Statistical Analysis of Text Summarization Evaluation

Files

No. of downloads: 386

2016

This dissertation applies statistical methods to the evaluation of automatic

summarization using data from the Text Analysis Conferences in 2008-2011.

Several aspects of the evaluation framework itself are studied, including the

statistical testing used to determine significant differences, the assessors,

and the design of the experiment. In addition, a family of evaluation metrics

is developed to predict the score an automatically generated summary would

receive from a human judge and its results are demonstrated at the Text

Analysis Conference. Finally, variations on the evaluation framework are

studied and their relative merits considered. An over-arching theme of this

dissertation is the application of standard statistical methods to data that

does not conform to the usual testing assumptions.