Statistical Analysis of Text Summarization Evaluation
Files
Publication or External Link
Date
Authors
Advisor
Citation
DRUM DOI
Abstract
This dissertation applies statistical methods to the evaluation of automatic
summarization using data from the Text Analysis Conferences in 2008-2011.
Several aspects of the evaluation framework itself are studied, including the
statistical testing used to determine significant differences, the assessors,
and the design of the experiment. In addition, a family of evaluation metrics
is developed to predict the score an automatically generated summary would
receive from a human judge and its results are demonstrated at the Text
Analysis Conference. Finally, variations on the evaluation framework are
studied and their relative merits considered. An over-arching theme of this
dissertation is the application of standard statistical methods to data that
does not conform to the usual testing assumptions.