Library Award for Undergraduate Research

Browse

Now showing 1 - 1 of 1

Evaluating Evaluation Metrics for Ancient Chinese to English Machine Translation
(2024) Bennett, Eric; Schonebaum, Andrew
Evaluation metrics are an important driver of progress in Machine Translation (MT), but they have been primarily validated on high-resource modern languages. In this paper, we conduct an empirical evaluation of metrics commonly used to evaluate MT from Ancient Chinese into English. Using LLMs, we construct a contrastive test set, pairing high-quality MT and purposefully flawed MT of the same Pre-Qin texts. We then evaluate the ability of each metric to discriminate between accurate and flawed translations.