Evaluating Evaluation Metrics for Ancient Chinese to English Machine Translation
dc.contributor.advisor | Schonebaum, Andrew | |
dc.contributor.author | Bennett, Eric | |
dc.date.accessioned | 2025-04-02T15:55:54Z | |
dc.date.issued | 2024 | |
dc.description.abstract | Evaluation metrics are an important driver of progress in Machine Translation (MT), but they have been primarily validated on high-resource modern languages. In this paper, we conduct an empirical evaluation of metrics commonly used to evaluate MT from Ancient Chinese into English. Using LLMs, we construct a contrastive test set, pairing high-quality MT and purposefully flawed MT of the same Pre-Qin texts. We then evaluate the ability of each metric to discriminate between accurate and flawed translations. | |
dc.identifier | https://doi.org/10.13016/hluh-dgoh | |
dc.identifier.uri | http://hdl.handle.net/1903/33827 | |
dc.relation.isAvailableAt | Digital Repository at the University of Maryland | |
dc.relation.isAvailableAt | University of Maryland (College Park, Md) | |
dc.subject | Machine Translation | |
dc.subject | Ancient Chinese | |
dc.subject | Natural Language Processing | |
dc.subject | Machine Learning | |
dc.subject | Artificial Intelligence | |
dc.title | Evaluating Evaluation Metrics for Ancient Chinese to English Machine Translation | |
dc.type | Other |
Files
Original bundle
1 - 3 of 3
Loading...
- Name:
- Bennett_ResearchPaper.pdf
- Size:
- 274.53 KB
- Format:
- Adobe Portable Document Format