Center for Advanced Study of Language Research Works

Permanent URI for this collectionhttp://hdl.handle.net/1903/11610

Browse

Search Results

Now showing 1 - 2 of 2
  • Thumbnail Image
    Item
    Semantically-Informed Syntactic Machine Translation: A Tree-Grafting Approach
    (2010-10) Baker, Kathryn; Bloodgood, Michael; Callison-Burch, Chris; Dorr, Bonnie; Filardo, Nathaniel; Levin, Lori; Miller, Scott; Piatko, Christine
    We describe a unified and coherent syntactic framework for supporting a semantically-informed syntactic approach to statistical machine translation. Semantically enriched syntactic tags assigned to the target-language training texts improved translation quality. The resulting system significantly outperformed a linguistically naive baseline model (Hiero), and reached the highest scores yet reported on the NIST 2009 Urdu-English translation task. This finding supports the hypothesis (posed by many researchers in the MT community, e.g., in DARPA GALE) that both syntactic and semantic information are critical for improving translation quality—and further demonstrates that large gains can be achieved for low-resource languages with different word order than English.
  • Thumbnail Image
    Item
    Bucking the Trend: Large-Scale Cost-Focused Active Learning for Statistical Machine Translation
    (Association for Computational Linguistics, 2010-07) Bloodgood, Michael; Callison-Burch, Chris
    We explore how to improve machine translation systems by adding more translation data in situations where we already have substantial resources. The main challenge is how to buck the trend of diminishing returns that is commonly encountered. We present an active learning-style data solicitation algorithm to meet this challenge. We test it, gathering annotations via Amazon Mechanical Turk, and find that we get an order of magnitude increase in performance rates of improvement.