Bucking the Trend: Large-Scale Cost-Focused Active Learning for Statistical Machine Translation

dc.contributor.authorBloodgood, Michael
dc.contributor.authorCallison-Burch, Chris
dc.date.accessioned2014-07-31T11:46:32Z
dc.date.available2014-07-31T11:46:32Z
dc.date.issued2010-07
dc.description.abstractWe explore how to improve machine translation systems by adding more translation data in situations where we already have substantial resources. The main challenge is how to buck the trend of diminishing returns that is commonly encountered. We present an active learning-style data solicitation algorithm to meet this challenge. We test it, gathering annotations via Amazon Mechanical Turk, and find that we get an order of magnitude increase in performance rates of improvement.en_US
dc.description.sponsorshipJohns Hopkins University Human Language Technology Center of Excellenceen_US
dc.identifier.citationMichael Bloodgood and Chris Callison-Burch. 2010. Bucking the trend: cost-focused active learning for statistical machine translation. In Proceedings of the 48th Annual Meeting of the Association for Computational Linguistics, pages 854-864, Uppsala, Sweden, July. Association for Computational Linguistics.en_US
dc.identifier.urihttp://hdl.handle.net/1903/15549
dc.language.isoen_USen_US
dc.publisherAssociation for Computational Linguisticsen_US
dc.relation.isAvailableAtCenter for Advanced Study of Language
dc.relation.isAvailableAtDigitial Repository at the University of Maryland
dc.relation.isAvailableAtUniversity of Maryland (College Park, Md)
dc.subjectcomputer scienceen_US
dc.subjectstatistical methodsen_US
dc.subjectartificial intelligenceen_US
dc.subjectmachine learningen_US
dc.subjectcomputational linguisticsen_US
dc.subjectnatural language processingen_US
dc.subjecthuman language technologyen_US
dc.subjecttranslation technologyen_US
dc.subjectmachine translationen_US
dc.subjectstatistical machine translationen_US
dc.subjectactive learningen_US
dc.subjectselective samplingen_US
dc.subjectquery learningen_US
dc.subjectstopping criteriaen_US
dc.subjectstopping methodsen_US
dc.subjectcrowdsourcingen_US
dc.subjectcost-focused active learningen_US
dc.subjectcost-efficient annotationen_US
dc.subjectannotation costsen_US
dc.subjectannotation bottlenecken_US
dc.subjectannotation cost metricsen_US
dc.subjectUrdu-English translationen_US
dc.subjectuncertainty-based active learningen_US
dc.subjectuncertainty-based samplingen_US
dc.subjectAmazon Mechanical Turken_US
dc.subjectHighlighted N-Gram Methoden_US
dc.titleBucking the Trend: Large-Scale Cost-Focused Active Learning for Statistical Machine Translationen_US
dc.typeArticleen_US

Files

Original bundle
Now showing 1 - 1 of 1
Loading...
Thumbnail Image
Name:
buckingTheTrendActiveLearningMachineTranslationACL2010.pdf
Size:
509.94 KB
Format:
Adobe Portable Document Format
License bundle
Now showing 1 - 1 of 1
No Thumbnail Available
Name:
license.txt
Size:
1.57 KB
Format:
Item-specific license agreed upon to submission
Description: