Crowdsourced Monolingual Translation

Thumbnail Image


Hu_umd_0117E_13650.pdf (12.31 MB)
No. of downloads: 1122

Publication or External Link






An enormous potential exists for solving certain classes of computational problems through rich collaboration among crowds of humans supported by computers. Solutions to these problems used to involve human professionals who are expensive to hire or difficult to find. Despite significant advances, fully automatic systems still have much room for improvement. Recent research has involved recruiting large crowds of skilled humans (``crowdsourcing''), but crowdsourcing solutions are still restricted by the availability of those skilled human participants. With translation, for example, professional translators incur high cost and are not always available; machine translation systems have been greatly improved recently, but still can only provide passable translation, and for only limited language pairs at that; crowdsourced translation is limited by the availability of bilingual humans.

This dissertation describes crowdsourced monolingual translation, where monolingual translation is translation performed by monolingual people. Crowdsourced monolingual translation is a collaborative form of translation performed by two crowds of people who speak the source or the target language respectively, with machine translation as the mediating device.

A general protocol to handle crowdsourced monolingual translation is introduced along with three systems that implement the protocol. The MonoTrans system initially established the feasibility of the protocol. Then, MonoTrans2 enabled lab experiments with a second implementation of the protocol. MonoTrans2 was also applied to a an emergency-response scenario in a developing country (Haiti). The MonoTrans Widgets system was deployed to a large crowd of casual web users with a third implementation of the protocol. These systems were studied in various settings, and were found to supply improvement in quality over both machine translation and monolingual post-editing.