When Good MT Goes Bad: Undestanding and Mitigating Misleading Machine Translations

dc.contributor.advisorCarpuat, Marineen_US
dc.contributor.authorMartindale, Mariannaen_US
dc.contributor.departmentInformation Studiesen_US
dc.contributor.publisherDigital Repository at the University of Marylanden_US
dc.contributor.publisherUniversity of Maryland (College Park, Md.)en_US
dc.date.accessioned2024-09-23T06:23:28Z
dc.date.available2024-09-23T06:23:28Z
dc.date.issued2024en_US
dc.description.abstractMachine Translation (MT) has long been viewed as a force multiplier, enabling monolingual users to assist in processing foreign language text. In ideal situations, Neural MT (NMT) provides unprecedented MT quality, potentially increasing productivity and user acceptance of the technology. However, outside of ideal circumstances, NMT introduces new types of errors that may be difficult for users who don't understand the source language to recognize, resulting in misleading output. This dissertation seeks to understand the prevalence, nature, and impact of potentially misleading output and whether a simple intervention can mitigate its effects on monolingual users. To understand the prevalence of misleading MT output, we conduct a study to quantify the potential impact of output that is fluent but not adequate, or ``fluently inadequate", by observing the relative frequency of these types of errors in two types of MT models, statistical and early neural models. We find that neural models were consistently more prone to this type of error than traditional statistical models. However, improving the overall quality of the MT system such as through domain adaptation reduces these errors. We examine the nature of misleading MT output by moving from an intrinsic feature (fluency) to a more user-centered feature, believability, defined as a monolingual user's perception of the likelihood that the meaning of the MT output matches the meaning of the input, without understanding the source. We find that fluency accounts for most believability judgments, but semantic features like plausibility also play a role. Finally, we turn to mitigating the impacts of potentially misleading NMT output. We propose two simple interventions to help users more effectively handle inadequate output: providing output from a second NMT system and providing output from a rule-based MT (RBMT) system. We test these interventions for one use case with a user study designed to mimic typical intelligence analysis triage workflows and with actual intelligence analysts as participants. We see significant increases in performance on relevance judgment tasks with output from two NMT systems and in performance on relevant entity identification tasks with the addition of RBMT output.en_US
dc.identifierhttps://doi.org/10.13016/pznx-ypp1
dc.identifier.urihttp://hdl.handle.net/1903/33445
dc.language.isoenen_US
dc.subject.pqcontrolledInformation scienceen_US
dc.subject.pquncontrolledartificial ingtelligenceen_US
dc.subject.pquncontrolledmachine translationen_US
dc.titleWhen Good MT Goes Bad: Undestanding and Mitigating Misleading Machine Translationsen_US
dc.typeDissertationen_US

Files

Original bundle

Now showing 1 - 1 of 1
Loading...
Thumbnail Image
Name:
Martindale_umd_0117E_24634.pdf
Size:
2.63 MB
Format:
Adobe Portable Document Format