Searching to Translate and Translating to Search: When Information Retrieval Meets Machine Translation

Ture, Ferhan

Searching to Translate and Translating to Search: When Information Retrieval Meets Machine Translation

dc.contributor.advisor	Lin, Jimmy	en_US
dc.contributor.author	Ture, Ferhan	en_US
dc.contributor.department	Computer Science	en_US
dc.contributor.publisher	Digital Repository at the University of Maryland	en_US
dc.contributor.publisher	University of Maryland (College Park, Md.)	en_US
dc.date.accessioned	2013-10-03T05:32:39Z
dc.date.available	2013-10-03T05:32:39Z
dc.date.issued	2013	en_US
dc.description.abstract	With the adoption of web services in daily life, people have access to tremendous amounts of information, beyond any human's reading and comprehension capabilities. As a result, search technologies have become a fundamental tool for accessing information. Furthermore, the web contains information in multiple languages, introducing another barrier between people and information. Therefore, search technologies need to handle content written in multiple languages, which requires techniques to account for the linguistic differences. Information Retrieval (IR) is the study of search techniques, in which the task is to find material relevant to a given information need. Cross-Language Information Retrieval (CLIR) is a special case of IR when the search takes place in a multi-lingual collection. Of course, it is not helpful to retrieve content in languages the user cannot understand. Machine Translation (MT) studies the translation of text from one language into another efficiently (within a reasonable amount of time) and effectively (fluent and retaining the original meaning), which helps people understand what is being written, regardless of the source language. Putting these together, we observe that search and translation technologies are part of an important user application, calling for a better integration of search (IR) and translation (MT), since these two technologies need to work together to produce high-quality output. In this dissertation, the main goal is to build better connections between IR and MT, for which we present solutions to two problems: Searching to translate explores approximate search techniques for extracting bilingual data from multilingual Wikipedia collections to train better translation models. Translating to search explores the integration of a modern statistical MT system into the cross-language search processes. In both cases, our best-performing approach yielded improvements over strong baselines for a variety of language pairs. Finally, we propose a general architecture, in which various components of IR and MT systems can be connected together into a feedback loop, with potential improvements to both search and translation tasks. We hope that the ideas presented in this dissertation will spur more interest in the integration of search and translation technologies.	en_US
dc.identifier.uri	http://hdl.handle.net/1903/14502
dc.subject.pqcontrolled	Computer science	en_US
dc.subject.pqcontrolled	Artificial intelligence	en_US
dc.subject.pqcontrolled	Information science	en_US
dc.subject.pquncontrolled	cross-language	en_US
dc.subject.pquncontrolled	information retrieval	en_US
dc.subject.pquncontrolled	ivory	en_US
dc.subject.pquncontrolled	locality sensitive hashing	en_US
dc.subject.pquncontrolled	machine translation	en_US
dc.subject.pquncontrolled	mapreduce	en_US
dc.title	Searching to Translate and Translating to Search: When Information Retrieval Meets Machine Translation	en_US
dc.type	Dissertation	en_US

Files

Original bundle

Now showing 1 - 1 of 1

Name:: Ture_umd_0117E_14453.pdf
Size:: 1.55 MB
Format:: Adobe Portable Document Format

Download

Collections

UMD Theses and Dissertations
Computer Science Theses and Dissertations