University of Maryland DRUM  
University of Maryland Digital Repository at the University of Maryland

DRUM >
Theses and Dissertations from UMD >
UMD Theses and Dissertations >

Please use this identifier to cite or link to this item: http://hdl.handle.net/1903/3212

Title: Matching Meaning for Cross-Language Information Retrieval
Authors: Wang, Jianqiang
Advisors: Oard, Douglas W
Department/Program: Library & Information Services
Type: Dissertation
Sponsors: Digital Repository at the University of Maryland
University of Maryland (College Park, Md.)
Keywords: Information Science (0723)
Computer Science (0984)
Library Science (0399)
information retrieval; cross-language information retrieval; information retrieval evaluation
Issue Date: 6-Dec-2005
Abstract: Cross-language information retrieval concerns the problem of finding information in one language in response to search requests expressed in another language. The explosive growth of the World Wide Web, with access to information in many languages, has provided a substantial impetus for research on this important problem. In recent years, significant advances in cross-language retrieval effectiveness have resulted from the application of statistical techniques to estimate accurate translation probabilities for individual terms from automated analysis of human-prepared translations. With few exceptions, however, those results have been obtained by applying evidence about the meaning of terms to translation in one direction at a time (e.g., by translating the queries into the document language). This dissertation introduces a more general framework for the use of translation probability in cross-language information retrieval based on the notion that information retrieval is dependent fundamentally upon matching what the searcher means with what the document author meant. The perspective yields a simple computational formulation that provides a natural way of combining what have been known traditionally as query and document translation. When combined with the use of synonym sets as a computational model of meaning, cross-language search results are obtained using English queries that approximate a strong monolingual baseline for both French and Chinese documents. Two well-known techniques (structured queries and probabilistic structured queries) are also shown to be a special case of this model under restrictive assumptions.
URI: http://hdl.handle.net/1903/3212
Appears in Collections:UMD Theses and Dissertations
Information Studies Theses and Dissertations

Files in This Item:

File Description SizeFormatNo. of Downloads
umi-umd-3035.pdf760.31 kBAdobe PDF982View/Open

All items in DRUM are protected by copyright, with all rights reserved.

 

DRUM is brought to you by the University of Maryland Libraries
University of Maryland, College Park, MD 20742-7011 (301)314-1328.
Please send us your comments. -
All Contents