Structured Translation for Cross-Language Information Retrieval

Sperer, Ruth; Oard, Douglas W.

Structured Translation for Cross-Language Information Retrieval

Files

CS-TR-4154.ps (8.72 MB)

No. of downloads: 273

CS-TR-4154.pdf (153.72 KB)

No. of downloads: 1524

Date

2000-06-21

Authors

Sperer, Ruth

Oard, Douglas W.

Abstract

The paper introduces a query translation model that reflects the structure of the cross-language information retrieval task. The model is based on a structured bilingual dictionary in which the translations of each term are clustered into groups with distinct meanings. Query translation is modeled as a two-stage process, with the system first determining the intended meaning of a query term and then selecting translations appropriate to that meaning that might appear in the document collection. An implementation of structured translation based on automatic dictionary clustering is described and evaluated by using Chinese queries to retrieve English documents. Structured translation achieved an average precision that was statistically indistinguishable from Pirkola's technique for very short queries, but Pirkola's technique outperformed structured translation on long queries. The paper concludes with some observations on future work to improve retrieval effectiveness and on other potential uses of structured translation in interactive cross-language retrieval applications. (Also cross-referenced as UMIACS-TR-2000-45, LAMP-TR-052)

URI (handle)

http://hdl.handle.net/1903/1085

Collections

Technical Reports from UMIACS
Technical Reports of the Computer Science Department

Full item page