Structured Translation for Cross-Language Information Retrieval

Thumbnail Image

Files

CS-TR-4154.ps (8.72 MB)
No. of downloads: 276
CS-TR-4154.pdf (153.72 KB)
No. of downloads: 1535

Publication or External Link

Date

2000-06-21

Advisor

Citation

DRUM DOI

Abstract

The paper introduces a query translation model that reflects the structure of the cross-language information retrieval task. The model is based on a structured bilingual dictionary in which the translations of each term are clustered into groups with distinct meanings. Query translation is modeled as a two-stage process, with the system first determining the intended meaning of a query term and then selecting translations appropriate to that meaning that might appear in the document collection. An implementation of structured translation based on automatic dictionary clustering is described and evaluated by using Chinese queries to retrieve English documents. Structured translation achieved an average precision that was statistically indistinguishable from Pirkola's technique for very short queries, but Pirkola's technique outperformed structured translation on long queries. The paper concludes with some observations on future work to improve retrieval effectiveness and on other potential uses of structured translation in interactive cross-language retrieval applications. (Also cross-referenced as UMIACS-TR-2000-45, LAMP-TR-052)

Notes

Rights