Construction of a Chinese-English Verb Lexicon for Embedded Machine Translation in Cross-Language Information Retrieval

dc.contributor.authorDorr, Bonnie Jeanen_US
dc.contributor.authorLevow, Gina-Anneen_US
dc.contributor.authorLin, Dekangen_US
dc.date.accessioned2004-05-31T23:21:21Z
dc.date.available2004-05-31T23:21:21Z
dc.date.created2002-09en_US
dc.date.issued2002-10-07en_US
dc.description.abstractThis paper addresses the problem of automatic acquisition of lexical knowledge for rapid construction of MT engines %DL: delete for use in multilingual applications. We describe new techniques for large-scale construction of a Chinese-English verb lexicon and we evaluate the coverage and effectiveness of the resulting lexicon for a structured MT approach that is embedded in a cross-language information retrieval system. Leveraging off an existing Chinese conceptual database called HowNet and a large, semantically rich English verb database, we use thematic-role information to create links between Chinese concepts and English classes. We apply the metrics of recall and precision to evaluate the coverage and effectiveness of the linguistic resources. The results of this work indicate that: (1) we are able to obtain reliable Chinese-English entries both with and without pre-existing semantic links between the two languages; (2) if we have pre-existing semantic links, we are able to produce a more robust lexical resource by merging these with our semantically rich English database; (3) In our comparisons with manual lexicon creation, our automatic techniques were shown to achieve 62% precision, compared to a much lower precision of 10% for arbitrary assignment of semantic links. (Also LAMP-TR-093) (Also UMIACS-TR-2002-80)en_US
dc.format.extent843244 bytes
dc.format.mimetypeapplication/pdf
dc.identifier.urihttp://hdl.handle.net/1903/1226
dc.language.isoen_US
dc.relation.isAvailableAtDigital Repository at the University of Marylanden_US
dc.relation.isAvailableAtUniversity of Maryland (College Park, Md.)en_US
dc.relation.isAvailableAtTech Reports in Computer Science and Engineeringen_US
dc.relation.isAvailableAtUMIACS Technical Reportsen_US
dc.relation.ispartofseriesUM Computer Science Department; CS-TR-4400en_US
dc.relation.ispartofseriesLAMP-TR-093en_US
dc.relation.ispartofseriesUMIACS; UMIACS-TR-2002-80en_US
dc.titleConstruction of a Chinese-English Verb Lexicon for Embedded Machine Translation in Cross-Language Information Retrievalen_US
dc.typeTechnical Reporten_US

Files

Original bundle

Now showing 1 - 1 of 1
Loading...
Thumbnail Image
Name:
CS-TR-4400.psd.pdf
Size:
823.48 KB
Format:
Adobe Portable Document Format