Show simple item record

dc.contributor.authorHabash, Nizaren_US
dc.contributor.authorDorr, Bonnieen_US
dc.date.accessioned2004-05-31T23:25:28Z
dc.date.available2004-05-31T23:25:28Z
dc.date.created2003-02en_US
dc.date.issued2003-02-27en_US
dc.identifier.urihttp://hdl.handle.net/1903/1258
dc.description.abstractWe describe our approach to the construction and evaluation of a large-scale database called ``CatVar'' which contains categorial variations of English lexemes. Due to the prevalence of cross-language categorial variation in multilingual applications, our categorial-variation resource may serve as an integral part of a diverse range of natural language applications. Thus, the research reported herein overlaps heavily with that of the machine-translation, lexicon-construction, and information-retrieval communities. We apply the information-retrieval metrics of precision and recall to evaluate the accuracy and coverage of our database with respect to a human-produced gold standard. This evaluation reveals that the categorial database achieves a high degree of precision and recall. Additionally, we demonstrate that the database improves on the linkability of Porter Stemmer by over 30\%. UMIACS-TR-2003-13 LAMP-TR-095en_US
dc.format.extent5452136 bytes
dc.format.mimetypeapplication/postscript
dc.language.isoen_US
dc.relation.ispartofseriesUM Computer Science Department; CS-TR-4443en_US
dc.relation.ispartofseriesUMIACS; UMIACS-TR-2003-13en_US
dc.relation.ispartofseriesLAMP-TR-095en_US
dc.titleA Categorial Variation Database for Englishen_US
dc.typeTechnical Reporten_US
dc.relation.isAvailableAtDigital Repository at the University of Marylanden_US
dc.relation.isAvailableAtUniversity of Maryland (College Park, Md.)en_US
dc.relation.isAvailableAtTech Reports in Computer Science and Engineeringen_US
dc.relation.isAvailableAtUMIACS Technical Reportsen_US


Files in this item

Thumbnail
Thumbnail

This item appears in the following Collection(s)

Show simple item record