Domain-Specific Term-List Expansion Using Existing Linguistic Resources
dc.contributor.author | Dorr, Bonnie | en_US |
dc.contributor.author | Zhao, Tiejun | en_US |
dc.date.accessioned | 2004-05-31T23:21:15Z | |
dc.date.available | 2004-05-31T23:21:15Z | |
dc.date.created | 2002-09 | en_US |
dc.date.issued | 2002-10-03 | en_US |
dc.description.abstract | This report describes a series of experiments involving expansion of a domain-specific human-generated "seed list" using available linguistic resources. The resources used for the expansion are intended to be general purpose: two large-scale Chinese-English dictionaries and a Chinese lexical knowledge base (HowNet). The methodology involves three steps: (1) hand extraction of head words from each entry in the human-generated seed list; (2) automatic comparison of these head words against entries in the linguistic resources-where an entry matches if the head word matches the entry exactly or is included in its the semantic definition; and (3) collection of any resulting matching entries into a larger term list. The terms extracted by this process were verified manually to confirm whether they were relevant to the topic of a specific domain. An important contribution of this work is the finding that the use of a bilingual term list for the expansion process does not provide a significant improvement over the use of a simpler, more easily produced, monolingual term list. (Also LAMP-TR-092) (Also UMIACS-TR-2002-79) | en_US |
dc.format.extent | 409351 bytes | |
dc.format.mimetype | application/postscript | |
dc.identifier.uri | http://hdl.handle.net/1903/1225 | |
dc.language.iso | en_US | |
dc.relation.isAvailableAt | Digital Repository at the University of Maryland | en_US |
dc.relation.isAvailableAt | University of Maryland (College Park, Md.) | en_US |
dc.relation.isAvailableAt | Tech Reports in Computer Science and Engineering | en_US |
dc.relation.isAvailableAt | UMIACS Technical Reports | en_US |
dc.relation.ispartofseries | UM Computer Science Department; CS-TR-4399 | en_US |
dc.relation.ispartofseries | LAMP-TR-092 | en_US |
dc.relation.ispartofseries | UMIACS; UMIACS-TR-2002-79 | en_US |
dc.title | Domain-Specific Term-List Expansion Using Existing Linguistic Resources | en_US |
dc.type | Technical Report | en_US |