Domain-Specific Term-List Expansion Using Existing Linguistic Resources

dc.contributor.authorDorr, Bonnieen_US
dc.contributor.authorZhao, Tiejunen_US
dc.date.accessioned2004-05-31T23:21:15Z
dc.date.available2004-05-31T23:21:15Z
dc.date.created2002-09en_US
dc.date.issued2002-10-03en_US
dc.description.abstractThis report describes a series of experiments involving expansion of a domain-specific human-generated "seed list" using available linguistic resources. The resources used for the expansion are intended to be general purpose: two large-scale Chinese-English dictionaries and a Chinese lexical knowledge base (HowNet). The methodology involves three steps: (1) hand extraction of head words from each entry in the human-generated seed list; (2) automatic comparison of these head words against entries in the linguistic resources-where an entry matches if the head word matches the entry exactly or is included in its the semantic definition; and (3) collection of any resulting matching entries into a larger term list. The terms extracted by this process were verified manually to confirm whether they were relevant to the topic of a specific domain. An important contribution of this work is the finding that the use of a bilingual term list for the expansion process does not provide a significant improvement over the use of a simpler, more easily produced, monolingual term list. (Also LAMP-TR-092) (Also UMIACS-TR-2002-79)en_US
dc.format.extent409351 bytes
dc.format.mimetypeapplication/postscript
dc.identifier.urihttp://hdl.handle.net/1903/1225
dc.language.isoen_US
dc.relation.isAvailableAtDigital Repository at the University of Marylanden_US
dc.relation.isAvailableAtUniversity of Maryland (College Park, Md.)en_US
dc.relation.isAvailableAtTech Reports in Computer Science and Engineeringen_US
dc.relation.isAvailableAtUMIACS Technical Reportsen_US
dc.relation.ispartofseriesUM Computer Science Department; CS-TR-4399en_US
dc.relation.ispartofseriesLAMP-TR-092en_US
dc.relation.ispartofseriesUMIACS; UMIACS-TR-2002-79en_US
dc.titleDomain-Specific Term-List Expansion Using Existing Linguistic Resourcesen_US
dc.typeTechnical Reporten_US

Files

Original bundle

Now showing 1 - 2 of 2
Loading...
Thumbnail Image
Name:
CS-TR-4399.pdf
Size:
148.72 KB
Format:
Adobe Portable Document Format
Description:
Auto-generated copy of CS-TR-4399.ps
No Thumbnail Available
Name:
CS-TR-4399.ps
Size:
399.76 KB
Format:
Postscript Files