Information Extraction Tool

Loading...
Thumbnail Image
Files
UG_2001-1.pdf(163.84 KB)
No. of downloads: 330
Publication or External Link
Date
2001
Authors
Lo, Alex
Advisor
Gupta, S.K.
Lin, Edward Yi-tzer
Citation
DRUM DOI
Abstract
In the "Internet age," where a wealth of information is available, it becomes important to get the information desired. This becomes difficult when the information on the web is not uniformly formatted. While technologies such as XML promise to bring more organization to the Internet, it is not commonly used. Many projects have been based around "dumb" extraction -- simply taking information for a specified place and storing it to a specified location. A more intelligent method can be used when dealing with information that is semi-structured and needs to be cataloged. The method developed here incorporates information division and recognition to identify and catalog information on the Internet.
Notes
Rights