Improving Information Retrieval Systems using Part of Speech Tagging

Loading...
Thumbnail Image

Files

TR_98-48.pdf (106.33 KB)
No. of downloads: 923

Publication or External Link

Date

1998

Advisor

Citation

DRUM DOI

Abstract

The object of Information Retrieval is to retrieve all relevantdocuments for a user query and only those relevant documents. Muchresearch has focused on achieving this objective with little regard forstorage overhead or performance. In the paper we evaluate the use ofPart of Speech Tagging to improve, the index storage overhead andgeneral speed of the system with only a minimal reduction to precisionrecall measurements. We tagged 500Mbs of the Los Angeles Times 1990 and1989 document collection provided by TREC for parts of speech. We thenexperimented to find the most relevant part of speech to index. We showthat 90 percent of precision recall is achieved with 40 percent of the documentcollections terms. We also show that this is a improvement in overheadwith only a 1 percent reduction in precision recall.

Notes

Rights