Classifying Attitude by Topic Aspect for English and Chinese Document Collections

dc.contributor.advisorOard, Douglas W.en_US
dc.contributor.authorWu, Yejunen_US
dc.contributor.departmentLibrary & Information Servicesen_US
dc.contributor.publisherDigital Repository at the University of Marylanden_US
dc.contributor.publisherUniversity of Maryland (College Park, Md.)en_US
dc.date.accessioned2008-06-20T05:36:23Z
dc.date.available2008-06-20T05:36:23Z
dc.date.issued2008-04-25en_US
dc.description.abstractThe goal of this dissertation is to explore the design of tools to help users make sense of subjective information in English and Chinese by comparing attitudes on aspects of a topic in English and Chinese document collections. This involves two coupled challenges: topic aspect focus and attitude characterization. The topic aspect focus is specified by using information retrieval techniques to obtain documents on a topic that are of interest to a user and then allowing the user to designate a few segments of those documents to serve as examples for aspects that she wishes to see characterized. A novel feature of this work is that the examples can be drawn from documents in two languages (English and Chinese). A bilingual aspect classifier which applies monolingual and cross-language classification techniques is used to assemble automatically a large set of document segments on those same aspects. A test collection was designed for aspect classification by annotating consecutive sentences in documents from the Topic Detection and Tracking collections as aspect instances. Experiments show that classification effectiveness can often be increased by using training examples from both languages. Attitude characterization is achieved by classifiers which determine the subjectivity and polarity of document segments. Sentence attitude classification is the focus of the experiments in the dissertation because the best presently available test collection for Chinese attitude classification (the NTCIR-6 Chinese Opinion Analysis Pilot Task) is focused on sentence-level classification. A large Chinese sentiment lexicon was constructed by leveraging existing Chinese and English lexical resources, and an existing character-based approach for estimating the semantic orientation of other Chinese words was extended. A shallow linguistic analysis approach was adopted to classify the subjectivity and polarity of a sentence. Using the large sentiment lexicon with appropriate handling of negation, and leveraging sentence subjectivity density, sentence positivity and negativity, the resulting sentence attitude classifier was more effective than the best previously reported systems.en_US
dc.format.extent3884173 bytes
dc.format.mimetypeapplication/pdf
dc.identifier.urihttp://hdl.handle.net/1903/8150
dc.language.isoen_US
dc.subject.pqcontrolledInformation Scienceen_US
dc.subject.pqcontrolledLibrary Scienceen_US
dc.subject.pqcontrolledComputer Scienceen_US
dc.subject.pquncontrolledclassificationen_US
dc.subject.pquncontrolledsubtopicen_US
dc.subject.pquncontrolledfaceten_US
dc.subject.pquncontrolledcross-languageen_US
dc.subject.pquncontrolledtest collectionen_US
dc.subject.pquncontrolledbilingualen_US
dc.titleClassifying Attitude by Topic Aspect for English and Chinese Document Collectionsen_US
dc.typeDissertationen_US

Files

Original bundle
Now showing 1 - 1 of 1
Loading...
Thumbnail Image
Name:
umi-umd-5329.pdf
Size:
3.7 MB
Format:
Adobe Portable Document Format