Show simple item record

dc.contributor.advisorPerlis, Donald R.en_US
dc.contributor.authorWilson, Shomiren_US
dc.date.accessioned2011-07-07T05:44:47Z
dc.date.available2011-07-07T05:44:47Z
dc.date.issued2011en_US
dc.identifier.urihttp://hdl.handle.net/1903/11694
dc.description.abstractTo understand the language we use, we sometimes must turn language on itself, and we do this through an understanding of the use-mention distinction. In particular, we are able to recognize mentioned language: that is, tokens (e.g., words, phrases, sentences, letters, symbols, sounds) produced to draw attention to linguistic properties that they possess. Evidence suggests that humans frequently employ the use-mention distinction, and we would be severely handicapped without it; mentioned language frequently occurs for the introduction of new words, attribution of statements, explanation of meaning, and assignment of names. Moreover, just as we benefit from mutual recognition of the use-mention distinction, the potential exists for us to benefit from language technologies that recognize it as well. With a better understanding of the use-mention distinction, applications can be built to extract valuable information from mentioned language, leading to better language learning materials, precise dictionary building tools, and highly adaptive computer dialogue systems. This dissertation presents the first computational study of how the use-mention distinction occurs in natural language, with a focus on occurrences of mentioned language. Three specific contributions are made. The first is a framework for identifying and analyzing instances of mentioned language, in an effort to reconcile elements of previous theoretical work for practical use. Definitions for mentioned language, metalanguage, and quotation have been formulated, and a procedural rubric has been constructed for labeling instances of mentioned language. The second is a sequence of three labeled corpora of mentioned language, containing delineated instances of the phenomenon. The corpora illustrate the variety of mentioned language, and they enable analysis of how the phenomenon relates to sentence structure. Using these corpora, inter-annotator agreement studies have quantified the concurrence of human readers in labeling the phenomenon. The third contribution is a method for identifying common forms of mentioned language in text, using patterns in metalanguage and sentence structure. Although the full breadth of the phenomenon is likely to elude computational tools for the foreseeable future, some specific, common rules for detecting and delineating mentioned language have been shown to perform well.en_US
dc.titleA Computational Theory of the Use-Mention Distinction in Natural Languageen_US
dc.typeDissertationen_US
dc.contributor.publisherDigital Repository at the University of Marylanden_US
dc.contributor.publisherUniversity of Maryland (College Park, Md.)en_US
dc.contributor.departmentComputer Scienceen_US
dc.subject.pqcontrolledComputer Scienceen_US
dc.subject.pqcontrolledArtificial Intelligenceen_US
dc.subject.pqcontrolledLanguage, Linguisticsen_US
dc.subject.pquncontrolledcomputational linguisticsen_US
dc.subject.pquncontrolleddialog systemsen_US
dc.subject.pquncontrolledmetalanguageen_US
dc.subject.pquncontrollednatural language processingen_US


Files in this item

Thumbnail

This item appears in the following Collection(s)

Show simple item record