Data and Methods for Reference Resolution in Different Modalities

Guha, Anupam

Data and Methods for Reference Resolution in Different Modalities

dc.contributor.advisor	Aloimonos, Yiannis	en_US
dc.contributor.advisor	Boyd-Graber, Jordan	en_US
dc.contributor.author	Guha, Anupam	en_US
dc.contributor.department	Computer Science	en_US
dc.contributor.publisher	Digital Repository at the University of Maryland	en_US
dc.contributor.publisher	University of Maryland (College Park, Md.)	en_US
dc.date.accessioned	2018-01-23T06:31:49Z
dc.date.available	2018-01-23T06:31:49Z
dc.date.issued	2017	en_US
dc.description.abstract	One foundational goal of artificial intelligence is to build intelligent agents which interact with humans, and to do so, they must have the capacity to infer from human communication what concept is being referred to in a span of symbols. They should be able, like humans, to map these representations to perceptual inputs, visual or otherwise. In NLP, this problem of discovering which spans of text are referring to the same real-world entity is called Coreference Resolution. This dissertation expands this problem to go beyond text and maps concepts referred to by text spans to concepts represented in images. This dissertation also investigates the complex and hard nature of real world coreference resolution. Lastly, this dissertation expands upon the definition of references to include abstractions referred by non-contiguous text distributions. A central theme throughout this thesis is the paucity of data in solving hard problems of reference, which it addresses by designing several datasets. To investigate hard text coreference this dissertation analyses a domain of coreference heavy text, namely questions present in the trivia game of quiz bowl and creates a novel dataset. Solving quiz bowl questions requires robust coreference resolution and world knowledge, something humans possess but current models do not. This work uses distributional semantics for world knowledge. Also, this work addresses the sub-problems of coreference like mention detection. Next, to investigate complex visual representations of concepts, this dissertation uses the domain of paintings. Mapping spans of text in descriptions of paintings to regions of paintings being described by that text is a non-trivial problem because paintings are sufficiently harder than natural images. Distributional semantics are again used here. Finally, to discover prototypical concepts present in distributed rather than contiguous spans of text, this dissertation investigates a source which is rich in prototypical concepts, namely movie scripts. All movie narratives, character arcs, and character relationships, are distilled to sequences of interconnected prototypical concepts which are discovered using unsupervised deep learning models, also using distributional semantics. I conclude this dissertation by discussing potential future research in downstream tasks which can be aided by discovery of referring multi-modal concepts.	en_US
dc.identifier	https://doi.org/10.13016/M2SJ19S6M
dc.identifier.uri	http://hdl.handle.net/1903/20273
dc.language.iso	en	en_US
dc.subject.pqcontrolled	Computer science	en_US
dc.subject.pqcontrolled	Artificial intelligence	en_US
dc.subject.pquncontrolled	Coreference Resolution	en_US
dc.subject.pquncontrolled	Multimodal	en_US
dc.subject.pquncontrolled	Natural Language Processing	en_US
dc.subject.pquncontrolled	References	en_US
dc.title	Data and Methods for Reference Resolution in Different Modalities	en_US
dc.type	Dissertation	en_US

Files

Original bundle

Now showing 1 - 1 of 1

Name:: Guha_umd_0117E_18335.pdf
Size:: 9.4 MB
Format:: Adobe Portable Document Format

Download

Collections

UMD Theses and Dissertations
Computer Science Theses and Dissertations