Context Models for Understanding Images and Videos

Nagaraja, Varun K.

Context Models for Understanding Images and Videos

dc.contributor.advisor	Davis, Larry S	en_US
dc.contributor.author	Nagaraja, Varun K.	en_US
dc.contributor.department	Computer Science	en_US
dc.contributor.publisher	Digital Repository at the University of Maryland	en_US
dc.contributor.publisher	University of Maryland (College Park, Md.)	en_US
dc.date.accessioned	2016-09-03T05:42:49Z
dc.date.available	2016-09-03T05:42:49Z
dc.date.issued	2016	en_US
dc.description.abstract	A computer vision system that has to interact in natural language needs to understand the visual appearance of interactions between objects along with the appearance of objects themselves. Relationships between objects are frequently mentioned in queries of tasks like semantic image retrieval, image captioning, visual question answering and natural language object detection. Hence, it is essential to model context between objects for solving these tasks. In the first part of this thesis, we present a technique for detecting an object mentioned in a natural language query. Specifically, we work with referring expressions which are sentences that identify a particular object instance in an image. In many referring expressions, an object is described in relation to another object using prepositions, comparative adjectives, action verbs etc. Our proposed technique can identify both the referred object and the context object mentioned in such expressions. Context is also useful for incrementally understanding scenes and videos. In the second part of this thesis, we propose techniques for searching for objects in an image and events in a video. Our proposed incremental algorithms use the context from previously explored regions to prioritize the regions to explore next. The advantage of incremental understanding is restricting the amount of computation time and/or resources spent for various detection tasks. Our first proposed technique shows how to learn context in indoor scenes in an implicit manner and use it for searching for objects. The second technique shows how explicitly written context rules of one-on-one basketball can be used to sequentially detect events in a game.	en_US
dc.identifier	https://doi.org/10.13016/M29V36
dc.identifier.uri	http://hdl.handle.net/1903/18603
dc.language.iso	en	en_US
dc.subject.pqcontrolled	Computer science	en_US
dc.title	Context Models for Understanding Images and Videos	en_US
dc.type	Dissertation	en_US

Files

Original bundle

Now showing 1 - 1 of 1

Name:: Nagaraja_umd_0117E_17463.pdf
Size:: 28.79 MB
Format:: Adobe Portable Document Format

Download

Collections

UMD Theses and Dissertations
Computer Science Theses and Dissertations