Computer Science Theses and Dissertations

Permanent URI for this collectionhttp://hdl.handle.net/1903/2756

Browse

Search Results

Now showing 1 - 2 of 2
  • Thumbnail Image
    Item
    Context Driven Scene Understanding
    (2015) Chen, Xi; Davis, Larry S; Computer Science; Digital Repository at the University of Maryland; University of Maryland (College Park, Md.)
    Understanding objects in complex scenes is a fundamental and challenging problem in computer vision. Given an image, we would like to answer the questions of whether there is an object of a particular category in the image, where is it, and if possible, locate it with a bounding box or pixel-wise labels. In this dissertation, we present context driven approaches leveraging relationships between objects in the scene to improve both the accuracy and efficiency of scene understanding. In the first part, we describe an approach to jointly solve the segmentation and recognition problem using a multiple segmentation framework with context. Our approach formulates a cost function based on contextual information in conjunction with appearance matching. This relaxed cost function formulation is minimized using an efficient quadratic programming solver and an approximate solution is obtained by discretizing the relaxed solution. Our approach improves labeling performance compared to other segmentation based recognition approaches. Secondly, we introduce a new problem called object co-labeling where the goal is to jointly annotate multiple images of the same scene which do not have temporal consistency. We present an adaptive framework for joint segmentation and recognition to solve this problem. We propose an objective function that considers not only appearance but also appearance and context consistency across images of the scene. A relaxed form of the cost function is minimized using an efficient quadratic programming solver. Our approach improves labeling performance compared to labeling each image individually. We also show the application of our co-labeling framework to other recognition problems such as label propagation in videos and object recognition in similar scenes. In the third part, we propose a novel general strategy for simultaneous object detection and segmentation. Instead of passively evaluating all object detectors at all possible locations in an image, we develop a divide-and-conquer approach by actively and sequentially evaluating contextual cues related to the query based on the scene and previous evaluations---like playing a ``20 Questions'' game---to decide where to search for the object. Such questions are dynamically selected based on the query, the scene and current observed responses given by object detectors and classifiers. We first present an efficient object search policy based on information gain of asking a question. We formulate the policy in a probabilistic framework that integrates current information and observation to update the model and determine the next most informative action to take next. We further enrich the power and generalization capacity of the Twenty Questions strategy by learning the Twenty Questions policy driven by data. We formulate the problem as a Markov Decision Process and learn a search policy by imitation learning.
  • Thumbnail Image
    Item
    Rover: Architectural Support for Exposing and Using Context
    (2010) Almazan, Christian Butiu; Agrawala, Ashok K; Computer Science; Digital Repository at the University of Maryland; University of Maryland (College Park, Md.)
    Technology has advanced to the point where many people feel it has created a world with an insurmountable amount of information. Information includes messages people send to each other, logged data from their activities, and the services available to them. This problem has been exaggerated in modern societies by high availability of Internet connectivity. All types of information contains context, whether they have been stated explicitly or understood implicitly. Understanding, handling, and using context represents one of the most critical steps towards coping with the amount of information available today. In this dissertation, we examine two topics: context and the design of a context-aware platform. We describe fundamental types of context associated with every piece of information and discuss issues which may occur when implementing a system which utilizes context. We present a context-aware platform called Rover. The Rover architecture provides a conceptual framework geared towards understanding how application developers can utilize a variety of aspects of context to assist the development of modern applications. To aid developers in figuring out what context may be useful in their application, we describe the concept of a Rover ecosystem: a logical organization analogous to how similar groups of people interact with each other. We also discuss how information and context can be shared between ecosystems. To examine the feasibility of the Rover architecture's conceptual framework, we have implemented a reference implementation of the core unit of a Rover ecosystem: the Rover server. We discuss the details of the Rover server and describe the implementation of an emergency response application which demonstrates the utility of the conceptual framework.