Skip to content
University of Maryland LibrariesDigital Repository at the University of Maryland
    • Login
    View Item 
    •   DRUM
    • Theses and Dissertations from UMD
    • UMD Theses and Dissertations
    • View Item
    •   DRUM
    • Theses and Dissertations from UMD
    • UMD Theses and Dissertations
    • View Item
    JavaScript is disabled for your browser. Some features of this site may not work without it.

    Productive Vision: Methods for Automatic Image Comprehension

    Thumbnail
    View/Open
    SummersStay_umd_0117E_14898.pdf (22.31Mb)
    No. of downloads: 325

    Machinamenta.pdf (31.04Mb)
    No. of downloads: 147

    Date
    2013
    Author
    Summers-Stay, Douglas Alan
    Advisor
    Aloimonos, Yiannis
    Metadata
    Show full item record
    Abstract
    Image comprehension is the ability to summarize, translate, and answer basic questions about images. Using original techniques for scene object parsing, material labeling, and activity recognition, a system can gather information about the objects and actions in a scene. When this information is integrated into a deep knowledge base capable of inference, the system becomes capable of performing tasks that, when performed by students, are considered by educators to demonstrate comprehension. The vision components of the system consist of the following: object scene parsing by means of visual filters, material scene parsing by superpixel segmentation and kernel descriptors, and activity recognition by action grammars. These techniques are characterized and compared with the state-of-the-art in their respective fields. The output of the vision components is a list of assertions in a Cyc microtheory. By reasoning on these assertions and the rest of the Cyc knowledge base, the system is able to perform a variety of tasks, including the following: Recognize essential parts of objects are likely present in the scene despite not having an explicit detector for them. Recognize the likely presence of objects due to the presence of their essential parts. Improve estimates of both object and material labels by incorporating knowledge about the typical pairings. Label ambiguous objects with a more general label that encompasses both possible labelings. Answer questions about the scene that require inference and give justifications for the answers in natural language. Create a visual representation of the scene in a new medium. Recognize scene similarity even when there is little visual similarity.
    URI
    http://hdl.handle.net/1903/15131
    Collections
    • Computer Science Theses and Dissertations
    • UMD Theses and Dissertations

    DRUM is brought to you by the University of Maryland Libraries
    University of Maryland, College Park, MD 20742-7011 (301)314-1328.
    Please send us your comments.
    Web Accessibility
     

     

    Browse

    All of DRUMCommunities & CollectionsBy Issue DateAuthorsTitlesSubjectsThis CollectionBy Issue DateAuthorsTitlesSubjects

    My Account

    LoginRegister
    Pages
    About DRUMAbout Download Statistics

    DRUM is brought to you by the University of Maryland Libraries
    University of Maryland, College Park, MD 20742-7011 (301)314-1328.
    Please send us your comments.
    Web Accessibility