Theses and Dissertations from UMD

Permanent URI for this communityhttp://hdl.handle.net/1903/2

New submissions to the thesis/dissertation collections are added automatically as they are received from the Graduate School. Currently, the Graduate School deposits all theses and dissertations from a given semester after the official graduation date. This means that there may be up to a 4 month delay in the appearance of a give thesis/dissertation in DRUM

More information is available at Theses and Dissertations at University of Maryland Libraries.

Browse

Search Results

Now showing 1 - 4 of 4
  • Thumbnail Image
    Item
    TOWARDS EFFECTIVE DISPLAYS FOR VIRTUAL AND AUGMENTED REALITY
    (2020) Sun, Xuetong; Varshney, Amitabh; Computer Science; Digital Repository at the University of Maryland; University of Maryland (College Park, Md.)
    Virtual and augmented reality (VR and AR) are becoming increasingly accessible and useful nowadays. This dissertation focuses on several aspects of designing effective displays for VR and AR. Compared to conventional desktop displays, VR and AR displays can better engage the human peripheral vision. This provides an opportunity for more information to be perceived. To fully leverage the human visual system, we need to take into account how the human visual system perceives things differently in the periphery than in the fovea. By investigating the relationship of the perception time and eccentricity, we deduce a scaling function which facilitates content in the far periphery to be perceived as efficiently as in the central vision. AR overlays additional information on the real environment. This is useful in a number of fields, including surgery, where time-critical information is key. We present our medical AR system that visualizes the occluded catheter in the external ventricular drainage (EVD) procedure. We develop an accurate and efficient catheter tracking method that requires minimal changes to the existing medical equipment. The AR display projects a virtual image of the catheter overlaid on the occluded real catheter to depict its real-time position. Our system can make the risky EVD procedure much safer. Existing VR and AR displays support a limited number of focal distances, leading to vergence-accommodation conflict. Holographic displays can address this issue. In this dissertation, we explore the design and development of nanophotonic phased array (NPA) as a special class of holographic displays. NPAs have the advantage of being compact and support very high refresh rates. However, the use of the thermo-optic effect for phase modulation renders them susceptible to the thermal proximity effect. We study how the proximity effect impacts the images formed on NPAs. We then propose several novel algorithms to compensate for the thermal proximity effect on NPAs and compare their effectiveness and computational efficiency. Computer-generated holography (CGH) has traditionally focused on 2D images and 3D images in the form of meshes and point clouds. However, volumetric data can also benefit from CGH. One of the challenges in the use of volumetric data sources in CGH is the computational complexity needed to calculate the holograms of volumetric data. We propose a new method that achieves a significant speedup compared to existing holographic volume rendering methods.
  • Thumbnail Image
    Item
    Fusing Multimedia Data Into Dynamic Virtual Environments
    (2018) Du, Ruofei; Varshney, Amitabh; Computer Science; Digital Repository at the University of Maryland; University of Maryland (College Park, Md.)
    In spite of the dramatic growth of virtual and augmented reality (VR and AR) technology, content creation for immersive and dynamic virtual environments remains a significant challenge. In this dissertation, we present our research in fusing multimedia data, including text, photos, panoramas, and multi-view videos, to create rich and compelling virtual environments. First, we present Social Street View, which renders geo-tagged social media in its natural geo-spatial context provided by 360° panoramas. Our system takes into account visual saliency and uses maximal Poisson-disc placement with spatiotemporal filters to render social multimedia in an immersive setting. We also present a novel GPU-driven pipeline for saliency computation in 360° panoramas using spherical harmonics (SH). Our spherical residual model can be applied to virtual cinematography in 360° videos. We further present Geollery, a mixed-reality platform to render an interactive mirrored world in real time with three-dimensional (3D) buildings, user-generated content, and geo-tagged social media. Our user study has identified several use cases for these systems, including immersive social storytelling, experiencing the culture, and crowd-sourced tourism. We next present Video Fields, a web-based interactive system to create, calibrate, and render dynamic videos overlaid on 3D scenes. Our system renders dynamic entities from multiple videos, using early and deferred texture sampling. Video Fields can be used for immersive surveillance in virtual environments. Furthermore, we present VRSurus and ARCrypt projects to explore the applications of gestures recognition, haptic feedback, and visual cryptography for virtual and augmented reality. Finally, we present our work on Montage4D, a real-time system for seamlessly fusing multi-view video textures with dynamic meshes. We use geodesics on meshes with view-dependent rendering to mitigate spatial occlusion seams while maintaining temporal consistency. Our experiments show significant enhancement in rendering quality, especially for salient regions such as faces. We believe that Social Street View, Geollery, Video Fields, and Montage4D will greatly facilitate several applications such as virtual tourism, immersive telepresence, and remote education.
  • Thumbnail Image
    Item
    HandSight: A Touch-Based Wearable System to Increase Information Accessibility for People with Visual Impairments
    (2018) Stearns, Lee Stephan; Froehlich, Jon E; Chellappa, Rama; Computer Science; Digital Repository at the University of Maryland; University of Maryland (College Park, Md.)
    Many activities of daily living such as getting dressed, preparing food, wayfinding, or shopping rely heavily on visual information, and the inability to access that information can negatively impact the quality of life for people with vision impairments. While numerous researchers have explored solutions for assisting with visual tasks that can be performed at a distance, such as identifying landmarks for navigation or recognizing people and objects, few have attempted to provide access to nearby visual information through touch. Touch is a highly attuned means of acquiring tactile and spatial information, especially for people with vision impairments. By supporting touch-based access to information, we may help users to better understand how a surface appears (e.g., document layout, clothing patterns), thereby improving the quality of life. To address this gap in research, this dissertation explores methods to augment a visually impaired user’s sense of touch with interactive, real-time computer vision to access information about the physical world. These explorations span three application areas: reading and exploring printed documents, controlling mobile devices, and identifying colors and visual textures. At the core of each application is a system called HandSight that uses wearable cameras and other sensors to detect touch events and identify surface content beneath the user’s finger. To create HandSight, we designed and implemented the physical hardware, developed signal processing and computer vision algorithms, and designed real-time feedback that enables users to interpret visual or digital content. We involve visually impaired users throughout the design and development process, conducting several user studies to assess usability and robustness and to improve our prototype designs. The contributions of this dissertation include: (i) developing and iteratively refining HandSight, a novel wearable system to assist visually impaired users in their daily lives; (ii) evaluating HandSight across a diverse set of tasks, and identifying tradeoffs of a finger-worn approach in terms of physical design, algorithmic complexity and robustness, and usability; and (iii) identifying broader design implications for future wearable systems and for the fields of accessibility, computer vision, augmented and virtual reality, and human-computer interaction.
  • Thumbnail Image
    Item
    FINDING OBJECTS IN COMPLEX SCENES
    (2018) Sun, Jin; Jacobs, David; Computer Science; Digital Repository at the University of Maryland; University of Maryland (College Park, Md.)
    Object detection is one of the fundamental problems in computer vision that has great practical impact. Current object detectors work well under certain con- ditions. However, challenges arise when scenes become more complex. Scenes are often cluttered and object detectors trained on Internet collected data fail when there are large variations in objects’ appearance. We believe the key to tackle those challenges is to understand the rich context of objects in scenes, which includes: the appearance variations of an object due to viewpoint and lighting condition changes; the relationships between objects and their typical environment; and the composition of multiple objects in the same scene. This dissertation aims to study the complexity of scenes from those aspects. To facilitate collecting training data with large variations, we design a novel user interface, ARLabeler, utilizing the power of Augmented Reality (AR) devices. Instead of labeling images from the Internet passively, we put an observer in the real world with full control over the scene complexities. Users walk around freely and observe objects from multiple angles. Lighting can be adjusted. Objects can be added and/or removed to the scene to create rich compositions. Our tool opens new possibilities to prepare data for complex scenes. We also study challenges in deploying object detectors in real world scenes: detecting curb ramps in street view images. A system, Tohme, is proposed to combine detection results from detectors and human crowdsourcing verifications. One core component is a meta-classifier that estimates the complexity of a scene and assigns it to human (accurate but costly) or computer (low cost but error-prone) accordingly. One of the insights from Tohme is that context is crucial in detecting objects. To understand the complex relationship between objects and their environment, we propose a standalone context model that predicts where an object can occur in an image. By combining this model with object detection, it can find regions where an object is missing. It can also be used to find out-of-context objects. To take a step beyond single object based detections, we explicitly model the geometrical relationships between groups of objects and use the layout information to represent scenes as a whole. We show that such a strategy is useful in retrieving indoor furniture scenes with natural language inputs.