Theses and Dissertations from UMD

Permanent URI for this communityhttp://hdl.handle.net/1903/2

New submissions to the thesis/dissertation collections are added automatically as they are received from the Graduate School. Currently, the Graduate School deposits all theses and dissertations from a given semester after the official graduation date. This means that there may be up to a 4 month delay in the appearance of a give thesis/dissertation in DRUM

More information is available at Theses and Dissertations at University of Maryland Libraries.

Browse

Search Results

Now showing 1 - 10 of 43

SIMULATION, REPRESENTATION, AND AUTOMATION: HUMAN-CENTERED ARTIFICIAL INTELLIGENCE FOR AUGMENTING VISUALIZATION DESIGN
(2024) Shin, Sungbok; Elmqvist, Niklas; Computer Science; Digital Repository at the University of Maryland; University of Maryland (College Park, Md.)
Data visualization is a powerful strategy for using graphics to represent data for effective communication and analysis. Unfortunately, creating effective data visualizations is a challenge for both novice and expert design users. The task often involves an iterative process of trial and error, which by its nature, is time-consuming. Designers frequently seek feedback to ensure their visualizations convey the intended message clearly to their target audience. However, obtaining feedback from peers can be challenging, and alternatives like user studies or crowdsourcing is costly and time-consuming. This suggests the potential for a tool that can provide design feedback for visualizations. To that end, I create a virtual, human vision-inspired system that looks into the visualization design and provides feedback on it using various AI techniques. The goal is not to replicate an exact version of a human eye. Instead, my work aims to develop a practical and effective system that delivers design feedback to visualization designers, utilizing advanced AI techniques, such as deep neural networks (DNNs) and large language models (LLMs). My thesis includes three distinct works, each aimed at developing a virtual system inspired by human vision using AI techniques. Specifically, these works focus on simulation, representation, and automation, collectively progressing toward the aim. First, I develop a methodology to simulate human perception in machines through a virtual eye tracker named A SCANNER DEEPLY. This involves gathering eye gaze data from chart images and training them using a DNN. Second, I focus on effectively and pragmatically representing a virtual human vision-inspired system by creating PERCEPTUAL PAT, which includes a suite of perceptually-based filters. Third, I automate the feedback generation process with VISUALIZATIONARY, leveraging large language models to enhance the automation. I report on challenges and lessons learned about the key components and design considerations that help visualization designers. Finally, I end the dissertation by discussing future research directions for using AI for augmenting visualization design process.
A Framework for Benchmarking Graph-Based Artificial Intelligence
(2024) O'Sullivan, Kent Daniel; Regli, William C; Computer Science; Digital Repository at the University of Maryland; University of Maryland (College Park, Md.)
Graph-based Artificial Intelligence (GraphAI) encompasses AI problems formulated using graphs, operating on graphs, or relying on graph structures for learning. Contemporary Artificial Intelligence (AI) research explores how structured knowledge from graphs can enhance existing approaches to meet the real world’s demands for transparency, explainability, and performance. Characterizing GraphAI performance is challenging because different combinations of graph abstractions, representations, algorithms, and hardware acceleration techniques can trigger unpredictable changes in efficiency. Although benchmarks enable testing different GraphAI implementations, most cannot currently capture the complex interaction between effectiveness and efficiency, especially across dynamic knowledge graphs. This work proposes an empirical ‘grey-box’ approach to GraphAI benchmarking, providing a method that enables experimentally trading between effectiveness and efficiency across different combinations of graph abstractions, representations, algorithms, and hardware accelerators. A systematic literature review yields a taxonomy of GraphAI tasks and a collection of intelligence and security problems that interact with GraphAI . The taxonomy and problem survey guide the development of a framework that fuses empirical computer science with constraint theory in an approach to benchmarking that does not require invasive workload analyses or code instrumentation. We formalize a methodology for developing problem-centric GraphAI benchmarks and develop a tool to create graphs from OpenStreetMaps data to fill a gap in real-world mesh graph datasets required for benchmark inputs. Finally, this work provides a completed benchmark for the Population Segmentation Intelligence and Security problem developed using the GraphAI benchmark problem development methodology. It provides experimental results that validate the utility of the GraphAI benchmark framework for evaluating if, how, and when GraphAI acceleration should be applied to the population segmentation problem.
A Multifaceted Quantification of Bias in Large Language Models
(2023) Sotnikova, Anna; Daumé III, Hal; Applied Mathematics and Scientific Computation; Digital Repository at the University of Maryland; University of Maryland (College Park, Md.)
Language models are rapidly developing, demonstrating impressive capabilities in comprehending, generating, and manipulating text. As they advance, they unlock diverse applications across various domains and become increasingly integrated into our daily lives. Nevertheless, these models, trained on vast and unfiltered datasets, come with a range of potential drawbacks and ethical issues. One significant concern is the potential amplification of biases present in the training data, generating stereotypes and reinforcing societal injustices when language models are deployed. In this work, we propose methods to quantify biases in large language models. We examine stereotypical associations for a wide variety of social groups characterized by both single and intersectional identities. Additionally, we propose a framework for measuring stereotype leakage across different languages within multilingual large language models. Finally, we introduce an algorithm that allows us to optimize human data collection in conditions of high levels of human disagreement.
FAST FEASIBLE MOTION PLANNING WITHOUT TWO-POINT BOUNDARY VALUE SOLUTION
(2023) Nayak, Sharan Harish; Otte, Michael; Aerospace Engineering; Digital Repository at the University of Maryland; University of Maryland (College Park, Md.)
Autonomous robotic systems have seen extensive deployment across domains such as manufacturing, industrial inspection, transportation, and planetary surface exploration. A crucial requirement for these systems is navigating from an initial to a final position, while avoiding potential collisions with obstacles en route. This challenging task of devising collision-free trajectories, formally termed as motion planning, is of prime importance in robotics. Traditional motion planning approaches encounter scalability challenges when planning in higher-dimensional state-spaces. Moreover, they rarely consider robot dynamics during the planning process. To address these limitations, a class of probabilistic planning methods called Sampling-Based Motion Planning (SBMP) has gained prominence. SBMP strategies exploit probabilistic techniques to construct motion planning solutions. In this dissertation, our focus turns towards feasible SBMP algorithms that prioritize rapidly discovering solutions while considering robot kinematics and dynamics. These algorithms find utility in quickly solving complex problems (e.g., Alpha puzzle) where obtaining any feasible solution is considered as an achievement. Furthermore, they find practical use in computationally constrained systems and in seeding time-consuming optimal solutions. However, many existing feasible SBMP approaches assume the ability to find precise trajectories that exactly connect two states in a robot's state space. This challenge is framed as the Two-Point Boundary Value Problem (TPBVP). But finding closed-form solutions for the TPBVP is difficult, and numerical approaches are computationally expensive and prone to precision and stability issues. Given these complexities, this dissertation's primary focus resides in the development of SBMP algorithms for different scenarios where solving the TPBVP is challenging. Our work addresses four distinct scenarios -- two for single-agent systems and two for multi-agent systems. The first single-agent scenario involves quickly finding a feasible path from the start to goal state, using bidirectional search strategies for fast solution discovery. The second scenario focuses on performing prompt motion replanning when a vehicle's dynamical constraints change mid-mission. We leverage the heuristic information from the original search tree constructed using the vehicle's nominal dynamics to speed up the replanning process. Both these two scenarios unfold in static environments with pre-known obstacles. Transitioning to multi-agent systems, we address the feasible multi-robot motion planning problem where a robot team is guided towards predefined targets in a partially-known environment. We employ a dynamic roadmap updated from the current known environment to accelerate agent planning. Lastly, we explore the problem of multi-robot exploration in a completely unknown environment applied to the CADRE mission. We demonstrate how our proposed bidirectional search strategies can facilitate efficient exploration for robots with significant dynamics. The effectiveness of our algorithms is validated through extensive simulation and real-world experiments.
Minimal Perception: Enabling Autonomy on Resource-Constrained Robots
(2023) Singh, Chahat Deep; Aloimonos, Yiannis; Computer Science; Digital Repository at the University of Maryland; University of Maryland (College Park, Md.)
Mobile robots are widely used and crucial in diverse fields due to their autonomous task performance. They enhance efficiency, and safety, and enable novel applications like precision agriculture, environmental monitoring, disaster management, and inspection. Perception plays a vital role in their autonomous behavior for environmental understanding and interaction. Perception in robots refers to their ability to gather, process, and interpret environmental data, enabling autonomous interactions. It facilitates navigation, object identification, and real-time reactions. By integrating perception, robots achieve onboard autonomy, operating without constant human intervention, even in remote or hazardous areas. This enhances adaptability and scalability. This thesis explores the challenge of developing autonomous systems for smaller robots used in precise tasks like confined space inspections and robot pollination. These robots face limitations in real-time perception due to computing, power, and sensing constraints. To address this, we draw inspiration from small organisms such as insects and hummingbirds, known for their sophisticated perception, navigation, and survival abilities despite their minimalistic sensory and neural systems. This research aims to provide insights into designing compact, efficient, and minimal perception systems for tiny autonomous robots. Embracing this minimalism is paramount in unlocking the full potential of tiny robots and enhancing their perception systems. By streamlining and simplifying their design and functionality, these compact robots can maximize efficiency and overcome limitations imposed by size constraints. In this work, a Minimal Perception framework is proposed that enables onboard autonomy in resource-constrained robots at scales (as small as a credit card) that were not possible before. Minimal perception refers to a simplified, efficient, and selective approach from both hardware and software perspectives to gather and process sensory information. Adopting a task-centric perspective allows for further refinement of the minimalist perception framework for tiny robots. For instance, certain animals like jumping spiders, measuring just 1/2 inch in length, demonstrate minimal perception capabilities through sparse vision facilitated by multiple eyes, enabling them to efficiently perceive their surroundings and capture prey with remarkable agility. This thesis introduces a cutting-edge exploration of the minimal perception framework, pushing the boundaries of robot autonomy to new heights. The contributions of this work can be summarized as follows:1. Utilizing minimal quantities such as uncertainty in optical flow and its untapped potential to enable autonomous navigation, static and dynamic obstacle avoidance, and the ability to fly through unknown gaps. 2. By utilizing the principles of interactive perception, the framework proposes novel object segmentation in cluttered environments eliminating the reliance on neural network training for object recognition. 3. Introducing a generative simulator called WorldGen that has the power to generate countless cities and petabytes of high-quality annotated data, designed to minimize the demanding need for laborious 3D modeling and annotations, thus unlocking unprecedented possibilities for perception and autonomy tasks. 4. Proposed a method to predict metric dense depth maps in never-seen or out-of-domain environments by fusing information from a traditional RGB camera and a sparse 64-pixel depth sensor. 5. The autonomous capabilities of the tiny robots are demonstrated on both aerial and ground robots: (a) autonomous car with a size smaller than a credit card (70mm), and (b) bee drone with a length of 120mm, showcasing navigation abilities, depth perception in all four main directions, and effective avoidance of both static and dynamic obstacles. In conclusion, the integration of the minimal perception framework in tiny mobile robots heralds a new era of possibilities, signaling a paradigm shift in unlocking their perception and autonomy potential. This thesis would serve as a transformative milestone that will reshape the landscape of mobile robot autonomy, ushering in a future where tiny robots operate synergistically in swarms, revolutionizing fields such as exploration, disaster response, and distributed sensing.
MOLECULAR RECOGNITION PROPERTIES OF MOLECULAR CONTAINERS IN AQUEOUS SOLUTIONS
(2023) King, David; Isaacs, Lyle D; Chemistry; Digital Repository at the University of Maryland; University of Maryland (College Park, Md.)
Supramolecular containers take advantage of non-covalent interactions to do a variety of tasks with high affinity. In particular, water-soluble containers are able to bind biologically relevant molecules to perform useful and interesting tasks. Chapter 1 introduces the field of supramolecular chemistry is introduced and establishes the ability of cucurbit[n]urils (CB[n]) to bind guests with high affinity. It also establishes the uses of water-soluble supramolecular containers, including new-generation pillar[n]arene sulfate (P[n]AS) hosts, in biologically relevant systems. Chapter 2 expands on previous attempts at finding high-affinity host-guest pairings by showing that triamantane amines and triamantane diamines are able to bind CB[8] with femtomolar dissociation constants. It also shows that these ultratight binding complexes can be found in competition measurements with slightly weaker ternary complexes, thus reducing the number of measurements needed and the error of those measurements. Chapter 3 shows the discriminatory power of P6AS towards various amino acids and amino acid amides, as well as their methylated derivatives. This discriminatory power is further explored by showing P6AS shows discriminatory power towards histone 3 peptide sequences that are methylation on either the lysine or arginine. This system was also modeled computationally to investigate the role of water in binding affinity. Chapter 4 expands on the use of P[n]AS in biologically relevant systems by measuring the binding constants and an assay to detect and differentiate various World Anti-Doping Agency (WADA) banned compounds in PBS. The same assay was then used to create a calibration curve in simulated urine for two compounds. In total, the proof-of-concept assay is able to detect Pseudo down to 31.8 μM concentrations.
A Goal, Question, Metric Approach to Coherent Use Integration Within the DevOps Lifecycle
(2022) Rassmann, Kelsey Anne; Regli, William; Computer Science; Digital Repository at the University of Maryland; University of Maryland (College Park, Md.)
The development of high-stakes artificial intelligence (AI) technology creates a possibility for disastrous errors of misuse and disuse. Despite these risks, AI still needs to be developed in a timely manner as it has the potential to positively impact users and their surrounding environment. High-stakes AI needs to “move fast” but it must not “break things.” This thesis provides developers with a methodology that will allow them to establish human-AI coherency while maintaining the development speed of the DevOps software development lifecycle. First, I will present a model of the human-machine interaction (HMI) which will motivate a new mode of AI use entitled ‘Coherent Use.’ Then, I will describe a Goal, Question, Metric approach to maximizing Coherent Use which will integrate directly into the DevOps lifecycle. Finally, I will simulate the usage of this template through an existing software product.
Towards Trust and Transparency in Deep Learning Systems through Behavior Introspection & Online Competency Prediction
(2021) Allen, Julia Filiberti; Gabriel, Steven A.; Mechanical Engineering; Digital Repository at the University of Maryland; University of Maryland (College Park, Md.)
Deep neural networks are naturally “black boxes”, offering little insight into how or why they make decisions. These limitations diminish the adoption likelihood of such systems for important tasks and as trusted teammates. We employ introspective techniques to abstract machine activation patterns into human-interpretable strategies and identify relationships between environmental conditions (why), strategies (how), and performance (result) on both a deep reinforcement learning two-dimensional pursuit game application and image-based deep supervised learning obstacle recognition application. Pursuit-evasion games have been studied for decades under perfect information and analytically-derived policies for static environments. We incorporate uncertainty in a target’s position via simulated measurements and demonstrate a novel continuous deep reinforcement learning approach against speed-advantaged targets. The resulting approach was tested under many scenarios and performance exceeded that of a baseline course-aligned strategy. We manually observed separation of learned pursuit behaviors into strategy groups and manually hypothesized environmental conditions that affected performance. These manual observations motivated automation and abstraction of conditions, performance and strategy relationships. Next, we found that deep network activation patterns could be abstracted into human-interpretable strategies for two separate deep learning approaches. We characterized machine commitment by the introduction of a novel measure and revealed significant correlations between machine commitment, strategies, environmental conditions, and task performance. As such, we motivated online exploitation of machine behavior estimation for competency-aware intelligent systems. And finally, we realized online prediction capabilities for conditions, strategies, and performance. Our competency-aware machine learning approach is easily portable to new applications due to its Bayesian nonparametric foundation, wherein all inputs are compactly transformed into the same compact data representation. In particular, image data is transformed into a probability distribution over features extracted from the data. The resulting transformation forms a common representation for comparing two images, possibly from different types of sensors. By uncovering relationships between environmental conditions (why), machine strategies (how), & performance (result) and by giving rise to online estimation of machine competency, we increase transparency and trust in machine learning systems, contributing to the overarching explainable artificial intelligence initiative.
ADVANCED VISION INTELLIGENT METHODS FOR FOOD, AGRICULTURAL, AND HEALTHCARE APPLICATIONS
(2021) Wang, Dongyi; Tao, Yang; Bioengineering; Digital Repository at the University of Maryland; University of Maryland (College Park, Md.)
With fast software and hardware developments, vision intelligence models have attracted great attention and showed unprecedented performance on large-scale datasets. In practice, studies are still needed to design innovative intelligence models for niche applications with limited data accessibilities in uncertain real-world scenarios. This research casts light on cutting edge vision intelligent applications that enhance the essential areas of people’s livelihood, including food, agriculture, and healthcare. First, a 2D/3D imaging system was developed to facilitate the autonomous processing of Chesapeake Bay blue crabs, as an efficient solution to current hand-picking protocols. The system integrates a semantic segmentation model to understand crab 2D morphology. It can detect crab back-fin knuckles with R2 larger than 0.995, which guides movements of a two degree-of-freedom gantry station in removing crab legs and extracting crab body cores with 2mm accuracy. The customized active laser line scanning 3D range imaging system shows high imaging accuracy (0.15mm) and is able to assist a linear actuator in removing crab chamber cartilages. Second, computer aided vision intelligent methods were applied to an emerging ophthalmologic imaging modality known as, erythrocyte mediated angiography. A novel regression-based segmentation model and a Monte Carlo based tracking method were proposed to monitor the erythrocytes in stasis and in movements. Both models displayed comparable performance to human experts. Preliminary clinical results also manifest the potential relationships between paused erythrocyte densities and primary open-angle glaucoma. To better understand retinal vessel and erythrocyte distributions, a novel network architecture, the Hard Attention Net was proposed. This network has achieved state-of-art retinal vessel segmentation performance across different ophthalmologic imaging modalities. Finally, deep learning based qualitative and quantitative analyses were applied to spectral signals for monitoring high-level status and low-level chemical properties of agricultural bioproducts. Experiments include early-stage tomato spotted wilt virus detection as well as nutrition content estimation of plant and corn kernels. By using adversarial training and feature weighting ideas, the two proposed networks were effectively trained with a limited dataset. The results of these studies show great potential for vision intelligence models for promoting applications of advanced imaging modalities and vision-guided automations in food, agricultural, and healthcare fields.
AN ANALYSIS OF BOTTOM-UP ATTENTION MODELS AND MULTIMODAL REPRESENTATION LEARNING FOR VISUAL QUESTION ANSWERING
(2019) Narayanan, Venkatraman; Shrivastava, Abhinav; Systems Engineering; Digital Repository at the University of Maryland; University of Maryland (College Park, Md.)
A Visual Question Answering (VQA) task is the ability of a system to take an image and an open-ended, natural language question about the image and provide a natural language text answer as the output. The VQA task is a relatively nascent field, with only a few strategies explored. The performance of the VQA system, in terms of accuracy of answers to the image-question pairs, requires a considerable overhaul before the system can be used in practice. The general system for performing the VQA task consists of an image encoder network, a question encoder network, a multi-modal attention network that combines the information obtained image and question, and answering network that generates natural language answers for the image-question pair. In this thesis, we follow two strategies to improve the performance (accuracy) of VQA. The first is a representation learning approach (utilizing the state-of-the-art Generative Adversarial Models (GANs) (Goodfellow, et al., 2014)) to improve the image encoding system of VQA. This thesis evaluates four variants of GANs to identify a GAN architecture that best captures the data distribution of the images, and it was determined that GAN variants become unstable and fail to become a viable image encoding system in VQA. The second strategy is to evaluate an alternative approach to the attention network, using multi-modal compact bilinear pooling, in the existing VQA system. The second strategy led to an increase in the accuracy of VQA by 2% compared to the current state-of-the-art technique.

Theses and Dissertations from UMD

Browse

Filters

Settings

Sort By

Results per page

Search Results