Computer Science Theses and Dissertations

Permanent URI for this collection

Browse

Recent Submissions

Now showing 1 - 20 of 969
  • Item
    Analyzing and indexing huge reference sequence collections
    (2023) Fan, Jason; Patro, Rob; Computer Science; Digital Repository at the University of Maryland; University of Maryland (College Park, Md.)
    Recent advancements in genome-scale assays and high throughput sequencing have made systematic measurement of model-organisms both accessible and abundant. As a result, novel algorithms that exploit similarities across multiple samples or compare measurements against multiple reference organisms have been designed to improve analyses and gain new insights. However, such models and algorithms can be difficult to apply in practice. Furthermore, analysis of high-throughput sequencing data across multiple samples and multiple reference genomic sequences can be prohibitively costly in terms of space and time. In three parts, this dissertation investigates novel computational techniques that improve analyses at various scales. In Part I, I present two general matrix-factorization algorithms designed to analyze and compare biological measurements of related species that can be summarized as networks. In Part II, I present methods that improve analyses of high-throughput sequencing data. The first method, SCALPELSIG, reduces the computation burden of applying mutational signature analysis in resource limited settings; and the second method, a derivation of perplexity for gene and transcript expression estimation models, enables effective model se- lection in experimental RNA-seq data where ground-truth is absent. In Part III, I tackle the difficulties of indexing and analyzing huge collections reference sequences. I introduce the spectrum preserving tiling (SPT), a new computational and mathematical abstraction. Mathematically, the SPT explicitly relates past work on compactly representing k-mer sets --- namely the compacted de Bruijn graph and recent derivations of spectrum preserving string sets --- to the indexing of k-mer positions and metadata in reference sequences. Computationally, the SPT makes possible an entire class of efficient and modular k-mer indexes. I introduce a pair of indexing schemes respectively designed to efficiently support rapid locate and k-mer "color" queries in small space. In the final Chapter of this dissertation, I show how these modular indexes can be effectively and generically implemented.
  • Item
    IMPROVED ALGORITHMS AND PRIMITIVES FOR QUANTUM CRYPTOGRAPHY
    (2023) Rodrigues, Nishant; Lackey, Brad; Childs, Andrew; Computer Science; Digital Repository at the University of Maryland; University of Maryland (College Park, Md.)
    In this dissertation we develop novel primitives and algorithms in quantum cryptography, specifically for quantum key distribution and random number generation. We show a device-independent quantum key distribution (DIQKD) algorithm that is based on the notion of synchronous correlations. Most algorithms in DIQKD literature rely on the well-known CHSH inequality which is neither symmetric nor synchronous. We propose a new synchronous Bell inequality that simplifies the QKD setting by being fully symmetric so that the roles of the two parties in the protocol, Alice and Bob, are completely interchangeable. This has implications for QKD hardware since an identical set of devices can be produced for both parties instead of separate devices for each. We also achieve key rates comparable to CHSH-based protocols. This dissertation also focuses on closing the causality or locality loophole present in device-independent schemes. An assumption that is critical to device-independent protocols is that the two parties are acausally separated and cannot communicate with each other once they receive their inputs. This is typically referred to as the nonsignaling condition. If the condition of nonsignaling is violated, then an attacker may simulate the entire protocol classically. This erases any certificate of quantumness produced by the violation of a Bell inequality. We pose a new security assumption with respect to the adversary's uncertainty about the two parties' measurement bases. We derive a bound for this uncertainty and show that if the uncertainty grows any larger than the threshold, there is no strategy any adversary can use to cheat in the protocol. This closes the causality loophole and makes the protocol easier to implement for practical use. A primitive widely used in QKD and other cryptographic protocols is a random bit generator. We define ideal and real models of random bit generators and show their efficiency and security in the Constructive Cryptography framework. We specifically look at random bit generators based on process tomography of one-qubit channels. We consider ideal quantum random bit generators, then introduce some state preparation and bit-flip errors to define real quantum random bit generators. We show that the ideal and real generators are close to each other in statistical distance. The third part of this dissertation presents some initial ideas for quantum lattice sieving algorithms. Lattices are very important objects in the effort to construct cryptographic primitives that are secure against quantum attacks. A central problem in the study of lattices is that of finding the shortest non-zero vector in the lattice. Asymptotically, sieving is the best known technique for solving the shortest vector problem, however, sieving requires memory exponential in the dimension of the lattice. This work tries to provide better memory complexity while also improving runtime. Our ideas are inspired by classical heuristic sieving algorithms and make an attempt to quantize those algorithms.
  • Item
    CHARACTERIZING AND IMPROVING MENTAL MODELS OF SECURE COMMUNICATION TOOLS
    (2023) Akgul, Omer; Mazurek, Michelle L.; Computer Science; Digital Repository at the University of Maryland; University of Maryland (College Park, Md.)
    Investigating the adoption and correct use of secure communication tools has been an open research question for the past two decades, starting with research on encrypted email and moving on to a wide range of security & privacy tools. Barriers to adoption include; usability issues, social factors (e.g., network effects), and misaligned mental models. More recently, as secure instant messengers have seen widespread adoption—with some products reaching billions of users—and VPNs have become increasingly popular, researchers have uncovered a fundamental problem; misalignments between what secure communication tools can offer and what users believe. Even though users might have already adopted the state-of-the-art secure communication tools, insufficient mental models lead to users overestimating capabilities, or, erroneously thinking other tools are more appropriate for use. Both scenarios are likely to hurt users’ security & privacy postures. In this thesis I describe my approach to characterizing and improving mental models, I detail background and related works, my contributions, and discuss implications. Among my main contributions, I first describe my work on an emerging vector of questionable security information, influencer VPN ads on YouTube. I show how this questionable information is linked to real-world mental models. Next, I detail my investigation into how the description of security & privacy technology, another source of influence, changes mental models. Further, I report on my efforts to improve mental models when they are already misaligned in the context of end-to-end encryption. A related study exploring best user sampling practices for studies like mine is discussed.
  • Item
    Learning-based Motion Planning for High-DoF Robot Systems
    (2023) Jia, Biao; Manocha, Dinesh; Computer Science; Digital Repository at the University of Maryland; University of Maryland (College Park, Md.)
    A high-degree-of-freedom (DoF) robot system refers to a type of robotic system that possesses many independently controllable mechanical degrees of freedom. This includes high-DoF robots or objects being manipulated, such as flexible robotic arms and flexible objects. Degrees of freedom in robotics represent the different ways a robot can move or manipulate its parts. High-DoF robot systems have a significant number of these independent motions, allowing them to exhibit complex and versatile movements and behaviors. These systems are employed in various applications, including manufacturing and healthcare, where precise and flexible control is essential. The main difficulty associated with high-DoF robot systems is the complexity arising from their numerous degrees of freedom. Calculating the optimal trajectories or control inputs for high-DoF systems can be computationally intensive. The sheer number of variables and the need for real-time responsiveness pose significant challenges in terms of computation and control. In some cases, high-DoF robot systems interact with deformable objects such as fabrics and foam. Modeling and controlling these objects add additional layers of complexity due to their dynamic and unpredictable behavior. To address these challenges, we delve into several key areas: Object Deformation Modeling, Controller Parameterization, System Identification, Control Policy Learning, and Sim-to- Real Transfer. We begin by using cloth manipulation as an example to illustrate how to model high-DoF objects and design mapping relationships. By leveraging computer vision and visual feedback-based controllers, we enhance the ability to model and control objects with substantial shape variations, which is particularly valuable in applications involving deformable materials. Next, we shift our focus to Controller Parameterization, aiming to define control parameters for high-DoF objects. We employ a random forest-based controller along with imitation learning, resulting in more robust and efficient controllers, which are essential for high-DoF robot systems. This method can be used for human-robot collaboration involving flexible objects and enables imitation learning to converge in as few as 4-5 iterations. Furthermore, we explore how to reduce the dimensionality of both high-degree-of-freedom (high-DoF) robot systems and objects simultaneously. Our system allows for the more effective use of computationally intensive methods like reinforcement learning (RL) or trajectory optimization. Therefore, we design a system identification method to reduce the need for repeated rendering or experiments, significantly improving the efficiency of RL. This enables some algorithms with exponential computational complexity to be solved in linear time. In this part of the work, we adopt a real setup where humans and robots collaborate in real-time to manipulate flexible objects. In the second part of our research, we focus on the task of natural media painting. We utilize reinforcement learning techniques. Painting itself can be considered a high-DoF robot system, as it entails a multitude of context-dependent actions to complete the task. Our objective is to replicate a reference image using brush strokes, with the goal encoded through observations. We will focus on how to address the sparse reward distribution with a large continuous action space. Additionally, we investigate the practicality of transferring learned policies from simulated environments to real-world scenarios, with a specific focus on tasks like painting. This research bridges the gap between simulation and practical application, ensuring that the knowledge gained from our work can be effectively utilized in real-world settings. Ultimately, we will demonstrate the use of RL-learned painting strategies in both virtual and real robot environments.
  • Item
    A Multifaceted Quantification of Bias in Large Language Models
    (2023) Sotnikova, Anna; Daumé III, Hal; Applied Mathematics and Scientific Computation; Digital Repository at the University of Maryland; University of Maryland (College Park, Md.)
    Language models are rapidly developing, demonstrating impressive capabilities in comprehending, generating, and manipulating text. As they advance, they unlock diverse applications across various domains and become increasingly integrated into our daily lives. Nevertheless, these models, trained on vast and unfiltered datasets, come with a range of potential drawbacks and ethical issues. One significant concern is the potential amplification of biases present in the training data, generating stereotypes and reinforcing societal injustices when language models are deployed. In this work, we propose methods to quantify biases in large language models. We examine stereotypical associations for a wide variety of social groups characterized by both single and intersectional identities. Additionally, we propose a framework for measuring stereotype leakage across different languages within multilingual large language models. Finally, we introduce an algorithm that allows us to optimize human data collection in conditions of high levels of human disagreement.
  • Item
    NUMERICAL ACOUSTICS FOR PHYSICAL AND SIMULATED ENVIRONMENTS
    (2023) Kaneko, Shoken Eckhart; Duraiswami, Ramani; Gumerov, Nail A; Computer Science; Digital Repository at the University of Maryland; University of Maryland (College Park, Md.)
    Computer modeling and numerical analysis of acoustical phenomena have important applications including manufacturing, audio technologies in immersive multimedia, and machine learning systems involving audio. The focus of the present dissertation is the exploration of numerical methods for modeling, simulating, synthesizing, estimating, processing, controlling, and analyzing acoustical phenomena in the physical world as well as its applications to the virtual world, i.e. immersive technologies for creating virtual, augmented, and extended realities.The dissertation is structured as follows. In chapter 1, I introduce some fundamentals and basic concepts of numerical acoustics and discuss existing practical problems in acoustics. In chapter 2 and chapter 3, I propose two novel techniques for three-dimensional sound field capturing end encoding for immersive audio applications, which are both based on (semi-)analytical cancellation of scattering caused by microphone arrays mounted on acoustic scatterers. In chapter 4 and chapter 5, I introduce a fast algorithm for synthesizing acoustic impulse responses in large-scale forests, and use it to predict the performance of acoustic wildlife monitoring systems based on large-scale distributed microphone arrays. In chapter 6, I propose a novel general-purpose individual-agnostic binaural localizer which supports sound source localization from arbitrary directions without a priori knowledge of the process generating the binaural signal. In chapter 7 and chapter 8, I develop frameworks for regularized active sound control, using either point- or mode-control and using either distributed or local worn loudspeaker and microphone arrays with applications including speech privacy, personal active noise control, and local crosstalk cancellation with limited noise injection into the environment. In chapter 9, chapter 10 and chapter 11, three numerical methods for evaluating integrals arising in the (fast multipole accelerated) boundary element method are introduced. In chapter 9, a recursive algorithm is developed which allows efficient analytical evaluation of singular and nearly singular layer potential integrals arising in the boundary element method using flat high-order elements for Helmholtz and Laplace equations. In chapter 10, a differential geometry-based quadrature algorithm is developed which allows accurate evaluation of singular and nearly singular layer potential integrals arising in the boundary element method using smooth manifold boundary elements with constant densities for Helmholtz and Laplace equations. In chapter 11, an algorithm for efficient exact evaluation of integrals of regular solid harmonics over high-order boundary elements with simplex geometries is developed. In chapter 12, I discuss future research directions and conclude the dissertation.
  • Item
    Fair and Scalable Algorithms on Massive Graphs
    (2023) Knittel, Marina; Hajiaghayi, MohammadTaghi; Dickerson, John; Computer Science; Digital Repository at the University of Maryland; University of Maryland (College Park, Md.)
    The impact of massive data processing on our daily lives is becoming increasingly more clear. With this, we must guarantee that vital, data-driven decision-making systems mitigate the potential negative impacts of the technologies themselves and scale to the massive datasets we have access to today. We explore both of these facets by studying fairness and scalability for algorithms on large graphs. In Part I, we focus on on fair hierarchical clustering. Our first work on this topic [Ahmadian et al., 2020b] initiates the study of fair hierarchical clustering by extending Chierichetti et al.’s [Chierichetti et al., 2017] notion of representationally proportional flat clustering to the hierarchical setting. From there, we introduce the first approximation algorithms for three well studied hierarchical clustering optimization problems in the fair context: cost [Dasgupta, 2016], revenue [Moseley and Wang, 2017], and value [Cohen-Addad et al., 2018]. Our initial work studies all three fair optimization problems, and our follow-up works [Knittel et al., 2023a, Knittel et al., 2023b] dive deeper into the notoriously difficult cost optimization problem. Regarding scalability, we leverage the Massively Parallel Computation (MPC) model, as well as its recent successor Adaptive Massively Parallel Computation (AMPC), to develop efficient graph algorithms for big data. MPC, discussed in Part II, has been one of the most practically useful models for massively parallel algorithms in the past decade, influencing a number of major frameworks including MapReduce, Hadoop, Spark, and Flume. In this model, we present our work on edge coloring [Behnezhad et al., 2019b], hierarchical clustering [Hajiaghayi and Knittel, 2020], and tree embeddings [Ahanchi et al., 2023]. AMPC improves upon the MPC model by adding access to a distributed hash table while still remaining highly implementable in practice. This allows it to overcome some shortcomings proposed in MPC literature, notably, the 1vs2Cycle Conjecture (i.e., that differentiating between a single cycle and two cycles is difficult in MPC). In Part III, we introduce a highly efficient and general tool for executing tree contractions in AMPC [Hajiaghayi et al., 2022b] and additionally exhibit the power of AMPC on minimum cut problems [Hajiaghayi et al., 2022a].
  • Item
    A Pedagogical Approach to Ramsey Multiplicity
    (2023) Brady, Robert; Gasarch, William; Computer Science; Digital Repository at the University of Maryland; University of Maryland (College Park, Md.)
    It is well known that for all 2-colorings of the edges of $K_6$ there is amonochromatic triangle. Less well known is that there are two monochromatic triangles. More generally, for all 2-colorings of the edges of $K_n$ there are roughly $\ge n^3/24$ monochromatic triangles. Another way to state this is that the density of monochromatic triangles is at least $1/4$. The Ramsey Multiplicity of $k$ is (asymptotically) the greatest $\alpha$ such that for every coloring of $K_n$ the density of monochromatic $K_k$'s is $\alpha$. This concept has been studied for many years. We survey the area and provide proofs that are more complete, more motivated, and using modern notation.
  • Item
    DOCUMENT INFORMATION EXTRACTION, STRUCTURE UNDERSTANDING AND MANIPULATION
    (2023) Mathur, Puneet; Manocha, Dinesh; Computer Science; Digital Repository at the University of Maryland; University of Maryland (College Park, Md.)
    Documents play an increasingly central role in human communications and workplace productivity. Every day, billions of documents are created, consumed, collaborated on, and edited. However, most such interactions are manual or rule-based semi-automated. Learning from semi-structured and unstructured documents is a crucial step in designing intelligent systems that can understand, interpret, and extract information contained in digital PDFs, forms, receipts, contracts, infographics, etc. Our work tries to solve three major problems in the domain of information extraction from real-world multimodal (text+images+layout) documents: (1) multi-hop reasoning between concepts and entities spanning several paragraphs; (2) semi-structured layout extraction in documents consisting of thousands of text tokens and embedded images arranged in specific layouts; (3) hierarchical document representations and the need to transcend content lengths beyond a fixed window for effective semantic reasoning. Our research broadly binds together the semantic (document-level information extraction) and structural (document image analysis) aspects of document intelligence to advance user productivity. The first part of the research addresses issues related to information extraction from characteristically long-range documents that consist of multiple paragraphs and require long-range contextualization. We propose augmenting the capabilities of the Transformer-based methods with graph neural networks to capture local-level context as well as long-range global information to solve document-level information extraction tasks. In this aspect, we first solve the task of document-level temporal relation extraction by leveraging rhetorical discourse features, temporal arguments, and syntactic features through a Gated Relational-GCN model to extend the capability of Transformer architecture for discourse-level modeling. Next, we propose DocTime, a novel temporal dependency graph parsing method that utilizes structural, syntactic, and semantic relations to learn dependency structures over time expressions and event entities in text documents to capture long-range interdependencies. We also show how the temporal dependency graphs can be incorporated into the self-attention layer of Transformer models to improve the downstream tasks of temporal questions answering and temporal NLI. Finally, we present DocInfer - a novel, end-to-end Document-level Natural Language Inference model that builds a hierarchical document graph, performs paragraph pruning, and optimally selects evidence sentences to identify the most important context sentences for a given hypothesis. Our evidence selection mechanism allows it to transcend the input length limitation of modern BERT-like Transformer models while presenting the entire evidence together for inferential reasoning that helps it to reason on large documents where the evidence may be fragmented and located arbitrarily far apart. The second part of the research covers novel approaches for understanding, manipulation, and downstream applications of spatial structures extracted from digital documents. We first propose LayerDoc to extract the hierarchical layout structure in visually rich documents by leveraging visual features, textual semantics, and spatial coordinates along with constraint inference in a bottom-up layer-wise fashion. Next, we propose DocEditor, a Transformer-based localization-aware multimodal (textual, spatial, and visual) model that performs the novel task of language-guided document editing based on user text prompts. Further, we investigated methods for building text-to-speech systems for semi-structured documents. Finally, we will explore two applications of long-context document-level reasoning: (i) user-personalized speech recognition systems for improved next-word prediction in specific domains by utilizing retrieval augmentation techniques for ASR Language Models; (ii) Transformer-based methods to utilize multimodal information from long-form financial conference calls (document-level transcripts, audio-visual recordings, and tabular information) for improved financial time series prediction tasks.
  • Item
    Reliability of Machine Learning Models in the Real World
    (2023) Chiang, Ping-yeh; Goldstein, Thomas; Computer Science; Digital Repository at the University of Maryland; University of Maryland (College Park, Md.)
    Neural networks have consistently showcased exceptional performance in various applications. Yet, their deployment in adversarial settings is limited due to concerns about reliability. In this paper, we’ll first explore methods to verify a model’s reliability in diverse scenarios, including classification, detection, auctions, and watermarking. We’ll then discuss the challenges and limitations of these verification techniques in real-world situations and suggest potential remedies. We’ll wrap up by examining the reliability of neural networks in the context of the model’s implicit bias. Our initial research investigated three critical areas where deep learning model reliability is crucial: object detection, deep auctions, and model watermarking. We found that without rigorous verification, systems could be vulnerable to accidents, manipulation of auction systems, and potential intellectual property theft. To counteract this, we introduced verification algorithms tailored to these respective scenarios. However, while certificates affirm the resilience of our models within a predefined threatframework, they don’t guarantee real-world infallibility. Hence, in the subsequent section, we explored strategies to improve model’s adaptability to domain shifts. While the pyramid adversarial training technique is effective in improving reliability with respect to domain shift, it is very computationally intensive. In response, we devised an alternative technique, universal pyramid adversarial training, which offers comparable advantages while being 30-70% more efficient. Finally, we try to understand the inherent non-robustness of neural networks through the lens of the model’s implicit bias. Surprisingly, we found that the generalization ability of deep learning models comes almost entirely from the architecture and not the optimizer as commonly believed. This architectural bias might be a crucial factor in explaining the inherent non-robustness of neural networks. Looking ahead, we intend to probe deeper into how neural networks’ innate biases can lead to their frailties. Moreover, we posit that refining these implicit biases could offer avenues to enhance model reliability.
  • Item
    Immersive Visual Analytics of Wi-Fi Signal Propagation and Network Health
    (2023) Rowden, Alexander R; Varsnhney, Amitabh; Computer Science; Digital Repository at the University of Maryland; University of Maryland (College Park, Md.)
    e are immersed in waves of information. This information is typically transmitted as radio waves in many protocols and frequencies, such as WiFi, Bluetooth, and Near-Field Communications (NFC). It carries vital information such as health data, private messages, and financial records. There is a critical need for systematic and comprehensive visualization techniques to facilitate seamless, resilient, and secure transmission of these signals. Traditional visualization techniques are not enough because of the scale of these datasets. In this dissertation, we present three novel contributions that leverage advances in volume rendering and virtual reality (VR): (a) an outdoor volume-rendering visualization system that facilitates large-scale visualization of radio waves over a college campus through real-time programmable customization for analysis purposes, (b) an indoor, building-scale visualization system that enables data to be collected and analyzed without occluding the user's view of the environment, and (c) a systematic user study with 32 participants which shows that users perform analysis tasks well with our novel visualizations. In our outdoor system, we present the Programmable Transfer Function. Programmable Transfer Functions offer the user a way to replace the traditional transfer function paradigm with a more flexible and less memory-demanding alternative. Our work on indoor WiFi visualization is called WaveRider. WaveRider is our contribution to indoor-modeled WiFi visualization using a virtual environment. WaveRider was designed with the help of expert signal engineers we interviewed to determine the needs of the visualization and who we used to evaluate the application. These works provide a solid starting point for signal visualization as our networks transition to more complex models. Indoor and outdoor visualizations are not the only dichotomy in the realm of signal visualization. We are also interested in visualizations of modeled data compared to visualization of data samples. We have also explored designs for multiple sample-based visualizations and conducted a formal evaluation where we compare these to our previous model-based approach. This analysis has shown that visualizing the data without modeling improves user confidence in their analyses. In the future, we hope to explore how these sample-based methods allow more routers to be visualized at once.
  • Item
    Modeling the fracture of polymer networks
    (2023) Tao, Manyuan; Cameron, Maria; Applied Mathematics and Scientific Computation; Digital Repository at the University of Maryland; University of Maryland (College Park, Md.)
    This dissertation is devoted to modeling the fracture of highly elastic materials that consist of polymer networks, such as elastomers and hydrogels. These polymer materials are composed of long polymer chains of repeating molecular units, which are crosslinked to form a three-dimensional network structure. A polymer network fractures by breaking covalent bonds, but the experimentally measured strength of a polymer network is orders of magnitude lower than the strength of covalent bonds. In this dissertation, we develop mesoscale models to understand what are the necessary ingredients leading to a large reduction in the strength of polymer networks observed in experiments. We hypothesize that the large reduction in strength is caused by statistical variation in lengths of polymer chains and a J-shaped stress-stretch relationship. The polymer chain carries entropic forces for most of the extension and carries covalent forces only for a narrow range of the extension. As a result, the statistical distribution of chain lengths causes only a small fraction of polymer chains to be highly stressed when the network is near fracture. We test this hypothesis using two mesoscale models: an idealized parallel chain model and a two-dimensional network model. Both models assume a statistical distribution for the lengths of polymer chains. Polymer chains are represented by freely-jointed chains that feature a nonlinear J-shaped stress-stretch relationship. The parallel chain model allows for simple calculations and is amenable for analysis by analytical tools. The network model accounts for the effect of stress concentration and is amenable for numerical simulations. Our models show that the combination of a J-shaped stress-stretch relationship and a distribution of chain lengths leads to a large reduction in strength, while keeping the variability in strength small from sample to sample. The large scatter in chain lengths causes a reduction in strength by up to two orders of magnitude, which explains a portion of the giant discrepancy between the experimentally measured strength of hydrogels and the strength of covalent bonds. Furthermore, our models demonstrate a power law relationship between the strength and the scatter in chain lengths. We provide an analytical derivation of the power law by taking advantage of the simplicity of the parallel chain model. In addition to studying macroscopic fracture properties, we further investigate the microscopic characteristics and the breaking mechanism of the polymer network, using the network model. By examining the characteristics of shortest paths, we find that the links traversed by a large number of shortest paths are more likely to break. Finally, we connect the microstructure of the network to the macroscopic mechanical properties. It is observed that the strength of the network correlates with the growth of holes during deformation.
  • Item
    Learning-based Autonomous Driving with Enhanced Data Efficiency and Policy Training
    (2023) Shen, Yu; Lin, Ming C.; Computer Science; Digital Repository at the University of Maryland; University of Maryland (College Park, Md.)
    Autonomous vehicles are capable of sensing the environment and moving around with little to no human intervention, enhancing efficiency and safety. Self-driving cars, for instance, will affect our modes of transportation and way of life in the years to come. With rapid advances in hardware and software design, learning-based autonomous driving is becoming a viable and popular solution. This thesis focuses on data from the front-end and policy from the back-end in learning-based autonomous driving. As commonly known, data is central to all learning-based methods. However, engineers cannot collect all the data from all possible scenarios to train a model due to the great variety of real-world driving scenarios, e.g., different conditions in weather, lighting, roads, traffic, etc. Consequently, the issue of how to train a model with robustness, generalizability, and transferability becomes crucial. In addition, while input data is the key component of autonomous driving in the front-end, policy, which controls the vehicle to navigate safely, is also an essential component in the back-end. Existing methods have made progress on policy learning, but there is room for improvement, e.g., reinforcement learning is not able to utilize expert demos, inverse reinforcement learning can not directly utilize driving domain knowledge, etc. I propose to address these important open research issues by adopting machine learning and deep learning techniques, including adversarial data augmentation and training, auxiliary modality learning, transfer learning, reinforcement learning, and inverse reinforcement learning. To make autonomous driving more robust against varying weather changes, lighting conditions, or other image corruptions, I propose a gradient-free adversarial training method based on data augmentation and sensitivity analysis. To utilize multi-modal information for good performance with low computational costs, I design an auxiliary modality learning framework that can distill knowledge from multi-modality data to single modality data, with a specific condition that allows the teacher network to stay aware of the student's status for better distillation. I further propose a small-shot cross-modal distillation to solve the problem in a small-shot setting. To overcome the difficulty of collecting data in the real world, I present a transfer learning architecture that is able to transfer knowledge from the simulation domain to the real-world scenarios. To utilize both expert demonstration and real-world driving knowledge, I propose enhanced inverse reinforcement learning with hybrid-weight trust-region optimization and curriculum learning. In summary, my proposed learning-based frameworks enhance robustness, efficiency, and generalization via adversarial data augmentation and training, auxiliary modality learning, and transfer learning w.r.t. data processing in the front-end; and further improve policy learning via inverse reinforcement learning in the back-end.
  • Item
    Expanding Robustness in Responsible AI for Novel Bias Mitigation
    (2023) Dooley, Samuel; Dickerson, John P; Computer Science; Digital Repository at the University of Maryland; University of Maryland (College Park, Md.)
    Conventional belief in the fairness community is that one should first find the highest performing model for a given problem and then apply a bias mitigation strategy. One starts with an existing model architecture and hyperparameters, and then adjusts model weights, learning procedures, or input data to make the model fairer using a pre-, post-, or in-processing bias mitigation technique. While existing methods for de-biasing machine learning systems use a fixed neural architecture and hyperparameter setting, I instead ask a fundamental question which has received little attention: how much does model-bias arise from the architecture and hyperparameters, and ask how can we exploit the extensive research in the fields of neural architecture search (NAS) and hyperparameter optimization (HPO) to search for more inherently fair models. By thinking of bias mitigation in this new way, we really are expanding our conceptualization of robustness in responsible AI. Robustness is an emerging aspect of responsible AI and focuses on maintaining model performance in the face of uncertainties and variations for all subgroups of a data population. Often robustness deals with protecting models from intentional or unintentional manipulations in data, while handling noisy or corrupted data and preserving accuracy in real-world scenarios. In other words, robustness, as commonly defined, examines the output of a system under changes to input data. However, I will broaden the idea of what robustness in responsible AI is in a manner which defines new fairness metrics, yields insights into robustness of deployed AI systems, and proposes an entirely new bias mitigation strategy. This thesis explores the connection between robust machine learning and responsible AI. It introduces a fairness metric that quantifies disparities in susceptibility to adversarial attacks. It also audits face detection systems for robustness to common natural noises, revealing biases in these systems. Finally, it proposes using neural architecture search to find fairer architectures, challenging the conventional approach of starting with accurate architectures and applying bias mitigation strategies.
  • Item
    Understanding and Enriching the Algorithmic Reasoning Capabilities of Deep Learning Models
    (2023) Li, Jingling; Dickerson, John; Computer Science; Digital Repository at the University of Maryland; University of Maryland (College Park, Md.)
    Learning to reason is an essential step to achieving general intelligence. My research has been focusing on empowering deep learning models with the abilities to generalize efficiently, extrapolate to out-of-distribution data, learn under noisy labels, and make better sequential decisions --- all of these require the models to have varying levels of reasoning capabilities. As the reasoning process can be described as a step-by-step algorithmic procedure, understanding and enriching the algorithmic reasoning capabilities has drawn increasing attention in deep learning communities. To bridge algorithms and neural networks, we propose a framework, algorithmic alignment, which connects neural networks with algorithms in a novel manner and advances our understanding of how these two fields can work together to solve complex reasoning tasks. Intuitively, the algorithmic alignment framework evaluates how well a neural network's computation structure aligns with the algorithmic structure in a reasoning process. Utilizing algorithmic alignment, we are able to understand the limitations of machine learning models in the context of reasoning (e.g., generalization, extrapolation, and robustness) and know where to make improvements. Beyond this framework, we also investigate how to make better sequential decisions, during which we introduce a class of efficient approaches, hindsight learning, which allows us to leverage the knowledge inside existing human-designed algorithms to make better sequential decisions under uncertainty. First, why and when does a network structure generalize better than others, despite having equal expressive power? In studying this, we unify seemingly different reasoning tasks, such as intuitive physics, visual question answering, and shortest paths, via the lens of a powerful algorithmic paradigm, dynamic programming (DP). We then show that Graph Neural Network (GNNs)---a popular reasoning model---algorithmically align well with DP, which explains their empirical success on many reasoning tasks. At the same time, using this framework, we also suggest the limitations of GNNs and demonstrate how we can use algorithmic alignment to design a structured network given a particular task. Second, what does a neural network learn outside the support of the training distribution? To answer this, we identify conditions under which MLPs and GNNs extrapolate well.We first quantify the observation that ReLU MLPs quickly converge to linear functions along any direction from the origin, implying that ReLU MLPs do not extrapolate most nonlinear functions. Yet, we prove that MLPs can successfully extrapolate a linear target function given sufficient training support. These results suggest a hypothesis on GNNs for which we provide theoretical and empirical evidence: the success of GNNs in extrapolating algorithmic tasks to new data (e.g., larger graphs or edge weights) relies on encoding task-specific non-linearities in the architecture or features. Third, how does a neural network’s architecture impact its robustness to noisy labels? To figure this out, we use the proposed framework connecting a network's robustness to its architectural alignments with the target/noise functions. We hypothesize that a network is more robust to noisy labels if its architecture is more aligned with the target function than the noise, and we provide both theoretical and empirical evidence across various neural network architectures and different domains. Last but not least, how can we design sample-efficient approaches for sequential decision-making problems? Many decision-making problems especially the ones in resource management have a bunch of human-designed heuristics that work well in practice. These heuristics could provide more guidance to RL agents than just the environmental reward. At the same time, we observe that in many resource management problems requiring sequential decision-making under uncertainty, the only uncertainty affecting the decision outcomes are exogenous variables outside the control of the decision-maker in the system, and we can use machine learning to learn from historical patterns of compute demands to help deal with these uncertainties. Therefore, we design a class of data-efficient algorithms termed Hindsight Learning (HL), where we utilize existing human-designed algorithms with historical data to better guide the RL agents.
  • Item
    Scalable Methods for Robust Machine Learning
    (2023) Levine, Alexander Jacob; Feizi, Soheil; Computer Science; Digital Repository at the University of Maryland; University of Maryland (College Park, Md.)
    In recent years, machine learning systems have been developed that demonstrate remarkable performance on many tasks. However, naive metrics of performance, such as the accuracy of a classifier on test samples drawn from the same distribution as the training set, can provide an overly optimistic view of the suitability of a model for real-world deployment. In this dissertation, we develop models that are robust, in addition to performing well on large-scale tasks. One notion of robustness is adversarial robustness, which characterizes the performance of models under adversarial attacks. Adversarial attacks are small, often imperceptible, distortions to the inputs of machine learning systems which are crafted to substantially change the output of the system. These attacks represent a real security threat, and are especially concerning when machine learning systems are used in safety-critical applications. To mitigate this threat, certifiably robust classification techniques have been developed. In a certifiably robust classifier, for each input sample, in addition to a classification, the classifier also produces a certificate, which is a guaranteed lower bound on the magnitude of any perturbation required to change the classification. Existing methods for certifiable robustness have significant limitations, which we address in Parts I and II of this dissertation: (i) Currently, randomized smoothing techniques are the only certification techniques that are viable for large-scale image classification (i.e. ImageNet). However, randomized smoothing techniques generally provide only high-probability, rather than exact, certificate results. To address this, we develop deterministic randomized smoothing-based algorithms, which produce exact certificates with finite computational costs. In particular, in Part I of this dissertation, we present to our knowledge the first deterministic, ImageNet-scale certification methods under the L_1, L_p (for p < 1), and "L_0" metrics. (ii) Certification results only apply to particular metrics of perturbation size. There is therefore a need to develop new techniques to provide provable robustness against different types of attacks. In Part II of this dissertation, we develop randomized smoothing-based algorithms for several new types of adversarial perturbation, including Wasserstein adversarial attacks, Patch adversarial attacks, and Data Poisoning attacks. The methods developed for Patch and Poisoning attacks are also deterministic, allowing for efficient exact certification. In Part III of this dissertation, we consider a different notion of robustness: test-time adaptability to new objectives in reinforcement learning. This is formalized as goal-conditioned reinforcement learning (GCRL), in which each episode is conditioned by a new "goal," which determines the episode's reward function. In this work, we explore a connection between off-policy GCRL and knowledge distillation, which leads us to apply Gradient-Based Attention Transfer, a knowledge distillation technique, to the Q-function update. We show, empirically and theoretically, that this can improve the performance of off-policy GCRL when the space of goals is high-dimensional.
  • Item
    The Limitations of Deep Learning Methods in Realistic Adversarial Settings
    (2023) Kaya, Yigitcan; Dumitras, Tudor; Computer Science; Digital Repository at the University of Maryland; University of Maryland (College Park, Md.)
    The study of adversarial examples has evolved from a niche phenomenon to a well-established branch of machine learning (ML). In the conventional view of an adversarial attack, the adversary takes an input sample, e.g., an image of a dog, and applies a deliberate transformation to this input, e.g., a rotation. This then causes the victim model to abruptly change its prediction, e.g., the rotated image is classified as a cat. Most prior work has adapted this view across different applications and provided powerful attack algorithms as well as defensive strategies to improve robustness. The progress in this domain has been influential for both research and practice and it has produced a perception of better security. Yet, security literature tells us that adversaries often do not follow a specific threat model and adversarial pressure can exist in unprecedented ways. In this dissertation, I will start from the threats studied in security literature to highlight the limitations of the conventional view and extend it to capture realistic adversarial scenarios. First, I will discuss how adversaries can pursue goals other than hurting the predictive performance of the victim. In particular, an adversary can wield adversarial examples to perform denial-of-service against emerging ML systems that rely on input-adaptiveness for efficient predictions. Our attack algorithm, DeepSloth, can transform the inputs to offset the computational benefits of these systems. Moreover, an existing conventional defense is ineffective against DeepSloth and poses a trade-off between efficiency and security. Second, I will show how the conventional view leads to a false sense of security for anomalous input detection methods. These methods build modern statistical tools around deep neural networks and have shown to be successful in detecting conventional adversarial examples. As a general-purpose analogue of blending attacks in security literature, we introduce the Statistical Indistinguishability Attack (SIA). SIA bypasses a range of published detection methods by producing anomalous samples that are statistically similar to normal samples. Third, and finally, I will focus on malware detection with ML, a domain where adversaries gain leverage over ML naturally without deliberately perturbing inputs like in the conventional view. Security vendors often rely on ML for automating malware detection due to the large volume of new malware. A standard approach for detection is collecting runtime behaviors of programs in controlled environments (sandboxes) and feeding them to an ML model. I have first observed that a model trained using this approach performs poorly when it is deployed on program behaviors from realistic, uncontrolled environments, which gives malware authors an advantage in causing harm. We attribute this deterioration to distribution shift and investigate possible improvements by adapting modern ML techniques, such as distributionally robust optimization. Overall, my dissertation work has reinforced the importance of considering comprehensive threat models and applications with well-documented adversaries for properly assessing the security risks of ML.
  • Item
    RESILIENT AND EFFICIENT CONSENSUS UNDER UNKNOWN NETWORK CONDITIONS
    (2023) Blum, Erica 3645 SE Woodstock, Portland, OR, 97202; Katz, Jonathan; Computer Science; Digital Repository at the University of Maryland; University of Maryland (College Park, Md.)
    Large-scale distributed services need to provide high throughput and low latency without sacrificing resilience. In particular, even if some servers crash or malfunction, the system as a whole should remain consistent. This challenge has been studied extensively in distributed computing and cryptography in the form of consensus algorithms. A consensus algorithm is an interactive protocol that allows honest (non-faulty) nodes to agree on a shared output in the presence of Byzantine (faulty) nodes, who may behave arbitrarily. Consensus algorithms have a long history in distributed computing, and are now receiving even more attention in the context of blockchain systems.Consensus has frequently been studied in the context of two contrasting network models. In the synchronous network model, any message sent by an honest party will be delivered within a fixed bound; this bound is known to all parties and may be used as a protocol parameter. In the asynchronous network model, messages may be delayed for arbitrary lengths of time. For certain consensus problems and settings, the optimal fault tolerance is higher in the synchronous model than the asynchronous model (all else being equal). For example, assuming a public key infrastructure (PKI), the fundamental problem of Byzantine agreement (BA) for n parties is feasible for t < n/2 faults in the synchronous model, compared to only t < n/3 in the asynchronous model. On the other hand, synchronous consensus protocols can become stuck or even lose consistency if delays exceed the fixed bound. In this dissertation, we consider a novel network-agnostic notion of security. Our central contribution is a suite of consensus protocols that achieve precisely defined security guarantees when run in either a synchronous or asynchronous network model, even when the parties are unaware of the network’s true status. In addition, we provide matching impossibility results characterizing the best-possible security guarantees for this setting. We conclude by exploring a natural extension to network-agnostic security, in which protocols must remain secure in a setting where the underlying network status is not only unknown, but may switch between synchrony and asynchrony during a single protocol execution.
  • Item
    COMPLEXITY CONTROLLED NATURAL LANGUAGE GENERATION
    (2023) Agrawal, Sweta; Carpuat, Marine; Computer Science; Digital Repository at the University of Maryland; University of Maryland (College Park, Md.)
    Generating text at the right level of complexity for its target audience so it can be easily understood by its target audience has the potential to make information more accessible to a wider range of people, including non-native speakers, language learners, and people who suffer from language or cognitive impairments. For example, a native Hindi speaker learning English might prefer reading a U.S. news article in English or Hindi tailored to their vocabulary and language proficiency level. Natural Language Generation (NLG), the use of computational models to generate human-like text, has been used to empower countless applications – from automatically summarizing financial and weather reports to enabling communication between multilingual communities through automatic translation. Although NLG has met some level of success, current models ignore that there are many valid ways of conveying the same information in a text and that selecting the appropriate variation requires knowing who the text is written for and its intended purpose. To address this, in this thesis, we present tasks, datasets, models, and algorithms that are designed to let users specify how simple or complex the generated text should be in a given language. We introduce the Complexity Controlled Machine Translation task, where the goal is to translate text from one language to another at a specific complexity level defined by the U.S. reading grade level. While standard machine translation (MT) tools generate a single output for each input, the models we design for this task produce translation at various complexity levels to suit the needs of different users. In order to build such models, we ideally require rich annotation and resources for supervised training, i.e., examples of the same input text paired with several translations in the output language, which is not available in most datasets used in MT. Hence, we have also contributed datasets that can enable the generation and evaluation of complexity-controlled translations. Furthermore, recognizing that when humans simplify a complex text in a given language, they often revise parts of the complex text according to the intended audience, we present strategies to adopt general-purpose Edit-based Non-Autoregressive models for controllable text simplification (TS). In this framing, the simplified output for a desired grade level is generated through a sequence of edit operations like deletions and insertions applied to the complex input sequence. As the model needs to learn to perform a wide range of edit operations for different target grade levels, we introduce algorithms to inject additional guidance during training and inference, which results in improved output quality while also providing users with the specific changes made to the input text. Finally, we present approaches to adapt general-purpose controllable TS models that leverage unsupervised pre-training and low-level control tokens describing the nature of TS edit operations as side constraints for grade-specific TS. Having developed models that can enable complexity-controlled text generation, in the final part of the thesis, we introduce a reading comprehension-based human evaluation framework that is designed to assess the correctness of texts generated by these systems using multiple-choice question-answering. Furthermore, we evaluate whether the measure of correctness (via the ability of native speakers to answer the questions correctly using the simplified texts) is captured by existing automatic metrics that measure text complexity or meaning preservation.
  • Item
    Adversarial Robustness and Fairness in Deep Learning
    (2023) Cherepanova, Valeriia; Goldstein, Tom; Applied Mathematics and Scientific Computation; Digital Repository at the University of Maryland; University of Maryland (College Park, Md.)
    While deep learning has led to remarkable advancements across various domains, the widespread adoption of neural network models has brought forth significant challenges such as vulnerability to adversarial attacks and model unfairness. These challenges have profound implications for privacy, security, and societal impact, requiring thorough investigation and development of effective mitigation strategies. In this work we address both these challenges. We study adversarial robustness of deep learning models and explore defense mechanisms against poisoning attacks. We also explore the sources of algorithmic bias and evaluate existing bias mitigation strategies in neural networks. Through this work, we aim to contribute to the understanding and enhancement of both adversarial robustness and fairness of deep learning systems.