Theses and Dissertations from UMD
Permanent URI for this communityhttp://hdl.handle.net/1903/2
New submissions to the thesis/dissertation collections are added automatically as they are received from the Graduate School. Currently, the Graduate School deposits all theses and dissertations from a given semester after the official graduation date. This means that there may be up to a 4 month delay in the appearance of a give thesis/dissertation in DRUM
More information is available at Theses and Dissertations at University of Maryland Libraries.
Browse
8 results
Search Results
Item Understanding and Improving Reliability of Predictive and Generative Deep Learning Models(2024) Kattakinda, Priyatham; Feizi, Soheil; Electrical Engineering; Digital Repository at the University of Maryland; University of Maryland (College Park, Md.)Deep learning models are prone to acquiring spurious correlations and biases during training and adversarial attacks during inference. In the context of predictive models, this results in inaccurate predictions relying on spurious features. Our research delves into this phenomenon specifically concerning objects placed in uncommon settings, where they are not conventionally found in the real world (e.g., a plane on water or a television in a cave). We introduce the "FOCUS: Familiar Objects in Common and Uncommon Settings" dataset which aims to stress-test the generalization capabilities of deep image classifiers. By leveraging the power of modern search engines, we deliberately gather data containing objects in common and uncommon settings in a wide range of locations, weather conditions, and time of day. Our comprehensive analysis of popular image classifiers on the FOCUS dataset reveals a noticeable decline in performance when classifying images in atypical scenarios. FOCUS only consists of natural images which are extremely challenging to collect as by definition it is rare to find objects in unusual settings. To address this challenge, we introduce an alternative dataset named Diffusion Dreamed Distribution Shifts (D3S). D3S comprises synthetic images generated through StableDiffusion, utilizing text prompts and image guides derived from placing a sample foreground image onto a background template image. This scalable approach allows us to create 120,000 images featuring objects from all 1000 ImageNet classes set against 10 diverse backgrounds. Due to the incredible photorealism of the diffusion model, our images are much closer to natural images than previous synthetic datasets. To alleviate this problem, we propose two methods of learning richer and more robust image representations. In the first approach, we harness the foreground and background labels within D3S to learn a foreground (background)representation resistant to changes in background (foreground). This is achieved by penalizing the mutual information between the foreground (background) features and the background (foreground) labels. We demonstrate the efficacy of these representations by training classifiers on a task with strong spurious correlations. Thus far, our focus has centered on predictive models, scrutinizing the robustness of the learned object representations, particularly when the contextual surroundings are unconventional. In the second approach, we propose to use embeddings of objects and their relationships extracted using off-the-shelf image segmentation models and text encoders respectively as input tokens to a transformer. This leads to remarkably richer features that improve performance on downstream tasks such as image retrieval. Large language models are also prone to failures during inference. Given the widespread use of LLMs, understanding the propensity of these models to fail given adversarial inputs is crucial. To that end we propose a series of fast adversarial attacks called BEAST that uses beam search to add adversarial tokens to a given input prompt. These attacks induce hallucination, cause the models to jailbreak and facilitate unintended membership inference from model outputs. Our attacks are fast and are executable in relatively compute constrained environments.Item Scalable Methods for Robust Machine Learning(2023) Levine, Alexander Jacob; Feizi, Soheil; Computer Science; Digital Repository at the University of Maryland; University of Maryland (College Park, Md.)In recent years, machine learning systems have been developed that demonstrate remarkable performance on many tasks. However, naive metrics of performance, such as the accuracy of a classifier on test samples drawn from the same distribution as the training set, can provide an overly optimistic view of the suitability of a model for real-world deployment. In this dissertation, we develop models that are robust, in addition to performing well on large-scale tasks. One notion of robustness is adversarial robustness, which characterizes the performance of models under adversarial attacks. Adversarial attacks are small, often imperceptible, distortions to the inputs of machine learning systems which are crafted to substantially change the output of the system. These attacks represent a real security threat, and are especially concerning when machine learning systems are used in safety-critical applications. To mitigate this threat, certifiably robust classification techniques have been developed. In a certifiably robust classifier, for each input sample, in addition to a classification, the classifier also produces a certificate, which is a guaranteed lower bound on the magnitude of any perturbation required to change the classification. Existing methods for certifiable robustness have significant limitations, which we address in Parts I and II of this dissertation: (i) Currently, randomized smoothing techniques are the only certification techniques that are viable for large-scale image classification (i.e. ImageNet). However, randomized smoothing techniques generally provide only high-probability, rather than exact, certificate results. To address this, we develop deterministic randomized smoothing-based algorithms, which produce exact certificates with finite computational costs. In particular, in Part I of this dissertation, we present to our knowledge the first deterministic, ImageNet-scale certification methods under the L_1, L_p (for p < 1), and "L_0" metrics. (ii) Certification results only apply to particular metrics of perturbation size. There is therefore a need to develop new techniques to provide provable robustness against different types of attacks. In Part II of this dissertation, we develop randomized smoothing-based algorithms for several new types of adversarial perturbation, including Wasserstein adversarial attacks, Patch adversarial attacks, and Data Poisoning attacks. The methods developed for Patch and Poisoning attacks are also deterministic, allowing for efficient exact certification. In Part III of this dissertation, we consider a different notion of robustness: test-time adaptability to new objectives in reinforcement learning. This is formalized as goal-conditioned reinforcement learning (GCRL), in which each episode is conditioned by a new "goal," which determines the episode's reward function. In this work, we explore a connection between off-policy GCRL and knowledge distillation, which leads us to apply Gradient-Based Attention Transfer, a knowledge distillation technique, to the Q-function update. We show, empirically and theoretically, that this can improve the performance of off-policy GCRL when the space of goals is high-dimensional.Item Towards Reliable and Efficient Representation Learning(2022) Zhu, Chen; Goldstein, Tom; Computer Science; Digital Repository at the University of Maryland; University of Maryland (College Park, Md.)Large-scale representation learning has achieved enormous success during the past decade, surpassing human-level accuracy on a range of benchmarks including image recognition and language understanding. The success is supported by advances in both the algorithms and computing capabilities, which enables training large models on enormous amounts of data. While the performance continues to improve on existing benchmarks with larger model and training dataset sizes, the reliability and efficiency of large models are often questioned for deployment in practice. Uncensored datasets can have been poisoned to manipulate model behavior, while practical deployment requires models to be trained or updated quickly on the latest data, and to have low latency for inference. This dissertation studies how to improve the reliability and efficiency of representation learning. On reliability, we study the threats of data poisoning and evasion attacks and how to defend against these threats. We propose a more vicious targeted clean-label poisoning attack that is highly effective even when the target architecture is unknown.To defend against such threats, we develop a k-NN based method in the feature space to filter out the poison examples from the training set, which effectively reduces the success rate of poisoning attacks at an insignificant cost of accuracy. For evasion attack, we demonstrate a new threat model against transfer learning, where the attack can be successful without knowledge of the specific classification head. In a broader sense, we also propose methods to enhance the empirical and certified robustness against evasion attacks. For efficiency, our study focuses on three dimensions: data efficiency, convergence speed and computational complexity.For data efficiency, we propose enhanced adversarial training algorithms as a general data augmentation technique to improve the generalization of models given the same amount of labeled data, where we show its efficacy for Transformer models on a range of language understanding tasks. For convergence speed, we propose an automated initialization scheme to accelerate the convergence of convolutional networks for image recognition and Transformers for machine translation. For computational complexity, to scale Transformers to long sequences, we propose a linear-complexity attention mechanism, which improves the efficiency while preserving the performance of full attention on a range of language and vision tasks.Item Analysis of Data Security Vulnerabilities in Deep Learning(2022) Fowl, Liam; Czaja, Wojciech; Goldstein, Thomas; Mathematics; Digital Repository at the University of Maryland; University of Maryland (College Park, Md.)As deep learning systems become more integrated into important application areas, the security of such systems becomes a paramount concern. Specifically, as modern networks require an increasing amount of data on which to train, the security of data that is collected for these models cannot be guaranteed. In this work, we investigate several security vulnerabilities and security applications of the data pipeline for deep learning systems. We systematically evaluate the risks and mechanisms of data security from multiple perspectives, ranging from users to large companies and third parties, and reveal several security mechanisms and vulnerabilities that are of interest to machine learning practitioners.Item ENHANCING RESILIENCE OF COMPLEX NETWORKS: WASHINGTON D.C. URBAN RAIL TRANSIT AS A CASE STUDY(2020) Saadat, Yalda; Ayyub, Bilal BA; Civil Engineering; Digital Repository at the University of Maryland; University of Maryland (College Park, Md.)According to the United Nation’s Department of Economic and Social Affairs Population Division, 66% of the world’s population will reside in urban areas by 2050; a boost from 30 % in 1950. Urbanization has indeed triumphed and its speed has brought innovation and economic growth. Its synergies within infrastructure systems are undeniable and have increased the demand for such systems. However, urbanization is one reason infrastructure systems are knocked out of equilibrium and show complex dynamical behavior. Most infrastructure systems have been designed without planning for this magnitude of potential demographic changes; thus redesigns are long overdue. Also, climate change looms. Resource scarcity and host of other factors leave their impacts; all pose some incidence of perturbation in the state of the infrastructure system. These perturbations can affect the system’s resilience, which is a defining property of each system for remaining functional in the midst of disruption from an adverse event. Therefore, it is essential to develop appropriate metrics and methods to enhance the resilience of infrastructures at the network level. Such enhancements are critical for sustainable infrastructure development that is capable of performing satisfactorily through intentional and/or stochastic disruptions. A resilience evaluation of a network typically entails assessing vulnerability and robustness as well as identifying strategies to increasing network efficiency and performance and offering recovery strategies ideally taken in a cost-effective manner. This dissertation uses complex network theory (CNT) as the theoretic basis to enhance the resilience of large-scale infrastructure networks, such as urban rail transit systems. Urban rail transit infrastructures are heterogeneous, complex systems consisting of a large number of interacting nodes and links, which can imitate a network paradigm. Any adverse event leading to a disruption in the interaction and connectivity of network components would dramatically affect the safety and wellbeing of commuters, as well as the direct and indirect costs associated with performance loss. Therefore, enhancing their resilience is necessary. Using the Washington D.C. Urban rail transit as a case study, this dissertation develops a methodology to analyze network topology, compute its efficiency, vulnerability, and robustness in addition to provide a unified metric for assessing the network resilience. The steps of methodology are applied to two models of weighted and unweighted networks. For the weighted model two novel algorithms are proposed to capture the general pattern of ridership in the network, and to reflect the weights on assessing network efficiency, respectively. This dissertation then proposes an effective strategy to increase the network resilience prior to a disruptive event, e.g., a natural disaster, by adding several loop lines in the network for topological enhancement. As such, adding a loop line can create redundancy to the vulnerable components and improve network resilience. Expanding on this, the dissertation offers comparative recovery strategies and cost model in the case of disruption. An effective recovery strategy must demonstrate rapid optimal restoration of a disrupted system performance while minimizing recovery costs. In summary, the systematic methodology described above, assesses and enhances the network resilience. The initial results rank the most vulnerable and robust components of the network. The algorithms developed throughout the study advance the weighted network analysis state of art. The topological enhancement strategy offered basis to justify capital improvement. Post failure recovery analysis and the cost model serves to inform decision makers in identifying best recover strategies with special attention not only to restoring performance of a system but also on reducing associated failure and recovery costs. The use of the methodology proposed in this dissertation may lead to significant societal benefits by reducing the risk of catastrophic failures, providing references for mitigation of disruption due to adverse events, and offering resilience- based strategies, and related pursuits.Item DEVELOPMENT OF A GENERAL-PURPOSE STEADY-STATE SIMULATION FRAMEWORK FOR VAPOR COMPRESSION SYSTEMS(2020) Huang, Ransisi; Radermacher, Reinhard; Mechanical Engineering; Digital Repository at the University of Maryland; University of Maryland (College Park, Md.)The vapor compression system is the dominating technology in heat pumping, air conditioning and refrigeration. Vapor compression is associated with significant energy consumption and high global warming potential. Steady-state simulation of vapor compression system is a crucial numerical technology that helps to assess and mitigate the energy and environmental impact of these systems. This dissertation aims to advance the steady-state modeling and simulation technologies for vapor compression systems toward higher level of flexibility, computational efficiency, and robustness, improving designs and reducing time to market.First, the dissertation proposes a generalized solution methodology for the steady-state analysis of arbitrary systems. A tripartite-graph based tearing algorithm is proposed to generically formulate the residual equations. The methodology was extensively validated by five test systems with capacities from 10 to 100 kW. The maximum simulation energy imbalance was 0.91%, and the maximum system performance deviation was 8.94%. The methodology was also applied to analyze two advanced vapor compression systems, presenting strong capability to contribute to the acceleration of their R&D stage. Second, the dissertation develops an approximation-assisted modeling methodology to speed up the steady-state system simulation. Three approximation-assisted heat exchanger models were compared in terms of accuracy and computational efficiency. Kriging metamodel presented the highest accuracy among the three. For heat exchanger performance approximation, its overall ∆P and ∆h mean absolute error (MAE) were 4.46% and 0.9%, respectively. For system simulations, the maximum COP and capacity errors with Kriging metamodel were 2.54% and 1.45%, respectively. System simulation was sped up by 10X - 600X, depending on the test conditions. Third, the dissertation proposes two convergence improvement approaches on the basis of nonlinear equation fundamentals, and assessed them on a standard vapor compression system as a first step, allowing for later application to more complex cycles. The assessment results show that a large initial Jacobian condition number presents low convergence probability at the current initial guess point. The results also indicate a correlation between component nonlinearity and simulation convergence. It was found that by changing the characterization methods in the heat exchanger models, 47 out of 51 originally non-converged cases were able to reach convergence.Item Optimization-based Robustness and Stabilization in Decentralized Control(2017) Alavian, Alborz; Rotkowitz, Michael C; Electrical Engineering; Digital Repository at the University of Maryland; University of Maryland (College Park, Md.)This dissertation pertains to the stabilization, robustness, and optimization of Finite Dimensional Linear Time Invariant (FDLTI) decentralized control systems. We study these concepts for FDLTI systems subject to decentralizations that emerge from imposing sparsity constraints on the controller. While these concepts are well-understood in absence of an information structure, they continue to raise fundamental interesting questions regarding an optimal controller, or on suitable notions of robustness in presence of information structures. Two notions of stabilizability with respect to decentralized controllers are considered. First, the seminal result of Wang & Davison in 1973 regarding internal stabilizability of perfectly decentralized system and its connection to the decentralized fixed-modes of the plant is revisited. This seminal result would be generalized to any arbitrary sparsity-induced information structure by providing an inductive proof that verifies and shows that those mode of the plant that are fixed with respect to the static controllers would remain fixed with respect to the dynamic ones. A constructive proof is also provided to show that one can move any non-fixed mode of the plant to any arbitrary location within desired accuracy provided that they remain symmetric in the complex plane. A synthesizing algorithm would then be derived from the inductive proof. A second stronger notion of stability referred to as "non-overshooting stability" is then addressed. A key property called "feedthrough consistency" is derived, that when satisfied, makes extension of the centralized results to the decentralized case possible. Synthesis of decentralized controllers to optimize an H_Infinity norm for model-matching problems is considered next. This model-matching problem corresponds to an infinite-dimensional convex optimization problem. We study a finite-dimensional parametrization, and show that once the poles are chosen for this parametrization, the remaining problem of coefficient optimization can be cast as a semidefinite program (SDP). We further demonstrate how to use first-order methods when the SDP is too large or when a first-order method is otherwise desired. This leaves the remaining choice of poles, for which we develop and discuss several methods to better select the most effective poles among many candidates, and to systematically improve their location using convex optimization techniques. Controllability of LTI systems with decentralized controllers is then studied. Whether an LTI system is controllable (by LTI controllers) with respect to a given information structure can be determined by testing for fixed modes, but this gives a binary answer with no information about robustness. Measures have already been developed to determine how far a system is from having a fixed mode when one considers complex or real perturbations to the state-space matrices. These measures involve intractable minimizations of a non-convex singular value over a power-set, and hence cannot be computed except for the smallest of the plants. We replace these problem by equivalent optimization problems that involve a binary vector rather than the power-set minimization and prove their equality. Approximate forms are also provided that would upper bound the original metrics, and enable us to utilize MINLP techniques to derive scalable upper bounds. We also show that we can formulate lower bounds for these measures as polynomial optimization problems,and then use sum-of-squares methods to obtain a sequence of SDPs, whose solutions would lower bound these metrics.Item RESILIENCE OF TRANSPORTATION INFRASTRUCTURE SYSTEMS: QUANTIFICATION AND OPTIMIZATION(2013) Faturechi, Reza; Miller-Hooks, Elise; Civil Engineering; Digital Repository at the University of Maryland; University of Maryland (College Park, Md.)Transportation systems are critical lifelines for society, but are at risk from natural or human-caused hazards. To prevent significant loss from disaster events caused by such hazards, the transportation system must be resilient, and thus able to cope with disaster impact. It is impractical to reinforce or harden these systems to all types of events. However, options that support quick recovery of these systems and increase the system's resilience to such events may be helpful. To address these challenges, this dissertation provides a general mathematical framework to protect transportation infrastructure systems in the presence of uncertain events with the potential to reduce system capacity/performance. A single, general decision-support optimization model is formulated as a multi-stage stochastic program. The program seeks an optimal sequence of decisions over time based upon the realization of random events in each time stage. This dissertation addresses three problems to demonstrate the application of the proposed mathematical model in different transportation environments with emphasis on system-level resilience: Airport Resilience Problem (ARP), Building Evacuation Design Problem (BEDP), and Travel Time Resilience in Roadways (TTR). These problems aim to measure system performance given the system's topological and operational characteristics and support operational decision-making, mitigation and preparedness planning, and post-event immediate response. Mathematical optimization techniques including, bi-level programming, nonlinear programming, stochastic programming and robust optimization, are employed in the formulation of each problem. Exact (or approximate) solution methodologies based on concepts of primal and dual decomposition (integer L-shaped decomposition, Generalized Benders decomposition, and progressive hedging), disjunctive optimization, scenario simulation, and piecewise linearization methods are presented. Numerical experiments were conducted on network representations of a United States rail-based intermodal container network, the LaGuardia Airport taxiway and runway pavement network, a single-story office building, and a small roadway network.