Mathematics
Permanent URI for this communityhttp://hdl.handle.net/1903/2261
Browse
24 results
Search Results
Item OUT OF DISTRIBUTION EVALUATION OF NATURAL LANGUAGE PROCESSING SYSTEMS: GENERALIZATION TO LOW-RESOURCE AND DISTANT LANGUAGES AND HUMAN-AI COLLABORATIVE WRITING(2024) Richburg, Aquia; Carpuat, Marine; Applied Mathematics and Scientific Computation; Digital Repository at the University of Maryland; University of Maryland (College Park, Md.)Large language models have revolutionized natural language processing with their capabilities in text generation and understanding. Their rich contextual representations learned from training on diverse text datasets have lead LLMs to be used across a variety of settings. However this increases the chance of models being used in unintended use cases and causing harm to users. This dissertation delves into empirical studies of out-of-distribution issues in text generation (machine translation) and text classification (authorship analysis) tasks, examining how LLMs perform in settings distant from their training distributions.In our first work, the goal is to understand the characteristics of the training distribution of LLMs by visualizing the roles of samples during the training of a machine translation model. Our results indicate that sample contributions are not uniform and play complex roles throughout the training process. This highlights the difficulty of describing samples that are representative of the training distribution and motivates thorough evaluation of models in diverse settings. Our second and third works turn to the evaluation of LLMs in out-of-distribution settings to better understand their strengths and limitations for generalization on unseen tasks. We evaluate LLMs in machine translation tasks, focusing on how translation quality is affected by the presence or absence of specific language pairs in the training data. Our findings show that while finetuning improves translation for unseen languages, the impact varies across different language pairs. This emphasizes the need for further research to enable effective massively multilingual translation with LLMs. In text classification, we explore out-of-distribution generalization for authorship analysis in the context of human-AI collaborative writing. Our studies reveal that traditional AI detection models underperform when distinguishing between human and AI cowritten text. Simpler n-gram techniques are more robust than LLM for authorship identification, suggesting the need for adapted authorship analysis tools. In summary this dissertation advances our understanding of LLM generalization and provides insights for improving the robustness and adaptability of NLP systems.Item Quantum Algorithms for Nonconvex Optimization: Theory and Implementation(2024) Leng, Jiaqi; Wu, Xiaodi XW; Applied Mathematics and Scientific Computation; Digital Repository at the University of Maryland; University of Maryland (College Park, Md.)Continuous optimization problems arise in virtually all disciplines of quantitative research. While convex optimization has been well-studied in recent decades, large-scale nonconvex optimization problems remain intractable in both theory and practice. Quantum computers are expected to outperform classical computers in certain challenging computational problems. Some quantum algorithms for convex optimization have shown asymptotic speedups, while the quantum advantage for nonconvex optimization is yet to be fully understood. This thesis focuses on Quantum Hamiltonian Descent (QHD), a quantum algorithm for continuous optimization problems. A systematic study of Quantum Hamiltonian Descent is presented, including theoretical results concerning nonconvex optimization and efficient implementation techniques for quantum computers. Quantum Hamiltonian Descent is derived as the path integral of classical gradient descent algorithms. Due to the quantum interference of classical descent trajectories, Quantum Hamiltonian Descent exhibits drastically different behavior from classical gradient descent, especially for nonconvex problems. Under mild assumptions, we prove that Quantum Hamiltonian Descent can always find the global minimum of an unconstrained optimization problem given a sufficiently long runtime. Moreover, we demonstrate that Quantum Hamiltonian Descent can efficiently solve a family of nonconvex optimization problems with exponentially many local minima, which most commonly used classical optimizers require super-polynomial time to solve. Additionally, by using Quantum Hamiltonian Descent as an algorithmic primitive, we show a quadratic oracular separation between quantum and classical computing. We consider the implementation of Quantum Hamiltonian Descent for two important paradigms of quantum computing, namely digital (fault-tolerant) and analog quantum computers. Exploiting the product formula for quantum Hamiltonian simulation, we demonstrate that a digital quantum computer can implement Quantum Hamiltonian Descent with gate complexity nearly linear in problem dimension and evolution time. With a hardware-efficient sparse Hamiltonian simulation technique known as Hamiltonian embedding, we develop an analog implementation recipe for Quantum Hamiltonian Descent that addresses a broad class of nonlinear optimization problems, including nonconvex quadratic programming. This analog implementation approach is deployed on large-scale quantum spin-glass simulators, and the empirical results strongly suggest that Quantum Hamiltonian Descent has great potential for highly nonconvex and nonlinear optimization tasks.Item A Multifaceted Quantification of Bias in Large Language Models(2023) Sotnikova, Anna; Daumé III, Hal; Applied Mathematics and Scientific Computation; Digital Repository at the University of Maryland; University of Maryland (College Park, Md.)Language models are rapidly developing, demonstrating impressive capabilities in comprehending, generating, and manipulating text. As they advance, they unlock diverse applications across various domains and become increasingly integrated into our daily lives. Nevertheless, these models, trained on vast and unfiltered datasets, come with a range of potential drawbacks and ethical issues. One significant concern is the potential amplification of biases present in the training data, generating stereotypes and reinforcing societal injustices when language models are deployed. In this work, we propose methods to quantify biases in large language models. We examine stereotypical associations for a wide variety of social groups characterized by both single and intersectional identities. Additionally, we propose a framework for measuring stereotype leakage across different languages within multilingual large language models. Finally, we introduce an algorithm that allows us to optimize human data collection in conditions of high levels of human disagreement.Item QUANTUM COMBINATORIAL OPTIMIZATION ALGORITHMS FOR PACKING PROBLEMS IN CLASSICAL COMPUTING AND NETWORKING(2023) Unsal, Cem Mehmet; Oruc, Yavuz A; Applied Mathematics and Scientific Computation; Digital Repository at the University of Maryland; University of Maryland (College Park, Md.)In Computer Engineering, packing problems play a central role in many aspects of hardware control. The field aims to maximize computer processing speed, network throughput, and dependability in industry applications. Many of these constrained maximization problems can be expressed as packing problems in integer programming when working with restrictions such as latency, memory size, race conditions, power, and component availability. Some of the most crucial of these integer programming problems are NP-hard for the global optimum. Therefore, real-world applications heavily rely on heuristics and meta-heuristics to find good solutions. With recent developments in quantum meta-heuristic methods and promising results in experimental quantum computing systems, quantum computing is rapidly becoming more and more relevant for complex real-world combinatorial optimization tasks. This thesis is about applications of quantum combinatorial optimization algorithms in classical computer engineering problems. These include novel quantum computing techniques that respect the constraints of state-of-the-art experimental quantum systems. This thesis includes five projects. FASTER QUANTUM CONCENTRATION VIA GROVER'S SEARCH:One of the most important challenges in information networks is to gather data from a larger set of nodes to a smaller set of nodes. This can be done via the use of a concentrator architecture in the connection topology. This chapter is a proof-of-concept that demonstrates a quantum-based controller in large interconnection networks can asymptotically perform this task faster. We specifically present quantum algorithms for routing concentration assignments on full-capacity fat-and-slim concentrators, bounded fat-and-slim concentrators, and regular fat-and-slim concentrators. Classically, the concentration assignment takes O(n) time on all these concentrators, where n is the number of inputs. Powered by Grover's quantum search algorithm, our algorithms take O(√(nc) ln(c)) time, where c is the capacity of the concentrator. Thus, our quantum algorithms are asymptotically faster than their classical counterparts when (c ln^2(c))=o(n). In general, c = n^μ satisfies (c ln^2(c))=o(n), implying a time complexity of O(n^(0.5(1+ μ )) ln (n)), for any μ, 0 < μ < 1. QUANTUM ADVERSARIAL LEARNING IN EMULATION OF MONTE-CARLO METHODS FOR MAX-CUT APPROXIMATION: QAOA IS NOT OPTIMAL:One of the leading candidates for near-term quantum advantage is the class of Variational Quantum Algorithms. However, these algorithms suffer from classical difficulty in optimizing the variational parameters as the number of parameters increases. Therefore, it is important to understand the expressibility and power of various ansätze to produce target states and distributions. To this end, we apply notions of emulation to Variational Quantum Annealing and the Quantum Approximate Optimization Algorithm (QAOA) to show that variational annealing schedules with equivalent numbers of parameters outperform QAOA. Our Variational Quantum Annealing schedule is based on a novel polynomial parameterization that can be optimized in a similar gradient-free way as QAOA, using the same physical ingredients. In order to compare the performance of ansätze types, we have developed statistical notions of Monte-Carlo methods. Monte-Carlo methods are computer programs that generate random variables that approximate a target number that is computationally hard to calculate exactly. While the most well-known Monte-Carlo method is Monte-Carlo integration (e.g., Diffusion Monte-Carlo or path-integral quantum Monte-Carlo), QAOA is itself a Monte-Carlo method that finds good solutions to NP-complete problems such as Max-cut. We apply these statistical Monte-Carlo notions to further elucidate the theoretical framework around these quantum algorithms. SCHEDULING JOBS IN A SHARED HIGH-PERFORMANCE COMPUTER WITH A NISQ COMPUTER:Several quantum approximation algorithms for NP-hard optimization problems have been described in the literature. The properties of quantum approximation algorithms have been well-explored for optimization problems of Ising type with 2-local Hamiltonians. A wide range of optimization problems can be mapped to Ising problems. However, the mapping overhead of many problem instances puts them out of the reach of Noisy Intermediate-scale Quantum (NISQ) devices. In this chapter, we develop a way of mapping constrained optimization problems to higher-order spin interactions to put a larger set of problem instances within reach of spin interaction devices with potential NISQ applications. We demonstrate the growth in the practicable set of problem instances by comparing the resource requirements as a function of coupling. As an example, we have demonstrated our techniques on the problem of scheduling jobs in a high-performance computer queue with limited memory and CPUs. PROTEIN STRUCTURES WITH OSCILLATING QPACKER:A significant challenge in designing proteins for therapeutic purposes is determining the structure of a protein to find the sidechain identities given a protein backbone. This problem can be easily and efficiently encoded as a quadratic binary optimization problem. There has been a significant effort to find ways to solve these problems in the field of quantum information, both exactly and approximately. An important initiative has applied experimental quantum annealing platforms to solve this problem and got promising results. This project is about optimizing the annealing schedule for the sidechain identity problem, inspired by cutting-edge developments in the algorithmic theory of quantum annealing. ON THE COMPLEXITY OF GENERALIZED DISCRETE LOGARITHM PROBLEM:The Generalized Discrete Logarithm Problem (GDLP) is an extension of the Discrete Logarithm Problem where the goal is to find x∈ℤ_s such that g^x mod s=y for a given g,y∈ℤ_s. The generalized discrete logarithm is similar, but instead of a single base element, it uses a number of base elements that do not necessarily commute. In this chapter, we prove that GDLP is NP-hard for symmetric groups. The lower-bound complexity of GDLP has been an open question since GDLP was defined in 2008 until our proof. Furthermore, we prove that GDLP remains NP-hard even when the base elements are permutations of at most three elements. Lastly, we discuss the implications and possible implications of our proofs in classical and quantum complexity theory.Item Deep Thinking Systems: Logical Extrapolation with Recurrent Neural Networks(2023) Schwarzschild, Avi Koplon; Goldstein, Tom; Mathematics; Digital Repository at the University of Maryland; University of Maryland (College Park, Md.)Deep neural networks are powerful machines for visual pattern recognition, but reasoning tasks that are easy for humans are still be difficult for neural models. Humans possess the ability to extrapolate reasoning strategies learned on simple problems to solve harder examples, often by thinking for longer. We study neural networks that have exactly this capability. By employing recurrence, we build neural networks that can expend more computation when needed. Using several datasets designed specifically for studying generalization from easy problems to harder test samples, we show that our recurrent networks can extrapolate from easy training data to much harder examples at test time, and they do so with many more iterations of a recurrent block of layers than are used during training.Item Quantum Speedups: Structure, Design, and Application(2023) Wang, Daochen; Childs, Andrew M.; Miller, Carl A.; Applied Mathematics and Scientific Computation; Digital Repository at the University of Maryland; University of Maryland (College Park, Md.)A quantum speedup occurs when a computational problem can be solved faster on a quantum computer than on a classical computer. This thesis studies the circumstances under which quantum speedups can arise from three perspectives. First, we study the structure of the problem. We characterize how a problem's symmetries decide whether it admits a superpolynomial or polynomial quantum speedup. In particular, we show that the computation of any partial function of a hypergraph's adjacency matrix admits at most a polynomial speedup. Such functions are only weakly symmetric and subsequent work shows that if the symmetry is weakened even further, then a superpolynomial speedup can be possible. To complete the picture for graph problems, we also construct a partial function of a graph's adjacency list that admits an exponential speedup. Second, we study classical paradigms for solving problems. We design a natural quantum query analogue of the divide and conquer paradigm, which we call the ``quantum divide and conquer'' framework. This framework allows us to quantumly speed up the solution of many problems that are classically solved using divide and conquer. The main technical novelty of our framework is that it naturally combines quantum adversary and quantum algorithmic techniques. To showcase its usefulness, we apply it to obtain near-optimal quantum query complexities for four string problems: (i) recognizing regular languages; (ii) decision versions of String Rotation and String Suffix; and natural parameterized versions of (iii) Longest Increasing Subsequence and (iv) Longest Common Subsequence. Third, we study ways of generalizing a problem known to admit a quantum speedup. We generalize the search problem of finding a marked item in a database to the case where the items are not either marked or unmarked, but are ``partially marked''. The goal is to find the item that is the most heavily marked. We construct a quantum algorithm for this problem within the variable-time framework and prove its near-optimality by modifying the standard adversary method to work for oracles encoding probability distributions. Coincidentally, the problem studied is equivalent to the multi-armed bandits problem in reinforcement learning, which has many applications.Item Combinatorial Optimization Algorithms for Hypergraph Matching with Application to Posture Identification in Embryonic Caenorhabditis elegans(2022) Lauziere, Andrew Neil; Balan, Radu; Mathematics; Digital Repository at the University of Maryland; University of Maryland (College Park, Md.)Point-set matching defines the task in computer vision of identifying a one-to-one alignment between two sets of points. Existing techniques rely on simple relationships between point-sets in order to efficiently find optimal correspondences between larger sets. Modern methodology precludes application to more challenging point-set matching tasks which benefit from interdependent modeling. This thesis addresses a gap in combinatorial optimization literature by enhancing leading methods in both graph matching and multiple object tracking (MOT) to more flexibly use data-driven point-set matching models. Presented contributions are inspired by Caenorhabditis elegans, a transparent free-living roundworm frequently studied in developmental biology and neurobiology. The C. elegans embryo, containing around 550 cells at hatch, can be used for cell tracking studies to understand how cell movement drives the development of specific embryonic tissues and organ functions. The development of muscle cells complicates analyses during late-stage development, as embryos begin twitching due to muscular activity. The sporadic twitches cause cells to move violently and unpredictably, invalidating traditional cell tracking approaches. The embryo possesses seam cells, a set of 20 cells which together act as fiducial markers to approximate the coiled embryo's body. Novel optimization algorithms and data-driven hypergraphical models leveraging the correlated movement among seam cells are used to forward research on C. elegans embryogenesis. We contribute two optimization algorithms applicable in differing conditions to interdependent point-set matching. The first algorithm, Exact Hypergraph Matching (EHGM), exactly solves the n-adic assignment problem by casting the problem as hypergraph matching. The algorithm obtains solutions to highly interdependent seam cell identification models. The second optimization algorithm, Multiple Hypothesis Hypergraph Tracking (MHHT), adapts traditional multiple hypothesis tracking with hypergraphical data association. Results from both studies highlight improved performance over established methods while providing adaptable optimization tools for multiple academic communities. The canonical point-set matching task is solved efficiently under strict assumptions of frame-to-frame transformations. Challenging situations arising from non-rigid displacements between frames will complicate established methods. Particularly, limitations in fluorescence microscopy paired with muscular twitching in late-stage embryonic C. elegans yield adversarial point-set matching tasks. Seam cell identification is cast as an assignment problem; detected cells in images are uniquely identified through a combinatorial optimization algorithm. Existing graph matching methods are underequipped to contextualize the coiled embryonic position in sparsely imaged samples. Both the lack of an effective point-set matching model and an efficient algorithm for solving the resulting optimization problem limit computationally driven solutions to identify seam cells in acquired image volumes. We cast the n-adic problem as hypergraph matching and present an exact algorithm to solve the resulting optimization problem. EHGM adapts the branch-and-bound paradigm to dynamically identify a globally optimal correspondence; it is the first algorithm capable of solving the underlying optimization problem. Our algorithm and accompanying data-driven hypergraphical models identify seam cells more accurately than established point-set matching methods. The final hours of embryogenesis encompass the moments in which C. elegans assumes motor control and begins exhibiting behavior. Rapid imaging of the seam cells provides insight into the embryo’s movement as a proxy for behavior. However, seam cell tracking is especially challenging due to both muscular twitching and the low dose required to gently image the embryo without perturbing development. Current methods in MOT rely on independent object trajectories undergoing smooth motion to effectively track large numbers of objects. Multiple Hypothesis Tracking (MHT) is the foremost method for challenging MOT tasks, yet the method cannot model correlated object movements. We contribute Multiple Hypothesis Hypergraph Tracking (MHHT) as an extension of MHT, which performs interdependent object tracking by jointly representing objects as a hypergraph. We apply MHHT to track seam cell nuclei during late-stage embryogenesis. Data-driven hypergraphical models more accurately track seam cells than traditional MHT based approaches. Analysis of time-lapse embryonic postures and behavioral motifs reveal a stereotyped developmental progression in C. elegans. Further analysis uncovers late-stage motility defects in unc-13 mutants.Item STATISTICAL INFERENCE ACROSS MULTIPLE NETWORKS: ADVANCEMENTS IN MULTIPLEX GRAPH MATCHING AND JOINT SPECTRAL NETWORK EMBEDDINGS(2022) Pantazis, Konstantinos; Lyzinski, Vince; Mathematics; Digital Repository at the University of Maryland; University of Maryland (College Park, Md.)Networks are commonly used to model and study complex systems that arise in a variety of scientific domains.One important network data modality is multiplex networks which are comprised of a collection of networks over a common vertex set. Multiplex networks can describe complex relational data where edges within each network can encode different relationship or interaction types. With the rise of network modeling of complex, multi-sample network data, there has been a recent emphasis on multiplex inference methods. In this thesis, we develop novel theory and methodology to study underlying network structures and perform statistical inference on multiple networks. While each chapter of the thesis has its own individual merit, synergistically they constitute a coherent multi-scale spectral network inference framework that accounts for unlabeled and correlated multi-sample network data. Together, these results significantly extend the reach of such procedures in the literature. In the first part of the thesis, we consider the inference task of aligning the vertices across a pair of multiplex networks, a key algorithmic step in routines that assume a priori node-aligned data. This general multiplex matching framework is then adapted to the task of detecting a noisy induced multiplex template network in a larger multiplex background network.Our methodology, which lifts the classical graph matching framework and the matched filters method of Sussman et al. (2018) to the multiple network setting, uses the template network to search for the ``most" similar subgraph(s) in the background network, where the notion of similarity is measured via a multiplex graph matching distance. We present an algorithm which can efficiently match the template to a (induced or not induced) subgraph in the background that approximately minimizes a suitable graph matching distance, and we demonstrate the effectiveness of our approach both theoretically and empirically in synthetic and real-world data settings. In the second part of the thesis, we present a joint embedding procedure for multi-scale spectral network inference.In our proposed framework, a collection of node-aligned networks are jointly embedded into a common Euclidean space by embedding a specialized omnibus matrix which contains the information across the entire collection of networks. Our approach---which builds upon the work of Levin et al. (2017)---is flexible enough to faithfully embed many different network collection types (e.g., network time-series data; test-retest network data, etc.) and is theoretically tractable. Indeed, our method is among the first joint embedding methods in which statistical consistency and asymptotic normality are established across correlated network collections. Moreover, we are able to identify (and fully analyze in our setting) the phenomenon of induced correlation in the embedded space, which is an artifact of joint embedding methodologies. We examine how both the induced (by the method) and inherent correlation can impact inference for correlated network data, and we provide network analogues of classical questions such as the effective sample size for more generally correlated data. In the final part of the thesis, we consider a constrained optimization problem called corr2OMNI, and we present an algorithm that approximates generalized omnibus matrix structure (for jointly embedding networks) that best preserve (in the embedded space) a given correlation structure across a collection of networks. Moreover, we analyze theoretically an important special case of corr2OMNI where we desire a fixed level of correlation across the networks.Item Quantum algorithms for linear and nonlinear differential equations(2022) Liu, Jinpeng; Childs, Andrew; Mathematics; Digital Repository at the University of Maryland; University of Maryland (College Park, Md.)Quantum computers are expected to dramatically outperform classical computers for certain computational problems. Originally developed for simulating quantum physics, quantum algorithms have been subsequently developed to address diverse computational challenges. There has been extensive previous work for linear dynamics and discrete models, including Hamiltonian simulations and systems of linear equations. However, for more complex realistic problems characterized by differential equations, the capability of quantum computing is far from well understood. One fundamental challenge is the substantial difference between the linear dynamics of a system of qubits and real-world systems with continuum and nonlinear behaviors. My research is concerned with mathematical aspects of quantum computing. In this dissertation, I focus mainly on the design and analysis of quantum algorithms for differential equations. Systems of linear ordinary differential equations (ODEs) and linear elliptic partial differential equations (PDEs) appear ubiquitous in natural and social science, engineering, and medicine. I propose a variety of quantum algorithms based on finite difference methods and spectral methods for producing the quantum encoding of the solutions, with an exponential improvement in the precision over previous quantum algorithms. Nonlinear differential equations exhibit rich phenomena in many domains but are notoriously difficult to solve. Whereas previous quantum algorithms for general nonlinear equations have been severely limited due to the linearity of quantum mechanics, I give the first efficient quantum algorithm for nonlinear differential equations with sufficiently strong dissipation. I also establish a lower bound, showing that nonlinear differential equations with sufficiently weak dissipation have worst-case complexity exponential in time, giving an almost tight classification of the quantum complexity of simulating nonlinear dynamics. Overall, utilizing advanced linear algebra techniques and nonlinear analysis, I attempt to build a bridge between classical and quantum mechanics, understand and optimize the power of quantum computation, and discover new quantum speedups over classical algorithms with provable guarantees.Item Analysis of Data Security Vulnerabilities in Deep Learning(2022) Fowl, Liam; Czaja, Wojciech; Goldstein, Thomas; Mathematics; Digital Repository at the University of Maryland; University of Maryland (College Park, Md.)As deep learning systems become more integrated into important application areas, the security of such systems becomes a paramount concern. Specifically, as modern networks require an increasing amount of data on which to train, the security of data that is collected for these models cannot be guaranteed. In this work, we investigate several security vulnerabilities and security applications of the data pipeline for deep learning systems. We systematically evaluate the risks and mechanisms of data security from multiple perspectives, ranging from users to large companies and third parties, and reveal several security mechanisms and vulnerabilities that are of interest to machine learning practitioners.
- «
- 1 (current)
- 2
- 3
- »