Computer Science Theses and Dissertations
Permanent URI for this collectionhttp://hdl.handle.net/1903/2756
Browse
1028 results
Search Results
Item Inertially Constrained Ruled Surfaces for Visual Odometry(2024) Zhu, Chenqi; Aloimonos, Yiannis; Computer Science; Digital Repository at the University of Maryland; University of Maryland (College Park, Md.)In computer vision, camera egomotion is typically solved with visual odometry techniques that relies on feature extraction from a sequence of images and computation of the optical flow. This, however, often requires a point-to-point correspondence between two consecutive frames which can often be costly to compute and its varying accuracy greatly affects the quality of estimated motion. Attempts have been made to bypass the difficulties originated from the correspondence problem by adopting line features and fusing other sensors (event camera, IMU), many of which still heavily rely on feature detectors. If the camera observes a straight line as it moves, the image of such line is sweeping a surface, this is a ruled surface and analyzing its shapes gives information about the egomotion. This research presents a novel algorithm to estimate 3D camera egomotion from scenes represented by ruled surfaces. Constraining the egomotion with inertia measurements from an onboard IMU sensor, the dimensionality of the solution space is greatly reduced.Item Parallel and External-Memory Algorithms for Large-Scale Genomic Sequence-Graph and -Index Construction(2024) Khan, Jamshed; Patro, Robert; Computer Science; Digital Repository at the University of Maryland; University of Maryland (College Park, Md.)Rapid developments in the throughput and affordability of modern genome-sequencing technologies have made the generation of billions of sequencing reads from biological samples highly time- and cost-efficient. Concurrent development of efficient algorithms and tools for genome assembly have also vastly expanded the number of available assembled genome sequences. This fast-paced and sustained increase in available genomic data has resulted in situations where, in many cases, computational data analysis has become more resource-expensive than the experiments generating these data. This is particularly true since genomic data are generated once, but are typically stored indefinitely and subsequently reanalyzed many times. As such, the development of algorithmic approaches to process these data in a fundamentally scalable manner has been a key focus of modern computational genomics. This dissertation focuses on developing scalable, parallel algorithms and data structures for compressing, organizing, and indexing high-throughput and large-scale genomic data using external-memory, in ways so as to increase the scale at which existing analyses can be performed. Toward these objectives, a mathematical object called the de Bruijn graph—and the associated data structure and variants—have become a compact data representation of increasing utility across computational genomics. Genomic data are expensive in terms of space, and building a compact representation of sequences of interest while retaining relevant features is a first step in many computational genomics pipelines. The majority of this dissertation focuses on construction algorithms for such a representation—a compact lossless variant of the de Bruijn graph, used to represent sets of genomic sequences. This dissertation first presents a SotA parallel algorithm, Cuttlefish, to construct the contracted de Bruijn graph from genome references with superior performance to pre-existing methods—where we propose a novel modeling approach for each graph vertex as an individual finite-state automaton, and treat the graph as a collection of automata. Cuttlefish is applicable for a specific class of input data, namely assembled genome sequences. The dissertation then presents our subsequent work, where we extend this method to be more general and enable it take any form of input. We developed a new SotA algorithm, Cuttlefish 2—through generalization of the earlier algorithm—to also facilitate construction from sequencing read sets as input, which is fundamentally a tougher problem due to structural properties of raw sequencing data. Both Cuttlefish and Cuttlefish 2 construct the de Bruijn graph without source-annotations for the vertices. The dissertation then presents Cuttlefish 3, an even faster, more memory-frugal, and further scalable algorithm to construct the contracted de Bruijn graph, while also incorporating source-annotation information (i.e. inverted-indexing) into the graph. Orthogonal to the strategy of working with the whole graph at all times, Cuttlefish 3 adopts an established paradigm of partitioning the original graph into subgraphs, contracting the subgraphs separately, and then merging the local solutions appropriately—with novel algorithmic contributions in key steps. It speeds up the construction process of the local contractions non-trivially, proposes a novel algorithm to join the local solutions based on parallel tree contraction, and presents a graph sparsification strategy to vastly reduce the amount of vertices to be tracked when inverted-indexing the graph. Finally, the dissertation presents a simple and highly parallel algorithm, CaPS-SA, to construct the classic Suffix Array data structure to index genomic sequences, advancing the SotA parallel solutions for modern architectures.Item Practical Cryptography for Blockchains: Secure Protocols with Minimal Trust(2024) Glaeser, Noemi; Katz, Jonathan; Malavolta, Giulio; Computer Science; Digital Repository at the University of Maryland; University of Maryland (College Park, Md.)In 2008, Satoshi Nakamoto introduced Bitcoin, the first digital currency without a trusted authority whose security is maintained by a decentralized blockchain. Since then, a plethora of decentralized applications have been proposed utilizing blockchains as a public bulletin board. This growing popularity has put pressure on the ecosystem to prioritize scalability at the expense of trustlessness and decentralization. This work explores the role cryptography has to play in the blockchain ecosystem to improve performance while maintaining minimal trust and strong security guarantees. I discuss a new paradigm for scalability, called naysayer proofs, which sits between optimistic and zero-knowledge approaches. Next, I cover two on-chain applications: First, I show how to improve the security of a class of coin mixing protocols by giving a formal security treatment of the construction paradigm and patching the security of an existing, insecure protocol. Second, I show how to construct practical on-chain protocols for a large class ofelections and auctions which simultaneously offer fairness and non-interactivity without relying on a trusted third party. Finally, I look to the edges of the blockchain and formalize new design requirements for the problem of backing up high-value but rarely-used secret keys, such as those used to secure the reserves of a cryptocurrency exchange, and develop a protocol which efficiently meets these new challenges. All of these works will be deployed in practice or have seen interest from practitioners. These examples show that advanced cryptography has the potential to meaningfully nudge the blockchain ecosystem towards increased security and reduced trust.Item Everything Efficient All at Once - Compressing Data and Deep Networks(2024) Girish, Sharath; Shrivastava, Abhinav; Computer Science; Digital Repository at the University of Maryland; University of Maryland (College Park, Md.)In this thesis, we examine the efficiency of deep networks and data, both of which are widely used in various computer vision/AI applications and are ubiquitous in today's information age. As deep networks continue to grow exponentially in size, improving their efficiency in terms of size and computation becomes necessary for deploying across various mobile/small devices with hardware constraints. Data efficiency is also equivalently important due to the memory and network speed bottlenecks when transmitting and storing data which is also being created and transmitted at an exponential rate. In this work, we explore in detail, various approaches to improve the efficiency of deep networks, as well as perform compression of various forms of data content. Efficiency of deep networks involves two major aspects; size, or the memory required to store deep networks on disk, and computation, or the number of operations/time taken to execute the network. The first work analyzes sparsity for computation reduction in the context of vision tasks which involve a large pretraining stage followed by downstream task finetuning. We show that task specific sparse subnetworks are more efficient than generalized sparse subnetworks which are more dense and do not transfer very well. We analyze several behaviors of training sparse networks for various vision tasks. While efficient, this sparsity theoretically focuses on only computation reduction and requires dedicated hardware for practical deployment. We therefore develop a framework for simultaneously reducing size and computation by utilizing a latent quantization-framework along with regularization losses. We compress convolutional networks by more than an order of magnitude in size while maintaining accuracy and speeding up inference without dedicated hardware. Data can take different forms such as audio, language, image, or video. We develop approaches for improving the compression and efficiency of various forms of visual data which take up the bulk of global network traffic as well as storage. This consists of 2D images or videos and, more recently, their 3D equivalents of static/dynamic scenes which are becoming popular for immersive AR/VR applications, scene understanding, 3D-aware generative modeling, and so on. To achieve data compression, we utilize Implicit Neural Representations (INRs) which represent data signals in terms of deep network weights. We transform the problem of data compression into network compression, thereby learning efficient data representations. We first develop an algorithm for compression of 2D videos via autoregressive INRs whose weights are compressed by utilizing the latent-quantization framework. We then focus on learning a general-purpose INR which can compress different forms of data such as 2D images/videos and can potentially be extended to the audio or language domain as well. This can be extended to compression of 3D objects and scenes as well. Finally, while INRs can represent 3D information, they are slow to train and render which are important for various real-time 3D applications. We utilize 3D Gaussian Splatting (3D-GS), a form of explicit representation for 3D scenes or objects. 3D-GS is quite fast to train and render, but consume large amounts of memory and are especially inefficient for modeling dynamic scenes or 3D videos. We first develop a framework for efficiently training and compressing 3D-GS for static scenes. We achieve large reductions in storage memory, runtime memory, training and rendering time costs while maintaining high reconstruction quality. Next, we extend to dynamic scenes or 3D videos, developing an online streamable framework for 3D-GS. We learn per-frame 3D-GS and learn/transmit only the residuals for 3D-GS attributes achieving large reductions in per-frame storage memory for online streamable 3D-GS while also reducing training time costs and maintaining high rendering speeds and reconstruction quality.Item Analytics of Configuration Management for System Administration(2024) Katz, Yehuda; Agrawala, Ashok K; Computer Science; Digital Repository at the University of Maryland; University of Maryland (College Park, Md.)System administrators are usually trusted to be experts on the systems they control, yet security breaches and performance problems can often be traced to improper configuration by these same administrators. These configuration mistakes aren’t made deliberately and certainly not maliciously, but are usually due to a lack of information about the consequences and interactions of these settings. This problem becomes more apparent as the complexity of the software being configured grows and as the role of system administrator is taken on by more people with less time to develop a complete understanding of the systems they control. We call this *Uninformed Configuration*. There is a blind spot in existing scientific research when it comes to understanding the effects of configuration changes on system performance and security, which if well understood would allow for informed configuration management. We present a new way to analyze and understand the effects of configuration management. We define a clear division between the operations of a program that are controlled by configuration and the operations of a program that are affected by the data the program is processing. This allows us to make more accurate inferences about how changing a configuration *knob* will affect the overall security and performance of the system. We build on existing static analysis tools and control flow representations originally designed for compiler optimization to build a clear picture of the effects of configuration changes. We refine the concept of understanding program execution paths with a control plane and data plane by focusing on the effects of configuration changes as a part of the control plane. We provide a method for communicating the importance of each configuration knob to a system administrator using a standardized ranking and scoring system. We also apply these methods to configuration knobs with known performance and security effects in two commonly used pieces of software. Finally, we discuss several future avenues of scientific research and practical work which will carry these ideas further to improve the state of configuration management.Item Computational Methods for Natural Walking in Virtual Reality(2024) Williams, Niall; Manocha, Dinesh; Computer Science; Digital Repository at the University of Maryland; University of Maryland (College Park, Md.)Virtual reality (VR) allow users to feel as though they are really present in a computer-generated virtual environment (VE). A key component of an immersive virtual experience is the ability to interact with the VE, which includes the ability to explore the virtual environment. Exploration of VEs is usually not straightforward since the virtual environment is usually shaped differently than the user's physical environment. This can cause users to walk on virtual routes that correspond to physical routes that are obstructed by unseen physical objects or boundaries of the tracked physical space. In this dissertation, we develop new algorithms to understand how and enable people to explore large VEs using natural walking while incurring fewer collisions physical objects in their surroundings. Our methods leverage concepts of alignment between the physical and virtual spaces, robot motion planning, and statistical models of human visual perception. Through a series of user studies and simulations, we show that our algorithms enable users to explore large VEs with fewer collisions, allow us to predict the navigability of a pair of environments without collecting any locomotion data, and deepen our understanding of how human perception functions during locomotion in VR.Item Interpreting Visual Representations and Mitigating their Failures(2024) Kalibhat, Neha; Feizi, Soheil; Computer Science; Digital Repository at the University of Maryland; University of Maryland (College Park, Md.)Deep learning has become the cornerstone of artificial intelligence (AI), particularly in language and computer vision domains. The progression in this field is reflected in numerous applications accessible to the general public, such as information retrieval via virtual assistants, content generation, autonomous vehicles, drug discovery, and medical imaging. This unprecedented rate of AI adoption raises the critical need for research on the fundamental underpinnings of deep neural networks to understand what leads to their decisions and why they fail. This thesis concentrates on self-supervised representation learning, a prevalent unsupervised method employed by foundational models to extract patterns from extensive visual data. Specifically, our focus lies in examining the low-dimensional representations generated by these models and dissecting their failure modes. In our initial investigation, we discover that self-supervised representations lack robustness to domain shifts, as they are not explicitly trained to distinguish image content from its domain. We remedy this issue by proposing a module that can be plugged into existing self-supervised baselines to disentangle their representation spaces and promote domain invariance and generalization. Our subsequent analysis delves into the patterns within representations that influence downstream classification. We scrutinize the discriminative capacity of individual features and their activations. We then propose an unsupervised quality metric that can preemptively determine whether a given representation will be correctly or incorrectly classified, with high precision. In the next segment of this thesis, we leverage our findings to further demystify the representation space, by uncovering interpretable subspaces which have unique concepts associated with them. We design a novel explainability framework that uses a vision-language model (such as CLIP) to provide natural language explanations for neural features (or groups) of a given pre-trained model. We next investigate the role of augmentations and format transformations in learning generalizable visual representations. Drawing inspiration from advancements in audio and speech modalities, we examine how presenting visual data in multiple formats affects learning, separating this from the impact of augmentations. In the final segment, we reveal compositionality as a notable failure mode in current state-of-the-art representation methods. We critique the use of fixed-size patches in vision transformers and demonstrate the benefits of employing semantically meaningful patches based on visual priors. This design adjustment leads to significant improvements in image-text retrieval tasks and, more importantly, enhances performance on compositionality benchmarks.Item Advance Video Modeling Techniques for Video Generation and Enhancement Tasks(2024) Shrivastava, Gaurav; Shrivastava, Abhinav; Computer Science; Digital Repository at the University of Maryland; University of Maryland (College Park, Md.)This thesis investigates advanced techniques that are useful in video modeling for generation and enhancement tasks. In the first part of the thesis, we explore generative modeling that exploits the external corpus for learning priors. The task here is of video prediction, i.e., to extrapolate future sequences given a few context frames. In a followup work we also demonstrate how can we reduce the inference time further and make the video prediction model more efficient. Additionally, we demonstrate that we are not only able to extrapolate one future sequence from a given context frame but multiple sequences given context frames. In the second part, we explore the methods that exploit internal statistics of videos to perform various restoration and enhancement tasks. Here, we show how robustly they perform the restoration tasks like denoising, super-resolution, frame interpolation, and object removal tasks. Furthermore, in a follow-up work, we utilize the inherent compositionality of videos and internal statistics to perform a wider variety of enhancement tasks such as relighting, dehazing, and foreground/background manipulations. Lastly, we provide insight into our future work on how data-free enhancement techniques could be improved. Additionally, we provide further insights on how multisteps video prediction techniques can be improved.Item Dynamical Memory in Deep Neural Networks -(2024) Evanusa, Matthew S; Aloimonos, Yiannis; Computer Science; Digital Repository at the University of Maryland; University of Maryland (College Park, Md.)In this work, I will begin to lay out a roadmap or framework for which I believe will serve the scientific communities of artificial intelligence and cognitive neuroscience of interest, in future development and design of a thinking intelligent machine, based on the accumulated knowledge I have gathered across many sources: from my advisors, peers and colleagues, collaborators, talks, symposia and conferences, and long paper dives, for the almost decade that I have spent at my new home in College Park, Maryland. It is my hope and intent that this thesis serves in its stated goal to advance the science of memory integration in neural networks, but in addition, to further the distant dream of discovering the mystery of what it means to be alive. It is important to note that while this thesis is focused on the critical integration of memory mechanisms into artificial neural networks, the authors’ larger goal is the creation of an overarching cognitive architecture that takes advantages of the right amount of advances from deep learning, with the right amount of insights from cognitive and neuroscience - a ”Goldilocks” of sorts for AI. It is my hope that through understanding mechanisms of memory and how they interact with our stimluli, we move one step closer to understanding our place in the cosmos.Item HIGH PERFORMANCE DISTRIBUTED TRANSACTIONS FOR MULTI-REGION DATABASE SYSTEMS(2024) Nguyen, Cuong; Abadi, Daniel J; Computer Science; Digital Repository at the University of Maryland; University of Maryland (College Park, Md.)The growing popularity of global applications, such as social networks, online gaming, and supply chain management, has spurred the demand for a more advanced database system layer in multi-region cloud infrastructures. This layer should provide a strongly consistent interface, abstracting the complexity of distributed computing so developers can focus on application logic without worrying about data race issues. At the same time, it must leverage geographic redundancy to tolerate failures, ranging from single-node crashes to regional outages. Achieving these goals often conflicts with the need for scalability and high performance. Ensuring strong data consistency requires additional coordination between nodes, while enhancing availability and durability requires data replication. These measures increase latency and resource consumption, thereby limiting performance and scalability. As a result, existing solutions, such as NoSQL database systems, often compromise with weaker consistency levels, while fully geo-replicated systems incur performance and scalability penalties to achieve stronger consistency guarantees. This dissertation addresses these challenges in three directions. First, we take a retrospective approach by enhancing an existing production-ready shared-storage architecture, derived from the Amazon Aurora database system. Our proposed modifications enables Aurora-style systems to support multiple writer nodes across geographically dispersed regions. Evaluation of our implementation, SunStorm, demonstrates substantial reductions in latency and improved scalability. Second, we adopt a forward-looking perspective by investigating a more recently proposed architecture known as deterministic database systems. We develop a new protocol within this architecture that facilitates the processing of strictly serializable multi-region transactions. Our implementation, Detock, achieves minimal performance degradation under high-contention workloads, improving throughput by an order of magnitude compared to state-of-the-art approaches and reducing latency by up to a factor of five. Finally, we conduct a comprehensive empirical study across a diverse range of real-world applications---such as e-commerce, chat, blogs, and content management systems---reassessing the assumptions of recent database system proposals and providing valuable insights for future transactional database research.