UMD Theses and Dissertations
Permanent URI for this collectionhttp://hdl.handle.net/1903/3
New submissions to the thesis/dissertation collections are added automatically as they are received from the Graduate School. Currently, the Graduate School deposits all theses and dissertations from a given semester after the official graduation date. This means that there may be up to a 4 month delay in the appearance of a given thesis/dissertation in DRUM.
More information is available at Theses and Dissertations at University of Maryland Libraries.
Browse
5 results
Search Results
Item Everything Efficient All at Once - Compressing Data and Deep Networks(2024) Girish, Sharath; Shrivastava, Abhinav; Computer Science; Digital Repository at the University of Maryland; University of Maryland (College Park, Md.)In this thesis, we examine the efficiency of deep networks and data, both of which are widely used in various computer vision/AI applications and are ubiquitous in today's information age. As deep networks continue to grow exponentially in size, improving their efficiency in terms of size and computation becomes necessary for deploying across various mobile/small devices with hardware constraints. Data efficiency is also equivalently important due to the memory and network speed bottlenecks when transmitting and storing data which is also being created and transmitted at an exponential rate. In this work, we explore in detail, various approaches to improve the efficiency of deep networks, as well as perform compression of various forms of data content. Efficiency of deep networks involves two major aspects; size, or the memory required to store deep networks on disk, and computation, or the number of operations/time taken to execute the network. The first work analyzes sparsity for computation reduction in the context of vision tasks which involve a large pretraining stage followed by downstream task finetuning. We show that task specific sparse subnetworks are more efficient than generalized sparse subnetworks which are more dense and do not transfer very well. We analyze several behaviors of training sparse networks for various vision tasks. While efficient, this sparsity theoretically focuses on only computation reduction and requires dedicated hardware for practical deployment. We therefore develop a framework for simultaneously reducing size and computation by utilizing a latent quantization-framework along with regularization losses. We compress convolutional networks by more than an order of magnitude in size while maintaining accuracy and speeding up inference without dedicated hardware. Data can take different forms such as audio, language, image, or video. We develop approaches for improving the compression and efficiency of various forms of visual data which take up the bulk of global network traffic as well as storage. This consists of 2D images or videos and, more recently, their 3D equivalents of static/dynamic scenes which are becoming popular for immersive AR/VR applications, scene understanding, 3D-aware generative modeling, and so on. To achieve data compression, we utilize Implicit Neural Representations (INRs) which represent data signals in terms of deep network weights. We transform the problem of data compression into network compression, thereby learning efficient data representations. We first develop an algorithm for compression of 2D videos via autoregressive INRs whose weights are compressed by utilizing the latent-quantization framework. We then focus on learning a general-purpose INR which can compress different forms of data such as 2D images/videos and can potentially be extended to the audio or language domain as well. This can be extended to compression of 3D objects and scenes as well. Finally, while INRs can represent 3D information, they are slow to train and render which are important for various real-time 3D applications. We utilize 3D Gaussian Splatting (3D-GS), a form of explicit representation for 3D scenes or objects. 3D-GS is quite fast to train and render, but consume large amounts of memory and are especially inefficient for modeling dynamic scenes or 3D videos. We first develop a framework for efficiently training and compressing 3D-GS for static scenes. We achieve large reductions in storage memory, runtime memory, training and rendering time costs while maintaining high reconstruction quality. Next, we extend to dynamic scenes or 3D videos, developing an online streamable framework for 3D-GS. We learn per-frame 3D-GS and learn/transmit only the residuals for 3D-GS attributes achieving large reductions in per-frame storage memory for online streamable 3D-GS while also reducing training time costs and maintaining high rendering speeds and reconstruction quality.Item The First Principles of Deep Learning and Compression(2022) Ehrlich, Max Donohue; Shrivastava, Abhinav; Davis, Larry S; Computer Science; Digital Repository at the University of Maryland; University of Maryland (College Park, Md.)The deep learning revolution incited by the 2012 Alexnet paper has been transformative for the field of computer vision. Many problems which were severely limited using classical solutions are now seeing unprecedented success. The rapid proliferation of deep learning methods has led to a sharp increase in their use in consumer and embedded applications. One consequence of consumer and embedded applications is lossy multimedia compression which is required to engineer the efficient storage and transmission of data in these real-world scenarios. As such, there has been increased interest in a deep learning solution for multimedia compression which would allow for higher compression ratios and increased visual quality. The deep learning approach to multimedia compression, so called Learned Multimedia Compression, involves computing a compressed representation of an image or video using a deep network for the encoder and the decoder. While these techniques have enjoyed impressive academic success, their industry adoption has been essentially non-existent. Classical compression techniques like JPEG and MPEG are too entrenched in modern computing to be easily replaced. This dissertation takes an orthogonal approach and leverages deep learning to improve the compression fidelity of these classical algorithms. This allows the incredible advances in deep learning to be used for multimedia compression without threatening the ubiquity of the classical methods. The key insight of this work is that methods which are motivated by first principles, \ie, the underlying engineering decisions that were made when the compression algorithms were developed, are more effective than general methods. By encoding prior knowledge into the design of the algorithm, the flexibility, performance, and/or accuracy are improved at the cost of generality. While this dissertation focuses on compression, the high level idea can be applied to many different problems with success. Four completed works in this area are reviewed. The first work, which is foundational, unifies the disjoint mathematical theories of compression and deep learning allowing deep networks to operate on compressed data directly. The second work shows how deep learning can be used to correct information loss in JPEG compression over a wide range of compression quality, a problem that is not readily solvable without a first principles approach. This allows images to be encoded at high compression ratios while still maintaining visual fidelity. The third work examines how deep learning based inferencing tasks, like classification, detection, and segmentation, behave in the presence of classical compression and how to mitigate performance loss. As in the previous work, this allows images to be compressed further but this time without accuracy loss on downstream learning tasks. Finally, these ideas are extended to video compression by developing an algorithm to correct video compression artifacts. By incorporating bitstream metadata and mimicking the decoding process with deep learning, the method produces more accurate results with higher throughput than general methods. This allows deep learning to improve the rate-distortion of classical MPEG codecs and competes with fully deep learning based codecs but with a much lower barrier-to-entry.Item Response of hypersonic boundary-layer disturbances to compression and expansion corners(2021) Butler, Cameron Scott; Laurence, Stuart; Aerospace Engineering; Digital Repository at the University of Maryland; University of Maryland (College Park, Md.)An experimental campaign was conducted at the University of Maryland - College Park to examine the impact of abrupt changes in surface geometry on hypersonic boundary-layer instability waves. A model consisting of a 5-degree conical forebody was selected to encourage the dominance of second-mode wavepackets upstream of the interaction region. Interchangeable afterbody attachments corresponding to flow deflections of -5-degree to +15-degree in 5-degree increments were considered. The adverse pressure gradient imposed by the +10-degree and +15-degree configurations caused the boundary layer to separate upstream, creating a region of recirculating flow. High-speed schlieren (440-822 kHz) was employed as the primary means of flow interrogation, with supplemental surface measurements provided by PCB132B38 pressure transducers. A lens calibration was applied to the images to provide quantitative fluctuations in density gradient. The high frame rate made possible the use of spectral analysis techniques throughout the entire field of view. This analysis reveals complex growth and decay trends for incoming second-mode disturbances. Additional, low-frequency content is generated by the deflected configurations. This is most pronounced for the separated cases where distinct, shear-generated disturbances are observed. Spectral proper orthogonal decomposition (SPOD) is demonstrated as a powerful tool for resolving the flow structures tied to amplifying frequencies. Nonlinear interactions are probed through bispectral analysis. Resonance of low-frequency structures is found to play a large role in nonlinear energy transfer downstream of the compression corners, particularly for the separated cases. Concave streamline curvature appears to result in concentrated regions of increased nonlinearity. These nonlinear interactions are shown to be spatially correlated with coherent flow structures resolved through SPOD. Finally, a limited computational study is carried out to demonstrate the ability of linear stability theory and the parabolized stability equations to reproduce experimental results obtained for the +10-degree extension. The development of the second-mode and shear-generated disturbances resolved by the computational analysis shows excellent agreement with the experimental results.Item A Viscoelastoplastic Continuum Damage Model for the Compressive Behavior of Asphalt Concrete(2006-10-23) Gibson, Nelson Harold; Schwartz, Charles W.; Civil Engineering; Digital Repository at the University of Maryland; University of Maryland (College Park, Md.)Mechanistic performance prediction of asphalt concrete pavements has been a goal for the pavement industry for some time. A comprehensive material model is essential for such predictions. This dissertation illustrates the development, calibration and validation of a comprehensive constitutive material model for asphalt concrete in unconfined and confined compression. A continuum damage-based viscoelastic model is extended with viscoplasticity. Thermodynamic principles, an elastic-viscoelastic correspondence principle and internal state variables quantify degradation by accounting for linear viscoelasticity and any nonlinear viscoelasticity with cumulative damage. Viscoplastic effects are addressed separately. Two distinctly different strain-hardening viscoplastic models were investigated. A more capable multiaxial model with primary-secondary hardening improved upon the original uniaxial. These characteristics enable the whole model to decompose total strain into individual response components of viscoelasticity, viscoplasticity and damage. Separate laboratory tests were required to measure and calibrate the individual response components. The calibration tests include small strain dynamic modulus tests for undamaged viscoelastic properties, cyclic creep and recovery tests for viscoplastic properties, and constant rate of strain tests for damage properties. All tests were performed at appropriate temperatures and loading rates. An extensive set of validation tests was used to confirm each model, which were very different from the calibration conditions to evaluate the models' capabilities. The predictions at these different conditions indicate that the comprehensive model can realistically simulate a wide range of asphalt concrete behavior. Recommendations are given based on lessons learned in the laboratory experiments and analyses of the data generated.Item IMPACT ASSESSMENT OF DYNAMIC SLOT EXCHANGE IN AIR TRAFFIC MANAGEMENT(2004-12-09) Sankararaman, Ravi; Ball, Michael; Decision and Information Technologies; Digital Repository at the University of Maryland; University of Maryland (College Park, Md.)Since the inception of Collaborative Decision Making (CDM), the Federal Aviation Administration and the airlines have been striving to improve utilization of critical resources such as arrival slots and reduce flight delays during Ground Delay Programs. Two of the mechanisms that have been implemented for increasing utilization at resource-constrained airports are those of Compression and Slot Credit Substitution (SCS). SCS is a conditional, dynamic means of inter-airline slot exchange while compression can be considered a static means of achieving slot utilization. This thesis will be an attempt to develop theoretical models to understand the performance of compression to slot exchange requests from airlines. This thesis will also address the trends in these slot exchange procedures, the benefits in terms of delay savings realized by the airlines, and avenues for future applications for improving efficiency of the National Airspace System.