The First Principles of Deep Learning and Compression

Ehrlich, Max Donohue

The First Principles of Deep Learning and Compression

dc.contributor.advisor	Shrivastava, Abhinav	en_US
dc.contributor.advisor	Davis, Larry S	en_US
dc.contributor.author	Ehrlich, Max Donohue	en_US
dc.contributor.department	Computer Science	en_US
dc.contributor.publisher	Digital Repository at the University of Maryland	en_US
dc.contributor.publisher	University of Maryland (College Park, Md.)	en_US
dc.date.accessioned	2022-06-21T05:34:22Z
dc.date.available	2022-06-21T05:34:22Z
dc.date.issued	2022	en_US
dc.description.abstract	The deep learning revolution incited by the 2012 Alexnet paper has been transformative for the field of computer vision. Many problems which were severely limited using classical solutions are now seeing unprecedented success. The rapid proliferation of deep learning methods has led to a sharp increase in their use in consumer and embedded applications. One consequence of consumer and embedded applications is lossy multimedia compression which is required to engineer the efficient storage and transmission of data in these real-world scenarios. As such, there has been increased interest in a deep learning solution for multimedia compression which would allow for higher compression ratios and increased visual quality. The deep learning approach to multimedia compression, so called Learned Multimedia Compression, involves computing a compressed representation of an image or video using a deep network for the encoder and the decoder. While these techniques have enjoyed impressive academic success, their industry adoption has been essentially non-existent. Classical compression techniques like JPEG and MPEG are too entrenched in modern computing to be easily replaced. This dissertation takes an orthogonal approach and leverages deep learning to improve the compression fidelity of these classical algorithms. This allows the incredible advances in deep learning to be used for multimedia compression without threatening the ubiquity of the classical methods. The key insight of this work is that methods which are motivated by first principles, \ie, the underlying engineering decisions that were made when the compression algorithms were developed, are more effective than general methods. By encoding prior knowledge into the design of the algorithm, the flexibility, performance, and/or accuracy are improved at the cost of generality. While this dissertation focuses on compression, the high level idea can be applied to many different problems with success. Four completed works in this area are reviewed. The first work, which is foundational, unifies the disjoint mathematical theories of compression and deep learning allowing deep networks to operate on compressed data directly. The second work shows how deep learning can be used to correct information loss in JPEG compression over a wide range of compression quality, a problem that is not readily solvable without a first principles approach. This allows images to be encoded at high compression ratios while still maintaining visual fidelity. The third work examines how deep learning based inferencing tasks, like classification, detection, and segmentation, behave in the presence of classical compression and how to mitigate performance loss. As in the previous work, this allows images to be compressed further but this time without accuracy loss on downstream learning tasks. Finally, these ideas are extended to video compression by developing an algorithm to correct video compression artifacts. By incorporating bitstream metadata and mimicking the decoding process with deep learning, the method produces more accurate results with higher throughput than general methods. This allows deep learning to improve the rate-distortion of classical MPEG codecs and competes with fully deep learning based codecs but with a much lower barrier-to-entry.	en_US
dc.identifier	https://doi.org/10.13016/rakj-790u
dc.identifier.uri	http://hdl.handle.net/1903/28918
dc.language.iso	en	en_US
dc.subject.pqcontrolled	Computer science	en_US
dc.subject.pquncontrolled	Compression	en_US
dc.subject.pquncontrolled	Computational Photography	en_US
dc.subject.pquncontrolled	Computer Vision	en_US
dc.subject.pquncontrolled	Deep Learning	en_US
dc.subject.pquncontrolled	Machine Learning	en_US
dc.subject.pquncontrolled	Multimedia	en_US
dc.title	The First Principles of Deep Learning and Compression	en_US
dc.type	Dissertation	en_US

Files

Original bundle

Now showing 1 - 1 of 1

Name:: Ehrlich_umd_0117E_22295.pdf
Size:: 172.66 MB
Format:: Adobe Portable Document Format

Download

Collections

UMD Theses and Dissertations
Computer Science Theses and Dissertations