Neural Networks as Databases - From Data to Model Compression
| dc.contributor.advisor | Shrivastava, Abhinav | en_US |
| dc.contributor.author | Maiya, Shishira Raghunath | en_US |
| dc.contributor.department | Computer Science | en_US |
| dc.contributor.publisher | Digital Repository at the University of Maryland | en_US |
| dc.contributor.publisher | University of Maryland (College Park, Md.) | en_US |
| dc.date.accessioned | 2025-09-15T05:37:56Z | |
| dc.date.issued | 2025 | en_US |
| dc.description.abstract | The past decade has witnessed an exponential increase in data and the rise of deep learning systems to handle it. These systems primarily analyze, learn and interpret the data by excelling in exploiting patterns. This leads us to the question - can we use these patterns to efficiently store the data as well ? Essentially this would mean moving from the paradigm of data compression towards model compression. This thesis introduces and investigates a variety of methods in the realm of model compression, with the eventual goal of replacing data compression itself. To explore this, we start with the most information rich, yet sparse media form - videos. In the first part of this thesis, we introduce a framework for representing videos as continuous functions using Implicit Neural Representations (INRs), which aim to represent any given signal as a mapping between the spatial/temporal coordinate space to its values. We propose an auto-regressive framework that exploits the redundancies of a video for efficient compression and real time decoding - a first in this field. We then build upon this framework by introducing a shared video prior that captures common patterns across video frames, significantly improving the encoding times by 10-20$\times$, while improving the compression rates. Despite these value additions, INR's aren't really representations of underlying data unless the resulting model weights also encode some semantics. Based on this intuition, we introduce a hypernetwork-based INR system enables us to perform semantic tasks like video retrieval and understanding along with compression. Having established that the problem of data compression is actually a model compression problem, I will then present a case study on a widely used model compression method: pruning. We take a popular pruning method - the Lottery Ticket Hypothesis which offers extreme sparsity and study its effects on fundamental vision tasks like classification, detection and segmentation. Finally, we take a look at the proposed future directions in which we explore enhanced network and algorithmic designs with random networks for greater compression and meta-learning for achieving faster video encoding. | en_US |
| dc.identifier | https://doi.org/10.13016/yyrf-kusd | |
| dc.identifier.uri | http://hdl.handle.net/1903/34656 | |
| dc.language.iso | en | en_US |
| dc.subject.pqcontrolled | Computer science | en_US |
| dc.subject.pqcontrolled | Artificial intelligence | en_US |
| dc.subject.pquncontrolled | Computer vision | en_US |
| dc.subject.pquncontrolled | Deep Learning | en_US |
| dc.subject.pquncontrolled | Machine Learning | en_US |
| dc.title | Neural Networks as Databases - From Data to Model Compression | en_US |
| dc.type | Dissertation | en_US |
Files
Original bundle
1 - 1 of 1