An efficient neural representation for videos

dc.contributor.advisorShrivastava, Abhinaven_US
dc.contributor.authorChen, Haoen_US
dc.contributor.departmentComputer Scienceen_US
dc.contributor.publisherDigital Repository at the University of Marylanden_US
dc.contributor.publisherUniversity of Maryland (College Park, Md.)en_US
dc.date.accessioned2023-10-06T05:37:18Z
dc.date.available2023-10-06T05:37:18Z
dc.date.issued2023en_US
dc.description.abstractWith the increasing popularity of videos, it has become crucial to find efficient and compact ways to represent them for easier storage, transmission, and downstream video tasks. Our dissertation proposes an innovative neural representation for videos called NeRV, which stores each video implicitly as a neural network. Building on NeRV, we introduce a hybrid representation for videos called HNeRV, which improves internal generalization and representation capacity. HNeRV allows for highly efficient video representation and compression, with a model size that can be up to 1000 times smaller than the original raw video. Apart from efficiency, HNeRV's simple decoding process, which involves a feedforward operation, enables fast video loading and easy deployment. To enhance efficiency, we develope an efficient neural video dataloader called NVLoader, which is 3-6 times faster than conventional video dataloaders. We also introduce the HyperNeRV framework to address encoding speed, which utilizes a hypernetwork to directly map input videos to NeRV model weights, resulting in a 10^4 faster encoding process. Aside from developing compact and implicit video neural representations, we explore several compelling applications, including frame interpolation, video restoration, and video editing. Furthermore, the compactness of these representations makes them an ideal output video format for video generation models, reducing the search space significantly. Additionally, they can serve as an efficient input for video understanding models.en_US
dc.identifierhttps://doi.org/10.13016/dspace/rpio-zrgb
dc.identifier.urihttp://hdl.handle.net/1903/30742
dc.language.isoenen_US
dc.subject.pqcontrolledComputer scienceen_US
dc.subject.pquncontrolledefficient video loadingen_US
dc.subject.pquncontrolledImplicit neural representationen_US
dc.subject.pquncontrolledVideo compressionen_US
dc.subject.pquncontrolledvideo editingen_US
dc.subject.pquncontrolledvideo restorationen_US
dc.titleAn efficient neural representation for videosen_US
dc.typeDissertationen_US

Files

Original bundle
Now showing 1 - 1 of 1
Loading...
Thumbnail Image
Name:
Chen_umd_0117E_23302.pdf
Size:
27.75 MB
Format:
Adobe Portable Document Format