Computer Science Theses and Dissertations
Permanent URI for this collectionhttp://hdl.handle.net/1903/2756
Browse
2 results
Search Results
Item Advance Video Modeling Techniques for Video Generation and Enhancement Tasks(2024) Shrivastava, Gaurav; Shrivastava, Abhinav; Computer Science; Digital Repository at the University of Maryland; University of Maryland (College Park, Md.)This thesis investigates advanced techniques that are useful in video modeling for generation and enhancement tasks. In the first part of the thesis, we explore generative modeling that exploits the external corpus for learning priors. The task here is of video prediction, i.e., to extrapolate future sequences given a few context frames. In a followup work we also demonstrate how can we reduce the inference time further and make the video prediction model more efficient. Additionally, we demonstrate that we are not only able to extrapolate one future sequence from a given context frame but multiple sequences given context frames. In the second part, we explore the methods that exploit internal statistics of videos to perform various restoration and enhancement tasks. Here, we show how robustly they perform the restoration tasks like denoising, super-resolution, frame interpolation, and object removal tasks. Furthermore, in a follow-up work, we utilize the inherent compositionality of videos and internal statistics to perform a wider variety of enhancement tasks such as relighting, dehazing, and foreground/background manipulations. Lastly, we provide insight into our future work on how data-free enhancement techniques could be improved. Additionally, we provide further insights on how multisteps video prediction techniques can be improved.Item Diverse Video Generation(2021) Shrivastava, Gaurav; Shrivastava, Abhinav; Computer Science; Digital Repository at the University of Maryland; University of Maryland (College Park, Md.)Generating future frames given a few context (or past) frames is a challengingtask. It requires modeling the temporal coherence of videos and multi-modality in terms of diversity in the potential future states. Current variational approaches for video generation tend to marginalize over multi-modal future outcomes. Instead, in this thesis, we propose to explicitly model the multi-modality in the future outcomes and leverage it to sample diverse futures. Our approach, Diverse Video Generator, uses a Gaussian Process (GP) to learn priors on future states given the past and maintains a probability distribution over possible futures given a particular sample. In addition, we leverage the changes in this distribution overtime to control the sampling of diverse future states by estimating the end of on-going sequences. That is, we use the variance of GP over the output function space to trigger a change in an action sequence. We achieve state-of-the-art results on diverse future frame generation in terms of reconstruction quality and diversity of the generated sequences