Modeling Deep Context in Spatial and Temporal Domain

Dai, Xiyang

Modeling Deep Context in Spatial and Temporal Domain

dc.contributor.advisor	Davis, Larry S.	en_US
dc.contributor.author	Dai, Xiyang	en_US
dc.contributor.department	Computer Science	en_US
dc.contributor.publisher	Digital Repository at the University of Maryland	en_US
dc.contributor.publisher	University of Maryland (College Park, Md.)	en_US
dc.date.accessioned	2019-02-08T06:30:30Z
dc.date.available	2019-02-08T06:30:30Z
dc.date.issued	2018	en_US
dc.description.abstract	Context has been one of the most important aspects in computer vision researches because it provides useful guidance to solve variant tasks in both spatial and temporal domain. As the recent rise of deep learning methods, deep networks have shown impressive performances on many computer vision tasks. Model deep context explicitly and implicitly in deep networks can further boost the effectiveness and efficiency of deep models. In spatial domain, implicitly model context can be useful to learn discriminative texture representations. We present an effective deep fusion architecture to capture both the second order and first older statistics of texture features; Meanwhile, explicitly model context can also be important to challenging task such as fine-grained classification. We then present a deep multi-task network that explicitly captures geometry constraints by simultaneously conducting fine-grained classification and key-point localization. In temporal domain, explicitly model context can be crucial to activity recognition and localization. We present a temporal context network to explicitly capture relative context around a proposal, which samples two temporal scales pair-wisely for precise temporal localization of human activities; Meanwhile, implicitly model context can lead to better network architecture for video applications. We then present a temporal aggregation network that learns a deep hierarchical representation for capturing temporal consistency. Finally, we conduct research on jointly modeling context in both spatial and temporal domain for human action understanding, which requires to predict where, when and what a human action happens in a crowd scene. We present a decoupled framework that has dedicated branches for spatial localization and temporal recognition. Contexts in spatial and temporal branches are modeled explicitly and fused together later to generate final predictions.	en_US
dc.identifier	https://doi.org/10.13016/ifop-it5w
dc.identifier.uri	http://hdl.handle.net/1903/21735
dc.language.iso	en	en_US
dc.subject.pqcontrolled	Computer science	en_US
dc.subject.pquncontrolled	computer vision	en_US
dc.subject.pquncontrolled	context	en_US
dc.subject.pquncontrolled	deep learning	en_US
dc.title	Modeling Deep Context in Spatial and Temporal Domain	en_US
dc.type	Dissertation	en_US

Files

Original bundle

Now showing 1 - 1 of 1

Name:: Dai_umd_0117E_19542.pdf
Size:: 16.96 MB
Format:: Adobe Portable Document Format

Download

Collections

UMD Theses and Dissertations
Computer Science Theses and Dissertations