Computer Science Theses and Dissertations
Permanent URI for this collectionhttp://hdl.handle.net/1903/2756
Browse
3 results
Search Results
Item Multi-Object Tracking, Event Modeling, and Activity Discovery in Video Sequences(2007-04-26) Joo, Seong-Wook; Chellappa, Rama; Computer Science; Digital Repository at the University of Maryland; University of Maryland (College Park, Md.)One of the main goals of computer vision is video understanding, where objects in the video are detected, tracked, and their behavior is analyzed. In this dissertation, several key problems in video understanding are addressed, focusing on video surveillance applications. Moving target detection and tracking is one of the most fundamental tasks in visual surveillance. A new moving target detection method is proposed where the temporal variance is used as a measure for characterizing object motion. Our method is experimentally shown to produce high detection rates while keeping low false positive rates. In tracking multiple objects, it is essential to correctly associate targets and measurements. We describe an efficient multi-object tracking approach that maintains multiple hypotheses over time regarding the association of targets and measurements. The data association problem is solved by a combinatorial optimization technique which finds the most likely association allowing track initiation, termination, merge, and split. Experimental results show that our method tracks through varying degrees of interactions among the targets with high success rate. Recognizing complex high-level events requires an explicit model of the structure of the events. Our approach uses attribute grammar for representing such event, which formally specifies the syntax of the symbols and the conditions on the attributes. Events are recognized using an extension of the Earley parser that handles attributes and concurrent event threads. Various examples of recognizing specific events of interest and detecting abnormal events are demonstrated using real data. Unsupervised methods for learning human activities have been largely based on clustering trajectories from a given scene. However, conventional clustering algorithms are not suitable for scenes that have many outlier trajectories. We describe a method for finding only salient groups of trajectories, using the probability of trajectories accidentally forming a group as the measure of significance of the group. The grouping algorithm finds groups that maximizes significance, while automatically determining the threshold for significance. We validate our approach on real data and analyze its performance using simulated data.Item Applications of Factorization Theorem and Ontologies for Activity ModelingRecognition and Anomaly Detection(2005-05-06) Akdemir, Umut; Chellappa, Rama; Computer Science; Digital Repository at the University of Maryland; University of Maryland (College Park, Md.)In this thesis two approaches for activity modeling and suspicious activity detection are examined. First is application of factorization theorem extension for deformable models in two dierent contexts. First is human activity detection from joint position information, and second is suspicious activity detection for tarmac security. It is shown that the first basis vector from factorization theorem is good enough to dierentiate activities for human data and to distinguish suspicious activities for tarmac security data. Second approach dierentiates individual components of those activities using semantic methodol- ogy. Although currently mainly used for improving search and information retrieval, we show that ontologies are applicable to video surveillance. We evaluate the domain ontologies from Challenge Project on Video Event Taxonomy sponsored by ARDA from the perspective of general ontology design principles. We also focused on the eect of the domain on the granularity of the ontology for suspicious activity detection.Item View-Invariance in Visual Human Motion Analysis(2004-04-29) Parameswaran, Vasudev; Chellappa, Rama; Computer ScienceThis thesis makes contributions towards the solutions to two problems in the area of visual human motion analysis: human action recognition and human body pose estimation. Although there has been a substantial amount of research addressing these two problems in the past, the important issue of viewpoint invariance in the representation and recognition of poses and actions has received relatively scarce attention, and forms a key goal of this thesis. Drawing on results from 2D projective invariance theory and 3D mutual invariants, we present three different approaches of varying degrees of generality, for human action representation and recognition. A detailed analysis of the approaches reveals key challenges, which are circumvented by enforcing spatial and temporal coherency constraints. An extensive performance evaluation of the approaches on 2D projections of motion capture data and manually segmented real image sequences demonstrates that in addition to viewpoint changes, the approaches are able to handle well, varying speeds of execution of actions (and hence different frame rates of the video), different subjects and minor variabilities in the spatiotemporal dynamics of the action. Next, we present a method for recovering the body-centric coordinates of key joints and parts of a canonically scaled human body, given an image of the body and the point correspondences of specific body joints in an image. This problem is difficult to solve because of body articulation and perspective effects. To make the problem tractable, previous researchers have resorted to restricting the camera model or requiring an unrealistic number of point correspondences, both of which are more restrictive than necessary. We present a solution for the general case of a perspective uncalibrated camera. Our method requires that the torso does not twist considerably, an assumption that is usually satisfied for many poses of the body. We evaluate the quantitative performance of the method on synthetic data and the qualitative performance of the method on real images taken with unknown cameras and viewpoints. Both these evaluations show the effectiveness of the method at recovering the pose of the human body.