Petri Net Models for Event Recognition in Surveillance Videos

Thumbnail Image
umi-umd-4317.pdf(2.73 MB)
No. of downloads: 3099
Publication or External Link
Ghanem, Nagia M
Davis, Larry
Video surveillance is the process of monitoring the behavior of people and objects within public places, e.g. airports and traffic intersections, by means of visual aids (cameras) usually for safety and security purposes. As the amount of video data gathered daily by surveillance cameras increases, the need for automatic systems to detect and recognize suspicious activities performed by people and objects is also increasing. The first part of the thesis describes a framework for modeling and recognition of events from surveillance video. Our framework is based on deterministic inference using Petri nets. Events can be composed by combining primitive events and previously defined events by spatial, temporal and logical relations. We provide a graphical user interface (GUI) to formulate such event models. Our approach automatically maps each of these models into a set of Petri net filters that represent the components of the event. Lower-level video processing modules, e.g. background subtraction, tracking and classification, are used to detect the occurrence of primitive events. These primitive events are then filtered by Petri nets filters to recognize composite events of interest. Our framework is general enough and we have applied it to many surveillance domains. In the second part of the thesis, we address the problem of detecting carried objects. Detecting carried objects is the main step to solve the problem of left object detection. We present two approaches to the left object detection problem. Both approaches poses the problem as a classification problem. For both approaches, we trained SVM classifiers on a laboratory database that contains examples of people seen with and without two common objects, namely backpacks and suitcases. We used a boosting technique, AdaBoost, to select the most discriminative features used by the SVMs and to enhance the performance of the classifiers. We give recognition results for each approach and then compare both approaches and describe the advantages of each one. We also compare the performance of both approaches on real world videos captured at the Munich airport.