Petri Net Models for Event Recognition in Surveillance Videos
Files
Publication or External Link
Date
Authors
Advisor
Citation
DRUM DOI
Abstract
Video surveillance is the process of monitoring the behavior of people and objects within public places, e.g. airports and traffic intersections, by means of visual aids (cameras) usually for safety and security purposes. As the amount of video data gathered daily by surveillance cameras increases, the need for automatic systems to
detect and recognize suspicious activities performed by people and objects is also increasing.
The first part of the thesis describes a framework for modeling and recognition of events from surveillance video. Our framework is
based on deterministic inference using Petri nets. Events can be composed by combining primitive events and previously defined events
by spatial, temporal and logical relations. We provide a graphical user interface (GUI) to formulate such event models. Our approach
automatically maps each of these models into a set of Petri net filters that represent the components of the event. Lower-level video processing modules, e.g. background subtraction, tracking and classification, are used to detect the occurrence of primitive events. These primitive events are then filtered by Petri nets
filters to recognize composite events of interest. Our framework is general enough and we have applied it to many surveillance domains.
In the second part of the thesis, we address the problem of detecting carried objects. Detecting carried objects is the main step to solve the problem of left object detection. We present two
approaches to the left object detection problem. Both approaches poses the problem as a classification problem. For both approaches,
we trained SVM classifiers on a laboratory database that contains examples of people seen with and without two common objects, namely backpacks and suitcases. We used a boosting
technique, AdaBoost, to select the most
discriminative features used by the SVMs and to enhance the performance of the classifiers. We give recognition results for each approach and then compare both approaches and describe the
advantages of each one. We also compare the performance of both approaches on real world videos captured at the Munich airport.