Appearance modeling under geometric context for object recognition in videos

Thumbnail Image


umi-umd-3692.pdf (4.61 MB)
No. of downloads: 2148

Publication or External Link







Object recognition is a very important high-level task in surveillance applications. This dissertation focuses on building appearance models for object recognition and exploring the relationship between shape and appearance for two key types of objects, human and vehicle. The dissertation proposes a generic framework that models the appearance while incorporating certain geometric prior information, or the so-called geometric context. Then under this framework, special methods are developed for recognizing humans and vehicles based on their appearance and shape attributes in surveillance videos.

The first part of the dissertation presents a unified framework based on a general definition of geometric transform (GeT) which is applied to modeling object appearances under geometric context. The GeT models the appearance by applying designed functionals over certain geometric sets. GeT unifies Radon transform, trace transform, image warping etc. Moreover, five novel types of GeTs are introduced and applied to fingerprinting the appearance inside a contour. They include GeT based on level sets, GeT based on shape matching, GeT based on feature curves, GeT invariant to occlusion, and a multi-resolution GeT (MRGeT) that combines both shape and appearance information.

The second part focuses on how to use the GeT to build appearance models for objects like walking humans, which have articulated motion of body parts. This part also illustrates the application of GeT for object recognition, image segmentation, video retrieval, and image synthesis. The proposed approach produces promising results when applied to automatic body part segmentation and fingerprinting the appearance of a human and body parts despite the presence of non-rigid deformations and articulated motion.

It is very important to understand the 3D structure of vehicles in order to recognize them. To reconstruct the 3D model of a vehicle, the third part presents a factorization method for structure from planar motion. Experimental results show that the algorithm is accurate and fairly robust to noise and inaccurate calibration. Differences and the dual relationship between planar motion and planar object are also clarified in this part. Based on our method, a fully automated vehicle reconstruction system has been designed.