Algorithmic issues in visual object recognition

Thumbnail Image


Publication or External Link






This thesis is divided into two parts covering two aspects of

research in the area of visual object recognition.

Part I is about human detection in still images. Human

detection is a challenging computer vision task due to the wide

variability in human visual appearances and body poses. In this

part, we present several enhancements to human detection

algorithms. First, we present an extension to the integral

images framework to allow for constant time computation of

non-uniformly weighted summations over rectangular regions

using a bundle of integral images. Such computational element

is commonly used in constructing gradient-based feature

descriptors, which are the most successful in shape-based human

detection. Second, we introduce deformable features as an

alternative to the conventional static features used in

classifiers based on boosted ensembles. Deformable features can

enhance the accuracy of human detection by adapting to pose

changes that can be described as translations of body features.

Third, we present a comprehensive evaluation framework for

cascade-based human detectors. The presented framework

facilitates comparison between cascade-based detection

algorithms, provides a confidence measure for result, and

deploys a practical evaluation scenario.

Part II explores the possibilities of enhancing the speed of

core algorithms used in visual object recognition using the

computing capabilities of Graphics Processing Units (GPUs).

First, we present an implementation of Graph Cut on GPUs, which

achieves up to 4x speedup against compared to a CPU

implementation. The Graph Cut algorithm has many applications

related to visual object recognition such as segmentation and

3D point matching. Second, we present an efficient sparse

approximation of kernel matrices for GPUs that can

significantly speed up kernel based learning algorithms, which

are widely used in object detection and recognition. We present

an implementation of the Affinity Propagation clustering

algorithm based on this representation, which is about 6 times

faster than another GPU implementation based on a conventional

sparse matrix representation.