Theses and Dissertations from UMD

Permanent URI for this communityhttp://hdl.handle.net/1903/2

New submissions to the thesis/dissertation collections are added automatically as they are received from the Graduate School. Currently, the Graduate School deposits all theses and dissertations from a given semester after the official graduation date. This means that there may be up to a 4 month delay in the appearance of a give thesis/dissertation in DRUM

More information is available at Theses and Dissertations at University of Maryland Libraries.

Browse

Search Results

Now showing 1 - 3 of 3
  • Thumbnail Image
    Item
    Looking at People Using Partial Least Squares
    (2010) Schwartz, William Robson; Davis, Larry S; Computer Science; Digital Repository at the University of Maryland; University of Maryland (College Park, Md.)
    Analysis of images involving humans is of significant interest in computer vision because problems such as detection, modeling, recognition, and tracking are fundamental to model interactions between people and understand high-level activities. Visual information contained in images is generally represented using descriptors (features). Many general classes of descriptors have been proposed focusing on different characteristics of images. Therefore, if one considers only a single descriptor, one might ignore useful information for a given task, compromising performance. In this research we consider a rich set of image descriptors analyzed by a statistical technique known as Partial Least Squares (PLS). PLS is a class of methods for modeling relations between sets of observations by means of latent variables and it is used to project exemplars from a very high dimensional feature space onto a low dimensional subspace. We demonstrate the effectiveness of combining a richer set of descriptors using PLS in two significant tasks in computer vision. First, we propose a method to detect humans, which is then extended to handle partial occlusion and finally a framework based on PLS regression models is incorporated to further reduce the computational cost. Second, an object recognition framework based on a one-against-all scheme is exploited for appearance-based person modeling and face identification.
  • Thumbnail Image
    Item
    Algorithmic issues in visual object recognition
    (2009) Hussein, Mohamed Elsayed Ahmed; Davis, Larry; Computer Science; Digital Repository at the University of Maryland; University of Maryland (College Park, Md.)
    This thesis is divided into two parts covering two aspects of research in the area of visual object recognition. Part I is about human detection in still images. Human detection is a challenging computer vision task due to the wide variability in human visual appearances and body poses. In this part, we present several enhancements to human detection algorithms. First, we present an extension to the integral images framework to allow for constant time computation of non-uniformly weighted summations over rectangular regions using a bundle of integral images. Such computational element is commonly used in constructing gradient-based feature descriptors, which are the most successful in shape-based human detection. Second, we introduce deformable features as an alternative to the conventional static features used in classifiers based on boosted ensembles. Deformable features can enhance the accuracy of human detection by adapting to pose changes that can be described as translations of body features. Third, we present a comprehensive evaluation framework for cascade-based human detectors. The presented framework facilitates comparison between cascade-based detection algorithms, provides a confidence measure for result, and deploys a practical evaluation scenario. Part II explores the possibilities of enhancing the speed of core algorithms used in visual object recognition using the computing capabilities of Graphics Processing Units (GPUs). First, we present an implementation of Graph Cut on GPUs, which achieves up to 4x speedup against compared to a CPU implementation. The Graph Cut algorithm has many applications related to visual object recognition such as segmentation and 3D point matching. Second, we present an efficient sparse approximation of kernel matrices for GPUs that can significantly speed up kernel based learning algorithms, which are widely used in object detection and recognition. We present an implementation of the Affinity Propagation clustering algorithm based on this representation, which is about 6 times faster than another GPU implementation based on a conventional sparse matrix representation.
  • Thumbnail Image
    Item
    Modeling Shape, Appearance and Motion for Human Movement Analysis
    (2009) Lin, Zhe; Davis, Larry S; Electrical Engineering; Digital Repository at the University of Maryland; University of Maryland (College Park, Md.)
    Shape, Appearance and Motion are the most important cues for analyzing human movements in visual surveillance. Representation of these visual cues should be rich, invariant and discriminative. We present several approaches to model and integrate them for human detection and segmentation, person identification, and action recognition. First, we describe a hierarchical part-template matching approach to simultaneous human detection and segmentation combining local part-based and global shape-based schemes. For learning generic human detectors, a pose-adaptive representation is developed based on a hierarchical tree matching scheme and combined with an support vector machine classifier to perform human/non-human classification. We also formulate multiple occluded human detection using a Bayesian framework and optimize it through an iterative process. We evaluated the approach on several public pedestrian datasets. Second, given regions of interest provided by human detectors, we introduce an approach to iteratively estimates segmentation via a generalized Expectation-Maximization algorithm. The approach incorporates local Markov random field constraints and global pose inferences to propagate beliefs over image space iteratively to determine a coherent segmentation. Additionally, a layered occlusion model and a probabilistic occlusion reasoning scheme are introduced to handle inter-occlusion. The approach is tested on a wide variety of real-life images. Third, we describe an approach to appearance-based person recognition. In learning, we perform discriminative analysis through pairwise coupling of training samples, and estimate a set of normalized invariant profiles by marginalizing likelihood ratio functions which reflect local appearance differences. In recognition, we calculate discriminative information-based distances by a soft voting approach, and combine them with appearance-based distances for nearest neighbor classification. We evaluated the approach on videos of 61 individuals under significant illumination and viewpoint changes. Fourth, we describe a prototype-based approach to action recognition. During training, a set of action prototypes are learned in a joint shape and motion space via $k$-means clustering; During testing, humans are tracked while a frame-to-prototype correspondence is established by nearest neighbor search, and then actions are recognized using dynamic prototype sequence matching. Similarity matrices used for sequence matching are efficiently obtained by look-up table indexing. We experimented the approach on several action datasets.