Sparse Methods for Robust and Efficient Visual Recognition

Thumbnail Image


Publication or External Link





Visual recognition has been a subject of extensive research in computer vision. A vast literature exists on feature extraction and learning methods for recognition. However, due to large variations in visual data, robust visual recognition is still an open problem. In recent years, sparse representation-based methods have become popular for visual recognition. By learning a compact dictionary of data and exploiting the notion of sparsity, start-of-the-art results have been obtained on many recognition tasks. However, existing data-driven sparse model techniques may not be optimal for some challenging recognition problems. In this dissertation, we consider some of these recognition tasks and present approaches based on sparse coding for robust and efficient recognition in such cases.

First we study the problem of low-resolution face recognition. This is a challenging problem, and methods have been proposed using super-resolution and machine learning based techniques. However, these methods cannot handle variations like illumination changes which can happen at low resolutions, and degrade the performance. We propose a generative approach for classifying low resolution faces, by exploiting 3D face models.

Further, we propose a joint sparse coding framework for robust classification at low resolutions. The effectiveness of the method is demonstrated on different face datasets.

In the second part, we study a robust feature-level fusion method for multimodal biometric recognition. Although score-level and decision-level fusion methods exist in biometric literature, feature-level fusion is challenging due to different output formats of biometric modalities. In this work, we propose a novel sparse representation-based method for multimodal fusion, and present experimental results for a large multimodal dataset. Robustness to noise and occlusion are demonstrated.

In the third part, we consider the problem of domain adaptation, where we want to learn effective classifiers for cases where the test images come from a different distribution than the training data. Typically, due to high cost of human annotation, very few labeled samples are available for images in the test domain. Specifically, we study the problem of adapting sparse dictionary-based classification methods for such cases. We describe a technique which jointly learns projections of data in the two domains, and a latent dictionary which can succinctly represent both domains in the projected low dimensional space. The proposed method is efficient and performs on par or better than many competing state-of-the-art methods.

Lastly, we study an emerging analysis framework of sparse coding for image classification. We show that the analysis sparse coding can give similar performance as the typical synthesis sparse coding methods, while being much faster at sparse encoding. In the end, we conclude the dissertation with discussions and possible future directions.