Efficient Evaluation of Gaussian Sums with Applications in Vision and Learning

Thumbnail Image


umi-umd-2804.pdf (5.8 MB)
No. of downloads: 2545

Publication or External Link






Evaluating sums of multivariate Gaussians is a common computational task in image processing, computer vision and learning, including in the general and powerful technique of kernel density estimation (KDE). The quadratic computational complexity of KDE is a significant barrier to practical applications. The fast Gauss transform (FGT) has successfully accelerated the kernel density estimation to linear time for low-dimensional problems. Unfortunately, direct extension of the FGT to higher-dimensional problems results in an algorithm whose cost grows exponentially with dimension, making it impractical for dimensions above three. We develop an improved fast Gauss transform (IFGT) in higher dimensions. A new multivariate expansion scheme and an adaptive space subdivision technique dramatically improve the performance. The IFGT has been applied to the mean shift algorithm achieving linear computational complexity, and been applied to applications such as image segmentation, and visual tracking.

Another issue when applying kernel methods to vision problems is that the statistical similarity measures commonly used were meant for low-dimensional problems. In particular similarity measures such as the Bhattacharya coefficient or the K-L distance were computed using histograms. When these measures are used for high-dimensional problems it is found that they are either unstable or not discriminative, and further are expensive to compute. We proposed a novel simple symmetric similarity function between kernel-density estimates of the model and target distributions. The mean-shift algorithm is used to track objects by iteratively maximizing this similarity function. To alleviate the quadratic complexity of the density estimation, we employ Gaussian kernels and the IFGT, which leads to very efficient and robust nonparametric tracking algorithms. The proposed algorithms are tested with several image sequences and shown to achieve robust real-time tracking performance.

The amount of computation required for the recently popular kernel machines used in learning grows with $N$ training samples at least as $O(N^2)$. Such a computational complexity is significant even for moderate size problems and is prohibitive for large datasets. We present an approximation technique based on the IFGT to reduce the computation into $O(N)$. We also give error bound for the approximation, and show experimental results on the UCI datasets without significant decreasing in the accuracy.