The compositional character of visual correspondence

Thumbnail Image


umi-umd-1769.pdf (7.78 MB)
No. of downloads: 1121

Publication or External Link






Given two images of a scene, the problem of finding a map relating the points in the two images is known as the correspondence problem. Stereo correspondence is a special case in which corresponding points lie on the same row in the two images; optical flow is the general case. In this thesis, we argue that correspondence is inextricably linked to other problems such as depth segmentation, occlusion detection and shape estimation, and cannot be solved in isolation without solving each of these problems concurrently within a compositional framework. We first demonstrate the relationship between correspondence and segmentation in a world devoid of shape, and propose an algorithm based on connected components which solves these two problems simultaneously by matching image pixels. Occlusions are found by using the uniqueness constraint, which forces one pixel in the first image to match exactly one pixel in the second image. Shape is then introduced into the picture, and it is revealed that a horizontally slanted surface is sampled differently by the two cameras of a stereo pair, creating images of different width. In this scenario, we show that pixel matching must be replaced by interval matching, to allow intervals of different width in the two images to correspond. A new interval uniqueness constraint is proposed to detect occlusions. Vertical slant is shown to have a qualitatively different character than horizontal slant, requiring the role of vertical consistency constraints based on non-horizontal edges. Complexities which arise in optical flow estimation in the presence of slant are also examined. For greater robustness and flexibility, the algorithm based on connected components is generalized into a diffusion-like process, which allows the use of new local matching metrics which we have developed in order to create contrast invariant and noise resistant correspondence algorithms. Ultimately, it is shown that temporal information can be used to assign correspondences to occluded areas, which also yields ordinal depth information about the scene, even in the presence of independently moving objects. This information can be used for motion segmentation to detect new types of independently moving objects, which are missed by state-of-the-art methods.