Deterministic Annealing for Correspondence, Pose, and Recognition

Thumbnail Image


umi-umd-3361.pdf (6.35 MB)
No. of downloads: 1572

Publication or External Link






The problem of determining the pose - the position and orientation - of an object given a model and an image of that object is a fundamental problem in computer vision. Applications include object recognition, object tracking, site inspection and updating, and autonomous navigation when scene models are available. The pose of an object is readily determined given a few correspondences between features in the image and features in the model. Conversely, corresponding model and image features can easily be determined if the pose of the object is known. However, when neither the pose nor the correspondences are known, the problem of determining either is difficult due to the fact that a small change in an object's pose can result in a large change in its appearance. Most existing techniques approach this as a combinatorial optimization problem in which the space of model-to-image feature correspondence is searched in order to find object poses that are supported by large numbers of image features. These approaches, however, are only practical when the level of clutter and occlusion in the image is small, which is often not the case in real-world environments.

This dissertation presents new algorithms that simultaneously determine the pose and feature correspondences of 2D and 3D objects from images containing large amounts of clutter and occlusion. Objects are modeled as sets of 2D or 3D points or line segments, and image features consist of either points or line segments. In each of the algorithms presented, deterministic annealing is used to convert a discrete combinatorial optimization problem into a continuous one that is indexed by a control parameter. This has two advantages. First, it allows solutions to the simpler continuous problem to slowly transform into a solution to the discrete problem. Secondly, many local minima are avoided by minimizing an objective function that is highly smoothed during the early phases of the optimization but which gradually transforms into the original objective function and constraints at the end of the optimization. These algorithms perform well in experiments involving highly cluttered synthetic and real imagery.