Similarity Searching in Large Image Databases

Thumbnail Image

Files (484.15 KB)
No. of downloads: 250
CS-TR-3388.pdf (386.64 KB)
No. of downloads: 702

Publication or External Link







We propose a method to handle approximate searching by image content in large image databases. Image content is represented by attributed relational graphs holding features of objects and relationships between objects. The method relies on the assumption that a fixed number of labeled'' or expected'' objects (e.g., heart'', lungs'' etc.) are common in all images of a given application domain in addition to a variable number of unexpected'' or unlabeled'' objects (e.g., tumor'', hematoma'' etc.). The method can answer queries by example such as {\em find all X-rays that are similar to Smith's X-ray}''. The stored images are mapped to points in a multidimensional space and are indexed using state-of-the-art database methods (R-trees). The proposed method has several desirable properties: (a) Database search is approximate so that all images up to a pre-specified degree of similarity (tolerance) are retrieved, (b) it has no false dismissals'' (i.e., all images qualifying query selection criteria are retrieved) and (c) it scales-up well as the database grows. We implemented the method and ran experiments on a database of synthetic (but realistic) medical images. The experiments showed that our method significantly outperforms sequential scanning by up to an order of magnitude. (Also cross-referenced as UMIACS-TR-94-134)