IMAGE RETRIEVAL BASED ON COMPLEX DESCRIPTIVE QUERIES
DAVIS, LARRY S
MetadataShow full item record
The amount of visual data such as images and videos available over web has increased exponentially over the last few years. In order to efficiently organize and exploit these massive collections, a system, apart from being able to answer simple classification based questions such as whether a specific object is present (or absent) in an image, should also be capable of searching images and videos based on more complex descriptive questions. There is also a considerable amount of structure present in the visual world which, if effectively utilized, can help achieve this goal. To this end, we first present an approach for image ranking and retrieval based on queries consisting of multiple semantic attributes. We further show that there are significant correlations present between these attributes and accounting for them can lead to superior performance. Next, we extend this by proposing an image retrieval framework for descriptive queries composed of object categories, semantic attributes and spatial relationships. The proposed framework also includes a unique multi-view hashing technique, which enables query specification in three different modalities - image, sketch and text. We also demonstrate the effectiveness of leveraging contextual information to reduce the supervision requirements for learning object and scene recognition models. We present an active learning framework to simultaneously learn appearance and contextual models for scene understanding. Within this framework we introduce new kinds of labeling questions that are designed to collect appearance as well as contextual information and which mimic the way in which humans actively learn about their environment. Furthermore we explicitly model the contextual interactions between the regions within an image and select the question which leads to the maximum reduction in the combined entropy of all the regions in the image (image entropy).