Thumbnail Image


Publication or External Link





Image geo-localization is an important research problem. In recent years, the IARPA Finder program gathers many researchers to develop the technology to address the geo-localization task. One particularly effective approach is utilizing the large-scale ground-level image and/or overhead imagery with image matching techniques for image geo-localization. In this dissertation, we focus on two different aspects of geo-localization. First, we focus on indoor image and use geo-localization to recognize different business venues. Second, we address the venerability of such a computer vision system and apply geo-localization to solve media forensics problems such as content manipulation and meta-data manipulation.

With the prevalence of social media platforms, media shared on the Internet can reach millions of people in a short time. Sheer amounts of media available on the Internet enable many different computer vision applications. However, at the same time, people can easily share a tampered media for malicious goals such as creating panic or distorting public opinions with little effort.

We first propose an image localization framework for extracting fine-grained location information (i.e. business venues) from images. Our framework utilizes the information available from social media websites such as Instagram and Yelp to extract a set of location-related concepts. Using these concepts with a multi-modal recognition model, we were able to extract location information based on the image content.

Secondly, to make a robust system, we address the metadata tampering detection problem, detecting the discrepancy between the images and its associated metadata such as GPS and timestamp. We propose a multi-task learning model to verify its authenticity by detecting the discrepancy between image content and its metadata. Our model first detects meteorological properties such as weather condition, sun angle, and temperatures from the image content and comparing it with the information from the online weather database. To facilitate the training and evaluating of our model, we create a large-scale outdoor dataset labeled with meteorological properties.

Thirdly, we address the event verification problem by designing a convolutional neural networks configuration specifically target for image localization. The proposed networks utilize the bilinear pooling layer and attention module to extract detail location information from the image content.

Forth, we present a generative model to generate realistic image compositing using adversarial learning, which can be used to further improve the image tampering detection model. Finally, we propose an object-based provenance approach to address the content manipulation problem in media forensics.