IP Geolocation in Metropolitan Areas

Thumbnail Image


Publication or External Link






In this thesis, we propose a robust methodology to geolocate a target IP Address in a metropolitan area. We model the problem as a Pattern Recognition problem and present algorithms that can extract patterns and match them for inferring the geographic location of target's IP Address.

The first algorithm is a relatively non-invasive method called Pattern Based Geolocation (PBG) which models the distribution of Round Trip Times (RTTs) to a target and matches them to that of the nearby landmarks to deduce the target's location. PBG builds Probability Mass Functions (PMFs) to model the distribution of RTTs. For comparing PMFs, we propose a novel `Shifted Symmetrized Divergence' distance metric which is a modified form of Kullback-Leibler divergence. It is symmetric as well as invariant to shifts. PBG algorithm works in almost stealth mode and leaves almost undetectable signature in network traffic.

The second algorithm, Perturbation Augmented PBG (PAPBG), gives a higher resolution in the location estimate using additional perturbation traffic. The goal of this algorithm is to induce a stronger signature of background traffic in the vicinity of the target, and then detect it in the RTT sequences collected. At the cost of being intrusive, this algorithm improves the resolution of PBG by approximately 20-40%.

We evaluate the performance of PBG and PAPBG on real data collected from 20 machines distributed over 700 square miles large Washington-Baltimore metropolitan area. We compare the performance of the proposed algorithms with existing measurement based geolocation techniques. Our experiments show that PBG shows marked improvements over current techniques and can geolocate a target IP address to within 2-4 miles of its actual location. And by sending an additional traffic in the network PAPBG improves the resolution to within 1-3 miles.