Empirically Evaluating the Security and Privacy Implications of Persistent Network Identifiers
Files
Publication or External Link
Date
Authors
Advisor
Citation
DRUM DOI
Abstract
Persistent, globally-unique link- and network-layer addresses present a privacy threat to the owners of the devices to which they are assigned, particularly when the device is physically or logically mobile. Best practices recommend the use of addresses that are ephemeral, random, or both, to protect user privacy. To date, there have been no large-scale, empirical studies of the degree to which these recommendations are being implemented in practice without privileged access to network data from a commercial third party. This is due, in part, to the difficulty of obtaining the quantities of addresses needed to draw any conclusions. Obtaining client IPv6 addresses is challenging without running a network service or partnering with an ISP; obtaining in-use link-layer addresses is difficult without being in physical or logical proximity to the interfaces to which they are assigned.
In this thesis, I demonstrate that a low-power attacker can collect network addresses at scale and that recommendations to prevent persistent identifiers are not being followed in practice, resulting in a substantial degradation of user privacy. To demonstrate this, I obtain large-scale corpora of link- and network-layer addresses from a variety of novel sources. Using this data, I then show the feasibility of a variety of attacks. For e.g., I show that an adversary can passively gather billions of active, client IPv6 addresses that can be used to track individual devices longitudinally and across network changes. I demonstrate IPv6-specific home network vulnerabilities, such as a lack of stateful firewalling, that permit attackers to discover in-home IoT devices. Using archival Wikimedia data, I demonstrate a novel methodology for collecting millions of historical, client IPv6 addresses. Switching to a different address space, I then show that a low-power attacker can obtain the geolocations of billions of Wi-Fi access point geolocations remotely, and track their movements over time. I demonstrate that, when combined with Wi-Fi access point geolocation data, some IPv6 devices can be precisely geolocated due to leaked link-layer identifiers in IPv6 addresses. Finally, I distill the key components of privacy-preserving network addressing systems, and offer recommendations for how to adapt today's addressing schemes to these principles.