Automating the Discovery of Censorship Evasion Strategies
Files
Publication or External Link
Date
Authors
Advisor
Citation
DRUM DOI
Abstract
Censoring nation-states deploy complex network infrastructure to regulate what content citizens can access, and such restrictions to open sharing of information threaten the freedoms of billions of users worldwide, especially marginalized groups. Researchers and censoring regimes have long engaged in a cat-and-mouse game, leading to increasingly sophisticated Internet-scale censorship techniques and methods to evade them. In this dissertation, I study the technology that underpins this Internet censorship: middleboxes (e.g. firewalls). I argue the following thesis: It is possible to automatically discover packet sequence modifications that render deployed censorship middleboxes ineffective across multiple application-layer protocols.
To evaluate this thesis, I develop Geneva, a novel genetic algorithm that discovers packet-manipulation-based censorship evasion strategies automatically against nation-state level censors. Training directly against a live adversary, Geneva com- poses, mutates, and evolves sophisticated strategies out of four basic packet manipulation primitives (drop, tamper, duplicate, and fragment).
I show that Geneva can be effective across different application layer protocols (HTTP, HTTPS+SNI, HTTPS+ESNI, DNS, SMTP, FTP), censoring regimes (China, Iran, India, and Kazakhstan), and deployment contexts (client-side, server- side), even in cases where multiple middleboxes work in parallel to perform censorship. In total, I present 112 client-side strategies (85 of which work by modifying application layer data), and the first ever server-side strategies (11 in total). Finally, I use Geneva to discover two novel attacks that show censoring middleboxes can be weaponized to launch attacks against innocent hosts anywhere on the Internet.
Collectively, my work shows that censorship evasion can be automated and that censorship infrastructures pose a greater threat to Internet availability than previously understood.