Modeling, Quantifying, and Limiting Adversary Knowledge
Files
Publication or External Link
Date
Authors
Advisor
Citation
DRUM DOI
Abstract
Users participating in online services are required to relinquish
control over potentially sensitive personal information, exposing
them to intentional or unintentional miss-use of said information by
the service providers.
Users wishing to avoid this must either abstain from often extremely
useful services, or provide false information which is usually
contrary to the terms of service they must abide by.
An attractive middle-ground alternative is to maintain control in
the hands of the users and provide a mechanism with which
information that is necessary for useful services can be queried.
Users need not trust any external party in the management of their
information but are now faced with the problem of judging when
queries by service providers should be answered or when they should
be refused due to revealing too much sensitive information.
Judging query safety is difficult.
Two queries may be benign in isolation but might reveal more than a
user is comfortable with in combination.
Additionally malicious adversaries who wish to learn more than
allowed might query in a manner that attempts to hide the flows of
sensitive information.
Finally, users cannot rely on human inspection of queries due to its
volume and the general lack of expertise.
This thesis tackles the automation of query judgment, giving the
self-reliant user a means with which to discern benign queries from
dangerous or exploitive ones.
The approach is based on explicit modeling and tracking of the
knowledge of adversaries as they learn about a user through the
queries they are allowed to observe.
The approach quantifies the absolute risk a user is exposed, taking
into account all the information that has been revealed already when
determining to answer a query.
Proposed techniques for approximate but sound probabilistic
inference are used to tackle the tractability of the approach,
letting the user tradeoff utility (in terms of the queries judged
safe) and efficiency (in terms of the expense of knowledge
tracking), while maintaining the guarantee that risk to the user is
never underestimated.
We apply the approach to settings where user data changes over time
and settings where multiple users wish to pool their data to perform
useful collaborative computations without revealing too much
information.
By addressing one of the major obstacles preventing the viability of
personal information control, this work brings the attractive
proposition closer to reality.