Modeling, Quantifying, and Limiting Adversary Knowledge

Thumbnail Image


Publication or External Link





Users participating in online services are required to relinquish

control over potentially sensitive personal information, exposing

them to intentional or unintentional miss-use of said information by

the service providers.

Users wishing to avoid this must either abstain from often extremely

useful services, or provide false information which is usually

contrary to the terms of service they must abide by.

An attractive middle-ground alternative is to maintain control in

the hands of the users and provide a mechanism with which

information that is necessary for useful services can be queried.

Users need not trust any external party in the management of their

information but are now faced with the problem of judging when

queries by service providers should be answered or when they should

be refused due to revealing too much sensitive information.

Judging query safety is difficult.

Two queries may be benign in isolation but might reveal more than a

user is comfortable with in combination.

Additionally malicious adversaries who wish to learn more than

allowed might query in a manner that attempts to hide the flows of

sensitive information.

Finally, users cannot rely on human inspection of queries due to its

volume and the general lack of expertise.

This thesis tackles the automation of query judgment, giving the

self-reliant user a means with which to discern benign queries from

dangerous or exploitive ones.

The approach is based on explicit modeling and tracking of the

knowledge of adversaries as they learn about a user through the

queries they are allowed to observe.

The approach quantifies the absolute risk a user is exposed, taking

into account all the information that has been revealed already when

determining to answer a query.

Proposed techniques for approximate but sound probabilistic

inference are used to tackle the tractability of the approach,

letting the user tradeoff utility (in terms of the queries judged

safe) and efficiency (in terms of the expense of knowledge

tracking), while maintaining the guarantee that risk to the user is

never underestimated.

We apply the approach to settings where user data changes over time

and settings where multiple users wish to pool their data to perform

useful collaborative computations without revealing too much


By addressing one of the major obstacles preventing the viability of

personal information control, this work brings the attractive

proposition closer to reality.