Can Querying for Bias Leak Protected Attributes? Achieving Privacy With Smooth Sensitivity

Loading...
Thumbnail Image

Files

Hamman et al.pdf (2.2 MB)
No. of downloads: 38

Publication or External Link

Date

2023-06-12

Advisor

Citation

Faisal Hamman, Jiahao Chen, and Sanghamitra Dutta. 2023. Can Querying for Bias Leak Protected Attributes? Achieving Privacy With Smooth Sensitivity. In 2023 ACM Conference on Fairness, Accountability, and Transparency (FAccT ’23), June 12–15, 2023, Chicago, IL, USA. ACM, New York, NY, USA, 11 pages.

Abstract

Existing regulations often prohibit model developers from accessing protected attributes (gender, race, etc.) during training. This leads to scenarios where fairness assessments might need to be done on populations without knowing their memberships in protected groups. In such scenarios, institutions often adopt a separation between the model developers (who train their models with no access to the protected attributes) and a compliance team (who may have access to the entire dataset solely for auditing purposes). However, the model developers might be allowed to test their models for disparity by querying the compliance team for group fairness metrics. In this paper, we first demonstrate that simply querying for fairness metrics, such as, statistical parity and equalized odds can leak the protected attributes of individuals to the model developers. We demonstrate that there always exist strategies by which the model developers can identify the protected attribute of a targeted individual in the test dataset from just a single query. Furthermore, we show that one can reconstruct the protected attributes of all the individuals from O (𝑁𝑘log(𝑛/𝑁𝑘)) queries when 𝑁𝑘 ≪ 𝑛 using techniques from compressed sensing (𝑛 is the size of the test dataset and 𝑁𝑘 is the size of smallest group therein). Our results pose an interesting debate in algorithmic fairness: Should querying for fairness metrics be viewed as a neutral-valued solution to ensure compliance with regulations? Or, does it constitute a violation of regulations and privacy if the number of queries answered is enough for the model developers to identify the protected attributes of specific individuals? To address this supposed violation of regulations and privacy, we also propose Attribute-Conceal, a novel technique that achieves differential privacy by calibrating noise to the smooth sensitivity of our bias query function, outperforming naive techniques such as the Laplace mechanism. We also include experimental results on the Adult dataset and synthetic dataset (broad range of parameters).

Notes

Rights