Abstract
Differential privacy has emerged as the most studied framework for privacy-preserving machine learning. However, recent studies show that enforcing differential privacy guarantees can not only significantly degrade the utility of the model, but also amplify existing disparities in its predictive performance across demographic groups. Although there is extensive research on the identification of factors that contribute to this phenomenon, we still lack a complete understanding of the mechanisms through which differential privacy exacerbates disparities. The literature on this problem is muddled by varying definitions of fairness, differential privacy mechanisms, and inconsistent experimental settings, often leading to seemingly contradictory results. This survey provides the first comprehensive overview of the factors that contribute to the disparate effect of training models with differential privacy guarantees. We discuss their impact and analyze their causal role in such a disparate effect. Our analysis is guided by a taxonomy that categorizes these factors by their position within the machine learning pipeline, allowing us to draw conclusions about their interaction and the feasibility of potential mitigation strategies. We find that factors related to the training dataset and the underlying distribution play a decisive role in the occurrence of disparate impact, highlighting the need for research on these factors to address the issue.
| Original language | English |
|---|---|
| Title of host publication | Proceedings of the 3rd IEEE Conference on Secure and Trustworthy Machine Learning |
| Publisher | Institute of Electrical and Electronics Engineers |
| Pages | 1-17 |
| Number of pages | 17 |
| Publication status | Accepted/In press - 13 Dec 2024 |
| Event | The 3rd IEEE Conference on Secure and Trustworthy Machine Learning - University of Copenhagen, Copenhagen, Denmark Duration: 9 Apr 2025 → 11 Apr 2025 Conference number: 3 https://satml.org/ |
Conference
| Conference | The 3rd IEEE Conference on Secure and Trustworthy Machine Learning |
|---|---|
| Abbreviated title | SaTML 2025 |
| Country/Territory | Denmark |
| City | Copenhagen |
| Period | 9/04/25 → 11/04/25 |
| Internet address |
Keywords / Materials (for Non-textual outputs)
- machine learning
- differential privacy
- privacy-preserving ML
- fairness
- trustworthy ML
Fingerprint
Dive into the research topics of 'SoK: What makes private learning unfair?'. Together they form a unique fingerprint.Cite this
- APA
- Author
- BIBTEX
- Harvard
- Standard
- RIS
- Vancouver