Skip to main navigation Skip to search Skip to main content

SoK: What makes private learning unfair?

Kai Yao, Marc Juarez

Research output: Chapter in Book/Report/Conference proceedingConference contribution

Abstract

Differential privacy has emerged as the most studied framework for privacy-preserving machine learning. However, recent studies show that enforcing differential privacy guarantees can not only significantly degrade the utility of the model, but also amplify existing disparities in its predictive performance across demographic groups. Although there is extensive research on the identification of factors that contribute to this phenomenon, we still lack a complete understanding of the mechanisms through which differential privacy exacerbates disparities. The literature on this problem is muddled by varying definitions of fairness, differential privacy mechanisms, and inconsistent experimental settings, often leading to seemingly contradictory results. This survey provides the first comprehensive overview of the factors that contribute to the disparate effect of training models with differential privacy guarantees. We discuss their impact and analyze their causal role in such a disparate effect. Our analysis is guided by a taxonomy that categorizes these factors by their position within the machine learning pipeline, allowing us to draw conclusions about their interaction and the feasibility of potential mitigation strategies. We find that factors related to the training dataset and the underlying distribution play a decisive role in the occurrence of disparate impact, highlighting the need for research on these factors to address the issue.
Original languageEnglish
Title of host publicationProceedings of the 3rd IEEE Conference on Secure and Trustworthy Machine Learning
PublisherInstitute of Electrical and Electronics Engineers
Pages1-17
Number of pages17
Publication statusAccepted/In press - 13 Dec 2024
EventThe 3rd IEEE Conference on Secure and Trustworthy Machine Learning - University of Copenhagen, Copenhagen, Denmark
Duration: 9 Apr 202511 Apr 2025
Conference number: 3
https://satml.org/

Conference

ConferenceThe 3rd IEEE Conference on Secure and Trustworthy Machine Learning
Abbreviated titleSaTML 2025
Country/TerritoryDenmark
CityCopenhagen
Period9/04/2511/04/25
Internet address

Keywords / Materials (for Non-textual outputs)

  • machine learning
  • differential privacy
  • privacy-preserving ML
  • fairness
  • trustworthy ML

Fingerprint

Dive into the research topics of 'SoK: What makes private learning unfair?'. Together they form a unique fingerprint.

Cite this