Abstract / Description of output
Machine Learning techniques have become pervasive across a range of different applications, and are now widely used in areas as disparate as recidivism prediction, consumer credit-risk analysis and insurance pricing. The prevalence of machine learning techniques has raised concerns about the potential for learned algorithms to become biased against certain groups. Many definitions have been proposed in the literature, but the fundamental task of reasoning about probabilistic events is a challenging one, owing to the intractability of inference.
The focus of this paper is taking steps towards the application of tractable probabilistic models to fairness in machine learning. Tractable probabilistic models have recently emerged that guarantee that conditional marginal can be computed in time linear in the size of the model. In particular, we show that sum product networks (SPNs) enable an effective technique for determining the statistical relationships between protected attributes and other training variables. We will also motivate the concept of “fairness through percentile equivalence”, a new definition predicated on the notion that individuals at the same percentile of their respective distributions should be treated equivalently, and this prevents unfair penalisation of those individuals who lie at the extremities of their respective distributions.
We compare the efficacy of this pre-processing technique with an alternative approach that assumes an additive contribution (used in [24]). It was found that when these two approaches were compared on a data set containing the results of law school applicants [33], the percentile equivalence method reduced the average underestimation in theexam score of ethnic minority applicants black applicants at the bottom end of their conditional distribution by about a fifth. We conclude by outlining potential improvements to our existing methodology and suggest opportunities for further work in this field.
The focus of this paper is taking steps towards the application of tractable probabilistic models to fairness in machine learning. Tractable probabilistic models have recently emerged that guarantee that conditional marginal can be computed in time linear in the size of the model. In particular, we show that sum product networks (SPNs) enable an effective technique for determining the statistical relationships between protected attributes and other training variables. We will also motivate the concept of “fairness through percentile equivalence”, a new definition predicated on the notion that individuals at the same percentile of their respective distributions should be treated equivalently, and this prevents unfair penalisation of those individuals who lie at the extremities of their respective distributions.
We compare the efficacy of this pre-processing technique with an alternative approach that assumes an additive contribution (used in [24]). It was found that when these two approaches were compared on a data set containing the results of law school applicants [33], the percentile equivalence method reduced the average underestimation in theexam score of ethnic minority applicants black applicants at the bottom end of their conditional distribution by about a fifth. We conclude by outlining potential improvements to our existing methodology and suggest opportunities for further work in this field.
Original language | English |
---|---|
Number of pages | 26 |
Publication status | Published - 7 Feb 2020 |
Event | Ninth International Workshop on Statistical Relational AI - New York, United States Duration: 7 Feb 2020 → 7 Feb 2020 Conference number: 9 http://www.starai.org/2020/ |
Workshop
Workshop | Ninth International Workshop on Statistical Relational AI |
---|---|
Abbreviated title | StarAI 2020 |
Country/Territory | United States |
City | New York |
Period | 7/02/20 → 7/02/20 |
Internet address |