Intersectional fair ranking via subgroup divergence

Pastor, Eliana; Bonchi, Francesco

doi:10.1007/s10618-024-01029-8

Intersectional fair ranking via subgroup divergence

Open access
Published: 21 May 2024

Volume 38, pages 2186–2222, (2024)
Cite this article

Download PDF

You have full access to this open access article

Data Mining and Knowledge Discovery Aims and scope Submit manuscript

Intersectional fair ranking via subgroup divergence

Download PDF

Eliana Pastor¹ &
Francesco Bonchi^2,3

449 Accesses
2 Altmetric
Explore all metrics

Abstract

Societal biases encoded in real-world data can contaminate algorithmic decisions, perpetuating preexisting inequalities in domains such as employment and education. In the fair ranking literature, following the doctrine of affirmative action, fairness is enforced by means of a group-fairness constraint requiring “enough” individuals from protected groups in the top-k positions, for a ranking to be considered valid. However, which are the groups that need to be protected? And how much representation is “enough”? As the biases affecting the process may not always be directly observable nor measurable, these questions might be hard to answer in a principled way, especially when many different potentially discriminated subgroups exist. This paper addresses this issue by automatically identifying the disadvantaged groups in the data and mitigating their disparate representation in the final ranking. Our proposal leverages the notion of divergence to automatically identify which subgroups, defined as combination of sensitive attributes, show a statistically significant deviation, in terms of ranking utility, compared to the overall population. Subgroups with negative divergence experience a disadvantage. We formulate the problem of re-ranking instances to maximize the minimum subgroup divergence, while maintaining the new ranking as close as possible to the original one. We develop a method which is based on identifying the divergent subgroups and applying a re-ranking procedure which is monotonic w.r.t. the goal of maximizing the minimum divergence. Our experimental results show that our method effectively eliminates the existence of disadvantaged subgroups while producing rankings which are very close to the original ones.

Equality of Opportunity in Ranking: A Fair-Distributive Model

Fairness for Robust Learning to Rank

Fairness of linear regression in decision making

Article 18 July 2023

Discover the latest articles, news and stories from top researchers in related subjects.

Artificial Intelligence

Use our pre-submission checklist

Avoid common mistakes on your manuscript.

1 Introduction

Ranking is a fundamental primitive in many algorithmic decision-making contexts, such as health (e.g., solid organ transplantation priority list), education (e.g., university admission), or employment (e.g., selection for a job). Typically, different information about an individual might be collected and processed by some machine learning model to produce a final score of “fitness” of a candidate, which then forms the basis of the final ranking. However, bias might be hiding in the underlying data, potentially interfering with the definition of the fitness score and ultimately leading to unfair ranking, which might substantially impact people’s lives. Of particular concern are historically disadvantaged groups, whose information in the underlying data might correlate with lower fitness due to historical reasons and preexisting societal inequalities. The growing awareness of the risks associated with algorithmic decision-making has been attracting an increasing research effort toward devising fair ranking systems (Zehlike et al. 2017; Yang and Stoyanovich 2017; Singh and Joachims 2018; Celis et al. 2018; Yang et al. 2019; Celis et al. 2020; García-Soriano and Bonchi 2021; Zehlike et al. 2022; Ekstrand et al. 2023). The bulk of this literature deals with fair ranking as a constrained optimization problem, where the fairness constraint requires that a valid ranking must exhibit in the top-k positions, for any k, a certain fraction of individuals from some protected groups, defined on the basis of sensitive attributes such as ethnicity, gender, or age. A main limitation of this approach is that it needs someone to define the group-fairness constraint. This requires (i) to identify the potentially disadvantaged groups and (ii) to decide which is the minimum representation for each of these groups in the top-k positions. As the potential biases hiding in the underlying data may not always be directly observable nor measurable, these questions might be hard to answer in a principled way, especially in the intersectional case, when many different potentially discriminated subgroups exist.

The notion of intersectionality (Crenshaw 1990) refers to individuals belonging to multiple protected groups who may experience a unique disadvantage. Intersectionality is complex because guaranteeing a fair representation or treatment for every single attribute does not guarantee the fair representation of their intersection. As shown by Celis et al. (2018), when each of the elements to be ranked belongs to one and only one group, the constrained optimization problem can be solved exactly in polynomial time. Instead, when each element can belong to more than one group, the problem becomes hard.

Table 1 Subgroups from the LSAT dataset with support (fraction of the population), the divergence $\Delta$ from the overall average score, and its statistical significance (Welch’s t-test)

Intersectional fair ranking via subgroup divergence

Abstract

Similar content being viewed by others

Equality of Opportunity in Ranking: A Fair-Distributive Model

Fairness for Robust Learning to Rank

Fairness of linear regression in decision making

Explore related subjects

1 Introduction

2 Related work

3 Preliminaries

4 Ranking divergence mitigation

Example 1

4.1 Desired properties of the mitigation process

Property 4.1

Property 4.2

Property 4.3

4.2 Mitigation step

Example 2

4.3 Ensuring monotonicity

Example 3

4.4 An iterative mitigation approach

Example 4

5 Experiments

5.1 Experimental setup

5.2 Divergence mitigation

5.3 Comparison with baselines

6 Conclusions

Availability of data and materials

Notes

References

Acknowledgements

Funding

Author information

Authors and Affiliations

Contributions

Corresponding author

Ethics declarations

Ethics approval and consent to participate

Competing interests

Additional information

Publisher's Note

Appendix 1: Additional experiments

Appendix 1: Additional experiments

1.1 Appendix 1.1: Dataset description

1.2 Appendix 1.2: Considered protected groups for the compared baselines

1.3 Appendix 1.3: Comparison with baselines: COMPAS, German Credit, IIT–JEE and folktables datasets

1.4 Appendix 1.4: Computation performance analysis: mitigation for all the attributes

1.5 Appendix 1.5: Top-k ranking and comparison with Multi-FAIR

Rights and permissions

About this article

Cite this article

Share this article

Keywords

Search

Navigation