Group and Individual Fairness in Clustering Algorithms

Gupta, Shivam; Jain, Shweta; Ghalme, Ganesh; Krishnan, Narayanan C.; Hemachandra, Nandyala

doi:10.1007/978-981-99-7184-8_2

Shivam Gupta⁶,
Shweta Jain⁶,
Ganesh Ghalme⁷,
Narayanan C. Krishnan⁸ &
…
Nandyala Hemachandra⁹

Part of the book series: Studies in Computational Intelligence ((SCI,volume 1123))

303 Accesses

Abstract

Clustering is a classical unsupervised machine learning technique. It has various applications in criminal justice, automated resume processing, bank loan approvals, recommender systems, and many more. Despite being so popular, traditional clustering algorithms may result in discriminatory behavior towards a group of people (or individuals) and have societal impacts. It has led to the study of fair clustering algorithms that aim to minimize the clustering cost while ensuring fairness. This chapter outlines existing group and individual fairness notions, discusses their relationships, and comprehensively categorizes the current algorithms. The chapter further discusses the advantages and disadvantages of existing algorithms in terms of theoretical guarantees, time complexity, and reproducibility. Finally, the chapter concludes with a discussion of new directions and open problems in the field of fair clustering.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Log in via an institution

Chapter: USD 29.95; Price excludes VAT (USA)

eBook: USD 139.00; Price excludes VAT (USA)

Hardcover Book: USD 179.99; Price excludes VAT (USA)

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

Exploring Rawlsian Fairness for K-Means Clustering

Beyond submodularity: a unified framework of randomized set selection with group fairness constraints

Article 28 April 2023

Multi-stage Bias Mitigation for Individual Fairness in Algorithmic Decisions

Notes

1.
\(\boldsymbol{\tau }\) vector is written in the form (red, blue) respectively in \(\boldsymbol{\tau }\)-mp, \(\boldsymbol{\tau }\)-rd and \(\boldsymbol{\tau }\)-fair notion.
2.
(p, q)-approx bicriteria denotes cost approximation of p and fairness approximation of q.
3.
Ratio of clustering objective value under fairness constraint to the standard objective value.
4.
Mean center of all points belonging to a single color (say red points) in the dataset.

References

Abbasi M, Bhaskara A, Venkatasubramanian S (2021) Fair clustering via equitable group representations. In: ACM FAccT, pp 504–514. https://doi.org/10.1145/3442188.3445913
Abraham SS, Padmanabhan D, Sundaram SS (2020) Fairness in clustering with multiple sensitive attributes. In: EDBT/ICDT joint conference, pp 287–298
Google Scholar
Ahmadian S, Epasto A, Kumar R, Mahdian M (2019) Clustering without over-representation. In: SIGKDD, pp 267–275. https://doi.org/10.1145/3292500.3330987
Ahmadian S, Epasto A, Kumar R, Mahdian M (2020) Fair correlation clustering. In: International conference on artificial intelligence and statistics. PMLR, pp 4195–4205
Google Scholar
Amanatidis G, Aziz H, Birmpas G, Filos-Ratsikas A, Li B, Moulin H, Voudouris AA, Wu X (2022) Fair division of indivisible goods: a survey. arXiv:2208.08782
Anderson N, Bera SK, Das S, Liu Y (2020) Distributional individual fairness in clustering. arXiv:2006.12589
Anegg G, Angelidakis H, Kurpisz A, Zenklusen R (2020) A technique for obtaining true approximations for k-center with covering constraints. In: International conference on integer programming and combinatorial optimization. Springer, pp 52–65
Google Scholar
Anegg G, Koch LV, Zenklusen R (2022) Techniques for generalized colorful \(k\)-center problems. arXiv:2207.02609
Asano T, Asano Y (2000) Recent developments in maximum flow algorithms. J Oper Res Soc Jpn 43(1):2–31
MathSciNet Google Scholar
Bacelar M (2021) Monitoring bias and fairness in machine learning models: a review. ScienceOpen Preprints
Google Scholar
Backurs A, Indyk P, Onak K, Schieber B, Vakilian A, Wagner T (2019) Scalable fair clustering. In: ICML, pp 405–413
Google Scholar
Balashankar A, Lees A, Welty C, Subramanian L (2019) What is fair? exploring pareto-efficiency for fairness constrained classifiers. arXiv:1910.14120
Balcan MF, Blum A, Vempala S (2008) A discriminative framework for clustering via similarity functions. In: ACM STOC, pp 671–680
Google Scholar
Bandyapadhyay S, Fomin FV, Simonov K (2020) On coresets for fair clustering in metric and euclidean spaces and their applications. arXiv:2007.10137
Bandyapadhyay S, Inamdar T, Pai S, Varadarajan K (2019) A constant approximation for colorful k-center. arXiv:1907.08906
Banerjee A, Ghosh J (2006) Scalable clustering algorithms with balancing constraints. Data Min Knowl Discov 13(3):365–395
Article MathSciNet Google Scholar
Bera S, Chakrabarty D, Flores N, Negahbani M (2019) Fair algorithms for clustering. In: NeurIPS, pp 4954–4965
Google Scholar
Bercea IO, Groß M, Khuller S, Kumar A, Rösner C, Schmidt DR, Schmidt M (2018) On the cost of essentially fair clusterings. arXiv:1811.10319
Biddle D (2017) Adverse impact and test validation: a practitioner’s guide to valid and defensible employment testing. Routledge
Google Scholar
Böhm M, Fazzone A, Leonardi S, Schwiegelshohn C (2020) Fair clustering with multiple colors. arXiv:2002.07892
Brubach B, Chakrabarti D, Dickerson J, Khuller S, Srinivasan A, Tsepenekas L (2020) A pairwise fair and community-preserving approach to k-center clustering. In: ICML, pp 1178–1189
Google Scholar
Brubach B, Chakrabarti D, Dickerson JP, Srinivasan A, Tsepenekas L (2021) Fairness, semi-supervised learning, and more: a general framework for clustering with stochastic pairwise constraints. In: Proceedings of the AAAI conference on artificial intelligence, vol 35, pp 6822–6830
Google Scholar
Byrka J, Pensyl T, Rybicki B, Srinivasan A, Trinh K (2014) An improved approximation for k-median, and positive correlation in budgeted optimization. In: ACM-SIAM SODA, pp 737–756
Google Scholar
Chakrabarti D, Dickerson JP, Esmaeili SA, Srinivasan A, Tsepenekas L (2021) A new notion of individually fair clustering: \(\alpha \)-equitable \(k\)-center. arXiv:2106.05423
Chan THH, Dinitz M, Gupta A (2006) Spanners with slack. In: European symposium on algorithms. Springer, pp 196–207
Google Scholar
Charikar M, Makarychev K, Makarychev Y (2010) Local global tradeoffs in metric embeddings. SIAM J Comput 39(6):2487–2512
Article MathSciNet Google Scholar
Chhabra A, Masalkovaitė K, Mohapatra P (2021) An overview of fairness in clustering. IEEE Access
Google Scholar
Chhabra A, Singla A, Mohapatra P (2021) Fair clustering using antidote data. arXiv:2106.00600
Chierichetti F, Kumar R, Lattanzi S, Vassilvitskii, S.: Fair clustering through fairlets. In: NeurIPS, pp. 5036–5044 (2017)
Google Scholar
Chlamtáč E, Makarychev Y, Vakilian A (2022) Approximating fair clustering with cascaded norm objectives. In: Proceedings of the 2022 annual ACM-SIAM symposium on discrete algorithms (SODA). SIAM, pp 2664–2683
Google Scholar
Dastin J (2018) Amazon scraps secret ai recruiting tool that showed bias against women. https://www.reuters.com/article/us-amazon-com-jobs-automation-insight-idUSKCN1MK08G. Accessed 15-August-2021
Davidson I, Ravi S (2020) Making existing clusterings fairer: algorithms, complexity results and insights. AAAI 34(04):3733–3740. https://doi.org/10.1609/aaai.v34i04.5783. ojs.aaai.org/index.php/AAAI/article/view/5783
Article Google Scholar
Dwork C, Hardt M, Pitassi T, Reingold O, Zemel R (2012) Fairness through awareness. In: ITCS, pp 214–226
Google Scholar
Esmaeili S, Brubach B, Srinivasan A, Dickerson J (2021) Fair clustering under a bounded cost. In: NeurIPS
Google Scholar
Esmaeili S, Brubach B, Tsepenekas L, Dickerson J (2020) Probabilistic fair clustering. In: NeurIPS, pp 12743–12755
Google Scholar
Galil Z (1986) Efficient algorithms for finding maximum matching in graphs. ACM Comput Surv (CSUR) 18(1):23–38
Article MathSciNet Google Scholar
Ghadiri M, Samadi S, Vempala S (2021) Socially fair k-means clustering. In: ACM FAccT, pp 438–448
Google Scholar
Ghassami A, Khodadadian S, Kiyavash N (2018) Fairness in supervised learning: an information theoretic approach. In: IEEE ISIT, pp 176–180
Google Scholar
Gonzalez TF (1985) Clustering to minimize the maximum intercluster distance. Theor Comput Sci 38:293–306
Article MathSciNet Google Scholar
Goyal D, Jaiswal R (2021) Tight fpt approximation for socially fair clustering. arXiv:2106.06755
Gupta S, Ghalme G, Krishnan NC, Jain S (2021) Efficient algorithms for fair clustering with a new fairness notion. arXiv:2109.00708
Harb E, Lam HS (2020) Kfc: a scalable approximation algorithm for \( k \)- center fair clustering. In: NEURIPS, pp 14509–14519
Google Scholar
Hardt M, Megiddo N, Papadimitriou C, Wootters M (2016) Strategic classification. In: ITCS, ITCS ’16. Association for Computing Machinery, New York, NY, USA, pp 111–122
Google Scholar
Hochbaum DS, Shmoys DB (1986) A unified approach to approximation algorithms for bottleneck problems. J ACM (JACM) 33(3):533–550
Article MathSciNet Google Scholar
Hong W, Zheng S, Wang H (2013) A job recommender system based on user clustering. J Comput 8(8) (2013)
Google Scholar
Huang L, Jiang S, Vishnoi N (2019) Coresets for clustering with fairness constraints. In: NeurIPS, pp 7589–7600
Google Scholar
Jain AK, Murty MN, Flynn PJ (1999) Data clustering: a review. ACM Comput Surv 31(3):264–323. https://doi.org/10.1145/331499.331504. doi.org/10.1145/331499.331504
Article Google Scholar
Jia X, Sheth K, Svensson O (2020) Fair colorful k-center clustering. In: International conference on integer programming and combinatorial optimization. Springer, pp 209–222
Google Scholar
Jones M, Nguyen H, Nguyen T (2020) Fair k-centers via maximum matching. In: ICML, pp 4940–4949
Google Scholar
Julia A, Larson J (2016) Propublica machine bias. https://www.propublica.org/article/machine-bias-risk-assessments-in-criminal-sentencing. Accessed 13-August-2021
Julia A, Larson J, Mattu S, Kirchner L (2016) Propublica–machine bias. https://www.propublica.org/article/machine-bias-risk-assessments-in-criminal-sentencing. Accessed 13-August-2021
Jung C, Kannan S, Lutz N (2020) Service in your neighborhood: fairness in center location. Foundations of responsible computing
Google Scholar
Kanaparthy S, Padala M, Damle S, Gujar S (2022) Fair federated learning for heterogeneous data. In: Joint CODS-COMAD, pp 298–299. https://doi.org/10.1145/3493700.3493750
Kar D, Medya S, Mandal D, Silva A, Dey P, Sanyal S (2021) Feature-based individual fairness in k-clustering. arXiv:2109.04554
Kleindessner M, Awasthi P, Morgenstern J (2019) Fair k-center clustering for data summarization. In: ICML, pp 3448–3457
Google Scholar
Kleindessner M, Awasthi P, Morgenstern J (2020) A notion of individual fairness for clustering. arXiv:2006.04960
Kurdija AS, Afric P, Sikic L, Plejic B, Silic M, Delac G, Vladimir K, Srbljic S (200) Candidate classification and skill recommendation in a cv recommender system. In: International conference on AI and mobile services. Springer, pp 30–44
Google Scholar
Le Quy T, Roy A, Friege G, Ntoutsi E (2021) Fair-capacitated clustering. In: EDM, pp 407–414
Google Scholar
Li B, Li L, Sun A, Wang C, Wang Y (2021) Approximate group fairness for clustering. In: ICML, pp 6381–6391. http://proceedings.mlr.press/v139/li21j.html
Li S, Svensson O (2016) Approximating k-median via pseudo-approximation. SIAM J Comput 45(2):530–547
Article MathSciNet Google Scholar
Liu S, Vicente LN (2021) A stochastic alternating balance \( k \)-means algorithm for fair clustering. arXiv:2105.14172
Mahabadi S, Vakilian A (2020) Individual fairness for k-clustering. In: ICML, pp 6586–6596
Google Scholar
Makarychev Y, Vakilian A (2021) Approximation algorithms for socially fair clustering. In: Belkin M, Kpotufe S (eds) COLT. https://proceedings.mlr.press/v134/makarychev21a.html
McMahan HB et al (2021) Advances and open problems in federated learning. Found Trends® Mach Learn 14(1)
Google Scholar
Mehrabi N, Morstatter F, Saxena N, Lerman K, Galstyan A (2021) A survey on bias and fairness in machine learning. ACM Comput Surv 54(6). https://doi.org/10.1145/3457607
Mhasawade V, Zhao Y, Chunara R (2021) Machine learning and algorithmic fairness in public and population health. Nat Mach Intell 3(8):659–666
Article Google Scholar
Micha E, Shah N (2020) Proportionally fair clustering revisited. In: ICALP
Google Scholar
Moulin H (2004) Fair division and collective welfare. MIT Press
Google Scholar
Nedlund E (2019) Apple card is accused of gender bias.here’s how that can happen. https://edition.cnn.com/2019/11/12/business/apple-card-gender-bias/index.html. Accessed 1-November-2022
Negahbani M, Chakrabarty D (2021) Better algorithms for individually fair \( k \)-clustering. In: NeurIPS
Google Scholar
Padmanabhan D, Abraham SS (2020) Representativity fairness in clustering. In: 12th ACM conference on web science. http://dx.doi.org/10.1145/3394231.3397910
Padmanabhan D (2020) Whither fair clustering? In: AI for social good: CRCS workshop
Google Scholar
Rösner C, Schmidt M (2018) Privacy preserving clustering with constraints. In: 45th international colloquium on automata, languages, and programming (ICALP 2018). Schloss Dagstuhl-Leibniz-Zentrum fuer Informatik
Google Scholar
Samet H (1984) The quadtree and related hierarchical data structures. ACM Comput Surv (CSUR) 16(2):187–260
Article MathSciNet Google Scholar
Schmidt M, Schwiegelshohn C, Sohler C (2019) Fair coresets and streaming algorithms for fair k-means. In: International workshop on approximation and online algorithms. Springer, pp 232–251
Google Scholar
Schmidt M, Wargalla J (2021) Coresets for constrained k-median and k-means clustering in low dimensional Euclidean space. arXiv:2106.07319
Sharifi-Malvajerdi S, Kearns M, Roth A (2019) Average individual fairness: algorithms, generalization and experiments. In: NeurIPS, pp 8242–8251
Google Scholar
Song M, Rajasekaran S (2010) Fast algorithms for constant approximation k-means clustering. Trans Mach Learn Data Min 3(2):67–79
Google Scholar
Stoica AA, Papadimitriou C (2018) Strategic clustering. http://www.columbia.edu/as5001/strategicclustering.pdf. Accessed 22-January-2022
Swamy C (2016) Improved approximation algorithms for matroid and knapsack median problems and applications. ACM Trans Algorithms 12(4). https://doi.org/10.1145/2963170
Thejaswi S, Ordozgoiti B, Gionis A (2021) Diversity-aware k-median: clustering with fair center representation. In: Joint European conference on machine learning and knowledge discovery in databases. Springer, pp 765–780
Google Scholar
Vakilian A, Yalçıner M (2021) Improved approximation algorithms for individually fair clustering. arXiv:2106.14043
Wang B, Davidson I (2019) Towards fair deep clustering with multi-state protected variables. arXiv:1901.10053
Zhang H, Davidson I (2021) Deep fair discriminative clustering. arXiv:2105.14146
Ziko IM, Yuan J, Granger E, Ayed IB (2021) Variational fair clustering. In: AAAI, pp 11202–11209
Google Scholar

Download references

Author information

Authors and Affiliations

Indian Institute of Technology Ropar, Bara Phool, India
Shivam Gupta & Shweta Jain
Indian Institute of Technology Hyderabad, Kandi, India
Ganesh Ghalme
Indian Institute of Technology Palakkad, Kanjikode, India
Narayanan C. Krishnan
Indian Institute of Technology Bombay, Mumbai, India
Nandyala Hemachandra

Authors

Shivam Gupta
View author publications
You can also search for this author in PubMed Google Scholar
Shweta Jain
View author publications
You can also search for this author in PubMed Google Scholar
Ganesh Ghalme
View author publications
You can also search for this author in PubMed Google Scholar
Narayanan C. Krishnan
View author publications
You can also search for this author in PubMed Google Scholar
Nandyala Hemachandra
View author publications
You can also search for this author in PubMed Google Scholar

Corresponding author

Correspondence to Narayanan C. Krishnan .

Editor information

Editors and Affiliations

Department of Computer Science and Engineering, IIT Kharagpur, West Bengal, India
Animesh Mukherjee
Department of Computer Science, Aalto University, Espoo, Finland
Juhi Kulshrestha
Department of Computer Science and Engineering, IIT Delhi, New Delhi, Delhi, India
Abhijnan Chakraborty
School of Computational Science and Engineering, College of Computing, Georgia Institute of Technology, Atlanta, GA, USA
Srijan Kumar

Rights and permissions

Reprints and permissions

Copyright information

About this chapter

Cite this chapter

Gupta, S., Jain, S., Ghalme, G., Krishnan, N.C., Hemachandra, N. (2023). Group and Individual Fairness in Clustering Algorithms. In: Mukherjee, A., Kulshrestha, J., Chakraborty, A., Kumar, S. (eds) Ethics in Artificial Intelligence: Bias, Fairness and Beyond. Studies in Computational Intelligence, vol 1123. Springer, Singapore. https://doi.org/10.1007/978-981-99-7184-8_2

Download citation

DOI: https://doi.org/10.1007/978-981-99-7184-8_2
Published: 30 December 2023
Publisher Name: Springer, Singapore
Print ISBN: 978-981-99-7183-1
Online ISBN: 978-981-99-7184-8
eBook Packages: Intelligent Technologies and RoboticsIntelligent Technologies and Robotics (R0)

Publish with us

Policies and ethics

Group and Individual Fairness in Clustering Algorithms

Abstract

Access this chapter

Similar content being viewed by others

Exploring Rawlsian Fairness for K-Means Clustering

Beyond submodularity: a unified framework of randomized set selection with group fairness constraints

Multi-stage Bias Mitigation for Individual Fairness in Algorithmic Decisions

Notes

References

Author information

Authors and Affiliations

Corresponding author

Editor information

Editors and Affiliations

Rights and permissions

Copyright information

About this chapter

Cite this chapter

Download citation

Publish with us

Navigation

Group and Individual Fairness in Clustering Algorithms

Abstract

Access this chapter

Similar content being viewed by others

Exploring Rawlsian Fairness for K-Means Clustering

Beyond submodularity: a unified framework of randomized set selection with group fairness constraints

Multi-stage Bias Mitigation for Individual Fairness in Algorithmic Decisions

Notes

References

Author information

Authors and Affiliations

Corresponding author

Editor information

Editors and Affiliations

Rights and permissions

Copyright information

About this chapter

Cite this chapter

Download citation

Share this chapter

Publish with us

Search

Navigation