Testing, Debugging, and Repairing Individual Discrimination in Machine Learning Models

Saha, Diptikalyan; Agarwal, Aniya; Hans, Sandeep; Haldar, Swagatam

doi:10.1007/978-981-99-7184-8_1

Diptikalyan Saha⁶,
Aniya Agarwal⁶,
Sandeep Hans⁶ &
…
Swagatam Haldar⁶

Part of the book series: Studies in Computational Intelligence ((SCI,volume 1123))

Abstract

Trustworthy AI is crucial for the broad adoption of AI systems. An important question is, therefore, how to ensure this trustworthiness. The absence of algorithmic bias is a crucial attribute for an AI system to be considered trustworthy. In this book chapter, we address various problems related to the detection and mitigation of algorithmic bias in machine learning models, specifically individual discrimination. A model shows individual discrimination if two instances, predominantly differing in protected attributes like race, gender, or age, produce different decision outcomes. In a black-box setting, detecting individual discrimination requires extensive testing. We present a methodology that enables the automatic generation of test inputs with a high likelihood of identifying individual discrimination. Our approach unites the power of two widely recognized techniques, symbolic execution and local explainability, to generate test cases effectively. We further address the problem of localizing individual discrimination failures required for effective model debugging. We introduce a notion called Region of Individual Discrimination or RID, which is an interpretable region in the feature space, described by simple predicates on features, aiming to contain all the discriminatory instances. This region essentially captures the positive correlation between discriminatory instances and features. Finally, we describe a model repair algorithm that aims to repair individual discrimination in the model by effectively creating retraining data based on RID. We empirically show that our approaches are effective for assuring individual fairness of machine learning models.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Log in via an institution

Chapter: USD 29.95; Price excludes VAT (USA)

eBook: USD 139.00; Price excludes VAT (USA)

Hardcover Book: USD 179.99; Price excludes VAT (USA)

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

Notes

1.
The choice of the training data depends on the debugging stage in Data and AI life cycle. An alternative choice at the model-building phase can be the union of training and validation data. Some debugging can be performed at runtime when several samples fail during runtime. At that point, training, validation, and test set for model M along with runtime workload can also be used.

References

IBM Watson Studio - AutoAI, Last accessed 15th Oct 2020
Google Scholar
IBM AIF360, Last accessed 19th Feb 2021
Google Scholar
Aggarwal A, Lohia P, Nagar S, Dey K, Saha D (2019) Black box fairness testing of machine learning models. ESEC/FSE
Google Scholar
Basu K, Basu T, Buckmire R, Lal N (2019) Predictive models of student college commitment decisions using machine learning. Data 4:65, 05
Google Scholar
Binns R (2020) On the apparent conflict between individual and group fairness. In: Proceedings of the 2020 conference on fairness, accountability, and transparency, pp 514–524
Google Scholar
Breiman L, Friedman J, Stone CJ, Olshen RA (1984) Classification and regression trees. CRC Press
Google Scholar
Cadar C, Ganesh V, Pawlowski PM, Dill DL, Engler DR (2023) Exe: automatically generating inputs of death. In: CCS ’06, pp 322–335
Google Scholar
Chakraborty J, Majumder S, Menzies T (2021) Bias in machine learning software: why? how? what to do? Association for Computing Machinery, New York, NY, USA
Google Scholar
Chung Y, Kraska T, Polyzotis N, Tae KH, Whang SE (2019) Automated data slicing for model validation: a big data-ai integration approach. IEEE Trans Knowl Data Eng 32(12):2284–2296
Google Scholar
Cito J, Dillig I, Kim S, Murali V, Chandra S (2021) Explaining mispredictions of machine learning models using rule induction. In: Proceedings of the 29th ACM joint meeting on European software engineering conference and symposium on the foundations of software engineering, pp 716–727
Google Scholar
Craven MW (1996) Extracting comprehensible models from trained neural networks. PhD thesis. AAI9700774
Google Scholar
Dressel J, Farid H (2018) The accuracy, fairness, and limits of predicting recidivism. Sci Adv 4:eaao5580
Google Scholar
Dua D, Graff C (2017) UCI machine learning repository
Google Scholar
Dwork C, Hardt M, Pitassi T, Reingold O, Zemel R (2012) Fairness through awareness. In: ITCS 2012, pp 214–226
Google Scholar
Esteva Andre, Robicquet Alexandre, Ramsundar Bharath, Kuleshov Volodymyr, DePristo Mark, Chou Katherine, Cui Claire, Corrado Greg, Thrun Sebastian, Dean Jeff (2019) A guide to deep learning in healthcare. Nat Med 25(1):24–29
Article Google Scholar
Feldman M, Friedler SA, Moeller J, Scheidegger C, Venkatasubramanian S (2015) Certifying and removing disparate impact. In: proceedings of the 21th ACM SIGKDD international conference on knowledge discovery and data mining, pp 259–268
Google Scholar
Galhotra S, Brun Y, Meliou A (2017) Fairness testing: testing software for discrimination. In: ESEC/FSE. ACM, New York, NY, USA, pp 498–510
Google Scholar
Godefroid P (2007) Compositional dynamic test generation. In: Proceedings of the 34th annual ACM SIGPLAN-SIGACT symposium on principles of programming languages, POPL ’07. ACM, New York, NY, USA, pp 47–54
Google Scholar
Godefroid P, Klarlund N, Sen K (2023) Dart: directed automated random testing. In: PLDI’ 05, pp 213–223
Google Scholar
Huang B, Kechadi MT, Buckley B (2012) Customer churn prediction in telecommunications. Expert Syst with Appl 39(1):1414–1425
Google Scholar
Joseph M, Kearns M, Morgenstern JH, Roth A (2016) Fairness in learning: classic and contextual bandits. In: Lee D, Sugiyama M, Luxburg V, Guyon I, Garnett R (eds) Advances in neural information processing systems, vol 29. Curran Associates, Inc.
Google Scholar
Kusner M, Loftus J, Russell C, Silva R (2017) Counterfactual fairness. In: NIPS. Curran Associates Inc, USA, pp 4069–4079
Google Scholar
Lahoti P, Weikum G, Gummadi K (2019) ifair: learning individually fair data representations for algorithmic decision making. In: 2019 IEEE 35th international conference on data engineering (ICDE), pp 1334–1345
Google Scholar
Le TT, Fu W, Moore JH (2020) Scaling tree-based automated machine learning to biomedical big data with a feature set selector. Bioinformatics 36(1):250–256
Google Scholar
Mehrabi N, Morstatter F, Saxena N, Lerman K, Galstyan A (2019) A survey on bias and fairness in machine learning. arXiv:1908.09635
Mothilal RK, Sharma A, Tan C (2020) Explaining machine learning classifiers through diverse counterfactual explanations. FAT* ’20
Google Scholar
Mukerjee Amitabha, Biswas Rita, Deb Kalyanmoy, Mathur Amrit P (2002) Multi-objective evolutionary algorithms for the risk-return trade-off in bank loan management. ITOR 9:583–597
Google Scholar
Pedregosa F, Varoquaux G, Gramfort A, Michel V, Thirion B, Grisel O, Blondel M, Prettenhofer P, Weiss R, Dubourg V, Vanderplas J, Passos A, Cournapeau D, Brucher M, Perrot M, Duchesnay E (2011) Scikit-learn: machine learning in python. J Mach Learn Res 12:2825–2830
MathSciNet Google Scholar
Pedregosa F, Varoquaux G, Gramfort A, Michel V, Thirion B, Grisel O, Blondel M, Prettenhofer P, Weiss R, Dubourg V et al (2011) Scikit-learn: machine learning in python. J Mach Learn Res 12:2825–2830
Google Scholar
Ribeiro MT, Singh S, Guestrin C (2016) Why should i trust you?: explaining the predictions of any classifier. In: Proceedings of the 22nd ACM SIGKDD international conference on knowledge discovery and data mining. ACM, pp 1135–1144
Google Scholar
Salimi B, Rodriguez L, Howe B, Suciu D (2019) Interventional fairness: causal database repair for algorithmic fairness. In: Proceedings of the 2019 international conference on management of data, pp 793–810
Google Scholar
Sen K, Marinov D, Agha G (2005) Cute: a concolic unit testing engine for c. In: Proceedings of the 10th European software engineering conference held jointly with 13th ACM SIGSOFT international symposium on foundations of software engineering, ESEC/FSE-13. ACM, New York, NY, USA, pp 263–272
Google Scholar
Settles B (2009) Active learning literature survey
Google Scholar
Sun Y, Wu M, Ruan W, Huang X, Kwiatkowska M, Kroening D (2018) Concolic testing for deep neural networks
Google Scholar
Szegedy C, Zaremba W, Sutskever I, Bruna J, Erhan D, Goodfellow IJ, Fergus R (2013) Intriguing properties of neural networks. CoRR, abs/1312.6199
Google Scholar
Tramèr F, Atlidakis V, Geambasu R, Hsu D, Hubaux J, Humbert M, Juels A, Lin H (2017) Fairtest: discovering unwarranted associations in data-driven applications. In: 2017 IEEE European symposium on security and privacy (EuroS P), pp 401–416
Google Scholar
Udeshi S, Arora P, Chattopadhyay S (2018) Automated directed fairness testing. ASE
Google Scholar
Zhang P, Wang J, Sun J, Dong G, Wang X, Wang X, Dong JS, Dai T (2020) White-box fairness testing through adversarial sampling. In: Proceedings of the ACM/IEEE 42nd international conference on software engineering, pp 949–960, 2020
Google Scholar

Download references

Author information

Authors and Affiliations

IBM Research, Bangalore, India
Diptikalyan Saha, Aniya Agarwal, Sandeep Hans & Swagatam Haldar

Authors

Diptikalyan Saha
View author publications
You can also search for this author in PubMed Google Scholar
Aniya Agarwal
View author publications
You can also search for this author in PubMed Google Scholar
Sandeep Hans
View author publications
You can also search for this author in PubMed Google Scholar
Swagatam Haldar
View author publications
You can also search for this author in PubMed Google Scholar

Corresponding author

Correspondence to Diptikalyan Saha .

Editor information

Editors and Affiliations

Department of Computer Science and Engineering, IIT Kharagpur, West Bengal, India
Animesh Mukherjee
Department of Computer Science, Aalto University, Espoo, Finland
Juhi Kulshrestha
Department of Computer Science and Engineering, IIT Delhi, New Delhi, Delhi, India
Abhijnan Chakraborty
School of Computational Science and Engineering, College of Computing, Georgia Institute of Technology, Atlanta, GA, USA
Srijan Kumar

Rights and permissions

Reprints and permissions

Copyright information

About this chapter

Cite this chapter

Saha, D., Agarwal, A., Hans, S., Haldar, S. (2023). Testing, Debugging, and Repairing Individual Discrimination in Machine Learning Models. In: Mukherjee, A., Kulshrestha, J., Chakraborty, A., Kumar, S. (eds) Ethics in Artificial Intelligence: Bias, Fairness and Beyond. Studies in Computational Intelligence, vol 1123. Springer, Singapore. https://doi.org/10.1007/978-981-99-7184-8_1

Download citation

DOI: https://doi.org/10.1007/978-981-99-7184-8_1
Published: 30 December 2023
Publisher Name: Springer, Singapore
Print ISBN: 978-981-99-7183-1
Online ISBN: 978-981-99-7184-8
eBook Packages: Intelligent Technologies and RoboticsIntelligent Technologies and Robotics (R0)

Publish with us

Policies and ethics

Testing, Debugging, and Repairing Individual Discrimination in Machine Learning Models