Abstract
Trustworthy AI is crucial for the broad adoption of AI systems. An important question is, therefore, how to ensure this trustworthiness. The absence of algorithmic bias is a crucial attribute for an AI system to be considered trustworthy. In this book chapter, we address various problems related to the detection and mitigation of algorithmic bias in machine learning models, specifically individual discrimination. A model shows individual discrimination if two instances, predominantly differing in protected attributes like race, gender, or age, produce different decision outcomes. In a black-box setting, detecting individual discrimination requires extensive testing. We present a methodology that enables the automatic generation of test inputs with a high likelihood of identifying individual discrimination. Our approach unites the power of two widely recognized techniques, symbolic execution and local explainability, to generate test cases effectively. We further address the problem of localizing individual discrimination failures required for effective model debugging. We introduce a notion called Region of Individual Discrimination or RID, which is an interpretable region in the feature space, described by simple predicates on features, aiming to contain all the discriminatory instances. This region essentially captures the positive correlation between discriminatory instances and features. Finally, we describe a model repair algorithm that aims to repair individual discrimination in the model by effectively creating retraining data based on RID. We empirically show that our approaches are effective for assuring individual fairness of machine learning models.
Access this chapter
Tax calculation will be finalised at checkout
Purchases are for personal use only
Notes
- 1.
The choice of the training data depends on the debugging stage in Data and AI life cycle. An alternative choice at the model-building phase can be the union of training and validation data. Some debugging can be performed at runtime when several samples fail during runtime. At that point, training, validation, and test set for model M along with runtime workload can also be used.
References
IBM Watson Studio - AutoAI, Last accessed 15th Oct 2020
IBM AIF360, Last accessed 19th Feb 2021
Aggarwal A, Lohia P, Nagar S, Dey K, Saha D (2019) Black box fairness testing of machine learning models. ESEC/FSE
Basu K, Basu T, Buckmire R, Lal N (2019) Predictive models of student college commitment decisions using machine learning. Data 4:65, 05
Binns R (2020) On the apparent conflict between individual and group fairness. In: Proceedings of the 2020 conference on fairness, accountability, and transparency, pp 514–524
Breiman L, Friedman J, Stone CJ, Olshen RA (1984) Classification and regression trees. CRC Press
Cadar C, Ganesh V, Pawlowski PM, Dill DL, Engler DR (2023) Exe: automatically generating inputs of death. In: CCS ’06, pp 322–335
Chakraborty J, Majumder S, Menzies T (2021) Bias in machine learning software: why? how? what to do? Association for Computing Machinery, New York, NY, USA
Chung Y, Kraska T, Polyzotis N, Tae KH, Whang SE (2019) Automated data slicing for model validation: a big data-ai integration approach. IEEE Trans Knowl Data Eng 32(12):2284–2296
Cito J, Dillig I, Kim S, Murali V, Chandra S (2021) Explaining mispredictions of machine learning models using rule induction. In: Proceedings of the 29th ACM joint meeting on European software engineering conference and symposium on the foundations of software engineering, pp 716–727
Craven MW (1996) Extracting comprehensible models from trained neural networks. PhD thesis. AAI9700774
Dressel J, Farid H (2018) The accuracy, fairness, and limits of predicting recidivism. Sci Adv 4:eaao5580
Dua D, Graff C (2017) UCI machine learning repository
Dwork C, Hardt M, Pitassi T, Reingold O, Zemel R (2012) Fairness through awareness. In: ITCS 2012, pp 214–226
Esteva Andre, Robicquet Alexandre, Ramsundar Bharath, Kuleshov Volodymyr, DePristo Mark, Chou Katherine, Cui Claire, Corrado Greg, Thrun Sebastian, Dean Jeff (2019) A guide to deep learning in healthcare. Nat Med 25(1):24–29
Feldman M, Friedler SA, Moeller J, Scheidegger C, Venkatasubramanian S (2015) Certifying and removing disparate impact. In: proceedings of the 21th ACM SIGKDD international conference on knowledge discovery and data mining, pp 259–268
Galhotra S, Brun Y, Meliou A (2017) Fairness testing: testing software for discrimination. In: ESEC/FSE. ACM, New York, NY, USA, pp 498–510
Godefroid P (2007) Compositional dynamic test generation. In: Proceedings of the 34th annual ACM SIGPLAN-SIGACT symposium on principles of programming languages, POPL ’07. ACM, New York, NY, USA, pp 47–54
Godefroid P, Klarlund N, Sen K (2023) Dart: directed automated random testing. In: PLDI’ 05, pp 213–223
Huang B, Kechadi MT, Buckley B (2012) Customer churn prediction in telecommunications. Expert Syst with Appl 39(1):1414–1425
Joseph M, Kearns M, Morgenstern JH, Roth A (2016) Fairness in learning: classic and contextual bandits. In: Lee D, Sugiyama M, Luxburg V, Guyon I, Garnett R (eds) Advances in neural information processing systems, vol 29. Curran Associates, Inc.
Kusner M, Loftus J, Russell C, Silva R (2017) Counterfactual fairness. In: NIPS. Curran Associates Inc, USA, pp 4069–4079
Lahoti P, Weikum G, Gummadi K (2019) ifair: learning individually fair data representations for algorithmic decision making. In: 2019 IEEE 35th international conference on data engineering (ICDE), pp 1334–1345
Le TT, Fu W, Moore JH (2020) Scaling tree-based automated machine learning to biomedical big data with a feature set selector. Bioinformatics 36(1):250–256
Mehrabi N, Morstatter F, Saxena N, Lerman K, Galstyan A (2019) A survey on bias and fairness in machine learning. arXiv:1908.09635
Mothilal RK, Sharma A, Tan C (2020) Explaining machine learning classifiers through diverse counterfactual explanations. FAT* ’20
Mukerjee Amitabha, Biswas Rita, Deb Kalyanmoy, Mathur Amrit P (2002) Multi-objective evolutionary algorithms for the risk-return trade-off in bank loan management. ITOR 9:583–597
Pedregosa F, Varoquaux G, Gramfort A, Michel V, Thirion B, Grisel O, Blondel M, Prettenhofer P, Weiss R, Dubourg V, Vanderplas J, Passos A, Cournapeau D, Brucher M, Perrot M, Duchesnay E (2011) Scikit-learn: machine learning in python. J Mach Learn Res 12:2825–2830
Pedregosa F, Varoquaux G, Gramfort A, Michel V, Thirion B, Grisel O, Blondel M, Prettenhofer P, Weiss R, Dubourg V et al (2011) Scikit-learn: machine learning in python. J Mach Learn Res 12:2825–2830
Ribeiro MT, Singh S, Guestrin C (2016) Why should i trust you?: explaining the predictions of any classifier. In: Proceedings of the 22nd ACM SIGKDD international conference on knowledge discovery and data mining. ACM, pp 1135–1144
Salimi B, Rodriguez L, Howe B, Suciu D (2019) Interventional fairness: causal database repair for algorithmic fairness. In: Proceedings of the 2019 international conference on management of data, pp 793–810
Sen K, Marinov D, Agha G (2005) Cute: a concolic unit testing engine for c. In: Proceedings of the 10th European software engineering conference held jointly with 13th ACM SIGSOFT international symposium on foundations of software engineering, ESEC/FSE-13. ACM, New York, NY, USA, pp 263–272
Settles B (2009) Active learning literature survey
Sun Y, Wu M, Ruan W, Huang X, Kwiatkowska M, Kroening D (2018) Concolic testing for deep neural networks
Szegedy C, Zaremba W, Sutskever I, Bruna J, Erhan D, Goodfellow IJ, Fergus R (2013) Intriguing properties of neural networks. CoRR, abs/1312.6199
Tramèr F, Atlidakis V, Geambasu R, Hsu D, Hubaux J, Humbert M, Juels A, Lin H (2017) Fairtest: discovering unwarranted associations in data-driven applications. In: 2017 IEEE European symposium on security and privacy (EuroS P), pp 401–416
Udeshi S, Arora P, Chattopadhyay S (2018) Automated directed fairness testing. ASE
Zhang P, Wang J, Sun J, Dong G, Wang X, Wang X, Dong JS, Dai T (2020) White-box fairness testing through adversarial sampling. In: Proceedings of the ACM/IEEE 42nd international conference on software engineering, pp 949–960, 2020
Author information
Authors and Affiliations
Corresponding author
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 2023 The Institution of Engineers (India)
About this chapter
Cite this chapter
Saha, D., Agarwal, A., Hans, S., Haldar, S. (2023). Testing, Debugging, and Repairing Individual Discrimination in Machine Learning Models. In: Mukherjee, A., Kulshrestha, J., Chakraborty, A., Kumar, S. (eds) Ethics in Artificial Intelligence: Bias, Fairness and Beyond. Studies in Computational Intelligence, vol 1123. Springer, Singapore. https://doi.org/10.1007/978-981-99-7184-8_1
Download citation
DOI: https://doi.org/10.1007/978-981-99-7184-8_1
Published:
Publisher Name: Springer, Singapore
Print ISBN: 978-981-99-7183-1
Online ISBN: 978-981-99-7184-8
eBook Packages: Intelligent Technologies and RoboticsIntelligent Technologies and Robotics (R0)