Skip to main content

Towards Formal Fairness in Machine Learning

  • Conference paper
  • First Online:
Principles and Practice of Constraint Programming (CP 2020)

Part of the book series: Lecture Notes in Computer Science ((LNPSE,volume 12333))

Abstract

One of the challenges of deploying machine learning (ML) systems is fairness. Datasets often include sensitive features, which ML algorithms may unwittingly use to create models that exhibit unfairness. Past work on fairness offers no formal guarantees in their results. This paper proposes to exploit formal reasoning methods to tackle fairness. Starting from an intuitive criterion for fairness of an ML model, the paper formalises it, and shows how fairness can be represented as a decision problem, given some logic representation of an ML model. The same criterion can also be applied to assessing bias in training data. Moreover, we propose a reasonable set of axiomatic properties which no other definition of dataset bias can satisfy. The paper also investigates the relationship between fairness and explainability, and shows that approaches for computing explanations can serve to assess fairness of particular predictions. Finally, the paper proposes SAT-based approaches for learning fair ML models, even when the training data exhibits bias, and reports experimental trials.

This work was partially funded by ANITI, funded by the French program “Investing for the Future – PIA3” under Grant agreement n\(^{o}\) ANR-19-PI3A-0004.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Chapter
USD 29.95
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
eBook
USD 129.00
Price excludes VAT (USA)
  • Available as EPUB and PDF
  • Read on any device
  • Instant download
  • Own it forever
Softcover Book
USD 169.99
Price excludes VAT (USA)
  • Compact, lightweight edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

Similar content being viewed by others

Notes

  1. 1.

    Real-value features can be discretized. Moreover, to focus on binary features, the fairly standard one-hot-encoding  [58] is assumed for handling non-binary categorical features.

  2. 2.

    For a number of reasons, datasets can contain such protected features, but their removal may be undesirable, for example, because this may induce inconsistencies in datasets.

  3. 3.

    It should be noted that, in ML settings, logic-based models that are not 100% accurate are expected to be less sensitive to overfitting. Thus, the fact that some accuracy is lost is not necessarily a drawback  [8].

References

  1. Adebayo, J.A.: FairML: ToolBox for diagnosing bias in predictive modeling. Master’s thesis, Massachusetts Institute of Technology (2016)

    Google Scholar 

  2. Adebayo, J.A.: FairML: auditing black-box predictive models (2017)

    Google Scholar 

  3. Aïvodji, U., Arai, H., Fortineau, O., Gambs, S., Hara, S., Tapp, A.: Fairwashing: the risk of rationalization. In: ICML, pp. 161–170 (2019)

    Google Scholar 

  4. Aïvodji, U., Ferry, J., Gambs, S., Huguet, M., Siala, M.: Learning fair rule lists. CoRR abs/1909.03977 (2019). http://arxiv.org/abs/1909.03977

  5. Angelino, E., Larus-Stone, N., Alabi, D., Seltzer, M., Rudin, C.: Learning certifiably optimal rule lists for categorical data. J. Mach. Learn. Res. 18, 234:1–234:78 (2017)

    Google Scholar 

  6. Angwin, J., Larson, J., Mattu, S., Kirchner, L.: Machine bias. propublica.org, May 2016. http://tiny.cc/a3b3iz

  7. Berk, R., Heidari, H., Jabbari, S., Kearns, M., Roth, A.: Fairness in criminal justice risk assessments: the state of the art. Sociol. Methods Res. (2017). https://doi.org/10.1177/0049124118782533

  8. Berkman, N.C., Sandholm, T.W.: What should be minimized in a decision tree: a re-examination. Department of Computer Science (1995)

    Google Scholar 

  9. Bessiere, C., Hebrard, E., O’Sullivan, B.: Minimising decision tree size as combinatorial optimisation. In: Gent, I.P. (ed.) CP 2009. LNCS, vol. 5732, pp. 173–187. Springer, Heidelberg (2009). https://doi.org/10.1007/978-3-642-04244-7_16

    Chapter  Google Scholar 

  10. Biere, A., Heule, M., van Maaren, H., Walsh, T. (eds.): Handbook of Satisfiability, Frontiers in Artificial Intelligence and Applications, vol. 185. IOS Press (2009)

    Google Scholar 

  11. Bird, S., Hutchinson, B., Kenthapadi, K., Kiciman, E., Mitchell, M.: Fairness-aware machine learning: practical challenges and lessons learned. In: KDD, pp. 3205–3206 (2019)

    Google Scholar 

  12. Cardelli, L., Kwiatkowska, M., Laurenti, L., Paoletti, N., Patane, A., Wicker, M.: Statistical guarantees for the robustness of Bayesian neural networks. In: Kraus, S. (ed.) Proceedings of the Twenty-Eighth International Joint Conference on Artificial Intelligence, IJCAI 2019, Macao, China, 10–16 August 2019, pp. 5693–5700. ijcai.org (2019). https://doi.org/10.24963/ijcai.2019/789

  13. Chen, T., Guestrin, C.: XGBoost: a scalable tree boosting system. In: KDD, pp. 785–794 (2016)

    Google Scholar 

  14. Chouldechova, A.: Fair prediction with disparate impact: a study of bias in recidivism prediction instruments. Big Data 5(2), 153–163 (2017)

    Article  Google Scholar 

  15. Chouldechova, A., Roth, A.: A snapshot of the frontiers of fairness in machine learning. Commun. ACM 63(5), 82–89 (2020)

    Article  Google Scholar 

  16. Corbett-Davies, S., Pierson, E., Feller, A., Goel, S., Huq, A.: Algorithmic decision making and the cost of fairness. In: KDD 2017, pp. 797–806 (2017)

    Google Scholar 

  17. Cormen, T.H., Leiserson, C.E., Rivest, R.L., Stein, C.: Introduction to Algorithms, 3rd edn. MIT Press (2009). http://mitpress.mit.edu/books/introduction-algorithms

  18. Demsar, J., et al.: Orange: data mining toolbox in python. J. Mach. Learn. Res. 14(1), 2349–2353 (2013)

    MATH  Google Scholar 

  19. Dressel, J., Farid, H.: The accuracy, fairness, and limits of predicting recidivism. Sci. Adv. 4(1), eaao5580 (2018)

    Google Scholar 

  20. Dwork, C., Hardt, M., Pitassi, T., Reingold, O., Zemel, R.: Fairness through awareness. In: ITCS, pp. 214–226 (2012)

    Google Scholar 

  21. Eén, N., Sörensson, N.: An extensible SAT-solver. In: Giunchiglia, E., Tacchella, A. (eds.) SAT 2003. LNCS, vol. 2919, pp. 502–518. Springer, Heidelberg (2004). https://doi.org/10.1007/978-3-540-24605-3_37

    Chapter  Google Scholar 

  22. Ehlers, R.: Formal verification of piece-wise linear feed-forward neural networks. In: D’Souza, D., Narayan Kumar, K. (eds.) ATVA 2017. LNCS, vol. 10482, pp. 269–286. Springer, Cham (2017). https://doi.org/10.1007/978-3-319-68167-2_19

    Chapter  Google Scholar 

  23. European Union High-Level Expert Group on Artificial Intelligence: Ethics guidelines for trustworthy AI, April 2019. https://ec.europa.eu/digital-single-market/en/news/ethics-guidelines-trustworthy-ai

  24. Feldman, M., Friedler, S.A., Moeller, J., Scheidegger, C., Venkatasubramanian, S.: Certifying and removing disparate impact. In: KDD, pp. 259–268. ACM (2015)

    Google Scholar 

  25. Friedler, S.A., Scheidegger, C., Venkatasubramanian, S.: On the (im)possibility of fairness. CoRR abs/1609.07236 (2016). http://arxiv.org/abs/1609.07236

  26. Friedler, S.A., Scheidegger, C., Venkatasubramanian, S., Choudhary, S., Hamilton, E.P., Roth, D.: A comparative study of fairness-enhancing interventions in machine learning. In: FAT, pp. 329–338 (2019)

    Google Scholar 

  27. Galhotra, S., Brun, Y., Meliou, A.: Fairness testing: testing software for discrimination. In: FSE, pp. 498–510 (2017)

    Google Scholar 

  28. Garg, S., Perot, V., Limtiaco, N., Taly, A., Chi, E.H., Beutel, A.: Counterfactual fairness in text classification through robustness. In: AIES, pp. 219–226 (2019)

    Google Scholar 

  29. Gario, M., Micheli, A.: PySMT: a solver-agnostic library for fast prototyping of SMT-based algorithms. In: SMT Workshop (2015)

    Google Scholar 

  30. Ghosh, B., Meel, K.S.: IMLI: an incremental framework for MaxSAT-based learning of interpretable classification rules. In: AIES, pp. 203–210 (2019)

    Google Scholar 

  31. Grgic-Hlaca, N., Zafar, M.B., Gummadi, K.P., Weller, A.: The case for process fairness in learning: feature selection for fair decision making. In: NIPS Symposium on Machine Learning and the Law (2016)

    Google Scholar 

  32. Han, J., Kamber, M., Pei, J.: Data Mining: Concepts and Techniques, 3rd edn. Morgan Kaufmann (2012)

    Google Scholar 

  33. Hardt, M., Price, E., Srebro, N.: Equality of opportunity in supervised learning. In: Lee, D.D., Sugiyama, M., von Luxburg, U., Guyon, I., Garnett, R. (eds.) Advances in Neural Information Processing Systems 29, pp. 3315–3323 (2016). http://papers.nips.cc/paper/6374-equality-of-opportunity-in-supervised-learning

  34. Holstein, K., Vaughan, J.W., Daumé III, H., Dudík, M., Wallach, H.M.: Improving fairness in machine learning systems: what do industry practitioners need? In: CHI, p. 600 (2019)

    Google Scholar 

  35. Hu, H., Siala, M., Hebrard, E., Huguet, M.J.: Learning optimal decision trees with MaxSAT and its integration in AdaBoost. In: IJCAI, pp. 1170–1176 (2020)

    Google Scholar 

  36. Hu, X., Rudin, C., Seltzer, M.: Optimal sparse decision trees. In: NeurIPS, pp. 7265–7273 (2019)

    Google Scholar 

  37. Huang, X., Kwiatkowska, M., Wang, S., Wu, M.: Safety verification of deep neural networks. In: Majumdar, R., Kunčak, V. (eds.) CAV 2017. LNCS, vol. 10426, pp. 3–29. Springer, Cham (2017). https://doi.org/10.1007/978-3-319-63387-9_1

    Chapter  Google Scholar 

  38. Ignatiev, A., Morgado, A., Marques-Silva, J.: PySAT: a python toolkit for prototyping with SAT oracles. In: Beyersdorff, O., Wintersteiger, C.M. (eds.) SAT 2018. LNCS, vol. 10929, pp. 428–437. Springer, Cham (2018). https://doi.org/10.1007/978-3-319-94144-8_26

    Chapter  MATH  Google Scholar 

  39. Ignatiev, A., Narodytska, N., Marques-Silva, J.: Abduction-based explanations for machine learning models. In: AAAI, pp. 1511–1519 (2019)

    Google Scholar 

  40. Ignatiev, A., Narodytska, N., Marques-Silva, J.: On validating, repairing and refining heuristic ML explanations. CoRR abs/1907.02509 (2019). http://arxiv.org/abs/1907.02509

  41. Ignatiev, A., Pereira, F., Narodytska, N., Marques-Silva, J.: A SAT-based approach to learn explainable decision sets. In: IJCAR, pp. 627–645 (2018)

    Google Scholar 

  42. Kamath, A.P., Karmarkar, N., Ramakrishnan, K.G., Resende, M.G.C.: A continuous approach to inductive inference. Math. Program. 57, 215–238 (1992)

    Article  MathSciNet  Google Scholar 

  43. Katz, G., Barrett, C., Dill, D.L., Julian, K., Kochenderfer, M.J.: Reluplex: an efficient SMT solver for verifying deep neural networks. In: Majumdar, R., Kunčak, V. (eds.) CAV 2017. LNCS, vol. 10426, pp. 97–117. Springer, Cham (2017). https://doi.org/10.1007/978-3-319-63387-9_5

    Chapter  Google Scholar 

  44. Katz, G., et al.: The marabou framework for verification and analysis of deep neural networks. In: Dillig, I., Tasiran, S. (eds.) CAV 2019. LNCS, vol. 11561, pp. 443–452. Springer, Cham (2019). https://doi.org/10.1007/978-3-030-25540-4_26

    Chapter  Google Scholar 

  45. Kilbertus, N., Rojas-Carulla, M., Parascandolo, G., Hardt, M., Janzing, D., Schölkopf, B.: Avoiding discrimination through causal reasoning. In: NeurIPS, pp. 656–666 (2017)

    Google Scholar 

  46. Kleinberg, J.M., Mullainathan, S., Raghavan, M.: Inherent trade-offs in the fair determination of risk scores. In: 8th Innovations in Theoretical Computer Science Conference, ITCS 2017, Berkeley, CA, USA, 9–11 January 2017, pp. 43:1–43:23 (2017)

    Google Scholar 

  47. Kohavi, R.: Scaling up the accuracy of Naive-Bayes classifiers: a decision-tree hybrid. In: KDD, pp. 202–207 (1996)

    Google Scholar 

  48. Kusner, M.J., Loftus, J.R., Russell, C., Silva, R.: Counterfactual fairness. In: NeurIPS, pp. 4066–4076 (2017)

    Google Scholar 

  49. Lakkaraju, H., Bach, S.H., Leskovec, J.: Interpretable decision sets: a joint framework for description and prediction. In: KDD, pp. 1675–1684 (2016)

    Google Scholar 

  50. Leofante, F., Narodytska, N., Pulina, L., Tacchella, A.: Automated verification of neural networks: advances, challenges and perspectives. CoRR abs/1805.09938 (2018). http://arxiv.org/abs/1805.09938

  51. Maliotov, D., Meel, K.S.: MLIC: a MaxSAT-based framework for learning interpretable classification rules. In: Hooker, J. (ed.) CP 2018. LNCS, vol. 11008, pp. 312–327. Springer, Cham (2018). https://doi.org/10.1007/978-3-319-98334-9_21

    Chapter  Google Scholar 

  52. de Moura, L., Bjørner, N.: Z3: an efficient SMT solver. In: Ramakrishnan, C.R., Rehof, J. (eds.) TACAS 2008. LNCS, vol. 4963, pp. 337–340. Springer, Heidelberg (2008). https://doi.org/10.1007/978-3-540-78800-3_24

    Chapter  Google Scholar 

  53. Nabi, R., Shpitser, I.: Fair inference on outcomes. In: AAAI, pp. 1931–1940 (2018)

    Google Scholar 

  54. Narayanan, A.: Translation tutorial: 21 fairness definitions and their politics. In: FAT (2018)

    Google Scholar 

  55. Narodytska, N.: Formal analysis of deep binarized neural networks. In: IJCAI, pp. 5692–5696 (2018)

    Google Scholar 

  56. Narodytska, N., Ignatiev, A., Pereira, F., Marques-Silva, J.: Learning optimal decision trees with SAT. In: IJCAI, pp. 1362–1368 (2018)

    Google Scholar 

  57. Narodytska, N., Kasiviswanathan, S.P., Ryzhyk, L., Sagiv, M., Walsh, T.: Verifying properties of binarized deep neural networks. In: AAAI, pp. 6615–6624 (2018)

    Google Scholar 

  58. Pedregosa, F., et al.: Scikit-learn: machine learning in Python. J. Mach. Learn. Res. 12, 2825–2830 (2011)

    MathSciNet  MATH  Google Scholar 

  59. du Pin Calmon, F., Wei, D., Vinzamuri, B., Ramamurthy, K.N., Varshney, K.R.: Optimized pre-processing for discrimination prevention. In: NeurIPS, pp. 3992–4001 (2017)

    Google Scholar 

  60. Pulina, L., Tacchella, A.: An abstraction-refinement approach to verification of artificial neural networks. In: Touili, T., Cook, B., Jackson, P. (eds.) CAV 2010. LNCS, vol. 6174, pp. 243–257. Springer, Heidelberg (2010). https://doi.org/10.1007/978-3-642-14295-6_24

    Chapter  Google Scholar 

  61. Ruan, W., Huang, X., Kwiatkowska, M.: Reachability analysis of deep neural networks with provable guarantees. In: IJCAI, pp. 2651–2659 (2018)

    Google Scholar 

  62. Shih, A., Choi, A., Darwiche, A.: A symbolic approach to explaining Bayesian network classifiers. In: IJCAI, pp. 5103–5111 (2018)

    Google Scholar 

  63. Supreme Court of the United States: Ricci v. DeStefano. U.S. 557, 174 (2009)

    Google Scholar 

  64. Verma, S., Rubin, J.: Fairness definitions explained. In: FairWare@ICSE, pp. 1–7 (2018)

    Google Scholar 

  65. Verwer, S., Zhang, Y.: Learning decision trees with flexible constraints and objectives using integer optimization. In: Salvagnin, D., Lombardi, M. (eds.) CPAIOR 2017. LNCS, vol. 10335, pp. 94–103. Springer, Cham (2017). https://doi.org/10.1007/978-3-319-59776-8_8

    Chapter  Google Scholar 

  66. Verwer, S., Zhang, Y.: Learning optimal classification trees using a binary linear program formulation. In: AAAI, pp. 1625–1632 (2019)

    Google Scholar 

  67. Wu, M., Wicker, M., Ruan, W., Huang, X., Kwiatkowska, M.: A game-based approximate verification of deep neural networks with provable guarantees. Theor. Comput. Sci. 807, 298–329 (2020). https://doi.org/10.1016/j.tcs.2019.05.046

    Article  MathSciNet  MATH  Google Scholar 

Download references

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Martin C. Cooper .

Editor information

Editors and Affiliations

Rights and permissions

Reprints and permissions

Copyright information

© 2020 Springer Nature Switzerland AG

About this paper

Check for updates. Verify currency and authenticity via CrossMark

Cite this paper

Ignatiev, A., Cooper, M.C., Siala, M., Hebrard, E., Marques-Silva, J. (2020). Towards Formal Fairness in Machine Learning. In: Simonis, H. (eds) Principles and Practice of Constraint Programming. CP 2020. Lecture Notes in Computer Science(), vol 12333. Springer, Cham. https://doi.org/10.1007/978-3-030-58475-7_49

Download citation

  • DOI: https://doi.org/10.1007/978-3-030-58475-7_49

  • Published:

  • Publisher Name: Springer, Cham

  • Print ISBN: 978-3-030-58474-0

  • Online ISBN: 978-3-030-58475-7

  • eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics