Skip to main content

Causal Inference in Data Analysis with Applications to Fairness and Explanations

  • Chapter
  • First Online:
Reasoning Web. Causality, Explanations and Declarative Knowledge

Part of the book series: Lecture Notes in Computer Science ((LNCS,volume 13759))

Abstract

Causal inference is a fundamental concept that goes beyond simple correlation and model-based prediction analysis, and is highly relevant in domains such as health, medicine, and the social sciences. Causal inference enables the estimation of the impact of an intervention or treatment on the world, making it critical for sound and robust policy making. However, randomized controlled experiments, which are typically considered as the gold standard for inferring causal conclusions, are often not feasible due to ethical, cost, or other constraints. Fortunately, there is a rich literature in Artificial Intelligence (AI), Machine Learning (ML), and Statistics on observational studies, which are methods for causal inference on observed or collected data under certain assumptions. In this paper, we provide an overview of popular formal and rigorous techniques for causal inference on observed data from the AI and Statistics literature. Furthermore, we discuss how concepts from causal inference can be used to infer fairness and enable explainability in machine learning models, which are critical in responsible data science when ML is used in making high-stake decisions in various contexts. Our discussion highlights the importance of using causal inference in ML models and provides insights on how to develop more transparent and responsible AI systems.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Chapter
USD 29.95
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
eBook
USD 49.99
Price excludes VAT (USA)
  • Available as EPUB and PDF
  • Read on any device
  • Instant download
  • Own it forever
Softcover Book
USD 64.99
Price excludes VAT (USA)
  • Compact, lightweight edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

Notes

  1. 1.

    For proof and intuition behind the back-door criteria, along with other sufficient conditions, see [55].

References

  1. New York times article on crime and summer (2009). https://www.nytimes.com/2009/06/19/nyregion/19murder.html?smid=url-share

  2. Alvarez-Melis, D., Jaakkola, T.S.: On the robustness of interpretability methods. arXiv preprint arXiv:1806.08049 (2018)

  3. Apley, D.W., Zhu, J.: Visualizing the effects of predictor variables in black box supervised learning models. arXiv preprint arXiv:1612.08468 (2016)

  4. Avin, C., Shpitser, I., Pearl, J.: Identifiability of path-specific effects (2005)

    Google Scholar 

  5. Awan, M.U., Morucci, M., Orlandi, V., Roy, S., Rudin, C., Volfovsky, A.: Almost-matching-exactly for treatment effect estimation under network interference. In: Chiappa, S., Calandra, R. (eds.) The 23rd International Conference on Artificial Intelligence and Statistics, AISTATS 2020, 26–28 August 2020, Online, Palermo, Sicily, Italy. Proceedings of Machine Learning Research, vol. 108, pp. 3252–3262. PMLR (2020)

    Google Scholar 

  6. Bickel, P.J., Hammel, E.A., O’Connell, J., et al.: Sex bias in graduate admissions: data from Berkeley. Science 187(4175), 398–404 (1975)

    Article  Google Scholar 

  7. Chiappa, S.: Path-specific counterfactual fairness. In: Proceedings of the AAAI Conference on Artificial Intelligence, vol. 33, pp. 7801–7808 (2019)

    Google Scholar 

  8. Chouldechova, A.: Fair prediction with disparate impact: a study of bias in recidivism prediction instruments. Big data 5(2), 153–163 (2017)

    Article  Google Scholar 

  9. Corbett-Davies, S., Pierson, E., Feller, A., Goel, S., Huq, A.: Algorithmic decision making and the cost of fairness. In: Proceedings of the 23rd ACM SIGKDD International Conference on Knowledge Discovery and Data Mining, pp. 797–806. ACM (2017)

    Google Scholar 

  10. Cox, D.R.: The regression analysis of binary sequences (with discussion). J. Roy. Stat. Soc. B 20, 215–242 (1958)

    MATH  Google Scholar 

  11. De Graaf, M.M.A., Malle, B.F.: How people explain action (and autonomous intelligent systems should too). In: 2017 AAAI Fall Symposium Series (2017)

    Google Scholar 

  12. Dieng, A., Liu, Y., Roy, S., Rudin, C., Volfovsky, A.: Interpretable almost-exact matching for causal inference. In: Chaudhuri, K., Sugiyama, M. (eds.) The 22nd International Conference on Artificial Intelligence and Statistics, AISTATS 2019, 16–18 April 2019, Naha, Okinawa, Japan. Proceedings of Machine Learning Research, vol. 89, pp. 2445–2453. PMLR (2019)

    Google Scholar 

  13. Dwork, C., Hardt, M., Pitassi, T., Reingold, O., Zemel, R.S.: Fairness through awareness. In: ITCS, pp. 214–226. ACM (2012)

    Google Scholar 

  14. Fisher, A., Rudin, C., Dominici, F.: Model class reliance: variable importance measures for any machine learning model class, from the “rashomon” perspective. arXiv preprint arXiv:1801.01489, p. 68 (2018)

  15. Ronald Aylmer Fisher: The Design of Experiments. Oliver and Boyd, Oxford (1935)

    Google Scholar 

  16. Friedman, J.H.: Greedy function approximation: a gradient boosting machine. Ann. Stat. 1189–1232 (2001)

    Google Scholar 

  17. Frye, C., Feige, I., Rowat, C.: Asymmetric shapley values: incorporating causal knowledge into model-agnostic explainability. arXiv preprint arXiv:1910.06358 (2019)

  18. Funk, M.J., Westreich, D., Wiesen, C., Stürmer, T., Brookhart, M.A., Davidian, M.: Doubly robust estimation of causal effects. Am. J. Epidemiol. 173, 761–767 (2011)

    Article  Google Scholar 

  19. Galhotra, S., Brun, Y., Meliou, A.: Fairness testing: testing software for discrimination. In: Proceedings of the 2017 11th Joint Meeting on Foundations of Software Engineering, pp. 498–510. ACM (2017)

    Google Scholar 

  20. Galhotra, S., Pradhan, R., Salimi, B.: Explaining black-box algorithms using probabilistic contrastive counterfactuals. In: Proceedings of the International Conference on Management of Data, pp. 577–590 (2021)

    Google Scholar 

  21. Gerstenberg, T., Goodman, N.D., Lagnado, D.A., Tenenbaum, J.B.: How, whether, why: causal judgments as counterfactual contrasts. In: CogSci (2015)

    Google Scholar 

  22. Goldstein, A., Kapelner, A., Bleich, J., Pitkin, E.: Peeking inside the black box: visualizing statistical learning with plots of individual conditional expectation. J. Comput. Graph. Stat. 24(1), 44–65 (2015)

    Article  MathSciNet  Google Scholar 

  23. Greenwell, B.M., Boehmke, B.C., McCarthy, A.J.: A simple and effective model-based variable importance measure. arXiv preprint arXiv:1805.04755 (2018)

  24. Grynaviski, E.: Contrasts, counterfactuals, and causes. Eur. J. Int. Rel. 19(4), 823–846 (2013)

    Article  Google Scholar 

  25. Guidotti, R., Monreale, A., Ruggieri, S., Turini, F., Giannotti, F., Pedreschi, D.: A survey of methods for explaining black box models. ACM Comput. Surv. (CSUR) 51(5), 1–42 (2018)

    Article  Google Scholar 

  26. Hahn, P.R., Murray, J.S., Carvalho, C.: Bayesian regression tree models for causal inference: regularization, confounding, and heterogeneous effects (2017)

    Google Scholar 

  27. Hernán, M.A., Robins, J.M.: Causal inference (2010)

    Google Scholar 

  28. Heskes, T., Sijben, E., Bucur, I.G., Claassen, T.: Causal shapley values: exploiting causal knowledge to explain individual predictions of complex models. arXiv preprint arXiv:2011.01625 (2020)

  29. Holland, P.W.: Statistics and causal inference. J. Am. Stat. Assoc. 81(396), 945–960 (1986)

    Article  MathSciNet  MATH  Google Scholar 

  30. Hooker, G.: Discovering additive structure in black box functions. In: Proceedings of the Tenth ACM SIGKDD International Conference on Knowledge Discovery and Data Mining, pp. 575–580 (2004)

    Google Scholar 

  31. Hooker, G., Mentch, L.: Please stop permuting features: an explanation and alternatives. arXiv preprint arXiv:1905.03151 (2019)

  32. Iacus, S.M., King, G., Porro, G., Katz, J.N.: Causal inference without balance checking: coarsened exact matching. Polit. Anal. 1–24 (2012)

    Google Scholar 

  33. Imbens, G.W., Rubin, D.B.: Causal Inference in Statistics, Social, and Biomedical Sciences. Cambridge University Press, Cambridge (2015)

    Book  MATH  Google Scholar 

  34. Islam, M.T., Fariha, A., Meliou, A., Salimi, B.: Through the data management lens: experimental analysis and evaluation of fair classification. In: Proceedings of the 2022 International Conference on Management of Data, pp. 232–246 (2022)

    Google Scholar 

  35. Karimi, A.-H., Barthe, G., Belle, B., Valera, I.: Model-agnostic counterfactual explanations for consequential decisions. arXiv preprint arXiv:1905.11190 (2019)

  36. Karimi, A.-H., Barthe, G., Schölkopf, B., Valera, I.: A survey of algorithmic recourse: contrastive explanations and consequential recommendations. ACM Comput. Surv. 55(5), 1–29 (2022)

    Article  Google Scholar 

  37. Karimi, A.-H., von Kügelgen, J., Schölkopf, B., Valera, I.: Algorithmic recourse under imperfect causal knowledge: a probabilistic approach. arXiv preprint arXiv:2006.06831 (2020)

  38. Kilbertus, N., Carulla, M.R., Parascandolo, G., Hardt, M., Janzing, D., Schölkopf, B.: Avoiding discrimination through causal reasoning. In: Advances in Neural Information Processing Systems, pp. 656–666 (2017)

    Google Scholar 

  39. Kumar, I.E., Venkatasubramanian, S., Scheidegger, C., Friedler, S.: Problems with shapley-value-based explanations as feature importance measures. In: International Conference on Machine Learning, pp. 5491–5500. PMLR (2020)

    Google Scholar 

  40. Kusner, M.J., Loftus, J., Russell, C., Silva, R.: Counterfactual fairness. In: Advances in Neural Information Processing Systems, pp. 4069–4079 (2017)

    Google Scholar 

  41. Larson, J., Mattu, S., Kirchner, L., Angwin, J.: How we analyzed the compas recidivism algorithm. ProPublica 9 (2016)

    Google Scholar 

  42. Laugel, T., Lesot, M.-J., Marsala, C., Renard, X., Detyniecki, M.: Inverse classification for comparison-based interpretability in machine learning. arXiv preprint arXiv:1712.08443 (2017)

  43. Lipton, P.: Contrastive explanation. R. Inst. Philos. Suppl. 27, 247–266 (1990)

    Article  Google Scholar 

  44. Mahajan, D., Tan, C., Sharma, A.: Preserving causal constraints in counterfactual explanations for machine learning classifiers. arXiv preprint arXiv:1912.03277 (2019)

  45. Makhlouf, K., Zhioua, S., Palamidessi, C.: Survey on causal-based machine learning fairness notions. arXiv preprint arXiv:2010.09553 (2020)

  46. Molnar, C.: Interpretable Machine Learning (2020). Lulu.com

  47. Morton, A.: Contrastive knowledge. Contrastivism Philos. 101–115 (2013)

    Google Scholar 

  48. Morucci, M., Orlandi, V., Roy, S., Rudin, C., Volfovsky, A.: Adaptive hyper-box matching for interpretable individualized treatment effect estimation. In: Adams, R.P., Gogate, V. (eds.) Proceedings of the Thirty-Sixth Conference on Uncertainty in Artificial Intelligence, UAI 2020, Virtual Online, 3–6 August 2020. Proceedings of Machine Learning Research, vol. 124, pp. 1089–1098. AUAI Press (2020)

    Google Scholar 

  49. Mothilal, R.K., Sharma, A., Tan, C.: Explaining machine learning classifiers through diverse counterfactual explanations. In: Proceedings of the 2020 Conference on Fairness, Accountability, and Transparency, pp. 607–617 (2020)

    Google Scholar 

  50. Nabi, R., Shpitser, I.: Fair inference on outcomes. In: Proceedings of the AAAI Conference on Artificial Intelligence. AAAI Conference on Artificial Intelligence, vol. 2018, p. 1931. NIH Public Access (2018)

    Google Scholar 

  51. Neyman, J.: On the application of probability theory to agricultural experiments. Essay on Principles. Section 9. PhD thesis, Roczniki Nauk Rolniczych Tom X [in Polish] (1923). Translated in Statistical Science, vol. 5, pp. 465–480

    Google Scholar 

  52. Ogburn, E.L., Shpitser, I., Lee, Y.: Causal inference, social networks, and chain graphs (2018)

    Google Scholar 

  53. Ogburn, E.L., Sofrygin, O., Diaz, I., van der Laan, M.J.: Causal inference for social network data (2017)

    Google Scholar 

  54. Pearl, J.: Probabilistic Reasoning in Intelligent Systems: Networks of Plausible Inference. Morgan Kaufmann (1988)

    Google Scholar 

  55. Pearl, J.: Causality: Models, Reasoning, and Inference, 2nd edn. Cambridge University Press, Cambridge (2009)

    Book  MATH  Google Scholar 

  56. Pearl, J.: Comment: understanding Simpson’s paradox. In: Probabilistic and Causal Inference: The Works of Judea Pearl, pp. 399–412 (2022)

    Google Scholar 

  57. Pearl, J., et al.: Causal inference in statistics: an overview. Stat. Surv. 3, 96–146 (2009)

    Article  MathSciNet  MATH  Google Scholar 

  58. Pearl, J., Glymour, M., Jewell, N.P.: Causal Inference in Statistics: A Primer. Wiley, Hoboken (2016)

    MATH  Google Scholar 

  59. Pfohl, S.R., Duan, T., Ding, D.Y., Shah, N.H.: Counterfactual reasoning for fair clinical risk prediction. In: Machine Learning for Healthcare Conference, pp. 325–358. PMLR (2019)

    Google Scholar 

  60. Pradhan, R., Zhu, J., Glavic, B., Salimi, B.: Interpretable data-based explanations for fairness debugging. In: SIGMOD (2022)

    Google Scholar 

  61. Ribeiro, M.T., Singh, S., Guestrin, C.: “Why should I trust you?” Explaining the predictions of any classifier. In: Proceedings of the 22nd ACM SIGKDD International Conference on Knowledge Discovery and Data Mining, pp. 1135–1144 (2016)

    Google Scholar 

  62. Ribeiro, M.T., Singh, S., Guestrin, C.: Anchors: high-precision model-agnostic explanations. In: AAAI, vol. 18, pp. 1527–1535 (2018)

    Google Scholar 

  63. Rosenbaum, P.R.: Observational Study. Wiley, Hoboken (2005)

    Book  Google Scholar 

  64. Rosenbaum, P.R.: Design of Observational Studies, vol. 10. Springer, Heidelberg (2010)

    Book  MATH  Google Scholar 

  65. Rosenbaum, P.R., Rubin, D.B.: The central role of the propensity score in observational studies for causal effects. Biometrika 70(1), 41–55 (1983)

    Article  MathSciNet  MATH  Google Scholar 

  66. Rosenbaum, P.R., Rubin, D.B.: Reducing bias in observational studies using subclassification on the propensity score. J. Am. Stat. Assoc. 79(387), 516–524 (1984)

    Article  Google Scholar 

  67. Rubin, D.B.: Matching to remove bias in observational studies. Biometrics 159–183 (1973)

    Google Scholar 

  68. Rubin, D.B.: Estimating causal effects of treatments in randomized and nonrandomized studies. J. Educ. Psychol. 66(5), 688 (1974)

    Article  Google Scholar 

  69. Rubin, D.B.: Causal inference using potential outcomes. J. Am. Stat. Assoc. 100(469), 322–331 (2005)

    Article  MATH  Google Scholar 

  70. Russell, C., Kusner, M.J., Loftus, J., Silva, R.: When worlds collide: integrating different counterfactual assumptions in fairness. In: Advances in Neural Information Processing Systems, pp. 6414–6423 (2017)

    Google Scholar 

  71. Salimi, B., Cole, C., Ports, D.R.K., Suciu, D.: ZaliQL: causal inference from observational data at scale. Proc. VLDB Endow. 10(12), 1957–1960 (2017)

    Article  Google Scholar 

  72. Salimi, B., Howe, B., Suciu, D.: Data management for causal algorithmic fairness. Data Eng. 24 (2019)

    Google Scholar 

  73. Salimi, B., Parikh, H., Kayali, M., Getoor, L., Roy, S., Suciu, D.: Causal relational learning. In: Proceedings of the 2020 ACM SIGMOD International Conference on Management of Data, pp. 241–256 (2020)

    Google Scholar 

  74. Salimi, B., Rodriguez, L., Howe, B., Suciu, D.: Interventional fairness: causal database repair for algorithmic fairness. In: Proceedings of the 2019 International Conference on Management of Data, pp. 793–810. ACM (2019)

    Google Scholar 

  75. Schwab, P., Karlen, W.: CXPlain: causal explanations for model interpretation under uncertainty. arXiv preprint arXiv:1910.12336 (2019)

  76. Stuart, E.A.: Matching methods for causal inference: a review and a look forward. Statistical science: a review. J. Inst. Math. Stat. 1–21 (2010)

    Google Scholar 

  77. Ustun, B., Spangher, A., Liu, Y.: Actionable recourse in linear classification. In: Proceedings of the Conference on Fairness, Accountability, and Transparency, pp. 10–19 (2019)

    Google Scholar 

  78. Verma, S., Rubin, J.: Fairness definitions explained. In: 2018 IEEE/ACM International Workshop on Software Fairness (FairWare), pp. 1–7. IEEE (2018)

    Google Scholar 

  79. Wachter, S., Mittelstadt, B., Russell, C.: Counterfactual explanations without opening the black box: automated decisions and the GDPR. Harv. JL Tech. 31, 841 (2017)

    Google Scholar 

  80. Wang, J., Wiens, J., Lundberg, S.: Shapley flow: a graph-based approach to interpreting model predictions. In: International Conference on Artificial Intelligence and Statistics, pp. 721–729. PMLR (2021)

    Google Scholar 

  81. Wang, T., et al.: FLAME: a fast large-scale almost matching exactly approach to causal inference. J. Mach. Learn. Res. 22, 31:1–31:41 (2021)

    Google Scholar 

  82. Woodward, J.: Making Things Happen: A Theory of Causal Explanation. Oxford University Press, Oxford (2005)

    Google Scholar 

  83. Zliobaite, I.: A survey on measuring indirect discrimination in machine learning. arXiv preprint arXiv:1511.00148 (2015)

Download references

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Sudeepa Roy .

Editor information

Editors and Affiliations

Rights and permissions

Reprints and permissions

Copyright information

© 2023 The Author(s), under exclusive license to Springer Nature Switzerland AG

About this chapter

Check for updates. Verify currency and authenticity via CrossMark

Cite this chapter

Roy, S., Salimi, B. (2023). Causal Inference in Data Analysis with Applications to Fairness and Explanations. In: Bertossi, L., Xiao, G. (eds) Reasoning Web. Causality, Explanations and Declarative Knowledge. Lecture Notes in Computer Science, vol 13759. Springer, Cham. https://doi.org/10.1007/978-3-031-31414-8_3

Download citation

  • DOI: https://doi.org/10.1007/978-3-031-31414-8_3

  • Published:

  • Publisher Name: Springer, Cham

  • Print ISBN: 978-3-031-31413-1

  • Online ISBN: 978-3-031-31414-8

  • eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics