Causality Modeling and Statistical Generative Mechanisms

  • Igor MandelEmail author
Part of the Lecture Notes in Computer Science book series (LNCS, volume 11100)


Causality notion lies at the heart of science, but when statistics tries to address this issue some profound questions remain unanswered. How statistical inference in probabilistic terms is linked with causality? What modern causality models offer that is substantially different from the traditional dependency models like regression or decision trees, and if yes, do they deliver these promises? How causality models are related to statistical and machine learning techniques? What is the relationship between causality modeling, statistical inference, and machine learning on one side – and operations research and optimization on the other? Or, more generally: if the causal picture of the world is a commonly accepted goal of any science, could the non-causal statistical models be of any use? If yes – in what sense? If not – why are they so widely used? The insufficient level of detail in discussions of these and similar problems creates a lot of confusion, especially now, when lauded terms like Data Mining, Big Data, Deep Learning and others appear even in the non-professional media. This paper inspects the underlying logic of different approaches, directly or indirectly, related with causality. It shows that even established methods are vulnerable to small deviations from the ideal setting; that the leading approaches to statistical causality, Structural Equations Modeling (SEM), Directed Acyclic Graphs (DAG) and Potential Outcomes (PO) theories do not provide a coherent causality theory, and argues that this theory is impossible on pure statistical grounds. It also discusses a new approach in which the concept of causality is replaced by the concept of dependent variable generation. Separation of the variables generating the outcome from others just correlated with it (which often separates also causal from non-causal variables) is proposed.


Dependency modeling Statistical inference Causality modeling Counterfactual statements Statistical learning Intrinsic probability Generative statistical mechanisms 



The study of causality was supported by Telmar Inc. and some of the results were incorporated in its software. Author sincerely thanks I. Lipkovich and S. Lipovetsky for the numerous fruitful discussions and B. Mirkin for very meaningful comments and suggestions.


  1. Bang-Jensen, J., Gutin, G.: Digraphs: Theory, Algorithms and Applications. Springer, Heidelberg (2009). Scholar
  2. Bennett, A.: The mother of all “isms”: organizing political science around causal mechanisms. In: Groff, R. (ed.) Revitalizing Causality: Realism About Causality in Philosophy and Social Science, pp. 205–219. Routledge (2008)Google Scholar
  3. Berk, R.: Regression Analysis: A Constructive Critique. Sage Publications, Newbury Park (2004)Google Scholar
  4. Berzuini, C., Dawid, P., Bernardinelli, L. (eds.): Causality: Statistical Perspectives and Applications. Wiley, Chichester (2012)Google Scholar
  5. Bigelow, J., Ellis, B., Pargetter, R.: Forces. Philos. Sci. 55, 614–630 (1988)MathSciNetCrossRefGoogle Scholar
  6. Bontempi, G., Flauder, M.: From dependency to causality: a machine learning approach. J. Mach. Learn. Res. 16, 2437–2457 (2015)MathSciNetzbMATHGoogle Scholar
  7. Bunge, M.: Causality and Modern Science. Transaction Publishers, New Brunswick (2009)Google Scholar
  8. Buonaccorsi, J.P.: Measurement Error: Models, Methods, and Applications. Chapman and Hall, Boca Raton (2010)CrossRefGoogle Scholar
  9. Carroll, R., et al.: Measurement Error in Nonlinear Models: A Modern Perspective. Chapman and Hall, New York (2006)CrossRefGoogle Scholar
  10. Cheng, C.L., Van Ness, J.W.: Statistical Regression with Measurement Error. Arnold Publishers, London (1999)zbMATHGoogle Scholar
  11. Conrady, S., Jouffe, L.: Bayesian Networks & BayesiaLab: A Practical Introduction for Researchers. Bayesia USA, Franklin (2015)Google Scholar
  12. Consumer Price Index Manual: Theory and Practice. International Monetary Fund (2004)Google Scholar
  13. Craycroft, J.: Propensity score methods: a simulation and case study involving breast cancer patients. Paper 2460 (2016).
  14. Dawid, P.: Conditional independence in statistical theory. J. R. Stat. Soc. B 41, 1–31 (1979)Google Scholar
  15. Dawid, P.: Beware of the DAG! In: JMLR: Workshop and Conference Proceedings, vol. 6, pp. 59–86 (2009)Google Scholar
  16. Dowe, P.: Causal processes. In: Stanford Encyclopedia of Philosophy (2007).
  17. Demidenko, E., Mandel, I.: Yield analysis and mixed model. In: Proceedings of Joint Statistical Meeting. ASA, Alexandria, VA (2005)Google Scholar
  18. Dodson, D., Mandel, I.: Causal Analytics for Media Planning (2015).
  19. Efron, B., Hastie, T.: Computer Age Statistical Inference Algorithms, Evidence, and Data Science. Cambridge University Press, New York (2016)Google Scholar
  20. Good, I.J.: Good Thinking: The Foundations of Probability and Its Applications. The University of Minnesota, Minneapolis (1983)Google Scholar
  21. Greenland, S., Robins, J.M., Pearl, J.: Confounding and collapsibility in causal inference. Stat. Sci. 14(1), 29–46 (1999)CrossRefGoogle Scholar
  22. Groff, R. (ed.): Revitalizing Causality: Realism about Causality in Philosophy and Social Science. Taylor and Francis Group, London (2008)Google Scholar
  23. Hastie,T., Tibshirani, R., Friedman, J.: The Elements of Statistical Learning: Data Mining, Inference, and Prediction, Springer (2009)Google Scholar
  24. Hildreth, C., Houck, J.P.: Some estimators for a linear model with random coefficients. J. Am. Stat. Assoc. 63, 584–595 (1968)MathSciNetzbMATHGoogle Scholar
  25. Hitchcock, C.: Probabilistic causation. In: Stanford Encyclopedia of Philosophy (2010).
  26. Hofmann, T., Scholkopf, B., Smola, A.J.: Kernel methods in machine learning. Ann. Stat. 36(3), 1171–1220 (2008)MathSciNetCrossRefGoogle Scholar
  27. Hoover, K.D.: Causality in economics and econometrics. In: The New Palgrave Dictionary of Economics. Springer, Heidelberg (2016). Scholar
  28. Illari, P., Russo, F.: Causality: Philosophical Theory meets Scientific Practice. Oxford University Press, London (2014)Google Scholar
  29. Imai, K., Tingley, D.: A statistical method for empirical testing of competing theories. Am. J. Polit. Sci. 56(1), 218–236 (2012)CrossRefGoogle Scholar
  30. Imbens, G., Rubin, D.: Causal Inference for Statistics, Social, and Biomedical Sciences: An Introduction. Cambridge University Press, New York (2015)Google Scholar
  31. Johnson, V., Payne, R., Wang, T., Asher, A., Mandal, S.: On the reproducibility of psychological science. J. Am. Stat. Assoc. 112, 517 (2017)MathSciNetCrossRefGoogle Scholar
  32. Kaplan, D., Chen, C.: Bayesian Propensity Score Analysis: Simulation and Case Study (2011).
  33. King, G., Nielsen, R.: Why Propensity Scores Should Not Be Used for Matching (2016).
  34. Kistler, M.: Causation and Laws of Nature. Routledge, London (2006)CrossRefGoogle Scholar
  35. Kline, R.: Principles and Practice of Structural Equation Modeling. The Guilford Press, New York (2011)Google Scholar
  36. Kuznetsov, D., Mandel, I.: Statistical physics of media processes: mediaphysics. Phys. A 377, 253–268 (2007)CrossRefGoogle Scholar
  37. Leightner, J., Inoue, T.: Solving the omitted variables problem of regression analysis using the relative vertical position of observations. Adv. Decis. Sci. 2012 (2012). Paper ID 728980MathSciNetCrossRefGoogle Scholar
  38. Lewis, D.: Counterfactuals. Harvard University Press, Cambridge (1973)Google Scholar
  39. Li, H., Yuan, Z., Su, P., Wang, T., Yu, Y., Sun, X., Xue, F.: A simulation study on matched case-control designs in the perspective of causal diagrams. BMC Med. Res. Methodol. BMC Ser. 16, 102 (2016)CrossRefGoogle Scholar
  40. Lipovetsky, S., Conklin, M.: Analysis of regression in game theory approach. Appl. Stochastic Models Bus. Ind. 17, 319–330 (2001)MathSciNetCrossRefGoogle Scholar
  41. Lipovetsky, S., Conklin, M.: Data aggregation and Simpson_s paradox gauged by index numbers. Eur. J. Oper. Res. 172, 334–351 (2006)CrossRefGoogle Scholar
  42. Lipovetsky, S.: Iteratively re-weighted random-coefficient models and Shapley value regression. Model Assist. Stat. Appl. 2, 201–212 (2007)MathSciNetzbMATHGoogle Scholar
  43. Lipovetsky, S., Conklin, M.: Predictor relative importance and matching regression parameters. J. Appl. Stat. (2014)Google Scholar
  44. Lipovetsky, S., Mandel, I.: Review on: handbook of causal analysis in social research, Springer, 2015. Technometrics 57(2), 298–300 (2015a)Google Scholar
  45. Lipovetsky, S., Mandel, I.: Modeling probability of causal and random impacts. J. Mod. Appl. Stat. Methods 14(1), 180–195 (2015b)CrossRefGoogle Scholar
  46. Mandel, I.: Sociosystemics, statistics, decisions. Model Assist. Stat. Appl. 6, 163–217 (2011)Google Scholar
  47. Mandel, I.: Fusion and causal analysis in big marketing data sets. In: Proceedings of JSM. ASA, Alexandria, VA, pp. 1719–1732 (2013)Google Scholar
  48. Mandel, I.: Causal models in estimation of the advertising ROI. In: Proceedings of JSM. ASA, Alexandria, VA, pp. 1720–1725 (2016)Google Scholar
  49. Mandel, I.: Troublesome Dependency Modeling: Causality, Inference, Statistical Learning (2017a).
  50. Mandel, I.: Regression coefficients vs causal coefficients. Post in ASA blog, 19 July 2017 (2017b).
  51. Masiuk, S., Kukush, A., Shklyar, S., Chepurny, M., Likhtarov, I.: Radiation Risk Estimation: Based on Measurement Error Models. Walter de Gruyter, Boston (2017)Google Scholar
  52. Menzies, P.: Counterfactual theories of causation. In: Stanford Encyclopedia of Philosophy (2014).
  53. Mirkin, B.: Core Concepts in Data Analysis: Summarization, Correlation and Visualization. Springer, Heidelberg (2011). Scholar
  54. Morgan, S.L. (ed.): Handbook of Causal Analysis in Social Research. Springer, Heidelberg (2014). Scholar
  55. Morgan, S.L., Winship, C.: Counterfactuals and Causal Inference: Methods and Principles for Social Research. Cambridge University Press, Cambridge (2014)Google Scholar
  56. Open Science Collaboration: Investigating variation in replicability: a “Many Labs” replication project. Soc. Psychol. 45, 142–152 (2014)CrossRefGoogle Scholar
  57. Open Science Collaboration: Estimating the reproducibility of psychological science. Science 349(6251) (2015)Google Scholar
  58. Pearl, J.: Causality. Cambridge University Press, Cambridge (2009)Google Scholar
  59. Pearl, J.: The Causal Foundations of Structural Equation Modeling. Technical report R-370 (2012).
  60. Pearl, J., Glymour, M., Jewell, N.: Causal Inference in Statistics: A Primer. Wiley, Chichester (2016)Google Scholar
  61. Peters, J., Janzing, D., Schölkopf, B.: Elements of Causal Inference: Foundations and Learning Algorithms. The MIT Press, Cambridge (2017)Google Scholar
  62. Ralph, J., O’Neill, R., Winton, J.: A Practical Introduction to Index Numbers. Wiley (2015)Google Scholar
  63. Rubin, D.: Matched Samples for Causal Effect. Cambridge University Press, New York (2006)Google Scholar
  64. Scholkopf, B., Janzing, D., Peters, J., Sgouritsa, E., Zhang, K., Mooij, J.: Semi-supervised learning in causal and anticausal settings. In: Schölkopf, B., Luo, Z., Vovk, V. (eds.) Empirical Inference, pp. 129–141. Springer, Heidelberg (2013). Scholar
  65. Skow, B.: An Argument Against Woodward’s Theory of Causal Explanation (2013).
  66. Spirtes, P., Glymour, C., Scheines, R.: Causation, Prediction, and Search. The MIT Press, Cambridge (2001)Google Scholar
  67. Squazzoni, F.: Agent-Based Computational Sociology. Wiley, Chichester (2012)CrossRefGoogle Scholar
  68. VanderWeele, T.: Explanation in Causal Inference: Methods for Mediation and Interaction. Oxford University Press, New York (2015)Google Scholar
  69. Vapnik, V.: Estimation of Dependences Based on Empirical Data: Empirical Inference Science. Springer, Heidelberg (2006). Scholar
  70. Viswanathan, M.: Measurement Error and Research Design. SAGE Publications, Thousand Oaks (2005)Google Scholar
  71. Wansbeek, T., Meijer, E.: Measurement Error and Latent Variables in Econometrics. Elsevier, Amsterdam (2000)Google Scholar
  72. Wasserstein, R., Lazar, N.: The ASA’s statement on p-values: context, process, and purpose. Am. Stat. 70(2), 129–133 (2016)MathSciNetCrossRefGoogle Scholar
  73. Zagar, A., Kadziola, Z., Lipkovich, I., Faries, D.: Evaluating different strategies for estimating treatment effects in observational studies. J. Biopharm. Stat. 27(3), 535–553 (2017)CrossRefGoogle Scholar
  74. Zadeh, L.: Causality is Undefinable. Toward a Theory of Hierarchical Definability (2001).

Copyright information

© Springer Nature Switzerland AG 2018

Authors and Affiliations

  1. 1.Telmar Inc.New YorkUSA

Personalised recommendations