Learning Bivariate Functional Causal Models

  • Olivier Goudet
  • Diviyan Kalainathan
  • Michèle Sebag
  • Isabelle Guyon
Part of the The Springer Series on Challenges in Machine Learning book series (SSCML)


Finding the causal direction in the cause-effect pair problem has been addressed in the literature by comparing two alternative generative models X → Y and Y → X. In this chapter, we first define what is meant by generative modeling and what are the main assumptions usually invoked in the literature in this bivariate setting. Then we present the theoretical identifiability problem that arises when considering causal graph with only two variables. It will lead us to present the general ideas used in the literature to perform a model selection based on the evaluation of a complexity/fit trade-off. Three main families of methods can be identified: methods making restrictive assumptions on the class of admissible causal mechanism, methods computing a smooth trade-off between fit and complexity and methods exploiting independence between cause and mechanism.


Cause-effect pairs Causal discovery Causal modeling Identifiability Causal mechanisms 



The authors would like to thank Daniel Rolland for proofreading this document, as well as the reviewers for their constructive feedback.


  1. 1.
    Robert Axelrod and William Donald Hamilton. The evolution of cooperation. science, 211(4489):1390–1396, 1981.MathSciNetCrossRefGoogle Scholar
  2. 2.
    Patrick Bloebaum, Dominik Janzing, Takashi Washio, Shohei Shimizu, and Bernhard Schölkopf. Cause-effect inference by comparing regression errors. In International Conference on Artificial Intelligence and Statistics, pages 900–909, 2018.Google Scholar
  3. 3.
    David Maxwell Chickering. Optimal structure identification with greedy search. Journal of machine learning research, 3(Nov):507–554, 2002.Google Scholar
  4. 4.
    Povilas Daniušis, Dominik Janzing, Joris Mooij, Jakob Zscheischler, Bastian Steudel, Kun Zhang, and Bernhard Schölkopf. Inferring deterministic causal relations. In Proceedings of the Twenty-Sixth Conference on Uncertainty in Artificial Intelligence, UAI’10, pages 143–150, Arlington, Virginia, United States, 2010. AUAI Press. ISBN 978-0-9749039-6-5.
  5. 5.
    Bruce Edmonds and Scott Moss. From kiss to kids–an ‘anti-simplistic’ modelling approach. In International workshop on multi-agent systems and agent-based simulation, pages 130–144. Springer, 2004.Google Scholar
  6. 6.
    Ian Goodfellow, Jean Pouget-Abadie, Mehdi Mirza, Bing Xu, David Warde-Farley, Sherjil Ozair, Aaron Courville, and Yoshua Bengio. Generative adversarial nets. In Neural Information Processing Systems (NIPS), pages 2672–2680, 2014.Google Scholar
  7. 7.
    Olivier Goudet, Diviyan Kalainathan, Philippe Caillou, Isabelle Guyon, David Lopez-Paz, and Michèle Sebag. Causal generative neural networks. arXiv preprint arXiv:1711.08936, 2017.Google Scholar
  8. 8.
    Clive WJ Granger. Investigating causal relations by econometric models and cross-spectral methods. Econometrica: Journal of the Econometric Society, pages 424–438, 1969.Google Scholar
  9. 9.
    Arthur Gretton, Olivier Bousquet, Alex Smola, and Bernhard Schölkopf. Measuring statistical dependence with Hilbert-Schmidt norms. In International conference on algorithmic learning theory, pages 63–77. Springer, 2005.Google Scholar
  10. 10.
    Arthur Gretton, Karsten M Borgwardt, Malte Rasch, Bernhard Schölkopf, Alexander J Smola, et al. A kernel method for the two-sample-problem. 19:513, 2007.Google Scholar
  11. 11.
    Isabelle Guyon. Chalearn cause effect pairs challenge, 2013.
  12. 12.
    Patrik O Hoyer, Dominik Janzing, Joris M Mooij, Jonas Peters, and Bernhard Schölkopf. Nonlinear causal discovery with additive noise models. In Neural Information Processing Systems (NIPS), pages 689–696, 2009.Google Scholar
  13. 13.
    Aapo Hyvärinen and Stephen M Smith. Pairwise likelihood ratios for estimation of non-gaussian structural equation models. Journal of Machine Learning Research, 14(Jan):111–152, 2013.Google Scholar
  14. 14.
    Dominik Janzing and Bernhard Schölkopf. Causal inference using the algorithmic Markov condition. IEEE Transactions on Information Theory, 56(10):5168–5194, 2010.MathSciNetCrossRefGoogle Scholar
  15. 15.
    Dominik Janzing and Bernhard Schölkopf. Detecting confounding in multivariate linear models via spectral analysis. Journal of Causal Inference, 6(1), 2018.Google Scholar
  16. 16.
    David Lopez-Paz and Maxime Oquab. Revisiting classifier two-sample tests. arXiv preprint arXiv:1610.06545, 2016.Google Scholar
  17. 17.
    Alexander Marx and Jilles Vreeken. Causal inference on multivariate and mixed-type data. arXiv preprint arXiv:1702.06385, 2017.Google Scholar
  18. 18.
    Mehdi Mirza and Simon Osindero. Conditional generative adversarial nets. arXiv preprint arXiv:1411.1784, 2014.Google Scholar
  19. 19.
    Jovana Mitrovic, Dino Sejdinovic, and Yee Whye Teh. Causal inference via kernel deviance measures. arXiv preprint arXiv:1804.04622, 2018.Google Scholar
  20. 20.
    Joris M Mooij, Jonas Peters, Dominik Janzing, Jakob Zscheischler, and Bernhard Schölkopf. Distinguishing cause from effect using observational data: methods and benchmarks. Journal of Machine Learning Research, 17(32):1–102, 2016.Google Scholar
  21. 21.
    Judea Pearl. Causality: models, reasoning and inference. Econometric Theory, 19(675–685):46, 2003.Google Scholar
  22. 22.
    Judea Pearl. Causality. Cambridge university press, 2009.CrossRefGoogle Scholar
  23. 23.
    F. Pedregosa, G. Varoquaux, A. Gramfort, V. Michel, B. Thirion, O. Grisel, M. Blondel, P. Prettenhofer, R. Weiss, V. Dubourg, J. Vanderplas, A. Passos, D. Cournapeau, M. Brucher, M. Perrot, and E. Duchesnay. Scikit-learn: Machine learning in Python. Journal of Machine Learning Research, 12:2825–2830, 2011.MathSciNetzbMATHGoogle Scholar
  24. 24.
    Jonas Peters, Dominik Janzing, and Bernhard Schölkopf. Causal inference on discrete data using additive noise models. IEEE Transactions on Pattern Analysis and Machine Intelligence, 33(12):2436–2450, 2011.CrossRefGoogle Scholar
  25. 25.
    Jonas Peters, Joris M Mooij, Dominik Janzing, and Bernhard Schölkopf. Causal discovery with continuous additive noise models. The Journal of Machine Learning Research, 15(1):2009–2053, 2014.Google Scholar
  26. 26.
    Dominik Rothenhäusler, Christina Heinze, Jonas Peters, and Nicolai Meinshausen. Backshift: Learning causal cyclic graphs from unknown shift interventions. In Advances in Neural Information Processing Systems, pages 1513–1521, 2015.Google Scholar
  27. 27.
    Eleni Sgouritsa, Dominik Janzing, Philipp Hennig, and Bernhard Schölkopf. Inference of cause and effect with unsupervised inverse regression. In Artificial Intelligence and Statistics, pages 847–855, 2015.Google Scholar
  28. 28.
    Shohei Shimizu, Patrik O Hoyer, Aapo Hyvärinen, and Antti Kerminen. A linear non-gaussian acyclic model for causal discovery. Journal of Machine Learning Research, 7(Oct):2003–2030, 2006.Google Scholar
  29. 29.
    Galit Shmueli et al. To explain or to predict? Statistical science, 25(3):289–310, 2010.MathSciNetCrossRefGoogle Scholar
  30. 30.
    Peter Spirtes, Clark N Glymour, and Richard Scheines. Causation, prediction, and search. MIT press, 2000.Google Scholar
  31. 31.
    Oliver Stegle, Dominik Janzing, Kun Zhang, Joris M Mooij, and Bernhard Schölkopf. Probabilistic latent variable models for distinguishing between cause and effect. In Neural Information Processing Systems (NIPS), pages 1687–1695, 2010.Google Scholar
  32. 32.
    Xiaohai Sun, Dominik Janzing, and Bernhard Schölkopf. Causal inference by choosing graphs with most plausible Markov kernels. In ISAIM, 2006.Google Scholar
  33. 33.
    Chris S Wallace and Peter R Freeman. Estimation and inference by compact coding. Journal of the Royal Statistical Society. Series B (Methodological), pages 240–265, 1987.Google Scholar
  34. 34.
    Kun Zhang and Aapo Hyvärinen. Distinguishing causes from effects using nonlinear acyclic causal models. In Proceedings of the 2008th International Conference on Causality: Objectives and Assessment-Volume 6, pages 157–164. JMLR. org, 2008.Google Scholar
  35. 35.
    Kun Zhang and Aapo Hyvärinen. On the identifiability of the post-nonlinear causal model. In Proceedings of the twenty-fifth conference on uncertainty in artificial intelligence, pages 647–655. AUAI Press, 2009.Google Scholar
  36. 36.
    Kun Zhang, Zhikun Wang, Jiji Zhang, and Bernhard Schölkopf. On estimation of functional causal models: general results and application to the post-nonlinear causal model. ACM Transactions on Intelligent Systems and Technology (TIST), 7(2):13, 2016.CrossRefGoogle Scholar

Copyright information

© Springer Nature Switzerland AG 2019

Authors and Affiliations

  • Olivier Goudet
    • 1
  • Diviyan Kalainathan
    • 1
  • Michèle Sebag
    • 1
  • Isabelle Guyon
    • 2
    • 3
  1. 1.Team TAU - CNRS, INRIAUniversité Paris Sud, Université Paris SaclayOrsayFrance
  2. 2.Team TAU - CNRS, INRIA, Université Paris SudUniversité Paris SaclayOrsayFrance
  3. 3.ChaLearnBerkeleyUSA

Personalised recommendations