Justifying Information-Geometric Causal Inference

  • Dominik Janzing
  • Bastian Steudel
  • Naji Shajarisales
  • Bernhard Schölkopf


Information-Geometric Causal Inference (IGCI) is a new approach to distinguish between cause and effect for two variables. It is based on an independence assumption between input distribution and causal mechanism that can be phrased in terms of orthogonality in information space. We describe two intuitive reinterpretations of this approach that make IGCI more accessible to a broader audience. Moreover, we show that the described independence is related to the hypothesis that unsupervised learning and semi-supervised learning only work for predicting the cause from the effect and not vice versa.


Equivalence Class Monotonic Function True Function Causal Direction Statistical Learning Theory 
These keywords were added by machine and not by the authors. This process is experimental and the keywords may be updated as the learning algorithm improves.



The authors are grateful to Joris Mooij for insightful discussions.


  1. 1.
    Amari, S., Nagaoka, H.: Methods of Information Geometry. Oxford University Press, New York (1993)Google Scholar
  2. 2.
    Chapelle, O., Schölkopf, B., Zien, A.: Semi-supervised Learning. MIT Press, Cambridge (2010)Google Scholar
  3. 3.
    Cover, T., Thomas, J.: Elements of Information Theory. Wiley Series in Telecommunications. Wiley, New York (1991)CrossRefzbMATHGoogle Scholar
  4. 4.
    Daniušis, P., Janzing, D., Mooij, J., Zscheischler, J., Steudel, B., Zhang, K., Schölkopf, B.: Inferring deterministic causal relations. In: Proceedings of the Twenty Sixth Conference on Uncertainty in Artificial Intelligence (UAI), pp. 143–150. AUAI, Corvallis (2010)Google Scholar
  5. 5.
    Hoyer, P., Janzing, D., Mooij, J., Peters, J., Schölkopf, B.: Nonlinear causal discovery with additive noise models. In: Koller, D., Schuurmans, D., Bengio, Y., Bottou, L. (eds.) Advances in Neural Information Processing Systems 21 (NIPS 2008), pp. 689–696. MIT Press, Cambridge (2009)Google Scholar
  6. 6.
    Janzing, D., Mooij, J., Zhang, K., Lemeire, J., Zscheischler, J., Daniušis, P., Steudel, B., Schölkopf, B.: Information-geometric approach to inferring causal directions. Artif. Intell. 182–183, 1–31 (2012)CrossRefGoogle Scholar
  7. 7.
    Janzing, D., Schölkopf, B.: Causal inference using the algorithmic Markov condition. IEEE Trans. Inf. Theory 56(10), 5168–5194 (2010)CrossRefGoogle Scholar
  8. 8.
    Lemeire, J., Janzing, D.: Replacing causal faithfulness with algorithmic independence of conditionals. Minds Mach. 23(2), 227–249 (2013)CrossRefGoogle Scholar
  9. 9.
    Li, M., Vitányi, P.: An Introduction to Kolmogorov Complexity and Its Applications. Springer, New York (1997) (3rd edn. 2008)Google Scholar
  10. 10.
    Pearl, J.: Causality: Models, Reasoning, and Inference. Cambridge University Press, Cambridge (2000)Google Scholar
  11. 11.
    Peters, J., Janzing, D., Schölkopf, B.: Identifying cause and effect on discrete data using additive noise models. In: Proceedings of the Thirteenth International Conference on Artificial Intelligence and Statistics (AISTATS 2010), Journal of Machine Learning Research: Workshop and Conference Proceedings, vol. 9, pp. 597–604 (2010)Google Scholar
  12. 12.
    Peters, J., Janzing, D., Schölkopf, B.: Causal inference on discrete data using additive noise models. IEEE Trans. Pattern Anal. Mach. Intell. 33(12), 2436–2450 (2011)CrossRefGoogle Scholar
  13. 13.
    Peters, J., Mooij, J., Janzing, D., Schölkopf, B.: Identifiability of causal graphs using functional models. In: Proceedings of the Twenty Seventh Conference on Uncertainty in Artificial Intelligence (UAI), pp. 589–598. AUAI, Corvallis (2011)Google Scholar
  14. 14.
    Schölkopf, B., Janzing, D., Peters, J., Sgouritsa, E., Zhang, K., Mooij, J.: On causal and anticausal learning. In: Proceedings of Twenty Ninth International Conference on Machine Learning (ICML 2012) (2012)Google Scholar
  15. 15.
    Sinz, F.H., Chapelle, O., Agarwal, A., Schölkopf, B.: An analysis of inference with the universum. In: Platt, J.C., Koller, D., Singer, Y., Roweis, S. (eds.) Advances in Neural Information Processing Systems 20 (NIPS 2007), pp. 1369–1376. MIT Press, Cambridge (2008)Google Scholar
  16. 16.
    Spirtes, P., Glymour, C., Scheines, R.: Causation, Prediction, and Search. Lecture Notes in Statistics, vol. 81. Springer, New York (1993)CrossRefzbMATHGoogle Scholar
  17. 17.
    Vapnik, V.: Estimation of Dependences Based on Empirical Data. Statistics for Engineering and Information Science, 2nd edn. Springer, New York (2006)zbMATHGoogle Scholar
  18. 18.
    Weston, J., Collobert, R., Sinz, F., Bottou, L., Vapnik, V.: Inference with the universum. In: Proceedings of the Twenty Third International Conference on Machine learning (ICML), pp. 1009–1016. ACM (2006)Google Scholar
  19. 19.
    Zhang, K., Hyvärinen, A.: On the identifiability of the post-nonlinear causal model. In: Proceedings of the Twenty Fifth Conference on Uncertainty in Artificial Intelligence (UAI), pp. 647–655. AUAI, Corvallis (2009)Google Scholar

Copyright information

© Springer International Publishing Switzerland 2015

Authors and Affiliations

  • Dominik Janzing
    • 1
  • Bastian Steudel
    • 1
  • Naji Shajarisales
    • 1
  • Bernhard Schölkopf
    • 1
  1. 1.Max Planck Institute for Intelligent SystemsTübingenGermany

Personalised recommendations