Results of the Cause-Effect Pair Challenge

  • Isabelle Guyon
  • Alexander Statnikov
Part of the The Springer Series on Challenges in Machine Learning book series (SSCML)


We organized a challenge in causal discovery from observational data with the aim of devising a “causation coefficient” to score pairs of variables. The participants were provided with a large database of thousands of pairs of variables {X, Y } (80% semi-artificial data and 20% real data) from which samples were drawn independently (i.e. ignoring possible time dependencies). The goal was to discover whether the data supports the hypothesis that Y = f(X, noise), which for the purpose of this challenge was our definition of causality (X causes Y). The participants adopted a machine learning approach, which contrasts with previously published model-based methods. They extracted numerous features of the joint empirical distribution of X and Y and built a classifier to separate pairs belonging to the class “X causes Y” from other cases (“Y causes X”, “X and Y are related” but not in a causal way, a third variable may be causing both X and Y, “X and Y are independent”). The classifier was trained from examples provided by the organizers and tested on independent test data for which the truth values of causal relationships was known only to the organizers. The participants achieved an Area under the ROC Curve (AUC) over 0.8 in the first phase deployed on the Kaggle challenge, which ran from March through September 2013 (round 1). The participants were then invited to improve upon the code efficiency by submitting fast causation coefficients on the Codalab platform (round 2). The causation coefficients developed by the winners have been made available under open source licenses. We have made all data and code publicly available at


Causal discovery Cause-effect pairs Benchmark Challenge 



We are very grateful to all those who contributed time to make this challenge happen. The initial impulse for this challenge was given by Joris Mooij, Dominik Janzing, and Bernhard Schoelkopf, from the Max Planck Institute who devised a cause-effect pair task, which was part of the NIPS 2008 Pot-Luck challenge Examples of algorithms and data were supplied by Povilas Daniusis, Arthur Gretton, Patrik O. Hoyer, Dominik Janzing, Antti Kerminen, Joris Mooij, Jonas Peters, Bernhard Schoekopf, Shohei Shimizu, Oliver Stegle, and Kun Zhang, and Jakob Zscheischler. The datasets were prepared by Isabelle Guyon, Mehreen Saeed, Mikael Henaff, Sisi Ma, and Alexander Statnikov. The website and the sample code were prepared by Isabelle Guyon and Ben Hamner. The challenge protocol and implementation was tested and/or reviewed by Marc Boullé, Léon Bottou, Hugo Jair Escalante, Frederick Eberhardt, Seth Flaxman, Patrik Hoyer, Dominik Janzing, Richard Kennaway, Vincent Lemaire, Joris Mooij, Jonas Peters, Florin, Peter Spirtes, Ioannis Tsamardinos, Jianxin Yin,and Kun Zhang. The second round was supported by Microsoft Research. We are very grateful to Evelyne Viegas for her help and advice.


  1. 1.
    Leo Breiman. Arcing the edge. Technical report, Technical Report 486, Statistics Department, University of California at Berkeley, 1997.Google Scholar
  2. 2.
    Diogo Moitinho de Almeida. Automated feature engineering applied to causality. In Proc. NIPS 2013 workshop on causality,, December 2013.
  3. 3.
    José A. R. Fonollosa. Conditional distribution variability measure for causality detection. In Proc. NIPS 2013 workshop on causality,, December 2013.
  4. 4.
    José AR Fonollosa. Conditional distribution variability measures for causality detection. arXiv preprint arXiv:1601.06680, 2016.Google Scholar
  5. 5.
    Jerome H Friedman. Greedy function approximation: a gradient boosting machine. Annals of statistics, pages 1189–1232, 2001.Google Scholar
  6. 6.
    Jerome H Friedman. Stochastic gradient boosting. Computational Statistics & Data Analysis, 38 (4): 367–378, 2002.MathSciNetCrossRefGoogle Scholar
  7. 7.
    Todd R Golub, Donna K Slonim, Pablo Tamayo, Christine Huard, Michelle Gaasenbeek, Jill P Mesirov, Hilary Coller, Mignon L Loh, James R Downing, Mark A Caligiuri, et al. Molecular classification of cancer: class discovery and class prediction by gene expression monitoring. science, 286 (5439): 531–537, 1999.Google Scholar
  8. 8.
    Isabelle Guyon. Chalearn cause effect pairs challenge, 2013. URL
  9. 9.
    Isabelle Guyon. Chalearn fast causation coefficient challenge. 2014.Google Scholar
  10. 10.
    Isabelle Guyon, Hans-Marcus Bitter, Zulfikar Ahmed, Michael Brown, and Jonathan Heller. Multivariate non-linear feature selection with kernel multiplicative updates and gram-schmidt relief. In BISC Flint-CIBI 2003 Workshop. Berkeley, pages 1–11, 2003.Google Scholar
  11. 11.
    Tin Kam Ho. Random decision forests. In Proceedings of the Third International Conference on Document Analysis and Recognition (Volume 1) - Volume 1, ICDAR ‘95, pages 278–, Washington, DC, USA, 1995. IEEE Computer Society. ISBN 0-8186-7128-9. URL
  12. 12.
    Patrik O Hoyer, Dominik Janzing, Joris M Mooij, Jonas Peters, and Bernhard Schölkopf. Nonlinear causal discovery with additive noise models. In Neural Information Processing Systems (NIPS), pages 689–696, 2009.Google Scholar
  13. 13.
    Dominik Janzing, Joris Mooij, Kun Zhang, Jan Lemeire, Jakob Zscheischler, Povilas Daniušis, Bastian Steudel, and Bernhard Schölkopf. Information-geometric approach to inferring causal directions. Artif. Intell., 182–183: 1–31, May 2012. ISSN 0004-3702. URL Scholar
  14. 14.
    David Lopez-Paz, Krikamol Muandet, and Benjamin Recht. The randomized causation coefficient. J. Mach. Learn. Res., 16 (1): 2901–2907, January 2015. ISSN 1532-4435. URL
  15. 15.
    Joris M Mooij, Jonas Peters, Dominik Janzing, Jakob Zscheischler, and Bernhard Schölkopf. Distinguishing cause from effect using observational data: methods and benchmarks. Journal of Machine Learning Research, 17 (32): 1–102, 2016.Google Scholar
  16. 16.
    Jonas Peters, Dominik Janzing, and Bernhard Scholkopf. Causal inference on discrete data using additive noise models. IEEE Transactions on Pattern Analysis and Machine Intelligence, 33 (12): 2436–2450, 2011.CrossRefGoogle Scholar
  17. 17.
    Ali Rahimi and Benjamin Recht. Random features for large-scale kernel machines. In Advances in neural information processing systems, pages 1177–1184, 2008.Google Scholar
  18. 18.
    Spyridon Samothrakis, Diego Perez, and Simon Lucas. Training gradient boosting machines using curve fitting and information-theoretic features for causal direction detection. In Proc. NIPS 2013 workshop on causality,, December 2013.
  19. 19.
    Thomas Schaffter, Daniel Marbach, and Dario Floreano. Genenetweaver: In silico benchmark generation and performance profiling of network inference methods. Bioinformatics, 27 (16): 2263–2270, 2011. wingx.Google Scholar
  20. 20.
    Alex Smola, Arthur Gretton, Le Song, and Bernhard Schölkopf. A hilbert space embedding for distributions. In International Conference on Algorithmic Learning Theory, pages 13–31. Springer, 2007.Google Scholar
  21. 21.
    Alexander Statnikov, Mikael Henaff, Nikita I Lytkin, and Constantin F Aliferis. New methods for separating causes from effects in genomics data. BMC genomics, 13 (8): S22, 2012.CrossRefGoogle Scholar
  22. 22.
    Alexander Statnikov, Sisi Ma, Mikael Henaff, Nikita Lytkin, Efstratios Efstathiadis, Eric R. Peskin, and Constantin F. Aliferis. Ultra-scalable and efficient methods for hybrid observational and experimental local causal pathway discovery. J. Mach. Learn. Res., 16 (1): 3219–3267, January 2015. ISSN 1532-4435. URL

Copyright information

© Springer Nature Switzerland AG 2019

Authors and Affiliations

  • Isabelle Guyon
    • 1
    • 2
  • Alexander Statnikov
    • 3
  1. 1.Team TAU - CNRS, INRIA, Université Paris SudUniversité Paris SaclayOrsayFrance
  2. 2.ChaLearnBerkeleyUSA
  3. 3.SoFiSan FranciscoUSA

Personalised recommendations