A new causal discovery heuristic



Probabilistic methods for causal discovery are based on the detection of patterns of correlation between variables. They are based on statistical theory and have revolutionised the study of causality. However, when correlation itself is unreliable, so are probabilistic methods: unusual data can lead to spurious causal links, while nonmonotonic functional relationships between variables can prevent the detection of causal links. We describe a new heuristic method for inferring causality between two continuous variables, based on randomness and unimodality tests and making few assumptions about the data. We evaluate the method against probabilistic and additive noise algorithms on real and artificial datasets, and show that it performs competitively.


Causality Randomness Unimodality 

Mathematics Subject Classification (2010)

62 00 68 


Unable to display preview. Download preview PDF.

Unable to display preview. Download preview PDF.



Our research was aided by the availability of benchmarks in the UCI Machine Learning Repository [19] and the Cause Effect Pairs collection of [21]. This work was supported in part by Science Foundation Ireland (SFI) under Grant Number SFI/12/RC/2289.


  1. 1.
    Basu, S., DasGupta, A.: The mean, median, and mode of unimodal distributions: A characterization. Theory Probab. Appl. 41(2), 210–223 (1997)MathSciNetCrossRefMATHGoogle Scholar
  2. 2.
    Black, S.E.: Do Better schools matter? parental valuation of elementary education. Q. J. Econ. 114(2), 577–599 (1999)MathSciNetCrossRefGoogle Scholar
  3. 3.
    Buehlmann, P., Peters, J., Ernest, J.: CAM: causal additive models, high-dimensional order search and penalized regression. Ann. Stat. 42, 2526–2556 (2014)MathSciNetCrossRefMATHGoogle Scholar
  4. 4.
    Bunge, M.: Causality and Modern Science. Transaction Publishers (2009)Google Scholar
  5. 5.
    Chay, K.Y., Greenstone, M.: Does air quality matter? Evidence from the housing market. J. Polit. Econ. 113(2), 376–424 (2005)CrossRefGoogle Scholar
  6. 6.
    Chiodo, A.J., Hernandez-Murillo, R., Owyang, M.T.: Nonlinear effects of school quality on house prices. Federal Reserve Bank St. Louis Rev. 92(6), 185–204 (2010)Google Scholar
  7. 7.
    Cooper, G.F.: The computational complexity of probabilistic inference using bayesian belief networks. Artificial Intelligence 42(2-3), 393–405 (1990)MathSciNetCrossRefMATHGoogle Scholar
  8. 8.
    Cortez, P., Cerdeira, A., Almeida, F., Matos, T., Reis, J.: Modeling wine preferences by data mining from physicochemical properties. Decis. Support. Syst. 47 (4), 547–553 (2009)CrossRefGoogle Scholar
  9. 9.
    Currie, J., Davis, L., Greenstone, M., Walker, R.: Environmental health risks and housing values: evidence from 1,600 toxic plant openings and closings. Am. Econ. Rev. 105(2), 678–709 (2015)CrossRefGoogle Scholar
  10. 10.
    Daniušis, P., Janzing, D., Mooij, J.M., Zscheischler, J., Steudel, B., Zhang, K., Schölkopf, B.: Inferring deterministic causal relations. In: Proceedings of the 26th Conference on Uncertainty in Artificial Intelligence, pp. 143–150 (2010)Google Scholar
  11. 11.
    Fukumizu, K., Gretton, A., Sun, X., Schoelkopf, B.: Kernel measures of conditional dependence. In: Proceedings of the 20th International Conference on Advances in Neural Information Processing Systems, pp. 489–496. MIT Press (2007)Google Scholar
  12. 12.
    Granger, C.W.: Investigating causal relations by econometric models and cross-spectral methods. Econometrica 37(3), 424–438 (1969)CrossRefMATHGoogle Scholar
  13. 13.
    Guyon, I., Aliferis, C., Elisseeff, A.: Causal feature selection. In: Liu, H., Motoda, H. (eds.) Computational Methods of Feature Selection. Chapman and Hall/CRC (2007)Google Scholar
  14. 14.
    Harrison, D., Rubinfeld, D.L.: Hedonic prices and the demand for clean air. J. Environ. Econ. Manag. 5, 81–102 (1978)CrossRefMATHGoogle Scholar
  15. 15.
    Hoover, K.D.: Nonstationary time series, cointegration, and the principle of the common cause. Brit. J. Phil. Sci. 54, 527–551 (2003)CrossRefMATHGoogle Scholar
  16. 16.
    Hoyer, P.O., Janzing, D., Mooij, J.M., Peters, J., Schölkopf, B.: Nonlinear causal discovery with additive noise models. Adv. Neural Inf. Process. Syst. 21, 689–696 (2009)MATHGoogle Scholar
  17. 17.
    Janzing, D., Mooij, J., Zhang, K., Lemeire, J., Zscheischler, J., Daniušis, P., Steudel, B., Schölkopf, B.: Information-geometric approach to inferring causal directions. Artif. Intell. 182-3, 1–31 (2012)MathSciNetCrossRefMATHGoogle Scholar
  18. 18.
    Kalisch, M., Maechler, M., Colombo, D., Maathuis, M.H., Buehlmann, P.: Causal inference using graphical models with the R package. J. Statist. Softw., 47(11) (2012)Google Scholar
  19. 19.
    Lichman, M.: UCI machine learning repository. http://archive.ics.uci.edu/ml (2013)
  20. 20.
    Margaritis, D.: Distribution-free learning of Bayesian network structure in continuous domains. In: Proceedings of the 20th National Conference on Artificial Intelligence AAAI, pp. 825–830 (2005)Google Scholar
  21. 21.
    Mooij, J.M., Janzing, D., Zscheischler, J., Schölkopf, B.: CauseEffectPairs repository http://webdav.tuebingen.mpg.de/causality/ (2014)
  22. 22.
    Mooij, J.M., Peters, J., Janzing, D., Zscheischler, J., B. Schölkopf.: Distinguishing cause from effect using observational data: Methods and benchmarks. Technical Report arXiv:1412.3773v1 Max-Planck-Institute for Intelligent Systems at Tuebingen (2014)
  23. 23.
    Parr, R., Mackay, J.: Secrets of the Sommeliers: How to Think and Drink Like the World’s Top Wine Professionals. Penguin Random House (2010)Google Scholar
  24. 24.
    Pearl, J.: Causality, Models, Reasoning, and Inference. Cambridge University Press (2000)Google Scholar
  25. 25.
    Peters, J., Ernest, J.: CAM: Causal Additive Model (CAM). R package version 1.0, http://CRAN.R-project.org/package=CAM (2015)
  26. 26.
    Prestwich, S.D., Tarim, S.A., Ozkan, I.: Causal discovery by randomness test. In: Proceedings of the 14th International Symposium on Artificial Intelligence and Mathematics (2016)Google Scholar
  27. 27.
    R Core Team: R: A Language and Environment for Statistical Computing. R Foundation for Statistical Computing. Vienna, Austria, http://www.R-project.org/ (2016)
  28. 28.
    Redmond, M.A., Baveja, A.: A data-driven software tool for enabling cooperative information sharing among police departments. Eur. J. Oper. Res. 141, 660–678 (2002)CrossRefMATHGoogle Scholar
  29. 29.
    Reichenbach, H.: The Direction of Time. University of California Press, Berkeley (1956)Google Scholar
  30. 30.
    Reiss, J.: Causation, Evidence, and Inference. Routledge (2015)Google Scholar
  31. 31.
    Salkind, N.J., Rasmussen, K.: Encyclopedia of Measurement and Statistics. SAGE Publications Inc (2007)Google Scholar
  32. 32.
    Shimizu, S., Hoyer, P.O., Hyvarinen, A., Kerminen, A.J.: A linear non-gaussian acyclic model for causal discovery. J. Mach. Learn. Res. 7, 2003–2030 (2006)MathSciNetMATHGoogle Scholar
  33. 33.
    Smith, V.K., Huang, J.C.: Hedonic models and air pollution: twenty-five years and counting. Environ. Resour. Econ. 3, 381–394 (1993)CrossRefGoogle Scholar
  34. 34.
    Sober. E.: Venetian sea levels, British bread prices, and the principle of the common cause. Brit. J. Phil. Sci. 52, 331–346 (2001)CrossRefGoogle Scholar
  35. 35.
    Spirtes, P., Glymour, C., Scheines, R.: Causation, Prediction and Search. MIT Press, Cambridge (2000)MATHGoogle Scholar
  36. 36.
    Ben Taieb, S., Hyndman, R.J.: A gradient boosting approach to the kaggle load forecasting competition. Int. J. Forecast. 30(2), 382–394 (2014)CrossRefGoogle Scholar
  37. 37.
    Wald, A., Wolfowitz, J.: On a test whether two samples are from the same population. Ann. Math. Statist. 11, 147–162 (1940)MathSciNetCrossRefMATHGoogle Scholar
  38. 38.
    You, J.: Darpa sets out to automate research. Science 347(6221), 465 (2015)CrossRefGoogle Scholar
  39. 39.
    Yule, G.: Why do we sometimes get nonsense-correlations between time series? J. R. Stat. Soc. 89, 1–64 (1926)CrossRefMATHGoogle Scholar
  40. 40.
    Zahirovic-Herbert, V., Turnbull, G.K.: School quality, house prices and liquidity. J. Real Estate Financ. Econ. 37(2), 113–130 (2008)CrossRefGoogle Scholar
  41. 41.
    Zhang, K., Hyvarinen, A.: On the Identifiability of the Post-Nonlinear Causal model. In: Proceedings of the 25th Conference on Uncertainty in Artificial Intelligence, pp. 647–655 (2009)Google Scholar
  42. 42.
    Zhang, K., Peters, J., Janzing, D., Schoelkopf, B.: Kernel-based conditional independence test and application in causal discovery. In: Proceedings of the 27th Conference on Uncertainty in Artificial Intelligence (2011)Google Scholar

Copyright information

© Springer International Publishing AG, part of Springer Nature 2018

Authors and Affiliations

  1. 1.Insight Centre for Data Analytics, Department of Computer ScienceUniversity College CorkCorkIreland
  2. 2.Department of ManagementCankaya UniversityAnkaraTurkey
  3. 3.Department of EconomicsHacettepe UniversityAnkaraTurkey

Personalised recommendations