Skip to main content
Log in

A new causal discovery heuristic

  • Published:
Annals of Mathematics and Artificial Intelligence Aims and scope Submit manuscript

Abstract

Probabilistic methods for causal discovery are based on the detection of patterns of correlation between variables. They are based on statistical theory and have revolutionised the study of causality. However, when correlation itself is unreliable, so are probabilistic methods: unusual data can lead to spurious causal links, while nonmonotonic functional relationships between variables can prevent the detection of causal links. We describe a new heuristic method for inferring causality between two continuous variables, based on randomness and unimodality tests and making few assumptions about the data. We evaluate the method against probabilistic and additive noise algorithms on real and artificial datasets, and show that it performs competitively.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Similar content being viewed by others

References

  1. Basu, S., DasGupta, A.: The mean, median, and mode of unimodal distributions: A characterization. Theory Probab. Appl. 41(2), 210–223 (1997)

    Article  MathSciNet  MATH  Google Scholar 

  2. Black, S.E.: Do Better schools matter? parental valuation of elementary education. Q. J. Econ. 114(2), 577–599 (1999)

    Article  MathSciNet  Google Scholar 

  3. Buehlmann, P., Peters, J., Ernest, J.: CAM: causal additive models, high-dimensional order search and penalized regression. Ann. Stat. 42, 2526–2556 (2014)

    Article  MathSciNet  MATH  Google Scholar 

  4. Bunge, M.: Causality and Modern Science. Transaction Publishers (2009)

  5. Chay, K.Y., Greenstone, M.: Does air quality matter? Evidence from the housing market. J. Polit. Econ. 113(2), 376–424 (2005)

    Article  Google Scholar 

  6. Chiodo, A.J., Hernandez-Murillo, R., Owyang, M.T.: Nonlinear effects of school quality on house prices. Federal Reserve Bank St. Louis Rev. 92(6), 185–204 (2010)

    Google Scholar 

  7. Cooper, G.F.: The computational complexity of probabilistic inference using bayesian belief networks. Artificial Intelligence 42(2-3), 393–405 (1990)

    Article  MathSciNet  MATH  Google Scholar 

  8. Cortez, P., Cerdeira, A., Almeida, F., Matos, T., Reis, J.: Modeling wine preferences by data mining from physicochemical properties. Decis. Support. Syst. 47 (4), 547–553 (2009)

    Article  Google Scholar 

  9. Currie, J., Davis, L., Greenstone, M., Walker, R.: Environmental health risks and housing values: evidence from 1,600 toxic plant openings and closings. Am. Econ. Rev. 105(2), 678–709 (2015)

    Article  Google Scholar 

  10. Daniušis, P., Janzing, D., Mooij, J.M., Zscheischler, J., Steudel, B., Zhang, K., Schölkopf, B.: Inferring deterministic causal relations. In: Proceedings of the 26th Conference on Uncertainty in Artificial Intelligence, pp. 143–150 (2010)

  11. Fukumizu, K., Gretton, A., Sun, X., Schoelkopf, B.: Kernel measures of conditional dependence. In: Proceedings of the 20th International Conference on Advances in Neural Information Processing Systems, pp. 489–496. MIT Press (2007)

  12. Granger, C.W.: Investigating causal relations by econometric models and cross-spectral methods. Econometrica 37(3), 424–438 (1969)

    Article  MATH  Google Scholar 

  13. Guyon, I., Aliferis, C., Elisseeff, A.: Causal feature selection. In: Liu, H., Motoda, H. (eds.) Computational Methods of Feature Selection. Chapman and Hall/CRC (2007)

  14. Harrison, D., Rubinfeld, D.L.: Hedonic prices and the demand for clean air. J. Environ. Econ. Manag. 5, 81–102 (1978)

    Article  MATH  Google Scholar 

  15. Hoover, K.D.: Nonstationary time series, cointegration, and the principle of the common cause. Brit. J. Phil. Sci. 54, 527–551 (2003)

    Article  MATH  Google Scholar 

  16. Hoyer, P.O., Janzing, D., Mooij, J.M., Peters, J., Schölkopf, B.: Nonlinear causal discovery with additive noise models. Adv. Neural Inf. Process. Syst. 21, 689–696 (2009)

    MATH  Google Scholar 

  17. Janzing, D., Mooij, J., Zhang, K., Lemeire, J., Zscheischler, J., Daniušis, P., Steudel, B., Schölkopf, B.: Information-geometric approach to inferring causal directions. Artif. Intell. 182-3, 1–31 (2012)

    Article  MathSciNet  MATH  Google Scholar 

  18. Kalisch, M., Maechler, M., Colombo, D., Maathuis, M.H., Buehlmann, P.: Causal inference using graphical models with the R package. J. Statist. Softw., 47(11) (2012)

  19. Lichman, M.: UCI machine learning repository. http://archive.ics.uci.edu/ml (2013)

  20. Margaritis, D.: Distribution-free learning of Bayesian network structure in continuous domains. In: Proceedings of the 20th National Conference on Artificial Intelligence AAAI, pp. 825–830 (2005)

  21. Mooij, J.M., Janzing, D., Zscheischler, J., Schölkopf, B.: CauseEffectPairs repository http://webdav.tuebingen.mpg.de/causality/ (2014)

  22. Mooij, J.M., Peters, J., Janzing, D., Zscheischler, J., B. Schölkopf.: Distinguishing cause from effect using observational data: Methods and benchmarks. Technical Report arXiv:1412.3773v1 Max-Planck-Institute for Intelligent Systems at Tuebingen (2014)

  23. Parr, R., Mackay, J.: Secrets of the Sommeliers: How to Think and Drink Like the World’s Top Wine Professionals. Penguin Random House (2010)

  24. Pearl, J.: Causality, Models, Reasoning, and Inference. Cambridge University Press (2000)

  25. Peters, J., Ernest, J.: CAM: Causal Additive Model (CAM). R package version 1.0, http://CRAN.R-project.org/package=CAM (2015)

  26. Prestwich, S.D., Tarim, S.A., Ozkan, I.: Causal discovery by randomness test. In: Proceedings of the 14th International Symposium on Artificial Intelligence and Mathematics (2016)

  27. R Core Team: R: A Language and Environment for Statistical Computing. R Foundation for Statistical Computing. Vienna, Austria, http://www.R-project.org/ (2016)

  28. Redmond, M.A., Baveja, A.: A data-driven software tool for enabling cooperative information sharing among police departments. Eur. J. Oper. Res. 141, 660–678 (2002)

    Article  MATH  Google Scholar 

  29. Reichenbach, H.: The Direction of Time. University of California Press, Berkeley (1956)

    Google Scholar 

  30. Reiss, J.: Causation, Evidence, and Inference. Routledge (2015)

  31. Salkind, N.J., Rasmussen, K.: Encyclopedia of Measurement and Statistics. SAGE Publications Inc (2007)

  32. Shimizu, S., Hoyer, P.O., Hyvarinen, A., Kerminen, A.J.: A linear non-gaussian acyclic model for causal discovery. J. Mach. Learn. Res. 7, 2003–2030 (2006)

    MathSciNet  MATH  Google Scholar 

  33. Smith, V.K., Huang, J.C.: Hedonic models and air pollution: twenty-five years and counting. Environ. Resour. Econ. 3, 381–394 (1993)

    Article  Google Scholar 

  34. Sober. E.: Venetian sea levels, British bread prices, and the principle of the common cause. Brit. J. Phil. Sci. 52, 331–346 (2001)

    Article  Google Scholar 

  35. Spirtes, P., Glymour, C., Scheines, R.: Causation, Prediction and Search. MIT Press, Cambridge (2000)

    MATH  Google Scholar 

  36. Ben Taieb, S., Hyndman, R.J.: A gradient boosting approach to the kaggle load forecasting competition. Int. J. Forecast. 30(2), 382–394 (2014)

    Article  Google Scholar 

  37. Wald, A., Wolfowitz, J.: On a test whether two samples are from the same population. Ann. Math. Statist. 11, 147–162 (1940)

    Article  MathSciNet  MATH  Google Scholar 

  38. You, J.: Darpa sets out to automate research. Science 347(6221), 465 (2015)

    Article  Google Scholar 

  39. Yule, G.: Why do we sometimes get nonsense-correlations between time series? J. R. Stat. Soc. 89, 1–64 (1926)

    Article  MATH  Google Scholar 

  40. Zahirovic-Herbert, V., Turnbull, G.K.: School quality, house prices and liquidity. J. Real Estate Financ. Econ. 37(2), 113–130 (2008)

    Article  Google Scholar 

  41. Zhang, K., Hyvarinen, A.: On the Identifiability of the Post-Nonlinear Causal model. In: Proceedings of the 25th Conference on Uncertainty in Artificial Intelligence, pp. 647–655 (2009)

  42. Zhang, K., Peters, J., Janzing, D., Schoelkopf, B.: Kernel-based conditional independence test and application in causal discovery. In: Proceedings of the 27th Conference on Uncertainty in Artificial Intelligence (2011)

Download references

Acknowledgements

Our research was aided by the availability of benchmarks in the UCI Machine Learning Repository [19] and the Cause Effect Pairs collection of [21]. This work was supported in part by Science Foundation Ireland (SFI) under Grant Number SFI/12/RC/2289.

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to S. D. Prestwich.

Additional information

Supported in part by a research grant from Science Foundation Ireland (SFI) under Grant Number SFI/12/RC/2289.

Rights and permissions

Reprints and permissions

About this article

Check for updates. Verify currency and authenticity via CrossMark

Cite this article

Prestwich, S.D., Tarim, S.A. & Ozkan, I. A new causal discovery heuristic. Ann Math Artif Intell 82, 245–259 (2018). https://doi.org/10.1007/s10472-018-9575-0

Download citation

  • Published:

  • Issue Date:

  • DOI: https://doi.org/10.1007/s10472-018-9575-0

Keywords

Mathematics Subject Classification (2010)

Navigation