Soft Computing

, Volume 21, Issue 24, pp 7381–7391 | Cite as

An efficient algorithm for large-scale causal discovery

  • Yinghan HongEmail author
  • Zhusong Liu
  • Guizhen Mai
Methodologies and Application


Causal discovery is a fundamental problem in scientific research. Although many researchers are committed to finding causal relationships from observational data, large-scale causal discovery remains a tremendous challenge. In this paper, a new approach for large-scale causal discovery is proposed, based on a split-and-merge strategy. The method first splits a given dataset into small subdatasets using a graph-partitioning method and then develops a effective algorithm to infer the causality of each subdataset. The entire causal structure with respect to the given dataset is achieved by combining all the causalities of each subdataset. The experimental results show that the proposed approach is effective and scalable for large-scale causal discovery problems.


Causation discovery Causal network Additive noise model 



This paper has been supported by Science and Technology Planning Project of Guangdong Province, China (2015A030401101), (2015B090922014), and by National Natural Science Foundation of China(61572144).

Compliance with ethical standards

Conflict of interest

The authors declare that they have no conflict of interest.


  1. Cai R, Zhang Z, Hao Z (2013) Sada: a general framework to support robust causation discovery. In: Proceedings of the 30th international conference on machine learning, pp 208–216Google Scholar
  2. Chickering DM (2003) Optimal structure identification with greedy search. J Mach Learn Res 3:507–554zbMATHMathSciNetGoogle Scholar
  3. Daniusis P, Janzing D, Mooij J, Zscheischler J, Steudel B, Zhang K, Schölkopf B (2012) Inferring deterministic causal relations. arXiv preprint arXiv:1203.3475
  4. Fortier N, Sheppard J, Strasser S (2014) Abductive inference in Bayesian networks using distributed overlapping swarm intelligence. Soft Comput 19(4):981–1001CrossRefGoogle Scholar
  5. Geiger D, Heckerman D (1994) Learning gaussian networks. In: Proceedings of the tenth international conference on uncertainty in artificial intelligence. Morgan Kaufmann Publishers Inc, pp 235–243Google Scholar
  6. Gullberg M, Noreus K, Brattsand G, Friedrich B, Shingler V (1990) Purification and characterization of a 19-kilodalton intracellular protein. An activation-regulated putative protein kinase c substrate of t lymphocytes. J Biol Chem 265(29):17499–17505Google Scholar
  7. Gu B, Sheng VS (2016) A robust regularization path algorithm for v-support vector classification. IEEE Trans Neural Netw Learn Syst. doi: 10.1109/TNNLS.2016.2527796
  8. Gu B, Sun X, Sheng VS (2016) Structural minimax probability machine. IEEE Trans Neural Netw Learn Syst. doi: 10.1109/TNNLS.2016.2544779
  9. Hadley SW, Pelizzari C, Chen GTY (1996) Registration of localization images by maximization of mutual information. In: Proceedings of annual meeting of the American association of physicists in medicineGoogle Scholar
  10. Hao Z, Zhang H, Cai R, Wen W, Li Z (2015) Causal discovery on high dimensional data. Appl Intell 42(3):594–607CrossRefGoogle Scholar
  11. Heckerman D, Meek C, Cooper G (1999) A bayesian approach to causal discovery. Comput Causation Discov 19:141–166MathSciNetGoogle Scholar
  12. Herskovits E (1991) Computer-based probabilistic-network construction. Ph.D thesis, Stanford University, USAGoogle Scholar
  13. Hoyer PO, Janzing D, Mooij JM, Peters J, Schölkopf B (2009) Nonlinear causal discovery with additive noise models. In: Advances in neural information processing systems. MIT press, Massachusetts, pp 689–696Google Scholar
  14. Janzing D, Mooij J, Zhang K, Lemeire J, Zscheischler J, Daniušis P, Steudel B, Schölkopf B (2012) Information-geometric approach to inferring causal directions. Artif Intell 182:1–31CrossRefzbMATHMathSciNetGoogle Scholar
  15. Kelly L, Clark J, Gilliland G (2002) Comprehensive genotypic analysis of leukemia: clinical and therapeutic implications. Curr Opin Oncol 14(1):10–18Google Scholar
  16. Kim K-J, Cho S-B (2015) Ensemble bayesian networks evolved with speciation for high-performance prediction in data mining. Soft Comput. doi: 10.1007/s00500-015-1841-z
  17. Kraskov A, Stögbauer H, Grassberger P (2004) Estimating mutual information. Phys Rev E 69(6):066138CrossRefMathSciNetGoogle Scholar
  18. Kwak N, Choi C-H (2002) Input feature selection by mutual information based on parzen window. Pattern Anal Mach Intell IEEE Trans 24(12):1667–1671CrossRefGoogle Scholar
  19. Liu Z, Yan H, Li Z (2015a) Server-aided anonymous attribute-based authentication in cloud computing. Future Gener Comput Syst 24:61–66CrossRefGoogle Scholar
  20. Liu Z, Yan H, Lin Z, Xu L (2015b) An improved cloud data sharing scheme with hierarchical attribute structure. J Univers Comput Sci 21(3):454–472Google Scholar
  21. Ma S, Li J, Liu L, Le TD (2016) Mining combined causes in large data sets. Knowl Based Syst 92:104–111CrossRefGoogle Scholar
  22. Meek C (1997) Graphical models: selecting causal and statistical models. Ph.D thesis, Carnegie Mellon UniversityGoogle Scholar
  23. Pearl J (2009) Causality. Cambridge University Press, CambridgeCrossRefzbMATHGoogle Scholar
  24. Peters J, Janzing D, Schölkopf B (2010) Identifying cause and effect on discrete data using additive noise models. In: International conference on artificial intelligence and statistics, pp 597–604Google Scholar
  25. Peters J, Janzing D, Schölkopf B (2011) Causal inference on discrete data using additive noise models. Pattern Anal Mach Intell IEEE Trans 33(12):2436–2450CrossRefGoogle Scholar
  26. Peters J, Mooij JM, Janzing D, Schölkopf B (2014) Causal discovery with continuous additive noise models. J Mach Learn Res 15(1):2009–2053zbMATHMathSciNetGoogle Scholar
  27. Rasmussen CE, Williams C (2006) Gaussian processes for machine learning. MIT Press, CambridgezbMATHGoogle Scholar
  28. Schwarz G et al (1978) Estimating the dimension of a model. Ann Stat 6(2):461–464CrossRefzbMATHMathSciNetGoogle Scholar
  29. Shimizu S, Hoyer PO, Hyvärinen A, Kerminen A (2006) A linear non-gaussian acyclic model for causal discovery. J Mach Learn Res 7:2003–2030zbMATHMathSciNetGoogle Scholar
  30. Spirtes P, Glymour CN, Richard S (2000) Causation, prediction and search, vol 81. MIT press, CambridgezbMATHGoogle Scholar
  31. Tang L-J, Jiang J-H, Wu H-L, Shen G-L, Yu R-Q (2009) Variable selection using probability density function similarity for support vector machine classification of high-dimensional microarray data. Talanta 79(2):260–267CrossRefGoogle Scholar
  32. Wang X, Gotoh O (2009) Accurate molecular classification of cancer using simple rules. BMC Med Genom 2(1):64CrossRefGoogle Scholar
  33. Wen X, Shao L, Xue Y, Fang W (2015) A rapid learning algorithm for vehicle classification. Inf Sci 295:395–406CrossRefGoogle Scholar
  34. Zhang K, Hyvärinen A (2008) Distinguishing causes from effects using nonlinear acyclic causal models. In: Journal of machine learning research, workshop and conference proceedings (NIPS 2008 causality workshop), vol 6, pp 157–164Google Scholar
  35. Zhang K, Peters J, Janzing D, Schölkopf B (2012) Kernel-based conditional independence test and application in causal discovery. arXiv preprint arXiv:1202.3775

Copyright information

© Springer-Verlag Berlin Heidelberg 2016

Authors and Affiliations

  1. 1.Physics and Electronic Engineering DepartmentHanshan Normal UniversityChaozhouChina
  2. 2.School of computer Science and TechnologyGuangdong University of TechnologyGuangzhouChina
  3. 3.School of AutomationGuangdong University of TechnologyGuangzhouChina

Personalised recommendations