Soft Computing

, Volume 20, Issue 11, pp 4575–4588 | Cite as

Multiple-cause discovery combined with structure learning for high-dimensional discrete data and application to stock prediction

  • Weiqi Chen
  • Zhifeng Hao
  • Ruichu Cai
  • Xiangzhou Zhang
  • Yong Hu
  • Mei Liu
Methodologies and Application


Causal discovery in observational data is crucial to a variety of scientific and business research. Although many causal discovery algorithms have been proposed in recent decades, none of them is effective enough in dealing with high-dimensional discrete data. The main challenge is the complex interactions among large volume of variables, leading to numerous spurious causalities found. In this work, we propose a novel multiple-cause discovery method combined with structure learning (McDSL) to eliminate the spurious causalities. The method is carried out in two phases. In the first phase, conditional independence test is used to distinguish direct causal candidates from the indirect ones. In the second phase, causal direction of multi-cause structure is carefully determined with a hybrid causal discovery method. Validation experiments on synthetic data showed that McDSL is reliable in discovering multi-cause structures and eliminating indirect causes. We then applied this algorithm in discovering multiple causes of stock return based on 13-year historical financial data of the Shanghai Stock Exchanges of China, and established a stock prediction model. Experimental results showed that the McDSL discovered causes revealed changes of key risk factors of the stock market over 13 years, which indicated investors should change their investment strategy over time. Moreover, the causes discovered by McDSL have better performance in predicting stock return than that of other common filter-based feature selection algorithms.


Causal discovery High-dimensional discrete data Structure learning Additive noise model  Stock prediction 



This research was partly supported by the National Natural Science Foundation of China (71271061, 70801020), Science and Technology Planning Project of Guangdong Province, China (2010B010600034, 2012B091100192), Guangdong Natural Science Foundation Research Team (S2013030015737), and Business Intelligence Key Team of Guangdong University of Foreign Studies (TD1202).


  1. Agbabiaka TB, Savović J, Ernst E (2008) Methods for causality assessment of adverse drug reactions. Drug Saf 310(1):21–37CrossRefGoogle Scholar
  2. Aliferis CF, Statnikov A, Tsamardinos I, Mani S, Koutsoukos XD (2010) Local causal and markov blanket induction for causal discovery and feature selection for classification part i: algorithms and empirical evaluation. J Mach Learn Res 11:171–234MathSciNetMATHGoogle Scholar
  3. Andreu L, Aldás J, Bigné JE, Mattila AS (2010) An analysis of e-business adoption and its impact on relational quality in travel agency-supplier relationships. Tour Manag 310(6):777–787CrossRefGoogle Scholar
  4. Breiman L, Friedman J, Stone CJ, Olshen RA (1984) Classification and regression trees. CRC Press, Boca RatonMATHGoogle Scholar
  5. Cai R, Zhang Z, Hao Z (2011) Bassum: a Bayesian semi-supervised method for classification feature selection. Pattern Recognit 440(4):811–820CrossRefMATHGoogle Scholar
  6. Cai R, Zhang Z, Hao Z (2013a) Causal gene identification using combinatorial v-structure search. Neural Netw 43:63–71CrossRefMATHGoogle Scholar
  7. Cai R, Zhang Z, Hao Z (2013b) Sada: a general framework to support robust causation discovery. In: Proceedings of the 30th international conference on machine learning, pp 208–216Google Scholar
  8. Chang YC, Hsieh YL, Chen CC, Hsu WL (2015) A semantic frame-based intelligent agent for topic detection. Soft Comput. doi: 10.1007/s00500-015-1695-4
  9. De Morais SR, Aussem A (2010) A novel Markov boundary based feature subset selection algorithm. Neurocomputing 730(4):578–584CrossRefGoogle Scholar
  10. Esposito C, Ficco M, Palmieri F, Castiglione A (2015) Smart cloud storage service selection based on fuzzy logic, theory of evidence and game theory. IEEE Trans Comput. doi: 10.1109/TC.2015.2389952
  11. Fama EF, French KR (1992) The cross-section of expected stock returns. J Financ 470(2):427–465CrossRefGoogle Scholar
  12. Fernandez-Lozano C, Seoane JA, Gestal M, Gaunt TR, Dorado J, Campbell C (2015) Texture classification using feature selection and kernel-based techniques. Soft Comput doi:10.1007/s00500-014-1573-5Google Scholar
  13. Fu R, Qin B, Liu T (2015) Open-categorical text classification based on multi-lda models. Soft Comput 190(1):29–38CrossRefGoogle Scholar
  14. Hoyer PO, Janzing D, Mooij JM, Peters J, Schölkopf B (2009) Nonlinear causal discovery with additive noise models. In: Advances in neural information processing systems, pp 689–696Google Scholar
  15. Kano Y, Shimizu S (2003) Causal inference using nonnormality. In: Proceedings of the international symposium on science of modeling, the 30th anniversary of the information criterion, pp 261–270Google Scholar
  16. Karahoca A, Tunga MA (2015) A polynomial based algorithm for detection of embolism. Soft Comput 190(1):167–177CrossRefGoogle Scholar
  17. Koller D, Sahami M (1996) Toward optimal feature selection. Proc int conf mach Learn 20(1113):284–292Google Scholar
  18. Lee M-C (2009) Using support vector machine with a hybrid feature selection method to the stock trend prediction. Expert Syst Appl 360(8):10896–10904CrossRefGoogle Scholar
  19. Mooij J, Janzing D, Peters J, Schölkopf B (2009) Regression by dependence minimization and its application to causal inference in additive noise models. In: Proceedings of the 26th annual international conference on machine learning, pp 745–752. ACMGoogle Scholar
  20. Pearl J (2000) Causality: models, reasoning and inference, vol 29. Cambridge Univ Press, CambridgeMATHGoogle Scholar
  21. Peters J, Janzing D, Gretton A, Schölkopf B (2009) Detecting the direction of causal time series. In: Proceedings of the 26th annual international conference on machine learning, pp 801–808. ACMGoogle Scholar
  22. Peters J, Janzing D, Schölkopf B (2010) Identifying cause and effect on discrete data using additive noise models. In: International conference on artificial intelligence and statistics, pp 597–604Google Scholar
  23. Peters J, Janzing D, Scholkopf B (2011) Causal inference on discrete data using additive noise models. IEEE Trans Pattern Anal Mach Intell 330(12):2436–2450CrossRefGoogle Scholar
  24. Sethi R (1996) Endogenous regime switching in speculative markets. Struct Change Econ Dyn 70(1):99–118CrossRefGoogle Scholar
  25. Shimizu S, Hoyer PO, Hyvärinen A, Kerminen A (2006) A linear non-gaussian acyclic model for causal discovery. J Mach Learn Res 7:2003–2030MathSciNetMATHGoogle Scholar
  26. Sobel ME (1996) An introduction to causal inference. Sociol Methods Res 240(3):353–379MathSciNetCrossRefGoogle Scholar
  27. Spirtes P, Glymour CN, Scheines R (2000) Causation, prediction, and search, vol 81. MIT press, CambridgeMATHGoogle Scholar
  28. Tibshirani R (1994) Regression shrinkage and selection via the lasso. J Royal Stat Soc 58(1):267–288MathSciNetMATHGoogle Scholar
  29. Tsai C-F, Hsiao Y-C (2010) Combining multiple feature selection methods for stock prediction: union, intersection, and multi-intersection approaches. Decis Support Syst 500(1):258–269CrossRefGoogle Scholar
  30. Tsai C-F, Lin Y-C, Yen DC, Chen Y-M (2011) Predicting stock returns by classifier ensembles. Appl Soft Comput 110(2):2452–2459CrossRefGoogle Scholar
  31. Tsamardinos I, Aliferis CF, Statnikov A (2003) Time and sample efficient discovery of markov blankets and direct causal relations. In: Proceedings of the ninth ACM SIGKDD international conference on knowledge discovery and data mining, pp 673–678. ACMGoogle Scholar
  32. Zhang J, Spirtes P (2008) Detection of unfaithfulness and robust causal inference. Minds Mach 180(2):239–271CrossRefGoogle Scholar
  33. Zhang X, Yong H, Xie K, Wang S, Ngai EWT, Liu M (2014) A causal feature selection algorithm for stock prediction modeling. Neurocomputing 142:48–59CrossRefGoogle Scholar
  34. Zhu Z, Ong Y-S, Dash M (2007) Markov blanket-embedded genetic algorithm for gene selection. Pattern Recognit 400(11):3236–3248CrossRefMATHGoogle Scholar
  35. Zunino L, Zanin M, Tabak BM, Pérez DG, Rosso OA (2010) omplexity-entropy causality plane: A useful approach to quantify the stock market inefficiency. Phys A Stat Mech Appl 3890(9):1891–1901CrossRefGoogle Scholar
  36. Zuo Y, Kita E (2012) Stock price forecast using Bayesian network. Expert Syst Appl 390(8):6729–6737CrossRefGoogle Scholar

Copyright information

© Springer-Verlag Berlin Heidelberg 2015

Authors and Affiliations

  • Weiqi Chen
    • 1
  • Zhifeng Hao
    • 2
  • Ruichu Cai
    • 2
  • Xiangzhou Zhang
    • 3
    • 4
  • Yong Hu
    • 3
    • 4
  • Mei Liu
    • 4
    • 5
  1. 1.Faculty of AutomationGuangdong University of TechnologyGuangzhouChina
  2. 2.Department of Computer ScienceGuangdong University of TechnologyGuangzhouChina
  3. 3.School of BusinessSun Yat-sen UniversityGuangzhouChina
  4. 4.Big Data Decision InstituteJinan UniversityGuangzhouChina
  5. 5.University of Kansas Medical CenterKansas CityUSA

Personalised recommendations