Skip to main content

Semi-causal decision trees

Abstract

Typically, classification algorithms use correlation analysis to make decisions. However, these decisions and the models they learn are not easily understandable for the typical user. Causal discovery is the field that studies the means to find causal relationships in observational data. Although highly interpretable, causal discovery algorithms tend to not perform so well in classification problems. This paper aims to propose a hybrid decision tree approach (SC tree) that mixes causal discovery with correlation analysis through the implementation of a custom metric to split the data in the tree’s construction (Semi-causal gain ratio). In the results, the proposed methodology obtained a significant performance improvement (11.26% mean error rate) when compared to several causal baselines CDT-PS (23.67% ) and CDT-SPS (25.14%), matching closely the performance of J48 (10.20%), used as a correlation baseline, in ten binary data sets. Besides, when compared with PC in discrete data sets, the proposed approach obtained substantial improvement (16.17% against 28.07% in terms of mean error rate).

This is a preview of subscription content, access via your institution.

Fig. 1
Fig. 2
Fig. 3
Fig. 4
Fig. 5

Notes

  1. 1.

    https://www.nationalgeographic.com/science/phenomena/2015/09/11/nick-cage-movies-vs-drownings-and-more-strange-but-spurious-correlations/.

  2. 2.

    “For each of the separate levels of the co-variable set h = 1, 2, ..., q, the response variable is distributed at random with respect to the sub-populations, i.e. the data in the respective rows of the hth table can be regarded as a successive set of simple random samples of sizes {Nhi.} from a fixed population corresponding to the marginal total distribution of the response variable {Nh.j}” [10].

  3. 3.

    We used the WEKA jar file provided kindly by the authors to compare with our methodology.

  4. 4.

    We used the WEKA implementation.

  5. 5.

    https://cran.r-project.org/web/packages/pcalg/index.html.

References

  1. 1.

    Agresti, A.: An introduction to categorical data analysis. Wiley, New York (2018)

    MATH  Google Scholar 

  2. 2.

    Birch, M.: The detection of partial association, i: the 2\(\times \) 2 case. J. Roy. Stat. Soc.: Ser. B (Methodol.) 26(2), 313–324 (1964)

    MathSciNet  MATH  Google Scholar 

  3. 3.

    Cochran, W.G.: Some methods for strengthening the common \(\chi \) 2 tests. Biometrics 10(4), 417–451 (1954)

    MathSciNet  Article  Google Scholar 

  4. 4.

    DeFries, R., Agarwala, M., Baquie, S., Choksi, P., Khanwilkar, S., Mondal, P., Nagendra, H., Uperlainen, J.: Improved household living standards can restore dry tropical forests. Biotropica (2021)

  5. 5.

    Domingos, P.M.: The role of occam’s razor in knowledge discovery. Data Min. Knowl. Discov. 3(4), 409–425 (1999). https://doi.org/10.1023/A:1009868929893

  6. 6.

    Glymour, C., Zhang, K., Spirtes, P.: Review of causal discovery methods based on graphical models. Front. Genet. (2019). https://doi.org/10.3389/fgene.2019.00524

  7. 7.

    Guo, R., Cheng, L., Li, J., Hahn, P.R., Liu, H.: A survey of learning causality with data: problems and methods. ACM Comput. Surv. (2020). https://doi.org/10.1145/3397269

  8. 8.

    Jin, Z., Li, J., Liu, L., Le, T.D., Sun, B., Wang, R.: Discovery of causal rules using partial association. In: Proceedings IEEE International Conference on Data Mining, ICDM pp. 309–318 (2012). https://doi.org/10.1109/ICDM.2012.36

  9. 9.

    KENT, J.T.: Information gain and a general measure of correlation. Biometrika 70(1), 163–173 (1983). https://doi.org/10.1093/biomet/70.1.163

  10. 10.

    Landis, J.R., Heyman, E.R., Koch, G.G.: Average partial association in three-way contingency tables: a review and discussion of alternative tests. Int. Stat. Rev. 46(3), 237 (2006). https://doi.org/10.2307/1402373

    MathSciNet  Article  MATH  Google Scholar 

  11. 11.

    Li, F., Gao, L., Ma, X., Yang, X.: Detection of driver pathways using mutated gene network in cancer. Mol. BioSyst. 12, 2135–2141 (2016). https://doi.org/10.1039/C6MB00084C

    Article  Google Scholar 

  12. 12.

    Li, J., Ma, S., Le, T., Liu, L., Liu, J.: Causal decision trees. IEEE Trans. Knowl. Data Eng. 29(2), 257–271 (2017). https://doi.org/10.1109/TKDE.2016.2619350

    Article  Google Scholar 

  13. 13.

    Luma-Osmani, S., Ismaili, F., Zenuni, X., Raufi, B.: A systematic literature review in causal association rules mining. In: 2020 11th IEEE Annual Information Technology, Electronics and Mobile Communication Conference (IEMCON), pp. 0048–0054 (2020). https://doi.org/10.1109/IEMCON51383.2020.9284908

  14. 14.

    Ma, S., Statnikov, A.: Methods for computational causal discovery in biomedicine. Behaviormetrika 44(1), 165–191 (2017). https://doi.org/10.1007/s41237-016-0013-5

    Article  Google Scholar 

  15. 15.

    Mantas, C.J., Abellán, J.: Credal-c4.5: Decision tree based on imprecise probabilities to classify noisy data. Expert Syst. Appl. 41(10), 4625–4637 (2014). https://doi.org/10.1016/j.eswa.2014.01.017. http://www.sciencedirect.com/science/article/pii/S0957417414000384

  16. 16.

    Marx, A., Vreeken, J.: Testing conditional independence on discrete data using stochastic complexity. arXiv preprint arXiv:1903.04829 (2019)

  17. 17.

    Mooij, J.M., Cremers, J., Others: An empirical study of one of the simplest causal prediction algorithms. In: UAI 2015 Workshop on Advances in Causal Inference, 1504, pp. 30–39 (2015)

  18. 18.

    Pearl, J., Verma, T.S.: A theory of inferred causation. In: Studies in Logic and the Foundations of Mathematics, vol. 134, pp. 789–811. Elsevier (1995)

  19. 19.

    Piltaver, R., Luštrek, M., Gams, M., Martinšić-Ipšić, S.: What makes classification trees comprehensible? Expert Syst. Appl. 62, 333–346 (2016)

  20. 20.

    Quinlan, J.R.: Induction of decision trees. Mach. Learn. 1(1), 81–106 (1986). https://doi.org/10.1007/BF00116251

    Article  Google Scholar 

  21. 21.

    Samothrakis, S., Perez, D., Lucas, S.: Training Gradient Boosting Machines Using Curve-Fitting and Information-Theoretic Features for Causal Direction Detection, pp. 331–338. Springer International Publishing, Cham (2019). https://doi.org/10.1007/978-3-030-21810-2_11

  22. 22.

    Spirtes, P., Glymour, C.N., Scheines, R., Heckerman, D.: Causation, prediction, and search. MIT press (2000)

  23. 23.

    Tangirala, S.: Evaluating the impact of gini index and information gain on classification using decision tree classifier algorithm. Int. J. Adv. Comput. Sci. Appl. 11(2), 612–619 (2020)

    Google Scholar 

  24. 24.

    Theil, H.: Statistical decomposition analysis; with applications in the social and administrative sciences. Tech. rep. (1972)

  25. 25.

    Verma, T.S., Pearl, J.: On the equivalence of causal models. arXiv preprint arXiv:1304.1108 (2013)

  26. 26.

    Yu, K., Li, J., Liu, L.: A Review on Algorithms for Constraint-based Causal Discovery (2016)

  27. 27.

    Zhang, W., Wang, S.L.: An integrated framework for identifying mutated driver pathway and cancer progression. IEEE/ACM Trans. Comput. Biol. Bioinf. 16(2), 455–464 (2019). https://doi.org/10.1109/TCBB.2017.2788016

    Article  Google Scholar 

  28. 28.

    Zhang, X., Baral, C., Kim, S.: An algorithm to learn causal relations between genes from steady state data: Simulation and its application to melanoma dataset. In: Miksch, S., Hunter, J., Keravnou, E.T. (eds.) Artificial Intelligence in Medicine, pp. 524–534. Springer, Berlin (2005)

    Chapter  Google Scholar 

  29. 29.

    Zhou, Q., Liao, F., Mou, C., Wang, P.: Measuring interpretability for different types of machine learning models. In: M. Ganji, L. Rashidi, B.C.M. Fung, C. Wang (eds.) Trends and Applications in Knowledge Discovery and Data Mining - PAKDD 2018 Workshops, BDASC, BDM, ML4Cyber, PAISI, DaMEMO, Melbourne, VIC, Australia, June 3, 2018, Revised Selected Papers, Lecture Notes in Computer Science, vol. 11154, pp. 295–308. Springer (2018). https://doi.org/10.1007/978-3-030-04503-6_29

Download references

Acknowledgements

This research was carried out in the context of the project FailStopper (DSAIPA/DS/0086/2018) and supported by the Fundação para a Ciência e Tecnologia (FCT), Portugal for the PhD Grant SFRH/BD/146197/2019.

Author information

Affiliations

Authors

Corresponding author

Correspondence to Ana Rita Nogueira.

Additional information

Publisher's Note

Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

Rights and permissions

Reprints and Permissions

About this article

Verify currency and authenticity via CrossMark

Cite this article

Nogueira, A.R., Ferreira, C.A. & Gama, J. Semi-causal decision trees. Prog Artif Intell (2021). https://doi.org/10.1007/s13748-021-00262-2

Download citation

Keywords

  • Causal discovery
  • Correlation
  • Decision tree
  • Semi-causal information gain
  • Semi-causal gain ratio
  • Uncertainty coefficient