Abstract
We propose a novel tree-based ensemble method named Selective Cascade of Residual ExtraTrees (SCORE). SCORE draws inspiration from representation learning, incorporates regularized regression with variable selection features, and utilizes boosting to improve prediction and reduce generalization errors. We also develop a variable importance measure to increase the explainability of SCORE. Our computer experiments show that SCORE provides comparable or superior performance in prediction against ExtraTrees, random forest, gradient boosting machine, and neural networks; and the proposed variable importance measure for SCORE is comparable to studied benchmark methods. Finally, the predictive performance of SCORE remains stable across hyper-parameter values, suggesting potential robustness to hyper-parameter specification.
Similar content being viewed by others
References
Dietterich TG. An experimental comparison of three methods for constructing ensembles of decision trees: bagging, boosting, and randomization. Mach Learn. 2000;40:139–57.
Breiman L. Bagging predictors. Mach Learn. 1996;24(2):123–40.
Ho TK. The random subspace method for constructing decision forests. IEEE Trans Pattern Anal Mach Intell. 1998;20(8):832–44.
Ho TK. Random decision forest. In: Proceedings of the 3rd international conference on document analysis and recognition; 1995. pp. 278–282.
Geurts P, Ernst D, Wehenkel L. Extremely randomized trees. Mach Learn. 2006;63(1):3–42.
Fan W, Wang H, Yu P, Ma S. Is random model better? On its accuracy and efficiency. In: Third IEEE international conference on data mining, no. October 2015. IEEE Comput Soc; 2003. pp. 51–58.
Liu FT, Ting KM, Yu Y, Zhou Z-H. Spectrum of variable-random trees. J Artif Intell Res. 2008;32:355–84.
Friedman JH. Greedy function approximation: a gradient boosting machine. Ann Stat. 2001;29(5):1189–232.
Mason L, Baxter J, Bartlett P, Frean M. Boosting algorithms as gradient descent. Adv Neural Inf Process Syst. 2000;2000:512–8.
Hinton GE, Osindero S, Teh Y-W. A fast learning algorithm for deep belief nets. Neural Comput. 2006;18(7):1527–54.
Kong Y, Yu T. A deep neural network model using random forest to extract feature representation for gene expression data classification. Sci Rep. 2018;8:1–9.
Zhou ZH, Feng J. Deep forest: towards an alternative to deep neural networks. In: IJCAI international joint conference on artificial intelligence; 2017. pp. 3553–3559.
Feng J, Yu Y, Zhou ZH. Multi-layered gradient boosting decision trees. Adv Neural Inf Process Syst. 2018;2018:3551–61.
Bengio Y, Courville A, Vincent P. Representation learning: a review and new perspectives. IEEE Trans Pattern Anal Mach Intell. 2013;35(8):1798–828.
Freund Y, Schapire RE. A decision-theoretic generalization of on-line learning and an application to boosting. J Comput Syst Sci. 1997;55(1):119–39.
Schapire RE. The strength of weak learnability. Mach Learn. 1990;5(2):197–227.
Li N, Zhou ZH. Selective ensemble under regularization framework. In: Lecture notes in computer science (including subseries lecture notes in artificial intelligence and lecture notes in bioinformatics; 2009.
Bengio Y, Courville A, Vincent P. Representation learning: a review and new perspectives. IEEE Trans Pattern Anal Mach Intell. 2013;35:1798–828.
Brown G, Wyatt JL, Tino P. Managing diversity in regression ensembles. J Mach Learn Res. 2005;6:1621–2650.
Tibshirani R. Regression shrinkage and selection via the Lasso. J R Stat Soc Ser B (Methodol). 1996;58(1):267–88.
Helliwell J, Layard R, Sachs J. World Happiness Report 2019. In: Sustainable development solutions network, New York, Tech. Rep., 2019. https://worldhappiness.report/ed/2019/.
Olden JD, Joy MK, Death RG. An accurate comparison of methods for quantifying variable importance in artificial neural networks using simulated data. Ecol Model. 2004;178(3–4):389–97.
Acknowledgements
We thank Dr. Gitta Lubke and two anonymous referees for their useful and constructive comments on the project and the manuscript.
Author information
Authors and Affiliations
Corresponding author
Ethics declarations
Conflicts of interest
The authors declare that they have no conflicts of interest.
Additional information
Publisher's Note
Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.
Rights and permissions
About this article
Cite this article
Liu, Q., Liu, F. Selective Cascade of Residual ExtraTrees. SN COMPUT. SCI. 1, 354 (2020). https://doi.org/10.1007/s42979-020-00358-x
Received:
Accepted:
Published:
DOI: https://doi.org/10.1007/s42979-020-00358-x