Abstract
Persistent efforts are going on to propose more accurate decision forest building techniques. In this paper, we propose a new decision forest building technique called “Forest by Continuously Excluding Root Node (Forest CERN)”. The key feature of the proposed technique is that it strives to exclude attributes that participated in the root nodes of previous trees by imposing penalties on them to obstruct them appear in some subsequent trees. Penalties are gradually lifted in such a manner that those attributes can reappear after a while. Other than that, our technique uses bootstrap samples to generate predefined number of trees. The target of the proposed algorithm is to maximize tree diversity without impeding individual tree accuracy. We present an elaborate experimental results involving fifteen widely used data sets from the UCI Machine Learning Repository. The experimental results indicate the effectiveness of the proposed technique in most of the cases.
Access this chapter
Tax calculation will be finalised at checkout
Purchases are for personal use only
References
Adnan, M.N.: On dynamic selection of subspace for random forest. In: Luo, X., Yu, J.X., Li, Z. (eds.) ADMA 2014. LNCS, vol. 8933, pp. 370–379. Springer, Heidelberg (2014)
Amasyali, M.F., Ersoy, O.K.: Classifier ensembles with the extended space forest. IEEE Trans. Knowl. Data Eng. 16, 145–153 (2014)
Arlot, S.: A survey of cross-validation procedures for model selection. Stat. Surv. 4, 40–79 (2010)
Bernard, S., Heutte, L., Adam, S.: Forest-RK: a new random forest induction method. In: Huang, D.-S., Wunsch, D.C., Levine, D.S., Jo, K.-H. (eds.) ICIC 2008. LNCS (LNAI), vol. 5227, pp. 430–437. Springer, Heidelberg (2008)
Breiman, L.: Bagging predictors. Mach. Learn. 24, 123–140 (1996)
Breiman, L.: Random forests. Mach. Learn. 45, 5–32 (2001)
Breiman, L., Friedman, J., Olshen, R., Stone, C.: Classification and Regression Trees. Wadsworth International Group, Belmont (1985)
Demsar, J.: Statistical comparisons of classifiers over multiple data sets. J. Mach. Learn. Res. 7, 1–30 (2006)
Geurts, P., Ernst, D., Wehenkel, L.: Extremely randomized trees. Mach. Learn. 63, 3–42 (2006)
Han, J., Kamber, M.: Data Mining Concepts and Techniques. Morgan Kaufmann Publishers, San Francisco (2006)
Ho, T.K.: The random subspace method for constructing decision forests. IEEE Trans. Pattern Anal. Mach. Intell. 20, 832–844 (1998)
Hu, H., Li, J., Wang, H., Daggard, G., Shi, M.: A maximally diversified multiple decision tree algorithm for microarray data classification. In: Proceedings of the Workshop on Intelligent Systems for Bioinformatics (WISB), vol. 73, pp. 35–38 (2006)
Islam, M.Z., Giggins, H.: Knowledge discovery through sysfor - a systematically developed forest of multiple decision trees. In: Proceedings of the 9th Australian Data Mining Conference (2011)
Kurgan, L.A., Cios, K.J.: Caim discretization algorithm. IEEE Trans. Knowl. Data Eng. 16, 145–153 (2004)
Li, J., Liu, H.: Ensembles of cascading trees. In: Proceedings of the third IEEE International Conference on Data Mining, pp. 585–588 (2003)
Lichman, M.: UCI machine learning repository (2013). Last Accessed 10 January, 2015 http://archive.ics.uci.edu/ml/datasets.html
Munoz, G.M., Suarez, A.: Out-of-bag estimation of the optimal sample size in bagging. Pattern Recogn. 43, 143–152 (2010)
Polikar, R.: Ensemble based systems in decision making. IEEE Circuits Syst. Mag. 6, 21–45 (2006)
Quinlan, J.R.: C4.5: Programs for Machine Learning. Morgan Kaufmann Publishers, San Mateo (1993)
Rodriguez, J.J., Kuncheva, L.I., Alonso, C.J.: Rotation forest: a new classifier ensemble method. IEEE Trans. Pattern Anal. Mach. Intell. 28, 1619–1630 (2006)
Tan, P.N., Steinbach, M., Kumar, V.: Introduction to Data Mining. Pearson Education, New York (2006)
Triola, M.F.: Elementary Statistics. Addison Wesley Longman Inc., Boston (2001)
Author information
Authors and Affiliations
Corresponding author
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 2016 Springer International Publishing Switzerland
About this paper
Cite this paper
Adnan, M.N., Islam, M.Z. (2016). Forest CERN: A New Decision Forest Building Technique. In: Bailey, J., Khan, L., Washio, T., Dobbie, G., Huang, J., Wang, R. (eds) Advances in Knowledge Discovery and Data Mining. PAKDD 2016. Lecture Notes in Computer Science(), vol 9651. Springer, Cham. https://doi.org/10.1007/978-3-319-31753-3_25
Download citation
DOI: https://doi.org/10.1007/978-3-319-31753-3_25
Published:
Publisher Name: Springer, Cham
Print ISBN: 978-3-319-31752-6
Online ISBN: 978-3-319-31753-3
eBook Packages: Computer ScienceComputer Science (R0)