Skip to main content

Forest CERN: A New Decision Forest Building Technique

  • Conference paper
  • First Online:
Book cover Advances in Knowledge Discovery and Data Mining (PAKDD 2016)

Part of the book series: Lecture Notes in Computer Science ((LNAI,volume 9651))

Included in the following conference series:

Abstract

Persistent efforts are going on to propose more accurate decision forest building techniques. In this paper, we propose a new decision forest building technique called “Forest by Continuously Excluding Root Node (Forest CERN)”. The key feature of the proposed technique is that it strives to exclude attributes that participated in the root nodes of previous trees by imposing penalties on them to obstruct them appear in some subsequent trees. Penalties are gradually lifted in such a manner that those attributes can reappear after a while. Other than that, our technique uses bootstrap samples to generate predefined number of trees. The target of the proposed algorithm is to maximize tree diversity without impeding individual tree accuracy. We present an elaborate experimental results involving fifteen widely used data sets from the UCI Machine Learning Repository. The experimental results indicate the effectiveness of the proposed technique in most of the cases.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Chapter
USD 29.95
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
eBook
USD 39.99
Price excludes VAT (USA)
  • Available as EPUB and PDF
  • Read on any device
  • Instant download
  • Own it forever
Softcover Book
USD 54.99
Price excludes VAT (USA)
  • Compact, lightweight edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

References

  1. Adnan, M.N.: On dynamic selection of subspace for random forest. In: Luo, X., Yu, J.X., Li, Z. (eds.) ADMA 2014. LNCS, vol. 8933, pp. 370–379. Springer, Heidelberg (2014)

    Google Scholar 

  2. Amasyali, M.F., Ersoy, O.K.: Classifier ensembles with the extended space forest. IEEE Trans. Knowl. Data Eng. 16, 145–153 (2014)

    Google Scholar 

  3. Arlot, S.: A survey of cross-validation procedures for model selection. Stat. Surv. 4, 40–79 (2010)

    Article  MathSciNet  MATH  Google Scholar 

  4. Bernard, S., Heutte, L., Adam, S.: Forest-RK: a new random forest induction method. In: Huang, D.-S., Wunsch, D.C., Levine, D.S., Jo, K.-H. (eds.) ICIC 2008. LNCS (LNAI), vol. 5227, pp. 430–437. Springer, Heidelberg (2008)

    Google Scholar 

  5. Breiman, L.: Bagging predictors. Mach. Learn. 24, 123–140 (1996)

    MathSciNet  MATH  Google Scholar 

  6. Breiman, L.: Random forests. Mach. Learn. 45, 5–32 (2001)

    Article  MATH  Google Scholar 

  7. Breiman, L., Friedman, J., Olshen, R., Stone, C.: Classification and Regression Trees. Wadsworth International Group, Belmont (1985)

    MATH  Google Scholar 

  8. Demsar, J.: Statistical comparisons of classifiers over multiple data sets. J. Mach. Learn. Res. 7, 1–30 (2006)

    MathSciNet  MATH  Google Scholar 

  9. Geurts, P., Ernst, D., Wehenkel, L.: Extremely randomized trees. Mach. Learn. 63, 3–42 (2006)

    Article  MATH  Google Scholar 

  10. Han, J., Kamber, M.: Data Mining Concepts and Techniques. Morgan Kaufmann Publishers, San Francisco (2006)

    MATH  Google Scholar 

  11. Ho, T.K.: The random subspace method for constructing decision forests. IEEE Trans. Pattern Anal. Mach. Intell. 20, 832–844 (1998)

    Article  Google Scholar 

  12. Hu, H., Li, J., Wang, H., Daggard, G., Shi, M.: A maximally diversified multiple decision tree algorithm for microarray data classification. In: Proceedings of the Workshop on Intelligent Systems for Bioinformatics (WISB), vol. 73, pp. 35–38 (2006)

    Google Scholar 

  13. Islam, M.Z., Giggins, H.: Knowledge discovery through sysfor - a systematically developed forest of multiple decision trees. In: Proceedings of the 9th Australian Data Mining Conference (2011)

    Google Scholar 

  14. Kurgan, L.A., Cios, K.J.: Caim discretization algorithm. IEEE Trans. Knowl. Data Eng. 16, 145–153 (2004)

    Article  Google Scholar 

  15. Li, J., Liu, H.: Ensembles of cascading trees. In: Proceedings of the third IEEE International Conference on Data Mining, pp. 585–588 (2003)

    Google Scholar 

  16. Lichman, M.: UCI machine learning repository (2013). Last Accessed 10 January, 2015 http://archive.ics.uci.edu/ml/datasets.html

  17. Munoz, G.M., Suarez, A.: Out-of-bag estimation of the optimal sample size in bagging. Pattern Recogn. 43, 143–152 (2010)

    Article  MATH  Google Scholar 

  18. Polikar, R.: Ensemble based systems in decision making. IEEE Circuits Syst. Mag. 6, 21–45 (2006)

    Article  Google Scholar 

  19. Quinlan, J.R.: C4.5: Programs for Machine Learning. Morgan Kaufmann Publishers, San Mateo (1993)

    Google Scholar 

  20. Rodriguez, J.J., Kuncheva, L.I., Alonso, C.J.: Rotation forest: a new classifier ensemble method. IEEE Trans. Pattern Anal. Mach. Intell. 28, 1619–1630 (2006)

    Article  Google Scholar 

  21. Tan, P.N., Steinbach, M., Kumar, V.: Introduction to Data Mining. Pearson Education, New York (2006)

    Google Scholar 

  22. Triola, M.F.: Elementary Statistics. Addison Wesley Longman Inc., Boston (2001)

    MATH  Google Scholar 

Download references

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Md. Nasim Adnan .

Editor information

Editors and Affiliations

Rights and permissions

Reprints and permissions

Copyright information

© 2016 Springer International Publishing Switzerland

About this paper

Cite this paper

Adnan, M.N., Islam, M.Z. (2016). Forest CERN: A New Decision Forest Building Technique. In: Bailey, J., Khan, L., Washio, T., Dobbie, G., Huang, J., Wang, R. (eds) Advances in Knowledge Discovery and Data Mining. PAKDD 2016. Lecture Notes in Computer Science(), vol 9651. Springer, Cham. https://doi.org/10.1007/978-3-319-31753-3_25

Download citation

  • DOI: https://doi.org/10.1007/978-3-319-31753-3_25

  • Published:

  • Publisher Name: Springer, Cham

  • Print ISBN: 978-3-319-31752-6

  • Online ISBN: 978-3-319-31753-3

  • eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics