Advertisement

Diversified Random Forests Using Random Subspaces

  • Khaled Fawagreh
  • Mohamed Medhat Gaber
  • Eyad Elyan
Part of the Lecture Notes in Computer Science book series (LNCS, volume 8669)

Abstract

Random Forest is an ensemble learning method used for classification and regression. In such an ensemble, multiple classifiers are used where each classifier casts one vote for its predicted class label. Majority voting is then used to determine the class label for unlabelled instances. Since it has been proven empirically that ensembles tend to yield better results when there is a significant diversity among the constituent models, many extensions were developed during the past decade that aim at inducing some diversity in the constituent models in order to improve the performance of Random Forests in terms of both speed and accuracy. In this paper, we propose a method to promote Random Forest diversity by using randomly selected subspaces, giving a weight to each subspace according to its predictive power, and using this weight in majority voting. Experimental study on 15 real datasets showed favourable results, demonstrating the potential of the proposed method.

Keywords

Random Forest Class Label Gini Index Random Subspace Random Subspace Method 
These keywords were added by machine and not by the authors. This process is experimental and the keywords may be updated as the learning algorithm improves.

Preview

Unable to display preview. Download preview PDF.

Unable to display preview. Download preview PDF.

References

  1. 1.
    Adeva, J.J.G., Beresi, U., Calvo, R.: Accuracy and diversity in ensembles of text categorisers. CLEI Electronic Journal 9(1) (2005)Google Scholar
  2. 2.
    Amit, Y., Geman, D.: Shape quantization and recognition with randomized trees. Neural Computation 9(7), 1545–1588 (1997)CrossRefGoogle Scholar
  3. 3.
    Bache, K., Lichman, M.: UCI machine learning repository (2013)Google Scholar
  4. 4.
    Bader-El-Den, M., Gaber, M.: Garf: towards self-optimised random forests. In: Huang, T., Zeng, Z., Li, C., Leung, C.S. (eds.) ICONIP 2012, Part II. LNCS, vol. 7664, pp. 506–515. Springer, Heidelberg (2012)CrossRefGoogle Scholar
  5. 5.
    Bernard, S., Heutte, L., Adam, S.: A study of strength and correlation in random forests. In: Huang, D.-S., McGinnity, M., Heutte, L., Zhang, X.-P. (eds.) ICIC 2010. CCIS, vol. 93, pp. 186–191. Springer, Heidelberg (2010)CrossRefGoogle Scholar
  6. 6.
    Boinee, P., De Angelis, A., Foresti, G.L.: Meta random forests. International Journal of Computationnal Intelligence 2(3), 138–147 (2005)Google Scholar
  7. 7.
    Breiman, L.: Bagging predictors. Machine Learning 24(2), 123–140 (1996)MathSciNetzbMATHGoogle Scholar
  8. 8.
    Breiman, L.: Random forests. Machine Learning 45(1), 5–32 (2001)CrossRefzbMATHGoogle Scholar
  9. 9.
    Brown, G., Wyatt, J., Harris, R., Yao, X.: Diversity creation methods: a survey and categorisation. Information Fusion 6(1), 5–20 (2005)CrossRefGoogle Scholar
  10. 10.
    Cai, Q.-T., Peng, C.-Y., Zhang, C.-S.: A weighted subspace approach for improving bagging performance. In: IEEE International Conference on Acoustics, Speech and Signal Processing, ICASSP 2008, pp. 3341–3344. IEEE (2008)Google Scholar
  11. 11.
    Cuzzocrea, A., Francis, S.L., Gaber, M.M.: An information-theoretic approach for setting the optimal number of decision trees in random forests. In: 2013 IEEE International Conference on Systems, Man, and Cybernetics (SMC), pp. 1013–1019. IEEE (2013)Google Scholar
  12. 12.
    Freund, Y., Schapire, R.E.: A decision-theoretic generalization of on-line learning and an application to boosting. Journal of Computer and System Sciences 55(1), 119–139 (1997)MathSciNetCrossRefzbMATHGoogle Scholar
  13. 13.
    García-Pedrajas, N., Ortiz-Boyer, D.: Boosting random subspace method. Neural Networks 21(9), 1344–1362 (2008)CrossRefzbMATHGoogle Scholar
  14. 14.
    Ho, T.K.: Random decision forests. In: Proceedings of the Third International Conference on Document Analysis and Recognition, vol. 1, pp. 278–282. IEEE (1995)Google Scholar
  15. 15.
    Ho, T.K.: The random subspace method for constructing decision forests. IEEE Transactions on Pattern Analysis and Machine Intelligence 20(8), 832–844 (1998)CrossRefGoogle Scholar
  16. 16.
    Kuncheva, L.I., Whitaker, C.J.: Measures of diversity in classifier ensembles and their relationship with the ensemble accuracy. Machine Learning 51(2), 181–207 (2003)CrossRefzbMATHGoogle Scholar
  17. 17.
    Liaw, A., Wiener, M.: Classification and regression by randomforest. R News 2(3), 18–22 (2002)Google Scholar
  18. 18.
    Maclin, R., Opitz, D.: Popular ensemble methods: An empirical study. arXiv preprint arXiv:1106.0257 (2011)Google Scholar
  19. 19.
    Breiman, L., Friedman, J.H., Olshen, R.A., Stone, C.J.: Classification and regression trees. Wadsworth International Group (1984)Google Scholar
  20. 20.
    Opitz, D.W.: Feature selection for ensembles. In: AAAI/IAAI, pp. 379–384 (1999)Google Scholar
  21. 21.
    Panov, P., Džeroski, S.: Combining bagging and random subspaces to create better ensembles. Springer (2007)Google Scholar
  22. 22.
    Polikar, R.: Ensemble based systems in decision making. IEEE Circuits and Systems Magazine 6(3), 21–45 (2006)CrossRefGoogle Scholar
  23. 23.
    Robnik-Šikonja, M.: Improving random forests. In: Boulicaut, J.-F., Esposito, F., Giannotti, F., Pedreschi, D. (eds.) ECML 2004. LNCS (LNAI), vol. 3201, pp. 359–370. Springer, Heidelberg (2004)Google Scholar
  24. 24.
    Rokach, L.: Ensemble-based classifiers. Artificial Intelligence Review 33(1-2), 1–39 (2010)CrossRefGoogle Scholar
  25. 25.
    Tang, K., Suganthan, P.N., Yao, X.: An analysis of diversity measures. Machine Learning 65(1), 247–271 (2006)CrossRefGoogle Scholar
  26. 26.
    Wolpert, D.H.: Stacked generalization. Neural Networks 5(2), 241–259 (1992)MathSciNetCrossRefGoogle Scholar
  27. 27.
    Yan, W., Goebel, K.F.: Designing classifier ensembles with constrained performance requirements. In: Defense and Security, pp. 59–68. International Society for Optics and Photonics (2004)Google Scholar

Copyright information

© Springer International Publishing Switzerland 2014

Authors and Affiliations

  • Khaled Fawagreh
    • 1
  • Mohamed Medhat Gaber
    • 1
  • Eyad Elyan
    • 1
  1. 1.IDEAS, School of Computing Science and Digital MedialRobert Gordon UniversityAberdeenUK

Personalised recommendations