Advertisement

GRASP Forest: A New Ensemble Method for Trees

  • José F. Diez-Pastor
  • César García-Osorio
  • Juan J. Rodríguez
  • Andrés Bustillo
Part of the Lecture Notes in Computer Science book series (LNCS, volume 6713)

Abstract

This paper proposes a method for constructing ensembles of decision trees: GRASP Forest. This method uses the metaheuristic GRASP, usually used in optimization problems, to increase the diversity of the ensemble. While Random Forest increases the diversity by randomly choosing a subset of attributes in each tree node, GRASP Forest takes into account all the attributes, the source of randomness in the method is given by the GRASP metaheuristic. Instead of choosing the best attribute from a randomly selected subset of attributes, as Random Forest does, the attribute is randomly chosen from a subset of selected good attributes candidates. Besides the selection of attributes, GRASP is used to select the split value for each numeric attribute. The method is compared to Bagging, Random Forest, Random Subspaces, AdaBoost and MutliBoost, being the results very competitive for the proposed method.

Keywords

Classifier ensembles Bagging Random Subspaces Boosting Random Forest decision trees GRASP 

Preview

Unable to display preview. Download preview PDF.

Unable to display preview. Download preview PDF.

References

  1. 1.
    Kuncheva, L.I.: Combining Pattern Classifiers: Methods and Algorithms. Wiley Interscience, Hoboken (2004)CrossRefzbMATHGoogle Scholar
  2. 2.
    Breiman, L.: Bagging Predictors. Machine Learning 24, 123–140 (1996)zbMATHGoogle Scholar
  3. 3.
    Ho, T.K.: The random subspace method for constructing decision forests. IEEE Transactions on Pattern Analysis and Machine Intelligence 20, 832–844 (1998)CrossRefGoogle Scholar
  4. 4.
    Freund, Y., Schapire, R.: Experiments with a new boosting algorithm. In: Proceedings of the Thirteenth International Conference on Machine Learning (ICML), pp. 148–156 (1996)Google Scholar
  5. 5.
    Webb, G.I.: MultiBoosting: A Technique for Combining Boosting and Wagging. Machine Learning 40, 159–196 (2000)CrossRefGoogle Scholar
  6. 6.
    Breiman, L.: Random forests. Machine learning 45, 5–32 (2001)CrossRefzbMATHGoogle Scholar
  7. 7.
    Dietterich, T.G.: An experimental comparison of three methods for constructing ensembles of decision trees: Bagging, boosting, and randomization. Machine Learning 40, 139–157 (2000), doi:10.1023/A:1007607513941CrossRefGoogle Scholar
  8. 8.
    Maudes, J., Rodríguez, J.J., Garcïa-Osorio, C., Garcïa-Pedrajas, N.: Random feature weights for decision tree ensemble construction. Information Fusion (2010)Google Scholar
  9. 9.
    Quinlan, R.J.: C4.5: Programs for Machine Learning. Morgan Kaufmann, San Francisco (1993)Google Scholar
  10. 10.
    Breiman, L., Friedman, J., Stone, C.J., Olshen, R.A.: Classification and Regression Trees. Chapman and Hall, Boca Raton (1984)zbMATHGoogle Scholar
  11. 11.
    Feo, T., Resende, M.: A probabilistic heuristic for a computationally difficult set covering problem. Operations Research Letters 8, 67–71 (1989)CrossRefzbMATHGoogle Scholar
  12. 12.
    Feo, T., Resende, M.: Greedy randomized adaptive search procedures. Journal of Global Optimization 6, 109–133 (1995)CrossRefzbMATHGoogle Scholar
  13. 13.
    Pacheco, J., Alfaro, E., Casado, S., Gámez, M., García, N.: Uso del metaheurístico GRASP en la construcción de árboles de clasificación. Rect@ 11, 139–154 (2010)Google Scholar
  14. 14.
    Geurts, P., Ernst, D., Wehenkel, L.: Extremely randomized trees. Machine Learning 63, 3–42 (2006)CrossRefzbMATHGoogle Scholar
  15. 15.
    Hall, M., Frank, E., Holmes, G., Pfahringer, B., Reutemann, P., Witten, I.H.: The weka data mining software: an update. SIGKDD Explor. Newsl. 11, 10–18 (2009)CrossRefGoogle Scholar
  16. 16.
    Nadeau, C., Bengio, Y.: Inference for the generalization error. Machine Learning 52, 239–281 (2003), doi:10.1023/A:1024068626366CrossRefzbMATHGoogle Scholar
  17. 17.
    Dietterich, T.: Approximate statistical tests for comparing supervised classification learning algorithms. Neural computation 10, 1895–1923 (1998)CrossRefGoogle Scholar
  18. 18.
    Frank, A., Asuncion, A.: UCI machine learning repository (2010)Google Scholar
  19. 19.
    Demšar, J.: Statistical comparisons of classifiers over multiple data sets. The Journal of Machine Learning Research 7, 1–30 (2006)zbMATHGoogle Scholar
  20. 20.
    Castiello, C., Castellano, G., Fanelli, A.: Meta-data: Characterization of input features for meta-learning. In: Modeling Decisions for Artificial Intelligence, pp. 457–468 (2005)Google Scholar
  21. 21.
    Ho, T., Basu, M.: Complexity measures of supervised classification problems. IEEE Trans. Pattern Anal. Mach. Intell. 24, 289–300 (2002), info:doi:10.1109/34.990132CrossRefGoogle Scholar

Copyright information

© Springer-Verlag Berlin Heidelberg 2011

Authors and Affiliations

  • José F. Diez-Pastor
    • 1
  • César García-Osorio
    • 1
  • Juan J. Rodríguez
    • 1
  • Andrés Bustillo
    • 1
  1. 1.University of BurgosSpain

Personalised recommendations