Combining Bagging and Random Subspaces to Create Better Ensembles
Random forests are one of the best performing methods for constructing ensembles. They derive their strength from two aspects: using random subsamples of the training data (as in bagging) and randomizing the algorithm for learning base-level classifiers (decision trees). The base-level algorithm randomly selects a subset of the features at each step of tree construction and chooses the best among these. We propose to use a combination of concepts used in bagging and random subspaces to achieve a similar effect. The latter randomly select a subset of the features at the start and use a deterministic version of the base-level algorithm (and is thus somewhat similar to the randomized version of the algorithm). The results of our experiments show that the proposed approach has a comparable performance to that of random forests, with the added advantage of being applicable to any base-level algorithm without the need to randomize the latter.
KeywordsRandom Forest Bootstrap Sample Ensemble Method Baseline Method Vote Scheme
Unable to display preview. Download preview PDF.
- 4.Efron, B., Tibshirani, R.J.: An introduction to the Bootstrap. In: Monographs on Statistics and Applied Probability, vol. 57, Chapman and Hall, Sydney (1993)Google Scholar
- 5.Schapire, R.E.: The strength of weak learnability. Machine Learning 5, 197–227 (1990)Google Scholar
- 6.Ho, T.K.: Complexity of classification problems and comparative advantages of combined classifiers. In: MCS 2000: Proceedings of the First International Workshop on Multiple Classifier Systems, London, UK, pp. 97–106. Springer, Heidelberg (2000)Google Scholar
- 9.Newman, D.J., Hettich, S., Blake, C., Merz, C.: UCI repository of machine learning databases (1998)Google Scholar
- 10.Quinlan, J.R.: C4.5: programs for machine learning. Kaufmann Publishers Inc., San Francisco, CA, USA (1993)Google Scholar
- 11.Cohen, W.W.: Fast effective rule induction. In: Prieditis, A., Russell, S. (eds.) Proc. of the 12th International Conference on Machine Learning, Tahoe City, CA, pp. 115–123. Morgan Kaufmann, San Francisco (1995)Google Scholar
- 12.Aha, D.W., Kibler, D., Albert, M.K.: Instance-based learning algorithms. Mach. Learn. 6(1), 37–66 (1991)Google Scholar