Bagged Voting Ensembles
Bayesian and decision tree classifiers are among the most popular classifiers used in the data mining community and recently numerous researchers have examined their sufficiency in ensembles. Although, many methods of ensemble creation have been proposed, there is as yet no clear picture of which method is best. In this work, we propose Bagged Voting using different subsets of the same training dataset with the concurrent usage of a voting methodology that combines a Bayesian and a decision tree algorithm. In our algorithm, voters express the degree of their preference using as confidence score the probabilities of classifiers’ prediction. Next all confidence values are added for each candidate and the candidate with the highest sum wins the election. We performed a comparison of the presented ensemble with other ensembles that use either the Bayesian or the decision tree classifier as base learner and we took better accuracy in most cases.
KeywordsBase Classifier Ensemble Method High Error Rate Data Mining Algorithm Decision Tree Algorithm
Unable to display preview. Download preview PDF.
- 2.Blake, C.L., Merz, C.J.: UCI Repository of machine learning databases. Irvine, CA: University of California, Department of Information and Computer Science (1998), http://www.ics.uci.edu/~mlearn/MLRepository.html
- 3.Van den Bosch, A., Daelemans, W.: Memory-based morphological analysis. In: Proc. of the 37th Annual Meeting of the ACL, University of Maryland, pp. 285–292 (1999), http://ilk.kub.nl/~antalb/ltuia/week10.html
- 7.Freund, Y., Schapire, R.E.: Experiments with a New Boosting Algorithm. In: Proceedings: ICML 1996, pp. 148–156 (1996)Google Scholar
- 8.Hall, L.O., Bowyer, K.W., Kegelmeyer, W.P., Moore, T.E., Chao, C.: Distributed learning on very large datasets. In: ACM SIGKDD Workshop on Distributed and Parallel Knowledge Discovery (2000)Google Scholar
- 10.Kotsiantis, S., Pintelas, P.: On combining classifiers. In: Proceedings of HERCMA 2003 on computer mathematics and its applications, Athens (September 25-27, 2003)Google Scholar
- 12.Krogh, A., Vedelsby, J.: Neural network ensembles, cross validation and active learning. In: Advances in Neural Information Processing Systems, p. 7 (1995)Google Scholar
- 13.McQueen, R.J., Garner, S.R., Nevill-Manning, C.G., Witten, I.H.: Applying machine learning to agricultural data. Journal of Computing and Electronics in Agriculture (1994)Google Scholar
- 15.Quinlan, J.R.: C4.5: Programs for machine learning. Morgan Kaufmann, San Francisco (1993)Google Scholar
- 16.Quinlan, J.R.: Bagging, boosting, and C4.5. In: Proceedings of the Thirteenth National Conference on Artificial Intelligence, pp. 725–730. AAAI/MIT Press (1996)Google Scholar
- 22.Witten, I., Frank, E.: Data Mining: Practical Machine Learning Tools and Techniques with Java Implementations. Morgan Kaufmann, San Mateo (2000)Google Scholar