Abstract
In this paper, we examine ensemble algorithms (Boosting Lite and Ivoting) that provide accuracy approximating a single classifier, but which require significantly fewer training examples. Such algorithms allow ensemble methods to operate on very large data sets or use very slow learning algorithms. Boosting Lite is compared with Ivoting, standard boosting, and building a single classifier. Comparisons are done on 11 data sets to which other approaches have been applied. We find that ensembles of support vector machines can attain higher accuracy with less data than ensembles of decision trees. We find that Ivoting may result in higher accuracy ensembles on some data sets, however Boosting Lite is generally able to indicate when boosting will increase overall accuracy.
This is a preview of subscription content, log in via an institution.
Buying options
Tax calculation will be finalised at checkout
Purchases are for personal use only
Learn about institutional subscriptionsPreview
Unable to display preview. Download preview PDF.
References
Pavlov, D., Chudova, D., Smyth, P.: Towards scalable support vector machines using squashing. In: Proceedings of the Sixth ACM SIGKDD International Conference on Knowledge Discovery and Data Mining, pp. 295–299. ACM Press, New York (2000)
Breiman, L.: Pasting bites together for prediction in large data sets. Machine Learning 36(1-2), 85–103 (1999)
Chawla, N.V., et al.: Learning ensembles from bites: A scalable and accurate approach. Journal of Machine Learning Research 5, 421–451 (2004)
Pavlov, D., Mao, J., Dom, B.: Scaling-up support vector machines using boosting algorithm. In: 15th International Conference on Pattern Recognition, vol. 2, pp. 219–222 (2000)
Schapire, R.: A brief introduction to boosting. In: Proc. of the Sixteenth Intl. Joint Conf. on Artificial Intelligence (1999)
Pavlov, D.: Personal communication (2006)
Eibl, G., Pfeiffer, K.P.: How to Make AdaBoost.M1 Work for Weak Base Classifiers by Changing Only One Line of the Code. In: Elomaa, T., Mannila, H., Toivonen, H. (eds.) ECML 2002. LNCS (LNAI), vol. 2430, pp. 109–120. Springer, Heidelberg (2002)
Banfield, R.: OpenDT (2005), http://opendt.sourceforge.net/
Quinlan, J.R.: C4.5: programs for machine learning. Morgan Kaufmann, San Francisco (1993)
Gehrke, J., Ramakrishnan, R., Ganti, V.: Rainforest: A framework for fast decision tree construction of large datasets. Journal of Data Mining and Knowledge Discovery 4(2-3), 127–162 (2000)
Merz, C.J., Murphy, P.M.: UCI Repository of Machine Learning Databases. Univ. of CA., Dept. of CIS, Irvine, CA., http://www.ics.uci.edu/~mlearn/MLRepository.html
Michie, D., Spiegelhalter, D.J., Taylor, C.C.: Machine learning, neural and statistical classification (1994), ftp://ftp.ncc.up.pt/pub/statlog/
Author information
Authors and Affiliations
Editor information
Rights and permissions
Copyright information
© 2007 Springer Berlin Heidelberg
About this paper
Cite this paper
Hall, L.O., Banfield, R.E., Bowyer, K.W., Kegelmeyer, W.P. (2007). Boosting Lite – Handling Larger Datasets and Slower Base Classifiers. In: Haindl, M., Kittler, J., Roli, F. (eds) Multiple Classifier Systems. MCS 2007. Lecture Notes in Computer Science, vol 4472. Springer, Berlin, Heidelberg. https://doi.org/10.1007/978-3-540-72523-7_17
Download citation
DOI: https://doi.org/10.1007/978-3-540-72523-7_17
Publisher Name: Springer, Berlin, Heidelberg
Print ISBN: 978-3-540-72481-0
Online ISBN: 978-3-540-72523-7
eBook Packages: Computer ScienceComputer Science (R0)