Boosting Lite – Handling Larger Datasets and Slower Base Classifiers

Hall, Lawrence O.; Banfield, Robert E.; Bowyer, Kevin W.; Kegelmeyer, W. Philip

doi:10.1007/978-3-540-72523-7_17

Boosting Lite – Handling Larger Datasets and Slower Base Classifiers

Lawrence O. Hall¹,
Robert E. Banfield¹,
Kevin W. Bowyer² &
…
W. Philip Kegelmeyer³

Conference paper

1240 Accesses
2 Citations

Part of the book series: Lecture Notes in Computer Science ((LNIP,volume 4472))

Abstract

In this paper, we examine ensemble algorithms (Boosting Lite and Ivoting) that provide accuracy approximating a single classifier, but which require significantly fewer training examples. Such algorithms allow ensemble methods to operate on very large data sets or use very slow learning algorithms. Boosting Lite is compared with Ivoting, standard boosting, and building a single classifier. Comparisons are done on 11 data sets to which other approaches have been applied. We find that ensembles of support vector machines can attain higher accuracy with less data than ensembles of decision trees. We find that Ivoting may result in higher accuracy ensembles on some data sets, however Boosting Lite is generally able to indicate when boosting will increase overall accuracy.

This is a preview of subscription content, log in via an institution.

Chapter: USD 29.95; Price excludes VAT (USA)

eBook: USD 39.99; Price excludes VAT (USA)

Softcover Book: USD 54.99; Price excludes VAT (USA)

Tax calculation will be finalised at checkout

Purchases are for personal use only

Learn about institutional subscriptions

Preview

Unable to display preview. Download preview PDF.

References

Pavlov, D., Chudova, D., Smyth, P.: Towards scalable support vector machines using squashing. In: Proceedings of the Sixth ACM SIGKDD International Conference on Knowledge Discovery and Data Mining, pp. 295–299. ACM Press, New York (2000)
Chapter Google Scholar
Breiman, L.: Pasting bites together for prediction in large data sets. Machine Learning 36(1-2), 85–103 (1999)
Article Google Scholar
Chawla, N.V., et al.: Learning ensembles from bites: A scalable and accurate approach. Journal of Machine Learning Research 5, 421–451 (2004)
MathSciNet Google Scholar
Pavlov, D., Mao, J., Dom, B.: Scaling-up support vector machines using boosting algorithm. In: 15th International Conference on Pattern Recognition, vol. 2, pp. 219–222 (2000)
Google Scholar
Schapire, R.: A brief introduction to boosting. In: Proc. of the Sixteenth Intl. Joint Conf. on Artificial Intelligence (1999)
Google Scholar
Pavlov, D.: Personal communication (2006)
Google Scholar
Eibl, G., Pfeiffer, K.P.: How to Make AdaBoost.M1 Work for Weak Base Classifiers by Changing Only One Line of the Code. In: Elomaa, T., Mannila, H., Toivonen, H. (eds.) ECML 2002. LNCS (LNAI), vol. 2430, pp. 109–120. Springer, Heidelberg (2002)
Chapter Google Scholar
Banfield, R.: OpenDT (2005), http://opendt.sourceforge.net/
Quinlan, J.R.: C4.5: programs for machine learning. Morgan Kaufmann, San Francisco (1993)
Google Scholar
Gehrke, J., Ramakrishnan, R., Ganti, V.: Rainforest: A framework for fast decision tree construction of large datasets. Journal of Data Mining and Knowledge Discovery 4(2-3), 127–162 (2000)
Article Google Scholar
Merz, C.J., Murphy, P.M.: UCI Repository of Machine Learning Databases. Univ. of CA., Dept. of CIS, Irvine, CA., http://www.ics.uci.edu/~mlearn/MLRepository.html
Michie, D., Spiegelhalter, D.J., Taylor, C.C.: Machine learning, neural and statistical classification (1994), ftp://ftp.ncc.up.pt/pub/statlog/

Download references

Author information

Authors and Affiliations

Department of Computer Science & Engineering, University of South Florida, Tampa, Florida 33620-5399,
Lawrence O. Hall & Robert E. Banfield
Computer Science & Engineering, 384 Fitzpatrick Hall, Notre Dame, IN 46556,
Kevin W. Bowyer
Sandia National Labs, Computational Sciences and Math Research Department, PO Box 969, MS 9159,
W. Philip Kegelmeyer

Authors

Lawrence O. Hall
View author publications
You can also search for this author in PubMed Google Scholar
Robert E. Banfield
View author publications
You can also search for this author in PubMed Google Scholar
Kevin W. Bowyer
View author publications
You can also search for this author in PubMed Google Scholar
W. Philip Kegelmeyer
View author publications
You can also search for this author in PubMed Google Scholar

Editor information

Michal Haindl Josef Kittler Fabio Roli

Rights and permissions

Reprints and permissions

Copyright information

About this paper

Cite this paper

Hall, L.O., Banfield, R.E., Bowyer, K.W., Kegelmeyer, W.P. (2007). Boosting Lite – Handling Larger Datasets and Slower Base Classifiers. In: Haindl, M., Kittler, J., Roli, F. (eds) Multiple Classifier Systems. MCS 2007. Lecture Notes in Computer Science, vol 4472. Springer, Berlin, Heidelberg. https://doi.org/10.1007/978-3-540-72523-7_17

Download citation

DOI: https://doi.org/10.1007/978-3-540-72523-7_17
Publisher Name: Springer, Berlin, Heidelberg
Print ISBN: 978-3-540-72481-0
Online ISBN: 978-3-540-72523-7
eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics