Improving Bagging Performance through Multi-algorithm Ensembles

Hsu, Kuo-Wei; Srivastava, Jaideep

doi:10.1007/978-3-642-28320-8_40

Kuo-Wei Hsu²³ &
Jaideep Srivastava²⁴

Part of the book series: Lecture Notes in Computer Science ((LNAI,volume 7104))

Included in the following conference series:

Pacific-Asia Conference on Knowledge Discovery and Data Mining

1504 Accesses

Abstract

Bagging establishes a committee of classifiers first and then aggregates their outcomes through majority voting. Bagging has attracted considerable research interest and been applied in various application domains. Its advantages include an increased capability of handling small data sets, less sensitivity to noise or outliers, and a parallel structure for efficient implementations. However, it has been found to be less accurate than some other ensemble methods. In this paper, we propose an approach that improves bagging through the employment of multiple classification algorithms in ensembles. Our approach preserves the parallel structure of bagging and improves the accuracy of bagging. As a result, it unlocks the power and expands the user base of bagging.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Log in via an institution

Chapter: USD 29.95; Price excludes VAT (USA)

eBook: USD 39.99; Price excludes VAT (USA)

Softcover Book: USD 54.99; Price excludes VAT (USA)

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

Preview

Unable to display preview. Download preview PDF.

References

Aksela, M., Laaksonen, J.: Using diversity of errors for selecting members of a committee classifier. Pattern Recognition 39(4), 608–623 (2006)
Article MATH Google Scholar
Aljamaan, H.I., Elish, M.O.: An empirical study of bagging and boosting ensembles for identifying faulty classes in object-oriented software. In: CIDM, pp. 187–194. IEEE (2009)
Google Scholar
Braga, P., Oliveira, A., Ribeiro, G., Meira, S.: Bagging predictors for estimation of software project effort. In: International Joint Conference on Neural Networks, IJCNN 2007, pp. 1595–1600. IEEE (2007)
Google Scholar
Breiman, L.: Bagging predictors. Machine learning 24(2), 123–140 (1996)
MATH Google Scholar
Breiman, L.: Random forests. Machine learning 45(1), 5–32 (2001)
Article MATH Google Scholar
Brown, G.: Ensemble learning. In: Encyclopedia of Machine Learning. Springer, Heidelberg (2010)
Google Scholar
Brown, G., Wyatt, J., Harris, R., Yao, X.: Diversity creation methods: a survey and categorisation. Information Fusion 6(1), 5–20 (2005)
Article Google Scholar
Ghosh, J.: Multiclassifier Systems: Back to the Future. In: Roli, F., Kittler, J. (eds.) MCS 2002. LNCS, vol. 2364, pp. 1–15. Springer, Heidelberg (2002)
Chapter Google Scholar
Ho, T.: The random subspace method for constructing decision forests. IEEE Transactions on Pattern Analysis and Machine Intelligence 20(8), 832–844 (1998)
Article Google Scholar
Hothorn, T., Lausen, B.: Bagging tree classifiers for laser scanning images: a data- and simulation-based strategy. Artificial Intelligence in Medicine 27(1), 65–79 (2003)
Article Google Scholar
Hsu, K.W.: Applying bagging with heterogeneous algorithms to health care data (2010)
Google Scholar
Hsu, K.W.: Improving Bagging Performance through Multi-Algorithm Ensembles. Ph.D. thesis, University of Minnesota (2011)
Google Scholar
Hsu, K.W., Srivastava, J.: Diversity in Combinations of Heterogeneous Classifiers. In: Theeramunkong, T., Kijsirikul, B., Cercone, N., Ho, T.-B. (eds.) PAKDD 2009. LNCS, vol. 5476, pp. 923–932. Springer, Heidelberg (2009)
Chapter Google Scholar
Hsu, K.-W., Srivastava, J.: An Empirical Study of Applying Ensembles of Heterogeneous Classifiers on Imperfect Data. In: Theeramunkong, T., Nattee, C., Adeodato, P.J.L., Chawla, N., Christen, P., Lenca, P., Poon, J., Williams, G. (eds.) New Frontiers in Applied Data Mining. LNCS, vol. 5669, pp. 28–39. Springer, Heidelberg (2010)
Chapter Google Scholar
Hsu, K.W., Srivastava, J.: Relationship Between Diversity and Correlation in Multi-Classifier Systems. In: Zaki, M.J., Yu, J.X., Ravindran, B., Pudi, V. (eds.) PAKDD 2010. LNCS, vol. 6119, pp. 500–506. Springer, Heidelberg (2010)
Chapter Google Scholar
Kuncheva, L.: Combining pattern classifiers: methods and algorithms. Wiley-Interscience (2004)
Google Scholar
Kuncheva, L., Whitaker, C.: Measures of diversity in classifier ensembles and their relationship with the ensemble accuracy. Machine Learning 51(2), 181–207 (2003)
Article MATH Google Scholar
Kuncheva, L.I., Rodríguez, J.J.: An Experimental Study on Rotation Forest Ensembles. In: Haindl, M., Kittler, J., Roli, F. (eds.) MCS 2007. LNCS, vol. 4472, pp. 459–468. Springer, Heidelberg (2007)
Chapter Google Scholar
Kurogi, S., Nedachi, N., Funatsu, Y.: Reproduction and Recognition of Vowel Signals using Single and Bagging Competitive Associative Nets. In: Ishikawa, M., Doya, K., Miyamoto, H., Yamakawa, T. (eds.) ICONIP 2007, Part II. LNCS, vol. 4985, pp. 40–49. Springer, Heidelberg (2008)
Chapter Google Scholar
Kurogi, S., Sato, S., Ichimaru, K.: Speaker Recognition using Pole Distribution of Speech Signals Obtained by Bagging Can2. In: Leung, C.S., Lee, M., Chan, J.H. (eds.) ICONIP 2009. LNCS, vol. 5863, pp. 622–629. Springer, Heidelberg (2009)
Chapter Google Scholar
Lasota, T., Telec, Z., Trawiński, B., Trawiński, K.: A Multi-Agent System to Assist with Real Estate Appraisals using Bagging Ensembles. In: Nguyen, N.T., Kowalczyk, R., Chen, S.-M. (eds.) ICCCI 2009. LNCS, vol. 5796, pp. 813–824. Springer, Heidelberg (2009)
Chapter Google Scholar
Lemmens, A., Croux, C.: Bagging and boosting classification trees to predict churn. Journal of Marketing Research 43(2), 276–286 (2006)
Article Google Scholar
Liu, W., Wu, Z., Pan, G.: An Entropy-Based Diversity Measure for Classifier Combining and its Application to Face Classifier Ensemble Thinning. In: Li, S.Z., Lai, J.-H., Tan, T., Feng, G.-C., Wang, Y. (eds.) SINOBIOMETRICS 2004. LNCS, vol. 3338, pp. 118–124. Springer, Heidelberg (2004)
Chapter Google Scholar
Lu, C., Devos, A., Suykens, J., Arús, C., Van Huffel, S.: Bagging linear sparse bayesian learning models for variable selection in cancer diagnosis. IEEE Transactions on Information Technology in Biomedicine 11(3), 338–347 (2007)
Article Google Scholar
Melville, P., Mooney, R.: Constructing diverse classifier ensembles using artificial training examples. In: Proceedings of the IJCAI, pp. 505–510. Citeseer (2003)
Google Scholar
Melville, P., Mooney, R.: Creating diversity in ensembles using artificial data. Information Fusion 6(1), 99–111 (2005)
Article Google Scholar
Perlich, C., Rosset, S., Lawrence, R.D., Zadrozny, B.: High-quantile modeling for customer wallet estimation and other applications. In: Berkhin, P., Caruana, R., Wu, X. (eds.) KDD, pp. 977–985. ACM (2007)
Google Scholar
Pinheiro, C.A.R., Evsukoff, A., Ebecken, N.F.F.: Revenue recovering with insolvency prevention on a brazilian telecom operator. SIGKDD Explorations 8(1), 65–70 (2006)
Article Google Scholar
Platt, J.: Machines using sequential minimal optimization. In: Schoelkopf, B., Burges, C., Smola, A. (eds.) Advances in Kernel Methods - Support Vector Learning. The MIT Press (1998)
Google Scholar
Quinlan, J.: Learning with continuous classes. In: Proceedings of the 5th Australian Joint Conference on Artificial Intelligence, pp. 343–348. Citeseer (1992)
Google Scholar
Quinlan, J.: C4. 5: programs for machine learning. Morgan Kaufmann (1993)
Google Scholar
Rodriguez, J., Kuncheva, L., Alonso, C.: Rotation forest: A new classifier ensemble method. IEEE Transactions on Pattern Analysis and Machine Intelligence 28(10), 1619–1630 (2006)
Article Google Scholar
Schapire, R.E.: The strength of weak learnability. Machine Learning 5, 197–227 (1990)
Google Scholar
Seewald, A.K.: How to make stacking better and faster while also taking care of an unknown weakness. In: Sammut, C., Hoffmann, A.G. (eds.) ICML, pp. 554–561. Morgan Kaufmann (2002)
Google Scholar
Stepinski, T.F., Ghosh, S., Vilalta, R.: Machine learning for automatic mapping of planetary surfaces. In: AAAI, pp. 1807–1812. AAAI Press (2007)
Google Scholar
Ting, K., Witten, I.: Stacking bagged and dagged models. In: Proc. 14th International Conference on Machine Learning, pp. 367–375. Morgan Kaufmann (1997)
Google Scholar
Wang, Y., Wang, Y., Jain, A., Tan, T.: Face Verification Based on Bagging RBF Networks. In: Zhang, D., Jain, A.K. (eds.) ICB 2005. LNCS, vol. 3832, pp. 69–77. Springer, Heidelberg (2005)
Chapter Google Scholar
Wolpert, D.H.: Stacked generalization. Neural Networks 5(2), 241–259 (1992)
Article Google Scholar
Wu, F., Weld, D.S.: Autonomously semantifying wikipedia. In: Silva, M.J., Laender, A.H.F., Baeza-Yates, R.A., McGuinness, D.L., Olstad, B., Olsen, Ø.H., Falcão, A.O. (eds.) CIKM, pp. 41–50. ACM (2007)
Google Scholar
Wu, X., Kumar, V., Ross Quinlan, J., Ghosh, J., Yang, Q., Motoda, H., McLachlan, G., Ng, A., Liu, B., Yu, P., et al.: Top 10 algorithms in data mining. Knowledge and Information Systems 14(1), 1–37 (2008)
Article Google Scholar

Download references

Author information

Authors and Affiliations

Department of Computer Science, National Chengchi University, Taipei, 11605, Taiwan, ROC
Kuo-Wei Hsu
Department of Computer Science and Engineering, University of Minnesota, Minneapolis, MN, 55455, USA
Jaideep Srivastava

Authors

Kuo-Wei Hsu
View author publications
You can also search for this author in PubMed Google Scholar
Jaideep Srivastava
View author publications
You can also search for this author in PubMed Google Scholar

Editor information

Editors and Affiliations

Faculty of Engineering and Information Technology, University of Technology Sydney, Broadway, PO Box 123, NSW 2007, Sydney, Australia
Longbing Cao
Shenzhen Institute of Advanced Technology (SIAT), Chinese Academy of Sciences, 518055, Shenzhen, China
Joshua Zhexue Huang & Jun Luo &
The University of Melbourne, VIC 3010, Melbourne, Australia
James Bailey
The University of Auckland, Auckland, New Zealand
Yun Sing Koh

Rights and permissions

Reprints and permissions

Copyright information

About this paper

Cite this paper

Hsu, KW., Srivastava, J. (2012). Improving Bagging Performance through Multi-algorithm Ensembles. In: Cao, L., Huang, J.Z., Bailey, J., Koh, Y.S., Luo, J. (eds) New Frontiers in Applied Data Mining. PAKDD 2011. Lecture Notes in Computer Science(), vol 7104. Springer, Berlin, Heidelberg. https://doi.org/10.1007/978-3-642-28320-8_40

Download citation

DOI: https://doi.org/10.1007/978-3-642-28320-8_40
Publisher Name: Springer, Berlin, Heidelberg
Print ISBN: 978-3-642-28319-2
Online ISBN: 978-3-642-28320-8
eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics