Skip to main content

Improving Bagging Performance through Multi-algorithm Ensembles

  • Conference paper
New Frontiers in Applied Data Mining (PAKDD 2011)

Part of the book series: Lecture Notes in Computer Science ((LNAI,volume 7104))

Included in the following conference series:

  • 1504 Accesses

Abstract

Bagging establishes a committee of classifiers first and then aggregates their outcomes through majority voting. Bagging has attracted considerable research interest and been applied in various application domains. Its advantages include an increased capability of handling small data sets, less sensitivity to noise or outliers, and a parallel structure for efficient implementations. However, it has been found to be less accurate than some other ensemble methods. In this paper, we propose an approach that improves bagging through the employment of multiple classification algorithms in ensembles. Our approach preserves the parallel structure of bagging and improves the accuracy of bagging. As a result, it unlocks the power and expands the user base of bagging.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Chapter
USD 29.95
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
eBook
USD 39.99
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
Softcover Book
USD 54.99
Price excludes VAT (USA)
  • Compact, lightweight edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

Preview

Unable to display preview. Download preview PDF.

Unable to display preview. Download preview PDF.

References

  1. Aksela, M., Laaksonen, J.: Using diversity of errors for selecting members of a committee classifier. Pattern Recognition 39(4), 608–623 (2006)

    Article  MATH  Google Scholar 

  2. Aljamaan, H.I., Elish, M.O.: An empirical study of bagging and boosting ensembles for identifying faulty classes in object-oriented software. In: CIDM, pp. 187–194. IEEE (2009)

    Google Scholar 

  3. Braga, P., Oliveira, A., Ribeiro, G., Meira, S.: Bagging predictors for estimation of software project effort. In: International Joint Conference on Neural Networks, IJCNN 2007, pp. 1595–1600. IEEE (2007)

    Google Scholar 

  4. Breiman, L.: Bagging predictors. Machine learning 24(2), 123–140 (1996)

    MATH  Google Scholar 

  5. Breiman, L.: Random forests. Machine learning 45(1), 5–32 (2001)

    Article  MATH  Google Scholar 

  6. Brown, G.: Ensemble learning. In: Encyclopedia of Machine Learning. Springer, Heidelberg (2010)

    Google Scholar 

  7. Brown, G., Wyatt, J., Harris, R., Yao, X.: Diversity creation methods: a survey and categorisation. Information Fusion 6(1), 5–20 (2005)

    Article  Google Scholar 

  8. Ghosh, J.: Multiclassifier Systems: Back to the Future. In: Roli, F., Kittler, J. (eds.) MCS 2002. LNCS, vol. 2364, pp. 1–15. Springer, Heidelberg (2002)

    Chapter  Google Scholar 

  9. Ho, T.: The random subspace method for constructing decision forests. IEEE Transactions on Pattern Analysis and Machine Intelligence 20(8), 832–844 (1998)

    Article  Google Scholar 

  10. Hothorn, T., Lausen, B.: Bagging tree classifiers for laser scanning images: a data- and simulation-based strategy. Artificial Intelligence in Medicine 27(1), 65–79 (2003)

    Article  Google Scholar 

  11. Hsu, K.W.: Applying bagging with heterogeneous algorithms to health care data (2010)

    Google Scholar 

  12. Hsu, K.W.: Improving Bagging Performance through Multi-Algorithm Ensembles. Ph.D. thesis, University of Minnesota (2011)

    Google Scholar 

  13. Hsu, K.W., Srivastava, J.: Diversity in Combinations of Heterogeneous Classifiers. In: Theeramunkong, T., Kijsirikul, B., Cercone, N., Ho, T.-B. (eds.) PAKDD 2009. LNCS, vol. 5476, pp. 923–932. Springer, Heidelberg (2009)

    Chapter  Google Scholar 

  14. Hsu, K.-W., Srivastava, J.: An Empirical Study of Applying Ensembles of Heterogeneous Classifiers on Imperfect Data. In: Theeramunkong, T., Nattee, C., Adeodato, P.J.L., Chawla, N., Christen, P., Lenca, P., Poon, J., Williams, G. (eds.) New Frontiers in Applied Data Mining. LNCS, vol. 5669, pp. 28–39. Springer, Heidelberg (2010)

    Chapter  Google Scholar 

  15. Hsu, K.W., Srivastava, J.: Relationship Between Diversity and Correlation in Multi-Classifier Systems. In: Zaki, M.J., Yu, J.X., Ravindran, B., Pudi, V. (eds.) PAKDD 2010. LNCS, vol. 6119, pp. 500–506. Springer, Heidelberg (2010)

    Chapter  Google Scholar 

  16. Kuncheva, L.: Combining pattern classifiers: methods and algorithms. Wiley-Interscience (2004)

    Google Scholar 

  17. Kuncheva, L., Whitaker, C.: Measures of diversity in classifier ensembles and their relationship with the ensemble accuracy. Machine Learning 51(2), 181–207 (2003)

    Article  MATH  Google Scholar 

  18. Kuncheva, L.I., Rodríguez, J.J.: An Experimental Study on Rotation Forest Ensembles. In: Haindl, M., Kittler, J., Roli, F. (eds.) MCS 2007. LNCS, vol. 4472, pp. 459–468. Springer, Heidelberg (2007)

    Chapter  Google Scholar 

  19. Kurogi, S., Nedachi, N., Funatsu, Y.: Reproduction and Recognition of Vowel Signals using Single and Bagging Competitive Associative Nets. In: Ishikawa, M., Doya, K., Miyamoto, H., Yamakawa, T. (eds.) ICONIP 2007, Part II. LNCS, vol. 4985, pp. 40–49. Springer, Heidelberg (2008)

    Chapter  Google Scholar 

  20. Kurogi, S., Sato, S., Ichimaru, K.: Speaker Recognition using Pole Distribution of Speech Signals Obtained by Bagging Can2. In: Leung, C.S., Lee, M., Chan, J.H. (eds.) ICONIP 2009. LNCS, vol. 5863, pp. 622–629. Springer, Heidelberg (2009)

    Chapter  Google Scholar 

  21. Lasota, T., Telec, Z., Trawiński, B., Trawiński, K.: A Multi-Agent System to Assist with Real Estate Appraisals using Bagging Ensembles. In: Nguyen, N.T., Kowalczyk, R., Chen, S.-M. (eds.) ICCCI 2009. LNCS, vol. 5796, pp. 813–824. Springer, Heidelberg (2009)

    Chapter  Google Scholar 

  22. Lemmens, A., Croux, C.: Bagging and boosting classification trees to predict churn. Journal of Marketing Research 43(2), 276–286 (2006)

    Article  Google Scholar 

  23. Liu, W., Wu, Z., Pan, G.: An Entropy-Based Diversity Measure for Classifier Combining and its Application to Face Classifier Ensemble Thinning. In: Li, S.Z., Lai, J.-H., Tan, T., Feng, G.-C., Wang, Y. (eds.) SINOBIOMETRICS 2004. LNCS, vol. 3338, pp. 118–124. Springer, Heidelberg (2004)

    Chapter  Google Scholar 

  24. Lu, C., Devos, A., Suykens, J., Arús, C., Van Huffel, S.: Bagging linear sparse bayesian learning models for variable selection in cancer diagnosis. IEEE Transactions on Information Technology in Biomedicine 11(3), 338–347 (2007)

    Article  Google Scholar 

  25. Melville, P., Mooney, R.: Constructing diverse classifier ensembles using artificial training examples. In: Proceedings of the IJCAI, pp. 505–510. Citeseer (2003)

    Google Scholar 

  26. Melville, P., Mooney, R.: Creating diversity in ensembles using artificial data. Information Fusion 6(1), 99–111 (2005)

    Article  Google Scholar 

  27. Perlich, C., Rosset, S., Lawrence, R.D., Zadrozny, B.: High-quantile modeling for customer wallet estimation and other applications. In: Berkhin, P., Caruana, R., Wu, X. (eds.) KDD, pp. 977–985. ACM (2007)

    Google Scholar 

  28. Pinheiro, C.A.R., Evsukoff, A., Ebecken, N.F.F.: Revenue recovering with insolvency prevention on a brazilian telecom operator. SIGKDD Explorations 8(1), 65–70 (2006)

    Article  Google Scholar 

  29. Platt, J.: Machines using sequential minimal optimization. In: Schoelkopf, B., Burges, C., Smola, A. (eds.) Advances in Kernel Methods - Support Vector Learning. The MIT Press (1998)

    Google Scholar 

  30. Quinlan, J.: Learning with continuous classes. In: Proceedings of the 5th Australian Joint Conference on Artificial Intelligence, pp. 343–348. Citeseer (1992)

    Google Scholar 

  31. Quinlan, J.: C4. 5: programs for machine learning. Morgan Kaufmann (1993)

    Google Scholar 

  32. Rodriguez, J., Kuncheva, L., Alonso, C.: Rotation forest: A new classifier ensemble method. IEEE Transactions on Pattern Analysis and Machine Intelligence 28(10), 1619–1630 (2006)

    Article  Google Scholar 

  33. Schapire, R.E.: The strength of weak learnability. Machine Learning 5, 197–227 (1990)

    Google Scholar 

  34. Seewald, A.K.: How to make stacking better and faster while also taking care of an unknown weakness. In: Sammut, C., Hoffmann, A.G. (eds.) ICML, pp. 554–561. Morgan Kaufmann (2002)

    Google Scholar 

  35. Stepinski, T.F., Ghosh, S., Vilalta, R.: Machine learning for automatic mapping of planetary surfaces. In: AAAI, pp. 1807–1812. AAAI Press (2007)

    Google Scholar 

  36. Ting, K., Witten, I.: Stacking bagged and dagged models. In: Proc. 14th International Conference on Machine Learning, pp. 367–375. Morgan Kaufmann (1997)

    Google Scholar 

  37. Wang, Y., Wang, Y., Jain, A., Tan, T.: Face Verification Based on Bagging RBF Networks. In: Zhang, D., Jain, A.K. (eds.) ICB 2005. LNCS, vol. 3832, pp. 69–77. Springer, Heidelberg (2005)

    Chapter  Google Scholar 

  38. Wolpert, D.H.: Stacked generalization. Neural Networks 5(2), 241–259 (1992)

    Article  Google Scholar 

  39. Wu, F., Weld, D.S.: Autonomously semantifying wikipedia. In: Silva, M.J., Laender, A.H.F., Baeza-Yates, R.A., McGuinness, D.L., Olstad, B., Olsen, Ø.H., Falcão, A.O. (eds.) CIKM, pp. 41–50. ACM (2007)

    Google Scholar 

  40. Wu, X., Kumar, V., Ross Quinlan, J., Ghosh, J., Yang, Q., Motoda, H., McLachlan, G., Ng, A., Liu, B., Yu, P., et al.: Top 10 algorithms in data mining. Knowledge and Information Systems 14(1), 1–37 (2008)

    Article  Google Scholar 

Download references

Author information

Authors and Affiliations

Authors

Editor information

Editors and Affiliations

Rights and permissions

Reprints and permissions

Copyright information

© 2012 Springer-Verlag Berlin Heidelberg

About this paper

Cite this paper

Hsu, KW., Srivastava, J. (2012). Improving Bagging Performance through Multi-algorithm Ensembles. In: Cao, L., Huang, J.Z., Bailey, J., Koh, Y.S., Luo, J. (eds) New Frontiers in Applied Data Mining. PAKDD 2011. Lecture Notes in Computer Science(), vol 7104. Springer, Berlin, Heidelberg. https://doi.org/10.1007/978-3-642-28320-8_40

Download citation

  • DOI: https://doi.org/10.1007/978-3-642-28320-8_40

  • Publisher Name: Springer, Berlin, Heidelberg

  • Print ISBN: 978-3-642-28319-2

  • Online ISBN: 978-3-642-28320-8

  • eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics