Advertisement

Improving Bagging Performance through Multi-algorithm Ensembles

  • Kuo-Wei Hsu
  • Jaideep Srivastava
Conference paper
Part of the Lecture Notes in Computer Science book series (LNCS, volume 7104)

Abstract

Bagging establishes a committee of classifiers first and then aggregates their outcomes through majority voting. Bagging has attracted considerable research interest and been applied in various application domains. Its advantages include an increased capability of handling small data sets, less sensitivity to noise or outliers, and a parallel structure for efficient implementations. However, it has been found to be less accurate than some other ensemble methods. In this paper, we propose an approach that improves bagging through the employment of multiple classification algorithms in ensembles. Our approach preserves the parallel structure of bagging and improves the accuracy of bagging. As a result, it unlocks the power and expands the user base of bagging.

Keywords

Random Forest Ensemble Method Sequential Minimal Optimization Considerable Research Interest Rotation Forest 
These keywords were added by machine and not by the authors. This process is experimental and the keywords may be updated as the learning algorithm improves.

Preview

Unable to display preview. Download preview PDF.

Unable to display preview. Download preview PDF.

References

  1. 1.
    Aksela, M., Laaksonen, J.: Using diversity of errors for selecting members of a committee classifier. Pattern Recognition 39(4), 608–623 (2006)CrossRefzbMATHGoogle Scholar
  2. 2.
    Aljamaan, H.I., Elish, M.O.: An empirical study of bagging and boosting ensembles for identifying faulty classes in object-oriented software. In: CIDM, pp. 187–194. IEEE (2009)Google Scholar
  3. 3.
    Braga, P., Oliveira, A., Ribeiro, G., Meira, S.: Bagging predictors for estimation of software project effort. In: International Joint Conference on Neural Networks, IJCNN 2007, pp. 1595–1600. IEEE (2007)Google Scholar
  4. 4.
    Breiman, L.: Bagging predictors. Machine learning 24(2), 123–140 (1996)zbMATHGoogle Scholar
  5. 5.
    Breiman, L.: Random forests. Machine learning 45(1), 5–32 (2001)CrossRefzbMATHGoogle Scholar
  6. 6.
    Brown, G.: Ensemble learning. In: Encyclopedia of Machine Learning. Springer, Heidelberg (2010)Google Scholar
  7. 7.
    Brown, G., Wyatt, J., Harris, R., Yao, X.: Diversity creation methods: a survey and categorisation. Information Fusion 6(1), 5–20 (2005)CrossRefGoogle Scholar
  8. 8.
    Ghosh, J.: Multiclassifier Systems: Back to the Future. In: Roli, F., Kittler, J. (eds.) MCS 2002. LNCS, vol. 2364, pp. 1–15. Springer, Heidelberg (2002)CrossRefGoogle Scholar
  9. 9.
    Ho, T.: The random subspace method for constructing decision forests. IEEE Transactions on Pattern Analysis and Machine Intelligence 20(8), 832–844 (1998)CrossRefGoogle Scholar
  10. 10.
    Hothorn, T., Lausen, B.: Bagging tree classifiers for laser scanning images: a data- and simulation-based strategy. Artificial Intelligence in Medicine 27(1), 65–79 (2003)CrossRefGoogle Scholar
  11. 11.
    Hsu, K.W.: Applying bagging with heterogeneous algorithms to health care data (2010)Google Scholar
  12. 12.
    Hsu, K.W.: Improving Bagging Performance through Multi-Algorithm Ensembles. Ph.D. thesis, University of Minnesota (2011)Google Scholar
  13. 13.
    Hsu, K.W., Srivastava, J.: Diversity in Combinations of Heterogeneous Classifiers. In: Theeramunkong, T., Kijsirikul, B., Cercone, N., Ho, T.-B. (eds.) PAKDD 2009. LNCS, vol. 5476, pp. 923–932. Springer, Heidelberg (2009)CrossRefGoogle Scholar
  14. 14.
    Hsu, K.-W., Srivastava, J.: An Empirical Study of Applying Ensembles of Heterogeneous Classifiers on Imperfect Data. In: Theeramunkong, T., Nattee, C., Adeodato, P.J.L., Chawla, N., Christen, P., Lenca, P., Poon, J., Williams, G. (eds.) New Frontiers in Applied Data Mining. LNCS, vol. 5669, pp. 28–39. Springer, Heidelberg (2010)CrossRefGoogle Scholar
  15. 15.
    Hsu, K.W., Srivastava, J.: Relationship Between Diversity and Correlation in Multi-Classifier Systems. In: Zaki, M.J., Yu, J.X., Ravindran, B., Pudi, V. (eds.) PAKDD 2010. LNCS, vol. 6119, pp. 500–506. Springer, Heidelberg (2010)CrossRefGoogle Scholar
  16. 16.
    Kuncheva, L.: Combining pattern classifiers: methods and algorithms. Wiley-Interscience (2004)Google Scholar
  17. 17.
    Kuncheva, L., Whitaker, C.: Measures of diversity in classifier ensembles and their relationship with the ensemble accuracy. Machine Learning 51(2), 181–207 (2003)CrossRefzbMATHGoogle Scholar
  18. 18.
    Kuncheva, L.I., Rodríguez, J.J.: An Experimental Study on Rotation Forest Ensembles. In: Haindl, M., Kittler, J., Roli, F. (eds.) MCS 2007. LNCS, vol. 4472, pp. 459–468. Springer, Heidelberg (2007)CrossRefGoogle Scholar
  19. 19.
    Kurogi, S., Nedachi, N., Funatsu, Y.: Reproduction and Recognition of Vowel Signals using Single and Bagging Competitive Associative Nets. In: Ishikawa, M., Doya, K., Miyamoto, H., Yamakawa, T. (eds.) ICONIP 2007, Part II. LNCS, vol. 4985, pp. 40–49. Springer, Heidelberg (2008)CrossRefGoogle Scholar
  20. 20.
    Kurogi, S., Sato, S., Ichimaru, K.: Speaker Recognition using Pole Distribution of Speech Signals Obtained by Bagging Can2. In: Leung, C.S., Lee, M., Chan, J.H. (eds.) ICONIP 2009. LNCS, vol. 5863, pp. 622–629. Springer, Heidelberg (2009)CrossRefGoogle Scholar
  21. 21.
    Lasota, T., Telec, Z., Trawiński, B., Trawiński, K.: A Multi-Agent System to Assist with Real Estate Appraisals using Bagging Ensembles. In: Nguyen, N.T., Kowalczyk, R., Chen, S.-M. (eds.) ICCCI 2009. LNCS, vol. 5796, pp. 813–824. Springer, Heidelberg (2009)CrossRefGoogle Scholar
  22. 22.
    Lemmens, A., Croux, C.: Bagging and boosting classification trees to predict churn. Journal of Marketing Research 43(2), 276–286 (2006)CrossRefGoogle Scholar
  23. 23.
    Liu, W., Wu, Z., Pan, G.: An Entropy-Based Diversity Measure for Classifier Combining and its Application to Face Classifier Ensemble Thinning. In: Li, S.Z., Lai, J.-H., Tan, T., Feng, G.-C., Wang, Y. (eds.) SINOBIOMETRICS 2004. LNCS, vol. 3338, pp. 118–124. Springer, Heidelberg (2004)CrossRefGoogle Scholar
  24. 24.
    Lu, C., Devos, A., Suykens, J., Arús, C., Van Huffel, S.: Bagging linear sparse bayesian learning models for variable selection in cancer diagnosis. IEEE Transactions on Information Technology in Biomedicine 11(3), 338–347 (2007)CrossRefGoogle Scholar
  25. 25.
    Melville, P., Mooney, R.: Constructing diverse classifier ensembles using artificial training examples. In: Proceedings of the IJCAI, pp. 505–510. Citeseer (2003)Google Scholar
  26. 26.
    Melville, P., Mooney, R.: Creating diversity in ensembles using artificial data. Information Fusion 6(1), 99–111 (2005)CrossRefGoogle Scholar
  27. 27.
    Perlich, C., Rosset, S., Lawrence, R.D., Zadrozny, B.: High-quantile modeling for customer wallet estimation and other applications. In: Berkhin, P., Caruana, R., Wu, X. (eds.) KDD, pp. 977–985. ACM (2007)Google Scholar
  28. 28.
    Pinheiro, C.A.R., Evsukoff, A., Ebecken, N.F.F.: Revenue recovering with insolvency prevention on a brazilian telecom operator. SIGKDD Explorations 8(1), 65–70 (2006)CrossRefGoogle Scholar
  29. 29.
    Platt, J.: Machines using sequential minimal optimization. In: Schoelkopf, B., Burges, C., Smola, A. (eds.) Advances in Kernel Methods - Support Vector Learning. The MIT Press (1998)Google Scholar
  30. 30.
    Quinlan, J.: Learning with continuous classes. In: Proceedings of the 5th Australian Joint Conference on Artificial Intelligence, pp. 343–348. Citeseer (1992)Google Scholar
  31. 31.
    Quinlan, J.: C4. 5: programs for machine learning. Morgan Kaufmann (1993)Google Scholar
  32. 32.
    Rodriguez, J., Kuncheva, L., Alonso, C.: Rotation forest: A new classifier ensemble method. IEEE Transactions on Pattern Analysis and Machine Intelligence 28(10), 1619–1630 (2006)CrossRefGoogle Scholar
  33. 33.
    Schapire, R.E.: The strength of weak learnability. Machine Learning 5, 197–227 (1990)Google Scholar
  34. 34.
    Seewald, A.K.: How to make stacking better and faster while also taking care of an unknown weakness. In: Sammut, C., Hoffmann, A.G. (eds.) ICML, pp. 554–561. Morgan Kaufmann (2002)Google Scholar
  35. 35.
    Stepinski, T.F., Ghosh, S., Vilalta, R.: Machine learning for automatic mapping of planetary surfaces. In: AAAI, pp. 1807–1812. AAAI Press (2007)Google Scholar
  36. 36.
    Ting, K., Witten, I.: Stacking bagged and dagged models. In: Proc. 14th International Conference on Machine Learning, pp. 367–375. Morgan Kaufmann (1997)Google Scholar
  37. 37.
    Wang, Y., Wang, Y., Jain, A., Tan, T.: Face Verification Based on Bagging RBF Networks. In: Zhang, D., Jain, A.K. (eds.) ICB 2005. LNCS, vol. 3832, pp. 69–77. Springer, Heidelberg (2005)CrossRefGoogle Scholar
  38. 38.
    Wolpert, D.H.: Stacked generalization. Neural Networks 5(2), 241–259 (1992)CrossRefGoogle Scholar
  39. 39.
    Wu, F., Weld, D.S.: Autonomously semantifying wikipedia. In: Silva, M.J., Laender, A.H.F., Baeza-Yates, R.A., McGuinness, D.L., Olstad, B., Olsen, Ø.H., Falcão, A.O. (eds.) CIKM, pp. 41–50. ACM (2007)Google Scholar
  40. 40.
    Wu, X., Kumar, V., Ross Quinlan, J., Ghosh, J., Yang, Q., Motoda, H., McLachlan, G., Ng, A., Liu, B., Yu, P., et al.: Top 10 algorithms in data mining. Knowledge and Information Systems 14(1), 1–37 (2008)CrossRefGoogle Scholar

Copyright information

© Springer-Verlag Berlin Heidelberg 2012

Authors and Affiliations

  • Kuo-Wei Hsu
    • 1
  • Jaideep Srivastava
    • 2
  1. 1.Department of Computer ScienceNational Chengchi UniversityTaipeiTaiwan, ROC
  2. 2.Department of Computer Science and EngineeringUniversity of MinnesotaMinneapolisUSA

Personalised recommendations