Advertisement

Selective Ensemble under Regularization Framework

  • Nan Li
  • Zhi-Hua Zhou
Conference paper
Part of the Lecture Notes in Computer Science book series (LNCS, volume 5519)

Abstract

An ensemble is generated by training multiple component learners for a same task and then combining them for predictions. It is known that when lots of trained learners are available, it is better to ensemble some instead of all of them. The selection, however, is generally difficult and heuristics are often used. In this paper, we investigate the problem under the regularization framework, and propose a regularized selective ensemble algorithm RSE. In RSE, the selection is reduced to a quadratic programming problem, which has a sparse solution and can be solved efficiently. Since it naturally fits the semi-supervised learning setting, RSE can also exploit unlabeled data to improve the performance. Experimental results show that RSE can generate ensembles with small size but strong generalization ability.

Keywords

Unlabeled Data Ensemble Size Ensemble Learning Average Error Rate Label Rate 
These keywords were added by machine and not by the authors. This process is experimental and the keywords may be updated as the learning algorithm improves.

Preview

Unable to display preview. Download preview PDF.

Unable to display preview. Download preview PDF.

References

  1. 1.
    Andersen, E.D., Jensen, B., Sandvik, R., Worsoe, U.: The improvements in mosek version 5. Technical report, The MOSEK Inc. (2007)Google Scholar
  2. 2.
    Belkin, M., Niyogi, P., Sindhwani, V.: Manifold regularization: A geometric framework for learning from labeled and unlabeled examples. Journal of Machine Learning Research 7, 2399–2434 (2006)MathSciNetzbMATHGoogle Scholar
  3. 3.
    Blake, C., Keogh, E., Merz, C.J.: UCI repository of machine learning databases (1998), http://www.ics.uci.edu/~mlearn/MLRepository.html
  4. 4.
    Breiman, L.: Bagging predictors. Machine Learning 24(2), 123–140 (1996)zbMATHGoogle Scholar
  5. 5.
    Breiman, L.: Arcing classifiers. Annals of Statistics 26(3), 801–849 (1998)MathSciNetCrossRefzbMATHGoogle Scholar
  6. 6.
    Breiman, L.: Random forests. Machine Learning 45(1), 5–32 (2001)CrossRefzbMATHGoogle Scholar
  7. 7.
    Castro, P.D., Coelho, G.P., Caetano, M.F., Von Zuben, F.J.: Designing ensembles of fuzzy classification systems: An immune-inspired approach. In: Jacob, C., Pilat, M.L., Bentley, P.J., Timmis, J.I. (eds.) ICARIS 2005. LNCS, vol. 3627, pp. 469–482. Springer, Heidelberg (2005)CrossRefGoogle Scholar
  8. 8.
    Chen, K., Wang, S.: Regularized boost for semi-supervised learning. In: Platt, J.C., Koller, D., Singer, Y., Roweis, S. (eds.) Advances in Neural Information Processing Systems, vol. 20, pp. 281–288. MIT Press, Cambridge (2008)Google Scholar
  9. 9.
    Coyle, M., Smyth, B.: On the use of selective ensembles for relevance classification in case-based web search. In: Roth-Berghofer, T.R., Göker, M.H., Güvenir, H.A. (eds.) ECCBR 2006. LNCS, vol. 4106, pp. 370–384. Springer, Heidelberg (2006)CrossRefGoogle Scholar
  10. 10.
    Demiriz, A., Bennett, K.P., Shawe-Taylor, J.: Linear programming boosting via column generation. Machine Learning 46(1-3), 225–254 (2006)zbMATHGoogle Scholar
  11. 11.
    Dietterich, T.G.: Ensemble methods in machine learning. In: Kittler, J., Roli, F. (eds.) MCS 2000. LNCS, vol. 1857, pp. 1–15. Springer, Heidelberg (2000)CrossRefGoogle Scholar
  12. 12.
    Doherty, D., Freeman, M.A., Kumar, R.: Optimization with matlab and the genetic algorithm and direct search toolbox. Technical report, The MathWorks Inc. (2004)Google Scholar
  13. 13.
    Freund, Y., Schapire, R.E.: A decision-theoretic generalization of on-line learning and an application to boosting. Journal of Computer and System Sciences 55(1), 119–139 (1997)MathSciNetCrossRefzbMATHGoogle Scholar
  14. 14.
    Ho, T.K.: The random subspace method for constructing decision forests. IEEE Transaction on Pattern Analysis and Machine Intelligence 20(8), 832–844 (1998)CrossRefGoogle Scholar
  15. 15.
    Kégl, B., Wang, L.: Boosting on manifolds: Adaptive regularization of base classifiers. In: Saul, L.K., Weiss, Y., Bottou, L. (eds.) Advances in Neural Information Processing Systems, vol. 17, pp. 665–672. MIT Press, Cambridge (2005)Google Scholar
  16. 16.
    Margineantu, D., Dietterich, T.G.: Pruning adaptive boosting. In: Proceedings of the 14th International Conference on Machine Learning, Nashville, TN, pp. 211–218 (1997)Google Scholar
  17. 17.
    Martínez-Muñoz, G., Suárez, A.: Pruning in ordered bagging ensembles. In: Proceedings of the 23rd International Conference on Machine Learning, Pittsburgh, PA, pp. 609–616 (2006)Google Scholar
  18. 18.
    Schölkopf, B., Smola, A.J.: Learning with Kernels: Support Vector Machines, Regularization, Optimization, and Beyond. MIT Press, Cambridge (2001)Google Scholar
  19. 19.
    Tamon, C., Xiang, J.: On the boosting pruning problem. In: Proceedings of the 11th European Conference on Machine Learning, Barcelona, Spain, pp. 404–412 (2000)Google Scholar
  20. 20.
    Ting, K.M., Witten, I.H.: Issues in stacked generalization. Journal of Artificial Intelligence Research 10, 271–289 (1999)zbMATHGoogle Scholar
  21. 21.
    Witten, I.H., Frank, E.: Data Mining: Practical Machine Learning Tools and Techniques with Java Implementations. Morgan Kaufmann, San Francisco (2000)Google Scholar
  22. 22.
    Zhang, Y., Burer, S., Street, W.N.: Ensemble pruning via semi-definite programming. Journal of Machine Learning Research 7, 1315–1338 (2006)MathSciNetzbMATHGoogle Scholar
  23. 23.
    Zhou, Z.-H., Tang, W.: Selective ensemble of decision trees. LNCS (LNAI), vol. 2639, pp. 476–483. Springer, Heidelberg (2003)zbMATHGoogle Scholar
  24. 24.
    Zhou, Z.-H., Wu, J., Tang, W.: Ensembling neural networks: Many could be better than all. Artificial Intelligence 137(1-2), 239–263 (2002)MathSciNetCrossRefzbMATHGoogle Scholar
  25. 25.
    Zhu, X.: Semi-supervised learning literature survey. Technical report, Department of Computer Sciences, University of Wisconsin Madison (2007)Google Scholar

Copyright information

© Springer-Verlag Berlin Heidelberg 2009

Authors and Affiliations

  • Nan Li
    • 1
  • Zhi-Hua Zhou
    • 1
  1. 1.National Key Laboratory for Novel Software TechnologyNanjing UniversityNanjingChina

Personalised recommendations