Feature Selection for Ensembles of Simple Bayesian Classifiers

  • Alexey Tsymbal
  • Seppo Puuronen
  • David Patterson
Conference paper
Part of the Lecture Notes in Computer Science book series (LNCS, volume 2366)


A popular method for creating an accurate classifier from a set of training data is to train several classifiers, and then to combine their predictions. The ensembles of simple Bayesian classifiers have traditionally not been a focus of research. However, the simple Bayesian classifier has much broader applicability than previously thought. Besides its high classification accuracy, it also has advantages in terms of simplicity, learning speed, classification speed, storage space, and incrementality. One way to generate an ensemble of simple Bayesian classifiers is to use different feature subsets as in the random subspace method. In this paper we present a technique for building ensembles of simple Bayesian classifiers in random subspaces. We consider also a hill-climbing-based refinement cycle, which improves accuracy and diversity of the base classifiers. We conduct a number of experiments on a collection of real-world and synthetic data sets. In many cases the ensembles of simple Bayesian classifiers have significantly higher accuracy than the single “global” simple Bayesian classifier. We consider several methods for integration of simple Bayesian classifiers. The dynamic integration better utilizes ensemble diversity than the static integration.


Feature Selection Ensemble Member Feature Subset Base Classifier Iterative Refinement 
These keywords were added by machine and not by the authors. This process is experimental and the keywords may be updated as the learning algorithm improves.


Unable to display preview. Download preview PDF.

Unable to display preview. Download preview PDF.


  1. 1.
    Bauer, E., Kohavi, R.: An empirical comparison of voting classification algorithms: ging, boosting, and variants. Machine Learning, Vol. 36, Nos. 1,2 (1999) 105–139.CrossRefGoogle Scholar
  2. 2.
    Blake, C.L., Merz, C.J.: UCI repository of machine learning databases []. Dep-t of Information and CS, Un-ty of California, Irvine CA (1998).Google Scholar
  3. 3.
    Brodley, C., Lane, T.: Creating and exploiting coverage and diversity. In: Proc. AAAI-96 Workshop on Integrating Multiple Learned Models (1996) 8–14.Google Scholar
  4. 4.
    Cunningham, P.: Diversity versus quality in classification ensembles based on feature selection. Tech. Report TCD-CS-2000-02, Dept. of Computer Science, Trinity College Dublin, Ireland (2000).Google Scholar
  5. 5.
    Dietterich, T. G.: Ensemble Learning Methods. In: M.A. Arbib (ed.), Handbook of Brain Theory and Neural Networks, 2nd ed., MIT Press (2001).Google Scholar
  6. 6.
    Domingos, P., Pazzani, M.: On the optimality of the simple Bayesian classifier under zero-one loss. Machine Learning, Vol. 29, Nos. 2,3 (1997) 103–130.zbMATHCrossRefGoogle Scholar
  7. 7.
    Elkan C.: Boosting and naïve Bayesian learning. Tech. Report CS97-557, Dept. of CS and Engineering, Un-ty of California, San Diego, USA (1997).Google Scholar
  8. 8.
    Hansen, L., Salamon, P.: Neural network ensembles. IEEE Transactions on Pattern Analysis and Machine Intelligence, Vol. 12 (1990) 993–1001.CrossRefGoogle Scholar
  9. 9.
    Ho, T. K.: The random subspace method for constructing decision forests. IEEE Transactions on Pattern Analysis and Machine Intelligence, Vol. 20, No. 8 (1998) 832–844.CrossRefGoogle Scholar
  10. 10.
    Kohavi, R., Sommerfield, D., Dougherty, J.: Data mining using MLC++: a machine learning library in C++. Tools with Artificial Intelligence, IEEE CS Press (1996) 234–245.Google Scholar
  11. 11.
    Krogh, A., Vedelsby, J.: Neural network ensembles, cross validation, and active learning. In D. Touretzky, T. Leen (eds.), Advances in Neural Information Processing Systems, Vol. 7, Cambridge, MA, MIT Press (1995) 231–238.Google Scholar
  12. 12.
    Opitz, D.: Feature selection for ensembles. In: Proc. 16th National Conf. on Artificial Intelligence, AAAI (1999) 379–384.Google Scholar
  13. 13.
    Pedersen, T.: A simple approach to building ensembles of naive Bayesian classifiers for word sense disambiguation. In: Proc. 1st Annual Meeting of the North American Chapter of the Association for Computational Linguistics, Seattle, WA (2000) 63–69.Google Scholar
  14. 14.
    Puuronen, S., Terziyan, V., Tsymbal, A.: A dynamic integration algorithm for an ensemble of classifiers. In: Z.W. Ras, A. Skowron (eds.), Foundations of Intelligent Systems: ISMIS’99, Lecture Notes in AI, Vol. 1609, Springer-Verlag, Warsaw (1999) 592–600.CrossRefGoogle Scholar
  15. 15.
    Puuronen, S., Tsymbal, A.: Local feature selection with dynamic integration of classifiers, In: Fundamenta Informaticae, Special Issue “Intelligent Information Systems”, Vol. 47, Nos. 1-2, IOS Press (2001) 91–117.zbMATHMathSciNetGoogle Scholar
  16. 16.
    Skurichina, M., Duin, R.P.W.: Bagging and the random subspace method for redundant feature spaces. In: J. Kittler, F. Roli (eds.), Proc. 2nd Int. Workshop on Multiple Classifier Systems MCS 2001, Cambridge, UK (2001) 1–10.Google Scholar
  17. 17.
    Tsymbal, A., Puuronen, S., Skrypnyk, I.: Ensemble feature selection with dynamic integration of classifiers, In: Proc. Int. ICSC Congress on Computational Intelligence Methods and Applications CIMA’2001, Bangor, Wales, U.K. (2001).Google Scholar

Copyright information

© Springer-Verlag Berlin Heidelberg 2002

Authors and Affiliations

  • Alexey Tsymbal
    • 1
  • Seppo Puuronen
    • 1
  • David Patterson
    • 2
  1. 1.University of JyväskyläJyväskyläFinland
  2. 2.Northern Ireland Knowledge Engineering LaboratoryUniversity of UlsterUK

Personalised recommendations