Diversity versus Quality in Classification Ensembles Based on Feature Selection

  • Pádraig Cunningham
  • John Carney
Conference paper
Part of the Lecture Notes in Computer Science book series (LNCS, volume 1810)


Feature subset-selection has emerged as a useful technique for creating diversity in ensembles — particularly in classification ensembles. In this paper we argue that this diversity needs to be monitored in the creation of the ensemble. We propose an entropy measure of the outputs of the ensemble members as a useful measure of the ensemble diversity. Further, we show that using the associated conditional entropy as a loss function (error measure) works well and the entropy in the ensemble predicts well the reduction in error due to the ensemble. These measures are evaluated on a medical prediction problem and are shown to predict the performance of the ensemble well. We also show that the entropy measure of diversity has the added advantage that it seems to model the change in diversity with the size of the ensemble.


Feature Selection Ensemble Member Feature Subset Entropy Measure Ensemble Size 
These keywords were added by machine and not by the authors. This process is experimental and the keywords may be updated as the learning algorithm improves.


  1. [1]
    Aha, D. W., & Bankert, R. L., Feature selection for case-based classification of cloud types: An empirical comparison. In D. W. Aha (Ed.) Case-Based Reasoning: Papers from the 1994 Workshop (Technical Report WS-94-01). Menlo Park, CA: AAAI Press. (NCARAI TR: AIC-94-011), 1994.Google Scholar
  2. [2]
    Bonzano A., Cunningham P., Smyth B., Using introspective learning to improve retrieval in CBR: A case study in air traffic control, Case-Based Reasoning Research and Development, Proceedings of the 1997 International Conference on Case-Based Reasoning, D.B. Leake and E. Plaza Eds., Springer Verlag, Lecture Notes in Artificial Intelligence, pp.291–302, 1997.Google Scholar
  3. [3]
    Carney, J., Cunningham, P., The NeuralBAG algorithm: optimizing generalization performance in bagged neural networks, in proceedings of 7 th European Symposium on Artificial Neural Networks, Bruges (Belgium), pp35–50 1999.Google Scholar
  4. [4]
    Guerra-Salcedo, C., Whitley, D., Genetic Approach for Feature Selection for Ensemble Creation. in GECCO-99: Proceedings of the Genetic and Evolutionary Computation Conference, Banzhaf, W., Daida, J., Eiben, A. E., Garzon, M. H., Honavar, V., Jakiela, M., & Smith, R. E. (eds.). Orlando, Florida USA, pp236–243, San Francisco, CA: Morgan Kaufmann, 1999.Google Scholar
  5. [5]
    Guerra-Salcedo, C., Whitley, D., Feature Selection Mechanisms for Ensemble Creation: A Genetic Search Perspective, in Data Mining with Evolutionary Algorithms: Research Directions. Papers from the AAAI Workshop. Alex A. Freitas (Ed.) Technical Report WS-99-06. AAAI Press, 1999.Google Scholar
  6. [6]
    Ho, T.K., The Random Subspace Method for Constructing Decision Forests, IEEE Transactions on Pattern Analysis and Machine Intelligence, 20,8, 832–844, 1998.CrossRefGoogle Scholar
  7. [7]
    Ho, T.K., Nearest Neighbours in Random Subspaces, Proc. Of 2 nd International Workshop on Statistical Techniques in Pattern Recognition, A. Amin, D. Dori, P. Puil, H. Freeman, (eds.) pp640–648, Springer Verlag LNCS 1451, 1998.Google Scholar
  8. [8]
    Krogh, A., Vedelsby, J., Neural Network Ensembles, Cross Validation and Active Learning, in Advances in Neural Information Processing Systems 7, G. Tesauro, D. S. Touretsky, T. K. Leen, eds., pp231–238, MIT Press, Cambridge MA, 1995.Google Scholar
  9. [10]
    Tibshirani, R., (1996) Bias, variance and prediction error for classification rules, University of Toronto, Department of Statistics Technical Report, November 1996 (also available at, 1996.
  10. [11]
    Wettschereck, D., Aha, D. W., & Mohri, T., A review and empirical evaluation of feature weighting methods for a class of lazy learning algorithms. Artificial Intelligence Review, pp273–314, Vol. 11, Nos. 1–5, 1997.CrossRefGoogle Scholar

Copyright information

© Springer-Verlag Berlin Heidelberg 2000

Authors and Affiliations

  • Pádraig Cunningham
    • 1
  • John Carney
    • 1
  1. 1.Department of Computer ScienceTrinity College DublinIreland

Personalised recommendations