Towards an optimally pruned classifier ensemble

Bhardwaj, Manju; Bhatnagar, Vasudha

doi:10.1007/s13042-014-0303-8

Towards an optimally pruned classifier ensemble

Original Article
Published: 02 November 2014

Volume 6, pages 699–718, (2015)
Cite this article

International Journal of Machine Learning and Cybernetics Aims and scope Submit manuscript

Manju Bhardwaj¹ &
Vasudha Bhatnagar²

376 Accesses
8 Citations
Explore all metrics

Abstract

Ensemble pruning is an important area of research in multiple classifier systems. Reducing ensemble size, by selecting diverse and accurate classifiers from a given pool is a popular strategy to improve ensemble performance. In this paper, we present Accu-Prune (AP) algorithm, a majority voting ensemble that uses accuracy ordering and reduced error pruning approach to identify an optimal ensemble from a given pool of classifiers. At each step, the ensemble is extended by adding two lower accuracy classifiers, implicitly adding diversity to the ensemble. The proposed approach closely mimics the results of the Brute Force (BF) search for optimal ensemble, while reducing the search space drastically. We propose that quality of an ensemble is determined by two factors—size and accuracy. Ideally, smaller ensembles are qualitywise preferable over large ensembles with same accuracy. Based on this notion, we design a deficit function to quantify the quality differential between two arbitrary ensembles. The function examines the performance and size difference between two ensembles to quantify the quality differential. Experimentation has been carried out on 25 UCI datasets and AP algorithm has been compared with BF search and other pruning algorithms. The deficit function is used to compare AP with BF search and a well known pruning algorithm, EPIC. Relevant statistical tests reveal that the generalization capability of AP algorithm is better than forward search and backward elimination, comparable to BF search and slightly inferior to EPIC. EPIC ensembles being significantly large, the quality differential between AP and EPIC ensembles is not significant. Thus, for limited memory applications, with tolerance for small amount of error, AP ensembles may be more appropriate.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Log in via an institution

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Institutional subscriptions

A New Function for Ensemble Pruning

An Integrated Pruning Criterion for Ensemble Learning Based on Classification Accuracy and Diversity

An Empirical Investigation on the Use of Diversity for Creation of Classifier Ensembles

Notes

This lattice is ordered by the subset relation (\(\subset \)) with the infimum as the null ensemble and the supremum as the complete pool.
Since majority voting is used as the combiner function, levels of even order in the lattice (search space) can be skipped to somewhat speed up the search, though computational complexity of the search remains of the same order.
If a classifier predicts the class label of an instance correctly, an oracle output of 1 is generated, −1 otherwise.
We restricted this study to pool sizes 11 and 21, since BF search on pool-size 31 was not feasible because of unacceptably long computational timings for execution.

References

Banfield RE, Hall LO, Bowyer KW, Kegelmeyer WP (2005) Ensemble diversity measures and their applications to thinning. Inf Fus 6(1):49–62
Article Google Scholar
Baumgartner D, Serpen G (2013) Performance of global-local hybrid ensemble versus boosting and bagging. Int J Mach Learn Cybern 4(4):301–317
Article Google Scholar
Brown G, Wyatt JL, Harris R, Yao X (2005) Diversity creation methods: a survey and categorisation. Inf Fus 6(1):5–20
Article Google Scholar
Caruana R, Munson A, Niculescu-Mizil A (2006) Getting the most out of ensemble selection. In: Proceedings of the 6th IEEE international conference on data mining, pp 828–833
Caruana R, Niculescu-Mizil A, Crew G, Ksikes A (2004) Ensemble selection from libraries of models. In: Proceedings of the twenty-first international conference on Machine learning (ICML '04), ACM, New York, NY, USA.
Demsar J (2006) Statistical comparisons of classifiers over multiple data sets. J Mach Learn Res 7:1–30
MATH MathSciNet Google Scholar
Dietterich T (2000) Ensemble methods in machine learning. International workshop on Multiple Classifier Systems, pp 1–10
Faller W, Schreck S (1995) Real-time prediction of unsteady aerodynamics: application for aircraft control and manoeuvrability enhancement. IEEE Trans Neural Netw on 6(6):1461–1468
Article Google Scholar
Fan W, Chu F, Wang H, Yu PS (2002) Pruning and dynamic scheduling of cost-sensitive ensembles. In: Eighteenth national conference on AI, pp 146–151
Frank A, Asuncion A (2010) UCI machine learning repository. http://www.ics.uci.edu/mlearn/MLRepository.html
Giacinto G, Roli F, Fumera G (2000) Design of effective multiple classifier systems by clustering of classifiers. In: Proc. of ICPR2000, pp 3–8
Hall M, Frank E, Holmes G, Pfahringer B, Reutemann P, Witten IH (2009) The WEKA data mining software: an update. SIGKDD Explor 11(1):10–18
Article Google Scholar
Ko AHR, Sabourin R, de Souza Britto Jr A (2009) Compound diversity functions for ensemble selection. IJPRAI 23(4):659–686
Google Scholar
Kuncheva LI (2005) Diversity in multiple classifier systems. Inf Fus 6(1):3–4
Article MathSciNet Google Scholar
Kuncheva LI, Whitaker CJ (2003) Measures of diversity in classifier ensembles and their relationship with ensemble accuracy. Mach Learn 51:181–207
Article MATH Google Scholar
Lam L (2000) Classifier combinations: implementations and theoretical issues. In: Multiple Classifier Systems, pp 77–86
Li N, Yu Y, Zhou ZH (2012) Diversity Regularized Ensemble pruning. In: Proceedings of ECML/PKDD (1), Lecture Notes in Computer Science, vol 7523. Springer, Heidelberg, pp 330–345. doi:10.1007/978-3-642-33460-3_27
Liang Y (2004) Real-time VBR video traffic prediction for dynamic bandwidth allocation. IEEE Trans Syst Man Cybern Part C Appl Rev 34(1):32–47
Article Google Scholar
Lu Z, Wu X, Zhu X, Bongard J (2010) Ensemble pruning via individual contribution ordering. KDD’10, ACM, pp 871–880
Margineantu DD, Dietterich TG (1997) Pruning adaptive boosting. ICML’97, Morgan Kaufmann Publishers Inc., San Francisco, CA, USA, pp 211–218
Martínez-Muñoz G, Hernández-Lobato D, Suárez A (2009) An analysis of ensemble pruning techniques based on ordered aggregation. IEEE Trans Pattern Anal Mach Intell 31:245–259
Article Google Scholar
Martínez-Muñoz G, Suárez A (2006) Pruning in ordered bagging ensembles. ICML’06, ACM, New York, NY, USA, pp 609–616
Martínez-Muñoz G, Suárez A (2007) Using boosting to prune bagging ensembles. Pattern Recogn Lett 28(1):156–165
Article Google Scholar
Martinez-Munoz G, Suarez A (2004) Aggregation order in bagging. IASTED, pp 258–263
Partalas I, Tsoumakas G, Vlahavas I (2008) Focused ensemble selection: a diversity-based method for greedy ensemble selection. In: Proc. of ECAI, pp 117–121
Partalas I, Tsoumakas G, Vlahavas IP (2009) Pruning an ensemble of classifiers via reinforcement learning. Neurocomputing 72(7–9):1900–1909
Article Google Scholar
Qiang F, Shang-Xu H, Sheng-Ying Z (2005) Clustering-based selective neural network ensemble. J Zhejiang Univ Sci A 6(5):387–392
Google Scholar
Rodriguez J, Goni A, Illarramendi A (2005) Real-time classification of ECGs on a PDA. IEEE Trans Inf Technol Biomed 9(1):23–34
Article Google Scholar
Rokach L (2010) Ensemble-based classifiers. Artif Intell Rev 33(1–2):1–39
Article Google Scholar
Tamon C, Xiang J (2000) On the boosting pruning problem. ECML’00, Springer, London, UK, pp 404–412
Tang EK, Suganthan PN, Yao X (2006) An analysis of diversity measures. Mach Learn 65:247–271
Article Google Scholar
Tsoumakas G, Partalas I, Vlahavas I (2009) Ensemble pruning primer. In: Applications of supervised and unsupervised ensemble methods. Springer
Wang XZ, Wang R, Feng HM, Wang HC (2014) A new approach to classifier fusion based on upper integral. IEEE Trans Cybern 5(44):620–635
Article Google Scholar
Yang Y, Korb K, Ting K, Webb G (2005) Ensemble selection for superparent-one-dependence estimators. AI 2005:102–112
MathSciNet Google Scholar
Zhang Y, Burer S, Street WN (2006) Ensemble pruning via semi-definite programming. J Mach Learn Res 7:1315–1338
MATH MathSciNet Google Scholar
Zhou ZH, Tang W (2003) Selective ensemble of decision trees. In: Proceedings of the 9th international conference on Rough sets., fuzzy sets, data mining, and granular computing, Springer, Berlin, Heidelberg, pp 476–483

Download references

Acknowledgments

We thank the anonymous reviewers for the time they spent on the manuscript. Their valuable comments have resulted in appreciable improvements in the manuscript.

Author information

Authors and Affiliations

Maitreyi College, University of Delhi, Delhi, India
Manju Bhardwaj
Department of Computer Science, University of Delhi, Delhi, India
Vasudha Bhatnagar

Authors

Manju Bhardwaj
View author publications
You can also search for this author in PubMed Google Scholar
Vasudha Bhatnagar
View author publications
You can also search for this author in PubMed Google Scholar

Corresponding author

Correspondence to Manju Bhardwaj.

Rights and permissions

Reprints and permissions

About this article

Cite this article

Bhardwaj, M., Bhatnagar, V. Towards an optimally pruned classifier ensemble. Int. J. Mach. Learn. & Cyber. 6, 699–718 (2015). https://doi.org/10.1007/s13042-014-0303-8

Download citation

Received: 27 July 2013
Accepted: 20 September 2014
Published: 02 November 2014
Issue Date: October 2015
DOI: https://doi.org/10.1007/s13042-014-0303-8

Keywords

Access this article

Log in via an institution

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Institutional subscriptions

Towards an optimally pruned classifier ensemble

Abstract

Access this article

Similar content being viewed by others

A New Function for Ensemble Pruning

An Integrated Pruning Criterion for Ensemble Learning Based on Classification Accuracy and Diversity

An Empirical Investigation on the Use of Diversity for Creation of Classifier Ensembles

Notes

References

Acknowledgments

Author information

Authors and Affiliations

Corresponding author

Rights and permissions

About this article

Cite this article

Keywords

Navigation

Towards an optimally pruned classifier ensemble

Abstract

Access this article

Similar content being viewed by others

A New Function for Ensemble Pruning

An Integrated Pruning Criterion for Ensemble Learning Based on Classification Accuracy and Diversity

An Empirical Investigation on the Use of Diversity for Creation of Classifier Ensembles

Notes

References

Acknowledgments

Author information

Authors and Affiliations

Corresponding author

Rights and permissions

About this article

Cite this article

Share this article

Keywords

Search

Navigation