Skip to main content
Log in

Introducing randomness into greedy ensemble pruning algorithms

  • Published:
Applied Intelligence Aims and scope Submit manuscript

Abstract

As is well known, the Greedy Ensemble Pruning (GEP) algorithm, also called the Directed Hill Climbing Ensemble Pruning (DHCEP) algorithm, possesses relatively good performance and high speed. However, because the algorithm only explores a relatively small subspace within the whole solution space, it often produces suboptimal solutions of the ensemble pruning problem. Aiming to address this drawback, in this work, we propose a novel Randomized GEP (RandomGEP) algorithm, also called the Randomized DHCEP (RandomDHCEP) algorithm, that effectively enlarges the search space of the classical DHCEP while maintaining the same level of time complexity with the help of a randomization technique. The randomization of the classical DHCEP algorithm achieves a good tradeoff between the effectiveness and efficiency of ensemble pruning. Besides, the RandomDHCEP algorithm naturally inherits the two intrinsic advantages that a randomized algorithm usually possesses. First, in most cases, its running time or space requirements are smaller than well-behaved deterministic ensemble pruning algorithms. Second, it is easy to comprehend and implement. Experimental results on three benchmark classification datasets verify the practicality and effectiveness of the RandomGEP algorithm.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Institutional subscriptions

Fig. 1
Fig. 2

Similar content being viewed by others

References

  1. Dietterich TG (2000) Ensemble methods in machine learning. In: Proceedings of the 1st international workshop in multiple classifier systems. Cagliari, Italy

    Google Scholar 

  2. Breiman L (1996) Bagging predictors. Mach Learn 24:123– 140

    MATH  MathSciNet  Google Scholar 

  3. Breiman L (1999) Pasting small votes for classification in large databases and on-line. Mach Learn 36:85–103

    Article  Google Scholar 

  4. Dietterich T (2000) An experimental comparison of three methods for constructing ensembles of decision trees: bagging, boosting, and randomization. Mach Learn 40:139–158

    Article  Google Scholar 

  5. Breiman L (2001) Random forests. Mach Learn 45:5–32

    Article  MATH  Google Scholar 

  6. Freund Y, Schapire RE (1996) Experiments with a new boosting algorithm. In: 13th international conference on machine learning

  7. Banfield RE, Hall LO, Bowyer KW, Kegelmeyer WP (2005) Ensemble diversity measures and their application to thinning. Inf Fusion 6:49–62

    Article  Google Scholar 

  8. Caruana R, Munson A, Niculescu-Mizil A (2006) Getting the Most Out of Ensemble Selection. In: 6th international conference on data mining. Hong Kong

  9. Schapire R (2001) The boosting approach to machine learning: An overview. In: MSRI workshop on nonlinear estimation and classification

  10. Wolpert DH (1992) Stacked generalization. Neural Netw 5:241–259

    Article  Google Scholar 

  11. Margineantu D, Dietterich T (1997) Pruning adaptive boosting. In: Proceedings of the 14th international conference on machine learning

  12. Prodromidis AL, Stolfo SJ (2001) Cost complexity-based pruning of ensemble classifiers. Knowl Inf Syst 3:449–469

    Article  MATH  Google Scholar 

  13. Partalas I, Tsoumakas G, Vlahavas I (2010) An ensemble uncertainty aware measure for directed hill climbing ensemble pruning. Mach Learn 81:257–282

    Article  MathSciNet  Google Scholar 

  14. Zhou Z-H, Wu J, Tang W (2002) Ensembling neural networks: Many could be better than all. Artif Intell 137:239–263

    Article  MATH  MathSciNet  Google Scholar 

  15. Zhou Z, Tang W (2003) Selective ensemble of decision trees. In: Proceedings of the 9th international conference on rough sets, fuzzy sets, data mining, and granular computing. Chongqing, China

    Google Scholar 

  16. Caruana R, Niculescu-Mizil A, Crew G, Ksikes A (2004) Ensemble selection from libraries of models. In: Proceedings of the 21st international conference on machine learning

  17. Martinez-Munoz G, Suarez A (2004) Aggregation ordering in bagging. In: International conference on artificial intelligence and applications

  18. Alsuwaiyel MH (2003) Algorithms design techniques and analysis. World Scientific Publishing Co. Pte. Ltd., Singapore

    Google Scholar 

  19. Dai Q, Liu NZ (2011) The build of n-Bits binary coding ICBP ensemble system. Neurocomput 74:3509–3519

    Article  Google Scholar 

  20. Dai Q, Chen SC, Zhang BZ (2003) Improved CBP neural network model with applications in time series prediction. Neural Process Lett 18:197–211

    Google Scholar 

  21. Dai Q (2010) The build of a dynamic classifier selection ICBP system and its application to pattern recognition. Neural Comput Appl 19:123–137

    Article  Google Scholar 

  22. Weijters A (1995) The BP-SOM architecture and learning rule. Neural Process Lett 2:13–16

    Article  Google Scholar 

  23. Weijters A, Bosch VD, Herik HJ (1997) Intelligible neural networks with BP-SOM. Marcke and Daelemans, pp 27–36

  24. Eggermont J (1998) Rule-extraction and learning in the BP-SOM architecture in Computer Science Department: Leiden University

  25. UCI repository of machine learning databases. http://www.ics.uci.edu/%7Etextasciitildemlearn/MLRepository.html or http://www.ftp.ics.uci.edu/pub/machine-learning-databases/

Download references

Acknowledgments

This work is supported by the National Natural Science Foundation of China under Grants no. 61473150, 61100108, and 61375021. It is also supported by the Natural Science Foundation of Jiangsu Province of China under Grant no. SBK201322136, and is supported by the “Fundamental Research Funds for the Central Universities,” no. NZ2013306, and the Qing Lan Project, no. YPB13001. We would like to express our appreciation for the valuable comments from reviewers and editors.

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Qun Dai.

Rights and permissions

Reprints and permissions

About this article

Check for updates. Verify currency and authenticity via CrossMark

Cite this article

Dai, Q., Li, M. Introducing randomness into greedy ensemble pruning algorithms. Appl Intell 42, 406–429 (2015). https://doi.org/10.1007/s10489-014-0605-2

Download citation

  • Published:

  • Issue Date:

  • DOI: https://doi.org/10.1007/s10489-014-0605-2

Keywords

Navigation