Skip to main content
Log in

A fast ensemble pruning algorithm based on pattern mining process

  • Published:
Data Mining and Knowledge Discovery Aims and scope Submit manuscript

Abstract

Ensemble pruning deals with the reduction of base classifiers prior to combination in order to improve generalization and prediction efficiency. Existing ensemble pruning algorithms require much pruning time. This paper presents a fast pruning approach: pattern mining based ensemble pruning (PMEP). In this algorithm, the prediction results of all base classifiers are organized as a transaction database, and FP-Tree structure is used to compact the prediction results. Then a greedy pattern mining method is explored to find the ensemble of size k. After obtaining the ensembles of all possible sizes, the one with the best accuracy is outputted. Compared with Bagging, GASEN, and Forward Selection, experimental results show that PMEP achieves the best prediction accuracy and keeps the size of the final ensemble small, more importantly, its pruning time is much less than other ensemble pruning algorithms.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Similar content being viewed by others

References

  • Ali KM, Pazzani MJ (1996) Error reduction through learning multiple descriptions. Mach Learn 24(3): 173–202

    Google Scholar 

  • Allwein EL, Schapire RE, Singer Y (2000) Reducing multiclass to binary: a unifying approach for margin classifiers. J Mach Learn Res 1: 113–141

    Article  MathSciNet  Google Scholar 

  • Asuncion DNA (2007) UCI machine learning repository. http://www.ics.uci.edu/mlearn/MLRepository.html

  • Breiman L (1996) Bagging predictors. Mach Learn 24(2): 123–140

    MATH  MathSciNet  Google Scholar 

  • Caruana R, Niculescu-Mizil A, Crew G, Ksikes A (2004) Ensemble selection from libraries of models. In: Proceedings of the 21st international conference on machine learning (ICML2004), Banff, Alberta

  • Demsar J (2006) Statistical comparisons of classifiers over multiple data sets. J Mach Learn Res 7: 1–30

    MathSciNet  Google Scholar 

  • Han J, Pei J (2000) Mining frequent patterns by pattern growth: methodology and implications. SIGKDD Explor 2(2): 14–20

    Article  Google Scholar 

  • Jain AK, Duin RPW, Mao JC (2000) Statistical pattern recognition: a review. IEEE Trans Patt Anal Mach Intell 22(1): 4–37

    Article  Google Scholar 

  • Martínez-Muñoz G, Suarez A (2007) Using boosting to prune bagging ensembles. Patt Recogn Lett 28(1): 156–165

    Article  Google Scholar 

  • Opitz D, Maclin R (1999) Popular ensemble methods: an empirical study. J Artif Intell Res 11: 169–198

    MATH  Google Scholar 

  • Parmanto B, Munro PW, Doyle HR (1996) Improving committee diagnosis with resampling techniques. In: Touretzky DS, Mozer MC, Hesselmo ME (eds) Advances in neural information processing systems. MIT Press, Cambridge, pp 882–888

    Google Scholar 

  • Partalas I, Tsoumakas G, Vlahavas I (2009) Pruning an ensemble of classifiers via reinforcement learning. Neurocomputing (in press)

  • Rumelhart DE, Hinton GE, Williams RJ (1986) Learning internal representations by error propagation. In: Rumelhart DE, McClelland JL (eds) Parallel distributed processing: explorations in the microstructure of cognition. MIT Press, Cambridge, pp 318–362

    Google Scholar 

  • Ruta D, Gabrys B (2005) Classifier selection for majority voting. Inf Fusion 6(1): 63–81

    Article  Google Scholar 

  • Schapire RE (1999) A brief introduction to boosting. Proceedings of the Sixteenth International Joint Conference on Artificial Intelligence. Morgan Kaufmann 1401–1406

  • Sewell M (2008) Ensemble learning. http://machine-learning.martinsewell.com/ensembles/ensemble-learning.pdf

  • Tsoumakas G, Angelis L, Vlahavas I (2005) Selective fusion of heterogeneous classifiers. Intell Data Anal 9(6): 511–525

    Google Scholar 

  • Wolpert DH (1992) Stacked generalization. Neural Netw 5(2): 241–259

    Article  Google Scholar 

  • Xu L, Krzyzak A, Suen CY (1992) Methods of combining multiple classifiers and their applications to handwriting recognition. IEEE Trans Syst Man Cybern 22(3): 418–435

    Article  Google Scholar 

  • Zhang Y, Burer S, Street WN (2006) Ensemble pruning via semi-definite programming. J Mach Learn Res 7: 1315–1338

    MathSciNet  Google Scholar 

  • Zhou ZH, Wu JX, Tang W (2002) Ensembling neural networks: many could be better than all. Artif Intell 137(1–2): 239–263

    Article  MATH  MathSciNet  Google Scholar 

Download references

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Qiang-Li Zhao.

Additional information

Responsible editors: Aleksander Kołcz, Wray Buntine, Marko Grobelnik, Dunja Mladenic, and John Shawe-Taylor.

Rights and permissions

Reprints and permissions

About this article

Cite this article

Zhao, QL., Jiang, YH. & Xu, M. A fast ensemble pruning algorithm based on pattern mining process. Data Min Knowl Disc 19, 277–292 (2009). https://doi.org/10.1007/s10618-009-0138-1

Download citation

  • Received:

  • Accepted:

  • Published:

  • Issue Date:

  • DOI: https://doi.org/10.1007/s10618-009-0138-1

Keywords

Navigation