Skip to main content
Log in

An optimal pruning algorithm of classifier ensembles: dynamic programming approach

  • S.I. : Applying Artificial Intelligence to the Internet of Things
  • Published:
Neural Computing and Applications Aims and scope Submit manuscript

Abstract

In recent years, classifier ensemble techniques have drawn the attention of many researchers in the machine learning research community. The ultimate goal of these researches is to improve the accuracy of the ensemble compared to the individual classifiers. In this paper, a novel algorithm for building ensembles called dynamic programming-based ensemble design algorithm (DPED) is introduced and studied in detail. The underlying theory behind DPED is based on cooperative game theory in the first phase and applying a dynamic programming approach in the second phase. The main objective of DPED is to reduce the size of the ensemble while encouraging extra diversity in order to improve the accuracy. The performance of the DPED algorithm is compared empirically with the classical ensemble model and with a well-known algorithm called “the most diverse.” The experiments were carried out with 13 datasets from UCI and three ensemble models. Each ensemble model is constructed from 15 different base classifiers. The experimental results demonstrate that DPED outperforms the classical ensembles on all datasets in terms of both accuracy and size of the ensemble. Regarding the comparison with the most diverse algorithm, the number of selected classifiers by DPED across all datasets and all domains is less than or equal to the number selected by the most diverse algorithm. Experiment on blog spam dataset, for instance, shows that DPED provides an accuracy of 96.47 compared to 93.87 obtained by the most diverse using 40% training size. Finally, the experimental results verify the reliability, stability, and effectiveness of the proposed DPED algorithm.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Fig. 1
Fig. 2
Fig. 3
Fig. 4
Fig. 5
Fig. 6
Fig. 7
Fig. 8
Fig. 9
Fig. 10
Fig. 11
Fig. 12
Fig. 13

Similar content being viewed by others

References

  1. Abdar M, Zomorodi-Moghadam M, Zhou X, Gururajan R, Tao X, Barua PD, Gururajan R (2018) A new nested ensemble technique for automated diagnosis of breast cancer. Pattern Recognit Lett (in press)

  2. Abdar M, Makarenkov V (2019) CWV-BANN-SVM ensemble learning classifier for an accurate diagnosis of breast cancer. Measurement 146:557–570

    Article  Google Scholar 

  3. Aksela M (2003) Comparison of classifier selection methods for improving committee performance. In: International workshop on multiple classifier systems, Springer

  4. Alweshah M, Alzubi OA, Alzubi JA, Alaqeel S (2016) Solving attribute reduction problem using wrapper genetic programming. Int J Comput Sci Netw Secur 16(5):77–84

    Google Scholar 

  5. Alzubi JA (2015) Optimal classifier ensemble design based on cooperative game theory. Res J Appl Sci Eng Technol 11(12):1336–1346

    Article  Google Scholar 

  6. Alzubi OA, Alzubi JA, Tedmori S, Rashaideh H, Almomani O (2018) Consensus-based combining method for classifier ensembles. Int Arab J Inf Technol 15(1):76–86

    Google Scholar 

  7. Alzubi JA, Kumar A, Alzubi OA, Manikandan R (2019) Efficient approaches for prediction of brain tumor using machine learning techniques. Indian J Public Health Res Dev 10(2):267–272

    Article  Google Scholar 

  8. Andreica MI (2008) A dynamic programming framework for combinatorial optimization problems on graphs with bounded path width

  9. Banfield RE, Hall LO, Bowyer KW, Kegelmeyer WP (2003) A new ensemble diversity measure applied to thinning ensembles. In: International workshop on multiple classifier systems, Springer, pp 306–316

  10. Barron EN (2013) Game theory: an introduction. Wiley, New York

    Book  MATH  Google Scholar 

  11. Błaszczyński J, Stefanowski J (2015) Neighborhood sampling in bagging for imbalanced data. Neurocomputing 150:529–542

    Article  Google Scholar 

  12. Brown G, Kuncheva LI (2010) “Good” and “Bad” diversity in majority vote ensembles. In: El Gayar N, Kittler J, Roli F (eds) 9th International workshop on multiple classifier systems, MCS 2010, Cairo, Egypt, April 7–9, 2010, Springer, Berlin, pp 124–133

  13. Canuto AM, Abreu MC, de Melo Oliveira L, Xavier JC, Santos ADM (2007) Investigating the influence of the choice of the ensemble members in accuracy and diversity of selection-based and fusion-based methods for ensembles. Pattern Recognit Lett 28(4):472–486

    Article  Google Scholar 

  14. Chakraborty D, Narayanan V, Ghosh A (2019) Integration of deep feature extraction and ensemble learning for outlier detection. Pattern Recognit 89:161–171

    Article  Google Scholar 

  15. Chen T, Blasco J, Alzubi JA, Alzubi OA (2014) Intrusion detection. IET Digital Library, The Institution of Engineering and Technology

  16. Cohen S, Dror G, Ruppin E (2007) Feature selection via coalitional game theory. Neural Comput 19(7):1939–1961

    Article  MathSciNet  MATH  Google Scholar 

  17. Cohen S, Ruppin E, Dror G (2005) Feature selection based on the Shapley value. In: Proceedings of the 19th international joint conference on artificial intelligence (IJCAI’05), Morgan Kaufmann Publishers Inc., San Francisco, CA, USA, pp 665–670

  18. Cormen TH, Leiserson CE, Rivest RL, Stein C (2009) Book name: introduction to algorithm, vol 359

  19. Dai Q (2013) A competitive ensemble pruning approach based on cross-validation technique. Knowl Based Syst 37:394–414

    Article  Google Scholar 

  20. Dai Q, Han X (2016) An efficient ordering-based ensemble pruning algorithm via dynamic programming. Appl Intell 44(4):816–830

    Article  MathSciNet  Google Scholar 

  21. Dreuw P, Deselaers T, Rybach D, Keysers D, Ney H (2006) Tracking using dynamic programming for appearance-based sign language recognition. In: 7th International conference on automatic face and gesture recognition (FGR06), IEEE

  22. Erev I, Ert E, Yechiam E (2008) Loss aversion, diminishing sensitivity, and the effect of experience on repeated decisions. J Behav Decis Mak 21(5):575–597

    Article  Google Scholar 

  23. Frank A, Asuncion A (2015) UCI machine learning repository. University of California, School of Information and Computer Science, Irvine, CA

  24. Ganjisaffar Y, Caruana R, Lopes CV (2011) Bagging gradient-boosted trees for high precision, low variance ranking models. In: Proceedings of the 34th international ACM SIGIR conference on research and development in information retrieval, ACM

  25. Gwet KL (2014) Handbook of inter-rater reliability: the definitive guide to measuring the extent of agreement among raters, Advanced Analytics, LLC

  26. Hesterberg T, Monaghan S, Moore D, Clipson A, Epstein R (2003) Bootstrap methods and permutation tests: companion chapter 18 to the practice of business statistics

  27. Keinan A, Sandbank B, Hilgetag CC, Meilijson I, Ruppin E (2004) Fair attribution of functional contribution in artificial and biological networks. Neural Comput 16(9):1887–1915

    Article  MATH  Google Scholar 

  28. Ko AR, Sabourin R, de Souza Britto A (2006) Combining diversity and classification accuracy for ensemble selection in random subspaces. In: The 2006 IEEE international joint conference on neural network proceedings, IEEE

  29. Koohestani A, Abdar M, Khosravi A, Nahavandi S, Koohestani M (2019) Integration of ensemble and evolutionary machine learning algorithms for monitoring diver behavior using physiological signals. IEEE Access 7:98971–98992

    Article  Google Scholar 

  30. Lazarevic A, Obradovic Z (2001) Effective pruning of neural network classifier ensembles. In: Proceedings of IJCNN’01 international joint conference on neural networks, IEEE

  31. Lessmann S, Coussement K, De Bock KW, Haupt J (2019) Targeting customers for profit: an ensemble learning framework to support marketing decision-making. Inf Sci (in press)

  32. Lew A, Mauch H (2006) Dynamic programming: a computational tool. Springer, Berlin

    MATH  Google Scholar 

  33. Li N, Yu Y, Zhou ZH (2012) Diversity regularized ensemble pruning. In: Joint European conference on machine learning and knowledge discovery in databases, Springer, Berlin

  34. Lysiak R, Kurzynski M, Woloszynski T (2014) Optimal selection of ensemble classifiers using measures of competence and diversity of base classifiers. Neurocomputing 126:29–35

    Article  Google Scholar 

  35. Ma W, Liu Y, Yang X (2013) A dynamic programming approach for optimal signal priority control upon multiple high-frequency bus requests. J Intell Transp Syst 17(4): 282–293

    Article  Google Scholar 

  36. Martınez-Munoz G, Suárez A (2004) Aggregation ordering in bagging. In: Proceedings of the IASTED international conference on artificial intelligence and applications, Citesee

  37. Martinez-Muoz G, Hernández-Lobato D, Suarez A (2009) An analysis of ensemble pruning techniques based on ordered aggregation. IEEE Trans Pattern Anal Mach Intell 31(2):245–259

    Article  Google Scholar 

  38. Mathur S, Sankaranarayanan L, Mandayam NB (2006) Coalitional games in cooperative radio networks. In: 2006 Fortieth Asilomar conference on signals, systems and computers, IEEE

  39. Parvin H, MirnabiBaboli M, Alinejad-Rokny H (2015) Proposing a classifier ensemble framework based on classifier selection and decision tree. Eng Appl Artif Intell 37:34–42

    Article  Google Scholar 

  40. Peters H (2015) Game theory: a multi-leveled approach. Springer, Berlin

    Book  MATH  Google Scholar 

  41. Rigby AS (2009) Statistical methods in epidemiology. v. Towards an understanding of the kappa coefficient. Disabil Rehabilit

  42. Tsoumakas G, Partalas I, Vlahavas I (2009) An ensemble pruning primer. Applications of supervised and unsupervised ensemble methods. Springer, Berlin, pp 1–13

    Book  Google Scholar 

  43. Viera AJ, Garrett JM (2005) Understanding inter-observer agreement: the kappa statistic. Fam Med 37(5):360–363

    Google Scholar 

  44. Wang Z, Wang Y, Srinivasan RS (2018) A novel ensemble learning approach to support building energy use prediction. Energy Build 159:109–122

    Article  Google Scholar 

  45. Webb GI (2000) Multiboosting: a technique for combining boosting and wagging. Mach Learn 40(2):159–196

    Article  Google Scholar 

  46. Zhang X, Zhao Z, Zheng J, Li J (2019) Prediction of taxi destinations using a novel data embedding method and ensemble learning. IEEE Trans Intell Transp Syst (in press)

Download references

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Jafar A. Alzubi.

Ethics declarations

Conflict of interest

The authors declare that the research was conducted in the absence of any commercial or financial relationships that could be construed as a potential conflict of interest.

Additional information

Publisher's Note

Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

Rights and permissions

Reprints and permissions

About this article

Check for updates. Verify currency and authenticity via CrossMark

Cite this article

Alzubi, O.A., Alzubi, J.A., Alweshah, M. et al. An optimal pruning algorithm of classifier ensembles: dynamic programming approach. Neural Comput & Applic 32, 16091–16107 (2020). https://doi.org/10.1007/s00521-020-04761-6

Download citation

  • Received:

  • Accepted:

  • Published:

  • Issue Date:

  • DOI: https://doi.org/10.1007/s00521-020-04761-6

Keywords

Navigation