Abstract
In recent years, classifier ensemble techniques have drawn the attention of many researchers in the machine learning research community. The ultimate goal of these researches is to improve the accuracy of the ensemble compared to the individual classifiers. In this paper, a novel algorithm for building ensembles called dynamic programming-based ensemble design algorithm (DPED) is introduced and studied in detail. The underlying theory behind DPED is based on cooperative game theory in the first phase and applying a dynamic programming approach in the second phase. The main objective of DPED is to reduce the size of the ensemble while encouraging extra diversity in order to improve the accuracy. The performance of the DPED algorithm is compared empirically with the classical ensemble model and with a well-known algorithm called “the most diverse.” The experiments were carried out with 13 datasets from UCI and three ensemble models. Each ensemble model is constructed from 15 different base classifiers. The experimental results demonstrate that DPED outperforms the classical ensembles on all datasets in terms of both accuracy and size of the ensemble. Regarding the comparison with the most diverse algorithm, the number of selected classifiers by DPED across all datasets and all domains is less than or equal to the number selected by the most diverse algorithm. Experiment on blog spam dataset, for instance, shows that DPED provides an accuracy of 96.47 compared to 93.87 obtained by the most diverse using 40% training size. Finally, the experimental results verify the reliability, stability, and effectiveness of the proposed DPED algorithm.
Similar content being viewed by others
References
Abdar M, Zomorodi-Moghadam M, Zhou X, Gururajan R, Tao X, Barua PD, Gururajan R (2018) A new nested ensemble technique for automated diagnosis of breast cancer. Pattern Recognit Lett (in press)
Abdar M, Makarenkov V (2019) CWV-BANN-SVM ensemble learning classifier for an accurate diagnosis of breast cancer. Measurement 146:557–570
Aksela M (2003) Comparison of classifier selection methods for improving committee performance. In: International workshop on multiple classifier systems, Springer
Alweshah M, Alzubi OA, Alzubi JA, Alaqeel S (2016) Solving attribute reduction problem using wrapper genetic programming. Int J Comput Sci Netw Secur 16(5):77–84
Alzubi JA (2015) Optimal classifier ensemble design based on cooperative game theory. Res J Appl Sci Eng Technol 11(12):1336–1346
Alzubi OA, Alzubi JA, Tedmori S, Rashaideh H, Almomani O (2018) Consensus-based combining method for classifier ensembles. Int Arab J Inf Technol 15(1):76–86
Alzubi JA, Kumar A, Alzubi OA, Manikandan R (2019) Efficient approaches for prediction of brain tumor using machine learning techniques. Indian J Public Health Res Dev 10(2):267–272
Andreica MI (2008) A dynamic programming framework for combinatorial optimization problems on graphs with bounded path width
Banfield RE, Hall LO, Bowyer KW, Kegelmeyer WP (2003) A new ensemble diversity measure applied to thinning ensembles. In: International workshop on multiple classifier systems, Springer, pp 306–316
Barron EN (2013) Game theory: an introduction. Wiley, New York
Błaszczyński J, Stefanowski J (2015) Neighborhood sampling in bagging for imbalanced data. Neurocomputing 150:529–542
Brown G, Kuncheva LI (2010) “Good” and “Bad” diversity in majority vote ensembles. In: El Gayar N, Kittler J, Roli F (eds) 9th International workshop on multiple classifier systems, MCS 2010, Cairo, Egypt, April 7–9, 2010, Springer, Berlin, pp 124–133
Canuto AM, Abreu MC, de Melo Oliveira L, Xavier JC, Santos ADM (2007) Investigating the influence of the choice of the ensemble members in accuracy and diversity of selection-based and fusion-based methods for ensembles. Pattern Recognit Lett 28(4):472–486
Chakraborty D, Narayanan V, Ghosh A (2019) Integration of deep feature extraction and ensemble learning for outlier detection. Pattern Recognit 89:161–171
Chen T, Blasco J, Alzubi JA, Alzubi OA (2014) Intrusion detection. IET Digital Library, The Institution of Engineering and Technology
Cohen S, Dror G, Ruppin E (2007) Feature selection via coalitional game theory. Neural Comput 19(7):1939–1961
Cohen S, Ruppin E, Dror G (2005) Feature selection based on the Shapley value. In: Proceedings of the 19th international joint conference on artificial intelligence (IJCAI’05), Morgan Kaufmann Publishers Inc., San Francisco, CA, USA, pp 665–670
Cormen TH, Leiserson CE, Rivest RL, Stein C (2009) Book name: introduction to algorithm, vol 359
Dai Q (2013) A competitive ensemble pruning approach based on cross-validation technique. Knowl Based Syst 37:394–414
Dai Q, Han X (2016) An efficient ordering-based ensemble pruning algorithm via dynamic programming. Appl Intell 44(4):816–830
Dreuw P, Deselaers T, Rybach D, Keysers D, Ney H (2006) Tracking using dynamic programming for appearance-based sign language recognition. In: 7th International conference on automatic face and gesture recognition (FGR06), IEEE
Erev I, Ert E, Yechiam E (2008) Loss aversion, diminishing sensitivity, and the effect of experience on repeated decisions. J Behav Decis Mak 21(5):575–597
Frank A, Asuncion A (2015) UCI machine learning repository. University of California, School of Information and Computer Science, Irvine, CA
Ganjisaffar Y, Caruana R, Lopes CV (2011) Bagging gradient-boosted trees for high precision, low variance ranking models. In: Proceedings of the 34th international ACM SIGIR conference on research and development in information retrieval, ACM
Gwet KL (2014) Handbook of inter-rater reliability: the definitive guide to measuring the extent of agreement among raters, Advanced Analytics, LLC
Hesterberg T, Monaghan S, Moore D, Clipson A, Epstein R (2003) Bootstrap methods and permutation tests: companion chapter 18 to the practice of business statistics
Keinan A, Sandbank B, Hilgetag CC, Meilijson I, Ruppin E (2004) Fair attribution of functional contribution in artificial and biological networks. Neural Comput 16(9):1887–1915
Ko AR, Sabourin R, de Souza Britto A (2006) Combining diversity and classification accuracy for ensemble selection in random subspaces. In: The 2006 IEEE international joint conference on neural network proceedings, IEEE
Koohestani A, Abdar M, Khosravi A, Nahavandi S, Koohestani M (2019) Integration of ensemble and evolutionary machine learning algorithms for monitoring diver behavior using physiological signals. IEEE Access 7:98971–98992
Lazarevic A, Obradovic Z (2001) Effective pruning of neural network classifier ensembles. In: Proceedings of IJCNN’01 international joint conference on neural networks, IEEE
Lessmann S, Coussement K, De Bock KW, Haupt J (2019) Targeting customers for profit: an ensemble learning framework to support marketing decision-making. Inf Sci (in press)
Lew A, Mauch H (2006) Dynamic programming: a computational tool. Springer, Berlin
Li N, Yu Y, Zhou ZH (2012) Diversity regularized ensemble pruning. In: Joint European conference on machine learning and knowledge discovery in databases, Springer, Berlin
Lysiak R, Kurzynski M, Woloszynski T (2014) Optimal selection of ensemble classifiers using measures of competence and diversity of base classifiers. Neurocomputing 126:29–35
Ma W, Liu Y, Yang X (2013) A dynamic programming approach for optimal signal priority control upon multiple high-frequency bus requests. J Intell Transp Syst 17(4): 282–293
Martınez-Munoz G, Suárez A (2004) Aggregation ordering in bagging. In: Proceedings of the IASTED international conference on artificial intelligence and applications, Citesee
Martinez-Muoz G, Hernández-Lobato D, Suarez A (2009) An analysis of ensemble pruning techniques based on ordered aggregation. IEEE Trans Pattern Anal Mach Intell 31(2):245–259
Mathur S, Sankaranarayanan L, Mandayam NB (2006) Coalitional games in cooperative radio networks. In: 2006 Fortieth Asilomar conference on signals, systems and computers, IEEE
Parvin H, MirnabiBaboli M, Alinejad-Rokny H (2015) Proposing a classifier ensemble framework based on classifier selection and decision tree. Eng Appl Artif Intell 37:34–42
Peters H (2015) Game theory: a multi-leveled approach. Springer, Berlin
Rigby AS (2009) Statistical methods in epidemiology. v. Towards an understanding of the kappa coefficient. Disabil Rehabilit
Tsoumakas G, Partalas I, Vlahavas I (2009) An ensemble pruning primer. Applications of supervised and unsupervised ensemble methods. Springer, Berlin, pp 1–13
Viera AJ, Garrett JM (2005) Understanding inter-observer agreement: the kappa statistic. Fam Med 37(5):360–363
Wang Z, Wang Y, Srinivasan RS (2018) A novel ensemble learning approach to support building energy use prediction. Energy Build 159:109–122
Webb GI (2000) Multiboosting: a technique for combining boosting and wagging. Mach Learn 40(2):159–196
Zhang X, Zhao Z, Zheng J, Li J (2019) Prediction of taxi destinations using a novel data embedding method and ensemble learning. IEEE Trans Intell Transp Syst (in press)
Author information
Authors and Affiliations
Corresponding author
Ethics declarations
Conflict of interest
The authors declare that the research was conducted in the absence of any commercial or financial relationships that could be construed as a potential conflict of interest.
Additional information
Publisher's Note
Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.
Rights and permissions
About this article
Cite this article
Alzubi, O.A., Alzubi, J.A., Alweshah, M. et al. An optimal pruning algorithm of classifier ensembles: dynamic programming approach. Neural Comput & Applic 32, 16091–16107 (2020). https://doi.org/10.1007/s00521-020-04761-6
Received:
Accepted:
Published:
Issue Date:
DOI: https://doi.org/10.1007/s00521-020-04761-6