Skip to main content
Log in

An efficient chaotic salp swarm optimization approach based on ensemble algorithm for class imbalance problems

  • Application of soft computing
  • Published:
Soft Computing Aims and scope Submit manuscript

Abstract

Class imbalance problems have attracted the research community, but a few works have focused on feature selection with imbalanced datasets. To handle class imbalance problems, we developed a novel fitness function for feature selection using the chaotic salp swarm optimization algorithm, an efficient meta-heuristic optimization algorithm that has been successfully used in a wide range of optimization problems. This paper proposes an AdaBoost algorithm with chaotic salp swarm optimization. The most discriminating features are selected using salp swarm optimization, and AdaBoost classifiers are thereafter trained on the features selected. Experiments show the ability of the proposed technique to find the optimal features with performance maximization of AdaBoost.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Fig. 1
Fig. 2
Fig. 3
Fig. 4
Fig. 5
Fig. 6

Similar content being viewed by others

Notes

  1. https://sci2s.ugr.es/keel/imbalanced.php.

References

  • Ahmed S, Mafarja M, Faris H, Aljarah I (2018) Feature selection using salp swarm algorithm with chaos. In: Proceedings of the 2nd international conference on intelligent systems, metaheuristics and swarm intelligence. ACM, pp 65–69

  • Al-Ani A (2005) Feature subset selection using ant colony optimization. Int J Comput Intell 2(1):53–58

  • Amarendra C, Reddy KH (2019) Pso algorithm support switching pulse sequence isvm for six-phase matrix converter-fed drives. In: Smart intelligent computing and applications. Springer, pp 559–569

  • Bewoor LA, Chandra Prakash V, Sapkal SU (2017) Evolutionary hybrid particle swarm optimization algorithm for solving np-hard no-wait flow shop scheduling problems. Algorithms 10(4):121

    Article  MathSciNet  MATH  Google Scholar 

  • Cao P, Li B, Zhao D, Zaiane O (2013) A novel cost sensitive neural network ensemble for multiclass imbalance data learning. In: The 2013 international joint conference on neural networks (IJCNN). IEEE, pp 1–8

  • Chawla NV, Bowyer KW, Hall LO, Kegelmeyer WP (2002) Smote: synthetic minority over-sampling technique. J Artif Intell Res 16:321–357

    Article  MATH  Google Scholar 

  • Chawla NV, Lazarevic A, Hall LO, Bowyer KW (2003) Smoteboost: improving prediction of the minority class in boosting. In: European conference on principles of data mining and knowledge discovery. Springer, pp 107–119

  • Chung D, Kim H (2015) Accurate ensemble pruning with pl-bagging. Comput Stat Data Anal 83:1–13

    Article  MathSciNet  MATH  Google Scholar 

  • Di Martino M, Fernández A, Iturralde P, Lecumberry F (2013) Novel classifier scheme for imbalanced problems. Pattern Recogn Lett 34(10):1146–1151

    Article  Google Scholar 

  • Dou P, Chen Y (2017) Remote sensing imagery classification using adaboost with a weight vector (wv adaboost). Remote Sens Lett 8(8):733–742

    Article  Google Scholar 

  • Dwiyanti E, Ardiyanti A et al. (2016) Handling imbalanced data in churn prediction using rusboost and feature selection (case study: Pt. telekomunikasi indonesia regional 7). In: International conference on soft computing and data mining. Springer, pp 376–385

  • Emary E, Zawbaa HM, Hassanien AE (2016) Binary ant lion approaches for feature selection. Neurocomputing 213:54–65

    Article  Google Scholar 

  • Fawcett T (2006) An introduction to roc analysis. Pattern Recogn Lett 27(8):861–874

    Article  MathSciNet  Google Scholar 

  • Fiore U (2020) Minority oversampling based on the attraction-repulsion Weber problem. Concurr Comput Pract Exp 32(18):e5601

  • Fiore U, De Santis A, Perla F, Zanetti P, Palmieri F (2020) Using generative adversarial networks for improving classification effectiveness in credit card fraud detection. Inf Sci 479:448–455

    Article  Google Scholar 

  • Galar M, Fernandez A, Barrenechea E, Bustince H, Herrera F (2011) A review on ensembles for the class imbalance problem: bagging-, boosting-, and hybrid-based approaches. IEEE Trans Syst Man Cybern Part C (Appl Rev) 42(4):463–484

    Article  Google Scholar 

  • Galar M, Fernández A, Barrenechea E, Herrera F (2013) Eusboost: enhancing ensembles for highly imbalanced data-sets by evolutionary undersampling. Pattern Recogn 46(12):3460–3471

    Article  Google Scholar 

  • Gao M, Hong X, Chen S, Harris CJ (2011) A combined smote and pso based rbf classifier for two-class imbalanced problems. Neurocomputing 74(17):3456–3466

    Article  Google Scholar 

  • Haixiang G, Yijing L, Yanan L, Xiao L, Jinling L (2016) Bpso-adaboost-knn ensemble learning algorithm for multi-class imbalanced data classification. Eng Appl Artif Intell 49:176–193

    Article  Google Scholar 

  • Joshi MV, Kumar V, Agarwal RC (2001) Evaluating boosting algorithms to classify rare classes: comparison and improvements. In: Proceedings 2001 IEEE international conference on data mining. IEEE, pp 257–264

  • Kennedy J, Eberhart RC (1997) A discrete binary version of the particle swarm algorithm. In: Proceedings of the IEEE international conference on systems, man, and cybernetics. computational cybernetics and simulation, vol 5. IEEE, pp 4104–4108

  • Li Y, Wang S, Tian Q, Ding X (2015) A boosting approach to exploit instance correlations for multi-instance classification. IEEE Trans Neural Netw Learn Syst 27(12):2740–2747

    Article  Google Scholar 

  • Li K, Xie P, Liu W, Zha J (2017) An ensemble evolve algorithm for imbalanced data. J Comput Theor Nanosci 14(9):4624–4629

    Article  Google Scholar 

  • Li L, Wang C, Li W, Chen J (2018) Hyperspectral image classification by adaboost weighted composite kernel extreme learning machines. Neurocomputing 275:1725–1733

    Article  Google Scholar 

  • Li K, Zhou G, Zhai J, Li F, Shao M (2019) Improved pso\_adaboost ensemble algorithm for imbalanced data. Sensors 19(6):1476

    Article  Google Scholar 

  • Liu TY (2009) Easyensemble and feature selection for imbalance data sets. In: 2009 International joint conference on bioinformatics. Systems biology and intelligent computing. IEEE, pp 517–520

  • López V, Fernández A, Del Jesus MJ, Herrera F (2012) Cost sensitive and preprocessing for classification with imbalanced data-sets: similar behaviour and potential hybridizations. In: ICPRAM (2), pp 98–107

  • López V, Fernández A, García S, Palade V, Herrera F (2013) An insight into classification with imbalanced data: empirical results and current trends on using data intrinsic characteristics. Inf Sci 250:113–141

    Article  Google Scholar 

  • Maldonado S, Weber R, Famili F (2014) Feature selection for high-dimensional class-imbalanced data sets using support vector machines. Inf Sci 286:228–246

    Article  Google Scholar 

  • Menardi G, Torelli N (2014) Training and assessing classification rules with imbalanced data. Data Min Knowl Disc 28(1):92–122

    Article  MathSciNet  MATH  Google Scholar 

  • Mirjalili S, Gandomi AH, Mirjalili SZ, Saremi S, Faris H, Mirjalili SM (2017) Salp swarm algorithm: a bio-inspired optimizer for engineering design problems. Adv Eng Softw 114:163–191

    Article  Google Scholar 

  • Namassivaya N, Pal S, Ratnam DV (2019) Modelling of fpga-particle swarm optimized gnss receiver for satellite applications. Wirel Pers Commun 106(2):879–895

    Article  Google Scholar 

  • Nikhath AK, Subrahmanyam K (2019) Feature selection, optimization and clustering strategies of text documents. Int J Electr Comput Eng 9(2):2088–8708

    Google Scholar 

  • Ogiela L, Ogiela MR (2020) Cognitive security paradigm for cloud computing applications. Concurr Comput Pract Exp 32(8):e5316

    Google Scholar 

  • Qiaojin G, Libin L, Ning L (2008) Novel modified adaboost algorithm for imbalanced data classification. Comput Eng Appl 44(21):217–221

    Google Scholar 

  • Ramentol E, Caballero Y, Bello R, Herrera F (2012) Smote-rsb*: a hybrid preprocessing approach based on oversampling and undersampling for high imbalanced data-sets using smote and rough sets theory. Knowl Inf Syst 33(2):245-265

    Article  Google Scholar 

  • Rekha G, Reddy VK (2018) A novel approach for handling outliers in imbalance data. Int J Eng Technol 7(3.1):1–5

    Article  Google Scholar 

  • Sayed GI, Khoriba G, Haggag MH (2018) A novel chaotic salp swarm algorithm for global optimization and feature selection. Appl Intell 48(10):3462–3481

    Article  Google Scholar 

  • Sayed GI, Tharwat A, Hassanien AE (2019) Chaotic dragonfly algorithm: an improved metaheuristic algorithm for feature selection. Appl Intell 49(1):188–205

    Article  Google Scholar 

  • Schiezaro M, Pedrini H (2013) Data feature selection based on artificial bee colony algorithm. EURASIP J Image Video Process 1:47

    Article  Google Scholar 

  • Searle SR, Searle S (1987) Linear models for unbalanced data, vol 1987. Wiley, New York

    MATH  Google Scholar 

  • Sultanpure KA, Reddy LSS (2018) Job scheduling for energy efficiency using artificial bee colony through virtualization. Int J Intell Eng Syst 11(3):138–148

    Google Scholar 

  • Sun B, Chen H, Wang J, Xie H (2018) Evolutionary under-sampling based bagging ensemble method for imbalanced data classification. Front Comput Sci 12(2):331–350

    Article  Google Scholar 

  • Thai-Nghe N, Gantner Z, Schmidt-Thieme L (2010) Cost-sensitive learning methods for imbalanced data. In: The 2010 international joint conference on neural networks (IJCNN). IEEE, pp 1–8

  • Thanathamathee P, Lursinsap C (2013) Handling imbalanced data sets with synthetic boundary data generation using bootstrap re-sampling and adaboost techniques. Pattern Recogn Lett 34(12):1339–1347

    Article  Google Scholar 

  • Thirugnanasambandam K, Prakash S, Subramanian V, Pothula S, Thirumal V (2019) Reinforced cuckoo search algorithm-based multimodal optimization. Appl Intell 49(6):2059–2083

    Article  Google Scholar 

  • Verikas A, Kalsyte Z, Bacauskiene M, Gelzinis A (2010) Hybrid and ensemble-based soft computing techniques in bankruptcy prediction: a survey. Soft Comput 14(9):995–1010

    Article  Google Scholar 

  • Viola P, Jones M (2002) Fast and robust classification using asymmetric adaboost and a detector cascade. In: Advances in neural information processing systems, pp 1311–1318

  • Wang K, Wang Y, Zhao Q, Meng D, Liao X, Xu Z (2019) SPLBoost: an improved robust boosting algorithm based on self-paced learning. IEEE Trans Cybern 51(3):1556–1570

  • Weiss Y, Elovici Y, Rokach L (2013) The cash algorithm-cost-sensitive attribute selection using histograms. Inf Sci 222:247–268

    Article  MathSciNet  Google Scholar 

  • Xinwu Y, Zhuang M, Shun Y (2016) Multi-class adaboost algorithm based on the adjusted weak classifier. J Electron Inf Technol 38(2):373–380

    Google Scholar 

  • Xue B, Zhang M, Browne WN (2012) Particle swarm optimization for feature selection in classification: a multi-objective approach. IEEE Trans Cybern 43(6):1656–1671

    Article  Google Scholar 

  • Yijing L, Haixiang G, Xiao L, Yanan L, Jinling L (2016) Adapted ensemble classification algorithm based on multiple classifier system and feature selection for classifying multi-class imbalanced data. Knowl Based Syst 94:88–104

    Article  Google Scholar 

  • Yin L, Ge Y, Xiao K, Wang X, Quan X (2013) Feature selection for high-dimensional imbalanced data. Neurocomputing 105:3–11

    Article  Google Scholar 

  • Zhai J, Zhang S, Zhang M, Liu X (2018) Fuzzy integral-based elm ensemble for imbalanced big data classification. Soft Comput 22(11):3519–3531

    Article  Google Scholar 

  • Zhang C, Chen Y (2017) Improved piecewise nonlinear combinatorial adaboost algorithm based on noise self-detection. Comput Eng 43:163–168

    Google Scholar 

Download references

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Chandrashekar Jatoth.

Ethics declarations

Conflict of interest

Rekha G declares that she has no conflict of interest. Krishna Reddy V declares that he has no conflict of interest. Chandrashekar Jatoth declares that he has no conflict of interest. Ugo Fiore declares that he has no conflict of interest.

Ethical approval

This article does not contain any studies with human participants or animals performed by any of the authors.

Additional information

Publisher's Note

Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

Rights and permissions

Reprints and permissions

About this article

Check for updates. Verify currency and authenticity via CrossMark

Cite this article

Gillala, R., Vuyyuru, K.R., Jatoth, C. et al. An efficient chaotic salp swarm optimization approach based on ensemble algorithm for class imbalance problems. Soft Comput 25, 14955–14965 (2021). https://doi.org/10.1007/s00500-021-06080-x

Download citation

  • Accepted:

  • Published:

  • Issue Date:

  • DOI: https://doi.org/10.1007/s00500-021-06080-x

Keywords

Navigation