Density-weighted support vector machines for binary class imbalance learning

Hazarika, Barenya Bikash; Gupta, Deepak

doi:10.1007/s00521-020-05240-8

Density-weighted support vector machines for binary class imbalance learning

Original Article
Published: 04 August 2020

Volume 33, pages 4243–4261, (2021)
Cite this article

Neural Computing and Applications Aims and scope Submit manuscript

1140 Accesses
65 Citations
Explore all metrics

Abstract

In real-world binary classification problems, the entirety of samples belonging to each class varies. These types of problems where the majority class is notably bigger than the minority class can be called as class imbalance learning (CIL) problem. Due to the CIL problem, model performance may degrade. This paper presents a new support vector machine (SVM) model based on density weight for binary CIL (DSVM-CIL) problem. Additionally, an improved 2-norm-based density-weighted least squares SVM for binary CIL (IDLSSVM-CIL) is also proposed to increase the training speed of DSVM-CIL. In IDLSSVM-CIL, the least squares solution is obtained by considering 2-norm of slack variables and solving the primal problem of DSVM-CIL with equality constraints instead of inequality constraints. The basic ideas behind the algorithms are that the training datapoints are given weights during the training phase based on their class distributions. The weights are generated by using a density-weighted technique (Cha et al. in Expert Syst Appl 41(7):3343–3350, 2014) to reduce the effects of CIL. Experimental analyses are performed on some interesting imbalanced artificial and real-world datasets, and their performances are measured using the area under the curve and geometric mean (G-mean). The results are compared with SVM, least squares SVM, fuzzy SVM, improved fuzzy least squares SVM, affinity and class probability-based fuzzy SVM and entropy-based fuzzy least squares SVM. Similar or better generalization results indicate the efficacy and applicability of the proposed algorithms.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Log in via an institution

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Institutional subscriptions

Density Weighted Twin Support Vector Machines for Binary Class Imbalance Learning

Article 05 November 2021

Instance-based entropy fuzzy support vector machine for imbalanced data

Article 23 October 2019

Fuzzy twin support vector machine based on affinity and class probability for class imbalance learning

Article 12 June 2023

References

Alcalá-Fdez J, Fernández A, Luengo J, Derrac J, García S, Sánchez L, Herrera F (2011) Keel data-mining software tool: data set repository, integration of algorithms and experimental analysis framework. J Mult Valued Log Soft Comput 17(2):255–287
Google Scholar
Anitha PU, Neelima G, Kumar YS (2019) Prediction of cardiovascular disease using support vector machine. J Innov Electron Commun Eng 9(1):28–33
Google Scholar
Azevedo N, Pinheiro D, Weber GW (2014) Dynamic programming for a Markov-switching jump–diffusion. J Comput Appl Math 267:1–19
MathSciNet MATH Google Scholar
Bakan HÖ, Yılmaz F, Weber GW (2018) Minimal truncation error constants for Runge–Kutta method for stochastic optimal control problems. J Comput Appl Math 331:196–207
MathSciNet MATH Google Scholar
Balasundaram S, Gupta D (2016) On optimization based extreme learning machine in primal for regression and classification by functional iterative method. Int J Mach Learn Cybernet 7(5):707–728
Google Scholar
Batuwita R, Palade V (2010) FSVM-CIL: fuzzy support vector machines for class imbalance learning. IEEE Trans Fuzzy Syst 18(3):558–571
Google Scholar
Bhaumik A, Roy SK, Weber GW (2019) Hesitant interval-valued intuitionistic fuzzy-linguistic term set approach in Prisoners’ dilemma game theory using TOPSIS: a case study on human-trafficking. CEJOR 28:1–20
MATH Google Scholar
Borah P, Gupta D (2019) Functional iterative approaches for solving support vector classification problems based on generalized Huber loss. Neural Comput Appl 32:9245–9265
Google Scholar
Borah P, Gupta D, Prasad M (2018) Improved 2-norm based fuzzy least squares twin support vector machine. In: 2018 IEEE symposium series on computational intelligence (SSCI), pp 412–419. IEEE
Cardillo G (2007) McNemar test: perform the McNemar test on a 2 × 2 matrix. http://www.mathworks.com/matlabcentral/fileexchange/15472
Cha M, Kim JS, Baek JG (2014) Density weighted support vector data description. Expert Syst Appl 41(7):3343–3350
Google Scholar
Chen YH, Hong WC, Shen W, Huang NN (2016) Electric load forecasting based on a least squares support vector machine with fuzzy time series and global harmony search algorithm. Energies 9(2):70
Google Scholar
Çiftçi BB, Kuter S, Akyürek Z, Weber GW (2017) Fractional snow cover mapping by artificial neural networks and support vector machines. ISPRS Ann Photogramm Remote Sens Spat Inf Sci 4:179
Google Scholar
Cortes C, Vapnik V (1995) Support-vector networks. Mach Learn 20(3):273–297
MATH Google Scholar
Demšar J (2006) Statistical comparisons of classifiers over multiple data sets. J Mach Learn Res 7(Jan):1–30
MathSciNet MATH Google Scholar
Dietterich TG (1998) Approximate statistical tests for comparing supervised classification learning algorithms. Neural Comput 10(7):1895–1923
Google Scholar
Dohare AK, Kumar V, Kumar R (2018) Detection of myocardial infarction in 12 lead ECG using support vector machine. Appl Soft Comput 64:138–147
Google Scholar
Dudani SA (1976) The distance-weighted k-nearest-neighbor rule. IEEE Trans Syst Man Cybern 4:325–327
Google Scholar
Fernández A, del Río S, Chawla NV, Herrera F (2017) An insight into imbalanced big data classification: outcomes and challenges. Complex Intell Syst 3(2):105–120
Google Scholar
Gupta D, Richhariya B (2018) Entropy based fuzzy least squares twin support vector machine for class imbalance learning. Appl Intell 48(11):4212–4231
Google Scholar
Gupta D, Hazarika BB, Berlin M (2020) Robust regularized extreme learning machine with asymmetric Huber loss function. Neural Comput Appl 32:12971–12998
Google Scholar
Gupta D, Richhariya B, Borah P (2018) A fuzzy twin support vector machine based on information entropy for class imbalance learning. Neural Comput Appl 31:1–12
Google Scholar
Hazarika BB, Gupta D, Berlin M (2020a) A comparative analysis of artificial neural network and support vector regression for river suspended sediment load prediction. In: First international conference on sustainable technologies for computational intelligence. Springer, Singapore, pp 339–349
Hazarika BB, Gupta D, Berlin M (2020) Modeling suspended sediment load in a river using extreme learning machine and twin support vector regression with wavelet conjunction. Environ Earth Sci 79:234
Google Scholar
He H, Garcia EA (2009) Learning from imbalanced data. IEEE Trans Knowl Data Eng 21(9):1263–1284
Google Scholar
Jongen HT, Weber GW (1992) Nonconvex optimization and its structural frontiers. In: Modern methods of optimization. Springer, Berlin, pp 151–203
Kara G, Özmen A, Weber GW (2019) Stability advances in robust portfolio optimization under parallelepiped uncertainty. CEJOR 27(1):241–261
MathSciNet MATH Google Scholar
Kropat E, Weber GW, Belen S (2011) Dynamical gene-environment networks under ellipsoidal uncertainty: set-theoretic regression analysis based on ellipsoidal OR. In: Dynamics, games and science I. Springer, Berlin, pp 545–571
Kürüm E, Yildirak K, Weber GW (2012) A classification problem of credit risk rating investigated and solved by optimisation of the ROC curve. CEJOR 20(3):529–557
MathSciNet MATH Google Scholar
Lin CF, Wang SD (2002) Fuzzy support vector machines. IEEE Trans Neural Netw 13(2):464–471
Google Scholar
Liu J, Zio E (2018) A scalable fuzzy support vector machine for fault detection in transportation systems. Expert Syst Appl 102:36–43
Google Scholar
Liu YH, Huang HP (2002) Fuzzy support vector machines for pattern recognition and data mining. Int J Fuzzy Syst 4(3):826–835
MathSciNet Google Scholar
Lu S, Zhu C, Jiao C (2015) Density weighted core support vector machine. Adv Comput Sci Int J 4(6):150–155
Google Scholar
McNemar Q (1947) Note on the sampling error of the difference between correlated proportions or percentages. Psychometrika 12(2):153–157
Google Scholar
Murphy PM, Aha DW (1992) UCI machine learning repository
Napierała K, Stefanowski J, Wilk S (2010) Learning from imbalanced data in presence of noisy and borderline examples. In: International conference on rough sets and current trends in computing. Springer, Berlin, pp 158–167
Özmen A, Kropat E, Weber GW (2017) Robust optimization in spline regression models for multi-model regulatory networks under polyhedral uncertainty. Optimization 66(12):2135–2155
MathSciNet MATH Google Scholar
Özöğür Akyüz S, Üstünkar G, Weber GW (2016) Adapted infinite kernel learning by multi-local algorithm. Int J Pattern Recognit Artif Intell 30(04):1651004
MathSciNet Google Scholar
Özöğür-Akyüz S, Hussain Z, Shawe-Taylor J (2010) Prediction with the SVM using test point margins. In: Data mining. Springer, Boston, pp 147–158
Pant R, Trafalis TB, Barker K (2011) Support vector machine classification of uncertain and imbalanced data using robust optimization. In: Proceedings of the 15th WSEAS international conference on computers. World Scientific and Engineering Academy and Society (WSEAS) Stevens Point, Wisconsin, USA, pp 369–374
Roy SK, Maiti SK (2020) Reduction methods of type-2 fuzzy variables and their applications to Stackelberg game. Appl Intell 50:1398–1415
Google Scholar
Savku E, Weber GW (2018) A stochastic maximum principle for a markov regime-switching jump-diffusion model with delay and an application to finance. J Optim Theory Appl 179(2):696–721
MathSciNet MATH Google Scholar
Shao SY, Shen KQ, Ong CJ, Wilder-Smith EP, Li XP (2008) Automatic EEG artifact removal: a weighted support vector machine approach with error correction. IEEE Trans Biomed Eng 56(2):336–344
Google Scholar
Suykens JA, Vandewalle J (1999) Least squares support vector machine classifiers. Neural Process Lett 9(3):293–300
Google Scholar
Suykens JA, De Brabanter J, Lukas L, Vandewalle J (2002) Weighted least squares support vector machines: robustness and sparse approximation. Neurocomputing 48(1–4):85–105
MATH Google Scholar
Tang H, Dong P, Shi Y (2019) A new approach of integrating piecewise linear representation and weighted support vector machine for forecasting stock turning points. Appl Soft Comput 78:685–696
Google Scholar
Tao X, Li Q, Ren C, Guo W, He Q, Liu R, Zou J (2020) Affinity and class probability-based fuzzy support vector machine for imbalanced data sets. Neural Netw 122:289–307
Google Scholar
Tax DM, Duin RP (2004) Support vector data description. Mach Learn 54(1):45–66
MATH Google Scholar
Temoçin BZ, Weber GW (2014) Optimal control of stochastic hybrid system with jumps: a numerical approximation. J Comput Appl Math 259:443–451
MathSciNet MATH Google Scholar
Temoçin BZ, Weber GW, Azevedo N, Pinheiro D (2011) Applications of stochastic hybrid systems in finance. In: GAME THEORY AND MANAGEMENT. Collected abstracts of papers presented on the fifth international conference game theory and management/editors Leon A. Petrosyan and Nikolay A. Zenkevich.–SPb.: graduate school of management SPbU, 2011, 268 p. The collection contains abstracts of papers accepted for the Fifth International (p 236)
Tomar D, Singhal S, Agarwal S (2014) Weighted least square twin support vector machine for imbalanced dataset. Int J Database Theory Appl 7(2):25–36
Google Scholar
Trafalis TB, Alwazzi SA (2007) Support vector regression with noisy data: a second order cone programming approach. Int J Gen Syst 36(2):237–250
MathSciNet MATH Google Scholar
Tsang IW, Kwok JT, Cheung PM (2005) Core vector machines: fast SVM training on very large data sets. J Mach Learn Res 6(Apr):363–392
MathSciNet MATH Google Scholar
Van Gestel T, Suykens JA, Baestaens DE, Lambrechts A, Lanckriet G, Vandaele B, Vandewalle J (2001) Financial time series prediction using least squares support vector machines within the evidence framework. IEEE Trans Neural Netw 12(4):809–821
Google Scholar
Wang L, Gao C, Zhao N, Chen X (2019) A projection wavelet weighted twin support vector regression and its primal solution. Appl Intell 49:1–21
Google Scholar
Wang Q, Tian Y, Liu D (2019) Adaptive FH-SVM for imbalanced classification. IEEE Access 7:130410–130422
Google Scholar
Wang TY, Chiang HM (2007) Fuzzy support vector machine for multi-class text categorization. Inf Process Manag 43(4):914–929
Google Scholar
Wang T, Qiu Y, Hua J (2019) Centered kernel alignment inspired fuzzy support vector machine. Fuzzy Sets Syst 394:110–123
MathSciNet MATH Google Scholar
Weber GW (2002) Generalized semi-infinite optimization: theory and applications in optimal control and discrete optimization. J Stat Manag Syst 5(1–3):359–388
MathSciNet MATH Google Scholar
Xia S, Xiong Z, Luo Y, Dong L, Xing C (2015) Relative density based support vector machine. Neurocomputing 149:1424–1432
Google Scholar
Xu S, Yuan C, Zhang X (2011) Density weighted least squares support vector machine. In: Proceedings of the 30th Chinese control conference. IEEE, pp 5310–5314
Yang X, Song Q, Wang Y (2007) A weighted support vector machine for data classification. Int J Pattern Recognit Artif Intell 21(05):961–976
Google Scholar
Zhang C, Bi J, Xu S, Ramentol E, Fan G, Qiao B, Fujita H (2019) Multi-imbalance: an open-source software for multi-class imbalance learning. Knowl Based Syst 174:137–143
Google Scholar

Download references

Acknowledgements

The authors appreciatively acknowledge the valuable comments and suggestions of anonymous reviewers. This study is sanctioned under an early career research award (ECRA) by the Science and Engineering Research Board, Gov. of India (SERB), ECR/2016/001464.

Author information

Authors and Affiliations

Department of Computer Science and Engineering, National Institute of Technology, Arunachal Pradesh, Papum Pare, India
Barenya Bikash Hazarika & Deepak Gupta

Authors

Barenya Bikash Hazarika
View author publications
You can also search for this author in PubMed Google Scholar
Deepak Gupta
View author publications
You can also search for this author in PubMed Google Scholar

Corresponding author

Correspondence to Deepak Gupta.

Ethics declarations

Conflict of interest

All the authors declare that they have no conflict of interest.

Additional information

Publisher's Note

Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

Rights and permissions

Reprints and permissions

About this article

Cite this article

Hazarika, B.B., Gupta, D. Density-weighted support vector machines for binary class imbalance learning. Neural Comput & Applic 33, 4243–4261 (2021). https://doi.org/10.1007/s00521-020-05240-8

Download citation

Received: 29 December 2019
Accepted: 24 July 2020
Published: 04 August 2020
Issue Date: May 2021
DOI: https://doi.org/10.1007/s00521-020-05240-8

Keywords

Access this article

Log in via an institution

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Institutional subscriptions

Density-weighted support vector machines for binary class imbalance learning

Abstract

Access this article

Similar content being viewed by others

Density Weighted Twin Support Vector Machines for Binary Class Imbalance Learning

Instance-based entropy fuzzy support vector machine for imbalanced data

Fuzzy twin support vector machine based on affinity and class probability for class imbalance learning

References

Acknowledgements

Author information

Authors and Affiliations

Corresponding author

Ethics declarations

Conflict of interest

Additional information

Publisher's Note

Rights and permissions

About this article

Cite this article

Keywords

Navigation

Density-weighted support vector machines for binary class imbalance learning

Abstract

Access this article

Similar content being viewed by others

Density Weighted Twin Support Vector Machines for Binary Class Imbalance Learning

Instance-based entropy fuzzy support vector machine for imbalanced data

Fuzzy twin support vector machine based on affinity and class probability for class imbalance learning

References

Acknowledgements

Author information

Authors and Affiliations

Corresponding author

Ethics declarations

Conflict of interest

Additional information

Publisher's Note

Rights and permissions

About this article

Cite this article

Share this article

Keywords

Search

Navigation