Skip to main content
Log in

Cost-sensitive SVDD models based on a sample selection approach

  • Published:
Applied Intelligence Aims and scope Submit manuscript

Abstract

The asymmetry of different misclassification costs is a common problem in many realistic applications. However, most of the traditional classifiers pursue high recognition accuracy, assuming that different misclassification errors bring uniform cost. This paper proposes two cost-sensitive models based on support vector data description (SVDD) to minimize classification costs while maximize classification accuracy. The one-class classifier SVDD is extended to two two-class models. The cost information is incorporated to pursue tradeoff generalization performances between different classes in order to minimize the misclassification costs. Cost information is also considered to build the decision rules. The solutions of the optimization problems of the proposed two models are formulated according to sequential minimal optimization (SMO) algorithm. However, SMO needs to check all the samples to select the working set in each iteration, which is very time consuming. Considering that only the support vectors are needed to describe the boundaries, a sample selection approach is proposed to speed up the training time and reduce the storage requirement by selecting edge and overlapping samples, and overcome the local overlearning by remove outliers. Experimental results on synthetic and public datasets demonstrate the effectiveness and efficiency of the proposed methods.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Fig. 1
Fig. 2
Fig. 3
Fig. 4
Fig. 5
Fig. 6
Fig. 7
Fig. 8
Fig. 9
Fig. 10
Fig. 11

Similar content being viewed by others

References

  1. Elkan C (2001) The foundations of cost-sensitive learning[C]. In: Proceedings of 17th international joint conference on artificial intelligence, pp 973–978

  2. Datta S, Das S (2015) Near-bayesian support vector machines for imbalanced data classification with equal or unequal misclassification costs[J]. Neural Netw 70:39–52

    Article  Google Scholar 

  3. Dhar S, Cherkassky V (2015) Development and evaluation of cost-sensitive universum-SVM[J]. IEEE Trans Cybern 45(4):806–818

    Article  Google Scholar 

  4. Yang CY, Yang JS, Wang JJ (2009) Margin calibration in SVM class-imbalanced learning[J]. Neurocomputing 73(1-3):397–411

    Article  Google Scholar 

  5. Jiang L et al (2014) Cost-sensitive Bayesian network classifiers[J]. Pattern Recogn Lett 45:211–216

    Article  Google Scholar 

  6. Ibáñez A, Bielza C, Larrañaga P (2014) Cost-sensitive selective naïve Bayes classifiers for predicting the increase of the h-index for scientific journals[J]. Neurocomputing 135:42–52

    Article  Google Scholar 

  7. Freitas A, Costa-Pereira A, Brazdil P (2007) Cost-sensitive decision trees applied to medical data[J]. In: Data warehousing and knowledge discovery. Springer, Berlin, pp 303–312

    Book  Google Scholar 

  8. Li X, Zhao H, Zhu W (2015) A cost sensitive decision tree algorithm with two adaptive mechanisms[J]. Knowl-Based Syst 88:24–33

    Article  Google Scholar 

  9. Chen YL, Wub CC, Tang K (2016) Time-constrained cost-sensitive decision tree induction[J]. Inf Sci 354:140–152

    Article  Google Scholar 

  10. Sun Y et al (2007) Cost-sensitive boosting for classification of imbalanced data[J]. Pattern Recogn 40:3358–3378

    Article  Google Scholar 

  11. Wang B, Pineau J (2016) Online bagging and boosting for imbalanced data streams[J]. IEEE Trans Knowl Data Eng 28(12):3353–3366

    Article  Google Scholar 

  12. Zhao H, Min F, Zhu W (2011) Test-cost-sensitive attribute reduction based on neighborhood rough set[C]. In: IEEE international conference on granular computing, pp 802–806

  13. Jia X, Liao W et al (2013) Minimum cost attribute reduction in decision-theoretic rough set models[J]. Inf Sci 219:151–167

    Article  MathSciNet  Google Scholar 

  14. Shu W, Shen H (2016) Multi-criteria feature selection on costsensitive data with missing values[J]. Pattern Recogn 51:268–280

    Article  Google Scholar 

  15. Ju H, Yang X, Yu H et al (2016) Cost-sensitive rough set approach[J]. Inf Sci 355–356:282–298

    Article  Google Scholar 

  16. Tax DMJ, Duin RPW (1999) Support vector domain description[J]. Pattern Recogn Lett 20:1191–1199

    Article  Google Scholar 

  17. Tax DMJ, Duin RPW (2004) Support vector domain description[J]. Mach Learn 54:45–66

    Article  Google Scholar 

  18. Lee D, Lee J (2007) Domain described support vector classifier for multi-classification problems[J]. Pattern Recogn 40(1):41—51

    Article  Google Scholar 

  19. Mu TT, Nandi AK (2009) Multiclass classification based on extended support vector data description[J]. IEEE Trans Syst Man Cybern B Cybern 39(5):1206–1216

    Article  Google Scholar 

  20. Guo Y, Xiao H, Fu Q (2017) Least square support vector data description for HRRP-based radar target recognition[J]. Appl Intell 46:365–372

    Article  Google Scholar 

  21. Huang G, Chen H et al (2011) Two-class support vector data description[J]. Pattern Recogn 44:320–329

    Article  Google Scholar 

  22. Azami M E, Lartizien C, Canu S (2017) Converting SVDD scores into probability estimates: Application to outlier detection[J]. Neurocomputing 268. https://doi.org/10.1016/j.neucom.2017.01.103

    Article  Google Scholar 

  23. Wang S, Jianbo Y et al (2013) A modified support vector data description based novelty detection approach for machinery components[J]. Appl Soft Comput 13:1193–1205

    Article  Google Scholar 

  24. Duan L, Xie M et al (2016) A new support vector data description method for machinery fault diagnosis with unbalanced datasets[J]. Expert Syst Appl 64:239–246

    Article  Google Scholar 

  25. Zhou Y, Kan W et al (2017) Fault detection of aircraft based on support vector domain description[J]. Comput Electr Eng 61:80–94

    Article  Google Scholar 

  26. Zhu K, Mei F, Zheng J (2017) Adaptive fault diagnosis of HVCBs based on P-SVDD and P-KFCM[J]. Neurocomputing 240:127–136

    Article  Google Scholar 

  27. Krawczyk B, Woźniak M et al (2015) On the usefulness of one-class classifier ensembles for decomposition of multi-class problems[J]. Pattern Recogn 48:3969–3982

    Article  Google Scholar 

  28. Maloof MA (2003) Learning when data sets are imbalanced and when costs are unequal and unknown[C]. In: Working notes of the ICML’OS workshop on learning from imbalanced data sets. Washington, DC

  29. Kulluk S, Özbakır L, Tapkan PZ, Baykasoglu A (2016) Cost-sensitive meta-learning classifiers: MEPAR-miner and DIFACONN-miner[J]. Knowl-Based Syst 98:148–161

    Article  Google Scholar 

  30. Zhou Z, Liu X (2010) On multi-class cost-sensitive learning[J]. Comput Intell 26(3):232–257

    Article  MathSciNet  Google Scholar 

  31. Metacost DP (1999) A general method for making classifiers cost-sensitive[C]. In: Proceedings of the 5th ACM SIGKDD international conference on knowledge discovery and data mining. ACM Press, San Diego, pp 155–164

  32. Kim YJ, Baik B, Cho S (2016) Detecting financial misstatements with fraud intention using multi-class cost-sensitive learning[J]. Expert Syst Appl 62:32–43

    Article  Google Scholar 

  33. Sheng VS, Ling CX (2006) Thresholding for making classifiers cost-sensitive[C]. In: Proceedings of the 21st national conference on artificial intelligence. Massachusetts, Boston, pp 476–481

  34. Zhou H (2008) Instance weighting versus threshold adjusting for cost-sensitive classification[J]. Knowl Inf Syst 15:321– 334

    Article  Google Scholar 

  35. Chatelain C, Adam S et al (2010) A multi-model selection framework for unknown and/or evolutive misclassifiecation cost problems [J]. Pattern Recogn 43:815–823

    Article  Google Scholar 

  36. Bernard S, Chatelain C et al (2016) The multiclass ROC Front method for cost-sensitive classification[J]. Pattern Recogn 52:46–60

    Article  Google Scholar 

  37. Tapkana P, Özbakıra L, Kulluka S, Baykasoglu A (2016) A cost-sensitive classification algorithm: BEE-Miner[J]. Knowl-Based Syst 95(C):99–113

    Article  Google Scholar 

  38. Cheng F, Zhang J, Wen C (2016) Cost-sensitive large margin distribution machine for classification of imbalanced data[J]. Pattern Recogn Lett 80:107–112

    Article  Google Scholar 

  39. Zhang GQ, Sun HJ et al (2016) Cost-sensitive dictionary learning for face recognition[J]. Pattern Recogn 60:613–629

    Article  Google Scholar 

  40. Piatt J (1998) Sequential minimal optimization: A fast algorithm for training support vector machines[R]. Technical Report MST-TR-98-14, Microsoft Research

  41. Flake GW, Lawrence S (2002) Efficient SVM regression training with SMO[J]. Mach Learn 46(1):271–290

    Article  Google Scholar 

  42. Li YH, Maguire L (2011) Selecting critical patterns based on local geometrical and statistical information. IEEE Trans Pattern Anal Mach Intell 33(6):1189–1201

    Article  Google Scholar 

  43. Xiao Y, Wang H, Xu W (2015) Parameter selection of Gaussian kernel for one-class SVM[J]. IEEE Trans Cybern 45(5):927

    Article  Google Scholar 

  44. Chen Z, Xiao X, Li C et al (2016) Real-time transient stability status prediction using cost-sensitive extreme learning machine[J]. Neural Comput Appl 27:321–331

    Article  Google Scholar 

  45. Nikolaou N, Edakunni N, Kull M et al (2016) Cost-sensitive boosting algorithms: Do we really need them?[J]. Mach Learn 104:359–384

    Article  MathSciNet  Google Scholar 

  46. Sun J, Lang J, Fujita H, Li H (2018) Imbalanced enterprise credit evaluation with DTE-SBD: Decision tree ensemble based on SMOTE and bagging with differentiated sampling rates[J]. Inf Sci 425:76–91

    Article  MathSciNet  Google Scholar 

  47. Alcalá-Fdez J, Fernández A, Luengo J, Derrac J et al (2011) KEEL data-mining software tool: data set repository, integration of algorithms and experimental analysis framework[J]. J Multiple-Valued Logic Soft Comput 17:255–287

    Google Scholar 

Download references

Acknowledgments

This work is supported by the National Natural Science Foundation of China under Grants 60975026 and 61273275.

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Xiaodan Wang.

Rights and permissions

Reprints and permissions

About this article

Check for updates. Verify currency and authenticity via CrossMark

Cite this article

Zhao, Z., Wang, X. Cost-sensitive SVDD models based on a sample selection approach. Appl Intell 48, 4247–4266 (2018). https://doi.org/10.1007/s10489-018-1187-1

Download citation

  • Published:

  • Issue Date:

  • DOI: https://doi.org/10.1007/s10489-018-1187-1

Keywords

Navigation