Skip to main content
Log in

Manifold proximal support vector machine for semi-supervised classification

  • Published:
Applied Intelligence Aims and scope Submit manuscript

Abstract

Recently, semi-supervised learning (SSL) has attracted a great deal of attention in the machine learning community. Under SSL, large amounts of unlabeled data are used to assist the learning procedure to construct a more reasonable classifier. In this paper, we propose a novel manifold proximal support vector machine (MPSVM) for semi-supervised classification. By introducing discriminant information in the manifold regularization (MR), MPSVM not only introduces MR terms to capture as much geometric information as possible from inside the data, but also utilizes the maximum distance criterion to characterize the discrepancy between different classes, leading to the solution of a pair of eigenvalue problems. In addition, an efficient particle swarm optimization (PSO)-based model selection approach is suggested for MPSVM. Experimental results on several artificial as well as real-world datasets demonstrate that MPSVM obtains significantly better performance than supervised GEPSVM, and achieves comparable or better performance than LapSVM and LapTSVM, with better learning efficiency.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Fig. 1
Fig. 2
Fig. 3
Fig. 4
Fig. 5
Fig. 6
Fig. 7
Fig. 8
Fig. 9
Fig. 10
Fig. 11

Similar content being viewed by others

Notes

  1. We use b 1 and b 2 instead of −γ 1 and −γ 2 in the original paper [9] only for the unified notation.

  2. According to [10, 11], using the “difference” instead of the “ratio” does not change the geometrical interpretation of GEPSVM, results in standard eigenvalue problems, which are more efficient than the general eigenvalue problems solved in GEPSVM. Moreover, comprehensive comparisons in [10, 11] show that the “difference” has comparable or better performance compared to the “ratio” (GEPSVM), but with the less learning time.

  3. A particle \(\boldsymbol{x}_{i}^{t}\) with higher classification accuracy produces a better fitness value (lower training error). That is, better fitness is represented by lower value.

  4. Matlab code is available at http://www.optimal-group.org/Resource/MPSVM.html.

  5. Classification accuracy is defined as: \(\mathit{Acc} =\frac{\mathrm{TP} + \mathrm{TN}}{\mathrm{TP} + \mathrm{FP} +\mathrm{TN} + \mathrm{FN}}\), where TP, TN, FP and FN are the number of true positive, true negative, false positive and false negative, respectively.

  6. We use the training time T train and parameter search time T para to denote the computational efficiency (learning time) for each algorithm.

  7. Matlab is available at http://www.mathworks.com.

  8. The UCI datasets are available at http://archive.ics.uci.edu/ml.

  9. The USPS datasets are available at www.cs.nyu.edu/~roweis/data.html.

References

  1. Vapnik VN (1998) Statistical learning theory. Wiley, New York

    MATH  Google Scholar 

  2. Burges CJC (1998) A tutorial on support vector machines for pattern recognition. Data Min Knowl Discov 2(2):121–167

    Article  Google Scholar 

  3. Deng N, Tian Y, Zhang C (2013) Support vector machines: theory, algorithms and extensions. CRC Press, Philadelphia

    Google Scholar 

  4. Hao P, Chiang J, Lin Y (2009) A new maximal-margin spherical-structured multi-class support vector machine. Appl Intell 30(2):98–111

    Article  Google Scholar 

  5. Zhang HH, Ahn J, Lin XD, Park C (2006) Gene selection using support vector machines with non-convex penalty. Bioinformatics 22(1):88–95

    Article  Google Scholar 

  6. Lee L, Wan C, Rajkumar R, Isa D (2012) An enhanced support vector machine classification framework by using Euclidean distance function for text document categorization. Appl Intell 37(1):80–99

    Article  Google Scholar 

  7. Lee L, Rajkumar R, Isa D (2012) Automatic folder allocation system using Bayesian-support vector machines hybrid classification approach. Appl Intell 36(2):295–307

    Article  Google Scholar 

  8. Wang C, You W (2013) Boosting-SVM: effective learning with reduced data dimension. Appl Intell 39(3):465–474

    Article  Google Scholar 

  9. Mangasarian OL, Wild EW (2006) Multisurface proximal support vector machine classification via generalized eigenvalues. IEEE Trans Pattern Anal Mach Intell 28(1):69–74

    Article  Google Scholar 

  10. Shao Y, Deng N, Chen W, Zhen W (2013) Improved generalized eigenvalue proximal support vector machine. IEEE Signal Process Lett 20(3):213–216

    Article  Google Scholar 

  11. Ye Q, Zhao C, Zhang H, Ye N (2011) Distance difference and linear programming nonparallel plane classifier. Expert Syst Appl 38(8):9425–9433

    Article  Google Scholar 

  12. Jayadeva KR, Chandra S (2007) Twin support vector machines for pattern classification. IEEE Trans Pattern Anal Mach Intell 29(5):905–910

    Article  Google Scholar 

  13. Shao Y, Zhang C, Wang X, Deng N (2011) Improvements on twin support vector machines. IEEE Trans Neural Netw 22(6):962–968

    Article  Google Scholar 

  14. Peng X (2011) TPMSVM: a novel twin parametric-margin support vector machine for pattern recognition. Pattern Recognit 44(10–11):2678–2692

    Article  MATH  Google Scholar 

  15. Qi Z, Tian Y, Shi Y (2013) Structural twin support vector machine for classification. Knowl-Based Syst 43:74–81

    Article  Google Scholar 

  16. Shao Y, Deng N, Yang Z, Chen W, Wang Z (2012) Probabilistic outputs for twin support vector machines. Knowl-Based Syst 33:145–151

    Article  Google Scholar 

  17. Shao Y, Deng N, Yang Z (2012) Least squares recursive projection twin support vector machine for classification. Pattern Recognit 45(6):2299–2307

    Article  MATH  Google Scholar 

  18. Qi Z, Tian Y, Shi Y (2012) Twin support vector machine with universum data. Neural Netw 36:112–119

    Article  MATH  Google Scholar 

  19. Qi Z, Tian Y, Shi Y (2013) Robust twin support vector machine for pattern classification. Pattern Recognit 46(1):305–316

    Article  MATH  Google Scholar 

  20. Ding S, Yu J, Qi B, Huang H (2013) An overview on twin support vector machines. Artif Intell Rev. doi:10.1007/s10462-012-9336-0

    Google Scholar 

  21. Chapelle O, Schölkopf B, Zien A (2010) Semi-supervised learning. MIT Press, Massachusetts

    Google Scholar 

  22. Zhu X, Goldberg AB (2009) Introduction to semi-supervised learning. Morgan & Claypool, San Rafael

    MATH  Google Scholar 

  23. Tur G, Hakkani D, Schapire RE (2005) Combining active and semi-supervised learning for spoken language understanding. Speech Commun 45(2):171–186

    Article  Google Scholar 

  24. Guzella TS, Caminhas WM (2009) A review of machine learning approaches to spam filtering. Expert Syst Appl 36(7):10206–10222

    Article  Google Scholar 

  25. Zhang T, Liu S, Xu C, Lu H (2011) Boosted multi-class semi-supervised learning for human action recognition. Pattern Recognit 44(10–11):2334–2342

    Article  MATH  Google Scholar 

  26. Nguyen T, Ho T (2012) Detecting disease genes based on semi-supervised learning and protein protein interaction networks. Artif Intell Med 54(1):63–71

    Article  Google Scholar 

  27. Soares RGF, Chen H, Yao X (2012) Semisupervised classification with cluster regularization. IEEE Trans Neural Netw Learn Syst 23(11):1779–1792

    Article  Google Scholar 

  28. Fan M, Gu N, Qiao H, Zhang B (2011) Sparse regularization for semi-supervised classification. Pattern Recognit 44(8):1777–1784

    Article  MATH  Google Scholar 

  29. Belkin M, Niyogi P, Sindhwani V (2006) Manifold regularization: a geometric framework for learning from labeled and unlabeled examples. J Mach Learn Res 7:2399–2434

    MATH  MathSciNet  Google Scholar 

  30. Melacci S, Belkin M (2011) Laplacian support vector machines trained in the primal. J Mach Learn Res 12:1149–1184

    MATH  MathSciNet  Google Scholar 

  31. Qi Z, Tian Y, Shi Y (2012) Laplacian twin support vector machine for semi-supervised classification. Neural Netw 35:46–53

    Article  MATH  Google Scholar 

  32. Chen W, Shao Y, Ye Y (2013) Improving Lap-TSVM with successive overrelaxation and differential evolution. Proc Comput Sci 17:33–40

    Article  Google Scholar 

  33. Chen W, Shao Y, Hong N (2013) Laplacian smooth twin support vector machine for semi-supervised classification. Int J Mach Learn Res Cybern. doi:10.1007/s13042-013-0183-3

    Google Scholar 

  34. Tikhonov AN, Arsenin VY (1979) Methods for solving ill-posed problems. Nauka, Moscow

    Google Scholar 

  35. Parlett B (1998) The symmetric eigenvalue problem. SIAM, Philadelphia

    Book  MATH  Google Scholar 

  36. Lin SW, Ying KC, Chen SC, Lee ZJ (2008) Particle swarm optimization for parameter determination and feature selection of support vector machines. Expert Syst Appl 35(4):1817–1824

    Article  Google Scholar 

  37. Shao Y, Wang Z, Chen W, Deng N (2013) Least squares twin parametric-margin support vector machine for classification. Appl Intell 39(3):451–464

    Article  Google Scholar 

  38. Huang CL, Dun JF (2008) A distributed pso-svm hybrid system with feature selection and parameter optimization. Appl Soft Comput 8(4):1381–1391

    Article  Google Scholar 

  39. Das S, Suganthan PN (2011) Differential evolution: a survey of the state-of-the-art. IEEE Trans Evol Comput 15(1):4–31

    Article  Google Scholar 

  40. Kennedy J, Eberhart R (1995) Particle swarm optimization. In: IEEE international conference on neural networks, vol 4, pp 1942–1948

    Google Scholar 

  41. Poli R, Kennedy J, Blackwell T (2007) Particle swarm optimization. Swarm Intell 1(1):33–57

    Article  Google Scholar 

  42. Gan H, Sang N, Huang R, Tong X, Dan Z (2013) Using clustering analysis to improve semi-supervised classification. Neurocomputing 101:290–298

    Article  Google Scholar 

  43. Yang Z, Fang K, Kotz S (2007) On the student’s t-distribution and the t-statistic. J Multivar Anal 98(6):1293–1304

    Article  MATH  MathSciNet  Google Scholar 

Download references

Acknowledgements

The authors would like to thank the editors and the anonymous reviewers, whose invaluable comments helped improve the presentation of this paper substantially. This work is supported by the National Natural Science Foundation of China (11201426, 61203133, 11301485 and 61304125), the Zhejiang Provincial Natural Science Foundation of China (LQ12A01020, LQ13F030010) and the Science and Technology Foundation of Department of Education of Zhejiang Province (Y201225179).

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Wei-Jie Chen.

Rights and permissions

Reprints and permissions

About this article

Cite this article

Chen, WJ., Shao, YH., Xu, DK. et al. Manifold proximal support vector machine for semi-supervised classification. Appl Intell 40, 623–638 (2014). https://doi.org/10.1007/s10489-013-0491-z

Download citation

  • Published:

  • Issue Date:

  • DOI: https://doi.org/10.1007/s10489-013-0491-z

Keywords

Navigation