Skip to main content
Log in

A risk degree-based safe semi-supervised learning algorithm

  • Original Article
  • Published:
International Journal of Machine Learning and Cybernetics Aims and scope Submit manuscript

Abstract

Semi-supervised learning has attracted much attention in machine learning field over the past decades and a number of algorithms are proposed to improve the performance by exploiting unlabeled data. However, unlabeled data may hurt performance of semi-supervised learning in some cases. It is instinctively expected to design a reasonable strategy to safety exploit unlabeled data. To address the problem, we introduce a safe semi-supervised learning by analyzing the different characteristics of unlabeled data in supervised and semi-supervised learning. Our intuition is that unlabeled data may be often risky in semi-supervised setting and the risk degree are different. Hence, we assign different risk degree to unlabeled data and the risk degree serve as a sieve to determine the exploiting way of unlabeled data. The unlabeled data with high risk should be exploited by supervised learning and the other should be used for semi-supervised learning. In particular, we utilize kernel minimum squared error (KMSE) and Laplacian regularized KMSE for supervised and semi-supervised learning, respectively. Experimental results on several benchmark datasets illustrate the performance of our algorithm is never inferior to that of KMSE and indicate the effectiveness and efficiency of our algorithm.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Fig. 1
Fig. 2
Fig. 3
Fig. 4
Fig. 5
Fig. 6
Fig. 7
Fig. 8
Fig. 9
Fig. 10
Fig. 11
Fig. 12
Fig. 13
Fig. 14
Fig. 15
Fig. 16
Fig. 17

Similar content being viewed by others

References

  1. Belkin M, Niyogi P, Sindhwani V (2006) Manifold regularization: a geometric framework for learning from labeled and unlabeled examples. J Mach Learn Res 7:2399–2434

    MathSciNet  MATH  Google Scholar 

  2. Blum A, Mitchell T (1998) Combining labeled and unlabeled data with co-training. In: Proceedings of the 11th annual conference on computational learning theory. ACM, New York, NY, USA, pp 92–100

  3. Cao Y, He (Helen) H, Huang H (2011) Lift: a new framework of learning from testing data for face recognition. Neurocomputing 74(6):916–929

    Article  Google Scholar 

  4. Chapelle O, Scholkopf B, Zien A. http://olivier.chapelle.cc/ssl-book/benchmarks.html. Accessed 28 July 2006

  5. Chapelle O, Scholkopf B, Zien A (eds) (2006) Semi-supervised learning. MIT Press, Cambridge

    Google Scholar 

  6. Chen H, Li L, Peng J (2009) Error bounds of multi-graph regularized semi-supervised classification. Inf Sci 179(12):1960–1969

    Article  MathSciNet  MATH  Google Scholar 

  7. Chen S, Li S, Su S, Cao D, Ji R (2014) Online semi-supervised compressive coding for robust visual tracking. J Vis Commun Image Rep 25(5):793–804

    Article  Google Scholar 

  8. Gan H, Sang N, Chen X (2013) Semi-supervised kernel minimum squared error based on manifold structure. In: Proceedings of the 10th international symposium on neural networks, vol 7951. Springer-Verlag, Berlin, Heidelberg, pp 265–272

  9. Gan H, Sang N, Huang R (2014) Self-training-based face recognition using semi-supervised linear discriminant analysis and affinity propagation. J Opt Soc Am A 31(1):1–6

    Article  Google Scholar 

  10. Gan H, Sang N, Huang R, Tong X, Dan Z (2013) Using clustering analysis to improve semi-supervised classification. Neurocomputing 101:290–298

    Article  Google Scholar 

  11. Grabner Helmut LC, Horst B (2008) Semi-supervised on-line boosting for robust tracking. In: Proceedings of the 10th European conference on computer vision: part I. Springer-Verlag, Berlin, Heidelberg, pp 234–247

  12. Joachims T (1999) Transductive inference for text classification using support vector machines. In: Proceedings of the 16th international conference on machine learning. Morgan Kaufmann Publishers Inc., San Francisco, CA, USA, pp 200–209

  13. Li Y, Zhou Z (2011) Improving semi-supervised support vector machines through unlabeled instances selection. In: Proceedings of the 25th AAAI conference on artificial intelligence. AAAI Press, pp 500–505

  14. Li Y, Zhou Z (2011) Towards making unlabeled data never hurt. In: Proceedings of the 28th international conference on machine learning. Omnipress, pp 1081–1088

  15. Liu X, Pan S, Hao Z, Lin Z (2014) Graph-based semi-supervised learning by mixed label propagation with a soft constraint. Inf Sci 277:327–337

    Article  MathSciNet  Google Scholar 

  16. Ni T, Chung FL, Wang S (2015) Support vector machine with manifold regularization and partially labeling privacy protection. Inf Sci 294:390–407

    Article  MathSciNet  Google Scholar 

  17. Qi Z, Xu Y, Wang L, Song Y (2011) Online multiple instance boosting for object detection. Neurocomputing 74(10):1769–1775

    Article  Google Scholar 

  18. Van Vaerenbergh S, Santamaria I, Barbano P (2011) Semi-supervised handwritten digit recognition using very few labeled data. In: Proceedings of the 2011 IEEE international conference on acoustics, speech and signal processing, pp 2136–2139

  19. Varadarajan B, Yu D, Deng L, Acero A (2009) Using collective information in semi-supervised learning for speech recognition. In: Proceedings of the IEEE international conference on acoustics, speech and signal processing, pp 4633–4636. IEEE

  20. Wang XZ, Dong CR (2009) Improving generalization of fuzzy if-then rules by maximizing fuzzy entropy. IEEE Trans Fuzzy Syst 17(3):556–567

    Article  Google Scholar 

  21. Wang XZ, Dong LC, Yan JH (2012) Maximum ambiguity-based sample selection in fuzzy decision tree induction. IEEE Trans Knowl Data Eng 24(8):1491–1505

    Article  Google Scholar 

  22. Wang XZ, Xing HJ, Li Y, Hua Q, Dong CR, Pedrycz W (2014) A study on relationship between generalization abilities and fuzziness of base classifiers in ensemble learning. IEEE Trans Fuzzy Syst. doi:10.1109/TFUZZ.2014.2371479

    Google Scholar 

  23. Wang Y, Chen S (2013) Safety-aware semi-supervised classification. IEEE Trans Neural Netw Learn Syst 24(11):1763–1772

    Article  Google Scholar 

  24. Xu J, Zhang X, Li Y (2001) Kernel mse algorithm: a unified framework for KFD, LS-SVM and KRR. In: Proceedings of international joint conference on neural networks, pp 1486–1491

  25. Yang T, Priebe CE (2011) The effect of model misspecification on semi-supervised classification. IEEE Trans Pattern Anal Mach Intell 33(10):2093–2103

    Article  Google Scholar 

  26. Zhu X (2005) Semi-supervised learning literature survey. Technical Report 1530, Computer Sciences, University of Wisconsin-Madison

Download references

Acknowledgments

This work is supported by Zhejiang Provincial Natural Science Foundation of China under Grant No. LY14F030023, and Natural Science Foundation of China under Grant No. 61172134, 61201302 and 61372023.

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Haitao Gan.

Rights and permissions

Reprints and permissions

About this article

Check for updates. Verify currency and authenticity via CrossMark

Cite this article

Gan, H., Luo, Z., Meng, M. et al. A risk degree-based safe semi-supervised learning algorithm. Int. J. Mach. Learn. & Cyber. 7, 85–94 (2016). https://doi.org/10.1007/s13042-015-0416-8

Download citation

  • Received:

  • Accepted:

  • Published:

  • Issue Date:

  • DOI: https://doi.org/10.1007/s13042-015-0416-8

Keywords

Navigation