A multiclass boosting algorithm to labeled and unlabeled data

  • Jafar TanhaEmail author
Original Article


In this article we focus on the semi-supervised learning. Semi-supervised learning typically is a learning task from both labeled and unlabeled data. We especially consider the multiclass semi-supervised classification problem. To solve the multiclass semi-supervised classification problem we propose a new multiclass loss function using new codewords. In the proposed loss function, we combine the classifier predictions, based on the labeled data, and the pairwise similarity between labeled and unlabeled examples. The main goal of the proposed loss function is to minimize the inconsistency between classifier predictions and the pairwise similarity. The proposed loss function consists of two terms. The first term is the multiclass margin cost of the labeled data and the second term is a regularization term on unlabeled data. The regularization term is used to minimize the cost of pseudo-margin on unlabeled data. We then derive a new multiclass boosting algorithm from the proposed risk function, called GMSB. The derived algorithm also uses a set optimal similarity functions for a given dataset. The results of our experiments on a number of UCI and real-world biological, text, and image datasets show that GMSB outperforms the state-of-the-art boosting methods to multiclass semi-supervised learning.


Multiclass classification Semi-supervised learning Similarity function Boosting 



This research was partially supported by a grant from IPM (No. CS1398-4-224). We also thank the anonymous reviewers for their valuable comments.


  1. 1.
    Bagheri MA, Montazer GA, Kabir E (2013) A subspace approach to error correcting output codes. Pattern Recognit Lett 34(2):176–184Google Scholar
  2. 2.
    Bar-Hillel A, Hertz T, Shental N, Weinshall D (2005) Learning a mahalanobis metric from equivalence constraints. J Mach Learn Res 6:937–965MathSciNetzbMATHGoogle Scholar
  3. 3.
    Belkin M, Niyogi P, Sindhwani V (2006) Manifold regularization: a geometric framework for learning from labeled and unlabeled examples. J Mach Learn Res 7:2399–2434MathSciNetzbMATHGoogle Scholar
  4. 4.
    Bennett K, Demiriz A (1999) Semi-supervised support vector machines. NIPS pp 368–374Google Scholar
  5. 5.
    Bennett K, Demiriz A, Maclin R (2002) Exploiting unlabeled data in ensemble methods. In: Proceedings of ACM SIGKDD conference, pp 289–296Google Scholar
  6. 6.
    Blum A, Mitchell TM (1998) Combining labeled and unlabeled data with co-training. In: COLT, pp 92–100Google Scholar
  7. 7.
    Boley D, Gini M, Gross R, Han E, Hastings K, Karypis G, Kumar V, Mobasher B, Moore J (1999) Document categorization and query generation on the world wide web using webace. Artif Intell Rev 13(5):365–391Google Scholar
  8. 8.
    dAlch Buc F, Grandvalet Y, Ambroise C (2002) Semi-supervised marginboost. NIPS 14:553–560Google Scholar
  9. 9.
    Chen K, Wang S (2011) Semi-supervised learning via regularized boosting working on multiple semi-supervised assumptions. Pattern Anal Mach Intell 33(1):129–143Google Scholar
  10. 10.
    Chen T, Guestrin C (2016) Xgboost: A scalable tree boosting system. In: Proceedings of the 22nd ACM SIGKDD International Conference on Knowledge Discovery and Data Mining, pp 785–794Google Scholar
  11. 11.
    Dai Z, Yang Z, Yang F, Cohen WW, Salakhutdinov RR (2017) Good semi-supervised learning that requires a bad gan. In: NIPS, pp 6510–6520Google Scholar
  12. 12.
    Demiriz A, Bennett K, Shawe-Taylor J (2002) Linear programming boosting via column generation. Mach Learn 46(1):225–254zbMATHGoogle Scholar
  13. 13.
    Dietterich TG, Bakiri G (1995) Solving multiclass learning problems via error-correcting output codes. J Artif Intell Res 2:263–286zbMATHGoogle Scholar
  14. 14.
    Dunlop MM, Slepcev D, Stuart AM, Thorpe M (2018) Large data and zero noise limits of graph-based semi-supervised learning algorithms. CoRR arxIV:abs/1805.09450
  15. 15.
    Frank A, Asuncion A (2010) UCI machine learning repository. URL
  16. 16.
    Freund Y, Schapire RE (1996) Experiments with a new boosting algorithm. In: ICML, pp 148–156Google Scholar
  17. 17.
    Freund Y, Schapire R, Abe N (1999) A short introduction to boosting. J Jpn Soc Artif Intell 14(771–780):1612Google Scholar
  18. 18.
    Friedman J, Hastie T, Tibshirani R (2000) Additive logistic regression: a statistical view of boosting (with discussion and a rejoinder by the authors). Ann Stat 28(2):337–407zbMATHGoogle Scholar
  19. 19.
    Goodman N, Mansinghka V, Roy DM, Bonawitz K, Tenenbaum JB (2012) Church: a language for generative models. arXiv preprint arXiv:12063255
  20. 20.
    He R, Zheng W, Hu B, Kong X (2011) Nonnegative sparse coding for discriminative semi-supervised learning. In: The 24th IEEE Conference on Computer Vision and Pattern Recognition, CVPR 2011, Colorado Springs, CO, USA, 20-25 June 2011, pp 2849–2856,
  21. 21.
    Hoi S, Liu W, Chang S (2008) Semi-supervised distance metric learning for collaborative image retrieval. In: CVPR, pp 1–7Google Scholar
  22. 22.
    Hoi SC, Liu W, Lyu MR, Ma WY (2006) Learning distance metrics with contextual constraints for image retrieval. In: Computer vision and pattern recognition, 2006 IEEE computer society conference on, IEEE, vol 2, pp 2072–2078Google Scholar
  23. 23.
    Huang L, Liu X, Ma B, Lang B (2015) Online semi-supervised annotation via proxy-based local consistency propagation. Neurocomputing 149:1573–1586. Google Scholar
  24. 24.
    Jaakkola M (2002) Partially labeled classification with markov random walks. In: NIPS 14: proceedings of the 2002 conference, MIT Press, vol 2, p 945Google Scholar
  25. 25.
    Jiang B, Chen H, Yuan B, Yao X (2017) Scalable graph-based semi-supervised learning through sparse bayesian model. IEEE Trans Knowl Data Eng 29(12):2758–2771Google Scholar
  26. 26.
    Joachims T (1999) Transductive inference for text classification using support vector machines. In: ICML, pp 200–209Google Scholar
  27. 27.
    Kingma DP, Mohamed S, Rezende DJ, Welling M (2014) Semi-supervised learning with deep generative models. In: Advances in Neural Information Processing Systems, pp 3581–3589Google Scholar
  28. 28.
    Laine S, Aila T (2016) Temporal ensembling for semi-supervised learning. CoRR ARxIV:abs/1610.02242
  29. 29.
    Lewis D D (1999) Reuters-21578 text categorization test collection distribution, URL
  30. 30.
    Li Y, Guan C, Li H, Chin Z (2008) A self-training semi-supervised svm algorithm and its application in an eeg-based brain computer interface speller system. Pattern Recognit Lett 29(9):1285–1294Google Scholar
  31. 31.
    Mallapragada P, Jin R, Jain A, Liu Y (2009) Semiboost: boosting for semi-supervised learning. Pattern Anal Mach Intell 31(11):2000–2014Google Scholar
  32. 32.
    Miyato T, Maeda S, Ishii S, Koyama M (2018) Virtual adversarial training: a regularization method for supervised and semi-supervised learning. IEEE Transactions on Pattern Analysis and Machine Intelligence pp 1–1Google Scholar
  33. 33.
    Mukherjee I, Schapire RE (2013) A theory of multiclass boosting. J Mach Learn Res 14(1):437–497MathSciNetzbMATHGoogle Scholar
  34. 34.
    Ng WWY, Zhou X, Tian X, Wang X, Yeung DS (2018) Bagging-boosting-based semi-supervised multi-hashing with query-adaptive re-ranking. Neurocomputing 275:916–923Google Scholar
  35. 35.
    Ni B, Yan S, Kassim AA (2012) Learning a propagable graph for semisupervised learning: classification and regression. IEEE Trans Knowl Data Eng 24(1):114–126. Google Scholar
  36. 36.
    Nigam K, McCallum A, Thrun S, Mitchell T (2000) Text classification from labeled and unlabeled documents using em. Mach Learn 39(2):103–134zbMATHGoogle Scholar
  37. 37.
    Odena A (2016) Semi-supervised learning with generative adversarial networks. CoRR arXiv:abs/1606.01583
  38. 38.
    Rasmus A, Berglund M, Honkala M, Valpola H, Raiko T (2015) Semi-supervised learning with ladder networks. In: Advances in Neural Information Processing Systems 28: Annual Conference on Neural Information Processing Systems 2015, December 7-12, 2015, Montreal, Quebec, Canada, pp 3546–3554, URL
  39. 39.
    Rosenberg C, Hebert M, Schneiderman H (2005) Semi-supervised self-training of object detection models. In: WACV/MOTION, IEEE Computer Society, pp 29–36Google Scholar
  40. 40.
    Saberian MJ, Vasconcelos N (2011) Multiclass boosting: Theory and algorithms. In: Advances in Neural Information Processing Systems 24 (NIPS), pp 2124–2132Google Scholar
  41. 41.
    Sajjadi M, Javanmardi M, Tasdizen T (2016) Regularization with stochastic transformations and perturbations for deep semi-supervised learning. In: Lee DD, Sugiyama M, Luxburg UV, Guyon I, Garnett R (eds) Advances in Neural Information Processing Systems 29, Curran Associates, Inc., pp 1163–1171, URL
  42. 42.
    Song E, Huang D, Ma G, Hung C (2011) Semi-supervised multi-class adaboost by exploiting unlabeled data. Expert Syst Appl 38(6):6720–6726. Google Scholar
  43. 43.
    Subramanya A, Talukdar PP (2014) Graph-based semi-supervised learning. Synth Lect Artif Intell Mach Learn 8(4):1–125zbMATHGoogle Scholar
  44. 44.
    Sun S (2013) A survey of multi-view machine learning. Neural Comput Appl 23(7–8):2031–2038Google Scholar
  45. 45.
    Tanha J (2013) Ensemble approaches to semi-supervised learning, Ph.D thesis, Informatics Institute, University of AmsterdamGoogle Scholar
  46. 46.
    Tanha J, van Someren M, Afsarmanesh H (2011) Disagreement-based co-training. In: Tools with artificial intelligence (ICTAI), 2011 23rd IEEE International Conference on IEEE, pp 803–810Google Scholar
  47. 47.
    Tanha J, van Someren M, Afsarmanesh H (2012a) An adaboost algorithm for multiclass semi-supervised learning. In: ICDM, pp 1116–1121Google Scholar
  48. 48.
    Tanha J, van Someren M, Bakker M, Bouten W, Shamoun-Baranes J, Afsarmanesh H (2012b) Multiclass semi-supervised learning for animal behavior recognition from accelerometer data. In: Tools with artificial intelligence (ICTAI), 2012 24rd IEEE International Conference on IEEEGoogle Scholar
  49. 49.
    Tanha J, Saberian MJ, van Someren M (2013) Multiclass semi-supervised boosting using similarity learning. In: 2013 IEEE 13th International Conference on Data Mining, Dallas, TX, USA, December 7–10, 2013, pp 1205–1210Google Scholar
  50. 50.
    Tanha J, van Someren M, Afsarmanesh H (2014) Boosting for multiclass semi-supervised learning. Pattern Recognit Lett 37:63–77Google Scholar
  51. 51.
    Tanha J, van Someren M, Afsarmanesh H (2017) Semi-supervised self-training for decision tree classifiers. Int J Mach Learn Cybern 8(1):355–370Google Scholar
  52. 52.
    TREC (1999) Text retrieval conference. URL
  53. 53.
    Triguero I, García S, Herrera F (2015) Self-labeled techniques for semi-supervised learning: taxonomy, software and empirical study. Knowl Inf Syst 42(2):245–284Google Scholar
  54. 54.
    Valizadegan H, Jin R, Jain A (2008) Semi-supervised boosting for multi-class classification. ECML pp 522–537Google Scholar
  55. 55.
    Wright J, Yang AY, Ganesh A, Sastry SS, Ma Y (2009) Robust face recognition via sparse representation. IEEE Trans Pattern Anal Mach Intell 31(2):210–227. Google Scholar
  56. 56.
    Zha ZJ, Mei T, Wang J, Wang Z, Hua XS (2009) Graph-based semi-supervised learning with multiple labels. J Vis Commun Image Rep 20(2):97–103Google Scholar
  57. 57.
    Zhang M, Tang J, Zhang X, Xue X (2014) Addressing cold start in recommender systems: A semi-supervised co-training algorithm. In: Proceedings of the 37th international ACM SIGIR conference on Research & development in information retrieval, ACM, pp 73–82Google Scholar
  58. 58.
    Zhou D, Bousquet O, Lal T, Weston J, Schölkopf B (2004) Learning with local and global consistency. NIPS 16:321–328Google Scholar
  59. 59.
    Zhu J, Zou H, Rosset S, Hastie T et al (2009) Multi-class adaboost. Stat Interface 2(3):349–360MathSciNetzbMATHGoogle Scholar
  60. 60.
    Zhu X (2005) Semi-supervised learning literature survey. Tech. Rep. 1530, Computer Sciences, University of Wisconsin-MadisonGoogle Scholar
  61. 61.
    Zhu X, Ghahramani Z (2002) Learning from labeled and unlabeled data with label propagation. School Comput Sci, Carnegie Mellon Univ, Pittsburgh, PA, Tech Rep CMU-CALD-02-107Google Scholar
  62. 62.
    Zhu X, Goldberg AB (2009) Introduction to Semi-Supervised Learning. Artificial Intelligence and Machine Learning, Morgan & Claypool PublishersGoogle Scholar
  63. 63.
    Zhuang L, Gao H, Lin Z, Ma Y, Zhang X, Yu N (2012) Non-negative low rank and sparse graph for semi-supervised learning. In: 2012 ieee conference on computer vision and pattern recognition, Providence, RI, USA, June 16–21, 2012, pp 2328–2335.

Copyright information

© Springer-Verlag GmbH Germany, part of Springer Nature 2019

Authors and Affiliations

  1. 1.Electrical and Computer Engineering DepartmentUniversity of TabrizTabrizIran
  2. 2.School of Computer ScienceInstitute for Research in Fundamental Sciences (IPM)TehranIran

Personalised recommendations