Self-taught support vector machines

  • Parvin Razzaghi
Regular Paper


In this paper, a new approach to self-taught learning is proposed. Classification in target task with limited labeled target data gets improved thanks to enormous unlabeled source data. The target and source data can be drawn from different distributions. In the previous approaches, covariate shift assumption is considered in which the marginal distributions p(x) change over domains and the conditional distributions p(y|x) remain the same. In our approach, we propose a new objective function which simultaneously learns a common space ℑ(.) where the conditional distributions over domains p(ℑ(x)|y) remain the same and learns robust SVM classifiers for target task using both source and target data in the new representation. Hence, in the proposed objective function, the hidden label of the source data is also incorporated. We applied the proposed approach on Caltech-256 and MSRC + LMO datasets and compared the performance of our algorithm to the available competing methods. Our method has a superior performance to the successful existing algorithms.


Self-taught learning Domain adaption Inductive transfer learning Self-taught SVM 



This research was in part supported by a grant from the Institute for Research in Fundamental Sciences (IPM) (Grant number CS1395-4-68).

Supplementary material

10115_2018_1218_MOESM1_ESM.rar (1.9 mb)
Supplementary material 1 (RAR 1984 kb)


  1. 1.
    Ablavsky VH, Becker CJ, Fua P (2012) Transfer learning by sharing support vectors: No. EPFL-REPORT-181360Google Scholar
  2. 2.
    Ben-David S, Blitzer J, Crammer K, Kulesza A, Pereira F, Vaughan JW (2010) A theory of learning from different domains. Mach Learn 79:151–175MathSciNetCrossRefGoogle Scholar
  3. 3.
    Bickel S, Bruckner M, Scheffer T (2007) Discriminative learning for differing training and test distributions. In: International conference on machine learning, pp 81–88Google Scholar
  4. 4.
    Borgwardt KM, Gretton A, Rasch MJ, Kriegel HP, Scholkopf B, Smola AJ (2006) Integrating structured biological data by kernel maximum mean discrepancy. Bioinformatics 22:e49–e57CrossRefGoogle Scholar
  5. 5.
    Bosch A, Zisserman A, Munoz X (2007) Representing shape with a spatial pyramid kernel. In: ACM international conference on image and video retrievalGoogle Scholar
  6. 6.
    Bruzzone L, Marconcini M (2010) Domain adaptation problems: a DASVM classification technique and a circular validation strategy. IEEE Trans Pattern Anal Mach Intell 32:770–787CrossRefGoogle Scholar
  7. 7.
    Chang C-C, Lin C-J (2011) LIBSVM: a library for support vector machines. ACM Trans Intell Syst Technol 2:27:21–27:27CrossRefGoogle Scholar
  8. 8.
    Chen M, Weinberger KQ, Blitzer J (2011) Co-training for domain adaptation. In: Advances in neural information processing systems (NIPS), pp 2456–2464Google Scholar
  9. 9.
    Chen Y, Guoping W, Shihai D (2003) Learning with progressive transductive support vector machine. Pattern Recogn Lett 24:1845–1855CrossRefGoogle Scholar
  10. 10.
    Daume III H (2009) Frustratingly easy domain adaptation. arXiv preprint arXiv:0907.1815
  11. 11.
    Demšar J (2006) Statistical comparisons of classifiers over multiple data sets. J Mach Learn Res 7:1–30MathSciNetzbMATHGoogle Scholar
  12. 12.
    Duan L, Tsang IW, Xu D, Maybank SJ (2009) Domain transfer svm for video concept detection. In: IEEE conference on computer vision and pattern recognition, pp 1375–1381Google Scholar
  13. 13.
    Gammerman A, Vovk V, Vapnik V (1998) Learning by transduction. In: Proceedings of uncertainty in artificial intelligence, pp 148–156Google Scholar
  14. 14.
    Germain P, Habrard A, Laviolette F, Morvant E (2013) A PAC-Bayesian approach for domain adaptation with specialization to linear classifiers. In: International conference on machine learning, pp 738–746Google Scholar
  15. 15.
    Germain P, Habrard A, Laviolette F, Morvant E (2016) A new PAC-Bayesian perspective on domain adaptation. In: International conference on machine learningGoogle Scholar
  16. 16.
    Gong M, Zhang K, Liu T, Tao D, Glymour C, Schölkopf B (2016) Domain adaptation with conditional transferable components. In: International conference on machine learning, pp 2839–2848Google Scholar
  17. 17.
    Grant M, Boyd S (2012) CVX users’ guideGoogle Scholar
  18. 18.
    Griffin G, Holub A, Perona P (2007) Caltech 256 object category dataset. Technical Report UCB/CSD-04-1366: California Institute of TechnologyGoogle Scholar
  19. 19.
    Laviolette F, Marchand M, Roy JF (2011) From PAC-Bayes bounds to quadratic programs for majority votes. In: International conference on machine learningGoogle Scholar
  20. 20.
    Lazebnik S, Schmid C, Ponce J (2006) Beyond bags of features: Spatial pyramid matching for recognizing natural scene categories. In: Computer vision and pattern recognition, pp 2169–2678Google Scholar
  21. 21.
    Li L, Jin X, Long M (2012) Topic correlation analysis for cross-domain text classification. In: AAAI conference on artificial intelligenceGoogle Scholar
  22. 22.
    Li S, Li K, Fu Y (2017) Self-taught low-rank coding for visual learning. IEEE Trans Neural Netw Learn Syst 29:645–656CrossRefGoogle Scholar
  23. 23.
    Liu C, Yuen J, Torralba A (2009) Nonparametric scene parsing: label transfer via dense scene alignment. In: Conference on computer vision and pattern recognitionGoogle Scholar
  24. 24.
    Lowe D (2004) Distinctive image features from scale-invariant keypoints. Int J Comput Vis 60:91–110CrossRefGoogle Scholar
  25. 25.
    Lowe D (2004) Distinctive image features from scale-invariant keypoints. Int J Comput Vision 60:91–110CrossRefGoogle Scholar
  26. 26.
    Lu J, Behbood V, Hao P, Zuo H, Xue S, Zhang G (2015) Transfer learning using computational intelligence: a survey. Knowl Based Syst 80:14–23CrossRefGoogle Scholar
  27. 27.
    Mansour Y, Mohri M, Rostamizadeh A (2009) Domain adaptation: learning bounds and algorithms. arXiv preprint arXiv:0902.3430
  28. 28.
    Margolis A (2011) A literature review of domain adaptation with unlabeled data. Washington University, St. Louis, pp 1–42Google Scholar
  29. 29.
    Morvant E (2015) Domain adaptation of weighted majority votes via perturbed variation-based self-labeling. Pattern Recogn Lett 51:37–43CrossRefGoogle Scholar
  30. 30.
    Ojala T, Pietikäinen M, Mäenpää T (2002) Multiresolution gray-scale and rotation invariant texture classification with local binary pattern. IEEE Trans Pattern Anal Mach Intell 24:971–987CrossRefzbMATHGoogle Scholar
  31. 31.
    Orabona F, Castellini C, Caputo B, Fiorilla E, Sandini G (2009) Model adaptation with least-squares SVM for hand prosthetics. In: IEEE international conference on robotics and automationGoogle Scholar
  32. 32.
    Pan SJ, Yang Q (2010) A survey on transfer learning. IEEE Trans Knowl Data Eng 22:1345–1359CrossRefGoogle Scholar
  33. 33.
    Quionero-Candela J, Sugiyama M, Schwaighofer A, Lawrence ND (2009) Dataset shift in machine learning. The MIT Press, CambridgeGoogle Scholar
  34. 34.
    Raina R, Battle A, Lee H, Packer B, Ng AY (2007) Self-taught learning: transfer learning from unlabeled data. In: International conference on machine learning, pp 759–766Google Scholar
  35. 35.
    Rakotomamonjy A, Bach FR, Canu S, Grandvalet Y (2008) SimpleMKL. J Mach Learn Res 9:2491–2521MathSciNetzbMATHGoogle Scholar
  36. 36.
    Shimodaira H (2000) Improving predictive inference under covariate shift by weighting the log-likelihood function. J Stat Plann Inference 90:227–244MathSciNetCrossRefzbMATHGoogle Scholar
  37. 37.
    Shotton J, Johnson M, Cipolla R (2008) Semantic texton forests for image categorization and segmentation. In: Conference on computer vision and pattern recognition, pp 1–8Google Scholar
  38. 38.
    Sugiyama M, Kawanabe M (2012) Machine learning in non-stationary environments: Introduction to covariate shift adaptation. MIT Press, CambridgeCrossRefGoogle Scholar
  39. 39.
    Tommasi T, Orabona F, Caputo B (2010) Safety in numbers: learning categories from few examples with multi model knowledge transfer. In: Conference on computer vision and pattern recognition (CVPR), pp 3081–3088Google Scholar
  40. 40.
    Tuzel O, Porikli F, Meer P (2007) Human detection via classification on riemannian manifold. In: Conference on computer vision and pattern recognitionGoogle Scholar
  41. 41.
    Wang H, Nie F, Huang H (2013) Robust and discriminative self-taught learning. In: International conference on machine learning, pp 298–306Google Scholar
  42. 42.
    Weijer JVD, Schmid C (2006) Coloring local feature extraction. In: European conference on computer vision, pp 334–348Google Scholar

Copyright information

© Springer-Verlag London Ltd., part of Springer Nature 2018

Authors and Affiliations

  1. 1.Department of Computer Science and Information TechnologyInstitute for Advanced Studies in Basic Sciences (IASBS)ZanjanIran
  2. 2.School of Computer ScienceInstitute for Research in Fundamental Sciences (IPM)TehranIran

Personalised recommendations