Effective Multiclass Transfer for Hypothesis Transfer Learning

  • Shuang Ao
  • Xiang Li
  • Charles X. Ling
Conference paper
Part of the Lecture Notes in Computer Science book series (LNCS, volume 10235)


In this paper, we investigate the visual domain adaptation problem under the setting of Hypothesis Transfer Learning (HTL) where we can only access the source model instead of the data. However, previous studies of HTL are limited to either leveraging the knowledge from certain type of source classifier or low transfer efficiency on a small training set. In this paper, we aim at two important issues: effectiveness of the transfer on small target training set and compatibility of the transfer model for real-world HTL problems. To solve these two issues, we proposed our method, Effective Multiclass Transfer Learning (EMTLe). We demonstrate that EMTLe, which uses the prediction of the source models as the transferable knowledge can exploit the knowledge of different types of source classifiers. We use the transfer parameter to weigh the importance the prediction of each source model as the auxiliary bias. Then we use the bi-level optimization to estimate the transfer parameter and demonstrate that we can effectively obtain the optimal transfer parameter with our novel objective function. Empirical results show that EMTLe can effectively exploit the knowledge and outperform other HTL baselines when the size of the target training set is small.



We thank the anonymous reviewers for their valuable comments to improve this paper. This work is supported by Natural Sciences and Engineering Research Council of Canada (NSERC).


  1. 1.
    Aytar, Y., Zisserman, A.: Tabula rasa: model transfer for object category detection. In: 2011 IEEE International Conference on Computer Vision (ICCV), pp. 2252–2259. IEEE (2011)Google Scholar
  2. 2.
    Ben-David, S., Blitzer, J., Crammer, K., Kulesza, A., Pereira, F., Vaughan, J.W.: A theory of learning from different domains. Mach. Learn. 79(1–2), 151–175 (2010)MathSciNetCrossRefGoogle Scholar
  3. 3.
    Ben-David, S., Blitzer, J., Crammer, K., Pereira, F., et al.: Analysis of representations for domain adaptation. Adv. Neural Inf. Process. Syst. 19, 137 (2007)Google Scholar
  4. 4.
    Cawley, G.C.: Leave-one-out cross-validation based model selection criteria for weighted LS-SVMs. In: International Joint Conference on Neural Networks, IJCNN 2006, pp. 1661–1668. IEEE (2006)Google Scholar
  5. 5.
    Crammer, K., Singer, Y.: On the algorithmic implementation of multiclass kernel-based vector machines. J. Mach. Learn. Res. 2, 265–292 (2002)zbMATHGoogle Scholar
  6. 6.
    Davis, J., Domingos, P.: Deep transfer via second-order Markov logic. In: Proceedings of the 26th Annual International Conference on Machine Learning, pp. 217–224. ACM (2009)Google Scholar
  7. 7.
    Fei-Fei, L., Fergus, R., Perona, P.: One-shot learning of object categories. IEEE Trans. Pattern Anal. Mach. Intell. 28(4), 594–611 (2006)CrossRefGoogle Scholar
  8. 8.
    Jie, L., Tommasi, T., Caputo, B.: Multiclass transfer learning from unconstrained priors. In: 2011 IEEE International Conference on Computer Vision (ICCV), pp. 1863–1870. IEEE (2011)Google Scholar
  9. 9.
    Krizhevsky, A., Sutskever, I., Hinton, G.E.: Imagenet classification with deep convolutional neural networks. In: Advances in Neural Information Processing Systems, pp. 1097–1105 (2012)Google Scholar
  10. 10.
    Kuzborskij, I., Orabona, F.: Stability and hypothesis transfer learning. In: Proceedings of the 30th International Conference on Machine Learning, pp. 942–950 (2013)Google Scholar
  11. 11.
    Kuzborskij, I., Orabona, F., Caputo, B.: From n to n+1: multiclass transfer incremental learning. In: 2013 IEEE Conference on Computer Vision and Pattern Recognition (CVPR), pp. 3358–3365. IEEE (2013)Google Scholar
  12. 12.
    Maclaurin, D., Duvenaud, D., Adams, R.P.: Gradient-based hyperparameter optimization through reversible learning. In: Proceedings of the 32nd International Conference on Machine Learning (2015)Google Scholar
  13. 13.
    Pan, S.J., Yang, Q.: A survey on transfer learning. IEEE Trans. Knowl. Data Eng. 22(10), 1345–1359 (2010)CrossRefGoogle Scholar
  14. 14.
    Pedregosa, F.: Hyperparameter optimization with approximate gradient. In: Proceedings of the 33nd International Conference on Machine Learning, ICML 2016, New York City, NY, USA, 19–24 June 2016, pp. 737–746 (2016)Google Scholar
  15. 15.
    Rockafellar, R.T.: Convex Analysis. Princeton University Press, Princeton (2015)zbMATHGoogle Scholar
  16. 16.
    Tommasi, T., Orabona, F., Caputo, B.: Learning categories from few examples with multi model knowledge transfer. Pattern Anal. Mach. Intell. 36(5), 928–941 (2014)CrossRefGoogle Scholar
  17. 17.
    Wang, X., Huang, T.K., Schneider, J.: Active transfer learning under model shift. In: Proceedings of the 31st International Conference on Machine Learning (ICML 2014), pp. 1305–1313 (2014)Google Scholar
  18. 18.
    Yang, J., Yan, R., Hauptmann, A.G.: Adapting SVM classifiers to data with shifted distributions. In: 2007 Seventh IEEE International Conference on Data Mining Workshops, ICDM Workshop 2007, pp. 69–76. IEEE (2007)Google Scholar
  19. 19.
    Yang, J., Yan, R., Hauptmann, A.G.: Cross-domain video concept detection using adaptive SVMs. In: Proceedings of the 15th International Conference on Multimedia, pp. 188–197. ACM (2007)Google Scholar

Copyright information

© Springer International Publishing AG 2017

Authors and Affiliations

  1. 1.Department of Computer ScienceWestern UniversityLondonCanada

Personalised recommendations