Multi-Task Multi-Sample Learning

  • Yusuf AytarEmail author
  • Andrew Zisserman
Conference paper
Part of the Lecture Notes in Computer Science book series (LNCS, volume 8927)


In the exemplar SVM (E-SVM) approach of Malisiewicz et al., ICCV 2011, an ensemble of SVMs is learnt, with each SVM trained independently using only a single positive sample and all negative samples for the class. In this paper we develop a multi-sample learning (MSL) model which enables joint regularization of the E-SVMs without any additional cost over the original ensemble learning. The advantage of the MSL model is that the degree of sharing between positive samples can be controlled, such that the classification performance of either an ensemble of E-SVMs (sample independence) or a standard SVM (all positive samples used) is reproduced. However, between these two limits the model can exceed the performance of either. This MSL framework is inspired by multi-task learning approaches.

We also introduce a multi-task extension to MSL and develop a multi-task multi-sample learning (MTMSL) model that encourages both sharing between classes and sharing between sample specific classifiers within each class. Both MSL and MTMSL have convex objective functions.

The MSL and MTMSL models are evaluated on standard benchmarks including the MNIST, ‘Animals with attributes’ and the PASCAL VOC 2007 datasets. They achieve a significant performance improvement over both a standard SVM and an ensemble of E-SVMs.


Multi-task learning Exemplar SVMs 


  1. 1.
    Amit, Y., Fink, M., Srebro, N., Ullman, S.: Uncovering shared structures in multiclass classification. In: ICML, pp. 17–24 (2007)Google Scholar
  2. 2.
    Ando, R.K., Zhang, T.: A framework for learning predictive structures from multiple tasks and unlabeled data. Journal of Machine Learning Research 6, 1817–1853 (2005)zbMATHMathSciNetGoogle Scholar
  3. 3.
    Argyriou, A., Evgeniou, T., Pontil, M.: Multi-task feature learning. In: NIPS, pp. 41–48 (2006)Google Scholar
  4. 4.
    Argyriou, A., Evgeniou, T., Pontil, M.: Convex multi-task feature learning. Machine Learning 73(3), 243–272 (2008)CrossRefGoogle Scholar
  5. 5.
    Caruana, R.: Multitask learning. Machine Learning 28(1), 41–75 (1997)CrossRefMathSciNetGoogle Scholar
  6. 6.
    Dalal, N., Triggs, B.: Histogram of Oriented Gradients for Human Detection. Proc. CVPR. 2, 886–893 (2005)Google Scholar
  7. 7.
    Daumé III, H.: Frustratingly easy domain adaptation. CoRR (2009)Google Scholar
  8. 8.
    Everingham, M., Van Gool, L., Williams, C.K.I., Winn, J., Zisserman, A.: The PASCAL Visual Object Classes Challenge 2007 (VOC2007) Results (2007).
  9. 9.
    Evgeniou, T., Micchelli, C.A., Pontil, M.: Learning multiple tasks with kernel methods. Journal of Machine Learning Research 6, 615–637 (2005)zbMATHMathSciNetGoogle Scholar
  10. 10.
    Evgeniou, T., Pontil, M.: Regularized multi-task learning. In: KDD 2004: Proceedings of the Tenth ACM SIGKDD International Conference on Knowledge Discovery and Data Mining, pp. 109–117. ACM, New York (2004)Google Scholar
  11. 11.
    Felzenszwalb, P., Mcallester, D., Ramanan, D.: A discriminatively trained, multiscale, deformable part model. In: Proc. CVPR (2008)Google Scholar
  12. 12.
    Fornoni, M., Caputo, B., Orabona, F.: Multiclass latent locally linear support vector machines. In: ACML, pp. 229–244 (2013)Google Scholar
  13. 13.
    Fu, Z., Robles-Kelly, A., Zhou, J.: Mixing linear svms for nonlinear classification. IEEE Transactions on Neural Networks 21(12), 1963–1975 (2010)CrossRefGoogle Scholar
  14. 14.
    Jacob, L., Bach, F., Vert, J.: Clustered multi-task learning: A convex formulation. In: NIPS, pp. 745–752 (2008)Google Scholar
  15. 15.
    Johnson, S., Everingham, M.: Learning effective human pose estimation from inaccurate annotation. In: Proc. CVPR, pp. 1465–1472 (2011)Google Scholar
  16. 16.
    Kang, Z., Grauman, K., Sha, F.: Learning with whom to share in multi-task feature learning. In: ICML, pp. 521–528 (2011)Google Scholar
  17. 17.
    Kato, T., Kashima, H., Sugiyama, M., Asai, K.: Multi-task learning via conic programming. In: NIPS (2007)Google Scholar
  18. 18.
    Khosla, A., Zhou, T., Malisiewicz, T., Efros, A.A., Torralba, A.: Undoing the damage of dataset bias. In: Fitzgibbon, A., Lazebnik, S., Perona, P., Sato, Y., Schmid, C. (eds.) ECCV 2012, Part I. LNCS, vol. 7572, pp. 158–171. Springer, Heidelberg (2012) CrossRefGoogle Scholar
  19. 19.
    Ladicky, L., Torr, P.H.S.: Locally linear support vector machines. In: Proc. ICML, pp. 985–992 (2011)Google Scholar
  20. 20.
    Lampert, C.H., Blaschko, M.B.: Structured prediction by joint kernel support estimation. Machine Learning (2009)Google Scholar
  21. 21.
    Lee, S., Chatalbashev, V., Vickrey, D., Koller, D.: Learning a meta-level prior for feature relevance from multiple related tasks. In: ICML 2007: Proceedings of the 24th International Conference on Machine Learning, pp. 489–496. ACM, New York (2007)Google Scholar
  22. 22.
    Liu, J., Sun, J., Shum, H.: Paint selection. In: Proc. ACM SIGGRAPH (2009)Google Scholar
  23. 23.
    Malisiewicz, T., Gupta, A., Efros, A.A.: Ensemble of exemplar-SVMs for object detection and beyond. In: Proc. ICCV (2011)Google Scholar
  24. 24.
    Obozinski, G., Taskar, B., Jordan, M.I.: Joint covariate selection and joint subspace selection for multiple classification problems. Statistics and Computing 20(2), 231–252 (2010)CrossRefMathSciNetGoogle Scholar
  25. 25.
    Parameswaran, S., Weinberger, K.Q.: Large margin multi-task metric learning. In: NIPS, pp. 1867–1875 (2010)Google Scholar
  26. 26.
    Thrun, S.: Learning to learn: Introduction. Kluwer Academic Publishers (1996)Google Scholar
  27. 27.
    Yu, K., Tresp, V., Schwaighofer, A.: Learning gaussian processes from multiple tasks. In: ICML, pp. 1012–1019 (2005)Google Scholar
  28. 28.
    Zhang, Y., Yeung, D.: A convex formulation for learning task relationships in multi-task learning. CoRR abs/1203.3536 (2012)Google Scholar
  29. 29.
    Zhou, J., Chen, J., Ye, J.: Clustered multi-task learning via alternating structure optimization. In: NIPS, pp. 702–710 (2011)Google Scholar
  30. 30.
    Zhu, X., Vondrick, C., Ramanan, D., Fowlkes, C.: Do we need more training data or better models for object detection? In: Proc. BMVC, pp. 445–458 (2012)Google Scholar

Copyright information

© Springer International Publishing Switzerland 2015

Authors and Affiliations

  1. 1.Visual Geometry Group, Department of Engineering ScienceUniversity of OxfordOxfordUK

Personalised recommendations