Efficient Training of Graph-Regularized Multitask SVMs

  • Christian Widmer
  • Marius Kloft
  • Nico Görnitz
  • Gunnar Rätsch
Part of the Lecture Notes in Computer Science book series (LNCS, volume 7523)


We present an optimization framework for graph-regularized multi-task SVMs based on the primal formulation of the problem. Previous approaches employ a so-called multi-task kernel (MTK) and thus are inapplicable when the numbers of training examples n is large (typically n < 20,000, even for just a few tasks). In this paper, we present a primal optimization criterion, allowing for general loss functions, and derive its dual representation. Building on the work of Hsieh et al. [1,2], we derive an algorithm for optimizing the large-margin objective and prove its convergence. Our computational experiments show a speedup of up to three orders of magnitude over LibSVM and SVMLight for several standard benchmarks as well as challenging data sets from the application domain of computational biology. Combining our optimization methodology with the COFFIN large-scale learning framework [3], we are able to train a multi-task SVM using over 1,000,000 training points stemming from 4 different tasks. An efficient C++ implementation of our algorithm is being made publicly available as a part of the SHOGUN machine learning toolbox [4].


  1. 1.
    Hsieh, C., Chang, K., Lin, C., Keerthi, S., Sundararajan, S.: A dual coordinate descent method for large-scale linear SVM. In: Proceedings of the 25th International Conference on Machine Learning, pp. 408–415 (2008)Google Scholar
  2. 2.
    Fan, R., Chang, K., Hsieh, C., Wang, X., Lin, C.: LIBLINEAR: A library for large linear classification. Journal of Machine Learning Research 9, 1871–1874 (2008)zbMATHGoogle Scholar
  3. 3.
    Sonnenburg, S., Franc, V.: Coffin: A computational framework for linear SVMs. In: Fürnkranz, J., Joachims, T. (eds.) ICML, pp. 999–1006. Omnipress (2010)Google Scholar
  4. 4.
    Sonnenburg, S., Rätsch, G., Henschel, S., Widmer, C., Behr, J., Zien, A., de Bona, F., Binder, A., Gehl, C., Franc, V.: The SHOGUN Machine Learning Toolbox. Journal of Machine Learning Research 11, 1799–1802 (2010)zbMATHGoogle Scholar
  5. 5.
    Pan, S., Yang, Q.: A survey on transfer learning. IEEE Transactions on Knowledge and Data Engineering, 1345–1359 (2009)Google Scholar
  6. 6.
    Schweikert, G., Widmer, C., Schölkopf, B., Rätsch, G.: An Empirical Analysis of Domain Adaptation Algorithms for Genomic Sequence Analysis. In: Koller, D., Schuurmans, D., Bengio, Y., Bottou, L. (eds.) Advances in Neural Information Processing Systems 21, pp. 1433–1440 (2008)Google Scholar
  7. 7.
    Görnitz, N., Widmer, C., Zeller, G., Kahles, A., Sonnenburg, S., Rätsch, G.: Hierarchical Multitask Structured Output Learning for Large-scale Sequence Segmentation. In: Advances in Neural Information Processing Systems 24 (2011)Google Scholar
  8. 8.
    Collobert, R., Weston, J.: A unified architecture for natural language processing: deep neural networks with multitask learning. In: Cohen, W.W., McCallum, A., Roweis, S.T. (eds.) ICML. ACM International Conference Proceeding Series, vol. 307, pp. 160–167. ACM (2008)Google Scholar
  9. 9.
    Jiang, Y.G., Wang, J., Chang, S.F., Ngo, C.W.: Domain adaptive semantic diffusion for large scale context-based video annotation. In: ICCV, pp. 1420–1427. IEEE (2009)Google Scholar
  10. 10.
    Samek, W., Binder, A., Kawanabe, M.: Multi-task Learning via Non-sparse Multiple Kernel Learning. In: Real, P., Diaz-Pernil, D., Molina-Abril, H., Berciano, A., Kropatsch, W. (eds.) CAIP 2011, Part I. LNCS, vol. 6854, pp. 335–342. Springer, Heidelberg (2011)CrossRefGoogle Scholar
  11. 11.
    Torralba, A., Murphy, K.P., Freeman, W.T.: Sharing visual features for multiclass and multiview object detection. IEEE Trans. Pattern Anal. Mach. Intell. 29, 854–869 (2007)CrossRefGoogle Scholar
  12. 12.
    Evgeniou, T., Pontil, M.: Regularized multi-task learning. In: International Conference on Knowledge Discovery and Data Mining, pp. 109–117 (2004)Google Scholar
  13. 13.
    Agarwal, A., Daumé III, H., Gerber, S.: Learning Multiple Tasks using Manifold Regularization. In: Advances in Neural Information Processing Systems 23 (2010)Google Scholar
  14. 14.
    Evgeniou, T., Micchelli, C., Pontil, M.: Learning multiple tasks with kernel methods. Journal of Machine Learning Research 6(1), 615–637 (2005)MathSciNetzbMATHGoogle Scholar
  15. 15.
    Cortes, C., Vapnik, V.: Support vector networks. Machine Learning 20, 273–297 (1995)zbMATHGoogle Scholar
  16. 16.
    Müller, K.R., Mika, S., Rätsch, G., Tsuda, K., Schölkopf, B.: An introduction to kernel-based learning algorithms. IEEE Neural Networks 12(2), 181–201 (2001)CrossRefGoogle Scholar
  17. 17.
    Joachims, T.: Making large–scale SVM learning practical. In: Schölkopf, B., Burges, C., Smola, A. (eds.) Advances in Kernel Methods — Support Vector Learning, pp. 169–184. MIT Press, Cambridge (1999)Google Scholar
  18. 18.
    Rifkin, R.M., Lippert, R.A.: Value regularization and Fenchel duality. J. Mach. Learn. Res. 8, 441–479 (2007)MathSciNetzbMATHGoogle Scholar
  19. 19.
    Bertsekas, D.: Nonlinear Programming, 2nd edn. Athena Scientific, Belmont (1999)zbMATHGoogle Scholar
  20. 20.
    Xue, Y., Liao, X., Carin, L., Krishnapuram, B.: Multi-task learning for classification with dirichlet process priors. J. Mach. Learn. Res. 8, 35–63 (2007)MathSciNetzbMATHGoogle Scholar
  21. 21.
    Sonnenburg, S., Rätsch, G., Rieck, K.: Large scale learning with string kernels. In: Bottou, L., Chapelle, O., DeCoste, D., Weston, J. (eds.) Large Scale Kernel Machines, pp. 73–103. MIT Press, Cambridge (2007)Google Scholar
  22. 22.
    Consortium, T.W.T.C.C.: Genome-wide association study of 14,000 cases of seven common diseases and 3,000 shared controls. Nature 447(7145), 661–678 (2007)CrossRefGoogle Scholar
  23. 23.
    Kloft, M., Brefeld, U., Sonnenburg, S., Zien, A.: Lp-norm multiple kernel learning. Journal of Machine Learning Research 12, 953–997 (2011)MathSciNetGoogle Scholar

Copyright information

© Springer-Verlag Berlin Heidelberg 2012

Authors and Affiliations

  • Christian Widmer
    • 1
    • 2
  • Marius Kloft
    • 3
  • Nico Görnitz
    • 3
  • Gunnar Rätsch
    • 1
    • 2
  1. 1.Memorial Sloan-Kettering Cancer CenterNew YorkUSA
  2. 2.FML, Max-Planck SocietyTübingenGermany
  3. 3.Machine Learning LaboratoryTU BerlinGermany

Personalised recommendations