Multi-task Learning for Computational Biology: Overview and Outlook

Widmer, Christian; Kloft, Marius; Rätsch, Gunnar

doi:10.1007/978-3-642-41136-6_12

Christian Widmer^4,5,
Marius Kloft^4,6 &
Gunnar Rätsch⁴

3893 Accesses
1 Citations

Abstract

We present an overview of the field of regularization-based multi-task learning, which is a relatively recent offshoot of statistical machine learning. We discuss the foundations as well as some of the recent advances of the field, including strategies for learning or refining the measure of task relatedness. We present an example from the application domain of Computational Biology, where multi-task learning has been successfully applied, and give some practical guidelines for assessing a priori, for a given dataset, whether or not multi-task learning is likely to pay off.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Log in via an institution

Chapter: USD 29.95; Price excludes VAT (USA)

eBook: USD 39.99; Price excludes VAT (USA)

Softcover Book: USD 54.99; Price excludes VAT (USA)

Hardcover Book: USD 54.99; Price excludes VAT (USA)

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

References

Ando, R., Zhang, T.: A framework for learning predictive structures from multiple tasks and unlabeled data. J. Mach. Learn. Res. 6, 1817–1853 (2005)
MathSciNet MATH Google Scholar
Argyriou, A., Evgeniou, T., Pontil, M.: Multi-task feature learning. In: Advances in Neural Information Processing Systems 19, Vancouver. MIT Press, Cambridge (2007)
Google Scholar
Baxter, J.: A model of inductive bias learning. J. Artif. Intell. Res. 2777, 149–198 (2000)
MathSciNet Google Scholar
Ben-David, S., Schuller, R.: Exploiting task relatedness for multiple task learning. Lect. Notes Comput. Sci. 2777, 567–580 (2003)
Article Google Scholar
Blanchard, G., Lee, G., Scott, C.: Generalizing from several related classification tasks to a new unlabeled sample. In: Advances in Neural Information Processing Systems, Granada, vol. 24 (2011)
Google Scholar
Boser, B., Guyon, I., Vapnik, V.: A training algorithm for optimal margin classifiers. In: Proceedings of the Fifth Annual Workshop on Computational Learning Theory, COLT’92, Pittsburgh, pp. 144–152. ACM, New York (1992)
Google Scholar
Caruana, R.: Multitask learning: a knowledge-based source of inductive bias. In: ICML, Amherst, pp. 41–48. Morgan Kaufmann (1993)
Google Scholar
Caruana, R.: Multitask learning. Mach. Learn. 28(1), 41–75 (1997)
Article MathSciNet Google Scholar
Cortes, C., Vapnik, V.: Support vector networks. Mach. Learn. 20, 273–297 (1995)
MATH Google Scholar
Daumé, H.: Frustratingly easy domain adaptation. In: Annual Meeting—Association for Computational Linguistics, Prague, vol. 45, p. 256 (2007)
Google Scholar
Evgeniou, T., Pontil, M.: Regularized multi-task learning. In: International Conference on Knowledge Discovery and Data Mining, Chicago, p. 109 (2004)
Google Scholar
Evgeniou, T., Micchelli, C., Pontil, M.: Learning multiple tasks with kernel methods. J. Mach. Learn. Res. 6(1), 615–637 (2005)
MathSciNet MATH Google Scholar
Heckerman, D., Kadie, C., Listgarten, J.: Leveraging information across HLA alleles/supertypes improves epitope prediction. J. Comput. Biol. 14(6), 736–746 (2007)
Article Google Scholar
Jacob, L., Vert, J.: Efficient peptide-MHC-I binding prediction for alleles with few known binders. Bioinformatics (Oxford, England) 24(3), 358–366 (2008)
Google Scholar
Kloft, M., Brefeld, U., Sonnenburg, S., Zien, A.: Lp-norm multiple kernel learning. J. Mach. Learn. Res. 12, 953–997 (2011)
MathSciNet Google Scholar
Lanckriet, G., Cristianini, N., Ghaoui, L.E., Bartlett, P., Jordan, M.I.: Learning the kernel matrix with semi-definite programming. JMLR 5, 27–72 (2004)
MATH Google Scholar
Mordelet, F., Vert, J.: Prodige: prioritization of disease genes with multitask machine learning from positive and unlabeled examples. BMC Bioinf. 22, 389 (2011)
Article Google Scholar
Pan, S., Yang, Q.: A survey on transfer learning. IEEE Trans. Knowl. Data Eng. 22, 1345–1359 (2009)
Article Google Scholar
Park, C., Hess, D., Huttenhower, C., Troyanskaya, O.: Simultaneous genome-wide inference of physical, genetic, regulatory, and functional pathway components. PLoS Comput. Biol. 6(11), e1001,009 (2010)
Google Scholar
Schweikert, G., Widmer, C., Schölkopf, B., Rätsch, G.: An empirical analysis of domain adaptation algorithms for genomic sequence analysis. In: Koller, D., Schuurmans, D., Bengio, Y., Bottou, L. (eds.) Advances in Neural Information Processing Systems (NIPS), Vancouver, vol. 21, pp. 1433–1440 (2009)
Google Scholar
Sonnenburg, S., Zien, A., Rätsch, G.: ARTS: accurate recognition of transcription starts in human. Bioinformatics 22(14), e472–e480 (2006)
Article Google Scholar
Sriperumbudur, B., Gretton, A., Fukumizu, K., Lanckriet, G., Schölkopf, B.: Injective Hilbert space embeddings of probability measures. In: Servedio, R.A., Zhang, T. (eds.) Proceedings of the 21st Annual Conference on Learning Theory, Helsinki, pp. 111–122. Omnipress (2008)
Google Scholar
Vapnik, V.N., Chervonenkis, A.Y.: On the uniform convergence of relative frequencies of events to their probabilities. Theory Probab. Appl. 16(2), 264–280 (1971)
Article MathSciNet MATH Google Scholar
Widmer, C., Rätsch, G.: Multitask learning in computational biology. In: JMLR W&CP. ICML 2011 Unsupervised and Transfer Learning Workshop, Bellevue, vol. 27, pp. 207–216 (2012)
Google Scholar
Widmer, C., Leiva, J., Altun, Y., Rätsch, G.: Leveraging sequence classification by taxonomy-based multitask learning. In: Berger, B. (ed.) Research in Computational Molecular Biology, Lisbon, pp. 522–534. Springer (2010)
Google Scholar
Widmer, C., Toussaint, N., Altun, Y., Rätsch, G.: Inferring latent task structure for multitask learning by multiple kernel learning. BMC Bioinf. 11(Suppl 8), S5 (2010)
Article Google Scholar
Zhang, Y., Yeung, D.: A convex formulation for learning task relationships in multi-task learning. In: Proceedings of the 26th Annual Conference on Uncertainty in Artificial Intelligence (UAI-10), Catalina Island, pp. 733–742. AUAI Press, Corvallis (2010)
Google Scholar

Download references

Acknowledgements

We thank Klaus-Robert Müller and Mehryar Mohri for inspiring and helpful discussions. This work was supported by the German Research Foundation (DFG) under MU 987/6-1 and RA 1894/1-1 as well as by the European Community’s 7th Framework Programme under the PASCAL2 Network of Excellence (ICT-216886). Marius Kloft acknowledges a postdoctoral fellowship by the German Research Foundation (DFG).

Author information

Authors and Affiliations

Computational Biology Center, Memorial Sloan-Kettering Cancer Center, 415 E 68th Street, New York, NY, 10065, USA
Christian Widmer, Marius Kloft & Gunnar Rätsch
Machine Learning Group, Technische Universität Berlin, Franklinstr. 28/29, 10587, Berlin, Germany
Christian Widmer
Courant Institute of Mathematical Sciences, New York University, 251 Mercer Street, New York, NY, 10012, USA
Marius Kloft

Authors

Christian Widmer
View author publications
You can also search for this author in PubMed Google Scholar
Marius Kloft
View author publications
You can also search for this author in PubMed Google Scholar
Gunnar Rätsch
View author publications
You can also search for this author in PubMed Google Scholar

Corresponding author

Correspondence to Christian Widmer .

Editor information

Editors and Affiliations

Max Planck Institute for Intelligent Systems, Tübingen, Germany
Bernhard Schölkopf
Dept. of Computer Science, Royal Holloway, University of London, Egham, Surrey, United Kingdom
Zhiyuan Luo
Department of Computer Science, Royal Holloway, University of London, Egham, Surrey, United Kingdom
Vladimir Vovk

Rights and permissions

Reprints and permissions

Copyright information

About this chapter

Cite this chapter

Widmer, C., Kloft, M., Rätsch, G. (2013). Multi-task Learning for Computational Biology: Overview and Outlook. In: Schölkopf, B., Luo, Z., Vovk, V. (eds) Empirical Inference. Springer, Berlin, Heidelberg. https://doi.org/10.1007/978-3-642-41136-6_12

Download citation

DOI: https://doi.org/10.1007/978-3-642-41136-6_12
Published: 09 October 2013
Publisher Name: Springer, Berlin, Heidelberg
Print ISBN: 978-3-642-41135-9
Online ISBN: 978-3-642-41136-6
eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics