Advertisement

Semi-Supervised Multi-Task Regression

  • Yu Zhang
  • Dit-Yan Yeung
Conference paper
Part of the Lecture Notes in Computer Science book series (LNCS, volume 5782)

Abstract

Labeled data are needed for many machine learning applications but the amount available in some applications is scarce. Semi-supervised learning and multi-task learning are two of the approaches that have been proposed to alleviate this problem. In this paper, we seek to integrate these two approaches for regression applications. We first propose a new supervised multi-task regression method called SMTR, which is based on Gaussian processes (GP) with the assumption that the kernel parameters for all tasks share a common prior. We then incorporate unlabeled data into SMTR by changing the kernel function of the GP prior to a data-dependent kernel function, resulting in a semi-supervised extension of SMTR, called SSMTR. Moreover, we incorporate pairwise information into SSMTR to further boost the learning performance for applications in which such information is available. Experiments conducted on two commonly used data sets for multi-task regression demonstrate the effectiveness of our methods.

Keywords

Gaussian Process Unlabeled Data Kernel Parameter Pairwise Constraint Machine Learn Research 
These keywords were added by machine and not by the authors. This process is experimental and the keywords may be updated as the learning algorithm improves.

References

  1. 1.
    Chapelle, O., Zien, A., Schölkopf, B. (eds.): Semi-Supervised Learning. MIT Press, Boston (2006)Google Scholar
  2. 2.
    Blum, A., Mitchell, T.: Combining labeled and unlabeled data with co-training. In: Proceedings of the Workshop on Computational Learning Theory, Madison, Wisconsin, USA, pp. 92–100 (1998)Google Scholar
  3. 3.
    Bennett, K., Demiriz, A.: Semi-supervised support vector machines. In: Advances in Neural Information Processing Systems 11, Vancouver, British Columbia, Canada, pp. 368–374 (1998)Google Scholar
  4. 4.
    Joachims, T.: Transductive inference for text classification using support vector machines. In: Proceedings of the Sixteenth International Conference on Machine Learning, San Francisco, CA, USA, pp. 200–209 (1999)Google Scholar
  5. 5.
    Zhou, D., Bousquet, O., Lal, T., Weston, J., Schölkopf, B.: Learning with local and global consistency. In: Advances in Neural Information Processing Systems 16, Vancouver, British Columbia, Canada (2003)Google Scholar
  6. 6.
    Zhu, X., Ghahramani, Z., Lafferty, J.: Semi-supervised learning using Gaussian fields and harmonic functions. In: Proceedings of the Twentieth International Conference on Machine Learning, Washington, DC, pp. 912–919 (2003)Google Scholar
  7. 7.
    Belkin, M., Niyogi, P., Sindhwani, V.: Manifold regularization: A geometric framework for learning from labeled and unlabeled examples. Journal of Machine Learning Research 7, 2399–2434 (2006)MathSciNetzbMATHGoogle Scholar
  8. 8.
    Caruana, R.: Multitask learning. Machine Learning 28(1), 41–75 (1997)MathSciNetCrossRefGoogle Scholar
  9. 9.
    Baxter, J.: A Bayesian/information theoretic model of learning to learn via multiple task sampling. Machine Learning 28(1), 7–39 (1997)CrossRefzbMATHGoogle Scholar
  10. 10.
    Thrun, S.: Is learning the n-th thing any easier than learning the first? In: Touretzky, D.S., Mozer, M., Hasselmo, M.E. (eds.) Advances in Neural Information Processing Systems 8, Denver, CO, pp. 640–646 (1996)Google Scholar
  11. 11.
    Argyriou, A., Evgeniou, T., Pontil, M.: Convex multi-task feature learning. Machine Learning 73(3), 243–272 (2008)CrossRefGoogle Scholar
  12. 12.
    Evgeniou, T., Pontil, M.: Regularized multi-task learning. In: Proceedings of the Tenth ACM SIGKDD International Conference on Knowledge Discovery and Data Mining, Seattle, Washington, USA, pp. 109–117 (2004)Google Scholar
  13. 13.
    Bakker, B., Heskes, T.: Task clustering and gating for Bayesian multitask learning. Journal of Machine Learning Research 4, 83–99 (2003)zbMATHGoogle Scholar
  14. 14.
    Xue, Y., Liao, X., Carin, L., Krishnapuram, B.: Multi-task learning for classification with Dirichlet process priors. Journal of Machine Learning Research 8, 35–63 (2007)MathSciNetzbMATHGoogle Scholar
  15. 15.
    Lawrence, N.D., Platt, J.C.: Learning to learn with the informative vector machine. In: Proceedings of the Twenty-first International Conference on Machine Learning, Banff, Alberta, Canada (2004)Google Scholar
  16. 16.
    Schwaighofer, A., Tresp, V., Yu, K.: Learning Gaussian process kernels via hierarchical Bayes. In: Saul, L.K., Weiss, Y., Bottou, L. (eds.) Advances in Neural Information Processing Systems 17, Vancouver, British Columbia, Canada, pp. 1209–1216 (2005)Google Scholar
  17. 17.
    Yu, K., Tresp, V., Schwaighofer, A.: Learning Gaussian processes from multiple tasks. In: Proceedings of the Twenty-Second International Conference on Machine Learning, Bonn, Germany, pp. 1012–1019 (2005)Google Scholar
  18. 18.
    Bonilla, E., Chai, K.M.A., Williams, C.: Multi-task Gaussian process prediction. In: Platt, J., Koller, D., Singer, Y., Roweis, S. (eds.) Advances in Neural Information Processing Systems 20, Vancouver, British Columbia, Canada, pp. 153–160 (2008)Google Scholar
  19. 19.
    Ando, R.K., Zhang, T.: A framework for learning predictive structures from multiple tasks and unlabeled data. Journal of Machine Learning Research 6, 1817–1853 (2005)MathSciNetzbMATHGoogle Scholar
  20. 20.
    Liu, Q., Liao, X., Carin, L.: Semi-supervised multitask learning. In: Platt, J., Koller, D., Singer, Y., Roweis, S. (eds.) Advances in Neural Information Processing Systems 20, Vancouver, British Columbia, Canada, pp. 937–944 (2008)Google Scholar
  21. 21.
    Zhu, X., Goldberg, A.: Kernel regression with order preferences. In: Proceedings of the Twenty-Second AAAI Conference on Artificial Intelligence, Vancouver, British Columbia, Canada, pp. 681–686 (2007)Google Scholar
  22. 22.
    Rasmussen, C.E., Williams, C.K.I.: Gaussian Processes for Machine Learning. The MIT Press, Cambridge (2006)zbMATHGoogle Scholar
  23. 23.
    Yu, S., Tresp, V., Yu, K.: Robust multi-task learning with t-processes. In: Proceedings of the Twenty-Fourth International Conference on Machine Learning, Corvalis, Oregon, USA, pp. 1103–1110 (2007)Google Scholar
  24. 24.
    Heskes, T.: Solving a huge number of similar tasks: a combination of multi-task learning and a hierarchical Bayesian approach. In: Proceedings of the Fifteenth International Conference on Machine Learning, Madison, Wisconson, USA, pp. 233–241 (1998)Google Scholar
  25. 25.
    Heskes, T.: Empirical bayes for learning to learn. In: Proceedings of the Seventeenth International Conference on Machine Learning, Stanford University, Standord, CA, USA, pp. 367–374 (2000)Google Scholar
  26. 26.
    Zelnik-Manor, L., Perona, P.: Self-tuning spectral clustering. In: Advances in Neural Information Processing Systems 17, Vancouver, British Columbia, Canada, pp. 1601–1608 (2005)Google Scholar
  27. 27.
    Chung, F.R.K.: Spectral Graph Theory. American Mathematical Society, Rhode Island (1997)zbMATHGoogle Scholar
  28. 28.
    Sindhwani, V., Chu, W., Keerthi, S.S.: Semi-supervised Gaussian process classifiers. In: Proceedings of the 20th International Joint Conference on Artificial Intelligence, Hyderabad, India, pp. 1059–1064 (2007)Google Scholar
  29. 29.
    Sindhwani, V., Niyogi, P., Belkin, M.: Beyond the point cloud: from transductive to semi-supervised learning. In: Proceedings of the Twenty-Second International Conference on Machine Learning, Bonn, Germany, pp. 824–831 (2005)Google Scholar
  30. 30.
    Le, Q.V., Smola, A.J., Gärtner, T., Altun, Y.: Transductive Gaussian process regression with automatic model selection. In: Proceedings of the 17th European Conference on Machine Learning, Berlin, Germany, pp. 306–317 (2006)Google Scholar
  31. 31.
    Bilenko, M., Basu, S., Mooney, R.J.: Integrating constraints and metric learning in semi-supervised clustering. In: Proceedings of the Twenty-first International Conference on Machine Learning, Banff, Alberta, Canada (2004)Google Scholar
  32. 32.
    Chu, W., Ghahramani, Z.: Preference learning with Gaussian processes. In: Proceedings of the Twenty-Second International Conference on Machine Learning, pp. 137–144 (2005)Google Scholar
  33. 33.
    Bishop, C.M.: Pattern Recognition and Machine Learning. Springer, New York (2006)zbMATHGoogle Scholar
  34. 34.
    Vijayakumar, S., D’Souza, A., Schaal, S.: Incremental online learning in high dimensions. Neural Computation 17(12), 2602–2634 (2005)MathSciNetCrossRefGoogle Scholar
  35. 35.
    Lawrence, N.D., Seeger, M., Herbrich, R.: Fast sparse Gaussian process methods: The informative vector machine. In: Becker, S., Thrun, S., Obermayer, K. (eds.) Advances in Neural Information Processing Systems 15, Vancouver, British Columbia, Canada, pp. 609–616 (2003)Google Scholar

Copyright information

© Springer-Verlag Berlin Heidelberg 2009

Authors and Affiliations

  • Yu Zhang
    • 1
  • Dit-Yan Yeung
    • 1
  1. 1.Hong Kong University of Science and TechnologyHong Kong

Personalised recommendations