Advertisement

Transfer Learning beyond Text Classification

  • Qiang Yang
Part of the Lecture Notes in Computer Science book series (LNCS, volume 5828)

Abstract

Transfer learning is a new machine learning and data mining framework that allows the training and test data to come from different distributions or feature spaces. We can find many novel applications of machine learning and data mining where transfer learning is necessary. While much has been done in transfer learning in text classification and reinforcement learning, there has been a lack of documented success stories of novel applications of transfer learning in other areas. In this invited article, I will argue that transfer learning is in fact quite ubiquitous in many real world applications. In this article, I will illustrate this point through an overview of a broad spectrum of applications of transfer learning that range from collaborative filtering to sensor based location estimation and logical action model learning for AI planning. I will also discuss some potential future directions of transfer learning.

Keywords

Action Model Target Domain Rating Matrix Collaborative Filter Transfer Learning 
These keywords were added by machine and not by the authors. This process is experimental and the keywords may be updated as the learning algorithm improves.

Preview

Unable to display preview. Download preview PDF.

Unable to display preview. Download preview PDF.

References

  1. 1.
    Thrun, S., Mitchell, T.M.: Learning one more thing. In: Proceedings of the Fourteenth International Joint Conference on Artificial Intelligence, pp. 825–830. Morgan Kaufmann, San Francisco (1995)Google Scholar
  2. 2.
    Schmidhuber, J.: On learning how to learn learning strategies. Technical Report FKI-198-94, Fakultat fur Informatik, Palo Alto, CA (1994)Google Scholar
  3. 3.
    Caruana, R.: Multitask learning. Machine Learning 28(1), 41–75 (1997)CrossRefGoogle Scholar
  4. 4.
    Ben-David, S., Schuller, R.: Exploiting task relatedness for multiple task learning. In: Proceedings of the Sixteenth Annual Conference on Learning Theory, pp. 825–830. Morgan Kaufmann, San Francisco (2003)Google Scholar
  5. 5.
    DauméIII, H., Marcu, D.: Domain adaptation for statistical classifiers. Journal of Artificial Intelligence Research 26, 101–126 (2006)MathSciNetGoogle Scholar
  6. 6.
    Daumé III, H.: Frustratingly easy domain adaptation. In: Proceedings of the 45th Annual Meeting of the Association of Computational Linguistics, Prague, Czech Republic, June 2007, pp. 256–263 (2007)Google Scholar
  7. 7.
    Dai, W., Xue, G., Yang, Q., Yu, Y.: Co-clustering based classification for out-of-domain documents. In: Proceedings of the 13th ACM SIGKDD International Conference on Knowledge Discovery and Data Mining, San Jose, California, USA (August 2007)Google Scholar
  8. 8.
    Dai, W., Xue, G., Yang, Q., Yu, Y.: Transferring naive bayes classifiers for text classification. In: Proceedings of the 22nd AAAI Conference on Artificial Intelligence (July 2007)Google Scholar
  9. 9.
    Blitzer, J., McDonald, R., Pereira, F.: Domain adaptation with structural correspondence learning. In: Proceedings of the Conference on Empirical Methods in Natural Language, Sydney, Australia, pp. 120–128 (2006)Google Scholar
  10. 10.
    Blitzer, J., Dredze, M., Pereira, F.: Biographies, bollywood, boom-boxes and blenders: Domain adaptation for sentiment classification. In: Proceedings of the 45th Annual Meeting of the Association of Computational Linguistics, Prague, Czech Republic, pp. 432–439 (2007)Google Scholar
  11. 11.
    Raina, R., Battle, A., Lee, H., Packer, B., Ng, A.Y.: Self-taught learning: Transfer learning from unlabeled data. In: Proceedings of the 24th International Conference on Machine Learning, Corvalis, Oregon, USA, June 2007, pp. 759–766 (2007)Google Scholar
  12. 12.
    Konidaris, G., Barto, A.: Autonomous shaping: Knowledge transfer in reinforcement learning. In: Proceedings of Twenty-Third International Conference on Machine Learning (2006)Google Scholar
  13. 13.
    Pan, S.J., Yang, Q.: A survey on transfer learning. Technical Report HKUST-CS08-08, Department of Computer Science and Engineering, Hong Kong University of Science and Technology, Hong Kong, China (November 2008)Google Scholar
  14. 14.
    Raina, R., Ng, A.Y., Koller, D.: Constructing informative priors using transfer learning. In: Proceedings of the 23rd International Conference on Machine Learning, Pittsburgh, Pennsylvania, USA, June 2006, pp. 713–720 (2006)Google Scholar
  15. 15.
    Wu, P., Dietterich, T.G.: Improving svm accuracy by training on auxiliary data sources. In: Proceedings of the 21st International Conference on Machine Learning, Banff, Alberta, Canada. ACM, New York (2004)Google Scholar
  16. 16.
    Arnold, A., Nallapati, R., Cohen, W.W.: A comparative study of methods for transductive transfer learning. In: Proceedings of the 7th IEEE International Conference on Data Mining Workshops, Washington, DC, USA, pp. 77–82. IEEE Computer Society, Los Alamitos (2007)CrossRefGoogle Scholar
  17. 17.
    Raykar, V.C., Krishnapuram, B., Bi, J., Dundar, M., Rao, R.B.: Bayesian multiple instance learning: automatic feature selection and inductive transfer. In: Proceedings of the 25th International Conference on Machine learning, Helsinki, Finland, pp. 808–815. ACM, New York (2008)CrossRefGoogle Scholar
  18. 18.
    Ling, X., Xue, G.R., Dai, W., Jiang, Y., Yang, Q., Yu, Y.: Can chinese web pages be classified with english data source? In: Proceedings of the 17th International Conference on World Wide Web, Beijing, China, pp. 969–978. ACM, New York (2008)CrossRefGoogle Scholar
  19. 19.
    Widmer, G., Kubat, M.: Learning in the presence of concept drift and hidden contexts. Mach. Learn. 23(1), 69–101 (1996)Google Scholar
  20. 20.
    Klinkenberg, R.: Learning drifting concepts: Example selection vs. example weighting. Intell. Data Anal. 8(3), 281–300 (2004)Google Scholar
  21. 21.
    Tsymbal, A.: The problem of concept drift: Definitions and related workGoogle Scholar
  22. 22.
    Hulten, G., Spencer, L., Domingos, P.: Mining time-changing data streams. In: KDD 2001: Proceedings of the seventh ACM SIGKDD international conference on Knowledge discovery and data mining, pp. 97–106. ACM Press, New York (2001)CrossRefGoogle Scholar
  23. 23.
    Kolter, J., Maloof, M.: Dynamic weighted majority: A new ensemble method for tracking concept drift. In: Proceedings of the Third IEEE International Conference on Data Mining, pp. 123–130. IEEE Press, Los Alamitos (2003)Google Scholar
  24. 24.
    Wang, H., Fan, W., Yu, P.S., Han, J.: Mining concept-drifting data streams using ensemble classifiers. In: KDD 2003: Proceedings of the ninth ACM SIGKDD international conference on Knowledge discovery and data mining, pp. 226–235. ACM Press, New York (2003)CrossRefGoogle Scholar
  25. 25.
    Gao, J., Fan, W., Han, J., Yu, P.S.: A general framework for mining concept-drifting data streams with skewed distributions. In: Proceedings of the Seventh SIAM International Conference on Data Mining, Minneapolis, Minnesota, USA (2007)Google Scholar
  26. 26.
    Pan, S.J., Shen, D., Yang, Q., Kwok, J.T.: Transferring localization models across space. In: Proceedings of the 23rd AAAI Conference on Artificial Intelligence, pp. 1383–1388 (2008)Google Scholar
  27. 27.
    Zheng, V.W., Pan, S.J., Yang, Q., Pan, J.J.: Transferring multi-device localization models using latent multi-task learning. In: Proceedings of the 23rd AAAI Conference on Artificial Intelligence, Chicago, Illinois, USA, July 2008, pp. 1427–1432 (2008)Google Scholar
  28. 28.
    Ling, X., Xue, G.R., Dai, W., Jiang, Y., Yang, Q., Yu, Y.: Can chinese web pages be classified with english data source? In: WWW 2008: Proceeding of the 17th International conference on World Wide Web, pp. 969–978. ACM, New York (2008)CrossRefGoogle Scholar
  29. 29.
    Zhuo, H., Yang, Q., Hu, D.H., Li, L.: Transferring knowledge from another domain for learning action models. In: Ho, T.-B., Zhou, Z.-H. (eds.) PRICAI 2008. LNCS (LNAI), vol. 5351, pp. 1110–1115. Springer, Heidelberg (2008)CrossRefGoogle Scholar
  30. 30.
    Li, B., Yang, Q., Xue, X.: Transfer learning for collaborative filtering via a rating-matrix generative model. In: ICML, pp. 617–624 (2009)Google Scholar
  31. 31.
    Yang, Q., Pan, S.J., Zheng, V.W.: Estimating location using Wi-Fi. IEEE Intelligent Systems 23(1), 8–13 (2008), http://www.cse.ust.hk/~qyang/ICDMDMC07/ CrossRefGoogle Scholar
  32. 32.
    Tishby, N., Pereira, F.C., Bialek, W.: The information bottleneck method. In: Proc. of the 37th Annual Allerton Conference on Communication, Control and Computing, pp. 368–377Google Scholar
  33. 33.
    Bel, N., Koster, C.H.A., Villegas, M.: Cross-lingual text categorization. In: Koch, T., Sølvberg, I.T. (eds.) ECDL 2003. LNCS, vol. 2769, pp. 126–139. Springer, Heidelberg (2003)Google Scholar
  34. 34.
    Wu, Y., Oard, D.W.: Bilingual topic aspect classification with a few training examples. In: SIGIR 2008: Proceedings of the 31st annual international ACM SIGIR conference on Research and development in information retrieval, pp. 203–210. ACM, New York (2008)CrossRefGoogle Scholar
  35. 35.
    Rigutini, L., Maggini, M., Liu, B.: An em based training algorithm for cross-language text categorization. In: WI 2005: Proceedings of the 2005 IEEE/WIC/ACM International Conference on Web Intelligence, Washington, DC, USA, pp. 529–535. IEEE Computer Society, Los Alamitos (2005)CrossRefGoogle Scholar
  36. 36.
    Kullback, S., Leibler, R.A.: On information and sufficiency. Annals of Mathematical Statistics 22(1), 79–86 (1951)zbMATHCrossRefMathSciNetGoogle Scholar
  37. 37.
    Domingos, P., Kok, S., Lowd, D., Poon, H., Richardson, M., Singla, P., Sumner, M., Wang, J.: Markov logic: A unifying language for structural and statistical pattern recognition. In: da Vitoria Lobo, N., Kasparis, T., Roli, F., Kwok, J.T., Georgiopoulos, M., Anagnostopoulos, G.C., Loog, M. (eds.) S+SSPR 2008. LNCS, vol. 5342, pp. 3–3. Springer, Heidelberg (2008)CrossRefGoogle Scholar

Copyright information

© Springer-Verlag Berlin Heidelberg 2009

Authors and Affiliations

  • Qiang Yang
    • 1
  1. 1.Department of Computer Science and EngineeringHong Kong University of Science and TechnologyHong Kong

Personalised recommendations