Learning and Domain Adaptation

  • Yishay Mansour
Part of the Lecture Notes in Computer Science book series (LNCS, volume 5808)


Domain adaptation is a fundamental learning problem where one wishes to use labeled data from one or several source domains to learn a hypothesis performing well on a different, yet related, domain for which no labeled data is available. This generalization across domains is a very significant challenge for many machine learning applications and arises in a variety of natural settings, including NLP tasks (document classification, sentiment analysis, etc.), speech recognition (speakers and noise or environment adaptation) and face recognition (different lighting conditions, different population composition).

The learning theory community has only recently started to analyze domain adaptation problems. In the talk, I will overview some recent theoretical models and results regarding domain adaptation.

This talk is based on joint works with Mehryar Mohri and Afshin Rostamizadeh.


Unable to display preview. Download preview PDF.

Unable to display preview. Download preview PDF.


  1. 1.
    Ben-David, S., Blitzer, J., Crammer, K., Pereira, F.: Analysis of representations for domain adaptation. In: Proceedings of NIPS 2006 (2006)Google Scholar
  2. 2.
    Blitzer, J., Crammer, K., Kulesza, A., Pereira, F., Wortman, J.: Learning bounds for domain adaptation. In: Proceedings of NIPS 2007 (2007)Google Scholar
  3. 3.
    Blitzer, J., Dredze, M., Pereira, F.: Biographies, Bollywood, Boom-boxes and Blenders: Domain Adaptation for Sentiment Classification. In: ACL 2007 (2007)Google Scholar
  4. 4.
    Chelba, C., Acero, A.: Adaptation of maximum entropy capitalizer: Little data can help a lot. Computer Speech & Language 20(4), 382–399 (2006)CrossRefGoogle Scholar
  5. 5.
    Daumé III, H., Marcu, D.: Domain adaptation for statistical classifiers. Journal of Artificial Intelligence Research 26, 101–126 (2006)MathSciNetMATHGoogle Scholar
  6. 6.
    Dredze, M., Blitzer, J., Talukdar, P.P., Ganchev, K., Graca, J., Pereira, F.: Frustratingly Hard Domain Adaptation for Parsing. In: CoNLL 2007 (2007)Google Scholar
  7. 7.
    Gauvain, J.-L., Chin-Hui: Maximum a posteriori estimation for multivariate gaussian mixture observations of markov chains. IEEE Transactions on Speech and Audio Processing 2(2), 291–298 (1994)CrossRefGoogle Scholar
  8. 8.
    Jelinek, F.: Statistical Methods for Speech Recognition. MIT Press, Cambridge (1998)Google Scholar
  9. 9.
    Jiang, J., Zhai, C.: Instance Weighting for Domain Adaptation in NLP. In: Proceedings of ACL 2007 (2007)Google Scholar
  10. 10.
    Kifer, D., Ben-David, S., Gehrke, J.: Detecting change in data streams. In: Proceedings of the 30th International Conference on Very Large Data Bases (2004)Google Scholar
  11. 11.
    Legetter, C.J., Woodland, P.C.: Maximum likelihood linear regression for speaker adaptation of continuous density hidden markov models. In: Computer Speech and Language, pp. 171–185 (1995)Google Scholar
  12. 12.
    Mansour, Y., Mohri, M., Rostamizadeh, A.: Domain adaptation: Learning bounds and algorithms. In: COLT (2009)Google Scholar
  13. 13.
    Mansour, Y., Mohri, M., Rostamizadeh, A.: Domain adaptation with multiple sources. In: Proceedings of NIPS 2008 (2008)Google Scholar
  14. 14.
    Mansour, Y., Mohri, M., Rostamizadeh, A.: Multiple source adaptation and the Rényi divergence. In: Uncertainty in Artificial Inteligence, UAI (2009)Google Scholar
  15. 15.
    Martínez, A.M.: Recognizing imprecisely localized, partially occluded, and expression variant faces from a single sample per class. IEEE Trans. Pattern Anal. Mach. Intell. 24(6), 748–763 (2002)CrossRefGoogle Scholar
  16. 16.
    Pietra, S.D., Pietra, V.D., Mercer, R.L., Roukos, S.: Adaptive language modeling using minimum discriminant estimation. In: HLT 1991: Proceedings of the workshop on Speech and Natural Language, pp. 103–106 (1992)Google Scholar
  17. 17.
    Roark, B., Bacchiani, M.: Supervised and unsupervised PCFG adaptation to novel domains. In: Proceedings of HLT-NAACL (2003)Google Scholar
  18. 18.
    Rosenfeld, R.: A Maximum Entropy Approach to Adaptive Statistical Language Modeling. Computer Speech and Language 10, 187–228 (1996)CrossRefGoogle Scholar
  19. 19.
    Valiant, L.G.: A theory of the learnable. Communication of the ACM 27(11), 1134–1142 (1984)CrossRefMATHGoogle Scholar

Copyright information

© Springer-Verlag Berlin Heidelberg 2009

Authors and Affiliations

  • Yishay Mansour
    • 1
  1. 1.Blavatnik School of Computer ScienceTel Aviv UniversityTel AvivIsrael

Personalised recommendations