Domain Adaptation for Document Classification by Alternately Using Semi-supervised Learning and Feature Weighted Learning

Conference paper
Part of the Communications in Computer and Information Science book series (CCIS, volume 781)

Abstract

In this paper, we propose a new unsupervised domain adaptation method for document classification. We address the problem of domain adaptation for document classification where the source and target domains do not differ significantly and there is no labeled data in the target domain. In this case, we can use conventional semi-supervised learning. Thus, we use the naive Bayes-based expectation-maximization method (NBEM) which is very effective for document classification. However, NBEM does not utilize the difference between a source domain and a target domain. We combine NBEM with the feature weighted method for domain adaptation, referred to as “self-training feature weight” (STFW). Our proposed method alternately uses NBEM and STFW to gradually improve document classification precision for a target domain. This method significantly outperforms the conventional unsupervised methods for domain adaptation.

Keywords

Domain adaptation Document classification Semi-supervised learning Feature-based methods 

Notes

Acknowledgment

The work reported in this article was supported by the NINJAL collaborative research project ‘Development of all-words WSD systems and construction of a correspondence table between WLSP and IJD by these systems.’

References

  1. 1.
    Blitzer, J., McDonald, R., Pereira, F.: Domain adaptation with structural correspondence learning. In: EMNLP-2006, pp. 120–128 (2006)Google Scholar
  2. 2.
    Blum, A., Mitchell, T.: Combining labeled and unlabeled data with co-training. In: Proceedings of the Eleventh Annual Conference on Computational Learning Theory, pp. 92–100 (1998)Google Scholar
  3. 3.
    Chapelle, O., Schölkopf, B., Zien, A., et al.: Semi-supervised Learning, vol. 2. MIT Press, Cambridge (2006)CrossRefGoogle Scholar
  4. 4.
    Chen, M., Weinberger, K.Q., Blitzer, J.: Co-training for domain adaptation. In: NIPS, pp. 2456–2464 (2011)Google Scholar
  5. 5.
    Dai, W., Xue, G.R., Yang, Q., Yu, Y.: Transferring naive Bayes classifiers for text classification. In: AAAI-2007 (2007)Google Scholar
  6. 6.
    Daumé III, H.: Frustratingly easy domain adaptation. In: ACL-2007, pp. 256–263 (2007)Google Scholar
  7. 7.
    Joachims, T.: Transductive inference for text classification using support vector machines. In: ICML, vol. 99, pp. 200–209 (1999)Google Scholar
  8. 8.
    Kanamori, T., Hido, S., Sugiyama, M.: A least-squares approach to direct importance estimation. J. Mach. Learn. Res. 10, 1391–1445 (2009)MathSciNetMATHGoogle Scholar
  9. 9.
    Nigam, K., McCallum, A., Thrun, S., Mitchell, T.: Text classification from labeled and unlabeled documents using EM. Mach. Learn. 39(2/3), 103–134 (2000)CrossRefMATHGoogle Scholar
  10. 10.
    Pan, S.J., Yang, Q.: A survey on transfer learning. IEEE Trans. Knowl. Data Eng. 22(10), 1345–1359 (2010)CrossRefGoogle Scholar
  11. 11.
    Rosenstein, M.T., Marx, Z., Kaelbling, L.P., Dietterich, T.G.: To transfer or not to transfer. In: NIPS 2005 Workshop on Transfer Learning, vol. 898 (2005)Google Scholar
  12. 12.
    Settles, B.: Active Learning Literature Survey. University of Wisconsin, Madison (2010)MATHGoogle Scholar
  13. 13.
    Søgaard, A.: Semi-supervised Learning and Domain Adaptation in Natural Language Processing. Morgan & Claypool, San Rafael (2013)Google Scholar
  14. 14.
    Sugiyama, M., Kawanabe, M.: Machine Learning in Non-stationary Environments: Introduction to Covariate Shift Adaptation. MIT Press, Cambridge (2011)Google Scholar
  15. 15.
    Tan, S., Cheng, X., Wang, Y., Xu, H.: Adapting naive Bayes to domain adaptation for sentiment analysis. In: Boughanem, M., Berrut, C., Mothe, J., Soule-Dupuy, C. (eds.) ECIR 2009. LNCS, vol. 5478, pp. 337–349. Springer, Heidelberg (2009).  https://doi.org/10.1007/978-3-642-00958-7_31 CrossRefGoogle Scholar

Copyright information

© Springer Nature Singapore Pte Ltd. 2018

Authors and Affiliations

  • Hiroyuki Shinnou
    • 1
  • Kanako Komiya
    • 1
  • Minoru Sasaki
    • 1
  1. 1.Ibaraki UniversityHitachiJapan

Personalised recommendations