Advertisement

Coping with Distribution Change in the Same Domain Using Similarity-Based Instance Weighting

  • Jeong-Woo Son
  • Hyun-Je Song
  • Seong-Bae Park
  • Se-Young Park
Conference paper
Part of the Lecture Notes in Computer Science book series (LNCS, volume 5828)

Abstract

Lexicons are considered as the most crucial features in natural language processing (NLP), and thus often used in machine learning algorithms applied to NLP tasks. However, due to the diversity of lexical space, the machine learning algorithms with lexical features suffer from the difference between distributions of training and test data. In order to overcome the distribution change, this paper proposes support vector machines with example-wise weights. The training distribution coincides with the test distribution by weighting training examples according to their similarity to all test data. The experimental results on text chunking show that the distribution change between training and test data is actually recognized and the proposed method which considers this change in its training phase outperforms ordinary support vector machines.

Keywords

Support Vector Machine Natural Language Processing Test Instance Distribution Change Training Instance 
These keywords were added by machine and not by the authors. This process is experimental and the keywords may be updated as the learning algorithm improves.

Preview

Unable to display preview. Download preview PDF.

Unable to display preview. Download preview PDF.

References

  1. 1.
    Arnold, A., Nallapati, R., Cohen, W.: Exploiting Feature Hierarchy for Transfer Learning in Naemd Entity Recognition. In: Proceedings of the 46th Annual Meeting of the Association of Computational Linguistics, pp. 245–253 (2008)Google Scholar
  2. 2.
    Chapelle, O., Schölkopf, B., Zien, A.: Semi-Supervised Learning. MIT Press, Cambridge (2006)Google Scholar
  3. 3.
    Chelba, C., Acero, A.: Adaptation of Maximum Entropy Capitalizer: Little Data Can Help a Lot. In: Proceedings of Empirical Methods in Natural Language Processing and Very Large Corpora, pp. 285–292 (2004)Google Scholar
  4. 4.
    Daumé, H., Marcu, D.: Frustratingly Easy Domain Adaptation. In: Proceedings of the 45th Annual Meeting of the Association of Computational Linguistics, pp. 256–263 (2007)Google Scholar
  5. 5.
    Déjean, H.: Learning Rules and Their Exceptions. Journal of Machine Learning Research 2, 669–693 (2002)zbMATHCrossRefGoogle Scholar
  6. 6.
    Escudero, G., Marquez, L., Rigau, G.: An Empirical Study of the Domain Dependence of Supervised Word Sense Disambiguation Systems. In: Proceedings of Empirical Methods in Natural Language Processing and Very Large Corpora, pp. 172–180 (2000)Google Scholar
  7. 7.
    Florian, R., Hassan, H., Ittycheriah, A., Jing, H., Kambhatla, N., Luo, X., Nicolov, N., Roukos, S.: A Statistical Model for Multilingual Entity Detection and Tracking. In: Proceedings of Human Language Technology and North American Chapter of the Association for Computational Linguistics Annual Meeting, pp. 1–8 (2004)Google Scholar
  8. 8.
    Heckman, J.: Sample Selection Bias as a Specification Error. Econometrica 47(1), 153–162 (1979)zbMATHCrossRefMathSciNetGoogle Scholar
  9. 9.
    Huang, J., Smola, A., Gretton, A., Borgwardt, K., Schölkopf, B.: Correcting Sample Selection Bias by Unlabeled Data. In: Advances in Neural Information Processing Systems 19, pp. 601–608. MIT Press, Cambridge (2007)Google Scholar
  10. 10.
    Jiang, J., Zhai, C.: Instance Weighting for Domain Adaptation in NLP. In: Proceedings of the 45th Annual Meeting of the Association of Computational Linguistics, pp. 264–271 (2007)Google Scholar
  11. 11.
    Joachims, T.: Making Large-Scale SVM Learning Practical, LS8, Universitaet Dortmund (1998)Google Scholar
  12. 12.
    Kudoh, T., Matsumoto, T.: Chunking with Support Vector Machines. In: Proceedings of the 2nd Meeting of the North American Chapter of the Association for Computational Linguistics on Language technologies, pp. 1–8 (2001)Google Scholar
  13. 13.
    Martinez, D., Agirre, E.: One Sense per Collocation and Genre/Topic Variations. In: Proceedings of Empirical Methods in Natural Language Processing and Very Large Corpora, pp. 207–215 (2000)Google Scholar
  14. 14.
    Milidiú, R., Santos, C., Duarte, J.: Phrase Chunking Using Entropy Guided Transformation Leanring. In: Proceedings of the 46th Annual Meeting of the Association of Computational Linguistics, pp. 647–655 (2008)Google Scholar
  15. 15.
    Park, S.-B., Zhang, B.-T.: A Boosted Maximum Entropy Models for Learning Text Chunking. In: Proceedings of the 19th International Conference on Machine Learning, pp. 482–489 (2002)Google Scholar
  16. 16.
    Roark, B., Bacchiani, M.: Supervised and Unsupervised PCFG Adaptation to Novel Domains. In: Proceedings of Human Language Technology and North American Chapter of the Association for Computational Linguistics Annual Meeting, pp. 126–133 (2004)Google Scholar
  17. 17.
    Shawe-Taylor, J., Cristianini, N.: Support Vector Machines and Other Kernel-based Learning Methods. Cambridge University Press, Cambridge (2000)Google Scholar
  18. 18.
    Schmidt, M., Gish, H.: Speaker Identification via Support Vector Classifiers. In: Proceedings of IEEE International Conference on Acoustics, Speech, and Signal Processing, pp. 105–108 (1996)Google Scholar
  19. 19.
    Shimodaira, H.: Improving Predictive Inference Under Covariate Shift by Weighting the Log-Likelihood Function. Journal of Statistical Planning and Inference 90(2), 227–244 (2000)zbMATHCrossRefMathSciNetGoogle Scholar
  20. 20.
    Sugiyama, M., Nakajima, S., Kashima, H., Bünau, P., Kawanabe, M.: Direct Importance Estimation with Model Selection and Its Application to Covariate Shift Adaptation. In: Advances in Neural Information Processing Systems 20, pp. 1433–1440. MIT Press, Cambridge (2008)Google Scholar
  21. 21.
    Tsochantaridis, I., Joachims, T., Hofmann, T., Altun, Y.: Large Margin Methods for Structured and Interdependent Output Variables. Journal of Machine Learning Research 6, 1453–1484 (2005)MathSciNetGoogle Scholar
  22. 22.
    Zadrozny, B.: Learning and Evaluating Classifiers Under Sample Selection Bias. In: Proceedings of the 21st International Conference on Machine Learning, pp. 114–121 (2004)Google Scholar
  23. 23.
    Zhang, T., Damerau, F., Johnson, D.: Text Chunking Using Regularized Winnow. In: Proceedings of the 39th Annual Meeting of the Association of Computational Linguistics, pp. 539–546 (2001)Google Scholar

Copyright information

© Springer-Verlag Berlin Heidelberg 2009

Authors and Affiliations

  • Jeong-Woo Son
    • 1
  • Hyun-Je Song
    • 1
  • Seong-Bae Park
    • 1
  • Se-Young Park
    • 1
  1. 1.Department of Computer EngineeringKyungpook National UniversityDaeguKorea

Personalised recommendations