Domain Adaptation for English–Korean MT System: From Patent Domain to IT Web News Domain

  • Ki-Young Lee
  • Sung-Kwon Choi
  • Oh-Woog Kwon
  • Yoon-Hyung Roh
  • Young-Gil Kim
Part of the Lecture Notes in Computer Science book series (LNCS, volume 5459)

Abstract

This paper addresses a method to adapt an existing machine translation (MT) system for a newly targeted translation domain. Especially, we give detailed descriptions of customizing a patent specific English-Korean machine translation system for IT web news domain. The proposed method includes the followings: constructing a corpus from documents of IT web news domain, analyzing characteristics of IT web news sentences according to each viewpoint of MT system modules (tagger, parser, transfer) and translation knowledge, and adapting each MT system modules and translation knowledge considering characteristics of IT web news domain. To evaluate our domain adaptation method, we conducted a human evaluation and an automatic evaluation. The experiment showed promising results for diverse sentences extracted from IT Web News documents.

Keywords

machine translation domain adaptation 

Preview

Unable to display preview. Download preview PDF.

Unable to display preview. Download preview PDF.

References

  1. 1.
    Hong, M.P., Kim, Y.G., Kim, C.H., Yang, S.I., Seo, Y.A., Ryu, C., Park, S.K.: Customizing a Korean-English MT System for Patent Translation, MT Summit X, pp. 181–187 (2005)Google Scholar
  2. 2.
    Ayan, N.F., Dorr, B.J., Kolak, O.: Domain Tuning of Bilingual Lexicons for MT, Technical Reports of UMIACS (2003)Google Scholar
  3. 3.
    Zajac, R.: MT Customziation. In: MT Summit IX Workshop (2003)Google Scholar
  4. 4.
    Hong, M., Kim, Y.-K., Park, S.-K., Lee, Y.-J.: Semi-Automatic Construction of Korean-Chinese Verb Patterns Based on Translation Equivalency. In: COLING 2004 Workshop on Multilingual Linguistic Resources (2004)Google Scholar
  5. 5.
    Lee, K.Y., Park, S.K., Kim, H.W.: A Method for English-Korean Target Word Selection Using Multiple Knowledge Sources. IEICE Trans. Fundamentals E89-A(6) (2006)Google Scholar

Copyright information

© Springer-Verlag Berlin Heidelberg 2009

Authors and Affiliations

  • Ki-Young Lee
    • 1
  • Sung-Kwon Choi
    • 1
  • Oh-Woog Kwon
    • 1
  • Yoon-Hyung Roh
    • 1
  • Young-Gil Kim
    • 1
  1. 1.Electronics and Telecommunications Research InstituteNatural Language Processing TeamDaejeonKorea

Personalised recommendations