Domain Adaptation for English–Korean MT System: From Patent Domain to IT Web News Domain
This paper addresses a method to adapt an existing machine translation (MT) system for a newly targeted translation domain. Especially, we give detailed descriptions of customizing a patent specific English-Korean machine translation system for IT web news domain. The proposed method includes the followings: constructing a corpus from documents of IT web news domain, analyzing characteristics of IT web news sentences according to each viewpoint of MT system modules (tagger, parser, transfer) and translation knowledge, and adapting each MT system modules and translation knowledge considering characteristics of IT web news domain. To evaluate our domain adaptation method, we conducted a human evaluation and an automatic evaluation. The experiment showed promising results for diverse sentences extracted from IT Web News documents.
Keywordsmachine translation domain adaptation
Unable to display preview. Download preview PDF.
- 1.Hong, M.P., Kim, Y.G., Kim, C.H., Yang, S.I., Seo, Y.A., Ryu, C., Park, S.K.: Customizing a Korean-English MT System for Patent Translation, MT Summit X, pp. 181–187 (2005)Google Scholar
- 2.Ayan, N.F., Dorr, B.J., Kolak, O.: Domain Tuning of Bilingual Lexicons for MT, Technical Reports of UMIACS (2003)Google Scholar
- 3.Zajac, R.: MT Customziation. In: MT Summit IX Workshop (2003)Google Scholar
- 4.Hong, M., Kim, Y.-K., Park, S.-K., Lee, Y.-J.: Semi-Automatic Construction of Korean-Chinese Verb Patterns Based on Translation Equivalency. In: COLING 2004 Workshop on Multilingual Linguistic Resources (2004)Google Scholar
- 5.Lee, K.Y., Park, S.K., Kim, H.W.: A Method for English-Korean Target Word Selection Using Multiple Knowledge Sources. IEICE Trans. Fundamentals E89-A(6) (2006)Google Scholar