Advertisement

Klex: A Finite-State Transducer Lexicon of Korean

  • Na-Rae Han
Conference paper
Part of the Lecture Notes in Computer Science book series (LNCS, volume 4002)

Abstract

This paper describes the implementation and system details of Klex, a finite-state transducer lexicon for the Korean language, developed using XRCE’s Xerox Finite State Tool (XFST). Klex is essentially a transducer network representing the lexicon of the Korean language with the lexical string on the upper side and the inflected surface string on the lower side. Two major applications for Klex are morphological analysis and generation: given a well-formed inflected lower string, a language-independent algorithm derives the upper lexical string from the network and vice versa. Klex was written to conform to the part-of-speech tagging standards of the Korean Treebank Project, and is currently operating as the morphological analysis engine for the project.

Keywords

Korean Language Vocabulary Item Representative Form Linguistic Data Consortium Vowel Harmony 
These keywords were added by machine and not by the authors. This process is experimental and the keywords may be updated as the learning algorithm improves.

Preview

Unable to display preview. Download preview PDF.

Unable to display preview. Download preview PDF.

References

  1. 1.
    Back, D.H., Lee, H., Rim, H.C.: A structure of korean electronic dictionary using the finite state transducer. In: Proceedings of the 7th Symposium for Information Processing of Hangul and Korean (1995) (in Korean)Google Scholar
  2. 2.
    Beesley, K.R., Karttunen, L.: Finite-State Morphology: Xerox Tools and Techniques. CSLI Publications, Stanford, California (2003)Google Scholar
  3. 3.
    Han, C.H., Han, N.R.: Part of speech tagging guidelines for penn korean treebank. Technical report, IRCS, University of Pennsylvania (2001)Google Scholar
  4. 4.
    Han, N.R.: Klex: Finite-state lexical transducer for korean. Linguistic Data Consortium (LDC) (2004), catalog number LDC2004L01 and ISBN 1-58563-283-xGoogle Scholar
  5. 5.
    Han, N.R.: Morphologically annotated korean text. Linguistic Data Consortium (LDC) (2004), catalog number LDC2004T03 and ISBN 1-58563-284-8 Google Scholar
  6. 6.
    Kim, S.: Korean Morphology. Tap Publishing, Seoul, Korea (1992) (in Korean)Google Scholar
  7. 7.
    Ko, Y.: A Study of Korean Morphology. Seoul National University Press, Seoul, Korea (1989) (in Korean)Google Scholar
  8. 8.
    Koskenniemi, K.: Two-level morphology: A general computational model for word form recognition and production. Publication No: 11, Department of General Linguistics, University of Helsinki (1983)Google Scholar
  9. 9.
    Minjungseorim (ed.): Minjung Eutteum Korean Dictionary for Elementary School Students. Minjungseorim, Seoul, Korea (1998) (in Korean) Google Scholar
  10. 10.
    Palmer, M., Han, C.H., Han, N.R., Ko, E.S., Yi, H.J., Lee, A., Walker, C., Duda, J., Xue, N.: Korean english treebank annotations. Linguistic Data Consortium (LDC) catalog number LDC2002T26 and ISBN 1-58563-236-8 (2002)Google Scholar

Copyright information

© Springer-Verlag Berlin Heidelberg 2006

Authors and Affiliations

  • Na-Rae Han
    • 1
  1. 1.Department of LinguisticsUniversity of PennsylvaniaPhiladelphiaUSA

Personalised recommendations