The Contribution of Lexical Resources to Natural Language Processing of CJK Languages

  • Jack Halpern
Conference paper

DOI: 10.1007/11939993_77

Part of the Lecture Notes in Computer Science book series (LNCS, volume 4274)
Cite this paper as:
Halpern J. (2006) The Contribution of Lexical Resources to Natural Language Processing of CJK Languages. In: Huo Q., Ma B., Chng ES., Li H. (eds) Chinese Spoken Language Processing. Lecture Notes in Computer Science, vol 4274. Springer, Berlin, Heidelberg

Abstract

The role of lexical resources is often understated in NLP research. The complexity of Chinese, Japanese and Korean (CJK) poses special challenges to developers of NLP tools, especially in the area of word segmentation (WS), information retrieval (IR), named entity extraction (NER), and machine translation (MT). These difficulties are exacerbated by the lack of comprehensive lexical resources, especially for proper nouns, and the lack of a standardized orthography, especially in Japanese. This paper summarizes some of the major linguistic issues in the development NLP applications that are dependent on lexical resources, and discusses the central role such resources should play in enhancing the accuracy of NLP tools.

Preview

Unable to display preview. Download preview PDF.

Unable to display preview. Download preview PDF.

Copyright information

© Springer-Verlag Berlin Heidelberg 2006

Authors and Affiliations

  • Jack Halpern
    • 1
  1. 1.The CJK Dictionary Institute (CJKI)SaitamaJapan

Personalised recommendations