Advertisement

Indonesian Shallow Stemmer for Text Reading Support System

  • Hajime Mochizuki
  • Yuhei Nakamura
  • Kohji Shibano
Conference paper
Part of the Lecture Notes in Electrical Engineering book series (LNEE, volume 157)

Abstract

Our project involves the construction of a web-based system to facilitate the reading and comprehension of Indonesian text. The system will help users to understand difficult words in a text by displaying dictionary information about the words in a window. A large number of words in the Indonesian language are formed by combining root words with affixes and other combining forms. To search for the related dictionary entry, we need a stemming program to extract these root words. We develop an Indonesian stemming program for ourselves. Our stemmer does not need to be perfect because our application is limited to that of a text reading system. In this paper, we describe such a stemmer and present the results of preliminary examinations to evaluate it. We also describe a design for the text reading support system that uses the developed stemming program.

Keywords

Text Reading Base Word Baseline System Input Word Root Word 
These keywords were added by machine and not by the authors. This process is experimental and the keywords may be updated as the learning algorithm improves.

Preview

Unable to display preview. Download preview PDF.

Unable to display preview. Download preview PDF.

References

  1. 1.
    Yusuf, H.R.: An analysis of indonesian language for interlingual machine-ranslation system. In: Proceedings of the 15th International Conference on Computational Linguistics, pp. 1228–1232 (1992)Google Scholar
  2. 2.
    Nazief, B.: Panel: Development of computational linguistics research: A challenge for indonesia. In: Proceedings of the 38th Annual Meeting of the Association for Computational Linguistics, pp. 1–2. Association for Computational Linguistics, Hong Kong (2000), http://www.aclweb.org/anthology/P00-1075 Google Scholar
  3. 3.
    Adriani, M., Asian, J., Nazief, B., Tahaghoghi, S.M.M., Williams, H.E.: Stemming indonesian: A confix-stripping approach. ACM Transactions on Asian Language Information Processing 6(4), 1–33 (2007)CrossRefGoogle Scholar
  4. 4.
    TruAlfa and IndoDic.com. Forming Indonesian Words & using Indonesian Affixes, http://indodic.com/index.html
  5. 5.
    CICC, Indonesian basic dictionary, Center of the International Cooperation for Computerization Technical Report. Tech. Rep. 6-CICC-MT 53 (1995)Google Scholar

Copyright information

© Springer-Verlag Berlin Heidelberg 2012

Authors and Affiliations

  • Hajime Mochizuki
    • 1
  • Yuhei Nakamura
    • 2
  • Kohji Shibano
    • 3
  1. 1.Institute of Global StudiesTokyo University of Foreign StudiesFuchu-shiJapan
  2. 2.Mitsubishi UFJ Lease & Finance Co. Ltd.TokyoJapan
  3. 3.Research Institute for Languages and Cultures of Asia and AfricaTokyo University of Foreign StudiesFuchu-shiJapan

Personalised recommendations