Skip to main content

Dictionary-Based Thai CLIR: An Experimental Survey of Thai CLIR

  • Conference paper
  • First Online:
Evaluation of Cross-Language Information Retrieval Systems (CLEF 2001)

Part of the book series: Lecture Notes in Computer Science ((LNCS,volume 2406))

Included in the following conference series:

Abstract

This paper describes our participation in the Cross-Language Evaluation Forum. Our objectives for this experiment were three-fold. Firstly, the coverage of the Thai bilingual dictionary was evaluated when translating queries. Secondly, we investigated whether the segmentation process affected the CLIR. Lastly, this research examines query formation techniques. Since this is our first international experimental in CLIR, our approach used dictionary-based techniques to translate Thai queries into English queries. Four runs were submitted to CLEF: (a) single mapping translation with manual segmentation, (b) multiple mapping translation with manual segmentation, (c) single mapping translation with automatic segmentation and (d) single mapping with query enhancement using words from our Thai thesaurus. The retrieval effectiveness was worse than we expected. The simple dictionary mapping technique is unable to achieve retrieval effectiveness, although the dictionary lookup gave a very high percentage of mapping words. The words from the dictionary lookup are not specific terms but each is mapped to a definition or meaning of that term. Furthermore, Thai stopwords, stemmed words and word separation have reduced the effectiveness of Thai CLIR.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Chapter
USD 29.95
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
eBook
USD 84.99
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
Softcover Book
USD 109.99
Price excludes VAT (USA)
  • Compact, lightweight edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

Preview

Unable to display preview. Download preview PDF.

Unable to display preview. Download preview PDF.

References

  1. LEXiTRON, Thai<->English Dictionary, Software and Language Engineering Laboratory, National Electronics and Computer Technology Center, http://www.links.nectec.or.th/lexit/lex_t.html (downloaded in June, 2001)

  2. Online Thai English Dictionary, Northern Illinois University, http://www.seasite.niu.edu/Thai/home_page/online_thai_dictionaries.htm (download in June 2001)

  3. Thai Wordbreak Insertion Services, National Electronics and Computer Technology Center, URL:http://ntl.nectec.or.th/services/wordbreak/ (downloaded in June, 2001)

  4. Sophonpanich Kalaya, The R&D Activities of MT in Thailand, The National Electronics and Computer Technology Center, Bangkok, Thailand.

    Google Scholar 

  5. Parsit, Information Research and Development Division, National Electronics and Computer Technology Center, http://www.links.nectec.or.th/services/parsit/index2.html (downloaded in June, 2001)

  6. Kanlayanawat W., and Prasitjutrakul S., Automatic Indexing for Thai Text with Unknown Words using Trie Structure, Department of Computer Engineering, Chulalongkorn University.

    Google Scholar 

  7. Jaruskulchai C., An Automatic Indexing for Thai Text Retrieval, Ph.D. Thesis, George Washington University, U.S.A., Aug 1998.

    Google Scholar 

  8. SMART, ftp://ftp.cs.cornell.edu/pub/smart/smart.11.0.tar.z

  9. Charoenkitkarn, N., and Udomporntawee, R. Optimal Text Signature Length for Word Searching on Thai Holy Bible(in Thai). Proceeding of Electrical Engineering Conference, KMUTT, Bangkok, November 1998, 549–552.

    Google Scholar 

  10. Suwanvisat Prayut and Prasitjutrakul Somchai, Transliterated Word Encoding and Retrieval Algorithms for Thai-English Cross-Language Retrieval.

    Google Scholar 

  11. Suwanvisat P. and Prasijutrakul S., Thai-English Cross-Language Transliterated Word Retrieval Soundex Technique, NCSEC2000.

    Google Scholar 

  12. Kawtrakul A., Deemagarn A., Thumkanon C., Khantonthong N and McFetridge Paul., Backward Transliteration for Thai Document Retrieval, Natural Language Processing and Intelligent Information System Technology, Research Laboratory, Dept. of Computer Engineering, Kasetsart University, Bangkok, Thailand.

    Google Scholar 

  13. Adriani M., and Croft Bruce, The Effectiveness of a Dictionary-Based Technique for Indonesian-English Cross-Language Text Retrieval, Center for Intelligent Information Retrieval, Computer Science Department, University of Massachusetts, USA.

    Google Scholar 

  14. Chen, Gey, Kishida, Jiang and Liang, Comparing multiple methods for Japanese and Japanese-English text retrieval, Working Notes of the Cross-Language Evaluation Forum 2000, http://www.clef-campaign.org

  15. The Thai Internet Education Project, http://www.cyberc.com/crcl/ehelp/base.htm (doug@crcl.chula.edu: Contract person, downloaded in June, 2001)

  16. Yuen Phuwarawan and team, Thai Thesaurus, in Thai, Ed publisher.

    Google Scholar 

  17. Pirkola Ari, The Effects of Query Structure and Dictionary Setups in Dictionary-Based Cross-language Information Retrieval, SIGIR’98

    Google Scholar 

  18. Sripimonwan V. and Jaruskulchai C., Cross-Language Retrieval from Thai to English (in Thai), submitted to The Fifth National Computer Science and Engineering Conference, Thailand.

    Google Scholar 

Download references

Author information

Authors and Affiliations

Authors

Editor information

Editors and Affiliations

Rights and permissions

Reprints and permissions

Copyright information

© 2002 Springer-Verlag Berlin Heidelberg

About this paper

Cite this paper

Chuleerat, J. (2002). Dictionary-Based Thai CLIR: An Experimental Survey of Thai CLIR. In: Peters, C., Braschler, M., Gonzalo, J., Kluck, M. (eds) Evaluation of Cross-Language Information Retrieval Systems. CLEF 2001. Lecture Notes in Computer Science, vol 2406. Springer, Berlin, Heidelberg. https://doi.org/10.1007/3-540-45691-0_18

Download citation

  • DOI: https://doi.org/10.1007/3-540-45691-0_18

  • Published:

  • Publisher Name: Springer, Berlin, Heidelberg

  • Print ISBN: 978-3-540-44042-0

  • Online ISBN: 978-3-540-45691-9

  • eBook Packages: Springer Book Archive

Publish with us

Policies and ethics