Advertisement

Bengali, Hindi and Telugu to English Ad-Hoc Bilingual Task at CLEF 2007

  • Sivaji Bandyopadhyay
  • Tapabrata Mondal
  • Sudip Kumar Naskar
  • Asif Ekbal
  • Rejwanul Haque
  • Srinivasa Rao Godhavarthy
Part of the Lecture Notes in Computer Science book series (LNCS, volume 5152)

Abstract

This paper presents the experiments carried out at Jadavpur University as part of the participation in the CLEF 2007 ad-hoc bilingual task. This is our first participation in the CLEF evaluation task and we have considered Bengali, Hindi and Telugu as query languages for the retrieval from English document collection. We have discussed our Bengali, Hindi and Telugu to English CLIR system as part of the ad-hoc bilingual task, the English IR system for the ad-hoc monolingual task and the associated experiments at CLEF. Query construction was manual for Telugu-English ad-hoc bilingual task, while it was automatic for all other tasks.

Keywords

Machine Translation Query Term Content Word Stop Word Word Sense Disambiguation 
These keywords were added by machine and not by the authors. This process is experimental and the keywords may be updated as the learning algorithm improves.

Preview

Unable to display preview. Download preview PDF.

Unable to display preview. Download preview PDF.

References

  1. 1.
    Oard, D.: Alternative Approaches for Cross Language Text Retrieval. In: AAAI Symposium on Cross Language Text and Speech Retrieval, USA (1997)Google Scholar
  2. 2.
    Lavrenko, V., Choquette, M., Croft, W.: Cross-Lingual Relevance Models. In: Proceedings of the 25th Annual International ACM SIGIR Conference on Research and Development in Information Retrieval, ACM 2002, pp. 175–182. ACM Press, New York (2002)CrossRefGoogle Scholar
  3. 3.
    Oard, D.: The Surprise Language Exercises. ACM Transactions on Asian Language Information Processing 2(2), 79–84 (2003)CrossRefGoogle Scholar
  4. 4.
    Dorr, B., Zajic, D., Schwartz, R.: Cross-language Headline Generation for Hindi. ACM Transactions on Asian Language Information Processing (TALIP) 2(3), 270–289 (2003)CrossRefGoogle Scholar
  5. 5.
    Sekine, S., Grishman, R.: Hindi-English Cross-Lingual Question-Answering System. ACM Transactions on Asian Language Information Processing (TALIP) 2(3), 181–192 (2003)CrossRefGoogle Scholar
  6. 6.
    Pingali, P., Jagarlamudi, J., Varma, V.: Webkhoj: Indian Language IR from Multiple Character Encodings. In: WWW 2006: Proceedings of the 15th International Conference on World Wide Web, pp. 801–809 (2006)Google Scholar
  7. 7.
    Pingali, P., Varma, V.: Hindi and Telugu to English Cross Language Information Retrieval at CLEF 2006. In: Working Notes for the CLEF 2006 Wokshop (Cross Language Adhoc Task), Alicante, Spain, 20-22 September (2006)Google Scholar
  8. 8.
    Bharati, A., Sangal, R., Sharma, D.M., Kulkarni, A.P.: Machine Translation Activities in India: A Survey. In: The Proceedings of Workshop on Survey on Research and Development of Machine Translation in Asian Countries (2002)Google Scholar
  9. 9.
    Naskar, S., Bandyopadhyay, S.: Use of Machine Translation in India: Current Status. In: Proceedings of MT SUMMIT-X, Phuket, Thailand, pp. 465–470 (2005)Google Scholar
  10. 10.
    CLIA Consortium: Cross Lingual Information Access System for Indian Languages. In: Demo/Exhibition of the 3rd International Joint Conference on Natural Language Processing, Hyderabad, India, pp. 973–975 (2008)Google Scholar
  11. 11.
    Manning, C.D., Raghavan, P., Schütze, H.: Introduction to Information Retrieval, ch. 6. Cambridge University Press, Cambridge (2000)Google Scholar
  12. 12.
    Porter, M.F.: An Algorithm for Suffix Stripping. Program 14(3), 130–137 (1980)Google Scholar
  13. 13.
    Mayfield, J., McNamee, P.: Converting On-line Bilingual Dictionaries from Human-readable form to Machine-readable form. In: Proceedings of 25th Annual International ACM SIGIR Conference on Research and Development in Informational Retrieval, pp. 405–406. ACM Press, New York (2002)CrossRefGoogle Scholar
  14. 14.
    Ekbal, A., Naskar, S., Bandyopadhyay, S.: A Modified Joint Source-Channel Model for Transliteration. In: Proceedings of the COLING/ACL, Sydney, Australia, pp. 191–198 (2006)Google Scholar
  15. 15.
    Nunzio, G.M.D., Ferro, N., Mandi, T., Peters, C.: CLEF 2007: Ad HOC Track Overview. In: Peters, C., et al. (eds.) CLEF 2007. LNCS, vol. 5152, pp. 13–32. Springer, Heidelberg (2008)Google Scholar

Copyright information

© Springer-Verlag Berlin Heidelberg 2008

Authors and Affiliations

  • Sivaji Bandyopadhyay
    • 1
  • Tapabrata Mondal
    • 1
  • Sudip Kumar Naskar
    • 1
  • Asif Ekbal
    • 1
  • Rejwanul Haque
    • 1
  • Srinivasa Rao Godhavarthy
    • 1
  1. 1.Department of Computer Science and EngineeringJadavpur UniversityKolkataIndia

Personalised recommendations