Simple Transliteration for CLIR

  • Sauparna Palchowdhury
  • Prasenjit Majumder
Conference paper
Part of the Lecture Notes in Computer Science book series (LNCS, volume 7536)


This is an experiment in cross-lingual information retrieval for Indian languages, in a resource-poor situation. We use a simple grapheme-to-grapheme transliteration technique to transliterate parallel query-text between three morphologically similar Indian languages and compare the cross-lingual and mono-lingual performance. Where a state of the art system like the Google Translation tool performs roughly in the range of 60-90%, our transliteration technique achieves 20-60% of the mono-lingual performance. Though the figures are not impressive, we argue that in situations where linguistic resources are scarce, to the point of being non-existent, this can be a starting point of engineering retrieval effectiveness.


Target Language Query Expansion Source Language Test Collection Indian Language 
These keywords were added by machine and not by the authors. This process is experimental and the keywords may be updated as the learning algorithm improves.


Unable to display preview. Download preview PDF.

Unable to display preview. Download preview PDF.


  1. 1.
    Majumder, P., Mitra, M., Pal, D., Bandyopadhyay, A., Maiti, S., Pal, S., Modak, D., Sanyal, S.: The fire 2008 evaluation exercise. In: Proceedings of the First Workshop of the Forum for Information Retrieval Evaluation, vol. 9(3), pp. 1–24 (2010)Google Scholar
  2. 2.
    Chinnakotla, M.K., Damani, O.P., Satoskar, A.: Transliteration for resource-scarce languages. ACM Trans. Asian Lang. Inf. Process. 9(4), 14 (2010)CrossRefGoogle Scholar
  3. 3.
    ACM Transactions on Asian Language Information Processing (TALIP) 9(3) (2010)Google Scholar
  4. 4.
    ACM Transactions on Asian Language Information Processing (TALIP) 9(4) (2010)Google Scholar
  5. 5.
    Kumaran, A., Khapra, M.M., Bhattacharyya, P.: Compositional machine transliteration. ACM Trans. Asian Lang. Inf. Process. 9(4), 13 (2010)CrossRefGoogle Scholar
  6. 6.
    Ounis, I., Amati, G., Plachouras, V., He, B., Macdonald, C., Lioma, C.: Terrier: A High Performance and Scalable Information Retrieval Platform. In: Proceedings of ACM SIGIR 2006 Workshop on Open Source Information Retrieval, OSIR 2006 (2006)Google Scholar
  7. 7.
    Majumder, P., Mitra, M., Parui, S.K., Kole, G., Mitra, P., Datta, K.: YASS: Yet another suffix stripper. ACM Trans. Inf. Syst. 25(4) (2007)Google Scholar

Copyright information

© Springer-Verlag Berlin Heidelberg 2013

Authors and Affiliations

  • Sauparna Palchowdhury
    • 1
  • Prasenjit Majumder
    • 2
  1. 1.CVPR UnitIndian Statistical InstituteKolkataIndia
  2. 2.Computer Science & EngineeringDhirubhai Ambani Institute of Information and Communication TechnologyIndia

Personalised recommendations