Skip to main content
  • 1120 Accesses

Abstract

The information content of languages other than English are increasing rapidly on WWW. To access information of a language other than the native language we need Cross-Language Information Retrieval (CLIR). The approaches to CLIR can be classified into three different categories • document translation, query translation and interlingua matching. The dictionary based query translation approach has been widely used by researchers of CLIR. The translation ambiguity and target polysemy are the two major problems of dictionary based CLIR. In this paper, we have investigated part of speech and co-occurrence based disambiguation techniques for English-Hindi CLIR system.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Institutional subscriptions

Preview

Unable to display preview. Download preview PDF.

Unable to display preview. Download preview PDF.

References

  1. Douglas W.: A Comparative Study of Query and Document Translation for Cross Language Information Retrieval, Proceedings of the Third Conference of the Association for Machine Translation in the Americas on Machine Translation and the Information Soup, pp. 472–483 (1998)

    Google Scholar 

  2. Hsin-Hsi Chen, Guo-Wei Bian and Wen-Cheng Lin,: Resolving Translation Ambiguity and Target Polysemy in Cross-Language Information Retrieval in proceedings of 27th Annual Meeting of the Association for Computational Linguistics, University of Maryland, College Park, Maryland, USA, ACL (1999)

    Google Scholar 

  3. Ballesteros L, Croft B.: Dictionary Methods for Cross-Lingual Information Retrieval. 7th DEXA Conf. on Database and Expert Systems Applications. Pages 791–801 (1996)

    Google Scholar 

  4. Ballesteros L., Bruce C.W.: Resolving Ambiguity for Cross-language Retrieval. In Proceedings of the 21st International ACM SIGIR Conference on Research and Development in Information Retrieval (1998)

    Google Scholar 

  5. Pirkola A.: The effects of query structure and dictionary setups in dictionary-based cross-language information retrieval. Proceedings of the 21st Annual International ACM SIGIR Conference on Research and Development in Information Retrieval, pages 55–63 (1998)

    Google Scholar 

  6. Davis M., Dunning T.: Query Translation using Evolutionary Programming for Multilingual Information Retrieval. The 41h Evolutionary Programming Conf., (1995).

    Google Scholar 

  7. Hull. D.A.: Using structured queries for disambiguation in cross-language information retrieval. In Proc. of AAAI spring symposium on cross-language text and speech retrieval, Stanford, CA (1997)

    Google Scholar 

  8. Jianfeng Gao, Jian-Yun Nie, Endong Xun, Jian Zhang, Ming Zhou, Changning Huang: Improving Query Translation for Cross-Lan guage Information Retrieval using Statistical Models In Proceeding of 24th Annual International ACM SIGIR Conference on Research and Development in Information Retrieval (2001)

    Google Scholar 

  9. Sadat F., Maeda A., Yoshikawa M, Uemura S.: A Combined Statistical Query Term Disambiguation in Cross-Language Information Retrieval, Proceedings of the 13th International Workshop on Database and Expert Systems Applications (DEXA’02) 1529-4188/02 (2002)

    Google Scholar 

  10. Clough Paul, and Mark Stevenson,: “Evaluating the Contribution of EuroWordNet and Word Sense Disambiguation to Cross-language Information Retrieval” In: Proceedings of the Second Global WordNet Conference, pp. 97–105 (2004)

    Google Scholar 

  11. Adriani M., van Rijsbergen C.J.,: Term Similarity Based Query Expansion for Cross Language Information Retrieval. In Proceedings of Research and Advanced Technology for Digital Libraries, Third European Conference (ECDL’99), p. 311–322. Springer Verlag, Paris, September (1999)

    Google Scholar 

  12. Kekäläinen J., Järvelin K.: The impact of query structure and query expansion on retrieval performance. Proceedings of the 21st Annual International ACM SIGIR Conference on Research and Development in Information Retrieval, Melbourne, Australia (1998)

    Google Scholar 

  13. Davis M.W., Ogden W.C.: Free Resources And Advanced Alignment For Cross-Language Text Retrieval. TREC 1997:385–395(1997)

    Google Scholar 

  14. Monz C., Dorr B.J.: Iterative translation disambiguation for cross-language information retrievalin Proceedings of the 28th annual international ACM SIGIR conference on Research and development in information retrieval (2005)

    Google Scholar 

  15. Seetha A., Das S., Kumar M.: Evaluation of the English-Hindi Cross Language Information Retrieval System Based on Dictionary Based Query Translation Method. In proceedings of 10th International Conference on Information Technology (ICIT 2007), http://doi.ieeecomputersociety.org/10.1109/ICIT.2007.40

    Google Scholar 

  16. Daqing He, Oard D.W., Wang J., Jun Luo, Demner-Fushman D., Darwish K., Resnik P., Khudanpur S., Nossal M., Subotin M., Leuski A.: Making MIRACLEs: Interactive translingual search for Cebuano and Hindi September ACM Transactions on Asian Language Information Processing (TALIP), Volume 2 Issue 3 (2003)

    Google Scholar 

  17. Pingali P., Varma V.: IIIT Hyderabad at CLEF 2007-Adhoc Indian Language CLIR task 2007 CLEF-2007, Cross Language Evaluation Forum 2007 Workshop at Budapest Hungary, At Eleventh European Conference on Digital Libraries (2007).

    Google Scholar 

  18. Mandal D., Dandapat S., Gupta M., Banerjee P., Sarkar S.: Bengali and Hindi to English Cross-language Text Retrieval un der Limited Resources in CLEF 2007 working notes (2007).

    Google Scholar 

  19. Davis M.W., Ogden W.C.: Free Resources And Advanced Alignment For Cross-Language Text Retrieval. TREC: Gaithersburg, Maryland, 385–395 (1997)

    Google Scholar 

  20. Seetha A., Das S., Kumar M.,: Construction of Hindi test collection for CLIR research. In Proceedings of International Conference on Cognitive Systems (ICCS 2004) New Delhi, December 14–15, (available at www.niitcrcs.com/iccs/iccs2004/Papers/240%20Anurag%20Sheetha.pdf) (2004)

    Google Scholar 

  21. Croft W.B., Cook R., Wilder D: Providing Government Information on the Internet: Experiences with THOMAS. in Proceedings of DL. pp. 19–24 (1995)

    Google Scholar 

  22. Kamps J, Monz C., Maarten de Rijke Sigurbjörnsson B.: Monolingual Document Retrieval: English versus other European Language s. In Proceedings of the Fourth Dutch Belgian Information Retrieval Workshop (DIR-2003). Pages: 35–39 (2003)

    Google Scholar 

  23. Porter M.F.: An algorithm for suffix stripping, in Program—automated library and information systems, 14(3): 130–137 (1980)

    Article  Google Scholar 

  24. Demner-Fushman D., Oard D.W.: The effect of bilingual term list size on dictionary based cross-language information retrieval. In 36th Annual Hawaii International Conference on System Sciences (HICSS’03)—Track 4. Hawaii (2003)

    Google Scholar 

  25. Larkey L. S., Allan J., Connell, M. E., Bolivar A., Wade, C.: UMass at TREC 2002: Cross language and novelty tracks The 11th Text Retrieval Conference TREC 2002 NIST (2003)

    Google Scholar 

Download references

Author information

Authors and Affiliations

Authors

Editor information

Editors and Affiliations

Rights and permissions

Reprints and permissions

Copyright information

© 2009 Indian Institute of Information Technology, India

About this paper

Cite this paper

Das, S., Seetha, A., Kumar, M., Rana, J.L. (2009). Disambiguation Strategies for English-Hindi Cross Language Information Retrieval System. In: Tiwary, U.S., Siddiqui, T.J., Radhakrishna, M., Tiwari, M.D. (eds) Proceedings of the First International Conference on Intelligent Human Computer Interaction. Springer, New Delhi. https://doi.org/10.1007/978-81-8489-203-1_30

Download citation

  • DOI: https://doi.org/10.1007/978-81-8489-203-1_30

  • Publisher Name: Springer, New Delhi

  • Print ISBN: 978-81-8489-404-2

  • Online ISBN: 978-81-8489-203-1

  • eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics