Abstract
The application of nlp techniques to improve the results of information retrieval is still considered as a controversial issue, whereas in cross-language information retrieval (clir) linguistic processing is already well established. In this paper, the clir component - Mpro-IR - which is presented has been developed as the core module of a multilingual information system in a legal domain. This component uses not only the lexical base form for indexing but also derivational information and, for German, information about the decomposition of compounds. This information is provided by a sophisticated morpho-syntactic analyser and is exploited not only for query translation but also for query expansion as well as the search and the document ranking. The objective of the clef evaluation was to assess this linguistic based retrieval approach in an unrestricted domain. The focus of the investigation was on how derivation and decomposition can contribute to improve the recall.
Access this chapter
Tax calculation will be finalised at checkout
Purchases are for personal use only
Preview
Unable to display preview. Download preview PDF.
References
Brill, E. A simple rule-based part-of-speech tagger. In Proceedings of the Third Conference on Applied Natural Language Processing, Trento, Italy, 1992.
Kay, Martin. Multilinguality. In Varile, G. and A. Zampolli (Eds). Survey of the State of the Art in Human Language Technology, Elsnet Publication 1995.
Kraaij, W., R. Pohlmann. uplift-Using Linguistic Knowledge in Information Retrieval, Technial Report, University of Utrecht.
Krovetz, R. Viewing Morphology as an Inference Process. In SIGIR’ 93 Proceedings, Pittsburg, 1993.
Maas, D. Multilinguale Textverarbeitung mit MPRO. In Lobin, G. et al. (Eds). Europäische Kommunikationskybernetik heute und morgen, KoPäd, München, 1999. http://www.iai.uni-sb.de/global/memos.html
Popoviĉ, M,, and P. Willet. The effectiveness of stemming for natural-language access to Slovene textual data. In Journal of the American Society for Information Science, 43(5):384–390, 1992.
Porter, E. An algorithm for suffix stripping. In Programm, 14, 1980.
Ripplinger, B. emis: A Multilingual Information System. In Farwell, D., L. Gerber, E. Hovy (Eds). Machine Translation and the Information Soup, Third Conference of the AMTA, Springer, 1998.
Ripplinger, B. Mpro-IR-A Cross-language Information Retrieval Component Enhanced by Linguistic Knowledge. In Proceedings of the riao 2000, Paris.
Ripplinger, B. Linguistic Knowledge in a Multilingual Information System, IAI Memo, 2000.
Strzalkowski, Tomek, F. Ling, J. Wang, J. Perez-Carballo. Evaluating Natural Language Processing Techniques in Information Retrieval. In Strzalkowski, Tomek (Ed). Natural Language Information Retrieval. Kluwer, 1999.
Tzoukermann, E., J. L. Klavans and Ch. Jacquemin. Effective use of Natural language Processing Techniques for Automatic Conflation of Multi-Word Terms: The Role of Derivational Morphology, Part of Speech Tagging, and Shallow Parsing. In Proceeedings of the 20 th International Conference on Research and Developement in Information Retrieval (SIGIR’97), Philadelphia, 1997.
Author information
Authors and Affiliations
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 2001 Springer-Verlag Berlin Heidelberg
About this paper
Cite this paper
Ripplinger, B. (2001). The Use of NLP Techniques in CLIR. In: Peters, C. (eds) Cross-Language Information Retrieval and Evaluation. CLEF 2000. Lecture Notes in Computer Science, vol 2069. Springer, Berlin, Heidelberg. https://doi.org/10.1007/3-540-44645-1_16
Download citation
DOI: https://doi.org/10.1007/3-540-44645-1_16
Published:
Publisher Name: Springer, Berlin, Heidelberg
Print ISBN: 978-3-540-42446-8
Online ISBN: 978-3-540-44645-3
eBook Packages: Springer Book Archive