Abstract
This paper addresses the value of linguistically-motivated indexing (LMI) for document and text retrieval. After reviewing the basic concepts involved and the assumptions on which LMI is based, namely that complex index descriptions and terms are necessary, I consider past and recent research on LMI, and specifically on automated LMI via NLP. Experiments in the first phase of research, to the late eighties, did not demonstrate value in LMI, but were very limited; but the much larger tests of the Nineties, with full text, have not done so either. My conclusion is that LMI is not needed for effective retrieval, but has other important roles within information-selection systems.
Access this chapter
Tax calculation will be finalised at checkout
Purchases are for personal use only
Preview
Unable to display preview. Download preview PDF.
References
Bely, N., Borillo, A., Virbel, J. and Siot-Decauville, N. (1970) Procédures d’analyse sémantique appliquées a la documentation scientifique. Paris: Gauthier-Villars.
Biebricher, B. et al. (1988) The automatic indexing system AIR PHYS - from research to application. Proceedings of the 11th International Conference on Research and Development in Information Retrieval (ACM SIGIR.), pp. 333–342.
Buckley, C., Allan, J. and Salton, G. (1995) Automatic routing and retrieval using SMART: TREC-2. Information Processing and Management. 31 (3), pp. 315–326.
Callan, J., Croft, W.B. and Broglio, J. (1995) TREC and TIPSTER Experiments with INQUERY. Information Processing and Management, 31 (3), pp. 385–395.
Chan, L.M., Richmond, P.A. and Svenonius, E. (Eds.) (1985) Theory of subject analysis: a sourcebook. Littleton, CO: Libraries Unlimited.
Cleverdon, C.W. (1967) The Cranfield tests on index language devices. Aslib Proceedings, 19, 1967, pp. 173–192.
Cleverdon, C.W. (1977) A computer evaluation of searching by controlled language and natural language in an experimental NASA data base. Report ESA 1 432, European Space Agency, Frascati, Italy.
Croft, W.B., Turtle, H.R. and Lewis, D.D. (1991) The use of phrases and structured queries in information retrieval. SIGIR 91, Proceedings of the 1 4 th Annual International ACM SICIR Conference on Research and Development in Information Retrieval, pp. 32–45.
Damerau, F.J. (1993) Generating and evaluating domain-oriented multi-word terms from texts. Information Processing and Management, 29 (4), pp. 433–447.
Dillon, M. and Gray, A.S. (1983) Fully automatic syntaxt-based indexing. Journal of the American Society for Information Science, 34 (2), pp. 99–108.
Evans, D.A. and Lefferts, R.G. (1995) CLARIT-TREC Experiments. Information Processing and Management, 31 (3), pp. 385–395.
Fagan, J.L. (1987) Experiments in automatic phrase indexing for document retrieval: a comparison on syntactic and non-syntactic methods. PhD Thesis, Department of Computer Science, Cornell University; TR 87–868.
Fagan, J.L. (1989) The effectiveness of a non-syntactic approach to automatic phrase indexing for document retrieval. Journal of the American Society for Information Science, 40 (2), pp. 115–132.
Hayes, P.J. (1992) Intelligent, high-cvolume text processing using shallow, domain-specific techniques. In Text-based intelligent systems, Ed. P.S. Jacobs, Hillsdale NJ: Lawrence Erlbaum Associates, pp. 227–241.
Hahn, U. (1990) Topic parsing: accounting for text macro structures in full-text analysis. Information Processing and Management, 26, pp. 135–170.
Harman. D. (1991) How effective is suffixing? Journal of the American Society for Information Science, 42 (1), pp. 7–15.
Hillman, D.J. (1968) Negotiation of inquiries in an online retrieval system. Information Storage and Retrieval, 4, pp. 219–238.
Hull, D.M. (1990) Stemming algorithms: a case study for detailed evaluation. Journal of the American Society for Information Science, 47 (1), pp. 70–84.
Hutchins, W.J. (1975) Languages of indexing and classification. Stevenage, Herts: Peter Peregrinus.
Jacobs, P.S. and Rau, L.F. (1988) Natural language techniques for intelligent information retrieval. Proceedings of the 11th International Conference on Research and Development in Information Retrieval (ACM SIGIR,), pp. 85–99.
Klingbiel, P.H. (1973) A technique for machine-aided indexing. Information Storage and Retrieval, 9 (9), pp. 477–494.
Klingbiel, P.H. and Rinker, C.C. (1976) Evaluation of machine-aided indexing. Information Processing and Management, 12 (6), pp. 351–366.
Krovetz, R. and Croft, W.B. (1992) Lexical ambiguity and information retrieval. ACM Transactions on Information Systems, 10 (2), pp. 115–141.
Lancaster, F.W. (1972) Vocabulary control for information retrieval. Washington, DC: Information Reswources Press.
Lewis, D.D. (1991) Representation and learning in information retrieval. PhD Thesis, Department of Computer and Information Science, University of Massachusetts at Amherst, TR 91–93.
Mauldin, M. (1991) Retrieval performance in FERRET: a conceptual information retrieval system. SIGIR 91, Proceedings of the 14th Annual International ACM SIGIR Conference on Research and Development in Information Retrieval, pp. 347–355.
MUC-6 (1996) Proceedings of the Sixth Message Understanding Conference (MUC-6), San Francisco, CA: Morgan Kaufmann
Porter, M.F. (1980) An algorithm for suffix stripping. Program, 14, pp. 130–137.
Salton, G. (1968) Automatic information organisation and retrieval. New York: McGraw-Hill.
Salton, G. (1972) A new comparison between conventional indexing (MEDLARS) and automatic text processing (SMART). Journal of the American Society for Information Science, 23 (2), pp. 75–84.
Salton, G. and McGill, M.J. (1983) Introduction to modern information retrieval. New York: McGraw-Hill.
Schütze, H. and Pedersen, J.O. (1995) Information retrieval based on word senses. Fourth Annual Symposium on Document Analysis and Information Retrieval, Information Science Research Institute, University of Nevada, Las Vegas, pp. 161–175.
Silvester, J.P., Genuardi, M.T. and Klingbiel, P.H. (1994) Machine-aided indexing at NASA. Information Processing and Management, 30 (5), pp. 631–645.
Srneaton, A.F. and van Rijsbergen, C.J. (1988) Experiments in incorporating syntactic processing of user queries into a document retrieval strategy. Proceedings of the 11th International Conference on Research and Development in Information Retrieval (ACM SIGIR), pp. 32–51.
Sparck Jones, K. and Tait, J.I. (1984) Automatic search term variant generation. Journal of Documentation, 40, pp. 50–66.
Strzalkowski, T. (1994) Robust text processing in automated information retrieval. Pro-ceedings of the 4th Conference on Applied Natural Language Processing ( Stuttgart ), Association for Computational Lingustics, pp. 168–173.
Strzalkowski, T. (1995) Natural language information retrieval. Information Processing and Management, 31 (3), pp. 397–417.
TIPSTER (1996) Tipster Text Program, Phase II. Proceedings of a Workshop held at Vienna, Virginia May 6–8 1996. San Francisco, CA: Morgan Kaufmann.
TREC (1993–1997) Proceedings of the First Text REtrieval Conference (TREC-1). Ed. D.K. Harman, Special Publication 500–207, National Institute of Standards and Technology, Gaithersburg, MD, 1993; Second (TREC-4),500–215, 1994; Third (TREC-3),500–225, 1995; Fourth (TREC-4),500–236, 1996; Fifth (TREC-5),1997.
Author information
Authors and Affiliations
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 1999 Springer Science+Business Media Dordrecht
About this chapter
Cite this chapter
Jones, K.S. (1999). What is the Role of NLP in Text Retrieval?. In: Strzalkowski, T. (eds) Natural Language Information Retrieval. Text, Speech and Language Technology, vol 7. Springer, Dordrecht. https://doi.org/10.1007/978-94-017-2388-6_1
Download citation
DOI: https://doi.org/10.1007/978-94-017-2388-6_1
Publisher Name: Springer, Dordrecht
Print ISBN: 978-90-481-5209-4
Online ISBN: 978-94-017-2388-6
eBook Packages: Springer Book Archive