Skip to main content

What is the Role of NLP in Text Retrieval?

  • Chapter
Book cover Natural Language Information Retrieval

Part of the book series: Text, Speech and Language Technology ((TLTB,volume 7))

Abstract

This paper addresses the value of linguistically-motivated indexing (LMI) for document and text retrieval. After reviewing the basic concepts involved and the assumptions on which LMI is based, namely that complex index descriptions and terms are necessary, I consider past and recent research on LMI, and specifically on automated LMI via NLP. Experiments in the first phase of research, to the late eighties, did not demonstrate value in LMI, but were very limited; but the much larger tests of the Nineties, with full text, have not done so either. My conclusion is that LMI is not needed for effective retrieval, but has other important roles within information-selection systems.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Chapter
USD 29.95
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
eBook
USD 84.99
Price excludes VAT (USA)
  • Available as EPUB and PDF
  • Read on any device
  • Instant download
  • Own it forever
Softcover Book
USD 109.99
Price excludes VAT (USA)
  • Compact, lightweight edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info
Hardcover Book
USD 109.99
Price excludes VAT (USA)
  • Durable hardcover edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

Preview

Unable to display preview. Download preview PDF.

Unable to display preview. Download preview PDF.

References

  • Bely, N., Borillo, A., Virbel, J. and Siot-Decauville, N. (1970) Procédures d’analyse sémantique appliquées a la documentation scientifique. Paris: Gauthier-Villars.

    Google Scholar 

  • Biebricher, B. et al. (1988) The automatic indexing system AIR PHYS - from research to application. Proceedings of the 11th International Conference on Research and Development in Information Retrieval (ACM SIGIR.), pp. 333–342.

    Google Scholar 

  • Buckley, C., Allan, J. and Salton, G. (1995) Automatic routing and retrieval using SMART: TREC-2. Information Processing and Management. 31 (3), pp. 315–326.

    Article  Google Scholar 

  • Callan, J., Croft, W.B. and Broglio, J. (1995) TREC and TIPSTER Experiments with INQUERY. Information Processing and Management, 31 (3), pp. 385–395.

    Article  Google Scholar 

  • Chan, L.M., Richmond, P.A. and Svenonius, E. (Eds.) (1985) Theory of subject analysis: a sourcebook. Littleton, CO: Libraries Unlimited.

    Google Scholar 

  • Cleverdon, C.W. (1967) The Cranfield tests on index language devices. Aslib Proceedings, 19, 1967, pp. 173–192.

    Article  Google Scholar 

  • Cleverdon, C.W. (1977) A computer evaluation of searching by controlled language and natural language in an experimental NASA data base. Report ESA 1 432, European Space Agency, Frascati, Italy.

    Google Scholar 

  • Croft, W.B., Turtle, H.R. and Lewis, D.D. (1991) The use of phrases and structured queries in information retrieval. SIGIR 91, Proceedings of the 1 4 th Annual International ACM SICIR Conference on Research and Development in Information Retrieval, pp. 32–45.

    Google Scholar 

  • Damerau, F.J. (1993) Generating and evaluating domain-oriented multi-word terms from texts. Information Processing and Management, 29 (4), pp. 433–447.

    Article  Google Scholar 

  • Dillon, M. and Gray, A.S. (1983) Fully automatic syntaxt-based indexing. Journal of the American Society for Information Science, 34 (2), pp. 99–108.

    Article  Google Scholar 

  • Evans, D.A. and Lefferts, R.G. (1995) CLARIT-TREC Experiments. Information Processing and Management, 31 (3), pp. 385–395.

    Article  Google Scholar 

  • Fagan, J.L. (1987) Experiments in automatic phrase indexing for document retrieval: a comparison on syntactic and non-syntactic methods. PhD Thesis, Department of Computer Science, Cornell University; TR 87–868.

    Google Scholar 

  • Fagan, J.L. (1989) The effectiveness of a non-syntactic approach to automatic phrase indexing for document retrieval. Journal of the American Society for Information Science, 40 (2), pp. 115–132.

    Article  Google Scholar 

  • Hayes, P.J. (1992) Intelligent, high-cvolume text processing using shallow, domain-specific techniques. In Text-based intelligent systems, Ed. P.S. Jacobs, Hillsdale NJ: Lawrence Erlbaum Associates, pp. 227–241.

    Google Scholar 

  • Hahn, U. (1990) Topic parsing: accounting for text macro structures in full-text analysis. Information Processing and Management, 26, pp. 135–170.

    Article  Google Scholar 

  • Harman. D. (1991) How effective is suffixing? Journal of the American Society for Information Science, 42 (1), pp. 7–15.

    Article  Google Scholar 

  • Hillman, D.J. (1968) Negotiation of inquiries in an online retrieval system. Information Storage and Retrieval, 4, pp. 219–238.

    Article  Google Scholar 

  • Hull, D.M. (1990) Stemming algorithms: a case study for detailed evaluation. Journal of the American Society for Information Science, 47 (1), pp. 70–84.

    Article  Google Scholar 

  • Hutchins, W.J. (1975) Languages of indexing and classification. Stevenage, Herts: Peter Peregrinus.

    Google Scholar 

  • Jacobs, P.S. and Rau, L.F. (1988) Natural language techniques for intelligent information retrieval. Proceedings of the 11th International Conference on Research and Development in Information Retrieval (ACM SIGIR,), pp. 85–99.

    Google Scholar 

  • Klingbiel, P.H. (1973) A technique for machine-aided indexing. Information Storage and Retrieval, 9 (9), pp. 477–494.

    Article  Google Scholar 

  • Klingbiel, P.H. and Rinker, C.C. (1976) Evaluation of machine-aided indexing. Information Processing and Management, 12 (6), pp. 351–366.

    Article  Google Scholar 

  • Krovetz, R. and Croft, W.B. (1992) Lexical ambiguity and information retrieval. ACM Transactions on Information Systems, 10 (2), pp. 115–141.

    Article  Google Scholar 

  • Lancaster, F.W. (1972) Vocabulary control for information retrieval. Washington, DC: Information Reswources Press.

    Google Scholar 

  • Lewis, D.D. (1991) Representation and learning in information retrieval. PhD Thesis, Department of Computer and Information Science, University of Massachusetts at Amherst, TR 91–93.

    Google Scholar 

  • Mauldin, M. (1991) Retrieval performance in FERRET: a conceptual information retrieval system. SIGIR 91, Proceedings of the 14th Annual International ACM SIGIR Conference on Research and Development in Information Retrieval, pp. 347–355.

    Google Scholar 

  • MUC-6 (1996) Proceedings of the Sixth Message Understanding Conference (MUC-6), San Francisco, CA: Morgan Kaufmann

    Google Scholar 

  • Porter, M.F. (1980) An algorithm for suffix stripping. Program, 14, pp. 130–137.

    Article  Google Scholar 

  • Salton, G. (1968) Automatic information organisation and retrieval. New York: McGraw-Hill.

    Google Scholar 

  • Salton, G. (1972) A new comparison between conventional indexing (MEDLARS) and automatic text processing (SMART). Journal of the American Society for Information Science, 23 (2), pp. 75–84.

    Article  Google Scholar 

  • Salton, G. and McGill, M.J. (1983) Introduction to modern information retrieval. New York: McGraw-Hill.

    Google Scholar 

  • Schütze, H. and Pedersen, J.O. (1995) Information retrieval based on word senses. Fourth Annual Symposium on Document Analysis and Information Retrieval, Information Science Research Institute, University of Nevada, Las Vegas, pp. 161–175.

    Google Scholar 

  • Silvester, J.P., Genuardi, M.T. and Klingbiel, P.H. (1994) Machine-aided indexing at NASA. Information Processing and Management, 30 (5), pp. 631–645.

    Article  Google Scholar 

  • Srneaton, A.F. and van Rijsbergen, C.J. (1988) Experiments in incorporating syntactic processing of user queries into a document retrieval strategy. Proceedings of the 11th International Conference on Research and Development in Information Retrieval (ACM SIGIR), pp. 32–51.

    Google Scholar 

  • Sparck Jones, K. and Tait, J.I. (1984) Automatic search term variant generation. Journal of Documentation, 40, pp. 50–66.

    Article  Google Scholar 

  • Strzalkowski, T. (1994) Robust text processing in automated information retrieval. Pro-ceedings of the 4th Conference on Applied Natural Language Processing ( Stuttgart ), Association for Computational Lingustics, pp. 168–173.

    Google Scholar 

  • Strzalkowski, T. (1995) Natural language information retrieval. Information Processing and Management, 31 (3), pp. 397–417.

    Article  Google Scholar 

  • TIPSTER (1996) Tipster Text Program, Phase II. Proceedings of a Workshop held at Vienna, Virginia May 6–8 1996. San Francisco, CA: Morgan Kaufmann.

    Google Scholar 

  • TREC (1993–1997) Proceedings of the First Text REtrieval Conference (TREC-1). Ed. D.K. Harman, Special Publication 500–207, National Institute of Standards and Technology, Gaithersburg, MD, 1993; Second (TREC-4),500–215, 1994; Third (TREC-3),500–225, 1995; Fourth (TREC-4),500–236, 1996; Fifth (TREC-5),1997.

    Google Scholar 

Download references

Author information

Authors and Affiliations

Authors

Editor information

Editors and Affiliations

Rights and permissions

Reprints and permissions

Copyright information

© 1999 Springer Science+Business Media Dordrecht

About this chapter

Cite this chapter

Jones, K.S. (1999). What is the Role of NLP in Text Retrieval?. In: Strzalkowski, T. (eds) Natural Language Information Retrieval. Text, Speech and Language Technology, vol 7. Springer, Dordrecht. https://doi.org/10.1007/978-94-017-2388-6_1

Download citation

  • DOI: https://doi.org/10.1007/978-94-017-2388-6_1

  • Publisher Name: Springer, Dordrecht

  • Print ISBN: 978-90-481-5209-4

  • Online ISBN: 978-94-017-2388-6

  • eBook Packages: Springer Book Archive

Publish with us

Policies and ethics