On the Usefulness of Extracting Syntactic Dependencies for Text Indexing

Alonso, Miguel A.; Vilares, Jesús; Darriba, Víctor M.

doi:10.1007/3-540-45750-X_1

Miguel A. Alonso²,
Jesús Vilares^2,3 &
Víctor M. Darriba³

Part of the book series: Lecture Notes in Computer Science ((LNAI,volume 2464))

Included in the following conference series:

Irish Conference on Artificial Intelligence and Cognitive Science

564 Accesses
5 Citations

Abstract

In recent years, there has been a considerable amount of interest in using Natural Language Processing in Information Retrieval research, with specific implementations varyingfrom the word-level morphological analysis to syntactic parsing to conceptual-level semantic analysis. In particular, different degrees of phrase-level syntactic information have been incorporated in information retrieval systems workingon English or Germanic languages such as Dutch. In this paper we study the impact of usingsuc h information, in the form of syntactic dependency pairs, in the performance of a text retrieval system for a Romance language, Spanish.

The research reported in this article has been supported in part by Plan Nacional de Investigación Científica, Desarrollo e Innovación Tecnológica (Grant TIC2000- 0370-C02-01), Ministerio de Ciencia y Tecnología (Grant HP2001-0044) and Xunta de Galicia (Grant PGIDT01PXI10506PN).

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Log in via an institution

Chapter: USD 29.95; Price excludes VAT (USA)

eBook: USD 39.99; Price excludes VAT (USA)

Softcover Book: USD 54.99; Price excludes VAT (USA)

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

Preview

Unable to display preview. Download preview PDF.

References

C. Aone, L. Halverson, T. Hampton, and M. Ramos-Santacruz. SRA: Description of the IE2 system used for MUC-7. In Proc. of the MUC-7, 1998.
Google Scholar
A. Arampatzis, T. van der Weide, C. Koster, and P. van Bommel. Linguistically motivated information retrieval. In Encyclopedia of Library and Information Science. Marcel Dekker, Inc., New York and Basel, 2000.
Google Scholar
R. Baeza-Yates and B. Ribeiro-Neto. Modern information retrieval. Addison-Wesley, Harlow, England, 1999.
Google Scholar
C. Buckley, J. Allan, and G. Salton. Automatic routingand ad-hoc retrieval using SMART: TREC 2. In D. K. Harman, editor, Proc. of TREC-2, pages 45–56, Gaithersburg, MD, USA, 1993.
Google Scholar
J. Carrol, T. Briscoe, and A. Sanfilippo. Parser evaluation: a survey and a new proposal. In Proc. of LREC’98, pages 447–454, Granada, Spain, 1998.
Google Scholar
M. Dillon and A. S. Gray. FASIT: A fully automatic syntactically based indexing system. Journal of the American Society for Information Science, 34(2):99–108, 1983.
Article Google Scholar
J. L. Fagan. Automatic phrase indexing for document retrieval: An examination of syntactic and non-syntactic methods. In Proc. of SIGIR’87, pages 91–101, 1987.
Google Scholar
C. G. Figuerola, R. Gómez, A. F. Zazo, and J. L. Alonso. Stemmingin Spanish: A first approach to its impact on information retrieval. In [17].
Google Scholar
R. Grishman. The NYU system for MUC-6 or where’s the syntax? In Proc. of MUC-6. Morgan Kaufmann Publishers, 1995.
Google Scholar
J. R. Hobbs, D. Appelt, J. Bear, D. Israel, M. Kameyama, M. Stickel, and M. Tyson. FASTUS: A cascaded finite-state transducer for extractinginformation from natural-language text. In E. Roche and Y. Schabes, editors, Finite-State Language Processing. MIT Press, Cambridge, MA, USA, 1997.
Google Scholar
C. Jacquemin and E. Tzoukermann. NLP for term variant extraction: synergy between morphology, lexicon and syntax. In T. Strzalkowski, editor, Natural Language Information Retrieval, pages 25–74. Kluwer Academic Publishers, Dordrecht/Boston/London, 1999.
Google Scholar
J. S. Justeson and S. M. Katz. Technical terminology: some linguistic properties and an algorithm for identification in text. Natural Language Engineering, 1:9–27, 1995.
Article Google Scholar
W. Kraaij and R. Pohlmann. Comparingthe effect of syntactic vs. statistical phrase indexingstrateg ies for Dutch. In C. Nicolaou and C. Stephanidis, editors, Research and Adavanced Technology for Digital Libraries, volume 1513 of LNCS, pages 605–614. Springer-Verlag, Berlin/Heidelberg/New York, 1998.
Google Scholar
B.-K. Kwak, J.-H. Kim, G. Lee, and J. Y. Seo. Corpus-based learningof compound noun indexing. In J. Klavans and J. Gonzalo, editors, Proc. of the ACL’2000 workshop on Recent Advances in Natural Language Processing and Information Retrieval, HongKong, October 2000.
Google Scholar
M. Mittendorfer and W. Winiwarter. Exploitingsyn tactic analysis of queries for information retrieval. Data & Knowledge Engineering, 2002.
Google Scholar
J. Perez-Carballo and T. Strzalkowski. Natural language information retrieval: progress report. Information Processing and Management, 36(1):155–178, 2000.
Article Google Scholar
C. Peters, editor. Working Notes for the CLEF 2001 Workshop. Darmstadt, Germany, 2001. Available at http://www.clef-campaign.org.
J. Vilares, D. Cabrero, and M. A. Alonso. Applyingpro ductive derivational morphology to term indexing of Spanish texts. In Alexander Gelbukh, editor, Computational Linguistics and Intelligent Text Processing, volume 2004 of LNCS, pages 336–348. Springer-Verlag, Berlin-Heidelberg-New York, 2001.
Chapter Google Scholar
J. Vilares, M. Vilares, and M. A. Alonso. Towards the development of heuristics for automatic query expansion. In H. C. Mayr, J. Lazansky, G. Quirchmayr, and P. Vogel, editors, Database and Expert Systems Applications, volume 2113 of LNCS, pages 887–896. Springer-Verlag, Berlin-Heidelberg-New York, 2001.
Chapter Google Scholar

Download references

Author information

Authors and Affiliations

Departamento de Computación, Universidade da Coruña, Campus de Elviña s/n, 15071, La Coruña, Spain
Miguel A. Alonso & Jesús Vilares
Escuela Superior de Ingeniería Informática, Universidade de Vigo, Campus de As Lagoas, 32004, Orense, Spain
Jesús Vilares & Víctor M. Darriba

Authors

Miguel A. Alonso
View author publications
You can also search for this author in PubMed Google Scholar
Jesús Vilares
View author publications
You can also search for this author in PubMed Google Scholar
Víctor M. Darriba
View author publications
You can also search for this author in PubMed Google Scholar

Editor information

Editors and Affiliations

Department of Computer Science and Information Systems, University of Limerick, Ireland
Michael O’Neill , Richard F. E. Sutcliffe , Conor Ryan , Malachy Eaton & Niall J. L. Griffith , , , &

Rights and permissions

Reprints and permissions

Copyright information

About this paper

Cite this paper

Alonso, M.A., Vilares, J., Darriba, V.M. (2002). On the Usefulness of Extracting Syntactic Dependencies for Text Indexing. In: O’Neill, M., Sutcliffe, R.F.E., Ryan, C., Eaton, M., Griffith, N.J.L. (eds) Artificial Intelligence and Cognitive Science. AICS 2002. Lecture Notes in Computer Science(), vol 2464. Springer, Berlin, Heidelberg. https://doi.org/10.1007/3-540-45750-X_1

Download citation

DOI: https://doi.org/10.1007/3-540-45750-X_1
Published: 27 August 2002
Publisher Name: Springer, Berlin, Heidelberg
Print ISBN: 978-3-540-44184-7
Online ISBN: 978-3-540-45750-3
eBook Packages: Springer Book Archive

Publish with us

Policies and ethics