The PHASAR Search Engine

  • Cornelis H. A. Koster
  • Olaf Seibert
  • Marc Seutter
Part of the Lecture Notes in Computer Science book series (LNCS, volume 3999)

Abstract

This article describes the rationale behind the PHASAR system (Phrase-based Accurate Search And Retrieval), a professional Information Retrieval and Text Mining system under development for the collection of information about metabolites from the biological literature. The system is generic in nature and applicable (given suitable linguistic resources and thesauri) to many other forms of professional search. Instead of keywords, the PHASAR search engine uses Dependency Triples as terms. Both the documents and the queries are parsed, transduced to Dependency Triples and lemmatized. Queries consist of a set of Dependency Triples, whose elements may be generalized or specialized in order to achieve the desired precision and recall. In order to help in interactive exploration, the search process is supported by document frequency information from the index, both for terms from the query and for terms from the thesaurus.

Preview

Unable to display preview. Download preview PDF.

Unable to display preview. Download preview PDF.

References

  1. [Arampatzis et al., 2000]
    Arampatzis, A., van der Weide, T.P., Koster, C.H.A., van Bommel, P.: An Evaluation of Linguistically-motivated Indexing Schemes. In: Arampatzis, A. (ed.) Proceedings of BCS-IRSG, 22nd Annual Colloquium on IR Research, pp. 34–45 (2000)Google Scholar
  2. [Bouma et al, 2005]
    Bouma, G., Mur, J., van Noord, G., van der Plas, L., Tiedemann, J.: Question Answering for Dutch using Dependency Relations. In: Peters, C., Gey, F.C., Gonzalo, J., Müller, H., Jones, G.J.F., Kluck, M., Magnini, B., de Rijke, M., Giampiccolo, D. (eds.) CLEF 2005. LNCS, vol. 4022, pp. 370–379. Springer, Heidelberg (2006)CrossRefGoogle Scholar
  3. [Bruza and Huibers, 1996]
    Bruza, P., Huibers, T.W.C.: A Study of Aboutness in Information Retrieval. Artificial Intelligence Review 10, 1–27 (1996)CrossRefMATHGoogle Scholar
  4. [Cui et al, 2005]
    Cui, H., Sun, R., Li, K., Kan, M.-Y., Chua, T.-S.: Question Answering Passage Retrieval Using Dependency Relations. In: Proceedings SIGIR (2005)Google Scholar
  5. [Fagan, 1988]
    Fagan, J.L.: Experiments in automatic phrase indexing for document retrieval: a comparison of syntactic and non-syntactic methods, PhD Thesis, Cornell University (1988)Google Scholar
  6. [Furnkranz et al, 1998]
    Furnkranz, J., Mitchell, T., Riloff, E.: Case Study in Using Linguistic Phrases for Text Categorization on the WWW, AAAI/ICML Workshop on Learning for Text Categorization (1998)Google Scholar
  7. [Grootjen and van der Weide, 2004]
    Grootjen, F.A., van der Weide, T.P.: Effectiveness of Index Expressions. In: Meziane, F., Métais, E. (eds.) NLDB 2004. LNCS, vol. 3136, pp. 171–181. Springer, Heidelberg (2004)CrossRefGoogle Scholar
  8. [Hekkelman and Vriend, 2005]
    Hekkelman, M.L., Vriend, G.: MRS: A fast and compact retrieval system for biological data. Nucleic Acids Res. (July 1, 2005), 33(Web Server issue), W766W769, Also: http://mrs.cmbi.ru.nl/
  9. [Koster and Verbruggen, 2002]
    Koster, C.H.A., Verbruggen, E.: The AGFL Grammar Work Lab. In: Proceedings FREENIX/Usenix 2002, pp. 13–18 (2002)Google Scholar
  10. [Melc̆uk, 1988]
    Melc̆uk, I.A.: Dependency Syntax: Theory and Practice. State University of New York Press, Albany (1988)Google Scholar
  11. [Riloff and Lorenzen, 1999]
    Riloff, E., Lorenzen, J.: Extraction-based Text Categorization: Generating Domain-specific Role Relationships Automatically. In: [Strzalkowski 1999] (1999)Google Scholar
  12. [Sebastiani, 2002]
    Sebastiani, F.: Machine learning in automated text categorization. ACM Computing Surveys 34(1), 1–47 (2002)CrossRefGoogle Scholar
  13. [Sparck Jones, 1999]
    Sparck Jones, K.: The role of NLP in Text Retrieval (1999). In: [Strzalkowski, 1999], pp. 1-24 (1999)Google Scholar
  14. [Strzalkowski, 1995]
    Strzalkowski, T.: Natural Language Information Retrieval. Information Processing and Management 31(3), 397–417 (1995)CrossRefGoogle Scholar
  15. [Strzalkowski, 1999]
    Strzalkowski, T.: Natural Language Information Retrieval. Kluwer Academic Publishers, Dordrecht (1999)CrossRefMATHGoogle Scholar

Copyright information

© Springer-Verlag Berlin Heidelberg 2006

Authors and Affiliations

  • Cornelis H. A. Koster
    • 1
  • Olaf Seibert
    • 1
  • Marc Seutter
    • 1
  1. 1.Department of Computer ScienceRadboud University NijmegenNijmegenThe Netherlands

Personalised recommendations