Advertisement

Information Retrieval

, Volume 9, Issue 1, pp 55–70 | Cite as

Retrieval quality vs. effectiveness of specificity-oriented search in XML collections

  • Norbert FuhrEmail author
  • Norbert Gövert
Article

Abstract

Content-only queries in hierarchically structured documents should retrieve the most specific document nodes which are exhaustive to the information need. For this problem, we investigate two methods of augmentation, which both yield high retrieval quality. As retrieval effectiveness, we consider the ratio of retrieval quality and response time; thus, fast approximations to the 'correct' retrieval result may yield higher effectiveness. We present a classification scheme for algorithms addressing this issue, and adopt known algorithms from standard document retrieval for XML retrieval. As a new strategy, we propose incremental-interruptible retrieval, which allows for instant presentation of the top ranking documents. We develop a new algorithm implementing this strategy and evaluate the different methods with the INEX collection.

Keywords

XML retrieval Content-only search Ranked retrieval Efficiency Incremental algorithm 

Preview

Unable to display preview. Download preview PDF.

Unable to display preview. Download preview PDF.

References

  1. Amato G, Rabitti F, Savino P and Zezula P (2003) Region proximity in metric spaces and its use for approximate similarity search. ACM Transactions on Information Systems 21(2):192–227.CrossRefGoogle Scholar
  2. Beaulieu M and Robertson S (1996) Evaluating interactive systems in TREC. Journal of the American Society for Information Science 47(1):85–94.CrossRefGoogle Scholar
  3. Buckley C and Lewit A (1985) Optimization of inverted vector searches. In Proceedings of the 8th International ACM SIGIR Conference on Research and Development in Information Retrieval. ACM, New York, pp. 97–105.Google Scholar
  4. Chiaramella Y, Mulhem P and Fourel F (1996) A model for Multimedia Information Retrieval. Technical report, FERMI ESPRIT BRA 8134, University of Glasgow.Google Scholar
  5. Fagin R (1996) Combining fuzzy information from multiple systems. In Proceedings of the Fifteenth ACM SIGACT-SIGMOD-SIGART Symposium on Principles of Database Systems. ACM, New York, pp. 216–226.Google Scholar
  6. Fagin R (1999) Combining fuzzy information from multiple systems. Journal of Computer and System Sciences 58(1):83–99.CrossRefzbMATHMathSciNetGoogle Scholar
  7. Fuhr N, Gövert N, Kazai G and Lalmas M (2002) INEX: INitiative for the evaluation of XML retrieval. R Baeza-Yates, N Fuhr, and YS Maarek (Eds.): Proceedings of the SIGIR 2002 Workshop on XML and Information Retrieval. http://www.is.informatik.uni-duisburg.de/bib/xml/Fuhr_etal_02a.html
  8. Fuhr N, Gövert N, Kazai G and Lalmas M (Eds.), (2003) INitiative for the Evaluation of XML Retrieval (INEX). In Proceedings of the first INEX Workshop. Dagstuhl, Germany, December 8–11, 2002, ERCIM Workshop Proceedings. Sophia Antipolis, France: ERCIM. http://www.ercim.org/publication/ws-proceedings/INEX2002.pdf
  9. Fuhr N, Gövert N and Rölleke T (1998) DOLORES: A system for logic-based retrieval of multimedia objects. WB Croft, A Moffat, CJ van Rijsbergen, R Wilkinson, and J Zobel (Eds.), In Proceedings of the 21st Annual International ACM SIGIR Conference on Research and Development in Information Retrieval. ACM, New York, pp. 257–265.Google Scholar
  10. Fuhr N and Großjohann K (2004) XIRQL: An XML Query Language Based on Information Retrieval Concepts. ACM Transactions on Information Systems 22:313–356.CrossRefGoogle Scholar
  11. Fuhr N, Lalmas M and Malik S (Eds.), (2004) INitiative for the Evaluation of XML Retrieval (INEX). In Proceedings of the Second INEX Workshop. Dagstuhl, Germany, December 15–17, 2003. http://inex.is.informatik.uni-duisburg.de:2003/proceedings.pdf
  12. Gövert N and Kazai G (2003) Overview of the INitiative for the Evaluation of XML retrieval (INEX) 2002. Fuhr et al. (2003), pp. 1–17, ERCIM. http://www.ercim.org/publication/ws-proceedings/INEX2002.pdf
  13. Güntzer U, Balke W-T and Kießling W (2000) Optimizing multi-feature queries for image database. Proc. VLDB. San Francisco, Morgan Kaufman, pp. 419–428.Google Scholar
  14. Hatano K, Kinutan H, Watanabe M, Mori Y, Yoshikawa M and Uemura S (2004) Keyword-based XML fragment retrieval: experimental evaluation based on INEX 2003 relevance assessments. In: Fuhr et al. (2004), pp. 81–88. http://inex.is.informatik.uni-duisburg.de:2003/proceedings.pdf
  15. Moffat A and Zobel J (1996) Self-indexing inverted files for fast text retrieval. ACM Transactions on Information Systems 14(4):349–379.CrossRefGoogle Scholar
  16. Persin M, Zobel J and Sacks-Davis R (1996) Filtered document retrieval with frequency-sorted indexes. Journal of the American Society for Information Science 47(10):749–764.CrossRefGoogle Scholar
  17. Pfeifer U and Pennekamp S (1997) Incremental processing of vague queries in Interactive Retrieval Systems. In: N Fuhr, G Dittrich, and K Tochtermann (Eds.), Hypertext—Information Retrieval—Multimedia (HIM). Universitätsverlag Konstanz. http://ls1-www.cs.uni-dortmund.de/HIM97/
  18. Robertson SE, Walker S, Hancock-Beaulieu M, Gull A and Lau M (1992) Okapi at TREC. In: Text REtrieval Conference. pp. 21–30.Google Scholar
  19. Thom JA, Zobel, J, and Grima B (1995) Design of indexes for structured document databases. Technical Report TR-95-8, Collaborative Information Technology Research Institute, Melbourne, Australia.Google Scholar

Copyright information

© Springer Science + Business Media, Inc. 2006

Authors and Affiliations

  1. 1.University of Duisburg-EssenGermany
  2. 2.University of DortmundGermany

Personalised recommendations