Skip to main content

Ranked Retrieval of Structured Documents with the S-Term Vector Space Model

  • Conference paper

Part of the book series: Lecture Notes in Computer Science ((LNISA,volume 3493))

Abstract

This paper shows how the s-term ranking model [1] is extended and combined with index structures and algorithms for structured document retrieval to enhance both the effectiveness of the model and the retrieval efficiency. We explain in detail how previous work on ranked and exact retrieval can be integrated and optimized, and which adaptions are necessary. Our approach is evaluated experimentally at the INEX workshop 2004 [2]. The results are encouraging and give rise to a number of future enhancements.

This is a preview of subscription content, log in via an institution.

Buying options

Chapter
USD   29.95
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
eBook
USD   39.99
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
Softcover Book
USD   54.99
Price excludes VAT (USA)
  • Compact, lightweight edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info

Tax calculation will be finalised at checkout

Purchases are for personal use only

Learn about institutional subscriptions

Preview

Unable to display preview. Download preview PDF.

Unable to display preview. Download preview PDF.

References

  1. Schlieder, T., Meuss, H.: Querying and Ranking XML Documents. Journal of the American Society for Information Science and Technology 53 (2002)

    Google Scholar 

  2. INEX: Initiative for the Evaluation of XML Retrieval (2004), Available at http://inex.is.informatik.uni-duisburg.de:2004

  3. Fuhr, N., Großjohann, K.: XIRQL: A Query Language for Information Retrieval in XML Documents. In: Research and Development in Information Retrieval (2001)

    Google Scholar 

  4. Wolff, J.E., Flörke, H., Cremers, A.B.: Searching and Browsing Collections of Structural Information. In: Proc. IEEE Forum on Research and Technology Advances in Digital Libraries (2000)

    Google Scholar 

  5. Schlieder, T.: Similarity Search in XML Data using Cost-Based Query Transformations. In: Proc. 4th Intern. Workshop on the Web and Databases (2001)

    Google Scholar 

  6. Theobald, A., Weikum, G.: The Index-Based XXL Search Engine for Querying XML Data with Relevance Ranking. In: Proc. 8th Int. Conf. on Extending Database Technology (2002)

    Google Scholar 

  7. Shin, D., Jang, H., Jin, H.: BUS: An Effective Indexing and Retrieval Scheme in Structured Documents. In: Proc. 3rd ACM Int. Conf. on Digital Libraries (1998)

    Google Scholar 

  8. Salton, G.: The SMART Retrieval System – Experiments in Automatic Document Processing. Prentice Hall Inc., Englewood Cliffs (1971)

    Google Scholar 

  9. Weigel, F., Meuss, H., Schulz, K.U., Bry, F.: Content and Structure in Indexing and Ranking XML. In: Proc. 7th Int. Workshop on the Web and Databases (2004)

    Google Scholar 

  10. Weigel, F., Meuss, H., Bry, F., Schulz, K.U.: Content-Aware DataGuides: Interleaving IR and DB Indexing Techniques for Efficient Retrieval of Textual XML Data. In: McDonald, S., Tait, J.I. (eds.) ECIR 2004. LNCS, vol. 2997, pp. 378–393. Springer, Heidelberg (2004)

    Chapter  Google Scholar 

  11. Sacks-Davis, R., Arnold-Moore, T., Zobel, J.: Database Systems for Structured Documents. In: Proc. Int. Symposium on Advanced Database Technologies and Their Integration (1994)

    Google Scholar 

  12. Al-Khalifa, S., Jagadish, H.V., Koudas, N., Patel, J.M., Srivastava, D., Wu, Y.: Structural Joins: A Primitive for Efficient XML Query Pattern Matching. In: Proc. 18th IEEE Int. Conf. on Data Engineering (2002)

    Google Scholar 

  13. Kilpeläinen, P.: Tree Matching Problems with Applications to Structured Text Databases. PhD thesis, University of Helsinki (1992)

    Google Scholar 

  14. Meuss, H., Schulz, K.U., Weigel, F., Leonardi, S., Bry, F.: Visual Exploration and Retrieval of XML Document Collections with the Generic System X 2. Journal of Digital Libraries, Special Issue on Information Visualization Interfaces (2004)

    Google Scholar 

  15. Meuss, H.: Logical Tree Matching with Complete Answer Aggregates for Retrieving Structured Documents. PhD thesis, University of Munich (2000)

    Google Scholar 

  16. Meuss, H., Schulz, K.U.: Complete Answer Aggregates for Tree-like Databases: A Novel Approach to Combine Querying and Navigation. ACM Transactions on Information Systems 19 (2001)

    Google Scholar 

  17. Meuss, H., Schulz, K., Bry, F.: Towards Aggregated Answers for Semistructured Data. In: Van den Bussche, J., Vianu, V. (eds.) ICDT 2001. LNCS, vol. 1973, p. 346. Springer, Heidelberg (2001)

    Chapter  Google Scholar 

  18. Trotman, A., Sigurbjörnsson, B.: Narrowed Extended XPath I (2004)

    Google Scholar 

  19. Goldman, R., Widom, J.: DataGuides: Enabling Query Formulation and Optimization in Semistructured Databases. In: Proc. 23rd Int. Conf. on Very Large Data Bases (1997)

    Google Scholar 

Download references

Author information

Authors and Affiliations

Authors

Editor information

Editors and Affiliations

Rights and permissions

Reprints and permissions

Copyright information

© 2005 Springer-Verlag Berlin Heidelberg

About this paper

Cite this paper

Weigel, F., Schulz, K.U., Meuss, H. (2005). Ranked Retrieval of Structured Documents with the S-Term Vector Space Model. In: Fuhr, N., Lalmas, M., Malik, S., Szlávik, Z. (eds) Advances in XML Information Retrieval. INEX 2004. Lecture Notes in Computer Science, vol 3493. Springer, Berlin, Heidelberg. https://doi.org/10.1007/11424550_19

Download citation

  • DOI: https://doi.org/10.1007/11424550_19

  • Publisher Name: Springer, Berlin, Heidelberg

  • Print ISBN: 978-3-540-26166-7

  • Online ISBN: 978-3-540-32053-1

  • eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics