Skip to main content

Feature- and Query-Based Table of Contents Generation for XML Documents

  • Conference paper
Advances in Information Retrieval (ECIR 2007)

Part of the book series: Lecture Notes in Computer Science ((LNISA,volume 4425))

Included in the following conference series:

Abstract

The availability of a document’s logical structure in XML retrieval allows retrieval systems to return document portions (elements) instead of whole documents. This helps searchers focusing their attention to the relevant content within a document. However, other, e.g. sibling or parent, elements of retrieved elements may also be important as they provide context to the retrieved elements. The use of table of contents (TOC) offers an overview of a document and shows the most important elements and their relations to each other. In this paper, we investigate what searchers think is important in automatic TOC generation. We ask searchers to indicate their preferences for element features (depth, length, relevance) in order to generate TOCs that help them complete information seeking tasks. We investigate what these preferences are, and what are the characteristics of the TOCs generated by searchers’ settings. The results have implications for the design of intelligent TOC generation approaches for XML retrieval.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Chapter
USD 29.95
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
eBook
USD 84.99
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
Softcover Book
USD 109.99
Price excludes VAT (USA)
  • Compact, lightweight edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

Preview

Unable to display preview. Download preview PDF.

Unable to display preview. Download preview PDF.

References

  1. Borlund, P.: The IIR evaluation model: a framework for evaluation of interactive information retrieval systems. Information Research 8(3) (2003)

    Google Scholar 

  2. Denoyer, L., Gallinari, P.: The Wikipedia XML Corpus. SIGIR Forum 40(1), 64–69 (2006)

    Article  Google Scholar 

  3. Edmundson, H.P.: New methods in automatic extracting. J. ACM 16(2), 264–285 (1969), doi:10.1145/321510.321519

    Article  MATH  Google Scholar 

  4. Fuhr, N., et al. (eds.): INEX 2005. LNCS, vol. 3977. Springer, Heidelberg (2006)

    Google Scholar 

  5. Fuhr, N., et al. (eds.): INEX 2004. LNCS, vol. 3493. Springer, Heidelberg (2005)

    Google Scholar 

  6. Hammer-Aebi, B., et al.: Users, structured documents and overlap: interactive searching of elements and the influence of context on search behaviour. In: Proceedings of IIiX, Copenhagen, Denmark, pp. 46–55 (2006)

    Google Scholar 

  7. Kamps, J., de Rijke, M., Sigurbjörnsson, B.: Length normalization in XML retrieval. In: Proceedings of ACM SIGIR, Sheffield, United Kingdom, pp. 80–87. ACM Press, New York (2004)

    Google Scholar 

  8. Kamps, J., Sigurbjörnsson, B.: What do users think of an XML element retrieval system? In: Fuhr, N., et al. (eds.) INEX 2005. LNCS, vol. 3977, pp. 411–421. Springer, Heidelberg (2006)

    Chapter  Google Scholar 

  9. Kim, H., Son, H.: Users interaction with the hierarchically structured presentation in XML document retrieval. In: Fuhr, N., et al. (eds.) INEX 2005. LNCS, vol. 3977, pp. 422–431. Springer, Heidelberg (2006)

    Chapter  Google Scholar 

  10. Larsen, B., Malik, S., Tombros, A.: The interactive track at INEX 2005. In: Fuhr, N., et al. (eds.) INEX 2005. LNCS, vol. 3977, pp. 398–410. Springer, Heidelberg (2006)

    Chapter  Google Scholar 

  11. Malik, S., et al.: Designing a user interface for interactive retrieval of structured documents - lessons learned from the INEX interactive track. In: Gonzalo, J., et al. (eds.) ECDL 2006. LNCS, vol. 4172, pp. 291–302. Springer, Heidelberg (2006)

    Chapter  Google Scholar 

  12. Sigurbjörnsson, B.: Focused Information Access using XML Element Retrieval. PhD thesis, Faculty of Science, University of Amsterdam (2006)

    Google Scholar 

  13. Szlávik, Z., Tombros, A., Lalmas, M.: Investigating the use of summarisation for interactive XML retrieval. In: Crestani, F., Pasi, G. (eds.) Proceedings of ACM SAC-IARS’06, pp. 1068–1072. ACM Press, New York (2006)

    Chapter  Google Scholar 

  14. Szlávik, Z., Tombros, A., Lalmas, M.: The use of summaries in XML retrieval. In: Gonzalo, J., et al. (eds.) ECDL 2006. LNCS, vol. 4172, pp. 75–86. Springer, Heidelberg (2006)

    Chapter  Google Scholar 

  15. Theobald, M., Schenkel, R., Weikum, G.: An efficient and versatile query engine for TopX search. In: Proceedings of VLDB, pp. 625–636 (2005)

    Google Scholar 

  16. Tombros, A., Larsen, B., Malik, S.: The interactive track at INEX 2004. In: Fuhr, N., et al. (eds.) INEX 2004. LNCS, vol. 3493, pp. 422–435. Springer, Heidelberg (2005)

    Google Scholar 

  17. Tombros, A., Sanderson, M.: Advantages of query biased summaries in information retrieval. In: Proceedings of ACM SIGIR, Melbourne, Australia, pp. 2–10. ACM Press, New York (1998), doi:10.1145/290941.290947

    Google Scholar 

  18. van Zwol, R., Kazai, G., Lalmas, M.: INEX 2005 multimedia track. In: Fuhr, N., et al. (eds.) INEX 2005. LNCS, vol. 3977, pp. 497–510. Springer, Heidelberg (2006)

    Chapter  Google Scholar 

Download references

Author information

Authors and Affiliations

Authors

Editor information

Giambattista Amati Claudio Carpineto Giovanni Romano

Rights and permissions

Reprints and permissions

Copyright information

© 2007 Springer Berlin Heidelberg

About this paper

Cite this paper

Szlávik, Z., Tombros, A., Lalmas, M. (2007). Feature- and Query-Based Table of Contents Generation for XML Documents. In: Amati, G., Carpineto, C., Romano, G. (eds) Advances in Information Retrieval. ECIR 2007. Lecture Notes in Computer Science, vol 4425. Springer, Berlin, Heidelberg. https://doi.org/10.1007/978-3-540-71496-5_41

Download citation

  • DOI: https://doi.org/10.1007/978-3-540-71496-5_41

  • Publisher Name: Springer, Berlin, Heidelberg

  • Print ISBN: 978-3-540-71494-1

  • Online ISBN: 978-3-540-71496-5

  • eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics