Skip to main content

Space-Efficient Support for Temporal Text Indexing in a Document Archive Context

  • Conference paper
Research and Advanced Technology for Digital Libraries (ECDL 2003)

Part of the book series: Lecture Notes in Computer Science ((LNCS,volume 2769))

Included in the following conference series:

Abstract

Support for temporal text-containment queries (query for all versions of documents that contained one or more particular words at a particular time t) is of interest in a number of contexts, including web archives, in a smaller scale temporal XML/web warehouses, and temporal document database systems in general. In the V2 temporal document database system we employed a combination of full-text indexes and variants of time indexes to perform efficient text-containment queries. That approach was optimized for moderately large temporal document databases. However, for “extremely large databases” the index space usage of the approach could be too large. In this paper, we present a more space-efficient solution to the problem: the interval-based temporal text index (ITTX). We also present appropriate algorithms for update and retrieval, and we discuss advantages and disadvantages of the V2 and ITTX approaches.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Chapter
USD 29.95
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
eBook
USD 84.99
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
Softcover Book
USD 109.99
Price excludes VAT (USA)
  • Compact, lightweight edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

Preview

Unable to display preview. Download preview PDF.

Unable to display preview. Download preview PDF.

References

  1. Anick, P.G., Flynn, R.A.: Versioning a full-text information retrieval system. In: Proceedings of the 15th Annual International ACM SIGIR Conference on Research and Development in Information Retrieval (1992)

    Google Scholar 

  2. Aramburu-Cabo, M.J., Llavori, R.B.: Atemporal object-oriented model for digital libraries of documents. Concurrency and Computation: Practice and Experience 13(11) (2001)

    Google Scholar 

  3. Chien, S.-Y., Tsotras, V., Zaniolo, C.: Efficient schemes for managing multiversion XML documents. VLDB Journal 11(4) (2002)

    Google Scholar 

  4. Internet archive, http://archive.org/

  5. Lomet, D., Salzberg, B.: Access methods for multiversion data. In: Proceedings of the 1989 ACM SIGMOD (1989)

    Google Scholar 

  6. Marian, A., Abiteboul, S., Cobena, G., Mignet, L.: Change-centric management of versions in an XML warehouse. In: Proceedings of VLDB 2001 (2001)

    Google Scholar 

  7. Nørvåg, K.: Algorithms for temporal query operators in XML databases. In: Chaudhri, A.B., Unland, R., Djeraba, C., Lindner, W. (eds.) EDBT 2002. LNCS, vol. 2490, pp. 169–183. Springer, Heidelberg (2002)

    Chapter  Google Scholar 

  8. Nørvåg, K.: Supporting temporal text-containment queries. Technical Report IDI 11/2002, Norwegian University of Science and Technology (2002), Available from http://www.idi.ntnu.no/grupper/DB-grp/

  9. Nørvåg, K.: V2: A database approach to temporal document management. In: Proceedings of the 7th International Database Engineering and Applications Symposium, IDEAS (2003)

    Google Scholar 

  10. Nørvåg, K.: Algorithms for granularity reduction in temporal document databases. Technical Report IDI 1/2003, Norwegian University of Science and Technology (2003), Available from http://www.idi.ntnu.no/grupper/DB-grp/

  11. Olson, M.A., Bostic, K., Seltzer, M.: Berkeley DB. In: Proceedings of the FREENIX Track: 1999 USENIX Annual Technical Conference (1999)

    Google Scholar 

  12. Xyleme, L.: Adynamicwarehouse forXMLdata of the web. IEEE Data Engineering Bulletin 24(2) (2001)

    Google Scholar 

Download references

Author information

Authors and Affiliations

Authors

Editor information

Editors and Affiliations

Rights and permissions

Reprints and permissions

Copyright information

© 2003 Springer-Verlag Berlin Heidelberg

About this paper

Cite this paper

Nørvåg, K. (2003). Space-Efficient Support for Temporal Text Indexing in a Document Archive Context. In: Koch, T., Sølvberg, I.T. (eds) Research and Advanced Technology for Digital Libraries. ECDL 2003. Lecture Notes in Computer Science, vol 2769. Springer, Berlin, Heidelberg. https://doi.org/10.1007/978-3-540-45175-4_46

Download citation

  • DOI: https://doi.org/10.1007/978-3-540-45175-4_46

  • Publisher Name: Springer, Berlin, Heidelberg

  • Print ISBN: 978-3-540-40726-3

  • Online ISBN: 978-3-540-45175-4

  • eBook Packages: Springer Book Archive

Publish with us

Policies and ethics