Abstract
Support for temporal text-containment queries (query for all versions of documents that contained one or more particular words at a particular time t) is of interest in a number of contexts, including web archives, in a smaller scale temporal XML/web warehouses, and temporal document database systems in general. In the V2 temporal document database system we employed a combination of full-text indexes and variants of time indexes to perform efficient text-containment queries. That approach was optimized for moderately large temporal document databases. However, for “extremely large databases” the index space usage of the approach could be too large. In this paper, we present a more space-efficient solution to the problem: the interval-based temporal text index (ITTX). We also present appropriate algorithms for update and retrieval, and we discuss advantages and disadvantages of the V2 and ITTX approaches.
Access this chapter
Tax calculation will be finalised at checkout
Purchases are for personal use only
Preview
Unable to display preview. Download preview PDF.
References
Anick, P.G., Flynn, R.A.: Versioning a full-text information retrieval system. In: Proceedings of the 15th Annual International ACM SIGIR Conference on Research and Development in Information Retrieval (1992)
Aramburu-Cabo, M.J., Llavori, R.B.: Atemporal object-oriented model for digital libraries of documents. Concurrency and Computation: Practice and Experience 13(11) (2001)
Chien, S.-Y., Tsotras, V., Zaniolo, C.: Efficient schemes for managing multiversion XML documents. VLDB Journal 11(4) (2002)
Internet archive, http://archive.org/
Lomet, D., Salzberg, B.: Access methods for multiversion data. In: Proceedings of the 1989 ACM SIGMOD (1989)
Marian, A., Abiteboul, S., Cobena, G., Mignet, L.: Change-centric management of versions in an XML warehouse. In: Proceedings of VLDB 2001 (2001)
Nørvåg, K.: Algorithms for temporal query operators in XML databases. In: Chaudhri, A.B., Unland, R., Djeraba, C., Lindner, W. (eds.) EDBT 2002. LNCS, vol. 2490, pp. 169–183. Springer, Heidelberg (2002)
Nørvåg, K.: Supporting temporal text-containment queries. Technical Report IDI 11/2002, Norwegian University of Science and Technology (2002), Available from http://www.idi.ntnu.no/grupper/DB-grp/
Nørvåg, K.: V2: A database approach to temporal document management. In: Proceedings of the 7th International Database Engineering and Applications Symposium, IDEAS (2003)
Nørvåg, K.: Algorithms for granularity reduction in temporal document databases. Technical Report IDI 1/2003, Norwegian University of Science and Technology (2003), Available from http://www.idi.ntnu.no/grupper/DB-grp/
Olson, M.A., Bostic, K., Seltzer, M.: Berkeley DB. In: Proceedings of the FREENIX Track: 1999 USENIX Annual Technical Conference (1999)
Xyleme, L.: Adynamicwarehouse forXMLdata of the web. IEEE Data Engineering Bulletin 24(2) (2001)
Author information
Authors and Affiliations
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 2003 Springer-Verlag Berlin Heidelberg
About this paper
Cite this paper
Nørvåg, K. (2003). Space-Efficient Support for Temporal Text Indexing in a Document Archive Context. In: Koch, T., Sølvberg, I.T. (eds) Research and Advanced Technology for Digital Libraries. ECDL 2003. Lecture Notes in Computer Science, vol 2769. Springer, Berlin, Heidelberg. https://doi.org/10.1007/978-3-540-45175-4_46
Download citation
DOI: https://doi.org/10.1007/978-3-540-45175-4_46
Publisher Name: Springer, Berlin, Heidelberg
Print ISBN: 978-3-540-40726-3
Online ISBN: 978-3-540-45175-4
eBook Packages: Springer Book Archive