Optimisation of partitioned temporal joins
Partitioning data for temporal join processing is not trivial because tuples have to be replicated between data fragments. This causes three types of overheads: (a) an overhead caused by the replication process itself, (b) a processing overhead caused by the additional joining that has to be done and (c) an overhead for removing duplicates in the result. Previous work has mainly concentrated on avoiding (a) but still suffers from the consequences of (b) and (c).
In this paper, we show how partitioned temporal joins can be optimised for sequential and parallel processing by reducing tuple replication, thereby reducing the total overhead. For that purpose, a new data structure, namely the IP-table, is introduced. The idea is to have IP-tables of individual temporal relations stored in the database system's catalog from which they can be retrieved for the optimisation process. IP-tables of two or more temporal relations might be required for optimisations, too. These can be created by merging IP-tables of individual relations at optimisation time — a fast and straightforward process. IP-tables can be used for creating and analysing partitions over interval timestamps. Three strategies for partitioning interval data are presented, each of which can be easily and efficiently implemented using IP-tables. The performance determining parameters of a partition can also be derived from IP-tables.
Keywordstemporal join parallel join optimisation interval partitioning
Unable to display preview. Download preview PDF.
- 1.J. Allen. Maintaining Knowledge about Temporal Intervals. Communications of the ACM, 26(11):832–843, Nov. 1983.Google Scholar
- 2.R. Kooi. The Optimization of Queries in Relational Databases. PhD thesis, Case Western Reserver University, Sept. 1980.Google Scholar
- 3.T. Leung and R. Muntz. Temporal Query Processing and Optimization in Multiprocessor Database Machines. In Proc. of the 18th International Conference on Very Large Data Bases, Vancouver, Canada, pages 383–394, Aug. 1992.Google Scholar
- 4.H. Lu, B.-C. Ooi, and K.-L. Tan. On Spatially Partitioned Temporal Join. In Proc. of the 20th Internat. Conf. on Very Large Data Bases (VLDB), Santiago de Chile, pages 546–557, Sept. 1994.Google Scholar
- 5.P. Mishra and M. Eich. Join Processing in Relational Databases. ACM Computing Surveys, pages 63–113, Mar. 1992.Google Scholar
- 6.V. Poosala, Y. Ioannidis, P. Haas, and E. Shekita. Improved Histograms for Selectivity Estimation of Range Predicates. In Proceedings ACM SIGMOD Conference on Management of Data, Montreal, Canada, pages 294–305, June 1996.Google Scholar
- 7.Red Brick Systems. The Data Warehouse. White paper, Red Brick Systems, Aug. 1995.Google Scholar
- 8.M. Soo, R. Snodgrass, and C. Jensen. Efficient Evaluation of the Valid-Time Natural Join. In Proc. of the 10th International Conference on Data Engineering, Houston, Texas, USA, pages 282–292, Feb. 1994.Google Scholar
- 9.V. Tsotras and A. Kumar. Temporal Database Bibliography Update. SIGMOD Record, 25(1), Mar. 1996.Google Scholar
- 10.T. Zurek. Parallel Temporal Nested-Loop Joins. Technical Report ECS-CSG-20-96, Dept. of Computer Science, Edinburgh University, Jan. 1996.Google Scholar
- 11.T. Zurek. Optimal Interval Partitioning for Temporal Databases. In Proc. of the 3rd BIWIT Workshop, Biarritz, France. IEEE Computer Society Press, July 1997.Google Scholar
- 12.T. Zurek. Parallel Processing of Temporal Joins. Informatica, 21, 1997.Google Scholar
- 13.T. Zurek. Parallel Temporal Joins. In “Datenbanksysteme in Büro, Technik und Wissenschaft (BTW), German Database Conference”, Ulm, Germany, pages 269–278. Springer, Mar. 1997. (in English).Google Scholar