Abstract
We present a technique to decluster a spatial access method on a shared-nothing multi-computer architecture [DGS+90]. We propose a software architecture with the R-tree as the underlying spatial access method, with its non-leaf levels on the ‘master-server’ and its leaf nodes distributed across the servers. The major contribution of our work is the study of the optimal capacity of leaf nodes, or ‘chunk size’ (or ‘striping unit’): we express the response time on range queries as a function of the ‘chunk size’, and we show how to optimize it.
We implemented our method on a network of workstations, using a real dataset, and we compared the experimental and the theoretical results. The conclusion is that our formula for the response time is very accurate (the maximum relative error was 29%; the typical error was in the vicinity of 10–15%). We illustrate one of the possible ways to exploit such an accurate formula, by examining several ‘what-if’ scenarios. One major, practical conclusion is that a chunk size of 1 page gives either optimal or close to optimal results, for a wide range of the parameters.
Keywords
On leave from the University of Maryland, College Park. His research was partially funded by the Institute for Systems Research (ISR), and by the National Science Foundation under Grants IRI-9205273 and IRI-8958546 (PYI), with matching funds from EMPRESS Software Inc. and Thinking Machines Inc.
Preview
Unable to display preview. Download preview PDF.
References
Rakesh Agrawal and Ramakrishnan Srikant. Fast algorithms for mining association rules in large databases. Proc. of VLDB Conf., pages 487–499, September 1994.
D. Ballard and C. Brown. Computer Vision. Prentice Hall, 1982.
J.L. Bentley. Multidimensional binary search trees used for associative searching. CACM, 18(9):509–517, September 1975.
C. K. Baru, G. Fecteau, A. Goyal, H. Hsiao, A. Jhingran, S. Padmanabhan, G. P. Copeland, and W. G. Wilson. DB2 Parallel Edition. IBM Systems Journal, 32(2):292–322, 1995.
Thomas Brinkhoff, Hans-Peter Kriegel, and Bernhard Seeger. Efficient processing of spatial joins using r-trees. Proc. of ACM SIGMOD, pages 237–246, May 1993.
N. Beckmann, H.-P. Kriegel, R. Schneider, and B. Seeger. The r*-tree: an efficient and robust access method for points and rectangles. ACM SIGMOD, pages 322–331, May 1990.
David Boggs, Jeffrey C. Mogul, and Christopher A. Kent. Measured capacity of an ethernet: Myths and reality. WRL Research Report 88/4, 1988.
Ling Tony Chen and Doron Rotem. Declustering objects for visualization. Proc. VLDB Conf., August 1993. to appear.
D. DeWitt, S. Ghandeharizadeh, D. A. Schneider, A. Bricker, H. Hsiao, and R. Rasmussen. The gamma database machine project. IEEE Transactions on Knowledge and Data Engineering, 2(1), March 1990.
David. J DeWitt, Navin Kabra, Jun Luo, Jignesh Patel, and Jie-Bing Yu. The client/server paradise. Proceedings of the VLDB, 1994 Santiago, Chile, September 1994.
H.C. Du and J.S. Sobolewski. Disk allocation for cartesian product files on multiple disk systems. ACM Trans. Database Systems (TODS), 7(1):82–101, March 1982.
Christos Faloutsos and Pravin Bhagwat. Declustering using fractals. In 2nd Int. Conference on Parallel and Distributed Information Systems (PDIS), pages 18–25, San Diego, CA, January 1993.
Christos Faloutsos, Ron Barber, Myron Flickner, J. Hafner, Wayne Niblack, Dragutin Petkovic, and William Equitz. Efficient and effective querying by image content. J. of Intelligent Information Systems, 3(3/4):231–262, July 1994.
Christos Faloutsos and Ibrahim Kamel. Beyond uniformity and independence: Analysis of r-trees using the concept of fractal dimension. Proc. ACM SIGACT-SIGMOD-SIGART PODS, pages 4–13, May 1994. Also available as CS-TR-3198, UMIACS-TR-93-130.
M.F. Fang, R.C.T. Lee, and C.C. Chang. The idea of de-clustering and its applications. In Proc. 12th International Conference on VLDB, pages 181–188, Kyoto, Japan, August 1986.
C. Faloutsos and D. Metaxas. Declustering using error correcting codes. Eighth ACM SIGACT-SIGMOD-SIGART Symposium on Principles of Database Systems (PODS), pages 253–258, March 1989. Also available as UMIACS-TR-88-91 and CS-TR-2157.
C. Faloutsos and S. Roseman. Fractals for secondary key retrieval. Eighth ACM SIGACT-SIGMOD-SIGART Symposium on Principles of Database Systems (PODS), pages 247–252, March 1989. also available as UMIACS-TR-89-47 and CS-TR-2242.
Michael Freeston. The bang file: a new kind of grid file. Proc. of ACM SIGMOD, pages 260–269, May 1987.
Michael Freeston. A general solution of the n-dimensional b-tree problem. Proc. of ACM-SIGMOD, pages 80–91, May 1995.
I. Gargantini. An effective way to represent quadtrees. Comm. of ACM (CACM), 25(12):905–910, December 1982.
Shahram Ghandeharizadeh, David J. DeWitt, and W. Qureshi. A performance analysis of alternative multi-attribute declustering strategies. SIGMOD Conf., June 1992.
O. Gunther. The cell tree: an index for geometric data. Memorandum No. UCB/ERL M86/89, Univ. of California, Berkeley, December 1986.
A. Guttman. New Features for Relational Database Systems to Support CAD Applications. PhD thesis, University of California, Berkeley, June 1984.
A. Guttman. R-trees: a dynamic index structure for spatial searching. Proc. ACM SIGMOD, pages 47–57, June 1984.
K. Hinrichs and J. Nievergelt. The grid file: a data structure to support proximity queries on spatial objects. Proc. of the WG'83 (Intern. Workshop on Graph Theoretic Concepts in Computer Science), pages 100–113, 1983.
H.V. Jagadish. Linear clustering of objects with multiple attributes. ACM SIGMOD Conf., pages 332–342, May 1990.
Ibrahim Kamel and Christos Faloutsos. Parallel r-trees. Proc. of ACM SIGMOD Conf., pages 195–204, June 1992. Also available as Tech. Report UMIACS TR 92-1, CS-TR-2820.
Ibrahim Kamel and Christos Faloutsos. On packing r-trees. Second Int. Conf. on Information and Knowledge Management (CIKM), November 1993.
Ibrahim Kamel and Christos Faloutsos. Hilbert r-tree: an improved r-tree using fractals. In Proc. of VLDB Conference, pages 500–509, Santiago, Chile, September 1994.
M.H. Kim and S. Pramanik. Optimal file distribution for partial match retrieval. Proc. ACM SIGMOD Conf., pages 173–182, June 1988.
Curtis P. Kolovson and Michael Stonebraker. Segment indexes: Dynamic indexing techniques for multi-dimensional interval data. Proc. ACM SIGMOD, pages 138–147, May 1991.
David B. Lomet and Betty Salzberg. The hb-tree: a multiattribute indexing method with good guaranteed performance. ACM TODS, 15(4):625–658, December 1990.
J. Nievergelt, H. Hinterberger, and K.C. Sevcik. The grid file: an adaptable, symmetric multikey file structure. ACM TODS, 9(1):38–71, March 1984.
J. K. Ousterhout, G. T. Hamachi, R. N. Mayo, W. S. Scott, and G. S. Taylor. Magic: a vlsi layout system. In 21st Design Automation Conference, pages 152–159, Alburquerque, NM, June 1984.
J. Orenstein. Spatial query processing in an object-oriented database system. Proc. ACM SIGMOD, pages 326–336, May 1986.
N. Roussopoulos and D. Leifker. Direct spatial search on pictorial databases using packed r-trees. Proc. ACM SIGMOD, May 1985.
J.T. Robinson. The k-d-b-tree: a search structure for large multidimensional dynamic indexes. Proc. ACM SIGMOD, pages 10–18, 1981.
H. Samet. The Design and Analysis of Spatial Data Structures. Addison-Wesley, 1990.
T. Sellis, N. Roussopoulos, and C. Faloutsos. The r+ tree: a dynamic index for multi-dimensional objects. In Proc. 13th International Conference on VLDB, pages 507–518, England, September 1987. also available as SRC-TR-87-32, UMIACS-TR-87-3, CS-TR-1795.
R. Stam and Richard Snodgrass. A bibliography on temporal databases. IEEE Bulletin on Data Engineering, 11(4), December 1988.
M. Stonebraker, T. Sellis, and E. Hanson. Rule indexing implementations in database systems. In Proceedings of the First International Conference on Expert Database Systems, Charleston, SC., April 1986.
M. White. N-Trees: Large Ordered Indexes for Multi-Dimensional Space. Application Mathematics Research Staff, Statistical Research Division, U.S. Bureau of the Census, December 1981.
J.-H. Wang, T.-S. Yuen, and D.H.-C. Du. On multiple random accesses and physical data placement in dynamic files. IEEE Trans. on Software Engineering, SE-13(8):977–987, August 1987.
Gerhard Weikum, Peter Zabback, and Peter Scheuermann. Dynamic file allocation in disk arrays. Proc. ACM SIGMOD, pages 406–415, May 1991.
Author information
Authors and Affiliations
Editor information
Rights and permissions
Copyright information
© 1996 Springer-Verlag Berlin Heidelberg
About this paper
Cite this paper
Koudas, N., Faloutsos, C., Kamel, I. (1996). Declustering spatial databases on a multi-computer architecture. In: Apers, P., Bouzeghoub, M., Gardarin, G. (eds) Advances in Database Technology — EDBT '96. EDBT 1996. Lecture Notes in Computer Science, vol 1057. Springer, Berlin, Heidelberg. https://doi.org/10.1007/BFb0014180
Download citation
DOI: https://doi.org/10.1007/BFb0014180
Published:
Publisher Name: Springer, Berlin, Heidelberg
Print ISBN: 978-3-540-61057-1
Online ISBN: 978-3-540-49943-5
eBook Packages: Springer Book Archive