Declustering spatial databases on a multi-computer architecture

  • Nikos Koudas
  • Christos Faloutsos
  • Ibrahim Kamel
Parallel Databases
Part of the Lecture Notes in Computer Science book series (LNCS, volume 1057)


We present a technique to decluster a spatial access method on a shared-nothing multi-computer architecture [DGS+90]. We propose a software architecture with the R-tree as the underlying spatial access method, with its non-leaf levels on the ‘master-server’ and its leaf nodes distributed across the servers. The major contribution of our work is the study of the optimal capacity of leaf nodes, or ‘chunk size’ (or ‘striping unit’): we express the response time on range queries as a function of the ‘chunk size’, and we show how to optimize it.

We implemented our method on a network of workstations, using a real dataset, and we compared the experimental and the theoretical results. The conclusion is that our formula for the response time is very accurate (the maximum relative error was 29%; the typical error was in the vicinity of 10–15%). We illustrate one of the possible ways to exploit such an accurate formula, by examining several ‘what-if’ scenarios. One major, practical conclusion is that a chunk size of 1 page gives either optimal or close to optimal results, for a wide range of the parameters.


Parallel data bases spatial access methods shared nothing architecture 


Unable to display preview. Download preview PDF.

Unable to display preview. Download preview PDF.


  1. [AS94]
    Rakesh Agrawal and Ramakrishnan Srikant. Fast algorithms for mining association rules in large databases. Proc. of VLDB Conf., pages 487–499, September 1994.Google Scholar
  2. [BB82]
    D. Ballard and C. Brown. Computer Vision. Prentice Hall, 1982.Google Scholar
  3. [Ben75]
    J.L. Bentley. Multidimensional binary search trees used for associative searching. CACM, 18(9):509–517, September 1975.Google Scholar
  4. [BFG+95]
    C. K. Baru, G. Fecteau, A. Goyal, H. Hsiao, A. Jhingran, S. Padmanabhan, G. P. Copeland, and W. G. Wilson. DB2 Parallel Edition. IBM Systems Journal, 32(2):292–322, 1995.Google Scholar
  5. [BKS93]
    Thomas Brinkhoff, Hans-Peter Kriegel, and Bernhard Seeger. Efficient processing of spatial joins using r-trees. Proc. of ACM SIGMOD, pages 237–246, May 1993.Google Scholar
  6. [BKSS90]
    N. Beckmann, H.-P. Kriegel, R. Schneider, and B. Seeger. The r*-tree: an efficient and robust access method for points and rectangles. ACM SIGMOD, pages 322–331, May 1990.Google Scholar
  7. [BMK88]
    David Boggs, Jeffrey C. Mogul, and Christopher A. Kent. Measured capacity of an ethernet: Myths and reality. WRL Research Report 88/4, 1988.Google Scholar
  8. [CR93]
    Ling Tony Chen and Doron Rotem. Declustering objects for visualization. Proc. VLDB Conf., August 1993. to appear.Google Scholar
  9. [DGS+90]
    D. DeWitt, S. Ghandeharizadeh, D. A. Schneider, A. Bricker, H. Hsiao, and R. Rasmussen. The gamma database machine project. IEEE Transactions on Knowledge and Data Engineering, 2(1), March 1990.Google Scholar
  10. [DKL+94]
    David. J DeWitt, Navin Kabra, Jun Luo, Jignesh Patel, and Jie-Bing Yu. The client/server paradise. Proceedings of the VLDB, 1994 Santiago, Chile, September 1994.Google Scholar
  11. [DS82]
    H.C. Du and J.S. Sobolewski. Disk allocation for cartesian product files on multiple disk systems. ACM Trans. Database Systems (TODS), 7(1):82–101, March 1982.Google Scholar
  12. [FB93]
    Christos Faloutsos and Pravin Bhagwat. Declustering using fractals. In 2nd Int. Conference on Parallel and Distributed Information Systems (PDIS), pages 18–25, San Diego, CA, January 1993.Google Scholar
  13. [FBF+94]
    Christos Faloutsos, Ron Barber, Myron Flickner, J. Hafner, Wayne Niblack, Dragutin Petkovic, and William Equitz. Efficient and effective querying by image content. J. of Intelligent Information Systems, 3(3/4):231–262, July 1994.Google Scholar
  14. [FK94]
    Christos Faloutsos and Ibrahim Kamel. Beyond uniformity and independence: Analysis of r-trees using the concept of fractal dimension. Proc. ACM SIGACT-SIGMOD-SIGART PODS, pages 4–13, May 1994. Also available as CS-TR-3198, UMIACS-TR-93-130.Google Scholar
  15. [FLC86]
    M.F. Fang, R.C.T. Lee, and C.C. Chang. The idea of de-clustering and its applications. In Proc. 12th International Conference on VLDB, pages 181–188, Kyoto, Japan, August 1986.Google Scholar
  16. [FM89]
    C. Faloutsos and D. Metaxas. Declustering using error correcting codes. Eighth ACM SIGACT-SIGMOD-SIGART Symposium on Principles of Database Systems (PODS), pages 253–258, March 1989. Also available as UMIACS-TR-88-91 and CS-TR-2157.Google Scholar
  17. [FR89]
    C. Faloutsos and S. Roseman. Fractals for secondary key retrieval. Eighth ACM SIGACT-SIGMOD-SIGART Symposium on Principles of Database Systems (PODS), pages 247–252, March 1989. also available as UMIACS-TR-89-47 and CS-TR-2242.Google Scholar
  18. [Fre87]
    Michael Freeston. The bang file: a new kind of grid file. Proc. of ACM SIGMOD, pages 260–269, May 1987.Google Scholar
  19. [Fre95]
    Michael Freeston. A general solution of the n-dimensional b-tree problem. Proc. of ACM-SIGMOD, pages 80–91, May 1995.Google Scholar
  20. [Gar82]
    I. Gargantini. An effective way to represent quadtrees. Comm. of ACM (CACM), 25(12):905–910, December 1982.Google Scholar
  21. [GDQ92]
    Shahram Ghandeharizadeh, David J. DeWitt, and W. Qureshi. A performance analysis of alternative multi-attribute declustering strategies. SIGMOD Conf., June 1992.Google Scholar
  22. [Gun86]
    O. Gunther. The cell tree: an index for geometric data. Memorandum No. UCB/ERL M86/89, Univ. of California, Berkeley, December 1986.Google Scholar
  23. [Gut84a]
    A. Guttman. New Features for Relational Database Systems to Support CAD Applications. PhD thesis, University of California, Berkeley, June 1984.Google Scholar
  24. [Gut84b]
    A. Guttman. R-trees: a dynamic index structure for spatial searching. Proc. ACM SIGMOD, pages 47–57, June 1984.Google Scholar
  25. [HN83]
    K. Hinrichs and J. Nievergelt. The grid file: a data structure to support proximity queries on spatial objects. Proc. of the WG'83 (Intern. Workshop on Graph Theoretic Concepts in Computer Science), pages 100–113, 1983.Google Scholar
  26. [Jag90]
    H.V. Jagadish. Linear clustering of objects with multiple attributes. ACM SIGMOD Conf., pages 332–342, May 1990.Google Scholar
  27. [KF92]
    Ibrahim Kamel and Christos Faloutsos. Parallel r-trees. Proc. of ACM SIGMOD Conf., pages 195–204, June 1992. Also available as Tech. Report UMIACS TR 92-1, CS-TR-2820.Google Scholar
  28. [KF93]
    Ibrahim Kamel and Christos Faloutsos. On packing r-trees. Second Int. Conf. on Information and Knowledge Management (CIKM), November 1993.Google Scholar
  29. [KF94]
    Ibrahim Kamel and Christos Faloutsos. Hilbert r-tree: an improved r-tree using fractals. In Proc. of VLDB Conference, pages 500–509, Santiago, Chile, September 1994.Google Scholar
  30. [KP88]
    M.H. Kim and S. Pramanik. Optimal file distribution for partial match retrieval. Proc. ACM SIGMOD Conf., pages 173–182, June 1988.Google Scholar
  31. [KS91]
    Curtis P. Kolovson and Michael Stonebraker. Segment indexes: Dynamic indexing techniques for multi-dimensional interval data. Proc. ACM SIGMOD, pages 138–147, May 1991.Google Scholar
  32. [LS90]
    David B. Lomet and Betty Salzberg. The hb-tree: a multiattribute indexing method with good guaranteed performance. ACM TODS, 15(4):625–658, December 1990.Google Scholar
  33. [NHS84]
    J. Nievergelt, H. Hinterberger, and K.C. Sevcik. The grid file: an adaptable, symmetric multikey file structure. ACM TODS, 9(1):38–71, March 1984.Google Scholar
  34. [OHM+84]
    J. K. Ousterhout, G. T. Hamachi, R. N. Mayo, W. S. Scott, and G. S. Taylor. Magic: a vlsi layout system. In 21st Design Automation Conference, pages 152–159, Alburquerque, NM, June 1984.Google Scholar
  35. [Ore86]
    J. Orenstein. Spatial query processing in an object-oriented database system. Proc. ACM SIGMOD, pages 326–336, May 1986.Google Scholar
  36. [RL85]
    N. Roussopoulos and D. Leifker. Direct spatial search on pictorial databases using packed r-trees. Proc. ACM SIGMOD, May 1985.Google Scholar
  37. [Rob8l]
    J.T. Robinson. The k-d-b-tree: a search structure for large multidimensional dynamic indexes. Proc. ACM SIGMOD, pages 10–18, 1981.Google Scholar
  38. [Sam90]
    H. Samet. The Design and Analysis of Spatial Data Structures. Addison-Wesley, 1990.Google Scholar
  39. [SRF87]
    T. Sellis, N. Roussopoulos, and C. Faloutsos. The r+ tree: a dynamic index for multi-dimensional objects. In Proc. 13th International Conference on VLDB, pages 507–518, England, September 1987. also available as SRC-TR-87-32, UMIACS-TR-87-3, CS-TR-1795.Google Scholar
  40. [SS88]
    R. Stam and Richard Snodgrass. A bibliography on temporal databases. IEEE Bulletin on Data Engineering, 11(4), December 1988.Google Scholar
  41. [SSH86]
    M. Stonebraker, T. Sellis, and E. Hanson. Rule indexing implementations in database systems. In Proceedings of the First International Conference on Expert Database Systems, Charleston, SC., April 1986.Google Scholar
  42. [Whi81]
    M. White. N-Trees: Large Ordered Indexes for Multi-Dimensional Space. Application Mathematics Research Staff, Statistical Research Division, U.S. Bureau of the Census, December 1981.Google Scholar
  43. [WYD87]
    J.-H. Wang, T.-S. Yuen, and D.H.-C. Du. On multiple random accesses and physical data placement in dynamic files. IEEE Trans. on Software Engineering, SE-13(8):977–987, August 1987.Google Scholar
  44. [WZS91]
    Gerhard Weikum, Peter Zabback, and Peter Scheuermann. Dynamic file allocation in disk arrays. Proc. ACM SIGMOD, pages 406–415, May 1991.Google Scholar

Copyright information

© Springer-Verlag Berlin Heidelberg 1996

Authors and Affiliations

  • Nikos Koudas
    • 1
  • Christos Faloutsos
    • 2
  • Ibrahim Kamel
    • 3
  1. 1.Computer Systems Research InstituteUniversity of TorontoCanada
  2. 2.AT&T Bell LaboratoriesMurray Hill
  3. 3.Matsushita Information Technology LaboratoryJapan

Personalised recommendations