Declustering spatial databases on a multi-computer architecture

  • Nikos Koudas
  • Christos Faloutsos
  • Ibrahim Kamel
Parallel Databases
Part of the Lecture Notes in Computer Science book series (LNCS, volume 1057)

Abstract

We present a technique to decluster a spatial access method on a shared-nothing multi-computer architecture [DGS+90]. We propose a software architecture with the R-tree as the underlying spatial access method, with its non-leaf levels on the ‘master-server’ and its leaf nodes distributed across the servers. The major contribution of our work is the study of the optimal capacity of leaf nodes, or ‘chunk size’ (or ‘striping unit’): we express the response time on range queries as a function of the ‘chunk size’, and we show how to optimize it.

We implemented our method on a network of workstations, using a real dataset, and we compared the experimental and the theoretical results. The conclusion is that our formula for the response time is very accurate (the maximum relative error was 29%; the typical error was in the vicinity of 10–15%). We illustrate one of the possible ways to exploit such an accurate formula, by examining several ‘what-if’ scenarios. One major, practical conclusion is that a chunk size of 1 page gives either optimal or close to optimal results, for a wide range of the parameters.

Keywords

Parallel data bases spatial access methods shared nothing architecture 

Preview

Unable to display preview. Download preview PDF.

Unable to display preview. Download preview PDF.

References

  1. [AS94]
    Rakesh Agrawal and Ramakrishnan Srikant. Fast algorithms for mining association rules in large databases. Proc. of VLDB Conf., pages 487–499, September 1994.Google Scholar
  2. [BB82]
    D. Ballard and C. Brown. Computer Vision. Prentice Hall, 1982.Google Scholar
  3. [Ben75]
    J.L. Bentley. Multidimensional binary search trees used for associative searching. CACM, 18(9):509–517, September 1975.Google Scholar
  4. [BFG+95]
    C. K. Baru, G. Fecteau, A. Goyal, H. Hsiao, A. Jhingran, S. Padmanabhan, G. P. Copeland, and W. G. Wilson. DB2 Parallel Edition. IBM Systems Journal, 32(2):292–322, 1995.Google Scholar
  5. [BKS93]
    Thomas Brinkhoff, Hans-Peter Kriegel, and Bernhard Seeger. Efficient processing of spatial joins using r-trees. Proc. of ACM SIGMOD, pages 237–246, May 1993.Google Scholar
  6. [BKSS90]
    N. Beckmann, H.-P. Kriegel, R. Schneider, and B. Seeger. The r*-tree: an efficient and robust access method for points and rectangles. ACM SIGMOD, pages 322–331, May 1990.Google Scholar
  7. [BMK88]
    David Boggs, Jeffrey C. Mogul, and Christopher A. Kent. Measured capacity of an ethernet: Myths and reality. WRL Research Report 88/4, 1988.Google Scholar
  8. [CR93]
    Ling Tony Chen and Doron Rotem. Declustering objects for visualization. Proc. VLDB Conf., August 1993. to appear.Google Scholar
  9. [DGS+90]
    D. DeWitt, S. Ghandeharizadeh, D. A. Schneider, A. Bricker, H. Hsiao, and R. Rasmussen. The gamma database machine project. IEEE Transactions on Knowledge and Data Engineering, 2(1), March 1990.Google Scholar
  10. [DKL+94]
    David. J DeWitt, Navin Kabra, Jun Luo, Jignesh Patel, and Jie-Bing Yu. The client/server paradise. Proceedings of the VLDB, 1994 Santiago, Chile, September 1994.Google Scholar
  11. [DS82]
    H.C. Du and J.S. Sobolewski. Disk allocation for cartesian product files on multiple disk systems. ACM Trans. Database Systems (TODS), 7(1):82–101, March 1982.Google Scholar
  12. [FB93]
    Christos Faloutsos and Pravin Bhagwat. Declustering using fractals. In 2nd Int. Conference on Parallel and Distributed Information Systems (PDIS), pages 18–25, San Diego, CA, January 1993.Google Scholar
  13. [FBF+94]
    Christos Faloutsos, Ron Barber, Myron Flickner, J. Hafner, Wayne Niblack, Dragutin Petkovic, and William Equitz. Efficient and effective querying by image content. J. of Intelligent Information Systems, 3(3/4):231–262, July 1994.Google Scholar
  14. [FK94]
    Christos Faloutsos and Ibrahim Kamel. Beyond uniformity and independence: Analysis of r-trees using the concept of fractal dimension. Proc. ACM SIGACT-SIGMOD-SIGART PODS, pages 4–13, May 1994. Also available as CS-TR-3198, UMIACS-TR-93-130.Google Scholar
  15. [FLC86]
    M.F. Fang, R.C.T. Lee, and C.C. Chang. The idea of de-clustering and its applications. In Proc. 12th International Conference on VLDB, pages 181–188, Kyoto, Japan, August 1986.Google Scholar
  16. [FM89]
    C. Faloutsos and D. Metaxas. Declustering using error correcting codes. Eighth ACM SIGACT-SIGMOD-SIGART Symposium on Principles of Database Systems (PODS), pages 253–258, March 1989. Also available as UMIACS-TR-88-91 and CS-TR-2157.Google Scholar
  17. [FR89]
    C. Faloutsos and S. Roseman. Fractals for secondary key retrieval. Eighth ACM SIGACT-SIGMOD-SIGART Symposium on Principles of Database Systems (PODS), pages 247–252, March 1989. also available as UMIACS-TR-89-47 and CS-TR-2242.Google Scholar
  18. [Fre87]
    Michael Freeston. The bang file: a new kind of grid file. Proc. of ACM SIGMOD, pages 260–269, May 1987.Google Scholar
  19. [Fre95]
    Michael Freeston. A general solution of the n-dimensional b-tree problem. Proc. of ACM-SIGMOD, pages 80–91, May 1995.Google Scholar
  20. [Gar82]
    I. Gargantini. An effective way to represent quadtrees. Comm. of ACM (CACM), 25(12):905–910, December 1982.Google Scholar
  21. [GDQ92]
    Shahram Ghandeharizadeh, David J. DeWitt, and W. Qureshi. A performance analysis of alternative multi-attribute declustering strategies. SIGMOD Conf., June 1992.Google Scholar
  22. [Gun86]
    O. Gunther. The cell tree: an index for geometric data. Memorandum No. UCB/ERL M86/89, Univ. of California, Berkeley, December 1986.Google Scholar
  23. [Gut84a]
    A. Guttman. New Features for Relational Database Systems to Support CAD Applications. PhD thesis, University of California, Berkeley, June 1984.Google Scholar
  24. [Gut84b]
    A. Guttman. R-trees: a dynamic index structure for spatial searching. Proc. ACM SIGMOD, pages 47–57, June 1984.Google Scholar
  25. [HN83]
    K. Hinrichs and J. Nievergelt. The grid file: a data structure to support proximity queries on spatial objects. Proc. of the WG'83 (Intern. Workshop on Graph Theoretic Concepts in Computer Science), pages 100–113, 1983.Google Scholar
  26. [Jag90]
    H.V. Jagadish. Linear clustering of objects with multiple attributes. ACM SIGMOD Conf., pages 332–342, May 1990.Google Scholar
  27. [KF92]
    Ibrahim Kamel and Christos Faloutsos. Parallel r-trees. Proc. of ACM SIGMOD Conf., pages 195–204, June 1992. Also available as Tech. Report UMIACS TR 92-1, CS-TR-2820.Google Scholar
  28. [KF93]
    Ibrahim Kamel and Christos Faloutsos. On packing r-trees. Second Int. Conf. on Information and Knowledge Management (CIKM), November 1993.Google Scholar
  29. [KF94]
    Ibrahim Kamel and Christos Faloutsos. Hilbert r-tree: an improved r-tree using fractals. In Proc. of VLDB Conference, pages 500–509, Santiago, Chile, September 1994.Google Scholar
  30. [KP88]
    M.H. Kim and S. Pramanik. Optimal file distribution for partial match retrieval. Proc. ACM SIGMOD Conf., pages 173–182, June 1988.Google Scholar
  31. [KS91]
    Curtis P. Kolovson and Michael Stonebraker. Segment indexes: Dynamic indexing techniques for multi-dimensional interval data. Proc. ACM SIGMOD, pages 138–147, May 1991.Google Scholar
  32. [LS90]
    David B. Lomet and Betty Salzberg. The hb-tree: a multiattribute indexing method with good guaranteed performance. ACM TODS, 15(4):625–658, December 1990.Google Scholar
  33. [NHS84]
    J. Nievergelt, H. Hinterberger, and K.C. Sevcik. The grid file: an adaptable, symmetric multikey file structure. ACM TODS, 9(1):38–71, March 1984.Google Scholar
  34. [OHM+84]
    J. K. Ousterhout, G. T. Hamachi, R. N. Mayo, W. S. Scott, and G. S. Taylor. Magic: a vlsi layout system. In 21st Design Automation Conference, pages 152–159, Alburquerque, NM, June 1984.Google Scholar
  35. [Ore86]
    J. Orenstein. Spatial query processing in an object-oriented database system. Proc. ACM SIGMOD, pages 326–336, May 1986.Google Scholar
  36. [RL85]
    N. Roussopoulos and D. Leifker. Direct spatial search on pictorial databases using packed r-trees. Proc. ACM SIGMOD, May 1985.Google Scholar
  37. [Rob8l]
    J.T. Robinson. The k-d-b-tree: a search structure for large multidimensional dynamic indexes. Proc. ACM SIGMOD, pages 10–18, 1981.Google Scholar
  38. [Sam90]
    H. Samet. The Design and Analysis of Spatial Data Structures. Addison-Wesley, 1990.Google Scholar
  39. [SRF87]
    T. Sellis, N. Roussopoulos, and C. Faloutsos. The r+ tree: a dynamic index for multi-dimensional objects. In Proc. 13th International Conference on VLDB, pages 507–518, England, September 1987. also available as SRC-TR-87-32, UMIACS-TR-87-3, CS-TR-1795.Google Scholar
  40. [SS88]
    R. Stam and Richard Snodgrass. A bibliography on temporal databases. IEEE Bulletin on Data Engineering, 11(4), December 1988.Google Scholar
  41. [SSH86]
    M. Stonebraker, T. Sellis, and E. Hanson. Rule indexing implementations in database systems. In Proceedings of the First International Conference on Expert Database Systems, Charleston, SC., April 1986.Google Scholar
  42. [Whi81]
    M. White. N-Trees: Large Ordered Indexes for Multi-Dimensional Space. Application Mathematics Research Staff, Statistical Research Division, U.S. Bureau of the Census, December 1981.Google Scholar
  43. [WYD87]
    J.-H. Wang, T.-S. Yuen, and D.H.-C. Du. On multiple random accesses and physical data placement in dynamic files. IEEE Trans. on Software Engineering, SE-13(8):977–987, August 1987.Google Scholar
  44. [WZS91]
    Gerhard Weikum, Peter Zabback, and Peter Scheuermann. Dynamic file allocation in disk arrays. Proc. ACM SIGMOD, pages 406–415, May 1991.Google Scholar

Copyright information

© Springer-Verlag 1996

Authors and Affiliations

  • Nikos Koudas
    • 1
  • Christos Faloutsos
    • 2
  • Ibrahim Kamel
    • 3
  1. 1.Computer Systems Research InstituteUniversity of TorontoCanada
  2. 2.AT&T Bell LaboratoriesMurray Hill
  3. 3.Matsushita Information Technology LaboratoryJapan

Personalised recommendations