Hierarchical Declustering Schemes for Range Queries

  • Randeep Bhatia
  • Rakesh K. Sinha
  • Chung-Min Chen
Conference paper
Part of the Lecture Notes in Computer Science book series (LNCS, volume 1777)

Abstract

Declustering schemes have been widely used to speed up access time of multi-device storage systems (e.g. disk arrays) in modern geospatial applications. A declustering scheme distributes data items among multiple devices, thus enabling parallel I/O access and speeding up query response time. To date, efficient declustering schemes are only known for small values of M, where M is the number of devices. For large values of M, the search overhead to find an efficient scheme is prohibitive. In this paper, we present an efficient hierarchical technique for building declustering schemes for large values of M based on declustering schemes for small values of M. Using this technique, one may easily construct efficient declustering schemes for large values of M using any known good declustering schemes for small values of M. We analyze the performance of the declustering schemes generated by our technique in 2-dimension, giving tight asymptotic bounds on their response time. For example we show, in 2 dimension, that using optimal declustering schemes for M 1 and M 2 devices we can construct a scheme for M 1 M 2 devices whose response time is at most seven more than the optimal response time. Our technique generalizes to any d dimension. We also present simulation results to evaluate the performance of our scheme in practice.

Preview

Unable to display preview. Download preview PDF.

Unable to display preview. Download preview PDF.

References

  1. 1.
    NCR WorldMark/Teradata 1 TB TPC-D Executive Summary. available from http://www.tpc.org/.
  2. 2.
    K. Abdel-Ghaffar and A. E. Abbadi. Optimal allocation of two-dimensional data. In Proc. of 13th Int. Conf. on Database Theory, 1997.Google Scholar
  3. 3.
    M.J. Atallah and S. Prabhakar. (Almost) optimal parallel block access for range queries. Manuscript. Dept. of Comp. Sci., Purdue University, May 1999.Google Scholar
  4. 4.
    R. Bhatia, R. K. Sinha, and C. M. Chen. Declustering using golden ratio sequences. Proc. 16th International Conference on Data Engineering (ICDE), 2000.Google Scholar
  5. 5.
    R. Bhatia and R. Sinha and C. Chen. Hierarchical declustering schemes for range queries. http://www.cs.umd.edu/~randeep/decluster-hier.ps. 1999 Manuscript.
  6. 6.
    C. Chang, B. Moon, A. Acharya, C. Shock, A. Sussman, and J. Saltz. Titan: a high-performance remote-sensing database. In Proc. of 13th Int. Conf. on Data Engineering, 1997.Google Scholar
  7. 7.
    C.M. Chen and R. Sinha. Raster-spatial data declustering revisited: an interactive navigation perspective. In Proc. of 15th Int. Conf. on Data Engineering, 1999.Google Scholar
  8. 8.
    C.Y. Chen and C.C. Chang. On GDM allocation method for partial range queries. Information Systems, 1992.Google Scholar
  9. 9.
    L.T. Chen and D. Rotem. Declustering objects for visualization. In Proc. of 19th Int. Conf. on Very Large Data Bases, 1993.Google Scholar
  10. 10.
    H.C. Du and J.S. Sobolewski. Disk allocation for cartesian product files on multiple disk systems. ACM Trans. Database Systems, pages 82–101, 1982.Google Scholar
  11. 11.
    C. Faloutsos and P. Bhagwat. Declustering using fractals. In Proc. of 2nd Int. Conf. on Parallel and Distributed Information Systems, 1993.Google Scholar
  12. 12.
    M.T. Fang, R.C.T. Lee, and C.C. Chang. The idea of declustering and its applications. In Proc. of 12th Int. Conf. on Very Large Data Bases, 1986.Google Scholar
  13. 13.
    H. Ferhatosmanoglu and D. Agrawal. Concentric hyperspaces and disk allocations for fast parallel range searching. In Proc. of 15th Int. Conf. on Data Engineering, 1999.Google Scholar
  14. 14.
    N.D. Gershon and C.G. Miller. Dealing with the data deluge. IEEE Spectrum, 1993.Google Scholar
  15. 15.
    K. Keeton, D.A. Patterson, and J.M. Hellerstein. A case for intelligent disks (idisks). SIGMOD Record, 27(3), 1998.Google Scholar
  16. 16.
    M.H. Kim and S. Pramanik. Optimal file distribution for partial match retrieval. In Proc. of ACM Int. Conf. on Management of Data, 1988.Google Scholar
  17. 17.
    S. Kou, M. Winslett, Y. Cho, and J. Lee. New GDM-based declustering methods for parallel range queries. In Proc. of Int. Database Engineering and Applications Symposium, 1999.Google Scholar
  18. 18.
    D.-R. Liu and S. Shekhar. A similarity graph-based approach to declustering problems and its applications towards parallelizing grid files. In Proc. of 11th Int. Conf. on Data Engineering, 1995.Google Scholar
  19. 19.
    B. Moon and A. Acharya and J. Saltz. Study of scalable declustering algorithms for parallel grid files. In Proc. of 10th Int. Parallel Processing Symposium, 1996.Google Scholar
  20. 20.
    S. Prabhakar, K. Abdel-Ghaffar, D. Agrawal, and A. E. Abbadi. Cyclic allocation of two-dimensional data. In Proc. of 14th Int. Conf. on Data Engineering, 1998.Google Scholar
  21. 21.
    S. Prabhakar, K. Abdel-Ghaffar, D. Agrawal, and A.E. Abbadi. Efficient retrieval of multidimensional datasets through parallel I/O. In Proc. of 5th Int. Conf. on High Performance Computing, 1998.Google Scholar

Copyright information

© Springer-Verlag Berlin Heidelberg 2000

Authors and Affiliations

  • Randeep Bhatia
    • 1
  • Rakesh K. Sinha
    • 1
  • Chung-Min Chen
    • 2
  1. 1.Bell LaboratoriesMurray HillUSA
  2. 2.Telcordia Technologies, Inc.MorristownUSA

Personalised recommendations