Hierarchical Declustering Schemes for Range Queries
Declustering schemes have been widely used to speed up access time of multi-device storage systems (e.g. disk arrays) in modern geospatial applications. A declustering scheme distributes data items among multiple devices, thus enabling parallel I/O access and speeding up query response time. To date, efficient declustering schemes are only known for small values of M, where M is the number of devices. For large values of M, the search overhead to find an efficient scheme is prohibitive. In this paper, we present an efficient hierarchical technique for building declustering schemes for large values of M based on declustering schemes for small values of M. Using this technique, one may easily construct efficient declustering schemes for large values of M using any known good declustering schemes for small values of M. We analyze the performance of the declustering schemes generated by our technique in 2-dimension, giving tight asymptotic bounds on their response time. For example we show, in 2 dimension, that using optimal declustering schemes for M 1 and M 2 devices we can construct a scheme for M 1 M 2 devices whose response time is at most seven more than the optimal response time. Our technique generalizes to any d dimension. We also present simulation results to evaluate the performance of our scheme in practice.
KeywordsBase Scheme Range Query Data Block Query Response Time Hierarchical Scheme
Unable to display preview. Download preview PDF.
- 1.NCR WorldMark/Teradata 1 TB TPC-D Executive Summary. available from http://www.tpc.org/.
- 2.K. Abdel-Ghaffar and A. E. Abbadi. Optimal allocation of two-dimensional data. In Proc. of 13th Int. Conf. on Database Theory, 1997.Google Scholar
- 3.M.J. Atallah and S. Prabhakar. (Almost) optimal parallel block access for range queries. Manuscript. Dept. of Comp. Sci., Purdue University, May 1999.Google Scholar
- 4.R. Bhatia, R. K. Sinha, and C. M. Chen. Declustering using golden ratio sequences. Proc. 16th International Conference on Data Engineering (ICDE), 2000.Google Scholar
- 5.R. Bhatia and R. Sinha and C. Chen. Hierarchical declustering schemes for range queries. http://www.cs.umd.edu/~randeep/decluster-hier.ps. 1999 Manuscript.
- 6.C. Chang, B. Moon, A. Acharya, C. Shock, A. Sussman, and J. Saltz. Titan: a high-performance remote-sensing database. In Proc. of 13th Int. Conf. on Data Engineering, 1997.Google Scholar
- 7.C.M. Chen and R. Sinha. Raster-spatial data declustering revisited: an interactive navigation perspective. In Proc. of 15th Int. Conf. on Data Engineering, 1999.Google Scholar
- 8.C.Y. Chen and C.C. Chang. On GDM allocation method for partial range queries. Information Systems, 1992.Google Scholar
- 9.L.T. Chen and D. Rotem. Declustering objects for visualization. In Proc. of 19th Int. Conf. on Very Large Data Bases, 1993.Google Scholar
- 10.H.C. Du and J.S. Sobolewski. Disk allocation for cartesian product files on multiple disk systems. ACM Trans. Database Systems, pages 82–101, 1982.Google Scholar
- 11.C. Faloutsos and P. Bhagwat. Declustering using fractals. In Proc. of 2nd Int. Conf. on Parallel and Distributed Information Systems, 1993.Google Scholar
- 12.M.T. Fang, R.C.T. Lee, and C.C. Chang. The idea of declustering and its applications. In Proc. of 12th Int. Conf. on Very Large Data Bases, 1986.Google Scholar
- 13.H. Ferhatosmanoglu and D. Agrawal. Concentric hyperspaces and disk allocations for fast parallel range searching. In Proc. of 15th Int. Conf. on Data Engineering, 1999.Google Scholar
- 14.N.D. Gershon and C.G. Miller. Dealing with the data deluge. IEEE Spectrum, 1993.Google Scholar
- 15.K. Keeton, D.A. Patterson, and J.M. Hellerstein. A case for intelligent disks (idisks). SIGMOD Record, 27(3), 1998.Google Scholar
- 16.M.H. Kim and S. Pramanik. Optimal file distribution for partial match retrieval. In Proc. of ACM Int. Conf. on Management of Data, 1988.Google Scholar
- 17.S. Kou, M. Winslett, Y. Cho, and J. Lee. New GDM-based declustering methods for parallel range queries. In Proc. of Int. Database Engineering and Applications Symposium, 1999.Google Scholar
- 18.D.-R. Liu and S. Shekhar. A similarity graph-based approach to declustering problems and its applications towards parallelizing grid files. In Proc. of 11th Int. Conf. on Data Engineering, 1995.Google Scholar
- 19.B. Moon and A. Acharya and J. Saltz. Study of scalable declustering algorithms for parallel grid files. In Proc. of 10th Int. Parallel Processing Symposium, 1996.Google Scholar
- 20.S. Prabhakar, K. Abdel-Ghaffar, D. Agrawal, and A. E. Abbadi. Cyclic allocation of two-dimensional data. In Proc. of 14th Int. Conf. on Data Engineering, 1998.Google Scholar
- 21.S. Prabhakar, K. Abdel-Ghaffar, D. Agrawal, and A.E. Abbadi. Efficient retrieval of multidimensional datasets through parallel I/O. In Proc. of 5th Int. Conf. on High Performance Computing, 1998.Google Scholar