Asymptotically Optimal Declustering Schemes for Range Queries

  • Rakesh K. Sinha
  • Randeep Bhatia
  • Chung-Min Chen
Conference paper
Part of the Lecture Notes in Computer Science book series (LNCS, volume 1973)

Abstract

Declustering techniques have been widely adopted in parallel storage systems (e.g. disk arrays) to speed up bulk retrieval of multidimensional data. A declustering scheme distributes data items among multiple devices, thus enabling parallel I/O access and reducing query response time. We measure the performance of any declustering scheme as its worst case additive deviation from the ideal scheme. The goal thus is to design declustering schemes with as small an additive error as possible. We describe a number of declustering schemes with additive error O(log M) for 2-dimensional range queries, where M is the number of disks. These are the first results giving such a strong bound for any value of M. Our second result is a lower bound on the additive error. In 1997, Abdel-Ghaffar and Abbadi showed that except for a few stringent cases, additive error of any 2-dim declustering scheme is at least one. We strengthen this lower bound to Ω((log M)(d-1)/2) for d-dim schemes and to Ω(log M) for 2-dim schemes, thus proving that the 2-dim sche- mes described in this paper are (asymptotically) optimal. These results are obtained by establishing a connection to geometric discrepancy, a widely studied area of mathematics. We also present simulation results to evaluate the performance of these schemes in practice.

Keywords

Discrepancy Theory Range Query Additive Error Placement Scheme Ideal Scheme 
These keywords were added by machine and not by the authors. This process is experimental and the keywords may be updated as the learning algorithm improves.

Preview

Unable to display preview. Download preview PDF.

Unable to display preview. Download preview PDF.

References

  1. 2.
    K. Abdel-Ghaffar and A. E. Abbadi. Optimal allocation of two-dimensional data. In Proceedings of the International Conference on Database Theory, 1997.Google Scholar
  2. 3.
    M.J. Atallah and S. Prabhakar. (Almost) optimal parallel block access for range queries. In ACM Symp. on Principles of Database Systems, May 2000.Google Scholar
  3. 4.
    R. C. Baker. On irregularities of distribution, II. Journal of London Math. Soc.To appear.Google Scholar
  4. 5.
    S. Berchtold, C. Böhm, B. Braunmüller, D.A. Keim, and H.-P. Kriegel. Fast parallel similarity search in multimedia databases. In Proc. of ACM Int’l Conf. on Management of Data, 1997.Google Scholar
  5. 6.
    R. Bhatia, R. K. Sinha, and C. M. Chen. Declustering using golden ratio sequences.In 16th Int’l Conf. on Data Engineering, Feb 2000.Google Scholar
  6. 7.
    R. Bhatia, R. K. Sinha, and C. M. Chen. Hierarchical declustering schemes for range queries. In 7th Int’l Conf. on Extending Database Technology, Mar 2000.Google Scholar
  7. 8.
    C. Chang, B. Moon, A. Acharya, C. Shock, A. Sussman, and J. Saltz. Titan: a high-performance remote-sensing database. In 13th Int. Conf. on Data Engineering,1997.Google Scholar
  8. 9.
    C.M. Chen and R. Sinha. Raster-spatial data declustering revisited: an interactive navigation perspective. In 15th Int. Conf. on Data Engineering, 1999.Google Scholar
  9. 10.
    L.T. Chen and D. Rotem. Declustering objects for visualization. In Proc. of the 19th International Conference on Very Large Data Bases, 1993.Google Scholar
  10. 11.
    H.C. Du and J.S. Sobolewski. Disk allocation for cartesian product files on multiple disk systems. ACM Trans. Database Systems, pages 82–101, 1982.Google Scholar
  11. 12.
    C. Faloutsos and P. Bhagwat. Declustering using fractals. In Proceedings of the 2nd International Conference on Parallel and Distributed Information Systems, 1993.Google Scholar
  12. 13.
    H. Faure. Discrepancy of sequences associated with a number system (in dimensions) (in french). Acta Arithmetic, 41(4):337–351, 1982.MATHMathSciNetGoogle Scholar
  13. 14.
    H. Faure. Good permutations for extreme discrepancy. J. Number Theory, 42:47–56, 1992.MATHCrossRefMathSciNetGoogle Scholar
  14. 15.
    H. Ferhatosmanoglu and D. Agrawal. Concentric hyperspaces and disk allocations for fast parallel range searching. In Proc. of 15th Int. Conf. on Data Engineering,pages 608–615, 1999.Google Scholar
  15. 16.
    E. Hlawka. The theory of uniform distribution. A B Academic Publ, Berkhamasted, Herts, 1984.MATHGoogle Scholar
  16. 17.
    K. Keeton, D.A. Patterson, and J.M. Hellerstein. A case for intelligent disks (idisks). SIGMOD Record, 27(3), 1998.Google Scholar
  17. 18.
    M.H. Kim and S. Pramanik. Optimal file distribution for partial match retrieval. In Proceedings of the ACM International Conference on Management of Data, 1988.Google Scholar
  18. 19.
    S. Kou, M. Winslett, Y. Cho, and J. Lee. New gdm-based declustering methods for parallel range queries. In Int’l Database Engineering and Applications Symposium (IDEAS), Aug. 1999.Google Scholar
  19. 20.
    J. Matousek. Geometric discrepancy, an illustrated guide. Springer-Verlag, 1999.Google Scholar
  20. 21.
    B. Moon, A. Acharya, and J. Saltz. Study of scalable declustering algorithms for parallel grid files. In Proceedings of the 10th International Parallel Processing Symposium, 1996.Google Scholar
  21. 22.
    S. Prabhakar, K. Abdel-Ghaffar, D. Agrawal, and A. E. Abbadi. Cyclic allocation of two-dimensional data. In 14th Int. Conf. on Data Engineering, 1998.Google Scholar
  22. 23.
    S. Prabhakar, K. Abdel-Ghaffar, D. Agrawal, and A.E. Abbadi. Effcient retrieval of multidimensional datasets through parallel I/O. In 5th Int. Conf. on High Performance Computing, 1998.Google Scholar
  23. 24.
    I. M. Sobol. Distribution of points in a cube and approximate evaluation of integrals (in russian). Zh. Vychisl. Mat. i Mat. Fiz., 7:784–802, 1967.MathSciNetGoogle Scholar
  24. 25.
    J. G. van derCorput. Verteilungsfunktionen i. In Akad. Wetensch Amsterdam, pages 813–821, 1935.Google Scholar

Copyright information

© Springer-Verlag Berlin Heidelberg 2001

Authors and Affiliations

  • Rakesh K. Sinha
    • 1
  • Randeep Bhatia
    • 1
  • Chung-Min Chen
    • 2
  1. 1.Bell LaboratoriesNJ
  2. 2.Telcordia Technologies, Inc.NJ

Personalised recommendations