Advertisement

Efficient Declustering of Non-uniform Multidimensional Data Using Shifted Hilbert Curves

  • Hak-Cheol Kim
  • Mario A. Lopez
  • Scott T. Leutenegger
  • Ki-Joune Li
Conference paper
Part of the Lecture Notes in Computer Science book series (LNCS, volume 2973)

Abstract

Data declustering speeds up large data set retrieval by partitioning the data across multiple disks or sites and performing retrievals in parallel. Performance is determined by how the data is broken into ”buckets” and how the buckets are assigned to disks. While some work has been done for declustering uniformly distributed low dimensional data, little work has been done on declustering non-uniform high dimensional data. To decluster non-uniform data, a distribution sensitive bucketing algorithm is crucial for achieving good performance. In this paper we propose a simple and efficient data distribution sensitive bucketing algorithm. Our method employs a method based on shifted Hilbert curves to adapt to the underlying data distribution. Our proposed declustering algorithm gives good performance compared with previous work which have mostly focused on bucket-to-disk allocation scheme. Our experimental results show that the proposed declustering algorithm achieves a performance improvement up to 5 times relative to the two leading algorithms.

Keywords

Allocation Scheme Range Query Data Block Average Response Time Neighbor Graph 
These keywords were added by machine and not by the authors. This process is experimental and the keywords may be updated as the learning algorithm improves.

Preview

Unable to display preview. Download preview PDF.

Unable to display preview. Download preview PDF.

References

  1. 1.
    Abdel-Ghaffar, K., Abbadi, A.E.: Optimal Allocation of Two-Dimensional Data. In: Proc. ICDT Conf., pp. 409–418 (1997)Google Scholar
  2. 2.
    Atallah, M.J., Prabhakar, S.: (Almost) Optimal Parallel Block Access for Range Queries. In: Proc. PODS Conf., pp. 205–215 (2000)Google Scholar
  3. 3.
    Bhatia, R., Sinha, R.K., Chen, C.-M.: Declustering Using Golden Ratio Sequences. In: Proc. ICDE Conf., pp. 271–280 (2000)Google Scholar
  4. 4.
    Chen, C.M., Cheng, C.T.: From Discrepancy to Declustering: Near optimal multidimensional declustering strategies for range queries. In: Proc. PODS Conf., pp. 29–38 (2002)Google Scholar
  5. 5.
    Du, H.C., Sobolewski, J.S.: Disk Allocation for Cartisian Files on Multiple-Disk Systems. ACM Trans. Database Systems 7(1), 82–102 (1982)MATHCrossRefGoogle Scholar
  6. 6.
    Faloutsos, C., Bhagwat, P.: Declustering Using Fractals. In: Proc. Parallel and Distributed Information Systems Conf., pp. 18–25 (1993)Google Scholar
  7. 7.
    Faloutsos, C., Metaxas, D.: Disk Allocation Methods Using Error Correcting Codes. IEEE Trans on Computers 40(8), 907–914 (1991)CrossRefGoogle Scholar
  8. 8.
    Fang, M.T., Lee, R.C.T., Chang, C.C.: The Idea of De-Clustering and Its applications. In: Proc. VLDB Conf., pp. 181–188 (1986)Google Scholar
  9. 9.
    Kim, H.C., Li, K.J.: Declustering Spatial Objects by Clustering for Parallel Disks. In: Proc. DEXA Conf., pp. 450–459 (2001)Google Scholar
  10. 10.
    Kim, M.H., Pramanik, S.: Optimal File Distribution For Partial Match Retrieval. In: Proc. SIGMOD Conf., pp. 173–182 (1988)Google Scholar
  11. 11.
    Liao, S., Lopez, M.A., Leutenegger, S.T.: High Dimensional Similarity Search With Space Filling Curves. In: Proc. ICDE Conf., pp. 615–622 (2001)Google Scholar
  12. 12.
    Liu, D.R., Shekhar, S.: Partitioning Similarity Graphs: A Framework for Declustering Problems. International Journal Information System 21(6), 475–496 (1996)Google Scholar
  13. 13.
    Liu, D.R., Wu, M.Y.: A Hypergraph Based Approach to Declustering Problems. Distributed and Parallel Databases 10(3), 269–288 (2001)CrossRefGoogle Scholar
  14. 14.
    Nievergelt, J., Hinteberger, H., Sevcik, K.D.: The Grid file: An Adaptable, Symmetric Multi-Key File Structure. ACM Trans. on Database Systems 9(1), 38–71 (1984)CrossRefGoogle Scholar
  15. 15.
    Prabhakar, S., Abdel-Ghaffar, K., El Abbadi, A.: Cyclic Allocation of Two- Dimensional Data. In: Proc. ICDE Conf., pp. 94–101 (1998)Google Scholar

Copyright information

© Springer-Verlag Berlin Heidelberg 2004

Authors and Affiliations

  • Hak-Cheol Kim
    • 1
  • Mario A. Lopez
    • 2
  • Scott T. Leutenegger
    • 2
  • Ki-Joune Li
    • 1
  1. 1.School of Electrical and Computer EngineeringPusan National UniversityPusanKorea
  2. 2.Department of Computer ScienceUniversity of DenverDenverU.S.A

Personalised recommendations