Abstract
Parallelizing I/O operations via effective declustering of data is becoming essential to scale up the performance of parallel databases or high performance systems. Declustering has been shown to be a NP-complete problem in some contexts. Some heuristic methods have been proposed to solve this problem. However, most methods are not effective in several cases such as queries with different access frequencies or data with different sizes. In this paper, we propose a hypergraph model to formulate the declustering problem. Several interesting theoretical results are achieved by analyzing the proposed model. The proposed approach will allow modeling a wide range of declustering problems. Furthermore, the hypergraph declustering model is used as the basis to develop new heuristic methods, including a greedy method and a hybrid declustering method. Experiments show that the proposed methods can achieve better performance than several declustering methods.
Similar content being viewed by others
References
R. Bhatia, R. Sinha, and C.-M. Chen. “Hierarchical declustering schemes for range queries,” in Proc. of 7th Intl Conference on Extending Database Technology, EDBT, Konstanz, Germany, 2000, pp. 525–540.
R. Bhatia, R.K. Sinha, and C.-M. Chen. “Declustering using golden ratio sequences,” in Proc. of the 16th Intl Conference on Data Engineering, IEEE, San Diego, California, USA, 2000, pp. 271–280.
L.T. Chen and D. Rotem, “Declustering objects for visualization,” in Proc. of Intl Conference on Very Large Data Bases, 1993, pp. 85–96.
C.K. Cheng and Y.C. Wei, “An improved two-way partitioning algorithm with stable performance,” IEEE Trans. on Computer-Aided Design, vol. 10, no. 12, pp. 1502–1511, 1991.
D.J. DeWitt et al., “The gamma database machine project,” IEEE Trans. on Knowledge and Data Engineering, vol. 2, no. 1, 1990.
H.C. Du and J.S. Sobolewski, “Disk allocation for product files on multiple disk systems,” ACM Trans. on Database Systems, vol. 7, no. 1, pp. 82–101, 1982.
C. Faloutsos and P. Bhagwat, “Declustering using fractals,” in Proc. of Intl. Symposium on Databases in Parallel and Distributed Systems, 1993, pp. 18–25.
C. Faloutsos and D. Metaxas, “Disk allocation methods using error correcting codes,” IEEE Trans. on Computers, vol. 40, no. 8, pp. 907–914, 1991.
M.R. Garey and D.S. Johnson, Computers and Intractability: A Guide to the Theory of NP-Completeness, W.H. Freeman: San Francisco, 1979.
S. Ghandeharizadeh and D.J. DeWitt, “Amultiuser performance analysis of alternative declustering strategies,” in Proc. of the 6th Intl. Conference on Data Engineering, IEEE, 1990, pp. 466–475.
S. Ghandeharizadeh and D.J. DeWitt, “Hybrid-range partitioning strategy: A new declustering strategy for multiprocessor database machine,” in Proc. of Intl. Conference on Very Large Databases, VLDB, 1990, pp. 481–492.
S. Ghandeharizadeh, D.J. DeWitt, and W. Qureshi, “A performance analysis of alternative multiattribute declustering strategies,” in Proc. of Intl. Conference on Management of Data, ACM SIGMOD, 1992, pp. 29–38.
I. Kamel and C. Faloutsos, “Parallel R-trees,” in Proc. of Intl. Conference on Management of Data, ACM SIGMOD, 1992, pp. 195–204.
B.W. Kernighan and S. Lin, “An efficient heuristic procedure for partitioning graphs,” Bell Syst. Tech. J., vol. 49, no. 2, pp. 291–307, 1970.
K. Kim and V.K. Prasanna, “Latin squares for parallel array access,” IEEE Trans. on Parallel and Distributed Systems, vol. 4, no. 4, pp. 361–370, 1993.
M.H. Kim and S. Pramanik, “Optimal file distribution for partial match queries,” in Proc. of SIGMOD Conference on Management of Data, ACM, 1988, pp. 173–182.
V. Kouramajian, R. Elmasri, and A. Chaudhry, “Declustering techniques for parallelizing temporal access structures,” in Proc. of the Tenth Intl. Conference on Data Engineering, IEEE, 1994, pp. 232–244.
J. Li, J. Srivastava, and D. Rotem, “CMD: A multidimensional declustering method for parallel database systems,” in Proc. of Intl. Conference on Very Large Data Bases, 1992, pp. 3–14.
D.R. Liu and S. Shekhar, “Partitioning similarity graphs:Aframework for declustering problems,” Information Systems: An International Journal, vol. 21, no. 6, pp. 475–496, 1996.
S. Prabhakar, K. Abdel-Ghaffar, D. Agrawal, and A. El Abbadi. “Cyclic allocation of two-dimensional data,” in Proc. of the 14th Intl Conference on Data Engineering, IEEE, Orlando, Florida, USA, 1998, pp. 94–101.
D. Rotem, G.A. Schloss, and A. Segev, “Data allocation for multidisk databases,” IEEE Trans. on Knowledge and Data Engineering, vol. 5, no. 5, pp. 882–887, 1993.
B. Seeger and P.A. Larson, “Multi-disk B-trees,” in Proc. of Intl. Conference on Management of Data, ACM SIGMOD, 1991, pp. 436–445.
Y. Zhou, S. Shekhar, and M. Coyle, “Disk allocation methods for parallelizing grid files,” in Proc. of the Tenth Intl. Conference on Data Engineering, IEEE, 1994, pp. 243–252.
Author information
Authors and Affiliations
Rights and permissions
About this article
Cite this article
Liu, DR., Wu, MY. A Hypergraph Based Approach to Declustering Problems. Distributed and Parallel Databases 10, 269–288 (2001). https://doi.org/10.1023/A:1019269409432
Issue Date:
DOI: https://doi.org/10.1023/A:1019269409432