Abstract
In distributed database systems, tables are frequently fragmented and replicated over a number of sites in order to reduce network communication costs. How to fragment, when to replicate and how to allocate the fragments to the sites are challenging problems that has previously been solved either by static fragmentation, replication and allocation, or based on a priori query analysis. Many emerging applications of distributed database systems generate very dynamic workloads with frequent changes in access patterns from different sites. In such contexts, continuous refragmentation and reallocation can significantly improve performance. In this paper we present DYFRAM, a decentralized approach for dynamic table fragmentation and allocation in distributed database systems based on observation of the access patterns of sites to tables. The approach performs fragmentation, replication, and reallocation based on recent access history, aiming at maximizing the number of local accesses compared to accesses from remote sites. We show through simulations and experiments on the DASCOSA distributed database system that the approach significantly reduces communication costs for typical access patterns, thus demonstrating the feasibility of our approach.
Article PDF
Similar content being viewed by others
References
Agrawal, S., Narasayya, V., Yang, B.: Integrating vertical and horizontal partitioning into automated physical database design. In: Proceedings of SIGMOD, 2004
Agrawal, S., Chu, E., Narasayya, V.R.: Automatic physical design tuning: workload as a sequence. In: Proceedings of SIGMOD 2006, 2006
Ahmad, I., et al.: Evolutionary algorithms for allocating data in distributed database systems. Distrib. Parallel Databases 11(1), 5–32 (2002)
Apers, P.M.G.: Data allocation in distributed database systems. ACM Trans. Database Syst. 13(3), 263–304 (1988)
Bonvin, N., Papaioannou, T.G., Aberer, K.: A self-organized, fault-tolerant and scalable replication scheme for cloud storage. In: Proceedings of SoCC ’10, 2010
Bruno, N., Chaudhuri, S.: An online approach to physical design tuning. In: Proceedings of ICDE, 2007
Brunstrom, A., Leutenegger, S.T., Simha, R.: Experimental evaluation of dynamic data allocation strategies in a distributed database with changing workloads. In: Proceedings of CIKM ’95, 1995
Ciciani, B., Dias, D., Yu, P.: Analysis of replication in distributed database systems. IEEE Trans. Knowl. Data Eng. 2(2), 247–261 (1990)
Copeland, G., et al.: Data placement in Bubba. In: Proceedings of SIGMOD 1988, 1988
Corcoran, A.L., Hale, J.: A genetic algorithm for fragment allocation in a distributed database system. In: Proceedings of SAC’94, 1994
Didriksen, T., Galindo-Legaria, C.A., Dahle, E.: Database de-centralization—a practical approach. In: Proceedings of VLDB 1995, 1995
Donjerkovic, D., Ioannidis, Y.E., Ramakrishnan, R.: Dynamic histograms: Capturing evolving data sets. In: Proceedings of ICDE, 2000
Furtado, P.: Experimental evidence on partitioning in parallel data warehouses. In: Proceedings of DOLAP 2004, 2004
Gavish, B., Sheng, O.R.L.: Dynamic file migration in distributed computer systems. Commun. ACM 33(2), 177–189 (1990)
Hara, T., Madria, S.K.: Data replication for improving data accessibility in ad hoc networks. IEEE Trans. Mob. Comput. 5(11), 1515–1532 (2006)
Hauglid, J.O., Nørvåg, K., Ryeng, N.H.: Efficient and robust database support for data-intensive applications in dynamic environments. In: Proceedings of ICDE, 2009
Hua, K.A., Lee, C.: An adaptive data placement scheme for parallel database computer systems. In: Proceedings of VLDB 1990, 1990
Ioannidis, Y.: The history of histograms (abridged). In: Proceedings of VLDB 2003, 2003
Ivanova, M., Kersten, M.L., Nes, N.: Adaptive segmentation for scientific databases. In: Proceedings of ICDE 2008, 2008
Menon, S.: Allocating fragments in distributed databases. IEEE Trans. Parallel Distrib. Syst. 16(7), 577–585 (2005)
Mondal, A., Madria, S.K., Kitsuregawa, M.: CADRE: A collaborative replica allocation and deallocation approach for mobile-p2p networks. In: Proceedings of IDEAS 2006, 2006
Mondal, A., Yadav, K., Madria, S.K.: EcoBroker: An economic incentive-based brokerage model for efficiently handling multiple-item queries to improve data availability via replication in mobile-p2p networks. In: Proceedings of DNIS 2010, 2010
Padmanabhan, P., Gruenwald, L., Vallur, A., Atiquzzaman, M.: A survey of data replication techniques for mobile ad hoc network databases. VLDB J. 17(5), 1143–1164 (2008)
Rao, J., et al.: Automating physical database design in a parallel database. In: Proceedings of SIGMOD 2002, 2002
Saccà, D., Wiederhold, G.: Database partitioning in a cluster of processors. ACM Trans. Database Syst. 10(1), 29–56 (1985)
Shin, D.-G., Irani, K.B.: Fragmenting relations horizontally using a knowledge-based approach. IEEE Trans. Softw. Eng. 17(9), 872–883 (1991)
Sidell, J., Aoki, P.M., Sah, A., Staelin, C., Stonebraker, M., Yu, A.: Data replication in Mariposa. In: Proceedings of ICDE 1996, 1996
Stonebraker, M., et al.: Mariposa: A wide-area distributed database system. VLDB J. 5(1), 48–63 (1996)
Tamhankar, A., Ram, S.: Database fragmentation and allocation: an integrated methodology and case study. IEEE Trans. Syst. Man Cybern., Part A 28(3), 288–305 (1998)
Ulus, T., Uysal, M.: Heuristic approach to dynamic data allocation in distributed database systems. Pak. J. Inf. Technol. 2(3), 231–239 (2003)
Weikum, G., et al.: The COMFORT automatic tuning project, invited project review. Inf. Syst. 19(5), 381–432 (1994)
Wolfson, O., Jajodia, S.: Distributed algorithms for dynamic replication of data. In: Proceedings of PODS’92, New York, NY, USA, 1992. ACM, New York (1992)
Wong, E., Katz, R.H.: Distributing a database for parallelism. SIGMOD Rec. 13(4), 23–29 (1983)
Zilio, D.C., et al.: DB2 design advisor: integrated automatic physical database design. In: Proceedings of VLDB 2004, 2004
Author information
Authors and Affiliations
Corresponding author
Additional information
Communicated by Kam-Fai Wong.
Supported by grant #176894/V30 from the Norwegian Research Council.
Rights and permissions
Open Access This is an open access article distributed under the terms of the Creative Commons Attribution Noncommercial License (https://creativecommons.org/licenses/by-nc/2.0), which permits any noncommercial use, distribution, and reproduction in any medium, provided the original author(s) and source are credited.
About this article
Cite this article
Hauglid, J.O., Ryeng, N.H. & Nørvåg, K. DYFRAM: dynamic fragmentation and replica management in distributed database systems. Distrib Parallel Databases 28, 157–185 (2010). https://doi.org/10.1007/s10619-010-7068-1
Published:
Issue Date:
DOI: https://doi.org/10.1007/s10619-010-7068-1