Distributed and Parallel Databases

, Volume 28, Issue 2–3, pp 157–185

DYFRAM: dynamic fragmentation and replica management in distributed database systems

  • Jon Olav Hauglid
  • Norvald H. Ryeng
  • Kjetil Nørvåg
Open Access
Article

Abstract

In distributed database systems, tables are frequently fragmented and replicated over a number of sites in order to reduce network communication costs. How to fragment, when to replicate and how to allocate the fragments to the sites are challenging problems that has previously been solved either by static fragmentation, replication and allocation, or based on a priori query analysis. Many emerging applications of distributed database systems generate very dynamic workloads with frequent changes in access patterns from different sites. In such contexts, continuous refragmentation and reallocation can significantly improve performance. In this paper we present DYFRAM, a decentralized approach for dynamic table fragmentation and allocation in distributed database systems based on observation of the access patterns of sites to tables. The approach performs fragmentation, replication, and reallocation based on recent access history, aiming at maximizing the number of local accesses compared to accesses from remote sites. We show through simulations and experiments on the DASCOSA distributed database system that the approach significantly reduces communication costs for typical access patterns, thus demonstrating the feasibility of our approach.

Keywords

Distributed DBMS Fragmentation Replication Physical database design 

References

  1. 1.
    Agrawal, S., Narasayya, V., Yang, B.: Integrating vertical and horizontal partitioning into automated physical database design. In: Proceedings of SIGMOD, 2004 Google Scholar
  2. 2.
    Agrawal, S., Chu, E., Narasayya, V.R.: Automatic physical design tuning: workload as a sequence. In: Proceedings of SIGMOD 2006, 2006 Google Scholar
  3. 3.
    Ahmad, I., et al.: Evolutionary algorithms for allocating data in distributed database systems. Distrib. Parallel Databases 11(1), 5–32 (2002) MATHCrossRefGoogle Scholar
  4. 4.
    Apers, P.M.G.: Data allocation in distributed database systems. ACM Trans. Database Syst. 13(3), 263–304 (1988) CrossRefGoogle Scholar
  5. 5.
    Bonvin, N., Papaioannou, T.G., Aberer, K.: A self-organized, fault-tolerant and scalable replication scheme for cloud storage. In: Proceedings of SoCC ’10, 2010 Google Scholar
  6. 6.
    Bruno, N., Chaudhuri, S.: An online approach to physical design tuning. In: Proceedings of ICDE, 2007 Google Scholar
  7. 7.
    Brunstrom, A., Leutenegger, S.T., Simha, R.: Experimental evaluation of dynamic data allocation strategies in a distributed database with changing workloads. In: Proceedings of CIKM ’95, 1995 Google Scholar
  8. 8.
    Ciciani, B., Dias, D., Yu, P.: Analysis of replication in distributed database systems. IEEE Trans. Knowl. Data Eng. 2(2), 247–261 (1990) CrossRefGoogle Scholar
  9. 9.
    Copeland, G., et al.: Data placement in Bubba. In: Proceedings of SIGMOD 1988, 1988 Google Scholar
  10. 10.
    Corcoran, A.L., Hale, J.: A genetic algorithm for fragment allocation in a distributed database system. In: Proceedings of SAC’94, 1994 Google Scholar
  11. 11.
    Didriksen, T., Galindo-Legaria, C.A., Dahle, E.: Database de-centralization—a practical approach. In: Proceedings of VLDB 1995, 1995 Google Scholar
  12. 12.
    Donjerkovic, D., Ioannidis, Y.E., Ramakrishnan, R.: Dynamic histograms: Capturing evolving data sets. In: Proceedings of ICDE, 2000 Google Scholar
  13. 13.
    Furtado, P.: Experimental evidence on partitioning in parallel data warehouses. In: Proceedings of DOLAP 2004, 2004 Google Scholar
  14. 14.
    Gavish, B., Sheng, O.R.L.: Dynamic file migration in distributed computer systems. Commun. ACM 33(2), 177–189 (1990) CrossRefGoogle Scholar
  15. 15.
    Hara, T., Madria, S.K.: Data replication for improving data accessibility in ad hoc networks. IEEE Trans. Mob. Comput. 5(11), 1515–1532 (2006) CrossRefGoogle Scholar
  16. 16.
    Hauglid, J.O., Nørvåg, K., Ryeng, N.H.: Efficient and robust database support for data-intensive applications in dynamic environments. In: Proceedings of ICDE, 2009 Google Scholar
  17. 17.
    Hua, K.A., Lee, C.: An adaptive data placement scheme for parallel database computer systems. In: Proceedings of VLDB 1990, 1990 Google Scholar
  18. 18.
    Ioannidis, Y.: The history of histograms (abridged). In: Proceedings of VLDB 2003, 2003 Google Scholar
  19. 19.
    Ivanova, M., Kersten, M.L., Nes, N.: Adaptive segmentation for scientific databases. In: Proceedings of ICDE 2008, 2008 Google Scholar
  20. 20.
    Menon, S.: Allocating fragments in distributed databases. IEEE Trans. Parallel Distrib. Syst. 16(7), 577–585 (2005) CrossRefGoogle Scholar
  21. 21.
    Mondal, A., Madria, S.K., Kitsuregawa, M.: CADRE: A collaborative replica allocation and deallocation approach for mobile-p2p networks. In: Proceedings of IDEAS 2006, 2006 Google Scholar
  22. 22.
    Mondal, A., Yadav, K., Madria, S.K.: EcoBroker: An economic incentive-based brokerage model for efficiently handling multiple-item queries to improve data availability via replication in mobile-p2p networks. In: Proceedings of DNIS 2010, 2010 Google Scholar
  23. 23.
    Padmanabhan, P., Gruenwald, L., Vallur, A., Atiquzzaman, M.: A survey of data replication techniques for mobile ad hoc network databases. VLDB J. 17(5), 1143–1164 (2008) CrossRefGoogle Scholar
  24. 24.
    Rao, J., et al.: Automating physical database design in a parallel database. In: Proceedings of SIGMOD 2002, 2002 Google Scholar
  25. 25.
    Saccà, D., Wiederhold, G.: Database partitioning in a cluster of processors. ACM Trans. Database Syst. 10(1), 29–56 (1985) MATHCrossRefGoogle Scholar
  26. 26.
    Shin, D.-G., Irani, K.B.: Fragmenting relations horizontally using a knowledge-based approach. IEEE Trans. Softw. Eng. 17(9), 872–883 (1991) CrossRefMathSciNetGoogle Scholar
  27. 27.
    Sidell, J., Aoki, P.M., Sah, A., Staelin, C., Stonebraker, M., Yu, A.: Data replication in Mariposa. In: Proceedings of ICDE 1996, 1996 Google Scholar
  28. 28.
    Stonebraker, M., et al.: Mariposa: A wide-area distributed database system. VLDB J. 5(1), 48–63 (1996) CrossRefGoogle Scholar
  29. 29.
    Tamhankar, A., Ram, S.: Database fragmentation and allocation: an integrated methodology and case study. IEEE Trans. Syst. Man Cybern., Part A 28(3), 288–305 (1998) CrossRefGoogle Scholar
  30. 30.
    Ulus, T., Uysal, M.: Heuristic approach to dynamic data allocation in distributed database systems. Pak. J. Inf. Technol. 2(3), 231–239 (2003) Google Scholar
  31. 31.
    Weikum, G., et al.: The COMFORT automatic tuning project, invited project review. Inf. Syst. 19(5), 381–432 (1994) CrossRefGoogle Scholar
  32. 32.
    Wolfson, O., Jajodia, S.: Distributed algorithms for dynamic replication of data. In: Proceedings of PODS’92, New York, NY, USA, 1992. ACM, New York (1992) Google Scholar
  33. 33.
    Wong, E., Katz, R.H.: Distributing a database for parallelism. SIGMOD Rec. 13(4), 23–29 (1983) CrossRefGoogle Scholar
  34. 34.
    Zilio, D.C., et al.: DB2 design advisor: integrated automatic physical database design. In: Proceedings of VLDB 2004, 2004 Google Scholar

Copyright information

© The Author(s) 2010

Authors and Affiliations

  • Jon Olav Hauglid
    • 1
  • Norvald H. Ryeng
    • 1
  • Kjetil Nørvåg
    • 1
  1. 1.Dept. of Computer ScienceNorwegian University of Science and TechnologyTrondheimNorway

Personalised recommendations