A Data Dependency and Access Threshold Based Replication Strategy for Multi-cloud Workflow Applications

  • Fei XieEmail author
  • Jun YanEmail author
  • Jun ShenEmail author
Conference paper
Part of the Lecture Notes in Computer Science book series (LNCS, volume 11434)


Data replication is one of the significant sub-areas of data management in cloud based workflows. Data-intensive workflow applications can gain great benefits from cloud environments and usually need data management strategies to manage large amounts of data. At the same time, multi-cloud environments become more and more popular. We propose a cost-effective and threshold-based data replication strategy with the consideration of both data dependency and data access times for data-intensive workflows in the multi-cloud environment. Finally, the simulation results show that our approach can greatly reduce total cost of data-intensive workflow applications by considering both of data dependency and data access times in multi-cloud environments.


Multi-cloud Data management Data replication Data dependency Data access times 


  1. 1.
    Buyya, R., Broberg, J., Gościński, A.: Cloud Computing: Principles and Paradigms. Wiley, HobokenGoogle Scholar
  2. 2.
    Chang, R-S., Chang, H-P., Wang, Y-T.: A dynamic weighted data replication strategy in data grids. In: IEEE/ACS International Conference on Computer Systems and Applications, AICCSA 2008, pp. 414–421 (2008)Google Scholar
  3. 3.
    Gill, N.K., Singh, S.: A dynamic, cost-aware, optimized data replication strategy for heterogeneous cloud data centers. Future Gener. Comput. Syst. 65, 10–32 (2016)CrossRefGoogle Scholar
  4. 4.
    Janpet, J., Wen, Y-F.: Reliable and available data replication planning for cloud storage. In: IEEE 27th International Conference on Advanced Information Networking and Applications (AINA), pp. 678–685 (2013)Google Scholar
  5. 5.
    Khalajzadeh, H., Yuan, D., Grundy, J., Yang, Y.: Improving cloud-based online social network data placement and replication. In: IEEE 9th International Conference on Cloud Computing (CLOUD), pp. 678–685 (2016)Google Scholar
  6. 6.
    Li, W., Yang, Y., Yuan, D.: Ensuring cloud data reliability with minimum replication by proactive replica checking. IEEE Trans. Comput. 65(5), 1494–1506 (2016)MathSciNetCrossRefGoogle Scholar
  7. 7.
    Lin, J.-W., Chen, C.-H., Chang, J.M.: QoS-aware data replication for data-intensive applications in cloud computing systems. IEEE Trans. Cloud Comput. 1(1), 101–115 (2013)CrossRefGoogle Scholar
  8. 8.
    Liu, G., Shen, H., Chandler, H.: Selective data replication for online social networks with distributed datacenters. IEEE Trans. Parallel Distrib. Syst. 27(8), 2377–2393 (2016)CrossRefGoogle Scholar
  9. 9.
    Long, S.-Q., Zhao, Y.-L., Chen, W.: MORM: a multi-objective optimized replication management strategy for cloud storage cluster. J. Syst. Archit. 60(2), 234–244 (2014)CrossRefGoogle Scholar
  10. 10.
    Marinescu, D.C.: Cloud Computing: Theory and Practice. Elsevier/Morgan Kaufmann, Morgan Kaufmann is an imprint of Elsevier, Boston (2013)Google Scholar
  11. 11.
    Milani, B.A., Navimipour, N.J.: A comprehensive review of the data replication techniques in the cloud environments: Major trends and future directions. J. Netw. Comput. Appl. 64, 229–238 (2016)CrossRefGoogle Scholar
  12. 12.
    Rasool, Q., Li, J., Oreku, G.S., Zhang, S., Yang, D.: A load balancing replica placement strategy in data grid. In: Third International Conference on Digital Information Management, ICDIM 2008, pp. 751–756 (2008)Google Scholar
  13. 13.
    Tos, U., Mokadem, R., Hameurlain, A., Ayav, T., Bora, S.: A performance and profit oriented data replication strategy for cloud systems. In: International IEEE Conferences on Ubiquitous Intelligence & Computing, Advanced and Trusted Computing, Scalable Computing and Communications, Cloud and Big Data Computing, Internet of People, and Smart World Congress (UIC/ATC/ScalCom/CBDCom/IoP/SmartWorld), pp. 780–787 (2016)Google Scholar
  14. 14.
    Wang, C., Lu, Z., Wu, Z., Wu, J., Huang, S.: Optimizing multi-cloud CDN deployment and scheduling strategies using big data analysis. In: IEEE International Conference on Services Computing (SCC), pp. 273–280 (2017)Google Scholar
  15. 15.
    Wang, T., Yao, S., Xu, Z., Jia, S.: DCCP: an effective data placement strategy for data-intensive computations in distributed cloud computing systems. J. Supercomput. 72(7), 2537–2564 (2016)CrossRefGoogle Scholar
  16. 16.
    Wu, X.: Data sets replicas placements strategy from cost-effective view in the cloud. Sci. Program. 2016, 13 (2016)Google Scholar
  17. 17.
    Ye, Z., Li, S., Zhou, J.: A two-layer geo-cloud based dynamic replica creation strategy. Appl. Math. Inf. Sci. 8(1), 431–439 (2014)CrossRefGoogle Scholar
  18. 18.
    Yuan, D., Cui, L., Liu, X.: Cloud data management for scientific workflows: Research issues, methodologies, and state-of-the-art. In: 10th International Conference on Semantics, Knowledge and Grids (SKG), pp. 21–28 (2014)Google Scholar
  19. 19.
    Yuan, D., Yang, Y., Liu, X., Chen, J.: A data placement strategy in scientific cloud workflows. Future Gener. Comput. Syst. 26(8), 1200–1214 (2010)CrossRefGoogle Scholar
  20. 20.
    Zhang, Q., Li, S., Li, Z., Xing, Y., Yang, Z., Dai, Y.: CHARM: a cost-efficient multi-cloud data hosting scheme with high availability. IEEE Trans. Cloud Comput. 3(3), 372–386 (2015)CrossRefGoogle Scholar

Copyright information

© Springer Nature Switzerland AG 2019

Authors and Affiliations

  1. 1.University of WollongongWollongongAustralia

Personalised recommendations