A hybrid data replication strategy with fuzzy-based deletion for heterogeneous cloud data centers

  • 182 Accesses

  • 3 Citations


At present, huge cloud-based applications have put forward higher requests for data center storage. In a large-scale Cloud environment, data replication provides an appropriate solution for managing data files, which improves data reliability and availability. In this paper, we propose a data replication algorithm called hybrid replication strategy (HRS) that is applied into replica placement, selection, and replacement steps. HRS has three main phases and is suitable for replicating data files in cloud. In the first phase, it selects the best site (i.e., that is the most central site with high number of access) for storing new replica to reduce access time. In the second phase, HRS considers the best replica node for users based on different parameters such as CPU process capability, network transmission capability, I/O capability of disks, load, and network latency. In the third phase, the replacement decision is made in order to provide better response time. HRS can ascertain the importance of valuable replicas on the basis of a fuzzy inference system with three input parameters (i.e., number of accesses, cost, and the last time the replica was accessed). The new replication policy is simulated using the CloudSim toolkit package. Our proposed mechanism replicates the data over the cloud nodes reasonably well and is easily implementable in a real environment. Experiment results prove that HRS can significantly enhance availability, performance and load balance for data-intensive applications. In addition, it stands good without increasing additional overheads.

This is a preview of subscription content, log in to check access.

Access options

Buy single article

Instant unlimited access to the full article PDF.

US$ 39.95

Price includes VAT for USA

Subscribe to journal

Immediate online access to all issues from 2019. Subscription will auto renew annually.

US$ 199

This is the net price. Taxes to be calculated in checkout.

Fig. 1
Fig. 2
Fig. 3
Fig. 4
Fig. 5
Fig. 6
Fig. 7
Fig. 8
Fig. 9
Fig. 10
Fig. 11
Fig. 12
Fig. 13


  1. 1.

    Liu Q, Wang G, Liu X, Peng T, Wu J (2017) Achieving reliable and secure services in cloud computing environments. Comput Electr Eng 59:153–164

  2. 2.

    Jakóbik A, Grzonk D, Palmieri F (2017) Non-deterministic security driven meta scheduler for distributed cloud organizations. Simul Model Pract Theory 76:67–81

  3. 3.

    Mishra SK, Puthal D, Sahoo B, Jena SK, Obaidat MS (2017) An adaptive task allocation technique for green cloud computing. J Supercomput 74(1):370–385

  4. 4.

    Wang T, Zhiyang S, Yu X, Mounir H (2014) Rethinking the data center networking: architecture, network protocols, and resource sharing. IEEE Access 2:1481–1496

  5. 5.

    Wang T, Mounir H (2016) Presto: Towards efficient online virtual network embedding in virtualized cloud data centers. Comput Netw 106:196–208

  6. 6.

    Foster I, Zhao Y, Raicu I, Lu S (2008) Cloud computing and grid computing 360-degree compared. In: Grid Computing Environments Workshop, GCE’08, pp 1–10

  7. 7.

    Rajkumar B, Rajiv R, Calheiros RN (2009) Modeling and simulation of scalable cloud computing environments and the CloudSim toolkit: challenges and opportunities. High Perform Comput Simul 1:1–11

  8. 8.

    Ghemawat S, Gobioff H, Leung S (2003) The Google file system. In: ACM Symposium on Operating Systems Principles, pp 29–43

  9. 9.

    Mansouri N, Javidi MMA (2017) survey of dynamic replication strategies for improving response time in data grid environment. AUT J Model Simul 49:239–264

  10. 10.

    Borthakur D (2007) The Hadoop distributed file system: architecture and design.

  11. 11.

    Feng D, Qin L (2006) Adaptive object placement in object-based storage systems with minimal blocking probability. In: Proceeding of the 20th International Conference on Advanced Information Networking and Application

  12. 12.

    López-Pires F, Barán B (2017) Many-objective virtual machine placement. J Grid Comput 15(2):161–176

  13. 13.

    Tao M, Ota O, Dong M (2017) Dependency-aware dependable scheduling workflow applications with active replica placement in the cloud. In: IEEE Transactions on Cloud Computing, p 99

  14. 14.

    Mansouri N, Kuchaki Rafsanjani M, Javidi MMDPRS (2017) A dynamic popularity aware replication strategy with parallel download scheme in cloud environments. Simul Model Theory 77:177–196

  15. 15.

    Rahman RM, Barker K, Alhajj R (2006) Replica placement design with static optimality and dynamic maintainability. In: Sixth IEEE International Symposium on Cluster Computing and the Grid, pp 434–437

  16. 16.

    Shvachko K, Kuang H, Radia S, Chansler R (2010) The Hadoop distributed file system. In: IEEE 26th Symposium on Mass Storage Systems and Technologies, pp 1–10

  17. 17.

    Mansouri N, Dastghaibyfard GHA (2012) dynamic replica management strategy in data grid. J Netw Comput Appl 35:1297–1303

  18. 18.

    Ibrahim IA, Dai W, Bassiouni M (2016) Intelligent data placement mechanism for replicas distribution in cloudstorage systems. In: IEEE International Conference on Smart Cloud (SmartCloud), pp 134–139

  19. 19.

    Mansouri N, Dastghaibyfard GH, Mansouri E (2013) Combination of data replication and scheduling algorithm for improving data availability in data grids. J Netw Comput Appl 36:711–722

  20. 20.

    Mansouri N, Dastghaibyfard GH (2013) Enhanced dynamic hierarchical replication and weighted scheduling strategy in data grid. J Parallel Distrib Comput 73:534–543

  21. 21.

    Mansouri N (2016) Adaptive data replication strategy in cloud computing for performance improvement. Front Comput Sci 10(5):925–935

  22. 22.

    Sun DW, Chang GR, Gao S, Jin LZ, Wang XW (2012) Modeling a dynamic data replication strategy to increase system availability in cloud computing environments. J Comput Sci Technol 27:256–272

  23. 23.

    Chang RS, Chang HP (2008) A dynamic data replication strategy using access-weights in data grids. J Supercomput 45(3):277–295

  24. 24.

    Kim YH, Jung MJ, Lee CH (2010) Energy-aware real-time task scheduling exploiting temporal locality. IEICE Trans Inform Syst 93(5):1147–1153

  25. 25.

    Sun DW, Chang GR, Miao C, Jin LZ, Wang XW (2013) Analyzing modeling and evaluating dynamic adaptive fault tolerance strategies in cloud computing environments. J Supercomput 66:193–228

  26. 26.

    Zhang B, Wang X, Huang M (2014) A PGSA based data replica selection scheme for accessing cloud storage system. Adv Comput Archit 451:140–151

  27. 27.

    Ding X, You J (2011) Plant growth simulation algorithm. Shanghai People’s Publishing House, Shanghai, pp 1–59

  28. 28.

    Li B, Song SL, Bezakova I, Cameron KW (2013) EDR: An energy-aware runtime load distribution system for data-intensive applications in the cloud. In: IEEE International Conference on Cluster Computing

  29. 29.

    Lin JW, Chen CH, Chang JM (2013) QoS-aware data replication for data-intensive applications in cloud computing systems. IEEE Trans Cloud Comput 1:101–115

  30. 30.

    Long SQ, Zhao YL, Chen W (2014) MORM: a multi-objective optimized replication management strategy for cloud storage cluster. J Syst Architect 60:234–244

  31. 31.

    Luo Y, Li R, Tian F (2004) Application of artificial immune algorithm to function optimization. Fifth World Congr Intel Control Autom 3:2248–2252

  32. 32.

    Lou C, Zheng M, Liu X, Li X (2014) Replica selection strategy based on individual QoS sensitivity constraints in cloud environment. Pervasive Comput Netw World 8351:393–399

  33. 33.

    Kumar KA, Quamar A, Deshpande A, Khuller S (2014) SWORD: workload-aware data placement and replica selection for cloud data management systems. VLDB J 23:845–870

  34. 34.

    Newman MN (2009) An introduction. Oxford University Press, Oxford

  35. 35.

    Saleh A, Javidan R, Fatehikhaje MT (2015) A four-phase data replication algorithm for data grid. J Adv Comput Sci Technol 4:163

  36. 36.

    Bhardwaj T, Chander Sharma S (2018) Fuzzy logic-based elasticity controller for autonomic resource provisioning in parallel scientific applications: a cloud computing perspective. Comput Electr Eng.

  37. 37.

    Dhinesh Babu LD, Venkata KP (2013) Honey bee behavior inspired load balancing of tasks in cloud computing environments. Appl Soft Comput 13:2292–2303

  38. 38.

    Pérez JM, García-Carballeira F, Carretero J, Calderón A, Fernández J (2010) Branch replication scheme: a new model for data replication in large scale data grids. Future Gener Comput Syst 26:12–20

  39. 39.

    Dasgupta K, Kumar Mondal J, Dutta P (2013) Optimized video steganography using genetic algorithm. Int Conf Comput Intell Model Tech Appl 10:131–137

  40. 40.

    Saadat N, Rahmani AM (2012) PDDRA: a new pre-fetching based dynamic data replication algorithm in data grids. Future Gener Comput Syst 28:666–681

  41. 41.

    Calheiros RN, Ranjan R, Beloglazov A, De Rose CAF, Buyya R (2011) CloudSim: a toolkit for modeling and simulation of cloud computing environments and evaluation of resource provisioning algorithms. Softw Pract Exp 41:23–50

  42. 42.

    Howell F, Mcnab R (1998) SimJava: a discrete event simulation library for java. In: Proceedings of the First International Conference on Web-Based Modeling and Simulation

  43. 43.

    Barroso LA, Clidaras J, Holzle U (2013) The datacenter as a computer: an introduction to the design of warehouse-scale machines, vol 2. Morgan and Claypool Publishers, San Rafael

  44. 44.

    Kim YJ, Kim BK (2000) Load balancing algorithm of parallel vision processing system for real-time navigation. In Proceedings of 2000 IEEE/RSJ International Conference on Intelligent Robots and Systems, Takamatsu, Japan, pp 1860–1865

Download references

Author information

Correspondence to M. M. Javidi.

Rights and permissions

Reprints and Permissions

About this article

Verify currency and authenticity via CrossMark

Cite this article

Mansouri, N., Javidi, M.M. A hybrid data replication strategy with fuzzy-based deletion for heterogeneous cloud data centers. J Supercomput 74, 5349–5372 (2018) doi:10.1007/s11227-018-2427-1

Download citation


  • Cloud computing
  • Replication
  • CloudSim
  • Fuzzy system