Skip to main content
Log in

Evaluation of site availability exploitation towards performance optimization in data grids

  • Published:
Cluster Computing Aims and scope Submit manuscript

Abstract

Data in distributed systems are often replicated into different storage elements in order to facilitate their access. This allows optimizing execution time and bandwidth consumption, ensures load balancing and increases data availability and quality of service. Several replication strategies have then been proposed in the literature. In this work, a new evaluation metric for replication strategies is introduced and experimentally evaluated. This metric, called SAvE, serves to tackle a key feature, although neglected in the literature, which is the ability of a replication strategy to exploit the most available sites in the system. The design of such a metric requires an accurate determination of the availability degree of each site. A new measurement of site availability, denoted SA, is then designed to be integrated into SAvE while overcoming the drawbacks experienced by existing measurements. Extensive experiments are performed using the OptorSim simulator to show the accuracy and the effectiveness of our contributions.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Fig. 1
Fig. 2
Fig. 3
Fig. 4
Fig. 5

Similar content being viewed by others

References

  1. Al Mistarihi, H.H.E., Yong, C.H.: Replica management in data grid. Int. J. Comput. Sci. Netw. Secur. 8(6), 22–32 (2008)

    Google Scholar 

  2. Alsoghayer, R.A.: Risk Assessment Models for Resource Failure in Grid Computing. Ph.D. thesis, The University of Leeds, UK (2011)

  3. Amjad, T., Sher, M., Dau, A.: A survey of dynamic replication strategies for improving data availability in data grids. Future Gener. Comput. Syst. 28(2), 337–349 (2012)

    Article  Google Scholar 

  4. Azari, L., Rahmani, A.M., Daniel, H.A., Qader, N.N.: A data replication algorithm for groups of files in data grids. J. Parallel Distrib. Comput. 113, 115–126 (2018)

    Article  Google Scholar 

  5. Bagheri, K., Mohsenzadeh, M.: E2dr: energy efficient data replication in data grid. J. Adv. Comput. Eng. Technol. 2(3), 27–34 (2016)

    Google Scholar 

  6. Bell, W.H., Cameron, D.G., Capozza, L., Millar, A.P., Stockinger, K., Zini, F.: OptorSim: a grid simulator for studying dynamic data replication strategies. Int. J. High Perform. Comput. Appl. 17(4), 403–416 (2003)

    Article  Google Scholar 

  7. Ben Charrada, F., Ounelli, H., Chettaoui, H.: An efficient replica placement strategy in highly dynamic data grids. Int. J. Grid Util. Comput. 2(2), 156–163 (2011)

    Article  Google Scholar 

  8. Cameron, D.G., Carvajal-Schiaffino, R., Millar, A.P., Nicholson, C., Stockinger, K., Zini, F.: Evaluating scheduling and replica optimisation strategies in OptorSim. In: Proceedings of the 4th International Workshop on Grid Computing, pp. 52–59 (2003)

  9. Cameron, D.G., Carvajal-schiaffino, R., Millar, A.P., Nicholson, C., Stockinger, K., Zini, F.: OptorSim: a simulation tool for scheduling and replica optimisation. In: Proceedings of the International Conference on Computing in High Energy and Nuclear Physics (2004)

  10. Cameron, D.G., Carvajal-Schiaffino, R., Ferguson, J., Millar, A.P., Nicholson, C., Stockinger, K., Zini, F.: OptorSim v2.1 installation and user guide. Technical report, CERN (2006)

  11. Chamkoori, A., Heidari, F., Parhizgar, N.: Cost optimization of replicas in tree network of data grid with QoS and bandwidth constraints. J. Adv. Comput. Sci. Appl. 8(6), 464–471 (2017)

    Google Scholar 

  12. Chervenak, A., Foster, I., Kesselman, C., Salisbury, C., Tuecke, S.: The data grid: towards an architecture for the distributed management and analysis of large scientific datasets. Int. J. Netw. Comput. Appl. 23, 187–200 (2000)

    Article  Google Scholar 

  13. Chettaoui, H., Ben Charrada, F.: A decentralized periodic replication strategy based on knapsack problem. In: Proceedings of the 13th International ACM/IEEE Conference on Grid Computing. pp. 3–13 (2012)

  14. Dayyani, S., Khayyambashi, M.: A comparative study of replication techniques in grid computing systems. Int. J. Comput. Sci. Inf. Secur. 11(9), 64–73 (2013)

    Google Scholar 

  15. Endo, P.T., Rodrigues, M., Gonçalves, G.E., Kelner, J., Sadok, D., Curescu, C.: High availability in clouds: systematic review and research challenges. J. Cloud Comput. 5, 16 (2016)

    Article  Google Scholar 

  16. Foster, I., Kesselman, C., Tuecke, S.: Chapter 17—the open grid services architecture. Grid 2(2), 215–257 (2004)

    Google Scholar 

  17. Goel, S., Buyya, R.: Data replication strategies in wide area distributed systems. In: Proceedings of the Enterprise Service Computing: From Concept to Deployment, pp. 211–241 (2006)

  18. Hamdeni, C., Hamrouni, T., Ben Charrada, F.: DisQuEv: Looking for distribution quality evolution as a new metric for evaluating replication strategies. In: Proceedings of the 14th IEEE/ACS International Conference on Computer Systems and Applications, pp. 295–302 (2017)

  19. Hamdeni, C., Hamrouni, T., Ben Charrada, F.: New evaluation criterion of file replicas placement for replication strategies in data grids. In: Proceedings of the 9th IEEE International Conference on P2P, Parallel, Grid, Cloud and Internet Computing, pp. 1–8 (2014)

  20. Hamdeni, C., Hamrouni, T., Ben Charrada, F.: Adaptive measurement method for data popularity in distributed systems. Clust. Comput. 19(4), 1801–1818 (2016)

    Article  Google Scholar 

  21. Hamdeni, C., Hamrouni, T., Ben Charrada, F.: Data popularity measurements in distributed systems: survey and design directions. J. Netw. Comput. Appl. 72, 150161 (2016)

    Article  Google Scholar 

  22. Hamrouni, T., Hamdeni, C., Ben Charrada, F.: Impact of the distribution quality of file replicas on replication strategies. J. Netw. Comput. Appl. 56(3), 60–76 (2015)

    Article  Google Scholar 

  23. Hamrouni, T., Slimani, S., Ben Charrada, F.: A data mining correlated patterns-based periodic decentralized replication strategy for data grids. J. Syst. Softw. 110, 10–27 (2015)

    Article  Google Scholar 

  24. Hamrouni, T., Hamdeni, C., Ben Charrada, F.: Objective assessment of the performance of data grid replication strategies based on distribution quality. Int. J. Web Eng. Technol. 11(1), 3–28 (2016)

    Article  Google Scholar 

  25. Holtman, K.: CMS data grid system: overview and requirements. Technical Reports, The Compact Muon Solenoid (CMS) Experiment Note 2001/037, CERN, Switzerland (2001)

  26. Jaradat, A., Salleh, R., Abid, A.: Imitating K-means to enhance data selection. Int. J. Appl. Sci. 9(19), 3569–3574 (2009)

    Google Scholar 

  27. Jaradat, A., Patel, A., Zakaria, M.N., Amina, A.H.: Accessibility algorithm based on site availability to enhance replica selection in a data grid environment. Comput. Sci. Inf. Syst. 10(1), 105–132 (2013)

    Article  Google Scholar 

  28. Kim, J., Kim, Y., Jeon, C.: Real-time data replication strategy for data grids. Clust. Comput. 20(3), 2551–2562 (2017)

    Article  Google Scholar 

  29. Lei, M., Vrbsky, S.V., Hong, X.: An on-line replication strategy to increase availability in data grids. Future Gener. Comput. Syst. 24(2), 85–98 (2008)

    Article  Google Scholar 

  30. Li, R., Feng, W., Wu, H., Huang, Q.: A replication strategy for a distributed high-speed caching system based on spatiotemporal access patterns of geospatial data. Comput. Environ. Urban Syst. 61, 163–171 (2017)

    Article  Google Scholar 

  31. Mabni, Z., Latip, R., Ibrahim, H., Abdullah, A.: A high availability cluster-based replica control protocol in data grid. J. Inf. Commun. Technol. 16(1), 43–62 (2017)

    Google Scholar 

  32. Mansouri, N.: QDR: a QoS-aware data replication algorithm for data grids considering security factors. Clust. Comput. 19, 1–17 (2016)

    Article  Google Scholar 

  33. Milani, B.A., Navimipour, N.J.: A comprehensive review of the data replication techniques in the cloud environments: major trends and future directions. J. Netw. Comput. Appl. 64, 229–238 (2016)

    Article  Google Scholar 

  34. Mittal, S.: Power management techniques for data centers: a survey. Tech. rep, Oak Ridge National Laboratory, USA (2014)

  35. Mokadem, R., Hameurlain, A.: Data replication strategies with performance objective in data grid systems: a survey. Int. J. Grid Util. Comput. 6(1), 30–46 (2015)

    Article  Google Scholar 

  36. Nabi, M., Toeroe, M., Khendek, F.: Availability in the cloud: state of the art. J. Netw. Comput. Appl. 60, 54–67 (2016)

    Article  Google Scholar 

  37. Nadeem, F.: Ranking grid-sites based on their reliability for successfully executing jobs of given durations. Int. J. Comput. Netw. Inf. Secur. 5, 9–15 (2015)

    Google Scholar 

  38. On, G.: Quality of availability for widely distributed and replicated content stores. Ph.D. thesis, University of Darmstadt, Germany (2004)

  39. Ranganathan, K., Foster, I.: Identifying dynamic replication strategies for a high performance data grid. In: Proceedings of the 2nd International Workshop on Grid Computing, pp. 75–86 (2001)

    Google Scholar 

  40. Rosendo, D., Leoni, G., Gomes, D., Moreira, A., Gonçalves, G., Endoz, P.T., Kelner, J., Sadok, D., Mahloox, M.: How to improve cloud services availability? Investigating the impact of power and it subsystems failures. In: Proceedings of the 51st Hawaii International Conference on System Sciences, pp. 1543–1552 (2018)

  41. Saadat, N., Rahmani, A.M.: PDDRA: a new pre-fetching based dynamic data replication algorithm in data grids. Future Gener. Comput. Syst. 28(4), 666–681 (2012)

    Article  Google Scholar 

  42. Schroeder, B., Gibson, G.A.: A large-scale study of failures in high-performance computing systems. IEEE Trans. Dependable Secure Comput. 7, 337–350 (2006)

    Article  Google Scholar 

  43. Serrano, D., Patino-Martinez, M., Jimenez-Peris, R., Kemme, B.: Boosting database replication scalability through partial replication and 1-copy-snapshot-isolation. In: Proceedings of the 13th Pacific Rim International Symposium on Dependable Computing, pp. 290–297 (2007)

  44. Souravlas, S., Sifaleras, A.: Trends in data replication strategies: a survey. Int. J. Parallel Emerg. Distrib. Syst. (2017). https://doi.org/10.1080/17445760.2017.1401073

    Article  Google Scholar 

  45. Souravlas, S., Sifaleras, A.: Binary-tree based estimation of file requests for efficient data replication. IEEE Trans. Parallel Distrib. Syst. 28(7), 1839–18521 (2017)

    Article  Google Scholar 

  46. Sundararajan, E., Harwood, A., Kotagiri, R.: Incorporating fault tolerance with replication on very large scale grids. In: Proceedings of the 8th International Conference on Parallel and Distributed Computing, Applications and Technologies, pp. 319–328 (2007)

  47. Suri, P.K., Singh, M.: DR2: a two-stage dynamic replication strategy for data grid. Int. J. Recent Trends Eng. 2(4), 201–203 (2009)

    Google Scholar 

  48. Tanenbaum, A., van Steen, M.: Distributed Systems: Principles and Paradigms. Pearson Prentice Hall, Upper Saddle River (2007)

    MATH  Google Scholar 

  49. Thamaraiselvi, S., Balakrishnan, P., Rajendar, K.: Trust based grid scheduling algorithm for commercial grids. Int. Conf. Comput. Intell. Multimedia Appl. 1, 545–558 (2007)

    Google Scholar 

  50. Tian, T., Luo, J., Wu, Z., Song, A.: A pre-fetching-based replication algorithm in data grid. In: Proceedings of the 3rd International Conference on Pervasive Computing and Applications, pp. 526–531 (2008)

  51. Tos, U., Mokadem, R., Hameurlain, A., Ayav, T., Bora, S.: Ensuring performance and provider profit through data replication in cloud systems. Clust. Comput. (2017). https://doi.org/10.1007/s10586-017-1507-y

    Article  Google Scholar 

  52. Vrbsky, S.V., Galloway, M., Carr, R., Nori, R., Grubic, D.: Decreasing power consumption with energy efficient data aware strategies. Future Gener. Comput. Syst. 29(5), 1152–1163 (2013)

    Article  Google Scholar 

  53. Zakaryaa, M., Gillam, L.: Energy efficient computing, clusters, grids and clouds: a taxonomy and survey. Sustain. Comput. 14, 13–33 (2017)

    Google Scholar 

  54. Zeinalipour-Yazti, D., Neocleous, K., Georgiou, C., Dikaiakos, M.D.: Managing failures in a grid system using failrank. Department of Computer Science, University of Cyprus, Technical Reports (2006)

Download references

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to T. Hamrouni.

Rights and permissions

Reprints and permissions

About this article

Check for updates. Verify currency and authenticity via CrossMark

Cite this article

Hamdeni, C., Hamrouni, T. & Ben Charrada, F. Evaluation of site availability exploitation towards performance optimization in data grids. Cluster Comput 21, 1967–1980 (2018). https://doi.org/10.1007/s10586-018-2836-1

Download citation

  • Received:

  • Revised:

  • Accepted:

  • Published:

  • Issue Date:

  • DOI: https://doi.org/10.1007/s10586-018-2836-1

Keywords

Navigation