Skip to main content

Highly Available Clouds: System Modeling, Evaluations, and Open Challenges

  • Chapter
  • First Online:
Research Advances in Cloud Computing

Abstract

Cloud-based solution adoption is becoming an indispensable strategy for enterprises, since it brings many advantages, such as low cost. On the other hand, to attend this demand, cloud providers are facing a great challenge regarding their resource management: how to provide services with high availability relying on finite computational resources and limited physical infrastructure? Understanding the components and operations of cloud data center is a key point to manage resources in an optimal way and to estimate how physical and logical failures can impact on users’ perception. This book chapter aims to explore computational modeling theories in order to represent a cloud infrastructure focusing on how to estimate and model cloud availability.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Institutional subscriptions

Notes

  1. 1.

    http://tl9000.org/about/tl9000/overview.html.

  2. 2.

    VMware: Business Continuity and Disaster Recovery— http://www.vmware.com/solutions/business-continuity.html.

  3. 3.

    aws.amazon.com.

  4. 4.

    https://cloud.google.com/.

  5. 5.

    https://cloud.google.com/sql/faq.

  6. 6.

    https://azure.microsoft.com/pt-br/overview/what-is-azure/.

  7. 7.

    https://www.ibm.com/watson/.

  8. 8.

    https://www.ibm.com/cloud-computing/bluemix.

  9. 9.

    http://www.ibm.com/software/products/en/ibm-cloud-orchestrator.

  10. 10.

    http://www.futurefacilities.com/solutions/data-centers/.

  11. 11.

    http://www.coolsimsoftware.com/Home.aspx.

  12. 12.

    According to authors, “an irrational environment is where a network operator is worried more about a big failure disconnecting all clients for 1 h at the same time than for multiple small failures throughout the year disconnecting every client for 1 h on average [14].”

  13. 13.

    http://www.gartner.com/newsroom/id/3354117.

References

  1. Ansi/bicsi 002, data center design and implementation best practices. Retrieved November 2016, from https://www.bicsi.org/uploadedFiles/BICSI_Website/Global_Community/Presentations/CALA/Ciordia_002_Colombia_2016.pdf.

  2. Cost of data center outages: Data center performance benchmark series. Retrieved November 2016, from http://www.emersonnetworkpower.com/en-US/Resources/Market/Data-Center/Latest-Thinking/Ponemon/Documents/2016-Cost-of-Data-Center-Outages-FINAL-2.pdf/.

  3. Data center disaster recovery and backup solution. enterprise. Retrieved November 2016, from enterprise.huawei.com/ilink/enenterprise/download/HW_322364.

  4. Relationship Between Availability and Reliability. Retrieved November 2016, from http://www.weibull.com/hotwire/issue26/relbasics26.htm.

  5. Top 4 data center outages of 2014. Retrieved November 2016, from http://www.cyrusone.com/blog/top-5-data-center-outages-of-2014/.

  6. Bai, H. (2014). Zen of cloud: Learning cloud computing by examples on microsoft azure. CRC Press.

    Google Scholar 

  7. Barroso, L. A., Clidaras, J., & Hölzle, U. (2013). The datacenter as a computer: An introduction to the design of warehouse-scale machines. Synthesis Lectures on Computer Architecture, 8(3), 1–154.

    Google Scholar 

  8. Bauer, E., & Adams, R. (2012). Reliability and availability of cloud computing. Wiley.

    Google Scholar 

  9. Brian Beach. (2014). Pro powershell for amazon web services: DevOps for the AWS cloud. A press.

    Google Scholar 

  10. Clarke, E. M., Klieber, W., Nováček, M., & Zuliani, P. (2011). Model checking and the state explosion problem. In LASER Summer School on Software Engineering, pp. 1–30. Springer.

    Google Scholar 

  11. Chen, J., Liu, Y., Cui, H., & Li, Y. (2013). Methods with low complexity for evaluating cloud service reliability. In Proceedings 16th International Symposium on Wireless Personal Multimedia Communications, pp. 1–5. IEEE.

    Google Scholar 

  12. Dantas, J., Matos, R., Araujo, J., & Maciel, P. (2012). An availability model for eucalyptus platform: An analysis of warm-standy replication mechanism. In 2012 IEEE International Conference on Systems, Man, and Cybernetics (SMC), pp. 1664–1669. IEEE.

    Google Scholar 

  13. Dantas, J., Matos, R., Araujo, J., & Maciel, P. (2015). Eucalyptus-based private clouds: availability modeling and comparison to the cost of a public cloud. Computing, 97(11), 1121–1140.

    Article  MathSciNet  MATH  Google Scholar 

  14. Dixit, A., Mahloo, M., Lannoo, B., Chen, J., Wosinska, L., Colle, D., & Pickavet, M. (2014). Protection strategies for next generation passive optical networks-2. In 2014 International Conference on Optical Network Design and Modeling, pp. 13–18. IEEE.

    Google Scholar 

  15. Endo, P. T., Rodrigues, M., Gonçalves, G. E., Kelner, J., Sadok, D. H., & Curescu, C. (2016). High availability in clouds: Systematic review and research challenges. Journal of Cloud Computing, 5(1), 16.

    Article  Google Scholar 

  16. Gailey, G., Taubensee, J., Rabeler, C., Glick, A., & Squillace, R.: Azure resiliency technical guidance: Recovery from a region-wide service disruption. Retrieved December 2016. https://docs.microsoft.com/en-us/azure/resiliency/resiliency-technical-guidance-recovery-loss-azure-region.

  17. Geng, H. (2014). Data center handbook. Wiley.

    Google Scholar 

  18. Ghemawat, S., Gobioff, H., & Leung, S.-T. (2003). The google file system. In ACM SIGOPS operating systems review, vol. 37, pp. 29–43. ACM.

    Google Scholar 

  19. Gill, P., Jain, N., & Nagappan, N. (2011). Understanding network failures in data centers: Measurement, analysis, and implications. In ACM SIGCOMM Computer Communication Review, vol. 41, pp. 350–361. ACM.

    Google Scholar 

  20. Gonçalves, G., Endo, P. T., Rodrigues, M., Kelner, J., Sadok, D., & Curescu, C. (2016). Risk-based model for availability estimation of saf redundancy models. In 2016 IEEE Symposium on Computers and Communication (ISCC), pp. 886–891. IEEE.

    Google Scholar 

  21. Gonzalez, A. J., & Helvik, B. E. (2013). Hybrid cloud management to comply efficiently with sla availability guarantees. In 2013 12th IEEE International Symposium on Network Computing and Applications (NCA), pp. 127–134. IEEE.

    Google Scholar 

  22. Hoelzle, U., & Barroso, L. (2009). The datacenter as a computer. Morgan and Claypool.

    Google Scholar 

  23. Høyland, A., & Rausand, M. (2009). System reliability theory: models and statistical methods, vol. 420. Wiley.

    Google Scholar 

  24. Jammal, M., Kanso, A., Heidari, P., & Shami, A. (2016). A formal model for the availability analysis of cloud deployed multi-tiered applications. pp. 82–87. IEEE.

    Google Scholar 

  25. Kao, W., & Geng, H. (2015). Renewable and clean energy for data centers. Data Center Handbook, pp. 559–576.

    Google Scholar 

  26. Khazaei, H., Mišić, J., Mišić, V .B., & Mohammadi, N. B. (2012). Availability analysis of cloud computing centers. In Global Communications Conference (GLOBECOM), 2012 IEEE, pp. 1957–1962. IEEE.

    Google Scholar 

  27. Kosik, W. J., & Geng, H. (2014). Energy and sustainability in data centers. Data Center Handbook, pp. 15–45.

    Google Scholar 

  28. ADC Krone. (2008). Tia-942: Data center standards overview.

    Google Scholar 

  29. Longo, F., Ghosh, R., Naik, V.K., & Trivedi, K.S. (2011). A scalable availability model for infrastructure-as-a-service cloud. In 2011 IEEE/IFIP 41st International Conference on Dependable Systems & Networks (DSN), pp. 335–346. IEEE.

    Google Scholar 

  30. Machida, F., Kim, D. S., & Trivedi, K. S. (2013). Modeling and analysis of software rejuvenation in a server virtualized system with live VM migration. Performance Evaluation, 70(3), 212–230.

    Article  Google Scholar 

  31. Malhotra, M., & Trivedi, K. S. (1994). Power-hierarchy of dependability-model types. IEEE Transactions on Reliability, 43(3), 493–502.

    Article  Google Scholar 

  32. Marrone, S. (2015). Using bayesian networks for highly available cloud-based web applications. Journal of Reliable Intelligent Environments, 1(2–4), 87–100.

    Article  Google Scholar 

  33. Meisner, D., Wu, J., & Wenisch, T. F. (2012). Bighouse: A simulation infrastructure for data center systems. In 2012 IEEE International Symposium on Performance Analysis of Systems and Software (ISPASS), pp. 35–45. IEEE.

    Google Scholar 

  34. Melo, M., Araujo, J., Matos, R., Menezes, J., & Maciel, P. (2013). Comparative analysis of migration-based rejuvenation schedules on cloud availability. In 2013 IEEE International Conference on Systems, Man, and Cybernetics, pp. 4110–4115. IEEE.

    Google Scholar 

  35. Melo, M., Maciel, P., Araujo, J., Matos, R., & Araújo, C. (2013). Availability study on cloud computing environments: Live migration as a rejuvenation mechanism. In 2013 43rd Annual IEEE/IFIP International Conference on Dependable Systems and Networks (DSN), pp. 1–6. IEEE.

    Google Scholar 

  36. Miglierina, M., Gibilisco, G. P., Ardagna, G. P., & Di Nitto, E. (2013). Model based control for multi-cloud applications. In 2013 5th International Workshop on Modeling in Software Engineering (MiSE), pp. 37–43. IEEE.

    Google Scholar 

  37. Nae, V., Prodan, R., & Iosup, A. (2014). Sla-based operations of massively multiplayer online games in clouds. Multimedia Systems, 20(5), 521–544.

    Article  Google Scholar 

  38. Nguyen, T. A., Kim, D. S., & Park, J. S. (2016). Availability modeling and analysis of a data center for disaster tolerance. Future Generation Computer Systems, 56, 27–50.

    Article  Google Scholar 

  39. Noor, T. H., Sheng, Q. Z., Yao, L., Dustdar, S., & Anne, H. H. (2016). Ngu. CloudArmor: Supporting reputation-based trust management for cloud services. IEEE Transactions on Parallel and Distributed Systems, 27(2), 367–380.

    Article  Google Scholar 

  40. Pelánek, R. (2008). Fighting state space explosion: Review and evaluation. In International Workshop on Formal Methods for Industrial Critical Systems, pp. 37–52. Springer.

    Google Scholar 

  41. Pham, C., Cao, P., Kalbarczyk, Z., & Iyer, R. K. (2012). Toward a high availability cloud: Techniques and challenges. In IEEE/IFIP International Conference on Dependable Systems and Networks Workshops (DSN 2012), pp. 1–6. IEEE.

    Google Scholar 

  42. Ro, C. (2015). Modeling and analysis of memory virtualization in cloud computing. Cluster Computing, 18(1), 177–185.

    Article  Google Scholar 

  43. SAForum. (September, 2011). Service Availability Forum Service Availability Interface—Overview SAI-Overview-B.05.03. SAForum.

    Google Scholar 

  44. Shvachko, K., Kuang, H., Radia, S., & Chansler, R. (2010). The hadoop distributed file system. In 2010 IEEE 26th symposium on mass storage systems and technologies (MSST), pp. 1–10. IEEE.

    Google Scholar 

  45. Szatmári, Z., Kövi, A., & Reitenspiess, M. (2008). Applying mda approach for the sa forum platform. In Proceedings of the 2nd Workshop on Middleware-Application Interaction: Affiliated with the DisCoTec Federated Conferences 2008, pp. 19–24. ACM.

    Google Scholar 

  46. ASHRAE Technical Committee. (2011). Thermal guidelines for data processing environments expanded data center classes and usage guidance.

    Google Scholar 

  47. Toeroe, M., & Tam, F. (2012). Service availability: principles and practice. Wiley.

    Google Scholar 

  48. Trivedi, K., Sathaye, A., & Ramani, S. Availability modeling in practice.

    Google Scholar 

  49. Turner, W. P., PE, J. H., Seader, P. E., & Brill, K. J. (2006). Tier classification define site infrastructure performance. Uptime Institute, 17.

    Google Scholar 

Download references

Acknowledgements

This work was supported by the RLAM Innovation Center, Ericsson Telecomunicações S.A., Brazil.

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Patricia Takako Endo .

Editor information

Editors and Affiliations

Rights and permissions

Reprints and permissions

Copyright information

© 2017 Springer Nature Singapore Pte Ltd.

About this chapter

Check for updates. Verify currency and authenticity via CrossMark

Cite this chapter

Endo, P.T. et al. (2017). Highly Available Clouds: System Modeling, Evaluations, and Open Challenges. In: Chaudhary, S., Somani, G., Buyya, R. (eds) Research Advances in Cloud Computing. Springer, Singapore. https://doi.org/10.1007/978-981-10-5026-8_2

Download citation

  • DOI: https://doi.org/10.1007/978-981-10-5026-8_2

  • Published:

  • Publisher Name: Springer, Singapore

  • Print ISBN: 978-981-10-5025-1

  • Online ISBN: 978-981-10-5026-8

  • eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics