Skip to main content
Log in

Model-Based Performability and Dependability Evaluation of a System with VM Migration as Rejuvenation in the Presence of Bursty Workloads

  • Published:
Journal of Network and Systems Management Aims and scope Submit manuscript

Abstract

Software aging accumulation leads to increased resource consumption. In this context, the memory leak is one of the well-known problems related to software aging. A bursty workload can accelerate software aging bug activation as it requires instantaneous resource allocation. Then, the rapid resource allocation and deallocation may lead to software aging through memory leaks. Moreover, a bursty workload may cause a resource exhaustion failure in a system already overloaded by software aging accumulation. Virtual Machine (VM) migration schedules can be used to mitigate software aging moving services away from a compromised physical host. Despite the considerable progress made in this area, the state-of-the-art still lacks a modeling framework for performability and dependability evaluation of VM migration as rejuvenation in a system under bursty workloads. This paper proposes a set of Stochastic Reward Net (SRN), aiming at filling this research gap. We consider five scenarios covering different bursty workload conditions, and present a specific model to cover the uncertainties related to bursty workloads. Our results present the specific rejuvenation schedule to maximize system performability and dependability for each scenario. The proposed modeling framework may be useful to support virtualized environment management decisions.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Fig. 1
Fig. 2
Fig. 3
Fig. 4
Fig. 5
Fig. 6
Fig. 7
Fig. 8
Fig. 9
Fig. 10
Fig. 11
Fig. 12
Fig. 13

Similar content being viewed by others

Notes

  1. Bursty workload occurrence usually causes a system utilization peak.

  2. https://cloud.google.com/.

  3. https://aws.amazon.com/.

  4. https://simpy.readthedocs.io/en/latest/.

  5. Arcs terminating in a circle instead of an arrowhead.

  6. Receiving and returning.

  7. We consider a year with 365 days.

  8. Considering a month with 30 days. \(30 \cdot 24 \cdot 60 = 43,200\).

  9. Note that, in this case, the performance is degradable due to software aging accumulation issues, then the computed metrics are related to system performability [37].

References

  1. Akoush, S., Sohan, R., Rice, A., Moore, A.W., Hopper, A.: Predicting the performance of virtual machine migration. In: 2010 IEEE International Symposium on Modeling, Analysis and Simulation of Computer and Telecommunication Systems, pp. 37–46. IEEE (2010)

  2. Araujo, J., Matos, R., Maciel, P., Matias, R., Beicker, I.: Experimental evaluation of software aging effects on the eucalyptus cloud computing infrastructure. In: Proceedings of the Middleware 2011 Industry Track Workshop, p. 4. ACM (2011)

  3. Avizienis, A., Laprie, J.C., Randell, B., Landwehr, C.: Basic concepts and taxonomy of dependable and secure computing. IEEE Trans. Depend. Secure Comput. 1(1), 11–33 (2004)

    Article  Google Scholar 

  4. Avritzer, A., Weyuker, E.J.: Monitoring smoothly degrading systems for increased dependability. Empir. Softw. Eng. 2(1), 59–77 (1997)

    Article  Google Scholar 

  5. Bause, F.: Queueing petri nets-a formalism for the combined qualitative and quantitative analysis of systems. In: Proceedings of 5th International Workshop on Petri Nets and Performance Models, pp. 14–23. IEEE (1993)

  6. Bobbio, A.: System modelling with petri nets. In: Systems Reliability Assessment, pp. 103–143. Springer (1990)

  7. Clark, C., Fraser, K., Hand, S., Hansen, J.G., Jul, E., Limpach, C., Pratt, I., Warfield, A.: Live migration of virtual machines. In: Proceedings of the 2nd Conference on Symposium on Networked Systems Design & Implementation-Volume 2, pp. 273–286. USENIX Association (2005)

  8. Cotroneo, D., Natella, R., Pietrantuono, R., Russo, S.: A survey of software aging and rejuvenation studies. ACM J. Emerg. Technol. Comput. Syst. 10(1), 8 (2014)

    Article  Google Scholar 

  9. Dohi, T., Zheng, J., Okamura, H., Trivedi, K.S.: Optimal periodic software rejuvenation policies based on interval reliability criteria. Reliab. Eng. Syst. Saf. 180, 463–475 (2018)

    Article  Google Scholar 

  10. Escheikh, M., Tayachi, Z., Barkaoui, K.: Performability evaluation of server virtualized systems under bursty workload. IFAC-PapersOnLine 51(7), 45–50 (2018)

    Article  Google Scholar 

  11. Feuerlicht, G., Burkon, L., Sebesta, M.: Cloud computing adoption: what are the issues. Syst. Integr. 18(2), 187–192 (2011)

    Google Scholar 

  12. Garg, S., Van Moorsel, A., Vaidyanathan, K., Trivedi, K.S.: A methodology for detection and estimation of software aging. In: Proceedings Ninth International Symposium on Software Reliability Engineering (Cat. No. 98TB100257), pp. 283–292. IEEE (1998)

  13. Grottke, M., Matias, R., Trivedi, K.S.: The fundamentals of software aging. In: 2008 IEEE International Conference on Software Reliability Engineering Workshops (ISSRE Wksp), pp. 1–6. IEEE (2008)

  14. Gupta, A.K., Zeng, W.B., Wu, Y.: Probability and Statistical Models: Foundations for Problems in Reliability and Financial Mathematics. Springer, New York (2010)

    Book  Google Scholar 

  15. Huang, Y., Kintala, C., Kolettis, N., Fulton, N.D.: Software rejuvenation: analysis, module and applications. In: Twenty-Fifth International Symposium on Fault-Tolerant Computing. Digest of Papers, pp. 381–390. IEEE (1995)

  16. Jain, R.: The Art of Computer Systems Performance Analysis: Techniques for Experimental Design, Measurement, Simulation, and Modeling. Wiley, New York (1990)

    Google Scholar 

  17. Kleinrock, L.: Queueing Systems, vol. i: Theory (1975)

  18. Kounev, S.: Performance modeling and evaluation of distributed component-based systems using queueing petri nets. IEEE Trans. Softw. Eng. 32(7), 486–502 (2006)

    Article  Google Scholar 

  19. Kuchárik, M., Balogh, Z.: Modeling of uncertainty with petri nets. In: Asian Conference on Intelligent Information and Database Systems, pp. 499–509. Springer (2019)

  20. Liu, H., Xu, C.Z., Jin, H., Gong, J., Liao, X.: Performance and energy modeling for live migration of virtual machines. In: Proceedings of the 20th International Symposium on High Performance Distributed Computing, pp. 171–182. ACM (2011)

  21. Low, C., Chen, Y., Wu, M.: Understanding the determinants of cloud computing adoption. Ind. Manag. Data Syst. 111(7), 1006–1023 (2011)

    Article  Google Scholar 

  22. Macêdo, A., Ferreira, T.B., Matias, R.: The mechanics of memory-related software aging. In: 2010 IEEE Second International Workshop on Software Aging and Rejuvenation, pp. 1–5. IEEE (2010)

  23. Machida, F., Kim, D.S., Trivedi, K.S.: Modeling and analysis of software rejuvenation in a server virtualized system. In: 2010 IEEE Second International Workshop on Software Aging and Rejuvenation, pp. 1–6. IEEE (2010)

  24. Machida, F., Kim, D.S., Trivedi, K.S.: Modeling and analysis of software rejuvenation in a server virtualized system with live vm migration. Perform. Eval. 70(3), 212–230 (2013)

    Article  Google Scholar 

  25. Machida, F., Miyoshi, N.: An optimal stopping problem for software rejuvenation in a job processing system. In: 2015 IEEE International Symposium on Software Reliability Engineering Workshops (ISSREW), pp. 139–143. IEEE (2015)

  26. Machida, F., Miyoshi, N.: Analysis of an optimal stopping problem for software rejuvenation in a deteriorating job processing system. Reliab. Eng. Syst. Saf. 168, 128–135 (2017)

    Article  Google Scholar 

  27. Machida, F., Nicola, V.F., Trivedi, K.S.: Job completion time on a virtualized server subject to software aging and rejuvenation. In: 2011 IEEE Third International Workshop on Software Aging and Rejuvenation, pp. 44–49. IEEE (2011)

  28. Machida, F., Nicola, V.F., Trivedi, K.S.: Job completion time on a virtualized server with software rejuvenation. ACM J. Emerg. Technol. Comput. Syst. 10(1), 10 (2014)

    Article  Google Scholar 

  29. Machida, F., Xiang, J., Tadano, K., Maeno, Y.: Aging-related bugs in cloud computing software. In: 2012 IEEE 23rd International Symposium on Software Reliability Engineering Workshops, pp. 287–292. IEEE (2012)

  30. Machida, F., Xiang, J., Tadano, K., Maeno, Y.: Lifetime extension of software execution subject to aging. IEEE Trans. Reliab. 66(1), 123–134 (2016)

    Article  Google Scholar 

  31. Maciel, P., Matos, R., Silva, B., Figueiredo, J., Oliveira, D., Fé, I., Maciel, R., Dantas, J.: Mercury: performance and dependability evaluation of systems with exponential, expolynomial, and general distributions. In: 2017 IEEE 22nd Pacific Rim International Symposium on Dependable Computing (PRDC), pp. 50–57. IEEE (2017)

  32. Matos, R., Araujo, J., Alves, V., Maciel, P.: Characterization of software aging effects in elastic storage mechanisms for private clouds. In: 2012 IEEE 23rd International Symposium on Software Reliability Engineering Workshops, pp. 293–298. IEEE (2012)

  33. Maziku, H., Shetty, S.: Towards a network aware vm migration: Evaluating the cost of vm migration in cloud data centers. In: 2014 IEEE 3rd International Conference on Cloud Networking (CloudNet), pp. 114–119. IEEE (2014)

  34. Mell, P., Grance, T., et al.: The nist definition of cloud computing (2011)

  35. Melo, M., Araujo, J., Matos, R., Menezes, J., Maciel, P.: Comparative analysis of migration-based rejuvenation schedules on cloud availability. In: 2013 IEEE International Conference on Systems, Man, and Cybernetics, pp. 4110–4115. IEEE (2013)

  36. Melo, M., Maciel, P., Araujo, J., Matos, R., Araujo, C.: Availability study on cloud computing environments: live migration as a rejuvenation mechanism. In: 2013 43rd Annual IEEE/IFIP International Conference on Dependable Systems and Networks (DSN), pp. 1–6. IEEE (2013)

  37. Meyer, J.F.: Performability: a retrospective and some pointers to the future. Perform. Eval. 14(3–4), 139–156 (1992)

    Article  Google Scholar 

  38. Mijumbi, R., Serrat, J., Gorricho, J.L., Bouten, N., De Turck, F., Boutaba, R.: Network function virtualization: state-of-the-art and research challenges. IEEE Commun. Surv. Tutor. 18(1), 236–262 (2015)

    Article  Google Scholar 

  39. Murata, T.: Petri nets: properties, analysis and applications. Proc. IEEE 77(4), 541–580 (1989). https://doi.org/10.1109/5.24143

    Article  Google Scholar 

  40. Myint, M.T.H., Thein, T.: Availability improvement in virtualized multiple servers with software rejuvenation and virtualization. In: 2010 Fourth International Conference on Secure Software Integration and Reliability Improvement, pp. 156–162. IEEE (2010)

  41. Nguyen, T.A., Min, D., Choi, E., Tran, T.D.: Reliability and availability evaluation for cloud data center networks using hierarchical models. IEEE Access 7, 9273–9313 (2019)

    Article  Google Scholar 

  42. Oliveira, T., Thomas, M., Espadanal, M.: Assessing the determinants of cloud computing adoption: an analysis of the manufacturing and services sectors. Inf. Manag. 51(5), 497–510 (2014)

    Article  Google Scholar 

  43. Patterson, D.A., et al.: A simple way to estimate the cost of downtime. LISA 2, 185–188 (2002)

    Google Scholar 

  44. Pietrantuono, R., Russo, S.: A survey on software aging and rejuvenation in the cloud. Softw. Q. J. 1–32 (2019)

  45. Salfner, F., Tröger, P., Polze, A.: Downtime analysis of virtual machine live migration. In: The Fourth International Conference on Dependability (DEPEND 2011). IARIA, pp. 100–105 (2011)

  46. Schroeder, B., Gibson, G.A.: Disk failures in the real world: What does an mttf of 1, 000, 000 hours mean to you? FAST 7, 1–16 (2007)

    Google Scholar 

  47. Siddiqui, S., Darbari, M., Yagyasen, D., et al.: Modelling and simulation of queuing models through the concept of petri nets (2020)

  48. Soltesz, S., Pötzl, H., Fiuczynski, M.E., Bavier, A., Peterson, L.: ACM: Container-based operating system virtualization: a scalable, high-performance alternative to hypervisors. ACM SIGOPS Oper. Syst. Rev. 41, 275–287 (2007)

    Article  Google Scholar 

  49. Strunk, A.: Costs of virtual machine live migration: a survey. In: 2012 IEEE Eighth World Congress on Services, pp. 323–329. IEEE (2012)

  50. Thein, T., Park, J.S.: Availability analysis of application servers using software rejuvenation and virtualization. J. Comput. Sci. Technol. 24(2), 339–346 (2009)

    Article  Google Scholar 

  51. Torquato, M., Araujo, J., Umesh, I., Maciel, P.: Sware: a methodology for software aging and rejuvenation experiments. J. Inf. Syst. Eng. Manag. 3(2), 15 (2018)

    Google Scholar 

  52. Torquato, M., Maciel, P., Araujo, J., Umesh, I.: An approach to investigate aging symptoms and rejuvenation effectiveness on software systems. In: 2017 12th Iberian Conference on Information Systems and Technologies (CISTI), pp. 1–6. IEEE (2017)

  53. Torquato, M., Maciel, P., Vieira, M.: A model for availability and security risk evaluation for systems with vmm rejuvenation enabled by vm migration scheduling. IEEE Access 7, 138315–138326 (2019)

    Article  Google Scholar 

  54. Torquato, M., Maciel, P., Vieira, M.: Availability and reliability modeling of vm migration as rejuvenation on a system under varying workload. Softw. Qual. J. 1–25 (2020)

  55. Torquato, M., Torquato, L., Maciel, P., Vieira, M.: Iaas cloud availability planning using models and genetic algorithms. In: 2019 9th Latin-American Symposium on Dependable Computing (LADC), pp. 1–10. IEEE (2019)

  56. Torquato, M., Umesh, I., Maciel, P.: Models for availability and power consumption evaluation of a private cloud with vmm rejuvenation enabled by vm live migration. J. Supercomput. 74(9), 4817–4841 (2018)

    Article  Google Scholar 

  57. Torquato, M., Vieira, M.: Interacting srn models for availability evaluation of vm migration as rejuvenation on a system under varying workload. In: 2018 IEEE International Symposium on Software Reliability Engineering Workshops (ISSREW), pp. 300–307. IEEE (2018)

  58. Torquato, M., Vieira, M.: An experimental study of software aging and rejuvenation in dockerd. In: 2019 15th European Dependable Computing Conference (EDCC), pp. 1–6. IEEE (2019)

  59. Trivedi, K.S., Vaidyanathan, K., Goseva-Popstojanova, K.: Modeling and analysis of software aging and rejuvenation. In: Proceedings 33rd Annual Simulation Symposium (SS 2000), pp. 270–279. IEEE (2000)

  60. Vaidyanathan, K., Trivedi, K.S.: A comprehensive model for software rejuvenation. IEEE Trans. Dependable Secure Comput. 2(2), 124–137 (2005)

    Article  Google Scholar 

  61. Valmari, A.: The state explosion problem. In: Advanced Course on Petri Nets, pp. 429–528. Springer (1996)

  62. Voorsluys, W., Broberg, J., Venugopal, S., Buyya, R.: Cost of virtual machine live migration in clouds: a performance evaluation. In: IEEE International Conference on Cloud Computing, pp. 254–265. Springer (2009)

  63. Wang, D., Xie, W., Trivedi, K.S.: Performability analysis of clustered systems with rejuvenation under varying workload. Perform. Eval. 64(3), 247–265 (2007)

    Article  Google Scholar 

  64. Yeboah-Boateng, E.O., Essandoh, K.A.: Factors influencing the adoption of cloud computing by small and medium enterprises in developing economies. Int. J. Emerg. Sci. Eng. 2(4), 13–20 (2014)

    Google Scholar 

  65. Zheng, J., Okamura, H., Dohi, T.: A transient interval reliability analysis for software rejuvenation models with phase expansion. Softw. Qual. J. 1–22 (2019)

  66. Zimmermann, A.: Modelling and performance evaluation with timenet 4.4. In: International Conference on Quantitative Evaluation of Systems, pp. 300–303. Springer (2017)

Download references

Acknowledgements

This work has been partially supported by Portuguese Foundation for Science and Technology (FCT), through the PhD Grant SFRH/BD/146181/2019, within the scope of the project CISUC - UID/CEC/00326/2020. This work is also funded by the European Social Fund, through the Regional Operational Program Centro 2020. This work also received support from AIDA: (Adaptive, Intelligent and Distributed Assurance Platform) project, funded by Operational Program for Competitiveness and Internationalization (COMPETE 2020) and FCT (under CMU Portugal Program) through Grant POCI-01-0247-FEDER-045907. And, from project TalkConnect funded by COMPETE 2020 trough Grant POCI-01-0247-FEDER-039676.

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Matheus Torquato.

Ethics declarations

Conflict of interest

The authors declare that they have no conflict of interest.

Additional information

Publisher's Note

Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

Appendix: Stochastic Reward Nets

Appendix: Stochastic Reward Nets

Stochastic Reward Nets (SRN) are a sub-type of Petri Nets (PN). A PN is a 5-tuple, \(PN = (P, T, F, W, M_0)\) where: \(P = \{p_1, p_2, \ldots , p_n\}\) is a finite set of places, \(T = \{t_1, t_2, \ldots , t_n\}\) is a finite set of transitions, \(F \subseteq (P \times T) \cup (T \times P)\) is a set of arcs, \(W: F \rightarrow \{0, 1, 2, 3, \ldots \}\) is a weight function, and \(M_0: P \rightarrow \{0, 1, 2, 3, \ldots \} \) is the initial marking [39].

The graphical representation of PN has four main components, as presented in Fig. 14. The places keep the tokens, the arcs indicate the relation between places and transitions, and the PN state is altered upon a transition firing, which moves tokens from one transition to other. In SRNs, it is possible to assign time delays to the transitions.

Fig. 14
figure 14

Petri net components

Let us consider the flow of a simple SRN availability model in the Fig. 15. In the initial state, the system is running, presented by the token in the UP place. The transition MTTF represents the system mean time to failure (MTTF). MTTF firing represents a system failure occurrence. The same transition moves the token from UP place to the DW place. The system repair is represented by the MTTR transition (mean time to repair (MTTR)). MTTR transition firing brings the model back to its initial state.

Fig. 15
figure 15

Flow of a SRN simple availability model

We can compute the system availability using the following reward measure \(Availability = P\{UP > 0\}\), which captures the probability of tokens presence in the UP place.

Rights and permissions

Reprints and permissions

About this article

Check for updates. Verify currency and authenticity via CrossMark

Cite this article

Torquato, M., Maciel, P. & Vieira, M. Model-Based Performability and Dependability Evaluation of a System with VM Migration as Rejuvenation in the Presence of Bursty Workloads. J Netw Syst Manage 30, 3 (2022). https://doi.org/10.1007/s10922-021-09619-3

Download citation

  • Received:

  • Revised:

  • Accepted:

  • Published:

  • DOI: https://doi.org/10.1007/s10922-021-09619-3

Keywords

Navigation