Skip to main content
Log in

Toward high assurance software systems with adaptive fault management

  • Published:
Software Quality Journal Aims and scope Submit manuscript

Abstract

In this paper, we develop an adaptive approach to estimate the optimal preventive rejuvenation schedule, which maximizes the steady-state system availability. We formulate the upper and lower bounds of the predictive system availability using the one-look ahead predictive survival function from system failure time data and derive the pessimistic and optimistic rejuvenation policies. Then, we derive adaptive rejuvenation policies from the original data together with a right-censored observation. In the simulation experiments, we show the usefulness of the adaptive nonparametric predictive inference approach proposed in this paper.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Fig. 1
Fig. 2
Fig. 3
Fig. 4

Similar content being viewed by others

References

  • Adams, E. (1984). Optimizing preventive service of the software products. IBM Journal of Research and Development, 28(1), 2–14.

    Article  Google Scholar 

  • Avritzer, A., & Weyuker, E. J. (1997). Monitoring smoothly degrading systems for increased dependability. Empirical Software Engineering, 2(1), 59–77.

    Article  Google Scholar 

  • Avritzer, A., Bondi, A., Grottke, M., Weyuker, E. J., & Trivedi, K. S. (2006). Performance assurance via software rejuvenation: monitoring, statistics and algorithms. In DSN-2006: Proceedings of international conference on dependable systems and networks (pp. 435–444). IEEE CS Press.

  • Bao, Y., Sun, X., & Trivedi, K. S. (2003). Adaptive software rejuvenation: degradation model and rejuvenation scheme. In DSN-2003: Proceedings of international conference on dependable systems and networks (pp. 241–248). IEEE CS Press.

  • Bao, Y., Sun, X., & Trivedi, K. S. (2005). A workload-based analysis of software aging, and rejuvenation. IEEE Transactions on Reliability, 54(3), 541–548.

    Article  Google Scholar 

  • Bobbio, A., Sereno, M., & Anglano, C. (2001). Fine grained software degradation models for optimal rejuvenation policies. Performance Evaluation, 46(1), 45–62.

    Article  MATH  Google Scholar 

  • Castelli, V., Harper, R. E., Heidelberger, P., Hunter, S. W., Trivedi, K. S., Vaidyanathan, K. V., et al. (2001). Proactive management of software aging. IBM Journal of Research and Development, 45(2), 311–332.

    Article  Google Scholar 

  • Chen, X.-E., Quan, Q., Jia, Y.-F., & Cai, K.-Y. (2006). A threshold autoregressive model for software aging. In SOSE-2006: Proceedings of 2nd international symposium on service-oriented system engineering (pp. 34–37). IEEE CS Press.

  • Coolen, F. P. A., & Yan, K. J. (2004). Nonparametric predictive inference with right-censored data. Journal of Statistical Planning and Inference, 126(1), 25–54.

    Article  MATH  MathSciNet  Google Scholar 

  • Coolen-Schrijner, P., & Coolen, F. P. A. (2004). Adaptive age replacement strategies based on nonparametric predictive inference. Journal of the Operational Research Society, 55, 1281–1297.

    Article  MATH  Google Scholar 

  • Dohi, T., Goševa-Popstojanova, K., & Trivedi, K. S. (2001). Estimating software rejuvenation schedule in high assurance systems. Computer Journal, 44(6), 473–485.

    Article  MATH  Google Scholar 

  • Dohi, T., Iwamoto, K., Okamura, H., & Kaio, N. (2003). Discrete availability models to rejuvenate a telecommunication billing application. IEICE Transactions on Communications (B), E86–B(10), 2931–2939.

    Google Scholar 

  • Eto, H., & Dohi, T. (2006). Determining the optimal software rejuvenation schedule via semi-Markov decision process. Journal of Computer Science, 2(6), 528–534.

    Article  Google Scholar 

  • Garg, S., Telek, M., Puliafito, A., & Trivedi, K. S. (1995). Analysis of software rejuvenation using Markov regenerative stochastic Petri net. In ISSRE-1995: Proceedings of 6th international symposium on software reliability engineering (pp. 24–27). IEEE CS Press.

  • Garg, S., Pfening, S., Puliafito, A., Telek, M., & Trivedi, K. S. (1998). Analysis of preventive maintenance in transactions based software systems. IEEE Transactions on Computers, 47(1), 96–107.

    Article  Google Scholar 

  • Grottke, M., Lie, L., Vaidyanathan, K. V., & Trivedi, K. S. (2006). Analysis of software aging in a web server. IEEE Transactions on Reliability, 55(3), 411–420.

    Article  Google Scholar 

  • Hill, B. M. (1968). Posterior distribution of percentiles: Bayes’ theorem for sampling from a population. Journal of the American Statistical Association, 63(322), 677–691.

    MATH  MathSciNet  Google Scholar 

  • Huang, Y., Kintala, C., Kolettis, N., & Fulton, N. D. (1995). Software rejuvenation: analysis, module and applications. In FTC-1995: Proceedings of 25th international symposium on fault tolerant computing (pp. 381–390). IEEE CS Press.

  • Pfening, S., Garg, S., Puliafito, A., Telek, M., & Trivedi, K. S. (1996). Optimal rejuvenation for tolerating soft failure. Performance Evaluation, 27/28(4), 491–506.

    Article  Google Scholar 

  • Reinecke, P., Van Moorsel, A. P. A., & Wolter, K. (2004). A measurement study of the interplay between application level restart and transport protocol. In M. Malek, M. Manfred, & J. Kaiser (Eds.), Service availability: First international service availability symposium (ISAS 2004) (pp. 86–100). Berlin: Springer. LNCS 3335.

    Google Scholar 

  • Rinsaka, K., & Dohi, T. (2005). Behavioral analysis of fault-tolerant software systems with rejuvenation. IEICE Transactions on Information and Systems (D), E88–D(12), 2681–2690.

    Article  Google Scholar 

  • Rinsaka, K., & Dohi, T. (2007a). A faster algorithm for periodic preventive rejuvenation schedule maximizing system availability. In M. Malek, M. Reitenspiess, & A. Moorsel (Eds.), Service availability: 4th international service availability symposium (ISAS 2007) (pp. 94–109). Berlin: Springer. LNCS 4526.

    Chapter  Google Scholar 

  • Rinsaka, K., & Dohi, T. (2007b). Non-parametric predictive inference of preventive rejuvenation schedule in operational software systems. In ISSRE-2007:Proceedings of 18th international symposium on software reliability engineering (pp. 247–256). IEEE CS Press.

  • Shereshevsky, M., Crowell, J., Cukic, B. Gandikota, V., & Liu, Y. (2003). Software aging and multifractality of memory resources. In DSN-2003: Proceedings of international conference on dependable systems and networks (pp. 721–730). IEEE CS Press.

  • Suzuki, H., Dohi, T., Goševa-Popstojanova, K., & Trivedi, K. S. (2002). Analysis of multi step failure models with periodic software rejuvenation. In J. R. Artalejo & A. Krishnamoorthy (Eds.), Advances in stochastic modelling (pp. 85–108). Edina: Notable Publications.

    Google Scholar 

  • Tai, A. T., Alkalai, L., & Chau, S. N. (1999). On-board preventive maintenance: A design-oriented analytic study for long-life applications. Performance Evaluation, 35(3/4), 215–232.

    Article  MATH  Google Scholar 

  • Vaidyanathan, K. V., & Trivedi, K. S. (2005). A comprehensive model for software rejuvenation. IEEE Transactions on Dependable and Secure Computing, 2(2), 124–137.

    Article  Google Scholar 

  • van Moorsel, A. P. A., & Wolter, K. (2006). Analysis of restart mechanisms in software systems. IEEE Transactions on Software Engineering, 32(8), 547–558.

    Article  Google Scholar 

  • Wang, D., Xie, W., & Trivedi, K. S. (2007). Performability analysis of clustered systems with rejuvenation under varying workload. Performance Evaluation, 64(3), 247–265.

    Article  Google Scholar 

  • Yurcik, W., & Doss, D. (2001). Achieving fault-tolerant software with rejuvenation and reconfiguration. IEEE Software, 18(4), 48–52.

    Article  Google Scholar 

Download references

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Tadashi Dohi.

Rights and permissions

Reprints and permissions

About this article

Check for updates. Verify currency and authenticity via CrossMark

Cite this article

Rinsaka, K., Dohi, T. Toward high assurance software systems with adaptive fault management. Software Qual J 24, 65–85 (2016). https://doi.org/10.1007/s11219-014-9264-0

Download citation

  • Published:

  • Issue Date:

  • DOI: https://doi.org/10.1007/s11219-014-9264-0

Keywords

Navigation