Reliability analysis of real-time fault-tolerant task models


One notable advantage of Model-Driven Architecture (MDA) method is that software developers could do sufficient analysis and tests on software models in the design phase, which helps construct high confidence on the expected software behaviors and performance, especially for safety-critical real-time software. Most existing literature of reliability analysis ignores the effects from those deadline requirements of tasks which are critical properties for real-time software and thus cannot be ignored.

Considering the contradictory relationship between the deadline requirements and time costs of fault tolerance in real-time tasks, in this paper, we present a novel reliability model, which takes schedulability as one of the major factors affecting the reliability, to analyze reliability of the task execution model in real-time software design phase. The tasks in this reliability model has no restrictions on their distributions and thus could be distributed on a multiprocessor or on a distributed system. Furthermore, the tasks also define arrival rates of faults and fault-tolerant mechanisms to model the occurrences of non-permanent faults and the corresponding time costs of fault handling. By analyzing the probability of tasks still being schedulable in the worst-case execution scenario with faults occurring, reliability and schedulability are combined into an unified analysis framework, and two algorithms for reliability analysis are given. To make this reliability model more pragmatic, we also present an estimation technique for estimating the fault arrival rate of each task. We show through two case studies respectively the detailed derivation process under static-priority scheduling in a multiprocessor system and in the design process of avionics software, and then analyze the factors affecting the reliability analysis by setting up simulation experiments. When no assumptions of fault occurrences made on the task model, this reliability model regresses to a generic schedulability model.

This is a preview of subscription content, access via your institution.

Fig. 1
Fig. 2
Algorithm 1
Algorithm 2
Fig. 3
Fig. 4
Fig. 5
Fig. 6
Fig. 7
Fig. 8


  1. 1.

    T i T j is defined as that task T j cannot be released until task T i finishes.


  1. 1.

    OMG model driven architecture.

  2. 2.

    Avionics application software standard interface-ARINC. Specification 653P1-2 (2006)

  3. 3.

    Agresti WW, Evanco WM (1992) Projecting software defects from analyzing Ada designs. IEEE Trans Softw Eng 18(11):988–997

    Article  Google Scholar 

  4. 4.

    Audsley NC, Burns A, Richardson M, Tindell K, Wellings AJ (1993) Applying new scheduling theory to static priority pre-emptive scheduling. Softw Eng J 8(5):284–292

    Article  Google Scholar 

  5. 5.

    Bini E, Buttazzo GC, Giuseppe M (2003) Rate monotonic scheduling: the hyperbolic bound. IEEE Trans Comput 52(7):933–942

    Article  Google Scholar 

  6. 6.

    Castillo X, McConnel SR, Siewiorek DP (1982) Derivation and caliberation of a transient error reliability model. IEEE Trans Comput C-31(7):658–671

    Article  Google Scholar 

  7. 7.

    Chevochot P, Puaut I (1999) Scheduling fault-tolerant distributed hard real-time tasks independently of the replication strategies. In: Proc. of the 6th international conference on real-time computing systems and applications, pp 356–363

    Google Scholar 

  8. 8.

    Duane JT (1964) Learning curve approach to reliability monitoring. IEEE Trans Aerosp 2(2):563–566

    Article  Google Scholar 

  9. 9.

    Eles P, Izosimov V, Pop P, Peng Z (2008) Synthesis of fault-tolerant embedded systems. In: Proc. of the conference on design, automation and test in Europe, pp 1117–1122

    Google Scholar 

  10. 10.

    Gaffney JE, Pietrolewicz J (1990) An automated model for software early prediction(SWEEP). In: Proc. of the 13th Minnowbrook workshop on software reliability

    Google Scholar 

  11. 11.

    Ghosh S, Melhem R, Mosse D (1994) Fault-tolerant scheduling on a hard real-time multiprocessor system. In: Proc. of the 8th international symposium on parallel processing, pp 775–782

    Chapter  Google Scholar 

  12. 12.

    Goel AL, Okumoto K (1979) Time-dependent error-detection rate model for software and other performance measures. IEEE Trans Reliab R-28(S):206–211

    Article  Google Scholar 

  13. 13.

    Goseva-Popstojanova K, Trivedi K, Mathur AP (2000) How different architecture based software reliability models are related? In: Proc. of the fast abstracts 11th IEEE international symposium on software reliability engineering

    Google Scholar 

  14. 14.

    Institute of Electrical & Electronics Engineers (1991) Standard glossary of software engineering terminology. IEEE Std. 729–1991

  15. 15.

    Lee Y, Kim D, Younis M, Zhou J (1998) Partition scheduling in apex runtime environment for embedded avionics software. In: Proc. of the international conference on real-time computing systems and applications, pp 103–109

    Google Scholar 

  16. 16.

    Lima G, Burns A (2005) Scheduling fixed-priority hard real-time tasks in the presence of faults. In: Proc. of the 2nd Latin-American conference on dependable computing, pp 154–173

    Google Scholar 

  17. 17.

    Littlewood B (1980) The Littlewood-Verrall model for software reliability compared with some rivals. J Syst Softw 1(3):251–258

    Google Scholar 

  18. 18.

    Liu C, Layland J (1973) Scheduling algorithms for multiprogramming in a hard-real-time environment. J ACM 20(1):46–61

    Article  MATH  MathSciNet  Google Scholar 

  19. 19.

    Liu X (2009) Fault-tolerant scheduling A model proposal for multiple transient faults. Master’s thesis, Chalmers University of Technology, Sweden

  20. 20.

    Moranda PB (1979) Event-altered rate models for general reliability analysis. IEEE Trans Reliab R-28(5):376–381

    Article  Google Scholar 

  21. 21.

    Musa JD (1979) Validity of execution-time theory of software reliability. IEEE Trans Reliab R-28(3):181–191

    Article  Google Scholar 

  22. 22.

    Ohba M (1984) Software reliability analysis models. IBM J Res Dev 21(4):428–443

    Article  Google Scholar 

  23. 23.

    Pandya M, Malek M (1998) Minimum achivable utilization for fault-tolerant processing of periodic tasks. IEEE Trans Comput 47(10):1102–1112

    Article  Google Scholar 

  24. 24.

    Qin X, Jiang H (2006) A novel fault-tolerant scheduling algorithm for precedence constrained tasks in real-time heterogeneous systems. Parallel Comput 32(5):331–356

    Article  MathSciNet  Google Scholar 

  25. 25.

    Schneidewind NF (1975) Analysis of error processes in computer software. Sigplan Note 10(6):337–346

    Article  Google Scholar 

  26. 26.

    Shatz S, Wang J, Goto M (1992) Task allocation for maximizing reliability of distributed computer systems. IEEE Trans Comput 41(9):1156–1168

    Article  Google Scholar 

  27. 27.

    Shin KG, Lin T, Lee Y (1987) Optimal checkpointing of real-time tasks. IEEE Trans Comput C-36(11):1328–1341

    Article  Google Scholar 

  28. 28.

    Smidts C, Stutzke M, Stoddard RW (1998) Software reliability modeling: an approach to early reliability prediction. IEEE Trans Reliab 47(3):268–278

    Article  Google Scholar 

  29. 29.

    Tindell K, Burns A, Wellings A (1994) An extendible approach for analysing fixed priority hard real-time tasks. Real-Time Syst 6(1):133–151

    Article  Google Scholar 

  30. 30.

    Tindell K, Clark J (1994) Holistic schedulability analysis for distributed hard real-time systems. Microprocess Microprogram 40(2–3):117–134

    Article  Google Scholar 

Download references

Author information



Corresponding author

Correspondence to Shenglin Gui.

Additional information

The research presented in this paper has been partially supported by the Fundamental Research Funds for the Central Universities No. ZYGX2012J080 and Applied Basic Research Programs of Science and Technology Department of Sichuan Province No. 2013JY0002.

Rights and permissions

Reprints and Permissions

About this article

Cite this article

Gui, S., Luo, L. Reliability analysis of real-time fault-tolerant task models. Des Autom Embed Syst 17, 87–107 (2013).

Download citation


  • Real-time tasks
  • Design phase
  • Task model
  • Reliability model
  • Fault-tolerant mechanism
  • Schedulability