Advertisement

Designing for high integrity: the software fault tolerance approach

  • M. R. Moulding
Part of the Software Science and Engineering book series (SSEN)

Abstract

Traditional software engineering approaches for highly reliable systems are aimed at avoiding the introduction of faults into the software, and at removing faults during subsequent verification, validation and testing. Collectively, these approaches attempt to prevent software faults from existing in the operational system, but for realistic systems they are unlikely to be totally successful and a number of residual faults will remain. Consequently, in the cost-effective engineering of reliable software, it can be appropriate to supplement fault prevention with design approaches which attempt to suppress the effects of residual faults. Such fault tolerance approaches are the subject of this chapter and the major schemes which have been devised to achieve this will be investigated and some associated design and implementation issues will be discussed. However, the chapter will commence with an overview of software fault tolerance and in so doing uncover some important concepts and terms. The chapter as a whole has been written primarily for software developers, but software managers are invited to read the overview and summary sections in order to gain an understanding of this technology.

Keywords

Virtual Machine Fault Tolerance Acceptance Test Design Fault Software Fault 
These keywords were added by machine and not by the authors. This process is experimental and the keywords may be updated as the learning algorithm improves.

Preview

Unable to display preview. Download preview PDF.

Unable to display preview. Download preview PDF.

References

  1. [Anderson 1981]
    T. Anderson and P. A. Lee, Fault Tolerance: Principles and Practice, Prentice Hall (1981).Google Scholar
  2. [Anderson 1985a]
    T. Anderson, “Can Design Faults be Tolerated?” Software and Microsystems, Vol. 4, No. 3, pp. 59–62 (June 1985).CrossRefGoogle Scholar
  3. [Anderson 1985b]
    T. Anderson, et al., “Software Fault Tolerance: An Evaluation”, IEEE Trans S/W Eng. Vol. SE-11, No. 12 (Dec. 1985).Google Scholar
  4. [Avizienis 1984]
    A. Avizienis and J. P. J. Kelly, “Fault Tolerance by Design Diversity: Concepts and Experiments”, Computer, Vol. 17, pp. 67–80 (Aug 1984).CrossRefGoogle Scholar
  5. [Avizienis 1986]
    A. Avizienis, “The N-Version Approach to Fault Tolerant Software”, IEEE Trans. S/W Eng. Vol SE-12, No 1, pp. 1491–1501 (Jan. 1986).Google Scholar
  6. [Avizienis 1988]
    A. Avizienis et al, “DEDIX 87—A Supervisory System for Design Diversity Experiments at UCLA”, Software Diversity in Computerized Control Systems (Ed. U. Voges), Springer Verlag (1988).Google Scholar
  7. [Bishop 1986]
    P. Bishop et al., “PODS—A Project on Diverse Software”, IEEE Trans S/W Eng. Vol SE-12, No 9, pp. 929–940 (Sept. 1986).Google Scholar
  8. [Campbell 1979]
    R. H. Campbell, K. H. Horton and G. C. Belford, “Simulations of a Fault Tolerant Deadline Mechanism”, Digest FTCS-9, pp. 95-101, Maddison (WI) (June 1979).Google Scholar
  9. [Chen 1978]
    L. Chen and A. Avizienis, “N-Version Programming: A Fault-Tolerance Approach to Reliability of Software Operation”, Digest FTCS-8, Toulouse, pp. 3-9 (1978).Google Scholar
  10. [Hagelin 1988]
    G. Hagelin, “ERICSSON Safety System for Railway Control”, Software Diversity in Computerized Control Systems (Ed. U. Voges), Springer Verlag (1988).Google Scholar
  11. [Halliwell 1984]
    D. N. Halliwell, An Investigation into the Use of Software Fault Tolerance in a MASCOT-based Naval Command and Control System, Reference A049/DD.17/1, MARI, Newcastle upon Tyne (Feb. 1984).Google Scholar
  12. [Hyland 1985]
    I. Hyland, A Backward Recoverable MC68000 Microcomputer, Final-year undergraduate project, The Hatfield Polytechnic (1985).Google Scholar
  13. [Knight 1986a]
    J. C. Knight and N. G. Leveson, “An Experimental Evaluation of the Assumption of Independence in Multi-version Programming”, IEEE Trans. S/W Eng. Vol. SE-12, No 1, pp. 96–109 (Jan. 1986).Google Scholar
  14. [Knight 1986b]
    J. C. Knight and N. G. Leveson, “An Empirical Study of Failure Probabilities in Multi-version Software”, Proc. FTCS-16, Vienna (July 1986).Google Scholar
  15. [Lee 1980]
    P. A. Lee, N. Ghani and K. Heron, “A Recovery Cache for the PDP-11”, IEEE Trans. Computers, Vol C-29, No 6, pp. 546–549 (1980).CrossRefGoogle Scholar
  16. [Martin 1982]
    D. J. Martin, “Dissimilar Software in High Integrity Applications in Flight Controls”, Proc. AGARD Symposium on Software for Avionics, The Hague, The Netherlands, 1982, pp 36:1-36:13.Google Scholar
  17. [MASCOT 1980]
    MASCOT Suppliers Association, The Official Handbook of MASCOT, RSRE, Malvern, U.K. (1980).Google Scholar
  18. [Melliar Smith 1983]
    P. M. Melliar Smith, Development of Software Fault Tolerance Techniques, NASA Contractor Report 172122 (March 1983).Google Scholar
  19. [Moulding 1986]
    M. R. Moulding “An Architecture to Support Software Fault Tolerance and an Evaluation of its Performance in a Command and Control Application”, Digest IEE Colloquium on Performance Measurement and Prediction (Feb. 1986).Google Scholar
  20. [Moulding 1987]
    M. R. Moulding and P. Barrett, An Investigation into the Application of Software Fault Tolerance to Air Traffic Control Systems: Project Final Report, Ref. 1049/TD.6 Version 2, RMCS, Shrivenham, Wilts (Sept. 1987) (CAA copyright).Google Scholar
  21. [Randell 1975]
    B. Randell, “System Structuring for Software Fault Tolerance”, IEEE Trans. S/W Eng. Vol SE-1, No 2, pp. 220–232 (1975).Google Scholar
  22. [Shin 1984]
    K. G. Shin and Y. H. Lee, “Evaluation of Error Recovery Blocks Used for Co-operating Processes”, IEEE Trans. S/W Eng. Vol SE-10, No 6, pp. 692–700 (1984).CrossRefGoogle Scholar
  23. [Taylor 1980]
    D. J. Taylor, D. E. Morgan and J. P. Black, “Redundancy in Data Structures: Improving Software Fault Tolerance”, IEEE Trans S/W Eng. Vol SE-6, No 6, pp. 585–594 (Nov. 1980).MathSciNetCrossRefGoogle Scholar
  24. [Taylor 1981]
    J. R. Taylor, Letter from the editor, ACM Software Eng. Notes, Vol. 6, No. 1 pp. 1–2 (Jan. 1981).Google Scholar
  25. [Traverse 1988]
    P. Traverse, “AIRBUS and ATR System Architecture and Specification”, Software Diversity in Computerized Control Systems (Ed. U. Voges), Springer Verlag (1988).Google Scholar
  26. [Voges 1988]
    U. Voges (Editor), Software Diversity in Computerized Control Systems, Springer Verlag (1988).Google Scholar

Copyright information

© Crown Copyright 1989

Authors and Affiliations

  • M. R. Moulding
    • 1
  1. 1.Royal Military College of ScienceUK

Personalised recommendations