Logic of Programs 1983: Logics of Programs pp 147-160 | Cite as

A rigorous approach to fault-tolerant system development

extended abstract
  • Flaviu Cristian
Conference paper
Part of the Lecture Notes in Computer Science book series (LNCS, volume 164)


This paper investigates the issue of what it means for a system to behave correctly despite of hardware fault occurrences. Using a stable storage system as a running example, a framework is presented for specifying, understanding, and verifying the correctness of fault-tolerant systems.

A clear separation is made between the notions of software correctness and system reliability in the face of hardware malfunction. Correctness is established by using a programming logic augmented with fault axioms and rules. Stochastic modelling is employed to investigate reliability/availability system properties.

Index Terms

Correctness Fault-Tolerance Reliability 


Unable to display preview. Download preview PDF.

Unable to display preview. Download preview PDF.


  1. [BC]
    Best E. and F. Cristian "Systematic Detection of Exception Occurrneces", Science of Computer Programming, Vol 1, No 1, (1981).Google Scholar
  2. [BR]
    Best E. and B. Randell "A Formal Model of Atomicity in Asynchronous Systems", Acta Informatica, Vol 16, pp. 93–124, (1981).Google Scholar
  3. [CA]
    Costes, A. et al., "SURF: A Program for Dependability Evaluation of Complex Fault-Tolerant Systems," IEEE 11th Int. Conf. on Fault-Tolerant Computing, pp. 72–78 (1981).Google Scholar
  4. [C1]
    Cristian, F., "Robust Data Types" Acta Informatica Vol. 17, pp. 365–397, (1982).Google Scholar
  5. [C2]
    Cristian, F., "Correct and Robust Programs" IBM Research Report RJ3753 (1983). To appear in IEEE Transactions on Software Engineering.Google Scholar
  6. [C3]
    Cristian, F., "A Rigorous Approach to Fault-Tolerant System Development" IBM Research Report RJ3754 (January 1983)Google Scholar
  7. [dB]
    de Bakker, J. Mathematical Theory of Program Correctness Prentice Hall, (1980).Google Scholar
  8. [Fl]
    Floyd, R.W., "Assigning Meanings to Programs", in Mathematical Aspects of Computer Science, XIX American Mathematical Society, pp. 19–32, (1967).Google Scholar
  9. [GM]
    Gelenbe, E. and I. Mitrani, Analysis and Synthesis of Computer Systems, Academic Press (1980).Google Scholar
  10. [LS]
    Lampson, B. W. and H. E. Sturgis, "Crash Recovery in a Distributed Data Storage System," Xerox PARC Report, Palo Alto, Calif. (April 1979).Google Scholar
  11. [MS]
    Melliar-Smith, P. M. and R. L. Schwartz, "Formal Specification and Mechanical Verification of SIFT: A Fault-Tolerant Flight Control System," IEEE Trans. on Computers, Vol. C-31(7) (1982).Google Scholar
  12. [NA]
    Ng, Y. W. and A. Avizienis, "ARIES: An Automated Reliability Evaluation System," Proc. 1977 Annual Reliability and Maintainability Symposium, pp. 182–188 (1977).Google Scholar
  13. [SS]
    Schlichting, R. D. and F. B. Schneider, "Verification of Fault-Tolerant Software," TR 80–446, Cornell University (1980).Google Scholar
  14. [WA]
    Wensley, J. H. et al., "SIFT: Design and Analysis of a Fault-Tolerant Computer for Aircraft Control," Proceedings of the IEEE vol. 66(10), pp. 1240–1255 (October 1978).Google Scholar

Copyright information

© Springer-Verlag Berlin Heidelberg 1984

Authors and Affiliations

  • Flaviu Cristian
    • 1
  1. 1.IBM Research LaboratorySan Jose

Personalised recommendations