Formal Methods in System Design

, Volume 22, Issue 3, pp 225–248 | Cite as

Design and Verification of Distributed Recovery Blocks with CSP

  • W.L. Yeung
  • S.A. Schneider
Article

Abstract

A case study on the application of Communicating Sequential Processes (CSP) to the design and verification of fault-tolerant real-time systems is presented. The distributed recovery block (DRB) scheme is a design technique for the uniform treatment of hardware and software faults in real-time systems. Through a simple fault-tolerant real-time system design using the DRB scheme, the case study illustrates a paradigm for specifying fault-tolerant software and demonstrates how the different behavioural aspects of a fault-tolerant real-time system design can be separately and systematically specified, formulated, and verified using an integrated set of formal techniques based on CSP.

real-time systems fault-tolerance distributed recovery block scheme CSP formal specification and verification timewise refinement 

Preview

Unable to display preview. Download preview PDF.

Unable to display preview. Download preview PDF.

References

  1. 1.
    P.E. Ammann and J.C. Knight, “Data diversity: An approach to software fault tolerance,” In Proc. 17th International Symposium on Fault Tolerant Computing Systems, 1987, pp. 122–126.Google Scholar
  2. 2.
    S.D. Brookes, C.A.R. Hoare, and A.W. Roscoe, “A theory of communicating sequential processes,” J. ACM, Vol. 31, pp. 560–599, 1984.Google Scholar
  3. 3.
    A. Cau and W.-P. de Roever, “Specifying fault-tolerance within stark's formalism,” in Proc. 23rd Symp. on Fault-Tolerant Comp., IEEE Computer Society Press, 1993, pp. 392–401.Google Scholar
  4. 4.
    G.H. Chisholm and A.S. Wojcik, “An application of formal analysis to software in a fault-tolerant environment,” IEEE Transactions on Computers, Vol. 48, No. 10, pp. 1053–1063, 1999.Google Scholar
  5. 5.
    J. Coenen and J. Hooman, “A compositional semantics for fault-tolerant real-time systems,” in J. Vytopil (Ed.), Proc. Second International Symposium on Formal Techniques in Real-Time and Fault-Tolerant Systems, Nijmegen, The Netherlands, Springer-Verlag, Jan. 1992, pp. 33–51.Google Scholar
  6. 6.
    J. Coenen and J. Hooman, “Parameterized semantics for fault tolerant real-time systems,” in J. Vytopil (Ed.), Formal Techniques in Real-Time Fault-Tolerant Systems, Kluwer Academic Publishers, 1993, pp. 51–78.Google Scholar
  7. 7.
    F. Cristian, “Exception handling and software fault tolerance,” IEEE Transactions on Computers, Vol. C-31, No. 6, pp. 531–540, 1982.Google Scholar
  8. 8.
    F. Cristian, “Arigorous approach to fault-tolerant programming,” IEEE Transactions on Software Engineering, Vol. SE-11, No. 1, pp. 23–31, 1985.Google Scholar
  9. 9.
    J.W. Davies, Specification and Proof in Real-Time Systems. Cambridge University Press, 1993.Google Scholar
  10. 10.
    J.W. Davies and S.A. Schneider, “Real-Time CSP,” in T. Rus and C. Rattray (Eds.), Theories and Experiences for Real-time System Development, Vol. 2. World Scientific, 1995.Google Scholar
  11. 11.
    D.E. Eckhardt and L.D. Lee, “A theoretical basis for the analysis of multiversion software subject to coincidental errors,” IEEE Transactions on Software Engineering, Vol. SE-11, No. 12, pp. 1511–1517, 1985.Google Scholar
  12. 12.
    Tom R. Halfhill, “The truth behind the pentium bug,” Byte, March 1995.Google Scholar
  13. 13.
    H.A. Hansson, “Modeling real-time and reliability,” in J. Vytopil (Ed.), Formal Techniques in Real-Time Fault-Tolerant Systems, Kluwer Academic Publishers, 1993, pp. 79–105.Google Scholar
  14. 14.
    Jifeng He and C.A.R. Hoare, “Algebraic specification and proof of a distributed recovery algorithm,” Distributed Computing, Vol. 2, pp. 1–12, 1987.Google Scholar
  15. 15.
    C.A.R. Hoare, Communicating Sequential Processes, Prentice Hall, 1985.Google Scholar
  16. 16.
    J.J. Horning et al., “A Program Structure for Error Detection and Recovery,” in E. Gelenbe and C. Kaiser (Eds.), Lecture Notes in Computer Science, Springer Verlag, 1974, Vol. 16, pp. 171–187.Google Scholar
  17. 17.
    M. Joseph, A. Moitra, and N. Soundararajan, “Proof rules for fault-tolerant distributed programs,” Science of Computer Programming, Vol. 8, pp. 43–67, 1987.Google Scholar
  18. 18.
    K.H. Kim and H.O. Welch, “Distributed execution of recovery blocks: An approach for uniform treatment of hardware and software faults in real-time applications,” IEEE Transactions on Computers, Vol. 38, No. 5, pp. 626–636, 1989.Google Scholar
  19. 19.
    J.C. Knight and N.G. Leveson, “An experimental evaluation of the assumption of independence in multiversion programming,” IEEE Transactions on Software Engineering, Vol. SE-12, No. 1, pp. 96–109, 1986.Google Scholar
  20. 20.
    L. Lamport, “The temporal logic of actions,” ACM Transactions on Programming Languages and Systems, Vol. 1, No. 3, pp. 872–923, 1994.Google Scholar
  21. 21.
    L. Lamport and S. Merz, “Specifying and verifying fault-tolerant systems,” in Proc. Formal Techniques in Real-Time and Fault-Tolerant Systems, H. Langmaak, W.-P. de Roever, and J. Vytopil (Eds.), Springer-Verlag, 1994, pp. 42–76.Google Scholar
  22. 22.
    Jean-Claude Laprie et al., “Definition and analysis of hardware-and software-fault-tolerant architectures,” IEEE Computer, Vol. 23, No. 7, pp. 39–51, 1990.Google Scholar
  23. 23.
    R. Lazic, “A semantic study of data-independence with applications to the mechanical verification of concurrent systems,” Ph.D. Thesis, Oxford University, 1997.Google Scholar
  24. 24.
    G. Lowe, “Probabilities and priorities in timed CSP,” D. Phil. Thesis, Oxford University, 1993.Google Scholar
  25. 25.
    R. Milner, Communication and Concurrency, Prentice Hall, 1989.Google Scholar
  26. 26.
    A.W. Roscoe, M.W. Mislove, and S.A. Schneider, “Fixed points without completeness,” Theoretical Computer Science, Vol. 138, No. 2, pp. 273–314, 1995.Google Scholar
  27. 27.
    S. Owre, J. Rushby, N. Shankar, and F. Von Henke, “Formal verification for fault-tolerant architectures: Prolegomena to the design of PVS,” IEEE Transactions on Software Engineering, Vol. 21, No. 2, pp. 107–125, 1995.Google Scholar
  28. 28.
    J. Peleska, “Design and verification of fault tolerant systems with CSP,” Distributed Computing, Vol. 5, pp. 95–106, 1991.Google Scholar
  29. 29.
    B. Randell. “System structure for software fault tolerance,” IEEE Transactions on Software Engineering, Vol. SE-1, No. 2, pp. 220–232, 1975.Google Scholar
  30. 30.
    G.M. Reed, “A uniform mathematical theory for real-time distributed computing,” D.Phil. Thesis, Oxford University, 1988.Google Scholar
  31. 31.
    G.M. Reed and A.W Roscoe, “A timed model for communicating sequential processes,” in 13th ICALP, Vol. 226 of LNCS, Springer-Verlag, 1986, pp. 314–323.Google Scholar
  32. 32.
    A.W. Roscoe, “Model checking CSP,” In A Classical Mind: Essays in Honour of C.A.R. Hoare. Prentice Hall, 1994.Google Scholar
  33. 33.
    A.W. Roscoe, The Theory and Practice of Concurrency, Prentice Hall, 1997.Google Scholar
  34. 34.
    Henk Schepers, “Real-time systems and fault-tolerance,” in Real-Time Systems: Specification, Verification and Analysis, M. Joseph (Ed.), Prentice Hall, 1996, Ch. 6, pp. 229–257.Google Scholar
  35. 35.
    R.D. Schlichting and F.B. Schneider, “Fail-stop processors: An approach to designing fault tolerant computing systems,” ACM Transactions on Computer Systems, Vol. 1, No. 3, pp. 222–238, 1983.Google Scholar
  36. 36.
    F.B. Schneider, “Implementing fault-tolerant services using the state machine approach: A tutorial,” ACM Comp. Surveys, Vol. 22, No. 4, pp. 299–319, 1990.Google Scholar
  37. 37.
    S.A. Schneider, “Unbounded nondeterminism for real-time processes,” Technical Report 13–92, Oxford University, 1992.Google Scholar
  38. 38.
    S.A. Schneider, “Timewise refinement for communicating processes,” Science of Computer Programming, Vol. 28, pp. 43–90, 1997.Google Scholar
  39. 39.
    S.A. Schneider, Concurrent and Real-time Systems: The CSP Approach, John Wiley, 2000.Google Scholar
  40. 40.
    W.L. Yeung, S.A. Schneider, and F. Tam, “Design and verification of distributed recovery blocks with CSP,” Technical Report CSD-TR–98–08, Royal Holloway, University of London, 1998.Google Scholar

Copyright information

© Kluwer Academic Publishers 2003

Authors and Affiliations

  • W.L. Yeung
    • 1
  • S.A. Schneider
    • 2
  1. 1.Lingnan UniversityHong KongPeople's Republic of China
  2. 2.Royal HollowayUniversity of LondonEgham, SurreyUK

Personalised recommendations