Advertisement

A Theory for Observational Fault Tolerance

  • Adrian Francalanza
  • Matthew Hennessy
Part of the Lecture Notes in Computer Science book series (LNCS, volume 3921)

Abstract

In general, faults cannot be prevented; instead, they need to be tolerated to guarantee certain degrees of software dependability. We develop a theory for fault tolerance for a distributed pi-calculus, whereby locations act as units of failure and redundancy is distributed across independently failing locations. We give formal definitions for fault tolerant programs in our calculus, based on the well studied notion of contextual equivalence. We then develop bisimulation proof techniques to verify fault tolerance properties of distributed programs and show they are sound with respect to our definitions for fault tolerance.

Keywords

Fault Tolerance Link Failure Reduction Rule Fault Recovery Reduction Semantic 
These keywords were added by machine and not by the authors. This process is experimental and the keywords may be updated as the learning algorithm improves.

References

  1. 1.
    Amadio, R.M., Prasad, S.: Localities and failures. FSTTCS: Foundations of Software Technology and Theoretical Computer Science 14 (1994)Google Scholar
  2. 2.
    Christian, F.: Understanding fault tolerant distributed systems. Communications of the ACM 34(2), 56–78 (1991)MathSciNetCrossRefGoogle Scholar
  3. 3.
    Ciaffaglione, A., Hennessy, M., Rathke, J.: Proof methodologies for behavioural equivalence in Dπ. Technical Report 03/2005, University of Sussex (2005)Google Scholar
  4. 4.
    Francalanza, A., Hennessy, M.: A theory for observational fault tolerance, www.cs.um.edu.mt/~afran/
  5. 5.
    Francalanza, A., Hennessy, M.: A theory of system behaviour in the presence of node and link failures. In: Abadi, M., de Alfaro, L. (eds.) CONCUR 2005. LNCS, vol. 3653, pp. 368–382. Springer, Heidelberg (2005)CrossRefGoogle Scholar
  6. 6.
    Hennessy, M., Merro, M., Rathke, J.: Towards a behavioural theory of access and mobility control in distributed systems. Theoretical Computer Science 322, 615–669 (2004)MathSciNetCrossRefzbMATHGoogle Scholar
  7. 7.
    Hennessy, M., Rathke, J.: Typed behavioural equivalences for processes in the presence of subtyping. Mathematical Structures in Computer Science 14, 651–684 (2004)MathSciNetCrossRefzbMATHGoogle Scholar
  8. 8.
    Hennessy, M., Riely, J.: Resource access control in systems of mobile agents. Information and Computation 173, 82–120 (2002)MathSciNetCrossRefzbMATHGoogle Scholar
  9. 9.
    Prasad, K.V.S.: Combinators and Bisimulation Proofs for Restartable Systems. PhD thesis, Department of Computer Science, University of Edinburgh (December 1987)Google Scholar
  10. 10.
    Riely, J., Hennessy, M.: Distributed processes and location failures. Theoretical Computer Science 226, 693–735 (2001)MathSciNetCrossRefzbMATHGoogle Scholar
  11. 11.
    Sangiorgi, D., Walker, D.: The π-calculus. Cambridge University Press, Cambridge (2001)zbMATHGoogle Scholar
  12. 12.
    Schlichting, R.D., Schneider, F.B.: Fail-stop processors: An approach to designing fault-tolerant computing systems. Computer Systems 1(3), 222–238 (1983)CrossRefGoogle Scholar
  13. 13.
    Verissimo, P., Rodrigues, L.: Distributed Systems for System Architects. Kluwer Academic Publishers, Dordrecht (2001)CrossRefzbMATHGoogle Scholar

Copyright information

© Springer-Verlag Berlin Heidelberg 2006

Authors and Affiliations

  • Adrian Francalanza
    • 1
  • Matthew Hennessy
    • 2
  1. 1.University of MaltaMsida Malta
  2. 2.University of SussexBrightonEngland

Personalised recommendations