A Theory for Observational Fault Tolerance
In general, faults cannot be prevented; instead, they need to be tolerated to guarantee certain degrees of software dependability. We develop a theory for fault tolerance for a distributed pi-calculus, whereby locations act as units of failure and redundancy is distributed across independently failing locations. We give formal definitions for fault tolerant programs in our calculus, based on the well studied notion of contextual equivalence. We then develop bisimulation proof techniques to verify fault tolerance properties of distributed programs and show they are sound with respect to our definitions for fault tolerance.
KeywordsFault Tolerance Link Failure Reduction Rule Fault Recovery Reduction Semantic
- 1.Amadio, R.M., Prasad, S.: Localities and failures. FSTTCS: Foundations of Software Technology and Theoretical Computer Science 14 (1994)Google Scholar
- 3.Ciaffaglione, A., Hennessy, M., Rathke, J.: Proof methodologies for behavioural equivalence in Dπ. Technical Report 03/2005, University of Sussex (2005)Google Scholar
- 4.Francalanza, A., Hennessy, M.: A theory for observational fault tolerance, www.cs.um.edu.mt/~afran/
- 9.Prasad, K.V.S.: Combinators and Bisimulation Proofs for Restartable Systems. PhD thesis, Department of Computer Science, University of Edinburgh (December 1987)Google Scholar