Abstract
In general, faults cannot be prevented; instead, they need to be tolerated to guarantee certain degrees of software dependability. We develop a theory for fault tolerance for a distributed pi-calculus, whereby locations act as units of failure and redundancy is distributed across independently failing locations. We give formal definitions for fault tolerant programs in our calculus, based on the well studied notion of contextual equivalence. We then develop bisimulation proof techniques to verify fault tolerance properties of distributed programs and show they are sound with respect to our definitions for fault tolerance.
Chapter PDF
Similar content being viewed by others
Keywords
These keywords were added by machine and not by the authors. This process is experimental and the keywords may be updated as the learning algorithm improves.
References
Amadio, R.M., Prasad, S.: Localities and failures. FSTTCS: Foundations of Software Technology and Theoretical Computer Science 14 (1994)
Christian, F.: Understanding fault tolerant distributed systems. Communications of the ACM 34(2), 56–78 (1991)
Ciaffaglione, A., Hennessy, M., Rathke, J.: Proof methodologies for behavioural equivalence in Dπ. Technical Report 03/2005, University of Sussex (2005)
Francalanza, A., Hennessy, M.: A theory for observational fault tolerance, www.cs.um.edu.mt/~afran/
Francalanza, A., Hennessy, M.: A theory of system behaviour in the presence of node and link failures. In: Abadi, M., de Alfaro, L. (eds.) CONCUR 2005. LNCS, vol. 3653, pp. 368–382. Springer, Heidelberg (2005)
Hennessy, M., Merro, M., Rathke, J.: Towards a behavioural theory of access and mobility control in distributed systems. Theoretical Computer Science 322, 615–669 (2004)
Hennessy, M., Rathke, J.: Typed behavioural equivalences for processes in the presence of subtyping. Mathematical Structures in Computer Science 14, 651–684 (2004)
Hennessy, M., Riely, J.: Resource access control in systems of mobile agents. Information and Computation 173, 82–120 (2002)
Prasad, K.V.S.: Combinators and Bisimulation Proofs for Restartable Systems. PhD thesis, Department of Computer Science, University of Edinburgh (December 1987)
Riely, J., Hennessy, M.: Distributed processes and location failures. Theoretical Computer Science 226, 693–735 (2001)
Sangiorgi, D., Walker, D.: The π-calculus. Cambridge University Press, Cambridge (2001)
Schlichting, R.D., Schneider, F.B.: Fail-stop processors: An approach to designing fault-tolerant computing systems. Computer Systems 1(3), 222–238 (1983)
Verissimo, P., Rodrigues, L.: Distributed Systems for System Architects. Kluwer Academic Publishers, Dordrecht (2001)
Author information
Authors and Affiliations
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 2006 Springer-Verlag Berlin Heidelberg
About this paper
Cite this paper
Francalanza, A., Hennessy, M. (2006). A Theory for Observational Fault Tolerance. In: Aceto, L., Ingólfsdóttir, A. (eds) Foundations of Software Science and Computation Structures. FoSSaCS 2006. Lecture Notes in Computer Science, vol 3921. Springer, Berlin, Heidelberg. https://doi.org/10.1007/11690634_2
Download citation
DOI: https://doi.org/10.1007/11690634_2
Publisher Name: Springer, Berlin, Heidelberg
Print ISBN: 978-3-540-33045-5
Online ISBN: 978-3-540-33046-2
eBook Packages: Computer ScienceComputer Science (R0)