Abstract
In this paper, we present a software framework for adding fault-tolerance to existing finite-state programs. The input to our framework is a fault-intolerant program and a class of faults that perturbs the program. The output of our framework is a fault-tolerant version of the input program. Our framework provides (1) the first automated tool for the synthesis of fault-tolerant distributed programs, and (2) an extensible platform for researchers to develop a repository of heuristics that deal with the complexity of adding fault-tolerance to distributed programs. We also present a set of heuristics for polynomial-time addition of fault-tolerance to distributed programs. We have used this framework for automated synthesis of several fault-tolerant programs including a simplified version of an aircraft altitude switch, token ring, Byzantine agreement, and agreement in the presence of Byzantine and fail-stop faults. These examples illustrate that our framework can be used for synthesizing programs that tolerate different types of faults (process restarts, Byzantine and fail-stop) and programs that are subject to multiple faults (Byzantine and fail-stop) simultaneously. We have found our framework to be highly useful for pedagogical purposes, especially for teaching concepts of fault-tolerance, automatic program transformation, and the effect of heuristics.
This is a preview of subscription content, access via your institution.
References
A framework for automatic synthesis of fault-tolerance. http://www.cse.msu.edu/~sandeep/software/Code/synthesis-framework/
Spin language reference. http://spinroot.com/spin/Man/promela.html
Alpern B., Schneider F.B.: Defining liveness. Inform. Process. Lett. 21, 181–185 (1985)
Arora A., Gouda M.G.: Closure and convergence: a foundation of fault-tolerant computing. IEEE Trans. Software Eng. 19(11), 1015–1027 (1993)
Arora, A., Kulkarni, S.S.: Designing masking fault-tolerance via nonmasking fault-tolerance. IEEE Trans. Software Eng. 24(6), 435–450 (1998) (A preliminary version appears in the Proceedings of the Fourteenth Symposium on Reliable Distributed Systems, Bad Neuenahr, 174–185, 1995)
Attie, P.: Synthesis of large concurrent programs via pairwise composition. In: CONCUR’99: 10th International Conference on Concurrency Theory, pp. 130–145 (1999)
Attie, P., Emerson, A.: Synthesis of concurrent programs for an atomic read/write model of computation. ACM TOPLAS (a preliminary version of this paper appeared in PODC96) 23(2) (2001)
Attie P., Emerson E.: Synthesis of concurrent systems with many similar processes. ACM Trans. Programming Lang. Syst. 20(1), 51–115 (1998)
Attie, P.C., Arora, A., Emerson, E.A.: Synthesis of fault-tolerant concurrent programs. ACM Transactions on Programming Languages and Systems (TOPLAS). (A preliminary version of this paper appeared in PODC 1998.) 26(1), 125 – 185 (2004)
Bharadwaj, R., Heitmeyer, C.: Developing high assurance avionics systems with the SCR requirements method. In: Proceedings of the 19th Digital Avionics Systems Conference, Philadelphia, PA (2000)
Bonakdarpour, B., Kulkarni, S.S.: Exploiting symbolic techniques in automated synthesis of distributed programs. In: IEEE International Conference on Distributed Computing Systems, pp. 3–10 (2007)
Demirbas, M., Arora, A.: Convergence refinement. In: International Conference on Distributed Computing Systems, pp. 589–597 (2002)
Dijkstra, E.W.: Self-stabilizing systems in spite of distributed control. Commun. ACM 17(11) (1974)
Dijkstra, E.W.: A Discipline of Programming. Prentice-Hall, Englewood Cliffs, NJ, USA (1990)
Duval, G., Julliand, J.: Modeling and verification of rubis micro-kernel with spin. The First SPIN Workshop (1995). Available at http://spinroot.com/spin/Workshops/ws95/papers.html
Ebnenasir, A.: Diconic addition of failsafe fault-tolerance. In: Proceedings of the 22nd IEEE/ACM international conference on Automated Software Engineering, pp. 44–53 (2007)
Ebnenasir, A., Kulkarni, S.S.: SAT-based synthesis of fault-tolerance. In: Fast Abstracts of International Conference on Dependable Systems and Networks, Palazzo dei Congressi, Florence, Italy (2004)
Ebnenasir, A., Kulkarni, S.S.: Efficient synthesis of failsafe fault-tolerant distributed programs. Tech. Rep. MSU-CSE-05-13, Computer Science and Engineering, Michigan State University, East Lansing, MI (2005)
Emerson E., Clarke E.: Using branching time temporal logic to synthesize synchronization skeletons. Sci Comput Program 2(3), 241–266 (1982)
Gamma, E., Helm, R., Johnson, R., Vlissides, J.: Design Patterns: Elements of Reusable Object-Oriented Software. Addison-Wesley, Reading, MA, USA (1995)
Gärtner, F.C., Jhumka, A.: Automating the addition of failsafe fault-tolerance: Beyond fusion-closed specifications. Formal Techniques in Real-Time and Fault-Tolerant Systems (FTRTFT), Grenoble, France, LNCS 3253, pp. 183–198 (2004)
Gouda, M., McGuire, T.: Correctness preserving transformations for network protocol compilers. Prepared for the Workshop on New Visions for Software Design and Productivity: Research and Applications (2001)
Havelund K., Pressburger T.: Model checking java programs using java pathfinder. Int. J. Software Tools Technol. Transf. (STTT) 2(4), 366–381 (2000)
Holzmann, G.J.: From code to models. In: Proceedings of the Second International Conference on Application of Concurrency to System Design (ACSD’01), pp. 3–10 (2001)
Joesang, A.: Security protocol verification using spin. The First SPIN Workshop (1995). Available at http://spinroot.com/spin/Workshops/ws95/papers.html
Kulkarni, S.S.: Component-based design of fault-tolerance. Ph.D. Thesis, Ohio State University (1999)
Kulkarni, S.S., Arora, A.: Automating the addition of fault-tolerance. In: Proceedings of the 6th International Symposium on Formal Techniques in Real-Time and Fault-Tolerant Systems, pp. 82–93 (2000)
Kulkarni, S.S., Arora, A., Chippada, A.: Polynomial time synthesis of Byzantine agreement. Symposium on Reliable Distributed Systems, pp. 130–139 (2001)
Kulkarni, S.S., Ebnenasir, A.: Enhancing the fault-tolerance of nonmasking programs. In: Proceedings of the 23rd International Conference on Distributed Computing Systems, pp. 441–449 (2003)
Kulkarni, S.S., Ebnenasir, A.: A framework for automatic synthesis of fault-tolerance. Tech. Rep. MSU-CSE-03-16, Computer Science and Engineering, Michigan State University, East Lansing MI 48824, Michigan (2003)
Kulkarni S.S., Ebnenasir A.: Complexity issues in automated synthesis of failsafe fault-tolerance. IEEE Trans. Depend. Secure Comput. 2(3), 201–215 (2005)
Lamport L., Shostak R., Pease M.: The Byzantine generals problem. ACM Trans. Programming Lang. Syst. 4(3), 382–401 (1982)
Nesterenko M., Arora A.: Stabilization-preserving atomicity refinement. J. Parallel Distrib. Comput. 62(5), 766–791 (2002)
Varghese, G.: Self-stabilization by local checking and correction. Ph.D. Thesis, MIT/LCS/TR-583 (1993)
Author information
Authors and Affiliations
Corresponding author
Rights and permissions
About this article
Cite this article
Ebnenasir, A., Kulkarni, S.S. & Arora, A. FTSyn: a framework for automatic synthesis of fault-tolerance. Int J Softw Tools Technol Transfer 10, 455–471 (2008). https://doi.org/10.1007/s10009-008-0083-0
Published:
Issue Date:
DOI: https://doi.org/10.1007/s10009-008-0083-0
Keywords
- Fault-tolerance
- Automatic addition of fault-tolerance
- Formal methods
- Program synthesis
- Distributed programs