Semi-formal Development of a Fault-Tolerant Leader Election Protocol in Erlang
We present a semi-formal analysis method for fault-tolerant distributed algorithms written in the distributed functional programming language Erlang. In this setting, standard model checking techniques are often too expensive or too limiting, whereas testing techniques often do not cover enough of the state space.
Our idea is to first run instances of the algorithm on generated stimuli, thereby creating traces of events and states. Then, using an abstraction function specified by the user, our tool generates from these traces an abstract state transition diagram of the system, which can be nicely visualized and thus greatly helps in debugging the system. Lastly, formal requirements of the system specified in temporal logic can be checked automatically to hold for the generated abstract state transition diagram. Because the state transition diagram is abstract, we know that the checked requirements hold for a lot more traces than just the traces we actually ran.
We have applied our method to a commonly used open-source fault-tolerant leader election algorithm, and discovered two serious bugs. We have also implemented a new algorithm that does not have these bugs.
KeywordsModel Check Leader Election Concrete State State Transition Diagram Model Check Technique
Unable to display preview. Download preview PDF.
- 3.Arts, T., Benac Earle, C., Derrick, J.: Development of a verified Erlang program for resource locking. Int. J. on Software Tools for Technology Transfer (2004) (to appear)Google Scholar
- 4.Arts, T., Benac Earle, C., Sánchez Penas, J.J.: Translating Erlang to mCRL. In: Fourth International Conference on Application of Concurrency to System Design, Hamilton (Ontario), Canada, June 2004. IEEE computer society, Los Alamitos (2004)Google Scholar
- 6.Bjørner, N., Lerner, U., Manna, Z.: Deductive verification of parameterized fault-tolerant systems: A case study. In: Proceedings of the 2nd International Conference on Temporal Logic. Kluwer, Dordrecht (1997)Google Scholar
- 7.Blau, S., Rooth, J.: AXD 301 - A new generation ATM switching system. Ericsson Review 1, 10–17 (1998)Google Scholar
- 8.Brinksma, E.: A theory for the derivation of tests. Protocol Specification, Testing and Verification VIII, 63–74 (1988)Google Scholar
- 10.Clarke, E.M., Grumberg, O., Peled, D.A.: Model Checking. The MIT Press, Cambridge (2000)Google Scholar
- 15.Sen, K., Roşu, G., Agha, G.: Runtime safety analysis of multithreaded programs. In: Proceedings of the 9th European software engineering conference held jointly with 10th ACM SIGSOFT international symposium on Foundations of software engineering, pp. 337–346. ACM Press, New York (2003)Google Scholar
- 16.Singh, G.: Leader election in the presence of link failures. In: IEEE Transactions on Parallel and Distributed Systems, vol. 7. IEEE computer society, Los Alamitos (1996)Google Scholar
- 17.Stoller, S.D.: Leader election in distributed systems with crash failures. Technical Report 481, Computer Science Dept., Indiana University (May 1997) (Revised July 1997)Google Scholar
- 18.Svensson, H.: Various material related to the paper, http://www.cs.chalmers.se/~hanssv/erlang_testing
- 19.Tretmans, J.: A Formal Approach to Conformance Testing. PhD thesis, University of Twente, Enschede, The Netherlands (1992)Google Scholar
- 20.Tretmans, J., Belinfante, A.: Automatic testing with formal methods. In: EuroSTAR 1999: 7th European Int. Conference on Software Testing, Analysis & Review, EuroStar Conferences, Barcelona, Spain, Galway, Ireland, November 8-12 (1999)Google Scholar
- 21.Wiger, U.: Fault tolerant leader election, http://www.erlang.org/