Scalable Fault Tolerant Protocol for Parallel Runtime Environments

  • Thara Angskun
  • Graham E. Fagg
  • George Bosilca
  • Jelena Pješivac–Grbović
  • Jack J. Dongarra
Part of the Lecture Notes in Computer Science book series (LNCS, volume 4192)


The number of processors embedded on high performance computing platforms is growing daily to satisfy users desire for solving larger and more complex problems. Parallel runtime environments have to support and adapt to the underlying libraries and hardware which require a high degree of scalability in dynamic environments. This paper presents the design of a scalable and fault tolerant protocol for supporting parallel runtime environment communications. The protocol is designed to support transmission of messages across multiple nodes with in a self-healing topology to protect against recursive node and process failures. A formal protocol verification has validated the protocol for both the normal and failure cases. We have implemented multiple routing algorithms for the protocol and concluded that the variant rule-based routing algorithm yields the best overall results for damaged and incomplete topologies .


Distribute Hash Table Broadcast Message Procedure Rule Message Type Dead Node 
These keywords were added by machine and not by the authors. This process is experimental and the keywords may be updated as the learning algorithm improves.


Unable to display preview. Download preview PDF.

Unable to display preview. Download preview PDF.


  1. 1.
    Beck, M., Dongarra, J.J., Fagg, G.E., Geist, G.A., Gray, P., Kohl, J., Migliardi, M., Moore, K., Moore, T., Papadopoulous, P., Scott, S.L., Sunderam, V.: HARNESS: A next generation distributed virtual machine. Future Generation Computer Systems 15(5–6), 571–582 (1999)CrossRefGoogle Scholar
  2. 2.
    Burns, G., Daoud, R., Vaigl, J.: LAM: An Open Cluster Environment for MPI. In: Proceedings Supercomputing Symposium, pp. 379–386 (1994)Google Scholar
  3. 3.
    Butler, R., Gropp, W., Lusk, E.L.: A scalable process-management environment for parallel program. In: Proceedings of the 7th European PVM/MPI User’s Group Meeting on Recent Advances in Parallel Virtual Machine and Message Passing Interface, London, UK, pp. 168–175. Springer, Heidelberg (2000)CrossRefGoogle Scholar
  4. 4.
    Castain, R.H., Woodall, T.S., Daniel, D.J., Squyres, J.M., Barrett, B., Fagg, G.E.: The open run-time environment (openrte): A transparent multi-cluster environment for high-performance computing. In: Proceedings 12th European PVM/MPI User’s Group Meeting on Recent Advances in Parallel Virtual Machine and Message Passing Interface, Italy, September 2005. Springer, Heidelberg (2005)Google Scholar
  5. 5.
    Dongarra, J.J., Meuer, H., Strohmaier, E.: TOP500 supercomputer sites. Supercomputer 13(1), 89–120 (1997)Google Scholar
  6. 6.
    Fagg, G.E., Gabriel, E., Bosilca, G., Angskun, T., Chen, Z., Pjesivac-Grbovic, J., London, K., Dongarra, J.: Extending the mpi specification for process fault tolerance on high performance computing systems. In: Proceedings of the International Supercomputer Conference (ICS) 2004, Heidelberg, Germany, June 2006, Primeur (2006)Google Scholar
  7. 7.
    Gabriel, E., Fagg, G.E., Bosilca, G., Angskun, T., Dongarra, J.J., Squyres, J.M., Sahay, V., Kambadur, P., Barrett, B., Lumsdaine, A., Castain, R.H., Daniel, D.J., Graham, R.L., Woodall, T.S.: Open MPI: Goals, concept, and design of a next generation MPI implementation. In: Proceedings 11th European PVM/MPI User’s Group Meeting on Recent Advances in Parallel Virtual Machine and Message Passing Interface, Budapest, Hungary, September 2004, pp. 97–104. Springer, Heidelberg (2004)CrossRefGoogle Scholar
  8. 8.
    Gropp, W., Lusk, E., Doss, N., Skjellum, A.: A high - performance, portable implementation of MPI message passing interface standard. Parallel Computing 22(6), 789–828 (1996)zbMATHCrossRefGoogle Scholar
  9. 9.
    Gupta, I., van Renesse, R., Birman, K.: Scalable fault-tolerant aggregation in large process groups. In: Proceedings of The International Conference on Dependable Systems and Networks (DSN), pp. 433–442 (2001)Google Scholar
  10. 10.
    MPI Forum. MPI: A message-passing interface standard. Technical report (1994)Google Scholar
  11. 11.
    Ratnasamy, S., Francis, P., Handley, M., Karp, R., Shenker, S.: A scalable content addressable network. Technical Report TR-00-010, Berkeley, CA (2000)Google Scholar
  12. 12.
    Renesse, R.V., Minsky, Y., Hayden, M.: A gossip-style failure detection service. Technical Report TR98-1687, 28 (1998)Google Scholar
  13. 13.
    Rowstron, A., Druschel, P.: Pastry: Scalable, decentralized object location, and routing for large-scale peer-to-peer systems. In: Guerraoui, R. (ed.) Middleware 2001. LNCS, vol. 2218, pp. 329–350. Springer, Heidelberg (2001)CrossRefGoogle Scholar
  14. 14.
    Stoica, I., Morris, R., Karger, D., Kaashoek, F., Balakrishnan, H.: Chord: A scalable Peer-To-Peer lookup service for internet applications. In: Proceedings of the 2001 ACM SIGCOMM Conference, pp. 149–160 (2001)Google Scholar
  15. 15.
    Zhao, B.Y., Kubiatowicz, J.D., Joseph, A.D.: Tapestry: An infrastructure for fault-tolerant wide-area location and routing. Technical Report UCB/CSD-01-1141, UC Berkeley (April 2001)Google Scholar
  16. 16.
    Holzmann, G.J.: Design and validation of computer protocols. Prentice-Hall, Englewood Cliffs (1991)Google Scholar
  17. 17.
    Holzmann, G.J.: The model checker SPIN. IEEE Transactions on Software Engineering 23, 279–295 (1997)CrossRefGoogle Scholar

Copyright information

© Springer-Verlag Berlin Heidelberg 2006

Authors and Affiliations

  • Thara Angskun
    • 1
  • Graham E. Fagg
    • 1
  • George Bosilca
    • 1
  • Jelena Pješivac–Grbović
    • 1
  • Jack J. Dongarra
    • 1
  1. 1.Dept. of Computer ScienceThe University of TennesseeKnoxvilleUSA

Personalised recommendations