Skip to main content

P2P-MPI: A Peer-to-Peer Framework for Robust Execution of Message Passing Parallel Programs on Grids

Abstract

This paper presents P2P-MPI, a middleware aimed at computational Grids. From the programmer point of view, P2P-MPI provides a message-passing programming model which enables the development of MPI applications for Grids. Its originality lies in its adaptation to unstable environments. First, the peer-to-peer design of P2P-MPI allows for a dynamic discovery of collaborating resources. Second, it gives the user the possibility to adjust the robustness of an execution thanks to an internal process replication mechanism. Finally, we measure the performance of the integrated message passing library on several benchmarks and on different hardware platforms.

This is a preview of subscription content, access via your institution.

References

  1. JXTA: http://www.jxta.org

  2. Agbaria, A., Friedman, R.: Starfish: Fault-tolerant dynamic MPI programs on clusters of workstations. In: Proceedings of the 8th IEEE International Symposium on High Performance Distributed Computing. Los Alamitos, California, pp. 167–176 (1999)

  3. Bailey, D.H., Barszcz, E., Barton, J.T., Browning, D.S., Carter, R.L., Dagum, D., Fatoohi, R.A., Frederickson, P.O., Lasinski, T.A., Schreiber, R.S., Simon, H.D., Venkatakrishnan, V., Weeratunga, S.K.: The NAS parallel benchmarks. Int. J. Supercomput. Appl. 5(3), 63–73 (1991)

    Article  Google Scholar 

  4. Baker, M., Carpenter, B., Fox, G., Ko, S.H., Li, X.: MpiJava: A java interface to MPI. In: First UK Workshop on Java for High Performance Network Computing (1998)

  5. Batchu, R., Neelamegam, J., Cui, Z., Beddhua, M., Skjellum, A., Dandass, Y., Apte, M.: MPI/FTTM: Architecture and taxonomies for fault-tolerant, message-passing middleware for performance-portable parallel. In: Proceedings of the 1st IEEE International Symposium of Cluster Computing and the Grid. Melbourne, Australia (2001)

  6. Bornemann, M., van Nieuwpoort, R.V., Kielmann, T.: MPJ/Ibis: A flexible and efficient message passing platform for Java. In: Euro PVM/MPI 2005. LNCS, vol. 3666, Sorrento, Italy, pp. 217-224 (2005)

  7. Bosilca, G., Bouteiller, A., Cappello, F., Djailali, S., Fedak, G., Germain, C., Herault, T., Lemarinier, P., Lodygensky, O., Magniette, F., Neri, V., Selikhov, A.: MPICH-V: Toward a scalable fault tolerant MPI for volatile nodes. In: SuperComputing 2002. Baltimore, USA (2002)

  8. Bouteiller, A., Cappello, F., Hérault, T., Krawezik, G., Lemarinier, P., Magniette, F.: MPIch-V2: A fault tolerant mpi for volatile nodes based on the pessimistic sender based message logging. In: SuperComputing 2003. Phoenix USA (2003)

  9. Budhiraja, N., Schneider, F., Toueg, S., Marzullo, K.: The primary-backup approach. In: Mullender, S. (ed.) Distributed Systems, Chap, pp. 199–216. Addison-Wesley, Reading, MA (1993)

    Google Scholar 

  10. Caron, E., Deprez, F., Frédéric Lombard, F., Nicod, J.-M., Quinson, M., Suter, F.: A scalable approach to network enabled servers. In: 8th EuroPar Conference, vol. 2400 of LNCS. pp. 907–910 (2002)

  11. Carpenter, B., Getov, V., Judd, G., Skjellum, T., Fox, G.: MPJ: MPI-like message passing for java. Concurrency: Pract. Exper. 12(11) (2000)

  12. Chase, W., Brown, F.: General Statistics, 2nd edn. Wiley, New York (1992)

    MATH  Google Scholar 

  13. Schneider, F.: Replication Management Using State-Machine Approach. In: Mullender, S. (ed.) Distributed Systems, Chap 7, pp. 169–198. Addison-Wesley, Reading, MA (1993)

    Google Scholar 

  14. Fagg, G.E., Bukovsky, A., Dongarra, J.J.: Harness and fault tolerant MPI. Parallel Comput. (27), 1479–1495 (2001)

  15. Fedak, G., Germain, C.: XtremWeb: A generic global computing system. In: IEEE International Symposium on Cluster Computing and the Grid, pp. 582–587. IEEE Press, New York (2001)

    Chapter  Google Scholar 

  16. Gabriel, E., Resch, M., Beisel, T., Keller, R.: Distributed computing in an heterogeneous computing environment. In: EuroPVM/MPI Lecture notes of Computer Science, vol 1497, pp. 180–188. Springer, Berlin Heidelberg New York (1998)

    Google Scholar 

  17. Karonis, N.T., Toonen, B.T., Foster, I.: MPICH-G2: A Grid-enabled implementation of the message passing interface. J. Parallel Distrib. Comput. (special issue on Comput. Grids) 63(5), 551–563 (2003)

    Article  MATH  Google Scholar 

  18. Kielmann, T., Hofman, R.F.H., Bal, H.E., Plaat, A., Bhoedjang, R.A.F.: MagPIe: MPI’s collective communication operations for clustered wide area systems. ACM SIGPLAN Not. 34(8), 131–140 (1999)

    Article  Google Scholar 

  19. Louca, S., Neophytou, N., Lachanas, A., Evripidou, P.: MPI-FT: Portable fault tolerenace scheme for MPI. Parallel Process. Lett. 10(4), 371–382 (2000)

    Article  Google Scholar 

  20. Litzkow, M. et al: Condor: A hunter of idle workstations. In: Proceeding of the 8th International Conference on Distributed Computing Systems. Los Alamitos, California, pp. 104–111 (1998)

  21. Mintchev, S., Getov, V.: Towards portable message passing in Java: binding MPI. In: Recent Advances in PVM and MPI, vol. 1332 of LNCS (1997)

  22. MPI Forum: MPI: A Message Passing Interface Standard. Technical report, University of Tennessee, Knoxville, TN (1995)

  23. Felber, P., Defago, X., Guerraoui, R., Oser, P.: Failure detectors as first class objects. In: Proceeding of the 9th IEEE Intl. Symposium on Distributed Objects and Applications (DOA’99), pp. 132–141 (1999)

  24. Renesse, R.V., Minsky, Y., Hayden, M.: A Gossip-Style Failure Detection Service. Technical report, Ithaca, NY (1998)

  25. Shudo, K., Tanaka, Y., Sekiguchi, S.: P3: P2P-based middleware enabling transfer and aggregation of computational resource. In: 5th Intl. Workshop on Global and Peer-to-Peer Computing, in conjunc. with CCGrid05 (2005)

  26. Stellner, G.: CoCheck: Checkpointing and process migration for MPI. In: Proceedings of the 10th International Parallel Processing Symposium (IPPS ’96). Honolulu, Hawaii (1996)

  27. Thakur, R., Rabenseifner, R., Gropp, W.: Optimization of Collective Communication Operation in MPICH. Int. J. High Perform. Comput. Appl. 19(1), 49–66 (2005)

    Article  Google Scholar 

  28. Verbeke, J., Nadgir, N., Ruetsch, G., Sharapov, I.: Framework for peer-to-peer distributed computing in a heterogeneous, decentralized environment. In: GRID 2002, vol. 2536 of LNCS, pp. 1–12 (2002)

Download references

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Stéphane Genaud.

Rights and permissions

Reprints and Permissions

About this article

Cite this article

Genaud, S., Rattanapoka, C. P2P-MPI: A Peer-to-Peer Framework for Robust Execution of Message Passing Parallel Programs on Grids. J Grid Computing 5, 27–42 (2007). https://doi.org/10.1007/s10723-006-9056-2

Download citation

  • Received:

  • Accepted:

  • Published:

  • Issue Date:

  • DOI: https://doi.org/10.1007/s10723-006-9056-2

Key words

  • Grid
  • middleware
  • peer-to-peer
  • MPI
  • Java