Abstract
This paper presents P2P-MPI, a middleware aimed at computational Grids. From the programmer point of view, P2P-MPI provides a message-passing programming model which enables the development of MPI applications for Grids. Its originality lies in its adaptation to unstable environments. First, the peer-to-peer design of P2P-MPI allows for a dynamic discovery of collaborating resources. Second, it gives the user the possibility to adjust the robustness of an execution thanks to an internal process replication mechanism. Finally, we measure the performance of the integrated message passing library on several benchmarks and on different hardware platforms.
This is a preview of subscription content, access via your institution.
References
JXTA: http://www.jxta.org
Agbaria, A., Friedman, R.: Starfish: Fault-tolerant dynamic MPI programs on clusters of workstations. In: Proceedings of the 8th IEEE International Symposium on High Performance Distributed Computing. Los Alamitos, California, pp. 167–176 (1999)
Bailey, D.H., Barszcz, E., Barton, J.T., Browning, D.S., Carter, R.L., Dagum, D., Fatoohi, R.A., Frederickson, P.O., Lasinski, T.A., Schreiber, R.S., Simon, H.D., Venkatakrishnan, V., Weeratunga, S.K.: The NAS parallel benchmarks. Int. J. Supercomput. Appl. 5(3), 63–73 (1991)
Baker, M., Carpenter, B., Fox, G., Ko, S.H., Li, X.: MpiJava: A java interface to MPI. In: First UK Workshop on Java for High Performance Network Computing (1998)
Batchu, R., Neelamegam, J., Cui, Z., Beddhua, M., Skjellum, A., Dandass, Y., Apte, M.: MPI/FTTM: Architecture and taxonomies for fault-tolerant, message-passing middleware for performance-portable parallel. In: Proceedings of the 1st IEEE International Symposium of Cluster Computing and the Grid. Melbourne, Australia (2001)
Bornemann, M., van Nieuwpoort, R.V., Kielmann, T.: MPJ/Ibis: A flexible and efficient message passing platform for Java. In: Euro PVM/MPI 2005. LNCS, vol. 3666, Sorrento, Italy, pp. 217-224 (2005)
Bosilca, G., Bouteiller, A., Cappello, F., Djailali, S., Fedak, G., Germain, C., Herault, T., Lemarinier, P., Lodygensky, O., Magniette, F., Neri, V., Selikhov, A.: MPICH-V: Toward a scalable fault tolerant MPI for volatile nodes. In: SuperComputing 2002. Baltimore, USA (2002)
Bouteiller, A., Cappello, F., Hérault, T., Krawezik, G., Lemarinier, P., Magniette, F.: MPIch-V2: A fault tolerant mpi for volatile nodes based on the pessimistic sender based message logging. In: SuperComputing 2003. Phoenix USA (2003)
Budhiraja, N., Schneider, F., Toueg, S., Marzullo, K.: The primary-backup approach. In: Mullender, S. (ed.) Distributed Systems, Chap, pp. 199–216. Addison-Wesley, Reading, MA (1993)
Caron, E., Deprez, F., Frédéric Lombard, F., Nicod, J.-M., Quinson, M., Suter, F.: A scalable approach to network enabled servers. In: 8th EuroPar Conference, vol. 2400 of LNCS. pp. 907–910 (2002)
Carpenter, B., Getov, V., Judd, G., Skjellum, T., Fox, G.: MPJ: MPI-like message passing for java. Concurrency: Pract. Exper. 12(11) (2000)
Chase, W., Brown, F.: General Statistics, 2nd edn. Wiley, New York (1992)
Schneider, F.: Replication Management Using State-Machine Approach. In: Mullender, S. (ed.) Distributed Systems, Chap 7, pp. 169–198. Addison-Wesley, Reading, MA (1993)
Fagg, G.E., Bukovsky, A., Dongarra, J.J.: Harness and fault tolerant MPI. Parallel Comput. (27), 1479–1495 (2001)
Fedak, G., Germain, C.: XtremWeb: A generic global computing system. In: IEEE International Symposium on Cluster Computing and the Grid, pp. 582–587. IEEE Press, New York (2001)
Gabriel, E., Resch, M., Beisel, T., Keller, R.: Distributed computing in an heterogeneous computing environment. In: EuroPVM/MPI Lecture notes of Computer Science, vol 1497, pp. 180–188. Springer, Berlin Heidelberg New York (1998)
Karonis, N.T., Toonen, B.T., Foster, I.: MPICH-G2: A Grid-enabled implementation of the message passing interface. J. Parallel Distrib. Comput. (special issue on Comput. Grids) 63(5), 551–563 (2003)
Kielmann, T., Hofman, R.F.H., Bal, H.E., Plaat, A., Bhoedjang, R.A.F.: MagPIe: MPI’s collective communication operations for clustered wide area systems. ACM SIGPLAN Not. 34(8), 131–140 (1999)
Louca, S., Neophytou, N., Lachanas, A., Evripidou, P.: MPI-FT: Portable fault tolerenace scheme for MPI. Parallel Process. Lett. 10(4), 371–382 (2000)
Litzkow, M. et al: Condor: A hunter of idle workstations. In: Proceeding of the 8th International Conference on Distributed Computing Systems. Los Alamitos, California, pp. 104–111 (1998)
Mintchev, S., Getov, V.: Towards portable message passing in Java: binding MPI. In: Recent Advances in PVM and MPI, vol. 1332 of LNCS (1997)
MPI Forum: MPI: A Message Passing Interface Standard. Technical report, University of Tennessee, Knoxville, TN (1995)
Felber, P., Defago, X., Guerraoui, R., Oser, P.: Failure detectors as first class objects. In: Proceeding of the 9th IEEE Intl. Symposium on Distributed Objects and Applications (DOA’99), pp. 132–141 (1999)
Renesse, R.V., Minsky, Y., Hayden, M.: A Gossip-Style Failure Detection Service. Technical report, Ithaca, NY (1998)
Shudo, K., Tanaka, Y., Sekiguchi, S.: P3: P2P-based middleware enabling transfer and aggregation of computational resource. In: 5th Intl. Workshop on Global and Peer-to-Peer Computing, in conjunc. with CCGrid05 (2005)
Stellner, G.: CoCheck: Checkpointing and process migration for MPI. In: Proceedings of the 10th International Parallel Processing Symposium (IPPS ’96). Honolulu, Hawaii (1996)
Thakur, R., Rabenseifner, R., Gropp, W.: Optimization of Collective Communication Operation in MPICH. Int. J. High Perform. Comput. Appl. 19(1), 49–66 (2005)
Verbeke, J., Nadgir, N., Ruetsch, G., Sharapov, I.: Framework for peer-to-peer distributed computing in a heterogeneous, decentralized environment. In: GRID 2002, vol. 2536 of LNCS, pp. 1–12 (2002)
Author information
Authors and Affiliations
Corresponding author
Rights and permissions
About this article
Cite this article
Genaud, S., Rattanapoka, C. P2P-MPI: A Peer-to-Peer Framework for Robust Execution of Message Passing Parallel Programs on Grids. J Grid Computing 5, 27–42 (2007). https://doi.org/10.1007/s10723-006-9056-2
Received:
Accepted:
Published:
Issue Date:
DOI: https://doi.org/10.1007/s10723-006-9056-2
Key words
- Grid
- middleware
- peer-to-peer
- MPI
- Java