Skip to main content

Design of efficient Java message-passing collectives on multi-core clusters

Abstract

This paper presents a scalable and efficient Message-Passing in Java (MPJ) collective communication library for parallel computing on multi-core architectures. The continuous increase in the number of cores per processor underscores the need for scalable parallel solutions. Moreover, current system deployments are usually multi-core clusters, a hybrid shared/distributed memory architecture which increases the complexity of communication protocols. Here, Java represents an attractive choice for the development of communication middleware for these systems, as it provides built-in networking and multithreading support. As the gap between Java and compiled languages performance has been narrowing for the last years, Java is an emerging option for High Performance Computing (HPC).

Our MPJ collective communication library increases Java HPC applications performance on multi-core clusters: (1) providing multi-core aware collective primitives; (2) implementing several algorithms (up to six) per collective operation, whereas publicly available MPJ libraries are usually restricted to one algorithm; (3) analyzing the efficiency of thread-based collective operations; (4) selecting at runtime the most efficient algorithm depending on the specific multi-core system architecture, and the number of cores and message length involved in the collective operation; (5) supporting the automatic performance tuning of the collectives depending on the system and communication parameters; and (6) allowing its integration in any MPJ implementation as it is based on MPJ point-to-point primitives. A performance evaluation on an InfiniBand and Gigabit Ethernet multi-core cluster has shown that the implemented collectives significantly outperform the original ones, as well as higher speedups when analyzing the impact of their use on collective communications intensive Java HPC applications. Finally, the presented library has been successfully integrated in MPJ Express (http://mpj-express.org), and will be distributed with the next release.

This is a preview of subscription content, access via your institution.

References

  1. Taboada GL, Touriño J, Doallo R (2009) Java for high performance computing: assessment of current research and practice. In: Proc 7th int conf on principles and practice of programming in Java (PPPJ’09), Calgary, Canada, pp 30–39

  2. Blount B, Chatterjee S (1999) An evaluation of Java for numerical computing. Sci Program 7(2):97–110

    Google Scholar 

  3. Shafi A, Carpenter B, Baker M, Hussain A (2010) A comparative study of Java and C performance in two large-scale parallel applications. Concurr Comput, Pract Exp 15(21):1882–1906

    Google Scholar 

  4. Taboada GL, Touriño J, Doallo R (2010) F-MPJ: scalable Java message-passing communications on parallel systems. J Supercomput (in press)

  5. Carpenter B, Fox G, Ko S-H, Lim S, mpiJava 1.2: API specification. http://www.hpjava.org/reports/mpiJava-spec/mpiJava-spec/mpiJava-spec.html [Last visited: March 2010]

  6. Carpenter B, Getov V, Judd G, Skjellum A, Fox G (2000) MPJ: MPI-like message-passing for Java. Concurr Comput Pract Exp 12(11):1019–1038

    MATH  Google Scholar 

  7. Java Grande Forum. http://www.javagrande.org [Last visited: March 2010]

  8. Baker M, Carpenter B, Fox G, Ko S, Lim S (1999) mpiJava: an object-oriented Java interface to MPI. In: Proc 1st int workshop on Java for parallel and distributed computing (IWJPDC’99), LNCS, vol 1586, San Juan, Puerto Rico, pp 748–762

    Google Scholar 

  9. Shafi A, Carpenter B, Baker M (2009) Nested parallelism for multi-core HPC systems using Java. J Parallel Distrib Comput 69(6):532–545

    Article  Google Scholar 

  10. Bornemann M, v. Nieuwpoort RV, Kielmann T (2005) MPJ/Ibis: a flexible and efficient message-passing platform for Java. In: Proc 12th EuroPVM/MPI (EuroPVM/MPI’05), LNCS, vol 3666, Sorrento, Italy, pp 217–224

  11. Pugh B, Spacco J (2003) MPJava: High-performance message-passing in Java using Java.nio. In: Proc 16th int workshop on languages and compilers for parallel computing (LCPC’03), LNCS, vol 2958, College Station, TX, USA, pp 323–339

  12. Taboada GL, Touriño J, Doallo R (2010) Performance analysis of message-passing libraries on high-speed clusters. Int J Comput Syst Sci Eng 25(1):63–78, January

    Google Scholar 

  13. Chan E, Heimlich M, Purkayastha A, van de Geijn RA (2007) Collective communication: theory, practice, and experience. Concurr Comput, Pract Exp 19(13):1749–1783

    Article  Google Scholar 

  14. Barchet-Estefanel LA, Mounie G (2004) Fast tuning of intra-cluster collective communications. In: Proc 11th EuroPVM/MPI (EuroPVM/MPI’04), LNCS, vol 3241, Budapest, Hungary, pp 28–35

  15. Pjesivac-Grbovic J, Angskun T, Bosilca G, Fagg GE, Gabriel E, Dongarra JJ (2007) Performance analysis of MPI collective operations. Cluster Comput 10(2):127–143

    Article  Google Scholar 

  16. Thakur R, Rabenseifner R, Gropp W (2005) Optimization of collective communication operations in MPICH. Int J High Perform Comput Appl 19(1):49–66

    Article  Google Scholar 

  17. Pjesivac-Grbovic J, Fagg GE, Angskun T, Bosilca G, Dongarra JJ (2006) MPI collective algorithm selection and quadtree encoding. In: 13th EuroPVM/MPI (EuroPVM/MPI’06), LNCS, vol 4192, Bonn, Germany, pp 40–48

  18. Sanders P, Träff JL (2002) The hierarchical factor algorithm for all-to-all communication. In: Proc 8th int Euro-Par (Euro-Par’02), LNCS, vol 2400, Paderborn, Germany, pp 799–804

  19. Zhu H, Goodell D, Gropp W, Thakur R (2009) Hierarchical collectives in MPICH2. In: Proc 16th EuroPVM/MPI (EuroPVM/MPI’09), LNCS, vol 5759, Espoo, Finland, pp 325–326

  20. Tu B, Fan J, Zhan J, Zhao X (2010) Performance analysis and optimization of MPI collective operations on multi-core clusters. J Supercomp (in press)

  21. Tipparaju V, Nieplocha J, Panda DK (2003) Fast collective operations using shared and remote memory access protocols on clusters. In: Proc 17th int parallel and distributed processing symposium (IPDPS’03), Nice, France, pp. 84–93

  22. Mercier G, Clet-Ortega J (2009) Towards an efficient process placement policy for MPI applications in multicore environments. In: Proc 16th EuroPVM/MPI (EuroPVM/MPI’09), LNCS, vol 5759, Espoo, Finland, pp 104–115

  23. Nelisse A, Maassen J, Kielmann T, Bal HE (2003) CCJ: object-based message-passing and collective communication in Java. Concurr Comput, Pract Exp 15(3–5):341–369

    MATH  Article  Google Scholar 

  24. Lim S, Carpenter B, Fox G, Lee H (2005) Collective communications for scalable programming. In: Proc 3rd int symposium on parallel and distributed processing and applications (ISPA’05), LNCS, vol 3758, Nanjing, China, pp 286–297

  25. Shafi A, Manzoor J (2009) Towards efficient shared memory communications in MPJ Express. In: Proc 11th int workshop on Java and components for parallelism, distribution and concurrency (IWJacPDC’09), Rome, Italy, p 111b (8 pages)

  26. Taboada GL, Touriño J, Doallo R (2003) Performance analysis of Java message-passing libraries on fast ethernet, myrinet and SCI clusters. In: Proc 5th IEEE int conf on cluster computing (CLUSTER’03), Hong Kong, China, pp 118–126

  27. Mallón DA, Taboada GL, Touriño J, Doallo R (2009) NPB-MPJ: NAS parallel benchmarks implementation for message-passing in Java. In: Proc 17th euromicro int conf on parallel, distributed, and network-based processing (PDP’09), Weimar, Germany, pp 181–190

  28. Baker M, Carpenter B, Shafi A (2006) MPJ Express meets Gadget: towards a Java code for cosmological simulations. In: 13th EuroPVM/MPI (EuroPVM/MPI’06), Bonn, Germany, pp 358–365

  29. Finis Terrae. http://www.top500.org/system/9156 [Last visited: March 2010]

  30. TOP500 supercomputing site. http://www.top500.org [Last visited: March 2010]

Download references

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Guillermo L. Taboada.

Rights and permissions

Reprints and Permissions

About this article

Cite this article

Taboada, G.L., Ramos, S., Touriño, J. et al. Design of efficient Java message-passing collectives on multi-core clusters. J Supercomput 55, 126–154 (2011). https://doi.org/10.1007/s11227-010-0464-5

Download citation

  • Published:

  • Issue Date:

  • DOI: https://doi.org/10.1007/s11227-010-0464-5

Keywords

  • Message-passing in Java (MPJ)
  • Multi-core clusters
  • Scalable collective communication
  • High performance computing
  • Performance evaluation