Skip to main content
Log in

Comparison between pure MPI and hybrid MPI-OpenMP parallelism for Discrete Element Method (DEM) of ellipsoidal and poly-ellipsoidal particles

  • Published:
Computational Particle Mechanics Aims and scope Submit manuscript

Abstract

Parallel computing of 3D Discrete Element Method (DEM) simulations can be achieved in different modes, and two of them are pure MPI and hybrid MPI-OpenMP. The hybrid MPI-OpenMP mode allows flexibly combined mapping schemes on contemporary multiprocessing supercomputers. This paper profiles computational components and floating-point operation features of complex-shaped 3D DEM, develops a space decomposition-based MPI parallelism and various thread-based OpenMP parallelism, and carries out performance comparison and analysis from intranode to internode scales across four orders of magnitude of problem size (namely, number of particles). The influences of memory/cache hierarchy, processes/threads pinning, variation of hybrid MPI-OpenMP mapping scheme, ellipsoid versus poly-ellipsoid are carefully examined. It is found that OpenMP is able to achieve high efficiency in interparticle contact detection, but the unparallelizable code prevents it from achieving the same high efficiency for overall performance; pure MPI achieves not only lower computational granularity (thus higher spatial locality of particles) but also lower communication granularity (thus faster MPI transmission) than hybrid MPI-OpenMP using the same computational resources; the cache miss rate is sensitive to the memory consumption shrinkage per processor, and the last level cache contributes most significantly to the strong superlinear speedup among all of the three cache levels of modern microprocessors; in hybrid MPI-OpenMPI mode, as the number of MPI processes increases (and the number of threads per MPI processes decreases accordingly), the total execution time decreases, until the maximum performance is obtained at pure MPI mode; the processes/threads pinning on NUMA architectures improves performance significantly when there are multiple threads per process, whereas the improvement becomes less pronounced when the number of threads per process decreases; both the communication time and computation time increase substantially from ellipsoids to poly-ellipsoids. Overall, pure MPI outperforms hybrid MPI-OpenMP in 3D DEM modeling of ellipsoidal and poly-ellipsoidal particles.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Fig. 1
Fig. 2
Fig. 3
Fig. 4
Fig. 5
Fig. 6
Fig. 7
Fig. 8
Fig. 9
Fig. 10
Fig. 11
Fig. 12
Fig. 13
Fig. 14
Fig. 15
Fig. 16
Fig. 17
Fig. 18
Fig. 19
Fig. 20
Fig. 21
Fig. 22
Fig. 23
Fig. 24

Similar content being viewed by others

References

  1. Baugh JW Jr, Konduri R (2001) Discrete element modelling on a cluster of workstations. Eng Comput 17(1):1–15

    Article  MATH  Google Scholar 

  2. Chandra R (2001) Parallel programming in OpenMP. Morgan kaufmann, Burlington

    Google Scholar 

  3. Chorley MJ, Walker DW (2010) Performance analysis of a hybrid MPI/OpenMP application on multi-core clusters. J Comput Sci 1(3):168–174

    Article  Google Scholar 

  4. Dagum L, Enon R (1998) Openmp: an industry standard api for shared-memory programming. IEEE Comput Sci Eng 5(1):46–55

    Article  Google Scholar 

  5. Delaney GW, Cleary PW, Sinnott MD, Morrison RD (2010) Novel application of DEM to modelling comminution processes. In: IOP conference series: materials science and engineering, vol 10. IOP Publishing, p 012099

  6. Drosinos N, Koziris N (2004) Performance comparison of pure MPI vs hybrid MPI-Openmp parallelization models on SMP clusters. In: Parallel and distributed processing symposium, 2004. Proceedings. 18th International. IEEE, p 15

  7. Grest GS, Dünweg B, Kremer K (1989) Vectorized link cell fortran code for molecular dynamics simulations for a large number of particles. Comput Phys Commun 55(3):269–285

    Article  Google Scholar 

  8. Gropp W, Lusk E, Skjellum A (1999) Using MPI: portable parallel programming with the message-passing interface, vol 1. MIT press, Cambridge

    Book  MATH  Google Scholar 

  9. Henty DS (2000) Performance of hybrid message-passing and shared-memory parallelism for discrete element modeling. In: Proceedings of the 2000 ACM/IEEE conference on supercomputing. IEEE Computer Society, p 10

  10. Jagadish HV, Ooi BC, Tan KL, Yu C, Zhang R (2005) iDistance: an adaptive b+-tree based indexing method for nearest neighbor search. ACM Trans Database Syst (TODS) 30(2):364–397

    Article  Google Scholar 

  11. Jost G, Jin HQ, anMey D, Hatay FF (2003) Comparing the openmp, mpi, and hybrid programming paradigm on an smp cluster. ntrsnasagov

  12. Lim KW, Andrade JE (2014) Granular element method for three-dimensional discrete element calculations. Int J Numer Anal Methods Geomech 38(2):167–188

    Article  Google Scholar 

  13. Luecke G, Weiss O, Kraeva M, Coyle J, Hoekstra J (2010) Performance analysis of pure MPI versus MPI+ OpenMP for jacobi iteration and a 3D FFT on the cray XT5. In: Cray user group 2010 proceedings

  14. Maknickas A, Kačeniauskas A, Kačianauskas R, Balevičius R, Džiugys A (2006) Parallel DEM software for simulation of granular media. Informatica 17(2):207–224

    MATH  Google Scholar 

  15. Michael JQ (2003) Parallel programming in C with MPI and OpenMP. McGraw-Hill Press, New York

    Google Scholar 

  16. Muja M, Lowe DG (2009) Fast approximate nearest neighbors with automatic algorithm configuration. VISAPP 1(2):331–340

    Google Scholar 

  17. Munjiza A, Andrews K (1998) Nbs contact detection algorithm for bodies of similar size. Int J Numer Methods Eng 43(1):131–149

    Article  MATH  Google Scholar 

  18. Ng TT (1994) Numerical simulations of granular soil using elliptical particles. Comput Geotech 16(2):153–169

    Article  Google Scholar 

  19. Ng TT (2004) Triaxial test simulations with discrete element method and hydrostatic boundaries. J Eng Mech 130(10):1188–1194

    Article  Google Scholar 

  20. Pacheco PS (1997) Parallel programming with MPI. Morgan Kaufmann, Burlington

    MATH  Google Scholar 

  21. Pal A, Agarwala A, Raha S, Bhattacharya B (2014) Performance metrics in a hybrid MPI–OpenMP based molecular dynamics simulation with short-range interactions. J Parallel Distrib Comput 74(3):2203–2214

    Article  Google Scholar 

  22. Peters JF, Hopkins MA, Kala R, Wahl RE (2009) A polyellipsoid particle for nonspherical discrete element method. Eng Comput 26(6):645–657

    Article  Google Scholar 

  23. Rabenseifner R, Hager G, Jost G (2009) Hybrid MPI/OpenMP parallel programming on clusters of multi-core SMP nodes. In: 2009 17th euromicro international conference on parallel, distributed and network-based processing. IEEE, pp 427–436

  24. Vedachalam V, Virdee D (2011) Discrete element modelling of granular snow particles using liggghts. M.Sc. University of Edinburgh

  25. Washington DW, Meegoda JN (2003) Micro-mechanical simulation of geotechnical problems using massively parallel computers. Int J Numer Anal Methods Geomech 27(14):1227–1234

    Article  MATH  Google Scholar 

  26. Wellmann C, Lillie C, Wriggers P (2008) A contact detection algorithm for superellipsoids based on the common-normal concept. Eng Comput 25(5):432–442

    Article  MATH  Google Scholar 

  27. Williams JR, Pentland AP (1992) Superquadrics and modal dynamics for discrete elements in interactive design. Eng Comput 9(2):115–127

    Article  Google Scholar 

  28. Williams JR, Perkins E, Cook B (2004) A contact algorithm for partitioning n arbitrary sized objects. Eng Comput 21(2/3/4):235–248

    Article  MATH  Google Scholar 

  29. Yan B, Regueiro RA (2018a) A comprehensive study of MPI parallelism in three dimensional discrete element method (DEM) simulation of complex-shaped granular particles. Comput Part Mech 5(4):553–577

    Article  Google Scholar 

  30. Yan B, Regueiro R (2018b) Comparison between o(n\(^2\)) and o(n) neighbor search algorithm and its influence on superlinear speedup in parallel discrete element method (DEM) for complex-shaped particles. Eng Comput 35(6):2327–2348

    Article  Google Scholar 

  31. Yan B, Regueiro RA (2018c) Superlinear speedup phenomenon in parallel 3d discrete element method (DEM) simulations of complex-shaped particles. Parallel Comput 75:61–87

    Article  MathSciNet  Google Scholar 

  32. Yan B, Regueiro RA, Sture S (2010) Three dimensional ellipsoidal discrete element modeling of granular materials and its coupling with finite element facets. Eng Comput 27(4):519–550

    Article  MATH  Google Scholar 

Download references

Acknowledgements

We would like to acknowledge the support provided by ONR MURI Grant N00014-11-1-0691, and the DoD High Performance Computing Modernization Program (HPCMP) for granting us the computing resources required to conduct this work.

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Beichuan Yan.

Ethics declarations

Conflict of interest

The authors declare that there is no conflict of interest.

Additional information

Publisher's Note

Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

Rights and permissions

Reprints and permissions

About this article

Check for updates. Verify currency and authenticity via CrossMark

Cite this article

Yan, B., Regueiro, R.A. Comparison between pure MPI and hybrid MPI-OpenMP parallelism for Discrete Element Method (DEM) of ellipsoidal and poly-ellipsoidal particles. Comp. Part. Mech. 6, 271–295 (2019). https://doi.org/10.1007/s40571-018-0213-8

Download citation

  • Received:

  • Revised:

  • Accepted:

  • Published:

  • Issue Date:

  • DOI: https://doi.org/10.1007/s40571-018-0213-8

Keywords

Navigation