This chapter contains a detailed performance assessment of ls1 mardyn which is carried out in three parts. First, we provide a detailed performance analysis on a regular high-performance CPU (Intel Xeon) for scenarios containing particles with one to four centers. In all cases we ran strong-scaling and weak-scaling scenarios and analyze the performance characteristics of the implementation. Second, a performance study of a hybrid parallelization on the Intel Xeon Phi coprocessor is presented, as well as its scalability across several coprocessors. Special focus is put on the analysis of the performance of the proposed gather- and scatter-enhanced force calculation. Finally, we discuss our implementation specialized for atomic fluids, e.g., targeting inert gases. This version of ls1 mardyn enabled the world’s largest molecular dynamics simulation in 2013.
KeywordsMolecular dynamics simulation Vectorization Gather Scatter Lennard-Jones potential Shared-memory parallelization Distributed-memory parallelization
- 1.W. Eckhardt, A. Heinecke, W. Hölzl, H.-J. Bungartz, Vectorization of multi-center, highly-parallel rigid-body molecular dynamics simulations, in Supercomputing, The International Conference for High Performance Computing, Networking, Storage and Analysis, (IEEE, Denver, Poster abstract 2013)Google Scholar
- 2.S. Pennycook, C. Hughes, M. Smelyanskiy, S. Jarvis, Exploring SIMD for Molecular Dynamics, Using Intel Xeon Processors and Intel Xeon Phi Coprocessors, in IEEE 27th International Symposium on Parallel Distributed Processing (IPDPS), pp. 1085–1097 (2013)Google Scholar