Hybrid Codes for Atomistic Simulations on the Desmos Supercomputer: GPU-acceleration, Scalability and Parallel I/O

  • Nikolay Kondratyuk
  • Grigory Smirnov
  • Vladimir StegailovEmail author
Conference paper
Part of the Communications in Computer and Information Science book series (CCIS, volume 965)


In this paper, we compare different GPU accelerators and algorithms for classical molecular dynamics using LAMMPS and GROMACS codes. BigDFT is considered as an example of the modern ab initio code that implements the density functional theory algorithms in the wavelet basis and uses effectively GPU acceleration. Efficiency of distributed storage managed by the BeeGFS parallel file system is analysed with respect to saving of large molecular-dynamics trajectories. Results have been obtained using the Desmos supercomputer in JIHT RAS.


Molecular dynamics Density functional theory GPU acceleration Strong scaling Parallel I/O 



The work is supported by the Russian Science Foundation (grant No. 14-50-00124). HSE and MIPT helped in accessing some hardware used in this study. We acknowledge Shared Resource Center of CC FEB RAS for the access to the Jupiter supercomputer ( We are grateful to Nvidia for providing us with one Tesla V100 card for the benchmarks.


  1. 1.
    Stegailov, V., et al.: Early performance evaluation of the hybrid cluster with torus interconnect aimed at molecular-dynamics simulations. In: Wyrzykowski, R., Dongarra, J., Deelman, E., Karczewski, K. (eds.) PPAM 2017. LNCS, vol. 10777, pp. 327–336. Springer, Cham (2018). Scholar
  2. 2.
    Neuwirth, S., Frey, D., Nuessle, M., Bruening, U.: Scalable communication architecture for network-attached accelerators. In: 2015 IEEE 21st International Symposium on High Performance Computer Architecture (HPCA), pp. 627–638, February 2015Google Scholar
  3. 3.
    Puente, V., Beivide, R., Gregorio, J.A., Prellezo, J.M., Duato, J., Izu, C.: Adaptive bubble router: a design to improve performance in torus networks. In: Proceedings of the 1999 International Conference on Parallel Processing, pp. 58–67 (1999)Google Scholar
  4. 4.
    Scott, S.L., Thorson, G.M.: The Cray T3E network: adaptive routing in a high performance 3D torus. In: HOT Interconnects IV, Stanford University, 15–16 August 1996Google Scholar
  5. 5.
    Adiga, N.R., et al.: Blue Gene/L torus interconnection network. IBM J. Res. Dev. 49(2), 265–276 (2005)MathSciNetCrossRefGoogle Scholar
  6. 6.
    Smirnov, G.S., Stegailov, V.V.: Efficiency of classical molecular dynamics algorithms on supercomputers. Math. Models Comput. Simul. 8(6), 734–743 (2016)MathSciNetCrossRefGoogle Scholar
  7. 7.
    Stegailov, V.V., Orekhov, N.D., Smirnov, G.S.: HPC hardware efficiency for quantum and classical molecular dynamics. In: Malyshkin, V. (ed.) PaCT 2015. LNCS, vol. 9251, pp. 469–473. Springer, Cham (2015). Scholar
  8. 8.
    Rojek, K., Wyrzykowski, R., Kuczynski, L.: Systematic adaptation of stencil-based 3D MPDATA to GPU architectures. Concurr. Comput. Pract. Exp. 29, e3970 (2017)Google Scholar
  9. 9.
    Berendsen, H.J.C., van der Spoel, D., van Drunen, R.: GROMACS: a message-passing parallel molecular dynamics implementation. Comput. Phys. Commun. 91(1–3), 43–56 (1995)CrossRefGoogle Scholar
  10. 10.
    Plimpton, S.: Fast parallel algorithms for short-range molecular dynamics. J. Comput. Phys. 117(1), 1–19 (1995)CrossRefGoogle Scholar
  11. 11.
    Trott, C.R., Winterfeld, L., Crozier, P.S.: General-purpose molecular dynamics simulations on GPU-based clusters. ArXiv e-prints, September 2010Google Scholar
  12. 12.
    Brown, W.M., Wang, P., Plimpton, S.J., Tharrington, A.N.: Implementing molecular dynamics on hybrid high performance computers - short range forces. Comput. Phys. Commun. 182(4), 898–911 (2011)CrossRefGoogle Scholar
  13. 13.
    Brown, W.M., Kohlmeyer, A., Plimpton, S.J., Tharrington, A.N.: Implementing molecular dynamics on hybrid high performance computers - particle-particle particle-mesh. Comput. Phys. Commun. 183(3), 449–459 (2012)CrossRefGoogle Scholar
  14. 14.
    Edwards, H.C., Trott, C.R., Sunderland, D.: Kokkos: enabling manycore performance portability through polymorphic memory access patterns. J. Parallel Distrib. Comput. 74(12), 3202–3216 (2014). Domain-Specific Languages and High-Level Frameworks for High-Performance ComputingCrossRefGoogle Scholar
  15. 15.
    Abraham, M.J., et al.: GROMACS: high performance molecular simulations through multi-level parallelism from laptops to supercomputers. SoftwareX 12, 19–25 (2015)CrossRefGoogle Scholar
  16. 16.
    Ohmura, I., Morimoto, G., Ohno, Y., Hasegawa, A., Taiji, M.: MDGRAPE-4: a special-purpose computer system for molecular dynamics simulations. Phil. Trans. R. Soc. A 372, 20130387 (2014)Google Scholar
  17. 17.
    Piana, S., Klepeis, J.L., Shaw, D.E.: Assessing the accuracy of physical models used in protein-folding simulations: quantitative evidence from long molecular dynamics simulations. Curr. Opin. Struct. Biol. 24, 98–105 (2014)CrossRefGoogle Scholar
  18. 18.
    Kutzner, C., Pall, S., Fechner, M., Esztermann, A., de Groot, B.L., Grubmuller, H.: Best bang for your buck: GPU nodes for gromacs biomolecular simulations. J. Comput. Chem. 36(26), 1990–2008 (2015)CrossRefGoogle Scholar
  19. 19.
    Luehr, N., Ufimtsev, I.S., Martínez, T.J.: Dynamic precision for electron repulsion integral evaluation on graphical processing units (GPUs). J. Chem. Theor. Comput. 7(4), 949–954 (2011). PMID: 26606344CrossRefGoogle Scholar
  20. 20.
    Nicholas, M., Feltus, F.A., Ligon III, W.B.: Maximizing the performance of scientific data transfer by optimizing the interface between parallel file systems and advanced research networks. Fut. Gener. Comput. Syst. 79(Part 1), 190–198 (2018)Google Scholar
  21. 21.
    Plimpton, S.J., Tharrington, A.N., Brown, W.M., Wang, P.: Implementing molecular dynamics on hybrid high performance computers - short range forces. Comput. Phys. Commun. 182, 898–911 (2011)CrossRefGoogle Scholar
  22. 22.
    Plimpton, S.J., Tharrington, A.N., Brown, W.M., Kohlmeyer, A.: Implementing molecular dynamics on hybrid high performance computers - particle-particle particle-mesh. Comput. Phys. Commun. 183, 449–459 (2012)CrossRefGoogle Scholar
  23. 23.
    Masako, Y., Brown, W.M.: Implementing molecular dynamics on hybrid high performance computers - three-body potentials. Comput. Phys. Commun. 184, 2785–2793 (2013)CrossRefGoogle Scholar
  24. 24.
    Kondratyuk, N.D., Norman, G.E., Stegailov, V.V.: Self-consistent molecular dynamics calculation of diffusion in higher n-alkanes. J. Chem. Phys. 145(20), 204504 (2016)CrossRefGoogle Scholar
  25. 25.
    Genovese, L., et al.: Daubechies wavelets as a basis set for density functional pseudopotential calculations. J. Chem. Phys. 129(1), 014109 (2008)CrossRefGoogle Scholar
  26. 26.
    Genovese, L., Ospici, M., Deutsch, T., Méhaut, J.-F., Neelov, A., Goedecker, S.: Density functional theory calculation on many-cores hybrid central processing unit-graphic processing unit architectures. J. Chem. Phys. 131(3), 034103 (2009)CrossRefGoogle Scholar
  27. 27.
    Eckhardt, W., et al.: 591 TFLOPS multi-trillion particles simulation on superMUC. In: Kunkel, J.M., Ludwig, T., Meuer, H.W. (eds.) ISC 2013. LNCS, vol. 7905, pp. 1–12. Springer, Heidelberg (2013). Scholar

Copyright information

© Springer Nature Switzerland AG 2019

Authors and Affiliations

  • Nikolay Kondratyuk
    • 1
    • 2
    • 3
  • Grigory Smirnov
    • 1
    • 2
    • 3
  • Vladimir Stegailov
    • 1
    • 2
    • 3
    Email author
  1. 1.Joint Institute for High Temperatures of RASMoscowRussia
  2. 2.Moscow Institute of Physics and TechnologyDolgoprudnyRussia
  3. 3.National Research University Higher School of EconomicsMoscowRussia

Personalised recommendations