Advertisement

Porting VASP from MPI to MPI+OpenMP [SIMD]

Optimization Strategies, Insights and Feature Proposals
  • Florian Wende
  • Martijn Marsman
  • Zhengji Zhao
  • Jeongnim Kim
Conference paper
Part of the Lecture Notes in Computer Science book series (LNCS, volume 10468)

Abstract

We describe for the VASP application (a widely used electronic structure code written in FORTRAN) the transition from an MPI-only to a hybrid code base leveraging the three relevant levels of parallelism to be addressed when optimizing for an effective execution on modern computer platforms: multiprocessing, multithreading and SIMD vectorization. To achieve code portability, we draw on MPI parallelization together with OpenMP threading and SIMD constructs. Combining the latter can be challenging in complex code bases. Optimization targets are combining multithreading and vectorization in different calling contexts as well as whole function vectorization. In addition to outlining design decisions made throughout the code transformation process, we will demonstrate the effectiveness of the code adaptations using different compilers (GNU, Intel) and target platforms (CPU, Intel Xeon Phi (KNL)).

Notes

Acknowledgements

This work is (partially) supported by Intel within the IPCC activities at ZIB, by the ASCAR Office in the DOE, Office of Science, under contract number DE-AC02-05CH11231. It used the resources of National Energy Scientific Computing Center (NERSC).

References

  1. 1.
    Kresse, G., Furthmüller, J.: Efficient iterative schemes for ab initio total-energy calculations using a plane-wave basis set. Phys. Rev. B 54, 11169–11186 (1996)CrossRefGoogle Scholar
  2. 2.
    Kresse, G., Furthmüller, J.: Efficiency of ab-initio total energy calculations for metals and semiconductors using a plane-wave basis set. Comput. Mater. Sci. 6(1), 15–50 (1996)CrossRefGoogle Scholar
  3. 3.
    Marsman, M., Paier, J., Stroppa, A., Kresse, G.: Hybrid functionals applied to extended systems. J. Phys. Condens. Matter 20(6), 064201 (2008)CrossRefGoogle Scholar
  4. 4.
    Kaltak, M., Klimeš, J., Kresse, G.: Cubic scaling algorithm for the random phase approximation: self-interstitials and vacancies in Si. Phys. Rev. B Condens. Matter Mater. Phys. 90(5), 054115–054115 (2014)CrossRefGoogle Scholar
  5. 5.
    Liu, P., Kaltak, M., Klimeš, J., Kresse, G.: Cubic scaling \(GW\): towards fast quasiparticle calculations. Phys. Rev. B: Condens. Matter 94(16), 165109 (2016)CrossRefGoogle Scholar
  6. 6.
    Sodani, A., Gramunt, R., Corbal, J., Kim, H.S., Vinod, K., Chinthamani, S., Hutsell, S., Agarwal, R., Liu, Y.C.: Knights landing: second-generation Intel Xeon Phi product. IEEE Micro 36(2), 34–46 (2016)CrossRefGoogle Scholar
  7. 7.
    Kresse, G., Joubert, D.: From ultrasoft pseudopotentials to the projector augmented-wave method. Phys. Rev. B 59, 1758–1775 (1999)CrossRefGoogle Scholar
  8. 8.
    Zhao, Z., Marsman, M., Wende, F., Kim, J.: Performance of hybrid MPI/OpenMP VASP on Cray XC40 based on Intel Knights landing many integrated core architecture. In: CUG Proceedings (2017)Google Scholar
  9. 9.
    Klemm, M., Duran, A., Tian, X., Saito, H., Caballero, D., Martorell, X.: Extending OpenMP* with vector constructs for modern multicore SIMD architectures. In: Chapman, B.M., Massaioli, F., Müller, M.S., Rorro, M. (eds.) IWOMP 2012. LNCS, vol. 7312, pp. 59–72. Springer, Heidelberg (2012). doi: 10.1007/978-3-642-30961-8_5 CrossRefGoogle Scholar
  10. 10.
    OpenMP Architecture Review Board: OpenMP Application Program Interface, Version 4.0. (2013). http://www.openmp.org
  11. 11.
    OpenMP Architecture Review Board: OpenMP Application Program Interface, Version 4.5. (2015). http://www.openmp.org/
  12. 12.
    Wende, F., Noack, M., Schütt, T., Sachs, S., Steinke, T.: Application performance on a Cray XC30 evaluation system with Xeon Phi coprocessors at HLRN-III. In: Cray User Group (2015)Google Scholar
  13. 13.
    Wende, F., Noack, M., Steinke, T., Klemm, M., Zitzlsberger, G., Newburn, C.J.: Portable SIMD performance with OpenMP* 4.x compiler directives. In: Euro-Par 2016, Parallel Processing, 22nd International Conference on Parallel and Distributed Computing (2016)Google Scholar
  14. 14.
    Senkevich, A.: Libmvec (2015). https://sourceware.org/glibc/wiki/libmvec

Copyright information

© Springer International Publishing AG 2017

Authors and Affiliations

  • Florian Wende
    • 1
  • Martijn Marsman
    • 2
  • Zhengji Zhao
    • 3
  • Jeongnim Kim
    • 4
  1. 1.Zuse Institute BerlinBerlinGermany
  2. 2.University of ViennaViennaAustria
  3. 3.National Energy Research Scientific Computing CenterBerkeleyUSA
  4. 4.Intel CorporationHillsboroUSA

Personalised recommendations