Abstract
We present an implementation of parallel GPU-accelerated GPAW, a density-functional theory (DFT) code based on grid based projector-augmented wave method. GPAW is suitable for large scale electronic structure calculations and capable of scaling to thousands of cores. We have accelerated the most computationally intensive components of the program with CUDA. We will provide performance and scaling analysis of our multi-GPU-accelerated code staring from small systems up to systems with thousands of atoms running on GPU clusters. We have achieved up to 15 times speed-ups on large systems.
Access this chapter
Tax calculation will be finalised at checkout
Purchases are for personal use only
Preview
Unable to display preview. Download preview PDF.
References
Mortensen, J.J., Hansen, L.B., Jacobsen, K.W.: Real-space grid implementation of the projector augmented wave method. Phys. Rev. BÂ 71, 35109 (2005)
Enkovaara, J., Rostgaard, C., Mortensen, J.J., Chen, J., Dulak, M., Ferrighi, L., Gavnholt, J., Glinsvad, C., Haikola, V., Hansen, H.A., Kristoffersen, H.H., Kuisma, M., Larsen, A.H., Lehtovaara, L., Ljungberg, M., Lopez-Acevedo, O., Moses, P.G., Ojanen, J., Olsen, T., Petzold, V., Romero, N.A., Stausholm-Møller, J., Strange, M., Tritsaris, G.A., Vanin, M., Walter, M., Hammer, B., Häkkinen, H., Madsen, G.K.H., Nieminen, R.M., Nørskov, J.K., Puska, M., Rantala, T.T., Schiøtz, J., Thygesen, K.S., Jacobsen, K.W.: Electronic structure calculations with GPAW: a real-space implementation of the projector augmented-wave method. Journal of Physics: Condensed Matter 22(25), 253202 (2010)
Meuer, H., Strohmaier, E., Dongarra, J., Simon, H.: Top500 supercomputer sites (November 2012), http://www.top500.org/lists/2012/11/ (accessed December 5, 2012)
Harju, A., Siro, T., Canova, F.F., Hakala, S., Rantalaiho, T.: Computational Physics on Graphics Processing Units. In: Manninen, P., Öster, P. (eds.) PARA 2012. LNCS, vol. 7782, pp. 3–26. Springer, Heidelberg (2013)
Payne, M.C., Teter, M.P., Allan, D.C., Arias, T.A., Joannopoulos, J.D.: Iterative minimization techniques for ab initio total-energy calculations: molecular dynamics and conjugate gradients. Rev. Mod. Phys. 64, 1045–1097 (1992)
Yasuda, K.: Accelerating density functional calculations with graphics processing unit. Journal of Chemical Theory and Computation 4(8), 1230–1236 (2008)
Yasuda, K.: Two-electron integral evaluation on the graphics processor unit. Journal of Computational Chemistry 29(3), 334–342 (2008)
Ufimtsev, I.S., Martinez, T.J.: Quantum chemistry on graphical processing units. 1. Strategies for two-electron integral evaluation. Journal of Chemical Theory and Computation 4(2), 222–231 (2008)
Ufimtsev, I.S., Martinez, T.J.: Quantum chemistry on graphical processing units. 2. Direct self-consistent-field implementation. Journal of Chemical Theory and Computation 5(4), 1004–1015 (2009)
Asadchev, A., Allada, V., Felder, J., Bode, B.M., Gordon, M.S., Windus, T.L.: Uncontracted Rys quadrature implementation of up to G functions on graphical processing units. Journal of Chemical Theory and Computation 6(3), 696–704 (2010)
Genovese, L., Ospici, M., Deutsch, T., Méhaut, J.F., Neelov, A., Goedecker, S.: Density functional theory calculation on many-cores hybrid central processing unit-graphic processing unit architectures. The Journal of Chemical Physics 131(7), 34103 (2009)
Maintz, S., Eck, B., Dronskowski, R.: Speeding up plane-wave electronic-structure calculations using graphics-processing units. Computer Physics Communications 182(7), 1421–1427 (2011)
Hacene, M., Anciaux-Sedrakian, A., Rozanska, X., Klahr, D., Guignon, T., Fleurat-Lessard, P.: Accelerating VASP electronic structure calculations using graphic processing units. Journal of Computational Chemistry (2012) n/a–n/a
Spiga, F., Girotto, I.: phiGEMM: A CPU-GPU library for porting Quantum ESPRESSO on hybrid systems. In: 2012 20th Euromicro International Conference on Parallel, Distributed and Network-Based Processing (PDP), pp. 368–375 (February 2012)
Wang, L., Wu, Y., Jia, W., Gao, W., Chi, X., Wang, L.W.: Large scale plane wave pseudopotential density functional theory calculations on GPU clusters. In: Proceedings of 2011 International Conference for High Performance Computing, Networking, Storage and Analysis, SC 2011, pp. 71:1–71:10. ACM, New York (2011)
Jia, W., Cao, Z., Wang, L., Fu, J., Chi, X., Gao, W., Wang, L.W.: The analysis of a plane wave pseudopotential density functional theory code on a GPU machine. Computer Physics Communications 184(1), 9–18 (2013)
Andrade, X., Alberdi-Rodriguez, J., Strubbe, D.A., Oliveira, M.J.T., Nogueira, F., Castro, A., Muguerza, J., Arruabarrena, A., Louie, S.G., Aspuru-Guzik, A., Rubio, A., Marques, M.A.L.: Time-dependent density-functional theory in massively parallel computer architectures: the octopus project. Journal of Physics: Condensed Matter 24, 233202 (2012)
Blöchl, P.E.: Projector augmented-wave method. Phys. Rev. B 50, 17953–17979 (1994)
Kohn, W., Sham, L.J.: Self-consistent equations including exchange and correlation effects. Phys. Rev. 140, A1133–A1138 (1965)
Parr, R., Yang, W.: Density-Functional Theory of Atoms and Molecules. International Series of Monographs on Chemistry. Oxford University Press, USA (1994)
Brandt, A.: Multi-level adaptive solutions to boundary-value problems. Math. Comp. 31, 333–390 (1977)
Wood, D., Zunger, A.: A new method for diagonalising large matrices. Journal of Physics A: Mathematical and General 18(9), 1343 (1999)
Kresse, G., Furthmüller, J.: Efficient iterative schemes for ab initio total-energy calculations using a plane-wave basis set. Phys. Rev. B 54, 11169–11186 (1996)
Briggs, E.L., Sullivan, D.J., Bernholc, J.: Real-space multigrid-based approach to large-scale electronic structure calculations. Physical Review B 54, 14362–14375 (1996)
NVIDIA Corp: Whitepaper: NVIDIA’s next generation CUDA compute architecture: Fermi, http://www.nvidia.com/content/PDF/fermi_white_papers/NVIDIA_Fermi_Compute_Architecture_Whitepaper.pdf (accessed October 20, 2012)
Klöckner, A., Pinto, N., Lee, Y., Catanzaro, B., Ivanov, P., Fasih, A.: PyCUDA and PyOpenCL: A scripting-based approach to GPU run-time code generation. Parallel Computing 38(3), 157–174 (2012)
NVIDIA Corp: CUDA parallel computing platform, http://www.nvidia.com/object/cuda_home_new.html (accessed October 14, 2012)
Micikevicius, P.: 3D finite difference computation on GPUs using CUDA. In: GPGPU-2: Proceedings of 2nd Workshop on General Purpose Processing on Graphics Processing Units, pp. 79–84. ACM, New York (2009)
Author information
Authors and Affiliations
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 2013 Springer-Verlag Berlin Heidelberg
About this paper
Cite this paper
Hakala, S., Havu, V., Enkovaara, J., Nieminen, R. (2013). Parallel Electronic Structure Calculations Using Multiple Graphics Processing Units (GPUs). In: Manninen, P., Öster, P. (eds) Applied Parallel and Scientific Computing. PARA 2012. Lecture Notes in Computer Science, vol 7782. Springer, Berlin, Heidelberg. https://doi.org/10.1007/978-3-642-36803-5_4
Download citation
DOI: https://doi.org/10.1007/978-3-642-36803-5_4
Publisher Name: Springer, Berlin, Heidelberg
Print ISBN: 978-3-642-36802-8
Online ISBN: 978-3-642-36803-5
eBook Packages: Computer ScienceComputer Science (R0)