cuVASP: A GPU-Accelerated Plane-Wave Electronic-Structure Code
We report about a source-code modification of the density-functional program suite VASP which greatly benefits from the use of graphics-processing units (GPUs). The blocked Davidson iteration scheme (EDDAV) has been optimized for GPUs and gains speed-ups of up to 3.39 on S1070 devices and of 6.97 on a C2050 device. Using the Fermi card, the code reaches an impressive 61.7% efficiency but does not suffer from any accuracy losses. The algorithmic bottleneck lies in the multiplication of rectangular matrices. We also give some initial thoughts about introducing a different level of parallelism in order to harness the computational power of multi-GPU installations.
KeywordsFast Fourier Transformation Memory Transfer Rectangular Matrice CUDA Kernel Vector Computing
Unable to display preview. Download preview PDF.