Up to 700k GPU Cores, Kepler, and the Exascale Future for Simulations of Star Clusters Around Black Holes
We present benchmarks on high precision direct astrophysical N-body simulations using up to several 100k GPU cores; their soft and strong scaling behaves very well at that scale and allows further increase of the core number in the future path to Exascale computing. Our simulations use large GPU clusters both in China (Chinese Academy of Sciences) as well as in Germany (Judge/Milkyway cluster at FZ Jülich). Also we present first results on the performance gain by the new Kepler K20 GPU technology, which we have tested in two small experimental systems, and which also runs in the titan supercomputer in the United States, currently the fastest computer in the world. Our high resolution astrophysical N-body simulations are used for simulations of star clusters and galactic nuclei with central black holes. Some key issues in theoretical physics and astrophysics are addressed with them, such as galaxy formation and evolution, massive black hole formation, gravitational wave emission. The models have to cover thousands or more orbital time scales for the order of several million bodies. The total numerical effort is comparable if not higher than for the more widely known cosmological N-body simulations. Due to a complex structure in time (hierarchical blocked time steps) our codes are not considered “brute force”.
KeywordsBlack Hole Graphical Processing Unit Star Cluster Astrophysical Journal Strong Scaling
Unable to display preview. Download preview PDF.
- 3.Aarseth, S.J.: Gravitational N-Body Simulations (November 2003)Google Scholar
- 5.Akeley, K., Nguyen, H.: GPU Gems 3 (2007)Google Scholar
- 7.Berczik, P., Merritt, D., Spurzem, R., Bischof, H.: Efficient Merger of Binary Supermassive Black Holes in Nonaxisymmetric Galaxies. The Astrophysical Journal Letters 642, L21–L24 (2006)Google Scholar
- 8.Berczik, P., Nitadori, K., Hamada, T., Spurzem, R.: The Parallel GPU N-Body Code ϕGPUİn: New Astronomy (2013) in preparationGoogle Scholar
- 13.Hamada, T., Iitaka, T.: The Chamomile Scheme: An Optimized Algorithm for N-body simulations on Programmable Graphics Processing Units. ArXiv Astrophysics e-prints (March 2007)Google Scholar
- 15.Hwu, W.-M.-W.: GPU Computing Gems (2011)Google Scholar
- 18.Makino, J.: A Modified Aarseth Code for GRAPE and Vector Processors. Proceedings of Astronomical Society of Japan 43, 859–876 (1991)Google Scholar
- 20.Makino, J., Aarseth, S.J.: On a Hermite integrator with Ahmad-Cohen scheme for gravitational many-body problems. Publications of the Astronomical Society of Japan 44, 141–151 (1992)Google Scholar