Accelerating astrophysical particle simulations with programmable hardware (FPGA and GPU)

  • R. SpurzemEmail author
  • P. Berczik
  • G. Marcus
  • A. Kugel
  • G. Lienhart
  • I. Berentzen
  • R. Männer
  • R. Klessen
  • R. Banerjee
Special Issue Paper


In a previous paper we have shown that direct gravitational N-body simulations in astrophysics scale very well for moderately parallel supercomputers (order 10–100 nodes). The best balance between computation and communication is reached if the nodes are accelerated by special purpose hardware; in this paper we describe the implementation of particle based astrophysical simulation codes on new types of accelerator hardware (field programmable gate arrays, FPGA, and graphical processing units, GPU). In addition to direct gravitational N-body simulations we also use the algorithmically similar “smoothed particle hydrodynamics” method as test application; the algorithms are used for astrophysical problems as e.g. evolution of galactic nuclei with central black holes and gravitational wave generation, and star formation in galaxies and galactic nuclei. We present the code performance on a single node using different kinds of special hardware (traditional GRAPE, FPGA, and GPU) and some implementation aspects (e.g. accuracy). The results show that GPU hardware for real application codes is as fast as GRAPE, but for an order of magnitude lower price, and that FPGA is useful for acceleration of complex sequences of operations (like SPH). We discuss future prospects and new cluster computers built with new generations of FPGA and GPU cards.


Astrophysics  FPGA  GPU  Special purpose accelerators  Particle simulations 


Unable to display preview. Download preview PDF.

Unable to display preview. Download preview PDF.


  1. 1.
    Aarseth SJ (2003) Gravitational N-body Simulations. Cambridge University Press, CambridgezbMATHGoogle Scholar
  2. 2.
    Benz W (1990) Smooth particle hydrodynamics: A review. In: Proceedings of the NATO advanced research workshop on the numerical modelling of nonlinear stellar pulsations problems and prospects, Les Arcs, France, 20–24 March 1986. Buchler JR (ed) Kluwer Academic Publishers, Dordrecht Boston, pp 269–288Google Scholar
  3. 3.
    Berentzen I, Preto M, Berczik P, Merritt D, Spurzem R (2009) Binary black hole merger in galactic nuclei: post-Newtonian simulations. Astroph J 695:455, eprint arXiv:0812.2756CrossRefGoogle Scholar
  4. 4.
    Barnes J, Hut P (1986) A hierarchical O(NlogN) force-calculation algorithm. Nature 324:446–449CrossRefGoogle Scholar
  5. 5.
    van Albada TS, van Gorkom JH (1977) Experimental stellar dynamics for systems with axial symmetry. A&A 54:121Google Scholar
  6. 6.
    Greengard L, Rokhlin V (1987) A fast algorithm for particle simulations. J Comput Phys 73:325–348zbMATHCrossRefMathSciNetGoogle Scholar
  7. 7.
    Miller RH, Prendergast KH (1968) Stellar dynamics in a discrete phase space. ApJ 151:699CrossRefGoogle Scholar
  8. 8.
    Efstathiou G, Eastwood JW (1981) On the clustering of particles in an expanding universe. MNRAS 194:503–525Google Scholar
  9. 9.
    Spurzem R (1999) Direct N-body simulations. J Computat Appl Math 109:407–432zbMATHCrossRefMathSciNetGoogle Scholar
  10. 10.
    Dubinski J (1996) A parallel tree code. New Astron 1:133–147CrossRefGoogle Scholar
  11. 11.
    Pearce FR, Couchman HMP (1997) Hydra: a parallel adaptive grid code. New Astron 2:411–427CrossRefGoogle Scholar
  12. 12.
    Makino J (2002) An efficient parallel algorithm for O(N2) direct summation method and its variations on distributed-memory parallel machines. New Astron 7:373–384CrossRefGoogle Scholar
  13. 13.
    Dorband EN, Hemsendorf M, Merritt D (2003) Systolic and hyper-systolic algorithms for the gravitational N-body problem, with an application to Brownian motion. J Comput Phys 185:484–511zbMATHCrossRefMathSciNetGoogle Scholar
  14. 14.
    Sugimoto D, Chikada Y, Makino J, Ito T, Ebisuzaki T, Umemura M (1990) Nature 345:33CrossRefGoogle Scholar
  15. 15.
    Monaghan JJ (1992) Smoothed particle hydrodynamics. ARA&A 30:543CrossRefGoogle Scholar
  16. 16.
    Klessen R, Clark PC (2007) Modeling Star Formation with SPH, SPHERIC Proceedings, p 133Google Scholar
  17. 17.
    Springel V (2005) The cosmological simulation code GADGET-2. MNRAS 364:1105CrossRefGoogle Scholar
  18. 18.
    SPHERIC – SPH European Research Interest Community,
  19. 19.
    Fukushige T, Makino J, Kawai A (2005) Grape-6a: A single-card grape-6 for parallel pc-grape cluster system. PASJ (57):1009–1021Google Scholar
  20. 20.
    Harfst S, Gualandris A, Merritt D, Spurzem R, Portegies Zwart S, Berczik P (2007) Performance analysis of direct N-body algorithms on special-purpose supercomputers. New Astron 12:357CrossRefGoogle Scholar
  21. 21.
    Belleman R, Bédorf J, Portegies Zwart S (2008) High performance direct gravitational N-body simulations on graphics processing units ii: An implementation in cuda. New Astron 13(2):103–112CrossRefGoogle Scholar
  22. 22.
    Marcus G, Lienhart G, Kugel A, Männer R, Berczik P, Spurzem R, Wetzstein M, Naab T, Burkert A (2007) An FPGA-based hardware coprocessor for SPH computations. SPHERIC Proceedings, pp 63–66Google Scholar
  23. 23.
    Berczik P, Nakasato N, Berentzen I, Spurzem R, Marcus G, Lienhart G, Kugel A, Maenner R, Burkert A, Wetzstein M, Naab T, Vazquez H, Vinogradov S (2007) Special, hardware accelerated, parallel sph code for galaxy evolution. SPHERIC Proceedings, pp 5–8Google Scholar
  24. 24.
    Hamada T, Iitaka T (2007) The chamomile scheme: An optimized algorithm for N-body simulations on programmable graphics processing units.
  25. 25.
    Fujiwara K, Nakasato N (2009) Fast simulations of gravitational many-body problem on RV770 GPU. arXiv:astro-ph/0904.3659v1Google Scholar
  26. 26.
    Brunner RJ, Kindratenko VV, Myers AD (2007) Developing and Deploying Advanced Algorithms to Novel Supercomputing Hardware. Appeared in Proc. NASA Science Technology Conference – NSTC’07.
  27. 27.
    Kindratenko VV, Brunner RJ, Myers AD (2008) Mitrion-C Application Development on SGI Altix 350/RC100. On speeding up clustering calculations using alternative hardware technologies, appeared in IEEE Symposium on Filed-Programmable Custom Computing Machines – FCCM’07.
  28. 28.
    Lienhart G, Kugel A, Maenner R (2006) Rapid development of high performance floating-point pipelines for scientific simulation. RAW ProceedingsGoogle Scholar
  29. 29.
    Liu G, Liu M (2005) Smoothed Particle Hydrodynamics: a meshfree particle method. World Scientific, SingaporeGoogle Scholar
  30. 30.
    Marcus G, Lienhart G, Kugel A, Maenner R (2006) On buffer management strategies for high performance computing with reconfigurable hardware. FPL, pp 343–348. IEEEGoogle Scholar
  31. 31.
    Monaghan J (1994) Simulating free surface flows with sph. J Comutat Phys (110):399–406Google Scholar
  32. 32.
    Nitadori K, Makino J (2008) Sixth- and eighth-order Hermite integrator for N-body simulations. New Astron 13(7):498–507CrossRefGoogle Scholar
  33. 33.
    Nguyen H (2008) GPU Gems 3. Addison-Wesley, New YorkGoogle Scholar
  34. 34.
    Scrofano R, Gokhale MB, Trouw F, Prasanna VK (2007) Accelerating molecular dynamics simulations with reconfigurable computers. IEEE Transactions on Parallel and Distributed SystemsGoogle Scholar
  35. 35.
    Schive H-Y, Chien C-H, Wong S-K, Tsai Y-C, Chiueh T (2008) Graphic-card cluster for astrophysics (GraCCA) Performance tests. New Astron 13:418–435CrossRefGoogle Scholar

Copyright information

© Springer-Verlag 2009

Authors and Affiliations

  • R. Spurzem
    • 1
    Email author
  • P. Berczik
    • 1
  • G. Marcus
    • 3
  • A. Kugel
    • 3
  • G. Lienhart
    • 3
  • I. Berentzen
    • 2
    • 1
  • R. Männer
    • 3
  • R. Klessen
    • 2
  • R. Banerjee
    • 2
  1. 1.Astronomisches Rechen-Institut (ARI-ZAH)University of HeidelbergHeidelbergGermany
  2. 2.Inst. für Theor. Astrophysik (ITA-ZAH)University of HeidelbergHeidelbergGermany
  3. 3.Institute for Computer Engineering (ZITI)University of HeidelbergMannheimGermany

Personalised recommendations