Source-to-Source Optimization of CUDA C for GPU Accelerated Cardiac Cell Modeling

  • Fred V. Lionetti
  • Andrew D. McCulloch
  • Scott B. Baden
Part of the Lecture Notes in Computer Science book series (LNCS, volume 6271)


Large and complex systems of ordinary differential equations (ODEs) arise in diverse areas of science and engineering, and pose special challenges on a streaming processor owing to the large amount of state they manipulate. We describe a set of domain-specific source transformations on CUDA C that improved performance by ×6.7 on a system of ODEs arising in cardiac electrophysiology running on the nVidia GTX-295, without requiring expert knowledge of the GPU. Our transformations should apply to a wide range of reaction-diffusion systems..


Automatic code generation source-to-source transformations optimization GPU CUDA cardiac cell modeling ODEs 


Unable to display preview. Download preview PDF.

Unable to display preview. Download preview PDF.


  1. 1.
    Lionetti, F., McCulloch, A., Baden, S.B.: Gpu accelerated solvers for odes describing cardiac membrane equations. In: nVidia GPU Technology Conference (October 2009)Google Scholar
  2. 2.
    Sato, D., Xie, Y., Weiss, J.N., Qu, Z., Garfinkel, A., Sanderson, A.R.: Acceleration of cardiac tissue simulation with graphic processing units. Medical and Biological Engineering and Computing 47(9), 1011–1015 (2009)CrossRefGoogle Scholar
  3. 3.
    Liu, Y., Zhang, E.Z., Shen, X.: A cross-input adaptive framework for gpu program optimizations. In: Int. Parallel and Distributed Processing Symp., pp. 1–10 (2009)Google Scholar
  4. 4.
    Silberstein, M., Schuster, A., Geiger, D., Patney, A., Owens, J.D.: Efficient computation of sum-products on gpus through software-managed cache. In: Proc. 22nd Ann. Intl. Conf. Supercomputing, pp. 309–318 (2008)Google Scholar
  5. 5.
    Eichenberger, A.E., OÕBrien, J., OÕBrien, K., Wu, P., Chen, T., Oden, P.H., Prener, D.A., Shepherd, J.C., So, B., Sura, Z.: Using advanced compiler technology to exploit the performance of the cell broadband engine architecture. IBM Systems Journal 45(1), 59–84 (2006)CrossRefGoogle Scholar
  6. 6.
    Kahle, J.A., Day, M.N., Hofstee, H.P., Johns, C.R., Maeurer, T.R., Shippy, D.: Introduction to the cell multiprocessor. IBM J. Res. Dev. 49(4/5), 589–604 (2005)CrossRefGoogle Scholar
  7. 7.
  8. 8.
    Volkov, V., Demmel, J.W.: Benchmarking gpus to tune dense linear algebra. In: SC 2008: Proceedings of the 2008 ACM/IEEE Conference on Supercomputing (2008)Google Scholar
  9. 9.
    Certik, O.: Sympy python library for symbolic mathematics (2008),
  10. 10.
    Bendersky, E.: Pycparser (2009),
  11. 11.
    Flaim, S.N., Giles, W.R., McCulloch, A.D.: Contributions of sustained ina and ikv43 to transmural heterogeneity of early repolarization and arrhythmogenesis in canine left ventricular myocytes. American Journal of Physiology- Heart and Circulatory Physiology 291(6), H2617 (2006)CrossRefGoogle Scholar
  12. 12.
    Belady, L.A.: A study of replacement algorithms for virtual-storage computer. IBM Systems Journal 5(2), 78–101 (1966)CrossRefGoogle Scholar
  13. 13.
    Kennedy, K., Allen, J.R.: Optimizing compilers for modern architectures: a dependence-based approach. Morgan Kaufmann Pub., San Francisco (2001)Google Scholar
  14. 14.
    Britton, N.: Reaction-diffusion equations and their applications to biology. Harcourt Brace Janovich, Orlando, FL 32887-0405(USA), 1986, 288 (1986)Google Scholar
  15. 15.
    Lionetti, F.: Gpu accelerated cardiac electrophysiology. Master’s thesis f (2010),
  16. 16.
    Bell, N., Garlandy, M.: Efficient sparse matrix-vector multiplication on cuda. Technical report, NVIDIA Corporation, Santa Clara, CA, USA (December 2008)Google Scholar
  17. 17.
    Dean, R.C., Spiteri, R.: On the performance on an implicit-explicit runge-kutta method in models of cardiac electrical activity. IEEE Transactions on Biomedical Engineering 55(5) (2008)Google Scholar
  18. 18.
    Hairer, E., Noersett, S.P., Wanner, G.: Solving ordinary differential equations. Springer, Heidelberg (1993)Google Scholar
  19. 19.
    nVidia: Nvidia’s next generation cuda compute architecture: Fermi (2009),

Copyright information

© Springer-Verlag Berlin Heidelberg 2010

Authors and Affiliations

  • Fred V. Lionetti
    • 1
  • Andrew D. McCulloch
    • 2
  • Scott B. Baden
    • 1
  1. 1.Departments of Computer Science and EngineeringUniversity of CaliforniaLa JollaUSA
  2. 2.Departments of BioengineeringUniversity of CaliforniaLa JollaUSA

Personalised recommendations