International Journal of Parallel Programming

, Volume 45, Issue 5, pp 1236–1258 | Cite as

Accelerating Detailed Tissue-Scale 3D Cardiac Simulations Using Heterogeneous CPU-Xeon Phi Computing

  • Johannes Langguth
  • Qiang Lan
  • Namit Gaur
  • Xing Cai


We investigate heterogeneous computing, which involves both multicore CPUs and manycore Xeon Phi coprocessors, as a new strategy for computational cardiology. In particular, 3D tissues of the human cardiac ventricle are studied with a physiologically realistic model that has 10,000 calcium release units per cell and 100 ryanodine receptors per release unit, together with tissue-scale simulations of the electrical activity and calcium handling. In order to attain resource-efficient use of heterogeneous computing systems that consist of both CPUs and Xeon Phis, we first direct the coding effort at ensuring good performance on the two types of compute devices individually. Although SIMD code vectorization is the main theme of performance programming, the actual implementation details differ considerably between CPU and Xeon Phi. Moreover, in addition to combined OpenMP+MPI programming, a suitable division of the cells between the CPUs and Xeon Phis is important for resource-efficient usage of an entire heterogeneous system. Numerical experiments show that good resource utilization is indeed achieved and that such a heterogeneous simulator paves the way for ultimately understanding the mechanisms of arrhythmia. The uncovered good programming practices can be used by computational scientists who want to adopt similar heterogeneous hardware platforms for a wide variety of applications.


Calcium handling Multiscale cardiac tissue simulation Supercomputing Xeon Phi 


  1. 1.
    Adler, C., Costabel, U.: Cell number in human heart in atrophy, hypertrophy, and under the influence of cytostatics. Recent Adv. Stud. Card. Struct. Metab. 6, 343–355 (1975)Google Scholar
  2. 2.
    Brueckner, R.: A closer look at Intel’s Coral supercomputers coming to Argonne. (2015)
  3. 3.
    Chai, J., Hake, J.E., Wu, N., Wen, M., Cai, X., Lines, G.T., Yang, J., Su, H., Zhang, C., Liao, X.: Towards simulation of subcellular calcium dynamics at nanometre resolution. Int. J. High Perform. Comput. Appl. 29, 51–63 (2015). doi: 10.1177/1094342013514465 CrossRefGoogle Scholar
  4. 4.
    Chai, J., Wen, M., Wu, N., Huang, D., Yang, J., Cai, X., Zhang, C., Yang, Q.: Simulating cardiac electrophysiology in the era of GPU-cluster computing. IEICE Trans. Inf. Syst. E96—-D(12), 2587–2595 (2013). doi: 10.1587/transinf.E96.D.2587 CrossRefGoogle Scholar
  5. 5.
    Cheng, H., Lederer, W., Cannell, M.B.: Calcium sparks: elementary events underlying excitation-contraction coupling in heart muscle. Science 262(5134), 740–744 (1993)CrossRefGoogle Scholar
  6. 6.
    Crimi, G., Mantovani, F., Pivanti, M., Schifano, S., Tripiccione, R.: Early experience on porting and running a lattice Boltzmann code on the Xeon-Phi co-processor. Proc. Comput. Sci. 18, 551–560 (2013). doi: 10.1016/j.procs.2013.05.219 CrossRefGoogle Scholar
  7. 7.
    Datta, K., Murphy, M., Volkov, V., Williams, S., Carter, J., Oliker, L., Patterson, D., Shalf, J., Yelick, K.: Stencil computation optimization and auto-tuning on state-of-the-art multicore architectures. In: International Conference for High Performance Computing, Networking, Storage and Analysis, SC 2008 (2008). doi: 10.1109/SC.2008.5222004
  8. 8.
    Dong, X., Wen, M., Chai, J., Cai, X., Zhao, M., Zhang, C.: Communication-hiding programming for clusters with multi-coprocessor nodes. Concurr. Comput.: Pract. Exp. 27(16), 4172–4185 (2015). doi: 10.1002/cpe.3507 CrossRefGoogle Scholar
  9. 9.
    Durrer, D., Van Dam, R.T., Freud, G., Janse, M., Meijler, F., Arzbaecher, R.: Total excitation of the isolated human heart. Circulation 41(6), 899–912 (1970)CrossRefGoogle Scholar
  10. 10.
    Fang, J., Sips, H., Zhang, L., Xu, C., Che, Y., Varbanescu, A.L.: Test-driving Intel Xeon phi. In: Proceedings of the 5th ACM/SPEC International Conference on Performance Engineering, ICPE ’14, pp. 137–148. ACM (2014). doi: 10.1145/2568088.2576799
  11. 11.
    Gaur, N., Rudy, Y.: Multiscale modeling of calcium cycling in cardiac ventricular myocyte: macroscopic consequences of microscopic dyadic function. Biophys. J. 100(12), 2904–2912 (2011)CrossRefGoogle Scholar
  12. 12.
  13. 13.
  14. 14.
    Jeffers, J., Reinders, J.: Intel Xeon Phi Coprocessor High Performance Programming, 1st edn. Morgan Kaufmann Publishers Inc., Waltham (2013)Google Scholar
  15. 15.
    Lan, Q., Gaur, N., Langguth, J., Cai, X.: Towards detailed tissue-scale 3D simulations of electrical activity and calcium handling in the human cardiac ventricle. Algorithms and Architectures for Parallel Processing. Lecture Notes in Computer Science, vol. 9530, pp. 79–92. Springer, Berlin (2015)Google Scholar
  16. 16.
    Morris, J.: Intel’s next big thing: knights landing Xeon Phi. (2015)
  17. 17.
    MPICH: High-performance portable MPI.
  18. 18.
    Nivala, M., de Lange, E., Rovetti, R., Qu, Z.: Computational modeling and numerical methods for spatiotemporal calcium cycling in ventricular myocytes. Front. Physiol. 3, 114 (2012)CrossRefGoogle Scholar
  19. 19.
    Nivala, M., Qu, Z.: Calcium alternans in a couplon network model of ventricular myocytes: role of sarcoplasmic reticulum load. Am. J. Physiol. Heart Circ. Physiol. 303(3), H341–H352 (2012)CrossRefGoogle Scholar
  20. 20.
    Nivala, M., Song, Z., Weiss, J.N., Qu, Z.: T-tubule disruption promotes calcium alternans in failing ventricular myocytes: mechanistic insights from computational modeling. J. Mol. Cell. Cardiol. 79, 32–41 (2015)CrossRefGoogle Scholar
  21. 21.
    O’Hara, T., Rudy, Y.: Quantitative comparison of cardiac ventricular myocyte electrophysiology and response to drugs in human and nonhuman species. Am. J. Physiol. Heart Circ. Physiol. 302(5), H1020–H1030 (2011)Google Scholar
  22. 22.
    O’Hara, T., Virág, L., Varró, A., Rudy, Y.: Simulation of the undiseased human cardiac ventricular action potential: model formulation and experimental validation. PLoS Comput. Biol. 7(5), e1002,061 (2011)Google Scholar
  23. 23.
    Qu, Z., Garfinkel, A.: An advanced algorithm for solving partial differential equation in cardiac conduction. IEEE Trans. Biomed. Eng. 46(9), 1166–1168 (1999)CrossRefGoogle Scholar
  24. 24.
    Restrepo, J.G., Weiss, J.N., Karma, A.: Calsequestrin-mediated mechanism for cellular calcium transient alternans. Biophys. J. 95(8), 3767–3789 (2008)CrossRefGoogle Scholar
  25. 25.
    Song, Z., Ko, C.Y., Nivala, M., Weiss, J.N., Qu, Z.: Calcium-voltage coupling in the genesis of early and delayed afterdepolarizations in cardiac myocytes. Biophys. J. 108(8), 1908–1921 (2015)CrossRefGoogle Scholar
  26. 26.
    Stampede—Texas Advanced Computing Center.
  27. 27.
    Tianhe-2 (Milky Way-2) Supercomputer.
  28. 28.
    Top500 Supercomputing Sites.
  29. 29.
  30. 30.
    Venetis, I.E., Goumas, G., Geveler, M., Ribbrock, D.: Porting FEASTFLOW to the Intel Xeon Phi: lessons learned. Tech. rep, Partnership for Advanced Computing in Europe (PRACE) (2014)Google Scholar
  31. 31.
    Vladimirov, A.: Arithmetics on Intel’s Sandy Bridge and Westmere CPUs: not all FLOPs are created equal. Tech. rep, Colfax International (2012)Google Scholar
  32. 32.
    Williams, G.S., Chikando, A.C., Tuan, H.T.M., Sobie, E.A., Lederer, W., Jafri, M.S.: Dynamics of calcium sparks and calcium leak in the heart. Biophys. J. 101(6), 1287–1296 (2011)CrossRefGoogle Scholar

Copyright information

© Springer Science+Business Media New York 2016

Authors and Affiliations

  1. 1.Simula Research LaboratoryLysakerNorway
  2. 2.College of ComputerNational University of Defense TechnologyChangshaChina
  3. 3.National Key Laboratory of Parallel and Distributed ProcessingChangshaChina
  4. 4.Department of InformaticsUniversity of OsloOsloNorway

Personalised recommendations