GPU acceleration of splitting schemes applied to differential matrix equations

  • Hermann Mena
  • Lena-Maria Pfurtscheller
  • Tony StillfjordEmail author
Open Access
Original Paper


We consider differential Lyapunov and Riccati equations, and generalized versions thereof. Such equations arise in many different areas and are especially important within the field of optimal control. In order to approximate their solution, one may use several different kinds of numerical methods. Of these, splitting schemes are often a very competitive choice. In this article, we investigate the use of graphical processing units (GPUs) to parallelize such schemes and thereby further increase their effectiveness. According to our numerical experiments, large speed-ups are often observed for sufficiently large matrices. We also provide a comparison between different splitting strategies, demonstrating that splitting the equations into a moderate number of subproblems is generally optimal.


Differential Lyapunov equations Differential Riccati equations Large scale Splitting schemes GPU acceleration 



Open access funding provided by Max Planck Society. The authors would like to thank the anonymous referees, whose critical and constructive comments greatly improved the manuscript. We are also grateful to Peter Kandolf for his assistance with the original expleja code.

Funding information

This study is supported by the Austrian Science Fund (FWF)—project id:P27926 and by a scholarship of the Vizerektorat für Forschung, University of Innsbruck.


  1. 1.
    Abou-Kandil, H., Freiling, G., Ionescu, V., Jank, G.: Matrix Riccati equations in control and systems theory. Birkhäuser, Basel (2003)CrossRefzbMATHGoogle Scholar
  2. 2.
    Alonso-Mallo, I., Cano, B., Reguera, N.: Avoiding order reduction when integrating diffusion-reaction boundary value problems with exponential splitting methods. J. Comput. Appl. Math. 357, 228–250 (2019)MathSciNetCrossRefzbMATHGoogle Scholar
  3. 3.
    Antoulas, A.C., Sorensen, D.C., Zhou, Y.: On the decay rate of Hankel singular values and related issues. Syst. Cont. Lett. 46(5), 323–342 (2002)MathSciNetCrossRefzbMATHGoogle Scholar
  4. 4.
    Auer, N., Einkemmer, L., Kandolf, P., Ostermann, A.: Magnus integrators on multicore CPUs and GPUs. Comput. Phys. Comm. 228, 115–122 (2018). CrossRefGoogle Scholar
  5. 5.
    Auzinger, W., Koch, O., Thalhammer, M.: Defect-based local error estimators for high-order splitting methods involving three linear operators. Numer. Algorithm. 70(1), 61–91 (2015). MathSciNetCrossRefzbMATHGoogle Scholar
  6. 6.
    Başar, T., Bernhard, P.: \(H^{\infty }\)-optimal control and related minimax design problems. In: Systems & Control: Foundations & Applications. 2nd edn. A dynamic game approach. Birkhäuser Boston, Inc., Boston (1995)
  7. 7.
    Baker, J., Embree, M., Sabino, J.: Fast singular value decay for Lyapunov solutions with nonnormal coefficients. SIAM. J. Matrix Anal. Appl. 36(2), 656–668 (2015). MathSciNetCrossRefzbMATHGoogle Scholar
  8. 8.
    Bell, N., Garland, M.: Efficient sparse matrix-vector multiplication on CUDA. NVIDIA Technical Report NVR-2008-004, NVIDIA Corporation (2008)Google Scholar
  9. 9.
    Benner, P., Breiten, T.: Low rank methods for a class of generalized Lyapunov equations and related issues. Numer. Math. 124(3), 441–470 (2013). MathSciNetCrossRefzbMATHGoogle Scholar
  10. 10.
    Benner, P., Damm, T.: Lyapunov equations, energy functionals, and model order reduction of bilinear and stochastic systems. SIAM. J. Control Optim. 49(2), 686–711 (2011)MathSciNetCrossRefzbMATHGoogle Scholar
  11. 11.
    Benner, P., Dufrechou, E., Ezzatti, P., Mena, H., Quintana-Ortí, E.S., Remón, A.: Solving sparse differential Riccati equations on hybrid CPU-GPU platforms. In: Gervasi, O., Murgante, B., Misra, S., Borruso, G., Torre, C.M., Rocha, A.M.A.C., Taniar, D., Apduhan, B.O., Stankova, E., Cuzzocrea, A. (eds.) Computational Science and Its Applications – ICCSA 2017: 17th International Conference, Trieste, Proceedings, Part I., pp 116–132. Springer International Publishing, Cham (2017)
  12. 12.
    Benner, P., Ezzatti, P., Mena, H., Quintana-Ortí, E. S., Remón, A.: Solving differential Riccati equations on multi-GPU platforms. In: Proceedings of 11th International Conference on Computational and Mathematical Methods in Science and Engineering, pp. 178–188. CMMSE ’11, Benidorm (2011)Google Scholar
  13. 13.
    Benner, P., Ezzatti, P., Mena, H., Quintana-Ortí, E.S., Remón, A.: Solving matrix equations on multi-core and many-core architectures. Algorithms 6(4), 857–870 (2013). MathSciNetCrossRefzbMATHGoogle Scholar
  14. 14.
    Benner, P., Mena, H.: Rosenbrock methods for solving Riccati differential equations. IEEE Trans. Automat. Control 58(11), 2950–2956 (2013). MathSciNetCrossRefzbMATHGoogle Scholar
  15. 15.
    Benner, P., Mena, H.: Numerical solution of the infinite-dimensional LQR problem and the associated Riccati differential equations. J. Numer. Math. 26 (1), 1–20 (2018). MathSciNetCrossRefzbMATHGoogle Scholar
  16. 16.
    Benner, P., Ezzatti, P., Mena, H., Quintana-Ortí, E.S., Remón, A.: Unleashing CPU-GPU acceleration for control theory applications. In: Caragiannis, I., Alexander, M., Badia, R.M., Cannataro, M., Costan, A., Danelutto, M., Desprez, F., Krammer, B., Sahuquillo, J., Scott, S.L., Weidendorfer, J. (eds.) Euro-Par 2012: parallel processing workshops - BDMC, CGWS, HeteroPar, HiBB, OMHI, Paraphrase, PROPER, Resilience, UCHPC, VHPC, Rhodes Islands, Greece. Revised Selected Papers, Lecture Notes in Comput. Sci., vol. 7640, pp. 102–111. Springer. (2012)
  17. 17.
    Benner, P., Saak, J.: A semi-discretized heat transfer model for optimal cooling of steel profiles. In: Benner, P., Mehrmann, V., Sorensen, D. (eds.) Dimension Reduction of Large-Scale Systems, Lect. Notes Comput. Sci. Eng., vol. 45, pp 353–356. Springer, Berlin (2005)
  18. 18.
    Caliari, M., Kandolf, P., Ostermann, A., Rainer, S.: Comparison of software for computing the action of the matrix exponential. BIT Numer. Math. 54 (1), 113–128 (2014)MathSciNetCrossRefzbMATHGoogle Scholar
  19. 19.
    Caliari, M., Kandolf, P., Ostermann, A., Rainer, S.: The Leja method revisited: backward error analysis for the matrix exponential. SIAM. J. Sci. Comput. 38(3), A1639–A1661 (2016)zbMATHGoogle Scholar
  20. 20.
    Damm, T., Mena, H., Stillfjord, T.: Numerical solution of the finite horizon stochastic linear quadratic control problem. Numer. Lin. Alg. Appl. (2017)Google Scholar
  21. 21.
    De Leo, M., Rial, D., Sánchez de la Vega, C.: High-order time-splitting methods for irreversible equations. IMA, J. Numer. Anal. 36(4), 1842–1866 (2016). MathSciNetCrossRefzbMATHGoogle Scholar
  22. 22.
    Einkemmer, L., Ostermann, A.: Exponential integrators on graphic processing units. In: 2013 International Conference on High Performance Computing Simulation (HPCS), pp. 490–496. (2013)
  23. 23.
    Einkemmer, L., Ostermann, A.: Overcoming order reduction in diffusion-reaction splitting. Part 1: Dirichlet boundary conditions. SIAM J. Sci. Comput. 37(3), A1577–A1592 (2015). MathSciNetCrossRefzbMATHGoogle Scholar
  24. 24.
    Einkemmer, L., Ostermann, A.: Overcoming order reduction in diffusion-reaction splitting. Part 2: Oblique boundary conditions. SIAM J. Sci. Comput. 38(6), A3741–A3757 (2016). MathSciNetCrossRefzbMATHGoogle Scholar
  25. 25.
    Farquhar, M.E., Moroney, T.J., Yang, Q., Turner, I.W.: GPU accelerated algorithms for computing matrix function vector products with applications to exponential integrators and fractional diffusion. SIAM J. Sci. Comput. 38(3), C127–C149 (2016). MathSciNetCrossRefzbMATHGoogle Scholar
  26. 26.
    Goumas, G., Kourtis, K., Anastopoulos, N., Karakasis, V., Koziris, N.: Understanding the performance of sparse matrix-vector multiplication. In: 16th Euromicro Conference on Parallel, Distributed and Network-Based Processing (PDP 2008), pp. 283–292. (2008)
  27. 27.
    Hansen, E., Ostermann, A.: High order splitting methods for analytic semigroups exist. BIT Numer. Math. 49(3), 527–542 (2009). MathSciNetCrossRefzbMATHGoogle Scholar
  28. 28.
    Hansen, E., Stillfjord, T.: Convergence analysis for splitting of the abstract differential Riccati equation. SIAM J. Numer. Anal. 52(6), 3128–3139 (2014). MathSciNetCrossRefzbMATHGoogle Scholar
  29. 29.
    Hundsdorfer, W., Verwer, J.: Numerical solution of time-dependent advection-diffusion-reaction equations Springer Series in Computational Mathematics, vol. 33. Springer, Berlin (2003). CrossRefzbMATHGoogle Scholar
  30. 30.
    Ichikawa, A., Katayama, H.: Remarks on the time-varying \(H_{\infty }\) Riccati equations. Syst. Cont. Lett. 37(5), 335–345 (1999)MathSciNetCrossRefzbMATHGoogle Scholar
  31. 31.
    Koskela, A., Mena, H.: Analysis of Krylov subspace approximation to large scale differential Riccati equations. arXiv: (2017)
  32. 32.
    Lang, N.: Numerical methods for large-scale linear time-varying control systems and related differential matrix equations. Dissertation, Technische Universität Chemnitz, Chemnitz (2017)Google Scholar
  33. 33.
    Lang, N., Mena, H., Saak, J.: On the benefits of the L D L T factorization for large-scale differential matrix equation solvers. Linear Algebra Appl. 480, 44–71 (2015). MathSciNetCrossRefzbMATHGoogle Scholar
  34. 34.
    Mena, H., Ostermann, A., Pfurtscheller, L.M., Piazzola, C.: Numerical low-rank approximation of matrix differential equations. J. Comput. Appl. Math. 340, 602–614 (2018)MathSciNetCrossRefzbMATHGoogle Scholar
  35. 35.
    Mena, H., Pfurtscheller, L.: An efficient SPDE approach for El Niño. Appl. Math. Comput. 352, 146–156 (2019). MathSciNetCrossRefGoogle Scholar
  36. 36.
    Nakatsukasa, Y.: Gerschgorin’s theorem for generalized eigenvalue problems in the Euclidean metric. Math. Comp. 80(276), 2127–2142 (2011). MathSciNetCrossRefzbMATHGoogle Scholar
  37. 37.
    Nickolls, J., Buck, I., Garland, M., Skadron, K.: Scalable parallel programming with CUDA. Queue 6(2), 40–53 (2008). CrossRefGoogle Scholar
  38. 38.
    Penland, C., Sardeshmukh, P.: The optimal growth of tropical sea surface temperature anomalies. J. Clim. 8, 1999–2024 (1995)CrossRefGoogle Scholar
  39. 39.
    Penzl, T.: Eigenvalue decay bounds for solutions of Lyapunov equations: the symmetric case. Syst. Cont. Lett. 40, 139–144 (2000). MathSciNetCrossRefzbMATHGoogle Scholar
  40. 40.
    Petersen, I.R., Ugrinovskii, V.A., Savkin, A.V.: Robust Control Design using \(H^{\infty }\) Methods. Springer, London (2000)CrossRefzbMATHGoogle Scholar
  41. 41.
    Reese, J., Zaranek, S.: GPU programming in MATLAB. Mathworks News & Notes, pp 22–5. The MathWorks Inc, Natick (2012)Google Scholar
  42. 42.
    Saak, J.: Effiziente numerische Lösung eines Optimalsteuerungsproblems für die Abkühlung von Stahlprofilen. Diplomarbeit, Fachbereich 3/Mathematik und Informatik, Universität Bremen, D-28334 Bremen (2003)Google Scholar
  43. 43.
    Sorensen, D.C., Zhou, Y.: Bounds on eigenvalue decay rates and sensitivity of solutions to Lyapunov equations. Tech. Rep. TR02-07, Dept. of Comp. Appl. Math. Rice University, Houston (2002). Available online from Google Scholar
  44. 44.
    Stillfjord, T.: Low-rank second-order splitting of large-scale differential Riccati equations. IEEE Trans. Automat. Control 60(10), 2791–2796 (2015). MathSciNetCrossRefzbMATHGoogle Scholar
  45. 45.
    Stillfjord, T.: Adaptive high-order splitting schemes for large-scale differential Riccati equations. Numer. Algorithms. (2017)

Copyright information

© The Author(s) 2019

Open Access This article is distributed under the terms of the Creative Commons Attribution 4.0 International License (, which permits unrestricted use, distribution, and reproduction in any medium, provided you give appropriate credit to the original author(s) and the source, provide a link to the Creative Commons license, and indicate if changes were made.

Authors and Affiliations

  1. 1.Universidad Yachay TechSan Miguel de UrcuquíEcuador
  2. 2.Universität InnsbruckInnsbruckAustria
  3. 3.Max Planck Institute for Dynamics of Complex Technical SystemsMagdeburgGermany

Personalised recommendations