GPU Acceleration of Hermite Methods for the Simulation of Wave Propagation

  • Arturo VargasEmail author
  • Jesse Chan
  • Thomas Hagstrom
  • Timothy Warburton
Conference paper
Part of the Lecture Notes in Computational Science and Engineering book series (LNCSE, volume 119)


The Hermite methods of Goodrich, Hagstrom, and Lorenz (2006) use Hermite interpolation to construct high order numerical methods for hyperbolic initial value problems. The structure of the method has several favorable features for parallel computing. In this work, we propose algorithms that take advantage of the many-core architecture of Graphics Processing Units. The algorithm exploits the compact stencil of Hermite methods and uses data structures that allow for efficient data load and stores. Additionally the highly localized evolution operator of Hermite methods allows us to combine multi-stage time-stepping methods within the new algorithms incurring minimal accesses of global memory. Using a scalar linear wave equation, we study the algorithm by considering Hermite interpolation and evolution as individual kernels and alternatively combined them into a monolithic kernel. For both approaches we demonstrate strategies to increase performance. Our numerical experiments show that although a two kernel approach allows for better performance on the hardware, a monolithic kernel can offer a comparable time to solution with less global memory usage.



TH was supported in part by NSF Grant DMS-1418871. TW and JC were supported in part by NSF Grant DMS-1216674. Any opinions, findings, and conclusions or recommendations expressed in this material are those of the authors and do not necessarily reflect the views of the National Science Foundation.


  1. 1.
    D. Appelö, M. Inkman, T. Hagstrom, T. Colonius, Recent progress on Hermite methods in aeroacoustics, in 17th AIAA/CEAS Aeroacoustics Conference. AIAA, 2011Google Scholar
  2. 2.
    E. Baysal, D.D. Kosloff, J.W. Sherwood, Reverse time migration. Geophysics 48, 1514–1524 (1983)CrossRefGoogle Scholar
  3. 3.
    X. Chen, Numerical and analytical studies of electromagnetic waves: hermite methods, supercontinuum generation, and multiple poles in the SEM, Doctoral Thesis, University of New Mexico, 2012Google Scholar
  4. 4.
    E.T. Dye, Performance analysis and optimization of hermite methods on NVIDIA GPUs using CUDA, Master Thesis, The University of New Mexico, 2015Google Scholar
  5. 5.
    J. Goodrich, T. Hagstrom, J. Lorenz, Hermite methods for hyperbolic initial-boundary value problems. Math. Comput. 75, 595–630 (2006)CrossRefzbMATHMathSciNetGoogle Scholar
  6. 6.
    T. Hagstrom, D. Appelö. Experiments with Hermite methods for simulating compressible flows: Runge-Kutta time-stepping and absorbing layers, in 13th AIAA/CEAS Aeroacoustics Conference. AIAA, 2007Google Scholar
  7. 7.
    T. Hagstrom, D. Appelö, 2015. Solving PDEs with hermite interpolation, in Spectral and High Order Methods for Partial Differential Equations ICOSAHOM 2014 (Springer International Publishing, Cham, 2014), pp. 31–49Google Scholar
  8. 8.
    J.S. Hesthaven, T. Warburton, Nodal Discontinuous Galerkin Methods: Algorithms, Analysis, and Applications (Springer Science & Business Media, New York, 2007)zbMATHGoogle Scholar
  9. 9.
    A. Klöckner, T. Warburton, J. Bridge, J.S. Hesthaven, Nodal discontinuous Galerkin methods on graphics processors. J. Comput. Phys. 228, 7863–7882 (2009)CrossRefzbMATHMathSciNetGoogle Scholar
  10. 10.
    D. Medina. OKL: a unified language for parallel architectures, Doctoral Thesis, Rice University, 2015Google Scholar
  11. 11.
    P. Micikevicius. 3D finite difference computation on GPUs using CUDA, in Proceedings of 2nd workshop on general purpose processing on graphics processing units, ACM, 2009, pp. 79–84Google Scholar
  12. 12.
    A. Modave, A. St-Cyr, T. Warburton, GPU performance analysis of a nodal discontinuous Galerkin method for acoustic and elastic models. Comput. Geosci. 91, 64–76 (2006)CrossRefGoogle Scholar
  13. 13.
    J. Sanders, E. Kandrot, CUDA by Example: An Introduction to General-Purpose GPU Programming (Addison-Wesley Professional, Boston, MA, 2010)Google Scholar
  14. 14.
    A. Taflove, S.C. Hagness, Computational Electrodynamics: The Finite-Difference Time-Domain Method, 2nd edn. (Artech House, Norwood, MA, 1995)zbMATHGoogle Scholar
  15. 15.
    A. Vargas, J. Chan, T. Hagstrom, T. Warburton, Variations on Hermite methods for wave propagation. arXiv:1509.08012 (2015, arXiv preprint)Google Scholar
  16. 16.
    J. Virieux, S. Operto, An overview of full-waveform inversion in exploration geophysics. Geophysics 74, WCC1–WCC2 (2009)CrossRefGoogle Scholar
  17. 17.
    S. Williams, A. Waterman, D. Patterson, Roofline: an insightful visual performance model for multicore architectures. Commun. ACM 52, 65–76 (2009)CrossRefGoogle Scholar

Copyright information

© Springer International Publishing AG 2017

Authors and Affiliations

  • Arturo Vargas
    • 1
    Email author
  • Jesse Chan
    • 1
  • Thomas Hagstrom
    • 2
  • Timothy Warburton
    • 3
  1. 1.Rice UniversityHoustonUSA
  2. 2.Southern Methodist UniversityDallasUSA
  3. 3.Virginia TechBlacksburgUSA

Personalised recommendations