1 Introduction

Each field of applied sciences has particular requirements for computational modeling and often develops its own suite of numerical techniques. The numerical modeling of mechanical waves in some applications involve two somewhat conflicting requirements:

  • Complex, heterogeneous structures must be correctly modeled. In particular, interfaces and shapes of geological structures must be taken into account during space discretization. Moreover, high accuracy is needed for avoiding numerical anisotropy, attenuation, or dispersion that mislead interpretation. As a result, the numerical solution requires a huge computational effort, both in memory storage (from Giga to Tera nodes) and CPU time (from hours to weeks);

  • The medium where waves propagate is iteratively updated to fit recorded data. Wave simulation is one step of an imaging/inversion algorithm that may be repeated several times, thus it must be fast enough for not compromising the entire process.

These requirements are typical in the analysis of material response and imaging, for instance in

  • Exploration geophysics, reservoir scale seismics;

  • Geotechnical and engineering seismology;

  • Local, regional, and global seismology;

  • Planetary seismology;

  • Earth’s interior imaging;

  • Ground-shaking risk analysis - strong ground motion;

  • Monitoring of volcanic processes;

  • Earthquake and Tsunami early warning systems;

  • Global monitoring of nuclear tests.

Traditionally, these demands have been met with high-order schemes [1, 2]. They are highly accurate methods that require a low number of grid points per wavelength, thus reducing storage and CPU time requirements. Regardless of the chosen method, an efficient implementation is needed for reducing the total cost of the simulations. Some alternatives are resorting to vector/parallel platforms (massively parallel, clusters, GRID [2]), efficient subroutines and libraries (FFT, Lapack, MPI [3]), and seeking a low count of operations and of primary storage.

The numerical methods that have been developed for the above-mentioned purposes constitute a multidisciplinary field named computational seismology, the numerical simulation of seismic wave propagation in arbitrary 3D models [2]. Its scope is naturally beyond global-scale seismology, reaching other topics of geosciences (such as rock physics, exploration geophysics, volcanology, and geotechnical engineering) and beyond (computational mechanics, materials science, underwater acoustics, and medicine).

The recent literature provides detailed reviews oriented towards specific methods [4,5,6,7] or communities [8,9,10]. The purpose of this paper is to share an overview of computational seismology methods with a broader audience, starting from the mathematical models, visiting general aspects of spatial and temporal discretization and then arriving at the theoretical and computational aspects of the main numerical methods currently in use.

2 Governing equations

2.1 Scalar wave equation

The most elementary mathematical model of wave propagation is the scalar wave equation

$$\begin{aligned} \ddot{u} - c^2\frac{\partial ^2{u}}{\partial x^2} = f, \end{aligned}$$
(1)

where the unknown function u(xt) may denote, for instance, the acoustic pressure, c is the wave velocity in a homogeneous medium, and f(xt) is the divergence of an external body force. The dots denote time differentiation. A related model is the one-dimensional shear-wave propagation equation

$$\begin{aligned} \rho \ddot{u} - \frac{\partial {}}{\partial x}\left( \mu \frac{\partial {u}}{\partial x}\right) = f, \end{aligned}$$
(2)

where \(\rho \) is the density and \(\mu \) is the shear modulus. When the latter is constant, Eq. (2) can be written similarly to (1) by defining \(c = (\mu /\rho )^{1/2}\).

In the absence of source functions, Eq. (1) has the general solution

$$\begin{aligned} u(x,t) = F(x+ct)+G(x-ct), \end{aligned}$$
(3)

where the functions \(F(\cdot )\) and \(G(\cdot )\) are arbitrary. Some important particular solutions are the d’Alembert solution

$$\begin{aligned} u(x,t) = \frac{u_0(x+ct)+u_0(x-ct)}{2} + \frac{1}{2c}\int _{x-ct}^{x+ct}u_{00}(s)\, ds, \end{aligned}$$
(4)

which satisfies the initial conditions \(u(x,0)=u_0(x)\) and \(\dot{u}(x,0)= u_{00}(x)\), and the plane-wave solution

$$\begin{aligned} u(x,t) = \exp [-i(\omega t-\kappa x)], \end{aligned}$$
(5)

where the angular frequency \(\omega \) and the wave number \(\kappa \) satisfy the dispersion relation \(\omega =\pm c\kappa \). Equation (1) can be generalized to 2D or 3D media as follows:

$$\begin{aligned} \ddot{u} - c^2\Delta u = f, \end{aligned}$$
(6)

which admit plane-wave solutions of the form , with \(\omega = \pm c|{{\varvec{\kappa }}}|\), when \(f=0\). Both Eqs. (1) and (6) may be considered in the more general case of a homogeneous velocity field \(c({{\varvec{x}}})\). Moreover, the more general acoustic wave equation

$$\begin{aligned} \frac{\partial }{\partial t}{\left( \frac{1}{\rho c^2}\dot{u}\right) } -\nabla \cdot \left( \frac{1}{\rho }\nabla u\right) = f, \end{aligned}$$
(7)

accounts for variable density. We recall that Eq. (7) arises from the linearized mass and conservation equations

$$\begin{aligned} \frac{1}{K}\dot{u} + \nabla \cdot {{\varvec{v}}}= & {} 0, \end{aligned}$$
(8a)
$$\begin{aligned} \rho \dot{{{\varvec{v}}}} + \nabla u= & {} {{\varvec{F}}}. \end{aligned}$$
(8b)

for the pressure u and velocity \({{\varvec{v}}}\) of a fluid with density \(\rho ({{\varvec{x}}})\) and bulk modulus \(K({{\varvec{x}}})\) (see, e.g., [11]), for which the velocity is \(c({{\varvec{x}}}) = \left( K({{\varvec{x}}})/\rho ({{\varvec{x}}})\right) ^{1/2}\).

2.2 Elastic wave equation

The standard model for wave propagation in solids is given by the conservation of linear momentum (Newton’s law)

$$\begin{aligned} \rho \ddot{{{\varvec{u}}}} - \nabla \cdot {{\varvec{\sigma }}}({{\varvec{u}}}) = {{\varvec{f}}}, \end{aligned}$$
(9)

where \({{\varvec{u}}}\) is the displacement and \(\sigma \) is the stress tensor. In particular, elastic media are described by the linear constitutive relation (Hooke’s law)

(10)

between the stress \({{\varvec{\sigma }}}\) and the linearized strain \({{\varvec{\epsilon }}}({{\varvec{u}}})\), where is the elasticity tensor. Due to the symmetry provided by the conservation of angular momentum, it is convenient to use Voigt notation

$$\begin{aligned} {{\varvec{\sigma }}}=\{\sigma _{xx},\sigma _{yy},\sigma _{zz},\sigma _{xy},\sigma _{xz},\sigma _{yz}\}^{\top }, \end{aligned}$$
(11)

or \({{\varvec{\sigma }}} = \{\sigma _{xx},\sigma _{yy},\sigma _{xy}\}^{\top }\) in 2D, for which the governing equations may be written as

(12a)
(12b)

where the differential operator \({\mathcal {D}}\) is

$$\begin{aligned} {\mathcal {D}} = \left[ \begin{array}{lll} \partial _x &{} 0 &{} 0\\ 0 &{} \partial _y &{} 0\\ 0 &{} 0 &{} \partial _z \\ \partial _y &{} \partial _x &{} 0\\ \partial _z &{} 0 &{} \partial _x \\ 0 &{} \partial _z &{} \partial _y \end{array}\right] \; \hbox { or } \quad {\mathcal {D}} = \left[ \begin{array}{ll} \partial _x &{} 0 \\ 0 &{} \partial _y \\ \partial _y &{} \partial _x \end{array}\right] \; \hbox { in 2D}. \end{aligned}$$
(13)

The elasticity tensor has up to 21 free parameters, but there can be significantly fewer ones depending on the symmetry assumptions [12]. When the medium is isotropic, we have , where is the identity operator and \(\lambda ,\mu >0\) are the Lamé coefficients. In experimental studies, other elastic parameters are more typical, such as the elastic modulus and the Poisson’s ratio, which respectively are \(E {:=} \lambda + 2\mu \) and \(\nu {:=} \lambda /2(\lambda +\mu )\) for isotropic media. Under Voigt notation, tensor has the following matrix representation:

(14)

In vector notation, Eq. (9) assumes the following standard form:

$$\begin{aligned} \rho \ddot{{{\varvec{u}}}} - \nabla [(\lambda + \mu )\nabla \cdot {{\varvec{u}}}] - \nabla \cdot (\mu \nabla {{\varvec{u}}}) = {{\varvec{f}}}. \end{aligned}$$
(15)

When \({{\varvec{f}}}= {{\varvec{0}}}\) and the elastic parameters are constant, the plane wave \({{\varvec{u}}}({{\varvec{x}}},t) {:=} {{\varvec{R}}}\exp [-i(\omega t - {{\varvec{\kappa }}}\cdot {{\varvec{x}}})]\) is a solution to (15) if

$$\begin{aligned} \left( \frac{\omega ^2}{|{{\varvec{\kappa }}}|^2} - c_S^2\right) {{\varvec{R}}} - (c_P^2 - c_S^2) \left( {{\varvec{R}}}\cdot \frac{{{\varvec{\kappa }}}}{|{{\varvec{\kappa }}}|}\right) \frac{{{\varvec{\kappa }}}}{|{{\varvec{\kappa }}}|} = {{\varvec{0}}}, \end{aligned}$$
(16)

where

$$\begin{aligned} c_P {:=} \sqrt{\frac{\lambda +2\mu }{\rho }} , \quad c_S {:=} \sqrt{\frac{\mu }{\rho }}. \end{aligned}$$
(17)

are the compressional and shear-wave velocities. The vector Eq. (16) admits the solutions \((\omega _P,{{\varvec{R}}}_P)\) and \((\omega _S,{{\varvec{R}}}_S)\), where the angular frequencies are \(\omega _P = \pm c_P|{{\varvec{\kappa }}}|\) and \(\omega _S = \pm c_S|{{\varvec{\kappa }}}|\), while the propagation directions \({{\varvec{R}}}_P\) and \({{\varvec{R}}}_S\) (\({{\varvec{R}}}_{SV}\) and \({{\varvec{R}}}_{SH}\) in 3D) are parallel and perpendicular to \({{\varvec{\kappa }}}\), respectively [13].

A relevant class of anisotropic elastic stress-strain relations in computational seismology is that of transversely isotropic media with vertical symmetry axis (VTI):

(18)

where \(C_{66} = (C_{11}-C_{12})/2\). Following [14], a plane-wave solution \({{\varvec{u}}}({{\varvec{x}}},t) {:=} {{\varvec{R}}}\exp [-i(\omega t - {{\varvec{\kappa }}}\cdot {{\varvec{x}}})]\) in the three-dimensional case with \({{\varvec{\kappa }}} = \kappa \{\sin \theta ,0,\cos \theta \}^{\top }\) (without loss of generality, given the cylindrical symmetry [15]) yields the following dispersion relations:

$$\begin{aligned} \omega _P= & {} \pm c_P\left( 1 + \varepsilon \sin ^2\theta +\Delta (\theta )\right) ^{1/2}\kappa , \end{aligned}$$
(19a)
$$\begin{aligned} \omega _{SV}= & {} \pm c_S\left( 1 + \frac{C_P^2}{C_S^2}\varepsilon \sin ^2\theta -\frac{C_P^2}{C_S^2}\Delta (\theta )\right) ^{1/2}\kappa , \end{aligned}$$
(19b)
$$\begin{aligned} \omega _{SH}= & {} \pm c_S\left( 1+2\gamma \sin ^2\theta \right) ^{1/2}\kappa , \end{aligned}$$
(19c)

where \(c_P=(C_{33}/\rho )^{1/2}\), \(c_S=(C_{44}/\rho )^{1/2}\), \(\varepsilon =(C_{11}-C_{33})/(2C_{33})\), \(\gamma = (C_{66}-C_{44})/(2C_{44})\), \(\delta = [(C_{13}+C_{44})^2-(C_{33}-C_{44})^2]/[2C_{33}(C_{33}-C_{44})]\), and

$$\begin{aligned} \Delta (\theta ) = \frac{r_{CP}}{2}\left\{ \left[ 1 + 4(2\delta -\epsilon )\sin ^2\theta \cos ^2\theta + \frac{4(r_{cp}+\varepsilon )\varepsilon }{r_{CP}^2}\sin ^4\theta \right] ^{1/2} -1\right\} , \end{aligned}$$
(20)

with \(r_{CP} = 1 -c_S^2/c_P^2\). The fact that \(\omega _{SV}\) and \(\omega _{SH}\) may not coincide leads to the phenomenon of shear-wave splitting. The dimensionless numbers \(\varepsilon \), \(\delta \), and \(\gamma \) are known as Thomsen parameters and serve as measures of anisotropy. Under additional assumptions, the elasticity tensor can be entirely written in terms of \(c_P,c_S\), and Thomsen parameters [16, 17].

2.2.1 Viscoelasticity

The model of viscoelastic media is based on stress-strain relations that account for not only the instantaneous strain, but all its history. This is accomplished by a convolution in time between the strain rate of change and a time-dependent tensor known as the relaxation tensor [18]:

(21)

This equation may be recast in differential form through fractional derivatives [12, Sec. 2.5.2] and can be numerically handled with the aid of auxiliary memory variables [19]. If , where H is the Heaviside function, then we recover the elastic model (10) with . Another extreme case is when , where \(\delta \) is the Dirac distribution, which leads to \({{\varvec{\sigma }}}({{\varvec{u}}},t) = {{\varvec{\eta }}}_0\dot{{{\varvec{\epsilon }}}}({{\varvec{u}}},\tau )\).

Important aspects of wave propagation in viscoelastic media are present in the Kelvin–Voigt model, which combines both cases above, in the one-dimensional case:

$$\begin{aligned} \epsilon (x,t) = G_0\frac{\partial {u}}{\partial x}(x,t) + \eta _0\frac{\partial }{\partial t}{}\left( \frac{\partial {u}}{\partial x}(x,t)\right) . \end{aligned}$$
(22)

This constitutive relation yields the following wave equation:

$$\begin{aligned} \rho \ddot{u} - \frac{\partial {}}{\partial x}\left( G_0\frac{\partial {u}}{\partial x}+ \eta _0\frac{\partial }{\partial t}{}\left( \frac{\partial {u}}{\partial x}\right) \right) = f, \end{aligned}$$
(23)

which serves as a preliminary site-response model for small strains [20]. The dispersion relation for this equation is \(\omega ^2=\kappa ^2(G_0-i\omega \eta _0)\), hence phase (\(\omega /\kappa \)) and group (\(d\omega /d\kappa \)) are complex and frequency-dependent, highlighting two relevant aspects of viscoelastic wave propagation: attenuation and physical dispersion, respectively.

2.2.2 Poroelasticity

Wave propagation in fluid-saturated porous media had been studied by M. A. Biot for a variety of cases [21,22,23]. For simplicity, we consider an isotropic solid matrix with constant porosity \(\phi \).

Let us denote the densities and bulk moduli of the constituent solid and the saturating fluid as \(\rho _s,\rho _f\) and \(K_s,K_f\), respectively. Moreover, the bulk and shear modulus of the dry matrix will be denoted as \(K_d\) and \(\mu _d\). For convenience, let us also introduce the Lamé parameter \(\lambda _d = K_d - (2/3)\mu _d\).

Though the governing equations may be written in terms of the displacement vectors \({{\varvec{u}}}_s,\, {{\varvec{u}}}_f\) at the solid and fluid phases [21], it is convenient to substitute fluid displacement by \({{\varvec{w}}}=\phi ({{\varvec{u}}}_f-{{\varvec{u}}}_s)\), which represents the flow of the fluid relative to the solid but measured in terms of volume per unit area of the bulk medium [23]. In this case, the equations of motion are

$$\begin{aligned} \rho \ddot{{{\varvec{u}}}}_s + \rho _{f}\ddot{{{\varvec{w}}}} - \nabla \cdot {{\varvec{\sigma }}}_b= & {} {{\varvec{0}}}, \end{aligned}$$
(24a)
$$\begin{aligned} \rho _{f}\ddot{{{\varvec{u}}}}_s + \frac{T\rho _f}{\phi }\ddot{{{\varvec{w}}}} + \nabla p + \frac{\eta }{k}\dot{{{\varvec{w}}}}= & {} {{\varvec{0}}}. \end{aligned}$$
(24b)

where \(\rho = (1-\phi )\rho _s + \phi \rho _f\) is the density of the saturated matrix, T is the tortuosity, \(\eta \) is the fluid viscosity, and k is the permeability. The constitutive relations for the total stress and the fluid pressure p can be written as

(25a)
(25b)

where \(\alpha =1-K_d/K_s\) and M is such that \(1/M = (\alpha -\phi )/K_s+\phi /K_f\) [12].

A plane-wave analysis can be performed by decoupling the \(P-\) and \(S-\) modes of propagation [12]. Namely, we can apply the divergence operator to Eq. (24) to find a system of equations for \(\nabla \cdot {{\varvec{u}}}_s\) and \(\nabla \cdot {{\varvec{w}}}\), and apply the curl operator to the same equations and find another system for \(\hbox {curl }{{\varvec{u}}}_s\) and \(\hbox {curl }{{\varvec{w}}}\). Proceeding to the analysis of propagation in a single direction, we find two compressional velocities rather than a single one; the propagation mode associated with the velocity of lower magnitude is known as the Biot’s slow wave.

2.3 Velocity-stress formulation

The mathematical models reviewed above involve partial differential equations of second order in time for the displacement field. If we seek instead the velocity field (and introduce the stress as an additional variable), we are led to a system of first-order equations, for which a large variety of numerical methods is available.

Let us consider for instance the elastic Eq. (9). By taking time derivatives of both sides of (10), we find

(26a)
(26b)

where the unknowns are the stress \({{\varvec{\sigma }}}\) and the velocity \({{\varvec{v}}} = \dot{{{\varvec{u}}}}\). In matrix form, we have

(27)

A related first-order system is given by the velocity-displacement formulation

(28a)
(28b)

From the computational point of view, formulation (28) has the advantage of involving less unknowns than in (26). For instance, in the three-dimensional case we have six unknowns rather than nine.

2.4 Boundary conditions

Modeling of wave-propagation problems may involve not only physical but also computational (or artificial) boundaries depending on the region of interest, denoted by \(\Omega \) (the problem domain). Physical boundaries are usually modeled by transmission conditions of the form

$$\begin{aligned} {{\varvec{\sigma }}}\cdot {{\varvec{n}}} = g, \end{aligned}$$
(29)

where \({{\varvec{n}}}\) is the unit vector normal to the boundary and pointing outwards. The case \(g=0\) is referred to as a free-surface boundary condition.

Ideally, a computational boundary should not interfere with the waves, which makes it very challenging to model. One of the classical approaches is to use absorbing (or non-reflecting) boundary conditions [7, 24, 25]. As pointed out in [26], absorbing boundary conditions are related to the Sommerfeld radiation condition

$$\begin{aligned} \lim _{r\rightarrow 0}r^{(d-1)/2}\left( \frac{\partial u}{\partial r}(r) - iku(r)\right) =0,\quad r^2 = x_1^2 + \cdots + x_d^2 \end{aligned}$$
(30)

for Helmholtz equation \(\Delta u + k^2u=0\), in the sense that the analogue of (30) for the scalar wave equation (6) is

$$\begin{aligned} \lim _{r\rightarrow 0}r^{(d-1)/2}\left( \frac{\partial u}{\partial r}(r,t) + \frac{1}{c}\frac{\partial u}{\partial t}(r,t)\right) =0. \end{aligned}$$
(31)

Engquist and co-authors [24, 27] obtained the following boundary condition at \(r=a\) in the two-dimensional case:

$$\begin{aligned} \frac{\partial u}{\partial r}(a,t) + \frac{1}{c}\frac{\partial u}{\partial t}(a,t) + \frac{1}{2a}u(a,t)=0, \end{aligned}$$
(32)

as well as higher-order conditions, based on paraxial approximations of the wave equation (see also [28]). In Cartesian coordinates, the lowest-order boundary conditions at \(x=0\) (a typical lateral border in a two-dimensional simulation) are

$$\begin{aligned} \left( \frac{\partial }{\partial x} - \frac{1}{c}\frac{\partial }{\partial t}\right) u(0,y,t) = \frac{\partial u}{\partial x}(0,y,t) - \frac{1}{c}\frac{\partial u}{\partial t}u(0,y,t)= & {} 0, \end{aligned}$$
(33a)
$$\begin{aligned} \left( \frac{\partial ^2}{\partial x\partial t} - \frac{1}{c}\frac{\partial ^2}{\partial t^2} + \frac{c}{2}\frac{\partial ^2}{\partial y^2}\right) u(0,y,t)= & {} 0. \end{aligned}$$
(33b)

Conditions (33) were also derived by annihilating the reflection coefficient of plane-wave solutions [29,30,31]. A more general approach was later proposed by Higdon [25, 32], who considered boundary conditions of order p in the form

$$\begin{aligned} \prod _{j=1}^{p}\left( \cos \alpha _j\frac{\partial }{\partial t} - c\frac{\partial }{\partial x}\right) u(0,y,t) = 0, \end{aligned}$$
(34)

which reduce to Engquist–Majda conditions when \(\alpha _j=0\), \(0\le j\le p\). These coefficients may be chosen to minimize the reflection coefficient

$$\begin{aligned} R(\theta ) = -\prod _{j=1}^{p}\frac{\cos \alpha _j-\cos \theta }{\cos \alpha _j+\cos \theta } \end{aligned}$$
(35)

of plane waves traveling with angle of incidence \(\theta \). Bamberger et al. [33] proposed modified conditions that account for Rayleigh waves. Another relevant progress on absorbing boundary conditions is handling corner points [34,35,36,37].

As pointed out in [7], the error of a high-order absorbing boundary condition does not necessarily converge to zero as the order tends to infinity. When the error due to the boundary condition does converge to zero, it is referred to as an exact non-reflecting boundary condition [38,39,40,41].

Moreover, higher-order approximations involve high-order spatial and temporal derivatives, which must be appropriately represented in the numerical discretization and usually incur a higher computational cost. Such a constraint has motivated the study of high-order local non-reflecting boundary conditions (high-order local NRBCs, [7, 36]), which introduce auxiliary variables that avoid the need of high-order derivatives. As outlined in [7], one of the first approaches of this class, due to Collino [34], can be written as

$$\begin{aligned} \frac{\partial u}{\partial x}(0,y,t) + \frac{1}{c}\sum _{j=1}^{p}\frac{2\sin ^2\theta _j}{2p+1}\frac{\partial \phi _j}{\partial t}(0,y,t)= & {} 0, \end{aligned}$$
(36a)
$$\begin{aligned} \frac{1}{c^2}\frac{\partial ^2\phi _j}{\partial t^2}(y,t) - \cos ^2\theta _j\frac{\partial ^2\phi _j}{\partial y^2}(y,t) - \frac{\partial ^2u}{\partial y^2}(y,t)= & {} 0 \quad (1\le j\le p), \end{aligned}$$
(36b)

where \(\theta _j = j\pi /(2p+1)\). Thanks to the auxiliary variables \(\phi _1,\ldots ,\phi _p\), the derivatives in (36) have order no greater than two.

2.5 Variational formulation

The variational, or weak formulation is a convenient representation of the mathematical model that allows to seek the approximate solution in a functional space with lower regularity requirements, for instance when the material properties are discontinuous.

For conciseness, let us focus on system (9)–(10). Its weak form is obtained by taking the scalar product of both sides of (9) by a test function \({{\varvec{w}}}\) and integrating over the domain \(\Omega \). In the case of a homogeneous Dirichlet condition,

(37)

where \(d=2\) or 3 and \(H^1_0(\Omega ) = \{u\in L^2(\Omega ) \; | \; \nabla u\in L^2(\Omega ) \text{ and } u\!\mid _{\partial \Omega }=0\}\), where \(L^2(\Omega )\) is the space of functions that are square-integrable with respect to the Lebesgue measure in \(\Omega \) and \(u\!\mid _{\partial \Omega }\) denotes the trace of u over the boundary of \(\Omega \) [42]. Analogously, the variational formulation of system (26) is

(38a)
(38b)

where \(X = \{{{\varvec{\tau }}}\in L^2(\Omega )^{d\times d} \; ; \; {{\varvec{\tau }}}^{\top }= {{\varvec{\tau }}}\}\). A similar variational formulation can be obtained for the velocity-displacement formulation (28). One may also consider a formulation of system (26) with lower regularity on velocities and higher regularity on stresses, which can be discretized with mixed finite elements (see, e.g., [43]), and has been adapted to include perfectly matched layers [44].

3 Model discretization

The differential or variational formulations presented in the previous section, when complemented with proper initial and/or boundary conditions, provide a unique wave field that is complete in the sense that it can be determined for any point \({{\varvec{x}}}\) of the domain \(\Omega \) and any time \(t\ge t_0\), where \(t_0\) is the initial time of observation (for convenience, we consider \(t_0=0\) from here on). Since those initial-boundary value problems rarely have an analytical solution, we must resort to approximate solutions.

For some methods such as finite-difference and mimetic methods, the approximate solution corresponds to an array of coefficients \({{\varvec{U}}}_j^n\) such that \({{\varvec{U}}}_j^n \approx {{\varvec{u}}}({{\varvec{x}}},t_n)\) for any point \({{\varvec{x}}}\) in a subset \(X_j\) that usually consists of a single point \({{\varvec{x}}}_j\), but could also be an edge, face, or a three-dimensional shape. The differential/integral operators are approximated or replaced with discrete operators defined at the sets \(X_j\) for any j and at \(t=t_n,\) \(n=0,1,\ldots \).

For another group of methods that is considered in this survey, we have a finite expansion of functions in the form

$$\begin{aligned} {{\varvec{u}}}({{\varvec{x}}},t) \approx {{\varvec{u}}}_{N}({{\varvec{x}}},t) = \sum _{j=0}^{N}{\hat{{{{\varvec{u}}}}}}_j(t)\phi _j({{\varvec{x}}}), \end{aligned}$$
(39)

where \(\phi _0({{\varvec{x}}}),\ldots ,\phi _{N}({{\varvec{x}}})\) are previously chosen functions, while \({\hat{{{{\varvec{u}}}}}}_0(t),\ldots ,{\hat{{{{\varvec{u}}}}}}_{N}(t)\) are time-dependent coefficients to be determined. Pseudospectral and finite-element methods fall into this category. Through expansion (39), time and space are separately handled, in analogy with the method of separation of variables for partial differential equations. The spatial operators are applied to functions \(\phi _j({{\varvec{x}}})\) in original form, i.e., they are not discretized, though the operator evaluation may depend on interpolation or numerical integration.

To determine the arrays \({{\varvec{U}}}_j^n\) or the coefficients \({\hat{{{{\varvec{u}}}}}}_j(t)\), each method relies on a particular approximation principle, but most methods need to discretize the independent variables, space and time. Time and space sampling could be required in the approximation principle, in auxiliary calculations, or in the output generation.

3.1 Spatial discretization

The main requirements for the discretization in space are that the waves and the medium heterogeneities must be sufficiently sampled. While the former is strongly dependent on the chosen numerical method, the latter is similar for most methods.

The geological model is usually discretized in a finite number of cells (also referred to as elements or blocks) where, ideally,

  • there are no restrictions on the material variability;

  • interfaces and structures can be honored;

  • different rheologies can be easily handled.

Following [2], we classify the ensemble of all cells of a domain, called the mesh or grid, as structured or non-structured. A structured mesh is obtained by the mapping from the integer set \(\{0,1,\ldots ,N_1\}\times \cdots \times \{0,1,\ldots ,N_d\}\) to an d-dimensional domain \(\Omega \). For example, a cubic mesh with uniform spacing h can defined by the mapping of each \(\{i,j,k\}\in \{0,\ldots ,N\}^3\) to the vertex \((x_0+ih,y_0+jh,z_0+kh)\). As data structures, these meshes are uniquely defined by arrays with the coordinates of the vertices. The cells are understood to be formed by points whose coordinates have adjacent indices in a Cartesian fashion.

On the other hand, non-structured meshes do not have a Cartesian structure, hence the coordinates are not sufficient to identify the mesh. Usually there is one mapping from \(\{0,1\}^d\), or another reference cell, for each cell in the mesh. Despite their complexity, non-structured meshes conform more easily to interfaces and structures, while structured meshes usually need a large amount of grid nodes to avoid the staircase effect [45].

3.2 Temporal discretization

The discretization of the time variable can be done in a straightforward manner: a uniform partition of the interval [0, T] into \({N}\) intervals with time step \(\Delta t=T/{N}\):

$$\begin{aligned} 0< =t_0< t_1< \cdots < t_{N} = T, \quad t_n = n\Delta t, \end{aligned}$$
(40)

where T is the final time of the simulation. The time step will depend on the smallest grid length in space and the chosen approximation method for the time derivatives, and in general it can be dynamically selected by means of error estimators [46,47,48].

One of the greatest difficulties with temporal discretization is the situation where some elements in the spatial grid are extremely small (such as slivers generated from 3D unstructured mesh-generation codes [49]), forcing an equally small time step and undermining the efficiency of the wave-propagation simulation. This has motivated the development of local time-stepping schemes [50,51,52,53,54], where the time step is smaller where the grid is refined and larger where the grid is coarse.

4 Temporal discretization methods

In the following we review the methods for approximation of time derivatives that are most frequently used in seismic wave propagation. These methods are typically applied to the following second-order linear system of ordinary differential equations (ODEs):

(41)

where the matrices , , and are known as the mass, stiffness, and damping matrices, respectively. The latter arises from the discretization of viscous terms such as in Eqs. (22) and (24b), but also absorbing boundary conditions, such as (33). One can also seek the block vector \({{\varvec{v}}} = \{{{\varvec{u}}},\dot{{{\varvec{u}}}}\}^{\top }\) by solving a first-order system of the form

(42)

which is also the kind of system of ODEs that arises from velocity-stress and velocity-displacement formulations (27)–(28). The convenience of system (42) resides on the fact that it is a particular case of the classical equations

$$\begin{aligned} \dot{{{\varvec{v}}}} = {{\varvec{f}}}(t,{{\varvec{u}}}), \end{aligned}$$
(43)

for which a vast literature is available (see, e.g., [55]). On the other hand, system (41) as well as and its generalization to time-dependent coefficient matrices [56] and non-linear internal forces [57], among others, has been thoroughly studied by the community of structural dynamics.

For simplicity, temporal discretization methods will be described for the case of a uniform partition (40). Perhaps the most popular method is the leapfrog scheme, where the approximation \({{\varvec{u}}}_n\approx {{\varvec{u}}}(t_n)\) over the time partition is determined from the centered finite-difference approximation

(44)

which is present in some of the pioneering works on computational seismology [58, 59] and has been employed in several different approaches (e.g., [60,61,62]).

Some schemes such as leapfrog can be written as multi-step methods of the form

(45)

If in (45), the scheme is called explicit, otherwise, it is implicit. In the literature of finite-element methods, the concept of explicit is extended to the cases where is diagonal [63, 64]. In general, we may refer as explicit a method that does not require solving a linear system in any step of the computation of \({{\varvec{u}}}_{n+1}\). For instance, when and , then leapfrog method (44) becomes explicit:

(46)

4.1 Newmark methods

The leapfrog method is one member of a family of time-integration methods known as Newmark methods [65]. As described, e.g., in [64], the Newmark method for (41) with parameters \(\beta \) and \(\gamma \) can be written as follows:

(47a)
(47b)
(47c)
(47d)
(47e)

where \({{\varvec{a}}}_{n}\approx \ddot{{{\varvec{u}}}}(t_n)\) and \({{\varvec{v}}}_{n} \approx \dot{{{\varvec{u}}}}(t_n)\). The Newmark scheme naturally provides approximations for not only displacement, but also velocity and acceleration, which are useful to the inversion of three-component data [66]. The leapfrog method corresponds to \(\beta =0\) and \(\gamma =1/2\). Another well-known method from the Newmark family is the average-acceleration method [67], which corresponds to \(\beta =1/4\) and \(\gamma =1/2\). This method has been applied to acoustic and elastic wave propagation [68, 69]. The Newmark scheme has also been interpreted as a time-staggered velocity-stress algorithm with the purpose of implementing absorbing boundary conditions [70].

Newmark methods are at most second-order accurate [67], i.e., \({{\varvec{u}}}_n - {{\varvec{u}}}(t_n) = {\mathcal {O}}(\Delta t^2)\), and the use of higher-order methods may increase the computational efficiency despite their higher cost [71]. In the following we review several high-order temporal discretization methods.

4.2 Lax–Wendroff methods

One of the most traditional high-order approaches are the Lax-Wendroff methods [72], which use spatial derivatives to replace high-order time derivatives. This principle is also present in the arbitrary high-order derivatives (ADER) method [73, 74] and nearly analytical discrete methods [75, 76].

This scheme has been widely employed to second-order equations in the form

(48)

which can be derived from (41) in the absence of damping () as, e.g., in [54]. If the spatial discretization is performed with finite differences [71, 77], then and there is no need of transformations from (41) to (48). Most authors have considered the acoustic wave equation [54, 71, 77, 78], but the Lax-Wendroff approximation has also been applied to the elastic wave equation [79, 80].

Following [77], let us derive the fourth-order Lax–Wendroff method for (48). We refer to [78] for higher-order approximations. A standard Taylor expansion of the second-derivative term in (48) yields

$$\begin{aligned} \ddot{{{\varvec{v}}}}(t_n) = \frac{{{\varvec{v}}}(t_{n+1})-2{{\varvec{v}}}(t_n) + {{\varvec{v}}}(t_{n-1})}{\Delta t^2} - \frac{\Delta t^2}{12}\frac{\partial ^4{{\varvec{v}}}}{\partial t^4}(t_n) + {\mathcal {O}}(\Delta t^4). \end{aligned}$$
(49)

On the other hand, and, by taking a second-order derivative of this expression, we find

(50)

By combining (49) and (50), we arrive at the following explicit scheme:

(51)

The vector \(\ddot{{{\varvec{f}}}_n}\) can be approximated by a second-order scheme [81]. The use of the Lax–Wendroff scheme has also been investigated in the first-order scheme (42), not only for acoustic waves [82], but also for structural dynamics [83] and viscoacoustic waves [84].

4.3 Runge–Kutta and symplectic methods

The Runge–Kutta (RK) methods [55] can be readily applied to system (42). For instance, the classical fourth-order RK method is given as follows:

(52)

The RK discretization for first-order systems in the form (42) has been used in conjunction with pseudospectral [85], finite-element [86], and discontinuous Galerkin [87, 88] methods. The extension of RK methods to the second-order system (41) can be done by 2-step schemes known as Runge–Kutta–Nyström methods [89, 90], which have been thoroughly developed in the context of Hamiltonian systems [91],

(53)

Let be the flow map defined by the solution \(({{\varvec{p}}}(t),{{\varvec{q}}}(t))\) to (53) with initial conditions \({{\varvec{p}}}(t)={{\varvec{p}}}_0\) and \({{\varvec{q}}}(t)={{\varvec{q}}}_0\). Its Jacobian satisfies , and if the same property holds for the flow map produced by a numerical method for (53), then this method is called symplectic [92, 93]. Symplectic schemes have slower error growth and preserve conservative quantities, which makes them an attractive choice for long-time simulations [94, 95]. These schemes have also been proposed for the more general Birkhoffian systems [96], with application to poroelastic wave propagation [97].

The implicit, fourth-order Störmer–Numerov method, which has been applied to acoustic and elastic waves [98,99,100], is an example of a symplectic method [101]. The Störmer–Numerov approximation for system (48) is the following:

(54)

Makridakis [98] has extended this method to solve system (41), where the damping term accounts for absorbing boundary conditions.

4.4 Approximation of evolution operators

Another family of high-order methods is based on approximations of matrix power series that are present in the analytical solutions of systems of ODEs. As shown for instance in [102], the analytical solution of system (48) with initial conditions \({{\varvec{v}}}(0) = {{\varvec{v}}}_0\) and \(\dot{{{\varvec{v}}}}(0) = {{\varvec{v}}}_{00}\) can be written as

(55a)
(55b)

If a is a scalar, then \(C(a,t)=\cos (a^{1/2}t)\) and \(S(a,t)=\sin (a^{1/2}t)/a^{1/2}\). Taking into account the temporal discretization (40), the solution (55) may be represented as follows:

(56)

In order to derive numerical schemes from the recursive form (56), we may determine approximations to and , and select quadrature schemes to deal with the source term [102]. For instance, when \({{\varvec{f}}}={{\varvec{0}}}\) and the rational approximation

$$\begin{aligned} \cos (a^{1/2}t)\approx \frac{1 + (\beta -1/2)at^2}{1+\beta at^2}, \end{aligned}$$
(57)

then (56) leads to the following implicit scheme:

(58)

which corresponds to the Störmer–Numerov method (54) when \(\beta =1/12\) and has been revisited in [81, 103]. Baker et al. [104] studied sufficient conditions for the convergence of rational-approximation methods in the homogeneous case. One of the high-order schemes that satisfies these conditions is defined by the following approximation:

$$\begin{aligned} \cos (a^{1/2}t)\approx \frac{1 + (2\beta -1/2)at^2+(\beta ^2-\beta +1/24)a^2t^4}{(1+\beta at^2)^2},\quad \beta = \frac{5+\sqrt{15}}{60}, \end{aligned}$$
(59)

which leads to 6th-order accuracy. Some classical methods are related with Taylor approximation of evolution operators [8, 105]. Indeed, a Taylor approximation of degree \(m\ge 1\) for (56) in the case \({{\varvec{f}}}={{\varvec{0}}}\) yields

(60)

which are the leapfrog and Lax–Wendroff scheme when \(m=1\) and \(m=2\), respectively. Besides rational (Padé) and Taylor expansions, Chebyshev expansions can also be employed [106]. Such an approach is known in the seismic exploration literature as the rapid expansion method [107].

One can also consider the approximation of exponential operators associated with the analytical solution of the first-order system (42):

(61)

Similarly as in (55), several approximations for may be employed, such as Taylor, Fourier [105], Chebyshev [105, 107, 108], and rational [109, 110] expansions. Besides using a truncated expansion of the matrix operator, one may also split the matrices in diagonal and non-diagonal parts, leading to a fixed-point iteration method [111].

5 Spatial discretization methods

The numerical methods described next provide approximations of the mathematical models described in Sect. 2 as the systems of ODEs (41) or (42).

5.1 Finite-difference methods

Most time-discretization schemes described in Sect. 4 are based on finite-difference approximations of time derivatives. These formulas are based on Taylor-series expansions that are combined to reach the desirable accuracy. For instance, the expansion

$$\begin{aligned} \ddot{f}(x,t) = \frac{f(x,t-\Delta t)-2f(x,t)+f(x,t+\Delta t)}{\Delta t^2} + \frac{\Delta t^2}{24}\left( \frac{\partial ^4f}{\partial t^4}(x,\bar{t}_1)+\frac{\partial ^4f}{\partial t^4}(x,\bar{t}_2)\right) , \end{aligned}$$
(62)

where \(t_{n-1}\le \bar{t}_1\le t_n\) and \(t_{n}\le \bar{t}_2\le t_{n+1}\), leads to the finite-difference formula

$$\begin{aligned} \ddot{f}(x_j,t_n) \approx \frac{f(x_j,t_{n-1})-2f(x_j,t_n)+f(x_j,t_{n+1})}{\Delta t^2} \end{aligned}$$
(63)

over a space-time grid defined by the points \((x_j,t_n) = (x_0 + j\Delta x,n\Delta t)\). This scheme is second-order accurate if f has continuous fourth-order derivatives. By applying the same approximation to the second partial derivative in space, we arrive at the following finite-difference approximation of the scalar wave equation (1):

$$\begin{aligned} \frac{u_{j}^{n-1}-2u_{j}^{n}+u_{j}^{n+1}}{\Delta t^2} - c^2\frac{u_{j-1}^{n}-2u_{j}^{n}+u_{j+1}^{n}}{\Delta x^2} = f_{j}^{n}. \end{aligned}$$
(64)

At the time step \(n\ge 1\), Eq. (64) for \(0< i < N\) complemented with boundary conditions lead to the fully discrete system (44). For instance, under boundary conditions \(u(x_0,t)=u(x_{N},t)=0\), we have , , ,

(65)

In the simpler situation where \(c=1\), \(f=0\), and \(\Delta x=\Delta t\), Eq. (64) reduces to

$$\begin{aligned} u_{j}^{n-1}+u_{j}^{n+1}-u_{j-1}^{n}-u_{j+1}^{n} = 0, \end{aligned}$$
(66)

which is considered in the seminal paper by Courant et al. [112]. The same formula along with the centered finite-difference approximation of first-order derivatives was used in the approximation of the elastic wave equation in cylindrical coordinates [58] and in 2D Cartesian coordinates [113]. Alford and co-authors proposed schemes with fourth-order accuracy in space for both acoustic [59] and elastic [114] wave equations, with the purpose of improving accuracy (in particular, reducing numerical dispersion).

For the velocity-stress formulation (27) of the elastic wave equation, the use of staggered grids [115] is a standard practice [116,117,118,119]. A related approach seeks second-order accurate finite-difference formulas with a lower number of grid points, leading to simplicial grids [120]. Igel [2] illustrates staggered grids with the 1D elastic wave equation (2), whose velocity-stress form is

$$\begin{aligned} \rho \frac{\partial }{\partial t}{v} = \frac{\partial {\sigma }}{\partial x} + f ; \quad \frac{\partial }{\partial t}{\sigma } = \mu \frac{\partial {v}}{\partial x}. \end{aligned}$$
(67)

The space and time derivatives may approximated by centered finite differences with spacing \(\Delta x/2\) and \(\Delta t/2\), respectively (Fig. 1):

$$\begin{aligned} \rho _j\frac{v_j^{n+\frac{1}{2}}-v_j^{n-\frac{1}{2}}}{\Delta t} = \frac{\sigma _{j+\frac{1}{2}}^n-\sigma _{j-\frac{1}{2}}^n}{\Delta x} + f_j^n ; \quad \frac{\sigma _{j+\frac{1}{2}}^{n+1}-\sigma _{j+\frac{1}{2}}^{n}}{\Delta t} = \mu _{j+\frac{1}{2}} \frac{v_{j+1}^{n+\frac{1}{2}}-v_j^{n+\frac{1}{2}}}{\Delta x}. \end{aligned}$$
(68)
Fig. 1
figure 1

Stencil of the staggered scheme (68). Circles and squares represent the nodes where velocity and stress are evaluated, respectively

In this manner, v and \(\sigma \) are computed on disjoint spatial grids with spacing \(\Delta x\) rather than \(\Delta x/2\), reducing computer memory requirements. For higher dimension, one may need to interpolate stress components from separate grids to evaluate stress-strain relations [8].

The staggered formulation has been applied to anisotropic [121,122,123], viscoelastic [124,125,126], and poroelastic [127,128,129] models in velocity-stress form as in (67). Wave equations in second-order form may also be considered through an equivalent staggered-grid scheme [130]. Another extension is given by the rotated staggered grids [131], which have shown to be useful to anisotropic media, especially in the study of shear-wave splitting [132]. Staggered grids handle more easily boundary conditions and interfaces [117, 119, 133,134,135], in addition that their numerical error is less sensitive to Poisson’s ratio [119]. They also provide a natural framework for implementing non-uniform or discontinuous grids [136,137,138,139].

As illustrated in scheme (64), the mass matrix is typically the identity matrix in finite-difference methods, thus it is natural to consider explicit temporal discretization schemes. On the other hand, implicit schemes require an efficient implementation to handle the additional cost of solving linear systems, and a common approach is to use sequential splitting techniques [140].

Splitting methods have their roots in the classical alternating-direction [141, 142] and fractional-step [143] methods, which have been gathered as locally one-dimensional (LOD) methods [144]. LOD has been extended from parabolic to hyperbolic problems [145,146,147], including high-order time discretizations [148]. Perhaps the first high-order ADI method for the acoustic wave equation is due to Ciment and Leventhal [148], though other works have previously considered extending ADI methods from parabolic to hyperbolic problems [145,146,147]. More recently, these methods have employed to seismic wave-propagation problems [81, 103, 149]. Wave equations with viscous terms have been considered in [150].

Another splitting approach due to Strang [151] and Marchuk [152] has also been adopted in wave-propagation problems [153]. A related technique has been employed to separate the stiff from the nonstiff part of the velocity-pressure poroacoustic wave equations, so that Biot’s slow wave can be numerically modeled [154]. A theoretical framework to analyze splitting methods for general second-order systems of ODEs has been proposed in [155].

For instance, let us review the splitting of scheme (58) for system following [81], in the case where is the two-dimensional, second-order finite-difference operator

(69)

where and are given as follows:

$$\begin{aligned} u_{j,k}^{n} = c^2_{j,k}\frac{v_{j-1,k}^{n}-2v_{j,k}^{n}+v_{j+1,k}^{n}}{\Delta x^2},\quad w_{j,k}^{n} = c^2_{j,k}\frac{v_{j,k-1}^{n}-2v_{j,k}^{n}+v_{j,k+1}^{n}}{\Delta y^2}. \end{aligned}$$
(70)

Note that and involve approximations of the \(x-\) and \(y-\) directions, respectively. Scheme (58) is approximated as the following three-step formula:

(71a)
(71b)
(71c)

Note that step (71a) is explicit whereas steps (71b)–(71c) involve tridiagonal linear systems. The error of (71) with respect to (58) is \({\mathcal {O}}(\Delta t^4)\) [81].

As pointed out by Emerman et al. [156], schemes such as (71) may allow larger time steps than explicit schemes but have very low accuracy. Thus implicit methods must employ high-order finite-difference approximations (see, e.g., [103]) in order to become competitive.

A well-known limitation of traditional finite-difference methods is that they poorly handle irregular interfaces, topography, or boundary conditions, and several approaches have been proposed to circumvent these difficulties [157]. Alterman and Nathaniel have addressed the case of a constant slope by means of a change of coordinates [158]. Ilan has adapted the work in [158] to standard Cartesian coordinates, and by allowing a non-uniform grid, has extended it to polygonal topography. It is worth noting that several schemes have been proposed for non-uniform finite-difference grids [159,160,161]. Jih et al. [162] revisited the same issue and proposed local changes of coordinates, which allow the use of a uniform grid. This technique evolved to boundary-conforming grids defined by curvilinear coordinates [163,164,165,166,167], motivated by the work by Fornberg [168]. Additional approaches are hybrid finite-difference and finite-element/discrete-wavenumber methods [169], the vacuum method [170], the interface method [171, 172], and the use of non-matching grids [173].

5.2 Pseudospectral methods

As mentioned in Sect. 3, another approach to discretize the initial-boundary value problems arising from wave-propagation models is to employ the finite expansion (39). The schemes that follow this approach are known as spectral methods [174,175,176], and are essentially characterized by the choice of the basis functions \(\phi _j({{\varvec{x}}})\) and the way to determine the expansion coefficients \({\hat{{{{\varvec{u}}}}}}_j(t)\) (\(1\le j\le M\)).

The classical choices for the approximation space are orthogonal trigonometric or polynomial functions, while the approaches to determine the expansion coefficients are classically divided into tau [177], Galerkin [178], and collocation [179] methods.

Both tau and Galerkin methods choose the coefficients such that the solution satisfies the variational formulation in the approximation space, and they differ on how boundary conditions are handled. Currently, the tau method is seldom used [180]. The Galerkin method is better known in the form of finite- and spectral-element methods, which use piecewise-polynomial interpolation basis functions and are discussed later on. The Galerkin technique has also been proposed with wavelet basis functions [181,182,183,184,185]. Global orthogonal polynomials are rarer, and are mostly applied to waves in fluids [186]. On the other hand, spectral methods with the collocation technique, which became known as pseudospectral methods [187], have achieved great popularity thanks to the Fast Fourier Transform (FFT) algorithm [188].

In the one-dimensional case, a pseudospectral method is essentially defined from a set of functions \(\phi _0,\ldots ,\phi _{N}\) that are orthogonal with respect to some inner product \(\left\langle \cdot ,\cdot \right\rangle \) and collocation points \(x_0,\ldots ,x_{N}\) that are chosen such that the orthogonal projection

$$\begin{aligned} u_{N}(x) = \sum _{j=0}^{N}{\hat{u}}_j\phi _j(x), \quad {\hat{u}}_j=\frac{\left\langle u,\phi _j\right\rangle }{\left\langle \phi _j,\phi _j\right\rangle }, \end{aligned}$$
(72)

which corresponds to the best approximation of a function u(x) in the vector space spanned by \(\{ \phi _0,\ldots ,\phi _{N}\}\) [1], satisfies \(u_{N}(x_i)=u(x_i)\) for \(0\le i\le N\).

In seismic wave propagation, the earlier works are due to Gazdag [45], Kosloff and Baysal [60], which was generalized to account for anisotropy [189] and viscosity [19, 190]. They considered Cartesian grids with uniformly spaced collocation points and complex Fourier basis functions, and the resulting method be interpreted as the limit of finite differences with infinite order of accuracy [191].

In the Fourier pseudospectral method, the basis functions are \(\phi _j(x)=\exp (i\kappa (j)x)\), with \(\kappa (j) = 2\pi j/(N\Delta x)\). The corresponding collocation points are \(x_j = a + j\Delta x\), which have uniform spacing \(\Delta x = (b-a)/N\) over the interval [ab]. For convenience, the indices run from 0 to \(N-1\) so that \(N\) coincides with the number of subintervals. The relationship between \(u_{N}(x_j)\) and \({\hat{u}}_j\) can be written in terms of the discrete Fourier transform pair

$$\begin{aligned} \hat{v}_{j} = \sum _{k=0}^{N-1}v_{k}\, e^{-i\frac{2\pi }{N}jk},\quad v_{j} = \frac{1}{N}\sum _{k=0}^{N-1}\hat{v}_{k}\, e^{i\frac{2\pi }{N}jk}. \end{aligned}$$
(73)

The expansion coefficients are determined from a system of algebraic equations that is obtained by evaluating the differential equations at the collocation points, which requires evaluating derivatives of the expansion \(u_{N}\). The Fourier method is more naturally derived by approximating the computation of spatial derivatives in the frequency domain. For instance, the approximate solution to (6) with leapfrog time discretization in two dimensions [45] can be written as

$$\begin{aligned} u^{n+1}_{j,k} = 2u^{n}_{j,k}-u^{n-1}_{j,k} + \Delta t^2F^n_{j,k} + V_{j,k}^2\Delta t^2(D_{X}u^{n}_{j,k} + D_{Z}u^{n}_{j,k}), \end{aligned}$$
(74)

where \((x_j,z_k) = (j\Delta x,k\Delta z)\) \((0\le j< N_x,\; 0\le k< N_z)\), \(t_n = n\Delta t\) \((0\le n< N)\), and

$$\begin{aligned} D_{X}u^n_{j,k}= & {} \frac{-1}{N_x}\frac{1}{N_z} \sum _{{\hat{\jmath }}=N_x/2}^{-N_x/2-1}\sum _{{\hat{k}}=-N_z/2}^{N_z/2-1} \kappa ^2_x({\hat{\jmath }})\, {\hat{u}}^n_{{\hat{\jmath }},{\hat{k}}}\, e^{i[\kappa _x({\hat{\jmath }})x_j + \kappa _z({\hat{k}})z_k]}, \end{aligned}$$
(75a)
$$\begin{aligned} D_{Z}u^n_{j,k}= & {} \frac{-1}{N_x}\frac{1}{N_z} \sum _{{\hat{\jmath }}=N_x/2}^{-N_x/2-1}\sum _{{\hat{k}}=-N_z/2}^{N_z/2-1} \kappa _z^2({\hat{k}})\, {\hat{u}}^n_{{\hat{\jmath }},{\hat{k}}}\, e^{i[\kappa _x({\hat{\jmath }})x_j + \kappa _z({\hat{k}})z_k]}, \end{aligned}$$
(75b)

with \(\kappa _x({\hat{\jmath }}) = 2\pi {\hat{\jmath }}/(N_x\Delta x)\) and \(\kappa _z({\hat{k}}) = 2\pi {\hat{k}}/(N_z\Delta z)\), while \({\hat{u}}^n_{{\hat{\jmath }},{\hat{k}}}\) is defined by the 2D discrete Fourier transform

$$\begin{aligned} {\hat{u}}^n_{{\hat{\jmath }},{\hat{k}}} = \sum _{j=0}^{N_x-1}\sum _{k=0}^{N_z-1} u^n_{j,k}\, e^{-i[\kappa _x({\hat{\jmath }})x_j + \kappa _z({\hat{k}})z_k]}. \end{aligned}$$
(76)

The calculations from (75) and (76) can be efficiently carried out with the FFT algorithm. This algorithm requires \({\mathcal {O}}(N\log N)\) operations, which is significantly lower than the \({\mathcal {O}}(N^2)\) operations of matrix-vector multiplication for large \(N\). On the other hand, this method assumes periodic boundary conditions, demanding additional strategies to implement realistic ones (see, e.g. [192]).

To circumvent this problem, Raggio [193] proposed the Chebyshev pseudospectral method. Chebyshev basis functions have also been used in only one spatial direction, either on polar [85] or Cartesian [68] coordinates in order to better handle free-surface boundary conditions. Its extension to three-dimensional problems can be found in [194].

The Chebyshev basis functions in the interval \([-1,1]\) are \(T_j(x)=\cos (j\cos ^{-1}(x))\), while the collocation points are \(x_j = \cos (j\pi /N)\), \(0\le j\le N\). Herein, \(N\) denotes the maximal polynomial degree. The coefficients \({\hat{u}}_j\) in (72) are

$$\begin{aligned} {\hat{u}}_j = \frac{2}{N}\sum _{l=0}^N\frac{1}{c_lc_j}f(x_l)T_j(x_l),\quad c_j = \left\{ \begin{array}{ll} 1, &{} 0< j < N, \\ 2, &{} j=0,N. \end{array}\right. \end{aligned}$$
(77)

These coefficients can be written as

$$\begin{aligned} {\hat{u}}_j = \sum _{l=0}^Nv_l\frac{1}{c_j}\cos \left( \frac{jl\pi }{N}\right) ,\quad v_l = \frac{2}{c_l}f(x_l), \end{aligned}$$
(78)

which is related to the real part of the discrete Fourier transform. The coefficients of the derivatives of \(u_{N}\) can be computed from \({\hat{u}}_j\) through recursive relations [195].

The fact that the collocation points \(x_j\) are clustered at the boundary implies that the distance between grid points can be very small, which in turn leads to a small time step as well. For this reason, a stretching transformation should be employed [196, 197]. An alternative approach uses the tau method with Legendre polynomials [198], but the Chebyshev pseudospectral method has become more popular as it can be implemented with the FFT algorithm.

As finite-difference methods, pseudospectral methods benefit from staggered grids [199,200,201]. Their classical implementations were also limited to regular grids, and one alternative to avoid such a restriction is to employ curvilinear coordinates [168, 202]. Another approach is to resort to domain decomposition, which was proposed for the elastic wave equation initially for isotropic media [203, 204] and later to viscoelastic [205, 206] and poroelastic [207] media. Another benefit of using domain decomposition in pseudospectral methods is that the resulting wave operator is not entirely global, avoiding non-causal interactions of the propagating wavefield with parameter discontinuities in the model [200].

The Fourier pseudospectral method has been recently generalized to handle fractional derivatives, which are useful to model attenuating media without the need of memory variables [208,209,210].

5.3 Finite-element methods

As mentioned in the previous section, finite-element methods [63, 64] belong to the family of Galerkin methods, and typically use continuous Lagrange interpolation basis functions, which are associated with the spatial grid. The earliest applications of finite elements to seismic wave-propagation problems are due to Lysmer and Drake [211] in the frequency domain, and to Smith [212] in the time domain. Later works addressed several relevant aspects of seismic modeling with finite elements [5, 213, 214].

In the following we describe a finite-element approximation of the variational problem (37). The first step is to decompose the spatial domain \(\Omega \) into \(n_e\) non-overlapping elements \(\Omega ^e\) such that \(\Omega =\cup \Omega ^e\). Physical elements \(\Omega ^e\) are mapped through a transformation onto a reference element \({\hat{\Omega }}\) where computations are actually performed. The approximate solution may be written as

(79)

where the union operator denotes that \({\tilde{{{{\varvec{u}}}}}}\) is defined in \(\Omega \) and \({\tilde{{{{\varvec{u}}}}}}({{\varvec{x}}},\cdot ) = {\tilde{{{{\varvec{u}}}}}}^e({{\varvec{x}}},\cdot )\) for any \({{\varvec{x}}}\in \Omega ^e\). Moreover, \(n_{dof}\) is the number of element degrees of freedom. The basis functions satisfy , where is the Lagrange interpolation vector function associated with the j-th degree of freedom in a space of polynomial, vector-valued functions in \({\hat{\Omega }}\). For instance, if the polynomial degree is one and the spatial dimension is two with \({\hat{\Omega }}=[-1,1]\times [-1,1]\), then are the bilinear interpolation functions

(80a)
(80b)
(80c)
(80d)

where \(\varphi _1(\zeta )=(1-\zeta )/2\) and \(\varphi _2(\zeta )=(1+\zeta )/2\), for \(\zeta \in [-1,1]\). The coefficients \({\tilde{u}}_{j}^e(t)\) are determined from the following Galerkin approximation of (37):

(81)

where \(\tilde{V}\) is the subspace of \(H^1_0(\Omega )^d\) of continuous piecewise-polynomial functions built from local functions . After algebraic manipulations, Galerkin equation (81) are written as the system of ordinary differential equations

(82)

where , and \({{\varvec{F}}}\) are the global mass matrix, stiffness matrix, and load vector, respectively. Initial conditions \({{\varvec{U}}}(0)={{\varvec{U}}}_0\) and \(\dot{{{\varvec{U}}}}(0)=\dot{{{\varvec{U}}}}_0\) must be provided. They are built through a summation process of elemental matrices and vectors,

(83)

where , , and \(\tilde{{{\varvec{F}}}}^e\) are the elemental mass matrix, stiffness matrix, and load vector in sparse global form, that is, only non-zero entries are used and they are mapped into appropriate global locations by a connectivity map from local to global nodes [215]. The dense elemental arrays , and \({{\varvec{F}}}^e\) are defined by the contributions from element \(\Omega _e\) to the integrals in (81),

(84)

for \(1\le i,j\le n_{dof}\). These integrals are computed in the reference element \({\hat{\Omega }}\) through standard changes of variables [64].

In general, finite-element basis functions are defined from linear, quadratic and cubic polynomials over triangular or quadrilateral elements (tetrahedral or hexahedral in 3D, though other elements such as pyramids and wedges can be used [216]). Low-order finite-element methods are comparable to centered finite differences [213] and thus have low accuracy, but can be useful when the problem geometry leads to highly refined and irregular meshes [217, 218]. One alternative to improve the performance of low-order finite elements is to enrich the approximation space [219,220,221,222]. It is worth noticing that isogeometric analysis, a finite-element approach integrated with computer-aided design [223], has been shown to provide a better geometric representation than traditional finite elements. Several authors have studied this technique in wave-propagation problems [224,225,226].

5.4 Spectral-element methods

Unlike pseudospectral methods, the use of high-order finite elements was not common in the literature of computational seismology at least until the 1990s, mostly because there were concerns about their accuracy [227].

Standard high-order finite-element bases are based on equally spaced polynomial interpolation, which is an ill-conditioned problem [228, 229]. This can be noticed from the behavior of Lagrange basis functions of degree N at the reference element \([-1,1]\) with equally spaced points j/N, \(0\le j\le N\). As shown in Fig. 2, the Lagrange functions associated with the element midpoint begin to oscillate as N increases, similarly to the Runge phenomenon.

Fig. 2
figure 2

Lagrange basis functions of degree \(N=2,4,8\) with equally spaced points

A classical approach to circumvent this problem is to use the Chebyshev collocation points

$$\begin{aligned} \zeta _j = -\cos (j\pi /N), \quad 0\le j\le N, \end{aligned}$$
(85)

rather than equally spaced points, as shown in Fig. 3.

Fig. 3
figure 3

Lagrange basis functions of degree \(N=2,4,8\) with Chebyshev points

The use of Chebyshev points as well as other orthogonal polynomial roots was initially regarded as unnecessary [230], but in the work of Patera and co-workers [231,232,233], these points provided the link between finite-element and pseudospectral methods, with the appeal of having the geometric flexibility of the former and the rapid convergence properties of the latter.

Later on, the spectral-element method with Chebyshev collocation points was adapted to wave-propagation problems [69, 234,235,236,237], where their low numerical dispersion was pointed out. Another strand, which evolved from multi-domain pseudospectral methods [238] and high-order, mass-lumped finite-element methods [239], led to the spectral-element methods with Legendre collocation points [240,241,242]. Moreover, Laguerre spectral elements have been proposed to handle infinite domains [243]. Jabobi polynomials have also been employed but are less common [244].

Spectral elements are usually quadrangular or hexahedral, so that the Lagrange shape functions , defined in the reference element \({\hat{\Omega }}=[-1,1]^2\) or \([-1,1]^3\), are built from tensor products of one-dimensional Lagrange shape functions \(\varphi _i(\zeta )\) of degree N such that \(\varphi _i(\zeta _j)=\delta _{i,j}\), as in as in (80). The collocation points \(\zeta _i\) \((0\le i\le N)\) are given by (85) for Chebyshev elements, while for Legendre elements these points are the solutions to

$$\begin{aligned} (1-\zeta ^2)L_N'(\zeta ) = 0, \end{aligned}$$
(86)

where \(L_N(\zeta )\) is the Nth degree Legendre polynomial [1].

The standard implementations of spectral-element methods with Chebyshev (SEM-GLC) and Legendre (SEM-GLL) collocation points may be classified as consistent and lumped finite elements, respectively. In other words, the mass matrix in (83) for Chebyshev elements is calculated without approximations (assuming piecewise constant density) and is non-diagonal, while for Legendre elements this matrix is approximated through reduced integration by a diagonal matrix. SEM-GLL has been adapted to triangular meshes [245,246,247,248], but the selection of collocation points is more complex and may lead to non-diagonal mass matrices.

The computation of elemental matrices in SEM-GLC is based on the properties of Chebyshev polynomials \(T_j(\cos \theta )=\cos (j\theta )\). In particular, the Lagrange shape functions \(\varphi _i\) \((0\le i\le N)\) are obtained by choosing u in (72) such that \(u(x_j)=\delta _{i,j}\). It follows from (72) and (77) that

$$\begin{aligned} \varphi _i(\zeta ) = \frac{2}{N}\sum _{j=0}^N\frac{1}{c_ic_j}T_j(\zeta _i)T_j(\zeta ) = \sum _{j=0}^Ns_{i,j}T_j(\zeta ),\quad c_j = \left\{ \begin{array}{ll} 1, &{} 0< j < N, \\ 2, &{} j=0,N, \end{array}\right. \end{aligned}$$
(87)

so that the entries of the elemental mass matrix in \([-1,1]\) are

$$\begin{aligned} \hat{M}_{i,j} = \int _{-1}^1 \varphi _i(\zeta )\varphi _j(\zeta )\, d\zeta = \sum _{j=0}^Ns_{i,j}\int _{-1}^1T_i(\zeta )T_j(\zeta )\, d\zeta . \end{aligned}$$
(88)

The integral in (88) equals 0 if \(i+j\) is odd, and \((1-(i+j)^2)^{-1}+(1-(i-j)^2)^{-1}\) otherwise [215, 231]. These formulas have been generalized to take into account variable material properties, which are represented by expansions in basis functions that do not necessarily coincide with those from the wave field [215, 249].

Because fully discrete SEM-GLC schemes are implicit in time, they need efficient linear-system solvers. Some useful strategies are the element-by-element formulation and suitable factorizations of matrix-vector products [250, 251]. Moreover, unconditionally stable time-integration schemes should be chosen to allow the use of large time steps.

For SEM-GLL, the standard practice is to employ the GLL quadrature

$$\begin{aligned} \int _{-1}^1 f(\zeta )\, d\zeta \approx \sum _{k=0}^Nf(\zeta _k)w_k,\quad w_k = \frac{2}{(N+1)N}\frac{1}{L_N^2(\zeta _k)}, \end{aligned}$$
(89)

with \(\zeta _j\) satisfying (86). This formula is exact if f is a polynomial of degree \(\le 2N-1\). In particular, the calculation

$$\begin{aligned} \hat{M}^{GLL}_{i,j} = \sum _{k=0}^N\varphi _i(\zeta _k)\varphi _j(\zeta _k)w_k \approx \int _{-1}^1 \varphi _i(\zeta )\varphi _j(\zeta )\, d\zeta \end{aligned}$$
(90)

is not exact, since \(\varphi _i\varphi _j\) has degree 2N. On the other hand, since \(\varphi _i(\zeta _j) = \delta _{i,j}\), we have \(\hat{M}^{GLL}_{i,j} = w_j\delta _{i,j}\), i.e., is diagonal. The diagonality of the mass matrix was crucial in the success of spectral elements in computational seismology.

The spectral-element method has been implemented for anisotropic, visco- and poroelastic wave propagation [61, 252, 253] and the advent of collaborative codes and platforms [254, Appendix A] has encouraged its application in several studies of regional and global seismology [6, 255,256,257]. This method has also been widely used in conjunction with adjoint methods [258, 259].

5.5 Finite-volume methods

The finite-volume method for elastic wave propagation was initially proposed by Dormy and Tarantola [260] for the velocity-displacement formulation (28) and later by Tadi [261] for the second-order formulation (15).

In [260], the main motivation to introduce the finite-volume method was to generalize minimum grid, second-order finite differences [120] to unstructured grids and irregular boundaries. The key idea was to use the divergence theorem to obtain derivative estimates of a field from its values at surrounding grid points, rather than Taylor series expansions. On the other hand, the main concern in [261] was the enforcement of traction boundary conditions.

By using the ADER method [73, 74] and a reconstruction algorithm to introduce high-order numerical fluxes, Dumbser et al. [262] have obtained higher accuracy both in space and time. Later on, Zhang and co-authors [263, 264] proposed a high-order finite-volume method that combines the reconstruction algorithm from [262] and the element subdivision algorithm from the spectral volume method [265]. The finite-volume method has been used in studies of dynamic rupture [266], hypoelastic [267], and poroelastic media [268].

As in [269], let us introduce the finite-volume methods through the one-dimensional conservation law

$$\begin{aligned} \dot{u}(x,t) + \frac{\partial {f(u(x,t))}}{\partial x} = 0. \end{aligned}$$
(91)

Given a uniform spatial grid with spacing \(\Delta x\), let \(\mathcal {C}_j = [x_{j-1/2},x_{j+1/2}]\) be a grid cell (or finite volume) centered at node \(x_j\). By integrating both sides of (91) over \(\mathcal {C}_j\), we find

$$\begin{aligned} \frac{\partial }{\partial t}{}\int _{\mathcal {C}_j}u(x,t)\, dx = f(u(x_{j-1/2},t)) - f(u(x_{j+1/2},t)). \end{aligned}$$
(92)

A subsequent integration from \(t_n\) to \(t_{n+1}\) yields

$$\begin{aligned} \int _{\mathcal {C}_j}u(x,t_{n+1})\, dx - \int _{\mathcal {C}_j}u(x,t_n)\, dx = \int _{t_n}^{t_{n+1}}f(u(x_{j-1/2},t))-f(u(x_{j+1/2},t))\, dt, \end{aligned}$$
(93)

which can be written as

$$\begin{aligned} U_j^{n+1} - U_j^n = \frac{\Delta x}{\Delta t}\left( F_{j-1/2}^n - F_{j+1/2}^n\right) , \end{aligned}$$
(94)

where

$$\begin{aligned} U_j^n = \frac{1}{\Delta x}\int _{\mathcal {C}_j}u(x,t_{n})\, dx \end{aligned}$$
(95)

is the average approximation of the unknown field \(u(x,t_n)\) at \(\mathcal {C}_j\), while

$$\begin{aligned} F_{j\pm 1/2}^n = \frac{1}{\Delta t}\int _{t_n}^{t_{n+1}}f(u(x_{j\pm 1/2},t))\, dt \end{aligned}$$
(96)

are the average fluxes at \(x_{j\pm 1/2}\). Assuming that \(F_{j\pm 1/2}^n\) approximately depends only on the adjacent average values \(U_j^n\) and \(U_{j\pm 1}^n\), we arrive at the finite-volume approximation of (94):

$$\begin{aligned} U_j^{n+1} = U_j^n + \frac{\Delta x}{\Delta t}\left( \mathcal {F}(U_{j-1}^n,U_{j}^n)-\mathcal {F}(U_j^n,U_{j+1}^n)\right) , \end{aligned}$$
(97)

where the numerical-flux function \(\mathcal {F}(U_-,U_+)\) approximates the average fluxes (96). For instance, let us consider the linear case \(f(u)=au\) with \(a>0\), which is a one-way wave equation. Taking into account that the information propagates from the left to the right for \(a>0\), an effective choice is the upwind flux \(\mathcal {F}(U_-,U_+) = f(U_-)=aU_-\), leading to the scheme

$$\begin{aligned} U_j^{n+1} = U_j^n - \frac{a\Delta x}{\Delta t}\left( U_{j}^n-U_{j-1}^n\right) , \end{aligned}$$
(98)

which coincides with the classical upwind finite-difference method. In the case where a may change sign, we choose \(\mathcal {F}(U_-,U_+) = \max \{a,0\}U_- + \min \{a,0\}U_+\), so that

$$\begin{aligned} U_j^{n+1} = U_j^n - \frac{\Delta x}{\Delta t}\left( \max \{a,0\}\Delta U_{j-1/2}^n+\min \{a,0\}\Delta U_{j+1/2}^n\right) . \end{aligned}$$
(99)

The jumps \(\Delta U_{j-1/2} = U_{j}^n-U^n_{j-1}\) and \(\Delta U_{j+1/2} = U_{j+1}^n-U^n_{j}\) can be interpreted as waves moving across cells \(\mathcal {C}_j\) and \(\mathcal {C}_{j+1}\) with opposite senses; in general, the numerical flux is driven by the solution of a Riemann problem [269].

Let us now consider the 1D elastic wave equation (67) in the homogeneous case, which can be written in a matrix form similar to (91);

(100)

From the eigenvalue decomposition , the vector satisfies

(101)

We can apply the upwind scheme (99) to either \(w_+\) or \(w_-\):

$$\begin{aligned} W_{+,i}^{n+1} = W_{+,i}^n - \frac{\Delta x}{\Delta t}c_S\Delta W_{+,i-1/2}^n, \quad W_{-,i}^{n+1} = W_{-,i}^n - \frac{\Delta x}{\Delta t}(-c_S)\Delta W_{-,i+1/2}^n. \end{aligned}$$
(102)

The matrix form of (102) yields an upwind scheme for \({{\varvec{w}}}\):

$$\begin{aligned} {{\varvec{W}}}_{j}^{n+1} = {{\varvec{W}}}_{j}^n - \frac{\Delta x}{\Delta t}\left( \left[ \begin{array}{cc} c_S &{} 0 \\ 0 &{} 0 \end{array}\right] \Delta {{\varvec{W}}}_{j-1/2}^n + \left[ \begin{array}{cc} 0 &{} 0 \\ 0 &{} -c_S \end{array}\right] \Delta {{\varvec{W}}}_{j+1/2}^n \right) \end{aligned}$$
(103)

By multiplying both sides of (103) by on the left and replacing the jumps \(\Delta {{\varvec{W}}}_{j\pm 1/2}^n\) with the jumps in terms of the average approximations of \({{\varvec{u}}}(x,t_n)\), we arrive at the scheme

(104)

for the original field \({{\varvec{u}}}\), where

(105)

The extension of the finite-volume method to 2D and 3D can be obtained through the divergence theorem [2, 260].

5.6 Discontinuous Galerkin methods

The discontinuous Galerkin method (DG) incorporates the concept of numerical fluxes across element interfaces from the finite-volume method into the finite-element framework. In particular, computations are done on reference element to increase computational efficiency. It is well-suited for parallelization due to the local character of the scheme and low amount of communication. Because continuity between elements is not required, the choice of spatial meshes is more flexible. On the other hand, these methods require more degrees of freedom than continuous methods. In the acoustic case, for instance, each node from an internal edge is associated with two or more local degrees of freedom, rather than a single global degree of freedom.

As finite-volume methods, DG was initially designed for transport problems [270, 271]. Interior penalty methods, another class of discontinuous Galerkin approximations, were independently developed for elliptic and parabolic problems [272]. While the former is based on suitable approximation of fluxes across elements, the latter concerns weakly imposing continuity between them. We refer to Arnold et al. [273] for a unified framework for these approaches as well as other discontinuous Galerkin formulations.

The earliest DG methods for seismic wave propagation followed the interior-penalty approach [274,275,276,277,278]. They have been developed for second-order formulations. However, the most popular DG method in computational seismology is derived from first-order, conservation law formulations [74, 279]. It employs piecewise high-order polynomial approximation, as spectral-element methods, together with the ADER time-integration approach. The resulting method achieves arbitrary high approximation order in space and time. In contrast with spectral elements, which rely on Lagrange interpolation functions, ADER-DG methods follow a modal rather than nodal approach [2], i.e., they rely on orthogonal polynomial basis functions, in particular for triangular and tetrahedral elements [280, 281]. Hence, these methods benefit from automatic mesh generators of unstructured triangular and tetrahedral meshes, which are usually higher developed than those for quadrilateral and hexahedral elements, for example with optimized mesh partitioning techniques based on graph theory [262]. This allows, for instance, precise digital elevation models for the topography of the free surface.

A more recent version of DG, termed hybridizable discontinuous Galerkin (HDG), employs Lagrange multipliers over element boundaries and can provide a higher convergence rate [88, 100, 282]. As shown in [283], HDG is related to the earlier staggered discontinuous Galerkin methods [284,285,286].

There have been several applications of DG beyond the isotropic wave equation, such as anisotropic [287], viscoelastic [288], and poroelastic [289, 290] waves. A unified Riemann solver has been recently proposed to couple these media [291]. Wilcox et al. [87] consider a velocity-strain (rather than velocity-stress) formulation, which allows coupling elastic and acoustic media. As pointed out by [2], DG is very well suited to dynamic-rupture problems [292,293,294,295] and has also benefited from the availability of open-source codes [126, 296, 297].

In order to compare spectral-element and discontinuous Galerkin methods [2], let us derive the elemental equations for a nodal upwind DG method for the 1D elastic wave system (100). Let \(\Omega ^e\) be an interior subinterval of the domain and as in (79). We perform the scalar product of both sides of (100) with a test function and integrate by parts in \(\Omega ^e\), so that

(106)

where \(x_L^e\) and \(x_R^e\) are the left and right endpoints of \(\Omega ^e\). The latter terms are not present on standard finite- and spectral-element methods as they cancel out with contributions from adjacent intervals. In the discontinuous case, adjacent elements are no longer connected through continuity conditions, which are replaced with numerical fluxes. Similarly to (104), the undefined boundary terms of (106) can be approximated as follows (Fig. 4):

(107a)
(107b)
Fig. 4
figure 4

Upwind contributions of a scalar, discontinuous piecewise-polynomial field \({\tilde{u}}\) at the endpoints of element \(\Omega ^e\), when the velocity is positive (\(a^+\)) or negative (\(a^-\)). Adapted from [2, p. 259]

By introducing numerical fluxes (107) and the expansion (79) into (106) , we find

(108)

where the indices i(R) and i(L) satisfy and , respectively. Equation (108) is the same for the spectral-element method, except that the terms in the right-hand side are not present.

5.7 Other methods

We close this section reviewing some families of numerical methods that are conceptually diverse from the ones presented above though their computational implementation may be developed from traditional spatial discretizations.

5.7.1 Physics-compatible numerical methods

Physics-compatible (also called mimetic or conservative) numerical methods are techniques that try to preserve (mimic) the fundamental physical and mathematical properties of continuous physics models in their finite-dimensional algebraic representations.

The numerical methods presented above, such as finite differences (FD), finite volumes (FV) and finite elements (FE), evolved separately until recently, but in recent years the need to develop more complex algorithms for solving new challenging real problems prompted the search for better and more robust schemes. Investigations and experience on the computational behavior of standard methods (stability, convergence, numerical errors, and efficiency) demonstrated that solving a physical problem by discrete models reproducing fundamental properties of the original continuum model allows for the best results. Among the important properties to be preserved are topology, conservation of energy, monotonicity, stability, maximum principles, symmetries, and involutions of continuum models. For this purpose, differential geometry, external calculus, and algebraic topology are the main mathematical tools for developing compatible discretizations.

Examples are compatible methods for spatial discretizations, variational and geometric integrators, or conservative finite-volume, finite-element and spectral-element methods, etc. The design principles for the development of mimetic discretization methods are described in books [298,299,300] and the references therein, while a general introduction and overview of spatial and temporal mimetic/geometric methods can be found in [301,302,303,304,305,306,307].

The general approach in developing compatible numerical schemes is to formulate the PDEs, which describe the continuum models, using invariant first-order differential operators, such as the divergence of vectors and tensors, the gradient of scalars and vectors, and the curl of vectors. The next step is to work out the compatible discretizations by using equivalent discrete forms of these invariant operators. The divergence, gradient, and curl differential operators satisfy certain integral identities (such as Green’, Gauss’, and Stokes’ theorems) that are closely related to the conservation laws of the continuum models.

Therefore, the equivalent discrete forms of these integral identities are used in building the compatible discrete divergence, gradient, and curl operators since they must satisfy such discrete integral identities. Furthermore, other approaches have also been used, for example, based on algebraic topology, variational principles, or discrete vector calculus as well as for extending the mimetic approach to more general grids including polygonal, polyhedral, locally refined, and non-matching meshes.

For the sake of clarity, we show below the application of the basic principles of mimetic discretizations using the scalar wave equation (6) as described by Solano–Feo et al. [308].

Let us initially consider a one-dimensional grid with points \(x_0,\ldots ,x_N\) and uniform spacing h. We denote the mimetic approximations of a scalar function u at the grid points and their midpoints as and , respectively. We may define the discrete divergence and gradient operators through central finite differences as

$$\begin{aligned} v_{i+1/2} = (Du)_{i+1/2} = \frac{u_{i+1}-u_i}{h}, \quad v_i = (Gu)_i = \frac{u_{i+1/2}-u_{i-1/2}}{h}, \end{aligned}$$
(109)

which can be written in matrix form as and . The discrete divergence and operators yield grid functions defined at midpoints and node values, respectively (Fig. 5). Left- and right-sided approximations should be employed to define the gradient operator at the grid endpoints, and two- and three-dimensional operators can be constructed with the aid of Kronecker products [299]. Solano–Feo et al. [308] pointed out that these operators satisfy the discrete integral identity

(110)

where is a boundary operator, denotes a discrete inner product with weighting matrix , \(d=1,2,\) or 3 is the spatial dimension, () is the diagonal matrix containing quadrature weights of the compound midpoint (3/8 Newton–Cotes) rule, and is the identity matrix.

Fig. 5
figure 5

Discrete divergence (D) and gradient (G) operators in a 1D staggered grid

Let us now proceed to the mimetic approximation to (6), considering the leapfrog scheme in time. By writing \(\Delta u = \text{ div }(\nabla u)\), the Laplacian operator can be approximated by the compound discrete operator , leading to the explicit scheme

(111)

where .

More details can be found in [299, 308] and references therein. Mimetic principles have been applied to modeling wave-propagation problems by many authors [309,310,311,312]. Mimetic finite differences are particularly effective to handle topography and boundary conditions [313,314,315].

5.7.2 Cell method

It has been observed that many physical theories have a very similar formal structure from a geometric, algebraic, and analytical point of view. This principle led to the Tonti diagrams [316], a classification scheme of the physical quantities and the physical theories in which they are involved, such as equations of equilibrium, continuity and motion. These equations can be reformulated in a finite grid using basic concepts of algebraic topology such as completely discrete functions defined on a combination of elements of the grid rather than functions in the continuum. It is therefore possible to directly establish a set of algebraic relations between physical variables associated with the geometric elements of the problem and which are suitable for numerical simulations.

The cell method (CM) [317,318,319,320,321] is a computational method based on such direct algebraic formulation developed by Enzo Tonti [317, 322,323,324,325]. Using directly the experimental laws, it avoids discretizing the differential equations used in a continuum formulation, and is physically compatible by construction. CM is therefore a mimetic method.

In practice, the CM at first accepts the idea of an approximate solution and focuses on single limited parts of the analyzed domain: the cells. After dividing the domain into cells (named as primal cell complex) a second subdivision is made, coupling a piece of each cell to each of its nodes. With this last subdivision, a domain area is attributed to each node of the primal cell complex, thus creating a second cell system, called the dual system (Fig. 6). In fact, there is full reciprocity (duality) between the geometric elements of the two systems of cells: a cell of the dual system (considered as “tributary region”) remains connected to each node of the primal system; vice versa: to each node of the dual system correspond cells of the primal system. Therefore the geometric elements of the primal system (points P, lines L, surfaces S, and volumes V) correspond to the geometric elements of the dual system (respectively volumes \({\tilde{V}}\), surfaces \({\tilde{S}}\), lines \({\tilde{L}}\), and points \({\tilde{P}}\)).

Fig. 6
figure 6

Domain partition in the cell method: the primal complex (left), primal and dual complexes (center), and tributary regions of some nodes of the primal complex (right)

The cell method, in addition to its simplicity, has close compatibility with the physical and experimental reality as a consequence of connecting the physical quantities to the geometric elements of the two cell systems with the same logic with which the quantities are investigated experimentally. The association between quantities involved in a physical problem and the geometric elements of the two cell complexes is illustrated and effectively summarized in the Tonti diagrams.

In summary, spatial and temporal quantities are represented by sets of topological entities (cells) of multiple dimensions called primal and dual cell complexes, and a system of inner and outer orientations are assigned to such cell complexes. The physical variables are associated with spatial and temporal elements according to the following classification:

  • Configuration variables: geometric and kinematic variables that describe the configuration of the (wave) field, such as displacement;

  • Source variables: static and dynamic variables that describe the sources of the field, such as force and mass flow;

  • Energy variables: variables that are obtained from the product of configuration and source variables, such as work and kinetic energy.

Configuration variables are associated with elements of the primal cell complex, while source/energy variables are associated with elements of the dual complex. The constitutive and balance laws are then imposed, leading to algebraic equations for the variables of interest.

5.7.3 Homogenization and multiscale methods

Two approaches have been proposed to handle medium heterogeneities that must be taken into account in the wave simulation, but would need a mesh refinement that is impractical to implement. These methods essentially convert the material properties in a fine scale, where relevant variability occurs, into equivalent ones in a coarse scale corresponding to the target wavelength. In general, physical laws may be different in fine and coarse scales [326].

One of these approaches can be seen as a generalization of averaging techniques [327] and lead to the homogenization methods [328,329,330,331,332]. Through asymptotic theory, they obtain effective equations at the macroscopic level that qualitatively account for the fine scales. The asymptotic expansion of order zero usually corresponds to the classical averaging techniques. Though these methods are not restricted to periodic media, they have been mostly developed for rectangular and cuboid grids [333,334,335].

The second approach obtains the effective medium by numerically modeling the fine scales [336]. There are several methods that follow this approach, such as numerical upscaling [337, 338], heterogeneous multiscale method [339, 340], multiscale finite elements [341], multiscale coupling methods [342], and Fast Fourier homogenization [343]. These techniques are especially useful to finely layered, randomly oriented, and fractured media [344, 345].

Homogenization methods have the advantage of a lower computational cost. On the other hand, methods that numerically evaluate the contribution of fine scales to the macroscopic model can be more flexible with respect to the mesh geometry [346].

6 Numerical approximation of boundary conditions

This section concerns the implementation of the free-surface and computational boundary conditions mentioned in Sect. 2.4. They can also be handled directly at the discrete level, rather than being based on the discretization of analytical boundary conditions [29, 347]. An important class of such methods use an artificial layer surrounding the domain to attenuate reflected waves.

6.1 Free-surface boundary conditions

Free-surface conditions can be easily imposed in numerical methods based on variational formulations, such as finite/spectral elements and discontinuous Galerkin methods. As an illustration, let us consider the variational formulation of the elastic wave equations (9) with boundary conditions

$$\begin{aligned} {{\varvec{\sigma }}}({{\varvec{u}}})\cdot {{\varvec{n}}}= & {} {{\varvec{0}}} \text{ on } \Gamma _1, \end{aligned}$$
(112a)
$$\begin{aligned} {{\varvec{u}}}= & {} {{\varvec{0}}} \text{ on } \Gamma _2, \end{aligned}$$
(112b)

where \(\Gamma _1\cup \Gamma _2=\partial \Omega \) and \(\Gamma _1\) denotes the surface boundary. in the space

$$\begin{aligned} W = \left\{ {{\varvec{w}}}\in H^1(\Omega )^d \; ; \; {{\varvec{w}}} = {{\varvec{0}}} \text{ on } \Gamma _2\right\} . \end{aligned}$$
(113)

We have from divergence theorem that, for any \({{\varvec{w}}}\in W\),

$$\begin{aligned} \int _{\Omega }{{\varvec{w}}}\cdot (\nabla \cdot {{\varvec{\sigma }}}({{\varvec{u}}}))\,d\Omega= & {} \int _{\Gamma _1}{{\varvec{w}}}\cdot ({{\varvec{\sigma }}}({{\varvec{u}}})\cdot {{\varvec{n}}})\,d\Gamma + \int _{\Gamma _2}{{\varvec{w}}}\cdot ({{\varvec{\sigma }}}({{\varvec{u}}})\cdot {{\varvec{n}}})\,d\Gamma \nonumber \\&- \int _{\Omega }{{\varvec{\epsilon }}}({{\varvec{w}}}):{{\varvec{\sigma }}}({{\varvec{u}}})\,d\Omega . \end{aligned}$$
(114)

The first and second boundary integrals in the right-hand side of (114) vanish due to (112a) and (113), respectively, so that we arrive at the same variational formulation as in (37), except that \({{\varvec{w}}}\in W\). A similar result applies to the velocity-stress and velocity-displacement formulations.

Other methods such as finite differences need some strategy to handle z-derivatives present on condition (112a). This condition becomes, for instance in the 2D isotropic case,

$$\begin{aligned} \sigma _{zz} = \lambda \frac{\partial u_x}{\partial x} + (\lambda +\mu )\frac{\partial u_z}{\partial z} = 0,\quad \sigma _{xz} = \mu \left( \frac{\partial u_x}{\partial z} + \frac{\partial u_z}{\partial x}\right) = 0. \end{aligned}$$
(115)

For a free-surface condition over \(z=0\), the x-derivatives can be approximated using grid points over this line. For the z-derivatives, one can extend the grid above \(z=0\) and impose the skew-symmetry of the stress components to evaluate the variables over these additional points [119, 157], or to employ one-sided finite-difference expansions to avoid extending the grid [348]. Another approach is to use extrapolation based on characteristic variables [118], which has also been used on pseudospectral methods [68, 107, 349]. In the finite-volume method, the free-surface boundary condition may be imposed by solving an inverse Riemann problem [262].

6.2 Absorbing boundary conditions

The classical absorbing boundary conditions were initially implemented on finite-difference methods for the scalar wave equation, with one-sided difference formulas at the boundary [24, 28, 32]. Later on, these conditions were implemented on finite-difference methods for the two- and three-dimensional elastic wave equation [350, 351].

Engquist–Majda conditions (33) have been widely employed by other spatial discretization techniques, such as pseudospectral methods [352, 353], finite elements [35, 354], spectral elements [61, 355, 356], and discontinuous Galerkin methods [88, 285].

High-order local absorbing boundary conditions have been mostly implemented with finite-difference and finite-element methods [357,358,359], but have also been considered in other methods [360, 361].

It is worth noting that absorbing boundary conditions often involve first-order time derivatives, leading to second-order linear systems of ODEs (41) where the damping matrix is present [98]. We refer to [36] for conditions for first-order hyperbolic problems.

6.3 Absorbing layers and PML

An alternative to designing non-reflecting boundary conditions is to extend the computational domain by surrounding with a layer where the wave field is subject to some form of filtering that attenuates the waves generated by reflection at the outer layer boundary (Fig. 7). This technique can be traced back to the works of Petschek and Hanson [362, 363] and became popular in exploration geophysics after the method of Cerjan et al. [192]. The latter attenuates the numerical solution at the end of each time step by multiplication of a factor that tapers gradually towards the center of the grid [364], as suggested by the shading pattern in Fg. 7.

Fig. 7
figure 7

Sketch of an absorbing layer (shaded) surrounding a rectangular domain with a homogeneous grid. In the upper right, the idealized effect of the absorbing layer on waves reflected by the outer boundary

Rather than post-processing the wave field, wave attenuation may be obtained by adopting a modified governing equation in the absorbing layer, as proposed in [364,365,366]. For instance, the acoustic wave equation (7) can be modified in the absorbing layer as follows:

$$\begin{aligned} \frac{\partial }{\partial t}{\left( \frac{1}{\rho c^2}\dot{u}\right) } -\nabla \cdot \left( \frac{1}{\rho }\nabla u\right) + 2\gamma \dot{u}-\gamma ^2u = f, \end{aligned}$$
(116)

where the parameter \(\gamma ({{\varvec{x}}})\) is chosen to achieve the best amplitude elimination [364]. Sarma et al. [367] developed modified equations for finite-element methods under the framework of Rayleigh damping.

A disadvantage of absorbing layers is that, while waves going through them may be effectively damped, spurious reflections occur at the interface between the domain and the absorbing layer. This limitation motivated the development of perfectly matched layers (PML), which were originally proposed to electromagnetic waves [368] and later extended to acoustic and elastic waves [44, 369,370,371].

Following [372], let us illustrate a PML for the scalar wave equation (6). Firstly, this equation is rewritten as a first-order system and is Laplace transformed:

$$\begin{aligned} -i\omega \hat{{{\varvec{v}}}} = -\nabla {\hat{u}}, \quad -i\omega {\hat{u}} = -c^2\nabla \cdot \hat{{{\varvec{v}}}}. \end{aligned}$$
(117)

For simplicity, consider the layer portion adjacent to the boundary \(x=0\). In this part of the layer, system (117) is modified as follows:

$$\begin{aligned} (-i\omega +\omega _1(x_1))v_1= & {} -\frac{\partial {\hat{u}}}{\partial x_1},\quad (-i\omega +\omega _1(x_1)){\hat{u}}_1 = -c^2\frac{\partial \hat{v}_1}{\partial x_1}, \end{aligned}$$
(118a)
$$\begin{aligned} -i\omega v_j= & {} -\frac{\partial {\hat{u}}}{\partial x_j}, \quad -i\omega {\hat{u}}_j = -c^2\frac{\partial \hat{v}_j}{\partial x_j} \quad (j=2,3), \end{aligned}$$
(118b)

where \({\hat{u}}_j\) (\(j=1,2,3\)) are auxiliary variables such that \({\hat{u}}={\hat{u}}_1+{\hat{u}}_2+{\hat{u}}_3\), while \(\omega _1\) is a function that vanishes along with its derivative at the interface (for instance, \(\omega _1(x_1)=Ax_1\)). Finally, the equations in time domain are obtained by applying the inverse Laplace transform to (118):

$$\begin{aligned} \dot{v}_1+\omega _1(x_1)v_1= & {} -\frac{\partial u}{\partial x_1}, \quad \dot{u}_1+\omega _1(x_1)u_1 = -c^2\frac{\partial v_1}{\partial x_1}, \end{aligned}$$
(119a)
$$\begin{aligned} \dot{v}_j= & {} -\frac{\partial u}{\partial x_j},\quad \dot{u}_j = -c^2\frac{\partial v_j}{\partial x_j} \quad (j=2,3). \end{aligned}$$
(119b)

In general, we split the derivative operators and the unknowns on components that are normal and tangential to the boundary and apply a complex change of variables in the normal direction [44, 373]. Moreover, the modified equations may be obtained on second-order formulations of the wave equation [373].

Later on, the convolutional perfectly matched layer (CPML) was developed to avoid spurious reflections at grazing incidence. This method was originally proposed to the elastic wave equation in the velocity-stress formulation [374] and later extended to the displacement formulation [375] and to poroelastic [376] and viscoelastic [377] media. Kristek et al. [378] explored the relationship between PML and CPML.

Another approach that uses an absorbing layer applies high-order local NRBCs on two parallel artificial boundaries, and it is known as double-absorbing-boundary method [379]. This method has been evaluated in 2D and 3D seismic wave-propagation benchmark problems [380, 381].

7 Numerical errors

Convergence analyses have been proposed for most of the fully discrete methods outlined in the previous section. In the following we list some of these works:

However, convergence analysis usually does not guide the practitioner in the choice of discretization parameters for a wave-propagation simulation. Such an information is mostly provided by stability analysis (often a constituent part of convergence proofs [392]) and dispersion analysis.

7.1 Stability

Numerical stability, or the sensitivity of the numerical solution to perturbations, is an essential feature on numerical wave simulations, where such perturbations should not grow over time. The analysis of numerical stability of time-dependent problems is usually done through Von Neumann analysis [393], the matrix method [394], and the energy method [395].

Let us illustrate these analyses with the explicit finite-difference scheme (64) in the absence of source terms. In Von Neumann (or discrete Fourier) analysis, we represent the numerical solution in the form

$$\begin{aligned} u_j^n \leftarrow {\hat{u}}_0e^{\alpha t_n}e^{ik x_j} = {\hat{u}}_0\xi ^ne^{ik x_j}, \end{aligned}$$
(120)

and refer to the method as stable if the amplification factor \(\xi =u_j^{n+1}/u_j^n=\exp (\alpha \Delta t)\) satisfies \(|\xi |\le 1\) for any k [393], and unstable otherwise. By substituting (120) into (64), we obtain the following equation for the amplification factor [393, 396]:

$$\begin{aligned} \xi ^2 - 2A\xi + 1 = 0,\quad A = 1 - 2r^2\sin \left( \frac{k\Delta x}{2}\right) ,\quad r = \frac{c\Delta t}{\Delta x}. \end{aligned}$$
(121)

The solutions \(\xi = A \pm \sqrt{A^2-1}\) satisfy \(|\xi |\le 1\) if the space and time steps \(\Delta x\) and \(\Delta t\) are chosen such that

$$\begin{aligned} r = \frac{c\Delta t}{\Delta x} \le 1, \end{aligned}$$
(122)

coinciding with the CFL stability criterion [112]. Similarly as in [396], the amplification factor of the implicit version of scheme (64) in the absence of sources,

$$\begin{aligned} \frac{u_{j}^{n-1}-2u_{j}^{n}+u_{j}^{n+1}}{\Delta t^2} = c^2\frac{u_{j-1}^{n+1}-2u_{j}^{n+1}+u_{j+1}^{n+1}}{\Delta x^2}, \end{aligned}$$
(123)

satisfies \(\xi ^2(1+4A)-2\xi +1=0\), where \(A=r^2\sin (k\Delta x/2)\), hence

$$\begin{aligned} |\xi | = \left| \frac{1\pm 2\sqrt{A}i}{1+4A}\right| = 1. \end{aligned}$$
(124)

Therefore, the implicit scheme (123) is unconditionally stable, i.e., it is stable for any combination of the grid parameters \(\Delta x\) and \(\Delta t\). For this reason, implicit schemes may be an attractive choice despite their higher computational cost. However, one must keep in mind that numerical dispersion, presented in the next section, constrains the choices of the grid parameters for both explicit and implicit methods.

A general form of this procedure can be found in many textbooks (e.g., [397]). The simplicity of Von Neumann analysis has made it the most frequently used tool for stability analysis of finite-difference methods as well as other techniques [80, 277, 398, 399]. However, since the free-space solution in the form (120) is essentially limited to unbounded or periodic domains, boundary conditions are not taken into account. Moreover, the simplicity of Von Neumann analysis is limited to equations with constant coefficients.

While equations for heterogeneous media can be handled by considering the “frozen-coefficient” equations [400, 401], the analysis of boundary conditions require alternative techniques. Even though Von Neumann conditions are sufficient in particular wave-propagation problems, in general the boundary conditions are stable only for a certain range of elastic parameters [402].

One of these alternatives is the matrix method, whose mostly well known source is the book by Mitchell [394] as well as other textbooks [401, 403], though its main idea is present in earlier works [404,405,406]. The matrix method analysis is carried out in the physical domain (writing the equations in matrix form) rather than the wave number domain, and has been considered in [153, 156, 240, 261].

Recalling that the matrix form of (64) with homogeneous boundary conditions is , with given in (65), we have the following single step equation:

(125)

where . It then follows that , where \(\Vert \cdot \Vert \) denotes a vector norm and its induced matrix norm (i.e., . A necessary (but in general not sufficient) condition for the boundedness of \(\Vert {{\varvec{v}}}_{n}\Vert \) is , where is the spectral radius of .

The eigenvalues of satisfy \(\lambda ^2-\mu \lambda +1=0\), where \(\mu \) is an eigenvalue of , thus to ensure \(|\lambda |\le 1\) we must have \(|\mu |\le 2\). Since the eigenvalues of are \(2-2\cos (j\pi /N)\) for \(1\le j\le N-1\) [407], it follows that \(r\le 1/\sin ^2((\pi /2)(N-1)/N)\), which is nearly the same as condition (122).

A drawback of the matrix method is the need to analyze large matrices, and there are some approaches that alleviate the underlying computational cost. Ilan and Loewenthal [402] restricted the analysis to a portion of the domain close to the boundary. On the other hand, Kamel [408] proposed to seek the largest eigenvalue through power method [228], which is interpreted as updating an initial data through successive time steps.

As pointed out in [401, 409], the condition assures that remains bounded as n increases but, under this condition, may initially increase before decreasing. Griffiths et al. [409] point out that the condition is a sufficient one, and suggest an intermediate condition of the form .

Let us proceed to the energy method, which seeks a discrete quadratic form that does not grow or moderately grows with time and in the same time bounds from above the discrete \(L^2\) norm [410]. In the context of the previous examples, the discrete \(L^2\) norm and inner product are

$$\begin{aligned} \Vert {{\varvec{u}}}_n\Vert _h = \langle {{\varvec{u}}}_n,{{\varvec{u}}}_n\rangle _h^{1/2},\quad \langle {{\varvec{u}}}_n,{{\varvec{v}}}_n\rangle _h = \Delta x\sum _{j=1}^{N-1}u^n_jv^n_j. \end{aligned}$$
(126)

Let us consider Eq. (1) in the absence of sources with boundary conditions \(u(a)=u(b)=0\). By multiplying both sides of (1) by \(\dot{u}\) and integrating by parts over [ab], we find

$$\begin{aligned} \int _a^b\ddot{u}(x,t)\dot{u}(x,t)\, dx + \int _a^bc^2\frac{\partial {u}}{\partial x}(x,t)\frac{\partial {\dot{u}}}{\partial x}(x,t)\, dx = 0. \end{aligned}$$
(127)

It then follows that

$$\begin{aligned} \frac{dE}{dt}(t) = 0, \quad E(t) = \frac{1}{2}\int _a^b\dot{u}(x,t)^2\, dx + \frac{1}{2}\int _a^bc^2\left( \frac{\partial {u}}{\partial x}(x,t)\right) ^2\, dx, \end{aligned}$$
(128)

i.e., the quadratic form E(t) (the total energy) is constant over time. Note by writing \(c^2=\mu /\rho \) in analogy with (2) that the first and second terms in E(t) are associated with kinetic and potential energy, respectively. In general, E(t) may not be associated with a physical energy [410, 411].

Analogously to (127), we multiply (64) by the centered-difference approximation \((u_j^{n+1}-u_j^{n-1})/(2\Delta t)\) and sum from \(j=1\) to \(j=N-1\) arriving at the discrete energy conservation

$$\begin{aligned} \frac{E_h^{n+1/2}-E_h^{n-1/2}}{\Delta t} = 0, \quad E_h^{n+1/2} = \frac{1}{2}\left\| \frac{{{\varvec{u}}}_{n+1}-{{\varvec{u}}}_n}{\Delta t}\right\| _h^2 + \frac{1}{2}\left\langle c^2D_+{{\varvec{u}}}_{n+1},D_+{{\varvec{u}}}_n\right\rangle , \end{aligned}$$
(129)

where \((D_+{{\varvec{u}}}_n)_j=(u_{j+1}^n-u_j^n)/\Delta x\) for \(1\le j\le N-1\). Moreover, we have the (energy) inequality

$$\begin{aligned} E_h^{n+1/2} \ge \left( 1-\frac{c^2\Delta t^2}{\Delta x^2}\right) \left\| \frac{{{\varvec{u}}}_{n+1}-{{\varvec{u}}}_n}{\Delta t}\right\| _h^2 + \frac{c^2}{2}\left\| D_+{{\varvec{u}}}_{n+1/2}\right\| ^2_h > \frac{c^2}{2}\left\| D_+{{\varvec{u}}}_{n+1/2}\right\| ^2_h, \end{aligned}$$
(130)

if (122) holds. Since \(E_h^{n+1/2}\) remains bounded due to (129), so does \(\Vert D_+{{\varvec{u}}}_{n+1/2}\Vert _h\). The details of (129)–(130) are available in Sec. 9.2 of [412], where the general heterogeneous case is considered.

The energy method has been successfully used to analyze problems with free-surface [413], PML [414, 415], and absorbing boundary conditions [28, 37, 379, 411, 416]. Besides finite differences, the analysis with energy method is present in finite [35] and spectral [417] elements, finite volumes [266], and discontinuous Galerkin methods [418]. On the other hand, since this method indirectly bounds some norm of the numerical solution, the stability condition is sufficient, but not necessary [410].

Finally, let us point out that other approaches are very well suited to the stability analysis of boundary conditions, such as the normal-mode (also known as GKS or GKSO) analysis [32, 419, 420] and the geometric stability condition [421, 422].

7.2 Dispersion and numerical anisotropy

Dispersion analysis is an important tool for assessing the quality of approximation of numerical methods, providing an estimate of the minimum number of grid points per wavelength required to prevent waves from traveling with incorrect speed. A continuous or discrete wave model is dispersive if the wave speed depends on its wavelength.

Let us initially remain on the same 1D problem as in the previous section, recalling that the plane-wave solution (5) satisfies the dispersion relation \(\omega =\pm c\kappa \), thus the phase and group velocities,

$$\begin{aligned} c^{ph} = \frac{\omega }{\kappa },\quad c^{gr} = \frac{d\omega }{d\kappa }, \end{aligned}$$
(131)

coincide and are constant. On the other hand, if \(u_j^n = \exp (-i({\omega }_ht_n-\kappa x_j))\) then scheme (64) in the absence of sources yields

$$\begin{aligned} \sin \left( \frac{{\omega }_h\Delta t}{2}\right) = \pm r\sin \left( \frac{\kappa \Delta x}{2}\right) , \end{aligned}$$
(132)

with r defined as in (121). It follows from (132) that the numerical phase and group velocities satisfy

$$\begin{aligned} \frac{{c}_h^{ph}}{c^{ph}} = \frac{1}{r\pi H}\sin ^{-1}\left( r\sin \left( \pi H\right) \right) , \quad \frac{{c}_h^{gr}}{c^{gr}} = \frac{\cos (\pi H)}{\cos \left( \sin ^{-1}\left( r\sin (\pi H)\right) \right) }, \end{aligned}$$
(133)

where \(H=\kappa \Delta x/(2\pi )\). Noting that \(\exp (\kappa x)\) has period (wavelength) \(2\pi /\kappa \), we have that \(G = 2\pi /(\kappa \Delta x)=1/H\) is the number of grid points per wavelength.

If \(r=1\), then \({c}_h^{ph}=c^{ph}\) and \({c}_h^{gr}=c^{gr}\), as long as \(H\le 1/2\) (since \(\sin x\) is not invertible for \(0\le x\le L\) if \(L>\pi /2\)). The bound \(H\le 1/2\), or \(G\ge 2\), is known as the Nyquist limit. The result for \(r=1\) is exceptional and is not necessarily observed on less trivial problems [401]. When \(r<1\), numerical and exact phase/groups velocities do not coincide. The dispersion error is illustrated with \(r=1/2\) in Fig. 8. For instance, when \(H=0.2\) (i.e., \(G=5\) grid points per wavelength), the relative errors of phase and group velocity are approximately 5% and 15%, respectively.

Fig. 8
figure 8

Numerical dispersion of the explicit, second-order finite-difference method with \(r=0.5\)

A comprehensive review of the dispersion analysis of finite-difference schemes of higher order and spatial dimension is available in [412]. Trefethen presents in [423] a comprehensive review of group-velocity analysis of finite-difference schemes for the acoustic wave equation, plus a relationship between group velocity and GKS stability for first-order hyperbolic systems. Numerical dispersion has also been studied beyond the acoustic case [122, 125, 128, 213].

Kosloff and Baysal [60] presented the numerical dispersion relations of 1D and 2D Fourier pseudospectral method using a similar procedure as above, while Fornberg [191] focused on the dispersion of the spatial discretization. Spa et al. [424] considered fully discrete schemes with Lax-Wendroff and rapid expansion methods. The numerical dispersion in the case of Chebyshev collocation points has not been studied in the classical works, though an analysis of its multidomain version has been recently proposed [425].

The numerical dispersion of finite-element methods of degree one can be done exactly as finite-difference methods; that is, to plug the discrete plane wave into the finite-element stencil assuming an infinite, periodic mesh (see, e.g., [213]). For 1D quadratic meshes and certain triangular meshes we must separate the nodes into sets which share the same degrees of freedom and are located at the same cyclically repeating location in the mesh pattern [426]. In this case, the numerical dispersion relation is expressed by an eigenvalue problem, whose solutions are analogous to the acoustic and optical branches from the theory of wave propagation into crystal structures [427, 428]. Finite elements of higher degree lead to a larger number of solutions, and the classical interpretation is that only one eigenvalue is physically meaningful (in the case of the acoustic wave equation), while the others are regarded as spurious modes [227, 429]. For this reason, the use of high-order finite element methods had been discouraged in numerical wave propagation.

Priolo and Seriani [69, 234] performed a dispersion analysis of the 1D spectral-element method with Chebyshev collocation points by solving the discrete problem for a large final time, taking a wavelet as the initial condition and periodic boundary conditions. The final approximate and exact solutions are transformed into the Fourier space and the amplitude and the phase of their ratio is found for several wave numbers and degrees of polynomial approximation. The results are similar to the theoretical estimates presented in [191].

Mulder [430] applied the discrete Fourier transform sampled in the mesh nodes to the spatial operator and matched its eigenpairs with the transformed plane waves and their normalized wave numbers. Under this setting, the spurious modes provide reasonable approximations of particular eigenvectors of the exact operator. On the other hand, the spatial operator must be properly ordered to assure eigenpair matching. It is not trivial to find such an ordering for 2D or 3D problems.

A common practice in the dispersion analysis of spectral-element and discontinuous Galerkin methods is to select the eigenvalue mode that approximates the dispersion relation of the continuous wave equation [235, 277, 412, 431], and locating these modes is also not trivial in general. Cohen et al. [432] use a Taylor series expansion. Abboud and Pinsky [433] writes the amplitude-variable discrete plane wave as a linear combination of discrete plane waves and classify the modes with the dominating coefficient of the combination (see also [434]). Seriani and Oliveira [435] identify these modes by a Rayleigh quotient approximation of the constant-amplitude mode. A similar analysis was done for the elastic wave equation [99, 436]. The Rayleigh-quotient technique has also been employed in other Galerkin-type methods [225, 437].

Another related form of error is numerical anisotropy, which is present when the speed of the approximate wave solution depends on the propagation direction in a different fashion than the exact solution’s speed [397, 438, 439].

Let us illustrate numerical anisotropy with the two-dimensional version of the previous example. Let be a wave vector with magnitude \(\kappa \) and propagation direction given by the angle \(\theta \). It follows from the dispersion relation \(\omega =\pm c\kappa \) of the scalar wave equation (6) that the phase and group velocity do not depend on \(\theta \). On the other hand, by substituting \(u^n_{j,k}=\exp (-i({\omega }_ht_n-\kappa (x_j\cos \theta +y_k\sin \theta )))\) into the 2D version of (64) we find

$$\begin{aligned} \sin \left( \frac{{\omega }_h\Delta t}{2}\right) = \pm \sqrt{ r_x^2\sin ^2\left( \frac{\kappa \Delta x\cos \theta }{2}\right) + r_y^2\sin ^2\left( \frac{\kappa \Delta y\sin \theta }{2}\right) }, \end{aligned}$$
(134)

where \(r_x=c\Delta t/\Delta x\) and \(r_y=c\Delta t/\Delta y\). Thus the numerical phase and group velocities will depend on \(\theta \), even if \(r_x=r_y=1\). The detailed study of numerical anisotropy of this scheme is available in [59], and the three-dimensional case readily follows by considering .

A convenient way to represent an angle-dependent dispersion relation is a polar diagram [397]. Similarly to [2], Fig. 9 shows percent phase-velocity errors of the 2D and 3D versions of the finite-difference scheme (64) in polar form.

Fig. 9
figure 9

Percent phase-velocity error of the explicit, second-order finite-difference method with \(r=0.5\) in two (left) and three (right) dimensions. The black, blue, and red graphs correspond to \(H=0.3\), \(H=0.2\), and \(H=0.1\), respectively

Numerical dispersion analysis usually assumes homogeneous media and uniform rectangular/cuboid grids, but more general scenarios have also been considered in many studies, such as nonuniform grids [439, 440], interfaces [441,442,443], distorted elements [412, 444, 445], and periodic composite materials [446, 447].

7.2.1 Mass lumping and blending

Finite-element methods for second-order wave equations lead to systems of ODEs in the form (41) where the mass matrix is usually non-diagonal. The mass-lumping technique approximates by a diagonal matrix , allowing the use of explicit-time stepping schemes. The classical approach is to row-lump the mass matrix [63, 64, 428], i.e.,

$$\begin{aligned} \tilde{M}_{i,j} = \left\{ \begin{array}{ll} \displaystyle {\sum _{k=0}^NM_{i,k}},&{}\; i=j,\\ 0,&{}\; i\ne j.\\ \end{array}\right. \end{aligned}$$
(135)

This concept has also been proposed to discretize the same integral formulations that lead to finite-volume methods [448, 449].

In addition to the algebraic form (135), the approximate diagonal mass matrix may be obtained through reduced integration [450]. For high-order finite elements, the natural choice is to employ a Gauss–Lobatto–Legendre quadrature and shift the degrees of freedom so that they coincide with the quadrature nodes [239]. This procedure is followed in GLL spectral-element methods, as mentioned in Sect. 5.4.

Mass lumping significantly affects the numerical dispersion of finite elements. As illustrated in Fig. 10, the numerical phase velocity is higher than the actual phase velocity when a consistent matrix is used but lower when row lumping is employed. Usually the consistent mass matrix produces leading phase and group error, while the lumped mass matrix produces lagging phase and group error [451].

Fig. 10
figure 10

Rayleigh-quotient dispersion analysis of GLC spectral-element methods of degree \(N=1,2,4,8\) using the leapfrog time integration CFL parameter is \(r=0.5/N\) [452]. Left: consistent mass matrix; right: lumped mass matrix

It is natural to seek a combination of lumped and consistent mass matrices that balances the over- and undershoots of these approaches, reducing numerical dispersion [213, 428, 452, 453]. For instance, Fig. 11 shows the numerical phase velocity of the optimal blended operators [452] for the same example provided in Fig. 10. The dispersion remains lower than consistent and lumped elements until nearly the limit of \(\pi \) grid points per wavelength for Chebyshev collocation points [454].

Fig. 11
figure 11

Rayleigh-quotient dispersion analysis of optimal blended GLC spectral-element methods of degree \(N=1,2,4,8\) using the leapfrog time integration CFL parameter is \(r=0.5/N\) [452]

An alternative approach is to seek the coefficients of the mass and stiffness matrices that minimize dispersion [254]. The search for optimal operators is well known in the context of finite-difference methods [455,456,457,458] and can be performed in a framework that is valid for most numerical methods [459].

8 Final remarks

Most numerical methods presented in this survey have reached comparable levels of efficiency and scope, and continue to evolve. Multiscale and multiphysics modeling should push for further improvement of these techniques, and community coding shortens the gap between theoretical advances and practical applications.

It is worth noting that some ideas developed for a method have been transferred to others. The concept of staggered grids from finite differences has been useful to pseudospectral methods, which in turn contributed back through curvilinear coordinates. Spectral elements have inspired discontinuous Galerkin methods to seek higher accuracy by using orthogonal polynomials, which is also a contribution from pseudospectral methods. Such an exchange corroborates the relevance of each family of methods to the overall progress of numerical modeling.