1 Introduction

Numerical modeling for elastic wave propagation is a powerful tool in seismic data processing and interpretation (Duan et al. 2013). The numerical modeling has extensive applications in 2D seismic exploration, but it suffers inherent weaknesses and certain limitations for complex media. In recent years, seismic explorations have gradually shifted to areas with complex geological conditions, such as regions with mountains, deserts, faulted basins (Yang et al. 2015). As a result, 3D numerical modeling has attracted growing attention for high-precision seismic exploration. 3D wave equations can comprehensively describe seismic wavefield characteristics and propagation laws, and provide the theoretical basis for the acquisition, processing, interpretation (Xia et al. 2004). In addition, the elastic wavefield carries rich information related to the earth media, which means that it may provide more underground geological and structural information (Yong et al. 2016; Yang et al. 2018). Therefore, the high-precision simulation of the 3D elastic waveform plays a pivotal role in seismic exploration technologies.

Many numerical methods have been developed to solve wave equations, including the finite-difference (FD) methods, pseudo-spectral methods (Reshef et al. 1988), finite element methods (Marfurt 1984), spectral element methods (Zhu et al. 2011), etc. FD methods are generally popular in seismic numerical modeling because of their small computational cost and easy implementation (Alford et al. 1974; Dablain 1986; Etgen 2007; Moczo et al. 2007), while they suffer the inherent numerical dispersion because we have to approximate the differential operators with the truncated difference operators. In addition, there will lead to strong dispersion artifacts for modeling high-frequency wave propagations with large spatial spacing (Holberg 1987; Zhang and Yao 2012, 2013). Numerical dispersion is an inevitable problem in FD methods, which directly affects the accuracy of the wavefield simulation. To reduce the numerical dispersion, we can adopt a finer mesh or a lower wavelet frequency, but a thinner grid means high computational cost, reducing computing efficiency and increasing the storage burden on the computer. Optimizing FD coefficients is an effective approach to suppress numerical dispersion without increasing the computational cost and memory storage.

So far, various methods of optimizing FD coefficients have been developed, which can be mainly divided into two categories: Taylor expansion methods (Jastram and Behle 1993; Finkelstein and Kastner 2007; Liu and Sen 2009), optimization-based methods (Chen et al. 2010; Liu and Sen 2011; Wang et al. 2014). The first category consumes a lower computational cost than the other one for solving the dispersion equations, and it usually provides high accuracy on a small frequency zone (Ren and Liu 2015). The second category acquires FD coefficients by minimizing phase velocity errors, which possesses a broad effective frequency band but sacrifices computational efficiency and still has the potential of not convergence (Liu 2013). In addition, FD coefficients can be optimized by truncating window functions. The conventional window functions mainly include Hanning windows (Zhou and Greenhalgh 1992), Gauss windows (Igel et al. 1995) and modified binomial windows (Chu and Stoffa 2012). Recently, several new truncation windows have gradually been developed, such as a Chebyshev’s auto-convolution window (Wang et al. 2015) and a truncation window based on the least-squares algorithm (Ren et al. 2018). These methods based on truncated windows can partially suppress numerical dispersion.

The staggered-grid (SG) scheme introduced by Yee (1966) is an alternative approach to solve the elastic wave equation. Virieux (1986) applies this scheme with second-order accuracy in both time and space to solve the P-SV wave equation in heterogenous media. Levander (1988) develops a fourth-order accuracy in space and second-order accuracy in time FD scheme based on the staggered-grid scheme. Afterward, some optimized FD methods based on the staggered-grid scheme are gradually applied to solve the elastic wave equation (Ma and Zhu 2003; Kosloff et al. 2010; Yang et al. 2014). The SG scheme is often applied to solve first-order velocity–stress equations, which is not conducive to memory storage because of the existence of intermediate variables. Bartolo et al. (2012) develop an equivalent SG scheme (ESG) to solve the second-order system of the acoustic equation, obtaining the same level of accuracy and stability as the conventional SG scheme. Based on previous methods, we construct an optimized equivalent staggered-grid scheme (OESG) to improve the accuracy of elastic wave simulation. Different from the acoustic wave equation, the spatial derivatives are coupled with the velocity terms in the elastic wave equation, which affects the weight distribution of the grid points when the spatial derivatives are approximated with the truncated difference operators. In this paper, we take P-wave, S-wave, and C-wave into account and obtain the optimized FD coefficients of 3D elastic wave modeling.

We start by deriving the plane wave solutions of the 3D elastic wave equation to construct the dispersion relationships. Then we give three sets of FD coefficients to make up the OESG scheme. Moreover, the Newton iteration method is used to solve the new dispersion equations by minimizing the relative phase velocity error in the time–space domain. Finally, we analyze the dispersion characteristics and give several results of numerical modeling, proving the validity and superiority of the presented method.

2 Methodology

2.1 The OESG scheme for the 3D elastic wave equation

The first-order velocity–stress elastic wave equation is popular in seismic forward modeling, but there are nine variables (three particle velocities and six stresses) propagated in 3D cases, increasing the storage demands (Li et al. 2018). Therefore, we will adopt the second-order elastic wave equation to simulate in this study. In the 3D Cartesian coordinate system, the second-order elastic wave equation with constant P- and S-wave speed in the isotropic media can be approximately expressed as

$$ \begin{aligned} & \frac{{\partial^{2} u}}{{\partial t^{2} }} = \alpha^{2} \frac{{\partial^{2} u}}{{\partial x^{2} }} + \beta^{2} \left( {\frac{{\partial^{2} u}}{{\partial y^{2} }} + \frac{{\partial^{2} u}}{{\partial z^{2} }}} \right) + \left( {\alpha^{2} - \beta^{2} } \right)\left( {\frac{{\partial^{2} v}}{\partial x\partial y} + \frac{{\partial^{2} w}}{\partial x\partial z}} \right) \\ & \frac{{\partial^{2} v}}{{\partial t^{2} }} = \alpha^{2} \frac{{\partial^{2} v}}{{\partial y^{2} }} + \beta^{2} \left( {\frac{{\partial^{2} v}}{{\partial x^{2} }} + \frac{{\partial^{2} v}}{{\partial z^{2} }}} \right) + \left( {\alpha^{2} - \beta^{2} } \right)\left( {\frac{{\partial^{2} u}}{\partial y\partial x} + \frac{{\partial^{2} w}}{\partial y\partial z}} \right), \\ & \frac{{\partial^{2} w}}{{\partial t^{2} }} = \alpha^{2} \frac{{\partial^{2} w}}{{\partial z^{2} }} + \beta^{2} \left( {\frac{{\partial^{2} w}}{{\partial x^{2} }} + \frac{{\partial^{2} w}}{{\partial y^{2} }}} \right) + \left( {\alpha^{2} - \beta^{2} } \right)\left( {\frac{{\partial^{2} u}}{\partial z\partial x} + \frac{{\partial^{2} v}}{\partial z\partial y}} \right) \\ \end{aligned} $$
(1)

where u, v, w are the particle displacement vectors in x-, y- and z-direction, respectively. t is the travel time. α and β denote P-wave and S-wave speeds, respectively, where we have α2 = (λ +2 μ)/ρ, β2 = μ/ρ. λ and μ are the Lamé parameters, and ρ is the medium density. Note that Eq. 1 is derived from the constant velocity media, where the model parameters λ, μ, ρ are assumed to be constant. To extend the new method from homogeneous model to heterogeneous model, we need to assume the local constant velocity, which is reasonable according to the numerical results of a two-layer model with large velocity variations given by Liu (2014). Hence, Eq. 1 can also handle heterogeneous models with this assumption.

We set three sets of FD coefficients with respect to wave speeds and write out the OESG scheme of Eq. 1 as

$$ \begin{aligned} & u_{0,0,0}^{1} = 2u_{0,0,0}^{0} - u_{0,0,0}^{ - 1} + \frac{{\alpha^{2} \Delta t^{2} }}{{h^{2} }}\sum\limits_{l = 1}^{M} {\sum\limits_{m = 1}^{M} {a_{l} a_{m} \left[ {\left( {u_{m + l - 1,0,0}^{0} - u_{ - m + l,0,0}^{0} } \right) - \left( {u_{m - l,0,0}^{0} - u_{ - m - l + 1,0,0}^{0} } \right)} \right]} } \\ & \quad + \frac{{\beta^{2} \Delta t^{2} }}{{h^{2} }}\sum\limits_{l = 1}^{M} {\sum\limits_{m = 1}^{M} {b_{l} b_{m} \left[ {\left( {u_{0,m + l - 1,0}^{0} - u_{0, - m + l,0}^{0} } \right) - \left( {u_{0,m - l,0}^{0} - u_{0, - m - l + 1,0}^{0} } \right) + \left( {u_{0,0,m + l - 1}^{0} - u_{0,0, - m + l}^{0} } \right) - \left( {u_{0,0,m - l}^{0} - u_{0,0, - m - l + 1}^{0} } \right)} \right]} } \\ & \quad + \frac{{\left( {\alpha^{2} - \beta^{2} } \right)\Delta t^{2} }}{{h^{2} }}\sum\limits_{l = 1}^{M} {\sum\limits_{m = 1}^{M} {c_{l} c_{m} \left[ {\left( {v_{m - 1/2,l - 1/2,0}^{0} - v_{ - m + 1/2,l - 1/2,0}^{0} } \right) - \left( {v_{m - 1/2, - l + 1/2,0}^{0} - v_{ - m + 1/2, - l + 1/2,0}^{0} } \right)} \right.} } \\ & \quad \left. { + \left( {w_{m - 1/2,0,l - 1/2}^{0} - w_{ - m + 1/2,0,l - 1/2}^{0} } \right) - \left( {w_{m - 1/2,0, - l + 1/2}^{0} - w_{ - m + 1/2,0, - l + 1/2}^{0} } \right)} \right], \\ \end{aligned} $$
(2a)
$$ \begin{aligned} & v_{1/2,1/2,0}^{1} = 2v_{1/2,1/2,0}^{0} - v_{1/2,1/2,0}^{ - 1} + \frac{{\alpha^{2} \Delta t^{2} }}{{h^{2} }}\sum\limits_{l = 1}^{M} {\sum\limits_{m = 1}^{M} {a_{l} a_{m} \left[ {\left( {v_{1/2,m + l - 1/2,0}^{0} - v_{1/2, - m + l + 1/2,0}^{0} } \right) - \left( {v_{1/2,m - l + 1/2,0}^{0} - v_{1/2, - m - l + 3/2,0}^{0} } \right)} \right]} } \\ & \quad + \frac{{\beta^{2} \Delta t^{2} }}{{h^{2} }}\sum\limits_{l = 1}^{M} {\sum\limits_{m = 1}^{M} {b_{l} b_{m} \left[ {\left( {v_{m + l - 1/2,1/2,0}^{0} - v_{ - m + l + 1/2,1/2,0}^{0} } \right) - \left( {v_{m - l + 1/2,1/2,0}^{0} - v_{ - m - l + 3/2,1/2,0}^{0} } \right) + \left( {v_{1/2,1/2,m + l - 1}^{0} - v_{1/2,1/2, - m + l}^{0} } \right)} \right.} } \\ & \quad \left. { - \left( {v_{1/2,1/2,m - l}^{0} - v_{1/2,1/2, - m - l + 1}^{0} } \right)} \right] + \frac{{\left( {\alpha^{2} - \beta^{2} } \right)\Delta t^{2} }}{{h^{2} }}\sum\limits_{l = 1}^{M} {\sum\limits_{m = 1}^{M} {c_{l} c_{m} \left[ {\left( {u_{m,l,0}^{0} - u_{ - m + 1,l,0}^{0} } \right) - \left( {u_{m, - l + 1,0}^{0} - u_{ - m + 1, - l + 1,0}^{0} } \right)} \right.} } \\ & \quad \left. { + \left( {w_{1/2,m,l - 1/2}^{0} - w_{1/2, - m + 1,l - 1/2}^{0} } \right) - \left( {w_{1/2,m, - l + 1/2}^{0} - w_{1/2, - m + 1, - l + 1/2}^{0} } \right)} \right], \\ \end{aligned} $$
(2b)
$$ \begin{aligned} & w_{1/2,0,1/2}^{1} = 2w_{1/2,0,1/2}^{0} - w_{1/2,0,1/2}^{ - 1} + \frac{{\alpha^{2} \Delta t^{2} }}{{h^{2} }}\sum\limits_{l = 1}^{M} {\sum\limits_{m = 1}^{M} {a_{l} a_{m} \left[ {\left( {w_{1/2,0,m + l - 1/2}^{0} - w_{1/2,0, - m + l + 1/2}^{0} } \right) - \left( {w_{1/2,0,m - l + 1/2}^{0} - w_{1/2,0, - m - l + 3/2}^{0} } \right)} \right]} } \\ & \quad + \frac{{\beta^{2} \Delta t^{2} }}{{h^{2} }}\sum\limits_{l = 1}^{M} {\sum\limits_{m = 1}^{M} {b_{l} b_{m} \left[ {\left( {w_{m + l - 1/2,0,1/2}^{0} - w_{ - m + l + 1/2,0,1/2}^{0} } \right) - \left( {w_{m - l + 1/2,0,1/2}^{0} - w_{ - m - l + 3/2,0,1/2}^{0} } \right) + \left( {w_{1/2,m + l - 1,1/2}^{0} - w_{1/2, - m + l,1/2}^{0} } \right)} \right.} } \\ & \quad \left. { - \left( {w_{1/2,m - l,1/2}^{0} - w_{1/2, - m - l + 1,1/2}^{0} } \right)} \right] + \frac{{\left( {\alpha^{2} - \beta^{2} } \right)\Delta t^{2} }}{{h^{2} }}\sum\limits_{l = 1}^{M} {\sum\limits_{m = 1}^{M} {c_{l} c_{m} \left[ {\left( {u_{m,0,l}^{0} - u_{ - m + 1,0,l}^{0} } \right) - \left( {u_{m,0, - l + 1}^{0} - u_{ - m + 1,0, - l + 1}^{0} } \right)} \right.} } \\ & \quad \left. { + \left( {v_{1/2,m - 1/2,l}^{0} - v_{1/2, - m + 1/2,l}^{0} } \right) - \left( {v_{1/2,m - 1/2, - l + 1}^{0} - v_{1/2, - m + 1/2, - l + 1}^{0} } \right)} \right], \\ \end{aligned} $$
(2c)

where \( \left( {u,v,w} \right)_{i,j,k}^{n} \) are displacement components at discrete positions (x, y, z) = (ih, jh, kh) in discrete times t = n∆t. h and ∆t denote spatial and temporal sampling intervals, respectively. M represents half of the spatial operator lengths. i, j, k are discrete points on different directions, and a, b, c are the FD coefficients associated with P-wave, S-wave and C-wave terms, respectively. We also display the 3D OESG stencil to help understand, as shown in Fig. 1.

Fig. 1
figure 1

The 3D OESG stencil. h is a mesh spacing. x, y, z denote the grid coordinates. In the figure above, all grid points, half grid points and surfaces should be corresponding components, which are not shown here for easy observation

2.2 Plane wave solution of the 3D elastic wave equation

Applying the spatial Fourier transform to Eq. 1, we have

$$ \begin{aligned} & \frac{{\partial^{2} \hat{u}\left( {\varvec{k},t} \right)}}{{\partial t^{2} }} = - \left[ {\alpha^{2} k_{x}^{2} \hat{u}\left( {\varvec{k},t} \right) + \beta^{2} \left( {k_{y}^{2} + k_{z}^{2} } \right)\hat{u}\left( {\varvec{k},t} \right) + \left( {\alpha^{2} - \beta^{2} } \right)\left( {k_{x} k_{y} \hat{v}\left( {\varvec{k},t} \right) + k_{x} k_{z} \hat{w}\left( {\varvec{k},t} \right)} \right)} \right] \\ & \frac{{\partial^{2} \hat{v}\left( {\varvec{k},t} \right)}}{{\partial t^{2} }} = - \left[ {\alpha^{2} k_{y}^{2} \hat{v}\left( {\varvec{k},t} \right) + \beta^{2} \left( {k_{x}^{2} + k_{z}^{2} } \right)\hat{v}\left( {\varvec{k},t} \right) + \left( {\alpha^{2} - \beta^{2} } \right)\left( {k_{y} k_{x} \hat{u}\left( {\varvec{k},t} \right) + k_{y} k_{z} \hat{w}\left( {\varvec{k},t} \right)} \right)} \right], \\ & \frac{{\partial^{2} \hat{w}\left( {\varvec{k},t} \right)}}{{\partial t^{2} }} = - \left[ {\alpha^{2} k_{z}^{2} \hat{w}\left( {\varvec{k},t} \right) + \beta^{2} \left( {k_{x}^{2} + k_{y}^{2} } \right)\hat{w}\left( {\varvec{k},t} \right) + \left( {\alpha^{2} - \beta^{2} } \right)\left( {k_{z} k_{x} \hat{u}\left( {\varvec{k},t} \right) + k_{z} k_{y} \hat{v}\left( {\varvec{k},t} \right)} \right)} \right] \\ \end{aligned} $$
(3)

where \( \left( {\hat{u},\hat{v},\hat{w}} \right)\left( {\varvec{k},t} \right) \) denote displacement components in the wavenumber domain, and k = (kx, ky, kz) = (kcosθcosϕ, kcosθsinϕ, ksinθ) denotes the wavenumber vector. θ and ϕ are the dip angle and azimuth angle, respectively. In a matrix form, Eq. 3 can be written in the following form, as

$$ L = AU, $$
(4)

where L is the second-order derivative of displacement components, as

$$ L = \left( {\frac{{\partial^{2} \hat{u}}}{{\partial t^{2} }},\frac{{\partial^{2} \hat{v}}}{{\partial t^{2} }},\frac{{\partial^{2} \hat{w}}}{{\partial t^{2} }}} \right)^{\text{T}} ,\quad U = \left( {\hat{u},\hat{v},\hat{w}} \right). $$
(5)

T denotes the transpose symbol. A is the coefficients matrix with

$$ A = - \left[ {\begin{array}{*{20}l} {\alpha^{2} k_{x}^{2} + \beta^{2} \left( {k_{y}^{2} + k_{z}^{2} } \right)} \hfill & {\left( {\alpha^{2} - \beta^{2} } \right)k_{x} k_{y} } \hfill & {\left( {\alpha^{2} - \beta^{2} } \right)k_{x} k_{z} } \hfill \\ {\left( {\alpha^{2} - \beta^{2} } \right)k_{x} k_{y} } \hfill & {\alpha^{2} k_{y}^{2} + \beta^{2} \left( {k_{x}^{2} + k_{z}^{2} } \right)} \hfill & {\left( {\alpha^{2} - \beta^{2} } \right)k_{y} k_{z} } \hfill \\ {\left( {\alpha^{2} - \beta^{2} } \right)k_{x} k_{z} } \hfill & {\left( {\alpha^{2} - \beta^{2} } \right)k_{y} k_{z} } \hfill & {\alpha^{2} k_{z}^{2} + \beta^{2} \left( {k_{x}^{2} + k_{y}^{2} } \right)} \hfill \\ \end{array} } \right]. $$
(6)

It turns out that the matrix A is a real symmetric matrix, concluding that there exists an orthogonal matrix P that makes A = P\( \varLambda \)P−1. Eigenvalues of the matrix A is λ1 = − α2k2, λ2 = λ3 = − β2k2. According to the theory of linear algebra, we can derive the orthogonal matrix P, as

$$ P = \left[ {\begin{array}{*{20}c} {\frac{{k_{x} }}{k}} & {\frac{{k_{z} }}{{\sqrt {k_{x}^{2} + k_{z}^{2} } }}} & {\frac{{- k_{x} k_{y} }}{{k\sqrt {k_{x}^{2} + k_{z}^{2} } }}} \\ {\frac{{k_{y} }}{k}} & 0 & {\frac{{\sqrt {k_{x}^{2} + k_{z}^{2} } }}{k}} \\ {\frac{{k_{z} }}{k}} & {\frac{{ - k_{x} }}{{\sqrt {k_{x}^{2} + k_{z}^{2} } }}} & {\frac{{ - k_{y} k_{z} }}{{k\sqrt {k_{x}^{2} + k_{z}^{2} } }}} \\ \end{array} } \right],\quad P^{ - 1} = P^{\text{T}} . $$
(7)

Therefore, Eq. 4 can be rewritten as

$$ L = P\left[ {\begin{array}{*{20}c} { - \alpha^{2} k^{2} } & 0 & 0 \\ 0 & { - \beta^{2} k^{2} } & 0 \\ 0 & 0 & { - \beta^{2} k^{2} } \\ \end{array} } \right]P^{ - 1} U. $$
(8)

For the above equations, we can easily find the general solution of Eq. 4, as

$$ \left[ {\begin{array}{*{20}c} {\hat{u}\left( {\varvec{k},t} \right)} \\ {\hat{v}\left( {\varvec{k},t} \right)} \\ {\hat{w}\left( {\varvec{k},t} \right)} \\ \end{array} } \right] = P\left[ {\begin{array}{*{20}c} {\exp \left( { \pm {\text{i}}\alpha kt} \right)} & {} & {} \\ {} & {\exp \left( { \pm {\text{i}}\beta kt} \right)} & {} \\ {} & {} & {\exp \left( { \pm {\text{i}}\beta kt} \right)} \\ \end{array} } \right]P^{ - 1} \left[ {\begin{array}{*{20}c} {\hat{u}\left( {\varvec{k},0} \right)} \\ {\hat{v}\left( {\varvec{k},0} \right)} \\ {\hat{w}\left( {\varvec{k},0} \right)} \\ \end{array} } \right], $$
(9)

where \( \left( {\hat{u},\hat{v},\hat{w}} \right)\left( {\varvec{k},0} \right) \) denote the initial values of displacement components. Finally, we can acquire the plane wave solution of the 3D elastic wave by writing Eq. 9 as an expression form of Eq. 10.

$$ \begin{aligned} & \hat{u}\left( {\varvec{k},t} \right) = \frac{1}{k ^2} \left[ {k_{x}^{2} \exp \left( { \pm {\text{i}}\alpha kt} \right) + \left( {k_{y}^{2} + k_{z}^{2} } \right)\exp \left( { \pm {\text{i}}\beta kt} \right)} \right]\hat{u}\left( {\varvec{k},0} \right) \\ & \quad \quad \quad \quad + \frac{1}{k ^2}\left[ {\exp \left( { \pm {\text{i}}\alpha kt} \right) - \exp \left( { \pm {\text{i}}\beta kt} \right)} \right]\left[ {k_{x} k_{y} \hat{v}\left( {\varvec{k},0} \right) + k_{x} k_{z} \hat{w}\left( {\varvec{k},0} \right)_{y} } \right], \\ & \hat{v}\left( {\varvec{k},t} \right) = \frac{1}{k ^2}\left[ {k_{y}^{2} \exp \left( { \pm {\text{i}}\alpha kt} \right) + \left( {k_{x}^{2} + k_{z}^{2} } \right)\exp \left( { \pm {\text{i}}\beta kt} \right)} \right]\hat{v}\left( {\varvec{k},0} \right) \\ & \quad \quad \quad \quad + \frac{1}{k ^2}\left[ {\exp \left( { \pm {\text{i}}\alpha kt} \right) - \exp \left( { \pm {\text{i}}\beta kt} \right)} \right]\left[ {k_{y} k_{x} \hat{u}\left( {\varvec{k},0} \right) + k_{y} k_{z} \hat{w}\left( {\varvec{k},0} \right)_{y} } \right], \\ & \hat{w}\left( {\varvec{k},t} \right) = \frac{1}{k ^2}\left[ {k_{z}^{2} \exp \left( { \pm {\text{i}}\alpha kt} \right) + \left( {k_{x}^{2} + k_{y}^{2} } \right)\exp \left( { \pm {\text{i}}\beta kt} \right)} \right]\hat{w}\left( {\varvec{k},0} \right) \\ & \quad \quad \quad \quad + \frac{1}{k ^2}\left[ {\exp \left( { \pm {\text{i}}\alpha kt} \right) - \exp \left( { \pm {\text{i}}\beta kt} \right)} \right]\left[ {k_{z} k_{x} \hat{u}\left( {\varvec{k},0} \right) + k_{z} k_{y} \hat{v}\left( {\varvec{k},0} \right)_{y} } \right]. \\ \end{aligned} $$
(10)

2.3 Solving FD coefficients based on the time–space dispersion relations with Newton’s method

With the spatial Fourier transform on both sides of Eqs. 2a2c, then substituting Eq. 10 and simplifying, we obtain three dispersion relations, as

$$ \sin^{2} \left( {0.5\alpha k\Delta t} \right) = \frac{{\alpha^{2} \Delta t^{2} }}{{h^{2} }}\left\{ {\left[ {\sum\limits_{m = 1}^{M} {a_{m} \sin \left( {\left( {m - 0.5} \right)k_{x} h} \right)} } \right]^{2} + \left[ {\sum\limits_{m = 1}^{M} {a_{m} \sin \left( {\left( {m - 0.5} \right)k_{y} h} \right)} } \right]^{2} + \left[ {\sum\limits_{m = 1}^{M} {a_{m} \sin \left( {\left( {m - 0.5} \right)k_{z} h} \right)} } \right]^{2} } \right\}, $$
(11a)
$$ \sin^{2} \left( {0.5\beta k\Delta t} \right) = \frac{{\beta^{2} \Delta t^{2} }}{{h^{2} }}\left\{ {\left[ {\sum\limits_{m = 1}^{M} {b_{m} \sin \left( {\left( {m - 0.5} \right)k_{x} h} \right)} } \right]^{2} + \left[ {\sum\limits_{m = 1}^{M} {b_{m} \sin \left( {\left( {m - 0.5} \right)k_{y} h} \right)} } \right]^{2} + \left[ {\sum\limits_{m = 1}^{M} {b_{m} \sin \left( {\left( {m - 0.5} \right)k_{z} h} \right)} } \right]^{2} } \right\}, $$
(11b)
$$ \begin{aligned} & \frac{{k_{x} k_{y} }}{{k^{2} }}\left( {\sin^{2} \left( {0.5\alpha k\Delta t} \right) - \sin^{2} \left( {0.5\beta k\Delta t} \right)} \right) \\ & \quad \quad = \frac{{\left( {\alpha^{2} - \beta } \right)^{2} \Delta t^{2} }}{{h^{2} }}\left[ {\sum\limits_{m = 1}^{M} {c_{m} \sin \left( {\left( {m - 0.5} \right)k_{x} h} \right)} } \right]\left[ {\sum\limits_{m = 1}^{M} {c_{m} \sin \left( {\left( {m - 0.5} \right)k_{y} h} \right)} } \right] \\ & \frac{{k_{y} k_{z} }}{{k^{2} }}\left( {\sin^{2} \left( {0.5\alpha k\Delta t} \right) - \sin^{2} \left( {0.5\beta k\Delta t} \right)} \right) \\ & \quad \quad = \frac{{\left( {\alpha^{2} - \beta } \right)^{2} \Delta t^{2} }}{{h^{2} }}\left[ {\sum\limits_{m = 1}^{M} {c_{m} \sin \left( {\left( {m - 0.5} \right)k_{y} h} \right)} } \right]\left[ {\sum\limits_{m = 1}^{M} {c_{m} \sin \left( {\left( {m - 0.5} \right)k_{z} h} \right)} } \right]. \\ & \frac{{k_{z} k_{x} }}{{k^{2} }}\left( {\sin^{2} \left( {0.5\alpha k\Delta t} \right) - \sin^{2} \left( {0.5\beta k\Delta t} \right)} \right) \\ & \quad \quad = \frac{{\left( {\alpha^{2} - \beta^{2} } \right)\Delta t^{2} }}{{h^{2} }}\left[ {\sum\limits_{m = 1}^{M} {c_{m} \sin \left( {\left( {m - 0.5} \right)k_{z} h} \right)} } \right]\left[ {\sum\limits_{m = 1}^{M} {c_{m} \sin \left( {\left( {m - 0.5} \right)k_{x} h} \right)} } \right] \\ \end{aligned} $$
(11c)

Detailed derivations can be found in “Appendix 1”. Equations 11a and 11b have the same forms, only with different wave speeds. Therefore, we can compute FD coefficients associated with the P- and S-wave speeds by solving Eq. 11a. Equation 11a cannot be solved directly because of its nonlinearity. Therefore, we need to construct the objective function based on the L2 norm theory, as

$$ F\left( a \right) = \frac{1}{2}\sum\limits_{k = 0}^{{k_{\text{e}} }} {\sum\limits_{\theta = 0}^{\pi /4} {\sum\limits_{\phi = 0}^{\pi /4} {\left\| {\frac{{r^{2} q\left( a \right)}}{{\sin^{2} \left( {0.5\alpha k\Delta t} \right)}} - 1} \right\|} } } , $$
(12)

where

$$ \begin{aligned} & q\left( a \right) = \left[ {\sum\limits_{m = 1}^{M} {a_{m} \sin \left( {\left( {m - 0.5} \right)k_{x} h} \right)} } \right]^{2} \\ & \quad \quad + \left[ {\sum\limits_{m = 1}^{M} {a_{m} \sin \left( {\left( {m - 0.5} \right)k_{y} h} \right)} } \right]^{2} + \left[ {\sum\limits_{m = 1}^{M} {a_{m} \sin \left( {\left( {m - 0.5} \right)k_{z} h} \right)} } \right]^{2} , \\ \end{aligned} $$

and ke denotes the effective wavenumber range. r is the CFL (Courant–Friedrichs–Lewy) number, which can be represented as r = α∆t/h. We adopt the iterative format of Newton’s method to solve Eq. 11a, as

$$ a^{k + 1} = a^{k} - \left( {H^{k} } \right)^{ - 1} g^{k} , $$
(13)

where a represents FD coefficients, and k is the number of iterations. We can acquire the Hessian matrix H and gradient vector g of the objective function by taking first-order and second-order derivatives of Eq. 12. Details of the derivations are shown in “Appendix 2”. As a result, we can get the FD coefficients a and b by Eq. 13. To accelerate the convergence, the initial iteration values should be set as the conventional SG FD coefficients, and the error limit is set to 1% as the termination condition (Yong et al. 2017). Due to the fast convergence speed of Newton’s method, it usually takes only a few iterations to get a satisfactory result.

As for solving Eq. 11c, there is a trouble that the denominator could be zero if we just divide it by sin2(0.5αk∆t) − sin2(0.5βk∆t), resulting in non-convergence of the algorithm. Thus, we choose to break this term down into two terms by matching the terms of α and β, as

$$ \begin{aligned} & F\left( c \right) = \frac{1}{6}\sum\limits_{k = 0}^{{k_{\text{e}} }} {\sum\limits_{\theta = 0}^{\pi /4} {\sum\limits_{\phi = 0}^{\pi /4} {\left\{ {\left\| {\frac{{r_{1}^{2} k^{2} qxy\left( c \right)}}{{k_{x} k_{y} \sin^{2} \left( {0.5\alpha k\Delta t} \right)}} - 1} \right\| + \left\| {\frac{{r_{2}^{2} k^{2} qxy\left( c \right)}}{{k_{x} k_{y} \sin^{2} \left( {0.5\beta k\Delta t} \right)}} - 1} \right\| + \left\| {\frac{{r_{1}^{2} k^{2} qyz\left( c \right)}}{{k_{y} k_{z} \sin^{2} \left( {0.5\alpha k\Delta t} \right)}} - 1} \right\|} \right.} } } \\ & \quad \quad + \left. {\left\| {\frac{{r_{2}^{2} k^{2} qyz\left( c \right)}}{{k_{y} k_{z} \sin^{2} \left( {0.5\beta k\Delta t} \right)}} - 1} \right\| + \left\| {\frac{{r_{1}^{2} k^{2} qzx\left( c \right)}}{{k_{z} k_{x} \sin^{2} \left( {0.5\alpha k\Delta t} \right)}} - 1} \right\| + \left\| {\frac{{r_{2}^{2} k^{2} qzx\left( c \right)}}{{k_{z} k_{x} \sin^{2} \left( {0.5\beta k\Delta t} \right)}} - 1} \right\|} \right\}, \\ \end{aligned} $$
(14)

where

$$ \begin{aligned} & qxy\left( c \right) = \left[ {\sum\limits_{m = 1}^{M} {c_{m} \sin \left( {\left( {m - 0.5} \right)k_{x} h} \right)} } \right]\left[ {\sum\limits_{m = 1}^{M} {c_{m} \sin \left( {\left( {m - 0.5} \right)k_{y} h} \right)} } \right] \\ & qyz\left( c \right) = \left[ {\sum\limits_{m = 1}^{M} {c_{m} \sin \left( {\left( {m - 0.5} \right)k_{y} h} \right)} } \right]\left[ {\sum\limits_{m = 1}^{M} {c_{m} \sin \left( {\left( {m - 0.5} \right)k_{z} h} \right)} } \right] \\ & qzx\left( c \right) = \left[ {\sum\limits_{m = 1}^{M} {c_{m} \sin \left( {\left( {m - 0.5} \right)k_{z} h} \right)} } \right]\left[ {\sum\limits_{m = 1}^{M} {c_{m} \sin \left( {\left( {m - 0.5} \right)k_{x} h} \right)} } \right],r_{1} = \frac{\alpha \Delta t}{h},r_{2} = \frac{\beta \Delta t}{h}. \\ \end{aligned} $$

Similarly, the detailed derivations of first-order and second-order derivatives for Eq. 11c are contained in “Appendix 2”. Therefore, we can also get the FD coefficients c by Eq. 13.

2.4 Numerical dispersion analysis

It can be seen from the above analyses that three kinds of FD coefficients are solved separately, indicating that the dispersion analyses for them should also be separate. As a result, using the relation \( f = kv/2\pi \), we construct three dispersion equations taking the frequency as an independent variable, as

$$ \begin{aligned} & \delta \left( {a,\theta ,\phi ,f} \right) = \frac{{r_{1}^{2} qq\left( a \right)}}{{\sin^{2} \left( {\pi f\Delta t} \right)}},\varepsilon \left( {b,\theta ,\phi ,f} \right) = \frac{{r_{2}^{2} qq\left( b \right)}}{{\sin^{2} \left( {\pi f\Delta t} \right)}}, \\ & \eta \left( {c,\theta ,\phi ,f} \right) = \frac{1}{6}\left\{ {\frac{{r_{1}^{2} qqxy_{1} \left( c \right) + r_{2}^{2} qqxy_{2} \left( c \right)}}{{\cos^{2} \theta \cos \phi \sin \phi \sin^{2} \left( {\pi f\Delta t} \right)}} + \frac{{r_{1}^{2} qqyz_{1} \left( c \right) + r_{2}^{2} qqyz_{2} \left( c \right)}}{{\cos \theta \sin \theta \sin \phi \sin^{2} \left( {\pi f\Delta t} \right)}}} \right. \\ & \quad \quad \quad \quad \quad \left. { + \frac{{r_{1}^{2} qqzx_{1} \left( c \right) + r_{2}^{2} qqzx_{2} \left( c \right)}}{{\cos \theta \sin \theta \cos \phi \sin^{2} \left( {\pi f\Delta t} \right)}}} \right\}, \\ \end{aligned} $$
(15)

where

$$ \begin{aligned} qq\left( a \right) & = \left[ {\sum\limits_{m = 1}^{M} {a_{m} \sin \left( {\left( {m - 0.5} \right)\frac{2\pi f}{\alpha }h\cos \theta \cos \phi } \right)} } \right]^{2} + \left[ {\sum\limits_{m = 1}^{M} {a_{m} \sin \left( {\left( {m - 0.5} \right)\frac{2\pi f}{\alpha }h\cos \theta \sin \phi } \right)} } \right]^{2} \\ & \quad + \left[ {\sum\limits_{m = 1}^{M} {a_{m} \sin \left( {\left( {m - 0.5} \right)\frac{2\pi f}{\alpha }h\sin \theta } \right)} } \right]^{2} , \\ \end{aligned} $$
$$ \begin{aligned} qq\left( b \right) & = \left[ {\sum\limits_{m = 1}^{M} {b_{m} \sin \left( {\left( {m - 0.5} \right)\frac{2\pi f}{\beta }h\cos \theta \cos \phi } \right)} } \right]^{2} + \left[ {\sum\limits_{m = 1}^{M} {b_{m} \sin \left( {\left( {m - 0.5} \right)\frac{2\pi f}{\beta }h\cos \theta \sin \phi } \right)} } \right]^{2} \\ & \quad + \left[ {\sum\limits_{m = 1}^{M} {b_{m} \sin \left( {\left( {m - 0.5} \right)\frac{2\pi f}{\beta }h\sin \theta } \right)} } \right]^{2} , \\ \end{aligned} $$
$$ \begin{aligned} & qqxy_{1} \left( c \right) = \left[ {\sum\limits_{m = 1}^{M} {c_{m} \sin \left( {\left( {m - 0.5} \right)\frac{2\pi f}{\alpha }h\cos \theta \cos \phi } \right)} } \right]\left[ {\sum\limits_{m = 1}^{M} {c_{m} \sin \left( {\left( {m - 0.5} \right)\frac{2\pi f}{\alpha }h\cos \theta \sin \phi } \right)} } \right] \\ & qqxy_{2} \left( c \right) = \left[ {\sum\limits_{m = 1}^{M} {c_{m} \sin \left( {\left( {m - 0.5} \right)\frac{2\pi f}{\beta }h\cos \theta \cos \phi } \right)} } \right]\left[ {\sum\limits_{m = 1}^{M} {c_{m} \sin \left( {\left( {m - 0.5} \right)\frac{2\pi f}{\beta }h\cos \theta \sin \phi } \right)} } \right] \\ & qqyz_{1} \left( c \right) = \left[ {\sum\limits_{m = 1}^{M} {c_{m} \sin \left( {\left( {m - 0.5} \right)\frac{2\pi f}{\alpha }h\cos \theta \sin \phi } \right)} } \right]\left[ {\sum\limits_{m = 1}^{M} {c_{m} \sin \left( {\left( {m - 0.5} \right)\frac{2\pi f}{\alpha }h\sin \theta } \right)} } \right] \\ & qqyz_{1} \left( c \right) = \left[ {\sum\limits_{m = 1}^{M} {c_{m} \sin \left( {\left( {m - 0.5} \right)\frac{2\pi f}{\beta }h\cos \theta \sin \phi } \right)} } \right]\left[ {\sum\limits_{m = 1}^{M} {c_{m} \sin \left( {\left( {m - 0.5} \right)\frac{2\pi f}{\beta }h\sin \theta } \right)} } \right] \\ & qqzx_{1} \left( c \right) = \left[ {\sum\limits_{m = 1}^{M} {c_{m} \sin \left( {\left( {m - 0.5} \right)\frac{2\pi f}{\alpha }h\sin \theta } \right)} } \right]\left[ {\sum\limits_{m = 1}^{M} {c_{m} \sin \left( {\left( {m - 0.5} \right)\frac{2\pi f}{\alpha }h\cos \theta \cos \phi } \right)} } \right] \\ & qqzx_{2} \left( c \right) = \left[ {\sum\limits_{m = 1}^{M} {c_{m} \sin \left( {\left( {m - 0.5} \right)\frac{2\pi f}{\beta }h\sin \theta } \right)} } \right]\left[ {\sum\limits_{m = 1}^{M} {c_{m} \sin \left( {\left( {m - 0.5} \right)\frac{2\pi f}{\beta }h\cos \theta \cos \phi } \right)} } \right]. \\ \end{aligned} $$

For Eq. 15, it is worth noting that the new independent variable of the dispersion equations is the frequency f instead of kh. Next, based on Eqs. 15, we give several dispersion curves in the homogeneous media. The first test is a low-speed medium, whose parameters are α = 2000 m/s, β =1154 m/s, h = 15 m, ∆t = 0.5 ms, M = 5, shown in Fig. 2. Note that all simulation parameters should meet the stability condition, whose governing equation will be given below.

Fig. 2
figure 2

Dispersion curves of three wave terms with different methods at α = 2000 m/s, β =1154 m/s, h = 15 m, ∆t = 0.5 ms, M = 5. ac The ESG method; df The OESG method

Note that the effective frequency range in Fig. 2 is about 0-38 Hz, which can be calculated by the spatial domain sampling theorem f < β/2 h. Figure 2 shows the dispersion curves of three wave terms with the ESG and OESG methods in a constant velocity medium. One can see that all the curves in Fig. 2 almost lie underneath the ideal one for all frequency ranges, indicating that there suffers the spatial dispersion. In addition, the dispersion errors of the three wave terms are different under the same medium parameters, indicating that it is necessary to analyze the three wave terms separately. As can be seen from Fig. 2, the P-wave term suffers the smallest dispersion errors among the three terms, and the S-wave term has the largest errors.

Comparing Fig. 2a with d, we can conclude that the OESG method performs fewer errors of P-wave than the traditional ESG method, which reveals the advantage of the presented method. For the errors of S-wave and C-wave, the OESG method also can reduce dispersion errors compared with the ESG method in the same frequency range, shown in Fig. 2b–c and e–f. Figure 2 also displays the error values of three wave terms when f = 30 Hz, θ = ϕ = 0 (See the black arrows), providing a piece of evidence for the above conclusions. In addition, the dispersion errors vary with different propagation angles. There is a maximal dispersion error on the condition of θ = ϕ = 0 and minimal dispersion error on the condition of θ = ϕ = π/4, indicating that the propagation angles are the factor that affects the numerical dispersion. In general, the OESG method can effectively suppress spatial dispersion and has higher simulation precision than the traditional ESG method.

To test the presented method’s accuracy in a high-velocity medium, we increase the P-wave and S-wave speed to α = 4000 m/s, β =2309 m/s, shown in Fig. 3. Comparing with Fig. 3a–c, d–f shows weaker spatial dispersion of three wave terms in the same frequency values, indicating that the presented can also effectively reduce spatial dispersion compared to the traditional ESG method in high-velocity media. Figure 3 also marks the percentage error values at some points, providing a piece of evidence for verifying the superiority of the OESG method in suppressing spatial dispersion. In addition, temporal dispersion occurs in a high-speed medium even though the time step is small, shown in Fig. 3a (see the red box). To further investigate the ability of our method to suppress temporal dispersion for the elastic wave equation modeling, we move ∆t = 0.5 ms to ∆t = 1 ms, and then introduce an optimal method based on least-squares (LS) algorithm (Yang et al. 2014). As a result, the three wave terms suffer severe temporal dispersion, shown in Fig. 4.

Fig. 3
figure 3

Dispersion curves of three wave terms with different methods at α = 4000 m/s, β =2309 m/s, h = 15 m, ∆t = 0.5 ms, M = 5. ac The ESG method; df The OESG method

Fig. 4
figure 4

Dispersion curves of three wave terms with different methods at α = 4000 m/s, β =2309 m/s, h = 15 m, ∆t = 1 ms, M = 5. ac The ESG method; df The LS method; gi The OESG method

The black arrows in Fig. 4 indicate the temporal dispersion errors of the three methods when f = 60 Hz, θ = ϕ = π/4. By comparison, we know that the OESG method has the minimum temporal dispersion errors among the three methods. Comparing Fig. 4h–i with b–c, it can be seen that the OESG method also effectively suppress spatial dispersion compared to the traditional ESG method. In Fig. 4e–f and h–i, one can see that the LS method based on spatial domain optimization has less spatial dispersive error than the OESG method when frequency is high. Note that the OESG method based on time–space domain optimization can simultaneously suppress spatial and temporal dispersion to attain an optimal result, which leads to an average effect between the spatial dispersion and temporal dispersion. Therefore, the OESG method is slightly weaker than the LS method in suppressing spatial dispersion, while the LS method performs poor in suppressing temporal dispersion.

In addition, the P- to S-wave speed ratio (α/β) is also a factor affecting the numerical dispersion. In some special media such as sedimentary basins, α/β is as large as five and even larger (Moczo et al. 2010). Based on Fig. 2d–f, we move α = 2000 m/s, β = 1143 m/s to α= 2000 m/s, β =400 m/s, and obtain corresponding dispersion error curves, shown in Fig. 5. The maximum frequency f decreases to about 13 Hz according to the spatial domain sampling theorem, which means that the FD method suffers the severe numerical dispersion using a wavelet with high frequency. The numerical dispersion can be reduced effectively by shortening mesh spacing. Therefore, for the high α/β ratio areas, fine space girds are used.

Fig. 5
figure 5

Dispersion curves of three wave terms with the OESG method at α = 2000 m/s, β =400 m/s, h = 15 m, ∆t = 0.5 ms, M = 5. a P-wave; b S-wave; c C-wave

2.5 Stability analysis and boundary conditions

When solving wave equations, it is very important that the solution of the discrete difference equation is stable or not. There are two main reasons for the instability in the process of numerical simulation. On the one hand, FD coefficients are sensitive to the optimized wavenumber range. FD methods will be unstable when the optimized wavenumber range is too small. According to the sampling theorem, we have kmax= π/h, which is maximal wavenumber. We define fmax= 2.5 f0 and take kcon= 2πfmax/vmax (where, kcon< kmax) as the wavenumber range to optimize FD coefficients, while the iterative process will be unstable when k is too small. In this paper, we adopt Yong’s (2017) method to modify the wavenumber range. The new wavenumber ke can be defined as

$$ \begin{aligned} & \tau \left( {k_{\text{opt}} } \right) = \frac{{v_{\text{ph}} }}{{v_{\text{real}} }} - 1 > - 0.005 \\ & k_{\text{e}} = \hbox{max} \left( {k_{\text{con}} ,k_{\text{opt}} } \right), \\ \end{aligned} $$
(16)

where vph, vreal denote the phase and real velocities, respectively. τ is the relative phase velocity error, and kopt is the optimized wavenumber. We will choose the optimized wavenumber range strictly according to Eq. 16 in the following numerical examples.

On the other hand, improper calculation parameters will also lead to the instability of algorithms. At present, the Fourier analysis and matrix analysis are mainly used to qualitatively and quantitatively study stability conditions for different elastic wave equations. We derive the stability condition of the presented method by the matrix analysis method, as

$$ \hbox{max} \left( {\alpha_{\hbox{max} } \Delta t\sqrt {\frac{{\sum\nolimits_{m = 1}^{M} {\left| {a_{m} } \right|^{2} } }}{{\Delta x^{2} }} + \frac{{\sum\nolimits_{m = 1}^{M} {\left| {b_{m} } \right|^{2} } }}{{\Delta y^{2} }} + \frac{{\sum\nolimits_{m = 1}^{M} {\left| {b_{m} } \right|^{2} } }}{{\Delta z^{2} }},} \quad \alpha_{\hbox{max} } \Delta t\sqrt {\sum\nolimits_{m = 1}^{M} {\left| {c_{m} } \right|^{2} } \left( {\frac{1}{{\Delta x^{2} }} + \frac{1}{{\Delta y^{2} }} + \frac{1}{{\Delta z^{2} }}} \right)} } \right) \le 1. $$
(17)

The detailed derivation process is included in “Appendix 3”. Subsequent algorithm tests show that the presented method has almost the same stability as the traditional ESG method. Besides, we adopt absorption boundary conditions, whose attenuation function can be written as (Cerjan et al. 1985)

$$ G = \exp \left\{ { - \left[ {0.015\left( {npd - i} \right)} \right]^{ 2} } \right\}, $$
(18)

where npd is the thickness of the absorption layer, and i denotes spatial coordinates.

3 Numerical examples

In this section, we test some models, including several homogeneous models, the two-layer model and the salt-dome model in the 3D case. The main objective of these simulations is to prove the validity and superiority of the OESG method compared with the traditional methods. In these examples, we will use a Ricker wavelet source that is simultaneously added to three displacement components. In addition, we consider the previously mentioned absorbing boundary condition and give npd =30 in all the simulations. Finally, for the 3D elastic wave equation, there are three displacement components. Here, we can draw our conclusions by analyzing one of the three components.

3.1 The homogeneous model

The first model is a homogeneous model with lower velocity (α = 2000 m/s, β =1154 m/s), which is modeled using the numerical parameters h = 15 m and ∆t = 0.5 ms in the grid dimensions of 200 × 200 × 200. The source whose peak frequency is 14 Hz is located at the center of the model. Figure 6 displays the snapshots of different sections with the ESG and OESG methods. It exhibits a strong spatial dispersion for the S-wave term with the ESG method in Fig. 6a. We can shrink the sampling intervals or increase the spatial FD order to improve the simulation results, but there are all sorts of problems such as oversize computational cost and the instability of the algorithm. The OESG method based on time–space domain optimization can effectively reduce numerical dispersion, shown in Fig. 6b.

Fig. 6
figure 6

Snapshots of displacement components at 0.6 s of different methods with α = 2000 m/s, β =1154 m/s, h = 15 m, ∆t = 0.5 ms, f = 14 Hz. a The ESG method with M = 5; b The OESG method with M = 5; c The OESG method with M = 3; d The reference method

To further investigate the simulation accuracy of the OESG method, we reduce the spatial FD order of the OESG method to M = 3, shown in Fig. 6c. It can be seen that Fig. 6c performs the similar modeling accuracy to Fig. 6a, which suggests that the presented method with the low spatial FD order can achieve the accuracy requirement of the traditional ESG method with the high spatial FD order.

Besides, we extract single vertical slices of snapshots and adopt the reference method (see Fig. 6d) with the minimum possible dispersion errors by reducing the mesh step. As shown in Fig. 7, the reference method matches well with the OESG method with M = 5, but poor with the ESG method with M = 5. Table 1 gives a statistic of relative amplitude errors in Fig. 7. From Table 1, we know that the OESG method with M = 3 has smaller relative amplitude errors than the ESG method with M = 5, which means that forward modeling using the OESG method can save the computational cost of FD operator lengths compared to the traditional ESG method. The relative amplitude errors can be calculated by

Fig. 7
figure 7

Vertical slice of snapshots at x = 1500 m, y = 1500 m, z = 600 m-1035 m in Fig. 6. a The ESG method with M = 5; b The OESG method with M = 3; c The OESG method with M = 5

Table 1 The relative amplitude errors between different methods and the reference method
$$ E = {{\sum\limits_{n = 1}^{N} {\left[ {U_{\text{cal}} \left( z \right) - U_{\text{ref}} \left( z \right)} \right]^{2} } } \mathord{\left/ {\vphantom {{\sum\limits_{n = 1}^{N} {\left[ {U_{\text{cal}} \left( z \right) - U_{\text{ref}} \left( z \right)} \right]^{2} } } {\sum\limits_{n = 1}^{N} {\left[ {U_{\text{ref}} \left( z \right)} \right]^{2} } }}} \right. \kern-0pt} {\sum\limits_{n = 1}^{N} {\left[ {U_{\text{ref}} \left( z \right)} \right]^{2} } }} $$
(19)

where Ucal, Uref denote the calculated wavefield values and reference values along the z-direction, respectively. N is the total number of the calculated wavefields.

We continue to test the accuracy of the presented method in a high-speed medium. We give the snapshots of the ESG and OESG methods, where α = 4000 m/s, β =2309 m/s, f = 30 Hz, shown in Fig. 8. Comparing Fig. 8a with b, the traditional ESG method suffers severer numerical dispersion than the OESG method, indicating that the presented method can effectively suppress numerical dispersion in high-speed media. In general, the presented method can suppress the numerical dispersion in the low- and high-speed media, which is also consistent with the above dispersion analyses.

Fig. 8
figure 8

Snapshots of displacement component at 0.5 s of different methods with α = 4000 m/s, β =2309 m/s, h = 15 m, ∆t = 0.5 ms, f = 30 Hz. a The ESG method with M = 5; b The OESG method with M = 5

3.2 The two-layer model

It can be known from the above analyses that the FD coefficients of the OESG method are calculated in the time–space domain and correspond to the wave speed values one to one. As a result, the OESG method has higher simulation accuracy in the heterogeneous media than the traditional ESG method. Therefore, we first test a two-layer model, shown in Fig. 9. The gird dimensions are 201 × 201 × 201. The spatial and temporal sampling intervals are h = 15 m, ∆t = 0.5 ms. The P-wave velocities in the first and second layers are 2000 m/s and 2500 m/s, respectively. The S-wave speed is obtained by the ratio α/β = 1.73. The source whose peak frequency is 14 Hz is located at the location (1500 m, 1500 m, 1500 m). For the two-layer model in Fig. 9, we give its FD coefficients according to P-wave speeds, listed in Table 2.

Fig. 9
figure 9

Two-layer model

Table 2 The list of FD coefficients with M = 5, h = 15 m, ∆t = 0.5 ms for the two-layer model

Figure 10 shows the simulation results of different methods. In Fig. 10, we can observe the reflections, transmissions of P-wave, S-wave, and C-wave:

Fig. 10
figure 10

Snapshots of wavefield at 0.7 s for the two-layer model with different methods. a The ESG method with M = 5; b the OESG method with M = 5

  • A, B, and C denote the direct S-wave, the reflected S-wave, and the transmitted S-wave, respectively;

  • D, E, and F denote the direct P-wave, the reflected P-wave and the transmitted P-wave, respectively;

  • G and H denote the reflection and transmission for the converted SP-wave, respectively;

  • I and J denote the reflection and transmission for the converted PS-wave,

which illustrates that the OESG method is effective for numerical forward modeling of the heterogeneous medium. In addition, comparing Fig. 10b with Fig. 10a, we know that the traditional ESG method suffers the severer spatial dispersion for the S-wave term than the OESG method (see black arrows).

3.3 The 3D salt-dome model

Next, we consider the 3D salt-dome model whose velocity range is 1500 to 4482 m/s, shown in Fig. 11. The model dimensions are 225 × 225 × 201. The spatial and temporal sampling intervals are h = 15 m, ∆t = 0.5 ms. The S-wave speeds are obtained by the ratio α/β = 1.73. The source whose peak frequency is 14 Hz is located at the location (1680 m, 1680 m, 0 m). For the 3D salt-dome model, we need to discretize the velocity model with a sampling interval. Given a sampling interval of 5 m/s, about 600 sets of FD coefficients are required, but we can store them in the computer before starting numerical simulations. Table 3 lists the partial FD coefficients for simulations of the 3D salt-dome model.

Fig. 11
figure 11

3D salt-dome model

Table 3 The list of FD coefficients with M = 5, h = 15 m, ∆t = 0.5 ms for the salt-dome model

Figure 12 shows the snapshots of the 3D salt-dome model at 0.9 s with different methods. On the one hand, the reflections from the salt-dome are distinct and the location of salt-dome anomalies in Fig. 11 can be well matched in Fig. 12, proving the validity of the OESG method in the heterogeneous media. On the other hand, from the black arrows and red boxes in Fig. 12a–c, we can see that the traditional ESG method can lead to severer numerical dispersion than the OESG and LS methods.

Fig. 12
figure 12

Snapshots at 0.9 s for the salt model with different methods. a The ESG method with M = 5; b The LS method with M = 5; c The OESG method with M = 5; d The reference method

To facilitate analysis, we extract some single traces from common-shot gathers with a time record of 1.5 s (common-shot gathers are found in “Appendix 4”) and adopt the reference method (a)

with the lowest possible dispersion errors by increasing the order of the FD scheme, shown in Fig. 13. One can see that the OESG method is very close to the reference method (See red dashed in Fig. 13), while there is always a mismatch between the LS method and the reference method (See green dashed in Fig. 13). The blue dashed represents the traditional method, which has the biggest amplitude errors among the three methods. Table 4 shows the relative amplitude errors of seismograms in Fig. 13. By comparison, the OESG method has lower relative amplitude errors than the other two methods. From the above analyses, it can be concluded that the OESG method is superior to the other two methods for 3D elastic waveform modeling in complex med.

Fig. 13
figure 13

Single traces of different methods at ax = 1500 m, y = 1680 m, bx = 1680 m, y = 1680 m cx = 1680 m, y = 1500 m, and dx = 1680 m, y = 2250 m

Table 4 The relative amplitude errors between different methods and the reference method

4 Discussions

FD methods have always been used to solve partial differential equations (acoustic wave and elastic wave equations) on seismic exploration (Keiiti and Larner 1970; Kindelan et al. 1990), but it suffers the dispersion errors because of the inherent characteristics of its algorithm. In this paper, we adopt the second-order elastic wave equation in terms of pressure only, avoiding the first-order derivative variables and thus reducing memory requirements. To improve simulation accuracy, we obtain the optimized FD coefficients by minimizing the relative phase velocity error.

There are two different points from other optimized methods. On the one hand, we get new plane wave solutions of the 3D elastic wave equation, as shown in Eq. 10. On the other hand, we consider the coupling between P-wave and S-wave when computing the second-order spatial derivatives, and then adopt three sets of coefficients with respect to the three sets of wave terms shown in Eq. 2. Intending to eliminate the instability caused by too narrow wavenumber range, we modify the optimized wavenumber range, which makes the algorithm more robust. In addition, we also derive the stability condition of the presented method, concluding that the OESG method has almost the same stability as the traditional ESG method.

Naturally, there are some drawbacks to the new method, such as the complexity of formula derivations. In this aspect, subsequent researchers can choose other iterative methods or objective functions to simplify the formula according to the requirements. Besides, the computational efficiency of the algorithm has not been improved, which limits its promotion and application in seismic imaging and inversion. For this problem, we plan to take the work of imaging and inversion to further work by GPU’s method.

5 Conclusions

The OESG method for the 3D elastic wave equation is developed to simultaneously suppress temporal and spatial dispersion. Different from the traditional FD methods, the OESG method possesses the new plane wave solution that is more appropriate to construct the time–space dispersion relationships. As a result, we obtain FD coefficients using the Newton iteration method based on new time–space dispersion equations. In addition, we also modify the wavenumber range and objective function, obtaining more satisfactory results. With a local constant velocity assumption, we simulate the propagation of elastic waves in the 3D homogeneous and heterogeneous media using the OESG method.

The presented method’s advantages mainly include three aspects. Firstly, the presented method has a lower spatial dispersion error than the traditional ESG method. Numerical examples show that the OESG method with M = 3 can achieve the simulation accuracy of the traditional ESG method with M = 5, which is conducive to save the amount of computation. Secondly, compared with the LS method, the presented method performs well in suppressing temporal dispersion, while it is slightly weaker than the LS method in suppressing spatial dispersion. However, the presented method has smaller relative amplitude errors than the LS method in 3D salt-dome model, shown in Table 4, indicating that the OESG method is superior to the LS method in complex media. Thirdly, FD coefficients of the presented method are calculated from medium velocities, meaning that it is more accurate to heterogeneous media than the LS and traditional ESG methods whose FD coefficients are the constant. Numerical examples prove that the presented method is superior to the LS and traditional ESG methods.