Appendix A: Formulation of perfectly matched layers with the Neumann boundary conditions
The PML technique [4] was first implemented and used in a finite-difference time-domain method for the computation of electromagnetic waves. A more straightforward and convenient approach [14] was introduced by using complex coordinate stretching to build the same PMLs. Here, we will follow the work for the isotropic case [65] with the caveat that the half-space problem with a Neumann boundary condition on the top requires some adaptations. That is, we will need a constraint for the damping function in constructing the PMLs.
We let Si(xi) be a complex-valued damping function. We note that each Si(xi) is only a function of xi and is independent of other coordinates. We adjust the partial derivatives, \(\partial _{x_{i}} \rightarrow \frac {1}{S_{i}} \partial _{x_{i}}\), with Si being identically one in the domain of interest and complex-valued inside the PML region.
Numerically, we expect \(u_{l}|_{\partial X / \varSigma } \rightarrow 0\). The boundary value problem (1)–(2) takes the form
$$ \left\{\begin{array}{lll} (-\rho(x) \omega^{2} \delta_{il} - \frac{1}{S_{j}}\partial_{x_{j}} c_{ijkl}(x) \frac{1}{S_{k}}\partial_{x_{k}} )u_{l} &= 0 , \\ (c_{ijkl}\frac{1}{S_{k}} \partial_{x_{k}} u_{l}) \nu_{j} |_{\varSigma} &= g_{i}. \end{array}\right. $$
(12)
To arrive at the weak formulation, we need to carry out the following steps. We multiply both sides of (12) by S1S2S3,
$$ (-S_{1} S_{2} S_{3} \rho(x) \omega^{2} \delta_{il} - \partial_{x_{j}} c_{ijkl}(x) \frac{S_{1} S_{2} S_{3}}{S_{j} S_{k}} \partial_{x_{k}}) u_{l} = 0 , $$
noting that S1S2S3/Sj is not a function of xj. We now introduce coefficients,
$$ \begin{array}{@{}rcl@{}} \tilde{\rho}(x) = S_{1} S_{2} S_{3} \rho(x) , \tilde{c}_{ijkl}(x) = c_{ijkl}(x) \frac{S_{1} S_{2} S_{3}}{S_{j} S_{k}} ,\\ x \in X \cup \partial X , \end{array} $$
(13)
where X ∪ ∂X is the computational box with PML inside X. When we apply the classical PML coefficient Sj, we observe the reflected surface waves from the corners of the upper surface. This is because we have a mismatch between PML and the Neumann boundary condition. Here we modify the PML coefficient so that we can deal with the boundary conditions properly. We let
$$ S_{j} |_{\partial X} = 1, \quad \text{for} j = 1,2,3. $$
(14)
We multiply S1S2S3 to both sides of the Neumann boundary condition,
$$ \begin{array}{@{}rcl@{}} (c_{ijkl} \frac{S_{1} S_{2} S_{3}}{S_{k}} \partial_{x_{k}} u_{l}) \nu_{j} |_{\varSigma} = (c_{ijkl} \frac{S_{1} S_{2} S_{3}}{S_{j} S_{k}} \partial_{x_{k}} u_{l}) \nu_{j} |_{\varSigma} \\ = (\tilde{c}_{ijkl} \partial_{x_{k}} u_{l}) \nu_{j} |_{\varSigma} = S_{1} S_{2} S_{3} u_{i} . \end{array} $$
(15)
We note that we replace the original coefficients cijkl with the new coefficients \(\tilde {c}_{ijkl}\) at the boundary.
Considering that S1S2S3gi|Σ = gi|Σ, we obtain the modified strong formulation,
$$ \begin{array}{@{}rcl@{}} (-\tilde{\rho}(x) \omega^{2} \delta_{il} - \partial_{x_{j}} \tilde{c}_{ijkl}(x) \partial_{x_{k}} ) u_{l} &= 0 , \end{array} $$
(16)
$$ \begin{array}{@{}rcl@{}} (\tilde{c}_{ijkl} \partial_{x_{k}} u_{l}) \nu_{j} |_{\varSigma} &= g_{i} . \end{array} $$
(17)
Since we now have standard derivatives without any complex functions, we are now able to apply the continuous Galerkin finite element approximation to the system with PMLs. We then construct the local matrices on each element and assemble these local matrices into the global matrix. The strategy is similar to the standard work [26].
Appendix B: First-order adjoint state method: the gradient
Elastic FWI can be formulated as an optimization problem with equality constraints.
Since we deal with inverse boundary value problems, to extract the adjoint boundary values for misfit functional, we revisit the classical first-order adjoint state method. We consider a single source g here and sum over all the available sources later. The optimization problem minimizing ΨHS(u) in Eq. 7 takes the form,
$$ \begin{array}{@{}rcl@{}} &&\arg\!\min\limits_{m} \mathit{\varPsi}^{HS}(u) \ \text{subject to}\ \\ &&{\int}_{X} \Big(-\omega^{2} \rho u_{i} v_{i} + (\partial_{x_{j}} v_{i}) c_{ijkl} \partial_{x_{k}} u_{l}\Big) \mathrm{d} x \\ && = {\int}_{\varSigma} g_{i} v_{i} \mathrm{d} x, \forall v \in H^{1}(X), \end{array} $$
(18)
where the constraint in Eq. 18 represents the weak form of the entire boundary value problem (1)–(2), u denotes the weak solution and v denotes the test function. H1 denotes the Sobolev space of square-integrable functions with square-integrable weak first-order derivatives. We point out that the boundary value problems with discontinuities in the media can naturally be solved in the weak sense. Additionally, to obtain the adjoint boundary value, one needs to derive the adjoint formula in the weak sense.
To compute the gradient of the functional involved, we use a Lagrangian approach, the constrained optimization problem is cast into a formulation with Lagrange multipliers γ,
$$ \begin{array}{@{}rcl@{}} \mathcal{L}(m,u,\gamma) = \frac{1}{2} {\int}_{\partial X} \chi_{\varSigma} R (u_{i} - u^{\star}_{i}) \cdot R(u_{i} - u^{\star}_{i}) \mathrm{d} x \\ + {\int}_{X} \Big(-\omega^{2} \rho u_{i} \gamma_{i} + (\partial_{x_{j}} u_{i}) c_{ijkl} \partial_{x_{k}}\gamma_{l} \Big) \mathrm{d} x \\ - {\int}_{\varSigma} g_{i} \gamma_{i} \mathrm{d} x , \end{array} $$
(19)
where u⋆ denotes the solution in the true model m⋆. Given some m, we let \(\widetilde {u} = \widetilde {u}(m)\) be the solution to the forward boundary value problem and write
$$ \mathcal{L}(m,\widetilde{u},\gamma) = \mathit{\varPsi}^{HS}(\widetilde{u}) = \mathcal{E}(m) . $$
(20)
Since we consider piecewise constant models as described in Eq. 4, \(\mathcal {E}\) is a Fréchet differentiable function \(\mathcal {E}: V \rightarrow \mathbb {R}\), where V is a finite-dimensional vector space, the derivative \(D_{m} \mathcal {E}[m]\) exists. Since the Fréchet derivative is continuous, the Riesz representation theorem can be applied, here, using the L2 inner product in the model space [9]:
$$ D_{m} \mathcal{E}[m] \delta m = (\nabla \mathcal{E}, \delta m) , \forall m \in V , $$
where \(\nabla \mathcal {E}\) denotes the gradient and \(D_{m} \mathcal {E}\) is defined as the linear operator
$$ D_{m} \mathcal{E}[m] : \delta m \mapsto \frac{\mathrm{d}}{\mathrm{d} t}\Big|_{t=0} \mathcal{E}(m+t\delta m), \delta m \in V. $$
Since the Fréchet derivative of \(\widetilde {u}(m)\) exists, the Fréchet derivative of \(\mathcal {E}(m)\) with respect to m in the direction δm attains the form
$$ \begin{array}{@{}rcl@{}} D_{m} \mathcal{E}[m] \delta m \!&=&\! D_{m} \mathcal{L}(m,\widetilde{u},\gamma) \delta m \\ &=& {\int}_{X} -\omega^{2} \widetilde{u}_{i} \gamma_{i} \ \frac{\partial \rho}{\partial m} \delta m \mathrm{d} x\\ &&+ {\int}_{X} (\partial_{x_{j}} \widetilde{u}_{i} ) (\partial_{x_{k}} \gamma_{l}) \frac{\partial c_{ijkl}}{\partial m} \delta m \mathrm{d} x \\ &&+ {\int}_{X} \Big(-\omega^{2} \rho \gamma_{i} (D_{m} \widetilde{u}_{i}[m] \delta m) \\ &&+ \partial_{x_{j}} (D_{m} \widetilde{u}_{i}[m] \delta m ) c_{ijkl} \partial_{x_{k}} \gamma_{l} \Big) \mathrm{d} x \\ &&+ {\int}_{\partial X} \Big(R (D_{m} \widetilde{u}_{i}[m] \delta m ) \chi_{\varSigma} R (\widetilde{u}_{i} - u_{i}^{\star}) \Big) \mathrm{d} x.\\ \end{array} $$
(21)
We choose the adjoint state, \(\widetilde {\gamma } = \widetilde {\gamma }(m)\), so that \((m,\widetilde {u},\widetilde {\gamma })\) is a stationary point of the Lagrangian [16, 42, 61].
Thus, applying the calculus of variations, we let \(\widetilde {\gamma }\) solve
$$ \begin{array}{@{}rcl@{}} {\int}_{X} \Big(-\omega^{2} \rho \gamma_{i} v_{i} + (\partial_{x_{j}} v_{i} ) c_{ijkl} (\partial_{x_{k}} \gamma_{l}) \Big) \mathrm{d} x \\ + {\int}_{\partial X} v_{i} \chi_{\varSigma} R (\widetilde{u}_{i} - u_{i}^{\star}) \mathrm{d} x = 0 , \quad \forall v \in H^{1}(X) . \end{array} $$
(22)
From Eq. 22, it follows that the first-order adjoint state equation for the boundary value problem takes the form of Eqs. 8 and 9. Clearly, the adjoint boundary value problem (8)–(9) is well-posed in the weak sense. Substituting \(v=D_{m} \widetilde {u}[m] \delta m\) in Eq. 21, we avoid computing \(D_{m} \widetilde {u}[m]\) explicitly and obtain,
$$ \begin{array}{@{}rcl@{}} D_{m} \mathcal{E}[m] \delta m = (\nabla \mathcal{E}, \delta m) = {\int}_{X} -\omega^{2} \widetilde{u}_{i} \widetilde{\gamma}_{i} \ \frac{\partial \rho}{\partial m} \delta m \mathrm{d} x \\ + {\int}_{X} (\partial_{x_{j}} \widetilde{u}_{i}) (\partial_{x_{k}} \widetilde{\gamma}_{l} ) \frac{\partial c_{ijkl}}{\partial m} \delta m \mathrm{d} x . \end{array} $$
(23)
Summing over all available sources, we arrive at Eq. 10.
Appendix C: Second-order adjoint state method for the inverse boundary value problem
Since the vibroseis data lead to the inverse boundary value problems, we present the evaluation of (full and Gauss-Newton) Hessian-vector multiplication. For the analogous evaluation in the case of traditional FWI, several previous works [23, 36, 44] have been performed.
C.1 Full Hessian-vector product computation
To begin with, we consider the optimization problem with equality constraints with a single source,
$$ \begin{array}{@{}rcl@{}} && \min\limits_{m}\ \mathit{{\varPsi}_{1}}(u,u_{1})\quad \text{subject to} \\ && {\int}_{X} \Big(-\omega^{2} \rho u_{i} {v_{1}}_{i} + (\partial_{x_{j}} {v_{1}}_{i}) c_{ijkl} \partial_{x_{k}} u_{l}\Big) \mathrm{d} x \\ && = {\int}_{\varSigma} g_{i} {v_{1}}_{i} \mathrm{d} x , \quad \forall v_{1} \in H^{1}(X) ,\\ && {\int}_{X} \Big(-\omega^{2} \rho {u_{1}}_{i} v_{i} + (\partial_{x_{j}} v_{i}) c_{ijkl} \partial_{x_{k}} {u_{1}}_{l}\Big) \mathrm{d} x \\ && = - {\int}_{X} \Big(- \omega^{2} (\delta l^{\rho}) {u}_{i} v_{i} + (\partial_{x_{j}} v_{i}) (\delta l^{c})_{ijkl} \partial_{x_{k}} {u}_{l}\Big) \mathrm{d} x\\ && + {\int}_{\partial X} -\big[(\delta l^{c})_{ijkl} \partial_{x_{k}} u_{l} \big] \nu_{j} v_{i} \mathrm{d} x , \quad \forall v \in H^{1}(X), \end{array} $$
in which
$$ \mathit{{\varPsi}_{1}}(u,u_{1}) = D_{m} \mathit{\varPsi}(u) \delta l = {\int}_{\partial X} \chi_{\varSigma} R (u_{i} - u^{\star}_{i}) R {u_{1}}_{i} \mathrm{d} x , $$
(24)
where Ψ was introduced in Eq. 18, δl is the parameter perturbation, m + δl, δlc is the stiffness tensor part of parameter perturbation δl, δlρ is the density part of parameter perturbation δl, and u1 is the first-order perturbed field with respect to m along δl.
We derive the full Hessian-vector product for the inverse boundary value problem. We have two forward problems: \(\widetilde {u}\) is the weak solution to the direct problem (1)–(2) and the other generates \(\widetilde {u_{1}}\), which is the solution to
$$ P_{il} {u_{1}}_{l} = \omega^{2} (\delta l^{\rho}) \widetilde{u}_{l} \delta_{il} + \partial_{x_{j}} (\delta l^{c})_{ijkl} \partial_{x_{k}} \widetilde{u}_{l}, $$
supplemented with the boundary condition,
$$ (c_{ijkl} \partial_{x_{k}} {u_{1}}_{l}) \nu_{j} |_{\partial X} = -\big[(\delta l^{c})_{ijkl} \partial_{x_{k}} \widetilde{u}_{l} \big] \nu_{j} |_{\partial X} , $$
We introduce two Lagrangian multi-parameters γ and γ1 to replace v and v1. Following a similar argument in Appendix B, we choose \(\widetilde {\gamma }\) to be the weak solution to the first adjoint boundary value problem (8)–(9), and \(\widetilde {\gamma _{1}}\) to be the weak solution to the the second adjoint boundary value problem, which is given by
$$ \begin{array}{@{}rcl@{}} P_{il} {\gamma_{1}}_{l} &=& \delta l^{\rho} \omega^{2} \widetilde{\gamma}_{l} \delta_{il} + \partial_{x_{j}} [(\delta l^{c})_{ijkl} \partial_{x_{k}} \widetilde{\gamma}_{l}] ,\\ (c_{ijkl} \partial_{x_{k}} {\gamma_{1}}_{l}) \nu_{j} |_{\partial X} &=& -((\delta l^{c})_{ijkl} \partial_{x_{k}} \widetilde{\gamma}_{l}) \nu_{j} |_{\partial X} - \chi_{\varSigma} R \widetilde{u_{1}}_{i} . \end{array} $$
When summing over available boundary sources, gs, we obtain the Hessian-vector product,
$$ \begin{array}{@{}rcl@{}} \!\!\!&&\!H \delta l (\cdot) = \\ \!\!\!&&\!\sum\limits_{s} \int \left[ - \omega^{2} \widetilde{u_{1}}^{s}_{i} \widetilde{\gamma}^{s}_{i} \frac{\partial \rho}{\partial m} (\cdot) + (\partial_{x_{j}} \widetilde{u_{1}}^{s}_{i}) (\partial_{x_{k}} \widetilde{\gamma}^{s}_{l}) \frac{\partial c_{ijkl}}{\partial m}(\cdot)\right] \! \mathrm{d} x ,\\ \!\!\!&&\!+ \int \left[ - \omega^{2} \widetilde{u}^{s}_{i} \widetilde{\gamma_{1}}^{s}_{i} \frac{\partial \rho}{\partial m}(\cdot) + (\partial_{x_{j}} \widetilde{u}^{s}_{i}) (\partial_{x_{k}} \widetilde{\gamma_{1}}^{s}_{l})\ \frac{\partial c_{ijkl}}{\partial m}(\cdot)\ \right] \mathrm{d} x\\ \!\!\!&&\!+ \int ({\partial_{m}^{2}} P \delta l (\cdot) \widetilde{u}^{s}) \cdot \widetilde{\gamma}^{s} \mathrm{d} x , \end{array} $$
(25)
where the data residual information is hidden in the adjoint wavefield, \(\widetilde {\gamma }^{s}\) and \(\widetilde {\gamma _{1}}^{s}\); Pδl is a short-hand representation of Pil acting on δl.
C.2 Gauss-Newton Hessian-vector product computation
For the Gauss-Newton method, we consider the least-squares misfit and aim to compute the Gauss-Newton Hessian-vector product via the constrained minimization problem [36]. We consider a new objective function ΨGN,
$$ \begin{array}{@{}rcl@{}} \min\limits_{m} \mathit{\varPsi}^{GN}(u) \quad \text{subject to} \ \\ {\int}_{X} \Big(-\omega^{2} \rho u_{i} v_{i} + (\partial_{x_{j}} v_{i}) c_{ijkl} \partial_{x_{k}} u_{l}\Big) \mathrm{d} x \\ = {\int}_{\varSigma} g_{i} v_{i} \mathrm{d} x, \forall v \in H^{1}(X), \end{array} $$
in which
$$ \mathit{\varPsi}^{GN}(u) = {\int}_{\partial X} \chi_{\varSigma} R u_{i} R \widetilde{u_{1}}_{i} \mathrm{d} x . $$
With analogous derivation as the second-order adjoint state method, we introduce a Lagragian multiplier η and let \(\widetilde {\eta }\) to be the weak solution to the Gauss-Newton adjoint equation
$$ \begin{array}{@{}rcl@{}} P_{il} \eta_{l} &=& 0 , \end{array} $$
(26)
$$ \begin{array}{@{}rcl@{}} \nu_{j} (c_{ijkl} \partial_{x_{k}} \eta_{l})|_{\partial X} &= - \chi_{\varSigma} R \widetilde{u_{1}}_{i} . \end{array} $$
(27)
We have a new adjoint equation for Gauss-Newton Hessian-vector product, which means we need to solve one more equation to retrieve a Gauss-Newton Hessian-vector multiplication.
Then, for any choice of the parameters, we have
$$ \begin{array}{@{}rcl@{}} H^{GN} \delta l (\cdot) &=& \sum\limits_{s} \left\{ \int - \omega^{2} \widetilde{u}^{s}_{i}\widetilde{\eta}^{s}_{i} \ \frac{\partial \rho}{\partial m} (\cdot) \mathrm{d} x\right.\\ &&\left. + \int (\partial_{x_{j}} \widetilde{u}^{s}_{i}) (\partial_{x_{k}} \widetilde{\eta}^{s}_{l})\ \frac{\partial c_{ijkl}}{\partial m} (\cdot) \mathrm{d} x \right\}, \end{array} $$
(28)
Note that δl is hidden in the Gauss-Newton adjoint wavefield \(\widetilde {\eta }\).