Velocity variation coefficient-based angle-dependent gradient conditioning scheme: a new strategy for an enhanced full waveform inversion

The development of an accurate velocity model is the significant target in the Full Waveform Inversion (FWI) process where the data fitting process is carried out based on an ill-posed technique. In the FWI technique optimization process plays a crucial role through which objective function minimizes, which is related to the misfit function between observed and modelled data. However, the influence of external factors such as data fitting errors (local minima) and the presence of noise in data are involved in the success of this processing technique. The artefacts that arise during gradient computation also affect this processing technique. This study presents a strategy to mitigate the influence of these local minima and other artefacts based on the velocity variation coefficient where an angle-dependent gradient conditioning approach has been proposed. It is an auto-controlled process in which the primary mechanism updates the velocity model from a large angle scale to a smaller angle scale when iteration begins. At each iteration, it preserves the previous result whereby it does not scatter or overlap with the previous one. It covers all the angles smoothly which helps in minimizing the data misfit and providing a high-resolution velocity model. The proposed conditioning approach is demonstrated by implementing the Marmousi model, and the result proves that the method provides a much-improved velocity inversion result which is attained with reasonable iterations. This study represents of a suitable procedure for the FWI processing technique where less sensitive artefacts are identified with negligible time consumption. Furthermore, it also helps to reduce the cycle skips and improve convergence in any complex scenario.

Gradient with respect to model m at the i th iteration in terms of scattering angle J(m) Misfit Function d obs Observed seismic data p (x, z, t) Pressure wavefield at location (x, z), at the time ' t'and velocity 'c'of the medium m f (t) Source function i m Step length Scattering angles for the forward wavefield. Scattering angles for the residual wavefield. The cut-off value for the application of filter The quadratic mean of the velocity V(x, z) location (x, z) from l neighbors

Introduction
Full Waveform Inversion (FWI) is an iterative process aimed at inverting the model parameters by minimizing the error between the observed shot data and the modeled data. It is most frequently used as a high-resolution velocity model building method (Vigh et al. 2014;Operto et al. 2015). In this process, the error is pronounced as an objective function that directs the gradient and updates the model in such a way that the error is minimum. Local minima and cycle skipping are the two downsides of the FWI algorithm and to obtain an enhanced velocity model, it is necessary to overcome these two downsides. In the past, various efforts were made to develop a stable and robust FWI (Virieux and Operto 2009). Bunks et al. (1995) have studied the inversion of data in multi-scale to avoid local minima. Sirgue and Pratt (2004) have presented a similar approach in the frequency domain. Brenders et al. (2009) have proposed time damping of the input data to reduce the local minima by focusing on the different parts of the data. Laplace and Laplace-Fourier domain FWI is proposed by Shin and Cha (2009) to generate extremely low-frequency information for robust inversion results. Seismic envelopes of the observed and the forward modeled data have been proposed by Bŏzdag et al. (2011). The use of envelope in FWI is shown to reduce the risk of convergence to local minima.
Deconvolution as an effective misfit function is demonstrated by Warner and Guasch (2014). The use of optimal transport distance misfit (Métivier et al. 2016) and energy norm (Rocha et al. 2016) are proposed to invert better velocity models as compared to the conventional approach. Xu et al. (2012) studied the use of reflection energy to recover the background velocity and termed it the Reflection Full Waveform Inversion (RFWI). Frequency domain implementation is proposed by Wang et al. (2013) with a multi-stage approach. In conventional FWI, the gradient is considered to be equally weighted by transmission and reflection components. The tomographic component of the FWI gradient updates the long wavelength of the velocity model. Tang et al. (2013) showed that the FWI gradient can be filtered to enhance the tomographic component which improves the FWI convergence. The convergence of FWI can be improved with velocity updates using low wavenumber components in the initial iteration. Wu and Alkhalifah (2017) used the concept of decomposing the wavefield into up-going and down-going components to update the velocity using smooth components. Alkhalifah (2014) proposed scattering angle-based filtering of the full waveform inversion gradient. Wu and Alkhalifah (2017) recommended an efficient scattering angle enrichment filter to control the gradient wavelength components. Additional contributions are made by various researchers to overcome the limitations of velocity model updating techniques. Most of these were aimed at using scattering angle techniques. Luo and Wu (2019) studied scattering angle-based multi-stage inversion strategy for velocity and density estimation. Yao et al. (2019) have proposed nonstationary smoothing to extract the tomographic component through scattering angle filter. Luo et al. (2021) have used the scattering angle filters in Elastic Full Wavefrom Inversion to reduce the multiparameter crosstalk problem and cycle-skipping. In recent AlAli and Anifowose (2022) reviewed the role of the machine learning (ML) technique in updating the velocity model but that was concentrated on low-frequency extrapolation only. FWI gradient can be conditioned with the scattering angle approach, which requires the selection of angle ranges to be included in the inversion process.
In this study, the coefficient of velocity variation is defined as the measure of the change in the lateral velocity that is used to control the angle-based gradient filtering. The angle-weighted gradient preconditioning is then proposed based on the defined coefficient of velocity variation and the modified weight functions to obtain large and fine-scale structures simultaneously. The proposed approach helps in inverting the large-scale structures in the initial iterations. As the iteration increases, the updated coefficient of velocity variation helps in selecting the angle ranges for areas with higher velocity variation and in preserving the fine-scale structures. The applicability of the proposed approach is demonstrated in the Marmousi dataset.

Mathematical description
The Acoustic wave equation in a heterogeneous medium with velocity ' c ' and source f (t) has the following form: where p(x, z, t) is the pressure wavefield at the location (x, z), at the time ' t'and velocity 'c' of the medium m and f (t) is the source function. The FWI iteratively updates the model m to minimize the misfit between the modeled data d and the observed seismic data d using Eq. (1). The minimization of the misfit function can be given as follows: The calculated seismic data from the model m is calculated using Eq. (1) at receiver positions.
The model parameters can be updated by minimizing the misfit function (Eq. 2) using the following relation: where i m is the gradient of the misfit function with respect to model m at the i th iteration and i m is the positive number termed as step length. The gradient i m can be obtained by calculating the derivative of the misfit function J(m) as: where p m (x, z, t) is the forward wavefield for the current model and p res (x, z, t) is the residual wavefield obtained using the difference between the modeled data d cal and the observed seismic data d obs .
Model m can be updated using Eq. (3). The forward and residual wavefield can also be expressed as the superposition of local plane waves as follows: and are the scattering angles for the forward and residual wavefields.

Gradient precondition
Gradient preconditioning is the process that limits the influence of unwanted features. There are several approaches to condition the gradient with a scattering angle approach (Alkhalifah 2015b;Choi and Alkhalifah 2015;Xie 2015;Luo and Xie 2017). Based on the local plane wave decomposition the FWI gradient i m can be written in terms of scattering angle to i m, (Tang et al. (2013), Alkhalifah 2015, Jeong and Kim 2018) which is as follows: The FWI gradient i m, can be utilized to get the tomographic or migration component or filter the FWI gradient to update the velocity model. Alkhalifah (2015) has used the filter W( ) to remove the low scattering angles at the initial stage of FWI. The modified i m, can be written as follows: The filter W( ) is used to enrich the FWI gradient (Wu and Alkhalifah (2017)) with cosine weights of scattering angle . Jeong & Kim (2018) have used variable angle weights for different angle ranges to achieve the optimum FWI convergence. The selection of W( ) is proposed by Jeong and Kim (2018) as the variable dependent on the different angle ranges as given below: In most cases, the weighting factor for angle ranges needs to be selected carefully for stable inversion. Jeong and Kim (2018) have shown that the selection of low angle ranges during the inversion process provides the largescale structures and as the cut-off angle increases, details increase from large-scale to fine-scale structures. A judicious selection of these angle ranges and weight factors is essential to optimally invert the fine-scale structures into complex geologies.
In this paper, we proposed the W( ) as the combination of sine and cosine weight of scattering angle coupled with threshold decided by the lateral velocity variation.
For the proposed analysis, the Poynting vector method (Xie 2015) in time-domain full waveform inversion is used due to its high efficiency. Gradient preconditioning can be done by introducing the scattering angle-dependent weight factor by modifying (Eq. 5) the gradient i m to i m, as defined as follows: where weight function W( ) can be optimally chosen for stable convergence. The scattering angle can be obtained using the pointing vector (Yoon and Marfurt 2006): Angle-based gradient conditioning with Cos3 weights M3 Angle-based gradient conditioning with the proposed method where m = 1 and Gamma = 0.36 Angle-based gradient conditioning with the proposed method where m = 2 and Gamma = 0.36 Angle-based gradient conditioning with the proposed method where m = 3 and Gamma = 0.36 The angle-based preconditioning provides a way to select the details to be inverted. The paper proposes to utilize this property of the angle-based gradient preconditioning to increase the fine-scale model at complex geometries with high later velocity variations. This paper also defines a coefficient of velocity variation for the application of weights to gather preconditioning. This Lateral velocity variation is measured by the coefficient of velocity variation Ĉ v,l , which is derived analytically based on the quadratic mean of velocity in the lateral direction.
For our analysis, the proposed W( ) is defined as follows:  and l define the length of the coefficient computation. v,l is the quadratic mean of the velocity from the l neighbors. The proposed provides the estimate of the lateral variation weighted by normalized velocities. The lower Ĉ v corresponds to the low variation in the lateral velocity and the relatively higher Ĉ v defines the high lateral velocity variation with higher velocities.
The modified FWI gradient i m, is dependent on the scattering angle-based filtering, with weights defined by the analytically derived coefficient of velocity variation. The proposed weighting factor with sine and cosine weights of the scattering angle provides a smooth changes in the angle-based gather conditioning; otherwise, a tapered window on the angle ranges is required to avoid edge effects which may lead to instability. The verification for the weighting factor is provided in appendix-I.

Results and discussion
FWI has been implemented in this study using C and CUDA (Garland et al. 2008) programming language. FWI, FWI with weighted angle range and the proposed method on the different models are compared in this study to understand the behavior of the proposed method over conventional and available angle-based gradient preconditioning. In the study, a point source with a Ricker wavelet is used. A staggered grid finite-difference method is used to solve the acoustic wave equation with PML boundary conditions on Graphics Processing Units.

Comparison of FWI schemes
To understand the behavior of the proposed method over other methods, it is compared with conventional FWI and FWI with the weighted angle range technique. All the techniques are executed over the Marmousi 2-D model for which the original velocity model is shown in Fig. 1. The initial model used for FWI is shown in Fig. 2. This model is generated using an iteratively smoothening process using the actual velocity model. It is so smoothened that a few major boundaries where velocity changes occur are only observed and except for this no other detailed features are visible.  To investigate the effect of the angle-based gradient conditioning on FWI, various weight functions for angle-based gradient conditioning are tested as shown in Table 1. The length parameter l for the computation of velocity variation is kept as 100 m.
The coefficient of the velocity variation at the initial iteration is calculated and shown in Fig. 3. From this, it is observed that the water layer has a coefficient of a velocity variation value of 0.36 and the rest of the fragments of the model has values more than 0.4. It indicates that the application of the proposed angle weights is validated for most of the areas.
All these methods are used for inversion and the results for the same after the 100 th iteration are shown in Fig. 4. All the schemes (M1-M6) except M2 are able to invert the model to a good extent. M2 scheme somehow fails to invert the anticline nodes that are present in the central part of the model.
The proposed method M6 provides a relatively better model as compared to all other schemes. To get a better insight into the inverted model, zoomed sections are illustrated in Fig. 5. From the zoomed figure analysis, it is confirmed that the existing angle-based scheme (M2) performs relatively lower as compared to all other schemes as it includes the low angle information only. Herewith, it may be noted that for M2, the different weight functions per iteration may result in different outcomes. It is also observed that the output schemes based on the proposed method (M4, M5, and M6) outperform other methods in terms of velocity model inversion.
To showcase the relative convergence, the misfit function plot for M1, M2, M3 and M6 schemes is shown in Fig. 6. From the curve, it is interpreted that the M6 scheme convergence is faster when compared to other methods. M2 scheme which uses selective ranges of angles at different iterations performs better than M1 and M3 schemes. The better convergence of M6 as compared to M3 indicates the significance of the role of the proposed method and the effectiveness of the proposed method.
In the same manner, the misfit function for M1, M4, M5 and M6 is shown in Fig. 7, which undoubtedly indicates that all the schemes based on the proposed method perform better. At the initial stage, M5 has a slower convergence rate but as the iteration increases, it provides a good convergence To test the feasibility of the proposed method further, the coefficient of the velocity variation for the M6 scheme after the 200 th iteration is calculated and shown in Fig. 8. At the 200th iteration, the areas highlighted in color block point toward gradient contribution from relatively lower angles due to cosine-based weights. Areas in grey/ white color indicate the gradient contribution from higher angles due to the sine-based weight function.
At higher iterations, most of the model is updated using a sine-based weight function and the areas with high lateral velocity variation are updated using cosine-based weight functions which are found to provide better models. A comparison of the inverted model at different iterations is shown in Fig. 9. The use of the coefficient of velocity variation has provided similar quality models with very few iterations.

Execution time comparison
FWI performs well enough for subsurface imaging. However, it is a time-consuming process. It is necessary to take care of the execution time in any new development. With this in mind, a comparison of the run times for a single iteration for all the shots is shown in Fig. 10.
From the plot, it is observed that the M1 scheme takes less time (98 s) among all. M2 takes 99 s, whereas M3 takes 106 s. For the proposed M4, M5 and M6 schemes, each iteration takes 103 s, 105 s and 106 s, respectively. From all these, it is interpreted that as cos( ) weight increases, there is also an increase in run-time. However, the run time difference is very less and the overhead of computing the weights is negligible.

Conclusions
A velocity variation coefficient-based angle-dependent gradient conditioning is proposed as an approach for obtaining better results in FWI. The technique aims to obtain a high-resolution velocity model. At first, the velocity model from a large angle scale to a smaller scale is considered and iteration started. At each iteration, the previous result is preserved and goes forward without any overlap. Once it covers all the angles, that leads to minimizing the data misfit and provides a high-resolution velocity model. In practice, the algorithm on the Marmousi dataset is tested to get much better performance as compared to the conventional algorithm as well as the recently proposed other angle-based gradient conditioning algorithms. The conclusions drawn are as follows: 1. The result shows that the proposed algorithm gives stable and smooth inversion results and also attenuates artifacts and is restored from falling into local minima. 2. The computation is done at several levels and from the execution time analysis, it is undoubtedly interpreted that the proposed scheme is able to perform at the same time as other schemes do. 3. From the convergence curve analysis of the proposed algorithm, it is observed that by performing less number of iterations, better results can be obtained by restoring Fig. 12 Proposed Co-efficient of Velocity Variation for initial and Actual velocity profile in the lateral direction. For initial iterations, most of the model is updated by cosine-based weights which helps in inverting large-scale structures, as the model gets inverted close to actual velocity, the proposed scheme applies cosine-based weights in the areas with a high-velocity variation the appropriate velocity range. At each iteration, preserving previous results without overlapping provides a smoothed and enhanced result with all the angle information. 4. In practice, it remarked that the proposed scheme gets much better performance in terms of resolution compared to the conventional as well as other proposed algorithms, regarding the number of iterations needed for convergence to the true solution.
The validation of the proposed method indicates that our approach performs efficiently to obtain better results with an addition of little to no computation time to the FWI workflow by tuning each velocity range. Finally, the methodology presented in the paper for FWI converge faster by taking advantage of scattering angle filtering based on proposed coefficient of velocity variation.

Appendix
To depict the selection of weights by the proposed method a conceptual lateral velocity profile is created with velocity variation and shown in Fig. 11 along with its smooth version. For the application of FWI an initial or smoothen velocity is used. The proposed coefficient of velocity variation Ĉ v,l for conceptual initial and actual velocity profiles with 100 m length are shown in Fig. 12.
The calculated coefficient of velocity variation Ĉ v,l for the initial velocity profile indicates the application of cosinebased angle weights which helps in inverting large-scale structures. The coefficient of velocity variation for the actual model indicates that the sine-based angle weights are applied in most of the lateral profile and cosine-based weights are applied in the areas with a high-velocity variation. The conceptual velocity profile and calculated coefficient of Velocity Variation show the auto-selection of angle weights based on the proposed co-efficient. The proposed scheme includes minimal overhead in determining the angle weights for gradient conditioning.