Convolution-based multi-scale envelope inversion

Envelope inversion (EI) is an efficient tool to mitigate the nonlinearity of conventional full waveform inversion (FWI) by utilizing the ultralow-frequency component in the seismic data. However, the performance of envelope inversion depends on the frequency component and initial model to some extent. To improve the convergence ability and avoid the local minima issue, we propose a convolution-based envelope inversion method to update the low-wavenumber component of the velocity model. Besides, the multi-scale inversion strategy (MCEI) is also incorporated to improve the inversion accuracy while guaranteeing the global convergence. The success of this method relies on modifying the original envelope data to expand the overlap region between observed and modeled envelope data, which in turn expands the global minimum basin of misfit function. The accurate low-wavenumber component of the velocity model provided by MCEI can be used as the migration model or an initial model for conventional FWI. The numerical tests on simple layer model and complex BP 2004 model verify that the proposed method is more robust than EI even when the initial model is coarse and the frequency component of data is high.


Introduction
Building an accurate velocity model is essential for reservoir characterization and seismic imaging. Proposed decades ago, full waveform inversion has gained great interest as a promising data-fitting procedure for exploiting full information from the seismogram (Tarantola 1984;Virieux and Operto 2009). The theory and framework of FWI have been developed in different inversion domains (Gauthier et al. 1986;Pratt et al. 1998;Cha 2008, 2009) and in different inversion mediums (Yang et al. 2014;Köhn 2011;Mora 1987Mora , 1988. With the growth of computing power and the enhanced quality of recorded data, some pioneering applications of FWI have been performed in both global scales and in the seismic exploration problems (Brenders and Pratt 2007;Wang et al. 2014;Wang et al. 2015;Operto et al. 2015;Fichtner et al. 2009).
An efficient approach to conduct full waveform inversion is the local optimization method by which the model is iteratively updated using the gradient of the misfit function (Virieux and Operto 2009). However, one instinct problem of the local optimization strategy is that the solution tends to fall into the local minimum of the misfit function because of the highly nonlinear nature of FWI (Fichtner and Trampert 2011). Low-frequency component is of great importance which can significantly expand the global minimum basin of the misfit function. Multi-scale inversion strategy is commonly adopted, which starts the inversion procedure from low frequency and then gradually incorporates high-frequency component (Bunks et al. 1995;Boonyasiriwat et al. 2009). However, the performance of multi-scale inversion strategy is limited by the availability of low-frequency data which is usually expensive (Luo and Wu 2015;Morgan et al. 2013). Wu et al. (2014) proposed to use the envelope-based misfit function to extract the ultralow-frequency component which utilizes the idea of Bozdağ et al. (2011). Seismic envelope inversion methods can demodulate the ultralow-frequency components from the original seismic data for FWI, by using the nonlinear demodulation operator, and the initial model dependence in waveform inversion is reduced. The success of envelope-based inversion lies in that the adjoint source is related to waveform and envelope data while the adjoint source of conventional FWI is entirely determined by waveform (Chi et al. 2014). The implementation issues, as well as the noise resistance ability of envelope inversion, are discussed in detail by Luo and Wu (2014) and Luo et al. (2016). Envelope-based misfit function, combined with the conventional FWI engine, is successfully applied in acoustic and elastic cases (Huang et al. 2015).
Although the envelope inversion is proven to be efficient in initial model building, there are several theoretical drawbacks (Bharadwaj et al. 2016). Firstly, envelope-based misfit function can only update model when the envelope of modeled data and observed data is partly overlapped, which means it fails when starting from initial model that separates the modeled and the observed data more than the half dominant period. Secondly, the envelope-based misfit function lacks the inversion ability for reflected wave, and some artifacts may be introduced when inverting for reflected wave only as verified by Wu et al. (2014), which means the data that could be utilized by envelope-based misfit function are very limited. Thirdly, conventional envelope inversion does not adapt to strong-scattering media, such as the SEG/EAGE salt model (Chen et al. 2018;Kadu et al. 2016;Wu and Chen 2017). Bharadwaj et al. (2016) proposed bump functional which is a generalized envelope-based misfit function and has improved global convergence ability. Besides, a combination of multi-scale inversion engine (Bunks et al. 1995;Boonyasiriwat et al. 2009) and envelope-based misfit function may further expand the global minimum basin.
To improve the feasibility of envelope inversion and guarantee the computational efficiency, firstly we illustrate the case when conventional EI fails by analyzing the configuration of misfit function. Then a convolution-based envelope inversion scheme is proposed and the corresponding gradient is deduced by adjoint state method. In order to improve the inversion accuracy while guaranteeing the convergence ability, a multi-scale strategy is introduced by adjusting the convolution factors. Theoretical analysis and application on the simple layered model verify the correctness of MCEI, while the test on BP 2004 model indicates its potential for complex models.

Theory
The misfit function of conventional envelope inversion is defined as (Bozdağ et al. 2011;Wu et al. 2014;Chi et al. 2014): where u obs and u cal represent the observed and modeled data, respectively, and s , r , t represent the source and receiver position and recording time correspondingly. m represents the model parameter, which means V p in this paper. The envelope of seismic data E(u) can be obtained by the following equation: where H(u) represents the Hilbert transform of u.
Using the adjoint state method (Plessix 2006;Fichtner and Trampert 2011), the gradient of misfit function (1) can be expressed as: where û denotes the adjoint wavefield generated by propagating the adjoint source in reverse time direction from the receiver position, and B represents the forward modeling operator, which is chosen as acoustic finite-difference forward modeling method. The forward wavefield and adjoint wavefield obey the wave equation and adjoint equation: and where S and S adj represent source and adjoint source term, respectively, and B represents the adjoint forward modeling operator which utilizes the terminal condition and propagates in a reverse time manner.
The adjoint source term from misfit function (1) can be expressed as Chi et al. 2014): Bharadwaj et al. (2016) pointed out that the performance of conventional envelope inversion relies on the frequency of the source wavelet and accuracy of initial model, which means the initial model should guarantee that the difference between observed and modeled data is smaller than one dominant period. Figure 1a shows the observed (blue line) and modeled data (red line) obtained by convolving a spike and a Ricker wavelet with the dominant frequency 7 Hz. It is clear that the difference between two signals is greater than half period and cycle skipping arises when using the conventional L2 norm distance as misfit function.
the cycle-skipping problem is eliminated and envelope inversion is suitable because the envelope of modeled and observed data is overlapped (Bharadwaj et al. 2016). However, for another case, with a stable spike but a higher frequency (15 Hz) Ricker wavelet is employed, as shown in Fig. 2a, the envelope data are not overlapped again, because the difference between modeled and observed data is greater than one dominant period. The dashed line in Fig. 3 represents the misfit function (1) by shifting the modeled data, and the current model point is denoted by the red star. It is clear that the gradient of misfit function at current model point is zero, so the conventional envelope-based inversion fails to update the model.
In order to alleviate the dependence on the frequency component, we modify the misfit function as: where h(t) is the new smoothing function. The effect of the smooth function is to modify the original envelope data to the weighted sum of the current point and the points in the vicinity. In this paper, we choose the smoothing function as: where t max represents the time with maximum recording and is the convolution factor which controls the degree of By convolving the original envelope data with a smoothing function, the overlapping region of observed and modeled data will be expanded and better convergence ability should be expected, especially for seismic data with highfrequency content and initial models far away from true ones.
The adjoint source is obtained, and the adjoint wavefield û(x, t) is propagated using the new adjoint source in Eq. 10. The gradient of misfit function is as follows.
Full waveform inversion updates the velocity model through iterations to minimize data misfit. The iteration formula is written as, where k is the iterating number, H k represents the modification term of gradient, k is step length, which is obtained by parabolic interpolation. In this paper, the conjugate gradient method is chosen as the local optimization method.
The envelope data convolved with the smoothing function are shown in Fig. 2c, where the observed and modeled data are overlapped. The corresponding misfit function is shown by the solid line in Fig. 3. It is clear that the gradient of the new misfit function at current point is not zero and velocity model can be updated by gradient-based optimization algorithm, without falling into the local minima.
However, the convolution factor in Eq. (8) must be chosen carefully, which should satisfy following requirements: 1. The convolution factor should be big enough so that the modified envelope data are overlapped, which means the initial model is in the basin of global or local minima. 2. The convolution factor should be small enough to avoid cross-talk between different events, which may lead to the failure of convergence. Figure 4a shows the observed and modeled data, both of which contain two events. The corresponding envelope data are shown in Fig. 4b. It is clear that cycle-skipping arises indicated by the black arrow. We convolve the original envelope data with the smoothing function and set = 0.05. The modified envelope data are shown in Fig. 5a, and the misfit function before and after applying the smoothing function is shown in Fig. 5b. Because of the smoothing effect, the global minimum basin is expanded and local minima are eliminated, which verify that the smoothing function is helpful to improve the global convergence. However, when is set as 0.3, severe distortion can be observed in the modified envelope data (Fig. 6a). Although the basin is expanded, the global minimum does not coincide with the real minimum, which means the inversion accuracy is decreased (Fig. 6b).
Based on the analysis of the effect of the smoothing factor, we propose a multi-scale inversion scheme. Because the choice of can directly determine the convergence ability and the accuracy of inversion result, we can start the inversion from a big value and then gradually decrease it. The main purpose of multi-scale inversion method is, on the one hand, to guarantee the global convergence ability of misfit function and, on the other hand, to improve the inversion accuracy gradually.

Simple layered model
The first example is based on a simple layered model (Fig. 7a), and a constant gradient model is chosen as the initial model (Fig. 7b), which implies no prior knowledge of the model. The grid interval for horizontal and vertical directions is 10 m. A total of 80 shots are positioned by a 100-m interval on the surface, and every grid point on the surface is used as a receiver point. For waveform modeling, a tenth-order finite-difference stencil is performed for the spatial discretization and a second-order scheme is implemented for the time discretization (Levander 1988). Figure 9a, b provides the inversion results of conventional FWI after 500 iterations when dominant frequency of Ricker wavelets is 7 Hz and 15 Hz, respectively. The low-frequency information (0-4 Hz) in shot data is muted by applying a high-pass filter (Fig. 8). The lack of low-frequency information and the presence of errors in the initial model result in an incorrect high-wavenumber update, which prevents conventional FWI from converging to the true model. Starting from the linear initial model, the smooth background model is obtained from envelope inversion using a Ricker wavelet with a central frequency of 7 Hz. Due to the weaker nonlinearity, EI generates long-wavelength updates and  Fig. 7 a The true layered model and b the initial linear increasing model successfully obtains the background velocity, especially for structure of the bottom layer (Fig. 10a). The inversion result of EI is used as the starting model for the high-wavenumber recovery by FWI. Figure 10b shows the final result of combined inversion (EI + FWI) by waveform inversion using the recovered smooth background from EI. Most structures are successfully recovered, including the low-velocity anomaly and the dip faults. However, when the higher frequency (15 Hz) Ricker wavelet is used, the conventional EI is unable to provide satisfactory velocity estimation (Fig. 11a) and the inversion result is almost the same as the initial model. This is emphasized by the inversion result of subsequent FWI (Fig. 10b).
Then MCEI is applied to the high-frequency case, and we define convolution factor as 0.1 2 (i−1) , where i represents inversion stage number increasing from 1 to 10 with an interval of one. In this way, the convolution factor decreases from 0.1 to 0.00,020. Five iterations are carried out at each convolution factor sequentially. The long-wavelength models we obtain after 2 and 10 global stages are shown in Fig. 11a and Fig. 12b. As expected, the resolution of the image improves as smaller is used in the inversion and the final inversion result fits the longwavelength components of the real model pretty well. With this background model, we can significantly improve the FWI inversion result. The following FWI successfully avoids the local minima and provides an excellent inversion result, which is almost the same as the real model in the shallow parts. The low-velocity abnormal and dip fault are recovered accurately.

BP 2004 model
The second example involves BP 2004 model (Fig. 13a), which is representative of the geology of the Gulf of Mexico (Billlette and Brandsberg-Dahl 2004). This area is characterized by a deep water environment and the presence of complex salt structures. The original model is re-scaled due to the computation limitation, and the size of target area is 5.4 km by 1.07 km. The grid intervals in horizontal and vertical directions are both 10 m. We have sources and receivers on every grid at the surface, with source spacing 200 m and receiver spacing 10 m. A Ricker wavelet with peak frequency of 7 Hz is used. The top boundary condition is absorbing boundary condition. We muted low-frequency  Fig. 14. The modeling is performed by the same scheme as the former test. The maximum recording time is 2.4 s, and the sampling interval is 0.8 ms. The source function and water layer are assumed known during the inversion. We use two constant gradient models to test the dependence of EI and MCEI on the accuracy of initial model. The first one varies from 1500 m/s to 4670 m/s (Fig. 13b) which is very close to the real average background velocity (about 1500 m/s to 4500 m/s). The second is more distant from the exact model, varying from 1500 m/s to 3670 m/s (Fig. 15), which underestimates the increase in velocity in depth. Starting from these two initial models, MCEI and EI are used to interpret the data.
Starting from the first initial model, the inversion result using the conventional time-domain FWI after 500 iterations is shown in Fig. 16. Because the initial model deviates substantially from the exact model and the data lack low-frequency information, the traditional FWI merely adds some detailed short wavelength structures in shallow part. According to this result, conventional FWI fails to update the area deeper than 0.5 km, especially for the inner salt body and sub-salt structures.
To help traditional FWI to reach global convergence, we use EI to invert for the long-wavelength structure and to provide a more accurate initial model. After 200 iterations, the inversion result is shown in Fig. 17a, the salt domes successfully recovered. The salt structure in the middle part is successfully retrieved even for the deep part. However, we note some errors in the right bottom part because of lacking of illumination. To verify the accuracy of the inversion result, the inversion result of EI (Fig. 17a) is used as the input velocity model of conventional FWI and the inverted result is shown in Fig. 17b after 500 iterations. In this way, we can refine the low-wavenumber component model from EI with high-wavenumber details. Of particular note is that the shapes of domes are almost the same as in the true model; the interfaces and the velocities of low-velocity anomaly below the central salt-dome are recovered accurately.
To further investigate the benefits of EI, Fig. 18 provides the variation of normalized data and model misfits of FWI. Conventional FWI decreases the data misfit slower than the proposed inversion strategy, but due to the cycle skipping, the model misfit, in fact, does not decrease. On the other hand, FWI after EI reduces both data and model misfits effectively.
With the second initial model, we believe the inversion result of conventional FWI is much poorer than that shown in Fig. 17, which is not provided for clarity. To compare the robustness of EI and MCEI with respect to initial model, firstly we perform conventional EI for 200 iterations. The inversion result is shown in Fig. 19a. Because the seismic data contain some long-offset refracted waves and the gradient model is not very far from the true model in the shallow part, conventional EI obtains an acceptable inversion result for the shallow part. However, the lack of low-frequency component leads to an inaccurate inversion result especially in the deep part of the model. Some artifacts can be observed clearly under salt body. Sub-salt velocity structures are not clear, and inner-salt velocity is not accurate as well.
In comparison, with the broader global minima basin, the velocity model estimation obtained by MCEI significantly close to the exact model (Fig. 20a) especially for the deep part and the low-velocity abnormal in the middle of salt body. The inversion result of the following FWI (Fig. 20b) is almost identical to which is shown in Fig. 16b which verifies the accuracy of MCEI result. We show the MCEI adjoint source and its spectrum in Fig. 21. Besides, we also plot the curves of the seismic data residual and model residual reduction in Fig. 22. It is clear that FWI after MCEI reduces the data misfit and model misfit more rapidly.

Conclusion
A convolution-based envelope misfit function is proposed to improve the robustness of envelop inversion when cycleskipping problem is serious during FWI process. The gradient and the adjoint source are provided based on the adjoint state method. In order to improve the inversion accuracy and guarantee the global convergence, a multi-scale scheme is proposed. The combination of MCEI and conventional FWI can estimate the low-and high-wavenumber components of the velocity model in sequence even when the data frequency is high and the initial model is coarse.  . 22 The variation of normalized a data and b model misfit. FWI after MCEI decreases both model misfit and data misfit more rapidly than EI + FWI (Grant No. 2016ZX05014001, 2016ZX05002). J. P. Huang is supported by Tai Shan Science Foundation for the Excellent Youth Scholars.
Open Access This article is licensed under a Creative Commons Attribution 4.0 International License, which permits use, sharing, adaptation, distribution and reproduction in any medium or format, as long as you give appropriate credit to the original author(s) and the source, provide a link to the Creative Commons licence, and indicate if changes were made. The images or other third party material in this article are included in the article's Creative Commons licence, unless indicated otherwise in a credit line to the material. If material is not included in the article's Creative Commons licence and your intended use is not permitted by statutory regulation or exceeds the permitted use, you will need to obtain permission directly from the copyright holder. To view a copy of this licence, visit http://creat iveco mmons .org/licen ses/by/4.0/.