1 Introduction

The crustal structure of the Earth is fundamental to our understanding of the formation mechanisms and evolution processes of the crust as well as dynamics of the Earth’s deep interior. High-resolution images of the crust are also essential to characterize oil and gas reservoirs or mineral deposit fields, and to understand earthquake and other geological hazards.

Numerous seismological studies that utilize active sources or passive earthquakes have been devoted to imaging the high-resolution crustal structures for decades. Active source studies commonly provide good constraints on seismic velocity within the lithosphere, but are expensive and may cause environmental issues when chemical explosions are used (e.g., Prodehl and Mooney 2012, references therein). Techniques using passive seismic data to investigate the subsurface seismic structure overcome these shortcomings and have seen rapid developments with the increased availability of high-quality data from dense arrays. Many key understandings of the Earth structure have been learned by seismic imaging methods, but resolutions using different datasets are various. For example, teleseismic body wave tomography provides good constrains on the mantle structure (e.g., Dziewonski et al. 1977; van der Hilst et al. 1997; Bijwaard et al. 1998; Karason and van der Hilst 2001; Zhao 2004), but have poor resolution in the crust; Local body wave tomography has high spatial resolution, but only limited to a certain region that is seismically active and with good ray path coverage (e.g., Aki and Lee 1976; Eberhart-Phillips 1990; Thurber 1983; Zhang and Thurber 2003); Teleseismic surface wave tomography offers good resolution of the upper mantle (e.g., Trampert and Woodhouse 1995; Shapiro and Ritzwoller 2002; Simons et al. 2002), but has difficulty in resolving the shallow to middle crust structure due to its lower frequency content (mostly above 20-s period). In particular, ambient noise tomography provides powerful complementary constraints on regional- or local-scale crustal structures with short to intermediate period surface waves (e.g., Shapiro and Campillo 2004; Shapiro et al. 2005; Yao et al. 2006; Yang et al. 2007; Lin et al. 2008; Yao et al. 2008; Fang et al. 2015). Receiver function (RF) imaging provides powerful constraints on the discontinuities but lacks of absolute velocity information (Langston 1979; Ammon et al. 1990; Julià et al. 2000). In these regards, traditional imaging methods on their own as described above only offer limited constraints on fine crustal structures.

Recently, joint inversion methods, which combine multiple measurements with different sensitivities to the structure, have led to substantial improvements on the resolution of the crust. Joint surface wave tomography with both earthquake surface waves and ambient noise cross-correlation can successfully characterize crustal and upper mantle structure (e.g., Yao et al. 2008, 2010). Studies have also shown that the vertical and horizontal components of fundamental mode Rayleigh waves have a 90° phase shift, and their amplitude ratio (ZH ratio or ellipticity) is frequency dependent and only controlled by the structure beneath the recorded seismic station (Tanimoto and Alvizuri 2006; Tanimoto and Rivera 2008) under the 1D layered model assumption. Rayleigh wave ZH ratio has higher sensitivity to shallower structures compared to Rayleigh wave dispersion at the same period (Lin et al. 2012). Recent studies (Chong et al. 2015, 2016) also indicate that Rayleigh wave ZH ratio is sensitive to the smooth velocity variation (long wavelength velocity contrast) in the depth range that it samples. Therefore, Rayleigh wave ZH ratio can also serve as another useful measurement for wave speed structure inversion (Tanimoto and Alvizuri 2006; Tanimoto and Rivera 2008; Yano et al. 2009; Tanimoto et al. 2013). Lin et al. (2012, 2014) successfully performed joint inversion of Rayleigh wave dispersion and ellipticity using a linearized inversion algorithm for the crustal and upper mantle shear wave velocity (v S) structure of the Western United States. Chong et al. (2015) and Yuan et al. (2016) used the global optimization methods to invert Rayleigh wave phase velocity and ZH ratio data for layered v S structures. But the surface wave data as described above only resolve smooth velocity variations at depths and cannot distinguish a rapid change in velocity, due largely to their relative broad depth sensitivity to the velocity structure.

RF is particularly sensitive to the discontinuities in the crust and upper mantle beneath seismic stations, but the inversion of RF for crustal v S structures is intrinsically non-unique and highly depends on the initial model (e.g., Langston 1979; Owens et al. 1984; Ammon et al. 1990; Ammon 1991). Thus, it cannot well constrain absolute velocity structures alone. Julià et al. (2000) proposed a joint inversion using surface wave dispersion and RF data with a linearized inversion algorithm to provide tighter constraints on both the absolute v S structure and the velocity discontinuities. Thereafter, the joint inversion with these two datasets is extensively applied to studies in different areas, such as the Arabian Shield (Julià et al. 2003), East Africa (Julià et al. 2005), and southeastern Tibet (Liu et al. 2014). Shen et al. (2012, 2013) developed joint inversion of dispersion and RF using global optimization algorithm (the Bayesian Monte Carlo method) to obtain v S model and its associated uncertainty. The feasibility of the joint inversion with ellipticity and RF data was also explored by Chong et al. (2016) using the fast simulated annealing method.

Surface wave dispersion, ZH ratio, and RF provide complementary constraints on subsurface velocity structures, so their combinations are natural in the joint inversion. Recently, Kang et al. (2015) first combined these three datasets using a Bayesian Monte Carlo method to study the lithospheric v S structure at the depth of 0–150 km in Northeast China. But the global optimization algorithm is usually time-consuming even with a small set of model parameters.

In this paper, we propose a stepwise linearized inversion algorithm to jointly invert surface wave dispersion, Rayleigh wave ZH ratio, and receiver function data to better resolve 1D crustal v S structure. We first describe the methodology and then conduct a series of synthetic tests and Monte Carlo error analyses to demonstrate the feasibility and robustness of the proposed method. At last, we apply the new joint inversion strategy to real data from an example station in Northeast China.

2 Methodology

The forward problems for surface wave dispersion, Rayleigh wave ZH ratio, or RF can be expressed as

$${\varvec{d}} = {\varvec{F}}\left( {\varvec{m}} \right),$$
(1)

where m is an M-dimensional model vector representing seismic velocity and density parameters, d is an N-dimensional data vector that represents the combination of different data vectors (dispersion, ZH ratios, and RF), and \({\varvec{F}}\) denotes the nonlinear forward operator (e.g., Owens et al. 1984; Ammon et al. 1990; Yano et al. 2009; Lebedev et al. 2013). The inverse problem can be cast as minimizing the misfit function as

$$\phi \left( {\varvec{m}} \right) = \left\| {{\varvec{F}}\left( {\varvec{m}} \right) - {\varvec{d}}} \right\|_{2}^{2} \,+\, \eta^{2} \left\| {{\varvec{L}}\Delta {\varvec{m}}} \right\|_{2}^{2} ,$$
(2)

where \(\eta\) is a weighting coefficient to balance data fitness and model regularization; \(\Delta {\varvec{m}}\) denotes model update; and L is the first- or high-order Tikhonov smoothing operator (Aster et al. 2011).

As three datasets are primarily sensitive to v S (Fig. 1), we only retrieve the v S model with joint inversion in this paper. Therefore, the model vector m can be simplified as \({\varvec{m}}_{\text{S}}\), representing an M-dimensional vector of v S with fixed thickness layers. P-wave velocity and density are computed from S-wave velocity by empirical relationships given by Brocher (2005). Then a linearized iterative damped least-square scheme (e.g., Julià et al. 2000, 2003) is used to minimize the misfit function in Eq. (2), through the weighted normal equation

$$\left[ {\begin{array}{*{20}c} {\theta {\varvec{G}}_{\text{sw}} } \\ {\varphi {\varvec{G}}_{\text{zh}} } \\ {\lambda {\varvec{G}}_{\text{rf}} } \\ {\eta {\varvec{L}}} \\ \end{array} } \right]\left[ {\Delta {\varvec{m}}_{\text{s}} } \right] = \left[ {\begin{array}{*{20}c} {\theta {\varvec{r}}_{\text{sw}} } \\ {\varphi {\varvec{r}}_{\text{zh}} } \\ {\lambda {\varvec{r}}_{\text{rf}} } \\ {\varvec{0}} \\ \end{array} } \right],$$
(3)

and we update the v S model vector by \({\varvec{m}}_{\text{S}}^{{\left( {n + 1} \right)}} = {\varvec{m}}_{\text{S}}^{\left( n \right)} + \Delta {\varvec{m}}_{\text{S}}^{\left( n \right)}\) iteratively. \(\Delta {\varvec{m}}_{\text{S}}\) is the v S model update and we need an initial reference model \({\varvec{m}}_{\text{S}}^{\left[ 0 \right]}\) to start the iteration. The weighting coefficients are given by

$$\theta^{2} = \frac{p}{{N_{\text{sw}} \sigma_{{{\text{sw}},i}}^{2} }},\;\;\varphi^{2} = \frac{q}{{N_{\text{zh}} \sigma_{{{\text{zh}},i}}^{2} }},\;\;\lambda^{2} = \frac{r}{{N_{\text{rf}} \sigma_{{{\text{rf}},i}}^{2} }},$$
(4)

which are used to equalize the relative importance of different datasets by the influence coefficients pqr (\(p + q + r = 1\)), and account for the number of data points and physical units by N and data variance \(\sigma^{2}\). The subscripts sw, zh, and rf in all equations represent dispersion, ZH ratio, and RF data, respectively; \({\varvec{r}}_{\text{sw}} ,{\varvec{r}}_{\text{zh}} ,{\varvec{r}}_{\text{rf}}\) are the corresponding residual data vectors for different datasets; \({\varvec{G}}_{\text{sw}} ,{\varvec{G}}_{\text{zh}} ,{\varvec{G}}_{\text{rf}}\) are the corresponding partial derivative matrices, which can be calculated numerically through difference algorithms (e.g., Julià et al. 2000; Tanimoto and Alvizuri 2006; Julià 2007).

Fig. 1
figure 1

Sensitivity kernels to v S, v P, and density (Rho) of the fundamental mode Rayleigh wave phase velocity (RC) at periods 5 s (a) and 20 s (b) and ZH ratio (ZH) at periods 5 s (c) and 20 s (d). Sensitivity kernels of surface waves are computed from the PREM model (Dziewonski and Anderson 1981) using a difference method, and the forward calculations of both datasets are derived from the package by Herrmann (2013). Differential seismograms to v S (e), v P (f) and density (g) of RF are computed for a one-layer crust model with 30 km thickness (v S = 3.5 km/s, v P = 6.0 km/s and density is 2.7 g/cm3). They are computed by an efficient algorithm of Randall (1989) derived from the package by Charles J. Ammon at http://eqseis.geosc.psu.edu/~cammon/HTML/RftnDocs/thecodes01.html

Instead of inverting three datasets at the same time, we propose a stepwise joint inversion strategy for better taking advantages of their complementary sensitivities to the v S structure.

Stage 1: We first invert surface wave dispersion and ZH ratio data for a smooth average v S model in the crust and upper mantle, because their sensitivities are spanned over broad depth ranges and deepen with periods (Fig. 1a–d). The inclusion of ZH ratio data can provide better constraints on the shallow crustal v S structure comparing with the inversion with dispersion alone, as ZH ratio has higher sensitivity to shallow v S structure (Fig. 1a–d). The obtained smooth model is served as the initial model for the next-stage inversion, which helps to mitigate the strong initial model dependency of RF inversion.

Stage 2: We then incorporate RF data into the joint inversion to retrieve a finer v S model considering its high sensitivity to discontinuities (Fig. 1e–g). In this stage, we use a larger influence coefficient (i.e., r > p and r > q) for RF data, in order to resolve small-scale v S velocity contrast information based on the obtained smooth v S model in Stage 1. At the same time, we use smaller influence coefficients (pq) of both dispersion and ZH ratio data compared to those in Stage 1. This adjustment of influence coefficients for surface wave data is because both datasets have already been well fitted after the first-stage inversion. But we still keep them in the second-stage inversion to constrain the absolute v S structure so that the final inversion result would not be dominated by RF data only.

3 Synthetic tests

3.1 Inversion with noise-free data

We generated the noise-free synthetic datasets (Fig. 2a–c) from a v S model with 1.25 km layer thickness, involving low-velocity layers in the uppermost crust (v S about 2.5–3 km/s at depth of 0–5 km) and a low-velocity zone in the lower crust (v S ~ 3.3 km/s at depth of 20–25 km) (Fig. 2d). The P-wave velocity and density were derived based on the v S model through empirical relationships (Brocher 2005). As our research interest focuses on the crustal structure, we only used the dispersion data (phase and group velocities) at periods from 5 to 50 s (Fig. 2a), Rayleigh wave ZH ratio at periods from 5 to 60 s (Fig. 2b), with a 2.5-s period interval. RF waveforms are of 35-s duration, which contains most potential crustal multiples, with the Gaussian width parameter of 2.5, ray parameter of 0.06 s/km, and sampling rate of 0.1 s (Fig. 2c).

Fig. 2
figure 2

Inversion result after 7 iterations of Stage 1. Target (thick blue lines) and synthetic data from the final model (thin red lines) for (a) Rayleigh wave phase (RC) and group velocity (RU) dispersion at periods from 5 to 50 s; (b) ZH ratio at periods from 5 to 60 s; (c) RF waveform of 35-s duration with the Gaussian width parameter 2.5 and ray parameter of 0.06 s/km. (d) Target v S model (thick blue solid line), inverted v S model (thin red solid line), and initial reference v S model (thick red dashed line)

We started with a uniform v S model with layer thickness of 1.25 km and velocity value of 3.5 km/s at each depth (Fig. 2d), and then performed the first-stage inversion using noise-free synthetic dispersion (Fig. 2a) and ZH ratio (Fig. 2b) with influence coefficients of p = q = 0.5. Here we used the first-order Tikhonov regularization with smoothing parameter η = 0.5. Figure 2 shows the obtained v S model (Fig. 2d) and data fitness (Fig. 2a–c) after 7 iterations of Stage 1. It presents nearly perfect fitness of the group/phase velocity dispersion (Fig. 2a) and ZH ratio (Fig. 2b). But the RF waveform predicted by the optimal model cannot fit the observation quite well (Fig. 2c). We then incorporated receiver function data with influence coefficients of r = 0.8, p = 0.1, q = 0.1 in Stage 2. Figure 3 shows the final inversion results. The inclusion of RF data in the inversion in Stage 2 led to nearly ideal recovery of small-scale velocity contrasts, and all noise-free datasets are fitted well (Fig. 3).

Fig. 3
figure 3

Inversion result after 14 iterations of Stage 2. Symbols and explanations are the same as in Fig. 2

3.2 Effects of influence coefficients in Stage 2

Since RF data have much better constraints on small-scale velocity contrasts than surface wave data, we use a larger influence coefficient of RF data (e.g., r = 0.8) and smaller influence coefficients of both dispersion and ZH ratio data (p = q = 0.1) in Stage 2. For the above noise-free datasets, we try another two inversion tests with p:q:r = 2:2:6 and 3:3:4 in Stage 2, and both tests get very similar results as that in Fig. 3. Larger influence coefficient means more constraints from the corresponding dataset. From another perspective, different influence coefficients will nearly not change the global minimum of the misfit function for noise-free datasets, so we will get nearly the same solution. In practice, the influence coefficient depends on the data quality. We can reduce the influence coefficient of the corresponding data with poor quality (large uncertainties) in the joint inversion procedure.

3.3 Initial model dependence

RF and surface wave (dispersion or ZH ratio) data have initial model dependency with linear inversion strategy (Ammon et al. 1990; Yano et al. 2009; Lebedev et al. 2013), so we performed the stepwise joint inversion using a variety of different starting models to investigate its initial model dependency. Figure 4a and b presents 15 different initial models, in which both initial v S and Moho depths are altered with respect to the starting model in Fig. 2d. Figure 4c compares the true velocity model and all resulting models after joint inversion. We observe that all inverted models converge to the true solution with deviations less than 0.1 km/s of each layer in the crust after 20 iterations (Fig. 4c), which fully demonstrates that our stepwise joint inversion approach has little initial model dependency issues.

Fig. 4
figure 4

Initial model dependency tests. Thick blue lines represent the target v S models. a Different uniform half-space initial models with vs ranging from 2.3 to 4.6 km/s at each depth above the Moho (9 initial models, thin red solid/dashed lines). b Different uniform half-space models with Moho depths ranging from 23.75 to 35.25 km (6 different initial models, thin red solid/dashed lines). The thin red solid lines in (a) and (b) denote the initial reference model used for the synthetic test (see Fig. 2d). c Inversion results (thin red solid lines) of different initial models in (a) and (b). We use the same data parameters (surface wave periods, RF waveform duration and Gaussian width) as in the synthetic test 1 except 5-s period interval for dispersion and ZH ratio data

3.4 Monte Carlo error analyses with random data noise

In order to test the robustness of the new joint method in the presence of noise, we inverted 100 noisy datasets, generated from inclusions of random Gaussian noise into the noise-free synthetic datasets in Fig. 2a–c. Based on real data quality in practice, the noise levels were chosen to be 5% for receiver function waveforms and 1% for surface wave data, respectively. Noise levels are defined as the percentage of the maximum amplitude value of the waveform and the percentage of data values at different periods for receiver function and surface wave data, respectively. Then we performed two-stage inversion with the same inversion parameters (influence coefficients, smoothing parameter, and damping factor) as those in Sect. 3.2. As shown in Fig. 5, inverted results of 100 noisy datasets are very close to the true model (Fig. 5a) and the statistical histogram of root-mean-square (RMS) data residuals is Gaussian distributed (Fig. 5b), which demonstrates the robustness of our stepwise joint inversion against random data noise.

Fig. 5
figure 5

Inversion results from Monte Carlo error analyses with random data noise. a Target v S model (thick blue line) and inverted v S models (thin gray lines) after 12 iterations of 100 datasets with Gaussian random noise added. b Histogram of the root-mean square (RMS) data residuals of 100 Monte Carlo tests in (a). Noise levels are 5% for RF, 1% for both dispersion and ZH ratio, respectively

In case of poor data quality in practice, we performed another individual test (Fig. 6), where the noise levels are increased for three datasets, that is, 2% for surface wave data (dispersion and ZH ratio, Fig. 6a and b) and 25% for RF, respectively. We then filtered the RF with the Gaussian parameter of 2.5, in order to obtain the final noisy RF data (Fig. 6c). For this test, we used a larger smoothing parameter (about twice of that used in the noise-free inversion test) to avoid unreasonable jumps of the velocity model, with the cost of slower convergence speed (or more iterations), in comparison with the previous tests (Figs. 2, 3, 4, 5). Figure 6 presents the inversion result after 50 iterations, which is very close to the true model (Fig. 6d). Although the sharp interfaces are smeared (Fig. 6d), data fits are acceptable (Fig. 6a–c). This verifies that our stepwise joint inversion method can be successfully employed in some practical cases with large data noise level.

Fig. 6
figure 6

Synthetic test with large noise levels (25% for RF, 2% for group velocity (RU) and phase velocity (RC) dispersion and ZH ratio). Both noisy dispersion (a) and ZH ratio (b) are presented as solid points with estimated one standard deviation error bars; filtered noisy RF data (c) are shown as the thick blue line with the gray shaded area for its uncertainties; d target v S model (blue thick line) and the inverted v S model (thin red line) from the corresponding datasets in (a–c)

4 Application to real data

We apply the joint inversion method to the real data in the NE China. For this test, we use the Rayleigh wave phase velocity dispersion, ZH ratio, and RF data from NE53, a station of the NECESS Array, which was deployed in Northeast China from September 2009 to August 2011. This station is located at the Great Xing’an range. Both RF and Rayleigh wave phase velocity data are from Kang et al. (2015). And the ZH ratio measurements in the period band 8–100 s are from ambient noise (Li et al. 2016) and earthquake Rayleigh wave-based measurements (from Guoliang Li, personal communication). Here, we put emphasis on the validation of the new joint inversion method with real data, rather than a comprehensive analysis of the data or the geological interpretation of the inversion results.

Figure 7 shows all three datasets with uncertainties. Rayleigh wave phase velocity data are in the period range of 8–80 s, with a 2-s period interval from 8 to 32 s and then a 5-s period interval from 35 to 80 s. The period range of ZH ratio is from 8 to 100 s, with the period interval of 1 s from 8 to 30 s and 5-s period interval above 30 s. These two surface wave data were extracted from both earthquake and ambient noise (Kang et al. 2015; Li et al. 2016). The RF waveform was filtered with the Gaussian width of 3 and was made to a reference slowness of 0.06 s/km (Kang et al. 2015). The waveform duration is 10 s, and the uncertainties were estimated based upon a harmonic stripping analyses and compensate scaling (Shen et al. 2012, 2013; Kang et al. 2015). More detailed information about the data measurements can be found in Kang et al. (2015) and Li et al. (2016).

Fig. 7
figure 7

Real data application for the station NE53 in Northeast China (Kang et al. 2015; Li et al. 2016). a Rayleigh wave phase velocities with uncertainties presented as one standard deviation error bars (from Kang et al. 2015); b Rayleigh wave ZH ratio with uncertainties indicated by vertical bars (from Li et al. 2016); c RF with uncertainties shown as the gray shaded area (from Kang et al. 2015). Red solid lines in a–c correspond to synthetic datasets of the final inversion model. d Three different initial models (black dashed lines), the smooth v S models after the first-stage inversion (blue solid lines), and the final inverted v S models after the two-stage inversion (red solid lines). Note that the first- or second-stage inversion results are almost the same although the initial models are very different

We construct three different initial models with uniform v S (3.5, 4.0, and 4.5 km/s) for each layer and with the layer thickness of 4 km (black dashed lines in Fig. 7d). For each case, we perform the stepwise joint inversion, with jointly inverting phase velocity and ZH ratio in the first stage and subsequently incorporating RF data in the second stage. The same influence coefficients as those in the synthetic tests are adopted in the inversion, that is, p = q = 0.5 in Stage 1 and r = 0.8, p = 0.1, q = 0.1 in Stage 2. Figure 7d indicates three v S model (blue solid lines) after the first-stage inversion using surface wave data alone; it is very smooth and no reliable Moho interface can be distinguished in the model. After the second stage of inversion, all three obtained models converge to almost the same result (red solid lines in Fig. 7d). The most distinct feature of the final v S model is an apparent velocity jump (from 3.98 km/s to 4.51 km/s, about 0.5 km/s jump) across the Moho at the depth of 40 km, which is unresolved with surface wave data alone. The estimated Moho depth is very close to both results from previous joint inversion studies by Kang et al. (2015) (39.5 ± 1.58 km) and H − k stacking of RF by Tao et al. (2014) (38.6 ± 0.8 km) for the station NE53. As the longer period (> 40 s) surface wave data are included, structures of the uppermost mantle are also well constrained (Fig. 7d), which also correlates well with the results from Kang et al. (2015). From this example, we demonstrate the stepwise joint inversion method is feasible to real data. In addition, the inclusion of RF with surface wave data together in Stage 2 helps to resolve more reliable interface structures, and the absolute v S information retrieved from surface wave data mitigates the strong initial model dependency of RF data inversion alone.

5 Conclusion

In this paper, we propose a stepwise linearized joint inversion method using surface wave dispersion, Rayleigh wave ZH ratio, and RF data to resolve the fine 1D crustal v S structure. Our two-stage inversion strategy can take better advantages of each dataset due to their complementary sensitivities to the crust velocity structures. In the first stage, we invert surface wave dispersion and ZH ratio data to obtain a 1D smooth absolute v S model, and in the second stage, we incorporate receiver function data in the joint inversion to obtain a finer v S model with better constraints on interface structures. Synthetic tests and real data application have shown that the proposed stepwise joint inversion method is capable of resolving robust fine v S model in the crust, and has little initial model dependency. In the future, we are intending to resolve the 1D crustal v P/v S ratios from the joint inversion of these three or more datasets.