1 Introduction

Local site conditions in seismic hazard-prone areas, such as geological structure, topography, and stratum lithology can cause ground motion to become amplified or diminished during an earthquake, thus affecting the regional distribution of earthquake damage (Bo et al. 2003; Qi 2007). In comparison with an environment with a hard foundation and simple site conditions, it has been established that a soft foundation and complex site conditions have greater effect on the amplification of ground motion and can induce more serious earthquake damage (Richter 1958). Thus, site condition estimation can play an important role in the assessment of potential seismically induced losses, especially in rapid emergency response after earthquake disasters (Li and Zuo 1991; Ding et al. 2018). Time-averaged shear wave velocity to a specific depth (30 m as Vs30 or 20 m as Vs20), which is an important parameter to indicate site condition (Boore et al. 1993; BSSC 2009), is used widely to quantify site amplification effects in relation to earthquakes.

In many regions, shear wave velocity data are scarce and various approaches have been proposed for the estimation of Vs30 based on different proxies. These proxies include surface geology, geotechnical descriptors, slope gradient, geomorphic terrain class, and hybrids of more than one proxy. Previous studies have identified that the applicability of these approaches to the estimation of Vs30 is dependent on the local geologic and topographic environment. A comparative study of the geology-slope approach and the terrain-based approach in Greece showed that both approaches were equally applicable (σlnv = 0.396, σlnv = 0.394, respectively) (Yong et al. 2012; Stewart et al. 2014). A hybrid geology-slope approach with additional proxies was found to perform better (lower values of R and σlnv) than the geology-based approach in the Pacific Northwest of North America (Ahdi et al. 2017). The terrain-based approach was proven more suitable (σ = 128 m/s) than the geology-based approach in California, the United States (Yong et al. 2012). Choosing an appropriate approach for the estimation of Vs30 in seismic hazard-prone areas with few boreholes is very important for strategic rapid earthquake emergency response.

Sichuan Province in southwestern China is one of the areas of the country most prone to earthquakes because of its complex geological structure attributable to its location close to the Mediterranean-Himalayan seismic zone. The terrain of Sichuan Province is undulating but predominantly mountainous, and the distribution of stations available to determine Vs30 values is sparse. Kang et al. (2017) found that the macroscopic spatial distribution of each site classification under the geology- and topographic slope-based approaches is similar, and the distribution of hard or soft sites is confirmed by each other. However, given the complex topography and site conditions and high probability of earthquake hazard occurrence in Sichuan Province, a more accurate approach for the estimation of Vs30 is needed to determine seismic site conditions. With reference to measured Vs30 values from 160 stations, this study compared three proxy-based approaches and discussed the performance of each with respect to two objectives: (1) simulation of site amplification effects using estimated Vs30 values; and (2) selection of the most appropriate approach for site condition estimation in Sichuan Province.

2 Study Area

Sichuan Province is located on the southeastern edge of the Qinghai-Tibetan Plateau and lies within the Mediterranean-Himalayan seismic belt, making it one of the areas in China with the highest earthquake hazard occurrences (Huang et al. 2018). About 80% of Sichuan Province’s landmass is formed by old strata comprising hard Triassic metamorphic rocks in the northwestern parts and Permian sedimentary rocks in the southwestern parts and at the basin edge (Fig. 1a and b). Quaternary thick sandy and clayey alluvial deposits cover the western side of the eastern plain where the Cretaceous and Jurassic strata are distributed. Affected by uplifting associated with the Qinghai-Tibetan Plateau in the west, the topography of Sichuan Province shows marked variation. The relative elevation difference across the province is 6965 m, and the average slope of the province is approximately 16°. The geomorphology of Sichuan Province can be divided into three major units: a complete basin in the east, alpine mountains in the west, and medium-height mountains with wide valleys in the southwest (Diao 1991). Affected by the Quaternary glaciation, the plain area at the bottom of the basin is mostly covered by alluvial and diluvial sands and clay ranging from 15 to 35 m deep, and the western region is covered by glacial river and lake sediments (Zhao and Chen 2009).

Fig. 1
figure 1

Basic land surface and subterranean units of Sichuan Province: a geographical overview of Sichuan Province, China; and b geological composition in Sichuan Province based on the 1:500,000 scale geological map. G1: Holocene sandy and clayey alluvial deposits; G2: Neogene sedimentary rocks; G3: Paleogene sedimentary rocks; G4: Cretaceous-Paleogene sedimentary rocks; G5: Cretaceous sedimentary rocks; G6: Jurassic sedimentary rocks; G7: Triassic-Jurassic sedimentary rocks; G8: Triassic metamorphic rocks; G9: Beyond Permian

3 Data and Method

The shear wave velocity data of 160 seismic stations in Sichuan Province were collected in this study. The reference value of Vs30 was calculated by simple continuation or velocity continuation. The methods of simulating Vs30 based on geology, topographic slope, and terrain indices are widely used globally. Therefore, the three methods were selected in this study to estimate the simulated Vs30 value of the seismic stations.

3.1 Data

Measured Vs30 values were compiled from several sources. Borehole thickness and corresponding shear wave velocity data of 113 seismic stations were obtained from the Institute of Crustal Stress of China Earthquake Administration. Additional Vs30 values from another 25 seismic stations in Sichuan Province were obtained from the National Geospatial-Intelligence Agency (NGA)Footnote 1 of the United States Department of Defense, and data of both averaged shear wave velocity and bedrock overburden thickness from 22 seismic stations were collected from the literature (Ye 2013; Jiang 2017). Data from a total of 160 seismic stations in Sichuan Province were collected for shear wave velocity information.

In China, the thickness of soil cover with Vs20 and shear wave velocity > 500 m/s are used as the reference for bedrock classification, therefore shear wave velocity measured at 135 of the 160 stations did not reach a depth of 30 m. Two calculation methods were used to infer the Vs30 from the measured values. For the seismic stations where the thickness of the borehole and the corresponding shear wave velocity for each layer are known, we used the simple continuation method (113 seismic stations). For the 24 seismic stations where the deepest drilling depth and equivalent shear wave velocity were recorded, we used the extension approachFootnote 2 (Yu 2015) to obtain Vs30 values.

The digital elevation model used was the Shuttle Radar Topography Mission 30 (SRTM30),Footnote 3 available from the website of the U.S. Geological Survey from which a topographic slope model and a global automatic terrain classification map were generated. The geological composition in Sichuan Province was based on the 1:500,000 scale geological map obtained from the 1:500,000 digital geological map database of the People’s Republic of China.

3.2 Method

The geology-based approach (GE-approach) adopted in this study was established by Wills et al. (2000) by connecting Vs30 with the geological characteristics (including rock types and/or geological age) in California, United States. According to the stratigraphic age and physical characteristics (such as particle size, hardness, fracture distribution, among others) and based on the principle of analogy, the California region was divided into seven site types, which provides a basis of site classification for the area where shear wave velocity based on hole drilling is lacking (Tables 1 and 2). It was further refined subsequently using depositional environment and geographic criteria as additional constraints (Wills et al. 2000; Wills and Clahan 2006). Shi (2011) applied this method to establish a relational matrix between geological conditions and site categories, and completed the division of sites in China. Vilanova et al. (2018) developed a near-surface shear wave velocity database for Portugal and applied a three-step methodological approach to develop a Vs30 site-condition map using extrapolation.

Table 1 Correspondence between stratigraphic units and site types in different geological ages.
Table 2 Summary of measured Vs30 in California, United States.

Based on the 1:500,000 scale digital geological map of Sichuan Province and the classification criteria of the GE-approach, our study area was divided into four geologic units: beyond Cretaceous, upper Cretaceous to Oligocene, Pliocene to Pleistocene, and clay, which were assigned site types B, C, CD, and D, respectively.

The topographic slope-based approach (TS-approach) adopted in this study was introduced by Wald and Allen (2007), who found that Vs30 has a good correlation with slope, landform, and elevation. Based on SRTM30 elevation data, the corresponding relationship between slope and Vs30 was established in the geologic dynamical active areas (such as California, Salt Lake City, Taiwan) and in the geologic dynamical stable areas (such as Memphis, the continental United States) with rich borehole data, and applied to regions that lacked measured wave velocity (Table 3).

Table 3 Slope range and Vs30 value range correspondence table.

In this method, “slope” refers to gradient, that is, the ratio of slope vertical height to horizontal distance, which can be calculated from digital elevation model (DEM) data. In our study, the site types of Sichuan Province and the estimated Vs30 values of the seismic stations were obtained from a global Vs30 modelFootnote 4 that was built with arc-second spatial resolution. This method has been widely used in the rapid seismic evaluation systems research, such as the Prompt Assessment of Global Earthquakes for Response (PAGERFootnote 5) system developed by Wald and his colleagues (2006), site condition classification maps (Wald and Allen 2007), and earthquake early warning systems. Chen et al. (2010) established the relationship between Vs30 and topographic slope in China to quantify the site magnification factor of ground motion, and applied it to the self-developed Shake-Map system to obtain the ground motion parameter distribution on the surface soil layer.

The range of the corresponding Vs30 value of each site type under both the geology-based approach and the topographic slope-based approach were defined as the standard of the NEHRP (National Earthquake Hazards Reduction Program) (FEMA 1995; Table 4) site specification.

Table 4 US National Earthquake Hazards Reduction Program (NEHRP) site classification standard.

The terrain-based classification (TE-approach) is an automatic procedure developed by Iwahashi and Pike (2007), which relied on the development of a set of geomorphic categories with 16 types (TT1-16) based on gradient, convexity, and surface texture. In this method, the judgment of “topography” depends on three important indicators: (1) Slope gradient, calculated from DEM elevation data, indicating the steepness of a slope’s gradient, which is an important factor affecting the surface formation process and is also one of the most basic attributes of the resulting terrain; (2) Surface texture, that is, the roughness of the surface, showing the characteristics of ridge and valley; and (3) Slope gradient and surface texture together could automatically classify steep terrain, but not enough to distinguish low relief features of terrain units, such as alluvial fans and plain areas. To better identify this type of terrain, local convexity was introduced as the third variable that is derived from the calculated surface curvature. This approach had been applied to characterize Vs30 in California, United States, where each geomorphic type was assigned a mean Vs30 value (Yong et al. 2012) (Table 5). Irsyam et al. (2017) also developed a nationwide Vs30 map for Indonesia based on this approach from 90-m grid digital elevation data and their correlation with Vs30.

Table 5 16 types of terrain and corresponding mean Vs30 values.

Site type maps have been generated for the world at 1-km spatial resolution and at the spatial resolution of 270 m and 55 m for the Japanese archipelago and parts of Hokkaido, respectively.Footnote 6

4 Results

Site type maps of Sichuan Province with Vs30 simulated using the three approaches are shown in Fig. 2. By using numerical statistical methods (residuals, deviation rate, standard deviation, homogenization), we compared and assessed the applicability of the three kinds of method.

Fig. 2
figure 2

Site classification of Sichuan Province by: a the geology-based approach; b the topographic slope-based approach; and c the terrain-based approach. The ranges of Vs30 values of specific site types under the three approaches appear in d. Blue brackets indicate geology and topographic slope classes; black squares indicate terrain types (TT)

4.1 Site Classification

Based on the GE-approach, site types in Sichuan Province were classified as Classes B (Vs30 > 760 m/s), C (Vs30 = 360–760 m/s), CD (Vs30 = 270–555 m/s), and D (Vs30 = 180–360 m/s) (Fig. 2a). It was found that the proportion of Class B was the largest (approximately 91.38%). The proportions of Class C and CD were similar at 2.34% and 2.42%, respectively, and that of Class D was 3.86%. In the TS-approach, the site types were divided into Class B (Vs30 > 760 m/s), Class C (Vs30 = 360–760 m/s), and Class D (Vs30 = 180–360 m/s) (Fig. 2b). The proportion of Class B sites was the largest (52.61%), followed by Class C and D (37.82% and 9.57%, respectively). The 16 types of the TE-approach were found to be distributed throughout the province (Fig. 2c). Among them, TT1 was largest (approximately 66.9%), followed by TT3, which accounted for approximately 10.7% of the total area, and the proportion of TT14 was the smallest (0.06%).

Both the GE-approach and the TE-approach adopted the same Vs30 value range for Class B; nevertheless, the area of Class B using the former approach was larger, covering almost the entire province. Obvious differences in site type existed among the three approaches in two specific areas. One area was located at the basin edge extending eastward into the plain, which was classified roughly into two types (Class C and D) in the TS-approach, into four categories (Class B, C, CD, and D) in the GE-approach, but was divided with greater detail in the TE-approach. The second area was in the northern part of the alpine plateau (Ruoergai, Hongyuan, and Aba), where two site types (Class C and D) were identified by the TS-approach, three site types (Class C, CD, and D) were identified by the GE-approach, and three additional site types were identified by the TE-approach.

Generally, the northwestern mountain area with old strata (hard rocks) and steep slopes was also the area with high convexity and fine texture (Class B (Vs30 > 760 m/s) classified by both the GE-approach and the TS-approach, corresponding to TT1 (Vs30 = 519 m/s) in the TE-approach, which could be identified clearly using all three approaches (Fig. 3). In contrast, the site classifications of two other areas—one located on the basin edge extending eastward into the plain and the other in the northern parts of the alpine plateau—were significantly different, and both areas were characterized in more detail using the TE-approach.

Fig. 3
figure 3

Site classification area ratio of: a the geology-based approach and topographic slope-based approach; and b the terrain-based approach

4.2 Proxy Performance

With reference to the measured Vs30 values of the 160 stations in Sichuan Province, we used the natural logarithmic form of the residual and the deviation rate to estimate the predictability of the estimated Vs30 values. The residual and the deviation rate were calculated using the following equations:

$$R_{i} = \log (V_{s30i} ) - \overline{{\log (V_{s30i} )}}$$
(1)
$$D_{i} = {{\log (V_{s30i} ) - \overline{{\log (V_{s30i} )}} } \mathord{\left/ {\vphantom {{\log (V_{s30i} ) - \overline{{\log (V_{s30i} )}} } {\log (V_{s30i} )}}} \right. \kern-0pt} {\log (V_{s30i} )}}$$
(2)

where \(V_{s30i}\) is the measured Vs30 and \(\overline{{V_{s30i} }}\) is the estimated Vs30.

Our results show that the estimated Vs30 values of the 160 stations using the three approaches were mostly higher than the measured Vs30 values. However, the residual between the measured logVs30 value and that estimated using the TE-approach was the smallest, that is, the predictability of this approach was the best among the three, despite serious underestimation at several sites. The predictability of the GE-approach was intermediate, that is, some sites in the southwestern mountains were underestimated compared with the other two approaches. The TS-approach exhibited the worst predictability (Fig. 4).

Fig. 4
figure 4

Residuals between estimated logVs30 values using the a geology-based approach, b topographic slope-based approach, and c terrain-based approach and the measured logVs30 values at the 160 stations in Sichuan Province. Stations in yellow indicate where the Vs30 value was overestimated; stations in red indicate where the Vs30 value was underestimated. Station size was magnified 30 times to illustrate Abs (Ri); the bigger the point, the larger the residual. “S-wave” refers to sheer wave. d Deviation rate of Vs30 values using the three approaches

For the 160 station sites, the Vs30 values at approximately 79% of the sites were overestimated (Abs (Ri) = 0.26–0.57) and 18% were underestimated (Abs (Ri) = 0.01–0.39) using the GE-approach. The Vs30 values at approximately 81% of the sites were overestimated (Abs (Ri) = 0.28–0.56) and 19% were underestimated (Abs (Ri) = 0.01–0.33) using the TS-approach. The Vs30 values at approximately 77% of the sites were overestimated (Abs (Ri) = 0.28–0.56) and 14% were underestimated (Abs (Ri) = 0.01–0.33) using the TE-approach. Using the GE-approach and the TS-approach, the maximum deviation rate of estimated Vs30 values was > 250%, whereas with the TE-approach, it was 150–200% (Fig. 4d.)

Among the three approaches, stations with overestimated Vs30 values were found to be distributed mainly in the western and central parts of Sichuan Province; stations with underestimated Vs30 values were found distributed mostly in the western parts of the plain. Generally, in Sichuan Province, the large deviation of estimated Vs30 values in the high mountains and the plain highlighted the areas where the three approaches were found inadequate.

5 Discussion

With the TE-approach, the classification of site type in Sichuan Province was detailed and diverse. The residual between the measured logVs30 values and the estimated values was small, indicating low overall deviation. The classification of site types using either the GE-approach or the TS-approach was simple and approximate, and the overall deviation of the estimated Vs30 values was high for both methods, particularly the former. Furthermore, we calculated the smoothed mean and the standard deviation of the three approaches to test their stability in the estimation of Vs30 values. The methodological approach comprised two steps. (1) We obtained the estimated Vs30 values for the 160 stations using the three approaches. For this, we used the reported mean Vs30 values for each of the seven expanded NEHRP (National Earthquake Hazards Reduction Program) classes to obtain the GE-Vs30 values (Table 2) and we used the mean Vs30 values for each of the 16 terrain types to obtain the TE-Vs30 values. The TS-Vs30 values were obtained in a downloaded text file (Table 3). (2) We used the band-plot function and a customized subroutineFootnote 7 to calculate the standard deviation (σ) of each approach (Fig. 5).

Fig. 5
figure 5

Smoothed mean and standard deviation of estimated Vs30 values using: a the geology-based approach; b the topographic slope-based approach; and c the terrain-based approach and measured Vs30 values for the 160 stations in Sichuan Province. Red lines indicate smoothed mean, blue lines indicate one standard deviation, and pink lines indicate two standard deviations

It was found that the value of σ for the TS-approach, GE-approach, and TE-approach was approximately 208, 167, and 67 m/s, respectively, indicating that the TE-approach is most suitable for the estimation of Vs30 in Sichuan Province. The overall fluctuations were reasonably stable in both the GE-approach and the TS-approach. However, the TE-approach showed undulating fluctuation, which was severe for Vs30 values < 400 m/s but less notable in the range 400–520 m/s. The lower-velocity material is distributed on the lower slopes (Matsuoka et al. 2005; Chiou et al. 2006); therefore, we infer that the violent fluctuation of Vs30 values appears in the plain area under the TE-approach.

For deeper analysis of the situation regarding the plain and the performance of each of the three approaches, we divided Sichuan Province into four geological areas: Alpine plateau area; Mountain area; Basin edge; and Plain. The methodological approach adopted comprised two steps: (1) defining a preliminary set of geological areas based on slope; and (2) calculating the logVs30 distribution for each geological area. The mean (µ) and standard deviation (σ) were used to evaluate the overall estimation of the three approaches, and the normalized count distribution was used to assess the ability of each of the three approaches in each geological area (Fig. 6).

Fig. 6
figure 6

Normalized count distribution for each of three approaches in each geological area in Sichuan Province. The left side shows the residual distribution of logVs30 values (measured logVs30) and estimated logVs30 values using: a the geology-based approach; b the topographic slope-based approach; and c the terrain-based approach. The right side shows normalized count distribution for logVs30, values sorted by the four geological areas. A red line shows the corresponding fitted normal distribution with mean (µ) and standard deviation (σ), while the black line corresponds to the fitted normal distribution of Vs30 values for Sichuan Province

The normalized count distribution line for each geological area shown in Fig. 6 is distributed normally under the TE-approach, indicating this approach performs best in the estimation of Vs30 values. This is also confirmed by it having the smallest overall deviation in the estimated values (µ = 0.11, σ = 0.07). The three approaches were reasonably unbiased for the Mountain area and the Basin edge, biased for the Alpine plateau area, and severely biased for the Plain area. The bias in the GE-approach might in part be related to the challenges associated with correlating the geological units of regions with different geological and lithological conditions and/or because some geological units are poorly sampled. In the TS-approach, the linear residual distribution of each geological area indicates poor correlation between slope and Vs30 values. The occurrence of this bias might be related to the accuracy of the digital elevation model data. The Vs30 deviation in the Basin edge area is larger than in the Mountain area. This might reflect the basin effect that is a typical complex site effect and usually has a significant amplification effect on the ground motion (Guo and Zhou 2010).

Under the TE-approach, violent fluctuation in the Vs30 values was evident in the Plain area, and the bias in this area was larger in comparison with the other geological areas. This bias might reflect that the actual surface strata, rock properties, and sediment properties in this area are not identified appropriately. For example, macroscopically, the three stations in the Plain area with severe underestimation (Fig. 4c) are actually in an extension of the mountain range (low hills). Thus, the slope in such areas will not be very steep but the surface strata are old, and the rock is hard. In addition, due to the influence of Quaternary glaciers, most of the western plains are glacier-grooves, and the surface is covered with alluvial, alluvial gravel, and clay layers with thicknesses ranging from 15 m to 35 m. So the sediments are harder than those of ordinary plains (with high measured Vs30 values). The fact that the correlation between slope and Vs30 values is poor has been confirmed in relation to the TS-approach. However, under the TE-approach, slope gradient is key to the classification of site type; thus, the Vs30 values corresponding to the Plain area are low (low estimated Vs30 values). In this case, the Vs30 values are inevitably underestimated.

Kang et al. (2017) applied the GE-approach and the TS-approach to the site classification in Sichuan Province. The results showed that the macroscopic spatial distribution of the various sites obtained by the two approaches was similar and could be mutually confirmed, but there was a certain difference in the proportion of various types of sites. By comparing the results of the characteristic area site classification approach with the two approaches, the results show that the site classification obtained by directly referring to the relationship of other regions in the eastern part of Sichuan Province is not accurate enough, and the relationship needs to be corrected in combination with the measured drilling data in Sichuan. However, we found that the TE-approach is the most suitable of the three approaches to evaluate the site effect amplification in Sichuan Province, exhibiting the least estimation error of Vs30 values and the more detailed categorizations. We also found that the ability of this approach to estimate the Vs30 value in the plain area is slightly insufficient and needs to be localized. Comparing the estimated and measured values of the 160 seismic stations under this approach, the average difference was roughly 97 m/s.

6 Conclusion

This study collected the shear wave velocity data of 160 seismic stations in Sichuan Province as the reference Vs30 value, and compared the applicability of three methods for estimating Vs30 values based on different surrogate indicators. The results show that the variation range of Vs30 in Sichuan Province estimated by the TE-method had the smallest residual error and residual standard deviation (μ = 0.11, σ = 0.07), and its stability was also the best (σ = 67 m/s). The TE-approach should be prioritized over both the GE-approach and the TS-approach in site type classification in Sichuan Province because both geology and topography slope proxies were embedded to capture the characteristics and material properties. The TE-approach was the most suitable method to estimate Vs30 in Sichuan Province, followed by the GE-approach and the TS-approach.

Due to the influence of Quaternary glaciers, slopes, and residues and deposits, the actual site conditions were complex, resulting in no significant correlation between the estimated Vs30 and the measured Vs30 under the three methods. It also caused the Vs30 of the western plateau mountains, the southwestern mountains, and the basin edge stations to be generally overestimated, and the Vs30 of the eastern plain stations was mostly underestimated.

The Vs30 residual at the basin edge stations in the TS-approach showed a significant linear trend, indicating that the Vs30 characteristics of this type of geomorphic unit were not well characterized; and most of the underestimated Vs30 calculations were found in the eastern plain stations, revealing the need for improvement using this method. The poor performance of the TE-approach in the plain area highlights that localization correction would be required.