Methodology and Assessment of Proxy-Based Vs30 Estimation in Sichuan Province, China

Although time-averaged shear wave velocity to the depth of 30 m (Vs30) is an important indicator of earthquake site effects, it is difficult to obtain. Several proxies have been used either individually or in combination to infer Vs30 values during seismic hazard estimation under limited observational conditions. Sichuan Province is an area highly prone to earthquakes. Complex geological structures and lack of drilling sites mean that it is particularly important to establish a suitable approach for the estimation of Vs30 values for site classification. This study compared the application of three proxy-based approaches—geology-based, topographic slope-based, and terrain-based—to the estimation of Vs30 values in Sichuan Province. The results revealed that the residual between the measured logVs30 values and the estimations derived from the terrain-based approach was smallest, indicating best predictability. Stability analysis of the three approaches also showed that the terrain-based approach performed best. However, its performance in the plain area was poor, that is, the Vs30 values were mostly underestimated. This might indicate that the old strata, hard rock, and alluvial deposits formed by Quaternary glacier sediments were not identified appropriately in the plain area, highlighting the need for localized corrections.


Introduction
Local site conditions in seismic hazard-prone areas, such as geological structure, topography, and stratum lithology can cause ground motion to become amplified or diminished during an earthquake, thus affecting the regional distribution of earthquake damage (Bo et al. 2003;Qi 2007). In comparison with an environment with a hard foundation and simple site conditions, it has been established that a soft foundation and complex site conditions have greater effect on the amplification of ground motion and can induce more serious earthquake damage (Richter 1958). Thus, site condition estimation can play an important role in the assessment of potential seismically induced losses, especially in rapid emergency response after earthquake disasters (Li and Zuo 1991;Ding et al. 2018). Time-averaged shear wave velocity to a specific depth (30 m as V s30 or 20 m as V s20 ), which is an important parameter to indicate site condition (Boore et al. 1993;BSSC 2009), is used widely to quantify site amplification effects in relation to earthquakes.
In many regions, shear wave velocity data are scarce and various approaches have been proposed for the estimation of V s30 based on different proxies. These proxies include surface geology, geotechnical descriptors, slope gradient, geomorphic terrain class, and hybrids of more than one proxy. Previous studies have identified that the applicability of these approaches to the estimation of V s30 is dependent on the local geologic and topographic environment. A comparative study of the geology-slope approach and the terrain-based approach in Greece showed that both approaches were equally applicable (r lnv = 0.396, r lnv-= 0.394, respectively) (Yong et al. 2012;Stewart et al. 2014). A hybrid geology-slope approach with additional proxies was found to perform better (lower values of R and r lnv ) than the geology-based approach in the Pacific Northwest of North America (Ahdi et al. 2017). The terrain-based approach was proven more suitable (r = 128 m/s) than the geology-based approach in California, the United States (Yong et al. 2012). Choosing an appropriate approach for the estimation of V s30 in seismic hazard-prone areas with few boreholes is very important for strategic rapid earthquake emergency response.
Sichuan Province in southwestern China is one of the areas of the country most prone to earthquakes because of its complex geological structure attributable to its location close to the Mediterranean-Himalayan seismic zone. The terrain of Sichuan Province is undulating but predominantly mountainous, and the distribution of stations available to determine V s30 values is sparse. Kang et al. (2017) found that the macroscopic spatial distribution of each site classification under the geology-and topographic slopebased approaches is similar, and the distribution of hard or soft sites is confirmed by each other. However, given the complex topography and site conditions and high probability of earthquake hazard occurrence in Sichuan Province, a more accurate approach for the estimation of V s30 is needed to determine seismic site conditions. With reference to measured V s30 values from 160 stations, this study compared three proxy-based approaches and discussed the performance of each with respect to two objectives: (1) simulation of site amplification effects using estimated V s30 values; and (2) selection of the most appropriate approach for site condition estimation in Sichuan Province.

Study Area
Sichuan Province is located on the southeastern edge of the Qinghai-Tibetan Plateau and lies within the Mediterranean-Himalayan seismic belt, making it one of the areas in China with the highest earthquake hazard occurrences (Huang et al. 2018). About 80% of Sichuan Province's landmass is formed by old strata comprising hard Triassic metamorphic rocks in the northwestern parts and Permian sedimentary rocks in the southwestern parts and at the basin edge ( Fig. 1a and b). Quaternary thick sandy and clayey alluvial deposits cover the western side of the eastern plain where the Cretaceous and Jurassic strata are distributed. Affected by uplifting associated with the Qinghai-Tibetan Plateau in the west, the topography of Sichuan Province shows marked variation. The relative elevation difference across the province is 6965 m, and the average slope of the province is approximately 16°. The geomorphology of Sichuan Province can be divided into three major units: a complete basin in the east, alpine mountains in the west, and medium-height mountains with wide valleys in the southwest (Diao 1991). Affected by the Quaternary glaciation, the plain area at the bottom of the basin is mostly covered by alluvial and diluvial sands and clay ranging from 15 to 35 m deep, and the western region is covered by glacial river and lake sediments (Zhao and Chen 2009).

Data and Method
The shear wave velocity data of 160 seismic stations in Sichuan Province were collected in this study. The reference value of V s30 was calculated by simple continuation or velocity continuation. The methods of simulating V s30 based on geology, topographic slope, and terrain indices are widely used globally. Therefore, the three methods were selected in this study to estimate the simulated V s30 value of the seismic stations.

Data
Measured V s30 values were compiled from several sources. Borehole thickness and corresponding shear wave velocity data of 113 seismic stations were obtained from the Institute of Crustal Stress of China Earthquake Administration. Additional V s30 values from another 25 seismic stations in Sichuan Province were obtained from the National Geospatial-Intelligence Agency (NGA) 1 of the United States Department of Defense, and data of both averaged shear wave velocity and bedrock overburden thickness from 22 seismic stations were collected from the literature (Ye 2013;Jiang 2017). Data from a total of 160 seismic stations in Sichuan Province were collected for shear wave velocity information.
In China, the thickness of soil cover with V s20 and shear wave velocity [ 500 m/s are used as the reference for bedrock classification, therefore shear wave velocity measured at 135 of the 160 stations did not reach a depth of 30 m. Two calculation methods were used to infer the V s30 from the measured values. For the seismic stations where the thickness of the borehole and the corresponding shear wave velocity for each layer are known, we used the simple continuation method (113 seismic stations). For the 24 seismic stations where the deepest drilling depth and equivalent shear wave velocity were recorded, we used the extension approach 2 (Yu 2015) to obtain V s30 values. The digital elevation model used was the Shuttle Radar Topography Mission 30 (SRTM30), 3 available from the website of the U.S. Geological Survey from which a topographic slope model and a global automatic terrain classification map were generated. The geological composition in Sichuan Province was based on the 1:500,000 scale geological map obtained from the 1:500,000 digital geological map database of the People's Republic of China.

Method
The geology-based approach (GE-approach) adopted in this study was established by Wills et al. (2000) by connecting V s30 with the geological characteristics (including rock types and/or geological age) in California, United States. According to the stratigraphic age and physical characteristics (such as particle size, hardness, fracture distribution, among others) and based on the principle of analogy, the California region was divided into seven site types, which provides a basis of site classification for the area where shear wave velocity based on hole drilling is lacking (Tables 1 and 2). It was further refined subsequently using depositional environment and geographic criteria as additional constraints (Wills et al. 2000;Wills and Clahan 2006). Shi (2011) applied this method to establish a relational matrix between geological conditions and site categories, and completed the division of sites in China. Vilanova et al. (2018) developed a near-surface shear wave velocity database for Portugal and applied a three-step methodological approach to develop a V s30 sitecondition map using extrapolation.
Based on the 1:500,000 scale digital geological map of Sichuan Province and the classification criteria of the GEapproach, our study area was divided into four geologic units: beyond Cretaceous, upper Cretaceous to Oligocene, Pliocene to Pleistocene, and clay, which were assigned site types B, C, CD, and D, respectively.
The topographic slope-based approach (TS-approach) adopted in this study was introduced by Wald and Allen (2007), who found that V s30 has a good correlation with slope, landform, and elevation. Based on SRTM30 elevation data, the corresponding relationship between slope and V s30 was established in the geologic dynamical active areas (such as California, Salt Lake City, Taiwan) and in the geologic dynamical stable areas (such as Memphis, the continental United States) with rich borehole data, and applied to regions that lacked measured wave velocity (Table 3).
In this method, ''slope'' refers to gradient, that is, the ratio of slope vertical height to horizontal distance, which can be calculated from digital elevation model (DEM) data. In our study, the site types of Sichuan Province and the estimated V s30 values of the seismic stations were obtained from a global V s30 model 4 that was built with arc-second spatial resolution. This method has been widely used in the rapid seismic evaluation systems research, such as the Prompt Assessment of Global Earthquakes for Response (PAGER 5 ) system developed by Wald and his colleagues (2006), site condition classification maps (Wald and Allen 2007), and earthquake early warning systems. Chen et al. (2010) established the relationship between V s30 and topographic slope in China to quantify the site magnification factor of ground motion, and applied it to the selfdeveloped Shake-Map system to obtain the ground motion parameter distribution on the surface soil layer.
The range of the corresponding V s30 value of each site type under both the geology-based approach and the   Table 4) site specification.
The terrain-based classification (TE-approach) is an automatic procedure developed by Iwahashi and Pike (2007), which relied on the development of a set of geomorphic categories with 16 types (TT1-16) based on gradient, convexity, and surface texture. In this method, the judgment of ''topography'' depends on three important indicators: (1) Slope gradient, calculated from DEM elevation data, indicating the steepness of a slope's gradient, which is an important factor affecting the surface formation process and is also one of the most basic attributes of the resulting terrain; (2) Surface texture, that is, the roughness of the surface, showing the characteristics of ridge and valley; and (3) Slope gradient and surface texture together could automatically classify steep terrain, but not enough to distinguish low relief features of terrain units, such as alluvial fans and plain areas. To better identify this type of terrain, local convexity was introduced as the third variable that is derived from the calculated surface curvature. This approach had been applied to characterize V s30 in California, United States, where each geomorphic type was assigned a mean V s30 value (Yong et al. 2012) (Table 5). Irsyam et al. (2017) also developed a nationwide V s30 map for Indonesia based on this approach from 90-m grid digital elevation data and their correlation with V s30 .
Site type maps have been generated for the world at 1-km spatial resolution and at the spatial resolution of 270 m and 55 m for the Japanese archipelago and parts of Hokkaido, respectively. 6

Results
Site type maps of Sichuan Province with V s30 simulated using the three approaches are shown in Fig. 2. By using numerical statistical methods (residuals, deviation rate, standard deviation, homogenization), we compared and assessed the applicability of the three kinds of method.

Site Classification
Based on the GE-approach, site types in Sichuan Province were classified as Classes B (V s30 [ 760 m/s), C (V s30-= 360-760 m/s), CD (V s30 = 270-555 m/s), and D (V s30-= 180-360 m/s) (Fig. 2a). It was found that the proportion of Class B was the largest (approximately 91.38%). The proportions of Class C and CD were similar at 2.34% and 2.42%, respectively, and that of Class D was 3.86%. In the TS-approach, the site types were divided into Class B (V s30 [ 760 m/s), Class C (V s30 = 360-760 m/s), and Class D (V s30 = 180-360 m/s) (Fig. 2b). The proportion of Class B sites was the largest (52.61%), followed by Class C and D (37.82% and 9.57%, respectively). The 16 types of the TE-approach were found to be distributed throughout the province (Fig. 2c). Among them, TT1 was largest (approximately 66.9%), followed by TT3, which accounted for approximately 10.7% of the total area, and the proportion of TT14 was the smallest (0.06%). Both the GE-approach and the TE-approach adopted the same V s30 value range for Class B; nevertheless, the area of Class B using the former approach was larger, covering almost the entire province. Obvious differences in site type existed among the three approaches in two specific areas. One area was located at the basin edge extending eastward into the plain, which was classified roughly into two types (Class C and D) in the TS-approach, into four categories (Class B, C, CD, and D) in the GE-approach, but was divided with greater detail in the TE-approach. The second area was in the northern part of the alpine plateau (Ruoergai, Hongyuan, and Aba), where two site types (Class C and D) were identified by the TS-approach, three site types (Class C, CD, and D) were identified by the GE-approach, and three additional site types were identified by the TEapproach.
Generally, the northwestern mountain area with old strata (hard rocks) and steep slopes was also the area with high convexity and fine texture (Class B (V s30 [ 760 m/s) classified by both the GE-approach and the TS-approach, corresponding to TT1 (V s30 = 519 m/s) in the TE-approach, which could be identified clearly using all three approaches (Fig. 3). In contrast, the site classifications of two other areas-one located on the basin edge extending eastward into the plain and the other in the northern parts of the alpine plateau-were significantly different, and both areas were characterized in more detail using the TEapproach.

Proxy Performance
With reference to the measured V s30 values of the 160 stations in Sichuan Province, we used the natural logarithmic form of the residual and the deviation rate to estimate the predictability of the estimated V s30 values. The residual and the deviation rate were calculated using the following equations: where V s30i is the measured V s30 and V s30i is the estimated V s30 . Our results show that the estimated V s30 values of the 160 stations using the three approaches were mostly higher than the measured V s30 values. However, the residual between the measured logV s30 value and that estimated using the TE-approach was the smallest, that is, the predictability of this approach was the best among the three, despite serious underestimation at several sites. The predictability of the GE-approach was intermediate, that is, some sites in the southwestern mountains were underestimated compared with the other two approaches. The TSapproach exhibited the worst predictability (Fig. 4).
For the 160 station sites, the V s30 values at approximately 79% of the sites were overestimated (Abs (Ri) = 0.26-0.57) and 18% were underestimated (Abs (Ri) = 0.01-0.39) using the GE-approach. The V s30 values at approximately 81% of the sites were overestimated (Abs (Ri) = 0.28-0.56) and 19% were underestimated (Abs (Ri) = 0.01-0.33) using the TS-approach. The V s30 values at approximately 77% of the sites were overestimated (Abs (Ri) = 0.28-0.56) and 14% were underestimated (Abs (Ri) = 0.01-0.33) using the TE-approach. Using the GEapproach and the TS-approach, the maximum deviation Among the three approaches, stations with overestimated V s30 values were found to be distributed mainly in the western and central parts of Sichuan Province; stations with underestimated V s30 values were found distributed mostly in the western parts of the plain. Generally, in Sichuan Province, the large deviation of estimated V s30 values in the high mountains and the plain highlighted the areas where the three approaches were found inadequate.

Discussion
With the TE-approach, the classification of site type in Sichuan Province was detailed and diverse. The residual between the measured logV s30 values and the estimated values was small, indicating low overall deviation. The classification of site types using either the GE-approach or the TS-approach was simple and approximate, and the overall deviation of the estimated V s30 values was high for both methods, particularly the former. Furthermore, we calculated the smoothed mean and the standard deviation of the three approaches to test their stability in the estimation of V s30 values. The methodological approach comprised two steps. (1) We obtained the estimated V s30 values for the 160 stations using the three approaches. For this, we used the reported mean V s30 values for each of the seven expanded NEHRP (National Earthquake Hazards Reduction Program) classes to obtain the GE-V s30 values (Table 2) and we used the mean V s30 values for each of the 16 terrain types to obtain the TE-V s30 values. The TS-V s30 values were obtained in a downloaded text file (Table 3).
(2) We used the band-plot function and a customized Fig. 4 Residuals between estimated logV s30 values using the a geology-based approach, b topographic slope-based approach, and c terrain-based approach and the measured logV s30 values at the 160 stations in Sichuan Province. Stations in yellow indicate where the V s30 value was overestimated; stations in red indicate where the V s30 value was underestimated. Station size was magnified 30 times to illustrate Abs (Ri); the bigger the point, the larger the residual. ''Swave'' refers to sheer wave. d Deviation rate of V s30 values using the three approaches subroutine 7 to calculate the standard deviation (r) of each approach (Fig. 5).
It was found that the value of r for the TS-approach, GE-approach, and TE-approach was approximately 208, 167, and 67 m/s, respectively, indicating that the TE-approach is most suitable for the estimation of V s30 in Sichuan Province. The overall fluctuations were reasonably stable in both the GE-approach and the TS-approach. However, the TE-approach showed undulating fluctuation, which was severe for V s30 values \ 400 m/s but less notable in the range 400-520 m/s. The lower-velocity material is distributed on the lower slopes (Matsuoka et al. 2005;Chiou et al. 2006); therefore, we infer that the violent fluctuation of V s30 values appears in the plain area under the TE-approach.
For deeper analysis of the situation regarding the plain and the performance of each of the three approaches, we divided Sichuan Province into four geological areas: Alpine plateau area; Mountain area; Basin edge; and Plain. The methodological approach adopted comprised two steps: (1) defining a preliminary set of geological areas based on slope; and (2) calculating the logV s30 distribution for each geological area. The mean (l) and standard deviation (r) were used to evaluate the overall estimation of the three approaches, and the normalized count distribution was used to assess the ability of each of the three approaches in each geological area (Fig. 6).
The normalized count distribution line for each geological area shown in Fig. 6 is distributed normally under the TE-approach, indicating this approach performs best in the estimation of V s30 values. This is also confirmed by it having the smallest overall deviation in the estimated values (l = 0.11, r = 0.07). The three approaches were reasonably unbiased for the Mountain area and the Basin edge, biased for the Alpine plateau area, and severely biased for the Plain area. The bias in the GE-approach might in part be related to the challenges associated with correlating the geological units of regions with different geological and lithological conditions and/or because some geological units are poorly sampled. In the TS-approach, the linear residual distribution of each geological area indicates poor correlation between slope and V s30 values. The occurrence of this bias might be related to the accuracy of the digital elevation model data. The V s30 deviation in the Basin edge area is larger than in the Mountain area. This might reflect the basin effect that is a typical complex site effect and usually has a significant amplification effect on the ground motion (Guo and Zhou 2010).
Under the TE-approach, violent fluctuation in the V s30 values was evident in the Plain area, and the bias in this area was larger in comparison with the other geological areas. This bias might reflect that the actual surface strata, rock properties, and sediment properties in this area are not identified appropriately. For example, macroscopically, the three stations in the Plain area with severe underestimation (Fig. 4c) are actually in an extension of the mountain range (low hills). Thus, the slope in such areas will not be very steep but the surface strata are old, and the rock is hard. In addition, due to the influence of Quaternary glaciers, most of the western plains are glacier-grooves, and the surface is covered with alluvial, alluvial gravel, and clay layers with thicknesses ranging from 15 m to 35 m. So the sediments are harder than those of ordinary plains (with high measured V s30 values). The fact that the correlation between slope and V s30 values is poor has been confirmed in relation to the TS-approach. However, under the TE-approach, slope gradient is key to the classification of site type; thus, the V s30 values corresponding to the Plain area are low Fig. 6 Normalized count distribution for each of three approaches in each geological area in Sichuan Province. The left side shows the residual distribution of logV s30 values (measured logV s30 ) and estimated logV s30 values using: a the geology-based approach; b the topographic slope-based approach; and c the terrain-based approach. The right side shows normalized count distribution for logV s30 , values sorted by the four geological areas. A red line shows the corresponding fitted normal distribution with mean (l) and standard deviation (r), while the black line corresponds to the fitted normal distribution of V s30 values for Sichuan Province (low estimated V s30 values). In this case, the V s30 values are inevitably underestimated. Kang et al. (2017) applied the GE-approach and the TSapproach to the site classification in Sichuan Province. The results showed that the macroscopic spatial distribution of the various sites obtained by the two approaches was similar and could be mutually confirmed, but there was a certain difference in the proportion of various types of sites. By comparing the results of the characteristic area site classification approach with the two approaches, the results show that the site classification obtained by directly referring to the relationship of other regions in the eastern part of Sichuan Province is not accurate enough, and the relationship needs to be corrected in combination with the measured drilling data in Sichuan. However, we found that the TE-approach is the most suitable of the three approaches to evaluate the site effect amplification in Sichuan Province, exhibiting the least estimation error of V s30 values and the more detailed categorizations. We also found that the ability of this approach to estimate the V s30 value in the plain area is slightly insufficient and needs to be localized. Comparing the estimated and measured values of the 160 seismic stations under this approach, the average difference was roughly 97 m/s.

Conclusion
This study collected the shear wave velocity data of 160 seismic stations in Sichuan Province as the reference V s30 value, and compared the applicability of three methods for estimating V s30 values based on different surrogate indicators. The results show that the variation range of V s30 in Sichuan Province estimated by the TE-method had the smallest residual error and residual standard deviation (l = 0.11, r = 0.07), and its stability was also the best (r = 67 m/s). The TE-approach should be prioritized over both the GE-approach and the TS-approach in site type classification in Sichuan Province because both geology and topography slope proxies were embedded to capture the characteristics and material properties. The TE-approach was the most suitable method to estimate V s30 in Sichuan Province, followed by the GE-approach and the TS-approach.
Due to the influence of Quaternary glaciers, slopes, and residues and deposits, the actual site conditions were complex, resulting in no significant correlation between the estimated V s30 and the measured V s30 under the three methods. It also caused the V s30 of the western plateau mountains, the southwestern mountains, and the basin edge stations to be generally overestimated, and the V s30 of the eastern plain stations was mostly underestimated.
The V s30 residual at the basin edge stations in the TSapproach showed a significant linear trend, indicating that the V s30 characteristics of this type of geomorphic unit were not well characterized; and most of the underestimated V s30 calculations were found in the eastern plain stations, revealing the need for improvement using this method. The poor performance of the TE-approach in the plain area highlights that localization correction would be required.