1 Introduction

Uncertainty in site response during an earthquake is an important contributor to uncertainty in seismic hazard. Uncertainty in site response is due to a number of factors, including limitations in the modeling approach used (typically assuming a horizontally layered medium and equivalent linear behavior) and uncertainty in the soil properties, in particular uncertainty in the shear wave velocity Vs. Uncertainty in Vs comprises the vertical and lateral variability of Vs, as well as uncertainty associated with limitations in the available data. Section 2 of this paper describes the different kinds of uncertainty in the physical processes that contribute to them.

In order to quantify the effect of uncertainty in Vs on uncertainty in site response, it is necessary to simulate multiple synthetic Vs profiles. Then, site response calculations are performed for each profile, and the results are analyzed to obtain estimates of the central tendency and uncertainty in the site response. A number of inputs are required for simulating the synthetic profiles. The most important of these inputs is the variability of Vs as a function of depth, which is the main topic of this paper and is typically quantified by its logarithmic standard deviation (\({\sigma }_{lnV}\)) or by the coefficient of variation (COV; the two quantities are nearly equivalent in practice). This quantity will depend in large measure on the amount of data available for the site, which may range from a gross characterization based on geology or topography, to a site-specific application at a site where multiple velocity profiles have been measured.

Another important input is the vertical correlation structure of Vs. It is important for the simulated profiles to be realistic in their degree of vertical roughness. An excessively rough profile will generate excessive wave scattering, in addition to high concentrations of nonlinearity in very soft layers, and will predict unreasonably low site response. On the other hand, an excessively smooth profile will over-predict the response. It is worth noting, though that very rough or very smooth profiles do occur in nature in particular geologic environments. An extreme example of this occurs in regions of basaltic volcanism, where there may be many basalt layers separated by sedimentary interbeds with low Vs. Because of this profound effect of the geological environment, it is not possible to provide general guidance on what constitutes a realistic profile.

Following the discussion on the different kinds of uncertainty, this paper summarizes the Vs simulation model developed by the author (EPRI 1993; Toro 1995, 1997, 2005) for this purpose. Although this model is relatively old and does not benefit from the much more extensive soil-profile databases that have become available in recent years, it is useful to describe it because it has been used extensively in engineering practice and because many of its elements form the bases of more recent models. There is also at least one earlier model in the literature (Vanmarcke, 1977), but this model has not been used by the earthquake engineering community. This paper also summarizes the model recently developed by Shi and Asimaki (2018), which is based on a large California data set.

Most of the data used in the development of the above models was obtained using so-called invasive methods such as down-hole, cross-hole, cone penetrometer, or suspension logger. In recent years, non-invasive surface-wave methods have gained widespread use because of their much lower cost and much greater convenience (see Foti et al. 2018 for guidance regarding the use of these methods). In addition to cost considerations, there are many sites where boreholes cannot be drilled for environmental or other reasons.

The profile randomization method developed by the author has been re-formulated by Passeri (2019) using surface wave data. One very interesting feature of this model is that it is applied to travel time, rather than velocity, as a function of depth. The formulation in terms of travel times provides considerable advantages, which are discussed in this paper.

Surface-wave methods bring their own challenges to the problem of characterizing uncertainty in Vs. In particular, improper parameterization of the problem may lead to non-uniqueness of the solution, so that very different alternative profiles may satisfy the data constraints (which are in the form of wave-dispersion data) equally well. This paper provides a brief review of the uncertainty inherent in these methods and provides some recommendations based on blind studies performed at multiple sites. Because this is a rapidly evolving field, this discussion of surface wave methods and associated uncertainty in Vs is by no means exhaustive.

This paper concludes by providing some recommendations, which are developed on the basis of the material presented here, practical experience using profile simulation methods, and discussions with other practitioners and researchers.

2 Epistemic and aleatory uncertainty

In probabilistic seismic hazard analysis (PSHA) and in other natural-hazard studies, it has become common to distinguish two kinds of uncertainty as follows:

Epistemic uncertainty represents lack of knowledge about the physical process, and it can be reduced by the collection of additional data. The word “episteme” comes from the ancient Greek and means to know, to understand.

Aleatory uncertainty represents intrinsic randomness of the process, and it cannot be reduced by the collection of additional data. The word “Alea” means dice in Latin. Aleatory uncertainty is frequently called “aleatory variability.” This author prefers the term aleatory uncertainty, which is the term originally used by the SSHAC panel (Budnitz et al., 1997), to highlight that the differences between the two kinds of uncertainty are very subtle, as will be discussed below, and that the total uncertainty is the combined effect of the aleatory and epistemic uncertainties.

This partition between epistemic and aleatory uncertainties is very much model dependent. In this regard, Veneziano (2003) points out the following:

Uncertainty that is explicitly recognized by a stochastic model is aleatory. Uncertainty on the model itself and its parameters is epistemic. Hence the aleatory/epistemic split of the total uncertainty is model-dependent.

A useful illustration of this model dependence is provided by the history of ground motion prediction equations (GMPEs). The earliest regression-derived GMPEs had very few parameters, and as a result, the scatter of the data about the median predictions (commonly represented by the logarithmic standard deviation \({\sigma }_{\epsilon }\) and treated as aleatory uncertainty) was high. Although it was not explicitly included in these early PSHAs, there was also epistemic uncertainty in the form of statistical uncertainty in the derived regression coefficients. As more parameters were added to represent site effects, style of faulting, path effects, etc. (what is now called non-ergodic or partially non-ergodic effects), \({\sigma }_{\epsilon }\) has decreased (although not always as much as expected), while the epistemic uncertainties in the regression coefficients and in the applicable site, style, and path terms for a particular site and region have increased (they will increase unless more data are used to constrain the additional parameters).

It is sometimes assumed that epistemic uncertainty is always represented as discrete branches in a logic tree, whereas aleatory uncertainty is represented as continuous quantities and integrated over in the PSHA calculations. This is far from correct. Many epistemic uncertainties can be thought of as continuous quantities (e.g., the magnitude-scaling coefficient in a GMPE, or an uncertain Gutenberg-Richter b value), and their discretization for the purposes of constructing a logic tree is purely a matter of convenience.

As discussed by McGuire et al. (2005), classical decision theory indicates that decisions related to seismic design should be based on the mean probability of the undesirable outcome (e.g., mean seismic failure probability of a critical structure). They also point out that there are strong precedents in current design and building-code practice for using mean risk for setting design criteria (e.g., the ASCE 7 building code and the ASCE 43 guidelines for nuclear structures). One interesting property of the mean hazard and mean risk is that they are insensitive to how the total uncertainty is partitioned between epistemic and aleatory, as long as the total uncertainty is preserved. This property is very useful, given the difficulty in distinguishing epistemic and aleatory uncertainty in practice.

Despite the insensitivity of mean hazard and risk to the epistemic/aleatory distinction, this distinction has at least two useful purposes, as follows:

  • Cataloging the two kinds of uncertainty, and the different phenomena that contribute to each kind, facilitates the process of characterizing the total uncertainty in a proper manner. Unless the probability of the undesirable outcomes is a linear function of the uncertain quantities (which is rarely the case), it is necessary to estimate the total uncertainty (not only the aleatory uncertainty) in order to obtain a correct estimate of the hazard or risk.

  • Knowing which epistemic uncertainties can be reduced (and by how much) with the benefit of additional data helps in the process of deciding which additional investigation should be performed in a particular project. Thus, the epistemic/aleatory partition helps focus this effort.

3 Epistemic and aleatory uncertainty in site characterization and site response

The distinction between epistemic and aleatory uncertainty is particularly difficult for Vs and other geotechnical properties relevant to earthquake site response, in the sense that one can always collect more data about a particular site in order to improve its characterization. Because a site’s properties are fixed in time (except for temporal variations in the top few meters, as discussed below), it is difficult to think of truly aleatory uncertainties when thinking about that site’s properties.

Focusing on the uncertainty in site response (which is the ultimate objective), significant aleatory uncertainty will remain about the actual response of the site during a future earthquake (of unknown magnitude, focal mechanism, depth, and orientation relative to the site), as well as from limitations in the modeling. At present, this aleatory uncertainty can only be elucidated with repeated recordings per site and will not be considered further in this paper.

A somewhat arbitrary but reasonable distinction between epistemic and aleatory uncertainty in Vs has become generally accepted in nuclear earthquake engineering practice, and it is proposed in Appendix B of EPRI (2013; the so-called SPID document). In the SPID, variations within the footprint of a structure (approximately 100 to 200 m) are considered aleatory and are characterized using profile randomization, while lack of knowledge about the parameters to characterize that variation (in particular, the base-case profile) is considered epistemic and is characterized using three alternative base-case profiles with associated weights (more details on this are provided later).

One motivation for treating the footprint variability as aleatory is that the ground motions at the site are affected by the lateral variation in Vs within the footprint because the standard engineering assumption of vertically propagating shear waves does not fully apply for most sites. In some sense, the site response “samples” a multitude of profiles around the instrument or structure being considered, resulting in surface motions that may be viewed as a rough average of the response of the many profiles within that footprint. In addition to introducing uncertainty in the amplification factors, the randomization also tends to broaden and lower the peaks of the soil response, also compensating for the perceived tendency of the one-dimensional site response to over-predict the peaks. This averaging effect is clearly illustrated using actual recordings and site data by Hallal and Cox’s (2021) results for Treasure Island in San Francisco Bay.

In general, the epistemic uncertainty in Vs at a site depends greatly on whether site-specific geotechnical and geophysical information is available, which is not always the case, and on the quantity and quality of that information. For example, the site profile information for many seismograph sites in the central and Eastern USA consists only of an estimate of Vs30, which has been obtained using so-called proxy methods that rely on regional geology, topography, and/or site description (e.g., Wills and Clahan 2006; Chiou et al. 2008; Allen and Wald 2009; Parker et al. 2017). If one wants to generate representative profiles for a poorly characterized site such as one of these seismograph sites, one needs to incorporate a much larger epistemic uncertainty than for a well-characterized site for which one or more invasive or non-invasive Vs measurements are available.

Temporal variations in the dynamic properties of the top few meters of soil are known to occur and are due to meteorological factors such as precipitation, thawing, and freezing. Also, soil sites exposed to strong shaking may take a long time to return to their initial state. These variations, which are discussed by Zhu et al. (2021), can affect the high frequency response of the site and are clearly aleatory in nature.

4 The Toro randomization model

4.1 Model formulation

This model was developed by the author and several collaborators (in particular Dr. Walt Silva) and has evolved over multiple iterations (EPRI 1993; Toro 1995, 1997, 2005). It was originally developed for generic applications, where there is very limited information about the site, such as the Geomatrix (Chiou et al., 2008) or NEHRP (Dobry et al., 2000) site classifications. The available profiles were segregated into broad categories and a separate model was developed for each category, using as inputs the interpreted interval-velocity profiles obtained in different regions of the USA with different geologic settings. It is useful to describe this model here because it has been used extensively in practice, even though some may consider it obsolete, and because some of its elements are still used in more recent models.

The three basic elements of the model are shown in Fig. 1 and are described below. The first element (part b) is the randomization of layer thicknesses. Thicknesses are simulated as a non-homogeneous Poisson or renewal process as a function of depth. Unlike the standard Poisson process (used, for instance, to represent the occurrence of earthquakes in time), the intensity function or rate is a function of depth, with a higher value (i.e., thinner layers) near the surface. The functional form is a modified power law, i.e.,

Fig. 1
figure 1

Elements of the Toro profile randomization model. Note: deviations from the median model are calculated in logarithmic scale but they are shown in arithmetic scale for the sake of clarity

$$\lambda \left(z\right)={c}_{3}{\left[z+{c}_{1}\right]}^{-{c}_{2}}$$
(1)

where the three coefficients are determined from all the data. Toro (1995) obtained c1 = 10.86 m, c2 = 0.89, and c3 = 1.98 m−0.11 (these values produce \(\lambda (z)\) in units of m−1 when z is in units of meters). \(\lambda \left(z\right)\) may be interpreted as the reciprocal of the mean layer thickness at depth z. Figure 2a shows a typical intensity function as a function of depth z.

Fig. 2
figure 2

Elements of the layer-thickness modem. a Intensity function (see Eq. 1); b distribution of normalized thickness (from Toro 2005)

The properties of homogeneous Poisson processes (i.e., Poisson processes with constant rate) are well understood and familiar to most readers. For this reason and for other reasons that will become clear soon, it is useful to convert the non-homogeneous Poisson process of layer thickness defined by Eq. 1 into an equivalent homogeneous process with unit rate. This is achieved by mapping the depth z into a “normalized” depth \(\zeta (z)\), which is defined as

$$\zeta\left(z\right)=\int_0^{z}\lambda\left(z\right)dz$$
(2)

One of the important properties of a homogeneous Poisson process is that the interval between occurrences of the process (inter-arrival times if the process is parameterized in terms of time) follows an exponential distribution (see Parzen 1999).

For the non-homogeneous Poisson model of layer thickness, one can use Eq. 2 to convert the depths of layer boundaries in each observed profile from \(z\) to \(\zeta (z)\), and one can calculate the normalized layer thickness for the ith layer, as follows:

$$\tau_i=\int_{z_i}^{z_{i+1}}\lambda\left(z\right)dz$$
(3)

where \({z}_{1}, {z}_{2}, \dots\) are the depths to the layer boundaries in an observed profile and \({\tau }_{1}={\zeta }_{2}-{\zeta }_{1}, {\tau }_{2}={\zeta }_{3}-{\zeta }_{2}\), etc., are the corresponding normalized layer thicknesses. The quantity \({\tau }_{i}\) has a mean value of 1 and follows an exponential distribution if layer thickness is a non-homogeneous Poisson process.

The exponential distribution has its mode (i.e., most likely value) at zero, which implies a tendency of the model to produce many unrealistically thin layers. Figure 2b shows a typical distribution of normalized thickness for actual profiles (from Toro 2005), and it clearly shows that the observed distribution of normalized thicknesses does not have a mode at zero. To make the thickness model more realistic, the nonhomogeneous Poisson process was replaced by a nonhomogeneous renewal process. In this model, the lognormally distributed normalized layer thickness has a mean value of 1 and a standard deviation of 0.5 to 0.7 (Toro 1997, 2005, respectively). Unlike the exponential distribution, the lognormal distribution has a low probability density for short values of the normalized thickness, thereby reducing the likelihood of very thin layers.

The simulation of layers can proceed as follows: (1) simulate \({\tau }_{0}\) (the normalized thickness of the topmost layer, which is also equal to \({\zeta }_{0}\) because this is the first layer) from the selected distribution of normalized thickness (i.e., exponential or lognormal); (2) convert \({\zeta }_{0}\) to the actual depth \({z}_{0}\) using Eq. 2; (3) simulate \({\tau }_{1}\) from the selected distribution of normalized thickness; (4) convert \({\zeta }_{1}={\zeta }_{0}+{\tau }_{1}\) to the actual depth \({z}_{1}\) using Eq. 2; repeat steps 2 and 3 until the depth to bedrock is reached. Depth to bedrock is generally treated as an additional external input, but has also been treated as uncertain with a uniform (e.g., EPRI 1993), or normal or lognormal (e.g., Kottke et al. 2013) distribution. For rock sites with gradual weathering profiles and no distinct impedance contrasts at bedrock, a termination criterion in terms of the layer’s randomized Vs may be more appropriate. This latter criterion has been used by the author in some applications.

The second element of the model (part c in Fig. 1) is the base case or median profile, which represents the central tendency of all profiles in the category or site under consideration. In generic applications, the number of profiles is large, and layer boundaries are not concentrated at specific depths (i.e., many diverse sites), so that this median profile is generally smooth. In site-specific applications, this is often not the case, as will be discussed later.

The third element (part d in Fig. 1) is the Vs randomization, which works with the deviation (in logarithmic scale, but shown in arithmetic scale in Fig. 1 for the sake of clarity) between each layer’s Vs and the median Vs at the layer mid-point. This deviation, after normalization by the standard deviation at the layer mid-point, is modeled as

$${Z}_{i}=\frac{ln{V}_{i}-ln{V}_{\mathrm{median}}\left({z}_{i}\right)}{{\sigma }_{lnV}\left(z\right)}$$
(4)

where the subscript S has been removed from \({V}_{S}\) for the sake of brevity. Earlier versions of this model (EPRI 1993; Toro 1995) treated \({\sigma }_{lnV}\) as independent of depth. Recent work by the authors and others indicate that \({\sigma }_{lnV}\) varies significantly with depth.

The simulation proceeds by generating \({Z}_{1}\) as a standard normal random number and calculating the corresponding \({V}_{1}\) using Eq. 4. Subsequent values are calculated using a simple auto-regressive model of the form

$${Z}_{i}=\rho {Z}_{i-1}+\sqrt{1-{\rho }^{2}}\epsilon$$
(5)

where \(\epsilon\) is a standard normal random number, and the correlation coefficient \(\rho\) depends on both depth z and layer thickness t (both in meters), as follows:

$$\rho \left(z,t\right)=\left(1-{\rho }_{d}\left(z\right)\right){\rho }_{t\left(t\right)}+{\rho }_{d}(z)$$
(6)
$${\rho }_{d\left(z\right)}=\left\{\begin{array}{cc}{\rho }_{200}{\left[\frac{z+{h}_{0}}{200+{h}_{0}}\right]}^{b}& z\le 200 \mathrm{m}\\ {\rho }_{200}& z>200 \mathrm{m}\end{array}\right.$$
(7)
$${\rho }_{t}\left(t\right)=\mathrm{exp}\left[-\frac{t}{\Delta }\right]$$
(8)

The median Vs profile, the standard deviation \({\sigma }_{lnV}\), and the parameters of Eqs. 7 and 8 were calculated jointly from the data in each category using the method of maximum likelihood.

The method of maximum likelihood is implemented as follows. For each profile, one constructs the likelihood function of the data, which is the joint distribution of the observed Vs values at all layers. This joint distribution can be written as \({f}_{{V}_{1}}\left({v}_{1}\right)\times {f}_{{V}_{2}|{V}_{1}}\left({v}_{2};{v}_{1}\right)\times {f}_{{V}_{3}|{V}_{2}}\left({v}_{3};{v}_{2}\right)\times \dots\) using Eqs. 4 and 5 (together with the assumption that the Vs values at the layer mid-points are lognormal), where the \({v}_{i}\) are the observed Vs values at the layer midpoints. The total likelihood is the product of the likelihoods for all profiles in the category under consideration. Strictly speaking, this “Markovian” form of the joint distribution is applicable only to the case where \(\rho\) in Eq. 5 is constant. The free parameters of the total likelihood are \({V}_{\mathrm{median}}(z)\) (which is typically parameterized as a sequence of linear segments), \({\sigma }_{lnV}(z)\) (which may be parameterized as constant or as a sequence of linear segments), and all the constants that control \(\rho \left(z\right)\) (see Eqs. 7 and 8). The maximum likelihood method (Benjamin and Cornell 1970) determines the values of all free parameters by finding the values of those parameters that maximize the likelihood function given the data.

After the parameters have been determined, Eqs. 4 and 5 can be used to simulate Vs for the mid-points of all layers in the profile, as identified in the layer-thickness randomization described earlier. None of the stochastic models developed by Toro consider bedrock Vs and its uncertainty, but other researchers have proposed such models (e.g., Passeri 2019; Talukder et al. 2021).

Table 1 shows the parameter values obtained in Toro (1995) for various data sets associated with different Geomatrix (Chiou et al., 2008) and NEHRP (Dobry et al., 2000) Vs30-based site categories. For the Geomatrix classification, categories were always grouped into pairs (e.g., Geomatrix A + B) because the number of profiles in each individual category was too small. For the NEHRP classification, results were obtained for grouped and individual categories because the data are somewhat more abundant. Toro (1995) also tabulated the median Vs profile for each category. The associated standard deviations are high compared to those in other methods to be presented later because these categories are broad and encompass a range of site conditions and anticipated site response. In terms of the SPID epistemic/aleatory partition of Sect. 3, the \({\sigma }_{lnV(z)}\) values in Table 2 contain both epistemic and aleatory uncertainty.

Table 1 Parameters of generic shear-wave velocity randomization model. Modified from Toro 1995
Table 2 Values of \({\sigma }_{ln{V}_{s}}\) from site-specific studies (n indicates the number of sites in each group of profiles)

Table 1 quantifies uncertainty as the logarithmic standard deviation \({\sigma }_{lnV}\), which is the natural quantity for representing uncertainty in a lognormal distribution. For typical values of \({\sigma }_{lnV}\) (say, 0.35 or less), there is not much difference between \({\sigma }_{lnV}\) and the coefficient of variation (or COV, which is defined as standard deviation/mean). Similarly, there is not much difference between the bodies and upper tails (but not the lower tail) of the normal and lognormal distribution for distributions with \({\sigma }_{lnV}\) or COV values in this range. This paper will use \({\sigma }_{lnV}\) and COV (whichever the paper being discussed uses), but the reader should think of them as being nearly equivalent for the purposes of quantifying uncertainty in Vs.

In the author’s implementation of the randomization for this model, the normalized Gaussian deviation \({Z}_{i}\) is restricted to the \(\left|{Z}_{i}\right|<2\) range (see Toro 1995). In order to preserve the calculated standard deviations, the nominal \({\sigma }_{lnV(z)}\) is multiplied by 1.16 prior to the calculations. If any layer violates this constraint, the profile is discarded, and a new profile is simulated, beginning with the top layer.

4.2 Generic application

As indicated earlier, the above model was developed for the representation of uncertainty in Vs when there is very little information about the site, such as the situation where all is known about the site is its Geomatrix or NEHRP class. The model can be used, for instance, to calculate amplification factors and associated uncertainties for use in building codes (which use these broad categories) using the appropriate median profile and the parameters in Table 1.

4.3 Site-specific applications

Most applications of the Toro randomization model have been for site-specific situations, where some profiles are available for the site footprint (sometimes 10–20 in the case of new nuclear plants), but the number of profiles is not sufficient for the site-specific development of a full model (i.e., a thickness model, median of base-case profile, \({\sigma }_{lnV}\) vs. depth, and parameters for the correlation model).

Although the author has proposed a formal Bayesian methodology to combine a generic model with site-specific data to calculate the median profile and \({\sigma }_{lnV}\) vs. depth (EPRI 1993; Toro 1997), this approach has not been used in practice. Instead, more ad hoc approaches have been employed. Realistically, the Bayesian approach may end up adding little additional value if the number of site-specific profiles is 10 to 20, and they have a lower \({\sigma }_{lnV}\) than the applicable generic model.

The site-specific median or base-case profile is generally calculated as the median of the ln[Vs] from the measured site-specific profiles (typically 10 to 20 profiles). Sometimes, the statistics are calculated as a function of depth; sometimes they are calculated for each soil unit.

In most instances, the site-specific median or base-case profile is not smooth if most measured profiles have significant Vs contrasts over a narrow range of depths. If this is the case, the generic layer-thickness model (Eq. 1) is not applicable. If one uses a median profile with strong velocity contrasts, together with the generic thickness model, the finite layer thickness will act as a smoothing operator, the median of the simulated profiles will be smooth (i.e., not matching the desired median profile), and the observed \({\sigma }_{lnV}\) will be inflated.

Several approaches have been used in practice to simulate a realistic stratigraphy for site-specific profiles with Vs contrasts. The simplest is not to randomize thickness, using the important Vs contrasts in the median profile to define layer boundaries. A second approach is to treat each distinct layer in the median profile as having a certain probability distribution of thickness. A third approach is to develop a “lumpy” \(\lambda (h)\) that forces more layer boundaries to occur in the depth ranges where the median or base-case profile is changing rapidly, but also allows for some layer boundaries to occur elsewhere (see Sect. 4.3 in Toro 1995).

The most difficult quantity to estimate for site-specific analyses is \({\sigma }_{lnV}\) and its dependence on depth. In most situations, this quantity is smaller than the generic values in Table 1 and is calculated from the statistics of the profiles measured at the site. Table 2 shows the range of site-specific \({\sigma }_{lnV}\) from a number of nuclear projects performed by the author as well as from other studies (Toro 1995, 1997, 2005). This table shows a broad range of values, in large part because the geographic dimensions of these clusters are different. Also, some sites are more complex and variable than others (note that the values for the California highway bridges are much higher than other values). In some of the nuclear projects, the observed \({\sigma }_{lnV}\) values have been increased beyond the observed scatter to avoid sharp peaks in the site amplification factors and to allow for the effect of possible measurement errors.

When calculating \({\sigma }_{lnV}\) as a function of depth from site-specific Vs data, one often sees a higher \({\sigma }_{lnV}\) at locations near an uncertain layer boundary with a high Vs contrast because that boundary does not occur exactly at the same depth in all measured profiles. This effect is well illustrated by Passeri et al (2020). Their Fig. 2 (parts a through c) show that the distribution of Vs may be very broad and bimodal in these situations. This problem does not arise in the fitting of the Toro model (and on the values in Tables 1 and 2) because that maximum-likelihood fitting of the Toro model is done in terms of the velocity at the layer mid-points. These higher \({\sigma }_{lnV}\) values near layer boundaries should be ignored when determining a representative \({\sigma }_{lnV}\) from site-specific measurements for use in the Toro randomization model.

There are very few results for the inter-layer correlation for site-specific applications. Generally, one expects the inter-layer correlation (Eqs. 6 through 8) to be lower in site-specific than in generic applications because the site-to-site differences within a broad site category are expected to have a strong vertical correlation (i.e., a site that has higher than average Vs at one depth is likely to have consistently higher Vs at other depths). Once that site-to-site variation is removed by going from a generic to a site-specific analysis, that source of vertical correlation is eliminated. This reasoning is confirmed by comparing the generic correlation parameters in Table 1 to the site-specific parameters obtained by Toro (1997) and Toro (2005), which contain results for the Savannah River site in Georgia and for a cluster within that site (see Table 3). The use of high correlation coefficients in site-specific applications may explain the large variations in site period, amplification functions, and dispersion relations observed by Griffiths et al. (2016a, 2016b) and others when investigating profiles that they simulated using the Toro approach.

Table 3 Parameters of inter-layer correlation model from three site-specific applications

5 Other guidance on uncertainty for site-specific applications (SPID and Stewart et al. 2014)

The SPID (Screening, Prioritization and Implementation Details) document was prepared by the US nuclear industry as a standard approach for performing seismic re-evaluations of existing plants, in response to a request by the US Nuclear Regulatory Commission (NRC) issued after the 2011 Fukushima-Daiichi accident. The NRC endorsed the approach.

Appendix B of the SPID provides guidance on how to develop the site response, including the quantification of uncertainty, in cases where limited site response data exist. For many sites, the geotechnical and geophysical data available for the post-Fukushima evaluation were those obtained to develop the original license applications decades ago. Although Appendix B of the SPID contains guidance on many parameters of the site profile, this section will describe only the treatment of uncertainty in Vs. Although the SPID guidance was intended for the post-Fukushima re-evaluation of existing plans, it has been used in other situations and is discussed in many publications related to uncertainty in Vs (e.g., Griffiths et al. 2016a, b).

As indicated earlier, the SPID defines aleatory uncertainty as lateral variations within a footprint of approximately 100 to 200 m. The SPID (Section B-4.1) recommends \({\sigma }_{lnV}\) values of 0.25 at the surface, decreasing to 0.15 at 15 m and deeper. These values are roughly consistent with the Toro Savanah River data summarized in Table 2, on which they are based. They are also roughly consistent with the values shown by Passeri (2019) in his Fig. 6.24 (see Sect. 7 for more details on the data used by Passeri).

Stewart et al. (2014) recommend \({\sigma }_{lnV}\) values of 0.15 for 0–50 m and 0.22 at greater depths, also on the basis of the site-specific data analyzed by Toro (1995). Note that the two recommendations follow opposite trends at greater depths, even though they are supposed to be based on the same data. This difference is likely a consequence of Stewart et al. (2014) basing their estimates on the Toro (1995) graphs of \({\sigma }_{lnV}\) vs. depth from profile clusters, while the SPID uses the tabulated parameters (which are based on mid-layer velocities). As indicated earlier, the \({\sigma }_{lnV}\) values based on mid-layer velocities are the values that should be used in the Toro randomization.

Regarding epistemic uncertainty, the SPID recommends that epistemic uncertainty in Vs be represented by considering three alternative values of the median profile for the randomizations, as follows: the base-case (with a weight of 0.4) and the upper range and lower range profiles (with weights of 0.3 each) where the base case profile is scaled by a factor \(\mathrm{exp}[\pm 1.28{\sigma }_{ln{V}_{epistemic}}]\). The value of 1.28 and the weights are chosen to preserve the mean and standard deviation of the underlying continuous epistemic distribution of \(ln{V}_{s}\) (see Keefer and Bodily 1983). This shifting by a constant factor implies perfect depth-wise correlation, which may lead to unrealistic profiles (e.g., profiles with large differences in travel time, as pointed out by Griffiths et al. 2016a, b and others). The epistemic uncertainty \({\sigma }_{ln{V}_{epistemic}}\) is generally different from the aleatory \({\sigma }_{lnV}\) discussed earlier (although they are related, as will be discussed later). The total uncertainty in Vs is characterized by treating these three median profiles as branches in a logic tree and then randomizing about each median profile to quantify aleatory uncertainty.

The SPID provides limited guidance about the values of \({\sigma }_{ln{V}_{epistemic}}\) to use. Section B-3.2 of the SPID rightly indicates that the level of uncertainty should depend on the amount of information available, and on how well that information is correlated with shear-wave velocity (if indirect or proxy information is used). The same section recommends \({\sigma }_{ln{V}_{epistemic}}=0.35\) for sites with very limited site-specific Vs data obtained using geophysical methods and \({\sigma }_{ln{V}_{epistemic}}=0.50\) for sites where Vs estimates are limited and are inferred from geological observations or other indirect information such as blow counts or geology. This dependence of \({\sigma }_{ln{V}_{epistemic}}\) on the amount and quality of information available implies that \({\sigma }_{ln{V}_{epistemic}}\) should vary with depth in some situations, contrary to the way the SPID is often interpreted, such as the case where direct Vs measurements are available for the shallow portion of the profile while Vs for the deeper portion is estimated on the basis of the geology.

6 The Shi and Asimaki randomization model

This model (Shi and Asimaki 2018) used data from 914 measured Vs profiles in California (mostly in the Los Angeles and San Francisco areas). These profiles were measured using a variety of invasive (e.g., suspension logging, cross-hole, and down-hole) and noninvasive (i.e., based on the inversion of surface-wave dispersion curves) techniques.

These profile data are divided into 26 overlapping Vs30 bins (so each profile is present in more than one bin), with mid-point Vs30 values ranging from approximately170 to 990 m/s. The goal was that each bin should contain approximately 40 profiles for the sake of statistical stability.

Following Toro (1995), the variation of Vs with depth is assumed to follow a lognormal distribution. The mean profile is fit parametrically using a three-parameter modified power law with constant Vs in the top 2.5 m, with parameters that depend on Vs30. The authors also develop a model for the standard deviation of Vs as a function of depth and Vs30, as well as a power-law model for the mean layer thickness (note that the mean layer thickness is the reciprocal of the Toro 1995; intensity function given in Eq. 1).

This model is comparable to the Toro model in the sense that it applies to categories of sites, In this case, each category is a Vs30 bin indexed by its mid-point Vs30, but the maximum and minimum Vs30 in each bin are not specified in the paper. As in the Toro model, the standard deviations reflect deviations from the median profile for that Vs30 bin.

It is interesting to compare the Vs uncertainty implied by this model to that of the generic Toro model, because both models are generic. To this effect, the mean and standard deviation of Vs were calculated (using Eqs. 3, 4, and 9 of Shi and Asimaki 2018) and used to calculate \({\sigma }_{lnV}\). \({V}_{s30}\) values of 1000, 500, and 250 m/s were considered, which correspond approximately to the logarithmic mid-point \({V}_{s30}\) values for NEHRP site classes B, C, and D. Results are shown in Fig. 3. The observed values are comparable to those in Table 1, except for the much higher values in the top 10 m for site classes B and C (which may be a consequence of differences in the functional forms used by Shi and Asimaki for the mean and for the standard deviation). To a degree, the value of this comparison is limited because the width of the bins employed by Shi and Asimaki is not specified in their paper.

Fig. 3
figure 3

Uncertainty in Vs calculated using the Shi-Asimaki (2017) model for three NEHRP categories

7 The Passeri randomization model

This model is designed for site-specific applications and is documented in Passeri (2019) and Passeri et al. (2020). The model was developed using a large database of surface-wave geophysical measurements at 91 sites (the Polito Surface Wave Database or PSWD). Only sites where the dispersion curves did not exhibit the effect of higher modes were used (71 sites). Instead of fitting interval velocities as a function of depth as the Toro and Shi-Asimaki models, this model fits the travel time as a function of depth:

$$t{t}_{S,z}\left(z\right)=\sum \nolimits_{{0<z}_{i}\le h}\frac{{H}_{i}}{{V}_{S,i}}$$
(9)

that is, the sum of the one-way travel times in all full or partial layers (\({H}_{i})\) above depth z. This quantity has the advantage of being very closely related to the dynamic response of the site and to engineering measures such as Vs30. It also has the advantage of being a continuous quantity, thereby avoiding the higher Vs COV near uncertain layer boundaries. This travel time can be converted to a time-averaged velocity by dividing each depth \(z\) by the associated travel time, i.e.,

$${V}_{S,z}\left(z\right)=\frac{z}{t{t}_{S,z}\left(z\right)}=\frac{\sum_{{0<z}_{i}\le z}{H}_{i}}{\sum_{{0<z}_{i}\le z}\frac{{H}_{i}}{{V}_{S,i}}}$$
(10)

Because \(z\) in the numerator of Eq. 10 can be seen as a constant, the statistical properties of \(t{t}_{s,z}\) and \({V}_{s,z}\) are closely related (in particular, they have the same COV). The Passeri thickness model uses Eq. 1 but fits the parameters for each site-specific profile and uses an acceptance-rejection technique in the simulations to avoid layer thicknesses that are inconsistent with the layer-thickness resolution of surface-wave methods.

The Passeri model for the randomization of the travel time as a function of depth uses Eqs. 4 through 8, but it fits \(t{t}_{S,h}\left(z\right)\) instead of \({V}_{S}\left(z\right)\). According to Passeri (2019), the resulting standard deviation \({\sigma }_{\mathrm{ln}(t{t}_{S,z})}\) can be represented as taking values in the range from 0.01 to 0.02, although some profiles show significantly higher values in the top 20–40 m (see Fig. 4). This standard deviation is supposed to encompass all epistemic and aleatory uncertainties, but it appears that it only includes uncertainty in the dispersion curves and that this uncertainty has been underestimated (Rodriguez-Marek, personal communication, October 5, 2021). These values are nearly one order of magnitude lower than the site-specific values shown in Table 2 and other values to be shown in the next section.

Fig. 4
figure 4

Logarithmic standard deviation of travel time, as calculated by Passeri (2019). Source: Passeri (2019)

8 Uncertainty in surface-wave methods

Surface wave methods for the development of Vs profiles are a cost-effective alternative to the traditional invasive methods and are being used with increasing frequency. Uncertainty in the calculated Vs profile arises from uncertainties in the raw experimental data, the effect of assuming a 1D isotropic model that ignores lateral variability, non-uniqueness in the inversion of the dispersion curve, and differences between analysts (due to different approaches to construct the dispersion curves, different inversion techniques and software, different interpretations of possible higher modes, different parameterizations of the model space, etc.).

Foti et al. (2018) suggest that the uncertainty associated with the raw or primitive measurements at a given site and with a given array is aleatory. Contributors to this uncertainty include noise in the signals, errors in sensor location, and near-field effects. This uncertainty is represented by uncertainty in the dispersion curves.

Foti et al. (2018) also suggest that the uncertainty associated with the derived quantities (quantities calculated by one or more analysts, possibly using different methods and parameterizations) should be treated as epistemic uncertainty. This characterization as epistemic uncertainty is generally consistent with the traditional SSHAC distinction between epistemic and aleatory uncertainty because this uncertainty can be reduced by increasing the number of analysts (call it \(n)\). As \(n\) is increased (and under the assumption of independence between experts), the associated uncertainty (or standard error) in the median Vs at a given depth is proportional to \(\sigma /\sqrt{n}\), where \(\sigma\) in this case is the inter-expert (or expert-to-expert) standard deviation. This reduction with \(\sqrt{n}\) is a natural consequence of the assumption of independence between experts (see Benjamin and Cornell 1970).

Another source of uncertainty in surface-wave methods is non-uniqueness. Depending on the parameterization used for the inversion, there may be many profiles that match the dispersion curves equally well. The site-response implications of non-uniqueness and approaches to reduce its impact are investigated by Foti et al. (2009), Boaga et al. (2011), and Teague and Cox (2016), among others.

To this author’s knowledge, there are no generally accepted standard procedures to quantify the uncertainty in Vs measured using surface-wave methods. Lai et al. (2005) propose a mathematically rigorous approach based on derivatives and uncertainty propagation. They obtain very low COV’s (less than 0.04 overall and less than 0.01 at many depths), possibly because the algorithm considered only a local optimal point and ignored other optimal points further away in parameter space. Some authors have used global-search or Monte Carlo approaches, which are more likely to identify multiple optimal points if they exist. For instance, Foti et al. (2009) obtained COVs of Vs ranging from 0.01 to 0.10 when using Monte Carlo methods to invert the dispersion curve from a three-layer synthetic profile. Others have used Markov-Chain Monte Carlo, which can be thought of as combining the process of searching for an optimal solution with the process or randomizing in a manner that samples the uncertainty in the dispersion data (Molnar et al. 2010, 2013; Dettmer et al. 2012; Gosselin et al. 2017).

Perhaps the best approach to investigate the uncertainty in surface-wave methods in their current state of development is by means of blind tests. One advantage of this approach is that it automatically includes uncertainties in the field data as well as expert-to-expert differences related to different analytical approaches, different parameterizations, and other factors. The InterPACIFIC project (Garofalo et al., 2016a) performed blind tests at 3 different sites, comparing the Vs profiles developed by 14 teams of experts using a variety of techniques. Each team was provided with the same experimental surface-wave datasets, consisting of both actively and passively obtained data, collected in multiple arrays at each site. The three sites cover a broad range of site conditions. The teams were not given any other information about the sites (e.g., stratigraphy, geology). The teams used different methods and software to process and invert the data. Some teams considered additional information contained within the raw data, such as HVSR, ellipticity, Love-wave dispersion, etc. Each team also calculated its own within-team estimates of uncertainty, but these estimates are not reported or analyzed by Garofalo et al. (2016a).

The first site (Mirandola, Italy) consists of soft alluvial deposits overlaying a stiffer layer of clay at a depth between 50 and 150 m. The second site (Grenoble, France) consists of very deep, stiff alluvial deposits (500–800 m thick). The third site (Cadarche) is a rock outcrop site. At all three sites, the Rayleigh dispersion curves obtained by the various teams agree over a broad range of frequencies, with COVs lower than 10%.

Figure 5 shows the COVs as a function of depth for the three sites superimposed in one figure, for both Vs and \({V}_{S,Z}\) (recall Eq. 10). Focusing first on the COV of Vs, we note that it is high in the top 10 m (likely the result of lateral variability), then it decreases to a value of approximately 0.10. At greater depths, it becomes significantly larger for the Mirandola and Grenoble sites, which contain significant Vs contrasts at depths of 100 and 200 m, respectively. As indicated earlier, these high COVs associated with large velocity contrasts at uncertain depths should not be compared to the \({\sigma }_{lnVs}\) values obtained with the Toro and Shi-Asimaki models (e.g., Tables 1 and 2, and Fig. 3), because those models operate with the Vs values at the layer mid-points and not with Vs as a function of depth, and their uncertainties are not affected by these contrasts. The COV of Vs is much lower for Cadarche at depths above 80 m, because this site has no significant velocity contrasts.

Fig. 5
figure 5

Inter-expert coefficients of variation for \({V}_{s}(z)\) (left) and \({V}_{s,z}(z)\) (right) as reported by Garofalo et al. (2016a)

Focusing now on the COV of Vs,z, we note that this quantity is much more stable as a function of depth. It is also higher in the top 10 m, especially for the Cadarche site. Garofalo et al. (2016a) indicate that lateral variability was a particular problem for this site, resulting on a high COV in the high-frequency (> 60 Hz) portion of the Rayleigh dispersion data. At depths between 50 and 200 m, the COV of Vs,z ranges from 0.04 to 0.10. Values also show a moderate tendency to increase at greater depths, which is consistent with the fact that surface-wave methods lose resolution at greater depths. Interestingly, the COVs of Vs,z, in Fig. 4 are comparable to the COVs of Vs30 obtained by Yong et al. (2019), which consider within-site intra- and inter-method variability for 31 California sites. On the other hand, these values are significantly higher than the 0.01–0.02 range in \({\sigma }_{\mathrm{ln}(t{t}_{s,z})}\) proposed by Passeri (2019, see Fig. 4). Note that \({\sigma }_{\mathrm{ln}(t{t}_{s,z})}={\sigma }_{\mathrm{ln}\left({V}_{s,z}\right)}\approx COV[{V}_{s,z}\)].

The InterPACIFIC project (see Garofalo et al. 2016b) also compared the results from invasive geophysical investigations at the same three sites performed by multiple teams using a variety of methods (namely, cross-hole, down-hole, suspension logger, and seismic dilatometer). In general, there is no significant bias between the invasive and non-invasive methods, and the COVs of both Vs and Vs,z obtained at these three sites are comparable between methods.

9 Discussion and recommendations

This paper reviewed three stochastic models that have been used for the characterization of uncertainty in Vs, two documents that provide guidance on uncertainty in Vs, and papers and results that provide insights on the uncertainty in Vs for surface-wave methods. The latter topic is a very active one at this time, and some important contributions have likely been missed.

Despite these limitations, this paper tries to develop some insights and recommendations regarding the characterization of uncertainty in Vs for site response analysis.

One important consideration to keep in mind is that the primary purpose of the quantification of uncertainty in Vs, and the associated randomization, is to quantify the resulting uncertainty in site response. Another consideration is that a desirable effect of profile randomization is that it softens the peaks calculated using one-dimensional site response, thereby providing a better match to observations. The interplay between these two desirable effects of profile randomization has never been clearly established.

The ideal testing ground to resolve the issue of uncertainty in site response would therefore be a set of blind studies similar to those of Garofalo et al. (2016a, 2016b), but performed at sites where strong-motion recordings from earthquakes of engineering interest have been obtained. The work by Teague et al. (2018) is a step in that direction, and similar studies at other sites are needed, although such investigations are not truly blind in the sense that the ground motions are known beforehand.

Regarding the definition of what constitutes epistemic and aleatory uncertainty in Vs, any definition will necessarily be arbitrary, for reasons discussed earlier. Nonetheless, in order to develop a comprehensive and consistent quantification of total uncertainty, it would be very useful for researchers and practitioners to agree on what to consider epistemic and what to consider aleatory under the different site-investigation methods, and also agree on a comprehensive list of the various physical processes that contribute to each type of uncertainty. By identifying and naming the various uncertainty contributors in a consistent manner, it will be less likely that some of these contributors will be missed. This will also avoid confusion in nomenclature.

The epistemic uncertainty in Vs is large when no site-specific geophysical measurements of Vs are available, such as situations where all that is known about the site is the NEHRP site class or a proxy estimate of Vs30. The generic values of \({\sigma }_{lnV}\) in the Toro (Table 1) and Shi-Asimaki (Fig. 3) models are in rough agreement about the total (epistemic plus aleatory) uncertainty in Vs.

For site-specific applications, the epistemic uncertainty depends on the methods used and on the number of measurements available. A simple rule of thumb in statistics is that the epistemic uncertainty (or standard error) in the best-estimate Vs is equal to \(\sigma /\sqrt{n}\) where \(n\) is the number of independent measurements. This will make the epistemic uncertainty insignificant in many site-specific applications for which several tens of profiles are obtained from multiple boreholes within the building footprint or by different methods or investigators at the same approximate location. Of course, there may be question as to whether the profiles obtained by \(n\) investigators are truly independent.

There are two common approaches to calculate the site-specific statistics of Vs (or \(ln{V}_{s}\)) as a function of depth. The first one uses the actual interval velocities \({V}_{s}(z)\); the second one uses the travel time the time-integrated velocity \({V}_{s,z}\) (recall Eqs. 9 and 10). As discussed earlier, the statistics on \({V}_{s}(z)\) are inflated in the vicinity of velocity contrasts at uncertain depths, making it difficult to discern trends in \({\sigma }_{lnV}\) contained within the data and making comparisons more difficult. This problem can be avoided by considering only mid-layer velocities, but then the statistics become more complicated. As a result, comparisons in terms of the COV of \({V}_{s,z}(z)\) are preferable to comparisons in terms of the COV of \({V}_{s}\)(z). This is a reason why it may be preferable to formulate future stochastic Vs models in terms of \({V}_{s,z}(z)\) (as in the Passeri model) or in terms of travel time, in order to facilitate their fitting and modification for site-specific applications. Another reason for this suggestion is that travel time is closely related to site response.

Another reason for the COV of \({V}_{s,z}\) to be smaller is because it involves sums of travel times (recall Eqs. 9 and 10). If the velocities in the individual layers are not fully correlated, this additive process makes the COV of \(t{t}_{z}(z)\) (which is the same as the COV of \({V}_{s,z}(z)\)) smaller than the COV of the individual layer travel times (which is the same as the COV of the individual layer velocities).

Comparing the SPID recommendations for the aleatory \({\sigma }_{lnV}\) as a function of depth to the InterPACIFIC variation of the COV of \({V}_{s,z}(z)\) as a function of depth in Fig. 5 (or to the variation of the COV of \({V}_{s}(z)\) on the top 70 m in the same figure), it is noteworthy that the values are quite similar. This similarity is surprising considering that the results were obtained using two completely different data sets and purport to represent different things: the SPID values represent aleatory uncertainty associated with a 100–200 m footprint as obtained by Toro (1995), while the InterPACIFIC surface-wave values represent analyst-to-analyst epistemic uncertainty at three sites with different geologies, due primarily to different parameterizations of the problem and to non-uniqueness. In both cases, the uncertainty is higher (~ 0.25) in the top 15 to 20 m, likely as a result of higher lateral variability near the surface (and in the surface-wave data, possibly the result of non-uniqueness associated with the highest frequency in the dispersion data), and settles to a value between 0.07 and 0.20 below 20 m depth. In the InterPACIFIC COV of \({V}_{s,z}(z)\), there is a moderate monotonic increase with depth, likely related to increased non-uniqueness associated with the lowest frequency in the dispersion data.

Based on the material summarized and discussed here, the author proposes the following recommendations:

  • For generic applications involving broad site categories, the values in Table 1 provide reasonable ranges of values for the total uncertainty (epistemic and aleatory) in Vs. The median profiles and other parameters in Toro (1995) can also be used. For generic California applications, the Shi and Asimaki (2018) parameters may be preferable.

  • For site-specific applications for which there are multiple measurements within the site footprint, site-specific estimates of uncertainty in Vs should be obtained from mid-point velocities or using \({V}_{s,z}(z)\), to reduce the effect of velocity contrasts. The latter can then be approximately “un-packed” to obtain the corresponding uncertainty in \({V}_{s}\left(z\right)\) without the effect of layer boundaries. If there are not sufficient site-specific profiles to obtain site-specific estimates of the aleatory uncertainty in Vs, the SPID recommendations (i.e., 0.25 at the surface, decreasing to 0.15 at 15 m and deeper) can be used in conjunction with the Toro Vs randomization model. If no information is available on inter-layer correlation, parameter values within the range shown on Table 3 are recommended. The generic layer-thickness randomization model (Eq. 1) should not be used in site-specific applications if the median profile exhibits significant Vs contrasts. No layer-thickness randomization or a more localized randomization should be used in these situations.

  • The epistemic uncertainty in site-specific applications should reflect the quantity and quality of data available. In particular, for depths with one reliable geophysical Vs measurement, the epistemic uncertainty should be equal to the SPID aleatory uncertainty or to the COV of \({V}_{s,z}(z)\) in Fig. 5b. If the number of measurements is greater than one, then this uncertainty should be divided by \(\sqrt{n}\) (where \(n\) is the number of measurements) unless there are reasons to suspect that some of the measurements are not independent (in which case a conservatively chosen equivalent \(n\) should be used). At greater depths, one may have to rely on geology, distant sonic logs, or other indirect information, in which case the much larger estimates recommended by the SPID for those cases may be appropriate. The assumption of full depth-wise correlation of the epistemic uncertainty may be questionable. One option may be to perform the randomization using the total uncertainty (this would also require higher inter-layer correlation). Although the effects of the epistemic and aleatory uncertainties are inextricably combined by doing this, their effect of the total uncertainty will be captured more accurately.

Even though the parameters of the Toro model have been recommended above, these recommendation should be seen as interim. Over the past 25 years, large data sets of profile data have been collected for California, other regions of the USA, and worldwide (Hashash et al. 2014; Kayen et al. 2015; Wang et al. 2019; etc.). New generic and site-specific stochastic Vs models should be developed using these larger data sets, together with insights gained in research in the practical use of these models. Also, all the above recommendations apply to typical sites where the stratigraphy is relatively flat. Complex sites with significant two- and three-dimensional topography and/or stratigraphy are likely to exhibit higher uncertainties.