Numerical Sensitivity Study Compared to Trend of Experiments for LEAP-UCD-2017

This paper describes the numerical sensitivity study requested prior to the December 2017 LEAP workshop. Several but not all of the simulation teams participated in this sensitivity study. The results of the sensitivity study are used to begin to mapout thesimulationresponsesurfacesthatrelateresidualdisplacementtoPGA eff and relative density. The simulation response surfaces are compared to the corresponding response surfaces determined by nonlinear regression of the centrifuge test data. The de ﬁ nition of the experimental response surface allows a means to objectively reduce the in ﬂ uence of outliers in the experiment dataset. The residuals between the experiments and the regression surface are used to quantify the uncertainty associated with experiment-experiment variability.Somemetricsfor assessing the comparisonbetween simulations and experiments are explored; it is suggested that differences in the logarithm of displacement are more meaningful than arithmetic differences. As expected, some models predicted the average displacement well and some predicted triggering of liquefaction and the shape of the response function better than others.

Abstract This paper describes the numerical sensitivity study requested prior to the December 2017 LEAP workshop. Several but not all of the simulation teams participated in this sensitivity study. The results of the sensitivity study are used to begin to map out the simulation response surfaces that relate residual displacement to PGA eff and relative density. The simulation response surfaces are compared to the corresponding response surfaces determined by nonlinear regression of the centrifuge test data. The definition of the experimental response surface allows a means to objectively reduce the influence of outliers in the experiment dataset. The residuals between the experiments and the regression surface are used to quantify the uncertainty associated with experiment-experiment variability. Some metrics for assessing the comparison between simulations and experiments are explored; it is suggested that differences in the logarithm of displacement are more meaningful than arithmetic differences. As expected, some models predicted the average displacement well and some predicted triggering of liquefaction and the shape of the response function better than others. LEAP-UCD-2017 is not a final assessment of simulation procedures; instead, the results can be used to improve simulation specifications and calibration procedures and as a stimulus for more careful review of simulation results before they are submitted. Manzari et al. (2019a, b) describe the calibration phase and the Type-B numerical simulation exercise for LEAP-UCD-2017. After receiving the Type-B simulations, the organizing team (Manzari, Zeghal, and Kutter) invited the simulation teams to predict the results of an additional centrifuge test performed at RPI and to conduct a study to illustrate the sensitivity of the simulations to relative density and ground motion intensity. Considering the time constraints, about half of the simulation teams listed in Table 11.1 were able to provide the requested sensitivity study. This paper analyzes the predicted residual displacements from the sensitivity study.

Description of the Requested Sensitivity Study
A unique feature of the LEAP-UCD-2017 exercise is the intention to quantify the uncertainty of centrifuge experimental results. To understand the significance of the uncertainty, it is necessary to quantify the sensitivity of results to variations in key input parameters. For the present study we focus on sensitivity to three specific parameters: (1) the intensity of the input motion, (2) the presence of high-frequency excitation superimposed on the 1 Hz ramped sine wave component of shaking, and (3) the relative density of the sand. The RPI1 centrifuge test was arbitrarily chosen as the base case for the sensitivity study.
The input parameters varied to generate seven unique simulations, NS-1 to NS-7, are listed in the first several rows of Table 11.2. A subset of three conditions (NS-1, NS-2, and NS-3) were planned to illustrate the sensitivity of simulations to relative density. A subset of three simulations  were included to illustrate the sensitivity of the simulations to input ground motion intensity. A pair of simulations (NS-1 and NS-6) have similar effective PGAs but different highfrequency contents (a 3 Hz component was superimposed on the 1 Hz ramped sine wave). Finally, a pair of simulations (NS-6 and NS-7) have the same proportions of the 1 Hz and 3 Hz components but different amplitudes to allow a separate assessment of the sensitivity to shaking intensity for an input motion with significant high-frequency content. The simulation teams were specifically requested to provide the calculated lateral displacement of ground surface in the middle of the slope (Fig. 11.1).

Characterization of Displacements from Experiments
In this paper, the displacement results of the numerical simulations are compared to the two smoothed experimental regression surfaces presented by Kutter et al. (2018Kutter et al. ( , 2019b. Two regression equations were used based on different underlying equations to obtain different regression surfaces. The first surface, shown in Fig. 11.2, was based on a six-parameter regression surface:  FLAC-2D 11a-F2-Ub 11b-F2-pm Yes Yes a The last column indicates if the team participated in the sensitivity study Yes* indicates that they did participate and results are shown in their papers in this book, but not shown in this paper due to time limitations. At the time the request for the sensitivity study was sent to the simulation teams, the maximum and minimum dry densities were estimated to be 1765 and 1476 kg/m 3 . Based upon the study by Carey et al. (2019b), the maximum and minimum dry densities were updated to 1757 and 1490.5, respectively. Using the updated index dry densities, the relative densities corresponding to 1651, 1608, and 1683 kg/m 3 would be 64%, 48%, and 75% -slightly smaller than the relative densities listed in the above table where b 1 , b 2 , n1, n2, n3, and n4 were used as regression parameters. For the experimental data, Ux2 is the average displacement from the two central surface markers.   Figure 11.3 shows a slightly different surface that is based only on the regression using the four parameters b 1 , b 2 , n1, and n3: ð11:2Þ As explained by Kutter et al. (2019b), depending on the model, the method of estimating the parameters, and the exclusion of a small number of outliers, the correlation coefficients (R 2 values) varied between about 0.6 and 0.85, indicating that there is a meaningful relationship between the experimental data and the fitted surface. The bottom diagrams in Fig. 11.3 show the residuals between the experimental data and the regression surface. The bottom right diagram shows some lines that are suggested as being indicative of the uncertainty associated with experiment to experiment variability. The suggested uncertainty bounds were subjectively selected to be consistent with the data and these concepts: (1) based upon Fig. 11.3 Two views of regression surface for four-parameter model to fit experimental data as described by Kutter et al. (2019b) (top). Two views of residuals of the data from the curve fit, with suggested uncertainty bounds representing experiment-experiment variability (bottom right). Displacements (Ux2) are given in units of mm and accelerations are given in (g) measurement accuracy, the minimum uncertainty should be about 1 mm in model scale, which corresponds to 20 and 50 mm in prototype scale; (2) the variability of displacement should increase with the amplitude of the displacement; and (3) about 2/3 of the data points lie inside the suggested bounds so that these bounds might be indicative of 1 standard deviation of experiment-experiment variability.

2D Comparisons of Experimental Regression Surfaces
to Numerical Simulations Figure 11.4 compares results from the numerical sensitivity study to sections of the regression surface. In Fig. 11.4a, three points from a section through the regression surfaces at PGA eff ¼ 0.146 g are connected by thick dashed lines; it is apparent that the 4-parameter and 6-parameter surfaces produced very similar results on this section. The thin dashed lines represent the uncertainty band associated with experiment-experiment variability for the 4-parameter model. Eight sets of simulations provided results for the sensitivity analysis for relative densities of 50%, 65%, and 75%. During the Type-B simulations described by Manzari et al. (2019b), the simulations by 5a-F2-Pm respected the prescribed densities but mistakenly used ρ max and ρ min reported by a different source, resulting in lower relative densities. During the sensitivity analysis, this team revised the relative density calculations to produce prediction 5s-F2-Pm. The thick dashed lines in Fig. 11.4b indicate sections through the regression surfaces at D r ¼ 65%. The thinner dashed lines indicate the experimental uncertainty bands. Based on Fig. 11.4, one may conclude the following with respect to simulations: 1. On average, for the teams that participated in the sensitivity study, the NS-1 to NS-5 simulations overpredicted displacements. None of the NS-1 to NS-5 simulations fell below the uncertainty bounds of the experimental data. This observation is not applicable to the teams that did not participate in the sensitivity study. 2. Many of the simulations for the sensitivity study overpredicted displacements at the high ends of the displacement plots (low D r and high PGA eff ). Only two of the simulations fell within the uncertainty bounds for D r ¼ 0.5. Only three of the simulations fell within the uncertainty bounds for PGA eff ¼ 0.288. 3. On the low displacement ends of Fig. 11.4a, b, where the uncertainty bounds are wider on the log plot, most of the simulations fell within the uncertainty bounds. For D r ¼ 0.75, only three of the predictions were above the uncertainty bound of the regression surface. Five of the eight fell within the uncertainty bounds. For PGA eff ¼ 0.1, only one of the simulations fell above the uncertainty bounds for the data. 4. It is also interesting to see that a few of models (6a, 9, 11a) are somewhat insensitive to the PGA after it exceeds 0.15 g. 5. All of the simulation teams (except Team 5) used the relative density based on mass and volume measurements of the experiments instead of the method for determining the density from the cone penetration tests as recommended by Carey et al. (2019a, b). Simulation team 5 used different ρ d max and ρ d min data to determine relative density and thus their predictions (5a-F2-Pm) were for a looser state than the other simulation teams as is apparent in Fig. 11.4a. This discrepancy was corrected in a second set of simulations (5s-F2-Pm) which are more consistent with the other simulation teams and will be considered in later comparisons.

Error Measures and Ranking of Numerical Simulations
The results from the sensitivity study for simulations NS-1 to NS-5 are summarized in Table 11.3. The displacements from experiments are obtained from the regression surfaces at the specified D r and PGA eff . The displacements from the simulations are for the top center surface of the sand slope. Four error measures, M1-M4, were considered to evaluate the quality of the simulations. The first two are based on logarithmic differences of displacements and the second two are based on arithmetic differences in displacement: where (displacement from sim) is from the lower half of Table 11.3 for each simulation and (displacement from exp) is the average of the displacement from the 4-parameter and 6-parameter regression surfaces evaluated at the PGA eff and D r of NS-1 to NS-5. Tables 11.4 and 11.5 present the computed error measures for the various simulation teams. M1 is the median of the absolute values of the natural logarithm of the ratio of displacements listed in columns A-E (Table 11.4). Similarly, M2 is the root mean square (RMS) of the five values in columns A-E (Table 11.4). The values in parentheses next to M1 and M2 values are the rank orders of the error measures. The median error is less affected by a single simulation with a large error. For example, imagine four perfect simulations and one simulation that incorrectly predicted instability with huge displacementsthe median error measures (M1 or M3) would be zero, while the RMS error (M2 or M4) could be large. For displacements that span multiple orders of magnitude, and especially for data that is log-normally distributed, the statistics are best done by comparing the logarithms of displacements as is done with error measures M1 and M2. For error measures based on arithmetic differences (M3 and M4 in Table 11.5), a 10% error for a 1 m displacement prediction (0.1 m error) is treated as being equivalent to a    (4) 1.98 (7) Column F lists the median of the absolute values of columns A-E, and column G lists the RMS of columns A-E  (3) 5.343 (7) Column F lists the median of the absolute values of columns A-E, and column G lists the RMS of columns A-E 100% error for a 0.1 m displacement prediction. Arithmetic error measures such as M3 and M4 may be overly sensitive to small percentage errors in regions where displacements are large. It is interesting to note that in several cases the rank (in parentheses in Tables 11.4 and 11.5) changed by four places (out of eight possible) depending on the choice of error measure.

3-D Comparison of Simulations to Experimental Regression Surfaces
The 4-parameter and 6-parameter regression surfaces described above are compared to the Type-B predictions ) along with the results of the numerical sensitivity study in Fig. 11.5. For all of the plots, the Ux value is the predicted horizontal component of the surface displacement in the middle of the soil profile.
The results from the sensitivity study (simulations NS-1 to NS-5) are shown in these figures as black circular data points connected by lines. The sensitivity study simulations NS-6 and NS-7 are indicated by the white circular data points. White points contain significant high-frequency content and black points contain little high-frequency content. Some of the simulations show a sensitivity to the highfrequency content that was not accounted for by the PGA eff . In other words, the white circles of NS-6 and NS-7 plot above the black points for some simulations. For many of the simulations, the white points plot well above the black points suggesting that these simulations are more sensitive to high-frequency content than are the experiments.
The results of Type-B simulations for the nine model tests selected are also compared to the regression surfaces in Fig. 11.5. The shape of the data points indicates which experiment is being simulated, and the gray scale of the data points is indicative of the ratio of PGA HF /PGA 1Hz (the gray scale is consistent with that of the black and white points used for the sensitivity study). The density coordinates assumed by the simulation teams were used to plot their simulation data in Fig. 11.5; thus some errors associated with estimation of density are approximately accounted for in this data presentation.
Using this 3-D plot format allows for comparison of the trends and sensitivities of the simulations and experiments. Note that the vantage point chosen for the surface plots almost reduces the experimental regression surfaces to a curved line. Also note that the length of the stem below each data points indicates the predicted displacement and the intersection of the stem with the base plane indicates the density and PGA coordinates.
For most of the simulations, the results of the sensitivity study are consistent with the results of the Type-B simulations. Some of the predictions tend to be close to the experimental surface on average, despite having obviously different slopes and curvatures. Different metrics (as yet undetermined, but possibly similar to those Fig. 11.5 Comparison between regression to the experimentally determined response surfaces and the numerical predictions for the nine Type-B predictions and the seven predictions done for the sensitivity study. The black circles joined together are simulations of NS-1 to NS-5 of the sensitivity study. Comparison between regression to the experimentally determined response surfaces and the numerical predictions for the nine Type-B predictions and the seven predictions done for the sensitivity study. The black circles joined together are simulations of NS-1 to NS-5 of the sensitivity study. Comparison between regression to the experimentally determined response surfaces and the numerical predictions for the nine Type-B predictions and the seven predictions performed for the sensitivity study. The black circles joined together are simulations of NS-1 to NS-5 of the sensitivity study. The shape of the non-circular symbols indicates the experiment being simulated. The grey scale of the points indicates the intensity of the high frequency components (black ¼ negligible high frequency content, white ¼ very significant high frequency content) explored by Goswami et al. (2019) to compare experimental data) should be used to quantitatively compare the shape and the amplitude of the trends.
An important feature of codes intended for simulation of liquefaction should be able to predict a triggering curve, where deformations increase for cases above the threshold cyclic stress ratio and deformations are small below the threshold cyclic stress ratio, and the threshold cyclic stress should increase as density increases. This behavior is not apparent for all of the simulations in Fig. 11.5. For some simulations that did predict a reasonable triggering threshold the simulated displacements Fig. 11.5 (continued) increased too rapidly after passing the threshold for liquefaction, while the experimental displacements increase more gradually as PGA increases or density decreases.

Summary and Conclusions
The numerical sensitivity study was useful for LEAP because it allowed for the evaluation of the sensitivity of the numerical simulations to changes in the important input parameters for this problem. The evaluation of the quality of a simulation for a single point should account for the uncertainties associated with the experimental data point and the sensitivity of the simulation to these uncertainties. With enough data points from the sensitivity study and the centrifuge test matrix, we are able to map out response functions for the simulations and experiments and see how well they correspond to each other. In some cases, simulations may accurately model a Fig. 11.5 (continued) subset of the experiments, even if the shape of the simulated response surface does not accurately mimic the shape of the experimental response surface.
It is not a goal of this paper to determine winners and losers, especially in light of difficulties with scalar metrics such as M1-M4 explored in this paper. For liquefaction-induced displacements, where the displacements vary by orders of magnitude, it is recommended that logarithmic differences (metrics M1 and M2) are better metrics than arithmetic differences (M3 and M4). Arithmetic error metrics may be unduly affected by small percentage errors in the large numbers and may not be appropriately sensitive to large percentage errors in the small-displacement portion of the function.
The results clearly show that most of the teams that participated in the sensitivity study overpredicted the experimentally observed displacements in the large displacement range, with the simulations falling well above the suggested uncertainty bounds due to experiment-experiment variability. Based upon the 3-D plots of Type-B predictions, some of the simulation teams (Teams 3 and 4) probably would not have overestimated displacements had they completed the sensitivity study.
A large number of experiments is needed to map out the experimental response function and a suitable regression analysis is required to provide an experimental response function to map against the numerical response function. The mapping of a response function by the parametric study in numerical simulations is one useful way to assess the quality of the comparisons between experiments and simulations, but many other aspects of the simulations should be considered. Numerical simulations should produce reasonable calibrations to element tests, and the time series data for pore pressure, displacement, and acceleration should also be considered as was done in many respects by Manzari et al. (2019a, b).
Some differences in the comparisons between the simulations and experiments can be attributed to simple discrepancies, such as miscalculation of relative density by the simulation team. The errors are understandable as different maximum and minimum index densities were reported to the simulation teams at different stages of LEAP. In future LEAP efforts using the same Ottawa Sand, the dry density should be used as the primary indicator of the state of the sand in all correspondence because this avoids confusion associated with estimation of the maximum and minimum dry density parameters and the small errors associated with conversion to void ratio using specific gravity reported by different sources. Whenever relative density is specified, the assumed maximum and minimum index dry densities should be explicitly stated; best estimates of these index densities, unfortunately, continue to change as more testing is done to determine the index densities.
At this time, LEAP-UCD-2017 should not be used to pass judgment on winners and losers of a contest. In many cases, the differences are more indicative of mistakes, imperfect instructions, and limited resources than validity of the numerical or constitutive behavior. The results of LEAP-UCD-2017 should be used constructively to improve simulation specifications, calibration procedures, and peer review practices that could minimize avoidable errors in future validation efforts.
Open Access This chapter is licensed under the terms of the Creative Commons Attribution 4.0 International License (http://creativecommons.org/licenses/by/4.0/), which permits use, sharing, adaptation, distribution and reproduction in any medium or format, as long as you give appropriate credit to the original author(s) and the source, provide a link to the Creative Commons license and indicate if changes were made.
The images or other third party material in this chapter are included in the chapter's Creative Commons license, unless indicated otherwise in a credit line to the material. If material is not included in the chapter's Creative Commons license and your intended use is not permitted by statutory regulation or exceeds the permitted use, you will need to obtain permission directly from the copyright holder.