On the automatic parameter calibration of a hypoplastic soil model

This paper presents an approach for the automatic parameter calibration (AC) of a hypoplastic constitutive soil model. The calibration software developed in this work simplifies the parameter calibration, reduces the subjective “human” factor on the calibration result and lowers the entry hurdle for the use of the hypoplastic constitutive model. The performance of the software was demonstrated by comparing automatically calibrated parameter sets for two sands and their related simulations of the underlying experimental data with simulations using two reference parameter sets. The first reference parameter set was calibrated the classical way, "by hand", and the second was calibrated using the AC tool ExCalibre. Two different optimization methods were used, namely the Differential Evolution (DE) and the Particle Swarm Optimization (PSO). The simulations performed with the parameters obtained from the AC agree well with the experimental data and show improvements over the reference parameter sets. With respect to the optimization method, the performance of the DE proved superior to that of the PSO. Various measures of comparison were examined to quantify the discrepancy between experiment and simulation. By repeating 500 calibration runs, the dispersion of parameters was determined and correlations between different parameters of the hypoplastic model were found.


Introduction
Numerous constitutive soil models were developed in the past and are still being developed today.Many of the constitutive models utilized today originate either from the framework of hypoplasticity or elasto-plasticity.Prominent and established representatives of such advanced models for cohesionless soils are Sanisand [9,51] for elasto-plasticity as well as hypoplasticity according to Wu et al. [59] or in the version of von Wolffersdorff [57] with the extension by intergranular strain by Niemunis & Herle [38].For cohesive soils, the anisotropic visco-hypoplastic model by Niemunis et al. [39] or the clay-hypoplasticity by Mas ˇı ´n [34,35] can be listed at this point.
Recently, Fuentes et al. extended the hypoplastic constitutive model with the intergranular strain anisotropy (ISA, an extension of the intergranular strain concept first presented in [12]) to describe liquefaction phenomena of sands [13,40].For the Sanisand model of 2004 [9], many extensions have been proposed.For instance, Taiebat & Dafalias [51] added a yield surface with closed cap, Liu et al. [30] and Liu and Pisano [29] introduced a memory surface which improves the model's performance under cyclic loading and allows for the simulation of several thousands of loading cycles and Barrero et al. [3] improved the model for cyclic shearing of sands in semifluidized states, among others.New and further developments in the field of constitutive models for cohesive soils are for example the anisotropic visco-ISA (AVISA) model by Tafili & Triantafyllidis [50], the coupling of clay-hypoplasticity with ISA-plasticity by Fuentes et al. [14], the barodesy with intergranular strain extension by Bode et al. [5] or a novel hypoplastic model for overconsolidated clays by Wang and Wu [55,56].
Even if the above-mentioned works are only an excerpt from the constitutive models published in the recent years, they are testimony to the rapid and varied developments that have taken place in this field of soil mechanics.Many new models extend and improve the prediction quality of the old ones.This is usually accompanied by an increase in the number of material parameters which need to be calibrated.The quality of the forecast depends significantly on a suitable choice of these parameters.However, their calibration is often time-consuming and requires a high degree of experience in handling the model to be calibrated because many of the parameters cannot be identified straight forward by specific experiments or empirical equations.This makes the application of such advanced soil models a very challenging task not only for beginners but also for experienced engineers and researchers and by this often prevents the application in practical projects and to boundary value problems.Automatic Calibration (AC) aims to simplify and speed up the calibration process and, in particular, to reduce the application hurdles when using advanced constitutive soil models.In addition, AC helps to reduce the ''human factor'' in the calibration of model parameters (e.g., subconscious personal preferences, outcome-based calibrations and experience of the person performing the calibration).Another, not as obvious advantage of AC is the possibility to identify previously unknown relations between individual parameters.The detection of such relations can be used for a possible reduction in the number of required parameters of the model.A well-established tool for AC of constitutive soil models is ExCalibre by Kadlı ´c ˇek et al. [23][24][25] (https:// soilmodels.com/excalibre/).It allows to estimate parameters of constitutive soil models such as clay and sand hypoplasticity based on the results of oedometric compression and monotonic triaxial tests.However, the software only allows to calibrate a full set of parameters (calibration of only individual parameters is not possible) and, in case of the hypoplastic model for sands, produces parameters that underestimate dilatancy [36].In addition, custom implementations of soil models cannot be linked to the ExCalibre.Recently, an AC method for a hypoplastic model based on genetic algorithms was presented by Mendez et al. [36].The therein developed AC uses a Fre´chet Distance-based similarity measure and was able to determine parameters for the hypoplastic model that lead to simulation results closer to the experiments than those calibrated by hand.However, the influence of the optimization method and of the similarity measure used for determining the discrepancy between experiment and simulation remains unclear.In addition, the previous work was limited to a rather small number of laboratory tests when calibrating the parameters (in [36] a maximum of two oedometric compression tests and three drained monotonic triaxial tests were used for calibration).It is unclear how AC performs with a larger data set and what possible limitations follow from a calibration based on a reduced data set.
In this paper, a calibration software for constitutive soil models is presented.The calibration is based on two optimization methods: Differential Evolution (DE) [49] and Particle Swarm Optimization (PSO).The influence of the similarity measures used to quantify the differences between laboratory tests and simulation is investigated.The software is presented by means of the calibration of the parameters for a hypoplastic constitutive model for two different sands.The number of experiments taken into account is varied in order to check the influence of the data basis.
Both, DE and PSO fall in the category of so-called direct or derivative-free algorithms.Such algorithms make few or no assumptions about the underlying optimization problem and can quickly explore very large design spaces (ranges of parameters).In the field of geotechnics, they have been applied to the calibration of the parameters of the Mohr-Coulomb model by means of inverse analyses [41], to the calibration of an elasto-plastic constitutive model [43], to the automatic selection of constitutive models [21] or to the continuous parameter identification in staged excavation simulations [22].In [36], a Genetic Algorithm (GA, also belonging to the category of direct algorithms) was used to calibrate the parameters of a hypoplastic model and in [45] for the identification of maximum discontinuity frequency in a complex rock structure.Recently, [16] used a combination of DEM simulations and Artificial Neural Networks to determine the parameters of Duncan-Chang model, namely the tangent elastic modulus and the ultimate deviatoric stress.
The paper is structured as follows.The developed global optimizer, including the constitutive equations, the definition of the cost function as well as the optimization methods are introduced in Sect. 2. Section 3 presents the experimental data set, and the corresponding reference parameter sets, based on which the performance of the optimization is evaluated.The influence of the choice of optimization method and the choice of similarity measure to calculate the error function is investigated in Sects.4 and 5, respectively.Section 6 focuses on the solution uniqueness and Sect.7 investigates the influence of the extent of the available data for calibration.The conclusions of the study are given in Sect.8.

Numerical tool set
The Automatic Calibration (AC) software has been developed with the aim to provide easy access to the calibration of different (advanced) constitutive soil models.Several of such advanced constitutive models are implemented in the finite element program numgeo (Machac ˇek & Staubach, [31,32,47,48]).The parameter calibration should preferably be performed by using the identical implementation of the constitutive model as for the later simulation of the boundary value problem.In particular for advanced constitutive soil models the implementation can have an important influence on the response of the model.Therefore, the calibration software was built on top of numgeo.This allows easy access to all constitutive soil models implemented in numgeo and thus to utilize the same implementation for calibration and the subsequent calculation of boundary value problems.Although this work is restricted to the calibration of parameters for a hypoplastic constitutive model (see Sect. 2.1), the implementation can easily be extended to other constitutive models implemented in numgeo.
The AC software is implemented in Python, leveraged by numerical libraries such as numpy [17] and scipy [53].An overview of the software is given in Fig. 1.A database has been designed to access and store several (and different in kind) standard laboratory tests such as oedometric compression tests or (drained monotonic) triaxial tests.Functions are available for pre-processing (if desired/required) of the raw data, such as filtering, data reduction or interpolation.Besides the global optimizer, which is the focus of this article, the software also offers the possibility to automatically estimate the parameters based on simplified calibration methods, such as the one according to Herle [18].Two heuristic optimization algorithms are available for global parameter optimization, namely PSO and DE [49].The global optimizer operates on the set of model parameters, comparing the numerical prediction of the constitutive model to the experimental results stored in the database until a best set (the parameters that characterize better the material behavior) is identified, see Sects.

Hypoplastic model for sand
The finite element program numgeo implements a hypoplastic constitutive model for granular materials [59] with a predefined limit state surface [57] and intergranular strain extension [38].The basic hypoplastic model of von Wolffersdorff [57] interrelates the stress rate _ r with the strain rate _ e: Therein, L is a fourth order tensor being linear in _ e, whereas N is a second order tensor being nonlinear in _ e.Both stiffness tensors L and N are functions of stress and void ratio and given by the following equations [57]: Therein r ¼ r trr and rÃ ¼ r À 1 3 I are used.The scalar factors are defined by and u c , h s , n, e i0 , e d0 , e c0 , a and b are parameters and e is the actual void ratio.The pressure-dependent void ratios e i , e c and e d in Eq. ( 4), describing the loosest, the critical and the densest state, are calculated using the following relation [4,59] e is the mean effective stress.The scalar factor F in Eq. ( 2) and Eq. ( 3) is given by To improve the performance of the hypoplastic model in the range of small strains Niemunis & Herle, [38] introduced a new tensorial state variable, the intergranular strain h, which memorizes the recent deformation history.However, since the present work focuses on monotonic loading, the introduction of the intergranular strain extension is omitted for the sake of brevity.The present implementation uses an adaptive Newton scheme for the integration of Eq. (1) considering the rate of convergence within a substepping scheme.For void ratios e exceeding the loosest possible void ratio e i ðpÞ at a given mean effective stress p, a special treatment originally developed by A. Niemunis and described in [33] is applied.

Objective/cost function
The basis of any optimization is the existence of a (scalar) objective function (''cost function'') which has to be minimized by changing the parameters in the course of the optimization.The cost function in this work is defined as follows: iT depends on the laboratory test type.It is a natural choice to judge the quality of the optimization on the basis of the comparison of the experimental data with the simulation results.For this purpose, a comparison of the experimentally measured and simulated relationships in the stress and strain spaces is useful.For the present work, the cost function for oedometric compression tests is evaluated in the axial stress-axial strain ( r1 À ẽ1 ) plane, whereas for the drained monotonic triaxial tests, both the axial strain-deviatoric stress (ẽ 1 À q) and the axial strain-volumetric strain (ẽ 1 À ẽV ) planes are used: Therein, w eq dmt and w ee dmt are weights controlling whether the focus during calibration should be on achieving the best possible reproducibility of the response in the e 1 À q plane, e 1 À e V plane, or a combination of both.Note that in Equation ( 10) scaled stress and strain measures are used instead of the actual values.This is necessary to ensure that the variables used for the error calculation-despite their different value ranges and units (e 2 ½0 À 0:3 and q 2 ½0 À 500 kPa)-have a comparable influence on the optimization.Therefore, both the experimental (exp) and numerical (sim) data are made dimensionless using the following relations: This normalization transfers the experimental and numerical results into representations which have their origin in (0, 0) and range to (1,1) in the case of the r1 À ẽ1 and the ẽ1 À q plane and up to ðÀ1; 1Þ in the case of the ẽ1 À ẽV plane.
The cost functions it iT driving the optimization of the parameters are thus built by accounting for the discrepancy between the numerical predictions and the measured (experimental) data.Traditionally, this discrepancy is often quantified using a sum-of-square-based cost function.However, in the present case this is not always possible.One problem is a potentially different number of data points on the experimental curve compared to the numerical curve.Admittedly, this circumstance could be remedied by linear interpolation-provided there are enough data points to keep the introduced error low.However, in many cases there is no unique relationship between stress and strain-the material behavior is path dependent.An example is the oedometric compression test with loading, unloading and reloading.Thus, the use of alternative objective functions is advised to measure the similarity between the experimental data and the simulation results.In a recently published study [36], the Fre ´chet Distance was used for this purpose.[20] compared the performance of different similarity measures for the automatic parameter calibration of a kinematic hardening material model for steel.Up to now, the influence of the similarity measure applied to evaluate the cost function on the calibration result of constitutive soil models was not reported to the authors' best knowledge.For the present work, the same similarity measurements (and implementations) as in [20] were used: • Dynamic time warping (DTW) DTW [42,52] is a well-established method to account for temporal variations in the comparison of related time series.First applied to speech processing, the application of DTW ranges from data mining over signal processing to issues of different engineering disciplines, see e.g., [1] and the therein included references.
• Fre´chet Distance (FD) The Fre ´chet Distance [10,11] is a measure of the similarity between the curves taking into account the location and ordering of the points along the curves.
• Area Between Two Curves (ABTC) The Area Between Two Curves is the integral of the absolute value of their difference (evaluated as the sum of piece-wise integrals).The resulting integral corresponds to the amount of mismatch between the two curves.The approach proposed and implemented in [20] is used in this work.
The Curve Length Measure [2,6] compares a point p A on curve c A (experimental data) to its corresponding curve length point p B on curve c B (simulation result).p B is evaluated on c B at the equivalent curve length distance of p A on c A .The error is then calculated based on the distance between p A and p B .
An illustrative graphical description of the above similarity measurements is given in [20], to which reference is hereby made.In addition, a similarity measure based on the sumof-square method proposed by [28] and also used in [25,60] was tested which takes the general form: Therein, N is the number of values, U exp i is the value of measurement point i and U sim i is the value of the simulation at point i.Contrary to the other similarity measures, no scaling according to Eqs. (11)(12)(13) is required for this method.However, it has to be ensured that the arrays U exp and U sim have an identical number of data points.In the following, this method will be referred to as the Modified Least Square method (MLS).
In general, it is nearly impossible for numerical models to match experimental data exactly.It is not likely that a constitutive material model matches the real (measured) material behavior perfectly.This applies in particular to the very complex behavior of soils, which is highly nonlinear, anelastic and path dependent.Even if a ''perfect'' constitutive model existed, measurement errors and scattering of experimental results (e.g., resulting from the preparation of soil samples by different technicians) would prevent a perfect fit.Note that it is therefore very unlikely that the cost function will take values smaller than \10 À2 , especially if the parameter calibration is carried out on more than a single test.A good fit must therefore be judged by both a stagnant decrease in the error function and a visual inspection of the simulation results with the experimental data.
At this point it becomes obvious that due to the required specification of weighting factors and the formulation of error measures an absolute objectivity of the AC cannot be achieved and is subject to human influences.In the case of the AC, this corresponds to the influence of the engineer who developed the AC but also to that of the user of the AC.This is shown in Sect. 5 for the choice of similarity measure as well as the choice of weighting factors w eq dmt and w ee dmt .The different combinations of weighting factors considered in this work are summarized in Table 1.

Optimization algorithm
When choosing an optimization algorithm, it should be noted that the objective function of constitutive soil models as represented by the hypoplastic model is a nonlinear, complex multi-peak and non-differentiable function.The reason for this is the high sensitivity of the hypoplastic material model to changes in the parameters.In addition, some parameters of the hypoplastic model are interdependent (see also Sect.6).For the application of traditional direct search algorithms and optimization algorithms, depending on the gradient information, this will lead to local optimization and premature convergence [41].In such cases, the use of so-called direct or derivative-free algorithms is recommended, since they do not use derivatives or finite differences.Popular representatives of these methods are the Nelder-Mead method [37], Differential Evolution algorithms [49,58], Genetic Algorithms or Particle Swarm Optimization algorithms [26,44], among others.Such algorithms make few or no assumptions about the underlying optimization problem and can quickly explore very large design spaces (ranges of parameters).Compared to algorithms depending on gradient information, stochastic optimization algorithms require generally a large amount of evaluations of the objective function.[49], is a population-based metaheuristic1 search algorithm that optimizes a problem by iteratively improving a candidate solution x j (where j is the candidate number) based on an evolutionary process.In this work, the DE implementation of [53] is used.That implementation is based on the method proposed in [49] and later extended by [58].In the DE, the solution of the problem is described by a vector x ¼ ½x 1 ; x 2 ; :::; x nD with the dimension of the search space nD.In the present work, nD is equal to the number of parameters of the constitutive model to be calibrated.The solution space is investigated by multiple candidate solutions x j simultaneously.The set of solutions is called population p ¼ ½x 1 ; x 2 ; :::; x nC , with nC individuals (candidate solutions).For the present work, the overall population size is nC ¼ maxf15 Á nD; 2 p g ¼ 124 (the next power of 2 after 15 Á nD, with nD ¼ 7).For the initialization of p 0 , each parameter varies within a user provided range.These ranges remain fixed for all candidates during the whole optimization process.Details on the ranges are provided in Sect.2.3.3.The initialization of the candidate solution p 0 is based on Sobol sequences, which is known to cover the solution space evenly [27].The quality of the solution is determined based on the cost function as described in Sect.2.2.

Differential evolution (DE), proposed by Storn and Price
At each iteration I (also often referred to as evolutionary step or generation), DE uses a mutation operator for producing a so-called donor vector v j ¼ ½v j 1 ; v j 2 ; :::; v j nD for each individual x j in the current population p.For each target vector x j at iteration I, the associated mutant vector v j can be produced using a specific mutation scheme.Various mutation strategies exist in the literature for this, see for example [15] for a comparison of the most widely used strategies.In this work, the ''best/1/bin'' strategy is used: where x best is the currently best vector in the population p, x randð1Þ and x randð2Þ (with x randð1Þ 6 ¼ x randð2Þ ) are randomly chosen members of p. Their difference is used to mutate x best .F is the differential weight, a positive control parameter for scaling the difference vector.In this work the differential weight randomly varies after each generation in the range of 0:5 F 1.

Particle swarm optimization
The Particle Swarm Optimization (PSO), first introduced by Kennedy, Eberhart & Shi [26,44], is a nature inspired heuristic optimization method.In this work, the PSO implementation of [26,44] is used.In the PSO a candidate solution x ¼ ½x 1 ; x 2 ; :::; x nD T (with the dimension of the search space nD, cf.Section 2.3.1) is iteratively adjusted using a velocity update method.Similar to the DE, the PSO considers a set (a swarm) of solution candidates named particles s ¼ ½x 1 ; x 2 ; :::; x nC T , with nC individuals (candidate solutions).The position update at time t þ 1 of each particle in the swarm is defined as follows: where x i ðtÞ is the position of the considered particle at time t (the last iteration) and v i ðt þ 1Þ is the velocity of the particle at time t þ 1.To calculate the velocity, the following rule is applied: Therein, x gÀbest is the best candidate solution ever explored by the swarm s and x pÀbest is the personal best position of the considered particle so far.c 1 and c 2 are often referred to as ''cognitive weights.''They control if the particle follows the overall swarm's best solution or its personal one.w is the ''inertia weight'' and controls how much the particles velocity influences the update.r 1 and r 2 are random numbers between 0 and 1.In the present work, c 1 ¼ 0:5, c 2 ¼ 0:3 and w ¼ 0:9 were used.

Bounds and constraints
Whenever parameter optimization is performed for advanced material models, it is necessary to limit the search space of possible parameters.If this is omitted, it may result in nonphysical values for some parameters (e.g., negative limiting void ratios in hypoplasticity) or in an unstable numerical solution.On the other hand, the optimization algorithm can only identify the best fit parameters if they are within the search space.Assigning an appropriate search space is a non-trivial task and might require some iterations.However, identifying a too narrow search space is straight forward, since a too narrow search space leads to clustering of the population at the boundaries.
For the selection of the parameter limits, the parameters are divided into two groups.The first group includes the parameters u c , e c0 and e d0 .For these three parameters a suitable starting point for the optimization can be estimated with comparatively high confidence from simple laboratory tests (u c as the angle of repose from loosely pluviated cones of dry sand, e c0 and e d0 from index tests on maximum and minimum void ratio).For these parameters the bounds are set at þ= À 10 % of the corresponding values determined in the laboratory (u lab c , e lab c0 and e lab d0 ).The remaining parameters (e i0 , h s , n, a and b) belong to the second group.Their determination from laboratory tests or correlations is associated with greater uncertainties.Their limits are chosen freely, but in such a way that clustering does not occur.It is imperative to check this.Of course-if desired-the parameters of group 1 can also be treated like parameters of group 2. The bounds of the search space used in the present work are summarized in Table 2.

Experimental data and reference parameter sets
The performance of the calibration software is investigated by calibrating the parameters of the hypoplastic material model for two different sands.The first sand is the socalled Karlsruhe Sand (KSa): a medium coarse sand which has already been the basis for numerous model tests [54] and simulations [8,33,46].The laboratory tests of this sand comprise oedometric compression tests and drained monotonic triaxial tests and were carried out at the Karlsruhe Institute of Technology.The second sand is the socalled UWA Silica Sand (SSa): a fine-to-medium-grained sub-angular sand used for centrifuge tests at the University of Western Australia.The drained monotonic triaxial tests were carried out at the Technical University of Hamburg and are documented in [7].The oedometric compression ''max(q; e)'' max( it;eq dmt ; it;ee dmt ) 1 ''q'' 1 0 For cases where it;eq dmt [ it;ee dmt , the weights take the values of w eq dmt ¼ 1 and w ee dmt ¼ 0. Vice versa for it;ee dmt [ it;eq dmt , the weights are w eq dmt ¼ 0 and w ee dmt ¼ 1 Table 2 Bounds of the search space for the optimization.tests were carried out at the Ruhr-University Bochum.The index parameters of both sands are given in Table 3.A summary of the drained monotonic triaxial tests is given in Tables 4 and 5 for KSa and SSa, respectively.
For both sands, hypoplastic parameters were determined by hand by the authors in previous projects as outlined in the following: • In a first step, the parameters were calibrated by hand following the procedure proposed by [18].The eight parameters of the hypoplastic model have been calibrated based on loosely pluviated cones of dry sand (u c ), index tests on maximum and minimum void ratio (e d0 , e c0 ), oedometric compression tests (h s , n, b) and drained monotonic triaxial tests (a).• The hypoplastic parameters have been further optimized by recalculating the laboratory tests using numgeo.The parameters have been iteratively adjusted in order to receive a better fit of the material behavior under monotonic loading conditions.
In addition, a second parameter set was determined for each sand using the AC tool ExCalibre [23,24].ExCalibre uses a combination of empirical relations and automatic calibration to determine the hypoplastic parameters.The parameters h s and n are calculated from oedometric compression tests as described in [23].The void ratio e c0 is assigned to the initial value of the void ratio of the oedometric compression test performed on the loosest available sample [23].In doing so, it is assumed that the sample of the oedometric compression test is in its loosest possible state.The void ratios e d0 and e i0 are calculated using the empirical relations e d0 ¼ 0:5 Á e c0 and e i0 ¼ 1:2 Á e c0 [23].
The parameters a and b are calibrated by using back-calculations of drained monotonic triaxial tests.a and b are iteratively adjusted until a good fit with the experimental results is obtained.To the best of the authors' knowledge, it is not publicly documented how the discrepancy between experiment and simulation is determined or how the iterative adjustment is performed.A comparison of the simulation results with the parameters determined by hand and by means of ExCalibre with the experimental data is given for the Karlsruhe Sand in Fig. 2 and for the UWA Silica Sand in Fig. 3.The corresponding hypoplastic parameters are summarized in Table 6.The comparison of the parameters shows two peculiarities: first, the void ratios e d0 determined by ExCalibre are significantly lower than those determined in the laboratory and used for calibration by hand.Secondly, an unusually high value for h s is determined by ExCalibre for the Karlsruhe Sand.
For both sands, an acceptable agreement between the experimental data and the results of the numerical simulations can be observed for the parameters calibrated by hand.The parameters h s , n and b of the hypoplastic model have been calibrated to reproduce the first loading curves of the oedometric compression tests on loose samples, thus the good prediction at this phase of the test.For the UWA Silica Sand, noticeable differences for the oedometric compression tests on dense samples are observed.In comparison, the simulations with the parameters obtained from ExCalibre show a worse agreement with the experimental data.This applies in particular to the test on the loose sample of Karlsruhe Sand.For the UWA Silica Sand the parameters calibrated with the help of ExCalibre The critical friction angle was determined as the inclination of a loosely pluviated cone of sand, i.e., as the angle of repose From the simulations of the drained monotonic triaxial tests, one may note that the peak strength is slightly overestimated for both sands.This is more pronounced in case of the UWA Silica Sand, especially for initially loose samples at low initial stress (p 0 ¼ f50 kPa ; 100 kPa g).Except from the test at high initial stress (p 0 ¼ 400 kPa), the residual shear strength is reproduced reasonably well in the simulations of the Karlsruhe Sand.The simulated volumetric strain behavior for initially loose samples can be judged as satisfactory.However, significant differences are noted in the e v À e 1 plots for the medium-dense and dense samples.The inability of the hypoplastic model in reproducing the volumetric strain behavior of dense samples is clearly visibly.This is because the parameter a controlling these curves was previously calibrated to reproduce the peak stress of the test (a common approach to calibrate the hypoplastic constitutive model).For Karlsruhe Sand, the parameters calibrated using ExCalibre overestimate the initial stiffness and show slightly worse   3 Comparison of experiments (black) and simulation results (colored) for oedometric compression tests and drained monotonic triaxial tests on samples of UWA Silica Sand with different initial densities and different initial mean effective stresses.The hypoplastic parameters were calibrated by hand and by ExCalibre [23,24] agreement with the experimental results compared to the parameters calibrated by hand.For the UWA Silica Sand, the simulations using the parameters determined with the help of ExCalibre show a worse match with the experimental data than the ones using the parameters calibrated by hand.In this case, the initial stiffness in the q À e 1 plot is underestimated and the e v À e 1 behavior noticeably underestimates dilatancy.
In summary, the simulation results obtained with the help of the software ExCalibre are of similar quality to those using parameters calibrated by hand.The parameters determined in this way provide a reference for assessing the quality of the parameters obtained by AC.In addition, they also reveal some weaknesses, and it will be interesting to see whether the AC can reduce or even avoid them.

Influence of the optimization method
So far, there is little experience regarding the choice of optimization algorithm for determining the parameters of advanced constitutive soil models.In the following, the results of the parameter calibration of the hypoplastic material model using both, DE and PSO are compared with each other.For this purpose, 200 optimization runs were performed with each optimization method.A ''run'' denotes a complete calibration using AC, which again includes several iterations.The parameters were determined for Karlsruhe Sand on the basis of all available laboratory test data (2 oedometric compression tests and 7 drained monotonic triaxial tests).The discrepancy between the experimental data and the simulation results are quantified using the Fre´chet Distance (see Sect. 2.2).The cost function considers both the discrepancies for the oedometric compression tests as well as for the drained monotonic triaxial tests.Thus, the calibration considers both types of experiments simultaneously.In general, the quality of the parameters determined in a calibration run by optimization is measured by the final value of the cost function (see Sect. 2.2).The smaller this value, the better the agreement between experimental data and simulation results.The final values of the cost function of all runs of the optimization using DE and PSO are compared in Fig. 4.
The comparison clearly shows that the results of the optimization obtained by DE, by means of final cost function value, have a significantly lower scatter than the optimization using PSO.It is particularly noteworthy that the worst optimization result obtained in the 200 runs with the DE ( worst DE ¼ 0:0884) is only slightly worse than the best result obtained with PSO ( best PSO ¼ 0:0864).The worst run with PSO ( worst PSO ¼ 0:1349), on the other hand, leads to an almost 50 % larger value for than the worst run with DE.In terms of the achievable quality of the optimization and the reproducibility of the solution (both measured by means of ), the DE is preferable to the PSO.The extent to which the scattering of the optimization results influences the actual forecast quality is illustrated in Figs. 5 and 6.Therein, the simulation results obtained with the ''best fit'' and the ''worst fit'' parameter set from 200 calibration runs are compared.The corresponding parameters are summarized in Table 7.The reference sets correspond to the one  The comparison given in Figs. 5 and 6 shows that the simulation results obtained by AC are in satisfactory agreement with the experimental data.Furthermore, there are only small deviations in the simulation results based on the ''best fit'' parameter sets of the DE and the PSO.The largest deviations are found for the recalculation of the triaxial test on the very dense sample (D r0 ¼ 0:97; p 0 ¼ 100 kPa).Here, the simulations strongly overestimate the maximum deviatoric stress q, but show a good agreement in terms of volumetric strain e v .Compared to the manually determined parameters, which serve here as a reference parameter set, the automatically calibrated parameters can be attested a similarly good agreement for the oedometric compression tests.In contrast, the automatically calibrated parameters render better simulation results for the oedometric compression tests than the parameters calibrated by means of ExCalibre.The largest deviations between both reference parameter sets and the parameters obtained by AC can be seen in the simulation  6 Comparison of the simulation results for the parameter sets for Karlsruhe Sand with the ''best fit'' (solid) and the ''worst fit'' (dashed) from 200 repeated runs of the AC using PSO Acta Geotechnica results of the drained monotonic triaxial tests.While the reference parameter sets show a good prediction of the peak deviatoric stress for all relative densities, in the case of the AC there is the above-mentioned overestimation of the peak deviatoric stress for the very dense sample.For all other samples, the maximum deviatoric stress is well predicted.With respect to the residual shear strength, there is a slightly better agreement for the parameters obtained by AC.The largest difference is observed for the predicted volumetric strain e v .Here, the simulations with the automatically calibrated parameters achieve a significantly better agreement with the values measured in laboratory tests.Overall, the parameter sets obtained by AC can be attested a better agreement with the laboratory tests than both reference parameter sets.
For the parameters calibrated by DE, a very small deviation for the ''worst fit'' ( ¼ 0:0884) and the ''best fit'' ( ¼ 0:0854) can be observed.This indicates good reproducibility and is in good agreement with the low scatter of the cost function in Fig. 4. From a user perspective, both parameter sets would be equally suitable, as measured by the comparisons shown in Fig. 5.A different picture emerges for the parameter sets optimized by means of PSO.Already in Fig. 4 it became clear that results obtained using PSO show a significantly larger scatter than the ones obtained using DE.This comparatively large deviation in the error function is also reflected in the reproducibility of the simulation results, as can be seen in Fig. 6.In general, both DE and PSO are capable of determining parameters of comparable quality.Also to be emphasized at this point is that only for h s , n, e c0 and a (for the ''best fit'') clearly different parameters of PSO and DE are determined.The other parameters are comparable.However, this good agreement in terms of simulation results only refers to the best of 200 calibration runs.The fact that not every calibration run achieves the same final value of the error function has already been shown in Fig. 4. Compared to the DE, the optimization by means of PSO gets stuck in local minima more often.This could possibly be improved by an adjustment of the optimization constants (see Sect. 2.3.2).However, there are no general rules according to which these constants have to be chosen.The only possibility would be to search for better optimization constants by trial and error.However, this is very time-consuming and contradicts the actual goal of AC: simplification of the calibration process.Due to the significantly lower scatter, optimization by means of DE is preferred for further comparisons.

Influence of the objective function
The objective function, which is to be minimized in the course of the optimization, consists of two components: the similarity measure used to evaluate the discrepancy between the experimental data and the simulation results as well as the weighting of the influence of different laboratory experiments.To the authors' best knowledge, neither the influence of the similarity measure nor the influence of the applied weighting factors were yet investigated in the light of AC of soil models.
In what follows, the performance of five widely used approaches to quantify the discrepancy between experimental data and simulation results for their use in automatic parameter calibration will be shown.These approaches are namely the Area Between Two Curves (ABTC), the Dynamic time warping (DTW), Fre´chet Distance (FD), Curve Length Measure (CLM) and the Modified Least Square (MLS), see Sect.2.2.Since the similarity measures are defined differently, a comparison of their results in terms of a comparison of a numerical value is not possible.Instead, their applicability must be based on a personal comparison of experiment and simulation.This seems like a step backwards, since it is this subjective comparison carried out by an engineer that is supposed to become superfluous by the development of a calibration software.However, this path is inevitable.Taking into account the study of the influence of the optimization algorithm presented in Sect.4, the calibrations required for this purpose were performed using DE.The parameters were calibrated for Karlsruhe Sand and the results of this study are shown in Fig. 7 for the variation of weighting factors and in Fig. 8 for the variation of similarity measures.
As can be seen from Fig. 7 the overestimation of the maximum deviatoric stress, especially visible for the very dense sample, can be reduced by modifying the weights in favor of a better fit in the q À e 1 plane (w eq dmt ¼ 2=3; w ee dmt ¼ 1=3 and max(w eq dmt ; w ee dmt )).However, this leads to an underestimation of residual deviatoric stress and a slightly worse fit in terms of the behavior in the e v À e 1 plane, especially in case of the max(w eq dmt ; w ee dmt )weighting.Not surprisingly, the best overall agreement in terms of the behavior in the q À e 1 plane is observed for the calibration that considers only the discrepancy in the q À e 1 plane to evaluate the fitness in the triaxial tests (w eq dmt ¼ 1; w ee dmt ¼ 0).However, this comes at the price of a noticeable discrepancy between simulation and experimental results in the e v À e 1 plane.It should be noted that this is due to the inability of the hypoplastic constitutive model to capture both the behavior in the q À e 1 and in the e v À e 1 plane equally well with one set of parameters, and is not related to the algorithm.In the authors' experience, the default equal weight settings (w eq dmt ¼ 1=2; w ee dmt ¼ 1=2) work very well for other constitutive models, such as Sanisand [9] for instance.For the forthcoming simulations, setting w eq dmt ¼ 2=3; w ee dmt ¼ 1=3 is considered to be a good compromise between reducing the overestimation of the peak deviatoric stress and still providing good agreement in the e v À e 1 plane.It should first be noted that most similarity measures are capable of producing acceptable parameter sets, see Fig. 8. Probably the worst results, measured by the comparison of the simulation results with the experimental data, are noted for the determination of the error by means of ABTC and CL.In both cases, a very significant overestimation of the peak deviatoric stress is observed for initially dense soil samples.In terms of the e v À e 1 plane on the other hand, both simulations show the best agreement with the experimental data.However, the deviations in the q À e 1 plane in Fig. 8 Influence of the similarity measurement on the results of the AC with ''2/3, 1/3''-weighting.Simulation results (blue) vs. experimental data (black) for seven monotonic drained triaxial tests (left and middle) and two oedometric compression tests (right) on Karlsruhe Sand the triaxial tests outweigh this.Thus, further use of the ABTC and DTW is omitted.The best agreement in the q À e 1 plane is observed for the calibration using the MLS similarity measure.Both peak and residual shear strength are well-matched in the simulations.However, these simulations show a significant underestimation of the dilatancy in the e v À e 1 plot.A compromise between good agreement with the experiments in the q À e 1 and e v À e 1 plots is shown by the calibrations with the CLM and FD similarity measures, with a slight preference for FD.In terms of agreement with the oedometric compression tests, on the other hand, the worst agreement is obtained with the MLS approach since the simulation results show a too soft material response at higher effective stress (r v [ 100 kPa).The simulation results obtained with the other similarity measures show no noticeable differences in the oedometric compression tests and reveal very good agreement with the experiments.
From the authors' perspective, the use of the similarity measure Fre´chet Distance (FD) with w eq dmt ¼ 2=3; w ee dmt ¼ 1=3-weighting (from now on referred to as ''2/3, 1/3''-weighting) for the calibration of the hypoplastic constitutive model leads to the overall best agreement between simulation and experiment on average.Regarding the performance of the hypoplastic model in the oedometric compression tests, the AC was-regardless of the chosen similarity measure-able to either improve or maintain the simulation quality compared to the reference parameter sets calibrated ''by hand'' and using ExCalibre (compare Figs. 2, 7 and 8).To check whether these settings also give good results for another sand, the parameters of the UWA Silica Sand are calibrated (using FD and ''2/3, 1/ 3''-weighting).The simulation results obtained with this calibration are compared with the experimental data in Fig. 9.By calibrating the parameters by means of AC, significant improvement of the agreement between simulations and laboratory tests can be achieved for the UWA Silica Sand compared to the results obtained with the reference parameters.This is especially true for the triaxial tests and applies to both the behavior in the q À e 1 plane but also (and especially) in the e v À e 1 plane.The overestimation of the peak deviatoric stress for dense samples can also be observed in these simulations, although not as pronounced as for the Karlsruhe Sand.As already stated, this is due to the inability of the hypoplastic constitutive model in representing both the behavior in the q À e 1 and the e v À e 1 planes equally well.For the oedometric   compression tests, a slight or a significant improvement is observed compared to the simulations using parameters calibrated by means of ExCalibre and ''by hand,'' respectively.The resulting parameters for the optimization using the Fre´chet Distance are summarized in Table 8.

Range of parameter values
The final parameters for Karlsruhe Sand obtained from 500 AC runs using DE in combination with the Fre´chet Distance and ''2/3, 1/3''-weighting are shown as a scatter plot in Fig. 10.Looking at Fig. 10, the large scatter of the values determined for h s is striking.If the standard deviation is determined for all parameters, these take the following values in relation to the mean value: the largest standard deviation of approximately 19 % is determined for the parameter h s , followed by approx.4.2 for e i0 , 2.8 % for n as well as 2.2 % and 1.8 % for a and b, respectively.The standard deviation for the parameters u c , e c0 and e d0 is less than 0.5 %.This non-negligible variance can be associated with the parameter uncertainty and the very wide limits chosen for the parameters h s , n, a and b.In particular for h s and n lower standard deviations could be achieved by choosing narrower bounds, see for example [36].However, this would have the consequence that the bounds for different sands would have to be examined anew each time.
Considering the results presented in the previous sections, this is not necessary, since the parameters determined by means of DE show a good agreement with the experimental results (see Fig. 5) as well as an extremely low scatter with respect to the simulation results (despite strongly differing parameters).This is due to the lack of physical significance of the parameters.At this point, the name granular hardness for the parameter h s is certainly misleading.h s is a fitting parameter which (together with n) describes (among other relations) the development of the critical void ratio with respect to mean effective stress.Another striking feature in Fig. 10 is the apparently strong correlation of various parameters.These are particularly noticeable for the parameter pairs n À h s , b À h s , e c0 À n, and e c0 À b.The correlation h s À n is certainly known to exist [19] and is in good agreement with the linear correlation presented in [36].However, the comparison with the results shown in [36] should be treated with caution, since therein much tighter bounds were chosen for the parameters h s and n, and the parameters were determined for synthetic laboratory tests.However, the general statement of [36] that potential correlations exist between some parameters of the hypoplastic constitutive model can be confirmed.

Required test data
The calibrations presented in the previous sections were done based on the entire data set, i.e., taking into account all available experiments.Accordingly, the parameters determined in this way represent the soil behavior under these test conditions very well, see Figs. 8 or 9.However, such comprehensive laboratory data is not always available, especially when dealing with practical problems or at the beginning of new project phases.In these cases, the question of the (minimum) number of tests required for a (first) calibration of the parameters of a constitutive model often arises.
The AC offers a solution for this.This is exemplarily shown for the determination of the parameters for Karlsruhe Sand.For this purpose, four additional calibration runs were performed using DE with the Fre´chet Distance and ''2/3, 1/3''-weighting.Each run considered only a part of the experimental data.Which experiments were used for the calibration varied from run to run.The different combinations are shown in Table 9.The parameters determined on the basis of all experiments serve as the reference parameter set, see Sect. 4.
The results of the different runs are given in the form of a comparison with the experimental data in Fig. 11.Blue are the simulation results which belong to a laboratory test considered in the calibration.Red, on the other hand, are those whose experimental counterpart was not used in the calibration.These correspond to ''blind predictions.'' The results shown in Fig. 11 show that considering only an oedometric compression test on an initially loose sample in the AC results in a parameter set that underestimates the  Particle Swarm Optimization (PSO).Parameter calibration was based on oedometric compression tests and drained monotonic triaxial tests.However, the consideration of other test types is also possible.The findings can be summarized as follows: • In principle, parameter sets obtained with both PSO and DE showed good agreement with the experimental data.However, the DE showed a much lower scatter and is therefore the recommended optimization algorithm.• The choice of the similarity measure to quantify the discrepancy between laboratory test and simulation was found to be anything but trivial.The same applies to an appropriate choice of factors for weighting different aspects of soil behavior (q À e 1 plane vs. e v À e 1 plane).
In the authors' opinion, the combination of the Fre´chet Distance with a ''2/3, 1/3''-weighting showed the best results and was thus chosen for the remaining comparisons.However, as already stated, this is influenced by the disadvantages of the hypoplastic constitutive model in representing both the behavior in the q À e 1 plane and the e v À e 1 plane equally well.Thus, the question of the most appropriate method to quantify the discrepancy between simulation and experiment cannot be answered conclusively.In particular with regard to the calibration of the ''cyclic parameters'' individual similarity measures could turn out to be unusable.The study of these issues is the subject of future work.• The scatter of the determined parameters was examined using 200 repeated calibration runs.It was shown that especially for the parameter h s with a standard deviation of up to 19 % a large variance exists.However, this has only little influence on the quality of the simulations which are carried out with these parameters.Using DE, the difference between the simulations using different parameter sets obtained by AC is negligible.This shows that the actual value of individual parameters does not have a large influence on the prediction of the constitutive model for the element tests considered, as long as the totality of the parameters is taken into account in the calibration.
The automatic calibration is able to successfully calibrate parameters for advanced constitutive soil models.This was demonstrated for the hypoplastic model for sands.Automatic calibration reduces the entry barriers for the application of advanced constitutive models and the associated effort (of calibration) and is able to find a better parameter set compared to hand calibration.

Fig. 1
Fig. 1 Overview of the AC tool set for the optimization of parameters of constitutive models for soils u lab c , e lab c0 and e lab d0 are evaluated from standard laboratory tests as explained in the text u

Fig. 2
Fig.2Comparison of experiments (black) and simulation results (colored) for oedometric compression tests and drained monotonic triaxial tests on samples of Karlsruhe Sand with different initial densities and different initial mean effective stresses.The hypoplastic parameters were calibrated by hand in previous work[33] and by ExCalibre[23,24] Fig.3Comparison of experiments (black) and simulation results (colored) for oedometric compression tests and drained monotonic triaxial tests on samples of UWA Silica Sand with different initial densities and different initial mean effective stresses.The hypoplastic parameters were calibrated by hand and by ExCalibre[23,24]

Fig. 4
Fig. 4 Final values of the cost function for 200 repeated runs of the automatic parameter calibration for Karlsruhe Sand using DE and PSO

Fig. 5
Fig.5Comparison of the simulation results for the parameter sets for Karlsruhe Sand with the ''best fit'' (solid) and the ''worst fit'' (dashed) from 200 repeated runs of the AC using DE

Fig. 7
Fig. 7 Influence of the weighting factors on the results of the AC.Simulation results (blue) vs. experimental data (black) for seven monotonic drained triaxial tests (left and middle) and two oedometric compression tests (right) on Karlsruhe Sand using DE with the Fre´chet Distance

Fig. 9
Fig.9Comparison of experiments (black) and simulation results (blue) for oedometric compression tests and drained monotonic triaxial tests on samples of UWA Silica Sand with different initial densities and different initial mean effective stresses.The hypoplastic parameters were calibrated using DE with FD and ''2/3, 1/3''weighting

Fig. 10
Fig. 10 Collection of scatter plots showing the mutual distribution of different parameter sets considering the results from 500 runs using DE with the Fre´chet Distance and ''2/3, 1/3''-weighting for Karlsruhe Sand

Fig. 11
Fig. 11 Comparison of experiments (black) and simulation results (colored) for the parameter sets based on different sets of laboratory tests as basis for the calibration.Blue: tests used for calibration (training data), Red: tests not used for calibration (test data) 2.3.2 and 2.3.1.Interfaces to various comparison measures are implemented as explained in more detail in Sect.2.2.

Table 1
Combinations of weighting factors for judging the constitutive models performance in drained monotonic triaxial tests

Table 3
Index parameters of the Karlsruhe Sand and UWA Silica Sand used in the experiments

Table 4
Drained monotonic triaxial tests on Karlsruhe Sand (KSa)

Table 5
Drained monotonic triaxial tests on UWA Silica Sand (SSa)

Table 6
Parameters of the hypoplastic model for Karlsruhe Sand (KSa) and UWA Silica Sand (SSa)

Table 7 '
'Best fit'' and ''worst fit'' parameter sets of the hypoplastic model for Karlsruhe Sand obtained by optimization with DE and PSO

Table 8
Parameters of the Hypoplastic model for Karlsruhe Sand and UWA Silica Sand obtained from DE with Fre´chet Distance and ''

Table 9
Overview of the laboratory tests considered in the different calibration runs samples (see Data Set 3).Obviously, as can be seen from Data Set 4, it is not possible to completely dispense with oedometric compression tests for calibration.Furthermore, the result of the calibration with Data Set 3 seems to allow the conclusion that the influence of the mean effective stress can be reproduced by the hypoplastic model even if no triaxial tests on samples with different mean effective stresses are available.Of course, these observations are not universal.For different sands other conclusions could be drawn, which has to be investigated further.However, they serve as an illustration of the possibilities offered by AC.Aspects of calibration and the influence of individual parameters can be investigated in a comparatively simple way.An automatic calibration software has been developed, which performs the calibration of constitutive model parameters by solving a regression problem.It is based on the minimization of a cost function that quantifies the discrepancy between experiment and simulation.For the hypoplastic model, this is a nonlinear and non-differentiable multi-peak function.Five different methods were used to quantify the discrepancy: Area Between Two Curves, Dynamic time warping, Partial Curve Mapping, Fre´chet Distance and Curve Length Measure.Two heuristic optimization algorithms were used to solve this minimization problem: Differential Evolution (DE) and Funding Open Access funding enabled and organized by Projekt DEAL.No funding was received for conducting this study.Open Access This article is licensed under a Creative Commons Attribution 4.0 International License, which permits use, sharing, adaptation, distribution and reproduction in any medium or format, as long as you give appropriate credit to the original author(s) and the source, provide a link to the Creative Commons licence, and indicate if changes were made.The images or other third party material in this article are included in the article's Creative Commons licence, unless indicated otherwise in a credit line to the material.If material is not included in the article's Creative Commons licence and your intended use is not permitted by statutory regulation or exceeds the permitted use, you will need to obtain permission directly from the copyright holder.To view a copy of this licence, visit http://creativecommons. org/licenses/by/4.0/.