Advertisement

Structural and Multidisciplinary Optimization

, Volume 60, Issue 4, pp 1327–1353 | Cite as

Target output distribution and distribution of bias for statistical model validation given a limited number of test data

  • Min-Yeong Moon
  • K. K. ChoiEmail author
  • David Lamb
Research Paper
  • 115 Downloads

Abstract

Simulation model must be validated with experimental data to correctly predict the outputs of engineered systems before they can be used with confidence. While doing so, pointwise comparison between predicted output by simulation model and experimental data for model verification and validation (V&V) is not appropriate since real-world phenomena are not deterministic due to existence of irreducible uncertainty. Thus, the output prediction by a simulation model needs to be represented by a certain probability density function (PDF). Statistical model validation methods are necessary to compare the model prediction and physical test data. The validation of a simulation model entails the acquisition of extraordinarily detailed test data, which is expensive to generate, and practicing engineers can afford only a very limited number of test data. This paper proposes an effective method to validate simulation model by using a target output distribution, which closely approximates the true output distribution. Furthermore, the proposed target output distribution accounts for a biased simulation model with stochastic outputs—specifically, simulation output distribution—using limited numbers of input and output test data. Since limited test data may involve outlier or be sparse, a data quality checking process is proposed to determine whether a given output test data needs to be balanced. If necessary, stratified sampling using cluster analysis is employed to sample balanced test data. Next, Bayesian analysis is used to obtain many possible candidates of target output distributions, from which the one at the posterior median is selected. Then, the distribution of bias can be identified using Monte Carlo convolution. Three engineering examples are used to demonstrate that (1) the developed target output distribution closely approximates the true output distribution and is robust under different sets of test data; (2) the reallocated test dataset by a quality checking process and balance sampling leads to better matching to the true output distribution; and (3) the distribution of bias is effectively used to understand the model’s accuracy and model confidence for comparison study.

Keywords

Statistical model validation Target output distribution Irreducible uncertainty Uncertain output distribution Uncertainty quantification Biased simulation model Reducible uncertainty Limited number of input test data Limited number of output physical test data 

Nomenclature

AKDE

adaptive KDE

Bi(x)

unknown model bias for ith output response

CAE

computer-aided engineering

CDF

cumulative distribution function

DKG

dynamic kriging

\( \hat{f}(y) \)

output PDF using AKDE

\( {G}_i\left(\boldsymbol{x}\right),{G}_i^{true}\left(\boldsymbol{x}\right) \)

biased simulation output and true output of ith constraint

h(y; h0)

adaptive bandwidth in AKDE

h0

global fixed bandwidth for modeling output distribution

ISFC

indicated specific fuel consumption

K

kernel

KDE

kernel density estimation

M

number of MCS samples

MAE

mean absolute error

MAP

maximum a posteriori probability

MCMC

Markov Chain Monte Carlo

MCS

Monte Carlo simulation

MSE

mean squared error

\( {\hat{\mu}}_{h_0} \),\( {{\hat{\sigma}}_{h_0}}^2 \)

mean and variance of prior distribution for h0

UQ

uncertainty quantification

PDF

probability density function

P(h0)

prior distribution of bandwidth

P(h0|ye)

posterior distribution of bandwidth given output data

RBDO

reliability-based design optimization

RPM

revolutions per minute

STD

standard deviation

V&V

verification and Validation

\( {\boldsymbol{y}}_i^e,{\boldsymbol{y}}^e \)

ith output data and output data vector

\( {\boldsymbol{x}}_{ik}^e \)

kth element of the collected input data vector \( {\boldsymbol{x}}_i^e \)

Xi

ith input random variable

Notes

Funding information

Technical and financial support was provided by the RAMDO—U.S. Army SBIR Sequential Phase II sub-contract from RAMDO Solutions, LLC.

Compliance with ethical standards

Conflict of interests

The authors declare that they have no conflict of interest.

References

  1. Arendt PD, Apley DW, Chen W (2012) Quantification of model uncertainty: calibration, model discrepancy, and identifiability. J Mech Des 134:100908.  https://doi.org/10.1115/1.4007390 CrossRefGoogle Scholar
  2. Brooks SP, Gelman A (1998) General methods for monitoring convergence of iterative simulations. J Comput Graph Stat 7(4):434–455MathSciNetGoogle Scholar
  3. Cho H, Choi KK, Gaul N, Lee I, Lamb D, Gorsich D (2016) Conservative reliability-based design optimization method with insufficient input data. Special issue: physical, model, and statistical uncertainty in. Struct Multidiscip Optim 54(6):1–22.  https://doi.org/10.1007/s00158-016-1492-4 CrossRefGoogle Scholar
  4. Chowdhury FN, Kolber ZS, Barkley MD (1991) Monte Carlo convolution method for simulation and analysis of fluorescence decay data. Rev Sci Instrum 62(1):47–52CrossRefGoogle Scholar
  5. Du L, Choi KK (2008) An inverse analysis method for design optimization with both statistical and fuzzy uncertainties. Struct Multidiscip Optim 37(2):107–119CrossRefGoogle Scholar
  6. Ferson S, Oberkampf WL, Ginzburg L (2008) Model validation and predictive capability for the thermal challenge problem. Comput Methods Appl Mech Eng 197:2408–2430.  https://doi.org/10.1016/j.cma.2007.07.030 CrossRefzbMATHGoogle Scholar
  7. Gu L, Yang RJ, Tho CH, Makowskit M, Faruquet O, Li Y (2001) Optimization and robustness for crashworthiness of side impact. Int J Veh Des 26(4):348–360.  https://doi.org/10.1504/IJVD.2001.005210 CrossRefGoogle Scholar
  8. Gunawan S, Papalambros PY (2006) A Bayesian approach to reliability-based optimization with incomplete information. J Mech Des 128(4):909–918.  https://doi.org/10.1115/1.2204969 CrossRefGoogle Scholar
  9. Henderson DJ, Parmeter CF (2012) Normal reference bandwidths for the general order, multivariate kernel density derivative estimator. Stat Probabil Lett 82(12):2198–2205.  https://doi.org/10.1016/j.spl.2012.07.020
  10. Higdon D, Nakhleh C, Gattiker J, Williams B (2008) A Bayesian calibration approach to the thermal problem. Comput Methods Appl Mech Eng 197:2431–2441CrossRefGoogle Scholar
  11. Jiang Z, Chen W, Fu Y, Yang RJ (2013) Reliability-based design optimization with model bias and data uncertainty. SAE Int J Manuf Mater 6(2013-01-1384):502–516.  https://doi.org/10.4271/2013-01-1384
  12. Jones TA (1977) A computer method to calculate the convolution of statistical distributions. J Int Assoc Math Geol 9(6):635–647CrossRefGoogle Scholar
  13. Jung BC, Yoon H, Oh H, Lee G, Yoo M, Youn BD, Huh YC (2016) Hierarchical model calibration for designing piezoelectric energy harvester in the presence of variability in material properties and geometry. Struct Multidiscip Optim 53:161–173CrossRefGoogle Scholar
  14. Kennedy MC, O'Hagan A (2001) Bayesian calibration of computer models. J R Stat Soc Ser B Stat Methodol 63(3):425–464.  https://doi.org/10.1111/1467-9868.00294 MathSciNetCrossRefzbMATHGoogle Scholar
  15. Li W, Chen W, Jiang Z, Lu Z, Liu Y (2014) New validation metrics for models with multiple correlated responses. Reliab Eng Syst Saf 127:1–11CrossRefGoogle Scholar
  16. Liu Y, Chen W, Arendt P, Huang HZ (2011) Toward a better understanding of model validation metrics. J Mech Des 133(7):071005CrossRefGoogle Scholar
  17. McFarland J, Mahadevan S, Romero V, Swileir L (2008) Calibration and uncertainty analysis for computer simulations with multivariate output. AIAA J 46(5):1253–1265CrossRefGoogle Scholar
  18. Moon MY, Choi KK, Cho H, Gaul N, Lamb D, Gorsich D (2017) Reliability-based design optimization using confidence-based model validation for insufficient experimental data. J Mech Des 139(3):031404.  https://doi.org/10.1115/1.4035679 CrossRefGoogle Scholar
  19. Moon MY, Cho H, Choi KK, Gaul N, Lamb D, Gorsich D (2018a) Confidence-based reliability assessment considering limited numbers of both input and output test data. Struct Multidiscip Optim 57(5):2027–2043MathSciNetCrossRefGoogle Scholar
  20. Moon MY, Choi KK, Gaul N, Lamb D (2018b) Treating epistemic uncertainty using bootstrapping selection of input distribution model for confidence-based reliability assessment. J Mech Des (Accepted).  https://doi.org/10.1115/1.4042149
  21. Mourelatos ZP, Zhou J (2005) Reliability estimation and design with insufficient data based on possibility theory. AIAA J 48(8):1696–1705CrossRefGoogle Scholar
  22. Noh Y, Choi KK, Lee I, Gorsich D (2011) Reliability-based design optimization with confidence level for non-gaussian distributions using bootstrap method. J Mech Des ASME 133(9):091001.  https://doi.org/10.1115/1.4004545 CrossRefGoogle Scholar
  23. Oberkampf WL, Roy C (2010) Verification and validation in scientific computing. Cambridge University Press, Cambridge.  https://doi.org/10.1017/CBO9780511760396 CrossRefzbMATHGoogle Scholar
  24. Pan H, Xi Z, Yang RJ (2016) Model uncertainty approximation using a copula-based approach for reliability based design optimization. Struct Multidiscip Optim 54(6):1543–1556.  https://doi.org/10.1007/s00158-016-1530-2 MathSciNetCrossRefGoogle Scholar
  25. Papalambros PY, Wilde DJ (2000) Principles of optimal design: modeling and computation. Cambridge University Press, CambridgeCrossRefGoogle Scholar
  26. Picheny V, Kim NH, Haftka RT (2010) Application of bootstrap method in conservative estimation of reliability with limited samples. Struct Multidiscip Optim 41(2):205–217.  https://doi.org/10.1007/s00158-009-0419-8 MathSciNetCrossRefzbMATHGoogle Scholar
  27. RAMDO Software (2018) RAMDO solutions. LLC, Iowa City https://www.ramdosolutions.com. Accessed 8 Aug 2018
  28. Rao SS, Rao SS (2009) Engineering optimization: theory and practice. John Wiley & SonsGoogle Scholar
  29. Rebba R, Mahadevan S (2008) Computational methods for model reliability assessment. Reliab Eng Syst Saf 93:1197–1207CrossRefGoogle Scholar
  30. Sen O, Davis S, Jacobs G, Udaykumar HS (2015) Evaluation of convergence behavior of metamodeling techniques for bridging scales in multi-scale multimaterial simulation. J Comput Phys 294:585–604.  https://doi.org/10.1019/j.jcp.2015.03.043 MathSciNetCrossRefzbMATHGoogle Scholar
  31. Silverman BW (1986) Density estimation for statistics and data analysis. Chapman & Hall, LondonCrossRefGoogle Scholar
  32. Srivastava R, Deb K (2013) An evolutionary based Bayesian design optimization approach under incomplete information reliability based design optimization for complete information. Eng Opt 45(2):141–165CrossRefGoogle Scholar
  33. Thompson SK (2012) Sampling, 3rd edn. Wiley, HobokenCrossRefGoogle Scholar
  34. Tipton E (2013) Stratified sampling using cluster analysis: a sample selection strategy for improved generalizations from experiments. Eval Rev 37(2):109–139CrossRefGoogle Scholar
  35. Volpi S, Diez M, Gaul NJ, Song H, Iemma U, Choi KK, Campana EF, Stern F (2014) Development and validation of a dynamic metamodel based on stochastic radial basis functions and uncertainty quantification. Struct Multidiscip Optim 51(2):347–368.  https://doi.org/10.1007/s00158-014-1128-5 CrossRefGoogle Scholar
  36. Wang S, Chen W, Tsui KL (2009a) Bayesian validation of computer models. Technometrics 51(4):439–451.  https://doi.org/10.1198/tech.2009.07011 MathSciNetCrossRefGoogle Scholar
  37. Wang P, Youn BD, Xi Z, Kloess A (2009b) Bayesian reliability analysis with evolving, insufficient, and subjective data sets. J Mech Des 131(11):111008CrossRefGoogle Scholar
  38. Xi Z (2019) Model-based reliability analysis with both model uncertainty and parameter uncertainty. J Mech Des 141(5):051404–051404-11.  https://doi.org/10.1115/1.4041946 CrossRefGoogle Scholar
  39. Youn BD, Wang P (2008) Bayesian reliability-based design optimization using eigenvector dimension reduction (EDR) method. Struct Multidiscip Optim 36(2):107–123.  https://doi.org/10.1007/s00158-007-0202-7 CrossRefGoogle Scholar
  40. Youn BD, Choi KK, Yang RJ, Gu L (2004) Reliability-based design optimization for crashworthiness of vehicle side impact. Struct Multidiscip Optim 26(3):272–283.  https://doi.org/10.1007/s00158-003-0345-0 CrossRefGoogle Scholar
  41. Youn BD, Jung BC, Xi Z, Kim SB, Lee W (2011) A hierarchical framework for statistical model calibration in engineering product development. Comput Methods Appl Mech Eng 200:1421–1431CrossRefGoogle Scholar
  42. Zaman K, Mahadevan S (2017) Reliability-based design optimization of multidisciplinary system under aleatory and epistemic uncertainty. Struct Multidiscip Optim 55(2):681–699.  https://doi.org/10.1007/s00158-016-1532-0 MathSciNetCrossRefGoogle Scholar
  43. Zhao L, Choi KK, Lee I (2011) Metamodeling method using dynamic kriging for design optimization. AIAA J 49(9):2034–2046.  https://doi.org/10.2514/1.J051017 CrossRefGoogle Scholar

Copyright information

© Springer-Verlag GmbH Germany, part of Springer Nature 2019

Authors and Affiliations

  1. 1.Department of Mechanical Engineering, College of EngineeringThe University of IowaIowa CityUSA
  2. 2.US Army RDECOM/TARDECWarrenUSA

Personalised recommendations