# Integrated statistical modeling method: part I—statistical simulations for symmetric distributions

- 47 Downloads

## Abstract

The use of parametric and nonparametric statistical modeling methods differs depending on data sufficiency. For sufficient data, the parametric statistical modeling method is preferred owing to its high convergence to the population distribution. Conversely, for insufficient data, the nonparametric method is preferred owing to its high flexibility and conservative modeling of the given data. However, it is difficult for users to select either a parametric or nonparametric modeling method because the adequacy of using one of these methods depends on how well the given data represent the population model, which is unknown to users. For insufficient data or limited prior information on random variables, the interval approach, which uses interval information of data or random variables, can be used. However, it is still difficult to be used in uncertainty analysis and design, owing to imprecise probabilities. In this study, to overcome this problem, an integrated statistical modeling (ISM) method, which combines the parametric, nonparametric, and interval approaches, is proposed. The ISM method uses the two-sample Kolmogorov–Smirnov (K–S) test to determine whether to use either the parametric or nonparametric method according to data sufficiency. The sequential statistical modeling (SSM) and kernel density estimation with estimated bounded data (KDE-ebd) are used as the parametric and nonparametric methods combined with the interval approach, respectively. To verify the modeling accuracy, conservativeness, and convergence of the proposed method, it is compared with the original SSM and KDE-ebd according to various sample sizes and distribution types in simulation tests. Through an engineering and reliability analysis example, it is shown that the proposed ISM method has the highest accuracy and reliability in the statistical modeling, regardless of data sufficiency. The ISM method is applicable to real engineering data and is conservative in the reliability analysis for insufficient data, unlike the SSM, and converges to an exact probability of failure more rapidly than KDE-ebd as data increase.

## Keywords

Integrated statistical modeling (ISM) Kernel density estimation with estimated bounded data (KDE-ebd) Sequential statistical modeling (SSM) Symmetric distribution Kernel density estimation with estimated bounded data and sequential statistical modeling method (KbSSM)## Notes

### Funding information

This work was supported by a grant from the National Research Foundation of Korea (NRF), funded by the Korean Government (NRF-2015R1A1A3A04001351) and by the Technology Innovation Program (10048305, Launching Plug-In Digital Analysis Framework for Modular System Design) funded by the Ministry of Trade, Industry, and Energy (MOTIE, Korea).

### Compliance with ethical standards

### Conflict of interest

The authors declare that they have no conflict of interest.

## References

- Agarwal H, Renaud JE, Preston EL, Padmanabhan D (2004) Uncertainty quantification using evidence theory in multidisciplinary design optimization. Reliab Eng Syst Saf 85(1):281–294CrossRefGoogle Scholar
- Akaike H (1974) A new look at the statistical model identification. IEEE Trans Autom Control 19(6):716–723MathSciNetCrossRefGoogle Scholar
- Anderson TW, Darling DA (1952) Asymptotic theory of certain goodness of fit criteria based on stochastic processes. Ann Math Stat 23(2):193–212MathSciNetCrossRefGoogle Scholar
- Ayyub BM, McCuen RH (2012) Probability, statistics, and reliability for engineers and scientists. CRC Press, FloridazbMATHGoogle Scholar
- Betrie GD, Sadiq R, Morin KA, Tesfamariam S (2014) Uncertainty quantification and integration of machine learning techniques for predicting acid rock drainage chemistry: a probability bounds approach. Sci Total Environ 490:182–190CrossRefGoogle Scholar
- Betrie GD, Sadiq R, Nichol C, Morin KA, Tesfamariam S (2016) Environmental risk assessment of acid rock drainage under uncertainty: the probability bounds and PHREEQC approach. J Hazard Mater 301:187–196CrossRefGoogle Scholar
- Burnham KP, Anderson DR (2004) Multimodel inference: understanding AIC and BIC in model selection. Sociol Methods Res 33(2):261–304MathSciNetCrossRefGoogle Scholar
- Chen S (2015) Optimal bandwidth selection for kernel density functionals estimation. J Probab Stat 2015:21MathSciNetCrossRefGoogle Scholar
- Choi JS, Hong S, Chi SB, Lee HB, Park CK, Kim HW, Yeu TK, Lee TH (2011) Probability distribution for the shear strength of seafloor sediment in the KR5 area for the development of manganese nodule miner. Ocean Eng 38(17):2033–2041CrossRefGoogle Scholar
- Doh J, Lee J (2018) Bayesian estimation of the lethargy coefficient for probabilistic fatigue life model. J Comput Des Eng 5(2):191–197CrossRefGoogle Scholar
- Frangopol DM, Corotis RB, Rackwitz R (1997) Reliability and optimization of structural systems: Proceedings of the seventh IFIP WG7.5 working conference on reliability and optimization of structural systems 1996. Elsevier Science, PergamonGoogle Scholar
- Frigge M, Hoaglin DC, Lglewicz B (1989) Some implementations of the boxplot. Am Stat 43(1):50–54Google Scholar
- Guidoum AC (2015) Kernel estimator and bandwidth selection for density and its derivatives. Department of Probabilities & Statistics, Faculty of Mathematics, University of Science and Technology Houari Boumediene, Algeria https://cran.r-project.org/web/packages/packages/kedd/vignettes/kedd.pd. Accessed 06 Sept 2019
- Gunawan S, Papalambros PY (2006) A Bayesian approach to reliability-based optimization with incomplete information. J Mech Des 128(4):909–918CrossRefGoogle Scholar
- Hansen BE (2009) Lecture notes on nonparametrics. University of Wisconsin, Madison 718/NonParametrics1.pdf. Accessed 06 Sept 2019Google Scholar
- Hao WY, Liu C, Wang B, Wu H (2017) A novel non-probabilistic reliability-based design optimization algorithm using enhanced chaos control method. Comput Methods Appl Mech Eng 318:572–593MathSciNetCrossRefGoogle Scholar
- Hao P, Ma R, Wang Y, Feng S, Wang B, Li G (2019a) An augmented step size adjustment method for the performance measure approach: toward general structural reliability-based design optimization. Struct Saf 80:32–45CrossRefGoogle Scholar
- Hao P, Wang Y, Ma R, Liu H, Wang B, Li G (2019b) A new reliability-based design optimization framework using isogeometric analysis. Comput Methods Appl Mech Eng 345:476–501MathSciNetCrossRefGoogle Scholar
- Hess PE, Bruchman D, Assakkaf IA, Ayyub BM (2002) Uncertainties in material and geometric strength and load variables. Nav Eng J 114(2):139–166CrossRefGoogle Scholar
- Hong J, Kang YJ, Lim OK, Noh Y (2018) Comparison of multivariate statistical modeling methods for limited correlated data. Trans Korean Soc Mech Eng A 42(5):445–453CrossRefGoogle Scholar
- Jackman S (2009) Bayesian analysis for the social sciences, vol 846. John Wiley & Sons, ChichesterGoogle Scholar
- Joo M, Doh J, Lee J (2017) Determination of the best distribution and effective interval using statistical characterization of uncertain variables. J Comput Des EngGoogle Scholar
- Jung JH, Kang YJ, Lim OK, Noh Y (2017) A new method to determine the number of experimental data using statistical modeling methods. J Mech Sci Technol 31(6):2901–2910CrossRefGoogle Scholar
- Kang YJ (2018) Development of integrated statistical modeling method for reliability analysis, Ph.D. Dissertation, Pusan National UniversityGoogle Scholar
- Kang YJ, Lim OK, Noh Y (2016) Sequential statistical modeling for distribution type identification. Struct Multidiscip Optim 54(6):1587–1607CrossRefGoogle Scholar
- Kang YJ, Hong J, Lim OK, Noh Y (2017) Reliability analysis using parametric and nonparametric input modeling methods. J Comput Struct Eng Inst Korea 30(1):87–94CrossRefGoogle Scholar
- Kang YJ, Noh Y, Lim OK (2018) Kernel density estimation with bounded data. Struct Multidiscip Optim 57(1):95–113MathSciNetCrossRefGoogle Scholar
- Karanki DR, Kushwaha HS, Verma AK, Ajit S (2009) Uncertainty analysis based on probability bounds (P-box) approach in probabilistic safety assessment. Risk Anal 29(5):662–675CrossRefGoogle Scholar
- Keshtegar B, Chakraborty S (2018) A hybrid self-adaptive conjugate first order reliability method for robust structural reliability analysis. Appl Math Model 53:319–332MathSciNetCrossRefGoogle Scholar
- Kullback S, Leibler RA (1951) On information and sufficiency. Ann Math Stat 22(1):79–86MathSciNetCrossRefGoogle Scholar
- Li J, Wang H, Kim NH (2012) Doubly weighted moving least squares and its application to structural reliability analysis. Struct Multidiscip Optim 46(1):69–82CrossRefGoogle Scholar
- Lukić M, Cremona C (2001) Probabilistic assessment of welded joints versus fatigue and fracture. J Struct Eng 127(2):211–218CrossRefGoogle Scholar
- Malekpour S, Barmish BR (2016) When the expected value is not expected: A conservative approach. IEEE Transactions on Systems, Man, and Cybernetics: Systems 47(9):2454–2466Google Scholar
- Montgomery DC, Runger GC (2003) Applied statistics and probability for engineers, 3rd edn. Wiley, New YorkzbMATHGoogle Scholar
- Noh Y, Choi KK, Lee I (2010) Identification of marginal and joint CDFs using Bayesian method for RBDO. Struct Multidiscip Optim 40(1):35–51MathSciNetCrossRefGoogle Scholar
- Park C, Kim NH, Haftka RT (2015) The effect of ignoring dependence between failure modes on evaluating system reliability. Struct Multidiscip Optim 52(2):251–268CrossRefGoogle Scholar
- Peng X, Li J, Jiang S (2017a) Unified uncertainty representation and quantification based on insufficient input data. Struct Multidiscip Optim 56(6):1305–1317CrossRefGoogle Scholar
- Peng X, Wu T, Li J, Jiang S, Qiu C, Yi B (2017b) Hybrid reliability analysis with uncertain statistical variables, sparse variables and interval variables. Eng OptimGoogle Scholar
- Picheny V, Kim NH, Haftka RT (2010) Application of bootstrap method in conservative estimation of reliability with limited samples. Struct Multidiscip Optim 41(2):205–217MathSciNetCrossRefGoogle Scholar
- Schwarz (1978) Estimating the dimension of a model. Ann Stat 6(2):461–464MathSciNetCrossRefGoogle Scholar
- Shah H, Hosder S, Winter T (2015) Quantification of margins and mixed uncertainties using evidence theory and stochastic expansions. Reliab Eng Syst Saf 138:59–72CrossRefGoogle Scholar
- Sheather SJ (2004) Density estimation. Stat Sci 19(4):588–597CrossRefGoogle Scholar
- Silverman BW (1986) Density estimation for statistics and data analysis, vol 26. CRC press, LondonCrossRefGoogle Scholar
- Socie D (2014) Probabilistic statistical simulations technical background, eFatigue LLC, 2008, https://www.efatigue.com/probabilistic/background/statsim.html#Cor, April, 2014
- Tucker WT, Ferson S (2003) Probability bounds analysis in environmental risk assessment. Applied Biomathematics, Setauket, New York http://citeseerx.ist.psu.edu/viewdoc/download?. Accessed 06 Sep 2019
- Tukey JW (1977) Exploratory data analysis. Pearson, New YorkzbMATHGoogle Scholar
- Verma AK, Srividya A, Karanki DR (2010) Reliability and safety engineering. Springer, LondonCrossRefGoogle Scholar
- Wang P, Youn BD, Xi Z, Kloess A (2009) Bayesian reliability analysis with evolving, insufficient, and subjective data sets. J Mech Des 131(11):111008CrossRefGoogle Scholar
- Wang L, Cai Y, Liu D (2018) Multiscale reliability-based topology optimization methodology for truss-like microstructures with unknown-but-bounded uncertainties. Comput Methods Appl Mech Eng 339:358–388MathSciNetCrossRefGoogle Scholar
- Wheeler DJ (2012) What they forgot to tell you about the normal distribution: how the normal distribution has maximum uncertainty. Quality Digest (http://www.qualitydigest.com/print/21738), https://www.qualitydigest.com/print/21738
- Yao W, Chen X, Quyang Q, Van Tooren M (2013) A reliability-based multidisciplinary design optimization procedure based on combined probability and evidence theory. Struct Multidiscip Optim 48(2):339–354MathSciNetCrossRefGoogle Scholar
- Yoo D, Lee I (2014) Sampling-based approach for design optimization in the presence of interval variables. Struct Multidiscip Optim 49(2):253–266MathSciNetCrossRefGoogle Scholar
- Youn BD, Wang P (2008) Bayesian reliability-based design optimization using eigenvector dimension reduction (EDR) method. Struct Multidiscip Optim 36(2):107–123CrossRefGoogle Scholar
- Youn BD, Jung BC, Xi Z, Kim SB, Lee WR (2011) A hierarchical framework for statistical model calibration in engineering product development. Comput Methods Appl Mech Eng 200:1421–1431CrossRefGoogle Scholar
- Zhang Z, Jiang C, Han X, Hu D, Yu S (2014) A response surface approach for structural reliability analysis using evidence theory. Adv Eng Softw 69:37–45CrossRefGoogle Scholar