Kernel density estimation with bounded data
- 393 Downloads
The uncertainties of input variables are quantified as probabilistic distribution functions using parametric or nonparametric statistical modeling methods for reliability analysis or reliability-based design optimization. However, parametric statistical modeling methods such as the goodness-of-fit test and the model selection method are inaccurate when the number of data is very small or the input variables do not have parametric distributions. To deal with this problem, kernel density estimation with bounded data (KDE-bd) and KDE with estimated bounded data (KDE-ebd), which randomly generates bounded data within given input variable intervals for given data and applies them to generate density functions, are proposed in this study. Since the KDE-bd and KDE-ebd use input variable intervals, they attain better convergence to the population distribution than the original KDE does, especially for a small number of given data. The KDE-bd can even deal with a problem that has one data with input variable bounds. To verify the proposed method, statistical simulation tests were carried out for various numbers of data using multiple distribution types and then the KDE-bd and KDE-ebd were compared with the KDE. The results showed the KDE-bd and KDE-ebd to be more accurate than the original KDE, especially when the number of data is less than 10. It is also more robust than the original KDE regardless of the quality of given data, and is therefore more useful even if there is insufficient data for input variables.
KeywordsKernel density estimation Nonparametric statistical modeling Interval approach Nonparametric distribution Bounded data Intersection area
This work was supported by the National Research Foundation of Korea (NRF) grant, funded by the Korean Government (NRF-2015R1A1A3A04001351) and by the Technology Innovation Program (10048305, Launching Plug-in Digital Analysis Framework for Modular System Design) and the Human Resources Development program (No. 20164030201230) of the Korea Institute of Energy Technology Evaluation and Planning (KETEP) grant funded by the Ministry of Trade, Industry and Energy. This support is greatly appreciated.
- Cox M, Harris P (2003) Up a GUM tree? Try the full monte! National Physical Laboratory, TeddingtonGoogle Scholar
- Frigge M, Hoaglin DC, Lglewicz B (1989) Some implementations of the boxplot. Am Stat 43(1):50–54Google Scholar
- Gabauer W (2000) Manual of codes of practice for the determination of uncertainties in mechanical tests on metallic materials, the determination of uncertainties in tensile testing. UNCERT COP7 report, Project SMT4-CT97-2165Google Scholar
- Guidoum AC (2015) Kernel estimator and bandwidth selection for density and its derivatives. Department of Probabilities & Statistics, Faculty of Mathematics, University of Science and Technology Houari Boumediene, Algeria, https://cran.r-project.org/web/packages/kedd/vignettes/kedd.pdf
- Hansen BE (2009) Lecture notes on nonparametrics. University of Wisconsin-Madison, WI, USA, http://www.ssc.wisc.edu/~bhansen/718/NonParametrics1.pdf
- Montgomery DC, Runger GC (2003) Applied statistics and probability for engineers (3rd edition). Wiley, New YorkGoogle Scholar
- Schindler A (2011) Bandwidth selection in nonparametric kernel estimation. PhD Thesis. Göttingen, Georg-August Universität, DissGoogle Scholar
- Tucker WT, Ferson S (2003) Probability bounds analysis in environmental risk assessment. Applied Biomathematics, Setauket, New York, http://www.ramas.com/pbawhite.pdf
- Youn BD, Jung BC, Xi Z, Kim SB, Lee WR (2011) A hierarchical framework for statistical model calibration in engineering product development. Comput Methods Appl Mech Eng 200(13):1421–1431Google Scholar