Skip to main content
Log in

Dependence Between Histogram Parameters and the Kernel Estimate of a Unimodal Probability Density

  • GENERAL PROBLEMS OF METROLOGY AND MEASUREMENT TECHNIQUE
  • Published:
Measurement Techniques Aims and scope

The dependence between the sampling interval of the domain of values of a one-dimensional random variable and the blur coefficient of the kernel probability density estimate is determined. The studies used the results of an analysis of the asymptotic properties of a nonparametric estimate of the probability density of the Rosenblatt–Parzen type and its modification. It is shown that the modification of the kernel probability density estimate is a smoothed histogram. The optimal expressions for the kernel function blur coefficient and the length of the sampling interval of the domain of values of a one-dimensional random variable are considered. These parameters are obtained from the condition of minimum mean square deviations of the considered probability density estimates. On this basis, a relationship was established between the studied parameters, which is determined by a constant and depends on the applied kernel function and the volume of the initial statistical data. The values of the detected constant are characterized by the form of the reconstructed probability density and are independent of its parameters. According to the data of computational experiments, formulas are proposed for estimating the analyzed constant by the value of the antikurtosis coefficient for symmetric and asymmetric distribution laws. To estimate the antikurtosis coefficient, we used the initial statistical data in the problem of reconstructing the probability density. The results obtained make it possible to quickly determine the length of the sampling interval from the value of the kernel function blur coefficient, which is relevant when testing hypotheses about the distributions of random variables. The presented conclusions are confirmed by the results of computational experiments.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Fig. 1.

Similar content being viewed by others

References

  1. V. S. Pugachev, Probability Theory and Mathematical Statistics, Fizmatlit, Moscow (2002).

    MATH  Google Scholar 

  2. H. A. Sturges, “The choice of a class interval,” J. Am. Stat. Ass., 21, 65–66 (1926).

    Article  Google Scholar 

  3. I. Heinhold and K. W. Gaede, Ingeniur Statistic, Springer Verlag, München, Wien (1964).

    Google Scholar 

  4. M. P. Wand, “Data-based choice of histogram bin width,” Am. Statistician, 51, No. 1, 59–64 (1997).

    Google Scholar 

  5. D. W. Scott, Multivariate Density Estimation: Theory, Practice, and Visualization, John Wiley & Sons, N. J. (2015).

    Book  Google Scholar 

  6. A. V. Lapko and V. A. Lapko, “Optimal choice of the number of sampling intervals for the domain of variation of a one-dimensional random variable when estimating the probability density,” Izmer. Tekhn., No. 7, 24–27 (2013).

  7. A. V. Lapko and V. A. Lapko, “Estimation of the parameters of the optimal discretization formula for the domain of values of a two-dimensional random variable,” Izmer. Tekhn., No. 5, 9–13 (2018), DOI: https://doi.org/10.32446/0368-1025it. 2018-8-9-13.

  8. A. V. Lapko and V. A. Lapko, “Sampling method for the domain of values of a multidimensional random variable,” Izmer. Tekhn., No. 1, 16–20 (2019), DOI: https://doi.org/10.32446/0368-1025it.2019-1-16-20.

  9. A. V. Lapko and V. A. Lapko, “Choice of blur coefficient of kernel estimates of probability density in large samples,” Izmer. Tekhn., No. 5, 3–6 (2019), DOI: https://doi.org/10.32446/0368-1025it.2019-5-3-6

  10. A. V. Lapko and V. A. Lapko, “Technique for quick selection of blurring coefficients of kernel functions in a nonparametric pattern recognition algorithm,” Izmer. Tekhn., No. 4, 4–8 (2019), DOI: https://doi.org/10.32446/0368-1025it.2019-4-4-8.

  11. S. J. Sheather, “Density estimation,” Stat. Sci., 19, No. 4, 588–597 (2004).

    Article  Google Scholar 

  12. T. Duong, “Kernel density estimation and kernel discriminant analysis for multivariate data in R,” J. Stat. Soft., 21, No. 7, 1–16 (2007), DOI: https://doi.org/10.18637/jss.v021.i07.

    Article  Google Scholar 

  13. A. V. Dobrovidov and I. M. Rudko, “Choice of the window width of the kernel function in a non-parametric estimation of the derivative of density by the method of smoothed cross-validation,” Avtomat. Telemekh., No. 2, 42–58 (2010).

  14. Z. I. Botev, J. F. Grotowski, and D. P. Kroese, “Kernel density estimation via diffusion,” Ann. Stat., 38, No. 5, 2916–2957 (2010).

    Article  MathSciNet  Google Scholar 

  15. S. Chen, “Optimal bandwidth selection for kernel density functionals estimation,” J. Prob. Stat., 2015, 1–21 (2015).

    Article  MathSciNet  Google Scholar 

  16. T. A. O’Brien, K. Kashinath, N. R. Cavanaugh, et al., “A fast and objective multidimensional kernel density estimation method: fastKDE,” Comp. Stat. Data Anal., 101, 148–160 (2016), DOI: https://doi.org/10.1016/j.csda.2016.02.02.014.

    Article  MathSciNet  MATH  Google Scholar 

  17. M. I. Borrajo, W. González-Manteiga, and M. D. Martínez-Miranda, “Bandwidth selection for kernel density estimation with length-biased data,” J. Nonparam. Stat., 29, No. 3, 636–668 (2017).

    Article  MathSciNet  Google Scholar 

  18. E. Parzen, “On estimation of a probability density function and mode,” Ann. Math. Stat., 33, No. 5, 1065–1076 (1962), DOI: https://doi.org/10.1214/aoms/1177704472.

    Article  MathSciNet  MATH  Google Scholar 

  19. V. A. Epanechnikov, “Nonparametric estimation of multidimensional probability density,” Teor. Prob. Its Applic., 14, No. 1, 156–161 (1969).

    MathSciNet  MATH  Google Scholar 

  20. L. Dervoi and L. Dierfi , Nonparametric Density Estimation (L1-approach), Mir, Moscow (1988).

    Google Scholar 

  21. A. V. Lapko and V. A. Lapko, “Regression estimation of multidimensional probability density and its properties,” Avtometriya. 50, No. 2, 50–56 (2010).

    Google Scholar 

  22. A. V. Lapko and V. A. Lapko, “Fast algorithm for choosing the blur coefficients of kernel functions in a non-parametric estimate of the probability density,” Izmer. Tekhn., No. 6, 16–20 (2018), DOI: https://doi.org/10.32446/0368-1025it-2018-6-16-20.

Download references

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to A. V. Lapko.

Additional information

Translated from Izmeritel’naya Tekhnika, No. 9, pp. 3–8, September, 2019.

Rights and permissions

Reprints and permissions

About this article

Check for updates. Verify currency and authenticity via CrossMark

Cite this article

Lapko, A.V., Lapko, V.A. Dependence Between Histogram Parameters and the Kernel Estimate of a Unimodal Probability Density. Meas Tech 62, 747–753 (2019). https://doi.org/10.1007/s11018-019-01690-2

Download citation

  • Received:

  • Accepted:

  • Published:

  • Issue Date:

  • DOI: https://doi.org/10.1007/s11018-019-01690-2

Keywords

Navigation