Abstract
The article proposes a procedure for selecting the blur factor of kernel functions for nonparametric density estimation of a one-dimensional random variable given large amounts of statistical data, for example, obtained via the remote sensing of natural objects. The proposed procedure uses regression density estimation. A procedure is presented for synthesizing a regression density estimate. The estimate synthesis is based on the original sample compression by decomposing the range of a random variable. To this end, the Heinhold-Gaede rule and a formula for choosing the optimal number of sampling intervals are applied. The article considers two approaches to selecting the blur factor of regression density estimation using the conventional method and that proposed by the authors to optimize nonparametric density estimation. The conventional method for optimizing nonparametric density estimation is based on minimizing its standard deviation. In the proposed method, the choice of the blur factors of kernel functions relies on the conditions for the minimum approximation error of regression density estimation. The article analyzes the approximation properties of regression density estimation using two optimization methods. The conditions of their competence in estimating the probability densities of random variables following lognormal distribution are established. The results obtained for a one-dimensional random variable can be used to optimize the regression density estimation of a multi-dimensional random variable.
Similar content being viewed by others
References
Lapko, A.V., Lapko, V.A.: Yadernye Ocenki Plotnosti Veroyatnosti i ih Primenenie [Kernel Probability Density Estimates and their Application; in Russian]. Reshetnev University Publ, Krasnoyarsk (2021)
Lapko, A.V., Lapko, V.A.: Optoelectron. Instrum. Data Process., 50. No 2, 148–153 (2014). https://doi.org/10.3103/S875669901402006X
Rudemo, M.: Empirical choice of histogram and kernel density estimators. Scand. J. Stat. (9), 65–78 (1982)
A. W. Bowman, J. Stat. Comput. Sim., 21, No. 3–4 (1985). https://doi.org/10.1080/00949658508810822
Hall, P.: Ann. Stat. 11(4), 1156–1174 (1983). https://doi.org/10.1214/aos/1176346329
Jiang, M., Provost, S.B.: J Stat Comput Simul 84(3), 614–627 (2014). https://doi.org/10.1080/00949655.2012.721366
Dutta, S.: Commun. Stat. Simulat. 45(2), 472–490 (2016). https://doi.org/10.1080/03610918.2013.862275
Sturges, H.A.: J Am Stat Assoc 21, 65–66 (1926). https://doi.org/10.1080/01621459.1926.10502161
Storm, R.: Teoriya Veroyatnostej. Matematicheskaya Statistika. Statisticheskij Kontrol’ Kachestva [Probability Theory. Mathematical Statistics. Statistical Quality Control; in Russian]. Mir Publ, Moscow (1970)
Heinhold, I., Gaede Ingeniur-Statistic, K.W.: in German], Springler Verlag. München, Wien (1964)
Lapko, A.V., Lapko, V.A.: Meas Tech 56(7), 763–767 (2013). https://doi.org/10.1007/s11018-013-0279-x
Lapko, A.V., Lapko, V.A.: Meas. Tech. 63(7), 534–542 (2020). https://doi.org/10.1007/s11018-020-01820-1
Parzen, E.: Ann Math Stat 33(3), 1065–1076 (1962). https://doi.org/10.1214/aoms/1177704472
Epanechnikov, V.A.: Theor Probab Appl 14(1), 156–161 (1969). https://doi.org/10.1137/1114019
Gradov, V.M., Ovechkin, G.V., Ovechkin, P.V., Rudakov Komp’yuternoe Modelirovanie, I.V.: [Computer Modeling; in Russian], Kurs: INFRA‑M Publ. Moscow (2019)
Author information
Authors and Affiliations
Corresponding author
Ethics declarations
Conflict of interest
The authors declare no conflict of interest.
Additional information
Translated from Izmeritel’naya Tekhnika, No. 11, pp. 26–32, November 2023. Russian DOI: https://doi.org/10.32446/0368-1025it.2023-11-26-32
Publisher’s Note
Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.
Original article submitted 06/20/2023. Accepted 09/07/2023.
Rights and permissions
Springer Nature or its licensor (e.g. a society or other partner) holds exclusive rights to this article under a publishing agreement with the author(s) or other rightsholder(s); author self-archiving of the accepted manuscript version of this article is solely governed by the terms of such publishing agreement and applicable law.
About this article
Cite this article
Lapko, A.V., Lapko, V.A. Analysis of optimization methods for nonparametric estimation of probability density in large samples. Meas Tech (2024). https://doi.org/10.1007/s11018-024-02298-x
Published:
DOI: https://doi.org/10.1007/s11018-024-02298-x
Keywords
- Regression density estimation
- One-dimensional random variable
- Kernel density estimation
- Blur factor selection
- Heinhold-Gaede rule
- Optimal number
- Sampling intervals
- Large samples