Skip to main content
Log in

Comparison of the Methodology for Hypothesis Testing of the Independence of Two-Dimensional Random Variables Based on a Nonparametric Classifier

  • Published:
Scientific and Technical Information Processing Aims and scope

Abstract—

The properties of a new method for the hypothesis testing of the independence of random variables based on the use of a nonparametric pattern recognition algorithm corresponding to the maximum likelihood criterion are considered. The estimation of the distribution laws in classes is carried out using the initial statistical data under the assumption of the independence and dependence of the analyzed random variables. Under these conditions, estimates of the probabilities of pattern recognition errors in classes are calculated. A decision is made on the independence or dependence of random variables according to their minimum value. The results of the proposed method are compared using the Pearson criterion and the Pearson, Spearman, and Kendall correlation coefficients. When implementing the Pearson criterion, the formula for optimal discretization of the range of values of a two-dimensional random variable is used. Their effectiveness in complicating the dependence between random variables and changing the volume of initial statistical data is studied using computational experiment.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Fig. 1.
Fig. 1.

REFERENCES

  1. Pugachev, V.S., Probability Theory and Mathematical Statistics for Engineers, Moscow: Fizmatlit, 2002; Elsevier, 1984. https://doi.org/10.1016/C2013-0-06054-9

  2. Lapko, A.V. and Lapko, V.A., Nonparametric algorithms of pattern recognition in the problem of testing a statistical hypothesis on identity of two distribution laws of random variables, Optoelectron., Instrum. Data Process., 2010, vol. 46, no. 6, pp. 545–550. https://doi.org/10.3103/s8756699011060069

    Article  Google Scholar 

  3. Lapko, A.V. and Lapko, V.A., Comparison of empirical and theoretical distribution functions of a random variable on the basis of a nonparametric classifier, Optoelectron., Instrum. Data Process., 2012, vol. 48, no. 1, pp. 37–41. https://doi.org/10.3103/s8756699012010050

    Article  Google Scholar 

  4. Lapko, A.V. and Lapko, V.A., A technique for testing hypotheses for distributions of multidimensional spectral data using a nonparametric pattern recognition algorithm, Komp’yuternaya Opt., 2019, vol. 43, no. 2, pp. 238–244. https://doi.org/10.18287/2412-6179-2019-43-2-238-244

    Article  ADS  Google Scholar 

  5. Lapko, A.V. and Lapko, V.A., Testing the hypothesis of the independence of two-dimensional random variables using a nonparametric algorithm for pattern recognition, Optoelectron., Instrum. Data Process., 2021, vol. 57, no. 2, pp. 149–155. https://doi.org/10.3103/s8756699021020114

    Article  ADS  Google Scholar 

  6. Parzen, E., On estimation of a probability density function and mode, Ann. Math. Stat., 1962, vol. 33, no. 3, pp. 1065–1076. https://doi.org/10.1214/aoms/1177704472

    Article  MathSciNet  Google Scholar 

  7. Epanechnikov, V.a., Non-parametric estimation of a multivariate probability density, Theory Probab. Its Appl., 1969, vol. 14, no. 1, pp. 153–158. https://doi.org/10.1137/1114019

    Article  MathSciNet  Google Scholar 

  8. Lapko, A.V., Medvedev, A.V., and Tishina, E.A., To the optimizatio of nonparametric estimates, Sbornik nauchnykh trudov Algoritmy i programmy dlya sistem avtomatizatsii eksperimental’nykh issledovaniy (Collection of Scientific Papers Algorithms and Programs for Automation Systems of Experimental Research), Frunze: Ilim, 1975, pp. 105–116.

  9. Rudemo, M., Empirical choice of histogram and kernel density estimators, Scandinavian J. Stat., 1982, vol. 9, no. 2, pp. 65–78.

    MathSciNet  Google Scholar 

  10. Bowman, A.W., A comparative study of some kernel-based nonparametric density estimators, J. Stat. Comput. Simul., 1982, vol. 21, nos. 3–4, pp. 313–327. https://doi.org/10.1080/00949658508810822

    Article  Google Scholar 

  11. Hall, P., Large sample optimality of least squares cross-validation in density estimation, Ann. Stat., 1983, vol. 11, no. 4, pp. 1156–1174. https://doi.org/10.1214/aos/1176346329

    Article  MathSciNet  Google Scholar 

  12. Jiang, M. and Provost, S.B., A hybrid bandwidth selection methodology for kernel density estimation, J. Stat. Comput. Simul., 2014, vol. 84, no. 3, pp. 614–627. https://doi.org/10.1080/00949655.2012.721366

    Article  MathSciNet  Google Scholar 

  13. Dutta, S., Cross-validation Revisited, Commun. Stat. Simul. Comput., 2016, vol. 45, no. 2, pp. 472–490. https://doi.org/10.1080/03610918.2013.862275

    Article  MathSciNet  Google Scholar 

  14. Heidenreich, N.-B., Schindler, A., and Sperlich, S., Bandwidth selection for kernel density estimation: a review of fully automatic selectors, AStA Adv. Stat. Anal., 2013, vol. 97, no. 4, pp. 403–433. https://doi.org/10.1007/s10182-013-0216-y

    Article  MathSciNet  Google Scholar 

  15. Li, Q. and Racine, J.S., Nonparametric Econometrics: Theory and Practice, Princeton: Princeton Univ. Press, 2007.

    Google Scholar 

  16. Lapko, A.V. and Lapko, V.A., Method of fast bandwidth selection in a nonparametric classifier corresponding to the a posteriori probability maximum criterion, Optoelectron., Instrum. Data Process., 2019, vol. 55, no. 6, pp. 597–605. https://doi.org/10.3103/s8756699019060104

    Article  ADS  Google Scholar 

  17. Lapko, A.V. and Lapko, V.A., Modified fast algorithm for the bandwidth selection of the kernel density estimation, Optoelectron., Instrum. Data Process., 2020, vol. 56, no. 6, pp. 566–572. https://doi.org/10.3103/s8756699020060102

    Article  ADS  Google Scholar 

  18. Scott, D.W., Multivariate Density Estimation: Theory, Practice, and Visualization, Wiley Series in Probability and Statistics, New Jersey: Wiley, 2015. https://doi.org/10.1002/9781118575574

  19. Sheather, S.J., Density estimation, Stat. Sci., 2004, vol. 19, no. 4, pp. 588–597. https://doi.org/10.1214/088342304000000297

    Article  Google Scholar 

  20. Silverman, B.W., Density Estimation for Statistics and Data Analysis, London: Chapman and Hall, 1986.

    Google Scholar 

  21. Lapko, A.V. and Lapko, V.A., Estimation of a nonlinear functional of probability density when optimizing nonparametric decision functions, Meas. Tech., 2021, vol. 64, no. 1, pp. 13–20. https://doi.org/10.1007/s11018-021-01889-2

    Article  Google Scholar 

  22. Lapko, A.V. and Lapko, V.A., Selection of the optimal number of intervals sampling the region of values of a two-dimensional random variable, Meas. Tech., 2016, vol. 59, no. 2, pp. 122–126. https://doi.org/10.1007/s11018-016-0928-y

    Article  Google Scholar 

  23. Lapko, A.V. and Lapko, V.A., Estimation of parameters of the formula for optimal discretization of the range of values of a two-dimensional random variable, Meas. Tech., 2018, vol. 61, no. 5, pp. 427–433. https://doi.org/10.1007/s11018-018-1447-9

    Article  Google Scholar 

Download references

Funding

This work was supported by ongoing institutional funding. No additional grants to carry out or direct this particular research were obtained.

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to A. V. Lapko.

Ethics declarations

The authors of this work declare that they have no conflicts of interest.

Additional information

Publisher’s Note.

Allerton Press remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

About this article

Check for updates. Verify currency and authenticity via CrossMark

Cite this article

Lapko, A.V., Lapko, V.A. & Bakhtina, A.V. Comparison of the Methodology for Hypothesis Testing of the Independence of Two-Dimensional Random Variables Based on a Nonparametric Classifier. Sci. Tech. Inf. Proc. 50, 572–581 (2023). https://doi.org/10.3103/S0147688223060084

Download citation

  • Received:

  • Published:

  • Issue Date:

  • DOI: https://doi.org/10.3103/S0147688223060084

Keywords:

Navigation