Modeling the Nearest Neighbor Graphs to Estimate the Probability of the Independence of Data

Kislitsyn, A. A.

doi:10.1134/S2070048223070086

Modeling the Nearest Neighbor Graphs to Estimate the Probability of the Independence of Data

Published: 26 December 2023

Volume 15, pages S41–S53, (2023)
Cite this article

Mathematical Models and Computer Simulations Aims and scope

A. A. Kislitsyn¹

15 Accesses
Explore all metrics

Abstract

The proposed method is based on calculations of the statistics of the nearest neighbor graph (NNG) structures, which are presented as a benchmark of the probabilities of the distribution of graphs by the number of disconnected fragments. The deviation of the actually observed occurrence of connectivity from the calculated one will allow us to determine the probability that this sample can be considered a set of statistically independent variables. The statements about the independence of the NNG statistics from the distribution of distances and from the triangle inequality are proved, which allows the numerical modeling of such structures. Estimates of the accuracy of the calculated statistics for graphs and their comparison with estimates obtained by modeling random coordinates of points in d-dimensional space are carried out. It is shown that the model of the NNGs without taking into account the dimension of the space leads to fairly accurate estimates of the statistics of graph structures in spaces of dimensionality higher than five. For spaces of smaller dimensionality, the benchmark can be obtained by directly calculating the distances between points with random coordinates in a unit cube. The proposed method is applied to the problem of analyzing the level of unsteadiness of the earthquake catalog in the Kuril–Kamchatka region. The lengths of samples of time intervals between neighboring events are analyzed. It is shown that the analyzed system as a whole is interconnected with a probability of 0.91, and this dependence is fundamentally different from the lag correlation between the sample elements.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Log in via an institution

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Institutional subscriptions

Investigation of Statistics of Nearest Neighbor Graphs

Article 11 April 2023

Information and Dimensionality of Anisotropic Random Geometric Graphs

Multidimensional Connected Set Detection in Clustering Based on Nonparametric Density Estimation

REFERENCES

A. A. Kislitsyn, “Investigation of statistics of nearest neighbor graphs,” Math. Models Comput. Simul. 15, 235–244 (2023). https://doi.org/10.1134/s2070048223020084
Article MathSciNet Google Scholar
A. A. Kislitsyn and Yu. N. Orlov, “Discussion about properties of first nearest neighbor graphs,” Lobachevskii J. Math. 43, 3515–3524 (2022). https://doi.org/10.1134/s1995080222150148
Article MathSciNet Google Scholar
E. Fix and J. Hodges, Discriminatory Analysis: Nonparametric Discrimination: Consistency Properties (USAF School of Aviation Medicine, Texas, 1951).
Book Google Scholar
T. Hastie, J. Friedman, and R. Tibshirani, The Elements of Statistical Learning: Data Mining, Inference, and Prediction, Springer Series in Statistics (Springer, New York, 2001). https://doi.org/10.1007/978-0-387-21606-5
G. Guo, H. Wang, D. Bell, Ya. Bi, and K. Greer, “KNN model-based approach in classification,” in On The Move to Meaningful Internet Systems 2003: CoopIS, DOA, and ODBASE, Ed. by R. Meersman, Z. Tari, and D. C. Schmidt, Lecture Notes in Computer Science, Vol. 2888 (Springer, Berlin, 2003), pp. 986–996. https://doi.org/10.1007/978-3-540-39964-3_62
Book Google Scholar
V. Vaidehi and S. Vasuhi, “Person authentication using face recognition,” in Proc. World Congress on Engineering and Computer Science (2008).
Z. Yong, L. Youwen, and X. Shixiong, “An improved KNN text classification algorithm based on clustering,” J. Comput. 4, 230–237 (2009).
Google Scholar
F. Bajramovic, F. Mattern, N. Butko, and J. Denzler, “A comparison of nearest neighbor search algorithms for generic object recognition,” in Advanced Concepts for Intelligent Vision Systems, Ed. by J. Blanc-Talon, W. Philips, D. Poposcu, and P. Scheunders (Springer, Berlin, 2006), Vol. 4179, pp. 1186–1197. https://doi.org/10.1007/11864349_108
Book Google Scholar
T. Bailey and A. K. Jain, “A note on distance-weighted k-nearest neighbor rules,” IEEE Trans. Syst., Man, Cybern. 8, 311–313 (1978). https://doi.org/10.1109/tsmc.1978.4309958
Article Google Scholar
V. F. Kolchin, Random Graphs (Fizmatlit, Moscow, 2004).
Google Scholar
M. Kenui, Fast Statistical Computations: Simplified Methods for Estimation and Testing (Statistika, Moscow, 1979).
Google Scholar
H. B. Mann and D. R. Whitney, “On a test of whether one of two random variables is stochastically larger than the other,” Ann. Math. Stat. 18, 50–60 (1947). https://doi.org/10.1214/aoms/1177730491
Article MathSciNet Google Scholar
R. A. Fisher and F. Yates, Statistical Tables for Biological, Agricultural and Medical Research (Oliver and Boyd, Edinburg, 1946).
Google Scholar
L. N. Bol’shev and N. V. Smirnov, Tables of Mathematical Statistics (Nauka, Moscow, 1965).
Google Scholar
F. G. Foster and A. Stuart, “Distribution-free tests in time-series based on the breaking of records,” J. R. Stat. Soc., Ser. B (Methodological) 16, 1–13 (1954). https://doi.org/10.1111/j.2517-6161.1954.tb00143.x
Article MathSciNet Google Scholar
A. A. Kislitsyn, A. B. Kozlova, M. B. Korsakova, and Yu. N. Orlov, “Disorder indicator for nonstationary stochastic processes,” Dokl. Math. 99, 57–59 (2019). https://doi.org/10.1134/s1064562419010174
Article MathSciNet Google Scholar
Global Centroid Moment Tensor Catalogm GCMT catalog. https://www.globalcmt.org/CMTsearch.html.
O. R. Musin, “The problem of the twenty-five spheres,” Russ. Math. Surv. 58, 794–795 (2003). https://doi.org/10.1070/rm2003v058n04abeh000651
Article MathSciNet Google Scholar
V. S. Korolyuk, N. I. Portenko, A. V. Skorokhod, and A. F. Turbin, Reference Book on the Probability Theory and Mathematical Statistics (Nauka, Moscow, 1985).
Google Scholar

Download references

Funding

This study was supported by the Russian Science Foundation (grant no. 23-27-00395).

Author information

Authors and Affiliations

Keldysh Institute of Applied Mathematics, Russian Academy of Sciences, Moscow, Russia
A. A. Kislitsyn

Authors

A. A. Kislitsyn
View author publications
You can also search for this author in PubMed Google Scholar

Corresponding author

Correspondence to A. A. Kislitsyn.

Ethics declarations

The author of this work declares that he has no conflicts of interest.

Additional information

Publisher’s Note.

Pleiades Publishing remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

Rights and permissions

Reprints and permissions

About this article

Cite this article

Kislitsyn, A.A. Modeling the Nearest Neighbor Graphs to Estimate the Probability of the Independence of Data. Math Models Comput Simul 15 (Suppl 1), S41–S53 (2023). https://doi.org/10.1134/S2070048223070086

Download citation

Received: 17 April 2023
Revised: 17 April 2023
Accepted: 15 May 2023
Published: 26 December 2023
Issue Date: December 2023
DOI: https://doi.org/10.1134/S2070048223070086

Keywords:

Access this article

Log in via an institution

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Institutional subscriptions