Abstract
We explore the properties of the squared Euclidean interpoint distances (IDs) drawn from multinomial distributions. We consider the distances within one sample and across two samples and obtain their means, variances, covariances and distributions. We discuss applications of IDs for testing goodness of fit, the equality of high dimensional multinomial distributions, classification, and outliers detection. A simulation study compares the performance of the \(\chi ^2\) and the likelihood ratio statistics for testing equality of distributions, with methods based on the IDs.
Similar content being viewed by others
References
Agresti A (2013) Categorical data analysis, 3rd edn. Wiley, New York
Atkinson EN, Brown BW, Thompson JR (1989) Parallel algorithms for fixed seed simulation based parameter estimation. In Computer science and statistics: proceedings of the 21st symposium on the interface, pp 259–261
Baringhaus L, Franz C (2004) On a new multivariate two sample test. J Multivar Anal 88:190–206
Biswas M, Ghosh AK (2014) A nonparametric two-sample test applicable to high dimensional data. J Multivar Anal 123:160–171
Bonetti M, Pagano M (2005) The interpoint distance distribution as a descriptor of point patterns, with an application to spatial disease clustering. Stat Med 24:753–773
Chowdhury S, Mukherjee A, Chakraborti S (2014) A New distribution-free control chart for joint monitoring of location and scale parameters of continuous distributions. Qual Reliab Eng Int 30:191–204
Dong Y, Lee SMS (2014) Depth functions as measures of representativeness. Stat Pap 55(4):1079–1105
Fishman GS (1996) Monte Carlo: concepts, algorithms and applications. Springer, New York
Friedman JH, Rafsky LC (1979) Multivariate generalization of the Wald–Wolfowitz and Smirnov two-sample tests. Ann Stat 7:697–717
Henze N (1988) A multivariate two-sample test based on the number of nearest neighbor type coincidences. Ann Stat 16:772–783
Johnson NL, Kotz S, Balakrishnan N (1997) Discrete multivariate distributions. Wiley, New York
Johnson RA, Wichern DW (2007) Applied multivariate statistical analysis. Prentice Hall, New Jersey
Jurecková J, Kalina J (2012) Nonparametric multivariate rank tests and their unbiasedness. Bernoulli 18(1):229–251
Liao SM, Akritas M (2007) Test-based classification: a linkage between classification and statistical testing. Stat Probab Lett 77(12):1269–1281
Liu Z, Modarres R (2010) A triangle test for equality of distribution functions in high dimensions. J Nonparametr Stat 22(6):1–11
Liu Z, Modarres R (2011) Lens data depth and median. J Nonparametr Stat 23(4):1063–1074
Maa JF, Pearl DK, Bartoszyński R (1996) Reducing multidimensional two-sample data to one-dimensional interpoint comparisons. Ann Stat 24:1069–1074
Marozzi M (2015) Multivariate multidistance tests for high-dimensional low sample size case-control studies. Stat Med 34:1511–1526
Modarres R (2014) On the interpoint distances of Bernoulli vectors. Stat Probab Lett 84:215–222
Modarres R, Patil GP (2007) Hotspot detection with bivariate data. J Stat Plan Inference 137:3643–3654
Nijenhuis A, Wilf HS (1978) Combinatorial algorithms, 2nd edn. Academic Press, New York
Ripley BD (1976) The second-order analysis of stationary point processes. J Appl Probab 13:255–266
Ripley BD (1977) Modeling spatial patterns. J R Stat Soc Ser B 39:172–192
SAS IML Studio 3.4 for SAS/Stat Users (2011) SAS Institute Inc., SAS Campus Drive, Cary, North Carolina 27513
Selvin S, Schulman J, Merrill DW (1993) Interpoint squared distance as a measure of spatial clustering. Soc Sci Med 36:1011–1016
Tebaldi P, Bonetti M, Pagano M (2011) M statistic commands: interpoint distance distribution analysis. Stata J 11(2):271–289
Acknowledgments
I would like to thank an anonymous referees whose helpful comments and suggestions improved the presentation of the article.
Author information
Authors and Affiliations
Corresponding author
Rights and permissions
About this article
Cite this article
Modarres, R. Multinomial interpoint distances. Stat Papers 59, 341–360 (2018). https://doi.org/10.1007/s00362-016-0766-7
Received:
Revised:
Published:
Issue Date:
DOI: https://doi.org/10.1007/s00362-016-0766-7