Abstract
We establish (a) the probability mass function of the interpoint distance (IPD) between random vectors that are drawn from the multivariate power series family of distributions (MPSD); (b) obtain the distribution of the IPD within one sample and across two samples from this family; (c) determine the distribution of the MPSD Euclidean norm and distance from fixed points in \({\mathbb {Z}}^d\); and (d) provide the distribution of the IPDs of vectors drawn from a mixture of the MPSD distributions. We present a method for testing the homogeneity of MPSD mixtures using the sample IPDs.
Similar content being viewed by others
References
Baringhaus L, Franz C (2004) On a new multivariate two-sample test. J Multivar Anal 88(1):190–206
Barni M, Cappellini V, Mecocci A (1994) Fast vector median filter based on Euclidean norm approximation. IEEE Signal Process Lett 1(6):92–94
Berrendero JR, Cuevas A, Pateiro-Lòpez B (2016) Shape classification based on interpoint distance distributions. J Multivar Anal 146:237–247
Biswas M, Ghosh AK (2014) A nonparametric two-sample test applicable to high dimensional data. J Multivar Anal 123:160–171
Friedman JH, Rafsky LC (1979) Multivariate Generalizations of the Wald-Wolfowitz and Smirnov two-sample tests. Ann Stat 7(4):697–717
Guo L, Modarres R (2018) Interpoint distance classification of high dimensional discrete observations. Int Stat Rev. https://doi.org/10.1111/insr.12281
Hall P, Tajvidi N (2002) Permutation tests for equality of distributions in high-dimensional settings. Biometrika 89(2):359–374
Hall P, Titterington DM, Xue JH (2009) Median-based classifiers for high-dimensional data. J Am Stat Assoc 104(488):1597–1608
Henze N, Penrose MD (1999) On the multivariate runs test. Ann Stat 27(1):290–298
Joshi SW, Patil GP (1970) Certain structural properties of the sum-symmetric power series distributions. Indian J Stat Ser A 33(2):175–184
Jureckovà J, Kalina J (2012) Nonparametric multivariate rank tests and their unbiasedness. Bernoulli 18(1):229–251
Kolesnik AD (2014) The explicit probability distribution of the sum of two telegraph processes. Stoch Dyn 15(2). arXiv:1402.6866
Lance GN, Williams WT (2007) A general theory of classificatory sorting strategies: 1. Hierarchical systems. Comput J 9(4):373–380
Liu Z, Modarres R (2011) A triangle test for equality of distribution functions in high dimensions. J Nonparametric Stat 23(3):605–615
Lok WS, Lee SMS (2011) A new statistical depth function with applications to multimodal data. J Nonparametric Stat 23(3):617–631
Lukens MW (2004) Examination of statistical outlier structure in high dimension using interpoint distance densities and multivariate rankings. Unpublished PhD dissertation, George Mason University
Marozzi M (2015) Multivariate multidistance tests for high-dimensional low sample size case-control studies. Stat Med 34(9):1511–1526
Marozzi M (2016) Multivariate tests based on interpoint distances with application to magnetic resonance imaging. Stat Methods Med Res 25(6):2593–2610
Modarres R (2013) On the interpoint distances of Bernoulli vectors. Stat Probab Lett 84:215–222
Niu X, Li P, Zhang P (2011) Testing homogeneity in a multivariate mixture model. Can J Stat 39(2):218–238
Nijenhuis A, Wilf HS (1978) Combinatorial algorithms, 2nd edn. Academic Press, New York
Noack A (1950) A class of random variables with discrete distributions. Ann Math Stat 21(1):127–132
Osada R, Funkhouser T, Chazelle B, Dobkin D (2002) Shape distributions. ACM Trans Gr 21(4):807–832
Patil GP (1968) On sampling with replacement from populations with multiple characters. Indian J Stat Ser B 30(3/4):355–366
Ripley BD (1977) Modeling spatial patterns. J R Stat Soc Ser B 39:172–192
Royle JA, Link WA (2005) A general class of multinomial mixture models for anuran calling survey data. Ecol Soc Am 86(9):2505–2512
Rosenblum PR (2005) An exact distribution free test comparing two multivariate distributions based on adjacency. J R Stat Soc Ser B 67:515–530
Shin W (1987) A multinomial change-point theory in the context of diagnosis code searching. Unpublished PhD dissertation, Texas Tech University
Shurygin AM (2006) Using interpoint distances for pattern recognition. Pattern Recognit Image Anal 16(4):726–729
Sibuya M, Yoshimura I, Shimizu R (1964) Negative multinomial distribution. Ann Inst Stat Math 16(1):409–426
Silverman B, Brown T (1978) Short distances, flat triangles and Poisson limits. J Appl Probab 15(4):816–826
Song Y (2018) Interpoint distance distributions and their applications. Unpublished PhD dissertation, George Washington University
Szèkely GJ, Rizzo ML (2013) Energy statistics: statistics based on distances. J Stat Plann Inference 143:1249–1272
Waller LA, Zelterman D (1997) Log-linear modeling with the negative multinomial distribution. Biometrics 53(3):971–982
Zhang P, Wang X, Song PXK (2006) Clustering categorical data based on distance vectors. J Am Stat Assoc 101(473):355–367
Author information
Authors and Affiliations
Corresponding author
Additional information
Publisher's Note
Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.
Electronic supplementary material
Below is the link to the electronic supplementary material.
Rights and permissions
About this article
Cite this article
Modarres, R., Song, Y. Multivariate power series interpoint distances. Stat Methods Appl 29, 955–982 (2020). https://doi.org/10.1007/s10260-020-00508-8
Accepted:
Published:
Issue Date:
DOI: https://doi.org/10.1007/s10260-020-00508-8