Skip to main content
Log in

Multinomial interpoint distances

  • Regular Article
  • Published:
Statistical Papers Aims and scope Submit manuscript

Abstract

We explore the properties of the squared Euclidean interpoint distances (IDs) drawn from multinomial distributions. We consider the distances within one sample and across two samples and obtain their means, variances, covariances and distributions. We discuss applications of IDs for testing goodness of fit, the equality of high dimensional multinomial distributions, classification, and outliers detection. A simulation study compares the performance of the \(\chi ^2\) and the likelihood ratio statistics for testing equality of distributions, with methods based on the IDs.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Fig. 1

Similar content being viewed by others

References

  • Agresti A (2013) Categorical data analysis, 3rd edn. Wiley, New York

    MATH  Google Scholar 

  • Atkinson EN, Brown BW, Thompson JR (1989) Parallel algorithms for fixed seed simulation based parameter estimation. In Computer science and statistics: proceedings of the 21st symposium on the interface, pp 259–261

  • Baringhaus L, Franz C (2004) On a new multivariate two sample test. J Multivar Anal 88:190–206

    Article  MathSciNet  MATH  Google Scholar 

  • Biswas M, Ghosh AK (2014) A nonparametric two-sample test applicable to high dimensional data. J Multivar Anal 123:160–171

    Article  MathSciNet  MATH  Google Scholar 

  • Bonetti M, Pagano M (2005) The interpoint distance distribution as a descriptor of point patterns, with an application to spatial disease clustering. Stat Med 24:753–773

    Article  MathSciNet  Google Scholar 

  • Chowdhury S, Mukherjee A, Chakraborti S (2014) A New distribution-free control chart for joint monitoring of location and scale parameters of continuous distributions. Qual Reliab Eng Int 30:191–204

    Article  Google Scholar 

  • Dong Y, Lee SMS (2014) Depth functions as measures of representativeness. Stat Pap 55(4):1079–1105

    Article  MathSciNet  MATH  Google Scholar 

  • Fishman GS (1996) Monte Carlo: concepts, algorithms and applications. Springer, New York

    Book  MATH  Google Scholar 

  • Friedman JH, Rafsky LC (1979) Multivariate generalization of the Wald–Wolfowitz and Smirnov two-sample tests. Ann Stat 7:697–717

    Article  MathSciNet  MATH  Google Scholar 

  • Henze N (1988) A multivariate two-sample test based on the number of nearest neighbor type coincidences. Ann Stat 16:772–783

    Article  MathSciNet  MATH  Google Scholar 

  • Johnson NL, Kotz S, Balakrishnan N (1997) Discrete multivariate distributions. Wiley, New York

    MATH  Google Scholar 

  • Johnson RA, Wichern DW (2007) Applied multivariate statistical analysis. Prentice Hall, New Jersey

    MATH  Google Scholar 

  • Jurecková J, Kalina J (2012) Nonparametric multivariate rank tests and their unbiasedness. Bernoulli 18(1):229–251

    Article  MathSciNet  MATH  Google Scholar 

  • Liao SM, Akritas M (2007) Test-based classification: a linkage between classification and statistical testing. Stat Probab Lett 77(12):1269–1281

    Article  MathSciNet  MATH  Google Scholar 

  • Liu Z, Modarres R (2010) A triangle test for equality of distribution functions in high dimensions. J Nonparametr Stat 22(6):1–11

    MathSciNet  Google Scholar 

  • Liu Z, Modarres R (2011) Lens data depth and median. J Nonparametr Stat 23(4):1063–1074

    Article  MathSciNet  MATH  Google Scholar 

  • Maa JF, Pearl DK, Bartoszyński R (1996) Reducing multidimensional two-sample data to one-dimensional interpoint comparisons. Ann Stat 24:1069–1074

    Article  MathSciNet  MATH  Google Scholar 

  • Marozzi M (2015) Multivariate multidistance tests for high-dimensional low sample size case-control studies. Stat Med 34:1511–1526

    Article  MathSciNet  Google Scholar 

  • Modarres R (2014) On the interpoint distances of Bernoulli vectors. Stat Probab Lett 84:215–222

    Article  MathSciNet  MATH  Google Scholar 

  • Modarres R, Patil GP (2007) Hotspot detection with bivariate data. J Stat Plan Inference 137:3643–3654

    Article  MathSciNet  MATH  Google Scholar 

  • Nijenhuis A, Wilf HS (1978) Combinatorial algorithms, 2nd edn. Academic Press, New York

    MATH  Google Scholar 

  • Ripley BD (1976) The second-order analysis of stationary point processes. J Appl Probab 13:255–266

    Article  MathSciNet  MATH  Google Scholar 

  • Ripley BD (1977) Modeling spatial patterns. J R Stat Soc Ser B 39:172–192

    Google Scholar 

  • SAS IML Studio 3.4 for SAS/Stat Users (2011) SAS Institute Inc., SAS Campus Drive, Cary, North Carolina 27513

  • Selvin S, Schulman J, Merrill DW (1993) Interpoint squared distance as a measure of spatial clustering. Soc Sci Med 36:1011–1016

    Article  Google Scholar 

  • Tebaldi P, Bonetti M, Pagano M (2011) M statistic commands: interpoint distance distribution analysis. Stata J 11(2):271–289

    Google Scholar 

Download references

Acknowledgments

I would like to thank an anonymous referees whose helpful comments and suggestions improved the presentation of the article.

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Reza Modarres.

Rights and permissions

Reprints and permissions

About this article

Check for updates. Verify currency and authenticity via CrossMark

Cite this article

Modarres, R. Multinomial interpoint distances. Stat Papers 59, 341–360 (2018). https://doi.org/10.1007/s00362-016-0766-7

Download citation

  • Received:

  • Revised:

  • Published:

  • Issue Date:

  • DOI: https://doi.org/10.1007/s00362-016-0766-7

Keywords

Mathematics Subject Classification

Navigation