Skip to main content
Log in

Comparison of relative density of two random geometric digraph families in testing spatial clustering

  • Original Paper
  • Published:
TEST Aims and scope Submit manuscript

Abstract

We compare the performance of relative densities of two parameterized random geometric digraph families called proximity catch digraphs (PCDs) in testing bivariate spatial patterns. These PCD families are proportional edge (PE) and central similarity (CS) PCDs and are defined with proximity regions based on relative positions of data points from two classes. The relative densities of these PCDs were previously used as statistics for testing segregation and association patterns against complete spatial randomness. The relative density of a digraph, D, with n vertices (i.e., with order n) represents the ratio of the number of arcs in D to the number of arcs in the complete symmetric digraph of the same order. When scaled properly, the relative density of a PCD is a U-statistic; hence, it has asymptotic normality by the standard central limit theory of U-statistics. The PE- and CS-PCDs are defined with an expansion parameter that determines the size or measure of the associated proximity regions. In this article, we extend the distribution of the relative density of CS-PCDs for expansion parameter being larger than one, and compare finite sample performance of the tests by Monte Carlo simulations and asymptotic performance by Pitman asymptotic efficiency. We find the optimal expansion parameters of the PCDs for testing each alternative in finite samples and in the limit as the sample size tending to infinity. As a result of our comparisons, we demonstrate that in terms of empirical power (i.e., for finite samples) relative density of CS-PCD has better performance (which occurs for expansion parameter values larger than one) for the segregation alternative, while relative density of PE-PCD has better performance for the association alternative. The methods are also illustrated in a real-life data set from plant ecology.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Fig. 1
Fig. 2
Fig. 3
Fig. 4
Fig. 5
Fig. 6
Fig. 7
Fig. 8
Fig. 9
Fig. 10
Fig. 11
Fig. 12
Fig. 13
Fig. 14
Fig. 15
Fig. 16

Similar content being viewed by others

Abbreviations

PCD:

proximity catch digraph

CCCD:

class cover catch digraph

CS-PCD:

central similarity PCD

PE-PCDs:

proportional edge PCD

PAE:

Pitman asymptotic efficiency

CSR:

complete spatial randomness

NNCT:

nearest neighbor contingency table

NN:

nearest neighbor

D(V,A):

Vertex random PCD with vertex set V and arc set A. See Sect. 2, paragraph 1.

N(⋅) and N(x):

Proximity map and the proximity region associated with a point x. See Sect. 2, paragraph 5.

\(\mathcal {C}_{H}(\mathcal {Y}_{m})\) :

Convex hull of \(\mathcal {Y}_{m}\). See Sect. 2, paragraph 7.

\(T(\mathcal {Y}_{3})=T(\mathsf {y}_{1},\mathsf {y}_{2},\mathsf {y}_{3})\) :

The triangle with vertices y 1,y 2,y 3. See Sect. 2, paragraph 7.

\(\mathcal {S}(F)\) :

Support of the distribution F and \(\mathcal {U}(T (\mathcal {Y}_{3} ))\): Uniform distribution on \(T (\mathcal {Y}_{3} )\). See Sect. 2, paragraph 7.

N PE(x,r):

The proportional edge proximity map with expansion parameter r. See Sect. 2, paragraph 8.

R V (y 1), R V (y 2), and R V (y 3):

The vertex regions for vertices y 1,y 2,y 3. See Sect. 2, paragraph 8.

N CS(x,τ):

The central similarity proximity map with expansion parameter τ. See Sect. 2, paragraph 9.

R E (e 1), R E (e 2), R E (e 3):

The edge regions for edges e 1,e 2,e 3 opposite to the vertices y 1,y 2,y 3 See Sect. 2, paragraph 9.

ρ(D):

The relative density of the digraph D. See Sect. 3, paragraph 1.

ρ PE(n,r) and ρ CS(n,τ):

The relative densities of PE-PCDs and CS-PCDs, respectively. See Theorem 3.1.

μ PE(r) and ν PE(r):

The arc probability (or the asymptotic mean) and asymptotic variance of relative density of PE-PCDs. See Sect. 3.1, paragraph 4.

μ CS(r) and \(\nu_{_{Cs}}(r)\) :

The arc probability (or the asymptotic mean) and asymptotic variance of relative density of CS-PCDs. See Sect. 3.1, paragraph 4.

\(\widetilde{\rho}_{\mathrm{PE}}(n,m,r)\) and \(\widetilde{\rho}_{\mathrm{CS}}(n,m,\tau)\) :

The relative density for the PE-PCD and CS-PCD in the multiple triangle case. See Sect. 3.2, paragraph 2.

\(\widetilde{\mu}_{\mathrm{PE}}(m,r)\) and \(\widetilde{\nu}_{\mathrm{PE}}(m,r)\) :

The asymptotic mean and asymptotic variance of relative density of PE-PCDs in the multiple triangle case. See Corollary 3.4.

R PE(r):

The standardized test statistic based on the relative density of PE-PCD in the one-triangle case. See (13).

\(\widetilde{R}_{\mathrm{PE}}(r)\) :

The standardized test statistic based on the relative density of PE-PCD in the multi-triangle case. See Sect. 4.1, paragraph 3.

\(\operatorname {PAE}_{\mathrm{PE}}(r)\) and \(\operatorname {PAE}_{\mathrm{CS}}(\tau)\) :

Pitman asymptotic efficiency score for relative density of PE-PCD and CS-PCD, respectively. See Sect. 7, paragraph 2.

\(\pi_{\operatorname {out}}\) and \(\widehat{\pi}_{\operatorname {out}}\) :

Proportion of class 1 points outside the convex hull of class 2 points and its estimate. See Sect. 8, paragraphs 1 & 2, respectively.

C ch :

The correction coefficient for the class 1 points outside the convex hull of class 2 points. See (14).

\(\widetilde {R}^{\mathrm{ch}}_{\mathrm{PE}}(r)\) and \(\widetilde {R}^{\mathrm{ch}}_{\mathrm{CS}}(\tau)\) :

The convex hull corrected versions of the standardized test statistics based on the relative density of PE- and CS-PCDs. See (15).

References

  • Baddeley AJ, Turner R (2005) Spatstat: an R package for analyzing spatial point patterns. J Stat Softw 12(6):1–42

    Google Scholar 

  • Beer E, Fill JA, Janson S, Scheinerman ER (2010) On vertex, edge, and vertex-edge random graphs. arXiv:0812.1410v2 [math.CO]

  • Ceyhan E (2010a) A comparison of two proximity catch digraph families in testing spatial clustering. Technical Report # KU-EC-10-3, Koç University, Istanbul, Turkey. arXiv:1010.4436v1 [math.CO]

  • Ceyhan E (2010b) Extension of one-dimensional proximity regions to higher dimensions. Comput Geom Theor Appl 43(9):721–748

    Article  MATH  MathSciNet  Google Scholar 

  • Ceyhan E (2010c) New tests of spatial segregation based on nearest neighbor contingency tables. Scand J Stat 37:147–165

    Article  MATH  MathSciNet  Google Scholar 

  • Ceyhan E (2011) Spatial clustering tests based on domination number of a new random digraph family. Commun Stat, Theory Methods 40(8):1363–1395

    Article  MATH  MathSciNet  Google Scholar 

  • Ceyhan E, Priebe CE, Wierman JC (2006) Relative density of the random r-factor proximity catch digraphs for testing spatial patterns of segregation and association. Comput Stat Data Anal 50(8):1925–1964

    Article  MATH  MathSciNet  Google Scholar 

  • Ceyhan E, Priebe CE, Marchette DJ (2007) A new family of random graphs for testing spatial segregation. Can J Stat 35(1):27–50

    Article  MATH  MathSciNet  Google Scholar 

  • Coleman TF, Moré JJ (1983) Estimation of sparse Jacobian matrices and graph coloring problems. SIAM J Numer Anal 20(1):187–209

    Article  MATH  MathSciNet  Google Scholar 

  • Coomes DA, Rees M, Turnbull L (1999) Identifying aggregation and association in fully mapped spatial data. Ecology 80(2):554–565

    Article  Google Scholar 

  • Cressie NAC (1993) Statistics for spatial data. Wiley, New York

    Google Scholar 

  • DeVinney J, Priebe CE, Marchette DJ, Socolinsky D (2002) Random walks and catch digraphs in classification. In: Proceedings of the 34th symposium on the interface: computing science and statistics, vol 34. http://www.galaxy.gmu.edu/interface/I02/I2002Proceedings/DeVinneyJason/DeVinneyJason.paper.pdf

    Google Scholar 

  • Diggle PJ (2003) Statistical analysis of spatial point patterns. Hodder Arnold Publishers, London

    MATH  Google Scholar 

  • Dixon PM (1994) Testing spatial segregation using a nearest-neighbor contingency table. Ecology 75(7):1940–1948

    Article  Google Scholar 

  • Dixon PM (2002a) Nearest-neighbor contingency table analysis of spatial segregation for several species. Ecoscience 9(2):142–151

    Google Scholar 

  • Dixon PM (2002b) Nearest neighbor methods. In: El-Shaarawi AH, Piegorsch WW (eds) Encyclopedia of environmetrics, vol 3. Wiley, New York, pp 1370–1383

    Google Scholar 

  • Erdős P, Rényi A (1959) On random graphs I. Publ Math (Debr) 6:290–297

    Google Scholar 

  • Fall A, Fortin MJ, Manseau M, O’Brien D (2007) Ecosystems. Int J Geogr Inf Sci 10(3):448–461

    Google Scholar 

  • Faragó A (2008) A general tractable density concept for graphs. Math Comput Sci 1(4):689–699

    Article  MATH  MathSciNet  Google Scholar 

  • Goldberg AV (1984) Finding a maximum density subgraph. Technical Report UCB/CSD-84-171, EECS Department, University of California, Berkeley

  • Good BJ, Whipple SA (1982) Tree spatial patterns: South Carolina bottomland and swamp forests. Bull Torrey Bot Club 109:529–536

    Article  Google Scholar 

  • Illian J, Burslem D (2007) Contributions of spatial point process modelling to biodiversity theory. Coexistence 148(148):9–29

    MathSciNet  Google Scholar 

  • Janson S, Łuczak T, Ruciński A (2000) Random graphs. Wiley-Interscience series in discrete mathematics and optimization. Wiley, New York

    Book  MATH  Google Scholar 

  • Jaromczyk JW, Toussaint GT (1992) Relative neighborhood graphs and their relatives. Proc IEEE 80:1502–1517

    Article  Google Scholar 

  • Jung I, Kulldorff M (2007) Theoretical properties of tests for spatial clustering of count data. Can J Stat 35(3):433–446

    Article  MATH  MathSciNet  Google Scholar 

  • Keitt T (2007) Introduction to spatial modeling with networks. Presented at the workshop on networks in ecology and beyond organized by the PRIMES (program in interdisciplinary math, ecology and statistics) at Colorado State University, Fort Collins, Colorado

  • Kendall M, Stuart A (1979) The advanced theory of statistics, vol 2, 4th edn. Griffin, London

    MATH  Google Scholar 

  • Lehmann EL (1988) Nonparametrics: statistical methods based on ranks. Prentice-Hall, Upper Saddle River

    Google Scholar 

  • Leibovich E (2009) Approximating graph density problems. PhD thesis, The Open University of Israel, Department of Mathematics and Computer Science

  • Marchette DJ, Priebe CE (2003) Characterizing the scale dimension of a high dimensional classification problem. Pattern Recognit 36(1):45–60

    Article  MATH  Google Scholar 

  • Minor ES, Urban DL (2007) Graph theory as a proxy for spatially explicit population models in conservation planning. Ecol Appl 17(6):1771–1782

    Article  Google Scholar 

  • Okabe A, Boots B, Sugihara K, Chiu SN (2000) Spatial tessellations: concepts and applications of Voronoi diagrams. Wiley, New York

    Book  Google Scholar 

  • Penrose M (2003) Random geometric graphs. Number 5 in Oxford studies in probability. Oxford University Press, London

    Book  Google Scholar 

  • Priebe CE, DeVinney JG, Marchette DJ (2001) On the distribution of the domination number of random class cover catch digraphs. Stat Probab Lett 55:239–246

    Article  MATH  MathSciNet  Google Scholar 

  • Priebe CE, Marchette DJ, DeVinney J, Socolinsky D (2003) Classification using class cover catch digraphs. J Classif 20(1):3–23

    Article  MATH  MathSciNet  Google Scholar 

  • Roberts SA, Hall GB, Calamai PH (2000) Analysing forest fragmentation using spatial autocorrelation, graphs and GIS. Int J Geogr Inf Sci 14(2):185–204

    Article  Google Scholar 

  • Shenggui Z, Hao S, Xueliang L (2002) w-density and w-balanced property of weighted graphs. Appl Math J Chin Univ 17(3):355–364

    Article  MATH  Google Scholar 

  • Toussaint GT (1980) The relative neighborhood graph of a finite planar set. Pattern Recognit 12(4):261–268

    Article  MATH  MathSciNet  Google Scholar 

  • van Eeden C (1963) The relation between Pitman’s asymptotic relative efficiency of two tests and the correlation coefficient between their test statistics. Ann Math Stat 34(4):1442–1451

    Article  MATH  Google Scholar 

Download references

Acknowledgements

I would like to thank an anonymous associate editor and referees whose constructive comments and suggestions greatly improved the presentation and flow of the paper. Most of the Monte Carlo simulations presented in this article were executed at Koç University High Performance Computing Laboratory. This research was supported by the European Commission under the Marie Curie International Outgoing Fellowship Programme via Project # 329370 titled PRinHDD.

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Elvan Ceyhan.

Rights and permissions

Reprints and permissions

About this article

Cite this article

Ceyhan, E. Comparison of relative density of two random geometric digraph families in testing spatial clustering. TEST 23, 100–134 (2014). https://doi.org/10.1007/s11749-013-0344-4

Download citation

  • Received:

  • Accepted:

  • Published:

  • Issue Date:

  • DOI: https://doi.org/10.1007/s11749-013-0344-4

Keywords

Mathematics Subject Classification (2010)

Navigation