Comparison of relative density of two random geometric digraph families in testing spatial clustering

Ceyhan, Elvan

doi:10.1007/s11749-013-0344-4

Comparison of relative density of two random geometric digraph families in testing spatial clustering

Original Paper
Published: 18 October 2013

Volume 23, pages 100–134, (2014)
Cite this article

TEST Aims and scope Submit manuscript

Elvan Ceyhan¹

162 Accesses
3 Citations
Explore all metrics

Abstract

We compare the performance of relative densities of two parameterized random geometric digraph families called proximity catch digraphs (PCDs) in testing bivariate spatial patterns. These PCD families are proportional edge (PE) and central similarity (CS) PCDs and are defined with proximity regions based on relative positions of data points from two classes. The relative densities of these PCDs were previously used as statistics for testing segregation and association patterns against complete spatial randomness. The relative density of a digraph, D, with n vertices (i.e., with order n) represents the ratio of the number of arcs in D to the number of arcs in the complete symmetric digraph of the same order. When scaled properly, the relative density of a PCD is a U-statistic; hence, it has asymptotic normality by the standard central limit theory of U-statistics. The PE- and CS-PCDs are defined with an expansion parameter that determines the size or measure of the associated proximity regions. In this article, we extend the distribution of the relative density of CS-PCDs for expansion parameter being larger than one, and compare finite sample performance of the tests by Monte Carlo simulations and asymptotic performance by Pitman asymptotic efficiency. We find the optimal expansion parameters of the PCDs for testing each alternative in finite samples and in the limit as the sample size tending to infinity. As a result of our comparisons, we demonstrate that in terms of empirical power (i.e., for finite samples) relative density of CS-PCD has better performance (which occurs for expansion parameter values larger than one) for the segregation alternative, while relative density of PE-PCD has better performance for the association alternative. The methods are also illustrated in a real-life data set from plant ecology.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Log in via an institution

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Institutional subscriptions

Bayesian hierarchical models for analysing spatial point-based data at a grid level: a comparison of approaches

Article 10 September 2014

Nearest neighbor methods for testing reflexivity

Article 05 November 2016

A spatial randomness test based on the box-counting dimension

Article 05 January 2022

Abbreviations

PCD:: proximity catch digraph
CCCD:: class cover catch digraph
CS-PCD:: central similarity PCD
PE-PCDs:: proportional edge PCD
PAE:: Pitman asymptotic efficiency
CSR:: complete spatial randomness
NNCT:: nearest neighbor contingency table
NN:: nearest neighbor
D(V,A):: Vertex random PCD with vertex set V and arc set A. See Sect. 2, paragraph 1.
N(⋅) and N(x):: Proximity map and the proximity region associated with a point x. See Sect. 2, paragraph 5.
\(\mathcal {C}_{H}(\mathcal {Y}_{m})\) :: Convex hull of \(\mathcal {Y}_{m}\). See Sect. 2, paragraph 7.
\(T(\mathcal {Y}_{3})=T(\mathsf {y}_{1},\mathsf {y}_{2},\mathsf {y}_{3})\) :: The triangle with vertices y ₁,y ₂,y ₃. See Sect. 2, paragraph 7.
\(\mathcal {S}(F)\) :: Support of the distribution F and \(\mathcal {U}(T (\mathcal {Y}_{3} ))\): Uniform distribution on \(T (\mathcal {Y}_{3} )\). See Sect. 2, paragraph 7.
N _PE(x,r):: The proportional edge proximity map with expansion parameter r. See Sect. 2, paragraph 8.
R _V(y ₁), R _V(y ₂), and R _V(y ₃):: The vertex regions for vertices y ₁,y ₂,y ₃. See Sect. 2, paragraph 8.
N _CS(x,τ):: The central similarity proximity map with expansion parameter τ. See Sect. 2, paragraph 9.
R _E(e ₁), R _E(e ₂), R _E(e ₃):: The edge regions for edges e ₁,e ₂,e ₃ opposite to the vertices y ₁,y ₂,y ₃ See Sect. 2, paragraph 9.
ρ(D):: The relative density of the digraph D. See Sect. 3, paragraph 1.
ρ _PE(n,r) and ρ _CS(n,τ):: The relative densities of PE-PCDs and CS-PCDs, respectively. See Theorem 3.1.
μ _PE(r) and ν _PE(r):: The arc probability (or the asymptotic mean) and asymptotic variance of relative density of PE-PCDs. See Sect. 3.1, paragraph 4.
μ _CS(r) and \(\nu_{_{Cs}}(r)\) :: The arc probability (or the asymptotic mean) and asymptotic variance of relative density of CS-PCDs. See Sect. 3.1, paragraph 4.
\(\widetilde{\rho}_{\mathrm{PE}}(n,m,r)\) and \(\widetilde{\rho}_{\mathrm{CS}}(n,m,\tau)\) :: The relative density for the PE-PCD and CS-PCD in the multiple triangle case. See Sect. 3.2, paragraph 2.
\(\widetilde{\mu}_{\mathrm{PE}}(m,r)\) and \(\widetilde{\nu}_{\mathrm{PE}}(m,r)\) :: The asymptotic mean and asymptotic variance of relative density of PE-PCDs in the multiple triangle case. See Corollary 3.4.
R _PE(r):: The standardized test statistic based on the relative density of PE-PCD in the one-triangle case. See (13).
\(\widetilde{R}_{\mathrm{PE}}(r)\) :: The standardized test statistic based on the relative density of PE-PCD in the multi-triangle case. See Sect. 4.1, paragraph 3.
\(\operatorname {PAE}_{\mathrm{PE}}(r)\) and \(\operatorname {PAE}_{\mathrm{CS}}(\tau)\) :: Pitman asymptotic efficiency score for relative density of PE-PCD and CS-PCD, respectively. See Sect. 7, paragraph 2.
\(\pi_{\operatorname {out}}\) and \(\widehat{\pi}_{\operatorname {out}}\) :: Proportion of class 1 points outside the convex hull of class 2 points and its estimate. See Sect. 8, paragraphs 1 & 2, respectively.
C _ch :: The correction coefficient for the class 1 points outside the convex hull of class 2 points. See (14).
\(\widetilde {R}^{\mathrm{ch}}_{\mathrm{PE}}(r)\) and \(\widetilde {R}^{\mathrm{ch}}_{\mathrm{CS}}(\tau)\) :: The convex hull corrected versions of the standardized test statistics based on the relative density of PE- and CS-PCDs. See (15).

References

Baddeley AJ, Turner R (2005) Spatstat: an R package for analyzing spatial point patterns. J Stat Softw 12(6):1–42
Google Scholar
Beer E, Fill JA, Janson S, Scheinerman ER (2010) On vertex, edge, and vertex-edge random graphs. arXiv:0812.1410v2 [math.CO]
Ceyhan E (2010a) A comparison of two proximity catch digraph families in testing spatial clustering. Technical Report # KU-EC-10-3, Koç University, Istanbul, Turkey. arXiv:1010.4436v1 [math.CO]
Ceyhan E (2010b) Extension of one-dimensional proximity regions to higher dimensions. Comput Geom Theor Appl 43(9):721–748
Article MATH MathSciNet Google Scholar
Ceyhan E (2010c) New tests of spatial segregation based on nearest neighbor contingency tables. Scand J Stat 37:147–165
Article MATH MathSciNet Google Scholar
Ceyhan E (2011) Spatial clustering tests based on domination number of a new random digraph family. Commun Stat, Theory Methods 40(8):1363–1395
Article MATH MathSciNet Google Scholar
Ceyhan E, Priebe CE, Wierman JC (2006) Relative density of the random r-factor proximity catch digraphs for testing spatial patterns of segregation and association. Comput Stat Data Anal 50(8):1925–1964
Article MATH MathSciNet Google Scholar
Ceyhan E, Priebe CE, Marchette DJ (2007) A new family of random graphs for testing spatial segregation. Can J Stat 35(1):27–50
Article MATH MathSciNet Google Scholar
Coleman TF, Moré JJ (1983) Estimation of sparse Jacobian matrices and graph coloring problems. SIAM J Numer Anal 20(1):187–209
Article MATH MathSciNet Google Scholar
Coomes DA, Rees M, Turnbull L (1999) Identifying aggregation and association in fully mapped spatial data. Ecology 80(2):554–565
Article Google Scholar
Cressie NAC (1993) Statistics for spatial data. Wiley, New York
Google Scholar
DeVinney J, Priebe CE, Marchette DJ, Socolinsky D (2002) Random walks and catch digraphs in classification. In: Proceedings of the 34th symposium on the interface: computing science and statistics, vol 34. http://www.galaxy.gmu.edu/interface/I02/I2002Proceedings/DeVinneyJason/DeVinneyJason.paper.pdf
Google Scholar
Diggle PJ (2003) Statistical analysis of spatial point patterns. Hodder Arnold Publishers, London
MATH Google Scholar
Dixon PM (1994) Testing spatial segregation using a nearest-neighbor contingency table. Ecology 75(7):1940–1948
Article Google Scholar
Dixon PM (2002a) Nearest-neighbor contingency table analysis of spatial segregation for several species. Ecoscience 9(2):142–151
Google Scholar
Dixon PM (2002b) Nearest neighbor methods. In: El-Shaarawi AH, Piegorsch WW (eds) Encyclopedia of environmetrics, vol 3. Wiley, New York, pp 1370–1383
Google Scholar
Erdős P, Rényi A (1959) On random graphs I. Publ Math (Debr) 6:290–297
Google Scholar
Fall A, Fortin MJ, Manseau M, O’Brien D (2007) Ecosystems. Int J Geogr Inf Sci 10(3):448–461
Google Scholar
Faragó A (2008) A general tractable density concept for graphs. Math Comput Sci 1(4):689–699
Article MATH MathSciNet Google Scholar
Goldberg AV (1984) Finding a maximum density subgraph. Technical Report UCB/CSD-84-171, EECS Department, University of California, Berkeley
Good BJ, Whipple SA (1982) Tree spatial patterns: South Carolina bottomland and swamp forests. Bull Torrey Bot Club 109:529–536
Article Google Scholar
Illian J, Burslem D (2007) Contributions of spatial point process modelling to biodiversity theory. Coexistence 148(148):9–29
MathSciNet Google Scholar
Janson S, Łuczak T, Ruciński A (2000) Random graphs. Wiley-Interscience series in discrete mathematics and optimization. Wiley, New York
Book MATH Google Scholar
Jaromczyk JW, Toussaint GT (1992) Relative neighborhood graphs and their relatives. Proc IEEE 80:1502–1517
Article Google Scholar
Jung I, Kulldorff M (2007) Theoretical properties of tests for spatial clustering of count data. Can J Stat 35(3):433–446
Article MATH MathSciNet Google Scholar
Keitt T (2007) Introduction to spatial modeling with networks. Presented at the workshop on networks in ecology and beyond organized by the PRIMES (program in interdisciplinary math, ecology and statistics) at Colorado State University, Fort Collins, Colorado
Kendall M, Stuart A (1979) The advanced theory of statistics, vol 2, 4th edn. Griffin, London
MATH Google Scholar
Lehmann EL (1988) Nonparametrics: statistical methods based on ranks. Prentice-Hall, Upper Saddle River
Google Scholar
Leibovich E (2009) Approximating graph density problems. PhD thesis, The Open University of Israel, Department of Mathematics and Computer Science
Marchette DJ, Priebe CE (2003) Characterizing the scale dimension of a high dimensional classification problem. Pattern Recognit 36(1):45–60
Article MATH Google Scholar
Minor ES, Urban DL (2007) Graph theory as a proxy for spatially explicit population models in conservation planning. Ecol Appl 17(6):1771–1782
Article Google Scholar
Okabe A, Boots B, Sugihara K, Chiu SN (2000) Spatial tessellations: concepts and applications of Voronoi diagrams. Wiley, New York
Book Google Scholar
Penrose M (2003) Random geometric graphs. Number 5 in Oxford studies in probability. Oxford University Press, London
Book Google Scholar
Priebe CE, DeVinney JG, Marchette DJ (2001) On the distribution of the domination number of random class cover catch digraphs. Stat Probab Lett 55:239–246
Article MATH MathSciNet Google Scholar
Priebe CE, Marchette DJ, DeVinney J, Socolinsky D (2003) Classification using class cover catch digraphs. J Classif 20(1):3–23
Article MATH MathSciNet Google Scholar
Roberts SA, Hall GB, Calamai PH (2000) Analysing forest fragmentation using spatial autocorrelation, graphs and GIS. Int J Geogr Inf Sci 14(2):185–204
Article Google Scholar
Shenggui Z, Hao S, Xueliang L (2002) w-density and w-balanced property of weighted graphs. Appl Math J Chin Univ 17(3):355–364
Article MATH Google Scholar
Toussaint GT (1980) The relative neighborhood graph of a finite planar set. Pattern Recognit 12(4):261–268
Article MATH MathSciNet Google Scholar
van Eeden C (1963) The relation between Pitman’s asymptotic relative efficiency of two tests and the correlation coefficient between their test statistics. Ann Math Stat 34(4):1442–1451
Article MATH Google Scholar

Download references

Acknowledgements

I would like to thank an anonymous associate editor and referees whose constructive comments and suggestions greatly improved the presentation and flow of the paper. Most of the Monte Carlo simulations presented in this article were executed at Koç University High Performance Computing Laboratory. This research was supported by the European Commission under the Marie Curie International Outgoing Fellowship Programme via Project # 329370 titled PRinHDD.

Author information

Authors and Affiliations

Department of Mathematics, College of Sciences, Koç University, 34450 Sarıyer, Istanbul, Turkey
Elvan Ceyhan

Authors

Elvan Ceyhan
View author publications
You can also search for this author in PubMed Google Scholar

Corresponding author

Correspondence to Elvan Ceyhan.

Rights and permissions

Reprints and permissions

About this article

Cite this article

Ceyhan, E. Comparison of relative density of two random geometric digraph families in testing spatial clustering. TEST 23, 100–134 (2014). https://doi.org/10.1007/s11749-013-0344-4

Download citation

Received: 22 February 2012
Accepted: 23 September 2013
Published: 18 October 2013
Issue Date: March 2014
DOI: https://doi.org/10.1007/s11749-013-0344-4

Keywords

Mathematics Subject Classification (2010)

Access this article

Log in via an institution

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Institutional subscriptions

Comparison of relative density of two random geometric digraph families in testing spatial clustering

Abstract

Access this article

Similar content being viewed by others

Bayesian hierarchical models for analysing spatial point-based data at a grid level: a comparison of approaches

Nearest neighbor methods for testing reflexivity

A spatial randomness test based on the box-counting dimension

Abbreviations

References

Acknowledgements

Author information

Authors and Affiliations

Corresponding author

Rights and permissions

About this article

Cite this article

Keywords

Mathematics Subject Classification (2010)

Navigation

Comparison of relative density of two random geometric digraph families in testing spatial clustering

Abstract

Access this article

Similar content being viewed by others

Bayesian hierarchical models for analysing spatial point-based data at a grid level: a comparison of approaches

Nearest neighbor methods for testing reflexivity

A spatial randomness test based on the box-counting dimension

Abbreviations

References

Acknowledgements

Author information

Authors and Affiliations

Corresponding author

Rights and permissions

About this article

Cite this article

Share this article

Keywords

Mathematics Subject Classification (2010)

Search

Navigation