Skip to main content

Spatial autoregressive models for scan statistic

Abstract

Spatial scan statistics are well-known methods for cluster detection and are widely used in epidemiology and medical studies for detecting and evaluating the statistical significance of disease hotspots. For the sake of simplicity, the classical spatial scan statistic assumes that the observations of the outcome variable in different locations are independent, while in practice the data may exhibit a spatial correlation. In this article, we use spatial autoregressive (SAR) models to account the spatial correlation in parametric/non-parametric scan statistic. Firstly, the correlation parameter is estimated in the SAR model to transform the outcome into a new independent outcome over all locations. Secondly, we propose an adapted spatial scan statistic based on this independent outcome for cluster detection. A simulation study highlights the better performance of the proposed methods than the classical one in presence of spatial correlation in the data. The latter shows a sharp increase in Type I error and false-positive rate but also decreases the true-positive rate when spatial correlation increases. Besides, our methods retain the Type I error and have stable true and false positive rates with respect to the spatial correlation. The proposed methods are illustrated using a spatial economic dataset of the median income in Paris city. In this application, we show that taking spatial correlation into account leads to the identification of more concentrated clusters than those identified by the classical spatial scan statistic.

This is a preview of subscription content, access via your institution.

Fig. 1
Fig. 2
Fig. 3
Fig. 4
Fig. 5

Availability of data and material

data provided on demand

References

  • Anselin L (1995) Local indicators of spatial association. Geograph Anal 27:93–115

    Google Scholar 

  • Anselin L (2013) Spatial econometrics: methods and models. Springer, New York

    Google Scholar 

  • Anselin L, Bera AK, Florax R, Yoon MJ (1996) Simple diagnostic tests for spatial dependence. Regional Sci Urban Econ 26:77–104

    Google Scholar 

  • Arbia G (1990) On second-order non-stationarity in two dimensionallattice processes. Comput Stat Data Anal 9:147–160

    Google Scholar 

  • Arbia G (2014) A primer for spatial econometrics: with applications in R. Springer, New York

    Google Scholar 

  • Bhatt V, Tiwari N (2014) A spatial scan statistic for survival data based on weibull distribution. Stat Med 33:1867–1876

    Google Scholar 

  • Brunsdon C, Fotheringham AS, Charlton ME (1996) Geographically weighted regression: a method for exploring spatial nonstationarity. Geograph Anal 28:281–298

    Google Scholar 

  • Brunsdon C, Fotheringham S, Charlton M (2007) Geographically weighted discriminant analysis. Geograph Anal 39:376–396

    Google Scholar 

  • Burnham KP, Anderson DR (2004) Multimodel inference: understanding aic and bic in model selection. Sociol Methods Res 33:261–304

    Google Scholar 

  • Chasco C, Le Gallo J, López Hernández F (2020) The spatial structure of housing prices in Madrid: evidence from spatio-temporal scan statistics. In: Glaz J, Koutras MV (eds) Handbook of scan statistics. Springer, pp 1–19

    Google Scholar 

  • Cleveland WS, Devlin SJ (1988) Locally weighted regression: an approach to regression analysis by local fitting. J Am Stat Assoc 83:596–610

    Google Scholar 

  • Cliff A, Ord K (1973) Spatial autocorrelation. Pion Ltd, London

    Google Scholar 

  • Cressie N (1977) On some properties of the scan statistic on the circle and the line. J Appl Prob 14:272–283

    Google Scholar 

  • Cressie N (1993) Statistics for spatial data. Wiley, Hoboken

    Google Scholar 

  • Cucala L (2014) A distribution-free spatial scan statistic for marked point processes. Sp Stat 10:117–125

    Google Scholar 

  • Cucala L, Demattei C, Lopes P, Ribeiro A (2013) A spatial scan statistic for case event data based on connected components. Comput Stat 28:357–369

    Google Scholar 

  • Cucala L, Genin M, Lanier C, Occelli F (2017) A multivariate gaussian scan statistic for spatial data. Spatial Stat 21:66–74

    Google Scholar 

  • Cucala L, Genin M, Occelli F, Soula J (2019) A multivariate nonparametric scan statistic for spatial data. Spatial Stat 29:1–14

    Google Scholar 

  • Fisher RA (1959) Statistical methods and scientific inference. Hafner, New York

    Google Scholar 

  • Fotheringham AS, Brunsdon C, Charlton M (2003) Geographically weighted regression: the analysis of spatially varying relationships. Wiley, Hoboken

    Google Scholar 

  • Fotheringham AS, Charlton ME, Brunsdon C (1998) Geographically weighted regression: a natural evolution of the expansion method for spatial data analysis. Environ Plan A 30:1905–1927

    Google Scholar 

  • Gelfand AE, Kim HJ, Sirmans C, Banerjee S (2003) Spatial modeling with spatially varying coefficient processes. J Am Stat Assoc 98:387–396

    Google Scholar 

  • Genin M, Fumery M, Occelli F, Savoye G, Pariente B, Dauchet L, Giovannelli J, Vignal C, Body-Malapel M, Sarter H et al (2020) Fine-scale geographical distribution and ecological risk factors for crohns disease in france (2007–2014). Aliment pharmacol Therapeutics 51:139–148

    Google Scholar 

  • Ghiringhelli C, Bartolucci F, Mira A, Arbia G (2021) Modelling nonstationary spatial lag models with hidden markov random fields. Spatial Stat 44:100522

    Google Scholar 

  • Glaz J, Naus J, Wallenstein S (2001) Scan Statistics. Scan Statistics, Springer https://books.google.fr/books?id=CHUwtWl6zOYC

  • Glaz J, Pozdnyakov V, Wallenstein S (2009) Scan statistics: methods and applications. Springer, New York

    Google Scholar 

  • Huang L, Kulldorff M, Gregorio D (2007) A spatial scan statistic for survival data. Biometrics 63:109–118

    Google Scholar 

  • Huang L, Tiwari RC, Zou Z, Kulldorff M, Feuer EJ (2009) Weighted normal spatial scan statistic for heterogeneous population data. J Am Stat Assoc 104:886–898

    Google Scholar 

  • Jung I (2009) A generalized linear models approach to spatial scan statistics for covariate adjustment. Stat Med 28:1131–1143

    Google Scholar 

  • Jung I, Cho HJ (2015) A nonparametric spatial scan statistic for continuous data. Int J Health Geograph 14:30

    Google Scholar 

  • Kelejian HH, Prucha IR (1998) A generalized spatial two-stage least squares procedure for estimating a spatial autoregressive model with autoregressive disturbances. J Real Estate Finance Econ 17:99–121

    Google Scholar 

  • Kelejian HH, Prucha IR (1999) A generalized moments estimator for the autoregressive parameter in a spatial model. Int Econ Rev 40:509–533

    Google Scholar 

  • Kooijman S (1976) Some remarks on the statistical analysis of grids especially with respect to ecology Annals of Systems Research Springer 113–132

  • Kostov P (2010) Model boosting for spatial weighting matrix selection in spatial lag models. Environ Plan B: Plan Des 37:533–549

    Google Scholar 

  • Kulldorff M (1997) A spatial scan statistic. Commun Stat Theory methods 26:1481–1496

    Google Scholar 

  • Kulldorff M, Huang L, Konty K (2009) A scan statistic for continuous data based on the normal probability model. Int J Health Geograph 8:58

    Google Scholar 

  • Kulldorff M, Huang L, Pickle L, Duczmal L (2006) An elliptic spatial scan statistic. Stat Med 25:3929–3943

    Google Scholar 

  • Lawson A, Denison D (2002) Spatial cluster modelling. CRC Press, USA

    Google Scholar 

  • Lee J, Gangnon RE, Zhu J (2017) Cluster detection of spatial regression coefficients. Stat Med 36:1118–1133

    Google Scholar 

  • Lee J, Sun Y, Chang HH (2019) Spatial cluster detection of regression coefficients in a mixed-effects model. Environmetrics 31:e2578

    Google Scholar 

  • Lee LF (2004) Asymptotic distributions of quasi-maximum likelihood estimators for spatial autoregressive models. Econometrica 72:1899–1925

    Google Scholar 

  • Lee LF (2007) Gmm and 2sls estimation of mixed regressive, spatial autoregressive models. J Econ 137:489–514

    Google Scholar 

  • LeSage J, Pace PK (2009) Introduction to spatial econometrics, 1st edn. Chapman and Hall/CRC. https://doi.org/10.1201/9781420064254

    Book  Google Scholar 

  • LeSage JP, Pace RK (2014) The biggest myth in spatial econometrics. Econometrics 2:217–249

    Google Scholar 

  • Li F, Sang H (2019) Spatial homogeneity pursuit of regression coefficients for large datasets. J Am Stat Assoc 114(527):1050–1062. https://doi.org/10.1080/01621459.2018.1529595

    Article  Google Scholar 

  • Lin PS (2014) Generalized scan statistics for disease surveillance. Scand J Stat 41:791–808

    Google Scholar 

  • Loh JM, Zhu Z (2007) Accounting for spatial correlation in the scan statistic. Annal Appl Stat 1:560–584

    Google Scholar 

  • Luquero FJ, Banga CN, Remartínez D, Palma PP, Baron E, Grais RF (2011) Cholera epidemic in guinea-bissau (2008): the importance of place. PloS one 6:e19005

    Google Scholar 

  • Naus JI (1965) Clustering of random points in two dimensions. Biometrika 52:263–267

    Google Scholar 

  • Ott J, Hoh J (2012) Scan statistics in human gene mapping. Am J Hum Genet 91:970

    Google Scholar 

  • Páez A, Uchida T, Miyamoto K (2002) A general framework for estimation and inference of geographically weighted regression models: 1. location-specific kernel bandwidths and a test for locational heterogeneity. Environ Plan A 34:733–754

    Google Scholar 

  • Raftery AE (1995) Bayesian model selection in social research. Sociol methodol 25:111–164

    Google Scholar 

  • Ripley B (1981) Spatial statistics. Wiley, Hoboken

    Google Scholar 

  • Santi F, Arbia G, Bee M, Espa G (2017) A frequency domain test for isotropy in spatial data models. Spatial Stat 28:262–278

    Google Scholar 

Download references

Funding

This research did not receive any specific grant from funding agencies in the public, commercial, or not-for-profit sectors.

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Michaël Genin.

Ethics declarations

Conflict of interest/Competing interests

none

Code availability

code provided on demand

Ethics approval

not applicable

Consent to participate

not applicable

Consent for publication

not applicable

Additional information

Publisher's Note

Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

Supplementary Information

Below is the link to the electronic supplementary material.

Supplementary file1 (PNG 70 KB)

Supplementary file2 (PDF 12 KB)

Appendix 1

Appendix 1

Explicit expressions of the parameters estimators for the Gaussian spatial scan statistics

Under \({\mathcal {H}}_0\), the MLEs of \(\alpha\) and \(\sigma ^2\) have the following explicit expressions:

$$\begin{aligned} {\widehat{\alpha }}=\frac{1}{n}\sum _{i=1}^{n}Y_i\qquad \text{ and } \qquad \widehat{\sigma ^2}=\frac{1}{n}\sum _{i=1}^{n}\left( Y_i-{\widehat{\alpha }}\right) ^{2}. \end{aligned}$$

Under \({\mathcal {H}}_1\), the MLEs of \(\alpha\), \(\sigma ^2\) and \(\delta _k\) have the following explicit expressions:

$$\begin{aligned} {\widehat{\alpha }}_{k}=\frac{1}{n-n_k}\sum _{i=1}^{n}\left( 1-\xi ^{(k)}_i\right) Y_i,\qquad {\widehat{\delta }}_{k}=\frac{1}{n-n_k}\sum _{i=1}^{n}\left( \frac{n}{n_k}\xi ^{(k)}_i-1\right) {Y}_i \end{aligned}$$

and

$$\begin{aligned} \widehat{\sigma ^2}_{k}= & {} \frac{1}{n}\sum _{i=1}^{n}\left( Y_i-{\widehat{\alpha }}_{k}-{\widehat{\delta }}_{k}\xi _{i}^{(k)}\right) ^2\\= & {} \frac{1}{n}\left\{ \sum _{i \notin C_k}\left( Y_i-\frac{1}{n-n_k}\sum _{j \notin C_k }Y_j\right) ^2 +\sum _{i\in C_k}\left( Y_i-\frac{1}{n_k}\sum _{j \in C_k}Y_j\right) ^2\right\} \end{aligned}$$

where \(n_k\) is the number of locations inside \(C_k\). Thus, the last decomposition is equal to the estimator of \(\sigma ^2\) under \({\mathcal {H}}_1\) given in Kulldorff et al. (2009).

Rights and permissions

Reprints and Permissions

About this article

Verify currency and authenticity via CrossMark

Cite this article

Ahmed, MS., Cucala, L. & Genin, M. Spatial autoregressive models for scan statistic. J Spat Econometrics 2, 11 (2021). https://doi.org/10.1007/s43071-021-00017-0

Download citation

  • Received:

  • Accepted:

  • Published:

  • DOI: https://doi.org/10.1007/s43071-021-00017-0

Keywords

  • Spatial autoregressive models
  • Scan statistics
  • Cluster detection

JEL classification

  • C210