Spatial heterogeneity in Bayesian disease mapping

Congdon, Peter

doi:10.1007/s10708-018-9920-1

Spatial heterogeneity in Bayesian disease mapping

Open access
Published: 04 September 2018

Volume 84, pages 1303–1316, (2019)
Cite this article

Download PDF

You have full access to this open access article

GeoJournal Aims and scope Submit manuscript

Spatial heterogeneity in Bayesian disease mapping

Download PDF

Peter Congdon ORCID: orcid.org/0000-0003-1934-9205¹

2205 Accesses
3 Citations
2 Altmetric
Explore all metrics

Abstract

Disease mapping applications generally assume homogeneous regression effects and use random intercepts to account for residual spatial dependence. However, there may be local variation in the association between disease and area risk factors. We consider implications for model fit, estimated regression coefficients, and substantive inferences of allowing spatial variability in impacts of area risk factors. An application to suicide in 6791 English small areas shows that average regression coefficients and substantive inferences (e.g. about relative risk) may be considerably affected by allowing spatially varying predictor effects, while fit is improved.

Bayesian forecasting of disease spread with little or no local data

Article Open access 19 May 2023

Modeling of Infectious Diseases: A Core Research Topic for the Next Hundred Years

ScanZID: Spatial Scan Statistics with Zero Inflation and Dispersion

Use our pre-submission checklist

Avoid common mistakes on your manuscript.

Introduction

Many area disease studies consider official mortality statistics for administrative areas (e.g. US counties, English wards). Assume a region subdivided into n small areas. Let $Y_{i}$ denote observed deaths in such areas, and $E_{i}$ denote expected deaths obtained by applying national rates to small area populations. For relatively rare diseases, maximum likelihood estimates of relative risk, namely standard mortality ratios $Y_{i}/E_{i}$ are unstable (Haining 2001), with extreme ratios associated with areas with the smallest populations. By contrast, disease mapping methods, usually using Bayesian inference, seek to borrow strength across areas to produce stable risk estimates.

A common analytical framework for rare diseases over small areas adopts the Besag et al. (1991) model, whereby disease counts $Y_{i}$$(\,i =1 , \ldots ,n)$ are taken as Poisson with means $\mu _{i} =E_{i}\lambda _{i}$, where $\lambda _{i}$ denote relative disease risks. These are centred around a global relative risk of 1 when ${\sum _{i =1}^{n}}Y_{i} ={\sum _{i =1}^{n}}E_{i} .$ Let $X_{i}$ denote known predictors of disease risk, such as area deprivation. Presentations of disease mapping regression (e.g. Schrödle and Held 2011; Mollié 1996; Clayton et al. 1993) typically assume a homogenous effect of such predictors, and use varying intercepts to represent and account for spatial dependence in the outcome, and hence to borrow strength in estimation. Such models incorporate spatially dependent effects, as, for example, under conditional autoregressive schemes (e.g. Besag et al. 1991; Leroux et al. 1999).

An alternative to spatial homogeneity (also known as spatial stationarity) for predictor effects is to allow spatially varying impacts. A classical technique allowing varying coefficients is geographically weighted regression (e.g. Mou et al. 2017). Analogous models using random effects Bayesian approaches have been proposed (e.g. Assunçao 2003), but their full implications for inferences from disease mapping studies are not widely studied—though see Feng et al. (2016). However spatial dependence in residuals may be due to spatial nonstationarity (Fotheringham 2009), with implication that if allowing spatially varying regression impacts removes spatial residual dependence, then additional spatially structured random intercepts may be unnecessary.

In this case study, we consider suicide counts in 6791 middle level super output areas (MSOAs) in England. Suicides are for the 5 year period, 2011–2015, and available as gender—specific totals within each MSOA; there are 23,460 suicides overall, 17,897 male and 5563 female. We consider the impact on suicide variations of three area level predictors: socioeconomic deprivation, social fragmentation and rurality.

Analysis uses Bayesian inference via integrated nested Laplace approximation (INLA), a computationally efficient alternative to Markov Chain Monte Carlo, and implemented in the R package R-INLA (Bivand et al. 2015). We compare models with homogeneous regressor effects and varying intercepts (“global models”) with models allowing spatially varying predictor effects (“local models”). This comparison shows that fit improves under local models, while residual spatial dependence is removed. Central predictor effects are different between the models, and there is significant variability in one or more predictors under local models. Implications for relative risk estimates, and for local variations in risk within subregions, are discussed.

Subsequent sections discuss specifications of the two classes of model, the definition and epidemiological role of the predictors, and the case study implementation and findings.

Model specification

A commonly adopted formulation for disease mapping involves the convolution model (Besag et al. 1991), which leads to spatially varying intercepts. This involves, at a minimum, a spatially configured effect $s_{i}$, and an unstructured or iid term $u_{i}$, normally distributed with mean 0 and variance $\sigma _{u}^{2}$. Let $L_{i}$ denote the locality surrounding area i, meaning the set of $d_{i}$ small areas adjacent to it. Let N(m, V) denote a normal density with mean m and variance V. Then the spatial effects follow a conditional autoregressive (or CAR) scheme,

$$ s_{i} \sim N\left( S_{i} ,\sigma _{s}^{2}/d_{i}\right),$$

(1)

where $S_{i}$ is the average of the spatial effects in locality $L_{i}$. For identifiability purposes, spatial effects are centred to have mean 0. Then with $\beta _{0}$ denoting an intercept, relative risks can be modelled as a log-link regression with varying intercepts:

$$ \log (\lambda _{i}) =\beta _{0} +s_{i} +u_{i}.$$

(2)

Assume there are ecological (area-level) predictors $X_{i}$ relevant to the disease. Under a “global” or stationary model, predictors have a spatially homogeneous impact, namely

$$ \log (\lambda _{i}) =\beta _{0} +X_{i}\beta +s_{i} +u_{i},$$

(3)

with random intercepts intended to eliminate residual spatial dependence.

However, spatial dependence and spatial heterogeneity (i.e. spatial non-stationarity) are often interrelated (Anselin 2010). One possible form of spatial heterogeneity is in predictor impacts. Consider a simple linear regression $y_{i} \sim N(\mu _{i} ,\sigma ^{2})$ for an area outcome, and with one predictor (Fotheringham 2009). Suppose a global model $\mu _{i} =\beta _{0} +X_{i}\beta _{1} ,$ is assumed, but that the true model is

$$ \mu _{i} =\beta _{0} +_{}X_{i}\beta _{1i}, $$

(4)

where $\beta _{1i}$ are varying slopes. Assume the global regression coefficient is estimated as $\beta _{1}^{ *}$ .

Then for areas where $\beta _{1i} >\beta _{1}^{ *}$ , $y_{i}$ is typically underestimated under a global model, and the residual is positive. For locations where $\beta _{1i} <\beta _{1}^{ *}$ , the residual is typically negative, since $y_{i}$ is overestimated. If the $\beta _{1i}$ show spatial dependence, then residuals from the global model will also show spatial dependence.

More generally if non-stationarity in regression effects is a major source of spatial dependence in residuals, such dependence may be eliminated by a local model allowing non-stationary impacts of $X_{i}$. A model allowing for local variation in regression effects is also potentially more adaptive to local variations in risk and their association with local risk factor patterns: links between disease and risk factors may be stronger in some sub-regions.

Area socioeconomic influences on suicide

The above discussion assumes that area level predictors of disease risk are available. Many studies report that area poverty and deprivation increase suicide risks (e.g. Gunnell et al. 1995). Area deprivation is to some extent simply an aggregate measure of individual suicide risk factors such as unemployment and low income, so acting as a compositional factor (Collins et al. 2017). However, it may also partly reflect contextual risks, or place effects per se. This is supported by studies of mental illness reporting significant associations with area-level socio-economic status, beyond individual-level factors (e.g. Silver et al. 2002; Matheson et al. 2006). Here we use the UK government’s Index of Multiple Deprivation (or IMD) as a measure of area SES.

A number of studies have considered impacts on suicide outcomes of indices of social fragmentation, meaning relatively low levels of community integration linked to high numbers of non-family households, and high residential turnover. Thus Congdon (1996) proposed a social fragmentation index based on residential turnover, one person households, renting from private sector landlords (excluding social renting), and non-married adults.

Social fragmentation is to some degree a compositional measure of individual suicide risk factors such as living alone (Holt-Lunstad et al. 2015), recent residential relocation, and being unmarried. However, it may also partly reflect contextual risks, such as negative effects of high neighborhood transience on population mental health, after control for individual risk factors (Matheson et al. 2006). Here a fragmentation score is obtained from principal component analysis of the four variables of Congdon (1996), but updated to the 2011 Census. Social fragmentation so defined tends to be higher in central city areas, but also in particular types of town (coastal resorts, university towns) with relatively high population turnover.

Findings on urbanicity or rurality in relation to suicide are inconsistent, though many confounding factors could contribute to these findings. Thus Gartner et al. (2008) report that “mortality rates from suicide for males in England were 10 per cent lower in rural areas before adjustment for deprivation, but 11 per cent higher [...] after adjustment”; see also Saunderson et al. (1998). Various features of rural economic life and healthcare may affect suicide, such as sole entrepreneurship, easier access to suicide methods, and lesser access to mental health services.

In the analysis below a rurality score is based on the 2011 rural/urban classification (RUC2011) of UK small areas (Bibby and Brindley 2013). Specifically a ridit score (e.g. Ernstsen et al. 2012) is obtained from MSOA frequencies in eight ordered urban-rural categories (from Urban Major Conurbation at one extreme to Rural Village and Dispersed in a Sparse Setting at the other). Ridit scores are assigned to ordered categories following a procedure developed by Bross (1958).

Case study implementation

Suicide deaths $Y_{i}$ are available by gender, with expected deaths $E_{i}$ obtained by multiplying MSOA populations by England wide age specific suicide death rates, with ${\sum _{i =1}^{n}}Y_{i} ={\sum _{i =1}^{n}}E_{i} .$ Let $X_{i} =(X_{1i} ,X_{2i} ,X_{3i})$ denote the predictors; respectively deprivation, social fragmentation and rurality. For each of three analyses (overall, male, and females) we compare four models.

Model 1 is a global model assuming the same predictor effects across all MSOAs, together with varying intercepts. Thus for latent relative risks $\lambda _{i}$,

$$ \log (\lambda _{i}) =\beta _{0} +X_{1i}\beta _{1} +X_{2i}\beta _{2} +X_{3i }\beta _{3} +s_{i} +u_{i },$$

(5)

with spatial and iid effects, $s_{i}$ and $u_{i}$respectively, defined as above. Among non-stationary models, the simplest assumes varying slopes, without varying intercepts. This is denoted model 2, or the local model, with

$$ \log (\lambda _{i}) =\beta _{0} +X_{1i}\beta _{1i} +X_{2i}\beta _{2i} +X_{3i }\beta _{3i}, $$

(6)

where $\beta _{1i} ,\beta _{2i} ,$ and $\beta _{3i}$ each follow the autoregressive scheme (1). The data are relatively sparse (the average overall suicide count per MSOA is 3.5) so may not support complex models with additional random effects. Accordingly, we compare model 2 with two more complex models: model 2′, with varying slopes and spatial intercepts, and model 2″, containing varying predictor effects, and both spatial and iid intercepts. Thus model 2″ is

$$ \log (\lambda _{i}) =\beta _{0} +X_{1i}\beta _{1i} +X_{2i}\beta _{2i} +X_{3i }\beta _{3i} +s_{i} +u_{i}.$$

(7)

In practice, R-INLA estimates varying slopes as the sum of a fixed effect and a spatial random effect, for example $\beta _{1i} =\beta _{1} +w_{1i}$ where $w_{1i}$ is a zero mean CAR scheme. The $X_{i}$ are scaled to the interval [0, 1], so that the relative importance of predictors as risk factors can be directly assessed from the regression coefficients.

It is important to assess how far each model eliminates residual spatial dependence. With $\overline{\mu }_{i}$ denoting posterior mean of $\mu _{i}$, Poisson residuals are defined as

$$ r_{i} =(Y_{i} -\overline{\mu }_{i})/\overline{\mu }_{i}^{0.5},$$

(8)

and one may assess significant residual dependence using an index such as Moran’s I. Specifically the moran.mc procedure in R uses a Monte Carlo permutation test for Moran’s I statistic, with 1000 permutations being used. Significant correlation will show in extreme p values, namely values close to zero when positive residual correlation remains, and p values close to 1 for significant negative residual correlation. Moran I calculations use a binary adjacency spatial interaction matrix for the 6791 areas.

Model fit is assessed by using the deviance information criterion (DIC) (Spiegelhalter et al. 2002), and the Watanabe-Akaike information criterion (WAIC) (Watanabe 2010). Both measures incorporate a complexity penalty as well as simply measuring fit, and so are analogous to classical fit measures such as the Akaike information criterion (AIC). Smaller values of the DIC and WAIC suggest a model which is both better fitting and more parsimonious in terms of parameters.

Case study results

Table 1 summarises model fit across the outcomes, and levels of residual spatial dependence. The largest gains in model fit (in terms of reduced WAIC and DIC) are obtained in going from the global model 1 to the simplest local model, the varying slopes model 2. Any additional improvements in fit are slight: for overall suicides, the reduction in WAIC in going from model 2 to model 2′ is only 3, and for males the corresponding reduction is only 4. There is no gain in fit in moving from model 2′ to model 2″. For females, the DIC and WAIC are lowest for model 2, though changes in fit measures between models are small; for females, fit deteriorates slightly in moving from model 2 to more complex variants.

Table 1 Comparison of model variants

Full size table

Additionally under the global models for overall and male suicides, there is still evidence of some residual spatial dependence, in fact negative correlation, so that the p value is close to 1. For overall sucides, the more complex models 2′ and 2″ also show residual spatial dependence.

So on grounds of both fit and eliminating residual dependence, the simplest local model, model 2, is the best choice for overall and female suicides. For conceptual simplicity and comparability, Tables 2, 3 and 4 compare outcome specific regression coefficients under the varying intercepts model 1 and varying slopes model 2.

Table 2 Parameter estimates, overall suicides, England MSOAs, 2011–2015

Full size table

Table 3 Parameter estimates, male suicides, England MSOAs, 2011–2015

Full size table

Table 4 Parameter estimates, female suicides, England MSOAs, 2011–2015

Full size table

It can be seen that deprivation is the leading risk factor for overall suicide, and for male suicides (which account for 76% of overall suicides). For males, the relative suicide risk under model 2 is 2.58, when comparing the most and least deprived communities and holding other influences constant. By contrast, for females, social fragmentation is the leading suicide risk factor, with a relative risk of 3.26 when comparing the most and least fragmented communities. Rurality is a weaker but still significantly positive risk factor for all outcomes. Regarding central predictor effects under model 2, as compared to homogeneous effects under model 1, impacts of deprivation are reduced for all outcomes, but especially female suicide.

For male suicides, social fragmentation effects show the most variability under model 2 (as shown by the estimated random effect standard deviations on the right side of Table 3). However, for female suicides, the impact of deprivation is the most variable. For overall suicides, both fragmentation and deprivation show variable impacts, but fragmentation shows greater variability. Thus Figs. 1 and 2 represent the varying effects of fragmentation and deprivation on overall suicide.

While there is significant variability in SFI impacts on two suicide outcomes, this is overwhelmingly a positive risk factor on suicide in the sense that higher SFI is associated with higher suicide risk. For local SFI impacts on overall suicides, in 4750 (of 6791) areas there is a 90% chance or more of a positive effect, but no areas with a 90% chance or more that the SFI effect is negative. Regarding local SFI impacts on female suicide, in all 6791 MSOAs there is an over 90% chance of a positive effect. For male suicides, the SFI effect is significantly positive in 3080 areas, though for a few areas (under 10) the effect is significantly negative.

Comparison of predictions

Detecting markedly elevated or reduced risk is a primary goal in disease mapping, and often a criterion for model effectiveness.

As one approach to comparing predictions and detecting extreme risk, we focus on overall suicides, and compare relative risk estimates between global and local models for those MSOAs where predictions contrast most. The ten MSOAs where the local model 2 predicts distinctly higher relative risks are listed first (see Table 5). We include SMRs as a risk indicator, but because of their drawbacks as relative risk estimates, supplement them by probability estimates that relative risks exceed 1. These are based on Poisson random simulations, using only the observed $Y_{i}$ and $E_{i}$ (see “Appendix 1”). These estimates take account of varying MSOA populations, and higher $E_{i}$ in some areas, whereas an SMR of, say, 2, does not distinguish between the scenarios $(Y =10 ,E =5)$ and $(Y =4 ,E =2)$.

It can be seen that risk tends to be underpredicted under the global model 1 for the first ten areas in Table 5. In these areas, high suicide risks are associated with high scores on deprivation or fragmentation, and with above average IMD and SFI coefficients (compared to the coefficient averages in Table 2). Such high coefficients are acting adaptively to explain locally high suicide risk under model 2.

Table 5 Comparison of relative risk estimates

Full size table

By contrast, the last ten rows of Table 5 are for ten MSOAs where the global model 1 produces distinctly higher relative risk estimates as compared to model 2. These tend to be MSOAs where actual deaths $Y_{i}$ are less than expected counts $E_{i}$, and there is a low data-based probability that relative risk exceeds 1. However, for all these MSOAs, the global model produces relative risk estimates in excess of 1.

A more generic impression of how predictions compare between models is based on forming groups of MSOAs with distinct risk levels, as assessed from probability estimates of excess relative risk (“Appendix 1”) and the population size of MSOAs (as reflected in expected suicides). The excess risk probabilities more clearly identify elevated than depressed risk. Considering overall suicides, there are 148 MSOAs with excess risk probabilities over 0.99, and 449 with probabilities over 0.95. By contrast, there are only 36 areas with probabilities under 0.01, while 199 are under 0.025, and 420 under 0.05.

Therefore we define area categories according to excess risk probabilities over 0.99, over 0.95, under 0.05 and under 0.025, and expected suicides (over 5, 3–5, and under 3). Thus see the upper section of Table 6, with the last column showing the number of MSOAs in the category, and also including intermediate risk areas. Table 6 shows the average relative risks under models 1, 2, 2′, and 2″ (columns 2–5) as well as the standard mortality ratio in that category (total deaths divided by total expected).

Table 6 Comparing predictions between models

Full size table

Thus for areas with elevated risk, the varying slopes model 2 identifies such risk better than the varying intercepts model 1. This advantage is most pronounced for larger areas (E > 5) where the simulation probabilities exceed 0.99. Model 2 also has an advantage over model 1 for low risk areas with larger populations (E > 5). Additionally, throughout all comparisons, there is no advantage in predicting risk for models 2′ and 2″ against the less complex model 2. For intermediate risk areas, the models make similar predictions.

These themes continue in the lower part of Table 6 which compares predictions between the nine English regions. Thus different models tend to be broadly similar in their predictive success within some regions as against others. For example, all models are relatively successful in predicting high risk in the North West, South West and North East regions. By contrast, the high risk in 20 London MSOAs is not well predicted by any model, suggesting that an expanded model might be needed to capture distinct regional effects. However, the advantage of model 2 over model 1 in predicting higher risk still pertains for most regions, most markedly for the East of England.

To illustrate how varying intercepts and varying slopes model compare for data following a known model form, we also conduct a simulation for one English region, the North West. Details are set out in Online Resource O1, but replicate the above discussed features: better fit and better prediction for higher risk areas under the varying slopes model 2.

Suicide variation within local authorities

The 6791 MSOAs are nested within larger administrative units, namely 326 local authorities. To illustrate locally varying SFI impacts within a subregion, consider Tendring local authority in NE Essex, which includes coastal resort areas. As mentioned above, coastal resort towns may have high residential turnover, and high levels of private renting, leading to high fragmentation scores. They can also be relatively deprived. High suicide risks in such towns are exemplified by authority-wide SMRs (for overall suicides) of 1.57 (Tendring), 1.51 (Blackpool), 1.53 (Hastings), 1.48 (Scarborough), 1.43 (Great Yarmouth), 1.37 (Brighton and Hove) and 1.32 (Eastbourne). These authorities are all within the 25 local authorities with the highest suicide SMRs.

Within Tendring, there are wide contrasts in suicide risks, with observed counts $Y_{i}$ considerably exceeding expected counts $E_{i}$ in some MSOAs, with the reverse true in other areas (see Table 7). In six areas there is an over 90% chance that relative risk exceeds 1, based simply on the data.

Table 7 Suicide risk estimates, IMD and SFI Scores, and local model regression coefficients

Full size table

As to risk-predictor associations, there is a positive correlation (0.90) between fragmentation and suicide SMRs within Tendring, and a positive correlation (0.86) between deprivation and suicide also. So both deprivation and fragmentation coefficients are above average, as the two risk factors have a clear role in explaining suicide contrasts within the subregion. Figure 3 shows the varying SFI coefficients in this local authority. The Figure highlights three adjacent coastal MSOAs (Tendring 012, 014 and 016) with SFI coefficients in the highest category and also high suicide risks.

Another example is a relatively deprived, post-industrial, local authority in northern England, namely Middlesbrough, with an authority-wide suicide SMR of 1.58. Here deprivation coefficients are above average (penultimate column, Table 8). These coefficients reflect associations between suicide and risk factors: high risk MSOAs tend to be highly deprived, while low risk MSOAs have low deprivation levels. The same applies to social fragmentation, where coefficients are above average, and high (low) risk tends to be associated with high (low) fragmentation. Areas with elevated risk, and high levels of both deprivation and fragmentation, are exemplified by Middlesbrough 001 and 003. Areas with low risk, and low levels of both deprivation and fragmentation, are exemplified by Middlesbrough 012 and 017.

Table 8 Suicide risk estimates, IMD and SFI Scores, and local model regression coefficients

Full size table

Conclusions

The primary intention of the preceding analysis has been to evaluate models for a relatively rare mortality outcome, allowing for spatially varying predictor effects. This approach is compared to a varying intercepts model within the broader framework of Bayesian disease mapping. Disease mapping applications tend to assume homogeneous effects of area risk factors on disease, and use random intercepts to account for residual spatial dependence, as in suicide studies (Qi et al. 2014; Yoon et al. 2015). Area suicide studies may also omit area risk factors altogether, and use a varying intercepts model (Cheung et al. 2012).

However, as demonstrated in the current application, additional substantive perspectives may be gained through exploring local variability in risk factor effects, and in some circumstances, there may be little gain in fit through using varying intercepts in combination with varying predictor effects. The latter applies for suicide across England small areas, where models with varying slopes only provide comparable fit to more complex models. However, this finding may be specific to this particularly outcome, and the analysis does not establish a generic tendency for varying slopes to dispense with the need for random intercepts.

Varying regression effects are here implemented using a Bayesian random effects approach, and analysis of locally varying regression is facilitated by software such as R-INLA. This avoids the computational burden involved in Monte Carlo Markov Chain analysis.

Geographically weighted regression (GWR) has the same focus on spatial heterogeneity, but does not use a random effects approach, but instead a series of separate weighted regressions. GWR has been widely used, with health applications included. Since readers of the journal may well be more familiar with this approach, a supplementary analysis using GWR (applied to overall suicides) is reported on in Online Resource O2. This shows some findings common with the borrowing strength Bayesian approach, but not a strong correlation between area regression coefficients.

Such contrasts are likely given the rather different methods used in the two approaches (Waller et al. 2007), and may be more apparent for a rare disease outcome. Borrowing strength operates via the assumed prior distribution of random effect across all areas, and this is especially important for rarer outcomes. The borrowing strength approach has, however, primarily been used in modelling intercept variation. Studies of spatial heterogeneity in disease mapping, and in particular the potential for spatially varying coefficients on risk factors, are far fewer, and the present paper is intended to demonstrate how this form of analysis may be approached and its potential benefits.

References

Anselin, L. (2010). Thirty years of spatial econometrics. Papers in Regional Science, 89(1), 3–25.
Article Google Scholar
Assunçao, R. (2003). Space varying coefficient models for small area data. Environmetrics, 5, 453–473.
Article Google Scholar
Besag, J., York, J., & Mollie, A. (1991). Bayesian image restoration, with two applications in spatial statistics. Annals of the Institute of Statistical Mathematics, 43, 1–21.
Article Google Scholar
Bibby, P., & Brindley, P. (2013). The 2011 rural-urban classification for small area geographies: A user guide and frequently asked questions (v1. 0), Government Statistical Service, UK
Bivand, R., Gómez-Rubio, V., & Rue, H. (2015). Spatial data analysis with R-INLA with some extensions. Journal of Statistical Software, 63(20), 1–31.
Article Google Scholar
Bross, I. (1958). How to use ridit analysis. Biometrics, 14(1), 18–38.
Article Google Scholar
Cheung, Y., Spittal, M., Pirkis, J., & Yip, P. (2012). Spatial analysis of suicide mortality in Australia: Investigation of metropolitan-rural-remote differentials of suicide risk across states/territories. Social Science & Medicine, 75(8), 1460–1468.
Article Google Scholar
Clayton, D., Bernardinelli, L., & Montomoli, C. (1993). Spatial correlation in ecological analysis. International Journal of Epidemiology, 22(6), 1193–1202.
Article Google Scholar
Collins, J., Ward, B., Snow, P., Kippen, S., & Judd, F. (2017). Compositional, contextual, and collective community factors in mental health and well-being in Australian rural communities. Qualitative Health Research, 27(5), 677–687.
Article Google Scholar
Congdon, P. (1996). The incidence of suicide and parasuicide: a small area study. Urban Studies, 33, 137–158.
Article Google Scholar
Ernstsen, L., Strand, B., Nilsen, S. M., Espnes, G., & Krokstad, S. (2012). Trends in absolute and relative educational inequalities in four modifiable ischaemic heart disease risk factors. BMC Public Health, 12(1), 266.
Article Google Scholar
Feng, W., Lim, C., Maiti, T., & Zhang, Z. (2016). Spatial regression and estimation of disease risks: A clustering based approach. Statistical Analysis and Data Mining: The ASA Data Science Journal, 9(6), 417–434.
Article Google Scholar
Fotheringham, A. (2009). “The problem of spatial autocorrelation” and local spatial statistics. Geographical Analysis, 41(4), 398–403.
Article Google Scholar
Gartner, A., Farewell, D., Dunstan, F., & Gordon, E. (2008). Differences in mortality between rural and urban areas in England and Wales, 2002–2004. Health Statistics Quarterly, 39, 6.
Google Scholar
Gunnell, D., Peters, T., Kammerling, R., & Brooks, J. (1995). Relation between parasuicide, suicide, psychiatric admissions, and socioeconomic deprivation. British Medical Journal, 311, 226–230.
Article Google Scholar
Haining, R. (2001). Spatial autocorrelation. In International encyclopedia of the social and behavioral sciences, Elsevier, pp. 14763–14768
Holt-Lunstad, J., Smith, T. B., Baker, M., Harris, T., & Stephenson, D. (2015). Loneliness and social isolation as risk factors for mortality: A meta-analytic review. Perspectives on Psychological Science, 10(2), 227–237.
Article Google Scholar
Leroux, B., Lei, X., & Breslow, N. (1999). Estimation of disease rates in small areas: A new mixed model for spatial dependence. In M. Halloran & D. Berry (Eds.), Statistical models in epidemiology, the environment and clinical trials (pp. 135–178). New York: Springer.
Google Scholar
Marshall, E., & Spiegelhalter, D. (2007). Identifying outliers in Bayesian hierarchical models: A simulation-based approach. Bayesian Analysis, 2(2), 409–444.
Article Google Scholar
Matheson, F., Moineddin, R., Dunn, J., Creatore, M., Gozdyra, P., & Glazier, R. (2006). Urban neighborhoods, chronic stress, gender and depression. Social Science & Medicine, 63(10), 2604–2616.
Article Google Scholar
Mollié, A. (1996). Bayesian mapping of disease. In W. Gilks, S. Richardson, & D. Spiegelhalter (Eds.), Markov Chain Monte Carlo in practice. London: Chapman and Hall.
Google Scholar
Mou, Y., He, Q., & Zhou, B. (2017). Detecting the spatially non-stationary relationships between housing price and its determinants in China: Guide for housing market sustainability. Sustainability, 9(10), 1826.
Article Google Scholar
Qi, X., Hu, W., Mengersen, K., & Tong, S. (2014). Socio-environmental drivers and suicide in Australia: Bayesian spatial analysis. BMC Public Health, 14(1), 681.
Article Google Scholar
Saunderson, T., Haynes, R., & Langford, I. (1998). Urban–rural variations in suicides and undetermined deaths in England and Wales. Journal of Public Health, 20(3), 261–267.
Article Google Scholar
Schrödle, B., & Held, L. (2011). A primer on disease mapping and ecological regression using INLA. Computational Statistics, 26(2), 241–258.
Article Google Scholar
Silver, E., Mulvey, E., & Swanson, J. (2002). Neighborhood structural characteristics and mental disorder: Faris and Dunham revisited. Social Science & Medicine, 55(8), 1457–1470.
Article Google Scholar
Spiegelhalter, D., Best, N., Carlin, B., & van der Linde, A. (2002). Bayesian measures of model complexity and fit. Journal of the Royal Statistical Society, Series B, 64(4), 583–639.
Article Google Scholar
Waller, L., Zhu, L., Gotway, C., & Gruenewald, P. (2007). Quantifying geographic variations in associations between alcohol distribution and violence: A comparison of geographically weighted regression and spatially varying coefficient models. Stochastic Environmental Research and Risk Assessment, 21(5), 573–588.
Article Google Scholar
Watanabe, S. (2010). Asymptotic equivalence of Bayes cross validation and widely applicable information criterion in singular learning theory. Journal of Machine Learning Research, 11(Dec), 3571–3594.
Google Scholar
Yoon, T., Noh, M., Han, J., Jung-Choi, K., & Khang, Y. (2015). Deprivation and suicide mortality across 424 neighborhoods in Seoul, South Korea: A Bayesian spatial analysis. International Journal of Public Health, 60(8), 969–976.
Article Google Scholar

Download references

Author information

Authors and Affiliations

School of Geography, Queen Mary University of London, Mile End Rd, London, E1 4NS, UK
Peter Congdon

Authors

Peter Congdon
View author publications
You can also search for this author in PubMed Google Scholar

Corresponding author

Correspondence to Peter Congdon.

Ethics declarations

Conflicts of interest

The author declares no conflict of interest.

Electronic supplementary material

Below is the link to the electronic supplementary material.

Supplementary material 1 (pdf 87 KB)

Appendix 1

To indicate disease risk using simply the observed data, one may simulate death counts $\widetilde{Y}_{it}$ (for simulations $t =1 , \ldots ,T$) based on the expected deaths $E_{i}$, and compare these with actual deaths $Y_{i} .$ If for most simulations, one has $Y_{i} \ge \widetilde{Y}_{it}$ then this indicates high relative risk in area i. The relative risk interpretation holds when ${\sum _{i =1}^{n}}Y_{i} ={\sum _{i =1}^{n}}E_{i}$.

Let $I(A) =1$ or 0 according as condition A holds. Following Marshall and Spiegelhalter (2007), and allowing for equality of simulated and actual counts, we find probability estimates that relative risks exceed 1, based simply on the data, as $R_{i}/T$ where

$$ R_{i} =\underset{t =1}{\sum \nolimits ^{T}} I(Y_{i} >\widetilde{Y}_{it}) +0.5\underset{t =1}{\sum \nolimits ^{T}}I(Y_{i} =\widetilde{Y}_{it}) $$

The relevant R code for $T =1000$ is

Ysim = exc = matrix(,1000,6791)
for (t in 1:1000) {Ysim[t,] = rpois(6791,E)
exc[t,] = ifelse(Y > Ysim[t,],1,0) + ifelse(Y == Ysim[t,],0.5,0)}
pr.exc = apply(exc,2,mean)

where pr.exc are the probability estimates.

Rights and permissions

Open Access This article is distributed under the terms of the Creative Commons Attribution 4.0 International License (http://creativecommons.org/licenses/by/4.0/), which permits unrestricted use, distribution, and reproduction in any medium, provided you give appropriate credit to the original author(s) and the source, provide a link to the Creative Commons license, and indicate if changes were made.

Reprints and permissions

About this article

Cite this article

Congdon, P. Spatial heterogeneity in Bayesian disease mapping. GeoJournal 84, 1303–1316 (2019). https://doi.org/10.1007/s10708-018-9920-1

Download citation

Published: 04 September 2018
Issue Date: October 2019
DOI: https://doi.org/10.1007/s10708-018-9920-1

Keywords

Use our pre-submission checklist

Avoid common mistakes on your manuscript.

Spatial heterogeneity in Bayesian disease mapping

Abstract

Similar content being viewed by others

Bayesian forecasting of disease spread with little or no local data

Modeling of Infectious Diseases: A Core Research Topic for the Next Hundred Years

ScanZID: Spatial Scan Statistics with Zero Inflation and Dispersion

Introduction

Model specification

Area socioeconomic influences on suicide

Case study implementation

Case study results

Comparison of predictions

Suicide variation within local authorities

Conclusions

References

Author information

Authors and Affiliations

Corresponding author

Ethics declarations

Conflicts of interest

Electronic supplementary material

Supplementary material 1 (pdf 87 KB)

Appendix 1

Rights and permissions

About this article

Cite this article

Keywords

Navigation

Spatial heterogeneity in Bayesian disease mapping

Abstract

Similar content being viewed by others

Bayesian forecasting of disease spread with little or no local data

Modeling of Infectious Diseases: A Core Research Topic for the Next Hundred Years

ScanZID: Spatial Scan Statistics with Zero Inflation and Dispersion

Introduction

Model specification

Area socioeconomic influences on suicide

Case study implementation

Case study results

Comparison of predictions

Suicide variation within local authorities

Conclusions

References

Author information

Authors and Affiliations

Corresponding author

Ethics declarations

Conflicts of interest

Electronic supplementary material

Supplementary material 1 (pdf 87 KB)

Appendix 1

Appendix 1

Rights and permissions

About this article

Cite this article

Share this article

Keywords

Search

Navigation