Skip to main content

MSPOCK: Alleviating Spatial Confounding in Multivariate Disease Mapping Models


Exploring spatial patterns in the context of disease mapping is a decisive approach to bring evidence of geographical tendencies in assessing disease status and progression. In most cases, multiple count responses (corresponding to disease incidences of multiple types, such as cancer in men and women) are recorded at each spatial location, which may exhibit similar spatial patterns in addition to disease-specific patterns. These are typically modeled using multivariate shared component models, where the spatial (random) effects may be shared between the disease types to model their association. However, this framework is not immune to spatial confounding, where the latent correlation between the spatial random effects and the fixed effects often leads to misleading interpretation. A recent approach to attenuate spatial confounding is the “SPatial Orthogonal Centroid ‘K’orrection”, aka SPOCK, which displaces the geographical centroids, ensuring orthogonality of the spatial random effects and the fixed effects. In this paper, we introduce MSPOCK, or Multiple SPOCK, to tackle spatial confounding for the multiple counts scenario. The methodology is evaluated on synthetic data, and illustrated via an application to new cases of respiratory system cancer for men and women for the US state of California in 2016. Our studies show that the MSPOCK correction leads to a reduction of the posterior variance estimates of model parameters, while maintaining the interpretation of the model.

This is a preview of subscription content, access via your institution.

Fig. 1
Fig. 2
Fig. 3
Fig. 4
Fig. 5
Fig. 6


  • Banerjee S, Carlin BP, Gelfand AE (2014) Hierarchical modeling and analysis for spatial data, 2nd ed. CRC Press

  • Besag J (1974) Spatial interaction and the statistical analysis of lattice systems. J R Stat Soc: Ser B (Stat Methodol) 36:192–225

    MathSciNet  MATH  Google Scholar 

  • Besag J, York J, Mollié A (1991) Bayesian image restoration, with two applications in spatial statistics. Ann Inst Stat Math 43:1–20

    MathSciNet  Article  Google Scholar 

  • Boloker G, Wang C, Zhang J (2018) Updated statistics of lung and bronchus cancer in united states (2018). J Thorac Dis 10:1158

    Article  Google Scholar 

  • Breslow NE, Clayton DG (1993) Approximate inference in generalized linear mixed models. J Am Stat Assoc 88:9–25

    MATH  Google Scholar 

  • Chew LP (1989) Constrained Delaunay triangulations. Algorithmica 4:97–108

    MathSciNet  Article  Google Scholar 

  • CHRR CHR R (2019) University of wisconsin population health institute. County health rankings and roadmaps 2019. Accessed 16 Jan 2020

  • Clayton DG, Bernardinelli L, Montomoli C (1993) Spatial correlation in ecological analysis. Int J Epidemiol 22:1193–1202

    Article  Google Scholar 

  • Dabney AR, Wakefield JC (2005) Issues in the mapping of two diseases. Stat Methods Med Res 14:83–112

    MathSciNet  Article  Google Scholar 

  • Datta A, Banerjee S, Hodges JS, Gao L et al (2019) Spatial disease mapping using directed acyclic graph auto-regressive (DAGAR) models. Bayes Anal 14:1221–1244

    MathSciNet  MATH  Google Scholar 

  • de Valpine P, Paciorek C, Turek D, Michaud N, Anderson-Bergman C, Obermeyer F, Wehrhahn Cortes C, Rodrìguez A, Temple Lang D, Paganin S (2020) NIMBLE User Manual. R package manual version

  • Fu JB, Kau TY, Severson RK, Kalemkerian GP (2005) Lung cancer in women: analysis of the national surveillance, epidemiology, and end results database. Chest 127:768–777

    Article  Google Scholar 

  • Gelfand AE, Vounatsou P (2003) Proper multivariate conditional autoregressive models for spatial data analysis. Biostatistics 4:11–15

    Article  Google Scholar 

  • Gelman A, Hwang J, Vehtari A (2014) Understanding predictive information criteria for Bayesian models. Stat Comput 24:997–1016

    MathSciNet  Article  Google Scholar 

  • Gelman A, Rubin DB et al (1992) Inference from iterative simulation using multiple sequences. Stat Sci 7:457–472

    MATH  Google Scholar 

  • Gómez-Rubio V, Palmí-Perales F (2019) Multivariate posterior inference for spatial models with the integrated nested laplace approximation. J Roy Stat Soc: Ser C (Appl Stat) 68:199–215

    MathSciNet  Google Scholar 

  • Gómez-Rubio V, Rue H (2018) Markov chain monte carlo with the integrated nested laplace approximation. Stat Comput 28:1033–1051

    MathSciNet  Article  Google Scholar 

  • Guan Y, Haran M (2018) A computationally efficient projection-based approach for spatial generalized linear mixed models. J Comput Gr Stat 27:701–714

    MathSciNet  Article  Google Scholar 

  • Guan Y, Haran M (2019) Fast expectation-maximization algorithms for spatial generalized linear mixed models. arXiv:1909.05440

  • Hanks EM, Schliep EM, Hooten MB, Hoeting JA (2015) Restricted spatial regression in practice: geostatistical models, confounding, and robustness under model misspecification. Environmetrics 26:243–254

    MathSciNet  Article  Google Scholar 

  • Hatami R (2018) A practical method to control spatiotemporal confounding in environmental impact studies. MethodsX 5:710–716

    Article  Google Scholar 

  • Hefley TJ, Hooten MB, Hanks EM, Russell RE, Walsh DP (2017) The bayesian group lasso for confounded spatial data. J Agric Biol Environ Stat 22:42–59

    MathSciNet  Article  Google Scholar 

  • Hodges JS, Reich BJ (2010) Adding spatially-correlated errors can mess up the fixed effect you love. Am Stat 64:325–334

    MathSciNet  Article  Google Scholar 

  • Hughes J, Haran M (2013) Dimension reduction and alleviation of confounding for spatial generalized linear mixed models. J R Stat Soc: Ser B (Stat Methodol) 75:139–159

    MathSciNet  Article  Google Scholar 

  • Jiao J, Han Y (2020) Bias Correction With Jackknife, Bootstrap, and Taylor Series. IEEE Trans Inf Theory 66:4392–4418

    MathSciNet  Article  Google Scholar 

  • Kim H, Sun D, Tsutakawa RK (2001) A bivariate Bayes method for improving the estimates of mortality rates with a twofold conditional autoregressive model. J Am Stat Assoc 96:1506–1521

    MathSciNet  Article  Google Scholar 

  • Knorr-Held L, Best NG (2001) A shared component model for detecting joint and selective clustering of two diseases. J R Stat Soc: Ser A (Stat Soc) 164:73–85

    MathSciNet  Article  Google Scholar 

  • Knorr-Held L, Natário I, Fenton SE, Rue H, Becker N (2005) Towards joint disease mapping. Stat Methods Med Res 14:61–82

    MathSciNet  Article  Google Scholar 

  • Knorr-Held L, Raßer G (2000) Bayesian detection of clusters and discontinuities in disease maps. Biometrics 56:13–21

    Article  Google Scholar 

  • Lawson AB (2019) Bayesian disease mapping: hierarchical modeling in spatial epidemiology, 3rd ed. Chapman and Hall/CRC

  • Leroux BG, Lei X, Breslow N (1999) Estimation of disease rates in small areas: a new mixed model for spatial dependence. In Statistical models in epidemiology, the environment, and clinical trials. Springer, pp 179–191

  • Lindgren F, Rue H et al (2015) Bayesian spatial modelling with R-INLA. J Stat Softw 63:1–25

    Article  Google Scholar 

  • Moran PA (1950) A test for the serial independence of residuals. Biometrika 37:178–181

    MathSciNet  Article  Google Scholar 

  • Park J, Haran M (2020) Reduced-dimensional monte carlo maximum likelihood for latent gaussian random field models. J Comput Gr Stat 1–15

  • Prates MO, Assunção RM, Rodrigues EC et al (2019) Alleviating spatial confounding for areal data problems by displacing the geographical centroids. Bayes Anal 14:623–647

    MathSciNet  MATH  Google Scholar 

  • Reich BJ, Hodges JS, Zadnik V (2006) Effects of residual smoothing on the posterior of the fixed effects in disease-mapping models. Biometrics 62:1197–1206

    MathSciNet  Article  Google Scholar 

  • Rodrigues EC, Assunção R (2012) Bayesian spatial models with a mixture neighborhood structure. J Multivar Anal 109:88–102

    MathSciNet  Article  Google Scholar 

  • Rue H, Held L (2005) Gaussian Markov random fields: theory and applications. Chapman and Hall/CRC

  • Rue H, Martino S, Chopin N (2009) Approximate Bayesian inference for latent Gaussian models by using integrated nested Laplace approximations. J R Stat Soc: Ser B (Stat Methodol) 71:319–392

    MathSciNet  Article  Google Scholar 

  • SEER (2019) National Cancer Institute. Surveillance, Epidemiology, and End Results (SEER) Program ( SEER*Stat Database: Incidence - SEER 9 Regs Research Data, Nov 2018 Sub (1975–2016). Released April 2019, based on the November 2018 submission

  • Siegel RL, Miller KD, Jemal A (2019) Cancer statistics, 2019. Cancer J Clin 69:7–34

    Article  Google Scholar 

  • Spiegelhalter DJ, Best NG, Carlin BP, Van Der Linde A (2002) Bayesian measures of model complexity and fit. J R Stat Soc: Ser B (Stat Methodol) 64:583–639

    MathSciNet  Article  Google Scholar 

  • Thaden H, Kneib T (2018) Structural equation models for dealing with spatial confounding. Am Stat 72:239–252

    MathSciNet  Article  Google Scholar 

  • Vargas FR (2013) Bayesian estimates of the lethality rate of acute myocardial infarction. PhD thesis, Universidade Federal de Minas Gerais (UFMG)

  • Watanabe S (2010) Asymptotic equivalence of Bayes cross validation and widely applicable information criterion in singular learning theory. J Mach Learn Res 11:3571–3594

    MathSciNet  MATH  Google Scholar 

  • WHO (2004) Gender in lung cancer and smoking research.

Download references


The authors thank the anonymous Associate Editor and two reviewers, whose constructive comments led to an improved presentation. Prates acknowledges partial funding support from CNPq Grants 436948/2018-4 and 307547/2018-4, and FAPEMIG grant PPM-00532-16. Bandyopadhyay acknowledges partial support from Grants R01DE024984 and P30CA016059 from the United States National Institutes of Health.

Author information

Authors and Affiliations


Corresponding author

Correspondence to Marcos O. Prates.

Additional information

Publisher's Note

Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.



A: Integrated Nested Laplace Approximation—INLA

Integrated nested Laplace approximation (INLA, Rue et al. 2009) is a powerful methodology that allows the user to fit a variety of Bayesian models. A model can be fitted in INLA, if, for a random variable \({\varvec{Y}}\), its mean \({\varvec{\mu }}\) can be modeled through a link function g(.) in an additive way as:

$$\begin{aligned} g(\mu _i) = \eta _i = \beta _0 + \sum _{j = 1}^{n_{\xi }}\xi ^{(j)}(z_{ji}) + \sum _{k = 1}^{n_{\beta }}\beta _kX_{ki} + \epsilon _i, \end{aligned}$$

where \(\xi ^{(j)}(z_{ji})\) are unknown functions of the covariates \(z_{ij}\), \(\beta _0\) is an intercept, \(\beta _k\) is a set of coefficients related to the fixed effects \(X_{ki}\) and \(\epsilon _i\) are unstructured terms. INLA assumes Gaussian priors to the vector \({\varvec{u}} = \{\beta _0, {\varvec{\xi }}, {\varvec{\beta }}, \epsilon \}\) giving rise to a Gaussian Markov random field (GMRF, Rue and Held 2005). If the latent structure of a model can be written as a GMRF, it is possible to apply the INLA methodology. Most common models belonging to the GLMM family can be fitted in this framework.

The vector \({\varvec{u}} = \{\beta _0, {\varvec{\xi }}, {\varvec{\beta }}, {\varvec{\epsilon }} \}\) may depend on some hyperparameters \({\varvec{\theta }}\), for example, variances and correlation parameters that obey, in general, \(\text {dim}({\varvec{u}}) \gg \text {dim}({\varvec{\theta }}) = n_{\theta }\). That way, one must provide the prior distribution for the vector \(\{{\varvec{u}}, {\varvec{\theta }}\}\). INLA assigns priors \(\pi ({\varvec{u}}, {\varvec{\theta }}) = \pi ({\varvec{u}}|{\varvec{\theta }})\pi ({\varvec{\theta }})\) where \(\pi ({\varvec{u}}|{\varvec{\theta }})\) is a GMRF and \(\pi ({\varvec{\theta }})\) may be decomposed as \(\prod _{j = 1}^{n_{\theta }}\pi ({\varvec{\theta _j}})\). The marginal posterior distributions for the set of parameters are given by:

$$\begin{aligned} \pi (u_j|{\varvec{y}}) = \int \pi (u_j, {\varvec{\theta }}|{\varvec{y}})d{\varvec{\theta }} = \int \pi (u_j|{\varvec{\theta }}, {\varvec{y}})\pi ({\varvec{\theta }}|{\varvec{y}})d{\varvec{\theta }},\\ \pi (\theta _k|{\varvec{y}}) = \int \pi ({\varvec{\theta }}|{\varvec{y}}) d{\varvec{\theta _{-k}}}. \end{aligned}$$

In the absence of analytical solution to these integrals, numerical approximations are necessary to obtain \({\tilde{\pi }}(u_j|{\varvec{y}})\) and \({\tilde{\pi }}(\theta _k|{\varvec{y}})\), where \({\tilde{\pi }}(.)\) denotes an approximate function for \(\pi (.)\).

Marginal Distribution for \(\theta _k\)

We can rewrite \(\displaystyle \pi ({\varvec{\theta }}|{\varvec{y}}) = \frac{\pi ({\varvec{u}}, {\varvec{\theta }}|{\varvec{y}})}{\pi ({\varvec{u}}| {\varvec{\theta }}, {\varvec{y}})}\). To approximate this quantity, Rue et al. (2009) suggest a Gaussian approximation for the denominator as:

$$\begin{aligned} {\tilde{\pi }}({\varvec{\theta }}|{\varvec{y}}) \propto \frac{\pi ({\varvec{u}}, {\varvec{\theta }}, {\varvec{y}})}{\pi _G({\varvec{u}}| {\varvec{\theta }}, {\varvec{y}})}\Bigg |_{u = u^{*}({\varvec{\theta }})}, \end{aligned}$$

where \(\pi _G(.)\) is the Gaussian approximation of a density, and \(u^{*}({\varvec{\theta }})\) is the mode of \(\pi ({\varvec{u}}| {\varvec{\theta }}, {\varvec{y}})\) at a given \({\varvec{\theta }}\). Now, to obtain the marginal distribution \({\tilde{\pi }}(\theta _k|{\varvec{y}})\), a numerical integration is conducted. Using a grid of \(\theta _k\) values, the marginal is obtained as:

$$\begin{aligned} \pi (\theta _k|{\varvec{y}}) = \sum _{h=1}^H {\tilde{\pi }}({\varvec{\theta }}|{\varvec{y}})\Delta _{kh}. \end{aligned}$$

Marginal Distribution for \(u_j\)

Rue et al. (2009) propose three different approximations to this quantity: 1) Gaussian approximation; 2) Laplace approximation, and; 3) simplified Laplace approximation. The Gaussian approximation is the easiest to be obtained, but provides poor results. At the cost of being computationally expensive, the Laplace approximation produces better results. The simplified Laplace approximation provides satisfactory results, with an improved computational time. Taking one of them as approximation for \({\tilde{\pi }}(u_j|{\varvec{\theta }}, y)\), one can calculate the posterior marginal distribution as:

$$\begin{aligned} {\tilde{\pi }}(u_j|{\varvec{y}}) \approx \sum _{h=1}^H {\tilde{\pi }}(u_j | \theta ^*_h, {\varvec{y}}) {\tilde{\pi }}(\theta ^*_h|{\varvec{y}}) \Delta _h. \end{aligned}$$

B: Additional Simulation Results

Table 6 presents the simulation results for scenario SM2 (cubic and linear) and SM3 (cubic), comparing the SCM (without MSPOCK adjustment), to the SCM with the adjustment.

Table 6 Simulation results comparing SCM (shared component model, without confounding adjustment), and MSPOCK (shared component model, with confounding adjustment) for scenario SM2 (linear and cubic) and SM3 (cubic)

C: Widely Applicable Information Criterion

In any application, it is a common in practice to have several competitor models. These models may vary in the number of parameters and/or model likelihood and, therefore, the complexity of these models can differ. One important aspect to evaluate is the parsimony principle that consists in determining a trade-off between model fitting and model complexity. In practice, we are searching for the best fit. However, the best fit does not necessarily always mean a more complex model, since they may have undesirable properties as overfitting, computational cost, identifiability issues, and so on.

Under the Bayesian paradigm, the deviation information criterion (DIC, Spiegelhalter et al. 2002) continues to be a widely popular metric. However, Gelman et al. (2014) studied and compared different model selection criteria, and concluded that the Widely applicable information criterion (WAIC, Watanabe 2010) is a promising alternative to performing such a task. To calculate the WAIC, one must compute the following log pointwise posterior predictive density (\( {lppd}\)):

$$\begin{aligned} lppd = \log \left( \prod _{i=1}^n \pi _{post}(y_i) \right) = \sum _{i=1}^n \log \left( \int \pi (y_i|{\varvec{u}},{\varvec{\theta }}) \pi _{post}({\varvec{u}},{\varvec{\theta }}) \right) , \end{aligned}$$

where \(\pi _{post}(\cdot )\) represents the posterior distribution of some quantity. Next, to adjust for a possible overfitting, a term is added to correct for the effective number of parameters \(p_{\text {WAIC}} = \sum _{i=1}^n V(\log f(y_i|{\varvec{u}}, {\varvec{\theta }}))\), where \(V(\cdot )\) is the posterior variance of the log predictive density. Finally, the WAIC is given by:

$$\begin{aligned} \text {WAIC} = -2 (lppd - p_{\text {WAIC}}). \end{aligned}$$

The model with the smallest WAIC value is considered the model of best fit to a dataset.

Rights and permissions

Reprints and Permissions

About this article

Verify currency and authenticity via CrossMark

Cite this article

Azevedo, D.R.M., Prates, M.O. & Bandyopadhyay, D. MSPOCK: Alleviating Spatial Confounding in Multivariate Disease Mapping Models. JABES 26, 464–491 (2021).

Download citation

  • Received:

  • Accepted:

  • Published:

  • Issue Date:

  • DOI:


  • Areal modeling
  • Bayesian
  • Respiratory system cancer
  • Shared components
  • Spatial confounding
  • Variance inflation