Skip to main content

Spatiotemporal high-resolution prediction and mapping: methodology and application to dengue disease


Dengue disease has become a major public health problem. Accurate and precise identification, prediction and mapping of high-risk areas are crucial elements of an effective and efficient early warning system in countering the spread of dengue disease. In this paper, we present the fusion area-cell spatiotemporal generalized geoadditive-Gaussian Markov random field (FGG-GMRF) framework for joint estimation of an area-cell model, involving temporally varying coefficients, spatially and temporally structured and unstructured random effects, and spatiotemporal interaction of the random effects. The spatiotemporal Gaussian field is applied to determine the unobserved relative risk at cell level. It is transformed to a Gaussian Markov random field using the finite element method and the linear stochastic partial differential equation approach to solve the “big n” problem. Sub-area relative risk estimates are obtained as block averages of the cell outcomes within each sub-area boundary. The FGG-GMRF model is estimated by applying Bayesian Integrated Nested Laplace Approximation. In the application to Bandung city, Indonesia, we combine low-resolution area level (district) spatiotemporal data on population at risk and incidence and high-resolution cell level data on weather variables to obtain predictions of relative risk at subdistrict level. The predicted dengue relative risk at subdistrict level suggests significant fine-scale heterogeneities which are not apparent when examining the area level. The relative risk varies considerably across subdistricts and time, with the latter showing an increase in the period January–July and a decrease in the period August–December.


Dengue disease is a major challenge to healthcare worldwide, potentially leading to death, especially among the poor in low- and middle-income countries (Ak et al. 2018). In addition, there are latent costs related to stress, productivity loss, school absence, and care taking (Wilastonegoro et al. 2020). To reduce health impacts and treatment costs, there have been substantial efforts aimed at the prevention of dengue disease outbreaks (Kampen et al. 2014). For these purposes, statistical models aimed at identifying the causes, transmission mechanisms, and prediction of outbreaks at a fine spatiotemporal scale are of crucial importance (Messina et al. 2019).

High-resolution information is needed for research on the etiology of the disease and the development of control and prevention strategies (Ak et al. 2018; Jaya et al. 2017; Jaya and Folmer 2020, 2021a, b; Pokharel and Deardon 2016). Dengue disease typically involves a spatiotemporal pattern (Jaya and Folmer 2020, 2021a, b; Phanitchat et al. 2019; Puggioni et al. 2020); therefore, high-resolution spatiotemporal models and maps of the distribution of relative risk are basic elements of an early warning system aimed at identifying where and when an outbreak will occur (Hanigan et al. 2019; Shi et al. 2013; Xu et al. 2019).

In many areas of spatial research including epidemiology, data are often only available at different levels of aggregation (Moraga et al. 2017; Shi et al. 2013; Utazi et al. 2019). If low-resolution information is available whereas high-resolution (cell level) information is needed but lacking, area-to-cell disaggregation through joint area-cell estimation can be applied to obtain the missing information (Moraga et al. 2017; Utazi et al. 2019; Wang et al. 2018a).Footnote 1 Data fusion or data assimilation (Banerjee et al. 2015) and Bayesian melding (Fuentes and Raftery 2005; Liu et al. 2011) are terms used to denote integrating multiple data sources of different spatial resolutions.Footnote 2 The basic concept involves combining area and cell observations in a single statistical model. Combining data measured at different levels of aggregation can improve parameter estimation and increase prediction accuracy (Wang et al. 2018b). However, it may lead to spatial misalignment (Moraga et al. 2017; Sahu et al. 2010; Truong et al. 2014; Utazi et al. 2019), which induces biased or inconsistent estimators (Liu and Bertazzon 2016; Peng and Bell 2010; Saez and López-Casasnovas 2019).

Several kinds of correction approaches based on regression models have been developed to deal with area-to-cell misalignment problems (Banerjee and Gelfand 2002; Banerjee et al. 2015). Moraga et al. (2017) and Utazi et al. (2019) applied Bayesian geostatistical analysis to deal with misalignment in spatial non-Gaussian data with linear covariates. However, methods that address area-to-cell misalignment in spatiotemporal non-Gaussian data with nonlinear covariates are less well known. This applies especially to Poisson or Negative Binomial (NB) spatiotemporal data, which are typically applied in dengue and other disease incidence modeling, but also in other kinds of spatial and regional research.

In this paper, we introduce the Fusion Area-Cell Spatiotemporal Generalized Geoadditive (GG)-Gaussian Field (GF) model, abbreviated as FGG-GF model, to generate high-resolution (cell) predictions based on observations at lower (area) resolution of the variable of interest (i.e., the number of dengue incidences) and the population at risk, and high-resolution cell data (i.e., weather variables), while controlling for misalignment. Moraga et al. (2017) and Utazi et al. (2019) have shown that the prediction performance of integrated area-cell models can outperform models that use single level data sources.

Kammann and Wand (2003) introduced the Generalized Geoadditive Model (GGM), which has become popular in disease mapping (among other fields) because of its suitability for making high-resolution maps (Muleia et al. 2020; Wand et al. 2011). The model assumes that there is a spatially continuous variable underlying all observations which can be modeled using a Gaussian process, usually denoted Gaussian Field (GF). A GF is characterized by a first-order autoregressive model with spatially correlated innovations. The GGM combines the Generalized Additive Model (GAM) and the Geostatistical Model (GM). The former was introduced by Hastie and Tibshirani (1986) to provide a flexible means of handling nonlinear and interacting covariates. GAMs are also suitable for handling complex spatial and temporal autocorrelation (French and Wand 2004; Ma et al. 2014). GAMs are nonparametric because they do not require a priori specification of the regression function (Wang et al. 2018a). The GM was introduced by Matheron (1963) to construct high-resolution maps over a particular geographical region based on cell data on (risk) factors associated with a (dependent) variable of interest.

The integrated area-cell observations and the combination of non-Gaussian data, a nonlinear predictor and latent model components, in particular the spatiotemporal GF, make estimation of FGG-GF model, prediction, and mapping computationally complex and time-consuming (Barber et al. 2016) because of the “big n” problem. This issue can be handled by transforming a GF with a dense covariance matrix to a Gaussian Markov Random Field (GMRF) with a sparse precision matrixFootnote 3 of Matérn covariances (Lindgren et al. 2011).

The objective of this paper is to develop a high-resolution prediction and mapping procedure for spatiotemporal Poisson or Negative Binomial data applying a Fusion Area-Cell Spatiotemporal Generalized Geoadditive-Gaussian Markov Random Field (abbreviated as FGG-GMRF) model. Inference and prediction are handled in a Bayesian framework. The approach will subsequently be applied to dengue disease risk in Bandung city, Indonesia. The purpose is to predict and map the relative dengue risk at subdistrict level, given observations on dengue incidence and population at risk at district level and weather risk factors at cell level.Footnote 4 Special attention is paid to high-risk districts and subdistricts requiring public intervention (Aguayo et al. 2020).

The structure of the remainder of this paper is as follows. Section 2 introduces the spatiotemporal GG-GF model. Section 3 presents the FGG-GF and FGG-GMRF models and the Bayesian inference framework. The link between the GF and GMRF models is summarized in Appendix 1. Section 4 applies the methodology to dengue incidence in Bandung city, Indonesia, and Sect. 5 summarizes and concludes the conducted research.

The spatiotemporal generalized geoadditive-Gaussian field model

Consider region \(\mathcal{A}\in {\mathbb{R}}^{2}\), partitioned into \({n}_{\mathcal{A}}\) areas (e.g., districts in a city), each measured for \(T\) periods. The areas are labeled \(\left\{\left({\mathcal{A}}_{1},1\right),\ldots ,\left({\mathcal{A}}_{1},T\right),\left({\mathcal{A}}_{2},1\right),\ldots ,\left({\mathcal{A}}_{i},t\right),\ldots ,\left({\mathcal{A}}_{{n}_{\mathcal{A}}},T\right)\right\}\), where \(\left({\mathcal{A}}_{i},t\right)\) denotes area \(i\) at time \(t,\) for \(i=1,\ldots ,{n}_{\mathcal{A}}\) and \(t=1,\ldots ,T\). Region \(\mathcal{A}\) is further divided into a finite set of \({n}_{p}\) cells for \(T\) periods. The set of \({n}_{p}\) cells over \(T\) periods is denoted \(\left\{\left({{\varvec{s}}}_{1},1\right),\ldots ,\left({{\varvec{s}}}_{{n}_{{\mathcal{A}}_{1}}},T\right),\left({{\varvec{s}}}_{{n}_{{\mathcal{A}}_{1}}+1},1\right),\ldots ,\left({{\varvec{s}}}_{g},t\right),\ldots ,\left({{\varvec{s}}}_{{n}_{p}},T\right)\right\},\) with \({n}_{{\mathcal{A}}_{i}}\) denoting the number of cells in area \({\mathcal{A}}_{i}\) for \(g=1,\ldots ,{n}_{p}\) and \(t=1,\ldots ,T \,{{\rm and}}\,n_{p}=\sum_{i=1}^{{n}_{\mathcal{A}}}{n}_{{\mathcal{A}}_{i}}\). Note that the notation \(\left\{\left({{\varvec{s}}}_{1\left({\mathcal{A}}_{1}\right)},1\right),\ldots ,\left({{\varvec{s}}}_{{n}_{{\mathcal{A}}_{1}}\left({\mathcal{A}}_{1}\right)},T\right),\left({{\varvec{s}}}_{{n}_{{\mathcal{A}}_{1}}+1\left({\mathcal{A}}_{2}\right)},1\right),\ldots ,\left({{\varvec{s}}}_{g\left({\mathcal{A}}_{i}\right)},t\right),\ldots ,\left({{\varvec{s}}}_{{n}_{p}\left({\mathcal{A}}_{{n}_{\mathcal{A}}}\right)},T\right)\right\}\) will be used to explicitly denote that cell \({{\varvec{s}}}_{g\left({\mathcal{A}}_{i}\right)}\) belongs to area \({\mathcal{A}}_{i}\). If the area is not relevant, \({{\varvec{s}}}_{g}\) will be used. Moreover, the notation \(g\) for \({\mathbf{s}}_{g}\) will be incidentally used if there is no risk of misunderstanding. Finally, \({s}_{g,1}\) and \({s}_{g,2}\) denote the latitude and longitude coordinates of its centroid, respectively.

Let \({y}_{it}\) and \({N}_{it}\) denote the number of (dengue) incidences and population at risk in area \({\mathcal{A}}_{i}\) at time \(t\), respectively, and \({y}_{gt}\) and \({N}_{gt}\) the number of (dengue) incidences and population at risk in cell \({\mathbf{s}}_{g}\) at time \(t\), respectively. Note that both \({y}_{gt}\) and \({N}_{gt}\) are unobserved at the cell level. \({y}_{it}\) and \({y}_{gt}\) are assumed to follow Poisson distributionsFootnote 5 with means \({\mu }_{it}={E}_{it}{\theta }_{it}\) and \({\mu }_{gt}={E}_{gt}{\theta }_{gt}\), respectively, with \({E}_{it}\) and \({E}_{gt}\) denoting the expected number of (dengue) incidences and \({\theta }_{it}\) and \({\theta }_{gt}\) the relative (dengue) risk for area \(i\) and cell \(g\) at time \(t\), respectively (Jaya and Folmer 2020. The expected rate \({E}_{it}\) is calculated using external standardization. It is defined based on the overall average across all areas and periods (Abente et al. 2018; Jaya and Folmer 20202021a, b):

$${E}_{it}={N}_{it}\left(\frac{1}{{n}_{\mathcal{A}}T}\sum_{i=1}^{{n}_{\mathcal{A}}}\sum_{t=1}^{T}{y}_{it}/\frac{1}{{n}_{\mathcal{A}}T}\sum_{i=1}^{{n}_{\mathcal{A}}}\sum_{t=1}^{T}{N}_{it}\right) \quad {{\rm for}}\,i=1,\ldots ,{n}_{\mathcal{A}}\,{{\rm and}}\,t=1,\ldots ,T.$$

The relative risk is defined as the ratio of the local risk in a spatiotemporal unit relative to the average risk across the whole study region over the entire time period (Yin et al. 2014). It is centered around one, meaning that the total number of incidences is equal to the expected rate. The maximum likelihood (ML) estimator of the relative risk \({\theta }_{it}\) is (Jaya et al. 2017; Jaya and Folmer 2020):

$${\widehat{\theta }}_{it}=\frac{{y}_{it}}{{E}_{it}} \quad {{\rm for}}\,i=1,\ldots ,{n}_{\mathcal{A}}\,{{\rm and}}\,t=1,\ldots ,T.$$

This is known as the crude risk or the standardized incidence ratio (SIR).

Following Moraga et al. (2017), Jaya and Folmer (2020, 2021a, b), and Utazi et al. (2019), we model the relative risk as a non-separable Poisson log-linear model as followsFootnote 6:

$${\eta }_{it}={\beta }_{0}+\sum_{k=1}^{K}{f}_{k}\left({\overline{{\rm x}} }_{k,it}\right)+{\omega }_{i}+{\upsilon }_{i}+{\phi }_{t}+{\varsigma }_{t}+{\delta }_{it}+{\overline{\Phi } }_{it}\,\,{\rm for}\,\,i=1,\ldots ,{n}_{\mathcal{A}}\,{{\rm and}}\, t=1,\ldots ,T,$$
$${\eta }_{gt}={\beta }_{0}+\sum_{k=1}^{K}{f}_{k}\left({{\rm x}}_{k,gt}\right)+{\omega }_{g({\mathcal{A}}_{i})}+{\upsilon }_{g({\mathcal{A}}_{i})}+{\phi }_{t}+{\varsigma }_{t}+{\delta }_{g({\mathcal{A}}_{i})t}+{\Phi }_{gt}\,\,{\rm for}\,\,g=1,\ldots ,{n}_{p}\,{{\rm and}}\,t =1,\ldots ,T$$

with \({\eta }_{it}={\rm log}\left({\theta }_{it}\right)\) and \({\eta }_{gt}={\rm log}\left({\theta }_{gt}\right)\).

In Eqs. (3a) and (3b), \({\beta }_{0}\) is the overall intercept denoting the average risk across space and time, i.e., across all \(i=1,\ldots ,{n}_{\mathcal{A}}\), \(g=1,\ldots ,{n}_{p}\), and \(t=1,\ldots ,T\). The latentFootnote 7 functions \({f}_{k}\left({\overline{{\rm x}} }_{k,it}\right)\) and \({f}_{k}\left({{\rm x}}_{k,gt}\right)\) for \(k=1,\ldots ,K\), represent the (non)linear effects of the metrical area and cell risk factors, respectively. The risk factors at cell level for a given area \({\mathcal{A}}_{i}\) and time \(t\) are fixed. However, they vary across areas and times. The latent (non)linear risk factor functions are based on observations at cell level but are predicted at area level. The risk at cell and area levels are assumed to be driven by the same factors; therefore, we adopt joint risk factor functions. For this purpose, we stack the observations on the risk factors such that risk factor \(k\), at both area and cell level, becomes \({\mathbf{z}}_{k}={\left({\overline{{\rm x}} }_{k,11},\ldots {\overline{{\rm x}} }_{k,{n}_{\mathcal{A}}T},{{\rm x}}_{k,11},\ldots ,{{\rm x}}_{k,{n}_{p}T}\right)}^{{\prime}}\) for \(k=1,\ldots ,K\) and latent function \({f}_{k}\left({\mathbf{z}}_{k}\right)\). The functions \({f}_{k}\left({{\varvec{z}}}_{k}\right)\) are commonly centered at the mean, i.e., \({\mathbb{E}}\left[{f}_{k}\left({{\varvec{z}}}_{k}\right)\right]=0,\) for identifiability reasons (Fahrmeir and Lang 2001).

Let \(f({\varvec{z}})\) be the sum of the functions \({f}_{k}({{\varvec{z}}}_{k})\) for \(k=1,\ldots ,K\):

$$f\left({\varvec{z}}\right)=\sum_{k=1}^{K}{f}_{k}\left({{\varvec{z}}}_{k}\right)={f}_{1}\left({{\varvec{z}}}_{1}\right)+\cdots +{f}_{K}\left({{\varvec{z}}}_{K}\right).$$

To account for spatiotemporal variation, Eq. (4) can be extended to a varying coefficients modelFootnote 8:

$$f\left({\varvec{z}}\right)={f}_{1}\left({{\varvec{z}}}_{1}\right){\mathbf{v}}_{1}+\cdots +{f}_{K}({{\varvec{z}}}_{K}){\mathbf{v}}_{K}$$

where the design vector \(\mathbf{v}=\left({\mathbf{v}}_{1},\ldots ,{\mathbf{v}}_{K}\right)\boldsymbol{^{\prime}}\) contains components of \({\varvec{z}}\) or additional covariates. The vector \({\mathbf{v}}_{k} \quad {{\rm for}}\,k=1, \ldots ,K\), modifies the relationship between the covariate \({{\varvec{z}}}_{k}\) and the log-linear conditional expectation \({\mathbb{E}}\left[\mathbf{y}|{\varvec{z}}\right]\). If it is identical to the vector 1, i.e., \({\mathbf{v}}_{k}={\left(1,\ldots ,1\right)}^{{\prime}}\) with dimension (\({n}_{\mathcal{A}}+{n}_{p})T\times 1\), then \({f}_{k}({{\varvec{z}}}_{k})\) presents the overall (main) effect of \({{\varvec{z}}}_{k}.\) If it is different from \(1,{f}_{k}\left({{\varvec{z}}}_{k}\right){\mathbf{v}}_{k}\) presents the effect of \({\mathbf{z}}_{k}\) that varies along with \({\mathbf{v}}_{k}\). In other words, \({f}_{k}\left({{\varvec{z}}}_{k}\right){\mathbf{v}}_{k}\) models the interaction between \({\mathbf{z}}_{k}\) and \({\mathbf{v}}_{k}\) (Fahrmeir and Lang 2001). According to Martınez-Bello et al. (2017a; b), the varying coefficients model helps refine the association between the regressors (e.g., the weather variables) and the response, thus improving predictions at a fine spatiotemporal scale. For example, if \({\mathbf{v}}_{k}\) denotes the calendar day and \({\mathbf{z}}_{k}\) is the spatiotemporal covariate temperature, then \({f}_{k}\left({{\varvec{z}}}_{k}\right){\mathbf{v}}_{k}\) represents the temperature effect varying by day. In this study, we apply the temporally varying coefficients model to accommodate the temporally varying nonlinear effects of risk factors on the response. The time-varying effect of, for example, the \(k\)-th covariate can be written as (Franco-Villoria et al. 2019):

$${f}_{k}\left({{\varvec{z}}}_{k,t}\right){\mathbf{v}}_{k,t}={\beta }_{k}\left({\mathbf{v}}_{k,t}\right){{\varvec{z}}}_{k,t}\, {\rm for\, every }\,i\,{\rm and }\,g,\,{\rm and\,for }\,k=1,\ldots ,K\,{{\rm and}}\,t=1,\ldots ,T.$$

where \({\beta }_{k}({\mathbf{v}}_{k,t})\) for \(t=1,\ldots ,T\) is the time-varying regression coefficient, which can be regarded as a stochastic process over \({\mathbf{v}}_{k}\) (Fahrmeir and Lang 2001). For ease of notation, we ignore the term \({\mathbf{v}}_{k}\) and write \({\beta }_{k,t}\).

A time-varying coefficient can be conveniently specified as the sum of a fixed (global mean) effect and a temporal random effect of the risk factor:\({\beta }_{k,t}={\beta }_{k}+{\zeta }_{k,t}\) for \(k=1,\ldots ,K\) and \(t=1,\ldots ,T\). The fixed effect (\({\beta }_{k}\)) presents the effect of the risk factor that remains constant across space or time, while the temporal random effect (\({\zeta }_{k,t})\) accounts for the time-varying effect of the risk factor (Song et al. 2020). The temporal random effect \({\zeta }_{k,t}\) can be conveniently specified as a random walk model of order one (RW1) or two (RW2)Footnote 9 (Bernardinelli et al. 1995; Martinez-Bello et al. 2017b; Schrödle and Knorr-Held 2011):

$${\zeta }_{k,t}={\zeta }_{k,t-1}+{u}_{k,t} ({\rm RW}1)\,{\rm for }\,t=2,\ldots ,T\,{\rm or }\,{\zeta }_{k,t}=2{\zeta }_{k,t-1}-{\zeta }_{k,t-2}+{u}_{k,t} ({\rm RW}2)\,{\rm for}\, t=3,\ldots ,T$$

with \({u}_{k,t}\sim \mathcal{N}\left(0,{\sigma }_{{\zeta }_{k}}^{2}\right)\) white noise, and \({\sigma }_{{\zeta }_{k}}^{2}\) denoting the variance of the RW process controlling the smoothness of \({\beta }_{k,t}\). A random walk process of order one needs an initial value of \({\zeta }_{k,1}\) and a random walk of order two needs initial values of \({\zeta }_{k,1}\) and \({\zeta }_{k,2}\).

The components \({\omega }_{i}\) and \({\upsilon }_{i}\) are the spatially structured and unstructured main effects at area level, respectively, whereas \({\omega }_{{g(\mathcal{A}}_{i})}\) and \({\upsilon }_{{g(\mathcal{A}}_{i})}\) denote the spatially structured and unstructured main effects for cell \({s}_{g}\) in the area \({\mathcal{A}}_{i}\) to which it belongs. The components \({\phi }_{t}\) and \({\varsigma }_{t}\) are the temporally structured and unstructured main effects. For a given time \(t,\) they are equal for all areas and cells. \({\delta }_{it}\) represents the spatiotemporal interaction effect of the unobserved risk factors at area level and \({\delta }_{{g(\mathcal{A}}_{i})t}\) the impact of the interaction effect \({\delta }_{it}\) in cell \({s}_{g}\) in area \({\mathcal{A}}_{i}\) to which it belongs (see Sect. 3.1 for details). We consider four type of space–time interactions (see Table 5 in Appendix 2).

The final component, \({\Phi }_{gt},\) in Eq. (3b) is the spatiotemporal GF in cell \({\mathbf{s}}_{g }\,{\rm for}\, g=1,\ldots ,{n}_{p}\) at time \(t\), indicating the true but unobserved relative risk (Cameletti et al. 2013; Godana et al. 2019). Hence, \({\Phi }_{gt}\) is the “own” spatiotemporal interaction effect of cell \({\mathbf{s}}_{g}\). Because of the large number of cells, it is continuously indexed (Blangiardo and Cameletti 2015). The component \({\overline{\Phi } }_{it}\) in Eq. (3a) denotes the area average of \({\Phi }_{gt}\) across the cells within \({\mathcal{A}}_{i}\). Following Cameletti et al. (2013) and Godana et al. (2019), we assume that \(\Phi \left({{\varvec{s}}}_{g},t\right)\) changes over time following a first-order autoregressive (AR1) process with coefficient \({\lambda }_{1}, \left| {\lambda }_{1}\right|<1\):

$$\Phi \left({{\varvec{s}}}_{g},t\right)={\lambda }_{1}\Phi \left({{\varvec{s}}}_{g},t-1\right)+\gamma \left({{\varvec{s}}}_{g},t\right) \quad {{\rm for}}\,t=2,\ldots ,T\,{{\rm and}}\,g=1,\ldots ,{n}_{p}$$

with \(\Phi \left({{\varvec{s}}}_{g},1\right)\sim \mathcal{N}\left(0,{\sigma }_{\Phi }^{2}/\left(1-{\lambda }_{1}^{2}\right)\right)\,{{\rm and}}\,\gamma \left({{\varvec{s}}}_{g},t\right)\) defined as a mean square differentiable processFootnote 10 (Stein 1999) with the temporally independent but spatially correlated innovations following a zero-mean Gaussian distribution with spatiotemporal covariance function:

$${\rm Cov}\left(\gamma \left({{\varvec{s}}}_{g},t\right),\gamma \left({{\varvec{s}}}_{h},t{^{\prime}}\right)\right)=\left\{\begin{array}{ll}0 & \quad {\rm if}\,\, t\ne t{^{\prime}}\\ {\sigma }_{\Phi }^{2}R\left(d\right) & \quad {\rm if}\,\, t=t{^{\prime}}\end{array}\right.$$

for \(g\ne h.\) \({\sigma }_{\Phi }^{2}\) is the homogeneous variance of \(\gamma \left({{\varvec{s}}}_{g},t\right)\), i.e., \({\rm Var}\left(\gamma \left({{\varvec{s}}}_{g},t\right)\right)={\sigma }_{\Phi }^{2}\) for every \({{\varvec{s}}}_{g}\) and \(t\), and \(\mathcal{R}\left(d\right)\) the spatial autocorrelation matrix as a function of the distance \(d\) between \({{\varvec{s}}}_{g}\) and \({{\varvec{s}}}_{h}\) at time \(t\) (e.g., the Euclidean distance). Under the assumption that the covariance function only depends on \(d\), it is a Matérn covariance function satisfying the second-order stationarity and isotropy assumptions. Consequently, the mean of the process is constant and only depends on the locations of \({{\varvec{s}}}_{g}\) and \({{\varvec{s}}}_{h}\) through the Euclidean distance \(d=\Vert {{\varvec{s}}}_{g}-{{\varvec{s}}}_{h}\Vert \in {\mathbb{R}}\) (Song et al. 2008). The spatial autocorrelation function \(\mathcal{R}\left(d\right)\) is defined as:

$$\mathcal{R}\left(d\right)=\frac{1}{\Gamma \left(v\right){2}^{v-1}}{\left(\kappa d\right)}^{v}{K}_{v}\left(\kappa d\right) \quad {\rm for\, every}\, t$$

where \(\Gamma \left(.\right)\) is the gamma function, \({K}_{v}\left(.\right)\) the modified Bessel function of the second order (Abramovitz and Stegun 1965) and \(v>0\) the parameter controlling the smoothness of the GF (smoothness parameter). In applications, \(v\) is commonly fixed because it is usually poorly identified (Miller et al. 2019; Utazi et al. 2019). In several software packages, including R-INLA (Integrated Nested Laplace Approximation), the default value is \({\rm one} (v\) = 1), corresponding to moderate smoothness (Lindgren et al. 2011; Utazi et al. 2019). The scale parameter \(\kappa , \kappa >0\) controls the rate of decay of the correlation and is inversely related to the range parameter \(r\) of the Euclidean distance between \(\gamma ({{\varvec{s}}}_{g},t)\) and \(\gamma \left({{\varvec{s}}}_{h},t\right).\) For large \(r\), \(\kappa\) goes to zero. Because of a lack of a simple relationship between \(\kappa\) and \(r\), Lindgren et al. (2011) proposed the empirically derived relationship \(r=\sqrt{8v}/\kappa\) for spatial autocorrelation near 0.1. Substituting Eq. (10) in Eq. (9), the spatiotemporal Matérn covariance function \(\Sigma \left(d\right)\) for each time \(t\) is:

$$\Sigma \left(d\right)=\frac{{\sigma }_{\Phi }^{2}}{\Gamma \left(v\right){2}^{v-1}}{\left(\kappa d\right)}^{v}{K}_{v}\left(\kappa d\right) \quad{\rm for\,\, every }\,t.$$

For the joint latent spatiotemporal GF \({{\varvec{\Phi}}}_{t}={\left({\Phi }_{1t},\ldots ,{\Phi }_{{n}_{p}t}\right)}^{{{\prime}}}\) at cell level, we have:

$${{\varvec{\Phi}}}_{t}={\lambda }_{1}{{\varvec{\Phi}}}_{t-1}+{{\varvec{\upgamma}}}_{t}\,{{\rm and}}\,{{\varvec{\upgamma}}}_{t}\sim \mathcal{N}\left(0,{\varvec{\Sigma}}\right) \quad {\rm for }\,t=2,\ldots ,T.$$

with \({{\varvec{\upgamma}}}_{t}=({\gamma }_{1,t},\ldots ,{\gamma }_{{n}_{p},t}){^{\prime}}\) \({\rm and}\,\,{\varvec{\Sigma}}={\sigma }_{\Phi }^{2}\mathcal{R}\) a Matérn covariance matrix. That is, the joint latent spatiotemporal GF is a second-order stationary, isotropic GF with Matérn covariance function Eq. (11) and initial value distributed as \({{\varvec{\Phi}}}_{1}\sim \mathcal{N}\left(0,\frac{{\sigma }_{\Phi }^{2}}{\left(1-{\lambda }_{1}^{2}\right)}\mathcal{R}\right)\).

Bayesian inference

This section consists of two subsections. In the first, we present the Fusion Area-Cell Spatiotemporal Generalized Geoadditive-Gaussian Field (FGG-GF) model which integrates the sub-models (3a) and (3b) into a single statistical model. The section also presents the Bayesian statistical tools. In the second section, we discuss solving the “big n” problem resulting in the Fusion Area-Cell Spatiotemporal Generalized Geoadditive-Gaussian Markov Random Field (FGG-GMRF) model and point out that it can be estimated using the R-INLA package. Details on the link between the GF and the GMRF through the Linear Stochastic Partial Differential Equation (LSPDE) approach are discussed in Appendix 1.

The fusion area-cell spatiotemporal generalized geoadditive-Gaussian field model

As observed above, combining low-resolution and high-resolution data to generate high-resolution predictions entails the risk of misalignment (Moraga et al. 2017; Utazi et al. 2019). To handle misalignment, we first stack the corresponding objects of the area and cell models in Eqs. (3a) and (3b) to give the FGG-GF modelFootnote 11 which is then estimated as a single model (Blangiardo and Cameletti 2015; Kifle et al. 2017; Utazi et al. 2019). The FGG-GF model reads:

$${{\varvec{\upeta}}}_{t}={\upbeta }_{0}{1}_{({n}_{\mathcal{A}}+{n}_{p})}+{\sum }_{k=1}^{K}\left({\beta }_{k}+{\zeta }_{k,t}\right){\mathbf{z}}_{k,t}+\ddot{{\varvec{\upomega}}}+\ddot{{\varvec{\upsilon}}}+{\phi }_{t}{1}_{({n}_{\mathcal{A}}+{n}_{p})}+{\varsigma }_{t}{1}_{({n}_{\mathcal{A}}+{n}_{p})}+{\ddot{{\varvec{\updelta}}}}_{t}+{\ddot{{\varvec{\Phi}}}}_{t} \quad {\rm for }\,t=1, \ldots ,T$$

where \({{\varvec{\upeta}}}_{t}={\left({\eta }_{1,t},\ldots ,{\eta }_{{n}_{\mathcal{A}},t},{\eta }_{\left({n}_{\mathcal{A}}+1\right),t},\ldots ,{\eta }_{\left({n}_{\mathcal{A}}+{n}_{p}\right),t}\right)}^{{\prime}}\), \({\beta }_{0}\) the global mean defined in Eq. (3), \({1}_{\left({n}_{\mathcal{A}}+{n}_{p}\right)}\) a vector of ones of dimension \(\left({n}_{\mathcal{A}}+{n}_{p}\right)\), \({\mathbf{z}}_{k,t}={\left({\mathbf{z}}_{k,1t},\ldots ,{\mathbf{z}}_{k,{n}_{\mathcal{A}}t},{\mathbf{z}}_{k,\left({n}_{\mathcal{A}}+1\right)t },\ldots ,{\mathbf{z}}_{k,\left({n}_{\mathcal{A}}+{n}_{p}\right)t}\right)}^{{{\prime}}}\) the joint \(k\) th risk factor with fixed coefficient \({\beta }_{k}\) and temporal random coefficient \({{\varvec{\zeta}}}_{k}=\left({\zeta }_{k,1},\ldots ,{\zeta }_{k,T}\right){^{\prime}}{\rm for} k=1,\ldots ,K\). Furthermore, \(\ddot{{\varvec{\upomega}}}={\left({\varvec{\upomega}},{{\varvec{\upomega}}}_{\mathcal{A}}\right)}^{{\prime}}\), with \({\varvec{\upomega}}={\left({\omega }_{1},\ldots .,{\omega }_{i},\ldots ,{\omega }_{{n}_{\mathcal{A}}}\right)}^{{{\prime}}}\) and \({{\varvec{\upomega}}}_{\mathcal{A}}={\left({\omega }_{1\left({\mathcal{A}}_{1}\right)},\ldots ,{\omega }_{g\left({\mathcal{A}}_{i}\right)}\ldots ,{\omega }_{{n}_{p}\left({\mathcal{A}}_{{n}_{\mathcal{A}}}\right)}\right)}^{{{\prime}}},\,\,\ddot{{\varvec{\upsilon}}}={\left({\varvec{\upsilon}},{{\varvec{\upsilon}}}_{\mathcal{A}}\right)}^{{\prime}}\), with \({\varvec{\upsilon}}={\left({\upsilon }_{1},\ldots ,{\upsilon }_{i},\ldots ,{\upsilon }_{{n}_{\mathcal{A}}}\right)}^{{\prime}}\)and \({{\varvec{\upsilon}}}_{\mathcal{A}}={\left({\upsilon }_{1\left({\mathcal{A}}_{1}\right)},\ldots ,{\upsilon }_{g\left({\mathcal{A}}_{i}\right)},\ldots ,{\upsilon }_{{n}_{p}\left({\mathcal{A}}_{{n}_{\mathcal{A}}}\right)}\right)}^{{{\prime}}}\), \({\ddot{{\varvec{\updelta}}}}_{{\varvec{t}}}={\left({{\varvec{\updelta}}}_{t},{{\varvec{\updelta}}}_{\mathcal{A}t}\right)}^{{\prime}}\), with \({{\varvec{\updelta}}}_{t}={\left({\updelta }_{1t},\ldots ,{\updelta }_{it},\ldots,{\updelta }_{{n}_{\mathcal{A}}t}\right)}^{{{\prime}}},\) \({\phi }_{t}\) and \({\varsigma }_{t}\) defined as in Eq. (3), and \({{\varvec{\updelta}}}_{\mathcal{A}t}={\left({\updelta }_{1\left({\mathcal{A}}_{1}\right)t},\ldots ,{\updelta }_{g\left({\mathcal{A}}_{i}\right)t},\ldots ,{\updelta }_{{n}_{p}\left({\mathcal{A}}_{{n}_{\mathcal{A}}}\right)t}\right)}^{{{\prime}}},\,\,{\ddot{{\varvec{\Phi}}}}_{t}={\left({\overline{{\varvec{\Phi}}} }_{t},{{\varvec{\Phi}}}_{t}\right)}^{{\prime}}\), with \({\overline{{\varvec{\Phi}}} }_{t}={\left({\overline{\Phi } }_{1t},\ldots ,{\overline{\Phi } }_{it},\ldots ,{\overline{\Phi } }_{{n}_{\mathcal{A}}t}\right)}^{{{\prime}}}\)and \({{\varvec{\Phi}}}_{t}=({\Phi }_{1t},\ldots ,{\Phi }_{gt},\ldots ,{\Phi }_{{n}_{p}t})\boldsymbol{^{\prime}}\). Note that the above vectors are ((\({n}_{\mathcal{A}}+{n}_{p})\times 1)\) for \(t=1, \ldots ,T.\)

The following observations apply. First, the basic components of a high-resolution spatiotemporal relative risk model are the covariates and/or the GF at cell level \(\left(\Phi \left({{\varvec{s}}}_{g},t\right)\right)\). Either one or both are required for the estimation of the relative risk at cell level. Second, the interaction terms \({\delta }_{it}\) and \({\Phi }_{gt}\) in the non-separable models in Eqs. (3a) and (3b), respectively, have as covariance matrices the Kronecker products of the spatial and temporal covariance matrices (Blangiardo and Cameletti 2015; Fuentes et al. 2008). See Table 5 in Appendix 2 and Sect. 3.2 for further details. For alternative approaches to handling non-separable models, see among others Bakka et al. (2020), Gneiting (2002), and Sherman (2011). Third, the parameters \(\left\{{\omega }_{i}, {\upsilon }_{i},{\delta }_{it}\right\}\) are estimated at area level. For the cell level, they are the corresponding area level parameters, implying that they do not vary among cells within a given area \({\mathcal{A}}_{i}\). Fourth, to control for misalignment, for each area \({\mathcal{A}}_{i}\), the model component \({\overline{\Phi } }_{it}\) and the area values of the risk factors \({\overline{{\rm x}} }_{k,it}\) are taken as the block averages of the cells within \({\mathcal{A}}_{i}\) for a given time point \(t\), respectively (Banerjee et al. 2015). That is, for \(i=1,\ldots ,{n}_{\mathcal{A}}\) and \(t=1,\ldots ,T\), \({\overline{{\rm x}} }_{k,it}={\left|{\mathcal{A}}_{i}\right|}^{-1}\underset{{\mathcal{A}}_{i}}{\overset{}{\int }}{{\rm x}}_{k}\left(\mathbf{s},t\right){\rm d}{\varvec{s}}\,\,{\rm for}\,\,k=1,\ldots ,K\) and \({\overline{\Phi } }_{it}={\left|{\mathcal{A}}_{i}\right|}^{-1}\underset{{\mathcal{A}}_{i}}{\overset{}{\int }}\Phi \left({\varvec{s}},t\right){\rm d}{\varvec{s}}\) where \(\left|{\mathcal{A}}_{i}\right|=\underset{{\mathcal{A}}_{i}}{\overset{}{\int }}1{\rm d}{\varvec{s}}\) denotes the size of \({\mathcal{A}}_{i}\). The simplest procedure to estimate \({\overline{{\rm x}} }_{k,it}\) is to approximate \({\left|{\mathcal{A}}_{i}\right|}^{-1}\underset{{\mathcal{A}}_{i}}{\overset{}{\int }}{\mathbf{x}}_{k}\left({\varvec{s}},t\right){\rm d}{\varvec{s}}\) for each time \(t\) by taking the average of the values of the cell risk factor \({{\rm x}}_{k,g\left({\mathcal{A}}_{i}\right)t}\) in \({\mathcal{A}}_{i}\): \({\overline{{\rm x}} }_{k,it}\approx \frac{1}{{n}_{{\mathcal{A}}_{i}}}\sum_{{{\varvec{s}}}_{g}\in {\mathcal{A}}_{i}}{\mathbf{x}}_{k}\left({{\varvec{s}}}_{g},t\right)\) for \(k=1, \ldots ,K\), \(i=1,\ldots ,{n}_{\mathcal{A}},\) and \(t=1,\ldots,T\), with \({n}_{{\mathcal{A}}_{i}}\) denoting the number of cells in \({\mathcal{A}}_{i}\) (Lawson et al. 2012; Utazi et al. 2019). Estimation of \({\overline{\Phi } }_{it}={\left|{\mathcal{A}}_{i}\right|}^{-1}\underset{{\mathcal{A}}_{i}}{\overset{}{\int }}{\varvec{\Phi}}\left({\varvec{s}},t\right){\rm d}{\varvec{s}}\) is discussed in Sect. 3.2.

Bayesian estimation of the FGG-GF model is initiated by defining the estimated parameter and hyperparameter vectors. Let \(\boldsymbol{\ell}=\left({\beta }_{0},{\beta }_{1},\ldots ,{\beta }_{K},{{\varvec{\zeta}}}_{1},\ldots ,{{\varvec{\zeta}}}_{K},{\varvec{\upomega}},{\varvec{\upsilon}},{\varvec{\phi}},\boldsymbol{\varsigma },{\varvec{\updelta}},{\varvec{\Phi}}\right)\) and \({\varvec{\Psi}}=\left({\sigma }_{{\beta }_{0}}^{2},{\sigma }_{{\beta }_{1}}^{2},\ldots ,{\sigma }_{{\beta }_{K}}^{2},{\sigma }_{{\zeta }_{1}}^{2},\ldots ,{\sigma }_{{\zeta }_{K}}^{2},{\sigma }_{\upomega }^{2},{\sigma }_{\upsilon }^{2},{\sigma }_{\phi }^{2},{\sigma }_{\boldsymbol{\varsigma }}^{2},{\sigma }_{\updelta }^{2},{\sigma }_{\Phi }^{2},\rho ,{\lambda }_{1},{\lambda }_{2},r\right){^{\prime}}\) denote the parameter and hyperparameter vectors, respectively, of the FGG-GF model in Eq. (13). The joint posterior distribution of the FGG-GF model is:

$$p\left(\boldsymbol{\ell},{\varvec{\Psi}}|\mathbf{y}\right)\propto p\left(\mathbf{y}|\boldsymbol{\ell},{\varvec{\Psi}}\right)p\left(\boldsymbol{\ell}|{\varvec{\Psi}}\right)p\left({\varvec{\Psi}}\right)$$

where \(p\left(.\right)\) denotes the probability density function. Below, we first discuss the likelihood function \(p\left(\mathbf{y}|\boldsymbol{\ell},{\varvec{\Psi}}\right)\) and next the joint prior of the GF at cell level. Based on the assumption that \(\mathbf{y}\) follows a Poisson distribution at area and cell levels (see Eq. (3a) and (3b)), the likelihood function \(p\left(\mathbf{y}|\boldsymbol{\ell},{\varvec{\Psi}}\right)\) is given by:

$$\begin{aligned} p\left( {{\mathbf{y}}|{{\boldsymbol{\ell} }},{{\varvec{\Psi}}}} \right) & = \mathop \prod \limits_{t = 1}^{T} \frac{{\exp \left( {- {\mathbf{E}}_{t} \exp \left( {{{\varvec{\upeta}}}_{t} } \right)} \right)\left( {{\mathbf{E}}_{t} \exp \left( {{{\varvec{\upeta}}}_{t} } \right)} \right)^{{{\mathbf{y}}_{{\varvec{t}}} }} }}{{{\mathbf{y}}_{t} !}} \\ & = \exp \left( {\log \left( {\mathop \prod \limits_{t = 1}^{T} \frac{{\exp \left( {- {\mathbf{E}}_{t} \exp \left( {{{\varvec{\upeta}}}_{t} } \right)} \right)\left( {{\mathbf{E}}_{t} \exp \left( {{{\varvec{\upeta}}}_{t} } \right)} \right)^{{{\mathbf{y}}_{{\varvec{t}}} }} }}{{{\mathbf{y}}_{t} !}}} \right)} \right) \\ & = \exp \left( {\mathop \sum \limits_{t = 1}^{T} \log \left( {\frac{{\exp \left( {- {\mathbf{E}}_{t} \exp \left( {{{\varvec{\upeta}}}_{t} } \right)} \right)\left( {{\mathbf{E}}_{t} \exp \left( {{{\varvec{\upeta}}}_{t} } \right)} \right)^{{{\mathbf{y}}_{{\varvec{t}}} }} }}{{{\mathbf{y}}_{t} !}}} \right)} \right) \\ & = \exp \left( {\mathop \sum \limits_{t = 1}^{T} \left( {{\mathbf{y}}_{{\varvec{t}}} \left( {\log \left( {{\mathbf{E}}_{t} } \right) + {{\varvec{\upeta}}}_{t} } \right) - {\mathbf{E}}_{t} \exp \left( {{{\varvec{\upeta}}}_{t} } \right) - \log \left( {{\mathbf{y}}_{t} !} \right)} \right)} \right). \\ \end{aligned}$$

The joint prior of the GF at cell level is obtained as follows. Since the GF in Eq. (8) at cell level is assumed to follow an AR1 model, the joint prior distribution of \({\varvec{\Phi}}=({{\varvec{\Phi}}}_{1},\ldots,{{\varvec{\Phi}}}_{T})\mathbf{^{\prime}}\), i.e., \(p\left({\varvec{\Phi}}|{\lambda }_{1},{\varvec{\Sigma}}\right)\), is (Godana et al. 2019):

$$p\left({{\varvec{\Phi}}}_{1},\ldots ,{{\varvec{\Phi}}}_{T}|{\lambda }_{1},{\varvec{\Sigma}}\right)=p\left({{\varvec{\Phi}}}_{T}|{{\varvec{\Phi}}}_{T-1},{{\varvec{\Phi}}}_{T-2},\ldots ,{{\varvec{\Phi}}}_{1},{\lambda }_{1},{\varvec{\Sigma}}\right)\times \ldots \times p\left({{\varvec{\Phi}}}_{2}|{{\varvec{\Phi}}}_{1},{\varvec{\Psi}}\right)\times p\left({{\varvec{\Phi}}}_{1}|{\lambda }_{1},{\varvec{\Sigma}}\right).$$

Because of the AR1 process, we have:

$$p\left({{\varvec{\Phi}}}_{T}|{{\varvec{\Phi}}}_{T-1},{{\varvec{\Phi}}}_{T-2},\ldots ,{{\varvec{\Phi}}}_{1},{\lambda }_{1},{\varvec{\Sigma}}\right)=p\left({{\varvec{\Phi}}}_{T}|{{\varvec{\Phi}}}_{T-1},{\lambda }_{1},{\varvec{\Sigma}}\right).$$

Thus, the joint distribution of the latent spatiotemporal Gaussian process \({\varvec{\Phi}}\) is:

$$p\left({{\varvec{\Phi}}}_{1},\ldots ,{{\varvec{\Phi}}}_{T}|{\lambda }_{1},{\varvec{\Sigma}}\right)=p\left({{\varvec{\Phi}}}_{1}|{\lambda }_{1},{\varvec{\Sigma}}\right)\prod_{t=2}^{T}p\left({{\varvec{\Phi}}}_{t}|{{\varvec{\Phi}}}_{t-1},{\lambda }_{1},{\varvec{\Sigma}}\right).$$

The joint distribution of the GF in Eq. (18) consists of two probability distributions: \(p\left({{\varvec{\Phi}}}_{1}|{\lambda }_{1},{\varvec{\Sigma}}\right)\) and \(p\left({{\varvec{\Phi}}}_{t}|{{\varvec{\Phi}}}_{t-1},{\lambda }_{1},{\varvec{\Sigma}}\right)\) for \(t=2,\ldots ,T\). To obtain the joint distribution of \({\varvec{\Phi}}\), we need the joint distributions of \(p\left({{\varvec{\Phi}}}_{1}|{\lambda }_{1},{\varvec{\Sigma}}\right)\) for \(g=1,\ldots ,{n}_{p}\) and \(t=1\) and \(p\left({{\varvec{\Phi}}}_{t}|{{\varvec{\Phi}}}_{t-1},{\lambda }_{1},{\varvec{\Sigma}}\right)\) for \(g=1,\ldots ,{n}_{p}\) and \(t=2,\ldots .,T\). \({{\varvec{\Phi}}}_{1}\) is an AR1 stationary process, i.e., \({{\varvec{\Phi}}}_{1}|{\lambda }_{1},{\varvec{\Sigma}}\sim \mathcal{N}\left(0,\frac{{\varvec{\Sigma}}}{1-{\lambda }_{1}^{2}}\right).\) It is called the initial distribution for \(g=1,\ldots ,{n}_{p}\) and reads as:

$$p\left({{\varvec{\Phi}}}_{1}|{\lambda }_{1},{\varvec{\Sigma}}\right)={\left(\frac{1}{\sqrt{2\pi }}\right)}^{{n}_{p}}\frac{1}{{\left|\frac{{\varvec{\Sigma}}}{1-{\lambda }_{1}^{2}}\right|}^{1/2}}{\rm exp}\left(-\frac{1}{2}{{\varvec{\Phi}}}_{1}^{{{\prime}}}{\left(\frac{{\varvec{\Sigma}}}{1-{\lambda }_{1}^{2}}\right)}^{-1}{{\varvec{\Phi}}}_{1}\right).$$

Because \({\varvec{\Sigma}}={\sigma }_{\Phi }^{2}\mathcal{R}\), with \(\mathcal{R}\) defined in Eq. (10), we have:

$$\begin{aligned} p\left( {{{\varvec{\Phi}}}_{1} |\lambda_{1} ,{{\varvec{\Sigma}}}} \right) & = \left( {\frac{1}{{\sqrt {2\pi } }}} \right)^{{n_{p} }} \frac{1}{{\left| {\frac{{\sigma_{{\Phi }}^{2} }}{{1 - \lambda_{1}^{2} }}{\boldsymbol{\mathcal{R}}}} \right|^{1/2} }}\exp \left( {- \frac{1}{{2\sigma_{{\Phi }}^{2} }}{{\varvec{\Phi}}}_{1}^{{^{\prime}}} \left( {\frac{{\boldsymbol{\mathcal{R}}}}{{1 - \lambda_{1}^{2} }}} \right)^{- 1} {{\varvec{\Phi}}}_{1} } \right) \\ & = \left( {\frac{1}{{\sqrt {2\pi } }}} \right)^{{n_{p} }} \left( {\frac{{\sigma_{{\Phi }}^{2} }}{{1 - \lambda_{1}^{2} }}} \right)^{{- \frac{{n_{p} }}{2}}} \left| {\boldsymbol{\mathcal{R}}} \right|^{{- \frac{1}{2}}} \exp \left( {- \frac{{\left( {1 - \lambda_{1}^{2} } \right)}}{{2\sigma_{{\Phi }}^{2} }}{{\varvec{\Phi}}}_{1}^{{^{\prime}}} {\boldsymbol{\mathcal{R}}}^{- 1} {{\varvec{\Phi}}}_{1} } \right) \\ & \propto \exp \left( {- \frac{{\left( {1 - \lambda_{1}^{2} } \right)}}{{2\sigma_{{\Phi }}^{2} }}{{\varvec{\Phi}}}_{1}^{{^{\prime}}} {\boldsymbol{\mathcal{R}}}^{- 1} {{\varvec{\Phi}}}_{1} } \right). \\ \end{aligned}$$

The joint distribution \(p\left({{\varvec{\Phi}}}_{t}|{{\varvec{\Phi}}}_{t-1},{\lambda }_{1},{\varvec{\Sigma}}\right)\) for \(g=1,\ldots ,{n}_{p}\) and \(t=2,\ldots ,T\) is given by:

$$\begin{aligned} \mathop \prod \limits_{t = 2}^{T} p\left( {{{\varvec{\Phi}}}_{t} |{{\varvec{\Phi}}}_{t - 1} ,\lambda_{1} ,{{\varvec{\Sigma}}}} \right) & = \mathop \prod \limits_{t = 2}^{T} \left( {\frac{1}{{\sqrt {2\pi } }}} \right)^{{n_{p} }} \frac{1}{{\left| {\sigma_{{\Phi }}^{2} {\boldsymbol{\mathcal{R}}}} \right|^{1/2} }}\exp \left( {- \frac{1}{2}\left( {\left( {{{\varvec{\Phi}}}_{{\varvec{t}}} - \lambda_{1} {{\varvec{\Phi}}}_{{{\varvec{t}} - 1}} } \right){^{\prime}}\left( {\sigma_{{\Phi }}^{2} {\boldsymbol{\mathcal{R}}}} \right)^{- 1} \left( {{{\varvec{\Phi}}}_{{\varvec{t}}} - \lambda_{1} {{\varvec{\Phi}}}_{{{\varvec{t}} - 1}} } \right)} \right)} \right) \\ & = \mathop \prod \limits_{t = 2}^{T} \left( {\sqrt {2\pi } } \right)^{{\frac{{- n_{p} }}{2}}} \left( {\sigma_{{\Phi }}^{2} } \right)^{{- \frac{{n_{p} }}{2}}} \left| {\boldsymbol{\mathcal{R}}} \right|^{- 1/2} \exp \left( {- \frac{1}{2}\left( {\left( {{{\varvec{\Phi}}}_{{\varvec{t}}} - \lambda_{1} {{\varvec{\Phi}}}_{{{\varvec{t}} - 1}} } \right){^{\prime}}\left( {\sigma_{{\Phi }}^{2} {\boldsymbol{\mathcal{R}}}} \right)^{- 1} \left( {{{\varvec{\Phi}}}_{{\varvec{t}}} - \lambda_{1} {{\varvec{\Phi}}}_{{{\varvec{t}} - 1}} } \right)} \right)} \right) \\ & = \left( {\sqrt {2\pi } } \right)^{{\frac{{- n_{p} \left( {T - 1} \right)}}{2}}} \left( {\sigma_{{\Phi }}^{2} } \right)^{{- \frac{{n_{p} \left( {T - 1} \right)}}{2}}} \left| {\boldsymbol{\mathcal{R}}} \right|^{{- \frac{{\left( {T - 1} \right)}}{2}}} \exp \left( {- \frac{1}{{2\sigma_{{\Phi }}^{2} }}\mathop \sum \limits_{t = 2}^{T} \left( {\left( {{{\varvec{\Phi}}}_{{\varvec{t}}} - \lambda_{1} {{\varvec{\Phi}}}_{{{\varvec{t}} - 1}} } \right){^{\prime}}{\boldsymbol{\mathcal{R}}}^{- 1} \left( {{{\varvec{\Phi}}}_{{\varvec{t}}} - \lambda_{1} {{\varvec{\Phi}}}_{{{\varvec{t}} - 1}} } \right)} \right)} \right) \\ & \propto \exp \left( {- \frac{1}{{2\sigma_{{\Phi }}^{2} }}\mathop \sum \limits_{t = 2}^{T} \left( {\left( {{{\varvec{\Phi}}}_{{\varvec{t}}} - \lambda_{1} {{\varvec{\Phi}}}_{{{\varvec{t}} - 1}} } \right){^{\prime}}{\boldsymbol{\mathcal{R}}}^{- 1} \left( {{{\varvec{\Phi}}}_{{\varvec{t}}} - \lambda_{1} {{\varvec{\Phi}}}_{{{\varvec{t}} - 1}} } \right)} \right)} \right). \\ \end{aligned}$$

Finally, the joint prior distribution for the AR1 process, denoted as \(p\left({\varvec{\Phi}}|{\mathbf{Q}}_{{\varvec{\Phi}}}^{-1}\right)\), is given by multiplying Eqs. (20) and (21). It reads:

$$\begin{aligned} p\left( {{{\varvec{\Phi}}}|{\mathbf{Q}}_{{{\varvec{\Phi}}}}^{- 1} } \right) & \propto \exp \left( {- \frac{{\left( {1 - \lambda_{1}^{2} } \right)}}{{2\sigma_{{\Phi }}^{2} }}{{\varvec{\Phi}}}_{1}^{{^{\prime}}} {\boldsymbol{\mathcal{R}}}^{- 1} {{\varvec{\Phi}}}_{1} } \right) \times \exp \left( {- \frac{1}{{2\sigma_{{\Phi }}^{2} }}\mathop \sum \limits_{t = 2}^{T} \left( {\left( {{{\varvec{\Phi}}}_{{\varvec{t}}} - \lambda_{1} {{\varvec{\Phi}}}_{{{\varvec{t}} - 1}} } \right){^{\prime}}{\boldsymbol{\mathcal{R}}}^{- 1} \left( {{{\varvec{\Phi}}}_{{\varvec{t}}} - \lambda_{1} {{\varvec{\Phi}}}_{{{\varvec{t}} - 1}} } \right)} \right)} \right) \\ & \propto \exp \left( {- \frac{1}{2}\left( {\frac{1}{{\sigma_{{\Phi }}^{2} }}\left( {{{\varvec{\Phi}}}_{1}^{{^{\prime}}} \left( {1 - \lambda_{1}^{2} } \right){\boldsymbol{\mathcal{R}}}^{- 1} {{\varvec{\Phi}}}_{1} + \mathop \sum \limits_{t = 2}^{T} \left( {\left( {{{\varvec{\Phi}}}_{{\varvec{t}}} - \lambda_{1} {{\varvec{\Phi}}}_{{{\varvec{t}} - 1}} } \right){^{\prime}}{\boldsymbol{\mathcal{R}}}^{- 1} \left( {{{\varvec{\Phi}}}_{{\varvec{t}}} - \lambda_{1} {{\varvec{\Phi}}}_{{{\varvec{t}} - 1}} } \right)} \right)} \right)} \right)} \right) \\ & \propto \exp \left( {- \frac{1}{2}{{\varvec{\Phi}}}_{{\varvec{t}}} {^{\prime}}{\mathbf{Q}}_{{{\varvec{\Phi}}}} {{\varvec{\Phi}}}_{{\varvec{t}}} } \right) \\ & \propto {\mathcal{N}}\left( {0,{\mathbf{Q}}_{{{\varvec{\Phi}}}}^{- 1} } \right) \\ \end{aligned}$$

with \({\mathbf{Q}}_{{\varvec{\Phi}}}^{-1}={\varvec{\Sigma}}\) denoting the covariance matrix of the GF in Eq. (12).

The priors and joint priors of \(\boldsymbol{\ell}=\left({\beta }_{0},{\beta }_{1},\ldots ,{\beta }_{K},{{\varvec{\zeta}}}_{1},\ldots ,{{\varvec{\zeta}}}_{K},{\varvec{\upomega}},{\varvec{\upsilon}},{\varvec{\phi}},\boldsymbol{\varsigma },{\varvec{\updelta}},{\varvec{\Phi}}\right)\) and the hyperpriors of \({\varvec{\Psi}}=\left({\sigma }_{{\beta }_{0}}^{2},{\sigma }_{{\beta }_{1}}^{2},\ldots ,{\sigma }_{{\beta }_{K}}^{2},{\sigma }_{{\zeta }_{1}}^{2},\ldots ,{\sigma }_{{\zeta }_{K}}^{2},{\sigma }_{\upomega }^{2},{\sigma }_{\upsilon }^{2},{\sigma }_{\phi }^{2},{\sigma }_{\boldsymbol{\varsigma }}^{2},{\sigma }_{\updelta }^{2},{\sigma }_{\Phi }^{2},\rho ,{\lambda }_{1},{\lambda }_{2},r\right){^{\prime}}\) are presented in Table 5 in Appendix 2.Footnote 12 Details can be found in Jaya and Folmer (2021a) and the references therein. The prior distributions are assumed to be independent, implying that:

$$\begin{aligned} p\left( {\boldsymbol{\ell} |{{\varvec{\Psi}}}} \right) & = {\mathcal{N}}\left( {0,\sigma_{{\beta_{0} }}^{2} } \right) \times {\mathcal{N}}\left( {0;{\mathbf{Q}}_{{\varvec{\beta}}}^{- 1} } \right) \times \mathop \prod \limits_{k = 1}^{K} {\mathcal{N}}\left( {0;{\mathbf{Q}}_{{{\varvec{\zeta}}_{k} }}^{- 1} } \right) \times {\mathcal{N}}\left( {0,{\mathbf{Q}}_{{\upomega }}^{- 1} } \right) \times {\mathcal{N}}\left( {0,{\mathbf{Q}}_{\upsilon }^{- 1} } \right) \\ & \quad \times {\mathcal{N}}\left( {0,{\mathbf{Q}}_{\phi }^{- 1} } \right) \times {\mathcal{N}}\left( {0,{\mathbf{Q}}_{\varvec{\varsigma }}^{- 1} } \right) \times {\mathcal{N}}\left( {0,{\mathbf{Q}}_{{{\varvec{\updelta}}}}^{- 1} } \right) \times {\mathcal{N}}\left( {0,{\mathbf{Q}}_{{{\varvec{\Phi}}}}^{- 1} } \right). \\ \end{aligned}$$

The joint hyperparameter distribution is given byFootnote 13:

$$\begin{aligned} p\left( {{\varvec{\Psi}}} \right) & = p\left( {\sigma_{{\beta_{0} }}^{2} } \right)p\left( {\sigma_{{\beta_{1} }}^{2} } \right) \ldots p\left( {\sigma_{{\beta_{K} }}^{2} } \right)p\left( {\sigma_{{\zeta_{1} }}^{2} } \right) \ldots p\left( {\sigma_{{\zeta_{K} }}^{2} } \right)p\left( {\sigma_{{\upomega }}^{2} } \right)p\left( {\sigma_{\upsilon }^{2} } \right)p\left( {\sigma_{\phi }^{2} } \right)p\left( {\sigma_{\varvec{\varsigma }}^{2} } \right)p\left( {\sigma_{{\Phi }}^{2} } \right)p\left( {\sigma_{{\updelta }}^{2} } \right)p\left( \rho \right)p\left( {\lambda_{1} } \right)p\left( {\lambda_{2} } \right)p\left( \kappa \right) \\ & = \mathop \prod \limits_{j = 1}^{{\dim \left( {{\varvec{\Psi}}} \right)}} p\left( {{\Psi }_{j} } \right) \quad {\text{for}} \quad j = 1, \ldots ,J. \\ \end{aligned}$$

Given the likelihood function, the joint prior distributions for the parameter vectors and the joint hyperparameter, the joint posterior distribution in Eq. (14) can be written as:

$$\begin{aligned} p\left( {{{\boldsymbol{\ell} }},{\varvec{\varPsi}}|{\varvec{y}}} \right) & \propto \exp \left( {\mathop \sum \limits_{t = 1}^{T} \left( {{\varvec{y}}_{t} \log \left( {{\varvec{E}}_{t} \exp \left( {{\varvec{\eta}}_{t} } \right)} \right) - {\varvec{E}}_{t} \exp \left( {{\varvec{\eta}}_{t} } \right)} \right)} \right) \\ & \quad \times \exp \left( {- \frac{1}{{2\sigma_{{\beta_{0} }}^{2} }}\beta_{0}^{2} } \right) \times \exp \left( {- \frac{1}{2}\varvec{\beta ^{\prime}}{\mathbf{Q}}_{{\varvec{\beta}}} {\varvec{\beta}}} \right) \times \exp \left( {- \frac{1}{2}\mathop \sum \limits_{k = 1}^{K} {\varvec{\zeta}}_{{\varvec{k}}}^{{^{\prime}}} {\mathbf{Q}}_{{\zeta_{{\varvec{k}}} }} {\varvec{\zeta}}_{k} } \right) \\ & \quad \times \exp \left( {- \frac{1}{2}\varvec{\omega^{\prime}}{\mathbf{Q}}_{{\varvec{\omega}}} {\varvec{\omega}}} \right) \times \exp \left( {- \frac{1}{2}\varvec{\upsilon^{\prime}}{\mathbf{Q}}_{\upsilon } {\varvec{\upsilon}}} \right) \times \exp \left( {- \frac{1}{2}\phi {^{\prime}}{\mathbf{Q}}_{\phi } \phi } \right) \times \exp \left( {- \frac{1}{2}\varvec{\varsigma ^{\prime}}{\mathbf{Q}}_{\varvec{\varsigma }} \varvec{\varsigma }} \right) \\ & \quad \times \exp \left( {- \frac{1}{2}{{\varvec{\Phi}}}{^{\prime}}{\mathbf{Q}}_{{{\varvec{\Phi}}}} {{\varvec{\Phi}}}} \right) \times \mathop \prod \limits_{j = 1}^{{\dim \left( {{\varvec{\Psi}}} \right)}} p\left( {{\Psi }_{j} } \right). \\ \end{aligned}$$

The fusion area-cell spatiotemporal generalized geoadditive-Gaussian Markov random field model

A continuously indexed GF typically has a dense covariance matrix, such as \({\mathbf{Q}}_{{\varvec{\Phi}}}\) in Eq. (24), leading to complex, time-consuming numerical estimation challenges, commonly referred to as the “big n” problem. Lindgren et al. (2011) proposed to solve the big \(n\) problem by substituting a sparse, discretely indexed Gaussian Markov Random Field (GMRF) for the continuously indexed GF.Footnote 14 For a GMRF, the full conditional distribution for each component \({\upgamma }_{g,t}\) for \(g=1,\ldots ,{n}_{p},\) only depends on a set of neighbors \(N\left(g\right)\) as follows:

$$p\left({\upgamma }_{g,t}|{{\varvec{\upgamma}}}_{-g,t}\right)=p\left({\upgamma }_{g,t}|{{\varvec{\upgamma}}}_{N\left(g\right),t}\right) \quad {{\rm for}}\,g=1,\ldots ,{n}_{p}\,{{\rm and}}\,t=1,\ldots ,T$$

where \({{\varvec{\upgamma}}}_{-g,t}\) denotes all the elements in \({\varvec{\upgamma}}\) except \({\upgamma }_{g,t}\), and \({{\varvec{\upgamma}}}_{N\left(g\right),t}\) denotes all the elements of \({{\varvec{\upgamma}}}_{t}\) in the neighborhood N(\(g\)) of \({\upgamma }_{g,t}.\) The vector of elements of \({{\varvec{\upgamma}}}_{t}\) not in the neighborhood N(\(g)\) of \({\upgamma }_{g,t}\) is denoted as \({{\varvec{\upgamma}}}_{-\left\{g,N\left(g\right)\right\},t}.\,\,{\upgamma }_{g,t}\,\,{\rm is}\) conditionally independent of the elements of \({{\varvec{\upgamma}}}_{-\left\{g,N\left(g\right)\right\},t}\). The conditional independence relationship is written as:

$${\upgamma }_{g,t}\perp {{\varvec{\upgamma}}}_{-\left\{g,N\left(g\right)\right\},t}|{{\varvec{\upgamma}}}_{N\left(g\right),t} \quad {{\rm for }}\,t=1,\ldots ,T\,{{\rm and }}\,g=1,\ldots ,{n}_{p}.$$

If Eq. (26) holds, the precision matrix \(\mathbf{Q}={{\varvec{\Sigma}}}^{-1}\) of \({{\varvec{\upgamma}}}_{t}\) is sparse for each \(t\). In other words, for a pair \(g\) and \(h\) with \(g\ne h\) and \(h\notin \left\{N(g)\right\}\), we have:

$${\upgamma }_{g,t}\perp {\upgamma }_{h,t}|{{\varvec{\upgamma}}}_{-\left(g,h\right),t}\iff \mathbf{Q}(g,h)=0 \quad {{\rm for}}\, g=1,\ldots ,{n}_{p} \,{{\rm and }}\,t=1,\ldots ,T$$

implying that the nonzero pattern in the precision matrix \(\mathbf{Q}\) is given by the neighborhood structure. Conversely,

$$\mathbf{Q}\left(g,h\right)\ne 0,\,{{\rm if }}\,h\in \left\{N\left(g\right)\right\}.$$

Lindgren et al. (2011) proposed the Linear Stochastic Partial Differential Equation (LSPDE)Footnote 15 approach based on a mesh of the study area, to transform a dense Matérn covariance matrix of a GF, such as in Eq. (11), into a sparse Matérn precision matrix of a GMRF (see Appendix 1). Specifically, for \(t=1, \ldots , T,\) the GF \({\upgamma }_{g,t}\) in Eq. (12) with Matérn covariance function \({\varvec{\Sigma}}={\sigma }_{\Phi }^{2}\mathcal{R}\), is transformed into a GMRF, \({\stackrel{\sim }{{\varvec{\upgamma}}}}_{t}\left({\varvec{s}}\right)\sim \mathcal{N}\left(0,{\stackrel{\sim }{\mathbf{Q}}}_{s}^{-1}\right),\) with sparse spatial precision matrix \({\stackrel{\sim }{\mathbf{Q}}}_{{\varvec{s}}}\) defined in Eq. (47). Consequently, for \(t=1,\ldots ,T\), the joint latent spatiotemporal GF \({{\varvec{\Phi}}}_{t}={\left({\Phi }_{1t},\ldots ,{\Phi }_{{n}_{p}t}\right)}^{{{\prime}}}\) at cell level in Eq. (12) is transformed into a GMRF \({\stackrel{\sim }{{\varvec{\Phi}}}}_{t}\) as:

$${\stackrel{\sim }{{\varvec{\Phi}}}}_{t}={\lambda }_{1}{\stackrel{\sim }{{\varvec{\Phi}}}}_{t-1}+{\stackrel{\sim }{{\varvec{\upgamma}}}}_{t}\,\,{\rm and }\,\,{\stackrel{\sim }{{\varvec{\upgamma}}}}_{t}({\varvec{s}})\sim \mathcal{N}\left(0,{\stackrel{\sim }{\mathbf{Q}}}_{s}^{-1}\right)$$

with the initial value distributed as: \({\stackrel{\sim }{{\varvec{\Phi}}}}_{1}\sim \mathcal{N}\left(0,{\stackrel{\sim }{\mathbf{Q}}}_{s}^{-1}/\left(1-{\lambda }_{1}^{2}\right)\right)\). The joint distribution of the \(T\times L\)-dimensional cell level GMRF \(\stackrel{\sim }{{\varvec{\Phi}}}=\left({\stackrel{\sim }{{\varvec{\Phi}}}}_{1}^{{{\prime}}},\ldots ,{\stackrel{\sim }{{\varvec{\Phi}}}}_{T}^{{{\prime}}}\right)\boldsymbol{^{\prime}}\) is:

$$\stackrel{\sim }{{\varvec{\Phi}}}\sim \mathcal{N}\left(0,{\stackrel{\sim }{\mathbf{Q}}}_{\stackrel{\sim }{{\varvec{\Phi}}}}^{-1}\right)$$

with precision matrix \({\mathbf{Q}}_{\stackrel{\sim }{{\varvec{\Phi}}}}={\mathbf{Q}}_{{\varvec{T}}}\otimes {\stackrel{\sim }{\mathbf{Q}}}_{s}\), i.e., the Kronecker product of the autoregressive temporal covariance matrix (\({\mathbf{Q}}_{{\varvec{T}}})\) (see Table 5 in Appendix 2) and the Matérn spatial covariance matrix (\({\stackrel{\sim }{\mathbf{Q}}}_{s})\), respectively.

To facilitate the estimation of \({\ddot{{\varvec{\Phi}}}}_{t}={\left({\overline{{\varvec{\Phi}}} }_{t},{{\varvec{\Phi}}}_{t}\right)}^{{\prime}}\), with \({\overline{{\varvec{\Phi}}} }_{t}={\left({\overline{\Phi } }_{1t},\ldots ,{\overline{\Phi } }_{it},\ldots ,{\overline{\Phi } }_{{n}_{\mathcal{A}}t}\right)}^{{{\prime}}}\) and \({{\varvec{\Phi}}}_{t}=({\Phi }_{1t},\ldots ,{\Phi }_{gt},\ldots ,{\Phi }_{{n}_{p}t})\boldsymbol{^{\prime}}\) \({{\varvec{\Phi}}}_{t}=({\Phi }_{1t},\ldots ,{\Phi }_{gt},\ldots ,{\Phi }_{{n}_{p}t})\boldsymbol{^{\prime}}\) for \(t=1,\ldots ,T\), (Eq. (13)) as a GMRF \(\stackrel{\sim }{{\varvec{\Phi}}}\), we introduce the \(\left(\left({n}_{\mathcal{A}}+{n}_{p}\right)\times L\right)\)—dimensional partitioned or block matrix \(\mathbf{H}={\left[\begin{array}{cc}{\mathbf{H}}_{1}& {\mathbf{H}}_{2}\end{array}\right]}^{-1}\) that maps the GMFRs associated with the \(L\) triangulation nodesFootnote 16 to the \({n}_{\mathcal{A}}\) areas and \({n}_{p}\) cells, respectively. The elements of \({\mathbf{H}}_{1}\) correspond to the block average \({\left|{\mathcal{A}}_{i}\right|}^{-1}\underset{{\mathcal{A}}_{i}}{\overset{}{\int }}\Phi \left({\varvec{s}},t\right){\rm d}{\varvec{s}}\) for \(i=1,\ldots ,{n}_{\mathcal{A}}, t=1,\ldots ,T\,{{\rm and}}\,{\varvec{s}}={({\varvec{s}}}_{1({\mathcal{A}}_{1})},\ldots ,{{\varvec{s}}}_{g\left({\mathcal{A}}_{i}\right)},\ldots , {{\varvec{s}}}_{{n}_{p}({\mathcal{A}}_{{n}_{\mathcal{A}}})}){^{\prime}}\). That is, \({\mathbf{H}}_{1}\) is the \(({n}_{\mathcal{A}}\times L)\) sparse matrix with \({{\varvec{H}}}_{1}(i,l)=1/{V}_{i}\) if vertex \(l\) is in area \({\mathcal{A}}_{i}\) and zero otherwise and \({V}_{i}\) is the number of vertices in the area \({\mathcal{A}}_{i}\). Hence, matrix \({{\varvec{H}}}_{1}\) reads:


Consequently, \({\overline{\Phi } }_{it}\approx \sum_{l=1}^{L}{{\varvec{H}}}_{1}\left(i,l\right){\stackrel{\sim }{\Phi }}_{t,l}{\rm for} t=1,\ldots ,T\) with \({\stackrel{\sim }{\Phi }}_{t,l}{\rm the} (t,l)\) th element of \(\stackrel{\sim }{{\varvec{\Phi}}}\).

\({{\varvec{H}}}_{2}\) transforms the (\(T\times L\)) elements of \(\stackrel{\sim }{{\varvec{\Phi}}}\) into \(({n}_{p}\times T)\) elements of \({\varvec{\Phi}}\) with the value of the \((g,t)\) th element \({\varvec{\Phi}}\) corresponding to the value of \((t,l)\) th of \(\stackrel{\sim }{{\varvec{\Phi}}}\). That is, \({{\varvec{H}}}_{2}\) is an \(({n}_{p}\times L)\) sparse matrix with \({{\varvec{H}}}_{2}(g,l)=1\) if the vertex l is at location \({{\varvec{s}}}_{g}\) and zero elsewhere such that for the gth cell \({\Phi }_{g,t}\approx \sum_{l=1}^{L}{{\varvec{H}}}_{2}(g,l){\stackrel{\sim }{\Phi }}_{t,l}\) for \(i=1,\ldots ,{n}_{p}\,{{\rm and}}\,t=1,\ldots ,T\). Hence, matrix \({{\varvec{H}}}_{2}\) reads:


Given the partitioned matrix \({\varvec{H}}\), the FGG-GF in Eq. (13) can be written as:

$${{\varvec{\eta}}}_{t}={\beta }_{0}{1}_{({n}_{\mathcal{A}}+{n}_{p})}+\sum\limits_{{\varvec{k}}=1}^{{\varvec{K}}}\left({\beta }_{k}+{\zeta }_{k,t}\right){{\varvec{z}}}_{k,t}+\ddot{{\varvec{\upomega}}}+\ddot{{\varvec{\upsilon}}}+{\phi }_{t}{1}_{({n}_{\mathcal{A}}+{n}_{p})}+{\varsigma }_{t}{1}_{({n}_{\mathcal{A}}+{n}_{p})}+{\ddot{{\varvec{\updelta}}}}_{t}+\mathbf{H}{\stackrel{\sim }{{\varvec{\Phi}}}}_{t} \quad {{\rm for}}\,t=1,\ldots ,T.$$

Because the FGG-GMRF model in Eq. (33) belongs to the class of the latent Gaussian models, it can be estimated using INLA-LSPDE (Cameletti et al. 2013; Gómez-Rubio et al. 2021). Predictions of the relative risk \({\varvec{\theta}}\) for the cells of the triangulated domain can be obtained via the posterior conditional distribution of \(\stackrel{\sim }{{\varvec{\Phi}}}\), given \({\mathbf{Q}}_{\stackrel{\sim }{{\varvec{\Phi}}}}\) for all the \(L\) vertices and the posterior distributions of the parameter and hyperparameters in Eq. (24). The FGG-GMRF model setup in Eq. (33) implies that INLA generates predictions for the target cells during model-fitting.

Application: relative dengue risk at subdistrict level in Bandung, 2012–2018

Bandung city is divided into 30 districts and 151 subdistricts. The districts are third level administrative units within a province, and the subdistricts are fourth level administrative units. Every district in Bandung city consists of a minimum of four subdistricts. While the number of dengue incidences in Bandung city is reported at district level, for efficient and effective prevention and control, figures at the subdistrict scale are needed. In Sect. 4.1, we discuss and explore the data; in Sect. 4.2, we estimate the FGG-GMRF model; and in Sect. 4.3, we use the model to predict the relative dengue risk at subdistrict level.

Data and exploratory data analysis

The data were obtained from existing databases. Annual observations at district level on the population at risk (see Online Resource 1) and monthly dengue incidence at district level (see Fig. 1) were obtained from the Bandung Central Statistical Bureau (2012, 2013, 2014, 2015, 2016, 2017, 2018) and the Bandung Health Department (2013, 2014, 2015, 2016, 2017, 2018, 2019), respectively. From January 1, 2012, until December 31, 2018, a total of 26,095 dengue incidences (1,030 per 100,000 inhabitants) were reported. The monthly incidence pattern is highly similar from year to year and is taken as constant. Particularly, the mean Pearson correlation coefficient of monthly dengue incidence for the years 2012–2018 is approximately 0.70. In addition, there were no major shifts across the years in the annual cycle. Hence, as in Jaya and Folmer (2021a), we only consider the monthly cycle for the 30 districts.Footnote 17 Figure 1 shows a high number of incidences from January to July, followed by a sharp drop in July and a low level of incidences for the remainder of the year. The monthly incidences range from 5 to 265 (0.197–10.461 per 100,000 inhabitants, respectively).

Fig. 1
figure 1

Monthly dengue incidences at district level (in 1,000), Bandung City, Indonesia, 2012–2018 (1–30: district codes, see Online Resource 1)

We derived the crude risk rate (i.e., the standardized incidence ratio, SIR) as the ratio of the observed to the expected number of incidences (see Eq. (2)). Figure 2 presents the monthly dengue SIR per district for the period 2012–2018. It ranges from 0.229 to 3.132. Most districts, primarily those in northern and southern Bandung, have a SIR greater than one from January to July. The districts with the highest SIR are Buah Batu (\({\rm id}=20\)), Lengkong (\({\rm id}=27\)) and Rancasari (\({\rm id}=29\)).

Fig. 2
figure 2

Monthly dengue standardized incidence ratio (SIR) at district level, Bandung City, Indonesia, 2012–2018

As observed by Ebi and Nealon (2016), Jaya and Folmer (2021a), and Zellweger et al. (2017), among others, socioeconomic and environmental conditions are the main factors influencing dengue disease risk over space and time. However, for Bandung, socioeconomic risk variables such as income, education, occupation and living conditions are unavailable for districts and cells. These factors are accounted for by the random effects, thus controlling for omitted variable bias (Jaya and Folmer 2020, 2021a, b). By contrast, the monthly averages of the weather risk variables of precipitation (mm), temperature (°C), sunshine duration (kJ/m2day) and water vapor pressure (kPa) are available at cell level from the WorldClim2.0 database (Fick and Hijmans 2017), obtained from 19 weather stations surrounding Bandung in West Java for 1970–2000. We selected cells of resolution 1 km2. Accordingly, Bandung city was divided into 179 cells. Table 1 presents the monthly average weather variables for the period 1970–2000.

Table 1 Descriptive statistics for the monthly averages of the weather variablesa

Figure 3 shows that precipitation and water vapor pressure are relatively high in the period November–April, temperature is relatively high in the period April–June and in October, and solar radiation is relatively high in the period August–November. Figure 4 indicates that the average temperature varies strongly over space. The minimum average temperature occurs in the northern districts, which are mountainous areas at approximately 800 m above sea level. In addition, they are densely covered with forests and have relatively high precipitation. The central districts, where the governmental facilities and businesses are located, also have high precipitation in the period November to April. Moreover, they have higher temperatures than northern Bandung because of differences in forest density and elevation. They also have high population density, high mobility and high air pollution (Jaya and Folmer 2020).

Fig. 3
figure 3

Monthly variation of the mean annual weather variables a precipitation (mm), b temperature (°C), c solar radiation in 1000 (kJ/m2day), and d water vapor pressure (kPa)

Fig. 4
figure 4

The spatiotemporal variation of the weather variables: a precipitation (mm), b temperature (°C), c solar radiation in 1000 (kJ/m2day), and d water vapor pressure (kPa)

In preparation for estimating the FGG-GMRF model in Eq. (33), we calculated the variance inflation factor (VIF) of the weather variables to check multicollinearity. Table 2 shows that the maximum VIF is 9.312 (for water vapor pressure) which is below the critical (rule of thumb) value of 10, indicating that the correlation among variables is unlikely to affect estimation (Montgomery et al. 2012).

Table 2 Variance inflation factors (VIF) for precipitation, average temperature, solar radiation and water vapor pressure

The estimated generalized Geoadditive-Gaussian Markov Random Field model and prediction

The first step in estimating the FGG-GMRF model given by Eq. (33) is the construction of a triangle mesh of the study area for the application of the Finite Element Method (FEM) and LSPDE approach.

As described in Appendix 1, the accuracy of the FEM calculations and the precision of the forecasts, is a function of the number of vertices in the mesh (edge length). Blangiardo and Cameletti (2015) and Utazi et al. (2019) recommended varying the edge length between the minimum distance and approximately 5–8% of the maximum distance of any two cells (18,681 m). Hence, we considered \(G=\left\{966, 831, 717, 616, 548, 501\right\}\) vertices, corresponding to edge lengths varying from 1000 to 1500 m, with a differenceFootnote 18 of 100 m (see Online Resource 1). The data and R code are available in Online Resource 2.

Before turning to the estimations of the spatiotemporal FGG-GMRF model, we make the following observations. First, as explained in Sect. 3, the covariates and/or the state process \({\varvec{\Phi}}\) at cell level are required for high-resolution spatiotemporal prediction using the FGG-GMRF model. Hence, either the covariates or \({\varvec{\Phi}}\), or both, are included in the selected model. Second, for every model, we considered Poisson and Negative Binomial model specifications for the number of incidences, a random walk of order one (RW1) and two (RW2) for the time-varying coefficients, structured and unstructured spatial and temporal random effects and their interaction, and six different edge lengths. Third, the best model was selected using the deviance information criterion (DIC), the Watanabe–Akaike information criterion (WAIC) and the marginal predictive likelihood (MPL). As a rule of thumb, the best model is the one with the smallest DIC and WAIC, and the largest MPL. Fourth, we started the estimation with the simplest models with covariates only (M1), and then, we proceeded to the model specifications with covariates and four types of interaction (see Table 5 in Appendix 2) at area and cell levels (M2) and, finally, we estimated the full models with covariates, four types of interaction at area and cell levels, and spatially and temporally structured and unstructured main effects (M3).Footnote 19 Finally, due to the large number of outcomes, we only present the estimates for interaction Type IV (spatially structured \(\otimes\) temporally structured) which, as in Jaya and Folmer (2020), performed best among the models with spatiotemporal interaction.Footnote 20

The estimations are presented in Table 6 (in Appendix 3). The table shows that the modelsFootnote 21 with covariates only (M1) have the worst fit and predictive performance among the three classes of models. They have the highest DICs, WAICs and smallest MPLs. Given their relatively poor fit and predictive performance, the M1 models were not considered further in the selection procedure. Introduction of the interaction effect type IV (M2) yielded substantially better predictive performance. The full models with covariates, interaction at area and cell levels and spatially and temporally structured and unstructured main effects (M3) had fit and predictive performance similar to the M2 models. Hence, the main spatially and temporally structured and unstructured effects did not improve the model fit and prediction performance, which is consistent with Jaya and Folmer (2020). Based on these observations, and because the M2 models have a simpler structure, we selected the class of M2 models.

Next, we turned to the selection of the best model from the 24 models M2.1.1.1–M2.2.2.6. First, we considered the edge length, finding that the models M2.1.1.1–M2.2.2.6. have similar DIC, WAIC and MPL values. Based on this observation, we selected the edge length of 1500 m for reasons of computational time. Online Resource 1 presents the mesh. Second, among the M2 models with edge length 1500 m, the Poisson model had slightly lower DIC and WAIC and slightly higher MPL than the Negative Binomial model. In addition, the models with temporal trends RW1 and RW2 had similar DIC, WAIC and MPL values. Based on these considerations, we selected model M2 with Poisson distribution, RW1 time-varying effect, and edge length of 1500 m (denoted as model M2.1.1.6 below).

Figure 5a shows that for model M2.1.1.6, the observed and predicted dengue relative risks are strongly correlated \(({\rm Pearson correlation coefficient} = 0.986)\), indicating that the model fits the data well. Figure 5b shows that the PIT histogram is close to the Uniform distribution, also indicating that model M2.1.1.6 fits the data well.

Fig. 5
figure 5

a Scatterplot for model M2.1.1.6 of the predicted versus the observed dengue relative risk and b histogram of the probability integral transform (PIT)

Table 3 summarizes various components of model M2.1.1.6 which are subsequently used to calculate the posterior means of the monthly relative risk at district and subdistrict levels. Before doing so, we discuss the components separately. To this end, we also make use of Figs. 6 and 7. Before going into detail, we make the following remarks. First, as observed in Sect. 2, p.8, the posterior means of the time-varying coefficients \({\beta }_{k,t}={\beta }_{k}+{\zeta }_{k,t}\) for \(k=1,\ldots ,K\,{{\rm and}}\,t=1,\ldots ,T\), which are presented in Fig. 6, consist of the fixed effect plus the temporal random effect. The minimum and maximum posterior means of the temporal random effects present the largest negative and largest positive differences of the temporal random effects relative to the global effects. Secondly, the contributions of the weather variables in explaining the spatiotemporal variation of the dengue risk are conveniently summarized by the posterior means of their hyperparameter variances and their percentage contributions as fractions of the total variance (the last column of Table 3). In a similar vein, the posterior means and fractions of the hyperparameter variance of the random components present their variability and strength in explaining the relative risk across space and time.

Table 3 The posterior means and 95% credible intervals of the fixed effects, the minimum and maximum posterior means and 95% credible intervals of the temporal random effects and the posterior means and 95% credible intervals of the hyperparameters of the FGG-GMRF model M2.1.1.6
Fig. 6
figure 6

Time-varying effect (\({\widehat{\beta }}_{k,t})\) of a precipitation (mm), b average temperature (°C), c solar radiation (kJ m−2 day−1) and d water vapor pressure (kPa)

Fig. 7
figure 7

Monthly posterior means of a the area level interaction effect and b the cell level interaction effect, January–December

We will begin the discussion of Table 3 with the global mean effects of the weather variables. We only consider the posterior means and disregard credible intervals. Before going into the detail, it is worth noting that an indirect relationship exists between the relative risk of dengue and weather variables via the development and survival of the dengue virus and its vector (Jaya and Folmer 2021a).

Precipitation in general has a negative impact, with a posterior global (overall) mean of \(-\) 0.0041, which is consistent with Jaya and Folmer (2021a). For a one mm increase in the global mean, the dengue risk decreases by \(\left({\rm exp}\left(-0.0041\right)-1\right)100\%=-0. 41\%\). The explanation is that heavy rainfall disrupts the Aedes-spp mosquito’s reproductive cycle by washing away breeding sites (Abiodun et al. 2016; Benedum et al. 2018).

Temperature in general has a positive impact (0.1002), which is consistent with Hurtado-Díaz et al. (2007) and Jaya and Folmer (2021a). The relative risk increases by \(\left({\rm exp}\left(0.1002\right)-1\right)100\%=10.54\%\) for an increase of global mean temperature by 10C. The explanation is that higher temperatures offer good conditions for mosquito development, particularly feeding (Hales et al. 2002; Lambrechts et al. 2012).

Solar radiation has a negative impact (− 0.0001), which is consistent with Ekasari et al. (2018), Jaya and Folmer (2021a) and Martínez-Bello et al. (2017b). An increase of 1 kJ m−2 day−1 of solar radiation decreases the relative dengue risk by \(\left({\rm exp}\left(-0.0001\right)-1\right)100\%=-0.01\%\). As shown by Rasjid et al. (2019), strong solar radiation negatively influences the breeding and spread of Aedes-spp mosquitoes. A longer spell of solar radiation implies a shortened spell of dawn and dusk, during which the Aedes-spp mosquito preys on animals and humans, particularly 20 to 30 min after sunset (Ekasari et al. 2018; Jaya and Folmer 2021a).

Water vapor pressure in general has a negative impact, with a posterior global mean of 0.5033. An increase of the global mean by 1% increases the relative dengue risk by \(\left({\rm exp}\left(0.5033\right)-1\right)100\%=65.42\%\) due to an increase in breeding ability (Bambrick et al. 2009).

For the temporal random effects of the weather variables, the greatest negative temporal random effect of precipitation occurred in June (-0.0248), while the greatest positive occurred in March (0.0176). For temperature, the largest negative temporal random effect was in February (-0.0713), while the largest positive was in June (0.1104). Solar radiation has a temporal random effect close to zero, as indicated by its minimum and maximum posterior means of -0.0002 and 0.0003, respectively. The greatest negative of temporal random effect of water vapor pressure was in January (− 0.0137), while the greatest positive effect was in June (0.0161).

The variances of the weather variables together account for 31.87% of the total variance of the hyperpriors with solar radiation being the most important (\({\sigma }_{{\beta }_{3}}^{2}= 0.023)\), explaining 9.88%, while temperature is the smallest, accounting for 6.2%. The variance of the area level interaction effect (\({\sigma }_{\updelta }^{2})\)) accounts for the highest fraction of the total variance (43.47%), followed by the cell level interaction effect (\({\sigma }_{\Phi }^{2}\)), with a fraction of 24.68%. These fractions indicate that the trend of dengue relative risk for each district and subdistrict is strongly affected by neighboring districts and subdistricts, respectively. The relatively low fractions of the total variance for the other effects imply that they are less important. Specifically, the low fraction of the total variance for the average temperature indicates that only a small part of the variability of the relative risk of dengue in districts and subdistricts is explained by the average temperature.

The posterior mean of hypermeter of the Leroux CAR spatial autoregressive coefficient (\(\rho )\), and the posterior means of the hyperparameters of the temporal autoregressive coefficients at area level (\({\lambda }_{2}\)) and cell level (\({\lambda }_{1}\)) are substantial (larger than 0.700), indicating strong spatial and temporal dependency. The estimated range is \(r\) equals \(13.597\,{\rm km}\). This implies that beyond the distance \(r=13.597\) the spatial correlation among any two cells is smaller than 0.1. Hence, the observations are spatially strongly correlated. Beyond \(13.597{\rm km}\) it is negligible.

Based on Table 3, we now discuss the estimated parameters for the time-varying effects of the risk factors and, next, the spatiotemporal interaction effects at the area and cell levels.

The posterior means of time-varying effects of the weather variables are presented in Fig. 6. The figure shows that the time-varying effects of precipitation, average temperature and solar radiation vary considerably over the year. The time-varying effect of water vapor pressure, in contrast, is highly constant over time.

The time-varying effect of precipitation is positive for the periods January–April, August–September and December and negative for May–July and October–November. The strongest negative effect was in June (− 0.0289), and the strongest positive effect was in March (0.0135). The negative impact for the period May–July follows after the peak of the rainy season from November–April. The negative impact for October–November is caused by the increase in precipitation after the peak of the dry season in June–July. The positive impacts for the periods January–April and August–September, and the peak in March, correspond to the relatively low rainfall one to two months before these periods.

The time-varying effect of temperature is positive for all months and ranges from 0.0289 (end of August) to 0.2106 (June–July). It increases from January–June, is at its peak in June, decreases from July until mid-September, and then starts increasing up to the global mean, where it remains for the rest of the year. Note that the monthly temperature has a delayed risk effect in that it increases from January–May, while its impact is largest in June. The delay is due to the mosquito life cycle and incubation period (Jaya and Folmer 2021a).

The time-varying coefficient of solar radiation is below zero for almost all months, except for November, and varies from − 0.0003 to 0.0002. This is due to the fact that tropical countries such as Indonesia receive a lot of solar radiation throughout the year (Handayani and Ariyanti 2012). The time-varying effect of water vapor pressure is positive all year round and hardly varies. The effect varies from 0.4896 to 0.5194. The strongest effect is in June (0.5194).

Figure 7 presents the estimated parameters of the spatiotemporal interaction effects at area and cell level which depend on their hyperparameters in Table 3.

Figure 7a presents the district level spatiotemporal interaction effects (i.e., the residual effect after accounting for the weather effects). The figure shows that the interaction effect varies across districts and time. In the northern districts, it is positive and quite high during January and February, followed by a decrease to around zero during the period March–May. From June–September it is moderately positive followed by a period of high positive interaction for the rest of the year, especially in the most north-western districts. The high interaction effect in the period September–February is related to multiple factors, in particular environmental conditions. The northern areas are ideal breeding habitats because of dense vegetation and high humidity, especially during the rainy season, with low sunshine duration and high humidity.

The central districts have high spatiotemporal interaction effects because of favorable socioeconomic conditions for the spread of the dengue virus, including high population density and high density of hotels, hostels and student apartments. For the northern central districts, there is the additional effect of spillover of mosquitoes from the northern districts. As a consequence, the interaction effect of the northern central districts follows a time pattern similar to time patterns of the northern districts, though less intense. The most central districts have high positive interaction effects all year round because they have the highest population density and density of hotels and hostels. The two most central districts have low interaction effects all year round, indicating that the main effects (risk factors) virtually fully explain the dengue risk. These districts have no special socioeconomic or environmental conditions affecting the dengue incidence rate.

In the southern districts, the spatiotemporal interaction effect is similar to that in the most central districts, although for partly different reasons. They have high population density with many residential areas that have unhygienic conditions. See Hsu et al. (2017) for details on the relationship between hygiene and dengue infection.

The situation in the eastern districts differs from that in the northern, southern and central districts. The interaction effect is negative in January and February, highly positive in March–May, slightly positive in June–August and negative for the rest of the year. The negative interaction effect in January–February is probably caused by the interaction of the weather variables and the environmental conditions. The absence of forests, heavy rainfall, and the short spells of sunshine in the period January–February keep the humidity low, which is unfavorable for the presence of dengue mosquitoes. The districts are residential areas with inadequate drainage and sanitation. The heavy rainfall until March combined with inadequate drainage and sanitation leads to large quantities of standing water, which provides favorable breeding habitats, contributing to the positive interaction effect in March.

The western districts have medium to strong negative interaction effects all year round, reducing the effects of the weather variables. The majority of the western districts have good drainage and sanitation, and the lifestyle and health behavior of the population is substantially better than in the other parts of Bandung, reducing dengue infection (Bandung Health Profile 2019). For example, the district with the highest healthy behavior index, Cicendo, is located in the western region. The negative effects also indicate that there is limited spillover of mosquitoes from the other districts.

Figure 7b presents contour maps of the cell level interaction effects. In contrast with Fig. 7a, the cell level interaction effects are almost the same across the months. Hence, after accounting for the weather variables, the cell level residual varies over space but is relatively constant over time, implying that it is related more to a topographical dimension, such as elevation, than to time. Positive cell level interaction effects are found in the northern part of Bandung, which is at 800 m above sea level and has high precipitation and dense vegetation, providing an ideal breeding ground and habitat for the Aedes-spp mosquito (Arboleda et al. 2009).

Posterior mean of the relative risk

Figure 8a shows the posterior means of the relative risk estimates (based on the posterior means of the time-varying coefficients of the risk factors and the posterior means of the spatiotemporal interaction effects) at district level. The posterior means of the relative risk at subdistrict level (see Fig. 8b) are obtained as the block average of the cell values within each subdistrict boundary and the cell values that are partly outside its boundary.Footnote 22 Accordingly, Bandung city is divided into 30 districts, 151 subdistricts and 179 cells.

Fig. 8
figure 8

Monthly posterior means of the relative risk at a area (district) level and b subdistrict level (surrounded area: Lengkong district)

Comparing Figs. 8a and 8b shows similar temporal trends, in particular, an increase in January–July and a decrease in August–November. This applies especially to the spatial units with the highest risk, notably the districts in southern Bandung. For the spatial dimension, however, we notice substantial differences. For example, according to Fig. 8a, the entire central district of Lengkong (surrounded area in Fig. 8) is categorized as a high-risk area in the period January–July, whereas Fig. 8b shows that this only applies to parts of the district. The explanation is that categorization based on the model given by Eq. (3a) ignores within-district heterogeneity, while this is taken into account when categorization is based on the model given by Eq. (3b). To explore this issue further, we calculated the high-risk and low-risk districts and subdistricts based on the posterior exceedance probability for the two approaches, denoted as the top-down and the bottom-up approaches, respectively. Following Sparks (2015) and Osei and Stein (2017), we fixed the posterior exceedance probability threshold for \({\theta }_{it}\, {\rm at} 1.25\). The exceedance probability \(\widehat{Pr}\left({\theta }_{it}>1.25|\mathbf{y}\right)\) over space and time is presented in Fig. 9. Table 4 presents the classification of the subdistricts into high and low risk based on the bottom-up and top-down approaches, respectively.

Fig. 9
figure 9

Monthly posterior exceedance probability of the relative risk at a district and b subdistrict levels (surrounded area: Lengkong district)

Table 4 Misclassification of the subdistricts based on the bottom-up and top-down approaches

Table 4 shows substantial misclassification for the top-down approach for the period January–July. The overall misclassification rate is 15.6% for all periods (January–December) and 26.7% for the high-risk period (January–July). The table furthermore shows that all the misclassifications occurred in the period January–July, which partly overlaps with the rainy season in November–May, whereas there is no misclassification from August–December. The explanation for the misclassification as such is that the bottom-up approach averages out the differences in risk factors, and consequently the number of incidences, over a set of relatively small number of relatively homogenous cells within a subdistrict. The top-down approach, on the other hand, averages out the differences over a substantially larger number of relatively heterogeneous cells in a district (a district contains at least four subdistricts; see Sect. 4.1).

The misclassification is obviously concentrated in the rainy period January–July with local variation in the risk factors due variation in local characteristics such as elevation or vegetation density. Although the rainy season is from November to May, there are positive rates of misclassification in June–July and unexpected zero rates in November–December. These misclassifications and unexpected rates are due to the delayed responses of mating, breeding and hunting by Aedes-spp mosquitos. Mating and breeding occur mainly during the rainy season. Following this, it takes approximately two weeks to one month for the eggs to develop into adult mosquitoes and for the virus to multiply and reach the salivary glands before it is transmitted to humans. If an individual is infected, the symptoms can be observed approximately four to seven days after being bitten (Ehelepola et al. 2015). Accordingly, there is a delayed infection response with respect to the weather conditions (Jaya and Folmer 2021a).

Summary and conclusions

Effective and efficient control of a variety of spatial problems, including dengue disease abatement, requires data at a fine spatiotemporal scale. However, data availability at the same (especially fine) spatial scale is quite rare (Moraga et al. 2017; Utazi et al. 2019). A major challenge in spatial sciences, including modeling of infectious diseases such as dengue and COVID-19, is how to align data bases of different resolutions consistently. In this study, we presented the Fusion Area-Cell Spatiotemporal Generalized Geoadditive-Gaussian Markov Random Field model as a solution to this problem. This model combines observations on the dependent variable and population at risk at the area level and covariates at the cell level to generate predictions of relative risk at the subdistrict level. Special attention was paid to the model setup to generate predictions for the target cells during model-fitting, using Bayesian Integrated Nested Laplace Approximation (INLA). The methodology was applied to monthly dengue disease data for 30 districts in the city of Bandung, Indonesia, for the period January 2012 to December 2018. The risk factors consisted of the monthly averages of precipitation, temperature, solar radiation and water vapor pressure. The analysis showed that the effects of precipitation, temperature and solar radiation varied considerably across space and time, while the effect of water vapor pressure was highly constant over time. Solar radiation was found to be the most important risk factor. The spatiotemporal interaction effect, capturing the effects of omitted variables at area level, also varied across districts and time. In contrast, the cell level interaction effect was almost constant over the months but varied substantially over space, indicating a strong spatial spillover effect.

Based on the posterior means of the relative risk at cell level, we obtained the relative risk estimates at subdistrict level. We found a similar temporal pattern for district and subdistricts. Relative dengue risk was relatively high in the period January–July and relatively low during the period August–December. We further compared the risk estimates per subdistrict based on: (i) the bottom-up approach using the cell level estimates and (ii) the top-down approach assigning the district value to its subdistricts. Using the posterior exceedance probability of the relative risk, we identified high-risk and low-risk districts to find that during the high-risk period of January–July, the top-down approach misclassified 26.4% of the subdistricts as high risk, which according to the bottom-up approach was low risk. The overall misclassification rate was 15.6%.

The main conclusions of the paper are the following. First, effective and efficient policy intervention, such as the control of infectious diseases, requires data at the right level of resolution. In particular, low-resolution maps may misclassify regions. If regions are incorrectly misclassified as high-risk, unnecessary policy intervention with undue financial, social and environmental costs may result. In contrast, if regions are incorrectly misclassified as low-risk, opportunities for policy intervention may be missed which may also have costs of various kinds. Secondly, the proposed FGG-GMRF model adjusts data and maps of different resolutions consistently, and allows more data to be utilized, thus improving the statistical efficiency. Third, application of the FGG-GMRF model to the dengue disease data for Bandung from 2012 to 2018 shows that the relative infection risk is high in various cells, subdistricts and districts from January–July. The strong spatiotemporal interaction indicates that the occurrence of the dengue disease vector is highly contagious and must be detected early in order to prevent its spread. Rapid response measures such as fogging are critical in areas with high dengue incidence. Finally, based on the experiences in the present paper, it is worthwhile to investigate the suitability of the FGG-GMRF model for a variety of other spatiotemporal problems, including other infectious diseases such as COVID-19 (see Jaya and Folmer 2021b), vaccination coverage (Utazi et al. 2019), particulate matter concentration (Cameletti et al. 2013; Lee et al. 2016), and social issues such as unemployment and crime.


  1. Below we use the notion of “cell” to denote spatial units of higher resolution.

  2. Joint area-cell estimation is based on the assumption that the same factors at area and cell level drive the spatiotemporal process, although usually in various degrees (Banerjee et al. 2015; Utazi et al. 2019). In the present case, the same disease risk factors (weather) drive disease incidence at area and cell levels.

  3. Note that although the precision matrix is sparse, the covariance matrix is dense in general. Specifying models in terms of the precision matrix rather than the covariance matrix is useful in many high-dimensional applications because it allows for a sparse representation, typically reducing computational cost and memory usage (Sidén et al. 2018).

  4. Although Dengue disease incidence depends on socioeconomic risk factors such as income, family size, age, education, and weather variables such as humidity, precipitation, and sunshine, the case study presented below only considers the latter, while the socioeconomic risk factors are taken into account by the random effects. The reason is that the latter are only available at city level.

  5. A negative binomial distribution is appropriate when there is large overdispersion of zeros (Berk and MacDonald 2008; Payne et al. 2017).

  6. Spatiotemporal variation (in disease outcomes) consists of the following three components: a spatial component capturing the overall spatial distribution, a temporal component capturing the overall temporal pattern, and a space–time component capturing space–time interaction. Spatiotemporal models are classified as separable or non-separable (Knorr-Held 2020). A spatiotemporal model is said to be separable if it consists of spatial and temporal components without their interaction, while a non-separable model comprises all three components (Haining and Li 2020; Knorr-Held 2000; Martinez-Beneito and Botella-Rocamora 2019).

  7. The regression functions are latent in that they do not have pre-specified functional forms. This feature makes latent regression functions suitable to accommodate complicated relationships between the risk factors and the dependent variable.

  8. According to Hall et al. (2016) and Martınez-Bello et al. (2017b), temporally varying coefficients models can be seen as distributed lag parameters. They allow the visualization and exploration of changes in the dependent variable as linear or nonlinear functions of time.

  9. Random walk models are the Bayesian equivalents of P(enalized) Spline regression models (Wang et al. 2018a).

  10. A GF \(\gamma \left({\boldsymbol{s}}_{g},t\right)={\gamma }_{t}\left({\boldsymbol{s}}_{g}\right)\) is mean square differentiable in \(\mathcal{A}\), if for every \({\boldsymbol{s}}_{g}\,{\rm in}\,\mathcal{A}\) and for every t, \({\gamma }_{t}^{{\prime}}\left({\boldsymbol{s}}_{g}\right)=\underset{l\to 0}{{\rm lim}}\frac{{\gamma }_{t}\left({\boldsymbol{s}}_{g}+l\right)-{\gamma }_{t}\left({\boldsymbol{s}}_{g}\right)}{l}\) exists and \(\underset{l\to 0}{{\rm lim}}\mathbb{E}{\left[\frac{{\gamma }_{t}\left({\boldsymbol{s}}_{g}+l\right)-\gamma \left({\boldsymbol{s}}_{g}\right)}{l}-{\gamma }_{t}^{{'}}\left({\boldsymbol{s}}_{g}\right)\right]}^{2}=0\) (Stein 1999).

  11. The inla.stack function can be used to define the fusion model framework.

  12. A joint prior may refer to multiple parameters, to the spatial units (area and cell), to multiple time periods, or to space–time. The specification of the parameter vector or the subscripts of the precision matrices indicate what kind of jointness is in order.

  13. The total number of hyperparameters is \(J=2K+11\), where \(K\) is the total number of risk factors and 11 represents the total of the elements of {\({\sigma }_{{\beta }_{0}}^{2},{\sigma }_{\upomega }^{2}, {\sigma }_{\upsilon }^{2},{\sigma }_{\phi }^{2},{\sigma }_{\boldsymbol{\varsigma }}^{2},{\sigma }_{\Phi }^{2},{\sigma }_{\updelta }^{2}, \rho ,{\lambda }_{1}, {\lambda }_{2},r\)}.

  14. A sparse precision matrix enables the application of computationally efficient numerical methods. For instance, for the factorization of a dense covariance matrix of a spatiotemporal GF, the \({\rm O}({n}^{3}\)) computation time is reduced to \({\rm O}({n}^{2}\)) for a sparse covariance matrix of a GMRF (Rue and Held 2005).

  15. The LSPDE approach uses finite basis functions defined by a triangulation of the study region (see Online Resource 1 for application to Bandung city) to transform a GF to a GMRF. Implementation can be conveniently handled using INLA (Rue et al. 2009). The LSPDE approach is immediately applicable to a large class of spatiotemporal models. For instance, it applies to models with complex hierarchical structures or with non-separable covariance functions, as well as non-stationary models with time-varying parameters (Cameletti et al. 2013).

  16. The number of nodes (vertices) \(L\) is larger than the number of cells \({n}_{p}\) because we extend the domain of interest by an outer area to avoid the boundary effect (see Appendix 1 for details).

  17. The main objective of this research is to find out in which areas and which months Dengue outbreaks will occur annually. For this purpose monthly cycle observations suffice.

  18. R-INLA contains the routine inla.mesh.2d() for meshing of spatial domains. It contains several subroutines, notably loc: for the number and the location of the initial mesh vertices, and max.edge: for the maximum edge lengths in the study area and in the outer area.

  19. In Jaya and Folmer (2021a), the order of estimation was reverse. The models with the structurally and temporally main random effects were estimated before the models with the interaction effects. The latter outperformed the former. Based on this outcome, we estimated the M2 models before the M3 models to economize on computation time.

  20. The estimates of the models with other types of interaction are available from the first author upon request.

  21. For a precise definition of the models see the note under Table 6, Appendix 3.

  22. The reason that some cells are partly outside some subdistrict boundaries is that the cells are squares.

  23. The errors are due to the difference between the exact solution of the model equations and the numerical solution (Tu et al. 2018).

  24. For \({\varvec{s}}\in {\mathbb{R}}^{2}\), two different values of \(\alpha\) are commonly applied, viz. \(\alpha =\{{\rm 1,2}\}\) corresponding to \(v=\{{\rm 0,1}\}\) (Lindgren et al. 2011). For \(\alpha =2\), \({\left({\kappa }^{2}-{\nabla }^{2}\right)}^{\alpha /2}{{\varvec{\upgamma}}}_{t}\left({\varvec{s}}\right)\) is a linear differential operator which is easier to solve than a fractional differential operator when \(\alpha\) is an odd number, e.g. \(\alpha \hspace{0.17em}\)= 1 (Miller et al. 2019).

  25. Green’s first identity theorem is a multidimensional generalization of integration by parts to reduce the order of derivatives (Langtangen and Logg 2016). Applied to Eq. (37), it reads:

    $${\int }_{\mathcal{A}}\boldsymbol{\varphi }\left({\varvec{s}}\right){\nabla }^{2}{{\varvec{\upgamma}}}_{t}({\varvec{s}}){\rm d}{\varvec{s}}={\int }_{\partial \mathcal{A}}\frac{\partial {{\varvec{\upgamma}}}_{t}({\varvec{s}})}{\partial \mathbf{n}}\boldsymbol{\varphi }\left({\varvec{s}}\right){\rm d}{\varvec{s}}-{\int }_{\mathcal{A}}\nabla \boldsymbol{\varphi }\left({\varvec{s}}\right)\nabla {{\varvec{\upgamma}}}_{t}({\varvec{s}}){\rm d}{\varvec{s}}$$

    with \(\frac{\partial {\upgamma }_{t}({\varvec{s}})}{\partial \mathbf{n}}\) the directional derivative:

    $$\frac{\partial {{\varvec{\upgamma}}}_{t}({\varvec{s}})}{\partial \mathbf{n}}={\nabla }_{\mathbf{n}}{{\varvec{\upgamma}}}_{t}\left({\varvec{s}}\right)=\underset{\varepsilon \to 0}{{\rm lim}}\frac{{{\varvec{\upgamma}}}_{t}\left({\varvec{s}}+\varepsilon \mathbf{n}\right)-{{\varvec{\upgamma}}}_{t}\left({\varvec{s}}\right)}{\varepsilon }=\frac{\partial {{\varvec{\upgamma}}}_{t}(\mathbf{s})}{\partial \mathbf{s}}=\mathbf{n}\bullet \nabla {{\varvec{\upgamma}}}_{t}\left({\varvec{s}}\right)$$
  26. This procedure is known as the Galerkin finite element method (Langtangen and Logg 2016).

  27. By definition, a generalized GF, for example, \(\boldsymbol{\chi }\left(\boldsymbol{s}\right)\), in a domain \(\mathcal{A}\) is a random \({L}^{2}\left(\mathcal{A}\right)\) generalized function, with \({L}^{2}\left(\mathcal{A}\right)\) a vector space of square-integrable functions in two-dimensional space \(\mathcal{A}\) (i.e., \({L}^{2}\left(\mathcal{A}\right)=\{\boldsymbol{\chi }\left(\boldsymbol{s}\right):\mathcal{A}\to {\int }_{\mathcal{A}}^{}{\left|\boldsymbol{\chi }\left(\boldsymbol{s}\right)\right|}^{2}{\rm d}\boldsymbol{s}<\boldsymbol{\infty }\}\)) such that for every finite set of test functions \(\left\{{\boldsymbol{\psi }}_{h}\in {L}^{2}\left(\mathcal{A}\right),h=1,\ldots ,H\right\}\), the inner product \({\int }_{\mathcal{A}}^{}{\boldsymbol{\psi }}_{h}\boldsymbol{\chi }\left(\boldsymbol{s}\right){\rm d}\boldsymbol{s}\) for \(h=1,\ldots ,H\) is jointly Gaussian. Because Gaussian white noise \({\mathcal{W}}_{t}(\boldsymbol{s})\) is an \({L}^{2}\left(\mathcal{A}\right)\)-bounded generalized GF in the domain \(\mathcal{A}\), the distribution of \({\int }_{\mathcal{A}}^{}{\boldsymbol{\psi }}_{l}{\mathcal{W}}_{t}(\boldsymbol{s}){\rm d}\boldsymbol{s}\) is Gaussian, with expectation and covariance given by (Lindgren et al. 2011):

    $${\mathbb{E}}\left({\int }_{\mathcal{A}}{{\varvec{\psi}}}_{l}\left({\varvec{s}}\right){\mathcal{W}}_{t}({\varvec{s}}){\rm d}{\varvec{s}}\right)=0$$
    $$Cov\left({\int }_{\mathcal{A}}{{\varvec{\psi}}}_{l}\left({\varvec{s}}\right){\mathcal{W}}_{t}({\varvec{s}}){\rm d}{\varvec{s}},{\int }_{\mathcal{A}}{{\varvec{\psi}}}_{h}\left({\varvec{s}}\right){\mathcal{W}}_{t}({\varvec{s}}){\rm d}{\varvec{s}}\right)={\int }_{\mathcal{A}}{{\varvec{\psi}}}_{l}\left({\varvec{s}}\right){{\varvec{\psi}}}_{h}\left({\varvec{s}}\right)d({\varvec{s}}).$$
  28. The inner product of two arbitrary functions \({\boldsymbol{\psi }}_{l}\left(\boldsymbol{s}\right)\) and \({\boldsymbol{\psi }}_{h}\left(\boldsymbol{s}\right)\) is approximately the same as the summation of the product of pairwise values, with summation replaced by integration (Langtangen and Mardal 2019).

  29. This prior represents a situation where we are not completely ignorant about the values of a parameter, but have no firm prior information about them (Simpson et al. 2017).

  30. Note that in the present setting we are dealing with the latent Gaussian model (LGM) for which the IG distribution is a conjugate hyperprior for the variance of the Gaussian prior (Hamura et al. 2021).

  31. The Uniform distribution becomes an increasingly uninformative prior as the upper bound increases (Lemoine 2019). Hence, it is unclear what upper bound is appropriate for a Uniform prior.

  32. The smaller the marginal variance, the smoother the time-varying regression coefficient.

  33. An IGMRF has a sparse precision matrix which is not of full rank. According to Rue and Held (2005), the order of an IGMRF is the rank deficiency of its precision matrix, i.e., the number of its zero eigenvalues. Hence, a zero-mean IGMRF of order s is:

    $$p\left({{\varvec{\zeta}}}_{k}|{\mathbf{Q}}_{{\zeta }_{k}}^{-}\right)=({2\pi )}^{-\frac{T-s}{2}}\left({\left|{\mathbf{Q}}_{{\zeta }_{k}}\right|}^{*}\right){\rm exp}\left(-\frac{1}{2}{{\varvec{\zeta}}}_{k}^{{{\prime}}}{\mathbf{Q}}_{{\zeta }_{k}}{{\varvec{\zeta}}}_{k}\right)$$

    where \({\mathbf{Q}}_{{\zeta }_{k}}\) is the (\(T\times T)\) the semi-positive definite precision matrix of the parameters \({{\varvec{\zeta}}}_{k}\) given by \({\mathbf{Q}}_{{\zeta }_{k}}=\frac{1}{{\sigma }_{{\zeta }_{k}}^{2}}{\mathbf{R}}_{{\zeta }_{k}}\) with \({\mathbf{R}}_{{\zeta }_{k}}\) the (\(T\times T)\) temporal structure matrix. \({\left|{\mathbf{Q}}_{{\zeta }_{k}}\right|}^{*}\) denotes the generalized determinant which is equal to the product of the \(T-s\) non-zero eigenvalues of \({\mathbf{Q}}_{{\zeta }_{k}}\,\,(s=\{{\rm 1,2}\}\) for a RW1 and RW2, respectively). The generalized inverse of the precision matrix (\({\mathbf{Q}}_{{\zeta }_{k}}^{-}\)) is the generalized covariance matrix \({{\varvec{\Sigma}}}_{{\zeta }_{k}}={\sigma }_{{\zeta }_{k}}^{2}{\mathbf{R}}_{{\zeta }_{k}}^{-}\).

  34. Sørbye and Rue (2014) pointed out that scaling of the RW1 and RW2 variances reduces the sensitivity of the estimated marginal variance to changes in the scale parameter of the IG hyperprior and re-scaling of the covariates.

  35. The estimates were obtained using the software package R-INLA (R-version 4.0.3, Integrated Nested Laplace Approximation-INLA). The code is presented in Online Resource 2.


  • Abente LG, Aragonés N, García-Pérez J, Fernández NP (2018) Disease mapping and spatio-temporal analysis: importance of expected-case computation criteria. Geospat Health 9(1):27–33

    Google Scholar 

  • Abiodun G, Maharaj R, Witbooi P, Okosun K (2016) Modelling the influence of temperature and rainfall on the population dynamics of Anopheles arabiensis. Malar J 15(364):1–15

    Google Scholar 

  • Abramovitz M, Stegun I (1965) Handbook of mathematical functions. Dover Publications, New York

    Google Scholar 

  • Aguayo G, Schritz A, Ruiz-Castell M, Villarroel L, Valdivia G, Fagherazzi G, Valdivia G, Fagherazzi G, Lawson A (2020) Identifying hotspots of cardiometabolic outcomes based on a Bayesian approach: the example of Chile. PLoS ONE 15(6):1–16

    Google Scholar 

  • Ahmadian H, Friswell M, Mottershead J (1998) Minimization of the discretization error in mass and stiffness formulation by an inverse method. Int J Numer Methods Eng 41(2):371–378

    Google Scholar 

  • Ak C, Ergonul O, Şencan I, Torunoğlu MA, Gonen M (2018) Spatiotemporal prediction of infectious diseases using structured Gaussian processes with application to Crimean-Congo hemorrhagic fever. PLoS Negl Trop Dis 12(8):1–20

    Google Scholar 

  • Arboleda S, Jaramillo ON, Peterson A (2009) Mapping environmental dimensions of dengue fever transmission risk in the Aburrá Valley Colombia. Int J Environ Res Public Health 6(12):3040–3055

    Google Scholar 

  • Bakka H (2019) How to solve the stochastic partial differential equation that gives a Matérn random field using the finite element method, pp 1–17. [stat.CO]

  • Bakka H, Krainski E, Bolin D, Rue H, Lindgren F (2020). The diffusion-based extension of the Matérn field to space-time, pp 1–22. [stat.ME]

  • Bambrick H, Woodruff R, Hanigan I (2009) Climate change could threaten blood supply by altering the distribution of vector-borne disease: An Australian case-study. Glob Health Action 2(1):1–11

    Google Scholar 

  • Bandung Central Statistical Bureau (2012) Bandung City in Figure 2012. Bandung Government, Bandung

  • Bandung Central Statistical Bureau (2013) Bandung City in Figure 2013. Bandung Government, Bandung

  • Bandung Central Statistical Bureau (2014) Bandung City in Figure 2014. Bandung Government, Bandung

  • Bandung Central Statistical Bureau (2015) Bandung City in Figure 2015. Bandung Government, Bandung

  • Bandung Central Statistical Bureau (2016) Bandung City in Figure 2016. Bandung Government, Bandung

  • Bandung Central Statistical Bureau (2017) Bandung City in Figure 2017. Bandung Government, Bandung

  • Bandung Central Statistical Bureau (2018) Bandung City in Figure 2018. Bandung Government, Bandung

  • Bandung Health Department (2013) Health Profile of Bandung Municipality in 2012. Bandung Government, Bandung

  • Bandung Health Department (2014) Health Profile of Bandung Municipality in 2013. Bandung Government, Bandung

  • Bandung Health Department (2015) Health Profile of Bandung Municipality in 2014. Bandung Government, Bandung

  • Bandung Health Department (2016) Health Profile of Bandung Municipality in 2015. Bandung Government, Bandung

  • Bandung Health Department (2017) Health Profile of Bandung Municipality in 2016. Bandung Government, Bandung

  • Bandung Health Department (2018) Health Profile of Bandung Municipality in 2017. Bandung Government, Bandung

  • Bandung Health Department (2019) Health Profile of Bandung Municipality in 2018. Bandung Government, Bandung

  • Banerjee S, Gelfand A (2002) Prediction interpolation and regression for spatially misaligned data. Sankhyā: Indian J Stat 64(2):227–245

  • Banerjee S, Carlin B, Gelfand A (2015) Hierarchical modeling and analysis for spatial data, 2nd edn. CRC Press Taylor and Francis Group, Boca Raton

    Google Scholar 

  • Barber X, Conesa D, Lladosa S, López-Quílez A (2016) Modelling the presence of disease under spatial misalignment using Bayesian latent Gaussian models. Geospat Health 11(415):11–20

    Google Scholar 

  • Benedum C, Seidahmed O, Eltahir E, Markuzon N (2018) Statistical modeling of the effect of rainfall flushing on dengue transmission in Singapore. PLoS Negl Trop Dis 12(12):1–18

    Google Scholar 

  • Berk R, MacDonald J (2008) Overdispersion and Poisson regression. J Quant Criminol 24(3):269–284

    Google Scholar 

  • Bernardinelli L, Clayton D, Pascutto C, Montomoli C, Ghislandi M, Songini M (1995) Bayesian analysis of space-time variation in disease risk. Stat Med 14(21–22):2433–2443

    Google Scholar 

  • Bivand R, Gómez-Rubio V, Rue H (2015) Spatial data analysis with R-INLA with some extensions. J Stat Softw 63(20):1–31

    Google Scholar 

  • Blangiardo M, Cameletti M (2015) Spatial and spatio-temporal Bayesian models with R-INLA. Wiley, Chichester

    Google Scholar 

  • Bohn J, Feischl M (2021) Recurrent neural networks as optimal mesh refinement strategies. Comput Math Appl 97:61–76.

    Article  Google Scholar 

  • Bolin D, Lindgren F (2009) Wavelet Markov models as efficient alternatives to tapering and convolution fields. Mathematical Sciences Preprint 13. Lund University, Lund

  • Cameletti M, Lindgren F, Simpson D, Rue H (2013) Spatio-temporal modeling of particulate matter concentration through the SPDE approach. AStA Adv Stat Anal 97(2):109–131

    Google Scholar 

  • Yb C, Chen Xh, Hl Li, Zy C, Jiang R, Lü J, Fu Hd (2018) Analysis and comparison of Bayesian methods for measurement uncertainty evaluation. Math Probl Eng.

    Article  Google Scholar 

  • Coly S, GarridoI M, Abrial D, Yao AF (2021) Bayesian hierarchical models for disease mapping applied to contagious pathologies. PLoS ONE 16(1):1–28

    Google Scholar 

  • Ebi K, Nealon J (2016) Dengue in a changing climate. Environ Res 151(1):115–123

    Google Scholar 

  • Ehelepola NDB, Ariyaratne K, Buddhadasa WNMP, Ratnayake S, Wickramasinghe M (2015) A study of the correlation between dengue and weather in Kandy city Sri Lanka (2003–2012) and lessons learned. Infect Dis Poverty 4(1):42–55

    Google Scholar 

  • Ekasari R, Susanna D, Riskiyani S (2018) Climate factors and dengue fever in Jakarta 2011–2015. KnE Life Sci 4(4):151–160

    Google Scholar 

  • Fahrmeir L, Lang S (2001) Bayesian inference for generalized additive mixed models based on Markov random field prior. Appl Stat 50(2):201–220

    Google Scholar 

  • Fick SE, Hijmans RJ (2017) WorldClim 2: New 1-km spatial resolution climate surfaces for global land areas. Int J Climatol 37(2):4302–4315

    Google Scholar 

  • Franco-Villoria M, Ventrucci M, Rue H (2019) A unified view on Bayesian varying coefficient models. Electron J Stat 13(2):5334–5359

    Google Scholar 

  • French J, Wand M (2004) Generalized additive models for cancer mapping with incomplete covariates. Biostatistics 5(2):177–191

    Google Scholar 

  • Fuentes M, Raftery AE (2005) Model evaluation and spatial interpolation by Bayesian combination of observations with outputs from numerical models. Biometrics 61(1):36–45

    Google Scholar 

  • Fuentes M, Chen L, Davis JM (2008) A class of nonseparable and nonstationary spatial temporal covariance functions. Environmetrics 19(5):487–507

    Google Scholar 

  • Fuglstad GA, Hem I, Knight A, Rue H, Riebler A (2020) Intuitive joint priors for variance parameters. Bayesian Anal 15(4):1109–1137

    Google Scholar 

  • Gelman A (2006) Prior distribution for variance parameters in hierarchical models. Bayesian Anal 1(3):515–533

    Google Scholar 

  • Gelman A, Simpson D, Betancourt M (2017) The prior can often only be understood in the context of the likelihood. Entropy 19(555):1–13

    Google Scholar 

  • Gneiting T (2002) Nonseparable, stationary covariance functions for space-time data. J Am Stat Assoc 97(458):590–600

    Google Scholar 

  • Godana AA, Mwalili SM, Orwa GO (2019) Dynamic spatiotemporal modeling of the infected rate of visceral leishmaniasis in humans in an endemic area of Amhara regional state Ethiopia. PLoS ONE 14(3):1–21

    Google Scholar 

  • Goicoa T, Adin A, Ugarte M, Hodges J (2018) In spatio-temporal disease mapping models, identifiability constraints affect PQL and INLA results. Stoch Environ Res Risk Assess 32(3):749–770

    Google Scholar 

  • Gómez-Rubio V (2020) Bayesian inference with INLA. Chapman & Hall/CRC Press, Boca Raton

    Google Scholar 

  • Gómez-Rubio V, Bivand R, Rue H (2021) Spatial models using Laplace Approximation methods. In: Fischer MM, Nijkamp P (eds) Handbook of regional science, second and, extended. Springer, Berlin, pp 1943–1959

    Google Scholar 

  • Haining R, Li G (2020) Modelling spatial and spatial-temporal data a Bayesian approach. CRC Press Taylor & Francis Group, Boca Raton

    Google Scholar 

  • Hales S, de Wet N, Maindonald J, Woodward A (2002) Potential effect of population and climate changes on global distribution of dengue fever: an empirical model. Lancet 360(9336):830–834

    Google Scholar 

  • Hall S, Swamy P, Tavlas G (2016) Time-varying coefficient models: a proposal for selecting the coefficient drivers sets. Macroecon Dyn 21(5):1158–1174

    Google Scholar 

  • Hamura Y, Irie K, Sugasawa S (2021) On global-local shrinkage priors for count data. Bayesian Anal.

    Article  Google Scholar 

  • Handayani N, Ariyanti D (2012) Potency of solar energy applications in Indonesia. Int J Renew Energy Dev 1(2):33–38

    Google Scholar 

  • Hanigan I, Chaston T, Hinze B, Dennekamp M, Jalaludin B, Kinfu Y, Morgan G (2019) A statistical downscaling approach for generating high spatial resolution health risk maps: A case study of road noise and ischemic heart disease mortality in Melbourne. Australia Int J Health Geogr 18(1):20–29

    Google Scholar 

  • Hastie T, Tibshirani R (1986) Generalized additive models. Stat Sci 1(3):297–318

    Google Scholar 

  • Hsu J, Hsieh CL, Lub C (2017) Trend and geographic analysis of the prevalence of dengue in Taiwan 2010–2015. J Glob Infect Dis 54:43–49.

    Article  Google Scholar 

  • Hurtado-Díaz M, Riojas-Rodrıguez H, Rothenberg S, Gomez-Dantés H, Cifuentes E (2007) Short communication: impact of climate variability on the incidence of dengue in Mexico. Trop Med Int Health 12(11):1327–1337

    Google Scholar 

  • Jaya IGNM, Folmer H, Ruchjana BN, Kristiani F, Andriyana Y (2017) Modeling of infectious diseases: a core research topic for the next hundred years. In: Jackson R, Schaeffer P (eds) Regional research frontiers, vol 2. methodological advances, regional systems modeling and open sciences. Springer, West Virginia, pp 239–255

    Google Scholar 

  • Jaya IGNM, Folmer H (2020) Bayesian spatiotemporal mapping of relative dengue disease risk in Bandung Indonesia. J Geogr Syst 22(1):105–142

    Google Scholar 

  • Jaya IGNM, Folmer H (2021a) Identifying spatiotemporal clusters by means of agglomerative hierarchical clustering and Bayesian regression analysis with spatiotemporally varying coefficients: methodology and application to dengue disease in Bandung Indonesia. Geogr Anal 53(4):767–817

    Google Scholar 

  • Jaya IGNM, Folmer H (2021b) Bayesian spatiotemporal forecasting and mapping of COVID-19 risk with application to West Java Province Indonesia. J Reg Sci 61(4):849–881

    Google Scholar 

  • Kammann EE, Wand MP (2003) Geoadditive models. Appl Stat 52(1):1–18

    Google Scholar 

  • Kampen GI, Engelfriet P, Pv B (2014) Disease prevention: saving lives or reducing health care costs? PLoS ONE 9(8):1–5

    Google Scholar 

  • Kang S, McGree J, Baade P, Mengersen K (2015) A case study for modelling cancer incidence using Bayesian spatio-temporal models. Aust N Z J Stat 57(3):325–345

    Google Scholar 

  • Kifle YW, Hens N, Faes C (2017) Cross-covariance functions for additive and coupled joint spatiotemporal SPDE models in R-INLA. Environ Ecol Stat 24(4):551–586

    Google Scholar 

  • Knorr-Held L (2000) Bayesian modeling of inseparable space-time variation in disease risk. Stat Med 19(17–18):2555–2567

    Google Scholar 

  • Lambrechts L, Paaijmans K, Fansiri T, Carrington L (2012) Impact of daily temperature fluctuations on dengue virus transmission by Aedes aegypti. Proc Natl Acad Sci USA 108(18):7460–7465

    Google Scholar 

  • Langtangen HP, Logg A (2016) Solving PDEs in Python. The FEniCS tutorial I. Springer Open, Cham

  • Langtangen HP, Mardal KA (2019) Introduction to numerical methods for variational problems. Springer, Cham

    Google Scholar 

  • Lawson AB (2010) Hotspot detection and clustering: ways and means. Environ Ecol Stat 17(2):231–245

    Google Scholar 

  • Lawson AB, Choi J, Cai B, Hossain M, Kirby RS, Liu J (2012) Bayesian 2-stage space-time mixture modeling with spatial misalignment of the exposure in small area health data. J Agric Biol Environ Stat 17(3):417–441

    Google Scholar 

  • Lee M, Kloog I, Chudnovsky A, Lyapustin A, Wang Y, Melly S, Coull B, Koutrakis P, Schwartz J (2016) Spatiotemporal prediction of fine particulate matter using high resolution satellite images in the southeastern U.S 2003–2011. J Exp Sci Environ Epidemiol 26(4):377–384

  • Leroux B, Lei X, Breslow N (2000) Estimation of disease rates in small areas: a new mixed model for spatial dependence. In: Halloran M, Berry D (eds) Statistical models in epidemiology the environment and clinical trials. Springer, New York, pp 179–191

    Google Scholar 

  • Lindgren F, Rue H (2015) Bayesian spatial modelling with R-INLA. J Stat Softw 63(19):1–25

    Google Scholar 

  • Lindgren F, Rue H, Lindström J (2011) An explicit link between Gaussian fields and Gaussian Markov random fields: the stochastic partial differential equation approach. J R Stat Soc B 73(4):423–498

    Google Scholar 

  • Lindquist D, Gilest M (1989) A comparison of numerical schemes on triangular and quadrilateral meshes. In: Dwoyer D, Hussaini M, Voigt R (eds) 11th International conference on numerical methods in fluid dynamics. Lecture notes in physics. Springer, Berlin, pp 369–373

  • Liu X, Bertazzon S (2016) Fine scale spatio-temporal modelling of urban air pollution. In: Miller J, O’Sullivan D, Wiegand N (eds) Geographic Information Science. Springer, Cham, pp 210–224

    Google Scholar 

  • Liu Z, Le ND, Zidek JV (2011) An empirical assessment of Bayesian melding for mapping ozone pollution. Environmetrics 22(3):340–353

    Google Scholar 

  • Ma W, Gu S, Wang Y, Zhang X, Wang A, Zhao N, Song Y (2014) The use of mixed generalized additive modeling to assess the effect of temperature on the usage of emergency electrocardiography examination among the elderly in Shanghai. PLoS ONE 9(6):1–10

    Google Scholar 

  • Martinez-Beneito M, Botella-Rocamora P (2019) Disease mapping from foundations to multidimensional modeling. Taylor & Francis Group, Boca Raton

    Google Scholar 

  • Martínez Bello DA, López-Quílez A, Torres-Prieto A (2017) Relative risk estimation of dengue disease at small spatial scale. Int J Health Geogr 16(31):1–15

    Google Scholar 

  • Martınez-Bello DA, Lopez-Quılez A, Torres-Prieto A (2017) Bayesian dynamic modeling of time series of dengue disease case counts. PLoS Negl Trop Dis 11(7):1–19

    Google Scholar 

  • Matheron G (1963) Principles of geostatistics. Econ Geol 58(8):1246–1266

    Google Scholar 

  • Messina JP, Brady OJ, Golding N, Kraemer MU, Wint GR, Ray SE, Velayudha, (2019) The current and future global distribution and population at risk of dengue. Nat Microbiol 4(9):1508–1515

    Google Scholar 

  • Miller D, Glennie R, Seaton A (2019) Understanding the stochastic partial differential equation approach to smoothing. J Agric Biol Environ Stat 25(1):1–16

    Google Scholar 

  • Montgomery D, Peck E, Vining G (2012) Introduction to linear regression analysis. Wiley, Hoboken

    Google Scholar 

  • Moraga P, Cramb S, Mengersen K, Pagano M (2017) A geostatistical model for combined analysis of point level and area level data using INLA and SPDE. Spat Stat 21(Part A):27–41

  • Muleia R, Boothe M, Loquiha O, Aerts M, Faes C (2020) Spatial distribution of HIV prevalence among young people in Mozambique. Int J Environ Res Public Health 17(3):885–904

    Google Scholar 

  • Osei F, Stein A (2017) Diarrhea Morbidities in small areas: accounting for non-stationarity in sociodemographic impacts using Bayesian spatially varying coefficient modelling. Sci Rep 7(1):9908–9922

    Google Scholar 

  • Payne E, Hardin J, Egede L, Ramakrishnan V, Selassie A, Gebregziabher M (2017) Approaches for dealing with various sources of overdispersion in modeling count data: scale adjustment versus modeling. Stat Methods Med Res 26(4):1802–1823

    Google Scholar 

  • Peng R, Bell M (2010) Spatial misalignment in time series studies of air pollution and health data. Biostatistics 11(4):720–740

    Google Scholar 

  • Phanitchat T, Zhao B, Haque U, Pientong C, Ekalaksananan T, Aromseree S, Overgaard H (2019) Spatial and temporal patterns of dengue incidence in northeastern Thailand 2006–2016. BMC Infect Dis 19(1):743–755

    Google Scholar 

  • Pokharel G, Deardon R (2016) Gaussian process emulators for spatial individual level models of infectious disease. Can J Stat 44(4):480–501

    Google Scholar 

  • Puggioni G, Couret J, Serman E, Akanda A, Ginsberg H (2020) Spatiotemporal modeling of dengue fever risk in Puerto Rico. Spat Spatio-Temporal Epidemiol 35:100375–100383.

    Article  Google Scholar 

  • Rasjid A, Yudhastuti R, Notobroto HB, Hartono R (2019) Climate change: An overview of the prevalence of dengue hemorrhagic fever in the South Sulawesi province of Indonesia. Indian J Public Health Res Dev 10(8):1982–1986

    Google Scholar 

  • Righetto AJ, Faes C, Vandendijck Y, Ribeiro PJ Jr (2018) On the choice of the mesh for the analysis of geostatistical data using R-INLA. Commun Stat 49(1):203–220

    Google Scholar 

  • Rue H, Held L (2005) Gaussian Markov random fields: theory and applications. Chapman and Hall/CRC, Boca Raton

    Google Scholar 

  • Rue H, Martino S, Chopin N (2009) Approximate Bayesian inference for latent Gaussian models by using integrated nested Laplace approximations. J R Stat Soc Ser B 7(2):319–392

    Google Scholar 

  • Saez M, López-Casasnovas G (2019) Assessing the effects on health inequalities of di erential exposure and differential susceptibility of air pollution and environmental noise in Barcelona, 2007–2014. Int J Environ Res Public Health 16(18):340–362

    Google Scholar 

  • Sahu S, Gelfand A, Holland D (2010) Fusing point and areal level space–time data with application to wet deposition. J R Stat Soc Ser C Appl Stat 59(1):77–103

    Google Scholar 

  • Schrödle B, Held L (2011) Spatio-temporal disease mapping using INLA. Environmetrics 22(6):725–734

    Google Scholar 

  • Sedda L, Vilela AP, Aguia ER, Gaspar CH, Gonçalves AN, Olmo RP, Silva ATS, Silveira LC, Drumond BP, Marques JT (2018) The spatial and temporal scales of local dengue virus transmission in natural settings: A retrospective analysis. Parasites Vectors 11(1):79–92

    Google Scholar 

  • Sherman M (2011) Spatial statistics and spatio-temporal data. Wiley, West Sussex

    Google Scholar 

  • Shi X, Miller S, Mwenda K, Onda A, Rees J, Onega T, Moeschler J (2013) Mapping disease at an approximated individual level using aggregate data: a case study of mapping New Hampshire birth defects. Int J Environ Res Public Health 10(9):4161–4167

    Google Scholar 

  • Sidén P, Lindgren F, Bolin D, Villani M (2018) Efficient covariance approximations for large sparse precision matrices. J Comput Graph Stat 27(4):898–909

    Google Scholar 

  • Simpson D, Lindgren F, Rue H (2012) Think continuous: Markovian Gaussian models in spatial statistics. Spat Stat 1:16–29.

    Article  Google Scholar 

  • Simpson D, Rue H, Riebler A, Martins T, Sørbye S (2017) Penalising model component complexity: a principled, practical approach to constructing priors. Stat Sci 32(1):1–28

    Google Scholar 

  • Sloan S (1993) A fast algorithm for generating constrained Delaunay triangulations. Comput Struct 47(3):441–450

    Google Scholar 

  • Song C, Sh X, Wang J (2020) Spatiotemporally Varying Coefficients (STVC) model: a Bayesian local regression to detect spatial and temporal nonstationarity in variables relationships. Ann GIS 26(3):277–291

    Google Scholar 

  • Song HR, Fuentes M, Ghosh S (2008) A comparative study of Gaussian geostatistical models and Gaussian Markov random field models. J Multivar Anal 99(8):1681–1697

    Google Scholar 

  • Sørbye SH (2013) Tutorial: scaling IGMRF-models in R-INLA. University of Tromsø, Tromsø, Department of Mathematics and Statistics

    Google Scholar 

  • Sorbye SH, Rue H (2017) Penalised complexity priors for stationary autoregressive processes. J Time Ser Anal 38(6):923–935

    Google Scholar 

  • Sørbye SH, Rue H (2014) Scaling intrinsic Gaussian Markov random field priors in spatial modelling. Spat Stat 8:39–51.

    Article  Google Scholar 

  • Sparks C (2015) An examination of disparities in cancer incidence in Texas using Bayesian random coefficient models. PeerJ 3:1–24.

    Article  Google Scholar 

  • Stein, (1999) Interpolation of spatial data: some theory for Kriging. Springer, New York

    Google Scholar 

  • Truong P, Heuvelink G, Pebesma E (2014) Bayesian area-to-point Kriging using expert knowledge as informative priors. Int J Appl Earth Obs Geoinform 30:128–138.

    Article  Google Scholar 

  • Tu J, Yeoh GH, Liu C (2018) Computational fluid dynamics: a practical approach. Butterworth-Heinemann, Oxford

    Google Scholar 

  • Utazi C, Thorley J, Alegana V, Ferrari M, Nilsen K, Takahashi S, Tatem A (2019) A spatial regression model for the disaggregation of areal unit-based data to high-resolution grids with application to vaccination coverage mapping. Stat Methods Med Res 28(10–11):3226–3241

    Google Scholar 

  • Wand H, Whitaker C, Ramjee G (2011) Geoadditive models to assess spatial variation of HIV infections among women in Local communities of Durban South Africa. In J Health Geogr 10:28–36.

    Article  Google Scholar 

  • Wang X, Yue YR, Faraway J (2018a) Bayesian regression modeling with INLA. Taylor and Francis Group LLC, Boca Raton

    Google Scholar 

  • Wang C, Puhan M, Furrer R (2018b) Generalized spatial fusion model framework for joint analysis of point and areal data. Spat Stat 23:72–90.

    Article  Google Scholar 

  • Whittle P (1963) Stochastic processes in several dimensions. Bull Inst Int Stat 40:974–994

    Google Scholar 

  • Whittle P (1954) On stationary processes in the plane. Biometrika 41(3–4):434–449

    Google Scholar 

  • Wilastonegoro N, Kharisma D, Laksono I, Halasa-Rappel Y, Brady O, Shepard D (2020) Cost of dengue illness in Indonesia across hospital, ambulatory, and not medically attended settings. Am J Trop Med Hyg 103(5):2029–2039

    Google Scholar 

  • WorldClim (2020) Global climate and weather data, version 2.1. WorldClim: Accessed 2 May 2020

  • Xu Y, Cancino-Muñoz I, Torres-Puente M, Villamayor L, Borras R, Borras-Mañez M, Escribano I (2019) High-resolution mapping of tuberculosis transmission: Whole genome sequencing and phylogenetic modelling of a cohort from Valencia Region. Spain Plos Med 6(10):1–20

    Google Scholar 

  • Yin P, Mu L, Madden M, Vena J (2014) Hierarchical Bayesian modeling of spatio-temporal patterns of lung cancer incidence risk in Georgia, USA: 2000–2007. J Geogr Syst 16(1):387–407

    Google Scholar 

  • Zellweger R, Cano J, Mangeas M, Taglioni F, Mercier A, Despinoy M, Teurlai M (2017) Socioeconomic and environmental determinants of dengue transmission in an urban setting: an ecological study in Noumea New Caledonia. PLoS Negl Trop Dis 11(4):1–18

    Google Scholar 

Download references


We thank the Bandung City Health Office for providing the data. This research was funded by ALG Unpad contract: 1427/UN6.3.1/LT/2020.

Author information

Authors and Affiliations


Corresponding author

Correspondence to I. Gede Nyoman Mindra Jaya.

Additional information

Publisher's Note

Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

Electronic supplementary material

Below is the link to the electronic supplementary material.

Supplementary file 1 (PDF 210 KB)

Supplementary file 2 (ZIP 5354 KB)


Appendix 1: The linear stochastic partial differential equation approach

A GF with dense covariance matrix can be transformed to a GMRF with sparse covariance matrix by means of a Linear Stochastic Partial Differential Equation (LSPDE) (Lindgren et al. 2011) which reads:

$${\left({\kappa}^{2}-{\nabla}^{2}\right)}^{\alpha/2}{{\varvec{\gamma}}}_{t}\left({\varvec{s}}\right)={\mathcal{W}}_{t}\left({\varvec{s}}\right)\quad{\rm for}\,t=1,\ldots,T\,\,{\rm and}\,\,{\varvec{s}}={({\varvec{s}}}_{1},\ldots{{\varvec{s}}}_{g},\ldots,{{\varvec{s}}}_{{n}_{p}}){^{\prime}}\,\,{\rm with}\,\,{{\varvec{s}}}_{g}=({s}_{g,1},{s}_{g,2})$$

where \({{\varvec{\upgamma}}}_{t}\left({\varvec{s}}\right)\) is a temporally independent GF, \({\mathcal{W}}_{t}\left({\varvec{s}}\right)\) a Gaussian white noise process, \(\alpha\) a positive integer related to the smoothness parameter of the Matérn covariance function in Eq. (11) by \(\alpha =v+1,\) and \({\nabla }^{2}\) the Laplace operator, i.e., \({\nabla }^{2}=\left(\frac{{\partial }^{2}}{\partial {{\varvec{s}}}_{1}^{2}}+\frac{{\partial }^{2}}{\partial {{\varvec{s}}}_{2}^{2}}\right)\). Using spectral decomposition, Whittle (1954, 1963) showed that for \(\alpha\) > 1, the only exact stationary solution to Eq. (34) is the isotropic Matérn field, i.e., the stationary GF \({{\varvec{\upgamma}}}_{t}\left({\varvec{s}}\right)\) with the Matérn covariance function in Eq. (11).

A closed-form solution for the LSPDE in Eq. (34) is restricted to regular lattices (Lindgren et al. 2011). For an irregular lattice, it can be approximated through a basis function representation using the Finite Element Method (FEM) defined on the domain \(\mathcal{A}\in {\mathbb{R}}^{2}\). The basis function representation is defined on a mesh, that is, a collection of (i) vertices, (ii) the edges between the vertices, and (iii) the polygons described by the edges. A mesh consists of a minimum of three connected edges conforming to the shape of the domain. It subdivides a continuous geometric space into a finite set of discrete geometric or topological elements such as triangles or rectangles (for a two-dimensional geometric space) or tetrahedral or rectangular prisms (for three-dimensional spaces). It reduces the degrees of freedom from infinite to finite. Because the FEM calculations are based on a finite number of cells and the results are generalized through interpolation for the entire domain, the accuracy of the global solution is a function of the number of elements of the mesh (Bohn and Feischl 2021).

Triangulation is a common FEM meshing scheme (Sloan 1993) because of its flexibility for irregular domains (Lindquist and Gilest 1989) and accuracy (due to the minimization of the discretization error)Footnote 23 (Ahmadian et al. 1998). Triangulation divides the domain into a set of non-intersecting triangles, where any two triangles meet in, at most, a common edge or vertex. The popular Delaunay triangulation scheme (Cheng et al. 2013) ensures that the triangulation maximizes the minimum angle of the triangles, thus avoiding sliver (i.e., long and thin) triangles and rendering the transitions between small and large triangles smooth (Lindgren et al. 2011). The restricted Bowyer–Watson algorithm, which is designed to conform to the domain’s boundary, is a popular Delaunay triangulation algorithm (Cheng et al. 2013).

A drawback of applying an algorithm with boundary conditions is that the variance near the boundary is inflated by a factor of two (Lindgren et al. 2011). To avoid the boundary effect, Lindgren and Rue (2015) proposed to extend the domain of interest by an outer area at a distance of at least \(r\), corresponding to a correlation of approximately 0.1 between two points in the inner and outer areas. The edge length of the triangles of the outer area should be at least equal to the edge length of the triangles of the inner area (Blangiardo and Cameletti 2015). To find the appropriate mesh for the data at hand, meshes of different sizes are usually considered, which can be evaluated using the DIC and WAIC (Righetto et al. 2018).

Given the triangular mesh, the FEM of the solution of the LSPDE in Eq. (34) is:

$${{\varvec{\upgamma}}}_{t}\left({\varvec{s}}\right)\approx \sum_{l=1}^{L}{{\varvec{\psi}}}_{l}\left({\varvec{s}}\right){\stackrel{\sim }{\upgamma }}_{tl}\,\,{\rm for }\quad t=1,\ldots ,T , {\varvec{s}}={({\varvec{s}}}_{1},\ldots {{\varvec{s}}}_{g},\ldots , {{\varvec{s}}}_{{n}_{p}}){^{\prime}}\,\,{\rm and}\,\,{{\varvec{s}}}_{g}=({s}_{g,1},{s}_{g,2})$$

with \({\left\{{{\varvec{\psi}}}_{l}({\varvec{s}})\right\}}_{l=1}^{L}\) the set of piecewise linear basis functions, \(L\) the number of vertices in the triangulation and l the \(l\) th vertex. For location \({{\varvec{s}}}_{g}\), the piecewise linear basis function \({{\varvec{\psi}}}_{l}({{\varvec{s}}}_{g})\) is:

$${{\varvec{\psi}}}_{l}\left({{\varvec{s}}}_{g}\right)=\left\{\begin{array}{ll}1 &\quad{\rm if\, the \,vertex }\,l \,{\rm is\, at\,location }\,{{\varvec{s}}}_{g} \\ 0&\quad {\rm elsewhere}.\end{array}\right.$$

The \({\left\{{\stackrel{\sim }{\upgamma }}_{tl}\right\}}_{l=1}^{L}\) are zero mean Gaussian-distributed weights determining the value of \({{\varvec{\upgamma}}}_{t}\left({\varvec{s}}\right)\) for vertex \(l\) at location \({{\varvec{s}}}_{g}\). Hence, \({{\varvec{\upgamma}}}_{t}\left({\varvec{s}}\right)\) is uniquely defined by its values at the vertices \(l,\,\,l=1, 2,\ldots ,L\) of the mesh. The values in the interior of the triangles are estimated by linear interpolation. Hence, the Gaussian-distributed weights determine the values of the GF at the vertices such that the distribution of \({{\varvec{\upgamma}}}_{t}\left({\varvec{s}}\right)\) is determined by the joint distribution of the weights \({\stackrel{\sim }{{\varvec{\upgamma}}}}_{t}=\left({\stackrel{\sim }{\upgamma }}_{t1},\ldots ,{\stackrel{\sim }{\upgamma }}_{t}\right)\boldsymbol{^{\prime}}\) with sparse precision matrix \({\stackrel{\sim }{\mathbf{Q}}}_{s}\) (Lindgren et al. 2011).

To show that Eq. (34) approximates the GF \({{\varvec{\upgamma}}}_{t}\left({\varvec{s}}\right)\), we takeFootnote 24\(\alpha =2\) corresponding to \(v\) = 1, define the LSPDE in Eq. (34) as a variational problem, i.e., multiply it by an arbitrary test function \(\boldsymbol{\varphi }\left({\varvec{s}}\right) \quad {{\rm for}}\,{\varvec{s}}={({\varvec{s}}}_{1},\ldots {{\varvec{s}}}_{g},\ldots , {{\varvec{s}}}_{{n}_{p}}){^{\prime}}\) with \({{\varvec{s}}}_{g}=({s}_{g,1},{s}_{g,2})\), and integrate it by parts over the domain \(\mathcal{A}\) using Green’s first identity theoremFootnote 25 (Langtangen and Logg 2016). Multiplying the LSPDE in Eq. (34) by a test function \(\boldsymbol{\varphi }\left({\varvec{s}}\right)\) gives:

$${\int }_{\mathcal{A}}\boldsymbol{\varphi }({\varvec{s}})\left({\kappa }^{2}-{\nabla }^{2}\right){{\varvec{\upgamma}}}_{t}({\varvec{s}}){\rm d}{\varvec{s}}={\int }_{\mathcal{A}}\boldsymbol{\varphi }({\varvec{s}}){\mathcal{W}}_{t}({\varvec{s}}){\rm d}{\varvec{s}}$$
$${\int }_{\mathcal{A}}{\kappa }^{2}\boldsymbol{\varphi }({\varvec{s}}){{\varvec{\upgamma}}}_{t}({\varvec{s}}){\rm d}{\varvec{s}}-{\int }_{\mathcal{A}}\boldsymbol{\varphi }({\varvec{s}}){\nabla }^{2}{{\varvec{\upgamma}}}_{t}({\varvec{s}}){\rm d}{\varvec{s}}={\int }_{\mathcal{A}}\boldsymbol{\varphi }({\varvec{s}}){\mathcal{W}}_{t}({\varvec{s}}){\rm d}{\varvec{s}}$$

where \(\mathcal{A}\) is the entire spatial domain over which Eq. (37) is to be solved and \(d{\varvec{s}}\) is shorthand for \(d{{\varvec{s}}}_{1}d{{\varvec{s}}}_{2}\) of the two-dimensional integral. Partial integration of \(-{\int }_{\mathcal{A}}\boldsymbol{\varphi }({\varvec{s}}){\nabla }^{2}{{\varvec{\upgamma}}}_{t}({\varvec{s}}){\rm d}{\varvec{s}}\) gives:

$$-{\int }_{\mathcal{A}}\boldsymbol{\varphi }\left({\varvec{s}}\right){\nabla }^{2}{{\varvec{\upgamma}}}_{{\varvec{t}}}({\varvec{s}}){\rm d}{\varvec{s}}={\int }_{\mathcal{A}}\nabla \boldsymbol{\varphi }\left({\varvec{s}}\right)\nabla {{\varvec{\upgamma}}}_{{\varvec{t}}}({\varvec{s}}){\rm d}{\varvec{s}}-{\int }_{\partial \mathcal{A}}\frac{\partial {{\varvec{\upgamma}}}_{{\varvec{t}}}({\varvec{s}})}{\partial \mathbf{n}}\boldsymbol{\varphi }\left({\varvec{s}}\right){\rm d}\stackrel{\sim }{{\varvec{s}}}$$

where \(\frac{\partial {{\varvec{\upgamma}}}_{{\varvec{t}}}({\varvec{s}})}{\partial \mathbf{n}}=\mathbf{n}\bullet \nabla {{\varvec{\upgamma}}}_{t}\left({\varvec{s}}\right)\) is the directional derivative of \({{\varvec{\upgamma}}}_{t}({\varvec{s}})\) in outward normal direction \(\mathbf{n}\) on the boundary \(\partial \mathcal{A}\) (i.e., the subset of points which can be approached both from \(\mathcal{A}\) and from the outside of \(\mathcal{A}\)) (Langtangen and Logg 2016) and \(d\stackrel{\sim }{{\varvec{s}}}\) is shorthand for \(d{\stackrel{\sim }{{\varvec{s}}}}_{1}d{\stackrel{\sim }{{\varvec{s}}}}_{2}\) of the two-dimensional integral along the boundary \(\partial \mathcal{A}\). Substituting Eq. (38) in Eq. (37), we have:

$${\int }_{\mathcal{A}}{\kappa }^{2}\boldsymbol{\varphi }({\varvec{s}}){{\varvec{\upgamma}}}_{t}({\varvec{s}}){\rm d}{\varvec{s}}+{\int }_{\mathcal{A}}\nabla \boldsymbol{\varphi }\left({\varvec{s}}\right)\nabla {{\varvec{\upgamma}}}_{t}({\varvec{s}}){\rm d}{\varvec{s}}-{\int }_{\partial \mathcal{A}}\frac{\partial {{\varvec{\upgamma}}}_{t}({\varvec{s}})}{\partial \mathbf{n}}\boldsymbol{\varphi }\left({\varvec{s}}\right){\rm d}{\varvec{s}}={\int }_{\mathcal{A}}\boldsymbol{\varphi }({\varvec{s}}){\mathcal{W}}_{t}({\varvec{s}}){\rm d}{\varvec{s}}$$

with the Neumann boundary condition:

$${\left.\frac{\partial {{\varvec{\upgamma}}}_{t}({\varvec{s}})}{\partial \mathbf{n}}\right|}_{\partial \mathcal{A}}=0,$$

where | denotes the boundary (Bakka 2019). Hence, Eq. (39) can be written as:

$${\kappa }^{2}{\int }_{\mathcal{A}}\boldsymbol{\varphi }({\varvec{s}}){{\varvec{\upgamma}}}_{t}({\varvec{s}}){\rm d}{\varvec{s}}+{\int }_{\mathcal{A}}\nabla \boldsymbol{\varphi }\left({\varvec{s}}\right)\nabla {{\varvec{\upgamma}}}_{t}({\varvec{s}}){\rm d}{\varvec{s}}={\int }_{\mathcal{A}}\boldsymbol{\varphi }\left({\varvec{s}}\right){\mathcal{W}}_{t}\left({\varvec{s}}\right){\rm d}{\varvec{s}}.$$

Because the set of the test functions is infinite (Langtangen and Logg 2016), it is not possible to test Eq. (40) for every test function \(\boldsymbol{\varphi }({\varvec{s}})\). As a solution, the FEM can be applied to construct a finite set of test functions \({\left\{{\boldsymbol{\varphi }}_{h}\left({\varvec{s}}\right)\right\}}_{h=1}^{L} \quad {{\rm for}}\,h=1,\ldots ,L\) and tested against Eq. (40). Using Eq. (35) and substituting \({\left\{{{\varvec{\psi}}}_{l}({\varvec{s}})\right\}}_{l=1}^{L}{\rm for }l=1,\ldots ,L\) in Eq. (40), we obtain the system of linear equations:

$${\int }_{\mathcal{A}}{\kappa }^{2}{\boldsymbol{\varphi }}_{h}\left({\varvec{s}}\right)\sum_{l=1}^{L}{{\varvec{\psi}}}_{l}\left({\varvec{s}}\right){\stackrel{\sim }{\upgamma }}_{tl}{\rm d}{\varvec{s}}+{\int }_{\mathcal{A}}\nabla {\boldsymbol{\varphi }}_{h}\left({\varvec{s}}\right)\nabla \sum_{l=1}^{L}{{\varvec{\psi}}}_{l}\left({\varvec{s}}\right){\stackrel{\sim }{\upgamma }}_{tl}{\rm d}{\varvec{s}}={\int }_{\mathcal{A}}{\boldsymbol{\varphi }}_{h}\left({\varvec{s}}\right){\mathcal{W}}_{t}({\varvec{s}}){\rm d}{\varvec{s}},$$
$$\sum_{l=1}^{L}\left({\kappa }^{2}{\int }_{\mathcal{A}}{\boldsymbol{\varphi }}_{h}\left({\varvec{s}}\right){{\varvec{\psi}}}_{l}\left({\varvec{s}}\right){\rm d}{\varvec{s}}+{\int }_{\mathcal{A}}\nabla {\boldsymbol{\varphi }}_{h}\left({\varvec{s}}\right)\nabla {{\varvec{\psi}}}_{l}({\varvec{s}}){\rm d}{\varvec{s}}\right){\stackrel{\sim }{\upgamma }}_{tl}={\int }_{\mathcal{A}}{\boldsymbol{\varphi }}_{h}\left({\varvec{s}}\right){\mathcal{W}}_{t}({\varvec{s}}){\rm d}{\varvec{s}}.$$

The test functions are commonly taken to be equal to the basis functions, i.e., \({\boldsymbol{\varphi }}_{h}({\varvec{s}})={{\varvec{\psi}}}_{l}({\varvec{s}})\)Footnote 26 for \(h=1,\ldots ,L\). Hence:

$$\sum_{l=1}^{L}\left({\kappa }^{2}{\int }_{\mathcal{A}}{{\varvec{\psi}}}_{l}\left({\varvec{s}}\right){{\varvec{\psi}}}_{h}\left({\varvec{s}}\right){\rm d}{\varvec{s}}+{\int }_{\mathcal{A}}\nabla {{\varvec{\psi}}}_{h}\left({\varvec{s}}\right)\nabla {{\varvec{\psi}}}_{l}({\varvec{s}}){\rm d}{\varvec{s}}\right){\stackrel{\sim }{\upgamma }}_{tl}={\int }_{\mathcal{A}}{{\varvec{\psi}}}_{l}({\varvec{s}}){\mathcal{W}}_{t}({\varvec{s}}){\rm d}{\varvec{s}}$$

The integral on the right-hand side of Eq. (42) is the Gaussian white noise distribution \(\mathcal{N}\left(0,{\int }_{\mathcal{A}}{{\varvec{\psi}}}_{{\varvec{l}}}^{2}({\varvec{s}}){\rm d}{\varvec{s}}\right)\), with mean zero and covariance matrixFootnote 27:

$$Cov\left({\int }_{\mathcal{A}}{{\varvec{\psi}}}_{l}\left({\varvec{s}}\right){\mathcal{W}}_{t}({\varvec{s}}){\rm d}{\varvec{s}},{\int }_{\mathcal{A}}{{\varvec{\psi}}}_{h}\left({\varvec{s}}\right){\mathcal{W}}_{t}({\varvec{s}}){\rm d}{\varvec{s}}\right)={\int }_{\mathcal{A}}{{\varvec{\psi}}}_{l}\left({\varvec{s}}\right){{\varvec{\psi}}}_{h}\left({\varvec{s}}\right){\rm d}{\varvec{s}}.$$

We can write Eq. (42) in matrix form as (Simpson et al. 2012):

$${{\varvec{K}}}_{{\varvec{s}}}{\stackrel{\sim }{{\varvec{\upgamma}}}}_{t}\sim \mathcal{N}\left(0,{\mathbf{C}}_{{\varvec{s}}}\right)$$

where \({{\varvec{K}}}_{{\varvec{s}}}={\upkappa }^{2}{\mathbf{C}}_{{\varvec{s}}}+{{\varvec{M}}}_{{\varvec{s}}}\), \({\stackrel{\sim }{{\varvec{\upgamma}}}}_{t}={\left({\stackrel{\sim }{\upgamma }}_{t1},\ldots ,{\stackrel{\sim }{\upgamma }}_{tL}\right)}^{{\prime}}\), \({{\rm C}}_{s,lh}={\int }_{\mathcal{A}}{{\varvec{\psi}}}_{l}\left({\varvec{s}}\right){{\varvec{\psi}}}_{h}\left({\varvec{s}}\right){\rm d}{\varvec{s}},\) and \({M}_{s,lh}={\int }_{\mathcal{A}}\nabla {{\varvec{\psi}}}_{l}\left({\varvec{s}}\right).\nabla {{\varvec{\psi}}}_{h}({\varvec{s}}){\rm d}{\varvec{s}}\).

Because of the highly local nature of the piecewise linear basis functions, \({\mathbf{C}}_{{\varvec{s}}}\) and \({\mathbf{M}}_{{\varvec{s}}}\) are sparse matrices. However, \({\mathbf{C}}_{{\varvec{s}}}^{-1}\) is a dense matrix and will generally not be a GMRF. Lindgren et al. (2011) proposed to solve this issue by reducing the integration order of the inner product of the piecewise linear basis functions \({{\varvec{\psi}}}_{l}\left({\varvec{s}}\right)\) and \({{\varvec{\psi}}}_{h}\left({\varvec{s}}\right)\) for vertices \(l=1,\ldots ,L\) and \(h=1,\ldots .L\) on the interval \(\mathcal{A}\) (i.e., \({\langle {{\varvec{\psi}}}_{l}\left({\varvec{s}}\right),{{\varvec{\psi}}}_{h}\left({\varvec{s}}\right)\rangle }_{\mathcal{A}}={\int }_{\mathcal{A}}{{\varvec{\psi}}}_{l}\left({\varvec{s}}\right){{\varvec{\psi}}}_{h}\left({\varvec{s}}\right){\rm d}{\varvec{s}}\))Footnote 28 by taking \({{\varvec{\psi}}}_{h}\left({\varvec{s}}\right)\) as the constant function 1 yielding \({\langle {{\varvec{\psi}}}_{l}\left({\varvec{s}}\right),1\rangle }_{\mathcal{A}}={\int }_{\mathcal{A}}{{\varvec{\psi}}}_{l}\left({\varvec{s}}\right){\rm d}{\varvec{s}}= \sum_{h=1}^{L}{{\rm C}}_{lh}\). The result is an approximate diagonal \({\mathbf{C}}_{{\varvec{s}}}\) matrix, \({\stackrel{\sim }{\mathbf{C}}}_{{\varvec{s}}},\) with diagonal elements \({\stackrel{\sim }{{\rm C}}}_{{\varvec{s}},ll}=\sum_{h=1}^{L}{{\rm C}}_{lh}\). Replacing \({\mathbf{C}}_{{\varvec{s}}}\) with \({\stackrel{\sim }{\mathbf{C}}}_{{\varvec{s}}}\) yields

$${\mathbf{K}}_{{\varvec{s}}}{\stackrel{\sim }{{\varvec{\upgamma}}}}_{t}\sim \mathcal{N}\left(0,{\stackrel{\sim }{\mathbf{C}}}_{{\varvec{s}}}\right)$$


$${\stackrel{\sim }{{\varvec{\upgamma}}}}_{t}\sim \mathcal{N}\left(0,{\stackrel{\sim }{\mathbf{Q}}}_{\mathbf{s}}^{-1}\right)$$

with \({\stackrel{\sim }{\mathbf{Q}}}_{\mathbf{s}}\) the precision matrix

$${\stackrel{\sim }{\mathbf{Q}}}_{\mathbf{s}}={{\mathbf{K}}_{\mathbf{s}}}^{\mathbf{^{\prime}}}{\stackrel{\sim }{\mathbf{C}}}_{\mathbf{s}}^{-1}{\mathbf{K}}_{\mathbf{s}}={\left({\upkappa }^{2}{\stackrel{\sim }{\mathbf{C}}}_{\mathbf{s}}+{\mathbf{M}}_{\mathbf{s}}\right)}^{\mathbf{^{\prime}}}{\stackrel{\sim }{\mathbf{C}}}_{\mathbf{s}}^{-1}\left({\upkappa }^{2}{\stackrel{\sim }{\mathbf{C}}}_{\mathbf{s}}+{\mathbf{M}}_{\mathbf{s}}\right)$$
$${\stackrel{\sim }{\mathbf{Q}}}_{\mathbf{s}}=\left({\upkappa }^{4}{\stackrel{\sim }{\mathbf{C}}}_{\mathbf{s}}+2{\upkappa }^{2}{\mathbf{M}}_{\mathbf{s}}+{\mathbf{M}}_{\mathbf{s}}{\stackrel{\sim }{\mathbf{C}}}_{\mathbf{s}}^{-1}{\mathbf{M}}_{\mathbf{s}}\right)\quad {\rm for \,every }\,t,$$

where \(\upkappa =\sqrt{8v}/r.\) Because of the serial independence assumption in Eq. (9), \({\stackrel{\sim }{\mathbf{Q}}}_{\mathbf{s}}\) is constant over time.

Bolin and Lindgren (2009) compared the exact FEM approach (using \({\mathbf{C}}_{{\varvec{s}}}\) with all the elements of \({\mathbf{C}}_{{\varvec{s}}}\) calculated as a Hilbert space wavalet model such as a B-spline or a Daubechies wavalet model), and the Markov approximation (replacing \({\mathbf{C}}_{{\varvec{s}}}\) with \({\stackrel{\sim }{\mathbf{C}}}_{\mathbf{s}}\)) for spatial prediction and found that the differences between both approaches in terms prediction errors are negligible.

Appendix 2

This appendix consists of three parts. The first part is Table 5. which presents the priors and hyperpriors for the FGG-GMRF model given by Eq. (33). The second part consists of comments on the priors and hyperpriors and the final part deals with identification.

Priors, joint priors and hyperpriors for the FGG-GMRF model

See Table 5.

Table 5 Priors, joint priors and hyperpriorsa

Comments on the priors and hyperpriors

The following observations apply for the parameters and hyperparameters of the FGG-GMRF. Due to the lack of strong prior knowledge, we used a vague Gaussian prior distribution with a zero mean and a very large variance for the parameters \({\beta }_{0}: {\beta }_{0}\sim \mathcal{N}\left(0,{10}^{6}\right)\) and \({\beta }_{k}: {\beta }_{k}\sim \mathcal{N}\left(0,{10}^{6}\right)\) for \(k=1,\ldots ,K\)(Blangiardo and Cameletti 2015; Martinez-Beneito and Botella-Rocamora 2019). A weakly informative priorFootnote 29 was assigned to the log-odds of the spatial autoregressive parameter \(\rho :{\rm log}\left(\rho /(1-\rho )\right)\sim \mathcal{N}\left(0, 0.45\right)\) (Utazi et al. 2019). Note that the transformation is used to ensure that \(\rho\) takes values between 0 and 1 (Bivand et al. 2015; Martinez-Beneito and Botella-Rocamora 2019).

For the scale hyperparameters (variance and standard deviation) of the temporally varying coefficients, spatially and temporally structured and unstructured random effects, spatiotemporal interaction of the random effects, and the Gaussian field (GF), the Inverse Gamma (IG), half-Cauchy (HC), Uniform and Penalized Complexity (PC) are common hyperpriors. The IG distribution is a well-known, easy to use, conjugate hyperprior for the variance (Coly et al. 2021; Hamura et al. 2021).Footnote 30 The HC is a Cauchy distribution truncated at zero. Gelman (2006) recommended the HC with scale parameter 25 as an alternative to the IG when the variance hyperparameter is close to zero. The HC is also a frequently used hyperprior for the standard deviation (Gómez-Rubio 2020). When the limit of the scale parameter goes to infinity, the HC converges to the Uniform hyperprior (Gelman 2006; Gómez-Rubio 2020). However, for the Uniform hyperprior, it is difficult to set ranges of values for the standard deviation using R-INLA. The reason is that R-INLA assumes that the standard deviation is unbounded (above) (Gelman et al. 2017; Gómez-Rubio 2020), whereas a fixed upper bound on the standard deviation is frequently needed to avoid overfittingFootnote 31 (Simpson et al. 2017). As an alternative to the HC and Uniform hyperpriors for the standard deviation, Simpson et al. (2017) proposed the PC hyperprior. The PC allows using probability statements on values for the parameters such as the standard deviation, autocorrelation and range parameters (Franco-Villoria et al. 2019). The parameters can be restricted using either an upper bound or a lower bound but not both. The PC is frequently used for autoregressive processes (Sørbye and Rue 2017), varying coefficients models and high-resolution prediction models (Franco-Villoria et al. 2019).

In the application, we assigned the IG (1, 0.01) hyperprior to the variance of the autoregressive AR1 model of the temporally structured effects (\({\sigma }_{\phi }^{2}\)) and to the variances of the exchangeable (iid) priors of the spatially (\({\sigma }_{\upsilon }^{2})\) and temporally (\({\sigma }_{\varsigma }^{2})\) unstructured random effects, respectively. Following Lawson et al. (2010), we assigned the HC hyperprior with scale parameter to the standard deviation of the Leroux prior (Leroux et al. 2000) of the spatially structured random effect \(\left({\sigma }_{\omega }^{2}\right)\).

Regarding the hyperpriors to the variances of the RW1 and RW2 priors the following applies. The marginal variances of the RW1 and RW2 priors reflect the smoothness of the \(k\)th vector of the temporal random effect \({{\varvec{\zeta}}}_{k}={\left({\zeta }_{k,1},\ldots ,{\zeta }_{k,T}\right)}^{{{\prime}}} {\rm for }k=1,\ldots ,K\).Footnote 32 However, before assigning hyperpriors we need to render the marginal variances as smoothness indicators comparable as they are affected by their temporal structures and the scale of the risk factors which render them inadequate as smoothness indicators (Sørbye and Rue 2014; Blangiardo and Cameletti 2015; Gómez-Rubio 2020; Kang et al. 2015). The temporal structure matrices carry over to the hyperpriors. To overcome the incomparability problem, Sørbye and Rue (2014) proposed to scale the temporal structure matrices by the geometric means of the diagonal elements of their inverses. However, the random walk models of order one (RW1) and two (RW2) are intrinsic Gaussian Markov random fields (IGMRFs).Footnote 33 They have zero mean and generalized covariance matrices \({{\varvec{\Sigma}}}_{{\zeta }_{k}}^{RW1}={\sigma }_{{\zeta }_{k}}^{2}{\mathbf{R}}_{{\zeta }_{k}}^{RW1-}\) and \({{\varvec{\Sigma}}}_{{\zeta }_{k}}^{RW2}={\sigma }_{{\zeta }_{k}}^{2}{\mathbf{R}}_{{\zeta }_{k}}^{RW2-}\), respectively with \({\mathbf{R}}_{{\zeta }_{k}}^{RW1-}\) and \({\mathbf{R}}_{{\zeta }_{k}}^{RW2-}\) the generalized inverses of the temporal structure matrices varying along with RW1 and RW2, respectively.

Sørbye and Rue (2014) proposed to scale the temporal structure matrices of the RW1 and RW2 priors by the geometric mean of the corresponding diagonal elements of their generalized inverses \({\mathbf{R}}_{{\zeta }_{k}}^{-}\):

$${\rm exp}\left(\frac{1}{T}\sum_{t=1}^{T}{\rm log}\left({\rm diag}\left({\mathbf{R}}_{{\zeta }_{k}}^{-}\right)\right)\right)={\sigma }_{GV}^{2}$$

with \({\sigma }_{GV}^{2}\) called the generalized variance of the diagonal elements of \({\mathbf{R}}_{{\zeta }_{k}}^{-}\) (see Sørbye and Rue (2014) for details). Scaling of \({{\varvec{R}}}_{{\zeta }_{k}}\) gives the scaled temporal structure matrix \({\stackrel{\sim }{{\varvec{R}}}}_{{\zeta }_{k}}={{\varvec{R}}}_{{\zeta }_{k}}{\sigma }_{GV}^{2}\) and scaled generalized covariance matrix as \({\stackrel{\sim }{{\varvec{\Sigma}}}}_{{\zeta }_{k}}={\sigma }_{{\zeta }_{k}}^{2}{\stackrel{\sim }{{\varvec{R}}}}_{{\zeta }_{k}}^{-}\) with scaled generalized variance \({\stackrel{\sim }{\sigma }}_{GV}^{2}=1\).

Scaling the RW1 and RW2 priors can be straightforwardly done in R-INLA by using the option scale.model = TRUE (Blangiardo and Cameletti 2015; Sørbye 2013). After scaling, the same hyperprior IG (1, 0.01) can be assigned to the variance \({\sigma }_{{\zeta }_{k}}^{2}\) of the IGMRF priors RW1 and RW2.Footnote 34 We followed this procedure in the application.

To avoid overfitting, we followed Fuglstad et al. (2020) and applied a weakly informative PC hyperprior to the range \(\left(r\right)\) and the standard deviation (\({\sigma }_{\Phi }\)) of the LSPDE model. Particularly, we applied \({\rm Pr}\left({\sigma }_{\Phi }>{\sigma }_{\Phi 0}\right)={\alpha }_{{\sigma }_{\Phi }}\) with \({\sigma }_{\Phi 0}\) and \({\alpha }_{{\sigma }_{\Phi }}\) denoting the lower limit and lower tail probability of the PC hyperprior for \({\sigma }_{\gamma }\), respectively. The actual value for the standard deviation was 1 and 0.001 for the tail probability, i.e., \({\rm Pr}\left({\sigma }_{\Phi }>1\right)=0.01\). For the range \(r\) we selected \({\rm Pr}\left(r<{r}_{0}\right)={\alpha }_{r}\) with \({r}_{0}\) and \({\alpha }_{r}\) denoting the upper limit and upper tail probability, respectively. We took prior information \({r}_{0}= 5{\rm km}\) which is the distance beyond which dengue disease risk is no longer spatially correlated (Sedda et al. 2018). It corresponds to the median distance between the grid cells. For the prior belief we took \({\alpha }_{r}=0.5\), i.e., \({\rm Pr}\left(r<5\right)=0.5\). Note that the two preceding decisions imply that we know the true limits ahead of the analysis.


Because the ICAR, Leroux, AR1, RW1, and RW2 priors are specified conditionally on neighboring observations and their parameters are unique up to an additive constant, their specifications implicitly include the overall intercept. Consequently, if the model would also include an additional intercept, it would not be identified due to the perfect collinearity (Goicoa et al. 2018). To solve this identification problem, the explicit intercept can be excluded or sum-to-zero constraints on the random effects can be imposed (Goicoa et al. 2018). When sum-to-zero constraints are imposed, the random effects become orthogonal to the explicit overall intercept.

To achieve identifiability, we imposed the following sum-to-zero constraints on the temporal random of effects of the risk factors, the spatial and temporal random effects and their spatiotemporal interaction effects components:

$$\sum_{t=1}^{T}{\zeta }_{k,t}=0\quad{\rm for \,every }\,i\,{\rm and }\,g\, {\rm and} \quad {{\rm for}}\,k=1,\ldots ,K,$$
$$\sum_{i=1}^{{n}_{\mathcal{A}}}{\omega }_{i}+{\upsilon }_{i}=0\quad {\rm and }\,\sum_{i=1}^{{n}_{\mathcal{A}}}{\delta }_{it}=0 \quad {{\rm for}}\,t=1,\ldots ,T,$$
$$\sum_{t=1}^{T}{\phi }_{t}+{\varsigma }_{t}=0\quad{\rm and }\,\sum_{t=1}^{T}{\delta }_{it}=0\, \,{\rm for }\,i=1,\ldots ,{n}_{\mathcal{A}}.$$

The above-mentioned sum-to-zero constraints are imposed in INLA by setting the constr = TRUE argument to the function \(f(.)\) which defines the temporal random effects of the risk factors, the spatial, temporal, and spatiotemporal interaction effect components.

Appendix 3

The best specification of the FGG-GMRF model in Eq. (33) was selected using the deviance information criterion (DIC), the Watanabe–Akaike information criterion (WAIC), and the marginal predictive likelihood (MPL). The results are presented in Table 6.Footnote 35 As a rule of thumb, the best model is the one with the smallest DIC and WAIC, and the largest MPL.

Table 6 Deviance information criterion (DIC), Watanabe–Akaike information criterion (WAIC) and marginal predictive likelihood (MPL) of a subset of models of the FGG-GMRF model in Eq. (33)a

Rights and permissions

Open Access This article is licensed under a Creative Commons Attribution 4.0 International License, which permits use, sharing, adaptation, distribution and reproduction in any medium or format, as long as you give appropriate credit to the original author(s) and the source, provide a link to the Creative Commons licence, and indicate if changes were made. The images or other third party material in this article are included in the article's Creative Commons licence, unless indicated otherwise in a credit line to the material. If material is not included in the article's Creative Commons licence and your intended use is not permitted by statutory regulation or exceeds the permitted use, you will need to obtain permission directly from the copyright holder. To view a copy of this licence, visit

Reprints and Permissions

About this article

Verify currency and authenticity via CrossMark

Cite this article

Jaya, I.G.N.M., Folmer, H. Spatiotemporal high-resolution prediction and mapping: methodology and application to dengue disease. J Geogr Syst (2022).

Download citation

  • Received:

  • Accepted:

  • Published:

  • DOI:


  • Dengue disease
  • Relative risk
  • Fusion area-cell generalized geoadditive-Gaussian Markov random field model
  • Bayesian statistics
  • “Big n
  • Problem
  • Bottom-up approach

JEL Classification

  • C18
  • I18