Spatio-temporal joint modelling on moderate and extreme air pollution in Spain

Wang, Kai; Ling, Chengxiu; Chen, Ying; Zhang, Zhengjun

doi:10.1007/s10651-023-00575-6

Spatio-temporal joint modelling on moderate and extreme air pollution in Spain

Open access
Published: 08 September 2023

Volume 30, pages 601–624, (2023)
Cite this article

Download PDF

You have full access to this open access article

Environmental and Ecological Statistics Aims and scope Submit manuscript

Spatio-temporal joint modelling on moderate and extreme air pollution in Spain

Download PDF

Kai Wang^1,2,
Chengxiu Ling¹,
Ying Chen¹ &
…
Zhengjun Zhang^3,4

1281 Accesses
2 Citations
Explore all metrics

Abstract

Very unhealthy air quality is consistently connected with numerous diseases. Appropriate extreme analysis and accurate predictions are in rising demand for exploring potential linked causes and for providing suggestions for the environmental agency in public policy strategy. This paper aims to model the spatial and temporal pattern of both moderate and extremely poor PM$_{10}$ concentrations (of daily mean) collected from 342 representative monitors distributed throughout mainland Spain from 2017 to 2021. We first propose and compare a series of Bayesian hierarchical generalized extreme models of annual maxima PM$_{10}$ concentrations, including both the fixed effect of altitude, temperature, precipitation, vapour pressure and population density, as well as the spatio-temporal random effect with the Stochastic Partial Differential Equation (SPDE) approach and a lag-one dynamic auto-regressive component (AR(1)). Under WAIC, DIC and other criteria, the best model is selected with good predictive ability based on the first four-year data (2017–2020) for training and the last-year data (2021) for testing. We bring the structure of the best model to establish the joint Bayesian model of annual mean and annual maxima PM$_{10}$ concentrations and provide evidence that certain predictors (precipitation, vapour pressure and population density) influence comparably while the other predictors (altitude and temperature) impact reversely in the different scaled PM$_{10}$ concentrations. The findings are applied to identify the hot-spot regions with poor air quality using excursion functions specified at the grid level. It suggests that the community of Madrid and some sites in northwestern and southern Spain are likely to be exposed to severe air pollution, simultaneously exceeding the warning risk threshold.

Unveiling air pollution patterns in Yemen: a spatial–temporal functional data analysis

Article 15 February 2023

Measurement error in a multi-level analysis of air pollution and health: a simulation study

Article Open access 14 February 2019

Spatiotemporal modelling of $$\hbox {PM}_{2.5}$$ concentrations in Lombardy (Italy): a comparative study

Article Open access 01 February 2024

Use our pre-submission checklist

Avoid common mistakes on your manuscript.

1 Introduction

Air pollution, composed of particulate matter (${\text{PM}}$) and gaseous pollutants, has a substantial negative impact on the environment, ecosystem and human health. Poor air quality is one of the five most significant health risks worldwide, alongside high blood pressure, smoking, diabetes and obesity (Cohen et al. 2017; Daellenbach et al. 2020). It becomes one of the most considerable health concerns for the residents in areas of higher population density (Dias and Tchepel 2018), centres with dense activities, and to particular user groups (Agarwal and Kaddoura 2019; Singh et al. 2021). Among all the pollutants, the particulate matter, with an aerodynamic diameter less than or equal to 10 and 2.5 μm respectively (${\text{PM}}_{10}$ and ${\text{PM}}_{2.5}$) are most consistently connected with numerous adverse health outcomes including lung infections, cardiovascular diseases, and respiratory problems (Joseph et al. 2003; Martuzzi et al. 2006; Samoli et al. 2013), while appropriate regulation directly reduces the adverse health effects, increases general well-being, and improves public health (Steinle et al. 2015).

In Europe, although the European Environment Agency (EEA) maintains a rather dense PM monitoring network to record the concentration levels across countries, huge regions of the European continent remain unmonitored. For proper assessment of population-wide exposure and appropriate formulation of pollution mitigation strategies, the responsible authorities need accurately estimate and predict the concentration levels at the unobserved locations (Chu et al. 2015).

The main challenge to forecasting PM concentrations corresponds to the complexity of PM generation and spreading dynamics. On the one hand, the PM generation is dominated by two complicated sources, inorganic aerosol coming from the agriculture, long-range transport and energy sectors (Steinle et al. 2013; Daellenbach et al. 2020), as well as organic aerosol coming from biomass and fossil fuels burning emissions, vehicles emissions and cooking (Lenschow et al. 2001; Omidvarborna et al. 2015). On the other hand, the PM spread depends on both meteorological conditions and land use dispersion, leading the observed concentration levels to fluctuate geographically and temporally.

In the statistical literature, the Bayesian spatio-temporal model that allows modelling a complex environmental phenomenon through a hierarchy of sub-models becomes one of the most promising methodologies in air quality scientific investigations (Cameletti et al. 2013; Amin et al. 2015; Taheri Shahraiyni and Sodoudi 2016; Forlani et al. 2020; Fioravanti et al. 2021; Castro-Camilo et al. 2021). In particular, this approach allows involving the explanatory variables to explain the large-scale variability, take residual dependency into account through a space-time process with a Gaussian random field (GRF), and produce high-resolution spatial forecasts to meet the rising demand for predictive concentrations maps in epidemiological studies.

According to Porcu et al. (2012), the main drawback of the Bayesian model with a GRF refers to the computational difficulty of dealing with enormous amounts of data, especially when applying complex spatial dependence measures (i.e., the Matérn covariance function). Some strategies have been proposed to alleviate the computational burden of fitting complex spatial and temporal models. Lindgren et al. (2011) proposed the stochastic partial differential equation (SPDE) approach, providing a method to represent a continuous Matérn field through a discretely indexed Gaussian Markov random field (GMRF) associated with a sparse precision matrix, which enjoys good computational property. Rue et al. (2009) also provided the integrated nested Laplace approximation (INLA) algorithm that performs direct numerical calculations on the marginal posterior distributions, avoiding the time-consuming Markov chain Monte Carlo (MCMC) simulations. Additionally, GMRF with SPDE approach can be fitted in a Bayesian hierarchical framework through the INLA approach, with implementation in the R-INLA package available at https://www.r-inla.org/, making this methodology fast and easily implemented.

Most previous spatial and temporal studies on air pollution only concentrated on moderate (i.e., daily, monthly and annual mean) PM concentrations (Cameletti et al. 2013; Beloconi et al. 2018; Fioravanti et al. 2021; Saez and Barceló 2022). However, extreme conditions are actually more concerned with environmental quality management due to their various hazardous impacts (Amin et al. 2015). Numerous epidemiological studies pointed out that short-term exposures to severe PM pollution can trigger serious acute cardiovascular and respiratory mortality (Orellano et al. 2020; Lei et al. 2019; Zhang et al. 2019; Yu et al. 2014; Brook et al. 2016) and huge economic loss in the corresponding hospitalization (Xie et al. 2021; Shah et al. 2013). In the field of extreme case spatio-temporal analysis, Sharma et al. (2012), Rodríguez et al. (2016), Amin et al. (2015), Martins et al. (2017) and Castro-Camilo et al. (2021) typically focused on a small spatial domain, making it difficult to consider the complicated orography with a variety of climatic conditions, as well as to provide general suggestions to national governments on the environmental policy formulation and health care allocation. More importantly, to our best knowledge, no studies consider the potential differences between moderate and extreme air pollution, in other words, model different scaled air pollution simultaneously to identify similarities and differences in the effects of influential factors.

In this paper, we focus on the spatial and temporal variation of both moderate and extreme air pollution (i.e., annual mean and annual maxima of daily ${\text{PM}}_{10}$ concentration levels) in mainland Spain from 2017 to 2021, after controlling for meteorological variables and socio-economic factors. The contribution is two-fold: the predictions of extreme pollution (excursion functions) and the investigation of similar/reverse effects of predictors in different scaled cases. First, we establish several Bayesian hierarchical generalized extreme models on annual maxima and select the best model based on their predictive performance, detailed in Sects. 3.1 and 4.1. Secondly, we utilize the joint Bayesian model with sharing effects in Sects. 3.2 and 4.3 to model both annual mean and maxima concentrations simultaneously. We observe the comparable influence of precipitation, vapour pressure, and population density, as well as the possible opposite effects of altitude and temperature. We also generate excursion function maps (Bolin and Lindgren 2015) based on the joint model to highlight the regional risk ranking that simultaneously exceeds the warning risk threshold in Sect. 4.4. These main findings, comprehensive knowledge on the ${\text{PM}}_{10}$ generation and spread with high-resolution spatial forecasts are expected to promote awareness of the significance of extreme air pollution research, help in the investigation of the long-term effect in epidemiological studies, and underpins air pollutants regulation and human health protection strategies for environmental agencies.

The rest of the paper is organized as follows. In Sect. 2, we demonstrate the dataset for response and main explanatory variables. In Sect. 3, we formalize spatio-temporal Bayesian generalized extreme models on moderate-extreme air pollution in the framework of Bayesian spatial analysis and extreme value theory. We present the main results, applications and potential influence in Sect. 4, and conclude this paper with extensional discussions in Sect. 5.

2 Data

2.1 PM$_{10}$ concentrations and spatial domain

In mainland Spain, the ${\text{PM}}_{10}$ concentration levels data is accessible by the EEA’s air quality database, Air Quality e-Reporting, consisting of a multi-annual time series data of air quality measurement and calculated statistics for a number of air pollutants. To work with a more robust dataset, we only retain 342 air pollution stations that have at least 60% valid observations in a year, with the geographical distribution of the stations shown in red circles embedded in the mesh constructed for the SPDE approach (Fig. 1). This five-year dataset (2017–2021) with 1470 observations is divided into the training set (2017–2020; 1215 observations) and the validation set (2021; 255 observations), which are used to evaluate and compare the model fitness and predictive ability.

Figure 2 shows the temporal and spatial variations of extreme ${\text{PM}}_{10}$ following the EEA’s air quality categories with 0–20 μg/m³ (Good), 20–40 μg/m³ (Fair), 40–50 μg/m³ (Moderate), 50–100 μg/m³ (Poor), 100–150 μg/m³ (Very poor), more than 150 μg/m³ (Extremely poor). Temporally, severe pollution seems to occur in 2017 and 2020, as indicated by the numerous monitors coloured in red (very poor) and purple (extremely poor). Spatially, compared with relatively low annual maxima recorded in the east (Valencian Community) and north (Basque Country), high ${\text{PM}}_{10}$ concentrations are most prevalent in the centre, northwest and southeast, which correspond to the autonomous communities of Madrid, Galicia, Andalusia, and the Region of Murcia, respectively. This spatio-temporal pattern inspires our further investigation of spatio-temporal modelling taking appropriate topography covariates into account, which are stated in the following section.

2.2 Explanatory variables

A number of potential predictors are available based on prior findings in the air quality literature (Cameletti et al. 2013; Fioravanti et al. 2021; Castro-Camilo et al. 2021), and we choose to include a set of five main spatial and spatio-temporal varying predictors with the complete description list reported in Table 1.

Table 1 Description for explanatory variables

Full size table

In the following, we describe the selected predictors in detail.

Meteorological variables. The meteorological variables (temperature, precipitation and vapour pressure) of the monthly mean are collected from the CRU TS (Climatic Research Unit gridded Time Series; Harris et al. 2020) dataset and aggregated to be annual mean. Accordingly, CRU TS was first published in 2000, using ADW (angular-distance weighting) to interpolate anomalies of monthly observations onto a 0.5° grid over land surfaces (excluding Antarctica) for observed and derived variables (mean, minimum and maximum temperatures, precipitation, vapour pressure, wet days and cloud cover) with no missing values in the defined domain.

Elevation. The altitude data, height over sea level, matched with locations of all air pollution monitors, is accessible in the annual aggregated air quality values dataset provided by EEA, available at https://discomap.eea.europa.eu/App/AirQualityStatistics/index.html.

Population density. The population densities are calculated in each autonomous community. The original data is collected from the statistics report (available at https://stats.oecd.org/) of the Organisation for Economic Co-operation and Development (OECD). The OECD statistics contain data and metadata for economic and education indexes of OECD countries and some selected non-member economies.

We see in Table 2 that the correlations of potential predictors to extreme and average PM$_{10}$ concentrations display a similar or different direction, which also vary year by year. This will be further investigated by our spatio-temporal generalized extreme model and joint model with sharing effects, see details in both Sects. 3.1 and 3.2. Furthermore, considering the correlation between location variables and meteorological variables (e.g., latitude and temperature), we adjust the location variables (longitude and latitude) as covariates to investigate the impact of meteorological factors.

Table 2 Correlation between explanatory variables and annual mean and annual maximum PM$_{10}$ concentrations on log scale

Full size table

3 Model formulation

In Sect. 3.1, we first provide a brief introduction to spatio-temporal Gaussian field and extreme value models, followed by four candidate models for the annual maximum of daily PM$_{10}$ concentrations in mainland Spain. Subsequently, in Sect. 3.2, we present the joint Bayesian Gumbel-Gaussian model for both extreme and moderate levels.

3.1 Spatio-temporal extreme value models with mixed effects

The generalized extreme value (GEV) distribution is widely employed for modelling extremes in environmental science (Reiss and Thomas 2007), such as temperature (Cheng et al. 2014), precipitation (Panagoulia et al. 2014), air pollution (Deng and Zhang 2018; Martins et al. 2017) and sea level (Lobeto et al. 2018). The GEV distribution has three parameters, location parameter ($\mu$, $-\infty<\mu <\infty$), scale parameter ($\sigma$, $\sigma >0$) and tail parameter ($\xi$, $-\infty<\xi <\infty$) with the cumulative distribution function

$$\begin{aligned} \text{ GEV }(x; \mu ,\sigma , \xi )=\exp \left\{ -\left[ 1+\xi \left( \frac{x-\mu }{\sigma }\right) \right] _{+}^{-\frac{1}{\xi }}\right\} . \end{aligned}$$

The case $\xi =0$ is interpreted as the limit case of $\xi \rightarrow 0$, leading to the Gumbel family with distribution function

$$\begin{aligned} \text{ Gumbel }(x; \mu , \sigma )=\exp \left\{ -\exp \left[ -\left( \frac{x-\mu }{\sigma }\right) \right] \right\} , \quad x \in \mathbb {R}. \end{aligned}$$

Considering the potential spatial and temporal dependence among the PM$_{10}$ concentrations, we use two typical approaches for measurement: Matérn covariance function (Matérn 1986; Guttorp and Gneiting 2006) with the stochastic partial differential equations (SPDE; Lindgren et al. 2011) approximation for spatial correlation and the auto-regressive dynamic model (AR; Shumway and Stoffer 2017) for temporal dependence. We establish two groups of generalised extreme value models with similar fixed effects and varying random effects to model the annual maxima ${\text{PM}}_{10}$ concentrations.

3.1.1 Model 1: Gumbel model with fixed effect and spatio-temporal random effect

Let $y_{\text {max}}({\varvec{s}},t)$ denote the logarithm transform of annual maxima ${\text{PM}}_{10}$ concentrations at location ${\varvec{s}} \in \mathcal {S}$ and year $t \in \mathcal {T}$, where $\mathcal {S}$ is the study area and $\mathcal {T}$ is the time period in focus. Under the assumption of constant scale ($\sigma$) and tail ($\xi$) parameters, we use a linear combination of fixed effects with explanatory variables and spatio-temporal varying random effect to model the location parameter ($\mu ({\varvec{s}},t)$) in the Gumbel model below. Suppose that

$$\begin{aligned} \begin{aligned} \left[ y_{\text {max}}({\varvec{s}},t) \mid \mu ({\varvec{s}},t), \sigma \right]&\sim {\text {Gumbel}}\left( \mu ({\varvec{s}},t), \sigma \right) \\ \text{ with } \quad \mu ({\varvec{s}},t)&= {\textbf{x}}({\varvec{s}},t)^\top \varvec{\beta }+u({\varvec{s}},t) \end{aligned} \end{aligned}$$

(1)

and

$$\begin{aligned} \begin{aligned} u({\varvec{s}},t)&= a u({\varvec{s}},t-1 ) +w\left( {\varvec{s}}, t\right) , \\ w\left( {\varvec{s}}, t\right)&\sim \mathcal {G} \mathcal {P}_{2 \textrm{D}-{\text {SPDE}}}\left( \rho _M, \sigma _M, \nu _M\right) . \end{aligned} \end{aligned}$$

(2)

Here, the vector ${\textbf{x}}(s,t)$ contains an intercept and the explanatory variables of location variables, meteorological variables and human-effect variables listed in Table 1, and the vector $\varvec{\beta }$ corresponds to the regression coefficients associated with the fixed effects. The term $u({\varvec{s}},t)$ represents a spatio-temporal varying random effect that incorporates spatio-temporal interaction (Cameletti et al. 2013). It temporally changes according to AR(1) dynamics with autocorrelation parameter a and spatial correlated and serially independent innovations $w\left( {\varvec{s}}, t\right)$.

Given two locations $s_i$ and $s_j$ separated by $h=d(s_i,s_j)$ (normally Euclidean) units, the Gaussian process with mean 0 and Matérn covariance function is in the form of

$$\begin{aligned} {\text {Cov}}(w(s_i,t),w(s_j,t'))= {\left\{ \begin{array}{ll}0, &{} t \ne t', \\ {\frac{\sigma ^2}{2^{\nu-1} \ \ \Gamma (\nu )}\left( \sqrt{8 \nu } \frac{h}{\rho }\right) ^{\nu } K_{\nu }\left( \sqrt{8 \nu } \frac{h}{\rho }\right) }, &{} t=t',\end{array}\right. } \end{aligned}$$

where $\Gamma$ is the gamma function, $K_\nu$ is the modified Bessel function of the second kind, $\rho >0$ is the range parameter, $\nu >0$ is the smoothness parameter, and $\sigma ^2>0$ is the marginal variance.

3.1.2 Model 2: Gumbel model with fixed effect and separated spatial/temporal/spatio-temporal random effects

Our second model is a modification of Model 1 specified in Eq. (1) with spatial and temporal random effects and separable interaction effects specified in Eq. (2), namely, we keep the Gumbel distribution assumption on $y_{\text {max}}({\varvec{s}},t)$ as below.

$$\begin{aligned} \begin{aligned} \left[ y_{\text {max}}({\varvec{s}},t) \mid \mu ({\varvec{s}},t), \sigma \right]&\sim {\text {Gumbel}}\left( \mu ({\varvec{s}},t), \sigma \right) ,\\ \mu ({\varvec{s}},t)&= {\textbf{x}}({\varvec{s}},t)^\top \varvec{\beta }+f(t)+w({\varvec{s}}) + {u({\varvec{s}},t)},\\ \text{ with }\ \ f(t)&\sim \mathcal {G} \mathcal {P}_{\textrm{AR} (1)}\left( a, \tau _{AR}\right) ,\\ w\left( {\varvec{s}}\right)&\sim \mathcal {G} \mathcal {P}_{2 \textrm{D}-{\text {SPDE}}}\left( \rho _M, \sigma _M, \nu _M\right) . \end{aligned} \end{aligned}$$

(3)

Note that $u({\varvec{s}},t)$ is defined the same as in Eq. (2), indicating the spatio-temporal interaction term with the Kronecker product, see e.g., Cameletti et al. (2011, 2013) and Fioravanti et al. (2021). The f(t) denotes the non-linear random effect in the temporal structure of AR(1), the $w({\varvec{s}}, t)$ is the spatially dependent only random effect with SPDE structure. Specifically, the implementation of the AR(1) model in INLA generally assumes the Gaussian white noise with mean 0 and precision $\tau _{AR}$. For f(t) defined over the naturally binned covariate (Year), let ${\varvec{t}}=\left( t_1, \ldots , t_5\right) ^\top$ denotes the time from the first year ($t_1$) to the last year ($t_5$),

$$\begin{aligned} \begin{aligned} f(t_1)&\sim \mathcal {N}\left( 0,\left( \tau _{AR}\left( 1-a^2\right) \right) ^{-1}\right) , \\ f(t_i)&=a f(t_{i-1})+\epsilon _i, \quad \epsilon _i \sim \mathcal {N}\left( 0, \tau _{AR}^{-1}\right) , \quad i=2, \ldots , 5, \end{aligned} \end{aligned}$$

where $-1<a<1$ is a numeric constant, the so-called autocorrelation, by which we multiply the lagged variable $f(t_{i-1})$, and $\epsilon _i$ denotes the unpredictable error in the form of Gaussian white noise.

3.1.3 Model 3: GEV model with fixed effect and spatio-temporal random effect

The generalised extreme value (GEV) models are basically following the same structure as Gumbel models except the generalized extreme value distribution for the response. To be specific, we suppose that

$$\begin{aligned} \left[ y_{\text {max}}({\varvec{s}},t) \mid \mu ({\varvec{s}},t)\right] \sim {\text {GEV}}\left( \mu ({\varvec{s}},t), \sigma , \xi \right) , \end{aligned}$$

(4)

where the location parameter $\mu ({\varvec{s}},t)$ is of the same form of mixed effects as in Eq. (1) and random effects in Eq. (2).

3.1.4 Model 4: GEV model with fixed effect and separated spatial/temporal random/spatio-temporal random effect

Similar consideration of Model 2 modified from Model 1, we consider the following model in parallel with Model 3, i.e., we take spatial and temporal random effects with an interaction effect into the GEV model. Suppose that

$$\begin{aligned} \begin{aligned} \left[ y_{\text {max}}({\varvec{s}},t) \mid \mu ({\varvec{s}},t), \sigma , \xi \right]&\sim {\text {GEV}}\left( \mu ({\varvec{s}},t), \sigma , \xi \right) \end{aligned} \end{aligned}$$

(5)

with the location parameter $\mu ({\varvec{s}},t)$ is of spatio-temporal structure of the form in Eq. (3).

It is worth pointing out that the well-known limit type theorem in Coles (2001) motivates our GEV distribution assumption of annual maxima of PM${_{10}}$ concentrations, and its reduced case (i.e., Gumbel corresponds to GEV with $\xi =0$). The latter one becomes more parsimonious, and its exponential decay tail might be appropriate to fit extreme air quality (Deng and Zhang 2018). The Gumbel model is generally needed when the uncertainty of $\xi$ being non-zero arises in the GEV model with an estimate of $\xi$ close to zero.

3.2 Bayesian joint model with sharing effects

In order to identify the potential varied effect levels of main explanatory variables, with inspiration from applications of the joint model with sharing effects on wildfire (Koh et al. 2023), we model both moderate and extreme PM$_{10}$ pollution simultaneously in two respective sub-models linked by the sharing effects and the sharing coefficients (scaling factors).

Let $y_{\text {mean}}({\varvec{s}},t)$ denote the logarithm transform of annual mean ${\text{PM}}_{10}$ at location ${\varvec{s}} \in \mathcal {S}$ and year $t \in \mathcal {T}$. We perform the Gaussian sub-model and Gumbel sub-model on annual mean and annual maxima simultaneously, with the structure of the best extreme value model as Model 1 according to the model fitness and prediction analysed in Sect. 4.1.

$$\begin{aligned} \begin{aligned} {\left[ y_{\text {mean}}({\varvec{s}},t) \mid \mu _{\text {mean}}({\varvec{s}},t), \sigma _{\text {mean}}^2 \right] }&\sim {\text {Gaussian}}\left( \mu _{\text {mean}}({\varvec{s}},t), \sigma _{\text {mean}}^2 \right) \\ \text{ with }\qquad \mu _{\text {mean}}({\varvec{s}},t)&= {{\textbf{x}}^{S}({\varvec{s}},t)^\top \varvec{\beta }^{S}}+{{\textbf{x}}^{N\!S}({\varvec{s}},t)^\top \varvec{\beta }_{\text {mean}}^{N\!S}} +{u^{S}({\varvec{s}},t)}; \\ {\left[ y_{\text {max}}({\varvec{s}},t) \mid \mu _{\text {max}}({\varvec{s}},t), \sigma _{\text {max}}\right] }&\sim {\text {Gumbel}}\left( \mu _{\text {max}}({\varvec{s}},t), \sigma _{\text {max}}\right) \\ \text{ with } \qquad \mu _{\text {max}}({\varvec{s}},t)&=\ {{\textbf{x}}^{S}({\varvec{s}},t)^\top \varvec{\beta }^{S}}{\varvec{\beta }_{1}^{\text {Gaussian-Gumbel}}}\\&\quad +{{\textbf{x}}^{N\!S}({\varvec{s}},t)^\top \varvec{\beta }_{\text {max}}^{N\!S}} +{{\beta }_{2}^{\text {Gaussian-Gumbel}}}{u^{S}({\varvec{s}},t)}, \qquad \end{aligned} \end{aligned}$$

(6)

where the terms $\varvec{\beta }^{S}$ and $u^{S}({\varvec{s}},t)$ with superscript S denote the sharing effects, and ${\textbf{x}}^{S}({\varvec{s}},t)$ are corresponding variables. The term $\varvec{\beta }^{N\!S}$ with superscript NS denotes the non-sharing effects with corresponding covariates ${\textbf{x}}^{N\!S}({\varvec{s}},t)$ which includes the intercept and the other variables. For the selection of sharing and non-sharing variables, to avoid the potential issue of uncertainty of significant sharing effects (${\beta }^{S}=0$), we treat two significant predictors (altitude and precipitation) as sharing terms to investigate the potential similar and reverse effects by the ratios ($\varvec{\beta }_1^{\text {Gaussian-Gumbel}}$) while considering all other variables, including three non-significant ones (temperature, vapour pressure, and population density), and location variables as non-sharing terms.

Additionally, the spatio-temporal random effect is also treated as sharing effects with respect to the simplicity of computation. The sharing coefficients $\varvec{\beta }_1^{\text {Gaussian-Gumbel}}$ and ${\beta }_2^{\text {Gaussian-Gumbel}}$ scale the common components of predictor vector ${\varvec{x}}^S({\varvec{s}}, t)$ and random effect $u^S({\varvec{s}}, t)$, and control how much information is shared from the average predictor towards the annual max predictor, and determine the strength of interaction between the two processes. Precisely, it allows for capturing the positive or negative correlations.

For further explanation, the sharing effects (including fixed effects and random effects) are the same in two sub-models ($\varvec{\beta }^{S}$, $u^{S}({\varvec{s}},t)$). Meanwhile, they are linked by the sharing coefficients (scaling factors) $\varvec{\beta }_{1}^{\text {Gaussian-Gumbel}}$ and ${\beta }_{2}^{\text {Gaussian-Gumbel}}$. On the one hand, these sharing coefficients relax the strictly equal relation between the sharing effects in two sub-models. On the other hand, more importantly, their posterior distributions can also measure the similarities and differences in the effects of predictors in the sub-models. For instance, a significantly negative ${\beta }_{1}^{\text {Gaussian-Gumbel}}$ implies that the corresponding predictors oppositely influence the moderate and extreme air pollution cases.

3.3 Priors definition

In a Bayesian context, in order to finalize the model, we need to define prior distributions for the remaining parameters in Gumbel ($\sigma _{\text {Gumbel}}$) and GEV distributions ($\sigma _{\text {GEV}}$, $\xi _{\text {GEV}}$), the regression coefficients ($\varvec{\beta }$), the sharing coefficients in the joint model ($\varvec{\beta }^{\text {Gaussian-Gumbel}}$), the parameters in the Matérn covariance function ($\sigma _{M}, \rho _{M}$, $\nu _{M}$) and the parameters in AR(1) dynamic model (a, $\tau _{AR}$).

We use vague Gaussian priors for the tail parameter ($\xi _{\text {GEV}}$) in GEV distribution and the elements of coefficients ($\varvec{\beta }$, $\varvec{\beta }^{\text {Gaussian-Gumbel}}$). The smooth parameter $\nu _{M}$ is treated here as a fixed value with $\nu _{M} = 1$, as in most spatial analyses. The parameters $\sigma _{M}$ and $\rho _{M}$ in the Matérn function and autocorrelation parameter a in AR(1) model are defined by penalized complexity (PC) priors (Simpson et al. 2017) with knowledge from Moraga (2019) and Fuglstad et al. (2019). PC prior for the range parameter ($\rho _{M}$) is defined with ${\text {Prob}}\left( \rho _{M}<10^4\right) =0.01$, which means the probability that the range is less than $10 \textrm{km}$ is very small, and the PC prior for variance parameter as ${\text {Prob}} \left( \sigma _{M}>3\right) =0.01$, indicating the probability for variance greater than 3 is low. Similarly, we apply the auto-correlation (a) PC prior following the recommendation of ${\text {Prob}}\left( a>0\right) =0.9$.

Note that INLA often uses the precision parameter ($\tau$) to replace the scale parameter ($\sigma$) by $\sigma = {1}/{\sqrt{\tau }}$. The PC priors for all precision parameters are given by ${\text {Prob}}\left( {1}/{\sqrt{\tau }}>3\right) =0.01$ for Gumbel and GEV likelihood and ${\text {Prob}}\left( {1}/{\sqrt{\tau _{AR}}}>5\right) =0.01$ for the AR model.

3.4 Model evaluation, diagnosis and cross-validation

Traditional Bayesian model performance is evaluated by two popular criteria, the deviance information criterion (DIC) and the Watanabe-Akaike information criterion (WAIC). The deviance information criterion (DIC) proposed by Spiegelhalter et al. (2002), is a popular criterion for model choice similar to the Akaike information criterion (AIC).

$$\begin{aligned} \text {DIC}=D( \widehat{\varvec{\theta }})+2 p_D, \end{aligned}$$

where $D( \widehat{\varvec{\theta }})$ is the deviance function with Bayes estimate $\widehat{\varvec{\theta }}$, and $p_D$ is the effective number of parameters. The Watanabe-Akaike information criterion, also known as the widely applicable Bayesian information criterion, is similar to the DIC, but the effective number of parameters is computed in a different way (Watanabe 2013).

However, DIC may under-penalize complex models with many random effects. Alternatively, for prediction performance, INLA suggests applying the leave-one-out cross-validation criteria, conditional predictive ordinates (CPO; Pettit 1990) with its summative version, the Logarithmic Score (LS; Gneiting and Raftery 2007) and predictive integral transform (PIT; Marshall and Spiegelhalter 2003), which facilitates the computation of the cross-validated log-score for model choice, and enables the calibration assessment of out-of-sample predictions, respectively.

$$\begin{aligned} \begin{aligned} {\text {LS}}&= - \sum _{i=1}^{n} \ln {\text {CPO}_i} \quad \text {with\ } \textrm{CPO}_i =\pi \left( y_i^{\text{ obs } } \mid y_{-i}\right) ,\\ \text {PIT}_i&={\text {Prob}}\left( Y_i \le y_i^{\text{ obs } } \mid y_{-i}\right) , \end{aligned} \end{aligned}$$

where $y_i^{\text{ obs } }$ denotes the i-th observation and $y_{-i}$ denotes the observations y with the i-th component omitted. A model with a small value of LS and a standard uniform distributed PIT is preferable (Gómez Rubio 2020).

Note that numerical problems may occur when CPO and PIT values are computed with the complicated Gaussian process with INLA algorithm (Held et al. 2010). Hence, we take a few other criteria introduced in Bayesian literature into account: The coverage probability of 95% CI is computed as the proportion of the validation observations that the observed value lies between the 2.5% quantile to the 97.5% quantile of the predicted value (posterior distribution). The correlation coefficient denotes the correlation between observed values and predicted values in the validation set. The root mean square error (RMSE) is defined as the square root of the second sample moment of the differences between predicted values and observed values.

As we separate the original five-year dataset (2017–2021) into training (2017–2020) and validation (2021) sets, we decide to apply DIC, WAIC, LS, PIT and RMSE to compare the model fitness on the training set and use the coverage probability, correlation coefficient, and RMSE for predictive ability evaluation.

4 Results

In this section, we first compare the performance of the Bayesian generalised extreme models and select the best one with the outstanding predictive ability (Sect. 4.1), then summarize the corresponding posterior estimates for both fixed and random effects (Sect. 4.2). The analysis and interpretation of the joint model are included in Sect. 4.3 and followed by the excursion functions generation with severe air pollution risk ranking (Sect. 4.4). Note that since the explanatory variables are measured in different scales, to avoid numerical problems, each predictor is standardized to have mean zero and unit standard deviation.

4.1 Model comparison

In order to examine the fitting and prediction ability of the extreme value models, we apply some evaluation criteria (DIC, WAIC, LS, PIT and RMSE) on the training set (Table 3), and use other criteria (coverage probability, correlation and RMSE) on the validation set (Table 4). We see from Table 3 with model performance on the training set, Models 2 and 4 outperform in DIC (36.27 and 73.43, respectively), WAIC (193.67 and 223.06, respectively) and RMSE (0.24 and 0.18, respectively). However, Model 1 is preferable in the leave-one-out cross-validation criteria with LS (8019.07) and PIT plots (Fig. 3). Table 4 shows that all four models are comparable in the coverage probability, correlation and RMSE.

Figure 4 shows the simultaneous visualization of model performance on both training and validation sets, where the scatters’ distribution along the line with intercept 0 and slope 1 indicates the estimated (predicted) values are best suited to the observations. All four models exhibit a strong fit for the trend.

To summarise, no model excels in all criteria. Considering the similar performance of all four models on the validation set, we choose the parsimonious model, Model 1 (the Gumbel model with fixed effects and spatio-temporal random effects defined in Eq. (1)), as the best model and bring its structure into further analysis.

Table 3 Performance evaluation criteria (DIC, WAIC, LS and RMSE) of Models 1–4 specified in Eqs. (1), (3)–(5) on the training set (all criteria prefer lower values)

Full size table

Table 4 Performance evaluation criteria (coverage probability, correlation and RMSE) of Models 1–4 specified in Eqs. (1), (3)–(5) on the validation set

Full size table

4.2 Summary of Model 1

The summary statistics for fixed effects are shown in Table 5. We see that altitude and population density are significantly negatively associated with annual maxima ${\text{PM}}_{10}$ concentrations, whereas the effects of other covariates are not statistically significant. High temperature, low precipitation and low vapour pressure are likely to associate with extreme ${\text{PM}}_{10}$ concentrations, consistent with Kalisa et al. (2018) and Li et al. (2014).

The negative association between population density and annual maxima ${\text{PM}}_{10}$ concentrations is different from the positive correlation found in Table 2 and the positive association demonstrated by Borck and Schrauth (2021). Together with the facts of relatively low correlation to population density (Table 2) and the phenomena that severe ${\text{PM}}_{10}$ pollution in 2017, 2020 and 2021 (Fig. 2) is usually associated with high temperature and low precipitation, our results probably imply that the generation and spread of extreme ${\text{PM}}_{10}$ concentrations depend heavily on the climate conditions. This evidence coincides with the findings of the overwhelming role of adverse meteorology in severe air pollution events (Wang et al. 2020; Morawska et al. 2021).

The spatial and temporal dependence is accessible by spatial heat plot (Fig. 5) and posterior estimates of autocorrelation coefficient (a in Table 6). Spatially, similar values of mean random effects occur in groups, especially a large cluster of high values happens in the centre of the mainland. Temporally, the estimated mean correlation coefficient (a) is 0.80 with 95% credible interval (0.74, 0.85), providing evidence of relatively strong dependence between two consecutive years.

Table 5 Posterior estimates (mean, standard deviation and quantiles) of the coefficients of the covariates involved in Model 1 specified in Eq. (1)

Full size table

Table 6 Posterior estimates of mean, standard deviation and quantiles of the parameters in Model 1 specified in Eq. (1)

Full size table

4.3 Results of the joint model with sharing effects

Combining the best extreme value model (Model 1) with the Gaussian model, we establish the joint model in Sect. 3.2 with sharing effects to estimate both maxima and mean concentration levels simultaneously. We keep the two significant effects of predictors (altitude and precipitation) as sharing effects, and all other effects (temperature, vapour pressure, population density and location) are treated as non-sharing ones. Table 7 summarizes the posterior estimates of these effects from the joint model. Precipitation shows significant negative associations with both annual mean and maxima concentrations, which is consistent with the notion that ${\text{PM}}_{10}$ concentrations are generally low in wet regions (Li et al. 2014). In contrast, we find that altitude is negatively connected with the mean but positively connected with the maxima.

This is also supported by the detailed sharing coefficients analysed below. We see from Fig. 6 that the posterior distribution of the sharing coefficients of precipitation (see detailed definition of $\varvec{\beta }_{1}^{\text {Gaussian-Gumbel}}$ in Sect. 3.2) almost lies between 0 and 1, showing that the influences of precipitation are similar on extreme and moderate air pollution. The coefficient for altitude is negative, suggesting certain evidence that altitude can impact reversely in different scaled pollution.

Furthermore, the effect of temperature is not significant with respect to 95% credible interval for mean but becomes significant for maxima. This result implies that high temperatures weakly impact the general air quality, while playing an essential role in the dispersion of extremely poor pollution. Nevertheless, considering the possibility of unmeasured confounders potentially distorting the coefficients, the strength of the aforementioned inference might be compromised.

In terms of model performance evaluation, the joint model demonstrated promising results in the validation set. In particular, the Gumbel sub-model (max sub-model) exhibited improvements in the coverage probability ($88.63\%$) and correlation ($46.71\%$), compared with the univariate Gumbel model (Model 1), as shown in Tables 4 and 8. The insights of the joint model performance are depicted in Fig. 7, where both the performance of the Gumbel model (maxima model) and the Gaussian model (mean model) are satisfied. Because of this, despite the less parsimonious nature of the joint model, which may introduce more volatility in the prediction of the Gumbel sub-model, the overall satisfying evaluation results bolster the credibility of the joint model’s findings. With this confidence, we proceed to utilize the joint model for subsequent predictions using excursion functions.

Table 7 Posterior estimates (mean and 95% credible interval) of the sharing effects and non-sharing effects specified in Eq. (6)

Full size table

Table 8 Sub-models performance evaluation (coverage probability, correlation and RMSE) in the validation set

Full size table

4.4 Annual maxima prediction and excursion functions

Hot-spot region identifications and predictive concentration level plots are important tools in air pollution studies, because they intuitively imply the air quality in specific regions and are easily interpreted to environmental agencies and the general public. Bolin and Lindgren (2015) proposed the positive ($\textrm{E}_{u, \alpha }^{+}(X)$) and negative excursion sets ($\textrm{E}_{u, \alpha }^{-}(X)$) that determine the largest set that simultaneously exceeds or falls below the risk level (u) with a small error probability ($\alpha$), employing a parametric family and sequential importance sampling method for estimating joint probabilities. To visualize the excursion sets simultaneously, we apply the positive and negative excursion functions, $F_u^{+}(s)=1-\inf \left\{ \alpha \mid s \in \textrm{E}_{u, \alpha }^{+}\right\}$ and $F_u^{-}(s)=1-\inf \left\{ \alpha \mid s \in \textrm{E}_{u, \alpha }^{-}\right\}$. To explain, the term $\inf \left\{ \alpha \mid s_0 \in \textrm{E}_{u, \alpha }^{+}\right\}$ denotes the "smallest" $\alpha$ required that the location ($s_0$) can be included into the positive excursion set $\textrm{E}_{u, \alpha }^{+}$ at the first time, while the higher $1-\inf \left\{ \alpha \mid s_0 \in \textrm{E}_{u, \alpha }^{+}\right\}$ reported by positive excursion function generally indicates higher probabilities for the location ($s_0$) to exceed the risk threshold u simultaneously.

In our case, to discover the areas that are most likely and most unlikely to suffer from severe ${\text{PM}}_{10}$ pollution simultaneously, we utilize predicted ${\text{PM}}_{10}$ concentration levels from the max sub-model of the joint model to generate both positive and negative excursion functions with the thresholds 50 μg/m³ (poor) and 100 μg/m³ (very poor) at 548 locations distributed throughout the mainland of Spain, including a set of 0.5° $\times$ 0.5° (50 km $\times$ 50 km) grids (206 locations) and locations of all ${\text{PM}}_{10}$ stations (342 monitors).

In the case of exceeding 50 μg/m³ (Fig. 8), the probability for simultaneously exceeding is high in the northwest, middle and south, meaning that poor ${\text{PM}}_{10}$ pollution probably hazard these locations during the year. In contrast, the negative excursion function with threshold 50 μg/m³ indicates that the regions in the north and east enjoy good or moderate air quality throughout the year. In the case of 100 μg/m³ (Fig. 9), the probabilities for very poor ${\text{PM}}_{10}$ pollution occurrence are low (in white) in most regions, but still likely to appear in certain areas in the community of Madrid, meanwhile, most areas in the north, northeast and east are expected to be below this threshold.

5 Discussions and conclusions

In this paper, we discovered the spatio-temporal patterns of ${\text{PM}}_{10}$ concentration levels in mainland Spain under the framework of INLA-EVT methodology. The high-quality data-set of annual maxima and annual mean of daily PM$_{10}$ concentration levels was jointly studied successfully, due to the following three reasons. Firstly, Spain is a mountainous country with a large central plateau, and this complicated orography is usually associated with a variety of climatic conditions. Secondly, the air pollution monitors distributed throughout mainland Spain provide high-quality time series data, which supports both local and national level accurate prediction and credible inference. Finally, Spain generally enjoys good air quality (low annual mean), but extreme air pollution does appear on certain days during the year (high annual maxima). This circumstance allows us to investigate the potential difference in the generation and spread of the moderate and extreme cases.

We establish a series of Bayesian spatio-temporal models on extreme ${\text{PM}}_{10}$ concentrations in Spain, with specifically meteorological predictors, human effects, and spatio-temporal random effects following SPDE and AR(1) dynamic to account for the dependence not explained by covariates. We also provide evidence of similar and reverse connections between influential predictors and different scaled ${\text{PM}}_{10}$ concentrations by the Bayesian joint model with sharing effects, as well as generate the annual maxima excursion functions maps specified at the grid level to highlight the regional risk ranking.

Although most statistical studies focus on moderate cases with long-term exposure, in the epidemiological field, short-term exposure to severe particulate matter is considered as an essential public health issue with major acute cardiovascular problems and health economic consequences. For example, Brook et al. (2016) emphasized that people living in highly polluted regions probably increase heart failure hospitalisations and cardiovascular mortality several folds. Shah et al. (2013) also pointed out that even modest improvements in air quality are projected to have major population health benefits and substantial health-care cost savings, preventing thousands of heart failure hospitalisations and saving millions of US dollars a year.

Combined with these health and economic hazards, the main findings in this paper are expected to provide in-depth knowledge of extreme air pollution spreading, promote awareness of extreme value studies, and provide suggestions to the national governments regarding the legislation of extremely poor air quality regulation and human health protection. In particular, the joint model incorporating sharing effects emphasizes the potential reverse or similar impacts of altitude and precipitation on both moderate and extreme cases of ${\text{PM}}_{10}$ pollution. Moreover, the excursion functions maps indicate that the central region in Spain is more likely to experience severe ${\text{PM}}_{10}$ pollution, which can be applied in research of long-term effects and health outcomes in epidemiological studies, such as acute cardiovascular events (Mustafić et al. 2012; Shah et al. 2013) and various types of strokes (Yu et al. 2014).

In conclusion, our study provides valuable insights into the generation and spreading of extreme PM$_{10}$ pollution through innovative methods incorporating sharing effects in the joint model. This approach holds promise for future environmental and epidemiological studies in exploring various air pollutants and their relationship with meteorological variables and anthropogenic factors.

References

Agarwal A, Kaddoura I (2019) On-road air pollution exposure to cyclists in an agent-based simulation framework. Period Polytech Transp Eng 48(2):117–125
Google Scholar
Amin NA, Adam M, Aris A (2015) Bayesian extreme for modeling high ${\rm PM}_{10}$ concentration in Johor. Procedia Environ Sci 30:309–314
CAS Google Scholar
Beloconi A, Chrysoulakis N, Lyapustin A, Utzinger J, Vounatsou P (2018) Bayesian geostatistical modelling of ${\rm PM}_{10}$ and ${\rm PM}_{2.5}$ surface level concentrations in Europe using high-resolution satellite-derived products. Environ Int 121:57–70
CAS PubMed PubMed Central Google Scholar
Bolin D, Lindgren F (2015) Excursion and contour uncertainty regions for latent Gaussian models. J R Stat Soc Ser B (Stat Methodol) 77(1):85–106
Google Scholar
Borck R, Schrauth P (2021) Population density and urban air quality. Reg Sci Urban Econ 86:103596
Google Scholar
Brook R, Sun Z, Brook J, Zhao X, Ruan Y, Yan J, Mukherjee B, Rao X, Duan F, Sun L, Liang R, Lian H, Zhang S, Fang Q, Gu D, Sun Q, Fan Z, Rajagopalan S (2016) Extreme air pollution conditions adversely affect blood pressure and insulin resistance: the air pollution and cardiometabolic disease study. Hypertension 67(1):77–85
CAS PubMed Google Scholar
Cameletti M, Ignaccolo R, Bande S (2011) Comparing spatio-temporal models for particulate matter in Piemonte. Environmetrics 22(8):985–996
Google Scholar
Cameletti M, Lindgren F, Simpson D, Rue H (2013) Spatio-temporal modeling of particulate matter concentration through the SPDE approach. AStA Adv Stat Anal 97(2):109–131
Google Scholar
Castro-Camilo D, Huser R, Rue H (2021) Practical strategies for generalized extreme value-based regression models for extremes. Environmetrics 33(6):e2742
Google Scholar
Cheng L, AghaKouchak A, Gilleland E, Katz R (2014) Non-stationary extreme value analysis in a changing climate. Clim Change 127(2):353–369
Google Scholar
Chu H, Huang B, Lin C (2015) Modeling the spatio-temporal heterogeneity in the PM10-PM2.5 relationship. Atmos Environ 102:176–182
CAS Google Scholar
Cohen A, Brauer M, Burnett R, Anderson H, Frostad J, Estep K, Balakrishnan K, Brunekreef B, Dandona L, Dandona R, Feigin V, Freedman G, Hubbell B, Jobling A, Kan H, Knibbs L, Liu Y, Martin R, Morawska L, Pope C, Shin H, Straif K, Shaddick G, Thomas M, van Dingenen R, van Donkelaar A, Vos T, Murray CJ, Forouzanfar M (2017) Estimates and 25-year trends of the global burden of disease attributable to ambient air pollution: an analysis of data from the global burden of diseases study 2015. Lancet 389(10082):1907–1918
PubMed PubMed Central Google Scholar
Coles S (2001) An introduction to statistical modeling of extreme values. Springer, Berlin
Google Scholar
Daellenbach K, Uzu G, Jiang J, Cassagnes L-E, Leni Z, Vlachou A, Stefenelli G, Canonaco F, Weber S, Segers A, Kuenen J, Schaap M, Favez O, Albinet A, Aksoyoglu S, Dommen J, Baltensperger U, Geiser M, Haddad I, Prevot A (2020) Sources of particulate-matter air pollution and its oxidative potential in Europe. Nature 587:414–419
CAS PubMed Google Scholar
Deng L, Zhang Z (2018) Assessing the features of extreme smog in China and the differentiated treatment strategy. Proc R Soc A 474(2009):20170511
Google Scholar
Dias D, Tchepel O (2018) Spatial and temporal dynamics in air pollution exposure assessment. Int J Environ Res Public Health 15(3):558
PubMed PubMed Central Google Scholar
Fioravanti G, Martino S, Cameletti M, Cattani G (2021) Spatio-temporal modelling of ${\rm PM}_{10}$ daily concentrations in Italy using the SPDE approach. Atmos Environ 248(1):118192
CAS Google Scholar
Forlani C, Bhatt S, Cameletti M, Krainski E, Blangiardo M (2020) A joint Bayesian space-time model to integrate spatially misaligned air pollution data in R-INLA. Environmetrics 31(8):e2644
Google Scholar
Fuglstad G-A, Simpson D, Lindgren F, Rue H (2019) Constructing priors that penalize the complexity of Gaussian random fields. J Am Stat Assoc 114(525):445–452
CAS Google Scholar
Gneiting T, Raftery A (2007) Strictly proper scoring rules, prediction, and estimation. J Am Stat Assoc 102(477):359–378
CAS Google Scholar
Gómez Rubio V (2020) Bayesian inference with INLA. Chapman & Hall, Boca Raton
Google Scholar
Guttorp P, Gneiting T (2006) Studies in the history of probability and statistics XLIX On the Matérn correlation family. Biometrika 93(4):989–995
Google Scholar
Harris I, Osborn T, Jones P, Lister D (2020) Version 4 of the CRU TS monthly high-resolution gridded multivariate climate dataset. Sci Data 7(1):109
PubMed PubMed Central Google Scholar
Held L, Schrödle B, Rue H (2010) Posterior and cross-validatory predictive checks: a comparison of MCMC and INLA. Physica-Verlag HD, Heidelberg, pp 91–110
Google Scholar
Joseph A, Sawant A, Srivastava A (2003) ${\rm PM}_{10}$ and its impacts on health—a case study in Mumbai. Int J Environ Health Res 13(2):207–214
PubMed Google Scholar
Kalisa E, Fadlallah S, Amani M, Nahayo L, Habiyaremye G (2018) Temperature and air pollution relationship during heatwaves in Birmingham, UK. Sustain Cities Soc 43:111–120
Google Scholar
Koh J, Pimont F, Dupuy J, Opitz T (2023) Spatiotemporal wildfire modeling through point processes with moderate and extreme marks. Ann Appl Stat 17(1):560–582
Google Scholar
Lei R, Zhu F, Cheng H, Liu J, Shen C, Zhang C, Xu Y, Xiao C, Li X, Zhang J, Ding R, Cao J (2019) Short-term effect of ${\rm PM}_{2.5}$ / ${\rm O}_{3}$ on non-accidental and respiratory deaths in highly polluted area of China. Atmos Pollut Res 10(5):1412–1419
CAS Google Scholar
Lenschow P, Abraham H-J, Kutzner K, Lutz M, Preuß J-D, Reichenbächer W (2001) Some ideas about the sources of ${\rm PM}_{10}$. Atmos Environ 35:S23–S33
CAS Google Scholar
Li L, Qian J, Ou C-Q, Zhou Y-X, Guo C, Guo Y (2014) Spatial and temporal analysis of air pollution index and its timescale-dependent relationship with meteorological factors in Guangzhou, China, 2001–2011. Environ Pollut 190:75–81
CAS PubMed Google Scholar
Lindgren F, Rue H, Lindström J (2011) An explicit link between Gaussian fields and Gaussian Markov random fields: the stochastic partial differential equation approach. J R Stat Soc Ser B (Stat Methodol) 73(4):423–498
Google Scholar
Lobeto H, Menendez M, Losada I (2018) Toward a methodology for estimating coastal extreme sea levels from satellite altimetry. J Geophys Res 123(11):8284–8298
Google Scholar
Marshall E, Spiegelhalter D (2003) Approximate cross-validatory predictive checks in disease mapping models. Stat Med 22(10):1649–1660
CAS PubMed Google Scholar
Martins L, Wikuats CF, Capucim M, de Almeida D, da Costa S, Albuquerque T, Barreto Carvalho V, de Freitas E, de Fátima Andrade M, Martins J (2017) Extreme value analysis of air pollution data and their comparison between two large urban regions of South America. Weather Clim Extrem 18:44–54
Google Scholar
Martuzzi M, Mitis F, Iavarone I, Serinelli M (2006) Health impact of ${\rm PM}_{10}$ and ozone in 13 Italian cities. World Health Organization, Regional Office for Europe, Denmark
Google Scholar
Matérn B (1986) Spatial variation. Springer, New York
Google Scholar
Moraga P (2019) Geospatial health data: modeling and visualization with R-INLA and shiny. Chapman & Hall, Boca Raton
Google Scholar
Morawska L, Zhu T, Liu N, Amouei Torkmahalleh M, de Fatima Andrade M, Barratt B, Broomandi P, Buonanno G, Ceron Carlos Belalcazar L, Chen J, Cheng Y, Evans G, Gavidia M, Guo H, Hanigan I, Hu M, Jeong C, Kelly F, Gallardo L, Kumar P, Lyu X, Mullins B, Nordstrøm C, Pereira G, Querol X, Yezid Rojas Roa N, Russell A, Thompson H, Wang H, Wang L, Wang T, Wierzbicka A, Xue T, Ye C (2021) The state of science on severe air pollution episodes: quantitative and qualitative analysis. Environ Int 156:106732
CAS PubMed Google Scholar
Mustafić H, Jabre P, Caussin C, Murad M, Escolano S, Tafflet M, Périer M-C, Marijon E, Vernerey D, Empana J-P, Jouven X (2012) Main air pollutants and myocardial infarction: a systematic review and meta-analysis. J Am Med Assoc 307(7):713–721
Google Scholar
Omidvarborna H, Kumar A, Kim D-S (2015) Recent studies on soot modeling for diesel combustion. Renew Sustain Energy Rev 48:635–647
CAS Google Scholar
Orellano P, Reynoso J, Quaranta N, Bardach A, Ciapponi A (2020) Short-term exposure to particulate matter (${\rm PM}_{10}$ and ${\rm PM}_{2.5}$), nitrogen dioxide (${\rm NO}_{2}$), and ozone (${\rm O}_{3}$) and all-cause and cause-specific mortality: systematic review and meta-analysis. Environ Int 142:105876
CAS PubMed Google Scholar
Panagoulia D, Economou P, Caroni C (2014) Stationary and nonstationary generalized extreme value modelling of extreme precipitation over a mountainous area under climate change. Environmetrics 25(1):29–43
Google Scholar
Pettit L (1990) The conditional predictive ordinate for the normal distribution. J R Stat Soc Ser B (Methodol) 52(1):175–184
Google Scholar
Porcu E, Montero J, Schlather M (2012) Advances and challenges in space-time modelling of natural events. Springer, New York
Google Scholar
Reiss R-D, Thomas M (2007) Statistical analysis of extreme values: with applications to insurance, finance, hydrology and other fields. Birkhäuser, Basel
Google Scholar
Rodríguez S, Huerta G, Reyes H (2016) A study of trends for Mexico city ozone extremes: 2001–2014. Atmósfera 29(2):107–120
Google Scholar
Rue H, Martino S, Chopin N (2009) Approximate Bayesian inference for latent Gaussian models by using Integrated Nested Laplace Approximations. J R Stat Soc Ser B (Stat Methodol) 71(2):319–392
Google Scholar
Saez M, Barceló M (2022) Spatial prediction of air pollution levels using a hierarchical Bayesian spatiotemporal model in Catalonia, Spain. Environ Model Softw 151(C):105369
Google Scholar
Samoli E, Stafoggia M, Rodopoulou S, Ostro B, Declercq C, Alessandrini E, Díaz J, Karanasiou A, Kelessis A, Tertre A, Pandolfi P, Randi G, Scarinzi C, Zauli-Sajani S, Katsouyanni K, Forastiere F (2013) Associations between fine and coarse particles and mortality in Mediterranean cities: results from the MED-PARTICLES Project. Environ Health Perspect 121(8):932–938
PubMed PubMed Central Google Scholar
Shah A, Langrish J, Nair H, McAllister D, Hunter A, Donaldson K, Newby D, Mills N (2013) Global association of air pollution and heart failure: a systematic review and meta-analysis. Lancet 382(9897):1039–1048
CAS PubMed PubMed Central Google Scholar
Sharma P, Chandra A, Kaushik S, Sharma P, Jain S (2012) Predicting violations of national ambient air quality standards using extreme value theory for Delhi city. Atmos Pollut Res 3(2):170–179
CAS Google Scholar
Shumway R, Stoffer D (2017) Time series analysis and its applications: with R examples, 4th edn. Springer, Cham
Google Scholar
Simpson D, Rue H, Riebler A, Martins T, Sørbye S (2017) Penalising model component complexity: a principled, practical approach to constructing priors. Stat Sci 32(1):1–28
Google Scholar
Singh V, Meena K, Agarwal A (2021) Travellers’ exposure to air pollution: a systematic review and future directions. Urban Clim 38:100901
Google Scholar
Spiegelhalter D, Best N, Carlin B, van der Linde A (2002) Bayesian measures of model complexity and fit. J R Stat Soc Ser B (Stat Methodol) 64(4):583–639
Google Scholar
Steinle S, Reis S, Sabel C (2013) Quantifying human exposure to air pollution-moving from static monitoring to spatio-temporally resolved personal exposure assessment. Sci Total Environ 443(15):184–193
CAS PubMed Google Scholar
Steinle S, Reis S, Sabel C, Semple S, Twigg M, Braban C, Leeson S, Heal M, Harrison D, Lin C, Wu H (2015) Personal exposure monitoring of $\rm PM_{2.5}$ in indoor and outdoor microenvironments. Sci Total Environ 508(1):383–394
CAS PubMed Google Scholar
Taheri Shahraiyni H, Sodoudi S (2016) Statistical modeling approaches for $\rm PM _{10}$ prediction in urban areas; a review of 21st-Century studies. Atmosphere 7(2):15
Google Scholar
Wang P, Chen K, Zhu S, Wang P, Zhang H (2020) Severe air pollution events not avoided by reduced anthropogenic activities during COVID-19 outbreak. Resour Conserv Recycl 158:104814
PubMed PubMed Central Google Scholar
Watanabe S (2013) A widely applicable Bayesian information criterion. J Mach Learn Res 14(1):867–897
Google Scholar
Xie Y, Li Z, Zhong H, Feng X, Lu P, Xu Z, Guo T, Si Y, Wang J, Chen L, Wei C, Deng F, Baccarelli A, Zheng Z, Guo X, Wu S (2021) Short-term ambient particulate air pollution and hospitalization expenditures of cause-specific cardiorespiratory diseases in China: a multicity analysis. Lancet Reg Health 15:100232
Google Scholar
Yu X, Su J, Li X, Chen G (2014) Short-term effects of particulate matter on stroke attack: meta-regression and meta-analyses. Public Librar Sci 9(5):e95682
Google Scholar
Zhang J, Chen Q, Wang Q, Ding Z, Sun H, Xu Y (2019) The acute health effects of ozone and ${\rm PM}_{2.5}$ on daily cardiovascular disease mortality: a multi-center time series study in China. Ecotoxicol Environ Saf 174(15):218–223
CAS PubMed Google Scholar

Download references

Author information

Authors and Affiliations

Wisdom Lake Academy of Pharmacy, Xi’an Jiaotong-Liverpool University, Suzhou, 215123, China
Kai Wang, Chengxiu Ling & Ying Chen
Department of Mathematical Sciences, University of Liverpool, Liverpool, L693BX, UK
Kai Wang
School of Economics and Management, University of Chinese Academy of Sciences, Beijing, 101418, China
Zhengjun Zhang
Department of Statistics, University of Wisconsin-Madison, Madison, WI, 53706, USA
Zhengjun Zhang

Authors

Kai Wang
View author publications
You can also search for this author in PubMed Google Scholar
Chengxiu Ling
View author publications
You can also search for this author in PubMed Google Scholar
Ying Chen
View author publications
You can also search for this author in PubMed Google Scholar
Zhengjun Zhang
View author publications
You can also search for this author in PubMed Google Scholar

Corresponding author

Correspondence to Chengxiu Ling.

Additional information

Handling editor: Luiz Duczmal.

Rights and permissions

Open Access This article is licensed under a Creative Commons Attribution 4.0 International License, which permits use, sharing, adaptation, distribution and reproduction in any medium or format, as long as you give appropriate credit to the original author(s) and the source, provide a link to the Creative Commons licence, and indicate if changes were made. The images or other third party material in this article are included in the article's Creative Commons licence, unless indicated otherwise in a credit line to the material. If material is not included in the article's Creative Commons licence and your intended use is not permitted by statutory regulation or exceeds the permitted use, you will need to obtain permission directly from the copyright holder. To view a copy of this licence, visit http://creativecommons.org/licenses/by/4.0/.

Reprints and permissions

About this article

Cite this article

Wang, K., Ling, C., Chen, Y. et al. Spatio-temporal joint modelling on moderate and extreme air pollution in Spain. Environ Ecol Stat 30, 601–624 (2023). https://doi.org/10.1007/s10651-023-00575-6

Download citation

Received: 03 March 2023
Revised: 24 July 2023
Accepted: 07 August 2023
Published: 08 September 2023
Issue Date: December 2023
DOI: https://doi.org/10.1007/s10651-023-00575-6

Keywords

Use our pre-submission checklist

Avoid common mistakes on your manuscript.

Spatio-temporal joint modelling on moderate and extreme air pollution in Spain

Abstract

Similar content being viewed by others

Unveiling air pollution patterns in Yemen: a spatial–temporal functional data analysis

Measurement error in a multi-level analysis of air pollution and health: a simulation study

Spatiotemporal modelling of $$\hbox {PM}_{2.5}$$ concentrations in Lombardy (Italy): a comparative study

1 Introduction

2 Data