1 Introduction

With more than twelve million confirmed COVID-19 cases and more than 550 thousand related deaths globally as of the beginning of July 2020,Footnote 1 the novel coronavirus pandemic has unquestionably caused dramatic health and economic impacts. Despite the public health benefits of the consequent COVID-19 mitigation measures adopted by the central and the regional governments in Italy, one of the most heavily impacted countries, there are adverse socioeconomic effects of the lockdown on top of what are already dramatic public health impacts. Official morbidity statistics, although complicated by the public health interventions and the emergency status, reveal a strong spatial clustering phenomenon across administrative regions in Italy and provinces and municipalities within each region. Such a geographical concentration of both COVID-19 morbidity and mortality is most likely the result of the interaction of multiple factors, among which include the clustering of initially infected individuals, different choices made about testing and contact tracing in order to identify community transmission, underlying population demographic and prevalence of health status, and the timely adoption of lockdown measures to control the COVID-19 epidemic (Ciminelli and Garcia-mandicó 2020). Beyond such proximal factors, however, additional contextual factors may have played an important role in the health impacts of COVID-19 in Italy.

The Northern Italian regions most affected by the spreading of coronavirus (Lombardia, Veneto, Piemonte, Emilia Romagna) are also the most densely populated and heavily industrialized and thereby the most heavily polluted regions of Italy. These four regions together host 39% of the national population,Footnote 2 and approximately one-half of the Italian GDP is produced there. Such a spatial concentration of economic activities involves the industrial manufacturing sectors to the largest extent, and the consequent high level of emissions is at least in part responsible for poor air quality in the region.Footnote 3 In Brescia, among the most affected cities in Lombardy, the concentration of particulate matter (PM) and ozone exceeded the allowable threshold in 150 days in 2018, making it the most polluted city in Italy. Lodi and Monza follow, with 149 and 140 exceedance days, respectively. Milan and Bergamo are sixth and ninth, respectively, with 135 and 127 days.Footnote 4 Lombardy is also among the most polluted regions in all of Europe (European Environmental Agency 2019). The relatively higher air pollutant concentrations in the Po Valley region of Italy contrasts sharply with neighbouring alpine regions and stems from the combination of two main factors (Carugno et al. 2016; Larsen et al. 2012; Pozzer et al. 2019). The first is the high concentration of urban areas with their congested roads and industrial belts. Source apportionment research from the Lombardy region (Pirovano et al. 2015) indicates that the major sources of PM2.5 include residential heating (e.g., fuel), transport, agriculture, background (including natural-source sand long-range transport), and other (including stationary industrial sources). The second is the location in the orographic “bowl” of the Po Valley, an extension of flat river lands enclosed between the Alps and Apennines mountains, which causes the stagnation of pollutants due to low ventilation (Giulianelli et al. 2014).

These factors help to characterize the Po Valley’s peculiarity with respect to different European areas with comparable urban and industrial density levels (Eeftens et al. 2012). Moreover, in addition to the urbanized and industrial areas, the remainder of the valley presents an intensive agricultural activity. Local studies on emission sources highlight a varying composition of the final concentration values depending on the position of monitoring stations and with different sources acting as local or diffused ones (for instance having high emissions from traffic close to cities, while having background biomass burning diffused in the whole region) (Bigi and Ghermandi 2016; Larsen et al. 2012). Indeed, given the EU Ambient Air Quality Directives that sets the Air quality standards for the protection of health at 25 μg/m3 for the averaging period of a calendar year, the Po valley shows values consistently near or above the threshold. These values often range in the 25–30 μg/m3 interval with peaks of > 30 μg/m3, which in Europe are only matched in Southern Poland and other smaller Eastern European clusters (EEA 2019).

Compared to its overall representation in the population, Lombardy is disproportionately impacted by COVID-19 related mortality, with approximately 53% of Italy’s COVID-19 deaths as of April 15, 2020 (Odone et al. 2020). Lombardy is also the most impacted Italian region as far as the total number of deaths in excess in the first quarter of 2020 compared to the same period of the previous years. Comparing the official COVID-19 death data with registry deaths, it emerges that the latter is almost 70% larger than the former in Lombardy, 27% larger in Emilia-Romagna and 18% and 16% in Veneto and Piemonte, respectively. It is, therefore, imperative to consider the role that PM may have played in such disproportionate COVID-19 deaths in Northern Italy.

There are a number of plausible pathways by which airborne PM may impact COVID-19 related morbidity and mortality. Existing data already finds a strong positive correlation between viral respiratory infection incidence and ambient PM concentrations (Ciencewicki and Jaspers 2007; Sedlmaier et al. 2009). One plausible pathway for this phenomenon is the fate and transport of the virus itself within the environment. A recent position paper by the Italian Society of Environmental Medicine argues that PM may act as both a carrier and substrate of the virus and thus influence the virus’ fate and transport in the environment and reaching susceptible receptors (Setti et al. 2020). Another pathway is the increase in susceptibility to COVID-19 mortality caused by long term exposure to PM. Fine PM is already known to affect cardiovascular and respiratory morbidity and mortality (Cakmak et al. 2018; Jeong et al. 2017; McGuinn et al. 2017; Yin et al. 2017). Moreover, among 1596 Italian COVID-19 patients who died in the hospitals, and for whom it was possible to analyze clinic charts, data showed substantial comorbidities including ischemic heart disease (27.9%); atrial fibrillation (22.4%); heart failure (15.6%); stroke (10.9%); hypertension (70.6%), and chronic obstructive pulmonary disease (17.9%) (Istituto Superiore di Sanità 2020). Biologically, long-term PM exposure may be responsible for a chronic inflammation status that causes the hyper-activation of the immune system and the life-threatening respiratory disorders caused by COVID-19 (Shi et al. 2020).

Some preliminary evidence is now emerging about COVID-19 that shows a positive relationship between air pollution and morbidity and mortality. Beyond qualitatively describing the European Air Quality Index for Northern Italy to argue the causal role of air pollution and the relatively high COVID-19 mortality observed in that region, Conticini et al. (2020) review the most recent existing toxicological and epidemiological literature. Based on existing evidence from other empirical studies, they clarify the relationship between air pollution, prolonged inflammation and immune system hyper-activation and immune suppression, and the link between the latter and acute respiratory distress syndrome, and respiratory mortality. Their paper is important in that it suggests a clinical and biologically plausible explanation to our analysis, but does not provide statistical evidence in support of the hypothesis. A separate empirical analysis by Becchetti et al. (2020) finds preliminary evidence that confirms such a positive effect of air pollution on mortality in Italy based on the analysis of death data at the province level. Similarly, Wu et al. (2020) show a positive association between long term PM exposure and COVID-19 related deaths in US counties. Ogen (2020) recently analysed data from 66 administrative regions in France, Spain, Italy, and Germany, and found that the highest COVID-19 deaths in these regions were associated with five regions of Northern Italy that also corresponded with the highest levels of atmospheric nitrogen dioxide (NO2). Cole et al. (2020) estimate the same relationship using Netherlands municipality data and find PM2.5 positively associated with COVID-19 cases, hospitalization, and deaths.

In this paper, we follow this emerging stream of the empirical literature and test the hypothesis that a higher average long-term exposure to PM2.5 is positively associated with the current extraordinarily high death toll in Northern Italy. We decided to focus on PM2.5 because, given the complexity of air pollution, it is quite common in air pollution epidemiology studies to focus the analysis on a single pollutant (Wu et al. 2020), although multipollutant analyses are certainly warranted. We selected PM2.5 for a variety of important reasons, including policy implications and evidence in the literature in terms of chronic health effects. Regarding its policy implications, we selected PM2.5 as opposed to PM10 because the former is more correlated with human activities than the latter, and it correlates with stronger health effects than PM10 does. With respect to respiratory mortality effects from the existing air pollution literature, the most robust evidence points to PM2.5 as opposed to other gaseous air pollutants (Bowe et al. 2019).

Mortality data are collected at the municipality level for the period January-April 2020. Given that mortality data are not disaggregated by mortality cause, death counts are measured as the difference from the last five-years mean to reflect the abnormal number of deaths caused by the spreading of the pandemic. Since PM2.5 can be associated to generic mortality even in the absence of the pandemic outbreak (Dominici et al. 2003; Katsouyanni et al. 2001; Samet et al. 2000), we also estimate the impact of PM2.5 on the excess mortality in the sample using 2019 data, a time in which the coronavirus epidemic had presumably not yet begun. Data on PM2.5 concentration at the municipality level refer to the years prior 2020 to account for long-term population exposure. We assign municipality PM2.5 concentration by a set of different methods of spatial interpolation (kriging) of monitoring station data related to the years 2015–2019.

We estimate a negative binomial model of excessive deaths on historical PM2.5 concentrations and a series of control variables that may plausibly affect both PM2.5 concentration and mortality, including population density; the spatial concentration of the industrial manufacturing sites; climatic conditions observed during the first quarter of 2020; and the demographic composition of the municipal population among others. In addition, we consider spatial heterogeneity in the distribution of the number of deaths related to regional and local unobservable factors. We account for region-specific effects because regions, in Italy, are the administrative units in charge of managing the health systems and the measures taken to trace and contrast the spreading of the pandemic varied greatly among even contiguous regions. We also account for local effects common to functionally linked clusters of municipalities (the Local Labour Systems—LLS). We deem this part of the identification strategy crucial because the relationship between PM2.5 and COVID-19 mortality may be confounded by several other factors, some of which were not observable or measurable, but are nevertheless intrinsically related to the geographical location of the observed units.

The remainder of the paper is organized as follows. The next section introduces the empirical strategy and describes the dataset. The results are presented and discussed in section three, considering the total number of (excess) deaths. Section four draws the conclusions and highlights the limitation of the study and the indications for future research.

2 Empirical Strategy and Data

Our analysis is restricted to the study area of Northern Italy (Fig. 1), which encompasses the sub-regions of Valle D’Aosta, Piemonte, Liguria, Lombardia, Emilia-Romagna, Veneto, Friuli-Venezia Giulia and Trentino-Alto Adige/Südtirol. Official territorial data on COVID-19 mortality in Italy are available at the rather aggregate regional or provincial level, corresponding to the levels 2 and 3, respectively of the European nomenclature units for territorial statistics (NUTS).Footnote 5 In addition, these official data refer to the deaths of patients tested positive for severe acute respiratory syndrome coronavirus 2 (SARS-CoV2) only and do not include (potential) patients without COVID-19 diagnosis because they were not tested and died at home or elsewhere. Hence, the officially reported deaths are likely underestimated. Because testing policies vary among regions in Italy, the induced measurement error is also non-randomly distributed among the provinces. Ciminelli and Garcia-mandicó (2020) compare the official COVID-19 fatality rates with historical death data and report that deaths were higher than official fatalities throughout the period of COVID-19 epidemic.

Fig. 1
figure 1

Italian regions included in the study

Working under the assumption that COVID-19 deaths are underestimated in Italy, the choice is made in this paper to use the total deaths from the official registries, accordingly, and to scale the analysis at the municipality level, the smallest administrative units, to have a more granular representation of the spatial dimension of the phenomenon. Since we are interested in excess deaths, we take the difference between the number of deaths in the period January 1—April 30, 2020, and the average number of deaths in the same period of the previous 5 years (ExDeaths) and use this metric as the dependent variable in our statistical model. Figure 2 displays the geographical distribution of the above-described data among the 4041 municipalities for which data is available.

Fig. 2
figure 2

Spatial distribution of cumulative excess deaths in sample municipalities, Northern Italy, January 1—April 30, 2020

The variable is assumed to follow a Negative Binomial distribution, a generalization of the Poisson distribution that avoids the restrictive mean–variance equality of the latter, and is modelled as follows:

$$\begin{aligned} & {\text{ExDeaths}}_{i} \sim NB\left( {\mu_{i} ,\theta } \right) \\ & \log (\mu_{i} ) = \alpha + \beta PM_{i} + \delta^{\prime }X_{i} + \varepsilon_{i} \\ \end{aligned}$$

where \(\theta\) is the overdispersion parameter to be estimated and \(\mu_{i}\) is the municipality-specific expectation conditional on the value of the covariates. Among the covariates, PM is the concentration of fine particulate matter in municipality i and \(\beta\) is the associated parameter, which we expect positive and statistically different from zero; X is a vector of control variables that adjusts for the potential confounding effects and includes the (log of) total population as the offset while \(\varepsilon\) is a normally-distributed error term.

Our main source of PM2.5 data is the European Environmental Agency’s (EEA) air monitoring database, which is provided to EEA by the Institute for Environmental Protection and Research (ISPRA). ISPRA conducts ground-level air measurements of PM2.5 air concentrations (µg/m3) collected at 268 monitoring sites throughout Italy. Specifically, we use the EEA’s E1a and E2a datasets, which are primary validated assessment data and primary up-to-date assessment data reported by the European Member States, respectively. Although the measurements come both in hourly and daily averaging formats, we work with daily values and use them to obtain yearly aggregates for the years 2015, 2016, 2017, 2018, and 2019. However, because model (1) does not include a time component, we further compute a six-year averaging time to obtain a metric of long-term (chronic) PM2.5 concentration levels throughout different spatial units of Northern Italy. The number of 6 years for the reference period is sufficiently long to account for long-term exposure while being not too long to be affected by the mobility of people among municipalities, and it is in line with existing literature assessing long-term effects of PM exposure (Yorifuji et al. 2019). Since the air monitoring stations provide only partial spatial coverage for municipality-level PM2.5 concentration data, we impute missing observations using a spatial interpolation model. Specifically, we fill in the gaps using a mean stationary Ordinary Kriging (see Bivand et al. 2013, p 209) defined through an exponential covariance function with nugget, partial sill and range parameters estimated through (restricted) maximum likelihood methods. Figure 2 displays the resulting PM2.5 concentration data.Footnote 6

Comparing Figs. 2 and 3, it is possible to visually appreciate a spatial coincidence between higher levels of excess mortality and higher levels of PM2.5, in particular in the Lombardia region which notably is the region with both the highest particulate concentration and the highest number of excess mortality.

Fig. 3
figure 3

Spatial distribution of PM2.5 concentration levels in the sample municipalities, simple kriging of monitoring stations, average across years 2015–2019

The hypothesis that PM2.5 concentration affected COVID deaths, that is \(\hat{\beta } > 0\), is tested among several possible specifications. In model (2) we include regional effects \(\left( {\lambda_{j} } \right)\). These effects are expected to capture the aspects related to the management of the outbreak, which may have systematically influenced COVID-19 mortality and that are common to all the municipalities in the same region. Italy has a national health system that ensures equal access to healthcare to all citizens. The system is managed by regions at the local level, and, in the specific case of this pandemic, regions were responsible for defining the testing and contact-tracing protocols and implementing the necessary measures to contain the outbreak, among which the measure to protect healthcare workers. In model (3), we include LLS-specific effects \(\left( {e_{k} } \right)\). LLS are spatial clusters of contiguous municipalities related by commuting flows that share a common specialization in a specific sector of manufacturing production and correspond to the conceptualization of Marshallian districts (Becattini 2002). The number of LLS clusters per-region and the total number of municipalities belonging to clusters are reported in Table 1, along with the minimum, maximum, and average cluster size.

Table 1 Number of LLS spatial clusters in each region

The use of LLS captures the interlinkages within neighbouring municipalities that may have favoured the geographical spreading of coronavirus around specific hotspots. Mortality data are then expected to vary among municipalities in different LLS, but differences are expected to be non-systematic in this case. In model (4) we include both the regional fixed effects and the LLS random effects.

$$\begin{aligned} & ExDeaths_{i} \sim NB\left( {\mu_{ij} ,\theta } \right) \\ & \log (\mu_{ij} ) = \alpha + \beta PM_{ij} + \delta^{\prime }X_{ij} + \lambda_{j} + \varepsilon_{ij} \\ \end{aligned}$$
$$\begin{aligned} & ExDeaths_{i} \sim NB\left( {\mu_{i} ,\theta } \right) \\ & \log (\mu_{ik} ) = \alpha + \beta PM_{ik} + \delta^{\prime }X_{ik} + u_{ik} \\ & u_{ik} = \varepsilon_{ik} + e_{k} \\ \end{aligned}$$
$$\begin{aligned} & ExDeaths_{i} \sim NB\left( {\mu_{ijk} ,\theta } \right) \\ & \log (\mu_{ijk} ) = \alpha + \beta PM_{ijk} + \delta^{\prime }X_{ijk} + \lambda_{j} + u_{ijk} \\ & u_{ijk} = \varepsilon_{ijk} + e_{k} \\ \end{aligned}$$

Control variables to be included in the model were chosen to avoid any potential spatial confounding effect and considering as well the emerging literature on the impact of PM on COVID-19 related deaths (Cole et al. 2020; Wu et al. 2020). The population density and per-capita income account for urbanisation level. The most densely populated and wealthy municipalities are among the most polluted due to the spatial concentration of manufacturing and service activities but are also the places where the contagion could have been easier, with a potential impact on mortality. In addition to the density of population, the shares of municipality area occupied by industrial sites and the average size of manufacturing firms are included in the regression because they are related to pollutant concentration and possibly to mortality. National measures to stop the spreading of the viral infection (lockdown) involved the service sector to the largest extent while many manufacturing activities, being considered necessary, were left open and, in the absence of social distance and individual protection measures, the geographical concentration of these activities in a municipality with their complex logistics and transport interconnections, and the size of plants, may have influenced mortality. Average temperature, for which an association with COVID-19 deaths has also been found (Ma et al. 2020), is also included in the regression.Footnote 7 Moreover, COVID-19 incidence has proven to be higher among men than women and people aged 65 or more. Hence these two variables are considered in the model, even though these aspects are not necessarily connected with the average PM2.5 exposure in a municipality. Underlying socioeconomic conditions can also play a role in COVID-19 related mortality (Goutte et al. 2020). Brandt et al. (2020) and Mukherji (n.d.) have shown that, in the US, COVID-19 is more threatening for ethnic minorities, and we believe that the share of migrants, identified as non-EU citizens, can control for this aspect influencing the observed excess mortality. On the other hand, Mukherji (2020) and Goutte et al. (2020) also find that places with a higher share of the population with a low level of education have higher deaths. In our paper, given the lack of updated data on education at the municipal level, we proxy it with the percentage of university students on the total population. The distance from the closest airport is a proxy for the functional and relational linkage between a municipality and a place of highly frequent national and international connections and potential sources of coronavirus spreading. Finally, we consider the number of hospital beds as a proxy for the supply of health services to account for the fact that many people died at home without being diagnosed for coronavirus due to the shortage of beds in public structures. The full details of the variables in the model, including sources and summary statistics, are presented in Table 2.

Table 2 Description of model variables and summary sample statistics

Having accounted for the confounding effect due to the omission of relevant information from the empirical specification, we exclude any other potential source of endogeneity considered in similar papers. In particular, we exclude endogeneity due to measurement error in the outcome variable and the main independent variable. Concerning the outcome variable, the relationship between deaths and cases with fine PM could be spurious because more cases could be registered, and more individuals tested in highly polluted areas as people there are more likely to show COVID-19 symptoms due to the chronic inflammation induced by PM. The high toll of deaths of people diagnosed with COVID-19 would be a natural consequence of that. In contrast, the number of deaths in excess, used in this paper, is not affected by testing problems since it considers all the potential COVID-19 deaths. Concerning the PM variable, measurement errors are likely to occur when using satellite data or modelled data. We preferred to use PM2.5 levels observed from monitoring stations to avoid such a measurement error. Some caution is needed in the spatial interpolation because the method chosen to fill the missing data may underestimate the value in locations farther from the monitoring stations. With this concern in mind, we test the robustness of our results using PM2.5 data obtained from different interpolation approaches.

3 Results

As indicated in Table 2, the overall average of PM2.5 for the study area between 2015 and 2019 is roughly 20 µg/m3, as most municipalities in Norther Italian regions belong to industrial and agricultural intensive locations. The average mortality between 2015 and 2019 for the period of interest (January 1—April 30) was 25 deaths, while it grew to 34 in 2020. That results in an average excess death of 9, with standard deviation four times as larger (see Table 2).

Estimation results from the negative binomial models are summarised in Table 3 for the four different specifications of the model (1—no geographical effects; 2—regional fixed effects; 3—LLS random effects; 4—regional fixed effects and LLS random effects). In the lower part of the table, the estimated overdispersion parameter, the Akaike Information Criterion (AIC), and the Moran’s test for the null hypothesis of absence of spatial autocorrelationFootnote 8 in the residuals (Moran 1950) are reported.

Table 3 Estimation result of main regressions, dependent variable: excess deaths during the period January 1—April 30 2020, municipalities in Northern Italy

The four specifications provide consistent results in terms of the direction and significance of PM2.5 coefficients. The overall effect of PM2.5 on COVID-19-related excess mortality is positive and statistically significant in all models. The estimated incidence rate ratios, reported in Table 4 with their confidence interval, for Model 1, 2, 3 and 4 are 13.7%, 8.9%, 9.3%, and 9.3%, respectively.

Table 4 Marginal effects of an increase in PM2.5 concentration on excess deaths in Northern Italy during COVID-19 outbreak

In model 2, the regional fixed effects coefficients are statistically significant. They indicate that other things being equal, the number of deaths in municipalities in Lombardy and Emilia Romagna has been systematically higher compared to base categoryFootnote 9 and in municipalities in Veneto it has been systematically lower. The significance of the coefficient for Emilia Romagna, however, drops after including the random effects in the model. Since the first three models are nested into model 4 it is also possible to compare the models in terms of AIC. Model 4 performs substantially better than the other three. In general, the inclusion of RE in models 3 and 4 leads to a decrease in the value of the AIC. In models 1 and 2 the residuals appear spatially autocorrelated, as the null hypothesis of no spatial autocorrelation is rejected in both cases (p < 0.001). The introduction of the LLS random effects appears to solve the issue of autocorrelation.

Based on the estimates of model 4, we compute the expected value of excess deaths conditional on covariates (taken at the average level) in the average city for varying levels of PM2.5 and show how the expected number of deaths by region varies at different concentration levels in Fig. 4. Notably, Emilia–Romagna and Liguria are the regions in which a a reduction of average fine PM from the highest level to the lowest would have benefited the most.

Fig. 4
figure 4

Expected excess deaths in the average municipality against the observed value of PM2.5, by region

3.1 Robustness checks

For robustness check of the PM2.5 metric used in our study, we explored the influence that other alternate PM2.5 metrics may have on the direction and magnitude of the observed associations. Figure 5 depicts the point estimates and the 95% confidence interval for the Incidence Rate Ratios (IRR).Footnote 10 We find that while data from satellite elaborations (MODISFootnote 11 and DIMAQFootnote 12), and monitoring stations’ interpolation EEAFootnote 13 PM2.5 models result in IRRs trending in the same direction, the point estimates for IRRs are lower than our primary analysis which was based on a combination of ground monitoring and kriging. The lower IRR point estimates are unsurprising because the underlying data for the alternate PM2.5 metrics do not have the same temporal coverage as the ground-level monitoring data (2015–2019). This lack of temporal coverage contributes to non-differential exposure misclassification, which, in turn, would lead to suppressing effect estimates towards the null. Despite this, it is encouraging to find that regardless of the PM2.5 metric used, the direction of the observed associations remains, and so does statistical significance.

Fig. 5
figure 5

Robustness check: estimated IRR (PM variable only) for models (1)–(4) using spatially interpolated data and four alternative satellite measures of particulate concentration

As previously anticipated, we re-estimate model (4) using different specifications of the Kriging interpolator. In particular, we first relax the mean-stationarity assumption of Ordinary Kriging by modelling the mean function of the process through both a linear and a quadratic trend in latitude and longitude. Next, we replace the simple Exponential function with a Spherical model and a more flexible Matérn kernel with the characteristic parameter set at 3/2 (to preserve mean-square differentiability). All these specifications still assume covariance stationarity. Figure 6 and Table 5 in the Appendix report the estimated Incidence Rate Ratios (IRR) regression coefficients for the PM variable in model (4) under these multiple setups: both point estimates and 95% confidence intervals indicate that there are no substantive differences between using different trend or covariance models, indicating that our result is robust to alternative specification of the interpolation method.

Fig. 6
figure 6

Robustness Check: estimated IRR (PM variable only) for PM in Model 4 using three different covariance functions and three alternative trend models

Table 5 Estimated regression coefficient (PM variable only) for Model 4 using nine Kriging specifications: three different covariance functions (Exponential, Matérn and Spherical) time three alternative trend models (constant trend, linear trend and quadratic trend)

4 Discussion

In each of the four specifications presented, the coefficient related to PM2.5 is always of the hypothesized direction and statistically different from zero. Precisely and consistently with previous results for the original SARS-Coronavirus during the 2003 outbreak (Cui et al. 2003), an increase in air pollution exposure is associated with increased mortality for COVID-19. The first panel in Fig. 4, as well as Table 4, suggests that, when using interpolated data from ISPRA monitoring stations, the increase in mortality rate due to a one-unit increase in PM2.5 concentration varies between 14% (model 1—highest rate) and nearly 9% (model 4—lowest rate). The 95% confidence interval for the point estimate in model 4 lies between roughly 6% and 12%. Our findings fall in line with both Wu et al. (2020) and Cole et al. (2020) papers. Specifically, both papers find a positive ecological relationship between PM2.5 and COVID-19 mortality. In relation to a 1 µg/m3 increase in PM2.5, Wu et al. find 8% change in COVID-19 mortality, Cole et al. find the same increase associated to additional 3 COVID-19 deaths (almost 17% if compared to their sample mean), and our paper finds 9% increase in COVID-19 related excess mortality. Despite this similarity in results, the two key differences between our study and the others relate to the exposure assessment method and the outcome assessment method. In our study we use a spatial interpolation method (kriging) from ground-level monitoring data, whereas these other two studies utilize PM2.5 gridded surfaces such as chemical transport modelling in the case of Cole et al. and a hybrid approach using chemical transport, aerosol optical depth and land use regression modelling in the case of Wu et al. With respect to COVID-19 mortality data, Wu et al. use county-level data from the Johns Hopkins University, Center for Systems Science and Engineering Coronavirus Resource Center, which is comprised of COVID-19 deaths tabulated by the US Centers for Disease Control and Prevention and State health departments. In Cole et al., researchers obtained COVID-19 deaths by residential address and aggregated these to the municipality level. The obvious difference between their study and ours is that we used a surrogate excess mortality measure due to the issues of reliability for COVID-19 death data, as we have already discussed. The other relevant difference between our study and the Wu et al. and Cole et al. studies is that we subsample the total cohort of Italian municipalities to only regions with a very high mortality rate, which are also the regions most affected by the air quality problems. On the other hand, when satellite data are used, our estimate yields lower incidence ratios. Although ground-level concentration metrics come with fewer measurement errors, satellite data proves nevertheless useful in corroborating both the direction and the significance of the effect of interest. This redundancy is particularly relevant in light of the relatively few stations capable of detecting the finest particulate.

With reference to model (4) and the remaining covariates, we observe no effect related to population density or income or the extent of industrial areas in the municipality. Likewise, there is no evidence suggesting significant links between the share of non-EU residents, the female to male ratio (which disappears after we incorporate the random effects), and the level of education (proxy by the percentage of university students) on the dependent variable of interest. On the other hand, our results suggest a negative association between temperatures and mortality due to COVID-19. Finally, as expected, we find that municipalities with higher shares of the population aged 65 or more have been most affected. The distance from the closest airport, a measure of relational connectedness that also proxy for the exposure to the contagion process, deserves a last comment. We find that municipalities closer to an airport experience a higher number of deaths in excess. We speculate that the result could be related to a higher likelihood for these municipalities to become clusters of contagion in the initial phase of the pandemic, but a causal link cannot be inferred based on our result ad the topic needs more research to be addressed adequately.

We conclude our analysis by checking the consistency of our results to different choices of the dependent variable. Existing evidence (Dominici et al. 2003; Pascal et al. 2014; Samet et al. 2000; Yin et al. 2017) associates fine PM to severe cardiovascular and respiratory diseases and mortality. In European cities, in particular, an estimated increase in the number of daily deaths of 0.7% is associated with an increase of 10 µg/m3 of PM10 (Katsouyanni et al. 2001). This evidence suggests that long term PM exposure may have had an overall effect on deaths even before the outbreak in the sample municipalities, making it more difficult to isolate the real effect on COVID-19 deaths. We thus run model (4) using the total number of deaths in the same observation period of 2019 as the dependent variable to understand whether the effect of fine PM2.5 on mortality has been more severe during the pandemic. We find no evidence of an effect of PM2.5 on total deaths for 2019 in the sampled municipalities, suggesting that the effect of PM exposure on the mortality rate is closely connected to the novel coronavirus outbreak (see Table 6 in the Appendix). However, since the dependent variable in this “placebo” regression cannot be directly compared to the excess mortality, we repeat the test using total mortality for the year 2020. Although the latter includes both COVID-19 related and unrelated deaths, these two variables represent data generating processes of the same nature. As expected, both the regression coefficients and IRRs calculated regressing total deaths in 2020 suggest a positive and statistically significant effect of exposure to fine particulate on mortality, even though its magnitude is greatly reduced if compared to the estimates in Tables 3 and 4 (see Table 7 in the Appendix). Presumably, the effect of PM2.5 concentration on COVID-19 related mortality becomes muted by the noise introduced when accounting for other causes of death. This would also explain the non-significant PM2.5 coefficient in the first “placebo” regression.

Table 6 Estimation results for the placebo regression, dependent variable: total number of deaths during the period Jan1-April 30 2019, municipalities in Northern Italy
Table 7 Estimation results for the placebo regression, dependent variable: total number of deaths during the period Jan1–April 30 2020, municipalities in Northern Italy

5 Conclusion

Italy is among the countries most severely affected by the new coronavirus, with more than 230 thousand confirmed cases and more than 30 thousand deaths as of the end of May. Yet, the spatial distribution of confirmed cases and deaths suggest that the effects of the viral infection spreading largely vary across the regions of the country but also within regions. In this work, we examined the role of ambient PM2.5 in explaining the spatial variation in deaths that occurred throughout the most extreme time period of the epidemic. The results in the paper, that suggest a positive relationship between PM2.5 concentration and COVID-19 related excess mortality, are robust to different specifications PM2.5 and estimation strategies, even after controlling for additional confounder factors. Coherently with previous findings in the literature, we highlight a strong positive correlation between viral respiratory infection incidence and ambient PM2.5 concentrations and the increase in susceptibility to COVID-19 mortality caused by long term exposure to PM2.5, consistent with evidence for the original SARS-Coronavirus during the 2003 outbreak. In fact, fine PM is already known to affect cardiovascular and respiratory morbidity and mortality.

However, we are aware that the phenomenon and the cause and effect relationships are very complex and that our work can only address part of the problem. The cross-sectional nature of the dataset and the use of geographically aggregated information in the epidemiological model does not allow concluding a causal effect exists. In our opinion, the robust evidence in the paper shows that the relationship between PM2.5 and COVID-19 related excess deaths goes far beyond a simple geographical correlation, and further research is needed to explore the causal effect more in depth, when reliable time series data are available.

In fact, our paper does not deal with the spread of contagion and the dynamics linked to it, also because, as we underlined, such analysis would require time-series data, a different econometric methodology, and the identification of the exogenous Coronavirus insurgence in Northern Italy. To the latter purpose, the spread of the pandemic incorporates two different dynamics: (1) on the one hand, the dynamics of the spread of the contagion requires further information to be investigated such as its genesis, the type of virus, and setting of the first outbreak; (2) the effects of the lockdown changed (or partially blocked) the contagion in an asymmetric way. In addition to this, of course, there are other elements that should be investigated, such as additional variables about health data, mobility, and so forth.

Our results reinforce the need to adopt environmental policies that would not only reduce the impact of pollution on the health of citizens and workers but would contribute to smooth the negative effects of a (future) pandemics, avoiding collapses of health systems. Indeed, recent studies show that in addition to chronic lung inflammation, environmental air pollutant concentrations can exacerbate the effects of increasingly frequent one-shot systemic shocks, which in turn are also caused by environmental factors. In this regard, sustainable and decarbonization policies such as the Green New Deal, conceived as long-term policies, should be accelerated.