1 Introduction

The outbreak of SARS-CoV-2 triggered a strong reaction from public authorities, who responded in accordance with the threat that this new infectious disease entailed for the population. Despite having adopted social distancing measures that had never been seen in living memory in the twenty-first century, this pandemic led to most countries seeing significantly increased mortality [1], which the national surveillance authorities in Europe highlighted daily with their published reports [2, 3]. Nowadays, the daily monitoring of the pandemic’s evolution has been partially replaced by the quest to globally assess its mortality burden.

However, because these various national surveillance authorities were quickly established ad hoc to monitor the pandemic, use of the data they provide is broadly questioned for a wide range of reasons: different definitions of data amongst countries; time-varying collection methods; reporting delays; and diverse coverage by place of death [4, 5]. In contrast, the standard official statistics systems provide all-cause deaths, which are essential for computing excess mortality, defined as the difference between the deaths observed and the deaths that would have been observed under normal circumstances. In the context of the SARS-CoV-2 pandemic, excess mortality is defined as “the difference between the number of deaths (from any cause) that occur during the pandemic and the number of deaths that would have occurred in the absence of the pandemic” [6]. Excess mortality is considered by scholars to be the gold standard for estimating the global impact of SARS-CoV-2 [7, 8], which is why numerous studies have estimated excess mortality for a large number of countries [9,10,11,12,13,14,15,16,17,18].

Whilst some of these studies evaluate the pandemic’s impact on mortality by estimating the absolute or relative numbers of excess deaths, these figures are generally insufficient because they fail to consider changes in a population’s age structure. For this reason, scholars prefer looking at age-specific mortality rates instead, as they can be collapsed into a summary measure such as the age-standardised death rate or period life expectancy at birth (hereinafter, shortened to simply “life expectancy”). Especially, the latter measure is a popular tool for quantifying period shocks in mortality because life expectancy readily expresses changes in mortality in terms of a longer or shorter average life span and are thus, comprehensible for a broad audience. The impact on changes in age-specific mortality rates on life expectancy trends is, however, more complex than often assumed (see e.g. [19,20,21]). Recently, [22] suggested interpreting a drop in life expectancy due to a mortality shock such as those incurred by the COVID-19 pandemic rather as a measure of premature mortality as compared to differences in mean longevity. This new interpretation might be suitable for our analysis because the mortality conditions observed in 2020 at the subnational level are unlikely to prevail in the future and thus, it is difficult to conceive the presented mortality changes as differences in average life spans [23].

To date, most of these studies estimate excess mortality for countries as a whole, with the undeniable risk of hiding intranational differences and hindering effective health policies within a pandemic context. Hence, in recent months, many papers have estimated regional excess mortality for specific countries [24,25,26,27,28,29,30,31,32]. However, comparing these regional patterns is relatively problematic because these studies take different approaches to compute the mortality levels that would have occurred without the pandemic. Specifically, they either use pre-pandemic levels or employ forecasting techniques. Moreover, these papers rely on different indicators to assess excess mortality, namely life expectancy or death toll. To our knowledge, only one study [33] allows for a simultaneous comparison of regional excess deaths in five European countries (Italy, Switzerland, Spain, Greece, England, and Wales) in 2020.

Our paper aims to fill this research gap by presenting life expectancy losses at birth in 2020 for 561 regions from 21 countries in Central and Western Europe. In dealing with small geographical areas, we focus on point-estimates as much as on uncertainty quantification. To obtain the mortality levels that would have been observed in 2020 in the absence of the pandemic, we took a robust forecasting approach that accounts for regional diversity and delivers analytic confidence intervals that account for all sources of uncertainty [34, 35]. Furthermore, this study also identifies geographical clusters of high and low losses of life expectancy.

2 Data and Methods

2.1 Data Preparation

We collected regional death and population counts for 21 European countries by age and sex from the national statistical offices, Eurostat, and the Human Mortality Database [36]. Since these data show varying age classes, we harmonised them into single-age intervals up to 95 + for all spatial units [37]. The lowest number of age groups in our input data is 16 (for Slovakia) and the largest age group that we ungrouped into single ages is 14 (for Germany, deaths at age 1 to 14). To ensure comparability in area size when selecting spatial units amongst the 21 countries, we relied mostly on level 3 (Czechia, Denmark, France, Italy, Luxembourg, Slovakia, Spain, Sweden) and level 2 (Austria, Belgium, England and Wales, Hungary, Iceland, Netherlands, Norway, Portugal, Switzerland, Slovenia) of the Nomenclature of Territorial Units for Statistics (NUTS 3 and NUTS 2). As exceptions, we used NUTS 1 for Ireland, Northern Ireland, and Scotland; and, for Germany, a national classification (“Raumordnungsregionen”) [38]. Minor adjustments were required due to territorial changes over time and data availability issues (see Table A1 in online Supplementary Information for details). In total, we analysed 561 harmonised spatial units containing populations ranging from 40,000 (Bornholm, Denmark) to 6,760,000 (Madrid, Spain).

Finally, we verified the data quality by comparing our regional data (aggregated at the national level) with those available in the Human Mortality Database. The differences between age-specific death rates calculated from the two data sources are almost nil for ages below 90, and very small for ages upper this limit. Thus, summary measures such as life expectancy at birth correspond to each other.

2.2 Methodology

When dealing with excess mortality, a central methodological issue concerns estimating the baseline mortality level, which is what would have been the expected level in the absence of the pandemic. Pre-pandemic mortality levels are often used as the baseline because they are easy to obtain and compute. However, because such a simplistic approach frequently ignores temporal trends, it is necessary to derive a more suitable expected mortality level in the absence of COVID-19 by forecasting pre-pandemic historical trends for the year affected by the pandemic (here, 2020). Among the numerous methodologies currently available (e.g. [39]), we opted for a CP-spline approach [35], which combines two-dimensional P-splines with prior demographical information derived from population-specific historical patterns.

A significant advantage to taking a non-parametric approach like that of CP-splines lies in its great flexibility for describing various mortality scenarios, which becomes particularly relevant when dealing with 561 diverse subpopulations across 21 European countries. Furthermore, smooth, plausible age profiles and time trends are ensured whilst we gain the added advantage of robustness when analysing small populations at risk. Moreover, the relatively low computation costs of CP-splines allow us to optimise region-specific time windows for our 2020 forecast values.

It is essential to quantify uncertainty before drawing any conclusions on excess mortality levels, particularly when dealing with regional mortality data. The error associated with the forecast of the life expectancy in 2020 must be added to the uncertainty around the observed 2020 mortality levels. Whereas the variance–covariance structure from CP-splines allows us to compute confidence intervals for our forecast values, a fully analytic procedure is used to compute uncertainty around the observed 2020 mortality levels. These two sources of uncertainty are then combined without performing any simulation or bootstrap procedures, thereby considerably reducing computational costs and time. A detailed description of the analytic procedure is available in online Supplementary Information C which includes a validation of our approach through a comparison with the traditional bootstrap procedure. To replicate our methods, the codes we applied in R as well as the population and death counts for France’s 95 study regions are available in [34]. All calculations are performed in R version 4.0.3 [40].

It is noteworthy that the whole procedure can be performed for any given age and regardless of the mortality indicator selected for estimating excess mortality (e.g. life expectancy or age-standardised death rate). Our study focuses on losses in life expectancy at birth and at age 60, rather than on total excess deaths. A key advantage of our two indicators is that they remain unaffected by the age structure of the population, which differs greatly from one region to another in Europe.

To identify any spatial clustering of losses of life expectancy, we rely on the Getis-Ord Gi* statistic [41], which allows us to locate hot and cold spots, that is, neighbouring areas of high concentration of spatial units with either large or small loss of life expectancy. In practice, Gi* is calculated for all estimated losses of life expectancy in our dataset, and the resultant z-scores and p-values are evaluated within a defined neighbourhood. Here, we used the contiguity edges corners definition of neighbourhood, which specifically considers neighbours to be units that share a boundary, share a node, or overlap. To be a statistically significant hot spot, a spatial unit should not only have a high value but also be surrounded by other units with high values. The local sum for a given unit and its neighbours is evaluated against the sum of all spatial units. If the observed local sum is very different from the expected local sum, then this difference is not the result of random chance, and the corresponding z-score is statistically significant. The same logic is applied for identifying cold spots. This well-established methodology has been implemented and thoroughly documented in ArcMap software (module Spatial Statistics Tools) [42], which we used to perform the respective analysis as well as to produce mortality maps. Required shapefiles were collected from Eurostat [43] and were re-projected using the Winkel tripel projection.

3 Results

Figure 1 shows the spatial distribution of losses of life expectancy at birth across 561 territorial units in 21 European countries during 2020. Light blue represents gains in life expectancy compared to expected values, whereas the remaining colours indicate life expectancy losses. Among all the countries considered in our analysis, those most affected by the pandemic appear to be Italy, Spain, and Poland. In most Polish regions as well as in many Spanish and Italian regions, losses in both male and female life expectancy amounted to more than 1 year. This fall in life expectancy observed within just one calendar year is unprecedented in twenty-first century Europe [9, 10]. Interestingly, they are amongst the areas with both the highest longevity (Spanish and Italian regions) and the lowest longevity (Polish regions) in Europe (see Figures A1 and A2 in the online Supplementary Information A). In contrast, most regions in Denmark, Germany, Norway, and western France, as well as some rural areas in Sweden, showed only moderate declines or even slight increases in life expectancy in 2020 compared to their expected values.

Fig. 1
figure 1

Spatial distribution of losses of life expectancy at birth (years) across 21 European countries in 2020 Source: As shown in Table A1

Although previous works could only show national differences in excess mortality associated with the COVID-19 pandemic, our regional analysis is able to highlight substantial within-country variations at the subnational level. Italy represents a notable example, with the whole country showing a loss of life expectancy at birth of 1.16 years for both sexes combined. However, this national value conceals both a loss of 4.18 years in the Bergamo province in the north (CI: − 3.95 to − 4.4) and a gain of 0.4 years in the Sicilian province of Caltanissetta, even if non-significant (CI: − 0.17 to 0.92). Moreover, we can see in Fig. 3 that 95% confidence intervals are plotted along estimated excess mortality in life expectancy at birth. Other examples of remarkable subnational differences can be found in France and the United Kingdom. In the former case, higher excess mortality is visible in the north-east compared to the south-west regions. In the latter, south-west shows a moderate gain, unlike the central parts of England that experience high losses. Spatial variation is also evident in Germany, a country with relatively better performance during the first pandemic wave in 2020. Specifically, its eastern and south-eastern regions bordering Poland, Czechia, and Austria show higher life expectancy losses than the rest of the country.

In contrast, although Poland shows life expectancy losses similar to Italy (1.32 year), there is considerably less subnational variation between the highest and lowest affected regions, with Jeleniogorski in the south showing a loss of 0.7 years (CI: − 0.16 to − 1.09) and Przemyski in the east a loss of 2.17 years (CI: − 1.7 to − 2.65).

The analysis of spatial distribution of losses of life expectancy at age 60 has revealed very similar results (see Figure A3 in the online Supplementary Information A).

Our next step was to analyse the hot and cold spots for losses of life expectancy at birth (Fig. 2) and at age 60 (Fig. A4 in the online Supplementary Information A). This analysis statistically confirms our previous visual inspection: hot spots of life expectancy losses were indeed located in northern Italy and covered most of the Polish and Spanish territory; whereas cold spots were located in Denmark, Germany, western France, and Norway. Furthermore, our analysis identifies several small clusters of elevated excess mortality expanding across national borders. One example can be seen in an area covering parts of northern Italy and southern Switzerland, and—at least amongst women—spreading into eastern France. Another cross-national hot spot expands from southern Poland into eastern Czechia and—again, amongst women—into northern Slovakia. These observations may well indicate possible spill-over effects of the pandemic burden across European countries.

Fig. 2
figure 2

Hot and cold spots of losses of life expectancy at birth across 21 European countries in 2020 Source: as shown in Table A1

Figure 3 ranks the life expectancy losses and shows the associated 95% confidence intervals (CI) for all 561 spatial units. This figure also allows us to visually assess the magnitude of uncertainty around point estimates. The Panel A colours correspond to the colours and cut points presented in Fig. 1. Panels B and C depict the top and bottom 30 regions, respectively. Figures A5 and A6 in the Supplementary Information A separately show the same distributions for men and women.

Fig. 3
figure 3

Source: As shown in Table A1. Notes: (1) Excess mortality is estimated as the difference between observed and expected life expectancy at birth (2) Horizontal bars represent 95% confidence intervals

Losses or gains of life expectancy at birth (e0) associated with COVID-19 pandemic across 561 spatial units in 21 European countries, 2020, both sexes combined.

Figure 3 also reveals the large regional variability in excess mortality across Europe. The aforementioned Italian Bergamo province experienced the highest loss of life expectancy, amounting to 4.18 years (CI: − 3.95 to − 4.4), whereas the Ariège department in France benefited from the highest gain of 0.65 years, even if non-significant (CI: − 0.13 to 1.45). Only a few regions experienced life expectancy gains for which the 95% confidence intervals do not include 0: Devon, Cornwall, and Scilly Island in the UK and Nordjylland in Denmark. Overall, 391 out of 561 regions suffered consequential life expectancy losses, meaning that the associated 95% confidence intervals do not contain 0. The tail of the distribution (Panel C) is dominated by northern Italian and central Spanish regions. Among those harshly hit by the pandemic, 12 regions experienced losses of life expectancy greater than 2 years, with the Italian provinces of Piacenza, Cremona, and Bergamo experiencing more than 3 years of losses. As expected, the group of regions least affected by the pandemic (Panel B) is dominated by German, French, and Danish spatial units. However, this successful group also includes some Italian and UK regions. The numerical values of losses of life expectancy at birth and at age 60 for each spatial unit and for both females and males are provided in the online Supplementary Information B.

In addition to revealing intranational heterogeneity and cross-national patterns, using subnational data has the advantage of enabling a deeper analysis of excess mortality. Moreover, the uncommonly large number of spatial units also increases statistical power in these supplementary analyses. For instance, we can easily assess the association between sex differences in life expectancy loss and the overall regional loss. Figure 4 displays the results of this closer examination using a modified version of a Bland–Altman plot, where the differences between female and male changes in life expectancy are plotted against the corresponding overall regional change. Each dot in the plot is sized proportionally to the population of its respective region.

Fig. 4
figure 4

A modified version of a Bland-Altman plot to portray association between differences between female and male losses of life expectancy at birth and overall regional changes across 561 spatial units in 21 European countries, 2020 Source: As shown in Table A1

As already clear from previous analysis, the majority of regions present an overall loss in life expectancy and this is indicated by the large number of dots in the red areas of Fig. 4. The average difference in excess mortality between males and females is clearly indicated by the mean line, showing that males have experienced over three months of additional loss on average. Regions with higher losses amongst males are represented by dots below the horizontal zero line (lighter red), whilst regions with greater losses amongst females are indicated by dots above this line (darker red).

The identification of outliers is particularly noteworthy. Here, we identify outliers by depicting spatial units lying beyond the 95% limits of agreement of the Bland–Altman plot estimated as prediction intervals of a mean model. Notably, nearly all outliers amongst regions with greater losses of males belong to areas identified as hot spots (Italy, Poland, and Spain). Nevertheless, some outliers also show greater female losses, and these are located predominantly in France.

4 Discussion

4.1 Statement of Principal Findings

This large study of 561 territorial units in 21 countries uses loss of life expectancy at birth and at age 60 to measure the burden of COVID-19 in 2020 throughout Western and Central Europe. Our findings focus on life expectancy at birth and show evidence of losses in 391 regions, whilst only three regions exhibit notable gains during the first year of pandemic. Overall, those most affected were central Spain, northern Italy, and eastern Poland, whilst Denmark, western France, Germany, and Norway were somewhat spared from the COVID-19 burden. Regional differences in life expectancy at birth changes were impressive, ranging from a loss of more than 4 years to a gain of 8 months. For 12 regions, losses amounted to more than 2 years, and three regions showed losses greater than 3 years. Our results regarding losses in life expectancy at age 60 are similar in magnitude since the deaths were mainly observed above this age.

This study further reveals that regional differences in excess mortality were also highly substantial within some countries. In Italy, for example, we estimated a 4-year life expectancy loss in the Bergamo province but no loss in the Caltanissetta province. However, the regional differences in one country can differ greatly from those in another. For example, although Italy and Poland both have similar losses of life expectancy at birth at the national level, the gap between the highest and lowest afflicted regions is three times larger in Italy than in Poland.

Geographical clusters identify where life expectancy was much lower than expected (hot spot), and others where life expectancy was in line with expectations or even higher (cold spot). Notwithstanding, these analyses pinpoint clusters where excess mortality expanded beyond national borders in certain European areas: The hot spots observed in northern Italy spilled over into southern Switzerland and eastern France, and other hot spots observed in Poland spilled over into northern Slovakia and eastern Czechia. Meanwhile, this study singles out a cold spot located in two nearby countries (Germany and Denmark).

We observe differences between women and men in their life expectancy losses in 2020. In most analysed regions, reduced life expectancy was greater amongst men. For example, in the worst hit regions in northern Italy and central Spain, men’s losses were more than 1 year higher as compared to their female counterparts. Widening sex differentials in mortality during the COVID-19 pandemic have been reported previously [1, 44, 45]. Yet, we also find more pronounced life expectancy losses for women in many European regions where drops in life expectancy were more moderate. This is relevant for the discussion on causes for male excess deaths, i.e. biological mechanisms vs. social mechanism [46]. As we observe substantial variability in the magnitude and even direction of sex differences in life expectancy losses, our results indicate that the degree of mortality deterioration is not necessarily linked to biological sex.

4.2 Strengths and Limitations

This study is the first to use a large set of small territorial units to provide estimates of loss of life expectancy during the COVID-19 pandemic in 2020. The contiguity of the regions investigated has allowed us to accurately identify geographical clusters of elevated mortality. This approach is novel because most previous studies have focussed primarily on single-country regions or on regions in various countries with no common borders.

Contrary to many other studies on COVID-19-related deaths, we estimated losses of life expectancy using official mortality data routinely collected by vital registration systems, which are less sensitive to reporting delays and misclassification. To verify that each country’s sums of its regional data were consistent for all ages, we verified them with data from the Human Mortality Database. This is especially crucial, not only for calculating outcomes at older ages, which require statistical techniques to obtain single-year-of-age data, but also considering how COVID-19 was particularly dangerous for older individuals. Finally, we computed baseline mortality using an up-to-date statistical methodology that optimises the time windows in our models for forecasting regional trends in 2020. We compared our country-specific estimates with results from previous studies to evaluate our findings in terms of reliability. Our estimates for losses in life expectancy are consistent with results from Aburto et al. [9] and Islam et al. [15] (see table A2).

Our study also has several limitations. It covers 21 Central and Western European countries and still cannot be extended to Eastern European countries. However, comparisons with these countries would have been interesting insofar as this global health crisis triggered different political responses. Furthermore, this study covers only the year 2020. Given the experience gained by policymakers during the first waves of the pandemic and the emergence of new variants, the results for 2021 and 2022 should be different. As soon as more up-to-date data become available, the analyses we performed can be easily extended along both the spatial and temporal dimensions.

Whilst our adopted forecast approach is robust and flexible enough to be adapted to different demographic scenarios and relatively small populations, we model each spatial unit independently and without considering spatial autocorrelation. By including a spatial structure in modelling and forecasting mortality, this may eventually reduce uncertainty around excess mortality estimates and thus lead to more explicit outcomes. Nonetheless, it remains essential to account for temporal trends when estimating mortality levels in the absence of a pandemic and, consequently, compute excess mortality more accurately [15, 17, 33]. However, other approaches in computing baseline mortality levels have been proposed [47,48,49], and these could be adopted in order to obtain alternative perspectives.

4.3 Policy Implications and Future Research

Our study extends the existing literature on excess mortality during the COVID-19 pandemic in 2020 by reporting results at the regional level for many European countries, which has not been done so far. The findings provide a strong rationale for this regional analysis, as we show that national estimates would have hidden significant regional disparities. Policymakers should be made aware of this intranational heterogeneity to fully assess the burden of the pandemic in their own countries and adopt differentiated health policy responses.

Whist our study quantifies regional differences in excess mortality, it cannot provide avenues for explaining them. The first step towards doing so would be to link these estimates to regional contextual variables (such as income level or occupational structure) as well as to public policies of social distancing and international isolation, which were implemented at both regional and national levels. Eventually, ecological analyses should be complemented with carefully designed epidemiological studies. In this way, one will be able to highlight the decisive factors explaining these regional differences and ultimately gain a better understanding of how to deal with the spread of a new infectious disease. For example, the exceptional mortality observed in northern Italy might be due to the early onset of the pandemic, which triggered a strong public response from the Italian government that limited the spread of the pandemic to the southern parts of the country [50].

It should be emphasised that this study estimates the burden of the pandemic by calculating excess mortality, which is the overall net assessment of the pandemic’s impact on society and must be distinguished from the mortality count due to COVID-19. More specifically, the Institute for Health Metrics and Evaluation (IHME) considers excess mortality to be influenced by six drivers related to the pandemic and the response to it, which are namely: (i) deaths directly related to SARS-CoV-2 infection; (ii) increased mortality due to cancelling or postponing care; (iii) increased mortality due to mental health problems, including depression, increased alcohol use, and increased opiate use; (iv) reduced mortality from decreases in accidents because of reduced mobility; (v) reduced mortality due to reduced transmission of other viruses, including influenza; and (vi) reduced mortality from many chronic diseases such as cardiovascular and respiratory diseases that would have killed frail individuals had COVID-19 not killed them instead. When detailed cause-of-death statistics become available, an accurate assessment must be made of COVID-19 mortality, as well as the distinction between direct and indirect COVID-19 mortality.

Finally, this study critically depends on reliable data being made available from the national statistical institutes of the different European countries analysed here. The magnitude of our presented results should encourage policymakers to promptly provide regional data to better measure the burden of future epidemics in a timely fashion. To produce accurate analyses in times of demographical ageing, it is of utmost importance to harmonise European population and mortality data with consistent age groups and an oldest age class of at least 100 years upwards.