FormalPara Key Points for Decision Makers

Lockdown is effective in reducing the number of new COVID-19 infections.

Lockdown starts to reduce the number of COVID-19 infections around 10 days after implementation.

Lockdown continues to reduce the number of new cases as much as 20 days after the initiation of the policy.

1 Introduction and Research Question

Diffusion of a novel coronavirus infectious disease (COVID-19) began in China in December 2019, when the first cases were identified in the province of Wuhan. Since then, COVID-19 has spread quickly all over the world [1]. For this reason, on 11 March 2020, the World Health Organization (WHO) declared COVID-19 a pandemic. According to WHO data, on 12 May 2020 there were more than 4 million confirmed cases, approximately 280,000 confirmed deaths, and at least 215 countries, areas or territories that had reported confirmed cases of infection. Starting in April, the US observed swift growth of COVID-19, and by 12 May 2020 was the country with the highest number of confirmed cases (approximately 1,300,000). In May, Europe and North America were the most severely affected continents, with 1,755,790 and 1,743,717 cases, respectively (as of 12 May 2020). By 12 May 2020, Spain was the most badly affected European country, with over 224,000 confirmed cases, followed by Russia (221,334), the UK (219,187), Italy (219,070) and Germany (169,575). It should be noted that the situation is rapidly evolving and continuously changing.

Although the approaches taken by national governments to deal with the emergency have varied widely [2], two principal types of policies can be identified: (1) health policies aimed at strengthening the capacity of the hospital system; and (2) policies aimed at reducing the probability of people contracting the virus, such as lockdowns and social distancing measures.

Lockdown has fostered a debate about the need for such measures. This resistance on the part of stakeholders and policymakers is likely because lockdown comes with a very high economic price, characterized by the necessary halting of many productive activities. Although both the WHO and previous literature focusing on the Chinese case [3, 4] have highlighted the importance and centrality of such measures in reducing the probability of contagion (and thus the related diffusion of the virus), the political debate appears to have been greatly influenced by the negative impact of those measures on national economies. The debate has thus focused on the potential inefficiency of this policy, and also on the estimation of a good trade-off between safeguarding citizens’ health and avoiding damage to the economy.

Some countries have apparently managed to contain the outbreak without imposing a lockdown, such as TaiwanFootnote 1; some in which lockdown played a major role in reducing the contagion, such as China; and others that claim a lockdown is unnecessary, such as Sweden, or that it has to be lifted very soon, such as the US.

For all these reasons, it is important to empirically test the efficiency of lockdown in reducing the contagion. While various contributions address this issue, to the best of our knowledge the principal empirical papers that focus specifically on the efficacy of lockdown have been carried out at national level, often adopting susceptible, infectious, recovered (SIR) models, and focusing especially on the Indian and Chinese cases. This literature includes the work of Lau et al. [4], who conclude that thanks to lockdown, a significantly decreased growth in new COVID-19 cases was observed in China, and Sardar et al. [5], who, in a study that incorporates lockdown measures for India, conclude that positive effects of the lockdown are only observed in some provinces. A recent study from the Istituto Superiore di Sanità and Bruno Kessler Foundation of Trento, analysing the reproduction trend of the virus, “underlines the importance of non-pharmaceutical control measures” [6].

Piguillem and Shi [2] study the topic theoretically, adapting an SIR model to include lockdown and virus testing. Their finding was that lockdown is a government’s second best option, after testing. As regards a cross-country perspective, there are comparisons between different patterns of the virus [7], but, to the best of our knowledge, no empirical evidence on the impact of lockdown measures has been provided. Using a qualitative approach, Cruz and Dias [8] investigate COVID-19 in China, Italy, Brazil, and the US, suggesting that “not all relevant actions were taken, in a timely manner, to efficiently address the spread of COVID-19”.

The aim of this manuscript is to provide empirical evidence on the efficacy of lockdown measures by means of quantitative analysis, namely a panel data approach. We address two different research questions. First, is there cross-country empirical evidence regarding the capability of lockdown measures? Second, how long does it take for lockdown measures to become effective?

This kind of approach allows us to observe and take into account heterogeneity within different countries. These types of results, less useful for the detection of a precise estimation at country-level, allow us to generalize a complex phenomenon and to benefit from a higher number of observations able to add robustness to the results.

2 Data and Methods

To pursue our objectives, we utilized a panel dataset, with daily data from 202 countries around the world used as the basic statistical unit of observation. In formal terms, we estimated the following equation (Eq. 1):

$$\Delta {i}_{ct}=\alpha +{\beta }_{1}{i}_{ct-1}+{\beta }_{2}{DLD}_{ct}+\varepsilon $$
(1)

where \(\Delta i\) is the new COVID-19 cases at time t with respect to t-1 in country c. This is modeled as a function of the infections in country c the previous day (\({i}_{t-1}\)). Furthermore, Eq. 1 includes DLD, a dummy that signals whether or not on day t there was a lockdown in effect in country c. DLD may also signal which country had a lockdown in place for x days (more details on this further below), in order to control for the time efficiency of the policy.

To estimate our equation, we needed (1) the daily number of COVID-19 cases; and (2) the lockdown measures put in place. We gathered the former from the ‘Novel Coronavirus Cases’ dataset compiled by the Johns Hopkins University Center for Systems Science and Engineering [9] in its latest version available on 11 May 2020. It offers a daily estimation of COVID-19 cases from 22 January to 10 May 2020; 202 countries were included in our final sample, given the list wise deletion of some minor Pacific Island countries due to the lack of data for the datasets of the other variables. From this source, we computed our dependent variable New cases, the operationalization of \(\Delta i\), as the first difference between the cases of today and those of yesterday, and also one of the independent variables, YCases, the operationalization of \({i}_{t-1}\), which is, for each day, the absolute value of cases found yesterday.

For the lockdown measures, we relied on ACAPS data from the ‘#COVID-19: Government Measures Dataset’ [10].Footnote 2 This allowed us to distinguish between countries that applied lockdown measures and countries that did not. We used the latest version available on 11 May 2020 (which had been compiled on 5 May 2020), and built the dummy variable Lockdown (operationalization of DLD), which assumes the value of 1 in the first date that a country implemented a partial or complete lockdown measure for the entire population, and also for all subsequent days in which the lockdown is in place. The choice to include policies aimed at the entire population only is justified to avoid having biased estimations due to policy interventions that were referred to a small share of the population. This strategy resulted in a total of 272 measures captured by the DLD dummy, in 100 different countries, in many different times.

We also computed DLD for the number of days elapsed since implementation of the policy. Indeed, 97.5% of those who develop symptoms do so within 11.5 days of infection, with a 95% confidence interval of between 8.2 and 15.6 days [11]. In other words, this means that if a certain delay is in place between infection and the possibility of being tested positive, the lockdown may have greater benefits in terms of a reduction in new cases only after the elapse of a certain time period after the implementation of the policy.

Considering that data have several observations for each c and t, the best estimators are either a feasible generalized least square (FGLS) [12, 13] or generalized estimating equation (GEE) model [14, 15], which have been proposed to extend the generalized linear model to allow for correlated observations. Considering that the spread of the virus may be due to factors specific to each country, from a theoretical perspective we consider it more appropriate to employ a fixed effects (FE) estimator, which captures the heterogeneity between countries. In other words, in this way it is possible to estimate the average effects for single countries, assuming that the heterogeneity among them does not change in the 110 days of our timespan. A Hausman test (reported in Table 1) also suggests that an FE estimator is preferred to a random effects (RE) estimator.

Table 1 Hausman test for fixed versus random effects, baseline model (dummy lockdown), worldwide and European-only samples

Moreover, it is possible to imagine that, other than the number of cases on the day before, there are regional differences that influence the evolution of the pandemic. While in an FE estimation all these country-level peculiarities are controlled for, some of these characteristics may be due to the world region to which the country belongs, i.e. its continent. It is also important because, after originating in Asia, the virus took some time to reach Europe, and arrived in America and Africa even later. Thus, the continents may be at different stages of development of the infection. For these reasons, a hierarchical linear model [16] could help to highlight differences and peculiarities within the different world regions by determining the degree of variance in the diffusion of COVID-19 due to cross-regional differences (rather than country-level differences).

Our final dataset is composed of 109 daily observations (for 110 days, i.e. from 22 January to 10 May 2020) in 202 countries, giving a total of 22,018 observations. We also computed estimates for the 47 countries in Europe (with a total of 5123 observations included in this subsample).

3 Results

Results of the estimates through an FGLS-FE on the complete sample are reported in Table 2 and Fig. 1. YCases is the operationalization of \({i}_{ct-1}\) and is the total number of COVID-19 cases registered in country c yesterday (on t-1). It has, as expected, a positive and statistically significant coefficient, suggesting that the more cases reported yesterday, the more New Cases of COVID-19 there will be today.

Table 2 Feasible generalised least squares fixed-effect estimation of the worldwide (complete) sample
Fig. 1
figure 1

Betas of several lockdown dummies, computed at the date of implementation, 7 days after, 8 days after, etc., up to 20 days after. Lines and lighter colours represent the 95% and 90% confidence intervals. Betas are estimated through the FGLS-FE model. FGLS-FE feasible generalized least square–fixed effects

Lockdown has a negative and statistically significant coefficient, suggesting that countries that implemented the lockdown have fewer New Cases than countries that did not. We can also see that the benefits of lockdown increase exponentially with the passing of time.

When looking at the European subsample (Table 3), the situation is slightly different. It is worth noting that Europe has been affected by COVID-19 severely and diffusely, and that compared with other continents, it is densely inhabited. This, combined with the existence of several different countries in a relatively small space, has possibly increased the speed of the contagion. Here Lockdown is positive, suggesting that countries that implemented the lockdown have, on average, more New Cases than in countries that did not. This is possibly due to the fact that in the countries that implemented lockdown, the spread of COVID-19 was already advanced compared with other European countries. The positive coefficient thus signals a presence in the same continent of countries at different stages of the spread of the virus. In Europe, this difference became statistically non-significant 13 days after the start of lockdown; after 17 days the coefficient become negative, suggesting a net benefit in having implemented the measure; its magnitude and statistical significance keep growing after 20 days, with an apparently exponential trend.

Table 3 Feasible generalised least squares fixed-effect estimation of the European subsample

While both theory and a Hausman test suggested that FE would be a better estimator for testing the model, we cannot assume that all lockdown measures are equal among the 202 different countries included in this analysis. Furthermore, lockdown measures may have happened at different times, and thus could be considerably different between countries or geographical regions. For these reasons, we also estimated Eq. 1 (for both the whole sample and the European subsample) through an FGLS RE estimator. The results are reported in Tables 4 and 5 and are perfectly compatible with the previous finding, suggesting some robustness in the results. Moreover, we also replicated the analysis employing GEE estimators. The results, included in Tables 6 and 7, once again confirm the principal findings, suggesting, if possible, even more stability in the estimates.

Table 4 Feasible generalised least squares random-effect estimation of the worldwide (complete) sample
Table 5 Feasible generalised least squares random-effect estimation of the European subsample
Table 6 Generalized estimating equation of the worldwide (complete) sample
Table 7 Generalized estimating equation of the European subsample

Furthermore, a hierarchical linear RE analysis of variance (ANOVA) is presented in Table 8, for the whole dataset, and Table 9, for the European subsample. This model allows us to analyse the degree of variance due to cross-regional differences. As regards the whole dataset (Table 8), the regions identified as a level of aggregation of the countries were Africa (27.23% of the sample), the Americas (23.76%), Asia (22.77%), Europe (23.27%), and the Pacific (2.97%). On the other hand, for the European subsample, following the distinction adopted by the United Nations Statistical Department,Footnote 3 the regions identified were Eastern Europe (22.92% of this subsample), Northern Europe (22.92%), Southern Europe (35.42%), and Western Europe (18.75%). As expected, the likelihood ratio test statistics for the null hypothesis that there is no cross-regional variation in New Cases have a p-value of 0.00, suggesting that there is indeed variation between the regions identified, both in the complete sample and in the European subsample. Looking at the coefficients, these results are also in line with the previous estimates, suggesting once again some robustness in our findings.

Table 8 Hierarchical linear model, ANOVA random effects of the worldwide (complete) sample
Table 9 Hierarchical linear model, ANOVA random effects of the European subsample

Moreover, estimates of an alternative model, with an interaction term, are included in the electronic supplementary material.

4 Discussion

This article provides the first empirical evidence, from a cross-country perspective, regarding the efficacy of lockdown measures. Our results confirm that lockdown policies have had a positive impact on the pandemic, and that they have been able to reduce the number of COVID-19 cases in the countries that implemented them.

It is worth noting that, as previously pointed out, our results are robust to different kinds of estimators (namely FGLS, both with fixed effects and random effects, and GEE), and also when considering the possible existence of differences depending on the world region or European region to which a specific country belongs (through an hierarchical linear model).

All these results contribute to the current debate regarding the need for this kind of policy. Specifically, we believe that our results may contribute to the debate on defining both ‘mitigation measures’ and ‘exit strategy’. Indeed, we have proven that lockdown is effective, on average and worldwide, in reducing COVID-19 contagion. More specifically, looking at the European case, its efficiency begins approximately 3 weeks after the lockdown and continues to reduce the number of COVID-19 infections up to 20 days later. While Sarwal and Sarwal [17] suggest that a ‘localization’ strategy is efficient after 14 days of lockdown, a point at which no further gains can be expected by extending the lockdown, our results seem to contradict this finding from a cross-country perspective, since we detect an increase in benefits and a reduction in contagion even 20 days after the start of lockdown. Accordingly, it seems that any decision to loosen lockdown should be considered very carefully if the full benefits of the measure in terms of reducing new cases are to be enjoyed. More importantly, a rushed removal of lockdown measures may condemn hard-won results to waste.

4.1 Limitations

It is important to highlight that we have presented a cross-country analysis. This means it should be considered as an estimate measuring the average effects, worldwide or for European countries. While this has many benefits in terms of generalization of the results and the statistical power of the model, there are also serious limitations in deriving precise estimates, since standard errors can of course increase. Caution is thus suggested in reading these results, which are necessarily also driven by the timing of the measures taken in Europe and the rest of the world, as well as by the spread of the pandemic (it is important to highlight that while we partially controlled this bias in the multilevel analysis, there is still an important heterogeneity within the regions, which are continents for the estimations on the entire dataset, and agglomerates of a dozen countries in the case of the European subsample).

For these reasons, we highlight the importance of and need for further investigations on this topic, which may focus on more specific territorial or climatic subsamples, or on how governments have implemented lockdown policies.