The impact of non-pharmaceutical interventions on COVID-19 epidemic growth in the 37 OECD member states

We estimated the impact of a comprehensive set of non-pharmeceutical interventions on the COVID-19 epidemic growth rate across the 37 member states of the Organisation for Economic Co-operation and Development during the early phase of the COVID-19 pandemic and between October and December 2020. For this task, we conducted a data-driven, longitudinal analysis using a multilevel modelling approach with both maximum likelihood and Bayesian estimation. We found that during the early phase of the epidemic: implementing restrictions on gatherings of more than 100 people, between 11 and 100 people, and 10 people or less was associated with a respective average reduction of 2.58%, 2.78% and 2.81% in the daily growth rate in weekly confirmed cases; requiring closing for some sectors or for all but essential workplaces with an average reduction of 1.51% and 1.78%; requiring closing of some school levels or all school levels with an average reduction of 1.12% or 1.65%; recommending mask wearing with an average reduction of 0.45%, requiring mask wearing country-wide in specific public spaces or in specific geographical areas within the country with an average reduction of 0.44%, requiring mask-wearing country-wide in all public places or all public places where social distancing is not possible with an average reduction of 0.96%; and number of tests per thousand population with an average reduction of 0.02% per unit increase. Between October and December 2020 work closing requirements and testing policy were significant predictors of the epidemic growth rate. These findings provide evidence to support policy decision-making regarding which NPIs to implement to control the spread of the COVID-19 pandemic. Supplementary Information The online version contains supplementary material available at 10.1007/s10654-021-00766-0.


Appendix 1. Calculation of the average daily growth rate in the cumulative number of weekly cases (wADGR).
The average daily growth rate in the cumulative number of weekly cases (wADGR) is expressed as: = ( −1 )(1 + ) 7 , where is the cumulative number of cases at the end of week t and −1 is the cumulative number of cases at the end of week t-1. Solving for wADGR: The average daily growth rate in the cumulative number of cases in week 5 is 2.6%.

Appendix 2. Data collection, data inputs, and data sources.
Table A1 provides a description of the data inputs and their sources. All data was collected via searches in the World Wide Web. The values of some categorical variables were recoded to balance the number of observations between levels (see Table 1). For the mask wearing requirements, we combined country-specific values from two sources -a study by Leffler et al [1] and the WHO COVID-19 Government Response Tracker [2] and coded them ourselves according to three levels: No requirements/ Mask wearing in public recommended / Mask wearing in public required in some public places or in some geographical areas within the country / Mask wearing in public required in all public places and in all geographical areas within the country. Restrictions on gatherings 0 -No restrictions 1 -Restrictions on very large gatherings (the limit is above 1000 people) 2 -Restrictions on gatherings between 101-1000 people 3 -Restrictions on gatherings between 11-100 people 4 -Restrictions on gatherings of 10 people or less 0 -No restrictions 1 -Restrictions on gatherings of more than 100 people 2 -Restrictions on gatherings of between 11 and 100 people 3 -Restrictions on gatherings of 10 people or less Public transport restrictions 0 -No measures 1 -Recommend closing (or significantly reduce volume/ route/ means of transport available) 2 -Require closing (or prohibit most citizens from using it) 0 -No measures 1 -Recommend closing (or significantly reduce volume/ route/ means of transport available) or require closing (or prohibit most citizens from using it) Stay at home requirements 0 -No measures 1 -recommend not leaving house 2 -require not leaving house with exceptions for daily exercise, grocery shopping, and 'essential' trips 3 -Require not leaving house with minimal exceptions (e.g. allowed to leave only once a week, or only one person can leave at a time, etc.) 0 -No measures or recommend not leaving house 1 -require not leaving house with exceptions for daily exercise, grocery shopping, and 'essential' trips or require not leaving house with minimal exceptions (e.g. allowed to leave only once a week, or only one person can leave at a time, etc.)  Outcome and overall approach to model fitting.
We performed a probit transformation of the ADGR values (probit_wADGR). We then fitted a series of OLS-lines to the probit_ADGR trajectories (see Figure A1). From Figure A1, it was appropriate to use a growth model linear in time. Figure A1. Probit_ADGR trajectories and fitted OLS lines.

Model fitting.
All models were fitted using Maximum Likelihood (ML) estimation with the Nelder-Mead optimiser. Statistical significance was set at p=0.05. In addition, the final model was also fitted using Bayesian Estimation with minimally informative priors using Integrated Nested Laplace Estimation. The first fitted model was an unconditional growth model with only time as an independent variable (see Table 2). To identify the regressors for inclusion in the mLMM, we used maximum likelihood estimation with a forward selection procedure as follows: -First, we fitted a series of univariate linear mixed models with time and individual NPIs as regressors -Second, we selected all the NPIs which had shown to be significant regressors in the univariate models -Third, we ranked each of these policies in decreasing order, based on the goodness of fit of the univariate models as expressed by the Bayesian Information Criterion (BIC) -Fourth, we fitted a series of multivariable forward selection linear mixed models with time and adding each NPI sequentially based on its rank from the previous step. If a particular NPI was not a significant predictor it was excluded from the forward selection models -Fifth, we introduced implementation time delay and each control variable individually into the forward selection models. If any of these regressors was statistically significant it was included in the final model.
-Intercept -Time -Workplace closing (require closing or work from home for some sectors or categories of workers) -Workplace closing (require closing or work from home all-but-essential workplaces e.g. grocery stores, doctors)    In the mLMM, variations in the following regressors were significant predictors of changes in the probit_ADGR: 1) restrictions on gatherings, 2) mask wearing requirements, 3) workplace closing requirements, 4) school closing requirements, and 5) total number of tests performed per thousand population during the study period. The maximum likelihood and Bayesian estimation give very similar results.
Testing the assumptions of the mLMM.
In the sub-sections below we present our assessment of the model assumptions, including: Normality in the residuals, heteroscedasticity in the residuals, and lack of strong collinearity in the regressors.
Normality in the distribution of the residuals Figure A2 presents qqplots for, respectively, the level-1 residuals, i.e. the residuals across all countries and all occasions of measurement (top graph) and the level-2 residuals, i.e. the residuals in the intercepts and slopes across countries (bottom graph), i.e. in the between-country intercept and slope residuals. From Figure A2, the distribution of these residuals can be considered approximately Normal.   While mobility (an independent predictor of the probit_ADGR) can be considered an intermediate variable, i.e.occurring in the causal pathway between most NPIs and the epidemic growth rate and hence should not be included as an explanatory variable in the mLMM, it may be a confounding variable in the association between mask wearing and epidemic growth. In a sensibility analysis, we assessed whether the effect of mask wearing was maintained when mobility was included in the model (see Table A6). -Restrictions on gatherings (more than 100 people) -Restrictions on gatherings (between 11-100 people) -Restrictions on gatherings (10 people or less) -Mask wearing requirements (mask wearing recommended) -Mask wearing requirements (mask wearing required in specific public spaces country-wide or in specific geographical areas) -Mask wearing requirements (mask wearing required country-wide in all public places or in all public places where social distancing is not possible) -Workplace closing (require closing or work from home for some sectors or categories of workers) -Workplace closing (require closing or work from home all-but-essential workplaces e.g. grocery stores, doctors) -School closing (only some levels or categories, e.g. just high school, or just public schools) -School closing (require closing all levels)  Table A6, mobility picks up the effect of other NPIs but not of mask wearing requirements, which was still present.

Multivariable generalised linear mixed model (mGLMM)
We fitted a multivariable beta regression generalized linear mixed model (mGLMM) with a probit link function using the average daily rate of growth in the cumulative weekly cases (wADGR) as the response variable. We used restricted maximum likelihood estimation to fit the model. The mGLMM allowed for the estimation of the average marginal effects (AME) of the NPIs on the wADGR. The AME provide a measure, across all observed data, of the average change in the wADGR which results from changes in the level of intensity of each of the NPIs. The best fitting mGLMM was adjusted by modelling the model dispersion with workplace closing requirements as a regressor. Table A7 presents the model results.  The AME estimated based on the results of the mGLMM (last column), indicate that restrictions on gatherings had the highest impact of all NPIs in reducing the average daily growth rate in cumulative weekly confirmed COVID-19 cases (wADGR). Workplace closing requirements had the second highest impact, followed by school closing requirements. Mask wearing requirements ranked fourth in terms of impact and, as a proxy for testing strategy. Additionally, the total number of tests performed country-wide per thousand population was a significant predictor.
Testing the assumptions of the mGLMM.
In the sub-sections below we present our assessment of the model assumptions, including: adequacy of the probit link function, normality in the distribution of the random-effect residuals, no overdispersion (i.e. uniformity in the distribution of the scaled residuals and uniformity in y-direction of the residuals), no collinearity between the regressors.  Figure 4, the mGLMM has a good predictive ability. Figure A5 shows two qqplots for, respectively, the random effects residuals of the mGLMM. From these plots, the distribution of these residuals is approximately Normal. Figure A5. qqplots of the random effect residuals for the mGLMM Figure A6 shows two the output from the DHARMa package residual diagnostics. The left panel shows a qqplot comparing the distribution of the scaled residuals with their expected distribution. The scaled residuals of the mGLMM follow (as appropriate) a uniform distribution. The DHARMa package overdispersion test is not significant, the Kolmogorov-Smirnof test for correct distribution is not significant, the test for outliers is not significant. The right panel shows a plot of the empirical residuals in the 0.25, 0.5 and 0.75 quantiles against their predicted values to detect deviations from uniformity in y-direction. No significant deviations are detected. Figure A6. Residual diagnostics for the mGLMM Table A8 shows the variance inflation factor (VIF) for each of the regressors in the mGLMM. Values close to 1 indicate no collinearity.

Multivariable generalised linear mixed model (mGLMM) with NPI-time interactions
For the mGLMM, we explored the interactions between the NPIs and the time variable to understand what policies may impact longitudinal changes in the wADGR. Table A9 shows the results of the mGLMM with NPI-time interactions adjusted for overdispersion in workplace closing requirements. The best fitting model showed an interaction of mask wearing requirements with time.  The AME estimated based on the results of the mGLMM with a mask wearing requirements-time interaction (last column) indicate, again, as was the case with the mGLMM without interactions, that restrictions on gatherings had the highest impact of all NPIs in reducing the wADGR. Comparing the last column of Table A9 with the last column of Table A7, the interaction of mask wearing requirements with time results in a slight decrement of the average marginal effects of the NPIs.
Testing the assumptions of the mGLMM with mask wearing requirements-time interaction.
As with the mGLMM with no interactions, in the sub-sections below we present our assessment of the model assumptions (except collinearity, as the model incorporates interaction terms). Figure A7 presents, in the top panel, a plot of the observed and predicted log-scaled wADGR for all countries and time points and, in the bottom panel, a posterior predictive check. From Figure 4, the mGLMM with mask wearing requirements-time interaction has a good predictive ability. Figure A7. Observed versus predicted values of the wADGR for the mGLMM with mask wearing requirements-time interaction Figure A8 shows two qqplots for, respectively, the random effects residuals of the mGLMM with interaction between mask wearing requirements and time. From these plots, the distribution of these residuals is approximately Normal. Figure A8. qqplots of the random effect residuals for the mGLMM with mask wearing requirementstime interaction. Figure A9 shows the output from the DHARMa package residual diagnostics. The scaled residuals of the mGLMM follow (as appropriate) a uniform distribution. The DHARMa package overdispersion test is not significant, the Kolmogorov-Smirnof test for correct distribution is not significant, the test for outliers is not significant. No significant deviations from uniformity in y-direction are detected. Figure A9. Residual diagnostics for the mGLMM with mask wearing requirement-time interaction

Analysis of the period October 1-December 31, 2020.
Figure A10 below presents side by side the change over time in the intensity of the NPIs across OECD countries for the period October-December 2020 and for the initial phase of the epidemic (in each graph, each point is the intensity of the corresponding intervention in each country during the relevant week).  Figure A10, there were several differences in the intensity of each NPI in the period October-December 2020 compared to the initial stages of the epidemic: 1. Public event cancelling requirements. In the period October-December 2020 (P2), this policy was overall less stringent but tightened more over time than in the study period of the initial epidemic stage (P1) 2. Restrictions on gatherings. In P2, this NPI was implemented at a higher level of stringency and tightened over time compared to P1 3. International travel controls. In P2 this NPI was stable over time and overall less stringent than in P1.
4. Public information campaigns. In P2, this NPI was stable over time and overall slightly more intense than in P1 5. Public transport restrictions. In P2, this NPI intensified slightly over time and was overall less stringent than in P1 6. Stay at home restrictions. In P2, this NPI tightened somewhat over time and was overall less stringent than in P1 7. Internal travel restrictions. In P2, this NPI was stable over time and overall somewhat less stringent than in P1 8. Contact tracing policy. In P2, this NPI was stable over time and overall more intense than in P1.
9. School closing requirements. In P2, this NPI tightened slightly over time and was overall less stringent than in P1 10. Work closing requirements. In P2, this NPI tightened over time and was slightly less stringent than in P1 11. Testing policy. In P2, this NPI was stable over time and was overall more intense than in P1 12. Mask wearing requirements. In P2, this NPI tightened slightly over time and was overall more stringent than in P1

Multivariable linear mixed model (mLMM2)
Outcome and overall approach to model fitting.
We performed a probit transformation of the ADGR values (probit_wADGR). We then fitted a series of OLS-lines to the probit_ADGR trajectories (see Figure A11). From Figure A11, it was appropriate to use a growth model linear in time. Figure A11. Probit_ADGR trajectories and fitted OLS lines.
Identifying the mLMM2: forward selection. Table A10 below shows the results of the univariate linear mixed models (only for the NPIs which showed a statistical effect) and their rank. This rank determined the sequence in which the regressors were introduced into the forward selection models: Table A10. Univariate results and sequence in which the regressors were introduced into the forward selection models.

Regressors Coefficients (SE) p-values BIC Variance (intercept) Variance (slope) Variance (residuals)
-Intercept -Time -Workplace closing (require closing or work from home for some sectors or categories of workers) -Workplace closing (require closing or work from home all-but-essential workplaces e.g. grocery stores, doctors)  Table A11 below shows the mLMM2 resulting from the forward selection process.  Testing the assumptions of the mLMM2.
In the sub-sections below we present our assessment of the model assumptions, including: Normality in the residuals, heteroscedasticity in the residuals, and lack of strong collinearity in the regressors. Figure A12 presents qqplots for, respectively, the level-1 residuals, i.e. the residuals across all countries and all occasions of measurement (top graph) and the level-2 residuals, i.e. the residuals in the intercepts and slopes across countries (bottom graph), i.e. in the between-country intercept and slope residuals. From Figure A12, the distribution of these residuals can be considered approximately Normal.   Percentage of total population living in urban areas 1

Multivariable generalised linear mixed model (mGLMM2)
We fitted a multivariable beta regression generalized linear mixed model (mGLMM2) with a probit link function using the average daily rate of growth in the cumulative weekly cases (wADGR) as the response variable. We used restricted maximum likelihood estimation to fit the model. The mGLMM2 allowed for the estimation of the average marginal effects (AME) of the NPIs on the wADGR. The AME provide a measure, across all observed data, of the average change in the wADGR which results from changes in the level of intensity of each of the NPIs. The best fitting mGLMM2 was adjusted by modelling the model dispersion with test policy requirements as a regressor. Table  A13 presents the model results.  Testing the assumptions of the mGLMM2.
We tested the following model assumptions: adequacy of the probit link function, normality in the distribution of the random-effect residuals, no overdispersion (i.e. uniformity in the distribution of the scaled residuals and uniformity in y-direction of the residuals), no collinearity between the regressors.  Figure A14, the mGLMM2 has a good predictive ability.

Adequacy of the probit link function
29 Figure A14. Observed versus predicted values of the wADGR for the mGLMM2 Normality in the distribution of the random effects residuals Figure A15 shows two qqplots for, respectively, the random effects residuals of the mGLMM2. From these plots, the distribution of these residuals is approximately Normal. Figure A15. qqplots of the random effect residuals for the mGLMM2 Figure A16 shows two the output from the DHARMa package residual diagnostics. The left panel shows a qqplot comparing the distribution of the scaled residuals with their expected distribution. The scaled residuals of the mGLMM follow (as appropriate) a uniform distribution. The DHARMa package overdispersion test is not significant, the Kolmogorov-Smirnof test for correct distribution is not significant, the test for outliers is not significant. The right panel shows a plot of the empirical residuals in the 0.25, 0.5 and 0.75 quantiles against their predicted values to detect deviations from uniformity in y-direction. Although some significant deviations are detected at two quantiles, the combined adjusted quantile test is not significant. -Country = name of OECD member state -pop2020 = total population in 2020 -urban_percent = percentage of the population living in urban areas -SDI = sociodemographic index -GDP_pc_PPP = GDP, purchase power parity -GDP_health = % of GDP spent in health -household = average household size -palma = Palma ratio -timely = time delay in the implementation of the first in-country social distancing NPI -time_w_recoded = week number (0 = two weeks after first internal NPI implemented for NPI intensity / 0 = 4 weeks after first internal NPI implemented for the outcome) -ave_daily_gr = average daily growth in the cumulative number of weekly cases (wADGR) -strin_index = stringency index -tests_policy = intensity of testing policy -TT_84_pth = tests per thousand population -contacts_policy = intensity of the contact tracing policy -IT_1 = Intensity of international travel controls -PI_1 = Intensity of public information campaigns -SC_11 = Intensity of school closing requirements -WC_11 = Intensity of workplace closing requirements -PE_11 = Intensity of public event cancelling restrictions -RG_11 = Intensity of restrictions on gatherings -PT_11 = Intensity of public transport closure requirements -SH_11 = Intensity of stay at home restrictions -DT_11 = Intensity of internal travel restrictions -masks = Intensity of mask-wearing policy -temperature = temperature -dem_index = democracy index -mob_index = mobility composite -cases_start = baseline number of cumulative cases

R script.
Loading required packages