Introduction

In January 2020, in Wuhan, China, what began as a cluster of viral pneumonia cases rapidly spiraled into an epidemic of a new disease, COVID-19, caused by a novel coronavirus, designated SARS-CoV-2. As cases mounted in Wuhan and began cropping up elsewhere, China introduced a comprehensive set of measures, termed nonpharmaceutical interventions (NPIs), to contain the virus. On January 23, a lockdown was enacted in Wuhan, which shut down public transit and travel out of the city. Other cities throughout Hubei province (of which Wuhan is capital) announced similar lockdowns over the next few days [1]. The rest of China was subject to social distancing measures: mass transit and public gatherings were severely curtailed, and the New Year holiday (Chunyun) was extended, which kept most schools, workplaces, and businesses closed [1,2,3]. In addition, numerous measures were implemented to rapidly identify and isolate suspected cases. These included temperature checks at borders and travel hubs, quarantine of new arrivals, isolation of both confirmed and suspected cases, and contact tracing with quarantine and medical observation.

China’s robust public health response was decidedly effective in controlling the spread of SARS-CoV-2 [4,5,6,7]. As of April 4, 2021, China had a cumulative incidence of 70 cases per million residents and cumulative mortality of 3 deaths per million, compared to 16,737 cases and 365 deaths per million persons globally [8]. Given the magnitude and intensity of the response in China, which would be difficult to replicate in many settings, it would be useful to know which interventions were most effective in limiting the spread of the virus.

Evaluating the effectiveness of specific NPIs has been a challenge throughout the COVID-19 pandemic because multiple interventions are typically used concurrently. Several studies have estimated the combined impact of multiple NPIs [5,6,7, 9]; others have estimated the effect of specific control measures using statistical approaches [10,11,12,13,14,15,16,17,18,19,20] or modeling [21,22,23,24,25]. The majority of studies focus on the effectiveness of social distancing interventions, while relatively few explicitly consider the impact of quarantine or isolation measures. However, in China, the impact of rapid case isolation was indirectly observed in a shift toward earlier transmission, reflected in shorter serial intervals [26]. Here, we explore how strong this effect was, using serial interval data and case incidence data to estimate how much NPIs reduced transmission before and after symptom onset. We show that, in the first few weeks after NPIs were implemented, presymptomatic transmission decreased slightly but transmission post-symptom onset declined dramatically. These results suggest that “post-onset” interventions, such as case isolation, may have been more effective than other control measures at limiting transmission of SARS-CoV-2.

Results

We compiled published data, including symptom onset dates, for 873 infector-infectee case pairs from China (see Methods) [27,28,29,30]. Among these case pairs, the majority of transmission events occurred outside Hubei, with 84% of the secondary cases reported to be infected in other provinces (Supplemental Table S1). Symptom onset dates ranged from January 7 to February 29, 2020, a period that spans the rollout of nonpharmaceutical interventions in China. The mean serial interval, defined as the time between onset of symptoms in infector and infectee, was 4.64 days over the entire period, similar to published estimates [27, 28, 31, 32], but this distribution shifted markedly over time. A linear regression of serial intervals vs. symptom onset dates of primary cases reveals a significant decrease in the length of serial intervals (p = 4 × 10–34, Fig. 1a), an observation also made by Ali et al. [26].

Fig. 1
figure 1

Serial intervals of infector-infectee pairs in China. a Serial interval vs. date of symptom onset in the primary case of each pair. Shading reflects the number of overlapping points, with darker colors indicating higher numbers. Dashed vertical line indicates Jan 23, when the rollout of nonpharmaceutical interventions (NPIs) began. b Serial interval histograms and best-fit normal distributions for primary cases with symptom onset before Jan 23 (pre-NPI; blue, hatched bars and dashed line) and on/after Jan 23 (post-NPI; red, solid bars and solid line)

We divided case pairs into two time periods using the symptom onset dates of the primary cases. January 23 marked the lockdown of Wuhan and the start of a national rollout of nonpharmaceutical interventions (NPIs); cases with symptom onset prior to Jan. 23 were therefore denoted pre-NPI (n = 207) and the rest designated post-NPI (n = 666). Serial intervals were significantly different between the two time periods (p = 4 × 10–19, Fig. 1b), with a mean of 7.57 days (standard deviation 5.13 days) before the NPI rollout and a mean of 3.73 days (standard deviation 4.93 days) afterward.

Next, we used a Markov chain Monte Carlo (MCMC) approach to estimate the distribution of the generation interval for each time period by fitting to serial interval data (Whereas the serial interval is the time between consecutive symptom onset events, the generation interval is the time between consecutive transmission events). We replicated the analysis using three different priors for the incubation period distribution (Supplemental Table S2), based on work by Lauer et al. [33], Zhang et al. [30]; and Backer et al. [34]; we also used two different models for the generation interval distribution. Under the incubation-independent model, all generation intervals are drawn from a single gamma distribution; under the incubation-dependent model, the generation interval is still gamma-distributed but the probability distribution for a given case is horizontally “stretched” in proportion to the incubation period, which means that cases with longer incubation periods will tend to have longer generation intervals. We used a modified deviance information criterion [35] to assess model fit (Supplementary Information) and found that the best model for both time periods was the incubation-independent model—which, somewhat surprisingly, suggests generation intervals do not increase with incubation period—paired with the prior based on Lauer et al. (Supplemental Table S3).

As expected, the fitted generation interval distributions had similar means but smaller variances than the serial interval distributions [36]. Under the best model, pre-NPI generation intervals had a mean of 7.50 days (standard deviation 3.95 days) while post-NPI generation intervals had a mean of 3.90 days (standard deviation 3.15 days). Under the remaining models, mean generation intervals ranged from 7.49 to 7.64 days for the pre-NPI period and from 3.84 to 4.04 days for the post-NPI period (Supplemental Table S4).

Using the inferred generation interval distributions, we estimated the relative frequency of presymptomatic transmission under each model by calculating the probability of the generation interval being shorter than the incubation period. Under the best model, the frequency of presymptomatic transmission was estimated to be 34.4% in the pre-NPI period (95% credible interval 28.3%–41.3%) and 71.0% in the post-NPI period (95% CI 67.6%–74.2%) (Fig. 2). Across all models, the estimated frequency of presymptomatic transmission ranged from 30.7% to 47.0% for the pre-NPI period, with 95% CIs collectively extending from 24.0% to 53.7% (Supplemental Table S5). For the post-NPI period, estimates ranged from 68.1% to 80.6%, with 95% CIs extending from 64.5% to 83.5%. We note that the estimates for the pre-NPI period are lower than many published estimates of the frequency of presymptomatic transmission [27, 29, 37,38,39,40], which raises the possibility that many estimates might have been inflated by the effects of nonpharmaceutical interventions.

Fig. 2
figure 2

Estimated relative frequency of presymptomatic transmission before and after the rollout of nonpharmaceutical interventions (NPIs). Points and lines show posterior means and 95% credible intervals, respectively. Colors denote time periods (blue, pre-NPI; red, post-NPI), shades denote generation interval distribution models (light, incubation-dependent model; dark, incubation-independent model), and symbols denote sources for incubation period prior distributions (squares, Lauer et al. [33]; circles, Zhang et al. [30]; triangles, Backer et al. [34]). Estimates from the best model are highlighted in gray

The shift toward presymptomatic transmission following the rollout of NPIs suggests a disproportionate reduction in transmission following symptom onset. We therefore estimated the change in transmission during each phase of the infection (before and after symptom onset) following the implementation of control measures. To do so, it was first necessary to estimate the overall reduction in transmission of SARS-CoV-2. For this, we used province-level incidence data for all of China, combining the numbers for all provinces except Hubei because the transmission events represented in the case-pair data mostly occurred outside Hubei. We calculated the daily case reproduction number \((R_{t} )\), shown in Fig. 3, using a version of the Wallinga-Teunis method [41] which was modified to allow for time-varying serial intervals, as changes in the serial interval distribution have been shown to affect inference of reproduction numbers [26].

Fig. 3
figure 3

Daily reproduction numbers estimated using incidence data from all Chinese provinces except Hubei. Dashed vertical line, start of nonpharmaceutical intervention (NPI) rollout on Jan 23; gray shading, region in which \(R_{t}\) is likely to be underestimated due to right-truncation

We used the estimated \(R_{t}\) values to calculate the mean reproduction numbers for the pre- and post-NPI periods; estimates past February 20 were not used due to the potential for underestimation of \(R_{t}\) as a result of right-truncation (Fig. 3) [42]. The mean reproduction numbers were estimated to be 1.64 for the pre-NPI period and 0.666 for the post-NPI period, corresponding to a 59.4% reduction in overall transmission. Treating this net change as a weighted average of the changes in presymptomatic transmission and transmission post-symptom onset, we calculated the change in the absolute frequency of transmission during each phase of the infection. Under the best model, we estimate that presymptomatic transmission decreased by 15.5% (95% CI: − 30.6% to + 2.56%) while transmission post-symptom onset decreased by 82.0% (95% CI: − 84.5% to − 79.0%; Fig. 4). Across all models, estimates of the change in presymptomatic transmission ranged from − 29.9% to − 8.62%, with 95% CIs extending from − 39.4% to + 16.3% (Supplemental Table S6). Estimates of the change in transmission post-symptom onset ranged from -85.1% to -81.3%, with 95% CIs extending from − 87.9% to -78.3%.

Fig. 4
figure 4

Estimated percent change in absolute frequency of presymptomatic transmission and transmission post-symptom onset. Points and lines show posterior means and 95% credible intervals, respectively. Orange, presymptomatic transmission; green, transmission post-symptom onset; closed symbols, incubation-dependent model; open symbols, incubation-independent model. Symbol shapes denote sources for incubation period prior distributions (squares, Lauer et al. [33]; circles, Zhang et al. [30]; triangles, Backer et al. [34]). Estimates from the best model are highlighted in gray

One hypothesis to explain the disproportionate reduction in transmission post-symptom onset is that “post-onset” interventions, such as case isolation, were highly effective at preventing transmission in the later stages of infection; however, several alternative explanations must be considered. One possibility is that travel restrictions reduced the frequency of case pairs in which the primary and secondary cases were infected in different cities; if such case pairs had longer-than-average serial intervals, then a reduction in the frequency of such case pairs might cause the average serial interval to decrease. Although the proportion of case pairs with the primary and secondary cases infected in different cities decreased after the rollout of nonpharmaceutical interventions (p = 1 × 10–4; Supplemental Table S7), there was no significant effect of being infected in different cities on serial intervals (p > 0.05; Supplemental Fig S1).

Another possibility is that social distancing increased time spent at home, with more frequent exposure leading to earlier transmission between family members or household contacts, similar to the relationship between force of infection and age of first infection. The proportion of transmission events occurring between family members increased following the NPI rollout (p = 3 × 10–5; Supplemental Table 7), although there was no change in the proportion of transmission events taking place within households (p > 0.05). However, neither familial relationship nor household contact had a significant effect on serial intervals (p > 0.05 in both instances; Supplemental Figs S2-S3).

Finally, it is possible that NPIs shifted the demographics of the infectees and/or infectors in such a way as to favor shorter serial intervals, e.g. with earlier transmission from infectors or earlier symptom onset in infectees. Potential confounders include age and sex, both of which can influence the severity and/or infectiousness of COVID-19. However, neither age nor sex differed significantly between the periods preceding and following the introduction of NPIs (p > 0.05 in both instances; Supplemental Figs S4-S5), although age and sex were associated with one another (p = 0.01; Supplemental Fig S6) and both were associated with position in a case pair (primary vs. secondary) (effect of case pair position on sex ratio, p = 4 × 10-10; effect on age, p = 0.01; Supplemental Figs S5, S7). In addition, there was no significant association between serial intervals and age or sex of either primary or secondary cases (p > 0.05 in all cases; Supplemental Fig S8).

Discussion

Following the implementation of nonpharmaceutical interventions in China, the transmission of SARS-CoV-2 changed in two key ways. Compared to the period preceding the rollout of nonpharmaceutical interventions, the post-NPI period was characterized by a significant reduction in the reproduction number \(\left( {R_{t} } \right)\), indicating decreased transmission, and a decrease in the length of generation intervals, reflecting earlier transmission. Specifically, we found that overall transmission of SARS-CoV-2 declined by 59.4% after the NPI rollout, while presymptomatic transmission increased from 34% to 71% of all transmission events. The shift toward earlier transmission implies a disproportionate reduction in transmission following symptom onset; we estimate that presymptomatic transmission decreased by roughly 16% after the rollout of NPIs, whereas transmission post-symptom onset decreased by approximately 82%.

Several factors might contribute to the observed shift toward earlier transmission: a reduction in the proportion of case pairs infected in different cities, eliminating a potential delay in transmission; increased time at home, resulting in more frequent exposure to infectious household contacts or family members; shifts in demographics of cases that alter the timing of transmission and/or symptom onset; and interventions that disproportionately reduce transmission in the later stages of infection. Our results do not support the first three possibilities: whether a linked pair of cases were infected in the same city was not a significant predictor of the serial interval, nor was the existence of a shared household or familial relationship, nor the age or sex of either the primary or secondary case. The final possibility is indirectly supported by observed correlations between isolation delays and serial intervals, which suggest that case isolation is capable of shifting the serial interval distribution [25, 26, 43].

Our findings therefore suggest that “post-onset” interventions, such as case isolation, may have been responsible for the dramatic reduction in transmisson post-symptom onset which followed the implementation of NPIs in China. If so, given that post-onset transmission comprised an estimated two-thirds of total transmission at baseline, this would suggest that post-onset interventions, especially rapid isolation of COVID-19 cases, were among the largest contributors to rapid control of SARS-CoV-2 in China. In contrast, the comparatively modest reduction in presymptomatic transmission, combined with the fact that approximately a third of transmission was presymptomatic before NPIs were introduced, suggest that reductions in presymptomatic transmission, e.g. from quarantine of asymptomatic exposed contacts, likely had a smaller impact on overall transmission. However, it does not follow that control could be achieved through post-onset interventions alone, since the reductions in transmission may not be sufficient to bring the effective reproduction number \((R_{t} )\) below the threshold for control [25]. Interventions that specifically limit presymptomatic transmission are likely to provide modest additional benefits, while the impact of interventions that reduce transmission at all stages of the infection, such as social distancing, may be appreciable but could not be quantified in this analysis, although this question has been explored elsewhere [9, 11, 14, 16,17,18,19].

A key limitation of this work is the lack of information regarding transmission from asymptomatic infections (those that never develop symptoms) which are believed to comprise 40–45% of all SARS-CoV-2 infections [44]. Because the serial interval is defined by symptom onset dates, alternative methods would be required to examine the impact of NPIs on asymptomatic transmission. For instance, data on exposure windows could be used to estimate the time of infection for primary cases, similar to the approach used for estimating incubation periods [30, 33, 34, 45, 46]; in conjunction with exposure or symptom onset dates for secondary cases, an approach similar to the one employed here could then be used to infer generation intervals for transmission pairs with asymptomatic primary cases.

Another limitation is the fact that case pairs, especially those found through passive surveillance, may represent a non-random sample of all infections. Even when only symptomatic cases are considered, those with milder symptoms may be more likely to go unreported, and it has been suggested that severe disease is associated with shorter incubation periods [47], although this effect may be the result of confounding by other factors known to influence incubation period, such as age [48]. Many estimates of the serial interval and generation interval of SARS-CoV-2 could be affected by the biases of passive surveillance, and this work is no exception; however, the fact that age distributions and sex ratios were consistent before and after the rollout of NPIs suggests that case detection biases are unlikely to be responsible for the shift in transmission timing following the introduction of control measures.

In summary, we find that the implementation of nonpharmaceutical interventions in China was followed not only by a rapid decrease in the rate of SARS-CoV-2 transmission, but a significant shift in the timing of transmission, with more transmission occurring in the presymptomatic (incubation) period. The leading hypothesis to explain these observations is that interventions, particularly case isolation, were highly effective in limiting transmission in the later stages of infection, while other measures, such as quarantine of exposed contacts, had a more limited impact on transmission in the earlier stages. These findings suggest that rapid case detection and isolation, if rigorously implemented, may be a highly effective strategy for interrupting transmission of SARS-CoV-2.

Methods

Data sources

We obtained data on serial intervals for infector-infectee pairs reported in China in January and February 2020. We combined data that were previously collected and published by the following sources: Xu et al. with 679 case pairs, compiled from provincial and urban health commission reports [27]; Du et al. with 468 case pairs, compiled from provincial health agency reports [28]; He et al. with 41 case pairs, compiled from government and media reports [29]; and Zhang et al. with 35 case pairs, compiled from health agency and media reports [30]. We cross-checked all datasets and eliminated suspected duplicate case pairs, which we identified as those with matching sex, age, and symptom onset date for both cases. This left a total of 873 unique case pairs for use in parameter estimation.

Among the compiled data, age, sex, and date of symptom onset were reported for all cases (asymptomatic cases, meaning those which remained asymptomatic throughout the entire infection, were not included in any of the datasets). Infection locations (the cities and provinces in which each case was presumed to be infected) were reported for 782 case pairs; the relationship between infector and infectee was reported for 660 case pairs, and the type of contact (household vs. non-household) was reported for 121 case pairs.

Among all case pairs, a plurality of primary cases (45.5%) were infected in Hubei province, which is home to the city of Wuhan; however, 84.2% of secondary cases were infected in other provinces (Supplemental Table S1), indicating that the majority of transmission events took place outside Hubei. We therefore assume that the transmission events captured in the case pair data are most representative of the dynamics outside Hubei province.

Analysis of serial intervals

Since dates of transmission were unknown, we dated each case pair using the date of symptom onset in the primary case. We used simple linear regression (serial interval versus date) to test the hypothesis that serial intervals declined over time. We then divided serial intervals into two time periods based on the date of symptom onset in the primary case of each pair. Case pairs were designated “pre-NPI” if the primary case developed symptoms before the rollout of nonpharmaceutical interventions beginning on Jan 23 (i.e. symptom onset occurred on or before Jan 22) and “post-NPI” otherwise. We used a two-sided Student’s t-test to test the hypothesis that the mean serial intervals differed between the pre- and post-NPI periods.

Generation interval distributions

Consider a linked pair of cases, with the secondary case arising by transmission from the primary case. The time between symptom onset in the primary case and symptom onset in the secondary case is termed the serial interval. A related term, the generation interval describes the time from infection of the primary case to infection of the secondary case. The time between infection and symptom onset is called the incubation period. In what follows, we denote the serial interval by \(\delta\), the generation interval by \(\tau\), and the incubation periods of the primary and secondary cases by \(\theta_{1}\) and \(\theta_{2}\), respectively. The relationship between these quantities is given by \(\tau = \delta + \theta_{1} - \theta_{2}\) (Fig. 5).

Fig. 5
figure 5

Diagram showing incubation period \(\left( \theta \right)\), serial interval \(\left( \delta \right)\), and generation interval \(\left( \tau \right)\) for a linked pair of cases. Closed circle, time of infection; open circle, time of symptom onset; dot-dash arrow, transmission from primary case to secondary case

We assumed that the generation interval \(\tau\) followed a gamma distribution \(f_{\tau }\) with unknown parameters \(\alpha\) and \(\beta\). Due to uncertainty regarding the relationship between the incubation period and the generation interval, we used two different models for the generation interval distribution. In the incubation-independent model, we assumed that a single generation interval distribution, with shape parameter \(\alpha\) and rate parameter \(\beta\), applied to all individuals, regardless of incubation period. In the “incubation-dependent” model, we assumed that individuals with longer incubation periods would tend to have longer generation intervals; specifically, for a primary case with incubation period \(\theta_{1}\), we assumed that the generation interval followed a gamma distribution with shape parameter \(\alpha\) and rate parameter \(\beta /\theta_{1}\). This is equivalent to defining a new random variable \(X = \tau /\theta_{1}\) where \(X\) follows a gamma distribution with shape \(\alpha\) and rate \(\beta\). This formulation causes the generation interval distribution to be horizontally “stretched out” in proportion to \(\theta_{1}\) – for instance, it results in the expected generation interval being a fixed multiple of \(\theta_{1}\), rather than a fixed length of time.

Parameter estimation

We used a Markov chain Monte Carlo (MCMC) algorithm to estimate the parameters of the pre- and post-NPI generation interval distributions by fitting to serial interval data from each time period. Because an observed serial interval depends on the incubation period of each infection as well as the generation interval, we also estimated posteriors for the unobserved incubation periods \(\theta_{1}\) and \(\theta_{2}\) for each pair of cases (data augmentation). The model assumed that the incubation periods were drawn from a prior \(f_{\theta }\); we used three different priors drawn from the literature (see below). The generation interval was assumed to be drawn from a gamma distribution \(f_{\tau }\) with unknown parameters \(\alpha\) and \(\beta\), with minimally informative priors \(f_{\alpha }\) and \(f_{\beta }\), respectively (Supplemental Table S2). We can express the joint posterior density of the unknown parameters as follows:

$$\begin{gathered} f_{{\alpha ,\beta |\overline{\delta }}} \left( {\alpha ,\beta {|}\overline{\delta }} \right) \propto \mathop \prod \limits_{i = 1}^{N} f_{\delta |\alpha ,\beta } \left( {\delta_{i} {|}\alpha ,\beta } \right)f_{\alpha } \left( \alpha \right)f_{\beta } \left( \beta \right) \hfill \\ = \mathop \prod \limits_{i = 1}^{N} \left( {\mathop \smallint \limits_{0}^{\infty } \mathop \smallint \limits_{0}^{\infty } f_{\delta |\alpha ,\beta } \left( {\delta_{i} {|}\alpha ,\beta ,\theta_{1} ,\theta_{2} } \right)f_{\alpha } \left( \alpha \right)f_{\beta } \left( \beta \right)f_{\theta } \left( {\theta_{1} } \right)f_{\theta } \left( {\theta_{2} } \right)d\theta_{1} d\theta_{2} } \right) \hfill \\ = \mathop \prod \limits_{i = 1}^{N} \left( {\mathop \smallint \limits_{0}^{\infty } \mathop \smallint \limits_{0}^{\infty } f_{\tau |\alpha ,\beta } \left( {\delta_{i} + \theta_{1} - \theta_{2} {|}\alpha ,\beta } \right)f_{\alpha } \left( \alpha \right)f_{\beta } \left( \beta \right)f_{\theta } \left( {\theta_{1} } \right)f_{\theta } \left( {\theta_{2} } \right)d\theta_{1} d\theta_{2} } \right) \hfill \\ \end{gathered}$$

We estimated the unknown quantities using a Metropolis–Hastings algorithm, with each iteration taking place in two parts. The parameters \(\alpha\) and \(\beta\) were updated first, with proposed values \(\alpha ^{\prime}\) and \(\beta ^{\prime}\) being accepted with probability \({\text{min}}\left( {1,\frac{{\mathop \prod \nolimits_{i = 1}^{N} f_{\tau |\alpha ,\beta } (\delta_{i} + \theta_{1\left( i \right)} - \theta_{2\left( i \right)} |\alpha^{\prime},\beta^{\prime})f_{\alpha } \left( {\alpha^{\prime}} \right)f_{\beta } \left( {\beta^{\prime}} \right)}}{{\mathop \prod \nolimits_{i = 1}^{N} f_{\tau |\alpha ,\beta } (\delta_{i} + \theta_{1\left( i \right)} - \theta_{2\left( i \right)} |\alpha ,\beta )f_{\alpha } \left( \alpha \right)f_{\beta } \left( \beta \right)}}} \right)\). The incubation periods \(\theta_{1}\) and \(\theta_{2}\) for each case pair were then updated, with proposed values \(\theta_{1\left( i \right)} ^{\prime}\) and \(\theta_{2\left( i \right)} ^{\prime}\) being accepted with probability \({\text{min}}\left( {1,\frac{{f_{\tau |\alpha ,\beta } (\delta_{i} + \theta_{1\left( i \right)} ^{\prime} - \theta_{2\left( i \right)} ^{\prime}|\alpha ,\beta )f_{\theta } \left( {\theta_{1\left( i \right)} ^{\prime}} \right)f_{\theta } \left( {\theta_{2\left( i \right)} ^{\prime}} \right)}}{{f_{\tau |\alpha ,\beta } (\delta_{i} + \theta_{1\left( i \right)} - \theta_{2\left( i \right)} |\alpha ,\beta )f_{\theta } \left( {\theta_{1\left( i \right)} } \right)f_{\theta } \left( {\theta_{2\left( i \right)} } \right)}}} \right)\).

We ran the algorithm for 250,000 iterations, discarded the first 50,000 and thinned the remainder by keeping every 10th iteration. The resulting set of 20,000 observations was used to approximate the joint posterior distribution of the parameters of interest. The aggregated posterior distributions of the incubation periods \(\theta_{1}\) and \(\theta_{2}\) for each model are illustrated in Supplemental Figs S9-S12, while posterior means and 95% credible intervals for the generation interval parameters are reported in Supplemental Tables S3-S4. We report the mean and standard deviation of the generation interval distribution alongside the parameters \(\alpha\) and \(\beta\), since these latter quantities are difficult to interpret. For the incubation-independent model, the mean and variance are simply \(\alpha /\beta\) and \(\alpha /\beta^{2}\), respectively. For the incubation-dependent model, the mean generation interval is given by \({\text{E}}\left[ X \right]{\text{E}}\left[ \theta \right]\), where \(X = \tau /\theta\) follows a gamma distribution with shape \(\alpha\) and rate \(\beta\). The variance is given by \(\left( {{\text{Var}}\left[ X \right] + {\text{E}}\left[ X \right]^{2} } \right)\left( {{\text{Var}}\left[ \theta \right] + {\text{E}}\left[ \theta \right]^{2} } \right) - \left( {{\text{E}}\left[ X \right]^{2} } \right)\left( {{\text{E}}\left[ \theta \right]^{2} } \right)\). We calculated the mean and variance of the generation interval for each iteration in the converged and thinned Markov chain in order to approximate posterior distributions, which we used to obtain posterior means and 95% credible intervals.

Incubation period distributions

Since the incubation period distribution affects the inferred generation interval distribution, we replicated our analysis using three different priors (\(f_{\theta }\)) for the incubation period. The distributions and their sources are as follows: a lognormal distribution with mean 5.52 days and standard deviation 2.41 days, based on Lauer et al. [33]; a lognormal distribution with mean 5.21 days and standard deviation 2.59 days, based on Zhang et al. [30]; and a Weibull distribution with mean 6.49 days and standard deviation 2.35 days, based on Backer et al. [34]. The parameters for these distributions can be found in Supplemental Table S2.

Model fit

We assessed model fit using a modified deviance information criterion (DIC) for data-augmented models (Supplementary Information) [35]. Broadly speaking, DIC is a generalization of the Akaike information criterion (AIC); it penalizes model complexity as well as poor model fit (low likelihood of observing the data under the specified model). As with AIC, the “best” model is the one with the lowest DIC value.

Presymptomatic transmission

We next used the joint posterior distribution of the generation interval parameters to approximate the distribution of the percentage of transmission expected to take place prior to symptom onset, which we denote \(\varphi\). For the incubation-independent model, given \(\alpha\) and \(\beta\), the proportion of transmission expected to occur before symptom onset is given by \(\varphi = \mathop \smallint \limits_{0}^{\infty } \mathop \smallint \limits_{0}^{u} f_{\theta } \left( u \right)f_{\tau } \left( v \right)dv du\), where \(f_{\tau }\) is a gamma distribution with shape \(\alpha\) and rate \(\beta\). For the incubation-dependent model, the expected proportion of transmission occurring before symptom onset is given by \(\varphi = \mathop \smallint \limits_{0}^{\theta } f_{\tau } \left( u \right)du\), where \(f_{\tau }\) is a gamma distribution with shape \(\alpha\) and rate \(\beta /\theta\). We calculated \(\varphi\) for each iteration in the converged and thinned Markov chain in order to approximate the posterior distribution of \(\varphi\), which we use to obtain posterior means and 95% credible intervals.

Method to estimate reproduction numbers \(\left( {R_{t} } \right)\) with time-varying serial intervals

We introduce a modified version of the Wallinga-Teunis method of estimating the reproduction number at time \(t\) (denoted \(R_{t}\)). In the basic method described by Wallinga and Teunis (2004), the probability that case \(i\) was infected by case \(j\) is given by

$$p_{ij} = \frac{{w\left( {t_{i} - t_{j} } \right)}}{{\mathop \sum \nolimits_{k \ne i} w\left( {t_{i} - t_{k} } \right)}}$$

where \(t_{i}\) is the time at which case \(i\) developed symptoms, and \(w\) the serial interval distribution. The reproduction number for case j is therefore given by

$$R_{j} = \mathop \sum \limits_{i \ne j} p_{ij}$$

and the reproduction number at time \(t\) \(\left( {R_{t} } \right)\) is equal to the reproduction number for any case with symptom onset at time \(t\).

We now present a modification of this approach, which allows for time-varying serial interval distributions. Rather than a fixed serial interval distribution \(w\), let \(w_{t}\) be the distribution of serial intervals for primary cases with symptom onset in the time window \(\left[ {t - z,t + z} \right]\). Then the probability case \(i\) was infected by case j is given by

$$p_{ij} = \frac{{w_{{t_{j} }} \left( {t_{i} - t_{j} } \right)}}{{\mathop \sum \nolimits_{k \ne i} w_{{t_{k} }} \left( {t_{i} - t_{k} } \right)}}$$

and the reproduction number for case \(j\) is similarly given by \(R_{j} = \sum\nolimits_{i \ne j} {p_{ij} }\), which we can rewrite as follows:

$$R_{j} = \mathop \sum \limits_{{x = t_{{{\text{min}}}} }}^{{t_{{{\text{max}}}} }} \frac{{\left( {n_{x} - \delta_{{xt_{j} }} } \right)w_{{t_{j} }} \left( {x - t_{j} } \right)}}{{\mathop \sum \nolimits_{{y = t_{{{\text{min}}}} }}^{{t_{{{\text{max}}}} }} \left( {n_{y} - \delta_{xy} } \right)w_{y} \left( {x - y} \right)}}$$

where \(t_{{{\text{min}}}}\) and \(t_{{{\text{max}}}}\) are the first and last dates of symptom onset, \(n_{t}\) is the number of cases with symptom onset at time \(t\), and \(\delta_{ij}\) is the Kronecker delta function, with

$$\delta_{ij} = \left\{ {\begin{array}{*{20}c} {1 {\text{ if }}i = j} \\ {0 {\text{ if }}i \ne j} \\ \end{array} } \right.$$

such that case \(j\) is subtracted from the set of potential infectees with symptom onset at time \(t_{j}\). With \(R_{t}\) assumed to be equal to the reproduction number for any case with symptom onset at time \(t\), we can modify the above to get the final expression for \(R_{t}\):

$$R_{t} = \mathop \sum \limits_{{x = t_{{{\text{min}}}} }}^{{t_{{{\text{max}}}} }} \frac{{\left( {n_{x} - \delta_{xt} } \right)w_{t} \left( {x - t} \right)}}{{\mathop \sum \nolimits_{{y = t_{{{\text{min}}}} }}^{{t_{{{\text{max}}}} }} \left( {n_{y} - \delta_{xy} } \right)w_{y} \left( {x - y} \right)}}$$

Estimation of time-varying serial interval distributions

We used the serial interval data described above to estimate the time-varying serial interval distributions; we assumed serial intervals for primary cases with symptom onset at time \(t\) followed a normal distribution with mean and standard deviation calculated using serial intervals for primary cases with symptom onset in the window \(\left[ {t - z,t + z} \right]\) with \(z = 3\) (corresponding to a 7-day moving window). For time windows extending beyond the range of primary case symptom onset dates, the nearest 7-day window falling within this range was used to estimate the serial interval distribution.

Estimation of reproduction numbers \(\left( {R_{t} } \right)\) using case incidence data

The incidence data based on reported case pairs are less than ideal for estimation of \(R_{t}\) due to small daily case numbers (median of 9 cases per day) as well as potential sampling biases. We therefore compared case pair incidence data to the incidence data for all of China minus Hubei province. Incidence curves for the two datasets were similar, although the non-Hubei curve was shifted to the right (Supplemental Fig S13), presumably due to the delay between symptom onset and case reporting (nationwide data did not include symptom onset dates). After shifting the non-Hubei incidence data back by 7 days to align the peaks of the two curves, we find good agreement between the case pair data and non-Hubei data (Supplemental Fig S14).

We estimated \(R_{t}\) using both the case pair incidence data and the time-shifted non-Hubei incidence data, and find good agreement between the two sets of estimates (Supplemental Figure S15). However, because the non-Hubei data feature larger case numbers, and presumably smaller sampling error, we use the \(R_{t}\) estimates based on the these data for subsequent analysis.

Estimating mean reproduction numbers before and after rollout of nonpharmaceutical interventions

We used \(R_{t}\) estimates from before and after the rollout of nonpharmaceutical interventions (before Jan. 23 and on/after Jan. 23, respectively) to estimate the mean reproduction numbers for the pre- and post-NPI periods. In order to reduce the effects of sampling error, we only used \(R_{t}\) estimates for days with at least 10 cases. In addition, because the Wallinga-Teunis method uses future case incidence to estimate the reproduction number, it will tend to underestimate \(R_{t}\) toward the end of a time series due to right-truncation. We therefore calculated the probability of a secondary case developing symptoms by \(t_{{{\text{max}}}}\) if the primary case developed symptoms at time \(t\), given the serial interval distribution at time \(t\) (Supplemental Figure S16). We discarded \(R_{t}\) estimates beyond the time point at which this probability dropped below 90%.

Estimating reductions in transmission before and after symptom onset

Let \(R_{{{\text{pre}}}}\) and \(R_{{{\text{post}}}}\) denote the mean reproduction numbers for the pre-NPI and post-NPI periods, respectively. Similarly, let \(\varphi_{{{\text{pre}}}}\) and \(\varphi_{{{\text{post}}}}\) denote the relative frequency of presymptomatic transmission for each time period. Then the change in the absolute frequency of presymptomatic transmission (from pre-NPI to post-NPI) is given by

$$r_{{{\text{presym}}}} = \frac{{\varphi_{{{\text{post}}}} R_{{{\text{post}}}} - \varphi_{{{\text{pre}}}} R_{{{\text{pre}}}} }}{{\varphi_{{{\text{pre}}}} R_{{{\text{pre}}}} }}$$

and the change in the absolute frequency of transmission post-symptom onset is given by

$$r_{{{\text{postsym}}}} = \frac{{(1 - \varphi_{{{\text{post}}}} )R_{{{\text{post}}}} - (1 - \varphi_{{{\text{pre}}}} )R_{{{\text{pre}}}} }}{{(1 - \varphi_{{{\text{pre}}}} )R_{{{\text{pre}}}} }}$$

Posterior distributions for the changes in presymptomatic and (post)symptomatic transmission were obtained using the posterior distributions for the relative frequency of presymptomatic transmission for each time period.

Alternative hypotheses to explain shift in serial intervals

We considered three alternative hypotheses that might explain the shift toward shorter serial intervals following the introduction of NPIs. We first considered that lockdowns in Hubei province may have reduced the proportion of case pairs in which the primary and secondary cases were infected in different cities; such case pairs might be expected to have longer serial intervals.We classified case pairs as having “matched” or “mismatched” infection locations and used a two-sided chi-squared test to determine whether the frequency of mismatched case pairs changed following NPI rollout. We then used a two-way analysis of variance (ANOVA) to examine the effects of infection location matching and time period (pre-NPI vs. post-NPI) on serial intervals. With this and all other linear models, we started with the maximal model (main effects and all interaction terms) followed by stepwise model simplification, eliminating terms with deletion p-values greater than 0.05, starting with the highest-order interactions. Finally, we confirmed that the residual sum of squares did not differ significantly between the maximal model and the minimal adequate model.

We next considered that the introduction of NPIs may have coincided with an increase in the proportion of transmission events between relatives or household members, which might have shorter serial intervals. We used data on contact type, described as either “household” or “non-household,” as well as relationships between infectors and infectees, which we classified as either “family” or “non-family.” As described above, we used two-sided chi-squared tests to determine whether the frequencies of contact type or relationship type changed following introduction of NPIs, and two-way ANOVAs to explore the effects of these factors (in conjunction with time period) on serial intervals.

Finally, we considered the hypothesis that the demographics (age, sex, or both) of infectors and/or infectees may have shifted in such a way as to favor shorter serial intervals. We first explored whether age and/or sex of infectees or infectors changed following the introduction of NPIs. We used a multi-way ANOVA to characterize the associations between age and three explanatory variables: sex, position in the case pair (primary or secondary), and time period. We also used a generalized linear model with binomial errors to characterize the effects of age, position in case pair, and time period on sex ratio. Finally, we used analysis of covariance (ANCOVA) to explore the associations between serial interval and five explanatory variables: age and sex of primary case, age and sex of secondary case, and time period. For all of these analyses, we followed the process of starting with the maximal model and eliminating nonsignificant terms until arriving at the minimal adequate model.

Software

All of the analysis for this study was conducted in R (version 3.6.1).