Introduction

A hallmark of influenza A pandemics is their unpredictability, not only with respect to the timing of their occurrence but also with respect to their size, duration, and severity. With the benefit of hindsight it is now clear that the 2009 pandemic influenza A/H1N1 has been relatively mild, both in terms of the fraction of the population that has developed influenza-like illness, and the overall severity of the disease [1, 2, 3]. Nevertheless, demand for high care hospital beds has been high compared to the available number of beds [4, 5].

Excess high care hospital capacity is limited [6]. For instance, in the Netherlands intensive care capacity is approximately 10 beds per 100,000 persons, of which less than 2 per 100,000 may be available to meet sudden increases in demand [7]. It is therefore vital that trends of rapidly increasing incidence and health care demand (especially hospitalizations requiring intensive care) are noticed early so that there may be time to increase operational capacity by strict triaging and by postponement of non-critical operations [8].

Real-time tracking of hospital admissions during epidemics is difficult because of the inherent delay in reporting of cases or hospital admissions. Reasons for such delays include the time to complete diagnostic tests, logistics, and overwhelmed surveillance systems. A number of studies have addressed the problem of reporting delay, and recently the term ‘nowcasting’ has been coined for attempts to assess the current situation based on imperfect information [9, 10, 11].

Here we propose an algorithm to correct for delays in reporting, and infer the number of admissions from incoming reports. We have applied this nowcasting method to track the 2009 influenza A/H1N1 hospitalizations in the Netherlands, using a complete set of dates of hospital admissions and associated reporting delays. Now the pandemic has passed, we can assess retrospectively the precision of our nowcasting estimates of the number of hospital admissions during the pandemic.

Methods

Surveillance system

From April 25, 2009, both general practitioners and hospitals were required to notify to the municipal health services patients with influenza like symptoms. Laboratory tests were performed at the National Influenza Centre (represented by the National Institute for Public Health and the Environment, RIVM, and the Erasmus Medical Centre), using RT-PCR. Anonymized data about confirmed cases, including date of admission and travel history, was entered into a web-based database by the municipal health services and collected at the RIVM. On August 3 it was announced that the novel influenza A/H1N1 would no longer be a notifiable disease, which stopped the registration of cases. From that day on, only hospitalized cases that fulfilled the case definition [12] were reported by the hospitals to the municipal health services.

Reporting probability

We reconstructed the number of hospitalized patients on each day of the epidemic. The reporting delay was measured as the difference between the date of admission of a patient and the date of reporting. We observed the reporting delays for all cases from June 5 to measure the distribution of reporting delays. The cumulative frequency distribution of reporting delays gives the probability of a case having been reported i days after the day of onset of symptoms, ρ i . The 95% reporting horizon is the delay where this cumulative distribution surpasses 0.95.

Estimation of the actual number of cases

We set the current day as day 0. Our goal is to estimate the number of admissions i days ago, N i . We denote the number of admissions on that day that have been reported up to the current day by C i . We denote the probability that an admission on that day has been reported before or during that day by ρ i . We are looking for the actual number of admissions i days ago, N i , given the number of reported cases for that day so far, C i .

We note that the number of observed cases is the product of the reporting probability ρ i and the actual number of cases N i : C i  = ρ i N i . Rearranging gives an estimator for the actual number of cases

$$ \hat{N}_i = {\frac{C_i}{\rho_i}}. $$
(1)

Maximum likelihood estimator and 95% confidence interval

In order to construct a likelihood function for the actual number of cases we assume that the number of observed cases follows a binomial distribution that is defined by a number of N i independent trials where the probability of success is ρ i . The probability of observing C i cases is

$$ P(C_i|N_i,\rho_i) = {N_i \choose C_i}\rho_i^{C_i} (1-\rho_i)^{N_i-C_i} $$
(2)

The corresponding likelihood function for N i given the number of observed cases C i and probability of reporting ρ i is, up to a constant, given by

$$ L(N_i;C_i,\rho_i) = {N_i \choose C_i} (1-\rho_i)^{N_i-C_i} $$
(3)

The value of the actual number of cases that maximizes the likelihood is

$$ \hat{N}_i = {\frac{C_i}{\rho_i}}. $$
(4)

This confirms that the straightforward estimator derived earlier is a maximum likelihood estimator.

We construct a confidence interval by the profile likelihood method. We accept values of N i that have a likelihood ratio of \(\lambda = {\frac{L(N_i)}{L(\hat{N}_i)}}\) in the acceptance region specified by the likelihood ratio test with α = 0.05, that is −2 log λ ≤ χ2 1:0.05. This means that the confidence interval includes all values N i that have a likelihood higher than 1/6.8 of the maximum likelihood.

Results

Figure 1 summarizes the information on the daily number of hospitalizations due to pandemic influenza A/H1N1 in the Netherlands. After a period from July up to early October during which approximately 15 hospitalizations were recorded per week, the number of hospitalizations started to increase steeply in the second week of October (Fig. 1a). The frequency distribution of the reporting delay and the associated cumulative delay distribution are shown in Fig. 1c, d, respectively. The delay distribution is sharply peaked around 3 days, but also has a long tail that extends to more than 25 days. 95% of all hospitalizations is reported within 14 days, hence we call this the 95% reporting horizon. The mean reporting delay was variable in the early stages of the epidemic (2–7 days) due to the small number of hospitalizations (Fig. 1b). Later on the delay first stabilized to 6–7 days in the period from early September until mid October, and then slowly decreased to 5 days over the period from mid October to December (Fig. 1b). Hence, the daily number of reported admissions provides a poor estimate of the actual number of admissions.

Fig. 1
figure 1

Hospitalizations of confirmed pandemic influenza A/H1N1 cases during the 2009 pandemic. a Daily numbers of patients admitted between July 13 and December 30. The peak of the hospitalizations is on November 12. b The mean reporting delay over all cases up to the indicated date of admission. c The frequency distribution of the admission-to-reporting delay. d The normalized cumulative delay distribution. The dotted line indicates the threshold level of 0.95 for the reporting horizon, the dashed line indicates that after 14  days more than 95% of the hospitalizations has been reported

With the number of reported cases and the cumulative delay distribution at hand, it is possible to estimate the number of hospitalizations that are still to be reported. Figure 2 shows for two specific dates the number of hospitalizations that were recorded up to that day (top panels), the cumulative delay distribution up to that day (middle panels), the expected number of hospitalizations (bottom panel, red lines), and the number of hospitalizations that were ultimately recorded (bottom panels, black lines). The bottom panel of Fig. 2 also shows the likelihood support for the estimates as confidence bounds (red shading). Both the increasing and decreasing trends in the early and late stages of the epidemic are well captured. Moreover, our method is even able to estimate the actual number of cases with fair precision (Fig. 3).

Fig. 2
figure 2

Correcting for the reporting delay on October 28 (left) and December 2 (right). Top panels the reported number of patients admitted to hospital with confirmed influenza A/H1N1 at each of the dates. Middle panels the probability of having been reported, as a function of admission date. The dotted line shows the 95% threshold, used for the reporting horizon (dashed line). Bottom panels The estimated number of cases (red line), including 1/6.8 likelihood support (red shaded area) and 95% reporting horizon (yellow-black dashed line). The black line denotes the final number of cases, reported until December 30. The initial number of cases shows a decline on both dates, but the compensation shows that the number of hospitalized patients is still increasing on the first date. (Color figure online)

Fig. 3
figure 3

Accuracy of the estimates. a Distribution of the difference between the estimated and actual number of admitted patients as a function of the time between the admission and observation, measured as a moving window over the entire epidemic. The solid black line shows the median, the shaded areas show the total range, 95% of the data (between the 2.5 and 97.5% percentile), and 75% of the data (between the 12.5 and 87.5% percentile). b The percentage of estimations where the actual number of cases was below the lower or above the upper confidence bound

The number of hospital admissions was evenly distributed over all weekdays (∼300 per day), while the number of reports was high for working days (∼400 per day) and low (∼0) in weekends (Fig. 4a). The delay between admission and reporting was highest for admissions on Thursday and Friday (∼6 days), and lower during the other days of the week (4–5 days) (Fig. 4b). However, this difference in delays between weekdays did not greatly affect our nowcasting estimates (Fig. 5). Overall, our nowcasting estimates were most precise on Wednesdays, while underestimating during the beginning of the week and slightly overestimating at the end of the week.

Fig. 4
figure 4

Differences between weekdays. a Total number of hospitalizations (dark grey) and incoming reports (light grey) by weekday. Only 8 reports were filed during the weekends. b The mean (and 95%CI) reporting delay for each weekday. Delays are longest on Thursday and Friday, and shortest on Sunday and Monday

Fig. 5
figure 5

Total difference between estimated and actual number of admitted patients over the entire time period, excluding the day of observation (0 days delay), and split up be weekday of observation. The results of the estimator clearly shifts during the week, with the best estimations on Wednesday

Discussion

In emerging outbreaks, it is important to have up-to-date information on the spread of the disease, and growth of the epidemic, because the number of cases can increase dramatically in a matter of days. Hospitalizations, and intensive care admissions in particular, should be tracked promptly, because excess capacity is small most of the time. However, surveillance systems suffer from delayed reporting of cases. This delay causes an apparent decrease in number of cases in the most recent part of the epidemic, and should therefore be taken into consideration when interpreting epidemic curves.

We have shown how routinely collected surveillance data can be used to obtain precise estimates of the actual number of hospitalized patients during an outbreak. Despite considerable reporting delays, the estimates were close to the actual numbers of daily hospitalized patients, up to 1 day before the observation.

The estimator has a number of limitations that should be addressed. First, patients that are reported after a very long time have not yet been included in the delay distribution. This truncation of data can be adjusted for [13], but we believe that little additional precision can be obtained by such more complicated analyses. Second, we assume that the distribution of delays is at least approximately stationary. If there is evidence of significant changes in the delay distribution over time, the different phases of the epidemic should be analyzed separately to reduce bias in the estimation, at the expense of a loss of precision. Third, the reporting delay will typically differ between weekdays, because hospitalizations are generally not reported during the weekends. If the difference between weekdays is large, the number that is still to be reported for each day of the week should be analyzed separately. Again, this reduces bias at the expense of a loss of precision.

The application of the nowcasting algorithm to pandemic influenza A/H1N1 2009 hospitalizations in the Netherlands provides an example where the limitations as described above, have been checked carefully. The scale of the delay distribution is much shorter than the scale of the epidemic. The reporting delay differed between weekdays but not enough to cause a substantial bias. Our analysis of the pandemic influenza A/H1N1 2009 hospitalizations may have profited from the relatively long period between the first reported cases and the start of the epidemic growth in the Netherlands. The Dutch hospitals and health services had ample time to prepare for the epidemic, and diagnostic tests were available throughout the epidemic. These preparations resulted in the relatively stable reporting delay distribution throughout the epidemic. Whereas a shorter period, such as in the UK, USA or Australia, could have overwhelmed the health services, causing larger fluctuations in the reporting delay. This, in turn, could result in less precise estimations.

The method presented here enables estimation of the current number of cases, and is not intended to predict the development of the epidemic. To that end, real-time prediction models are available that use the numbers of reported cases in combination with simple mathematical models to project the trajectory of the epidemic [14, 15, 16]. These models usually assume that cases are reported instantaneously, which is hardly ever the case in practice. We believe that a two-pronged approach in which our nowcasting estimator is used in conjunction with real-time prediction models could substantially improve prospects for the practical application of predicting the future course of an epidemic.

Concluding, we combined surveillance data to estimate the number of hospitalizations during the pandemic influenza A/H1N1 2009 outbreak, to track the actual health care demand. The method reliably predicts both increasing and decreasing trends in the number of hospitalizations. The nowcasting tool holds considerable promise for gauging actual number of hospitalizations in the presence of reporting delays.