Predictions of future cases, hospitalizations, and deaths have dominated the public discourse around COVID-19. Experts forecasted 20 to 60% of the world becoming infected and up to 2.2 million American deaths if the pandemic carries on unmitigated [1, 2]. Similarly, researchers predicted 510,000 British deaths and at least 300 million Indian cases [2, 3]. Many of these forecasts may not come true, so how can we use them effectively to inform policy?

Similar prediction efforts undertaken by researchers during past epidemics can lend clarity to this question. For example, prominent experts forecasted up to 200 million and 50,000 global deaths for H5N1 and mad cow disease respectively [4, 5]. However, these were drastic overpredictions, as only 455 and 177 deaths ensued [6, 7].

To systematically evaluate the success of prior forecasting models, we reviewed predictions from three twenty-first century epidemics: the 2002–2004 Severe acute respiratory syndrome (SARS) outbreak, the 2009 H1N1 influenza pandemic, and the 2014 Ebola virus disease outbreak. We found that during the SARS and H1N1 outbreaks, only a few studies attempted to predict future cases and were ultimately unsuccessful. During the Ebola epidemic, the number of forecasting studies increased dramatically, and most overestimated—quite substantially—the true number of cases and deaths.

We identified studies that forecasted cases or deaths for each epidemic by employing a broad PubMed search strategy with the terms “estimate”, “model”, “forecast”, “predict”, “transmission”, and “intervention.” We only included studies that made predictions while the outbreak was occurring. Because not all forecasts may have been published in peer-reviewed literature, we applied this search strategy to news articles from major media outlets as well. For our analysis of the Ebola epidemic, we utilized the references from a prior Ebola review [8]. Of the reviewed studies, only three predicted deaths. For the remaining studies that predicted cases, we extrapolated deaths by multiplying cases with the studies’ estimates of case fatality rates (CFR). For studies that did not estimate CFR, we applied the average Ebola CFR of 50%, as reported by the World Health Organization [9]. Finally, we numerically compared the studies’ predicted deaths to eventual true deaths to assess prediction accuracy.

For SARS, only one prediction was found, which vastly overpredicted the number of cases in Canada by June 2003 (Predicted (P): 4432, Actual (A): 188) [10, 11]. During H1N1, there were only two studies that predicted future case counts and both significantly underpredicted the number of cases in the U.S. by June 2009 (P: 2000-2500, A: 100,000) [12, 13].

There were 17 studies (reporting 35 predictions) that forecasted the Ebola epidemic. Of the 35 predictions, 71% (n = 25) overpredicted and 29% (n = 10) underpredicted the number of deaths. These mispredictions varied from projecting 89% (n = 2256) less than the actual deaths in Guinea to 9495% (n = 456,690) more than the actual deaths in Liberia [14]. Only 37% (n = 13) of the predictions were within the range of 50% greater or less than the actual number of deaths (Fig. 1). Additionally, several predictions were made assuming best case (all interventions implemented) or worst case (no interventions implemented) scenarios. Of the 12 predictions that assumed a worst-case scenario, 92% (n = 11) overpredicted and 8% (n = 1) underpredicted. Of the 7 predictions that assumed a best-case scenario, 57% (n = 4) still overpredicted (Supplementary File 1).

Fig. 1
figure 1

Frequency of predictions based on accuracy compared to actual numbers of deaths

As of June 19th, 2020, there were over 50 studies that predicted the course of COVID-19. It is reassuring to have scientists rise to the challenge to help leaders make informed decisions. However, a review of forecasts from Ebola suggests that the majority of predictions were far from the eventual reality. In fact, COVID-19 predictions too have ranged from massive underestimates—a worst case scenario of 50,562 cases in Italy by May 31st (there were over 230,000 cases) [15, 16]—to proven overestimates—190,000 cases in Wuhan, China by April (there were 50,339 cases by the middle of May) [17, 18]. A model developed by the Institute for Health Metrics and Evaluation has gained prominence and has been widely utilized for state and federal policymaking. However, even this model has been often inaccurate at local and national levels, tending to provide overly narrow confidence intervals [19, 20]. Thus, we must consider the historically poor performance of disease prediction models when engaging with predictions for COVID-19.

Imperfect data, unverifiable assumptions, and the unpredictability of human behavior make forecasting epidemics an inherently uncertain task. For disease models to appropriately inform policy, we must acknowledge not only the uncertainty of prediction estimates (via confidence intervals), but also the uncertainty inherent to the exercise of prediction itself.

One approach for improving predictions is to incorporate a broader set of disciplinary perspectives. Often, disease forecasts are made on the basis of individual expertise in virology, infectious disease epidemiology, or demography. However, the psychology of how behavior changes, the economics of unemployment that ensues, and the policy options with which nations can respond also influence a pandemic's course but are typically left unconsidered. Models that integrate various forms of epidemic information will bring much needed nuance and humility to the challenge of prediction.

Furthermore, we recommend standardized reporting guidelines for forecasting studies, much like STROBE for observational epidemiological studies and CONSORT for randomized controlled trials. Forecasting studies should discuss the “Current Forecasting Effort in Context” to summarize other predictions for the same outbreak as well as relevant predictions from prior outbreaks. These models should also report how their data was collected, detail the assumptions made and how realistic they are, and incorporate key epidemiological factors like age structure into the model [21]. Lastly, researchers should indicate how their forecast builds upon the existing landscape of predictions. As research and the media pivot focus towards the second surge of COVID-19, it is critical to quickly improve reporting standards so that future models are more honestly appraised.

Niels Bohr once said, “it is difficult to predict, especially the future”. Only once COVID-19 is behind us will we know whether prediction models did better than their counterparts from the Ebola epidemic. Until then, it is critical that researchers communicate the contexts and uncertainties of their predictions to best inform policy and the public.