ASSESSING THE AGE SPECIFICITY OF INFECTION FATALITY RATES FOR COVID-19: META-ANALYSIS & PUBLIC POLICY IMPLICATIONS

This paper assesses the age specificity of the infection fatality rate (IFR) for COVID-19. Our benchmark meta-regression synthesizes the age-specific IFRs from four recent large-scale seroprevalence studies conducted in Belgium, Geneva, Spain, and Sweden. The estimated IFR is close to zero for children and younger adults but rises exponentially with age, reaching about 0.3 percent for ages 50-59, 1 percent for ages 60-69, 4 percent for ages 70-79, and 24 percent for ages 80 and above. We compare those predictions to the age-specific IFRs computed using recent seroprevalence studies of six U.S. geographical areas, three countries that have engaged in comprehensive tracking and tracing of COVID-19 infections, and three small-scale seroprevalence studies. We also review more than 30 other seroprevalence studies whose design was not well-suited for estimating age-specific IFRs. Our findings indicate that COVID-19 is not just dangerous for the elderly and infirm but also for healthy middle-aged adults, for whom the fatality rate is roughly 50 times greater than the risk of dying in an automobile accident. Consequently, the overall IFR for a given location is intrinsically linked to the age-specific pattern of infections. In a scenario where the U.S. infection rate reaches nearly 30%, our analysis indicates that protecting vulnerable age groups could prevent over 200,000 deaths.


Introduction
As the COVID-19 pandemic has spread across the globe, some fundamental issues have remained unclear: How dangerous is COVID-19? And to whom? The answers to these questions have crucial implications in determining appropriate public health policies as well as informing prudent decision-making by individuals, families, and communities.
The standard epidemiological approach to gauging the severity of an infectious disease is to determine its infection fatality rate (IFR), that is, the ratio of deaths to the total number of infected individuals. The IFR is readily observable for certain viruses, such as Ebola, where nearly every case is associated with severe symptoms and the incidence of fatalities is extremely high; for such diseases, the IFR is practically identical to the case fatality rate (CFR), that is, the ratio of deaths to reported cases. By contrast, most people who are infected with SARS-Cov-2the virus that causes COVID-19-are asymptomatic or experience only mild symptoms such as headache or loss of taste and may be unlikely to receive a viral test or be included in official case reports. Consequently, reported cases tend to comprise a small fraction of the total number of infections, and hence the CFR is not an adequate metric for the true severity of the disease.
As shown in Table 1, assessing the IFR for COVID-19 is analogous to finding a needle in a haystack, especially in a dense urban area such as New York City (NYC). The New York State Department of Health recently conducted a large-scale seroprevalence study and estimated the NYC infection rate at about 22 percent, that is, 1.6 million out of 8 million NYC residents. 1 As of mid-July, NYC had about 220,000 reported COVID-19 cases, almost exactly one-tenth of the total number of infections. About one-fourth of those reported cases were severe enough to require hospitalization, many of whom unfortunately succumbed to the disease. All told, fatalities represented about one-tenth of reported cases but only one-hundredth of all infections.
While the NYC data indicate an IFR of about 1 percent, analysis of other locations has produced a puzzlingly wide array of IFR estimates, ranging from around 0.5 percent in Geneva and Zurich to rates above 2 percent in Spain and in the Republic of Korea (henceforth "Korea"). Indeed, a 1 See New York Department of Health (2020). Our analysis has two key conclusions: (1) COVID-19 is not just dangerous for the elderly and infirm but also for healthy middle-aged adults, for whom the fatality rate is roughly 50 times greater than the risk of dying in an automobile accident; and (2) age-specific policy choices and communications can dramatically decrease COVID-19 deaths. In particular, the overall IFR should not be viewed as an exogenously fixed parameter but as intrinsically linked to the age composition of the population and the age-specific pattern of infections. 3 Consequently, individual and collective efforts that minimize infections in older adults could substantially decrease total deaths. In a scenario where the infection rate of the U.S. population reaches nearly 30%, our analysis indicates that protecting vulnerable age groups could prevent over 200,000 deaths.
The remainder of this paper is structured as follows: Section 2 describes our methodology. Section 3 presents our meta-analysis results. Section 4 considers these findings in the context of other demographic characteristics (including race and ethnicity) and co-morbidities. Section 5 discusses the public policy implications of our analysis, including comparison to other types of fatality risks and scenario analysis of the age-specific pattern of U.S. infections and deaths.

Overview
To perform the present meta-analysis, we collected published papers and preprints that have studied the seroprevalence and/or infection fatality rate of COVID-19. To identify these studies, we performed online searches in MedRxiv and Medline using the criterion (("infection fatality rate" or "IFR" or "seroprevalence") and ("COVID-19" or "SARS-Cov-2")). 4 We identified other studies listed in reports by government agencies such as the U.S. Center for Disease Control & Prevention and the U.K. Parliament Office. 5 Finally, we confirmed the comprehensiveness of our literature search by referring to two recent meta-analysis studies that have assessed overall IFR for COVID-19 and a recent meta-analysis study comparing seroprevalence with reported cases. 6 Before proceeding further, we restricted our meta-analysis to studies of advanced economies, based on current membership in the Organization for Economic Cooperation and Development (OECD). 7 It should be emphasized that we applaud recent efforts to assess seroprevalence in a number of developing countries (including Brazil, Croatia, Ethiopia, and Iran), but we have excluded those studies in light of the distinct challenges associated with health care provision and reporting of fatalities in those locations. 8 We also excluded studies focused exclusively on measuring seroprevalence in a narrow segment of the population such as health care workers or pregnant women. 9 Appendix A lists all of the studies identified in our literature search.

Prevalence Measures
Our meta-analysis encompasses two distinct approaches for assessing COVID-19's prevalence: (1) extensive tracking and contact-tracing using live-virus testing and (2) seroprevalence studies that test for antibodies produced in response to the virus. Testing for the live virus is done by either a quantitative reverse-transcription polymerase chain reaction (qRT-PCR) molecular test for the viral nucleic acid sequence, or an antigen test for proteins specific to the virus. 10 These tests detect the virus within a few days of disease onset. While using live antigen testing is the optimal approach for determining prevalence, it requires extensive continuous testing of a population, and was only thoroughly implemented in select countries with relatively small populations, notably South Korea, Iceland, and New Zealand.
Most studies of COVID-19 prevalence have proceeded using serological analysis to determine what fraction of the population has developed either IgG or IgM antibodies to the virus. IgM antibodies develop earlier, but decrease over time, while IgG antibodies develop later and remain in high concentrations for several months. Antibodies are tested for using several methods. Enzyme-linked immunosorbent assays (ELISA) proceed by tagging antibody-antigen interactions with a reporter protein. Chemiluminescent immunoassays (CLA) work similarly by tagging the antigen-antibody interaction with a fluorescent protein. Lateral Flow Assays (LFA), also known as rapid diagnostic tests (RDT), produce a colored band upon antigen-antibody interaction.
Recognizing that SARS-Cov-2 is both novel and hazardous, public regulatory agencies have issued "emergency use authorizations" (EUA) to facilitate the rapid deployment of live virus and antibody tests based on the test characteristics reported by each manufacturer. 11 Subsequent studies by independent laboratories have reassessed the characteristics of these test kits, in many cases finding markedly different results than those of the manufacturer. Such differences reflect (a) the extent to which test results may be affected by seemingly trivial differences in its implementation, and (b) the extent to which seriological properties may vary across different segments of the population. For example, a significant challenge in producing accurate tests is to distinguish COVID-19 antibodies from those associated with other coronaviruses (including the 8 See Silveira et al. (2020), Jerkovic et al. (2020), Kempen et al. (2020), and Shakiba et al. (2020) for seroprevalence analysis of locations in Brazil, Croatia, Ethiopia, and Iran, respectively. Fassihit and Gladstone (2020) highlight the shortcomings of official tabulations of COVID-19 fatalities in Iran during the early stages of the pandemic. 9 For example, Flannery et al. (2020) assess seroprevalence in parturient women. 10 Carter et al., 2020 11 For example, see U.S. Food & Drug Administration (2020).
All rights reserved. No reuse allowed without permission.
(which was not certified by peer review) is the author/funder, who has granted medRxiv a license to display the preprint in perpetuity.
The copyright holder for this preprint this version posted July 24, 2020. . https://doi.org/10.1101/2020.07.23.20160895 doi: medRxiv preprint common cold). Consequently, the assessment of test characteristics may vary with seemingly innocuous factors such as the season of the year in which the blood samples were collected.
The reliability of seroprevalence testing depends on three key factors: (1) the seroprevalence test's sensitivity (odds the test detects the virus in an infected person); (2) the seroprevalence test's specificity (odds the test returns a negative result for a uninfected person); and (3) the true disease prevalence in the sample. In a population where the actual prevalence is relatively low, the frequency of false-positive tests is crucial for determining the reliability of the test results. Consequently, a key metric of test reliability is positive predictive value (PPV), that is, the likelihood that a positive test result is a true positive. The PPV can be evaluated as follows: Evidently, lower prevalence can markedly diminish the reliability of seroprevalence testing. For example, in a seroprevalence study of Dutch blood donors using the Wantai Total Antibody ELISA, the crude prevalence rate was found to be 2.7%. 12 However, that antibody test has a PPV of 42.4%, and hence the adjusted prevalence is only 0.6 %, with a 95 percent confidence interval of 0% to 5.2%. In effect, practically all of the positive tests obtained in this study might be false positives. By contrast, a seroprevalence study of New York City found a much higher crude prevalence of 20.0% using a Wadsworth Pan-Ig test with a PPV of 94.8%. 13 Consequently, the adjusted prevalence for this study is higher than the crude prevalence, namely, 21.7% with a 95 percent confidence interval of 19.2% to 24.4%. 14 Test sensitivity and specificity also have a high impact on PPV. For example, in a serological study of Santa Clara County, researchers used a Premier Biotech LFA test and estimated prevalence at 1.5% based on a test specificity of 99.5%. 15 However, a subsequent study found the specificity of that test to be only 97.2%. 16 That revision to the test specificity reduces its PPV in the Santa Clara study from 71.6% to 31.1%, and the adjusted prevalence for Santa Clara residents declines to 0%; that is, the prevalence in that population was so low that it could not be distinguished from zero. 17 These examples underscore why the sensitivity and specificity of COVID-19 antibody tests should not be treated as fixed parameters that are known with a high degree of certainty, as would generally be the case for medical tests of other diseases that have been authorized via standard regulatory procedures. Thus, the 95% confidence interval for each seroprevalence estimate should reflect the degree of uncertainty about its sensitivity and specificity as well as the conventional uncertainty that reflects the size of the sample used in producing that estimate. 18 In light of these testing reliability concerns, our meta-analysis excludes seroprevalence studies that do not disclose the test method or report estimates and confidence intervals that reflect the characteristics of the test. We also exclude studies that relied on test kits that were subsequently withdrawn by the manufacturer due to concerns about inadequate reliability.

Constructing a Representative Seroprevalence Sample
In order to accurately use antibody tests to estimate population prevalence, the study sample must accurately reflect the sampled population. We exclude four types of seroprevalence studies from our paper that do not provide accurate estimates of population-wide age-specific infection rates: (1) studies from clinics including COVID-19 patients; (2) studies which employed active recruitment; (3) studies whose samples are heavily skewed by age; and (4) studies of blood donors. Many of these studies were extremely useful in their originally intended contexts, but are not useful in the current context, for reasons outlined below.
Studies from serum samples at health care facilities produced inflated estimates of prevalence when those samples included patients who were obtaining treatment for symptoms associated with COVID-19. For example, a New York City study from an outpatient clinic in May yielded a seroprevalence estimate double that of two April studies of the same area. As the entire state was in lockdown between the two time points, it is difficult to believe that the prevalence grew so drastically during that time period. Instead, it is likely that the majority of primary care and urgent care patients in the month of May sought medical attention for COVID-like symptoms, and when these patients were not excluded, the estimate of COVID-19 prevalence was inflated Studies which employ active recruitment also inflate the number of positive patients, as people who think they are positive are more likely to enroll for the free testing. For example, in a study in Luxemburg, of the 35 participants who tested positive, 19 had previously interacted with a person who they knew was positive or had been tested for SARS-CoV-2 previously. Excluding these patients from the sample doubles this particular study's implied overall IFR.
Even random samples may only draw from younger people, making them less useful for estimating age-specific prevalence and IFRs across the full spectrum of the population. For example, in the French town of Oisie, seroprevalence tests were run on schoolchildren, their teachers, and their immediate families; the entire study only included two individuals older than sixty-five. 19 Since our analysis is aimed at gauging the relationship between IFR and age, we restrict our meta-analysis to studies that report age-specific results for a broad spectrum of age groups.
Finally, seroprevalence studies of blood donors also likely are non-representative and inflate the actual infection rate. In the discussion of their Milan blood donor study, Valenti et al. (2020) note that blood donors are generally healthier than the general population and "might have a higher number of social interactions than other groups." 20 For example, a blood donor study in the United Kingdom estimated 7.8% (CI: 7.1 to 8.6%) 21 of the population had COVID-19, while a randomized seroprevalence study of the UK population at a later date estimated the prevalence of COVID-19 to be 5.41% (CI: 4.3 to 6.5%). 22 Though blood donor studies were useful while few other samples were available, randomized seroprevalence studies provide more reliable population prevalence estimates.
Blood specimen from commercial labs may also not be representative of the location's broader population. The CDC conducted seroprevalence testing on blood samples from commercial labs in New York City (NYC), Connecticut, Utah, Missouri, and Puget Sound, Washington. 23 Though these studies mostly control for demographic data such as age, zip code, race, and sex, some other factors may skew the data. Patients who received healthcare during quarantine may have been more cautious of infection due to their underlying medical conditions; alternatively, a decision to enter medical spaces may have corresponded with a less cautious demographic. The CDC compares their estimated seroprevalence in each location to the number of reported cases. Four locations have ratios of seroprevalence to reported cases of around 11:1, but the Connecticut study reports a much lower ratio of 6:1, suggesting the study may have underestimated prevalence there, while the Missouri study has a ratio of 25:1 suggesting that prevalence may have been overestimated. These observations are further discussed in Section 3.
Concerns about potential sample selection issues in studies of health care specimens are underscored by findings from a seroprevalence study that analyzed specimens from two commercial laboratories in New York City. 24 In that study, the estimated seroprevalence differed markedly between the two labs, with non-overlapping 95% confidence intervals of 6.5 to 12.3% for Lab A and 3.7 to 6.1% for Lab B.

Matching Infections to Deaths
Accurately measuring total deaths, the numerator of the IFR calculation, is perhaps more difficult than assessing prevalence. The time lags from onset of symptoms to death and from death to official reporting are crucial. Symptoms typically develop within 6 days after exposure, but may develop as early as 2 days or as late as 14 days. 25 More than 95% of symptomatic COVID patients have positive antibody (IgG) tests within 17-19 days of symptom onset. 26 The CDC estimates that the mean time interval from symptom onset to death is 15 days for ages 18-64 (interquartile range of about 9 to 24 days) and 12 days for ages 65+ (IQR of 7 to 19 days). The mean interval from date of death to the reporting of that person's death is about 7 days 20 See Valenti et al. (2020). 21  (which was not certified by peer review) is the author/funder, who has granted medRxiv a license to display the preprint in perpetuity.
The copyright holder for this preprint this version posted July 24, 2020. . https://doi.org/10. 1101/2020 (interquartile range of about 2 to 19 days). Consequently, the upper bound of the 95% confidence interval between symptom onset and reporting of fatalities is about six weeks (41 days). 27 Figure 1 illustrates a scenario in which the pandemic ended two weeks prior to the date of a seroprevalence study. This figure shows the results of a stochastic simulation calibrated to reflect the CDC's estimated distribution for the time lags between symptom onset, death, and inclusion in official fatality reports. The histogram shows the frequency of deaths and reported fatalities associated with the infections that occurred on the last day prior to full containment. Consistent with the CDC confidence intervals, about 95% of cumulative fatalities are reported within roughly four weeks of the date of the seroprevalence study.
These considerations underscore the pitfalls of constructing IFRs based on the death toll at the midpoint date of a seroprevalence study, which is the approach that has been taken in most previous studies (including both of the meta-analysis studies of the overall IFR for COVID-19).
In particular, as shown in Table 2, the cumulative fatalities at the time of a seroprevalence study can markedly understate the full death count as of four weeks later. All of these studies were conducted in locations where the pandemic had been contained by the time that seroprevalence was measured, as evident from the fact that the fatality count leveled off over the subsequent month. 28 Evidently, the precise timing of the count of cumulative fatalities is relatively innocuous in locations (such as Spain and Castiglione d'Adda) where the outbreak had been contained for more than a month prior to the date of the seroprevalence study. But for the other studies shown 27 See U.S. Center for Disease Control & Prevention (2020e), Table 2. 28 In each of the locations shown in Table 2, cumulative fatalities stabilized over the month following the seroprevalence study; in each case, the percent change from week 4 to week 5 was less than 10% with the exceptions of Missouri (18%) and South Florida (11%). See Appendix Table D1 for further details.

Figure 1: Time Lags in the Incidence and Reporting of COVID-19 Fatalities
Source: authors' calculations based on CDC estimates; see text.
All rights reserved. No reuse allowed without permission.
(which was not certified by peer review) is the author/funder, who has granted medRxiv a license to display the preprint in perpetuity.
The copyright holder for this preprint this version posted July 24, 2020. . https://doi.org/10. 1101/2020 in Table 2, the outbreak had only recently been contained, and hence the death count continued rising markedly for several more weeks after the midpoint of the seroprevalence study. For each of those locations, matching seroprevalence to the death count at the midpoint date of the study would significantly underestimate the true level of the IFR. For example, in the case of New York state, computing the IFR using the 4-week fatality count is nearly 1.5 times higher than using the fatality count at the midpoint date of that study (which was conducted in late April).
By contrast, matching seroprevalence estimates with subsequent fatalities is infeasible if the seroprevalence study takes place in the midst of an accelerating outbreak. In particular, if infections and fatalities continue rising exponentially over subsequent weeks, there is no precise way of determining what fraction of those deaths resulted from infections before vs. after the date of the seroprevalence study.
Therefore, a crucial criterion for seroprevalence studies to be included in our meta-analysis is that the pandemic is well contained in advance of the study, as indicated by the stabilization of cumulative fatalities within the next several weeks after the midpoint date of the study. Diamond Princess 8 10 25 Sources: See Appendix A.
All rights reserved. No reuse allowed without permission.
(which was not certified by peer review) is the author/funder, who has granted medRxiv a license to display the preprint in perpetuity.
The copyright holder for this preprint this version posted July 24, 2020. . https://doi.org/10.1101/2020.07.23.20160895 doi: medRxiv preprint As shown in Table 3, four studies are clearly inconsistent with that criterion: Los Angeles County (mid-April), New York City (late March), Santa Clara County (early April), and Scotland (late March). 29 It should be emphasized that these studies provided valuable information about seroprevalence in the midst of an active outbreak, but these seroprevalence results are not well-suited for gauging the IFR of  Finally, it should be noted that reported deaths may not fully capture all fatalities resulting from COVID-19 infections, especially in locations where a substantial fraction of such deaths occur outside of healthcare institutions. In the absence of accurate COVID-19 death counts, an alternative measure, referred to as excess mortality, can be computed by comparing the number of deaths for a given time period in 2020 to the average number of deaths over the comparable time period in prior calendar years, e.g., 2015 to 2019. For example, a recent Belgian study used seroprevalence results in conjunction with excess mortality to compute age-specific IFRs, noting that their measure of excess mortality over the period from March to May coincided almost exactly with the tally of reported COVID-19 cases. 31   All rights reserved. No reuse allowed without permission.
(which was not certified by peer review) is the author/funder, who has granted medRxiv a license to display the preprint in perpetuity.

Meta-Regression Methodology
The goal of our meta-analysis is to systematically assess previous studies of mortality and infection rates to determine how age and fatality risk are related. To perform this analysis quantitatively, we use random-effects meta-regressions, using the STATA metareg procedure. 33 Meta-regressions are a useful tool for comparing study-level summary data. Since the individual observations are not available for any study we use, comparing summary-level data is the only way to compare the studies. We use summary level data from each age group in each study, so effectively one study has multiple "groups" in our meta-regressions. 34 We treat each age group separately because there are likely random variations in age-specific IFR both across studies and across age groups within a study. Random-effects procedures allow for such random variation between groups (referred to as residual heterogeneity) by assuming that these effects are drawn from a Gaussian distribution. The procedure provides reasonable results even if the errors are not strictly normal but may be unsatisfactory if the sample includes large outliers or the distribution of groups is not unimodal. In analytical terms, this framework can be expressed as follows: In this specification, is the estimated IFR in study i for age group j, denotes the median age of that group, denotes the source of idiosyncratic variations for that particular location and age group, and denotes the random effects that characterize any systematic deviations in outcomes across locations and age groups. Under the maintained assumption that each idiosyncratic term has a normal distribution, the idiosyncratic variance is = (( − )/3.96) , where and denote the upper and lower bounds of the 95% confidence interval for that study-age group. The random effects are assumed to be drawn from a homogeneous distribution with zero mean and variance . The null hypothesis of = 0 characterizes the case in which there are no systematic deviations across studies or age groups. If that null hypothesis is rejected, then the estimated value of encapsulates the magnitude of those systematic deviations.
Under our baseline specification, the infection fatality rate increases exponentially with age. 35 In particular, this meta-regression is specified in logarithmic terms, with the slope coefficient encapsulating the impact of higher age on log(IFR). Consequently, the null hypothesis that IFR 33 See Harbord and Higgins (2008) and Higgins, Thompson, and Spiegelhalter (2009). 34 We also replicated this analysis using fixed effects for studies and random effects for age groups within studies. 35 Bonanad et al. (2020) conducted a meta-analysis study of COVID-19 case fatality rates as a function of age using aggregate data from China, Italy, New York, Spain, and the U.K. and found a very strong exponential pattern of mortality: ages 40-49: 1.1%; ages 50-59: 3%; ages 60-69: 9.5%; ages 70-79 22.8%; ages 80+: 29.6%. Similarly, Doherty et al. (2020) investigated a large sample of U.K. hospitalized COVID-19 patients and identified an exponentially increasing mortality hazard rate as a function of patient age. All rights reserved. No reuse allowed without permission.
(which was not certified by peer review) is the author/funder, who has granted medRxiv a license to display the preprint in perpetuity.
The copyright holder for this preprint this version posted July 24, 2020. . https://doi.org/10.1101/2020.07.23.20160895 doi: medRxiv preprint is unrelated to age can be evaluated by testing whether the value of is significantly different from zero. If that null hypothesis is rejected, then the estimated values of and characterize the estimated relationship between log(IFR) and age. Consequently, the predicted relationship between IFR and age can be expressed as follows: The 95% confidence interval for this prediction can obtained using the delta method. In particular, let denote the infection fatality rate for age a, and let denote the standard error of the meta-regression estimate of log( ). If has a non-zero value, then the delta method indicates that its standard error equals / , and this standard error is used to construct the confidence interval for at each age a. Likewise, the prediction interval for log( ) is computed using a standard error of + that incorporates the systematic variation in the random effects across studies and age groups, and hence the corresponding prediction interval for is computed using a standard error of ( + )/ .

Study Selection
In subsections 2.1 to 2.4, we have identified four specific criteria for determining whether a given study should be included in our meta-analysis: (i) transparency about the characteristics and positive predictive power of the prevalence test procedure; (ii) use of a sample data frame that is broadly representative of the general population; (iii) effective containment of the pandemic prior to the initiation of the prevalence survey; and (iv) and reporting of prevalence estimates and confidence intervals for specific age groups as required for the estimation of age-specific IFRs. Based on those four criteria, we have determined that 32 studies are not suitable for assessing the IFR of COVID-19 even though each of those studies has made significant contributions along other lines. Those 32 studies are listed in Appendix A along with the rationale for excluding each of them from our meta-analysis.
Consequently, our meta-analysis focuses on synthesizing IFR data from thirteen locations; see Appendix A for further details. These locations can be classified into four distinct groups:  Benchmark Studies: Belgium, Geneva, Spain, and Sweden. Each of these locations has been the subject of a large-scale seroprevalence study using a test procedure with high positive predictive power and a sample frame that is broadly representative of the general public and that covers a wide array of age groups. 36 For Belgium and Geneva, estimates of age-specific IFRs have been reported based on each location's seroprevalence results. For Spain, we construct age-specific IFRs using the seroprevalence data in conjunction with excess mortality data published by the Spanish National Institute of Statistics. For Sweden, we construct age-specific IFRs using the seroprevalence data in conjunction 36  All rights reserved. No reuse allowed without permission.
(which was not certified by peer review) is the author/funder, who has granted medRxiv a license to display the preprint in perpetuity.

 Other Large-Scale Seroprevalence Studies: Connecticut, Missouri, New York, Puget
Sound, South Florida, Utah. Five of these studies were conducted by the U.S. Center for Disease Control & Prevention, using sample specimens from two commercial laboratory companies, and the sixth was conducted by the New York Department of Health using samples collected at supermarkets and grocery stores. 38 In light of potential concerns about sample data frames and limited age-specific seroprevalence data, we include these studies in our meta-analysis but not in the benchmark group. We construct age-specific IFRs using seroprevalence results matched to cumulative fatalities four weeks after the midpoint date of each study.

 Comprehensive Tracking and Tracing Countries: Iceland, Korea, and New Zealand.
These three countries engaged in extensive testing and tracing to halt the spread of infections. Iceland researchers also conducted a large-scale seroprevalence study, and we use the results of that study in computing age-specific IFRs for Iceland. 39 That study also indicates that reported cases in Iceland substantially understated actual prevalence; see Appendix C for details. 40 Thus, we make corresponding adjustments to the reported cases for Korea and New Zealand in constructing age-specific IFRs for each of those locations.

 Small-scale studies: Castiglione d'Adda, Gangelt, and Diamond Princess cruise ship.
The first two locations have had seroprevalence studies based on random samples, while data from Diamond Princess has been influential in informing subsequent studies. 41 37 Sweden Public Health Agency (2020f) recently produced estimates of the infection fatality rate in Stockholm for two age groups (ages 0-69 and 70+) using a novel methodology that links live virus tests, reported cases, and mortality outcomes. Given the markedly different methodology and the breadth of the two age groups, that study is not included in our meta-regression analysis, but it should be noted that their estimated IFR of 4.3% for ages 70+ is well aligned with the results of our meta-regression analysis. 38 See Havers et al. (2020) and Rosenberg et al. (2020), respectively. 39 See Gudbjartsson et al. (2020). All rights reserved. No reuse allowed without permission.
(which was not certified by peer review) is the author/funder, who has granted medRxiv a license to display the preprint in perpetuity.

Benchmark Analysis
As shown in Figure 2, the results from a random-effects meta-regression of ln(age) on fatality risk reveal a clear, exponential relationship between age and IFR. Note that the relationship between age and IFR is exponential for all ages, not just the elderly. Thus, the predicted IFR for a forty-year-old parent is exponentially higher than for their ten-year-old child, even though both face relatively low absolute IFRs.
As noted in Section 2.4, one key issue for assuring the validity of our meta-regression method is confirming that the distribution of observations is consistent with a normal distribution, without any extreme outliers or clustering of observations. The validity of those assumptions is evident from Figure 2: the observations provide relatively even coverage of the age interval from 30 to 85 years, and nearly all of the observations fall within the 95 percent confidence interval. The only exception is the middle-aged group from the Geneva study, but even that observation is only slightly outside the confidence interval, as one might expect with one out of 21 observations for this meta-regression. 42 42 We have also used the output of the Stata metareg procedure to confirm that the estimated random effects are consistent with a normal distribution.

Figure 2: The Log-Linear Relationship between Age and IFR for COVID-19
log(IFR) All rights reserved. No reuse allowed without permission.
(which was not certified by peer review) is the author/funder, who has granted medRxiv a license to display the preprint in perpetuity.
The copyright holder for this preprint this version posted July 24, 2020. . https://doi.org/10.1101/2020.07.23.20160895 doi: medRxiv preprint Figure 3 depicts the exponential relationship between IFR and age using the transformed output of the Stata metareg procedure. The thick red line is the meta-regression's predicted age-specific IFR curve. The dark purple area is the meta-regression's 95% confidence interval. This means there is a 95% chance the true relationship between IFR and age is within this confidence interval. The wider, light purple area is the meta-regression's 95% prediction interval. Taking into account random variation across studies and locations, there should be a 95% chance a subsequent study of age-specific IFR would land within that prediction interval. The prediction interval is much wider than the confidence interval, consistent with the inclusion of random effects in the meta-regression, i.e., there is substantial random variation across studies as well as age groups. Figure 3 demonstrates a clear, exponential relationship between age and fatality risk. The disease poses a substantial mortality risk for middle-aged adults and even higher risks for elderly people: A 50 year-old faces a 0.14% chance of dying if infected, and that rate rises to 0.5% for a 60-yearold, 1.6% for a 70-year-old, and over 15% for people ages 85 and above. Figure 3, the second-oldest Spanish age-group (70 -80 years) appears at the high end of the prediction interval. This may be due to sampling error that underestimated the infection rate (resulting in an inflated IFR estimate). Spanish researchers succeeded in conducting a very large-scale seroprevalence study encompassing about 100,000 people, with associated challenges in maintaining consistency across locations and age groups. Thus, it may be plausible that the infection rate was underestimated among the elderly, thereby diminishing the denominator in our IFR calculations. Finally, it is conceivable that elderly people in Spain

IFR (percent)
All rights reserved. No reuse allowed without permission.
(which was not certified by peer review) is the author/funder, who has granted medRxiv a license to display the preprint in perpetuity.
The copyright holder for this preprint this version posted July 24, 2020. . https://doi.org/10.1101/2020.07.23.20160895 doi: medRxiv preprint might have a higher incidence of comorbidities than their peers in other European countries --a possibility that we revisit in Section 4.

Comparing Benchmark Results to Other Studies
We now compare the benchmark meta-regression results with the age-specific IFRs for 12 other locations. In effect, this approach is comparable to "out-of-sample" exercises that statisticians commonly use in assessing the validity of a particular model. Figure 4, the observations from these twelve studies generally fit within the benchmark prediction intervals, broadly confirming the usefulness of the meta-regression in assessing age-specific IFRs beyond the four locations covered by the set of benchmark studies. Moreover, the outliers in this figure are mostly above the prediction interval rather than below it; that is, the unexplained variations outside the prediction interval tend to be observations with unusually high IFRs.

As shown in
We now consider the specific results for each group of studies:

IFR (percent)
All rights reserved. No reuse allowed without permission.
(which was not certified by peer review) is the author/funder, who has granted medRxiv a license to display the preprint in perpetuity.

Seroprevalence Studies
Every seroprevalence study in this group except for New York State was conducted using commercial lab blood samples. Connecticut's oldest age-group (the green triangle) appears far above the prediction interval. As discussed in Section 2.2, the Connecticut study may have underestimated prevalence; if the prevalence were revised upwards to align with the typical ratio of reported cases to infections, then this particular observation would likely fall within the prediction interval. Another potentially significant issue is whether Connecticut experienced a relatively high infection rate among residents of assisted care facilities, who would be particularly vulnerable due to elevated age as well as a higher incidence of co-morbidities.
By contrast, the observations for Missouri lie below the prediction interval. In this case, the ratio of estimated seroprevalence to reported cases was extraordinarily high (24:1), more than double the typical 10:1 ratio for other seroprevalence studies. 43 In effect, Missouri's seroprevalence might be significantly overstated, perhaps reflecting non-representative aspects of the sample data frame (i.e., specimens collected by two commercial labs). Moreover, Missouri has a relatively low estimated prevalence of 2.7% compared with most of the other studies in our meta-analysis, which raises concerns about the PPV of the antibody test and the possibility that false-positives may have distorted the estimated prevalence.
Finally, these results suggest that Utah may indeed have a systematically different pattern of agespecific IFR compared to most other locations. As in Missouri, there are potential concerns about PPV for a location with relatively low estimated prevalence of about 2.2%. Unlike Missouri, however, the ratio of estimated infections to reported cases is about 11:1, similar to most other studies. One plausible factor is that a large fraction of Utah residents abstain from use of alcohol, tobacco, and narcotic drugs and hence may have a lower incidence of co-morbidities compared to most other locations in the United States, Europe, and East Asia. The bottom line is that Utah's remarkably low IFR should not simply be dismissed as an outlier; rather, further study of that location may yield significant insights that are applicable elsewhere.

Comprehensive Tracking and Tracing Countries
The age-specific IFRs for all three countries that employed widespread testing and contacttracing to control the virus's spread all fit well within the benchmark's 95% prediction interval. Of the three, there has only been a seroprevalence study published about one country, Iceland. The study found that Iceland's ratio of actual to reported cases was about 1.4x, enabling us to reasonably reliably compute Iceland's age-specific IFRs. 44 Both Iceland and New Zealand were able to fairly rapidly control the virus, and thus very few deaths occurred (10 in Iceland and 22 in New Zealand). As three of the deaths in Iceland were in healthcare workers, the death data could be greatly skewed by conflicting factors such as viral 43 Havers, et al (2020). See Table 3. 44 See Gudbjartsson et al. (2020). All rights reserved. No reuse allowed without permission.
(which was not certified by peer review) is the author/funder, who has granted medRxiv a license to display the preprint in perpetuity.
The copyright holder for this preprint this version posted July 24, 2020. . https://doi.org/10.1101/2020.07.23.20160895 doi: medRxiv preprint load; excluding healthcare workers alters their IFR by 30%. Since each region's death counts were so low, little can be inferred from their age-specific IFRs.

Small-Scale Studies
It is remarkable how well the small-scale studies of Gangelt, Germany, and the Diamond Princess cruise ship fit the benchmark analysis once we account for delays in the timing and reporting of fatalities. As shown in Table 2 above, the number of deaths at the time of the initial infection study was only about half of the final death count a few weeks later.
These two locations also demonstrate why large sample sizes should be weighted much more heavily than small sample sizes. The benchmark's estimated IFR for a 55 year-old is about 0.5%. So, on average, 1 out of 200 infected 55 year-olds should die. However, in Gangelt only about 150 people aged 50-60 were infected, and on the Diamond Princess only 59 people in that age range were infected. It would be quite consistent with the benchmark finding if no people in that age group died at either location. For a virus that only kills a small percentage of patients, observers should be wary of basing their findings on small absolute death counts, since a few deaths can significantly change the implied IFR. For such small samples, the appropriate confidence intervals are too wide to draw many conclusions from. Iceland (10 deaths) and New Zealand (22 deaths) share this limitation.
Castiglione d'Adda's age-specific IFRs for individuals aged 65-85 are outliers above our prediction interval. This outlier status was likely due to the fact that the town was affected relatively severely and early by COVID-19 in the pandemic's first wave, and hospitals became overwhelmed, leading to rationing of medical care. It should not be surprising that the region's IFR for people aged 85 years and older falls within our benchmark estimation; at such high ages the disease is so dangerous that medical care may not influence mortality to any significant degree. The relatively high IFR for the next age cohort (65 to 85 years) could also reflect a higher incidence of co-morbidities compared to similar cohorts in other locations. Higher IFRs might also reflect dense multi-generational urban housing that resulted in increased viral load for elderly people. In sum, the extraordinarily high IFR for individuals ages 65-85 years plays a key role in explaining Castiglione d'Adda's overall IFR of about 5%.
Finally, while not included in our formal meta-analysis, it should be noted that the pathbreaking study of  is broadly consistent with our findings. That study was completed at an early stage of the COVID-19 pandemic, drawing on data from expatriation flights to estimate infection rates in Wuhan and then computing age-specific IFRs based on reported fatalities in Wuhan. As in our meta-regression results, the IFR estimates in that study increase exponentially as a function of age, with rates near zero for ages 0-39 and far higher rates for older adults.
All rights reserved. No reuse allowed without permission.
(which was not certified by peer review) is the author/funder, who has granted medRxiv a license to display the preprint in perpetuity.

Discussion
While age and fatality risk are closely related, it is very unlikely that age alone explains differences in IFR across regions and populations. The remaining variation may be explained by either comorbidities or other population demographics. Section 4.1 discusses the effect of comorbidities on fatality risk. Section 4.2 discusses how other demographic and socioeconomic factors may explain variations in IFR across locations.

Comorbidities
Researchers have debated whether certain conditions predispose patients to have more severe cases of COVID-19. A recent study in New York City reported the incidence of several chronic medical conditions in hospitalized COVID-19 patients; the median age of the patients in that sample was 59 years. 45 Table 4 compares those results to the average incidence of comorbidities among New York City residents ages 50 years and above -that is, the relevant age group for comparison with the sample of hospitalized patients. 46 Most comorbidities, including 45 See Richardson et al. (2020). 46 New York City does not publish data on the incidence of x and y, and hence Table 4 uses U.S. data to gauge the prevalence of those comorbidities. All rights reserved. No reuse allowed without permission.
(which was not certified by peer review) is the author/funder, who has granted medRxiv a license to display the preprint in perpetuity.
The copyright holder for this preprint this version posted July 24, 2020. . https://doi.org/10.1101/2020 hypertension and diabetes, do not explain which infected individuals are likely to require hospitalization for COVID-19. The only striking positive comorbidity was obesity. However, in NYC, obesity is also much more prevalent among low-income groups who are more likely to live in densely populated neighborhoods and to work in high-exposure jobs. Thus, it is quite possible that NYC's obese population has a higher infection rate, which would explain the higher hospitalization rate for obese individuals. A more detailed version of Table 4 can be found in Appendix D.
A recent U.K. study of hospitalized COVID-19 patients analyzed the impact of comorbidities on the risk of a fatal outcome. 47 Table 5 shows the most relevant portion of their results. Evidently, age influences the estimated hazard ratio far more than any specific comorbidity. In effect, comorbidities appear to be a scaling factor that has a noticeable impact on the hazard ratio compared to a healthy peer with no comorbidities, but that impact is much smaller than the impact of higher age. For example, the fatality risk for an obese 40-year-old hospital patient is moderately higher than for a non-obese individual of the same cohort but far lower (less than one-tenth) of the fatality risk for a non-obese 75-year-old hospital patient.
Both of these studies suggest that differences in comorbidity across geographical locations is likely to be a relatively modest factor in explaining the dispersion in overall IFRs for COVID-19, especially compared to the very strong link between IFR and age. 47 See Doherty et. al (2020).  Figure 5.
All rights reserved. No reuse allowed without permission.
(which was not certified by peer review) is the author/funder, who has granted medRxiv a license to display the preprint in perpetuity.

Demographic and Socioeconomic Factors
Our results indicate that age is a crucial determinant of infection fatality rates for COVID-19. However, our meta-analysis has not directly considered the extent to which IFRs may vary with other demographic factors, including race and ethnicity. Fortunately, some valuable insights can be garnered from other recent studies. In particular, one recent seroprevalence study of residents of two urban locations in Louisiana found no significant difference in IFRs between whites and Blacks. 48 Nonetheless, the incidence of COVID-19 mortality among people of color is extraordinarily high due to markedly different infection rates that reflect systematic racial and ethnic disparities in housing and employment. For example, a recent infection study of a San Francisco neighborhood found that 80% of positive cases were Latinx -far higher than the proportion of Latinx residents in that neighborhood. 49 That study concluded as follows: "Risk factors for recent infection were Latinx ethnicity, inability to shelter-in-place and maintain income, frontline service work, unemployment, and household income less than $50,000 per year." Recent CDC analysis has reached similar conclusions, attributing elevated infection rates among Blacks and Hispanics to dense housing of multi-generational families, increased employment in high-contact service jobs, high incidence of chronic health conditions, and lower quality of health care. 50 In summary, while the present study has investigated the effects of age on the IFR of COVID-19, further research needs to be done on how infection and fatality rates for this disease are affected by demographic and socioeconomic factors. All rights reserved. No reuse allowed without permission.
(which was not certified by peer review) is the author/funder, who has granted medRxiv a license to display the preprint in perpetuity.

Conclusion
Age and fatality risk for COVID-19 are exponentially related. In non-technical terms, COVID-19 poses a very low risk for children and younger adults but is hazardous for middle-aged adults and extremely dangerous for elderly people. Table 6 contextualize these risks by comparing the age-specific IFRs from our meta-regression analysis to the annualized risk of a fatal auto accident or other accidental injury. For the youngest age groups, the risk from COVID-19 is broadly comparable to those everyday activities. By contrast, for adults ages 55 to 64, the COVID-19 fatality risk is roughly 50 times greater than the risk of driving a car, and that hazard ratio rises to about 1000:1 for people ages 65 to 74. Moreover, as discussed in Section 4, comorbidities have only modest effects on these risks; that is, being in good health does not necessarily ensure that a middle-aged or older adult will survive a COVID-19 infection.
Our analysis facilitates comparisons between the COVID-19 pandemic and the Spanish Flu pandemic of 1918-20. The U.S. CDC estimates that about 28 percent of the U.S. population was infected by the Spanish Flu and that the death toll was about 675,000. However, that disease was most dangerous for young adults, with an IFR of about 4 percent for people ages 20 to 40 years old, who comprised roughly one-third of the U.S. population at that time. By contrast, COVID-19 is far more dangerous for middle-aged and older adults, whereas the Spanish Flu caused relatively few deaths among those age groups.
Our meta-regression analysis also confirms that COVID-19 is far more deadly than seasonal flu, especially for older adults and elderly people. For example, the U.S. CDC estimates that during winter 2018-19 influenza was associated with about 50 million infections and 34,000 fatalities,  (2020) All rights reserved. No reuse allowed without permission.
(which was not certified by peer review) is the author/funder, who has granted medRxiv a license to display the preprint in perpetuity.
The copyright holder for this preprint this version posted July 24, 2020. . https://doi.org/10.1101/2020.07.23.20160895 doi: medRxiv preprint that is, an overall IFR of about 0.07 percent. By comparison, recent seroprevalence data from U.S. public health laboratories indicates that COVID-19 had infected more than 20 million people by the third week of June, that is, about 6.4% of the U.S. population. 51 Cumulative U.S. fatalities reached nearly 150,000 as of July 21 (four weeks after the date of the seroprevalence data, appropriately reflecting time lags as discussed in Section 2). These figures indicate that the overall IFR of COVID-19 is currently about 0.7%, in line with recent guidance from the CDC. 52 That IFR indicates that COVID-19 is roughly ten times more deadly than the seasonal flu.
Nonetheless, the current level of the overall U.S. IFR should not be interpreted as a fixed parameter. Rather, our meta-analysis clearly underscores the rationale for public health measures and communications aimed at reducing the aggregate IFR by mitigating the incidence of new COVID-19 infections among middle-aged and older adults. 53 To illustrate these considerations, Table 7 outlines three alternative scenarios for the U.S. trajectory of COVID-19 infections and fatalities. All three scenarios assume that the infection rate continues rising to a plateau of 28%, matching the U.S. prevalence of the Spanish Flu. However, the age-specific infection rates vary markedly across the three scenarios: 51 See U.S. CDC (2020c). Seroprevalence estimates are reported in the U.S. CDC's Weekly COVID Surveillance Summary, based on data collected by 85 state and local public health laboratories spanning the entire country. These reports include age-specific seroprevalence but no details regarding sample selection, test characteristics, or confidence intervals and hence could not be used in our meta-regression analysis.  (which was not certified by peer review) is the author/funder, who has granted medRxiv a license to display the preprint in perpetuity.
The copyright holder for this preprint this version posted July 24, 2020. . https://doi.org/10.1101/2020.07.23.20160895 doi: medRxiv preprint  Scenario #1 assumes that age-specific prevalence will remain similar to the average pattern that has prevailed to date, as indicated by seroprevalence data from U.S. public health laboratories.
 Scenario #2 assumes that the prevalence will eventually become uniform across all age groups, similar to the Spanish Flu pandemic.
 Scenario #3 assumes that public health measures and communications will restrain the incidence of new infections among middle-aged and older adults while prevalence continues rising rapidly among children and younger adults.
To assess the implications of these three alternative assumptions, we use the age-specific IFRs from our meta-regression analysis to determine the death toll for each age group as follows: Scenario #1 shows that, if the current age-specific infection pattern continues until 28% of the U.S. is infected, deaths will increase by a factor of 3 to around 450,000. The outcome is far worse in Scenario #2, where the virus spreads uniformly across age groups and causes nearly one million deaths. In contrast, Scenario #3 is associated with a far lower proportion of older adults contracting the virus; thus, most of the growing prevalence occurs among children and younger adults, and the death toll is held to about 240,000.
This scenario analysis underscores the possibility that the United States could plausibly end up with an overall infection fatality rate of about 1%, similar to the outcome in New York City (as shown in Table 1). Such an outcome would be associated with a total death toll of nearly 1 million people. Simply maintaining the current pattern of prevalence across age groups would be somewhat less catastrophic, but still result in nearly 500,000 deaths. By comparison, policy measures and communications that protect vulnerable age groups could halve the current overall IFR to around 0.3%, and prevent over 200,000 deaths.
All rights reserved. No reuse allowed without permission.
(which was not certified by peer review) is the author/funder, who has granted medRxiv a license to display the preprint in perpetuity.
The copyright holder for this preprint this version posted July 24, 2020. 54 See Molenberghs et al. (2020), Table 6. This study used the seroprevalence findings of Herzog et al. (2020). 55 See Perez-Saez et al. (2020), Table S2. This study used the seroprevalence findings of Stringhini et al. (2020). 56 Age-specific IFRs were constructed using the seroprevalence findings of Pollán et al. (2020), Table S7 (both tests positive) and excess mortality data for Week 25 reported by Spain National Institute of Statistics (2020). 57 See Sweden Public Health Authority (2020a,b,c,d,e) for information about the seroprevalence program design, antibody test standards, results for weeks 18 to 21, and COVID-19 fatalities as of week 24, respectively.
All rights reserved. No reuse allowed without permission.
(which was not certified by peer review) is the author/funder, who has granted medRxiv a license to display the preprint in perpetuity.
All rights reserved. No reuse allowed without permission.
(which was not certified by peer review) is the author/funder, who has granted medRxiv a license to display the preprint in perpetuity.

A.5 Studies Excluded Due to Transparency Concerns Location
Transparency Concern

Denmark 75
The seroprevalence test kit used in this study was subsequently withdrawn from the market by the manufacturer due to reliability concerns. 76 Indiana 77 Indiana University issued two press releases regarding initial findings, but as of July 15, no report had been issued regarding the test procedure, sampling data frame, or construction of test-adjusted confidence intervals.

Slovenia 78
Public officials issued a press release regarding initial findings, but as of July 15, no report had been issued regarding the test procedure, sampling data frame, or construction of test-adjusted confidence intervals.

(a) Studies of Blood Donors
Rationale: Prior research has shown that blood donors tend to be younger and healthier than the general population, with very few blood donors over age 60. Moreover, individuals who donate blood during a pandemic may be more gregarious and less risk-averse than non-donors. Recent U.K. seroprevalence studies indicated that English blood donors had a COVID-19 infection rate of 7.9% compared to a rate of 5.4% for a random sample of the English population.  Thompson et al. (2020). 84 See Ng et al. (2020). 85 See Emmenegger et al. (2020); this study covers two distinct set of samples, one of which is from blood donors.
All rights reserved. No reuse allowed without permission.
(which was not certified by peer review) is the author/funder, who has granted medRxiv a license to display the preprint in perpetuity.

(b) Hospitals or Outpatient Clinics without Screening of COVID-Related Cases
Rationale: A substantial fraction of individuals seeking health care at a hospital or outpatient clinic may have symptoms related to prior exposure to COVID-19 and hence exhibit a higher prevalence of positive test results compared to a random sample of the general population.

Location Description
Brooklyn, NY 86 This study used samples from an outpatient clinic and yielded a much higher infection rate than other seroprevalence studies of the New York metropolitan area.

Kobe, Japan 87
This study tested for IgG antibodies in 1,000 specimens from an outpatient clinic and found 33 positive cases. However, the study did not screen out samples from patients who were seeking treatment for COVID-related symptoms. Moreover, the study reported raw prevalence and confidence interval but did not report statistics adjusted for test characteristics. The manufacturer (ADS Biotec / Kurabo Japan) has indicated that this test has specificity of 100%, based on a sample of 14 pre-COVID specimens, but that specificity has not been evaluated by any independent study. If the true specificity is 98%, then the adjusted prevalence would not be significant.
The authors concluded by noting the selection bias and recommended that "further serological studies targeting randomly selected people in Kobe City could clarify this potential limitation." Tokyo, Japan 88 The authors of this study specifically cautioned against interpreting their results as representative of the general population. In particular, the sample of 1,071 participants included 175 healthcare workers, 332 individuals who had experienced a fever in the past four months, 45 individuals who had previously taken a PCR test, and 9 people living with a COVID-positive cohabitant. The study obtained a raw infection rate of 3.8%, but the rate is only 0.8% if those subgroups are excluded.

Zurich, Switzerland 89
This study analyzed two distinct set of samples: (i) blood donors and (ii) hospital patients. Nearly all blood donors were ages 20 to 55, so that sample is not useful for assessing age-specific IFRs for older adults. The sample of hospital patients was not screened to eliminate cases directly related to COVID-19, so that sample may not be representative of the broader population. Moreover, inhabitants of the city of Zurich constituted a relatively large fraction of seropositive results compared to residents from the remainder of the canton of Zurich (which is predominantly rural).
The study computes an overall IFR of 0.5%, similar to that of Geneva. All rights reserved. No reuse allowed without permission.
(which was not certified by peer review) is the author/funder, who has granted medRxiv a license to display the preprint in perpetuity.

(c) Active Recruitment of Participants
Rationale: With active recruitment, a substantial fraction of the sample may be comprised of individuals who are aware of or concerned about prior exposure to COVID-19 and hence exhibit a higher prevalence of infections compared to a random sample of the general population.

Luxembourg 90
Of the 35 participants who tested positive, 19 had previously interacted with a person who was known to be infected or had a prior test for SARS-CoV-2.

Boise, Idaho 91
This study was promoted during a "Crush the Curve" publicity campaign and required participants to sign up for a test.
Santa Clara, CA 70 Participants were recruited via social media and needed to drive to the testing site.
Stanford Medicine subsequently released a statement indicating that the study was under review due to concerns about potential biases. 92 Frankfurt, Germany 93 This study was conducted at a industrial worksite. Among the 5 seropositive participants, 3 had prior positive tests or direct contact with a known positive case.

Location Description
Oisie, France 94 This sample of 1,340 participants included elementary school teachers, pupils, and their families. Only two individuals in the sample were ages 65 years and above.

Saxony, Germany 95
This study analyzed specimen samples from students and teachers at thirteen secondary schools in eastern Saxony and found very low seroprevalence (0.6%).

A.7: Studies Excluded Due to Accelerating Outbreak 96
Rationale: As discussed in Section 2, if a seroprevalence study takes place in the midst of an accelerating outbreak, then there is no preicse way to determine which of the subsequent fatalities resulted from new infections vs. infections prior to the date of the study.  Armann et al. (2020). 96 Studies on expatriate flights and Japanese evacuees from Wuhan also occurred extremely early in the outbreak, and infection and deaths in their respective populations rose exponentially following the initial study. 97 See Sood et al. (2020). 98 See Havers et al. (2020). 99 See Bendavid et al. (2020). 100 See Thompson et al. (2020).

Excluded Studies (see Section 2.2 and
All rights reserved. No reuse allowed without permission. (which was not certified by peer review) is the author/funder, who has granted medRxiv a license to display the preprint in perpetuity.
The copyright holder for this preprint this version posted July 24, 2020. This study analyzed 885 laboratory specimens from outpatient clinics for the period May 15-27 and found only four positive cases (0.6%). This sample is not well-suited for assessing age-specific prevalence or age-specific IFRs.

Czech Republic 102
The Czech Ministry of Health conducted a large-scale seroprevalence survey on April 23-May 1, collecting specimens from a random sample of 22,316 residents and testing for IgG antibodies using the Wantai test kit. Only 107 positive cases were identified (raw prevalence = 0.4%), and hence the test-adjusted confidence intervals include the lower bound of zero prevalence. That result is consistent with the very low number of reported cases in the Czech Republic as of early May; for example, Prague had only 1,638 reported cases for a population of 1,3 million.

Expatriate Flights 103
This study performed PCR tests on 689 individuals expatriated from Wuhan, China on six international flights during January 31-February 2. There were six positive tests (raw prevalence = 0.87%), but assessment of age-specific prevalence or IFRs is not feasible given the sample size, low prevalence, and lack of case outcomes.

Japanese Evacuees 104
This study performed PCR tests on 565 Japanese citizens expatriated from Wuhan, China. There were eight positive tests, indicating a raw prevalence of 1.4%, but assessment of age-specific prevalence or IFRs is not feasible given the small sample, low prevalence, and lack of data on case outcomes.

Jersey (U.K.) 105
This study collected samples from 629 households comprising 1,062 individuals and estimated seroprevalence at 4.2% (CI 2.9 to 5.5%), indicating that about 3,300 Jersey residents have been infected. Jersey has had 30 COVID-19 fatalities (as of July 15), and hence the overall IFR is about 1% (similar to that of NYC). However, the seroprevalence sample is too small to facilitate accurate assessments of age-specific IFRs; for ages 55+, there were 258 samples and 12 positive cases, New Orleans, LA 106 This study analyzed a random sample of 2,640 participants and obtained a seroprevalence estimate of 6.86% and an IFR of 1.63% (CI 1.53 to 1.74%). The study reported race-specific results but not age-specific seroprevalence or IFRs.

Mount Sinai Hospital, New York City 107
This study analyzed seroprevalence using specimens from four groups of patients (Cardiology, OB/GYN, Oncology, and Surgery) starting in mid-February. For the final week of the study (April 19), positive results were obtained for 47 of 243 patients; that seroprevalence estimate of 19.3% is well-aligned with the results of the New York Department of Health study. However, the sample size of this cohort is too small for assessing age-specific IFRs.
Neustadt-am-Rennsteig, Germany 108 This study analyzed seroprevalence of 626 residents (71% of the population of this municipality) and estimated seroprevalence of 8.4% (52 positive cases). However, this sample size is too small for assessing age-specific IFRs. District,CA 109 This study analyzed active infections and seroprevalence of 3,953 residents in a densely population majority Latinx neighborhood in downtown San Francisco. Positive seroprevalence in older adults was very low (22 out of 3,953) and hence too small for assessing age-specific IFRs.

San Miguel County, CO 110
The San Miguel County Health Department assessed seroprevalence in March and April using samples from 5,283 participants (66% of county residents). Raw prevalence was very low (0.53%), with only 3 confirmed positive results for adults ages 60 years and above.

Vo, Italy 111
Vo' is a municipality of 3,300 people, nearly all of whom (87%) participated in an infection survey in late February. However, there were only 54 infections among people ages 50+, so assessing age-specific IFRs is not feasible. All rights reserved. No reuse allowed without permission.
(which was not certified by peer review) is the author/funder, who has granted medRxiv a license to display the preprint in perpetuity.
The copyright holder for this preprint this version posted July 24, 2020. . https://doi.org/10.1101/2020.07.23.20160895 doi: medRxiv preprint  (which was not certified by peer review) is the author/funder, who has granted medRxiv a license to display the preprint in perpetuity.

Appendix C: Comparison of Seroprevalence vs. Reported Cases in Iceland
Age Group All rights reserved. No reuse allowed without permission.

Confidence Interval
(which was not certified by peer review) is the author/funder, who has granted medRxiv a license to display the preprint in perpetuity.