The title of this chapter is not a rhetorical question . It is a serious question about what kinds of experiments influenza virologists should undertake. It is a privilege to be asked to address it in a laboratory manual that is primarily devoted to compiling protocols from leading virologists on how to perform experiments, once the scientific objective is selected.

This chapter expresses a view contrary to the view expressed by many leading influenza virologists. A few have argued that in the right hands, there are no exceptionally dangerous gain-of-function experiments in influenza. They argue that biosafety and biosecurity can be assured for such experiments when they are performed in the right laboratories, to a degree that the danger is not exceptional [1,2,3]. Others have argued that certain experiments likely to produce potential pandemic pathogens (PPPs) —defined as infectious agents capable of high virulence and efficient transmission between humans—are exceptionally valuable to basic science or to public health and are therefore worth whatever risk they may entail [1, 4, 5].

This chapter will make the case that, in most instances, both views are mistaken. Because this book is about virology methods, rather than about biosafety and biosecurity , the chapter will attend briefly to the risk side of the equation and will focus mainly on the question of scientific and public health value of these experiments. The argument will acknowledge that there are certain scientific questions that can be answered only by gain-of-function experiments and among these a subset that can be answered only by gain-of-function experiments that are expected to create PPPs [1, 4]. However it will attempt to convince the reader that the additional scientific value of this so-called gain-of-function research of concern (GoFRoC , in the elegant parlance of the US government) is relatively modest compared to what can be learned from GoF experiments that do not create PPPs, combined with other approaches to experimental and observational influenza studies [6, 7]. Last, it will make the claim that the value of this small class of GoFRoC experiments for public health is not, as has been claimed, exceptionally high but indeed is incremental at best and will not, in the foreseeable future, substantially improve the evidence base for public health decision-making [6].

For the purposes of this chapter, I will use interchangeably the phrases “exceptionally dangerous gain-of-function experiments”; “gain-of-function research of concern (GoFRoC),” as defined by the US National Science Advisory Board on Biosecurity ; and “research likely to create PPP” as used in past writings by various authors. Specifically, these refer to experiments that are reasonably anticipated to create novel viral strains that combine high virulence for humans with high transmissibility in humans, especially when it is expected that the strain is likely to be unaffected by existing immunity in the general population due to antigenic novelty. Ferret passage of modified influenza A H5N1 viruses to select for droplet transmissibility [8, 9] is a paradigm case of such research. More generally, such studies (Fig. 1) typically start with a particular virus that does not have the phenotype of interest: to date, this has usually been droplet transmissibility between individuals of some mammalian species. Manipulations are performed to introduce changes that may confer such transmissibility . Such manipulations may involve (1) introducing (e.g., via reverse genetics ) mutations expected to contribute to the phenotype [8], (2) providing novel genetic material by coinfecting cells or animals with another strain and permitting reassortment [10,11,12], or (3) simply waiting for spontaneous mutations to occur [8, 9, 13]. After performing one or more of these manipulations, the next step is to exert selection for the phenotype (say by placing an infected animal near but physically separated from an uninfected one and looking for viruses in the second animal that have moved via droplets from the first). The processes of introducing variation and selecting for the phenotype may be repeated one or more times and usually stop when the phenotype is identified. Different variations of these procedures have been used by several laboratories to enhance transmissibility of influenza A in various animal models [8,9,10,11,12].

Fig. 1
figure 1

Schematic of the design of gain-of-function experiments involving selection for transmissibility in a mammalian host, starting from an avian-adapted virus obtained from an avian host or human zoonotic case

1 A Brief History of the Debate

Perhaps the first experimental effort to create a PPP in the laboratory was in fact by a somewhat different means: recreation of a strain of influenza H1N1 from 1918 by Tumpey and colleagues based on synthesizing nucleic acid sequences obtained from partially preserved viral RNA in frozen corpses from 1918 and then creating infectious virus by reverse genetics [14]. The question of whether it was wise to construct a virus that was historically associated with the worst pandemic in modern history and somewhat different from any virus currently circulating was raised, but the debate seems to have been internal to the US Department of Health and Human Services (HHS), particularly the National Institutes of Health (NIH), and it was judged that the work should proceed [14].

In 2004, as human cases of influenza H5N1 spilled over from avian reservoir hosts, a debate began about the desirability of experiments to test whether such viruses could reassort with human seasonal influenza viruses. Some argued that such experiments were essential and safe, while others argued for a global review of such studies, but apart from a news article in Science, there was little further public discussion [15].

Experiments to assess the potential for reassortment between subtypes of influenza enzootic in birds, such as H9N2 [10] and H5N1 [16, 17], continued with little attention from outside the field. Then, at a conference in Malta in September 2011, Prof. Ron Fouchier presented data from experiments in which his laboratory had modified a human isolate of H5N1 avian-origin influenza to acquire some mutations expected to adapt it to human-to-human transmission and then introduced the resulting virus into ferrets. Through serial passage in ferrets they had obtained a virus capable of small-droplet transmission from one infected ferret to another [3, 8]. Given the close resemblance of ferrets to humans in many aspects of influenza biology and pathology, this was seen by many (though not all [18]) as a proxy for human-to-human small-droplet transmission. Soon after, the laboratory of Prof. Yoshihiro Kawaoka reported a related set of experiments, this time using a virus created by reverse genetics from a human H1N1 virus and the hemagglutinin gene of a zoonotic H5N1 isolate [9]. A major controversy erupted over whether these experiments should be published. The controversy focused on the biosecurity issue of whether knowledge of the mutations required to render H5N1 influenza transmissible in ferrets (and possibly humans) would abet potential bioterrorists by providing them a “recipe” for a potential pandemic pathogen [19]. Despite strong objections from some, the US government, which had funded both sets of studies, recommended that they be published in full, as they soon were [8, 9]. Following this outcry, influenza virologists briefly announced [20, 21], then lifted [22, 23], a self-imposed moratorium on such experiments.

Concomitant with the publication of the two controversial studies, some colleagues raised concerns about a different risk: that of a laboratory accident that could lead to the release of a pathogen that, by design, combined high virulence and antigenic novelty (characteristic of the starting H5N1 strain) with high human-to-human transmissibility (a property selected for by proxy in the ferret experiments) [24,25,26]. These early critiques also questioned whether the scientific and public health value of such research justified the risks involved. Klotz and Sylvester coined the term “potential pandemic pathogen (PPP)” for such viruses [25]. The first major discussion meeting on this topic, to my knowledge, was held at the Royal Society of London in 2012 (https://royalsociety.org/science-events-and-lectures/2012/viruses/), with mainly UK and North American speakers.

Following the revelations of a number of laboratory mishaps involving mishandling and in some cases possible human exposures to potentially lethal (but not highly transmissible) pathogens at high-containment federal laboratories in the United States, criticism of experiments that created PPP began to intensify. A meeting that I co-organized with colleagues in July 2014 led to the creation of the Cambridge Working Group , which issued a statement calling for such work to be “curtailed” pending a formal risk-benefit assessment (http://www.cambridgeworkinggroup.org/). In response, a countervailing group called Scientists for Science (http://www.scientistsforscience.org/) formed in opposition to this call. Each group garnered the support of prominent scientists and others. In October 2014, citing the laboratory mishaps and the exceptional hazard that release of a potential pandemic pathogen might cause, the US White House Office of Science and Technology Policy (OSTP) and the National Institutes of Health announced a “pause” in federal funding of certain categories of research and a deliberative process that would culminate in a formal risk-benefit assessment, spearheaded by the National Science Advisory Board on Biosecurity (NSABB) (https://obamawhitehouse.archives.gov/blog/2014/10/17/doing-diligence-assess-risks-and-benefits-life-sciences-gain-function-research). This pause initially affected about two dozen existing grants from the NIH concerning influenza and coronaviruses [27], but it was soon narrowed to cover only about half that number (http://www.sciencemag.org/news/2014/12/moratorium-risky-experiments-lifted-mers-mouse-studies), out of an NIH portfolio of about 250 active research grants at the time on these two groups of viruses.

Following the funding pause, two major discussion meetings were held by the US National Academy of Sciences [28, 29], multiple meetings were held by the NSABB (at the time of this writing, no link to the records of these meetings was available from https://osp.od.nih.gov/biotechnology/national-science-advisory-board-for-biosecurity-nsabb/), and a risk-benefit assessment commissioned by the NSABB was completed by a private consulting firm, Gryphon Scientific (http://www.gryphonscientific.com/wp-content/uploads/2016/04/Risk-and-Benefit-Analysis-of-Gain-of-Function-Research-Final-Report.pdf). Many scientists, a few ethicists, and others debated the scientific and public health rationale for PPP experiments, the risks they posed, and the ethics of doing research that poses potentially major risks to persons not involved in the studies and even unaware of them. This process raised awareness of many issues that had not been previously highlighted, notably the lack of a framework for evaluating risks of research to persons who are not research participants [30, 31], the very poor availability of data on biosafety in biological laboratories in the United States and elsewhere [32], and the consequent uncertainty in risk-benefit calculations.

In January 2017, OSTP announced that policy guidance for gain-of-function work was being released (https://www.phe.gov/s3/dualuse/Pages/GainOfFunction.aspx), and at the time of writing, there has been no final resolution of the issue.

This account has been focused on the US debate, with which I was most directly involved. A parallel process played out in Europe, slightly faster than in the United States, and with little net change in the conditions for such research to proceed. China, also an important player in PPP research, did not to my knowledge have a formal debate on the wisdom of such work, but this may reflect inadequate coverage in the Western press.

2 Making Choices About What Science to Do and How to Do It

The majority of this book is intended to facilitate the technical aspects of cutting-edge influenza virology by providing protocols and techniques for performing such experiments in the laboratory. This chapter, by contrast, attempts to address the (conceptually prior) question of which kinds of experiments one should undertake. Throughout a career in experimental biology, one makes choices about which experiments to perform—and which not to perform—on many different time scales. The choice of which field of biology to enter may have consequences for decades or an entire career. The choice of what project to select (from among those available in graduate school) or apply for grant support for (often at later stages of a career) will have consequences for many years; and the choices of which experimental approaches to take may govern one’s activities for an afternoon, a week, or several months or years depending on the complexity of the experiments. Over long time horizons, we make these decisions based on a combination of scientific curiosity, the potential for novelty, the importance of the problem for advancing science or achieving a practical (e.g., public health) goal, and more mundane but essential constraints such as the availability of technology, reagents, facilities, funding, mentors, and collaborators, to name a few. On the shorter time scale of hours, days, and months, the choice of detailed experimental approach is motivated by questions about which approach is:

  • Most likely to provide interpretable data to answer the question, unaffected by uncontrolled variation

  • Most likely to provide an answer fast

  • Likely to be least expensive (or fit within a budget constraint)

  • Most elegant scientifically

These are questions of judgment on which reasonable scientists will disagree because they have different tastes, skills, and resources, but they are the kinds of questions a funder or a principal investigator expects they will attempt to answer before proceeding.

Typically, biosafety has not been considered within this list of questions (except perhaps in the situation where time in a high-containment laboratory is a limiting resource). Rather, the scientific judgment is typically made first, and then appropriate biosafety conditions are identified to permit the research to be performed safely. In the specific case of US NIH grants, biosafety considerations are reviewed for a proposal only after the scientific merit is determined, and any concerns about biosafety are not supposed to affect the scientific score of a proposal. As a global research enterprise, this type of reasoning has worked relatively well: known fatalities associated with laboratory exposure to pathogens have been extremely rare [33], around one per year globally.

3 Unique Risks of GoF and GoFRoC Experiments

This procedure is not sufficient for those cases where the risk of a laboratory accident extends beyond those working in the laboratory or nearby—to a much larger, even global, population, who might be infected and greatly harmed by a pathogen that is both highly virulent to humans and readily transmissible [31]. As I will argue below, that category of research is a very limited category—roughly covered by the terms PPP creation or GoFRoC. The accident risk posed by performing an activity for a particular period of time is conventionally defined as the product of two factors: the probability that an accident will occur during that period, times the expected magnitude of harm that the accident causes. For the vast majority of pathogens studied in the laboratory, accidental infection of a laboratory worker or other person poses little risk to the broader population. This is because most pathogens we study have one or more of the following properties that limit the impact of an accidental infection:

  • The pathogen is not transmissible from person to person (e.g., anthrax) or is somewhat transmissible but would be readily contained in a developed-world setting (e.g., filoviruses).

  • The pathogen is already widespread (and possibly usually carried asymptomatically), such that adding one more infected person to the pool of infected persons would have a negligible impact on the overall risk of transmission (e.g., the bacterial species that constitute the normal flora of the gut and respiratory tract).

  • Transmission of the pathogen is limited by widespread vaccine coverage (e.g., measles virus, Bordetella pertussis, and other targets of routine vaccination).

  • Transmission of the pathogen is locally limited by lack of a suitable vector (e.g., flaviviruses in sufficiently cool climates).

Potential pandemic pathogens are intrinsically those which meet none of the above conditions: there is widespread susceptibility, high virulence for humans, little or no current human infection, and a route of transmission that is likely to be efficient in the population where the research is performed. In this case, the potential impact of an accidental infection is manyfold larger than for most pathogens under study. A probability of laboratory accident that may be acceptable when the impact is one or two human cases may not be acceptable when the potential impact is thousands, millions, or more [31].

There has been much debate about the exact magnitude of risk from GoFRoC experiments. Rehearsing those arguments is not the purpose of this chapter. A few points, however, are important:

  1. 1.

    A published estimate by the author of this chapter [34] using the experience of laboratory-acquired infections in BSL3 laboratories in the United States suggested that a single year of work in a single laboratory on a novel influenza virus readily transmissible between humans carries between a 1/1000 and a 1/10,000 chance of accidentally sparking a pandemic by accidental release. A reply by Prof. Ron Fouchier [2] placed the probability more than one million times lower. This is not the place to rehash the arguments over these probabilities. The essential point is that either estimate is a small probability; both are dependent on assumptions because data are limited; but when multiplied by the potential magnitude of a pandemic, the higher estimate (or even an estimate between the two) suggests a clearly unacceptable risk. While we believe that our estimate is if anything too low rather than too high, it is notable that the expected risk (probability of an accident leading to a pandemic times expected fatalities in such a pandemic) would be unacceptable to most people even if it were 1000-fold lower than we have estimated [34].

  2. 2.

    While excellent biosafety conditions in the laboratories performing GoFRoC are certainly important, it is not a panacea for guaranteeing safety. Of the major mishaps at US government labs in recent years, nearly all involved removing the infectious agent from the high-containment lab where it was under study to another, lower-containment lab because it was thought to be inert [35,36,37] (https://www.defense.gov/News/Special-Reports/DoD-Laboratory-Review/). High-tech containment cannot prevent the deliberate removal of supposedly safe material from a laboratory, and so human error remains a source of potential missteps, regardless of the quality of the laboratory facilities. Among other consequences, this consideration means that measures to protect personnel in laboratories conducting GoFRoC, such as vaccination, prophylaxis with antiviral drugs, and enhanced surveillance for illness, cannot address the problem of exposure outside the “home” laboratory [38].

In summary, the record of laboratory accidents and accidental infections in the most secure and highly scrutinized government labs shows that such accidents are inevitable. In the vast majority of cases, the consequences, while potentially devastating for one or a few exposed persons, will be limited in scope. The unique risk posed by GoFRoC or PPP creation is that such accidents will lead not only to individual infections but to ongoing transmission and, in the worst case, extensive global spread of the engineered pathogen. Such risks are different in kind from ordinary biosafety risks and should not be seen as simply an issue to be addressed after deciding whether to do the experiment. Rather, the existence of such risks should be counted, like great expense in time or money, as an important factor arguing against undertaking the experiment in the first place. The unique benefits of doing GoFRoC should justify the unique risks such experiments create, and both comparisons should be made against devoting the same resources to alternative approaches that do not raise population-level biosafety concerns .

4 What Questions Can GoF and GoFRoC Experiments Answer Uniquely?

This leads directly to the question of what questions can be uniquely answered by experiments involving the creation of PPP . Here it is useful to distinguish the kinds of knowledge that can be obtained only by gain-of-function experiments and then the smaller set of knowledge that can be obtained only by GoFRoC, that is, by GOF experiments that lead to the creation of PPP.

Some scientific questions can be answered only by GoF experiments, as has been argued before [4]. However, these are limited to questions of the form: Starting from a particular virus genotype, is a particular set of genetic changes (for influenza, these may be reassortments or mutations) sufficient to create a particular phenotype that was not present in the starting genotype? This question can be answered only by a GoF experiment because if one does not create the phenotype, one cannot measure it. Loss-of-function experiments cannot answer this question unless one already has a strain with the phenotype in question and the exact genotype of interest, which is not the case in most GoF experiments. Comparisons of naturally occurring strains cannot answer the question for the same reason—if the phenotype already existed in a naturally occurring isolate, then the GoF experiment would not in general be under consideration.

Note the narrow scope of the question that such an experiment can answer. It cannot answer any of the following types of question:

  1. 1.

    Starting from a particular viral sequence, is a particular set of genetic changes necessary to create a phenotype of interest? This question is unanswerable in principle, because one can never make all possible genetic changes on a single starting genotype and rule out all alternative sets of changes [39].

  2. 2.

    Starting from a particular viral sequence, is a particular set of genetic changes the smallest set of changes on that genetic background sufficient to produce the phenotype? This question is unanswerable in practice for any set larger than about three mutations, because it is not possible to test phenotypically every one of the ~1012 combinations of three or more mutations, at least for whole-animal phenotypes like transmission that must be evaluated statistically in a number of animals for each genetic sequence. Thus even if each member of the set of changes is required along with the others to get the phenotype, there may be another set of (the same number of or fewer) changes that has not been tried in the experiment [39].

  3. 3.

    Is the set of mutations that produces the phenotype in the laboratory likely to happen in nature? This question can’t be answered by GoF experiments in the laboratory because viruses in nature are subject to many, possibly competing selective pressures and nonselective forces (e.g., population bottlenecks) that might produce a different trajectory of sequence changes from those that appear in the lab. Moreover, only a vanishingly small proportion of the viruses in nature have the genotype that forms the starting point of the experiment.

  4. 4.

    Is the set of mutations that produces the phenotype in the laboratory given a particular starting sequence likely to produce the phenotype when introduced into a viral strain with a different starting sequence? This would be predictable only if epistatic interactions were rare in influenza, such that the effect of a set of genetic changes on phenotype was relatively independent of genetic background. However, epistasis is pervasive in influenza [40, 41], including for phenotypes of interest such as and hemagglutinin receptor binding specificity [42, 43]. Thus, for example, the very mutations identified in GoF experiments on H5N1 isolates as conferring human receptor specificity on the viral hemagglutinin (HA ) did not do so when introduced into a slightly different genetic background [43].

  5. 5.

    Does the set of mutations that confers ferret-to-ferret droplet transmissibility when introduced into a particular starting sequence also confer human-to-human transmissibility? Barring an unethical deliberate infection of humans with the virus resulting from GoF experiments in ferrets, and barring an accidental infection of humans, which would be potentially catastrophic, we will never know whether that particular virus’s transmission potential in ferrets translates to humans. Ferret droplet transmissibility is a very good, but not perfect predictor of human transmissibility for influenza [44]. At least one prominent influenza virologist has argued that the strains produced in GoF experiments in ferrets probably are not readily transmissible in humans [18], and others have said the same thing in public forums. If taken seriously, this uncertainty undermines the scientific and public health value of the experiments, because it undermines our ability to use the presence of GoF-identified mutations and phenotypes to gauge the level of risk presented by naturally occurring viruses [45].

4.1 What Scientific Questions Are of Greatest Importance for Pandemic Prevention and Response?

Each of the above classes of questions—those answerable by GoF experiments (in some cases by GoFRoC experiments) and those not answerable by such experiments—has legitimate scientific interest. But improving our ability to prevent and/or respond to novel influenza pandemics is the major public health goal that lends practical importance to advancing our scientific knowledge of the determinants of transmissibility in particular. Most importantly, effective targeting of efforts to prevent or mitigate pandemic threats partly depends on developing accurate predictive capabilities to estimate the magnitude of the threat posed by particular influenza A strains that we are observing in nonhuman reservoirs [40, 45]. Knowing that strains isolated from wild or domesticated avian or domesticated swine populations have genetic sequences or phenotypic traits (e.g., receptor binding specificity) predictive of the ability to transmit effectively in humans can motivate targeted countermeasures against such strains, such as culling of infected animals or strain-specific vaccine development. These two activities in particular depend on prioritizing which strains are most threatening, because it is impractical to cull all influenza-infected animals, and it is currently impractical to make vaccines except in a strain-specific fashion. Such strain-dependent prevention and mitigation activities are part of an overall pandemic preparedness portfolio including strengthening human respiratory disease surveillance and analytic capacity, stockpiling of non-strain-specific countermeasures such as antiviral drugs, personal protective equipment and ventilators, and a range of other public health measures whose usefulness is relatively independent of the strain of influenza that causes zoonotic cases or initiates a pandemic. But for animal culling and vaccine development (absent universal vaccines), prioritizing pandemic threats based on sequence and viral traits is important [40, 45, 46].

Many different types of scientific approaches can contribute to such prioritization efforts [7]. Comparison of naturally occurring influenza viruses, either by genetic sequence or by phenotypic traits or both, can help to categorize those sequences and traits that are characteristics of strains that are avian-adapted and poorly able to transmit in humans, compared to those that are well-adapted to human transmission. Such comparisons have been made for decades, and a number of traits and genetic sequence features (such as specific variants at specific amino acid sites in specific viral proteins) have been associated with human adaptation. Among the most important traits correlated with human adaptation are HA specificity for sialic acids found in the human upper airway (alpha-2,6 linked), a relatively low pH of HA activation, and human adaptation of the nucleoprotein-polymerase complex [40].

As the importance of these and other traits has become apparent from natural comparisons, the sequence variations correlated with these properties have been established through biochemical and other assays of single gene products or minigenomes (in the case of the nucleoprotein-polymerase complex) [47]. Such sequence changes can then be tested for their causal role in the individual traits, measurable in vitro, again without employing infectious virus [48,49,50]. Gain-of-function studies can be performed without employing strains with pandemic potential, for example, by taking a natural strain, introducing a change that abrogates one of the traits of interest, and then selecting revertants (so-called gain+loss-of-function) [51] or by doing gain-of-function experiments similar to GoFRoC but employing attenuated strains [52], strains of lower virulence , and strains for which there is greater population-wide immunity. All of these are alternatives to GoFRoC, and each is capable of informing public health efforts to prioritize pandemic threats from naturally occurring strains.

Importantly, such prioritization efforts are and will remain intrinsically imperfect. There have only been four influenza pandemics since the turn of the twentieth century, so our evolutionary models of what confers human-to-human transmissibility are constrained by being based on essentially four independent data points. The limitations of predicting pandemic potential were evident in 2009, when the strain that emerged and began spreading in humans arguably failed to meet the existing notions of human adaptation on all three key human-adaptive traits as understood at the time: early pandemic isolates’ HA receptor specificity was mixed human and avian; their HA pH of activation was outside the range of human-adapted viruses known before; and they lacked the sequence variant (lysine at PB2 amino acid 627) that had been thought necessary in the past for human adaptation. While subsequent evolution “corrected” the first two of these limitations on human-to-human transmission, the criteria had to be updated to accommodate the fact that what seemed necessary for pandemic potential in 2008 was violated by the early pandemic strain in 2009 [40]. While this limitation is surely a rationale for continuing to expand our scientific knowledge in this area, it more fundamentally points out that with only four modern pandemics to test our models of what is needed for a pandemic strain, no amount of further research will be able to make a highly reliable predictive model for pandemic potential—we simply lack the data (actual pandemics) to validate such models and therefore can never rely fully on these models, no matter how many more experimental inputs we add. This consideration heightens the importance of other approaches to pandemic preparedness, which are not dependent on correct predictions of the strain that will cause a pandemic , in combination with efforts to improve our predictive power. Table 1 lists a number of experimental and computational approaches to increasing our understanding of influenza pandemic risk determinants (top) and increasing our preparedness for pandemics more generally (bottom).

Table 1 Scientific approaches that are scientifically valuable for understanding influenza virus determinants of pandemic potential and more generally for improving pandemic preparedness, but do not risk the creation of potential pandemic pathogens (PPPs)

5 What Do We Achieve, from a Public Health Perspective, with GoFRoC, that We Cannot Achieve with Safer Alternatives?

Given these considerations, no reasonable number of GoFRoC studies will give us enough data to be certain we know all the ways that influenza can achieve human-to-human transmission. Yet dramatic public health benefits have been claimed for GoFRoC experiments. Authors from the US CDC wrote in 2015 that prioritization of countermeasures against H7N9 strains in Cambodia that year had been possible uniquely thanks to the GoFRoC experiments that identified the importance of particular mutations in achieving human-to-human transmissibility [53]. These claims were exaggerated in the sense that every one of the mutations identified by those authors as generating concern about the Cambodian H7N9 strains, which had been identified in GoFRoC with H5N1, had been previously pinpointed in publications describing sequence analysis of existing strains and/or experimental approaches not constituting GoFRoC, such as binding studies of purified HA; the relevant experiments are listed in a table in Ref. [54]. While the GoFRoC studies in H5N1 added to the evidence base for the importance of these sites, there was nothing unique about the GoFRoC studies, and the same level of concern would have been appropriate purely on the basis of prior studies.

Similarly, participants in the recreation of the 1918 influenza A H1N1 strain asserted after the 2009 pandemic that this recreation had produced major public health benefits in the 2009 pandemic due to the understanding those experiments provided of H1N1 biology [5]. Yet the examples they cited, such as recognizing the likely protection enjoyed by elderly persons in 2009 who had been alive in 1918, in no way depended on the actual reconstruction of the 1918 virus. Sequence comparisons between the 2009 and 1918 viruses would have suggested the hypothesis, and serological assays of persons born before 1918 using the 2009 pandemic virus could—and did—test the hypothesis.

6 Conclusion

In summary, pandemic prediction, prevention, and mitigation are necessarily inexact sciences, where uncertainty remains mainly because we have so few pandemics to study. GoFRoC can add to the evidence base, but it cannot qualitatively change that evidence base. Empirically, the contribution of such studies to applied public health goals has been far more modest than claimed. Accounting for the unique risk posed by GoFRoC experiments as a factor in deciding whether to pursue them—as one should—shines a light on the narrowness of the scientific questions that they alone can answer. Vast advances in the most essential questions of influenza virology and in the public health goal of pandemic preparedness can be achieved without undertaking experiments that, if an accident occurs, could start a new pandemic. The consequences of such an accident should be enough to direct the attention of ambitious and public health-minded virologists toward these safe alternatives.