Replication, mutation, and the scope of the quasispecies concept

The main feature of quasispecies theory that rendered it an ideal framework to understand RNA viruses is its consideration of mutation as an integral part of the genome replication process (Eigen 1971; Eigen and Schuster 1979; Page and Nowak 2002). The two mathematical equations of quasispecies theory that are pertinent to the interpretation of virus behavior are the calculation of the concentration of mutant types as a function of replication time, and the formulations of the error threshold for maintenance of genetic information (Fig. 1). We refer to the original, deterministic formulation of quasispecies theory, but it must be clarified from the onset that extensions to stochastic treatments have been developed that justify the application of quasispecies theory to finite populations of mutable viruses (Domingo and Schuster 2016a). The term Q i in the first equation of Fig. 1 quantifies the relative concentration of correct copies of genome i during replication, with 0 ≤ Q i  ≤ 1. Related to it, W i  = A i Q i  − D i measures the selection value of genome i in the mutant ensemble. Thus, the first equation in Fig. 1 embodies self-reproduction, mutability and the selective value of the mutant species, with mutant generations occurring in parallel with correct copying. These are core concepts that have facilitated the understanding of virus behavior, even if viruses are intricate multi-gene organizations as compared with the simple genetic entities implied in quasispecies theory. The late Christof Biebricher, working in Manfred Eigen’s Department in Göttingen, determined the basic kinetic parameters for replication, mutation and selection of small replicating RNAs derived either from the bacterial virus (bacteriophage) Qβ RNA or synthesized de novo by the Qβ replicase [studies reviewed in (Biebricher 2008)]. These fundamental investigations bridged quasispecies theory and RNA virus analysis. In a strict sense, the quasispecies concept should apply to a single replicative unit in an infected cell because it is within such unit where the primary events of mutant generation, competition among variants and selection of the most fit take place (Fig. 1). In the life cycle of viruses, there are successive levels of competition and selection. A second level is established when RNAs gathered from different replicative units of the same cell engage in subsequent intracellular steps to assemble into viral particles that exit the cell. Still a third level of competition occurs when viruses invade new tissues or organs to extend the infection process. The intra-host viruses can then be transmitted into susceptible hosts to propagate the infection at the epidemiological level. In all these processes, mutant spectra are at play, with rapid generation of mutants and compartmentalization of different subpopulations in different organs [several studies were reviewed (Domingo 2016), and new examples are currently being described (de Almeida et al. 2017; Lu et al. 2017), several with medical implications]. Probably, both, competitive selection and random events (population bottlenecks of different intensity), intervene in shaping the transient composition of ever changing mutant distributions.

Fig. 1
figure 1

Basic concepts in viral quasispecies. Top: fundamental equations that express error-prone replication and error threshold for stability of genetic information. The first equation describes the concentration of mutant i as a function of time. A i , D i , and W ik represent replication of i, degradation of i, and synthesis of i from template k, respectively. Q i and W i are defined in the text. ф i describes the flux of molecules into or from the location where replication takes place. The second group of equations (second line) refers to the error threshold concept. νmax is the maximum amount of genetic complexity (information) that can be maintained during replication; σ0 quantifies the superiority of the master sequence relative to the other components of the mutant spectrum; q i is the average copying fidelity of the replicative system, and in consequence 1 − q i is the average error rate per site copied. The terms on the left express a maximum sequence length νmax for maintenance of information, while the terms on the right express the maximum tolerated error rate for a given genome complexity (length). The error threshold equations are at the basis of lethal mutagenesis as an antiviral strategy (see text). Bottom: a schematic representation of the evolution of a mutant spectrum, as experimentally evidenced with RNA viruses. Horizontal lines depict individual genomes and symbols on the lines represent mutations. Large arrows separate discrete time points at which the mutant distribution is observed. Note the continuous dynamics despite an invariant consensus sequence (represented by the lines at the bottom). The dynamics implies continuous generation of new genome types combined with the elimination of the less fit, as depicted here with discontinuous horizontal lines that indicate genomes that have accumulated a larger than average number of mutations. Real mutant distributions in cells infected with RNA viruses can reach thousands genome copies, which implies an impressive and ever changing diversity (see also data shown in Fig. 2, and references in the text) [figure adapted from (Domingo et al. 2012), with permission from American Society for Microbiology, Washington DC, USA].

The term quasispecies is now used in virology to refer to the collections of related viral genomes independently of their molecular origin and the level at which the mutant swarms act. Viral quasispecies has been extended to include mechanisms of genetic variation other than point mutations, such as recombination (the formation of chimeric genomes from two or more parental genomes), gene duplication, genome segment reassortment, and gene transfers (Boerlijst et al. 1996; Jacobi and Nordahl 2006; Muñoz et al. 2008; Park and Deem 2007; Wagner et al. 2016). In all these formulations, including the extension of quasispecies to populations of finite size in variable fitness landscapes, the fundamental theoretical framework of the original theory stands [reviewed in (Domingo et al. 2012) and in several chapters of (Domingo and Schuster 2016a)].

The unusual genetics of RNA viruses

What are viruses and how do they exploit mutation and other types of genetic variation? Viruses are subcellular genetic elements that have either DNA or RNA as genetic material. They replicate only inside a host cell. Outside a cell, viruses are inert macromolecular aggregates, no matter how many proteins and membrane structures surround their genetic material [reviews in (Domingo 2016; Flint et al. 2009)]. Viruses with an RNA genome are the most abundant (estimated in 75%) among those characterized in our biosphere; they are the cause of frequent diseases such as the common cold or influenza, and some more severe such as hepatitis, encephalitis or AIDS. To use RNA as genetic material is unusual in our DNA-based biosphere, and it might be a trace of a remote origin from an ancestral RNA world, in environments in which naked RNA (or RNA-related macromolecules) could be stable repositories of inheritable information. It is no surprise that it was with the RNA viruses that a link was first established between quasispecies and virology, because these viruses mutate at rates that exceed by nearly a million-fold the rates exhibited by their cellular host organisms (Holland et al. 1982), and high mutability is now an established fact for all RNA viruses studied to date (Domingo 2016).

Instability of phenotypic traits of RNA viruses was noted even before their genetic material could be isolated and characterized [early work reviewed in (Domingo et al. 1988; Holland et al. 1982)]. Typical findings were the presence of temperature sensitive or plaque morphology mutants in viral preparations, the easy selection of mutant viruses resistant to antiviral inhibitors, or differences in the types of lesions caused by “the same” plant virus, with reports going back as early as 1940 (Kunkel 1940). Although no procedures to identify mutations in RNA or DNA were available, and non-genetic mechanisms of variation could not be excluded, the early observations were compatible with the frequent occurrence of genetic changes during virus multiplication.

Direct evidence of mutations was obtained soon after the first nucleotide sequencing procedures were applied to the RNA bacteriophages (Fiers et al. 1976; Weissmann et al. 1973). Reverse genetics (consisting in the introduction of a precise genetic lesion and then study its consequences) was invented in the laboratory of Charles Weissmann, and permitted the in vitro synthesis of E. coli bacteriophage Qβ mutants with preselected mutations. One of the point mutants synthesized in vitro by site-directed mutagenesis produced infectious progeny, and it was calculated that the mutant reverted to its wild type counterpart at a rate of 10−4 mutations per nucleotide copying event (Batschelet et al. 1976). This value was a true mutation rate—rather than a mutation frequency—because its calculation took into consideration not only the rate of occurrence of the mutation but also the competition between the generated mutant and its parental genome. Its limitation was that the value referred to a single mutation type at a single site in the viral RNA, and it was not sure that the value was representative of other genomic sites or other RNA viruses. However, several subsequent measurements yielded comparably high mutation rates and frequencies for RNA viruses, including estimates based on new deep sequencing methodologies. Many measurements, however, determine mutation frequencies rather than mutations rates. That is, they measure the proportion of mutant genomes in a population rather than the rates at which mutations are generated. We speak generically of high mutation rates and frequencies for RNA viruses, in the range of 10−3 to 10−5 mutations per nucleotide, with site-to-site variations due to the sequence context of the template molecules and other environmental influences (Domingo 2016; Drake and Holland 1999; Sanjuan et al. 2010). The biochemical basis of high mutation rates is the absence or low activity of proofreading-repair and post-replicative repair activities during RNA replication (Steinhauer et al. 1992).

The so called “error-prone nature” of viral RNA-dependent RNA or DNA polymerases (the latter termed also reverse transcriptases) has been amply supported biochemically by fidelity measurements with the purified enzymes. The development of an in vivo assay to measure mutations in a single cycle of human immunodeficiency virus type 1 (HIV-1) using different mutation counting procedures has yielded error rates for HIV-1 over a range of 0.8 × 10−4 to 6.9 × 10−4 mutations per nucleotide and replication cycle, values which are lower than those obtained with purified reverse transcriptase (Mansky and Temin 1995; Rawson et al. 2015, 2017). A difference between in vivo and retrotranscriptase-based measurements of error rates suggested that either the fidelity of a polymerase may vary when separated from its intracellular context or that there might be features other than the fidelity properties of viral polymerases that can affect mutation rates. Although both factors are likely to exert an influence, studies on lethal mutagenesis have shown that, in addition to the polymerase, other viral proteins can modulate mutation types (see “Lethal mutagenesis and error threshold”).

Generation of mutant swarms is inherent to the infectious cycle of RNA viruses, and modifications of template copying fidelity can have a fitness cost for the virus (Borderia et al. 2016; Dapp et al. 2013; Lloyd et al. 2014; Newstein and Desrosiers 2001; Paredes et al. 2009; Pfeiffer and Kirkegaard 2005; Vignuzzi et al. 2006; Weber et al. 2005). These results suggest that viruses have evolved to display average error rates at levels that render exploration of sequences space compatible with functionality (Earl and Deem 2004). By an obvious extension, parallel concepts should apply to DNA viruses that are replicated by low fidelity DNA polymerases, those devoid of 3′–5′ exonuclease proofreading activity or other error-correcting activities (Vaisman and Woodgate 2017). Thus, formation of quasispecies swarms, with its biological implications, applies to many DNA viruses as well.

Exploration of sequence space for fitness gain

The application of the concept of sequence space to viral genomes (Eigen and Biebricher 1988) put into an evolutionary perspective the advantage of high mutation rates displayed by RNA viruses. The theoretical sequence space available to a viral genome is an immense number given by 4ν, being ν the number of genomic residues (genome length in nucleotides) and 4 the types of monomeric units (nucleotides) in the genome, A, G, C, U for RNA. High mutation rates favor exploration of sequence space, as illustrated by the following considerations. If any newly arising mutation in a viral genome were lethal, the viral genome sequence would remain invariant. If any mutation did not have any consequence for replication (strict neutrality) the virus would drift in sequence space, free of any constraint. The reality lies between the two extremes. When alternative sequences are not subjected to strong and immediate negative selection, high mutation rates imply that a virus population will consist of a mutant cloud (or mutant swarm), as observed experimentally. The components of a viral population will occupy multiple points in sequence space. In the event of an environmental change, the virus has multiple sequence space locations from where to initiate adaptive walks. Experiments have documented that multiple, alternative mutational pathways are available to viruses to gain fitness in a given environment, and that attaining high fitness does not imply population stability (a steady-state distribution of mutant genomes) (Escarmís et al. 1999, 2014; Moreno et al. 2017; Novella et al. 1999). A recent observation on the continuous modification of mutant frequencies in a hepatitis C virus (HCV) quasispecies, reflected in a continuum of mutational waves evolving in a constant biological environment (Moreno et al. 2017), can be explained by constant modification of selective values expressed by the term W i  = A i Q i  − D i in Fig. 1. A hypothetical maximum selective value expressed by Wmax = Amax Qmax − Dmax will never be attained because the mutational input does not allow the adaptive walk to reach an optimized goal. The environment is internally perturbed because the genomes present in the population at any given time are part of the environment (Kuppers 2016; Stadler 2016). Multiplication of HCV during 200 serial passages in the same host cells (equivalent to 700 days of continuous intracellular replication) did not attenuate the mutational waves, even at late passages when adaptation to the host cells had occurred (Fig. 2), and the virus did not reach phenotypic stasis (Moreno et al. 2017) [a number of controls excluded the possibility that the mutational waves were the result of a sampling bias due to the limited amounts of RNA taken for sequence analysis relative to the total amount of viral RNA in the sequential biological samples, see (Moreno et al. 2017)].

Fig. 2
figure 2

Analysis of the evolution of hepatitis C virus in a constant, unperturbed cellular environment. A clonal preparation of the virus was passaged in human hepatoma cells and the mutant spectrum composition of successive populations analyzed. The top panels indicate the number of mutations that accumulate in the consensus sequence of the entire genome or one of its genes (NS5A) as a function of passage number. HCVp0 is the initial population; HCVp200 is HCVp0 after 200 serial passages. The bottom panels depict the frequency of several individual mutations (color coded) in the NS5A-coding region of five populations analyzed. The colored lines link the five data points for visualization of modification of mutant frequencies. Mutant frequencies display “mutational waves” that cannot be an artifact resulting from sampling effects because many of them follow the same pattern according to the two independent analyses (shown in the two panels). The figure illustrates the simplification of reality when only consensus sequences are reported in virological studies. Data from Moreno et al. 2017 [figure adapted from (Moreno et al. 2017), with permission from American Society for Microbiology, Washington DC, USA].

Metastable equilibrium in biologically constant environments probably confers a selective advantage to viruses in preparation to have to confront multiple environments in their life cycles. In this manner, each confrontation is faced by multiple players. It is no surprise that, given that viruses exist despite the armamentarium of the host immune system, a single mutation or a limited number of mutations in viral genomes help adapting to the environmental changes they have to face. Examples of adaptive change have filled many research articles in virology: adaptation to a different cell type (the so-called cell tropism) or to a different host organism (host range), capacity to escape to antibodies, cytotoxic T cells or antiviral agents, increased resistance to some extreme conditions (temperature, pH), etc. have been described. These adaptations depend on a number of mutations that are within reach of the mutant spectra of RNA viruses in their exploration of sequence space [reviewed in (Domingo 2016)].

New means to attenuate viruses

Vaccination is the most effective way to prevent infectious diseases. However, for many human pathogens, including several RNA viral pathogens, no effective vaccines are available due to many interconnected reasons, one of them being the continuous variation in antigenic properties (responsible of triggering the immune response) that RNA viruses undergo in nature. The development of new vaccine strategies has been a very active field of research for decades. Concepts derived from the understanding of viruses as quasispecies, together with advances in viral genomics, have recently furnished new tools to convert virulent (disease-causing) viruses into attenuated counterparts that are vaccine candidates. Positioning viruses in unfavorable regions of sequence space can result in fitness loss and virus attenuation. This has been demonstrated by Marco Vignuzzi and colleagues by engineering serine- and leucine-coding triplets of Coxsackie B3 and influenza A viruses to synonymous alternatives that generate a stop codon after a single nucleotide substitution. The altered viruses displayed fitness loss, were attenuated in mice, and generated high levels of protecting neutralizing antibodies, a hallmark of a good vaccine (Moratorio et al. 2017). Forced movements towards unfavorable regions of sequence space is a strategy to reduce fitness, and it is one of the ingredients of lethal mutagenesis as an antiviral strategy (de la Higuera et al. 2017; Domingo et al. 2012) (see “Lethal mutagenesis and error threshold”). Alternative means to reduce fitness are alteration of polymerase copying fidelity (Vignuzzi et al. 2008), and massive deoptimization of codon composition or codon pair frequencies [(Bosch et al. 2010; Burns et al. 2006; Coleman et al. 2008; Cheng et al. 2017), among other examples reviewed in (Domingo 2016)]. A pendent issue is whether the attenuation trait, despite being associated with multiple genetic changes, will remain stable upon extensive viral multiplication; the concern is that viruses are adept at finding unexpected mutational pathways to regain fitness (Le Nouen et al. 2017).

Some puzzles on lethality and neutrality

The ease of finding multiple evolutionary pathways to gain fitness has been taken as evidence of predominance of neutral mutations (or of neutral constellations of mutations) in RNA virus evolution, while the experimental evidence suggests that neutral mutations in RNA viruses are infrequent. In some tests of the effect of point mutations introduced into viral genomes by site-directed mutagenesis, as much of 40% of individual mutations were lethal or highly deleterious (Fernandez et al. 2007; Sanjuan et al. 2004). The new data on mutant spectrum composition provided by deep sequencing is forcing reconsideration of the meaning of lethality: mutations previously scored as lethal may inflict a strong fitness cost and maintain the genomes bearing them at low frequencies but not dead. The increasing possibility of detection of rare mutations in any biological system (Jee et al. 2016) raises the issue for viruses of how many rare mutations can contribute to the pool of replicating RNA on which selection and random events act. An interesting example of the distinction between “whole variation” and “evolutionarily meaningful variation” is provided by a current debate on the role of cellular editing APOBEC (apolipoprotein B mRNA editing catalytic polypeptide-like) proteins as antiviral lethal mutagenic activities versus their role in the natural variation of HIV-1 (Delviks-Frankenberry et al. 2016; Rawson et al. 2017). Extremely infrequent mutations may be biologically (not biochemically) irrelevant, or they might have a relevance still to be discovered. The concept of lethality regarding the effect of mutations has entered a new scenario with the use of deep-sequencing to detect rare mutations (those present at very low frequency in a cell or virus population).

A distinction has been also made between “mechanistically active but inconsequential recombination” and “biologically meaningful recombination”, which acquires particular significance regarding clonality in virus evolution (Perales et al. 2015). Recombination may contribute to the reshuffling of mutated genomic sequences, further expanding the exploration of sequence space (Manrubia and Lazaro 2016). Many recombination events may be continuously occurring in viruses in which recombination is mechanistically linked to replication (Lowry et al. 2014; Perales et al. 2015; Urbanowicz et al. 2005). The available evidence suggests that the molecular connection between replication and recombination may be analogous to that between mutation and replication in quasispecies theory (see “Replication, mutation, and the scope of the quasispecies concept”). It appears that some RNA viruses (in particular negative strand RNA viruses, those with the polarity of the genomic RNA opposite to the polarity of the mRNAs that express viral proteins) display low levels of recombination. However, some of these viruses can readily generate defective-interfering particles that involve recombination-related mechanisms, and the level of transient recombination events in replication complexes is unknown. Some clarification may come from application of deep-sequencing to the analysis of nascent RNAs inside individual cells (when artifactual recombination events during in vitro copying and amplification of RNA molecules prior to deep sequencing can be controlled). Recombination may be evolutionarily relevant not only at the epidemiological level when sufficient divergence of parental genomes has been attained, but also within replication complexes to expand immediate adaptability at the molecular level (Perales et al. 2015).

Most mutations inflict a fitness cost upon a virus. Therefore, a clarification of the meaning of neutrality is in place here. Using as model systems short replicative elements and secondary structure as the target phenotypic trait, new insights into the genotype–phenotype relationship were achieved [reviewed in (Schuster 2016)]. In these systems, a large proportion of point mutations are neutral in the sense that they do not modify the predicted secondary structure of the macromolecule. There are many neutral networks that connect divergent nucleotide sequences that fold into the same secondary structure. Neutral networks have evolutionary implications such as a punctuated equilibrium-like dynamics consisting in long phases of diffusion in neutral networks which are intermittently discontinued by spurts of adaptation [(Huynen et al. 1996); reviewed in (Schuster 2016; Stadler 2016)]. When considering highly evolved biological systems such as RNA viruses, one of their relevant features is information compactness. Each nucleotide can affect several traits (RNA secondary structure, interaction with RNA or proteins, codon composition, and amino acid sequence of the expressed protein, among others); the RNA is per se part of the phenotype. Furthermore, it is increasingly evident that most viral proteins encoded by viral genomes are multifunctional [several examples have been described in (Domingo 2016)]. Can a large neutral space be reconciled with genome compactness? Alternative constellations of mutations may confer indistinguishable overall replicative ability in a given environment. Epistatic interactions among mutations may yield genomes with similar fitness despite differing in nucleotide sequence. The existence of similar fitness values in distant positions of sequence space is possible thanks to high mutation rates. An average rate of 10−4 mutations per nucleotide copied means that each RNA molecule that acts as template for replication has a high probability of acquiring one mutation. In any infected cell, hundred thousand progeny genomes are generated with multiple combinations of mutations. In cell culture or in vivo settings (infected tissues and organs) the number of infected cells may reach millions. (There are exceptions of viral pathogens that in persistent infections display limited fecundity; how can such limitation affects the adaptive potential of these viruses is not known). Thus, high mutation rates and large population sizes may provide constellations of mutations that behave as collectively neutral despite the non-neutral nature of the individual mutations.

Collective properties of mutant spectra

Selection of an individual genome from a viral quasispecies is never strictly individual because replication entails mutation (Fig. 1). What will be selected is a collection of related mutants that share the mutation (or mutations) that confer on the virus an advantage in the selective environment. Viruses with different mutations that express the same phenotype may be the object of selection. Studies on the behavior of FMDV quasispecies reconstructed in the laboratory with low frequency mixtures of two categories of antigenic variants—that shared resistance to a neutralizing antibody, and that each variant in isolation differed in relative fitness—illustrated how the composition of the cloud may assist the selected genomes in subsequent adaptability. When the reconstructed population was passaged in absence of antibody, one category of antigenic variants became dominant but it was surrounded by a cloud of the other category. The dominant category was inverted when the reconstructed quasispecies was passaged in presence of the antibody. The results point to a mechanism of antigenic flexibility—which is essential for the survival of most viruses—based on the mutant spectrum, since the variants selected for a phenotypic profile were surrounded by a cloud of mutants of the alternative profile (Martin and Domingo 2008). Thus, even if we term this mode of selection individual, relevant intricacies operate in the response of mutant spectra to a selective agent.

Selection will be conditioned by the fitness of an individual genome in a new environment as well as by the permissive or suppressive effect of the surrounding mutant spectrum. The effect of fitness has been extensively illustrated in clinical virology with studies of viral mutants which are resistant to drugs administered in therapy. These mutants (too numerous to describe here since they have been found for any pathogenic virus for which a therapeutic drug is available) must resist the inhibitory activity of antiviral agents while maintaining their basic requirement to produce progeny. In the majority of cases, the required mutations inflict a fitness cost, often counteracted by compensatory mutations when allowed by the viral population size (capacity of sequence space exploration). The suppressive effects of mutant spectra have been described both in theoretical and experimental studies (Crowder and Kirkegaard 2005; de la Torre and Holland 1990; González-López et al. 2004; Kirkegaard et al. 2016; Swetina and Schuster 1982). When the mutant spectrum is favorable because it includes many mutants that are neighbors in sequence space and fitness to the mutants apt to be selected, the ensemble has a higher probability of responding positively to the selective force. A related formulation was popularized with the terms “advantage of the flattest” (Wilke et al. 2001), also established in previous studies (Schuster and Swetina 1988).

Quasispecies memory

The evidence of a molecular memory in viral quasispecies (Briones and Domingo 2008; Ruiz-Jarabo et al. 2000) reinforced the biological role of mutant spectra. The origin of quasispecies memory is as follows. A mutation that confers a selective advantage in the face of an environmental change that has not occurred yet (for example the irruption of an antibody in a milieu where virus particles circulate), often belongs to a minority genome that has lower fitness than other more abundant components of the same mutant spectrum. However, in the process of being selected by the antibody, the mutant and its accompanying cloud will gain fitness because of their higher replication level than other members of the mutant spectrum (Escarmís et al. 1999; Novella et al. 1995). When the selective constraint (in this case the antibody) is removed, the genetic change responsible of the antibody resistance remains in the population at a significantly higher frequency than in the original population. With the genetic markers used in our experiments, the frequency was 10- to 102-fold higher than the original frequency (before memory implementation) (Arias et al. 2004; Ruiz-Jarabo et al. 2000). The higher mutant frequency due to the history of quasispecies is called memory, and its presence has been documented with several viruses and genetic markers in cell culture and in vivo (Briones and Domingo 2008). Memory is lost when the population is subjected to a bottleneck event that eliminates low frequency genomes including memory genomes. Since memory can affect the response to recurrent selective constraints that abound in nature, it follows that when the mutant spectrum that accompanies the dominant viral genomes includes memory genomes, the latter can contribute to future adaptive responses (Briones and Domingo 2008). In absence of the triggering selective constrain, memory levels decayed with time. Interestingly, the decay rate was identical in parallel viral lineages, suggesting a deterministic behavior of viral populations (Ruiz-Jarabo et al. 2003b). Deterministic features have also been observed in other experimental setting with RNA viruses (Clarke et al. 1994; Escarmís et al. 2002; Lázaro et al. 2002, 2003; Quer et al. 1996, 2001). Viral quasispecies probably hover between deterministic and stochastic responses (Rouzine et al. 2001), driven by population size and still largely unknown influences. It is worth stressing that a deterministic behavior is not a requirement for viral populations to be described as quasispecies (Domingo and Schuster 2016a) (see also “Replication, mutation, and the scope of the quasispecies concept”), a misconception still seen in the literature on virology.

Mutant collectivities and viral pathology

There are several additional examples of collective behavior of viral quasispecies. A poliovirus with substitution G46S in its polymerase that conferred a three to fivefold increase in the template-copying fidelity (decreased error rate) did not replicate efficiently in the brain of transgenic, susceptible mice. However, when the mutant was co-inoculated with the wild type virus, the mutant virus infected the brain (Vignuzzi et al. 2006). Other forms of complementation can occur between specific mutants that arise within viral quasispecies. A clonal population (derived from a single initial genome) of FMDV evolved to produce two genomic forms, each with a large internal deletion. The two forms produced infectious progeny by complementation, in absence of the standard genomic RNA (without deletions) (García-Arriaza et al. 2004, 2006; Moreno et al. 2014; Moreno and Perales 2016). This finding represents a remarkable case of RNA genome segmentation (a major evolutionary transition), prompted by exploration of sequence space.

Flaviviruses offer additional evidence. A defective dengue virus mutant with a nonsense mutation in the surface envelope protein was transmitted and maintained in infected cells (Aaskov et al. 2006). In West Nile virus, increased levels of co-infection led to complementation among variants, and contributed to maintaining genetic and phenotypic diversity; fitness levels of mutant swarms exceeded the fitness of individual genome types (Ciota et al. 2012). In a population of type A human influenza virus H3N2, two neuraminidase mutants cooperated because the efficacy of one mutant type for cell entry was combined with the efficiency of the other mutant to exit the cell (Xue et al. 2016).

Shirogane et al. have distinguished complementation (a genome encoding a functional protein permits replication of another closely related genome in which the corresponding protein is defective) from cooperation (two genomes produce a new phenotype through interaction between variant proteins) (Shirogane et al. 2016). They documented a case of cooperation in measles virus, with studies on membrane fusion activities that depend on two viral proteins, H and F that mediate entry into the host cells. A truncated form of H failed to cause cell to cell fusion. Membrane fusion was restored when the truncated H was accompanied by two forms of F protein, the wild type and a mutant with a single amino acid substitution, but not when it was accompanied by any of the two F versions individually. The new phenotype was attributed to heterotrimers of F that displayed enhanced fusion activity with wild type H as compared with the homotrimers of wild type F (Shirogane et al. 2012, 2016). These observations suggest an interpretation of early data obtained with other viruses that showed that fitness of a mutant ensemble was on average higher than fitness of individual biological clones isolated from the ensemble (Domingo et al. 1978; Duarte et al. 1994). Presumably, micro-complementation events among components of a mutant spectra were diminished following the cloning event.

Opposite to complementation is interference exerted among components of the mutant spectrum. This was first evidenced by suppression of high fitness vesicular stomatitis virus (VSV) by a population of lower fitness (de la Torre and Holland 1990). Also, high fitness antigenic variants of FMDV were suppressed by low fitness antibody-escape mutants (Borrego et al. 1993). Karla Kirkegaard and colleagues showed that suppressive effects could delay replication of drug-resistant mutants (Crowder and Kirkegaard 2005). Antiviral drug targets may be chosen in a way that the growth of a resistant mutant is inhibited by the drug-susceptible genomes within the same replicative units. If suppression can be sustained in time, the problem of dominance of drug-escape mutants can be avoided; experimental evidence in support of this strategy has been obtained with several RNA viruses (Kirkegaard et al. 2016).

Attenuated poliovirus could suppress virulent virus present in vaccine preparations (Chumakov et al. 1991). Pathogenic LCMV can cause a growth hormone deficiency syndrome in mice; this pathology was lost by suppressing non-pathogenic LCMV variants (Teng et al. 1996). A mutagenized FMDV population exerted a suppressive effect on standard genomes (González-López et al. 2004). Collective suppression was also achieved with an excess of capsid or polymerase mutants, and in the case of the polymerase the interference was dependent on a single amino acid substitution (Perales et al. 2007). A mutant spectrum composition can affect virus virulence in ways that are only beginning to be understood (Farci 2011; Sanz-Ramos et al. 2008).

Some collective effects of mutant distributions can be facilitated in cases in which multiple viral particles are packaged and bloc transmitted in lipid vesicles (Altan-Bonnet and Chen 2015; Chen et al. 2015; Nagashima et al. 2014; Sanjuan 2017). It is common practice to treat viral populations with mild detergent to disassemble aggregates prior to biological cloning (Sheldon et al. 2014; Sobrino et al. 1983), to derive progeny from a single infectious particle. In addition to bloc entry of multiple particles in vesicles, there are other mechanisms that favor intracellular interactions among components of a viral quasispecies, such as high MOI in absence of aggregates or cells that preferentially uptake multiple viral particles (Cicin-Sain et al. 2005; Chohan et al. 2005; Del Portillo et al. 2011). Obviously, even when a single particle enters a cell, mutations occurring during intracellular replication can engage the carrier genomes in mutual interactions, a situation which is particularly relevant during lethal mutagenesis (see “Lethal mutagenesis and error threshold”). Thus, there is ample evidence that the fact of viruses existing as mutant spectra can affect their behavior either permitting new phenotypes or modulating the expression of others in ways and with consequences that are only beginning to be understood. In this scenario, new strategies to combat viral diseases are under study that try to jeopardize virus survival by favoring negative intra-population interactions (Fig. 3). Such strategies are based also on concepts derived from quasispecies theory.

Fig. 3
figure 3

A schematic view of interactions among components of a mutant spectrum. Left: standard viral genomes are depicted as blue circles and interfering mutated genomes as red rough circles. The figure shows the transition from dominance of positive (complementing) interactions towards dominance of negative (interfering) interactions as the average error rate of the system is increased. Right: A specific case of interference exerted by mutated forms of a protein that is functional as a hexameric structure. This is one of the mechanisms postulated to be involved in lethal defection (see text for data and references) [figure adapted from (Domingo et al. 2012), with permission from American Society for Microbiology, Washington DC, USA].

Lethal mutagenesis and error threshold

Lethal mutagenesis is an antiviral strategy consisting in inducing mutation rates in RNA viruses above the basal level determined by polymerase fidelity. The enhanced mutation rates are generally achieved by introducing mutagenic nucleotide analogues during intracellular replication of the virus. In favor of lethal mutagenesis being potentially a broad spectrum antiviral design, the negative effects of enhanced mutation rates have been observed with RNA viruses displaying widely different replication mechanisms. Furthermore, virus extinction by nucleotide analogues has been achieved both in cell culture model systems and in vivo with natural hosts of the relevant viruses. In this context, noteworthy contributions have been the proof of principle of 5-fluorouracil interfering with a persisting LCMV infection of mice (Ruiz-Jarabo et al. 2003a), the effective extinction by favipiravir (T-705) of norovirus from mice (Arias et al. 2014), and a first clinical trial with a pyrimidine analogue administered to AIDS infected patients that had lost other therapeutic options (Mullins et al. 2011). Despite this clinical trial being unsuccessful in the sense that HIV-1 was not eradicated from patients (a truly challenging aim with any type of antiviral therapy), the resident HIV-1 was mutagenized, opening prospects of application of lethal mutagenesis for human infections. Two antiviral agents used in the clinic to treat several viral infections, may be exerting at least part of their antiviral activity via lethal mutagenesis. They are the two purine nucleoside analogues ribavirin and favipiravir (T-705), increasingly used as antiviral agents for established and emerging diseases. The discovery that their nucleoside-triphosphate derivatives that are produced intracellularly could act as antiviral mutagens (Baranovich et al. 2013; Crotty et al. 2000) was a surprise that has opened a new chapter in our understanding of lethal mutagenesis and antiviral designs generally.

The mechanisms by which high mutation rates are detrimental to RNA viruses may seem obvious since the forced introduction of mutations in any biological system tends to be deleterious rather than beneficial. However, several studies on the molecular events underlying virus extinction suggest a more complex picture than initially thought, despite agreement on the detrimental effect of enhanced mutation rates when viruses replicate close to an error threshold for maintenance of genetic information, as is the case of RNA viruses (Holland et al. 1990; Loeb et al. 1999; Schuster 2016).

Several models have been proposed to interpret virus extinction by an excess of mutations. Some models are related to the error threshold concept of quasispecies theory (Fig. 1) (Eigen 2002), while other models deviate from the error threshold concept (Tejero et al. 2016). An important point is the distinction between the error threshold as defined in quasispecies theory—which implies disappearance of the master sequence—from the extinction threshold—which implies disappearance of the entire population—(Bull et al. 2005; Tejero et al. 2016). In our view it is important to confront models with the experimental findings when following events underlying virus extinction by mutagenic agents. Some models propose that high mutation rates could lead to viral genomes with higher resistance to extinction than their parental genomes. However, there is experimental (Martin et al. 2008) and theoretical (O’Dea et al. 2009) evidence against such proposal. It seems highly unlikely that under a continuous (not transient) aggression to a genome collectivity with forced acquisition of mutations, the genome population could, in the time frame of genome replication, move collectively towards areas of sequence space in which mutations decrease their average deleteriousness.

Two broad classes of mechanisms of resistance to mutagenic agents unrelated to robustness acquisition have been described. One is due to amino acid substitutions in the viral polymerase (or in a protein functionally associated with the polymerase) that limits incorporation of the mutagenic nucleotide or counteracts the mutational bias introduced by the mutagenic agents (Agudo et al. 2010, 2016; Borderia et al. 2016; de la Higuera et al. 2017; Pfeiffer and Kirkegaard 2003). The selection of mutagen-resistant mutants appears to follow the pattern of competitive dynamics described for standard inhibitors. In some cases there might be cross-resistance among different mutagenic agents (Mihalik and Feigelstock 2013), while in other cases a mutant resistant to a mutagenic agent can be extinguished by subjecting it to a different mutagenic agent (Perales et al. 2009). The second mechanism of mutagen resistance that we have identified is viral fitness or a fitness-associated trait, so far described only for HCV (Gallego et al. 2016; Sheldon et al. 2014). The fitness-dependent resistance affects both mutagenic and non-mutagenic inhibitors and its molecular basis is not well understood, but its occurrence does not necessitate that the virus be subjected to enhanced mutagenesis.

Based on our experimental evidence and a theoretical model developed by Susanna Manrubia and her colleagues, our current view is that the transition towards extinction can have two different, probably overlapping, phases. Both phases involve mutations, but mutations occur with different intensity and with distinct biological consequences. In a first phase, with a moderate increase in mutational load, viral replication is still possible but defective genomes arise that can still replicate and interfere with the replication of the standard virus (see “Collective properties of mutant spectra”). This first phase was formulated as the lethal defection model of virus extinction since the virus is finally extinct, as evidenced experimentally and by model predictions (Grande-Pérez et al. 2005; Iranzo and Manrubia 2009) (Fig. 4). In the following phase, as the number of mutations per genome increases, bona fide lethality gradually replaces lethal defection, driving the entire system towards extinction (Perales and Domingo 2016). As expected, the deleterious effect of mutations is also noted in those viruses that still maintain a capacity to produce progeny during the transition towards extinction. This was shown by the fitness impairment of viable FMDV genomes rescued from a population subjected to ribavirin mutagenesis (Arias et al. 2001). Since the rescued viruses are the most viable subset of those present during mutagenesis, this result argues for a collective gradual loss of fitness with lethal defection and directly lethal mutations (or combinations of mutations) playing a role in loss of infectivity.

Fig. 4
figure 4

Representation of the connection between the error threshold concept of quasispecies theory, and experimental observations when RNA viruses are replicated in the presence of mutagenic agents. The central horizontal line marks the theoretical copying fidelity limits: 1, perfect fidelity (no errors in template copying); 0, complete lack of fidelity (any template residue can be copied into any other template residue). As fidelity diminishes by the increasing activity of a mutagenic agent, the quasispecies distribution moves towards lethal defection and overt lethality of mutations, preceding the violation of the error threshold (or extinction threshold) and loss of infectivity. See text for data, and for the experimental evidence in connection with alternative models of lethal mutagenesis. Scales at the bottom indicate a few of several influences that can accelerate or delay virus extinction [figure adapted from (Domingo et al. 2012), with permission from American Society for Microbiology, Washington DC, USA].

Research in lethal mutagenesis continues today on several fronts: (i) in the application of the known mutagenic base analogues to established and new emerging viral pathogens for which limited therapeutic options are available; (ii) in the discovery of new mutagenic agents, some through drug repositioning (drugs that are already used for other purposes), for example from chemotherapeutic agents for cancer into antiviral agents. This search is being pursued not only for many families of RNA viruses but also for retroviruses (HIV-1 being the most important target) and hepadnaviruses such as hepatitis B virus. Retroviruses and hepadnaviruses are a very special challenge because of the reservoir of intracellular DNAs (proviral DNA and cccDNA, respectively) that ideally should be activated before analogues can lethally mutate the replicating virus; (iii) last but not least, research continues to understand viral dynamics and how viruses can cope with abnormal error rates, a promising field that may bring additional practical consequences.

Extensions and conclusions

It is now 47 years since the first seminal paper on error-prone replication by Manfred Eigen was published (Eigen 1971), and 40 years since the first experimental evidence of quasispecies in viruses was obtained (Domingo et al. 1978). Since then, new developments in cell biology, microbiology and virology that have permitted characterizing genome compositions at the population level have evidenced a degree of overall complexity totally unpredicted at the time when quasispecies theory was formulated. Normal and pathological cell populations in parasites or animal organs, bacterial collectivities with or without biofilm organizations are all highly diverse, and in some cases their individual cell components appear more and more as genetically unique. Since generation of diversity is a key ingredient of quasispecies theory, it is not surprising that the theory and its derived error threshold concept has found application in the understanding of such diverse biological systems. Several examples dealing with bacterial collectivities, tumor cell dynamics, or strain specificity of prions can be found in recent reviews and references (Bertels et al. 2017; Domingo 2016; Domingo and Schuster 2016b; Loeb 2011; Sardanyes et al. 2017; Vanni et al. 2016; Weissmann et al. 2011). In all these systems the core concepts of enhanced exploration of sequence space by heterogeneous populations, and interactions among components of a collectivity apply. However, there is a key feature that outstands in the case of RNA viruses. Due to their limited genomic size, the RNA viruses can benefit most as a result of their existence as mutant distributions. While in the walks in sequence space of cellular populations only a tiny minority of possible mutants can be explored, in many viral populations all possible single mutants and a considerable proportion of the possible double and higher order mutants can participate in adaptive processes. Interestingly, current evidence suggests that with whatever their available mutant repertoire is, both viruses and cells can engage in competitive ratings among individual members of a collectivity that results in adaptability and survival.

The application of quasispecies to virology, the focus of the present article, is an example of how basic research in any form should be supported by the scientific community not only as part of the need to understand nature but also because practical applications can come in the most unexpected ways. A proposal to study quasispecies to design new antiviral interventions would have been immediately dismissed 20 years ago. At present it makes sense, despite reluctances that are gradually fading.