Introduction

Perhaps the most astonishing feature of the virosphere is its diversity. Across the three domains of life, viruses display a stunning versatility in virion organization and genomic content [48]. Furthermore, viruses span an entire range of morphological, genomic, and functional complexity. Some viruses are organized in an extremely simple manner, whereas others are exceedingly complex, surpassing some unicellular organisms in terms of physical dimensions and the number of genome-encoded proteins. Irrespective of complexity, replication of all viruses depends on certain functions provided by the host cell, but the extent of such dependence varies from one virus to the other. Some of the most complex viruses, such as members of the family Mimiviridae, encode many of the molecular machineries required for their multiplication [3, 79], whereas viruses with short genomes, such as circoviruses, have evolved masterful host manipulation strategies that allow hijacking all necessary components from the host cell to support their replication [25]. More generally, exploration of viral diversity has revealed a continuum of genome and virion sizes within the viral world, and any threshold between small and large viruses appears increasingly arbitrary [29]. Nevertheless, some groups of viruses appear to be discriminated against based on the level of their complexity. In particular, the classification scheme used by the International Committee on Taxonomy of Viruses (ICTV) does not extend to certain viruses, commonly known as satellite viruses, which, for successful propagation, require certain functions to be provided by other viruses. Paraphrasing George Orwell, it thus seems that “All viruses are equal, but some viruses are more equal than others”.

The ability to form infectious particles is a feature unique to classified viruses and distinguishes them from other types of (unclassified) mobile genetic elements, such as plasmids and certain transposable elements [51, 80]. It should be noted, however, that classified viruses from several taxonomic groups do not form virions (e.g., endornaviruses, hypoviruses, narnaviruses, umbraviruses [49]). Satellite viruses do encode components required for virion formation. Nevertheless, they are currently not classified into the same taxon ranks as the “full-fledged” viruses. Instead, satellite viruses are banished into a broad category called “sub-viral agents” on an equal footing with non-viral parasitic nucleic acids (satellite nucleic acids and viroids), and even prions [47]. Paradoxically, a distinct classification system has been put in place for the non-protein-coding viroids (family names ending in “-viroidae”, genus names ending in “-viroid”) [47], whereas protein-coding satellite nucleic acids remain unclassified. Furthermore, the categorization of satellite viruses as “sub-viral” agents is not applied consistently. For example, adenovirus-associated satellite viruses (AAVs) that depend on members of the families Herpesviridae, Adenoviridae, Papillomaviridae or Poxviridae for replication have been assigned to the genus Dependoparvovirus, included in the family Parvoviridae, whereas satellite hepatitis delta virus (HDV), which uses hepatitis B virus (family Hepadnaviridae) as a helper virus, is classified as a member of the free-floating genus Deltavirus. Notably, although HDV uses the envelope proteins of the helper virus, it also encodes two proteins, S-HDAg and L-HDAg, which form a ribonucleocapsid [11], thereby adhering to the definition of a satellite virus. By contrast, none of the remaining satellite viruses, some of which are considerably more complex than AAVs and HDV, have undergone proper taxonomic classification. For example, as of the latest, Ninth Report of the ICTV [47] and its updates, the Sputnik virophage, a satellite virus with a complex T = 27 virion and an 18-kb dsDNA genome that encodes structural and DNA replication proteins [58, 109], is labelled a sub-viral agent. Such unsubstantiated separation of satellite viruses from the remainder of the viral world has previously fuelled discussions on the necessity to reassess the classification of these entities [22, 28, 31, 5254].

Here, we argue that all nucleic-acid-containing non-organismal entities that encode their own capsid proteins are to be classified within proper viral taxa, regardless of whether or not they depend on another virus for replication. We propose a consistent classification scheme for satellite viruses, including ssRNA satellite viruses that infect plants or arthropods, as well as for the dsDNA virophages found in protists. Although satellite nucleic acids that do not encode their own capsid proteins often also display clear evolutionary relationships to bona fide viruses, they are not considered in this proposal.

Satellite viruses of plants

The phenomenon whereby one virus depends for its propagation on another virus was first described in a plant virus system in the early 1960s [43, 46]. Certain preparations of tobacco necrosis viruses (TNV; genera Alphanecrovirus and Betanecrovirus, family Tombusviridae) contained two types of spherical particles that differed in size and antigenic properties; whereas the larger, TNV, particles could propagate autonomously, the smaller ones were unable to replicate in the absence of the larger ones [46]. The smaller particles became known as virions of satellite tobacco necrosis virus (STNV), whereas TNV is referred to as STNV’s helper virus. Subsequently, several other plant viruses having features similar to those of STNV have been discovered, suggesting that such parasitic virus-virus associations are not uncommon among plant viruses. Different properties of plant satellite viruses have been reviewed previously on multiple occasions [24, 30, 38, 68, 101]. All of these satellite viruses have single-stranded (ss) RNA genomes of positive polarity that are packed into small capsids exhibiting T = 1 icosahedral symmetry (reviewed in reference [6]). The virions are constructed from 60 copies of the capsid protein (CP), which adopts the jelly-roll topology (Fig. 1A). Based on sequence similarity, plant satellite viruses can be broadly classified into four groups, which are briefly described below.

Fig. 1
figure 1

Relationships between plant satellite viruses. A. Structural similarity between the virions (top) and jelly-roll capsid proteins (bottom) of satellite tobacco necrosis virus (STNV; PDB ID: 2BUK), satellite panicum mosaic virus (SPMV; PDB ID: 1STM), and satellite tobacco mosaic virus (STMV; PDB ID: 4OQ8). All three virions have T = 1 icosahedral symmetry. Images of the depicted virions were downloaded from the VIPER database (http://viperdb.scripps.edu/). B. Pairwise identities between capsid proteins of plant satellite viruses, calculated using SIAS (http://imed.med.ucm.es/Tools/sias.html). SPMV-like capsid protein homologs encoded by satellite RNAs (OVsatRNA and satBaMV) and SGVV are shaded in light green. The boxes containing identity values among the capsid proteins of STNV-like viruses, SMWLMV, SPMV-like viruses, and STMV are coloured according to the proposed classification of the plant satellite viruses: Albetovirus, cyan; Aumaivirus, orange; Papanivirus, green; Virtovirus, red. Abbreviations: SMWLMV, satellite maize white line mosaic virus; SSADV, satellite St. Augustine decline virus; SGVV, satellite grapevine virus; OVsatRNA, olive viral satellite RNA; satBaMV, bamboo mosaic virus satellite RNA. Note that OVsatRNA and satBaMV are not satellite viruses but satellite nucleic acids because they are packed into the virions of helper viruses

Viruses related to satellite tobacco necrosis virus

STNV is one of the most extensively studied satellite viruses. Over the years, many properties of this virus, including its genome sequence, virion structure and assembly, as well as interaction with the helper virus, have been elucidated [41, 46, 68, 75]. The linear STNV genome consists of 1,239 nucleotides and encodes a single CP (195 aa), which is necessary and sufficient for virion formation. Assembly of icosahedral virions proceeds cooperatively via interactions between the packaging signals, degenerated stem-loop structures distributed throughout the genome, and multiple CP copies [75].

Like in the case of the helper TNV, the 5′ end of the STNV genome is phosphorylated and lacks a 7-methylguanylate cap or a genome-linked protein, whereas the 3′ terminus lacks a polyadenylation sequence [68]. Several cis-acting elements located within the 5′ and 3′ untranslated regions (UTRs) are responsible for efficient translation and replication of the STNV genome [15, 66, 94, 98]. The 3′ and 5′ UTRs of STNV and TNV can be exchanged without abolishing RNA accumulation [15], although the translation elements in the 3′-terminal regions appear to be unrelated in the two viruses [87].

In its natural habitat, STNV is transmitted through the soil the same way as its helper virus, i.e., by zoospores of a plant-pathogenic fungus (Olpidium brassicae). STNV typically has a negative effect on the propagation of its helper virus, which manifests as a decrease in (i) the number of necrotic lesions formed in co-inoculated plant leaves, (ii) the diameter of the lesions, and (iii) the amount of TNV produced in inoculated leaves [44]. However, several factors influence the extent to which TNV multiplication is affected, particularly the relative concentrations of the satellite and helper viruses in the inoculum, the physiological state of the host plants before and during the infection, and the type of plant used for the assay [44].

Three serotypes of STNV have been described, including STNV-1 (or STNV), STNV-2, and STNV-C [45]. Different STNV strains are activated by different viruses. The replication of STNV-1 and STNV-2 is supported by isolates of tobacco necrosis virus A (TNV-A is the sole member of the type species of the genus Alphanecrovirus, Tobacco necrosis virus A), whereas tobacco necrosis virus D (TNV-D, the sole member of the type species of the genus Betanecrovirus, Tobacco necrosis virus D) supports the replication of STNV-C [45]. Genome sequences for the three STNV strains have been determined (Table 1) [14, 19, 103]. The overall organization of the genomes of the three viruses is similar. CPs are ≈50-63 % identical in sequence (Fig. 1B). Whereas the 5′ UTRs of STNV-1 and STNV-2 are 30 nt in length and are nearly identical, the 5′ UTR of STNV-C is significantly longer (101 nt). Similarly, the 3′ UTR of STNV-C RNA is significantly different compared to those of STNV-1 and STNV-2 (40 and 38 % similarity, respectively), which are approximately 64 % similar to each other [14].

Table 1 General properties of plant satellite viruses

Satellite maize white line mosaic virus

The fourth, more divergent member of the STNV-like virus group is satellite maize white line mosaic virus (SMWLMV). SMWLMV depends on maize white line mosaic virus (MWLMV; species Maize white line mosaic virus, genus Aureusvirus, family Tombusviridae) for multiplication [108]. MWLMV can infect maize in the absence of SMWLMV, whereas the SMWLMV particle can infect maize only when co-inoculated with MWLMV. The ssRNA genome of SMWLMV is 1,168 nucleotides in length and encodes one capsid protein [108]. Like for STNV-like viruses, the SMWLMV virion is 17 nm in diameter, but there is only limited sequence similarity between the SMWLMV capsid protein and corresponding proteins of STNV-like viruses. Indeed, SMWLMV was considered unrelated to other satellite viruses [101, 108]. However, BLASTp searches seeded with the SMWLMV CP sequence result in a significant match to the corresponding protein of STNV-1 (32 % identity over 177 aa; E = 1e-14). Consistently, CD-search against a comprehensive collection of domain models available at the NCBI’s Conserved Domain Database [64] shows that SMWLMV CP contains the TNV_CP domain (PF03898; E = 4.5e-96), indicating that SMWLMV and the STNV-like viruses have diverged from a common ancestor.

Viruses related to satellite panicum mosaic virus

Satellite panicum mosaic virus (SPMV) is completely dependent on panicum mosaic virus (PMV), a member of the species Panicum mosaic virus, genus Panicovirus, family Tombusviridae, for replication as well as systemic spread in plants [16, 95, 96]. Like in the case of STNV, the 5′-terminus of the SPMV genome is phosphorylated and lacks a 7-methylguanylate cap [65]. Several secondary structure elements implicated in the replication of the SPMV genome were predicted in the 5′ and 3′ UTRs [65]. The 826 nt-long ssRNA genome of SPMV contains two open reading frames. However, only one of them (encoding CP) was found to be expressed in in vitro translation assays [65]. The sequence of SPMV CP is not appreciably similar to those of STNV-like viruses (below 15 % identity; Figure 1B). However, X-ray structure analysis of the SPMV particle [7] revealed that the protein has a jelly-roll fold that is similar to that of the STNV CP (Fig. 1A). In addition to its structural role, SPMV CP has several other biological functions, most notably systemic accumulation, maintenance, and movement of the cognate SPMV RNA [71]. Interestingly, the latter activities of the SPMV CP apparently extend to the helper virus RNA, assisting in its maintenance or stabilization. As a result, co-infection of PMV with SPMV exacerbates the PMV disease phenotype in millet plants, resulting in severe chlorosis and stunting [86]. It is noteworthy that PMV and SPMV are involved in a peculiar tripartite association with a 350-nt-long satellite RNA (satRNA), whereby PMV provides necessary factors for satRNA replication and SPMV provides CPs for satRNA encapsidation [23].

Two other viruses encoding SPMV-like CPs have been reported. The first one, satellite St. Augustine decline virus (SSADV), is associated with the St. Augustine decline strain of PMV [8]. SSADV is 95 % identical to SPMV over the entire genome length (36 nt changes, 5 aa changes) and can be considered a different strain of SPMV. The second putative satellite virus, satellite grapevine virus (SGVV), was discovered by deep sequencing of total intracellular RNA from grapevine [2]. However, neither the viral particles nor the associated helper virus have been characterized. SGVV CP shares ≈24 % sequence identity with SPMV CP (Fig. 1B).

Homologs of SPMV CP are encoded by certain satRNAs. In particular, bamboo mosaic virus satellite RNA (satBaMV; 836 nt) encodes a protein, P20, that is 44 % identical to the CP of SPMV (Fig. 1B). P20 plays a role in the accumulation and movement of the satBaMV within the plant but does not participate in satRNA encapsidation [74]. Instead, satBaMV is packaged into rod-shaped particles by the CP of the helper bamboo mosaic virus, a member of the family Alphaflexiviridae [62]. In addition, a sequence of olive viral satellite RNA (OVsatRNA) has been deposited to GenBank (Table 1) that encodes a protein that is 35 % identical to P20 of satBaMV (Fig. 1B). Considering the conservation of the SPMV-like CPs, it appears likely that satBaMV and OVsatRNA have evolved from genuine satellite viruses, once again emphasizing the apparent ease with which transitions between different types of mobile elements (i.e., parasitic nucleic acids and viruses) occur.

Satellite tobacco mosaic virus

Satellite tobacco mosaic virus (STMV) has been isolated from tree tobacco (Nicotiana glauca) and is naturally associated with and dependent on tobacco mild green mosaic virus (TMGMV), a member of the species Tobacco mild green mosaic virus, genus Tobamovirus, family Virgaviridae [24]. However, under experimental settings, STMV can adapt and replicate in many plant hosts (e.g., tobacco, pepper, tomato) in association with other tobamoviruses, including tobacco mosaic virus (TMV) [97]. Thus far, STMV is the only known satellite virus that uses rod-shaped viruses as helpers.

The effects of STMV on the multiplication of its helper virus, as well as on the helper virus-induced symptoms, are dependent on the host [24]. In tobacco plants, STMV does not change the mild mosaic symptom caused by TMGMV, whereas in jalapeño pepper, severe leaf blistering induced by TMGMV is attenuated by STMV infection. Furthermore, tobamovirus titers are greatly decreased by STMV in pepper compared to other hosts [82].

The STMV genome is a linear ssRNA molecule of 1,059 nt that contains two open reading frames (ORF), both of which are functional in the in vitro translation assay [67]. The first ORF encodes a protein of 58 aa that lacks similarity to proteins with sequences in the public databases and appears to be dispensable for STMV multiplication [84]. Indeed, certain naturally occurring isolates of STMV contain a deletion within ORF1 and do not produce the corresponding product [24]. The second ORF encodes STMV CP, which also has no identifiable homologs in sequence databases. However, structural analysis shows that the STMV CP has a jelly-roll fold similar to those of STNV and SPMV (Fig. 1A) [60], suggesting that the three satellite viruses might be evolutionarily related.

As in the case of STNV and SPMV, but different from the helper TMV virus, the genome of STMV lacks a 7-methylguanylate cap, and the first six nucleotides of the STMV RNA are identical to those of the STNV genome. However, in contrast to STNV and SPMV, the 5′ end of the STMV genome is not phosphorylated [67]. The 3′ UTR is predicted to contain a series of pseudoknots followed by a tRNA-like structure, which can be amino acylated with histidine [36]. The latter features are strikingly similar to those of the genome of helper TMV and other tobamoviruses, with 40–50 nt-long regions of near identity among the STMV and TMV 3′ UTRs [24]. These secondary structure elements play critical roles in STMV genome replication, translation, and initiation of virion assembly [84, 88].

Proposed classification of plant satellite viruses

Plant satellite viruses from the four groups described above all propagate in flowering plants (angiosperms). The viruses share several genomic and structural characteristics that distinguish them from other known viruses: (i) Representatives of all four satellite virus groups form capsids with T = 1 icosahedral symmetry (Fig. 1A). Although common among ssDNA viruses [55], T = 1 capsids are not used by other ssRNA viruses, which typically have larger T = 3 or T = 4 capsids [47]. (ii) The capsid proteins of all described plant satellite viruses are structurally homologous (Fig. 1A) despite negligible sequence similarity. Notably, the structure of the SMWLMV capsid protein is not available. However, sequence similarity to the corresponding proteins of STNV-like viruses (Fig. 1B) and the fact that these viruses have the same capsid diameter (17 nm) strongly suggest that the SMWLMV capsid protein also adopts the jelly-roll topology. (iii) In all cases, the linear ssRNA genomes lack 7-methylguanylate caps and polyadenylation sequences in their 5′ and 3′ UTRs, respectively. Furthermore, 5′ ends of the STNV and SPMV genomes are phosphorylated, whereas the first six nucleotides at the 5′ terminus of the STMV genome are identical to those in STNV. Considering these similarities, it is conceivable that all plant satellite viruses have evolved from a common ancestor. However, due to high divergence in their nucleotide and protein sequences, the monophyly of these viruses cannot be ascertained at this point. Thus, based on the comparison of their CP sequences, we propose to establish four unassigned genera for their classification (Fig. 1B). We propose to classify the STNV-like viruses STNV-1, STNV-2, and STNV-C into three species, Tobacco albetovirus 1, 2, and 3, respectively, within a new genus, Albetovirus (sigil: Al- for alphanecrovirus [helper virus], be- for betanecrovirus [helper virus], to- for tobacco). The low sequence similarity between the CPs of STNV-like viruses and SMWLMV calls for the creation of a separate genus for classification of the latter virus. Thus, for classification of SMWLMV, we propose creating a new species, Maize aumaivirus 1, within a new genus, Aumaivirus (sigil: Au- for aureusvirus [helper virus], mai- for maize). SPMV and SSADV could be assigned into the tentative genus Papanivirus (sigil: Pa- for panicovirus [helper virus], pani- for panicum), and within the species Panicum papanivirus 1. It is premature to classify SGVV because virions of this putative SPMV-like satellite virus, as well as its helper virus, are yet to be characterized. The fourth genus, which we suggest to name Virtovirus (sigil: Vir- for virgavirus [helper virus], to- for tobacco), would include STMV as a sole representative of the species Tobacco virtovirus 1.

We recommend using pairwise sequence identity comparisons between the capsid proteins as the main demarcation criterion for future members of the genera. Within the genus, currently known viruses show 45-90 % sequence identity between their capsid proteins, whereas viruses with capsid protein sequence identity lower than 45 % are classified into separate genera. The complete structure of the taxa proposed for classification of plant satellite viruses is summarized in Table 2.

Table 2 Proposed family, genus, and species names for plant and arthropod satellite viruses, as well as virophages

Animal satellite viruses

In addition to members of the genus Dependoparvovirus (family Parvoviridae) and the free-floating genus Deltavirus, which have been properly established in the framework of the ICTV [47], three other satellite viruses associated with helper viruses infecting animals have been reported. These include chronic bee-paralysis satellite virus (CBPSV), extra small virus (XSV), and Nilaparvata lugens commensal X virus (NLCXV).

Chronic bee-paralysis satellite virus

CBPSV reproduction is strictly dependent on chronic bee-paralysis virus (CBPV), an unclassified virus of honey bees (Apis mellifera) that is evolutionarily related to members of the family Nodaviridae [1, 70, 81]. CBPSV has a negative effect on CBPV reproduction. The total amount of CBPV genomic RNAs (2 segments) is greatly reduced as CBPSV multiplication increases [5]. The efficiency of CBPSV replication appears to be host dependent, as worker and drone bees produce much less CBPSV than most queens [5].

CBPSV virions are isometric and serologically unrelated to the ellipsoidal virus particles produced by CBPV [4]. The virions are similar in size (17 nm) to those of plant satellite viruses (Table 1) and are constructed from a single CP of ≈15 kDa [4]. The genome consists of three segments (≈1,100 nt each) of linear ssRNA [72]. Occasionally, CBPSV RNAs might be encapsidated into the CBPV virions [72], although co-purification of CBPV and CBPSV particles could not be excluded. Unfortunately, neither the sequence of the genome nor the structure of the virion has been reported, precluding meaningful comparisons with other satellite viruses.

In the absence of a complete genome sequence, proper classification of CBPSV appears premature; however, the distinguishing features, particularly the segmented genome, of CBPSV suggest that a new genus will have to be created for its classification once the complete genome sequence becomes available.

Extra small virus

XSV and its helper virus, Macrobrachium rosenbergii nodavirus (MrNV), infect giant freshwater prawns and cause white tail disease, which is responsible for mass mortalities and important economic losses in prawn hatcheries and farms (reviewed in reference [10]). MrNV is currently unclassified, but sequence analyses clearly show that it is a genuine member of the family Nodaviridae [9]. XSV replication is dependent on that of MrNV, and the two viruses are always found together. However, the exact relationship and the effect of XSV multiplication on that of MrNV remain obscure [10, 107]. XSV and MrNV have been detected in aquatic insects of several species that were collected from nursery ponds containing freshwater prawn (Macrobrachium rosenbergii) infected with MrNV and XSV [89]. Both viruses could also replicate in mosquito cell lines, suggesting that aquatic insects serve as vectors for XSV and MrNV transfer [89]. Several XSV isolates from geographically remote locations, including the French West Indies, Thailand, Taiwan, China, and India, have been reported [10]. The isolates display 96-99 % sequence identity in their capsid protein genes [90].

The XSV genome is a linear positive-sense RNA molecule of 796 nucleotides, which, unlike other satellite viruses, contains a short poly(A) tail of 15-20 nucleotides at the 3′ end [100]. XSV particles are spherical, ≈15 nm in diameter and serologically unrelated to and considerably smaller than those of MrNV (Fig. 2) [63, 78]. The XSV particle is constructed from two CPs, CP-17 (17 kDa) and CP-16 (16 kDa), which are present in nearly equimolar ratios and which are independently translated, initiating from different start codons within the same gene [99, 100]. The 3′ UTR plays an important role in selective encapsidation of the XSV genome [61]. The capsid protein is not recognizably similar to proteins with sequences in public databases [100]. However, secondary structure prediction using Psi-Pred shows that the protein contains eight beta-strands, consistent with the jelly-roll fold found in the CPs of all other known satellite ssRNA viruses.

Fig. 2
figure 2

Transmission electron micrographs of XSV (a) and MrNV (b) virions purified on CsCl gradients. Bars = 100 nm. Inset in b: higher magnification of MrNV; bar = 50 nm. Reproduced from reference 10 with permission from Elsevier

Considering the lack of sequence similarity to other viruses, we propose to classify XSV as the sole representative of the species Macrobrachium satellite virus 1 within the new genus Macronovirus (sigil: Macro- for Mac robrachium ro senbergii, no- for nodavirus [helper virus]) of the new family Sarthroviridae (sigil: S- for small, arthro- for arthropod, and the suffix for virus families, viridae) (Table 2).

Nilaparvata lugens commensal X virus

NLCXV has been isolated from brown planthoppers (Nilaparvata lugens) along with Himetobi P virus (HiPV; species Himetobi P virus, genus Cripavirus, family Dicistroviridae, order Picornavirales) and Nilaparvata lugens reovirus (NLRV; species Nilaparvata lugens reovirus, genus Fijivirus, family Reoviridae) [69]. The virion of NLCXV is 30 nm in diameter and is considerably larger than those of all other described satellite RNA viruses (Table 1). The NLCXV genome consists of a 1,647-nt-long linear ssRNA molecule that lacks a poly(A) tail. The genome encodes a single CP of ≈50 kDa, suggesting that factors required for NLCXV genome replication are provided by the helper virus. This observation has led to the conclusion that NLCXV is a satellite virus. However, NLCXV propagation does not seem to be always associated with either HiPV or NLRV, and the actual helper virus, if any, remains to be identified [69]. Thus, more data on the replication mode and evolutionary origins of NLCXV are needed for proper classification of this virus.

Virophages

The serendipitous discovery of the giant Acanthamoeba polyphaga mimivirus (APMV; species Acanthamoeba polyphaga mimivirus, genus Mimivirus, family Mimiviridae), a dsDNA virus with a 500-nm large icosahedral capsid and a 1.2-Mbp genome [79], spurred research efforts to isolate further giant viruses from diverse aquatic and terrestrial environments. To date, several dozen mimiviruses have been isolated, and many of those have been genetically characterized [3, 17, 58, 59, 73] but not yet classified. One particular APMV strain called Acanthamoeba castellanii mamavirus (ACMV), which originated from a cooling tower in Paris, France, was accompanied by icosahedral virus particles that were only a tenth of the size of APMV [58]. This smaller virus, named Sputnik, replicated in the same amoebal host as ACMV, even though Sputnik replication was strictly dependent on co-infection with APMV or ACMV. Electron microscopy of co-infected cells revealed that Sputnik targeted the cytoplasmic replication factory of the giant virus and caused aberrant capsid phenotypes [58]. The presence of Sputnik interfered with ACMV propagation, resulted in decreased ACMV progeny, and increased host cell survival [58]. As a viral parasite of a virus, Sputnik was termed a “virophage” [31, 58]. Virophages are thus dsDNA viruses that depend on giant dsDNA viruses for their own propagation. In recent years, several additional virophages have been found, but many of them are known by genome only as a result of assembly from metagenomic datasets [17, 21, 33, 102, 106, 110, 111]. These viruses typically carry 17- to 30-kbp-long linear or circular dsDNA genomes. The virophages that have been isolated in culture produce icosahedral particles with diameters of 40-80 nm.

Sputnik

The circular dsDNA genome of Sputnik consists of 18,342 bp and encodes 21 ORFs that appear related to genes of other DNA viruses infecting eukaryotes, bacteria, and archaea [58]. The Sputnik genome is packaged in a 74-nm icosahedral protein shell. A cryo-electron microscopy-based reconstruction of the Sputnik virion at 3.5-Å resolution showed that the capsid has T = 27 quasisymmetry and is built from 260 pseudohexameric capsomers of the double jelly-roll fold major capsid protein (MCP, ORF 20) and 12 pentameric capsomers of the single jelly-roll minor capsid protein (mCP, ORF19) [91, 109] (Fig. 3A). The exact mechanism by which Sputnik enters the amoebal host cell is unknown. A likely scenario involves Sputnik attaching to the 125-nm-thick fibre coat of APMV/ACMV particles and subsequent co-phagocytosis of the two virions. Support for this hypothesis, dubbed “paired-entry mode” [93], stems from electron micrographs that display Sputnik particles within the ACMV fibre coat [20], as well as from an APMV deletion mutant called mimivirus M4, which has lost 207 kbp of its 1.2-Mbp genome [13]. These deletions affect genes for the external virion fibres, and the resulting fibreless M4 particles are no longer able to support Sputnik replication [13]. In addition to physically attaching its virion to APMV/ACMV particles, Sputnik encodes a lambda-type integrase that enables the virophage genome to integrate into the genome of an unclassified APMV isolate called Lentille virus [21]. These mechanisms are likely to increase the chances that Sputnik will remain associated with its giant helper/host virus.

Fig. 3
figure 3

The virions of mavirus and Sputnik. A. Cryo-EM reconstruction of the Sputnik virion (adapted from reference 110, Electron Microscopy Data Bank ID 5495). B. Negative stain electron micrograph of mavirus particles (U. Mersdorf, Max Planck Institute for Medical Research)

Two additional isolates of Sputnik (Sputnik 2 & 3) [32, 59] and the Sputnik-related virophage Zamilon [33] have been described and genetically analyzed. The genomes of the three Sputnik isolates differ from each other at fewer than 10 nucleotide sites; the 17,276-bp-long Zamilon genome, on the other hand is only 76 % identical to Sputnik [33]. Zamilon was isolated on A. polyphaga together with its unclassified giant helper/host virus Mont1 mimivirus from Tunisian soil [12]. Another virophage, Rio Negro virophage, was also isolated from the Acanthamoeba system and is associated with an unclassified giant virus called Samba virus. Rio Negro virophage seems to be closely related to Sputnik, although its genome has not been sequenced yet [17].

Mavirus

The mavirus virophage, with its 19,063-bp circular dsDNA genome, depends for its propagation on Cafeteria roenbergensis virus (CroV; species Cafeteria roenbergensis virus, genus Cafeteriavirus, family Mimiviridae), a giant virus with a ≈700-kbp dsDNA genome that infects a marine heterotrophic nanoflagellate (Cafeteria roenbergensis) [26, 27]. Like most other virophages, mavirus encodes two capsid proteins (MCP and mCP) that form an icosahedral capsid with a diameter of ≈75 nm (Fig. 3B). The cell entry mechanism of mavirus differs from that of Sputnik in that mavirus is endocytosed independently of CroV (“independent entry mode” [93]), most likely via the clathrin-mediated pathway [27]. Once inside the host cell, mavirus targets the cytoplasmic virion factory of its associated giant virus CroV, inhibits the production of new CroV particles, and increases host cell survival [27].

Mavirus shares many features with the large, virus-like transposons of the Maverick/Polinton (MP) superfamily, which are widespread in eukaryotes [42, 57, 77]. Both types of elements encode seven homologous proteins involved in virion morphogenesis (MCP, mCP, FtsK-HerA-type genome packaging ATPase and a cysteine protease homologous to adenoviral maturation proteases), genome replication (protein-primed family B DNA polymerase and superfamily 3 helicase), and integration (retrovirus-like integrase, which belongs to a broad superfamily of DDE transposases) [27, 56]. Furthermore, the mavirus genome contains long inverted repeats that resemble those found at the termini of MP transposons. Based on these similarities, it has been proposed that mavirus is evolutionarily related to MP transposons, even though the directionality of the evolutionary processes that led to these two forms of mobile DNA elements is a matter of ongoing debate [27, 105]. Whereas a virophage origin was initially proposed for the emergence of MP transposons, based on biological properties of mavirus (such as the potential for genome integration and its positive effect on host cell populations [27]), phylogenetic analysis based on DNA polymerase sequences rather suggests that MP transposons gave rise to mavirus (and other virophages), as well as several other groups of eukaryotic dsDNA viruses [57, 105]. Owing to the high degree of divergence and gene shuffling among virophages and other forms of mobile genetic elements, the reconstruction of ancient evolutionary events that led to the extant virophages remains challenging.

Other virophages

Several virophage genome sequences have been partially or fully assembled from metagenomic datasets, notably from two Antarctic lakes, Yellowstone Lake, and sheep rumen [102, 106, 110, 111]. The viral and cellular hosts for these virophages are unknown, but in the case of Organic Lake virophage (OLV) and some Yellowstone Lake virophages (YSLVs), it can be assumed that they replicate in photosynthetic protists and are associated with algae-infecting viruses related to members of the family Mimiviridae [102, 104, 111]. The genomes of these metagenomic virophages are up to 30 kbp long and contain up to 34 ORFs. A special case is represented by the Phaeocystis globosa virus-associated virophage (PgVV) [85], which appears to have lost most of its structural genes except for a distant version of the MCP [56]. No encapsidated forms of PgVV have been found so far, and it has been proposed that this element replicates as a linear plasmid or as a “provirophage” integrated in the genome of its host virus PgV [85].

Proposed classification of virophages

In terms of genome and particle size, virophages are at least as complex as members of several families of bona fide viruses with isometric capsids and dsDNA genomes, including Polyoma-, Papilloma-, Cortico-, and Tectiviridae, or podoviruses such as Bacillus virus phi29. The largest virophage genomes assembled from metagenomic datasets are comparable in size to adenoviral genomes. Thus, the only argument in favour of classifying virophages as satellite viruses is the fact that Sputnik and mavirus cannot propagate by themselves without a co-infecting giant virus. However, given the strong similarity of transcriptional regulatory motifs (promoter and transcription termination signals) found in virophage and giant virus genomes, it can be assumed that virophages use the transcriptional machinery encoded by their associated giant virus for mRNA synthesis instead of relying on the host cell transcription system [18, 27, 28]. Virophages therefore use the cytoplasmic giant virus factory in the same manner as other dsDNA viruses of comparable size would use the host cell nucleus.

A conserved set of six proteins or domains that are found in all canonical virophages (excluding PgVV) consists of the morphogenetic module MCP, mCP, FtsK-HerA family DNA-packaging ATPase, and cysteine protease, as well as the primase-superfamily 3 helicase (S3H), and a zinc-ribbon domain protein (Fig. 4A) [105]. The existence of these core genes strongly suggests a monophyletic origin for virophages and justifies the creation of a family-rank taxon within the ICTV framework.

Fig. 4
figure 4

Relationships between virophages. A. Comparative genomic maps of the virophages OLV, Sputnik, and mavirus. ORFs are indicated with arrows. Conserved virophage genes are shown in color: superfamily 3 helicase, pink; zinc-ribbon domain, yellow; FtsK-HerA family ATPase, red; Cys protease, green; minor capsid protein, light blue; major capsid protein, indigo. The scale bar shows distances in kilobase pairs. B. Pairwise identities between major capsid proteins of virophages, calculated using SIAS (http://imed.med.ucm.es/Tools/sias.html). The SIAS calculation is based on a PROMALS alignment [76]. Identity values among the MCPs of Sputnik-like viruses and mavirus-like viruses are highlighted in blue and red, respectively. C. Phylogenetic analysis of the MCPs of virophages. Branches are coloured according to the proposed classification of virophages: Sputnikvirus, blue; Mavirus, red. The multiple sequence alignment for phylogenetic analysis was constructed using PROMALS3D [76] with the Sputnik Cryo-EM structure (protein data bank ID 3j26) as a 3D structure template. Columns containing gaps were removed from the alignment. Maximum-likelihood phylogenetic analysis was carried out using PhyML 3.1 [35] with the Whelan and Goldman (WAG) model of amino acid substitutions, including a gamma law with four substitution rate categories. The tree is unrooted due to a lack of identifiable homologs outside of this group of viruses that could be used as an outgroup. Numbers at the branch points represent SH (Shimodaira–Hasegawa)-like local support values. The scale bar represents the number of substitutions per site. All taxa are indicated with the corresponding GenBank identifiers, or in the case of rumen virophages, with the Shotgun Assembly Sequence identifier. Abbreviations: ALM, Ace Lake mavirus; OLV, Organic Lake virophage; RVP, rumen virophage; YSLV, Yellowstone Lake virophage

We propose to create the family Lavidaviridae for Sputnik, Zamilon, mavirus, and other virophages yet to be isolated. “Lavida-” stands for large virus-dependent or -associated virus and refers to the property of Sputnik, mavirus, and other virophages of depending on or associating with large dsDNA viruses. The demarcation criteria for membership in the proposed family Lavidaviridae are fulfilled

  1. 1)

    if the virus encodes at least some of the morphogenetic genes that are conserved in virophages (MCP, mCP, ATPase, PRO) and can be used for phylogenetic analysis to demonstrate genetic similarity to other virophages and

  2. 2)

    if this virus is dependent on, or associated with, a large dsDNA virus related to the so-called nucleo-cytoplasmic large DNA viruses (NCLDVs; [39]).

Since phylogenetic analyses based on conserved virophage proteins consistently produce separate clades for mavirus and Sputnik (exemplified by the MCP similarity matrix in Fig. 4B and the MCP tree in Fig. 4C), we suggest the creating of two genera within the proposed family Lavidaviridae: the genus Sputnikvirus for Sputnik-like virophages, and the genus Mavirus for mavirus-like virophages. In addition to phylogenetic placement, Sputnik and mavirus differ with respect to their genetic similarity with the MP DNA transposons. Whereas mavirus encodes a protein-primed family B DNA polymerase and a retrovirus-type rve-family integrase, both of which are conserved in MP transposons, these genes have been replaced in Sputnik by a family A DNA polymerase (termed TV-Pol) fused to an S3H domain [40] and a lambda-type integrase found in bacterial and archaeal (pro-)viruses, respectively. These distinguishing criteria can be applied when classifying new virophages. OLV [102] and the related YSLVs [110, 111], as well as rumen virophages [106], are likely to require classification in separate genera (see Fig. 4A and C). However, metagenomic virus sequences in the absence of replicating isolates are currently not classified by the ICTV.

At the species level, we propose the following names: the genus Sputnikvirus includes the species Mimivirus-dependent virus Sputnik, with its sole member being Sputnik virus, as well as the species Mimivirus-dependent virus Zamilon with currently a single member, Zamilon virus. The genus Mavirus contains the species Cafeteriavirus-dependent mavirus with the single isolate Maverick-related virus (mavirus).

The complete structure of the taxa proposed for classification of virophages is summarized in Table 2.

Concluding remarks

Viruses inhabit highly dynamic, open environments, where they face not only their cellular hosts but also various other types of mobile genetic elements that are generally not considered to be viruses, despite the overlap in evolutionary histories [49]. Indeed, virus evolution is a perpetual process, and transitions between different types of mobile elements, e.g., from a plasmid to a virus to a transposon, seem to have occurred on multiple occasions during evolution, and in different directions [50]. As a result, viruses display a striking versatility in morphological and genomic complexity and content, as well as in the ways they interact with their hosts. For example, whereas most viruses are obligate cellular parasites, members of the family Polydnaviridae have become obligate symbionts of parasitic wasps [37]; whereas most viruses have an extracellular phase, viruses of fungi are almost exclusively transmitted vertically [34]; whereas most viruses form viral particles, some are never encapsidated [49, 83] or encapsidated by capsid proteins of other viruses [92]. Magnificent as it may seem, the richness of the viral world raises certain issues regarding virus classification and definition, i.e., what should be objectively considered a virus and what should not. As a case in point, until now, satellite viruses were deprived of the legitimate “virus” status and placed within an oblique category of “sub-viral agents” for which formal rules of virus classification do not apply. Here, we attempt to rectify this situation by suggesting a classification scheme for all isolated satellite viruses for which complete genome sequences are available. We propose to create two new families with seven new genera for classification of satellite viruses replicating in plant, insect, or protist hosts (Table 2). The proposed framework will not only increase the consistency of classification of known viruses but should also provide valuable guidelines for classifying satellite viruses that will be isolated in the future. There are certainly many awaiting discovery in the environment.