Introduction

DsRNA viruses represent a remarkably diverse group of biological entities, infecting organisms in all three domains of life. Nine distinct dsRNA virus families (Amalgaviridae, Birnaviridae, Chrysoviridae, Cystoviridae, Endornaviridae, Partitiviridae, Picobirnaviridae, Reoviridae and Totiviridae) are currently recognised by the International Committee on Taxonomy of Viruses (ICTV). These viruses vary in terms of host specificity (ranging from bacteria to humans), number of genome segments (one to twelve) and virion organization (varying numbers of capsid layers with different triangulation (T) numbers). However, the majority of the dsRNA viruses share certain fundamental structural and functional features. These similarities reflect the challenges viruses face replicating their dsRNA genomes, while simultaneously avoiding dsRNA-triggered antiviral defense mechanisms of their host organisms [1]. DsRNA viruses overcome these challenges by delivering their genomes into the host cell within specialized icosahedral capsids, containing enzymatic activity for RNA metabolism. These multifunctional nanocompartments carry out replication and transcription as well as protect the dsRNA genome from antiviral responses [2]. Probably due to these common functional requirements, the innermost capsids are highly conserved among most of the dsRNA viruses. They consist of 120 protein subunits arranged as 60 asymmetric dimers on a T=1 icosahedral lattice. Moreover, in most cases, the protein subunits in the capsid possess a similar fold, despite the lack of significant sequence similarity [3, 4]. This icosahedral protein shell encloses the segmented dsRNA genome and several copies of the viral RNA-dependent RNA polymerase (RdRP). Additional capsid shells, typically arranged on T=13 icosahedral lattice, may reside on top of the inner core particle. These external layers facilitate interaction with the host, and thereby show greater diversity between dsRNA virus families [2].

Pseudomonas phage phi6, the type member of the Cystoviridae, was isolated on Pseudomonas-infested bean straw in Nebraska, USA in the beginning of 1970’s [5]. It was the first – and for over two decades the only – known dsRNA virus to infect bacteria. The tripartite dsRNA genome of phi6 is enclosed in an icosahedral, double-layered protein capsid, which in turn is surrounded by a membrane envelope. The lipid envelope around an icosahedral protein capsid is a unique feature among bacterial viruses, though members of the Tectiviridae and Corticoviridae have lipid membranes inside their protein capsids. Due to the extensive biochemical, genetic and structural characterization, phi6 has become one of the best-known bacterial virus models and dsRNA systems. Until now, phi6 has remained the sole representative of the Cystoviridae family (and Cystovirus genus), recognised by the ICTV. However, additional dsRNA phages have been isolated [6, 7] and some characterized in more detail [8,9,10,11,12,13,14,15] (Table 1). These viruses share notable genetic and structural similarities with phage phi6. Consequently, we have proposed to classify them into the Cystoviridae family. Here we provide a short overview of phi6 and the proposed new members of the Cystoviridae family.

Table 1 Proposed members of the revised Cystoviridae family

Extended Cystoviridae family includes seven viruses

Since the late 1990’s, dsRNA phages have been readily isolated from environmental samples, indicating that the members of the Cystoviridae are far more widespread and abundant than has been previously acknowledged [6,7,8, 13,14,15]. The complete nucleotide sequences of six of these phage isolates (Pseudomonas phages phi8, phi12, phi13, phi2954, phiNN and phiYY) have been determined to date [9,10,11,12,13,14,15]. Similarly to phi6, phages phi8, phi12, phi13 and phi2954 were isolated from bacteria-infested legumes in the USA [8, 13]. They all infect hosts belonging to genus Pseudomonas, most commonly plant pathogenic Pseudomonas syringae strains. However, this limited host range likely reflects the somewhat biased isolation method, in which the host strain of phi6, Pseudomonas syringae pv. phaseolicola HB10Y (or one of its mutants), was used in enrichment. Interestingly, a number of additional dsRNA virus isolates have been obtained by sampling from clovers and green beans at various locations in the USA [6, 7]. These virus isolates have been partially sequenced but not otherwise characterized. Nevertheless, the high frequency of dsRNA phages in these environmental samples suggests that this virus type is a common bacterial parasite in certain terrestrial habitats [6].

The most recent dsRNA phage isolations were reported in Europe and Asia from diverse environmental sources: Pseudomonas phage phiNN was isolated contemporarily with its host bacterial strain Pseudomonas sp. B314 from a fresh water sample in Finland [14], whereas the isolation source of Pseudomonas phage phiYY was hospital sewage in China [15]. PhiYY was isolated together with Pseudomonas aeruginosa strain PAO38, and it also infects several other clinical strains of P. aeruginosa [15]. P. aeruginosa is an opportunistic human pathogen, which causes serious infections in immune-compromised individuals. These recent discoveries demonstrate, that dsRNA phages have adapted to varying habitats in globally distant locations.

The dsRNA phages described above share genetic and structural characteristics (overall virion morphology, genome type and genome organization) with phage phi6, the prototype virus of the Cystoviridae. These features clearly distinguish them from other viruses and demonstrate their relatedness. Consequently, they should be included into the Cystoviridae family.

Cystoviruses have one or two icosahedral protein shells surrounded by an envelope

The virion organization of the proposed members of the Cystoviridae, if described, resembles that of the type species. The virions are enveloped and the tri-segmented genome is enclosed in one or two concentric, icosahedrally symmetric protein shells [16, 17]. Studies on phi6 have revealed that the innermost protein shell of the virion, also referred to as the polymerase complex (PC), is composed of the major capsid protein (MCP) P1, the RdRP P2, the packaging NTPase P4 and the minor protein P7 [18]. The structural framework of the PC consists of 60 asymmetric dimers of the MCP P1 arranged into T=1 architecture, characteristic of dsRNA viruses [19, 20].

The second protein layer, or the nucleocapsid (NC) shell, of the phi6 virion is constituted by 200 trimers of protein P8 on a T=13 icosahedral lattice [18,19,20]. The near-atomic structure of the phi6 NC shell has been recently solved [21]. Interestingly, phage phi8 is lacking this NC shell, and therefore has a single protein shell surrounding the genome [16].

The outermost layer of cystoviruses is a lipid envelope, containing host-derived phospholipids [22] and phage-encoded membrane proteins [23]. Host binding spikes, composed of protein P3, protrude from the virion surface [24]. Spike proteins differ between cystoviruses, resulting in varying host specificities [25,26,27].

Genome organizations of cystoviruses are highly similar

All the proposed members of the Cystoviridae family have a dsRNA genome, which is divided into three separate segments designated according to their size as L (large, 6.4 – 7.1 kb), M (medium, 3.6 – 4.7 kb) and S (small, 2.3 – 3.2 kb) [9,10,11,12,13,14,15, 28,29,30]. The total genome size varies from 12.7 kb (phi2954) to 15.0 kb (phi8). The GC content of the cystovirus genomes ranges from 53.4% (phi2954) to 58.8% (phiYY). Cystoviruses share limited similarity at the nucleotide sequence level (Table 2), with the exception of phi6 and phiNN, which are genetically considerably similar (79.5%, 51.2% and 83.4% nucleotide sequence similarities for L, M and S segments, respectively). The high genetic similarity between phi6 and phiNN is surprising, considering the fact, that these two viruses were isolated from different habitats (plant debris and fresh water sample) at globally distant locations (Nebraska, USA and Jyväskylä, Finland) at an interval of over 40 years [5, 14].

Table 2 Nucleotide sequence similarities (%) between the cystoviral genome segments. Color code: > 95% = dark grey, > 75% = medium grey, > 50% = light grey

Genome organization of cystoviruses is highly similar and they encode a comparable set of proteins [9,10,11,12,13,14,15, 28,29,30]. Genes are grouped into functional groups in each genome segment: The L-segment encodes proteins forming the virion core (P1, P2, P4 and P7), the M-segment contains genes for the host recognition complex (P3 and P6), and the S-segment encodes the NC shell protein (P8; as an exception, phi8 P8 is a membrane protein), the major membrane protein (P9), putative membrane morphogenetic factor P12 as well as the protein needed in host cell lysis (P5). In each segment, the coding region is flanked by non-coding regions, which are essential in genome packaging and replication. Comparative genome analysis reveals an intergenome rearrangement in Pseudomonas phage phi8: gene 7 is located at the 3′ terminus of the L-segment in phage phi8, but at the 5′ terminus of the L-segment in all other proposed cystoviruses (Fig. 1). Otherwise the order of the genes in the genome segments seems to be the same in all proposed members of the Cystoviridae.

Fig. 1
figure 1

Genome maps of the segments S (a), M (b) and L (c) of the proposed members of the Cystoviridae. Open reading frames (ORFs) of the predicted positive strands are depicted and amino acid sequence similarities (%) between corresponding ORFs are indicated. Comparisons were conducted with EMBOSS Needle Pairwise Sequence Alignment [51]. The order of the genome segments follows the clustering in the phylogenetic trees presented in Fig. 2

A moderate level of amino acid sequence similarity is seen among the proposed cystoviruses between the major structural proteins and essential enzymes, which are encoded by the L- and S-segments (Fig. 1). For instance, when comparing phages phi6 and phiNN, the protein products of the S- and L-segment are almost identical (89 – 99% amino acid sequence similarity; Fig. 1a,c), whereas more diversity is seen between the corresponding proteins of the M-segments (41 – 70% amino acid sequence similarity; Fig. 1b; [14]). This higher genetic flexibility in the M-segment, encoding the host recognition complex, may reflect the evolutionary pressure to broaden the host range and adapt to new habitats [14]. Interestingly though, 36 – 85% amino acid sequence similarity can be detected between the corresponding putative host recognition proteins of phages phiYY, phi12 and phi13, despite their taxonomically distant host bacteria (phiYY infects human pathogen P. aeruginosa, whereas phi12 and phi13 infect plant pathogen P. syringae).

We conducted phylogenetic analyses separately for each of the cystoviral genome segments based on the nucleotide sequences (Fig. 2). All the deduced phylogenetic trees show close relatedness between phages phi6 and phiNN. Otherwise, the clustering was different in each of the three trees, suggesting reassortment of genome segments between the proposed species during their evolution. The exchange of genome segments between some of the cystoviral isolates has also been experimentally demonstrated [8, 10, 13] and can be detected in natural dsRNA phage isolates [6, 7].

Fig. 2
figure 2

Phylogenetic trees showing relationships between proposed members of the Cystoviridae based on nucleotide sequence comparisons of the segments S (a), M (b) and L (c). The trees were constructed with maximum likelihood method using Mega 7.0 [52]. The robustness was statistically evaluated by bootstrap analysis with 1000 replicates. Bootstrap values greater than 50% are indicated at the branch points. Finally, the trees were visualized using FigTree v1.3.1

Cystoviruses have lytic life cycles

All the proposed members of the Cystoviridae are virulent viruses, which induce lysis of their bacterial host cells at the end of viral reproduction cycle. However, it has been shown, that phage phi6, the type member of the Cystoviridae, may also establish a carrier state in the host bacterium [31,32,33]. Different stages of the infection cycle have been described comprehensively for phi6. Upon infection, the phi6 virion adsorbs to type IV pilus on bacterial cell surface [34, 35]. As the pilus retracts, the virion is brought into contact with the bacterial outer membrane. Phages phiNN and phi2954 also use this type IV pilus-mediated infection strategy [13, 14], whereas phages phi8, phi12 and phi13 bind the host cell directly through rough LPS on the cell surface [8]. In each case, P3 protein complex is required for the initial binding to the host bacterium [9, 10, 12,13,14, 29, 34]. It has been suggested, that the P3 protein complex consists of a single polypeptide or its multimer in phages phi6, phi2954 and phiNN [13, 14, 29], whereas in phages phi8, phi12, phi13 and phiYY the P3 complex is heteromeric, containing two or three different polypeptides (P3a, P3b, P3c) [9, 10, 12, 15]. The P3 protein of phi6 is anchored to the viral membrane via phage membrane protein P6. P6 protein mediates the fusion between the viral envelope and bacterial outer membrane, ultimately releasing the NC into the periplasmic space [36].

The removal of the viral membrane from the NC releases the lytic enzyme P5, which then digests the host peptidoglycan layer [37, 38]. The rupturing of the peptidoglycan layer enables the NC to reach the cytoplasmic membrane. The NC penetrates the cytoplasmic membrane using endocytic-like mechanism [39]. The NC outer shell (formed by P8) dissociates in the cytoplasm, revealing the PC [40, 41]. This activates the virion-associated RdRP, which then launches viral transcription within the PC [42]. Semi-conservative transcription produces full-length, polycistronic mRNA molecules of the genome segments [43]. At the first phase of the infection, approximately equal amounts of transcripts are produced from each genome segment [44, 45]. However, only the transcripts of the L-segment are efficiently produced, which leads to the accumulation of PC proteins and, consequently, to the formation of empty PCs. The packaging NTPase P4 translocates one copy of each type of genome segment transcripts inside the newly synthesized empty PCs [46, 47]. The transcripts are packaged, based on the 5′ terminal packaging signals, in the order S, M and L. The packaging triggers the negative-strand synthesis within the PC, ultimately resulting in mature, double-stranded forms of all three genome segments [42]. After the replication, the plus-strand synthesis is again switched on. During this late phase of infection, transcription of the S- and M-segments predominates [44, 45]. This leads to the production of proteins needed in virion assembly. The NC shell assembles around the PC [40, 48], after which viral membrane, derived from the host cytoplasmic membrane, encloses the NC [49]. Finally P3 spikes are attached onto the virion surface resulting in mature virion structure. Lytic enzyme P5 and membrane protein P10 mediate the lysis of the host cell, ultimately releasing the newly synthesized virions into the environment [37, 38].

Taxonomic structure of the cystoviruses

Despite the fact, that the identified dsRNA phages commonly share a relatively low degree of nucleotide sequence identity (< 50%, except for phi6 and phiNN; Table 2), their overall virion structures (one or two icosahedral capsids, enclosed by a lipid envelope) and genome characteristics (genome type and size, GC content, genome organization and gene synteny) are strikingly similar. They undoubtedly belong to the Cystoviridae family. We have proposed 95% nucleotide sequence identity as the criterion for demarcation of species in the Cystoviridae family. This initial criterion may be adjusted when new cystovirus isolates are described. Based on the current criteria the members of each of the proposed species should differ from those of other species by more than 5% at the nucleotide sequence level. Consequently, Pseudomonas phages phi8, phi12, phi13, phi2954, phiNN and phiYY should be included into the Cystoviridae family as distinct species (Table 1).

The phylogenetic analyses indicate close relationship between the type member of the genus Cystovirus (Pseudomonas virus phi6) and one of the proposed new species, Pseudomonas virus phiNN, whereas the other proposed species are more distantly related (Fig. 2). Furthermore, the three genome segments of these isolates apparently have distinct evolutionary histories (due to frequent genome segment reassortments). These viruses clearly belong to the Cystoviridae family, but due to the low number of isolates it is difficult to clarify the taxonomic structure within the family. Therefore we propose that all the new species belong to the same genus, Cystovirus.