Genome sequence of pineapple secovirus B, a second sadwavirus reported infecting Ananas comosus

The complete genome sequence of pineapple secovirus B (PSV-B), a new virus infecting pineapple (Ananas comosus) on the island of Oahu, Hawaii, was determined by high-throughput sequencing (HTS). The genome comprises two RNAs that are 5,956 and 3,808 nt long, excluding the 3’-end poly-A tails, both coding for a single large polyprotein. The RNA1 polyprotein contains five conserved domains associated with replication, while the RNA2 polyprotein is cleaved into the movement protein and coat protein. PSV-B is representative of a new species in the subgenus Cholivirus (genus Sadwavirus; family Secoviridae), as the level of amino acid sequence identity to recognized members of this subgenus in the Pro-Pol and coat protein regions is below currently valid species demarcation thresholds. Supplementary Information The online version contains supplementary material available at 10.1007/s00705-022-05590-9.

carried out in 2019 on the island of Oahu and showed the presence of the virus in six out of twelve plants with symptoms of reddening and wilting of leaves associated with mealybug wilt of pineapple (MWP). PSV-A was absent in the 13 asymptomatic plants tested [5]. Various combinations of ampeloviruses from the pineapple mealybug wilt-associated virus (PMWaV) species complex were also detected in the 12 symptomatic plants (Larrea-Sarmiento et al, unpublished results).
To examine the presence of further undiscovered viruses infecting A. comosus, high-throughput sequencing (HTS) was performed on the same field plants detailed in a study by Larrea-Sarmiento et al. [5]. Total RNA was extracted from the basal portions of individual pineapple leaf samples using a Spectrum™ Total RNA Kit (Sigma-Aldrich, USA), following the manufacturer's instructions. Total RNA extracted from the 12 MWP-symptomatic field samples and 13 healthy-looking plants were pooled into two respective composite RNA samples and subjected to ribodepletion to remove the ribosomal RNA (rRNA). cDNA library synthesis was followed by HTS using an Illumina® NovaSeq 6000 system to obtain paired-end reads (2 × 100 bp) at the Genomics High-Throughput Sequencing Facility at the University of California, Irvine.
Data obtained from ~ 40million raw reads per composite ribosomal RNA-depleted total RNA were curated and assembled following the methods of Green et al. [2]. The resulting contigs were annotated by doing BLASTx searches of the NCBI virus sequence database. Annotated contigs revealed sequence similarity to the previously characterized PMWaVs and secoviruses. Two contigs recovered from the symptomatic composite sample dataset had significant matches to PSV-A and other sadwaviruses but were sufficiently divergent to suggest that they represented the two RNA components of a new virus. Similar to PSV-A and the majority of secovirids, the potential new virus has a bipartite genome consisting of two positive-sense RNA molecules.
To obtain the complete genome sequence of the virus, 5' and 3' rapid amplification of cDNA ends (RACE) was performed. Both 5' and 3' ends were obtained using a Takara SMARTer RACE 5'/3' Kit according to the manufacturer's instructions, followed by PCR with a universal anchored primer and sequence-specific primers (Supplementary Table  S1). Amplicons were cloned, and five to seven clones were sequenced by the Sanger method. The complete genome comprises two RNA molecules; RNA1 is 5,956 nt long (GenBank accession no. OM777135) and RNA2 is 3,808 nt long (GenBank accession no. OM777136), each coding for large polyproteins referred as P1 and P2, respectively. The name "pineapple secovirus B" (PSV-B) is proposed for this putative new virus infecting pineapple.
The polyprotein precursor P1 of PSV-B is 1,875 aa long and is composed of proteins involved in replication: protease cofactor (Pro-C), helicase (Hel), VPg, protease (Pro), and RNA-dependent RNA-polymerase (Pol). Likewise, the polyprotein P2 of PSV-B is 1,143 aa long and is composed of a movement protein (MP) and one large coat protein (CP) (Fig.1). Similar to other secovirids, both PSV-B RNAs are expected to possess a VPg bound at the 5'end and a poly(A) tail at the 3'end, respectively [1,8]  Sadwavirus) are predicted to encode only one large CP [4,5,7,12]. Analysis of the predicted cleavage sites located four Q/S and five E/G dipeptides in the polyprotein P1 [3]. The cleavage sites recognized by the RNA1-encoded 3C-like protease (3CL-Pro) likely cleave P1 at four sites, defining five domains, while 3CL-Pro likely cleaves P2 at one site, defining two domains (Fig.1) [1, 7, 8].
The recently characterized PSV-A was found to be closely related to Dioscorea mosaic associated virus (DMaV) and chocolate lily virus A (CLVA) [5]. In 2020, the proposed revision of the family Secoviridae classified DMaV and CLVA, previously denoted as unassigned secoviruses, as members of the subgenus Cholivirus within the genus Sadwavirus [8]. To study the taxonomic position of PSV-B and its relatedness to PSV-A and other members of the family Secoviridae, phylogenetic analysis using the maximumlikelihood method based on the aa sequence of the Pro-Pol region was carried out using LG (Le Gascuel) + G (discrete Gamma distribution) as the best model of protein evolution. This analysis suggested that PSV-B is a new Sadwavirus member that is related to, but distinct from, the previously characterized Cholivirus member PSV-A (Fig.2). PSV-B is placed on a branch distinct from PSV-A and a basal clade that contains DMaV and CLVA (Fig.2). For members within the family Secoviridae, the species demarcation criteria are < 80% identity for the aa sequence of the Pro-Pol region and < 75% identity for the large and small CP together [11]. Sequence identities of 45.1% and 53.5% were observed when comparing the Pro-Pol region of PSV-B to PSV-A and CLVA homolog regions, respectively, using pairwise comparisons. Likewise, amino acid sequence identity values of 23.5% and 25.4% were obtained when comparing the CPs of PSV-B and PSV-A and those of PSV-B and CLVA, respectively. These results are consistent with the findings