Background

The genus Begomovirus constitutes the largest genus in the family Geminiviridae, with over 400 species recognized by the International Committee on Taxonomy of Viruses (ICTV) (Zerbini et al. 2017). Begomoviruses infect economically important crops and cause serious damages in agriculture throughout the tropical and subtropical regions (Brown et al. 2015). Based on genome organizations, begomoviruses can be further classified into monopartite and bipartite subgroups (Hanley-Bowdoin et al. 2013). The genome of bipartite begomoviruses contains two similarly sized single-stranded DNA (ssDNA) components, designated as DNA A and DNA B. The DNA A component encodes six proteins involved in viral replication, encapsidation, transmission and pathogenesis (Fondong 2013). The DNA B component encodes two proteins, which participate in cell-to-cell and systemic spread throughout the host (Lazarowitz and Beachy 1999). The genome of monopartite begomoviruses comprises a single molecule that is similar to the DNA A component of bipartite begomoviruses. For a few begomoviruses, DNA A/DNA component also encodes V3 (Gong et al. 2021) or C5 (Li et al. 2015) protein involved in suppression of host RNA interference. In addition, two types of ssDNA molecules, referred to as alphasatellite and betasatellite, are frequently found to be associated with the infection of monopartite begomoviruses (Briddon et al. 2018; Yang et al. 2019). Both alphasatellite and betasaellite have approximately half the size of the begomovirus components, and depend on their helper viruses for encapsidation and movement in plants (Zhou 2013). Alphasatellite has a single ORF coding for a replication initiator protein (alpha-Rep), and is capable of self-replication in the host plants (Briddon et al. 2018). Betasatellite encodes a multifunctional βC1 protein involved in pathogenicity, suppression of gene silencing, and repression of other plant defense responses (Li et al. 2018). Recently, a novel βV1 gene was identified in betasatellite, which is related with symptom development (Hu et al. 2020).

Weeds serve as the important intermediate hosts of plant viruses, and may participate in disease epidemic. Several types of weeds have been identified to be the alternative hosts of geminiviruses (Yang et al. 2008; Papayiannis et al. 2011; Fiallo-Olive et al. 2012). Malvastrum coromandelianum is a weed plant commonly seen in Yunnan Province, China, and is also a reservoir host for geminiviruses (Zhou et al. 2003). In this study, we describe the molecular characterization of two distinct begomovirus species that infect M. coromandelianum in Yunnan, China.

Results

Identification of begomoviruses in M. coromandelianum

Three M. coromandelianum leaf samples, Y249, 278 and Y281, with yellow vein symptoms were collected in Yunnan Province. To test whether the symptoms were caused by geminivirus infection, degenerate primers PA and PB, universal for members of the genus Begomovirus, were designed to detect a conserved region within DNA-A/DNA component of begomoviruses, and an amplicon of ~ 500 base pairs (bp) in size was obtained from each sample, suggesting that these samples were infected by begomoviruses. Based on the obtained sequences, primer pairs were designed to amplify the rest part of the viral genome from each sample. The complete nucleotide sequences of both Y278 and Y281 are 2743 bp in length (Accession numbers FN386459 and FN386460, respectively). These two sequences are nearly identical to each other (99.5% similarity), demonstrating that they are isolates of a single begomovirus species. Comparative sequence analysis of Y278 and Y281 with other begomoviruses showed that they share the highest sequence identities (88.4 and 88.5%) with pea leaf distortion virus (PLDV). The complete genome of Y249 is 2740 bp in length (Accession number FN552749). Sequence analysis showed that Y249 genome shares the highest sequence similarity (86.5%) with malvastrum yellow mosaic virus (MaYMV). These sequence identities are below the threshold value of 91% for species demarcation within the genus Begomovirus (Fauquet and Stanley 2003; Brown et al. 2015). According to the established principles of geminivirus taxonomy and nomenclature (Fauquet et al. 2008), the virus isolates Y278 and Y281 were named as malvastrum yellow vein Baoshan virus (MaYVBsV), and the virus isolate Y249 was named as malvastrum yellow vein Honghe virus (MaYVHhV) (Fig. 1).

Fig. 1
figure 1

Schematic representations of the genome organization of MaYVBsV and MaYVHhV. Different open reading frames (ORFs) with their respective products are indicated

Genome organization of MaYVBsV and MaYVHhV

Both MaYVBsV and MaYVHhV have canonical begomovirus genome arrangement (Fig. 1). In the virion-sense strand of the genome, the ORFs V1 encode 256 amino acids (aa) (MaYVBsV) and 257 aa (MaYVHhV) viral coat proteins (CP) (pfam 00844). Putative proteins encoded by ORFs V2 of MaYVBsV and MaYVHhV are 112 aa (12.9 kDa) and 116 aa (13.3 kDa), respectively, predicted to be the movement protein-like V2 protein (pfam 01524). MaYVHhV has an additional ORF V3 located downstream of the ORF V2, encoding a polypeptide with a molecular weight of 7.4 kDa. The function of the ORF V3 is elusive, while in silico analysis revealed a putative transmembrane helix at position 13–35 aa. In the complementary-sense strand, the C1 ORF of both MaYVBsV and MaYVHhV codes for putative 363 aa-long polypeptides with predicted molecular mass of 41.2 kDa. These proteins were identified as the replication-associated protein (Rep) (pfam 00799), containing four conserved motifs of Reps of geminiviruses (Nash et al. 2011; Fondong 2013) (Fig. 2a). P-loop NTPase domains were identified at aa 217–264 in both proteins, which are involved in nucleotide binding. ORFs C2 of the two viruses encode putative proteins of 150 aa, identified as the transcriptional activator protein (TrAP) (pfam 01440). Three known functional regions of C2 proteins were identified, including nuclear location signal (NLS), cysteine-rich zinc finger-like domain which confers DNA-binding activity, and the acidic region at C-terminal required for transactivation activity (Fig. 2b). Putative proteins encoded by ORFs C3 and C4 of MaYVBsV and MaYVHhV are 134 aa and 143 aa, identified as the replication enhancer protein (REn) (pfam 01407) and C4 protein (pfam 01492), respetively. The C5 ORFs potentially encode proteins of 167 aa for MaYVBsV and 208 aa for MaYVHhV. By analogy to similarly located ORF of other members of the genus Begomovirus, the noncoding regions of both MaYVBsV and MaYVHhV are 272 nt in length, with a predicted hairpin structure containing the conserved nonanucleotide motif TAATATT/AC.

Fig. 2
figure 2

Multiple alignments of protein sequences from MaYVBsV, MaYVHhV and other begomiviruses. a An alignment of Rep sequences of MaYVBsV, MaYVHhV and other begomoviruses. The four conserved motifs I (DNA binding), II (metal binding), GRS (Geminivirus Rep Sequence) and III (DNA cleavage and ligation) are indicated. b An alignment of C2 protein sequences of MaYVBsV, MaYVHhV and other begomoviruses. The positions of functional regions are indicated

Phylogenetic relationship of MaYVBsV and MaYVHhV with other geminiviruses

Amino acid sequence comparisons of the six viral proteins of MaYVBsV or MaYVHhV were performed with the 12 begomoviruses, which have the highest genome sequence identities with these two viruses (Table 1). MaYVBsV has the highest amino acid sequence identities with hollyhock leaf curl virus (HoLCV) for CP (96.5%), malvastrum yellow mosaic virus (MaYMV) for Rep (92.0%), and okra enation leaf curl virus (OELCuV) for C4 (80.8%). The V2, C2 and C3 of MaYVBsV share the highest amino acid identities with those of malvastrum yellow mosaic virus (MaYMV) (92.0, 95.3 and 95.5%, respectively) (Table 1). On the other hand, the CP of MaYVHhV has the highest amino acid sequence identity with those of malvastrum yellow vein Yunnan virus (MaYVYNV) and pepper yellow leaf curl virus (PYLCV) (94.6%). The V2, Rep, and C2 share the highest amino acid sequence identities with those of MalYMV (92.2, 92.6 and 94.6%, respectively). MaYVHhV C3 has the highest sequence similarity with C3 of malvastrum yellow vein virus (MaYVV), MalYMV and MaYVYNV (92.5%). MaYVHhV C4 protein has the highest amino acid sequence identity with that of HoLCV (81.4%) (Table 1). Phylogenetic analyses were further performed based on the amino acid sequences of two taxonomically relevant gene products, CP and Rep, and the full-length genome sequences of geminiviruses. The trees showed that MaYVBsV and MaYVHhV always cluster with other begomoviruses (Fig. 3).

Table 1 Amino acid sequence identities between proteins encoded by MaYVBsV or MaYVHhV and other begomoviruses
Fig. 3
figure 3

Phylogenetic trees constructed based on the amino acid sequences of CP (a), Rep (b) or the complete viral DNA sequences (c) of members in the family Geminiviridae. Bootstrap values (%) for 1000 replicates are indicated. Virus names and their GenBank accession numbers are shown in Additional file 1: Table S1

Discussion

Due to the broad distribution and rapid propagation characters, weeds may survive in or around crop fields during the non-cropping season, which makes them important reservoir hosts for plant viruses. Besides, mutations and recombinations increase in virus genome when weeds are infected with multiple virus species, which may increase the transmission rate of viruses and further broaden their host range. In our study, two different types of begomovirus, MaYVBsV and MaYVHhV, were identified in M. coromandelianum. Sequence alignment analysis of virus genome and gene products showed that MaYVBsV and MaYVHhV were closely related with begomoviruses that collected from different plant hosts, indicating that these two viruses may be transmitted from M. coromandelianum to different types of crops and nurseries. In spite of our efforts, we failed to reveal a DNA-B component associated with MaYVBsV or MaYVHhV, indicating that they are monopartite begomoviruses. Further study will focus on detecting the infectivity and host range of MaYVBsV and MaYVHhV.

Conclusions

Our study reports the detection and characterization of two novel putative begomovirus species infecting M. coromandelianum plants. These results will facilitate the development of strategies for managing the spread of geminiviruses.

Methods

Plant materials

Virus isolates were collected from M. coromandelianum plant displaying yellow vein symptoms in Honghe (Y249) in 2003 and Baoshan (Y278 and Y281) in 2005, in Yunnan Province, China.

Determination of the full-length genomic sequences of viruses

Total plant DNA was extracted from leaves of naturally infected symptomatic plants as described (Zhou et al. 2001). Degenerate primer pair PA (5′-TAATATTACCKGWKGVCCSC-3′) and PB (5′-TGGACYTTRCAWGGBCCTTCACA-3′) was designed to amplify the DNA-A/DNA fragment that contains the partial intergenic region and CP gene. Based on the determined sequences, the primer pairs Y249F (5′-GTAACTGTCCCTACTGTCCGC-3′)/ Y249R (5′-TACACGGGTTGAGTAAGGACTG-3′) and Y278F (5′-TCAAAGCTTAAATAATTTTCCCACCG-3′)/ Y278R (5′-TTGAGTGCGTCATCTGATTGGACCAG-3′) were designed to amplify the full-length DNA-A/DNA. The PCR products were cloned into pMD18-T vector (TaKaRa, Tokyo, Japan) and the viral genome sequences were determined by Sanger sequencing.

Sequence analysis

Sequences were assembled and analyzed with SnapGene®. Domains were analyzed using the Pfam database (http://pfam.xfam.org). Transmembrane helices were predicted using TMHMM 2.0 (http://www.cbs.dtu.dk/services/TMHMM/). Sequence alignments were performed using and MUSCLE by MEGA X. Phylogenetic analyses were performed using the maximum likelihood method by MEGA X. The GenBank accession numbers of sequences analyzed in the study are listed in Additional file1: Table S1.