Introduction

RNA silencing, a gene regulatory mechanism that uses small RNAs for sequence-specific gene expression inhibition, protect eukaryotic genomes against aberrant endogenous or exogenous RNA molecules [1, 7, 30, 41]. RNA silencing pathways are triggered by double-stranded RNAs (dsRNA) or single-stranded RNAs (ssRNA) with foldback structures that are processed into small interfering RNAs (siRNAs) of 21–24 nucleotides (nt) by RNase III-type DICER enzymes [1, 2, 25]. These siRNAs are recruited into the RNA-induced silencing complex (RISC) by proteins of the Argonaute (AGO) family, to facilitate the cleavage of target RNAs through various base-pairing mechanisms [6, 36].

In higher plants, RNA silencing can serve as an adaptive, antiviral defence system, which is transmitted systemically in response to localised virus challenge [37]. Plant viruses can activate the RNA machinery and can also be targets of RNA silencing. In infected cells, high levels of virus-derived small interfering RNAs (vsiRNAs) can be processed from either viral dsRNA replicative intermediates, or local self-complementary regions of the viral genome, or dsRNA resulting from the action of RNA-dependent RNA polymerases (RDRs) on viral RNA templates [8, 11, 12, 27, 32, 38]. These vsiRNAs are key elements in guiding auto-silencing of viral RNA as part of an antiviral self-defence response in plants [8]. Multiple AGO proteins, such as AGO1, AGO2, AGO4, AGO5, AGO7, AGO10 and AGO18, are involved in this process, [10]. Loading of siRNAs onto a particular AGO complex is preferentially, but not exclusively, dictated by their 5′ terminal nucleotides [5, 10, 26]. Moreover, compelling evidence indicates that the biogenesis of vsiRNA of different size classes involves the same Dicer-like (DCL)-dependent pathways responsible for the formation of endogenous siRNAs [3, 4, 39]. Thus, vsiRNAs share some features with host siRNA and can mediate RNA silencing. It is worth noting that some of these vsiRNAs can participate in degrading complementary cellular transcripts to create cellular conditions suitable for viral infection [3335].

Bamboo mosaic virus (BaMV) has been investigated intensively and has thus become one of the most important models for studying plant-virus interactions [24]. This virus has a single-stranded, positive-sense RNA genome of 6,400 nt comprising five open reading frames (ORFs) flanked by 5′- and 3′-untranslated regions (UTRs) of 94 and 142 nt, respectively [21]. ORF1 encodes a 155-kDa replication-related protein with three functional domains: an N-terminal mRNA capping enzyme, a central RNA helicase and a C-terminal RDR [13, 16, 17]. ORF2–4, which are overlapping and are referred to as the ‘triple gene block’, are required for viral movement [24]. The 25-kDa coat protein encoded by ORF5 is associated with virion encapsidation, replication and both cell-to-cell and long distance viral movement [14]. The satellite RNA (satRNA) of BaMV (satBaMV), the only example of satRNA associated with a potexvirus, totally depends on BaMV for replication, assembly and movement [20]. P20, which is encoded by satBaMV, is not required for satBaMV replication or cell-to-cell movement; however, it indeed facilitates long-distance movement of satBaMV in Nicotiana benthamiana co-infected with BaMV [22, 28].

In recent years, deep sequencing of vsiRNAs in different host–virus systems, along with functional characterisation, has provided insights into the origin and composition of vsiRNAs and their potential role in virus–host interactions [31]. The composition of BaMV and satBaMV-derived siRNAs in infected N. benthamiana and Arabidopsis thaliana has been investigated [19]. However, few studies have focused on vsiRNA profiles from bamboo, which is the natural host of BaMV; such studies are indispensable for the understanding of virus-plant interactions and, most importantly, for developing sustainable methods of controlling viral infections. Here, we report on vsiRNAs of new divergent BaMV isolates (Ba-vsiRNAs) and their associated satBaMV in a cultivated bamboo species (Dendrocalamus latiflorus).

Bamboo leaves of a single plant with mosaic symptoms were collected in the Fuzhou National Forest Park, Fujian, China, once a month from July to October, 2014. Total RNA was extracted with an RNeasy Plant Mini Kit (Qiagen, Hilden, Germany) according to the manufacturer’s instructions. The complete genomes of BaMV and satBaMV in infected D. latiflorus were sequenced as described previously [23]. Small RNA isolation and library construction were performed essentially as described [19]. Deep sequencing was performed on the Illumina Solexa platform following the manufacturer’s protocol. The 5′ and 3′ adapter sequences were trimmed from the Solexa reads, and vsiRNAs and satRNA-derived siRNAs (satsiRNAs) of 18- to 28-nt were identified by a BLAST search against the complete genomic sequences of the BaMV-MAZSL1 isolate and the satBaMV-MAZSL1 isolate (accession numbers: KU870664 and KU870665, respectively) identified in this study. Only sequences that contained no more than two-position mismatches were further analysed. Library characterisation and determination of the BaMV and satBaMV siRNA profiles were performed using in-house scripts. Further statistical analyses and summaries were conducted using Microsoft Excel 2010. Potential targets of the Ba-vsiRNAs and satsiRNAs were identified by psRNA Target (http://plantgrn.noble.org/psRNATarget/), using the default parameters [40]. Due to our limited knowledge of the complete genome of D. latiflorus, the cds_DNA of Phyllostachys heterocycla (Moso Bamboo) was used as the pool for putative target prediction.

To date, only eight complete genomes of wild BaMV isolates have been deposited in GenBank. The genome of BaMV-MAZSL1 (6,350 nt) has the same genomic structure as that of the other isolates, except for five amino acid deletions close to the N-terminus of the coat protein gene. BaMV-MAZSL1 and satBaMV-MAZSL1 share the highest overall nucleotide sequence identity (83 and 94 %, respectively) with all BaMV and satBaMV isolates available in GenBank. Detailed genomic information about the isolates is presented in Table S1. Phylogenetic analysis placed nine complete BaMV genomic sequences into two clusters; however, BaMV-MAZSL1 was clustered into a new phylogenetic sub-lineage, which is closer to isolates from Taiwan than those from Fuzhou (Fig. S1A). A similar result was obtained for satBaMV-MAZSL1, which was also grouped into a new sub-cluster despite the large number of satBaMV isolates analysed in this study (Fig. S1B).

For siRNAs analysis, a total of 21,006,558 reads with length between 18 and 28 nt (after trimming the sequences) were searched against the corresponding viral genomes. As a result, 3,190,981 and 727,234 reads, accounting for 15.2 and 3.5 % of 18- to 28-nt reads, were identified as BaMV-MAZSL- and satBaMV-MAZSL-specific siRNA, respectively, representing 323,031 and 67,856 unique reads (Fig. 1). For both vsiRNAs and satsiRNAs, the 21-nt class was clearly the most dominant, followed by the 22-nt class, together accounting for 81.7 and 74.6 % of the total selected reads, respectively (Fig. 1). The result suggest that the bamboo homologues of DCL4 and DCL2 may be the predominant DCL ribonucleases involved in vsiRNAs and satsiRNAs biogenesis and that the 21- and 22-nt siRNAs are the predominant antiviral silencing components, in accordance with many studies of various plant viruses [9, 18, 29, 40]. By contrast, we detected far fewer 23- and 24-nt vsiRNAs and satsiRNAs, suggesting that DCL3 plays only a minor role in the biogenesis of BaMV- and satBaMV-specific siRNAs in bamboo.

Fig. 1
figure 1

Number of total siRNAs, BaMV-derived siRNAs and satBaMV-derived siRNAs of 18–28 nt in size in libraries prepared from BaMV-infected Dendrocalamus latiflorus

To investigate the frequency distribution of vsiRNAs and satsiRNAs in the BaMV- and satBaMV-MAZSL genomes, respectively, single-base resolution maps of all redundant BaMV- and satBaMV-derived siRNAs along the genomes were constructed using Bowtie tools [15]. The results showed that the most abundant Ba-vsiRNAs were mainly located within both positive and negative strands of the capping enzyme domain of the replicase and the 5′ UTR (Fig. 2A, B). A similar pattern was found in A. thaliana, whereas extremely different results were obtained in N. benthamiana, with the CP and 3’UTR being the major sources of Ba-vsiRNAs [19]. The highly abundant region of satsiRNAs was in the positive strand of the P20-encoding region, and much fewer satsiRNAs were matched to the negative strand (Fig. 2C, D). Moreover, a total of 18 hotspots (V1–V13 on BaMV and S1–S5 on satBaMV) of viral-specific siRNAs, with more than 10,000 reads, were identified (Fig. 2B, D). The mapping patterns of vsiRNA and satsiRNA of different size classes (21–24 nt) were clarified, revealing that 21-nt sequences, followed by 22-nt sequences, predominated within these hotspots (Fig. S2A, B). In contrast to the negative-strand dominance of Ba-vsiRNAs and satsiRNAs found in N. benthamiana in a previous study [19], there were slightly more vsiRNAs in the positive strand (52 %) than in the negative strand (48 %) of D. latiflorus (Fig. 2B). Moreover, there were substantially more satsiRNAs in the positive strand (88.9 %) than in the negative strand (11.1 %) (Fig. 2D). Our results suggest that the high or low abundance of antisense vsiRNAs compared to sense vsiRNAs is not a specific feature of a plant virus. Instead, the mechanism responsible for strand polarity might depend on other factors possibly related to a specific virus, different hosts or the environment.

Fig. 2
figure 2

The distribution of siRNAs on the BaMV (A and B) and satBaMV (C and D) genomes of Dendrocalamus latiflorus. The siRNAs are shown in orange above (positive strand) and in blue below (negative strand) the horizontal line. Hotspots accumulating vsiRNAs and satsiRNAs at precise positions are indicated by V1–13 and S1–5 (>10,000 reads), respectively. The X axis represents the length of the genome, and the Y axis represents the number of siRNAs

Previous studies have indicated that loading of siRNAs onto a particular AGO complex is preferentially dictated by their 5′-terminal nucleotides. The bioinformatics data revealed that sense Ba-vsiRNAs with an adenine (A) at their 5′-termini were the most abundant, accounting for 35.92 % of all sense vsiRNAs, followed by cytosine (C), uracil (U) and guanine (G) with 32.43, 19.24 and 12.41 %, respectively (Fig. 3A; Fig. S3). Likewise, antisense Ba-vsiRNAs with an A residue at the 5′-termini were the most abundant (37.39 % of all antisense vsiRNAs) (Fig. 3A; Fig. S3). However, based on the nucleotide sequence, only 30.31 and 18.55 % of the sequence of genomic and complementary RNAs of BaMV-MAZSL1 contain A. This suggested that no matter what polarity these Ba-vsiRNAs were, their 5′-terminal base was biased towards A (Fig. S3). Interestingly, satsiRNAs exhibited a different pattern: a bias for C and U was evident in sense satsiRNAs and the preference towards A was strong in antisense satsiRNAs (Fig. S3). These results are in contrast with the previous finding that there is no nucleotide preference in the generation of BaMV and satBaMV siRNAs in N. benthamiana and A. thaliana [19]. To gain further insights into vsiRNA and satsiRNA biogenesis and sorting, different-sized species (21–24 nt) were analysed (Fig. 3B, C). For all vsiRNAs of different size classes, a clear preference for A as the 5′-terminal nucleotide was observed, which is indicative of vsiRNAs with high binding affinity for AGO2 and AGO4 homologues; however, a strong bias for sequences beginning with a 5′-C was observed in satsiRNAs of all size classes, suggesting the high binding affinity of AGO5 homologues for satsiRNAs in D. latiflorus.

Fig. 3
figure 3

The 5′-terminal nucleotides of Ba-vsiRNAs and satsiRNAs. (A): Relative frequency of the 5′-terminal nucleotides of all Ba-vsiRNAs and satsiRNAs (sense and antisense). (B): Number of 21–24 nt reads of Ba-vsiRNAs. (C): Number of 21–24 nt reads of satsiRNAs

A total of 1389 and 584 bamboo genes were predicted to be possible targets of Ba-vsiRNAs and satsiRNAs, respectively (Table S2); these potential targets were predicted to be involved in a broad range of biological processes. Meanwhile, twelve hotspots of these siRNAs were predicted to have one or more targets (Table S3). The current experimental evidence supporting a functional interaction between host mRNAs and virus-specific siRNAs is weak. Nevertheless, this finding suggests that many host genes and their regulatory sequences might be targeted by Ba-vsiRNA- and satsiRNA-mediated downregulation during viral function.

In this study, we identified and characterised new divergent BaMV and satBaMV isolates and their specific siRNA profiles, which differ somewhat from those of previous studies, supporting the view that the combined action of viruses, satRNA, DCLs and AGOs, in different host plants results in the high diversity of the vsiRNAs pool found in nature. The challenge ahead is to further determine the extent of these functional interactions between vsiRNAs and their targets from a biological perspective.