Functional & Integrative Genomics

, Volume 9, Issue 3, pp 277–286

Dicer-like (DCL) proteins in plants

Authors

    • School of Agriculture and Food ScienceZhejiang Forestry University
  • Ying Feng
    • College of Environmental and Resources ScienceZhejiang University
  • Zhujun Zhu
    • School of Agriculture and Food ScienceZhejiang Forestry University
Review

DOI: 10.1007/s10142-009-0111-5

Cite this article as:
Liu, Q., Feng, Y. & Zhu, Z. Funct Integr Genomics (2009) 9: 277. doi:10.1007/s10142-009-0111-5

Abstract

Dicer and Dicer-like (DCL) proteins are key components in small RNA biogenesis. DCLs form a small protein family in plants whose diversification time dates to the emergence of mosses (Physcomitrella patens). DCLs are ubiquitously but not evenly expressed in tissues, at different developmental stages, and in response to environmental stresses. In Arabidopsis, AtDCL1, AtDCL2, and AtDCL4 exhibit similar expression pattern during the leaf or stem development, which is distinguished from AtDCL3. However, distinct expression profiles for all DCLs are found during the development of reproductive organs flower and seed. The grape VvDCL1 and VvDCL3 may act sequentially to face the fungi challenge. Overall, the responses of DCLs to drought, cold, and salt are quite different, indicating that plants might have specialized regulatory mechanism in response to different abiotic stresses. Further analysis of the promoter regions reveals a few of cis-elements that are hormone- and stress-responsive and developmental-related. However, gain and loss of cis-elements are frequent during evolution, and not only paralogous but also orthologous DCLs have dissimilar cis-element organization. In addition to cis-elements, AtDCL1 is probably regulated by both ath-miR162 and ath-miR414. Posterior analysis has identified some critical amino acid sites that are responsible for functional divergence between DCL family members. These findings provide new insights into understanding DCL protein functions.

Keywords

Dicer-likeExpressionCis-elementsFunctional divergencePlant

Introduction

Genetic and biochemical evidence has demonstrated that small RNAs such as microRNAs (miRNAs) and small interfering RNAs (siRNAs) in eukaryotic organisms play important roles in developmental regulation (Kidner and Martienssen 2005), epigenetic modifications (Vaucheret 2006), tumorigenesis (Murakami et al. 2006), and biotic and abiotic stress responses (Llave 2004). The two kinds of non-coding RNAs, miRNAs and siRNAs, are produced from different types of precursors (Millar and Waterhouse 2005; Groβhans and Filipowicz 2008). Dicer or Dicer-like (DCL) proteins are key components in the miRNA and siRNA biogenesis pathways in processing long double-stranded RNAs into mature small RNAs (Millar and Waterhouse 2005; Chapman and Carrington 2007; Groβhans and Filipowicz 2008). In higher plants, insects, protozoa, and some fungi such as Neurospora crassa and Magnaporthe oryzae, Dicer or DCLs form a small gene family being composed of two, four, or five members, whereas only one Dicer protein is found in vertebrates, nematodes, Schizosaccharomyces pombe, and green alga Chlamydomonas reinhardtii.

Dicer and DCL proteins are large multi-domain ribonucleases. Vertebrate, insect, and plant Dicer and DCL proteins generally contain six types of domains including DEAD box, helicase-C, DUF283, PAZ, RNase III, and dsRBD (Margis et al. 2006). In lower eukaryotes, one or more of these domains may be absent. The PAZ, RNase III, and dsRBD domains are considered to function in dsRNA binding and cleavage. The PAZ domain of Dicer is directly connected to the RNase IIIa domain by a long α helix and can specifically bind the end of dsRNA containing a 3′ two-base overhang (MacRae et al. 2006). In addition, the PAZ domain also plays a role in binding single-stranded RNAs (Kini and Walton 2007). Zhang et al. (2004) suggested that Dicer functions through intramolecular dimerization of its two RNase III domains. Structural and biochemical analysis of mouse Dicer (Du et al. 2008) revealed four RNA binding motifs (RBMs 1–4) with RBMs 1 and 2 in dsRBD and RBMs 3 and 4 in RNase IIIb; importantly, a highly conserved lysine residue in Dicer RNase IIIa and IIIb has been suggested to be critical for dsRNA cleavage. In addition to dsRNA binding, the RNase III domain is found to directly bind to the PIWI box of Argonaute proteins, which is dependent on the activity of Hsp90 (Tahbaz et al. 2004). It is worth noting that Dicer itself could act as a molecular ruler, as the distance between the PAZ and RNase III domains (65 Å) matches the length spanned by 25 bp of RNA (MacRae et al. 2006). The dsRBD domain has been suggested to play a role in mediating the processes of discriminating different RNA substrates and the subsequent incorporation of effector complexes (Margis et al. 2006). In higher eukaryotes, the DUF283 domain is proposed to be involved in siRNA/miRNA strand selection by recognizing the asymmetry of RNA duplexes directly or by recruiting another dsRBD protein (Dlakić 2006).

In view of the importance of miRNA/siRNA biogenesis, Dicer and DCL proteins are essential for eukaryote development and viral defense. Mutagenesis studies have indicated that Dicer is indispensable to normal germline development for Caenorhabditis elegans (Knight and Bass 2001) and maintaining two types of stem cells (GSCs and SSCs) in the Drosophila ovary (Jin and Xie 2007). Knockout of Dicer in mouse oocytes results in an inability to progress through first meiotic division due to disorganized spindles and chromosome congression defects (Murchison et al. 2007). Moreover, Dicer is also found to play pivotal roles in embryogenesis (Yang et al. 2005), lung epithelium morphogenesis (Harris et al. 2006), limb development (Harfe et al. 2005), and apoptosis (Matskevich and Moelling 2008). In Drosophila, the two Dicers have distinct but related roles (Lee et al. 2004): Dicer-1 processes miRNA precursors, whereas Dicer-2 is necessary for processing siRNA precursors. Both Dicer-1 and Dicer-2 are required for siRNA-directed mRNA cleavage, and a role for Dicer to protect host against virus infection has been established (Millar and Waterhouse 2005). In mammals, the absence of Dicer leads to a modest increase of virus production and accelerated apoptosis of influenza A virus-infected cells (Matskevich and Moelling 2007). Flies with a loss-of-function of Dicer-2 are more susceptible to infection by flock house virus than the wild type, demonstrating the importance of Dicer-2 in virus defense (Galiana-Arnoux et al. 2006).

Relative to animals and fungi, the notable expansion of DCL family members in monocot and dicot plants may reflect the deployment of RNA silencing approach in antiviral defense (Deleris et al. 2006; Margis et al. 2006). In Arabidopsis thaliana, four Dicer-like proteins (DCL1–DCL4) with different roles are found (Xie et al. 2004; Dunoyer et al. 2005; Moissiard et al. 2007; Mlotshwa et al. 2008): DCL1 not only is associated with miRNA production but also has a role in the production of small RNAs from endogenous inverted repeats. The other three DCLs are siRNA-generating enzymes. DCL2 generates siRNAs from natural cis-acting antisense transcripts and functions in viral resistance. DCL3 generates siRNAs for a guide of chromatin modification, while DCL4 is associated with tasiRNA metabolism and acts during posttranscriptional silencing (Liu et al. 2007). The functions of DCL1 and DCL3 overlap to promote Arabidopsis flowering (Schmitz et al. 2007). Overlaps in function are also found for DCL2 and DCL4 with respect to antiviral defense (Deleris et al. 2006) and for DCL2, DCL3, and DCL4 in siRNA and tasiRNA production and in the establishment and maintenance of DNA methylation (Henderson et al. 2006).

Dicer expression studies in humans have shown that 5′-UTR variants generally repressed translational efficiency, and its diversity determines tissue- and developmental-specific expression patterns (Singh et al. 2005). In Arabidopsis, it was found that loss-of-function of all four DCLs causes ABA supersensitive during seed germination possibly due to the fact that the biogenesis of one or more special microRNAs function in ABA signaling (Zhang et al. 2008). Several ABA responsive cis-acting elements are found in the promoter region of DCL genes as discussed later.

In order to gain more insights about the DCL protein families, a comprehensive survey was conducted by utilizing various data sources in public domain.

Databases and methodologies used in the survey

A. thaliana DCL sequences downloaded from the GenBank database were used as query to BLAST search against the Oryza sativa, Vitis vinifera, Populus trichocarpa, and Sorghum bicolor genomes. In order to exhaustively seek for homologues, the ENSEMBL and GenBank databases were also searched using the programs BLASTN and BLASTP, respectively. In addition, the human Dicer was used as query to collect orthologous Dicers from fungi and other animals. Program InterProScan (Quevillon et al. 2005) was employed to detect conserved domains within Dicer and DCL protein candidates.

Gene expression microarray datasets (GSE7951 and GSE6901 for rice; GSE5621, GSE5623, GSE5624, GSE5630, GSE5632, GSE5633, GSE5634, and GSE607 for Arabidopsis; and GPL1320 for grape) were downloaded from the GEO database in NCBI. The microarray data of rice include the analysis of gene expression profiles in nine tissues (Li et al. 2007) and 7-day-old seedlings under drought, salt, and cold stress treatments (Jain et al. 2007). In Arabidopsis, the expression pattern of DCL genes that is in root and shoot under different stresses with different time treating and in various tissues from different developmental stages were investigated and compared (Bergmann et al. 2004; Schmid et al. 2005). The DCL gene expression profiles in grape cultivars Cabernet sauvignon and Norton infected with powdery mildew Erysiphe necator were analyzed to investigate their possible roles in disease resistance.

Program GEPS (Wang et al. 2006) was employed to quantitatively analyze the expression pattern of DCL genes. Similarity measure (SM) was used to quantify the similarity between gene expression profiles. A value of SM close to 1 indicates high similarity of two gene expression patterns irrespective of their absolute expression levels. Thus, a high SM value means that the corresponding genes may have related biological roles (Wang et al. 2006). In addition, specificity measure (SPM) was used to define the tissue-specific expression pattern of a gene, which may be useful for further understanding its physiological behaviors (Wang et al. 2006). In this survey, gene expression level of DCL genes was divided into three classes based on the SPM value: comparatively high expression (SPM ≥ 0.7), above average (0.7 > SPM ≥ 0.5), and below average (SPM < 0.5).

The 1,000 bp of nucleotide sequences upstream of the translation initiation codon for each DCL gene in three species (rice, Arabidopsis, and grape) was extracted using a custom PERL script and used further for the transcription factor binding sites (TFBSs) analysis. At present, no full-length cDNA sequences for grape exist in JGI and/or the GenBank database. In order to facilitate comparison between species, the sequences upstream of the translation initiation codon rather than the transcription start site were used to screen for possible cis-acting regulatory elements. The software PlantCARE (Lescot et al. 2002) was utilized to determine putative plant-specific TFBSs in a given DNA sequence. To reduce false positives, only TFBSs whose matrix score is not less than 5 were considered further.

DIVERGE, a program developed by Gu and Vander Velden (2002), was used to detect functional divergence between members of a protein family. The coefficient of type I functional divergence θ and likelihood ratio statistic (LRT) between any two DCL clusters were calculated. If θ is significantly greater than 0, it means altered selective constraints of amino acid sites after gene duplication (Gu and Vander Velden 2002).

Dicer and DCL protein sequences were aligned using the E-INS-I program implemented in MAFFT v6.6 (Katoh et al. 2005). Phylogenetic trees were reconstructed with MEGA v3.1 (Kumar et al. 2004) by employing the neighbor-joining (NJ) and minimal evolution (ME) method, respectively. For both the NJ and ME methods, the parameters p-distance model and pairwise deletion of gaps/missing data were selected. Bootstrap test of phylogeny was performed with 1,000 replications. The phylogenetic trees were displayed using MEGA v3.1 (Kumar et al. 2004).

Expression pattern of DCL genes in different tissues

The transcriptional patterns of DCL genes in nine (stigma, ovary, suspension cell, shoot, root, anther, embryo, endosperm, and seed) and three (rosette leaf, stem, and flowers) rice and Arabidopsis tissues were investigated. It was observed that rice and Arabidopsis DCL genes were constitutively expressed in all examined tissues (Fig. 1). Compared with the relatively high expression level of OsDCL1 and OsDCL4, rice OsDCL5 is poorly expressed in all tissues. OsDCL3 and OsDCL4 show a similar expression pattern (SM value 0.957), although the general expression level of the former gene is lower than OsDCL4 (Fig. 1a). Despite the fact that the SPM values of OsDCLs are relatively low where most of them are below 0.7, the DCL genes still exhibit a tissue-specific expression pattern (Fig. 1 and Electronic supplementary material Table S1). The particularly high expression of OsDCL1 in embryo relative to other tissues suggests that it may function in embryogenesis (Fig 1a). OsDCLs 3 and 5 show comparatively higher expression level in suspension cell than in other tissues; OsDCLs 2, 4, and 5 are expressed more strongly in shoot, endosperm, and embryo, respectively. Different expression patterns of DCL genes among tissues were also found in Arabidopsis (Fig. 1 and Electronic supplementary material Table S1). AtDCLs 1, 3, and 4 have higher expression level in flowers, but lower in rosette leaf and stem (Fig. 1b).
https://static-content.springer.com/image/art%3A10.1007%2Fs10142-009-0111-5/MediaObjects/10142_2009_111_Fig1_HTML.gif
Fig. 1

Gene expression patterns of rice (a) and Arabidopsis (b) DCLs in different tissues. The vertical axis indicates the extent of gene expression levels, which are signal intensity as calculated by GCOS 1.2.1 (rice) and Affymetrix Microarray Suite 5.0 (Arabidopsis), respectively

Developmental regulation of DCL gene expression

Small RNAs (miRNAs and siRNAs) regulate gene expression at the transcriptional or posttranscriptional level (Kidner and Martienssen 2005; Millar and Waterhouse 2005), and since this is crucial for eukaryotic development, the regulation of expression of DCLs should be important for these processes. The available gene expression patterns of DCL genes during the leaf, stem, flower, and seed development collected using the Arabidopsis microarray (Fig. 2) were studied in detail.
https://static-content.springer.com/image/art%3A10.1007%2Fs10142-009-0111-5/MediaObjects/10142_2009_111_Fig2_HTML.gif
Fig. 2

The diversity of expression profiles for DCL genes during the leaf (a), stem (b), flower (c), and seed (d) developments in Arabidopsis. The vertical axis represents the gene expression level, while the horizontal axis is the different developmental stages. The stage-12-equivalent in flower development (c) is abbreviated as follows: stage12-equivalent-1 multi-carpel gyneoceum, enlarged meristem, increased organ number; stage12-equivalent-2 flowers converted to leaf-like structures, some shoot characteristics; stage12-equivalent-3 flowers without sepals, petals replaced by second flowers; stage12-equivalent-4 flowers without sepals, petals, carpeloid structures on sepals; stage12-equivalent-5 flowers without petals, stamens; stage12-equivalent-6 flowers without stamens, carpels, replaced by sepals and petals, indeterminate; stage12-equivalent-7, filamentous organs in whorls two and three

The results of similarity measure show that during the leaf or stem development, AtDCLs 1, 2, and 4 exhibit similar expression pattern, which is different from that of AtDCL3 (Fig. 2). Figure 2a shows that the expression level of AtDCL3 is low during the whole leaf developmental stage. The expression of AtDCLs 1, 2, and 4 fluctuate more extensively, reaching a peak at the senescing leaf stage (Fig. 2a). The AtDCLs could be also classified into two groups of expression patterns during the stem development. AtDCL3 represents the first group, while the other three AtDCLs form the second one (Fig. 2b). The specificity measure analysis reveals that AtDCL3 is significantly expressed in shoot apex, inflorescence (after bolting), whereas AtDCL2 shows a higher expression in shoot apex, transition (before bolting). In addition, it was observed that AtDCL1 has a higher expression in stem second internode than other AtDCLs, suggesting its particular role in the corresponding developmental stage.

AtDCLs show a wide diversity of expression profiles during the flower and seed developments (Fig. 2). AtDCLs 1, 3, and 4 show a tendency of decreasing expression from flower stage 9 to stage-12-equivalent and then rebound at flower stage 15. On the contrary, the expression of AtDCL2 is significantly higher at flower stage-12-equivalent (p < 0.001) and then decreases at stage 15 (Fig. 2c). Relative to other stages, AtDCL3 is relatively specifically expressed at flower stages 10–11 (SPM value, 0.619). However, all AtDCLs are weakly expressed in mature pollen. During the seed development, the expression of AtDCL2 keeps nearly constant; AtDCLs 1 and 3 show expression peak at the seventh and sixth seed stage respectively. Notably, the AtDCL4 expression increases significantly from seed stage 3 to stage 6 and then decreases dramatically up to stage 10 (Fig. 2d).

Expression profiles of DCL genes in response to stress

Abiotic stress

The expression profiles of DCL genes in response to stresses such as drought, cold, and salt were examined using rice datasets. Compared to the biological control, the expression of rice OsDCLs is slightly repressed under such above stress treatments (Fig. 3). The expression of OsDCL3 reduces significantly under the condition of drought or salt treatment (p < 0.05), as is the case for OsDCL4 under drought treatment.
https://static-content.springer.com/image/art%3A10.1007%2Fs10142-009-0111-5/MediaObjects/10142_2009_111_Fig3_HTML.gif
Fig. 3

The expression of rice OsDCL genes in response to the drought, cold, and salt stresses

To further confirm the responsiveness of DCLs to stresses, an examination of Arabidopsis AtDCLs expression in roots and shoots under the treatments of drought, cold, and salt confirmed that AtDCL1 expression decreases extensively from 0.25 to 1.0 h after drought treatment, whereas other AtDCLs keep nearly constant (Fig. 4a). Similar to expression in roots, AtDCL1 expression in shoots showed the lowest expression at 1.0 h and then increased to a high level at 24.0 h after drought treatment. There has been no obvious change in expression for AtDCL3 under different drought treatment conditions. Both AtDCL2 and AtDCL4 increase their expression at 6.0 h. However, the former gene shows the highest expression at 6.0 h, while the latter reaches its expression peak at 12.0 h after treatment and then declined at 24.0 h.
https://static-content.springer.com/image/art%3A10.1007%2Fs10142-009-0111-5/MediaObjects/10142_2009_111_Fig4_HTML.gif
Fig. 4

Expression profiles of Arabidopsis AtDCLs in roots and shoots under different physiological conditions. a drought, b cold, c salt

After cold treatment, the expression of AtDCL1 continues to increase and shows its highest level at 24.0 h in roots, and AtDCL4 decreased from 6.0 to 24.0 h. Similar patterns were observed in shoots where the expression of AtDCL1 has increased extensively after long time of cold treatment, whereas other AtDCLs show an inverse tendency (Fig. 4b).

More complicated expression patterns of AtDCLs were revealed after salt treatment (Fig. 4c). AtDCL1 showed a decreasing expression pattern, while AtDCL4 was significantly expressed at 12.0 h and then declined rapidly at 24.0 h. In contrast, the changes in expression for AtDCLs 2 and 3 are insignificant in shoots. In roots, the expression of AtDCL1 decreased promptly after 3 and 6-h salt treatment and subsequently recovered at 12.0 h. The AtDCL4 expression decreased with the time of salt treatment. Interestingly, AtDCL2 and AtDCL3 exhibit distinct expression patterns, namely, the former had the lowest expression level at 6.0 h, whereas AtDCL3 was significantly activated at the same time point.

Overall, it is evident that DCL action can be compartmentalized in different tissues (Xie et al. 2005) under different environmental conditions. These results imply that plants should have evolved specialized regulatory mechanisms in response to different abiotic stresses (Xie et al. 2004).

Biotic stress

It was possible to examine the correlation between DCLs and disease resistance in grapes. Because DCL2 and DCL4 were not included in the grape microarray data, only the responsiveness of VvDCLs 1 and 3 to the infection by powdery mildew E. necator in two grape cultivars was examined (Fig. 5). In both the disease-resistant and susceptible grape cultivars Norton and C. sauvignon, VvDCL1 and VvDCL3 exhibit similar expression patterns. The peak of VvDCL1 expression is established at 4.0 h after fungi inoculation. Afterward, it decreases rapidly and increases again after 24.0 h; the expression level of C. sauvignon VvDCL1 is relatively higher than its counterpart in Norton at 4.0, 8.0, and 24.0 h after treatment (Fig. 5a). In the C. sauvignon–Norton comparison, the expression of VvDCL3 is significantly higher in Norton than in C. sauvignon, although both of them show the highest expression level at 8.0 h after treatment (Fig. 5b). With respect to other diseases, DCL1 is argued to be required for resistance to bacterial pathogen Pseudomonas syringae (Katiyar-Agarwal et al. 2006), while DCLs 2, 3, and 4 are important for antiviral defense (Xie et al. 2004; Deleris et al. 2006).
https://static-content.springer.com/image/art%3A10.1007%2Fs10142-009-0111-5/MediaObjects/10142_2009_111_Fig5_HTML.gif
Fig. 5

Comparison of the responses of VvDCL1 (a) and VvDCL3 (b) to the infection by powdery mildew E. necator in two grape cultivars Norton and C. sauvignon

Regulatory elements for plant DCL genes

Transcription factors bind to corresponding TFBSs upstream from genes of interest and the profiles of cis-acting elements may thus provide information for understanding the regulatory mechanism of gene expression. A computational tool PlantCARE (Lescot et al. 2002) was adopted to identify putative TFBSs in the 1,000-bp DNA sequence upstream of the translation initiation codon of DCL genes in rice, Arabidopsis, and grape.

Light responsive elements such as Sp1 and GT1 box are redundantly present in the promoters of plant DCL genes (Electronic supplementary material Table S2). Sp1 is the most redundant cis-element found in rice DCLs. All but OsDCL3, where one GT1 box is found, have at least three copies of Sp1 elements. Arabidopsis and grape, in contrast, showed one Sp1 in AtDCL4 and VvDCL4, respectively. In addition, two and four GT1 boxes have been identified in VvDCL3 and AtDCL4, respectively. The second class of cis-element that enriches in the promoter region is the plant hormone response elements, such as ABRE, GARE, P-box, the TCA element, as well as the CGTCA and TGACG elements, suggesting that plant DCLs may play a role in the corresponding ABA, gibberellin, salicylic acid, and MeJA signaling pathways. The Skn-1 motif that is required for endosperm expression is also found frequently. With one exception (VvDCL1), all DCLs possessed this regulatory element (Electronic supplementary material Table S2). The presence of anaerobic and stress response elements such as the GC motif, MBS, HSE, LTR, and TC-rich repeats in the upstream regions of DCLs further supports the idea that plant DCLs function in a wide diversity of ways.

In addition, species- and/or DCL membership-specific cis-elements have been also observed. In six out of eight Arabidopsis and grape DCLs, a cis-element termed circadian that is involved in circadian control was found, whereas none of the rice OsDCLs possessed this element. AC-II, a cis-element required for xylem-specific expression, is specifically present in OsDCL2 and AtDCL2. Furthermore, orthologous DCLs have different cis-element organization as well (Electronic supplementary material Table S2). Consistent with this observation, Liu et al. (2007) revealed that loss-of-function mutations of OsDCL4 cause severe developmental defects in rice but not in Arabidopsis; thereby, they suggested that OsDCL4 may have evolved a much broader role in development than its Arabidopsis counterpart.

Interestingly, DCL1 participates in the processes of generating mature miRNA, and for feedback, miR162 negatively regulates the DCL1 expression (Xie et al. 2003). Moreover, using both FASTA and the plant microRNA potential target finder miRU (Zhang 2005), ath-miR414 is supposed to be another potential regulator for AtDCL1. However, osa-miR414 is not predicted to target OsDCL1.

Divergence between plant DCL family members

The estimation of functional divergence between clusters was based on the NJ tree reconstructed using DCL proteins from rice, S. bicolor, Arabidopsis, poplar, and grape. As expected, the null hypothesis (no functional divergence) could be strongly rejected in that the coefficients of type I functional divergence (θ) between DCL subgroups were statistically significant (p < 0.01; Table 1), indicating that significant amino acid site-specific selective constraints operate on different types of DCL members, leading to a subgroup-specific functional evolution after their diversification.
Table 1

Functional divergence between subgroups of the plant DCL family

Group1

Group2

θ ± SE

LRT

p

Qk > 0.8

Qk > 0.9

DCL1

DCL2

0.528 ± 0.073

51.96

<0.01

18

2

DCL1

DCL3

0.378 ± 0.072

27.32

<0.01

2

2

DCL1

DCL4

0.476 ± 0.071

44.68

<0.01

15

4

DCL2

DCL3

0.322 ± 0.052

39.02

<0.01

2

0

DCL2

DCL4

0.234 ± 0.048

23.99

<0.01

0

0

DCL3

DCL4

0.362 ± 0.053

46.77

<0.01

11

0

Large Qk value indicates a high possibility that the functional constraint (or the evolutionary rate) of a site is different between two clusters (Gu 2003)

θ coefficient of type I functional divergence between two gene clusters, LRT likelihood ratio statistic, Qk posterior probability

To identify critical amino acid sites that may be responsible for functional divergence between DCL subgroups, the posterior probability (Qk) of divergence was determined for each site. According to the definition, large Qk indicates a high possibility that the functional constraint (or the evolutionary rate) of a site is different between two clusters (Gu 2003). The results showed that the functional divergence between DCL members would be partially attributed to the variation on several to tens of amino acid sites whose Qk value is greater than 0.8 (Tables 1 and Electronic supplementary material Table S3). Strong functional diversification is indicated to have occurred between DCL1/DCL2, DCL1/DCL3, and DCL1/DCL4 pairs, as there are two, two, and four amino acid sites with Qk > 0.9, respectively (Tables 1 and Electronic supplementary material Table S3). However, the diversification between DCL1 and DCL3 was not as strong as that between DCL1/DCL2 and DCL1/DCL4 because only two amino acid sites are found to be possible contributors. For the DCL2/DCL3 and DCL2/DCL4 pairs, there is only two and zero amino acid site with Qk > 0.8, respectively, suggesting that their diversification would be much weaker.

Based on the Gu (1999) method, the function of plant DCLs was revealed to be significantly divergent from each other. In agreement with the previous reports, DCL1 is found to be strongly divergent from other DCL family members, whereas the divergence between DCL2, DCL3, and DCL4 was relatively weak because no amino acid site with Qk > 0.9 was found for the corresponding gene pairs (Table 1). It was observed that most of the critical amino acid sites fall in the PAZ domain and the loops between Helicase-C and PAZ and between RNase IIIa and RNase IIIb (Electronic supplementary material Table S3). Electronic supplementary material Figure S1 shows the amino acid sites with Qk > 0.9 that is predicted to be highly functional divergence-related (Gu 2003). In DCL2, sites 1495 (Qk = 0.967) and 1846 (Qk = 0.978) are invariant for cysteine and arginine, respectively, whereas the same positions in DCL1 have several amino acids with different chemical properties, such as non-polar amino acid valine as well as uncharged polar amino acids cysteine and asparagines. Similar cases were also observed between DCL1/DCL3 and DCL1/DCL4 (Electronic supplementary material Fig. S1).

Evolutionary analysis of Dicer and DCL proteins

The A. thaliana genome encodes four DCL proteins. Based on homologous search, four DCLs were identified in V. vinifera and five DCLs in poplar, rice, and S. bicolor, respectively. Using the TIGR rice database (build 3), Margis et al. (2006) reported that there are six DCLs in the rice genome. We carefully examined the rice genomic and protein sequences (version 5) and also identified six DCL candidates. However, Os09g14160 is just 670 aa in length. The result of InterProScan analysis shows that Os09g14160 only contains the PAZ, RNase III, and dsRBD domains. According to the annotation in TIGR (build 5), Os09g1416 has three alternative splicing transcripts and none of them is over than 670 aa; at present, therefore, it is only possible to confirm the presence of five DCLs in rice. Notably, three DCL genes have been identified in moss Physcomitrella patens. Phylogenetic analysis showed that each of the three PpDCLs was clearly clustered together with their orthologues in vascular plants (Fig. 6), which might place the divergence time of DCLs before the emergence of moss P. patens but after the single-cell green algae C. reinhardtii.
https://static-content.springer.com/image/art%3A10.1007%2Fs10142-009-0111-5/MediaObjects/10142_2009_111_Fig6_HTML.gif
Fig. 6

Phylogenetic tree of Dicer and Dicer-like (DCL) proteins in eukaryotic organisms. The numbers beside the branches represent bootstrap values (≥60%) based on 1,000 resamplings. To identify the species of origin for each Dicer or DCL, a species acronym is included before the protein name: Ac Aspergillus clavatus, Af Aspergillus fumigatus, Ag Anopheles gambiae, Am Apis mellifera, At Arabidopsis thaliana, Bt Bos taurus, Cb Caenorhabditis briggsae, Cco Coprinopsis cinerea okayama7#130, Ce Caenorhabditis elegans, Cf Canis familiaris, Cr Chlamydomonas reinhardtii, Dd Dictyostelium discoideum, Dm Drosophila melanogaster, Dr Danio rerio, Gg Gallus gallus, Hs Homo sapiens, Md Monodelphis domestica, Mm Mus musculus, Mt Medicago truncatula, Nc Neurospora crassa, Nf Neosartorya fischeri NRRL 181, Nv Nasonia vitripennis, Oa Ornithorhynchus anatinus, Os Oryza sativa, Pa Podospora anserine, Pp Physcomitrella patens, Pt Populus trichocarpa, Sb Sorghum bicolor, Sp Strongylocentrotus purpuratus, Spo Schizosaccharomyces pombe, Tr Takifugu rubripes, Tt Tetrahymena thermophila, Vv Vitis vinifera, Xt Xenopus tropicalis, Zm Zea mays

With regards genomic location, all four Arabidopsis AtDCLs are located outside of the putative duplicated segments, while in rice, OsDCLs 3, 4, and 5 are in regions which have undergone whole genome duplication events. It is evident, however, that in rice, the duplicated copies have been lost during evolution because only single copies of the genes remain. There exist five DCLs in the poplar genome (Margis et al. 2006). Based on the sequence similarity, it can be inferred that PtDCL5 has resulted from a recent gene duplication of its paralogues PtDCL2 probably after the speciation of poplar and grape (Fig. 6). The emergence of the fifth DCL in monocot species, however, should have occurred after the monocot–dicot split ∼200 million years ago (mya), but before the divergence of cereals approximately 70 mya (Margis et al. 2006). As suggested by Deleris et al. (2006), the extensive proliferation of the DCL family in plants might be partially attributed to the requirement for the existence of multiple antiviral DCL activities. It is therefore more reasonable to predict that the expansion of DCL proteins in plants may be an ongoing process.

Conclusions

The diversification of plant DCLs can be placed at the time before the emergence of moss P. patens, and their rapid proliferation is argued to have partially attributed to the need for plants to acquire resistance to viruses, bacteria, and fungi. The survey of upstream elements revealed three major classes of cis-elements in the promoter region of DCLs, and their distinct organization pattern is interpreted to reflect their varying participation in gene expression regulation. Importantly, the amino acid level analysis suggested that functional divergence has occurred between plant DCL proteins and identified the critical amino acid sites involved in this divergence for further investigation.

Acknowledgements

We thank Prof. Rudi Appels and Prof. Wujun Ma for their valuable and constructive suggestions and for careful editing of the manuscript. This work was supported by an intramural fund from Zhejiang Forestry University (to Qingpo Liu) and grants from National Basic Research Program of China (973 program; no. 2007CB109305), National Natural Science Foundation of China (no. 30740011), Zijin Program from Zhejiang University (to Y. Feng), the Special Fund for Grade B Innovative Research Team from Zhejiang Forestry University (to Z. Zhu).

Supplementary material

10142_2009_111_MOESM1_ESM.pdf (269 kb)
Table S1 (PDF 269 KB).
10142_2009_111_MOESM2_ESM.pdf (17 kb)
Fig. S1 Functional divergence significantly related amino acid site candidates [Qk > 0.9]. A site-specific profile based on the posterior probability (Qk) was used to identify critical amino acid sites that were responsible for functional divergence between DCL family members. According to the definition, large Qk indicates a high possibility that the functional constraint (or the evolutionary rate) of a site is different between two clusters. a DCL1/DCL2; b DCL1/DCL3; c DCL1/DCL4 (PDF 17.4 KB).

Copyright information

© Springer-Verlag 2009