Introduction

During the last two decades, a group of silencing phenomena emerged, in which small RNA molecules (20–30 nucleotides (nt) long) function as sequence-specific guides for ribonucleoprotein complexes with repressive functions (reviewed in [48]). Each RNA silencing pathway involves three main steps: (1) production of small RNAs, (2) formation of an effector complex, and (3) sequence-specific induction of silencing effects. Small RNA production in RNA silencing pathways frequently employs Dicer, a ∼200-kDa RNase III endonuclease. Dicer, which is encoded by a single mammalian gene, is cleaving different substrates into small RNAs acting in both microRNA (miRNA) and RNA interference (RNAi) pathways (Fig. 1). The main differences between mammalian miRNA and RNAi pathways are in origins of small RNAs (i.e., in Dicer substrates), prevailing modes of silencing, and biological functions rather than in the molecular mechanism per se, as it is an essentially shared downstream of Dicer (Fig. 1).

Fig. 1
figure 1

An overview of RNAi and miRNA pathways and their essential components. The miRNA pathway is a ubiquitous mammalian RNA silencing pathway. During miRNA biogenesis, RNase III Dicer cleaves small hairpin precursors (pre-miRNAs) and produces 21–23-nt-long miRNAs, which are loaded on an AGO protein in the RNA-induced silencing complex (RISC). The RNAi pathway is generally a minor mammalian small RNA pathway. It shares protein components with the miRNA pathway. RNAi employs ∼22-nt-long small interfering RNAs (siRNAs) produced by Dicer from long dsRNA. The silencing effect does not depend on the origin of a small RNA but on the base pairing with a target and the type of AGO protein. Mammals have four AGO proteins (AGO1–4). All bind miRNAs and siRNAs, but only AGO2 is capable of an endonucleolytic cleavage of cognate RNAs, which is a hallmark of RNAi. Importantly, miRNAs loaded on AGO2 can induce endonucleolytic cleavage upon perfect base pairing with targets as well. However, a typical miRNA binding is imperfect and results in translational repression. miRNAs function as gene-specific inhibitors where miRNA networks provide a combinatorial system of post-transcriptional control of gene expression

The miRNA pathway serves for a selective post-transcriptional repression of gene expression (reviewed for example in [37, 116]). It is mediated by ∼21–23-nt-long genome-encoded miRNAs. The biogenesis of canonical miRNAs starts with primary miRNA transcripts (pri-miRNAs) synthesized by RNA polymerase II. Pri-miRNAs carry one or more local stem loop structures, which are released by the nuclear Microprocessor complex (composed of Drosha and DGCR8). The released small hairpin (∼80 nt [22]) precursor miRNA (pre-miRNAs) with a 2-nt 3′ overhang is transported to the cytoplasm where it is cleaved by Dicer into a miRNA duplex, which is subsequently loaded onto an Argonaute (AGO) protein family member. Argonautes are the principal protein components of effector complexes in all RNA silencing pathways. miRNAs typically have imperfect base pairing with cognate messenger RNAs (mRNAs) resulting in translational repression followed by mRNA degradation (reviewed in [44]). The functional base pairing with a cognate mRNA appears to involve little beyond the so-called miRNA “seed” region (nucleotides 2–8), which essentially defines the bulk of each miRNA target repertoire [6, 23, 34]. In addition to canonical miRNAs, non-canonical miRNAs were also identified, which bypass processing by the Microprocessor complex or Dicer. miRNAs are the dominant class of mammalian small RNAs. They exert a strong impact on gene expression and have been implicated in a wide range of biological processes. It has been estimated that more than 60 % of all mRNAs might be regulated by miRNAs in humans [28].

RNAi was defined as a sequence-specific mRNA degradation induced by long double-stranded RNA (dsRNA) [25]. Accordingly, the term RNAi is used here strictly for sequence-specific degradation induced by long dsRNA. During RNAi (reviewed in [75]), long dsRNA is processed by Dicer into ∼20–23-nt short interfering RNAs (siRNAs). Dicer then assists in loading siRNAs onto an Argonaute protein (AGO2 in mammals). A small RNA is loaded on the mammalian AGO2 base pairs with a complementary mRNA. In the case of perfect complementarity, the cognate mRNA is cleaved by AGO2 in the middle of the base-paired sequence. RNAi functions as a defensive mechanism against viruses and repetitive elements in plants and invertebrates [79]. In contrast, mammalian RNAi seems to be a rudimentary pathway whose defensive function has been taken over by other molecular mechanisms during evolution [94]. A known exception from the rule is mouse oocytes, the first mammalian cell type where RNAi was experimentally demonstrated [95, 111]. In the following text, we will discuss Dicer structure and function in Metazoa with focus on adaptations specific for miRNA and RNAi pathways.

Dicer domain organization and evolution in Metazoa

Dicer (reviewed previously in [42]) is a member of the RNase III family. Mammalian Dicer proteins are ∼220-kDa multidomain proteins, which are composed of domains ordered from the N- to the C-terminus as follows: N-terminal DEAD-like (DExD) and helicase superfamily C-terminal domains, a domain of unknown function DUF283, a Piwi/Argonaute/Zwille (PAZ) domain, RNase IIIa and RNase IIIb domains, and a C-terminal dsRNA-binding domain (dsRBD) (Fig. 2a). In contrast to the simplest RNase III family members (exemplified by Escherichia coli RNase III), which carry only one RNase III domain and dimerize when cleaving dsRNA [43, 54], Dicer proteins carry two RNase III domains, which form an intramolecular dimer [69, 119]. Based on their complex domain composition, Dicers were grouped as an RNase III class III enzyme while Drosha, an RNase III enzyme processing pri-miRNAs into pre-miRNAs, was classified as an RNase III class II enzyme [66]. However, as a diversity of protozoan Dicers was discovered, it was proposed to merge Dicer and Drosha groups into a single class of II of RNase III enzymes [42].

Fig. 2
figure 2

Dicer architecture. a Complex mammalian Dicer protein domain composition and a short Dicer from Giardia intestinalis lacking accessory domains present in human Dicer: the N-terminal helicase domain, domain of unknown function 283 (DUF283), and the C-terminal dsRNA-binding domain (dsRBD). b A phylogenetic tree of metazoan Dicer proteins. Dicer paralogs in insects are designated as follows: D1–DCR-1, and D2–DCR-2. c Dicer from G. intestinalis. On the left is a ribbon model, and on the right is the schematic representation of the structure. The crystal structure of Dicer from Giardia has two RNase III domains (orange) and the PAZ domain (green). The connector helix (red), which links the PAZ domain to RNase III domains, is supported by the platform domain (gray). Two RNase III domains are connected by the bridging domain. The structure was derived from published data [68]. d Human Dicer. The schematic representation was generated based on previously published data [57, 102]; one of the cryo-EM reconstructions [102] is shown above the scheme with the same color coding for domains: blue the N-terminal helicase domain, green the PAZ domain, gray the platform domain, and orange RNase III domains. Please note that the human Dicer is in the literature usually displayed upside-down relative to that of Giardia, i.e., with the PAZ domain (HEAD) on the top of the structure

Although Dicer is generally well conserved among eukaryotic organisms, a number of functionally distinct paralogs emerged [74]. Most higher metazoans have one Dicer gene in their genome, while basal metazoans exhibit frequent Dicer duplications [17]. Exceptional among higher metazoan model organisms are Drosophila’s two Dicer paralogs, which underlie a genetic divergence of RNAi (DCR-2) and miRNA (DCR-1) pathways [59]. Consistent with its defensive role, DCR-2, the RNAi-dedicated Dicer in Drosophila, is more derived from the common ancestral Dicer than the miRNA pathway-dedicated DCR-1 [74] (Fig. 2b). In some invertebrates, such as Caenorhabditis elegans, a single Dicer is used for efficient biogenesis of both miRNAs and siRNAs. In mammals, a single Dicer produces mainly miRNAs and little of any siRNAs. Thus, one common Dicer design apparently evolved during metazoan evolution from a universal factor for RNAi and miRNA pathways into a factor specifically adapted for either RNAi or miRNA pathways. The molecular foundations of such adaptations will be discussed in the following text.

Structure of mammalian Dicer and the “molecular ruler” model

A full-length mammalian Dicer has not been crystallized yet. The structure of mammalian Dicer has thus been inferred from (i) biochemical studies of recombinant Dicer and individual domains [63, 83, 87, 118, 119], (ii) comparison with crystal structure of Giardia intestinalis Dicer [67, 69], (iii) crystallographic studies on a fragment of mammalian Dicer [20] or on individual domains [64, 98, 103], and (iv) cryoelectron microscopy (cryo-EM) studies of human Dicer and its complexes with other proteins [56, 57, 102, 108, 113].

A Giardia Dicer structure reveals spatial organization of the core part of eukaryotic Dicer proteins and explains how Dicer generates small RNAs of specific lengths [68] (Fig. 2c). Two RNase III domains of Giardia Dicer form an intramolecular dimer resulting in a single processing center placed at a specific distance from the PAZ domain confirming the biochemistry-based prediction of the human Dicer organization [119]. A structural component defining this distance is an α helix (connector helix), which directly links PAZ domain and RNase III domains [68]. Thus, Dicer functions as a molecular ruler, measuring the length of the substrate from the PAZ domain to RNase III domains where it is cleaved.

The front view of the Giardia Dicer structure resembles an axe (Fig. 2c). The blade is formed of an intramolecular duplex of two RNase III domains, which are connected by a bridging domain constituting the back end of the blade. The platform domain is adjacent to the RNase IIIa domain and makes up the upper part of the handle. The PAZ domain is connected by a long helix to the RNase IIIa domain and forms the base of the handle [68]. Altogether, the Giardia Dicer is formed of three rigid regions, which are linked by flexible hinges. One region is formed by RNase III domains and the bridging domain, the second by the platform domain and the connector helix, and the third by the PAZ domain (Fig. 2c). These three parts can swing relative to each other and possibly ensure accommodation of Dicer to the structure of its substrate [67]. This conformational flexibility likely enables binding of dsRNAs with non-canonical base pairing as well as imperfect duplexes of pre-miRNAs [67]. In addition, dsRNA binding is presumably stabilized by several positively charged patches on the surface of the Giardia Dicer between the processing center and the PAZ domain, which are in contact with dsRNA [67, 69].

Mammalian Dicers are much larger and contain domains absent in the Giardia Dicer (Fig. 2a, c, d). Although a full-length mammalian Dicer has not been crystallized, the architecture of the human Dicer and positions of its domains and interacting partners have been inferred by cryo-EM of the full-length protein and its mutants [56, 57, 102, 108, 113]. The overall shape of the human Dicer resembles the letter L; the shape is further divided into a head, a body, and a base (Fig. 2d). The PAZ domain is adjacent to the platform domain in the head of the protein, while the RNase IIIb is located in the body. The helicase domain constitutes the base. The position of the processing center relative to the PAZ domains differs between human and Giardia Dicers, which explains the fact that the human Dicer produces siRNA about four nucleotides shorter than the Giardia Dicer, which corresponds to approximately one third of a dsRNA helical turn [56]. Therefore, the processing center has to access the cleavage site of dsRNA from the different angles relative to the dsRNA helical end in comparison with the Giardia Dicer [56].

The PAZ domain

The PAZ domain found in Dicer and Argonaute proteins is a dsRNA-terminus binding module [64, 68]. The PAZ domain has a 3′ overhang binding pocket, but only the PAZ domain of Dicer has an extra loop enriched in basic amino acids, changing electrostatic potential and molecular surface of the pocket. These changes may influence RNA binding by Dicer and handing off the substrate to other protein complexes [68]. The PAZ domain of metazoan Dicers also recognizes the phosphorylated 5′ end of a pre-miRNA. A mutation of the 5′ binding pocket leads to dysregulation of miRNA biogenesis in vivo [83]. The 5′ binding pocket is conserved in Drosophila DCR-1 and human Dicer but not in the Giardia Dicer [83]. Importantly, the 5′ binding pocket appears conserved in Dicer proteins functioning in miRNA biogenesis (human Dicer, Drosophila DCR-1) but not in Dicer proteins dedicated to long dsRNA processing (Giardia, Schizosaccharomyces, Drosophila DCR-2). Accordingly, simultaneous fixing of 3′ and 5′ ends emerges as a feature important for fidelity of miRNA biogenesis but not for siRNAs [83].

The N-terminal helicase

The N-terminus of metazoan Dicers harbors a complex helicase structure, which is adjacent to RNase III catalytic domains [56]. Although the helicase must come into contact with the substrate, its functional significance is only partially understood. The N-terminal helicase belongs to the RIG-I-like helicase family [121] and consists of a proximal DExD/H domain and an adjacent helicase superfamily C-terminal domain (Fig. 2a). A conventional helicase domain has an ATPase activity. Indeed, invertebrate Dicers bind and hydrolyze ATP [4, 49, 78, 117]. However, despite the N-terminal helicase with conserved motifs important for ATP binding and hydrolysis is present in mammalian Dicers, there is no evidence of ATP requirement for the human Dicer activity [87, 118]. The human Dicer has the same processing efficiency in the presence or absence of ATP. Moreover, the rate of cleavage is not influenced by addition of other nucleotides, non-cleavable ATP analogues, or a mutation in the Walker A motif of ATPase/helicase domain [87, 118]. Notably, these experiments were performed using a long dsRNA substrate with blunt ends, whose processing by invertebrate Dicers is ATP-dependent [4, 49, 78, 117, 118]. Remarkably, deletion of the helicase domain results in a high cleavage rate of long dsRNAs by human Dicer in vitro [63] as well as in vivo in murine and human cells [26, 47]. Thus, the N-terminal helicase in mammalian Dicers has a different role in substrate recognition and processing than the helicase in invertebrate Dicers although the overall shapes of human and Drosophila Dicer proteins are similar [56].

As the crystal structure of the N-terminal helicase domain of Dicer had not been obtained yet, we have to rely on the cryo-EM-based modeling, which suggests that the N-terminal helicase is composed of three globular subdomains (HEL1, HEL2, and HEL2i) where the DExD/H domain corresponds to HEL1 and the helicase superfamily C-terminal domain to HEL2 and HEL2i. All three parts of the helicase form a clamp near the RNase III domain active site. Interestingly, the N-terminal helicase was found in two distinct conformations in respect to the body of the enzyme [56]; similar to the RIG-I helicase, which was used as a template structure for modeling [53]. Analysis of substrate-specific structural rearrangements proposes that human Dicer exists in three states depending on the presence and type of substrate [102]. Unbound Dicer existing in a canonical state rearranges upon substrate binding that involves the PAZ domain as well as the helicase domain. Substrate-bound Dicer exists either in an open or closed state. The open state is cleavage-competent and it is typical for pre-miRNA binding. It is characterized by binding of a pre-miRNA along the platform, bending of the helicase domain, and access of RNase IIIa and IIIb sites to the substrate [102]. The closed state has been observed for a 35-bp A-form RNA duplex, which represents a siRNA precursor. In this state, the substrate is trapped between the PAZ and helicase domains away from the catalytic sites [102]. This provides a structural explanation for previous observations that Dicer poorly processes longer perfect duplexes in vitro and in vivo [50, 76]. Taken together, the helicase domain in mammalian Dicers provides a structural basis for substrate specificity, namely distinguishing pre-miRNAs as a preferred substrate for small RNA biogenesis. Interestingly, a natural Dicer isoform has been found in mouse oocytes, which lacks the N-terminal helicase domain, efficiently generates siRNAs from long dsRNAs, and is sufficient for enhancing RNAi in cultured cells. Importantly, this isoform is a consequence of a rodent-specific retrotransposon insertion and is present only in the Muridae family [26] (Fig. 3). This demonstrates that, while a small change in a mammalian Dicer gene can restore RNAi, miRNA biogenesis had been the preferred role for Dicer during vertebrate evolution.

Fig. 3
figure 3

Short Dicer variant and endogenous RNAi in mouse oocytes. a Dicer isoforms domain composition. DicerS - full-length somatic Dicer, DicerO - truncated oocyte-specific Dicer. b DicerO has higher cleavage activity in an in vitro cleavage assay. c Full-length Dicer and DicerO cleave a long dsRNA substrate the same way, but DicerO produces much higher amounts of siRNAs. Shown are 21–23-nt reads from next-generation sequencing, which were mapped onto a ∼500-bp-long dsRNA expressed in analyzed cells. Mapped reads were normalized to counts per million (CPM). d Optineurin gene 3′ end is one of endo-siRNA-generating loci in mouse oocytes and ESCs. Black rectangles represent Optineurin exons. Thick gray lines below the gene scheme represent individual putative endo-siRNAs identified by deep sequencing, which map exactly into the predicted dsRNA region. The folded hairpin is shown below the gene locus scheme. e Putative B1/SINE endo-siRNAs identified in ESCs come from short-to-medium hairpin precursors and have relatively well-defined sequences, thus appear to be closer to non-canonical miRNAs than to endo-siRNAs. Black bars represent 21–23-nt RNAs from deep sequencing mapped onto a SINE-derived inverted repeat. For further details, please see the reference [26]

Dicer-interacting partners–tandem dsRBD proteins

Dicer has two main types of interacting partners in Metazoa: Argonaute proteins, which receive small RNAs produced by Dicer, and dsRNA-binding proteins with tandemly arrayed dsRBDs, which facilitate substrate recognition, cleavage fidelity, and Argonaute loading. Importantly, despite a similar domain organization, these proteins evolved different roles in small RNA biogenesis in different model organisms. For example, RDE-4 in C. elegans is a 385-amino acid protein with two N-terminal dsRBDs and a third degenerate dsRBD at the C-terminus. Rde-4 mutant lacks RNAi but does not show activation of mobile elements [96]. Mutants and biochemical analyses support a model where RDE-4 dimerizes through the C-terminal domain; dimers cooperatively bind long dsRNA, and, together with Dicer, an Argonaute protein RDE-1 and a DExH-Box helicase DRH-1/2 (Dicer-related helicase) form a complex initiating the RNAi [8486, 97]. RDE-4 is involved in siRNA production from dsRNA but is not essential for later steps of RNAi; RDE-4 immunoprecipitates with long dsRNA but not siRNA [97], and RNAi in mutants can be rescued with siRNAs [86]. RDE-4 is involved in siRNA production from exogenous and endogenous dsRNAs, the later involves RDE-4 and Dicer but neither RDE-1 nor DRH-1/2 [58].

A similar dsRBD domain arrangement is found also in Dicer-interacting partners Loquacious (LOQS) and R2D2 in Drosophila, while their role in RNA silencing is different. R2D2 associates with DCR-2 and acts in RNAi [62]. R2D2 does not influence DCR-2 enzymatic activity [62] but restricts DCR-2 function to processing of long dsRNAs [8, 29]. It also facilitates passing the cleavage product to AGO2 excluding miRNA-like duplexes with imperfect base pairing [104]. Loquacious gene produces two protein isoforms, which associate with DCR-1 and miRNA pathway (LOQS-PB isoform) and DCR-2 and RNAi (LOQS-PD isoform) [39, 72, 120]. LOQS-PD and R2D2 function sequentially and non-redundantly in the endogenous RNAi pathway. LOQS-PD stimulates DCR-2-mediated processing of dsRNA, whereas R2D2 acts downstream during RISC loading [40, 71, 72]. Taken together, LOQS and R2D2 contribute to the profound mechanistic separation of miRNA and RNAi pathways, which evolved in Drosophila (and presumably in insects in general).

In mammals, two dsRNA-binding proteins with tandemly arrayed dsRBDs have been identified as Dicer-binding proteins: trans-activation-responsive RNA-binding protein 2 (TARBP2) and protein activator of PKR (PACT) [10, 38]. TARBP2 and PACT are paralogs, which evolved through a gene duplication event in an ancestral chordate [15]. Each protein consists of three dsRBDs, where the first two domains can bind dsRNA while the third domain has a partial homology to dsRDB and does not bind dsRNA. Instead, it mediates protein-protein interactions and is a part of a larger protein-protein-interacting C-terminal region referred to as Medipal domain as it interacts with Merlin, Dicer, and PACT (reviewed in [15]). TARBP2 and PACT can also form homodimers and heterodimers through the Medipal domain [55].

The binding site of TARBP2 and PACT on Dicer was recently determined using cryo-EM and crystallography [113]. Homology-based modeling showed that Dicer-binding residues are conserved in TARBP2 and PACT, implicating that binding of TARBP2 and PACT to Dicer is mutually exclusive [113].

In vitro, TARBP2 stimulates a Dicer-mediated cleavage of both pre-miRNA and pre-siRNA substrates presumably by enhancing the stability of Dicer–substrate complexes; this stimulation requires the two N-terminal dsRBDs [9]. In contrast, PACT inhibits Dicer processing of pre-siRNA substrates when compared to Dicer and Dicer–TARBP2 complex [60]. The two N-terminal dsRBDs contribute to the observed differences in dsRNA substrate recognition and processing behavior of Dicer–dsRNA-binding protein complexes [60]. In addition, PACT and TARBP2 have non-redundant effects on the generation of different-sized miRNAs (isomiRs) [52, 60, 113]. Cells lacking TARBP2 exhibit altered cleavage sites in a subset of miRNAs but no effect on the general miRNA abundance or Argonaute loading [52]. Thus, impact of TARBP2 and PACT on miRNA biogenesis in vivo seems to be relatively minor [52, 113]. However, it should be pointed out that any change in the 5′ end position of any miRNA will have a strong effect on its target repertoire. Taken together, TARBP2 and PACT are regulatory factors that contribute to the substrate specificity and cleavage fidelity during miRNA and siRNA production.

Moreover, TARBP2 and PACT have an additional role in a cross talk of the interferon (IFN) response and small RNA pathways (reviewed in [15]). The IFN response is the major antiviral branch of innate immunity in mammals, which deals with threats associated with long dsRNA. Among the key components sensing dsRNA in the IFN response are protein kinase R (PKR) and helicase RIG-I (reviewed in [32]). The two N-terminal dsRBDs of PACT and TARBP2 bind PKR through the same residues [113], while the (C-terminal) Medipal domain of PACT is needed for PKR activation [41]. In contrast, the Medipal domain of TARBP2 has an inhibitory effect [36]. Furthermore, sequestering of PACT by TARBP2 has a negative effect on PKR phosphorylation and activation. PKR inhibition by TARBP2 is released in stress conditions, leading to IFN response activation [14]. Therefore, absolute and/or relative expression levels of TARBP2 and PACT might be buffering or sensitizing the IFN response to dsRNA. One could envision that suppression of the IFN response might result in increased RNAi. However, there is no evidence, so far, that TARBP2 would redirect long dsRNA to Dicer and stimulate RNAi in vivo enough to achieve a robust sequence-specific mRNA knockdown.

Substrate processing by mammalian Dicer proteins

First in vitro studies on recombinant human Dicer showed that substrate cleavage is dependent on Mg2+ but not on ATP [87, 118]. Subsequently, it was found that Dicer cleaves long dsRNAs and pre-miRNAs with different efficiency, which stems from substrate’s structural properties [9, 63]. Therefore, a cleavage of miRNA precursors and long dsRNAs will be discussed separately below.

Canonical miRNAs

Canonical miRNAs are the dominant Dicer products in mammalian cells. In contrast to long dsRNA, a canonical pre-miRNA is cleaved only once and releases a single small RNA duplex for AGO loading. Pre-miRNAs are the most efficiently cleaved Dicer substrates in vitro. Human Dicer alone cleaves pre-miRNAs much faster than pre-siRNA substrates under both single and multiple turnover conditions, with more than 100-fold difference in maximal cleavage rates (V max) under multiple turnover conditions [9]. This indicates that the mammalian Dicer is optimized for miRNA biogenesis and several specific structural adaptations discussed below support this notion.

A characteristic feature of the pre-miRNA hairpin, which is accessed by the PAZ domain of Dicer, is a two-nt 3′ overhang generated by the nuclear Microprocessor complex [33]. Pre-miRNAs with the two-nt 3′ overhang at the 3′ terminus are bound with higher affinity than pre-miRNAs with different ends [24]. Moreover, the two-nt 3′ end overhang leads to a higher substrate processing, which was shown on both pre-miRNAs and perfect duplexes [24, 83, 119]. Such preference is likely conferred by/due to simultaneous binding of the pre-miRNA end by both 5′ and 3′ binding pockets in the PAZ domain [83]. Importantly, fidelity of miRNA biogenesis is critical for miRNA functionality because a single nucleotide shift at the 5′ end of a miRNA would redefine its target repertoire. In contrast, RNAi, which typically involves perfect complementarity between a small RNA and its target, would be essentially insensitive to precise cleavage positioning as long as it would not affect Argonaute loading. Thus, the simultaneous recognition of both strands at the two-nt 3′ overhang terminus by Dicer can be seen as an adaptation driven by miRNA biogenesis [83].

The second structural adaptation of mammalian Dicer supporting miRNA biogenesis is the N-terminal helicase, which forms a clamp-like structure adjacent to RNase III domains; hence, it is positioned to bind the stem loop of a pre-miRNA [56]. While the loss of the entire N-terminal helicase only slightly increases pre-miRNA processing activity in vitro [63], pre-miRNA processing by recombinant Dicer in vitro is much faster than that of a perfect duplex [9, 63]. In vivo, the naturally occurring N-terminally truncated Dicer isoform (Fig. 2) can rescue miRNA biogenesis in Dicer −/− embryonic stem cells (ESCs) [26]. This suggests that the N-terminal helicase domain in mammalian Dicers is not important for miRNA biogenesis per se; it rather provides constrains for substrate selectivity favoring pre-miRNAs.

This is consistent with the model where pre-miRNA binding is associated with the cleavage-competent open conformation. In the open state, a pre-miRNA is bound along the platform, the helicase domain is bent, and RNase IIIa and IIIb sites have access to the substrate [102]. It has been proposed that the loop of a pre-microRNA may prevent the adoption of the closed conformation by Dicer by interacting with HEL1 and HEL2i and by possibly stabilizing the open conformation of Dicer [24, 56, 65]. This also indicates that the N-terminal helicase had acquired distinct roles in Dicer function in RNA silencing during evolution. In mammalian cells, the N-terminal helicase has a gatekeeper function where the pre-miRNAs’ loop appears to be a key keeping the gate open.

Long dsRNA

In addition to pre-miRNA, Dicer can process long dsRNAs from different sources. Exogenous sources of dsRNA include viral dsRNAs and imply function of RNAi in antiviral immune response [105, 106, 112]. Endogenous dsRNAs have variable length and termini and are generated by transcription of inverted repeats, by convergent transcription, or by pairing of complementary RNAs in trans. Importantly, mammals lack an RNA-dependent RNA polymerase (RdRp), which is a conserved component of RNAi-related mechanisms in plants, fungi, and invertebrates. RNAi in mouse oocytes, the best documented mammalian endogenous RNAi example, works independent of RdRp activity [91].

The human Dicer binds long dsRNA but not siRNAs in vitro. Long dsRNA binding is independent both on Mg2+ and ATP. The human Dicer preferentially binds and cleaves long dsRNA from the end, due to inefficient binding into internal regions of dsRNA [118]. In comparison to pre-miRNA processing, human Dicer exhibits lower cleavage activity on perfect dsRNA substrates [63]. A proposed explanation might be that a closed conformation of the N-terminal helicase domain disturbs the RNase III catalytic core and inhibits the cleavage of perfect dsRNAs [56]. As it was already mentioned, in vitro deletion of the N-terminal helicase domain increases the cleavage activity of human recombinant Dicer (∼65-fold). As increase in k cat (turnover of the enzyme) is the major contribution to Dicer activation, authors hypothesize that DExD/H-box domain mainly inhibits the functionality of the Dicer active site, but not RNA binding [63]. This model is supported by previously mentioned structural data, where Dicer is in a closed state with a 35-bp A-form RNA duplex trapped between PAZ and helicase domains away from the catalytic center [102].

A Dicer-mediated cleavage of dsRNA can be stimulated in vitro by TARBP2. However, it is not clear if TARBP2 stimulation could be sufficient to induce endogenous RNAi in vivo [9]. So far, the evidence for endogenous RNAi (including attempts to induce RNAi with exogenous substrates) is scarce (reviewed in detail in [75, 94]). The only tissue type, where abundant endogenous siRNAs were found and where long dsRNA readily induces RNAi, is mouse oocytes, which express an oocyte-specific Dicer isoform lacking a part of the N-terminal helicase domain [26], thus mimicking some of the mutants tested in vitro [63]. Taken together, long dsRNA, a typical endogenous RNAi substrate, is poorly processed by endogenous full-length Dicer. This is due to the gatekeeper role of the N-terminal helicase domain, which does not open upon binding long dsRNA.

Biological roles of mammalian Dicers beyond the miRNA pathway

The principal role of Dicer in mammals is undoubtedly microRNA biogenesis. The other roles, some of which will be discussed below, have only partial experimental support. The second best documented function of Dicer is endo-siRNA biogenesis from long dsRNA in mouse oocytes (with an unclear extent to other mammals). In addition, there are data from somatic cells supporting a possible role of Dicer in antiviral and retrotransposon defense, nuclear dsRNA clearance, and chromatin association (reviewed in more detail in [7, 13]). Dicer was associated with age-related macular degeneration, a severe condition leading to blindness, where Dicer loss was correlated with accumulation of toxic Alu transcripts and cell death of retinal pigment epithelium cells [45, 51, 101]. While a miRNA-independent role of Dicer in macular degeneration has been proposed [45], miRNA-dependent functions of Dicer should be still taken into consideration [93].

Endogenous RNAi in mouse oocytes and elsewhere

In Drosophila and Caenorhabditis, RNAi functions in gene regulation, silencing of transposable elements, and antiviral defense (reviewed in [75]). In contrast, mammalian RNAi seems to be, with a notable exception of rodent oocytes, a minor pathway with limited functionality. One of the main reasons of RNAi regression is evolution of the IFN response, which is a sequence-independent vertebrate innate immunity primary response to cytoplasmic dsRNA in somatic cells. While the IFN response can mask RNAi effects and is likely an evolutionary force acting against RNAi, two additional factors emerged to underlie non-functional endogenous RNAi in somatic cells: low Dicer activity and substrate (dsRNA) availability. Low Dicer activity on long dsRNA has been thoroughly discussed above. In addition, even when high Dicer activity was present in ESCs, the amount of endo-siRNAs remained low relative to miRNAs [26]. Thus, the amount of long dsRNA available for cleavage is another limiting factor in vivo in mammalian cells. Accordingly, an order-of-magnitude higher level of siRNAs occurred in the same ESCs when an excess of dsRNA substrate was present [26]. In fact, when the same experiment was performed in somatic cells, no accumulation of siRNAs from endogenous templates was found while siRNAs from an ectopically expressed dsRNA were readily observed (Flemr et al., unpublished observation). Conversely, ubiquitous long dsRNA expression in a transgenic mouse model induced neither the IFN response nor RNAi in somatic cells while robust RNAi effects were observed in the oocyte [76]. These data imply that endogenous Dicer activity and dsRNA availability in somatic cells are typically too low to support efficient endo-siRNA production and robust RNAi activity. If so, one would predict to observe canonical RNAi in mammalian cells under unique circumstances—exemplified but likely not restricted to mouse oocytes. At the moment, the whole framework, under which canonical RNAi operates (or could be induced) in somatic cells, is poorly understood, but it must include (i) high Dicer activity and (ii) sufficient amounts of dsRNA not provoking the interferon response. The first condition could be achieved with an N-terminally truncated Dicer, overexpression of the somatic Dicer isoform, or hypothetically interacting partner-stimulating Dicer cleavage in vivo. The second condition may occur either in cells with suppressed PKR response or when compartmentalization would hold dsRNA off the IFN pathway but available for Dicer cleavage.

The best studied example of mammalian endogenous RNAi is mouse oocytes and early embryos (reviewed in [94]). Mouse oocytes have several unique features, which underlie endogenous RNAi functionality: they (i) lack the interferon response [92], (ii) express a highly active Dicer isoform lacking the N-terminal helicase domain [26], (iii) express a sufficient amount of Dicer substrates to accumulate endo-siRNAs [99, 109], and (iv) provide a sufficient time window to accumulate significant RNAi effects in vivo [21].

Because of the origin of the oocyte-specific highly active Dicer isoform and sources of maternal endo-siRNAs, the essential role of RNAi in mouse oocytes should be seen as a derived character, not a norm for mammalian oocytes. At the same time, endogenous RNAi likely operates also in oocytes of mammals lacking the highly active Dicer isoform as evidenced by sequence-specific knockdown upon injection of long dsRNA into bovine [82], porcine [2], and ovine [114] oocytes.

Experimental evidence suggests that endogenous RNAi might function also in some other cell types (reviewed in [75]). However, the evidence for a potential physiological role of endo-siRNAs in mammalian somatic cells is scarce at best. Endo-siRNAs were found in the mouse hippocampus, where deep sequencing revealed putative endo-siRNAs generated from overlapping transcripts and from hairpin structures in introns of protein-coding genes, many of which regulate synaptic plasticity [90]. In ESCs, endo-siRNAs were suggested to contribute to their self-renewal and proliferation because of a stronger phenotype observed in Dicer−/− ESCs than in Dgcr8−/− ESCs [46, 73, 107]. However, an analysis of small RNAs from ESCs suggests that a few loci might generate low levels of endo-siRNAs and that previously annotated endo-siRNAs [3] are similar to non-canonical miRNAs rather than to a pool of siRNAs generated from a long dsRNA template [26] (Fig. 3e). RNAi was also implicated in the control of LINE-1 retrotransposon in ESCs [11, 12]. LINE-1-derived siRNAs originate from convergent transcription at the 5′ UTR [115]. However, LINE-1 is a highly adapted and successful mammalian retrotransposon; thus, our observations may be revealing an adaptation of the LINE-1 retrotransposon to maintain low expression levels rather than an effective way of LINE-1 suppression by the host.

Antiviral RNAi

RNAi seems to be the major antiviral pathway in invertebrates and plants [18]. However, in mammals, two factors strongly limit the antiviral potential of endogenous RNAi: (i) the poor cleavage of long dsRNA substrates by Dicer and (ii) the sequence-independent IFN response to long dsRNA. While two groups reported antiviral RNAi function under specific circumstances [61, 70], there is no solid evidence for a natural antiviral role of RNAi in mammals, so far [13, 31, 100]. Thus, endogenous RNAi does not seem to be an important antiviral mechanism in mammals. In fact, natural selection might favor suppression of RNAi in mammals as it might reduce the efficiency of the IFN response. If mammals truly use RNAi as an antiviral mechanism, it most likely happens in rare cases and under unique circumstances allowing for accumulation of physiologically relevant amounts of virus-derived siRNAs.

Nuclear Dicer in mammalian cells

Nuclear localization and function of mammalian Dicer remains one of the least understood aspects of Dicer biology. While cytoplasmic processing of pre-miRNAs seems to be the main function of Dicer, some observations indicate that Dicer might have a nuclear role as well. Among the possible roles for the nuclear Dicer might be miRNA and endo-siRNA production directly in the nucleus and/or removal of nuclear dsRNA. However, there is no coherent model for a nuclear role of Dicer, which would accommodate published data and provide a biological role of nuclear Dicer. Accordingly, we will discuss the evidence supporting the nuclear Dicer functions while highlighting some of the experimental issues concerning nuclear Dicer.

The bulk of evidence for nuclear mammalian Dicer comes from microscopy and cell fractionation experiments. Staining of Dicer in mammalian interphase cells reveals a strong cytoplasmic signal with a varying nuclear background, so it is difficult to discern whether the nuclear signal truly comes from Dicer or it is an antibody artifact (e.g., [110]). Although the C-terminal dsRBD of Dicer may have properties of a nuclear localization signal (NLS), it was not shown to function as an NLS in the context of an intact Dicer [19].

Murine and human Dicer were localized to ribosomal DNA (rDNA) (i.e., insoluble nuclear fraction) using microscopy and chromatin immunoprecipitation. However, these data are difficult to interpret because (i) a minuscule amount of Dicer might be sufficient to generate signal in rDNA on mitotic chromosomes and (ii) no function associated with this localization has been identified [89]. Localization of Dicer on mitotic chromosomes was detected in different cell types and with three different polyclonal Dicer antibodies. Also, HA-, Myc-, FLAG-, and enhanced green fluorescent protein (EGFP)-tagged Dicer isoforms supported chromatin-bound Dicer restricted to transcribed rDNA sequences [89].

Unique strategies to study nuclear Dicer by microscopy are fluorescence correlation spectroscopy (FCS) and fluorescence cross-correlation spectroscopy (FCCS). These methods were, among others, used to explore Dicer localization in the nucleus of ER293 cells [81]. Among the advantages of FCS over the classical confocal microscopy is that it is based on direct EGFP fluorescence from an EGFP-tagged protein and that it studies protein localization in a well-defined volume, which can be explored in the cytoplasm and the nucleus. Interestingly, while EGFP-Dicer fluorescence is cytoplasmic with a minimal nuclear EGFP signal background, estimations of the sizes of Dicer-containing complexes in the cytoplasm and in the nucleus differ [80]. It seems that cytoplasmic Dicer is present in large complexes, which might represent a RISC loading complex (Fig. 4). In contrast, nuclear Dicer does not seem to be interacting with other proteins and its size estimate corresponds to Dicer protein itself. This is consistent with a later report showing that Dicer is present in the nucleus of mammalian cells but loading of small RNAs on AGO proteins does not occur there [30].

Fig. 4
figure 4

Dicer localization in somatic cells. a Confocal microscopy on HeLa cells transiently transfected with EGFP-Dicer or EGFP-LacZ-expressing plasmids. Cells with comparable cytoplasmic signal suggest that EGFP-LacZ expression yields higher EGFP signal in the nucleus than EGFP-Dicer. b Ectopically expressed tagged DicerO in 3T3 cells was detected with α-myc antibody. c FCS analysis of EGFP-Dicer in a stable ER293 cell line. On the left is a confocal image showing EGFP expression. The table shows the FCS analysis of molecular weight of complexes, in which EGFP-tagged proteins reside in the nucleus or the cytoplasm. For further details, please see references [80, 81]

A recurrent question is the ratio of the nuclear and cytoplasmic Dicer. Although some experiments reveal a considerable amount of Dicer in the nucleus (e.g., [1, 16, 30, 110]), the dynamic range of the detection system is typically not shown/unclear. In one of the experiments, the nuclear/cytoplasmic ratio for Dicer in ER293 cells was estimated by western blotting to be 1:4.3 ± 0.3 [81]. However, this ratio feels overestimated as there are several experiments where the nuclear Dicer amount is below the detection limit [5]. An indirect evidence for nuclear substrate processing came from small RNA analysis in ESCs expressing either the full-length Dicer or its truncated variant, which suggested that in the two most prominent loci, endo-siRNAs are most likely produced from a nascent transcript made by RNA polymerase II [26] (Fig. 3d).

In terms of the biological role, nuclear Dicer has been associated, for example, with DNA damage response [27], transcriptional silencing [35], detoxification through dsRNA removal in the nucleus [110], and RNA post-transcriptional processing [77]. However, the evidence for the nuclear role of Dicer is still not complete and it is possible that some of the abovementioned nuclear functions will be revised.

Summary

Despite its similar domain composition of Dicer across Metazoa, different metazoan model organisms adopted different strategies for substrate processing and selectivity by Dicer. We can recognize three different scenarios regarding the production of small RNAs for miRNA and RNAi pathways. In C. elegans, one single Dicer gene serves both pathways sufficiently well (although post-translational processing affects the balance between miRNA and siRNA biogenesis [88]). In Drosophila, miRNA and RNAi pathways are genetically separated at the level of Dicer and its interacting partners. Mammals employ a single Dicer gene dedicated to the miRNA pathway. Importantly, mammalian Dicer structure and biochemical properties are consistent with its primary role in the cytoplasmic miRNA pathway. This implies that (i) long dsRNA processing by mammalian Dicer is a rudimentary mechanism and (ii) non-miRNA functions are generally of secondary importance unless they would evolve into a unique adaptation, such as the one observed in mouse oocytes.