Biogenesis and activity of human microRNAs

MicroRNAs (miRNAs) are small endogenous post-transcriptional regulators of gene expression found in a wide range of eukaryotes [1]. These 20- to 24-nucleotide (nt)-long RNAs regulate numerous physiological processes, including cell proliferation, differentiation, apoptosis and development [25]. Deregulation of miRNA expression has been linked to many disorders, including cancer [68] and neurodegeneration [9, 10]. Most of the miRNA genes, which are located predominantly in introns of protein coding genes and intergenic regions, are transcribed by RNA polymerase II (Pol II) [11, 12], and some are transcribed by RNA polymerase III (Pol III) [13] (Fig. 1). The primary transcripts (pri-miRNAs), which are typically capped and polyadenylated, contain one or more long hairpin structures. The structural features of these hairpins are unique to pri-miRNAs, thus distinguishing them from the various RNA stem-loop-like structures present in the nucleus. The pri-miRNA hairpin typically contains a long imperfect stem of approximately 30 bp with flanking single-stranded RNA segments at its base (termed the single-stranded–double-stranded RNA junction) [14, 15]. This structure is recognized and cleaved by the Microprocessor complex containing the ribonuclease Drosha (RNase III enzyme), the RNA binding protein DGCR8 (DiGeorge syndrome critical region gene 8) [1618] and other proteins [17, 19]. Drosha cleaves the pri-miRNA hairpin at a distance of approximately 11 bp from the single-stranded RNA–dsRNA junction, which is recognized by DGCR8 [15]. The Drosha cleavage of intronic miRNAs occurs co-transcriptionally before splicing of the host RNA [20, 21]. As a result, a long truncated hairpin of approximately 60 nt (pre-miRNA) is generated, which usually has a 2-nt 3′ overhang, the hallmark of RNase III products [22]. The 3′ overhang and the pre-miRNA double-stranded stem with a minimal length of 16 bp [23] are essential structural elements that are recognized by Exportin-5 (Exp-5) [24, 25]. In cooperation with the guanine triphosphatase (GTPase) Ran, Exp-5 transports the pre-miRNA from the nucleus to the cytoplasm irrespective of its nucleotide sequence and the presence of various structural motifs. The high-resolution X-ray structure of the pre-miRNA nuclear export machinery has recently been reported and the crucial role of the structural features of the precursor for Exp-5 recognition confirmed [26]. However, the exact mechanism by which the pre-miRNA is transferred from the Microprocessor to Exp-5 in the nucleus and from Exp-5 to the RISC loading complex (RLC) in the cytoplasm remains to be elucidated. It is thought that Exp-5 passes pre-miRNA to Dicer directly or via additional components [27]. The pre-miRNA cutting machine of the RLC is the ribonuclease Dicer (another RNase III enzyme), which converts the pre-miRNA to mature miRNA. Dicer recognizes the 3′-ends generated by Drosha and cleaves the pre-miRNA two helical turns (approx. 22 nt) away, near the terminal loop, to produce a miRNA–miRNA* duplex having 2-nt 3′ overhangs at both ends [28, 29]. The human Dicer is an approximately 200-kDa multidomain protein that contains an N-terminal DEAD-box helicase domain, a domain originally named the domain of unknown function (DUF283), a PAZ domain, two conserved catalytic RNase III domains (RIIIA and RIIIB) and a C-terminal dsRNA-binding domain (dsRBD) [30]. The two RNase III domains of Dicer form a single processing center for pre-miRNA cleavage, and the 5′ and 3′ arms of the precursor are cleaved by domains RIIIB and RIIIA, respectively [31, 32]. Dicer does not function alone; it cooperates within the RLC with other proteins [33, 34]. Its protein partners are members of the argonaute (AGO) family [35, 36], HIV-1 transactivation response (TAR) RNA-binding protein (TRBP) [37, 38] and possibly other proteins (see the section Dicer partners in pre-miRNA cleavage). Once Dicer has cleaved the pre-miRNA, only one miRNA strand (guide strand) of the duplex is loaded onto AGO to form the programmed RISC (referred to as the miRISC); the other strand (passenger strand or miRNA*) is released and degraded [37, 39, 40]. The thermodynamic stability of the ends of the miRNA duplexes is thought to play a crucial role in miRNA strand selection [4143]. Other factors contributing to strand selection are structural features of the miRNA/miRNA* duplex (e.g. positions of base mismatches) [4446] and sequence composition [27, 47]. These features may be disregarded, since there are cases where both miRNA and miRNA* strands may be involved in RISC-mediated gene silencing [46, 48]. The finding of the cognate target is thought to be a diffusion-driven process [49], and the recognition of mRNA by miRISC is based on the partial complementarity of miRNA to mRNA. The “seed” region, i.e. nt 2–8 of the miRNAs, typically forms perfect matches with their target sequences located in the 3′ untranslated region (UTR) of mRNAs, but these interactions fall into several categories [5052]. MiRNAs downregulate gene expression in several ways; in animal cells, the downregulation is usually achieved by translational arrest, mRNA deadenylation and degradation, or less frequently by mRNA cleavage (reviewed in [1]). More recently, mRNA destabilization and degradation have been proposed to constitute the predominant mechanism of gene expression regulation by miRNAs [53]. Moreover, the existence of “non-canonical” pathways of miRNA biogenesis has also been reported [54]. Pre-miRNA-like introns (mirtrons) are spliced out of mRNA transcripts [5557]. Mirtrons bypass the Drosha requirements, but other steps of their biogenesis, such as nuclear export and cleavage by Dicer, follow the canonical pathway [58]. Instead, an erythropoetic miR-451 circumvents the Dicer step; miR-451 is cleaved by Ago2 within the 3′ arm, and the exact miRNA 3′-end is generated by trimming, which is mediated by other 3′-5′ exonucleases [5961]. Like all RNAs, miRNAs also have inherent half-lives. Their turnover has been evaluated in various experiments with Pol II inhibitors and/or by the RNAi-mediated depletion of miRNA processing enzymes. Most miRNAs are highly stable, and their half-lives range from hours to even days, but several miRNAs with an accelerated turnover have also been reported [6266]. The different aspects of miRNA biogenesis, function and regulation at multiple levels have been extensively reviewed elsewhere [6771].

Fig. 1
figure 1

Canonical pathway of microRNA (miRNA) biogenesis and activity. Sequential processing reactions of the primary transcript by Drosha in the nucleus and of pre-miRNA by Dicer in the cytoplasm are presented schematically (details are given in the text). The structural requirements for Drosha, Exportin-5 and Dicer are highlighted in the yellow boxes

Here we review recent progress in understanding the mechanism of Drosha and Dicer cleavages from the perspective of their end-products. We describe the specificity of these cleavages and analyze miRNA length variety in terms of both diversity and heterogeneity, searching for their sources and consequences. Finally, we discuss the composition of the core RLC complex, the role of each Dicer protein partner and the structure of this complex as a whole interacting with pre-miRNA.

Relaxed specificity of Drosha and Dicer cleavage and its consequences

It was predicted 6 years ago that as many as 1,000 miRNAs may regulate the expression of most human protein coding genes [72]. The latest release of the miRNA repository (miRBase, release 16) [73] indeed matches that prediction, and the phenomenon of miRNA end and length heterogeneity substantially increases this number. This kind of miRNA heterogeneity, i.e. the existence of several length variants and shifted sequence variants of the same miRNA, has been known since early miRNA studies [74] and has recently been termed miRNA end polymorphism or iso-miR formation [7577]. Since the routine application of deep sequencing technology in miRNA discovery studies [75, 7882], the scale of miRNA heterogeneity has been found to be greater than anticipated (Fig. 2a). At the same time, the proper interpretation of deep sequencing results and the separation of real biological effects from various deep sequencing artifacts remain ambiguous. Different deep sequencing platforms generate platform-specific biases [83, 84] that stem from the methods of miRNA library construction [85]. Among the biological effects involved, Drosha- and Dicer-induced cuts with relaxed specificity are one of the most likely explanations for the phenomenon of miRNA heterogeneity [75, 78, 81, 86].

Fig. 2
figure 2

Sources of miRNA end heterogeneity. a Example of deep sequencing data for an miRNA generated from both precursor arms. The number of sequence reads and fraction of miRNA variants are indicated on the right. Variants represented by only one sequence read are not shown. The major contributor (miRNA-MAIN) is shown in boldface, and the miRNA weighted average length (miRNA-WAL) is also indicated. The underlined letters indicate nontemplated nucleotides. The reference precursor and annotated miRNAs (red type), along with the dot-bracket pattern of structure of the precursor, are shown below the aligned sequences. Note that only the miRNA–MAIN generated from the 5′ arm corresponds to the miRNA annotated in miRBase. The figure was prepared from deep sequencing data deposited in miRBase [73, 149]. b–d Different proposed hypotheses to explain the observed miRNA end heterogeneity. b Drosha and Dicer cleavages are the primary sources of miRNA heterogeneity, and the ends generated by Drosha are less heterogeneous than those generated by Dicer. c No matter which arm the miRNA is generated from, the 5′-end is always less heterogeneous than the 3′-end. d Shifted Drosha cleavages result in shifted Dicer cleavages (leading to miRNA with two heterogeneous ends). e AGO2 loading is the selection step for binding miRNAs with U and A at their 5′-ends. f Nontemplated heterogeneity results from modifications of the 3′-end occurring after Drosha/Dicer cleavages

Based on deep sequencing results, it has been proposed that Drosha-induced cleavages generate much less miRNA end heterogeneity than Dicer cleavages [87] (Fig. 2b). Other authors have concluded that the 5′-ends of miRNAs are always less heterogeneous than the 3′-ends, regardless of the precursor arm from which the functional miRNAs were generated [27, 75, 78, 81, 88, 89] (Fig. 2c). This conclusion is in agreement with the notion that evolutionary pressure favors homogenous miRNA 5′-end formation, which is important for specific target recognition [90]. However, the above propositions are mutually exclusive as Drosha defines the 5′-ends of miRNAs from the pre-miRNA 5′ arm and the 3′-ends of miRNAs from the 3′ arm, whereas Dicer defines the 3′ ends of miRNAs from the pre-miRNA 5′ arm and the 5′-ends of miRNAs from the pre-miRNA 3′ arm (see Fig. 2). Drosha and Dicer belong to the same RNase III class; both possess a catalytic center formed by two RNase III domains. The difference in cleavage specificity for Drosha and Dicer may arise from their different strategies of substrate recognition. Dicer combines precursor recognition and cleavage activities, as it contains the PAZ and RNase III domains within one protein. This enzyme design makes the structure flexible and adaptable to various structures of pre-miRNA substrates. Drosha requires another protein (DGCR8) to recognize and bind to pri-miRNAs. It may be these differences between Drosha and Dicer which influence their cleavage specificities (Fig. 2b). It would appear that the discriminative power of the Drosha/DGCR8 complex between substrates and the multitude of nonsubstrate hairpins need to be very high; also, the use of two specialized proteins to perform this task offers more precision in the cleavage. However, it seems unlikely that Drosha and Dicer can sense which precursor arm will generate the functional miRNA and that arm will be cleaved more precisely (Fig. 2c). Therefore, the hypothesis suggesting that miRNA end heterogeneity is the result of different Drosha and Dicer cleavage specificities seems more appealing. Another model of miRNA heterogeneity that was proposed is a parallel shift of Drosha and Dicer cuts to generate several miRNAs from the same precursor [86] (Fig. 2d). This model simply reflects the fact that nonspecific Drosha cleavages will have consequences for nonspecific Dicer cleavages. Apart from the imprecise cleavages by Drosha and Dicer being the primary source of miRNA heterogeneity, the AGO binding step may also introduce a bias toward the U and A residues at the 5′ nucleotides of the miRNA [91] (Fig. 2e).

Starting from the release 16, the deep sequencing data are deposited in miRBase. Most miRNAs are represented by numerous length/sequence variants; in the majority of cases, the miRNAs are generated from one precursor arm (5′ or 3′). Such miRNAs are often represented by approximately 99% of all sequence reads. There are, however, examples of substantial amounts of miRNAs generated from both arms of one precursor (Fig. 2a). The frequency of individual miRNA variants varies strongly, with the most frequent approaching 100% and the rarest being represented by very small fractions (considerably less than 1% of all the reads). Only the most frequent variants (miRNA-MAIN and those that contribute to more than 5% of total reads) can probably be considered functional. If many miRNAs are derived from one precursor arm, the parameter miRNA-WAL (weighted average length) would be appropriate to describe the lengths of miRNAs obtained from deep sequencing and derived from single genes.

In addition to deep sequencing, conventional techniques of miRNA detection have also been used to demonstrate and characterize miRNA heterogeneity. Northern blotting, a “gold standard” in molecular biology, has been frequently used to detect newly identified miRNAs [74, 9295]. Recent improvements in northern blotting protocols have allowed us to distinguish not only miRNAs but also pre-miRNAs differing in length by 1 nt; the northern blot method has also been used for the quantitative analysis of miRNA and pre-miRNA heterogeneity [9698]. High-resolution northern blotting used in conjunction with primer extension (which detects 5′-end heterogeneity) has made it possible to distinguish shares held by Drosha and Dicer in generating miRNA heterogeneity. Both Drosha and Dicer generate substantial miRNA end heterogeneity, but the contribution of Dicer is slightly greater [96].

The observed heterogeneity in miRNA ends and lengths may have important functional implications. Different nucleotides at the miRNA 5′-end may change the relative thermodynamic stability of the miRNA/miRNA* duplex ends and cause preferential RISC activation by a different strand. Furthermore, the miRNA 5′-end, particularly its seed sequence, is responsible for the recognition of a complementary sequence and the binding to mRNA. MiRNAs with shifted 5′-ends have different seed sequences and may regulate different targets [77, 78, 96, 99, 100]. It has recently been shown that different miRNA length variants (iso-miRs) may be loaded to different AGO proteins [101]. Thus, the generation of 5′-end heterogeneity may be another mechanism of miRNA activity regulation that functions either by increasing the number of targets regulated by one miRNA or by decreasing the fraction and functional impact of a dominant, canonical miRNA. The role of miRNA 3′-end heterogeneity is also gaining the attention of researchers. In addition to the frequently detected templated heterogeneity (miRNA end nucleotides match the genomic sequence), the presence of nontemplated nucleotides (Fig. 2f) has also been observed in some miRNAs. The nucleotides that most often differ from genomic DNA are typically 3′-end A and U [75, 79, 89, 102]. It is thought that these nucleotides are added to either miRNA or pre-miRNA ends by specific enzymes [102105] following Drosha or Dicer cleavage. In addition to contributing to the miRNA–mRNA interaction (compensatory site effect) [50, 51], the 3′-end of the miRNA may influence its localization [63]. Extra nucleotides added at the 3′-ends of some miRNAs may also influence their stabilities [103106], regulate the Dicer step of biogenesis by blocking cleavage [107] or modulate miRNA uptake by RISC bound with different AGO proteins [102]. The 3′-end modification has also been linked to a reduction in the efficiency of mRNA targeting [102, 104].

The 5′- and 3′-end heterogeneity of the Drosha and Dicer cleavage products, as well as the nontemplated effects, may also have implications for the RNA interference (RNAi) and miRNA technologies. The end heterogeneity of biologically processed short RNAi triggers or exogenous miRNAs may create problems in reproducing the silencing effects achieved with synthetic small interfering RNAs (siRNAs) or miRNAs having defined ends [98]. The nonspecific Dicer and Drosha cleavages generate a population of products, only a fraction of which may have the desired sequence and exert the expected effects.

Structural aspects of pre-miRNA cleavage by recombinant Dicer

Another phenomenon related to the variation in the length of miRNAs is their length diversity, i.e. the formation of miRNAs differing in length from different miRNA genes [96]. miRNA length diversity is generated by Dicer from Drosha cleavage products. This effect originates from the structural features of the pre-miRNA hairpins, which differ from each other in the number and localization of various types of structural motifs. The range of human pre-miRNA structural diversity has been estimated by the analysis of predicted structures of 460 pre-miRNAs whose sequences were reconstructed from mature miRNA sequences [108, 109]. The lowest energy structures of pre-miRNAs [110] were analyzed for the presence of various secondary structure motifs (mismatches, bulges, symmetrical and asymmetrical internal loops). All motifs identified were cataloged according to their type, size, position and orientation. Of the 1,243 motifs, 631 were symmetrical (1- to 5-nt-long mismatches and internal loops) and 612 were asymmetrical (bulges and internal loops). Single nucleotide mismatches and bulges accounted for most of the findings (Fig. 3a). The number of structural motifs in the pre-miRNA structures analyzed ranged from zero to seven, with an average of 2.7 motifs per precursor. Based on the distribution, localization and sequence content of the structural motifs, the following interesting observations were reported: (1) the frequency of symmetrical motifs tended to increase and that of asymmetrical motifs decreased when proceeding from the pre-miRNA hairpin base to its terminal loop; (2) bulges were significantly overrepresented in the 5′ arm of the precursor (262) compared to the 3′ arm (172); (3) there were no strongly overrepresented specific sequences within the structural motifs analyzed. The predicted structures of many pre-miRNAs [109] showed great variation, possibly explaining the length diversity of mature miRNAs. To determine the specific structural motifs that could predominantly account for miRNA length diversity, a large number of synthetic pre-miRNAs selected to contain various structural motifs at specific locations were subjected to a cleavage assay with recombinant Dicer [96]. Prior to Dicer cleavage, the structures of nearly 20 pre-miRNAs were determined experimentally by chemical and biochemical methods. The set of metal ions used in that study (Fig. 3b), namely Pb, Ca, Mg and Mn ions [111, 112], was previously used to demonstrate the structural diversity of extended pre-miRNAs [43]. The ions mapped the single-stranded fragments, i.e. the terminal and internal symmetrical and asymmetrical loops. When the structures of the terminal loops were probed, the best results were obtained with S1 and T1 nucleases, but V1 nuclease was used for mapping well-paired stem portions of the secondary structure. The small differences between the predicted and experimentally determined structures were localized mainly in the hairpin terminal loops [96].

Fig. 3
figure 3

Structure and dicing of miRNA precursors. a Structural diversity of miRNA precursors based on bioinformatics analysis of the predicted structures of 460 human pre-miRNAs [109]. The frequencies of various secondary structure motifs are presented in a pie chart. b Structural probing of the 5′-end labeled pre-miR-31 using the indicated probes. Lanes: Ci Incubation control (no probe), F formamide ladder, T guanine-specific ladder. S1, T1, T2 Nucleases. The positions of selected G residues are shown. On the right is the experimentally determined pre-miRNA structure. The cleavage sites and intensities for the selected probes are indicated by the symbols described in the inset. c Results of a Dicer cleavage assay for pre-miR-31. The precursor was incubated with human recombinant Dicer for 1, 2 and 5 min as described by Starega–Roslan [96]. The Dicer cleavage sites (black arrowheads), are shown in the secondary structure model; the reported miRNA sequences are marked in red. The values of the weighted average length of diced RNA (WALDI) parameter for miRNAs generated from the precursor 5′ and 3′ arms are indicated. Other designations are as in 3b

The Dicer cleavage assay (Fig. 3c) of 19 pre-miRNAs and several pre-miRNA mutants revealed the important role of the structure of the precursor in determining the cleavage position. In particular, the presence of asymmetrical structural motifs was found to be a major determinant of miRNA length diversity. Precursor arms harboring excessive unpaired nucleotides gave rise to longer miRNAs [96]. To make the analysis more straightforward and quantitative, a new parameter was introduced in that study, namely, the weighted average length of diced RNA–WALDI, which facilitated finding the correlation between the pre-miRNA structure and the lengths of products excised by Dicer. The results from the Dicer cleavage assay of pre-miRNAs were confirmed by bioinformatic analyses of miRNAs deposited in miRBase [73]. The results of both approaches confirmed that the presence of “excessive” nucleotides in any pre-miRNA arm results in the generation of longer miRNAs from this arm by Dicer [96].

Most of the other information currently available on Dicer cleavage of dsRNA comes from biochemical studies in which the requirements for Dicer binding and cleavage were determined [31, 113115]. The preferences toward dsRNA over single-stranded RNA were shown, and the binding of the PAZ domain to the 3′ terminal single-stranded overhang was demonstrated [114]. Dicer binding was not only shown to the typical 2-nt 3′ overhang, but also to 1-nt and 3- to 5-nt protruding ends [113, 116] as well as less efficient binding to blunt ends of RNA [114, 117]. Not only the overhang structure itself, but also the base composition of the overhanging nucleotides influence the efficiency of Dicer binding and cleavage [113, 118]. The pre-miRNA terminal loop structure has also recently been shown to influence Dicer’s cleavage efficiency [119], and the existence of some local sequence preferences at Dicer cleavage sites have been postulated [27] (our unpublished data). Additionally, the RNase IIIB Dicer domain was shown to prefer cleaving phosphodiester bonds adjacent to the structural distortions that occur in the 5′ arm of the RNA substrate, therefore determining the Dicer cleavage site [27]. Nevertheless, the 3′ terminus of pre-miRNAs is the major determinant of Dicer’s cleavage position as Dicer measures the distance (two helical turns) from the 3′ end to its cleavage site [30].

Dicer partners in pre-miRNA cleavage

As mentioned earlier in this review, in cells, Dicer functions within the RLC, which comprises the two components TRBP and AGO2, as well as some additional proteins [35]. The role of Dicer’s protein partners in pre-miRNA (dsRNA) dicing has been addressed in several studies, but to date it has not been satisfactorily resolved. In this section, the protein components of the human dicing complex are characterized, and the results of studies aimed at resolving their role in pre-miRNA processing are summarized.

There are four argonaute proteins (AGO1–4) in human cells, but only AGO2 has RNA slicing activity. AGO2 is a major protein involved in the RNAi mechanism whose main function is to induce the guide strand-mediated cleavage of target mRNA by the catalytically competent RISC. The approximately 100-kDa AGO2 protein contains three domains. The PAZ domain is responsible for the binding of the 2-nt overhangs of miRNA and siRNA duplexes [120123], the MID domain interacts with the 5′-phosphate group of RNA [124] and the PIWI domain, located at the C-terminus, possesses the RNase H activity required for the endonucleolytic cleavage of the miRNA passenger strand and subsequently the target mRNA [121, 124].

TRBP is an approximately 45-kDa RNA binding protein containing three dsRBDs. Two of the dsRBDs can homodimerize or bind to the interferon (IFN)-induced protein kinase R (PKR) and the protein activator of PKR kinase (PACT). The third dsRBD interacts with the N-terminal helicase domain of Dicer [125]. The exact function of TRBP in miRNA biogenesis has not yet been determined and remains controversial [37, 38]. This protein is believed to somehow cooperate with Dicer to facilitate miRNA/siRNA production [126], probably by enhancing the stability of Dicer–substrate complexes [127]. Alternatively, TRBP may assist in recruiting substrates to Dicer [127129]. Based on the biochemical results [130] and data from electron microscopy imaging of the RLC complex [131], TRBP is the sensor for proper strand loading to RISC, which can proofread incorrect strand loading. TRBP mutations observed in human cancers have been shown to cause defects in miRNA biogenesis [132].

The localization and function of other accessory proteins, such as PACT, in the complex are still poorly understood; however, it has been shown that the loss of PACT expression impairs miRNA biogenesis [34]. PACT is a protein that contains three dsRBDs. Since TRBP and PACT are very similar in terms of their domain composition, it is likely that they compete for the same binding site in Dicer and that their functions may be mutually compensated [34]. TRBP and PACT may interact with each other and also associate with Dicer to facilitate the cleavage of dsRNA; therefore, both proteins play a stimulatory role [126]. The following proteins may also be involved in the regulation of processing selected miRNAs by Dicer: ARS2 [133], FMRP [134], KSRP [135], Lin 28 [103] or ADAR [136]. These proteins act by stimulating pre-miRNA processing by Dicer or by repressing miRNA maturation (reviewed in [67]).

Several experimental systems have been used to date to gain insight into the role of Dicer’s partners in its cleavage efficiency and specificity. Initially, the silencing of endogenous Dicer partners resulted in decreased pre-miRNA processing efficiency rather than in changed cleavage specificity. Even with regard to cleavage efficiency, the results obtained in independent studies were rather discordant [37, 38]. Synthetic precursors were also injected or transfected to cells in culture to follow their cleavage by the endogenous dicing complex [137] (Koscianska et al., submitted). The processing products of synthetic pre-miRNAs in cells were found to correspond rather well to the cleavage products generated by the recombinant Dicer; however, the cleavage efficiency differed between these systems. The experimental systems also included pre-miRNA cleaved by an immunoprecipitate containing Dicer with its protein partners [33, 35] and pre-miRNAs incubated in cellular extracts [138, 139]. Both exogenous stimulators [140] and endogenous inhibitors [141] of dicing were shown to exist. Human RLC was reconstituted in a 1:1:1 stoichiometry from recombinant Dicer, AGO2 and TRBP proteins, and this RLC was used in a cleavage assay with synthetic pre-let-7 [36]. Dicer’s partners showed little or no effect on cleavage specificity compared to Dicer’s activity alone [36] (Koscianska et al., submitted).

Structure and dynamics of the dicing complex

The successful reconstitution of RLC activity from recombinant proteins [36] has prompted researchers to aim at acquiring deeper insights into the molecular architecture of the human Dicer in the complex generated with its protein partners. The high-resolution crystal structures of Giardia intestinalis Dicer [142, 143], Thermus thermophilus AGO2 [144] and the DEAD-box helicase domain [145] have provided useful information for developing the human Dicer, Dicer–TRBP and RLC models based on single particle electron microscopy images [131, 146]. These analyses yielded low-resolution (20 Å) information on the mutual arrangement and possible interactions between the proteins within the binary and ternary complex. Multiple images obtained from electron microscopy suggest that RLC forms an L-shaped structure with an active RNase III center of Dicer in the back portion of the L-structure and in N-terminal domains localized at the base of the L-structure [146]. Using its PIWI and MID domains, AGO2 interacts with the Dicer platform formed by the C-terminal region. The N-terminal domain of AGO2, together with TRBP, interacts with Dicer’s DEAD-box domain localized at the base of the L-structure. AGO2 transiently interacts with TRBP to form a closed complex with Dicer. AGO2’s position in the RLC complex is flexible; it can move upon the binding of RNA and may play the role in increasing pre-miRNA access to Dicer [131, 147]. TRBP increases the affinity of AGO2 for Dicer, thus stabilizing the whole complex. The three-component complex forms a stable, triangle-like architecture [131], with an inside channel with a diameter of about 20 Å and a length of >100 Å. This channel, which runs along the long edge of the L-shaped portion, may be used to bind and position the pre-miRNA for catalysis. Attempts have been made to fit a hairpin structure into the cleft of the reconstructed Dicer–TRBP complex [146]. Most human pre-miRNAs range from 57 to 66 nt [108] and are approximately 78–90 Å long; they can, therefore, be accommodated within the channel. The “catalytic valley” formed by the two RNase III domains in Giardia Dicer is about 20 Å wide (which is similar to the diameter of the RNA-A helix) and 50 Å long [30] and covers about two-thirds of the length of a typical pre-miRNA (Fig. 4). A more in-depth understanding of the RLC and Dicer–TRBP structures [131, 146] and better insight into pre-miRNA structure and dicing [96] will provide answers to the intriguing question of whether the formation of the complex with pre-miRNA requires the structural adaptation of both the RNA and protein components or whether structural changes in only one of them would be sufficient to provide an induced fit [30, 143]. Previous studies that addressed this question focused mainly on the adaptive features in Dicer’s structure [142, 148]. Protein flexibility was proposed to be a critical factor, allowing Dicer to adjust its shape to accommodate the structural diversity of its pre-miRNA substrates [142, 148]. To excise the 20-nt miRNAs from the pre-miRNA hairpin with a fully base-paired stem in the RNA-A conformation, the catalytic site of the RNase III domain has to be located approximately 56 Å from the pre-miRNA 3′-base. To excise 24-nt miRNAs, the distance needs to be approximately 67 Å. Thus, the amplitude of motion of the Dicer catalytic center has to be at least 10 Å, i.e. approximately one-tenth the entire length of the substrate channel. However, the movement of the Dicer structure does not need to be as great. Only a few known human pre-miRNAs have perfectly paired hairpin stems, and their derived miRNAs vary in length from 21 to 22 nt [73, 108]. The stems of other human pre-miRNAs are mosaics of base pairs and internal loops of various types and sizes (Fig. 3a). The unmatched bases of asymmetrical motifs probably bulge out of the helix when the pre-miRNA is accommodated within the substrate channel (Fig. 4); these bases are therefore not counted by Dicer when it sets the distance to its cleavage site [96]. The accumulation of structural imperfections in pre-miRNA hairpins results in a higher plasticity of the structures of the precursor; thus, the pre-miRNA may also contribute to the induced fit required for active complex formation.

Fig. 4
figure 4

A hypothetical model highlighting the role of structural plasticity of the precursors in the dynamics of the pre-miRNA dicing complex. The pre-miRNA hairpin is forced to enter a narrow substrate channel formed by the Dicer portion of RISC loading complex (RLC) [131]. Excessive nucleotides present in any precursor arm bulge out and are not counted by Dicer, which measures the double-stranded RNA (dsRNA) distance from its anchoring site (PAZ domain) to the cleavage site [96]. RIIIA and RIIIB domains: RNase IIIA and RNase IIIB Dicer domains

Concluding remarks

Over the past several years numerous important advances have been made in the miRNA field, and research focusing on miRNA biogenesis is one of the most rapidly progressing areas in this field. The discovery that many miRNA variants containing slightly shifted sequences and differing in lengths are generated from individual miRNA genes has brought an extra interest to the mechanism of miRNA biogenesis and to the sources of miRNA variety. Results of miRNA deep sequencing studies have provided new insights into the specificity of miRNA processing steps triggered by Drosha and Dicer and into the nature of post-cleavage modifications. On the other hand, insightful information regarding the structures of miRNA precursors and their processing enzymes has also been obtained by the conventional molecular and structural biology methods. In this review, we have discussed structural aspects of mammalian microRNA biogenesis, placing special emphasis on the cytoplasmic step triggered by RNase Dicer and the role of precursor structure in generating miRNA length variety. It may be concluded from this article that the mechanism of miRNA precursor cleavages induced by Dicer is now fairly well established and that structural sources of miRNA length diversity are well recognized. There remains, however, a number of unresolved or poorly understood issues. Among these are the questions of sequence preferences for Dicer-induced cleavages and the exact role of Dicer protein partners in the dicing process. The low-resolution structures of the dicing complex provide first insights into its functioning, but further refinement of these structures will be needed to learn more about its dynamics. The recent identification of numerous auxiliary proteins implicated in the nuclear step of miRNA maturation sets the stage for more challenging studies on the structure and dynamics of the Microprocessor complex.