Introduction

Gene expression in eukaryotic cells requires the coordinated action of many molecular processes and the machines that carry them out. Pre-mRNAs, synthesised by transcription of protein-coding genes, are assembled into pre-mRNA–protein (pre-mRNP) complexes, processed, transformed into export competent mRNA–protein (mRNP) complexes and exported to the cytoplasm. There is no sharp distinction between components present in pre-mRNPs and components present in mRNPs. Some components interact transiently either with pre-mRNPs or mRNPs, while other components are present in both pre-mRNPs and mRNPs. In addition, an mRNP does not consist of the same components when it enters the nuclear pore complex (NPC) as when it emerges into the cytoplasm.

Many of the pre-mRNP/mRNP components are abundant in the nucleus, such as small nuclear ribonucleoproteins (snRNPs) and several of the heterogeneous nuclear ribonucleoproteins (hnRNPs), and they recognise binding determinants in the transcripts, such as splice sites and poly(A) signals and more or less specific binding motifs. As a result, different pre-mRNPs contain largely the same components. However, gene-specific sequences and unique exon–intron organisation, in combination with the set of available binding components, result in pre-mRNPs that have gene-specific combinatorial compositions. In addition, the composition of a pre-mRNP along a transcribing gene changes as components bind, processing occurs and as components in some cases leave the pre-mRNP.

The spatial separation of making translatable mRNPs from the translation process in eukaryotic cells has made possible additional levels of regulation of gene expression. It has become more and more evident that the mechanisms for export of mRNPs are integrated in the biogenesis of the mRNP. The molecular machines that are responsible for biogenesis have evolved to interact with each other to achieve coordination, efficiency, regulation and quality control. The transcription machinery and the transcription process are at the centre of coordinating many of the processes and the molecular machines. It is moreover evident that the NPC provides a molecular environment that triggers rebuilding of the mRNP, essential for the future cytoplasmic functions.

Our knowledge of the chain of molecular events, from synthesis of pre-mRNPs at the gene to entrance of mRNPs into the cytoplasm, has been gained mainly by combining results from in vitro analyses of individual processing events, studies of gene constructs introduced into cells and yeast genetics. More knowledge about the expression of specific, endogenous genes in vivo is needed to understand the temporal, spatial and structural coordination of the different steps and of the different molecular machines that are involved. Analyses of the expression of the Balbiani ring (BR) genes in the intact polytene nuclei of the dipteran Chironomus tentans have contributed essential and unique information (reviewed in Daneholt 2001a). For the BR genes, it is possible to study most nuclear events at the gene, in the interchromatin space and at the NPCs. Here, we will use the BR genes and their pre-mRNPs/mRNPs as reference to summarise the knowledge about the steps of pre-mRNP/mRNP formation and export in eukaryotic cells. The overall intranuclear steps in gene expression as seen for BR1/BR2 genes are shown in Fig. 1. At the gene, the pre-mRNA is synthesised and assembled with hnRNP proteins, processing, export and quality control components. The released, export competent BR mRNP moves randomly by diffusion through the interchromatin space. At the NPC, the BR mRNP binds, moves into the central channel, goes through conformational and compositional changes and, at the cytoplasmic side, translation is rapidly initiated.

Fig. 1
figure 1

Overview of the intranuclear steps of expression of a BR1/BR2 gene. Processes taking place at the gene locus, in the interchromatin space and at the NPC are listed. BR gene chromatin in black, RNA polymerase II elongation complex in purple, pre-mRNPs in light blue, mRNPs in dark blue and ribosomes in orange. The NPC and nuclear membrane are in grey

The cotranscriptional formation of export competent mRNPs

According to structural determination of the elongating RNA polymerase II in yeast (Kornberg 2007), about eight bases of the pre-mRNA are base paired to the DNA. It is suggested that the nascent pre-mRNA chain appears on the surface of the RNA polymerase II after reaching approximately double this length. At this exit site, it is likely that the pre-mRNA rapidly associates with various proteins and snRNPs that have access to sequence motifs in the pre-mRNA on a competitive basis. Analyses of transcribing BR genes by electron microscopy (EM) show that pre-mRNP assembly is initiated very early and that the transcription elongation complex and the pre-mRNP form a closely connected multi-molecular structure (Skoglund et al. 1983; Wetterberg et al. 2001) (Fig. 2). The observed gene-specific combination of SR proteins bound to pre-mRNAs in a single nucleus support that binding affinities determines the combinations of proteins in each pre-mRNP (Björk et al. 2009). The sequential exposure of newly synthesised pre-mRNA sequences during transcription presumably contributes to order during assembly and processing. This is exemplified in the BR3 gene where introns are excised in an overall 5′ to 3′ order (Wetterberg et al. 1996).

Fig. 2
figure 2

Cotranscriptional synthesis, assembly and processing of BR1/BR2 and BR3 pre-mRNPs. a The proximal (0–7 kb) and middle (7–35 kb) regions of the BR1/BR2 gene is shown schematically. The bent arrow indicates the transcription start site. The RNA polymerase II elongation complex in purple and the pre-mRNPs in light blue. The proximal region contains three introns (i1–i3, black boxes) and exons 1–3, followed by the long (approximately 35 kb) exon 4. The continuous recruitment of components to the pre-mRNPs (listed components have been demonstrated) and the continuous assembly of the pre-mRNPs are indicated by red arrows. The processing events of the pre-mRNPs are indicated (red lines). Assembly of the exon sequences results in a 7-nm pre-mRNP fibre that is further folded into a 19-nm fibre, followed by organisation into higher order structures. Interaction between the pre-mRNP and chromatin may involve pre-mRNP-bound actin. b Schematic representation of part of the BR3 gene. The BR3 gene contains multiple short introns and exons. Three nascent transcript and splicing (NTS) complexes are shown. Each complex consists of a pre-mRNP (light blue), including splicing factors and an RNA polymerase II elongation complex (purple). b′ EM 3D reconstruction of a corresponding BR3 gene segment with three NTS complexes (adopted from Wetterberg et al. 2001). The repeated assembly and structural dynamics of the spliceosome along the multi-intron gene dominate the structure of the pre-mRNPs. The scale bar represents 10 nm. c, d EM images of nascent BR1/BR2 pre-mRNPs in the middle region of the gene (adopted from Skoglund et al. 1983). The 19-nm fibre and the higher order granular structure are seen in several pre-mRNPs. The granular structure (arrows) gets bigger and denser as more RNA protein is added (compare c and d). The scale bar represents 50 nm

It is not understood how efficient access for all pre-mRNA interacting components is orchestrated close to the surface of the RNA polymerase II exit channel. It is possible that the components get access to the pre-mRNP because they continuously and randomly move within the interchromatin space. It is also possible that the local concentration of components is increased at active genes by intermolecular interactions. The RNA polymerase II C-terminal domain (CTD) is located in convenient proximity to the pre-mRNA exit channel, and it is implicated in loading components onto the pre-mRNA. Apparently, the CTD has evolved as a regulatable interaction surface to facilitate coordinated delivery of components to the pre-mRNP and to the chromatin (Yoh et al. 2008). In this way, the local concentration of specific components can be increased at the appropriate time. The CTD is long enough to simultaneously bind several proteins. The CTD is structurally versatile and an induced fit is important when it interacts with different components (reviewed in Meinhart et al. 2005). Furthermore, binding can be regulated because interacting proteins recognise specific serine phosphorylation patterns in the CTD (reviewed in Buratkowski 2009).

The coupling between pre-mRNP assembly, the transcription process and chromatin dynamics is not fully understood. One aspect is that assembly of the pre-mRNP may prevent formation of RNA–DNA hybrids and transcription-associated recombination (Huerta and Aguilera 2003). Another aspect might be that pre-mRNP assembly influences the chromatin structure to facilitate transcription elongation. Recent data imply that actin binds to certain hnRNP proteins in BR pre-mRNPs and recruits HATs (Fig. 2a), thereby directing histone acetylation (reviewed in Visa and Percipalle 2010). Also, the chromatin remodelling complex SWI/SNF associates with BR pre-mRNAs (Tyagi et al. 2009). It is suggested that this complex, as part of the pre-mRNPs, could in addition influence splicing and 3′ end processing.

As a result of evolving specific interactions between components of the different molecular machines, coupling between transcription and pre-mRNP assembly, processing and export has been established. In such a system, quality control can be reached by connecting different machines through regulating subunit interactions. Such a case could be the apparent coordination between assembly of the 3′ end machinery and delivery of export adaptors (Johnson et al. 2009). It is further likely that the many cotranscriptional interactions and rearrangements that include multi-component assemblies involve kinetic proofreading (Hopfield 1974) that increases fidelity and efficiency. Specific ATP-dependent steps in splicing have been suggested to be such examples (Burgess and Guthrie 1993; Xu and Query 2007).

In addition to the cotranscriptional environment, it is known that the localisation of genes inside the nucleus is important for gene expression. The presence of specific transcription factories in the nucleus has been suggested (reviewed in Sutherland and Bickmore 2009), but the molecular basis for such factories is not clear. In the polytene nucleus, multiple copies of each BR gene are closely packed, providing a local environment of high concentration of various factors, but there is no evidence for arranging different genes in close approximation. In yeast, a sub-compartment at the NPC facilitates gene expression. Direct interactions between several different components involved in transcription activation, transcription elongation, 3′ end processing, export as well as quality control and components of the NPC have been demonstrated in yeast for highly expressed and inducible genes (reviewed in Dieppois and Stutz 2010). The multitude of interactions suggest a direct coupling between the cotranscriptional processes and processes at the NPC. Such a direct interaction is apparently not the case for most active genes since they are not located at the NPCs. However, Nup153 and Tpr are dynamic NPC components, and in addition to being part of NPCs, they are associated with active chromatin within the nucleus (Vaquerizas et al. 2010). Interactions in the interior of the nucleus between chromatin and nucleoporins are also important for gene expression during cell differentiation (Capelson et al. 2010; Kalverda et al. 2010). Indirect coupling mechanisms between mRNP synthesis and NPC transport may furthermore exist, as suggested from studies of BR mRNP export (Kylberg et al. 2008).

Using immuno EM, it has been demonstrated that many components are recruited to the multi-molecular pre-mRNA–RNA polymerase II elongation complex on the active BR genes. There, they cooperate throughout transcription to facilitate and coordinate transcription, pre-mRNP assembly and processing. They also, in many cases, couple the cotranscriptional processes with export and cytoplasmic processes. These pre-mRNP/mRNP components and their functions will be briefly described below.

Pre-mRNP/mRNP components and their function

hnRNP proteins

More than 20 different hnRNP proteins, named A1 to U, are abundant components of pre-mRNPs/mRNPs and constitute a structurally and functionally diverse group of RNA binding proteins (reviewed in Dreyfuss et al. 2002). Different hnRNP proteins are present in widely different amounts in cell nuclei. Many of the hnRNP proteins bind to pre-mRNPs cotranscriptionally as demonstrated in lampbrush chromosomes (Wu et al. 1991) and for BR1 and BR2 pre-mRNPs (Visa et al. 1996b). In BR1 and BR2 pre-mRNPs, an hnRNP A1-like protein is present in multiple copies along the pre-mRNP (Kiseleva et al. 1997). The hnRNP proteins thus seem to be present throughout the length of pre-mRNPs and on individual transcripts they are likely to be present in specific combinations (reviewed in Singh and Valcárcel 2005), presumably determined by preferential sequence binding properties, post-translational modifications and cellular localisation. As they bind to the pre-mRNP, the hnRNP proteins can affect the structure of the pre-mRNP and the binding of other components, thereby influencing many aspects of synthesis, processing and function of pre-mRNPs/mRNPs, for example, polyadenylation, mRNA stability, export and translation. Several hnRNP proteins modulate splice site choices (reviewed in Martinez-Contreras et al. 2007), as exemplified by hnRNP A1 and hnRNP H that can both repress and stimulate splicing (Fisette et al. 2010). Hrp36, an hnRNP A1-like protein, and the Y-box p50 protein (Soop et al. 2003) both bind to BR1 and BR2 pre-mRNPs cotranscriptionally and remain associated with the BR mRNPs in polysomes, suggesting multiple roles for these proteins.

Nuclear cap binding proteins

All pre-mRNAs are capped and associated with cap binding proteins (CBPs) to form the cap binding complex (CBC). This universal 5′ end modification of the pre-mRNAs occurs soon after the start of elongation (presumably on 25- to 40-base-long pre-mRNAs) by the combined action of several enzymatic activities that also interact with the Ser5 phosphorylated CTD of RNA polymerase II (reviewed in Gu and Lima 2005). The cap binding proteins CBP20 and probably CBP80 are added early on BR1 and BR2 pre-mRNPs (Visa et al. 1996a), and thus the CBC is present in the pre-mRNP throughout most of the transcription elongation. This is in line with the observed interaction between the CBC and other cotranscriptional processing events (reviewed in Lewis and Izaurralde 1997), including the export machinery (Cheng et al. 2006).

The splicing machinery

SnRNPs and other splicing factors are efficiently recruited to intron-containing nascent transcripts (Kiseleva et al. 1994; Baurén et al. 1996; Lacadie and Rosbash 2005; Listerman et al. 2006) and are part of a nascent pre-mRNP/transcription complex (Wetterberg et al. 2001). Cotranscriptional splicing was originally observed in Drosophila (Beyer and Osheim 1988) and has been directly demonstrated for the BR1 (Baurén and Wieslander 1994) and BR3 (Wetterberg et al. 1996) pre-mRNAs. Observations in mammalian cells support that splicing is largely cotranscriptional (reviewed by Neugebauer 2002; Bentley 2005; Perales and Bentley 2009). EM visualisation of the BR3 pre-mRNPs in situ (Fig. 2b and b′) (Wetterberg et al. 2001) indicates that spliceosome components are repeatedly recruited to the multi-intron transcript during transcription elongation and that spliceosome components and introns leave the pre-mRNP after completion of splicing. It should be pointed out that spliceosome assembly is initiated cotranscriptionally, but splicing need not be completed until after 3′ end processing (Tardiff et al. 2006), depending on for example the position of the introns (Baurén and Wieslander 1994). Based on observations on the BR1 gene, splicing can be stimulated by 3′ end processing and may be essentially completed in the pre-mRNP while it is being polyadenylated, still retained at the gene locus (Baurén et al. 1998).

The SR proteins

The family of SR proteins fulfils multiple roles in the life of a pre-mRNP/mRNP (reviewed in Sanford et al. 2005; Long and Cáceres 2009). SR proteins are recruited to nascent pre-mRNPs (Baurén et al. 1996; Misteli et al. 1997; Mabon and Misteli 2005). During constitutive splicing more than one SR protein is required. SR proteins directly contact the pre-mRNA at several specific sites during the splicing reaction (Shen and Green 2004) and form important protein–protein bridges. They also influence alternative splice site choices, probably by contributing to specific combinations of splicing factors that associate with pre-mRNPs. Pre-mRNPs from different genes, including BR genes, bind several different members of the SR protein family in gene-specific combinations (Björk et al. 2009). At least four different types of SR proteins remain with BR mRNPs during export to the cytoplasm. Experimental interference with one of these SR proteins, SRSF2 (SC35), suggests that it is important for cotranscriptional events as well as for nucleocytoplasmic export for a single BR pre-mRNP/mRNP. One of the SR proteins, SRSF1 (ASF/SF2), even stays with the mRNP in polysomes, in agreement with its known role in translation initiation (Michlewski et al. 2008). These data underline the numerous findings that SR proteins are part of pre-mRNPs/mRNPs from the gene to the cytoplasm and influence many important processes such as synthesis, processing, export, stability and translation. In addition, an SR-like protein, RSF1/hrp23, is recruited to the pre-mRNP cotranscriptionally and inhibits spliceosomal early assembly at incorrect sites within exons (Björk et al. 2006).

The exon junction complex

A specific complex of proteins associates with the pre-mRNP/mRNP in a splicing-dependent but non-sequence-specific way, at a region about 20 nucleotides upstream of exon–exon junctions (reviewed in Le Hir and Andersen 2008). Assembly of this exon junction complex (EJC) represents a splicing-dependent change in pre-mRNP composition. It is assumed that an EJC is deposited upstream exon–exon junctions in a multi-intron pre-mRNA. The core of the EJC consists of eIF4AIII, a member of the DEAD-box family of RNA helicases that binds to the pre-mRNA, the Mago-Y14 heterodimer and Barentsz/MLN51. The order and timing of assembly of the EJC is not precisely known. The eIF4AIII is associated with both unspliced and spliced pre-mRNAs and may interact with spliceosomal proteins (Ideue et al. 2007). Detailed structural analyses indicate that eIF4AIII undergoes conformational changes and that Mago-Y14 is needed for the tight binding of eIF4AIII to the RNA. This may take place during spliceosomal C-complex formation (Herold et al. 2009). The four EJC core components associate with the BR1 and BR2 pre-mRNPs during cotranscriptional splicing (Björk et al. manuscript in preparation). The EJC core is stably bound to the mRNA and serves as an anchoring platform for many different proteins in the nucleus and in the cytoplasm. Among these proteins are splicing-associated proteins, export proteins and mRNA quality control proteins (Le Hir et al. 2001). In the cytoplasm, the EJC is involved in localisation of mRNPs, nonsense-mediated decay (NMD) and translational control (Isken et al. 2008; Ma et al. 2008). The EJC is present on CBC bound mRNPs and upon the first round of translation, the EJC is removed from the mRNP and eIF4E replaces the CBC and steady-state translation initiation can take place (reviewed in Isken and Maquat 2007).

The 3′ end processing machinery

The conserved 3′ end processing machinery (reviewed in Mandel et al. 2008; Millevoi and Vagner 2010) binds to a combination of sequence signals in the pre-mRNA and, in a coupled two-step process, cleaves the pre-mRNA and polyadenylates the free 3′ end. Even if it has been reported that 3′ end processing can occur after release from the gene (West et al. 2008), more commonly it takes place cotranscriptionally (reviewed in Perales and Bentley 2009). The recruitment of 3′ end processing factors appears to involve multiple mechanisms, pointing to interplay with the transcription process. Some 3′ end processing factors are recruited already at the 5′ end of the gene, possibly at the promoter, involving for example TFIID (Dantonel et al. 1997). Some factors, such as the cleavage polyadenylation specificity factor (CPSF), bind to RNA polymerase II (Nag et al. 2007). A maximal amount of 3′ end processing factors is found at the 3′ end of the gene, downstream the poly(A) site, probably recruited by the signals in the pre-mRNA (Glover-Cutter et al. 2008). At this position, Ser2 phosphorylation in the RNA polymerase II CTD is maximal and the cleavage stimulation factor (CstF) binds to the Ser2 phosphorylated CTD. In the BR1 gene, the RNA polymerase continues transcription approximately 600 bp downstream the position of the poly(A) site (Fig. 3). At this point, termination of transcription, 3′ end cleavage and the initial polyadenylation as well as stimulation of excision of the last intron take place (Baurén et al. 1998). As a result, the composition of the pre-mRNP changes; the functional 3′ end processing machinery temporarily associates and the poly(A)-binding protein (PABP) becomes part of the pre-mRNP. Ultrastructural analysis of the BR1/BR2 pre-mRNPs at the end of the genes has not been performed so far. Although initiated at the gene, polyadenylation is completed after release of the pre-mRNP from the gene since the length of the poly(A) tail gets longer in the interchromatin located BR pre-mRNPs/mRNPs.

Fig. 3
figure 3

Cotranscriptional processes at the 3′ end of the BR1/BR2 gene. a Schematic representation. Colour code as in Fig. 2. Excision of intron 4 (i4, black box) is initiated rapidly. About 0.6 kb downstream the poly(A) signal (grey box), transcription termination, 3′ end cleavage and polyadenylation are coordinated. The pre-mRNP is retained at the gene during the initial polyadenylation, and splicing is stimulated at this stage. Approximately 3 kb of DNA, as extended chromatin, is present downstream the site of release of the pre-mRNP, before being folded into compact chromatin (black curved line). Released mRNP in dark blue. b EM image of the 3′ end of the gene (adopted from Ericsson et al. 1989). The arrows indicate the chromatin fibre, extending from the final pre-mRNP into the compact chromatin. The scale bar represents 50 nm

The transcription-export complex

The transcription-export complex (TREX) is associated with the elongating RNA polymerase II-nascent pre-mRNP and plays several roles. TREX consists of the multi-subunit THO complex and two export proteins, the RNA helicase Sub2 (in yeast)/UAP56 (in mammals) and Yra1 (in yeast)/Aly (in mammals). In yeast, the THO complex appears to be associated with the transcription elongation complex and is believed to facilitate loading of proteins onto the pre-mRNA (Jensen et al. 2004). In mammals, the THO complex contains additional subunits and one of these, Thoc5, has been reported to participate in binding the export receptor NXF1 (Katahira et al. 2009). Recruitment of Sub2/UAP56 and Yra/Aly to BR pre-mRNPs is cotranscriptional (Kiesler et al. 2002). Importantly, Sub2/UAP56 is yet an example of a protein that has multiple functions. Apart from its presence during transcription elongation, it is involved in splicing, where it plays a role in U2 snRNP recruitment (Kistler and Guthrie 2001). Moreover, as part of the TREX complex, it is involved in export.

The recruitment of Sub2/UAP56 to the pre-mRNA can apparently take place by several mechanisms, connecting export to transcription and processing. Hpr1, a subunit of the yeast THO complex, has been shown to be required for recruitment of Sub2 and Yra1 to the pre-mRNA (Zenklusen et al. 2002). Aly can bind to nascent transcripts dependent on the transcription elongation factor Spt6. Spt6 itself binds to the Ser2 phosphorylated RNA polymerase II CTD and recruits Aly via its interacting protein Iws1 (Yoh et al. 2007). Recruitment of the export proteins is moreover connected to 3′ end processing. Direct interaction between the subunit Pcf11 of the yeast 3′ end cleavage factor IA (CFIA) and Yra1 may couple delivery of Yra1 to Sub2 and assembly of a functional CFIA (Johnson et al. 2009). In mammals, it has been reported that UAP56 and Aly interact with the EJC (for example Le Hir et al. 2001). However, it has also been demonstrated that in humans, CBP80 recruits the TREX complex to the 5′ end of the pre-mRNP in a splicing-dependent manner via interaction with Aly (Cheng et al. 2006).

mRNA export adaptors and receptors

Export of mRNPs requires that specific proteins in the mRNPs serve as adaptors for binding to export receptors. The export receptors in turn mediate contact with the NPC and are essential for translocation of the mRNP through the NPC. The vast majority of mRNPs use a specific heterodimer export receptor called NXF1:NXT1 in mammals and Mex67:Mtr2 in yeast. NXF1 belongs to a family of proteins (Herold et al. 2000), and several NXF family members exhibit mRNA export function. Some members are expressed in specific tissues and may serve specialised functions in mRNA metabolism (Tan et al. 2005). In addition, a second receptor, the karyopherin Crm1, is involved in export of some mRNPs (Gallouzi and Steitz 2001; Cullen 2003; Carmody and Wente 2009).

A number of different adaptors to the NXF1:NXT1 export receptor have been identified. The recruitment of these adaptors is directly or indirectly connected to pre-mRNA processing steps. Yra1/Aly is a well-known adaptor, described above as part of the TREX complex. Besides Yra1/Aly, some SR proteins, SRSF3 (SRp20), SRSF7 (9G8) and SRSF1 (ASF/SF2), remain associated with the processed mRNP. In a hypophosphorylated form, they serve as adaptors for NXF1:NXT1 (reviewed in Huang and Steitz 2005). A BR mRNP contains multiple export adaptors. Aly (Kiesler et al. 2002) and several types of SR proteins (Björk et al. 2009) have been shown to be part of BR1 and BR2 mRNPs, suggesting that multiple copies of the export receptor NXF1:NXT1 are associated with the BR mRNP. In addition, Crm1 is part of the BR1 and BR2 mRNPs (Zhao et al. 2004).

In summary, multiple export adaptor proteins and mechanisms of loading them onto the pre-mRNPs/mRNPs are utilised. These mechanisms involve both transcription and different pre-mRNA processing steps and thus connect these processes to the formation of export competent mRNPs. It seems likely that individual mRNPs associate with more than one adaptor, loaded by different mechanisms, possibly to increase export efficiency. It is also likely that some mRNAs require specific arrangements. The HSP70 mRNA, for example, requires a co-adaptor, Thoc5, which is part of the TREX complex, for binding the export receptor NXF1 (Katahira et al. 2009). Furthermore, it is likely that biological variation accentuates one variant or the other, such that different conditions dominate in different tissues or species and that mRNPs from certain genes preferentially use specific export factors.

Gene-specific components

Although most components that bind to pre-mRNPs/mRNPs are not gene specific, there are exceptions. For example, some, but far from all, regulators of alternative splicing are expressed tissue specifically (reviewed in Nilsen and Graveley 2010). A second example is the transcriptional repressor CA150/hrp130 that is highly enriched in the BR3 gene in an RNA-dependent manner. This conserved protein is known to repress transcription elongation in mammals. It has therefore been suggested that CA150/hrp130 may adjust the transcription rate to the frequent cotranscriptional excision of the many introns in the BR3 pre-mRNAs (Sun et al. 2004). A third example is ADAR II that edits within double-stranded parts of specific pre-mRNAs cotranscriptionally (Laurencikiene et al. 2006), implying that this enzyme can be part of the pre-mRNP in the appropriate cases.

Multiple roles for individual components

An important finding from studies of individual components of pre-mRNPs/mRNPs is that the same protein can interact with different molecular assemblies and be involved in more than one process. Four examples, SR proteins, the CBPs, the core EJC and the Sub2/UAP56 described above, emphasise this principle of multi-functional proteins. Evolution has clearly shaped multiple contact possibilities between subunits involved in different processes. The functional diversity of these proteins and their affinities are likely to be influenced by for example mutually exclusive interactions as suggested between Yra1 and Sub2 and Pcf11, by the phosphorylation status as observed for SR protein recruitment in splicing versus interaction with export receptors and by the varying interaction partners depending on cellular localisation as suggested for the core EJC.

These examples show that proteins having multiple functions during transcription, processing, export and translation are essential for the coordination of gene expression and thus are important for reaching accuracy as well as efficiency.

The structure of pre-mRNPs and mRNPs

To understand mRNP export in its molecular context in vivo, it is important to analyse the dynamic structure and composition of pre-mRNPs and the corresponding mRNPs. The structure and composition of a pre-mRNP changes during transcription, reflecting the gradual synthesis of the pre-mRNA, the continuous assembly of the pre-mRNP and the transient interactions with processing machines (Figs. 2 and 3). Information about the structure of pre-mRNPs is generally lacking. In EM studies of mammalian cell nuclei, perichromatin fibrils and granules presumably represent pre-mRNPs, but detailed structural analyses of such gene-specific pre-mRNPs have not been performed (reviewed in Fakan and van Driel 2007). The most detailed structural information available is for BR pre-mRNPs (Skoglund et al. 1983; Wetterberg et al. 2001). During transcription elongation, the pre-mRNP structure changes as more RNA is made and incorporated together with additional proteins. In the BR1 and BR2 pre-mRNPs, distinct structural changes are seen (Figs. 2 and 3). In the very beginning of the 5′ proximal part of the about 35-kb-long genes, the pre-mRNP structure is not yet precisely analysed. Thin fibres and occasional granules are present. In this region, three introns are incorporated into the pre-mRNP and subsequently rapidly excised (Fig. 2a). The structure of a pre-mRNP can be highly influenced by the presence of processing machines, such as the spliceosome. The 11-kb-long BR3 gene contains 38 introns and the pre-mRNP repeatedly take several distinct structures during recruitment, assembly and function of the spliceosome (Fig. 2b and b′) (Wetterberg et al. 2001).

Further into the proximal region of the BR1/BR2 gene, excision of the three introns is complete. The pre-mRNP then appears as a 19-nm-thick irregular fibre (Fig. 2a). Based on Miller spreads and on analyses of released BR mRNPs, the 19-nm fibre is in fact built from a folded, basic 7-nm fibre. At the end of the proximal region, a second structural change appears in the pre-mRNP. The 19-nm-thick fibre reaches a length that from then on is constant throughout transcription of the middle region (a single approximately 30-kb exon) of the BR1 and BR2 genes (Fig. 2a, c and d). At the tip of the 19-nm fibre, the 5′ part of the pre-mRNP folds into a more compact granular higher order structure. In this higher order structure, the 7-nm fibre is still the basic unit, although arranged in a different way. The molecular mechanisms for the structural transitions (7 nm to 19 nm to compact granular structure) are not known. They could for example be a consequence of processing of introns 1 to 3 or they could reflect assembly properties of a long pre-mRNP fibre. In summary, the packing of the BR1 and BR2 exon sequences takes place in distinct and reproducible steps; a 7-nm fibre is folded into an irregular 19-nm fibre and subsequently folded into a more compact, eventually spherical structure. At the 3′ end of the BR1 and BR2 genes, the overall pre-mRNP structure is still dominated by the 19-nm fibre and the growing compact spherical structure (Fig. 3).

We predict from these results on the BR genes that, in general, pre-mRNPs are structurally diverse. The structure will depend on the size of the transcription unit, the exon–intron organisation, the presence of processing machineries and the dynamic rearrangements during the processing events.

Also for mRNPs in the interchromatin space, structural information is needed. The structure of the BR1 and BR2 mRNPs has been analysed by EM tomography (Skoglund et al. 1986). The mRNP consists of a 7-nm fibre (Lönnroth et al. 1992) that is folded into higher order structures, apparently folded in a back-and-forth manner (Daneholt 2001b). Overall, a spherical mRNP is formed, having a diameter of about 50 nm and a central hole (Fig. 4a and b). This structure largely reflects the build-up of the compact spherical structure seen late during transcription (Fig. 3b). All BR mRNPs from an individual BR1 or BR2 gene have the same structure, and pre-mRNPs from the related BR1 and BR2 genes are very similar. This suggests that the mRNP structure is the result of specific interactions. In yeast, a large population of presumably nuclear mRNPs have been isolated and characterised by EM (Batisse et al. 2009). Many different mRNPs shared a common architecture consisting of a constant 5–7-nm-thick elongated ribbon-like structure with a length that increased with the size of the mRNA. From these examples, it is difficult to know if there is a common architecture for all nuclear mRNPs. It is possible that mRNPs have a more unified structure than pre-mRNPs because processing factors are not present. It could then be that a 5–7-nm fibre-like structural element is present and that for longer mRNAs, such a fibre can be arranged in a higher order of folding. It is also possible that the CBC and the poly(A) tail with its PABPs adopt structures that are different from the body of the mRNP. As indicated from studies of BR mRNPs (Mehlin et al. 1992, 1995) and dystrophin mRNPs (Mor et al. 2010) during export through the NPCs, it is evident that the mRNP structure is flexible and can be rearranged (see ‘mRNP interaction with the NPC and release into the cytoplasm’ section).

Fig. 4
figure 4

The BR1/BR2 mRNPs in the interchromatin space. a EM image of BR mRNPs in situ. The BR mRNPs (50 nm in diameter) in the interchromatin are indicated by long arrows. For comparison, short arrows indicate ribosomal subunits. N nucleus, C cytoplasm. The scale bar represents 100 nm. b The 3D structure of the BR1/BR2 mRNP (50 nm in diameter) (adopted from Mehlin et al. 1995, © The Rockefeller University Press. The Journal of Cell Biology, 1995, 129:1205–1216, Fig. 7). The numbers (16) on the mRNP indicate multiple contact points with the inner ring of the NPC during export. The numbers outside the mRNP (14) indicate described domains of the mRNP. The scale bar represents 10 nm. c The released BR1/BR2 mRNPs move randomly inside the interchromatin space and become part of a population of mRNPs, from which individual mRNPs (stochastically) bind to the basket of the NPCs. Colour code as in Fig. 1. c′ EM image of a section through a polytene nucleus (adopted from Singh et al. 1999). Newly synthesised (red dots) and old (blue dots) BR1/BR2 mRNPs are randomly distributed. The highest concentration of mRNPs is at the gene locus (BR). PC polytene chromosomes, Nu nucleolus, N nucleus, C cytoplasm. The scale bar represents 5 μm. d Within the interchromatin space, the random movement of the mRNP is occasionally interrupted by transient interactions (light grey, thin lines) with interchromatin structures (dark grey, thick line). d′ EM 3D reconstruction of a BR1/BR2 mRNP interacting with an interchromatin fibre structure (adopted from Miralles et al. 2000, © The Rockefeller University Press. The Journal of Cell Biology, 2000, 148:271–282, Fig. 2d, part IV). Numbers 1–4 refer to the described domains of the mRNP. The scale bar represents 20 nm

Quality control in the nucleus

Improperly processed or assembled pre-mRNPs are recognised and degraded in the nucleus, as has been most extensively analysed in yeast (reviewed in Schmid and Jensen 2008; Fasken and Corbett 2009; Dieppois and Stutz 2010). Although decapping and 5′–3′ degradation has been observed for poorly spliced pre-mRNAs, the nuclear exosome, having 3′–5′ exonucleolytic activity and its activator complex TRAMP, is the most well-characterised degradation complex. The exosome is associated with the elongating RNA polymerase II in active genes (Andrulis et al. 2002). In the BR genes, the exosome component Rrp4 (and presumably the exosome) is in addition present in pre-mRNPs throughout transcription, an interaction that is mediated by the hrp59/hnRNP M (Hessle et al. 2009). Pre-mRNPs with defects in assembly, splicing or 3′ end processing can be retained at the gene locus in an exosome-dependent manner (Hilleren et al. 2001). The molecular mechanism of this retention is not known, at least partly because the mechanisms for normal release of pre-mRNPs/mRNPs from the gene is not known. It has been observed that an appropriate poly(A) tail, presumably coupled to PABPs, is needed for release from the gene locus. In the BR1 gene (Fig. 3), correctly 3′ end cleaved transcripts with short, approximately 20-A long, poly(A) tails were present at the gene locus and extension of the poly(A) tails occurred as the mRNPs were released or shortly thereafter (Baurén et al. 1998). The initial polyadenylation thus occurs at the gene locus and unless efficiently performed leads to retention and exosome degradation (reviewed in Schmid and Jensen 2008). Since the different cotranscriptional events are coupled, defects in RNA polymerase II, pre-mRNP assembly and splicing could possibly all result in retention at the gene locus, mediated through a 3′ end processing pathway.

Unspliced and improperly assembled mRNPs are also retained and presumably degraded at the NPC. The basket-associated Mlp1 and Mlp2 (Tpr in vertebrates) and Pml39 proteins are involved in sorting proper and improper mRNPs for translocation and retention, respectively (reviewed in Fasken and Corbett 2009). Additional NPC-associated proteins are involved, probably indirectly because they are important for NPC assembly (Esc1) and Mlp1 anchoring (Nup60). The mRNPs may directly interact with Mlp1 via the poly(A)-binding protein Nab2. The SUMO protease Ulp1 is localised to the NPC and is involved in retaining unspliced mRNPs, suggesting that desumoylation is part of this process (Lewis et al. 2007).

mRNPs move inside the interchromatin space by diffusion

Different types of data support the view that active chromatin is unfolded and form loops of various lengths (reviewed in Cremer et al. 2004; Sutherland and Bickmore 2009). In mammalian cell nuclei, such actively transcribing chromatin loops may be found inside chromosome territories or at the surface of these territories. EM observations suggest that in both cases, active transcription occurs at the interface between chromatin and interchromatin (reviewed in Fakan and van Driel 2007). The nascent pre-mRNPs are accessible from the interchromatin space, and upon termination of transcription/processing, mRNPs are directly delivered into the interchromatin space. This situation is most evident in polytene nuclei. EM analyses show that the active BR genes loop out into the interchromatin space, that the nascent BR pre-mRNPs are in direct contact with the interchromatin space and that the pre-mRNPs/mRNPs are directly delivered into the interchromatin space (Fig. 4c and c′).

Biochemical analysis of the nuclear BR1/BR2 mRNPs suggests that each of these long mRNAs is associated with approximately 500 average-sized proteins (Wurtz et al. 1990). A number of different proteins have been identified by immuno EM (CBC, hnRNP proteins, export factors, PABP, EJC core complex, helicases, SR proteins and a splicing repressor), but the complete composition is not yet known. It is evident that each BR mRNP contains proteins that have specific functions during export, for example at the NPC and subsequently in the cytoplasm. In general, proteins present in mRNPs have been identified (Dreyfuss et al. 2002; Singh and Valcárcel 2005) and also characterised for mRNP populations (Batisse et al. 2009). More information about the protein composition in gene-specific mRNPs is however needed.

Before the movement and distribution of mRNPs inside the cell nucleus could be studied, it was believed that efficiency demanded structural mRNP transport systems. The current model proposes that mRNPs move in a non-directional manner away from the gene by diffusion. This diffusion, however, is restricted by the chromatin organisation and structures inside the interchromatin (Zachar et al. 1993). In mammalian nuclei, the interchromatin space forms a three-dimensional labyrinth and diffusion of the mRNPs occurs within this restricted space. It is possible that narrow channels in between volumes of chromatin increase the efficiency of directionality for mRNP movement. Direct measurements of distribution and rate of movement are compatible with a restricted diffusion (Politz and Pederson 2000; Shav-Tal et al. 2004; Ben-Ari et al. 2010; Mor et al. 2010). It has been observed that within the interchromatin space, mRNPs move unrestricted in and out of interchromatin granule clusters (speckles at the light microscope level), suggesting that, in general, mRNPs do not accumulate at specific regions for specific processing/modification steps, at least not for splicing (Ritland Politz et al. 2006).

Compared to mRNPs in mammalian nuclei, the movement of BR1/BR2 mRNPs is less restricted within the interchromatin space of the large polytene nucleus (Singh et al. 1999), and the diffusion coefficient is higher than reported for mRNPs in mammalian nuclei (Siebrasse et al. 2008). This is probably because chromatin restrains movement in mammalian nuclei. Tracking of individual BR mRNPs demonstrated that within the interchromatin space, the mRNPs are slowed down occasionally. This may reflect transient interactions with non-chromatin components in the interchromatin space (Fig. 4d and d′). Such interactions have been demonstrated (Miralles et al. 2000).

It has been estimated that movement through the nucleus of mammalian cells, from the gene to the NPC, occurs within a time frame of 5–40 min, i.e. not a fixed time (Mor et al. 2010). In the case of BR1/BR2 mRNPs, newly made mRNPs become part of a pool of nuclear mRNPs from which individual BR mRNPs are exported, presumably after randomly docking at NPCs (Fig. 4c and c′). We propose that this reflects a situation common to all mRNPs and that the nuclear residence time for individual mRNPs varies. It is possible that diffusion through narrow interchromatin channels increases the directionality for mRNP movement in mammalian nuclei.

mRNP interaction with the NPC and release into the cytoplasm

A decisive step in nucleocytoplasmic export is the recognition between the mRNP and the NPC. The NPC is a huge protein complex, comprising around 125 MDa in mammals and 60 MDa in yeast. The structure and function of the NPC has been reviewed in several recent reviews (for example Cook et al. 2007; D’Angelo and Hetzer 2008; Lim et al. 2008; Terry and Wente 2009; Strambio-De-Castillia et al. 2010). Briefly, the NPC is built from about 30 different proteins, called nucleoporins, divided into structural and FG nucleoporins. FG nucleoporins contain flexible domains rich in phenylalanine and glycine residue repeats, separated by characteristic spacer sequences. In total, it is estimated to be 500–1,000 nucleoporins/NPC. The NPC has a characteristic 8-fold symmetry and consists of a core embedded in the nuclear envelope and fibrillar structures on both the nuclear and cytoplasmic sides. The core consists of a central channel positioned in between a nuclear and a cytoplasmic ring. The central channel contains a meshwork of unfolded, flexible FG nucleoporins. The fibrillar structure on the nuclear side, called the basket, consists of eight fibrils attached to the nuclear ring of the core. The ends of the basket fibrils can interconnect with each other. On the cytoplasmic side, there are also eight fibrils but they are shorter than those on the nuclear side.

The binding of BR1 and BR2 mRNPs to the basket of the NPC and their translocation through the NPC has been studied in some detail (Fig. 5). Initially, the BR mRNP interacts with the tip of the basket fibrils and there are structural rearrangements of the basket (Fig. 5a and a′) (Kiseleva et al. 1996). The fibrils open up and a ring-like structure is formed around the centrally placed BR mRNP. It is possible that, before binding the mRNP, the basket fibrils are in fact quite flexible and that they constantly move and thereby contribute to the efficiency of recruitment of mRNPs to the basket (Kylberg et al. 2010). After binding, the BR mRNP enters the basket and the ring structure regresses (Fig. 5b and b′). Tpr and the FG nucleoporin Nup153 are present in the basket fibrils. Export receptors interact with FG repeats and several FG nucleoporins in addition to Nup153 are present on surface-accessible NPC positions including the basket (reviewed in Terry and Wente 2009). Export receptors bind directly to the FG repeats with domains that are different from the mRNP-binding domains (Stewart 2007), and different export receptors may require different subsets of FG nucleoporins. BR mRNPs are retained on top of the basket when Nup153 is experimentally interfered with, suggesting that Nup153 is involved in the transfer of the BR mRNP into the basket (Soop et al. 2005). Wheat germ agglutinin blocks the BR mRNPs at the basket, suggesting that GlcNAc-bearing nucleoporins, but presumably not Nup153, are involved in processes preceding the translocation through the central channel (Kylberg et al. 2010).

Fig. 5
figure 5

The behaviour of BR1/BR2 mRNPs at the NPC. Schematic representations and EM images (adopted from Mehlin et al. 1992) of the corresponding steps. a, a′ The mRNP (in dark blue) binds to the tip of the basket (in grey). b, b′ It then enters into the basket and docks at the entrance of the central channel. c, c′ Subsequently, the mRNP changes conformation as it is translocated through the central channel with the 5′ end first. d, d′ At the cytoplasmic side, initiation of translation can occur rapidly at the exit of the central channel. Orange dots in (d) and arrows in (d′) indicate ribosomes. The scale bar represents 100 nm

As previously pointed out, a BR mRNP contains multiple export adaptors for NXF1:NXT1 and it also contains Crm1. It has been observed that during translocation, the BR mRNP rotates and that it has multiple contact points with the NPC nuclear ring (Fig. 4b) (Mehlin et al. 1995). This may reflect the multiple export receptors and/or the mechanics of entering the central channel. At least for large mRNPs, such as the BR mRNPs, efficient NPC passage may require multiple export receptors. In agreement with such an interpretation, interference with SRSF2 (SC35) (Björk et al. 2009) or inhibition of formation of LMB–RanGTP complex (Zhao et al. 2004) only partially reduces export.

After docking at the entrance of the central channel, the BR mRNP strikingly changes structure as it continues to move into the channel and adopts a more extended shape, approximately 25 nm in diameter (Fig. 5c and c′). The NPC channel allows almost free passage of molecules smaller than 5 nm in diameter (Feldherr and Akin 1997; Keminar and Peters 1999) and appears to have an upper limit of transport of complexes with a diameter of 39 nm (Wente and Rout 2010). The structural rearrangement of the BR mRNP may be a consequence of this physical size limit and the translocation process. Compositional changes could facilitate these structural rearrangements. The BR mRNPs move through the NPC channel with its 5′ end first (Fig. 5c and c′) (Mehlin et al. 1992, 1995; Visa et al. 1996a). This may be a general property of mRNPs since it has also been observed for dystrophin mRNPs (Mor et al. 2010). The molecular details are not known. It is possible, though, that a number of proteins leave the mRNP at this point. In the case of the BR mRNP, immuno EM studies suggest that the TREX components UAP56 and Aly (Kiesler et al. 2002), the splicing regulator hrp23/RSF1 (Sun et al. 1998) and hrp59/hnRNP M (Kiesler et al. 2005) all leave the BR mRNP during docking or translocation through the NPC. It is also possible that proteins important for export, such as RAE1, interact transiently and specifically with the BR mRNP at this stage (Sabri and Visa 2000).

The translocation through the NPC channel is rapid. Real-time visualisation of individual mRNPs during export showed that the export time was about 0.5 s (Mor et al. 2010 and references herein). The movement of the mRNP through the NPC was 15-fold faster than simple diffusion, suggesting that the translocation is facilitated. Several mechanisms for translocation have been put forward (reviewed in Peters 2009).

At the cytoplasmic side of the NPC, BR mRNPs do not regain their spherical interchromatin structure. Instead, they form extended mRNP fibrils. In general, it has been suggested that a reorganisation of mRNPs takes place at the cytoplasmic side of the NPC. The DEAD-box helicase Dbp5 is enriched at this location. Dbp5 is important for the export process (Alcazar-Roman et al. 2006; Weirich et al. 2006), presumably by remodelling the mRNP and simultaneously releasing proteins, such as NXF1:NXT1, thereby creating a directionality of the export (Lund and Guthrie 2005). Dbp5 binds to RNA and to several protein partners that are present at the cytoplasmic side of the NPC (Moeller et al. 2009). Nup214 (Nup159 in yeast) binds Dbp5, but only in the RNA-free form. It is therefore possible that Dbp5 could be delivered from Nup214 to the mRNP. Immuno EM studies of translocating BR1 and BR2 mRNPs suggest that Dbp5 is added to the mRNP at this stage (Zhao et al. 2002). Gle1 and its cofactor InsP6, which are associated with another nucleoporin and positioned close to Nup214, stimulate the Dbp5 ATPase activity. This results in a local activation of Dbp5, leading to the mRNP remodelling and release of proteins. One such protein is the nuclear poly(A)-binding protein that is lost from BR mRNP at NPC translocation or shortly thereafter (Bear et al. 2003). This would be in line with that Dbp5 has been shown in vitro to remodel mRNP and displace the yeast Nab2 protein (Tran et al. 2007).

Dbp5 has been shown to also bind to pre-mRNPs cotranscriptionally in yeast (Estruch and Cole 2003) and to BR pre-mRNPs (Zhao et al. 2002). Such ‘pre-bound’ Dbp5 could function at the NPC. It is also possible that Dbp5 is involved in remodelling of the pre-mRNP in the nucleus, for example during cotranscriptional assembly.

The routes taken by mRNPs in the cytoplasm can be different. For efficiently translated mRNPs, such as BR mRNPs, initiation of translation can occur immediately after entrance into the cytoplasm (Fig. 5d and d′). CBP20, and presumably the entire CBC, is lost from the BR mRNP during translocation (Visa et al. 1996a). Furthermore, the translation initiation factor eIF4H (Björk et al. 2003) becomes associated with the BR mRNP at the cytoplasmic side of the NPC. Morphologically, ribosomes can be seen associated with the extended BR mRNP fibril (Mehlin et al. 1992). This implies that a pioneer round of translation and establishment of polysomes can occur already in the perinuclear region.

The NPC translocation process is an important step in transforming the intranuclear mRNP into a functional cytoplasmic mRNP. Understanding the molecular events at the NPC that deciphers the information built into the nuclear mRNP composition and structure will be important to understand gene expression. The intranuclear mRNP enters the NPC via the basket as an export competent mRNP, and during translocation, conformational and compositional changes are triggered. Changes occur at both the 5′ and 3′ ends of the mRNP. Export receptors and other proteins are displaced. The cotranscriptional assembly is, however, not entirely replaced. The EJC, hnRNP proteins and SR proteins are examples of cotranscriptionally added components that will influence localisation, translation and quality control in the cytoplasm.

Conclusions and future perspectives

The making of an export and translation competent mRNP requires the coordinated action of many multi-component protein and protein–RNA complexes. At the active gene locus and at the NPC, important events integrate the formation, processing and modification of the mRNP before delivery into the cytoplasm. Interactions between components in the different molecular machines and multiple roles for individual components at several steps result in a streamlined synthesis, processing and export pathway. In addition, the interplay between subcomponents improves efficiency and makes quality control possible at several steps.

Identical molecular components are present in different organisms, but variations in the pathway of mRNP biogenesis clearly exist that presumably tend to emphasise specific processes depending on the cell type or experimental system. We therefore need to analyse principles and mechanisms in favourable systems and learn how these mechanisms operate in specific cases and in general. We furthermore need experimental tools to better analyse the formation, possibly pre-formation, and reuse of the multi-molecular machines in intact cell nuclei. It is still largely unknown how integrated, efficient and regulated action of the machines and processes is brought about in the cell nucleus.