Introduction

When a pre-mRNA molecule is transcribed from DNA the resulting molecule includes several exons (a term coined from expressed or exported sequence) and introns (for intragenic or intervening sequence). The task of the splicing process is to remove introns and join together the exons to obtain the mature mRNA molecule, and for general reviews on the subject the reader is referred to previous publications [1, 2]. The removal takes place through two transesterification reactions (Fig. 1) and the incredibly complex cellular machinery that performs this process is called the spliceosome [36]. The end result will be that of obtaining a mature mRNA that can be divided in three regions; the protein coding sequence (ORF), the 5′-untranslated sequence (5′UTR), and the 3′-untranslated region (3′UTR). It is this final product that can be exported to the cytoplasm and translated in the required protein. Because exons by definition are those that were considered to be the important pieces of a gene, researchers have historically concentrated on studying exons and not introns. For a very long time, in fact, these sequences were considered “junk” without any other function than that of separating exons from each other. Their presence was justified to allow the generation of protein variants through alternative splicing [79] or, from an evolutionary point of view, allow novel combinations of protein domains [10, 11]. Of course, now we know that introns are not simply passive players in gene expression regulation as within their sequences they can host splicing enhancers and silencers [12, 13], regulatory pseudoexons [14], microRNAs (miRNAs) precursor sequences [15, 16], and small nucleolar RNAs (snoRNAs) [17]. In this sense the importance of introns is not only confined to the human species. For example, Castillo-Davis and coworkers have studied possible links between gene expression data and gene structure from whole genome analysis in humans and C. elegans and revealed the existence of natural selection for short introns in highly expressed genes in both organisms but not the absence or loss of introns [18]. Many essential yeast genes, as ribosomal protein genes (account for 90% of all mRNA transcripts from intron-containing genes) have introns and about 27% of mRNA molecules made each hour in cell are derived from genes interrupted by introns although they are represented in only 4% of all S. cerevisiae genes (most of them have a single intron) [1921]. This means that the stimulatory effect of introns on gene expression cannot be neglected even in yeast [22, 23].

Fig. 1
figure 1

Schematic representation of the splicing process. A pre-mRNA molecule consists of exons and introns which are defined by conserved sequences called 5′ and 3′ splice sites. In the initial step of the splicing process, the U1 snRNA as part of the U1 small nuclear ribonucleoprotein (U1snRNP) recognizes 5′ splice site and U2 snRNA within U2 snRNP binds to the branch site near 3′ splice site. Three other snRNPs named U6, U4, and U5 bind sequentially to the pre-mRNA thus forming the large dynamic nucleoprotein complex called “spliceosome” and catalyzes the intron excision from the pre-mRNA. Once all the exons are joined together, the mature mRNA molecule becomes ready to be exported and translated in the cytoplasm

It is not surprising, therefore, that in complex multicellular organisms most genes contain at least one intron. For example, in humans 94% of the genes are interrupted by, on average, seven introns [24, 25]. Moreover, genes interrupted with introns are generally expressed at a significantly higher level in mammalian cells than the same genes lacking introns [2632]. Indeed, not only in mammalian cells but also in plants it was demonstrated over 20 years that the expression of some genes appears to have a strong requirement for the presence of an intron (i.e., are intron dependent) [26, 3337]. The reason why is that until a few years ago it was believed that only very limited contact existed between the splicing process and other processes that regulate gene expression. Now, however, it is quite clear that all these processes can influence each other to a considerable degree and for reviews on the subject the reader is referred to Maniatis and Reed [38] and Le Hir et al. [39].

In the following paragraphs we will briefly summarize the effects of splicing (and viceversa) on various other steps in gene expression regulation and especially those that can be of interest for the biotechnology field. These are also all schematically summarized in Fig. 2 for an easier reference to readers.

Fig. 2
figure 2

Functional coupling between splicing and other steps in gene expression regulation. The functional coupling between splicing and other steps of gene expression regulation span the entire mRNA life-cycle. For example, at the gene/chromatin levels it has recently been observed that nucleosomes are positioned on the exons rather than introns and may help in the co-transcriptional recruitment of splicing factors. Other important connections are: 1. reciprocal activation of transcription and splicing appears at the transcriptional initiation, elongation and termination level through the splicing factors interaction with the transcriptional machinery (RNA polymerase II and other transcriptional factors). 2. Upon splicing, exon–exon junctions are marked with the multiple protein complex called EJC. The proteins that form the EJC have been described to play a role in at least four different processes that include mRNA splicing (RNPS1, SRm160, Pinin), mRNA export (TAP/p15, REF/Aly, UAP56), mRNA localization (eIF4AIII, Y14, Magoh) and NMD (Nonsence Mediated Decay) (Upf1, upf2, Upf3). 3. Upon mature mRNA export to the cytoplasm further rearrangement of the EJC protein complex occurs. EJC proteins: RNPS1, Y14, and Magoh mediate polysome association with mRNA and translational enhancement together with the other nucleocytoplasmic shuttling candidates as SR proteins, U2AF and hnRNPs

Splicing and Transcription

In vitro splicing assays initially showed that RNA Pol II phosphorylation was capable of strongly activating splicing [40]. The first observations that these two processes might be able to talk to each other came by studies showing that several well-known splicing regulatory factors were also capable of interacting directly with RNA Polymerase II (RNA Pol II). For example, the SR protein family of splicing factors [41] and U1snRNP [4244] were immunopurified with RNA Pol II demonstrating their interaction in vivo. Indeed, evidence has also accumulated with regards to considering U1 snRNA as a functional component of the transcription factor TFIIH regulating the transcriptional initiation [45] and the snRNPs in general interacting with TAT-SF1 transcriptional elongation factor [46].

Presently, compelling evidence has accumulated with regards to the ways that transcription rates can have an influence on the splicing process. This influence can be achieved in two basic ways: either by influencing the timing through which splice sites are presented to the splicing machinery [47] or through the recruitment of specific splicing factors to the RNA Pol II C-terminal region [48]. Both issues have been recently reviewed by Kornblihtt and the reader is referred to this publication for further discussion [49]. More recently, latest insights obtained from studies that focused into nucleosome positioning have also shown that exons, rather than introns, are marked by nucleosomes and this may help the co-transcriptional recruitment of splicing factors [5052]. Finally, although mostly still at the hypothetical stage, it has been suggested that chromatin features such as methylation, acetylation may also affect the work of the spliceosome to distinguish exons and introns depending on the presence of epigenetic signatures [53].

Splicing and mRNA Export

In order to obtain good protein expression another essential step that will reflect on the final outcome is the export of the mature mRNA from the nucleus to the cytoplasm. In early experiments performed using Xenopus oocytes it was demonstrated that some spliced mRNAs are exported to the cytoplasm more rapidly than identical mRNAs transcribed from cDNA [54]. This is true also for mammalian cells where introns have been shown to alter the nucleocytoplasmic distribution of a particular mRNA [29, 55, 56]. More recently, characterization of the splicing mRNP complex has revealed that some of the key mRNA export factors are recruited to the mRNA by members of the splicing machinery and several works have identified proteins that can act to couple the splicing and mRNA export apparatus [5760].

In particular, a key event in this coupling has been found to rely on the sequence independent deposition of multiple proteins 20–24 nucleotides upstream of mRNA exon–exon junctions as a consequence of splicing. This was initially discovered in HeLa cell and in Xenopus oocytes [61, 62] and involves some mRNA export factors (some of them remain bound to mRNA after export to the cytoplasm) [59] such as REF1 (for review see Reed and Hurt [63]). This complex is now currently referred to as the exon junction complex (EJC) and may also play a role in directly influencing translation and not just export, as reviewed by Tange et al. [64]. In general, however, Wiegand et al. [65] recently demonstrated that splicing fails to enhance gene expression if formation of the EJC is prevented.

It should be noted, however, that work performed on this subject is still very much open to debate. In fact, several studies also indicate that splicing may not be essential for nuclear export of most mRNAs [27, 28, 66]. For example, RNA interference mediated depletion of REF1 and other known EJC proteins (DEK, SRm160 or Y14) showed no significant nuclear accumulation of poly(A) RNAs indicating that they are not crucial for mRNA export in Drosophila cells, whereas NXF1(TAP) and UAP56 are (as in the S. cerevisiae, C. elegans) [66]. Furthermore, data from yeasts demonstrate that recruitment of mRNA export factors occurs not only during splicing but also cotranscriptionally and that there is no need of splicing for efficient nuclear export of mRNAs [6769] and for nuclear export of naturally intronless transcripts [70].

Taken together, all these data suggest that eukaryotic cells probably contain several pathways for export factors recruitment on mRNA transcripts. However, experimental data also support the view that at least one export pathway can be triggered by splicing. For these reasons, promoting EJC deposition on a gene of interest may be a very useful property to keep in mind when optimizing biotechnological processes aimed at increasing recombinant protein yields.

Splicing and mRNA Stability

In addition to transcription and export, a few examples in which the presence of introns within genes is advantageous in terms of enhanced mRNA stability have been documented. For example, work on SV40 late pre-mRNAs has given some indications that the presence of introns stabilizes the primary transcript within the nucleus and mediates the efficient transport of mRNA to the cytoplasm [29]. More recently, by comparing mRNA stabilities between different human genes it was found that genes with introns have more stable mRNAs than intron lacking genes and that there is also a positive correlation of intron number and mRNA stability in Arabidopsis thaliana cells [71]. Finally, several members of the hnRNP family of proteins have been described to play roles in translational regulation by affecting mRNA stability [72, 73].

Splicing and Translation

From a biotechnology point of view, a very important issue is represented by possible direct connections between splicing and the translation process. Early evidence that these two processes might influence each other comes from experiments of mRNA injection into Xenopus oocytes nucleus which showed that the presence of an active intron inserted into the 3′ UTR of chloramphenicol acetyl transferase pre-mRNA was a major factor that could influence its translation in the cytoplasm [74].

Since then, the presence of an intron in pre-mRNA has been observed to generally increase the translational efficiency of the spliced mRNA product in the cytoplasm [27, 28, 7577]. In some cases this enhancement is achieved by increased association of the spliced mRNA with polysomes that is mediated at least in part by the deposition of the EJC following intron removal (see above). This has been directly tested in HeLa cells where both the enhanced translational yield and polysome association could be reproduced by tethering three different EJC proteins, RNPS1, Y14 and Magoh, to an intronless reporter (TCR-β and β-globin constructs) [78]. However, this may not be the only mean and other candidates that could mediate translation efficiency are also well known for their splicing regulatory abilities. For example, SR proteins [79, 80] have been recently described as potent stimulators of translation [81] although they can also act as translational inhibitors of their own expression such as been recently described for its most famous member, ASF/SF2 [82]. In a similar manner, several members of the hnRNP family of proteins also involved in pre-mRNA splicing regulation have been described to affect translation directly, for example by inhibiting formation of the 80S ribosomal complex [83].

Importance of pre-mRNA Splicing in Biotechnology

The creation of highly productive cell lines includes choice of vector and host cell line, media optimization, maximization of transcription levels by utilizing promoters with high intrinsic activity, poly-adenylation signals, inclusion of 5′ UTR introns, chromatin opening elements, the use of DNA elements that counteract endogenous repression and improve translational efficiency with translational enhancers (TEE technology). Quite understandably, bioengineers sometimes have difficulties in exploiting such diversity to its full extent because each step requires a distinct set of strategies in order to capitalize it for a synthetic system. For this reason, in our opinion it is important to provide some practical examples to show how the inclusion of introns can boost the expression of recombinant genes in vitro and in vivo. These examples, in fact, also contain some useful “tips” that should be taken into account especially with regards to the choice of intron elements to be introduced and the intron position.

1. The first example concerns the truncated spike protein (Str2) of Severe Acute Respiratory Syndrome Coronavirus whose expression in mammalian cells is important for vaccine development. In this case, a 5′ UTR intron (138 bp intron of pIRES, Clontech) insertion increased the protein expression yield by 1.9, 2.5, and 4.1-fold in Vero E6, QBI-293A and CHO cells, respectively (Fig. 3a). Higher protein expression correlated with a higher total RNA level in intron-containing construct which was further demonstrated to be due at least in part to the increase of RNA transcriptional elongation rate [84].

Fig. 3
figure 3

The beneficial effects of introns on recombinant gene expression. Three examples in which splicing has been used to obtain a beneficial effect on biotechnological processes. a Increased STR2 protein expression in transfected cells (293, CHO, Vero) can be achieved by the addition of a 138 bp intron sequence within the 5′ UTR of STR2 gene. Intronic sequence can also be used to improve expression of naturally intronless genes. b Insertion of an intron at position 189 within the β interferon gene successfully increased mRNA and protein levels in CHO and HeLa transfected cells. c A schematic view of the Creator Splice system. In this case, a standard acceptor vector from Clontech was modified by the SD/intron insertion. A Creator Splice Acceptor and Donor Vectors are recombined in the presence of Cre recombinase in vitro. Resulting expression vector has an intron starting from the SD/intron from the acceptor vector and ending with the SA from the donor vector. Upon mammalian cells transfection, transcription occurs and splicing removes the intron. This system has been observed to benefit from enhanced protein expression in comparison to the original Creator system without intron. bac bacteria promoter, CmR chloramphenicol resistance ORF, kan/neoR kanamycin and neomycin resistance gene, P promoter, SA splice acceptor, SD splice donor

2. In the second case, when recombinant human ceruloplasmin was expressed starting from cDNA in baby hamster kidney cells it gave a very low yield due to nuclear retention of mRNA. This problem was solved by inserting the second intron of the rabbit β-globin gene upstream of the human ceruloplasmin cDNA. This action was able to alleviate the block of cytoplasmic export and significantly increased recombinant protein expression [55].

3. In the third example, Zago et al. [85] have tested the effect of introduction of two well-characterized introns, an α-tropomyosin derived intron and pCl-neo intron in a naturally uninterrupted human gene, interferon beta. Both introns were designed to introduce stop codons in the IFN beta sequence in order to prevent translation from the unspliced mRNA. Both introns were short, 114 bp and 172 bp, with efficient splicing behavior in vivo and in vitro (Fig. 3b). In this work, Zago et al. have also tested experimentally different intron positions to obtain the highest impact on the subsequent protein production based on the cDNA homology between interferon beta and the loosely related interferon gamma. Intron insertion at the position with the highest homology had the ability to increase mRNA and IFN beta protein levels in both CHO and HeLa cells by 1.5 to 2.5 fold. In this respect, it should also be important to note that in other systems intron positioning can have a profound effect on gene expression and in some cases may even result to be inhibitory [28, 35, 36]. Therefore, introducing introns in biotechnological systems may require some careful experimentation.

Finally, RNA fluorescence in situ hybridization performed in order to examine the cellular distribution of the interferon transcripts detected few discrete transcription foci in the construct with intron, while no intron mRNA showed diffuse staining around transcription points suggesting that introduction of the introns also affected mRNA export efficiency [85]. Therefore, the presence of splicing markedly induced the translational utilization of the mature mRNAs that do reach the cytoplasm.

4. Finally, particular attention must be paid to expression systems that have often been engineered with introns to increase product yield, for example for gene therapy [86]. In fact, many commercially available mammalian expression vectors contain chimeric introns located upstream of the site of cDNA insert to promote a high level of expression. One of the most recent examples of such an improvement has been described by Colwill and collaborators that have modified the Creator recombination system from Clontech by introducing a 5′-intron (360 nucleotides from the adenovirus L1 major late intron) to systematically increase protein expression levels for a wide variety of downstream applications [87] (Fig. 3c). In this case, it should be noted that the inserted intron was especially chosen with regards to splicing efficiency and avoidance of unwanted side effects (i.e. activation of cryptic elements).

Future Prospects and Conclusions

The rapid expansion of global manufacturing capacity represents one of the major research challenges faced today by biotech researchers. The therapeutic need for moving from microgram to milligram doses of recombinant protein production has literally changed the face of biotechnology in recent years. In this review, we propose that one of the solutions for biotech practitioners to improve their expression system can be that of adding an “intronic dimension” to the system. This will allow them to exploit in a more efficient way several transcriptional and post-transcriptional mechanisms that would otherwise be ineffective when expressing intronless transcripts. In fact, even though the presence of introns in transcripts is not mandatory for mammalian gene expression, the examples reported in this review show that their presence may significantly improve the productivity of low producing systems toward making them financially viable. In addition, even for higher producers introducing intron processing events may be a useful addition as a tool to fine tune productivity. Last but not least, exploiting splicing events can be made readily adaptable to any system of interest and has a minimal cost. It may still sound a bit counterintuitive to add complex intron structure into biotech production systems that always strive for simplicity in order to lessen the chance for things to go wrong. However, adding things to systems sometimes should not be considered as just increasing complexity, but rather in terms of improving efficiency and/or better organization.