Competition of RNA splicing: line in or circle up

The advent of high throughput technologies has revealed that mammalian genomes are pervasively transcribed, most for long noncoding RNAs (lncRNAs, at least 200 nt long). Thousands of lncRNAs from intergenic regions (large intergenic noncoding RNA, lincRNA) have been uncovered by massive deep sequencing from the repertoire of polyadenylated (poly(A)+) RNAs, together with multiple chromatin landscapes. These lncRNAs are messenger RNA (mRNA)-like, with linear signatures of 5′ mG caps and 3′ poly(A)+ tails. Unexpectedly, mammalian transcriptomes are even more complex with the expression of RNAs without polyadenylated tails (poly(A)– RNAs) [1], leading to the identification of new lncRNA formats, such as circular RNAs. Due to the covalently close structure and without 3′ poly(A) tails, circular RNAs failed to be analyzed in most transcriptome analyses mainly for polyadenylated RNAs. By taking advantage of deep sequencing from nonpolyadenylated RNA population [1], thousands of circular RNAs were identified to be widely expressed in human cell lines. There are at least two different types of circular RNAs processed from pre-RNA splicing: one type is derived from spliced introns (circular intronic RNAs) [2] and the other type is from back-spliced exons (exonic circular RNAs) [3]. Circular intronic RNAs (ciRNAs) are produced from introns that fail to be debranched after splicing, but covalently circularized with 2′,5′-phosphodiester bond between a splice donor site and a branch point site. The formation of ciRNAs can be reconstituted in expression vectors with the requirement of consensus motifs flanking 2′,5′-phosphodiester bonds. Importantly, ciRNAs were shown to play an important cis-regulatory role in local gene expression [2]. Exonic circular RNAs (circRNAs) are produced from back-spliced circularization [3]. Unlike (normal) RNA splicing that joins an upstream splice donor site with a downstream splice acceptor site, leading to a linear RNA transcript (Figure 1A), back splicing joins a downstream splice donor site reversely with an upstream splice acceptor site, yielding a circular RNA transcript with 3′,5′-phosphodiester bond at the joint site (Figure 1B). In last decades, only a handful of circRNAs were identified and indicated as byproducts of splicing errors with no function. Until recently, the genome-wide profiling of

The advent of high throughput technologies has revealed that mammalian genomes are pervasively transcribed, most for long noncoding RNAs (lncRNAs, at least 200 nt long). Thousands of lncRNAs from intergenic regions (large intergenic noncoding RNA, lincRNA) have been uncovered by massive deep sequencing from the repertoire of polyadenylated (poly(A)+) RNAs, together with multiple chromatin landscapes. These lncRNAs are messenger RNA (mRNA)-like, with linear signatures of 5′ m 7 G caps and 3′ poly(A)+ tails. Unexpectedly, mammalian transcriptomes are even more complex with the expression of RNAs without polyadenylated tails (poly(A)-RNAs) [1], leading to the identification of new lncRNA formats, such as circular RNAs.
Due to the covalently close structure and without 3′ poly(A) tails, circular RNAs failed to be analyzed in most transcriptome analyses mainly for polyadenylated RNAs. By taking advantage of deep sequencing from nonpolyadenylated RNA population [1], thousands of circular RNAs were identified to be widely expressed in human cell lines. There are at least two different types of circular RNAs processed from pre-RNA splicing: one type is derived from spliced introns (circular intronic RNAs) [2] and the other type is from back-spliced exons (exonic circular RNAs) [3]. Circular intronic RNAs (ciRNAs) are produced from introns that fail to be debranched after splicing, but covalently circularized with 2′,5′-phosphodiester bond between a splice donor site and a branch point site. The formation of ciRNAs can be reconstituted in expression vectors with the requirement of consensus motifs flanking 2′,5′-phosphodiester bonds. Importantly, ciRNAs were shown to play an important cis-regulatory role in local gene expression [2].
Exonic circular RNAs (circRNAs) are produced from back-spliced circularization [3]. Unlike (normal) RNA splicing that joins an upstream splice donor site with a downstream splice acceptor site, leading to a linear RNA transcript ( Figure  1A), back splicing joins a downstream splice donor site reversely with an upstream splice acceptor site, yielding a circular RNA transcript with 3′,5′-phosphodiester bond at the joint site ( Figure 1B). In last decades, only a handful of circRNAs were identified and indicated as byproducts of splicing errors with no function. Until recently, the genome-wide profiling of nonpolyadenylated RNAs or RNase R enriched RNAs surprisingly indicated a wide expression of circRNAs from a spectrum of cell-lines/species. However, the detailed mechanism of circRNA formation and the direct evidence for its biogenesis has remained elusive, despite a distinct association with long flanking introns and Alu elements among them.
Very recently, with a highly efficient computational pipeline (CIRCexplorer) to identify junction reads from back spliced exons, thousands of circRNAs were retrieved from poly(A)-and/or RNase R-treated poly(A)-RNA-seq in human embryonic stem cell H9 line. In addition, genomic characteristics show that circRNA formation is in general coupled with RNA splicing, and circularized exons are preferentially flanked by long introns containing juxtaposed Alu elements in an orientation-opposite pattern for IRAlus pairing. Importantly, the formation of circRNA can be recapitulated with expression vectors in transfected human and mouse cells. The flanking intronic complementary sequences, either repetitive (mainly Alu elements in humans) or non-repetitive, were demonstrated to play essential roles in exon circularization ( Figure 1B). Strikingly, exon circularization efficiency is regulated by the competition of RNA pairing, mainly by inverted repeated Alu pairs (IRAlus) in humans. RNA pairing within individual introns promotes normal splicing to from a linear RNA transcript (line in). In contrast, RNA pairing across flanking introns preferentially associates with back splicing, resulting in the formation of a circular RNA transcript (circle up). Finally, alternative formation of IRAlus-from widely distributed and orientation-opposite Alus in human introns-and the competition among them lead to alternative circularization (AC), resulting in multiple circular RNA transcripts from a single gene locus. Notably, it also suggests a previously under-appreciated role of intronic Alus in gene expression regulation [3]. Obviously, additional questions remain to be answered for circRNA biogenesis and function. For instance, what other elements/factors are involved in circRNA formation in addition to the flanking complementary sequences? So far, one RNA-binding protein, muscleblind, has been reported to promote exon circularization by binding to flanking introns [4]. What mechanism of the splicing machinery is chosen to produce a circRNA or a linear RNA? How backspliced circularization is coordinated with transcription/ splicing? Can circRNAs in mammals be translated? Importantly, what do circRNAs do in cells? Although circRNAs were suggested to function as miRNA sponges [5,6], only a few circRNAs have been demonstrated to contain multiple binding sites to trap one particular miRNA. In fact, the majority of circRNAs were only bioinfomatically determinted, and lacked function annotation. In this case, their functional implications will shed new light on circR-NA study. Noticeably, complementary sequences and/or the competition of RNA pairing between them are evolutionarily dynamic, leading to the species-specific expression of circRNAs [3]. Thus, it will be of interest to profile circRNAs and their flanking complementary sequences among species, which will provide useful clues for functional analyses of these molecules during evolution.
Collectively, the discovery of wide-spread expression of circRNAs further expands our knowledge on gene expression regulation. Besides the well-known linear matured RNAs formed by splicing ( Figure 1A), circular RNAs can be produced from RNA precursors through back-spliced circularization ( Figure 1B). Similar to multiple linear mRNAs processed from alternative splicing at one locus, a number of circRNAs can be produced from alternative circularization at a given gene locus. As most of ciR-NAs/circRNAs are not accessible for protein coding, these widely expressed circular RNAs further enlarge the ever expanding category of lncRNAs and suggest a whole new level of complexity of our transcriptomes.