Processing and transcriptome expansion at the mRNA 3′ end in health and disease: finding the right end
- 2.4k Downloads
The human transcriptome is highly dynamic, with each cell type, tissue, and organ system expressing an ensemble of transcript isoforms that give rise to considerable diversity. Apart from alternative splicing affecting the “body” of the transcripts, extensive transcriptome diversification occurs at the 3′ end. Transcripts differing at the 3′ end can have profound physiological effects by encoding proteins with distinct functions or regulatory properties or by affecting the mRNA fate via the inclusion or exclusion of regulatory elements (such as miRNA or protein binding sites). Importantly, the dynamic regulation at the 3′ end is associated with various (patho)physiological processes, including the immune regulation but also tumorigenesis. Here, we recapitulate the mechanisms of constitutive mRNA 3′ end processing and review the current understanding of the dynamically regulated diversity at the transcriptome 3′ end. We illustrate the medical importance by presenting examples that are associated with perturbations of this process and indicate resulting implications for molecular diagnostics as well as potentially arising novel therapeutic strategies.
KeywordsAlternative cleavage and polyadenylation (APA) Disease Post-transcriptional gene regulation RNA 3′ end maturation Transcriptome diversity Cancer
The mRNA and protein isoforms produced by alternative processing of primary RNA transcripts differ in structure, function, localization, or other properties [13, 120, 183]. Alternative splicing affects more than half of all human genes and represents a primary driver of the evolution of phenotypic complexity in mammals [15, 80, 95]. Across individuals, changes in normal isoform structure can have phenotypic consequences and have been associated with disease . With the advent of next-generation RNA sequencing technologies, it became apparent that not only the “body” of transcripts but also the mRNA 3′ end is affected by enormous diversity. Up to 70 % of the transcriptome undergoes alternative mRNA 3′ end processing . This for example affects numerous genes during the stress response  or after T and B cell activation . The medical significance is highlighted by complex disorders that are associated with alternative cleavage and polyadenylation (APA), i.e., in the susceptibility to systemic lupus erythematosus  or more globally in tumorigenesis [119, 121]. Yet, up to now, the functional role of widespread APA in disease processes is still enigmatic.
In the following, we will briefly present the mechanistic key features of 3′ end formation since several reviews cover the basics of 3′ end formation in greater detail [28, 43, 60, 88, 143, 180, 185, 198]. We will illustrate the medical perspective of 3′ end processing and show how disease can be caused by perturbations of canonical mRNA processing in cis and trans. We will then present how alternative 3′ end processing contributes to the complexity of the transcriptome and thereby affects important cellular functions in physiological as well as pathophysiological conditions. Finally and most intriguingly, we will discuss to what extent alternative cleavage and polyadenylation represent the driver or passenger in the pathogenesis of human disorders.
General principles of canonical 3′ end processing—the eukaryotic mRNA 3′ end cleavage and polyadenylation machinery
mRNA 3′ end processing is tightly connected to transcription and splicing to ensure proper mRNA maturation and to maintain genome integrity
Messenger RNA 3′ end processing is well-orchestrated and interconnected with transcription, splicing, and translation [70, 114, 143] (Fig. 1b): As the pre-mRNA transcript emerges from POL2, extensive constitutive and alternative splicing events occur co-transcriptionally giving rise to a perplexingly high transcriptome diversity. Eventually, transcription termination pauses the elongating polymerase triggered by recognition of poly(A) signals in the nascent transcript [143, 148], and both CPSF and CSTF are transferred by the carboxy-terminal domain (CTD) of POL2 to their specific pre-mRNA binding sites to produce the mRNA 3′ end. The phosphorylation of serine and threonine residues within the CTD regulates gene expression ; it coordinates the recruitment of RNA processing factors (including the 5′ capping complex, the spliceosome, and 3′ processing machinery) and regulates chromatin organization by histone methylation .
The extensive crosstalk between the different pre-mRNA processing activities (capping, splicing, and polyadenylation) is crucial for gene expression and, importantly, genome integrity (Fig. 1c): Proteins binding to the cap structure of pre-mRNAs interact with splicing factors and promote recognition of the cap-proximal splice site. Conversely, splicing factors associating with the 3′ terminal intron interact with downstream polyadenylation factors to mutually promote 3′ end cleavage/polyadenylation and terminal intron splicing [5, 36, 65, 93, 100, 109, 122, 125, 126, 134, 175, 184]. These interactions ensure the recognition of the correct splice sites and the timely, accurate, and efficient 3′ end processing. Moreover, the extensive integration between the different co-transcriptional mechanisms protects chromosomes from potentially deleterious effects, which could arise from interaction between the nascent RNA and template DNA during transcription . Finally, functional polyadenylation sites and polyadenylation factors are required for efficient transcription termination [20, 85, 143, 148] and release of the polyadenylated mRNAs for export from the nucleus . The efficiency of polyadenylation can thus have significant quantitative effects on gene expression in general, and defects of mRNA 3′ end formation can profoundly affect cell viability, growth, and development.
The medical relevance of errors of 3′ end processing is exemplified by different inherited and acquired human disorders. A continuously growing number of reports document the increasing awareness of the mRNA 3′ end becoming a critical constituent for a variety of disorders (for review ).
In the following paragraphs, we highlight the most characteristic hallmarks of constitutive 3′ end processing and—based on a few selected examples—illustrate how alterations of sequence elements or the executing 3′ end processing machinery can result in human pathology. In the second part, we shift to regulated and alternative 3′ end formation and demonstrate its importance in various physiologically and pathophysiologically relevant processes.
When processing gets awry
The polyadenylation signal matters
Numerous mutations affecting cis-regulatory sequence elements required for mRNA 3′ end processing evidence their detrimental role in a variety of human disorders . Intriguingly, a significant factor in deciphering the underlying mechanistic principles of mRNA 3′ end processing was the presence of mutations, which were initially identified in thalassemia patients. Among those patients, different mutations affecting the AAUAAA hexanucleotide of the poly(A) signal were found in the α-  and β-globin genes  (and references therein) and shown to invariably inactivate or severely inhibit gene expression. Similar mutations were observed in, e.g., the Foxp3 gene causing IPEX syndrome , a rare fatal disorder characterized by immune dysregulation, polyendocrinopathy, enteropathy, and X-linked inheritance. Recently, a germline variant in the TP53 polyadenylation signal has been identified in a large genome-wide association study to confer cancer susceptibility. In this gene, a mutation that changes the AATAAA into AATACA results in impaired 3′ end processing of the TP53 mRNA predisposing to prostate cancer, gliomas, and colorectal adenomas .
These and other examples  illustrate the functional importance of the highly conserved poly(A) signal and reveal the devastating consequences of some of these mutations. Although there is some sequence flexibility [9, 173] and even alternative 3′UTR architectures for effective processing , these findings indicate that the hexanucleotide required for recruitment of the CPSF complex represents the “Achilles heel” for disease causing loss-of-function mutations altering 3′ end processing. This is further corroborated by numerous mutations altering polyadenylation in a variety of other disorders (for review ).
How about other sequence elements?
3′ end processing depends on various canonical and non-canonical (auxiliary) sequence elements (Fig. 1a). With the exception of the poly(A) hexanucleotide, they are less well conserved and consequently tolerate (“silent”) nucleotide exchanges with greater likelihood.
Nevertheless, examples, which illustrate the function of these elements, exist. This has been first demonstrated for a clinically relevant gain of function mutation of the cleavage site stimulating 3′ end processing : In most cases, endonucleolytic cleavage and polyadenylation of pre-mRNAs occurs predominantly 3′ of a CA dinucleotide. However, in the prothrombin (F2) gene encoding a key blood coagulation factor, the cleavage site is composed of a CG dinucleotide, which is less efficient in promoting the cleavage reaction . As a consequence, a common mutation (F2 20210*A) affecting the most 3′ nucleotide of the F2 mRNA converts the physiologically inefficient cleavage site into the mechanistically most efficient CA dinucleotide [32, 59]. This increases the cleavage site recognition resulting in an approximately twofold enhancement of F2 mRNA and protein expression. This finally causes raised F2 plasma concentrations, which disturb the finely tuned balance between pro- and anti-coagulatory activities and thereby predisposes to thrombosis .
Mutations affecting 3′ end processing were also identified at other positions in the F2 gene [6, 155, 165]. They increase the efficiency of 3′ end processing  either when located at the penultimate position of the F2 3′UTR (F2 20209*T) or further downstream 3′ of the cleavage site in the putative CSTF binding site (F2 20221*T). However, these effects are presumably gene-specific since the putative CSTF binding site in the F2 gene displays an unusually low density of uridine residues when compared to efficiently 3′ end processed mRNAs. Consequently, the introduction of (an) additional uridine-residue(s) into that region enhances 3′ end processing, supposedly by facilitating the interaction of CSTF with the pre-mRNA . This illustrates a typical feature of most PASs that typically harbor a U-/GU-rich stretch up to 30 nucleotides downstream of the cleavage site to recruit the CSTF complex for efficient 3′ end cleavage and polyadenylation (Fig. 1a).
Yet deleterious effects arising from mutations have been documented, which neither directly affect the poly(A) signal or the cleavage site nor the downstream sequence elements. This, for instance, is shown for a 20 base pair duplication in MSH6, one of the four mismatch repair genes causing the LYNCH syndrome (HNPCC or hereditary nonpolyposis colorectal cancer), an autosomal-dominant genetic cancer syndrome with a high risk of colon cancer (among others) . This duplication downregulates processing at an adjacent PAS, although the underlying mechanism remained unclear. Another intriguing example highlighting the mechanistic complexity of mRNA 3′ end processing is found for a complex immunodeficiency syndrome, which can be caused by a 3′UTR mutation in the p14 gene. In this case, the mutation creates a splicing defective 5′ splice site resulting in a U1 snRNP-mediated suppression of an adjacent PAS . Interestingly, this principle already points to a mechanistic aspect that will be relevant in the regulation of alternative cleavage and polyadenylation and the therapeutic manipulation of 3′ end processing (see below).
These examples suggest that aberrations in regions not directly affecting the poly(A) signal deserve special attention. Although there clearly is more sequence flexibility up- and downstream of the poly(A) signal, mutations in such regions can have devastating consequences. This aspect points to another mechanistic peculiarity of 3′ end processing that the majority of RNAs likely do not have “optimal” upstream and downstream core elements. Instead, auxiliary elements situated in this regions aid polyadenylation (for review, ). However, their composition and mode of action are often gene-specific thereby accommodating the needs for specificity of regulation at the mRNA 3′ end (further detailed below).
Role of trans-acting factors
As indicated before, the processing apparatus is a complex machinery involving more than 50 proteins . Examples illustrating the medical relevance of mutations affecting 3′ end processing factors are highlighted by occulopharyngeal muscular dystrophy (OPMD) or the hypereosinophilic syndrome (HES).
OPMD is an adult-onset disease with slowly progressing muscle weakness primarily affecting the eyelids resulting in ptosis and the pharyngeal muscles resulting in dysphagia. It is caused by short trinucleotide repeat [(GCG)8–13] expansions in the coding region of the nuclear poly(A)binding protein 1 (PABPN1, see above) . Normally, the polyalanine stretch encoded by this trinucleotide comprises 10 alanines, which is expanded to 12–17 alanines in autosomal-dominant OPMD. This expansion results in an increase of self-association, misfolding, and filamentous nuclear aggregation of the PABPN1 protein in skeletal muscle. In vitro, the mutant protein is fully active and OPMD cells do not display a severe polyadenylation defect [21, 91]. Thus, the phenotype might be best explained by either a quantitatively minor disturbance of the protein’s function in polyadenylation (which may be difficult to detect in vitro or in transfected cells) and/or by co-sequestration of other potentially interacting proteins. Finally, PABPN1 also plays an important role in the transcription of muscle-specific genes, which could explain why other tissues are unaffected .
In contrast, HES represents a severe hematologic disorder with sustained overproduction of eosinophils in the bone marrow, eosinophilia, tissue infiltration, and organ damage. In this case, a DNA rearrangement involving a chromosomal deletion of 800 kb and fusion of the hFip1 and PDGFRα genes is the underlying cause of this syndrome . The corresponding chimeric protein, hFip1-PDGFRα, contains the N-terminus of hFip1, an integral subunit of the CPSF complex stimulating 3′ end processing , and the C-terminal kinase domain of PDGFRα. The expression of the hFip1-PDGFRα fusion protein in hematopoietic cells constitutively activates the PDGFRα kinase and transforms cells. As for OPMD, the resulting phenotype might further be aggravated by an interference with canonical 3′ end processing, although this has not been explored in immediate detail. Surprisingly, these examples are currently the only two reported genomic perturbations, which affect processing factors and leading to an overt phenotype. This either reflects a negative selective pressure, an unexpectedly high structural and/or functional flexibility, or the redundancy of some of those factors.
Regulated cleavage and polyadenylation
However, an inhibition of 3′ end cleavage and polyadenylation can also occur as a result of endogenous cell-intrinsic mechanisms. One prominent example is the BRCA1-associated protein BARD1, which establishes a causal link between mRNA 3′ end processing and tumor suppression . BARD1 senses sites of DNA damage and repair and physically interacts with CSTF-50 (Fig. 2b). Challenging cells with DNA-damaging agents transiently inhibit 3′ end formation by enhanced formation of CSTF/BARD1/BRCA1 complexes. Furthermore, a tumor-associated germline mutation in BARD1 (Gln564His) decreases its affinity for CSTF-50 and renders the protein inactive in polyadenylation inhibition. The BARD1-mediated inhibition of polyadenylation may thus prevent inappropriate RNA processing during transcription of damaged DNA loci.
Competitive protein interactions modulating the efficiency of cleavage and polyadenylation can be regulated by specific signaling pathways. This has first been demonstrated for the prothrombin (F2) pre-mRNA in which processing relies on a highly conserved upstream sequence element (USE) (Fig. 1a) . In this example, stress conditions that activate p38 MAPK signaling up-modulate components of the 3′ end processing apparatus and phosphorylate the RNA-binding proteins FBP2 and FBP3 (Fig. 2c red complex). Normally, these proteins bind to the USE and inhibit 3′ end processing. Upon phosphorylation, they dissociate from the USE, making it accessible to proteins that stimulate 3′ end processing . These findings have important implications: deregulated F2 expression plays a crucial role in the pathogenesis of thrombophilia but also in other conditions linking p38 MAPK activation with aberrant cellular processes such as tumorigenesis . It is worth noting that the USE motif constitutes a highly conserved nonameric sequence element, which can be found in many genes including MYC (among other key players with a role in tumorigenesis ). Thus, this regulatory principle might account for a plethora of gene functions. From these findings, regulated 3′ end processing emerged as a key mechanism of gene regulation with broad biological and medical implications. It also provides a first example how the basal 3′ end processing apparatus is “wired” to signaling pathways allowing a dynamic adaptation of the 3′ end cleavage and polyadenylation efficiency. In analogy to other mechanisms (i.e., splicing), this also exemplifies how accessory sequence elements confer specificity to this type of gene regulation.
Finally, another important example establishing the role of posttranslational modifications as a critical element for regulation of 3′ end processing is shown for the poly(A) polymerase . This enzyme catalyzing the formation of the poly(A) tail can be posttranslationally modified by the poly(ADP-ribose) polymerase 1 (PARP1) leading to a poly(ADP-ribosyl)ation, which inhibits the PAP activity (Fig. 2d). The physiological importance of this mechanism is shown in the context of heat shock during which PARP1 inhibits polyadenylation of non-heat shock protein-encoding genes, while polyadenylation of hsp transcripts remains unaltered. Thus, a PARP1-mediated modification of PAP has evolved as an effective mechanism for a differential regulation of polyadenylation during thermal stress. Although not fully elucidated, this example also suggests that there must be gene-specific regulatory mechanisms which allow selective gene expression even in conditions, where PAP as a central enzyme is posttranslationally modified .
These and other examples illustrate that complex molecular mechanisms have evolved to control and regulate mRNA 3′ end processing at (a) defined PAS(s) to eventually execute specific cellular programs. Although not yet explored in further detail, analogous mechanisms might also come into play for the dynamic regulation at alternative (“competing”) PASs (next section).
Variations at the transcriptome 3′ end—when processing gets alternative
With the emergence of RNA sequencing (RNA-Seq) technologies, it became clear that the transcriptome is enormously diversified at the 3′ end . Approximately up to 70 % of the transcriptome is affected by a mechanism widely referred to as “alternative 3′ end cleavage and polyadenylation” (APA) . As highlighted above, it regulates numerous genes during the stress response or after T and B cell activation, during differentiation and dedifferentiation, and in various processes linked to tumor progression (detailed below). These findings are in line with earlier observations that alternative PAS selection represents an important and evolutionary conserved regulatory mechanism for spatial (tissue specificity [53, 67, 105, 107]) and temporal control of gene expression (i.e., immunoglobulin class-switch [3, 30, 47, 48, 147, 170, 171]).
Before moving on to a more universal role of APA in health and disease, we will briefly illustrate a few important physiological aspects, which deserve special attention.
APA in differentiation and development
Recent studies based on high throughput analyses have revealed that APA is highly regulated during development ([8, 54, 76, 79, 152, 158] and references therein). Interestingly there is a correlation between the proliferation status and the global APA patterns (Fig. 4a). Proliferating cells tend to use upstream (“proximal”) PASs and produce mRNAs with shorter 3′UTRs, while quiescent/differentiated cells favor downstream (“distal”) PASs and produce mRNAs with longer 3′UTRs . Similar observations have been made for mouse development and the differentiation of ECSs into neurons and other functions [16, 78, 158]. In contrast, during somatic reprogramming  or tumorigenesis, proximal PASs are favored leading to shorter 3′UTRs . In some of those cases, mRNAs with shorter 3′UTRs tend to be more stable [71, 162] or globally elevated  eventually leading to higher protein output [121, 152]. Breaking it down to individual transcripts, the consequences of APA however can be complex: APA transcript isoforms of the same gene can encode different proteins and/or change the 3′UTR properties, leading to the inclusion or exclusion of mRNA stabilizing or destabilizing elements, miRNA target sites, or result in different translation efficiencies or subcellular localization (Fig. 4a, detailed below). Thus, although APA is widespread in processes such as differentiation, dedifferentiation, and development, global equations of how the overall trend of APA directionality affects the fate of the respective transcript isoforms are difficult. This is supported by studies, which failed to detect a straightforward correlation between APA and mRNA stability or protein output [50, 58, 64, 130, 152, 166]. Accordingly, deciphering the downstream functional consequences of APA in the context of cellular programs and to understand whether APA is the cause or consequence of complex biological programs represents a major challenge. Possibly APA coordinately regulates post-transcriptional regulons (“RNA operons”) driving specific cellular programs. For example, hFip1 (an integral part of the CPSF complex) has recently been shown to control embryonic stem cell-specific APA profiles to ensure the optimal expression of a specific set of genes, including critical self-renewal factors, in the cell fate specification . Yet, we have just begun to decipher the resulting consequences, and further studies are needed to expand our understanding of the resulting consequences. In contrast, the impact of APA on the expression and function of individual genes is far better understood.
Role of APA for individual genes
The historically eldest and perhaps the most thoroughly studied example illustrating the importance of APA and also shedding light on the underlying regulatory principles is the regulation of IgM heavy chain expression during B cell differentiation (Fig. 4b; [3, 47, 147]). In this mRNA, alternative PAS selection is regulated by a modulated recruitment of the CSTF 64-kDa subunit to one of two competing PASs. Upon B cell activation, this switches the IgM heavy chain expression from a membrane bound form (μm) to the secreted form (μs) by activation of an alternative upstream μs-specific PAS in plasma cells . Several mechanisms are possibly contributing including additional modulators (U1A) tightly controlling and ensuring appropriate PAS selection [138, 139]. Although the underlying regulation is further complicated by being coupled to alternative splicing (“splicing coupled APA”), it establishes an important direct functional link between dynamically regulated APA and a physiological process.
A similar mechanism underlies the regulated expression of the transcription factor NF-ATc during T cell differentiation . Two longer isoforms of NF-ATc mRNA are synthesized in naïve T cells, whereas a shorter isoform is expressed in effector cells. The switch is mediated by activation of a proximal PAS by up-regulation of CSTF 64 kDa subunit, which occurs upon T cell stimulation (“direct APA”). In this context, it is interesting to note that LPS stimulation increases CSTF 64 expression in macrophages, which in turn regulates APA of several mRNAs . These examples illustrate that the modulated expression of individual 3′ end processing factors can directly—or in concert with auxiliary (inhibitory or stimulatory) factors—change the production of APA isoforms and thereby drive important cellular functions.
Inferring from the tight coupling of transcription with processing [115, 131], the kinetics of POL2 play an important role in PAS choice (“kinetic coupling”). In Drosophila, polo is a cell cycle gene, which uses two PAS in the 3′ UTR to produce alternative messenger RNAs that differ in their 3′ UTR length. By using a mutant Drosophila strain with a lower transcriptional elongation rate, it was shown that transcription kinetics can determine alternative PAS selection. Although only one gene is affected, the physiological consequences of incorrect polo PAS choice are detrimental; transgenic flies lacking the distal poly(A) signal cannot produce the longer transcript and die at the pupa stage due to a failure in the proliferation of the precursor cells of the abdomen . Along these lines also, transcription elongation factors can direct alternative RNA processing and thereby control important cellular functions such as the immunoglobulin secretion in plasma cells .
Another interesting example is the brain-derived neurotrophic factor (BDNF), which is encoded by two transcripts with either short or long 3′ UTRs. The physiological significance of the two mRNA isoforms encoding the same protein has been unknown until it could be demonstrated that the short and long 3′ UTR BDNF mRNAs are involved in different cellular functions. The short 3′ UTR mRNAs are restricted to somata, whereas the long 3′ UTR mRNAs are also localized in dendrites. In a mouse mutant where the long 3′ UTR is truncated, dendritic targeting of BDNF mRNAs is impaired, resulting in low level BDNF in hippocampal dendrites, a selective impairment in long-term potentiation in dendrites, while somata of hippocampal neurons remained normal. These results provide insights into local and dendritic actions of BDNF and reveal APA for a differential regulation of subcellular functions of proteins  with important medical implications .
Further examples documenting the biological consequences of APA are represented by the regulated expression of a truncated form of glutamyl-prolyl tRNA synthetase (EPRS), which as a “gamma-interferon-activated inhibitor of translation” (GAIT) constituent controls the translation of GAIT target transcripts such as the VEGF-A . Furthermore, recent studies demonstrated APA’s potential to differentially regulate the localization of membrane proteins by a trafficking mechanism involving the CD47 3′UTR as a scaffold . Altogether, these and other examples illustrate the physiological importance of regulated mRNA 3′ end processing as a mechanism controlling a wide spectrum of cellular functions.
APA in human disease
Aberrant APA profiles are associated with a variety of human disorders ([8, 40, 119, 121, 130] and references therein). Most importantly, the strong prevalence of APA regulation in physiological processes such as differentiation and development (see above) is also reflected in situations where these processes are typically dysregulated. The most prototypical example for this is uncontrolled cellular proliferation in the course of cancer development. Accordingly, a widespread increase in the use of proximal PASs has been observed in various cancer cells . In this context, the shorter mRNA isoforms showed an increased stability and typically produced more protein, in part through the loss of microRNA-mediated repression. Interestingly, switching to shorter 3′UTRs also allowed proto-oncogenes to escape from inhibition by miRNAs, thereby resulting in oncogene activation in the absence of genetic alterations. Global induction of proximal PAS usage has consistently been observed in several studies ever since ([2, 58, 104, 119, 130, 162, 191] and references therein). This phenotype however does not apply to all tumor types , and occasionally the correlation between cancer progression and 3′UTR shortening appears to be more complex [49, 58, 130]. Interestingly, the determination of (selected) APA profiles has recently proven to be of prognostic significance [99, 119, 186, 191]. Yet, also other disorders such as endocrine  or cardiovascular disease  are associated with a widespread regulation of APA.
Previously, PABPN1 has been identified to represent a potent modulator of APA by inhibiting processing at respective PASs . In the context of OPMD (see above), these data may also imply that OPMD is associated with misregulated APA, which results in unbalanced formation of alternative mRNA 3′ ends. They also predict that OPMD symptoms in humans may be to a certain degree a result of aberrant gene expression due to a change in 3′ end formation. Yet, so far, this has not been tested directly.
Interestingly, also pathological changes occurring in the context of myotonic dystrophy may be attributable to specific alterations in 3′UTR structures and subsequent changes in RNA localization and/or protein isoforms and levels . Although primarily demonstrated in mice, complementary analysis of samples of human origin strongly suggests APA to represent a (additional) molecular mechanism underlying muscular dystrophy.
Given the high prevalence of APA as a pervasive gene regulatory mechanism in various physiologically relevant processes, it is likely that its misregulation will be discovered in the context of various other pathological conditions . Yet the extent to which the vast majority of all reported global alterations at the mRNA 3′ end represent driver or passengers of human disorders remains an open question. Furthermore, it is, for example, conceivable that the observed APA pattern changes could equally reflect compensatory activities reestablishing the cellular homeostasis in response to disease-triggering events.
Potentially interesting evidence for a causal relationship between APA and a resulting disorder is represented by a polymorphism in the interferon regulatory factor (IRF) 5 gene predisposing to systemic lupus erythematosus (SLE) [61, 62]. This newly identified polymorphism creates a functional polyadenylation site resulting in an increased expression of a transcript variant containing a shorter 3′UTR. Interestingly, the expression levels of transcript variants with the shorter or longer 3′UTRs appeared to be inversely correlated. Thereby, it contributes to a misregulation of interferon signaling, a critical constituent in the pathogenesis or progression of SLE. Yet, this is obviously one of the “simpler” examples in which APA (of one gene) is altered in cis. Accordingly, the nature and functional consequences of global APA regulation in the context of human pathologies remain subject to future investigations.
APA in molecular diagnostics
With the advent of high-throughput analyses, the bioinformatical workload has increased dramatically. In contrast to total RNA-Seq, the sequencing restricted to the transcriptome 3′ end directly uncovers the variability and perturbation occurring at the mRNA 3′ end (Fig. 5). This has several advantages. Firstly, it drastically reduces the bioinformatical workload. Furthermore, these data are typically not confounded by other variables that complicate the bioinformatical processing of the data (such as alternative splicing). Finally, restricting the sequencing to the last (approximately) 30 nucleotides of the transcriptome opens up interesting (and first and foremost cost-effective) opportunities for multiplexing—while still keeping a high coverage for a reliable analysis. In depth APA profile studies have recently revealed “aberrant” APA signatures to be associated with more aggressive tumor phenotypes in cancer patients and thereby provided the proof-of-concept that such a determination can reveal prognostic signatures [119, 191]. Yet, applying novel bioinformatical analysis (DaPars), APA patterns can also be extracted from preexisting transcriptome wide sequencing data . Although this takes advantage of the fact that RNA-Seq data is already available for numerous tissue specimens, this technique has the limitation that it is primarily suited to detect alternative 3′UTR events, while APA events, which are located within the coding region, or alternatively spliced introns (internal APA) rather remain obscure. Compared to 3′ end sequencing technologies, this algorithm requires complex bioinformatical calculations and typically allows a less “intuitive” identification of the mRNA 3′ end (Fig. 5c, compare polyA-Seq with DaPars).
It remains to be observed in which disease conditions and to what extent the analysis of APA signatures could further improve diagnostic strategies and possibly allow detecting biological aberrations with higher sensitivity and specificity. Interestingly, selected APA events can confer strong prognostic power beyond common clinical and molecular variables, suggesting their potential as novel prognostic biomarkers . Thus, it will be interesting to see how the determination of APA patterns may evolve as a potentially new biomarker in the future. This could advance diagnostic strategies for a more thorough understanding of underlying disease mechanisms as well as for a reliable prognostic and possibly therapeutic stratification.
Ultimately, ongoing genome sequencing activities will most likely grant us further insights into genomic variations resulting in gene-specific perturbation of APA isoforms with possible detrimental functional consequences. Unlike global aberration in trans (i.e., as a result of a change of the abundance of one processing factor or regulatory protein), the cause-consequence relationship in this kind of setting is substantively clearer. Further such changes may be directly accessible for specific, targeted therapeutic approaches.
Targeting mRNA 3′ end formation as a novel therapy
We have seen that the determination of global APA patterns can have important diagnostic and even prognostic implications. The therapeutic significance of APA will depend on it being cause, consequence, or simply a coincidental epiphenomenon of the underlying disease.
However, even the latter two conditions do not necessarily preclude the possibility of regulating APA as a therapeutically meaningful approach. Various disorders are associated with drastic APA changes (see above), and it is difficult to imagine that all observed APA patterns are biologically “silent” and consequently do not affect a potential phenotype.
Although still on an experimental level, in principle, strategies to interfere with 3′ end processing are available. This encompasses both unspecific as well as target-specific strategies. For example, as shown for the regulation of splicing, antisense oligonucleotides (ASO) inhibiting U1 binding can be used to specifically promote intronic alternative polyadenylation . Although the experience concerning therapeutic targeting of splicing is far more advanced, the general proof-of-concept of targeting specific polyadenylation sites for redirection of processing based on analogous approaches has been made. This includes the use of ASOs  and siRNAs  as well as modified U1 snRNP, which interacts with a target gene upstream of its PAS to regulate gene expression .
Yet, also other strategies might come into play as well. We have seen that APA is influenced by various other cellular processes controlling gene expression (Figs. 1 and 3) including the velocity/kinetics of POL2 . Although presumably unspecific, the interference with POL2 processivity at various stages of transcription  may potentially regulate APA . In fact, numerous anticancer drugs regulate in the one or the other way the processivity of POL2 (such as doxorubicin or camptothecin). Apart from this, the C-terminal domain of POL2 is subject to extensive posttranslational phosphorylation, which influences co-transcriptional events including splicing, transcription termination, and 3′ end processing . Interestingly, although Ser 2 and Ser 5 phosphorylations of the CTD are by far the most studied posttranslational modifications, the way in which the phosphorylation pattern itself or potentially even other posttranslational modifications might affect the loading and delivery of processing factors to their ultimate destination is yet to be elucidated.
Furthermore, 3′ end processing is tightly bound to splicing, and a significant proportion of APA events per se occur concurrently with alternative splicing. Therefore, virtually all therapeutical approaches currently tested for manipulating splicing  may—in the one or the other instance—help reverting “disordered” APA phenotypes as well. Ultimately, we have obtained first evidence how extracellular signals influence the basal 3′ end processing machinery . Generally, this and other examples connecting posttranslational modifications with the regulation of 3′ end processing as shown for PAP  or the modifications of POL2 CTD (see above) may lead new ways towards targeting signaling components for regulation of the transcriptome 3′ end diversity. Finally, by means of the ongoing research elucidating how epigenetic modifications can control APA switches ([102, 182, 190] and references therein), it is tempting to speculate that the manipulation of these pathways may eventually be translated into the clinical context.
Conclusions and perspectives
The 3′UTR has emerged as a hotspot for posttranscriptional gene regulation, controlling important cellular functions such as morphogenesis, cell differentiation, metabolism, cell proliferation, and other processes by controlling mRNA translation, stability, localization, as well as 3′ end processing. More recently, APA has been found to represent an important layer of posttranscriptional gene regulation, which in turn can influence the RNA fate and/or regulate the protein output quantitatively or qualitatively, and thereby steer important cellular programs. Interestingly, we have seen distorted APA signatures being associated with a variety of disorders. Presumably, more of these patterns will be discovered in the context of various other pathological conditions in the near future.
From a medical perspective, the most intriguing question relates to the extent to which APA represents a driver or passenger of human disorders. For individual genes harboring mutations that perturb alternative cleavage and polyadenylation of their own transcripts, the cause-consequence relationship is relatively simple. However, the causal contribution of a widespread APA (de)regulation in the context of human pathologies is still unclear. The broad biological importance of APA renders the possibility of global changes of the transcriptome 3′ end (associated with human pathologies) to have any phenotypic consequences unlikely. Even without being disease eliciting directly, these changes may aggravate underlying pathologies. Yet, they could equally represent compensatory activities for reestablishing the cellular homeostasis in response to disease-triggering events.
Thus, further studies are required to decipher the functional contribution of regulated APA in the context of human pathologies in order to determine whether APA can serve as a meaningful therapeutic target. Defining key components directing APA and dissecting their functional hierarchy thus represent critical aspects that influence the conceptual tractability—apart from all practical possibilities/opportunities. Uncoupled from these challenging aspects, exiting first steps towards APA serving as a potentially novel biomarker are taken. It will be interesting to see how these encouraging findings will further translate into a clinical setting and whether they will become part of “routine” molecular diagnostics allowing prognostic and/or therapeutic stratifications in the (near) future.
We thank members of the Danckwardt lab for helpful discussions. We apologize to all colleagues whose work could not be discussed or cited here because of space constraints.
Work in the laboratory of SD is supported by the DFG (DA 1189/2-1), by the GRK 1591, by the Federal Ministry of Education and Research (BMBF 01EO1003), by the Hella Bühler Prize for Cancer Research, and by the DGKL and the Institute of Clinical Chemistry, University Medical Center Mainz. We thank Vanessa Rau for helpful input and critical comments on the manuscript.
AO, YK, and SD selected and analyzed the literature and drafted the manuscript.
Compliance with ethical standards
Conflict of interest
The authors declare that they have no conflict of interest.
- 8.Batra R, Charizanis K, Manchanda M, Mohan A, Li M, Finn DJ, Goodwin M, Zhang C, Sobczak K, Thornton CA, Swanson MS (2014) Loss of MBNL leads to disruption of developmentally regulated alternative polyadenylation in RNA-mediated disease. Mol Cell 56:311–322. doi: 10.1016/j.molcel.2014.08.027 PubMedPubMedCentralCrossRefGoogle Scholar
- 18.Brais B, Bouchard JP, Xie YG, Rochefort DL, Chretien N, Tome FM, Lafreniere RG, Rommens JM, Uyama E, Nohira O, Blumen S, Korczyn AD, Heutink P, Mathieu J, Duranceau A, Codere F, Fardeau M, Rouleau GA, Korcyn AD (1998) Short GCG expansions in the PABP2 gene cause oculopharyngeal muscular dystrophy. Nat Genet 18:164–167PubMedCrossRefGoogle Scholar
- 29.Cools J, DeAngelo DJ, Gotlib J, Stover EH, Legare RD, Cortes J, Kutok J, Clark J, Galinsky I, Griffin JD, Cross NCP, Tefferi A, Malone J, Alam R, Schrier SL, Schmid J, Rose M, Vandenberghe P, Verhoef G, Boogaerts M, Wlodarska I, Kantarjian H, Marynen P, Coutre SE, Stone R, Gilliland DG (2003) A tyrosine kinase created by fusion of the PDGFRA and FIP1L1 genes as a therapeutic target of imatinib in idiopathic hypereosinophilic syndrome. N Engl J Med 348:1201–1214PubMedCrossRefGoogle Scholar
- 40.de Klerk E, Venema A, Anvar SY, Goeman JJ, Hu O, Trollet C, Dickson G, den Dunnen JT, van der Maarel SM, Raz V, ‘t Hoen PAC (2012) Poly(A) binding protein nuclear 1 levels affect alternative polyadenylation. Nucleic Acids Res 40:9089–9101. doi: 10.1093/nar/gks655 PubMedPubMedCentralCrossRefGoogle Scholar
- 41.Decorsière A, Toulas C, Fouque F, Tilkin-Mariamé A-F, Selves J, Guimbaud R, Chipoulet E, Delmas C, Rey J-M, Pujol P, Favre G, Millevoi S, Vagner S (2012) Decreased efficiency of MSH6 mRNA polyadenylation linked to a 20-base-pair duplication in Lynch syndrome families. Cell Cycle 11:2578–2580PubMedCrossRefGoogle Scholar
- 54.Flavell SW, Kim TK, Gray JM, Harmin DA, Hemberg M, Hong EJ, Markenscoff-Papadimitriou E, Bear DM, Greenberg ME (2008) Genome-wide analysis of MEF2 transcriptional program reveals synaptic target genes and neuronal activity-dependent polyadenylation site selection. Neuron 60:1022–1038. doi: 10.1016/j.neuron.2008.11.029 PubMedPubMedCentralCrossRefGoogle Scholar
- 61.Graham RR, Kyogoku C, Sigurdsson S, Vlasova IA, Davies LRL, Baechler EC, Plenge RM, Koeuth T, Ortmann WA, Hom G, Bauer JW, Gillett C, Burtt N, Cunninghame Graham DS, Onofrio R, Petri M, Gunnarsson I, Svenungsson E, Rönnblom L, Nordmark G, Gregersen PK, Moser K, Gaffney PM, Criswell LA, Vyse TJ, Syvänen A-C, Bohjanen PR, Daly MJ, Behrens TW, Altshuler D (2007) Three functional variants of IFN regulatory factor 5 (IRF5) define risk and protective haplotypes for human lupus. Proc Natl Acad Sci 104:6758–6763. doi: 10.1073/pnas.0701266104 PubMedPubMedCentralCrossRefGoogle Scholar
- 64.Gruber AR, Martin G, Müller P, Schmidt A, Gruber AJ, Gumienny R, Mittal N, Jayachandran R, Pieters J, Keller W, van Nimwegen E, Zavolan M (2014) Global 3′ UTR shortening has a limited effect on protein abundance in proliferating T cells. Nat Commun 5. doi: 10.1038/ncomms6465
- 76.Jan CH, Friedman RC, Ruby JG, Bartel DP (2010) Formation, regulation and evolution of Caenorhabditis elegans 3′UTRs., NatureGoogle Scholar
- 77.Jenal M, Elkon R, Loayza-Puch F, van Haaften G, Kuhn U, Menzies FM, Oude Vrielink JA, Bos AJ, Drost J, Rooijers K, Rubinsztein DC, Agami R (2012) The poly(A)-binding protein nuclear 1 suppresses alternative cleavage and polyadenylation sites. Cell 149:538–553. doi: 10.1016/j.cell.2012.03.022 PubMedCrossRefGoogle Scholar
- 87.Ke S, Alemu EA, Mertens C, Gantman EC, Fak JJ, Mele A, Haribal B, Zucker-Scharff, Moore MJ, Park CY, Vagbo CB, Kussnierczyk A, Klungland A, Damell JE jr, Darnell RB (2015) A majority of m6A residues are in the last exons, allowing the potential for 3' UTR regulation. Genes Dev 1;29(19):2037–53Google Scholar
- 94.Lackford B, Yao CG, Charles GM, Weng LJ, Zheng XF, Choi EA, Xie XH, Wan J, Xing Y, Freudenberg JM, Yang PY, Jothi R, Hu G, Shi YS (2014) Fip1 regulates mRNA alternative polyadenylation to promote stem cell self-renewal. EMBO J 33:878–889. doi: 10.1002/embj.201386537 PubMedPubMedCentralCrossRefGoogle Scholar
- 96.Langemeier J, Schrom EM, Rabner A, Radtke M, Zychlinski D, Saborowski A, Bohn G, Mandel‐Gutfreund Y, Bodem J, Klein C, Bohne J (2012) A complex immunodeficiency is based on U1 snRNP‐mediated poly(A) site suppression. EMBO J 31:4035–4044. doi: 10.1038/emboj.2012.252 PubMedPubMedCentralCrossRefGoogle Scholar
- 102.Lian Z, Karpikov A, Lian J, Mahajan MC, Hartman S, Gerstein M, Snyder M, Weissman SM (2008) A genomic analysis of RNA polymerase II modification and chromatin architecture related to 3′ end RNA polyadenylation. Genome Res 18:1224–1237. doi: 10.1101/gr.075804.107 PubMedPubMedCentralCrossRefGoogle Scholar
- 113.Mangone M, Manoharan AP, Thierry-Mieg D, Thierry-Mieg J, Han T, Mackowiak SD, Mis E, Zegar C, Gutwein MR, Khivansara V, Attie O, Chen K, Salehi-Ashtiani K, Vidal M, Harkins TT, Bouffard P, Suzuki Y, Sugano S, Kohara Y, Rajewsky N, Piano F, Gunsalus KC, Kim JK (2010) The landscape of C. elegans 3′UTRs. Science 329:432–435. doi: 10.1126/science.1191244 PubMedPubMedCentralCrossRefGoogle Scholar
- 158.Shepard PJ, Choi EA, Lu J, Flanagan LA, Hertel KJ, Shi Y (2011) Complex and dynamic landscape of RNA polyadenylation revealed by PAS-Seq., RNAGoogle Scholar
- 162.Singh P, Alley TL, Wright SM, Kamdar S, Schott W, Wilpan RY, Mills KD, Graber JH (2009) Global changes in processing of mRNA 3′ untranslated regions characterize clinically distinct cancer subtypes. Cancer Res 69:9422–9430. doi: 10.1158/0008-5472.can-09-2236 PubMedPubMedCentralCrossRefGoogle Scholar
- 164.Soetanto R, Hynes CJ, Patel H, Humphreys DT, Evers M, Duan GW, Parker BJ, Archer SK, Clancy JL, Graham RM, Beilharz TH, Smith NJ, Preiss T (2016) Role of miRNAs and alternative mRNA 3′-end cleavage and polyadenylation of their mRNA targets in cardiomyocyte hypertrophy. Biochim Biophys Acta. doi: 10.1016/j.bbagrm.2016.03.010 PubMedGoogle Scholar
- 165.Soo PY, Patel RK, Best S, Arya R, Thein SL (2005) Detection of prothrombin gene polymorphism at position 20209 (PT20209C/T): pilot study in a black population in the United Kingdom. Thromb Haemost 93:179–180Google Scholar
- 168.Stacey SN, Sulem P, Jonasdottir A, Masson G, Gudmundsson J, Gudbjartsson DF, Magnusson OT, Gudjonsson SA, Sigurgeirsson B, Thorisdottir K, Ragnarsson R, Benediktsdottir KR, Nexo BA, Tjonneland A, Overvad K, Rudnai P, Gurzau E, Koppova K, Hemminki K, Corredera C, Fuentelsaz V, Grasa P, Navarrete S, Fuertes F, Garcia-Prats MD, Sanambrosio E, Panadero A, De Juan A, Garcia A, Rivera F, Planelles D, Soriano V, Requena C, Aben KK, van Rossum MM, Cremers RGHM, van Oort IM, van Spronsen D-J, Schalken JA, Peters WHM, Helfand BT, Donovan JL, Hamdy FC, Badescu D, Codreanu O, Jinga M, Csiki IE, Constantinescu V, Badea P, Mates IN, Dinu DE, Constantin A, Mates D, Kristjansdottir S, Agnarsson BA, Jonsson E, Barkardottir RB, Einarsson GV, Sigurdsson F, Moller PH, Stefansson T, Valdimarsson T, Johannsson OT, Sigurdsson H, Jonsson T, Jonasson JG, Tryggvadottir L, Rice T, Hansen HM, Xiao Y, Lachance DH, Oneill BP, Kosel ML, Decker PA, Thorleifsson G, Johannsdottir H, Helgadottir HT, Sigurdsson A, Steinthorsdottir V, Lindblom A, Sandler RS, Keku TO, Banasik K, Jorgensen T, Witte DR, Hansen T, Pedersen O, Jinga V, Neal DE, Catalona WJ, Wrensch M, Wiencke J, Jenkins RB, Nagore E, Vogel U, Kiemeney LA, Kumar R, Mayordomo JI, Olafsson JH, Kong A, Thorsteinsdottir U, Rafnar T, Stefansson K (2011) A germline variant in the TP53 polyadenylation signal confers cancer susceptibility. Nat Genet 43:1098–1103, http://www.nature.com/ng/journal/v43/n11/abs/ng.926.html-supplementary-informationPubMedPubMedCentralCrossRefGoogle Scholar
- 186.Wiestner A, Tehrani M, Chiorazzi M, Wright G, Gibellini F, Nakayama K, Liu H, Rosenwald A, Muller-Hermelink HK, Ott G, Chan WC, Greiner TC, Weisenburger DD, Vose J, Armitage JO, Gascoyne RD, Connors JM, Campo E, Montserrat E, Bosch F, Smeland EB, Kvaloy S, Holte H, Delabie J, Fisher RI, Grogan TM, Miller TP, Wilson WH, Jaffe ES, Staudt LM (2007) Point mutations and genomic deletions in CCND1 create stable truncated cyclin D1 mRNAs that are associated with increased proliferation rate and shorter survival. Blood 109:4599–4606. doi: 10.1182/blood-2006-08-039859 PubMedPubMedCentralCrossRefGoogle Scholar
- 191.Xia Z, Donehower LA, Cooper TA, Neilson JR, Wheeler DA, Wagner EJ, Li W (2014) Dynamic analyses of alternative polyadenylation from RNA-seq reveal a 3′-UTR landscape across seven tumour types. Nat Commun 5. doi: 10.1038/ncomms6274
- 195.Yao P, Potdar Alka A, Arif A, Ray Partho S, Mukhopadhyay R, Willard B, Xu Y, Yan J, Saidel Gerald M, Fox Paul L (2012) Coding region polyadenylation generates a truncated tRNA synthetase that counters translation repression. Cell 149:88–100. doi: 10.1016/j.cell.2012.02.018 PubMedPubMedCentralCrossRefGoogle Scholar
Open Access This article is distributed under the terms of the Creative Commons Attribution 4.0 International License (http://creativecommons.org/licenses/by/4.0/), which permits unrestricted use, distribution, and reproduction in any medium, provided you give appropriate credit to the original author(s) and the source, provide a link to the Creative Commons license, and indicate if changes were made.