Nonsense-mediated mRNA decay (NMD) is a specific pathway for the degradation of mRNAs that have premature termination codons (PTCs) in their open reading frames (ORFs). Its importance is highlighted by its conservation in all eukaryotes. NMD counteracts the potentially harmful impact of mRNAs that have PTCs as a result of errors at various levels of gene expression, such as nonsense and frameshift mutations, transcriptional errors and faulty splicing. Thus, NMD serves as a 'cellular vacuum cleaner' that protects the cell from the potentially harmful effects of truncated proteins by eliminating mRNAs with PTCs in a sequence of events that is not yet fully understood. In recent years numerous biochemical and cell-biological investigations in Saccharomyces cerevisiae [1], Drosophila melanogaster [2], Caenorhabditis elegans [3] and human [4, 5] cells have helped to elucidate some of the mechanistic details underlying the NMD pathway. A role for NMD in the regulation of mRNA metabolism beyond the mere vacuum cleaner function for faulty mRNAs has been suspected, and was foreshadowed by work on the splicing factor SC35 and some ribosomal proteins [6, 7]. Now, 'genome-wide' approaches - one in yeast using microarrays and another in silico, analyzing information mined from mRNA and protein databases - have added powerful evidence to suggest that NMD may serve multiple purposes in gene expression [811].

Inspirations from yeast

In yeast, NMD depends on the expression of the Upf1, Upf2 (Nmd2) and Upf3 proteins. Single or simultaneous inactivation of the UPF genes stabilizes nonsense-containing mRNAs, indicating that their protein products interact functionally in the same pathway. He et al. [8] used high-density oligonucleotide arrays to analyze genome-wide expression profiles of yeast strains containing single deletions of the UPF1, UPF2 or UPF3 genes, as well as of the DCP1 and XRN1 genes which encode proteins with activities thought to be involved in the NMD pathway - an essential component of the mRNA decapping enzyme and the 5'-3' exonuclease, respectively. They also tested double deletions of the XRN1 gene in combination with each of the UPF genes. Two-dimensional clustering analysis of the expressed genes for the Δupf1, Δupf2 and Δupf3 strains yielded several interesting results.

The deletion of UPF1, UPF2 or UPF3 generated nearly identical expression profiles. Thus, all three gene products act on the same targets, consistent with the function of Upf1, Upf2 and Upf3 in a single, linear pathway in yeast. The abundance of most mRNAs upregulated in Upf-deficient cells was also increased in Δdcp1 and Δxrn1 strains, suggesting that these mRNAs are largely degraded by decapping and subsequent 5'-3' exonucleolytic decay. This approach also identified a considerable number of NMD-regulated transcripts (765 out of the 7,839 genes represented on the microarrays) and showed that NMD substrates are generally expressed at below-average levels. In addition, most NMD substrates were found to be upregulated upon NMD inactivation, but some were downregulated, pointing to the existence of higher-order NMD targets (or additional functions of the Upf proteins in alternative gene-regulation pathways). Finally, only one third of the identified transcripts can be classified into structural or functional groups, some of which are surprising and hitherto unrecognized. Representatives of previously described NMD-substrate categories were identified, including mRNAs with nonsense mutations, transcripts resulting from faulty or regulated alternative splicing, mRNAs subject to leaky scanning during translation initiation, and mRNAs with an upstream ORF or with AATGA or ATGAA motifs immediately upstream of their translation initiation codons. More intriguing is the discovery of several new classes of NMD targets, including mRNAs that use translational +1 frameshifting, bicistronic mRNAs and, most interestingly, two classes of noncoding RNAs: pseudogene transcripts and transcripts encoded by transposable elements or their long terminal repeat (LTR) sequences.

A significant fraction of the protein-encoding transcripts upregulated in the strains with mutations in the NMD-pathway genes [8] could be grouped into clusters of proteins that act in similar pathways. Among these are proteins coordinately involved in telomere maintenance, pre-mRNA splicing, peroxisomal function and DNA repair. This suggests the exciting possibility that NMD could orchestrate the expression of functional groups of genes. Several important results of the work by He et al. [8] confirm findings obtained by another lab using a similar approach several years ago [12]. Notably, the coordinate upregulation of genes involved in telomere maintenance by NMD inactivation has already sparked interesting follow-up investigations. The yeast NMD pathway has been shown to accelerate the rate of senescence promoted by the loss of the telomerase enzyme or by the erosion of telomeres that results from altering the stochiometry of telomere-cap components [1315]. At least one likely primary target of NMD is the mRNA of Stn1p, an essential protein involved in chromosome-end protection. The observation by He et al. [8] that 35.9% of all ORFs encoded in the telomere region were upregulated in strains with mutations in components of the NMD pathway further illustrates that NMD controls many genes near telomere ends, although probably indirectly. These genes are usually silenced and may be derepressed when the protection of chromosome ends is disturbed by the loss of NMD. This enlightening example demonstrates how NMD can affect whole pathways by regulating the expression of one or a few primary target mRNAs, with consequences for groups of downstream secondary effectors. The discovery of additional examples of NMD-mediated control of functional pathways can be expected, promoting NMD to a gene-expression tool with many utilities. The cellular vacuum cleaner has therefore become a Swiss army knife (Figure 1).

Figure 1
figure 1

The role of NMD. (a) Until recently the role of NMD has been predominantly seen as that of a cellular 'vacuum cleaner' that rids the cell of erroneous mRNAs. (b) Now a more sophisticated picture of NMD is emerging - a highly specific and delicate multipurpose tool that contributes at multiple levels to the control and balance of physiological gene expression.

In silico veritas?

In a series of three recent publications [911] Steve Brenner and colleagues suggested, by data-mining, that one-third of reliably inferred human splice products form a major class of natural targets for NMD. Alternative splicing is thought to occur in 30-60% of human genes, expanding the coding repertoire of the limited number of genes in the genome and modulating tissue-specific and developmental gene functions. Brenner and colleagues [911] now suggest that alternative splicing provides a mechanism to generate PTC-containing splice products that are subsequently degraded by NMD and, as a consequence, that cooperation between alternative splicing and NMD pathways offers a major and currently underappreciated way to regulate gene expression.

For their in silico analysis, they mapped well-characterized human RefSeq [16] and LocusLink [17] mRNA sequences to genomic sequences, and then performed high-stringency alignments between these 'RefSeq-coding genes' and expressed sequence tags (ESTs). The reliably inferred splice variants were only accepted as likely NMD targets when they conformed to the '50-nucleotide rule': this hallmark of mammalian NMD predicates that stop codons located at least 50 nucleotides upstream of the last exon junction will be interpreted as 'premature' and trigger NMD. This approach leads to an underestimate of potential NMD targets, because some PTCs (for example, in T-cell receptor gene transcripts) will trigger NMD even when they are not followed by a sufficiently distant intron [18]. Moreover, Brenner and colleagues [911] also excluded mRNA variants that are indistinguishable from products of faulty splicing. These studies have unearthed several groups of functionally related proteins whose expression appears to be regulated by NMD, including translation factors and ribosomal proteins. This is in remarkable contrast to the yeast data of He et al. [8], where proteins with a function in translation were, if anything, underrepresented in the pool of NMD-regulated genes.

NMD is not (yet) on everyone's mind. As a consequence, Brenner and colleagues [11] found several entries for truncated proteins in the Swiss-Prot database. In some of these cases the available experimental evidence confirms that the mRNAs that encode these truncated proteins are bona fide NMD substrates. We are left with a consolation and a surprise. The consolation is that traces of NMD can be uncovered even though they had been overlooked before. Having recently had to accept that the number of genes in the human genome is too limited to explain the far higher number of proteins (not to mention other gene products), we then had to learn that one plausible and elegant explanation lies in alternative splicing, which enables a gene to code for a whole family of related, or sometimes antagonistic, proteins. And now the surprise is that a large portion of this effort is supposedly expended only to direct many of the primary products to decay. Several conclusions can be drawn from these observations. First, databases need to be read and annotated with a full realization of the implications of NMD; and second, NMD seems to serve as a tool for rapidly switching off gene expression. This view extends the idea of NMD as a mechanism for ridding the cell of the potentially harmful products of faulty splicing. But there may be more to it. NMD rarely downregulates the expression of a transcript completely; more commonly, 10-30% of the PTC-containing transcripts survive and may allow the production of physiologically relevant levels of truncated protein products.

For NMD researchers it has always been hard to reconcile these observations with the presumed protective role of NMD, especially as very low levels of biological products can sometimes have enormous effects. Given the problems of detecting the low levels of proteins or peptides produced from downregulated transcripts, in addition to some lingering lack of awareness of NMD, examples to prove otherwise are hard to come by. A recent publication [19] describes a PTC-containing transcript of the high-affinity immunoglobulin E (IgE) receptor, FcεRIβ, arising from retention of an intron. This alternative transcript not only conforms to the 50-nucleotide rule but its expression levels are very low compared to those of the full-length transcript, as would be expected for an NMD target. Nonetheless, the truncated protein is not only detectable, it even competes effectively with the full-length protein to control FcεRIβ expression on the cell surface. Thus, even low endogenous expression levels of NMD targets can suffice to generate a product with a biological function. A similar example of the utility of a bona fide NMD substrate is illustrated by the unc-49 locus in C. elegans. This locus uses alternative splicing to produce three GABA-receptor subunits, two of which (A and B) undergo several splice events in their 3' UTRs, rendering them predicted NMD substrates [20]. While A-form transcripts are hardly detectable, B-form transcripts represent the most abundant form and code for a protein essential for the worm's locomotion. Either the B-form transcript escapes NMD, or the residual mRNA left after NMD suffices for the necessary protein production.

Among the examples presented by Brenner and colleagues [11] or in other studies [21] are genes with complex alternative splice patterns resulting in multiple transcript isoforms with or without PTCs. The lower abundance of these isoforms when compared to the full-length form supports the notion that they are targeted to the NMD pathway. But is the possibility that cells engage in complex alternative-splicing procedures generating multiple products, just to finally dispose of them, the only conceivable option? Four out of five PTC-containing isoforms of the mRNA for the LARD death receptor are readily detectable in non-activated lymphocytes, whereas only the full-length form is expressed in activated lymphocytes [11]. While it is entirely possible that the PTC-containing forms are generated to switch off receptor expression in resting lymphocytes, there are attractive alternatives: as in the case of the FcεRIβ receptor [19], the PTC-containing mRNAs may produce proteins or peptides with relevant, although currently unknown, functions. Thus, quantitative control of the expression of low amounts of protein isoforms could represent yet another facet of the function of the NMD pathway.

A role for NMD in controlling the levels of noncoding RNAs (including noncoding alternative splice products) must also be considered. RNA accounts for more than 95% of the human genome's output and there is increasing evidence that noncoding RNAs (including introns, and spliced and polyadenylated transcripts) can have a function, for example as a modulating network or an additional layer of information [22, 23]. Importantly, noncoding RNAs have been discovered among natural NMD targets in yeast [8], where only a relatively small portion of the genome is transcribed into noncoding RNAs. What might be the role of NMD in mammals, which transcribe a far higher percentage of their genome into noncoding RNAs [22, 24, 25]? Clearly, the recent insights into the RNA-substrate spectrum of the NMD system should enhance the appreciation of NMD as a versatile, multipurpose mechanism that controls the transcriptome qualitatively and quantitatively.