Sequences Promoting Recoding Are Singular Genomic Elements
- 837 Downloads
The distribution of sequences which induce non-standard decoding, especially of shift-prone sequences, is very unusual. On one hand, since they can disrupt standard genetic readout, they are avoided within the coding regions of most genes. On the other hand, they play important regulatory roles for the expression of those genes where they do occur. As a result, they are preserved among homologs and exhibit deep phylogenetic conservation. The combination of these two constraints results in a characteristic distribution of recoding sequences across genomes: they are highly conserved at specific locations while they are very rare in other locations. We term such sequences singular genomic elements to signify their rare occurrence and biological importance.
KeywordsStop Codon Codon Bias Codon Usage Bias Singular Element Codon Reassignment
In this chapter we describe the distribution of several known shift-prone patterns and stimulatory signals and how they relate to the concept of singular genomic elements. We also discuss how their characteristic distribution can be utilized for identification of novel recoded genes and describe studies where such strategies have been employed.
14.1 Singular Genomic Elements
A characteristic property of all biological systems is diversity and specialization of their component parts that play distinctive functional roles. The tendency for specialization and uniqueness is profound on the genomic level, as the existence of identical multiple copies of the same gene (unless they relate to mobile elements) is rare. Functional specialization of gene products demands similarly specific regulation of their biosynthesis and processing. Such specificity can be achieved through a combinatorial effect of several regulatory mechanisms acting on different levels of gene expression – from initiation of transcription to posttranslational modifications, where similar regulatory sequences occur in groups of functionally related genes. Specificity is gained through differential combination of these sequences which could be idiosyncratic for a particular gene. However, it is attractive to imagine a simpler scheme where a unique regulatory element would be responsible for the regulation of a specific gene. Such a sequence could respond to changes in particular cellular conditions associated with expression of the regulated gene and so provide feedback control. Indeed such regulatory elements are known and they are characteristically distributed across genomes. Their occurrence at random locations in a single genome is avoided while their occurrence at specific genomic locations across several species is preserved. Such a distribution is easy to explain. Suppose we have a genomic feature F that specifically regulates expression of a gene G. The feature F then should be avoided in all locations where it may have an effect on expression of genes other than G. On the other hand, since association of the feature F with the gene G is beneficial for the organism, such association would likely be preserved during speciation and therefore it will occur in orthologs of the gene G. In other words these regulatory sequence elements are avoided and hence underrepresented across a single genome or across particular types of sequences (e.g., those coding for proteins) in a single genome. However, they are present in orthologous genes from multiple related organisms. Here we introduce the term singular genomic element to denote such elements. There are a number of different biologically active nucleotide sequences that exhibit properties of singular genomic elements. Examples are unique sites of restriction, sites encoding unique protease cleavage sites, cases of transcriptional slippage discussed in Chapter 19, or even miRNA targets (Farh et al., 2005). Nonetheless, perhaps, the most striking type of known singular genomic elements is sequences promoting recoding events. Such elements interfere with standard genetic decoding and increase the chances of erroneous translation. Thus, their occurrence in the protein coding sequence of most genes is detrimental. At the same time they do play important roles in those genes that utilize non-standard decoding in their expression and consequently undergo purifying selection during evolution in their corresponding locations. In this chapter we discuss different examples of sequences implicated in recoding events and their distribution across different regions of single genomes and among orthologous genes. We also discuss how searches for singular genomic elements could be used as a strategy for identification of new cases of recoding events and novel genes that are expressed via recoding mechanisms.
14.2 Sequences Promoting Ribosomal Frameshifting as Singular Genomic Elements
14.2.1 +1 Frameshifting Cassette in Bacterial Release Factor 2 mRNA
All other stimulatory elements in the RF2 frameshifting cassette are not responsible for the sensitivity of frameshifting to release factor concentration. However, they are responsible for elevation of the absolute level of frameshifting efficiency, which in their absence would be insignificant even at low concentrations of release factors. The element whose role in the frameshifting mechanism is relatively easy to understand is the identity of the nucleotide 3′ adjacent to the stop codon. Unlike all sense codons that are recognized by RNA molecules via complementary interactions, stop codons are recognized by protein molecules. The recent analysis of crystal structure of the ribosome complex with RF2 reveals details of RF2 interactions with the UGA stop codon in mRNA (Weixlbaumer et al., 2008). Unfortunately, the crystal structure does not provide information on interactions of RF2 with the mRNA region downstream of the stop codon which seems to interact with release factors as evident from earlier cross-linking studies (Poole et al., 1998). While these interactions do not play a role in stop codon discrimination, they do affect termination efficiency. Since frameshifting efficiency negatively correlates with termination efficiency, it is not surprising that the weakest termination context has been selected in the RF2 frameshift site during its evolution (Major et al., 1996). It can be seen in Fig. 14.2 that the 3′ nucleotide adjacent to the stop codon is nearly always C, which has been shown to be the most inefficient context codon for termination in eubacterial organisms (Mottagui-Tabar and Isaksson, 1998; Pavlov et al., 1998).
Another important stimulatory element in the RF2 frameshifting cassette is the internal Shine–Dalgarno (SD) sequence located upstream of the shift site (Weiss et al., 1987, 1988; Curran and Yarus, 1988). Normally SD sequences are used for the initiation of translation in bacteria and are located upstream of initiator codons (Shine and Dalgarno, 1975). The increase in local concentration of initiating ribosomes around initiator sites is achieved through interactions between the SD and the corresponding complementary region of 16S rRNA, termed anti-Shine–Dalgarno (anti-SD). The internal SD 5′ of the frameshifting site could serve the same purpose, and initiation of translation at the UUG codon (which is a part of the frameshifting site) has been demonstrated (Baranov et al., 2002), although no potential functional role for this internal initiation event has been implicated. It could be that this is simply an unintentional side effect caused by sequence constraints of the RF2 frameshifting cassette. Irrespective of internal initiation, the main role of the internal SD is clearly to target elongating ribosomes. One particular important aspect of the SD stimulatory effect on frameshifting efficiency is the location of the SD relative to the frameshift site (Weiss et al., 1987). The length of the spacer between the SD sequence and the P-site tRNA during the frameshift is shorter than the distance between the SD and initiator codons (Ma et al., 2002). It is reasonable to assume that the distance between an SD and an initiator codon is optimal for the relaxed conformation of the ribosomal RNA during the initiation. If so, the shorter distance between the internal SD and the shift site should create tension in the ribosomal RNA between the anti-SD and the decoding center of the ribosome. Such tension likely acts in a manner of a compressed spring, whose relaxation is achieved by a progressive movement of tRNA with the decoding center of the ribosome toward the 3′-end of mRNA. This movement would explain the stimulatory effect of an SD on +1 frameshifting. Accordingly it is known that an internal SD stimulates frameshifting in the opposite direction when the spacer is longer than the optimal for initiation, in which case RNA likely acts as a stretched spring that alleviates tRNA movement toward the 5′-end of mRNA (Atkins et al., 2001). The conservation of the SD sequence and its location is illustrated in Fig. 14.2. Since base pairing between the SD and rRNA does not have to be perfect to cause the effect, there is a certain degree of flexibility in the RF2 frameshift stimulatory SD sequences; hence, its conservation is less profound than that of the shift site and the stop codon.
While the size of the spacer separating the shift site from the internal SD sequence is crucially important for its stimulatory effect, the identity of the spacer is not inconsequential either (Baranov et al., 2002). During frameshifting the spacer corresponds to the codon located in the ribosomal E-site. It has been suggested that there is a competition between the anti-SD and E-site tRNA for interactions with the corresponding part of mRNA. This interference of the SD with normal occupation of the E-site codon by the E-site tRNA affects fidelity of the ribosome (Baranov et al., 2002; Marquez et al., 2004; Sanders and Curran, 2007). Consequently, as the affinity of different tRNAs for the E-site fluctuates (Lill and Wintermeyer, 1987), it is not surprising that the identity of the spacer affects frameshifting efficiency.
Analysis of the distribution of sequences similar to the RF2 frameshifting cassette in bacterial genomes in terms of its “singularity” is meaningless, due to its size and complexity. If we represent the RF2 frameshift cassette as some kind of a roughly estimated consensus sequence such as GRGGNNNYTT-Stop-C, the probability of its appearance in random sequences of the same length is 1/16,384. Since we are interested only in those stop codons that are really used for the termination of translation, then the probability of such a sequence in a genome similar to E. coli (∼4,000 genes) will be about 0.2 and the probability of two such sequences in such a genome will be only ~0.05. Even if a deviation of a single nucleotide in the above consensus sequence is allowed, the probability of two random occurrences of such sequences in a genome of a size similar to that of E. coli would be less than 1/2. In other words, the fact that the above consensus sequence does not occur at the end of any other E. coli gene does not indicate evolutionary selection against such sequences. As for the individual modular stimulatory signals constituting the RF2 frameshifting cassette, they are insufficient to trigger ribosomal frameshifting with comparable efficiency and hence they are relatively frequent in the genomes. Nonetheless, some tendency for their avoidance can be illustrated using the following simple and perhaps somewhat naïve measures. For example, while C nucleotides constitute a 0.25 fraction of the E. coli K12 genome (NC_000913), the fraction of C nucleotides adjacent to the 3′-end of E. coli stop codons is 0.17, and 0.14 for those adjacent to UGA, whereas the portion of Cs after any UGA trinucleotide in the E. coli genome (NTGAC/NTGAN ratio) is 0.22. This seeming underrepresentation of Cs after stop codons and UGA in particular is, of course, due to its weakening effect on termination of translation. A similar tendency could be sensed for the usage of a codon upstream of stop codons. For example, the proportion of UUU codons among all Phe codons in the E. coli K12 genome is 0.66. But the proportion of UUU codons among Phe codons that are located upstream of stop codons is 0.47 and only 0.24 upstream of UGA codons. For CUU similar calculations give the less profound corresponding values of 0.16, 0.17, and 0.13. There is no avoidance of SD-like sequences at the end of E. coli genes compared to other locations within mRNA coding sequences. On the contrary, analysis of a larger number of bacterial genomes suggests that SD sequences are even overrepresented at the end of coding sequences, perhaps due to translational coupling where such SD sequences are used for the initiation of downstream genes (PVB, unpublished).
Summarizing, the entire RF2 frameshifting cassette constitutes a relatively large and complex constrained sequence pattern whose random occurrence in small genomes, such as the one in E. coli, has a low probability. Smaller and simpler components of the frameshift cassette are relatively ineffective in triggering efficient non-standard translation events; nonetheless they probably can increase the chance of errors and thus some level of selection against such sequences can be detected. In the following section we deal with the analysis of relatively short sequences, so their random occurrence is considerably more likely. Despite their shortness, however, they are sufficient to trigger efficient non-standard translational events.
14.2.2 −1 Frameshifting Cassette in Coronavirus Polyprotein-Encoding Gene
The coronaviral gene encoding the ORF1AB polyprotein consists of two overlapping ORFs and the synthesis of the full length protein product requires programmed ribosomal −1 frameshifting (Brierley et al., 1989). The frameshift cassette consists of the slippery heptamer sequence U_UU.U_AA.C (underlined spaces indicate separation of codons in the initial phase and dots separate codons in the frame after the shift). The frameshifting is stimulated by RNA structures downstream of the slippery sequence. There is a degree of variation among the stimulatory structures. In some viruses the structure is formed by two distant stem loops forming complementary interactions between their apical loops (kissing stem-loop structures) (Herold and Siddell, 1993). In others, it is a classical H-type pseudoknot with variable features, for example, in SARS-CoV there is an important RNA stem-loop structure located within the second loop of the pseudoknot (Baranov et al., 2005; Plant et al., 2005; Su et al., 2005).
14.3 Cars and Ribosomes, Fast and Furious: Role of mRNA in the Accuracy of Translation
One striking difference between erroneous frameshifting and programmed frameshifting lies in their efficiencies. The translational apparatus is able to decode mRNA with remarkable accuracy; misincorporation of an amino acid due to recognition of incorrect tRNAs occurs with frequencies in the range of 10−3–10−5 depending on the exact type of error. These estimates come from a number of studies in E. coli, reviewed in Parker (1989). This high accuracy for amino acid incorporation is observed despite the fact that not all such errors are necessarily harmful, since substitution of a single amino acid in a protein does not necessarily lead to its inactivation. The extent of tolerance to misincorporation errors is best illustrated by Candida albicans where CUG codons are decoded as both Leu and Ser due to ambiguous aminoacylation of the corresponding tRNA (Moura et al., 2007). In contrast, errors in processivity, such as frameshift errors, pose a greater danger during translation since they result in alterations not of just a single amino acid but of the entire sequence following such an error. It is reasonable to expect that the decoding apparatus should be able to prevent such errors with even greater accuracy. Indeed, it has been estimated that background levels of frameshifting errors fluctuate in the range of 10−5–10−7 (Kurland, 1979; Parker, 1989). At the 2007 ribosomal meeting in Cape Cod, Mons Ehrenberg summarized his talk with the following statement: “Ribosomes are very fast and very accurate and this is the summary of my talk.” It would be hard and perhaps juvenile to argue with such a statement as it would be hard to argue with commercials advertising modern cars saying that they are fast and safe. Cars are, but the traffic is not, at least not always. The safety and speed of traffic depends not only on cars but also on road conditions. By analogy we can describe mRNAs as the roads for the ribosomal traffic. We will argue that the observed accuracy of translation relies not only on the properties of the ribosome but also on mRNA sequence. Under certain circumstances mRNA can force translating ribosomes to alter their behavior so that translation can no longer be considered accurate.
Frameshifting occurs with strikingly high efficiencies at certain recoding sites exceeding background levels by 106 and under certain conditions could be even more efficient than standard triplet translation. Of course, such efficiency is frequently achieved by an ensemble of complex stimulatory signals that have evolved to increase frameshifting efficiency at a local site. This was described above for RF2 mRNA frameshifting and is also evident from many other examples throughout this book. However, even relatively simple sequences such as the heptameric C.UU_A.GG_C in yeast transposon Ty1 cause frameshifting with efficiency comparable to that of standard translation at the same site without additional stimulators (Belcourt and Farabaugh, 1990). Other simple short sequences are also shift-prone and can lead to frameshifting events of lower efficiency, but still much greater than the background levels. Evidently the accuracy of translation in terms of reading frame maintenance is highly dependent on mRNA nucleotide context. Why is there such dependence and why do ribosomes not translate all sequences with a similar accuracy?
This explains why certain relatively simple sequences can be particularly prone to frameshift errors and why they are rare in most coding regions. However, the situation is not always so simple as we will see in the following sections.
14.4 Strategies for Searching Recoding Cases as Singular Elements
A number of studies have attempted to search for new cases of programmed frameshifting based on the assumption that the sequences that promote ribosomal frameshifting should behave like singular genomic elements and as such be avoided in the coding regions unless the triggered frameshifting is positively selected for. The simplest idea is to search for further occurrences of sequences, of the type known to be utilized for programmed ribosomal frameshifting, throughout the coding regions of completed genomes. Although this approach limits the search to motifs already known to trigger frameshifting and will not increase our knowledge of frameshift-prone sequences, it could reveal novel cases of utilization of these sequences for gene expression purposes. To analyze the frequency of occurrence of sequences capable of stimulating −1 frameshifting in Saccharomyces cerevisiae, Jacobs et al. (2007) searched for viral consensus slippery sites X_XX.Y_YY.Z, where XXX represents any three identical nucleotides, YYY represents AAA or UUU, Z ≠ G. With this approach they identified 10,340 slippery sites in the 6,353 annotated coding sequences of the yeast genome, 6,016 of which are followed by at least one pseudoknot motif. According to statistical analyses employed by the authors these signals are underrepresented in the S. cerevisiae genome. Of the 6,353 yeast ORFs, 1,275 contain at least one strong and statistically significant −1 frameshift signal [in a recent study Theis et al. (2008) have argued that in some cases there are alternative structures that are more stable than the predicted pseudoknots]. Eight out of nine sequences, selected for experimental verification using artificial genetic constructs, supported efficient levels of frameshifting in vivo. The authors hypothesized that many other frameshift candidates found in their study could lead to significant levels of frameshifting. If frameshifting indeed takes place at those locations, in the vast majority of cases it would result in production of truncated and most likely dysfunctional products. The authors hypothesized that the role of frameshifting could be regulatory (see the following section). It is unclear how beneficial such a regulation might be for the cells and no data on phylogenetic conservation of these sequences have been provided.
In a different work (Gurvich et al., 2003), the E. coli K12 genome was searched for occurrences of the very well-known prokaryotic slippery sequence A_AA.A_AA.G. Frameshifting at A_AA.A_AA.G is utilized for expression of the γ subunit of DNA polymerase III, while the τ subunit is expressed by standard translation from the same gene (dnaX) (Blinkowa and Walker, 1990; Flower and McHenry, 1990; Tsuchihashi and Kornberg, 1990). Frameshifting at this sequence is also utilized by a number of insertion sequence elements in E. coli (Hu et al., 1996; Baranov et al., 2006). Seventy instances of this sequence have been found in 68 E. coli genes. Twelve genes have been chosen for experimental analysis and all of them have been shown to support −1 frameshifting at levels above background. The authors used comparative phylogenetic analysis to address potential utilization of any of those sequences for gene expression purposes. Apart from the dnaX gene, six IS2-like elements and the ydaY gene of unknown function, utilize A_AA.A_AA.G for gene expression. Although the number of occurrences is quite high, according to the statistical analysis this sequence is underrepresented in coding regions, and thus does behave as a singular element. The distribution of three other known shift-prone sequences in E. coli K12, CCC_UGA (Gurvich et al., 2003), AGG_AGG, and AGA_AGA (Gurvich et al., 2005), was also examined. All three sequences trigger +1 frameshifting in E. coli. Frameshifting at C.CC_U.GA occurs through near-cognate recognition of the CCC codon by tRNAPro 5’U*GG3’(where U* designates the cmo5U34 modification) (O’Connor, 2002). Because of suboptimal base pairing with the CCC codon, this tRNA is prone to shift into the +1 frame to re-pair to mRNA at the cognate CCU codon. As with RF2 mRNA frameshifting, that on C.CC_U.GA is in direct competition with termination mediated by RF2 and its efficiency is increased due to slow decoding of the termination codon. Although not known to be utilized for gene expression in E. coli, frameshifting at C.CC_U.GA is employed for expression of antizyme genes in some eukaryotes (Ivanov and Atkins, 2007) and for expression of the tsh gene of Listeria monocytogenes phage PSA (Zimmer et al., 2003). Nineteen genes in E. coli K12 end with C.CC_TGA and in half of them frameshifting occurs at above 1% (Gurvich et al., 2003).
Frameshifting on A.GG_A.GG and A.GA_A.GA is due to limited abundance of the cognate arginine tRNAArg 3’UCC5’ and tRNAArg 3’UCU*5’ (where U* is 5-methylaminomethyl-2-thiouridine), respectively. Due to sequestration of the sparse tRNA by the first of the tandem codons, its availability for the second codon is drastically reduced. When the second codon occupies the A-site of the translating ribosome the longer-than-usual time for arrival of the cognate tRNA increases the chance for dissociation of the peptidyl-tRNA which may re-pair to mRNA in the overlapping +1 frame (or potentially −1 frame as has been shown for an A.GA_A.GA tandem by Lainé et al. (2008)). Frameshifting to the new frame is greatly favored by availability of the tRNA cognate to the new codon in the +1 frame. The A.GG_A.GG and A.GA_A.GA tandems were originally reported to trigger up to 50% frameshifting (Spanjaard and van Duin, 1988; Spanjaard et al., 1990). Although such high levels of frameshifting are likely due to overexpression of the mRNAs containing these sequences (Gurvich et al., 2005) and due to the use of streptomycin-resistant strains, in which ribosomes translate the mRNA more slowly making them prone to +1 frameshifting at the rare codons (Sipley and Goldman, 1993). Nevertheless, even at the lowest possible expression level of the transgene, frameshifting at A.GA_A.GA (and likely A.GG_A.GG) occurs at about 1% level (Gurvich et al., 2005). All three frameshift-prone sequences C.CC_U.GA, A.GG_A.GG, and A.GA_A.GA are not underrepresented in E. coli and in fact C.CC_U.GA is significantly overrepresented. However, none of these sequences including A_AA.A_AA.G, occur in the subset of highly expressed genes in E. coli (Karlin et al., 2001). This means that although not significantly underrepresented in coding regions, overall these sequences are selected against in highly expressed ORFs and in the way they behave as singular elements in highly expressed genes. In contrast to the Jacobs et al. study, Gurvich et al. suggested that the occurrence of these frameshift candidates in protein coding regions does not have a functional role, since they do not exhibit phylogenetic conservation. Gurvich et al. argued that frameshifting above background level in lowly expressed genes could easily be tolerated by cells, since only a few aberrant protein molecules would be produced as a result of frameshifting. Therefore, the presence of shift-prone sequences in certain locations can be explained not by their beneficial effects but by the lack of strong selection against such sequences. Future studies are expected to resolve the contrasting interpretations.
The most general ab initio study related to singular elements supporting frameshifting was performed by Shah et al. (2002) where the distribution of all heptamers occurring in coding regions of the yeast S. cerevisiae genome was analyzed. A fraction of the least abundant and the most underrepresented heptamers have been tested for their ability to trigger ribosomal frameshifting. All sequences tested stimulated ribosomal frameshifting at above background levels with some of them promoting highly efficient frameshifting. Notably, the heptamer sequences C.UU_A.GU_U and C.UU_A.GG_C used to trigger programmed ribosomal frameshifting for expression of EST3 (Morris and Lundblad, 1997; Taliaferro and Farabaugh, 2007) and ABP140 (Asakura et al., 1998), respectively, are ranked among the least represented in coding regions of S. cerevisiae. While this approach appeared to have good predictability for sequences supporting +1 frameshifting in yeast, it failed in predicting sequences that would stimulate −1 frameshifting. The authors suggested this could be because the sequences utilized for −1 programmed frameshifting in yeast do not stimulate frameshifting at sufficiently high efficiency without additional cis-acting elements.
14.5 Possible Functions of Products Generated by Low-Level Aberrant Translation
As has been shown by several studies described above, shift-prone sequences, although somewhat underrepresented throughout the genome and absent in highly expressed genes, are frequent in coding sequences. In a few distinct cases specific functional consequences of frameshifting can be envisioned. However, such cases are rare and in general the frameshifting on frameshift-prone sequences will result in premature termination and production of a nonfunctional peptide that gets degraded. Most likely such frameshift events occur without any specific functional role and constitute minor faults of the translation process. Nevertheless, some general impact of such erroneous frameshifting on regulation of different cellular processes has been proposed. Some authors suggest that erroneous frameshifting can posttranscriptionally regulate mRNA stability, since encountering a premature termination codon by translating ribosome would trigger mRNA degradation through nonsense-mediated decay (NMD) pathway (Jacobs et al., 2007). However, the growing evidence suggests that in higher eukaryotes NMD can be triggered only during the first, so-called pioneer round of translation [review in Chang et al. (2007)]. If frameshifting occurs at a level of about 1%, then an mRNA containing such a frameshift site would be degraded through the NMD pathway only in 1% of the cases. On the other hand, in S. cerevisiae where NMD is inefficient and can be triggered after a number of translations of the PTC-containing mRNA, some downregulation of the mRNAs containing frameshift sites is feasible.
A consequence of erroneous frameshifting is production of an aberrant peptide. In some cases, when frameshifting occurs near the end of the coding region, the peptide synthesized might retain its function and could be utilized along with the products of standard translation (Mejlhede et al., 1999). In all other cases it is generally assumed that nonfunctional peptides get degraded. However, the exact fate is indeed unknown. Peptides produced by erroneous frameshifting can be potentially utilized as cryptic epitopes in the immune system. Two such cases have been described in the literature to date. One was identified in a patient with Reuter’s syndrome. There, a transframe peptide produced via frameshifting from the IL-10 gene served as cryptic epitope to activate cytotoxic T cells (Saulquin et al., 2002). Intriguingly, the authors speculated that the frameshifting in the IL-10 could be of pathophysiological relevance since the preliminary data suggested recognition of the same epitope in another rheumatoid arthritis patient. Another example was identified in the herpes simplex virus (HSV) tk gene which encodes thymidine kinase (TK). Thymidine kinase is crucial for reactivation of the virus from a latent phase and is a target for antiviral therapy with the drug acyclovir. An acyclovir-resistant mutant has the insertion of a single G nucleotide in a run of 7 G’s in tk gene, resulting in a run of 8 G’s (Horsburgh et al., 1996). This frameshift mutation results in synthesis of nonfunctional TK and the mutant is resistant to acyclovir, which has to be phosphorylated by TK and subsequently by host kinases to an active form that interferes with viral replication (Elion, 1982). However, low levels of functional TK that are crucial for viral propagation are synthesized via ribosomal frameshifting on the run of 8 G (Griffiths et al., 2006; Besecker et al., 2007). In the wild-type tk gene the run of 7 G also causes about 1% frameshifting and the truncated peptide serves as a cryptic epitope and can trigger an immune response (Zook et al., 2006).
As we demonstrated in this chapter, sequences responsible for highly efficient alterations of standard genetic readout are sometimes underrepresented in protein coding regions of genomes. When such sequences play crucial roles for gene expression, e.g., required for the biosynthesis of functional gene products, they exhibit deep phylogenetic conservation. Such sequences can be classified as singular genetic elements. Yet, there are a substantial number of sequences prone to low-level aberrant translational events and their underrepresentation in coding sequences is less pronounced. Even though the negative impact of such sequences in gene expression is less critical and their genomic locations are not strictly conserved, the subsequent non-canonical translational events have important functional implications, such as fine-tuning of expression levels during posttranscriptional regulation or production of epitopes for an immune response.
P.V.B. thanks Science Foundation Ireland for Support.
- Atkins JF, Baranov PV, Fayet O, Herr AJ, Howard MT, Ivanov IP, Matsufuji S, Miller WA, Moore B, Prere MF, Wills NM, Zhou J, Gesteland RF (2001) Overriding standard decoding: implications of recoding for ribosome function and enrichment of gene expression. Cold Spr Harb Symp Quant Biol 66:217–232CrossRefGoogle Scholar
- Bernardi G, Bernardi G (1986) Compositional constraints and genome evolution. J Mol Evol 24:1–11Besecker MI, Furness CL, Coen DM, Griffiths A (2007) Expression of extremely low levels of thymidine kinase from an acyclovir-resistant herpes simplex virus mutant supports reactivation from latently infected mouse trigeminal ganglia. J Virol 81:8356–60PubMedCrossRefGoogle Scholar
- Flower AM, McHenry CS (1990) The gamma subunit of DNA polymerase III holoenzyme of Escherichia coli is produced by ribosomal frameshifting. Proc Natl Acad Sci USA 87:3713–3717Griffiths A, Link MA, Furness CL, Coen DM (2006) Low-level expression and reversion both contribute to reactivation of herpes simplex virus drug-resistant mutants with mutations on homopolymeric sequences in thymidine kinase. J Virol 80:6568–6574PubMedCrossRefGoogle Scholar
- Kurland, C (1979) Reading frame errors on ribosomes. In: Celis J, Smith JD (eds) Nonsense mutations and tRNA suppressors, Academic Press, London, pp 97–108Google Scholar
- Kurland CG, Hughes D, Ehrenberg M (1996) Limitations of translational accuracy. In Escherichia coli and Salmonella typhimurium: Cellular and molecular biology, ASM Press, Washington, DC, pp 979–1004Lainé S, Thouard A, Komar AA, Rossignol JM (2008) Ribosome can resume the translation in both +1 or –1 frames after encountering an AGA cluster in Escherichia coli. Gene 412:95–101Google Scholar
- Poole ES, Major LL, Mannering SA, Tate WP (1998) Translational termination in Escherichia coli: three bases following the stop codon crosslink to release factor 2 and affect the decoding efficiency of UGA-containing signals. Nucl Acids Res 26:954–960Sanders CL, Curran JF (2007) Genetic analysis of the E site during RF2 programmed frameshifting. RNA 13:1483–1491PubMedCrossRefGoogle Scholar
- Taliaferro D, Farabaugh PJ (2007) An mRNA sequence derived from the yeast EST3 gene stimulates programmed +1 translational frameshifting. RNA 13:606–613Theis C, Reeder J, Giegerich R (2008) KnotInFrame: prediction of –1 ribosomal frameshift events.Nucl Acids Res 36:6013–6020PubMedCrossRefGoogle Scholar
- Tsuchihashi Z, Kornberg A (1990) Translational frameshifting generates the gamma subunit of DNA polymerase III holoenzyme. Proc Natl Acad Sci USA 87:2516–2520Vallabhaneni H, Fan-Minogue H, Bedwell DM, Farabaugh PJ (2009) Connection between stop codon reassignment and frequent use of shifty stop frameshifting. RNA 15:889–897PubMedCrossRefGoogle Scholar