Abstract
Analysis of occurrence of simple amino acid repeats in large ensemble of prokaryotic and eukaryotic sequences reveals that nearly all amino acids found in the repeats belong to those which have in their codon repertoires aggressively expanding triplets, all of three known pathologically expanding classes GCU (GCU, CUG, UGC, AGC, GCA, CAG), GCC (GCC, CCG, CGC, GGC, GCG, CGG), and AAG (AAG, AGA, GAA, CTT, TTC, TCT). This is observed especially clear in the first exons of proteins of higher eukaryotes. The data are interpreted as manifestation of everlasting triplet expansions, which, presumably, started from the very origin of the triplet code. The spontaneous expansions continued to occur all the way during evolution, leaving their footprints in the protein-coding sequences as still visible simple amino acid repeats, as preferred triplets encoding the repeats, and as preferred codons in the codon usage tables.
Similar content being viewed by others
References
Abbas AK, Lichtman AH (2003) Cellular and molecular immunology, 5th edn. Saunders, Philadelphia
Alba MM, Guigo R (2004) Comparative analysis of amino acid repeats in rodents and humans. Genome Res 14:549–554
Alberts B, Johnson A, Lewis J, Raff M, Roberts K, Walter P (2007) Molecular biology of the cell, 5th edn. Garland Science, New York
Bacolla A, Larson JE, Collins JR, Li J, Milosavljevic A, Stenson PD, Cooper DN, Wells RD (2008) Abundance and length of simple repeats in vertebrate genomes are determined by their structural properties. Genome Res 18:1545–1553
Black DL (2003) Mechanisms of alternative pre-messenger RNA splicing. Ann Rev Biochem 72:291–336
Brown LY, Brown SA (2004) Alanine tracts: the expanding story of human illness and trinucleotide repeats. Trends Genet 20:51–58
Doolittle WF, Sapienza C (1980) Selfish genes, the phenotype paradigm and genome evolution. Nature 284:601–603
Faux NG, Bottomley SP, Lesk AM, Irving JA, Morrison JR, de la Banda MG, Whisstock JC (2005) Functional insights from the distribution and role of homopeptide repeat-containing proteins. Genome Res 15:537–551
Faux NG, Huttley GA, Mahmood K, Webb GI, de la Banda MG, Whisstock JC (2007) RCPdb: an evolutionary classification and codon usage database for repeat-containing proteins. Genome Res 17:1118–1127
Fondon JW, Garner HR (2004) Molecular origins of rapid and continuous morphological evolution. Proc Natl Acad Sci USA 101:18058–18063
Gogarten JP, Alireza GS, Zhaxybayeva O, Olendzenski L, Hilario E (2002) Inteins: structure, function, and evolution. Ann Rev Microbiol 56:263–287
Haerty W, Golding GB (2010) Genome-wide evidence for selection acting on single amino acid repeats. Genome Res 20:755–760
Hamada H, Petrino MG, Kakunaga T, Seidman M, Stollar BD (1984) Enhanced gene expression by the poly(dT-dG) poly(dC-dA) sequence. Mol Cell Biol 4:2622–2630
Huntley MA, Clark AG (2007) Evolutionary analysis of amino acid repeats across the genomes of 12 drosophila species. Mol Biol Evol 24:2598–2609
Jukes TH, Cantor CR (1969) Evolution of protein molecules. In: Munro HN (ed) Mammalian protein metabolism. Academic Press, New York, pp 21–123
Karlin S, Brocchieri L, Bergman A, Mrazek J, Gentles AJ (2002) Amino acid runs in eukaryotic proteomes and disease associations. Proc Nat Acad Sci USA 99:333–338
Kashi Y, King DG (2006) Simple sequence repeats as advantageous mutators in evolution. Trends Genet 22:253–259
King DG (1994) Triple repeat DNA as a highly mutable regulatory mechanism. Science 263:595–596
Kozlowski P, de Mezer M, Krzyzosiak WJ (2010) Trinucleotide repeats in human genome and exome. Nucl Acids Res 38:4027–4039
Matlin AJ, Clark F, Smith CW (2005) Understanding alternative splicing: towards a cellular code. Nat Rev 6:386–398
Matula M, Kypr J (1999) Nucleotide sequences flanking dinucleotide microsatellites in the human, mouse and Drosophila genomes. J Biomol Struct Dynam 17:275–280
Mazrimas JA, Hatch FT (1972) Possible relationship between satellite DNA and evolution of kangaroo rat species (genus Dipodomys). Nat New Biol 240:102
Mirkin SM (2006) DNA structures, repeat expansions and human hereditary disorders. Curr Opin Struct Biol 16:351–358
Mularoni L, Ledda A, Toll-Riera M, Mar Albà M (2010) Natural selection drives the accumulation of amino acid tandem repeats in human proteins. Genome Res 20:745–754
Ohshima K, Kang S, Larson JE, Wells RD (1996) Cloning, characterization, and properties of seven triplet repeat DNA sequences. J Biol Chem 271:16773–16783
Orgel LE, Crick FHC (1980) Selfish DNA: the ultimate parasite. Nature 284:604–607
Orr HT, Zoghbi HY (2007) Trinucleotide repeat disorders. Annu Rev Neurosci 30:575–621
Pearson CE, Edamura KN, Cleary JD (2005) Repeat instability: mechanisms of dynamic mutations. Nat Rev Genet 6:729–742
Pino S, Trifonov EN, Di Mauro E (in press) On the observable transition to living matter. Genomics Proteomics Bioinform
Richards RI, Sutherland GR (1997) Dynamic mutation: possible mechanisms and significance in human disease. Trends Biochem Sci 22:432–436
Siwach P, Pophaly SD, Ganesh S (2006) Genomic and evolutionary insights into genes encoding proteins with single amino acid repeats. Mol Biol Evol 23:1357–1369
Trifonov EN (1989) The multiple codes of nucleotide sequences. Bull Math Biol 51:417–432
Trifonov EN (1996) Interfering contexts of regulatory sequence elements. CABIOS 12:423–429
Trifonov EN (2000) Consensus temporal order of amino acids and evolution of the triplet code. Gene 261:139–151
Trifonov EN (2004) The triplet code from first principles. J Biomol Struct Dynam 22:1–11
Trifonov EN (2006) Early molecular evolution. Isr J Ecol Evol 52:375–387
Trifonov EN (2010) Towards reconstruction of molecular origin of life. J Cosmol 10:3374–3380
Trifonov EN, Bettecken T (1997) Sequence fossils, triplet expansion, and reconstruction of earliest codons. Gene 205:1–6
Usdin K (2008) The biological effects of simple tandem repeats: lessons from the repeat expansion diseases. Genome Res 18:1011–1019
Zoghbi HY, Orr HT (2000) Glutamine repeats and neurodegeneration. Ann Rev Neurosci 23:217–247
Acknowledgments
This work was supported by grant 710/02-19.0 of the Israeli Science Foundation and by an EU grant QLG2-CT-2002-01298.
Author information
Authors and Affiliations
Corresponding author
Rights and permissions
About this article
Cite this article
Koren, Z., Trifonov, E.N. Role of Everlasting Triplet Expansions in Protein Evolution. J Mol Evol 72, 232–239 (2011). https://doi.org/10.1007/s00239-010-9425-0
Received:
Accepted:
Published:
Issue Date:
DOI: https://doi.org/10.1007/s00239-010-9425-0