Abstract
The genetic code is the syntactic foundation underlying the structure and function of every protein in the history of the biological world. Its highly ordered degenerate complexity suggests an incremental evolution, the result of a combination of selective, mechanistic, and random processes. These evolutionary processes are still poorly understood and remain an open question in the study of early life on Earth. We perform a compositional analysis of ribosomal proteins and ATPase subunits in bacterial and archaeal lineages, using conserved positions that came and remained under purifying selection before and up to the most recent common ancestor. An observable shift in amino acid usage at these conserved positions likely provides an untapped window into the history of protein sequence space, allowing events of genetic code expansion to be identified. We identify Cys, Glu, Phe, Ile, Lys, Val, Trp, and Tyr as recent additions to the genetic code, with Asn, Gln, Gly, and Leu among the more ancient. Our observations are consistent with a scenario in which genetic code expansion primarily favored amino acids that promoted an increase in polypeptide size and functionality. We propose that this expansion would have been critical in the takeover of many RNA-mediated processes, as well as the addition of novel biological functions inaccessible to an RNA-based physiology, such as crossing lipid membranes. Thus, expansion of the genetic code likely set the stage for the transition from RNA-based to protein-based life.
Similar content being viewed by others
References
Betts M, Russell R (2003) Amino acid properties and consequences of substitutions. Wiley, West Sussex
Brooks D, Fresco J, Singh M (2004) A novel method for estimating ancestral amino acid composition and its application to proteins of the Last Universal Ancestor. Bioinformatics 20:2251–2257
Brown J, Doolittle W (1995) Root of the universal tree of life based on ancient aminoacyl-tRNA synthetase gene duplications. Proc Natl Acad Sci USA 92:2441–2445
Brown JR, Doolittle WF (1999) Gene descent, duplication, and horizontal transfer in the evolution of glutamyl- and glutaminyl-tRNA synthetases. J Mol Evol 49:485–495
Bywater R, Thomas D, Vriend G (2001) A sequence and structural study of transmembrane helices. J Comput Aided Mol Des 15:533–552
Cavalcanti A, Leite E, Neto B, Ferreira R (2004) On the classes of aminoacyl-tRNA synthetases, amino acids and the genetic code. Orig Life Evol Biosph 34:407–420
Collins L, Penny D (2005) Complex splicesomal organization ancestral to extant eukaryotes. Mol Biol Evol 22:1053–1066
Cummings L, Riley L, Black L, Souyoroy A, Resenchuk S, Dondoshansky I, Tatusova T (2002) Genomic BLAST: custom-defined virtual databases for complete and unfinished genomes. FEMS Microbiol Lett 216:133–138
Davis B (1999) Evolution of the genetic code. Prog Biophys Mol Biol 72:157–243
Delaye L, Becerra A, Lazcano A (2005) The last common ancestor: what’s in a name? Orig Life Evol Biosph 35:537–554
Di Giulio M (2006) The non-monophyletic origin of the tRNA molecule and the origin of genes only after the evolutionary stage of the last universal common ancestor (LUCA). J Theor Biol 240:343–352
Douzery E, Delsuc F, Philippe H (2006) Molecular dating in the genomic era. Med Sci (Paris) 22:374–380
Edgar RC (2004) MUSCLE: multiple sequence alignment with high accuracy and high throughput. Nucleic Acids Res 32:1792–1797
Gogarten JP, Taiz L (1992) Evolution of proton pumping ATPases: rooting the tree of life. Photosynth Res 33:137–146
Gogarten-Boekels M, Hilario E, Gogarten J (1995) The effects of heavy meteorite bombardment on the early evolution–the emergence of the three domains of life. Orig Life Evol Biosph 25:251–264
Gogarten J, Kibak H, Dittrich P, Taiz L, Bowman E, Bowman B, Manolson M, Poole R, Date T, Oshima T, Konishi J, Denda K, Yoshida M (1989) Evolution of the vacuolar H+-ATPase: implications for the origin of eukaryotes. Proc Natl Acad Sci USA 86:6661–6665
Gribaldo S, Cammarano P (1998) The root of the universal tree of life inferred from anciently duplicated genes encoding components of the protein-targeting machinery. J Mol Evol 47:508–516
Harris J, Kelley S, Spiegelman G, Pace N (2003) The genetic core of the universal ancestor. Genome Res 13:407–412
Hartlein M, Cusack S (1995) Structure, function and evolution of seryl-tRNA synthetases: implications for the evolution of aminoacyl-tRNA synthetases and the genetic code. J Mol Evol 40:519–30
Hartman H (1975) Speculations on the evolution of the genetic code. Orig Life 6:423–427
Hartman H (1978) Speculations on the evolution of the genetic code II. Orig Life 9:133–136
Hartman H (1984) Speculations on the evolution of the genetic code III: the evolution of t-RNA. Orig Life 14:407–412
Hartman H (1995) Speculations on the evolution of the genetic code IV: the evolution of the aminoacyl-tRNA synthetases. Orig Life Evol Biosph 25:265–269
Higgs P, Purdritz R (2006) From protoplanetary disks to prebiotic amino acids and the origin of the genetic code. Cambridge University Press
Holbrook S (2005) RNA structure: the long and the short of it. Curr Opin Struct Biol 15:302–308
Ibba M, Celic I, Curnow A, Kim H, Pelaschier J, Tumbula D, Vothknecht U, Woese C, Soll D (1997) Aminoacyl-tRNA synthesis in Archaea. Nucleic Acids Symp Ser 37:305–306
Imlay J (2006) Iron-sulfur clusters and the problem with oxygen. Mol Microbiol 59:1073–1082
Jadhav V, Yarus M (2002) Coenzymes as coribozymes. Biochimie 84:877–888
Jordan IK, Kondrashov FA, Adzhubei IA, Wolf YI, Koonin EV, Kondrashov AS, Sunyaev S (2005) A universal trend of amino acid gain and loss in protein evolution. Nature 433:633–638
Keefe AD, Lazcano A, Miller SL (1995) Evolution of the biosynthesis of the branched-chain amino acids. Orig Life Evol Biosph 25:99–110
Klipcan L, Safro M (2004) Amino acid biogenesis, evolution of the genetic code and aminoacyl-tRNA synthetases. J Theor Biol 228:389–396
Knight RD, Freeland SJ, Landweber LF (1999) Selection, history and chemistry: the three faces of the genetic code. Trends Biochem Sci 24:241–247
Koonin EV (2003) Comparative genomics, minimal gene-sets and the last universal common ancestor. Nat Rev Microbiol 1:127–136
McDonald JH (2006) Apparent trends of amino Acid gain and loss in protein evolution due to nearly neutral variation. Mol Biol Evol 23:240–244
Nagel GM, Doolittle RF (1995) Phylogenetic analysis of the aminoacyl-tRNA synthetases. J Mol Evol 40:487–498
Nisbet E, Sleep N (2001) The habitat and nature of early life. Nature 409:1083–1091
Nishida H, Nishiyama M, Kobashi N, Kosuge T, Hoshino T, Yamane H (1999) A prokaryotic gene cluster involved in synthesis of lysine through the amino adipate pathway: a key to the evolution of amino acid biosynthesis. Genome Res 409:1175–1183
Nixon J, Wang A, Morrison H, McArthur A, Sogin M, Loftus B, Samuelson J (2002) A splicesomal intron in Giardia lamblia. Proc Natl Acad Sci USA 99:3701–3705
Noller HF, Hoang L, Fredrick K (2005) The 30S ribosomal P site: a function of 16S rRNA. FEBS Lett 579:855–858
Poirot O, Suhre K, Abergel C, O’Toole E, Notredame C (2004) 3DCoffee@igs: a web server for combining sequences and structures into a multiple sequence alignment. Nucleic Acids Res 32:W37–W40
Sadeghi M, Naderi-Manesh H, Zarrabi M, Ranjbar B (2006) Effective factors in thermostability of thermophilic proteins. Biophys Chem 119:256–270
Saran D, Frank J, Burke DH (2003) The tyranny of adenosine recognition among RNA aptamers to coenzyme A. BMC Evol Biol 3:26
Shih P, Pedersen LG, Gibbs PR, Wolfenden R (1998) Hydrophobicities of the nucleic acid bases: distribution coefficients from water to cyclohexane. J Mol Biol 280:421–430
Sorimachi K, Itoh T, Kawarabayasi Y, Okayasu T, Akimoto K, Niwa A (2001) Conservation of the basic pattern of cellular amino acid composition of archaeobacteria during biological evolution and the putative amino acid composition of primitive life forms. Amino Acids 21:393–399
Stoltzfus A (1999) On the possibility of constructive neutral evolution. J Mol Evol 49:169–181
Syvanen M (2002) Recent emergence of the modern genetic code: a proposal. Trends Genet 18:245–248
Thompson J, Higgins D, Gibson T (1994) CLUSTAL W: improving the sensitivity of progressive multiple sequence alignment through sequence weighing, position-specific gap penalties and weight matrix choice. Nucleic Acids Res 22:4673–4680
Trifonov EN (2004) The triplet code from first principles. J Biomol Struct Dyn 22:1–11
Tumbula D, Vothknecht UC, Kim HS, Ibba M, Min B, Li T, Pelaschier J, Stathopoulos C, Becker H, Soll D (1999) Archaeal aminoacyl-tRNA synthesis: diversity replaces dogma. Genetics 152:1269–1276
Tumbula-Hansen D, Feng L, Toogood H, Stetter KO, Soll D (2002) Evolutionary divergence of the archaeal aspartyl-tRNA synthetases into discriminating and nondiscriminating forms. J Biol Chem 277:37184–37190
Velasco AM, Leguina JI, Lazcano A (2002) Molecular evolution of the lysine biosynthetic pathways. J Mol Evol 55:445–459
Vlassov A (2005) How was membrane permeability produced in an RNA world? Orig Life Evol Biosph 35:135–149
Vogel H (1964) Distribution of lysine pathways among fungi: evolutionary implications. Am Nat 98:446–455
Weber AL, Miller SL (1981) Reasons for the occurrence of the twenty coded protein amino acids. J Mol Evol 17:273–284
Wong JT (2005) Coevolution theory of the genetic code at age thirty. Bioessays 27:416–425
Zhaxybayeva O, Lapierre P, Gogarten JP (2005) Ancient gene duplications and the root(s) of the tree of life. Protoplasma 227:53–64
Zuckerkandl E, Derancourt J, Vogel H (1971) Mutational trends and random processes in the evolution of informational macromolecules. J Mol Biol 59:473–490
Acknowledgments
This work was supported through the NASA Exobiology Program (NNX07AK15G).
Author information
Authors and Affiliations
Corresponding author
Electronic supplementary material
Rights and permissions
About this article
Cite this article
Fournier, G.P., Gogarten, J.P. Signature of a Primitive Genetic Code in Ancient Protein Lineages. J Mol Evol 65, 425–436 (2007). https://doi.org/10.1007/s00239-007-9024-x
Received:
Revised:
Accepted:
Published:
Issue Date:
DOI: https://doi.org/10.1007/s00239-007-9024-x