Skip to main content
Log in

Ancestral Reconstruction of a Pre-LUCA Aminoacyl-tRNA Synthetase Ancestor Supports the Late Addition of Trp to the Genetic Code

  • Original Article
  • Published:
Journal of Molecular Evolution Aims and scope Submit manuscript

Abstract

The genetic code was likely complete in its current form by the time of the last universal common ancestor (LUCA). Several scenarios have been proposed for explaining the code’s pre-LUCA emergence and expansion, and the relative order of the appearance of amino acids used in translation. One co-evolutionary model of genetic code expansion proposes that at least some amino acids were added to the code by the ancient divergence of aminoacyl-tRNA synthetase (aaRS) families. Of all the amino acids used within the genetic code, Trp is most frequently claimed as a relatively recent addition. We observe that, since TrpRS and TyrRS are paralogous protein families retaining significant sequence similarity, the inferred sequence composition of their ancestor can be used to evaluate this co-evolutionary model of genetic code expansion. We show that ancestral sequence reconstructions of the pre-LUCA paralog ancestor of TyrRS and TrpRS have several sites containing Tyr, yet a complete absence of sites containing Trp. This is consistent with the paralog ancestor being specific for the utilization of Tyr, with Trp being a subsequent addition to the genetic code facilitated by a process of aaRS divergence and neofunctionalization. Only after this divergence could Trp be specifically encoded and incorporated into proteins, including the TyrRS and TrpRS descendant lineages themselves. This early absence of Trp is observed under both homogeneous and non-homogeneous models of ancestral sequence reconstruction. Simulations support that this observed absence of Trp is unlikely to be due to chance or model bias. These results support that the final stages of genetic code evolution occurred well within the “protein world,” and that the presence–absence of Trp within conserved sites of ancient protein domains is a likely measure of their relative antiquity, permitting the relative timing of extremely early events within protein evolution before LUCA.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Institutional subscriptions

Fig. 1
Fig. 2
Fig. 3
Fig. 4
Fig. 5
Fig. 6
Fig. 7

Similar content being viewed by others

References

  • Altschul SF, Gish W, Miller W, Myers EW, Lipman DJ (1990) Basic local alignment search tool. J Mol Biol 215:403–410

    Article  CAS  PubMed  Google Scholar 

  • Alves R, Savageau MA (2005) Evidence of selection for low cognate amino acid bias in amino acid biosynthetic enzymes. Mol Microbiol 56:1017–1034

    Article  PubMed Central  CAS  PubMed  Google Scholar 

  • Andam CP, Williams D, Gogarten JP (2010) Biased gene transfer mimics patterns created through shared ancestry. Proc Natl Acad Sci USA 107:10679–10684

    Article  PubMed Central  CAS  PubMed  Google Scholar 

  • Bedouelle H, Guez-Ivanier V, Nageotte R (1993) Discrimination between transfer-RNAs by tyrosyl-tRNA synthetase. Biochimie 75:1099–1108

    Article  CAS  PubMed  Google Scholar 

  • Benson DA, Clark K, Karsch-Mizrachi I, Lipman DJ, Ostell J, Sayers EW (2014) GenBank. Nucleic Acids Res 42:D32–D37

    Article  PubMed Central  CAS  PubMed  Google Scholar 

  • Brochier C, Gribaldo S, Zivanovic Y, Confalonieri F, Forterre P (2005) Nanoarchaea: representatives of a novel archaeal phylum or a fast-evolving euryarchaeal lineage related to Thermococcales? Genome Biol 6:R42

    Article  PubMed Central  PubMed  Google Scholar 

  • Brooks DJ, Fresco JR (2002) Increased frequency of cysteine, tyrosine, and phenylalanine residues since the last universal ancestor. Mol Cell Proteomics 1:125–131

    Article  CAS  PubMed  Google Scholar 

  • Brooks DJ, Fresco JR, Lesk AM, Singh M (2002) Evolution of amino acid frequencies in proteins over deep time: inferred order of introduction of amino acids into the genetic code. Mol Biol Evol 19:1645–1655

    Article  CAS  PubMed  Google Scholar 

  • Brown JR, Doolittle WF (1995) Root of the universal tree of life based on ancient aminoacyl-tRNA synthetase gene duplications. Proc Natl Acad Sci USA 92:2441–2445

    Article  PubMed Central  CAS  PubMed  Google Scholar 

  • Brown JR, Robb FT, Weiss R, Doolittle WF (1997) Evidence for the early divergence of tryptophanyl- and tyrosyl-tRNA synthetases. J Mol Evol 45:9–16

    Article  CAS  PubMed  Google Scholar 

  • Cavalcanti AR, Leite ES, Neto BB, Ferreira R (2004) On the classes of aminoacyl-tRNA synthetases, amino acids and the genetic code. Orig Life Evol Biosph 34:407–420

    Article  CAS  PubMed  Google Scholar 

  • Chandrasekaran SN, Yardimci GG, Erdogan O, Roach J, Carter CW Jr (2013) Statistical evaluation of the Rodin-Ohno hypothesis: sense/antisense coding of ancestral class I and II aminoacyl-tRNA synthetases. Mol Biol Evol 30:1588–1604

    Article  PubMed Central  CAS  PubMed  Google Scholar 

  • Dasgupta S, Basu G (2014) Evolutionary insights about bacterial GlxRS from whole genome analyses: is GluRS2 a chimera? BMC Evol Biol 14:26

    Article  PubMed Central  PubMed  Google Scholar 

  • Di Giulio M (1992) The evolution of aminoacyl-tRNA synthetases, the biosynthetic pathways of amino acids and the genetic code. Orig Life Evol Biosph 22:309–319

    Article  PubMed  Google Scholar 

  • Dong X, Zhou M, Zhong C, Yang B, Shen N, Ding J (2010) Crystal structure of Pyrococcus horikoshii tryptophanyl-tRNA synthetase and structure-based phylogenetic analysis suggest an archaeal origin of tryptophanyl-tRNA synthetase. Nucleic Acids Res 38:1401–1412

    Article  PubMed Central  CAS  PubMed  Google Scholar 

  • Doublie S, Bricogne G, Gilmore C, Carter CW Jr (1995) Tryptophanyl-tRNA synthetase crystal structure reveals an unexpected homology to tyrosyl-tRNA synthetase. Structure 3:17–31

    Article  CAS  PubMed  Google Scholar 

  • Edgar RC (2004) MUSCLE: multiple sequence alignment with high accuracy and high throughput. Nucleic Acids Res 32:1792–1797

    Article  PubMed Central  CAS  PubMed  Google Scholar 

  • Fang ZP et al (2014) Coexistence of bacterial leucyl-tRNA synthetases with archaeal tRNA binding domains that distinguish tRNA(Leu) in the archaeal mode. Nucleic Acids Res 42:5109–5124

    Article  PubMed Central  CAS  PubMed  Google Scholar 

  • Fondi M, Brilli M, Emiliani G, Paffetti D, Fani R (2007) The primordial metabolism: an ancestral interconnection between leucine, arginine, and lysine biosynthesis. BMC Evol Biol 7(Suppl 2):S3

    Article  PubMed Central  PubMed  Google Scholar 

  • Fournier GP, Gogarten JP (2007) Signature of a primitive genetic code in ancient protein lineages. J Mol Evol 65:425–436

    Article  CAS  PubMed  Google Scholar 

  • Fournier GP, Gogarten JP (2010) Rooting the ribosomal tree of life. Mol Biol Evol 27:1792–1801

    Article  CAS  PubMed  Google Scholar 

  • Fournier GP, Andam CP, Alm EJ, Gogarten JP (2011) Molecular evolution of aminoacyl tRNA synthetase proteins in the early history of life. Orig Life Evol Biosph 41:621–632

    Article  CAS  PubMed  Google Scholar 

  • Fukuchi S, Yoshimune K, Wakayama M, Moriguchi M, Nishikawa K (2003) Unique amino acid composition of proteins in halophilic bacteria. J Mol Biol 327:347–357

    Article  CAS  PubMed  Google Scholar 

  • Gaucher EA, Thomson JM, Burgan MF, Benner SA (2003) Inferring the palaeoenvironment of ancient bacteria on the basis of resurrected proteins. Nature 425:285–288

    Article  CAS  PubMed  Google Scholar 

  • Grosjean H, de Crecy-Lagard V, Marck C (2010) Deciphering synonymous codons in the three domains of life: co-evolution with specific tRNA modification enzymes. FEBS Lett 584:252–264

    Article  CAS  PubMed  Google Scholar 

  • Groussin M, Boussau B, Gouy M (2013) A branch-heterogeneous model of protein evolution for efficient inference of ancestral sequences. Syst Biol 62:523–538

    Article  PubMed Central  CAS  PubMed  Google Scholar 

  • Gueguen L et al (2013) Bio++: efficient extensible libraries and tools for computational molecular evolution. Mol Biol Evol 30:1745–1750

    Article  CAS  PubMed  Google Scholar 

  • Guindon S, Dufayard JF, Lefort V, Anisimova M, Hordijk W, Gascuel O (2010) New algorithms and methods to estimate maximum-likelihood phylogenies: assessing the performance of PhyML 3.0. Syst Biol 59:307–321

    Article  CAS  PubMed  Google Scholar 

  • Hartlein M, Cusack S (1995) Structure, function and evolution of seryl-tRNA synthetases: implications for the evolution of aminoacyl-tRNA synthetases and the genetic code. J Mol Evol 40:519–530

    Article  CAS  PubMed  Google Scholar 

  • Hartman H, Smith TF (2014) The evolution of the ribosome and the genetic code. Life 4:227–249

    Article  PubMed Central  CAS  PubMed  Google Scholar 

  • Higgs PG (2009) A four-column theory for the origin of the genetic code: tracing the evolutionary pathways that gave rise to an optimized code. Biol Direct 4:16

    Article  PubMed Central  PubMed  Google Scholar 

  • Higgs PG, Pudritz RE (2009) A thermodynamic basis for prebiotic amino acid synthesis and the nature of the first genetic code. Astrobiology 9:483–490

    Article  CAS  PubMed  Google Scholar 

  • Hogue CW, Doublie S, Xue H, Wong JT, Carter CW Jr, Szabo AG (1996) A concerted tryptophanyl-adenylate-dependent conformational change in Bacillus subtilis tryptophanyl-tRNA synthetase revealed by the fluorescence of Trp92. J Mol Biol 260:446–466

    Article  CAS  PubMed  Google Scholar 

  • Huang J, Xu Y, Gogarten JP (2005) The presence of a haloarchaeal type tyrosyl-tRNA synthetase marks the opisthokonts as monophyletic. Mol Biol Evol 22:2142–2146

    Article  CAS  PubMed  Google Scholar 

  • Inagaki Y, Susko E, Roger AJ (2006) Recombination between elongation factor 1alpha genes from distantly related archaeal lineages. Proc Natl Acad Sci USA 103:4528–4533

    Article  PubMed Central  CAS  PubMed  Google Scholar 

  • Jukes TH (1973) Possibilities for the evolution of the genetic code from a preceding form. Nature 246:22–26

    Article  CAS  PubMed  Google Scholar 

  • Jukes TH (1981) Amino acid codes in mitochondria as possible clues to primitive codes. J Mol Evol 18:15–17

    Article  CAS  PubMed  Google Scholar 

  • Klipcan L, Safro M (2004) Amino acid biogenesis, evolution of the genetic code and aminoacyl-tRNA synthetases. J Theor Biol 228:389–396

    Article  CAS  PubMed  Google Scholar 

  • Knauth LP (2005) Temperature and salinity history of the Precambrian ocean: implications for the course of microbial evolution. Palaeogeogr Palaeoclimatol Palaeoecol 219:53–69

    Article  Google Scholar 

  • Koonin EV, Novozhilov AS (2009) Origin and evolution of the genetic code: the universal enigma. IUBMB Life 61:99–111

    Article  PubMed Central  CAS  PubMed  Google Scholar 

  • Landes C, Perona JJ, Brunie S, Rould MA, Zelwer C, Steitz TA, Risler JL (1995) A structure-based multiple sequence alignment of all class I aminoacyl-tRNA synthetases. Biochimie 77:194–203

    Article  CAS  PubMed  Google Scholar 

  • Miller SL (1953) A production of amino acids under possible primitive earth conditions. Science 117:528–529

    Article  CAS  PubMed  Google Scholar 

  • Nagel GM, Doolittle RF (1995) Phylogenetic analysis of the aminoacyl-tRNA synthetases. J Mol Evol 40:487–498

    Article  CAS  PubMed  Google Scholar 

  • Osawa S, Jukes TH, Watanabe K, Muto A (1992) Recent evidence for evolution of the genetic code. Microbiol Rev 56:229–264

    PubMed Central  CAS  PubMed  Google Scholar 

  • Podar M et al (2008) A genomic analysis of the archaeal system Ignicoccus hospitalis-Nanoarchaeum equitans. Genome Biol 9:R158

    Article  PubMed Central  PubMed  Google Scholar 

  • Praetorius-Ibba M et al (2000) Ancient adaptation of the active site of tryptophanyl-tRNA synthetase for tryptophan binding. Biochemistry 39:13136–13143

    Article  CAS  PubMed  Google Scholar 

  • Ribas de Pouplana L, Frugier M, Quinn CL, Schimmel P (1996) Evidence that two present-day components needed for the genetic code appeared after nucleated cells separated from eubacteria. Proc Natl Acad Sci USA 93:166–170

    Article  PubMed Central  CAS  PubMed  Google Scholar 

  • Schmidt HA, Strimmer K, Vingron M, von Haeseler A (2002) TREE-PUZZLE: maximum likelihood phylogenetic analysis using quartets and parallel computing. Bioinformatics 18:502–504

    Article  CAS  PubMed  Google Scholar 

  • Tivorsak TL (2001) Reconstructing ancestral biosynthetic enzymes: an approach to explore the evolution of the genetic code—tryptophan synthase as a model. Senior Thesis, Princeton University

  • Trifonov EN (2000) Consensus temporal order of amino acids and evolution of the triplet code. Gene 261:139–151

    Article  CAS  PubMed  Google Scholar 

  • Tsunoda M et al (2007) Structural basis for recognition of cognate tRNA by tyrosyl-tRNA synthetase from three kingdoms. Nucleic Acids Res 35:4289–4300

    Article  PubMed Central  CAS  PubMed  Google Scholar 

  • Vetsigian K, Woese C, Goldenfeld N (2006) Collective evolution and the genetic code. Proc Natl Acad Sci USA 103:10696–10701

    Article  PubMed Central  CAS  PubMed  Google Scholar 

  • Wetzel R (1978) Aminoacyl-tRNA synthetase families and their significance to the origin of the genetic code. Orig Life 9:39–50

    Article  CAS  PubMed  Google Scholar 

  • Wetzel R (1995) Evolution of the aminoacyl-tRNA synthetases and the origin of the genetic code. J Mol Evol 40:545–550

    Article  CAS  PubMed  Google Scholar 

  • Wong JT (1988) Evolution of the genetic code. Microbiol Sci 5:174–181

    CAS  PubMed  Google Scholar 

  • Xie Y, Reeve JN (2005) Regulation of tryptophan operon expression in the archaeon Methanothermobacter thermautotrophicus. J Bacteriol 187:6419–6429

    Article  PubMed Central  CAS  PubMed  Google Scholar 

Download references

Acknowledgments

This work was supported by the National Science Foundation Grant 0936234, NASA Astrobiology Institute Grant NNA08CN84A, and an appointment from the NASA Postdoctoral Program to GPF at the Massachusetts Institute of Technology. We thank Mathieu Groussin and Bastien Boussau for helpful discussions and their assistance with implementing non-homogeneous ancestral reconstruction models.

Conflict of interest

The authors declare that they have no conflict of interest.

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to G. P. Fournier.

Electronic supplementary material

Below is the link to the electronic supplementary material.

239_2015_9672_MOESM1_ESM.xlsx

Homogeneous ancestral reconstruction of TyrRS/TrpRS paralog ancestor, per-site amino acid probabilities. Sites within majority topology regions are labeled V(vertical), sites within proposed partial HGT regions are labeled R1-R3(recombined). Combined alignment sites refer to the native sequence alignment, and match the numbering used throughout the manuscript. Topology-specific alignment sites refer to the provided FASTA format sequence alignments for each topology region (V, R1, R2, R3). As R regions were removed from the V alignment, these numberings differ. Numbering in the “prob” column headings refers to each internal node reconstruction. The mapping of these nodes to the phylogeny is provided in Online Resource 11. For each node, “max” refers to the maximum-likelihood AA for each site. Supplementary material 1 (XLSX 10629 kb)

239_2015_9672_MOESM2_ESM.xlsx

Non-homogeneous ancestral reconstruction of TyrRS/TrpRS paralog ancestor, per-site amino acid probabilities. Sites within majority topology regions are labeled V(vertical), sites within proposed partial HGT regions are labeled R1-R3(recombined). Combined alignment sites refer to the native sequence alignment, and match the numbering used throughout the manuscript. Topology-specific alignment sites refer to the provided FASTA format sequence alignments for each topology region (V, R1, R2, R3). As R regions were removed from the V alignment, these numberings differ. Numbering in the “prob” column headings refers to each internal node reconstruction. The mapping of these nodes to the phylogeny is provided in Online Resource X. For each node, “max” refers to the maximum-likelihood AA for each site. Supplementary material 2 (XLSX 10649 kb)

Table of species names and abbreviations used in online trees and alignments. Supplementary material 3 (XLSX 52 kb)

239_2015_9672_MOESM4_ESM.fasta

FASTA format alignment of TyrRS/TrpRS protein sequences, with partial HGT regions removed. Supplementary material 4 (FASTA 183 kb)

239_2015_9672_MOESM5_ESM.fasta

FASTA format alignment of TyrRS/TrpRS protein sequences, proposed partial HGT region R1. Supplementary material 5 (FASTA 7 kb)

239_2015_9672_MOESM6_ESM.fasta

FASTA format alignment of TyrRS/TrpRS protein sequences, proposed partial HGT region R2. Supplementary material 6 (FASTA 13 kb)

239_2015_9672_MOESM7_ESM.fasta

FASTA format alignment of TyrRS/TrpRS protein sequences, proposed partial HGT region R3. Supplementary material 7 (FASTA 5 kb)

Phylogeny for proposed partial HGT region R1. Supplementary material 8 (PDF 10 kb)

Phylogeny for proposed partial HGT region R2. Supplementary material 9 (PDF 10 kb)

Phylogeny for proposed partial HGT region R3. Supplementary material 10 (PDF 10 kb)

Phylogeny with mapping for ancestral node reconstruction numberings. Supplementary material 11 (PDF 12 kb)

Rights and permissions

Reprints and permissions

About this article

Check for updates. Verify currency and authenticity via CrossMark

Cite this article

Fournier, G.P., Alm, E.J. Ancestral Reconstruction of a Pre-LUCA Aminoacyl-tRNA Synthetase Ancestor Supports the Late Addition of Trp to the Genetic Code. J Mol Evol 80, 171–185 (2015). https://doi.org/10.1007/s00239-015-9672-1

Download citation

  • Received:

  • Accepted:

  • Published:

  • Issue Date:

  • DOI: https://doi.org/10.1007/s00239-015-9672-1

Keywords

Navigation