Journal of Molecular Evolution

, Volume 77, Issue 4, pp 170–184 | Cite as

A Realistic Model Under Which the Genetic Code is Optimal

  • Harry Buhrman
  • Peter T. S. van der GulikEmail author
  • Gunnar W. Klau
  • Christian Schaffner
  • Dave Speijer
  • Leen Stougie
Original Article


The genetic code has a high level of error robustness. Using values of hydrophobicity scales as a proxy for amino acid character, and the mean square measure as a function quantifying error robustness, a value can be obtained for a genetic code which reflects the error robustness of that code. By comparing this value with a distribution of values belonging to codes generated by random permutations of amino acid assignments, the level of error robustness of a genetic code can be quantified. We present a calculation in which the standard genetic code is shown to be optimal. We obtain this result by (1) using recently updated values of polar requirement as input; (2) fixing seven assignments (Ile, Trp, His, Phe, Tyr, Arg, and Leu) based on aptamer considerations; and (3) using known biosynthetic relations of the 20 amino acids. This last point is reflected in an approach of subdivision (restricting the random reallocation of assignments to amino acid subgroups, the set of 20 being divided in four such subgroups). The three approaches to explain robustness of the code (specific selection for robustness, amino acid–RNA interactions leading to assignments, or a slow growth process of assignment patterns) are reexamined in light of our findings. We offer a comprehensive hypothesis, stressing the importance of biosynthetic relations, with the code evolving from an early stage with just glycine and alanine, via intermediate stages, towards 64 codons carrying todays meaning.


Genetic code Error robustness Origin of life Polar requirement 



We thank the EiC and two anonymous reviewers for suggestions which improved the manuscript. Part of this research has been funded by NWO-VICI Grant 639-023-302, by the NWO-CLS MEMESA Grant, by the Tinbergen Institute, and by a NWO-VENI Grant.

Supplementary material


  1. Aboderin AA (1971) An empirical hydrophobicity scale for α-amino-acids and some of its applications. Int J Biochem 2(11):537–544CrossRefGoogle Scholar
  2. Alff-Steinberger C (1969) The genetic code and error transmission. Proc Natl Acad Sci USA 64(2):584–591CrossRefGoogle Scholar
  3. Ardell DH (1998) On error minimization in a sequential origin of the standard genetic code. J Mol Evol 47(1):1–13CrossRefGoogle Scholar
  4. Berg JM, Tymoszko JL, Stryer L (2007) Biochemistry, 6th edn. W.H. Freeman and Company, New York, p 664Google Scholar
  5. Biou V, Gibrat JF, Levin JM, Robson B, Garnier J (1988) Secondary structure prediction: combination of three different methods. Protein Eng 2(3):185–191CrossRefGoogle Scholar
  6. Buhrman H, van der Gulik PTS, Kelk SM, Koolen WM, Stougie L (2011) Some mathematical refinements concerning error minimization in the genetic code. IEEE/ACM Trans Comput Biol Bioinf 8(5):1358–1372CrossRefGoogle Scholar
  7. Burkard R, Derigs U (1980) Assignment and matching problems: solution methods with FORTRAN-programs. Lecture notes in economics and mathematical systems. Springer-Verlag, Berlin. CrossRefGoogle Scholar
  8. Burkard RE, Rendl F (1984) A thermodynamically motivated simulation procedure for combinatorial optimization problems. Eur J Oper Res 17(2):169–174CrossRefGoogle Scholar
  9. Butler T, Goldenfeld N, Mathew D, Luthey-Schulten Z (2009) Extreme genetic code optimality from a molecular dynamics calculation of amino acid polar requirement. Phys Rev E 79(6):060,901(R)CrossRefGoogle Scholar
  10. Caporaso JG, Yarus M, Knight R (2005) Error minimization and coding triplet/binding site associations are independent features of the canonical genetic code. J Mol Evol 61(5):597–607CrossRefGoogle Scholar
  11. Cornette JL, Cease KB, Margalit H, Spouge JL, Berzofsky JA, DeLisi C (1987) Hydrophobicity scales and computational techniques for detecting amphipathic structures in proteins. J Mol Biol 195(3):659–685CrossRefGoogle Scholar
  12. Crick FHC (1968) The origin of the genetic code. J Mol Biol 38(3):367–379CrossRefGoogle Scholar
  13. Crick FHC, Barnett L, Brenner S, Watts-Tobin RJ (1961) General nature of the genetic code for proteins. Nature 192(4809):1227–1232CrossRefGoogle Scholar
  14. Di Giulio M (1989) The extension reached by the minimization of the polarity distances during the evolution of the genetic code. J Mol Evol 29(4):288–293CrossRefGoogle Scholar
  15. Di Giulio M (2008) An extension of the coevolution theory of the origin of the genetic code. Biol Direct 3:37CrossRefGoogle Scholar
  16. Eigen M, Schuster P (1978) A principle of natural self organization. Part C: the realistic hypercycle. Naturwissenschaften 65(7):341–369CrossRefGoogle Scholar
  17. Eisenberg D, McLachlan AD (1986) Solvation energy in protein folding and binding. Nature 319(6050):199–203CrossRefGoogle Scholar
  18. Ellington AD, Szostak JW (1990) In vitro selection of RNA molecules that bind specific ligands. Nature 346:818–822CrossRefGoogle Scholar
  19. Eppstein D (2003) Setting parameters by example. SIAM J Comput 32(3):643–653CrossRefGoogle Scholar
  20. Erives A (2011) A model of proto-anti-codon RNA enzymes requiring L-amino acid homochirality. J Mol Evol 73:10–22. doi: 10.1007/s00239-011-9453-4 CrossRefGoogle Scholar
  21. Freeland SJ, Hurst LD (1998a) The genetic code is one in a million. J Mol Evol 47(3):238–248CrossRefGoogle Scholar
  22. Freeland SJ, Hurst LD (1998b) Load minimization of the genetic code: history does not explain the pattern. Proc R Soc B Biol Sci 265(1410):2111–2119CrossRefGoogle Scholar
  23. Freeland SJ, Knight RD, Landweber LF, Hurst LD (2000) Early fixation of an optimal genetic code. Mol Biol Evol 17(4):511–518CrossRefGoogle Scholar
  24. Freeland SJ, Wu T, Keulmann N (2003) The case for an error minimizing standard genetic code. Orig Life Evol Biosp 33(4-5):457–477CrossRefGoogle Scholar
  25. Gilis D, Massar S, Cerf NJ, Rooman M (2001) Optimality of the genetic code with respect to protein stability and amino-acid frequencies. Genome Biol 2(11):R49CrossRefGoogle Scholar
  26. Grantham R (1974) Amino acid difference formula to help explain protein evolution. Science 185(4154):862–864CrossRefGoogle Scholar
  27. Grosjean H, de Crecy-Lagard V, Marck C (2010) Deciphering synonymous codons in the three domains of life: co-evolution with specific tRNA modification enzymes. FEBS Lett 584(2):252–264CrossRefGoogle Scholar
  28. Haig D, Hurst LD (1991) A quantitative measure of error minimization in the genetic code. J Mol Evol 33(5):412–417CrossRefGoogle Scholar
  29. Higgs PG (2009) A four- column theory for the origin of the genetic code: tracing the evolutionary pathways that gave rise to an optimized code. Biol Direct 4:16CrossRefGoogle Scholar
  30. Higgs PG, Pudritz RE (2009) A thermodynamic basis for prebiotic amino acid synthesis and the nature of the first genetic code. Astrobiology 9(5):483–490CrossRefGoogle Scholar
  31. Ikehara K (2002) Origins of gene, genetic code, protein and life: comprehensive view of life systems from a GNC-SNS primitive genetic code hypothesis. J Biosci 27(2):165–186CrossRefGoogle Scholar
  32. Ikehara K, Omori Y, Arai R, Hirose A (2002) A novel theory on the origin of the genetic code: a GNC-SNS hypothesis. J Mol Evol 54(4):530–538CrossRefGoogle Scholar
  33. Illangasekare M, Yarus M (2002) Phenylalanine-binding RNAs and genetic code evolution. J Mol Evol 54(3):298–311CrossRefGoogle Scholar
  34. Janas T, Widmann JJ, Knight R, Yarus M (2010) Simple, recurring RNA binding sites for l-arginine. RNA 16(4):805–816CrossRefGoogle Scholar
  35. Jensen RA (1976) Enzyme recruitment in evolution of new function. Annu Rev Microbiol 30:409–425CrossRefGoogle Scholar
  36. Johansson MJO, Esberg A, Huang B, Bjork GR, Bystrom AS (2008) Eukaryotic wobble uridine modifications promote a functionally redundant decoding system. Mol Cell Biol 28(10):3301–3312CrossRefGoogle Scholar
  37. Johnson DBF, Wang L (2010) Imprints of the genetic code in the ribosome. Proc Natl Acad Sci USA 107(18):8298–8303CrossRefGoogle Scholar
  38. Kawashima S, Ogata H, Kanehisa M (1999) AAindex: amino acid index database. Nucleic Acids Res 27(1):368–369CrossRefGoogle Scholar
  39. Knight RD, Freeland SJ, Landweber LF (1999) Selection, history and chemistry: the three faces of the genetic code. Trends Biochem Sci 24(6):241–247CrossRefGoogle Scholar
  40. Kyte J, Doolittle RF (1982) A simple method for displaying the hydropathic character of a protein. J Mol Biol 157(1):105–132CrossRefGoogle Scholar
  41. Lehman N, Jukes TH (1988) Genetic code development by stop codon takeover. J Theor Biol 135(2):203–214CrossRefGoogle Scholar
  42. Li Y, Pardalos P, Resende M (1994) A greedy randomized adaptive search procedure for the quadratic assignment problem. Quadratic Assign Relat Probl 16:237–261CrossRefGoogle Scholar
  43. Lozupone C, Changayil S, Majerfeld I, Yarus M (2003) Selection of the simplest RNA that binds isoleucine. RNA 9(11):1315–1322CrossRefGoogle Scholar
  44. Majerfeld I, Chocholousova J, Malaiya V, Widmann J, McDonald D, Reeder J, Iyer M, Illangasekare M, Yarus M, Knight R (2010) Nucleotides that are essential but not conserved; a sufficient l-tryptophan site in RNA. RNA 16(10):1915–1924CrossRefGoogle Scholar
  45. Majerfeld I, Puthenvedu D, Yarus M (2005) RNA affinity for molecular l-histidine; genetic code origins. J Mol Evol 61:226–235CrossRefGoogle Scholar
  46. Majerfeld I, Yarus M (1994) An RNA pocket for an aliphatic hydrophobe. Nat Struct Biol 1(5):287–292CrossRefGoogle Scholar
  47. Majerfeld I, Yarus M (2005) A diminutive and specific RNA binding site for l-tryptophan. Nucleic Acids Res 33(17):5482–5493. doi: 10.1093/nar/gki861 CrossRefGoogle Scholar
  48. Massey SE (2006) A sequential "2-1-3" model of genetic code evolution that explains codon constraints. J Mol Evol 62(6):809–810CrossRefGoogle Scholar
  49. Massey SE (2008) A neutral origin for error minimization in the genetic code. J Mol Evol 67(5):510–516CrossRefGoogle Scholar
  50. Mathew DC, Luthey-Schulten Z (2008) On the physical basis of the amino acid polar requirement. J Mol Evol 66(5):519–528CrossRefGoogle Scholar
  51. MATLAB: version 7.12.0 (R2011a) The MathWorks Inc., Natick, Massachusetts (2011)Google Scholar
  52. Meirovitch H, Rackovsky S, Scheraga HA (1980) Empirical studies of hydrophobicity. 1. Effect of protein size on the hydrophobic behavior of amino acids. Macromolecules 13(6):1398–1405CrossRefGoogle Scholar
  53. Miyazawa S, Jernigan RL (1985) Estimation of effective interresidue contact energies from protein crystal structures: quasi-chemical approximation. Macromolecules 18(3):534–552CrossRefGoogle Scholar
  54. Miyazawa S, Jernigan RL (1999) Self-consistent estimation of inter-residue protein contact energies based on an equilibrium mixture approximation of residues. Proteins 34(1):49–68CrossRefGoogle Scholar
  55. Noller HF (2004) The driving force for molecular evolution of translation. RNA 10(12):1833–1837CrossRefGoogle Scholar
  56. Novozhilov AS, Wolf YI, Koonin EV (2007) Evolution of the genetic code: partial optimization of a random code for robustness to translation error in a rugged fitness landscape. Biol Direct 2:24CrossRefGoogle Scholar
  57. Ohno S (1970) Evolution by gene duplication. Springer, BerlinCrossRefGoogle Scholar
  58. Oobatake M, Ooi T (1977) An analysis of non-bonded energy of proteins. J Theor Biol 67(3):567–584CrossRefGoogle Scholar
  59. Parker ET, Cleaves HJ, Dworkin JP, Glavin DP, Callahan M, Aubrey A, Lazcano A, Bada JL (2011) Primordial synthesis of amines and amino acids in a 1958 Miller H2S-rich spark discharge experiment. Proc Natl Acad Sci USA 108(14):5526–5531CrossRefGoogle Scholar
  60. Philip GK, Freeland SJ (2011) Did evolution select a nonrandom "alphabet" of amino acids? Astrobiology 11(3):235–240CrossRefGoogle Scholar
  61. Ponnuswamy PK, Prabhakaran M, Manavalan P (1980) Hydrophobic packing and spatial arrangement of amino acid residues in globular proteins. Biochim Biophys Acta 623(2):301–316CrossRefGoogle Scholar
  62. Rahman S, Bashton M, Holliday G, Schrader R, Thornton J (2009) Small molecule subgraph detector (SMSD) toolkit. J Cheminform 1(1):12. doi: 10.1186/1758-2946-1-12 CrossRefGoogle Scholar
  63. Rode BM, Son HL, Suwannachot Y, Bujdak J (1999) The combination of salt induced peptide formation reaction and clay catalysis: a way to higher peptides under primitive earth conditions. Orig Life Evol Biosph 29(3):273–286CrossRefGoogle Scholar
  64. Schwendinger MG, Rode BM (1989) Possible role of copper and sodium in prebiotic evolution of peptides. Anal Sci 5:411–414CrossRefGoogle Scholar
  65. Sweet RM, Eisenberg D (1983) Correlation of sequence hydrophobicities measures similarity in three-dimensional protein structure. J Mol Biol 171(4):479–488CrossRefGoogle Scholar
  66. Szostak JW (2012) The eightfold path to non-enzymatic rna replication. J Syst Chem 3:2CrossRefGoogle Scholar
  67. Taylor FJR, Coates D (1989) The code within the codons. BioSystems 22(3):177–187CrossRefGoogle Scholar
  68. Turk RM, Chumachenko NV, Yarus M (2010) Multiple translational products from a five-nucleotide ribozyme. Proc Natl Acad Sci USA 107(10):4585–4589CrossRefGoogle Scholar
  69. van der Gulik P, Massar S, Gilis D, Buhrman H, Rooman M (2009) The first peptides: the evolutionary transition between prebiotic amino acids and early proteins. J Theor Biol 261(4):531–539CrossRefGoogle Scholar
  70. van der Gulik PTS, Hoff WD (2011) Unassigned codons, nonsense suppression, and anticodon modifications in the evolution of the genetic code. J Mol Evol 73(3-4):59–69CrossRefGoogle Scholar
  71. Vetsigian K, Woese C, Goldenfeld N (2006) Collective evolution and the genetic code. Proc Natl Acad Sci USA 103(28):10,696–10,701CrossRefGoogle Scholar
  72. Voet D, Voet JG (1995) Biochemistry, 2nd edn, Wiley, New York, p 773Google Scholar
  73. Woese CR (1965) Order in the genetic code. Proc Natl Acad Sci USA 54(1):71–75CrossRefGoogle Scholar
  74. Woese CR (1967) The genetic code. Harper and Row, New YorkGoogle Scholar
  75. Woese CR (1973) Evolution of the genetic code. Naturwissenschaften 60(10):447–459CrossRefGoogle Scholar
  76. Woese CR, Dugre DH, Dugre SA, Kondo M, Saxinger WC (1966a) On the fundamental nature and evolution of the genetic code. Cold Spring Harb Symp Quant Biol 31:723–736CrossRefGoogle Scholar
  77. Woese CR, Dugre DH, Saxinger WC, Dugre SA (1966b) The molecular basis for the genetic code. Proc Natl Acad Sci USA 55(4):966–974CrossRefGoogle Scholar
  78. Wolf YI, Koonin EV (2007) On the origin of the translation system and the genetic code in the RNA world by means of natural selection, exaptation, and subfunctionalization. Biol Direct 2:14CrossRefGoogle Scholar
  79. Wong JT (1975) A co-evolution theory of the genetic code. Proc Natl Acad Sci USA 72(5):1909–1912CrossRefGoogle Scholar
  80. Wong JT (1980) Role of minimization of chemical distances between amino acids in the evolution of the genetic code. Proc Natl Acad Sci USA 77(2 II):1083–1086CrossRefGoogle Scholar
  81. Wong JT (2007) Question 6: coevolution theory of the genetic code: a proven theory. Orig Life Evol Biosph 37(4-5):403–408CrossRefGoogle Scholar
  82. Wong JTF (2005) Coevolution theory of genetic code at age thirty. BioEssays 27(4):416–425CrossRefGoogle Scholar
  83. Yarus M (2011) The meaning of a minuscule ribozyme. Philos Trans R Soc B Biol Sci 366(1580):2902–2909CrossRefGoogle Scholar
  84. Yarus M, Widmann JJ, Knight R (2009) RNA-amino acid binding: a stereochemical era for the genetic code. J Mol Evol 69(5):406–429CrossRefGoogle Scholar

Copyright information

© Springer Science+Business Media New York 2013

Authors and Affiliations

  • Harry Buhrman
    • 1
    • 3
  • Peter T. S. van der Gulik
    • 1
    Email author
  • Gunnar W. Klau
    • 1
    • 4
  • Christian Schaffner
    • 1
    • 3
  • Dave Speijer
    • 2
    • 3
  • Leen Stougie
    • 1
    • 4
  1. 1.Centrum Wiskunde & Informatica (CWI)AmsterdamThe Netherlands
  2. 2.Department of Medical BiochemistryAcademic Medical CenterAmsterdamThe Netherlands
  3. 3.University of AmsterdamAmsterdamThe Netherlands
  4. 4.VU UniversityAmsterdamThe Netherlands

Personalised recommendations