, Volume 7, Issue 6, pp 443–450 | Cite as

Optimum growth temperature and the base composition of open reading frames in prokaryotes

  • R. J. Lambros
  • J. R. Mortimer
  • D. R. ForsdykeEmail author
Original Paper


The purine-loading index (PLI) is the difference between the numbers of purines (A+G) and pyrimidines (T+C) per kilobase of single-stranded nucleic acid. By purine-loading their mRNAs organisms may minimize unnecessary RNA–RNA interactions and prevent inadvertent formation of "self" double-stranded RNA. Since RNA–RNA interactions have a strong entropy-driven component, this need to minimize should increase as temperature increases. Consistent with this, we report for 550 prokaryotic species that optimum growth temperature is related to the average PLI of open reading frames. With increasing temperature prokaryotes tend to acquire base A and lose base C, while keeping bases T and G relatively constant. Accordingly, while the PLI increases, the (G+C)% decreases. The previously observed positive correlation between (G+C)% and optimum growth temperature, which applies to RNA species whose structure is of major importance for their function (ribosomal and transfer RNAs) does not apply to mRNAs, and hence is unlikely to apply generally to genomic DNA.


Base composition (G+C)% Growth temperature Purine-loading Thermophiles 



Codon usage tables from GenBank


Chargaff difference for the S bases ("GC skew")


Chargaff difference for the W bases ("AT skew")


Open reading frame


Purine-loading index



We thank James Gerlach, Christopher Madill, Andrew Schramm, and Scott Smith for advice and assistance. Jean Lobry and Daniel Chessel kindly made their paper available prior to publication. Access to the GCG suite of programs was provided by the Canadian Bioinformatics Resource (Halifax). Academic Press, Cold Spring Harbor Laboratory Press, and Elsevier Science gave permissions for the inclusion of full-text versions of some of the cited papers in Forsdyke's web pages, which may be accessed at


  1. Barrette IH, McKenna S, Taylor DR, Forsdyke DR (2001) Introns resolve the conflict between base order-dependent stem-loop potential and the encoding of RNA or protein: further evidence from overlapping genes. Gene 270:181–189CrossRefPubMedGoogle Scholar
  2. Bell SJ, Forsdyke DR (1999a) Accounting units in DNA. J Theor Biol 197:51–61CrossRefPubMedGoogle Scholar
  3. Bell SJ, Forsdyke DR (1999b) Deviations from Chargaff's second parity rule correlate with direction of transcription. J Theor Biol 197:63–76CrossRefPubMedGoogle Scholar
  4. Bernardi G (2000) Isochores and the evolutionary genomics of vertebrates. Gene 241:3–17CrossRefPubMedGoogle Scholar
  5. Bernardi G, Bernardi G (1986) Compositional constraints and genome evolution. J Mol Evol 24:1–11PubMedGoogle Scholar
  6. Cambillau C, Claverie J-M (2000) Structural and genomic correlates of hyperthermostability. J Biol Chem 275:32383–32386CrossRefPubMedGoogle Scholar
  7. Cantor CR, Schimmel PR (1980) Statistical mechanics and kinetics of nucleic acid interactions. In: Biophysical chemistry. Freeman, San Francisco, pp 1183–1264Google Scholar
  8. Cristillo AD, Mortimer JR, Barrette IH, Lillicrap TP, Forsdyke DR (2001) Double-stranded RNA as a not-self alarm signal: to evade, most viruses purine-load their RNAs, but some (HTLV-1, EBV) pyrimidine-load. J Theor Biol 208:475–491CrossRefPubMedGoogle Scholar
  9. Dalgaard JZ, Garrett A (1993) Archaeal hyperthermophile genes. In: Kates M, Kushner DJ, Matheson AT (eds) The biochemistry of Archaea (Archaebacteria), Elsevier, Amsterdam, pp 535–562Google Scholar
  10. D'Onofrio G, Jabbari K, Musto H, Bernardi G (1999) The correlation of protein hydropathy with the base composition of coding sequences. Gene 238:3–14CrossRefPubMedGoogle Scholar
  11. Eigen M, Schuster P (1978) The hypercycle. A principle of natural self-organization, part C. The realistic hypercycle. Naturwissenschaften 65:341–369Google Scholar
  12. Forsdyke DR (1995) Entropy-driven protein self-aggregation as the basis for self/not-self discrimination in the crowded cytosol. J Biol Sys 3:273–287Google Scholar
  13. Forsdyke DR (1996) Different biological species "broadcast" their DNAs at different (C+G)% "wavelengths". J Theor Biol 178:405–417CrossRefPubMedGoogle Scholar
  14. Forsdyke DR (1998) An alternative way of thinking about stem-loops in DNA. A case study of the G0S2 gene. J Theor Biol 192:489–504CrossRefPubMedGoogle Scholar
  15. Forsdyke DR (1999) Two levels of information in DNA. Relationship of Romanes' "intrinsic" variability of the reproductive system, and Bateson's "residue", to the species-dependent component of the base composition, (C+G)%. J Theor Biol 201: 47–61CrossRefPubMedGoogle Scholar
  16. Forsdyke DR (2001a) The origin of species, revisited. McGill-Queen's University Press, MontrealGoogle Scholar
  17. Forsdyke DR (2001b) Functional constraint and molecular evolution. In: Nature encyclopedia of life sciences, vol 7. Nature Publishing, London, pp 396–403Google Scholar
  18. Forsdyke DR (2002a) Symmetry observations in long nucleotide sequences. Bioinformatics 18:215–217CrossRefPubMedGoogle Scholar
  19. Forsdyke DR (2002b) Selective pressures that decrease synonymous mutations in Plasmodium falciparum. Trends Parasitol 18:411–418CrossRefPubMedGoogle Scholar
  20. Forsdyke DR, Mortimer JR (2000) Chargaff's legacy. Gene 261:127–137CrossRefPubMedGoogle Scholar
  21. Forterre P, Elie C (1993) Chromosome structure, DNA topoisomerases, and DNA polymerases in Archaebacteria (Archaea). In: Kates M, Kushner DJ, Matheson AT (eds) The biochemistry of Archaea (Archaebacteria). Elsevier, Amsterdam, pp 325–345Google Scholar
  22. Fukuchi S, Nishikawa K (2001) Protein surface amino acid compositions distinctively differ between thermophilic and mesophilic bacteria. J Mol Biol 309:835–843CrossRefPubMedGoogle Scholar
  23. Galtier N, Lobry JR (1997) Relationships between genomic G+C content, RNA secondary structures, and optimal growth temperature in prokaryotes. J Mol Evol 44:632–636PubMedGoogle Scholar
  24. Grantham R (1980) Workings of the genetic code. Trends Biochem Sci 5:327–331CrossRefGoogle Scholar
  25. Grove A, Lim L (2001) High affinity DNA binding of HU protein from the hyperthermophile Thermotoga maritime. J Mol Biol 311:491–502CrossRefPubMedGoogle Scholar
  26. Hurst LD, Merchant AR (2001) High guanine-cytosine content is not an adaptation to high temperature: a comparative analysis among prokaryotes. Proc R Soc Lond B 268:493–497CrossRefPubMedGoogle Scholar
  27. Jaenicke R, Bohm G (1998) The stability of proteins in extreme environments. Curr Opin Struct Biol 8:738–748PubMedGoogle Scholar
  28. Jenkins JM, Pagel M, Gould EA, Zanotto PM de A, Holmes EC (2001) Evolution of base composition and codon usage bias in the genus Flavivirus. J Mol Evol 52:383–390PubMedGoogle Scholar
  29. Knight RD, Freeland SJ, Landweber LF (2001) A simple model based on mutation and selection explains trends in codon and amino-acid usage and GC composition within and across genomes. Genome Biol 2:0010.1–0010.13Google Scholar
  30. Lao PJ, Forsdyke DR (2000) Thermophilic bacteria strictly obey Szybalski's transcription direction rule and politely purine-load RNAs with both adenine and guanine. Genome Res 10:228–236CrossRefPubMedGoogle Scholar
  31. Lauffer MA (1975) Entropy-driven processes in biology. Springer, Berlin Heidelberg New YorkGoogle Scholar
  32. Lobry JR, Chessel D (2003) Internal correspondence analysis of codon and amino-acid usage in thermophilic bacteria. J Appl Genet 44:235–261PubMedGoogle Scholar
  33. Mortimer JR, Forsdyke DR (2003) Comparison of responses by bacteriophage and bacteria to pressures on the base composition of open reading frames. Appl Bioinformatics 2:47–62Google Scholar
  34. Nakamura Y, Gojobori T, Ikemura T (2000) Codon usage tabulated from the international DNA sequence databases: status for the year 2000. Nucleic Acids Res 28:292PubMedGoogle Scholar
  35. Pizzi E, Frontali C (2001) Low-complexity regions in Plasmodium falciparum proteins. Genome Res 11:218–229CrossRefPubMedGoogle Scholar
  36. Ream RA, Johns GC, Somero GN (2003) Base compositions of genes encoding α-actin and lactate dehydrogenase-A from differently adapted vertebrates show no temperature-adaptive variation in G+C content. Mol Biol Evol 20:105–110CrossRefPubMedGoogle Scholar
  37. Saccone C, Gissi C, Lanave C, Larizza A, Pesole G, Reyes A (2000) Evolution of the mitochondrial genetic system: an overview. Gene 261:153–159CrossRefPubMedGoogle Scholar
  38. Shepherd JCW (1981) Method to determine the reading frame of a protein from the purine/pyrimidine genome sequence and its possible evolutionary justification. Proc Natl Acad Sci U S A 78:1596–1600PubMedGoogle Scholar
  39. Sicot F-X, Mesnage M, Masselot M, Exposito J-Y, Garrone R, Deutsch J, Gaill F (2000) Molecular adaptation to an extreme environment: origin of the thermal stability of the Pompeii worm collagen. J Mol Biol 302:811–820CrossRefPubMedGoogle Scholar
  40. Smithies O, Engels WR, Devereux JR, Slightom JL, Chen S-H (1981) Base substitutions, length differences and DNA strand asymmetries in the human Gγ and Aγ fetal globin gene region. Cell 26:345–353PubMedGoogle Scholar
  41. Stetter KO (1999) Extremophiles and their adaptation to hot environments. FEBS Lett 452:22–25PubMedGoogle Scholar
  42. Szybalski W, Kubinski H, Sheldrick O (1966) Pyrimidine clusters on the transcribing strand of DNA and their possible role in the initiation of RNA synthesis. Cold Spring Harb Symp Quant Biol 31:123–127PubMedGoogle Scholar
  43. Xue HY, Forsdyke DR (2003) Low complexity segments in Plasmodium falciparum proteins are primarily nucleic acid level adaptations. Mol Biochem Parasitol 128:21–32CrossRefPubMedGoogle Scholar

Copyright information

© Springer-Verlag 2003

Authors and Affiliations

  • R. J. Lambros
    • 1
  • J. R. Mortimer
    • 1
  • D. R. Forsdyke
    • 1
    Email author
  1. 1.Department of BiochemistryQueen's UniversityKingstonCanada

Personalised recommendations