Journal of Molecular Evolution

, Volume 62, Issue 3, pp 340–361 | Cite as

The Response of Amino Acid Frequencies to Directional Mutation Pressure in Mitochondrial Genome Sequences Is Related to the Physical Properties of the Amino Acids and to the Structure of the Genetic Code



The frequencies of A, C, G, and T in mitochondrial DNA vary among species due to unequal rates of mutation between the bases. The frequencies of bases at fourfold degenerate sites respond directly to mutation pressure. At first and second positions, selection reduces the degree of frequency variation. Using a simple evolutionary model, we show that first position sites are less constrained by selection than second position sites and, therefore, that the frequencies of bases at first position are more responsive to mutation pressure than those at second position. We define a measure of distance between amino acids that is dependent on eight measured physical properties and a similarity measure that is the inverse of this distance. Columns 1, 2, 3, and 4 of the genetic code correspond to codons with U, C, A, and G in their second position, respectively. The similarity of amino acids in the four columns decreases systematically from column 1 to column 2 to column 3 to column 4. We then show that the responsiveness of first position bases to mutation pressure is dependent on the second position base and follows the same decreasing trend through the four columns. Again, this shows the correlation between physical properties and responsiveness. We determine a proximity measure for each amino acid, which is the average similarity between an amino acid and all others that are accessible via single point mutations in the mitochondrial genetic code structure. We also define a responsiveness for each amino acid, which measures how rapidly an amino acid frequency changes as a result of mutation pressure acting on the base frequencies. We show that there is a strong correlation between responsiveness and proximity, and that both these quantities are also correlated with the mutability of amino acids estimated from the mtREV substitution rate matrix. We also consider the variation of base frequencies between strands and between genes on a strand. These trends are consistent with the patterns expected from analysis of the variation among genomes.


Mitochondrial genomes Directional mutation pressure Genetic code Amino acid substitutions 



This work was supported by the Natural Sciences and Engineering Research Council of Canada and by Canada Research Chairs.


  1. Adachi J, Hasegawa M (1996) Model of amino acid substitution in proteins encoded by mitochondrial DNA. J Mol Evol 42:459–468CrossRefPubMedGoogle Scholar
  2. Alff-Steinberger C. (1969) The genetic code and error transmission. Proc Natl Acad Sci USA 64:584–591PubMedGoogle Scholar
  3. Antezana MA, Kreitman M (1999) The nonrandom location of synonymous codons suggests that reading frame-independent forces have patterned codon preferences. JMol Evol 49:36–43Google Scholar
  4. Bielawski JP, Gold JR. (2002) Mutation patterns of mitochondrial H- and L-strand DNA in closely related cyprinid fishes. Genetics 161:1589–1597PubMedGoogle Scholar
  5. Bharanidharan D, Bhargavi GR, Uthanumallian K, Gautham N (2004) Correlations between nucleotide frequencies and amino acid composition in 115 bacterial species. Biochem Biophys Res Commun 315:1097–1103CrossRefPubMedGoogle Scholar
  6. Bogenhagen DF, Clayton DA (2003) The mitochondrial DNA replication bubble has not burst. Trends Biochem Sci 28:357–360PubMedGoogle Scholar
  7. Bowmaker M, Yang MY, Yasukawa T, Reyes A, Jacobs HT, Huberman JA, Holt IJ (2003) Mammalian mitochondrial DNA replicates bidirectionally from an initiation zone. J Biol Chem 278:50961–50969CrossRefPubMedGoogle Scholar
  8. Coghlan A, Wolfe KH (2000) Relationship of codon bias to mRNA concentration and protein length in Saccharomyces cerevisiae. Yeast 16:1131–1145CrossRefPubMedGoogle Scholar
  9. Creighton TE (1993) Proteins: Structures and molecular properties, end ed. W. H. Freeman, New YorkGoogle Scholar
  10. Dayhoff MO, Schwartz RM, Orcutt BC (1978) A model of evolutionary change in proteins. In: Atlas of protein sequence and structure. National Biomedical Research Foundation, Washington, DC, Vol 5, Suppl 3, pp 345–352Google Scholar
  11. Dean MD, Ballard JWO (2005) High divergence among Drosophila simulans mitochondrial haplogroups arose in the midst of long term purifying selection. Mol Phylogenet Evol 36:328–337PubMedGoogle Scholar
  12. Duret L (2000) tRNA gene number and codon usage in the C. elegans genome are coadapted for optimal translation of highly expressed genes. Trends Genet 16:287–289CrossRefPubMedGoogle Scholar
  13. Engelman DA, Steitz TA, Goldman A (1986) Identifying nonpolar transbilayer helices in amino acid sequences of membrane proteins. Annu Rev Biophys Biophys Chem 15:321–353CrossRefPubMedGoogle Scholar
  14. Faith JJ, Pollock DD (2003) Likelihood analysis of asymmetrical mutation bias gradients in vertebrate mitochondrial genomes. Genetics 165:735–745PubMedGoogle Scholar
  15. Foster PG, Hickey DA (1999) Compositional bias may affect both DNA-based and protein-based phylogenetic reconstructions. J Mol Evol 48:284–290CrossRefPubMedGoogle Scholar
  16. Foster PG, Jermiin LS, Hickey DA (1997) Nucleotide compositional bias affects amino acid content in proteins coded by animal mitochondria. J Mol Evol 44:282–288CrossRefPubMedGoogle Scholar
  17. Freeland SJ, Knight RD, Landweber LF, Hurst LD (2000) Early fixation of an optimal genetic code. Mol Biol Evol 17:511–518PubMedGoogle Scholar
  18. Gibson A, Gowri-Shankar V, Higgs PG, Rattray M (2005) A comprehensive analysis of mammalian mitochondrial genome base composition and improved phylogenetic methods. Mol Biol Evol 22:251–264PubMedGoogle Scholar
  19. Gilis D, Massar S, Cerf NJ, Rooman M (2001) Optimality of the genetic code with respect to protein stability and amino acid frequencies. Genome Biol 2(11):research00491CrossRefGoogle Scholar
  20. Goldman N, Yang Z. (1994) A codon-based model of nucleotide substitution for protein coding DNA sequences. Mol Biol Evol 11:725–736PubMedGoogle Scholar
  21. Grantham R (1974) Amino acid difference formula to help explain protein evolution. Science 185:862–864PubMedGoogle Scholar
  22. Haig D, Hurst LD (1991) A quantitative measure of error minimization in the genetic code. J Mol Evol 33:412–417CrossRefPubMedGoogle Scholar
  23. Halpern AL, Bruno WJ (1998) Evolutionary distances for protein-coding sequences: modeling site-specific residue frequencies. Mol Biol Evol 15:910–917PubMedGoogle Scholar
  24. Hasegawa M, Kishino H, Yano TA (1985) Dating of the human-ape splitting by a molecular clock of mitochondrial DNA. J Mol Evol 22:160–174PubMedGoogle Scholar
  25. Hasegawa M, Cao Y, Yang Z (1998) Preponderence of slightly deleterious polymorphism in mitochondrial DNA: nonsynonymous/synonymous rate ratio is much higher within species than between species. Mol Biol Evol 15:1499–1505PubMedGoogle Scholar
  26. Higgs PG, Attwood TK (2005) Bioinformatics and molecular evolution. Blackwell, Malden, MAGoogle Scholar
  27. Jameson D, Gibson AP, Hudelot C, Higgs PG (2003) OGRe: a relational database for comparative analysis of mitochondrial genomes. Nucleic Acids Res 31:202–206 (latest version available at Scholar
  28. Jones DT, Taylor WR, Thornton JM (1992) The rapid generation of mutation data matrices from protein sequences. CABIOS 8:275–282PubMedGoogle Scholar
  29. Kanaya S, Yamada Y, Kudo Y, Ikemura T (1999) Studies of codon usage and tRNA genes of 18 unicellular organisms and quantification of Bacillus subtilis tRNAs. Gene 238:143–155CrossRefPubMedGoogle Scholar
  30. Kimura M (1983) The neutral theory of molecular evolution. Cambridge University Press, CambridgeGoogle Scholar
  31. Knight RD, Freeland SJ, Landweber LF (2001a) A simple model based on mutation and selection explains trends in codon and amino acid usage and GC composition within and across genomes. Genome Biol 2(4):research00101Google Scholar
  32. Knight RD, Landweber LF, Yarus M (2001b) How mitochondria redefine the code. J Mol Evol 53:299–313CrossRefGoogle Scholar
  33. Krishnan NM, Seligmann H, Raina SZ, Pollock DD. (2004) Detecting gradients of asymmetry in site-specific substitutions in mitochondrial genomes. DNA Cell Biol 23:707–714CrossRefPubMedGoogle Scholar
  34. Kyte J, Doolittle RF (1982) A simple method for displaying the hydropathic character of a protein. J Mol Biol 157:105–132CrossRefPubMedGoogle Scholar
  35. Lobry JR (1997) Influence of genomic G+C content on average amino acid composition of proteins from 59 bacterial species. Gene 205:309–316CrossRefPubMedGoogle Scholar
  36. McLean MJ, Wolfe KH, Devine KM (1998) Base composition skews, replication orientation and gene orientation in 12 prokaryote genomes. J Mol Evol 47:691–696CrossRefPubMedGoogle Scholar
  37. Miller S, Janin J, Lesk AM, Chothia C (1987) Interior and surface of monomeric proteins. J Mol Biol 196:641–657CrossRefPubMedGoogle Scholar
  38. Muto A, Osawa S (1987) The guanine and cytosine content of genomic DNA and bacterial evolution. Proc Natl Acad Sci USA 84:166–169PubMedGoogle Scholar
  39. Raina SZ, Faith JJ, Dusotell TR, Seligmann H, Stewart CB, Pollock DD. (2005) Evolution of base-substitution gradients in primate mitochondrial genomes. Genome Res 15:665–673CrossRefPubMedGoogle Scholar
  40. Reyes A, Gissi C, Pesole G, Saccone C (1998) Asymmetrical directional mutation pressure in the mitochondrial genome of mammals. Mol Biol Evol 15:957–966PubMedGoogle Scholar
  41. Rose GD, Geselowitz AR, Lesser GJ, Lee RH, Zehfus MH (1985) Hydrophobicity of amino acid residues in globular proteins. Science 228:834–838Google Scholar
  42. Schmitz J, Ohme M, Zischler H (2002) The complete mitochondrial sequence of Tarsius bancanus: evidence for an extensive nucleotide compositional plasticity of primate mitochondrial DNA. Mol Biol Evol 19:544–553PubMedGoogle Scholar
  43. Sengupta S, Higgs PG (2005) A unified model of codon reassignment in alternative genetic codes. Genetics 170:831–840CrossRefPubMedGoogle Scholar
  44. Singer GAC, Hickey DA (2000) Nucleotide bias causes a genome wide bias in the amino acid composition of proteins. Mol Biol Evol 17:1581–1588PubMedGoogle Scholar
  45. Sueoka N (1988) Directional mutation pressure and neutral molecular evolution. Proc Natl Acad Sci USA 85:2653–2657PubMedGoogle Scholar
  46. Sueoka N (1995) Intra-strand parity rules of DNA base composition and usage biases of synonymous codons. J Mol Evol 40:318–325CrossRefPubMedGoogle Scholar
  47. Sueoka N (1999) Two aspects of DNA base composition:G+C content and translation-coupled deviation from intra-strand rule of A = T and G = C. J Mol Evol 49:49–62CrossRefPubMedGoogle Scholar
  48. Tanaka M, Ozawa T (1994) Strand asymmetry in human mitochondrial DNA mutations. Genomics 22:327–335CrossRefPubMedGoogle Scholar
  49. Woese CR (1965) On the evolution of the genetic code. Proc Natl Acad Sci USA 54:1546–1552PubMedGoogle Scholar
  50. Woese CR, Dugre DH, Saxinger WC, Dugre SA (1966). The molecular basis for the genetic code. Proc Natl Acad Sci USA 55:966–974PubMedGoogle Scholar
  51. Zimmerman JM, Eliezer N, Simha R (1968) The characterization of amino acids sequences in proteins by statistical methods. J Theor Biol 21:170–201CrossRefPubMedGoogle Scholar

Copyright information

© Springer Science+Business Media, Inc. 2006

Authors and Affiliations

  1. 1.Department of Physics and AstronomyMcMaster UniversityHamiltonCanada
  2. 2.Division of Genomics and Proteomics, Ontario Cancer InstituteUniversity of TorontoTorontoCanada

Personalised recommendations