Journal of Molecular Evolution

, Volume 67, Issue 4, pp 334–342 | Cite as

The Universal Trend of Amino Acid Gain–Loss is Caused by CpG Hypermutability

  • Kazuharu Misawa
  • Naoyuki Kamatani
  • Reiko F. Kikuno


Understanding the cause of the changes in the amino acid composition of proteins is essential for understanding the evolution of protein functions. Since the early 1970s, it has been known that the frequency of some amino acids in protein sequences is increasing and that of others is decreasing. Recently, it was found that the trends of amino acid changes were similar in 15 taxa representing Bacteria, Archaea, and Eukaryota. However, the cause of this similarity in the trend of the gains and losses of amino acids continued to be debated. Here, we show that this trend of the gain and loss of amino acids can be simply explained by CpG hypermutability. We found that the frequency of amino acids coded by codons with TpG dinucleotides and those with CpA dinucleotides is increasing, while that of amino acids coded by codons with CpG dinucleotides is decreasing. We also found that organisms that lack DNA methyltransferase show different trends of the gain and loss of amino acids. DNA methyltransferase methylates CpG dinucleotides and induces CpG hypermutability. The incorporation of CpG hypermutability into models of protein evolution will improve studies on protein evolution in different organisms.


Gain and loss of amino acids Rates of molecular evolution CpG hypermutability 

Supplementary material

239_2008_9141_MOESM1_ESM.doc (38 kb)
239_2008_9141_MOESM2_ESM.xls (28 kb)
239_2008_9141_MOESM3_ESM.pdf (32 kb)
239_2008_9141_MOESM4_ESM.pdf (32 kb)
239_2008_9141_MOESM5_ESM.pdf (33 kb)
239_2008_9141_MOESM6_ESM.pdf (35 kb)
239_2008_9141_MOESM7_ESM.pdf (36 kb)


  1. Bird AP (1980) DNA methylation and the frequency of CpG in animal DNA. Nucleic Acids Res 8:1499–1504PubMedCrossRefGoogle Scholar
  2. Gasteiger E, Gattiker A, Hoogland C, Ivanyi I, Appel RD, Bairoch A (2003) ExPASy: the proteomics server for in-depth protein knowledge and analysis. Nucleic Acids Res 31:3784–3788PubMedCrossRefGoogle Scholar
  3. Goldstein RA, Pollock DD (2006) Observations of amino acid gain and loss during protein evolution are explained by statistical bias. Mol Biol Evol 23:1444–1449PubMedCrossRefGoogle Scholar
  4. Hobolth A, Nielsen R, Wang Y, Wu F, Tanksley SD (2006) CpG + CpNpG analysis of protein-coding sequences from tomato. Mol Biol Evol 23:1318–1323PubMedCrossRefGoogle Scholar
  5. Hurst LD, Fell EJ, Rocha EPC (2006) Protein evolution: causes of trends in amino-acid gain and loss. Nature 442:E11–E12PubMedCrossRefGoogle Scholar
  6. Huttley GA (2004) Modeling the impact of DNA methylation on the evolution of BRCA1 in mammals. Mol Biol Evol 21:1760–1768PubMedCrossRefGoogle Scholar
  7. Jordan IK, Kondrashov FA, Adzhubei IA, Wolf YI, Koonin EV, Kondrashov AS, Sunyaev S (2005) A universal trend of amino acid gain and loss in protein evolution. Nature 433:633–638PubMedCrossRefGoogle Scholar
  8. Jukes TH (1978) Codons and nearest-neighbor nucleotide pairs in mammalian messenger RNA. J Mol Evol 11:121–127PubMedCrossRefGoogle Scholar
  9. Kimura M (1968) Evolutionary rate at the molecular level. Nature 217:624–626PubMedCrossRefGoogle Scholar
  10. Kosiol C, Holmes I, Goldman N (2007) An empirical codon model for protein sequence evolution. Mol Biol Evol 24:1464–1479PubMedCrossRefGoogle Scholar
  11. Lunter G, Hein J (2004) A nucleotide substitution model with nearest-neighbour interactions. Bioinformatics 20(Suppl 1):I216–I223PubMedCrossRefGoogle Scholar
  12. McDonald JH (2006) Apparent trends of amino acid gain and loss in protein evolution due to nearly neutral variation. Mol Biol Evol 23:240–244PubMedCrossRefGoogle Scholar
  13. Ohta T (1992) The nearly neutral theory of molecular evolution. Annu Rev Ecol Syst 23:263–286CrossRefGoogle Scholar
  14. Ponger L, Li WH (2005) Evolutionary diversification of DNA methyltransferases in eukaryotic genomes. Mol Biol Evol 22:1119–1128PubMedCrossRefGoogle Scholar
  15. R Development Core Team (2007) R: a language and environment for statistical computing. R Development Core Team, Vienna, AustriaGoogle Scholar
  16. Riggins GJ, Lokey LK, Chastain JL, Leiner HA, Sherman SL, Wilkinson KD, Warren ST (1992) Human genes containing polymorphic trinucleotide repeats. Nat Genet 2:186–191PubMedCrossRefGoogle Scholar
  17. Scarano E, Iaccarino M, Grippo P, Parisi E (1967) The heterogeneity of thymine methyl group origin in DNA pyrimidine isostichs of developing sea urchin embryos. Proc Natl Acad Sci USA 57:1394–1400PubMedCrossRefGoogle Scholar
  18. Shigenobu S, Watanabe H, Hattori M, Sakaki Y, Ishikawa H (2000) Genome sequence of the endocellular bacterial symbiont of aphids Buchnera sp. APS. Nature 407:81–86PubMedCrossRefGoogle Scholar
  19. Tweedie S, Charlton J, Clark V, Bird A (1997) Methylation of genomes and genes at the invertebrate-vertebrate boundary. Mol Cell Biol 17:1469–1475PubMedGoogle Scholar
  20. Wang Y, Rocha EP, Leung FC, Danchin A (2004) Cytosine methylation is not the major factor inducing CpG dinucleotide deficiency in bacterial genomes. J Mol Evol 58:692–700PubMedCrossRefGoogle Scholar
  21. Wootton JC (1994) Sequences with ‘unusual’ amino acid compositions. Curr Opin Struct Biol 4:413–421CrossRefGoogle Scholar
  22. Zuckerkandl E, Derancourt J, Vogel H (1971) Mutational trends and random processes in the evolution of informational macromolecules. J Mol Biol 59:473–490PubMedCrossRefGoogle Scholar

Copyright information

© Springer Science+Business Media, LLC 2008

Authors and Affiliations

  • Kazuharu Misawa
    • 1
    • 2
  • Naoyuki Kamatani
    • 3
    • 4
  • Reiko F. Kikuno
    • 5
  1. 1.Chiba Industry Advancement CenterChibaJapan
  2. 2.Research Program for Computational Science, Research and Development Group for Next-Generation Integrated Living Matter Simulation, Fusion of Data and Analysis Research and Development TeamRikenTokyoJapan
  3. 3.Division of Genomic Medicine, Department of Advanced Biomedical Engineering and ScienceTokyo Women’s Medical UniversityTokyoJapan
  4. 4.Institute of RheumatologyTokyo Women’s Medical UniversityTokyoJapan
  5. 5.Kazusa DNA Research InstituteChibaJapan

Personalised recommendations