Protein Substitution Model and Evolutionary Distance

  • Xuhua Xia


In addition to nucleotide-based substitution models, there are also models based on amino acid and codon sequences. Observed substitutions between two sense codons depend on codon frequencies, the difference between the two encoded amino acids, and the number of nucleotide site differences between the two codons (which could differ at 1, 2, or all 3 sites). Similarly, observed substitutions between two amino acids depend on amino acid frequencies and amino acid dissimilarities. This chapter focuses on amino acid substitution models with their parameters derived from empirical substitution matrices. How is an empirical substitution matrix compiled? How to derive transition probability and rate matrices from an empirical matrix? How to derive evolutionary distances from these matrices? Under what circumstances one may fail to obtain an evolutionary distance? These questions are addressed in detail with numerical illustrations.


  1. Grantham R (1974) Amino acid difference formula to help explain protein evolution. Science 185:862–864CrossRefGoogle Scholar
  2. Jones DT, Taylor WR, Thornton JM (1992) The rapid generation of mutation data matrices from protein sequences. Comput Appl Biosci 8:275–282PubMedGoogle Scholar
  3. Kimura M (1968) Evolutionary rate at the molecular level. Nature 217:624–626CrossRefGoogle Scholar
  4. Kimura M (1977) Preponderance of synonymous changes as evidence for the neutral theory of molecular evolution. Nature 267:275–276CrossRefGoogle Scholar
  5. Kimura M (1983) The neutral theory of molecular evolution. Cambridge University Press, CambridgeCrossRefGoogle Scholar
  6. King MC, Jukes TH (1969) Non-Darwinian evolution. Science 164:788–798CrossRefGoogle Scholar
  7. Miyata T, Yasunaga T (1980) Molecular evolution of mRNA: a method for estimating evolutionary rates of synonymous and amino acid substitutions from homologous nucleotide sequences and its application. J Mol Evol 16(1):23–36CrossRefGoogle Scholar
  8. Miyata T, Miyazawa S, Yasunaga T (1979) Two types of amino acid substitutions in protein evolution. J Mol Evol 12(3):219–236CrossRefGoogle Scholar
  9. Palidwor GA, Perkins TJ, Xia X (2010) A general model of codon bias due to GC mutational bias. PLoS One 5(10):e13431CrossRefGoogle Scholar
  10. Xia X (1998b) The rate heterogeneity of nonsynonymous substitutions in mammalian mitochondrial genes. Mol Biol Evol 15:336–344CrossRefGoogle Scholar
  11. Xia X (2013) DAMBE5: a comprehensive software package for data analysis in molecular biology and evolution. Mol Biol Evol 30:1720–1728PubMedPubMedCentralGoogle Scholar
  12. Xia X (2017d) Self-organizing map for characterizing heterogeneous nucleotide and amino acid sequence motifs. Computation 5(4):43Google Scholar
  13. Xia X, Li WH (1998) What amino acid properties affect protein evolution? J Mol Evol 47(5):557–564CrossRefGoogle Scholar
  14. Xia X, Xie Z (2002) Protein structure, neighbor effect, and a new index of amino acid dissimilarities. Mol Biol Evol 19(1):58–67CrossRefGoogle Scholar

Copyright information

© Springer Science+Business Media LLC 2018

Authors and Affiliations

  • Xuhua Xia
    • 1
  1. 1.University of Ottawa CAREG and Biology DepartmentOttawaCanada

Personalised recommendations