Journal of Molecular Evolution

, Volume 45, Issue 4, pp 359–369 | Cite as

Estimation of evolutionary distances from protein spatial structures

  • Nick V. Grishin


New equations are derived to estimate the number of amino acid substitutions per site between two homologous proteins from the root mean square (RMS) deviation between two spatial structures and from the fraction of identical residues between two sequences. The equations are based on evolutionary models, analyzing predominantly structural changes and not sequence changes. Evolution of spatial structure is treated as a diffusion in an elastic force field. Diffusion accounts for structural changes caused by amino acid substitutions, and elastic force reflects selection, which preserves protein fold. Obtained equations are supported by analysis of protein spatial structures.

Key words

Protein structure RMS deviation Molecular evolution Evolutionary distance Substitution rates 


Unable to display preview. Download preview PDF.

Unable to display preview. Download preview PDF.


  1. Barry D, Hartigan JA (1987) Asynchronous distance between homologous DNA sequences. Biometrics 43:261–276PubMedCrossRefGoogle Scholar
  2. Chotia C, Lesk A (1986) The relation between the divergence of sequence and structure in proteins. The EMBO J 5:823–826Google Scholar
  3. Dayhoff MO, Eck RV, Park CM (1972) A model of evolutionary change in proteins. In: Dayhoff MO (ed) Atlas of protein sequence and structure, 5. National Biomedical Research Foundation, Washington, DC, pp 89–99Google Scholar
  4. Dayhoff MO, Schwartz RM & Orcutt BC (1978) A model of evolutionary change in proteins. In: Dayhoff MO (ed) Atlas of protein sequence and structure, 5, Suppl 3. National Biomedical Research Foundation, Washington, DC, pp 345–352Google Scholar
  5. Flores TP, Orengo CA, Moss DS, Thornton JM (1993) Comparison of conformational characteristics in structurally similar protein pairs. Protein Science 2:1811–1826PubMedCrossRefGoogle Scholar
  6. Grishin NV (1995) Estimation of the number of amino acid substitutions per site when the substitution rate varies among sites. J Mol Evol 41:675–679PubMedCrossRefGoogle Scholar
  7. Gutin AM, Badretdinov AY (1994) Evolution of protein 3D structures as diffusion in multidimensional conformational space. J Mol Evol 39:206–209Google Scholar
  8. Holmquist R, Goodman M, Conroy T, Czelusniak J (1983) The spatial distribution of fixed mutations within genes coding for proteins. J Mol Evol 19:437–448PubMedCrossRefGoogle Scholar
  9. Hubbard TJP, Blundell TL (1987) Comparison of solvent-inaccessible cores of homologous proteins: definitions useful for protein modeling. Protein Engineering 1:159–171PubMedCrossRefGoogle Scholar
  10. Kishino H, Miyata T, Hasegawa M (1990) Maximum likelihood inference of protein phylogeny and the origin of chloroplasts. J Mol Evol 31:151–160CrossRefGoogle Scholar
  11. Lesk AM, Chotia CH. (1986) The response of protein structure to amino-acid sequence changes. Phil Trans R Soc Lond A 317:345–356CrossRefGoogle Scholar
  12. Olsen GJ (1987) Earliest phylogenetic branchings: comparing rRNAbased evolutionary trees inferred with various techniques. Cold Spring Harbor Symposia on Quantitative Biology 52:825–837PubMedGoogle Scholar
  13. Ota T, Nei M (1994) Estimation of the number of amino acid substitutions per site when the substitution rate varies among sites. J Mol Evol 38:642–643CrossRefGoogle Scholar
  14. Rzhetsky A, Nei M. (1992) A simple method for estimating and testing minimum-evolution trees. Mol Biol Evol 9:945–967Google Scholar
  15. Saitou N (1988) Property and efficiency of the maximum likelihood method for molecular phylogeny. J Mol Evol 27:261–273PubMedCrossRefGoogle Scholar
  16. Takacs L. (1966) Stochastic process. Methuen & Co LTD, London, John Wiley & Sons Inc., NYGoogle Scholar
  17. Tajima F, Takezaki N (1994) Estimation of evolutionary distance for reconstructing molecular phylogenetic trees. Mol Biol Evol 11:278–286PubMedGoogle Scholar
  18. Tateno Y, Takezaki N, Nei M. (1994) Relative efficiencies of the maximum-likelihood, neighbor-joining, and maximum-parsimony methods when substitution rate varies with site. Mol Biol Evol 11:261–277PubMedGoogle Scholar
  19. Uzzel T, Corbin KW (1971) Fitting discrete probability distribution to evolutionary events. Science 172:1089–1096CrossRefGoogle Scholar
  20. Wilbur WJ (1985) On the PAM matrix model of protein evolution. Mol Biol Evol 2:434–447PubMedGoogle Scholar
  21. Yang Z (1993) Maximum-likelihood estimation of phylogeny from DNA sequences when substitution rates differ over sites. Mol Biol Evol 10:1396–1401PubMedGoogle Scholar
  22. Yang Z (1994) Maximum likelihood phylogenetic estimation from DNA sequences with variable rates over sites: approximate methods. J Mol Evol 39:306–314PubMedCrossRefGoogle Scholar
  23. Zharkikh A (1994) Estimation of evolutionary distances between nucleotide sequences. J Mol Evol 39:315–329PubMedCrossRefGoogle Scholar
  24. Zuckerkandl E, Pauling L. (1965) Evolutionary divergence and convergence in proteins. In: Bryson V, Vodel HJ (eds) Evolving genes and proteins. Academic Press, NY, pp 97–166Google Scholar

Copyright information

© Springer-Verlag 1997

Authors and Affiliations

  • Nick V. Grishin
    • 1
  1. 1.Department of PharmacologyThe University of Texas Southwestern Medical Center at DallasDallasUSA

Personalised recommendations