Skip to main content
Log in

Alignment statistic for identifying related protein sequences

  • Published:
Journal of Molecular Evolution Aims and scope Submit manuscript

Summary

Closely related proteins show an obvious kinship by having numerous matching amino acids in their aligned sequences. Kinship between anciently separated proteins requires a statistical evaluation to rule out fortuitous similarities. A simple statistic is developed which assumes equal probability for all codon pairs, and a table of critical values for amino acid sequence alignments of length 200 or less is presented. Applying this statistic toV andC regions of immunoglobulin chains, aligned on the basis of shared features of three-dimensional structure, provides evidence that theV andC sequences descended from a common ancestor. Similarly the distant evolutionary relationship of dehydrogenases, flavdoxin, and subtilisin, suggested by structural alignments, is verified. On the other hand, the statistic does not verify a common evolutionary origin for the heme binding pocket in globins and cytochromeb 5. Empirical evidence from the distribution of MMD values of amino acid pairs in comparisons of misaligned polypeptide chains and from Monte Carlo trials of sequences aligned with arbitrary gaps supports the validity of the statistic.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Similar content being viewed by others

References

  • Barker, W.C., Dayhoff, M.O. (1972). Atlas of Protein Sequence and Structure 5, 89

    Google Scholar 

  • Fitch, W.M. (1975). J. Mol. Biol., 16, 9

    Google Scholar 

  • Fitch, W.M. (1970). J. Mol. Biol. 49, 1

    Google Scholar 

  • Fitch, W.M., Margoliash, E. (1967). Science 155, 279

    Google Scholar 

  • Haber, J.E., Koshland, D.E., Jr. (1970). J. Mol. Biol. 50, 617

    Google Scholar 

  • Jukes, T.H., Cantor, C.R. (1969). In: Mammalian protein metabolism, H.M. Munro, ed. New York: Academic Press

    Google Scholar 

  • McLachlan, A.D. (1971). J. Mol. Biol. 61, 409

    Google Scholar 

  • McLachlan, A.D. (1972). J. Mol. Biol. 64, 417

    Google Scholar 

  • Needleman, S.B., Wunsch, C.D. (1970). J. Mol. Biol., 48, 443

    Google Scholar 

  • Poljak, R.J., Amzel, L.M., Chen, B.L., Phizackerley, R.P., Saul, R. (1974). Proc. Natl. Acad. Sci. (U.S.A.) 71, 3440

    Google Scholar 

  • Rossman, M.G., Argos, P. (1975). J. Biol. Chem. 250, 7525

    Google Scholar 

  • Rossman, M.G., Moras, D., Olsen, K.W. (1974). Nature 250, 194

    Google Scholar 

  • Sankoff, D. (1972). Proc. Natl. Acad. Sci. (U.S.A.) 69, 4

    Google Scholar 

Download references

Author information

Authors and Affiliations

Authors

Rights and permissions

Reprints and permissions

About this article

Cite this article

Moore, G.W., Goodman, M. Alignment statistic for identifying related protein sequences. J Mol Evol 9, 121–130 (1977). https://doi.org/10.1007/BF01732744

Download citation

  • Received:

  • Issue Date:

  • DOI: https://doi.org/10.1007/BF01732744

Key words

Navigation