Journal of Protein Chemistry

, Volume 4, Issue 1, pp 23–55 | Cite as

Statistical analysis of the physical properties of the 20 naturally occurring amino acids

  • Akinori Kidera
  • Yasuo Konishi
  • Masahito Oka
  • Tatsuo Ooi
  • Harold A. Scheraga


In order to describe the conformational and other physical properties of the 20 naturally occurring amino acid residues with a minimum number of parameters, several multivariate statistical analyses were applied to 188 of their physical properties and ten orthogonal properties (factors) were obtained for the 20 amino acids without losing the information contained in the original physical properties. The analysis consisted of three main steps. First, 72 of the physical properties were eliminated from further consideration because they did not pass statistical tests that they follow a normal distribution. Second, the remaining 116 physical properties of the amino acids were classified by a cluster analysis to eliminate duplications of highly correlated physical properties. This led to nine clusters, each of which was characterized by an average characteristic property, namely bulk, two hydrophobicity indices for free amino acids, one hydrophobicity index for amino acid residues in a protein, two types of β-structure preference, α-helix preference, and two types of bend-structure preference. The physical properties within a given cluster were highly correlated with each other, but the correlation between clusters was low. Third, a factor analysis was applied to the nine average classified properties and 16 additional physical properties to obtain a small number of orthogonal properties (ten factors). Four of these factors arise from the nine characteristic properties, and the remaining six factors were obtained from the 16 physical properties not included in the nine characteristic properties. Finally, most of the 188 physical properties could be expressed as a sum of these ten orthogonal factors, with appropriate weighting factors. Since these factors contain information relating almost all properties of all 20 amino acids, it is possible to estimate the numerical values of a property for one or two amino acids for which experimental data for this property are not available. For example, the estimated values for the Zimm-Bragg parameters at 20°C are 0.66 and 0.92 for proline and cysteine, respectively, computed from the first four factors.

Key words

amino acid physical properties characteristic properties bulk hydrophobicity β-structure preference α-helix preference bend-structure preference statistical analysis cluster analysis factor analysis 


Unable to display preview. Download preview PDF.

Unable to display preview. Download preview PDF.


  1. Anderson, T. W., and Rubin, H. (1956). In:Proceedings of the Third Berkeley Symposium on Mathematical Statistics and Probability, Vol. 5, pp. 111–150.Google Scholar
  2. Bartlett, M. S. (1954).J. Roy. Stat. Soc. B 16, 296–298.Google Scholar
  3. Beghin, F., and Dirkx, J. (1975).Arch. Int. Physiol. Biochim. 83, 167–168.Google Scholar
  4. Bernstein, F. C., Koetzle, T. F., Williams, G. J. B., Meyer, Jr., E. F., Brice, M. D., Rodgers, J. R., Kennard, O., Shimanouchi, T., and Tasumi, M. (1977).J. Mol. Biol. 112, 535–542.Google Scholar
  5. Bigelow, C. C. (1967).J. theor. Biol. 16, 187–211.Google Scholar
  6. Browne, C. A., Bennett, H. P. J., and Solomon, S. (1982).Anal. Biochem. 124, 201–208.Google Scholar
  7. Bull, H. B., and Breese, K. (1974).Arch. Biochem. Biophys. 161, 665–670.Google Scholar
  8. Burgess, A. W., Ponnuswamy, P. K., and Scheraga, H. A. (1974).Isr. J. Chem. 12, 239–286.Google Scholar
  9. Charton, M. (1981).J. Theor. Biol. 91, 115–123.Google Scholar
  10. Charton, M., and Charton, B. I. (1982).J. Theor. Biol. 99, 629–644.Google Scholar
  11. Chothia, C. (1976).J. Mol. Biol. 105, 1–14.Google Scholar
  12. Chou, P. Y., and Fasman, G. D. (1978).Adv. Enzymol. 47, 45–148.Google Scholar
  13. Cohn, E. J., and Edsall, J. T. (1943).Protein, Amino Acids, and Peptides, Reinhold, New York.Google Scholar
  14. Crawford, J. L., Lipscomb, W. N., and Schellman, C. G. (1973).Proc. Natl. Acad. Sci. USA 70, 538–542.Google Scholar
  15. Dawson, D. M. (1972). In:The Biochemical Genetics of Man (Brock, D. J. H., and Mayo, O., eds.), Academic Press, New York, pp. 1–38.Google Scholar
  16. Dayhoff, M. O., Schwartz, R. M., and Orcutt, B. C. (1978a). In:Atlas of Protein Sequence and Structure, Vol. 5, Suppl. 3 (Dayoff, M. O., ed.), National Biomedical Research Foundation, Washington, D. C., pp. 345–352.Google Scholar
  17. Dayhoff, M. O., Hunt, L. T., and Hurst-Calderone, S. (1978b). In:Atlas of Protein Sequence and Structure, Vol. 5, Suppl. 3 (Dayhoff, M. O., ed.), National Biomedical Research Foundation, Washington, D.C., p. 363.Google Scholar
  18. Fasman, G. D., ed. (1976).Handbook of Biochemistry and Molecular Biology, Vol. 1,Proteins, 3rd ed., CRC Press, Cleveland, Ohio.Google Scholar
  19. Finkelstein, A. V., and Ptitsyn, O. B. (1976).Biopolymers 16, 469–495.Google Scholar
  20. Garel, J. P., Filliol, D., and Mandel, P. (1973).J. Chromatogr. 78, 381–391.Google Scholar
  21. Goldsack, D. E., and Chalifoux, R. C. (1973).J. Theor. Biol. 39, 645–651.Google Scholar
  22. Grantham, R. (1974).Science 185, 862–864.Google Scholar
  23. Harman, H. H. (1976).Modern Factor Analysis, 3rd ed., University of Chicago Press, Chicago, Illinois.Google Scholar
  24. Hartigan, J. A. (1975).Clustering Algorithms, Wiley, New York.Google Scholar
  25. Hopfinger, A. J. (1977).Intermolecular Interactions and Biomolecular Organizations, Wiley, New York.Google Scholar
  26. Hopp, T. P., and Woods, K. R. (1981).Proc. Natl. Acad. Sci. USA 78, 3824–3828.Google Scholar
  27. Hutchens, J. O. (1970). In:Handbook of Biochemistry, 2nd ed. (Sober, H. A., ed.), Chemical Rubber Co., Cleveland, Ohio, pp. B60-B61.Google Scholar
  28. IMSL (1982).IMSL Library Reference Manual, 9th ed., Institute for Mathematical and Statistical Subroutine Library, Houston, Texas.Google Scholar
  29. Isogai, Y., Némethy, G., Rackovsky, S., Leach, S. J., and Scheraga, H. A. (1980).Biopolymers 19, 1183–1210.Google Scholar
  30. Janin, J. (1979).Nature 277, 491–492.Google Scholar
  31. Janin, J., Wodak, S., Levitt, M., and Maigret, B. (1978).J. Mol. Biol. 125, 357–386.Google Scholar
  32. Jones, D. D. (1975).J. Theor. Biol. 50, 167–183.Google Scholar
  33. Jöreskog, K. G. (1967).Psychometrika 32, 443–482.Google Scholar
  34. Jukes, T. H., Holmquist, R., and Moise, H. (1975).Science 189, 50–51.Google Scholar
  35. Jungck, J. R. (1978).J. Mol. Evol. 11, 211–224.Google Scholar
  36. Kaiser, H. F. (1958).Psychometrika 23, 187–200.Google Scholar
  37. Kanehisa, M. I., and Tsong, T. Y. (1980).Biopolymers 19, 1617–1628.Google Scholar
  38. Kendall, M., and Stuart, A. (1977).The Advanced Theory of Statistics, 4th ed., Macmillan, New York.Google Scholar
  39. Kidera, A., Konishi, Y., Ooi, T., and Scheraga, H. A. (1985).J. Protein Chem., submitted.Google Scholar
  40. Krigbaum, W. R., and Komoriya, A. (1979).Biochim. Biophys. Acta 576, 204–228.Google Scholar
  41. Kyte, J., and Doolittle, R. F. (1982).J. Mol. Biol. 157, 105–132.Google Scholar
  42. Lawley, D. N., and Maxwell, A. E. (1971).Factor Analysis as a Statistical Method, 2nd ed., Butterworths, London.Google Scholar
  43. Levitt, M. (1976).J. Mol. Biol. 104, 59–107.Google Scholar
  44. Levitt, M. (1978).Biochemistry 17, 4277–4285.Google Scholar
  45. Lewis, P. N., Momany, F. A., and Scheraga, H. A. (1971).Proc. Natl. Acad. Sci. USA 68, 2293–2297.Google Scholar
  46. Lifson, S., and Sander, C. (1979).Nature 282, 109–111.Google Scholar
  47. Manavalan, P., and Ponnuswamy, P. K. (1978).Nature 275, 673–674.Google Scholar
  48. Maxfield, F. R., and Scheraga, H. A. (1976).Biochemistry 15, 5138–5153.Google Scholar
  49. Meek, J. L., and Rossetti, Z. L. (1981).J. Chromatogr. 211, 15–28.Google Scholar
  50. Meirovitch, H., Rackovsky, S., and Scheraga, H. A. (1980).Macromolecules 13, 1398–1405.Google Scholar
  51. Morrison, D. F. (1976).Multivariate Statistical Methods, McGraw-Hill, New York.Google Scholar
  52. Nagano, K. (1973).J. Mol. Biol. 75, 401–420.Google Scholar
  53. Némethy, G., and Scheraga, H. A. (1977).Q. Rev. Biophys. 10, 239–352.Google Scholar
  54. Nishikawa, K., and Ooi, T. (1980).Int. J. Peptide Protein Res. 16, 19–32.Google Scholar
  55. Oobatake, M., and Ooi, T. (1977).J. Theor. Biol. 67, 567–584.Google Scholar
  56. Pliska, V., Schmidt, M., and Fauchére, J. L. (1981).J. Chromatogr. 216, 79–92.Google Scholar
  57. Ponnuswamy, P. K., Prabhakaran, M., and Manavalan, P. (1980).Biochim. Biophys. Acta 623, 301–316.Google Scholar
  58. Prabhakaran, M., and Ponnuswamy, P. K. (1982).Macromolecules 15, 314–320.Google Scholar
  59. Ptitsyn, O. B., and Finkelstein, A. V. (1983).Biopolymers 22, 15–25.Google Scholar
  60. Rackovsky, S., and Scheraga, H. A. (1977).Proc. Natl. Acad. Sci. USA 74, 5248–5251.Google Scholar
  61. Rackovsky, S., and Scheraga, H. A. (1982).Macromolecules 15, 1340–1346.Google Scholar
  62. Robson, B., and Osguthorpe, D. J. (1979).J. Mol. Biol. 132, 19–51.Google Scholar
  63. Robson, B., and Suzuki, E. (1976).J. Mol. Biol. 107, 327–356.Google Scholar
  64. Rose, G. D., and Roy, S. (1980).Proc. Natl. Acad. Sci. USA 77, 4643–4647.Google Scholar
  65. Simon, Z. (1976).Quantum Biochemistry and Specific Interactions, Abacus Press, Tunbridge Wells, Kent, England.Google Scholar
  66. Sneath, P. H. A. (1966).J. Theor. Biol. 12, 157–195.Google Scholar
  67. Späth, H. (1980).Cluster Analysis Algorithms for Data Reduction and Classification of Objects, Halsted Press, New York.Google Scholar
  68. Sueki, M., Lee, S., Powers, S. P., Denton, J. B., Konishi, Y., and Scheraga, H. A. (1984).Macromolecules 17, 148–155.Google Scholar
  69. Tanaka, S., and Scheraga, H. A. (1977).Macromolecules 10, 9–20.Google Scholar
  70. Vásquez, M., Némethy, G., and Scheraga, H. A. (1983).Macromolecules 16, 1043–1049.Google Scholar
  71. Von Heijne, G., and Blomberg, C. (1979).Eur. J. Biochem. 97, 175–181.Google Scholar
  72. Warme, P. K., and Morgan, R. S. (1978).J. Mol. Biol. 118, 289–304.Google Scholar
  73. Weber, A. L., and Lacey, Jr., J. C. (1978).J. Mol. Evol. 11, 199–210.Google Scholar
  74. Wertz, D. H., and Scheraga, H. A. (1978).Macromolecules 11, 9–15.Google Scholar
  75. Woese, C. R. (1973).Naturwissenschaften 60, 447–459.Google Scholar
  76. Wolfenden, R., Andersson, L., Cullis, P. M., and Southgate, C. C. B. (1981).Biochemistry 20, 849–855.Google Scholar
  77. Zimmerman, J. M., Eliezer, N., and Simha, R. (1968).J. Theor. Biol. 21, 170–201.Google Scholar

Copyright information

© Plenum Publishing Corporation 1985

Authors and Affiliations

  • Akinori Kidera
    • 1
  • Yasuo Konishi
    • 1
  • Masahito Oka
    • 1
  • Tatsuo Ooi
    • 2
  • Harold A. Scheraga
    • 1
  1. 1.Baker Laboratory of ChemistryCornell UniversityIthaca
  2. 2.Institute for Chemical ResearchKyoto UniversityUjiJapan

Personalised recommendations