Relation between sequence similarity and structural similarity in proteins. Role of important properties of amino acids

Kidera, Akinori; Konishi, Yasuo; Ooi, Tatsuo; Scheraga, Harold A.

doi:10.1007/BF01025494

Relation between sequence similarity and structural similarity in proteins. Role of important properties of amino acids

Published: 06 November 1985

Volume 4, pages 265–297, (1985)
Cite this article

Journal of Protein Chemistry Aims and scope Submit manuscript

Akinori Kidera¹,
Yasuo Konishi¹,
Tatsuo Ooi² &
…
Harold A. Scheraga¹

296 Accesses
60 Citations
Explore all metrics

Abstract

In a previous paper we obtained ten (orthogonal) factors, linear combinations of which can express the properties of the 20 naturally occurring amino acids. In this paper, we assume that the most important properties (linear combinations of these ten factors) that determine the three-dimensional structure of a protein are conserved properties, i.e., are those that have been conserved during evolution. Two definitions of a conserved property are presented: (1) a conserved property for an average protein is defined as that linear combination of the ten factors that optimally expresses the similarity of one amino acid to another (hence, little change during evolution), as given by the relatedness odds matrix of Dayhoff et al.; (2) a conserved property for each position in the amino acid sequence (locus) of a specific family of homologous proteins (the cytochromec family or the globin family) is defined as that linear combination of the ten factors that is common among a set of amino acids at a given locus when the sequences are properly aligned. When the specificity at each locus is averaged over all loci, the same features are observed for three expressions of these two definitions, namely the conserved property for an average protein, the average conserved property for the cytochromec family, and the average conserved property for the globin family; we find that bulk and hydrophobicity (information about packing and long-range interactions) are more important than other properties, such as the preference for adopting a specific backbone structure (information about short-range interactions). We also demonstrate that the sequence profile of a conserved property, defined for each locus of a protein family (definition 2), corresponds uniquely to the three-dimensional structure, while the conserved property for an average protein (definition 1) is not useful for the prediction of protein structure. The amino acid sequences of numerous proteins are searched to find those that are similar, in terms of the conserved properties (definition 2), to sequences of the same size from one of the homologous families (cytochromec and globin, respectively) for whose loci the conserved properties were defined. Many similar sequences are found, the number of similarities decreasing with increasing size of the segment. However, the segments must be rather long (≥15 residues) before the comparisons become meaningful. As an example, one sufficiently large sequence (20 residues) from a protein of known structure (apo-liver alcohol dehydrogenase that is not a member of either family) is found to be similar in the conserved properties to a particular sequence of a member of the family of human hemoglobin α chains, and the two sequences have similar structures. This means that, since conserved properties are expected to be structure determinants, we can use the conserved properties to predict an initial protein structure for subsequent energy minimization for a protein for which the conserved properties are similar to those of a family of proteins with a sufficiently large number of homologous amino acid sequences; such a large number of homologous sequences is required to define a conserved property for each locus of the homologous protein family.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Log in via an institution

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Institutional subscriptions

Tools for Characterizing Proteins: Circular Variance, Mutual Proximity, Chameleon Sequences, and Subsequence Propensities

Quantiprot - a Python package for quantitative analysis of protein sequences

Article Open access 17 July 2017

Mass & secondary structure propensity of amino acids explain their mutability and evolutionary replacements

Article Open access 10 August 2017

References

Baba, M. L., Darga, L. L., Goodman, M., and Czelusniak, J. (1981).J. Mol. Evol. 17, 197–213.
Article CAS PubMed Google Scholar
Bernstein, F. C., Koetzle, T. F., Williams, G. J. B., Meyer, Jr., E. F., Brice, M. D., Rodgers, J. R., Kennard, O., Shimanouchi, T., and Tatsumi, M. (1977).J. Mol. Biol. 112, 535–542.
Article CAS PubMed Google Scholar
Dayhoff, M. O., Schwartz, R. M., and Orcutt, B. C. (1978). InAtlas of Protein Sequence and Structure (Dayoff, M. O., ed.), National Biomedical Research Foundation, Washington, D. C., Vol. 5, Suppl. 3, pp. 345–352.
Google Scholar
Dickerson, R. E. (1980). InThe Evolution of Protein Structure and Function (Sigman, D. S., and Brazier, M. A. B., eds.), Academic Press, New York, pp. 173–202.
Chapter Google Scholar
Eisenberg, D., Weiss, R. M., and Terwilliger, T. C. (1982).Nature 299, 371–374.
Article CAS PubMed Google Scholar
Epstein, C. J. (1967).Nature 215, 355–359.
Article CAS PubMed Google Scholar
French, S., and Robson, B. (1983).J. Mol. Evol. 19, 171–175.
Article CAS Google Scholar
Gay, D. M. (1983).ACM Trans. Math. Software 9, 503–524.
Article Google Scholar
Gō, M., and Miyazawa, S. (1980).Int. J. Peptide Protein Res. 15, 211–224.
Article Google Scholar
Goodman, M. (1981).Prog. Biophys. Mol. Biol. 38, 105–164.
Article CAS PubMed Google Scholar
Grantham, R. (1974).Science 185, 862–864.
Article CAS PubMed Google Scholar
IMSL (1982).IMSL Library Reference Manual, 9th ed., IMSL, Houston.
Google Scholar
Kabsch, W., and Sander, C. (1983).Biopolymers 22, 2577–2637.
Article CAS PubMed Google Scholar
Kidera, A., Konishi, Y., Oka, M., Ooi, T., and Scheraga, H. A. (1985).J. Protein Chem. 4, 23–55.
Article CAS Google Scholar
Lawn, R. M., Adelman, J., Dull, T. J., Gross, M., Goeddel, D., and Ullrich, A. (1981).Science 212, 1159–1162.
Article CAS PubMed Google Scholar
Lesk, A. M., and Chothia, C. (1980).J. Mol. Biol. 136, 225–270.
Article CAS PubMed Google Scholar
Morrison, D. F. (1976).Multivariate Statistical Method, McGraw-Hill, New York.
Google Scholar
Némethy, G., Pottle, M. S., and Scheraga, H. A. (1983).J. Phys. Chem. 87, 1883–1887.
Article Google Scholar
Ohno, S., and Taniguchi, T. (1981).Proc. Natl. Acad. Sci. USA 78, 5305–5309.
Article CAS PubMed PubMed Central Google Scholar
Orcutt, B. C., and Dayhoff, M. O. (1982).Protein Sequence Database, National Biomedical Research Foundation, Washington, D.C.
Google Scholar
Perutz, M. F., Kendrew, J. C., and Watson, H. C. (1965).J. Mol. Biol. 13, 669–678.
Article CAS Google Scholar
Pestka, S. (1983).Arch. Biochem. Biophys. 221, 1–37.
Article CAS PubMed Google Scholar
Ptitsyn, O. B. (1974).J. Mol. Biol. 88, 287–300.
Article CAS Google Scholar
Richardson, J. S. (1981).Adv. Protein Chem. 34, 167–339.
Article CAS PubMed Google Scholar
Rose, G. D., and Roy, S. (1980).Proc. Natl. Acad. Sci. USA 77, 4643–4647.
Article CAS PubMed PubMed Central Google Scholar
Schulz, G. E., and Schirmer, R. H. (1979).Principles of Protein Structure, Springer, New York.
Book Google Scholar
Shrake, A., and Rupley, J. A. (1973).J. Mol. Biol. 79, 351–371.
Article CAS PubMed Google Scholar
Sippl, M. J. (1982).J. Mol. Biol. 156, 359–388.
Article CAS PubMed Google Scholar
Thompson, E. O. P. (1980). InThe Evolution of Protein Structure and Function (Sigman, D. A., and Brazier, M. A. B., eds.), Academic Press, New York, pp. 267–298.
Chapter Google Scholar
Vogel, H., and Zuckerkandl, E. (1972). InProceeding of the Sixth Berkeley Symposium on Mathematical Statistics and Probability (Lecam, L. M., Neyman, J., and Scott, E., eds.), University of California Press, Los Angeles, pp. 155–176.
Google Scholar
Zuckerkandl, E., and Pauling, L. (1965). InEvolving Genes and Proteins (Bryson, V., and Vogel, H. J., eds.), Academic Press, New York, pp. 97–166.
Chapter Google Scholar

Download references

Author information

Authors and Affiliations

Baker Laboratory of Chemistry, Cornell University, Ithaca, New York
Akinori Kidera, Yasuo Konishi & Harold A. Scheraga
Institute for Chemical Research, Kyoto University, Uji, Japan
Tatsuo Ooi

Authors

Akinori Kidera
View author publications
You can also search for this author in PubMed Google Scholar
Yasuo Konishi
View author publications
You can also search for this author in PubMed Google Scholar
Tatsuo Ooi
View author publications
You can also search for this author in PubMed Google Scholar
Harold A. Scheraga
View author publications
You can also search for this author in PubMed Google Scholar

Rights and permissions

Reprints and permissions

About this article

Cite this article

Kidera, A., Konishi, Y., Ooi, T. et al. Relation between sequence similarity and structural similarity in proteins. Role of important properties of amino acids. J Protein Chem 4, 265–297 (1985). https://doi.org/10.1007/BF01025494

Download citation

Received: 06 November 1985
Published: 06 November 1985
Issue Date: October 1985
DOI: https://doi.org/10.1007/BF01025494

Key words

Access this article

Log in via an institution

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Institutional subscriptions

Relation between sequence similarity and structural similarity in proteins. Role of important properties of amino acids

Abstract

Access this article

Similar content being viewed by others

Tools for Characterizing Proteins: Circular Variance, Mutual Proximity, Chameleon Sequences, and Subsequence Propensities

Quantiprot - a Python package for quantitative analysis of protein sequences

Mass & secondary structure propensity of amino acids explain their mutability and evolutionary replacements

References

Author information

Authors and Affiliations

Rights and permissions

About this article

Cite this article

Key words

Navigation

Relation between sequence similarity and structural similarity in proteins. Role of important properties of amino acids

Abstract

Access this article

Similar content being viewed by others

Tools for Characterizing Proteins: Circular Variance, Mutual Proximity, Chameleon Sequences, and Subsequence Propensities

Quantiprot - a Python package for quantitative analysis of protein sequences

Mass & secondary structure propensity of amino acids explain their mutability and evolutionary replacements

References

Author information

Authors and Affiliations

Rights and permissions

About this article

Cite this article

Share this article

Key words

Search

Navigation