Abstract
The location of structural domains in proteins is predicted from the amino acid sequence, based on the analysis of a computed contact map for the protein, the average distance map (ADM). Interactions between residues i and j in a protein are subdivided into several ranges, according to the separation |i-j| in the amino acid sequence. Within each range, average spatial distances between every pair of amino acid residues are computed from a data base of known protein structures. Infrequently occurring pairs are omitted as being statistically insignificant. The average distances are used to construct a predicted ADM. The ADM is analyzed for the occurrence of regions with high densities of contacts (compact regions). Locations of rapid changes of density between various parts of the map are determined by the use of scanning plots of contact densities. These locations serve to pinpoint the distribution of compact regions. This distribution, in turn, is used to predict boundaries of domains in the protein. The technique provides an objective method for the location of domains both on a contact map derived from a known three-dimensional protein structure, the real distance map (RDM), and on an ADM. While most other published methods for the identification of domains locate them in the known three-dimensional structure of a protein, the technique presented here also permits the prediction of domains in proteins of unknown spatial structure, as the construction of the ADM for a given protein requires knowledge of only its amino acid sequence.
Similar content being viewed by others
References
Bernstein, F. C., Koetzle, T. F., Williams, G. J. B., Meyer, E. F. Jr., Brice, M. D., Rodgers, J. R., Kennard, O., Shimanouchi, T., and Tasumi, M. (1977).J. Mol. Biol. 112, 535–542.
Blake, C. C. F., Koenig, D. F., Mair, G. A., North, A. C. T., Phillips, D. C., and Sarma, V. R. (1965).Nature (Lond.)206, 757–761.
Busetta, B., and Barrans, Y. (1984).Biochim. Biophys. Acta 790, 117–124.
Cariani, P., and Goel, N. S. (1985).Bull. Math. Biol. 47, 367–407.
Chothia, C. (1975).Nature (Lond.) 254, 304–308.
Chou, K.-C., Pottle, M., Némethy, G., Ueda, Y., and Scheraga, H. A. (1982).J. Mol. Biol. 162, 89–112.
Cohen, F. E., and Sternberg, M. J. E. (1980).J. Mol. Biol. 137, 9–22.
Crippen, G. M. (1977a).J. Comp. Phys. 24, 96–107.
Crippen, G. M. (1977b).Biopolymers 16, 2189–2201.
Crippen, G. M. (1978).J. Mol. Biol. 126, 315–332.
Delbaere, L. T. J., Brayer, G. D., and James, M. N. G. (1979).Can. J. Biochem. 57, 135–144.
Dixon, W. J., and Massey, F. J. Jr. (1983).Introduction to Statistical Analysis. 4th Edition. McGraw-Hill, New York, p. 569.
Drenth, J., Jansonius, J. N., Koeboek, R., and Wolthers, B. G. (1971).Adv. Protein Chem. 25, 79–115.
Ghélis, C., and Yon, J. M. (1982).Protein Folding. Academic Press, New York.
Gö, M. (1981).Nature (Lond.) 291, 90–92.
Goel, N. S., and Yěas, M. (1979).J. Theor. Biol. 77, 253–305.
Goel, N. S., Rouyanian, B., and Sanati, M. (1982).J. Theor. Biol. 99, 705–757.
Goldberg, M. E. (1969).J. Mol. Biol. 46, 441–446.
Janin, J., and Chothia, C. (1985).Methods Enzymol. 115, 420–430.
Janin, J., and Wodak, S. J. (1983).Prog. Biophys. Mol. Biol. 42, 21–78.
Kikuchi, T., Némethy, G., and Scheraga, H. A. (1986).J. Comput. Chem. 7, 67–88.
KaiKikuchi, T., Némethy, G., and Scheraga, H. A. (1988a).J. Protein Chem. 7, 473–490.
Kikuchi, T., Némethy, G., and Scheraga, H. A. (1988b).J. Protein Chem. 7, 491–507.
Kikuchi, T., Némethy, G., and Scheraga, H. A. (1988c).J. Comput. Chem. Submitted.
Krigbaum, W. R., and Komoriya, A. (1979).Biochim. Biophys. Acta 576, 204–228.
Kuntz, I. D. (1975).J. Am. Chem. Soc. 97, 4362–4366.
Kuntz, I. D., Crippen, G. M., and Kollman, P. A. (1979).Biopolymers 18, 939–957.
Lesk, A. M., and Rose, G. D. (1981).Proc. Natl. Acad. Sci. U.S.A. 78, 4304–4308.
Liljas, A., and Rossmann, M. G. (1974).Annu. Rev. Biochem. 43, 475–507.
Manavalan, P., and Ponnuswamy, P. K. (1977).Arch. Biochem. Biophys. 184, 476–487.
Meirovitch, H., and Scheraga, H. A. (1980).Macromolecules 13, 1406–1414.
Meirovitch, H., and Scheraga, H. A. (1981a).Macromolecules 14, 340–345.
Meirovitch, H., and Scheraga, H. A. (1981b).Macromolecules 14, 1250–1259.
Meirovitch, H., Rackovsky, S., and Scheraga, H. A. (1980).Macromolecules 13, 1398–1405.
Némethy, G., and Scheraga, H. A. (1979).Proc. Natl. Acad. Sci. U.S.A. 76, 6050–6054.
Nishikawa, K., Ooi, T., Isogai, Y., and Saito, N. (1972).J. Phys. Soc. Jpn. 32, 1331–1337.
Phillips, D. C. (1967).Proc. Natl. Acad. Sci. U.S.A. 57, 484–495.
Phillips, D. C. (1970). InBritish Biochemistry: Past and Present (Goodwin, T. W., ed.), Academic Press, London, pp. 11–28.
Rashin, A. A. (1981).Nature (Lond.) 291, 85–87.
Remington, S. J., Anderson, W. F., Owen, J., Ten Eyck, L. F., Grainger, C. T., and Matthews, B. W. (1978).J. Mol. Biol. 118, 81–98.
Richards, F. M. (1974).J. Mol. Biol. 82, 1–14.
Richards, F. M. (1977).Ann. Rev. Biophys. Bioeng. 6, 151–176.
Richardson, J. S. (1981).Adv. Protein Chem. 34, 167–339.
Rose, G. D. (1979).J. Mol. Biol. 134, 447–470.
Rose, G. D. (1985).Methods Enzymol. 115, 430–440.
Rossmann, M. G., and Argos, P. (1981).Annu. Rev. Biochem. 50, 497–532.
Rossmann, M. G., and Liljas, A. (1974).J. Mol. Biol. 85, 177–181.
Sander, C. (1981). InStructural Aspects of Recognition and Assembly in Biological Macromolecules (M. Balaban, ed.), Balaban International Science Services, Rehovot, Israel, pp. 183–195.
Scheraga, H. A. (1980). InProtein Folding (R. Jaenicke, ed.), Elsevier, Amsterdam, pp. 261–288.
Schulz, G. E., and Schirmer, R. H. (1979).Principles of Protein Structure, Springer-Verlag, New York, pp. 88–91.
Schulz, G. E., Elzinga, M., Marx, F., and Schirmer, R. H. (1974).Nature (Lond.)250, 120–123.
Tainer, J. A., Getzoff, E. D., Beem, K. M., Richardson, J. S., and Richardson, D. C. (1982).J. Mol. Biol. 160, 181–217.
Tanaka, S., and Scheraga, H. A. (1975).Proc. Natl. Acad. Sci. U.S.A. 72, 3802–3806.
Tanaka, S., and Scheraga, H. A. (1976).Macromolecules 9, 945–950.
Tanaka, S., and Scheraga, H. A. (1977).Macromolecules 10, 291–304.
Thornton, J. M. (1981).J. Mol. Biol. 151, 261–287.
Vonderviszt, F., and Simon, I. (1986).Biochem. Biophys. Res. Commun. 139, 11–17.
Wako, H., and Scheraga, H. A. (1981).Macromolecules 14, 961–969.
Wako, H., and Scheraga, H. A. (1982a).J. Protein Chem. 1, 5–45.
Wako, H., and Scheraga, H. A. (1982b).J. Protein Chem. 1, 85–117.
Watson, H. C. (1969).Prog. Stereochem. 4, 299–333.
Wetlaufer, D. B. (1973).Proc. Natl. Acad. Sci. U.S.A. 70, 697–701.
Wetlaufer, D. B. (1981).Adv. Protein Chem. 34, 61–92.
Wodak, S. J., and Janin, J. (1981a).Biochemistry 20, 6544–6552.
Wodak, S. J., and Janin, J. (1981b). InStructural Aspects of Recognition and Assembly in Biological Macromolecules (M. Balaban, ed.), Balaban International Science Services, Rehovot, Israel pp. 149–167.
Yčas, M., Goel, N. S., and Jacobsen, J. W. (1978).J. Theor. Biol. 72, 443–457.
Yon, J. M. (1978).Biochimie 60, 581–591.
Zehfus, M. H. (1987).Proteins Struct. Function Genet. 2, 90–110.
Zehfus, M. H., and Rose, G. D. (1986).Biochemistry 25, 5759–5765.
Author information
Authors and Affiliations
Rights and permissions
About this article
Cite this article
Kikuchi, T., Némethy, G. & Scheraga, H.A. Prediction of the location of structural domains in globular proteins. J Protein Chem 7, 427–471 (1988). https://doi.org/10.1007/BF01024890
Received:
Published:
Issue Date:
DOI: https://doi.org/10.1007/BF01024890