Abstract
The specificity of GalNAc-transferase is consistent with the existence of an extended site composed of nine subsites, denoted by R4, R3, R2, R1, R0, R1′, R2′, R3′, and R4′, where the acceptor at R0 is either Ser or Thr to which the reducing monosaccharide is anchored. To predict whether a peptide will react with the enzyme to form a Ser- or Thr-conjugated glycopeptide, a neural network method—Kohonen's self-organization model is proposed in this paper. Three hundred five oligopeptides are chosen for the training site, with another 30 oligopeptides for the test set. Because of its high correct prediction rate (26/30=86.7%) and stronger fault-tolerant ability, it is expected that the neural network method can be used as a technique for predicting O-glycosylation and designing effective inhibitors of GalNAc-transferase. It might also be useful for targeting drugs to specific sites in the body and for enzyme replacement therapy for the treatment of genetic disorders.
Similar content being viewed by others
REFERENCES
Aubert, J. P., Biserte, G., and Loucheux-Lefebvre, M. H. (1976). Carbohydrate-peptide linkage in glycoproteins, Arch. Biochem. Biophys. 175, 410–418.
Bhat, U. N. (1984). In Elements of Applied Stochastic Processes, Wiley, New York, Chapter 3.
Chou, K. C (1994). A vectorized sequence-coupling model for predicting HIV protease cleavage sites in proteins, J. Biol. Chem. 268, 16938–16948.
Chou, K.-C. (1995) A sequence-coupled vector-projection model for predicting the specificity of GalNAc-transferase, Protein Sci. 4, 1365–1384.
Chou, K. C., and Zhang, C. T. (1992). A correlation coefficient method to predicting protein structural classes from amino acid compositions, Eur. J. Biochem. 207, 429–433 [Corrections: Eur. J. Biochem. 222, 1063 (1994)].
Chou, K. C., Zhang, C. T., and Kézdy, F. J. (1993). A vector projection approach to predicting HIV protease cleavage sites in proteins, Proteins Struct. Funct. Genet. 16, 195–204.
Elhammer, Å. P., Poorman, R. A., Brown, E., Maggiora, L. L., Hoogerheide, J. G., and Kézdy, F. J. (1993). The specificity of UDP-GalNAc:polypeptide N-acetylgalactosaminyl-transferase as inferred from a database of in vivo substrates and from the in vitro glycosylation of proteins and peptides, J. Biol. Chem. 268, 10029–10038.
Goochee, C. F., and Monica, T. (1990). Environmental effects on protein glycosylation, Bio/Technology 8, 421–425.
Goto, M., Akai, K., Murakami, A., Hasimoto, C., Tsuda, E., Ueda, M., Kawanishi, G., Takahashi, N., Ishimoto, A., Chiba, H., and Saski, R. (1988). Production of recombinant erythropoietin in mammalian cells: Host-cell dependency of the biological activity of the cloned glycoprotein, Bio/Technology 6, 67–71.
Hart, G. W., Holt, G. D., and Haltiwanger, R. S. (1988). Nuclear and cytoplasmic glycosylation: Novel saccharide linkages in unexpected places, TIBS 13, 380–384.
Hill, H. D. Jr, Schwyzer, M., Steinman, H. M., and Hill, R. L. (1977). Ovine submaxillary mucin, J. Biol. Chem. 252, 3799–3804.
Kabsch, W., and Sander, C. (1983). How good are predictions of protein secondary structure? FEBS Lett. 155, 179–182.
Kehry, M., Sibley, C., Furhman, J., Schilling, J., and Hood, L. E. (1979). Amino acid sequence of a mouse immunoglobulin μ chain, Proc. Natl. Acad. Sci. USA 76, 2932–2936.
Kobata, A. (1984). The carbohydrates of glycoproteins, In Biology of Carbohydrates (Ginsburg, V., and Robbins, P. W., eds.), Wiley, New York, Vol. 2, Chapter 2.
Kohonen, T. (1988). Introduction to neural computing, Neural Networks 1(1), 3–16.
Kornfeld, R., and Kornfeld, S. (1985). Assembly of asparagine-linked oligosaccharides, Annu. Rev. Biochem. 54, 631–664.
Nakashima, H., Nishikawa, K., and Ooi, T. (1986). The folding type of a protein is relevant to the amino acid composition, J. Biochem. 99, 152–162.
NBRF Protein Database (1993). Atlas of Protein and Genomic Sequences, The National Biomedical Research Foundation, Compact Disc Data Storage, Release 35, January 1993, Washington, D. C. 20007.
Oppenheim, F., Offner, G. D., and Troxler, R. F. (1985). Amino acid sequence of a proline-rich phosphoglycoprotein from parotid secretion of the subhuman primate Macaca fascicularis, J. Biol. Chem. 260, 10671–10679.
Pisano, A., Redmond, J. W., Williams, K. L., and Gooley, A. A. (1993). Glycosylation sites identified by solid-phase Edman degradation: O-linked glycosylation motifs on human glycophorin A, Glycobiology 3, 429–435.
Poorman, R. A., Tomasselli, A. G., Heinrikson, R. L., and Kézdy, F. J. (1991). A cumulative specificity model for proteases from human immunodeficiency virus types 1 and 2, inferred from statistical analysis of an extended substrate data base, J. Biol. Chem. 266, 14554–14561.
Rodén, L. (1966). Structure of the neutral trisaccharide of the chondroitin 4-sulfate-protein linkage region, J. Biol. Chem. 241, 5949–5954.
Sadler, J. E. (1984). In Biology of Carbohydrates (Ginsburg, V., and Robbins, P. W., eds.), Wiley, New York, Vol. 2, pp. 199–213.
Sharon, N., and Lis, H. (1981). Glycoprotein: Research booming on long-ignored, ubiquitous compounds, Chem. Eng. News 1981 (March 30), 21–44.
Schechter, I., and Berger, R. (1967). On the size of the active site in proteases. I. Papain, Biochem. Biophys. res. Commun. 27, 157–162.
Schmid, K., Hediger, M. A., Brossmer, R., Collins, J. H., Haupt, H. G., Offner, G. D., Schaller, J., Takagaki, K., Walsh, M. T., Schwick, H. G., Rosen, F. S., and Remold O'Donnell, E. (1992). Amino acid sequence of human plasma galactoglycoprotein: Identity with the extracellular region of CD43 (sialophorin), Proc. Natl. Acad. Sci. USA 89, 663–667.
Taniguchi, T., Mizuochi, T., Beale, M., Dwek, R. A., Rademacher, T. W., and Kobata, A. (1985). Structures of the sugar chains of rabbit immunoglobulin G: Occurrence of asparagine-linked sugar chains in the Fab fragment, Biochemistry 24, 5551–5557.
Watzawick, H., Walsh, M. T., Yoshioka, Y., Schmid, K., and Brossmer, R. (1992). Structure of the N-and O-glycans of the A-chain of human plasma α 2-HS-glycoprotein as deduced from the chemical compositions of the derivatives prepared by stepwise degradation with exoglycosidases, Biochemistry 31, 12198–12203.
West, G. M. (1986). Current ideas on the significance of protein glycosylation, Mol. Cell. Biochem. 72, 3–20.
Young, J. D., Tsuchiya, D., Sandlin, D. E., and Holroyde, M. J. (1979). Enzyme O-glycosylation of synthetic peptides from sequences in basic myelin protein, Biochemistry 18, 4444–4448.
Author information
Authors and Affiliations
Rights and permissions
About this article
Cite this article
Cai, YD., Yu, H. & Chou, KC. Artificial Neural Network Method for Predicting the Specificity of GalNAc-transferase. J Protein Chem 16, 689–700 (1997). https://doi.org/10.1023/A:1026306520790
Published:
Issue Date:
DOI: https://doi.org/10.1023/A:1026306520790