Abstract
Methods for predicting protein post-translational modifications have been developed extensively. In this chapter, we review major post-translational modification prediction strategies, with a particular focus on statistical and machine learning approaches. We present the workflow of the methods and summarize the advantages and disadvantages of the methods.
Access this chapter
Tax calculation will be finalised at checkout
Purchases are for personal use only
References
Jaeken J., Carchon H. (2001) Congenital disorders of glycosylation: the rapidly growing tip of the iceberg. Curr Opin Neurol 14, 811–815.
Martin P.T. (2005) The dystroglycanopathies: the new disorders of O-linked glycosylation. Semin Pediatr Neurol 12, 152–158.
Cohen, P. (2000) the regulation of protein function by multisite phosphorylation-a 25 year update. Trends Biochem Sci 25, 596–601.
Tyers, M., Jorgensen, P. (1989) Protein and carbohydrate structural analysis of a recombinant soluble CD4 receptor by mass spectrometry. J Biol Chem 264, 21286–21295.
Medzihradszky, K. F. (2008) Characterization of site-specific N-glycosylation. Methods Mol Biol 446, 293–316.
Ingrell, C. R., Miller, M. L., Jensen, O. N., Blom, N. (2007) NetPhosYeast: prediction of protein phosphorylation sites in yeast. Bioinformatics 23, 895–897.
Gupta, R. (2001) Prediction of glycosylation sites in proteomes: from post-translational modifications to protein function. Ph.D. thesis at CBS.
Kim, J. H., Lee, J., Oh, B., et al. (2004) Prediction of phosphorylation sites using SVMs. Bioinformatics 20, 3179–3184.
Plewczynski, D., Tkacz, A., Wyrwicz, L.S., Rychlewski, L. (2005) AutoMotif server: prediction of single residue post-translational modifications in proteins. Bioinformatics 21, 2525–2527.
Plewczynski, D., Tkacz, A., Wyrwicz, L. S., et al. (2008) AutoMotif Server for prediction of phosphorylation sites in proteins using support vector machine: 2007 update. J Mol Model 14, 69–76.
Wong, Y. H., Lee, T.Y., Liang, H. K., et al. (2007) KinasePhos 2.0: a web server for identifying protein kinase-specific phosphorylation sites based on sequences and coupling patterns. Nucleic Acids Res 35, W588–594.
Xue, Y., Li, A., Wang, L., Feng, H., Yao, X. (2006) PPSP: prediction of PK-specific phosphorylation site with Bayesian decision theory. BMC Bioinformatics 7, 163.
Yoo, P. D., Ho, Y. S., Zhou, B. B., Zomaya, A. Y. (2008) SiteSeek: posttranslational modification analysis using adaptive locality-effective kernel methods and new profiles. BMC Bioinformatics 9, 272.
Lee, T. Y., et al. (2006) dbPTM: an information repository of protein post-translational modification. Nucleic Acids Res 34, D622–D627.
Sigrist, C. J., Cerutti, L., Hulo, N., et al. (2002) PROSITE: A documented database using patterns and profiles as motif descriptors. Brief Bioinfo 3, 265–274.
Kiemer, L., Bendtsen, J. D., Blom, N. (2005) NetAcet: prediction of N-terminal acetylation sites. Bioinformatics 21, 1269–1270.
Johansen, M. B., Kiemer, L., Brunak, S. (2006) Analysis and prediction of mammalian protein glycation. Glycobiology 16, 844–853.
Hansen, J. E., Lund, O., Tolstrup, N., et al. (1998) NetOglyc: Prediction of mucin type O-glycosylation sites based on sequence context and surface accessibility. Glycoconjugate J 15, 115–130.
Blom, N., Gammeltoft, S., Brunak, S. (1999) Sequence and structure-based prediction of eukaryotic protein phosphorylation sites. J Mol Biol 294, 1351–1362.
Chang, W. C., Lee, T. Y., Shien DM, et al. (2009) Incorporating support vector machine for identifying protein tyrosine sulfation sites. J Comput Chem 30, 2526–2537.
http://ca.expasy.org/tools/. Accessed 18 August 2010.
Blom, N., Sicheritz-Pontén, T., Gupta, R., et al. (2004) Prediction of post-translational glycosylation and phosphorylation of proteins from the amino acid sequence. Proteomics 4, 1633–1649.
Liu, C. M., Blake, A., Burge, L., et al. (2006) The identification of ion types in tandem mass spectra based on a graph algorithm. J Sci Practical Comput 1, 46–60.
Jung, I., Matsuyama, A., Yoshida, M., Kim, D. (2010) PostMod: sequence based prediction of kinase-specific phosphorylation sites with indirect relationship. BMC Bioinformatics 11(Suppl1), S10.
Zhou, F. F., Xue, Y., Chen, G.L., Yao, X. (2004) GPS: a novel group-based phosphorylation predicting and scoring method. Biochem Biophys Res Commun 325(4), 1443–1448.
Na, S., Paek, E. (2009) Prediction of novel modifications by unrestrictive search of tandem mass spectra. J Proteome Res 8, 4418–4427.
Huang, H. D., Lee, T. Y., Tseng, S. W., Horng, J. T. (2005) KinasePhos: a web tool for identifying protein kinase-specific phosphorylation sites. Nucleic Acids Res 33, W226–229.
Zhou, C., Bowler, L. D., Feng, J. F. (2008) A machine learning approach to explore the spectra intensity pattern of peptides using tandem mass spectrometry data. BMC Bioinformatics. 9, 325.
Webb-Robertson, B. J., Cannon, W. R., Oehmen, C.S., et al. (2008) A support vector machine model for the prediction of proteotypic peptides for accurate mass and time proteomics. Bioinformatics 24, 1503–1509.
Blom, N., Sicheritz-Ponten, T., Gupta, R., et al. (2004) Prediction of post-translational glycosylation and phosphorylation of proteins from the amino acid sequence. Proteomics 4, 1633–1649.
Plewczynski, D., Tkacz, A., Wyrwicz, L. S., Rychlewski, L. (2005) AutoMotif server: prediction of single residue post-translational modifications in proteins. Bioinformatics 21, 2525–2527.
Lu, B., Ruse, C., Xu, T., et al. (2007) Automatic validation of phosphopeptide identifications from tandem mass spectra. Anal Chem 79, 1301–1310.
Wong, Y. H., Lee, T. Y., Liang, H. K., et al. (2007) KinasePhos 2.0: a web server for identifying protein kinase-specific phosphorylation sites based on sequences and coupling patterns. Nucleic Acids Res 35, W588–594.
Ahmad, I., Hoessli, D. C., Gupta, R., et al. (2007) In silico determination of intracellular glycosylation and phosphorylation sites in human selectins: implications for biological function. J Cell Biochem 100, 1558–1572.
Hansen, J. E., Lund, O., Tolstrup. N., et al. (1998) NetOGlyc: Prediction of mucin type O-glycosylation sites based on sequence context and surface accessibility. Glycoconj J 15, 115–130.
Julenius, K., Mølgaard, A., Gupta, R., Brunak, S. (2005) Prediction, conservation analysis and structural characterization of mammalian mucin-type O-glycosylation sites. Glycobiology 15, 153–164.
Gupta, R., Jung, E., Gooley, A. A., et al. (1999) Scanning the available Dictyostelium discoideum proteome for O-linked GlcNAc glycosylation sites using neural networks. Glycobiology 9, 1009–1022.
Hansen, J. E., Lund, O., Engelbrecht, J., et al. (1995) Prediction of O-glycosylation of mammalian proteins: specificity patterns of UDP-GalNAc: polypeptide N-acetylgalactosaminyltransferase. Biochem J 308, 801–813.
Julenius, K., Molgaard, A., Gupta, R., Brunak, S. (2005) Prediction, conservation analysis, and structural characterization of mammalian mucin-type O-glycosylation sites. Glycobiology 15, 153–164.
Torii, M., Liu, H., Hu, Z. (2009) Support vector machine-based mucin-type o-linked glycosylation site prediction using enhanced sequence feature encoding. Proc AMIA Annu Symp 14, 640–644.
Chen, K., Kurgan, L. A., Ruan, J. (2007) Prediction of flexible/rigid regions from protein sequences using k-spaced amino acid pairs. BMC Struct Biol 7, 25.
Hamby, S. E., Hirst, J. D. (2008) Prediction of glycosylation sites using random forests. BMC Bioinformatics 9, 500.
Hansen, J. F., Lund, O., Engelbrecht, J., et al. (1995) Prediction of O-glycosylation of mammalian proteins: specificity patterns of UDP-GalNAc:polypeptide N-acetylgalactosaminyltransferase. Biochem J t3, 801–813.
Mark, R., Holmes, M., C. (2004) Giddings prediction of posttranslational modifications using intact-protein mass spectrometric data. Anal Chem 76, 276–282.
Emanuelsson, O., Nielsen, H., von Heijne, G. (1999) a neural network-based method for predicting chloroplast transit peptides and their cleavage sites. Protein Sci 8, 978–984.
Puntervoll, P., Linding, R., Gemund, C., Chabanis, D.S. et al. (2003) ELM server: A new resource for investigating short functional sites in modular eukaryotic proteins. Nucleic Acids Res 31, 3625–3630.
Sigrist, C. J., Cerutti, L., Hulo, N., et al. (2002) PROSITE: A documented database using patterns and profiles as motif descriptors. Brief Bioinfo 3, 265–274.
Peri, S., Navarro, J., Amanchy, R., Kristiansen, T. et al. (2003) Development of human protein reference database as an initial platform for approaching systems biology in humans. Genome Res 13, 2363–2371.
Garavelli, J. (2003) The RESID Database of Protein Modifications: 2003 developments. Nucleic Acids Res 31, 499–501.
Obenauer, J.C., Cantley, L.C., Yaffe, M.B. (2003) Scansite 2.0: Proteome-wide prediction of cell signaling interactions using short sequence motifs. Nucleic Acids Res 31, 3635–3641.
Saunders, N.F., Brinkworth, R.I., Huber, T., et al. (2008) Predikin and PredikinDB: a computational framework for the prediction of protein kinase peptide specificity and an associated database of phosphorylation sites. BMC Bioinformatics 9, 245.
Blom, N., Gammeltoft, S., Brunak, S. (1999) P Sequence- and structure-based prediction of eukaryotic protein phosphorylation sites. J Mol Biol 294, 1351–1362.
Blom, N., Sicheritz-Ponten, T. Gupta, R. et al. (2004) Prediction of post-translational glycosylation and phosphorylation of proteins from the amino acid sequence. Proteomics 4, 1633–1649.
de Chiara, C., Menon, R.P., Strom, M., et al. (2009) Phosphorylation of S776 and 14-3-3 binding modulate ataxin-1 interaction with splicing factors. PLoS ONE 4, e8372.
Eisenhaber, B., Bork, P., Eisenhaber, F. (1998) Sequence properties of GPI-anchored proteins near the omega-site: constraints for the polypeptide binding site of the putative transamidase. Protein Eng 11, 1155–1161.
Cooper, C. A., Gasteiger, E., Packer, N. H. (2001) GlycoMod—a software tool for determining glycosylation compositions from mass spectrometric data. Proteomics 1, 340–349.
Julenius, K., Mlgaard, A., Gupta, R., Brunak, S. (2005) Prediction, conservation analysis and structural characterization of mammalian mucin-type O-glycosylation sites. Glycobiology 15, 153–164.
Gupta, R. Jung, E., Gooley, A.A., Williams, K.L., Brunak, S., Hansen, J. (1999) Scanning the available Dictyostelium discoideum proteome for O-linked GlcNAc glycosylation sites using neural networks. Glycobiology 9, 1009–1022.
Gupta, R., Brunak, S. (2002) Prediction of glycosylation across the human proteome and the correlation to protein function. Pacific Symposium on Biocomputing 7, 310–322.
Martinez, A., Traverso, J. A., Valot, B., Ferro, M., Espagne, C., Ephritikhine, G., Zivy, M., Giglione, C., Meinnel, T. (2008) Extent of N-terminal modifications in cytosolic proteins from eukaryotes. Proteomics 8, 2809–2831.
Duckert, P., Brunak, S., Blom, N. (2004) Prediction of proprotein convertase cleavage sites. Protein Eng Design Sel 17, 107–112.
Blom, N., Hansen, J., Blaas, D., Brunak, S. (1996) Cleavage site analysis in picornaviral polyproteins: discovering cellular targets by neural networks. Protein Sci 5, 2203–2216.
Bologna, G., Yvon, C., Duvaud, S., Veuthey, A. L. (2004) N-Terminal myristoylation predictions by ensembles of neural networks. Proteomics 4, 1626–1632.
Xue, Y., Ren, J., Gao, X., Jin, C., Wen, L., Yao, X. (2008) GPS 2.0, a tool to predict kinase-specific phosphorylation sites in hierarchy. Mol Cell Proteomics 7, 1598–1608.
Author information
Authors and Affiliations
Corresponding author
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 2011 Springer Science+Business Media, LLC
About this protocol
Cite this protocol
Liu, C., Li, H. (2011). In Silico Prediction of Post-translational Modifications. In: Yu, B., Hinchcliffe, M. (eds) In Silico Tools for Gene Discovery. Methods in Molecular Biology, vol 760. Humana Press. https://doi.org/10.1007/978-1-61779-176-5_20
Download citation
DOI: https://doi.org/10.1007/978-1-61779-176-5_20
Published:
Publisher Name: Humana Press
Print ISBN: 978-1-61779-175-8
Online ISBN: 978-1-61779-176-5
eBook Packages: Springer Protocols