Abstract
We present here the recent update of AutoMotif Server (AMS 2.0) that predicts post-translational modification sites in protein sequences. The support vector machine (SVM) algorithm was trained on data gathered in 2007 from various sets of proteins containing experimentally verified chemical modifications of proteins. Short sequence segments around a modification site were dissected from a parent protein, and represented in the training set as binary or profile vectors. The updated efficiency of the SVM classification for each type of modification and the predictive power of both representations were estimated using leave-one-out tests for model of general phosphorylation and for modifications catalyzed by several specific protein kinases. The accuracy of the method was improved in comparison to the previous version of the service (Plewczynski et al., “AutoMotif server: prediction of single residue post-translational modifications in proteins”, Bioinformatics 21: 2525–7, 2005). The precision of the updated version reached over 90% for selected types of phosphorylation and was optimized in trade of lower recall value of the classification model. The AutoMotif Server version 2007 is freely available at http://ams2.bioinfo.pl/. Additionally, the reference dataset for optimization of prediction of phosphorylation sites, collected from the UniProtKB was also provided and can be accessed at http://ams2.bioinfo.pl/data/.
References
Web Resources http://www.ncbi.nlm.nih.gov/sites/entrez?Db=pubmed&Cmd=ShowDetailView&TermToSearch=16448870&ordinalpos=5&itool=EntrezSystem2.PEntrez.Pubmed.Pubmed_ResultsPanel.Pubmed_RVDocSum, http://www.plosone.org/article/fetchArticle.action;jsessionid=25641689BCDA437BC10254AB6634C83F?articleURI=info%3Adoi%2F10.1371%2Fjournal.pone.0000656, http://en.wikipedia.org/wiki/Posttranslational_modification. 2007
Puntervoll P, Linding R, Gemund C, Chabanis-Davidson S, Mattingsdal M, Cameron S, Martin DM, Ausiello G, Brannetti B, Costantini A, Ferre F, Maselli V, Via A, Cesareni G, Diella F, Superti-Furga G, Wyrwicz L, Ramu C, McGuigan C, Gudavalli R, Letunic I, Bork P, Rychlewski L, Kuster B, Helmer-Citterich M, Hunter WN, Aasland R, Gibson TJ (2003) Nucleic Acids Res 31(13):3625–3630
Obenauer JC, Cantley LC, Yaffe MB (2003) Nucleic Acids Res 31(13):3635–3641
Attwood TK, Bradley P, Flower DR, Gaulton A, Maudling N, Mitchell AL, Moulton G, Nordle A, Paine K, Taylor P, Uddin A, Zygouri C (2003) Nucleic Acids Res 31(1):400–402
Huang JY, Brutlag DL (2001) Nucleic Acids Res 29(1):202–204
Nevill-Manning CG, Wu TD, Brutlag DL (1998) Proc Natl Acad Sci USA 95(11):5865–5871
Henikoff JG, Greene EA, Pietrokovski S, Henikoff S (2000) Nucleic Acids Res 28(1):228–230
Henikoff JG, Henikoff S, Pietrokovski S (1999) Nucleic Acids Res 27(1):226–228
Henikoff S, Henikoff JG, Pietrokovski S (1999) Bioinformatics 15(6):471–479
Falquet L, Pagni M, Bucher P, Hulo N, Sigrist CJ, Hofmann K, Bairoch A (2002) Nucleic Acids Res 30(1):235–238
Hofmann K, Bucher P, Falquet L, Bairoch A (1999) Nucleic Acids Res 27(1):215–219
Sigrist CJ, Cerutti L, Hulo N, Gattiker A, Falquet L, Pagni M, Bairoch A, Bucher P (2002) Brief Bioinform 3(3):265–274
de Castro E, Sigrist CJ, Gattiker A, Bulliard V, Langendijk-Genevaux PS, Gasteiger E, Bairoch A, Hulo N (2006) Nucleic Acids Res 34(Web Server issue):W362–W365
Gattiker A, Gasteiger E, Bairoch A (2002) Appl Bioinformatics 1(2):107–108
Jonassen I, Collins JF, Higgins DG (1995) Protein Sci 4(8):1587–1595
Blinov NN Jr, Gurzhiev AN, Gurzhiev SN, Kostritskii AV (2004) Med Tekh (5):47
Gurhiev AN, Gurzhiev SN, Kirichenko MG, Kostritskii AV (2005) Med Tekh (5):45–48
Quevillon E, Silventoinen V, Pillai S, Harte N, Mulder N, Apweiler R, Lopez R (2005) Nucleic Acids Res 33(Web Server issue):W116–W120
Zdobnov EM, Apweiler R (2001) Bioinformatics 17(9):847–848
Balla S, Thapar V, Verma S, Luong T, Faghri T, Huang CH, Rajasekaran S, del Campo JJ, Shinn JH, Mohler WA, Maciejewski MW, Gryk MR, Piccirillo B, Schiller SR, Schiller MR (2006) Nat Methods 3(3):175–177
Ahmad I, Hoessli DC, Walker-Nasir E, Choudhary MI, Rafik SM, Shakoori AR (2006) J Cell Biochem 99(3):706–718
Senawongse P, Dalby AR, Yang ZR (2005) J Chem Inf Model 45(4):1147–1152
Blom N, Kreegipuu A, Brunak S (1998) Nucleic Acids Res 26(1):382–386
Kreegipuu A, Blom N, Brunak S (1999) Nucleic Acids Res 27(1):237–239
Blom N, Gammeltoft S, Brunak S (1999) J Mol Biol 294(5):1351–1362
Xue Y, Li A, Wang L, Feng H, Yao X (2006) BMC Bioinformatics 7:163
Li A, Wang L, Shi Y, Wang M, Jiang Z, Feng H (2005) Conf Proc IEEE Eng Med Biol Soc 6:6075–6078
Li A, Xue Y, Jin C, Wang M, Yao X (2006) Biochem Biophys Res Commun 350(4):818–824
Lee TY, Huang HD, Hung JH, Huang HY, Yang YS, Wang TH (2006) Nucleic Acids Res 34(Database issue):D622–D627
Chen H, Xue Y, Huang N, Yao X, Sun Z (2006) Nucleic Acids Res 34(Web Server issue):W249–W253
Li S, Liu B, Zeng R, Cai Y, Li Y (2006) Comput Biol Chem 30(3):203–208
Monigatti F, Hekking B, Steen H (2006) Biochim Biophys Acta 1764(12):1904–1913
Xue Y, Chen H, Jin C, Sun Z, Yao X (2006) BMC Bioinformatics 7:458
Zhou F, Xue Y, Yao X, Xu Y (2006) Bioinformatics 22(7):894–896
Bairoch A, Apweiler R (1999) Nucleic Acids Res 27(1):49–54
Plewczynski D, Tkacz A, Godzik A, Rychlewski L (2005) Cell Mol Biol Lett 10(1):73–89
Plewczynski D, Tkacz A, Wyrwicz LS, Godzik A, Kloczkowski A, Rychlewski L (2006) J Mol Model 12(4):453–461
Plewczynski D, Tkacz A, Wyrwicz LS, Rychlewski L (2005) Bioinformatics 21(10):2525–2527
Cristianini N, Shawe-Taylor J (2000) An introduction to support vector machines : and other kernel-based learning methods. 2000, Cambridge University Press, Cambridge, U.K.; New York, p 189, xiii
Vapnik VN (1995) The nature of statistical learning theory. Springer, New York, p 188, xv
Vapnik VN (1998) Statistical learning theory. Adaptive and learning systems for signal processing, communications, and control. Wiley, New York, p 736, xxiv
Byvatov E, Fechner U, Sadowski J, Schneider G (2003) J Chem Inf Comput Sci 43(6):1882–1889
Furey TS, Cristianini N, Duffy N, Bednarski DW, Schummer M, Haussler D (2000) Bioinformatics 16(10):906–914
Kim H, Park H (2003) Protein Eng 16(8):553–560
Schölkopf B, Burges CJC, Smola AJ (1999) Advances in kernel methods : support vector learning. MIT Press, Cambridge, MA, p 376, vii
Zavaljevski N, Stevens FJ, Reifman J (2002) Bioinformatics 18(5):689–696
Zien A, Ratsch G, Mika S, Scholkopf B, Lengauer T, Muller KR (2000) Bioinformatics 16(9):799–807
Joachims T (2002) Learning to classify text using support vector machines. Kluwer international series in engineering and computer science ; SECS 668. Kluwer, Boston, p 205, xvi
Diella F, Cameron S, Gemund C, Linding R, Via A, Kuster B, Sicheritz-Ponten T, Blom N, Gibson TJ (2004) BMC Bioinformatics 5:79
Lohmann R, Schneider G, Behrens D, Wrede P (1994) Protein Sci 3(9):1597–1601
Acknowledgements
This work was supported by EC (LHSG-CT-2003-503265), NIH (1R01GM081680-01), EMBO Installation, FNP (FOCUS) and MNiSW (PBZ-MNiI-2/1/2005, N401 050 32/1181) grants.
Author information
Authors and Affiliations
Corresponding author
Rights and permissions
About this article
Cite this article
Plewczynski, D., Tkacz, A., Wyrwicz, L.S. et al. AutoMotif Server for prediction of phosphorylation sites in proteins using support vector machine: 2007 update. J Mol Model 14, 69–76 (2008). https://doi.org/10.1007/s00894-007-0250-3
Received:
Accepted:
Published:
Issue Date:
DOI: https://doi.org/10.1007/s00894-007-0250-3