Skip to main content

Advertisement

Log in

Support-vector-machine classification of linear functional motifs in proteins

  • Original Paper
  • Published:
Journal of Molecular Modeling Aims and scope Submit manuscript

Abstract

Our algorithm predicts short linear functional motifs in proteins using only sequence information. Statistical models for short linear functional motifs in proteins are built using the database of short sequence fragments taken from proteins in the current release of the Swiss-Prot database. Those segments are confirmed by experiments to have single-residue post-translational modification. The sensitivities of the classification for various types of short linear motifs are in the range of 70%. The query protein sequence is dissected into short overlapping fragments. All segments are represented as vectors. Each vector is then classified by a machine learning algorithm (Support Vector Machine) as potentially modifiable or not. The resulting list of plausible post-translational sites in the query protein is returned to the user. We also present a study of the human protein kinase C family as a biological application of our method.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Fig. 1

Similar content being viewed by others

References

  1. Attwood TK, Bradley P, Flower DR, Gaulton A, Maudling N, Mitchell AL, Moulton G, Nordle A, Paine K, Taylor P, Uddin A, Zygouri C (2003) Nucl Acids Res 31:400–402

    Article  PubMed  CAS  Google Scholar 

  2. Nevill-Manning CG, Wu TD, Brutlag DL (1998) Proc Natl Acad Sci USA 95:5865–5871

    Article  CAS  Google Scholar 

  3. Huang JY, Brutlag DL (2001) Nucl Acids Res 29:202–204

    Article  PubMed  CAS  Google Scholar 

  4. Henikoff S, Henikoff JG, Pietrokovski S (1999) Bioinformatics 15:471–479

    Article  PubMed  CAS  Google Scholar 

  5. Zdobnov EM, Apweiler R (2001) Bioinformatics 17:847–848

    Article  PubMed  CAS  Google Scholar 

  6. Falquet L, Pagni M, Bucher P, Hulo N, Sigrist CJ, Hofmann K, Bairoch A (2002) Nucl Acids Res 30:235–238

    Article  PubMed  CAS  Google Scholar 

  7. Gattiker A, Gasteiger E, Bairoch A (2002) Applied Bioinformatics 1:107–108

    PubMed  CAS  Google Scholar 

  8. Jonassen I, Collins JF, Higgins D (1995) Protein Science 4:1587–1595

    PubMed  CAS  Google Scholar 

  9. Puntervoll P, Linding R, Gemünd C, Chabanis-Davidson S, Mattingsdal M, Cameron S, Martin DMA, Ausiello G, Brannetti B, Costantini A, Ferrè F, Maselli V, Via A, Cesareni G, Diella F, Superti-Furga G, Wyrwicz L, Ramu C, McGuigan C, Gudavalli R, Letunic I, Bork P, Rychlewski L, Küster B, Helmer-Citterich M, Hunter WN, Aasland R, Gibson TJ (2003) Nucl Acids Res 31:3625–3630

    Article  PubMed  CAS  Google Scholar 

  10. Obenauer JC, Cantley LC, Yaffe MB (2003) Nucl Acids Res 31:3635–3641

    Article  PubMed  CAS  Google Scholar 

  11. Monigatti F, Gasteiger E, Bairoch A, Jung E (2002) Bioinformatics 18:769–770

    Article  PubMed  CAS  Google Scholar 

  12. Kreegipuu A, Blom N, Brunak S, Jarv J (1998) FEBS Lett 430:45–50

    Article  PubMed  CAS  Google Scholar 

  13. Kreegipuu A, Blom N, Brunak S (1999) Nucl Acids Res 27:237–239

    Article  PubMed  CAS  Google Scholar 

  14. Blom N, Gammeltoft S, Brunak S (1999) J Mol Biol 294:1351–1362

    Article  PubMed  CAS  Google Scholar 

  15. Plewczynski D, Rychlewski L, Ye Y, Jaroszewski L, Godzik A (2004) BMC Bioinformatics 5:98

    Article  PubMed  CAS  Google Scholar 

  16. Plewczynski D, Rychlewski L (2003) Comput Methods Sci Technol 9:93–100

    Google Scholar 

  17. Plewczynski D, Jaroszewski L, Godzik A, Kloczkowski A, Rychlewski L (2005) J Mol Model (in press)

  18. Bairoch A, Apweiler R (1999) Nucl Acids Res 27:49–54

    Article  PubMed  CAS  Google Scholar 

  19. Simons KT, Bonneau R, Ruczinski II, Baker D (1999) Proteins 37:171–176

    Article  Google Scholar 

  20. Rohl CA, Strauss CE, Chivian D, Baker D (2004) Proteins 55:656–677

    Article  PubMed  CAS  Google Scholar 

  21. Bystroff C, Shao Y (2002) Bioinformatics 18:S54–S61

    Google Scholar 

  22. Vapnik VN (1995) The Nature of Statistical Learning Theory. Springer

  23. Vapnik VN (1998) Statistical Learning Theory. Wiley, New York

    Google Scholar 

  24. Cristianini N, Shawe−Taylor J (2000) Support Vector Machines. Cambridge, UK

    Google Scholar 

  25. Zavaljevski N, Stevens FJ, Reifman J (2002) Bioinformatics 18:689–696

    Article  PubMed  CAS  Google Scholar 

  26. Kim H, Park H (2003) Protein Engin 16:553–560

    Article  CAS  Google Scholar 

  27. Minakuchi Y, Satou K, Konagaya A (2003) Prediction of protein–protein interaction sites using support vector machines. Proceedings of the international conference on mathematics and engineering techniques in medicine and biological sciences, pp 22–28

  28. Parekh DB, Ziegler W, Parker PJ (2000) EMBO J 19:496–503

    Article  PubMed  CAS  Google Scholar 

  29. Newton AC (1997) Curr Opin Cell Biol 9:161–167

    Article  PubMed  CAS  Google Scholar 

  30. Lohman R, Schneider G, Nehrens D, Wrede P (1994) Protein Sci 3:1597–1601

    Article  PubMed  Google Scholar 

Download references

Acknowledgements

This work was supported by the USA grant (“SPAM” GM63208), ELM (QLRT-CT2000-00127), BioSapiens (LHSG-CT-2003-503265), GeneFun (LSHG-CT-2004-503567) projects within five and six FP EC programs. A. K. acknowledges the financial support provided by the NIH grant 1R01GM072014-01. LSW is supported by Foundation for Polish Science within Program for Young Researchers.

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Dariusz Plewczynski.

Rights and permissions

Reprints and permissions

About this article

Cite this article

Plewczynski, D., Tkacz, A., Wyrwicz, L.S. et al. Support-vector-machine classification of linear functional motifs in proteins. J Mol Model 12, 453–461 (2006). https://doi.org/10.1007/s00894-005-0070-2

Download citation

  • Received:

  • Accepted:

  • Published:

  • Issue Date:

  • DOI: https://doi.org/10.1007/s00894-005-0070-2

Keywords

Navigation