Skip to main content
Log in

Molecular modeling of phosphorylation sites in proteins using a database of local structure segments

  • Original Paper
  • Published:
Journal of Molecular Modeling Aims and scope Submit manuscript

Abstract

A new bioinformatics tool for molecular modeling of the local structure around phosphorylation sites in proteins has been developed. Our method is based on a library of short sequence and structure motifs. The basic structural elements to be predicted are local structure segments (LSSs). This enables us to avoid the problem of non-exact local description of structures, caused by either diversity in the structural context, or uncertainties in prediction methods. We have developed a library of LSSs and a profile—profile-matching algorithm that predicts local structures of proteins from their sequence information. Our fragment library prediction method is publicly available on a server (FRAGlib), at http://ffas.ljcrf.edu/Servers/frag.html. The algorithm has been applied successfully to the characterization of local structure around phosphorylation sites in proteins. Our computational predictions of sequence and structure preferences around phosphorylated residues have been confirmed by phosphorylation experiments for PKA and PKC kinases. The quality of predictions has been evaluated with several independent statistical tests. We have observed a significant improvement in the accuracy of predictions by incorporating structural information into the description of the neighborhood of the phosphorylated site. Our results strongly suggest that sequence information ought to be supplemented with additional structural context information (predicted with our segment similarity method) for more successful predictions of phosphorylation sites in proteins.

Figure The automatic annotation service used for predicting posttranslational sites in proteins. Our local prediction method compares sequence profile of the query protein against all members of the fragment database. The query protein is dissected into short parts (7 - 19 residues long). For each fragment a similarity search is performed. Each member of the fragment database that is similar in terms of the homology sequence profile to the query fragment is added to the list of predicted structures. The list is then sorted and cut to choose 20 results. If the highest score of the predicted fragment is below the user-defined cutoff value then the whole prediction is discarded. At the end, some parts of the query protein are covered by the list of 20 fragments from the database (PLSSs). The resulting PLSSs are then compared with the database of segments known to be phosphorylated by PKA or PKC kinases. Each verified segment in the database is stored with the LSS and sequence profile. Our similarity method assigns the probabilities of being phosphorylated (the C scores) to all predicted PLSSs for the input query protein. Sites having scores higher then the cutoff value C0 (different for PKA and PKC phosphorylations) are predicted to be phosphorylated by a specific kinase.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Institutional subscriptions

Fig. 1
Fig. 2
Fig. 3
Fig. 4

Similar content being viewed by others

References

  1. Levitt M, Gerstein M (1998) Proc Natl Acad Sci 95:5913–5920

    Article  PubMed  CAS  Google Scholar 

  2. Luthy R, McLachlan AD, Eisenberg D (1991) Bioinformatics 16:1111–1119

    Google Scholar 

  3. Fischer D, Eisenberg D (1996) Protein Sci 5:947–955

    Article  PubMed  CAS  Google Scholar 

  4. Xu H, Aurora R, Rose GD, White RH (1999) Nat Struc Biol 6:750–754

    Article  CAS  Google Scholar 

  5. Rychlewski L, Godzik A (1997) Protein Eng 10:1143–1153

    Article  PubMed  CAS  Google Scholar 

  6. Yi TM, Lander ES (1993) J Mol Biol 232:1117–1129

    Article  PubMed  CAS  Google Scholar 

  7. Bystroff C, Baker D (1998) J Mol Biol 281:565–577

    Article  PubMed  CAS  Google Scholar 

  8. Bystroff C, Thorsson V, Baker D (2000) J Mol Biol 301:173–190

    Article  PubMed  CAS  Google Scholar 

  9. Simons KT, Bonneau R, Ruczinski I, Baker D (1999) Proteins 37:171–176

    Article  Google Scholar 

  10. Bystroff C, Shao Y (2002) Bioinformatics 18:S54–61

    Google Scholar 

  11. Rohl CA, Strauss CE, Chivian D, Baker D (2004) Proteins 55:656–677

    Article  PubMed  CAS  Google Scholar 

  12. Plewczynski D, Rychlewski L, Ye Y, Jaroszewski L, Godzik A (2004) BMC Bioinformatics 5:98

    Article  PubMed  CAS  Google Scholar 

  13. Plewczynski D, Rychlewski L (2003) Comput Methods Sci Technol 9:93–100

    Google Scholar 

  14. Bairoch A, Apweiler R (1999) Nucleic Acids Res 27:49–54

    Article  PubMed  CAS  Google Scholar 

  15. Chandonia JM, Walker NS, Lo Conte L, Koehl P, Levitt M, Brenner SE (2002) Nucleic Acids Res 30:260–263

    Article  PubMed  CAS  Google Scholar 

  16. Brenner SE, Koehl P, Levitt M (2000) Nucleic Acids Res 28:254–256

    Article  PubMed  CAS  Google Scholar 

  17. Altschul SF, Madden TL, Schäffer AA, Zhang J, Zhang Z, Miller W, Lipman DJ (1997) Nucleic Acids Res 25:3389–3402

    Article  PubMed  CAS  Google Scholar 

  18. Rychlewski L, Jaroszewski L, Li W, Godzik A (2000) Protein Sci 9:232–241

    Article  PubMed  CAS  Google Scholar 

  19. Iakoucheva LM, Radivojac P, Brown CJ, O’Connor TR, Sikes JG, Obradovic Z, Dunker AK (2004) Nucleic Acids Res 32:1037–1049

    Article  PubMed  CAS  Google Scholar 

  20. Zavaljevski N, Stevens FJ, Reifman J (2002) Bioinformatics 18:689–696

    Article  PubMed  CAS  Google Scholar 

  21. Linding R, Russell RB, Neduva V, Gibson TJ (2003) Nucleic Acids Res 31:3701–3708

    Article  PubMed  CAS  Google Scholar 

Download references

Acknowledgments

This work was supported by the USA grant (“SPAM” GM63208), BioSapiens (LHSG-CT-2003-503265), ELM (QLRT-CT-2000-00127) and GeneFun (LSHG-CT-2004-503567) projects within 6FP EU program.

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Dariusz Plewczynski.

Rights and permissions

Reprints and permissions

About this article

Cite this article

Plewczynski, D., Jaroszewski, L., Godzik, A. et al. Molecular modeling of phosphorylation sites in proteins using a database of local structure segments. J Mol Model 11, 431–438 (2005). https://doi.org/10.1007/s00894-005-0235-z

Download citation

  • Received:

  • Accepted:

  • Published:

  • Issue Date:

  • DOI: https://doi.org/10.1007/s00894-005-0235-z

Keywords

Navigation