Abstract
A new bioinformatics tool for molecular modeling of the local structure around phosphorylation sites in proteins has been developed. Our method is based on a library of short sequence and structure motifs. The basic structural elements to be predicted are local structure segments (LSSs). This enables us to avoid the problem of non-exact local description of structures, caused by either diversity in the structural context, or uncertainties in prediction methods. We have developed a library of LSSs and a profile—profile-matching algorithm that predicts local structures of proteins from their sequence information. Our fragment library prediction method is publicly available on a server (FRAGlib), at http://ffas.ljcrf.edu/Servers/frag.html. The algorithm has been applied successfully to the characterization of local structure around phosphorylation sites in proteins. Our computational predictions of sequence and structure preferences around phosphorylated residues have been confirmed by phosphorylation experiments for PKA and PKC kinases. The quality of predictions has been evaluated with several independent statistical tests. We have observed a significant improvement in the accuracy of predictions by incorporating structural information into the description of the neighborhood of the phosphorylated site. Our results strongly suggest that sequence information ought to be supplemented with additional structural context information (predicted with our segment similarity method) for more successful predictions of phosphorylation sites in proteins.
Figure The automatic annotation service used for predicting posttranslational sites in proteins. Our local prediction method compares sequence profile of the query protein against all members of the fragment database. The query protein is dissected into short parts (7 - 19 residues long). For each fragment a similarity search is performed. Each member of the fragment database that is similar in terms of the homology sequence profile to the query fragment is added to the list of predicted structures. The list is then sorted and cut to choose 20 results. If the highest score of the predicted fragment is below the user-defined cutoff value then the whole prediction is discarded. At the end, some parts of the query protein are covered by the list of 20 fragments from the database (PLSSs). The resulting PLSSs are then compared with the database of segments known to be phosphorylated by PKA or PKC kinases. Each verified segment in the database is stored with the LSS and sequence profile. Our similarity method assigns the probabilities of being phosphorylated (the C scores) to all predicted PLSSs for the input query protein. Sites having scores higher then the cutoff value C0 (different for PKA and PKC phosphorylations) are predicted to be phosphorylated by a specific kinase.
Similar content being viewed by others
References
Levitt M, Gerstein M (1998) Proc Natl Acad Sci 95:5913–5920
Luthy R, McLachlan AD, Eisenberg D (1991) Bioinformatics 16:1111–1119
Fischer D, Eisenberg D (1996) Protein Sci 5:947–955
Xu H, Aurora R, Rose GD, White RH (1999) Nat Struc Biol 6:750–754
Rychlewski L, Godzik A (1997) Protein Eng 10:1143–1153
Yi TM, Lander ES (1993) J Mol Biol 232:1117–1129
Bystroff C, Baker D (1998) J Mol Biol 281:565–577
Bystroff C, Thorsson V, Baker D (2000) J Mol Biol 301:173–190
Simons KT, Bonneau R, Ruczinski I, Baker D (1999) Proteins 37:171–176
Bystroff C, Shao Y (2002) Bioinformatics 18:S54–61
Rohl CA, Strauss CE, Chivian D, Baker D (2004) Proteins 55:656–677
Plewczynski D, Rychlewski L, Ye Y, Jaroszewski L, Godzik A (2004) BMC Bioinformatics 5:98
Plewczynski D, Rychlewski L (2003) Comput Methods Sci Technol 9:93–100
Bairoch A, Apweiler R (1999) Nucleic Acids Res 27:49–54
Chandonia JM, Walker NS, Lo Conte L, Koehl P, Levitt M, Brenner SE (2002) Nucleic Acids Res 30:260–263
Brenner SE, Koehl P, Levitt M (2000) Nucleic Acids Res 28:254–256
Altschul SF, Madden TL, Schäffer AA, Zhang J, Zhang Z, Miller W, Lipman DJ (1997) Nucleic Acids Res 25:3389–3402
Rychlewski L, Jaroszewski L, Li W, Godzik A (2000) Protein Sci 9:232–241
Iakoucheva LM, Radivojac P, Brown CJ, O’Connor TR, Sikes JG, Obradovic Z, Dunker AK (2004) Nucleic Acids Res 32:1037–1049
Zavaljevski N, Stevens FJ, Reifman J (2002) Bioinformatics 18:689–696
Linding R, Russell RB, Neduva V, Gibson TJ (2003) Nucleic Acids Res 31:3701–3708
Acknowledgments
This work was supported by the USA grant (“SPAM” GM63208), BioSapiens (LHSG-CT-2003-503265), ELM (QLRT-CT-2000-00127) and GeneFun (LSHG-CT-2004-503567) projects within 6FP EU program.
Author information
Authors and Affiliations
Corresponding author
Rights and permissions
About this article
Cite this article
Plewczynski, D., Jaroszewski, L., Godzik, A. et al. Molecular modeling of phosphorylation sites in proteins using a database of local structure segments. J Mol Model 11, 431–438 (2005). https://doi.org/10.1007/s00894-005-0235-z
Received:
Accepted:
Published:
Issue Date:
DOI: https://doi.org/10.1007/s00894-005-0235-z