Abstract
The prediction of β-sheet topology requires the consideration of long-range interactions between β-strands that are not necessarily consecutive in sequence. Since these interactions are difficult to simulate using ab initio methods, we propose a supplementary method able to assign β-sheet topology using only sequence information. We envision using the results of our method to reduce the three-dimensional search space of ab initio methods. Our method is based on the signature molecular descriptor, which has been used previously to predict protein–protein interactions successfully, and to develop quantitative structure–activity relationships for small organic drugs and peptide inhibitors. Here, we show how the signature descriptor can be used in a Support Vector Machine to predict whether or not two β-strands will pack adjacently within a protein. We then show how these predictions can be used to order β-strands within β-sheets. Using the entire PDB database with ten-fold cross-validation, we have achieved 74.0% accuracy in packing prediction and 75.6% accuracy in the prediction of edge strands. For the case of β-strand ordering, we are able to predict the correct ordering accurately for 51.3% of the β-sheets. Furthermore, using a simple confidence metric, we can determine those sheets for which accurate predictions can be obtained. For the top 25% highest confidence predictions, we are able to achieve 95.7% accuracy in β-strand ordering.
Similar content being viewed by others
References
Bohm G (1996) Biophys Chemist 59:1–32
Honig B (1999) J Mol Biol 293:283–293
Jones DT (1999) J Mol Biol 292:195–202
Lin K, Simossis VA, Taylor WR, Heringa J (2005) Bioinformatics 21:152–159
Rost B (2001) J Struct Biol 134:204–218
Orengo CA, Bray JE, Hubbard T, LoConte L, Sillitoe I (1999) Proteins (Suppl 3):149–170
Kolinski A, Betancourt MR, Kihara D, Rotkiewicz P, Skolnick J (2001) Proteins 44:133–149
Siepen JA, Radford SE, Westhead DR (2003) Protein Sci 12:2348–2359
Przybylski D, Rost B (2002) Proteins 46:197–205
Hutchinson EG, Sessions RB, Thornton JM, Woolfson DN (1998) Protein Sci 7:2287–2300
Steward RE, Thornton JM (2002) Proteins 48:178–191
Zaremba SM, Gregoret LM (1999) J Mol Biol 291:463–479
King RD, Clark DA, Shirazi J, Sternberg MJ (1994) Protein Eng 7:1295–1303
Churchwell CJ, Rintoul MD, Martin S, Visco Jr DP, Kotu A, Larson RS, Sillerud LO, Brown DC, Faulon JL (2004) J Mol Graph Model 22:263–273
Faulon JL, Visco Jr DP, Pophale RS (2003) J Chem Inf Comput Sci 43:707–720
Martin S, Roe D, Faulon JL (2005) Bioinformatics 21:218–226
Joachims T (1999) In: Scholkopf B, Burges CJC, Smola AJ (eds) Advances in Kernel Methods-Support Vector Learning, pp 169–184
Berman HM, Westbrook J, Feng Z, Gilliland G, Bhat TN, Weissig H, Shindyalov IN, Bourne PE (2000) Nucleic Acids Res 28:235–242
Dumais ST (1998) IEEE Intelligent Systems Magazine 13:21–23
Sheridan RP, Feuston BP, Maiorov VN, Kearsley SK (2004) J Chem Inf Comput Sci 44:1912—1928
Richardson JS, Richardson DC (2002) Proc Natl Acad Sci USA 99:2754—2759
Brown WM, Faulon JL, Sale K (2005) Comput Biol Chem 29:143—150
Acknowledgements
This work was funded by the U.S. Department of Energy’s Genomics: GTL program (http://www.doegenomestolife.org) under project, “Carbon Sequestration in Synechococcus Sp.: From Molecular Machines to Hierarchical Modeling” (http://www.genomes-to-life.org). Sandia is a multiprogram laboratory operated by Sandia Corporation, a LockheedMartin Company, for the United States Department of Energy’s National Nuclear Security Administration under contract DE-AC04-94AL85000.
Author information
Authors and Affiliations
Corresponding author
Rights and permissions
About this article
Cite this article
Brown, W.M., Martin, S., Chabarek, J.P. et al. Prediction of β-strand packing interactions using the signature product. J Mol Model 12, 355–361 (2006). https://doi.org/10.1007/s00894-005-0052-4
Received:
Accepted:
Published:
Issue Date:
DOI: https://doi.org/10.1007/s00894-005-0052-4