Efficient Traversal of Beta-Sheet Protein Folding Pathways Using Ensemble Models

  • Solomon Shenker
  • Charles W. O’Donnell
  • Srinivas Devadas
  • Bonnie Berger
  • Jérôme Waldispühl
Part of the Lecture Notes in Computer Science book series (LNCS, volume 6577)


Molecular Dynamics (MD) simulations can now predict ms-timescale folding processes of small proteins — however, this presently requires hundreds of thousands of CPU hours and is primarily applicable to short peptides with few long-range interactions. Larger and slower-folding proteins, such as many with extended β-sheet structure, would require orders of magnitude more time and computing resources. Furthermore, when the objective is to determine only which folding events are necessary and limiting, atomistic detail MD simulations can prove unnecessary. Here, we introduce the program tFolder as an efficient method for modelling the folding process of large β-sheet proteins using sequence data alone. To do so, we extend existing ensemble β-sheet prediction techniques, which permitted only a fixed anti-parallel β-barrel shape, with a method that predicts arbitrary β-strand/β-strand orientations and strand-order permutations. By accounting for all partial and final structural states, we can then model the transition from random coil to native state as a Markov process, using a master equation to simulate population dynamics of folding over time. Thus, all putative folding pathways can be energetically scored, including which transitions present the greatest barriers. Since correct folding pathway prediction is likely determined by the accuracy of contact prediction, we demonstrate the accuracy of tFolder to be comparable with state-of-the-art methods designed specifically for the contact prediction problem alone. We validate our method for dynamics prediction by applying it to the folding pathway of the well-studied Protein G. With relatively very little computation time, tFolder is able to reveal critical features of the folding pathways which were only previously observed through time-consuming MD simulations and experimental studies. Such a result greatly expands the number of proteins whose folding pathways can be studied, while the algorithmic integration of ensemble prediction with Markovian dynamics can be applied to many other problems.


Polypeptide Kato 


Unable to display preview. Download preview PDF.

Unable to display preview. Download preview PDF.


  1. 1.
    Dobson, C.M.: Protein folding and misfolding. Nature 426(6968), 884–890 (2003)CrossRefGoogle Scholar
  2. 2.
    Karplus, M., McCammon, J.A.: Molecular dynamics simulations of biomolecules. Nat. Struct. Biol. 9(9), 646–652 (2002)CrossRefGoogle Scholar
  3. 3.
    Faccioli, P., Sega, M., Pederiva, F., Orland, H.: Dominant pathways in protein folding. Phys. Rev. Lett. 97(10), 108101 (2006)CrossRefGoogle Scholar
  4. 4.
    Voelz, V.A., Bowman, G.R., Beauchamp, K., Pande, V.S.: Molecular simulation of ab initio protein folding for a millisecond folder NTL9(1-39). J. Am. Chem. Soc. 132(5), 1526–1528 (2010)CrossRefGoogle Scholar
  5. 5.
    Levitt, M., Warshel, A.: Computer simulation of protein folding. Nature 253(5494), 694–698 (1975)CrossRefGoogle Scholar
  6. 6.
    Tapia, L., Thomas, S., Amato, N.M.: A motion planning approach to studying molecular motions. Communications in Information and Systems 10(1), 53–68 (2010)CrossRefMATHGoogle Scholar
  7. 7.
    Amato, N.M., Song, G.: Using motion planning to study protein folding pathways. J. Comput. Biol. 9(2), 149–168 (2002)CrossRefGoogle Scholar
  8. 8.
    Hosur, R., Singh, R., Berger, B.: Sparse estimation for structural variability. Algorithms Mol. Biol. (2011)Google Scholar
  9. 9.
    McCaskill, J.: The equilibrium partition function and base pair binding probabilities for RNA secondary structure. Biopolymers 29, 1105–1119 (1990)CrossRefGoogle Scholar
  10. 10.
    Ding, Y., Lawrence, C.E.: A bayesian statistical algorithm for RNA secondary structure prediction. Comput. Chem. 23(3-4), 387–400 (1999)CrossRefGoogle Scholar
  11. 11.
    Turner, D.H., Mathews, D.H.: NNDB: the nearest neighbor parameter database for predicting stability of nucleic acid secondary structure. Nucleic Acids Res. 38(Database issue), 280–282 (2010)CrossRefGoogle Scholar
  12. 12.
    Wolfinger, M.T., Andreas Svrcek-Seiler, W.A., Flamm, C., Hofacker, I.L., Stadler, P.F.: Efficient computation of RNA folding dynamics. Journal of Physics A: Mathematical and General 37(17) (2004)Google Scholar
  13. 13.
    Tang, X., Thomas, S., Tapia, L., Giedroc, D.P., Amato, N.M.: Simulating RNA folding kinetics on approximated energy landscapes. J. Mol. Biol. 381(4), 1055–1067 (2008)CrossRefGoogle Scholar
  14. 14.
    Mamitsuka, H., Abe, N.: Predicting location and structure of beta-sheet regions using stochastic tree grammars. In: ISMB, pp. 276–284 (1994)Google Scholar
  15. 15.
    Chiang, D., Joshi, A.K., Searls, D.B.: Grammatical representations of macromolecular structure. J. Comput. Biol. 13(5), 1077–1100 (2006)MathSciNetCrossRefGoogle Scholar
  16. 16.
    Kato, Y., Akutsu, T., Seki, H.: Dynamic programming algorithms and grammatical modeling for protein beta-sheet prediction. J. Comput. Biol. 16(7), 945–957 (2009)MathSciNetCrossRefGoogle Scholar
  17. 17.
    Tran, V.D., Chassignet, P., Sheikh, S., Steyaert, J.M.: Energy-based classification and structure prediction of transmembrane beta-barrel proteins. In: Proceedings of the First IEEE International Conference on Computational Advances in Bio and medical Sciences (ICCABS) (2011)Google Scholar
  18. 18.
    Waldispühl, J., O’Donnell, C.W., Devadas, S., Clote, P., Berger, B.: Modeling ensembles of transmembrane beta-barrel proteins. Proteins 71(3), 1097–1112 (2008)CrossRefGoogle Scholar
  19. 19.
    Waldispühl, J., Steyaert, J.M.: Modeling and predicting all-alpha transmembrane proteins including helix-helix pairing. Theor. Comput. Sci. 335(1), 67–92 (2005)CrossRefMATHGoogle Scholar
  20. 20.
    Waldispühl, J., Berger, B., Clote, P., Steyaert, J.M.: Predicting transmembrane beta-barrels and interstrand residue interactions from sequence. Proteins 65(1), 61–74 (2006)CrossRefGoogle Scholar
  21. 21.
    Cowen, L., Bradley, P., Menke, M., King, J., Berger, B.: Predicting the beta-helix fold from protein sequence data. J. Comput. Bio.l, 261–276 (2001)Google Scholar
  22. 22.
    Ding, Y., Lawrence, C.E.: A statistical sampling algorithm for RNA secondary structure prediction. Nucleic Acids Res. 31, 7280–7301 (2003)CrossRefGoogle Scholar
  23. 23.
    Cheng, J., Baldi, P.: Improved residue contact prediction using support vector machines and a large feature set. BMC Bioinformatics 8, 113 (2007)CrossRefGoogle Scholar
  24. 24.
    Zemla, A., Venclovas, C., Fidelis, K., Rost, B.: A modified definition of sov, a segment-based measure for protein secondary structure prediction assessment. Proteins 34(2), 220–223 (1999)CrossRefGoogle Scholar
  25. 25.
    Moulton, V., Zuker, M., Steel, M., Pointon, R., Penny, D.: Metrics on RNA secondary structures. J. Comput. Biol. 7, 277–292 (2000)CrossRefGoogle Scholar
  26. 26.
    Song, G., Thomas, S., Dill, K.A., Scholtz, J.M., Amato, N.M.: A path planning-based study of protein folding with a case study of hairpin formation in protein G and L. Pac. Symp. Biocomput., 240–251 (2003)Google Scholar
  27. 27.
    Hubner, I.A., Shimada, J., Shakhnovich, E.I.: Commitment and nucleation in the protein G transition state. J. Mol. Biol. 336, 745–761 (2004)CrossRefGoogle Scholar
  28. 28.
    Fulton, K.F., Devlin, G.L., Jodun, R.A., Silvestri, L., Bottomley, S.P., Fersht, A.R., Buckle, A.M.: PFD: a database for the investigation of protein folding kinetics and stability. Nucleic Acids Res. 33(Database issue), D279–D283 (2005)CrossRefGoogle Scholar

Copyright information

© Springer-Verlag Berlin Heidelberg 2011

Authors and Affiliations

  • Solomon Shenker
    • 1
  • Charles W. O’Donnell
    • 2
  • Srinivas Devadas
    • 2
  • Bonnie Berger
    • 2
    • 3
  • Jérôme Waldispühl
    • 1
    • 2
  1. 1.School of Computer Science & McGill Centre for BioinformaticsMcGill UniversityMontrealCanada
  2. 2.Computer Science and AI LabMITCambridgeUSA
  3. 3.Department of MathematicsMITCambridgeUSA

Personalised recommendations