Skip to main content

Learning Organizations of Protein Energy Landscapes: An Application on Decoy Selection in Template-Free Protein Structure Prediction

  • Protocol
  • First Online:
Protein Supersecondary Structures

Part of the book series: Methods in Molecular Biology ((MIMB,volume 1958))

Abstract

The protein energy landscape, which lifts the protein structure space by associating energies with structures, has been useful in improving our understanding of the relationship between structure, dynamics, and function. Currently, however, it is challenging to automatically extract and utilize the underlying organization of an energy landscape to the link structural states it houses to biological activity. In this chapter, we first report on two computational approaches that extract such an organization, one that ignores energies and operates directly in the structure space and another that operates on the energy landscape associated with the structure space. We then describe two complementary approaches, one based on unsupervised learning and another based on supervised learning. Both approaches utilize the extracted organization to address the problem of decoy selection in template-free protein structure prediction. The presented results make the case that learning organizations of protein energy landscapes advances our ability to link structures to biological activity.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Protocol
USD 49.95
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
eBook
USD 129.00
Price excludes VAT (USA)
  • Available as EPUB and PDF
  • Read on any device
  • Instant download
  • Own it forever
Hardcover Book
USD 169.99
Price excludes VAT (USA)
  • Durable hardcover edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

References

  1. Boehr DD, Wright PE (2008) How do proteins interact? Science 320(5882):1429–1430

    Article  CAS  Google Scholar 

  2. Maximova T, Moffatt R, Ma B, Nussinov R, Shehu A (2016) Principles and overview of sampling methods for modeling macromolecular structure and dynamics. PLoS Comp Biol 12(4):e1004619

    Article  Google Scholar 

  3. Leaver-Fay A et al (2011) ROSETTA3: an object-oriented software suite for the simulation and design of macromolecules. Methods Enzymol 487:545–574

    Article  CAS  Google Scholar 

  4. Xu D, Zhang Y (2012) Ab initio protein structure assembly using continuous structure fragments and optimized knowledge-based force field. Proteins: Struct Funct Bioinf 80(7):1715–1735. https://doi.org/10.1002/prot.24065

    Article  CAS  Google Scholar 

  5. Olson B, Shehu A (2013) Multi-objective stochastic search for sampling local minima in the protein energy surface. In: ACM conference on bioinformatics, computational biology (BCB), Washington, DC, pp 430–439

    Google Scholar 

  6. Clausen R, Shehu A (2014) A multiscale hybrid evolutionary algorithm to obtain sample-based representations of multi-basin protein energy landscapes. In: ACM conference on bioinformatics, computational biology (BCB), Newport Beach, CA, pp 269–278

    Google Scholar 

  7. Shehu A, Plaku E (2016) A survey of computational treatments of biomolecules by robotics-inspired methods modeling equilibrium structure and dynamics. J Artif Intell Res 597:509–572

    Article  Google Scholar 

  8. Shehu A, Clementi C, Kavraki LE (2007) Sampling conformation space to model equilibrium fluctuations in proteins. Algorithmica 48(4):303–327

    Article  Google Scholar 

  9. Okazaki K, Koga N, Takada S, Onuchic JN, Wolynes PG (2006) Multiple-basin energy landscapes for large-amplitude conformational motions of proteins: structure-based molecular dynamics simulations. Proc Natl Acad Sci U S A 103(32):11844–11849

    Article  CAS  Google Scholar 

  10. Boehr DD, Nussinov R, Wright PE (2009) The role of dynamic conformational ensembles in biomolecular recognition. Nat Chem Biol 5(11):789–796

    Article  CAS  Google Scholar 

  11. Nussinov R, Wolynes PG (2014) A second molecular biology revolution? The energy landscapes of biomolecular function. Phys Chem Chem Phys 16(14):6321–6322

    Article  CAS  Google Scholar 

  12. Frauenfelder H, Sligar SG, Wolynes PG (1991) The energy landscapes and motion on proteins. Science 254(5038):1598–1603

    Article  CAS  Google Scholar 

  13. Bryngelson JD, Onuchic JN, Socci ND, Wolynes PG (1995) Funnels, pathways, and the energy landscape of protein folding: a synthesis. Proteins Struct Funct Genet 21(3):167–195

    Article  CAS  Google Scholar 

  14. Shehu A (2015) A review of evolutionary algorithms for computing functional conformations of protein molecules. In: Zhang W (ed) Computer-aided drug discovery, Springer methods in pharmacology and toxicology series

    Google Scholar 

  15. Samoilenko S (2008) Fitness landscapes of complex systems: insights and implications on managing a conflict environment of organizations. Complex Organ 10(4):38–45

    Google Scholar 

  16. Kryshtafovych A, Fidelis K, Tramontano A (2011) Evaluation of model quality predictions in CASP9. Proteins 79(Suppl 10):91–106

    Article  CAS  Google Scholar 

  17. Kryshtafovych A, Barbato A, Fidelis K, Monastyrskyy B, Schwede T, Tramon- tano A (2014) Assessment of the assessment: evaluation of the model quality estimates in CASP10. Proteins 82(Suppl 2):112–126

    Article  CAS  Google Scholar 

  18. Moult J, Fidelis K, Kryshtafovych A, Schwede T, Tramontano A (2014) Critical assessment of methods of protein structure prediction (CASP)—round X. Proteins: Struct Funct Bioinf 82:109–115

    Article  Google Scholar 

  19. Moult J, Fidelis K, Kryshtafovych A, Schwede T, Tramontano A (2018) Critical assessment of methods of protein structure prediction (CASP)—round XII. Proteins 86(Suppl 1):7–15. https://doi.org/10.1002/prot.25415

    Article  CAS  PubMed  Google Scholar 

  20. Uziela K, Wallner B (2016) Proq2: estimation of model accuracy implemented in rosetta. Bioinformatics 32(9):1411–1413

    Article  CAS  Google Scholar 

  21. Liu T, Wang Y, Eickholt J, Wang Z (2016) Benchmarking deep networks for predicting residue-specific quality of individual protein models in casp11. Sci Rep 6(19):301

    Google Scholar 

  22. Ginalski K, Elofsson A, Fischer D, Rychlewski L (2003) 3D-Jury: a simple approach to improve protein structure predictions. Bioinformatics 19(8):1015–1018

    Article  CAS  Google Scholar 

  23. Wallner B, Elofsson A (2006) Identification of correct regions in protein models using structural, alignment, and consensus information. Protein Sci 15(4):900–913

    Article  CAS  Google Scholar 

  24. Lorenzen S, Zhang Y (2007) Identification of near-native structures by clustering protein docking conformations. Proteins 68(1):187–194

    Article  CAS  Google Scholar 

  25. Zhang Y, Skolnick J (2004) Spicker: a clustering approach to identify near-native protein folds. J Comput Chem 25(6):865–871

    Article  CAS  Google Scholar 

  26. Molloy K, Saleh S, Shehu A (2013) Probabilistic search and energy guidance for biased decoy sampling in ab-initio protein structure prediction. IEEE/ACM Trans Bioinf Comput Biol 10(5):1162–1175

    Article  CAS  Google Scholar 

  27. Shehu A (2013) Probabilistic search and optimization for protein energy land- scapes. In: Aluru S, Singh A (eds) Handbook of computational molecular biology, Chapman & Hall/CRC Computer & Information Science SeriesBoca Raton

    Google Scholar 

  28. Guan W, Ozakin A, Gray A, et al (2011) Learning protein folding energy functions. In: International conference data mining. IEEE, pp 1062–1067

    Google Scholar 

  29. Jing X, Wang K, Lu R, Dong Q (2016) Sorting protein decoys by machine-learning-to-rank. Sci Rep 6(31):571

    Google Scholar 

  30. He Z, Alazmi M, Zhang J, Xu D (2013) Protein structural model selection by combining consensus and single scoring methods. PLoS One 8(9):e74006

    Article  CAS  Google Scholar 

  31. Pawlowski M, Kozlowski L, Kloczkowski A (2016) Mqapsingle: a quasi single-model approach for estimation of the quality of individual protein structure models. Proteins 84(8):1021–1028

    Article  CAS  Google Scholar 

  32. Cao R, Wang Z, Wang Y, Cheng J (2014) Smoq: a tool for predicting the absolute residue-specific quality of a single protein model with support vector machines. BMC Bioinform 15(1):120

    Article  Google Scholar 

  33. Nguyen SP, Shang Y, Xu D (2014) Dl-pro: a novel deep learning method for protein model quality assessment. In: International conference on neural networks (IJCNN). IEEE, pp 2071–2078

    Google Scholar 

  34. Manavalan B, Lee J, Lee J (2014) Random forest-based protein model quality assessment (rfmqa) using structural features and potential energy terms. PLoS One 9(9):e106542

    Article  Google Scholar 

  35. Chatterjee S, Ghosh S, Vishveshwara S (2013) Network properties of decoys and casp predicted models: a comparison with native protein structures. Mol BioSyst 9(7):1774–1788

    Article  CAS  Google Scholar 

  36. Mirzaei S, Sidi T, Keasar C, Crivelli S (2016) Purely structural protein scoring functions using support vector machine and ensemble learning. In: IEEE/ACM transactions on computational biology and bioinformatics

    Google Scholar 

  37. Kabsch W (1976) A solution for the best rotation to relate two sets of vectors. Acta Cryst A32:922–923

    Article  Google Scholar 

  38. Yang Z, Algesheimer R, Tessone CJ (2016) A comparative analysis of community detection algorithms on artificial networks. Sci Rep 6(30):750

    Google Scholar 

  39. Cazals F, Dreyfus T (2017) The structural bioinformatics library: modeling in biomolecular science and beyond. Bioinformatics 33(7):997–1004

    CAS  PubMed  Google Scholar 

  40. Zhou ZH (2012) Ensemble methods: foundations and algorithms. CRC Press, Boca Raton

    Book  Google Scholar 

  41. Friedman JH (2001) Greedy function approximation: a gradient boosting machine. Ann Stat 29(5):1189–1232

    Article  Google Scholar 

  42. Chen T, Guestrin C (2016) Xgboost: a scalable tree boosting system. In: Proceedings of the 22nd ACM SIGKDD international conference on knowledge discovery and data mining. ACM, pp 785–794

    Google Scholar 

  43. Tomek I (1976) Two modifications of CNN. IEEE Trans Syst Man Cybernet 6:769–772

    Google Scholar 

  44. Akhter N, Shehu A (2017) From extraction of local structures of protein energy landscapes to improved decoy selection in template-free protein structure prediction. Molecules 23(1):216

    Article  Google Scholar 

  45. Berman HM, Henrick K, Nakamura H (2003) Announcing the worldwide Protein Data Bank. Nat Struct Biol 10(12):980–980

    Article  CAS  Google Scholar 

  46. Yang J, Leskovec J (2012) Defining and evaluating network communities based on ground-truth. In: International conference on data mining (ICDM), pp 745–754

    Google Scholar 

  47. Bastian M, Heymann S, Jacomy M (2009) Gephi: an open source software for exploring and manipulating networks. In: International AAAI conference on weblogs and social media. AAS, pp 361–362

    Google Scholar 

  48. Jacomy M, Venturini T, Heymann S, Bastian M (2014) ForceAtlas2, a continuous graph layout algorithm for handy network visualization designed for the Gephi software. PLoS One 9(6):e98679

    Article  Google Scholar 

Download references

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Amarda Shehu .

Editor information

Editors and Affiliations

Rights and permissions

Reprints and permissions

Copyright information

© 2019 Springer Science+Business Media, LLC, part of Springer Nature

About this protocol

Check for updates. Verify currency and authenticity via CrossMark

Cite this protocol

Akhter, N., Hassan, L., Rajabi, Z., Barbará, D., Shehu, A. (2019). Learning Organizations of Protein Energy Landscapes: An Application on Decoy Selection in Template-Free Protein Structure Prediction. In: Kister, A. (eds) Protein Supersecondary Structures. Methods in Molecular Biology, vol 1958. Humana Press, New York, NY. https://doi.org/10.1007/978-1-4939-9161-7_8

Download citation

  • DOI: https://doi.org/10.1007/978-1-4939-9161-7_8

  • Published:

  • Publisher Name: Humana Press, New York, NY

  • Print ISBN: 978-1-4939-9160-0

  • Online ISBN: 978-1-4939-9161-7

  • eBook Packages: Springer Protocols

Publish with us

Policies and ethics