Toolbox for Protein Structure Prediction

Part of the Methods in Molecular Biology book series (MIMB, volume 1369)


Protein tertiary structure prediction algorithms aim to predict, from amino acid sequence, the tertiary structure of a protein. In silico protein structure prediction methods have become extremely important, as in vitro-based structural elucidation is unable to keep pace with the current growth of sequence databases due to high-throughput next-generation sequencing, which has exacerbated the gaps in our knowledge between sequences and structures.

Here we briefly discuss protein tertiary structure prediction, the biennial competition for the Critical Assessment of Techniques for Protein Structure Prediction (CASP) and its role in shaping the field. We also discuss, in detail, our cutting-edge web-server method IntFOLD2-TS for tertiary structure prediction. Furthermore, we provide a step-by-step guide on using the IntFOLD2-TS web server, along with some real world examples, where the IntFOLD server can and has been used to improve protein tertiary structure prediction and aid in functional elucidation.

Key words

Protein tertiary structure prediction Protein structure Fold recognition Template-based modeling Template-free modeling Critical Assessment of Techniques for Protein Structure Prediction (CASP) Bioinformatics web servers Model quality assessment methods Continuous Automated Model EvaluatiOn (CAMEO) Protein Model Portal (PMP) Protein Structure Initiative (PSI) 



DBR is a recipient of a Young Investigator Fellowship from the Institut de Biologie Computationnelle, Université de Montpellier (ANR Investissements D’Avenir Bio-informatique: projet IBC). This research leading to these results has received funding from the European Union Seventh Framework Programme (FP7/2007–2013) under grant agreement No. 246556 [to D.B.R.].


  1. 1.
    Roche DB, Buenavista MT, McGuffin LJ (2014) Assessing the quality of modelled 3D protein structures using the ModFOLD server. Methods Mol Biol 1137:83–103. doi: 10.1007/978-1-4939-0366-5_7 CrossRefPubMedGoogle Scholar
  2. 2.
    Roche DB, Buenavista MT, McGuffin LJ (2012) Predicting protein structures and structural annotation of proteomes. In: Roberts GCK (ed) Encyclopedia of biophysics, vol 1. Springer, BerlinGoogle Scholar
  3. 3.
    Roche DB, Buenavista MT, McGuffin LJ (2012) FunFOLDQA: a quality assessment tool for protein-ligand binding site residue predictions. PLoS One 7(5):e38219. doi: 10.1371/journal.pone.0038219 PubMedCentralCrossRefPubMedGoogle Scholar
  4. 4.
    Kajan L, Hopf TA, Kalas M, Marks DS, Rost B (2014) FreeContact: fast and free software for protein contact prediction from residue co-evolution. BMC Bioinformatics 15:85. doi: 10.1186/1471-2105-15-85 PubMedCentralCrossRefPubMedGoogle Scholar
  5. 5.
    Berman HM, Westbrook J, Feng Z, Gilliland G, Bhat TN, Weissig H, Shindyalov IN, Bourne PE (2000) The Protein Data Bank. Nucleic Acids Res 28(1):235–242, doi:gkd090 [pii]PubMedCentralCrossRefPubMedGoogle Scholar
  6. 6.
    McGuffin LJ (2008) Protein fold recognition and threading. Computational structural biology. World Scientific, London, pp 37–60CrossRefGoogle Scholar
  7. 7.
    Lee J, Wu S, Zhang Y (2009) Ab initio protein structure prediction. From protein structure to function with bioinformatics. Springer, London, pp 1–26Google Scholar
  8. 8.
    McGuffin LJ, Roche DB (2011) Automated tertiary structure prediction with accurate local model quality assessment using the IntFOLD-TS method. Proteins 79(Suppl 10):137–146. doi: 10.1002/prot.23120 CrossRefPubMedGoogle Scholar
  9. 9.
    Moult J, Fidelis K, Kryshtafovych A, Rost B, Tramontano A (2009) Critical assessment of methods of protein structure prediction—round VIII. Proteins 77(Suppl 9):1–4. doi: 10.1002/prot.22589 CrossRefPubMedGoogle Scholar
  10. 10.
    Moult J, Pedersen JT, Judson R, Fidelis K (1995) A large-scale experiment to assess protein structure prediction methods. Proteins 23(3):ii–v. doi: 10.1002/prot.340230303 CrossRefPubMedGoogle Scholar
  11. 11.
    Kryshtafovych A, Fidelis K, Moult J (2014) CASP10 results compared to those of previous CASP experiments. Proteins 82(Suppl 2):164–174. doi: 10.1002/prot.24448 PubMedCentralCrossRefPubMedGoogle Scholar
  12. 12.
    Kryshtafovych A, Krysko O, Daniluk P, Dmytriv Z, Fidelis K (2009) Protein structure prediction center in CASP8. Proteins 77(Suppl 9):5–9. doi: 10.1002/prot.22517 PubMedCentralCrossRefPubMedGoogle Scholar
  13. 13.
    Buenavista MT, Roche DB, McGuffin LJ (2012) Improvement of 3D protein models using multiple templates guided by single-template model quality assessment. Bioinformatics 28(14):1851–1857. doi: 10.1093/bioinformatics/bts292 CrossRefPubMedGoogle Scholar
  14. 14.
    Roche DB, Buenavista MT, Tetchner SJ, McGuffin LJ (2011) The IntFOLD server: an integrated web resource for protein fold recognition, 3D model quality assessment, intrinsic disorder prediction, domain prediction and ligand binding site prediction. Nucleic acids research 39(Web Server issue):W171–W176. doi: 10.1093/nar/gkr184 PubMedCentralCrossRefPubMedGoogle Scholar
  15. 15.
    Zhou H, Zhou Y (2005) SPARKS 2 and SP3 servers in CASP6. Proteins 61(Suppl 7):152–156. doi: 10.1002/prot.20732 CrossRefPubMedGoogle Scholar
  16. 16.
    Soding J, Biegert A, Lupas AN (2005) The HHpred interactive server for protein homology detection and structure prediction. Nucleic Acids Res 33(Web Server issue):W244–W248. doi: 10.1093/nar/gki408 PubMedCentralCrossRefPubMedGoogle Scholar
  17. 17.
    Soding J (2005) Protein homology detection by HMM-HMM comparison. Bioinformatics 21(7):951–960. doi: 10.1093/bioinformatics/bti125, bti125 [pii]CrossRefPubMedGoogle Scholar
  18. 18.
    Margelevicius M, Laganeckas M, Venclovas C (2010) COMA server for protein distant homology search. Bioinformatics 26(15):1905–1906. doi: 10.1093/bioinformatics/btq306 CrossRefPubMedGoogle Scholar
  19. 19.
    Margelevicius M, Venclovas C (2010) Detection of distant evolutionary relationships between protein families using theory of sequence profile-profile comparison. BMC Bioinformatics 11:89. doi: 10.1186/1471-2105-11-89 PubMedCentralCrossRefPubMedGoogle Scholar
  20. 20.
    Wu S, Zhang Y (2007) LOMETS: a local meta-threading-server for protein structure prediction. Nucleic Acids Res 35(10):3375–3382. doi: 10.1093/nar/gkm251 PubMedCentralCrossRefPubMedGoogle Scholar
  21. 21.
    McGuffin LJ, Roche DB (2010) Rapid model quality assessment for protein structure predictions using the comparison of multiple models without structural alignments. Bioinformatics 26(2):182–188. doi: 10.1093/bioinformatics/btp629 CrossRefPubMedGoogle Scholar
  22. 22.
    Kryshtafovych A, Barbato A, Fidelis K, Monastyrskyy B, Schwede T, Tramontano A (2014) Assessment of the assessment: evaluation of the model quality estimates in CASP10. Proteins 82(Suppl 2):112–126. doi: 10.1002/prot.24347 PubMedCentralCrossRefPubMedGoogle Scholar
  23. 23.
    McGuffin LJ (2008) Intrinsic disorder prediction from the analysis of multiple protein fold recognition models. Bioinformatics 24(16):1798–1804. doi: 10.1093/bioinformatics/btn326 CrossRefPubMedGoogle Scholar
  24. 24.
    Roche DB, Buenavista MT, McGuffin LJ (2013) The FunFOLD2 server for the prediction of protein-ligand interactions. Nucleic Acids Res 41(Web Server issue):W303–W307. doi: 10.1093/nar/gkt498 PubMedCentralCrossRefPubMedGoogle Scholar
  25. 25.
    Bordoli L, Schwede T (2012) Automated protein structure modeling with SWISS-MODEL Workspace and the Protein Model Portal. Methods Mol Biol 857:107–136. doi: 10.1007/978-1-61779-588-6_5 CrossRefPubMedGoogle Scholar
  26. 26.
    Berman HM, Burley SK, Chiu W, Sali A, Adzhubei A, Bourne PE, Bryant SH, Dunbrack RL Jr, Fidelis K, Frank J, Godzik A, Henrick K, Joachimiak A, Heymann B, Jones D, Markley JL, Moult J, Montelione GT, Orengo C, Rossmann MG, Rost B, Saibil H, Schwede T, Standley DM, Westbrook JD (2006) Outcome of a workshop on archiving structural models of biological macromolecules. Structure 14(8):1211–1217CrossRefPubMedGoogle Scholar
  27. 27.
    Berman HM, Westbrook JD, Gabanyi MJ, Tao W, Shah R, Kouranov A, Schwede T, Arnold K, Kiefer F, Bordoli L, Kopp J, Podvinec M, Adams PD, Carter LG, Minor W, Nair R, La Baer J (2009) The protein structure initiative structural genomics knowledgebase. Nucleic Acids Res 37(Database issue):D365–D368. doi: 10.1093/nar/gkn790 PubMedCentralCrossRefPubMedGoogle Scholar
  28. 28.
    Bindschedler LV, McGuffin LJ, Burgis TA, Spanu PD, Cramer R (2011) Proteogenomics and in silico structural and functional annotation of the barley powdery mildew Blumeria graminis f. sp. hordei. Methods 54(4):432–441. doi: 10.1016/j.ymeth.2011.03.006 CrossRefPubMedGoogle Scholar
  29. 29.
    Pedersen C, Loren V, van Themaat E, McGuffin LJ, Abbott JC, Burgis TA, Barton G, Bindschedler LV, Lu X, Maekawa T, Wessling R, Cramer R, Thordal-Christensen H, Panstruga R, Spanu PD (2012) Structure and evolution of barley powdery mildew effector candidates. BMC Genomics 13:694. doi: 10.1186/1471-2164-13-694 PubMedCentralCrossRefPubMedGoogle Scholar
  30. 30.
    Altschul SF, Madden TL, Schaffer AA, Zhang J, Zhang Z, Miller W, Lipman DJ (1997) Gapped BLAST and PSI-BLAST: a new generation of protein database search programs. Nucleic Acids Res 25(17):3389–3402, doi:gka562 [pii]PubMedCentralCrossRefPubMedGoogle Scholar
  31. 31.
    Finn RD, Bateman A, Clements J, Coggill P, Eberhardt RY, Eddy SR, Heger A, Hetherington K, Holm L, Mistry J, Sonnhammer EL, Tate J, Punta M (2014) Pfam: the protein families database. Nucleic Acids Res 42(Database issue):D222–D230. doi: 10.1093/nar/gkt1223 PubMedCentralCrossRefPubMedGoogle Scholar
  32. 32.
    Letunic I, Doerks T, Bork P (2014) SMART: recent updates, new developments and status in 2015. Nucleic Acids Res. doi: 10.1093/nar/gku949 Google Scholar
  33. 33.
    Roche DB, Tetchner SJ, McGuffin LJ (2011) FunFOLD: an improved automated method for the prediction of ligand binding residues using 3D models of proteins. BMC Bioinformatics 12:160. doi: 10.1186/1471-2105-12-160 PubMedCentralCrossRefPubMedGoogle Scholar
  34. 34.
    Jmol: an open-source Java viewer for chemical structures in 3D.
  35. 35.
    Zhang Y, Skolnick J (2005) TM-align: a protein structure alignment algorithm based on the TM-score. Nucleic Acids Res 33(7):2302–2309. doi: 10.1093/nar/gki524, 33/7/2302 [pii]PubMedCentralCrossRefPubMedGoogle Scholar
  36. 36.
    Tucci V, Kleefstra T, Hardy A, Heise I, Maggi S, Willemsen MH, Hilton H, Esapa C, Simon M, Buenavista MT, McGuffin LJ, Vizor L, Dodero L, Tsaftaris S, Romero R, Nillesen WN, Vissers LE, Kempers MJ, Vulto-van Silfhout AT, Iqbal Z, Orlando M, Maccione A, Lassi G, Farisello P, Contestabile A, Tinarelli F, Nieus T, Raimondi A, Greco B, Cantatore D, Gasparini L, Berdondini L, Bifone A, Gozzi A, Wells S, Nolan PM (2014) Dominant beta-catenin mutations cause intellectual disability with recognizable syndromic features. J Clin Invest 124(4):1468–1482. doi: 10.1172/JCI70372 PubMedCentralCrossRefPubMedGoogle Scholar
  37. 37.
    Fuller SJ, McGuffin LJ, Marshall AK, Giraldo A, Pikkarainen S, Clerk A, Sugden PH (2012) A novel non-canonical mechanism of regulation of MST3 (mammalian Sterile20-related kinase 3). Biochem J 442(3):595–610. doi: 10.1042/BJ20112000 PubMedCentralCrossRefPubMedGoogle Scholar
  38. 38.
    Sugden PH, McGuffin LJ, Clerk A (2013) SOcK, MiSTs, MASK and STicKs: the GCKIII (germinal centre kinase III) kinases and their heterologous protein-protein interactions. Biochem J 454(1):13–30. doi: 10.1042/BJ20130219 CrossRefPubMedGoogle Scholar
  39. 39.
    Dunwell TL, McGuffin LJ, Dunwell JM, Pfeifer GP (2013) The mysterious presence of a 5-methylcytosine oxidase in the Drosophila genome: possible explanations. Cell Cycle 12(21):3357–3365. doi: 10.4161/cc.26540 PubMedCentralCrossRefPubMedGoogle Scholar
  40. 40.
    Eswar N, Webb B, Marti-Renom MA, Madhusudhan MS, Eramian D, Shen MY, Pieper U, Sali A (2006) Comparative protein structure modeling using Modeller. Curr Protoc Bioinformatics Chapter 5:Unit 5 6. doi: 10.1002/0471250953.bi0506s15
  41. 41.
    Matthews BW (1975) Comparison of the predicted and observed secondary structure of T4 phage lysozyme. Biochim Biophys Acta 405(2):442–451CrossRefPubMedGoogle Scholar
  42. 42.
    Roche DB, Tetchner SJ, McGuffin LJ (2010) The binding site distance test score: a robust method for the assessment of predicted protein binding sites. Bioinformatics 26(22):2920–2921. doi: 10.1093/bioinformatics/btq543 CrossRefPubMedGoogle Scholar

Copyright information

© Springer Science+Business Media New York 2016

Authors and Affiliations

  1. 1.Institut de Biologie Computationnelle, LIRMM, CNRSUniversité de MontpellierMontpellierFrance
  2. 2.CEA, DSV, IG, GenoscopeÉvryFrance
  3. 3.CNRS-UMR8030ÉvryFrance
  4. 4.Université d’Évry Val d’EssonneÉvryFrance
  5. 5.PRES UniverSud ParisSaint-AubinFrance
  6. 6.School of Biological SciencesUniversity of ReadingReadingUK

Personalised recommendations