Skip to main content

In silico Identification and Characterization of Protein-Ligand Binding Sites

  • Protocol
  • First Online:
Computational Design of Ligand Binding Proteins

Part of the book series: Methods in Molecular Biology ((MIMB,volume 1414))

Abstract

Protein–ligand binding site prediction methods aim to predict, from amino acid sequence, protein–ligand interactions, putative ligands, and ligand binding site residues using either sequence information, structural information, or a combination of both. In silico characterization of protein–ligand interactions has become extremely important to help determine a protein’s functionality, as in vivo-based functional elucidation is unable to keep pace with the current growth of sequence databases. Additionally, in vitro biochemical functional elucidation is time-consuming, costly, and may not be feasible for large-scale analysis, such as drug discovery. Thus, in silico prediction of protein–ligand interactions must be utilized to aid in functional elucidation. Here, we briefly discuss protein function prediction, prediction of protein–ligand interactions, the Critical Assessment of Techniques for Protein Structure Prediction (CASP) and the Continuous Automated EvaluatiOn (CAMEO) competitions, along with their role in shaping the field. We also discuss, in detail, our cutting-edge web-server method, FunFOLD for the structurally informed prediction of protein–ligand interactions. Furthermore, we provide a step-by-step guide on using the FunFOLD web server and FunFOLD3 downloadable application, along with some real world examples, where the FunFOLD methods have been used to aid functional elucidation.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Protocol
USD 49.95
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
eBook
USD 84.99
Price excludes VAT (USA)
  • Available as EPUB and PDF
  • Read on any device
  • Instant download
  • Own it forever
Softcover Book
USD 139.00
Price excludes VAT (USA)
  • Compact, lightweight edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info
Hardcover Book
USD 109.99
Price excludes VAT (USA)
  • Durable hardcover edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

References

  1. Roche DB, Buenavista MT, Mcguffin LJ (2012) FunFOLDQA: a quality assessment tool for protein-ligand binding site residue predictions. PLoS One 7:e38219

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  2. Roche DB, Buenavista MT, Mcguffin LJ (2013) The FunFOLD2 server for the prediction of protein-ligand interactions. Nucleic Acids Res 41:W303–W307

    Article  PubMed  PubMed Central  Google Scholar 

  3. Roche DB, Tetchner SJ, Mcguffin LJ (2011) FunFOLD: an improved automated method for the prediction of ligand binding residues using 3D models of proteins. BMC Bioinformatics 12:160

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  4. Oh M, Joo K, Lee J (2009) Protein-binding site prediction based on three-dimensional protein modeling. Proteins 77(Suppl 9):152–156

    Article  CAS  PubMed  Google Scholar 

  5. Lopez G, Maietta P, Rodriguez JM et al (2011) Firestar--advances in the prediction of functionally important residues. Nucleic Acids Res 39:W235–W241

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  6. Lopez G, Valencia A, Tress ML (2007) Firestar--prediction of functionally important residues using structural templates and alignment reliability. Nucleic Acids Res 35:W573–W577

    Article  PubMed  PubMed Central  Google Scholar 

  7. Talavera D, Laskowski RA, Thornton JM (2009) WSsas: a web service for the annotation of functional residues through structural homologues. Bioinformatics 25:1192–1194

    Article  CAS  PubMed  Google Scholar 

  8. Sankararaman S, Kolaczkowski B, Sjolander K (2009) INTREPID: a web server for prediction of functionally important residues by evolutionary analysis. Nucleic Acids Res 37:W390–W395

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  9. Ye K, Feenstra KA, Heringa J et al (2008) Multi-RELIEF: a method to recognize specificity determining residues from multiple sequence alignments using a Machine-Learning approach for feature weighting. Bioinformatics 24:18–25

    Article  CAS  PubMed  Google Scholar 

  10. Ashkenazy H, Erez E, Martz E et al (2010) ConSurf 2010: calculating evolutionary conservation in sequence and structure of proteins and nucleic acids. Nucleic Acids Res 38(Suppl):W529–W533

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  11. Wass MN, Sternberg MJ (2008) ConFunc--functional annotation in the twilight zone. Bioinformatics 24:798–806

    Article  CAS  PubMed  Google Scholar 

  12. Sankararaman S, Sha F, Kirsch JF et al (2010) Active site prediction using evolutionary and structural information. Bioinformatics 26:617–624

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  13. Dong-Jun Y, Jun H, Jing Y et al (2013) Designing template-free predictor for targeting protein-ligand binding sites with classifier ensemble and spatial clustering. IEEE/ACM Trans Comput Biol Bioinform 10:994–1008

    Article  Google Scholar 

  14. Chen P, Huang JHZ, Gao X (2014) LigandRFs: random forest ensemble to identify ligand-binding residues from sequence information alone. BMC Bioinformatics 15:S4

    Article  Google Scholar 

  15. Brylinski M, Skolnick J (2008) A threading-based method (FINDSITE) for ligand-binding site prediction and functional annotation. Proc Natl Acad Sci U S A 105:129–134

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  16. Spitzer R, Cleves AE, Jain AN (2011) Surface-based protein binding pocket similarity. Proteins 79:2746–2763

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  17. Xie ZR, Liu CK, Hsiao FC et al (2013) LISE: a server using ligand-interacting and site-enriched protein triangles for prediction of ligand-binding sites. Nucleic Acids Res 41:W292–W296

    Article  PubMed  PubMed Central  Google Scholar 

  18. Zhu X, Xiong Y, Kihara D (2015) Large-scale binding ligand prediction by improved patch-based method Patch-Surfer2.0. Bioinformatics 31:707–713

    Article  PubMed  PubMed Central  Google Scholar 

  19. Cao Y, Li L (2014) Improved protein-ligand binding affinity prediction by using a curvature-dependent surface-area model. Bioinformatics 30:1674–1680

    Article  CAS  PubMed  Google Scholar 

  20. Fuller JC, Martinez M, Henrich S et al (2014) LigDig: a web server for querying ligand-protein interactions. Bioinformatics 31:1147–1149

    Article  PubMed  PubMed Central  Google Scholar 

  21. Erdin S, Ward RM, Venner E et al (2010) Evolutionary trace annotation of protein function in the structural proteome. J Mol Biol 396:1451–1473

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  22. Madabushi S, Yao H, Marsh M et al (2002) Structural clusters of evolutionary trace residues are statistically significant and common in proteins. J Mol Biol 316:139–154

    Article  CAS  PubMed  Google Scholar 

  23. Hernandez M, Ghersi D, Sanchez R (2009) SITEHOUND-web: a server for ligand binding site identification in protein structures. Nucleic Acids Res 37:W413–W416

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  24. Yang J, Roy A, Zhang Y (2013) Protein-ligand binding site recognition using complementary binding-specific substructure comparison and sequence profile alignment. Bioinformatics 29:2588–2595

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  25. Roy A, Yang J, Zhang Y (2012) COFACTOR: an accurate comparative algorithm for structure-based protein function annotation. Nucleic Acids Res 40:W471–W477

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  26. Heo L, Shin WH, Lee MS et al (2014) GalaxySite: ligand-binding-site prediction by using molecular docking. Nucleic Acids Res 42:W210–W214

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  27. Izidoro SC, De Melo-Minardi RC, Pappa GL (2014) GASS: identifying enzyme active sites with genetic algorithms. Bioinformatics 31:864–870

    Article  PubMed  Google Scholar 

  28. Huang B, Schroeder M (2006) LIGSITEcsc: predicting ligand binding sites using the Connolly surface and degree of conservation. BMC Struct Biol 6:19

    Article  PubMed  PubMed Central  Google Scholar 

  29. Andersson CD, Chen BY, Linusson A (2010) Mapping of ligand-binding cavities in proteins. Proteins 78:1408–1422

    CAS  PubMed  PubMed Central  Google Scholar 

  30. Lopez G, Ezkurdia I, Tress ML (2009) Assessment of ligand binding residue predictions in CASP8. Proteins 77(Suppl 9):138–146

    Article  CAS  PubMed  Google Scholar 

  31. Schmidt T, Haas J, Cassarino TG et al (2011) Assessment of ligand binding residue predictions in CASP9. Proteins: Structure, Function, and Bioinformatics 79 Suppl 10:126–136

    Google Scholar 

  32. Gallo Cassarino T, Bordoli L, Schwede T (2014) Assessment of ligand binding site predictions in CASP10. Proteins 82(Suppl 2):154–163

    Article  CAS  PubMed  Google Scholar 

  33. Haas J, Roth S, Arnold K et al (2013) The Protein Model Portal--a comprehensive resource for protein structure and model information. Database (Oxford) 2013:bat031

    Article  Google Scholar 

  34. Wass MN, Sternberg MJ (2009) Prediction of ligand binding sites using homologous structures and conservation at CASP8. Proteins 77(Suppl 9):147–151

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  35. Matthews BW (1975) Comparison of the predicted and observed secondary structure of T4 phage lysozyme. Biochim Biophys Acta 405:442–451

    Article  CAS  PubMed  Google Scholar 

  36. Roche DB, Tetchner SJ, Mcguffin LJ (2010) The binding site distance test score: a robust method for the assessment of predicted protein binding sites. Bioinformatics 26:2920–2921

    Article  CAS  PubMed  Google Scholar 

  37. Buenavista MT, Roche DB, Mcguffin LJ (2012) Improvement of 3D protein models using multiple templates guided by single-template model quality assessment. Bioinformatics 28:1851–1857

    Article  CAS  PubMed  Google Scholar 

  38. Zhang Y, Skolnick J (2005) TM-align: a protein structure alignment algorithm based on the TM-score. Nucleic Acids Res 33:2302–2309

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  39. Yang J, Roy A, Zhang Y (2013) BioLiP: a semi-manually curated database for biologically relevant ligand-protein interactions. Nucleic Acids Res 41:D1096–D1103

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  40. Xu J, Zhang Y (2010) How significant is a protein structure similarity with TM-score = 0.5? Bioinformatics 26:889–895

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  41. Mcguffin LJ, Roche DB (2010) Rapid model quality assessment for protein structure predictions using the comparison of multiple models without structural alignments. Bioinformatics 26:182–188

    Article  CAS  PubMed  Google Scholar 

  42. Webb EC (1989) Nomenclature Committee of the International-Union-of-Biochemistry (Nc-Iub) - Enzyme Nomenclature - Recommendations 1984 - Supplement-2 - Corrections and Additions. Eur J Biochem 179:489–533

    Article  CAS  PubMed  Google Scholar 

  43. Ashburner M, Ball CA, Blake JA et al (2000) Gene ontology: tool for the unification of biology. Nat Genet 25:25–29

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  44. Mcguffin LJ, Atkins JD, Salehe BR et al (2015) IntFOLD: an integrated server for modelling protein structures and functions from amino acid sequences. Nucleic Acids Research 43:W169–W173

    Article  PubMed  PubMed Central  Google Scholar 

  45. Bindschedler LV, Mcguffin LJ, Burgis TA et al (2011) Proteogenomics and in silico structural and functional annotation of the barley powdery mildew Blumeria graminis f. sp. hordei. Methods 54:432–441

    Article  CAS  PubMed  Google Scholar 

  46. Pedersen C, Ver Loren Van Themaat E, Mcguffin LJ et al (2012) Structure and evolution of barley powdery mildew effector candidates. BMC Genomics 13:694

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  47. Zhou Y, Xue S, Yang JJ (2013) Calciomics: integrative studies of Ca2+−binding proteins and their interactomes in biological systems. Metallomics 5:29–42

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  48. Don CG, Riniker S (2014) Scents and sense: in silico perspectives on olfactory receptors. J Comput Chem 35:2279–2287

    Article  CAS  PubMed  Google Scholar 

  49. Finn RD, Bateman A, Clements J et al (2014) Pfam: the protein families database. Nucleic Acids Res 42:D222–D230

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  50. Letunic I, Doerks T, Bork P (2015) SMART: recent updates, new developments and status in 2015. Nucleic Acids Res 43:D257–D260

    Article  PubMed  PubMed Central  Google Scholar 

  51. Feng Z, Chen L, Maddula H et al (2004) Ligand Depot: a data warehouse for ligands bound to macromolecules. Bioinformatics 20:2153–2155

    Article  CAS  PubMed  Google Scholar 

  52. Roche DB, Buenavista MT, Mcguffin LJ (2014) Assessing the quality of modelled 3D protein structures using the ModFOLD server. Methods Mol Biol 1137:83–103

    Article  CAS  PubMed  Google Scholar 

  53. Roche DB, Buenavista MT, Tetchner SJ et al (2011) The IntFOLD server: an integrated web resource for protein fold recognition, 3D model quality assessment, intrinsic disorder prediction, domain prediction and ligand binding site prediction. Nucleic Acids Res 39:W171–W176

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  54. Mcguffin LJ, Buenavista MT, Roche DB (2013) The ModFOLD4 server for the quality assessment of 3D protein models. Nucleic Acids Res 41:W368–W372

    Article  PubMed  PubMed Central  Google Scholar 

  55. Roy A, Kucukural A, Zhang Y (2010) I-TASSER: a unified platform for automated protein structure and function prediction. Nat Protoc 5:725–738

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  56. Mcguffin LJ (2008) Intrinsic disorder prediction from the analysis of multiple protein fold recognition models. Bioinformatics 24:1798–1804

    Article  CAS  PubMed  Google Scholar 

Download references

Acknowledgements

Daniel Barry Roche is a recipient of a Young Investigator Fellowship from the Institut de Biologie Computationnelle, Université de Montpellier (ANR Investissements D’Avenir Bio-informatique: projet IBC).

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Daniel Barry Roche .

Editor information

Editors and Affiliations

Rights and permissions

Reprints and permissions

Copyright information

© 2016 Springer Science+Business Media New York

About this protocol

Cite this protocol

Roche, D.B., McGuffin, L.J. (2016). In silico Identification and Characterization of Protein-Ligand Binding Sites. In: Stoddard, B. (eds) Computational Design of Ligand Binding Proteins. Methods in Molecular Biology, vol 1414. Humana Press, New York, NY. https://doi.org/10.1007/978-1-4939-3569-7_1

Download citation

  • DOI: https://doi.org/10.1007/978-1-4939-3569-7_1

  • Published:

  • Publisher Name: Humana Press, New York, NY

  • Print ISBN: 978-1-4939-3567-3

  • Online ISBN: 978-1-4939-3569-7

  • eBook Packages: Springer Protocols

Publish with us

Policies and ethics