In silico Identification and Characterization of Protein-Ligand Binding Sites

Roche, Daniel Barry; McGuffin, Liam James

doi:10.1007/978-1-4939-3569-7_1

Daniel Barry Roche^3,4 &
Liam James McGuffin⁵

Part of the book series: Methods in Molecular Biology ((MIMB,volume 1414))

3326 Accesses
4 Citations
1 Altmetric

Abstract

Protein–ligand binding site prediction methods aim to predict, from amino acid sequence, protein–ligand interactions, putative ligands, and ligand binding site residues using either sequence information, structural information, or a combination of both. In silico characterization of protein–ligand interactions has become extremely important to help determine a protein’s functionality, as in vivo-based functional elucidation is unable to keep pace with the current growth of sequence databases. Additionally, in vitro biochemical functional elucidation is time-consuming, costly, and may not be feasible for large-scale analysis, such as drug discovery. Thus, in silico prediction of protein–ligand interactions must be utilized to aid in functional elucidation. Here, we briefly discuss protein function prediction, prediction of protein–ligand interactions, the Critical Assessment of Techniques for Protein Structure Prediction (CASP) and the Continuous Automated EvaluatiOn (CAMEO) competitions, along with their role in shaping the field. We also discuss, in detail, our cutting-edge web-server method, FunFOLD for the structurally informed prediction of protein–ligand interactions. Furthermore, we provide a step-by-step guide on using the FunFOLD web server and FunFOLD3 downloadable application, along with some real world examples, where the FunFOLD methods have been used to aid functional elucidation.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Log in via an institution

Protocol: USD 49.95; Price excludes VAT (USA)

eBook: USD 84.99; Price excludes VAT (USA)

Softcover Book: USD 139.00; Price excludes VAT (USA)

Hardcover Book: USD 109.99; Price excludes VAT (USA)

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

References

Roche DB, Buenavista MT, Mcguffin LJ (2012) FunFOLDQA: a quality assessment tool for protein-ligand binding site residue predictions. PLoS One 7:e38219
Article CAS PubMed PubMed Central Google Scholar
Roche DB, Buenavista MT, Mcguffin LJ (2013) The FunFOLD2 server for the prediction of protein-ligand interactions. Nucleic Acids Res 41:W303–W307
Article PubMed PubMed Central Google Scholar
Roche DB, Tetchner SJ, Mcguffin LJ (2011) FunFOLD: an improved automated method for the prediction of ligand binding residues using 3D models of proteins. BMC Bioinformatics 12:160
Article CAS PubMed PubMed Central Google Scholar
Oh M, Joo K, Lee J (2009) Protein-binding site prediction based on three-dimensional protein modeling. Proteins 77(Suppl 9):152–156
Article CAS PubMed Google Scholar
Lopez G, Maietta P, Rodriguez JM et al (2011) Firestar--advances in the prediction of functionally important residues. Nucleic Acids Res 39:W235–W241
Article CAS PubMed PubMed Central Google Scholar
Lopez G, Valencia A, Tress ML (2007) Firestar--prediction of functionally important residues using structural templates and alignment reliability. Nucleic Acids Res 35:W573–W577
Article PubMed PubMed Central Google Scholar
Talavera D, Laskowski RA, Thornton JM (2009) WSsas: a web service for the annotation of functional residues through structural homologues. Bioinformatics 25:1192–1194
Article CAS PubMed Google Scholar
Sankararaman S, Kolaczkowski B, Sjolander K (2009) INTREPID: a web server for prediction of functionally important residues by evolutionary analysis. Nucleic Acids Res 37:W390–W395
Article CAS PubMed PubMed Central Google Scholar
Ye K, Feenstra KA, Heringa J et al (2008) Multi-RELIEF: a method to recognize specificity determining residues from multiple sequence alignments using a Machine-Learning approach for feature weighting. Bioinformatics 24:18–25
Article CAS PubMed Google Scholar
Ashkenazy H, Erez E, Martz E et al (2010) ConSurf 2010: calculating evolutionary conservation in sequence and structure of proteins and nucleic acids. Nucleic Acids Res 38(Suppl):W529–W533
Article CAS PubMed PubMed Central Google Scholar
Wass MN, Sternberg MJ (2008) ConFunc--functional annotation in the twilight zone. Bioinformatics 24:798–806
Article CAS PubMed Google Scholar
Sankararaman S, Sha F, Kirsch JF et al (2010) Active site prediction using evolutionary and structural information. Bioinformatics 26:617–624
Article CAS PubMed PubMed Central Google Scholar
Dong-Jun Y, Jun H, Jing Y et al (2013) Designing template-free predictor for targeting protein-ligand binding sites with classifier ensemble and spatial clustering. IEEE/ACM Trans Comput Biol Bioinform 10:994–1008
Article Google Scholar
Chen P, Huang JHZ, Gao X (2014) LigandRFs: random forest ensemble to identify ligand-binding residues from sequence information alone. BMC Bioinformatics 15:S4
Article Google Scholar
Brylinski M, Skolnick J (2008) A threading-based method (FINDSITE) for ligand-binding site prediction and functional annotation. Proc Natl Acad Sci U S A 105:129–134
Article CAS PubMed PubMed Central Google Scholar
Spitzer R, Cleves AE, Jain AN (2011) Surface-based protein binding pocket similarity. Proteins 79:2746–2763
Article CAS PubMed PubMed Central Google Scholar
Xie ZR, Liu CK, Hsiao FC et al (2013) LISE: a server using ligand-interacting and site-enriched protein triangles for prediction of ligand-binding sites. Nucleic Acids Res 41:W292–W296
Article PubMed PubMed Central Google Scholar
Zhu X, Xiong Y, Kihara D (2015) Large-scale binding ligand prediction by improved patch-based method Patch-Surfer2.0. Bioinformatics 31:707–713
Article PubMed PubMed Central Google Scholar
Cao Y, Li L (2014) Improved protein-ligand binding affinity prediction by using a curvature-dependent surface-area model. Bioinformatics 30:1674–1680
Article CAS PubMed Google Scholar
Fuller JC, Martinez M, Henrich S et al (2014) LigDig: a web server for querying ligand-protein interactions. Bioinformatics 31:1147–1149
Article PubMed PubMed Central Google Scholar
Erdin S, Ward RM, Venner E et al (2010) Evolutionary trace annotation of protein function in the structural proteome. J Mol Biol 396:1451–1473
Article CAS PubMed PubMed Central Google Scholar
Madabushi S, Yao H, Marsh M et al (2002) Structural clusters of evolutionary trace residues are statistically significant and common in proteins. J Mol Biol 316:139–154
Article CAS PubMed Google Scholar
Hernandez M, Ghersi D, Sanchez R (2009) SITEHOUND-web: a server for ligand binding site identification in protein structures. Nucleic Acids Res 37:W413–W416
Article CAS PubMed PubMed Central Google Scholar
Yang J, Roy A, Zhang Y (2013) Protein-ligand binding site recognition using complementary binding-specific substructure comparison and sequence profile alignment. Bioinformatics 29:2588–2595
Article CAS PubMed PubMed Central Google Scholar
Roy A, Yang J, Zhang Y (2012) COFACTOR: an accurate comparative algorithm for structure-based protein function annotation. Nucleic Acids Res 40:W471–W477
Article CAS PubMed PubMed Central Google Scholar
Heo L, Shin WH, Lee MS et al (2014) GalaxySite: ligand-binding-site prediction by using molecular docking. Nucleic Acids Res 42:W210–W214
Article CAS PubMed PubMed Central Google Scholar
Izidoro SC, De Melo-Minardi RC, Pappa GL (2014) GASS: identifying enzyme active sites with genetic algorithms. Bioinformatics 31:864–870
Article PubMed Google Scholar
Huang B, Schroeder M (2006) LIGSITEcsc: predicting ligand binding sites using the Connolly surface and degree of conservation. BMC Struct Biol 6:19
Article PubMed PubMed Central Google Scholar
Andersson CD, Chen BY, Linusson A (2010) Mapping of ligand-binding cavities in proteins. Proteins 78:1408–1422
CAS PubMed PubMed Central Google Scholar
Lopez G, Ezkurdia I, Tress ML (2009) Assessment of ligand binding residue predictions in CASP8. Proteins 77(Suppl 9):138–146
Article CAS PubMed Google Scholar
Schmidt T, Haas J, Cassarino TG et al (2011) Assessment of ligand binding residue predictions in CASP9. Proteins: Structure, Function, and Bioinformatics 79 Suppl 10:126–136
Google Scholar
Gallo Cassarino T, Bordoli L, Schwede T (2014) Assessment of ligand binding site predictions in CASP10. Proteins 82(Suppl 2):154–163
Article CAS PubMed Google Scholar
Haas J, Roth S, Arnold K et al (2013) The Protein Model Portal--a comprehensive resource for protein structure and model information. Database (Oxford) 2013:bat031
Article Google Scholar
Wass MN, Sternberg MJ (2009) Prediction of ligand binding sites using homologous structures and conservation at CASP8. Proteins 77(Suppl 9):147–151
Article CAS PubMed PubMed Central Google Scholar
Matthews BW (1975) Comparison of the predicted and observed secondary structure of T4 phage lysozyme. Biochim Biophys Acta 405:442–451
Article CAS PubMed Google Scholar
Roche DB, Tetchner SJ, Mcguffin LJ (2010) The binding site distance test score: a robust method for the assessment of predicted protein binding sites. Bioinformatics 26:2920–2921
Article CAS PubMed Google Scholar
Buenavista MT, Roche DB, Mcguffin LJ (2012) Improvement of 3D protein models using multiple templates guided by single-template model quality assessment. Bioinformatics 28:1851–1857
Article CAS PubMed Google Scholar
Zhang Y, Skolnick J (2005) TM-align: a protein structure alignment algorithm based on the TM-score. Nucleic Acids Res 33:2302–2309
Article CAS PubMed PubMed Central Google Scholar
Yang J, Roy A, Zhang Y (2013) BioLiP: a semi-manually curated database for biologically relevant ligand-protein interactions. Nucleic Acids Res 41:D1096–D1103
Article CAS PubMed PubMed Central Google Scholar
Xu J, Zhang Y (2010) How significant is a protein structure similarity with TM-score = 0.5? Bioinformatics 26:889–895
Article CAS PubMed PubMed Central Google Scholar
Mcguffin LJ, Roche DB (2010) Rapid model quality assessment for protein structure predictions using the comparison of multiple models without structural alignments. Bioinformatics 26:182–188
Article CAS PubMed Google Scholar
Webb EC (1989) Nomenclature Committee of the International-Union-of-Biochemistry (Nc-Iub) - Enzyme Nomenclature - Recommendations 1984 - Supplement-2 - Corrections and Additions. Eur J Biochem 179:489–533
Article CAS PubMed Google Scholar
Ashburner M, Ball CA, Blake JA et al (2000) Gene ontology: tool for the unification of biology. Nat Genet 25:25–29
Article CAS PubMed PubMed Central Google Scholar
Mcguffin LJ, Atkins JD, Salehe BR et al (2015) IntFOLD: an integrated server for modelling protein structures and functions from amino acid sequences. Nucleic Acids Research 43:W169–W173
Article PubMed PubMed Central Google Scholar
Bindschedler LV, Mcguffin LJ, Burgis TA et al (2011) Proteogenomics and in silico structural and functional annotation of the barley powdery mildew Blumeria graminis f. sp. hordei. Methods 54:432–441
Article CAS PubMed Google Scholar
Pedersen C, Ver Loren Van Themaat E, Mcguffin LJ et al (2012) Structure and evolution of barley powdery mildew effector candidates. BMC Genomics 13:694
Article CAS PubMed PubMed Central Google Scholar
Zhou Y, Xue S, Yang JJ (2013) Calciomics: integrative studies of Ca2+−binding proteins and their interactomes in biological systems. Metallomics 5:29–42
Article CAS PubMed PubMed Central Google Scholar
Don CG, Riniker S (2014) Scents and sense: in silico perspectives on olfactory receptors. J Comput Chem 35:2279–2287
Article CAS PubMed Google Scholar
Finn RD, Bateman A, Clements J et al (2014) Pfam: the protein families database. Nucleic Acids Res 42:D222–D230
Article CAS PubMed PubMed Central Google Scholar
Letunic I, Doerks T, Bork P (2015) SMART: recent updates, new developments and status in 2015. Nucleic Acids Res 43:D257–D260
Article PubMed PubMed Central Google Scholar
Feng Z, Chen L, Maddula H et al (2004) Ligand Depot: a data warehouse for ligands bound to macromolecules. Bioinformatics 20:2153–2155
Article CAS PubMed Google Scholar
Roche DB, Buenavista MT, Mcguffin LJ (2014) Assessing the quality of modelled 3D protein structures using the ModFOLD server. Methods Mol Biol 1137:83–103
Article CAS PubMed Google Scholar
Roche DB, Buenavista MT, Tetchner SJ et al (2011) The IntFOLD server: an integrated web resource for protein fold recognition, 3D model quality assessment, intrinsic disorder prediction, domain prediction and ligand binding site prediction. Nucleic Acids Res 39:W171–W176
Article CAS PubMed PubMed Central Google Scholar
Mcguffin LJ, Buenavista MT, Roche DB (2013) The ModFOLD4 server for the quality assessment of 3D protein models. Nucleic Acids Res 41:W368–W372
Article PubMed PubMed Central Google Scholar
Roy A, Kucukural A, Zhang Y (2010) I-TASSER: a unified platform for automated protein structure and function prediction. Nat Protoc 5:725–738
Article CAS PubMed PubMed Central Google Scholar
Mcguffin LJ (2008) Intrinsic disorder prediction from the analysis of multiple protein fold recognition models. Bioinformatics 24:1798–1804
Article CAS PubMed Google Scholar

Download references

Acknowledgements

Daniel Barry Roche is a recipient of a Young Investigator Fellowship from the Institut de Biologie Computationnelle, Université de Montpellier (ANR Investissements D’Avenir Bio-informatique: projet IBC).

Author information

Authors and Affiliations

Institut de Biologie Computationnelle, LIRMM, CNRS, Université de Montpellier, 860 rue de St Priest, 34095, Montpellier, France
Daniel Barry Roche
Centre de Recherche en Biologie cellulaire de Montpellier, CNRS-UMR 5237, 1919 Route de Mende, Montpellier, 34293, France
Daniel Barry Roche
School of Biological Sciences, University of Reading, Reading, RG6 6AS, UK
Liam James McGuffin

Authors

Daniel Barry Roche
View author publications
You can also search for this author in PubMed Google Scholar
Liam James McGuffin
View author publications
You can also search for this author in PubMed Google Scholar

Corresponding author

Correspondence to Daniel Barry Roche .

Editor information

Editors and Affiliations

Division of Basic Sciences, Fred Hutchinson Cancer Research Cen, Seattle, Washington, USA
Barry L. Stoddard

Rights and permissions

Reprints and permissions

Copyright information

About this protocol

Cite this protocol

Roche, D.B., McGuffin, L.J. (2016). In silico Identification and Characterization of Protein-Ligand Binding Sites. In: Stoddard, B. (eds) Computational Design of Ligand Binding Proteins. Methods in Molecular Biology, vol 1414. Humana Press, New York, NY. https://doi.org/10.1007/978-1-4939-3569-7_1

Download citation

DOI: https://doi.org/10.1007/978-1-4939-3569-7_1
Published: 20 April 2016
Publisher Name: Humana Press, New York, NY
Print ISBN: 978-1-4939-3567-3
Online ISBN: 978-1-4939-3569-7
eBook Packages: Springer Protocols

Publish with us

Policies and ethics