Abstract
Protein–ligand binding site prediction methods aim to predict, from amino acid sequence, protein–ligand interactions, putative ligands, and ligand binding site residues using either sequence information, structural information, or a combination of both. In silico characterization of protein–ligand interactions has become extremely important to help determine a protein’s functionality, as in vivo-based functional elucidation is unable to keep pace with the current growth of sequence databases. Additionally, in vitro biochemical functional elucidation is time-consuming, costly, and may not be feasible for large-scale analysis, such as drug discovery. Thus, in silico prediction of protein–ligand interactions must be utilized to aid in functional elucidation. Here, we briefly discuss protein function prediction, prediction of protein–ligand interactions, the Critical Assessment of Techniques for Protein Structure Prediction (CASP) and the Continuous Automated EvaluatiOn (CAMEO) competitions, along with their role in shaping the field. We also discuss, in detail, our cutting-edge web-server method, FunFOLD for the structurally informed prediction of protein–ligand interactions. Furthermore, we provide a step-by-step guide on using the FunFOLD web server and FunFOLD3 downloadable application, along with some real world examples, where the FunFOLD methods have been used to aid functional elucidation.
Access this chapter
Tax calculation will be finalised at checkout
Purchases are for personal use only
Similar content being viewed by others
References
Roche DB, Buenavista MT, Mcguffin LJ (2012) FunFOLDQA: a quality assessment tool for protein-ligand binding site residue predictions. PLoS One 7:e38219
Roche DB, Buenavista MT, Mcguffin LJ (2013) The FunFOLD2 server for the prediction of protein-ligand interactions. Nucleic Acids Res 41:W303–W307
Roche DB, Tetchner SJ, Mcguffin LJ (2011) FunFOLD: an improved automated method for the prediction of ligand binding residues using 3D models of proteins. BMC Bioinformatics 12:160
Oh M, Joo K, Lee J (2009) Protein-binding site prediction based on three-dimensional protein modeling. Proteins 77(Suppl 9):152–156
Lopez G, Maietta P, Rodriguez JM et al (2011) Firestar--advances in the prediction of functionally important residues. Nucleic Acids Res 39:W235–W241
Lopez G, Valencia A, Tress ML (2007) Firestar--prediction of functionally important residues using structural templates and alignment reliability. Nucleic Acids Res 35:W573–W577
Talavera D, Laskowski RA, Thornton JM (2009) WSsas: a web service for the annotation of functional residues through structural homologues. Bioinformatics 25:1192–1194
Sankararaman S, Kolaczkowski B, Sjolander K (2009) INTREPID: a web server for prediction of functionally important residues by evolutionary analysis. Nucleic Acids Res 37:W390–W395
Ye K, Feenstra KA, Heringa J et al (2008) Multi-RELIEF: a method to recognize specificity determining residues from multiple sequence alignments using a Machine-Learning approach for feature weighting. Bioinformatics 24:18–25
Ashkenazy H, Erez E, Martz E et al (2010) ConSurf 2010: calculating evolutionary conservation in sequence and structure of proteins and nucleic acids. Nucleic Acids Res 38(Suppl):W529–W533
Wass MN, Sternberg MJ (2008) ConFunc--functional annotation in the twilight zone. Bioinformatics 24:798–806
Sankararaman S, Sha F, Kirsch JF et al (2010) Active site prediction using evolutionary and structural information. Bioinformatics 26:617–624
Dong-Jun Y, Jun H, Jing Y et al (2013) Designing template-free predictor for targeting protein-ligand binding sites with classifier ensemble and spatial clustering. IEEE/ACM Trans Comput Biol Bioinform 10:994–1008
Chen P, Huang JHZ, Gao X (2014) LigandRFs: random forest ensemble to identify ligand-binding residues from sequence information alone. BMC Bioinformatics 15:S4
Brylinski M, Skolnick J (2008) A threading-based method (FINDSITE) for ligand-binding site prediction and functional annotation. Proc Natl Acad Sci U S A 105:129–134
Spitzer R, Cleves AE, Jain AN (2011) Surface-based protein binding pocket similarity. Proteins 79:2746–2763
Xie ZR, Liu CK, Hsiao FC et al (2013) LISE: a server using ligand-interacting and site-enriched protein triangles for prediction of ligand-binding sites. Nucleic Acids Res 41:W292–W296
Zhu X, Xiong Y, Kihara D (2015) Large-scale binding ligand prediction by improved patch-based method Patch-Surfer2.0. Bioinformatics 31:707–713
Cao Y, Li L (2014) Improved protein-ligand binding affinity prediction by using a curvature-dependent surface-area model. Bioinformatics 30:1674–1680
Fuller JC, Martinez M, Henrich S et al (2014) LigDig: a web server for querying ligand-protein interactions. Bioinformatics 31:1147–1149
Erdin S, Ward RM, Venner E et al (2010) Evolutionary trace annotation of protein function in the structural proteome. J Mol Biol 396:1451–1473
Madabushi S, Yao H, Marsh M et al (2002) Structural clusters of evolutionary trace residues are statistically significant and common in proteins. J Mol Biol 316:139–154
Hernandez M, Ghersi D, Sanchez R (2009) SITEHOUND-web: a server for ligand binding site identification in protein structures. Nucleic Acids Res 37:W413–W416
Yang J, Roy A, Zhang Y (2013) Protein-ligand binding site recognition using complementary binding-specific substructure comparison and sequence profile alignment. Bioinformatics 29:2588–2595
Roy A, Yang J, Zhang Y (2012) COFACTOR: an accurate comparative algorithm for structure-based protein function annotation. Nucleic Acids Res 40:W471–W477
Heo L, Shin WH, Lee MS et al (2014) GalaxySite: ligand-binding-site prediction by using molecular docking. Nucleic Acids Res 42:W210–W214
Izidoro SC, De Melo-Minardi RC, Pappa GL (2014) GASS: identifying enzyme active sites with genetic algorithms. Bioinformatics 31:864–870
Huang B, Schroeder M (2006) LIGSITEcsc: predicting ligand binding sites using the Connolly surface and degree of conservation. BMC Struct Biol 6:19
Andersson CD, Chen BY, Linusson A (2010) Mapping of ligand-binding cavities in proteins. Proteins 78:1408–1422
Lopez G, Ezkurdia I, Tress ML (2009) Assessment of ligand binding residue predictions in CASP8. Proteins 77(Suppl 9):138–146
Schmidt T, Haas J, Cassarino TG et al (2011) Assessment of ligand binding residue predictions in CASP9. Proteins: Structure, Function, and Bioinformatics 79 Suppl 10:126–136
Gallo Cassarino T, Bordoli L, Schwede T (2014) Assessment of ligand binding site predictions in CASP10. Proteins 82(Suppl 2):154–163
Haas J, Roth S, Arnold K et al (2013) The Protein Model Portal--a comprehensive resource for protein structure and model information. Database (Oxford) 2013:bat031
Wass MN, Sternberg MJ (2009) Prediction of ligand binding sites using homologous structures and conservation at CASP8. Proteins 77(Suppl 9):147–151
Matthews BW (1975) Comparison of the predicted and observed secondary structure of T4 phage lysozyme. Biochim Biophys Acta 405:442–451
Roche DB, Tetchner SJ, Mcguffin LJ (2010) The binding site distance test score: a robust method for the assessment of predicted protein binding sites. Bioinformatics 26:2920–2921
Buenavista MT, Roche DB, Mcguffin LJ (2012) Improvement of 3D protein models using multiple templates guided by single-template model quality assessment. Bioinformatics 28:1851–1857
Zhang Y, Skolnick J (2005) TM-align: a protein structure alignment algorithm based on the TM-score. Nucleic Acids Res 33:2302–2309
Yang J, Roy A, Zhang Y (2013) BioLiP: a semi-manually curated database for biologically relevant ligand-protein interactions. Nucleic Acids Res 41:D1096–D1103
Xu J, Zhang Y (2010) How significant is a protein structure similarity with TM-score = 0.5? Bioinformatics 26:889–895
Mcguffin LJ, Roche DB (2010) Rapid model quality assessment for protein structure predictions using the comparison of multiple models without structural alignments. Bioinformatics 26:182–188
Webb EC (1989) Nomenclature Committee of the International-Union-of-Biochemistry (Nc-Iub) - Enzyme Nomenclature - Recommendations 1984 - Supplement-2 - Corrections and Additions. Eur J Biochem 179:489–533
Ashburner M, Ball CA, Blake JA et al (2000) Gene ontology: tool for the unification of biology. Nat Genet 25:25–29
Mcguffin LJ, Atkins JD, Salehe BR et al (2015) IntFOLD: an integrated server for modelling protein structures and functions from amino acid sequences. Nucleic Acids Research 43:W169–W173
Bindschedler LV, Mcguffin LJ, Burgis TA et al (2011) Proteogenomics and in silico structural and functional annotation of the barley powdery mildew Blumeria graminis f. sp. hordei. Methods 54:432–441
Pedersen C, Ver Loren Van Themaat E, Mcguffin LJ et al (2012) Structure and evolution of barley powdery mildew effector candidates. BMC Genomics 13:694
Zhou Y, Xue S, Yang JJ (2013) Calciomics: integrative studies of Ca2+−binding proteins and their interactomes in biological systems. Metallomics 5:29–42
Don CG, Riniker S (2014) Scents and sense: in silico perspectives on olfactory receptors. J Comput Chem 35:2279–2287
Finn RD, Bateman A, Clements J et al (2014) Pfam: the protein families database. Nucleic Acids Res 42:D222–D230
Letunic I, Doerks T, Bork P (2015) SMART: recent updates, new developments and status in 2015. Nucleic Acids Res 43:D257–D260
Feng Z, Chen L, Maddula H et al (2004) Ligand Depot: a data warehouse for ligands bound to macromolecules. Bioinformatics 20:2153–2155
Roche DB, Buenavista MT, Mcguffin LJ (2014) Assessing the quality of modelled 3D protein structures using the ModFOLD server. Methods Mol Biol 1137:83–103
Roche DB, Buenavista MT, Tetchner SJ et al (2011) The IntFOLD server: an integrated web resource for protein fold recognition, 3D model quality assessment, intrinsic disorder prediction, domain prediction and ligand binding site prediction. Nucleic Acids Res 39:W171–W176
Mcguffin LJ, Buenavista MT, Roche DB (2013) The ModFOLD4 server for the quality assessment of 3D protein models. Nucleic Acids Res 41:W368–W372
Roy A, Kucukural A, Zhang Y (2010) I-TASSER: a unified platform for automated protein structure and function prediction. Nat Protoc 5:725–738
Mcguffin LJ (2008) Intrinsic disorder prediction from the analysis of multiple protein fold recognition models. Bioinformatics 24:1798–1804
Acknowledgements
Daniel Barry Roche is a recipient of a Young Investigator Fellowship from the Institut de Biologie Computationnelle, Université de Montpellier (ANR Investissements D’Avenir Bio-informatique: projet IBC).
Author information
Authors and Affiliations
Corresponding author
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 2016 Springer Science+Business Media New York
About this protocol
Cite this protocol
Roche, D.B., McGuffin, L.J. (2016). In silico Identification and Characterization of Protein-Ligand Binding Sites. In: Stoddard, B. (eds) Computational Design of Ligand Binding Proteins. Methods in Molecular Biology, vol 1414. Humana Press, New York, NY. https://doi.org/10.1007/978-1-4939-3569-7_1
Download citation
DOI: https://doi.org/10.1007/978-1-4939-3569-7_1
Published:
Publisher Name: Humana Press, New York, NY
Print ISBN: 978-1-4939-3567-3
Online ISBN: 978-1-4939-3569-7
eBook Packages: Springer Protocols