Beyond structural genomics: computational approaches for the identification of ligand binding sites in protein structures



Structural genomics projects have revealed structures for a large number of proteins of unknown function. Understanding the interactions between these proteins and their ligands would provide an initial step in their functional characterization. Binding site identification methods are a fast and cost-effective way to facilitate the characterization of functionally important protein regions. In this review we describe our recently developed methods for binding site identification in the context of existing methods. The advantage of energy-based approaches is emphasized, since they provide flexibility in the identification and characterization of different types of binding sites.


Binding site Function Interaction Ligand Prediction Structure 


  1. 1.
    Berman HM, Westbrook JD, Gabanyi MJ, Tao W, Shah R, Kouranov A, Schwede T, Arnold K, Kiefer F, Bordoli L, Kopp J, Podvinec M, Adams PD, Carter LG, Minor W, Nair R, La Baer J (2009) The protein structure initiative structural genomics knowledgebase. Nucleic Acids Res 37:D365–D368PubMedCrossRefGoogle Scholar
  2. 2.
    Brent MM, Marmorstein R (2008) Ankyrin for methylated lysines. Nat Struct Mol Biol 15:221–222PubMedCrossRefGoogle Scholar
  3. 3.
    Shima S, Pilak O, Vogt S, Schick M, Stagni MS, Meyer-Klaucke W, Warkentin E, Thauer RK, Ermler U (2008) The crystal structure of [Fe]-hydrogenase reveals the geometry of the active site. Science 321:572–575PubMedCrossRefGoogle Scholar
  4. 4.
    Copley SD (2003) Enzymes with extra talents: moonlighting functions and catalytic promiscuity. Curr Opin Chem Biol 7:265–272PubMedCrossRefGoogle Scholar
  5. 5.
    Jeffery CJ (2003) Moonlighting proteins: old proteins learning new tricks. Trends Genet 19:415–417PubMedCrossRefGoogle Scholar
  6. 6.
    Ghersi D, Sanchez R (2009) Improving accuracy and efficiency of blind protein-ligand docking by focusing on predicted binding sites. Proteins 74:417–424PubMedCrossRefGoogle Scholar
  7. 7.
    Hetenyi C, van der Spoel D (2011) Towards prediction of functional protein pockets using blind docking and pocket search algorithms. Protein Sci 20:880-893PubMedCrossRefGoogle Scholar
  8. 8.
    Perot S, Sperandio O, Miteva MA, Camproux AC, Villoutreix BO (2010) Druggable pockets and binding site centric chemical space: a paradigm shift in drug discovery. Drug Discov Today 15:656–667PubMedCrossRefGoogle Scholar
  9. 9.
    Capra JA, Singh M (2007) Predicting functionally important residues from sequence conservation. Bioinformatics 23:1875–1882PubMedCrossRefGoogle Scholar
  10. 10.
    Lichtarge O, Bourne HR, Cohen FE (1996) An evolutionary trace method defines binding surfaces common to protein families. J Mol Biol 257:342–358PubMedCrossRefGoogle Scholar
  11. 11.
    Mayrose I, Graur D, Ben-Tal N, Pupko T (2004) Comparison of site-specific rate-inference methods for protein sequences: empirical Bayesian methods are superior. Mol Biol Evol 21:1781–1791PubMedCrossRefGoogle Scholar
  12. 12.
    Artymiuk PJ, Poirrette AR, Grindley HM, Rice DW, Willett P (1994) A graph-theoretic approach to the identification of three-dimensional patterns of amino acid side-chains in protein structures. J Mol Biol 243:327–344PubMedCrossRefGoogle Scholar
  13. 13.
    Zhang Z, Grigorov MG (2006) Similarity networks of protein binding sites. Proteins 62:470–478PubMedCrossRefGoogle Scholar
  14. 14.
    Wallace AC, Borkakoti N, Thornton JM (1997) TESS: a geometric hashing algorithm for deriving 3D coordinate templates for searching structural databases. Application to enzyme active sites. Protein Sci 6:2308–2323PubMedCrossRefGoogle Scholar
  15. 15.
    Parca L, Gherardini PF, Helmer-Citterich M, Ausiello G (2011) Phosphate binding sites identification in protein structures. Nucleic Acids Res 39:1231–1242PubMedCrossRefGoogle Scholar
  16. 16.
    Berman HM, Westbrook J, Feng Z, Gilliland G, Bhat TN, Weissig H, Shindyalov IN, Bourne PE (2000) The Protein Data Bank. Nucleic Acids Res 28:235–242PubMedCrossRefGoogle Scholar
  17. 17.
    Brylinski M, Skolnick J (2009) FINDSITE: a threading-based approach to ligand homology modeling. PLoS Comput Biol 5:e1000405PubMedCrossRefGoogle Scholar
  18. 18.
    Skolnick J, Brylinski M (2009) FINDSITE: a combined evolution/structure-based approach to protein function prediction. Brief Bioinform 10:378–391PubMedCrossRefGoogle Scholar
  19. 19.
    Wass MN, Kelley LA, Sternberg MJ (2010) 3DLigandSite: predicting ligand-binding sites using similar structures. Nucleic Acids Res 38:W469–W473PubMedCrossRefGoogle Scholar
  20. 20.
    Laskowski RA, Luscombe NM, Swindells MB, Thornton JM (1996) Protein clefts in molecular recognition and function. Protein Sci 5:2438–2452PubMedGoogle Scholar
  21. 21.
    Levitt DG, Banaszak LJ (1992) POCKET: a computer graphics method for identifying and displaying protein cavities and their surrounding amino acids. J Mol Graph 10:229–234PubMedCrossRefGoogle Scholar
  22. 22.
    Hendlich M, Rippmann F, Barnickel G (1997) LIGSITE: automatic and efficient detection of potential small molecule-binding sites in proteins. J Mol Graph Model, 15, 359-63, 389Google Scholar
  23. 23.
    Huang B, Schroeder M (2006) LIGSITEcsc: predicting ligand binding sites using the Connolly surface and degree of conservation. BMC Struct Biol 6:19PubMedCrossRefGoogle Scholar
  24. 24.
    Laskowski RA (1995) SURFNET: a program for visualizing molecular surfaces, cavities, and intermolecular interactions. J Mol Graph 13(323–30):307–308Google Scholar
  25. 25.
    Peters KP, Fauck J, Frommel C (1996) The automatic search for ligand binding sites in proteins of known three-dimensional structure using only geometric criteria. J Mol Biol 256:201–213PubMedCrossRefGoogle Scholar
  26. 26.
    Brady GP Jr, Stouten PF (2000) Fast prediction and visualization of protein binding pockets with PASS. J Comput Aided Mol Des 14:383–401PubMedCrossRefGoogle Scholar
  27. 27.
    Dundas J, Ouyang Z, Tseng J, Binkowski A, Turpaz Y, Liang J (2006) CASTp: computed atlas of surface topography of proteins with structural and topographical mapping of functionally annotated residues. Nucleic Acids Res 34:W116–W118PubMedCrossRefGoogle Scholar
  28. 28.
    Kawabata T (2010) Detection of multiscale pockets on protein surfaces using mathematical morphology. Proteins 78:1195–1211PubMedCrossRefGoogle Scholar
  29. 29.
    Le Guilloux V, Schmidtke P, Tuffery P (2009) Fpocket: an open source platform for ligand pocket detection. BMC Bioinformatics 10:168PubMedCrossRefGoogle Scholar
  30. 30.
    Schmidtke P, Le Guilloux V, Maupetit J, Tuffery P (2010) fpocket: online tools for protein ensemble pocket detection and tracking. Nucleic Acids Res 38:W582–W589PubMedCrossRefGoogle Scholar
  31. 31.
    Petrek M, Otyepka M, Banas P, Kosinova P, Koca J, Damborsky J (2006) CAVER: a new tool to explore routes from protein clefts, pockets and cavities. BMC Bioinformatics 7:316PubMedCrossRefGoogle Scholar
  32. 32.
    Kalidas Y, Chandra N (2008) PocketDepth: a new depth based algorithm for identification of ligand binding sites in proteins. J Struct Biol 161:31–42PubMedCrossRefGoogle Scholar
  33. 33.
    Weisel M, Proschak E, Schneider G (2007) PocketPicker: analysis of ligand binding-sites with shape descriptors. Chem Cent J 1:7PubMedCrossRefGoogle Scholar
  34. 34.
    Tseng YY, Dupree C, Chen ZJ, Li WH (2009) SplitPocket: identification of protein functional surfaces and characterization of their spatial patterns. Nucleic Acids Res 37:W384–W389PubMedCrossRefGoogle Scholar
  35. 35.
    Tseng YY, Li WH (2009) Identification of protein functional surfaces by the concept of a split pocket. Proteins 76:959–976PubMedCrossRefGoogle Scholar
  36. 36.
    Huang B (2009) MetaPocket: a meta approach to improve protein ligand binding site prediction. OMICS 13:325–330PubMedCrossRefGoogle Scholar
  37. 37.
    Goodford PJ (1985) A computational procedure for determining energetically favorable binding sites on biologically important macromolecules. J Med Chem 28:849–857PubMedCrossRefGoogle Scholar
  38. 38.
    Laurie AT, Jackson RM (2005) Q-SiteFinder: an energy-based method for the prediction of protein-ligand binding sites. Bioinformatics 21:1908–1916PubMedCrossRefGoogle Scholar
  39. 39.
    Morita M, Nakamura S, Shimizu K (2008) Highly accurate method for ligand-binding site prediction in unbound state (apo) protein structures. Proteins 73:468–479PubMedCrossRefGoogle Scholar
  40. 40.
    Morris GM, Goodsell DS, Halliday RS, Huey R, Hart WE, Belew RK, Olson AJ (1998) Automated docking using a Lamarckian genetic algorithm and an empirical binding free energy function. J Comput Chem 19:1639–1662CrossRefGoogle Scholar
  41. 41.
    Harris R, Olson AJ, Goodsell DS (2008) Automated prediction of ligand-binding sites in proteins. Proteins 70:1506–1517PubMedCrossRefGoogle Scholar
  42. 42.
    Ghersi D, Sanchez R (2009) EasyMIFS and SiteHound: a toolkit for the identification of ligand-binding sites in protein structures. Bioinformatics 25:3185–3186PubMedCrossRefGoogle Scholar
  43. 43.
    Hernandez M, Ghersi D, Sanchez R (2009) SITEHOUND-web: a server for ligand binding site identification in protein structures. Nucleic Acids Res 37:W413–W416PubMedCrossRefGoogle Scholar
  44. 44.
    Ghersi D and Sanchez R (2011) Automated identification of binding sites for phosphorylated ligands in protein structures. submittedGoogle Scholar
  45. 45.
    Mattos C, Ringe D (1996) Locating and characterizing binding sites on proteins. Nat Biotechnol 14:595–599PubMedCrossRefGoogle Scholar
  46. 46.
    Vajda S, Guarnieri F (2006) Characterization of protein-ligand interaction sites using experimental and computational methods. Curr Opin Drug Discov Devel 9:354–362PubMedGoogle Scholar
  47. 47.
    Cruciani G (2006) Molecular interaction fields applications in drug discovery and ADME prediction. Wiley, WeinheimGoogle Scholar
  48. 48.
    Silvaggi NR, Zhang C, Lu Z, Dai J, Dunaway-Mariano D, Allen KN (2006) The X-ray crystal structures of human alpha-phosphomannomutase 1 reveal the structural basis of congenital disorder of glycosylation type 1a. J Biol Chem 281:14918–14926PubMedCrossRefGoogle Scholar
  49. 49.
    Olson LJ, Dahms NM, Kim JJ (2004) The N-terminal carbohydrate recognition site of the cation-independent mannose 6-phosphate receptor. J Biol Chem 279:34000–34009PubMedCrossRefGoogle Scholar
  50. 50.
    Lee KA, Fuda H, Lee YC, Negishi M, Strott CA, Pedersen LC (2003) Crystal structure of human cholesterol sulfotransferase (SULT2B1b) in the presence of pregnenolone and 3′-phosphoadenosine 5′-phosphate. Rationale for specificity differences between prototypical SULT2A1 and the SULT2BG1 isoforms. J Biol Chem 278:44593–44599PubMedCrossRefGoogle Scholar
  51. 51.
    Biswal BK, Au K, Cherney MM, Garen C, James MN (2006) The molecular structure of Rv2074, a probable pyridoxine 5′-phosphate oxidase from Mycobacterium tuberculosis, at 1.6 angstroms resolution. Acta Crystallogr Sect F Struct Biol Cryst Commun 62:735–742PubMedCrossRefGoogle Scholar
  52. 52.
    Biswal BK, Cherney MM, Wang M, Garen C, James MN (2005) Structures of Mycobacterium tuberculosispyridoxine 5′-phosphate oxidase and its complexes with flavin mononucleotide and pyridoxal 5′-phosphate. Acta Crystallogr D Biol Crystallogr 61:1492–1499PubMedCrossRefGoogle Scholar
  53. 53.
    Ladner JE, Obmolova G, Teplyakov A, Howard AJ, Khil PP, Camerini-Otero RD, Gilliland GL (2003) Crystal structure of Escherichia coli protein ybgI, a toroidal structure with a dinuclear metal site. BMC Struct Biol 3:7PubMedCrossRefGoogle Scholar
  54. 54.
    Zhong S, Mackerell AD Jr (2007) Binding response: a descriptor for selecting ligand binding site on protein surfaces. J Chem Inf Model 47:2303–2315PubMedCrossRefGoogle Scholar
  55. 55.
    Brenke R, Kozakov D, Chuang GY, Beglov D, Hall D, Landon MR, Mattos C, Vajda S (2009) Fragment-based identification of druggable ‘hot spots’ of proteins using Fourier domain correlation techniques. Bioinformatics 25:621–627PubMedCrossRefGoogle Scholar
  56. 56.
    Gelpi JL, Kalko SG, Barril X, Cirera J, De La Cruz X, Luque FJ, Orozco M (2001) Classical molecular interaction potentials: improved setup procedure in molecular dynamics simulations of proteins. Proteins 45:428–437PubMedCrossRefGoogle Scholar
  57. 57.
    Capra JA, Laskowski RA, Thornton JM, Singh M, Funkhouser TA (2009) Predicting protein ligand binding sites by combining evolutionary sequence conservation and 3D structure. PLoS Comput Biol 5:e1000585PubMedCrossRefGoogle Scholar
  58. 58.
    Kawabata T, Go N (2007) Detection of pockets on protein surfaces using small and large probe spheres to find putative ligand binding sites. Proteins 68:516–529PubMedCrossRefGoogle Scholar
  59. 59.
    An J, Totrov M, Abagyan R (2005) Pocketome via comprehensive identification and classification of ligand binding envelopes. Mol Cell Proteomics 4:752–761PubMedCrossRefGoogle Scholar
  60. 60.
    Till MS, Ullmann GM (2010) McVol - a program for calculating protein volumes and identifying cavities by a Monte Carlo algorithm. J Mol Model 16:419–429PubMedCrossRefGoogle Scholar
  61. 61.
    Nayal M, Honig B (2006) On the nature of cavities on protein surfaces: application to the identification of drug-binding sites. Proteins 63:892–906PubMedCrossRefGoogle Scholar
  62. 62.
    Halgren T (2007) New method for fast and accurate binding-site identification and analysis. Chem Biol Drug Des 69:146–148PubMedCrossRefGoogle Scholar
  63. 63.
    Halgren TA (2009) Identifying and characterizing binding sites and assessing druggability. J Chem Inf Model 49:377–389PubMedCrossRefGoogle Scholar

Copyright information

© Springer Science+Business Media B.V. 2011

Authors and Affiliations

  1. 1.Department of Structural and Chemical BiologyMount Sinai School of MedicineNew YorkUSA
  2. 2.Lewis-Sigler Institute for Integrative GenomicsPrinceton UniversityPrincetonUSA

Personalised recommendations