Journal of Computer-Aided Molecular Design

, Volume 27, Issue 1, pp 15–29 | Cite as

A consistent description of HYdrogen bond and DEhydration energies in protein–ligand complexes: methods behind the HYDE scoring function

  • Nadine Schneider
  • Gudrun Lange
  • Sally Hindle
  • Robert Klein
  • Matthias RareyEmail author


The estimation of free energy of binding is a key problem in structure-based design. We developed the scoring function HYDE based on a consistent description of HYdrogen bond and DEhydration energies in protein–ligand complexes. HYDE is applicable to all types of protein targets since it is not calibrated on experimental binding affinity data or protein–ligand complexes. The comprehensible atom-based score of HYDE is visualized by applying a very intuitive coloring scheme, thereby facilitating the analysis of protein–ligand complexes in the lead optimization process. In this paper, we have revised several aspects of the former version of HYDE which was described in detail previously. The revised HYDE version was already validated in large-scale redocking and screening experiments which were performed in the course of the Docking and Scoring Symposium at 241st ACS National Meeting. In this study, we additionally evaluate the ability of the revised HYDE version to predict binding affinities. On the PDBbind 2007 coreset, HYDE achieves a correlation coefficient of 0.62 between the experimental binding constants and the predicted binding energy, performing second best on this dataset compared to 17 other well-established scoring functions. Further, we show that the performance of HYDE in large-scale redocking and virtual screening experiments on the Astex diverse set and the DUD dataset respectively, is comparable to the best methods in this field.


Protein–ligand interactions Desolvation Binding affinity Virtual screening Lead optimization Docking 



The authors want to thank Hans Briem and Kristin Beyer of Bayer Pharma AG and Jürgen Albrecht of Bayer CropScience AG for many fruitful discussions and a successful cooperation. We also thank Holger Claussen, Marcus Gastreich and Christian Lemmen of BioSolveIT GmbH for their on-going support during the development of HYDE, particularly for the meticulous testing and analysis of HYDE and resulting valuable feedback. The HYDE project was funded by Bayer CropScience AG and Bayer Pharma AG.

Supplementary material

10822_2012_9626_MOESM1_ESM.pdf (184 kb)
Supplementary material 1 (PDF 183 kb)


  1. 1.
    Jorgensen WL (2004) The many roles of computation in drug discovery. Science 303:813–1818CrossRefGoogle Scholar
  2. 2.
    Matter H, Sotriffer C (2011) In: Sotriffer C (ed) Virtual screening: principles, challenges and practical guidelines, 1st edn. Wiley-VCH, WeinheimGoogle Scholar
  3. 3.
    Cheng T, Li X, Li Y, Liu Z, Wang R (2009) Comparative assessment of scoring functions on a diverse test set. J Chem Inf Mod 49:1079–1093CrossRefGoogle Scholar
  4. 4.
    Moitessier N, Englebienne P, Lee D, Lawandi J, Corbeil CR (2008) Towards the development of universal, fast and highly accurate docking/scoring methods: a long way to go. Br J Pharmacol 153:7–26CrossRefGoogle Scholar
  5. 5.
    Sotriffer C, Matter H (2011) In: Sotriffer C (ed) Virtual screening: principles, challenges and practical guidelines, 1st edn. Wiley-VCH, WeinheimCrossRefGoogle Scholar
  6. 6.
    Böhm HJ (1994) The development of a simple empirical scoring function to estimate the binding constant for a protein–ligand complex of known three-dimensional structure. J Comput Aided Mol Design 8:243–256CrossRefGoogle Scholar
  7. 7.
    Rarey M, Kramer B, Lengauer T, Klebe G (1996) A fast flexible docking method using an incremental construction algorithm. J Mol Biol 261:470–489CrossRefGoogle Scholar
  8. 8.
    Savage HJ, Elliott CJ, Freeman CM, Finney JL (1993) Lost hydrogen bonds and buried surface area: rationalising stability in globular proteins. J Chem Soc, Faraday Trans 89:2609–2617CrossRefGoogle Scholar
  9. 9.
    Bissantz C, Kuhn B, Stahl M (2010) A medicinal chemist’s guide to molecular interactions. J Med Chem 53(14):5061–5084CrossRefGoogle Scholar
  10. 10.
    Pham TA, Jain AN (2006) Parameter estimation for scoring protein–ligand interactions using negative training data. J Med Chem 49:5856–5868CrossRefGoogle Scholar
  11. 11.
    Krammer A, Kirchhoff PD, Jiang X, Venkatachalam CM, Waldman M (2005) LigScore: a novel scoring function for predicting binding affinities. J Mol Graph Model 23:395–407CrossRefGoogle Scholar
  12. 12.
    Friesner RA, Murphy RB, Repasky MP, Frye LL, Greenwood JR, Halgren TA, Sanschagrin PC, Mainz DT (2006) Extra precision glide: docking and scoring incorporating a model of hydrophobic enclosure for protein–ligand complexes. J Med Chem 49:6177–6196CrossRefGoogle Scholar
  13. 13.
    Sotriffer CA, Sanschagrin P, Matter H, Klebe G (2008) SFCscore: scoring functions for affinity prediction of protein–ligand complexes. Proteins 73:395–419CrossRefGoogle Scholar
  14. 14.
    Mysinger MM, Shoichet BK (2010) Rapid context-dependent ligand desolvation in molecular docking. J Chem Inf Model 50:1561–1573CrossRefGoogle Scholar
  15. 15.
    Kellogg GE, Burnett JC, Abraham DJ (2001) Very empirical treatment of solvation and entropy: a force field derived from Log Po/w. J Comput Aided Mol Des 15:381–393CrossRefGoogle Scholar
  16. 16.
    Wang R, Lai L, Wang S (2002) Further development and validation of empirical scoring functions for structure-based binding affinity prediction. J Comput Aided Mol Des 16:11–26CrossRefGoogle Scholar
  17. 17.
    Reulecke I, Lange G, Albrecht J, Klein R, Rarey M (2008) Towards an integrated description of hydrogen bonding and dehydration: reducing false positives in virtual screening with the hyde scoring function. ChemMedChem 3(6):885–897CrossRefGoogle Scholar
  18. 18.
    Lange G, Klein R, Albrecht J, Rarey M, Reulecke I (2010) European patent specification EP2084520Google Scholar
  19. 19.
    Schneider N, Hindle S, Lange G, Klein R, Albrecht J, Briem H, Beyer K, Claußen H, Gastreich M, Lemmen C, Rarey R (2012) Substantial improvements in large-scale redocking and screening using the novel HYDE scoring function. J Comput Aided Mol Des 26:701–723CrossRefGoogle Scholar
  20. 20.
    Richards FM (1977) Areas, volumes, packing, and protein structures. Ann Rev Biophys Bioeng 6:151–176CrossRefGoogle Scholar
  21. 21.
    Connolly ML (1983) Solvent-accessible surfaces of proteins and nucleic acids. Science 221:709–713CrossRefGoogle Scholar
  22. 22.
    Connolly ML (1983) Analytical molecular surface calculation. J Appl Cryst 16:548–558CrossRefGoogle Scholar
  23. 23.
    Stefano Forli, Olson AJ (2012) A force field with discrete waters and desolvation entropy for hydrated ligand docking. J Med Chem 55:623–638CrossRefGoogle Scholar
  24. 24.
    Schneider N, Klein R, Lange G, Rarey M (2012) Nearly no scoring function without a Hansch-analysis. Mol Inf 31:503–507CrossRefGoogle Scholar
  25. 25.
    Stahl M (2000) Modifications of the scoring function in FlexX for virtual screening applications. Perspect Drug Discov 20:83–98CrossRefGoogle Scholar
  26. 26.
    LeadIT. BioSolveIT GmbH, Sankt Augustin. Accessed 12 June 2012
  27. 27.
    Physprop database. Accessed 12 June 2012
  28. 28.
    Hansch C, Leo AJ (1985) Medchem project issue no. 26. Pomona College, Claremont, CAGoogle Scholar
  29. 29.
    Hansch C, Leo AJ (1987) The log P database. Pomona College, Claremont, CAGoogle Scholar
  30. 30.
    Hansch C, Leo A, Hoekman D (1995) Exploring QSAR. Hydrophobic, electronic, and steric constants. American Chemical Society, Washington, DCGoogle Scholar
  31. 31.
    Leo AJ (1993) Calculating log Poct from structures. Chem Rev 93:1281–1306CrossRefGoogle Scholar
  32. 32.
    Lee B, Richards FM (1971) The interpretation of protein structures: estimation of static accessibility. J Mol Biol 55:379–400CrossRefGoogle Scholar
  33. 33.
    Shrake A, Rupley JA (1973) Environment and exposure to solvent of protein atoms, lysozyme and insulin. J Mol Biol 79:351–371CrossRefGoogle Scholar
  34. 34.
    Bondi A (1964) Van der Waals volumes and radii. J Phys Chem 68:441–451CrossRefGoogle Scholar
  35. 35.
    Hartshorn MJ, Verdonk ML, Chessari G, Brewerton SC, Mooij WTM, Mortenson PN, Murray CW (2007) Diverse, high-quality test set for the validation of protein–ligand docking performance. J Med Chem 50:726–741CrossRefGoogle Scholar
  36. 36.
    Seebeck B, Reulecke I, Kämper A, Rarey M (2008) Modeling of metal interaction geometries for protein–ligand docking. Protein Struct Funct Bioinform 71:1237–1254CrossRefGoogle Scholar
  37. 37.
    Lippert T, Rarey M (2009) Fast automated placement of polar hydrogen atoms in protein–ligand complexes. J Cheminf 1:13CrossRefGoogle Scholar
  38. 38.
    Wang R, Fang X, Lu Y, Wang S (2004) The PDBbind database: collection of binding affinities for protein–ligand complexes with known three-dimensional structures. J Med Chem 47:2977–2980CrossRefGoogle Scholar
  39. 39.
    Wang R, Fang X, Lu Y, Yang CY, Wang S (2005) The PDBbind database: methodologies and updates. J Med Chem 48:4111–4119CrossRefGoogle Scholar
  40. 40.
    Jones G, Willett P, Glen RC (1995) Molecular recognition of receptor sites using a genetic algorithm with a description of desolvation. J Mol Biol 245:43–53CrossRefGoogle Scholar
  41. 41.
    Jones G, Willett P, Glen RC, Leach AR, Taylor R (1997) Development and validation of a genetic algorithm for flexible docking. J Mol Biol 267:727–748CrossRefGoogle Scholar
  42. 42.
    Verdonk ML, Cole JC, Hartshorn MJ, Murray CW, Taylor RD (2003) Improved protein–ligand docking using GOLD. Proteins 52:609–623CrossRefGoogle Scholar
  43. 43.
    Korb O, Stützle T, Exner TE (2006) PLANTS: application of ant colony optimization to structure-based drug design. Lect Notes Comput Sci 4150:247–258CrossRefGoogle Scholar
  44. 44.
    Korb O, Stützle T, Exner TE (2007) An ant colony optimization approach to flexible protein–ligand docking. Swarm Intel 1(2):115–134CrossRefGoogle Scholar
  45. 45.
    Korb O, Stützle T, Exner TE (2009) Empirical scoring functions for advanced protein–ligand docking with PLANTS. J Chem Inf Mod 49:84–96CrossRefGoogle Scholar
  46. 46.
    Huang N, Shoichet BK, Irwin JJ (2006) Benchmarking sets for molecular docking. J Med Chem 49(23):6789–6801CrossRefGoogle Scholar
  47. 47.
    Baum B, Mohamed M, Zayed M, Gerlach C, Heine A, Hangauer D, Klebe G (2009) More than a simple lipophilic contact: a detailed thermodynamic analysis of nonbasic residues in the s1 pocket of thrombin. J Mol Biol 390:56–69CrossRefGoogle Scholar
  48. 48.
    Regan J, Breitfelder S, Cirillo P, Gilmore T, Graham AG, Hickey E, Klaus B, Madwed J, Moriak M, Moss N, Pargellis C, Pav S, Proto A, Swinamer A, Tong L, Torcellini C (2002) Pyrazole urea-based inhibitors of p38 MAP kinase: from lead compound to clinical candidate. J Med Chem 45:2994–3008CrossRefGoogle Scholar
  49. 49.
    Berman HM, Westbrook J, Feng Z, Gilliland G, Bhat TN, Weissig H, Shindyalov IN, Bourne PE (2000) The protein data bank. Nucleic Acids Res 28:235–242CrossRefGoogle Scholar
  50. 50.
    Urbaczek S, Kolodzik A, Fischer JR, Lippert T, Heuser S, Groth I, Schulz-Gasch T, Rarey M (2011) NAOMI—on the almost trivial task of reading molecules from different file formats. J Chem Inf Mod 51:3199–3207CrossRefGoogle Scholar
  51. 51.
    Tang YT, Marshall GR (2011) PHOENIX: a scoring function for affinity prediction derived using high-resolution crystal structures and calorimetry measurements. J Chem Inf Mod 51:214–228CrossRefGoogle Scholar
  52. 52.
    Sondergaard CR, Garrett AE, Carstensen T, Pollastri G, Nielsen JE (2009) Structural artifacts in protein–ligand X-ray structures: implications for the development of docking scoring functions. J Med Chem 52:5673–5684CrossRefGoogle Scholar
  53. 53.
    Sadowski J, Gasteiger J, Klebe G (1994) Comparison of automatic three-dimensional model builders using 639 X-ray structures. J Chem Inf Comput Sci 34:1000–1008CrossRefGoogle Scholar
  54. 54.
    CORINA. Molecular Networks GmbH, Erlangen, Germany. Accessed 12 June 2011
  55. 55.
    Irwin JJ, Shoichet BK (2005) ZINC—a free database of commercially available compounds for virtual screening. J Chem Inf Model 45:177–182CrossRefGoogle Scholar
  56. 56.
    Repasky MP, Murphy RB, Banks JL, Greenwood JR, Tubert-Brohman I, Bhat S, Friesner RA (2012) Docking performance of the glide program as evaluated on the Astex and DUD datasets: a complete set of glide SP results and selected results for a new scoring function integrating WaterMap and glide. J Comput Aided Mol Des 26:787–799CrossRefGoogle Scholar
  57. 57.
    Liebeschuetz JW, Cole JC, Korb O (2012) Pose prediction and virtual screening performance of GOLD scoring functions in a standardized test. J Comput Aided Mol Des 26:737–748CrossRefGoogle Scholar
  58. 58.
    Neves MAC, Totrov M, Abagyan R (2012) Docking and scoring with ICM: the benchmarking results and strategies for improvement. J Comput Aided Mol Des 26:675–686CrossRefGoogle Scholar
  59. 59.
    McGann M (2011) FRED pose prediction and virtual screening accuracy. J Chem Inf Mod 51(3):578–596CrossRefGoogle Scholar
  60. 60.
    Brozell SR, Mukherjee S, Balius TE, Roe DR, Case DA, Rizzo RC (2012) Evaluation of DOCK 6 as a pose generation and database enrichment tool. J Comput Aided Mol Des 26:749–773CrossRefGoogle Scholar

Copyright information

© Springer Science+Business Media Dordrecht 2012

Authors and Affiliations

  • Nadine Schneider
    • 1
  • Gudrun Lange
    • 2
  • Sally Hindle
    • 3
  • Robert Klein
    • 2
  • Matthias Rarey
    • 1
    Email author
  1. 1.Center for BioinformaticsUniversity of HamburgHamburgGermany
  2. 2.Bayer CropScience AG, Industriepark Hoechst, G836Frankfurt am MainGermany
  3. 3.BioSolveIT GmbHSt. AugustinGermany

Personalised recommendations