Journal of Molecular Modeling

, Volume 9, Issue 3, pp 172–182 | Cite as

WaterScore: a novel method for distinguishing between bound and displaceable water molecules in the crystal structure of the binding site of protein-ligand complexes

  • Alfonso T. García-Sosa
  • Ricardo L. Mancera
  • Philip M. Dean
Original Paper


We have performed a multivariate logistic regression analysis to establish a statistical correlation between the structural properties of water molecules in the binding site of a free protein crystal structure, with the probability of observing the water molecules in the same location in the crystal structure of the ligand-complexed form. The temperature B-factor, the solvent-contact surface area, the total hydrogen bond energy and the number of protein–water contacts were found to discriminate between bound and displaceable water molecules in the best regression functions obtained. These functions may be used to identify those bound water molecules that should be included in structure-based drug design and ligand docking algorithms.

Figure The binding site (thin sticks) of penicillopepsin (3app) with its crystallographically determined water molecules (spheres) and superimposed ligand (in thick sticks, from complexed structure 1ppk). Water molecules sterically displaced by the ligand upon complexation are shown in cyan. Bound water molecules are shown in blue. Displaced water molecules are shown in yellow. Water molecules removed from the analysis due to a lack of hydrogen bonds to the protein are shown in white. WaterScore correctly predicted waters in blue as Probability=1 to remain bound and waters in yellow as Probability<1×10−20 to remain bound.


Protein hydration Drug design Bound water molecules Multivariate logistic regression 



ATGS would like to thank Consejo Nacional de Ciencia y Tecnología (CONACyT, México) for the award of a postgraduate scholarship and the CVCP of the Universities of the UK for an Overseas Research Scheme award. RLM is also a Research Fellow of Hughes Hall, Cambridge. We also thank Mr. Benjamin Carrington for his valuable help in the production of some of the figures, Dr. Per Kållblad for help and discussion on PC analysis, and Miss Eva-Liina Asu for proof-reading a draft of the manuscript.


  1. 1.
    Giacovazzo C, Monaco HL, Viterbo D, Scordari F, Gilli G, Zanotti G, Catti M (1992) Fundamentals of crystallography. Oxford University Press, Oxford, pp 583–584Google Scholar
  2. 2.
    Jeffrey GA (1994) J Mol Struct 322:21–25CrossRefGoogle Scholar
  3. 3.
    Purkiss A, Skoulakis S, Goodfellow JM (2001) Philos Trans R Soc London Ser A 359:1515–1527CrossRefGoogle Scholar
  4. 4.
    Chung E, Henriques D, Renzoni D, Zvelebil M, Bradshaw JM, Waksman G, Robinson CV, Ladbury JE (1998) Struct Folding Design 6:1141–1151Google Scholar
  5. 5.
    Sanschagrin PC, Kuhn LA (1998) Protein Sci 7:2054–2064PubMedGoogle Scholar
  6. 6.
    Lemieux RU (1996) Acc Chem Res 29:373–380CrossRefGoogle Scholar
  7. 7.
    Nakasako M (1999) J Mol Biol 289:547–564PubMedGoogle Scholar
  8. 8.
    Faerman CH, Karplus PA (1995) PROTEINS 23:1–11PubMedGoogle Scholar
  9. 9.
    Schwabe JWR (1997) Curr Opin Struct Biol 7:126–134CrossRefPubMedGoogle Scholar
  10. 10.
    Carrell HL, Glusker JP, Burger V, Manfre F, Tritsch D, Biellmann J-F (1989) Proc Natl Acad Sci USA 86:4440–4444PubMedGoogle Scholar
  11. 11.
    Baker EL, Hubbard RE (1984) Prog Biophys Molec Biol 44:97–179CrossRefGoogle Scholar
  12. 12.
    Loris R, Langhorst U, De Vos S, Decanniere K, Bouckaert J, Maes D, Transhue TR, Steyaert J (1999) PROTEINS 36:117–134CrossRefPubMedGoogle Scholar
  13. 13.
    Loris R, Stas PP, Wyns L (1994) J Biol Chem 269:26722–26733PubMedGoogle Scholar
  14. 14.
    Poornima CS, Dean PM (1995) J Comput-Aided Mol Des 9:521–531Google Scholar
  15. 15.
    Poornima CS, Dean PM (1995) J Comput-Aided Mol Des 9:500–512Google Scholar
  16. 16.
    Poornima CS, Dean PM (1995) J Comput-Aided Mol Des 9:513–520Google Scholar
  17. 17.
    Feig M, Pettitt BM (1998) Structure 6:1351–1354PubMedGoogle Scholar
  18. 18.
    Zhang X-J, Matthews BW (1994) Protein Sci 3:1031–1039PubMedGoogle Scholar
  19. 19.
    Mattos C (2002) Trends Biochem Sci 27:203–208CrossRefPubMedGoogle Scholar
  20. 20.
    Esposito L, Vitagliano L, Sica F, Sorrentino G, Zagari A, Mazzarella L (2000) J Mol Biol 297:713–732CrossRefPubMedGoogle Scholar
  21. 21.
    Teeter MM (1991) Annu Rev Biophys Chem 20:577–600CrossRefGoogle Scholar
  22. 22.
    Swaminathan CP, Nandi A, Visweswariah SS, Surolia A (1999) J Biol Chem 274:31272–31278CrossRefPubMedGoogle Scholar
  23. 23.
    Bhat TN, Bentley GA, Boulot G, Greene MI, Tello D, Dall'Acqua W, Souchon H, Schwarz FP, Mariuzza RA, Poljal RJ (1994) Proc Natl Acad Sci USA 91:1089–1093PubMedGoogle Scholar
  24. 24.
    Covell DG, Wallqvist A (1997) J Mol Biol 269:281–297CrossRefPubMedGoogle Scholar
  25. 25.
    Zhang L, Hermans J (1996) PROTEINS 24:433–438PubMedGoogle Scholar
  26. 26.
    Helms V, Wade RC (1995) Biophys J 69:810–824PubMedGoogle Scholar
  27. 27.
    Helms V, Wade RC (1998) PROTEINS 32:381–396CrossRefPubMedGoogle Scholar
  28. 28.
    Helms V, Wade RC (1998) J Am Chem Soc 120:2710–2713CrossRefGoogle Scholar
  29. 29.
    Marrone TJ, Briggs JM, McCammon JA (1997) Annu Rev Pharmacol Toxicol 37:71–90CrossRefPubMedGoogle Scholar
  30. 30.
    Lam PYS, Jadhav PK, Eyermann CJ, Hodge CN, Ru Y, Bacheler LT, Meek JL, Otto MJ, Rayner MM, Wong YN, Chang CH, Weber PC, Jackson DA, Sharpe, TR, Ericksonviitanen S (1994) Science 263:380–384PubMedGoogle Scholar
  31. 31.
    Mikol V, Papageorgiou C, Borer X (1995) J Med Chem 38:3361–3367PubMedGoogle Scholar
  32. 32.
    Palomer A, Pérez JJ, Navea S, Llorens O, Pascual J, García Ll, Mauleón D (2000) J Med Chem 43:2280–2284CrossRefPubMedGoogle Scholar
  33. 33.
    Cherbavaz DB, Lee ME, Stroud RM, Koschl DE (2000) J Mol Biol 295:377–385CrossRefPubMedGoogle Scholar
  34. 34.
    Finley JB, Atigadda VR, Duarte F, Zhao JJ, Brouillette WJ, Air GM, Luo M (1999) J Mol Biol 293:1107–1119CrossRefPubMedGoogle Scholar
  35. 35.
    Ehrlich L, Reckzo M, Wade RC (1998) Protein Eng 11:11–19CrossRefPubMedGoogle Scholar
  36. 36.
    Raymer ML, Sanschagrin PC, Punch WF, Venkataram S, Goodman ED, Kuhn L (1997) J Mol Biol 265:445–464CrossRefPubMedGoogle Scholar
  37. 37.
    Carugo O (1999) Protein Eng 12:1021–1024PubMedGoogle Scholar
  38. 38.
    Carugo O, Argos P (1998) PROTEINS 31:201–213CrossRefPubMedGoogle Scholar
  39. 39.
    Carugo O, Bordo D (1999) Acta Crystallogr Sect D 55:479–483CrossRefGoogle Scholar
  40. 40.
    Rarey M, Kramer B, Lengauer T (1999) PROTEINS 34:17–28CrossRefPubMedGoogle Scholar
  41. 41.
    Pastor M, Cruciani G, Watson KA (1997) J Med Chem 40:4089–4102CrossRefPubMedGoogle Scholar
  42. 42.
    Shoichet BK, Leach AR, Kuntz ID (1999) PROTEINS 34:4–16CrossRefPubMedGoogle Scholar
  43. 43.
    Mancera RL (2002) J Comp-Aided Mol Des 16:479–499Google Scholar
  44. 44.
    Berman HM, Westbrook J, Feng Z, Gilliland G, Bhat TN, Weissig H, Shindyalov IN, Bourne PE (2000) Nucleic Acids Res 28:235–242PubMedGoogle Scholar
  45. 45.
    Vriend G (1990) J Mol Graph 8:52–56PubMedGoogle Scholar
  46. 46.
    Hooft RWW, Sander C, Vriend G (1996) PROTEINS 26:363–376CrossRefPubMedGoogle Scholar
  47. 47.
    Hubbard SJ, Argos P (1995) Protein Eng 8:1011–1015PubMedGoogle Scholar
  48. 48.
    Lee B, Richards FM (1971) J Mol Biol 55:379–400PubMedGoogle Scholar
  49. 49.
    Matlab 5.0 (1999) The Math Works,Google Scholar
  50. 50.
    Menard SM (1995) Applied logistic regression analysis in series. In: Lewis-Beck MS (ed) Quantitative applications in the social sciences. Sage, Thousand Oaks, Calif.Google Scholar
  51. 51.
    Agresti A (1996) An introduction to categorical data analysis, Wiley series in probability and statistics, applied probability and statistics. Wiley, New YorkGoogle Scholar
  52. 52.
    Rice JA (1995) Mathematical statistics and data analysis, 2nd edn. Duxbury Press, Belmont, Calif.Google Scholar
  53. 53.
    Holtsberg A (1994)

Copyright information

© Springer-Verlag 2003

Authors and Affiliations

  • Alfonso T. García-Sosa
    • 1
  • Ricardo L. Mancera
    • 2
  • Philip M. Dean
    • 2
  1. 1.Department of PharmacologyUniversity of CambridgeCambridgeUK
  2. 2.De Novo PharmaceuticalsCambridgeUK

Personalised recommendations