Abstract
In the recent SAMPL5 challenge, participants submitted predictions for cyclohexane/water distribution coefficients for a set of 53 small molecules. Distribution coefficients (log D) replace the hydration free energies that were a central part of the past five SAMPL challenges. A wide variety of computational methods were represented by the 76 submissions from 18 participating groups. Here, we analyze submissions by a variety of error metrics and provide details for a number of reference calculations we performed. As in the SAMPL4 challenge, we assessed the ability of participants to evaluate not just their statistical uncertainty, but their model uncertainty—how well they can predict the magnitude of their model or force field error for specific predictions. Unfortunately, this remains an area where prediction and analysis need improvement. In SAMPL4 the top performing submissions achieved a root-mean-squared error (RMSE) around 1.5 kcal/mol. If we anticipate accuracy in log D predictions to be similar to the hydration free energy predictions in SAMPL4, the expected error here would be around 1.54 log units. Only a few submissions had an RMSE below 2.5 log units in their predicted log D values. However, distribution coefficients introduced complexities not present in past SAMPL challenges, including tautomer enumeration, that are likely to be important in predicting biomolecular properties of interest to drug discovery, therefore some decrease in accuracy would be expected. Overall, the SAMPL5 distribution coefficient challenge provided great insight into the importance of modeling a variety of physical effects. We believe these types of measurements will be a promising source of data for future blind challenges, especially in view of the relatively straightforward nature of the experiments and the level of insight provided.
Similar content being viewed by others
References
Mobley DL, Wymer KL, Lim NM, Guthrie JP (2014) J Comput Aided Mol Des 28(3):135
Geballe MT, Guthrie JP (2012) J Comput Aided Mol Des 26(5):489
Geballe MT, Skillman AG, Nicholls A, Guthrie JP, Taylor PJ (2010) J Comput Aided Mol Des 24(4):259
Klimovich PV, Mobley DL (2010) J Comput Aided Mol Des 24(4):307
Mobley DL, Bayly CI, Cooper MD, Dill KA (2009) J Phys Chem B 113(14):4533
Mobley DL, Liu S, Cerutti DS, Swope WC, Rice JE (2012) J Comput Aided Mol Des 26(5):551
Nicholls A, Mobley DL, Guthrie JP, Chodera JD, Bayly CI, Cooper MD, Pande VS (2008) J Med Chem 51(4):769
Rustenburg AS, Dancer J, Lin B, Feng JA, Ortwine DF, Mobley DL, Chodera JD (2016) J Comput Aided Mol Des
Leo A, Hansch C, Elkins D (1971) Chem Rev 71(6):525
Young RJ, Green DVS, Luscombe CN, Hill AP (2011) Drug Discov Today 16(17–18):822
Essex JW, Reynolds CA, Richards WG (1992) J Am Chem Soc 114(10):3634
Best SA, Merz KM Jr, Reynolds CH (1999) J Phys Chem B 103(4):714
Eksterowicz JE, Miller JL, Kollman PA (1997) J Phys Chem B 101(50):10971
Jorgensen WL (1989) Acc Chem Res 22:187
Jorgensen WL, Briggs JM, Contreras L (1990) J Phys 94(4):1683
Garrido NM, Queimada AJ, Jorge M, Macedo EA, Economou IG (2009) J Chem Theory Comput 5(9):2436
Garrido NM, Jorge M, Queimada AJ, Gomes JRB, Economou IG, Macedo EA (2011) Phys Chem Chem Phys 13(38):17384
Garrido NM, Economou IG, Queimada AJ, Jorge M, Macedo EA (2012) AIChE J 58(6):1929
Yang L, Ahmed A, Sandler SI (2013) J Comput Chem 34(4):284
Michel J, Orsi M, Essex JW (2007) J Phys Chem B 112(3):657
Genheden S (2016) J Chem Theory Comput 12(1):297
I. OpenEye Scientific Software. Oechem (2010). www.eyesopen.com
Bannan CC, Calabró G, Kyu DY, Mobley DL (2016) J Chem Theory Comput 12(8):4015
Wilk MB, Gnanadesikan R (1968) Biometrika 55(1):1
Berendsen HJC, Van Der Spoel D, van Drunen R (1995) Comput Phys Commun 91(1–3):43
Hess B, Kutzner C, van der Spoel D, Lindahl E (2008) J Chem Theory Comput 4(3):435
Lindahl E, Hess B, van der Spoel D (2001) J Mol Model 7(8):306
van der Spoel D, Lindahl E, Hess B, Groenhof G, Mark AE, Berendsen HJC (2005) J Comput Chem 26(16):1701
Pronk S, Páll S, Schulz R, Larsson P, Bjelkmar P, Apostolov R, Shirts MR, Smith JC, Kasson PM, van der Spoel D, Hess B, Lindahl E (2013) Bioinformatics (Oxford, England) 29(7):845
Páll S, Abraham MJ, Kutzner C, Hess B, Lindahl E (2014) Solving software challenges for exascale, vol 8759. Springer, Stockholm
Abraham MJ, Murtola T, Schulz R, Páll S, Smith JC, Hess B, Lindahl E (2015) SoftwareX 1–2:19
Wang J, Wolf RM, Caldwell JW, Kollman PA, Case DA (2004) J Comput Chem 25(9):1157
Jakalian A, Bush BL, Jack DB, Bayly CI (2000) J Comput Chem 21(2):132
Jakalian A, Jack DB, Bayly CI (2002) J Comput Chem 23(16):1623
Jorgensen WL, Chandrasekhar J, Madura JD, Impey RW, Klein ML (1983) J Chem Phys 79(2):926
Liu S, Cao S, Hoang K, Young KL, Paluch AS, Mobley DL (2016) J Chem Theory Comput 12(4):1930
Klimovich PV, Shirts MR, Mobley DL (2015) J Comput Aided Mol Des 29(5):397
Parameswaran S, Mobley DL (2014) J Comput Aided Mol Des 28(8):825
Lide DR (ed) (1996) CRC handbook of chemistry and physics, 76th edn. CRC Press, Boca Raton
Sangster J (1989) J Phys Chem Ref Data 18:1111
Schrödinger Release 2014-4: Epik, version 3.0, Schrödinger, LLC, New York, NY, (2014)
Shelley JC, Cholleti A, Frye LL, Greenwood JR, Timlin MR, Uchimaya M (2007) J Comput Aided Mol Des 21(12):681
Greenwood JR, Calkins D, Sullivan AP, Shelley JC (2010) J Comput Aided Mol Des 24(6–7):591
Schrödinger Release 2014-4: Ligprep, version 3.2, Schrödinger, LLC, New York, NY, (2014)
Wang R, Fu Y, Lai L (1997) J Chem Inf Model 37(3):615
Wang R, Gao Y, Lai L (2000) Perspect Drug Discov Des 19(1):47
Black C, Joris GG, Taylor HS (1948) J Chem Phys 16(5):538
Humphrey W, Dalke A, Schulten K (1996) J Mol Graph 14(1):33
Paranahewage SS, Gierhart CS, Fennell CJ (2016) J Comput Aided Mol Des. doi:10.1007/s10822-016-9950-z
Iorga B, Kenney IM, Beckstein O (2016) J Comput Aided Mol Des. doi:10.1007/s10822-016-9949-5
Bosisio S, Mey ASJS, Michel J (2016) J Comput Aided Mol Des. doi:10.1007/s10822-016-9933-0
Pickard F, König G, Tofoleanu F, Lee J, Simmonett A, Shao Y, Ponder J, Brooks BR (2016) J Comput Aided Mol Des. doi:10.1007/s10822-016-9955-7
König G, Pickard FC, Huang J, Simmonett AC, Tofoleanu F, Lee J, Dral PO, Samarjeet FNU, Jones M, Shao Y, Thiel W, Brooks BR (2016) J Comput Aided Mol Des. doi:10.1007/s10822-016-9936-x
Genheden S, Essex J (2016) J Comput Aided Mol Des. doi:10.1007/s10822-016-9926-z
Kamath G, Kurnikov I, Fain B, Leontyev I, Illarionov A, Butin O, Olevanov M, Pereyaslavets L (2016) J Comput Aided Mol Des. doi:10.1007/s10822-016-9958-4
Brini E, Paranahewage SS, Fennell CJ, Dill KA (2016) J Comput Aided Mol Des. doi:10.1007/s10822-016-9961-9
Jones MR, Brooks BR, Wilson AK (2016) J Comput Aided Mol Des. doi:10.1007/s10822-016-9964-6
Tielker N, Tomazic D, Heil J, Kloss T, Ehrhart S, Güssregen S, Schmidt KF, Kast S (2016) J Comput Aided Mol Des. doi:10.1007/s10822-016-9939-7
Luchko T, Blinov N, Limon GC, Joyce KP, Kovalenko A (2016) J Comput Aided Mol Des. doi:10.1007/s10822-016-9947-7
Diaz-Rodriguez S, Bozada SM, Phifer JR, Paluch AS (2016) J Comput Aided Mol Des. doi:10.1007/s10822-016-9945-9
Park H, Chung KC (2016) J Comput Aided Mol Des. doi:10.1007/s10822-016-9928-x
Santos-Martins D, Fernandes PA, Ramos MJa (2016) J Comput Aided Mol Des. doi:10.1007/s10822-016-9951-y
Klamt A, Eckert F, Reinisch J, Wichmann K (2016) J Comput Aided Mol Des. doi:10.1007/s10822-016-9927-y
Fennell CJ (2016) Personal Communication
Klamt A (2016) Personal Communication
Pickard IV FC (2016) Personal Communication
Acknowledgments
D.L.M. and C.C.B. appreciate financial support from the National Institutes of Health (1R01GM108889-01) and the National Science Foundation (CHE 1352608), and computing support from the UCI GreenPlanet cluster, supported in part by NSF Grant CHE-0840513. This work was made possible in part by NIH grant U01 GM111528 for the Drug Design Data Resource, which supported the SAMPL workshop. M.K.G. thanks the National Institutes of Health for Grant GM061300. The contents of this paper are solely the responsibility of the authors and do not necessarily represent the official views of the NIH. M.K.G. has an equity interest in and is a cofounder and scientific advisor of VeraChem LLC. We would also like to acknowledge John Shelley, Art Bochevarov, Robert Abel, and Mats Svensson from Schrödinger for their help with pKa and tautomer enumeration calculations. We also thank all the SAMPL5 participants and D3R Workshop attendees, and we especially appreciate valuable discussions with John Chodera (MSKCC), Ariën Rustenburg (MSKCC), Andreas Klamt (COSMOLogic), Christopher Fennell (Oklahoma State University), Samuel Genheden (Gothenburg University), and Frank Pickard (National Institute of Health).
Author information
Authors and Affiliations
Corresponding author
Rights and permissions
About this article
Cite this article
Bannan, C.C., Burley, K.H., Chiu, M. et al. Blind prediction of cyclohexane–water distribution coefficients from the SAMPL5 challenge. J Comput Aided Mol Des 30, 927–944 (2016). https://doi.org/10.1007/s10822-016-9954-8
Received:
Accepted:
Published:
Issue Date:
DOI: https://doi.org/10.1007/s10822-016-9954-8