Abstract
Within the framework of the 6th physical property blind challenge (SAMPL6) the authors have participated in predicting the octanol–water partition coefficients (logP) for several small drug like molecules. Those logP values where experimentally known by the organizers but only revealed after the submissions of the predictions. Two different sets of predictions were submitted by the authors, both based on the COSMOtherm implementation of COSMO-RS theory. COSMOtherm predictions using the FINE parametrization level (hmz0n) obtained the highest accuracy among all submissions as measured by the root mean squared error. COSMOquick predictions using a fast algorithm to estimate σ-profiles and an a posterio machine learning correction on top of the COSMOtherm results (3vqbi) scored 3rd out of 91 submissions. Both results underline the high quality of COSMO-RS derived molecular free energies in solution.
Similar content being viewed by others
References
Leo A, Hansch C, Elkins D (1971) Partition coefficients and their uses. Chem Rev 71:525–616. https://doi.org/10.1021/cr60274a001
Mannhold M, Poda G, Ostermann C, Tetko I (2009) Calculation of molecular lipophilicity: state of the art and comparison of methods on more than 96000 compounds. Chem Cent J 3:O7. https://doi.org/10.1186/1752-153X-3-S1-O7
(2019) Drug design data resource. In: Drug Des. Data Resour. https://drugdesigndata.org. Accessed 1 Feb 2019
Nicholls A, Mobley DL, Guthrie JP et al (2008) Predicting small-molecule solvation free energies: an informal blind test for computational chemistry. J Med Chem 51:769–779
Rustenburg AS, Dancer J, Lin B et al (2016) Measuring experimental cyclohexane-water distribution coefficients for the SAMPL5 challenge. J Comput Aided Mol Des 30:945–958
Klamt A, Eckert F, Reinisch J, Wichmann K (2016) Prediction of cyclohexane-water distribution coefficients with COSMO-RS on the SAMPL5 data set. J Comput Aided Mol Des 30:959–967. https://doi.org/10.1007/s10822-016-9927-y
(2017) SAMPL6—pKa-prediction—overview. In: PKa-Predict.—Overv. https://drugdesigndata.org/about/sampl6/pka-prediction. Accessed 6 Dec 2017
Işık M, Levorse D, Rustenburg AS et al (2018) pKa measurements for the SAMPL6 prediction challenge for a set of kinase inhibitor-like fragments. J Comput Aided Mol Des 32:1117–1138. https://doi.org/10.1007/s10822-018-0168-0
Pracht P, Wilcken R, Udvarhelyi A et al (2018) High accuracy quantum-chemistry-based calculation and blind prediction of macroscopic pKa values in the context of the SAMPL6 challenge. J Comput Aided Mol Des 32:1139–1149. https://doi.org/10.1007/s10822-018-0145-7
Avdeef A (1992) pH-Metric log P. Part 1. Difference plots for determining ion-pair octanol-water partition coefficients of multiprotic substances. Quant Struct-Act Relatsh 11:510–517. https://doi.org/10.1002/qsar.2660110408
Avdeef A (1993) pH-Metric log P. II: refinement of partition coefficients and ionization constants of multiprotic substances. J Pharm Sci 82:183–190. https://doi.org/10.1002/jps.2600820214
Slater B, McCormack A, Avdeef A, Comer JEA (1994) PH-Metric logP.4. Comparison of partition coefficients determined by HPLC and potentiometric methods to literature values. J Pharm Sci 83:1280–1283. https://doi.org/10.1002/jps.2600830918
Klamt A (1995) Conductor-like screening model for real solvents: a new approach to the quantitative calculation of solvation phenomena. J Phys Chem 99:2224–2235. https://doi.org/10.1021/j100007a062
Klamt A, Schüürmann G (1993) COSMO: a new approach to dielectric screening in solvents with explicit expressions for the screening energy and its gradient. J Chem Soc Perkin Trans 2(1993):799–805. https://doi.org/10.1039/P29930000799
Klamt A (2018) The COSMO and COSMO-RS solvation models: COSMO and COSMO-RS. Wiley Interdiscip Rev Comput Mol Sci 8:e1338. https://doi.org/10.1002/wcms.1338
Becke AD (1988) Density-functional exchange-energy approximation with correct asymptotic behavior. Phys Rev A 38:3098–3100. https://doi.org/10.1103/PhysRevA.38.3098
Perdew JP (1986) Density-functional approximation for the correlation energy of the inhomogeneous electron gas. Phys Rev B 33:8822–8824. https://doi.org/10.1103/PhysRevB.33.8822
Schäfer A, Huber C, Ahlrichs R (1994) Fully optimized contracted Gaussian basis sets of triple zeta valence quality for atoms Li to Kr. J Chem Phys 100:5829. https://doi.org/10.1063/1.467146
Rappoport D, Furche F (2010) Property-optimized Gaussian basis sets for molecular response calculations. J Chem Phys 133:134105. https://doi.org/10.1063/1.3484283
(2018) COSMOquick 1.7. COSMOlogic GmbH & Co. KG; http://www.cosmologic.de, Leverkusen, Germany
Stewart JJP (1993) MOPAC7. Quantum Chemistry Program Exchange; http://sourceforge.net/projects/mopac7/, University of Texas, Austin, TX, USA
(2018) COSMOconf 4.3. COSMOlogic GmbH & Co. KG; http://www.cosmologic.de, Leverkusen, Germany
(2018) TURBOMOLE V7.3. University of Karlsruhe and Forschungszentrum Karlsruhe GmbH, 1989-2007, TURBOMOLE GmbH, since 2007; available from http://www.turbomole.com, Karlsruhe, Germany
Dallos A, Liszi J (1995) (Liquid + liquid) equilibria of (octan-1-ol + water) at temperatures from 288.15 K to 323.15 K. J Chem Thermodyn 27:447–448. https://doi.org/10.1006/jcht.1995.0046
Klamt A, Jonas V, Bürger T, Lohrenz JC (1998) Refinement and parametrization of COSMO-RS. J Phys Chem A 102:5074–5085. https://doi.org/10.1021/jp980017s
(2019) COSMOtherm, Release 19. COSMOlogic GmbH & Co. KG; http://www.cosmologic.de, Leverkusen, Germany
(2007) BioByte Masterfile. BioByte Corporation, Claremont, CA, USA
Hornig M, Klamt A (2005) COSMOfrag: a novel tool for high-throughput ADME property prediction and similarity screening based on quantum chemistry. J Chem Inf Model 45:1169–1177. https://doi.org/10.1021/ci0501948
Friedman JH (2002) Stochastic gradient boosting. Comput Stat Data Anal 38:367–378. https://doi.org/10.1016/S0167-9473(01)00065-2
Chen T, Guestrin C (2016) XGBoost: a scalable tree boosting system. In: Proceedings of the 22nd ACM SIGKDD international conference on knowledge discovery and data mining. ACM, San Francisco, California, USA, pp 785–794
EPA (2014) EPI Suite Data. http://esc.syrres.com/interkow/EpiSuiteData_ ISIS_SDF.htm. Accessed 2 Feb 2019
Isik M (2019) Personal Communication
Acknowledgements
The authors acknowledge the organizers for setting up the SAMPL6 challenge and the SAMPL NIH Grant 1R01GM124270-01A1 for the support of the experimental work carried out in this context.
Author information
Authors and Affiliations
Corresponding author
Ethics declarations
Conflict of interest
The authors declare the following competing financial interest(s): Andreas Klamt, Jens Reinisch and Christoph Loschen are employees of Dassault Systèmes, BIOVIA. Dassault Systèmes commercially distributes software implementations of COSMO-RS (COSMOtherm, COSMOquick) which were used in the present strudy.
Additional information
Publisher's Note
Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.
Electronic supplementary material
Below is the link to the electronic supplementary material.
10822_2019_259_MOESM1_ESM.zip
Supplementary material 1—All COSMO files generated by Turbomole representing the conformations used by COSMOtherm for the calculation of the logP values. Additional information about the influence of the σ-profile fragmentation process on prediction quality and the role of conformational effects. (ZIP 5158 kb)
Rights and permissions
About this article
Cite this article
Loschen, C., Reinisch, J. & Klamt, A. COSMO-RS based predictions for the SAMPL6 logP challenge. J Comput Aided Mol Des 34, 385–392 (2020). https://doi.org/10.1007/s10822-019-00259-z
Received:
Accepted:
Published:
Issue Date:
DOI: https://doi.org/10.1007/s10822-019-00259-z