Calculation of distribution coefficients in the SAMPL5 challenge from atomic solvation parameters and surface areas
- 247 Downloads
In the context of SAMPL5, we submitted blind predictions of the cyclohexane/water distribution coefficient (D) for a series of 53 drug-like molecules. Our method is purely empirical and based on the additive contribution of each solute atom to the free energy of solvation in water and in cyclohexane. The contribution of each atom depends on the atom type and on the exposed surface area. Comparatively to similar methods in the literature, we used a very small set of atomic parameters: only 10 for solvation in water and 1 for solvation in cyclohexane. As a result, the method is protected from overfitting and the error in the blind predictions could be reasonably estimated. Moreover, this approach is fast: it takes only 0.5 s to predict the distribution coefficient for all 53 SAMPL5 compounds, allowing its application in virtual screening campaigns. The performance of our approach (submission 49) is modest but satisfactory in view of its efficiency: the root mean square error (RMSE) was 3.3 log D units for the 53 compounds, while the RMSE of the best performing method (using COSMO-RS) was 2.1 (submission 16). Our method is implemented as a Python script available at https://github.com/diogomart/SAMPL5-DC-surface-empirical.
KeywordsSAMPL5 Drug design data resource D3R Solvent accessible area Free energy of solvation Distribution coefficient
We acknowledge European Union (FEDER funds POCI/01/0145/FEDER/007728) and National Funds (FCT/MEC, Fundação para a Ciência e Tecnologia and Ministério da Educação e Ciência) under the Partnership Agreement PT2020 UID/MULTI/04378/2013.UID/MULTI/04378/2013; NORTE-01-0145-FEDER-000024, supported by Norte Portugal Regional Operational Programme (NORTE 2020), under the PORTUGAL 2020 Partnership Agreement, through the European Regional Development Fund (ERDF) D.S.M thanks Fundação para a Ciência e Tecnologia for scholarship SFRH/BD/84922/2012.
- 12.Rustenburg AS, Dancer J, Lin B, Feng JA, Ortwine DF, Mobley DL, Chodera JD (2016) bioRxiv 063081. doi: 10.1101/063081
- 14.Bannan CC, Burley KH, Chiu M, Gilson MK, Mobley DL (2016) J Comput Aided Mol Des (in prep)Google Scholar
- 18.O’Boyle NM, Morley C, Hutchison GR (2008) Chem Cent J 2(5). doi: 10.1186/1752-153X-2-5
- 20.Mobley DL (2013) Experimental and calculated small molecule hydration free energies. Retrieved from http://www.escholarship.org/uc/item/6sd403pz
- 21.R Development Core Team (2008) R: a language and environment for statistical computing. R Foundation for Statistical Computing, Vienna. http://www.R-project.org. ISBN 3-900051-07-0