Skip to main content

A deep learning approach for the blind logP prediction in SAMPL6 challenge

Abstract

Water octanol partition coefficient serves as a measure for the lipophilicity of a molecule and is important in the field of drug discovery. A novel method for computational prediction of logarithm of partition coefficient (logP) has been developed using molecular fingerprints and a deep neural network. The machine learning model was trained on a dataset of 12,000 molecules and tested on 2000 molecules. In this article, we present our results for the blind prediction of logP for the SAMPL6 challenge. While the best submission achieved a RMSE of 0.41 logP units, our submission had a RMSE of 0.61 logP units. Overall, we ranked in the top quarter out of the 92 submissions that were made. Our results show that the deep learning model can be used as a fast, accurate and robust method for high throughput prediction of logP of small molecules.

This is a preview of subscription content, access via your institution.

Fig. 1
Fig. 2
Fig. 3
Fig. 4
Fig. 5
Fig. 6

References

  1. Kubinyi H (1979) Progress in drug research/Fortschritte Der Arzneimittelforschung/Progrès Des Recherches Pharmaceutiques. Springer, New York pp 97–198

    Book  Google Scholar 

  2. Edwards MP, Price DA (2010) Annual reports in medicinal chemistry. Elsevier, Amsterdam pp 380–391

    Book  Google Scholar 

  3. Arnott JA, Kumar R, Planey SL (2013) J Appl Biopharm Pharmacokinet 1(1):31

    Google Scholar 

  4. Avdeef A, Box K, Comer J, Hibbert C, Tam K (1998) Pharm Res 15(2):209

    CAS  PubMed  Article  Google Scholar 

  5. Efremov RG, Chugunov AO, Pyrkov TV, Priestle JP, Arseniev AS, Jacoby E (2007) Curr Med Chem 14(4):393

    CAS  PubMed  Article  Google Scholar 

  6. Ritchie TJ, Macdonald SJ (2009) Drug Discov Today 14(21–22):1011

    CAS  PubMed  Article  Google Scholar 

  7. Ertl P, Jelfs S (2007) Curr Top Med Chem 7(15):1491

    CAS  PubMed  Article  Google Scholar 

  8. Macías FA, Marín D, Oliveros-Bastidas A, Molinillo JM (2006) J Agric Food Chem 54(25):9357

    PubMed  Article  CAS  Google Scholar 

  9. Ruscoe C (1977) Pestic Sci 8(3):236

    CAS  Article  Google Scholar 

  10. Sverdrup LE, Nielsen T, Krogh PH (2002) Environ Sci Technol 36(11):2429

    CAS  PubMed  Article  Google Scholar 

  11. Ghadimi S, Mousavi S Latif, Javani Z (2008) J Enzyme Inhib Med Chem 23(2):213

    CAS  PubMed  Article  Google Scholar 

  12. Riederer M, Daiß A, Gilbert N, Köhle H (2002) J Exp Bot 53(375):1815

    CAS  PubMed  Article  Google Scholar 

  13. KAJiyA K, Ichiba M, Kuwabara M, Kumazawa S, NAKAYAMA T (2001) Biosci Biotechnol Biochem 65(5):1227

    CAS  PubMed  Article  Google Scholar 

  14. Lee CK, Uchida T, Kitagawa K, Yagi A, Kim NS, Goto S (1994) J Pharm Sci 83(4):562

    CAS  PubMed  Article  Google Scholar 

  15. Hori M, Satoh S, Maibach HI, Guy RH (1991) J Pharm Sci 80(1):32

    CAS  PubMed  Article  Google Scholar 

  16. Cross SE, Magnusson BM, Winckle G, Anissimov Y, Roberts MS (2003) J Investig Dermatol 120(5):759

    CAS  PubMed  Article  Google Scholar 

  17. Abla M, Banga A (2013) Int J Cosmet Sci 35(1):19

    CAS  PubMed  Article  Google Scholar 

  18. Lipinski CA, Lombardo F, Dominy BW, Feeney PJ (1997) Adv Drug Deliv Rev 23(1–3):3

    CAS  Article  Google Scholar 

  19. Lipinski CA (2004) Drug Discov Today 1(4):337

    CAS  Article  Google Scholar 

  20. Guy RH, Potts RO (1993) Am J Ind Med 23(5):711

    CAS  PubMed  Article  Google Scholar 

  21. Hansch C, Björkroth J, Leo A (1987) J Pharm Sci 76(9):663

    CAS  PubMed  Article  Google Scholar 

  22. Liu R, Zhou D (2008) J Chem Inf Model 48(3):542

    CAS  PubMed  Article  Google Scholar 

  23. Lee CK, Uchida T, Kitagawa K, Yagi A, Kim N, Goto S (1994) Biol Pharm Bull 17(10):1421

    CAS  PubMed  Article  Google Scholar 

  24. Grams YY, Alaruikka S, Lashley L, Caussin J, Whitehead L, Bouwstra JA (2003) Eur J Pharm Sci 18(5):329

    CAS  PubMed  Article  Google Scholar 

  25. Nielsen JB, Nielsen F, Sørensen JA (2007) Arch Dermatol Res 299(9):423

    PubMed  Article  Google Scholar 

  26. Işık M, Levorse D, Mobley DL, Rhodes T, Chodera JD (2019) BioRxiv p 757393

  27. Mobley DL, Wymer KL, Lim NM, Guthrie JP (2014) J Comput Aided Mol Des 28(3):135

    CAS  PubMed  PubMed Central  Article  Google Scholar 

  28. Muddana HS, Sapra NV, Fenley AT, Gilson MK (2014) J Comput Aided Mol Des 28(3):277

    CAS  PubMed  PubMed Central  Article  Google Scholar 

  29. Yin J, Henriksen NM, Slochower DR, Shirts MR, Chiu MW, Mobley DL, Gilson MK (2017) J Comput Aided Mol Des 31(1):1

    CAS  PubMed  Article  Google Scholar 

  30. Rustenburg AS, Dancer J, Lin B, Feng JA, Ortwine DF, Mobley DL, Chodera JD (2016) J Comput Aided Mol Des 30(11):945

    CAS  PubMed  PubMed Central  Article  Google Scholar 

  31. Pracht P, Wilcken R, Udvarhelyi A, Rodde S, Grimme S (2018) J Comput Aided Mol Des 32(10):1139

    CAS  PubMed  Article  Google Scholar 

  32. Prasad S, Huang J, Zeng Q, Brooks BR (2018) J Comput Aided Mol Des 32(10):1191

    CAS  PubMed  PubMed Central  Article  Google Scholar 

  33. Bannan CC, Burley KH, Chiu M, Shirts MR, Gilson MK, Mobley DL (2016) J Comput Aided Mol Des 30(11):927

    CAS  PubMed  PubMed Central  Article  Google Scholar 

  34. Plante J, Werner S (2018) J Cheminf 10(1):61

    CAS  Article  Google Scholar 

  35. Yang P, Chen J, Chen S, Yuan X, Schramm KW, Kettrup A (2003) Sci Total Environ 305(1–3):65

    CAS  PubMed  Article  Google Scholar 

  36. Leo AJ, Hoekman D (2000) Perspect Drug Discov Des 18(1):19

    CAS  Article  Google Scholar 

  37. Schroeter TS, Schwaighofer A, Mika S, Laak AT, Suelzle D, Ganzer U, Heinrich N, Müller KR (2007) ChemMedChem 2(9):1265

    CAS  PubMed  Article  Google Scholar 

  38. Ognichenko LN, Kuz’min VE, Gorb L, Hill FC, Artemenko AG, Polischuk PG, Leszczynski J (2012) Mol Inf 31(3–4):273

    CAS  Article  Google Scholar 

  39. Ghasemi F, Mehridehnavi A, Fassihi A, Pérez-Sánchez H (2018) Appl Soft Comput 62:251

    Article  Google Scholar 

  40. Popova M, Isayev O, Tropsha A (2018) Sci Adv 4(7):eaap7885

    CAS  PubMed  PubMed Central  Article  Google Scholar 

  41. Lusci A, Pollastri G, Baldi P (2013) J Chem Inf Model 53(7):1563

    CAS  PubMed  PubMed Central  Article  Google Scholar 

  42. Mayr A, Klambauer G, Unterthiner T, Steijaert M, Wegner JK, Ceulemans H, Clevert DA, Hochreiter S (2018) Chem Sci 9:5441

    CAS  PubMed  PubMed Central  Article  Google Scholar 

  43. Hughes TB, Miller GP, Swamidass SJ (2015) ACS Cent Sci 1(4):168

    CAS  PubMed  PubMed Central  Article  Google Scholar 

  44. Daylight manual (2009). https://www.daylight.com/dayhtml/doc/theory/theory.smiles.html

  45. Rogers D, Hahn M (2010) J Chem Inf Model 50(5):742

    CAS  PubMed  Article  Google Scholar 

  46. Landrum G et al (2006) Rdkit: Open-source cheminformatics

  47. Card ML, Gomez-Alvarez V, Lee WH, Lynch DG, Orentas NS, Lee MT, Wong EM, Boethling RS (2017) Environ Sci 19(3):203–212

    CAS  Google Scholar 

  48. LeCun Y, Bengio Y, Hinton G (2015) Nature 521(7553):436

    CAS  PubMed  Google Scholar 

  49. Abadi M, Barham P, Chen J, Chen Z, Davis A, Dean J, Devin M, Ghemawat S, Irving G, Isard M et al (2016) In: 12th \(\{\)USENIX\(\}\) symposium on operating systems design and implementation (\(\{\)OSDI\(\}\) 16), pp 265–283

  50. Samplchallenges. samplchallenges/sampl6 (2019). https://github.com/samplchallenges/SAMPL6

  51. Friedman J, Hastie T, Tibshirani R (2001) The elements of statistical learning. Springer, New York

  52. Wen M, Jiang J, Wang ZX, Wu C (2014) Theor Chem Acc 133(5):1471

    Article  CAS  Google Scholar 

  53. Marenich AV, Cramer CJ, Truhlar DG (2009) J Phys Chem B 113(18):6378

    CAS  PubMed  Article  Google Scholar 

  54. Cramer CJ, Truhlar DG (2008) Acc Chem Res 41(6):760

    CAS  PubMed  Article  Google Scholar 

  55. Wang LP, Martinez TJ, Pande VS (2014) J Phys Chem Lett 5(11):1885

    CAS  PubMed  Article  Google Scholar 

  56. Krämer A, Pickard FC, Huang J, Venable RM, Simmonett AC, Reith D, Kirschner KN, Pastor RW, Brooks BR (2019) J Chem Theory Comput 15:3854–3867

  57. Beauchamp KA, Behr JM, Rustenburg AS, Bayly CI, Kroenlein K, Chodera JD (2015) J Phys Chem B 119(40):12912

    CAS  PubMed  PubMed Central  Article  Google Scholar 

  58. Yosinski J, Clune J, Bengio Y, Lipson H (2014) Advances in neural information processing systems. Curr Assoc 27:3320–3328

  59. Long M, Zhu H, Wang J, Jordan MI (2017) In: Proceedings of the 34th international conference on machine learning, vol 70, JMLR.org, pp 2208–2217

  60. Pan SJ, Yang Q (2009) IEEE Trans Knowl Data Eng 22(10):1345

    Article  Google Scholar 

  61. Shin HC, Roth HR, Gao M, Lu L, Xu Z, Nogues I, Yao J, Mollura D, Summers RM (2016) IEEE Trans Med Imaging 35(5):1285

    PubMed  Article  Google Scholar 

  62. Habgood MD, Dehkordi LS, Khodr HH, Abbott J, Hider RC et al (1999) Biochem Pharmacol 57(11):1305

    CAS  PubMed  Article  Google Scholar 

  63. Klamt A, Eckert F, Reinisch J, Wichmann K (2016) J Comput Aided Mol Des 30(11):959

    CAS  PubMed  Article  Google Scholar 

  64. König G, Pickard FC, Huang J, Simmonett AC, Tofoleanu F, Lee J, Dral PO, Prasad S, Jones M, Shao Y et al (2016) J Comput Aided Mol Des 30(11):989

    PubMed  Article  CAS  Google Scholar 

  65. Bengio Y (2012) In: Proceedings of ICML workshop on unsupervised and transfer learning, pp 17–36

Download references

Acknowledgements

Samarjeet would like to thank the Biochemistry, Cellular and Molecular Biology(BCMB) Program at JHU-SOM for supporting his graduate studies training. We would like to thank the LoBos and Biowulf teams at NIH for providing the high performance computing support to carry out the work. This study was supported by the Intramural Research Program of the National Heart, Lung and Blood Institute.

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Samarjeet Prasad.

Additional information

Publisher's Note

Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

Electronic supplementary material

Below is the link to the electronic supplementary material.

Electronic supplementary material 1 (DOCX 127 kb)

Rights and permissions

Reprints and Permissions

About this article

Verify currency and authenticity via CrossMark

Cite this article

Prasad, S., Brooks, B.R. A deep learning approach for the blind logP prediction in SAMPL6 challenge. J Comput Aided Mol Des 34, 535–542 (2020). https://doi.org/10.1007/s10822-020-00292-3

Download citation

  • Received:

  • Accepted:

  • Published:

  • Issue Date:

  • DOI: https://doi.org/10.1007/s10822-020-00292-3

Keywords

  • SAMPL6
  • LogP
  • Deep learning
  • Fingerprinting