Skip to main content
Log in

Multi-task convolutional neural networks for predicting in vitro clearance endpoints from molecular images

  • Published:
Journal of Computer-Aided Molecular Design Aims and scope Submit manuscript

Abstract

Optimization of compound metabolic stability is a highly topical issue in pharmaceutical research. Accordingly, application of predictive in silico models can potentially reduce the number of design-make-test-analyze iterations and consequently speed up the progression of novel candidate molecules. Herein, we have investigated the question if multiple in vitro clearance endpoints could be accurately predicted from image-based molecular representations. Thus, compound measurements for four commonly investigated clearance endpoints were curated from AstraZeneca internal sources, providing a sound basis for building multi-task convolutional neural network models. Application of several increasingly challenging data splitting strategies confirmed that convolutional neural network models were successful at capturing implicit chemical relationships contained in training and test data, similar to what is commonly observed for structural fingerprints. Furthermore, model benchmarking against state-of-the-art machine learning methods, including deep neural networks and graph convolutional neural networks, trained with structure- and graph-based representations, respectively, revealed on par or increased accuracy of convolutional neural networks with clear benefit of multi-task learning across all clearance endpoints. Our findings indicate that image-based molecular representations can be applied to predict multiple clearance endpoints, suggesting a potential follow-up to investigate model interpretability from molecular images.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Institutional subscriptions

Fig. 1
Fig. 2
Fig. 3
Fig. 4
Fig. 5
Fig. 6
Fig. 7
Fig. 8

Similar content being viewed by others

References

  1. Kola I (2008) The state of innovation in drug development. Clin Pharmacol Ther 83:227–230. https://doi.org/10.1038/sj.clpt.6100479

    Article  CAS  PubMed  Google Scholar 

  2. Williamson B, Colclough N, Fretland AJ, Jones BC, Jones RDO, McGinnity DF (2020) Further considerations towards an effective and efficient oncology drug discovery DMPK strategy. Curr Drug Metab 21:145–162. https://doi.org/10.2174/1389200221666200312104837

    Article  PubMed  Google Scholar 

  3. Masimirembwa CM, Bredberg U, Andersson TB (2003) Metabolic stability for drug discovery and development: pharmacokinetic and biochemical challenges. Clin Pharmacokinet 42:515–528. https://doi.org/10.2165/00003088-200342060-00002

    Article  CAS  PubMed  Google Scholar 

  4. Davies M, Jones RDO, Grime K, Jansson-Löfmark R, Fretland AJ, Winiwarter S, Morgan P, McGinnity DF (2020) Improving the accuracy of predicted human pharmacokinetics: lessons learned from the AstraZeneca drug pipeline over two decades. Trends Pharmacol Sci 41:390–408. https://doi.org/10.1016/j.tips.2020.03.004

    Article  CAS  PubMed  Google Scholar 

  5. Morgan P, Brown DG, Lennard S, Anderton MJ, Barrett JC, Eriksson U, Fidock M, Hamrén B, Johnson A, March RE, Matcham J, Mettetal J, Nicholls DJ, Platz S, Rees S, Snowden MA, Pangalos MN (2018) Impact of a five-dimensional framework on R&D productivity at AstraZeneca. Nat Rev Drug Discov 17:167–181. https://doi.org/10.1038/nrd.2017.244

    Article  CAS  PubMed  Google Scholar 

  6. Williamson B, Harlfinger S, McGinnity DF (2020) Evaluation of the disconnect between hepatocyte and microsome intrinsic clearance and in vitro in vivo extrapolation performance. Drug Metab Dispos 48:1137–1146. https://doi.org/10.1124/dmd.120.000131

    Article  CAS  PubMed  Google Scholar 

  7. Oprisiu I, Winiwarter S (2020) In: Wolkenhauer O (ed) Systems medicine: integrative, qualitative and computational approaches, 1st edn. Academic Press, New York

  8. Lombardo F, Desai PV, Arimoto R, Desino KE, Fischer H, Keefer CE, Petersson C, Winiwarter S, Broccatelli F (2017) In silico absorption, distribution, metabolism, excretion, and pharmacokinetics (ADME-PK): Utility and best practices. An industry perspective from the international consortium for innovation through quality in pharmaceutical development. J Med Chem 60:9097–9113. https://doi.org/10.1021/acs.jmedchem.7b00487

    Article  CAS  PubMed  Google Scholar 

  9. Winiwarter S, Ahlberg E, Watson E, Oprisiu I, Mogemark M, Noeske T, Greene N (2018) In silico ADME in drug design—enhancing the impact. ADMET DMPK 6:15–33. https://doi.org/10.5599/admet.6.1.470

    Article  Google Scholar 

  10. Miljković F, Martinsson A, Obrezanova O, Williamson B, Johnson M, Sykes A, Bender A, Greene N (2021) Machine learning models for human in vivo pharmacokinetic parameters with in-house validation. Mol Pharm 18:4520–4530. https://doi.org/10.1021/acs.molpharmaceut.1c00718

    Article  CAS  PubMed  Google Scholar 

  11. Winiwarter S, Middleton B, Jones B, Courtney P, Lindmark B, Page KM, Clark A, Landqvist C (2015) Time dependent analysis of assay comparability: a novel approach to understand intra- and inter-site variability over time. J Comput Aided Mol Des 29:795–807. https://doi.org/10.1007/s10822-015-9836-5

    Article  CAS  PubMed  Google Scholar 

  12. Fernandez M, Ban F, Woo G, Hsing M, Yamazaki T, LeBlanc E, Rennie PS, Welch WJ, Cherkasov A (2018) Toxic Colors: the use of deep learning for predicting toxicity of compounds merely from their graphic images. J Chem Inf Model 58:1533–1543. https://doi.org/10.1021/acs.jcim.8b00338

    Article  CAS  PubMed  Google Scholar 

  13. Cortés-Ciriano I, Bender A (2019) KekuleScope: prediction of cancer cell line sensitivity and compound potency using convolutional neural networks trained on compound images. J Cheminform 11:e41. https://doi.org/10.1186/s13321-019-0364-5

    Article  CAS  Google Scholar 

  14. Iqbal J, Vogt M, Bajorath J (2021) Prediction of activity cliffs on the basis of images using convolutional neural networks. J Comput Aided Mol Des 35:1157–1164. https://doi.org/10.1007/s10822-021-00380-y

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  15. Yoshimori A (2021) Prediction of molecular properties using molecular topographic map. Molecules 26:e4475. https://doi.org/10.3390/molecules26154475

    Article  CAS  Google Scholar 

  16. Iqbal J, Vogt M, Bajorath J (2020) Activity landscape image analysis using convolutional neural networks. J Cheminform 12:e34. https://doi.org/10.1186/s13321-020-00436-5

    Article  Google Scholar 

  17. RDKit (2022) RDKit: open-source cheminformatics and machine learning software. https://www.rdkit.org/. Accessed 14 Feb 2022

  18. Bento AP, Hersey A, Félix E, Landrum G, Gaulton A, Atkinson F, Bellis LJ, De Veij M, Leach AR (2020) An open source chemical structure curation pipeline using RDKit. J Cheminform 12:e51. https://doi.org/10.1186/s13321-020-00456-1

    Article  CAS  Google Scholar 

  19. Wenlock MC, Carlsson LA (2015) How experimental errors influence drug metabolism and pharmacokinetic QSAR/QSPR models. J Chem Inf Model 55:125–134. https://doi.org/10.1021/ci500535s

    Article  CAS  PubMed  Google Scholar 

  20. OEChem Toolkit, version 2.0.0, OpenEye Scientific Software: Santa Fe, NM

  21. Halgren TA (1996) Merck molecular force field. I. Basis, form, scope, parameterization, and performance of MMFF94. J Comput Chem 17:490–519. https://doi.org/10.1002/(SICI)1096-987X(199604)17:5/6%3c490::AID-JCC1%3e3.0.CO;2-P

    Article  CAS  Google Scholar 

  22. Paszke A, Gross S, Massa F, Lerer A, Bradbury J, Chanan G, Killeen T, Lin Z, Gimelshein N, Antiga L, Desmaison A, Xamla AK, Yang E, Devito Z, Raison Nabla M, Tejani A, Chilamkurthy S, Ai Q, Steiner B, Fang L, Bai J, Chintala S (2019) PyTorch: an imperative style, high-performance deep learning library. arXiv:1912.01703

  23. Rogers D, Hahn M (2010) Extended-connectivity fingerprints. J Chem Inf Model 50:742–754. https://doi.org/10.1021/ci100050t

    Article  CAS  PubMed  Google Scholar 

  24. Bemis GW, Murcko MA (1999) Properties of known drugs. 2. Side chains. J Med Chem 42:5095–5099. https://doi.org/10.1021/jm9903996

    Article  CAS  PubMed  Google Scholar 

  25. Huber PJ (1964) Robust estimation of a location parameter. Ann Math Statist 35:73–101. https://doi.org/10.1214/aoms/1177703732

    Article  Google Scholar 

  26. LeCun Y, Bengio Y, Hinton G (2015) Deep learning. Nature 521:436–444. https://doi.org/10.1038/nature14539

    Article  CAS  PubMed  Google Scholar 

  27. Prechelt L (2012) In: Montavon G, Orr GB, Müller K-R (eds) Neural networks: tricks of the trade, 2nd edn. Springer, Berlin

  28. Kingma DP, Ba JL (2014) Adam: a method for stochastic optimization. arXiv:1412.6980

  29. Agarap AF (2018) Deep learning using rectified linear units (ReLU). arXiv:1803.08375

  30. Ioffe S, Szegedy C (2015) Batch normalization: accelerating deep network training by reducing internal covariate shift. arXiv:1502.03167

  31. Srivastava N, Hinton G, Krizhevsky A, Salakhutdinov R (2014) Dropout: A simple way to prevent neural networks from overfitting. J Mach Learn Res 15:1929–1958

    Google Scholar 

  32. Deng J, Dong W, Socher R, Li L-J, Li K, Fei-Fei L (2009) ImageNet: A large-scale hierarchical image database. Proc IEEE Comput Soc Conf Comput Vis Pattern Recognit. https://doi.org/10.1109/CVPR.2009.5206848

    Article  Google Scholar 

  33. Simonyan K, Zisserman A (2014) Very deep convolutional networks for large-scale image recognition. arXiv:1409.1556

  34. Goh GB, Siegel C, Vishnu A, Hodas NO, Baker N (2017) Chemception: a deep neural network with minimal chemistry knowledge matches the performance of expert-developed QSAR/QSPR models. arXiv:1706.06689

  35. He K, Zhang X, Ren S, Sun J (2015) Deep residual learning for image recognition. arXiv:1512.03385

  36. Tan M, Le QV (2019) EfficientNet: rethinking model scaling for convolutional neural networks. arXiv:1905.11946

  37. Breiman L (2001) Random forests. Mach Learn 45:5–32. https://doi.org/10.1023/A:1010933404324

    Article  Google Scholar 

  38. Vapnik VN (2000) The nature of statistical learning theory. Springer, New York

    Book  Google Scholar 

  39. Bengio Y (2009) Learning deep architectures for AI. Found Trends Mach Learn 2:1–127. https://doi.org/10.1561/2200000006

    Article  Google Scholar 

  40. Pedregosa F, Varoquaux G, Gramfort A, Michel V, Thirion B, Grisel O, Blondel M, Prettenhofer P, Weiss R, Dubourg V, Vanderplas J, Passos A, Cournapeau D, Brucher M, Perrot M, Duchesnay E (2011) Scikit-learn: machine learning in Python. J Mach Learn Res 12:2825–2830

    Google Scholar 

  41. Ralaivola L, Swamidass SJ, Saigo H, Baldi P (2005) Graph kernels for chemical informatics. Neural Netw 18:1093–1110. https://doi.org/10.1016/j.neunet.2005.07.009

    Article  PubMed  Google Scholar 

  42. Liaw R, Liang E, Nishihara R, Moritz P, Gonzalez JE, Stoica I (2018) Tune: a research platform for distributed model selection and training, arXiv:1807.05118

  43. Falkner S, Klein A, Hutter F (2018) BOHB: Robust and efficient hyperparameter optimization at scale. arXiv:1807.01774

  44. Yang K, Swanson K, Jin W, Coley C, Eiden P, Gao H, Guzman-Perez A, Hopper T, Kelley B, Mathea M, Palmer A, Settels V, Jaakkola T, Jensen K, Barzilay R (2019) Analyzing learned molecular representations for property prediction. J Chem Inf Model 59:3370–3388. https://doi.org/10.1021/acs.jcim.9b00237

    Article  CAS  PubMed  PubMed Central  Google Scholar 

Download references

Author information

Authors and Affiliations

Authors

Contributions

Conceptualization, A.M.M., V.S., and F.M.; methodology, A.M.M., V.S., and F.M.; formal analysis, A.M.M., V.S., and F.M.; data curation, A.M.M.; writing—original draft preparation, F.M.; writing—review and editing, A.M.M., V.S., and F.M.; supervision, V.S. and F.M.; All authors reviewed the manuscript.

Corresponding author

Correspondence to Filip Miljković.

Ethics declarations

Conflict of interest

The authors declare no competing interests.

Additional information

Publisher's Note

Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

Supplementary Information

Below is the link to the electronic supplementary material.

Supplementary file1 (DOCX 93 kb)

Rights and permissions

Reprints and permissions

About this article

Check for updates. Verify currency and authenticity via CrossMark

Cite this article

Martínez Mora, A., Subramanian, V. & Miljković, F. Multi-task convolutional neural networks for predicting in vitro clearance endpoints from molecular images. J Comput Aided Mol Des 36, 443–457 (2022). https://doi.org/10.1007/s10822-022-00458-1

Download citation

  • Received:

  • Accepted:

  • Published:

  • Issue Date:

  • DOI: https://doi.org/10.1007/s10822-022-00458-1

Keywords

Navigation