Abstract
Optimization of compound metabolic stability is a highly topical issue in pharmaceutical research. Accordingly, application of predictive in silico models can potentially reduce the number of design-make-test-analyze iterations and consequently speed up the progression of novel candidate molecules. Herein, we have investigated the question if multiple in vitro clearance endpoints could be accurately predicted from image-based molecular representations. Thus, compound measurements for four commonly investigated clearance endpoints were curated from AstraZeneca internal sources, providing a sound basis for building multi-task convolutional neural network models. Application of several increasingly challenging data splitting strategies confirmed that convolutional neural network models were successful at capturing implicit chemical relationships contained in training and test data, similar to what is commonly observed for structural fingerprints. Furthermore, model benchmarking against state-of-the-art machine learning methods, including deep neural networks and graph convolutional neural networks, trained with structure- and graph-based representations, respectively, revealed on par or increased accuracy of convolutional neural networks with clear benefit of multi-task learning across all clearance endpoints. Our findings indicate that image-based molecular representations can be applied to predict multiple clearance endpoints, suggesting a potential follow-up to investigate model interpretability from molecular images.
Similar content being viewed by others
References
Kola I (2008) The state of innovation in drug development. Clin Pharmacol Ther 83:227–230. https://doi.org/10.1038/sj.clpt.6100479
Williamson B, Colclough N, Fretland AJ, Jones BC, Jones RDO, McGinnity DF (2020) Further considerations towards an effective and efficient oncology drug discovery DMPK strategy. Curr Drug Metab 21:145–162. https://doi.org/10.2174/1389200221666200312104837
Masimirembwa CM, Bredberg U, Andersson TB (2003) Metabolic stability for drug discovery and development: pharmacokinetic and biochemical challenges. Clin Pharmacokinet 42:515–528. https://doi.org/10.2165/00003088-200342060-00002
Davies M, Jones RDO, Grime K, Jansson-Löfmark R, Fretland AJ, Winiwarter S, Morgan P, McGinnity DF (2020) Improving the accuracy of predicted human pharmacokinetics: lessons learned from the AstraZeneca drug pipeline over two decades. Trends Pharmacol Sci 41:390–408. https://doi.org/10.1016/j.tips.2020.03.004
Morgan P, Brown DG, Lennard S, Anderton MJ, Barrett JC, Eriksson U, Fidock M, Hamrén B, Johnson A, March RE, Matcham J, Mettetal J, Nicholls DJ, Platz S, Rees S, Snowden MA, Pangalos MN (2018) Impact of a five-dimensional framework on R&D productivity at AstraZeneca. Nat Rev Drug Discov 17:167–181. https://doi.org/10.1038/nrd.2017.244
Williamson B, Harlfinger S, McGinnity DF (2020) Evaluation of the disconnect between hepatocyte and microsome intrinsic clearance and in vitro in vivo extrapolation performance. Drug Metab Dispos 48:1137–1146. https://doi.org/10.1124/dmd.120.000131
Oprisiu I, Winiwarter S (2020) In: Wolkenhauer O (ed) Systems medicine: integrative, qualitative and computational approaches, 1st edn. Academic Press, New York
Lombardo F, Desai PV, Arimoto R, Desino KE, Fischer H, Keefer CE, Petersson C, Winiwarter S, Broccatelli F (2017) In silico absorption, distribution, metabolism, excretion, and pharmacokinetics (ADME-PK): Utility and best practices. An industry perspective from the international consortium for innovation through quality in pharmaceutical development. J Med Chem 60:9097–9113. https://doi.org/10.1021/acs.jmedchem.7b00487
Winiwarter S, Ahlberg E, Watson E, Oprisiu I, Mogemark M, Noeske T, Greene N (2018) In silico ADME in drug design—enhancing the impact. ADMET DMPK 6:15–33. https://doi.org/10.5599/admet.6.1.470
Miljković F, Martinsson A, Obrezanova O, Williamson B, Johnson M, Sykes A, Bender A, Greene N (2021) Machine learning models for human in vivo pharmacokinetic parameters with in-house validation. Mol Pharm 18:4520–4530. https://doi.org/10.1021/acs.molpharmaceut.1c00718
Winiwarter S, Middleton B, Jones B, Courtney P, Lindmark B, Page KM, Clark A, Landqvist C (2015) Time dependent analysis of assay comparability: a novel approach to understand intra- and inter-site variability over time. J Comput Aided Mol Des 29:795–807. https://doi.org/10.1007/s10822-015-9836-5
Fernandez M, Ban F, Woo G, Hsing M, Yamazaki T, LeBlanc E, Rennie PS, Welch WJ, Cherkasov A (2018) Toxic Colors: the use of deep learning for predicting toxicity of compounds merely from their graphic images. J Chem Inf Model 58:1533–1543. https://doi.org/10.1021/acs.jcim.8b00338
Cortés-Ciriano I, Bender A (2019) KekuleScope: prediction of cancer cell line sensitivity and compound potency using convolutional neural networks trained on compound images. J Cheminform 11:e41. https://doi.org/10.1186/s13321-019-0364-5
Iqbal J, Vogt M, Bajorath J (2021) Prediction of activity cliffs on the basis of images using convolutional neural networks. J Comput Aided Mol Des 35:1157–1164. https://doi.org/10.1007/s10822-021-00380-y
Yoshimori A (2021) Prediction of molecular properties using molecular topographic map. Molecules 26:e4475. https://doi.org/10.3390/molecules26154475
Iqbal J, Vogt M, Bajorath J (2020) Activity landscape image analysis using convolutional neural networks. J Cheminform 12:e34. https://doi.org/10.1186/s13321-020-00436-5
RDKit (2022) RDKit: open-source cheminformatics and machine learning software. https://www.rdkit.org/. Accessed 14 Feb 2022
Bento AP, Hersey A, Félix E, Landrum G, Gaulton A, Atkinson F, Bellis LJ, De Veij M, Leach AR (2020) An open source chemical structure curation pipeline using RDKit. J Cheminform 12:e51. https://doi.org/10.1186/s13321-020-00456-1
Wenlock MC, Carlsson LA (2015) How experimental errors influence drug metabolism and pharmacokinetic QSAR/QSPR models. J Chem Inf Model 55:125–134. https://doi.org/10.1021/ci500535s
OEChem Toolkit, version 2.0.0, OpenEye Scientific Software: Santa Fe, NM
Halgren TA (1996) Merck molecular force field. I. Basis, form, scope, parameterization, and performance of MMFF94. J Comput Chem 17:490–519. https://doi.org/10.1002/(SICI)1096-987X(199604)17:5/6%3c490::AID-JCC1%3e3.0.CO;2-P
Paszke A, Gross S, Massa F, Lerer A, Bradbury J, Chanan G, Killeen T, Lin Z, Gimelshein N, Antiga L, Desmaison A, Xamla AK, Yang E, Devito Z, Raison Nabla M, Tejani A, Chilamkurthy S, Ai Q, Steiner B, Fang L, Bai J, Chintala S (2019) PyTorch: an imperative style, high-performance deep learning library. arXiv:1912.01703
Rogers D, Hahn M (2010) Extended-connectivity fingerprints. J Chem Inf Model 50:742–754. https://doi.org/10.1021/ci100050t
Bemis GW, Murcko MA (1999) Properties of known drugs. 2. Side chains. J Med Chem 42:5095–5099. https://doi.org/10.1021/jm9903996
Huber PJ (1964) Robust estimation of a location parameter. Ann Math Statist 35:73–101. https://doi.org/10.1214/aoms/1177703732
LeCun Y, Bengio Y, Hinton G (2015) Deep learning. Nature 521:436–444. https://doi.org/10.1038/nature14539
Prechelt L (2012) In: Montavon G, Orr GB, Müller K-R (eds) Neural networks: tricks of the trade, 2nd edn. Springer, Berlin
Kingma DP, Ba JL (2014) Adam: a method for stochastic optimization. arXiv:1412.6980
Agarap AF (2018) Deep learning using rectified linear units (ReLU). arXiv:1803.08375
Ioffe S, Szegedy C (2015) Batch normalization: accelerating deep network training by reducing internal covariate shift. arXiv:1502.03167
Srivastava N, Hinton G, Krizhevsky A, Salakhutdinov R (2014) Dropout: A simple way to prevent neural networks from overfitting. J Mach Learn Res 15:1929–1958
Deng J, Dong W, Socher R, Li L-J, Li K, Fei-Fei L (2009) ImageNet: A large-scale hierarchical image database. Proc IEEE Comput Soc Conf Comput Vis Pattern Recognit. https://doi.org/10.1109/CVPR.2009.5206848
Simonyan K, Zisserman A (2014) Very deep convolutional networks for large-scale image recognition. arXiv:1409.1556
Goh GB, Siegel C, Vishnu A, Hodas NO, Baker N (2017) Chemception: a deep neural network with minimal chemistry knowledge matches the performance of expert-developed QSAR/QSPR models. arXiv:1706.06689
He K, Zhang X, Ren S, Sun J (2015) Deep residual learning for image recognition. arXiv:1512.03385
Tan M, Le QV (2019) EfficientNet: rethinking model scaling for convolutional neural networks. arXiv:1905.11946
Breiman L (2001) Random forests. Mach Learn 45:5–32. https://doi.org/10.1023/A:1010933404324
Vapnik VN (2000) The nature of statistical learning theory. Springer, New York
Bengio Y (2009) Learning deep architectures for AI. Found Trends Mach Learn 2:1–127. https://doi.org/10.1561/2200000006
Pedregosa F, Varoquaux G, Gramfort A, Michel V, Thirion B, Grisel O, Blondel M, Prettenhofer P, Weiss R, Dubourg V, Vanderplas J, Passos A, Cournapeau D, Brucher M, Perrot M, Duchesnay E (2011) Scikit-learn: machine learning in Python. J Mach Learn Res 12:2825–2830
Ralaivola L, Swamidass SJ, Saigo H, Baldi P (2005) Graph kernels for chemical informatics. Neural Netw 18:1093–1110. https://doi.org/10.1016/j.neunet.2005.07.009
Liaw R, Liang E, Nishihara R, Moritz P, Gonzalez JE, Stoica I (2018) Tune: a research platform for distributed model selection and training, arXiv:1807.05118
Falkner S, Klein A, Hutter F (2018) BOHB: Robust and efficient hyperparameter optimization at scale. arXiv:1807.01774
Yang K, Swanson K, Jin W, Coley C, Eiden P, Gao H, Guzman-Perez A, Hopper T, Kelley B, Mathea M, Palmer A, Settels V, Jaakkola T, Jensen K, Barzilay R (2019) Analyzing learned molecular representations for property prediction. J Chem Inf Model 59:3370–3388. https://doi.org/10.1021/acs.jcim.9b00237
Author information
Authors and Affiliations
Contributions
Conceptualization, A.M.M., V.S., and F.M.; methodology, A.M.M., V.S., and F.M.; formal analysis, A.M.M., V.S., and F.M.; data curation, A.M.M.; writing—original draft preparation, F.M.; writing—review and editing, A.M.M., V.S., and F.M.; supervision, V.S. and F.M.; All authors reviewed the manuscript.
Corresponding author
Ethics declarations
Conflict of interest
The authors declare no competing interests.
Additional information
Publisher's Note
Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.
Supplementary Information
Below is the link to the electronic supplementary material.
Rights and permissions
About this article
Cite this article
Martínez Mora, A., Subramanian, V. & Miljković, F. Multi-task convolutional neural networks for predicting in vitro clearance endpoints from molecular images. J Comput Aided Mol Des 36, 443–457 (2022). https://doi.org/10.1007/s10822-022-00458-1
Received:
Accepted:
Published:
Issue Date:
DOI: https://doi.org/10.1007/s10822-022-00458-1