Skip to main content

Artificial Intelligence, Machine Learning, and Deep Learning in Real-Life Drug Design Cases

  • Protocol
  • First Online:

Part of the book series: Methods in Molecular Biology ((MIMB,volume 2390))

Abstract

The discovery and development of drugs is a long and expensive process with a high attrition rate. Computational drug discovery contributes to ligand discovery and optimization, by using models that describe the properties of ligands and their interactions with biological targets. In recent years, artificial intelligence (AI) has made remarkable modeling progress, driven by new algorithms and by the increase in computing power and storage capacities, which allow the processing of large amounts of data in a short time. This review provides the current state of the art of AI methods applied to drug discovery, with a focus on structure- and ligand-based virtual screening, library design and high-throughput analysis, drug repurposing and drug sensitivity, de novo design, chemical reactions and synthetic accessibility, ADMET, and quantum mechanics.

This is a preview of subscription content, log in via an institution.

Buying options

Protocol
USD   49.95
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
eBook
USD   139.00
Price excludes VAT (USA)
  • Available as EPUB and PDF
  • Read on any device
  • Instant download
  • Own it forever
Softcover Book
USD   179.99
Price excludes VAT (USA)
  • Compact, lightweight edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info
Hardcover Book
USD   249.99
Price excludes VAT (USA)
  • Durable hardcover edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info

Tax calculation will be finalised at checkout

Purchases are for personal use only

Learn about institutional subscriptions

Springer Nature is developing a new tool to find and evaluate Protocols. Learn more

Abbreviations

ADMET:

absorption, distribution, metabolism, excretion, toxicity

AI:

artificial intelligence

ANN :

artificial neural network

AUC:

area under the curve

AUROC:

area under the receiver operating characteristic curve

BA:

balanced accuracy

BKD:

binary kernel discrimination

CCLE:

Cancer Cell Line Encyclopedia

CCLP:

COSMIC Cell Lines Project

CNN :

convolutional neural network

CV:

cross-validation

DFT :

density functional theory

DILI :

drug-induced liver injury

DL:

deep learning

DMTA :

design–make–test–analyze

DNN :

deep neural network

DRD2:

dopamine receptor D2

DTNN:

deep tensor neural network

ECFP4:

extended connectivity fingerprint of diameter 4

FDA:

Food and Drug Administration

FNN:

feed-forward neural network

GCNN:

graph convolutional neural network

GDB-7:

generic database with up to 7 heavy atoms

GDSC:

genomics in drug sensitivity in cancer

GENTRL :

generative tensorial reinforcement learning

GPU :

graphics processing unit

GSE:

general solubility equation

hERG :

human Ether-à-go-go-Related Gene

HTS :

high-throughput screening

JAK:

Janus kinase

KNN:

k-nearest neighbor

LBVS :

ligand-based virtual screening

LINCS:

library of integrated network-based cellular signatures

LSTM :

long short-term memory

MCC:

Matthews correlation coefficient

MeSH:

medical subject headings

ML:

machine learning

MT:

multitasks

MTDL:

multitask deep learning

MTNN:

multitask neural network

NCI-60:

National Cancer Institute 60 human cancer cell lines

PAINS :

pan-assay interference

PPAR:

peroxisome proliferator-activated receptors

PPI:

protein–protein interaction

QM :

quantum mechanics

QSAR :

quantitative structure–activity relationship

QSPR :

quantitative structure–property relationship

RF :

random forest

RL :

reinforcement learning

RNN :

recurrent neural network

ROC:

receiver operating characteristic

RXR:

retinoid X receptors

SA:

synthetic accessibility

SBVS :

structure-based virtual screening

SEA:

similarity ensemble approach

SMILES :

simplified molecular input line entry specification,

SOM :

site of metabolism

SVM :

support vector machine

SVR:

support vector regression

VS:

virtual screening

References

  1. Vamathevan J, Clark D, Czodrowski P et al (2019) Applications of machine learning in drug discovery and development. Nat Rev Drug Discov 18:463–477

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  2. Lecun Y, Bengio Y, Hinton G (2015) Deep learning. Nature 521:436–444

    Article  CAS  PubMed  Google Scholar 

  3. Liu Z, Su M, Han L et al (2017) Forging the basis for developing protein-ligand interaction scoring functions. Acc Chem Res 50:302–309

    Article  CAS  PubMed  Google Scholar 

  4. Ain QU, Aleksandrova A, Roessler FD et al (2015) Machine-learning scoring functions to improve structure-based binding affinity prediction and virtual screening. Wiley Interdiscip Rev Comput Mol Sci 5:405–424

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  5. Shen C, Ding J, Wang Z et al (2020) From machine learning to deep learning: advances in scoring functions for protein–ligand docking. WIREs Comput Mol Sci 10:e1429

    Article  CAS  Google Scholar 

  6. Ashtawy HM, Mahapatra NR (2015) A comparative assessment of predictive accuracies of conventional and machine learning scoring functions for protein-ligand binding affinity prediction. IEEE/ACM Trans Comput Biol Bioinforma 12:335–347

    Article  CAS  Google Scholar 

  7. Wang C, Zhang Y (2017) Improving scoring-docking-screening powers of protein–ligand scoring functions using random forest. J Comput Chem 38:169–177

    Article  PubMed  CAS  Google Scholar 

  8. Ragoza M, Hochuli J, Idrobo E et al (2017) Protein-ligand scoring with convolutional neural networks. J Chem Inf Model 57:942–957

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  9. Pereira JC, Caffarena ER, Dos Santos CN (2016) Boosting docking-based virtual screening with deep learning. J Chem Inf Model 56:2495–2506

    Article  CAS  PubMed  Google Scholar 

  10. Gomes J, Ramsundar B, Feinberg EN, et al (2017) Atomic convolutional networks for predicting protein-ligand binding. arXiv e-prints 1703.10603

    Google Scholar 

  11. Chen L, Cruz A, Ramsey S et al (2019) Hidden bias in the DUD-E dataset leads to misleading performance of deep learning in structure-based virtual screening. PLoS One 14:e0220113

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  12. Yang J, Shen C, Huang N (2020) Predicting or pretending: artificial intelligence for protein-ligand interactions lack of sufficiently large and unbiased datasets. Front Pharmacol 11:69

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  13. Sieg J, Flachsenberg F, Rarey M (2019) In need of bias control: evaluating chemical data for machine learning in structure-based virtual screening. J Chem Inf Model 59:947–961

    Article  CAS  PubMed  Google Scholar 

  14. Scantlebury J, Brown N, Von Delft F et al (2020) Data set augmentation allows deep learning-based virtual screening to better generalize to unseen target classes and highlight important binding interactions. J Chem Inf Model 60:3722–3730

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  15. Gentile F, Agrawal V, Hsing M et al (2020) Deep docking: a deep learning platform for augmentation of structure based drug discovery. ACS Cent Sci 6:939–949

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  16. Ton AT, Gentile F, Hsing M et al (2020) Rapid identification of potential inhibitors of SARS-CoV-2 Main protease by deep docking of 1.3 billion compounds. Mol Inform 39:e2000028

    Article  PubMed  CAS  Google Scholar 

  17. Dahl GE, Jaitly N, and Salakhutdinov R (2014) Multi-task Neural Networks for QSAR Predictions. arXiv 1406.1231

    Google Scholar 

  18. Rodríguez-Pérez R, Bajorath J (2019) Multitask machine learning for classifying highly and weakly potent kinase inhibitors. ACS Omega 4:4367–4375

    Article  CAS  Google Scholar 

  19. Keshavarzi Arshadi A, Salem M, Collins J et al (2020) DeepMalaria: artificial intelligence driven discovery of potent Antiplasmodials. Front Pharmacol 10:1526

    Article  PubMed  PubMed Central  CAS  Google Scholar 

  20. Miljković F, Rodríguez-Pérez R, Bajorath J (2020) Machine learning models for accurate prediction of kinase inhibitors with different binding modes. J Med Chem 63:8738–8748

    Article  PubMed  CAS  Google Scholar 

  21. Aldrich C, Bertozzi C, Georg GI et al (2017) The ecstasy and agony of assay interference compounds. J Chem Inf Model 57:387–390

    Article  CAS  PubMed  Google Scholar 

  22. Yang Z-Y, He J-H, Lu A-P et al (2020) Frequent hitters: nuisance artifacts in high-throughput screening. Drug Discov Today 25:657–667

    Article  CAS  PubMed  Google Scholar 

  23. Stork C, Chen Y, Šícho M et al (2019) Hit Dexter 2.0: machine-learning models for the prediction of frequent hitters. J Chem Inf Model 59:1030–1043

    Article  CAS  PubMed  Google Scholar 

  24. Blaschke T, Miljković F, Bajorath J (2019) Prediction of different classes of promiscuous and nonpromiscuous compounds using machine learning and nearest neighbor analysis. ACS Omega 4:6883–6890

    Article  CAS  Google Scholar 

  25. Borrel A, Huang R, Sakamuru S et al (2020) High-throughput screening to predict chemical-assay interference. Sci Rep 10:3986

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  26. Borrel A, Mansouri K, Nolte S et al (2020) InterPred: a webtool to predict chemical autofluorescence and luminescence interference. Nucleic Acids Res 48:W586–W590

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  27. Lipinski CA, Lombardo F, Dominy BW et al (2001) Experimental and computational approaches to estimate solubility and permeability in drug discovery and development settings1PII of original article: S0169-409X(96)00423-1. The article was originally published in advanced drug delivery reviews 23 (1997) 3. Adv Drug Deliv Rev 46:3–26

    Article  CAS  PubMed  Google Scholar 

  28. Zhang X, Betzi S, Morelli X et al (2014) Focused chemical libraries--design and enrichment: an example of protein-protein interaction chemical space. Future Med Chem 6:1291–1307

    Article  CAS  PubMed  Google Scholar 

  29. Villoutreix BO, Labbe CM, Lagorce D et al (2012) A leap into the chemical space of protein-protein interaction inhibitors. Curr Pharm Des 18:4648–4667

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  30. Bosc N, Muller C, Hoffer L et al (2020) Fr-PPIChem: an academic compound library dedicated to protein-protein interactions. ACS Chem Biol 15:1566–1574

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  31. Nidhi GM, Davies JW et al (2006) Prediction of biological targets for compounds using multiple-category bayesian models trained on chemogenomics databases. J Chem Inf Model 46:1124–1133

    Article  CAS  PubMed  Google Scholar 

  32. Zhang P, Wang F, Hu J (2014) Towards drug repositioning: a unified computational framework for integrating multiple aspects of drug similarity and disease similarity. AMIA Annu Symp Proc 2014:1258–1267

    PubMed  PubMed Central  Google Scholar 

  33. Napolitano F, Zhao Y, Moreira VM et al (2013) Drug repositioning: a machine-learning approach through data integration. J Cheminform 5:30

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  34. Jarada TN, Rokne JG, Alhajj R (2020) A review of computational drug repositioning: strategies, approaches, opportunities, challenges, and directions. J Cheminform 12:46

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  35. Unterthiner T, Mayr A, Klambauer G et al (2014) Deep learning as an opportunity in virtual screening. In: Conference: Workshop on Deep Learning and Representation Learning (NIPS2014)

    Google Scholar 

  36. Allen BK, Ayad NG, and Schürer SC (2019) Kinome-wide activity classification of small molecules by deep learning. bioRxiv

    Google Scholar 

  37. Rifaioglu AS, Nalbat E, Atalay V et al (2020) DEEPScreen: high performance drug-target interaction prediction with convolutional neural networks using 2-D structural compound representations. Chem Sci 11:2531–2557

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  38. Stokes JM, Yang K, Swanson K et al (2020) A deep learning approach to antibiotic discovery. Cell 180:688–702.e13

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  39. Hu S, Zhang C, Chen P et al (2019) Predicting drug-target interactions from drug structure and protein sequence using novel convolutional neural networks. BMC Bioinformatics 20:689

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  40. Aliper A, Plis S, Artemov A et al (2016) Deep learning applications for predicting pharmacological properties of drugs and drug repurposing using transcriptomic data. Mol Pharm 13:2524–2530

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  41. Meyer JG, Liu S, Miller IJ et al (2019) Learning drug functions from chemical structures with convolutional neural networks and random forests. J Chem Inf Model 59:4438–4449

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  42. Yang W, Soares J, Greninger P et al (2013) Genomics of drug sensitivity in cancer (GDSC): a resource for therapeutic biomarker discovery in cancer cells. Nucleic Acids Res 41:D955–D961

    Article  CAS  PubMed  Google Scholar 

  43. Tate JG, Bamford S, Jubb HC et al (2019) COSMIC: the catalogue of somatic mutations in cancer. Nucleic Acids Res 47:D941–D947

    Article  CAS  PubMed  Google Scholar 

  44. Barretina J, Caponigro G, Stransky N et al (2012) The cancer cell line encyclopedia enables predictive modelling of anticancer drug sensitivity. Nature 483:603–607

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  45. Shoemaker RH (2006) The NCI60 human tumour cell line anticancer drug screen. Nat Rev Cancer 6:813–823

    Article  CAS  PubMed  Google Scholar 

  46. Garnett MJ, Edelman EJ, Heidorn SJ et al (2012) Systematic identification of genomic markers of drug sensitivity in cancer cells. Nature 483:570–575

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  47. Iorio F, Knijnenburg TA, Vis DJ et al (2016) A landscape of Pharmacogenomic interactions in cancer. Cell 166:740–754

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  48. Rahman R, Matlock K, Ghosh S et al (2017) Heterogeneity aware random forest for drug sensitivity prediction. Sci Rep 7:11347

    Article  PubMed  PubMed Central  CAS  Google Scholar 

  49. Costello JC, Heiser LM, Georgii E et al (2014) A community effort to assess and improve drug sensitivity prediction algorithms. Nat Biotechnol 32:1202–1212

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  50. Menden MP, Iorio F, Garnett M et al (2013) Machine learning prediction of cancer cell sensitivity to drugs based on genomic and chemical properties. PLoS One 8:e61318

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  51. Cortés-Ciriano I, Van Westen GJP, Bouvier G et al (2016) Improved large-scale prediction of growth inhibition patterns using the NCI60 cancer cell line panel. Bioinformatics 32:85–95

    PubMed  Google Scholar 

  52. Chang Y, Park H, Yang HJ et al (2018) Cancer drug response profile scan (CDRscan): a deep learning model that predicts drug effectiveness from cancer genomic signature. Sci Rep 8:8857

    Article  PubMed  PubMed Central  CAS  Google Scholar 

  53. Liu P, Li H, Li S et al (2019) Improving prediction of phenotypic drug response on cancer cell lines using deep convolutional network. BMC Bioinformatics 20:408

    Article  PubMed  PubMed Central  CAS  Google Scholar 

  54. Garcia-Alonso L, Iorio F, Matchan A et al (2018) Transcription factor activities enhance markers of drug sensitivity in cancer. Cancer Res 78:769–780

    Article  CAS  PubMed  Google Scholar 

  55. Besnard J, Ruda GF, Setola V et al (2012) Automated design of ligands to polypharmacological profiles. Nature 492:215–220

    Article  CAS  PubMed  Google Scholar 

  56. Hartenfeller M, Zettl H, Walter M et al (2012) Dogs: reaction-driven de novo design of bioactive compounds. PLoS Comput Biol 8:e1002380

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  57. Zhavoronkov A, Ivanenkov YA, Aliper A et al (2019) Deep learning enables rapid identification of potent DDR1 kinase inhibitors. Nat Biotechnol 37:1038–1040

    Article  CAS  PubMed  Google Scholar 

  58. Walters WP, Murcko M (2020) Assessing the impact of generative AI on medicinal chemistry. Nat Biotechnol 38:143–145

    Article  CAS  PubMed  Google Scholar 

  59. Elton DC, Boukouvalas Z, Fuge MD et al (2019) Deep learning for molecular design - a review of the state of the art. Mol Syst Des Eng 4:828–849

    Article  CAS  Google Scholar 

  60. Bian Y and Xie X-Q (2020) Generative chemistry: drug discovery with deep learning generative models arXiv 2008.09000

    Google Scholar 

  61. Gómez-Bombarelli R, Wei JN, Duvenaud D et al (2018) Automatic chemical design using a data-driven continuous representation of molecules. ACS Cent Sci 4:268–276

    Article  PubMed  PubMed Central  CAS  Google Scholar 

  62. Segler MHS, Kogej T, Tyrchan C et al (2018) Generating focused molecule libraries for drug discovery with recurrent neural networks. ACS Cent Sci 4:120–131

    Article  CAS  PubMed  Google Scholar 

  63. Merk D, Friedrich L, Grisoni F et al (2018) De novo design of bioactive small molecules by artificial intelligence. Mol Inform 37:1700153–1700154

    Article  PubMed Central  CAS  Google Scholar 

  64. Olivecrona M, Blaschke T, Engkvist O et al (2017) Molecular de-novo design through deep reinforcement learning. J Cheminform 9:48

    Article  PubMed  PubMed Central  Google Scholar 

  65. Blaschke T, Arús-Pous J, Chen H et al (2020) REINVENT 2.0: an AI tool for De novo drug design. J Chem Inf Model 60:5918–5922

    Article  CAS  PubMed  Google Scholar 

  66. Cao N de and Kipf T (2018) MolGAN: An implicit generative model for small molecular graphs. arXiv 1805.11973

    Google Scholar 

  67. Zhou Z, Kearnes S, Li L et al (2019) Optimization of molecules via deep reinforcement learning. Sci Rep 9:10752

    Article  PubMed  PubMed Central  CAS  Google Scholar 

  68. Méndez-Lucio O, Baillif B, Clevert DA et al (2020) De novo generation of hit-like molecules from gene expression signatures using artificial intelligence. Nat Commun 11:10

    Article  PubMed  PubMed Central  CAS  Google Scholar 

  69. Benhenda M (2017) ChemGAN challenge for drug discovery: can AI reproduce natural chemical diversity? arXiv 1708.08227

    Google Scholar 

  70. Brown N, Fiscato M, Segler MHS et al (2019) GuacaMol: benchmarking models for de novo molecular design. J Chem Inf Model 59:1096–1108

    Article  CAS  PubMed  Google Scholar 

  71. Gottipati SK, Sattarov B, Niu S, et al (2020) Learning To Navigate The Synthetically Accessible Chemical Space Using Reinforcement Learning. arXiv 2004.12485

    Google Scholar 

  72. Corey EJ, Wipke WT (1969) Computer-assisted design of complex organic syntheses. Science 166:178–192

    Article  CAS  PubMed  Google Scholar 

  73. Ertl P, Schuffenhauer A (2009) Estimation of synthetic accessibility score of drug-like molecules based on molecular complexity and fragment contributions. J Cheminform 1:8

    Article  PubMed  PubMed Central  CAS  Google Scholar 

  74. Fukunishi Y, Kurosawa T, Mikami Y et al (2014) Prediction of synthetic accessibility based on commercially available compound databases. J Chem Inf Model 54:3259–3267

    Article  CAS  PubMed  Google Scholar 

  75. Sheridan RP, Zorn N, Sherer EC et al (2014) Modeling a crowdsourced definition of molecular complexity. J Chem Inf Model 54:1604–1616

    Article  CAS  PubMed  Google Scholar 

  76. Coley CW, Rogers L, Green WH et al (2018) SCScore: synthetic complexity learned from a reaction corpus. J Chem Inf Model 58:252–261

    Article  CAS  PubMed  Google Scholar 

  77. Segler MHS, Waller MP (2017) Neural-symbolic machine learning for retrosynthesis and reaction prediction. Chemistry 23:5966–5971

    Article  CAS  PubMed  Google Scholar 

  78. Segler MHS, Preuss M, Waller MP (2018) Planning chemical syntheses with deep neural networks and symbolic AI. Nature 555:604–610

    Article  CAS  PubMed  Google Scholar 

  79. Fooshee D, Mood A, Gutman E et al (2018) Deep learning for chemical reaction prediction. Mol Syst Des Eng 3:442–452

    Article  CAS  Google Scholar 

  80. Schwaller P, Laino T, Gaudin T et al (2019) Molecular transformer: a model for uncertainty-calibrated chemical reaction prediction. ACS Cent Sci 5:1572–1583

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  81. Segler MHS, Waller MP (2017) Modelling chemical reasoning to predict and invent reactions. Chemistry 23:6118–6128

    Article  CAS  PubMed  Google Scholar 

  82. Ahneman DT, Estrada JG, Lin S et al (2018) Predicting reaction performance in C–N cross-coupling using machine learning. Science 360:186 LP–190 LP

    Article  CAS  Google Scholar 

  83. Sandfort F, Strieth-Kalthoff F, Kühnemund M et al (2020) A structure-based platform for predicting chemical reactivity. Chem 6:1379–1390

    Article  CAS  Google Scholar 

  84. Reker D, Bernardes G, and Rodrigues T (2018) Evolving and Nano data enabled machine intelligence for chemical reaction optimization. ChemRxiv

    Google Scholar 

  85. Gao H, Struble TJ, Coley CW et al (2018) Using machine learning to predict suitable conditions for organic reactions. ACS Cent Sci 4:1465–1476

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  86. Zhou Z, Li X, Zare RN (2017) Optimizing chemical reactions with deep reinforcement learning. ACS Cent Sci 3:1337–1344

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  87. Gao W, Coley CW (2020) The synthesizability of molecules proposed by generative models. J Chem Inf Model 60:5714–5723

    Article  CAS  PubMed  Google Scholar 

  88. Korovina K, Xu S, Kandasamy K, et al (2019) ChemBO: Bayesian Optimization of Small Organic Molecules with Synthesizable Recommendations arXiv 1908.01425

    Google Scholar 

  89. Zubatyuk R, Smith J, Nebgen B, et al (2020) Teaching a neural network to attach and detach electrons from molecules. ChemRxiv

    Google Scholar 

  90. Genheden S, Thakkar A, Chadimova V, et al (2020) AiZynthFinder: A Fast Robust and Flexible Open-Source Software for Retrosynthetic Planning. ChemRxiv

    Google Scholar 

  91. Thakkar A, Selmi N, Reymond J-L et al (2020) “Ring breaker”: neural network driven synthesis prediction of the ring system chemical space. J Med Chem 63:8791–8808

    Article  CAS  PubMed  Google Scholar 

  92. Gale EM, Durand DJ (2020) Improving reaction prediction. Nat Chem 12:509–510

    Article  CAS  PubMed  Google Scholar 

  93. Irmann F (1965) Eine einfache Korrelation zwischen Wasserlöslichkeit und Struktur von Kohlenwasserstoffen und Halogenkohlenwasserstoffen. Chemie Ing Tech 37:789–798

    Article  CAS  Google Scholar 

  94. Hansch C, Quinlan JE, Lawrence GL (1968) Linear free-energy relationship between partition coefficients and the aqueous solubility of organic liquids. J Org Chem 33:347–350

    Article  CAS  Google Scholar 

  95. Ran Y, Yalkowsky SH (2001) Prediction of drug solubility by the general solubility equation (GSE). J Chem Inf Comput Sci 41:354–357

    Article  CAS  PubMed  Google Scholar 

  96. Llinàs A, Glen RC, Goodman JM (2008) Solubility challenge: can you predict Solubilities of 32 molecules using a database of 100 reliable measurements? J Chem Inf Model 48:1289–1303

    Article  PubMed  CAS  Google Scholar 

  97. Llinas A, Avdeef A (2019) Solubility challenge revisited after ten years, with multilab shake-flask data, using tight (SD ∼ 0.17 log) and loose (SD ∼ 0.62 log) test sets. J Chem Inf Model 59:3036–3040

    Article  CAS  PubMed  Google Scholar 

  98. Korotcov A, Tkachenko V, Russo DP et al (2017) Comparison of deep learning with multiple machine learning methods and metrics using diverse drug discovery data sets. Mol Pharm 14:4462–4475

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  99. Wu K, Zhao Z, Wang R et al (2018) TopP–S: persistent homology-based multi-task deep neural networks for simultaneous predictions of partition coefficient and aqueous solubility. J Comput Chem 39:1444–1454

    Article  CAS  PubMed  Google Scholar 

  100. Korolev V, Mitrofanov A, Korotcov A et al (2020) Graph convolutional neural networks as “general-purpose” property predictors: the universality and limits of applicability. J Chem Inf Model 60:22–28

    Article  CAS  PubMed  Google Scholar 

  101. Cui Q, Lu S, Ni B et al (2020) Improved prediction of aqueous solubility of novel compounds by going deeper with deep learning. Front Oncol 10:121

    Article  PubMed  PubMed Central  Google Scholar 

  102. Montanari F, Kuhnke L, Ter Laak A et al (2020) Modeling Physico-chemical ADMET endpoints with multitask graph convolutional networks. Molecules 25:44

    Article  CAS  Google Scholar 

  103. Avdeef A (2020) Prediction of aqueous intrinsic solubility of druglike molecules using random Forest regression trained with wiki-pS0 database. ADMET DMPK 8:29

    Article  PubMed  PubMed Central  Google Scholar 

  104. Khurana S, Rawi R, Kunji K et al (2018) DeepSol: a deep learning framework for sequence-based protein solubility prediction. Bioinformatics 34:2605–2613

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  105. Rawi R, Mall R, Kunji K et al (2018) PaRSnIP: sequence-based protein solubility prediction using gradient boosting machine. Bioinformatics 34:1092–1098

    Article  CAS  PubMed  Google Scholar 

  106. Li X, Fourches D (2020) Inductive transfer learning for molecular activity prediction: next-gen QSAR models with MolPMoFiT. J Cheminform 12:27

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  107. Fuchs J-A, Grisoni F, Kossenjans M et al (2018) Lipophilicity prediction of peptides and peptide derivatives by consensus machine learning. Med Chem Commun 9:1538–1546

    Article  CAS  Google Scholar 

  108. Wenzel J, Matter H, Schmidt F (2019) Predictive multitask deep neural network models for ADME-Tox properties: learning from large data sets. J Chem Inf Model 59:1253–1268

    Article  CAS  PubMed  Google Scholar 

  109. Hunt PA, Segall MD, Tyzack JD (2018) WhichP450: a multi-class categorical model to predict the major metabolising CYP450 isoform for a compound. J Comput Aided Mol Des 32:537–546

    Article  CAS  PubMed  Google Scholar 

  110. Xiong Y, Qiao Y, Kihara D et al (2019) Survey of machine learning techniques for prediction of the isoform specificity of cytochrome P450 substrates. Curr Drug Metab 20:229–235

    Article  CAS  PubMed  Google Scholar 

  111. Rydberg P, Gloriam DE, Olsen L (2010) The SMARTCyp cytochrome P450 metabolism prediction server. Bioinformatics 26:2988–2989

    Article  CAS  PubMed  Google Scholar 

  112. Rudik A, Bezhentsev V, Dmitriev A et al (2018) Metatox - web application for generation of metabolic pathways and toxicity estimation. J Bioinforma Comput Biol 17:1940001

    Article  CAS  Google Scholar 

  113. Madzhidov TI, Khakimova AA, Nugmanov RI et al (2018) Prediction of aromatic hydroxylation sites for human CYP1A2 substrates using condensed graph of reactions. Bionanoscience 8:384–389

    Article  Google Scholar 

  114. Matlock MK, Hughes TB, Swamidass SJ (2015) XenoSite server: a web-available site of metabolism prediction tool. Bioinformatics 31:1136–1137

    Article  CAS  PubMed  Google Scholar 

  115. Rudik AV, Dmitriev AV, Lagunin AA et al (2014) Metabolism site prediction based on xenobiotic structural formulas and PASS prediction algorithm. J Chem Inf Model 54:498–507

    Article  CAS  PubMed  Google Scholar 

  116. Finkelmann AR, Goldmann D, Schneider G et al (2018) MetScore: site of metabolism prediction beyond cytochrome P450 enzymes. ChemMedChem 13:2281–2289

    Article  CAS  PubMed  Google Scholar 

  117. Šícho M, Stork C, Mazzolari A et al (2019) FAME 3: predicting the sites of metabolism in synthetic compounds and natural products for phase 1 and phase 2 metabolic enzymes. J Chem Inf Model 59:3400–3412

    Article  PubMed  CAS  Google Scholar 

  118. Flynn NR, Le Dang N, Ward MD et al (2020) XenoNet: inference and likelihood of intermediate metabolite formation. J Chem Inf Model 60:3431–3449

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  119. Djoumbou-Feunang Y, Fiamoncini J, Gil-de-la-Fuente A et al (2019) BioTransformer: a comprehensive computational tool for small molecule metabolism prediction and metabolite identification. J Cheminform 11:2

    Article  PubMed  PubMed Central  Google Scholar 

  120. Marchant CA, Briggs KA, Long A (2008) In silico tools for sharing data and knowledge on toxicity and metabolism: derek for windows, meteor, and vitic. Toxicol Mech Methods 18:177–187

    Article  CAS  PubMed  Google Scholar 

  121. de Bruyn Kops C, Stork C, Šícho M et al (2019) GLORY: generator of the structures of likely cytochrome P450 metabolites based on predicted sites of metabolism. Front Chem 7:402

    Article  PubMed  PubMed Central  CAS  Google Scholar 

  122. Šícho M, de Bruyn Kops C, Stork C et al (2017) FAME 2: simple and effective machine learning model of cytochrome P450 Regioselectivity. J Chem Inf Model 57:1832–1846

    Article  PubMed  CAS  Google Scholar 

  123. Hartung T (2019) Predicting toxicity of chemicals: software beats animal testing. EFSA J 17:e170710

    Article  PubMed  PubMed Central  Google Scholar 

  124. Lee H-M, Yu M-S, Kazmi SR et al (2019) Computational determination of hERG-related cardiotoxicity of drug candidates. BMC Bioinformatics 20:250

    Article  PubMed  PubMed Central  CAS  Google Scholar 

  125. Ogura K, Sato T, Yuki H et al (2019) Support vector machine model for hERG inhibitory activities based on the integrated hERG database using descriptor selection by NSGA-II. Sci Rep 9:12220

    Article  PubMed  PubMed Central  CAS  Google Scholar 

  126. Zhang Y, Zhao J, Wang Y et al (2019) Prediction of hERG K+ channel blockage using deep neural networks. Chem Biol Drug Des 94:1973–1985

    Article  CAS  PubMed  Google Scholar 

  127. Fourches D, Barnes JC, Day NC et al (2010) Cheminformatics analysis of assertions mined from literature that describe drug-induced liver injury in different species. Chem Res Toxicol 23:171–183

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  128. Kim E, Nam H (2017) Prediction models for drug-induced hepatotoxicity by using weighted molecular fingerprints. BMC Bioinformatics 18:227

    Article  PubMed  PubMed Central  CAS  Google Scholar 

  129. Low Y, Uehara T, Minowa Y et al (2011) Predicting drug-induced hepatotoxicity using QSAR and toxicogenomics approaches. Chem Res Toxicol 24:1251–1262

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  130. Muller C, Pekthong D, Alexandre E et al (2015) Prediction of drug induced liver injury using molecular and biological descriptors. Comb Chem High Throughput Screen 18:315–322

    Article  CAS  PubMed  Google Scholar 

  131. Wang H, Liu R, Schyman P et al (2019) Deep neural network models for predicting chemically induced liver toxicity endpoints from transcriptomic responses. Front Pharmacol 10:42

    Article  PubMed  PubMed Central  CAS  Google Scholar 

  132. Nguyen-Vo T-H, Nguyen L, Do N et al (2020) Predicting drug-induced liver injury using convolutional neural network and molecular fingerprint-embedded features. ACS Omega 5:25432–25439

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  133. Lei T, Li Y, Song Y et al (2016) ADMET evaluation in drug discovery: 15. Accurate prediction of rat oral acute toxicity using relevance vector machine and consensus modeling. J Cheminform 8:6

    Article  PubMed  PubMed Central  CAS  Google Scholar 

  134. Fan T, Sun G, Zhao L et al (2018) QSAR and classification study on prediction of acute Oral toxicity of N-Nitroso compounds. Int J Mol Sci 19:3015

    Article  PubMed Central  CAS  Google Scholar 

  135. García-Jacas CR, Marrero-Ponce Y, Cortés-Guzmán F et al (2019) Enhancing acute Oral toxicity predictions by using consensus modeling and algebraic form-based 0D-to-2D molecular encodes. Chem Res Toxicol 32:1178–1192

    Article  PubMed  CAS  Google Scholar 

  136. Lunghini F, Marcou G, Azam P et al (2019) Consensus models to predict oral rat acute toxicity and validation on a dataset coming from the industrial context. SAR QSAR Environ Res 30:879–897

    Article  CAS  PubMed  Google Scholar 

  137. Wu K, Wei G-W (2018) Quantitative toxicity prediction using topology based multitask deep neural networks. J Chem Inf Model 58:520–531

    Article  CAS  PubMed  Google Scholar 

  138. Xu Y, Pei J, Lai L (2017) Deep learning based regression and multiclass models for acute Oral toxicity prediction with automatic chemical feature extraction. J Chem Inf Model 57:2672–2685

    Article  CAS  PubMed  Google Scholar 

  139. Sosnin S, Karlov D, Tetko IV et al (2019) Comparative study of multitask toxicity modeling on a broad chemical space. J Chem Inf Model 59:1062–1072

    Article  CAS  PubMed  Google Scholar 

  140. Carnesecchi E, Raitano G, Gamba A et al (2020) Evaluation of non-commercial models for genotoxicity and carcinogenicity in the assessment of EFSA’s databases. SAR QSAR Environ Res 31:33–48

    Article  CAS  PubMed  Google Scholar 

  141. Honma M, Kitazawa A, Cayley A et al (2019) Improvement of quantitative structure-activity relationship (QSAR) tools for predicting Ames mutagenicity: outcomes of the Ames/QSAR international challenge project. Mutagenesis 34:3–16

    Article  CAS  PubMed  Google Scholar 

  142. Verheyen GR, Braeken E, Van Deun K et al (2017) Evaluation of existing (Q)SAR models for skin and eye irritation and corrosion to use for REACH registration. Toxicol Lett 265:47–52

    Article  CAS  PubMed  Google Scholar 

  143. Piir G, Sild S, Maran U (2021) Binary and multi-class classification for androgen receptor agonists, antagonists and binders. Chemosphere 262:128313

    Article  CAS  PubMed  Google Scholar 

  144. Mazzolari A, Vistoli G, Testa B et al (2018) Prediction of the formation of reactive metabolites by a novel classifier approach based on enrichment factor optimization (EFO) as implemented in the VEGA program. Molecules 23:2955

    Article  PubMed Central  CAS  Google Scholar 

  145. Yuan Q, Wei Z, Guan X et al (2019) Toxicity prediction method based on Multi-Channel convolutional neural network. Molecules 24:3383

    Article  CAS  PubMed Central  Google Scholar 

  146. Watanabe R, Ohashi R, Esaki T et al (2019) Development of an in silico prediction system of human renal excretion and clearance from chemical structure information incorporating fraction unbound in plasma as a descriptor. Sci Rep 9:18782

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  147. Sun L, Yang H, Li J et al (2018) In silico prediction of compounds binding to human plasma proteins by QSAR models. ChemMedChem 13:572–581

    Article  CAS  PubMed  Google Scholar 

  148. Esposito C, Wang S, Lange UEW et al (2020) Combining machine learning and molecular dynamics to predict P-glycoprotein substrates. J Chem Inf Model 60:4730–4749

    Article  CAS  PubMed  Google Scholar 

  149. Shin M, Jang D, Nam H et al (2018) Predicting the absorption potential of chemical compounds through a deep learning approach. IEEE/ACM Trans Comput Biol Bioinforma 15:432–440

    Article  CAS  Google Scholar 

  150. Guan L, Yang H, Cai Y et al (2019) ADMET-score – a comprehensive scoring function for evaluation of chemical drug-likeness. Med Chem Commun 10:148–157

    Article  CAS  Google Scholar 

  151. Kar S, Leszczynski J (2020) Open access in silico tools to predict the ADMET profiling of drug candidates. Expert Opin Drug Discov 15:1473–1487

    Article  CAS  PubMed  Google Scholar 

  152. Feinberg EN, Joshi E, Pande VS et al (2020) Improvement in ADMET prediction with multitask deep Featurization. J Med Chem 63:8835–8848

    Article  CAS  PubMed  Google Scholar 

  153. Zhou Y, Cahya S, Combs SA et al (2019) Exploring tunable Hyperparameters for deep neural networks with industrial ADME data sets. J Chem Inf Model 59:1005–1016

    Article  CAS  PubMed  Google Scholar 

  154. Schütt KT, Arbabzadah F, Chmiela S et al (2017) Quantum-chemical insights from deep tensor neural networks. Nat Commun 8:13890

    Article  PubMed  PubMed Central  CAS  Google Scholar 

  155. Blum LC, Reymond J-L (2009) 970 million Druglike small molecules for virtual screening in the chemical universe database GDB-13. J Am Chem Soc 131:8732–8733

    Article  CAS  PubMed  Google Scholar 

  156. Reymond J-L (2015) The chemical space project. Acc Chem Res 48:722–730

    Article  CAS  PubMed  Google Scholar 

  157. Ramakrishnan R, Dral PO, Rupp M et al (2014) Quantum chemistry structures and properties of 134 kilo molecules. Sci Data 1:140022

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  158. Smith JS, Isayev O, Roitberg AE (2017) ANI-1: an extensible neural network potential with DFT accuracy at force field computational cost. Chem Sci 8:3192–3203

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  159. Fink T, Reymond J-L (2007) Virtual exploration of the chemical universe up to 11 atoms of C, N, O, F: assembly of 26.4 million structures (110.9 million stereoisomers) and analysis for new ring systems, stereochemistry, physicochemical properties, compound classes, and drug Discov. J Chem Inf Model 47:342–353

    Article  CAS  PubMed  Google Scholar 

  160. Gebauer NWA, Gastegger M, and Schütt KT (2018) Generating equilibrium molecules with deep neural networks arXiv 1810.11347

    Google Scholar 

  161. Schütt KT, Sauceda HE, Kindermans P-J et al (2018) SchNet - a deep learning architecture for molecules and materials. J Chem Phys 148:241722

    Article  PubMed  CAS  Google Scholar 

  162. Bleiziffer P, Schaller K, Riniker S (2018) Machine learning of partial charges derived from high-quality quantum-mechanical calculations. J Chem Inf Model 58:579–590

    Article  CAS  PubMed  Google Scholar 

  163. Irwin JJ, Shoichet BK (2005) ZINC—a free database of commercially available compounds for virtual screening. J Chem Inf Model 45:177–182

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  164. Gaulton A, Bellis LJ, Bento AP et al (2012) ChEMBL: a large-scale bioactivity database for drug discovery. Nucleic Acids Res 40:D1100–D1107

    Article  CAS  PubMed  Google Scholar 

  165. Callaway E (2020), It will change everything’: DeepMind’s AI makes gigantic leap in solving protein structures. https://www.nature.com/articles/d41586-020-03348-4

Download references

Acknowledgments

The authors thank Laurianne David and Martin Kotev (Evotec (France) SAS, Toulouse, France) for their contribution to the manuscript.

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Constantino Diaz Gonzalez .

Editor information

Editors and Affiliations

Rights and permissions

Reprints and permissions

Copyright information

© 2022 The Author(s), under exclusive license to Springer Science+Business Media, LLC, part of Springer Nature

About this protocol

Check for updates. Verify currency and authenticity via CrossMark

Cite this protocol

Muller, C., Rabal, O., Diaz Gonzalez, C. (2022). Artificial Intelligence, Machine Learning, and Deep Learning in Real-Life Drug Design Cases. In: Heifetz, A. (eds) Artificial Intelligence in Drug Design. Methods in Molecular Biology, vol 2390. Humana, New York, NY. https://doi.org/10.1007/978-1-0716-1787-8_16

Download citation

  • DOI: https://doi.org/10.1007/978-1-0716-1787-8_16

  • Published:

  • Publisher Name: Humana, New York, NY

  • Print ISBN: 978-1-0716-1786-1

  • Online ISBN: 978-1-0716-1787-8

  • eBook Packages: Springer Protocols

Publish with us

Policies and ethics