Generative AI Models for Drug Discovery

Tang, Bowen; Ewalt, John; Ng, Ho-Leung

doi:10.1007/7355_2021_124

Bowen Tang¹³,
John Ewalt¹³ &
Ho-Leung Ng¹³

Part of the book series: Topics in Medicinal Chemistry ((TMC,volume 37))

2413 Accesses
7 Citations
2 Altmetric

Abstract

A drug-like-molecule library can contain 10²³–10⁶⁰ molecules, among which only approximately 10¹² molecules may be synthesized in labs. However, it is still challenging for researchers to find the most promising candidates among the vast number of synthesizable compounds in a reasonable time. Moreover, although molecules are picked for their predicted bioactivities, their absorption, distribution, metabolism, excretion, and toxicity (ADMET) properties are often difficult to predict and modify. This is often a bottleneck for downstream studies and applications. It would be more productive if candidate molecules are generated, rather than screened from libraries, with suitable ADMET properties as prerequisites at the beginning of the molecule design process. Recently, artificial intelligence (AI)-based generative models have been described for designing drug candidates using prior biological and chemical knowledge. A spectacular example was the use of a combination of AI generative techniques and reinforcement learning by the biotechnology company, Insilico Medicine, to successfully create new DDR1 kinase inhibitors to treat fibrosis in only 21 days. We will describe how reinforcement learning (RL) algorithms can be applied to generative AI for better real-world effectiveness while better utilizing modern distributed hardware assets. In this chapter, we will review simple and advanced AI generative models and discuss the advantages and disadvantages of each model.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Log in via an institution

Chapter: USD 29.95; Price excludes VAT (USA)

eBook: USD 259.00; Price excludes VAT (USA)

Softcover Book: USD 329.99; Price excludes VAT (USA)

Hardcover Book: USD 329.99; Price excludes VAT (USA)

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

Abbreviations

3D-CNN:: 3D convolutional neural networks
AAE:: Adversarial auto-encoders
ADME:: Absorption, distribution, metabolism, and excretion
ADQN-FBDD:: Advanced deep Q-learning neural network with fragment-based drug discovery
AI:: Artificial intelligence
AIDD:: Artificial intelligence drug discovery
APEX-FBDD:: The distributed version of ADQ-FBDD
CASP:: Critical assessment of protein structure prediction
COVID19:: Coronavirus disease 2019
CVAE:: Conditional variational auto-encoder
DAI:: Distributed artificial intelligence
DQN:: Deep Q-learning neural network
FCN:: Fully connected network
GAN:: Generative adversarial networks
GCN:: Graph convolutional network
GENTRL:: Generative tensorial reinforcement learning
GRU:: Gated recurrent units
KL Divergence:: Kullback-Leibler divergence
LSTM:: Long short-term memories
MDP:: Markov decision process
QED:: Quantitative estimation of drug-likeness
R&D:: Research and development
RL:: Reinforcement learning
RNN:: Recurrent neural network
SAMPN:: Self-attention message passing graph neural network
SOM:: Self-organizing map
USD:: United States dollar
VAE:: Variational auto-encoder

References

DiMasi JA, Grabowski HG, Hansen RW (2016) Innovation in the pharmaceutical industry: new estimates of R&D costs. J Health Econ 47:20–33
Article PubMed Google Scholar
Paul SM, Mytelka DS, Dunwiddie CT, Persinger CC, Munos BH, Lindborg SR, Schacht AL (2010) How to improve R&D productivity: the pharmaceutical industry’s grand challenge. Nat Rev Drug Discov 9:203–214
Article CAS PubMed Google Scholar
von Ungern-Sternberg A (2018) Autonomous driving: regulatory challenges raised by artificial decision making and tragic choices. In: Research handbook on the law of artificial intelligence. Edward Elgar Publishing
Google Scholar
Coeckelbergh M (2020) Artificial intelligence, responsibility attribution, and a relational justification of explainability. Sci Eng Ethics 26:2051–2068
Article PubMed Google Scholar
Thierer AD, Castillo O'Sullivan A, Russell R (2017) Artificial intelligence and public policy. Mercatus Research Paper
Google Scholar
Collins GS, Moons KGM (2019) Reporting of artificial intelligence prediction models. Lancet 393:1577–1579
Article PubMed Google Scholar
Singh S, Okun A, Jackson A (2017) Learning to play go from scratch. Nature 550:336–337
Article CAS PubMed Google Scholar
Chao X, Kou G, Li T, Peng Y (2018) Jie Ke versus AlphaGo: a ranking approach using decision making method for large-scale data with incomplete information. Eur J Oper Res 265:239–247
Article Google Scholar
Service RF (2020) The game has changed. AI triumphs at protein folding. Science 370:1144–1145
Article CAS PubMed Google Scholar
Aliper A, Plis S, Artemov A, Ulloa A, Mamoshina P, Zhavoronkov A (2016) Deep learning applications for predicting pharmacological properties of drugs and drug repurposing using transcriptomic data. Mol Pharm 13:2524–2530
Article CAS PubMed PubMed Central Google Scholar
Mamoshina P, Vieira A, Putin E, Zhavoronkov A (2016) Applications of deep learning in biomedicine. Mol Pharm 13:1445–1454
Article CAS PubMed Google Scholar
Ishida J, Konishi M, Ebner N, Springer J (2016) Repurposing of approved cardiovascular drugs. J Transl Med 14:269
Article PubMed PubMed Central CAS Google Scholar
Subramanian A, Narayan R, Corsello SM, Peck DD, Natoli TE, Lu X, Gould J, Davis JF, Tubelli AA, Asiedu JK, Lahr DL, Hirschman JE, Liu Z, Donahue M, Julian B, Khan M, Wadden D, Smith IC, Lam D, Liberzon A, Toder C, Bagul M, Orzechowski M, Enache OM, Piccioni F, Johnson SA, Lyons NJ, Berger AH, Shamji AF, Brooks AN, Vrcic A, Flynn C, Rosains J, Takeda DY, Hu R, Davison D, Lamb J, Ardlie K, Hogstrom L, Greenside P, Gray NS, Clemons PA, Silver S, Wu X, Zhao W-N, Read-Button W, Wu X, Haggarty SJ, Ronco LV, Boehm JS, Schreiber SL, Doench JG, Bittker JA, Root DE, Wong B, Golub TR (2017) A next generation connectivity map: L1000 platform and the first 1,000,000 profiles. Cell 171:1437–1452.e17
Article CAS PubMed PubMed Central Google Scholar
Li X, Xu Y, Lai L, Pei J (2018) Prediction of human cytochrome P450 inhibition using a multitask deep autoencoder neural network. Mol Pharm 15:4336–4345
Article CAS PubMed Google Scholar
Tang B, Kramer ST, Fang M, Qiu Y, Wu Z, Xu D (2020) A self-attention based message passing neural network for predicting molecular lipophilicity and aqueous solubility. J Cheminform 12:1–9
Article Google Scholar
Stokes JM, Yang K, Swanson K, Jin W, Cubillos-Ruiz A, Donghia NM, MacNair CR, French S, Carfrae LA, Bloom-Ackermann Z, Tran VM, Chiappino-Pepe A, Badran AH, Andrews IW, Chory EJ, Church GM, Brown ED, Jaakkola TS, Barzilay R, Collins JJ (2020) A deep learning approach to antibiotic discovery. Cell 180:688–702.e13
Article CAS PubMed PubMed Central Google Scholar
Popova M, Isayev O, Tropsha A (2018) Deep reinforcement learning for de novo drug design. Sci Adv 4:eaap7885
Article CAS PubMed PubMed Central Google Scholar
Merk D, Grisoni F, Friedrich L, Schneider G (2018) Tuning artificial intelligence on the de novo design of natural-product-inspired retinoid X receptor modulators. Commun Chem 1:1–9
Article Google Scholar
Zhavoronkov A, Ivanenkov YA, Aliper A, Veselov MS, Aladinskiy VA, Aladinskaya AV, Terentiev VA, Polykovskiy DA, Kuznetsov MD, Asadulaev A, Volkov Y, Zholus A, Shayakhmetov RR, Zhebrak A, Minaeva LI, Zagribelnyy BA, Lee LH, Soll R, Madge D, Xing L, Guo T, Aspuru-Guzik A (2019) Deep learning enables rapid identification of potent DDR1 kinase inhibitors. Nat Biotechnol 37:1038–1040
Article CAS PubMed Google Scholar
Korshunova M, Huang N, Capuzzi S, Radchenko DS, Savych O, Moroz YS, Wells CI, Willson TM, Tropsha A, Isayev O (2021) A bag of tricks for automated de novo design of molecules with the desired properties: application to EGFR inhibitor discovery. ChemRxiv 2021:14045072.v1, pp 1–19
Google Scholar
Salakhutdinov R (2015) Learning deep generative models. Annu Rev Stat Appl 2:361–385
Article Google Scholar
Lavecchia A (2019) Deep learning in drug discovery: opportunities, challenges and future prospects. Drug Discov Today 24:2017–2032
Article PubMed Google Scholar
Bjerrum EJ, Threlfall R (2017) Molecular generation with recurrent neural networks (RNNs). preprint arXiv 2017:1705.04612v2, pp 1–9
Chowdhary K (2020) Natural language processing. In: Fundamentals of artificial intelligence. Springer, pp 603–649
Google Scholar
Sriram A, Jun H, Satheesh S, Coates A (2017) Cold fusion: training seq2seq models together with language models. arXiv preprint arXiv:170806426, pp 1–7
Google Scholar
Rodríguez P, Bautista MA, Gonzalez J, Escalera S (2018) Beyond one-hot encoding: lower dimensional target embedding. Image Vis Comput 75:21–31
Article Google Scholar
Cardoso J-F (1997) Infomax and maximum likelihood for blind source separation. IEEE Signal Process Lett 4:112–114
Article Google Scholar
Toomarian NB, Barhen J (1992) Learning a trajectory using adjoint functions and teacher forcing. Neural Netw 5:473–484
Article Google Scholar
Abadi M, Barham P, Chen J, Chen Z, Davis A, Dean J, Devin M, Ghemawat S, Irving G, Isard M, Kudlur M, Levenberg J, Monga R, Moore S, Murray DG, Steiner B, Tucker P, Vasudevan V, Warden P, Wicke M, Yu Y, Zheng X (2016) Tensorflow: a system for large-scale machine learning. In: 12th USENIX symposium on operating systems design and implementation (OSDI 16), pp 265–283
Google Scholar
Jia Y, Shelhamer E, Donahue J, Karayev S, Long J, Girshick R, Guadarrama S, Darrell T (2014) Caffe. In: Proceedings of the 22nd ACM international conference on multimedia. ACM, pp 675–678
Google Scholar
Paszke A, Gross S, Massa F, Lerer A, Bradbury J, Chanan G, Killeen T, Lin Z, Gimelshein N, Antiga L, Desmaison A, Kopf A, Yang E, DeVito Z, Raison M, Tejani A, Chilamkurthy S, Steiner B, Fang L, Bai J, Chintala S (2019) Pytorch: an imperative style, high-performance deep learning library. In: Advances in neural information processing systems, pp 8026–8037
Google Scholar
Bjerrum E, Sattarov B (2018) Improving chemical autoencoder latent space and molecular de novo generation diversity with heteroencoders. Biomol Ther 8:131
Google Scholar
Burger B, Maffettone PM, Gusev VV, Aitchison CM, Bai Y, Wang X, Li X, Alston BM, Li B, Clowes R, Rankin N, Harris B, Sprick RS, Cooper AI (2020) A mobile robotic chemist. Nature 583:237–241
Article CAS PubMed Google Scholar
Doersch C (2016) Tutorial on variational autoencoders. arXiv preprint arXiv:160605908, pp 1–23
Google Scholar
Gómez-Bombarelli R, Wei JN, Duvenaud D, Hernández-Lobato JM, Sánchez-Lengeling B, Sheberla D, Aguilera-Iparraguirre J, Hirzel TD, Adams RP, Aspuru-Guzik A (2018) Automatic chemical design using a data-driven continuous representation of molecules. ACS Cent Sci 4:268–276
Article PubMed PubMed Central CAS Google Scholar
Jørgensen PB, Schmidt MN, Winther O (2018) Deep generative models for molecular science. Mol Inf 37:1700133
Article CAS Google Scholar
Sattarov B, Baskin II, Horvath D, Marcou G, Bjerrum EJ, Varnek A (2019) De novo molecular design by combining deep autoencoder recurrent neural networks with generative topographic mapping. J Chem Inf Model 59:1182–1196
Article CAS PubMed Google Scholar
Harel S, Radinsky K (2018) Prototype-based compound discovery using deep generative models. Mol Pharm 15:4406–4416
Article CAS PubMed Google Scholar
Kang S, Cho K (2018) Conditional molecular design with deep generative models. J Chem Inf Model 59:43–52
Article PubMed CAS Google Scholar
Blaschke T, Olivecrona M, Engkvist O, Bajorath J, Chen H (2018) Application of generative autoencoder in de novo molecular design. Mol Inf 37:1700123
Article CAS Google Scholar
Samanta B, De A, Jana G, Gómez V, Chattaraj PK, Ganguly N, Gomez-Rodriguez M (2020) Nevae: a deep generative model for molecular graphs. J Mach Learn Res 21:1–33
Google Scholar
Liu Q, Allamanis M, Brockschmidt M, Gaunt A (2018) Constrained graph variational autoencoders for molecule design. arXiv preprint arXiv:1805.09076, pp 1–13
Google Scholar
Jin W, Barzilay R, Jaakkola T (2020) Multi-objective molecule generation using interpretable substructures. In: International conference on machine learning
Google Scholar
Jin W, Barzilay R, Jaakkola T (2020) Hierarchical generation of molecular graphs using structural motifs. Proceedings of the 37th international conference on machine learning. PMLR 119:4839–4848
Google Scholar
Jin W, Barzilay R, Jaakkola T (2018) Junction tree variational autoencoder for molecular graph generation. Proceedings of the 35th international conference on machine learning. PMLR 80:2323–233
Google Scholar
Li Y, Zhang L, Liu Z (2018) Multi-objective de novo drug design with conditional graph generative model. J Cheminform 10:1–24
Article CAS Google Scholar
Jin W, Barzilay R, Jaakkola T (2020) Hierarchical generation of molecular graphs using structural motifs. In: International conference on machine learning: PMLR, pp 4839–4848
Google Scholar
Lim J, Hwang S-Y, Moon S, Kim S, Kim WY (2020) Scaffold-based molecular design with a graph generative model. Chem Sci 11:1153–1164
Article CAS Google Scholar
Ji S, Xu W, Yang M, Yu K (2012) 3D convolutional neural networks for human action recognition. IEEE Trans Pattern Anal Mach Intell 35:221–231
Article Google Scholar
Wang R, Fang X, Lu Y, Yang C-Y, Wang S (2005) The PDBbind database: methodologies and updates. J Med Chem 48:4111–4119
Article CAS PubMed Google Scholar
Sussman JL, Lin D, Jiang J, Manning NO, Prilusky J, Ritter O, Abola EE (1998) Protein data bank (PDB): database of three-dimensional structural information of biological macromolecules. Acta Crystallogr D Biol Cryst 54:1078–1084
Article CAS Google Scholar
Sunseri J, Koes DR (2020) Libmolgrid: graphics processing unit accelerated molecular gridding for deep learning applications. J Chem Inf Model 60:1079–1084
Article CAS PubMed PubMed Central Google Scholar
Gogineni T, Xu Z, Punzalan E, Jiang R, Kammeraad J, Tewari A, Zimmerman P (2020) TorsionNet: a reinforcement learning approach to sequential conformer search. arXiv preprint arXiv:200607078, pp 1–17
Google Scholar
Simm G, Pinsler R, Hernández-Lobato JM (2020) Reinforcement learning for molecular design guided by quantum mechanics. In: International conference on machine learning: PMLR, pp 8959–8969
Google Scholar
Guimaraes GL, Sanchez-Lengeling B, Outeiral C, Farias PLC, Aspuru-Guzik A (2017) Objective-reinforced generative adversarial networks (ORGAN) for sequence generation models. arXiv preprint arXiv:170510843, pp 1–7
Google Scholar
Sanchez-Lengeling B, Outeiral C, Guimaraes GL, Aspuru-Guzik A (2017) Optimizing distributions over molecular space. An objective-reinforced generative adversarial network for inverse-design chemistry (ORGANIC). ChemRxiv Preprin chemrxiv.5309668.v2, pp 1–20
Google Scholar
De Cao N, Kipf T (2018) MolGAN: an implicit generative model for small molecular graphs. arXiv preprint arXiv:180511973, pp 1–11
Google Scholar
Su J (2018) Variational inference: a unified framework of generative models and some revelations. arXiv preprint arXiv:180705936, pp 1–6
Google Scholar
Yu L, Zhang W, Wang J, Yu Y (2017) Seqgan: sequence generative adversarial nets with policy gradient. In: Thirty-first AAAI conference on artificial intelligence
Google Scholar
Weng L (2019) From GAN to WGAN. arXiv preprint arXiv:190408994, pp 1–12
Google Scholar
Kadurin A, Aliper A, Kazennov A, Mamoshina P, Vanhaelen Q, Khrabrov K, Zhavoronkov A (2017) The cornucopia of meaningful leads: applying deep adversarial autoencoders for new molecule development in oncology. Oncotarget 8:10883–10890
Article PubMed Google Scholar
Mnih V, Kavukcuoglu K, Silver D, Rusu AA, Veness J, Bellemare MG, Graves A, Riedmiller M, Fidjeland AK, Ostrovski G, Petersen S, Beattie C, Sadik A, Antonoglou I, King H, Kumaran D, Wierstra D, Legg S, Hassabis D (2015) Human-level control through deep reinforcement learning. Nature 518:529–533
Article CAS PubMed Google Scholar
Vázquez-Canteli JR, Nagy Z (2019) Reinforcement learning for demand response: a review of algorithms and modeling techniques. Appl Energy 235:1072–1089
Article Google Scholar
Zhu H, Cao Y, Wang W, Jiang T, Jin S (2018) Deep reinforcement learning for mobile edge caching: review, new features, and open issues. IEEE Netw 32:50–57
Article Google Scholar
Li F, Du Y (2018) From AlphaGo to power system AI: what engineers can learn from solving the most complex board game. IEEE Power Energy Mag 16:76–84
Article CAS Google Scholar
Holcomb SD, Porter WK, Ault SV, Mao G, Wang J (2018) Overview on deepmind and its AlphaGo zero AI. In: Proceedings of the 2018 international conference on big data and education, pp 67–71
Chapter Google Scholar
Chen JX (2016) The evolution of computing: AlphaGo. Comput Sci Eng 18:4–7
Article Google Scholar
Olivecrona M, Blaschke T, Engkvist O, Chen H (2017) Molecular de-novo design through deep reinforcement learning. J Cheminform 9:48
Article PubMed PubMed Central Google Scholar
Pal MK, Bhati R, Sharma A, Kaul SK, Anand S, Sujit P (2018) A reinforcement learning approach to jointly adapt vehicular communications and planning for optimized driving. In: 2018 21st international conference on intelligent transportation systems (ITSC). IEEE, pp 3287–3293
Google Scholar
Zhou Z, Kearnes S, Li L, Zare RN, Riley P (2019) Optimization of molecules via deep reinforcement learning. Sci Rep 9:1–10
Google Scholar
van Otterlo M, Wiering M (2012) Reinforcement learning and markov decision processes. In: Adaptation, learning, and optimization. Springer, Berlin, pp 3–42
Google Scholar
Hester T, Vecerik M, Pietquin O, Lanctot M, Schaul T, Piot B, Horgan D, Quan J, Sendonaris A, Dulac-Arnold G, Osband I, Agapiou J, Leibo JZ, Gruslys A (2017) Deep q-learning from demonstrations. Thirty-second AAAI conference on artificial intelligence AAAI-18:3223–3230
Google Scholar
Tang B, He F, Liu D, Fang M, Wu Z, Xu D (2020) AI-aided design of novel targeted covalent inhibitors against SARS-CoV-2. preprint BioRxiv. https://doi.org/10.1101/2020.03.03.972133
Horgan D, Quan J, Budden D, Barth-Maron G, Hessel M, Van Hasselt H, Silver D (2018) Distributed prioritized experience replay. arXiv preprint arXiv:180300933, pp 1–23
Google Scholar
Fellows M, Mahajan A, Rudner TG, Whiteson S (2019) Virel: a variational inference framework for reinforcement learning. In: Advances in neural information processing systems, pp 7122–7136
Google Scholar

Download references

Author information

Authors and Affiliations

Department of Biochemistry and Molecular Biophysics, Kansas State University, Manhattan, KS, USA
Bowen Tang, John Ewalt & Ho-Leung Ng

Authors

Bowen Tang
View author publications
You can also search for this author in PubMed Google Scholar
John Ewalt
View author publications
You can also search for this author in PubMed Google Scholar
Ho-Leung Ng
View author publications
You can also search for this author in PubMed Google Scholar

Corresponding author

Correspondence to Ho-Leung Ng .

Editor information

Editors and Affiliations

GIPER, Kashipur, India
Anil Kumar Saxena

Ethics declarations

Disclosure of Potential Conflicts of Interest: There are no conflicts of interest for this chapter.

Funding: There is no funding related to this chapter.

Ethical Approval: No animals or humans were used in this study.

Informed Consent: No patients were used in this study.

Rights and permissions

Reprints and permissions

Copyright information

About this chapter

Cite this chapter

Tang, B., Ewalt, J., Ng, HL. (2021). Generative AI Models for Drug Discovery. In: Saxena, A.K. (eds) Biophysical and Computational Tools in Drug Discovery. Topics in Medicinal Chemistry, vol 37. Springer, Cham. https://doi.org/10.1007/7355_2021_124

Download citation

DOI: https://doi.org/10.1007/7355_2021_124
Published: 16 July 2021
Publisher Name: Springer, Cham
Print ISBN: 978-3-030-85280-1
Online ISBN: 978-3-030-85281-8
eBook Packages: Chemistry and Materials ScienceChemistry and Material Science (R0)

Publish with us

Policies and ethics