Skip to main content

Advertisement

Log in

Applications of artificial intelligence to drug design and discovery in the big data era: a comprehensive review

  • Comprehensive review
  • Published:
Molecular Diversity Aims and scope Submit manuscript

Abstract

Artificial intelligence (AI) renders cutting-edge applications in diverse sectors of society. Due to substantial progress in high-performance computing, the development of superior algorithms, and the accumulation of huge biological and chemical data, computer-assisted drug design technology is playing a key role in drug discovery with its advantages of high efficiency, fast speed, and low cost. Over recent years, due to continuous progress in machine learning (ML) algorithms, AI has been extensively employed in various drug discovery stages. Very recently, drug design and discovery have entered the big data era. ML algorithms have progressively developed into a deep learning technique with potent generalization capability and more effectual big data handling, which further promotes the integration of AI technology and computer-assisted drug discovery technology, hence accelerating the design and discovery of the newest drugs. This review mainly summarizes the application progression of AI technology in the drug discovery process, and explores and compares its advantages over conventional methods. The challenges and limitations of AI in drug design and discovery have also been discussed.

Graphic abstract

This is a preview of subscription content, log in via an institution to check access.

Access this article

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Fig. 1

Reproduced with permission from ref 19. Copyright 2020 American Chemical Society

Fig. 2

Reproduced with permission from ref 58. Copyright 2020 Nature

Fig. 3

Reproduced with permission from ref 72. Copyright 2018 Oxford University Press

Fig. 4

Reproduced with permission from ref 73. Copyright 2020 MDPI

Fig. 5

Reproduced with permission from ref 84. Copyright 2020 Springer Nature

Fig. 6

Reproduced with permission from ref 87. Copyright 2016 Springer

Fig. 7

Reproduced from ref 98. Copyright 2017 American Chemical Society

Fig. 8

Reproduced from ref 49. Copyright 2016

Fig. 9

Reproduced with permission from ref 119. Copyright 2017 American Chemical Society

Fig. 10

Reproduced with permission from ref 121. Copyright 2019 American Chemical Society

Fig. 11

Reproduced with permission from ref 130. Copyright 2020 Elsevier

Fig. 12

Reproduced with permission from ref 137. Copyright 2019 American Chemical Society

Similar content being viewed by others

References

  1. Ashburn TT, Thor KB (2004) Drug repositioning: identifying and developing new uses for existing drugs. Nat Rev Drug Discov 3:673–683. https://doi.org/10.1038/nrd1468

    Article  CAS  PubMed  Google Scholar 

  2. DiMasi JA, Grabowski HG, Hansen R (2016) Innovation in the pharmaceutical industry: new estimates of R&D costs. J Health Econ 47:20–33. https://doi.org/10.1016/j.jhealeco.2016.01.012

    Article  PubMed  Google Scholar 

  3. Silver D, Hubert T, Schrittwieser J, Antonoglou I, Lai M, Guez A, Lanctot M, Sifre L, Kumaran D, Graepel T, Lillicrap T, Simonyan K, Hassabis D (2018) A general reinforcement learning algorithm that masters chess, shogi, and Go through self-play. Science 362:1140–1144. https://doi.org/10.1126/science.aar6404

    Article  CAS  PubMed  Google Scholar 

  4. Ma C, Wang L, Xie XQ (2011) GPU accelerated chemical similarity calculation for compound library comparison. J Chem Inf Model 51(7):1521–2152. https://doi.org/10.1021/ci1004948

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  5. Smalley E (2017) AI-powered drug discovery captures pharma interest. Nat Biotechnol 35:604–605. https://doi.org/10.1038/nbt0717-604

    Article  CAS  PubMed  Google Scholar 

  6. Mak KK, Pichika MR (2019) Artificial intelligence in drug development: present status and future prospects. Drug Discovery Today 24(3):773–780. https://doi.org/10.1016/j.drudis.2018.11.014

    Article  PubMed  Google Scholar 

  7. Domingos P, Pazzani M (1997) On the optimality of the simple Bayesian classifier under zero-one loss. Mach Learn 29:103–130. https://doi.org/10.1023/A:1007413511361

    Article  Google Scholar 

  8. Cox DR (1958) The regression analysis of binary sequences. J R Stat Soc B 20:215–242. https://www.jstor.org/stable/2983890.

  9. Hou TJ, Wang JM, Li YY (2007) ADME evaluation in drug discovery. 8. The prediction of human intestinal absorption by a support vector machine. J Chem Inf Model 47:2408–2415. https://doi.org/10.1021/ci7002076

    Article  CAS  PubMed  Google Scholar 

  10. Svetnik V, Liaw A, Tong C (2003) Random forest: a classification and regression tool for compound classification and QSAR modeling. J Chem Inf Comput Sci 43:1947–1958. https://doi.org/10.1021/ci034160g

    Article  CAS  PubMed  Google Scholar 

  11. Rayhan F, Ahmed S, Shatabda S, Farid DM, Mousavian Z, Dehzangi A, Rahman MS (2017) iDTI-ESBoost, Identification of drug target interaction using evolutionary and structural features with boosting. Sci Rep 7:17731. https://doi.org/10.1038/s41598-017-12580-2

    Article  PubMed  PubMed Central  Google Scholar 

  12. Cao DS, Xu QS, Liang YZ, Chen XA, Li HD (2010) Automatic feature subset selection for decision tree-based ensemble methods in the prediction of bioactivity. Chemometr Intell Lab 103(2):129–136. https://doi.org/10.1016/j.chemolab.2010.06.008

    Article  CAS  Google Scholar 

  13. Lavecchia A, Di Giovanni C (2013) Virtual screening strategies in drug discovery: a critical review. Curr Med Chem 20:2839–2860. https://doi.org/10.2174/09298673113209990001

    Article  CAS  PubMed  Google Scholar 

  14. Vanhaelen Q, Mamoshina P, Aliper AM, Artemov A, Lezhnina K, Ozerov I, Labat I, Zhavoronkov A (2017) Design of efficient computational workflows for in silico drug repurposing. Drug Discov Today 22:210–222. https://doi.org/10.1016/j.drudis.2016.09.019

    Article  CAS  PubMed  Google Scholar 

  15. Schmidhuber J (2015) Deep learning in neural networks an overview. Neural Netw 61:85–117. https://doi.org/10.1016/j.neunet.2014.09.003

    Article  PubMed  Google Scholar 

  16. LeCun Y, Bengio Y, Hinton G (2015) Deep learning. Nature 521:436–444. 10.1038nature14539.

  17. Sheridan RP (2013) Time-split cross-validation as a method for estimating the goodness of prospective prediction. J Chem Inf Model 53(4):783–790. https://doi.org/10.1021/ci400084k

    Article  CAS  PubMed  Google Scholar 

  18. Wu Z, Ramsundar B, Feinberg EN, Gomes J, Geniesse C, Pappu AS, Leswing K, Pande V (2018) MoleculeNet: a benchmark for Molecular Machine Learning. Chem Sci 9:513–530. https://doi.org/10.1039/C7SC02664A

    Article  CAS  PubMed  Google Scholar 

  19. Minnich AJ, McLoughlin K, Tse M, Deng J, Weber A, Murad N, Madej BD, Ramsundar B, Rush T, Calad-Thomson S, Brase J, Allen JE (2020) AMPL: a data-driven modeling pipeline for drug discovery. J Chem Inf Model 60:1955–1968. https://doi.org/10.1021/acs.jcim.9b01053

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  20. Mayr A, Klambauer G, Unterthiner T, Steijaert M, Wegner JK, Ceulemans H, Clevert D-A, Hochreiter S (2018) Large-scale comparison of machine learning methods for drug target prediction on ChEMBL. Chem Sci 9:5441–5451. https://doi.org/10.1039/C8SC00148K

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  21. Zhong FS, Xing J, Li XT, Liu XH, Fu ZY, Xiong ZP, Lu D, Wu XL, Zhao JH, Tan XQ, Li F, Luo XM, Li XZ, Chen KX, Zheng MY, Jiang HL (2018) Artificial intelligence in drug design. Sci China Life Sci 61:59–72. https://doi.org/10.1007/s11427-018-9342-2

    Article  Google Scholar 

  22. Jing YK, Bian YM, Hu ZH, Wang LR, Sean Xie XQ (2018) Deep learning for drug design: an artificial intelligence paradigm for drug discovery in the big data era. AAPS J 20:58. https://doi.org/10.1208/s.12248-018-0210-0

    Article  PubMed  Google Scholar 

  23. Sze V, Chen YH, Yang T, Emer JS (2017) Efficient processing of deep neural networks: a tutorial and survey. Proc IEEE 105:2295–2329. https://doi.org/10.1109/JPROC.2017.2761740

    Article  Google Scholar 

  24. Yang Y, Adelstein SJ, Kassis AI (2009) Target discovery from data mining approaches. Drug Discov Today 2(14):147–154. https://doi.org/10.1016/j.drudis.2008.12.005

    Article  Google Scholar 

  25. Chan HCS, Shan H, Dahoun T, Vogel H, Yuan S (2019) Advancing drug discovery via artificial intelligence. Trends Pharmacol Sci 40(8):592–604. https://doi.org/10.1016/j.tips.2019.06.004

    Article  CAS  PubMed  Google Scholar 

  26. Ciallella HL, Zhu H (2019) Advancing computational toxicology in the big data era by artificial intelligence: data-driven and mechanism-driven modeling for chemical toxicity. Chem Res Toxicol 32:536–547. https://doi.org/10.1021/chemrestox.8b00393

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  27. Brown N (2015) In silico medicinal chemistry: computational methods to support drug design. Royal Society of Chemistry. https://doi.org/10.1039/9781782622604

    Article  PubMed Central  Google Scholar 

  28. Kumar R, Chaudhary K, Gupta S, Singh H, Kumar S, Gautam A, Kapoor P, Raghava GPS (2013) CancerDR: cancer drug resistance database. Sci Rep 3:1445. https://doi.org/10.1038/srep01445

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  29. Kim S, Thiessen PA, Bolton EE, Chen J, Fu G, Gindulyte A, Han L, He J, He S, Shoemaker BA, Wang J, Yu B, Zhang J, Bryant SH (2016) Pubchem substance and compound databases. Nucleic Acids Res 44:D1202–D1213. https://doi.org/10.1093/nar/gkv951

    Article  CAS  Google Scholar 

  30. Placzek S, Schomburg I, Chang A, Jeske L, Ulbrich M, Tillack J, Schomburg D (2017) BRENDA in 2017: new perspectives and new tools in BRENDA. Nucleic Acids Res 45:D380–D388. https://doi.org/10.1093/nar/gkw952

    Article  CAS  PubMed  Google Scholar 

  31. Chen R, Liu X, Jin S, Lin J, Liu J (2018) Machine learning for drug-target interaction prediction. Molecules 23:2208. https://doi.org/10.3390/molecules23092208

    Article  CAS  PubMed Central  Google Scholar 

  32. Gaulton A, Bellis LJ, Bento AP, Chambers J, Davies M, Hersey A, Light Y, Mcglinchey S, Michalovich D, Allazikani B (2012) ChEMBL: a large-scale bioactivity database for drug discovery. Nucleic Acids Res 40:D1100–D1107. https://doi.org/10.1038/srep01445

    Article  CAS  PubMed  Google Scholar 

  33. Magariños MP, Carmona SJ, Crowther GJ, Ralph SA, Roos DS, Shanmugam D, Voorhis WCV, Agüero F (2012) TDR targets: a chemogenomics resource for neglected diseases. Nucleic Acids Res 40:D1118–D1127. https://doi.org/10.1093/nar/gkr1053

    Article  CAS  PubMed  Google Scholar 

  34. Günther S et al (2008) SuperTarget and Matador: resources for exploring drug-target relationships. Nucleic Acids Res 36:D919–D922. https://doi.org/10.1039/nar/gkm862

    Article  PubMed  Google Scholar 

  35. Russell SJ, Norvig P (2003) Artificial intelligence: a modern approach. Upper Saddle River, NJ: Prentice Hall/Pearson Ed.

  36. Hansch C, FujitaT, (1964) ρ-σ-π Analysis. a method for the correlation of biological activity and chemical structure. J Am Chem Soc 86:1616–1626. https://doi.org/10.1021/ja01062a035

    Article  CAS  Google Scholar 

  37. Zefirov NS, Palyulin VA (2002) Fragmental approach in QSPR. J Chem Inform Comput Sci 42:1112–1122. https://doi.org/10.1021/ci020010e

    Article  CAS  Google Scholar 

  38. McGregor MJ, Muskal SM (1999) Pharmacophore fingerprinting. 1. application to QSAR and focused library design. J Chem Inf Comput Sci 39:569–577. https://doi.org/10.1021/ci980159j

    Article  CAS  PubMed  Google Scholar 

  39. Gozalbes R, Doucet JP, Derouin F (2002) Application of topological descriptors in QSAR and drug design: history and new trends. Curr Drug Targets Infect Disord 2:93–102. https://doi.org/10.2174/1568005024605909

    Article  CAS  PubMed  Google Scholar 

  40. Zhu H (2020) Big data and artificial intelligence modeling for drug discovery. Annu Rev Pharmacol Toxicol 60(23):1–23. https://doi.org/10.1146/annurev-pharmtox-010919-023324

    Article  CAS  Google Scholar 

  41. Aoyama T, Suzuki Y, Ichikawa H (1989) Neural networks applied to pharmaceutical problems.1. method and application to decision-making. Chem Pharm Bull 37:2558–2560. https://doi.org/10.1248/cpb.37.2558

    Article  CAS  Google Scholar 

  42. Tetko IV, Villa AE, Aksenova TI, Zielinski WL, Brower J, Welsh WJ (1998) Application of a pruning algorithm to optimize artificial neural networks for pharmaceutical fingerprinting. J Chem Inf Comput Sci 38(4):660–668. https://doi.org/10.1021/ci970439j

    Article  CAS  PubMed  Google Scholar 

  43. Tetko IV, Villa AE, Livingstone DJ (1996) Neural network studies 2 variable selection. J Chem Inf Comput Sci 36(4):794–803. https://doi.org/10.1021/ci950204c

    Article  CAS  PubMed  Google Scholar 

  44. Agatonovic-Kustrin S, Beresford R (2000) Basic concepts of artificial neural network (ANN) modeling and its application in pharmaceutical research. J Pharm Biomed Anal 22(5):717–727. https://doi.org/10.1016/S0731-7085(99)00272-1

    Article  CAS  PubMed  Google Scholar 

  45. Gawehn E, Hiss JA, Schneider G (2016) Deep learning in drug discovery. Mol Inform 35:3–14. https://doi.org/10.1002/minf.201501008

    Article  CAS  PubMed  Google Scholar 

  46. Hinton G, Deng L, Yu D, Dahl GE, Mohamed AR, Jaitly N, Senior A, Vanhoucke V, Nguyen P, Sainath T, Kingsbury B (2012) Deep neural networks for acoustic modeling in speech recognition. IEEE Signal Proc Mag 29:82–97. https://doi.org/10.1109/MSP.2012.2205597

    Article  Google Scholar 

  47. Silver D, Huang A, Maddison CJ, Guez A, Sifre L et al (2016) Mastering the game of Go with deep neural networks and tree search. Nature 529:484–489. https://doi.org/10.1038/nature16961

    Article  CAS  PubMed  Google Scholar 

  48. Ma J, Sheridan RP, Liaw A, Dahl GE, Svetnik V (2015) Deep neural nets as a method for quantitative structure-activity relationships. J Chem Inf Model 55:263–274. https://doi.org/10.1021/ci500747n

    Article  CAS  PubMed  Google Scholar 

  49. Mayr A, Klambauer G, Unterthiner T, Hochreiter S (2016) DeepTox: toxicity prediction using deep learning. Front Environ Sci 3:80. https://doi.org/10.3389/fenvs.2015.00080

    Article  Google Scholar 

  50. Heffernan R, Paliwal K, Lyons J, Dehzangi A, Sharma A, Wang JH, Sattar A, Yang YD, Zhou YD (2015) Improving prediction of secondary structure, local backbone angles, and solvent accessible surface area of proteins by iterative deep learning. Sci Rep 5:11476. https://doi.org/10.1038/srep11476

    Article  PubMed  PubMed Central  Google Scholar 

  51. Qian N, Sejnowski TJ (1988) Predicting the secondary structure of globular proteins using neural network models. J Mol Biol 202:865–884. https://doi.org/10.1016/0022-2836(88)90564-5

    Article  CAS  PubMed  Google Scholar 

  52. Qi YJ, Oja M, Weston J, Noble WS (2012) A unified multitask architecture for predicting local protein properties. PLoS ONE 7:e32235. https://doi.org/10.1371/journal.pone.0032235

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  53. Spencer M, Eickholt J, Cheng J (2015) A deep learning network approach to ab initio protein secondary structure prediction. IEEE/ACM Trans Comput Biol Bioinform 12(1):103–112. https://doi.org/10.1109/TCBB.2014.2343960

    Article  CAS  PubMed  Google Scholar 

  54. Wang S, Peng J, Ma JZ, Xu JB (2016) Protein secondary structure prediction using deep convolutional neural fields. Sci Rep 6:18962. https://doi.org/10.1038/srep18962

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  55. Jo T, Hou J, Eickholt J, Cheng J (2015) Improving protein fold recognition by deep learning networks. Sci Rep 5:17573. https://doi.org/10.1038/srep17573

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  56. Dill KA, Ozkan SB, Shell MS, Weikl TR (2008) The protein folding problem. Annu Rev Biophys 37:289–316. https://doi.org/10.1146/annurev.biophys.37.092707.153558

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  57. Dill KA, MacCallum JL (2012) The protein-folding problem, 50 years on. Science 338:1042–1046. https://doi.org/10.1126/science.1219021

    Article  CAS  PubMed  Google Scholar 

  58. Senior AW et al (2020) Improved protein structure prediction using potentials from deep learning. Nature 577:706–710. https://doi.org/10.1038/s41586-019-1923-7

    Article  CAS  PubMed  Google Scholar 

  59. Senior AW et al (2019) Protein structure prediction using multiple deep neural networks in the 13th critical assessment of protein structure prediction (CASP13). Proteins 87:1141–1148. https://doi.org/10.1002/prot.25834

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  60. Goshisht MK, Moudgil L, Rani M, Khullar P, Singh G, Kumar H, Singh N, Kaur G, Bakshi MS (2014) Lysozyme complexes for the synthesis of functionalized biomaterials to understand protein–protein interactions and their biological applications. J Phys Chem C 118(48):28207–28219. https://doi.org/10.1021/jp5078054

    Article  CAS  Google Scholar 

  61. Goshisht MK, Moudgil L, Khullar P, Singh G, Kaura A, Kumar H, Kaur G, Bakshi MS (2015) Surface adsorption and molecular modeling of biofunctional gold nanoparticles for systemic circulation and biological sustainability. ACS Sustainable Chem Eng 3(12):3175–3187. https://doi.org/10.1021/acssuschemeng.5b00747

    Article  CAS  Google Scholar 

  62. Khullar P, Goshisht MK, Moudgil L, Singh G, Mandial D, Kumar H, Ahliwalia GK, Bakshi MS (2017) Mode of protein complexes on gold nanoparticles surface: synthesis and characterization of biomaterials for hemocompatibility and preferential DNA complexation. ACS Sustainable Chem Eng 5(1):1082–1093. https://doi.org/10.1021/acssuschemeng.6b02373

    Article  CAS  Google Scholar 

  63. Mahal A, Goshisht MK, Khullar P, Kumar H, Singh N, Kaur G, Bakshi MS (2014) Protein mixtures of environmentally friendly zein to understand protein–protein interactions through biomaterials synthesis, hemolysis, and their antimicrobial activities. Phys Chem Chem Phys 16:14257–14270. https://doi.org/10.1039/C4CP01457J

    Article  CAS  PubMed  Google Scholar 

  64. Scott DE, Bayly AR, Abell C, Skidmore J (2016) Small molecules, big targets: drug discovery faces the protein–protein interaction challenge. Nat Rev Drug Discov 15:533–550. https://doi.org/10.1038/nrd.2016.29

    Article  CAS  PubMed  Google Scholar 

  65. Santos R, Ursu O, Gaulton A, Bento AP, Donadi RS, Bologa CG, Karlsson A, Al- Lazikani B, Hersey A, Oprea TI, Overington JP (2017) A comprehensive map of molecular drug targets. Nat Rev Drug Discov 16:19–34. https://doi.org/10.1038/nrd.2016.230

    Article  CAS  PubMed  Google Scholar 

  66. Wilson AJ, Murphy NS, Long K, Azzarito V (2013) Inhibition of α-helix-mediated protein-protein interactions using designed molecules. Nat Chem 5:161–173. https://doi.org/10.1038/nchem.1568

    Article  CAS  PubMed  Google Scholar 

  67. Maheshwari S, Brylinski M (2016) Template-based identification of protein–protein interfaces using eFindSitePPI. Methods 93:64–71. https://doi.org/10.1016/j.ymeth.2015.07.017

    Article  CAS  PubMed  Google Scholar 

  68. Vakser IA (2014) Protein-protein docking: from interaction to interactome. Biophys J 107:1785–1793. https://doi.org/10.1016/j.bpj.2014.08.033

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  69. Mosca R, Ceol A, Aloy P (2013) Interactome3D: adding structural details to protein networks. Nat Methods 10:47–53. https://doi.org/10.1038/nmeth.2289

    Article  CAS  PubMed  Google Scholar 

  70. Du TC, Li L, Wu CH, Sun BL (2016) Prediction of residue-residue contact matrix for protein- protein interaction with Fisher score features and deep learning. Methods 110:97–105. https://doi.org/10.1016/j.ymeth.2016.06.001

    Article  CAS  PubMed  Google Scholar 

  71. Du XQ, Sun SW, Hu CL, Yao Y, Yan YT, Zhang YP (2017) DeepPPI: boosting prediction of protein-protein interactions with deep neural networks. J Chem Inf Model 57(6):1499–1510. https://doi.org/10.1021/acs.jcim.7b00028

    Article  CAS  PubMed  Google Scholar 

  72. Zenge H, Wanf S, Zhou TM, Zhao EF, Li XF, Wu Q, Xu JB (2018) ComplexContact: a web server for inter-protein contact prediction using deep learning. Nucleic Acids Res 46:W432–W437. https://doi.org/10.1093/nar/gky420

    Article  CAS  Google Scholar 

  73. Xie Z, Deng X, Shu K (2020) Prediction of protein-protein interaction sites using convolutional neural network and improved data sets. Int J Mol Sci 221(2):467. https://doi.org/10.3390/ijms21020467

    Article  CAS  Google Scholar 

  74. Rester U (2008) From virtuality to reality - Virtual screening in lead discovery and lead optimization: a medicinal chemistry perspective. Curr Opin Drug Discov Devel 11(4):559–568

    CAS  PubMed  Google Scholar 

  75. Walters WP, Stahl MT, Murcko MA (1998) Virtual screening—an overview. Drug Discovery Today 3(4):160–178. https://doi.org/10.1016/S1359-6446(97)01163-X

    Article  CAS  Google Scholar 

  76. Gonczarek A, Tomczak JM, Zareba S, Kaczmar J, Dabrowski P, Walczak MJ (2018) Interaction prediction in structure-based virtual screening using deep learning. Comput Biol Med 100:253–258. https://doi.org/10.1016/compbiomed.2017.09.007

    Article  PubMed  Google Scholar 

  77. Plewczynski D, Spieser SAH, Koch U (2009) Performance of machine learning methods for ligand-based virtual screening. Comb Chem High Throughput Screen 12(4):358–368. https://doi.org/10.2174/138620709788167962

    Article  CAS  PubMed  Google Scholar 

  78. Xiao T, Qi X, Chen YZ, Jiang Y (2018) Development of ligand-based big data deep neural network models for virtual screening of large compound libraries. Mol Inf 37:1800031. https://doi.org/10.1002/minf.201800031

    Article  CAS  Google Scholar 

  79. Ferreira LG, Dos Santos RN, Oliva G, Andricopulo AD (2015) Molecular docking and structure-based drug design strategies. Molecules 20(7):13384–13421. https://doi.org/10.3390/molecules200713384

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  80. Akbar R, Jusoh SA, Amaro RE, Helms V (2017) ENRI: a tool for selecting structure based virtual screening target conformations. Chem Biol Drug Des 89:762–771. https://doi.org/10.1111/cbdd.12900

    Article  CAS  PubMed  Google Scholar 

  81. Bengio Y, Courville A, Vincent P (2013) Representation learning: a review and new perspectives. IEEE Trans Pattern Anal Mach Intell 35:1798–1828. https://doi.org/10.1109/TPAMI.2013.50

    Article  PubMed  Google Scholar 

  82. Pereira JC, Caffarena ER, dos Santos CN (2016) Boosting docking-based virtual screening with deep learning. J Chem Inf Model 56:2495–2506. https://doi.org/10.1021/acs.jcim.6b00355

    Article  CAS  PubMed  Google Scholar 

  83. Skalic M, Martínez-Rosell G, Jiménez J, De Fabritiis G (2019) PlayMolecule BindScope: large scale CNN-based virtual screening on the web. Bioinformatics 35:1237–1238. https://doi.org/10.1093/bioinformatics/bty758

    Article  CAS  PubMed  Google Scholar 

  84. Mendolia I, Contino S, Perricone U, Ardizzone E, Pirrone R (2020) Convolutional architectures for virtual screening. BMC Bioinformatics 21:310. https://doi.org/10.1186/s12859-020-03645-9

    Article  PubMed  PubMed Central  Google Scholar 

  85. Esposito EX, Hopfinger AJ, Madura JD (2004) Methods for applying the quantitative structure-activity relationship paradigm. Methods Mol Biol 275:131–214. https://doi.org/10.1385/1-5259-802-1:131

    Article  CAS  PubMed  Google Scholar 

  86. Myint KZ, Xie XQ (2010) Recent advances in fragment-based QSAR and multidimensional QSAR methods. Int J Mol Sci 11:3846–3866. https://doi.org/10.3393/ijms/11103846

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  87. Lei T, Li Y, Song Y, Li D, Sun H, Hou T (2016) ADMET evaluation in drug discovery. Accurate prediction of rat oral acute toxicity using relevance vector machine and consensus modeling. J Cheminform 8: 6. https://doi.org/10.1186/s13321-016-0117-7

  88. Aoyama T, Suzuki YJ, Ichikawa H (1990) Neural networks applied to quantitative structure-activity relationship analysis. J Med Chem 33:2583–2590. https://doi.org/10.1021/jm00171a037

    Article  CAS  PubMed  Google Scholar 

  89. Dong J, Yao ZJ, Zhu MF, Wang NN, Lu B, Chen AF, Lu AP, Miao HY, Zeng WB, Cao DS (2017) ChemSAR: an online pipelining platform for molecular SAR modeling. J Cheminform 9:27. https://doi.org/10.1186/s13321-0215-1

    Article  PubMed  PubMed Central  Google Scholar 

  90. Dahl GE, Jaitly N, Salakhutdinov R (2014) Multi-task neural networks for QSAR predictions. 1–21. arXiv:https://arxiv.org/abs/1406.1231v1

  91. Vina D, Uriarte E, Orallo F, González-Díaz H, (2009) Alignment-free prediction of a drug−target complex network based on parameters of drug connectivity and protein sequence of receptors. Mol Pharmaceutics 6:825–835. https://doi.org/10.1021/mp800102c

    Article  CAS  Google Scholar 

  92. Prado-Prado FJ, Ubeira FM, Borges F, González-Díaz H, (2010) Unified QSAR & network-based computational chemistry approach to antimicrobials. II. multiple distance and triadic census analysis of antiparasitic drugs complex networks. J Comput Chem 31:164–173. https://doi.org/10.1002/jcc.21292

  93. Speck-Planche A, Kleandrova VV, Luan F, Cordeiro MN (2012) Chemoinformatics in multi-target drug discovery for anti-cancer therapy: in silico design of Potent and versatile anti-brain tumor agents. Anticancer Agents Med Chem 12:678–685. https://doi.org/10.2174/187152012800617722

    Article  CAS  PubMed  Google Scholar 

  94. Tenorio-Borroto E, Penuelas-Rivas CG, Chagoyán JCV, Castañedo N, Prado-Prado FJ, García-Mera X, González-Díaz H (2012) ANN multiplexing model of drugs effect on macrophages; theoretical and flow cytometry study on the cytotoxicity of the anti-microbial drug G1 in spleen. Bioorg Med Chem 20:6181−6194. https://doi.org/10.1016/j.bmc.2012.07.020

  95. Tenorio-Borroto E, García-Mera X, Penuelas-Rivas CG, Vasquez-Chagoyan JC, Prado-Prado FJ, Castanedo N, Gonzalez-Diaz H (2013) Entropy model for multiplex drug-target interaction endpoints of drug immunotoxicity. Curr Top Med Chem 13:1636–1649. https://doi.org/10.1016/j.ejmech.2013.08.035

    Article  CAS  PubMed  Google Scholar 

  96. Tenorio-Borroto E, Peñuelas-Rivas CG, Vásquez-Chagoyán JC, Castañedo N, Prado-Prado FJ, García-Mera X, González-Díaz H (2014) Model for high-throughput screening of drug immunotoxicity−study of the anti-microbial g1 over peritoneal macrophages using flow cytometry. Eur. J Med Chem 72:206−220. https://doi.org/10.1016/j.ejmech.2013.08.035

  97. Speck-Planche A, Cordeiro MNDS (2013) Simultaneous modeling of antimycobacterial activities and ADMET profiles: a chemoinformatic approach to medicinal chemistry. Curr Top Med Chem 13:1656–1665. https://doi.org/10.2174/15680266113139990116

    Article  CAS  PubMed  Google Scholar 

  98. Speck-Planche A, Cordeiro MNDS (2017) Speeding up early drug discovery in antiviral research: a fragment-based in silico approach for the design of virtual anti-hepatitis C leads. ACS Comb Sci 19(8):501–512. https://doi.org/10.1021/acscombsci.7b00039

    Article  CAS  PubMed  Google Scholar 

  99. Ramsundar B, Kearnes S, Riley P, Webster D, Konerding D, Pande V (2015) Massively multitask networks for drug discovery. arXiv:https://arxiv.org/abs/1502.02072v1

  100. Xu Y, Ma J, Liaw A, Sheridan RP, Svetnik V (2017) Demystifying multitask deep neural networks for quantitative structure−activity relationships. J Chem Inf Model 57(10):2490–2504. https://doi.org/10.1021/acs.jcim.7b00087

    Article  CAS  PubMed  Google Scholar 

  101. Koutsoukas A, Monaghan KJ, Li X, Huan J (2017) Deep-Learning: investigating deep neural networks hyper-parameters and comparison of performance to shallow methods for modeling bioactivity data. J Cheminformatics 9:42. https://doi.org/10.1186/s13321-017-0226-y

    Article  Google Scholar 

  102. Mendenhall J, Meiler J (2016) Improving quantitative structure−activity relationship models using artificial neural networks trained with dropout. J Comput-Aided Mol Des 30:177–189. https://doi.org/10.1007/s10822-016-9895-2

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  103. Shahriari B, Swersky K, Wang Z, Adams RP, De Freitas N (2016) Taking the human out of the loop: a review of Bayesian optimization. Proc IEEE 104:148–175. https://doi.org/10.1109/JPROC.2015.2494218

    Article  Google Scholar 

  104. Zhao Z, Qin J, Gou Z, Zhang Y, Yang Y (2020) Multi-task learning models for predicting active compounds. J Biomed Inform 108:103484. https://doi.org/10.1016/j.jbi.2020.103484

    Article  PubMed  Google Scholar 

  105. Kharkar PS (2010) Two-dimensional (2D) in silico models for absorption, distribution, metabolism, excretion and toxicity (ADME/T) in drug discovery. Curr Top Med Chem 10:116–126. https://doi.org/10.2174/1568026.10790232224

    Article  CAS  Google Scholar 

  106. Wang YL, Xing J, Xu Y, Zhou NN, Peng JL, Xiong ZP, Liu X, Luo XM, Luo C, Chen KX, Zheng MY, Jiang HL (2015) In silico ADME/T modelling for rational drug design. Q Rev Biophys 48:488–515. https://doi.org/10.1017/s0033583515000190

    Article  PubMed  Google Scholar 

  107. Xue HQ, Li J, Xie HZ, Wang YD (2018) Review of drug repositioning approaches and resources. Int J Biol Sci 14:1232–1244. https://doi.org/10.7150/ijbs.24612

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  108. Kennedy T (1997) Managing the drug discovery/development interface. Drug Discov Today 2:436–444. https://doi.org/10.1016/s1359-6446(97)01099-4

    Article  Google Scholar 

  109. Merlot G (2010) Computational toxicology-a tool for early safety evaluation, Drug Discov. Today 15:16–22. https://doi.org/10.1016/j.drudis.2009.09.010

    Article  CAS  Google Scholar 

  110. Khanna I (2012) Drug discovery in pharmaceutical industry: productivity challenges and trends. Drug Discov Today 17:1088–1102. https://doi.org/10.1016/j.drudis.2012.05.007

    Article  PubMed  Google Scholar 

  111. Tan JJ, Cong XJ, Hu LM, Wang CX, Jia L, Liang XJ (2010) Therapeutic strategies underpinning the development of novel techniques for the treatment of HIV infection. Drug Discov Today 15:186–197. https://doi.org/10.1016/j.drudis.2010.01.004

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  112. Kortagere S, Chekmarev DS, Welsh WJ, Ekins S (2008) New predictive models for blood-brain barrier permeability of drug-like molecules. Pharm Res 25:1836–1845. https://doi.org/10.1007/s11095-0008-9584-5

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  113. Obrezanova O, Csanyi G, Gola GMR, Segall MD (2007) Gaussian processes: a method for automatic QSAR modeling of ADME properties. J Chem Inf Model 47(5):1847–1857. https://doi.org/10.1021/ci7000633

    Article  CAS  PubMed  Google Scholar 

  114. Lombardo F, Obach RS, DiCapua FM, Bakken GA, Lu J, Potter DM, Gao F, Miller MD, Zhang Y (2006) A hybrid mixture discriminant analysis-random forest computational model for the prediction of volume of distribution of drugs in human. J Med Chem 49:2262–2267. https://doi.org/10.1016/j.drudis.2017.08.010

    Article  CAS  PubMed  Google Scholar 

  115. Klon AE, Lowrie JF, Diller DJ (2006) Improved naive Bayesian modeling of numerical data for absorption, distribution, metabolism and excretion (ADME) property prediction. J Chem Inf Model 46:1945–1956. https://doi.org/10.1021/ci0601315

    Article  CAS  PubMed  Google Scholar 

  116. Clark AM, Dole K, Coulon-Spektor A, McNutt A, Grass G, Freundlich JS, Reynolds RC, Ekins S (2015) Open source bayesian models. 1. application to ADME/Tox and drug discovery datasets. J Chem Inf Model 55:1231–1245. https://doi.org/10.1021/acs.jcim.5b00143

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  117. Li X, Xu Y, Lai L, Pai J (2018) Prediction oh human cytochrome P450 inhibition using a multitask deep autoencoder neural network. Mol Pharmaceutics 15(10):4336–4345. https://doi.org/10.1021/acs.molpharmaceut.8b00110

    Article  CAS  Google Scholar 

  118. Ramsundar B, Liu B, Wu Z, Verras A, Tudor M, Sheridan RP, Pande V (2017) Is multitask deep learning practical for pharma? J Chem Inf Model 57:2068–2076. https://doi.org/10.1021/acs.jcim.7b00146

    Article  CAS  PubMed  Google Scholar 

  119. Altae-Tran H, Ramsundar B, Pappu AS, Pande V (2017) Low data drug discovery with one-shot learning. ACS Cent Sci 3:283–293. https://doi.org/10.1021/acscentsci.6b00367

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  120. Wenlock MC, Carlsson LA (2015) How experimental errors influence drug metabolism and pharmacokinetic QSAR/QSPR models. J Chem Inf Model 55:125–134. https://doi.org/10.1021/ci500535s

    Article  CAS  PubMed  Google Scholar 

  121. Wenzel J, Matter H, Schmidt F (2019) Predictive multitask deep neural network models for ADME-Tox properties: learning from large data sets. J Chem Inf Model 59(3):1253–1268. https://doi.org/10.1021/acs.jcim.8b00785

    Article  CAS  PubMed  Google Scholar 

  122. Hughes TB, Miller GP, Swamidass SJ (2015) Modeling epoxidation of drug-like molecules with a deep machine learning network. ACS Cent Sci 1:168–180. https://doi.org/10.1021/acscentsci.5b00131

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  123. Xu YJ, Dai ZW, Chen FJ, Gao SS, Pei JF, Lai LH (2015) Deep learning for drug induced liver injury. J Chem Inf Model 55:2085–2093. https://doi.org/10.1021/acs.jcim.5b00238

    Article  CAS  PubMed  Google Scholar 

  124. Novac N (2013) Challenges and opportunities of drug repositioning. Trends Pharmacol Sci 34:267–272. https://doi.org/10.1016/j.tips.2013.03.004

    Article  CAS  PubMed  Google Scholar 

  125. Tripathi N, Tripathi N, Goshisht MK (2021) COVID-19: inflammatory responses, structure-based drug design and potential therapeutics. Mol Divers. https://doi.org/10.1007/s11030-020-10176-1

    Article  PubMed  PubMed Central  Google Scholar 

  126. Chen X, Yan CC, Zhang XT, Zhang X, Dai F, Yin J, Zhang YD (2016) Drug–target interaction prediction: databases, web servers and computational models. Briefings Bioinf 17:696–712. https://doi.org/10.1093/bib/bbv066

    Article  CAS  Google Scholar 

  127. Romero Durán FJ, Alonso N, Caamaňo O, García-Mera X, Yaneez M, Prado-Prado FJ, Gonz_alez-Díaz H (2014) Prediction of multi-target networks of neuroprotective compounds with entropy indices and synthesis, assay, and theoretical study of new asymmetric, 1,2-rasagiline carbamates. Int J Mol Sci 15:17035–17064. https://doi.org/10.3390/ijms150917035

  128. Kitchen DB, Decornez H, Furr JR, Bajorath J (2004) Docking and scoring in virtual screening for drug discovery: methods and applications. Nat Rev Drug Discov 3:935–949. https://doi.org/10.1038/nrd1549

    Article  CAS  PubMed  Google Scholar 

  129. Yao ZJ, Dong J, Che YJ, Zhu MF, Wen M, Wang NN, Wang S, Lu AP, Cao DS (2016) TargetNet: a web service for predicting potential drug-target interaction profiling via multi-target SAR models. J Comput Aided Mol Des 30:413–424. https://doi.org/10.1007/s10822-016-9915-2

    Article  CAS  PubMed  Google Scholar 

  130. Zhou Y, Wang F, Tang J, Nussinov R, Cheng F (2020) Artificial intelligence in COVID-19 drug repurposing. The Lancet Digital Health 2(12):E667–E676. https://doi.org/10.1016/S2589-7500(20)30192-8

    Article  PubMed  PubMed Central  Google Scholar 

  131. Wen M, Zhang ZM, Niu SY, Sha HZ, Yang RH, Yun YH, Lu HM (2017) Deep learning- based drug-target interaction prediction. J Proteome Res 16:1401–1409. https://doi.org/10.1021/acs.jproteome.6b00618

    Article  CAS  PubMed  Google Scholar 

  132. Luo YL, Zhao XB, Zhou JT, Yang JL, Zhang YQ, Kuang WH, Peng J, Chen L, Zeng JY (2017) A network integration approach for drug-target interaction prediction and computational drug repositioning from heterogeneous information. Nat Commun 8:573. https://doi.org/10.1021/acs.jproteome.6b00618

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  133. Schneider G, Fechner U (2005) Computer-based de novo design of drug-like molecules. Nat Rev Drug Discov 4:649–663. https://doi.org/10.1038/nrd1799

    Article  CAS  PubMed  Google Scholar 

  134. Böhm HJ (1992) The computer program LUDI: a new method for the de novo design of enzyme inhibitors. J Comput Aided Mol Des 6:61–78. https://doi.org/10.1007/bf00124387

    Article  PubMed  Google Scholar 

  135. Schneider G, Geppert T, Hartenfeller M, Reisen F, Klenner A, Reutlinger M, Hähnke V, Hiss JA, Zettl H, Keppner S, Spänkuch B, Schneider P (2011) Reaction-driven de novo design, synthesis and testing of potential type II kinase inhibitors. Future Med Chem 3:415–424. https://doi.org/10.4155/fmc.11.8

    Article  CAS  PubMed  Google Scholar 

  136. Gόmez-Bombarelli R, Wei JN, Duvenaud D, Hernández-Lobato JM, Sánchez-Lengeling B, Sheberla D, Aguilera-Iparraguirre J, Hirzel TD, Adams RP, Aspuru-Guzik A (2018) Automatic chemical design using a data-driven continuous representation of molecules. ACS Cent Sci 4(2):268–276. https://doi.org/10.1021/acscentsci.7b00572

    Article  CAS  Google Scholar 

  137. Skalic M, Jimėnez J, Sabbadin D, De Fabritiis F (2019) Shape-based generative modeling for de novo drug design. J Chem Inf Model 59:1205–1214. https://doi.org/10.1021/acs.jcim.8b00706

    Article  CAS  PubMed  Google Scholar 

  138. Collins KD, Glorius FA (2013) Robustness screen for the rapid assessment of chemical reactions. Nat Chem 5:597–601. https://doi.org/10.1038/nchem.1669

    Article  CAS  PubMed  Google Scholar 

  139. Wei JN, Duvenaud D, Aspuru-Guzik A (2016) Neural networks for the prediction of organic chemistry reactions. ACS Cent Sci 2:725−732.

  140. Huang Q, Li L-L, Yang S-Y (2011) RASA: a rapid retrosynthesis-based scoring method for the assessment of synthetic accessibility of drug-like molecules. J Chem Inf Model 51:2768–2777. https://doi.org/10.1021/ci100216g

    Article  CAS  PubMed  Google Scholar 

  141. Fialkowski M, Bishop KJ, Chubukov VA, Campbell CJ, Grzybowski BA (2005) Architecture and evolution of organic chemistry. Angew Chem Int Ed 44:7263–7269.

    Article  CAS  Google Scholar 

  142. Peplow M (2014) Organic synthesis: the robo-chemist. Nature 512:20–22. https://doi.org/10.1038/512020a

    Article  CAS  PubMed  Google Scholar 

  143. Gothard CM, Soh S, Gothard NA, Kowalczyk B, Wei Y, Baytekin B, Grzybowski BA (2012) Rewiring chemistry: algorithmic discovery and experimental validation of one-pot reactions in the network of organic chemistry. Angew Chem Int Ed 51:7922–7927. https://doi.org/10.1002/anie.201202155

    Article  CAS  Google Scholar 

  144. Kowalik M, Gothard CM, Drews AM, Gothard NA, Weckiewicz A, Fuller PE, Grzybowski BA, Bishop KJ (2012) Parallel optimization of synthetic pathways within the network of organic chemistry. Angew Chem Int Ed 51:7928–7932. https://doi.org/10.1002/anie.201202209

    Article  CAS  Google Scholar 

  145. Coley CW, Barzilay R, Jaakkola TS, Green WH, Jensen KF (2017) Prediction of organic reaction outcomes using machine learning. ACS Cent Sci 3:434–443. https://doi.org/10.1021/acscentsci.7b0006.4

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  146. Lowe DM (2012) Extraction of chemical structures and reactions from the literature. Doctoral dissertation, University of Cambridge 1289. https://doi.org/10.17863/CAM.16293.

  147. Segler MH, Waller MP (2017) Neural-symbolic machine learning for retrosynthesis and reaction prediction. Chem Eur J 23:5966–5971. https://doi.org/10.1002/chem.201605499

    Article  CAS  PubMed  Google Scholar 

  148. Segler MH, Preuss M, Waller MP (2018) Planning chemical syntheses with deep neural networks and symbolic AI. Nature 555:604–610. https://doi.org/10.1038/nature25978

    Article  CAS  PubMed  Google Scholar 

  149. Browne CB, Powley E, Whitehouse D, Lucas SM, Cowling PI, Rohlfshagen P, Tavener S, Perez D, Samothrakis S, Colton SA (2012) Survey of monte carlo tree search methods. IEEE Trans Comput Intell AI Games 4:1–43. https://doi.org/10.1109/TCIAIG.2012.2186810

    Article  Google Scholar 

  150. Segler MH, Waller MP (2017) Modelling chemical reasoning to predict and invent reactions. Chem- Eur J 23:6118–6128. https://doi.org/10.1002/chem.201604556

    Article  CAS  PubMed  Google Scholar 

  151. Mamoshina P, Vieira A, Putin E, Zhavoronkov A (2016) Applications of deep learning in biomedicine. Mol Pharm 13:1445–1454. https://doi.org/10.1021/acs.molpharmaceut.5b00982

    Article  CAS  PubMed  Google Scholar 

  152. Krizhevsky A, Sutskever I, Hinton GE (2017) ImageNet classification with deep convolutional neural networks. Commun ACM 60(6):84–90. https://doi.org/10.1145/3065386

    Article  Google Scholar 

Download references

Acknowledgements

Dr. M.K. Goshisht extends his appreciation to the Principal (Dr. T.R. Ratre) of Government College Tokapal, Bastar, Chhattisgarh, for providing support while writing this work.

Funding

Not Applicable.

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Manoj Kumar Goshisht.

Ethics declarations

Conflict of interest

The authors declare that they have no conflict of interest.

Additional information

Publisher's Note

Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

Rights and permissions

Reprints and permissions

About this article

Check for updates. Verify currency and authenticity via CrossMark

Cite this article

Tripathi, N., Goshisht, M.K., Sahu, S.K. et al. Applications of artificial intelligence to drug design and discovery in the big data era: a comprehensive review. Mol Divers 25, 1643–1664 (2021). https://doi.org/10.1007/s11030-021-10237-z

Download citation

  • Received:

  • Accepted:

  • Published:

  • Issue Date:

  • DOI: https://doi.org/10.1007/s11030-021-10237-z

Keywords

Navigation