Skip to main content

Artificial Intelligence in Drug Discovery: A Comprehensive Review of Data-driven and Machine Learning Approaches


As expenditure on drug development increases exponentially, the overall drug discovery process requires a sustainable revolution. Since artificial intelligence (AI) is leading the fourth industrial revolution, AI can be considered as a viable solution for unstable drug research and development. Generally, AI is applied to fields with sufficient data such as computer vision and natural language processing, but there are many efforts to revolutionize the existing drug discovery process by applying AI. This review provides a comprehensive, organized summary of the recent research trends in AI-guided drug discovery process including target identification, hit identification, ADMET prediction, lead optimization, and drug repositioning. The main data sources in each field are also summarized in this review. In addition, an in-depth analysis of the remaining challenges and limitations will be provided, and proposals for promising future directions in each of the aforementioned areas.


  1. 1.

    DiMasi, J. A., H. G. Grabowski, and R. W. Hansen (2016) Innovation in the pharmaceutical industry: New estimates of R&D costs. J. Health Econ. 47: 20–33.

    Article  Google Scholar 

  2. 2.

    Paul, S. M., D. S. Mytelka, C. T. Dunwiddie, C. C. Persinger, B. H. Munos, S. R. Lindborg, and A. L. Schacht (2010) How to improve R&D productivity: the pharmaceutical industry's grand challenge. Nat. Rev. Drug Discov. 9: 203–214.

    Article  CAS  Google Scholar 

  3. 3.

    van de Waterbeemd, H. and E. Gifford (2003) ADMET in silico modelling: towards prediction paradise? Nat. Rev. Drug Discov. 2: 192–204.

    Article  Google Scholar 

  4. 4.

    Mak, K. K. and M. R. Pichika (2019) Artificial intelligence in drug development: present status and future prospects. Drug Discov. Today. 24: 773–780.

    Article  Google Scholar 

  5. 5.

    Yang, X., Y. Wang, R. Byrne, G. Schneider, and S. Yang (2019) Concepts of artificial intelligence for computer-assisted drug discovery. Chem. Rev. 119: 10520–10594.

    Article  CAS  Google Scholar 

  6. 6.

    Eder, J., R. Sedrani, and C. Wiesmann (2014) The discovery of first-in-class drugs: origins and evolution. Nat. Rev. Drug Discov. 13: 577–587.

    Article  CAS  Google Scholar 

  7. 7.

    Brown, D. (2007) Unfinished business: target-based drug discovery. Drug Discov. Today 12: 1007–1012.

    Article  CAS  Google Scholar 

  8. 8.

    Hsu, Y. H., J. Yao, L. C. Chan, T. J. Wu, J. L. Hsu, Y. F. Fang, Y. Wei, Y. Wu, W. C. Huang, C. L. Liu, Y. C. Chang, M. Y. Wang, C. W. Li, J. Shen, M. K. Chen, A. A. Sahin, A. Sood, G. B. Mills, D. Yu, G. N. Hortobagyi, and M. C. Hung (2014) Definition of PKC-a, CDK6, and MET as therapeutic targets in triple-negative breast cancer. Cancer Res. 74: 4822–4835.

    Article  CAS  Google Scholar 

  9. 9.

    Chen, B. and A. Butte (2016) Leveraging big data to transform target selection and drug discovery. Clin. Pharmacol. Ther. 99: 285–297.

    Article  Google Scholar 

  10. 10.

    Kodama, K., M. Horikoshi, K. Toda, S. Yamada, K. Hara, J. Irie, M. Sirota, A. A. Morgan, R. Chen, H. Ohtsu, S. Maeda, T. Kadowaki, and A. J. Butte (2012) Expression-based genomewide association study links the receptor CD44 in adipose tissue with type 2 diabetes. Proc. Natl. Acad. Sci. USA. 109: 7049-7054.

  11. 11.

    Zhu, Z., F. Zhang, H. Hu, A. Bakshi, M. R. Robinson, J. E. Powell, G. W. Montgomery, M. E. Goddard, N. R. Wray, P. M. Visscher, and J. Yang (2016) Integration of summary data from GWAS and eQTL studies predicts complex trait gene targets. Nat. Genet. 48: 481-487.

  12. 12.

    van Dam, S., U. Võsa, A. van der Graaf, L. Franke, and J. P. de Magalhães (2018) Gene co-expression analysis for functional classification and gene-disease predictions. Brief. Bioinform. 19: 575–592.

    Google Scholar 

  13. 13.

    Petyuk, V. A., R. Chang, M. Ramirez-Restrepo, N. D. Beckmann, M. Y. R. Henrion, P. D. Piehowski, K. Zhu, S. Wang, J. Clarke, M. J. Huentelman, F. Xie, V. Andreev, A. Engel, T. Guettoche, L. Navarro, P. De Jager, J. A. Schneider, C. M. Morris, I. G. McKeith, R. H. Perry, S. Lovestone, R. L. Woltjer, T. G. Beach, L. I. Sue, G. E. Serrano, A. P. Lieberman, R. L. Albin, I. Ferrer, D. C. Mash, C. M. Hulette, J. F. Ervin, E. M. Reiman, J. A. Hardy, D. A. Bennett, E. Schadt, R. D. Smith, and A. J. Myers (2018) The human brainome: network analysis identifies HSPA2 as a novel Alzheimer's disease target. Brain. 141: 2721-2739.

  14. 14.

    Lee, S., C. Zhang, Z. Liu, M. Klevstig, B. Mukhopadhyay, M. Bergentall, R. Cinar, M. Ståhlman, N. Sikanic, J. K. Park, S. Deshmukh, A. M. Harzandi, T. Kuijpers, M. Grøtli, S. J. Elsässer, B. D. Piening, M. Snyder, U. Smith, J. Nielsen, F. Bäckhed, G. Kunos, M. Uhlen, J. Boren, and A. Mardinoglu (2017) Network analyses identify liver-specific targets for treating liver diseases. Mol. Syst. Biol. 13: 938.

  15. 15.

    Zou, Q., J. Li, L. Song, X. Zeng, and G. Wang (2016) Similarity computation strategies in the microRNA-disease network: a survey. Brief. Funct. Genomics. 15: 55–64.

    Google Scholar 

  16. 16.

    Chen, X., D. Xie, L. Wang, Q. Zhao, Z. H. You, and H. Liu (2018) BNPMDA: Bipartite Network Projection for MiRNADisease Association prediction. Bioinformatics. 34: 3178–3186.

    Article  CAS  Google Scholar 

  17. 17.

    Ding, P., J. Luo, C. Liang, Q. Xiao, and B. Cao (2018) Human disease MiRNA inference by combining target information based on heterogeneous manifolds. J. Biomed. Inform. 80: 26–36.

    Article  Google Scholar 

  18. 18.

    Mohamed, S. K., V. Novácek, and A. Nounu (2020) Discovering protein drug targets using knowledge graph embeddings. Bioinformatics. 36: 603–610.

    Google Scholar 

  19. 19.

    Richardson, P., I. Griffin, C. Tucker, D. Smith, O. Oechsle, A. Phelan, M. Rawling, E. Savory, and J. Stebbing (2020) Baricitinib as potential treatment for 2019-nCoV acute respiratory disease. Lancet. 395: e30–e31.

    Article  Google Scholar 

  20. 20.

    Segler, M. H. S., M. Preuss, and M. P. Waller (2018) Planning chemical syntheses with deep neural networks and symbolic AI. Nature. 555: 604–610.

    Article  CAS  Google Scholar 

  21. 21.

    Ferrero, E., I. Dunham, and P. Sanseau (2017) In silico prediction of novel therapeutic targets using gene-disease association data. J. Transl. Med. 15: 182.

    Article  CAS  Google Scholar 

  22. 22.

    Mamoshina, P., M. Volosnikova, I. V. Ozerov, E. Putin, E. Skibina, F. Cortese, and A. Zhavoronkov (2018) Machine learning on human muscle transcriptomic data for biomarker discovery and tissue-specific drug target identification. Front. Genet. 9: 242.

  23. 23.

    Piñero, J., Á. Bravo, N. Queralt-Rosinach, A. Gutiérrez- Sacristán, J. Deu-Pons, E. Centeno, J. García-García, F. Sanz, and L. I. Furlong (2017) DisGeNET: A comprehensive platform integrating information on human disease-associated genes and variants. Nucleic Acids Res. 45: D833-D839.

  24. 24.

    Stoeger, T., M. Gerlach, R. I. Morimoto, and L. A. Nunes Amaral (2018) Large-scale investigation of the reasons why potentially important genes are ignored. PLoS Biol. 16: e2006643.

  25. 25.

    Piñero, J., J. M. Ramírez-Anguita, J. Saüch-Pitarch, F. Ronzano, E. Centeno, F. Sanz, and L. I. Furlong (2020) The DisGeNET knowledge platform for disease genomics: 2019 update. Nucleic Acids Res. 48: D845-D855.

  26. 26.

    Davis, A. P., C. J. Grondin, R. J. Johnson, D. Sciaky, R. McMorran, J. Wiegers, T. C. Wiegers, and C. J. Mattingly (2019) The Comparative Toxicogenomics Database: update 2019. Nucleic Acids Res. 47: D948-D954.

  27. 27.

    Vasaikar, S. V., J. Wang, and B. Zhang (2018) LinkedOmics: analyzing multi-omics data within and across 32 cancer types. Nucleic Acids Res. 46: D956–D963.

  28. 28.

    Carvalho-Silva, D., A. Pierleoni, M. Pignatelli, C. Ong, L. Fumis, N. Karamanis, M. Carmona, A. Faulconbridge, A. Hercules, E. McAuley, A. Miranda, G. Peat, M. Spitzer, J. Barrett, D. G. Hulcoop, E. Papa, G. Koscielny, and I. Dunham (2019) Open Targets Platform: new developments and updates two years on. Nucleic Acids Res. 47: D1056-D1065.

  29. 29.

    Brown, K. K., M. M. Hann, A. S. Lakdawala, R. Santos, P. J. Thomas, and K. Todd (2018) Approaches to target tractability assessment - a practical perspective. Medchemcomm. 9: 606–613.

    Article  Google Scholar 

  30. 30.

    Huang, Z., J. Shi, Y. Gao, C. Cui, S. Zhang, J. Li, Y. Zhou, and Q. Cui (2019) HMDD v3.0: a database for experimentally supported human microRNA-disease associations. Nucleic Acids Res. 47: D1013–D1017.

    Article  CAS  Google Scholar 

  31. 31.

    DepMap portal.

  32. 32.

    Meyers, R. M., J. G. Bryan, J. M. McFarland, B. A. Weir, A. E. Sizemore, H. Xu, N. V. Dharia, P. G. Montgomery, G. S. Cowley, S. Pantel, A. Goodale, Y. Lee, L. D. Ali, G. Jiang, R. Lubonja, W. F. Harrington, M. Strickland, T. Wu, D. C. Hawes, V. A. Zhivich, M. R. Wyatt, Z. Kalani, J. J. Chang, M. Okamoto, K. Stegmaier, T. R. Golub, J. S. Boehm, F. Vazquez, D. E. Root, W. C. Hahn, and A. Tsherniak (2017) Computational correction of copy number effect improves specificity of CRISPR-Cas9 essentiality screens in cancer cells. Nat. Genet. 49: 1779-1784.

  33. 33.

    Tsherniak, A., F. Vazquez, P. G. Montgomery, B. A. Weir, G. Kryukov, G. S. Cowley, S. Gill, W. F. Harrington, S. Pantel, J. M. Krill-Burger, R. M. Meyers, L. Ali, A. Goodale, Y. Lee, G. Jiang, J. Hsiao, W. F. J. Gerath, S. Howell, E. Merkel, M. Ghandi, L. A. Garraway, D. E. Root, T. R. Golub, J. S. Boehm, and W. C. Hahn (2017) Defining a cancer dependency map. Cell. 170: 564-576.e16.

  34. 34.

    Barretina, J., G. Caponigro, N. Stransky, K. Venkatesan, A. A. Margolin, S. Kim, C. J. Wilson, J. Lehár, G. V. Kryukov, D. Sonkin, A. Reddy, M. Liu, L. Murray, M. F. Berger, J. E. Monahan, P. Morais, J. Meltzer, A. Korejwa, J. Jané-Valbuena, F. A. Mapa, J. Thibault, E. Bric-Furlong, P. Raman, A. Shipway, I. H. Engels, J. Cheng, G. K. Yu, J. Yu, P. Aspesi, M. de Silva, K. Jagtap, M. D. Jones, L. Wang, C. Hatton, E. Palescandolo, S. Gupta, S. Mahan, C. Sougnez, R. C. Onofrio, T. Liefeld, L. MacConaill, W. Winckler, M. Reich, N. Li, J. P. Mesirov, S. B. Gabriel, G. Getz, K. Ardlie, V. Chan, V. E. Myer, B. L. Weber, J. Porter, M. Warmuth, P. Finan, J. L. Harris, M. Meyerson, T. R. Golub, M. P. Morrissey, W. R. Sellers, R. Schlegel, and L. A. Garraway (2012) The Cancer Cell Line Encyclopedia enables predictive modelling of anticancer drug sensitivity. Nature. 483: 603-607.

  35. 35.

    Stransky, N., M. Ghandi, G. V. Kryukov, L. A. Garraway, J. Lehár, M. Liu, D. Sonkin, A. Kauffmann, K. Venkatesan, E. J. Edelman, M. Riester, J. Barretina, G. Caponigro, R. Schlegel, W. R. Sellers, F. Stegmeier, M. Morrissey, A. Amzallag, I. Pruteanu-Malinici, D. A. Haber, S. Ramaswamy, C. H. Benes, M. P. Menden, F. Iorio, M. R. Stratton, U. McDermott, M. J. Garnett, and J. Saez-Rodriguez (2015) Pharmacogenomic agreement between two cancer cell line data sets. Nature. 528: 84-87.

  36. 36.

    Ghandi, M., F. W. Huang, J. Jané-Valbuena, G. V. Kryukov, C. C. Lo, E. R. McDonald, J. Barretina, E. T. Gelfand, C. M. Bielski, H. Li, K. Hu, A. Y. Andreev-Drakhlin, J. Kim, J. M. Hess, B. J. Haas, F. Aguet, B. A. Weir, M. V. Rothberg, B. R. Paolella, M. S. Lawrence, R. Akbani, Y. Lu, H. L. Tiv, P. C. Gokhale, A. de Weck, A. A. Mansour, C. Oh, J. Shih, K. Hadi, Y. Rosen, J. Bistline, K. Venkatesan, A. Reddy, D. Sonkin, M. Liu, J. Lehar, J. M. Korn, D. A. Porter, M. D. Jones, J. Golji, G. Caponigro, J. E. Taylor, C. M. Dunning, A. L. Creech, A. C. Warren, J. M. McFarland, M. Zamanighomi, A. Kauffmann, N. Stransky, M. Imielinski, Y. E. Maruvka, A. D. Cherniack, A. Tsherniak, F. Vazquez, J. D. Jaffe, A. A. Lane, D. M. Weinstock, C. M. Johannessen, M. P. Morrissey, F. Stegmeier, R. Schlegel, W. C. Hahn, G. Getz, G. B. Mills, J. S. Boehm, T. R. Golub, L. A. Garraway, and W. R. Sellers (2019) Next-generation characterization of the Cancer Cell Line Encyclopedia. Nature. 569: 503-508.

  37. 37.

    Yu, C., A. M. Mannan, G. M. Yvone, K. N. Ross, Y. L. Zhang, M. A. Marton, B. R. Taylor, A. Crenshaw, J. Z. Gould, P. Tamayo, B. A. Weir, A. Tsherniak, B. Wong, L. A. Garraway, A. F. Shamji, M. A. Palmer, M. A. Foley, W. Winckler, S. L. Schreiber, A. L. Kung, and T. R. Golub (2016) High-throughput identification of genotype-specific cancer vulnerabilities in mixtures of barcoded tumor cell lines. Nat. Biotechnol. 34: 419-423.

  38. 38.

    Szklarczyk, D., A. L. Gable, D. Lyon, A. Junge, S. Wyder, J. Huerta-Cepas, M. Simonovic, N. T. Doncheva, J. H. Morris, P. Bork, L. J. Jensen, and C. V. Mering (2019) STRING v11: protein-protein association networks with increased coverage, supporting functional discovery in genome-wide experimental datasets. Nucleic Acids Res. 47: D607-D613.

  39. 39.

    Wang, Y., S. Zhang, F. Li, Y. Zhou, Y. Zhang, Z. Wang, R. Zhang, J. Zhu, Y. Ren, Y. Tan, C. Qin, Y. Li, X. Li, Y. Chen, and F. Zhu (2020) Therapeutic target database 2020: enriched resource for facilitating research and early development of targeted therapeutics. Nucleic Acids Res. 48: D1031-D1041.

  40. 40.

    Pearson, N., K. Malki, D. Evans, L. Vidler, C. Ruble, J. Scherschel, B. Eastwood, and D. A. Collier (2019) TractaViewer: a genome-wide tool for preliminary assessment of therapeutic target druggability. Bioinformatics. 35: 4509–4510.

    Article  CAS  Google Scholar 

  41. 41.

    Keiser, M. J., V. Setola, J. J. Irwin, C. Laggner, A. I. Abbas, S. J. Hufeisen, N. H. Jensen, M. B. Kuijer, R. C. Matos, T. B. Tran, R. Whaley, R. A. Glennon, J. Hert, K. L. H. Thomas, D. D. Edwards, B. K. Shoichet, and B. L. Roth (2009) Predicting new molecular targets for known drugs. Nature. 462: 175-181.

  42. 42.

    Morris, G. M., R. Huey, W. Lindstrom, M. F. Sanner, R. K. Belew, D. S. Goodsell, and A. J. Olson (2009) AutoDock4 and AutoDockTools4: Automated docking with selective Receptor flexibility. J. Comput. Chem. 30: 2785–2791.

    Article  CAS  Google Scholar 

  43. 43.

    Trott, O. and A. J. Olson (2010) AutoDock Vina: improving the speed and accuracy of docking with a new scoring function, efficient optimization, and multithreading. J. Comput. Chem. 31: 455–461.

    Google Scholar 

  44. 44.

    Koes, D. R., M. P. Baumgartner, and C. J. Camacho (2013) Lessons learned in empirical scoring with smina from the CSAR 2011 benchmarking exercise. J. Chem. Inf. Model. 53: 1893–1904.

    Article  CAS  Google Scholar 

  45. 45.

    Ballester, P. J. and J. B. O. Mitchell (2010) A machine learning approach to predicting protein-ligand binding affinity with applications to molecular docking. Bioinformatics. 26: 1169–1175.

    Article  CAS  Google Scholar 

  46. 46.

    Li, L., B. Wang, and S. O. Meroueh (2011) Support vector regression scoring of receptor-ligand complexes for rankordering and virtual screening of chemical libraries. J. Chem. Inf. Model. 51: 2132–2138.

    Article  CAS  Google Scholar 

  47. 47.

    Ragoza, M., J. Hochuli, E. Idrobo, J. Sunseri, and D. R. Koes (2017) Protein-ligand scoring with convolutional neural networks. J. Chem. Inf. Model. 57: 942–957.

    Article  CAS  Google Scholar 

  48. 48.

    Jimenez, J., M. Skalic, G. Martinez-Rosell, and G. De Fabritiis (2018) KDEEP: Protein-ligand absolute binding affinity prediction via 3D-convolutional neural networks. J. Chem. Inf. Model. 58: 287-296.

  49. 49.

    Imrie, F., A. R. Bradley, M. van der Schaar, and C. M. Deane (2018) Protein family-specific models using deep neural networks and transfer learning improve virtual screening and highlight the need for more data. J. Chem. Inf. Model. 58: 2319-2330.

  50. 50.

    Stepniewska-Dziubinska, M. M., P. Zielenkiewicz, and P. Siedlecki (2018) Development and evaluation of a deep learning model for protein-ligand binding affinity prediction. Bioinformatics. 34: 3666–3674.

    Article  CAS  Google Scholar 

  51. 51.

    Tian, K., M. Shao, Y. Wang, J. Guan, and S. Zhou (2016) Boosting compound-protein interaction prediction by deep learning. Methods. 110: 64–72.

    Article  CAS  Google Scholar 

  52. 52.

    Feinberg, E. N., D. Sur, Z. Wu, B. E. Husic, H. Mai, Y. Li, S. Sun, J. Yang, B. Ramsundar, and V. S. Pande (2018) PotentialNet for molecular property prediction. ACS Cent. Sci. 4: 1520-1530.

  53. 53.

    Lim, J., S. Ryu, K. Park, Y. J. Choe, J. Ham, and W. Y. Kim (2019) Predicting drug-target interaction using a novel graph neural network with 3D structure-embedded graph representation. J. Chem. Inf. Model. 59: 3981-3988.

  54. 54.

    Landrum, G., B. Kelley, P. Tosco, sriniker, gedeck, NadineSchneider, R. Vianello, A. Dalke, AlexanderSavelyev, S. Turk, B. Cole, M. Swain, A. Vaucher, M. Wójcikowski, A. Pahl, JP, strets123, JLVarjo, P. Fuller, DoliathGavid, N. O'Boyle, P. P. Zarrinkar, G. Sforna, M. Nowotka, pzc, J. van Santen, J. H. Jensen, J. Domanski, D. Hall, and P. Avery (2018) rdkit/rdkit: 2018_03_1 (Q1 2018) Release. Zenodo.

  55. 55.

    O'Boyle, N. M., M. Banck, C. A. James, C. Morley, T. Vandermeersch, and G. R. Hutchison (2011) Open Babel: An open chemical toolbox. J. Cheminform. 3: 33.

    Article  CAS  Google Scholar 

  56. 56.

    Willighagen, E. L., J. W. Mayfield, J. Alvarsson, A. Berg, L. Carlsson, N. Jeliazkova, S. Kuhn, T. Pluskal, M. Rojas-Cherto, O. Spjuth, G. Torrance, C. T. Evelo, R. Guha, and C. Steinbeck (2017) Erratum to: The Chemistry Development Kit (CDK) v2.0: atom typing, depiction, molecular formulas, and substructure searching. J. Cheminform. 9: 53.

  57. 57.

    Yap, C. W. (2011) PaDEL-descriptor: an open source software to calculate molecular descriptors and fingerprints. J. Comput. Chem. 32: 1466–1474.

    Article  CAS  Google Scholar 

  58. 58.

    Mauri, A., V. Consonni, M. Pavan, and R. Todeschini (2006) Dragon software: An easy approach to molecular descriptor calculations. Match-Commun. Math. Comput. Chem. 56: 237–248.

    Google Scholar 

  59. 59.

    Cao, D. S., Y. Z. Liang, J. Yan, G. S. Tan, Q. S. Xu, and S. Liu (2013) PyDPI: freely available python package for chemoinformatics, bioinformatics, and chemogenomics studies. J. Chem. Inf. Model. 53: 3086-3096.

  60. 60.

    Cao, D. S., N. Xiao, Q. S. Xu, and A. F. Chen (2015) Rcpi: R/ Bioconductor package to generate various descriptors of proteins, compounds and their interactions. Bioinformatics. 31: 279–281.

    Article  CAS  Google Scholar 

  61. 61.

    Moriwaki, H., Y. S. Tian, N. Kawashita, and T. Takagi (2018) Mordred: a molecular descriptor calculator. J. Cheminform. 10: 4.

    Article  CAS  Google Scholar 

  62. 62.

    Burden, F. R. (2001) Quantitative structure-Activity relationship studies using gaussian processes. J. Chem. Inf. Comput Sci. 41: 830–835.

    Article  CAS  Google Scholar 

  63. 63.

    Svetnik, V., A. Liaw, C. Tong, J. C. Culberson, R. P. Sheridan, and B. P. Feuston (2003) Random forest: a classification and regression tool for compound classification and QSAR modeling. J. Chem. Inf. Comput. Sci. 43: 1947–1958.

    Article  CAS  Google Scholar 

  64. 64.

    Ma, J., R. P. Sheridan, A. Liaw, G. E. Dahl, and V. Svetnik (2015) Deep neural nets as a method for quantitative structure- activity relationships. J. Chem. Inf. Model. 55: 263-274.

  65. 65.

    Xu, Y., J. Ma, A. Liaw, R. P. Sheridan, and V. Svetnik (2017) Demystifying Multitask Deep neural networks for quantitative structure-activity relationships. J. Chem. Inf. Model. 57: 2490–2504.

    Article  CAS  Google Scholar 

  66. 66.

    Ghasemi, F., A. Mehridehnavi, A. Fassihi, and H. Prez-Snchez (2018) Deep neural network in QSAR studies using deep belief network. Appl. Soft Comput. 62: 251–258.

    Article  Google Scholar 

  67. 67.

    Kato, Y., S. Hamada, and H. Goto (2020) Validation Study of QSAR/DNN models using the competition datasets. Mol. Inf. 39: 1900154.

    Article  CAS  Google Scholar 

  68. 68.

    Lusci, A., G. Pollastri, and P. Baldi (2013) Deep architectures and deep learning in chemoinformatics: the prediction of aqueous solubility for drug-like molecules. J. Chem. Inf. Model. 53: 1563-1575.

  69. 69.

    Duvenaud, D., D. Maclaurin, J. Aguilera-Iparraguirre, R. Gómez-Bombarelli, T. Hirzel, A. Aspuru-Guzik, and R. P. Adams (2015) Convolutional networks on graphs for learning molecular fingerprints. arXiv. 1509.09292.

  70. 70.

    Rogers, D. and M. Hahn (2010) Extended-connectivity fingerprints. J. Chem. Inf. Model. 50: 742–754.

    Article  CAS  Google Scholar 

  71. 71.

    Jaeger, S., S. Fulle, and S. Turk (2018) Mol2vec: Unsupervised machine learning approach with chemical intuition. J. Chem. Inf. Model. 58: 27–35.

    Article  CAS  Google Scholar 

  72. 72.

    Chakravarti, S. K. and S. R. M. Alla (2019) Descriptor Free QSAR modeling using deep learning with long short-term memory neural networks. Front. Artif. Intell. 2: 17.

    Article  Google Scholar 

  73. 73.

    Winter, R., F. Noé, and D. A. Clevert (2019) Learning continuous and data-driven molecular descriptors by translating equivalent chemical representations. Chem. Sci. 10: 1692–1701.

  74. 74.

    Honda, S., S. Shi, and H. R. Ueda (2019) SMILES transformer: pre-trained molecular fingerprint for low data drug discovery. arXiv. 1911.04738.

  75. 75.

    Devlin, J., M. W. Chang, K. Lee, and K. Toutanova (2018) BERT: pre-training of deep bidirectional transformers for language understanding. arXiv. 1810.04805.

  76. 76.

    Altae-Tran, H., B. Ramsundar, A. S. Pappu, and V. Pande (2017) Low data drug discovery with one-shot learning. ACS Cent. Sci. 3: 283–293.

    Article  CAS  Google Scholar 

  77. 77.

    Rohrer, S. G. and K. Baumann (2009) Maximum unbiased validation (MUV) data sets for virtual screening based on PubChem bioactivity data. J. Chem. Inf. Model. 49: 169–184.

    Article  CAS  Google Scholar 

  78. 78.

    Jeon, M., D. Park, J. Lee, H. Jeon, M. Ko, S. Kim, Y. Choi, A. C. Tan, and J. Kang (2019) ReSimNet: drug response similarity prediction using siamese neural networks. Bioinformatics. 35: 5249–5256.

    Article  CAS  Google Scholar 

  79. 79.

    Lamb, J., E. D. Crawford, D. Peck, J. W. Modell, I. C. Blat, M. J. Wrobel, J. Lerner, J. P. Brunet, A. Subramanian, K. N. Ross, M. Reich, H. Hieronymus, G. Wei, S. A. Armstrong, S. J. Haggarty, P. A. Clemons, R. Wei, S. A. Carr, E. S. Lander, and T. R. Golub (2006) The Connectivity Map: using gene-expression signatures to connect small molecules, genes, and disease. Science. 313: 1929-1935.

  80. 80.

    Park, K., Y. J. Ko, P. Durai, and C. H. Pan (2019) Machine learning-based chemical binding similarity using evolutionary relationships of target genes. Nucleic Acids Res. 47: e128.

  81. 81.

    Cheng, T., M. Hao, T. Takeda, S. H. Bryant, and Y. Wang (2017) Large-scale prediction of drug-target interaction: a datacentric review. AAPS J. 19: 1264–1275.

    Article  CAS  Google Scholar 

  82. 82.

    Ding, H., I. Takigawa, H. Mamitsuka, and S. Zhu (2014) Similarity-based machine learning methods for predicting drugtarget interactions: a brief review. Brief Bioinform. 15: 734–747.

    Article  Google Scholar 

  83. 83.

    Bleakley, K. and Y. Yamanishi (2009) Supervised prediction of drug-target interactions using bipartite local models. Bioinformatics. 25: 2397–2403.

    Article  CAS  Google Scholar 

  84. 84.

    Xia, Z., L. Y. Wu, X. Zhou, and S. T. C. Wong (2010) Semisupervised drug-protein interaction prediction from heterogeneous biological spaces. BMC Syst. Biol. 4 Suppl 2: S6.

  85. 85.

    van Laarhoven, T., S. B. Nabuurs, and E. Marchiori (2011) Gaussian interaction profile kernels for predicting drug-target interaction. Bioinformatics. 27: 3036–3043.

    Article  CAS  Google Scholar 

  86. 86.

    Pahikkala, T., A. Airola, S. Pietila, S. Shakyawar, A. Szwajda, J. Tang, and T. Aittokallio (2015) Toward more realistic drugtarget interaction predictions. Brief. Bioinform. 16: 325-337.

  87. 87.

    Keum, J. and H. Nam (2017) SELF-BLM: Prediction of drugtarget interactions via self-training SVM. PLoS One. 12: e0171839.

    Article  CAS  Google Scholar 

  88. 88.

    Chen, X., M. X. Liu, and G. Y. Yan (2012) Drug-target interaction prediction by random walk on the heterogeneous network. Mol. Biosyst. 8: 1970–1978.

    Article  CAS  Google Scholar 

  89. 89.

    Luo, Y., X. Zhao, J. Zhou, J. Yang, Y. Zhang, W. Kuang, J. Peng, L. Chen, and J. Zeng (2017) A network integration approach for drug-target interaction prediction and computational drug repositioning from heterogeneous information. Nat. Commun. 8: 573.

  90. 90.

    Wang, S., H. Cho, C. Zhai, B. Berger, and J. Peng (2015) Exploiting ontology graph for predicting sparsely annotated gene function. Bioinformatics. 31: i357–i364.

    Article  CAS  Google Scholar 

  91. 91.

    Ewing, T., J. C. Baber, and M. Feher (2006) Novel 2D fingerprints for ligand-based virtual screening. J. Chem. Inf. Model. 46: 2423–2431.

    Article  CAS  Google Scholar 

  92. 92.

    Dubchak, I., I. Muchnik, S. R. Holbrook, and S. H. Kim (1995) Prediction of protein folding class using global description of amino acid sequence. Proc. Natl. Acad. Sci. USA. 92: 8700-8704.

  93. 93.

    Zhang, P., L. Tao, X. Zeng, C. Qin, S. Chen, F. Zhu, Z. Li, Y. Jiang, W. Chen, and Y. Z. Chen (2017) A protein network descriptor server and its use in studying protein, disease, metabolic and drug targeted networks. Brief. Bioinform. 18: 1057-1070.

  94. 94.

    Yu, H., J. Chen, X. Xu, Y. Li, H. Zhao, Y. Fang, X. Li, W. Zhou, W. Wang, and Y. Wang (2012) A systematic prediction of multiple drug-target interactions from chemical, genomic, and pharmacological data. PLoS One. 7: e37608.

    Article  CAS  Google Scholar 

  95. 95.

    Li, Z. C., M. H. Huang, W. Q. Zhong, Z. Q. Liu, Y. Xie, Z. Dai, and X. Y. Zou (2016) Identification of drug-target interaction from interactome network with ‘guilt-by-association’ principle and topology features. Bioinformatics. 32: 1057–1064.

    Article  CAS  Google Scholar 

  96. 96.

    Lee, I. and H. Nam (2018) Identification of drug-target interaction by a random walk with restart method on an interactome network. BMC Bioinformatics. 19: 208.

    Article  CAS  Google Scholar 

  97. 97.

    Wang, Y. and J. Zeng (2013) Predicting drug-target interactions using restricted Boltzmann machines. Bioinformatics. 29: i126–i134.

    Article  CAS  Google Scholar 

  98. 98.

    Wen, M., Z. Zhang, S. Niu, H. Sha, R. Yang, Y. Yun, and H. Lu (2017) Deep-learning-based drug-target interaction prediction. J. Proteome Res. 16: 1401-1409.

  99. 99.

    Hu, P. W., K. C. C. Chan, and Z. H. You (2016) Large-scale prediction of drug-target interactions from deep representations. 2016 International Joint Conference on Neural Networks (IJCNN). July 24-29. Vancouver, BC, Canada.

  100. 100.

    Ozturk, H., A. Ozgur, and E. Ozkirimli (2018) DeepDTA: deep drug-target binding affinity prediction. Bioinformatics. 34: i821–i829.

    Article  CAS  Google Scholar 

  101. 101.

    He, T., M. Heidemeyer, F. Ban, A. Cherkasov, and M. Ester (2017) SimBoost: a read-across approach for predicting drugtarget binding affinities using gradient boosting machines. J. Cheminform. 9: 24.

    Article  CAS  Google Scholar 

  102. 102.

    Tsubaki, M., K. Tomii, and J. Sese (2019) Compound-protein interaction prediction with end-to-end learning of neural networks for graphs and sequences. Bioinformatics. 35: 309–318.

    Article  CAS  Google Scholar 

  103. 103.

    Gonen, M. (2012) Predicting drug-target interactions from chemical and genomic kernels using Bayesian matrix factorization. Bioinformatics. 28: 2304–2310.

    Article  CAS  Google Scholar 

  104. 104.

    Lee, I., J. Keum, and H. Nam (2019) DeepConv-DTI: Prediction of drug-target interactions via deep learning with convolution on protein sequences. PLoS Comput. Biol. 15: e1007129.

  105. 105.

    Karimi, M., D. Wu, Z. Wang, and Y. Shen (2019) DeepAffinity: interpretable deep learning of compound-protein affinity through unified recurrent and convolutional neural networks. Bioinformatics. 35: 3329-3338.

  106. 106.

    Shen, C., J. Ding, Z. Wang, D. Cao, X. Ding, and T. Hou (2020) From machine learning to deep learning: Advances in scoring functions for protein-ligand docking. WIREs Comput. Mol. Sci. 10: e1429.

  107. 107.

    Sieg, J., F. Flachsenberg, and M. Rarey (2019) In need of bias control: evaluating chemical data for machine learning in structure-based virtual screening. J. Chem. Inf. Model. 59: 947-961.

  108. 108.

    Chen, L., A. Cruz, S. Ramsey, C. J. Dickson, J. S. Duca, V. Hornak, D. R. Koes, and T. Kurtzman (2019) Hidden bias in the DUD-E dataset leads to misleading performance of deep learning in structure-based virtual screening. PLoS One. 14: e0220113.

  109. 109.

    Hanson, J., K. K. Paliwal, T. Litfin, Y. Yang, and Y. Zhou (2020) Getting to know your neighbor: protein structure prediction comes of age with contextual machine learning. J. Comput. Biol. 27: 796-814.

  110. 110.

    Shi, Q., W. Chen, S. Huang, Y. Wang, and Z. Xue (2019) Deep learning for mining protein data. Brief. Bioinform. bbz156.

  111. 111.

    Goodsell, D. S., C. Zardecki, L. Di Costanzo, J. M. Duarte, B. P. Hudson, I. Persikova, J. Segura, C. Shao, M. Voigt, J. D. Westbrook, J. Y. Young, and S. K. Burley (2020) RCSB Protein Data Bank: Enabling biomedical research and drug discovery. Protein Sci. 29: 52-65.

  112. 112.

    Gola, J., O. Obrezanova, E. Champness, and M. Segall (2006) ADMET property prediction: The state of the art and current challenges. QSAR Comb. Sci. 25: 1172–1180.

    Article  CAS  Google Scholar 

  113. 113.

    Moroy, G., V. Y. Martiny, P. Vayer, B. O. Villoutreix, and M. A. Miteva (2012) Toward in silico structure-based ADMET prediction in drug discovery. Drug Discov. Today. 17: 44–55.

    Article  CAS  Google Scholar 

  114. 114.

    Tian, S., J. Wang, Y. Li, D. Li, L. Xu, and T. Hou (2015) The application of in silico drug-likeness predictions in pharmaceutical research. Adv. Drug Deliv. Rev. 86: 2–10.

    Article  CAS  Google Scholar 

  115. 115.

    Zhao, Y. H., J. Le, M. H. Abraham, A. Hersey, P. J. Eddershaw, C. N. Luscombe, D. Boutina, G. Beck, B. Sherborne, I. Cooper, and J. A. Platts (2001) Evaluation of human intestinal absorption data and subsequent derivation of a quantitative structure-Activity relationship (QSAR) with the Abraham descriptors. J. Pharm. Sci. 90: 749-784.

  116. 116.

    Ponzoni, I., V. Sebastin-Prez, C. Requena-Triguero, C. Roca, M. J. Martnez, F. Cravero, M. F. Daz, J. A. Pez, R. G. Arrays, J. Adrio, and N. E. Campillo (2017) Hybridizing feature selection and feature learning approaches in QSAR modeling for drug discovery. Sci. Rep. 7: 2403.

  117. 117.

    Wang, N. N., C. Huang, J. Dong, Z. J. Yao, M. F. Zhu, Z. K. Deng, B. Lv, A. P. Lu, A. F. Chen, and D. S. Cao (2017) Predicting human intestinal absorption with modified random forest approach: a comprehensive evaluation of molecular representation, unbalanced data, and applicability domain issues. RSC Adv. 7: 19007-19018.

  118. 118.

    Yang, M., J. Chen, L. Xu, X. Shi, X. Zhou, Z. Xi, R. An, and X. Wang (2018) A novel adaptive ensemble classification framework for ADME prediction. RSC Adv. 8: 11661–11683.

    Article  Google Scholar 

  119. 119.

    Fredlund, L., S. Winiwarter, and C. Hilgendorf (2017) In vitro intrinsic permeability: a transporter-independent measure of Caco-2 cell permeability in drug design and development. Mol. Pharm. 14: 1601-1609.

  120. 120.

    Patel, R. D., S. P. Kumar, C. N. Patel, S. S. Shankar, H. A. Pandya, and H. A. Solanki (2017) Parallel screening of druglike natural compounds using Caco-2 cell permeability QSAR model with applicability domain, lipophilic ligand efficiency index and shape property: A case study of HIV-1 reverse transcriptase inhibitors. J. Mol. Struct. 1146: 80-95.

  121. 121.

    Sun, H., K. Nguyen, E. Kerns, Z. Yan, K. R. Yu, P. Shah, A. Jadhav, and X. Xu (2017) Highly predictive and interpretable models for PAMPA permeability. Bioorg. Med. Chem. 25: 1266–1276.

    Article  CAS  Google Scholar 

  122. 122.

    Chi, C. T., M. H. Lee, C. F. Weng, and M. K. Leong (2019) In silico prediction of PAMPA effective permeability using a two-QSAR approach. Int. J. Mol. Sci. 20: 3170.

  123. 123.

    Lanevskij, K. and R. Didziapetris (2019) Physicochemical QSAR analysis of passive permeability across Caco-2 monolayers. J. Pharm. Sci. 108: 78–86.

    Article  Google Scholar 

  124. 124.

    Oja, M., S. Sild, and U. Maran (2019) Logistic classification models for pH-permeability profile: predicting permeability classes for the biopharmaceutical classification system. J. Chem. Inf. Model. 59: 2442-2455.

  125. 125.

    Shin, M., D. Jang, H. Nam, K. H. Lee, and D. Lee (2018) Predicting the absorption potential of chemical compounds through a deep learning approach. IEEE/ACM Trans. Comput. Biol. Bioinform. 15: 432–440.

    Article  Google Scholar 

  126. 126.

    Wenzel, J., H. Matter, and F. Schmidt (2019) Predictive multitask deep neural network models for ADME-Tox properties: learning from large data sets. J. Chem. Inf. Model. 59: 1253-1268.

  127. 127.

    Gooch, E. (2004) Medicinal chemistry - an introduction; fundamentals of medicinal chemistry (Gareth Thomas). J. Chem. Educ. 81: 1271.

    Article  Google Scholar 

  128. 128.

    Kumar, R., A. Sharma, M. H. Siddiqui, and R. K. Tiwari (2017) Prediction of drug-plasma protein binding using artificial intelligence based algorithms. Comb. Chem. High Throughput Screen. 21: 57-64.

  129. 129.

    Wang, N. N., Z. K. Deng, C. Huang, J. Dong, M. F. Zhu, Z. J. Yao, A. F. Chen, A. P. Lu, Q. Mi, and D. S. Cao (2017) ADME properties evaluation in drug discovery: Prediction of plasma protein binding using NSGA-II combining PLS and consensus modeling. Chemometr. Intell. Lab. Syst. 170: 84-95.

  130. 130.

    Sun, L., H. Yang, J. Li, T. Wang, W. Li, G. Liu, and Y. Tang (2018) In silico prediction of compounds binding to human plasma proteins by QSAR models. ChemMedChem. 13: 572-581.

  131. 131.

    Toma, C., D. Gadaleta, A. Roncaglioni, A. Toropov, A. Toropova, M. Marzo, and E. Benfenati (2019) QSAR development for plasma protein binding: influence of the ionization state. Pharm. Res. 36: 28.

  132. 132.

    Ye, Z., Y. Yang, X. Li, D. Cao, and D. Ouyang (2019) An Integrated transfer learning and multitask learning approach for pharmacokinetic parameter prediction. Mol. Pharm. 16: 533–541.

    Google Scholar 

  133. 133.

    Prachayasittikul, V., A. Worachartcheewan, A. P. Toropova, A. A. Toropov, N. Schaduangrat, V. Prachayasittikul, and C. Nantasenamat (2017) Large-scale classification of P-glycoprotein inhibitors using SMILES-based descriptors. SAR QSAR Environ. Res. 28: 1-16.

  134. 134.

    Gonzalo, C. G. and N. García-Pedrajas (2018) Boosted feature selectors: a case study on prediction P-gp inhibitors and substrates. J. Comput. Aided Mol. Des. 32: 1273–1294.

    Article  CAS  Google Scholar 

  135. 135.

    Hinge, V. K., D. Roy, and A. Kovalenko (2019) Prediction of Pglycoprotein inhibitors with machine learning classification models and 3D-RISM-KH theory based solvation energy descriptors. J. Comput. Aided Mol. Des. 33: 965–971.

    Article  CAS  Google Scholar 

  136. 136.

    Shi, T., Y. Yang, S. Huang, L. Chen, Z. Kuang, Y. Heng, and H. Mei (2019) Molecular image-based convolutional neural network for the prediction of ADMET properties. Chemometr. Intell. Lab. Syst. 194: 103853.

  137. 137.

    Toropov, A. A., A. P. Toropova, M. Beeg, M. Gobbi, and M. Salmona (2017) QSAR model for blood-brain barrier permeation. J. Pharmacol. Toxicol. Methods. 88: 7–18.

    Article  CAS  Google Scholar 

  138. 138.

    Wang, Z., H. Yang, Z. Wu, T. Wang, W. Li, Y. Tang, and G. Liu (2018) In silico prediction of blood-brain barrier permeability of compounds by machine learning and resampling methods. ChemMedChem. 13: 2189-2201.

  139. 139.

    Yuan, Y., F. Zheng, and C. G. Zhan (2018) Improved prediction of blood-brain barrier permeability through machine learning with combined use of molecular property-based descriptors and fingerprints. AAPS J. 20: 54.

  140. 140.

    Miao, R., L. Y. Xia, H. H. Chen, H. H. Huang, and Y. Liang (2019) Improved classification of blood-brain-barrier drugs using deep learning. Sci. Rep. 9: 8802.

    Article  CAS  Google Scholar 

  141. 141.

    Hunt, P. A., M. D. Segall, and J. D. Tyzack (2018) WhichP450: a multi-class categorical model to predict the major metabolising CYP450 isoform for a compound. J. Comput. Aided Mol. Des. 32: 537-546.

  142. 142.

    Tian, S., Y. Djoumbou-Feunang, R. Greiner, and D. S. Wishart (2018) CypReact: A software tool for in silico reactant prediction for human cytochrome P450 enzymes. J. Chem. Inf. Model. 58: 1282-1291.

  143. 143.

    Shan, X., X. Wang, C. D. Li, Y. Chu, Y. Zhang, Y. Xiong, and D. Q. Wei (2019) Prediction of CYP450 enzyme-substrate selectivity based on the network-based label space division method. J. Chem. Inf. Model. 59: 4577-4586.

  144. 144.

    Li, X., Y. Xu, L. Lai, and J. Pei (2018) Prediction of human cytochrome P450 inhibition using a multitask deep autoencoder neural network. Mol. Pharm. 15: 4336–4345.

    Article  CAS  Google Scholar 

  145. 145.

    Pang, X., B. Zhang, G. Mu, J. Xia, Q. Xiang, X. Zhao, A. Liu, G. Du, and Y. Cui (2018) Screening of cytochrome P450 3A4 inhibitors via in silico and in vitro approaches. RSC Adv. 8: 34783-34792.

  146. 146.

    Wu, Z., T. Lei, C. Shen, Z. Wang, D. Cao, and T. Hou (2019) ADMET evaluation in drug discovery. 19. Reliable prediction of human cytochrome P450 inhibition using artificial intelligence approaches. J. Chem. Inf. Model. 59: 4587-4601.

  147. 147.

    He, S., M. Li, X. Ye, H. Wang, W. Yu, W. He, Y. Wang, and Y. Qiao (2017) Site of metabolism prediction for oxidation reactions mediated by oxidoreductases based on chemical bond. Bioinformatics. 33: 363–372.

    Google Scholar 

  148. 148.

    Šícho, M., C. De Bruyn Kops, C. Stork, D. Svozil, and J. Kirchmair (2017) FAME 2: simple and effective machine learning model of cytochrome P450 regioselectivity. J. Chem. Inf. Model. 57: 1832-1846.

  149. 149.

    Finkelmann, A. R., D. D. Goldmann, G. Schneider, and A. H. Goller (2018) MetScore: Site of metabolism prediction beyond cytochrome P450 enzymes. ChemMedChem. 13: 2281–2289.

    Article  CAS  Google Scholar 

  150. 150.

    Cai, Y., H. Yang, W. Li, G. Liu, P. W. Lee, and Y. Tang (2019) Computational prediction of site of metabolism for UGTcatalyzed reactions. J. Chem. Inf. Model. 59: 1085-1095.

  151. 151.

    Lee, P. W. (2014) Handbook of Metabolic Pathways of Xenobiotics. John Wiley & Sons

  152. 152.

    Podlewska, S. and R. Kafel (2018) MetStabOn-online platform for metabolic stability predictions. Int. J. Mol. Sci. 19: 1040.

    Article  CAS  Google Scholar 

  153. 153.

    Esaki, T., R. Watanabe, H. Kawashima, R. Ohashi, Y. Natsume-Kitatani, C. Nagao, and K. Mizuguchi (2019) Data curation can improve the prediction accuracy of metabolic intrinsic clearance. Mol. Inform. 38: e1800086.

  154. 154.

    Liu, K., X. Sun, L. Jia, J. Ma, H. Xing, J. Wu, H. Gao, Y. Sun, F. Boulnois, and J. Fan (2019) Chemi-net: A molecular graph convolutional network for accurate drug property prediction. Int. J. Mol. Sci. 20: 3389.

  155. 155.

    Zhivkova, Z. D. (2017) Quantitative structure - pharmacokinetic relationships for plasma clearance of basic drugs with consideration of the major elimination pathway. J. Pharm. Pharm. Sci. 20: 135–147.

    Article  Google Scholar 

  156. 156.

    Wakayama, N., K. Toshimoto, K. Maeda, S. Hotta, T. Ishida, Y. Akiyama, and Y. Sugiyama (2018) In silico prediction of major clearance pathways of drugs among 9 routes with two-step support vector machines. Pharm. Res. 35: 197.

  157. 157.

    Watanabe, R., R. Ohashi, T. Esaki, H. Kawashima, Y. Natsume-Kitatani, C. Nagao, and K. Mizuguchi (2019) Development of an in silico prediction system of human renal excretion and clearance from chemical structure information incorporating fraction unbound in plasma as a descriptor. Sci. Rep. 9: 18782.

  158. 158.

    Chen, J., H. Yang, L. Zhu, Z. Wu, W. Li, Y. Tang, and G. Liu (2020) In silico prediction of human renal clearance of compounds using quantitative structure-pharmacokinetic relationship models. Chem. Res. Toxicol. 33: 640-650.

  159. 159.

    Hong, H., S. Thakkar, M. Chen, and W. Tong (2017) Development of decision forest models for prediction of drug-induced liver injury in humans using a large set of FDA-approved drugs. Sci. Rep. 7: 17311.

    Article  CAS  Google Scholar 

  160. 160.

    Kim, E. and H. Nam (2017) Prediction models for drug-induced hepatotoxicity by using weighted molecular fingerprints. BMC Bioinformatics. 18: 227.

  161. 161.

    Kotsampasakou, E., F. Montanari, and G. F. Ecker (2017) Predicting drug-induced liver injury: The importance of data curation. Toxicology. 389: 139–145.

    Article  CAS  Google Scholar 

  162. 162.

    Ai, H., W. Chen, L. Zhang, L. Huang, Z. Yin, H. Hu, Q. Zhao, J. Zhao, and H. Liu (2018) Predicting drug-induced liver injury using ensemble learning methods and molecular fingerprints. Toxicol. Sci. 165: 100-107.

  163. 163.

    Hammann, F., V. Schning, and J. Drewe (2019) Prediction of clinically relevant drug-induced liver injury from structure using machine learning. J. Appl. Toxicol. 39: 412-419.

  164. 164.

    He, S., T. Ye, R. Wang, C. Zhang, X. Zhang, G. Sun, and X. Sun (2019) An in silico model for predicting drug-induced hepatotoxicity. Int. J. Mol. Sci. 20: 1897.

  165. 165.

    Williams, D. P., S. E. Lazic, A. J. Foster, E. Semenova, and P. Morgan (2019) Predicting drug-induced liver injury with Bayesian machine learning. Chem. Res. Toxicol. 33: 239-248.

  166. 166.

    Munawar, S., M. J. Windley, E. G. Tse, M. H. Todd, A. P. Hill, J. I. Vandenberg, and I. Jabeen (2018) Experimentally validated pharmacoinformatics approach to predict hERG inhibition potential of new chemical entities. Front. Pharmacol. 9: 1035.

  167. 167.

    Siramshetty, V. B., Q. Chen, P. Devarakonda, and R. Preissner (2018) The catch-22 of predicting hERG blockade using publicly accessible bioactivity data. J. Chem. Inf. Model. 58: 1224-1233.

  168. 168.

    Cai, C., P. Guo, Y. Zhou, J. Zhou, Q. Wang, F. Zhang, J. Fang, and F. Cheng (2019) Deep learning-based prediction of druginduced cardiotoxicity. J. Chem. Inf. Model. 59: 1073-1084.

  169. 169.

    Konda, L. S. K., S. K. Praba, and R. Kristam (2019) hERG liability classification models using machine learning techniques. Comput. Toxicol. 12: 100089.

  170. 170.

    Lee, A. A., Q. Yang, A. Bassyouni, C. R. Butler, X. Hou, S. Jenkinson, and D. A. Price (2019) Ligand biological activity predicted by cleaning positive and negative chemical correlations. Proc. Natl. Acad. Sci. USA. 116: 3373-3378.

  171. 171.

    Lee, H. M., M. S. Yu, S. R. Kazmi, S. Y. Oh, K. H. Rhee, M. A. Bae, B. H. Lee, D. S. Shin, K. S. Oh, H. Ceong, D. Lee, and D. Na (2019) Computational determination of hERG-related cardiotoxicity of drug candidates. BMC Bioinformatics. 20: 250.

  172. 172.

    Ogura, K., T. Sato, H. Yuki, and T. Honma (2019) Support vector machine model for hERG inhibitory activities based on the integrated hERG database using descriptor selection by NSGA-II. Sci. Rep. 9: 12220.

  173. 173.

    Zhang, Y., J. Zhao, Y. Wang, Y. Fan, L. Zhu, Y. Yang, X. Chen, T. Lu, Y. Chen, and H. Liu (2019) Prediction of hERG K+ channel blockage using deep neural networks. Chem. Biol. Drug Des. 94: 1973-1985.

  174. 174.

    Sato, T., H. Yuki, K. Ogura, and T. Honma (2018) Construction of an integrated database for hERG blocking small molecules. PLoS One. 13: e0199348.

    Article  CAS  Google Scholar 

  175. 175.

    Kim, H. and H. Nam (2020) hERG-Att: Self-attention-based deep neural network for predicting hERG blockers. Comput. Biol. Chem. 87: 107286.

    Article  CAS  Google Scholar 

  176. 176.

    Lei, T., F. Chen, H. Liu, H. Sun, Y. Kang, D. Li, Y. Li, and T. Hou (2017) ADMET evaluation in drug discovery. Part 17: development of quantitative and qualitative prediction models for chemical-induced respiratory toxicity. Mol. Pharm. 14: 2407-2421.

  177. 177.

    Lei, T., H. Sun, Y. Kang, F. Zhu, H. Liu, W. Zhou, Z. Wang, D. Li, Y. Li, and T. Hou (2017) ADMET evaluation in drug discovery. 18. reliable prediction of chemical-induced urinary tract toxicity by boosting machine learning approaches. Mol. Pharm. 14: 3935-3953.

  178. 178.

    Liu, J., G. Patlewicz, A. J. Williams, R. S. Thomas, and I. Shah (2017) Predicting organ toxicity using in vitro bioactivity data and chemical structure. Chem. Res. Toxicol. 30: 2046–2059.

    Article  CAS  Google Scholar 

  179. 179.

    Xu, Y., J. Pei, and L. Lai (2017) Deep learning based regression and multiclass models for acute oral toxicity prediction with automatic chemical feature extraction. J. Chem. Inf. Model. 57: 2672–2685.

    Article  CAS  Google Scholar 

  180. 180.

    Zhang, H., P. Yu, J. X. Ren, X. B. Li, H. L. Wang, L. Ding, and W. B. Kong (2017) Development of novel prediction model for drug-induced mitochondrial toxicity by using naive Bayes classifier method. Food Chem. Toxicol. 110: 122-129.

  181. 181.

    Fan, D., H. Yang, F. Li, L. Sun, P. Di, W. Li, Y. Tang, and G. Liu (2018) In silico prediction of chemical genotoxicity using machine learning methods and structural alerts. Toxicol. Res. 7: 211-220.

  182. 182.

    Jiang, C., H. Yang, P. Di, W. Li, Y. Tang, and G. Liu (2019) In silico prediction of chemical reproductive toxicity using machine learning. J. Appl. Toxicol. 39: 844–854.

    Article  CAS  Google Scholar 

  183. 183.

    Zheng, S., Y. Wang, W. Liu, W. Chang, G. Liang, Y. Xu, and F. Lin (2019) In silico prediction of hemolytic toxicity on the human erythrocytes for small molecules by machine-learning and genetic algorithm. J. Med. Chem. 12: 6499-6512.

  184. 184.

    Fernandez, M., F. Ban, G. Woo, M. Hsing, T. Yamazaki, E. Leblanc, P. S. Rennie, W. J. Welch, and A. Cherkasov (2018) Toxic colors: the use of deep learning for predicting toxicity of compounds merely from their graphic images. J. Chem. Inf. Model. 58: 1533-1543.

  185. 185.

    Abbasi, K., A. Poso, J. Ghasemi, M. Amanlou, and A. Masoudi-Nejad (2019) Deep transferable compound representation across domains and tasks for low data drug discovery. J. Chem. Inf. Model. 59: 4528-4539.

  186. 186.

    Karim, A., A. Mishra, M. A. H. Newton, and A. Sattar (2019) Efficient toxicity prediction via simple features using shallow neural networks and decision trees. ACS Omega. 4: 1874–1888.

    Article  CAS  Google Scholar 

  187. 187.

    Zakharov, A. V., T. Zhao, D. T. Nguyen, T. Peryea, T. Sheils, A. Yasgar, R. Huang, N. Southall, and A. Simeonov (2019) Novel consensus architecture to improve performance of large-scale multitask deep learning QSAR models. J. Chem. Inf. Model. 59: 4613-4624.

  188. 188.

    Wang, J. and T. Hou (2015) Advances in computationally modeling human oral bioavailability. Adv. Drug Deliv. Rev. 86: 11–16.

    Article  CAS  Google Scholar 

  189. 189.

    Hutter, M. C. (2018) The current limits in virtual screening and property prediction. Future Med. Chem. 10: 1623–1635.

    Article  CAS  Google Scholar 

  190. 190.

    Wu, Z., B. Ramsundar, E. N. Feinberg, J. Gomes, C. Geniesse, A. S. Pappu, K. Leswing, and V. Pande (2018) MoleculeNet: a benchmark for molecular machine learning. Chem. Sci. 9: 513-530.

  191. 191.

    Merck Molecular Activity Challenge (2012)

  192. 192.

    Winkler, D. A. and T. C. Le (2017) Performance of deep and shallow neural networks, the universal approximation theorem, activity cliffs, and QSAR. Mol. Inform. 36: 1600118.

    Article  CAS  Google Scholar 

  193. 193.

    Ryu, S., Y. Kwon, and W. Y. Kim (2019) A Bayesian graph convolutional network for reliable prediction of molecular properties with uncertainty quantification. Chem. Sci. 10: 8438–8446.

    Article  Google Scholar 

  194. 194.

    Xiong, Z., D. Wang, X. Liu, F. Zhong, X. Wan, X. Li, Z. Li, X. Luo, K. Chen, H. Jiang, and M. Zheng (2019) Pushing the boundaries of molecular representation for drug discovery with the graph attention mechanism. J. Med. Chem. 63: 8749-8760.

  195. 195.

    Maggiora, G. M. (2006) On outliers and activity cliffs-Why QSAR often disappoints. J. Chem. Inf. Model. 46: 1535.

    Article  CAS  Google Scholar 

  196. 196.

    Kohonen, P., J. A. Parkkinen, E. L. Willighagen, R. Ceder, K. Wennerberg, S. Kaski, and R. C. Grafstrm (2017) A transcriptomics data-driven gene space accurately predicts liver cytopathology and drug-induced liver injury. Nat. Commun. 8: 15932.

  197. 197.

    Rueda-Zrate, H. A., I. Imaz-Rosshandler, R. A. Crdenas-Ovando, J. E. Castillo-Fernndez, J. Noguez-Monroy, and C. Rangel-Escareo (2017) A computational toxicogenomics approach identifies a list of highly hepatotoxic compounds from a large microarray database. PLoS One. 12: e0176284.

  198. 198.

    Su, R., H. Wu, B. Xu, X. Liu, and L. Wei (2019) Developing a multi-dose computational model for drug-induced hepatotoxicity prediction based on toxicogenomics data. IEEE/ACM Trans. Comput. Biol. Bioinform. 16: 1231-1239.

  199. 199.

    Schneider, G. and U. Fechner (2005) Computer-based de novo design of drug-like molecules. Nat. Rev. Drug Discov. 4: 649–663.

    Article  CAS  Google Scholar 

  200. 200.

    Walters, W. P. (2019) Virtual chemical libraries. J. Med. Chem. 62: 1116–1124.

    Article  CAS  Google Scholar 

  201. 201.

    Reymond, J. L., L. Ruddigkeit, L. Blum, and R. van Deursen (2012) The enumeration of chemical space. WIREs Comput. Mol. Sci. 2: 717-733.

  202. 202.

    Sanchez-Lengeling, B. and A. Aspuru-Guzik (2018) Inverse molecular design using machine learning: Generative models for matter engineering. Science. 361: 360–365.

    Article  CAS  Google Scholar 

  203. 203.

    Elton, D. C., Z. Boukouvalas, M. D. Fuge, and P. W. Chung (2019) Deep learning for molecular design—a review of the state of the art. Mol. Syst. Des. Eng. 4: 828-849.

  204. 204.

    Brown, N., M. Fiscato, M. H. S. Segler, and A. C. Vaucher (2019) GuacaMol: Benchmarking models for de novo molecular design. J. Chem. Inf. Model. 59: 1096-1108.

  205. 205.

    Huc, I. and J. M. Lehn (1997) Virtual combinatorial libraries: dynamic generation of molecular and supramolecular diversity by self-assembly. Proc. Natl. Acad. Sci. USA. 94: 2106–2110.

    Article  Google Scholar 

  206. 206.

    Lehn, J. M. (1999) Dynamic combinatorial chemistry and virtual combinatorial libraries. Chem. Eur. J. 5: 2455–2463.

    Article  Google Scholar 

  207. 207.

    Kwon, Y., J. Yoo, Y. S. Choi, W. J. Son, D. Lee, and S. Kang (2019) Efficient learning of non-autoregressive graph variational autoencoders for molecular graph generation. J. Cheminform. 11: 70.

  208. 208.

    Segler, M. H. S., T. Kogej, C. Tyrchan, and M. P. Waller (2018) Generating focused molecule libraries for drug discovery with recurrent neural networks. ACS Cent. Sci. 4: 120-131.

  209. 209.

    Gómez-Bombarelli, R., J. N. Wei, D. Duvenaud, J. M. Hernández-Lobato, B. Sánchez-Lengeling, D. Sheberla, J. Aguilera-Iparraguirre, T. D. Hirzel, R. P. Adams, and A. Aspuru-Guzik (2018) Automatic chemical design using a datadriven continuous representation of molecules. ACS Cent. Sci. 4: 268-276.

  210. 210.

    Kang, S. and K. Cho (2019) Conditional molecular design with deep generative models. J. Chem. Inf. Model. 59: 43–52.

    Article  CAS  Google Scholar 

  211. 211.

    Arús-Pous, J., S. V. Johansson, O. Prykhodko, E. J. Bjerrum, C. Tyrchan, J. L. Reymond, H. Chen, and O. Engkvist (2019) Randomized SMILES strings improve the quality of molecular generative models. J. Cheminform. 11: 71.

  212. 212.

    Gupta, A., A. T. Müller, B. J. H. Huisman, J. A. Fuchs, P. Schneider, and G. Schneider (2018) Generative recurrent networks for de novo drug design. Mol. Inform. 37: 1700111.

  213. 213.

    Merk, D., F. Grisoni, L. Friedrich, and G. Schneider (2018) Tuning artificial intelligence on the de novo design of naturalproduct-inspired retinoid X receptor modulators. Commun. Chem. 1: 68.

  214. 214.

    Zheng, S., X. Yan, Q. Gu, Y. Yang, Y. Du, Y. Lu, and J. Xu (2019) QBMG: quasi-biogenic molecule generator with deep recurrent neural network. J. Cheminform. 11: 5.

  215. 215.

    Awale, M., F. Sirockin, N. Stiefl, and J. L. Reymond (2019) Drug analogs from fragment-based long short-term memory generative neural networks. J. Chem. Inf. Model. 59: 1347-1356.

  216. 216.

    Arús-Pous, J., T. Blaschke, S. Ulander, J. L. Reymond, H. Chen, and O. Engkvist (2019) Exploring the GDB-13 chemical space using deep generative models. J. Cheminform. 11: 20.

  217. 217.

    Pogány, P., N. Arad, S. Genway, and S. D. Pickett (2019) De novo molecule design by translating from reduced graphs to SMILES. J. Chem. Inf. Model. 59: 1136-1146.

  218. 218.

    Li, Y., L. Zhang, and Z. Liu (2018) Multi-objective de novo drug design with conditional graph generative model. J. Cheminform. 10: 33.

    Article  CAS  Google Scholar 

  219. 219.

    Polykovskiy, D., A. Zhebrak, D. Vetrov, Y. Ivanenkov, V. Aladinskiy, P. Mamoshina, M. Bozdaganyan, A. Aliper, A. Zhavoronkov, and A. Kadurin (2018) Entangled conditional adversarial autoencoder for de novo drug discovery. Mol. Pharm. 15: 4398-4405.

  220. 220.

    Lim, J., S. Ryu, J. W. Kim, and W. Y. Kim (2018) Molecular generative model based on conditional variational autoencoder for de novo molecular design. J. Cheminform. 10: 31.

  221. 221.

    Harel, S. and K. Radinsky (2018) Prototype-based compound discovery using deep generative models. Mol. Pharmaceutics. 15: 4406–4416.

    Article  CAS  Google Scholar 

  222. 222.

    Skalic, M., J. Jiménez, D. Sabbadin, and G. De Fabritiis (2019) Shape-based generative modeling for de novo drug design. J. Chem. Inf. Model. 59: 1205-1214.

  223. 223.

    Lim, J., S. Y. Hwang, S. Moon, S. Kim, and W. Y. Kim (2020) Scaffold-based molecular design with a graph generative model. Chem. Sci. 11: 1153-1164.

  224. 224.

    Kadurin, A., S. Nikolenko, K. Khrabrov, A. Aliper, and A. Zhavoronkov (2017) druGAN: An advanced generative adversarial autoencoder model for de novo generation of new molecules with desired molecular properties in silico. Mol. Pharmaceutics. 14: 3098-3104.

  225. 225.

    Blaschke, T., M. Olivecrona, O. Engkvist, J. Bajorath, and H. Chen (2018) Application of generative autoencoder in de novo molecular design. Mol. Inform. 37: 1700123.

    Article  CAS  Google Scholar 

  226. 226.

    Prykhodko, O., S. V. Johansson, P. C. Kotsias, J. Arús-Pous, E. J. Bjerrum, O. Engkvist, and H. Chen (2019) A de novo molecular generation method using latent vector based generative adversarial network. J. Cheminform. 11: 74.

  227. 227.

    Zhou, Z., S. Kearnes, L. Li, R. N. Zare, and P. Riley (2019) Optimization of molecules via deep reinforcement learning. Sci. Rep. 9: 10752.

  228. 228.

    Olivecrona, M., T. Blaschke, O. Engkvist, and H. Chen (2017) Molecular de-novo design through deep reinforcement learning. J. Cheminform. 9: 48.

  229. 229.

    Popova, M., O. Isayev, and A. Tropsha (2018) Deep reinforcement learning for de novo drug design. Sci. Adv. 4: eaap7885.

    Article  CAS  Google Scholar 

  230. 230.

    Putin, E., A. Asadulaev, Y. Ivanenkov, V. Aladinskiy, B. Sanchez-Lengeling, A. Aspuru-Guzik, and A. Zhavoronkov (2018) Reinforced adversarial neural computer for de novo molecular design. J. Chem. Inf. Model. 58: 1194-1204.

  231. 231.

    Putin, E., A. Asadulaev, Q. Vanhaelen, Y. Ivanenkov, A. V. Aladinskaya, A. Aliper, and A. Zhavoronkov (2018) Adversarial threshold neural computer for molecular de novo design. Mol. Pharmaceutics. 15: 4386-4397.

  232. 232.

    Liu, X., K. Ye, H. W. T. van Vlijmen, A. P. Ijzerman, and G. J. P. van Westen (2019) An exploration strategy improves the diversity of de novo ligands using deep reinforcement learning: a case for the adenosine A2A receptor. J. Cheminform. 11: 35.

  233. 233.

    Ståhl, N., G. Falkman, A. Karlsson, G. Mathiason, and J. Boström (2019) Deep reinforcement learning for multiparameter optimization in de novo drug design. J. Chem. Inf. Model. 59: 3166–3176.

    Google Scholar 

  234. 234.

    Zhavoronkov, A., Y. A. Ivanenkov, A. Aliper, M. S. Veselov, V. A. Aladinskiy, A. V. Aladinskaya, V. A. Terentiev, D. A. Polykovskiy, M. D. Kuznetsov, A. Asadulaev, Y. Volkov, A. Zholus, R. R. Shayakhmetov, A. Zhebrak, L. I. Minaeva, B. A. Zagribelnyy, L. H. Lee, R. Soll, D. Madge, L. Xing, T. Guo, and A. Aspuru-Guzik (2019) Deep learning enables rapid identification of potent DDR1 kinase inhibitors. Nat. Biotechnol. 37: 1038-1040.

  235. 235.

    Polykovskiy, D., A. Zhebrak, B. Sanchez-Lengeling, S. Golovanov, O. Tatanov, S. Belyaev, R. Kurbanov, A. Artamonov, V. Aladinskiy, M. Veselov, A. Kadurin, S. Johansson, H. Chen, S. Nikolenko, A. Aspuru-Guzik, and A. Zhavoronkov (2018) Molecular Sets (MOSES): A benchmarking platform for molecular generation models. ArXiv. 1811.12823.

  236. 236.

    Kawai, K., Y. Karuo, A. Tarui, K. Sato, and M. Omote (2020) Effect of structural descriptors on the design of cyclin dependent kinase inhibitors using similarity-based molecular evolution. Mol. Inform. 39: 1900126.

  237. 237.

    Yoshikawa, N., K. Terayama, M. Sumita, T. Homma, K. Oono, and K. Tsuda (2018) Population-based de novo molecule generation, using grammatical evolution. Chem. Lett. 47: 1431-1434.

  238. 238.

    Jensen, J. H. (2019) A graph-based genetic algorithm and generative model/Monte Carlo tree search for the exploration of chemical space. Chem. Sci. 10: 3567–3572.

    Article  Google Scholar 

  239. 239.

    Herring, R. H. and M. R. Eden (2015) Evolutionary algorithm for de novo molecular design with multi-dimensional constraints. Comput. Chem Eng. 83: 267–277.

    Article  CAS  Google Scholar 

  240. 240.

    Rupakheti, C., A. Virshup, W. Yang, and D. N. Beratan (2015) Strategy to discover diverse optimal molecules in the small molecule universe. J. Chem. Inf. Model. 55: 529–537.

    Article  CAS  Google Scholar 

  241. 241.

    Boolell, M., M. J. Allen, S. A. Ballard, S. Gepi-Attee, G. J. Muirhead, A. M. Naylor, I. H. Osterloh, and C. Gingell (1996) Sildenafil: an orally active type 5 cyclic GMP-specific phosphodiesterase inhibitor for the treatment of penile erectile dysfunction. Int. J. Impot Res. 8: 47-52.

  242. 242.

    Ning, Y. M., J. L. Gulley, P. M. Arlen, S. Woo, S. M. Steinberg, J. J. Wright, H. L. Parnes, J. B. Trepel, M. J. Lee, Y. S. Kim, H. Sun, R. A. Madan, L. Latham, E. Jones, C. C. Chen, W. D. Figg, and W. L. Dahut (2010) Phase II trial of bevacizumab, thalidomide, docetaxel, and prednisone in patients with metastatic castration-resistant prostate cancer. J. Clin. Oncol. 28: 2070-2076.

  243. 243.

    Singhal, S., J. Mehta, R. Desikan, D. Ayers, P. Roberson, P. Eddlemon, N. Munshi, E. Anaissie, C. Wilson, M. Dhodapkar, J. Zeldis, and B. Barlogie (1999) Antitumor activity of thalidomide in refractory multiple myeloma. N. Engl. J. Med. 341: 1565-1571.

  244. 244.

    D'Amato, R. J., M. S. Loughnan, E. Flynn, and J. Folkman (1994) Thalidomide is an inhibitor of angiogenesis. Proc. Natl. Acad. Sci. USA. 91: 4082–4085.

    Article  Google Scholar 

  245. 245.

    Hameed, P. N., K. Verspoor, S. Kusljic, and S. Halgamuge (2018) A two-tiered unsupervised clustering approach for drug repositioning through heterogeneous data integration. BMC Bioinformatics. 19: 129.

    Article  CAS  Google Scholar 

  246. 246.

    Wu, C., R. C. Gudivada, B. J. Aronow, and A. G. Jegga (2013) Computational drug repositioning through heterogeneous network clustering. BMC Syst. Biol. 7: S6.

  247. 247.

    Blondel, V. D., J. L. Guillaume, R. Lambiotte, and E. Lefebvre (2008) Fast unfolding of communities in large networks. J. Stat. Mech. 2008: P10008.

  248. 248.

    Nepusz, T., H. Yu, and A. Paccanaro (2012) Detecting overlapping protein complexes in protein-protein interaction networks. Nat. Methods. 9: 471–472.

    Article  CAS  Google Scholar 

  249. 249.

    Sun, P., J. Guo, R. Winnenburg, and J. Baumbach (2017) Drug repurposing by integrated literature mining and drug-genedisease triangulation. Drug Discov. Today. 22: 615–619.

    Article  CAS  Google Scholar 

  250. 250.

    Chen, H. and Z. Zhang (2018) Prediction of drug-disease associations for drug repositioning through drug-miRNAdisease heterogeneous network. IEEE Access. 6: 45281–45287.

    Article  Google Scholar 

  251. 251.

    Martinez, V., C. Navarro, C. Cano, W. Fajardo, and A. Blanco (2015) DrugNet: network-based drug-disease prioritization by integrating heterogeneous data. Artif. Intell. Med. 63: 41–49.

    Article  Google Scholar 

  252. 252.

    Martinez, V., C. Cano, and A. Blanco (2014) ProphNet: a generic prioritization method through propagation of information. BMC Bioinformatics. 15: S5.

    Google Scholar 

  253. 253.

    Luo, H., J. Wang, M. Li, J. Luo, X. Peng, F. X. Wu, and Y. Pan (2016) Drug repositioning based on comprehensive similarity measures and Bi-Random walk algorithm. Bioinformatics. 32: 2664–2671.

    Article  CAS  Google Scholar 

  254. 254.

    Luo, H., M. Li, S. Wang, Q. Liu, Y. Li, and J. Wang (2018) Computational drug repositioning using low-rank matrix approximation and randomized algorithms. Bioinformatics. 34: 1904–1912.

    Article  CAS  Google Scholar 

  255. 255.

    Yan, C. K., W. X. Wang, G. Zhang, J. L. Wang, and A. Patel (2019) BiRWDDA: A novel drug repositioning method based on multisimilarity fusion. J. Comput. Biol. 26: 1230–1242.

    Article  CAS  Google Scholar 

  256. 256.

    Gottlieb, A., G. Y. Stein, E. Ruppin, and R. Sharan (2011) PREDICT: A method for inferring novel drug indications with application to personalized medicine. Mol. Syst. Biol. 7: 496.

    Article  CAS  Google Scholar 

  257. 257.

    Napolitano, F., Y. Zhao, V. M. Moreira, R. Tagliaferri, J. Kere, M. D'Amato, and D. Greco (2013) Drug repositioning: A machinelearning approach through data integration. J. Cheminform. 5: 30.

  258. 258.

    Wang, Y., S. Chen, N. Deng, and Y. Wang (2013) Drug repositioning by kernel-based integration of molecular structure, molecular activity, and phenotype data. PLoS One. 8: e78518.

    Article  CAS  Google Scholar 

  259. 259.

    Kim, E., A. S. Choi, and H. Nam (2019) Drug repositioning of herbal compounds via a machine-learning approach. BMC Bioinformatics. 20: 247.

    Article  Google Scholar 

  260. 260.

    Zhang, W., X. Yue, F. Huang, R. Liu, Y. Chen, and C. Ruan (2018) Predicting drug-disease associations and their therapeutic function based on the drug-disease association bipartite network. Methods. 145: 51–59.

    Article  CAS  Google Scholar 

  261. 261.

    Le, D. H. and D. Nguyen-Ngoc (2018) Drug repositioning by integrating known disease-gene and drug-target associations in a semi-supervised learning model. Acta Biotheor. 66: 315–331.

    Article  Google Scholar 

  262. 262.

    Xuan, P., Y. Cao, T. Zhang, X. Wang, S. Pan, and T. Shen (2019) Drug repositioning through integration of prior knowledge and projections of drugs and diseases. Bioinformatics. 35: 4108–4119.

    Article  CAS  Google Scholar 

  263. 263.

    Wei, X., Y. Zhang, Y. Huang, and Y. Fang (2019) Predicting drug-disease associations by network embedding and biomedical data integration. Data Technol. Appl. 53: 217–229.

    Article  Google Scholar 

  264. 264.

    Moridi, M., M. Ghadirinia, A. Sharifi-Zarchi, and F. Zare-Mirakabad (2019) The assessment of efficient representation of drug features using deep learning for drug repositioning. BMC Bioinformatics. 20: 577.

    Article  Google Scholar 

  265. 265.

    Abdolhosseini, F., B. Azarkhalili, A. Maazallahi, A. Kamal, S. A. Motahari, A. Sharifi-Zarchi, and H. Chitsaz (2019) Cell identity codes: understanding cell identity from gene expression profiles using deep neural networks. Sci. Rep. 9: 2342.

  266. 266.

    Asgari, E. and M. R. K. Mofrad (2015) Continuous distributed representation of biological sequences for deep proteomics and genomics. PLoS One. 10: e0141287.

    Article  CAS  Google Scholar 

  267. 267.

    Donner, Y., S. Kazmierczak, and K. Fortney (2018) Drug Repurposing using deep embeddings of gene expression profiles. Mol. Pharm. 15: 4314–4325.

    Article  CAS  Google Scholar 

  268. 268.

    Stathias, V., J. Turner, A. Koleti, D. Vidovic, D. Cooper, M. Fazel-Najafabadi, M. Pilarczyk, R. Terryn, C. Chung, A. Umeano, D. J. B. Clarke, A. Lachmann, J. E. Evangelista, A. Ma'ayan, M. Medvedovic, and S. C. Schurer (2020) LINCS Data Portal 2.0: next generation access point for perturbationresponse signatures. Nucleic Acids Res. 48: D431-D439.

  269. 269.

    You, J., R. D. McLeod, and P. Hu (2019) Predicting drug-target interaction network using deep learning model. Comput. Biol. Chem. 80: 90-101.

  270. 270.

    Aliper, A., S. Plis, A. Artemov, A. Ulloa, P. Mamoshina, and A. Zhavoronkov (2016) Deep learning applications for predicting pharmacological properties of drugs and drug repurposing using transcriptomic data. Mol. Pharm. 13: 2524-2530.

  271. 271.

    Zeng, X., S. Zhu, X. Liu, Y. Zhou, R. Nussinov, and F. Cheng (2019) deepDR: a network-based deep learning approach to in silico drug repositioning. Bioinformatics. 35: 5191–5198.

    Article  CAS  Google Scholar 

  272. 272.

    Xuan, P., L. Zhao, T. Zhang, Y. Ye, and Y. Zhang (2019) Inferring drug-related diseases based on convolutional neural network and gated recurrent unit. Molecules. 24: 2712.

    Article  CAS  Google Scholar 

  273. 273.

    Masoudi-Sobhanzadeh, Y., Y. Omidi, M. Amanlou, and A. Masoudi-Nejad (2019) Drug databases and their contributions to drug repurposing. Genomics. 112: 1087–1095.

    Google Scholar 

  274. 274.

    Cheng, F. (2019) In silico oncology drug repositioning and polypharmacology. Methods Mol. Biol. 1878: 243–261.

    Article  CAS  Google Scholar 

  275. 275.

    March-Vila, E., L. Pinzi, N. Sturm, A. Tinivella, O. Engkvist, H. Chen, and G. Rastelli (2017) On the integration of in silico drug design methods for drug repurposing. Front. Pharmacol. 8: 298.

  276. 276.

    Fleuren, W. W. M. and W. Alkema (2015) Application of text mining in the biomedical domain. Methods. 74: 97–106.

    Article  CAS  Google Scholar 

  277. 277.

    Nugent, T., V. Plachouras, and J. L. Leidner (2016) Computational drug repositioning based on side-effects mined from social media. PeerJ. Computer Science. 2: e46.

  278. 278.

    Rastegar-Mojarad, M., R. K. Elayavilli, D. Li, R. Prasad, and H. Liu (2015) A new method for prioritizing drug repositioning candidates extracted by literature-based discovery. Proceedings of 2015 IEEE International Conference on Bioinformatics and Biomedicine, BIBM 2015. November 9-12. Washington, DC, USA.

  279. 279.

    Su, E. W. and T. M. Sanger (2017) Systematic drug repositioning through mining adverse event data in PeerJ. 5: e3154.

    Article  CAS  Google Scholar 

  280. 280.

    Park, K. (2019) A review of computational drug repurposing. Transl. Clin. Pharmacol. 27: 59–63.

    Article  Google Scholar 

  281. 281.


  282. 282.

    Douguet, D. (2018) Data sets representative of the structures and experimental properties of FDA-approved drugs. ACS Med. Chem. Lett. 9: 204–209.

    Article  CAS  Google Scholar 

  283. 283.

    Kim, S., P. A. Thiessen, E. E. Bolton, J. Chen, G. Fu, A. Gindulyte, L. Han, J. He, S. He, B. A. Shoemaker, J. Wang, B. Yu, J. Zhang, and S. H. Bryant (2016) PubChem substance and compound databases. Nucleic Acids Res. 44: D1202-D1213.

  284. 284.

    Williams, A. J. (2008) Internet-based tools for communication and collaboration in chemistry. Drug Discovery Today. 13: 502–506.

    Article  CAS  Google Scholar 

  285. 285.

    Ursu, O., J. Holmes, C. G. Bologa, J. J. Yang, S. L. Mathias, V. Stathias, D. T. Nguyen, S. Schurer, and T. Oprea (2019) DrugCentral 2018: an update. Nucleic Acids Res. 47: D963-D970.

  286. 286.

    Ursu, O., J. Holmes, J. Knockel, C. G. Bologa, J. J. Yang, S. L. Mathias, S. J. Nelson, and T. I. Oprea (2017) DrugCentral: online drug compendium. Nucleic Acids Res. 45: D932-D939.

  287. 287.


  288. 288.

    Kuhn, M., I. Letunic, L. J. Jensen, and P. Bork (2016) The SIDER database of drugs and side effects. Nucleic Acids Res. 44: D1075–D1079.

    Article  CAS  Google Scholar 

  289. 289.

    Tatonetti, N. P., P. P. Ye, R. Daneshjou, and R. B. Altman (2012) Data-driven prediction of drug effects and interactions. Sci. Transl. Med. 4: 125ra31.

  290. 290.

    Fang, H., Z. Su, Y. Wang, A. Miller, Z. Liu, P. C. Howard, W. Tong, and S. M. Lin (2014) Exploring the FDA adverse event reporting system to generate hypotheses for monitoring of disease characteristics. Clin. Pharmacol. Ther. 95: 496-498.

  291. 291.

    Cai, M. C., Q. Xu, Y. J. Pan, W. Pan, N. Ji, Y. B. Li, H. J. Jin, K. Liu, and Z. L. Ji (2015) ADReCS: An ontology database for aiding standardization and hierarchical classification of adverse drug reaction terms. Nucleic Acids Res. 43: D907-D913.

  292. 292.

    Subramanian, A., R. Narayan, S. M. Corsello, D. D. Peck, T. E. Natoli, X. Lu, J. Gould, J. F. Davis, A. A. Tubelli, J. K. Asiedu, D. L. Lahr, J. E. Hirschman, Z. Liu, M. Donahue, B. Julian, M. Khan, D. Wadden, I. C. Smith, D. Lam, A. Liberzon, C. Toder, M. Bagul, M. Orzechowski, O. M. Enache, F. Piccioni, S. A. Johnson, N. J. Lyons, A. H. Berger, A. F. Shamji, A. N. Brooks, A. Vrcic, C. Flynn, J. Rosains, D. Y. Takeda, R. Hu, D. Davison, J. Lamb, K. Ardlie, L. Hogstrom, P. Greenside, N. S. Gray, P. A. Clemons, S. Silver, X. Wu, W. N. Zhao, W. Read-Button, X. Wu, S. J. Haggarty, L. V. Ronco, J. S. Boehm, S. L. Schreiber, J. G. Doench, J. A. Bittker, D. E. Root, B. Wong, and T. R. Golub (2017) A next generation Connectivity Map: L1000 platform and the first 1,000,000 profiles. Cell. 171: 1437-1452.e17.

  293. 293.

    Barrett, T., D. B. Troup, S. E. Wilhite, P. Ledoux, D. Rudnev, C. Evangelista, I. F. Kim, A. Soboleva, M. Tomashevsky, and R. Edgar (2007) NCBI GEO: Mining tens of millions of expression profiles - Database and tools update. Nucleic Acids Res. 35: D760-D765.

  294. 294.

    Barrett, T., T. O. Suzek, D. B. Troup, S. E. Wilhite, W. C. Ngau, P. Ledoux, D. Rudnev, A. E. Lash, W. Fujibuchi, and R. Edgar (2005) NCBI GEO: Mining millions of expression profiles -Database and tools. Nucleic Acids Res. 33: D562-D566.

  295. 295.

    Parkinson, H., M. Kapushesky, M. Shojatalab, N. Abeygunawardena, R. Coulson, A. Farne, E. Holloway, N. Kolesnykov, P. Lilja, M. Lukk, R. Mani, T. Rayner, A. Sharma, E. William, U. Sarkans, and A. Brazma (2007) ArrayExpress - A public database of microarray experiments and gene expression profiles. Nucleic Acids Res. 35: D747-750.

  296. 296.

    Yang, W., J. Soares, P. Greninger, E. J. Edelman, H. Lightfoot, S. Forbes, N. Bindal, D. Beare, J. A. Smith, I. R. Thompson, S. Ramaswamy, P. A. Futreal, D. A. Haber, M. R. Stratton, C. Benes, U. McDermott, and M. J. Garnett (2013) Genomics of Drug Sensitivity in Cancer (GDSC): A resource for therapeutic biomarker discovery in cancer cells. Nucleic Acids Res. 41: D955-D961.

  297. 297.

    Bodenreider, O. (2004) The Unified Medical Language System (UMLS): integrating biomedical terminology. Nucleic Acids Res. 32: D267–D270.

    Article  CAS  Google Scholar 

  298. 298.

    Rogers, F. B. (1963) Medical subject headings. Bull. Med. Libr. Assoc. 51: 114–116.

    Google Scholar 

  299. 299.

    Piñero, J., N. Queralt-Rosinach, À. Bravo, J. Deu-Pons, A. Bauer-Mehren, M. Baron, F. Sanz, and L. I. Furlong (2015) DisGeNET: A discovery platform for the dynamical exploration of human diseases and their genes. Database. 2015: bav028.

  300. 300.

    Ogata, H., S. Goto, K. Sato, W. Fujibuchi, H. Bono, and M. Kanehisa (1999) KEGG: Kyoto Encyclopedia of Genes and Genomes. Nucleic Acids Res. 27: 29–34.

    Article  Google Scholar 

  301. 301.

    Hewett, M., D. E. Oliver, D. L. Rubin, K. L. Easton, J. M. Stuart, R. B. Altman, and T. E. Klein (2002) PharmGKB: the pharmacogenetics knowledge base. Nucleic Acids Res. 30: 163-165.

  302. 302.

    Tate, J. G., S. Bamford, H. C. Jubb, Z. Sondka, D. M. Beare, N. Bindal, H. Boutselakis, C. G. Cole, C. Creatore, E. Dawson, P. Fish, B. Harsha, C. Hathaway, S. C. Jupe, C. Y. Kok, K. Noble, L. Ponting, C. C. Ramshaw, C. E. Rye, H. E. Speedy, R. Stefancsik, S. L. Thompson, S. Wang, S. Ward, P. J. Campbell, and S. A. Forbes (2019) COSMIC: the Catalogue Of Somatic Mutations In Cancer. Nucleic Acids Res. 47: D941-D947.

  303. 303.

    Lappalainen, I., J. Lopez, L. Skipper, T. Hefferon, J. D. Spalding, J. Garner, C. Chen, M. Maguire, M. Corbett, G. Zhou, J. Paschall, V. Ananiev, P. Flicek, and D. M. Church (2013) DbVar and DGVa: public archives for genomic structural variation. Nucleic Acids Res. 41: D936-D941.

  304. 304.

    Mailman, M. D., M. Feolo, Y. Jin, M. Kimura, K. Tryka, R. Bagoutdinov, L. Hao, A. Kiang, J. Paschall, L. Phan, N. Popova, S. Pretel, L. Ziyabari, M. Lee, Y. Shao, Z. Y. Wang, K. Sirotkin, M. Ward, M. Kholodov, K. Zbicz, J. Beck, M. Kimelman, S. Shevelev, D. Preuss, E. Yaschenko, A. Graeff, J. Ostell, and S. T. Sherry (2007) The NCBI dbGaP database of genotypes and phenotypes. Nat. Genet. 39: 1181-1186.

  305. 305.

    Smigielski, E. M., K. Sirotkin, M. Ward, and S. T. Sherry (2000) dbSNP: a database of single nucleotide polymorphisms. Nucleic Acids Res. 28: 352–355.

    Article  Google Scholar 

  306. 306.

    Liu, Z., M. Su, L. Han, J. Liu, Q. Yang, Y. Li, and R. Wang (2017) Forging the basis for developing protein-ligand interaction scoring functions. Acc. Chem. Res. 50: 302-309.

  307. 307.

    Su, M., Q. Yang, Y. Du, G. Feng, Z. Liu, Y. Li, and R. Wang (2019) Comparative assessment of scoring functions: The CASF-2016 update. J. Chem. Inf. Model. 59: 895-913.

  308. 308.

    Mysinger, M. M., M. Carchia, J. J. Irwin, and B. K. Shoichet (2012) Directory of useful decoys, enhanced (DUD-E): better ligands and decoys for better benchmarking. J. Med. Chem. 55: 6582-6594.

  309. 309.

    Carlson, H. A., R. D. Smith, K. L. Damm-Ganamet, J. A. Stuckey, A. Ahmed, M. A. Convery, D. O. Somers, M. Kranz, P. A. Elkins, G. Cui, C. E. Peishoff, M. H. Lambert, and J. B. Dunbar Jr. (2016) CSAR 2014: A benchmark exercise using unpublished data from pharma. J. Chem. Inf. Model. 56: 1063-1077.

  310. 310.

    Kim, S., J. Chen, T. Cheng, A. Gindulyte, J. He, S. He, Q. Li, B. A. Shoemaker, P. A. Thiessen, B. Yu, L. Zaslavsky, J. Zhang, and E. E. Bolton (2019) PubChem 2019 update: improved access to chemical data. Nucleic Acids Res. 47: D1102-D1109.

  311. 311.

    Mendez, D., A. Gaulton, A. P. Bento, J. Chambers, M. De Veij, E. Felix, M. P. Magarinos, J. F. Mosquera, P. Mutowo, M. Nowotka, M. Gordillo-Maranon, F. Hunter, L. Junco, G. Mugumbate, M. Rodriguez-Lopez, F. Atkinson, N. Bosc, C. J. Radoux, A. Segura-Cabrera, A. Hersey, and A. R. Leach (2019) ChEMBL: towards direct deposition of bioassay data. Nucleic Acids Res. 47: D930-D940.

  312. 312.

    Gilson, M. K., T. Liu, M. Baitaluk, G. Nicola, L. Hwang, and J. Chong (2016) BindingDB in 2015: A public database for medicinal chemistry, computational chemistry and systems pharmacology. Nucleic Acids Res. 44: D1045–1053.

    Article  CAS  Google Scholar 

  313. 313.

    Wishart, D. S., Y. D. Feunang, A. C. Guo, E. J. Lo, A. Marcu, J. R. Grant, T. Sajed, D. Johnson, C. Li, Z. Sayeeda, N. Assempour, I. Iynkkaran, Y. Liu, A. Maciejewski, N. Gale, A. Wilson, L. Chin, R. Cummings, D. Le, A. Pon, C. Knox, and M. Wilson (2018) DrugBank 5.0: a major update to the DrugBank database for 2018. Nucleic Acids Res. 46: D1074-D1082.

  314. 314.

    Kanehisa, M., M. Furumichi, M. Tanabe, Y. Sato, and K. Morishima (2017) KEGG: new perspectives on genomes, pathways, diseases and drugs. Nucleic Acids Res. 45: D353–D361.

    Article  CAS  Google Scholar 

  315. 315.

    Alexander, S. P. H., H. E. Benson, E. Faccenda, A. J. Pawson, J. L. Sharman, J. C. McGrath, W. A. Catterall, M. Spedding, J. A. Peters, A. J. Harmar, and CGTP Collaborators (2013) The concise guide to PHARMACOLOGY 2013/14: overview. Br. J. Pharmacol. 170: 1449-1458.

  316. 316.

    Hecker, N., J. Ahmed, J. von Eichborn, M. Dunkel, K. Macha, A. Eckert, M. K. Gilson, P. E. Bourne, and R. Preissner (2012) SuperTarget goes quantitative: update on drug-target interactions. Nucleic Acids Res. 40: D1113-D1117.

  317. 317.

    Gunther, S., M. Kuhn, M. Dunkel, M. Campillos, C. Senger, E. Petsalaki, J. Ahmed, E. G. Urdiales, A. Gewiess, L. J. Jensen, R. Schneider, R. Skoblo, R. B. Russell, P. E. Bourne, P. Bork, and R. Preissner (2008) SuperTarget and Matador: resources for exploring drug-target relationships. Nucleic Acids Res. 36: D919-D922.

  318. 318.

    Kuhn, M., C. von Mering, M. Campillos, L. J. Jensen, and P. Bork (2008) STITCH: interaction networks of chemicals and proteins. Nucleic Acids Res. 36: D684–D688.

    Google Scholar 

  319. 319.

    Yang, H., C. Lou, L. Sun, J. Li, Y. Cai, Z. Wang, W. Li, G. Liu, and Y. Tang (2019) admetSAR 2.0: web-service for prediction and optimization of chemical ADMET properties. Bioinformatics. 35: 1067–1069.

    Article  CAS  Google Scholar 

  320. 320.

    Tomasulo, P. (2002) ChemIDplus-super source for chemical and drug information. Med. Ref. Serv Q. 21: 53–59.

    Google Scholar 

  321. 321.

    Richard, A. M., R. S. Judson, K. A. Houck, C. M. Grulke, P. Volarath, I. Thillainadarajah, C. Yang, J. Rathman, M. T. Martin, J. F. Wambaugh, T. B. Knudsen, J. Kancherla, K. Mansouri, G. Patlewicz, A. J. Williams, S. B. Little, K. M. Crofton, and R. S. Thomas (2016) ToxCast chemical landscape: Paving the road to 21st century toxicology. Chem. Res. Toxicol. 29: 1225-1251.

  322. 322.

    Tox21 Challenge.

  323. 323.

    Watford, S., L. Ly Pham, J. Wignall, R. Shin, M. T. Martin, and K. P. Friedman (2019) ToxRefDB version 2.0: Improved utility for predictive and retrospective toxicology analyses. Reprod. Toxicol. 89: 145-158.

  324. 324.

    Sterling, T. and J. J. Irwin (2015) ZINC 15 — ligand discovery for everyone. J. Chem. Inf. Model. 55: 2324–2337

  325. 325.

    Blum, L. C. and J. L. Reymond (2009) 970 million druglike small molecules for virtual screening in the chemical universe database GDB-13. J. Am. Chem. Soc.131: 8732-8733.

  326. 326.

    Ruddigkeit, L., R. van Deursen, L. C. Blum, and J. L. Reymond (2012) Enumeration of 166 billion organic small molecules in the chemical universe database GDB-17. J. Chem. Inf. Model. 52: 2864–2875.

    Article  CAS  Google Scholar 

  327. 327.

    Ramakrishnan, R., P. O. Dral, M. Rupp, and O. A. von Lilienfeld (2014) Quantum chemistry structures and properties of 134 kilo molecules. Sci. Data. 1: 140022.

  328. 328.

    Visini, R., M. Awale, and J. L. Reymond (2017) Fragment database FDB-17. J. Chem. Inf. Model. 57: 700-709.

  329. 329.

    Sun, J., N. Jeliazkova, V. Chupakin, J. F. Golib-Dzib, O. Engkvist, L. Carlsson, J. Wegner, H. Ceulemans, I. Georgiev, V. Jeliazkov, N. Kochev, T. J. Ashby, and H. Chen (2017) ExCAPE-DB: an integrated large scale dataset facilitating Big Data analysis in chemogenomics. J. Cheminform. 9: 17.

  330. 330.

    Messenger, A. G. and J. Rundegren (2004) Minoxidil: Mechanisms of action on hair growth. Br. J. Dermatol. 150: 186–194.

    Article  Google Scholar 

  331. 331.

    Steinbach, G., P. M. Lynch, R. K. Phillips, M. H. Wallace, E. Hawk, G. B. Gordon, N. Wakabayashi, B. Saunders, Y. Shen, T. Fujimura, L. K. Su, B. Levin, L. Godio, S. Patterson, M. A. Rodriguez-Bigas, S. L. Jester, K. L. King, M. Schumacher, J. Abbruzzese, R. N. DuBois, W. N. Hittelman, S. Zimmerman, J. W. Sherman, and G. Kelloff (2000) The effect of celecoxib, a cyclooxygenase-2 inhibitor, in familial adenomatous polyposis. N. Engl. J. Med. 342: 1946-1952.

  332. 332.

    Von Eichborn, J., M. S. Murgueitio, M. Dunkel, S. Koerner, P. E. Bourne, and R. Preissner (2011) PROMISCUOUS: A database for network-based drug-repositioning. Nucleic Acids Res. 39: D1060–D1066.

    Article  CAS  Google Scholar 

  333. 333.

    Luo, H., P. Zhang, X. H. Cao, D. Du, H. Ye, H. Huang, C. Li, S. Qin, C. Wan, L. Shi, L. He, and L. Yang (2016) DPDR-CPI, a server that predicts drug positioning and drug repositioning via chemical-protein interactome. Sci. Rep. 6: 35996.

  334. 334.

    Brown, A. S. and C. J. Patel (2017) A standard database for drug repositioning. Sci. Data. 4: 170029.

    Article  Google Scholar 

  335. 335.

    Shameer, K., B. S. Glicksberg, R. Hodos, K. W. Johnson, M. A. Badgeley, B. Readhead, M. S. Tomlinson, T. O'Connor, R. Miotto, B. A. Kidd, R. Chen, A. Ma'ayan, and J. T. Dudley (2018) Systematic analyses of drugs and disease indications in RepurposeDB reveal pharmacological, biological and epidemiological factors influencing drug repositioning. Brief Bioinform. 19: 656–678.

    Article  CAS  Google Scholar 

  336. 336.

    Cotto, K. C., A. H. Wagner, Y. Y. Feng, S. Kiwala, A. C. Coffman, G. Spies, A. Wollam, N. C. Spies, O. L. Griffith, and M. Griffith (2018) DGIdb 3.0: A redesign and expansion of the drug-gene interaction database. Nucleic Acids Res. 46: D1068–D1073.

    Article  CAS  Google Scholar 

  337. 337.

    Kohler, S., L. Carmody, N. Vasilevsky, J. O. B. Jacobsen, D. Danis, J. P. Gourdine, M. Gargano, N. L. Harris, N. Matentzoglu, J. A. McMurry, D. Osumi-Sutherland, V. Cipriani, J. P. Balhoff, T. Conlin, H. Blau, G. Baynam, R. Palmer, D. Gratian, H. Dawkins, M. Segal, A. C. Jansen, A. Muaz, W. H. Chang, J. Bergerson, S. J. F. Laulederkind, Z. Yuksel, S. Beltran, A. F. Freeman, P. I. Sergouniotis, D. Durkin, A. L. Storm, M. Hanauer, M. Brudno, S. M. Bello, M. Sincan, K. Rageth, M. T. Wheeler, R. Oegema, H. Lourghi, M. G. Della Rocca, R. Thompson, F. Castellanos, J. Priest, C. Cunningham-Rundles, A. Hegde, R. C. Lovering, C. Hajek, A. Olry, L. Notarangelo, M. Similuk, X. A. Zhang, D. Gomez-Andres, H. Lochmuller, H. Dollfus, S. Rosenzweig, S. Marwaha, A. Rath, K. Sullivan, C. Smith, J. D. Milner, D. Leroux, C. F. Boerkoel, A. Klion, M. C. Carter, T. Groza, D. Smedley, M. A. Haendel, C. Mungall, and P. N. Robinson (2019) Expansion of the Human Phenotype Ontology (HPO) knowledge base and resources. Nucleic Acids Res. 47: D1018–D1027.

    Article  CAS  Google Scholar 

Download references


This work was supported by the National Research Foundation of Korea (NRF) grant funded by the Korea government (NRF-2020R1A2C2004628), and was supported by the Bio-Synergy Research Project (NRF-2017M3A9C 4092978) of the Ministry of Science, ICT.

Author information



Corresponding author

Correspondence to Hojung Nam.

Additional information

Publisher’s Note

Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

The authors declare no conflict of interest.

Neither ethical approval nor informed consent was required for this study.

Rights and permissions

Reprints and Permissions

About this article

Verify currency and authenticity via CrossMark

Cite this article

Kim, H., Kim, E., Lee, I. et al. Artificial Intelligence in Drug Discovery: A Comprehensive Review of Data-driven and Machine Learning Approaches. Biotechnol Bioproc E 25, 895–930 (2020).

Download citation


  • drug discovery
  • artificial intelligence
  • data-driven
  • machine learning