Artificial Intelligence in Drug Discovery: A Comprehensive Review of Data-driven and Machine Learning Approaches

Kim, Hyunho; Kim, Eunyoung; Lee, Ingoo; Bae, Bongsung; Park, Minsu; Nam, Hojung

doi:10.1007/s12257-020-0049-y

Artificial Intelligence in Drug Discovery: A Comprehensive Review of Data-driven and Machine Learning Approaches

Review Paper
Published: 07 January 2021

Volume 25, pages 895–930, (2020)
Cite this article

Download PDF

Biotechnology and Bioprocess Engineering Aims and scope Submit manuscript

Artificial Intelligence in Drug Discovery: A Comprehensive Review of Data-driven and Machine Learning Approaches

Download PDF

Hyunho Kim¹,
Eunyoung Kim¹,
Ingoo Lee¹,
Bongsung Bae¹,
Minsu Park¹ &
…
Hojung Nam¹

4670 Accesses
46 Citations
7 Altmetric
Explore all metrics

Abstract

As expenditure on drug development increases exponentially, the overall drug discovery process requires a sustainable revolution. Since artificial intelligence (AI) is leading the fourth industrial revolution, AI can be considered as a viable solution for unstable drug research and development. Generally, AI is applied to fields with sufficient data such as computer vision and natural language processing, but there are many efforts to revolutionize the existing drug discovery process by applying AI. This review provides a comprehensive, organized summary of the recent research trends in AI-guided drug discovery process including target identification, hit identification, ADMET prediction, lead optimization, and drug repositioning. The main data sources in each field are also summarized in this review. In addition, an in-depth analysis of the remaining challenges and limitations will be provided, and proposals for promising future directions in each of the aforementioned areas.

Article PDF

Machine Learning: Algorithms, Real-World Applications and Research Directions

Article 22 March 2021

Deep Learning: A Comprehensive Overview on Techniques, Taxonomy, Applications and Research Directions

Article 18 August 2021

Machine learning and deep learning

Article Open access 08 April 2021

References

DiMasi, J. A., H. G. Grabowski, and R. W. Hansen (2016) Innovation in the pharmaceutical industry: New estimates of R&D costs. J. Health Econ. 47: 20–33.
Article Google Scholar
Paul, S. M., D. S. Mytelka, C. T. Dunwiddie, C. C. Persinger, B. H. Munos, S. R. Lindborg, and A. L. Schacht (2010) How to improve R&D productivity: the pharmaceutical industry's grand challenge. Nat. Rev. Drug Discov. 9: 203–214.
Article CAS Google Scholar
van de Waterbeemd, H. and E. Gifford (2003) ADMET in silico modelling: towards prediction paradise? Nat. Rev. Drug Discov. 2: 192–204.
Article Google Scholar
Mak, K. K. and M. R. Pichika (2019) Artificial intelligence in drug development: present status and future prospects. Drug Discov. Today. 24: 773–780.
Article Google Scholar
Yang, X., Y. Wang, R. Byrne, G. Schneider, and S. Yang (2019) Concepts of artificial intelligence for computer-assisted drug discovery. Chem. Rev. 119: 10520–10594.
Article CAS Google Scholar
Eder, J., R. Sedrani, and C. Wiesmann (2014) The discovery of first-in-class drugs: origins and evolution. Nat. Rev. Drug Discov. 13: 577–587.
Article CAS Google Scholar
Brown, D. (2007) Unfinished business: target-based drug discovery. Drug Discov. Today 12: 1007–1012.
Article CAS Google Scholar
Hsu, Y. H., J. Yao, L. C. Chan, T. J. Wu, J. L. Hsu, Y. F. Fang, Y. Wei, Y. Wu, W. C. Huang, C. L. Liu, Y. C. Chang, M. Y. Wang, C. W. Li, J. Shen, M. K. Chen, A. A. Sahin, A. Sood, G. B. Mills, D. Yu, G. N. Hortobagyi, and M. C. Hung (2014) Definition of PKC-a, CDK6, and MET as therapeutic targets in triple-negative breast cancer. Cancer Res. 74: 4822–4835.
Article CAS Google Scholar
Chen, B. and A. Butte (2016) Leveraging big data to transform target selection and drug discovery. Clin. Pharmacol. Ther. 99: 285–297.
Article Google Scholar
Kodama, K., M. Horikoshi, K. Toda, S. Yamada, K. Hara, J. Irie, M. Sirota, A. A. Morgan, R. Chen, H. Ohtsu, S. Maeda, T. Kadowaki, and A. J. Butte (2012) Expression-based genomewide association study links the receptor CD44 in adipose tissue with type 2 diabetes. Proc. Natl. Acad. Sci. USA. 109: 7049-7054.
Zhu, Z., F. Zhang, H. Hu, A. Bakshi, M. R. Robinson, J. E. Powell, G. W. Montgomery, M. E. Goddard, N. R. Wray, P. M. Visscher, and J. Yang (2016) Integration of summary data from GWAS and eQTL studies predicts complex trait gene targets. Nat. Genet. 48: 481-487.
van Dam, S., U. Võsa, A. van der Graaf, L. Franke, and J. P. de Magalhães (2018) Gene co-expression analysis for functional classification and gene-disease predictions. Brief. Bioinform. 19: 575–592.
Google Scholar
Petyuk, V. A., R. Chang, M. Ramirez-Restrepo, N. D. Beckmann, M. Y. R. Henrion, P. D. Piehowski, K. Zhu, S. Wang, J. Clarke, M. J. Huentelman, F. Xie, V. Andreev, A. Engel, T. Guettoche, L. Navarro, P. De Jager, J. A. Schneider, C. M. Morris, I. G. McKeith, R. H. Perry, S. Lovestone, R. L. Woltjer, T. G. Beach, L. I. Sue, G. E. Serrano, A. P. Lieberman, R. L. Albin, I. Ferrer, D. C. Mash, C. M. Hulette, J. F. Ervin, E. M. Reiman, J. A. Hardy, D. A. Bennett, E. Schadt, R. D. Smith, and A. J. Myers (2018) The human brainome: network analysis identifies HSPA2 as a novel Alzheimer's disease target. Brain. 141: 2721-2739.
Lee, S., C. Zhang, Z. Liu, M. Klevstig, B. Mukhopadhyay, M. Bergentall, R. Cinar, M. Ståhlman, N. Sikanic, J. K. Park, S. Deshmukh, A. M. Harzandi, T. Kuijpers, M. Grøtli, S. J. Elsässer, B. D. Piening, M. Snyder, U. Smith, J. Nielsen, F. Bäckhed, G. Kunos, M. Uhlen, J. Boren, and A. Mardinoglu (2017) Network analyses identify liver-specific targets for treating liver diseases. Mol. Syst. Biol. 13: 938.
Zou, Q., J. Li, L. Song, X. Zeng, and G. Wang (2016) Similarity computation strategies in the microRNA-disease network: a survey. Brief. Funct. Genomics. 15: 55–64.
Google Scholar
Chen, X., D. Xie, L. Wang, Q. Zhao, Z. H. You, and H. Liu (2018) BNPMDA: Bipartite Network Projection for MiRNADisease Association prediction. Bioinformatics. 34: 3178–3186.
Article CAS Google Scholar
Ding, P., J. Luo, C. Liang, Q. Xiao, and B. Cao (2018) Human disease MiRNA inference by combining target information based on heterogeneous manifolds. J. Biomed. Inform. 80: 26–36.
Article Google Scholar
Mohamed, S. K., V. Novácek, and A. Nounu (2020) Discovering protein drug targets using knowledge graph embeddings. Bioinformatics. 36: 603–610.
Google Scholar
Richardson, P., I. Griffin, C. Tucker, D. Smith, O. Oechsle, A. Phelan, M. Rawling, E. Savory, and J. Stebbing (2020) Baricitinib as potential treatment for 2019-nCoV acute respiratory disease. Lancet. 395: e30–e31.
Article Google Scholar
Segler, M. H. S., M. Preuss, and M. P. Waller (2018) Planning chemical syntheses with deep neural networks and symbolic AI. Nature. 555: 604–610.
Article CAS Google Scholar
Ferrero, E., I. Dunham, and P. Sanseau (2017) In silico prediction of novel therapeutic targets using gene-disease association data. J. Transl. Med. 15: 182.
Article CAS Google Scholar
Mamoshina, P., M. Volosnikova, I. V. Ozerov, E. Putin, E. Skibina, F. Cortese, and A. Zhavoronkov (2018) Machine learning on human muscle transcriptomic data for biomarker discovery and tissue-specific drug target identification. Front. Genet. 9: 242.
Piñero, J., Á. Bravo, N. Queralt-Rosinach, A. Gutiérrez- Sacristán, J. Deu-Pons, E. Centeno, J. García-García, F. Sanz, and L. I. Furlong (2017) DisGeNET: A comprehensive platform integrating information on human disease-associated genes and variants. Nucleic Acids Res. 45: D833-D839.
Stoeger, T., M. Gerlach, R. I. Morimoto, and L. A. Nunes Amaral (2018) Large-scale investigation of the reasons why potentially important genes are ignored. PLoS Biol. 16: e2006643.
Piñero, J., J. M. Ramírez-Anguita, J. Saüch-Pitarch, F. Ronzano, E. Centeno, F. Sanz, and L. I. Furlong (2020) The DisGeNET knowledge platform for disease genomics: 2019 update. Nucleic Acids Res. 48: D845-D855.
Davis, A. P., C. J. Grondin, R. J. Johnson, D. Sciaky, R. McMorran, J. Wiegers, T. C. Wiegers, and C. J. Mattingly (2019) The Comparative Toxicogenomics Database: update 2019. Nucleic Acids Res. 47: D948-D954.
Vasaikar, S. V., J. Wang, and B. Zhang (2018) LinkedOmics: analyzing multi-omics data within and across 32 cancer types. Nucleic Acids Res. 46: D956–D963.
Carvalho-Silva, D., A. Pierleoni, M. Pignatelli, C. Ong, L. Fumis, N. Karamanis, M. Carmona, A. Faulconbridge, A. Hercules, E. McAuley, A. Miranda, G. Peat, M. Spitzer, J. Barrett, D. G. Hulcoop, E. Papa, G. Koscielny, and I. Dunham (2019) Open Targets Platform: new developments and updates two years on. Nucleic Acids Res. 47: D1056-D1065.
Brown, K. K., M. M. Hann, A. S. Lakdawala, R. Santos, P. J. Thomas, and K. Todd (2018) Approaches to target tractability assessment - a practical perspective. Medchemcomm. 9: 606–613.
Article Google Scholar
Huang, Z., J. Shi, Y. Gao, C. Cui, S. Zhang, J. Li, Y. Zhou, and Q. Cui (2019) HMDD v3.0: a database for experimentally supported human microRNA-disease associations. Nucleic Acids Res. 47: D1013–D1017.
Article CAS Google Scholar
DepMap portal. https://depmap.org/portal/.
Meyers, R. M., J. G. Bryan, J. M. McFarland, B. A. Weir, A. E. Sizemore, H. Xu, N. V. Dharia, P. G. Montgomery, G. S. Cowley, S. Pantel, A. Goodale, Y. Lee, L. D. Ali, G. Jiang, R. Lubonja, W. F. Harrington, M. Strickland, T. Wu, D. C. Hawes, V. A. Zhivich, M. R. Wyatt, Z. Kalani, J. J. Chang, M. Okamoto, K. Stegmaier, T. R. Golub, J. S. Boehm, F. Vazquez, D. E. Root, W. C. Hahn, and A. Tsherniak (2017) Computational correction of copy number effect improves specificity of CRISPR-Cas9 essentiality screens in cancer cells. Nat. Genet. 49: 1779-1784.
Tsherniak, A., F. Vazquez, P. G. Montgomery, B. A. Weir, G. Kryukov, G. S. Cowley, S. Gill, W. F. Harrington, S. Pantel, J. M. Krill-Burger, R. M. Meyers, L. Ali, A. Goodale, Y. Lee, G. Jiang, J. Hsiao, W. F. J. Gerath, S. Howell, E. Merkel, M. Ghandi, L. A. Garraway, D. E. Root, T. R. Golub, J. S. Boehm, and W. C. Hahn (2017) Defining a cancer dependency map. Cell. 170: 564-576.e16.
Barretina, J., G. Caponigro, N. Stransky, K. Venkatesan, A. A. Margolin, S. Kim, C. J. Wilson, J. Lehár, G. V. Kryukov, D. Sonkin, A. Reddy, M. Liu, L. Murray, M. F. Berger, J. E. Monahan, P. Morais, J. Meltzer, A. Korejwa, J. Jané-Valbuena, F. A. Mapa, J. Thibault, E. Bric-Furlong, P. Raman, A. Shipway, I. H. Engels, J. Cheng, G. K. Yu, J. Yu, P. Aspesi, M. de Silva, K. Jagtap, M. D. Jones, L. Wang, C. Hatton, E. Palescandolo, S. Gupta, S. Mahan, C. Sougnez, R. C. Onofrio, T. Liefeld, L. MacConaill, W. Winckler, M. Reich, N. Li, J. P. Mesirov, S. B. Gabriel, G. Getz, K. Ardlie, V. Chan, V. E. Myer, B. L. Weber, J. Porter, M. Warmuth, P. Finan, J. L. Harris, M. Meyerson, T. R. Golub, M. P. Morrissey, W. R. Sellers, R. Schlegel, and L. A. Garraway (2012) The Cancer Cell Line Encyclopedia enables predictive modelling of anticancer drug sensitivity. Nature. 483: 603-607.
Stransky, N., M. Ghandi, G. V. Kryukov, L. A. Garraway, J. Lehár, M. Liu, D. Sonkin, A. Kauffmann, K. Venkatesan, E. J. Edelman, M. Riester, J. Barretina, G. Caponigro, R. Schlegel, W. R. Sellers, F. Stegmeier, M. Morrissey, A. Amzallag, I. Pruteanu-Malinici, D. A. Haber, S. Ramaswamy, C. H. Benes, M. P. Menden, F. Iorio, M. R. Stratton, U. McDermott, M. J. Garnett, and J. Saez-Rodriguez (2015) Pharmacogenomic agreement between two cancer cell line data sets. Nature. 528: 84-87.
Ghandi, M., F. W. Huang, J. Jané-Valbuena, G. V. Kryukov, C. C. Lo, E. R. McDonald, J. Barretina, E. T. Gelfand, C. M. Bielski, H. Li, K. Hu, A. Y. Andreev-Drakhlin, J. Kim, J. M. Hess, B. J. Haas, F. Aguet, B. A. Weir, M. V. Rothberg, B. R. Paolella, M. S. Lawrence, R. Akbani, Y. Lu, H. L. Tiv, P. C. Gokhale, A. de Weck, A. A. Mansour, C. Oh, J. Shih, K. Hadi, Y. Rosen, J. Bistline, K. Venkatesan, A. Reddy, D. Sonkin, M. Liu, J. Lehar, J. M. Korn, D. A. Porter, M. D. Jones, J. Golji, G. Caponigro, J. E. Taylor, C. M. Dunning, A. L. Creech, A. C. Warren, J. M. McFarland, M. Zamanighomi, A. Kauffmann, N. Stransky, M. Imielinski, Y. E. Maruvka, A. D. Cherniack, A. Tsherniak, F. Vazquez, J. D. Jaffe, A. A. Lane, D. M. Weinstock, C. M. Johannessen, M. P. Morrissey, F. Stegmeier, R. Schlegel, W. C. Hahn, G. Getz, G. B. Mills, J. S. Boehm, T. R. Golub, L. A. Garraway, and W. R. Sellers (2019) Next-generation characterization of the Cancer Cell Line Encyclopedia. Nature. 569: 503-508.
Yu, C., A. M. Mannan, G. M. Yvone, K. N. Ross, Y. L. Zhang, M. A. Marton, B. R. Taylor, A. Crenshaw, J. Z. Gould, P. Tamayo, B. A. Weir, A. Tsherniak, B. Wong, L. A. Garraway, A. F. Shamji, M. A. Palmer, M. A. Foley, W. Winckler, S. L. Schreiber, A. L. Kung, and T. R. Golub (2016) High-throughput identification of genotype-specific cancer vulnerabilities in mixtures of barcoded tumor cell lines. Nat. Biotechnol. 34: 419-423.
Szklarczyk, D., A. L. Gable, D. Lyon, A. Junge, S. Wyder, J. Huerta-Cepas, M. Simonovic, N. T. Doncheva, J. H. Morris, P. Bork, L. J. Jensen, and C. V. Mering (2019) STRING v11: protein-protein association networks with increased coverage, supporting functional discovery in genome-wide experimental datasets. Nucleic Acids Res. 47: D607-D613.
Wang, Y., S. Zhang, F. Li, Y. Zhou, Y. Zhang, Z. Wang, R. Zhang, J. Zhu, Y. Ren, Y. Tan, C. Qin, Y. Li, X. Li, Y. Chen, and F. Zhu (2020) Therapeutic target database 2020: enriched resource for facilitating research and early development of targeted therapeutics. Nucleic Acids Res. 48: D1031-D1041.
Pearson, N., K. Malki, D. Evans, L. Vidler, C. Ruble, J. Scherschel, B. Eastwood, and D. A. Collier (2019) TractaViewer: a genome-wide tool for preliminary assessment of therapeutic target druggability. Bioinformatics. 35: 4509–4510.
Article CAS Google Scholar
Keiser, M. J., V. Setola, J. J. Irwin, C. Laggner, A. I. Abbas, S. J. Hufeisen, N. H. Jensen, M. B. Kuijer, R. C. Matos, T. B. Tran, R. Whaley, R. A. Glennon, J. Hert, K. L. H. Thomas, D. D. Edwards, B. K. Shoichet, and B. L. Roth (2009) Predicting new molecular targets for known drugs. Nature. 462: 175-181.
Morris, G. M., R. Huey, W. Lindstrom, M. F. Sanner, R. K. Belew, D. S. Goodsell, and A. J. Olson (2009) AutoDock4 and AutoDockTools4: Automated docking with selective Receptor flexibility. J. Comput. Chem. 30: 2785–2791.
Article CAS Google Scholar
Trott, O. and A. J. Olson (2010) AutoDock Vina: improving the speed and accuracy of docking with a new scoring function, efficient optimization, and multithreading. J. Comput. Chem. 31: 455–461.
Google Scholar
Koes, D. R., M. P. Baumgartner, and C. J. Camacho (2013) Lessons learned in empirical scoring with smina from the CSAR 2011 benchmarking exercise. J. Chem. Inf. Model. 53: 1893–1904.
Article CAS Google Scholar
Ballester, P. J. and J. B. O. Mitchell (2010) A machine learning approach to predicting protein-ligand binding affinity with applications to molecular docking. Bioinformatics. 26: 1169–1175.
Article CAS Google Scholar
Li, L., B. Wang, and S. O. Meroueh (2011) Support vector regression scoring of receptor-ligand complexes for rankordering and virtual screening of chemical libraries. J. Chem. Inf. Model. 51: 2132–2138.
Article CAS Google Scholar
Ragoza, M., J. Hochuli, E. Idrobo, J. Sunseri, and D. R. Koes (2017) Protein-ligand scoring with convolutional neural networks. J. Chem. Inf. Model. 57: 942–957.
Article CAS Google Scholar
Jimenez, J., M. Skalic, G. Martinez-Rosell, and G. De Fabritiis (2018) KDEEP: Protein-ligand absolute binding affinity prediction via 3D-convolutional neural networks. J. Chem. Inf. Model. 58: 287-296.
Imrie, F., A. R. Bradley, M. van der Schaar, and C. M. Deane (2018) Protein family-specific models using deep neural networks and transfer learning improve virtual screening and highlight the need for more data. J. Chem. Inf. Model. 58: 2319-2330.
Stepniewska-Dziubinska, M. M., P. Zielenkiewicz, and P. Siedlecki (2018) Development and evaluation of a deep learning model for protein-ligand binding affinity prediction. Bioinformatics. 34: 3666–3674.
Article CAS Google Scholar
Tian, K., M. Shao, Y. Wang, J. Guan, and S. Zhou (2016) Boosting compound-protein interaction prediction by deep learning. Methods. 110: 64–72.
Article CAS Google Scholar
Feinberg, E. N., D. Sur, Z. Wu, B. E. Husic, H. Mai, Y. Li, S. Sun, J. Yang, B. Ramsundar, and V. S. Pande (2018) PotentialNet for molecular property prediction. ACS Cent. Sci. 4: 1520-1530.
Lim, J., S. Ryu, K. Park, Y. J. Choe, J. Ham, and W. Y. Kim (2019) Predicting drug-target interaction using a novel graph neural network with 3D structure-embedded graph representation. J. Chem. Inf. Model. 59: 3981-3988.
Landrum, G., B. Kelley, P. Tosco, sriniker, gedeck, NadineSchneider, R. Vianello, A. Dalke, AlexanderSavelyev, S. Turk, B. Cole, M. Swain, A. Vaucher, M. Wójcikowski, A. Pahl, JP, strets123, JLVarjo, P. Fuller, DoliathGavid, N. O'Boyle, P. P. Zarrinkar, G. Sforna, M. Nowotka, pzc, J. van Santen, J. H. Jensen, J. Domanski, D. Hall, and P. Avery (2018) rdkit/rdkit: 2018_03_1 (Q1 2018) Release. Zenodo. https://doi.org/10.5281/zenodo.1222070.
O'Boyle, N. M., M. Banck, C. A. James, C. Morley, T. Vandermeersch, and G. R. Hutchison (2011) Open Babel: An open chemical toolbox. J. Cheminform. 3: 33.
Article CAS Google Scholar
Willighagen, E. L., J. W. Mayfield, J. Alvarsson, A. Berg, L. Carlsson, N. Jeliazkova, S. Kuhn, T. Pluskal, M. Rojas-Cherto, O. Spjuth, G. Torrance, C. T. Evelo, R. Guha, and C. Steinbeck (2017) Erratum to: The Chemistry Development Kit (CDK) v2.0: atom typing, depiction, molecular formulas, and substructure searching. J. Cheminform. 9: 53.
Yap, C. W. (2011) PaDEL-descriptor: an open source software to calculate molecular descriptors and fingerprints. J. Comput. Chem. 32: 1466–1474.
Article CAS Google Scholar
Mauri, A., V. Consonni, M. Pavan, and R. Todeschini (2006) Dragon software: An easy approach to molecular descriptor calculations. Match-Commun. Math. Comput. Chem. 56: 237–248.
Google Scholar
Cao, D. S., Y. Z. Liang, J. Yan, G. S. Tan, Q. S. Xu, and S. Liu (2013) PyDPI: freely available python package for chemoinformatics, bioinformatics, and chemogenomics studies. J. Chem. Inf. Model. 53: 3086-3096.
Cao, D. S., N. Xiao, Q. S. Xu, and A. F. Chen (2015) Rcpi: R/ Bioconductor package to generate various descriptors of proteins, compounds and their interactions. Bioinformatics. 31: 279–281.
Article CAS Google Scholar
Moriwaki, H., Y. S. Tian, N. Kawashita, and T. Takagi (2018) Mordred: a molecular descriptor calculator. J. Cheminform. 10: 4.
Article CAS Google Scholar
Burden, F. R. (2001) Quantitative structure-Activity relationship studies using gaussian processes. J. Chem. Inf. Comput Sci. 41: 830–835.
Article CAS Google Scholar
Svetnik, V., A. Liaw, C. Tong, J. C. Culberson, R. P. Sheridan, and B. P. Feuston (2003) Random forest: a classification and regression tool for compound classification and QSAR modeling. J. Chem. Inf. Comput. Sci. 43: 1947–1958.
Article CAS Google Scholar
Ma, J., R. P. Sheridan, A. Liaw, G. E. Dahl, and V. Svetnik (2015) Deep neural nets as a method for quantitative structure- activity relationships. J. Chem. Inf. Model. 55: 263-274.
Xu, Y., J. Ma, A. Liaw, R. P. Sheridan, and V. Svetnik (2017) Demystifying Multitask Deep neural networks for quantitative structure-activity relationships. J. Chem. Inf. Model. 57: 2490–2504.
Article CAS Google Scholar
Ghasemi, F., A. Mehridehnavi, A. Fassihi, and H. Prez-Snchez (2018) Deep neural network in QSAR studies using deep belief network. Appl. Soft Comput. 62: 251–258.
Article Google Scholar
Kato, Y., S. Hamada, and H. Goto (2020) Validation Study of QSAR/DNN models using the competition datasets. Mol. Inf. 39: 1900154.
Article CAS Google Scholar
Lusci, A., G. Pollastri, and P. Baldi (2013) Deep architectures and deep learning in chemoinformatics: the prediction of aqueous solubility for drug-like molecules. J. Chem. Inf. Model. 53: 1563-1575.
Duvenaud, D., D. Maclaurin, J. Aguilera-Iparraguirre, R. Gómez-Bombarelli, T. Hirzel, A. Aspuru-Guzik, and R. P. Adams (2015) Convolutional networks on graphs for learning molecular fingerprints. arXiv. 1509.09292.
Rogers, D. and M. Hahn (2010) Extended-connectivity fingerprints. J. Chem. Inf. Model. 50: 742–754.
Article CAS Google Scholar
Jaeger, S., S. Fulle, and S. Turk (2018) Mol2vec: Unsupervised machine learning approach with chemical intuition. J. Chem. Inf. Model. 58: 27–35.
Article CAS Google Scholar
Chakravarti, S. K. and S. R. M. Alla (2019) Descriptor Free QSAR modeling using deep learning with long short-term memory neural networks. Front. Artif. Intell. 2: 17.
Article Google Scholar
Winter, R., F. Noé, and D. A. Clevert (2019) Learning continuous and data-driven molecular descriptors by translating equivalent chemical representations. Chem. Sci. 10: 1692–1701.
Honda, S., S. Shi, and H. R. Ueda (2019) SMILES transformer: pre-trained molecular fingerprint for low data drug discovery. arXiv. 1911.04738.
Devlin, J., M. W. Chang, K. Lee, and K. Toutanova (2018) BERT: pre-training of deep bidirectional transformers for language understanding. arXiv. 1810.04805.
Altae-Tran, H., B. Ramsundar, A. S. Pappu, and V. Pande (2017) Low data drug discovery with one-shot learning. ACS Cent. Sci. 3: 283–293.
Article CAS Google Scholar
Rohrer, S. G. and K. Baumann (2009) Maximum unbiased validation (MUV) data sets for virtual screening based on PubChem bioactivity data. J. Chem. Inf. Model. 49: 169–184.
Article CAS Google Scholar
Jeon, M., D. Park, J. Lee, H. Jeon, M. Ko, S. Kim, Y. Choi, A. C. Tan, and J. Kang (2019) ReSimNet: drug response similarity prediction using siamese neural networks. Bioinformatics. 35: 5249–5256.
Article CAS Google Scholar
Lamb, J., E. D. Crawford, D. Peck, J. W. Modell, I. C. Blat, M. J. Wrobel, J. Lerner, J. P. Brunet, A. Subramanian, K. N. Ross, M. Reich, H. Hieronymus, G. Wei, S. A. Armstrong, S. J. Haggarty, P. A. Clemons, R. Wei, S. A. Carr, E. S. Lander, and T. R. Golub (2006) The Connectivity Map: using gene-expression signatures to connect small molecules, genes, and disease. Science. 313: 1929-1935.
Park, K., Y. J. Ko, P. Durai, and C. H. Pan (2019) Machine learning-based chemical binding similarity using evolutionary relationships of target genes. Nucleic Acids Res. 47: e128.
Cheng, T., M. Hao, T. Takeda, S. H. Bryant, and Y. Wang (2017) Large-scale prediction of drug-target interaction: a datacentric review. AAPS J. 19: 1264–1275.
Article CAS Google Scholar
Ding, H., I. Takigawa, H. Mamitsuka, and S. Zhu (2014) Similarity-based machine learning methods for predicting drugtarget interactions: a brief review. Brief Bioinform. 15: 734–747.
Article Google Scholar
Bleakley, K. and Y. Yamanishi (2009) Supervised prediction of drug-target interactions using bipartite local models. Bioinformatics. 25: 2397–2403.
Article CAS Google Scholar
Xia, Z., L. Y. Wu, X. Zhou, and S. T. C. Wong (2010) Semisupervised drug-protein interaction prediction from heterogeneous biological spaces. BMC Syst. Biol. 4 Suppl 2: S6.
van Laarhoven, T., S. B. Nabuurs, and E. Marchiori (2011) Gaussian interaction profile kernels for predicting drug-target interaction. Bioinformatics. 27: 3036–3043.
Article CAS Google Scholar
Pahikkala, T., A. Airola, S. Pietila, S. Shakyawar, A. Szwajda, J. Tang, and T. Aittokallio (2015) Toward more realistic drugtarget interaction predictions. Brief. Bioinform. 16: 325-337.
Keum, J. and H. Nam (2017) SELF-BLM: Prediction of drugtarget interactions via self-training SVM. PLoS One. 12: e0171839.
Article CAS Google Scholar
Chen, X., M. X. Liu, and G. Y. Yan (2012) Drug-target interaction prediction by random walk on the heterogeneous network. Mol. Biosyst. 8: 1970–1978.
Article CAS Google Scholar
Luo, Y., X. Zhao, J. Zhou, J. Yang, Y. Zhang, W. Kuang, J. Peng, L. Chen, and J. Zeng (2017) A network integration approach for drug-target interaction prediction and computational drug repositioning from heterogeneous information. Nat. Commun. 8: 573.
Wang, S., H. Cho, C. Zhai, B. Berger, and J. Peng (2015) Exploiting ontology graph for predicting sparsely annotated gene function. Bioinformatics. 31: i357–i364.
Article CAS Google Scholar
Ewing, T., J. C. Baber, and M. Feher (2006) Novel 2D fingerprints for ligand-based virtual screening. J. Chem. Inf. Model. 46: 2423–2431.
Article CAS Google Scholar
Dubchak, I., I. Muchnik, S. R. Holbrook, and S. H. Kim (1995) Prediction of protein folding class using global description of amino acid sequence. Proc. Natl. Acad. Sci. USA. 92: 8700-8704.
Zhang, P., L. Tao, X. Zeng, C. Qin, S. Chen, F. Zhu, Z. Li, Y. Jiang, W. Chen, and Y. Z. Chen (2017) A protein network descriptor server and its use in studying protein, disease, metabolic and drug targeted networks. Brief. Bioinform. 18: 1057-1070.
Yu, H., J. Chen, X. Xu, Y. Li, H. Zhao, Y. Fang, X. Li, W. Zhou, W. Wang, and Y. Wang (2012) A systematic prediction of multiple drug-target interactions from chemical, genomic, and pharmacological data. PLoS One. 7: e37608.
Article CAS Google Scholar
Li, Z. C., M. H. Huang, W. Q. Zhong, Z. Q. Liu, Y. Xie, Z. Dai, and X. Y. Zou (2016) Identification of drug-target interaction from interactome network with ‘guilt-by-association’ principle and topology features. Bioinformatics. 32: 1057–1064.
Article CAS Google Scholar
Lee, I. and H. Nam (2018) Identification of drug-target interaction by a random walk with restart method on an interactome network. BMC Bioinformatics. 19: 208.
Article CAS Google Scholar
Wang, Y. and J. Zeng (2013) Predicting drug-target interactions using restricted Boltzmann machines. Bioinformatics. 29: i126–i134.
Article CAS Google Scholar
Wen, M., Z. Zhang, S. Niu, H. Sha, R. Yang, Y. Yun, and H. Lu (2017) Deep-learning-based drug-target interaction prediction. J. Proteome Res. 16: 1401-1409.
Hu, P. W., K. C. C. Chan, and Z. H. You (2016) Large-scale prediction of drug-target interactions from deep representations. 2016 International Joint Conference on Neural Networks (IJCNN). July 24-29. Vancouver, BC, Canada.
Ozturk, H., A. Ozgur, and E. Ozkirimli (2018) DeepDTA: deep drug-target binding affinity prediction. Bioinformatics. 34: i821–i829.
Article CAS Google Scholar
He, T., M. Heidemeyer, F. Ban, A. Cherkasov, and M. Ester (2017) SimBoost: a read-across approach for predicting drugtarget binding affinities using gradient boosting machines. J. Cheminform. 9: 24.
Article CAS Google Scholar
Tsubaki, M., K. Tomii, and J. Sese (2019) Compound-protein interaction prediction with end-to-end learning of neural networks for graphs and sequences. Bioinformatics. 35: 309–318.
Article CAS Google Scholar
Gonen, M. (2012) Predicting drug-target interactions from chemical and genomic kernels using Bayesian matrix factorization. Bioinformatics. 28: 2304–2310.
Article CAS Google Scholar
Lee, I., J. Keum, and H. Nam (2019) DeepConv-DTI: Prediction of drug-target interactions via deep learning with convolution on protein sequences. PLoS Comput. Biol. 15: e1007129.
Karimi, M., D. Wu, Z. Wang, and Y. Shen (2019) DeepAffinity: interpretable deep learning of compound-protein affinity through unified recurrent and convolutional neural networks. Bioinformatics. 35: 3329-3338.
Shen, C., J. Ding, Z. Wang, D. Cao, X. Ding, and T. Hou (2020) From machine learning to deep learning: Advances in scoring functions for protein-ligand docking. WIREs Comput. Mol. Sci. 10: e1429.
Sieg, J., F. Flachsenberg, and M. Rarey (2019) In need of bias control: evaluating chemical data for machine learning in structure-based virtual screening. J. Chem. Inf. Model. 59: 947-961.
Chen, L., A. Cruz, S. Ramsey, C. J. Dickson, J. S. Duca, V. Hornak, D. R. Koes, and T. Kurtzman (2019) Hidden bias in the DUD-E dataset leads to misleading performance of deep learning in structure-based virtual screening. PLoS One. 14: e0220113.
Hanson, J., K. K. Paliwal, T. Litfin, Y. Yang, and Y. Zhou (2020) Getting to know your neighbor: protein structure prediction comes of age with contextual machine learning. J. Comput. Biol. 27: 796-814.
Shi, Q., W. Chen, S. Huang, Y. Wang, and Z. Xue (2019) Deep learning for mining protein data. Brief. Bioinform. bbz156.
Goodsell, D. S., C. Zardecki, L. Di Costanzo, J. M. Duarte, B. P. Hudson, I. Persikova, J. Segura, C. Shao, M. Voigt, J. D. Westbrook, J. Y. Young, and S. K. Burley (2020) RCSB Protein Data Bank: Enabling biomedical research and drug discovery. Protein Sci. 29: 52-65.
Gola, J., O. Obrezanova, E. Champness, and M. Segall (2006) ADMET property prediction: The state of the art and current challenges. QSAR Comb. Sci. 25: 1172–1180.
Article CAS Google Scholar
Moroy, G., V. Y. Martiny, P. Vayer, B. O. Villoutreix, and M. A. Miteva (2012) Toward in silico structure-based ADMET prediction in drug discovery. Drug Discov. Today. 17: 44–55.
Article CAS Google Scholar
Tian, S., J. Wang, Y. Li, D. Li, L. Xu, and T. Hou (2015) The application of in silico drug-likeness predictions in pharmaceutical research. Adv. Drug Deliv. Rev. 86: 2–10.
Article CAS Google Scholar
Zhao, Y. H., J. Le, M. H. Abraham, A. Hersey, P. J. Eddershaw, C. N. Luscombe, D. Boutina, G. Beck, B. Sherborne, I. Cooper, and J. A. Platts (2001) Evaluation of human intestinal absorption data and subsequent derivation of a quantitative structure-Activity relationship (QSAR) with the Abraham descriptors. J. Pharm. Sci. 90: 749-784.
Ponzoni, I., V. Sebastin-Prez, C. Requena-Triguero, C. Roca, M. J. Martnez, F. Cravero, M. F. Daz, J. A. Pez, R. G. Arrays, J. Adrio, and N. E. Campillo (2017) Hybridizing feature selection and feature learning approaches in QSAR modeling for drug discovery. Sci. Rep. 7: 2403.
Wang, N. N., C. Huang, J. Dong, Z. J. Yao, M. F. Zhu, Z. K. Deng, B. Lv, A. P. Lu, A. F. Chen, and D. S. Cao (2017) Predicting human intestinal absorption with modified random forest approach: a comprehensive evaluation of molecular representation, unbalanced data, and applicability domain issues. RSC Adv. 7: 19007-19018.
Yang, M., J. Chen, L. Xu, X. Shi, X. Zhou, Z. Xi, R. An, and X. Wang (2018) A novel adaptive ensemble classification framework for ADME prediction. RSC Adv. 8: 11661–11683.
Article Google Scholar
Fredlund, L., S. Winiwarter, and C. Hilgendorf (2017) In vitro intrinsic permeability: a transporter-independent measure of Caco-2 cell permeability in drug design and development. Mol. Pharm. 14: 1601-1609.
Patel, R. D., S. P. Kumar, C. N. Patel, S. S. Shankar, H. A. Pandya, and H. A. Solanki (2017) Parallel screening of druglike natural compounds using Caco-2 cell permeability QSAR model with applicability domain, lipophilic ligand efficiency index and shape property: A case study of HIV-1 reverse transcriptase inhibitors. J. Mol. Struct. 1146: 80-95.
Sun, H., K. Nguyen, E. Kerns, Z. Yan, K. R. Yu, P. Shah, A. Jadhav, and X. Xu (2017) Highly predictive and interpretable models for PAMPA permeability. Bioorg. Med. Chem. 25: 1266–1276.
Article CAS Google Scholar
Chi, C. T., M. H. Lee, C. F. Weng, and M. K. Leong (2019) In silico prediction of PAMPA effective permeability using a two-QSAR approach. Int. J. Mol. Sci. 20: 3170.
Lanevskij, K. and R. Didziapetris (2019) Physicochemical QSAR analysis of passive permeability across Caco-2 monolayers. J. Pharm. Sci. 108: 78–86.
Article Google Scholar
Oja, M., S. Sild, and U. Maran (2019) Logistic classification models for pH-permeability profile: predicting permeability classes for the biopharmaceutical classification system. J. Chem. Inf. Model. 59: 2442-2455.
Shin, M., D. Jang, H. Nam, K. H. Lee, and D. Lee (2018) Predicting the absorption potential of chemical compounds through a deep learning approach. IEEE/ACM Trans. Comput. Biol. Bioinform. 15: 432–440.
Article Google Scholar
Wenzel, J., H. Matter, and F. Schmidt (2019) Predictive multitask deep neural network models for ADME-Tox properties: learning from large data sets. J. Chem. Inf. Model. 59: 1253-1268.
Gooch, E. (2004) Medicinal chemistry - an introduction; fundamentals of medicinal chemistry (Gareth Thomas). J. Chem. Educ. 81: 1271.
Article Google Scholar
Kumar, R., A. Sharma, M. H. Siddiqui, and R. K. Tiwari (2017) Prediction of drug-plasma protein binding using artificial intelligence based algorithms. Comb. Chem. High Throughput Screen. 21: 57-64.
Wang, N. N., Z. K. Deng, C. Huang, J. Dong, M. F. Zhu, Z. J. Yao, A. F. Chen, A. P. Lu, Q. Mi, and D. S. Cao (2017) ADME properties evaluation in drug discovery: Prediction of plasma protein binding using NSGA-II combining PLS and consensus modeling. Chemometr. Intell. Lab. Syst. 170: 84-95.
Sun, L., H. Yang, J. Li, T. Wang, W. Li, G. Liu, and Y. Tang (2018) In silico prediction of compounds binding to human plasma proteins by QSAR models. ChemMedChem. 13: 572-581.
Toma, C., D. Gadaleta, A. Roncaglioni, A. Toropov, A. Toropova, M. Marzo, and E. Benfenati (2019) QSAR development for plasma protein binding: influence of the ionization state. Pharm. Res. 36: 28.
Ye, Z., Y. Yang, X. Li, D. Cao, and D. Ouyang (2019) An Integrated transfer learning and multitask learning approach for pharmacokinetic parameter prediction. Mol. Pharm. 16: 533–541.
Google Scholar
Prachayasittikul, V., A. Worachartcheewan, A. P. Toropova, A. A. Toropov, N. Schaduangrat, V. Prachayasittikul, and C. Nantasenamat (2017) Large-scale classification of P-glycoprotein inhibitors using SMILES-based descriptors. SAR QSAR Environ. Res. 28: 1-16.
Gonzalo, C. G. and N. García-Pedrajas (2018) Boosted feature selectors: a case study on prediction P-gp inhibitors and substrates. J. Comput. Aided Mol. Des. 32: 1273–1294.
Article CAS Google Scholar
Hinge, V. K., D. Roy, and A. Kovalenko (2019) Prediction of Pglycoprotein inhibitors with machine learning classification models and 3D-RISM-KH theory based solvation energy descriptors. J. Comput. Aided Mol. Des. 33: 965–971.
Article CAS Google Scholar
Shi, T., Y. Yang, S. Huang, L. Chen, Z. Kuang, Y. Heng, and H. Mei (2019) Molecular image-based convolutional neural network for the prediction of ADMET properties. Chemometr. Intell. Lab. Syst. 194: 103853.
Toropov, A. A., A. P. Toropova, M. Beeg, M. Gobbi, and M. Salmona (2017) QSAR model for blood-brain barrier permeation. J. Pharmacol. Toxicol. Methods. 88: 7–18.
Article CAS Google Scholar
Wang, Z., H. Yang, Z. Wu, T. Wang, W. Li, Y. Tang, and G. Liu (2018) In silico prediction of blood-brain barrier permeability of compounds by machine learning and resampling methods. ChemMedChem. 13: 2189-2201.
Yuan, Y., F. Zheng, and C. G. Zhan (2018) Improved prediction of blood-brain barrier permeability through machine learning with combined use of molecular property-based descriptors and fingerprints. AAPS J. 20: 54.
Miao, R., L. Y. Xia, H. H. Chen, H. H. Huang, and Y. Liang (2019) Improved classification of blood-brain-barrier drugs using deep learning. Sci. Rep. 9: 8802.
Article CAS Google Scholar
Hunt, P. A., M. D. Segall, and J. D. Tyzack (2018) WhichP450: a multi-class categorical model to predict the major metabolising CYP450 isoform for a compound. J. Comput. Aided Mol. Des. 32: 537-546.
Tian, S., Y. Djoumbou-Feunang, R. Greiner, and D. S. Wishart (2018) CypReact: A software tool for in silico reactant prediction for human cytochrome P450 enzymes. J. Chem. Inf. Model. 58: 1282-1291.
Shan, X., X. Wang, C. D. Li, Y. Chu, Y. Zhang, Y. Xiong, and D. Q. Wei (2019) Prediction of CYP450 enzyme-substrate selectivity based on the network-based label space division method. J. Chem. Inf. Model. 59: 4577-4586.
Li, X., Y. Xu, L. Lai, and J. Pei (2018) Prediction of human cytochrome P450 inhibition using a multitask deep autoencoder neural network. Mol. Pharm. 15: 4336–4345.
Article CAS Google Scholar
Pang, X., B. Zhang, G. Mu, J. Xia, Q. Xiang, X. Zhao, A. Liu, G. Du, and Y. Cui (2018) Screening of cytochrome P450 3A4 inhibitors via in silico and in vitro approaches. RSC Adv. 8: 34783-34792.
Wu, Z., T. Lei, C. Shen, Z. Wang, D. Cao, and T. Hou (2019) ADMET evaluation in drug discovery. 19. Reliable prediction of human cytochrome P450 inhibition using artificial intelligence approaches. J. Chem. Inf. Model. 59: 4587-4601.
He, S., M. Li, X. Ye, H. Wang, W. Yu, W. He, Y. Wang, and Y. Qiao (2017) Site of metabolism prediction for oxidation reactions mediated by oxidoreductases based on chemical bond. Bioinformatics. 33: 363–372.
Google Scholar
Šícho, M., C. De Bruyn Kops, C. Stork, D. Svozil, and J. Kirchmair (2017) FAME 2: simple and effective machine learning model of cytochrome P450 regioselectivity. J. Chem. Inf. Model. 57: 1832-1846.
Finkelmann, A. R., D. D. Goldmann, G. Schneider, and A. H. Goller (2018) MetScore: Site of metabolism prediction beyond cytochrome P450 enzymes. ChemMedChem. 13: 2281–2289.
Article CAS Google Scholar
Cai, Y., H. Yang, W. Li, G. Liu, P. W. Lee, and Y. Tang (2019) Computational prediction of site of metabolism for UGTcatalyzed reactions. J. Chem. Inf. Model. 59: 1085-1095.
Lee, P. W. (2014) Handbook of Metabolic Pathways of Xenobiotics. John Wiley & Sons
Podlewska, S. and R. Kafel (2018) MetStabOn-online platform for metabolic stability predictions. Int. J. Mol. Sci. 19: 1040.
Article CAS Google Scholar
Esaki, T., R. Watanabe, H. Kawashima, R. Ohashi, Y. Natsume-Kitatani, C. Nagao, and K. Mizuguchi (2019) Data curation can improve the prediction accuracy of metabolic intrinsic clearance. Mol. Inform. 38: e1800086.
Liu, K., X. Sun, L. Jia, J. Ma, H. Xing, J. Wu, H. Gao, Y. Sun, F. Boulnois, and J. Fan (2019) Chemi-net: A molecular graph convolutional network for accurate drug property prediction. Int. J. Mol. Sci. 20: 3389.
Zhivkova, Z. D. (2017) Quantitative structure - pharmacokinetic relationships for plasma clearance of basic drugs with consideration of the major elimination pathway. J. Pharm. Pharm. Sci. 20: 135–147.
Article Google Scholar
Wakayama, N., K. Toshimoto, K. Maeda, S. Hotta, T. Ishida, Y. Akiyama, and Y. Sugiyama (2018) In silico prediction of major clearance pathways of drugs among 9 routes with two-step support vector machines. Pharm. Res. 35: 197.
Watanabe, R., R. Ohashi, T. Esaki, H. Kawashima, Y. Natsume-Kitatani, C. Nagao, and K. Mizuguchi (2019) Development of an in silico prediction system of human renal excretion and clearance from chemical structure information incorporating fraction unbound in plasma as a descriptor. Sci. Rep. 9: 18782.
Chen, J., H. Yang, L. Zhu, Z. Wu, W. Li, Y. Tang, and G. Liu (2020) In silico prediction of human renal clearance of compounds using quantitative structure-pharmacokinetic relationship models. Chem. Res. Toxicol. 33: 640-650.
Hong, H., S. Thakkar, M. Chen, and W. Tong (2017) Development of decision forest models for prediction of drug-induced liver injury in humans using a large set of FDA-approved drugs. Sci. Rep. 7: 17311.
Article CAS Google Scholar
Kim, E. and H. Nam (2017) Prediction models for drug-induced hepatotoxicity by using weighted molecular fingerprints. BMC Bioinformatics. 18: 227.
Kotsampasakou, E., F. Montanari, and G. F. Ecker (2017) Predicting drug-induced liver injury: The importance of data curation. Toxicology. 389: 139–145.
Article CAS Google Scholar
Ai, H., W. Chen, L. Zhang, L. Huang, Z. Yin, H. Hu, Q. Zhao, J. Zhao, and H. Liu (2018) Predicting drug-induced liver injury using ensemble learning methods and molecular fingerprints. Toxicol. Sci. 165: 100-107.
Hammann, F., V. Schning, and J. Drewe (2019) Prediction of clinically relevant drug-induced liver injury from structure using machine learning. J. Appl. Toxicol. 39: 412-419.
He, S., T. Ye, R. Wang, C. Zhang, X. Zhang, G. Sun, and X. Sun (2019) An in silico model for predicting drug-induced hepatotoxicity. Int. J. Mol. Sci. 20: 1897.
Williams, D. P., S. E. Lazic, A. J. Foster, E. Semenova, and P. Morgan (2019) Predicting drug-induced liver injury with Bayesian machine learning. Chem. Res. Toxicol. 33: 239-248.
Munawar, S., M. J. Windley, E. G. Tse, M. H. Todd, A. P. Hill, J. I. Vandenberg, and I. Jabeen (2018) Experimentally validated pharmacoinformatics approach to predict hERG inhibition potential of new chemical entities. Front. Pharmacol. 9: 1035.
Siramshetty, V. B., Q. Chen, P. Devarakonda, and R. Preissner (2018) The catch-22 of predicting hERG blockade using publicly accessible bioactivity data. J. Chem. Inf. Model. 58: 1224-1233.
Cai, C., P. Guo, Y. Zhou, J. Zhou, Q. Wang, F. Zhang, J. Fang, and F. Cheng (2019) Deep learning-based prediction of druginduced cardiotoxicity. J. Chem. Inf. Model. 59: 1073-1084.
Konda, L. S. K., S. K. Praba, and R. Kristam (2019) hERG liability classification models using machine learning techniques. Comput. Toxicol. 12: 100089.
Lee, A. A., Q. Yang, A. Bassyouni, C. R. Butler, X. Hou, S. Jenkinson, and D. A. Price (2019) Ligand biological activity predicted by cleaning positive and negative chemical correlations. Proc. Natl. Acad. Sci. USA. 116: 3373-3378.
Lee, H. M., M. S. Yu, S. R. Kazmi, S. Y. Oh, K. H. Rhee, M. A. Bae, B. H. Lee, D. S. Shin, K. S. Oh, H. Ceong, D. Lee, and D. Na (2019) Computational determination of hERG-related cardiotoxicity of drug candidates. BMC Bioinformatics. 20: 250.
Ogura, K., T. Sato, H. Yuki, and T. Honma (2019) Support vector machine model for hERG inhibitory activities based on the integrated hERG database using descriptor selection by NSGA-II. Sci. Rep. 9: 12220.
Zhang, Y., J. Zhao, Y. Wang, Y. Fan, L. Zhu, Y. Yang, X. Chen, T. Lu, Y. Chen, and H. Liu (2019) Prediction of hERG K+ channel blockage using deep neural networks. Chem. Biol. Drug Des. 94: 1973-1985.
Sato, T., H. Yuki, K. Ogura, and T. Honma (2018) Construction of an integrated database for hERG blocking small molecules. PLoS One. 13: e0199348.
Article CAS Google Scholar
Kim, H. and H. Nam (2020) hERG-Att: Self-attention-based deep neural network for predicting hERG blockers. Comput. Biol. Chem. 87: 107286.
Article CAS Google Scholar
Lei, T., F. Chen, H. Liu, H. Sun, Y. Kang, D. Li, Y. Li, and T. Hou (2017) ADMET evaluation in drug discovery. Part 17: development of quantitative and qualitative prediction models for chemical-induced respiratory toxicity. Mol. Pharm. 14: 2407-2421.
Lei, T., H. Sun, Y. Kang, F. Zhu, H. Liu, W. Zhou, Z. Wang, D. Li, Y. Li, and T. Hou (2017) ADMET evaluation in drug discovery. 18. reliable prediction of chemical-induced urinary tract toxicity by boosting machine learning approaches. Mol. Pharm. 14: 3935-3953.
Liu, J., G. Patlewicz, A. J. Williams, R. S. Thomas, and I. Shah (2017) Predicting organ toxicity using in vitro bioactivity data and chemical structure. Chem. Res. Toxicol. 30: 2046–2059.
Article CAS Google Scholar
Xu, Y., J. Pei, and L. Lai (2017) Deep learning based regression and multiclass models for acute oral toxicity prediction with automatic chemical feature extraction. J. Chem. Inf. Model. 57: 2672–2685.
Article CAS Google Scholar
Zhang, H., P. Yu, J. X. Ren, X. B. Li, H. L. Wang, L. Ding, and W. B. Kong (2017) Development of novel prediction model for drug-induced mitochondrial toxicity by using naive Bayes classifier method. Food Chem. Toxicol. 110: 122-129.
Fan, D., H. Yang, F. Li, L. Sun, P. Di, W. Li, Y. Tang, and G. Liu (2018) In silico prediction of chemical genotoxicity using machine learning methods and structural alerts. Toxicol. Res. 7: 211-220.
Jiang, C., H. Yang, P. Di, W. Li, Y. Tang, and G. Liu (2019) In silico prediction of chemical reproductive toxicity using machine learning. J. Appl. Toxicol. 39: 844–854.
Article CAS Google Scholar
Zheng, S., Y. Wang, W. Liu, W. Chang, G. Liang, Y. Xu, and F. Lin (2019) In silico prediction of hemolytic toxicity on the human erythrocytes for small molecules by machine-learning and genetic algorithm. J. Med. Chem. 12: 6499-6512.
Fernandez, M., F. Ban, G. Woo, M. Hsing, T. Yamazaki, E. Leblanc, P. S. Rennie, W. J. Welch, and A. Cherkasov (2018) Toxic colors: the use of deep learning for predicting toxicity of compounds merely from their graphic images. J. Chem. Inf. Model. 58: 1533-1543.
Abbasi, K., A. Poso, J. Ghasemi, M. Amanlou, and A. Masoudi-Nejad (2019) Deep transferable compound representation across domains and tasks for low data drug discovery. J. Chem. Inf. Model. 59: 4528-4539.
Karim, A., A. Mishra, M. A. H. Newton, and A. Sattar (2019) Efficient toxicity prediction via simple features using shallow neural networks and decision trees. ACS Omega. 4: 1874–1888.
Article CAS Google Scholar
Zakharov, A. V., T. Zhao, D. T. Nguyen, T. Peryea, T. Sheils, A. Yasgar, R. Huang, N. Southall, and A. Simeonov (2019) Novel consensus architecture to improve performance of large-scale multitask deep learning QSAR models. J. Chem. Inf. Model. 59: 4613-4624.
Wang, J. and T. Hou (2015) Advances in computationally modeling human oral bioavailability. Adv. Drug Deliv. Rev. 86: 11–16.
Article CAS Google Scholar
Hutter, M. C. (2018) The current limits in virtual screening and property prediction. Future Med. Chem. 10: 1623–1635.
Article CAS Google Scholar
Wu, Z., B. Ramsundar, E. N. Feinberg, J. Gomes, C. Geniesse, A. S. Pappu, K. Leswing, and V. Pande (2018) MoleculeNet: a benchmark for molecular machine learning. Chem. Sci. 9: 513-530.
Merck Molecular Activity Challenge (2012) https://www.kaggle.com/c/MerckActivity.
Winkler, D. A. and T. C. Le (2017) Performance of deep and shallow neural networks, the universal approximation theorem, activity cliffs, and QSAR. Mol. Inform. 36: 1600118.
Article CAS Google Scholar
Ryu, S., Y. Kwon, and W. Y. Kim (2019) A Bayesian graph convolutional network for reliable prediction of molecular properties with uncertainty quantification. Chem. Sci. 10: 8438–8446.
Article Google Scholar
Xiong, Z., D. Wang, X. Liu, F. Zhong, X. Wan, X. Li, Z. Li, X. Luo, K. Chen, H. Jiang, and M. Zheng (2019) Pushing the boundaries of molecular representation for drug discovery with the graph attention mechanism. J. Med. Chem. 63: 8749-8760.
Maggiora, G. M. (2006) On outliers and activity cliffs-Why QSAR often disappoints. J. Chem. Inf. Model. 46: 1535.
Article CAS Google Scholar
Kohonen, P., J. A. Parkkinen, E. L. Willighagen, R. Ceder, K. Wennerberg, S. Kaski, and R. C. Grafstrm (2017) A transcriptomics data-driven gene space accurately predicts liver cytopathology and drug-induced liver injury. Nat. Commun. 8: 15932.
Rueda-Zrate, H. A., I. Imaz-Rosshandler, R. A. Crdenas-Ovando, J. E. Castillo-Fernndez, J. Noguez-Monroy, and C. Rangel-Escareo (2017) A computational toxicogenomics approach identifies a list of highly hepatotoxic compounds from a large microarray database. PLoS One. 12: e0176284.
Su, R., H. Wu, B. Xu, X. Liu, and L. Wei (2019) Developing a multi-dose computational model for drug-induced hepatotoxicity prediction based on toxicogenomics data. IEEE/ACM Trans. Comput. Biol. Bioinform. 16: 1231-1239.
Schneider, G. and U. Fechner (2005) Computer-based de novo design of drug-like molecules. Nat. Rev. Drug Discov. 4: 649–663.
Article CAS Google Scholar
Walters, W. P. (2019) Virtual chemical libraries. J. Med. Chem. 62: 1116–1124.
Article CAS Google Scholar
Reymond, J. L., L. Ruddigkeit, L. Blum, and R. van Deursen (2012) The enumeration of chemical space. WIREs Comput. Mol. Sci. 2: 717-733.
Sanchez-Lengeling, B. and A. Aspuru-Guzik (2018) Inverse molecular design using machine learning: Generative models for matter engineering. Science. 361: 360–365.
Article CAS Google Scholar
Elton, D. C., Z. Boukouvalas, M. D. Fuge, and P. W. Chung (2019) Deep learning for molecular design—a review of the state of the art. Mol. Syst. Des. Eng. 4: 828-849.
Brown, N., M. Fiscato, M. H. S. Segler, and A. C. Vaucher (2019) GuacaMol: Benchmarking models for de novo molecular design. J. Chem. Inf. Model. 59: 1096-1108.
Huc, I. and J. M. Lehn (1997) Virtual combinatorial libraries: dynamic generation of molecular and supramolecular diversity by self-assembly. Proc. Natl. Acad. Sci. USA. 94: 2106–2110.
Article Google Scholar
Lehn, J. M. (1999) Dynamic combinatorial chemistry and virtual combinatorial libraries. Chem. Eur. J. 5: 2455–2463.
Article Google Scholar
Kwon, Y., J. Yoo, Y. S. Choi, W. J. Son, D. Lee, and S. Kang (2019) Efficient learning of non-autoregressive graph variational autoencoders for molecular graph generation. J. Cheminform. 11: 70.
Segler, M. H. S., T. Kogej, C. Tyrchan, and M. P. Waller (2018) Generating focused molecule libraries for drug discovery with recurrent neural networks. ACS Cent. Sci. 4: 120-131.
Gómez-Bombarelli, R., J. N. Wei, D. Duvenaud, J. M. Hernández-Lobato, B. Sánchez-Lengeling, D. Sheberla, J. Aguilera-Iparraguirre, T. D. Hirzel, R. P. Adams, and A. Aspuru-Guzik (2018) Automatic chemical design using a datadriven continuous representation of molecules. ACS Cent. Sci. 4: 268-276.
Kang, S. and K. Cho (2019) Conditional molecular design with deep generative models. J. Chem. Inf. Model. 59: 43–52.
Article CAS Google Scholar
Arús-Pous, J., S. V. Johansson, O. Prykhodko, E. J. Bjerrum, C. Tyrchan, J. L. Reymond, H. Chen, and O. Engkvist (2019) Randomized SMILES strings improve the quality of molecular generative models. J. Cheminform. 11: 71.
Gupta, A., A. T. Müller, B. J. H. Huisman, J. A. Fuchs, P. Schneider, and G. Schneider (2018) Generative recurrent networks for de novo drug design. Mol. Inform. 37: 1700111.
Merk, D., F. Grisoni, L. Friedrich, and G. Schneider (2018) Tuning artificial intelligence on the de novo design of naturalproduct-inspired retinoid X receptor modulators. Commun. Chem. 1: 68.
Zheng, S., X. Yan, Q. Gu, Y. Yang, Y. Du, Y. Lu, and J. Xu (2019) QBMG: quasi-biogenic molecule generator with deep recurrent neural network. J. Cheminform. 11: 5.
Awale, M., F. Sirockin, N. Stiefl, and J. L. Reymond (2019) Drug analogs from fragment-based long short-term memory generative neural networks. J. Chem. Inf. Model. 59: 1347-1356.
Arús-Pous, J., T. Blaschke, S. Ulander, J. L. Reymond, H. Chen, and O. Engkvist (2019) Exploring the GDB-13 chemical space using deep generative models. J. Cheminform. 11: 20.
Pogány, P., N. Arad, S. Genway, and S. D. Pickett (2019) De novo molecule design by translating from reduced graphs to SMILES. J. Chem. Inf. Model. 59: 1136-1146.
Li, Y., L. Zhang, and Z. Liu (2018) Multi-objective de novo drug design with conditional graph generative model. J. Cheminform. 10: 33.
Article CAS Google Scholar
Polykovskiy, D., A. Zhebrak, D. Vetrov, Y. Ivanenkov, V. Aladinskiy, P. Mamoshina, M. Bozdaganyan, A. Aliper, A. Zhavoronkov, and A. Kadurin (2018) Entangled conditional adversarial autoencoder for de novo drug discovery. Mol. Pharm. 15: 4398-4405.
Lim, J., S. Ryu, J. W. Kim, and W. Y. Kim (2018) Molecular generative model based on conditional variational autoencoder for de novo molecular design. J. Cheminform. 10: 31.
Harel, S. and K. Radinsky (2018) Prototype-based compound discovery using deep generative models. Mol. Pharmaceutics. 15: 4406–4416.
Article CAS Google Scholar
Skalic, M., J. Jiménez, D. Sabbadin, and G. De Fabritiis (2019) Shape-based generative modeling for de novo drug design. J. Chem. Inf. Model. 59: 1205-1214.
Lim, J., S. Y. Hwang, S. Moon, S. Kim, and W. Y. Kim (2020) Scaffold-based molecular design with a graph generative model. Chem. Sci. 11: 1153-1164.
Kadurin, A., S. Nikolenko, K. Khrabrov, A. Aliper, and A. Zhavoronkov (2017) druGAN: An advanced generative adversarial autoencoder model for de novo generation of new molecules with desired molecular properties in silico. Mol. Pharmaceutics. 14: 3098-3104.
Blaschke, T., M. Olivecrona, O. Engkvist, J. Bajorath, and H. Chen (2018) Application of generative autoencoder in de novo molecular design. Mol. Inform. 37: 1700123.
Article CAS Google Scholar
Prykhodko, O., S. V. Johansson, P. C. Kotsias, J. Arús-Pous, E. J. Bjerrum, O. Engkvist, and H. Chen (2019) A de novo molecular generation method using latent vector based generative adversarial network. J. Cheminform. 11: 74.
Zhou, Z., S. Kearnes, L. Li, R. N. Zare, and P. Riley (2019) Optimization of molecules via deep reinforcement learning. Sci. Rep. 9: 10752.
Olivecrona, M., T. Blaschke, O. Engkvist, and H. Chen (2017) Molecular de-novo design through deep reinforcement learning. J. Cheminform. 9: 48.
Popova, M., O. Isayev, and A. Tropsha (2018) Deep reinforcement learning for de novo drug design. Sci. Adv. 4: eaap7885.
Article CAS Google Scholar
Putin, E., A. Asadulaev, Y. Ivanenkov, V. Aladinskiy, B. Sanchez-Lengeling, A. Aspuru-Guzik, and A. Zhavoronkov (2018) Reinforced adversarial neural computer for de novo molecular design. J. Chem. Inf. Model. 58: 1194-1204.
Putin, E., A. Asadulaev, Q. Vanhaelen, Y. Ivanenkov, A. V. Aladinskaya, A. Aliper, and A. Zhavoronkov (2018) Adversarial threshold neural computer for molecular de novo design. Mol. Pharmaceutics. 15: 4386-4397.
Liu, X., K. Ye, H. W. T. van Vlijmen, A. P. Ijzerman, and G. J. P. van Westen (2019) An exploration strategy improves the diversity of de novo ligands using deep reinforcement learning: a case for the adenosine A2A receptor. J. Cheminform. 11: 35.
Ståhl, N., G. Falkman, A. Karlsson, G. Mathiason, and J. Boström (2019) Deep reinforcement learning for multiparameter optimization in de novo drug design. J. Chem. Inf. Model. 59: 3166–3176.
Google Scholar
Zhavoronkov, A., Y. A. Ivanenkov, A. Aliper, M. S. Veselov, V. A. Aladinskiy, A. V. Aladinskaya, V. A. Terentiev, D. A. Polykovskiy, M. D. Kuznetsov, A. Asadulaev, Y. Volkov, A. Zholus, R. R. Shayakhmetov, A. Zhebrak, L. I. Minaeva, B. A. Zagribelnyy, L. H. Lee, R. Soll, D. Madge, L. Xing, T. Guo, and A. Aspuru-Guzik (2019) Deep learning enables rapid identification of potent DDR1 kinase inhibitors. Nat. Biotechnol. 37: 1038-1040.
Polykovskiy, D., A. Zhebrak, B. Sanchez-Lengeling, S. Golovanov, O. Tatanov, S. Belyaev, R. Kurbanov, A. Artamonov, V. Aladinskiy, M. Veselov, A. Kadurin, S. Johansson, H. Chen, S. Nikolenko, A. Aspuru-Guzik, and A. Zhavoronkov (2018) Molecular Sets (MOSES): A benchmarking platform for molecular generation models. ArXiv. 1811.12823.
Kawai, K., Y. Karuo, A. Tarui, K. Sato, and M. Omote (2020) Effect of structural descriptors on the design of cyclin dependent kinase inhibitors using similarity-based molecular evolution. Mol. Inform. 39: 1900126.
Yoshikawa, N., K. Terayama, M. Sumita, T. Homma, K. Oono, and K. Tsuda (2018) Population-based de novo molecule generation, using grammatical evolution. Chem. Lett. 47: 1431-1434.
Jensen, J. H. (2019) A graph-based genetic algorithm and generative model/Monte Carlo tree search for the exploration of chemical space. Chem. Sci. 10: 3567–3572.
Article Google Scholar
Herring, R. H. and M. R. Eden (2015) Evolutionary algorithm for de novo molecular design with multi-dimensional constraints. Comput. Chem Eng. 83: 267–277.
Article CAS Google Scholar
Rupakheti, C., A. Virshup, W. Yang, and D. N. Beratan (2015) Strategy to discover diverse optimal molecules in the small molecule universe. J. Chem. Inf. Model. 55: 529–537.
Article CAS Google Scholar
Boolell, M., M. J. Allen, S. A. Ballard, S. Gepi-Attee, G. J. Muirhead, A. M. Naylor, I. H. Osterloh, and C. Gingell (1996) Sildenafil: an orally active type 5 cyclic GMP-specific phosphodiesterase inhibitor for the treatment of penile erectile dysfunction. Int. J. Impot Res. 8: 47-52.
Ning, Y. M., J. L. Gulley, P. M. Arlen, S. Woo, S. M. Steinberg, J. J. Wright, H. L. Parnes, J. B. Trepel, M. J. Lee, Y. S. Kim, H. Sun, R. A. Madan, L. Latham, E. Jones, C. C. Chen, W. D. Figg, and W. L. Dahut (2010) Phase II trial of bevacizumab, thalidomide, docetaxel, and prednisone in patients with metastatic castration-resistant prostate cancer. J. Clin. Oncol. 28: 2070-2076.
Singhal, S., J. Mehta, R. Desikan, D. Ayers, P. Roberson, P. Eddlemon, N. Munshi, E. Anaissie, C. Wilson, M. Dhodapkar, J. Zeldis, and B. Barlogie (1999) Antitumor activity of thalidomide in refractory multiple myeloma. N. Engl. J. Med. 341: 1565-1571.
D'Amato, R. J., M. S. Loughnan, E. Flynn, and J. Folkman (1994) Thalidomide is an inhibitor of angiogenesis. Proc. Natl. Acad. Sci. USA. 91: 4082–4085.
Article Google Scholar
Hameed, P. N., K. Verspoor, S. Kusljic, and S. Halgamuge (2018) A two-tiered unsupervised clustering approach for drug repositioning through heterogeneous data integration. BMC Bioinformatics. 19: 129.
Article CAS Google Scholar
Wu, C., R. C. Gudivada, B. J. Aronow, and A. G. Jegga (2013) Computational drug repositioning through heterogeneous network clustering. BMC Syst. Biol. 7: S6.
Blondel, V. D., J. L. Guillaume, R. Lambiotte, and E. Lefebvre (2008) Fast unfolding of communities in large networks. J. Stat. Mech. 2008: P10008.
Nepusz, T., H. Yu, and A. Paccanaro (2012) Detecting overlapping protein complexes in protein-protein interaction networks. Nat. Methods. 9: 471–472.
Article CAS Google Scholar
Sun, P., J. Guo, R. Winnenburg, and J. Baumbach (2017) Drug repurposing by integrated literature mining and drug-genedisease triangulation. Drug Discov. Today. 22: 615–619.
Article CAS Google Scholar
Chen, H. and Z. Zhang (2018) Prediction of drug-disease associations for drug repositioning through drug-miRNAdisease heterogeneous network. IEEE Access. 6: 45281–45287.
Article Google Scholar
Martinez, V., C. Navarro, C. Cano, W. Fajardo, and A. Blanco (2015) DrugNet: network-based drug-disease prioritization by integrating heterogeneous data. Artif. Intell. Med. 63: 41–49.
Article Google Scholar
Martinez, V., C. Cano, and A. Blanco (2014) ProphNet: a generic prioritization method through propagation of information. BMC Bioinformatics. 15: S5.
Google Scholar
Luo, H., J. Wang, M. Li, J. Luo, X. Peng, F. X. Wu, and Y. Pan (2016) Drug repositioning based on comprehensive similarity measures and Bi-Random walk algorithm. Bioinformatics. 32: 2664–2671.
Article CAS Google Scholar
Luo, H., M. Li, S. Wang, Q. Liu, Y. Li, and J. Wang (2018) Computational drug repositioning using low-rank matrix approximation and randomized algorithms. Bioinformatics. 34: 1904–1912.
Article CAS Google Scholar
Yan, C. K., W. X. Wang, G. Zhang, J. L. Wang, and A. Patel (2019) BiRWDDA: A novel drug repositioning method based on multisimilarity fusion. J. Comput. Biol. 26: 1230–1242.
Article CAS Google Scholar
Gottlieb, A., G. Y. Stein, E. Ruppin, and R. Sharan (2011) PREDICT: A method for inferring novel drug indications with application to personalized medicine. Mol. Syst. Biol. 7: 496.
Article CAS Google Scholar
Napolitano, F., Y. Zhao, V. M. Moreira, R. Tagliaferri, J. Kere, M. D'Amato, and D. Greco (2013) Drug repositioning: A machinelearning approach through data integration. J. Cheminform. 5: 30.
Wang, Y., S. Chen, N. Deng, and Y. Wang (2013) Drug repositioning by kernel-based integration of molecular structure, molecular activity, and phenotype data. PLoS One. 8: e78518.
Article CAS Google Scholar
Kim, E., A. S. Choi, and H. Nam (2019) Drug repositioning of herbal compounds via a machine-learning approach. BMC Bioinformatics. 20: 247.
Article Google Scholar
Zhang, W., X. Yue, F. Huang, R. Liu, Y. Chen, and C. Ruan (2018) Predicting drug-disease associations and their therapeutic function based on the drug-disease association bipartite network. Methods. 145: 51–59.
Article CAS Google Scholar
Le, D. H. and D. Nguyen-Ngoc (2018) Drug repositioning by integrating known disease-gene and drug-target associations in a semi-supervised learning model. Acta Biotheor. 66: 315–331.
Article Google Scholar
Xuan, P., Y. Cao, T. Zhang, X. Wang, S. Pan, and T. Shen (2019) Drug repositioning through integration of prior knowledge and projections of drugs and diseases. Bioinformatics. 35: 4108–4119.
Article CAS Google Scholar
Wei, X., Y. Zhang, Y. Huang, and Y. Fang (2019) Predicting drug-disease associations by network embedding and biomedical data integration. Data Technol. Appl. 53: 217–229.
Article Google Scholar
Moridi, M., M. Ghadirinia, A. Sharifi-Zarchi, and F. Zare-Mirakabad (2019) The assessment of efficient representation of drug features using deep learning for drug repositioning. BMC Bioinformatics. 20: 577.
Article Google Scholar
Abdolhosseini, F., B. Azarkhalili, A. Maazallahi, A. Kamal, S. A. Motahari, A. Sharifi-Zarchi, and H. Chitsaz (2019) Cell identity codes: understanding cell identity from gene expression profiles using deep neural networks. Sci. Rep. 9: 2342.
Asgari, E. and M. R. K. Mofrad (2015) Continuous distributed representation of biological sequences for deep proteomics and genomics. PLoS One. 10: e0141287.
Article CAS Google Scholar
Donner, Y., S. Kazmierczak, and K. Fortney (2018) Drug Repurposing using deep embeddings of gene expression profiles. Mol. Pharm. 15: 4314–4325.
Article CAS Google Scholar
Stathias, V., J. Turner, A. Koleti, D. Vidovic, D. Cooper, M. Fazel-Najafabadi, M. Pilarczyk, R. Terryn, C. Chung, A. Umeano, D. J. B. Clarke, A. Lachmann, J. E. Evangelista, A. Ma'ayan, M. Medvedovic, and S. C. Schurer (2020) LINCS Data Portal 2.0: next generation access point for perturbationresponse signatures. Nucleic Acids Res. 48: D431-D439.
You, J., R. D. McLeod, and P. Hu (2019) Predicting drug-target interaction network using deep learning model. Comput. Biol. Chem. 80: 90-101.
Aliper, A., S. Plis, A. Artemov, A. Ulloa, P. Mamoshina, and A. Zhavoronkov (2016) Deep learning applications for predicting pharmacological properties of drugs and drug repurposing using transcriptomic data. Mol. Pharm. 13: 2524-2530.
Zeng, X., S. Zhu, X. Liu, Y. Zhou, R. Nussinov, and F. Cheng (2019) deepDR: a network-based deep learning approach to in silico drug repositioning. Bioinformatics. 35: 5191–5198.
Article CAS Google Scholar
Xuan, P., L. Zhao, T. Zhang, Y. Ye, and Y. Zhang (2019) Inferring drug-related diseases based on convolutional neural network and gated recurrent unit. Molecules. 24: 2712.
Article CAS Google Scholar
Masoudi-Sobhanzadeh, Y., Y. Omidi, M. Amanlou, and A. Masoudi-Nejad (2019) Drug databases and their contributions to drug repurposing. Genomics. 112: 1087–1095.
Google Scholar
Cheng, F. (2019) In silico oncology drug repositioning and polypharmacology. Methods Mol. Biol. 1878: 243–261.
Article CAS Google Scholar
March-Vila, E., L. Pinzi, N. Sturm, A. Tinivella, O. Engkvist, H. Chen, and G. Rastelli (2017) On the integration of in silico drug design methods for drug repurposing. Front. Pharmacol. 8: 298.
Fleuren, W. W. M. and W. Alkema (2015) Application of text mining in the biomedical domain. Methods. 74: 97–106.
Article CAS Google Scholar
Nugent, T., V. Plachouras, and J. L. Leidner (2016) Computational drug repositioning based on side-effects mined from social media. PeerJ. Computer Science. 2: e46.
Rastegar-Mojarad, M., R. K. Elayavilli, D. Li, R. Prasad, and H. Liu (2015) A new method for prioritizing drug repositioning candidates extracted by literature-based discovery. Proceedings of 2015 IEEE International Conference on Bioinformatics and Biomedicine, BIBM 2015. November 9-12. Washington, DC, USA.
Su, E. W. and T. M. Sanger (2017) Systematic drug repositioning through mining adverse event data in ClinicalTrials.gov. PeerJ. 5: e3154.
Article CAS Google Scholar
Park, K. (2019) A review of computational drug repurposing. Transl. Clin. Pharmacol. 27: 59–63.
Article Google Scholar
RDKit. http://www.rdkit.org/.
Douguet, D. (2018) Data sets representative of the structures and experimental properties of FDA-approved drugs. ACS Med. Chem. Lett. 9: 204–209.
Article CAS Google Scholar
Kim, S., P. A. Thiessen, E. E. Bolton, J. Chen, G. Fu, A. Gindulyte, L. Han, J. He, S. He, B. A. Shoemaker, J. Wang, B. Yu, J. Zhang, and S. H. Bryant (2016) PubChem substance and compound databases. Nucleic Acids Res. 44: D1202-D1213.
Williams, A. J. (2008) Internet-based tools for communication and collaboration in chemistry. Drug Discovery Today. 13: 502–506.
Article CAS Google Scholar
Ursu, O., J. Holmes, C. G. Bologa, J. J. Yang, S. L. Mathias, V. Stathias, D. T. Nguyen, S. Schurer, and T. Oprea (2019) DrugCentral 2018: an update. Nucleic Acids Res. 47: D963-D970.
Ursu, O., J. Holmes, J. Knockel, C. G. Bologa, J. J. Yang, S. L. Mathias, S. J. Nelson, and T. I. Oprea (2017) DrugCentral: online drug compendium. Nucleic Acids Res. 45: D932-D939.
DailyMed. https://dailymed.nlm.nih.gov/dailymed/.
Kuhn, M., I. Letunic, L. J. Jensen, and P. Bork (2016) The SIDER database of drugs and side effects. Nucleic Acids Res. 44: D1075–D1079.
Article CAS Google Scholar
Tatonetti, N. P., P. P. Ye, R. Daneshjou, and R. B. Altman (2012) Data-driven prediction of drug effects and interactions. Sci. Transl. Med. 4: 125ra31.
Fang, H., Z. Su, Y. Wang, A. Miller, Z. Liu, P. C. Howard, W. Tong, and S. M. Lin (2014) Exploring the FDA adverse event reporting system to generate hypotheses for monitoring of disease characteristics. Clin. Pharmacol. Ther. 95: 496-498.
Cai, M. C., Q. Xu, Y. J. Pan, W. Pan, N. Ji, Y. B. Li, H. J. Jin, K. Liu, and Z. L. Ji (2015) ADReCS: An ontology database for aiding standardization and hierarchical classification of adverse drug reaction terms. Nucleic Acids Res. 43: D907-D913.
Subramanian, A., R. Narayan, S. M. Corsello, D. D. Peck, T. E. Natoli, X. Lu, J. Gould, J. F. Davis, A. A. Tubelli, J. K. Asiedu, D. L. Lahr, J. E. Hirschman, Z. Liu, M. Donahue, B. Julian, M. Khan, D. Wadden, I. C. Smith, D. Lam, A. Liberzon, C. Toder, M. Bagul, M. Orzechowski, O. M. Enache, F. Piccioni, S. A. Johnson, N. J. Lyons, A. H. Berger, A. F. Shamji, A. N. Brooks, A. Vrcic, C. Flynn, J. Rosains, D. Y. Takeda, R. Hu, D. Davison, J. Lamb, K. Ardlie, L. Hogstrom, P. Greenside, N. S. Gray, P. A. Clemons, S. Silver, X. Wu, W. N. Zhao, W. Read-Button, X. Wu, S. J. Haggarty, L. V. Ronco, J. S. Boehm, S. L. Schreiber, J. G. Doench, J. A. Bittker, D. E. Root, B. Wong, and T. R. Golub (2017) A next generation Connectivity Map: L1000 platform and the first 1,000,000 profiles. Cell. 171: 1437-1452.e17.
Barrett, T., D. B. Troup, S. E. Wilhite, P. Ledoux, D. Rudnev, C. Evangelista, I. F. Kim, A. Soboleva, M. Tomashevsky, and R. Edgar (2007) NCBI GEO: Mining tens of millions of expression profiles - Database and tools update. Nucleic Acids Res. 35: D760-D765.
Barrett, T., T. O. Suzek, D. B. Troup, S. E. Wilhite, W. C. Ngau, P. Ledoux, D. Rudnev, A. E. Lash, W. Fujibuchi, and R. Edgar (2005) NCBI GEO: Mining millions of expression profiles -Database and tools. Nucleic Acids Res. 33: D562-D566.
Parkinson, H., M. Kapushesky, M. Shojatalab, N. Abeygunawardena, R. Coulson, A. Farne, E. Holloway, N. Kolesnykov, P. Lilja, M. Lukk, R. Mani, T. Rayner, A. Sharma, E. William, U. Sarkans, and A. Brazma (2007) ArrayExpress - A public database of microarray experiments and gene expression profiles. Nucleic Acids Res. 35: D747-750.
Yang, W., J. Soares, P. Greninger, E. J. Edelman, H. Lightfoot, S. Forbes, N. Bindal, D. Beare, J. A. Smith, I. R. Thompson, S. Ramaswamy, P. A. Futreal, D. A. Haber, M. R. Stratton, C. Benes, U. McDermott, and M. J. Garnett (2013) Genomics of Drug Sensitivity in Cancer (GDSC): A resource for therapeutic biomarker discovery in cancer cells. Nucleic Acids Res. 41: D955-D961.
Bodenreider, O. (2004) The Unified Medical Language System (UMLS): integrating biomedical terminology. Nucleic Acids Res. 32: D267–D270.
Article CAS Google Scholar
Rogers, F. B. (1963) Medical subject headings. Bull. Med. Libr. Assoc. 51: 114–116.
Google Scholar
Piñero, J., N. Queralt-Rosinach, À. Bravo, J. Deu-Pons, A. Bauer-Mehren, M. Baron, F. Sanz, and L. I. Furlong (2015) DisGeNET: A discovery platform for the dynamical exploration of human diseases and their genes. Database. 2015: bav028.
Ogata, H., S. Goto, K. Sato, W. Fujibuchi, H. Bono, and M. Kanehisa (1999) KEGG: Kyoto Encyclopedia of Genes and Genomes. Nucleic Acids Res. 27: 29–34.
Article Google Scholar
Hewett, M., D. E. Oliver, D. L. Rubin, K. L. Easton, J. M. Stuart, R. B. Altman, and T. E. Klein (2002) PharmGKB: the pharmacogenetics knowledge base. Nucleic Acids Res. 30: 163-165.
Tate, J. G., S. Bamford, H. C. Jubb, Z. Sondka, D. M. Beare, N. Bindal, H. Boutselakis, C. G. Cole, C. Creatore, E. Dawson, P. Fish, B. Harsha, C. Hathaway, S. C. Jupe, C. Y. Kok, K. Noble, L. Ponting, C. C. Ramshaw, C. E. Rye, H. E. Speedy, R. Stefancsik, S. L. Thompson, S. Wang, S. Ward, P. J. Campbell, and S. A. Forbes (2019) COSMIC: the Catalogue Of Somatic Mutations In Cancer. Nucleic Acids Res. 47: D941-D947.
Lappalainen, I., J. Lopez, L. Skipper, T. Hefferon, J. D. Spalding, J. Garner, C. Chen, M. Maguire, M. Corbett, G. Zhou, J. Paschall, V. Ananiev, P. Flicek, and D. M. Church (2013) DbVar and DGVa: public archives for genomic structural variation. Nucleic Acids Res. 41: D936-D941.
Mailman, M. D., M. Feolo, Y. Jin, M. Kimura, K. Tryka, R. Bagoutdinov, L. Hao, A. Kiang, J. Paschall, L. Phan, N. Popova, S. Pretel, L. Ziyabari, M. Lee, Y. Shao, Z. Y. Wang, K. Sirotkin, M. Ward, M. Kholodov, K. Zbicz, J. Beck, M. Kimelman, S. Shevelev, D. Preuss, E. Yaschenko, A. Graeff, J. Ostell, and S. T. Sherry (2007) The NCBI dbGaP database of genotypes and phenotypes. Nat. Genet. 39: 1181-1186.
Smigielski, E. M., K. Sirotkin, M. Ward, and S. T. Sherry (2000) dbSNP: a database of single nucleotide polymorphisms. Nucleic Acids Res. 28: 352–355.
Article Google Scholar
Liu, Z., M. Su, L. Han, J. Liu, Q. Yang, Y. Li, and R. Wang (2017) Forging the basis for developing protein-ligand interaction scoring functions. Acc. Chem. Res. 50: 302-309.
Su, M., Q. Yang, Y. Du, G. Feng, Z. Liu, Y. Li, and R. Wang (2019) Comparative assessment of scoring functions: The CASF-2016 update. J. Chem. Inf. Model. 59: 895-913.
Mysinger, M. M., M. Carchia, J. J. Irwin, and B. K. Shoichet (2012) Directory of useful decoys, enhanced (DUD-E): better ligands and decoys for better benchmarking. J. Med. Chem. 55: 6582-6594.
Carlson, H. A., R. D. Smith, K. L. Damm-Ganamet, J. A. Stuckey, A. Ahmed, M. A. Convery, D. O. Somers, M. Kranz, P. A. Elkins, G. Cui, C. E. Peishoff, M. H. Lambert, and J. B. Dunbar Jr. (2016) CSAR 2014: A benchmark exercise using unpublished data from pharma. J. Chem. Inf. Model. 56: 1063-1077.
Kim, S., J. Chen, T. Cheng, A. Gindulyte, J. He, S. He, Q. Li, B. A. Shoemaker, P. A. Thiessen, B. Yu, L. Zaslavsky, J. Zhang, and E. E. Bolton (2019) PubChem 2019 update: improved access to chemical data. Nucleic Acids Res. 47: D1102-D1109.
Mendez, D., A. Gaulton, A. P. Bento, J. Chambers, M. De Veij, E. Felix, M. P. Magarinos, J. F. Mosquera, P. Mutowo, M. Nowotka, M. Gordillo-Maranon, F. Hunter, L. Junco, G. Mugumbate, M. Rodriguez-Lopez, F. Atkinson, N. Bosc, C. J. Radoux, A. Segura-Cabrera, A. Hersey, and A. R. Leach (2019) ChEMBL: towards direct deposition of bioassay data. Nucleic Acids Res. 47: D930-D940.
Gilson, M. K., T. Liu, M. Baitaluk, G. Nicola, L. Hwang, and J. Chong (2016) BindingDB in 2015: A public database for medicinal chemistry, computational chemistry and systems pharmacology. Nucleic Acids Res. 44: D1045–1053.
Article CAS Google Scholar
Wishart, D. S., Y. D. Feunang, A. C. Guo, E. J. Lo, A. Marcu, J. R. Grant, T. Sajed, D. Johnson, C. Li, Z. Sayeeda, N. Assempour, I. Iynkkaran, Y. Liu, A. Maciejewski, N. Gale, A. Wilson, L. Chin, R. Cummings, D. Le, A. Pon, C. Knox, and M. Wilson (2018) DrugBank 5.0: a major update to the DrugBank database for 2018. Nucleic Acids Res. 46: D1074-D1082.
Kanehisa, M., M. Furumichi, M. Tanabe, Y. Sato, and K. Morishima (2017) KEGG: new perspectives on genomes, pathways, diseases and drugs. Nucleic Acids Res. 45: D353–D361.
Article CAS Google Scholar
Alexander, S. P. H., H. E. Benson, E. Faccenda, A. J. Pawson, J. L. Sharman, J. C. McGrath, W. A. Catterall, M. Spedding, J. A. Peters, A. J. Harmar, and CGTP Collaborators (2013) The concise guide to PHARMACOLOGY 2013/14: overview. Br. J. Pharmacol. 170: 1449-1458.
Hecker, N., J. Ahmed, J. von Eichborn, M. Dunkel, K. Macha, A. Eckert, M. K. Gilson, P. E. Bourne, and R. Preissner (2012) SuperTarget goes quantitative: update on drug-target interactions. Nucleic Acids Res. 40: D1113-D1117.
Gunther, S., M. Kuhn, M. Dunkel, M. Campillos, C. Senger, E. Petsalaki, J. Ahmed, E. G. Urdiales, A. Gewiess, L. J. Jensen, R. Schneider, R. Skoblo, R. B. Russell, P. E. Bourne, P. Bork, and R. Preissner (2008) SuperTarget and Matador: resources for exploring drug-target relationships. Nucleic Acids Res. 36: D919-D922.
Kuhn, M., C. von Mering, M. Campillos, L. J. Jensen, and P. Bork (2008) STITCH: interaction networks of chemicals and proteins. Nucleic Acids Res. 36: D684–D688.
Google Scholar
Yang, H., C. Lou, L. Sun, J. Li, Y. Cai, Z. Wang, W. Li, G. Liu, and Y. Tang (2019) admetSAR 2.0: web-service for prediction and optimization of chemical ADMET properties. Bioinformatics. 35: 1067–1069.
Article CAS Google Scholar
Tomasulo, P. (2002) ChemIDplus-super source for chemical and drug information. Med. Ref. Serv Q. 21: 53–59.
Google Scholar
Richard, A. M., R. S. Judson, K. A. Houck, C. M. Grulke, P. Volarath, I. Thillainadarajah, C. Yang, J. Rathman, M. T. Martin, J. F. Wambaugh, T. B. Knudsen, J. Kancherla, K. Mansouri, G. Patlewicz, A. J. Williams, S. B. Little, K. M. Crofton, and R. S. Thomas (2016) ToxCast chemical landscape: Paving the road to 21st century toxicology. Chem. Res. Toxicol. 29: 1225-1251.
Tox21 Challenge. https://tripod.nih.gov/tox21/challenge/.
Watford, S., L. Ly Pham, J. Wignall, R. Shin, M. T. Martin, and K. P. Friedman (2019) ToxRefDB version 2.0: Improved utility for predictive and retrospective toxicology analyses. Reprod. Toxicol. 89: 145-158.
Sterling, T. and J. J. Irwin (2015) ZINC 15 — ligand discovery for everyone. J. Chem. Inf. Model. 55: 2324–2337
Blum, L. C. and J. L. Reymond (2009) 970 million druglike small molecules for virtual screening in the chemical universe database GDB-13. J. Am. Chem. Soc.131: 8732-8733.
Ruddigkeit, L., R. van Deursen, L. C. Blum, and J. L. Reymond (2012) Enumeration of 166 billion organic small molecules in the chemical universe database GDB-17. J. Chem. Inf. Model. 52: 2864–2875.
Article CAS Google Scholar
Ramakrishnan, R., P. O. Dral, M. Rupp, and O. A. von Lilienfeld (2014) Quantum chemistry structures and properties of 134 kilo molecules. Sci. Data. 1: 140022.
Visini, R., M. Awale, and J. L. Reymond (2017) Fragment database FDB-17. J. Chem. Inf. Model. 57: 700-709.
Sun, J., N. Jeliazkova, V. Chupakin, J. F. Golib-Dzib, O. Engkvist, L. Carlsson, J. Wegner, H. Ceulemans, I. Georgiev, V. Jeliazkov, N. Kochev, T. J. Ashby, and H. Chen (2017) ExCAPE-DB: an integrated large scale dataset facilitating Big Data analysis in chemogenomics. J. Cheminform. 9: 17.
Messenger, A. G. and J. Rundegren (2004) Minoxidil: Mechanisms of action on hair growth. Br. J. Dermatol. 150: 186–194.
Article Google Scholar
Steinbach, G., P. M. Lynch, R. K. Phillips, M. H. Wallace, E. Hawk, G. B. Gordon, N. Wakabayashi, B. Saunders, Y. Shen, T. Fujimura, L. K. Su, B. Levin, L. Godio, S. Patterson, M. A. Rodriguez-Bigas, S. L. Jester, K. L. King, M. Schumacher, J. Abbruzzese, R. N. DuBois, W. N. Hittelman, S. Zimmerman, J. W. Sherman, and G. Kelloff (2000) The effect of celecoxib, a cyclooxygenase-2 inhibitor, in familial adenomatous polyposis. N. Engl. J. Med. 342: 1946-1952.
Von Eichborn, J., M. S. Murgueitio, M. Dunkel, S. Koerner, P. E. Bourne, and R. Preissner (2011) PROMISCUOUS: A database for network-based drug-repositioning. Nucleic Acids Res. 39: D1060–D1066.
Article CAS Google Scholar
Luo, H., P. Zhang, X. H. Cao, D. Du, H. Ye, H. Huang, C. Li, S. Qin, C. Wan, L. Shi, L. He, and L. Yang (2016) DPDR-CPI, a server that predicts drug positioning and drug repositioning via chemical-protein interactome. Sci. Rep. 6: 35996.
Brown, A. S. and C. J. Patel (2017) A standard database for drug repositioning. Sci. Data. 4: 170029.
Article Google Scholar
Shameer, K., B. S. Glicksberg, R. Hodos, K. W. Johnson, M. A. Badgeley, B. Readhead, M. S. Tomlinson, T. O'Connor, R. Miotto, B. A. Kidd, R. Chen, A. Ma'ayan, and J. T. Dudley (2018) Systematic analyses of drugs and disease indications in RepurposeDB reveal pharmacological, biological and epidemiological factors influencing drug repositioning. Brief Bioinform. 19: 656–678.
Article CAS Google Scholar
Cotto, K. C., A. H. Wagner, Y. Y. Feng, S. Kiwala, A. C. Coffman, G. Spies, A. Wollam, N. C. Spies, O. L. Griffith, and M. Griffith (2018) DGIdb 3.0: A redesign and expansion of the drug-gene interaction database. Nucleic Acids Res. 46: D1068–D1073.
Article CAS Google Scholar
Kohler, S., L. Carmody, N. Vasilevsky, J. O. B. Jacobsen, D. Danis, J. P. Gourdine, M. Gargano, N. L. Harris, N. Matentzoglu, J. A. McMurry, D. Osumi-Sutherland, V. Cipriani, J. P. Balhoff, T. Conlin, H. Blau, G. Baynam, R. Palmer, D. Gratian, H. Dawkins, M. Segal, A. C. Jansen, A. Muaz, W. H. Chang, J. Bergerson, S. J. F. Laulederkind, Z. Yuksel, S. Beltran, A. F. Freeman, P. I. Sergouniotis, D. Durkin, A. L. Storm, M. Hanauer, M. Brudno, S. M. Bello, M. Sincan, K. Rageth, M. T. Wheeler, R. Oegema, H. Lourghi, M. G. Della Rocca, R. Thompson, F. Castellanos, J. Priest, C. Cunningham-Rundles, A. Hegde, R. C. Lovering, C. Hajek, A. Olry, L. Notarangelo, M. Similuk, X. A. Zhang, D. Gomez-Andres, H. Lochmuller, H. Dollfus, S. Rosenzweig, S. Marwaha, A. Rath, K. Sullivan, C. Smith, J. D. Milner, D. Leroux, C. F. Boerkoel, A. Klion, M. C. Carter, T. Groza, D. Smedley, M. A. Haendel, C. Mungall, and P. N. Robinson (2019) Expansion of the Human Phenotype Ontology (HPO) knowledge base and resources. Nucleic Acids Res. 47: D1018–D1027.
Article CAS Google Scholar

Download references

Acknowledgments

This work was supported by the National Research Foundation of Korea (NRF) grant funded by the Korea government (NRF-2020R1A2C2004628), and was supported by the Bio-Synergy Research Project (NRF-2017M3A9C 4092978) of the Ministry of Science, ICT.

Author information

Authors and Affiliations

School of Electrical Engineering and Computer Science, Gwangju Institute of Science and Technology (GIST), Gwangju, 61005, Korea
Hyunho Kim, Eunyoung Kim, Ingoo Lee, Bongsung Bae, Minsu Park & Hojung Nam

Authors

Hyunho Kim
View author publications
You can also search for this author in PubMed Google Scholar
Eunyoung Kim
View author publications
You can also search for this author in PubMed Google Scholar
Ingoo Lee
View author publications
You can also search for this author in PubMed Google Scholar
Bongsung Bae
View author publications
You can also search for this author in PubMed Google Scholar
Minsu Park
View author publications
You can also search for this author in PubMed Google Scholar
Hojung Nam
View author publications
You can also search for this author in PubMed Google Scholar

Corresponding author

Correspondence to Hojung Nam.

Additional information

Publisher’s Note

Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

The authors declare no conflict of interest.

Neither ethical approval nor informed consent was required for this study.

Rights and permissions

Reprints and permissions

About this article

Cite this article

Kim, H., Kim, E., Lee, I. et al. Artificial Intelligence in Drug Discovery: A Comprehensive Review of Data-driven and Machine Learning Approaches. Biotechnol Bioproc E 25, 895–930 (2020). https://doi.org/10.1007/s12257-020-0049-y

Download citation

Received: 13 February 2020
Revised: 27 May 2020
Accepted: 03 June 2020
Published: 07 January 2021
Issue Date: December 2020
DOI: https://doi.org/10.1007/s12257-020-0049-y

Keywords

Use our pre-submission checklist

Avoid common mistakes on your manuscript.

Artificial Intelligence in Drug Discovery: A Comprehensive Review of Data-driven and Machine Learning Approaches

Abstract

Article PDF

Similar content being viewed by others

Machine Learning: Algorithms, Real-World Applications and Research Directions

Deep Learning: A Comprehensive Overview on Techniques, Taxonomy, Applications and Research Directions

Machine learning and deep learning

References

Acknowledgments

Author information

Authors and Affiliations

Corresponding author

Additional information

Publisher’s Note

Rights and permissions

About this article

Cite this article

Keywords

Navigation

Artificial Intelligence in Drug Discovery: A Comprehensive Review of Data-driven and Machine Learning Approaches

Abstract

Article PDF

Similar content being viewed by others

Machine Learning: Algorithms, Real-World Applications and Research Directions

Deep Learning: A Comprehensive Overview on Techniques, Taxonomy, Applications and Research Directions

Machine learning and deep learning

References

Acknowledgments

Author information

Authors and Affiliations

Corresponding author

Additional information

Publisher’s Note

Rights and permissions

About this article

Cite this article

Share this article

Keywords

Search

Navigation