Abstract
The advent of computational methods for efficient prediction of the druglikeness of small molecules and their ever-burgeoning applications in the fields of medicinal chemistry and drug industries have been a profound scientific development, since only a few amounts of the small molecule libraries were identified as approvable drugs. In this study, a deep belief network was utilized to construct a druglikeness classification model. For this purpose, small molecules and approved drugs from the ZINC database were selected for the unsupervised pre-training step and supervised training step. Various binary fingerprints such as Macc 166 bit, PubChem 881 bit, and Morgan 2048 bit as data features were investigated. The report revealed that using an unsupervised pre-training phase can lead to a good performance model and generalizability capability. Accuracy, precision, and recall of the model for Macc features were 97%, 96%, and 99%, respectively. For more consideration about the generalizability of the model, the external data by expression and investigational drugs in drug banks as drug data and randomly selected data from the ZINC database as non-drug were created. The results confirmed the good performance and generalizability capability of the model. Also, the outcomes depicted that a large proportion of misclassified non-drug small molecules ascertain the bioavailability conditions and could be investigated as a drug in the future. Furthermore, our model attempted to tap potential opportunities as a drug filter in drug discovery.
Graphic abstract
Similar content being viewed by others
References
Jorgensen WL (2004) The many roles of computation in drug discovery. Science 303(80):1813–1818
Mahajan PG, Dige NC, Vanjare BD, et al (2019) Synthesis and biological evaluation of 1, 2, 4-triazolidine-3-thiones as potent acetylcholinesterase inhibitors: in vitro and in silico analysis through kinetics, chemoinformatics and computational approaches. Mol Divers 1–19
Hooshmand SE, Ghadari R, Mohammadian R et al (2019) Rhodanine-Furan bis-heterocyclic frameworks synthesis via green one-pot sequential six-component reactions: a synthetic and computational study. ChemistrySelect 4:11893–11898. https://doi.org/10.1002/slct.201903361
Lipinski CA, Lombardo F, Dominy BW, Feeney PJ (1997) Experimental and computational approaches to estimate solubility and permeability in drug discovery and development settings. Adv Drug Deliv Rev 23:3–25
Muegge I, Heald SL, Brittelli D (2001) Simple selection criteria for drug-like chemical matter. J Med Chem 44:1841–1846
Egan WJ, Merz KM, Baldwin JJ (2000) Prediction of drug absorption using multivariate statistics. J Med Chem 43:3867–3877
Byvatov E, Fechner U, Sadowski J, Schneider G (2003) Comparison of support vector machine and artificial neural network systems for drug/nondrug classification. J Chem Inf Comput Sci 43:1882–1889. https://doi.org/10.1021/ci0341161
Li Q, Bender A, Pei J, Lai L (2007) A large descriptor set and a probabilistic kernel-based classifier significantly improve druglikeness classification. J Chem Inf Model 47:1776–1786. https://doi.org/10.1021/ci700107y
Tian S, Wang J, Li Y et al (2012) Drug-likeness analysis of traditional Chinese medicines: prediction of drug-likeness using machine learning approaches. Mol Pharm 9:2875–2886. https://doi.org/10.1021/mp300198d
Dhanda SK, Singla D, Mondal AK, Raghava GPS (2013) DrugMint: a webserver for predicting and designing of drug-like molecules. Biol Direct 8:1–12. https://doi.org/10.1186/1745-6150-8-28
García-Sosa AT, Oja M, Hetényi C, Maran U (2012) DrugLogit: logistic discrimination between drugs and nondrugs including disease-specificity by assigning probabilities based on molecular properties. J Chem Inf Model 52:2165–2180. https://doi.org/10.1021/ci200587h
Korkmaz S, Zararsiz G, Goksuluk D (2014) Drug/nondrug classification using support vector machines with various feature selection strategies. Comput Methods Programs Biomed 117:51–60. https://doi.org/10.1016/j.cmpb.2014.08.009
Hu Q, Feng M, Lai L, Pei J (2018) Prediction of drug-likeness using deep autoencoder neural networks. Front Genet 9:1–8. https://doi.org/10.3389/fgene.2018.00585
Mohammadi R, Fallah-Mehrabadi J, Bidkhori G et al (2016) A systems biology approach to reconcile metabolic network models with application to Synechocystis sp. PCC 6803 for biofuel production. Mol BioSyst 12:2552–2561
Masoudi-Sobhanzadeh Y, Omidi Y, Amanlou M, Masoudi-Nejad A (2019) DrugR + : a comprehensive relational database for drug repurposing, combination therapy, and replacement therapy. Comput Biol Med 109:254–262
Masoudi-Sobhanzadeh Y, Omidi Y, Amanlou M, Masoudi-Nejad A (2019) Trader as a new optimization algorithm predicts drug-target interactions efficiently. Sci Rep 9:9348
Sterling T, Irwin JJ (2015) ZINC 15 - Ligand discovery for everyone. J Chem Inf Model 55:2324–2337. https://doi.org/10.1021/acs.jcim.5b00559
Cereto-Massagué A, Ojeda MJ, Valls C et al (2015) Molecular fingerprint similarity search in virtual screening. Methods 71:58–63. https://doi.org/10.1016/j.ymeth.2014.08.005
Bolton EE, Wang Y, Thiessen PA, Bryant SH (2008) PubChem: integrated platform of small molecules and biological activities. In: Annual reports in computational chemistry. Elsevier, pp 217–241
Steinbeck C, Hoppe C, Kuhn S et al (2006) Recent developments of the chemistry development kit (CDK)-an open-source java library for chemo-and bioinformatics. Curr Pharm Des 12:2111–2120
Morgan HL (1965) The generation of a unique machine description for chemical structures-a technique developed at chemical abstracts service. J Chem Doc 5:107–113
Yap CW (2011) PaDEL-descriptor: an open source software to calculate molecular descriptors and fingerprints. J Comput Chem 32:1466–1474
Hinton GE, Salakhutdinov RR (2006) Reducing the dimensionality of data with neural networks. Science 313(80):504–507
Hinton GE, Osindero S, Teh Y-W (2006) Fast learning algorithm for deep belief nets. Neural Comput 18:1527–1554. https://doi.org/10.1162/neco.2006.18.7.1527
Abadi M, Barham P, Chen J, et al (2016) Tensorflow: a system for large-scale machine learning. In: 12th {USENIX} symposium on operating systems design and implementation ({OSDI} 16). pp 265–283
Chollet F (2015) Keras. GitHub. Available at: https://github.com/fchollet/keras
Erhan D, Bengio Y, Courville A et al (2010) Why does unsupervised pre-training help deep learning? J Mach Learn Res 11:625–660. https://doi.org/10.1145/1756006.1756025
Law V, Knox C, Djoumbou Y et al (2013) DrugBank 4.0: shedding new light on drug metabolism. Nucleic Acids Res 42:D1091–D1097
Lovering F, Bikker J, Humblet C (2009) Escape from flatland: increasing saturation as an approach to improving clinical success. J Med Chem 52:6752–6756
Ritchie TJ, Ertl P, Lewis R (2011) The graphical representation of ADME-related molecule properties for medicinal chemists. Drug Discov Today 16:65–72
Daina A, Michielin O, Zoete V (2017) SwissADME: a free web tool to evaluate pharmacokinetics, drug-likeness and medicinal chemistry friendliness of small molecules. Sci Rep 7:1–13. https://doi.org/10.1038/srep42717
Veber DF, Johnson SR, Cheng H-Y et al (2002) Molecular properties that influence the oral bioavailability of drug candidates. J Med Chem 45:2615–2623
Ghose AK, Viswanadhan VN, Wendoloski JJ (1999) A knowledge-based approach in designing combinatorial or medicinal chemistry libraries for drug discovery. 1. A qualitative and quantitative characterization of known drug databases. J Comb Chem 1:55–68
Aliper A, Plis S, Artemov A et al (2016) Deep learning applications for predicting pharmacological properties of drugs and drug repurposing using transcriptomic data. Mol Pharm 13:2524–2530
Vitaku E, Smith DT, Njardarson JT (2014) Analysis of the structural diversity, substitution patterns, and frequency of nitrogen heterocycles among US FDA approved pharmaceuticals: miniperspective. J Med Chem 57:10257–10274
Shaabani A, Hooshmand SE (2018) Malononitrile dimer as a privileged reactant in design and skeletal diverse synthesis of heterocyclic motifs. Mol Divers 22:207–224
Yerien DE, Bonesi S, Postigo A (2016) Fluorination methods in drug discovery. Org Biomol Chem 14:8398–8427
Vulpetti A, Dalvit C (2012) Fluorine local environment: from screening to drug design. Drug Discov Today 17:890–897
de la Torre BG, Albericio F (2019) The pharmaceutical industry in 2018. An analysis of FDA drug approvals from the perspective of molecules. Molecules 24:809
Zha G-F, Rakesh KP, Manukumar HM et al (2019) Pharmaceutical significance of azepane based motifs for drug discovery: a critical review. Eur J Med Chem 162:465–494
Poschel BPH (1971) A simple and specific screen for benzodiazepine-like drugs. Psychopharmacologia 19:193–198
Author information
Authors and Affiliations
Corresponding author
Additional information
Publisher's Note
Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.
Rights and permissions
About this article
Cite this article
Hooshmand, S.A., Jamalkandi, S.A., Alavi, S.M. et al. Distinguishing drug/non-drug-like small molecules in drug discovery using deep belief network. Mol Divers 25, 827–838 (2021). https://doi.org/10.1007/s11030-020-10065-7
Received:
Accepted:
Published:
Issue Date:
DOI: https://doi.org/10.1007/s11030-020-10065-7