Skip to main content
Log in

Distinguishing drug/non-drug-like small molecules in drug discovery using deep belief network

  • Original Article
  • Published:
Molecular Diversity Aims and scope Submit manuscript

Abstract

The advent of computational methods for efficient prediction of the druglikeness of small molecules and their ever-burgeoning applications in the fields of medicinal chemistry and drug industries have been a profound scientific development, since only a few amounts of the small molecule libraries were identified as approvable drugs. In this study, a deep belief network was utilized to construct a druglikeness classification model. For this purpose, small molecules and approved drugs from the ZINC database were selected for the unsupervised pre-training step and supervised training step. Various binary fingerprints such as Macc 166 bit, PubChem 881 bit, and Morgan 2048 bit as data features were investigated. The report revealed that using an unsupervised pre-training phase can lead to a good performance model and generalizability capability. Accuracy, precision, and recall of the model for Macc features were 97%, 96%, and 99%, respectively. For more consideration about the generalizability of the model, the external data by expression and investigational drugs in drug banks as drug data and randomly selected data from the ZINC database as non-drug were created. The results confirmed the good performance and generalizability capability of the model. Also, the outcomes depicted that a large proportion of misclassified non-drug small molecules ascertain the bioavailability conditions and could be investigated as a drug in the future. Furthermore, our model attempted to tap potential opportunities as a drug filter in drug discovery.

Graphic abstract

This is a preview of subscription content, log in via an institution to check access.

Access this article

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Fig. 1
Fig. 2
Fig. 3
Fig. 4
Fig. 5
Fig. 6
Fig. 7
Fig. 8
Fig. 9

Similar content being viewed by others

References

  1. Jorgensen WL (2004) The many roles of computation in drug discovery. Science 303(80):1813–1818

    Article  CAS  PubMed  Google Scholar 

  2. Mahajan PG, Dige NC, Vanjare BD, et al (2019) Synthesis and biological evaluation of 1, 2, 4-triazolidine-3-thiones as potent acetylcholinesterase inhibitors: in vitro and in silico analysis through kinetics, chemoinformatics and computational approaches. Mol Divers 1–19

  3. Hooshmand SE, Ghadari R, Mohammadian R et al (2019) Rhodanine-Furan bis-heterocyclic frameworks synthesis via green one-pot sequential six-component reactions: a synthetic and computational study. ChemistrySelect 4:11893–11898. https://doi.org/10.1002/slct.201903361

    Article  CAS  Google Scholar 

  4. Lipinski CA, Lombardo F, Dominy BW, Feeney PJ (1997) Experimental and computational approaches to estimate solubility and permeability in drug discovery and development settings. Adv Drug Deliv Rev 23:3–25

    Article  CAS  Google Scholar 

  5. Muegge I, Heald SL, Brittelli D (2001) Simple selection criteria for drug-like chemical matter. J Med Chem 44:1841–1846

    Article  CAS  PubMed  Google Scholar 

  6. Egan WJ, Merz KM, Baldwin JJ (2000) Prediction of drug absorption using multivariate statistics. J Med Chem 43:3867–3877

    Article  CAS  PubMed  Google Scholar 

  7. Byvatov E, Fechner U, Sadowski J, Schneider G (2003) Comparison of support vector machine and artificial neural network systems for drug/nondrug classification. J Chem Inf Comput Sci 43:1882–1889. https://doi.org/10.1021/ci0341161

    Article  CAS  PubMed  Google Scholar 

  8. Li Q, Bender A, Pei J, Lai L (2007) A large descriptor set and a probabilistic kernel-based classifier significantly improve druglikeness classification. J Chem Inf Model 47:1776–1786. https://doi.org/10.1021/ci700107y

    Article  CAS  PubMed  Google Scholar 

  9. Tian S, Wang J, Li Y et al (2012) Drug-likeness analysis of traditional Chinese medicines: prediction of drug-likeness using machine learning approaches. Mol Pharm 9:2875–2886. https://doi.org/10.1021/mp300198d

    Article  CAS  PubMed  Google Scholar 

  10. Dhanda SK, Singla D, Mondal AK, Raghava GPS (2013) DrugMint: a webserver for predicting and designing of drug-like molecules. Biol Direct 8:1–12. https://doi.org/10.1186/1745-6150-8-28

    Article  CAS  Google Scholar 

  11. García-Sosa AT, Oja M, Hetényi C, Maran U (2012) DrugLogit: logistic discrimination between drugs and nondrugs including disease-specificity by assigning probabilities based on molecular properties. J Chem Inf Model 52:2165–2180. https://doi.org/10.1021/ci200587h

    Article  CAS  PubMed  Google Scholar 

  12. Korkmaz S, Zararsiz G, Goksuluk D (2014) Drug/nondrug classification using support vector machines with various feature selection strategies. Comput Methods Programs Biomed 117:51–60. https://doi.org/10.1016/j.cmpb.2014.08.009

    Article  PubMed  Google Scholar 

  13. Hu Q, Feng M, Lai L, Pei J (2018) Prediction of drug-likeness using deep autoencoder neural networks. Front Genet 9:1–8. https://doi.org/10.3389/fgene.2018.00585

    Article  CAS  Google Scholar 

  14. Mohammadi R, Fallah-Mehrabadi J, Bidkhori G et al (2016) A systems biology approach to reconcile metabolic network models with application to Synechocystis sp. PCC 6803 for biofuel production. Mol BioSyst 12:2552–2561

    Article  CAS  PubMed  Google Scholar 

  15. Masoudi-Sobhanzadeh Y, Omidi Y, Amanlou M, Masoudi-Nejad A (2019) DrugR + : a comprehensive relational database for drug repurposing, combination therapy, and replacement therapy. Comput Biol Med 109:254–262

    Article  CAS  PubMed  Google Scholar 

  16. Masoudi-Sobhanzadeh Y, Omidi Y, Amanlou M, Masoudi-Nejad A (2019) Trader as a new optimization algorithm predicts drug-target interactions efficiently. Sci Rep 9:9348

    Article  PubMed  PubMed Central  Google Scholar 

  17. Sterling T, Irwin JJ (2015) ZINC 15 - Ligand discovery for everyone. J Chem Inf Model 55:2324–2337. https://doi.org/10.1021/acs.jcim.5b00559

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  18. Cereto-Massagué A, Ojeda MJ, Valls C et al (2015) Molecular fingerprint similarity search in virtual screening. Methods 71:58–63. https://doi.org/10.1016/j.ymeth.2014.08.005

    Article  CAS  PubMed  Google Scholar 

  19. Bolton EE, Wang Y, Thiessen PA, Bryant SH (2008) PubChem: integrated platform of small molecules and biological activities. In: Annual reports in computational chemistry. Elsevier, pp 217–241

  20. Steinbeck C, Hoppe C, Kuhn S et al (2006) Recent developments of the chemistry development kit (CDK)-an open-source java library for chemo-and bioinformatics. Curr Pharm Des 12:2111–2120

    Article  CAS  PubMed  Google Scholar 

  21. Morgan HL (1965) The generation of a unique machine description for chemical structures-a technique developed at chemical abstracts service. J Chem Doc 5:107–113

    Article  CAS  Google Scholar 

  22. Yap CW (2011) PaDEL-descriptor: an open source software to calculate molecular descriptors and fingerprints. J Comput Chem 32:1466–1474

    Article  CAS  PubMed  Google Scholar 

  23. http://sourceforge.net/projects/rdkit/

  24. Hinton GE, Salakhutdinov RR (2006) Reducing the dimensionality of data with neural networks. Science 313(80):504–507

    Article  CAS  PubMed  Google Scholar 

  25. Hinton GE, Osindero S, Teh Y-W (2006) Fast learning algorithm for deep belief nets. Neural Comput 18:1527–1554. https://doi.org/10.1162/neco.2006.18.7.1527

    Article  PubMed  Google Scholar 

  26. Abadi M, Barham P, Chen J, et al (2016) Tensorflow: a system for large-scale machine learning. In: 12th {USENIX} symposium on operating systems design and implementation ({OSDI} 16). pp 265–283

  27. Chollet F (2015) Keras. GitHub. Available at: https://github.com/fchollet/keras

  28. Erhan D, Bengio Y, Courville A et al (2010) Why does unsupervised pre-training help deep learning? J Mach Learn Res 11:625–660. https://doi.org/10.1145/1756006.1756025

    Article  Google Scholar 

  29. Law V, Knox C, Djoumbou Y et al (2013) DrugBank 4.0: shedding new light on drug metabolism. Nucleic Acids Res 42:D1091–D1097

    Article  PubMed  PubMed Central  Google Scholar 

  30. Lovering F, Bikker J, Humblet C (2009) Escape from flatland: increasing saturation as an approach to improving clinical success. J Med Chem 52:6752–6756

    Article  CAS  PubMed  Google Scholar 

  31. Ritchie TJ, Ertl P, Lewis R (2011) The graphical representation of ADME-related molecule properties for medicinal chemists. Drug Discov Today 16:65–72

    Article  CAS  PubMed  Google Scholar 

  32. Daina A, Michielin O, Zoete V (2017) SwissADME: a free web tool to evaluate pharmacokinetics, drug-likeness and medicinal chemistry friendliness of small molecules. Sci Rep 7:1–13. https://doi.org/10.1038/srep42717

    Article  Google Scholar 

  33. Veber DF, Johnson SR, Cheng H-Y et al (2002) Molecular properties that influence the oral bioavailability of drug candidates. J Med Chem 45:2615–2623

    Article  CAS  PubMed  Google Scholar 

  34. Ghose AK, Viswanadhan VN, Wendoloski JJ (1999) A knowledge-based approach in designing combinatorial or medicinal chemistry libraries for drug discovery. 1. A qualitative and quantitative characterization of known drug databases. J Comb Chem 1:55–68

    Article  CAS  PubMed  Google Scholar 

  35. Aliper A, Plis S, Artemov A et al (2016) Deep learning applications for predicting pharmacological properties of drugs and drug repurposing using transcriptomic data. Mol Pharm 13:2524–2530

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  36. Vitaku E, Smith DT, Njardarson JT (2014) Analysis of the structural diversity, substitution patterns, and frequency of nitrogen heterocycles among US FDA approved pharmaceuticals: miniperspective. J Med Chem 57:10257–10274

    Article  CAS  PubMed  Google Scholar 

  37. Shaabani A, Hooshmand SE (2018) Malononitrile dimer as a privileged reactant in design and skeletal diverse synthesis of heterocyclic motifs. Mol Divers 22:207–224

    Article  CAS  PubMed  Google Scholar 

  38. Yerien DE, Bonesi S, Postigo A (2016) Fluorination methods in drug discovery. Org Biomol Chem 14:8398–8427

    Article  CAS  PubMed  Google Scholar 

  39. Vulpetti A, Dalvit C (2012) Fluorine local environment: from screening to drug design. Drug Discov Today 17:890–897

    Article  CAS  PubMed  Google Scholar 

  40. de la Torre BG, Albericio F (2019) The pharmaceutical industry in 2018. An analysis of FDA drug approvals from the perspective of molecules. Molecules 24:809

    Article  Google Scholar 

  41. Zha G-F, Rakesh KP, Manukumar HM et al (2019) Pharmaceutical significance of azepane based motifs for drug discovery: a critical review. Eur J Med Chem 162:465–494

    Article  CAS  PubMed  Google Scholar 

  42. Poschel BPH (1971) A simple and specific screen for benzodiazepine-like drugs. Psychopharmacologia 19:193–198

    Article  CAS  PubMed  Google Scholar 

Download references

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Ali Masoudi-Nejad.

Additional information

Publisher's Note

Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

Rights and permissions

Reprints and permissions

About this article

Check for updates. Verify currency and authenticity via CrossMark

Cite this article

Hooshmand, S.A., Jamalkandi, S.A., Alavi, S.M. et al. Distinguishing drug/non-drug-like small molecules in drug discovery using deep belief network. Mol Divers 25, 827–838 (2021). https://doi.org/10.1007/s11030-020-10065-7

Download citation

  • Received:

  • Accepted:

  • Published:

  • Issue Date:

  • DOI: https://doi.org/10.1007/s11030-020-10065-7

Keywords

Navigation