Skip to main content

Progress on Open Chemoinformatic Tools for Drug Discovery

Part of the Computer-Aided Drug Discovery and Design book series (CADDD,volume 1)

Abstract

Informatics plays a fundamental role in many chemistry applications giving rise to the consolidation of well-established disciplines such as bioinformatics and chemoinformatics. It has also led to the maturation of subdisciplines such as food informatics, epi-informatics, and more recently, to the so-called natural products informatics. The extensive practice of informatics across different disciplines and subdisciplines has been boosted by the large and increasing availability of open and well-documented resources. A number of them have been implemented as web-applications that further encourage the use by the scientific community. In this chapter, we review the recent progress on the development of public chemoinformatic resources for different tasks, with special focus/emphasis on drug discovery applications. Due to the current COVID-19 pandemic, we emphasize resources that have been developed and released over the past few months to support drug discovery efforts worldwide.

Keywords

  • COVID-19
  • Chemical space
  • Chemoinformatics
  • Drug discovery
  • Education
  • Natural products informatics
  • Open science
  • SARS-CoV-2
  • Structure-activity relationships
  • Webserver

This is a preview of subscription content, access via your institution.

Buying options

Chapter
USD   29.95
Price excludes VAT (USA)
  • DOI: 10.1007/978-3-030-95895-4_9
  • Chapter length: 23 pages
  • Instant PDF download
  • Readable on all devices
  • Own it forever
  • Exclusive offer for individuals only
  • Tax calculation will be finalised during checkout
eBook
USD   129.00
Price excludes VAT (USA)
  • ISBN: 978-3-030-95895-4
  • Instant PDF download
  • Readable on all devices
  • Own it forever
  • Exclusive offer for individuals only
  • Tax calculation will be finalised during checkout
Hardcover Book
USD   169.99
Price excludes VAT (USA)
Fig. 1
Fig. 2
Fig. 3

Abbreviations

AI:

Artificial intelligence

COCONUT:

Collection of Open Natural Products

CoVs:

Coronaviruses

ETP:

Epigenetic Target Profiler

LANaPD:

Latin America Natural Product Database

NP:

Natural products

SAR:

Structure-activity relationships

SARS-CoV-2:

Severe Acute Respiratory Syndrome Coronavirus 2

SMILES:

Simplified Molecular Input Line Entry System

References

  • Abegaz BM, Kinfe HH. Secondary metabolites, their structural diversity, bioactivity, and ecological functions: an overview. Phys Sci Rev. 2019;4:20180100.

    Google Scholar 

  • Alnajjar R, Mostafa A, Kandeil A, Al-Karmalawy AA. Molecular docking, molecular dynamics, and in vitro studies reveal the potential of angiotensin II receptor blockers to inhibit the Covid-19 main protease. Heliyon. 2020;6:e05641.

    PubMed  PubMed Central  CrossRef  Google Scholar 

  • Atanasov AG, Zotchev SB, Dirsch VM, Orhan IE, Banach M, Rollinger JM, Barreca D, Weckwerth W, Bauer R, Bayer EA, Majeed M, Bishayee A, Bochkov V, Bonn GK, Braidy N, Bucar F, Cifuentes A, D’Onofrio G, Bodkin M, Diederich M, Dinkova-Kostova AT, Efferth T, El Bairi K, Arkells N, Fan T-P, Fiebich BL, Freissmuth M, Georgiev MI, Gibbons S, Godfrey KM, Gruber CW, Heer J, Huber LA, Ibanez E, Kijjoa A, Kiss AK, Lu A, Macias FA, Miller MJS, Mocan A, Müller R, Nicoletti F, Perry G, Pittalà V, Rastrelli L, Ristow M, Russo GL, Silva AS, Schuster D, Sheridan H, Skalicka-Woźniak K, Skaltsounis L, Sobarzo-Sánchez E, Bredt DS, Stuppner H, Sureda A, Tzvetkov NT, Vacca RA, Aggarwal BB, Battino M, Giampieri F, Wink M, Wolfender J-L, Xiao J, Yeung AWK, Lizard G, Popp MA, Heinrich M, Berindan-Neagoe I, Stadler M, Daglia M, Verpoorte R, Supuran CT. The International Natural Product Sciences T natural products in drug discovery: advances and opportunities. Nat Rev Drug Discov. 2021;20:200–16.

    CAS  PubMed  CrossRef  Google Scholar 

  • Berman HM, Westbrook J, Feng Z, Gilliland G, Bhat TN, Weissig H, Shindyalov IN, Bourne PE. The protein data bank. Nucleicl Acids Res. 2000;28:235–42.

    CAS  CrossRef  Google Scholar 

  • Bobrowski TM, Korn DR, Muratov EN, Tropsha A. ZINC Express: a virtual assistant for purchasing compounds annotated in the zinc database. J Chem Inf Model. 2021;61:1033–6.

    CAS  PubMed  CrossRef  Google Scholar 

  • Chávez-Hernández AL, Sánchez-Cruz N, Medina-Franco JL. A fragment library of natural products and its comparative chemoinformatic characterization. Mol Inf. 2020;39:2000050.

    CrossRef  CAS  Google Scholar 

  • Chen Y, Kirchmair J. Cheminformatics in natural product-based drug discovery. Mol Inf. 2020;39:2000171.

    CAS  CrossRef  Google Scholar 

  • Chen Y, Stork C, Hirte S, Kirchmair J. Np-Scout: machine learning approach for the quantification and visualization of the natural product-likeness of small molecules. Biomol Ther. 2019;9:43.

    Google Scholar 

  • Cheng T, Pan Y, Hao M, Wang Y, Bryant SH. Pubchem applications in drug discovery: a bibliometric analysis. Drug Discov Today. 2014;19:1751–6.

    PubMed  PubMed Central  CrossRef  Google Scholar 

  • Cucinotta D, Vanelli M. Who declares Covid-19 a pandemic. Acta Bio Medica Atenei Parmensis. 2020;91:157–60.

    PubMed  PubMed Central  Google Scholar 

  • da Paixão VG, Pita S. In silico identification and evaluation of new trypanosoma cruzi trypanothione reductase (Tctr) inhibitors obtained from natural products database of the Bahia Semi-Arid Region (Natprodb). Comput Biol Chem. 2019;79:36–47.

    PubMed  CrossRef  CAS  Google Scholar 

  • Daina A, Zoete V. Application of the Swissdrugdesign online resources in virtual screening. Int J Mol Sci. 2019;20

    Google Scholar 

  • Ekins S, Williams AJ, Krasowski MD, Freundlich JS. In silico repositioning of approved drugs for rare and neglected diseases. Drug Discov Today. 2011;16:298–310.

    PubMed  CrossRef  Google Scholar 

  • Ertl P, Roggo S, Schuffenhauer A. Natural product-likeness score and its application for prioritization of compound libraries. J Chem Inf Model. 2008;48:68–74.

    CAS  PubMed  CrossRef  Google Scholar 

  • Gasteiger J. Chemistry in times of artificial intelligence. ChemPhysChem. 2020;21:2233–42.

    CAS  PubMed  PubMed Central  CrossRef  Google Scholar 

  • Gong Z, Hu G, Li Q, Liu Z, Wang F, Zhang X, Xiong J, Li P, Xu Y, Ma R, Chen S. Li J Compound libraries: recent advances and their applications in drug discovery. Curr Drug Discov Technol. 2017;14:216–28.

    CAS  PubMed  CrossRef  Google Scholar 

  • Gonzalez-Medina M, Naveja JJ, Sanchez-Cruz N, Medina-Franco JL. Open chemoinformatic resources to explore the structure, properties and chemical space of molecules. RSC Adv. 2017;7:54153–63.

    CAS  CrossRef  Google Scholar 

  • Gorgulla C, Boeszoermenyi A, Wang ZF, Fischer PD, Coote PW, Padmanabha Das KM, Malets YS, Radchenko DS, Moroz YS, Scott DA, Fackeldey K, Hoffmann M, Iavniuk I, Wagner G, Arthanari H. An open-source drug discovery platform enables ultra-large virtual screens. Nature. 2020;580:663–8.

    CAS  PubMed  PubMed Central  CrossRef  Google Scholar 

  • Guedes IA, Costa LSC, Dos Santos KB, Karl ALM, Rocha GK, Teixeira IM, Galheigo MM, Medeiros V, Krempser E, Custódio FL, Barbosa HJC, Nicolás MF, Dardenne LE. Drug design and repurposing with Dockthor-Vs web server focusing on Sars-Cov-2 therapeutic targets and their non-synonym variants. Sci Rep. 2021;11:5543.

    CAS  PubMed  PubMed Central  CrossRef  Google Scholar 

  • Guo S, Xie H, Lei Y, Liu B, Zhang L, Xu Y. Zuo Z Discovery of Novel inhibitors against main protease (Mpro) of Sars-Cov-2 via virtual screening and biochemical evaluation. Bioorg Chem. 2021;110:104767.

    CAS  PubMed  PubMed Central  CrossRef  Google Scholar 

  • Harris CJ, Hill RD, Sheppard DW, Slater MJ, Stouten PF. The design and application of target-focused compound libraries. Comb Chem High Throughput Screen. 2011;14:521–31.

    CAS  PubMed  PubMed Central  CrossRef  Google Scholar 

  • Irwin JJ, Tang KG, Young J, Dandarchuluun C, Wong BR, Khurelbaatar M, Moroz YS, Mayfield J, Sayle RA. Zinc20-a Free ultralarge-scale chemical database for ligand discovery. J Chem Inf Model. 2020;60:6065–73.

    CAS  PubMed  PubMed Central  CrossRef  Google Scholar 

  • Jayaseelan KV, Moreno P, Truszkowski A, Ertl P, Steinbeck C. Natural product-likeness score revisited: an open-source, open-data implementation. BMC Bioinformatics. 2012;13:106.

    PubMed  PubMed Central  CrossRef  Google Scholar 

  • Johnson M, Zaretskaya I, Raytselis Y, Merezhuk Y, McGinnis S, Madden TL. NCBI blast: a better web interface. Nucleic Acids Res. 2008;36:W5–9.

    CAS  PubMed  PubMed Central  CrossRef  Google Scholar 

  • Jorgensen WL. The Many roles of computation in drug discovery. Science. 2004;303:1813–8.

    CAS  PubMed  CrossRef  Google Scholar 

  • Lipinski CA. Lead- and druglike compounds: the rule-of-five revolution. Drug Discov Today Technol. 2004;1:337–41.

    CAS  PubMed  CrossRef  Google Scholar 

  • Lipinski CA, Lombardo F, Dominy BW, Feeney PJ. Experimental and computational approaches to estimate solubility and permeability in drug discovery and development settings. Adv Drug Deliv Rev. 1997;23:3–25.

    CAS  CrossRef  Google Scholar 

  • López-López E, Bajorath J, Medina-Franco JL. Informatics for chemistry, biology, and biomedical sciences. J Chem Inf Model. 2021;61:26–35.

    PubMed  CrossRef  CAS  Google Scholar 

  • Martinez-Mayorga K, Madariaga-Mazon A, Medina-Franco JL, Maggiora G. The impact of chemoinformatics on drug discovery in the pharmaceutical industry. Exp Opin Drug Discov. 2020;15:293–306.

    CAS  CrossRef  Google Scholar 

  • Masters L, Eagon S, Heying M. Evaluation of consensus scoring methods for Autodock Vina, Smina and Idock. J Mol Graph Model. 2020;96:107532.

    CAS  PubMed  CrossRef  Google Scholar 

  • Medina-Franco JL. Towards a unified Latin American natural products database: Lanapd. Future Sci OA. 2020;6:FSO468.

    CrossRef  CAS  Google Scholar 

  • Medina-Franco JL, Saldívar-González FI. Cheminformatics to characterize pharmacologically active natural products. Biomol Ther. 2020;10:1566.

    CAS  Google Scholar 

  • Mendez D, Gaulton A, Bento AP, Chambers J, De Veij M, Félix E, Magariños María P, Mosquera Juan F, Mutowo P, Nowotka M, Gordillo-Marañón M, Hunter F, Junco L, Mugumbate G, Rodriguez-Lopez M, Atkinson F, Bosc N, Radoux Chris J, Segura-Cabrera A, Hersey A, Leach AR. Chembl: towards direct deposition of bioassay data. Nucl Acids Res. 2019;47:D930–40.

    CAS  PubMed  CrossRef  Google Scholar 

  • Merz KM, Amaro R, Cournia Z, Rarey M, Soares T, Tropsha A, Wahab HA, Wang R. Editorial: method and data sharing and reproducibility of scientific results. J Chem Inf Mod. 2020;60:5868–9.

    CrossRef  CAS  Google Scholar 

  • Mousavizadeh L, Ghasemi S. Genotype and phenotype of Covid-19: their roles in pathogenesis. J Microbiol Immunol Infect. 2021;54:159–163. https://doi.org/10.1016/j.jmii.2020.03.022.

  • Naveja JJ, Oviedo-Osornio CI, Trujillo-Minero NN, Medina-Franco JL. Chemoinformatics: a perspective from an academic setting in Latin America. Mol Divers. 2018;22:247–58.

    CAS  PubMed  CrossRef  Google Scholar 

  • Newman DJ, Cragg GM. Natural products as sources of new drugs over the nearly four decades from 01/1981 to 09/2019. J Nat Prod. 2020;83:770–803.

    CAS  PubMed  CrossRef  Google Scholar 

  • Ntie-Kang F, Nyongbela KD, Ayimele GA, Shekfeh S. “Drug-Likeness” properties of natural compounds. Phys Sci Rev. 2019;4:20180169.

    Google Scholar 

  • Pippel M, Meier R, Sippl W. Paradocks—a framework for molecular docking. J Cheminf. 2011;3:P35.

    CrossRef  Google Scholar 

  • Pitsillou E, Liang J, Ververis K, Lim KW, Hung A, Karagiannis TC. Identification of small molecule inhibitors of the deubiquitinating activity of the Sars-Cov-2 Papain-like protease: in silico molecular docking studies and in vitro enzymatic activity assay. Front Chem. 2020;8:623971.

    CAS  PubMed  PubMed Central  CrossRef  Google Scholar 

  • Pitsillou E, Liang J, Ververis K, Hung A, Karagiannis TC. Interaction of small molecules with the Sars-Cov-2 Papain-like protease: in silico studies and in vitro validation of protease activity inhibition using an enzymatic inhibition assay. J Mol Graph Model. 2021;104:107851.

    CAS  PubMed  PubMed Central  CrossRef  Google Scholar 

  • Raj R. Analysis of non-structural proteins, Nsps of Sars-Cov-2 as targets for computational drug designing. Biochem Biophys Rep. 2021;25:100847.

    PubMed  Google Scholar 

  • Ruiz-Carmona S, Alvarez-Garcia D, Foloppe N, Garmendia-Doval AB, Juhos S, Schmidtke P, Barril X, Hubbard RE, Morley SD. Rdock: a fast, versatile and open source program for docking ligands to proteins and nucleic acids. PLoS Comput Biol. 2014;10:e1003571.

    PubMed  PubMed Central  CrossRef  CAS  Google Scholar 

  • Rutz A, Sorokina M, Galgonek J, Mietchen D, Willighagen E, Graham J, Stephan R, Page R, Vondrášek J, Steinbeck C, Pauli GF, Wolfender J-L, Bisson J, Allard P-M. Open natural products research: curation and dissemination of biological occurrences of chemical structures through Wikidata. bioRxiv. 2021; 2021.02.28.433265

    Google Scholar 

  • Sadegh S, Matschinske J, Blumenthal DB, Galindez G, Kacprowski T, List M, Nasirigerdeh R, Oubounyt M, Pichlmair A, Rose TD, Salgado-Albarrán M, Späth J, Stukalov A, Wenke NK, Yuan K, Pauling JK, Baumbach J. Exploring the SARS-Cov-2 virus-host-drug interactome for drug repurposing. Nat Commun. 2020;11:3518.

    CAS  PubMed  PubMed Central  CrossRef  Google Scholar 

  • Sahoo S, Mahapatra SR, Parida BK, Rath S, Dehury B, Raina V, Mohakud NK, Misra N, Suar M. Dbcovp: a database of coronavirus virulent glycoproteins. Comput Biol Med. 2021;129:104131.

    CAS  PubMed  CrossRef  Google Scholar 

  • Saldívar-González F, Prieto-Martínez FD, Medina-Franco JL, Descubrimiento Y. Desarrollo De Fármacos: Un Enfoque Computacional. Educ Quim. 2017;28:51–8.

    Google Scholar 

  • Saldívar-González FI, Valli M, Andricopulo AD, da Silva BV, Medina-Franco JL. Chemical space and diversity of the Nubbe database: a chemoinformatic characterization. J Chem Inf Model. 2019;59:74–85.

    PubMed  CrossRef  CAS  Google Scholar 

  • Saldívar-González FI, Huerta-García CS, Medina-Franco JL. Chemoinformatics-based enumeration of chemical libraries: a tutorial. J Chem Inf. 2020;12:64.

    Google Scholar 

  • Sánchez-Cruz N, Medina-Franco JL. Epigenetic target profiler: a web server to predict epigenetic targets of small molecules. J Chem Inf Model. 2021a;61:1550–1554. https://doi.org/10.1021/acs.jcim.1c00045.

  • Sánchez-Cruz N, Medina-Franco JL. Epigenetic target fishing with accurate machine learning models. J Med Chem. 2021b;64:8208–8220. https://doi.org/10.1021/acs.jmedchem.1c00020.

  • Sánchez-Cruz N, Pilón-Jiménez B, Medina-Franco J. Functional group and diversity analysis of Biofacquim: a mexican natural product database [Version 2; Peer Review: 3 Approved]. F1000Research. 2020;8:2071.

    CrossRef  Google Scholar 

  • Scharfe M, Pippel M, Sippl W. Paradocks—an open-source framework for molecular docking: implementation of target-class-specific scoring methods. J Chem Inf. 2013;5:P11.

    Google Scholar 

  • Scior T, Medina-Franco JL, Do Q-T, Martínez-Mayorga K, Yunes Rojas JA, Bernard P. How to recognize and workaround pitfalls in Qsar studies: a critical review. Curr Med Chem. 2009;16:4297–313.

    CAS  PubMed  CrossRef  Google Scholar 

  • Scior T, Bender A, Tresadern G, Medina-Franco JL, Martínez-Mayorga K, Langer T, Cuanalo-Contreras K, Agrafiotis DK. Recognizing pitfalls in virtual screening: a critical review. J Chem Inf Model. 2012;52:867–81.

    CAS  PubMed  CrossRef  Google Scholar 

  • Sessions Z, Sánchez-Cruz N, Prieto-Martínez FD, Alves VM, Santos HP Jr, Muratov E, Tropsha A, Medina-Franco JL. Recent progress on cheminformatics approaches to epigenetic drug discovery. Drug Discov Today. 2020;25:2268–76.

    CAS  PubMed  PubMed Central  CrossRef  Google Scholar 

  • Shultz MD. Two decades under the influence of the rule of five and the changing properties of approved oral drugs. J Med Chem. 2019;62:1701–14.

    CAS  PubMed  CrossRef  Google Scholar 

  • Singh N, Chaput L, Villoutreix BO. Virtual screening web servers: designing chemical probes and drug candidates in the cyberspace. Brief Bioinform. 2020;22:1790–818.

    PubMed Central  CrossRef  Google Scholar 

  • Sorokina M, Steinbeck C. Naples: a natural products likeness scorer-web application and database. J Cheminform. 2019;11:55.

    PubMed  PubMed Central  CrossRef  Google Scholar 

  • Sorokina M, Steinbeck C. Review on natural products databases: where to find data in 2020. J Chem Inf. 2020;12:20.

    CAS  Google Scholar 

  • Sorokina M, Merseburger P, Rajan K, Yirik MA, Steinbeck C. Coconut online: collection of open natural products database. J Cheminformatics. 2021;13:2.

    CrossRef  Google Scholar 

  • Sterling T, Irwin JJ. Zinc 15—ligand discovery for everyone. J Chem Inf Model. 2015;55:2324–37.

    CAS  PubMed  PubMed Central  CrossRef  Google Scholar 

  • Stork C, Embruch G, Šícho M, de Bruyn KC, Chen Y, Svozil D, Kirchmair J. Nerdd: a web portal providing access to in silico tools for drug discovery. Bioinformatics. 2020;36:1291–2.

    CAS  PubMed  Google Scholar 

  • Sunseri J, Koes DR. Pharmit: interactive exploration of chemical space. Nucleic Acids Res. 2016;44:W442–8.

    CAS  PubMed  PubMed Central  CrossRef  Google Scholar 

  • Valsecchi C, Grisoni F, Motta S, Bonati L, Ballabio D. Nura: a curated dataset of nuclear receptor modulators. Toxicol Appl Pharmacol. 2020;407:115244.

    CAS  PubMed  CrossRef  Google Scholar 

  • Walters WP. Virtual chemical libraries. J Med Chem. 2019;62:1116–24.

    CAS  PubMed  CrossRef  Google Scholar 

  • Wang J, Ge Y, Xie XQ. Development and testing of druglike screening libraries. J Chem Inf Model. 2019;59:53–65.

    CAS  PubMed  PubMed Central  CrossRef  Google Scholar 

  • Wassermann AM, Camargo LM, Auld DS. Composition and applications of focus libraries to phenotypic assays. Front Pharmacol. 2014;5:164.

    PubMed  PubMed Central  CrossRef  Google Scholar 

  • Wójcikowski M, Zielenkiewicz P, Siedlecki P. Open drug discovery toolkit (Oddt): a new open-source player in the drug discovery field. J Chem Inf. 2015;7:26.

    Google Scholar 

  • Wu F, Zhou Y, Li L, Shen X, Chen G, Wang X, Liang X, Tan M, Huang Z. Computational approaches in preclinical studies on drug discovery and development. Front Chem. 2020;8:726.

    CAS  PubMed  PubMed Central  CrossRef  Google Scholar 

  • Zhang H, Yang Y, Li J, Wang M, Saravanan KM, Wei J, Tze-Yang Ng J, Tofazzal Hossain M, Liu M, Zhang H, Ren X, Pan Y, Peng Y, Shi Y, Wan X, Liu Y, Wei Y. A Novel virtual screening procedure identifies pralatrexate as inhibitor of Sars-Cov-2 RDRP and it reduces viral replication in vitro. PLoS Comput Biol. 2020;16:e1008489.

    CAS  PubMed  PubMed Central  CrossRef  Google Scholar 

Download references

Acknowledgments

We thank the support of DGAPA, UNAM, Programa de Apoyo a Proyectos de Investigación e Innovación Tecnológica (PAPIIT), grant IN201321.

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to José L. Medina-Franco .

Editor information

Editors and Affiliations

Rights and permissions

Reprints and Permissions

Copyright information

© 2022 The Author(s), under exclusive license to Springer Nature Switzerland AG

About this chapter

Verify currency and authenticity via CrossMark

Cite this chapter

Medina-Franco, J.L., Gutiérrez-Nieto, R., Gómez-Velasco, H. (2022). Progress on Open Chemoinformatic Tools for Drug Discovery. In: Scotti, M.T., Bellera, C.L. (eds) Drug Target Selection and Validation. Computer-Aided Drug Discovery and Design, vol 1. Springer, Cham. https://doi.org/10.1007/978-3-030-95895-4_9

Download citation