Abstract
Natural products, as major resources for drug discovery historically, are gaining more attentions recently due to the advancement in genomic sequencing and other technologies, which makes them attractive and amenable to drug candidate screening. Collecting and mining the bioactivity information of natural products are extremely important for accelerating drug development process by reducing cost. Lately, a number of publicly accessible databases have been established to facilitate the access to the chemical biology data for small molecules including natural products. Thus, it is imperative for scientists in related fields to exploit these resources in order to expedite their researches on natural products as drug leads/candidates for disease treatment. PubChem, as a public database, contains large amounts of natural products associated with bioactivity data. In this review, we introduce the information system provided at PubChem, and systematically describe the applications for a set of PubChem web services for rapid data retrieval, analysis, and downloading of natural products. We hope this work can serve as a starting point for the researchers to perform data mining on natural products using PubChem.
References
Liu J, Hu Y, Waller DL, Wang JF, Liu QS. Natural products as kinase inhibitors. Nat Prod Rep, 2012, 29: 392–403
Newman DJ, Cragg GM. Natural products as sources of new drugs over the last 25 years. J Nat Prod, 2007, 70: 461–477
Carlomagno T. NMR in natural products: understanding conformation, configuration and receptor interactions. Nat Prod Rep, 2012, 29: 536–554
Shen J, Xu X, Cheng F, Liu H, Luo X, Chen K, Zhao W, Shen X, Jiang H. Virtual screening on natural products for discovering active compounds and target information. Curr Med Chem, 2003, 10: 2327–2342
Harvey A. Strategies for discovering drugs from previously unexplored natural products. Drug Discov Today, 2000, 5: 294–300
Koehn FE. Biosynthetic medicinal chemistry of natural product drugs. MedChemComm, 2012, 3: 854–865
Calderón AI, Simithy-Williams J, Gupta MP. Antimalarial natural products drug discovery in Panama. Pharm Biol, 2012, 50: 61–71
Wang Y, Xiao J, Suzek TO, Zhang J, Wang J, Bryant SH. PubChem: A public information system for analyzing bioactivities of small molecules. Nucleic Acids Res, 2009, 37: W623–W633
Wang YL, Bolton E, Dracheva S, Karapetyan K, Shoemaker BA, Suzek TO, Wang JY, Xiao JW, Zhang J, Bryant SH. An overview of the PubChem bioassay resource. Nucleic Acids Res, 2010, 38: D255–D266
Wang YL, Xiao JW, Suzek TO, Zhang J, Wang JY, Zhou ZG, Han LY, Karapetyan K, Dracheva S, Shoemaker BA, Bolton E, Gindulyte A, Bryant SH. PubChem’s bioassay database. Nucleic Acids Res, 2012, 40: D400–D412
Gaulton A, Bellis LJ, Bento AP, Chambers J, Davies M, Hersey A, Light Y, McGlinchey S, Michalovich D, Al-Lazikani B, Overington JP. ChEMBL: A large-scale bioactivity database for drug discovery. Nucleic Acids Res, 2012, 40: D1100–D1107
Tiikkainen P, Franke L. Analysis of commercial and public bioactivity databases. J Chem Inf Model, 2011, 52: 319–326
Southan C, Várkonyi P, Muresan S. Complementarity between public and commercial databases: New opportunities in medicinal chemistry informatics. Curr Top Med Chem, 2007, 7: 1502–1508
Newman DJ, Cragg GM, Snader KM. The influence of natural products upon drug discovery. Nat Prod Rep, 2000, 17: 215–234
Cheng T, Wang Y, Bryant SH. Investigating the correlations among the chemical structures, bioactivity profiles and molecular targets of small molecules. Bioinformatics, 2010, 26: 2881–2888
Perez JJ. Managing molecular diversity. Chem Soc Rev, 2005, 34: 143–152
Cheng T, Li Q, Wang Y, Bryant SH. Identifying compound-target associations by combining bioactivity profile similarity search and public databases mining. J Chem Inf Model, 2011, 51: 2440–2448
Chen B, Wild DJ. PubChem bioassays as a data source for predictive models. J Mol Graphics Model, 2010, 28: 420–426
Pouliot Y, Chiang AP, Butte AJ. Predicting adverse drug reactions using publicly available PubChem bioassay data. Clin Pharmacol Ther, 2011, 90: 90–99
Zhang J, Lushington G, Huan J. The bioassay network and its implications to future therapeutic discovery. BMC Bioinformatics, 2011, 12: S1
Wendt B, Mulbaier M, Wawro S, Schultes C, Alonso J, Janssen B, Lewis J. Toluidinesulfonamide hypoxia-induced factor 1 inhibitors: alleviating drug-drug interactions through use of PubChem data and comparative molecular field analysis guided synthesis. J Med Chem, 2011, 54: 3982–3986
Awale M, van Deursen R, Reymond J L. MQN-mapplet: Visualization of chemical space with interactive maps of DrugBank, ChEMBL, PubChem, GDB-11, and GDB-13. J Chem Inf Model, 2013, 53: 509–518
Hu Y, Maggiora G, Bajorath J. Activity cliffs in PubChem confirmatory bioassays taking inactive compounds into account. J Comput Aided Mol Des, 2013, 27: 115–124
Liu X, Wang S, Meng F, Wang J, Zhang Y, Dai E, Yu X, Li X, Jiang W. SM2miR: A database of the experimentally validated small molecules’ effects on microRNA expression. Bioinformatics, 2013, 29: 409–411
Covell DG. Integrating constitutive gene expression and chemoactivity: Mining the NCI60 anticancer screen. PLoS ONE, 2012, 7: e44631
Gerlich M, Neumann S. MetFusion: Integration of compound identification strategies. J Mass Spectrom, 2013, 48: 291–298
Author information
Authors and Affiliations
Corresponding authors
Rights and permissions
About this article
Cite this article
Hao, M., Cheng, T., Wang, Y. et al. Web search and data mining of natural products and their bioactivities in PubChem. Sci. China Chem. 56, 1424–1435 (2013). https://doi.org/10.1007/s11426-013-4910-0
Received:
Accepted:
Published:
Issue Date:
DOI: https://doi.org/10.1007/s11426-013-4910-0