Accessing, Using, and Creating Chemical Property Databases for Computational Toxicology Modeling

  • Antony J. WilliamsEmail author
  • Sean Ekins
  • Ola Spjuth
  • Egon L. Willighagen
Part of the Methods in Molecular Biology book series (MIMB, volume 929)


Toxicity data is expensive to generate, is increasingly seen as precompetitive, and is frequently used for the generation of computational models in a discipline known as computational toxicology. Repositories of chemical property data are valuable for supporting computational toxicologists by providing access to data regarding potential toxicity issues with compounds as well as for the purpose of building structure–toxicity relationships and associated prediction models. These relationships use mathematical, statistical, and modeling computational approaches and can be used to understand the mechanisms by which chemicals cause harm and, ultimately, enable prediction of adverse effects of these chemicals to human health and/or the environment. Such approaches are of value as they offer an opportunity to prioritize chemicals for testing. An increasing amount of data used by computational toxicologists is being published into the public domain and, in parallel, there is a greater availability of Open Source software for the generation of computational models. This chapter provides an overview of the types of data and software available and how these may be used to produce predictive toxicology models for the community.

Key words

Bioinformatics Cheminformatics Computational toxicology Public domain toxicology data QSAR Toxicology databases 



SE gratefully acknowledges the many collaborators involved in the cited work. Contributions by OS and ELW were supported by Uppsala University (KoF 07).

SE consults for Collaborative Drug Discovery, Inc. on a Bill and Melinda Gates Foundation Grant#49852 “Collaborative drug discovery for TB through a novel database of SAR data optimized to promote data archiving and sharing.”


  1. 1.
    Helma C (ed) (2005) Predictive toxicology. Taylor and Francis, Boca RatonGoogle Scholar
  2. 2.
    Cronin MTD, Livingstone DJ (2004) Predicting chemical toxicity and fate. CRC, Boca RatonCrossRefGoogle Scholar
  3. 3.
    Ekins S (2007) Computational toxicology: risk assessment for pharmaceutical and environmental chemicals. Wiley, HobokenGoogle Scholar
  4. 4.
    Ekins S, Boulanger B, Swaan PW, Hupcey MAZ (2002) Towards a new age of virtual ADME/TOX and multidimensional drug discovery. J Comput Aided Mol Des 16:381–401PubMedCrossRefGoogle Scholar
  5. 5.
    Voutchkova AM, Osimitz TG, Anastas PT (2010) Toward a comprehensive molecular design framework for reduced hazard. Chem Rev 110:5845–5882PubMedCrossRefGoogle Scholar
  6. 6.
    Ekins S, Giroux C (2006) Computers and systems biology for pharmaceutical research and development. In: Ekins S (ed) Computer applications in pharmaceutical research and development. John Wiley, Hoboken, pp 139–165CrossRefGoogle Scholar
  7. 7.
    Ekins S, Bugrim A, Brovold L, Kirillov E, Nikolsky Y, Rakhmatulin EA, Sorokina S, Ryabov A, Serebryiskaya T, Melnikov A, Metz J, Nikolskaya T (2006) Algorithms for network analysis in systems-ADME/Tox using the MetaCore and MetaDrug platforms. Xenobiotica 36(10–11):877–901PubMedCrossRefGoogle Scholar
  8. 8.
    Ekins S (2006) Systems-ADME/Tox: resources and network approaches. J Pharmacol Toxicol Methods 53:38–66PubMedCrossRefGoogle Scholar
  9. 9.
    Nikolsky Y, Ekins S, Nikolskaya T, Bugrim A (2005) A novel method for generation of signature networks as biomarkers from complex high throughput data. Toxicol Lett 158:20–29PubMedCrossRefGoogle Scholar
  10. 10.
    Ekins S, Nikolsky Y, Nikolskaya T (2005) Techniques: application of systems biology to absorption, distribution, metabolism, excretion, and toxicity. Trends Pharmacol Sci 26:202–209PubMedCrossRefGoogle Scholar
  11. 11.
    Ekins S, Williams AJ, Xu JJ (2010) A predictive ligand-based Bayesian model for human drug induced liver injury. Drug Metab Dispos 38:2302–2308PubMedCrossRefGoogle Scholar
  12. 12.
    Zientek M, Stoner C, Ayscue R, Klug-McLeod J, Jiang Y, West M, Collins C, Ekins S (2010) Integrated in silico-in vitro strategy for addressing cytochrome P450 3A4 time-dependent inhibition. Chem Res Toxicol 23:664–676PubMedCrossRefGoogle Scholar
  13. 13.
    Langdon SR, Mulgrew J, Paolini GV, van Hoorn WP (2010) Predicting cytotoxicity from heterogeneous data sources with Bayesian learning. J Cheminform 2:11PubMedCrossRefGoogle Scholar
  14. 14.
    Clark RD, Wolohan PR, Hodgkin EE, Kelly JH, Sussman NL (2004) Modelling in vitro hepatotoxicity using molecular interaction fields and SIMCA. J Mol Graph Model 22:487–497PubMedCrossRefGoogle Scholar
  15. 15.
    Cheng A, Dixon SL (2003) In silico models for the prediction of dose-dependent human hepatotoxicity. J Comput Aided Mol Des 17:811–823PubMedCrossRefGoogle Scholar
  16. 16.
    Ung CY, Li H, Yap CW, Chen YZ (2007) In silico prediction of pregnane X receptor activators by machine learning approaches. Mol Pharmacol 71:158–168PubMedCrossRefGoogle Scholar
  17. 17.
    Marechal JD, Yu J, Brown S, Kapelioukh I, Rankin EM, Wolf CR, Roberts GC, Paine MJ, Sutcliffe MJ (2006) In silico and in vitro screening for inhibition of cytochrome P450 CYP3A4 by co-medications commonly used by patients with cancer. Drug Metab Dispos 34:534–538PubMedCrossRefGoogle Scholar
  18. 18.
    Ekins S, Waller CL, Swaan PW, Cruciani G, Wrighton SA, Wikel JH (2000) Progress in predicting human ADME parameters in silico. J Pharmacol Toxicol Methods 44:251–272PubMedCrossRefGoogle Scholar
  19. 19.
    Boelsterli UA, Ho HK, Zhou S, Leow KY (2006) Bioactivation and hepatotoxicity of nitroaromatic drugs. Curr Drug Metab 7:715–727PubMedCrossRefGoogle Scholar
  20. 20.
    Kassahun K, Pearson PG, Tang W, McIntosh I, Leung K, Elmore C, Dean D, Wang R, Doss G, Baillie TA (2001) Studies on the metabolism of troglitazone to reactive intermediates in vitro and in vivo. Evidence for novel biotransformation pathways involving quinone methide formation and thiazolidinedione ring scission. Chem Res Toxicol 14:62–70PubMedCrossRefGoogle Scholar
  21. 21.
    Walgren JL, Mitchell MD, Thompson DC (2005) Role of metabolism in drug-induced idiosyncratic hepatotoxicity. Crit Rev Toxicol 35:325–361PubMedCrossRefGoogle Scholar
  22. 22.
    Park BK, Kitteringham NR, Maggs JL, Pirmohamed M, Williams DP (2005) The role of metabolic activation in drug-induced hepatotoxicity. Annu Rev Pharmacol Toxicol 45:177–202PubMedCrossRefGoogle Scholar
  23. 23.
    Schuster D, Laggner C, Langer T (2005) Why drugs fail—a study on side effects in new chemical entities. Curr Pharm Des 11:3545–3559PubMedCrossRefGoogle Scholar
  24. 24.
    Xu JJ, Henstock PV, Dunn MC, Smith AR, Chabot JR, de Graaf D (2008) Cellular imaging predictions of clinical drug-induced liver injury. Toxicol Sci 105:97–105PubMedCrossRefGoogle Scholar
  25. 25.
    Xia XY, Maliski EG, Gallant P, Rogers D (2004) Classification of kinase inhibitors using a Bayesian model. J Med Chem 47:4463–4470PubMedCrossRefGoogle Scholar
  26. 26.
    Bender A (2005) Studies on molecular similarity. Ph.D. Thesis, University of Cambridge, CambridgeGoogle Scholar
  27. 27.
    Williams AJ, Ekins S (2012) A quality alert for chemistry databases. Towards a gold standard: regarding quality in public domain chemistry databases and approaches to improving the situation, Drug Discovery Today, Volume 17, Issues 13–14, Pages 685–701. Submitted for publicationGoogle Scholar
  28. 28.
    Judson R (2010) Public databases supporting computational toxicology. J Toxicol Environ Health 13:218–231Google Scholar
  29. 29.
    Williams AJ, Tkachenko V, Lipinski C, Tropsha A, Ekins S (2009) Free online resources enabling crowd-sourced drug discovery. Drug Discov World 10(Winter):33–38Google Scholar
  30. 30.
    Richard AM, Williams CR (2002) Distributed structure-searchable toxicity (DSSTox) public database network: a proposal. Mutat Res 499:27–52PubMedCrossRefGoogle Scholar
  31. 31.
    Judson R, Richard A, Dix D, Houck K, Elloumi F, Martin M, Cathey T, Transue TR, Spencer R, Wolf M (2008) ACToR—aggregated computational toxicology resource. Toxicol Appl Pharmacol 233:7–13PubMedCrossRefGoogle Scholar
  32. 32.
    Overington J (2009) ChEMBL An interview with John Overington, team leader, chemogenomics at the European Bioinformatics Institute Outstation of the European Molecular Biology Laboratory (EMBL-EBI). Interview by Wendy A. Warr. J Comput Aided Mol Des 23:195–198PubMedCrossRefGoogle Scholar
  33. 33.
    Richard AM (2006) DSSTox web site launch: Improving public access to databases for building structure-toxicity prediction models. Preclinica 2:103–108Google Scholar
  34. 34.
    Kortagere S, Krasowski MD, Reschly EJ, Venkatesh M, Mani S, Ekins S (2010) Evaluation of computational docking to identify pregnane × receptor agonists in the ToxCast™ database. Environ Health Perspect 118:1412–1417PubMedCrossRefGoogle Scholar
  35. 35.
    Sanderson K (2011) It’s not easy being green. Nature 469:18–20PubMedCrossRefGoogle Scholar
  36. 36.
    Carroll JJ, Klyne G (2004) Resource description framework (RDF): concepts and abstract syntax. Tech rep, W3CGoogle Scholar
  37. 37.
    Prud’hommeaux E, Seaborne A (2008) SPARQL query language for RDF, W3C recommendationGoogle Scholar
  38. 38.
    Willighagen EL, Alvarsson J, Andersson A, Eklund M, Lampa S, Lapins M, Spjuth O, Wikberg J (2011) Linking the resource description framework to cheminformatics and proteochemometrics. J Biomedical Semantics 2(Suppl 1):S1–S6CrossRefGoogle Scholar
  39. 39.
    Chen B, Dong X, Jiao D, Wang H, Zhu Q, Ding Y, Wild DJ (2010) Chem2Bio2RDF: a semantic framework for linking and data mining chemogenomic and systems chemical biology data. BMC Bioinformatics 11:255PubMedCrossRefGoogle Scholar
  40. 40.
    Ansell P (2011) Model and prototype for querying multiple linked scientific datasets. Future Generat Comput Syst 27:329–333CrossRefGoogle Scholar
  41. 41.
    Belleau F, Nolin MA, Tourigny N, Rigault P, Morissette J (2008) Bio2RDF: towards a mashup to build bioinformatics knowledge systems. J Biomed Inform 41:706–716PubMedCrossRefGoogle Scholar
  42. 42.
    Prud’hommeaux E (2007) Case study: FeDeRate for drug research. Tech Rep: 4–7Google Scholar
  43. 43.
    Wang Y, Xiao J, Suzek TO, Zhang J, Wang J, Bryant SH (2009) PubChem: a public information system for analyzing bioactivities of small molecules. Nucleic Acids Res 37:W623–W633PubMedCrossRefGoogle Scholar
  44. 44.
    Crumb WJ Jr, Ekins S, Sarazan D, Wikel JH, Wrighton SA, Carlson C, Beasley CM (2006) Effects of antipsychotic drugs on Ito, INa, Isus, IK1, and hERG: QT prolongation, structure activity relationship, and network analysis. Pharm Res 23:1133–1143PubMedCrossRefGoogle Scholar
  45. 45.
    Su BH, Shen MY, Esposito EX, Hopfinger AJ, Tseng YJ (2010) In silico binary classification QSAR models based on 4D-fingerprints and MOE descriptors for prediction of hERG blockage. J Chem Inf Model 50:1304–1318PubMedCrossRefGoogle Scholar
  46. 46.
    Li Q, Jorgensen FS, Oprea T, Brunak S, Taboureau O (2008) hERG classification model based on a combination of support vector machine method and GRIND descriptors. Mol Pharm 5:117–127PubMedCrossRefGoogle Scholar
  47. 47.
    Thai KM, Ecker GF (2009) Similarity-based SIBAR descriptors for classification of chemically diverse hERG blockers. Mol Divers 13:321–336PubMedCrossRefGoogle Scholar
  48. 48.
    Ekins S, Williams AJ, Krasowski MD, Freundlich JS (2011) In silico repositioning of approved drugs for rare and neglected diseases. Drug Discov Today 16(7–8):298–310PubMedCrossRefGoogle Scholar
  49. 49.
    Strachan RT, Ferrara G, Roth BL (2006) Screening the receptorome: an efficient approach for drug discovery and target validation. Drug Discov Today 11:708–716PubMedCrossRefGoogle Scholar
  50. 50.
    O’Connor KA, Roth BL (2005) Finding new tricks for old drugs: an efficient route for public-sector drug discovery. Nat Rev Drug Discov 4:1005–1014PubMedCrossRefGoogle Scholar
  51. 51.
    Roth BL, Lopez E, Beischel S, Westkaemper RB, Evans JM (2004) Screening the receptorome to discover the molecular targets for plant-derived psychoactive compounds: a novel approach for CNS drug discovery. Pharmacol Ther 102:99–110PubMedCrossRefGoogle Scholar
  52. 52.
    Keiser MJ, Setola V, Irwin JJ, Laggner C, Abbas AI, Hufeisen SJ, Jensen NH, Kuijer MB, Matos RC, Tran TB, Whaley R, Glennon RA, Hert J, Thomas KL, Edwards DD, Shoichet BK, Roth BL (2009) Predicting new molecular targets for known drugs. Nature 462:175–181PubMedCrossRefGoogle Scholar
  53. 53.
    Setola V, Dukat M, Glennon RA, Roth BL (2005) Molecular determinants for the interaction of the valvulopathic anorexigen norfenfluramine with the 5-HT2B receptor. Mol Pharmacol 68:20–33PubMedGoogle Scholar
  54. 54.
    Rothman RB, Baumann MH, Savage JE, Rauser L, McBride A, Hufeisen SJ, Roth BL (2000) Evidence for possible involvement of 5-HT(2B) receptors in the cardiac valvulopathy associated with fenfluramine and other serotonergic medications. Circulation 102:2836–2841PubMedCrossRefGoogle Scholar
  55. 55.
    Chekmarev DS, Kholodovych V, Balakin KV, Ivanenkov Y, Ekins S, Welsh WJ (2008) Shape signatures: new descriptors for predicting cardiotoxicity in silico. Chem Res Toxicol 21:1304–1314PubMedCrossRefGoogle Scholar
  56. 56.
    Zhu H, Tropsha A, Fourches D, Varnek A, Papa E, Gramatica P, Oberg T, Dao P, Cherkasov A, Tetko IV (2008) Combinatorial QSAR modeling of chemical toxicants tested against Tetrahymena pyriformis. J Chem Inf Model 48:766–784PubMedCrossRefGoogle Scholar
  57. 57.
    Ekins S, Williams AJ (2010) Precompetitive preclinical ADME/Tox data: set It free on the web to facilitate computational model building to assist drug development. Lab Chip 10:13–22PubMedCrossRefGoogle Scholar
  58. 58.
    Wishart DS, Knox C, Guo AC, Cheng D, Shrivastava S, Tzur D, Gautam B, Hassanali M (2008) DrugBank: a knowledgebase for drugs, drug actions and drug targets. Nucleic Acids Res 36:D901–D906PubMedCrossRefGoogle Scholar
  59. 59.
    Hardy B, Douglas N, Helma C, Rautenberg M, Jeliazkova N, Jeliazkov V, Nikolova I, Benigni R, Tcheremenskaia O, Kramer S, Girschick T, Buchwald F, Wicker J, Karwath A, Gutlein M, Maunz A, Sarimveis H, Melagraki G, Afantitis A, Sopasakis P, Gallagher D, Poroikov V, Filimonov D, Zakharov A, Lagunin A, Gloriozova T, Novikov S, Skvortsova N, Druzhilovsky D, Chawla S, Ghosh I, Ray S, Patel H, Escher S (2010) Collaborative development of predictive toxicology applications. J Cheminform 2:7PubMedCrossRefGoogle Scholar
  60. 60.
    Spjuth O, Alvarsson J, Berg A, Eklund M, Kuhn S, Masak C, Torrance G, Wagener J, Willighagen EL, Steinbeck C, Wikberg JE (2009) Bioclipse 2: a scriptable integration platform for the life sciences. BMC Bioinformatics 10:397PubMedCrossRefGoogle Scholar
  61. 61.
    Afzelius L, Arnby CH, Broo A, Carlsson L, Isaksson C, Jurva U, Kjellander B, Kolmodin K, Nilsson K, Raubacher F, Weidolf L (2007) State-of-the-art tools for computational site of metabolism predictions: comparative analysis, mechanistic insights, and future applications. Drug Metab Rev 39:61–86PubMedCrossRefGoogle Scholar
  62. 62.
    Jolivette LJ, Ekins S (2007) Methods for predicting human drug metabolism. Adv Clin Chem 43:131–176PubMedCrossRefGoogle Scholar
  63. 63.
    Crivori P, Poggesi I (2006) Computational approaches for predicting CYP-related metabolism properties in the screening of new drugs. Eur J Med Chem 41:795–808PubMedCrossRefGoogle Scholar
  64. 64.
    Stjernschantz E, Vermeulen NP, Oostenbrink C (2008) Computational prediction of drug binding and rationalisation of selectivity towards cytochromes P450. Expert Opin Drug Metab Toxicol 4:513–527PubMedCrossRefGoogle Scholar
  65. 65.
    Boyer S, Arnby CH, Carlsson L, Smith J, Stein V, Glen RC (2007) Reaction site mapping of xenobiotic biotransformations. J Chem Inf Model 47:583–590PubMedCrossRefGoogle Scholar
  66. 66.
    Carlsson L, Spjuth O, Adams S, Glen RC, Boyer S (2010) Use of historic metabolic biotransformation data as a means of anticipating metabolic sites using MetaPrint2D and Bioclipse. BMC Bioinformatics 11:362PubMedCrossRefGoogle Scholar
  67. 67.
    Rydberg P, Gloriam DE, Olsen L (2010) The SMARTCyp cytochrome P450 metabolism prediction server. Bioinformatics 26:2988–2989PubMedCrossRefGoogle Scholar
  68. 68.
    Cruciani G, Carosati E, De Boeck B, Ethirajulu K, Mackie C, Howe T, Vianello R (2005) MetaSite: understanding metabolism in human cytochromes from the perspective of the chemist. J Med Chem 48:6970–6979PubMedCrossRefGoogle Scholar
  69. 69.
    Spjuth O, Willighagen EL, Guha R, Eklund M, Wikberg JE (2010) Towards interoperable and reproducible QSAR analyses: exchange of datasets. J Cheminform 2:5PubMedCrossRefGoogle Scholar
  70. 70.
    Floris F, Willighagen EL, Guha R, Rojas M, Hoppe C (2010) The blue obelisk descriptor ontology. Technical reportGoogle Scholar
  71. 71.
    Wagener J, Spjuth O, Willighagen EL, Wikberg JE (2009) XMPP for cloud computing in bioinformatics supporting discovery and invocation of asynchronous web services. BMC Bioinformatics 10:279PubMedCrossRefGoogle Scholar
  72. 72.
    Gupta RR, Gifford EM, Liston T, Waller CL, Bunin B, Ekins S (2010) Using open source computational tools for predicting human metabolic stability and additional ADME/TOX properties. Drug Metab Dispos 38:2083–2090PubMedCrossRefGoogle Scholar
  73. 73.
    Steinbeck C, Hoppe C, Kuhn S, Floris M, Guha R, Willighagen EL (2006) Recent developments of the chemistry development kit (CDK)—an open-source java library for chemo- and bioinformatics. Curr Pharm Des 12:2111–2120PubMedCrossRefGoogle Scholar
  74. 74.
    Brazma A (2001) On the importance of standardisation in life sciences. Bioinformatics 17:113–114PubMedCrossRefGoogle Scholar
  75. 75.
    Brazma A, Hingamp P, Quackenbush J, Sherlock G, Spellman P, Stoeckert C, Aach J, Ansorge W, Ball CA, Causton HC, Gaasterland T, Glenisson P, Holstege FC, Kim IF, Markowitz V, Matese JC, Parkinson H, Robinson A, Sarkans U, Schulze-Kremer S, Stewart J, Taylor R, Vilo J, Vingron M (2001) Minimum information about a microarray experiment (MIAME)-toward standards for microarray data. Nat Genet 29:365–371PubMedCrossRefGoogle Scholar

Copyright information

© Springer Science+Business Media, LLC 2012

Authors and Affiliations

  • Antony J. Williams
    • 1
    Email author
  • Sean Ekins
    • 2
    • 3
    • 4
  • Ola Spjuth
    • 5
    • 6
  • Egon L. Willighagen
    • 5
    • 7
    • 8
  1. 1.Royal Society of ChemistryWake ForestUSA
  2. 2.Collaborations in ChemistryFuquay VarinaUSA
  3. 3.Department of Pharmaceutical SciencesUniversity of MarylandBaltimoreUSA
  4. 4.Department of PharmacologyUniversity of Medicine & Dentistry of New Jersey (UMDNJ)-Robert Wood Johnson Medical SchoolPiscatawayUSA
  5. 5.Department of Pharmaceutical BiosciencesUppsala UniversityUppsalaSweden
  6. 6.Swedish e-Science Research CenterRoyal Institute of TechnologyStockholmSweden
  7. 7.Division of Molecular ToxicologyInstitute of Environmental Medicine, Karolinska InstitutetStockholmSweden
  8. 8.Department of Bioinformatics - BiGCaTMaastricht UniversityMaastrichtThe Netherlands

Personalised recommendations