Archives of Toxicology

, Volume 90, Issue 10, pp 2445–2460 | Cite as

Toward a unifying strategy for the structure-based prediction of toxicological endpoints

  • Pau Carrió
  • Ferran Sanz
  • Manuel Pastor
Regulatory Toxicology


Most computational methods used for the prediction of toxicity endpoints are based on the assumption that similar compounds have similar biological properties. This principle can be exploited using computational methods like read across or quantitative structure–activity relationships. However, there is no general agreement about which method is the most appropriate for quantifying compound similarity neither for exploiting the similarity principle in order to obtain reliable estimations of the compound properties. Moreover, optimal similarity metrics and modeling methods might depend on the characteristics of the endpoints and training series used in each case. This study describes a comparative analysis of the predictive performance of diverse similarity metrics and modeling methods in toxicological applications. A collection of two quantitative (n = 660, n = 1114) and three qualitative (n = 447, n = 905, n = 1220) datasets representing very different endpoints of interest in drug safety evaluation and rigorous methods were used to estimate the external predictive ability in each case. The results confirm that no single approach produces the best results in all instances, and the best predictions were obtained using different tools in different situations. The trends observed in this study were exploited to propose a unifying strategy allowing the use of the most suitable method for every compound. A comparison of the quality of the predictions obtained by the unifying strategy with those obtained by standard prediction methods confirmed the usefulness of the proposed approach.


In silico toxicity prediction QSAR QSPR Read across Chemical domain 



The research leading to these results has received support from the Innovative Medicines Initiative Joint Undertaking, under Grant Agreement No. 115002 (eTOX), resources of which are composed of a financial contribution from the European Union’s Seventh Framework Programme (FP7/2007–2013) and EFPIA companies’ in kind contributions.

Compliance with ethical standards

Conflict of interest

The authors declare that they have no conflict of interest.

Supplementary material

204_2015_1618_MOESM1_ESM.gz (3.4 mb)
Supplementary material 1 (GZ 3471 kb)


  1. Alelyunas YW, Empfield JR, McCarthy D et al (2010) Experimental solubility profiling of marketed CNS drugs, exploring solubility limit of CNS discovery candidate. Bioorganic Med Chem Lett 20:7312–7316. doi: 10.1016/j.bmcl.2010.10.068 CrossRefGoogle Scholar
  2. Aller SG, Yu J, Ward A et al (2009) Structure of P-glycoprotein reveals a molecular basis for poly-specific drug binding. Science 323:1718–1722. doi: 10.1126/science.1168750 CrossRefPubMedPubMedCentralGoogle Scholar
  3. Andersson PL, Maran U, Fara D et al (2002) General and class specific models for prediction of soil sorption using various physicochemical descriptors. J Chem Inf Comput Sci 42:1450–1459CrossRefPubMedGoogle Scholar
  4. Aronov AM (2008) Tuning out of hERG. Curr Opin Drug Discov Devel 11:128–140PubMedGoogle Scholar
  5. Bajorath J (2012) Computational chemistry in pharmaceutical research: at the crossroads. J Comput Aided Mol Des 26:11–12. doi: 10.1007/s10822-011-9488-z CrossRefPubMedGoogle Scholar
  6. Bajorath J (2014) Exploring activity cliffs from a chemoinformatics perspective. Mol Inform 33:438–442. doi: 10.1002/minf.201400026 CrossRefPubMedGoogle Scholar
  7. Bajorath J, Peltason L, Wawer M et al (2009) Navigating structure-activity landscapes. Drug Discov Today 14:698–705CrossRefPubMedGoogle Scholar
  8. Bemis GW, Murcko MA (1996) The properties of known drugs. 1. Molecular frameworks. J Med Chem 39:2887–2893. doi: 10.1021/jm9602928 CrossRefPubMedGoogle Scholar
  9. Benet LZ (2009) The drug transporter-metabolism alliance: uncovering and defining the interplay. Mol Pharm 6:1631–1643. doi: 10.1021/mp900253n CrossRefPubMedPubMedCentralGoogle Scholar
  10. Borst P, Elferink RO (2002) Mammalian ABC transporters in health and disease. Annu Rev Biochem 71:537–592. doi: 10.1146/annurev.biochem.71.102301.093055 CrossRefPubMedGoogle Scholar
  11. Breiman L (2001) Random Forests. Mach Learn 45:5–32. doi: 10.1186/1478-7954-9-29 CrossRefGoogle Scholar
  12. Broccatelli F, Carosati E, Cruciani G, Oprea TI (2010) Transporter-mediated efflux influences CNS side effects: ABCB1, from antitarget to target. Mol Inform 29:16–26. doi: 10.1002/minf.200900075 CrossRefPubMedPubMedCentralGoogle Scholar
  13. Broccatelli F, Carosati E, Neri A et al (2011) A novel approach for predicting p-glycoprotein (ABCB1) Inhibition using molecular interaction fields. J Med Chem 54:1740–1751. doi: 10.1021/jm101421d CrossRefPubMedPubMedCentralGoogle Scholar
  14. Broccatelli F, Mannhold R, Moriconi A et al (2012) QSAR modeling and data mining link torsades de pointes risk to the interplay of extent of metabolism, active transport, and hERG liability. Mol Pharm 9:2290–2301CrossRefPubMedGoogle Scholar
  15. Carrió P, López O, Sanz F, Pastor M (2015) eTOXlab, an open source modeling framework for implementing predictive models in production environments. J Cheminform. doi: 10.1186/s13321-015-0058-6 PubMedPubMedCentralGoogle Scholar
  16. Cherkasov A, Muratov EN, Fourches D et al (2014) QSAR modeling: where have you been? where are you going to? J Med Chem 57:4977–5010. doi: 10.1021/jm4004285
  17. Choudhuri S, Klaassen CD (2006) Structure, function, expression, genomic organization, and single nucleotide polymorphisms of human ABCB1 (MDR1), ABCC (MRP), and ABCG2 (BCRP) efflux transporters. Int J Toxicol 25:231–259CrossRefPubMedGoogle Scholar
  18. Cortes C, Vapnik V (1995) Support-vector networks. Mach Learn 20:273–297Google Scholar
  19. Curigliano G, Mayer EL, Burstein HJ et al (2010) Cardiac toxicity from systemic cancer therapy: a comprehensive review. Prog Cardiovasc Dis 53:94–104CrossRefPubMedGoogle Scholar
  20. Delaney JS (2004) ESOL: estimating aqueous solubility directly from molecular structure. J Chem Inf Comput Sci 44:1000–1005. doi: 10.1021/ci034243x CrossRefPubMedGoogle Scholar
  21. Dimova D, Bajorath J (2014) Extraction of SAR information from activity cliff clusters via matching molecular series. Eur J Med Chem 87:454–460. doi: 10.1016/j.ejmech.2014.09.087 CrossRefPubMedGoogle Scholar
  22. Durán Á, Pastor M (2010) Pentacle.
  23. Durán Á, Martínez GC, Pastor M (2008) Development and validation of AMANDA, a new algorithm for selecting highly relevant regions in molecular interaction fields. J Chem Inf Model 48:1813–1823. doi: 10.1021/ci800037t CrossRefPubMedGoogle Scholar
  24. EC (2015) REACH. European Community Regulation on chemicals and their safe use.
  25. Eckert H, Bajorath J (2007) Molecular similarity analysis in virtual screening: foundations, limitations and novel approaches. Drug Discov Today 12:225–233. doi: 10.1016/j.drudis.2007.01.011 CrossRefPubMedGoogle Scholar
  26. Ekins S (2014) Progress in computational toxicology. J Pharmacol Toxicol Methods 69:115–140. doi: 10.1016/j.vascn.2013.12.003 CrossRefPubMedGoogle Scholar
  27. Enoch SJ, Cronin MTD, Madden JC, Hewitt M (2009) Formation of structural categories to allow for read-across for teratogenicity. QSAR Comb Sci 28:696–708. doi: 10.1002/qsar.200960011 CrossRefGoogle Scholar
  28. FDA (2005) Guidance for industry starting dose in initial clinical trials guidance for industry estimating the maximum safe. FDA. doi: 10.1089/blr.2006.25.697 Google Scholar
  29. Fourches D, Barnes JC, Day NC et al (2010) Cheminformatics analysis of assertions mined from literature that describe drug-induced liver injury in different species. Chem Res Toxicol 23:171–183. doi: 10.1021/tx900326k CrossRefPubMedPubMedCentralGoogle Scholar
  30. Fung M, Thornton A, Mybeck K et al (2001) Evaluation of the characteristics of safety withdrawal of prescription drugs from worldwide pharmaceutical markets-1960 to 1999. Drug Inf J 35:293–317. doi: 10.1177/009286150103500134 Google Scholar
  31. Golbraikh A, Muratov E, Fourches D, Tropsha A (2014) Data set modelability by QSAR. J Chem Inf Model 54:1–4. doi: 10.1021/ci400572x CrossRefPubMedPubMedCentralGoogle Scholar
  32. Guha R (2012) Exploring uncharted territories: predicting activity cliffs in structure-activity landscapes. J Chem Inf Model 52:2181–2191. doi: 10.1021/ci300047k CrossRefPubMedPubMedCentralGoogle Scholar
  33. Guha R, Dutta D, Jurs PC, Chen T (2006) Local lazy regression: making use of the neighborhood to improve QSAR predictions. J Chem Inf Model 46:1836–1847. doi: 10.1021/ci060064e CrossRefPubMedGoogle Scholar
  34. Hancox JC, McPate MJ, El Harchi A, Zhang YH (2008) The hERG potassium channel and hERG screening for drug-induced torsades de pointes. Pharmacol Ther 119:118–132. doi: 10.1016/j.pharmthera.2008.05.009 CrossRefPubMedGoogle Scholar
  35. Helgee EA, Carlsson L, Boyer S, Norinder U (2010) Evaluation of quantitative structure-activity relationship modeling strategies: local and global models. J Chem Inf Model 50:677–689. doi: 10.1021/ci900471e CrossRefPubMedGoogle Scholar
  36. Hewitt M, Enoch SJ, Madden JC et al (2013) Hepatotoxicity: a scheme for generating chemical categories for read-across, structural alerts and insights into mechanism(s) of action. Crit Rev Toxicol 43:537–558. doi: 10.3109/10408444.2013.811215 CrossRefPubMedGoogle Scholar
  37. Hua Y, Yongyan W, Yiyu C (2007) Local and global quantitative structure-activity relationship modeling and prediction for the baseline toxicity. J Chem Inf Model 47:159–169. doi: 10.1021/ci600299j CrossRefGoogle Scholar
  38. Juliano RL, Ling V (1976) A surface glycoprotein modulating drug permeability in Chinese hamster ovary cell mutants. Biochim Biophys Acta 455:152–162. doi: 10.1016/0005-2736(76)90160-7 CrossRefPubMedGoogle Scholar
  39. Klepsch F, Ecker GF (2010) Impact of the recent mouse p-glycoprotein structure for structure-based ligand design. Mol Inform 29:276–286. doi: 10.1002/minf.201000017 CrossRefPubMedGoogle Scholar
  40. Könemann H (1980) Structure-activity relationships and additivity in fish toxicities of environmental pollutants. Ecotoxicol Environ Saf 4:415–421. doi: 10.1016/0147-6513(80)90043-3 CrossRefPubMedGoogle Scholar
  41. Könemann H, Musch A (1981) Quantitative structure-activity relationships in fish toxicity studies Part 2: the influence of pH on the QSAR of chlorophenols. Toxicology 19:223–228. doi: 10.1016/0300-483X(81)90131-1 CrossRefPubMedGoogle Scholar
  42. Kramer NI, Di Consiglio E, Blaauboer BJ, Testai E (2015) Biokinetics in repeated-dosing in vitro drug toxicity studies. Toxicol, VitrGoogle Scholar
  43. Kruhlak NL, Choi SS, Contrera JF et al (2008) Development of a phospholipidosis database and predictive quantitative structure-activity relationship (QSAR) models. Toxicol Mech Methods 18:217–227. doi: 10.1080/15376510701857262 CrossRefPubMedGoogle Scholar
  44. Kubinyi H (1998) Similarity and dissimilarity: a medicinal chemist’s view. Perspect Drug Discov Des 9-11:225–252. doi: 10.1023/A:1027221424359
  45. Landrum G RDKit: open-source cheminformatics.
  46. Leise MD, Poterucha JJ, Talwalkar JA (2014) Drug-induced liver injury. Mayo Clin Proc 89:95–106CrossRefPubMedGoogle Scholar
  47. Li Q, Jørgensen FS, Oprea T et al (2008) hERG classification model based on a combination of support vector machine method and GRIND descriptors. Mol Pharm 5:117–127. doi: 10.1021/mp700124e CrossRefPubMedGoogle Scholar
  48. Liaw A, Wiener M (2002) Classification and regression by randomForest. R News 2:18–22Google Scholar
  49. Liebler DC, Guengerich FP (2005) Elucidating mechanisms of drug-induced toxicity. Nat Rev Drug Discov 4:410–420. doi: 10.1038/nrd1720 CrossRefPubMedGoogle Scholar
  50. Lin Y, Jeon Y (2006) Random forests and adaptive nearest neighbors. J Am Stat Assoc 101:578–590. doi: 10.1198/016214505000001230 CrossRefGoogle Scholar
  51. Loo TW, Clarke DM (2002) Location of the rhodamine-binding site in the human multidrug resistance P-glycoprotein. J Biol Chem 277:44332–44338. doi: 10.1074/jbc.M208433200 CrossRefPubMedGoogle Scholar
  52. MACCS Structural Keys (2011) Accelrys, San Diego, CAGoogle Scholar
  53. Maggiora GM (2006) On outliers and activity cliffs—Why QSAR often disappoints. J Chem Inf Model 46:1535. doi: 10.1021/ci060117s CrossRefPubMedGoogle Scholar
  54. Maggiora G, Vogt M, Stumpfe D, Bajorath J (2014) Molecular similarity in medicinal chemistry. J Med Chem 57:3186–3204. doi: 10.1021/jm401411z CrossRefPubMedGoogle Scholar
  55. Martens H (2001) Reliable and relevant modelling of real world data: a personal account of the development of PLS regression. Chemometr Intell Lab Syst 58:85–95. doi: 10.1016/S0169-7439(01)00153-8 CrossRefGoogle Scholar
  56. Martin YC (1981) A practitioner’s perspective of the role of quantitative structure-activity analysis in medicinal chemistry. J Med Chem 24:229–237. doi: 10.1021/jm00135a001 CrossRefPubMedGoogle Scholar
  57. Martin YC, Kofron JL, Traphagen LM (2002) Do structurally similar molecules have similar biological activity? J Med Chem 45:4350–4358CrossRefPubMedGoogle Scholar
  58. Medina-Franco JL (2012) Scanning structure−activity relationships with structure−activity similarity and related maps: from consensus activity cliffs to selectivity switches. J Chem Inf Model 52:2485–2493. doi:10.1021/ci300362xGoogle Scholar
  59. Medina-Franco JL (2013) Activity cliffs: facts or artifacts? Chem Biol Drug Des 81:553–556. doi: 10.1111/cbdd.12115 CrossRefPubMedGoogle Scholar
  60. Mevik B-H, Wehrens R (2007) The pls package: principal component and partial least squares regression in R. J Stat Softw 18:1–24CrossRefGoogle Scholar
  61. Meyer D, Dimitriadou E, Hornik K et al (2014) e1071: Misc Functions of the Department of Statistics (e1071), TU WienGoogle Scholar
  62. Milletti F, Storchi L, Sforna G, Cruciani G (2007) New and original pKa prediction method using grid molecular interaction fields. J Chem Inf Model 47:2172–2181. doi: 10.1021/ci700018y CrossRefPubMedGoogle Scholar
  63. Milletti F, Storchi L, Sforna G et al (2009) Tautomer enumeration and stability prediction for virtual screening on large chemical databases. J Chem Inf Model 49:68–75. doi: 10.1021/ci800340j CrossRefPubMedGoogle Scholar
  64. Morgan HL (1965) The generation of a unique machine description for chemical structures—a technique developed at chemical abstracts service. J Chem Doc 5:107–113CrossRefGoogle Scholar
  65. Muller PY, Milton MN (2012) Index in drug development. Nat Rev Drug Discov 11:751–761. doi: 10.1038/nrd3801 CrossRefPubMedGoogle Scholar
  66. Muster W, Breidenbach A, Fischer H et al (2008) Computational toxicology in drug development. Drug Discov Today 13:303–310. doi: 10.1016/j.drudis.2007.12.007 CrossRefPubMedGoogle Scholar
  67. Nikolova N, Jaworska J (2003) Approaches to measure chemical similarity—a review. QSAR Comb Sci 22:1006–1026. doi: 10.1002/qsar.200330831 CrossRefGoogle Scholar
  68. NRC (2007) Toxicity testing in the 21st century: a vision and a strategy. The National Academies Press, WashingtonGoogle Scholar
  69. Obiol-Pardo C, Gomis-Tena J, Sanz F et al (2011) A multiscale simulation system for the prediction of drug-induced cardiotoxicity. J Chem Inf Model 51:483–492. doi: 10.1021/ci100423z CrossRefPubMedGoogle Scholar
  70. Orogo AM, Choi SS, Minnier BL, Kruhlak NL (2012) Construction and consensus performance of (Q)SAR models for predicting phospholipidosis using a dataset of 743 compounds. Mol Inform 31:725–739. doi: 10.1002/minf.201200048 CrossRefPubMedGoogle Scholar
  71. Park YC, Cho MH (2011) A new way in deciding NOAEL based on the findings from GLP-toxicity test. Toxicol Res 27:133–135. doi: 10.5487/TR.2011.27.3.133 CrossRefPubMedPubMedCentralGoogle Scholar
  72. Pastor M (2006) Alignment-independent descriptors from molecular interaction fields. In: Cruciani G (ed) Molecular interaction fields applications in drug discovery. ADME Predict. Wiley-VCH, London, pp 117–141CrossRefGoogle Scholar
  73. Pastor M, Cruciani G, McLay I et al (2000) GRid-INdependent descriptors (GRIND): a novel class of alignment-independent three-dimensional molecular descriptors. J Med Chem 43:3233–3243. doi: 10.1021/jm000941m CrossRefPubMedGoogle Scholar
  74. Perkins R, Fang H, Tong W, Welsh WJ (2003) Quantitative structure-activity relationship methods: perspectives on drug discovery and toxicology. Environ Toxicol Chem 22:1666–1679CrossRefPubMedGoogle Scholar
  75. Przybylak KR, Alzahrani AR, Cronin MTD (2014) How does the quality of phospholipidosis data influence the predictivity of structural alerts? J Chem Inf Model. doi: 10.1021/ci500233k PubMedGoogle Scholar
  76. Raunio H (2011) In silico toxicology—non-testing methods. Front Pharmacol 2:33. doi: 10.3389/fphar.2011.00033 CrossRefPubMedPubMedCentralGoogle Scholar
  77. Reasor MJ, Hastings KL, Ulrich RG (2006) Drug-induced phospholipidosis: issues and future directions. Expert Opin Drug Saf 5:567–583. doi: 10.1517/14740338.5.4.567 CrossRefPubMedGoogle Scholar
  78. Recanatini M, Cavalli A, Masetti M (2008) Modeling HERG and its interactions with drugs: recent advances in light of current potassium channel simulations. ChemMedChem 3:523–535. doi: 10.1002/cmdc.200700264 CrossRefPubMedGoogle Scholar
  79. Rogers D, Hahn M (2010) Extended-connectivity fingerprints. J Chem Inf Model 50:742–754. doi: 10.1021/ci100050t CrossRefPubMedGoogle Scholar
  80. Roy K, Mitra I, Kar S et al (2012) Comparative studies on some metrics for external validation of QSPR models. J Chem Inf Model 52:396–408. doi: 10.1021/ci200520g CrossRefPubMedGoogle Scholar
  81. Sadowski J, Gasteiger J (1993) From atoms and bonds to three-dimensional atomic coordinates: automatic model builders. Chem Rev 93:2567–2581. doi: 10.1021/cr00023a012 CrossRefGoogle Scholar
  82. Sadowski J, Gasteiger J, Klebe G (1994) Comparison of automatic three-dimensional model builders using 639 X-ray structures. J Chem Inf Model 34:1000–1008. doi: 10.1021/ci00020a039 CrossRefGoogle Scholar
  83. Sanz F, Carrió P, López O et al (2015) Integrative modeling strategies for predicting drug toxicities at the eTOX project. Mol Inform 34:477–484. doi: 10.1002/minf.201400193 CrossRefPubMedGoogle Scholar
  84. Sawada H, Takami K, Asahi S (2005) A toxicogenomic approach to drug-induced phospholipidosis: analysis of its induction mechanism and establishment of a novel in vitro screening system. Toxicol Sci 83:282–292. doi: 10.1093/toxsci/kfh264 CrossRefPubMedGoogle Scholar
  85. Schultz TW, Amcoff P, Berggren E et al (2015) A strategy for structuring and reporting a read-across prediction of toxicity. Regul Toxicol Pharmacol 72:586–601. doi: 10.1016/j.yrtph.2015.05.016 CrossRefPubMedGoogle Scholar
  86. Sheridan RP (2014) Global quantitative structure–activity relationship models vs selected local models as predictors of off-target activities for project compounds. J Chem Inf Model 54:1083–1092. doi: 10.1021/ci500084w CrossRefPubMedGoogle Scholar
  87. Szakács G, Paterson JK, Ludwig JA et al (2006) Targeting multidrug resistance in cancer. Nat Rev Drug Discov 5:219–234. doi: 10.1038/nrd1984 CrossRefPubMedGoogle Scholar
  88. Thai K-M, Windisch A, Stork D et al (2010) The hERG potassium channel and drug trapping: insight from docking studies with propafenone derivatives. ChemMedChem 5:436–442. doi: 10.1002/cmdc.200900374 CrossRefPubMedGoogle Scholar
  89. Treinen-Moslen M, Kanz MF (2006) Intestinal tract injury by drugs: importance of metabolite delivery by yellow bile road. Pharmacol Ther 112:649–667CrossRefPubMedGoogle Scholar
  90. Tropsha A (2010) Best practices for QSAR model development, validation, and exploitation. Mol Inform 29:476–488. doi: 10.1002/minf.201000061 CrossRefPubMedGoogle Scholar
  91. Vandenberg JI, Perry MD, Perrin MJ et al (2012) hERG K + channels: structure, function, and clinical significance. Physiol Rev 92:1393–1478. doi: 10.1152/physrev.00036.2011 CrossRefPubMedGoogle Scholar
  92. Venables WN, Ripley BD (2002) Modern applied statistics with S, 4th edn. Springer, New YorkCrossRefGoogle Scholar
  93. Wilk-Zasadna I, Bernasconi C, Pelkonen O, Coecke S (2015) Biotransformation in vitro: an essential consideration in the quantitative in vitro-to-in vivo extrapolation (QIVIVE) of toxicity data. Toxicology 332:8–19. doi: 10.1016/j.tox.2014.10.006 CrossRefPubMedGoogle Scholar
  94. Willett P, Barnard JM, Downs GM (1998) Chemical similarity searching. J Chem Inf Model 38:983–996. doi: 10.1021/ci9800211 Google Scholar
  95. Yoon M, Blaauboer BJ, Clewell HJ (2015) Quantitative in vitro to in vivo extrapolation (QIVIVE): an essential element for in vitro-based risk assessment. Toxicology 332:1–3. doi: 10.1016/j.tox.2015.02.002 CrossRefPubMedGoogle Scholar

Copyright information

© Springer-Verlag Berlin Heidelberg 2015

Authors and Affiliations

  1. 1.Research Programme on Biomedical Informatics (GRIB), Department of Experimental and Health Sciences, Hospital del Mar Medical Research Institute (IMIM)Universitat Pompeu FabraBarcelonaSpain

Personalised recommendations