Abstract
Most computational methods used for the prediction of toxicity endpoints are based on the assumption that similar compounds have similar biological properties. This principle can be exploited using computational methods like read across or quantitative structure–activity relationships. However, there is no general agreement about which method is the most appropriate for quantifying compound similarity neither for exploiting the similarity principle in order to obtain reliable estimations of the compound properties. Moreover, optimal similarity metrics and modeling methods might depend on the characteristics of the endpoints and training series used in each case. This study describes a comparative analysis of the predictive performance of diverse similarity metrics and modeling methods in toxicological applications. A collection of two quantitative (n = 660, n = 1114) and three qualitative (n = 447, n = 905, n = 1220) datasets representing very different endpoints of interest in drug safety evaluation and rigorous methods were used to estimate the external predictive ability in each case. The results confirm that no single approach produces the best results in all instances, and the best predictions were obtained using different tools in different situations. The trends observed in this study were exploited to propose a unifying strategy allowing the use of the most suitable method for every compound. A comparison of the quality of the predictions obtained by the unifying strategy with those obtained by standard prediction methods confirmed the usefulness of the proposed approach.
Similar content being viewed by others
References
Alelyunas YW, Empfield JR, McCarthy D et al (2010) Experimental solubility profiling of marketed CNS drugs, exploring solubility limit of CNS discovery candidate. Bioorganic Med Chem Lett 20:7312–7316. doi:10.1016/j.bmcl.2010.10.068
Aller SG, Yu J, Ward A et al (2009) Structure of P-glycoprotein reveals a molecular basis for poly-specific drug binding. Science 323:1718–1722. doi:10.1126/science.1168750
Andersson PL, Maran U, Fara D et al (2002) General and class specific models for prediction of soil sorption using various physicochemical descriptors. J Chem Inf Comput Sci 42:1450–1459
Aronov AM (2008) Tuning out of hERG. Curr Opin Drug Discov Devel 11:128–140
Bajorath J (2012) Computational chemistry in pharmaceutical research: at the crossroads. J Comput Aided Mol Des 26:11–12. doi:10.1007/s10822-011-9488-z
Bajorath J (2014) Exploring activity cliffs from a chemoinformatics perspective. Mol Inform 33:438–442. doi:10.1002/minf.201400026
Bajorath J, Peltason L, Wawer M et al (2009) Navigating structure-activity landscapes. Drug Discov Today 14:698–705
Bemis GW, Murcko MA (1996) The properties of known drugs. 1. Molecular frameworks. J Med Chem 39:2887–2893. doi:10.1021/jm9602928
Benet LZ (2009) The drug transporter-metabolism alliance: uncovering and defining the interplay. Mol Pharm 6:1631–1643. doi:10.1021/mp900253n
Borst P, Elferink RO (2002) Mammalian ABC transporters in health and disease. Annu Rev Biochem 71:537–592. doi:10.1146/annurev.biochem.71.102301.093055
Breiman L (2001) Random Forests. Mach Learn 45:5–32. doi:10.1186/1478-7954-9-29
Broccatelli F, Carosati E, Cruciani G, Oprea TI (2010) Transporter-mediated efflux influences CNS side effects: ABCB1, from antitarget to target. Mol Inform 29:16–26. doi:10.1002/minf.200900075
Broccatelli F, Carosati E, Neri A et al (2011) A novel approach for predicting p-glycoprotein (ABCB1) Inhibition using molecular interaction fields. J Med Chem 54:1740–1751. doi:10.1021/jm101421d
Broccatelli F, Mannhold R, Moriconi A et al (2012) QSAR modeling and data mining link torsades de pointes risk to the interplay of extent of metabolism, active transport, and hERG liability. Mol Pharm 9:2290–2301
Carrió P, López O, Sanz F, Pastor M (2015) eTOXlab, an open source modeling framework for implementing predictive models in production environments. J Cheminform. doi:10.1186/s13321-015-0058-6
Cherkasov A, Muratov EN, Fourches D et al (2014) QSAR modeling: where have you been? where are you going to? J Med Chem 57:4977–5010. doi:10.1021/jm4004285
Choudhuri S, Klaassen CD (2006) Structure, function, expression, genomic organization, and single nucleotide polymorphisms of human ABCB1 (MDR1), ABCC (MRP), and ABCG2 (BCRP) efflux transporters. Int J Toxicol 25:231–259
Cortes C, Vapnik V (1995) Support-vector networks. Mach Learn 20:273–297
Curigliano G, Mayer EL, Burstein HJ et al (2010) Cardiac toxicity from systemic cancer therapy: a comprehensive review. Prog Cardiovasc Dis 53:94–104
Delaney JS (2004) ESOL: estimating aqueous solubility directly from molecular structure. J Chem Inf Comput Sci 44:1000–1005. doi:10.1021/ci034243x
Dimova D, Bajorath J (2014) Extraction of SAR information from activity cliff clusters via matching molecular series. Eur J Med Chem 87:454–460. doi:10.1016/j.ejmech.2014.09.087
Durán Á, Pastor M (2010) Pentacle. http://www.moldiscovery.com/software/pentacle
Durán Á, Martínez GC, Pastor M (2008) Development and validation of AMANDA, a new algorithm for selecting highly relevant regions in molecular interaction fields. J Chem Inf Model 48:1813–1823. doi:10.1021/ci800037t
EC (2015) REACH. European Community Regulation on chemicals and their safe use. http://ec.europa.eu/environment/chemicals/reach/reach_intro.htm
Eckert H, Bajorath J (2007) Molecular similarity analysis in virtual screening: foundations, limitations and novel approaches. Drug Discov Today 12:225–233. doi:10.1016/j.drudis.2007.01.011
Ekins S (2014) Progress in computational toxicology. J Pharmacol Toxicol Methods 69:115–140. doi:10.1016/j.vascn.2013.12.003
Enoch SJ, Cronin MTD, Madden JC, Hewitt M (2009) Formation of structural categories to allow for read-across for teratogenicity. QSAR Comb Sci 28:696–708. doi:10.1002/qsar.200960011
FDA (2005) Guidance for industry starting dose in initial clinical trials guidance for industry estimating the maximum safe. FDA. doi:10.1089/blr.2006.25.697
Fourches D, Barnes JC, Day NC et al (2010) Cheminformatics analysis of assertions mined from literature that describe drug-induced liver injury in different species. Chem Res Toxicol 23:171–183. doi:10.1021/tx900326k
Fung M, Thornton A, Mybeck K et al (2001) Evaluation of the characteristics of safety withdrawal of prescription drugs from worldwide pharmaceutical markets-1960 to 1999. Drug Inf J 35:293–317. doi:10.1177/009286150103500134
Golbraikh A, Muratov E, Fourches D, Tropsha A (2014) Data set modelability by QSAR. J Chem Inf Model 54:1–4. doi:10.1021/ci400572x
Guha R (2012) Exploring uncharted territories: predicting activity cliffs in structure-activity landscapes. J Chem Inf Model 52:2181–2191. doi:10.1021/ci300047k
Guha R, Dutta D, Jurs PC, Chen T (2006) Local lazy regression: making use of the neighborhood to improve QSAR predictions. J Chem Inf Model 46:1836–1847. doi:10.1021/ci060064e
Hancox JC, McPate MJ, El Harchi A, Zhang YH (2008) The hERG potassium channel and hERG screening for drug-induced torsades de pointes. Pharmacol Ther 119:118–132. doi:10.1016/j.pharmthera.2008.05.009
Helgee EA, Carlsson L, Boyer S, Norinder U (2010) Evaluation of quantitative structure-activity relationship modeling strategies: local and global models. J Chem Inf Model 50:677–689. doi:10.1021/ci900471e
Hewitt M, Enoch SJ, Madden JC et al (2013) Hepatotoxicity: a scheme for generating chemical categories for read-across, structural alerts and insights into mechanism(s) of action. Crit Rev Toxicol 43:537–558. doi:10.3109/10408444.2013.811215
Hua Y, Yongyan W, Yiyu C (2007) Local and global quantitative structure-activity relationship modeling and prediction for the baseline toxicity. J Chem Inf Model 47:159–169. doi:10.1021/ci600299j
Juliano RL, Ling V (1976) A surface glycoprotein modulating drug permeability in Chinese hamster ovary cell mutants. Biochim Biophys Acta 455:152–162. doi:10.1016/0005-2736(76)90160-7
Klepsch F, Ecker GF (2010) Impact of the recent mouse p-glycoprotein structure for structure-based ligand design. Mol Inform 29:276–286. doi:10.1002/minf.201000017
Könemann H (1980) Structure-activity relationships and additivity in fish toxicities of environmental pollutants. Ecotoxicol Environ Saf 4:415–421. doi:10.1016/0147-6513(80)90043-3
Könemann H, Musch A (1981) Quantitative structure-activity relationships in fish toxicity studies Part 2: the influence of pH on the QSAR of chlorophenols. Toxicology 19:223–228. doi:10.1016/0300-483X(81)90131-1
Kramer NI, Di Consiglio E, Blaauboer BJ, Testai E (2015) Biokinetics in repeated-dosing in vitro drug toxicity studies. Toxicol, Vitr
Kruhlak NL, Choi SS, Contrera JF et al (2008) Development of a phospholipidosis database and predictive quantitative structure-activity relationship (QSAR) models. Toxicol Mech Methods 18:217–227. doi:10.1080/15376510701857262
Kubinyi H (1998) Similarity and dissimilarity: a medicinal chemist’s view. Perspect Drug Discov Des 9-11:225–252. doi:10.1023/A:1027221424359
Landrum G RDKit: open-source cheminformatics. http://www.rdkit.org
Leise MD, Poterucha JJ, Talwalkar JA (2014) Drug-induced liver injury. Mayo Clin Proc 89:95–106
Li Q, Jørgensen FS, Oprea T et al (2008) hERG classification model based on a combination of support vector machine method and GRIND descriptors. Mol Pharm 5:117–127. doi:10.1021/mp700124e
Liaw A, Wiener M (2002) Classification and regression by randomForest. R News 2:18–22
Liebler DC, Guengerich FP (2005) Elucidating mechanisms of drug-induced toxicity. Nat Rev Drug Discov 4:410–420. doi:10.1038/nrd1720
Lin Y, Jeon Y (2006) Random forests and adaptive nearest neighbors. J Am Stat Assoc 101:578–590. doi:10.1198/016214505000001230
Loo TW, Clarke DM (2002) Location of the rhodamine-binding site in the human multidrug resistance P-glycoprotein. J Biol Chem 277:44332–44338. doi:10.1074/jbc.M208433200
MACCS Structural Keys (2011) Accelrys, San Diego, CA
Maggiora GM (2006) On outliers and activity cliffs—Why QSAR often disappoints. J Chem Inf Model 46:1535. doi:10.1021/ci060117s
Maggiora G, Vogt M, Stumpfe D, Bajorath J (2014) Molecular similarity in medicinal chemistry. J Med Chem 57:3186–3204. doi:10.1021/jm401411z
Martens H (2001) Reliable and relevant modelling of real world data: a personal account of the development of PLS regression. Chemometr Intell Lab Syst 58:85–95. doi:10.1016/S0169-7439(01)00153-8
Martin YC (1981) A practitioner’s perspective of the role of quantitative structure-activity analysis in medicinal chemistry. J Med Chem 24:229–237. doi:10.1021/jm00135a001
Martin YC, Kofron JL, Traphagen LM (2002) Do structurally similar molecules have similar biological activity? J Med Chem 45:4350–4358
Medina-Franco JL (2012) Scanning structure−activity relationships with structure−activity similarity and related maps: from consensus activity cliffs to selectivity switches. J Chem Inf Model 52:2485–2493. doi:10.1021/ci300362x
Medina-Franco JL (2013) Activity cliffs: facts or artifacts? Chem Biol Drug Des 81:553–556. doi:10.1111/cbdd.12115
Mevik B-H, Wehrens R (2007) The pls package: principal component and partial least squares regression in R. J Stat Softw 18:1–24
Meyer D, Dimitriadou E, Hornik K et al (2014) e1071: Misc Functions of the Department of Statistics (e1071), TU Wien
Milletti F, Storchi L, Sforna G, Cruciani G (2007) New and original pKa prediction method using grid molecular interaction fields. J Chem Inf Model 47:2172–2181. doi:10.1021/ci700018y
Milletti F, Storchi L, Sforna G et al (2009) Tautomer enumeration and stability prediction for virtual screening on large chemical databases. J Chem Inf Model 49:68–75. doi:10.1021/ci800340j
Morgan HL (1965) The generation of a unique machine description for chemical structures—a technique developed at chemical abstracts service. J Chem Doc 5:107–113
Muller PY, Milton MN (2012) Index in drug development. Nat Rev Drug Discov 11:751–761. doi:10.1038/nrd3801
Muster W, Breidenbach A, Fischer H et al (2008) Computational toxicology in drug development. Drug Discov Today 13:303–310. doi:10.1016/j.drudis.2007.12.007
Nikolova N, Jaworska J (2003) Approaches to measure chemical similarity—a review. QSAR Comb Sci 22:1006–1026. doi:10.1002/qsar.200330831
NRC (2007) Toxicity testing in the 21st century: a vision and a strategy. The National Academies Press, Washington
Obiol-Pardo C, Gomis-Tena J, Sanz F et al (2011) A multiscale simulation system for the prediction of drug-induced cardiotoxicity. J Chem Inf Model 51:483–492. doi:10.1021/ci100423z
Orogo AM, Choi SS, Minnier BL, Kruhlak NL (2012) Construction and consensus performance of (Q)SAR models for predicting phospholipidosis using a dataset of 743 compounds. Mol Inform 31:725–739. doi:10.1002/minf.201200048
Park YC, Cho MH (2011) A new way in deciding NOAEL based on the findings from GLP-toxicity test. Toxicol Res 27:133–135. doi:10.5487/TR.2011.27.3.133
Pastor M (2006) Alignment-independent descriptors from molecular interaction fields. In: Cruciani G (ed) Molecular interaction fields applications in drug discovery. ADME Predict. Wiley-VCH, London, pp 117–141
Pastor M, Cruciani G, McLay I et al (2000) GRid-INdependent descriptors (GRIND): a novel class of alignment-independent three-dimensional molecular descriptors. J Med Chem 43:3233–3243. doi:10.1021/jm000941m
Perkins R, Fang H, Tong W, Welsh WJ (2003) Quantitative structure-activity relationship methods: perspectives on drug discovery and toxicology. Environ Toxicol Chem 22:1666–1679
Przybylak KR, Alzahrani AR, Cronin MTD (2014) How does the quality of phospholipidosis data influence the predictivity of structural alerts? J Chem Inf Model. doi:10.1021/ci500233k
Raunio H (2011) In silico toxicology—non-testing methods. Front Pharmacol 2:33. doi:10.3389/fphar.2011.00033
Reasor MJ, Hastings KL, Ulrich RG (2006) Drug-induced phospholipidosis: issues and future directions. Expert Opin Drug Saf 5:567–583. doi:10.1517/14740338.5.4.567
Recanatini M, Cavalli A, Masetti M (2008) Modeling HERG and its interactions with drugs: recent advances in light of current potassium channel simulations. ChemMedChem 3:523–535. doi:10.1002/cmdc.200700264
Rogers D, Hahn M (2010) Extended-connectivity fingerprints. J Chem Inf Model 50:742–754. doi:10.1021/ci100050t
Roy K, Mitra I, Kar S et al (2012) Comparative studies on some metrics for external validation of QSPR models. J Chem Inf Model 52:396–408. doi:10.1021/ci200520g
Sadowski J, Gasteiger J (1993) From atoms and bonds to three-dimensional atomic coordinates: automatic model builders. Chem Rev 93:2567–2581. doi:10.1021/cr00023a012
Sadowski J, Gasteiger J, Klebe G (1994) Comparison of automatic three-dimensional model builders using 639 X-ray structures. J Chem Inf Model 34:1000–1008. doi:10.1021/ci00020a039
Sanz F, Carrió P, López O et al (2015) Integrative modeling strategies for predicting drug toxicities at the eTOX project. Mol Inform 34:477–484. doi:10.1002/minf.201400193
Sawada H, Takami K, Asahi S (2005) A toxicogenomic approach to drug-induced phospholipidosis: analysis of its induction mechanism and establishment of a novel in vitro screening system. Toxicol Sci 83:282–292. doi:10.1093/toxsci/kfh264
Schultz TW, Amcoff P, Berggren E et al (2015) A strategy for structuring and reporting a read-across prediction of toxicity. Regul Toxicol Pharmacol 72:586–601. doi:10.1016/j.yrtph.2015.05.016
Sheridan RP (2014) Global quantitative structure–activity relationship models vs selected local models as predictors of off-target activities for project compounds. J Chem Inf Model 54:1083–1092. doi:10.1021/ci500084w
Szakács G, Paterson JK, Ludwig JA et al (2006) Targeting multidrug resistance in cancer. Nat Rev Drug Discov 5:219–234. doi:10.1038/nrd1984
Thai K-M, Windisch A, Stork D et al (2010) The hERG potassium channel and drug trapping: insight from docking studies with propafenone derivatives. ChemMedChem 5:436–442. doi:10.1002/cmdc.200900374
Treinen-Moslen M, Kanz MF (2006) Intestinal tract injury by drugs: importance of metabolite delivery by yellow bile road. Pharmacol Ther 112:649–667
Tropsha A (2010) Best practices for QSAR model development, validation, and exploitation. Mol Inform 29:476–488. doi:10.1002/minf.201000061
Vandenberg JI, Perry MD, Perrin MJ et al (2012) hERG K + channels: structure, function, and clinical significance. Physiol Rev 92:1393–1478. doi:10.1152/physrev.00036.2011
Venables WN, Ripley BD (2002) Modern applied statistics with S, 4th edn. Springer, New York
Wilk-Zasadna I, Bernasconi C, Pelkonen O, Coecke S (2015) Biotransformation in vitro: an essential consideration in the quantitative in vitro-to-in vivo extrapolation (QIVIVE) of toxicity data. Toxicology 332:8–19. doi:10.1016/j.tox.2014.10.006
Willett P, Barnard JM, Downs GM (1998) Chemical similarity searching. J Chem Inf Model 38:983–996. doi:10.1021/ci9800211
Yoon M, Blaauboer BJ, Clewell HJ (2015) Quantitative in vitro to in vivo extrapolation (QIVIVE): an essential element for in vitro-based risk assessment. Toxicology 332:1–3. doi:10.1016/j.tox.2015.02.002
Acknowledgments
The research leading to these results has received support from the Innovative Medicines Initiative Joint Undertaking, under Grant Agreement No. 115002 (eTOX), resources of which are composed of a financial contribution from the European Union’s Seventh Framework Programme (FP7/2007–2013) and EFPIA companies’ in kind contributions.
Author information
Authors and Affiliations
Corresponding author
Ethics declarations
Conflict of interest
The authors declare that they have no conflict of interest.
Electronic supplementary material
Below is the link to the electronic supplementary material.
Rights and permissions
About this article
Cite this article
Carrió, P., Sanz, F. & Pastor, M. Toward a unifying strategy for the structure-based prediction of toxicological endpoints. Arch Toxicol 90, 2445–2460 (2016). https://doi.org/10.1007/s00204-015-1618-2
Received:
Accepted:
Published:
Issue Date:
DOI: https://doi.org/10.1007/s00204-015-1618-2