Utilizing high throughput screening data for predictive toxicology models: protocols and application to MLSCN assays

Guha, Rajarshi; Schürer, Stephan C.

doi:10.1007/s10822-008-9192-9

Utilizing high throughput screening data for predictive toxicology models: protocols and application to MLSCN assays

Published: 19 February 2008

Volume 22, pages 367–384, (2008)
Cite this article

Journal of Computer-Aided Molecular Design Aims and scope Submit manuscript

Rajarshi Guha¹ &
Stephan C. Schürer²

396 Accesses
40 Citations
Explore all metrics

Abstract

Computational toxicology is emerging as an encouraging alternative to experimental testing. The Molecular Libraries Screening Center Network (MLSCN) as part of the NIH Molecular Libraries Roadmap has recently started generating large and diverse screening datasets, which are publicly available in PubChem. In this report, we investigate various aspects of developing computational models to predict cell toxicity based on cell proliferation screening data generated in the MLSCN. By capturing feature-based information in those datasets, such predictive models would be useful in evaluating cell-based screening results in general (for example from reporter assays) and could be used as an aid to identify and eliminate potentially undesired compounds. Specifically we present the results of random forest ensemble models developed using different cell proliferation datasets and highlight protocols to take into account their extremely imbalanced nature. Depending on the nature of the datasets and the descriptors employed we were able to achieve percentage correct classification rates between 70% and 85% on the prediction set, though the accuracy rate dropped significantly when the models were applied to in vivo data. In this context we also compare the MLSCN cell proliferation results with animal acute toxicity data to investigate to what extent animal toxicity can be correlated and potentially predicted by proliferation results. Finally, we present a visualization technique that allows one to compare a new dataset to the training set of the models to decide whether the new dataset may be reliably predicted.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Log in via an institution

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Institutional subscriptions

Computational methods for prediction of in vitro effects of new chemical structures

Article Open access 29 September 2016

Predictive Modeling of Tox21 Data

Prediction of Micronucleus Assay Outcome Using In Vivo Activity Data and Molecular Structure Features

Article 20 October 2021

References

Nidhi GM, Davies JW, Jenkins JL (2006) Prediction of biological targets for compounds using multiple-category Bayesian models trained on chemogenomics databases. J Chem Inf Model 46(3):1124–1133
Google Scholar
Poroikov V, Filimonov D, Lagunin A, Gloriozova T, Zakharov A (2007) PASS: identification of probable targets and mechanisms of toxicity. SAR QSAR Environ Res 18:101–110
Article CAS Google Scholar
Paakkari I (2002) Cardiotoxicity of new antihistamines and cisapride. Toxicol Lett 127(1–3):279–284
Article CAS Google Scholar
Vandenberg JI, Walker BD, Campbell TJ (2001) Herg K+ channels: friend and foe. Trends Pharmacol Sci 22(5):240–246
Article CAS Google Scholar
Maxwell DM, Brecht KM, Koplovitz I, Sweeney RE (2006) Acetylcholinesterase inhibition: does it explain the toxicity of organophosphorus compounds? Arch Toxicol 80(11):756–760
Article CAS Google Scholar
Taylor P, Kovarik Z, Reiner E, Radic Z (2007) Acetylcholinesterase: converting a vulnerable target to a template for antidotes and detection of inhibitor exposure. Toxicology 233(1–3):70–78
Article CAS Google Scholar
Clark RD, Wolohan PRN, Hodgkin EE, Kelly JH, Sussman NL (2004) Modelling in vitro hepatotoxicity using molecular interaction fields and SIMCA J Mol Graph Model 22(6):487–497
Article CAS Google Scholar
Hodges G, Roberts DW, Marshall SJ, Dearden JC (2006) Defining the toxic mode of action of esther sulphonates using the joint toxicity of mixtures. Chemosphere 64(1):17–25
Article CAS Google Scholar
Ankley GT, Villeneuve DL (2006) The fathead minnow in aquatic toxicology: past, present and future. Aquat Toxicol 78(1):91–102
Article CAS Google Scholar
Lagunin AA, Zakharov AV, Filimonov DA, Poroikov VV (2007) A new approach to QSAR modelling of acute toxicity. Sar QSAR Environ Res 18(3–4):285–298
Article CAS Google Scholar
Pasha FA, Srivastava HK, Srivastava A, Singh PP (2007) QSTR study of small organic molecules against Tetrahymena pyriformis. QSAR Comb Sci 26(1):69–84
Article CAS Google Scholar
Yan XF, Xiao HM (2007) QSAR study of nitrobenzenes’ toxicity to tetrahymena pyriformis using semi-empirical quantum chemical methods. Chin J Struct Chem 26(1):7–14
CAS Google Scholar
Park SY, Lee SM, Ye SK, Yoon SH, Chung MH, Choi J (2006) Benzo[a]pyrene-induced DNA damage and p53 modulation in human hepatoma HepG2 cells for the identification of potential biomarkers for PAH monitoring and risk assessment. Toxicol Lett 167(1):27–33
Article CAS Google Scholar
Roos PH, Tschirbs S, Pfeifer F, Welge P, Hack A, Wilhelm M, Bolt HM (2004) Risk potentials for humans of original and remediated PAH-contaminated soils: application of biomarkers of effect. Toxicology 205(3):181–194
Article CAS Google Scholar
Niu J, Yu G (2004) Molecular structural characteristics governing biocatalytic chlorination of PAHs by chloroperoxidase from Caldariomyces fumago. SAR QSAR Environ Res 15(3):159–167
Article CAS Google Scholar
Perugini M, Visciano P, Giammarino A, Manera M, Di Nardo W, Amorena M (2007) Polycyclic aromatic hydrocarbons in marine organisms from the Adriatic Sea, Italy. Chemosphere 66(10):1904–1910
Article CAS Google Scholar
Bohonowych JE, Denison MS (2007) Persistent binding of ligands to the aryl hydrocarbon receptor. Toxicol Sci 98(1):99–109
Article CAS Google Scholar
Chroust K, Pavlova M, Prokop Z, Mendel J, Bozkova K, Kubat Z, Zajickova V, Damborsky J (2007) Quantitative structure-activity relationships for toxicity and genotoxicity of halogenated aliphatic compounds: wing spot test of Drosophila melanogaster. Chemosphere 67(1):152–159
Article CAS Google Scholar
Muellner MG, Wagner ED, McCalla K, Richardson SD, Woo YT, Plewa MJ (2007) Haloacetonitriles vs. regulated haloacetic acids: are nitrogen-containing DBPs more toxic? Environ Sci Technol 41(2):645–651
Article CAS Google Scholar
Lu GH, Wang C, Li YM (2006) QSARS for acute toxicity of halogenated benzenes to bacteria in natural waters. Biomed Environ Sci 19(6):457–460
CAS Google Scholar
Liu HX, Papa E, Gramatica P (2006) QSAR prediction of estrogen activity for a large set of diverse chemicals under the guidance of OECD principles. Chem Res Toxicol 19(11):1540–1548
Article CAS Google Scholar
Afantitis A, Melagraki G, Sarimveis H, Koutentis PA, Markopoulos J, Igglessi-Markopoulou O (2006) A novel QSAR model for predicting induction of apoptosis by 4-aryl-4H-chromenes. Bioorg Med Chem 14(19):6686–6694
Article CAS Google Scholar
Mosier PD, Jurs PC (2002) QSAR/QSPR studies using probabilistic neural networks and generalized regression neural networks. J Chem Inf Comput Sci 42(6):1460–1470
Article CAS Google Scholar
Kaiser KLE, Niculescu SP, Schultz TW (2002) Probabilistic neural network modeling of the toxicity of chemicals to Tetrahymena pyriformis with molecular fragment descriptors. SAR QAR Environ Res 13(1):57–67
Article CAS Google Scholar
Roncaglioni A, Novic M, Vracko M, Benfenati E (2004) Classification of potential endocrine disrupters on the basis of molecular structure using a nonlinear modeling method. J Chem Inf Comput Sci 44(2):300–309
Article CAS Google Scholar
Mazzatorta P, Vracko M, Jezierska A, Benfenati E (2003) Modeling toxicity by using supervised Kohonen neural networks. J Chem Inf Comput Sci 43(2):485–492
Article CAS Google Scholar
Crettaz P, Benigni R (2005) Prediction of the rodent carcinogenicity of 60 pesticides by the DEREKfW expert system. J Chem Inf Model 45(6):1864–1873
Article CAS Google Scholar
Veith GD (2004) On the nature, evolution and future of quantitative structure-activity relationships (QSAR) in toxicology. SAR QSAR Environ Res 15(5–6):323–330
Article CAS Google Scholar
von Korff M, Sander T (2006) Toxicity-indicating structural patterns. J Chem Inf Model 46(2):536–544
Article Google Scholar
Xia M, Huang R, Witt KL, Southall N, Fostel J, Cho MH, Jadhav A, Smith CS, Inglese J, Portier CJ, Tice RR, Austin CP (2007) Compound cytotoxicity profiling using quantitative high-throughput screening. Environ Health Perspect, in press, 10.1289/ehp.10727
MDL (2006) MDL Toxicity Database, MDL, San Ramon
Renner S, Fechner U, Schneider G (2006) Pharmacophores and pharmacophore searches. In: Langer T, Hoffmann RD (eds) Wiley-VCH, Wienheim, Germany 32:49–79
Breiman L (2001) Random forests. Machine Learning 45:5–32
Google Scholar
Breiman L, Friedman JH, Olshen RA, Stone CJ (1984) Classification and regression trees. Chapman & Hall/CRC, Boca Raton, FL
Google Scholar
R Development Core Team (2005) A language and environment for statistical computing, R Foundation for Statistical Computing, Vienna, Austria
Cho SJ, Hermsmeier MA (2002) Genetic algorithm guided selection: variable selection and subset selection. J Chem Inf Comput Sci 42:927–936
Article CAS Google Scholar
Forrest S (1993) Genetic algorithms: principles of natural selection applied to computation. Science 261:872–878
Article CAS Google Scholar
Leardi R (2001) Genetic algorithms in chemometrics and chemistry. J Chemo 15:559–569
Article CAS Google Scholar
Derksen S, Keselman HJ (1992) Backward, forward and stepwise automated subset selection algorithms: frequency of obtaining authentic and noise variables. Br J Math Statis Psychol 45:265–282
Google Scholar
Kirkpatrick S, Gelatt JCD, Vecchi MP (1983) Optimization by simulated annealing. Science 220:671–680
Article Google Scholar
Sutter JM, Dixon SL, Jurs PC (1995) Automated descriptor selection for quantitative structure-activity relationships using generalized simulated annealing. J Chem Inf Comput Sci 35:77–84
Article CAS Google Scholar
Hanley JA, Mcneil BJ (1982) The meaning and use of the area under a Receiver Operating Characteristic (ROC) Curve. Radiology 143:29–36
CAS Google Scholar
Accelrys Scitegic Pipeline Pilot, San Diego, 2007
Cerri A, Serra F, Ferrari P, Folpini E, Padoani G, Melloni P (1997) Synthesis, cardiotonic activity, and structure-activity relationships of 17 beta-guanylhydrazone derivatives of 5 beta-androstane-3 beta, 14 beta-diol acting on the Na+,K(+)-ATPase receptor. J Med Chem 40(21):3484–3488
Article CAS Google Scholar
Grove SJ, Kaur J, Muir AW, Pow E, Tarver GJ, Zhang MQ (2002) Oxyaniliniums as acetylcholinesterase inhibitors for the reversal of neuromuscular block. Bioorg Med Chem Lett 12(2):193–196
Article CAS Google Scholar
Leader H, Wolfe AD, Chiang PK, Gordon RK (2002) Pyridophens: binary pyridostigmine-aprophen prodrugs with differential inhibition of acetylcholinesterase, butyrylcholinesterase, and muscarinic receptors. J Med Chem 45(4):902–910
Article CAS Google Scholar
Sheridan RP, Feuston BP, Maiorov VN, Kearsley SK (2004) Similarity to molecules in the training set is a good discriminator for prediction accuracy in QSAR. J Chem Inf Comput Sci 44(6):1912–1928
Article CAS Google Scholar
Guha R, Dutta D, Jurs PC, Chen T (2006) Local lazy regression: making use of the neighborhood to improve QSAR predictions. J Chem Inf Model 46(4):1836–1847
Article CAS Google Scholar
Netzeva TI, Worth A, Aldenberg T, Benigni R, Cronin MTD, Gramatica P, Jaworska JS, Kahn S, Klopman G, Marchant CA, Myatt G, Nikolova-Jeliazkova N, Patlewicz GY, Perkins R, Roberts D, Schultz T, Stanton DW, van de Sandt JJM, Tong W, Veith G, Yang C (2005) Current status of methods for defining the applicability domain of (Quantitative) structure–activity relationships. The Report and Recommendations of ECVAM Workshop 52. Altern Lab Anim 33(2):155–173
Google Scholar

Download references

Acknowledgements

RG would like to acknowledge funding from NIH Grant No. P20 HG003894-01. SCS acknowledges the support by the National Institutes of Health Molecular Library Screening Center Network (Grant No U54 MH074404-01, Prof. Hugh Rosen, Principle Investigator).

Author information

Authors and Affiliations

School of Informatics, Indiana University, Bloomington, IN, 47406, USA
Rajarshi Guha
Department of Scientific Computing, The Scripps Research Institute, Jupiter, FL, 33458, USA
Stephan C. Schürer

Authors

Rajarshi Guha
View author publications
You can also search for this author in PubMed Google Scholar
Stephan C. Schürer
View author publications
You can also search for this author in PubMed Google Scholar

Corresponding author

Correspondence to Rajarshi Guha.

Rights and permissions

Reprints and permissions

About this article

Cite this article

Guha, R., Schürer, S.C. Utilizing high throughput screening data for predictive toxicology models: protocols and application to MLSCN assays. J Comput Aided Mol Des 22, 367–384 (2008). https://doi.org/10.1007/s10822-008-9192-9

Download citation

Received: 04 October 2007
Accepted: 30 January 2008
Published: 19 February 2008
Issue Date: June 2008
DOI: https://doi.org/10.1007/s10822-008-9192-9

Keywords

Access this article

Log in via an institution

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Institutional subscriptions

Utilizing high throughput screening data for predictive toxicology models: protocols and application to MLSCN assays

Abstract

Access this article

Similar content being viewed by others

Computational methods for prediction of in vitro effects of new chemical structures

Predictive Modeling of Tox21 Data

Prediction of Micronucleus Assay Outcome Using In Vivo Activity Data and Molecular Structure Features

References

Acknowledgements

Author information

Authors and Affiliations

Corresponding author

Rights and permissions

About this article

Cite this article

Keywords

Navigation

Utilizing high throughput screening data for predictive toxicology models: protocols and application to MLSCN assays

Abstract

Access this article

Similar content being viewed by others

Computational methods for prediction of in vitro effects of new chemical structures

Predictive Modeling of Tox21 Data

Prediction of Micronucleus Assay Outcome Using In Vivo Activity Data and Molecular Structure Features

References

Acknowledgements

Author information

Authors and Affiliations

Corresponding author

Rights and permissions

About this article

Cite this article

Share this article

Keywords

Search

Navigation