Pharmaceutical Research

, 35:170 | Cite as

Naïve Bayesian Models for Vero Cell Cytotoxicity

  • Alexander L. Perryman
  • Jimmy S. Patel
  • Riccardo Russo
  • Eric Singleton
  • Nancy Connell
  • Sean Ekins
  • Joel S. Freundlich
Research Paper



To advance translational research of potential therapeutic small molecules against infectious microbes, the compounds must display a relative lack of mammalian cell cytotoxicity. Vero cell cytotoxicity (CC50) is a common initial assay for this metric. We explored the development of naïve Bayesian models that can enhance the probability of identifying non-cytotoxic compounds.


Vero cell cytotoxicity assays were identified in PubChem, reformatted, and curated to create a training set with 8741 unique small molecules. These data were used to develop Bayesian classifiers, which were assessed with internal cross-validation, external tests with a set of 193 compounds from our laboratory, and independent validation with an additional diverse set of 1609 unique compounds from PubChem.


Evaluation with independent, external test and validation sets indicated that cytotoxicity Bayesian models constructed with the ECFP_6 descriptor were more accurate than those that used FCFP_6 fingerprints. The best cytotoxicity Bayesian model displayed predictive power in external evaluations, according to conventional and chance-corrected statistics, as well as enrichment factors.


The results from external tests demonstrate that our novel cytotoxicity Bayesian model displays sufficient predictive power to help guide translational research. To assist the chemical tool and drug discovery communities, our curated training set is being distributed as part of the Supplementary Material.

Graphical Abstract

Naive Bayesian models have been trained with publically available data and offer a useful tool for chemical biology and drug discovery to select for small molecules with a high probability of exhibiting acceptably low Vero cell cytotoxicity.

Key Words

Bayesian model machine learning predicting mammalian cytotoxicity translational research vero cell CC50 



Absorption, metabolism, distribution, excretion and toxicity


Assay Identification number on PubChem BioAssay


Extended class fingerprints of maximum diameter 6


Molecular function class fingerprints of maximum diameter 6


Negative predictive value (filtering rate)


Positive predictive value (hit rate)


Quantitative Structure-Activity Relationships


Receiver-operator characteristic


Structure-Activity Relationship


Simplified molecular-input line-entry system

Vero CC50

Vero cell (African green monkey kidney cell) 50% cytotoxicity value


Compliance with Ethical Standards

Conflicts of Interest

S.E. is the Founder and CEO of Collaborations Pharmaceuticals Inc.

Supplementary material

11095_2018_2439_MOESM1_ESM.docx (1.3 mb)
ESM 1 (DOCX 1.29 mb)


  1. 1.
    Kola I, Landis J. Can the pharmaceutical industry reduce attrition rates? Nat Rev Drug Discov. 2004;3(8):711–5.CrossRefPubMedGoogle Scholar
  2. 2.
    Schoonen WG, Westerink WM, Horbach GJ. High-throughput screening for analysis of in vitro toxicity. EXS. 2009;99:401–52.PubMedGoogle Scholar
  3. 3.
    Segall MD, Barber C. Addressing toxicity risk when designing and selecting compounds in early drug discovery. Drug Discov Today. 2014;19(5):688–93.CrossRefPubMedGoogle Scholar
  4. 4.
    Chekmarev DS, Kholodovych V, Balakin KV, Ivanenkov Y, Ekins S, Welsh WJ. Shape signatures: new descriptors for predicting cardiotoxicity in silico. Chem Res Toxicol. 2008;21(6):1304–14.CrossRefPubMedPubMedCentralGoogle Scholar
  5. 5.
    Polak S, Wisniowska B, Fijorek K, Glinka A, Polak M, Mendyk A. The open-access dataset for insilico cardiotoxicity prediction system. Bioinformation. 2011;6(6):244–5.CrossRefPubMedPubMedCentralGoogle Scholar
  6. 6.
    Ekins S, Williams AJ, Xu JJ. A predictive ligand-based Bayesian model for human drug-induced liver injury. Drug Metab Dispos. 2010;38(12):2302–8.CrossRefPubMedGoogle Scholar
  7. 7.
    Greene N, Fisk L, Naven RT, Note RR, Patel ML, Pelletier DJ. Developing structure-activity relationships for the prediction of hepatotoxicity. Chem Res Toxicol. 2010;23(7):1215–22.CrossRefPubMedGoogle Scholar
  8. 8.
    Rodgers AD, Zhu H, Fourches D, Rusyn I, Tropsha A. Modeling liver-related adverse effects of drugs using knearest neighbor quantitative structure-activity relationship method. Chem Res Toxicol. 2010;23(4):724–32.CrossRefPubMedPubMedCentralGoogle Scholar
  9. 9.
    Liew CY, Lim YC, Yap CW. Mixed learning algorithms and features ensemble in hepatotoxicity prediction. J Comput Aided Mol Des. 2011;25(9):855–71.CrossRefPubMedGoogle Scholar
  10. 10.
    Ekins S. Progress in computational toxicology. J Pharmacol Toxicol Methods. 2014;69(2):115–40.CrossRefPubMedGoogle Scholar
  11. 11.
    Zhang H, Chen QY, Xiang ML, Ma CY, Huang Q, Yang SY. In silico prediction of mitochondrial toxicity by using GA-CG-SVM approach. Toxicol in Vitro. 2009;23(1):134–40.CrossRefPubMedGoogle Scholar
  12. 12.
    Lin Z, Will Y. Evaluation of drugs with specific organ toxicities in organ-specific cell lines. Toxicol Sci. 2012;126(1):114–27.CrossRefPubMedGoogle Scholar
  13. 13.
    Lakshminarayana SB, Huat TB, Ho PC, Manjunatha UH, Dartois V, Dick T, et al. Comprehensive physicochemical, pharmacokinetic and activity profiling of anti-TB agents. J Antimicrob Chemother. 2015;70(3):857–67.CrossRefPubMedGoogle Scholar
  14. 14.
    Riss TL, Moravec RA. Use of multiple assay endpoints to investigate the effects of incubation time, dose of toxin, and plating density in cell-based cytotoxicity assays. Assay Drug Dev Technol. 2004;2(1):51–62.CrossRefPubMedGoogle Scholar
  15. 15.
    Manjunatha UH, Smith PW. Perspective: challenges and opportunities in TB drug discovery from phenotypic screening. Bioorg Med Chem. 2015;23(16):5087–97.CrossRefPubMedGoogle Scholar
  16. 16.
    Franzblau SG, DeGroote MA, Cho SH, Andries K, Nuermberger E, Orme IM, et al. Comprehensive analysis of methods used for the evaluation of compounds against Mycobacterium tuberculosis. Tuberculosis (Edinb). 2012;92(6):453–88.CrossRefGoogle Scholar
  17. 17.
    Kim H, Yoon SC, Lee TY, Jeong D. Discriminative cytotoxicity assessment based on various cellular damages. Toxicol Lett. 2009;184(1):13–7.CrossRefPubMedGoogle Scholar
  18. 18.
    Schrey AK, Nickel-Seeber J, Drwal MN, Zwicker P, Schultze N, Haertel B, et al. Computational prediction of immune cell cytotoxicity. Food Chem Toxicol. 2017;107(Pt A):150–66.CrossRefPubMedGoogle Scholar
  19. 19.
    Moon H, Cong M. Predictive models of cytotoxicity as mediated by exposure to chemicals or drugs. SAR QSAR Environ Res. 2016;27(6):455–68.CrossRefPubMedGoogle Scholar
  20. 20.
    Adhikari N, Halder AK, Saha A, Das Saha K, Jha T. Structural findings of phenylindoles as cytotoxic antimitotic agents in human breast cancer cell lines through multiple validated QSAR studies. Toxicol in Vitro. 2015;29(7):1392–404.CrossRefPubMedGoogle Scholar
  21. 21.
    Ekins S, Freundlich JS, Hobrath JV, Lucile White E, Reynolds RC. Combining computational methods for hit to lead optimization in Mycobacterium tuberculosis drug discovery. Pharm Res. 2014;31(2):414–35.CrossRefPubMedGoogle Scholar
  22. 22.
    Stouch TR, Kenyon JR, Johnson SR, Chen XQ, Doweyko A, Li Y. In silico ADME/Tox: why models fail. J Comput Aided Mol Des. 2003;17(2–4):83–92.CrossRefPubMedGoogle Scholar
  23. 23.
    Johnson SR. The trouble with QSAR (or how I learned to stop worrying and embrace fallacy). J Chem Inf Model. 2008;48(1):25–6.CrossRefPubMedGoogle Scholar
  24. 24.
    Ekins S, Reynolds RC, Kim H, Koo M-S, Ekonomidis M, Talaue M, et al. Bayesian models leveraging bioactivity and cytotoxicity information for drug discovery. Chem Biol. 2013;20:370–8.CrossRefPubMedPubMedCentralGoogle Scholar
  25. 25.
    Ekins S, Perryman AL, Clark AM, Reynolds RC, Freundlich JS. Machine learning model analysis and data visualization with small molecules tested in a mouse model of Mycobacterium tuberculosis infection (2014-2015). J Chem Inf Model. 2016;56(7):1332–43.CrossRefPubMedPubMedCentralGoogle Scholar
  26. 26.
    Perryman AL, Stratton TP, Ekins S, Freundlich JS. Predicting mouse liver microsomal stability with "pruned" machine learning models and public data. Pharm Res. 2016;33(2):433–49.CrossRefPubMedGoogle Scholar
  27. 27.
    Wang Y, Xiao J, Suzek TO, Zhang J, Wang J, Zhou Z, et al. PubChem's BioAssay database. Nucleic Acids Res. 2012;40(Database issue):D400–12.CrossRefPubMedGoogle Scholar
  28. 28.
    Smith CJ, Hansch C, Morton MJ. QSAR treatment of multiple toxicities: the mutagenicity and cytotoxicity of quinolines. Mutat Res. 1997;379(2):167–75.CrossRefPubMedGoogle Scholar
  29. 29.
    Skibo EB, Xing C, Dorr RT. Aziridinyl quinone antitumor agents based on indoles and cyclopent[b]indoles: structure-activity relationships for cytotoxicity and antitumor activity. J Med Chem. 2001;44(22):3545–62.CrossRefPubMedGoogle Scholar
  30. 30.
    Weinstein JN, Myers TG, O'Connor PM, Friend SH, Fornace AJ Jr, Kohn KW, et al. An information-intensive approach to the molecular pharmacology of cancer. Science. 1997;275(5298):343–9.CrossRefPubMedGoogle Scholar
  31. 31.
    Swamidass SJ, Chen J, Bruand J, Phung P, Ralaivola L, Baldi P. Kernels for small molecules and the prediction of mutagenicity, toxicity and anti-cancer activity. Bioinformatics. 2005;21(Suppl 1):i359–68.CrossRefPubMedGoogle Scholar
  32. 32.
    Lee AC, Shedden K, Rosania GR, Crippen GM. Data mining the NCI60 to predict generalized cytotoxicity. J Chem Inf Model. 2008;48(7):1379–88.CrossRefPubMedPubMedCentralGoogle Scholar
  33. 33.
    Molnar L, Keseru GM, Papp A, Lorincz Z, Ambrus G, Darvas F. A neural network based classification scheme for cytotoxicity predictions:validation on 30,000 compounds. Bioorg Med Chem Lett. 2006;16(4):1037–9.CrossRefPubMedGoogle Scholar
  34. 34.
    Guha R, Schurer SC. Utilizing high throughput screening data for predictive toxicology models: protocols and application to MLSCN assays. J Comput Aided Mol Des. 2008;22(6–7):367–84.CrossRefPubMedGoogle Scholar
  35. 35.
    Boik JC, Newman RA. Structure-activity models of oral clearance, cytotoxicity, and LD50: a screen for promising anticancer compounds. BMC Pharmacol. 2008;8:12.CrossRefPubMedPubMedCentralGoogle Scholar
  36. 36.
    Huang R, Southall N, Xia M, Cho MH, Jadhav A, Nguyen DT, et al. Weighted feature significance: a simple, interpretable model of compound toxicity based on the statistical enrichment of structural features. Toxicol Sci. 2009;112(2):385–93.CrossRefPubMedPubMedCentralGoogle Scholar
  37. 37.
    Langdon SR, Mulgrew J, Paolini GV, van Hoorn WP. Predicting cytotoxicity from heterogeneous data sources with Bayesian learning. J Cheminform. 2010;2(1):11.CrossRefPubMedPubMedCentralGoogle Scholar
  38. 38.
    Chang CY, Hsu MT, Esposito EX, Tseng YJ. Oversampling to overcome overfitting: exploring the relationship between data set composition, molecular descriptors, and predictive modeling methods. J Chem Inf Model. 2013;53(4):958–71.CrossRefPubMedGoogle Scholar
  39. 39.
    Mervin LH, Cao Q, Barrett IP, Firth MA, Murray D, McWilliams L, et al. Understanding cytotoxicity and Cytostaticity in a high-throughput screening collection. ACS Chem Biol. 2016;11(11):3007–23.CrossRefPubMedGoogle Scholar
  40. 40.
    Stratton TP, Perryman AL, Vilcheze C, Russo R, Li SG, Patel JS, et al. Addressing the metabolic stability of Antituberculars through machine learning. ACS Med Chem Lett. 2017;8(10):1099–104.CrossRefPubMedGoogle Scholar
  41. 41.
    Hu Y, Unwalla R, Denny RA, Bikker J, Di L, Humblet C. Development of QSAR models for microsomal stability: identification of good and bad structural features for rat, human and mouse microsomal stability. J Comput Aided Mol Des. 2010;24(1):23–35.CrossRefPubMedGoogle Scholar

Copyright information

© Springer Science+Business Media, LLC, part of Springer Nature 2018

Authors and Affiliations

  1. 1.Department of Pharmacology, Physiology and Neuroscience, and MedicineRutgers University-New Jersey Medical SchoolNewarkUSA
  2. 2.Division of Infectious Diseases, Department of Medicine, and the Ruy V. Lourenço Center for the Study of Emerging and Re-emerging PathogensRutgers University–New Jersey Medical SchoolNewarkUSA
  3. 3.Collaborations Pharmaceuticals, Inc.RaleighUSA

Personalised recommendations