Skip to main content


Log in

Naïve Bayesian Models for Vero Cell Cytotoxicity

  • Research Paper
  • Published:
Pharmaceutical Research Aims and scope Submit manuscript



To advance translational research of potential therapeutic small molecules against infectious microbes, the compounds must display a relative lack of mammalian cell cytotoxicity. Vero cell cytotoxicity (CC50) is a common initial assay for this metric. We explored the development of naïve Bayesian models that can enhance the probability of identifying non-cytotoxic compounds.


Vero cell cytotoxicity assays were identified in PubChem, reformatted, and curated to create a training set with 8741 unique small molecules. These data were used to develop Bayesian classifiers, which were assessed with internal cross-validation, external tests with a set of 193 compounds from our laboratory, and independent validation with an additional diverse set of 1609 unique compounds from PubChem.


Evaluation with independent, external test and validation sets indicated that cytotoxicity Bayesian models constructed with the ECFP_6 descriptor were more accurate than those that used FCFP_6 fingerprints. The best cytotoxicity Bayesian model displayed predictive power in external evaluations, according to conventional and chance-corrected statistics, as well as enrichment factors.


The results from external tests demonstrate that our novel cytotoxicity Bayesian model displays sufficient predictive power to help guide translational research. To assist the chemical tool and drug discovery communities, our curated training set is being distributed as part of the Supplementary Material.

Naive Bayesian models have been trained with publically available data and offer a useful tool for chemical biology and drug discovery to select for small molecules with a high probability of exhibiting acceptably low Vero cell cytotoxicity.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Subscribe and save

Springer+ Basic
EUR 32.99 /Month
  • Get 10 units per month
  • Download Article/Chapter or Ebook
  • 1 Unit = 1 Article or 1 Chapter
  • Cancel anytime
Subscribe now

Buy Now

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Fig. 1
Fig. 2
Fig. 3

Similar content being viewed by others



Absorption, metabolism, distribution, excretion and toxicity


Assay Identification number on PubChem BioAssay


Extended class fingerprints of maximum diameter 6


Molecular function class fingerprints of maximum diameter 6


Negative predictive value (filtering rate)


Positive predictive value (hit rate)


Quantitative Structure-Activity Relationships


Receiver-operator characteristic


Structure-Activity Relationship


Simplified molecular-input line-entry system

Vero CC50 :

Vero cell (African green monkey kidney cell) 50% cytotoxicity value


  1. Kola I, Landis J. Can the pharmaceutical industry reduce attrition rates? Nat Rev Drug Discov. 2004;3(8):711–5.

    Article  PubMed  CAS  Google Scholar 

  2. Schoonen WG, Westerink WM, Horbach GJ. High-throughput screening for analysis of in vitro toxicity. EXS. 2009;99:401–52.

    PubMed  CAS  Google Scholar 

  3. Segall MD, Barber C. Addressing toxicity risk when designing and selecting compounds in early drug discovery. Drug Discov Today. 2014;19(5):688–93.

    Article  PubMed  CAS  Google Scholar 

  4. Chekmarev DS, Kholodovych V, Balakin KV, Ivanenkov Y, Ekins S, Welsh WJ. Shape signatures: new descriptors for predicting cardiotoxicity in silico. Chem Res Toxicol. 2008;21(6):1304–14.

    Article  PubMed  PubMed Central  CAS  Google Scholar 

  5. Polak S, Wisniowska B, Fijorek K, Glinka A, Polak M, Mendyk A. The open-access dataset for insilico cardiotoxicity prediction system. Bioinformation. 2011;6(6):244–5.

    Article  PubMed  PubMed Central  Google Scholar 

  6. Ekins S, Williams AJ, Xu JJ. A predictive ligand-based Bayesian model for human drug-induced liver injury. Drug Metab Dispos. 2010;38(12):2302–8.

    Article  PubMed  CAS  Google Scholar 

  7. Greene N, Fisk L, Naven RT, Note RR, Patel ML, Pelletier DJ. Developing structure-activity relationships for the prediction of hepatotoxicity. Chem Res Toxicol. 2010;23(7):1215–22.

    Article  PubMed  CAS  Google Scholar 

  8. Rodgers AD, Zhu H, Fourches D, Rusyn I, Tropsha A. Modeling liver-related adverse effects of drugs using knearest neighbor quantitative structure-activity relationship method. Chem Res Toxicol. 2010;23(4):724–32.

    Article  PubMed  PubMed Central  CAS  Google Scholar 

  9. Liew CY, Lim YC, Yap CW. Mixed learning algorithms and features ensemble in hepatotoxicity prediction. J Comput Aided Mol Des. 2011;25(9):855–71.

    Article  PubMed  CAS  Google Scholar 

  10. Ekins S. Progress in computational toxicology. J Pharmacol Toxicol Methods. 2014;69(2):115–40.

    Article  PubMed  CAS  Google Scholar 

  11. Zhang H, Chen QY, Xiang ML, Ma CY, Huang Q, Yang SY. In silico prediction of mitochondrial toxicity by using GA-CG-SVM approach. Toxicol in Vitro. 2009;23(1):134–40.

    Article  PubMed  CAS  Google Scholar 

  12. Lin Z, Will Y. Evaluation of drugs with specific organ toxicities in organ-specific cell lines. Toxicol Sci. 2012;126(1):114–27.

    Article  PubMed  CAS  Google Scholar 

  13. Lakshminarayana SB, Huat TB, Ho PC, Manjunatha UH, Dartois V, Dick T, et al. Comprehensive physicochemical, pharmacokinetic and activity profiling of anti-TB agents. J Antimicrob Chemother. 2015;70(3):857–67.

    Article  PubMed  CAS  Google Scholar 

  14. Riss TL, Moravec RA. Use of multiple assay endpoints to investigate the effects of incubation time, dose of toxin, and plating density in cell-based cytotoxicity assays. Assay Drug Dev Technol. 2004;2(1):51–62.

    Article  PubMed  CAS  Google Scholar 

  15. Manjunatha UH, Smith PW. Perspective: challenges and opportunities in TB drug discovery from phenotypic screening. Bioorg Med Chem. 2015;23(16):5087–97.

    Article  PubMed  CAS  Google Scholar 

  16. Franzblau SG, DeGroote MA, Cho SH, Andries K, Nuermberger E, Orme IM, et al. Comprehensive analysis of methods used for the evaluation of compounds against Mycobacterium tuberculosis. Tuberculosis (Edinb). 2012;92(6):453–88.

    Article  CAS  Google Scholar 

  17. Kim H, Yoon SC, Lee TY, Jeong D. Discriminative cytotoxicity assessment based on various cellular damages. Toxicol Lett. 2009;184(1):13–7.

    Article  PubMed  CAS  Google Scholar 

  18. Schrey AK, Nickel-Seeber J, Drwal MN, Zwicker P, Schultze N, Haertel B, et al. Computational prediction of immune cell cytotoxicity. Food Chem Toxicol. 2017;107(Pt A):150–66.

    Article  PubMed  CAS  Google Scholar 

  19. Moon H, Cong M. Predictive models of cytotoxicity as mediated by exposure to chemicals or drugs. SAR QSAR Environ Res. 2016;27(6):455–68.

    Article  PubMed  CAS  Google Scholar 

  20. Adhikari N, Halder AK, Saha A, Das Saha K, Jha T. Structural findings of phenylindoles as cytotoxic antimitotic agents in human breast cancer cell lines through multiple validated QSAR studies. Toxicol in Vitro. 2015;29(7):1392–404.

    Article  PubMed  CAS  Google Scholar 

  21. Ekins S, Freundlich JS, Hobrath JV, Lucile White E, Reynolds RC. Combining computational methods for hit to lead optimization in Mycobacterium tuberculosis drug discovery. Pharm Res. 2014;31(2):414–35.

    Article  PubMed  CAS  Google Scholar 

  22. Stouch TR, Kenyon JR, Johnson SR, Chen XQ, Doweyko A, Li Y. In silico ADME/Tox: why models fail. J Comput Aided Mol Des. 2003;17(2–4):83–92.

    Article  PubMed  CAS  Google Scholar 

  23. Johnson SR. The trouble with QSAR (or how I learned to stop worrying and embrace fallacy). J Chem Inf Model. 2008;48(1):25–6.

    Article  PubMed  CAS  Google Scholar 

  24. Ekins S, Reynolds RC, Kim H, Koo M-S, Ekonomidis M, Talaue M, et al. Bayesian models leveraging bioactivity and cytotoxicity information for drug discovery. Chem Biol. 2013;20:370–8.

    Article  PubMed  PubMed Central  CAS  Google Scholar 

  25. Ekins S, Perryman AL, Clark AM, Reynolds RC, Freundlich JS. Machine learning model analysis and data visualization with small molecules tested in a mouse model of Mycobacterium tuberculosis infection (2014-2015). J Chem Inf Model. 2016;56(7):1332–43.

    Article  PubMed  PubMed Central  CAS  Google Scholar 

  26. Perryman AL, Stratton TP, Ekins S, Freundlich JS. Predicting mouse liver microsomal stability with "pruned" machine learning models and public data. Pharm Res. 2016;33(2):433–49.

    Article  PubMed  CAS  Google Scholar 

  27. Wang Y, Xiao J, Suzek TO, Zhang J, Wang J, Zhou Z, et al. PubChem's BioAssay database. Nucleic Acids Res. 2012;40(Database issue):D400–12.

    Article  PubMed  CAS  Google Scholar 

  28. Smith CJ, Hansch C, Morton MJ. QSAR treatment of multiple toxicities: the mutagenicity and cytotoxicity of quinolines. Mutat Res. 1997;379(2):167–75.

    Article  PubMed  CAS  Google Scholar 

  29. Skibo EB, Xing C, Dorr RT. Aziridinyl quinone antitumor agents based on indoles and cyclopent[b]indoles: structure-activity relationships for cytotoxicity and antitumor activity. J Med Chem. 2001;44(22):3545–62.

    Article  PubMed  CAS  Google Scholar 

  30. Weinstein JN, Myers TG, O'Connor PM, Friend SH, Fornace AJ Jr, Kohn KW, et al. An information-intensive approach to the molecular pharmacology of cancer. Science. 1997;275(5298):343–9.

    Article  PubMed  CAS  Google Scholar 

  31. Swamidass SJ, Chen J, Bruand J, Phung P, Ralaivola L, Baldi P. Kernels for small molecules and the prediction of mutagenicity, toxicity and anti-cancer activity. Bioinformatics. 2005;21(Suppl 1):i359–68.

    Article  PubMed  CAS  Google Scholar 

  32. Lee AC, Shedden K, Rosania GR, Crippen GM. Data mining the NCI60 to predict generalized cytotoxicity. J Chem Inf Model. 2008;48(7):1379–88.

    Article  PubMed  PubMed Central  CAS  Google Scholar 

  33. Molnar L, Keseru GM, Papp A, Lorincz Z, Ambrus G, Darvas F. A neural network based classification scheme for cytotoxicity predictions:validation on 30,000 compounds. Bioorg Med Chem Lett. 2006;16(4):1037–9.

    Article  PubMed  CAS  Google Scholar 

  34. Guha R, Schurer SC. Utilizing high throughput screening data for predictive toxicology models: protocols and application to MLSCN assays. J Comput Aided Mol Des. 2008;22(6–7):367–84.

    Article  PubMed  CAS  Google Scholar 

  35. Boik JC, Newman RA. Structure-activity models of oral clearance, cytotoxicity, and LD50: a screen for promising anticancer compounds. BMC Pharmacol. 2008;8:12.

    Article  PubMed  PubMed Central  CAS  Google Scholar 

  36. Huang R, Southall N, Xia M, Cho MH, Jadhav A, Nguyen DT, et al. Weighted feature significance: a simple, interpretable model of compound toxicity based on the statistical enrichment of structural features. Toxicol Sci. 2009;112(2):385–93.

    Article  PubMed  PubMed Central  CAS  Google Scholar 

  37. Langdon SR, Mulgrew J, Paolini GV, van Hoorn WP. Predicting cytotoxicity from heterogeneous data sources with Bayesian learning. J Cheminform. 2010;2(1):11.

    Article  PubMed  PubMed Central  CAS  Google Scholar 

  38. Chang CY, Hsu MT, Esposito EX, Tseng YJ. Oversampling to overcome overfitting: exploring the relationship between data set composition, molecular descriptors, and predictive modeling methods. J Chem Inf Model. 2013;53(4):958–71.

    Article  PubMed  CAS  Google Scholar 

  39. Mervin LH, Cao Q, Barrett IP, Firth MA, Murray D, McWilliams L, et al. Understanding cytotoxicity and Cytostaticity in a high-throughput screening collection. ACS Chem Biol. 2016;11(11):3007–23.

    Article  PubMed  CAS  Google Scholar 

  40. Stratton TP, Perryman AL, Vilcheze C, Russo R, Li SG, Patel JS, et al. Addressing the metabolic stability of Antituberculars through machine learning. ACS Med Chem Lett. 2017;8(10):1099–104.

    Article  PubMed  CAS  Google Scholar 

  41. Hu Y, Unwalla R, Denny RA, Bikker J, Di L, Humblet C. Development of QSAR models for microsomal stability: identification of good and bad structural features for rat, human and mouse microsomal stability. J Comput Aided Mol Des. 2010;24(1):23–35.

    Article  PubMed  CAS  Google Scholar 

Download references

Author information

Authors and Affiliations


Corresponding author

Correspondence to Joel S. Freundlich.

Ethics declarations

Conflicts of Interest

S.E. is the Founder and CEO of Collaborations Pharmaceuticals Inc.

Electronic supplementary material


(DOCX 1.29 mb)

Rights and permissions

Reprints and permissions

About this article

Check for updates. Verify currency and authenticity via CrossMark

Cite this article

Perryman, A.L., Patel, J.S., Russo, R. et al. Naïve Bayesian Models for Vero Cell Cytotoxicity. Pharm Res 35, 170 (2018).

Download citation

  • Received:

  • Accepted:

  • Published:

  • DOI:

Key Words