Skip to main content

Machine Learning Methods in Computational Toxicology

  • Protocol
  • First Online:
Computational Toxicology

Part of the book series: Methods in Molecular Biology ((MIMB,volume 1800))

Abstract

Various methods of machine learning, supervised and unsupervised, linear and nonlinear, classification and regression, in combination with various types of molecular descriptors, both “handcrafted” and “data-driven,” are considered in the context of their use in computational toxicology. The use of multiple linear regression, variants of naïve Bayes classifier, k-nearest neighbors, support vector machine, decision trees, ensemble learning, random forest, several types of neural networks, and deep learning is the focus of attention of this review. The role of fragment descriptors, graph mining, and graph kernels is highlighted. The application of unsupervised methods, such as Kohonen’s self-organizing maps and related approaches, which allow for combining predictions with data analysis and visualization, is also considered. The necessity of applying a wide range of machine learning methods in computational toxicology is underlined.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Protocol
USD 49.95
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
eBook
USD 99.00
Price excludes VAT (USA)
  • Available as EPUB and PDF
  • Read on any device
  • Instant download
  • Own it forever
Softcover Book
USD 129.99
Price excludes VAT (USA)
  • Compact, lightweight edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info
Hardcover Book
USD 199.99
Price excludes VAT (USA)
  • Durable hardcover edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

References

  1. Barratt MD, Rodford RA (2001) The computational prediction of toxicity. Curr Opin Chem Biol 5:383–388

    Article  PubMed  CAS  Google Scholar 

  2. Kavlock RJ, Ankley G, Blancato J, Breen M, Conolly R, Dix D, Houck K, Hubal E, Judson R, Rabinowitz J, Richard A, Setzer RW, Shah I, Villeneuve D, Weber E (2008) Computational toxicology—a state of the science mini review. Toxicol Sci 103:14–27

    Article  PubMed  CAS  Google Scholar 

  3. Muster W, Breidenbach A, Fischer H, Kirchner S, Müller L, Pähler A (2008) Computational toxicology in drug development. Drug Discov Today 13:303–310

    Article  PubMed  CAS  Google Scholar 

  4. Valerio LG (2009) In silico toxicology for the pharmaceutical sciences. Toxicol Appl Pharmacol 241:356–370

    Article  PubMed  CAS  Google Scholar 

  5. Nigsch F, Macaluso NJM, Mitchell JBO, Zmuidinavicius D (2009) Computational toxicology: an overview of the sources of data and of modelling methods. Expert Opin Drug Metab Toxicol 5:1–14

    Article  PubMed  CAS  Google Scholar 

  6. Merlot C (2010) Computational toxicology—a tool for early safety evaluation. Drug Discov Today 15:16–22

    Article  PubMed  CAS  Google Scholar 

  7. Raunio H (2011) In silico toxicology – non-testing methods. Front Pharmacol 2:33

    Article  PubMed  PubMed Central  Google Scholar 

  8. Sun HM, Xia MH, Austin CP, Huang RL (2012) Paradigm shift in toxicity testing and modeling. AAPS J 14:473–480

    Article  PubMed  PubMed Central  CAS  Google Scholar 

  9. Reisfeld B, Mayeno AN (2012) What is computational toxicology? In: Reisfeld B, Mayeno AN (eds) Computational toxicology, vol Volume I. Humana Press, Totowa, NJ, pp 3–7

    Chapter  Google Scholar 

  10. Knudsen T, Martin M, Chandler K, Kleinstreuer N, Judson R, Sipes N (2013) Predictive models and computational toxicology. In: Barrow PC (ed) Teratogenicity testing: methods and protocols. Humana Press, Totowa, NJ, pp 343–374. https://doi.org/10.1007/978-1-62703-131-8_26

    Chapter  Google Scholar 

  11. Ekins S (2014) Progress in computational toxicology. J Pharmacol Toxicol Methods 69:115–140

    Article  PubMed  CAS  Google Scholar 

  12. Varnek A, Baskin I (2012) Machine learning methods for property prediction in chemoinformatics: quo vadis? J Chem Inf Mod 52:1413–1437

    Article  CAS  Google Scholar 

  13. Cherkasov A, Muratov EN, Fourches D, Varnek A, Baskin II, Cronin M, Dearden J, Gramatica P, Martin YC, Todeschini R, Consonni V, Kuz'min VE, Cramer R, Benigni R, Yang C, Rathman J, Terfloth L, Gasteiger J, Richard A, Tropsha A (2015) QSAR modeling: where have you been? Where are you going to? J Med Chem 57:4977–5010

    Article  CAS  Google Scholar 

  14. Todeschini R, Consonni V (2009) Molecular descriptors for chemoinformatics. In: Methods and principles in medicinal chemistry, vol 41. Wiley-VCH, Weinheim

    Google Scholar 

  15. Baskin I, Varnek A (2008) Fragment descriptors in SAR/QSAR/QSPR studies, molecular similarity analysis and in virtual screening. In: Varnek A, Tropsha A (eds) Chemoinformatics approaches to virtual screening. RSC Publisher, Cambridge, pp 1–43

    Google Scholar 

  16. Baskin I, Varnek A (2008) Building a chemical space based on fragment descriptors. Comb Chem High Throughput Screen 11:661–668

    Article  PubMed  CAS  Google Scholar 

  17. Varnek A, Fourches D, Hoonakker F, Solov’ev V (2005) Substructural fragments: an universal language to encode reactions, molecular and supramolecular structures. J Comput Aided Mol Des 19:693–703

    Article  PubMed  CAS  Google Scholar 

  18. Marcou G, Horvath D, Solov'ev V, Arrault A, Vayer P, Varnek A (2012) Interpretability of SAR/QSAR models of any complexity by atomic contributions. Mol Inform 31:639–642

    Article  PubMed  CAS  Google Scholar 

  19. Draper NR, Smith H (1998) Applied regression analysis, 3rd edn. John Wiley, New York

    Google Scholar 

  20. Lyubimova IK, Abilev SK, Gal'berstam NM, Baskin II, Palyulin VA, Zefirov NS (2001) Computer-aided prediction of the mutagenic activity of substituted polycyclic compounds. Biol Bull 28:139–145

    Article  CAS  Google Scholar 

  21. Enslein K, Gombar VK, Blake BW (1994) Use of SAR in computer-assisted prediction of carcinogenicity and mutagenicity of chemicals by the TOPKAT program. Mutat Res 305:47–61

    Article  PubMed  CAS  Google Scholar 

  22. Klopman G (1984) Artificial intelligence approach to structure-activity studies. Computer automated structure evaluation of biological activity of organic molecules. J Am Chem Soc 106:7315–7321

    Article  CAS  Google Scholar 

  23. Rosenkranz HS, Klopman G (1988) CASE, the computer-automated structure evaluation system, as an alternative to extensive animal testing. Toxicol Ind Health 4:533–540

    Article  PubMed  CAS  Google Scholar 

  24. Klopman G (1992) MULTICASE. 1. A hierarchical computer automated structure evaluation program. Quant Struct-Act Relat 11(2):176–184. https://doi.org/10.1002/qsar.19920110208

    Article  CAS  Google Scholar 

  25. Klopman G (1998) The MultiCASE program II. Baseline activity identification algorithm (BAIA). J Chem Inf Comput Sci 38:78–81

    Article  PubMed  CAS  Google Scholar 

  26. Klopman G (1996) The META-CASETOX system. In: Puijnenburg WJGM, Damborsky J (eds) Biodegradability prediction. Springer, Berlin, pp 27–40

    Chapter  Google Scholar 

  27. Matthews EJ, Contrera JF (1998) A new highly specific method for predicting the carcinogenic potential of pharmaceuticals in rodents using enhanced MCASE QSAR-ES software. Regul Toxicol Pharmacol 28:242–264

    Article  PubMed  CAS  Google Scholar 

  28. Klopman G, Chakravarti SK, Harris N, Ivanov J, Saiakhov RD (2003) In-silico screening of high production volume chemicals for mutagenicity using the MCASE QSAR expert system. SAR QSAR Environ Res 14:165–180

    Article  PubMed  CAS  Google Scholar 

  29. Klopman G, Chakravarti SK, Zhu H, Ivanov JM, Saiakhov RD (2004) ESP: a method to predict toxicity and pharmacological properties of chemicals using multiple MCASE databases. J Chem Inf Comput Sci 44:704–715

    Article  PubMed  CAS  Google Scholar 

  30. Klopman G, Ivanov J, Saiakhov R, Chakravarti S (2005) MC4PC–an artificial intelligence approach to the discovery of structure toxic activity relationships (STAR). In: Helma C (ed) Predictive toxicology. CRC Press, Boca Raton, pp 423–457

    Google Scholar 

  31. Carhart RE, Smith DH, Venkataraghavan R (1985) Atom pairs as molecular features in structure-activity studies: definition and applications. J Chem Inf Comput Sci 2:64–73

    Article  Google Scholar 

  32. Xiao Y, Qiao Y, Zhang J, Lin S, Zhang W (1997) A method for substructure search by atom-centered multilayer code. J Chem Inf Comput Sci 37:701–704

    Article  CAS  Google Scholar 

  33. Glen RC, Bender A, Arnby CH, Carlsson L, Boyer S, Smith J (2006) Circular fingerprints: flexible molecular descriptors with applications from physical chemistry to ADME. IDrugs 9:199–204

    CAS  Google Scholar 

  34. Filimonov D, Poroikov V, Borodina Y, Gloriozova T (1999) Chemical similarity assessment through multilevel neighborhoods of atoms: definition and comparison with the other descriptors. J Chem Inf Comput Sci 39:666–670

    Article  CAS  Google Scholar 

  35. Hassan M, Brown RD, Varma-O'Brien S, Rogers D (2006) Cheminformatics analysis and learning in a data pipelining environment. Mol Divers 10(3):283–299

    Article  PubMed  CAS  Google Scholar 

  36. Metz JT, Huth JR, Hajduk PJ (2007) Enhancement of chemical rules for predicting compound reactivity towards protein thiol groups. J Comput Aided Mol Des 21:139–144

    Article  PubMed  CAS  Google Scholar 

  37. Langdon SR, Mulgrew J, Paolini GV, van Hoorn WP (2010) Predicting cytotoxicity from heterogeneous data sources with Bayesian learning. J Cheminform 2:11

    Article  PubMed  PubMed Central  CAS  Google Scholar 

  38. Xia X, Maliski EG, Gallant P, Rogers D (2004) Classification of kinase inhibitors using a Bayesian model. J Med Chem 47:4463–4470

    Article  PubMed  CAS  Google Scholar 

  39. Liew CY, Lim YC, Yap CW (2011) Mixed learning algorithms and features ensemble in hepatotoxicity prediction. J Comput Aided Mol Des 25:855

    Article  PubMed  CAS  Google Scholar 

  40. Poroikov VV, Filimonov DA, Borodina YV, Lagunin AA, Kos A (2000) Robustness of biological activity spectra predicting by computer program PASS for noncongeneric sets of chemical compounds. J Chem Inf Comput Sci 4:1349–1355

    Article  CAS  Google Scholar 

  41. Lagunin AA, Dearden JC, Filimonov DA, Poroikov VV (2005) Computer-aided rodent carcinogenicity prediction. Mutat Res 586:138–146

    Article  PubMed  CAS  Google Scholar 

  42. Borodina Y, Sadym A, Filimonov D, Blinova V, Dmitriev A, Poroikov V (2003) Predicting biotransformation potential from molecular structure. J Chem Inf Comput Sci 43:1636–1646

    Article  PubMed  CAS  Google Scholar 

  43. Borodina Y, Rudik A, Filimonov D, Kharchevnikova N, Dmitriev A, Blinova V, Poroikov V (2004) A new statistical approach to predicting aromatic hydroxylation sites. Comparison with model-based approaches. J Chem Inf Comput Sci 44:1998–2009

    Article  PubMed  CAS  Google Scholar 

  44. Rudik AV, Dmitriev AV, Lagunin AA, Filimonov DA, Poroikov VV (2014) Metabolism site prediction based on xenobiotic structural formulas and PASS prediction algorithm. J Chem Inf Mod 54:498–507

    Article  CAS  Google Scholar 

  45. Rudik A, Dmitriev A, Lagunin A, Filimonov D, Poroikov V (2015) SOMP: web server for in silico prediction of sites of metabolism for drug-like compounds. Bioinformatics 31:2046–2048

    Article  PubMed  CAS  Google Scholar 

  46. Rudik AV, Dmitriev AV, Lagunin AA, Filimonov DA, Poroikov VV (2016) Prediction of reacting atoms for the major biotransformation reactions of organic xenobiotics. J Cheminf 8:68

    Article  CAS  Google Scholar 

  47. Rudik AV, Bezhentsev VM, Dmitriev AV, Druzhilovskiy DS, Lagunin AA, Filimonov DA, Poroikov VV (2017) MetaTox: web application for predicting structure and toxicity of xenobiotics’ metabolites. J Chem Inf Mod 57:638–642

    Article  CAS  Google Scholar 

  48. Saigo H, Tsuda K (2010) Graph mining in chemoinformatics. In: Lodhi H, Yamanishi Y (eds) Chemoinformatics and advanced machine learning perspectives: complex computational methods and collaborative techniques. IGI Global, Hershey, PA, pp 95–128

    Google Scholar 

  49. Saigo H, Kadowaki T, Tsuda K (2006) A linear programming approach for molecular QSAR analysis. Paper presented at the International Workshop on Mining and Learning with Graphs 2006, Berlin

    Google Scholar 

  50. Zheng W, Tropsha A (2000) Novel variable selection quantitative structure-property relationship approach based on the k-nearest-neighbor principle. J Chem Inf Comput Sci 40:185–194

    Article  PubMed  CAS  Google Scholar 

  51. Rodgers AD, Zhu H, Fourches D, Rusyn I, Tropsha A (2010) Modeling liver-related adverse effects of drugs using k nearest neighbor quantitative structure−activity relationship method. Chem Res Toxicol 23:724–732

    Article  PubMed  PubMed Central  CAS  Google Scholar 

  52. Vapnik V (1998) Statistical learning theory. Wiley-Interscience, New York

    Google Scholar 

  53. Vapnik VN (1995) The nature of statistical learning theory. Springer, Berlin

    Book  Google Scholar 

  54. Cortes C, Vapnik V (1995) Support-vector networks. Mach Learn 20:273–297

    Google Scholar 

  55. Czermiński R, Yasri A, Hartsough D (2001) Use of support vector machine in pattern classification: application to QSAR studies. Mol Inform 20:227–240

    Google Scholar 

  56. Khandelwal A, Krasowski MD, Reschly EJ, Sinz MW, Swaan PW, Ekins S (2008) Machine learning methods and docking for predicting human pregnane X receptor activation. Chem Res Toxicol 21:1457–1467

    Article  PubMed  PubMed Central  CAS  Google Scholar 

  57. Fourches D, Barnes JC, Day NC, Bradley P, Reed JZ, Tropsha A (2010) Cheminformatics analysis of assertions mined from literature that describe drug-induced liver injury in different species. Chem Res Toxicol 23:171–183

    Article  PubMed  PubMed Central  CAS  Google Scholar 

  58. Artemenko NV, Baskin II, Palyulin VA, Zefirov NS (2001) Prediction of physical properties of organic compounds using artificial neural networks within the substructure approach. Dokl Chem 381:317–320

    Article  Google Scholar 

  59. Artemenko NV, Baskin II, Palyulin VA, Zefirov NS (2003) Artificial neural network and fragmental approach in prediction of physicochemical properties of organic compounds. Russ Chem Bull 52:20–29

    Article  CAS  Google Scholar 

  60. Zhokhova NI, Baskin II, Palyulin VA, Zefirov AN, Zefirov NS (2007) Fragmental descriptors with labeled atoms and their application in QSAR/QSPR studies. Dokl Chem 417:282–284

    Article  CAS  Google Scholar 

  61. Sushko I, Novotarskyi S, Korner R, Pandey AK, Cherkasov A, Li J, Gramatica P, Hansen K, Schroeter T, Muller KR, Xi L, Liu H, Yao X, Oberg T, Hormozdiari F, Dao P, Sahinalp C, Todeschini R, Polishchuk P, Artemenko A, Kuz'min V, Martin TM, Young DM, Fourches D, Muratov E, Tropsha A, Baskin I, Horvath D, Marcou G, Muller C, Varnek A, Prokopenko VV, Tetko IV (2010) Applicability domains for classification problems: benchmarking of distance to models for Ames mutagenicity set. J Chem Inf Model 50:2094–2111

    Article  PubMed  CAS  Google Scholar 

  62. Ralaivola L, Swamidass SJ, Saigo H, Baldi P (2005) Graph kernels for chemical informatics. Neural Netw 18:1093–1110

    Article  PubMed  Google Scholar 

  63. Rupp M, Schneider G (2010) Graph kernels for molecular similarity. Mol Inform 29:266–273

    Article  PubMed  CAS  Google Scholar 

  64. Kashima H, Tsuda K, Inokuchi A (2003) Marginalized kernels between labeled graphs. In: Proceedings, twentieth international conference on machine learning, vol 1. AAAI Press, Washington D.C., pp 321–328

    Google Scholar 

  65. Menchetti S, Costa F, Frasconi P 2005 Weighted decomposition kernels. In: Proceedings of the 22nd international conference on Machine learning. ACM, pp. 585–592

    Google Scholar 

  66. Swamidass SJ, Chen J, Phung P, Ralaivola L, Baldi P (2005) Kernels for small molecules and the prediction of mutagenicity, toxicity and anti-cancer activity. Bioinformatics 21:I359–I368

    Article  PubMed  CAS  Google Scholar 

  67. Mahé P, Ueda N, Akutsu T, Perret J-L, Vert J-P (2005) Graph kernels for molecular structure-activity relationship analysis with support vector machines. J Chem Inf Mod 45:939–951

    Article  CAS  Google Scholar 

  68. Breiman L, Friedman J, Stone CJ, Olshen RA (1984) Classification and regression trees. Chapman & Hall/CRC, Wadsworth, California

    Google Scholar 

  69. Cheng A, Dixon SL (2003) In silico models for the prediction of dose-dependent human hepatotoxicity. J Comput Aided Mol Des 17:811–823

    Article  PubMed  CAS  Google Scholar 

  70. Susnow RG, Dixon SL (2003) Use of robust classification techniques for the prediction of human cytochrome P450 2D6 inhibition. J Chem Inf Comput Sci 43:1308–1315

    Article  PubMed  CAS  Google Scholar 

  71. Feng J, Lurati L, Ouyang H, Robinson T, Wang Y, Yuan S, Young SS (2003) Predictive toxicology: benchmarking molecular descriptors and statistical methods. J Chem Inf Comput Sci 43:1463–1470

    Article  PubMed  CAS  Google Scholar 

  72. Cramer GM, Ford RA, Hall RL (1976) Estimation of toxic hazard—a decision tree approach. Food Cosmet Toxicol 16:255–276

    Article  Google Scholar 

  73. Verhaar HJM, van Leeuwen CJ, Hermens JLM (1992) Classifying environmental pollutants. Chemosphere 25:471–491

    Article  CAS  Google Scholar 

  74. Walker JD, Gerner I, Hulzebos E, Schlegel K (2005) The skin irritation corrosion rules estimation tool (SICRET). QSAR Comb Sci 24:378–384

    Article  CAS  Google Scholar 

  75. Gerner I, Liebsch M, Spielmann H (2005) Assessment of the eye irritating properties of chemicals by applying alternatives to the Draize rabbit eye test: the use of QSARs and in vitro tests for the classification of eye irritation. Altern Lab Anim 33:215–237

    PubMed  CAS  Google Scholar 

  76. Benigni R, Bossa C (2008) Predictivity and reliability of QSAR models: the case of mutagens and carcinogens. Toxicol Mech Methods 18:137–147

    Article  PubMed  CAS  Google Scholar 

  77. Goldberg DE (1989) Genetic algorithms in search, optimization, and machine learning. Addison-Wesley Professional, New York

    Google Scholar 

  78. DeLisle RK, Dixon SL (2004) Induction of decision trees via evolutionary programming. J Chem Inf Comput Sci 44:862–870

    Article  PubMed  CAS  Google Scholar 

  79. Dietterichl TG (2002) Ensemble learning. In: Arbib M (ed) The handbook of brain theory and neural networks. MIT Press, Cambridge, pp 405–408

    Google Scholar 

  80. Svetnik V, Wang T, Tong C, Liaw A, Sheridan RP, Song Q (2005) Boosting: an ensemble learning tool for compound classification and QSAR modeling. J Chem Inf Mod 45:786–799

    Article  CAS  Google Scholar 

  81. Baskin II, Marcou G, Horvath D, Varnek A (2017) Bagging and boosting of classification models. In: Tutorials in chemoinformatics. John Wiley & Sons, Ltd, Hoboken, pp 241–247

    Chapter  Google Scholar 

  82. Baskin II, Marcou G, Horvath D, Varnek A (2017) Bagging and boosting of regression models. In: Tutorials in chemoinformatics. John Wiley & Sons, Ltd, Hoboken, pp 249–255

    Chapter  Google Scholar 

  83. Baskin II, Marcou G, Horvath D, Varnek A (2017) Random subspaces and random forest. In: Tutorials in chemoinformatics. John Wiley & Sons, Ltd, Hoboken, pp 263–269

    Chapter  Google Scholar 

  84. Baskin II, Marcou G, Horvath D, Varnek A (2017) Stacking. In: Tutorials in chemoinformatics. John Wiley & Sons, Ltd, Hoboken, pp 271–278

    Chapter  Google Scholar 

  85. Breiman L (1996) Bagging predictors. Mach Learn 24:123–140

    Google Scholar 

  86. Ho TK (1998) The random subspace method for constructing decision forests. IEEE Trans Pattern Anal 20:832–844

    Article  Google Scholar 

  87. Friedman JH (2002) Stochastic gradient boosting. Comput Stat Data Anal 38:367–378

    Article  Google Scholar 

  88. Breiman L (1996) Stacked regressions. Mach Learn 24:49–64

    Google Scholar 

  89. Breiman L (2001) Random forests. Mach Learn 45:5–32

    Article  Google Scholar 

  90. Svetnik V, Liaw A, Tong C, Culberson JC, Sheridan RP, Feuston BP (2003) Random forest: a classification and regression tool for compound classification and QSAR modeling. J Chem Inf Comput Sci 43:1947–1958

    Article  PubMed  CAS  Google Scholar 

  91. Li S, Fedorowicz A, Singh H, Soderholm SC (2005) Application of the random forest method in studies of local lymph node assay based skin sensitization data. J Chem Inf Mod 45:952–964

    Article  CAS  Google Scholar 

  92. Zhang Q-Y, Aires-de-Sousa J (2007) Random forest prediction of mutagenicity from empirical physicochemical descriptors. J Chem Inf Mod 47:1–8

    Article  CAS  Google Scholar 

  93. Polishchuk PG, Muratov EN, Artemenko AG, Kolumbin OG, Muratov NN, Kuz'min VE (2009) Application of random forest approach to QSAR prediction of aquatic toxicity. J Chem Inf Model 49:2481–2488

    Article  PubMed  CAS  Google Scholar 

  94. Vasanthanathan P, Taboureau O, Oostenbrink C, Vermeulen NPE, Olsen L, Jorgensen FS (2009) Classification of cytochrome P450 1A2 inhibitors and noninhibitors by machine learning techniques. Drug Metab Dispos 37:658–664

    Article  PubMed  CAS  Google Scholar 

  95. Rumelhart DE, McClelland JL (1986) Parallel distributed processing, vol 1,2. MIT Press, Cambridge, MA

    Google Scholar 

  96. Gasteiger J, Zupan J (1993) Neural networks in chemistry. Angew Chem Int Ed Engl 105:503–527

    Article  Google Scholar 

  97. Halberstam NM, Baskin II, Palyulin VA, Zefirov NS (2003) Neural networks as a method for elucidating structure-property relationships for organic compounds. Russ Chem Rev 72:629–649

    Article  CAS  Google Scholar 

  98. Baskin II, Palyulin VA, Zefirov NS (2008) Neural networks in building QSAR models. Methods Mol Biol 458:137–158

    PubMed  Google Scholar 

  99. Baskin II, Winkler D, Tetko IV (2016) A renaissance of neural networks in drug discovery. Expert Opin Drug Discovery 11:785–795

    Article  CAS  Google Scholar 

  100. Villemin D, Cherqaoui D, Mesbah A (1994) Predicting carcinogenicity of polycyclic aromatic hydrocarbons from back-propagation neural network. J Chem Inf Comput Sci 34:1288–1293

    Article  CAS  Google Scholar 

  101. Xu L, Ball JW, Dixon SL, Jurs PC (1994) Quantitative structure-activity relationships for toxicity of phenols using regression analysis and computational neural networks. Environ Toxicol Chem 13:841–851

    Article  CAS  Google Scholar 

  102. Devillers J, Bintein S, Domine D, Karcher W (1995) A general QSAR model for predicting the toxicity of organic chemicals to luminescent bacteria (Microtox test). SAR QSAR Environ Res 4:29–38

    Article  PubMed  CAS  Google Scholar 

  103. Molnar L, Keseru GM, Papp A, Lorincz Z, Ambrus G, Darvas F (2006) A neural network based classification scheme for cytotoxicity predictions: validation on 30,000 compounds. Bioorg Med Chem Lett 16(4):1037–1039

    Article  PubMed  CAS  Google Scholar 

  104. Hatrik S, Zahradnik P (1996) Neural network approach to the prediction of the toxicity of benzothiazolium salts from molecular structure. J Chem Inf Comput Sci 36:992–995

    Article  PubMed  CAS  Google Scholar 

  105. Zakarya D, Larfaoui EM, Boulaamail A, Lakhlifi T (1996) Analysis of structure-toxicity relationships for a series of amide herbicides using statistical methods and neural network. SAR QSAR Environ Res 5:269–279

    Article  PubMed  CAS  Google Scholar 

  106. Eldred DV, Jurs PC (1999) Prediction of acute mammalian toxicity of organophosphorus pesticide compounds from molecular structure. SAR QSAR Environ Res 10:75–99

    Article  PubMed  CAS  Google Scholar 

  107. Devillers J, Flatin J (2000) A general QSAR model for predicting the acute toxicity of pesticides to Oncorhynchus mykiss. SAR QSAR Environ Res 1:25–43

    Article  Google Scholar 

  108. Devillers J (2001) A general QSAR model for predicting the acute toxicity of pesticides to Lepomis macrochirus. SAR QSAR Environ Res 11:397–417

    Article  PubMed  CAS  Google Scholar 

  109. Devillers J, Pham-Delegue MH, Decourtye A, Budzinski H, Cluzeau S, Maurin G (2002) Structure-toxicity modeling of pesticides to honey bees. SAR QSAR Environ Res 13:641–648

    Article  PubMed  CAS  Google Scholar 

  110. Kaiser KLE (2003) The use of neural networks in QSARs for acute aquatic toxicological endpoints. J Mol Struct (THEOCHEM) 622:85–95

    Article  CAS  Google Scholar 

  111. Zakarya D, Boulaamail A, Larfaoui EM, Lakhlifi T (1997) QSARs for toxicity of DDT-type analogs using neural network. SAR QSAR Environ Res 6:183–203

    Article  PubMed  CAS  Google Scholar 

  112. Eldred DV, Weikel CL, Jurs PC, Kaiser KLE (1999) Prediction of fathead minnow acute toxicity of organic compounds from molecular structure. Chem Res Toxicol 12:670–678

    Article  PubMed  CAS  Google Scholar 

  113. Martin TM, Young DM (2001) Prediction of the acute toxicity (96-h LC50) of organic compounds to the fathead minnow (Pimephales promelas) using a group contribution method. Chem Res Toxicol 14:1378–1385

    Article  PubMed  CAS  Google Scholar 

  114. Moore DRJ, Breton RL, MacDonald DB (2003) A comparison of model performance for six quantitative structure-activity relationship packages that predict acute toxicity to fish. Environ Toxicol Chem 22:1799–1809

    Article  PubMed  CAS  Google Scholar 

  115. Garg A, Bhat KL, Bock CW (2002) Mutagenicity of aminoazobenzene dyes and related structures: a QSAR/QPAR investigation. Dyes Pigments 55:35–52

    Article  CAS  Google Scholar 

  116. Shoji R (2005) The potential performance of artificial neural networks in QSTRs for predicting ecotoxicity of environmental pollutants. Curr Comput Aided Drug Des 1:65–72

    Article  CAS  Google Scholar 

  117. Dearden JC, Rowe PH (2015) Use of artificial neural networks in the QSAR prediction of physicochemical properties and toxicities for REACH legislation. Methods Mol Biol 1260:65–88

    Article  PubMed  CAS  Google Scholar 

  118. Tetko IV, Livingstone DJ, Luik AI (1995) Neural network studies. 1. Comparison of overfitting and overtraining. J Chem Inf Comput Sci 35:826–833

    Article  CAS  Google Scholar 

  119. Tikhonov AN, Arsenin VA (1977) Solution of ill-posed problems. Winston & Sons, Washington

    Google Scholar 

  120. Winkler DA, Burden FR (2004) Bayesian neural nets for modeling in drug discovery. Drug Discov Today: BIOSILICO 2:104–111

    Article  CAS  Google Scholar 

  121. Burden F, Winkler D (2008) Bayesian regularization of neural networks. Methods Mol Biol 458:25–44

    PubMed  Google Scholar 

  122. Burden FR, Ford MG, Whitley DC, Winkler DA (2000) Use of automatic relevance determination in QSAR studies using Bayesian neural networks. J Chem Inf Comput Sci 40:1423–1430

    Article  PubMed  CAS  Google Scholar 

  123. Burden FR, Winkler DA (2000) A quantitative structure-activity relationships model for the acute toxicity of substituted benzenes to Tetrahymena pyriformis using Bayesian-regularized neural networks. Chem Res Toxicol 13:436–440

    Article  PubMed  CAS  Google Scholar 

  124. Cronin MTD, Schultz TW (2001) Development of quantitative structure-activity relationships for the toxicity of aromatic compounds to tetrahymena pyriformis: comparative assessment of the methodologies. Chem Res Toxicol 14:1284–1295

    Article  PubMed  CAS  Google Scholar 

  125. Polley MJ, Burden FR, Winkler DA (2005) Predictive human intestinal absorption QSAR models using Bayesian regularized neural networks. Aust J Chem 58:859–863

    Article  CAS  Google Scholar 

  126. Epa VC, Burden FR, Tassa C, Weissleder R, Shaw S, Winkler DA (2012) Modeling biological activities of nanoparticles. Nano Lett 12:5808–5812

    Article  PubMed  CAS  Google Scholar 

  127. Tetko IV (2002) Neural network studies. 4. Introduction to associative neural networks. J Chem Inf Comput Sci 42:717–728

    Article  PubMed  CAS  Google Scholar 

  128. Novotarskyi S, Abdelaziz A, Sushko Y, Körner R, Vogt J, Tetko IV (2016) ToxCast EPA in vitro to in vivo challenge: insight into the rank-I model. Chem Res Toxicol 29:768–775

    Article  PubMed  PubMed Central  CAS  Google Scholar 

  129. Abdelaziz A, Spahn-Langguth H, Schramm K-W, Tetko IV (2016) Consensus modeling for HTS assays using in silico descriptors calculates the best balanced accuracy in Tox21 challenge. Front Environ Sci 4. https://doi.org/10.3389/fenvs.2016.00002

  130. Sushko I, Novotarskyi S, Körner R, Pandey AK, Rupp M, Teetz W, Brandmaier S, Abdelaziz A, Prokopenko VV, Tanchuk VY, Todeschini R, Varnek A, Marcou G, Ertl P, Potemkin V, Grishina M, Gasteiger J, Schwab C, Baskin II, Palyulin VA, Radchenko EV, Welsh WJ, Kholodovych V, Chekmarev D, Cherkasov A, Aires-De-Sousa J, Zhang QY, Bender A, Nigsch F, Patiny L, Williams A, Tkachenko V, Tetko IV (2011) Online chemical modeling environment (OCHEM): web platform for data storage, model development and publishing of chemical information. J Comput Aided Mol Des 25:533–554

    Article  PubMed  PubMed Central  CAS  Google Scholar 

  131. LeCun Y, Bengio Y, Hinton G (2015) Deep learning. Nature 521:436–444

    Article  PubMed  CAS  Google Scholar 

  132. Bengio Y (2009) Learning deep architectures for AI. Found Trends Mach Learn 2:1–127

    Article  Google Scholar 

  133. Gawehn E, Hiss JA, Schneider G (2016) Deep learning in drug discovery. Mol Inform 35:3–14

    Article  PubMed  CAS  Google Scholar 

  134. Goh GB, Hodas NO, Vishnu A (2017) Deep learning for computational chemistry. J Comp Chem 38:1291–1307

    Article  CAS  Google Scholar 

  135. Ekins S (2016) The next era: deep learning in pharmaceutical research. Pharm Res 33:2594–2603

    Article  PubMed  PubMed Central  CAS  Google Scholar 

  136. Mayr A, Klambauer G, Unterthiner T, Hochreiter S (2016) DeepTox: toxicity prediction using deep learning. Front Environ Sci 3:80

    Article  Google Scholar 

  137. Bengio Y, Courville A, Vincent P (2013) Representation learning: a review and new perspectives. Pattern Anal Mach Intell IEEE Trans 35:1798–1828

    Article  Google Scholar 

  138. Kohonen T (2001) Self-organizing maps. Springer, Berlin Heidelberg

    Book  Google Scholar 

  139. Anzali S, Barnickel G, Krug M, Sadowski J, Wagener M, Gasteiger J, Polanski J (1996) The comparison of geometric and electronic properties of molecular surfaces by neural networks: application to the analysis of corticosteroid-binding globulin activity of steroids. J Comput Aided Mol Des 10:521–534

    Article  PubMed  CAS  Google Scholar 

  140. Hecht-Nielsen R (1987) Counterpropagation networks. Appl Opt 26:4979–4984

    Article  PubMed  CAS  Google Scholar 

  141. Vracko M (1997) A study of structure-carcinogenic potency relationship with artificial neural networks. The using of descriptors related to geometrical and electronic structures. J Chem Inf Comput Sci 37:1037–1043

    Article  CAS  Google Scholar 

  142. Mazzatorta P, Vracko M, Jezierska A, Benfenati E (2003) Modeling toxicity by using supervised Kohonen neural networks. J Chem Inf Comput Sci 43:485–492

    Article  PubMed  CAS  Google Scholar 

  143. Spycher S, Pellegrini E, Gasteiger J (2005) Use of structure descriptors to discriminate between modes of toxic action of phenols. J Chem Inf Model 45:200–208

    Article  PubMed  CAS  Google Scholar 

  144. Bishop CM, Svensén M, Williams CKI (1998) GTM: the generative topographic mapping. Neural Comput 10:215–234

    Article  Google Scholar 

  145. Kireeva N, Baskin II, Gaspar HA, Horvath D, Marcou G, Varnek A (2012) Generative topographic mapping (GTM): universal tool for data visualization, structure-activity modeling and dataset comparison. Mol Inform 31:301–312

    Article  PubMed  CAS  Google Scholar 

  146. Gaspar HA, Baskin II, Marcou G, Horvath D, Varnek A (2015) Chemical data visualization and analysis with incremental generative topographic mapping: big data challenge. J Chem Inf Mod 55:84–94

    Article  CAS  Google Scholar 

  147. Gaspar HA, Baskin II, Marcou G, Horvath D, Varnek A (2015) GTM-based QSAR models and their applicability domains. Mol Inform 34:348–356

    Article  PubMed  CAS  Google Scholar 

  148. Gaspar HA, Baskin II, Marcou G, Horvath D, Varnek A (2015) Stargate GTM: bridging descriptor and activity spaces. J Chem Inf Model 55:2403–2410

    Article  PubMed  CAS  Google Scholar 

  149. Gaspar HA, Baskin II, Varnek A (2016) Visualization of a multidimensional descriptor space. In: Frontiers in molecular design and chemical information science–Herman Skolnik Award Symposium 2015: Jürgen Bajorath, vol 1222. ACS Symposium Series, vol 1222. American Chemical Society, pp. 243–267

    Chapter  Google Scholar 

  150. Gaspar HA, Sidorov P, Horvath D, Baskin II, Marcou G, Varnek A (2016) Generative topographic mapping approach to chemical space analysis. In: Frontiers in molecular design and chemical information science–Herman Skolnik Award Symposium 2015: Jürgen Bajorath, vol 1222. ACS Symposium Series, vol 1222. American Chemical Society, pp. 211–241

    Chapter  Google Scholar 

  151. Kireeva N, Kuznetsov SL, Bykov AA, Tsivadze AY (2012) Towards in silico identification of the human ether-a-go-go-related gene channel blockers: discriminative vs. generative classification models. SAR QSAR Environ Res 24:103–117

    Article  PubMed  CAS  Google Scholar 

Download references

Author information

Authors and Affiliations

Authors

Editor information

Editors and Affiliations

Rights and permissions

Reprints and permissions

Copyright information

© 2018 Springer Science+Business Media, LLC, part of Springer Nature

About this protocol

Check for updates. Verify currency and authenticity via CrossMark

Cite this protocol

Baskin, I.I. (2018). Machine Learning Methods in Computational Toxicology. In: Nicolotti, O. (eds) Computational Toxicology. Methods in Molecular Biology, vol 1800. Humana Press, New York, NY. https://doi.org/10.1007/978-1-4939-7899-1_5

Download citation

  • DOI: https://doi.org/10.1007/978-1-4939-7899-1_5

  • Published:

  • Publisher Name: Humana Press, New York, NY

  • Print ISBN: 978-1-4939-7898-4

  • Online ISBN: 978-1-4939-7899-1

  • eBook Packages: Springer Protocols

Publish with us

Policies and ethics