Machine Learning Methods in Computational Toxicology

  • Igor I. Baskin
Part of the Methods in Molecular Biology book series (MIMB, volume 1800)


Various methods of machine learning, supervised and unsupervised, linear and nonlinear, classification and regression, in combination with various types of molecular descriptors, both “handcrafted” and “data-driven,” are considered in the context of their use in computational toxicology. The use of multiple linear regression, variants of naïve Bayes classifier, k-nearest neighbors, support vector machine, decision trees, ensemble learning, random forest, several types of neural networks, and deep learning is the focus of attention of this review. The role of fragment descriptors, graph mining, and graph kernels is highlighted. The application of unsupervised methods, such as Kohonen’s self-organizing maps and related approaches, which allow for combining predictions with data analysis and visualization, is also considered. The necessity of applying a wide range of machine learning methods in computational toxicology is underlined.

Key words

Computational toxicology Machine learning Support vector machines Random forest Neural networks Deep learning 


  1. 1.
    Barratt MD, Rodford RA (2001) The computational prediction of toxicity. Curr Opin Chem Biol 5:383–388PubMedCrossRefPubMedCentralGoogle Scholar
  2. 2.
    Kavlock RJ, Ankley G, Blancato J, Breen M, Conolly R, Dix D, Houck K, Hubal E, Judson R, Rabinowitz J, Richard A, Setzer RW, Shah I, Villeneuve D, Weber E (2008) Computational toxicology—a state of the science mini review. Toxicol Sci 103:14–27PubMedCrossRefPubMedCentralGoogle Scholar
  3. 3.
    Muster W, Breidenbach A, Fischer H, Kirchner S, Müller L, Pähler A (2008) Computational toxicology in drug development. Drug Discov Today 13:303–310PubMedCrossRefPubMedCentralGoogle Scholar
  4. 4.
    Valerio LG (2009) In silico toxicology for the pharmaceutical sciences. Toxicol Appl Pharmacol 241:356–370PubMedCrossRefPubMedCentralGoogle Scholar
  5. 5.
    Nigsch F, Macaluso NJM, Mitchell JBO, Zmuidinavicius D (2009) Computational toxicology: an overview of the sources of data and of modelling methods. Expert Opin Drug Metab Toxicol 5:1–14PubMedCrossRefPubMedCentralGoogle Scholar
  6. 6.
    Merlot C (2010) Computational toxicology—a tool for early safety evaluation. Drug Discov Today 15:16–22PubMedCrossRefPubMedCentralGoogle Scholar
  7. 7.
    Raunio H (2011) In silico toxicology – non-testing methods. Front Pharmacol 2:33PubMedPubMedCentralCrossRefGoogle Scholar
  8. 8.
    Sun HM, Xia MH, Austin CP, Huang RL (2012) Paradigm shift in toxicity testing and modeling. AAPS J 14:473–480PubMedPubMedCentralCrossRefGoogle Scholar
  9. 9.
    Reisfeld B, Mayeno AN (2012) What is computational toxicology? In: Reisfeld B, Mayeno AN (eds) Computational toxicology, vol Volume I. Humana Press, Totowa, NJ, pp 3–7CrossRefGoogle Scholar
  10. 10.
    Knudsen T, Martin M, Chandler K, Kleinstreuer N, Judson R, Sipes N (2013) Predictive models and computational toxicology. In: Barrow PC (ed) Teratogenicity testing: methods and protocols. Humana Press, Totowa, NJ, pp 343–374. CrossRefGoogle Scholar
  11. 11.
    Ekins S (2014) Progress in computational toxicology. J Pharmacol Toxicol Methods 69:115–140PubMedCrossRefPubMedCentralGoogle Scholar
  12. 12.
    Varnek A, Baskin I (2012) Machine learning methods for property prediction in chemoinformatics: quo vadis? J Chem Inf Mod 52:1413–1437CrossRefGoogle Scholar
  13. 13.
    Cherkasov A, Muratov EN, Fourches D, Varnek A, Baskin II, Cronin M, Dearden J, Gramatica P, Martin YC, Todeschini R, Consonni V, Kuz'min VE, Cramer R, Benigni R, Yang C, Rathman J, Terfloth L, Gasteiger J, Richard A, Tropsha A (2015) QSAR modeling: where have you been? Where are you going to? J Med Chem 57:4977–5010CrossRefGoogle Scholar
  14. 14.
    Todeschini R, Consonni V (2009) Molecular descriptors for chemoinformatics. In: Methods and principles in medicinal chemistry, vol 41. Wiley-VCH, WeinheimGoogle Scholar
  15. 15.
    Baskin I, Varnek A (2008) Fragment descriptors in SAR/QSAR/QSPR studies, molecular similarity analysis and in virtual screening. In: Varnek A, Tropsha A (eds) Chemoinformatics approaches to virtual screening. RSC Publisher, Cambridge, pp 1–43Google Scholar
  16. 16.
    Baskin I, Varnek A (2008) Building a chemical space based on fragment descriptors. Comb Chem High Throughput Screen 11:661–668PubMedCrossRefPubMedCentralGoogle Scholar
  17. 17.
    Varnek A, Fourches D, Hoonakker F, Solov’ev V (2005) Substructural fragments: an universal language to encode reactions, molecular and supramolecular structures. J Comput Aided Mol Des 19:693–703PubMedCrossRefPubMedCentralGoogle Scholar
  18. 18.
    Marcou G, Horvath D, Solov'ev V, Arrault A, Vayer P, Varnek A (2012) Interpretability of SAR/QSAR models of any complexity by atomic contributions. Mol Inform 31:639–642PubMedCrossRefPubMedCentralGoogle Scholar
  19. 19.
    Draper NR, Smith H (1998) Applied regression analysis, 3rd edn. John Wiley, New YorkGoogle Scholar
  20. 20.
    Lyubimova IK, Abilev SK, Gal'berstam NM, Baskin II, Palyulin VA, Zefirov NS (2001) Computer-aided prediction of the mutagenic activity of substituted polycyclic compounds. Biol Bull 28:139–145CrossRefGoogle Scholar
  21. 21.
    Enslein K, Gombar VK, Blake BW (1994) Use of SAR in computer-assisted prediction of carcinogenicity and mutagenicity of chemicals by the TOPKAT program. Mutat Res 305:47–61PubMedCrossRefGoogle Scholar
  22. 22.
    Klopman G (1984) Artificial intelligence approach to structure-activity studies. Computer automated structure evaluation of biological activity of organic molecules. J Am Chem Soc 106:7315–7321CrossRefGoogle Scholar
  23. 23.
    Rosenkranz HS, Klopman G (1988) CASE, the computer-automated structure evaluation system, as an alternative to extensive animal testing. Toxicol Ind Health 4:533–540PubMedCrossRefGoogle Scholar
  24. 24.
    Klopman G (1992) MULTICASE. 1. A hierarchical computer automated structure evaluation program. Quant Struct-Act Relat 11(2):176–184. CrossRefGoogle Scholar
  25. 25.
    Klopman G (1998) The MultiCASE program II. Baseline activity identification algorithm (BAIA). J Chem Inf Comput Sci 38:78–81PubMedCrossRefGoogle Scholar
  26. 26.
    Klopman G (1996) The META-CASETOX system. In: Puijnenburg WJGM, Damborsky J (eds) Biodegradability prediction. Springer, Berlin, pp 27–40CrossRefGoogle Scholar
  27. 27.
    Matthews EJ, Contrera JF (1998) A new highly specific method for predicting the carcinogenic potential of pharmaceuticals in rodents using enhanced MCASE QSAR-ES software. Regul Toxicol Pharmacol 28:242–264PubMedCrossRefGoogle Scholar
  28. 28.
    Klopman G, Chakravarti SK, Harris N, Ivanov J, Saiakhov RD (2003) In-silico screening of high production volume chemicals for mutagenicity using the MCASE QSAR expert system. SAR QSAR Environ Res 14:165–180PubMedCrossRefPubMedCentralGoogle Scholar
  29. 29.
    Klopman G, Chakravarti SK, Zhu H, Ivanov JM, Saiakhov RD (2004) ESP: a method to predict toxicity and pharmacological properties of chemicals using multiple MCASE databases. J Chem Inf Comput Sci 44:704–715PubMedCrossRefPubMedCentralGoogle Scholar
  30. 30.
    Klopman G, Ivanov J, Saiakhov R, Chakravarti S (2005) MC4PC–an artificial intelligence approach to the discovery of structure toxic activity relationships (STAR). In: Helma C (ed) Predictive toxicology. CRC Press, Boca Raton, pp 423–457Google Scholar
  31. 31.
    Carhart RE, Smith DH, Venkataraghavan R (1985) Atom pairs as molecular features in structure-activity studies: definition and applications. J Chem Inf Comput Sci 2:64–73CrossRefGoogle Scholar
  32. 32.
    Xiao Y, Qiao Y, Zhang J, Lin S, Zhang W (1997) A method for substructure search by atom-centered multilayer code. J Chem Inf Comput Sci 37:701–704CrossRefGoogle Scholar
  33. 33.
    Glen RC, Bender A, Arnby CH, Carlsson L, Boyer S, Smith J (2006) Circular fingerprints: flexible molecular descriptors with applications from physical chemistry to ADME. IDrugs 9:199–204Google Scholar
  34. 34.
    Filimonov D, Poroikov V, Borodina Y, Gloriozova T (1999) Chemical similarity assessment through multilevel neighborhoods of atoms: definition and comparison with the other descriptors. J Chem Inf Comput Sci 39:666–670CrossRefGoogle Scholar
  35. 35.
    Hassan M, Brown RD, Varma-O'Brien S, Rogers D (2006) Cheminformatics analysis and learning in a data pipelining environment. Mol Divers 10(3):283–299PubMedCrossRefGoogle Scholar
  36. 36.
    Metz JT, Huth JR, Hajduk PJ (2007) Enhancement of chemical rules for predicting compound reactivity towards protein thiol groups. J Comput Aided Mol Des 21:139–144PubMedCrossRefGoogle Scholar
  37. 37.
    Langdon SR, Mulgrew J, Paolini GV, van Hoorn WP (2010) Predicting cytotoxicity from heterogeneous data sources with Bayesian learning. J Cheminform 2:11PubMedPubMedCentralCrossRefGoogle Scholar
  38. 38.
    Xia X, Maliski EG, Gallant P, Rogers D (2004) Classification of kinase inhibitors using a Bayesian model. J Med Chem 47:4463–4470PubMedCrossRefGoogle Scholar
  39. 39.
    Liew CY, Lim YC, Yap CW (2011) Mixed learning algorithms and features ensemble in hepatotoxicity prediction. J Comput Aided Mol Des 25:855PubMedCrossRefGoogle Scholar
  40. 40.
    Poroikov VV, Filimonov DA, Borodina YV, Lagunin AA, Kos A (2000) Robustness of biological activity spectra predicting by computer program PASS for noncongeneric sets of chemical compounds. J Chem Inf Comput Sci 4:1349–1355CrossRefGoogle Scholar
  41. 41.
    Lagunin AA, Dearden JC, Filimonov DA, Poroikov VV (2005) Computer-aided rodent carcinogenicity prediction. Mutat Res 586:138–146PubMedCrossRefPubMedCentralGoogle Scholar
  42. 42.
    Borodina Y, Sadym A, Filimonov D, Blinova V, Dmitriev A, Poroikov V (2003) Predicting biotransformation potential from molecular structure. J Chem Inf Comput Sci 43:1636–1646PubMedCrossRefPubMedCentralGoogle Scholar
  43. 43.
    Borodina Y, Rudik A, Filimonov D, Kharchevnikova N, Dmitriev A, Blinova V, Poroikov V (2004) A new statistical approach to predicting aromatic hydroxylation sites. Comparison with model-based approaches. J Chem Inf Comput Sci 44:1998–2009PubMedCrossRefPubMedCentralGoogle Scholar
  44. 44.
    Rudik AV, Dmitriev AV, Lagunin AA, Filimonov DA, Poroikov VV (2014) Metabolism site prediction based on xenobiotic structural formulas and PASS prediction algorithm. J Chem Inf Mod 54:498–507CrossRefGoogle Scholar
  45. 45.
    Rudik A, Dmitriev A, Lagunin A, Filimonov D, Poroikov V (2015) SOMP: web server for in silico prediction of sites of metabolism for drug-like compounds. Bioinformatics 31:2046–2048PubMedCrossRefPubMedCentralGoogle Scholar
  46. 46.
    Rudik AV, Dmitriev AV, Lagunin AA, Filimonov DA, Poroikov VV (2016) Prediction of reacting atoms for the major biotransformation reactions of organic xenobiotics. J Cheminf 8:68CrossRefGoogle Scholar
  47. 47.
    Rudik AV, Bezhentsev VM, Dmitriev AV, Druzhilovskiy DS, Lagunin AA, Filimonov DA, Poroikov VV (2017) MetaTox: web application for predicting structure and toxicity of xenobiotics’ metabolites. J Chem Inf Mod 57:638–642CrossRefGoogle Scholar
  48. 48.
    Saigo H, Tsuda K (2010) Graph mining in chemoinformatics. In: Lodhi H, Yamanishi Y (eds) Chemoinformatics and advanced machine learning perspectives: complex computational methods and collaborative techniques. IGI Global, Hershey, PA, pp 95–128Google Scholar
  49. 49.
    Saigo H, Kadowaki T, Tsuda K (2006) A linear programming approach for molecular QSAR analysis. Paper presented at the International Workshop on Mining and Learning with Graphs 2006, BerlinGoogle Scholar
  50. 50.
    Zheng W, Tropsha A (2000) Novel variable selection quantitative structure-property relationship approach based on the k-nearest-neighbor principle. J Chem Inf Comput Sci 40:185–194PubMedCrossRefPubMedCentralGoogle Scholar
  51. 51.
    Rodgers AD, Zhu H, Fourches D, Rusyn I, Tropsha A (2010) Modeling liver-related adverse effects of drugs using k nearest neighbor quantitative structure−activity relationship method. Chem Res Toxicol 23:724–732PubMedPubMedCentralCrossRefGoogle Scholar
  52. 52.
    Vapnik V (1998) Statistical learning theory. Wiley-Interscience, New YorkGoogle Scholar
  53. 53.
    Vapnik VN (1995) The nature of statistical learning theory. Springer, BerlinCrossRefGoogle Scholar
  54. 54.
    Cortes C, Vapnik V (1995) Support-vector networks. Mach Learn 20:273–297Google Scholar
  55. 55.
    Czermiński R, Yasri A, Hartsough D (2001) Use of support vector machine in pattern classification: application to QSAR studies. Mol Inform 20:227–240Google Scholar
  56. 56.
    Khandelwal A, Krasowski MD, Reschly EJ, Sinz MW, Swaan PW, Ekins S (2008) Machine learning methods and docking for predicting human pregnane X receptor activation. Chem Res Toxicol 21:1457–1467PubMedPubMedCentralCrossRefGoogle Scholar
  57. 57.
    Fourches D, Barnes JC, Day NC, Bradley P, Reed JZ, Tropsha A (2010) Cheminformatics analysis of assertions mined from literature that describe drug-induced liver injury in different species. Chem Res Toxicol 23:171–183PubMedPubMedCentralCrossRefGoogle Scholar
  58. 58.
    Artemenko NV, Baskin II, Palyulin VA, Zefirov NS (2001) Prediction of physical properties of organic compounds using artificial neural networks within the substructure approach. Dokl Chem 381:317–320CrossRefGoogle Scholar
  59. 59.
    Artemenko NV, Baskin II, Palyulin VA, Zefirov NS (2003) Artificial neural network and fragmental approach in prediction of physicochemical properties of organic compounds. Russ Chem Bull 52:20–29CrossRefGoogle Scholar
  60. 60.
    Zhokhova NI, Baskin II, Palyulin VA, Zefirov AN, Zefirov NS (2007) Fragmental descriptors with labeled atoms and their application in QSAR/QSPR studies. Dokl Chem 417:282–284CrossRefGoogle Scholar
  61. 61.
    Sushko I, Novotarskyi S, Korner R, Pandey AK, Cherkasov A, Li J, Gramatica P, Hansen K, Schroeter T, Muller KR, Xi L, Liu H, Yao X, Oberg T, Hormozdiari F, Dao P, Sahinalp C, Todeschini R, Polishchuk P, Artemenko A, Kuz'min V, Martin TM, Young DM, Fourches D, Muratov E, Tropsha A, Baskin I, Horvath D, Marcou G, Muller C, Varnek A, Prokopenko VV, Tetko IV (2010) Applicability domains for classification problems: benchmarking of distance to models for Ames mutagenicity set. J Chem Inf Model 50:2094–2111PubMedCrossRefPubMedCentralGoogle Scholar
  62. 62.
    Ralaivola L, Swamidass SJ, Saigo H, Baldi P (2005) Graph kernels for chemical informatics. Neural Netw 18:1093–1110PubMedCrossRefPubMedCentralGoogle Scholar
  63. 63.
    Rupp M, Schneider G (2010) Graph kernels for molecular similarity. Mol Inform 29:266–273PubMedCrossRefPubMedCentralGoogle Scholar
  64. 64.
    Kashima H, Tsuda K, Inokuchi A (2003) Marginalized kernels between labeled graphs. In: Proceedings, twentieth international conference on machine learning, vol 1. AAAI Press, Washington D.C., pp 321–328Google Scholar
  65. 65.
    Menchetti S, Costa F, Frasconi P 2005 Weighted decomposition kernels. In: Proceedings of the 22nd international conference on Machine learning. ACM, pp. 585–592Google Scholar
  66. 66.
    Swamidass SJ, Chen J, Phung P, Ralaivola L, Baldi P (2005) Kernels for small molecules and the prediction of mutagenicity, toxicity and anti-cancer activity. Bioinformatics 21:I359–I368PubMedCrossRefPubMedCentralGoogle Scholar
  67. 67.
    Mahé P, Ueda N, Akutsu T, Perret J-L, Vert J-P (2005) Graph kernels for molecular structure-activity relationship analysis with support vector machines. J Chem Inf Mod 45:939–951CrossRefGoogle Scholar
  68. 68.
    Breiman L, Friedman J, Stone CJ, Olshen RA (1984) Classification and regression trees. Chapman & Hall/CRC, Wadsworth, CaliforniaGoogle Scholar
  69. 69.
    Cheng A, Dixon SL (2003) In silico models for the prediction of dose-dependent human hepatotoxicity. J Comput Aided Mol Des 17:811–823PubMedCrossRefPubMedCentralGoogle Scholar
  70. 70.
    Susnow RG, Dixon SL (2003) Use of robust classification techniques for the prediction of human cytochrome P450 2D6 inhibition. J Chem Inf Comput Sci 43:1308–1315PubMedCrossRefPubMedCentralGoogle Scholar
  71. 71.
    Feng J, Lurati L, Ouyang H, Robinson T, Wang Y, Yuan S, Young SS (2003) Predictive toxicology: benchmarking molecular descriptors and statistical methods. J Chem Inf Comput Sci 43:1463–1470PubMedCrossRefPubMedCentralGoogle Scholar
  72. 72.
    Cramer GM, Ford RA, Hall RL (1976) Estimation of toxic hazard—a decision tree approach. Food Cosmet Toxicol 16:255–276CrossRefGoogle Scholar
  73. 73.
    Verhaar HJM, van Leeuwen CJ, Hermens JLM (1992) Classifying environmental pollutants. Chemosphere 25:471–491CrossRefGoogle Scholar
  74. 74.
    Walker JD, Gerner I, Hulzebos E, Schlegel K (2005) The skin irritation corrosion rules estimation tool (SICRET). QSAR Comb Sci 24:378–384CrossRefGoogle Scholar
  75. 75.
    Gerner I, Liebsch M, Spielmann H (2005) Assessment of the eye irritating properties of chemicals by applying alternatives to the Draize rabbit eye test: the use of QSARs and in vitro tests for the classification of eye irritation. Altern Lab Anim 33:215–237PubMedPubMedCentralGoogle Scholar
  76. 76.
    Benigni R, Bossa C (2008) Predictivity and reliability of QSAR models: the case of mutagens and carcinogens. Toxicol Mech Methods 18:137–147PubMedCrossRefPubMedCentralGoogle Scholar
  77. 77.
    Goldberg DE (1989) Genetic algorithms in search, optimization, and machine learning. Addison-Wesley Professional, New YorkGoogle Scholar
  78. 78.
    DeLisle RK, Dixon SL (2004) Induction of decision trees via evolutionary programming. J Chem Inf Comput Sci 44:862–870PubMedCrossRefPubMedCentralGoogle Scholar
  79. 79.
    Dietterichl TG (2002) Ensemble learning. In: Arbib M (ed) The handbook of brain theory and neural networks. MIT Press, Cambridge, pp 405–408Google Scholar
  80. 80.
    Svetnik V, Wang T, Tong C, Liaw A, Sheridan RP, Song Q (2005) Boosting: an ensemble learning tool for compound classification and QSAR modeling. J Chem Inf Mod 45:786–799CrossRefGoogle Scholar
  81. 81.
    Baskin II, Marcou G, Horvath D, Varnek A (2017) Bagging and boosting of classification models. In: Tutorials in chemoinformatics. John Wiley & Sons, Ltd, Hoboken, pp 241–247CrossRefGoogle Scholar
  82. 82.
    Baskin II, Marcou G, Horvath D, Varnek A (2017) Bagging and boosting of regression models. In: Tutorials in chemoinformatics. John Wiley & Sons, Ltd, Hoboken, pp 249–255CrossRefGoogle Scholar
  83. 83.
    Baskin II, Marcou G, Horvath D, Varnek A (2017) Random subspaces and random forest. In: Tutorials in chemoinformatics. John Wiley & Sons, Ltd, Hoboken, pp 263–269CrossRefGoogle Scholar
  84. 84.
    Baskin II, Marcou G, Horvath D, Varnek A (2017) Stacking. In: Tutorials in chemoinformatics. John Wiley & Sons, Ltd, Hoboken, pp 271–278CrossRefGoogle Scholar
  85. 85.
    Breiman L (1996) Bagging predictors. Mach Learn 24:123–140Google Scholar
  86. 86.
    Ho TK (1998) The random subspace method for constructing decision forests. IEEE Trans Pattern Anal 20:832–844CrossRefGoogle Scholar
  87. 87.
    Friedman JH (2002) Stochastic gradient boosting. Comput Stat Data Anal 38:367–378CrossRefGoogle Scholar
  88. 88.
    Breiman L (1996) Stacked regressions. Mach Learn 24:49–64Google Scholar
  89. 89.
    Breiman L (2001) Random forests. Mach Learn 45:5–32CrossRefGoogle Scholar
  90. 90.
    Svetnik V, Liaw A, Tong C, Culberson JC, Sheridan RP, Feuston BP (2003) Random forest: a classification and regression tool for compound classification and QSAR modeling. J Chem Inf Comput Sci 43:1947–1958PubMedCrossRefPubMedCentralGoogle Scholar
  91. 91.
    Li S, Fedorowicz A, Singh H, Soderholm SC (2005) Application of the random forest method in studies of local lymph node assay based skin sensitization data. J Chem Inf Mod 45:952–964CrossRefGoogle Scholar
  92. 92.
    Zhang Q-Y, Aires-de-Sousa J (2007) Random forest prediction of mutagenicity from empirical physicochemical descriptors. J Chem Inf Mod 47:1–8CrossRefGoogle Scholar
  93. 93.
    Polishchuk PG, Muratov EN, Artemenko AG, Kolumbin OG, Muratov NN, Kuz'min VE (2009) Application of random forest approach to QSAR prediction of aquatic toxicity. J Chem Inf Model 49:2481–2488PubMedCrossRefPubMedCentralGoogle Scholar
  94. 94.
    Vasanthanathan P, Taboureau O, Oostenbrink C, Vermeulen NPE, Olsen L, Jorgensen FS (2009) Classification of cytochrome P450 1A2 inhibitors and noninhibitors by machine learning techniques. Drug Metab Dispos 37:658–664PubMedCrossRefPubMedCentralGoogle Scholar
  95. 95.
    Rumelhart DE, McClelland JL (1986) Parallel distributed processing, vol 1,2. MIT Press, Cambridge, MAGoogle Scholar
  96. 96.
    Gasteiger J, Zupan J (1993) Neural networks in chemistry. Angew Chem Int Ed Engl 105:503–527CrossRefGoogle Scholar
  97. 97.
    Halberstam NM, Baskin II, Palyulin VA, Zefirov NS (2003) Neural networks as a method for elucidating structure-property relationships for organic compounds. Russ Chem Rev 72:629–649CrossRefGoogle Scholar
  98. 98.
    Baskin II, Palyulin VA, Zefirov NS (2008) Neural networks in building QSAR models. Methods Mol Biol 458:137–158PubMedPubMedCentralGoogle Scholar
  99. 99.
    Baskin II, Winkler D, Tetko IV (2016) A renaissance of neural networks in drug discovery. Expert Opin Drug Discovery 11:785–795CrossRefGoogle Scholar
  100. 100.
    Villemin D, Cherqaoui D, Mesbah A (1994) Predicting carcinogenicity of polycyclic aromatic hydrocarbons from back-propagation neural network. J Chem Inf Comput Sci 34:1288–1293CrossRefGoogle Scholar
  101. 101.
    Xu L, Ball JW, Dixon SL, Jurs PC (1994) Quantitative structure-activity relationships for toxicity of phenols using regression analysis and computational neural networks. Environ Toxicol Chem 13:841–851CrossRefGoogle Scholar
  102. 102.
    Devillers J, Bintein S, Domine D, Karcher W (1995) A general QSAR model for predicting the toxicity of organic chemicals to luminescent bacteria (Microtox test). SAR QSAR Environ Res 4:29–38PubMedCrossRefPubMedCentralGoogle Scholar
  103. 103.
    Molnar L, Keseru GM, Papp A, Lorincz Z, Ambrus G, Darvas F (2006) A neural network based classification scheme for cytotoxicity predictions: validation on 30,000 compounds. Bioorg Med Chem Lett 16(4):1037–1039PubMedCrossRefGoogle Scholar
  104. 104.
    Hatrik S, Zahradnik P (1996) Neural network approach to the prediction of the toxicity of benzothiazolium salts from molecular structure. J Chem Inf Comput Sci 36:992–995PubMedCrossRefGoogle Scholar
  105. 105.
    Zakarya D, Larfaoui EM, Boulaamail A, Lakhlifi T (1996) Analysis of structure-toxicity relationships for a series of amide herbicides using statistical methods and neural network. SAR QSAR Environ Res 5:269–279PubMedCrossRefPubMedCentralGoogle Scholar
  106. 106.
    Eldred DV, Jurs PC (1999) Prediction of acute mammalian toxicity of organophosphorus pesticide compounds from molecular structure. SAR QSAR Environ Res 10:75–99PubMedCrossRefGoogle Scholar
  107. 107.
    Devillers J, Flatin J (2000) A general QSAR model for predicting the acute toxicity of pesticides to Oncorhynchus mykiss. SAR QSAR Environ Res 1:25–43CrossRefGoogle Scholar
  108. 108.
    Devillers J (2001) A general QSAR model for predicting the acute toxicity of pesticides to Lepomis macrochirus. SAR QSAR Environ Res 11:397–417PubMedCrossRefGoogle Scholar
  109. 109.
    Devillers J, Pham-Delegue MH, Decourtye A, Budzinski H, Cluzeau S, Maurin G (2002) Structure-toxicity modeling of pesticides to honey bees. SAR QSAR Environ Res 13:641–648PubMedCrossRefGoogle Scholar
  110. 110.
    Kaiser KLE (2003) The use of neural networks in QSARs for acute aquatic toxicological endpoints. J Mol Struct (THEOCHEM) 622:85–95CrossRefGoogle Scholar
  111. 111.
    Zakarya D, Boulaamail A, Larfaoui EM, Lakhlifi T (1997) QSARs for toxicity of DDT-type analogs using neural network. SAR QSAR Environ Res 6:183–203PubMedCrossRefPubMedCentralGoogle Scholar
  112. 112.
    Eldred DV, Weikel CL, Jurs PC, Kaiser KLE (1999) Prediction of fathead minnow acute toxicity of organic compounds from molecular structure. Chem Res Toxicol 12:670–678PubMedCrossRefPubMedCentralGoogle Scholar
  113. 113.
    Martin TM, Young DM (2001) Prediction of the acute toxicity (96-h LC50) of organic compounds to the fathead minnow (Pimephales promelas) using a group contribution method. Chem Res Toxicol 14:1378–1385PubMedCrossRefPubMedCentralGoogle Scholar
  114. 114.
    Moore DRJ, Breton RL, MacDonald DB (2003) A comparison of model performance for six quantitative structure-activity relationship packages that predict acute toxicity to fish. Environ Toxicol Chem 22:1799–1809PubMedCrossRefPubMedCentralGoogle Scholar
  115. 115.
    Garg A, Bhat KL, Bock CW (2002) Mutagenicity of aminoazobenzene dyes and related structures: a QSAR/QPAR investigation. Dyes Pigments 55:35–52CrossRefGoogle Scholar
  116. 116.
    Shoji R (2005) The potential performance of artificial neural networks in QSTRs for predicting ecotoxicity of environmental pollutants. Curr Comput Aided Drug Des 1:65–72CrossRefGoogle Scholar
  117. 117.
    Dearden JC, Rowe PH (2015) Use of artificial neural networks in the QSAR prediction of physicochemical properties and toxicities for REACH legislation. Methods Mol Biol 1260:65–88PubMedCrossRefPubMedCentralGoogle Scholar
  118. 118.
    Tetko IV, Livingstone DJ, Luik AI (1995) Neural network studies. 1. Comparison of overfitting and overtraining. J Chem Inf Comput Sci 35:826–833CrossRefGoogle Scholar
  119. 119.
    Tikhonov AN, Arsenin VA (1977) Solution of ill-posed problems. Winston & Sons, WashingtonGoogle Scholar
  120. 120.
    Winkler DA, Burden FR (2004) Bayesian neural nets for modeling in drug discovery. Drug Discov Today: BIOSILICO 2:104–111CrossRefGoogle Scholar
  121. 121.
    Burden F, Winkler D (2008) Bayesian regularization of neural networks. Methods Mol Biol 458:25–44PubMedPubMedCentralGoogle Scholar
  122. 122.
    Burden FR, Ford MG, Whitley DC, Winkler DA (2000) Use of automatic relevance determination in QSAR studies using Bayesian neural networks. J Chem Inf Comput Sci 40:1423–1430PubMedCrossRefPubMedCentralGoogle Scholar
  123. 123.
    Burden FR, Winkler DA (2000) A quantitative structure-activity relationships model for the acute toxicity of substituted benzenes to Tetrahymena pyriformis using Bayesian-regularized neural networks. Chem Res Toxicol 13:436–440PubMedCrossRefPubMedCentralGoogle Scholar
  124. 124.
    Cronin MTD, Schultz TW (2001) Development of quantitative structure-activity relationships for the toxicity of aromatic compounds to tetrahymena pyriformis: comparative assessment of the methodologies. Chem Res Toxicol 14:1284–1295PubMedCrossRefPubMedCentralGoogle Scholar
  125. 125.
    Polley MJ, Burden FR, Winkler DA (2005) Predictive human intestinal absorption QSAR models using Bayesian regularized neural networks. Aust J Chem 58:859–863CrossRefGoogle Scholar
  126. 126.
    Epa VC, Burden FR, Tassa C, Weissleder R, Shaw S, Winkler DA (2012) Modeling biological activities of nanoparticles. Nano Lett 12:5808–5812PubMedCrossRefPubMedCentralGoogle Scholar
  127. 127.
    Tetko IV (2002) Neural network studies. 4. Introduction to associative neural networks. J Chem Inf Comput Sci 42:717–728PubMedCrossRefPubMedCentralGoogle Scholar
  128. 128.
    Novotarskyi S, Abdelaziz A, Sushko Y, Körner R, Vogt J, Tetko IV (2016) ToxCast EPA in vitro to in vivo challenge: insight into the rank-I model. Chem Res Toxicol 29:768–775PubMedPubMedCentralCrossRefGoogle Scholar
  129. 129.
    Abdelaziz A, Spahn-Langguth H, Schramm K-W, Tetko IV (2016) Consensus modeling for HTS assays using in silico descriptors calculates the best balanced accuracy in Tox21 challenge. Front Environ Sci 4.
  130. 130.
    Sushko I, Novotarskyi S, Körner R, Pandey AK, Rupp M, Teetz W, Brandmaier S, Abdelaziz A, Prokopenko VV, Tanchuk VY, Todeschini R, Varnek A, Marcou G, Ertl P, Potemkin V, Grishina M, Gasteiger J, Schwab C, Baskin II, Palyulin VA, Radchenko EV, Welsh WJ, Kholodovych V, Chekmarev D, Cherkasov A, Aires-De-Sousa J, Zhang QY, Bender A, Nigsch F, Patiny L, Williams A, Tkachenko V, Tetko IV (2011) Online chemical modeling environment (OCHEM): web platform for data storage, model development and publishing of chemical information. J Comput Aided Mol Des 25:533–554PubMedPubMedCentralCrossRefGoogle Scholar
  131. 131.
    LeCun Y, Bengio Y, Hinton G (2015) Deep learning. Nature 521:436–444PubMedCrossRefPubMedCentralGoogle Scholar
  132. 132.
    Bengio Y (2009) Learning deep architectures for AI. Found Trends Mach Learn 2:1–127CrossRefGoogle Scholar
  133. 133.
    Gawehn E, Hiss JA, Schneider G (2016) Deep learning in drug discovery. Mol Inform 35:3–14PubMedCrossRefPubMedCentralGoogle Scholar
  134. 134.
    Goh GB, Hodas NO, Vishnu A (2017) Deep learning for computational chemistry. J Comp Chem 38:1291–1307CrossRefGoogle Scholar
  135. 135.
    Ekins S (2016) The next era: deep learning in pharmaceutical research. Pharm Res 33:2594–2603PubMedPubMedCentralCrossRefGoogle Scholar
  136. 136.
    Mayr A, Klambauer G, Unterthiner T, Hochreiter S (2016) DeepTox: toxicity prediction using deep learning. Front Environ Sci 3:80CrossRefGoogle Scholar
  137. 137.
    Bengio Y, Courville A, Vincent P (2013) Representation learning: a review and new perspectives. Pattern Anal Mach Intell IEEE Trans 35:1798–1828CrossRefGoogle Scholar
  138. 138.
    Kohonen T (2001) Self-organizing maps. Springer, Berlin HeidelbergCrossRefGoogle Scholar
  139. 139.
    Anzali S, Barnickel G, Krug M, Sadowski J, Wagener M, Gasteiger J, Polanski J (1996) The comparison of geometric and electronic properties of molecular surfaces by neural networks: application to the analysis of corticosteroid-binding globulin activity of steroids. J Comput Aided Mol Des 10:521–534PubMedCrossRefPubMedCentralGoogle Scholar
  140. 140.
    Hecht-Nielsen R (1987) Counterpropagation networks. Appl Opt 26:4979–4984PubMedCrossRefPubMedCentralGoogle Scholar
  141. 141.
    Vracko M (1997) A study of structure-carcinogenic potency relationship with artificial neural networks. The using of descriptors related to geometrical and electronic structures. J Chem Inf Comput Sci 37:1037–1043CrossRefGoogle Scholar
  142. 142.
    Mazzatorta P, Vracko M, Jezierska A, Benfenati E (2003) Modeling toxicity by using supervised Kohonen neural networks. J Chem Inf Comput Sci 43:485–492PubMedCrossRefPubMedCentralGoogle Scholar
  143. 143.
    Spycher S, Pellegrini E, Gasteiger J (2005) Use of structure descriptors to discriminate between modes of toxic action of phenols. J Chem Inf Model 45:200–208PubMedCrossRefPubMedCentralGoogle Scholar
  144. 144.
    Bishop CM, Svensén M, Williams CKI (1998) GTM: the generative topographic mapping. Neural Comput 10:215–234CrossRefGoogle Scholar
  145. 145.
    Kireeva N, Baskin II, Gaspar HA, Horvath D, Marcou G, Varnek A (2012) Generative topographic mapping (GTM): universal tool for data visualization, structure-activity modeling and dataset comparison. Mol Inform 31:301–312PubMedCrossRefPubMedCentralGoogle Scholar
  146. 146.
    Gaspar HA, Baskin II, Marcou G, Horvath D, Varnek A (2015) Chemical data visualization and analysis with incremental generative topographic mapping: big data challenge. J Chem Inf Mod 55:84–94CrossRefGoogle Scholar
  147. 147.
    Gaspar HA, Baskin II, Marcou G, Horvath D, Varnek A (2015) GTM-based QSAR models and their applicability domains. Mol Inform 34:348–356PubMedCrossRefPubMedCentralGoogle Scholar
  148. 148.
    Gaspar HA, Baskin II, Marcou G, Horvath D, Varnek A (2015) Stargate GTM: bridging descriptor and activity spaces. J Chem Inf Model 55:2403–2410PubMedCrossRefPubMedCentralGoogle Scholar
  149. 149.
    Gaspar HA, Baskin II, Varnek A (2016) Visualization of a multidimensional descriptor space. In: Frontiers in molecular design and chemical information science–Herman Skolnik Award Symposium 2015: Jürgen Bajorath, vol 1222. ACS Symposium Series, vol 1222. American Chemical Society, pp. 243–267CrossRefGoogle Scholar
  150. 150.
    Gaspar HA, Sidorov P, Horvath D, Baskin II, Marcou G, Varnek A (2016) Generative topographic mapping approach to chemical space analysis. In: Frontiers in molecular design and chemical information science–Herman Skolnik Award Symposium 2015: Jürgen Bajorath, vol 1222. ACS Symposium Series, vol 1222. American Chemical Society, pp. 211–241CrossRefGoogle Scholar
  151. 151.
    Kireeva N, Kuznetsov SL, Bykov AA, Tsivadze AY (2012) Towards in silico identification of the human ether-a-go-go-related gene channel blockers: discriminative vs. generative classification models. SAR QSAR Environ Res 24:103–117PubMedCrossRefPubMedCentralGoogle Scholar

Copyright information

© Springer Science+Business Media, LLC, part of Springer Nature 2018

Authors and Affiliations

  • Igor I. Baskin
    • 1
    • 2
  1. 1.Faculty of Physics, M.V. Lomonosov Moscow State UniversityMoscowRussian Federation
  2. 2.Butlerov Institute of ChemistryKazan Federal UniversityKazanRussian Federation

Personalised recommendations