Chronic Rat Toxicity Prediction of Chemical Compounds Using Kernel Machines

  • Georg Hinselmann
  • Andreas Jahn
  • Nikolas Fechner
  • Andreas Zell
Part of the Lecture Notes in Computer Science book series (LNCS, volume 5483)


A recently published study showed the feasibility of chronic rat toxicity prediction, an important task to reduce the number of animal experiments using the knowledge of previous experiments. We benchmarked various kernel learning approaches for the prediction of chronic toxicity on a set of 565 chemical compounds, labeled with the Lowest Observed Adverse Effect Level, and achieved a prediction error close to the interlaboratory reproducibility. ε-Support Vector Regression was used in combination with numerical molecular descriptors and the Radial Basis Function Kernel, as well as with graph kernels for molecular graphs, to train the models. The results show that a kernel approach improves the Mean Squared Error and the Squared Correlation Coefficient using leave-one-out cross-validation and a seeded 10-fold-cross-validation averaged over 10 runs. Compared to the state-of-the-art, the Mean Squared Error was improved up to MSEloo of 0.45 and MSEcv of 0.46±0.09 which is close to the theoretical limit of the estimated interlaboratory reproducibility of 0.41. The Squared Empirical Correlation Coefficient was improved to \(\text{Q}^2_{\text{loo}}\) of 0.58 and \(\text{Q}^2_{\text{\text{cv}}}\) of 0.57±0.10. The results show that numerical kernels and graph kernels are both suited for predicting chronic rat toxicity for unlabeled compounds.


Support Vector Machine Mean Square Error Kernel Function Radial Basis Function Support Vector Regression 
These keywords were added by machine and not by the authors. This process is experimental and the keywords may be updated as the learning algorithm improves.


Unable to display preview. Download preview PDF.

Unable to display preview. Download preview PDF.


  1. 1.
    Azencott, C.-A., Ksikes, A., Joshua Swamidass, S., Chen, J.H., Ralaivola, L., Baldi, P.: One- to four-dimensional kernels for virtual screening and the prediction of physical, chemical, and biological properties. Journal of Chemical Information and Modeling 47(3), 965–974 (2007)CrossRefGoogle Scholar
  2. 2.
    Chang, C.-C., Lin, C.-J.: Libsvm: A library for support vector machines (2001)Google Scholar
  3. 3.
    Cronin, M.T.D., Jaworska, J.S., Walker, J.D., Comber, M.H.I., Watts, C.D., Worth, A.P.: Use of qsars in international decision-making frameworks to predict health effects of chemical substances. Environmental Health Perspectives 10, 1391–1401 (2003)Google Scholar
  4. 4.
    Fröhlich, H., Wegner, J.K., Sieker, F., Zell, A.: Optimal assignment kernels for attributed molecular graphs. In: ICML, pp. 225–232 (2005)Google Scholar
  5. 5.
    Fröhlich, H., Wegner, J.K., Sieker, F., Zell, A.: Kernel functions for attributed molecular graphs - a new similarity-based approach to adme prediction in classification and regression. QSAR & Combinatorial Science 25, 317–326 (2006)CrossRefGoogle Scholar
  6. 6.
    Gasteiger, J., Rudolph, C., Sadowski, J.: Automatic generation of 3d-atomic coordinates for organic molecules. Tetrahedron Computational Methods 3, 537–547 (1992)CrossRefGoogle Scholar
  7. 7.
    Gower, J.C.: A general coefficient of similarity and some of its properties. Biometrics 27(4), 857–871 (1971)CrossRefGoogle Scholar
  8. 8.
    Kashima, H., Tsuda, K., Inokuchi, A.: Marginalized kernels between labeled graphs. In: ICML, pp. 321–328 (2003)Google Scholar
  9. 9.
    Leslie, C., Eskin, E., Noble, W.S.: The spectrum kernel: a string kernel for protein classification. In: Pacific symposium on biocomputing (2002)Google Scholar
  10. 10.
    Mahé, P., Ralaivola, L., Stoven, V., Vert, J.-P.: The pharmacophore kernel for virtual screening with support vector machines. Journal of Chemical Information and Modeling 46(5), 2003–2014 (2006)CrossRefGoogle Scholar
  11. 11.
    Mazzatorta, P., Estevez, M.D., Coulet, M., Schilter, B.: Modeling oral rat chronic toxicity. Journal of Chemical Information and Modeling 48, 1949–1954 (2008)CrossRefGoogle Scholar
  12. 12.
    Mumtaz, M.M., Knauf, L.A., Reisman, D.J., Peirano, W.B., DeRosa, C.T., Gombar, V.K., Enslein, K., Carter, J.R., Blake, B.W., Huque, K.I., Ramanujam, V.M.S.: Assessment of effect levels of chemicals from quantitative structure-activity relationship (qsar) models. i. chronic lowest-observed-adverse-effect level (loael). Toxicology Letters 79, 131–143 (1995)CrossRefGoogle Scholar
  13. 13.
    Munro, I.C., Ford, R.A., Kennepohl, E., Sprenger, J.G.: Correlation of structural class with no-observed-effect levels: A proposal for establishing a threshold of concern. Food and Chemical Toxicology 34, 829–867 (1996)CrossRefGoogle Scholar
  14. 14.
    Ralaivola, L., Swamidass, S.J., Saigo, H., Baldi, P.: Graph kernels for chemical informatics. Neural Networks 18(8), 1093–1110 (2005)CrossRefGoogle Scholar
  15. 15.
    Rupp, M., Proschak, E., Schneider, G.: Kernel approach to molecular similarity based on iterative graph similarity. Journal of Chemical Information and Modeling 47(6), 2280–2286 (2007)CrossRefGoogle Scholar
  16. 16.
    Saigo, H., Kadowaki, T., Tsuda, K.: A linear programming approach for molecular qsar analysis. In: International Workshop on Mining and Learning with Graphs (2006)Google Scholar
  17. 17.
    Schölkopf, B., Smola, A.J.: Learning with Kernels: Support Vector Machines, Regularization, Optimization, and Beyond. MIT Press, Cambridge (2001)Google Scholar
  18. 18.
    Steinbeck, C., Han, Y., Kuhn, S., Horlacher, O., Luttmann, E., Willighagen, E.: The chemistry development kit (cdk): an open-source java library for chemo- and bioinformatics. Journal of Chemical Information and Computer Science 43(2), 493–500 (2003)CrossRefGoogle Scholar
  19. 19.
    Talete srl, Milano, Italy. dragonX 1.4 for Linux (Molecular Descriptor Calculation Software)Google Scholar
  20. 20.
    Tilaoui, L., Schilter, B., Tran, L.-A., Mazzatorta, P., Grigorov, M.: Integrated computational methods for prediction of the lowest observable adverse effect level of food-borne molecules. QSAR & Combinatorial Science 26, 102–108 (2007)CrossRefGoogle Scholar
  21. 21.
    Todeschini, R., Consonni, V., Mannhold, R., Kubinyi, H., Timmerman, H.: Handbook of Molecular Descriptors. Wiley-VCH, Weinheim (2000)CrossRefGoogle Scholar
  22. 22.
    U. S. Environmental Protection Agency. ECOTOX User Guide: ECOTOXicology Database System. Version 4.0 (2006)Google Scholar
  23. 23.
    Venkatapathy, R., Moudga, C.J., Bruce, R.M.: Assessment of the oral rat chronic lowest observed adverse effect level model in topkat, a qsar software package for toxicity prediction. Journal of Chemical Information and Computer Sciences 44(5), 1623–1629 (2004)CrossRefGoogle Scholar
  24. 24.
    Vert, J.-P.: The optimal assignment kernel is not positive definite (2008)Google Scholar
  25. 25.
    Walker, J.D., Carlsen, L., Hulzebos, E., Simon-Hettich, B.: Global government applications of analogues, sars and qsars to predict aquatic toxicity, chemical or physical properties, environmental fate parameters and health effects of organic chemicals. SAR and QSAR in environmental research 13, 607–616 (2002)CrossRefGoogle Scholar

Copyright information

© Springer-Verlag Berlin Heidelberg 2009

Authors and Affiliations

  • Georg Hinselmann
    • 1
  • Andreas Jahn
    • 1
  • Nikolas Fechner
    • 1
  • Andreas Zell
    • 1
  1. 1.Wilhelm-Schickard-Institute for Computer Science, Dept. Computer ArchitectureUniversity of TübingenTübingenGermany

Personalised recommendations