Skip to main content
Log in

Prediction of mutagenic toxicity by combination of Recursive Partitioning and Support Vector Machines

  • Full Length Paper
  • Published:
Molecular Diversity Aims and scope Submit manuscript

Abstract

The study of prediction of toxicity is very important and necessary because measurement of toxicity is typically time-consuming and expensive. In this paper, Recursive Partitioning (RP) method was used to select descriptors. RP and Support Vector Machines (SVM) were used to construct structure–toxicity relationship models, RP model and SVM model, respectively. The performances of the two models are different. The prediction accuracies of the RP model are 80.2% for mutagenic compounds in MDL’s toxicity database, 83.4% for compounds in CMC and 84.9% for agrochemicals in in-house database respectively. Those of SVM model are 81.4%, 87.0% and 87.3% respectively.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Similar content being viewed by others

Abbreviations

SVM:

Support Vector Machines

RP:

Recursive Partitioning

References

  1. Benigni R. (2005) Structure–activity relationship studies of chemical mutagens and carcinogens: mechanistic investigations and prediction approaches. Chem Rev 105:1767–1800

    Article  CAS  Google Scholar 

  2. World Health Organization (WHO) (1985) Guide to short-term tests for detecting mutagenic and carcinogenic chemicals. Environmental Health Criteria 51:100–114

    Google Scholar 

  3. Ashby J, Tennant RW (1991) Definitive relationships among chemical structure, carcinogenicity and mutagenicity for 301 chemicals tested by the U.S. NTP. Mutat Res 257:229–306

    CAS  Google Scholar 

  4. Klopman G, Rosenkranz HS (1992) Testing by artificial intelligence: Computational alternatives to the determination of mutagenicity. Mutat Res 272:59–71

    CAS  Google Scholar 

  5. Ridings JE, Barratt MD, Cary R, Earnshaw GG, Eggington E, Ellis MK, Judson PN, Langowski JJ, Marchant CA, Payne MP, Watson WP, Yih TD (1996) Computer prediction of possible toxic action from chemical structure; an update on the DEREK system. Toxicology 106:267–279

    Article  CAS  Google Scholar 

  6. Klopman G (1992) MULTICASE 1. A hierarchical computer automated structure evaluation program. Quant Struct Act Relat 11:176–184

    CAS  Google Scholar 

  7. Enslein K, Gombar VK, Blake BW (1994) Use of SAR in computer-assited prediction of carcinogenicity and mutagenicity of chemicals by the TOPKAT program. Mutat Res 305:47–61

    CAS  Google Scholar 

  8. Young SS, Gombar VK, Emptage MR, Cariello NF, Lambert C (2002) Mixture- deconvolution and analysis of Ames mutagenicity data. Chem Intel Lab Sys 60:5–11

    Article  CAS  Google Scholar 

  9. Bacha PA, Gruver HS, Den Hartog BK, Tamura SY, Nutt RF (2002) Rule extraction from a mutagenicity data set using adaptively grown phylogenetic-like trees. J Chem Inf Comput Sci 42:1104–1111

    Article  CAS  Google Scholar 

  10. (a) Kazius J, McGuire R, Bursi R Derivation and validation of toxicophores for mutagenicity prediction, J Med Chem 48 312–320 (b) Data from http://www.cheminformatics.org/

  11. (a) Helma C, Cramer T, Kramer S, Raedt L (2004) Data mining and machine learning techniques for the identification of mutagenicity: inducing substructures and structure activity relationships of noncongeneric compounds, J Chem Inf Comput Sci 44 1402–1411, (b) Data from http://www.predictive-toxicology.org/data/cpdb_mutagens/

    Google Scholar 

  12. (a) Feng J, Lurati L, Ouyang H, Robinson T, Wang Y, Yuan S, Young SS (2003) Predictive toxicology: benchmarking molecular descriptors and statistical methods. J Chem Inf Comput Sci 43 1463–1470, (b) Data from http://www.niss.org/publications.html

  13. Liao Q, Yao JH, Li F, Yuan SG, Doucet JP, Panaye A, Fan BT (2004) CISOC-PSCT: a predictive system for carcinogenic toxicity. SAR QSAR Environ Res 15:217–235

    Article  CAS  Google Scholar 

  14. Liao Q, Yao JH, Yuan SG (2006) SVM approach for predicting LogP. Mol Divers 10:301–309

    Article  CAS  Google Scholar 

  15. Breiman L, Friedman JH, Olshen RA, Stone CG (1984) Classification and regression trees. Wadsworth International Group, Belmont, CA

    Google Scholar 

  16. Myles AJ, Brown SD (2003) Induction of decision trees using fuzzy partitions. J Chemomet 17:531–536

    Article  CAS  Google Scholar 

  17. Vapnik VN (ed) (1998) Statistical learning theory. John Wiley & Sons, New York

    Google Scholar 

  18. Cristianini N, Shawe-Taylor J (eds) (2000) An introduction to support vector machines. Cambridge University Press, Cambridge, UK

    Google Scholar 

  19. Burges CJC (1998) A tutorial on support vector machine for pattern recognition. Data Min. Knowl. Disc 2:121–167

    Article  Google Scholar 

  20. http://www.mdli.com/products/predictive/toxicity/

  21. http://www.mdli.com/products/knowledge/medicinal_chem/

  22. http://www.nature.com/nrg/journal/v5/n4/glossary/nrg 1317_glossary.html

  23. Rusinko A, Farmen MW, Lambert CG, Brown PL, Young SS (1999) Analysis of a large structure/biological activity data set using Recursive Partitioning. J Chem Inf Comput Sci 39:1017–1026

    Article  CAS  Google Scholar 

  24. Blower P, Fligner M, Verducci J, Bjoraker J (2002) On combining Recursive Partitioning and Simulated Annealing to detect groups of biologically active compounds. J Chem Inf Comput Sci 42:393–404

    Article  CAS  Google Scholar 

  25. Tong W, Hong H, Fang H, Xie Q, Perkins R (2003) Decision forest: combining the predictions of multiple independent decision tree models. J Chem Inf Comput Sci 43:525–531

    Article  CAS  Google Scholar 

  26. Daszykowski M, Walczak B, Xu QS, Daeyaert F, de Jonge MR, Heeres J, Koymans LM, Lewi PJ, Vinkers HM, Janssen PA, Massart DL (2004) Classification and Regression Trees-studies of HIV reverse transcriptase inhibitors. J Chem Inf Comput Sci 44:716–726

    Article  CAS  Google Scholar 

  27. DeLisle RK, Dixon SL (2004) Induction of Decision Trees via Evolutionary Programming. J Chem Inf Comput Sci 44:862–870

    Article  CAS  Google Scholar 

  28. Bai JPF, Utis A, Crippen G, He HD, Fischer V, Tullman R, Yin HQ, Hsu CP, Jiang L, Hwang KK (2004) Use of classification regression tree in predicting oral absorption in humans. J Chem Inf Comput Sci 44:2061–2069

    Article  CAS  Google Scholar 

  29. Furnkranz J (1997) Pruning algorithms for rule learning. Mach Learn 27:139–172

    Article  Google Scholar 

  30. Burbidge R, Trotter M, Buxton B, Holden S (2001) Drug design by machine learning: Support Vector Machines for pharmaceutical data analysis. Comput Chem 26:5–14

    Article  CAS  Google Scholar 

  31. Song M, Breneman CM, Bi J, Sukumar N, Bennett KP, Cramer S, Tugcu N (2002) Prediction of protein retention times in anion-exchange chromatography systems using Support Vector Regression. J Chem Inf Comput Sci 42:1347–1357

    Article  CAS  Google Scholar 

  32. Kramer S, Frank E, Helma C (2002) Fragment generation and Support Vector Machines for inducing SARs. SAR QSAR Environ Res 13:509–523

    Article  CAS  Google Scholar 

  33. Zernov VV, Balakin KV, Ivaschenko AA, Savchuk NP, Pletnev IV (2003) Drug discovery using Support Vector Machines. The case studies of drug-likeness, agrochemical-likeness, and enzyme inhibition predictions. J Chem Inf Comput Sci 43:2048–2056

    Article  CAS  Google Scholar 

  34. Luan F, Zhang RS, Zhao CY, Yao XJ, Liu MC, Hu ZD, Fan BT (2005) Classification of the carcinogenicity of N-Nitroso compounds based on Support Vector Machines and Linear Discriminant Analysis. Chem Res Toxicol 18:198–203

    Article  CAS  Google Scholar 

  35. Byvatov E, Fechner U, Sadowski J, Schneider G (2003) Comparison of Support Vector Machine and Artificial Neural Network systems for drug/nondrug classification. J Chem Inf Comput Sci 43:1882–1889

    Article  CAS  Google Scholar 

  36. Chang CC, Lin CJ, LIBSVM – A library for Support Vector Machines, http://www.csie.ntu.edu.tw/∼cjlin/libsvm/index.html

  37. Hsu CW, Chang CC, Lin CJ, A practical guide to Support Vector Classification, http://www.csie.ntu.edu.tw/∼cjlin/papers/guide/guide.pdf

Download references

Acknowledgments

The authors thank Dr. R. Bursi, Dr. S. S. Young and Dr. C. Helma for supplying the data sets. This work was supported in part by the National Basic Research Program (also called 973 Program) of China, through Grants 2003CB114400; by the National High-Tech. Program (also called 863 Program), through Grants 2006AA02Z39; by National Natural Science Foundation of China through Grants 20473112 and 20572120; by Chinese Academy of Sciences, through Grants KGCX2-SW-213-05 and KGCX2-SW-213-01.

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Jianhua Yao.

Rights and permissions

Reprints and permissions

About this article

Cite this article

Liao, Q., Yao, J. & Yuan, S. Prediction of mutagenic toxicity by combination of Recursive Partitioning and Support Vector Machines. Mol Divers 11, 59–72 (2007). https://doi.org/10.1007/s11030-007-9057-5

Download citation

  • Received:

  • Accepted:

  • Published:

  • Issue Date:

  • DOI: https://doi.org/10.1007/s11030-007-9057-5

Keywords

Navigation