Molecular Diversity

, Volume 10, Issue 2, pp 147–158 | Cite as

Lazy structure-activity relationships (lazar) for the prediction of rodent carcinogenicity and Salmonella mutagenicity

  • Christoph HelmaEmail author
Full–length article


lazar is a new tool for the prediction of toxic properties of chemical structures. It derives predictions for query structures from a database with experimentally determined toxicity data. lazar generates predictions by searching the database for compounds that are similar with respect to a given toxic activity and calculating the prediction from their activities. Apart form the prediction, lazar provides the rationales (structural features and similar compounds) for the prediction and a reliable condence index that indicates, if a query structure falls within the applicability domain of the training database.

Leave-one-out (LOO) crossvalidation experiments were carried out for 10 carcinogenicity endpoints ({female|male} {hamster|mouse|rat} carcinogenicity and aggregate endpoints {hamster|mouse|rat} carcinogenicity and rodent carcinogenicity) and Salmonella mutagenicity from the Carcinogenic Potency Database (CPDB). An external validation of Salmonella mutagenicity predictions was performed with a dataset of 3895 structures. Leave-one-out and external validation experiments indicate that Salmonella mutagenicity can be predicted with 85% accuracy for compounds within the applicability domain of the CPDB. The LOO accuracy of lazar predictions of rodent carcinogenicity is 86%, the accuracies for other carcinogenicity endpoints vary between 78 and 95% for structures within the applicability domain.


applicability domain carcinogenic potency database data mining lazar predictive toxicology (quantitative) structure-activity relationships 



chemical carcinogenesis research information system


carcinogenic potency database


distributed structure-searchable toxicity project


lazy structure-activity relationships


leave-one-out crossvalidation




(quantitative) structure-activity relationships


Unable to display preview. Download preview PDF.

Unable to display preview. Download preview PDF.


  1. 1.
    Helma, C. (Ed.)., Predictive Toxicology, Taylor & Francis, Boca Raton (2005).Google Scholar
  2. 2.
    Eriksson, L., Johansson, E. and Lundstedt, T. Regression- and projection-based approaches in Predictive Toxicology, in Helma, C. (Ed.)., Predictive Toxicology. Taylor & Francis, Boca Raton (2005) pp. 177–222.Google Scholar
  3. 3.
    Parsons, S. and McBurney, P. The use of expert systems for toxicology risk prediction, in Helma, C. (Ed.)., Predictive Toxicology. Taylor & Francis, Boca Raton (2005) pp. 135–176.Google Scholar
  4. 4.
    Kramer, S. and Helma, C. Machine learning and data mining, in Helma, C. (Ed.)., Predictive Toxicology. Taylor & Francis, Boca Raton (2005) pp. 223–254.Google Scholar
  5. 5.
    Imielinski, T. and Mannila, H., A database perspective on knowledge discovery, Communications of the ACM, 39 (1996) 58–64.CrossRefGoogle Scholar
  6. 6.
    DeRaedt, L., A perspective on inductive databases, SIGKDD Explorations, 4 (2002) 69–77.CrossRefGoogle Scholar
  7. 7.
    Toivonen, H., Srinivasan, A., King, R.D., Kramer, S. and Helma, C., Statistical evaluation of the Predictive Toxicology Challenge 2000–2001, Bioinformatics, 19 (2003) 1183–1193.CrossRefPubMedGoogle Scholar
  8. 8.
    Benigni, R. and Zito, R., The second national toxicology program comparative exercise on the prediction of rodent carcinogenicity: Denitive results. Mutation Res., 566 (2004) 49–63.Google Scholar
  9. 9.
    Benigni, R., Structure–activity relationship studies of chemical mutagens and carcinogens: mechanistic investigations and prediction approaches, Chemical Reviews, in press (2005).Google Scholar
  10. 10.
    Helma, C., Data mining and knowledge discovery in predictive toxicology, SAR QSAR Environ. Res., 15 (2004) 367–383.CrossRefPubMedGoogle Scholar
  11. 11.
    Helma, C., lazar: Lazy structure – activity relationships for toxicity prediction, in Helma, C. (Ed.)., Predictive Toxicology. Taylor & Francis, Boca Raton (2005) pp. 479–499.Google Scholar
  12. 12.
    Willett, P., Barnard, J. and Downs, G., Chemical similarity searching, J. Chem. Inf. Comput. Sci., 38 (1998) 983–996.CrossRefGoogle Scholar
  13. 13.
    Kramer, S., De Raedt, L. and Helma, C., Molecular feature mining in HIV data, in Proceedings of the Seventh ACM SIGKDD International Conference on Knowledge Discovery and Data Mining (KDD-01) (2001) pp. 136–143.Google Scholar
  14. 14.
    Hill, A., Erweiterung des Molecular Feature Miners für 3-dimensionale Fragmente, Master's thesis, Universität Freiburg (2002).Google Scholar
  15. 15.
    Molzberger, L., Development of a method to search efficiently for frequent substructures in large molecule databases, Master's thesis, Universität Freiburg (2004).Google Scholar
  16. 16.
    Poroikov, V. and Filimonov, D., Pass: Prediction of biological activity for substances, in Helma, C. (Ed.)., Predictive Toxicology. Taylor & Francis, Boca Raton (2005) pp. 459–478.Google Scholar
  17. 17.
    Varnek, A. and Solov'ev, V., “in silicodesign of potential anti-HIV actives using fragment descriptors, Comb. Chem. High. Throughput Screen., 8 (2005) 403–416.CrossRefPubMedGoogle Scholar
  18. 18.
    Coles, S., Day, N., Murray-Rust, P., Rzepa, H. and Zhang, Y., Enhancement of the chemical semantic web through the use of InChI identifiers, Org. Biomol. Chem., 3 (2005) 1832–1834.CrossRefPubMedGoogle Scholar
  19. 19.
    Hawkins, D., The problem of overfitting, J. Chem. Inf. Comput. Sci., 44 (2004) 1–12.CrossRefPubMedGoogle Scholar
  20. 20.
    Kazius, J., McGuire, R. and Bursi, R., Derivation and vlaidation of toxicophores for mutagenicity prediction, J. Med. Chem., 48 (2005) 312–320.CrossRefPubMedGoogle Scholar
  21. 21.
    Witten, I. and Frank, E., Data Mining: Practical machine learning tools and techniques, Morgan Kaufmann Publishers, San Francisco, California (2000).Google Scholar
  22. 22.
    Helma, C., Kramer, T., Kramer, S. and DeRaedt, L., Data Mining and Machine Learning techniques for the identification of mutagenicity inducing substructures and structure–activity relationships of noncongeneric compounds. J. Chem. Inf. Comput. Sci., 44 (2004) 1402–1411.CrossRefPubMedGoogle Scholar
  23. 23.
    Benigni, R., Qsar prediction of rodent carcinogenicity for a set of chemicals currently bioassayed by the us national toxicology program, Mutagenesis, 6 (1991) 423–425.PubMedCrossRefGoogle Scholar
  24. 24.
    Benigni, R., Predicting chemical carcinogenesis in rodents: The state of art in light of a comparative exercise, Mutation Res., 334 (1995) 103–113.PubMedGoogle Scholar
  25. 25.
    Woo, Y. and Lai, D.Y., Mechanism of action of chemical carcinogens and their role in structure-activity relationship (SAR) analysis and risk assessment, in Benigni, R. (Ed.)., Quantitative Structure–Activity Relationship (QSAR) Models of Mutagens and Carcinogens. CRC Press, Boca Raton (2003) pp. 41–80.Google Scholar
  26. 26.
    Gottmann, E., Kramer, S., Pfahringer, B. and Helma, C., Data quality in predictive toxicology: Reproducibility of rodent carcinogenicity experiments, Environ. Health Perspect., 109 (2001) 509–514.PubMedCrossRefGoogle Scholar
  27. 27.
    Benigni, R. and Giuliani, A., Putting the Predictive Toxicology Challenge into prespective: Reflections on the results, Bioinformatics, 19 (2003) 1194–1200.CrossRefPubMedGoogle Scholar

Copyright information

© Springer Science+Business Media, Inc. 2006

Authors and Affiliations

  1. 1.In Silico ToxicologyFreiburgGermany

Personalised recommendations