Advertisement

Interpretation of Conformal Prediction Classification Models

  • Ernst Ahlberg
  • Ola Spjuth
  • Catrin Hasselgren
  • Lars Carlsson
Conference paper
Part of the Lecture Notes in Computer Science book series (LNCS, volume 9047)

Abstract

We present a method for interpretation of conformal prediction models. The discrete gradient of the largest p-value is calculated with respect to object space. A criterion is applied to identify the most important component of the gradient and the corresponding part of the object is visualized.

The method is exemplified with data from drug discovery relating chemical compounds to mutagenicity. Furthermore, a comparison is made to already established important subgraphs with respect to mutagenicity and this initial assessment shows very useful results with respect to interpretation of a conformal predictor.

Keywords

Object Space QSAR Model Gradient Component Discrete Gradient Signature Descriptor 
These keywords were added by machine and not by the authors. This process is experimental and the keywords may be updated as the learning algorithm improves.

Preview

Unable to display preview. Download preview PDF.

Unable to display preview. Download preview PDF.

References

  1. 1.
    Daylight Theory: SMARTS - A Language for Describing Molecular Patterns. http://www.daylight.com/dayhtml/doc/theory/theory.smarts.html (accessed January 13, 2015)
  2. 2.
    Openeye Scientific Software. http://www.eyesopen.com (accessed August 30, 2014)
  3. 3.
    Ames, B.N., Lee, F.D., Durston, W.E.: An improved bacterial test system for the detection and classification of mutagens and carcinogens. Proceedings of the National Academy of Sciences 70(3), 782–786 (1973). http://www.pnas.org/content/70/3/782.abstract
  4. 4.
    Carlsson, L., Helgee, E.A., Boyer, S.: Interpretation of nonlinear qsar models applied to ames mutagenicity data. Journal of Chemical Information and Modeling 49(11), 2551–2558 (2009). http://dx.doi.org/10.1021/ci9002206, pMID: 19824682
  5. 5.
    Chang, C.C., Lin, C.J.: LIBSVM: a library for support vector machines (2001), software available at http://www.csie.ntu.edu.tw/~cjlin/libsvm
  6. 6.
    Eklund, M., Norinder, U., Boyer, S., Carlsson, L.: The application of conformal prediction to the drug discovery process. Annals of Mathematics and Artificial Intelligence, pp. 1–16 (2013). http://dx.doi.org/10.1007/s10472-013-9378-2
  7. 7.
    Faulon, J.L., Churchwell, C.J.: Signature Molecular Descriptor. 2. Enumerating Molecules from Their Extended Valence Sequences. J. Chem. Inf. Comput. Sci. 43, 721–734 (2003)CrossRefGoogle Scholar
  8. 8.
    Faulon, J.L., Visco, D.P.J., Pophale, R.S.: Signature Molecular Descriptor. 1. Using Extended Valence Sequences in QSAR and QSPR Studies. J. Chem. Inf. Comput. Sci. 43, 707–720 (2003)CrossRefGoogle Scholar
  9. 9.
    Grover, M., Singh, B., Bakshi, M., Singh, S.: Quantitative structure-property relationships in pharmaceutical research. Pharm. Sci. & Tech. Today 3(1), 28–35 (2000)CrossRefGoogle Scholar
  10. 10.
    Kazius, J., McGuire, R., Bursi, R.: Derivation and Validation of Toxicophores for Mutagenicity Prediction. J. Med. Chem 48, 312–320 (2005)CrossRefGoogle Scholar
  11. 11.
    Lewis, R.A.: A General Method for Exploiting QSAR Models in Lead Optimization. J. Med. Chem. 48(5), 1638–1648 (2005)CrossRefGoogle Scholar
  12. 12.
    Shafer, G., Vovk, V.: A tutorial on conformal prediction. Journal of Machine Learning Research 9, 371–421 (2008). http://www.jmlr.org/papers/volume9/shafer08a/shafer08a.pdf
  13. 13.
    Spjuth, O., Eklund, M., Ahlberg Helgee, E., Boyer, S., Carlsson, L.: Integrated decision support for assessing chemical liabilities. J. Chem. Inf. Model. 51(8), 1840–1847 (2011)Google Scholar
  14. 14.
    Steinbeck, C., Han, Y., Kuhn, S., Horlacher, O., Luttmann, E., Willighagen, E.: The chemistry development kit (cdk) an open-source java library for chemo- and bioinformatics. J. Chem. Inf. Comput. Sci. 43(2), 493–500 (2003). http://dx.doi.org/10.1021/ci025584y, pMID: 12653513
  15. 15.
    Stålring, J., Almeida, P.R., Carlsson, L., Helgee Ahlberg, E., Hasselgren, C., Boyer, S.: Localized heuristic inverse quantitative structure activity relationship with bulk descriptors using numerical gradients. Journal of Chemical Information and Modeling 53(8), 2001–2017 (2013). http://dx.doi.org/10.1021/ci400281y, pMID: 23845139
  16. 16.
    Vovk, V., Gammerman, A., Shafer, G.: Algorithmic Learning in a Random World. Springer-Verlag New York Inc., Secaucus (2005)zbMATHGoogle Scholar
  17. 17.
    Young, S., Gombar, V., Emptage, M., Cariello, N., Lambert, C.: Mixture De-Convolution and Analysis of Ames Mutagenicity Data. Chemometrics and Intelligent Laboratory Systems 60, 5–11 (2002)CrossRefGoogle Scholar

Copyright information

© Springer International Publishing Switzerland 2015

Authors and Affiliations

  • Ernst Ahlberg
    • 1
  • Ola Spjuth
    • 2
  • Catrin Hasselgren
    • 3
  • Lars Carlsson
    • 1
  1. 1.Drug Safety and MetabolismAstraZeneca Innovative Medicines and Early DevelopmentMölndalSweden
  2. 2.Department of Pharmaceutical Biosciences and Science for Life LaboratoryUppsala UniversityUppsalaSweden
  3. 3.Internal MedicineUniversity of New MexicoAlbuquerqueUSA

Personalised recommendations