Safe and Interpretable Machine Learning: A Methodological Review

Part of the Studies in Computational Intelligence book series (SCI, volume 445)

Abstract

When learning models from data, the interpretability of the resulting model is often mandatory. For example, safety-related applications for automation and control require that the correctness of the model must be ensured not only for the available data but for all possible input combinations. Thus, understanding what the model has learned and in particular how it will extrapolate to unseen data is a crucial concern. The paper discusses suitable learning methods for classification and regression. For classification problems, we review an approach based on an ensemble of nonlinear low-dimensional submodels, where each submodel is simple enough to be completely verified by domain experts. For regression problems, we review related approaches that try to achieve interpretability by using low-dimensional submodels (for instance, MARS and tree-growing methods). We compare them with symbolic regression, which is a different approach based on genetic algorithms. Finally, a novel approach is proposed for combining a symbolic regression model, which is shown to be easily interpretable, with a Gaussian Process. The combined model has an improved accuracy and provides error bounds in the sense that the deviation from the verified symbolic model is always kept below a defined limit.

Preview

Unable to display preview. Download preview PDF.

Unable to display preview. Download preview PDF.

References

  1. 1.
    Andrews, S., Tsochantaridis, I., Hofmann, T.: Support vector machines for multiple-instance learning. In: Advances in Neural Information Processing Systems, vol. 15, pp. 561–568. MIT Press, Cambridge (2003)Google Scholar
  2. 2.
    Bishop, C.M.: Neural Networks for Pattern Recognition. Oxford University Press, Oxford (1995)Google Scholar
  3. 3.
    Breiman, L.: Random forests. Machine Learning 45(1), 5–32 (2001)MATHCrossRefGoogle Scholar
  4. 4.
    Breiman, L., Friedman, J.H., Olshen, R.A., Stone, C.J.: Classification and Regression Trees. Statistics/Probability Series. Wadsworth Publishing Company, Belmont (1984)MATHGoogle Scholar
  5. 5.
    Dietterich, T.G.: Ensemble Methods in Machine Learning. In: Kittler, J., Roli, F. (eds.) MCS 2000. LNCS, vol. 1857, pp. 1–15. Springer, Heidelberg (2000)CrossRefGoogle Scholar
  6. 6.
    Friedman, J.H.: Multivariate Adaptive Regression Splines. The Annals of Statistics 19(1), 1–67 (1991)MathSciNetMATHCrossRefGoogle Scholar
  7. 7.
    Hastie, T., Tibshirani, R., Friedman, J., Franklin, J.: The elements of statistical learning: data mining, inference and prediction. Springer, New York (2009), http://www-stat.stanford.edu/~tibs/ElemStatLearn
  8. 8.
    Lang, B.: Monotonic Multi-layer Perceptron Networks as Universal Approximators. In: Duch, W., Kacprzyk, J., Oja, E., Zadrożny, S. (eds.) ICANN 2005. LNCS, vol. 3697, pp. 31–37. Springer, Heidelberg (2005), doi:10.1007/11550907Google Scholar
  9. 9.
    Lisboa, P.J.G.: Industrial use of safety-related artificial neural networks. Contract research report 327/2001. Liverpool John Moores University (2001)Google Scholar
  10. 10.
    Loh, W.: Regression by parts: Fitting visually interpretable models with guide. In: Chen, C., Härdle, W., Unwin, A. (eds.) Handbook of Computational Statistics, pp. 447–468 (2008)Google Scholar
  11. 11.
    Mitra, S., Hayashi, Y.: Neuro-fuzzy rule generation: Survey in soft computing framework. IEEE Transact. Neural Networks, 748–768 (2000)Google Scholar
  12. 12.
    Nusser, S., Otte, C., Hauptmann, W.: Interpretable ensembles of local models for safety-related applications. In: Proceedings of 16th European Symposium on Artificial Neural Networks (ESANN 2008), Brugge, Belgium, pp. 301–306 (2008)Google Scholar
  13. 13.
    Nusser, S., Otte, C., Hauptmann, W., Kruse, R.: Learning verifiable ensembles for classification problems with high safety requirements. In: Wang, L.S.L., Hong, T.P. (eds.) Intelligent Soft Computation and Evolving Data Mining: Integrating Advanced Technology, pp. 405–431. IGI Global (2009)Google Scholar
  14. 14.
    Rasmussen, C.E., Williams, C.K.I.: Gaussian Processes for Machine Learning. MIT Press, Cambridge (2006)MATHGoogle Scholar
  15. 15.
    Schmidt, M., Lipson, H.: Distilling Free-Form Natural Laws from Experimental Data. Science 324(5923), 81–85 (2009)CrossRefGoogle Scholar
  16. 16.
    Schölkopf, B., Smola, A.J.: Learning with Kernels: Support Vector Machines, Regularization, Optimization, and Beyond. MIT Press, Cambridge (2002)Google Scholar
  17. 17.
    Silverman, B.: Density Estimation for Statistics and Data Analysis (Chapman & Hall/CRC Monographs on Statistics & Applied Probability). Chapman and Hall/CRC (1986)Google Scholar
  18. 18.
    Taylor, B.J. (ed.): Methods and Procedures for the Verification and Validation of Artificial Neural Networks. Springer (2005)Google Scholar

Copyright information

© Springer-Verlag Berlin Heidelberg 2013

Authors and Affiliations

  1. 1.Corporate Research and TechnologiesSiemens AGMunichGermany

Personalised recommendations