An Interpretable Neural Model with Interactive Stepwise Influence

  • Yin ZhangEmail author
  • Ninghao Liu
  • Shuiwang Ji
  • James Caverlee
  • Xia Hu
Conference paper
Part of the Lecture Notes in Computer Science book series (LNCS, volume 11441)


Deep neural networks have achieved promising prediction performance, but are often criticized for the lack of interpretability, which is essential in many real-world applications such as health informatics and political science. Meanwhile, it has been observed that many shallow models, such as linear models or tree-based models, are fairly interpretable though not accurate enough. Motivated by these observations, in this paper, we investigate how to fully take advantage of the interpretability of shallow models in neural networks. To this end, we propose a novel interpretable neural model with Interactive Stepwise Influence (ISI) framework. Specifically, in each iteration of the learning process, ISI interactively trains a shallow model with soft labels computed from a neural network, and the learned shallow model is then used to influence the neural network to gain interpretability. Thus ISI could achieve interpretability in three aspects: importance of features, impact of feature value changes, and adaptability of feature weights in the neural network learning process. Experiments on both synthetic and two real-world datasets demonstrate that ISI could generate reliable interpretation with respect to the three aspects, as well as preserve prediction accuracy by comparing with other state-of-the-art methods.


Neural network Interpretation Stepwise Influence 


  1. 1.
    Balan, A.K., Rathod, V., Murphy, K.P., Welling, M.: Bayesian dark knowledge. In: NIPS (2015)Google Scholar
  2. 2.
    Bau, D., Zhou, B., Khosla, A., Oliva, A., Torralba, A.: Network dissection: quantifying interpretability of deep visual representations. arXiv preprint arXiv:1704.05796 (2017)
  3. 3.
    Cano, J.R., Herrera, F., Lozano, M.: Evolutionary stratified training set selection for extracting classification rules with trade off precision-interpretability. Data Knowl. Eng. 60(1), 90–108 (2007)CrossRefGoogle Scholar
  4. 4.
    Che, Z., Liu, Y.: Deep learning solutions to computational phenotyping in health care. In: ICDMW. IEEE (2017)Google Scholar
  5. 5.
    Che, Z., Purushotham, S., Khemani, R., Liu, Y.: Distilling knowledge from deep networks with applications to healthcare domain. arXiv preprint arXiv:1512.03542 (2015)
  6. 6.
    Che, Z., Purushotham, S., Khemani, R., Liu, Y.: Interpretable deep models for ICU outcome prediction. In: AMIA Annual Symposium Proceedings, vol. 2016. American Medical Informatics Association (2016)Google Scholar
  7. 7.
    Choi, E., Bahadori, M.T., Sun, J., Kulas, J., Schuetz, A., Stewart, W.: RETAIN: an interpretable predictive model for healthcare using reverse time attention mechanism. In: NIPS (2016)Google Scholar
  8. 8.
    Du, M., Liu, N., Hu, X.: Techniques for interpretable machine learning. arXiv preprint arXiv:1808.00033 (2018)
  9. 9.
    Du, M., Liu, N., Song, Q., Hu, X.: Towards explanation of DNN-based prediction with guided feature inversion. In: KDD (2018)Google Scholar
  10. 10.
    Freitas, A.A.: Comprehensible classification models: a position paper. ACM SIGKDD Explor. Newslett. 15(1), 1–10 (2014)CrossRefGoogle Scholar
  11. 11.
    Frosst, N., Hinton, G.: Distilling a neural network into a soft decision tree. arXiv preprint arXiv:1711.09784 (2017)
  12. 12.
    He, X., Liao, L., Zhang, H., Nie, L., Hu, X., Chua, T.S.: Neural collaborative filtering. In: International World Wide Web Conferences Steering Committee, WWW 2017 (2017)Google Scholar
  13. 13.
    Hinton, G., Vinyals, O., Dean, J.: Distilling the knowledge in a neural network. arXiv preprint arXiv:1503.02531 (2015)
  14. 14.
    Jennings, D., Amabile, T.M., Ross, L.: Informal covariation assessment: data-based vs. theory-based judgments. In: Judgment Under Uncertainty: Heuristics and Biases (1982)Google Scholar
  15. 15.
    Karpathy, A., Johnson, J., Fei-Fei, L.: Visualizing and understanding recurrent networks. arXiv preprint arXiv:1506.02078 (2015)
  16. 16.
    LeCun, Y., Bottou, L., Bengio, Y., Haffner, P.: Gradient-based learning applied to document recognition. Proc. IEEE 86(11), 2278–2324 (1998)CrossRefGoogle Scholar
  17. 17.
    Merz, C.J., Murphy, P.M.: \(\{\)UCI\(\}\) repository of machine learning databases (1998)Google Scholar
  18. 18.
    Ribeiro, M.T., Singh, S., Guestrin, C.: “Why should i trust you ?” explaining the predictions of any classifier. In: SIGKDD. ACM (2016)Google Scholar
  19. 19.
    Schmidhuber, J.: Deep learning in neural networks: an overview. Neural Netw. 61, 85–117 (2015)CrossRefGoogle Scholar
  20. 20.
    Sundararajan, M., Taly, A., Yan, Q.: Axiomatic attribution for deep networks. arXiv preprint arXiv:1703.01365 (2017)
  21. 21.
    Ustun, B., Rudin, C.: Methods and models for interpretable linear classification. arXiv preprint arXiv:1405.4047 (2014)
  22. 22.
    Wang, J., Fujimaki, R., Motohashi, Y.: Trading interpretability for accuracy: oblique treed sparse additive models. In: SIGKDD (2015)Google Scholar
  23. 23.
    Yeh, I.C., Lien, C.H.: The comparisons of data mining techniques for the predictive accuracy of probability of default of credit card clients. Expert Syst. Appl. 36(2), 2473–2480 (2009)CrossRefGoogle Scholar
  24. 24.
    Zhang, Q., Wu, Y.N., Zhu, S.C.: Interpretable convolutional neural networks. In: CVPR, pp. 8827–8836 (2018)Google Scholar

Copyright information

© Springer Nature Switzerland AG 2019

Authors and Affiliations

  • Yin Zhang
    • 1
    Email author
  • Ninghao Liu
    • 1
  • Shuiwang Ji
    • 1
  • James Caverlee
    • 1
  • Xia Hu
    • 1
  1. 1.Texas A&M UniversityCollege StationUSA

Personalised recommendations