Skip to main content
Log in

Multi-level selective potentiality maximization for interpreting multi-layered neural networks

  • Published:
Applied Intelligence Aims and scope Submit manuscript

Abstract

The present paper aims to extract the inference mechanism of neural networks, supposed to be hidden in complicated surface phenomena, by maximizing information in terms of multiple selective constraints. For easy interpretation of the meaning of information content, information is represented in terms of selectivity of components, or selective potentiality. The selective potentiality represents an ability of neurons to respond selectively to inputs, and this selectivity should exclusively increase when going through different neurons. In addition, because the selectivity can be realized by increasing the strength of connection weights, we try to reduce this strength as much as possible, namely, cost minimization. The selectivity and cost are hierarchically applied as multiple constraints, disentangling complicated components to make the functions of neurons and connection weights as clear as possible, leading us to find the inner inference mechanism. The method was applied to the simple qualitative bankruptcy and more complicated bank marketing data sets, where the number of hidden layers increased to 15 to examine how multi-layered networks could be used to disentangle complicated components. Experimental results showed that the selective potentiality could disentangle connection weights and eventually produce linear and individual features for easy interpretation.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Fig. 1
Fig. 2
Fig. 3
Fig. 4
Fig. 5
Fig. 6
Fig. 7
Fig. 8
Fig. 9
Fig. 10
Fig. 11
Fig. 12
Fig. 13
Fig. 14
Fig. 15
Fig. 16
Fig. 17
Fig. 18
Fig. 19
Fig. 20
Fig. 21
Fig. 22
Fig. 23

Similar content being viewed by others

References

  1. Rai AE (2020) Explainable ai: From black box to glass box. J Acad Mark Sci 48(1):137–141

    Article  Google Scholar 

  2. Pintelas E, Livieris IE, Pintelas P (2020) A grey-box ensemble model exploiting black-box accuracy and white-box intrinsic interpretability. Algorithms 13(1):17

    Article  MathSciNet  Google Scholar 

  3. Yang JH, Wright SN, Hamblin M, McCloskey D, Alcantar MA, Schrübbers L, Lopatkin AJ, Satish S, Nili A, Palsson BO et al (2019) A white-box machine learning approach for revealing antibiotic mechanisms of action. Cell 177(6):1649–1661

    Article  Google Scholar 

  4. Varshney KR, Alemzadeh H (2017) On the safety of machine learning: Cyber-physical systems, decision sciences, and data products. Big Data 5(3):246–255

    Article  Google Scholar 

  5. Rudin C (2019) Stop explaining black box machine learning models for high stakes decisions and use interpretable models instead. Nat Mach Intell 1(5):206–215

    Article  Google Scholar 

  6. Atkinson K, Bench-Capon T, Bollegala D (2020) Explanation in ai and law: Past, present and future. Artif Intell pp 103387

  7. Durán JM (2021) Dissecting scientific explanation in ai (sxai): A case for medicine and healthcare. Artif Intell 297:103498

    Article  MathSciNet  Google Scholar 

  8. Sendak M, Elish MC, Gao M, Futoma J, Ratliff W, Nichols M, Bedoya A, Balu S, O’Brien C (2020) The human body is a black box supporting clinical decision-making with deep learning. In: Proceedings of the 2020 Conference on fairness, accountability, and transparency, pp 99–109

  9. Yu Z, Li T, Luo G, Fujita H, Yu N, Pan Y (2018) Convolutional networks with cross-layer neurons for image recognition. Inform Sci 433:241–254

    Article  MathSciNet  Google Scholar 

  10. Ribeiro MT, Singh S, Guestrin C (2016) Why should I trust you?: Explaining the predictions of any classifier. In: Proceedings of the 22nd ACM SIGKDD international conference on knowledge discovery and data mining. ACM, pp 1135–1144

  11. Garreau D, Luxburg U (2020) Explaining the explainer: A first theoretical analysis of lime. In: International conference on artificial intelligence and statistics. PMLR, pp 1287–1296

  12. Fong RC, Vedaldi A (2017) Interpretable explanations of black boxes by meaningful perturbation. In: Proceedings of the IEEE international conference on computer vision, pp 3429–3437

  13. Nguyen A, Yosinski J, Clune J (2019) Understanding neural networks via feature visualization: A survey. In: Explainable AI: Interpreting, explaining and visualizing deep learning. Springer, pp 55–76

  14. Bach S, Binder A, Montavon G, Klauschen F, Müller K-R, Samek W (2015) On pixel-wise explanations for non-linear classifier decisions by layer-wise relevance propagation. PLoS ONE 10 (7):e0130140

    Article  Google Scholar 

  15. Montavon G, Binder A, Lapuschkin S, Samek W, Müller K-R (2019) Layer-wise relevance propagation: an overview. In: Explainable AI: interpreting, explaining and visualizing deep learning. Springer, pp 193–209

  16. Hernandez-Orallo J (2019) Gazing into clever hans machines. Nat Machi Intell 1(4):172–173

    Article  MathSciNet  Google Scholar 

  17. Lapuschkin S, Wäldchen S, Binder A, Montavon G, Samek W, Müller K-R (2019) Unmasking clever hans predictors and assessing what machines really learn. Nat Commun 10(1):1–8

    Article  Google Scholar 

  18. Carlini N, Wagner D (2017) Adversarial examples are not easily detected: Bypassing ten detection methods. In: Proceedings of the 10th ACM workshop on artificial intelligence and security, pp 3–14

  19. Li C, Gao S, Deng C, Xie D, Liu W (2019) Cross-modal learning with adversarial samples. In: Advances in neural information processing systems, pp 10792–10802

  20. Wen Y, Li S, Jia K (2020) Towards understanding the regularization of adversarial robustness on neural networks. In: International conference on machine learning. PMLR, pp 10225–10235

  21. Kirkpatrick J, Pascanu R, Rabinowitz N, Veness J, Desjardins G, Rusu AA, Milan K, Quan J, Ramalho T, Grabska-Barwinska A et al (2017) Overcoming catastrophic forgetting in neural networks. Proc Natl Acad Sci 114(13):3521–3526

    Article  MathSciNet  Google Scholar 

  22. Hayes TL, Kafle K, Shrestha R, Acharya M, Kanan C (2020) Remind your neural network to prevent catastrophic forgetting. In: European conference on computer vision. Springer, pp 466–483

  23. Chen X, Wang S, Fu B, Long M, Wang J (2019) Catastrophic forgetting meets negative transfer: Batch spectral shrinkage for safe transfer learning. In: Advances in neural information processing systems, pp 1908–1918

  24. Linsker R (1988) Self-organization in a perceptual network. Computer 21(3):105–117

    Article  Google Scholar 

  25. Linsker R (1989) How to generate ordered maps by maximizing the mutual information between input and output signals. Neural Comput 1(3):402–411

    Article  Google Scholar 

  26. Linsker R (1992) Local synaptic learning rules suffice to maximize mutual information in a linear network. Neural Comput 4(5):691–702

    Article  Google Scholar 

  27. Linsker R (1992) Local synaptic rules suffice to maximize mutual information in a linear network. Neural Comput 4:691–702

    Article  Google Scholar 

  28. Linsker R (2005) Improved local learning rule for information maximization and related applications. Neural Netw 18(3):261–265

    Article  Google Scholar 

  29. Jehee J, Roelfsema P, Deco G, Murre J, Lamme V (2007) Interactions between higher and lower visual areas improve shape selectivity of higher level neurons–explaining crowding phenomena. Brain Res 1157:167–176

    Article  Google Scholar 

  30. Johnston WJ, Palmer SE, Freedman DJ (2020) Nonlinear mixed selectivity supports reliable neural computation. PLoS Comput Biol 16(2):e1007544

    Article  Google Scholar 

  31. Peelen MV, Downing P (2020) Category selectivity in human visual cortex

  32. Bongers BJ, IJzerman AP, Van Westen GJ (2020) Proteochemometrics–recent developments in bioactivity and selectivity modeling. Drug Disc Today Technol

  33. Rafegas I, Vanrell M, Alexandre LA, Arias G (2020) Understanding trained cnns by indexing neuron selectivity. Pattern Recogn Lett 136:318–325

    Article  Google Scholar 

  34. Ukita J (2020) Causal importance of low-level feature selectivity for generalization in image recognition. Neural Netw 125:185–193

    Article  Google Scholar 

  35. Deco G, Finnof W, Zimmermann HG (1995) Unsupervised mutual information criterion for elimination of overtraining in supervised multiplayer networks. Neural Comput 7:86–107

    Article  Google Scholar 

  36. Deco G, Parra L (1997) Non-feature extraction by redundancy reduction in an unsupervised stochastic neural networks. Neural Netw 10(4):683–691

    Article  Google Scholar 

  37. Morcos AS, Barrett DG, Botvinick M, Rabinowitz NC (2018) On the importance of single directions for generalization

  38. Rumelhart DE, Hinton GE, Williams R (1986) Learning internal representations by error propagation. In: Rumelhart DE, G E H et al (eds) Parallel distributed processing, vol 1. MIT Press, Cambridge, pp 318–362

  39. Rumelhart DE, Zipser D (1986) Feature discovery by competitive learning. In: Rumelhart DE, G E H et al (eds) Parallel distributed processing, vol 1. MIT Press, Cambridge, pp 151–193

  40. Rumelhart DE, McClelland JL (1986) On learning the past tenses of English verbs. In: Rumelhart DE, Hinton GE, Williams RJ (eds) Parallel distributed processing, vol 2. MIT Press, Cambrige, pp 216–271

  41. Cheng X, Rao Z, Chen Y, Zhang Q (2020) Explaining knowledge distillation by quantifying the knowledge. In: Proceedings of the IEEE/CVF conference on computer vision and pattern recognition, pp 12925–12935

  42. Mirzadeh SI, Farajtabar M, Li A, Levine N, Matsukawa A, Ghasemzadeh H (2020) Improved knowledge distillation via teacher assistant. In: Proceedings of the AAAI conference on artificial intelligence, vol 34, pp 5191–5198

  43. Cheng Y, Wang D, Zhou P, Zhang T (2020) A survey of model compression and acceleration for deep neural networks

  44. Gou J, Yu B, Maybank SJ, Tao D (2020) Knowledge distillation: A survey

  45. Abramson N (1963) Information theory and coding. McGraw-Hill, New York

    Google Scholar 

  46. Kim M-J, Han I (2003) The discovery of experts’ decision rules from qualitative bankruptcy data using genetic algorithms. Expert Syst Appl 25(4):637–646

    Article  Google Scholar 

  47. Moro S, Cortez P, Rita P (2014) A data-driven approach to predict the success of bank telemarketing. Decis Support Syst 62: 22–31

    Article  Google Scholar 

Download references

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Ryotaro Kamimura.

Additional information

Publisher’s note

Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

Rights and permissions

Reprints and permissions

About this article

Check for updates. Verify currency and authenticity via CrossMark

Cite this article

Kamimura, R. Multi-level selective potentiality maximization for interpreting multi-layered neural networks. Appl Intell 52, 13961–13986 (2022). https://doi.org/10.1007/s10489-021-02705-8

Download citation

  • Accepted:

  • Published:

  • Issue Date:

  • DOI: https://doi.org/10.1007/s10489-021-02705-8

Keywords

Navigation