Variance-Based Feature Importance in Neural Networks

de Sá, Cláudio Rebelo

doi:10.1007/978-3-030-33778-0_24

Cláudio Rebelo de Sá ORCID: orcid.org/0000-0003-3943-6322¹¹

Part of the book series: Lecture Notes in Computer Science ((LNAI,volume 11828))

Included in the following conference series:

International Conference on Discovery Science

2788 Accesses
6 Citations

Abstract

This paper proposes a new method to measure the relative importance of features in Artificial Neural Networks (ANN) models. Its underlying principle assumes that the more important a feature is, the more the weights, connected to the respective input neuron, will change during the training of the model. To capture this behavior, a running variance of every weight connected to the input layer is measured during training. For that, an adaptation of Welford’s online algorithm for computing the online variance is proposed. When the training is finished, for each input, the variances of the weights are combined with the final weights to obtain the measure of relative importance for each feature. This method was tested with shallow and deep neural network architectures on several well-known classification and regression problems. The results obtained confirm that this approach is making meaningful measurements. Moreover, results showed that the importance scores are highly correlated with the variable importance method from Random Forests (RF).

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Log in via an institution

Chapter: USD 29.95; Price excludes VAT (USA)

eBook: USD 69.99; Price excludes VAT (USA)

Softcover Book: USD 89.99; Price excludes VAT (USA)

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

Notes

1.
All the results presented in this paper can be replicated using the python file in https://github.com/rebelosa/feature-importance-neural-networks.

References

Breiman, L.: Random forests. Mach. Learn. 45(1), 5–32 (2001)
Article Google Scholar
David Garson, G.: Interpreting neural-network connection weights. AI Expert 6(4), 46–51 (1991)
Google Scholar
Heaton, J., McElwee, S., Fraley, J.B., Cannady, J.: Early stabilizing feature importance for tensorflow deep neural networks. In: 2017 International Joint Conference on Neural Networks, IJCNN 2017, Anchorage, AK, USA, 14–19 May, 2017, pp. 4618–4624 (2017)
Google Scholar
Martínez, A., Castellanos, J., Hernández, C., de Mingo López, L.F.: Study of weight importance in neural networks working with colineal variables in regression problems. In: Multiple Approaches to Intelligent Systems, 12th International Conference on Industrial and Engineering Applications of Artificial Intelligence and Expert Systems, IEA/AIE-99, Cairo, Egypt, May 31 – June 3, 1999, Proceedings, pp. 101–110 (1999)
Google Scholar
Olden, J.D., Jackson, D.A.: Illuminating the “black box”: a randomization approach for understanding variable contributions in artificial neural networks. Ecol. Model. 154(1), 135–150 (2002)
Article Google Scholar
Paliwal, M., Kumar, U.A.: Assessing the contribution of variables in feed forward neural network. Appl. Soft Comput. 11(4), 3690–3696 (2011)
Article Google Scholar
Pedregosa, F., et al.: Scikit-learn: machine learning in python. J. Mach. Learn. Res. 12, 2825–2830 (2011)
MathSciNet MATH Google Scholar
Shavitt, I., Segal, E.: Regularization learning networks: deep learning for tabular datasets. In: Advances in Neural Information Processing Systems 31: Annual Conference on Neural Information Processing Systems 2018, NeurIPS 2018, 3–8 December 2018, Montréal, Canada, pp. 1386–1396 (2018)
Google Scholar
Welford, B.P.: Note on a method for calculating corrected sums of squares and products. Technometrics 4(3), 419–420 (1962)
Article MathSciNet Google Scholar

Download references

Acknowledgments

I gratefully acknowledge the support of NVIDIA Corporation with the donation of the Titan X Pascal GPU used for this research.

Author information

Authors and Affiliations

Data Science Research Group, University of Twente, Enschede, Netherlands
Cláudio Rebelo de Sá

Authors

Cláudio Rebelo de Sá
View author publications
You can also search for this author in PubMed Google Scholar

Corresponding author

Correspondence to Cláudio Rebelo de Sá .

Editor information

Editors and Affiliations

Jožef Stefan Institute, Ljubljana, Slovenia
Petra Kralj Novak
Rudjer Bošković Institute, Zagreb, Croatia
Tomislav Šmuc
Jožef Stefan Institute, Ljubljana, Slovenia
Sašo Džeroski

Rights and permissions

Reprints and permissions

Copyright information

About this paper

Cite this paper

de Sá, C.R. (2019). Variance-Based Feature Importance in Neural Networks. In: Kralj Novak, P., Šmuc, T., Džeroski, S. (eds) Discovery Science. DS 2019. Lecture Notes in Computer Science(), vol 11828. Springer, Cham. https://doi.org/10.1007/978-3-030-33778-0_24

Download citation

DOI: https://doi.org/10.1007/978-3-030-33778-0_24
Published: 16 October 2019
Publisher Name: Springer, Cham
Print ISBN: 978-3-030-33777-3
Online ISBN: 978-3-030-33778-0
eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics