Abstract
We explore the impact of adding entropy and sparsity criteria to a standard neural network cost function, by considering a variety of network types and applications. Measuring network performance via the testing error, we seek to answer the question: does including an entropy criterion and/or a sparsity criterion with some choice(s) of coefficient(s) produce a performance improvement of the network? The exploration suggests that the addition of a single one of these two criteria, with appropriate choice of coefficient, generates a performance improvement, and the inclusion of both criteria, with appropriate choice of coefficients, generates a further improvement. This suggestion reflects established results for parameter estimation inverse problems in a number of other settings.
Similar content being viewed by others
References
Boreland, B., Kunze, H., & Levere, K. (2023). Introduction to neural networks. In H. Kunze, D. La Torre, A. Riccoboni, & M. Ruiz Galán (Eds.), Engineering mathematics and artificial intelligence: Foundations, methods, and applications. CRC Press.
Hinton, G. E., Osindero, S., & Tey, Y. (2006a). A fast learning algorithm for deep belief nets. Neural Computation, 18, 1527–1554.
Hinton, G. E., Osindero, S., Welling, M., & Teh, Y. (2006b). Unsupervised discovery of non-linear structure using contrastive backpropagation. Cognitive Science, 30(6), 725–731.
Kunze, H., & La Torre, D. (2020). Solving inverse problems for steady-state equations using a multiple criteria model with collage distance, entropy, and sparsity. Annals of Operations Research, 311(2), 1051–1065.
Kunze, H., La Torre, D., & Vrscay, E. R. (2012). Solving inverse problems for des using the collage theorem and entropy maximization. Applied Mathematics Letters, 25(12), 2306–2311.
Kunze, H., La Torre, D., & Vrscay, E. (2013). Collage-based inverse problems for IFSM with entropy maximization and sparsity constraints. Image Analysis and Stereology, 32(3), 183–188.
McCulloch, W. S., & Pitts, W. (1943). A logical calculus of the ideas immanent in nervous activity. The Bulletin of Mathematical Biophysics, 5(4), 115–133.
Rosenblatt, F. (1962). Principles of Neurodynamics: Perceptrons and the Theory of Brain Mechanisms. Spartan Books.
Shannon, C. E. (1948). A mathematical theory of communication. Bell System Technical Journal, 27(3), 379–423.
Yu, P. L. (1985). Multiple-criteria decision making: Concepts, techniques, and extensions. Mathematical concepts and methods in science and engineering (Vol. 30). Springer.
Funding
The research of H. Kunze was funded by NSERC Discovery Grant #401274.
Author information
Authors and Affiliations
Contributions
The work in this paper is based primarily on work in BB doctoral thesis, for which HK and KL were co-advisors.
Corresponding author
Ethics declarations
Conflict of interest
B. Boreland declares that he has no conflict of interest. H. Kunze declares that he has no conflict of interest. K. Levere declares that she has no conflict of interest.
Ethical approval
This article does not contain any studies with human participants or animals performance by any of the authors.
Additional information
Publisher's Note
Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.
Rights and permissions
Springer Nature or its licensor (e.g. a society or other partner) holds exclusive rights to this article under a publishing agreement with the author(s) or other rightsholder(s); author self-archiving of the accepted manuscript version of this article is solely governed by the terms of such publishing agreement and applicable law.
About this article
Cite this article
Boreland, B., Kunze, H. & Levere, K. The impact of sparsity and entropy criteria on neural network performance. Ann Oper Res (2024). https://doi.org/10.1007/s10479-024-05834-8
Received:
Accepted:
Published:
DOI: https://doi.org/10.1007/s10479-024-05834-8