Maximal Information Divergence from Statistical Models Defined by Neural Networks

Montúfar, Guido; Rauh, Johannes; Ay, Nihat

doi:10.1007/978-3-642-40020-9_85

Guido Montúfar¹⁸,
Johannes Rauh¹⁹ &
Nihat Ay^19,20

Part of the book series: Lecture Notes in Computer Science ((LNIP,volume 8085))

Included in the following conference series:

International Conference on Geometric Science of Information

4768 Accesses
1 Altmetric

Abstract

We review recent results about the maximal values of the Kullback-Leibler information divergence from statistical models defined by neural networks, including naïve Bayes models, restricted Boltzmann machines, deep belief networks, and various classes of exponential families. We illustrate approaches to compute the maximal divergence from a given model starting from simple sub- or super-models. We give a new result for deep and narrow belief networks with finite-valued units.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Log in via an institution

Chapter: USD 29.95; Price excludes VAT (USA)

eBook: USD 39.99; Price excludes VAT (USA)

Softcover Book: USD 54.99; Price excludes VAT (USA)

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

Preview

Unable to display preview. Download preview PDF.

References

Ay, N., Knauf, A.: Maximizing multi-information. Kybernetika 42, 517–538 (2006)
MathSciNet MATH Google Scholar
Ay, N., Montúfar, G., Rauh, J.: Selection criteria for neuromanifolds of stochastic dynamics. In: Advances in Cognitive Neurodynamics (III). Springer (2013)
Google Scholar
Cybenko, G.: Approximation by superpositions of a sigmoidal function. Technical report, Department of computer Science, Tufts University, Medford, MA (1988)
Google Scholar
Funahashi, K.: Multilayer neural networks and Bayes decision theory. Neural Networks 11(2), 209–213 (1998)
Article Google Scholar
Hornik, K., Stinchcombe, M.B., White, H.: Multilayer feedforward networks are universal approximators. Neural Networks 2(5), 359–366 (1989)
Article Google Scholar
Juríček, J.: Maximization of information divergence from multinomial distributions. Acta Universitatis Carolinae 52(1) (2011)
Google Scholar
Le Roux, N., Bengio, Y.: Representational power of restricted Boltzmann machines and deep belief networks. Neural Computation 20(6), 1631–1649 (2008)
Article MathSciNet MATH Google Scholar
Le Roux, N., Bengio, Y.: Deep belief networks are compact universal approximators. Neural Computation 22, 2192–2207 (2010)
Article MathSciNet MATH Google Scholar
Matúš, F., Ay, N.: On maximization of the information divergence from an exponential family. In: Proceedings of the WUPES 2003, pp. 199–204 (2003)
Google Scholar
Matúš, F.: Maximization of information divergences from binary i.i.d. sequences. In: Proceedings IPMU, pp. 1303–1306 (2004)
Google Scholar
Montúfar, G.: Mixture decompositions of exponential families using a decomposition of their sample spaces. Kybernetika 49(1), 23–39 (2013)
MATH Google Scholar
Montúfar, G.: Universal approximation depth and errors of narrow belief networks with discrete units (2013). Preprint available at http://arxiv.org/abs/1303.7461
Montúfar, G., Ay, N.: Refinements of universal approximation results for DBNs and RBMs. Neural Computation 23(5), 1306–1319 (2011)
Article MathSciNet MATH Google Scholar
Montúfar, G., Morton, J.: Kernels and submodels of deep belief networks (2012). Preprint available at http://arxiv.org/abs/1211.0932
Montúfar, G., Morton, J.: Discrete restricted Boltzmann machines (2013). Preprint available at http://arxiv.org/abs/1301.3529
Montúfar, G., Rauh, J.: Scaling of model approximation errors and expected entropy distances. In: Proceedings of the WUPES 2012, pp. 137–148 (2012)
Google Scholar
Montúfar, G., Rauh, J., Ay, N.: Expressive power and approximation errors of restricted Boltzmann machines. In: Advances in NIPS 24, pp. 415–423 (2011)
Google Scholar
Rauh, J.: Finding the maximizers of the information divergence from an exponential family. IEEE Transactions on Information Theory 57(6), 3236–3247 (2011)
Article MathSciNet Google Scholar
Rauh, J.: Optimally approximating exponential families. Kybernetika 49(2), 199–215 (2013)
MATH Google Scholar
Sutskever, I., Hinton, G.E.: Deep narrow sigmoid belief networks are universal approximators. Neural Computation 20, 2629–2636
Google Scholar

Download references

Author information

Authors and Affiliations

Department of Mathematics, Pennsylvania State University, University Park, PA, 16802, USA
Guido Montúfar
Max Planck Institute for Mathematics in the Sciences, Inselstraße 22, 04103, Leipzig, Germany
Johannes Rauh & Nihat Ay
Santa Fe Institute, 1399 Hyde Park Road, Santa Fe, NM, 87501, USA
Nihat Ay

Authors

Guido Montúfar
View author publications
You can also search for this author in PubMed Google Scholar
Johannes Rauh
View author publications
You can also search for this author in PubMed Google Scholar
Nihat Ay
View author publications
You can also search for this author in PubMed Google Scholar

Editor information

Editors and Affiliations

Sony Computer Science Laboratories Inc, Tokyo, Japan
Frank Nielsen
Thales Land Air Systems, 91470, Limours, France
Frédéric Barbaresco

Rights and permissions

Reprints and permissions

Copyright information

About this paper

Cite this paper

Montúfar, G., Rauh, J., Ay, N. (2013). Maximal Information Divergence from Statistical Models Defined by Neural Networks. In: Nielsen, F., Barbaresco, F. (eds) Geometric Science of Information. GSI 2013. Lecture Notes in Computer Science, vol 8085. Springer, Berlin, Heidelberg. https://doi.org/10.1007/978-3-642-40020-9_85

Download citation

DOI: https://doi.org/10.1007/978-3-642-40020-9_85
Publisher Name: Springer, Berlin, Heidelberg
Print ISBN: 978-3-642-40019-3
Online ISBN: 978-3-642-40020-9
eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics