Abstract
Mutual Information (MI) is a long studied measure of information content, and many attempts to apply it to feature extraction and stochastic coding have been made. However, in general MI is computationally intractable to evaluate, and most previous studies redefine the criterion in forms of approximations. Recently we described properties of a simple lower bound on MI, and discussed its links to some of the popular dimensionality reduction techniques [1]. Here we introduce a richer family of auxiliary variational bounds on MI, which generalizes our previous approximations. Our specific focus then is on applying the bound to extracting informative lower-dimensional projections in the presence of irreducible Gaussian noise. We show that our method produces significantly tighter bounds than the well-known as-if Gaussian approximations of MI. We also show that the auxiliary variable method may help to significantly improve on reconstructions from noisy lower-dimensional projections.
Access this chapter
Tax calculation will be finalised at checkout
Purchases are for personal use only
Preview
Unable to display preview. Download preview PDF.
References
Barber, D., Agakov, F.V.: The IM Algorithm: A Variational Approach to Information Maximization. In: NIPS, MIT Press, Cambridge (2003)
Linsker, R.: An Application of the Principle of Maximum Information Preservation to Linear Systems. In: Touretzky, D. (ed.) Advances in Neural Information Processing Systems, vol. 1. Morgan-Kaufmann, San Francisco (1989)
Bell, A.J., Sejnowski, T.J.: An information-maximization approach to blind separation and blind deconvolution. Neural Computation 7(6), 1129–1159 (1995)
Brunel, N., Nadal, J.P.: Mutual Information, Fisher Information and Population Coding. Neural Computation 10, 1731–1757 (1998)
Agakov, F.V., Barber, D.: Variational Information Maximization for Neural Coding. In: International Conference on Neural Information Processing. Springer, Heidelberg (2004)
Agakov, F.V.: Variational Information Maximization in Stochastic Environments. PhD thesis, School of Informatics, University of Edinburgh (2005)
Bishop, C., Svensen, M., Williams, C.K.I.: GTM: The Generative Topographic Mapping. Neural Computation 10(1), 215–234 (1998)
Tipping, M.E., Bishop, C.M.: Mixtures of Probabilistic Principal Component Analyzers. Neural Computation 11(2), 443–482 (1999)
Linsker, R.: Deriving Receptive Fields Using an Optimal Encoding Criterion. In: Hanson, S., Cowan, J., Gilese, L. (eds.) Advances in Neural Information Processing Systems, vol. 5. Morgan Kaufmann, San Francisco (1993)
Neal, R.M., Hinton, G.E.: A View of the EM Algorithm That Justifies Incremental, Sparse, and Other Variants. In: Jordan, M. (ed.) Learning in Graphical Models. Kluwer Academic, Dordrecht (1998)
Cover, T.M., Thomas, J.A.: Elements of Information Theory. Wiley, Chichester (1991)
Arimoto, S.: An Algorithm for computing the capacity of arbitrary discrete memoryless channels. ITÂ 18 (1972)
Blahut, R.: Computation of channel capacity and rate-distortion functions. ITÂ 18 (1972)
Jacobs, R.A., Jordan, M.I., Nowlan, S.J., Hinton, G.E.: Adaptive Mixtures of Local Experts. Neural Computation 3 (1991)
Tishby, N., Pereira, F.C., Bialek, W.: The information bottleneck method. In: Proc. of the 37-th Annual Allerton Conference on Communication, Control and Computing, pp. 368–377 (1999)
LeCun, Y., Cortes, C.: The MNIST database of handwritten digits (1998)
Cardoso, J.F.: Infomax and maximum likelihood for blind source separation. IEEE Signal Processing Letters 4 (1997)
Author information
Authors and Affiliations
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 2006 Springer-Verlag Berlin Heidelberg
About this paper
Cite this paper
Agakov, F., Barber, D. (2006). Auxiliary Variational Information Maximization for Dimensionality Reduction. In: Saunders, C., Grobelnik, M., Gunn, S., Shawe-Taylor, J. (eds) Subspace, Latent Structure and Feature Selection. SLSFS 2005. Lecture Notes in Computer Science, vol 3940. Springer, Berlin, Heidelberg. https://doi.org/10.1007/11752790_6
Download citation
DOI: https://doi.org/10.1007/11752790_6
Publisher Name: Springer, Berlin, Heidelberg
Print ISBN: 978-3-540-34137-6
Online ISBN: 978-3-540-34138-3
eBook Packages: Computer ScienceComputer Science (R0)