Abstract
In his doctoral thesis and in a prize-winning paper in the IEEE Transactions on Information Theory, R.E. Blahut described two computational algorithms that are important in information theory. The first, which was also described by Arimoto, computes channel capacity and the distribution that achieves it. The second computes the rate-distortion function and distributions that achieve it. Blahut derived these algorithms as alternating maximization and minimization algorithms. Two closely related algorithms in estimation theory are the expectation-maximization algorithm and the generalized iterative scaling algorithm, each of which can be written as alternating minimization algorithms. Algorithmic and historical connections between these four algorithms are explored.
Access this chapter
Tax calculation will be finalised at checkout
Purchases are for personal use only
Preview
Unable to display preview. Download preview PDF.
References
S. Arimoto, An algorithm for computing the capacity of an arbitrary discrete memory-less channel, IEEE Urans. Inform. Theory, vol. 18, pp. 14–20, 1972.
R.E. Blahut, Computation of channel capacity and rate distortion functions, IEEE Trans. Inform. Theory, vol. 18, pp. 460–473, 1972.
—, Principles and Practice of Information Theory, Addison-Wesley 1987.
C.L. Byrne, Iterative image reconstruction algorithms based on cross-entropy minimization, IEEE Trans. Image Processing, vol. 2, pp. 96–103, 1993.
T.M. Cover and J.A. Thomas, Elements of Information Theory, New York: Wiley 1991.
I. Csiszár, On the computation of rate-distortion functions, IEEE Trans. Inform. Theory, vol. 20, pp. 122–124, 1974.
—, I-divergence geometry of probability distributions and minimization problems, Annals of Probability, vol.3, pp. 146–158, 1975.
—, A geometric interpretation of Darroch and Ratcliff’s generalized iterative scaling, Annals of Statistics, vol. 17, pp. 1409–1413, 1989.
—, Why least squares and maximum entropy? An axiomatic approach to inference for linear inverse problems, Annals of Statistics, vol. 19, pp. 2032–2066, 1991.
I. Csiszár and G. Tusnady, Information geometry and alternating decisions, Statistical Decisions, Suppl. issue #1, pp. 205–207, 1984.
J.N. Darroch and D. Ratcliff, Generalized iterative scaling for log-linear models, Annals Math. Statistics, vol. 43, pp. 1470–1480, 1972.
A.P. Dempster, N.M. Laird, and D.B. Rubin, Maximum likelihood from incomplete data via the EM algorithm, J. Royal Stat. Society, Series B, vol. 39, pp. 1–37, 1977.
A.O. Hero and J.A. Fessler, Convergence in norm for alternating expectation-maximization (EM) type algorithms, Statistica Sinica, vol. 5, pp. 41–54, 1995.
S. Kullback, Information Theory and Statistics, Wiley, New York, 1959.
S. Kullback and M.A. Khairat, A note on minimum discrimination information, Annals Math. Statistics, vol. 37, pp. 279–280, 1966.
S. Kullback and R.A. Leibler, On information and sufficiency, Annals Math. Statistics, vol. 22, pp. 79–86, 1951.
L.B. Lucy, Iterative technique for the rectification of observed distributions, Astronomical Journal, vol. 79, pp. 745–765, 1974.
M.I. Miller and D.L. Snyder, The role of likelihood and entropy in incomplete-data problems: Applications to estimating point-process intensities and Toeplitz and constrained covariances, Proc. IEEE, vol. 75, pp. 892–907, 1987.
W.H. Richardson, Bayesian-based iterative method of image restoration, J. Optical Society of America, vol.62, pp. 55–59, 1972.
L.A. Shepp and Y. Vardi, Maximum-likelihood reconstruction for emission tomography, IEEE Trans. Medical Imaging, vol. 1, pp. 113–122, 1982.
J.E. Shore and R.W. Johnson, Axiomatic derivation of the principle of maximum entropy and the principle of minimum cross-entropy, IEEE Trans. Inform. Theory, vol. 26, pp. 26–37, 1980.
D.L. Snyder, T.J. Schulz, and J.A. O’Sullivan, Deblurring subject to nonnegativity constraints, IEEE Trans. Signai Processing, vol.40, pp. 1143–1150, 1992.
Y. Vardi and D. Lee, From image deblurring to optimal investments: Maximum-likelihood solutions to positive linear inverse problems, J. Royal Stat. Society, Series B, pp. 569–612, 1993.
Y. Vardi, L.A. Shepp, and L. Kaufmann, A statistical model for positron emission tomography, J. Amer. Statistical Society, vol. 80, pp. 8–35, 1985.
C.F.J. Wu, On the convergence properties of the EM algorithm, Annals Statistics, vol. 11, pp. 95–103, 1983.
Author information
Authors and Affiliations
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 1998 Springer Science+Business Media New York
About this chapter
Cite this chapter
O’Sullivan, J.A. (1998). Alternating Minimization Algorithms: From Blahut-Arimoto to Expectation-Maximization. In: Vardy, A. (eds) Codes, Curves, and Signals. The Springer International Series in Engineering and Computer Science, vol 485. Springer, Boston, MA. https://doi.org/10.1007/978-1-4615-5121-8_13
Download citation
DOI: https://doi.org/10.1007/978-1-4615-5121-8_13
Publisher Name: Springer, Boston, MA
Print ISBN: 978-1-4613-7330-8
Online ISBN: 978-1-4615-5121-8
eBook Packages: Springer Book Archive