Skip to main content

Alternating Minimization Algorithms: From Blahut-Arimoto to Expectation-Maximization

  • Chapter
Codes, Curves, and Signals

Abstract

In his doctoral thesis and in a prize-winning paper in the IEEE Transactions on Information Theory, R.E. Blahut described two computational algorithms that are important in information theory. The first, which was also described by Arimoto, computes channel capacity and the distribution that achieves it. The second computes the rate-distortion function and distributions that achieve it. Blahut derived these algorithms as alternating maximization and minimization algorithms. Two closely related algorithms in estimation theory are the expectation-maximization algorithm and the generalized iterative scaling algorithm, each of which can be written as alternating minimization algorithms. Algorithmic and historical connections between these four algorithms are explored.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Chapter
USD 29.95
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
eBook
USD 129.00
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
Softcover Book
USD 169.99
Price excludes VAT (USA)
  • Compact, lightweight edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info
Hardcover Book
USD 169.99
Price excludes VAT (USA)
  • Durable hardcover edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

Preview

Unable to display preview. Download preview PDF.

Unable to display preview. Download preview PDF.

References

  1. S. Arimoto, An algorithm for computing the capacity of an arbitrary discrete memory-less channel, IEEE Urans. Inform. Theory, vol. 18, pp. 14–20, 1972.

    Article  MathSciNet  MATH  Google Scholar 

  2. R.E. Blahut, Computation of channel capacity and rate distortion functions, IEEE Trans. Inform. Theory, vol. 18, pp. 460–473, 1972.

    Article  MathSciNet  MATH  Google Scholar 

  3. —, Principles and Practice of Information Theory, Addison-Wesley 1987.

    Google Scholar 

  4. C.L. Byrne, Iterative image reconstruction algorithms based on cross-entropy minimization, IEEE Trans. Image Processing, vol. 2, pp. 96–103, 1993.

    Article  Google Scholar 

  5. T.M. Cover and J.A. Thomas, Elements of Information Theory, New York: Wiley 1991.

    Google Scholar 

  6. I. Csiszár, On the computation of rate-distortion functions, IEEE Trans. Inform. Theory, vol. 20, pp. 122–124, 1974.

    Article  MATH  Google Scholar 

  7. —, I-divergence geometry of probability distributions and minimization problems, Annals of Probability, vol.3, pp. 146–158, 1975.

    Article  MATH  Google Scholar 

  8. —, A geometric interpretation of Darroch and Ratcliff’s generalized iterative scaling, Annals of Statistics, vol. 17, pp. 1409–1413, 1989.

    Article  MathSciNet  Google Scholar 

  9. —, Why least squares and maximum entropy? An axiomatic approach to inference for linear inverse problems, Annals of Statistics, vol. 19, pp. 2032–2066, 1991.

    Google Scholar 

  10. I. Csiszár and G. Tusnady, Information geometry and alternating decisions, Statistical Decisions, Suppl. issue #1, pp. 205–207, 1984.

    Google Scholar 

  11. J.N. Darroch and D. Ratcliff, Generalized iterative scaling for log-linear models, Annals Math. Statistics, vol. 43, pp. 1470–1480, 1972.

    Article  MathSciNet  MATH  Google Scholar 

  12. A.P. Dempster, N.M. Laird, and D.B. Rubin, Maximum likelihood from incomplete data via the EM algorithm, J. Royal Stat. Society, Series B, vol. 39, pp. 1–37, 1977.

    MathSciNet  MATH  Google Scholar 

  13. A.O. Hero and J.A. Fessler, Convergence in norm for alternating expectation-maximization (EM) type algorithms, Statistica Sinica, vol. 5, pp. 41–54, 1995.

    MathSciNet  MATH  Google Scholar 

  14. S. Kullback, Information Theory and Statistics, Wiley, New York, 1959.

    Google Scholar 

  15. S. Kullback and M.A. Khairat, A note on minimum discrimination information, Annals Math. Statistics, vol. 37, pp. 279–280, 1966.

    Article  MathSciNet  MATH  Google Scholar 

  16. S. Kullback and R.A. Leibler, On information and sufficiency, Annals Math. Statistics, vol. 22, pp. 79–86, 1951.

    Article  MathSciNet  MATH  Google Scholar 

  17. L.B. Lucy, Iterative technique for the rectification of observed distributions, Astronomical Journal, vol. 79, pp. 745–765, 1974.

    Article  Google Scholar 

  18. M.I. Miller and D.L. Snyder, The role of likelihood and entropy in incomplete-data problems: Applications to estimating point-process intensities and Toeplitz and constrained covariances, Proc. IEEE, vol. 75, pp. 892–907, 1987.

    Article  Google Scholar 

  19. W.H. Richardson, Bayesian-based iterative method of image restoration, J. Optical Society of America, vol.62, pp. 55–59, 1972.

    Article  Google Scholar 

  20. L.A. Shepp and Y. Vardi, Maximum-likelihood reconstruction for emission tomography, IEEE Trans. Medical Imaging, vol. 1, pp. 113–122, 1982.

    Article  Google Scholar 

  21. J.E. Shore and R.W. Johnson, Axiomatic derivation of the principle of maximum entropy and the principle of minimum cross-entropy, IEEE Trans. Inform. Theory, vol. 26, pp. 26–37, 1980.

    Article  MathSciNet  MATH  Google Scholar 

  22. D.L. Snyder, T.J. Schulz, and J.A. O’Sullivan, Deblurring subject to nonnegativity constraints, IEEE Trans. Signai Processing, vol.40, pp. 1143–1150, 1992.

    Article  MATH  Google Scholar 

  23. Y. Vardi and D. Lee, From image deblurring to optimal investments: Maximum-likelihood solutions to positive linear inverse problems, J. Royal Stat. Society, Series B, pp. 569–612, 1993.

    Google Scholar 

  24. Y. Vardi, L.A. Shepp, and L. Kaufmann, A statistical model for positron emission tomography, J. Amer. Statistical Society, vol. 80, pp. 8–35, 1985.

    MATH  Google Scholar 

  25. C.F.J. Wu, On the convergence properties of the EM algorithm, Annals Statistics, vol. 11, pp. 95–103, 1983.

    Article  MATH  Google Scholar 

Download references

Author information

Authors and Affiliations

Authors

Editor information

Editors and Affiliations

Rights and permissions

Reprints and permissions

Copyright information

© 1998 Springer Science+Business Media New York

About this chapter

Cite this chapter

O’Sullivan, J.A. (1998). Alternating Minimization Algorithms: From Blahut-Arimoto to Expectation-Maximization. In: Vardy, A. (eds) Codes, Curves, and Signals. The Springer International Series in Engineering and Computer Science, vol 485. Springer, Boston, MA. https://doi.org/10.1007/978-1-4615-5121-8_13

Download citation

  • DOI: https://doi.org/10.1007/978-1-4615-5121-8_13

  • Publisher Name: Springer, Boston, MA

  • Print ISBN: 978-1-4613-7330-8

  • Online ISBN: 978-1-4615-5121-8

  • eBook Packages: Springer Book Archive

Publish with us

Policies and ethics