Maximum Entropy and Bayesian Methods pp 35-56 | Cite as

# Maximum Entropy, Likelihood and Uncertainty: A Comparison

## Abstract

A framework for comparing the maximum likelihood (ML) and maximum entropy (ME) approaches is developed. Two types of linear models are considered. In the first type, the objective is to estimate probability distributions given some moment conditions. In this case the ME and ML are equivalent. A generalization of this type of estimation models to incorporate noisy data is discussed as well. The second type of models encompasses the traditional linear regression type models where the number of observations is larger than the number of unknowns and the objects to be inferred are not natural probabilities. After reviewing a generalized ME estimator and the empirical likelihood (or weighted least squares) estimator, the two are compared and contrasted with the ML. It is shown that, in general, the ME estimators use less input information and may be viewed, within the second type models, as expected log-likelihood estimators. In terms of informational ranking, if the objective is to estimate with minimum a-priori assumptions, then the generalized ME estimator is superior to the other estimators. Two detailed examples, reflecting the two types of models, are discussed. The first example deals with estimating a first order Markov process. In the second example the empirical (natural) weights of each observation, together with the other unknowns, are the subject of interest.

## Key words

Empirical likelihood Information Maximum entropy Maximum likelihood## Preview

Unable to display preview. Download preview PDF.

## References

- 1.Akaike, H (1986), “The Selection Smoothness Priors for Distributed Lag Estimation,” in
*Bayesian and Decision Techniques: Essays in Honor of Bruno de Finetti*, eds. P. K. Goel and A. Zellner (Amsterdam, North-Holland) 109–118.Google Scholar - 2.Agmon, N., Y. Alhassid, and R. D. Levine (1979), “An Algorithm for Finding the Distribution of Maximal Entropy,”
*Journal of Computational Physics*, Vol. 30, pp. 250–259.zbMATHCrossRefGoogle Scholar - 3.Bernardo, J.M. (1979), “Expected Information as Expected Utility,”
*The Annals of Statistics*, 7,686–690.MathSciNetzbMATHCrossRefGoogle Scholar - 4.Csiszar, I. (1991) “Why Least Squares and Maximum Entropy? An Axiomatic Approach to Inference for Linear Inverse Problems,”
*The Annals of Statistics*, 19, 2032–2066.MathSciNetzbMATHCrossRefGoogle Scholar - 5.DiCiccio, T., P. Hall, and J. Romano (1991), “Empirical Likelihood is Bartlett-Correctable,”
*The Annals of Statistics*, 19, 1053–1061.MathSciNetzbMATHCrossRefGoogle Scholar - 6.Golan, A., and G. G. Judge and D. Miller (1996),
*Maximum Entropy Econometrics: Robust Estimation With Limited Data*, John Wiley & Sons, New York.zbMATHGoogle Scholar - 7.Golan, A., and G. Judge (1996), “A Maximum Entropy Approach to Empirical Likelihood Estimation and Inference,” (Working paper), UC Berkeley.Google Scholar
- 8.Golan, A., and S. J. Vogel (1997), “Estimation of Stationary and Non-Stationary Social Accounting Matrix Coefficients With Structural and Supply-Side Information,” (Working paper).Google Scholar
- 9.Good, I. J. (1963), “Maximum Entropy for Hypothesis Formulation, Especially for Multidimensional Contingency Tables,”
*Annals of Mathematical Statistics*, Vol. 34, pp. 911–934.MathSciNetzbMATHCrossRefGoogle Scholar - 10.Hall, P. (1990), “Pseudo-Likelihood Theory for Empirical Likelihood,”
*The Annals of Statistics*, 18, 121–140.MathSciNetzbMATHCrossRefGoogle Scholar - 11.Imbens, G.W. (1993), “A New Approach to Generalized Method of Moments Estimation,” (mimeo), Harvard University.Google Scholar
- 12.Imbens, G.W. and J.K. Hellerstein (1994), “Imposing Moment Restrictions by Weighting,” (mimeo), Harvard University.Google Scholar
- 13.Jaynes, E.T. (1957a), “Information Theory and Statistical Mechanics,”
*Physics Review*, 106, 620–630.MathSciNetzbMATHCrossRefGoogle Scholar - 14.Jaynes, E.T. (1957b), “Information Theory and Statistical Mechanics II,”
*Physics Review*, 108, 171–190.MathSciNetCrossRefGoogle Scholar - 15.Jaynes, E.T. (1963), “Information Theory and Statistical Mechanics II,” in K.W. Ford (ed.),
*Statistical Physics*, W.A. Benamin, Inc., New York, 181–218.Google Scholar - 16.Jaynes, E.T. (1984), “Prior Information and Ambiguity in Inverse Problems,” in D.W. McLaughlin (ed.),
*Inverse Problems*, SIAM Proceedings, American Mathematical Society, Providence, RI, 151–166.Google Scholar - 17.Kullback, J. (1959),
*Information Theory and Statistics*, New York: John Wiley & Sons.zbMATHGoogle Scholar - 18.Levine, R.D. (1980), “An Information Theoretical Approach to Inversion Problems,”
*Journal of Physics, A*, 13, 91–108.zbMATHCrossRefGoogle Scholar - 19.McFadden, D. (1974), “The Measurement of Urban Travel demand,”
*Journal of Public Economics*, Vol. 3, pp. 303–328.CrossRefGoogle Scholar - 20.Owen, A. (1990), “Empirical Likelihood Ratio Confidence Regions,”
*The Annals of Statistics*, 18, 90–120.MathSciNetzbMATHCrossRefGoogle Scholar - 21.Owen, A. (1991), “Empirical Likelihood for Linear Models,”
*The Annals of Statistics*, 19, 1725–1747.MathSciNetzbMATHCrossRefGoogle Scholar - 22.Qin, J. and J. Lawless (1994), “Empirical Likelihood and General Estimating Equations,”
*The Annals of Statistics*, 22, 300–325.MathSciNetzbMATHCrossRefGoogle Scholar - 23.Shannon, C.E. (1948), “A Mathematical Theory of Communication,”
*Bell System Technical Journal*, 27, 379–423.MathSciNetzbMATHGoogle Scholar - 24.Shore, J.E. and R.W. Johnson (1980), “Axiomatic Derivation of the Principle of Maximum Entropy and the Principle of Minimum Cross-Entropy,”
*IEEE Transactions on Information Theory*, IT-26(1), 26–37.MathSciNetGoogle Scholar - 25.Skilling, J. (1989), “The Axioms of Maximum Entropy,” in J. Skilling (ed.),
*Maximum Entropy and Bayesian Methods in Science and Engineering*, Kluwer Academic, Dordrecht, 173–187.Google Scholar - 26.Tobias, J., and A. Zellner (1997),
*Further Results on Bayesian Method of Moments Analysis of the Multiple Regression Model*. H.G.B. Alexander Research Foundation, University of Chicago.Google Scholar - 27.Zellner, A. (1988), “Optimal Information Processing and Bayes Theorem,”
*American Statistician*, 42, 278–284.MathSciNetGoogle Scholar - 28.A. Zellner. Bayesian method of moments/ instrumental variable (BMOM/iv) analysis of mean and regression models. In J.C. Lee, W.C. Johnson, and A. Zellner, editors,
*Modeling and Prediction: Honoring Seymour Geisser*, pages 61–75. Springer-Verlag, 1996.Google Scholar - 29.A. Zellner (1997). The Bayesian method of moments (BMOM): theory and applications. In T. Fomby and R.C. Hill, editors,
*Advances in Econometrics*.Google Scholar