Information Theory and an Extension of the Maximum Likelihood Principle

  • Hirotogu Akaike
Part of the Springer Series in Statistics book series (SSS)


In this paper it is shown that the classical maximum likelihood principle can be considered to be a method of asymptotic realization of an optimum estimate with respect to a very general information theoretic criterion. This observation shows an extension of the principle to provide answers to many practical problems of statistical model fitting.


Autoregressive Model Final Prediction Error Maximum Likelihood Principle Statistical Model Identification Statistical Decision Function 
These keywords were added by machine and not by the authors. This process is experimental and the keywords may be updated as the learning algorithm improves.


Unable to display preview. Download preview PDF.

Unable to display preview. Download preview PDF.


  1. 1.
    Akaike, H., Fitting autoregressive models for prediction. Ann. Inst. Statist. Math. 21 (1969) 243–217.MathSciNetCrossRefzbMATHGoogle Scholar
  2. 2.
    Akaike., H., Statistical predictor identification. Ann. Inst. Statist. Math. 22 (1970) 203–217.MathSciNetCrossRefGoogle Scholar
  3. 3.
    Akaike, H., On a semi-automatic power spectrum estimation procedure. Proc. 3rd Hawaii International Conference on System Sciences, 1970, 974–977.Google Scholar
  4. 4.
    Akaike, H., On a decision procedure for system identification, Preprints, IFAC Kyoto Symposium on System Engineering Approach to Computer Control. 1970, 486–490.Google Scholar
  5. 5.
    Akaike, H., Autoregressive model fitting for control. Ann. Inst. Statist. Math. 23 (1971) 163–180.MathSciNetCrossRefzbMATHGoogle Scholar
  6. 6.
    Akaike, H., Determination of the number of factors by an extended maximum likelihood principle. Research Memo. 44, Inst. Statist. Math. March, 1971.Google Scholar
  7. 7.
    Bartlett, M. S., The statistical approach to the analysis of time-series. Symposium on Information Theory (mimeographed Proceedings), Ministry of Supply, London, 1950, 81–101.Google Scholar
  8. 8.
    Billingsley, P., Statistical Inference for Markov Processes. Univ. Chicago Press, Chicago 1961.zbMATHGoogle Scholar
  9. 9.
    Blackwell, D., Equivalent comparisons of experiments. Ann. Math. Statist. 24 (1953) 265–272.MathSciNetCrossRefzbMATHGoogle Scholar
  10. 10.
    Campbell, L.L., Equivalence of Gauss’s principle and minimum discrimination information estimation of probabilities. Ann. Math. Statist. 41 (1970) 10111015.Google Scholar
  11. 11.
    Fisher, R.A., Theory of statistical estimation. Proc. Camb. Phil. Soc. 22 (1925) 700–725, Contributions to Mathematical Statistics John Wiley & Sons, New York, 1950, paper 11.Google Scholar
  12. 12.
    Good, I.J. Maximum entropy for hypothesis formulation, especially for multidimensional contingency tables. Ann. Math. Statist. 34 (1963) 911–934.MathSciNetCrossRefzbMATHGoogle Scholar
  13. 13.
    Gorman, J.W. and Toman, R.J., Selection of variables for fitting equations to data. Technometrics 8 (1966) 27–51.CrossRefGoogle Scholar
  14. 14.
    Jenkins, G.M. and Watts, D.G., Spectral Analysis and Its Applications. Holden Day, San Francisco, 1968.zbMATHGoogle Scholar
  15. 15.
    Kullback, S. and Leibler, R.A., On information and sufficiency. Ann. Math Statist. 22 (1951) 79–86.MathSciNetCrossRefzbMATHGoogle Scholar
  16. 16.
    Kullback, S., Information Theory and Statistics. John Wiley & Sons, New York 1959.zbMATHGoogle Scholar
  17. 17.
    Le Cam, L., On some asymptotic properties of maximum likelihood estimates and related Bayes estimates. Univ. Calif. Publ. in Stat. 1 (1953) 277–330.Google Scholar
  18. 18.
    Lehmann, E.L., Testing Statistical Hypotheses. John Wiley & Sons, New York 1969.Google Scholar
  19. 19.
    Otomo, T., Nakagawa, T. and Akaike, H. Statistical approach to computer control of cement rotary kilns. 1971. Automatica 8 (1972) 35–48.CrossRefGoogle Scholar
  20. 20.
    Rényi, A., Statistics and information theory. Studia Sci. Math. Hung. 2 (1967) 249–256.zbMATHGoogle Scholar
  21. 21.
    Savage, L.J., The Foundations of Statistics. John Wiley & Sons, New York 1954.zbMATHGoogle Scholar
  22. 22.
    Shannon, C.E. and Weaver, W., The Mathematical Theory of Communication. Univ. of Illinois Press, Urbana 1949.zbMATHGoogle Scholar
  23. 23.
    Wald, A., Tests of statistical hypotheses concerning several parameters when the number of observations is large. Trans. Am. Math. Soc. 54 (1943) 426–482.MathSciNetCrossRefzbMATHGoogle Scholar
  24. 24.
    Wald, A., Note on the consistency of the maximum likelihood estimate. Ann Math. Statist. 20 (1949) 595–601.MathSciNetCrossRefzbMATHGoogle Scholar
  25. 25.
    Wald, A., Statistical Decision Functions. John Wiley & Sons, New York 1950.zbMATHGoogle Scholar
  26. 26.
    Whittle, P., The statistical analysis of seiche record. J. Marine Res. 13 (1954) 76–100.MathSciNetGoogle Scholar
  27. 27.
    Whittle, P., Prediction and Regulation. English Univ. Press, London 1963.Google Scholar
  28. 28.
    Wiener, N., Cybernetics. John Wiley & Sons, New York, 1948.Google Scholar

Copyright information

© Springer Science+Business Media New York 1998

Authors and Affiliations

  • Hirotogu Akaike
    • 1
  1. 1.Institute of Statistical MathematicsJapan

Personalised recommendations