Summary
We take a detailed look at Akaike's information criterion (AIC) and Kullback-Leibler cross-validation (KLCV) in the problem of histogram density estimation. Two different definitions of “number of unknown parameters” inAIC are considered. A careful description is given of the influence of density tail properties on performance of both types ofAIC and onKLCV. A number of practical conclusions emerge. In particular, we find thatAIC will often give problems when used with heavy-tailed unbounded densities, but can perform quite well with compactly supported densities. In the latter case, both types ofAIC produce similar results, and those results will sometimes be asymptotically equivalent to the ones obtained fromKLCV. However, depending on the shape of the true density, theKLCV method can fail to balance “bias” and “variance” components of loss, with the result thatKLCV andAIC may produce very different results.
Article PDF
Similar content being viewed by others
Avoid common mistakes on your manuscript.
References
Akaike, H.: Information theory and an extension of the maximum likelihood principle. In: Petrov, B.N., Cźaki, P. (eds.) Second Internat. Symp. on Information Theory. pp.267–281. Budapest: Akademiai Kiadó 1973
Bowman, A.W.: A comparative study of some kernel-based nonparametric density estimators. J. Stat. Comput. Simulation21, 313–327 (1985)
Chow, Y.S., Geman, S., Wu, L.-D.: Consistent cross-validated density estimation. Ann. Stat.11, 25–38 (1983)
Duin, R.P.W.: On the choice of smoothing parameters for Parzen estimators of probability density functions. IEEE Trans. Comput. C25, 1175–1179 (1976)
Freedman, D., Diaconis, P.: On the histogram as a density estimator:L 2 theory. Z. Wahrscheinlichkeitstheor. Verw. Geb.57, 453–476 (1981)
Galambos, J.: The asymptotic theory of extreme order statistics. New York: Wiley 1978
Gregory, G.G., Schuster, E.F.: Contributions to non-parametric maximum likelihood methods of density estimation. In: Gentleman, J.F. (ed.) 12th Annual Symp. Interface Comput. Sci. Statist., pp. 427–431. Ontario: University of Waterloo, Canada 1979
Habbema, J.D.F., Hermans, J., Remme, J.: Variable kernel estimation in discriminant analysis. In: Corsten, L.C.A., Hermans, J. (eds.) Compstat 1978. pp. 178–185. Vienna: Physica 1978
Habbema, J.D.F., Hermans, J., Broek, K., van den: A stepwise discriminant analysis program using density estimation. In: Bruckman, G. (ed.) Compstat 1974. pp. 101–110. Vienna: Physica 1974
Habbema, J.D.F., Hermans, J., Broek, K. van den: Selection of variables in discriminant analysis byF-statistic and error rate. Technometrics19, 487–493 (1974)
Hall, P.: On Kullback-Leibler loss and density estimation. Ann. Stat.15, 1491–1519 (1987)
Hall, P.: On the estimation of probability densities using compactly supported kernels. J. Multivariate Anal.23, 131–158 (1987)
Hall, P., Heyde, C.C.: Martingale limit theory and its application. New York London: Academic Press 1980
Hannan, E.J., Rissanen, J.: The width of a spectral window. Adv. Appl. Probab. (1988)
Kullback, S.: Information theory and statistics. New York: Wiley 1959
Marron, J.S.: An asymptotically efficient solution to the bandwidth problem of kernel density estimation. Ann. Stat.13, 1011–1023 (1985)
Raatgever, J.W., Duin, R.P.W.: On the variable kernel method for multivariate nonparametric density estimation. In: Corsten, L.C.A., Hermans, J. (eds.) Compstat 1978. pp. 524–533. Vienna: Physica 1978
Rissanen, J.: Stochastic complexity. (With discussion). J. R. Stat. Soc.49, 223–239 (1987)
Scott, D.W.: On optimal and data-based histograms. Biometrika66, 605–610 (1979)
Schuster, E.F., Gregory, G.G.: On the consistency of maximum likelihood nonparametric density estimators. In: Eddy, W.F. (ed.) 13th Annual Symp. Interface Comput. Sci. Statist., pp. 295–298. New York Berlin Heidelberg: Springer 1981
Stone, C.J.: An asymptotically optimal histogram selection rule. In: LeCam, L.M., Olshen, R.A. (eds.) Proc. Berkely Conf. in Honor of J. Neyman and J. Kiefer, vol. II, pp. 513–520. Belmont, Calif.: Wadsworth 1985
Taylor, C.C.: Akaike's information criterion and the histogram. Biometrika74, 636–639 (1987)
Titterington, D.M.: Contribution to discussion of paper by T. Leonard. J. R. Stat. Soc., Ser. B40, 139–140 (1978)
Titterington, D.M.: A comparative study of kernel-based density estimates for categorical data. Technometrics22, 259–268 (1980)
Author information
Authors and Affiliations
Rights and permissions
About this article
Cite this article
Hall, P. Akaike's information criterion and Kullback-Leibler loss for histogram density estimation. Probab. Th. Rel. Fields 85, 449–467 (1990). https://doi.org/10.1007/BF01203164
Received:
Revised:
Issue Date:
DOI: https://doi.org/10.1007/BF01203164