Probability Theory and Related Fields

, Volume 92, Issue 2, pp 195–229 | Cite as

Data compression and histograms

  • Bin Yu
  • T. P. Speed


In this paper, the relationship between code length and the selection of the number of bins for a histogram density is considered for a sequence of iid observations on [0,1]. First, we use a shortest code length criterion to select the number of bins for a histogram. A uniform almost sure asymptotic expansion for the code length is given and it is used to prove the asymptotic optimality of the selection rule. In addition, the selection rule is consistent if the true density is uniform [0,1]. Secondly, we deal with the problem: what is the “best” achievable average code length with underlying density functionf? Minimax lower bounds are derived for the average code length over certain smooth classes of underlying densitiesf. For the smooth class with bounded first derivatives, the rate in the lower bound is shown to be achieved by a code based on a sequence of histograms whose number of bins is changed predictively. Moreover, this best code can be modified to ensure that the almost sure version of the code length has asymptotically the same behavior as its expected value, i.e., the average code length.

AMS 1980 Classifications

60G05 94A99 94A17 


Unable to display preview. Download preview PDF.

Unable to display preview. Download preview PDF.


  1. Assouad, P.: Deux remarques sur l'estimation. Compt. Rendus de l'Academie Sci. Paris296, 1021–1024 (1983)Google Scholar
  2. Barron, A.R., Cover, T.M.: Minimum complexity density estimation. IEEE Trans. Inf. Theory IT-37, 1034–1054 (1991)Google Scholar
  3. Birgé, L.: Approximations dans les espaces metriques et theorie de l'estimation. Z. Wahrscheinlichkeitstheor. Verw. Geb.65, 181–237 (1983)Google Scholar
  4. Breiman, L.A., Freedman, D.F.: How many variables should be entered in a regression equation? J. Am. Stat. Assoc.78, 131–136 (1983)Google Scholar
  5. Bretagnolle, J., Huber, C.: Estimation des densities: risque minimax. Z. Wahrscheinlichkeitsther. Verw. Geb.47, 119–137 (1979)Google Scholar
  6. Clarke, B.S.: Asymptotic cumulative risk and bayes risk under entropy, with applications. PhD thesis, University of Illinois at Urbana-Champaign, 1989Google Scholar
  7. Davisson, L.D.: Minimax noiseless universal coding for Markov sources. IEEE Trans. Inf. Theory29, 211–215 (1983)Google Scholar
  8. Dawid, A.P.: Present position and potential developments: some personal views, statistical theory, the prequential approach. J. R. Stat. Soc. Ser.B 147, 278–292 (1984)Google Scholar
  9. Dawid, A.P.: Prequential data analysis. In: Ghosh, M., Pathak, P.K. (eds.) Issues and controversies in statistical inference. Essays in Honor of D. Basu's 65th birthday. (to appear)Google Scholar
  10. Devroye, L.: A course in density estimation. Progress in probability and statistics, vol. 14. Basel: Birkhauser 1987Google Scholar
  11. Donoho, D., Lui, R., MacGibbon, B.: Minimax risk over hyperrectangles and implications. Ann. Stat.18, 1416–1437 (1990)Google Scholar
  12. Freedman, D.A., Diaconis, P.: On the histogram as a density estimator: L2 theory. Z. Wahrscheinlichkeitstheor. Verw. Geb.57, 453–475 (1981)Google Scholar
  13. Hall, P., Hannan, E.J.: On stochastic complexity and nonparametric density estimation. Biometrika74, 705–714 (1988)Google Scholar
  14. Hamming, R.W.: Coding and information theory. Englewood Cliffs, N.J.: Prentice-Hall 1986Google Scholar
  15. Hannan, E.J., Cameron, M.A., Speed, T.P.: Estimating spectra and prediction variance (manuscript, 1991)Google Scholar
  16. Rissanen, J.: A universal prior for integers and estimation by minimum description length. Ann. Stat.11, 416–431 (1983)Google Scholar
  17. Rissanen, J.: Stochastic complexity and modeling. Ann. Stat.14, 1080–1100 (1986)Google Scholar
  18. Rissanen, J.: Stochastic complexity in statistical inquiry. Singapore: World Scientific 1989Google Scholar
  19. Rissanen, J., Speed, T.P., Yu, B.: Density estimation by stochastic complexity. IEEE Trans. Inf. Theory (to appear 1992)Google Scholar
  20. Speed, T.P., Yu, B.: Model selection and prediction: Normal regression. Ann. Inst. Stat. Math. (submitted for publication)Google Scholar
  21. Stone, C.J.: Optimal uniform rate of convergence for nonparametric estimators of a density function or its derivatives. Recent advances in statistics, pp. 393–406. New York: Academic Press 1983Google Scholar
  22. Stone, C.J.: An asymptotic optimal histogram selection rule. Le Cam, L.M., Ohshen, R.A. (eds.) Proceedings of the Berkeley Conference in Honor of Jerzy Neyman and Jack Kiefer, vol.II, pp. 513–520. Belmont, CA: Wadsworth 1985Google Scholar

Copyright information

© Springer-Verlag 1992

Authors and Affiliations

  • Bin Yu
    • 1
  • T. P. Speed
    • 2
  1. 1.Department of StatisticsUniversity of WisconsinMadisonUSA
  2. 2.Department of StatistiesUniversity of CaliforniaBerkeleyUSA

Personalised recommendations