Asymptotic Log-Loss of Prequential Maximum Likelihood Codes

Grünwald, Peter; de Rooij, Steven

doi:10.1007/11503415_44

Peter Grünwald²⁰ &
Steven de Rooij²⁰

Part of the book series: Lecture Notes in Computer Science ((LNAI,volume 3559))

Included in the following conference series:

International Conference on Computational Learning Theory

3451 Accesses
4 Citations

Abstract

We analyze the Dawid-Rissanen prequential maximum likelihood codes relative to one-parameter exponential family models \({\mathcal M}\). If data are i.i.d. according to an (essentially) arbitraryP, then the redundancy grows at rate \({\frac{1}{2}} {\rm c} {\rm ln} n\). We show that c = σ \(_{\rm 1}^{\rm 2}\)/ σ \(_{\rm 2}^{\rm 2}\), where σ \(_{\rm 1}^{\rm 2}\) is the variance of P, and σ \(_{\rm 2}^{\rm 2}\) is the variance of the distribution \(M^{*} \in {\mathcal M}\) that is closest to P in KL divergence. This shows that prequential codes behave quite differently from other important universal codes such as the 2-part MDL, Shtarkov and Bayes codes, for which c = 1. This behavior is undesirable in an MDL model selection setting.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Log in via an institution

Chapter: USD 29.95; Price excludes VAT (USA)

eBook: USD 84.99; Price excludes VAT (USA)

Softcover Book: USD 109.99; Price excludes VAT (USA)

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

Preview

Unable to display preview. Download preview PDF.

References

Azoury, K., Warmuth, M.: Relative loss bounds for on-line density estimation with the exponential family of distributions. Machine Learning 43(3) (2001)
Google Scholar
Barron, A., Rissanen, J., Yu, B.: The minimum description length principle in coding and modeling. IEEE Trans. Inf. Theory 44(6), 2743–2760 (1998)
Article MATH MathSciNet Google Scholar
Cesa-Bianchi, N., Lugosi, G.: Worst-case bounds for the logarithmic loss of predictors. Journal of Machine Learning 43(3), 247–264 (2001)
Article MATH Google Scholar
Clarke, B.S., Barron, A.R.: Information-theoretic asymptotics of Bayes methods. IEEE Trans. Inf. Theory IT-36(3), 453–471 (1990)
Article MATH MathSciNet Google Scholar
Cover, T.M., Thomas, J.A.: Elements of Information Theory. Wiley, Chichester (1991)
Book MATH Google Scholar
Dawid, A.P.: Present position and potential developments: Some personal views, statistical theory, the prequential approach. Journal of the Royal Statistical Society, Series A 147(2), 278–292 (1984)
Article MATH MathSciNet Google Scholar
de Rooij, S., Grünwald, P.: An empirical study of MDL model selection with infinite parametric complexity. Available at the CoRR arXiv (2005), at http://xxx.lanl.gov/abs/cs.LG/0501028abs.cs.LG/0501028
Freund, Y.: Predicting a binary sequence almost as well as the optimal biased coin. In: Proc. Ninth Annual Conf. on Comp. Learning Theory, COLT 1996 (1996)
Google Scholar
Grünwald, P.: MDL tutorial. In: Grünwald, P., Myung, J., Pitt, M. (eds.) Advances in Minimum Description Length, MIT Press, Cambridge (2005)
Google Scholar
Grünwald, P., de Rooij, S.: Asymptotic log–loss of prequential maximum likelihood codes. Available at the CoRR arXiv (2005), at http://xxx.lanl.gov/
Hemerly, E.M., Davis, M.H.A.: Strong consistency of the PLS criterion for order determination of autoregressive processes. Ann. Statist. 17(2), 941–946 (1989)
Article MATH MathSciNet Google Scholar
Kass, R., Vos, P.: Geometric Foundations of Asymptotic Inference. Wiley, Chichester (1997)
Google Scholar
Li, L., Yu, B.: Iterated logarithmic expansions of the pathwise code lengths for exponential families. IEEE Trans. Inf. Theory 46(7), 2683–2689 (2000)
Article MATH MathSciNet Google Scholar
Rissanen, J.: Universal coding, information, prediction and estimation. IEEE Trans. Inf. Theory 30, 629–636 (1984)
Article MATH MathSciNet Google Scholar
Rissanen, J.: A predictive least squares principle. IMA Journal of Mathematical Control and Information 3, 211–222 (1986)
Article MATH Google Scholar
Rissanen, J.: Stochastic Complexity in Statistical Inquiry. World Scientific, Singapore (1989)
MATH Google Scholar
Wei, C.Z.: On predictive least squares principles. Ann. Statist 20(1), 1–42 (1990)
Article Google Scholar
Whittle, P.: Bounds for the moments of linear and quadratic forms in independent variables. Theory of Probability and its Applications (3) (1960)
Google Scholar

Download references

Author information

Authors and Affiliations

CWI Amsterdam,
Peter Grünwald & Steven de Rooij

Authors

Peter Grünwald
View author publications
You can also search for this author in PubMed Google Scholar
Steven de Rooij
View author publications
You can also search for this author in PubMed Google Scholar

Editor information

Editors and Affiliations

University of Leoben, A-8700, Leoben, Austria
Peter Auer
Department of Electrical Engineering, Technion, P.O. Box, 3200, Haifa, Israel
Ron Meir

Rights and permissions

Reprints and permissions

Copyright information

About this paper

Cite this paper

Grünwald, P., de Rooij, S. (2005). Asymptotic Log-Loss of Prequential Maximum Likelihood Codes. In: Auer, P., Meir, R. (eds) Learning Theory. COLT 2005. Lecture Notes in Computer Science(), vol 3559. Springer, Berlin, Heidelberg. https://doi.org/10.1007/11503415_44

Download citation

DOI: https://doi.org/10.1007/11503415_44
Publisher Name: Springer, Berlin, Heidelberg
Print ISBN: 978-3-540-26556-6
Online ISBN: 978-3-540-31892-7
eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics