Model Selection for Variable Length Markov Chains and Tuning the Context Algorithm

Bühlmann, Peter

doi:10.1023/A:1004165822461

Model Selection for Variable Length Markov Chains and Tuning the Context Algorithm

Published: June 2000

Volume 52, pages 287–315, (2000)
Cite this article

Annals of the Institute of Statistical Mathematics Aims and scope Submit manuscript

Peter Bühlmann¹

206 Accesses
27 Citations
Explore all metrics

Abstract

We consider the model selection problem in the class of stationary variable length Markov chains (VLMC) on a finite space. The processes in this class are still Markovian of high order, but with memory of variable length. Various aims in selecting a VLMC can be formalized with different non-equivalent risks, such as final prediction error or expected Kullback-Leibler information. We consider the asymptotic behavior of different risk functions and show how they can be generally estimated with the same resampling strategy. Such estimated risks then yield new model selection criteria. In particular, we obtain a data-driven tuning of Rissanen's tree structured context algorithm which is a computationally feasible procedure for selection and estimation of a VLMC.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Log in via an institution

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Institutional subscriptions

Bayesian Variable Selection for Linear Models Using I-Priors

Maximum spacing estimation for continuous time Markov chains and semi-Markov processes

Article 15 February 2021

Estimation and inference in multivariate Markov chains

Article 02 September 2014

References

Akaike, H. (1969). Fitting autoregressive models for prediction, Ann. Inst. Statist. Math., 21, 243–247.
Google Scholar
Akaike, H. (1970). Statistical predictor identification, Ann. Inst. Statist. Math., 22, 202–217.
Google Scholar
Akaike, H. (1973). Information theory and the maximum likelihood principle, 2nd International Symposium on Information Theory (eds. B. N. Petrov and F. Csàki), 267–281, Akademiai Kiàdo, Budapest.
Google Scholar
Breiman, L., Friedman, J. H., Olshen, R. A. and Stone, C. J. (1984). Classification and Regression Trees, Wadsworth, Belmont, CA.
Google Scholar
Bühlmann, P. (1999). Efficient and adaptive post-model-selection estimators, J. Statist. Plann. Inference, 79, 1–9.
Google Scholar
Bühlmann, P. and Wyner, A. J. (1999). Variable length Markov chains, Ann. Statist., 27, 480–513.
Google Scholar
Bunton, S. (1997). A percolating state selector for suffix-tree context models, Proc. of the 1997 Data Compression Conference, Snowbird, Utah (eds. J. A. Storer and M. Cohn), 32–41, IEEE Computer Society Press, Los Alamitos, CA.
Google Scholar
Cavanaugh, J. and Shumway, R. (1997). A bootstrap variant of AIC for state-space model selection, Statist. Sinica, 7, 473–496.
Google Scholar
Doukhan, P. (1994). Mixing. Properties and Examples, Lecture Notes in Statist., No. 85, Springer, New York.
Google Scholar
Efron, B. (1983). Estimating the error rate of a prediction rule: improvement on cross-validation, J. Amer. Statist. Assoc., 78, 316–331.
Google Scholar
Efron, B. (1986). How biased is the apparent error rate of a prediction rule, J. Amer. Statist. Assoc., 81, 461–470.
Google Scholar
Merhav, N., Gutman, M. and Ziv, J. (1989). On the estimation of the order of a Markov chain and universal data compression, IEEE Trans. Inform. Theory, IT-35, 1014–1019.
Google Scholar
Rissanen, J. (1983). A universal data compression system, IEEE Trans. Inform. Theory, IT-29, 656–664.
Google Scholar
Rissanen, J. (1986). Complexity of strings in the class of Markov sources, IEEE Trans. Inform. Theory, IT-32, 526–532.
Google Scholar
Rissanen, J. (1994). Noise separation and MDL modeling of chaotic processes, From Statistical Physics to Statistical Inference and Back (eds. P. Grassberger and J.-P. Nadal), 317–330. Kluwer, Dordrecht.
Google Scholar
Shibata, R. (1989). Statistical aspects of model selection, From Data to Model (ed. J. C. Willems), 215–240, Springer, New York.
Google Scholar
Shibata, R. (1997). Bootstrap estimate of Kullback-Leibler information for model selection, Statist. Sinica, 7, 375–394.
Google Scholar
Takeuchi, K. (1976). Distribution of informational statistics and a criterion of model fitting, Suri-Kagaku (Mathematical Sciences), 153, 12–18 (in Japanese).
Google Scholar
Tong, H. (1975). Determination of the order of a Markov chain by Akaike's information criterion, J. Appl. Probab., 12, 488–497.
Google Scholar
Weinberger, M. J. and Feder, M. (1994). Predictive stochastic complexity and model estimation for finite-state processes, J. Statist. Plann. Inference, 39, 353–372.
Google Scholar
Weinberger, M. J., Lempel, A. and Ziv, J. (1992). A sequential algorithm for the universal coding of finite memory sources, IEEE Trans. Inform. Theory, IT-38, 1002–1014.
Google Scholar
Weinberger, M. J., Rissanen, J. and Feder, M. (1995). A universal finite memory source, IEEE Trans. Inform. Theory, IT-41, 643–652.
Google Scholar
Weinberger, M. J., Rissanen, J. and Arps, R. B. (1996). Applications of universal context modeling to lossless compression of gray-scale images, IEEE Trans. Image Processing, IP-5, 575–586.
Google Scholar

Download references

Author information

Authors and Affiliations

Seminar für Statistik, ETH Zentrum, CH-8092, Zürich, Switzerland
Peter Bühlmann

Authors

Peter Bühlmann
View author publications
You can also search for this author in PubMed Google Scholar

About this article

Cite this article

Bühlmann, P. Model Selection for Variable Length Markov Chains and Tuning the Context Algorithm. Annals of the Institute of Statistical Mathematics 52, 287–315 (2000). https://doi.org/10.1023/A:1004165822461

Download citation

Issue Date: June 2000
DOI: https://doi.org/10.1023/A:1004165822461

Access this article

Log in via an institution

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Institutional subscriptions

Model Selection for Variable Length Markov Chains and Tuning the Context Algorithm

Abstract

Access this article

Similar content being viewed by others

Bayesian Variable Selection for Linear Models Using I-Priors

Maximum spacing estimation for continuous time Markov chains and semi-Markov processes

Estimation and inference in multivariate Markov chains

References

Author information

Authors and Affiliations

About this article

Cite this article

Navigation

Model Selection for Variable Length Markov Chains and Tuning the Context Algorithm

Abstract

Access this article

Similar content being viewed by others

Bayesian Variable Selection for Linear Models Using I-Priors

Maximum spacing estimation for continuous time Markov chains and semi-Markov processes

Estimation and inference in multivariate Markov chains

References

Author information

Authors and Affiliations

About this article

Cite this article

Share this article

Search

Navigation