Abstract
We consider the model selection problem in the class of stationary variable length Markov chains (VLMC) on a finite space. The processes in this class are still Markovian of high order, but with memory of variable length. Various aims in selecting a VLMC can be formalized with different non-equivalent risks, such as final prediction error or expected Kullback-Leibler information. We consider the asymptotic behavior of different risk functions and show how they can be generally estimated with the same resampling strategy. Such estimated risks then yield new model selection criteria. In particular, we obtain a data-driven tuning of Rissanen's tree structured context algorithm which is a computationally feasible procedure for selection and estimation of a VLMC.
Similar content being viewed by others
References
Akaike, H. (1969). Fitting autoregressive models for prediction, Ann. Inst. Statist. Math., 21, 243–247.
Akaike, H. (1970). Statistical predictor identification, Ann. Inst. Statist. Math., 22, 202–217.
Akaike, H. (1973). Information theory and the maximum likelihood principle, 2nd International Symposium on Information Theory (eds. B. N. Petrov and F. Csàki), 267–281, Akademiai Kiàdo, Budapest.
Breiman, L., Friedman, J. H., Olshen, R. A. and Stone, C. J. (1984). Classification and Regression Trees, Wadsworth, Belmont, CA.
Bühlmann, P. (1999). Efficient and adaptive post-model-selection estimators, J. Statist. Plann. Inference, 79, 1–9.
Bühlmann, P. and Wyner, A. J. (1999). Variable length Markov chains, Ann. Statist., 27, 480–513.
Bunton, S. (1997). A percolating state selector for suffix-tree context models, Proc. of the 1997 Data Compression Conference, Snowbird, Utah (eds. J. A. Storer and M. Cohn), 32–41, IEEE Computer Society Press, Los Alamitos, CA.
Cavanaugh, J. and Shumway, R. (1997). A bootstrap variant of AIC for state-space model selection, Statist. Sinica, 7, 473–496.
Doukhan, P. (1994). Mixing. Properties and Examples, Lecture Notes in Statist., No. 85, Springer, New York.
Efron, B. (1983). Estimating the error rate of a prediction rule: improvement on cross-validation, J. Amer. Statist. Assoc., 78, 316–331.
Efron, B. (1986). How biased is the apparent error rate of a prediction rule, J. Amer. Statist. Assoc., 81, 461–470.
Merhav, N., Gutman, M. and Ziv, J. (1989). On the estimation of the order of a Markov chain and universal data compression, IEEE Trans. Inform. Theory, IT-35, 1014–1019.
Rissanen, J. (1983). A universal data compression system, IEEE Trans. Inform. Theory, IT-29, 656–664.
Rissanen, J. (1986). Complexity of strings in the class of Markov sources, IEEE Trans. Inform. Theory, IT-32, 526–532.
Rissanen, J. (1994). Noise separation and MDL modeling of chaotic processes, From Statistical Physics to Statistical Inference and Back (eds. P. Grassberger and J.-P. Nadal), 317–330. Kluwer, Dordrecht.
Shibata, R. (1989). Statistical aspects of model selection, From Data to Model (ed. J. C. Willems), 215–240, Springer, New York.
Shibata, R. (1997). Bootstrap estimate of Kullback-Leibler information for model selection, Statist. Sinica, 7, 375–394.
Takeuchi, K. (1976). Distribution of informational statistics and a criterion of model fitting, Suri-Kagaku (Mathematical Sciences), 153, 12–18 (in Japanese).
Tong, H. (1975). Determination of the order of a Markov chain by Akaike's information criterion, J. Appl. Probab., 12, 488–497.
Weinberger, M. J. and Feder, M. (1994). Predictive stochastic complexity and model estimation for finite-state processes, J. Statist. Plann. Inference, 39, 353–372.
Weinberger, M. J., Lempel, A. and Ziv, J. (1992). A sequential algorithm for the universal coding of finite memory sources, IEEE Trans. Inform. Theory, IT-38, 1002–1014.
Weinberger, M. J., Rissanen, J. and Feder, M. (1995). A universal finite memory source, IEEE Trans. Inform. Theory, IT-41, 643–652.
Weinberger, M. J., Rissanen, J. and Arps, R. B. (1996). Applications of universal context modeling to lossless compression of gray-scale images, IEEE Trans. Image Processing, IP-5, 575–586.
Author information
Authors and Affiliations
About this article
Cite this article
Bühlmann, P. Model Selection for Variable Length Markov Chains and Tuning the Context Algorithm. Annals of the Institute of Statistical Mathematics 52, 287–315 (2000). https://doi.org/10.1023/A:1004165822461
Issue Date:
DOI: https://doi.org/10.1023/A:1004165822461