Strategies for Sequential Prediction of Stationary Time Series

  • László Gyöfi
  • Gábor Lugosi
Part of the International Series in Operations Research & Management Science book series (ISOR, volume 46)


We present simple procedures for the prediction of a real valued sequence. The algorithms are based on a combination of several simple predictors. We show that if the sequence is a realization of a bounded stationary and ergodic random process then the average of squared errors converges, almost surely, to that of the optimum, given by the Bayes predictor. We offer an analog result for the prediction of stationary gaussian processes.


Gaussian Process Modeling Uncertainty Ergodic Theorem Prediction Strategy Stationary Time Series 
These keywords were added by machine and not by the authors. This process is experimental and the keywords may be updated as the learning algorithm improves.


Unable to display preview. Download preview PDF.

Unable to display preview. Download preview PDF.


  1. Algoet, P. (1992). Universal schemes for prediction, gambling, and portfolio selection. Annals of Probability, 20:901–941.MathSciNetCrossRefGoogle Scholar
  2. Algoet, P. (1994). The strong law of large numbers for sequential decisions under uncertainty. IEEE Transactions on Information Theory, 40:609–634.MathSciNetCrossRefGoogle Scholar
  3. Bailey, D. H. (1976). Sequential schemes for classifying and predicting ergodic processes. PhD thesis, Stanford University.Google Scholar
  4. Breiman, L. (1960). The individual ergodic theorem of information theory. Annals of Mathematical Statistics, 31:809–811, 1957. Correction. Annals of Mathematical Statistics, 31:809–810.MathSciNetCrossRefGoogle Scholar
  5. Cesa-Bianchi, N., Y. Freund, D.P. Helmbold, D. Haussler, R. Schapire, and M.K. Warmuth. (1997). How to use expert advice. Journal of the ACM, 44(3):427–485.MathSciNetCrossRefGoogle Scholar
  6. Chow, Y.S. (1965). Local convergence of martingales and the law of large numbers. Annals of Mathematical Statistics, 36:552–558.MathSciNetCrossRefGoogle Scholar
  7. Devroye, L., L. Gyöfi, and G. Lugosi. (1996). A Probabilistic Theory of Pattern Recognition. Springer-Verlag, New York.CrossRefGoogle Scholar
  8. Feder, M., N. Merhav, and M. Gutman. (1992). Universal prediction of individual sequences. IEEE Transactions on Information Theory, 38:1258–1270.MathSciNetCrossRefGoogle Scholar
  9. Gerencsér, L. (1992). AR(∞) estimation and nonparametric stochastic complexity. IEEE Transactions on Information Theory, 38:1768–1779.MathSciNetCrossRefGoogle Scholar
  10. Gerencsér, L. (1994). On Rissanen’s predictive stochastic complexity for stationary ARMA processes. J. of Statistical Planning and Inference, 41:303–325.MathSciNetCrossRefGoogle Scholar
  11. Gerencsér, L. and J. Rissanen. (1986). A prediction bound for Gaussian ARMA processes. Proc. of the 25th Conference on Decision and Control, 1487–1490.Google Scholar
  12. Goldenshluger, A. and A. Zeevi. (submitted for publication 1999). Non-asymptotic bounds for autoregressive time series modeling.Google Scholar
  13. Gyöfi, L. (1984). Adaptive linear procedures under general conditions. IEEE Transactions on Information Theory, 30:262–267.MathSciNetCrossRefGoogle Scholar
  14. Gyöfi, L., G. Lugosi, and G. Morvai. (1999). A simple randomized algorithm for consistent sequential prediction of ergodic time series. IEEE Transactions on Information Theory, 45:2642–2650.MathSciNetCrossRefGoogle Scholar
  15. Kivinen, J. and M. K. Warmuth. (1999). Averaging expert predictions. In H. U. Simon P. Fischer, editor, Computational Learning Theory: Proceedings of the Fourth European Conference, EuroCOLT’99, pages 153–167. Springer, Berlin. Lecture Notes in Artificial Intelligence 1572.CrossRefGoogle Scholar
  16. Littlestone, N. and M. K. Warmuth. (1994). The weighted majority algorithm. Information and Computation, 108:212–261.MathSciNetCrossRefGoogle Scholar
  17. Merhav, N. and M. Feder. (1998). Universal prediction. IEEE Transactions on Information Theory, 44:2124–2147.MathSciNetCrossRefGoogle Scholar
  18. Morvai, G., S. Yakowitz, and L. Gyöfi. (1996). Nonparametric inference for ergodic, stationary time series. Annals of Statistics, 24:370–379.MathSciNetCrossRefGoogle Scholar
  19. Morvai, G., S. Yakowitz, and P. Algoet. (1997). Weakly Convergent Stationary Time Series. IEEE Transactions on Information Theory, 43:483–498.MathSciNetCrossRefGoogle Scholar
  20. Nobel, A. (2000). Aggregate schemes for sequential prediction of ergodic processes. Manuscript.Google Scholar
  21. Ornstein, D. S. (1978). Guessing the next output of a stationary process. Israel Journal of Mathematics, 30:292–296.MathSciNetCrossRefGoogle Scholar
  22. Pisier, G. (1986). Probabilistic methods in the geometry of Banach spaces. In Probability and Analysis. Lecture Notes in Mathematics, 1206, pages 167–241. Springer, New York.CrossRefGoogle Scholar
  23. Singer, A. and M. Feder. (1999). Universal linear prediction by model order weighting. IEEE Transactions on Signal Processing, 47:2685–2699.CrossRefGoogle Scholar
  24. Singer, A. C. and M. Feder. (2000). Universal linear least-squares prediction. International Symposium of Information Theory.Google Scholar
  25. Stout, W.F. (1974). Almost sure convergence. Academic Press, New York.zbMATHGoogle Scholar
  26. Tsypkin, Ya. Z. (1971). Adaptation and Learning in Automatic Systems. Academic Press, New York.zbMATHGoogle Scholar
  27. Yakowitz, S. (1976). Small-sample hypothesis tests of Markov order, with application to simulated and hydrologic chains. Journal of the American Statistical Association, 71:132–136.MathSciNetCrossRefGoogle Scholar
  28. Yakowitz, S. (1979). Nonparametric estimation of Markov transition functions. Annals of Statistics, 7:671–679.MathSciNetCrossRefGoogle Scholar
  29. Yakowitz, S. (1985). Nonparametric density estimation, prediction, and regression for Markov sequences. Journal of the American Statistical Association, 80:215–221.MathSciNetCrossRefGoogle Scholar
  30. Yakowitz, S. (1987). Nearest-neighbour methods for time series analysis. Journal of Time Series Analysis, 8:235–247.MathSciNetCrossRefGoogle Scholar
  31. Yakowitz, S. (1989). Nonparametric density and regression estimation for Markov sequences without mixing assumptions. Journal of Multivariate Analysis, 30:124–136.MathSciNetCrossRefGoogle Scholar
  32. Yakowitz, S., L. Gyöfi, J. Kieffer, and G. Morvai. (1999). Strongly consistent nonparametric estimation of smooth regression functions for stationary ergodic sequences. Journal of Multivariate Analysis, 71:24–41.MathSciNetCrossRefGoogle Scholar
  33. Vovk, V.G. (1990). Aggregating strategies. In Proceedings of the Third Annual Workshop on Computational Learning Theory, pages 372–383. Association of Computing Machinery, New York.Google Scholar
  34. Yang, Y. (1999). Aggregating regression procedures for a better performance. Manuscript.Google Scholar
  35. Yang, Y. (2000). Combining different procedures for adaptive regression. Journal of Multivariate Analysis, 74:135–161.MathSciNetCrossRefGoogle Scholar

Copyright information

© Springer Science + Business Media, Inc. 2002

Authors and Affiliations

  • László Gyöfi
    • 1
  • Gábor Lugosi
    • 2
  1. 1.Department of Computer Science and Information TheoryTechnical University of BudapestBudapestHungary
  2. 2.Department of EconomicsPompeu Fabra UniversityBarcelonaSpain

Personalised recommendations