Advertisement

Consistency of Feature Markov Processes

  • Peter Sunehag
  • Marcus Hutter
Part of the Lecture Notes in Computer Science book series (LNCS, volume 6331)

Abstract

We are studying long term sequence prediction (forecasting). We approach this by investigating criteria for choosing a compact useful state representation. The state is supposed to summarize useful information from the history. We want a method that is asymptotically consistent in the sense it will provably eventually only choose between alternatives that satisfy an optimality property related to the used criterion. We extend our work to the case where there is side information that one can take advantage of and, furthermore, we briefly discuss the active setting where an agent takes actions to achieve desirable outcomes.

Keywords

Markov Chain Hide Markov Model State Machine State Sequence Side Information 
These keywords were added by machine and not by the authors. This process is experimental and the keywords may be updated as the learning algorithm improves.

Preview

Unable to display preview. Download preview PDF.

Unable to display preview. Download preview PDF.

References

  1. [BP66]
    Baum, L.E., Petrie, T.: Statistical inference for probabilistic functions of Finite State Markov chains. The Annals of Mathematical Statistics 37(6), 1554–1563 (1966)zbMATHCrossRefMathSciNetGoogle Scholar
  2. [CMR05]
    Cappé, O., Moulines, E., Rydenp, T.: Inference in Hidden Markov Models. Springer Series in Statistics. Springer, New York (2005)zbMATHGoogle Scholar
  3. [CS00]
    Csiszr, I., Shields, P.C.: The consistency of the bic markov order estimator (2000)Google Scholar
  4. [EM02]
    Ephraim, Y., Merhav, N.: Hidden Markov processes. IEEE Transactions on Information Theory 48(6), 1518–1569 (2002)zbMATHCrossRefMathSciNetGoogle Scholar
  5. [FLN96]
    Finesso, L., Liu, C., Narayan, P.: The optimal error exponent for markov order estimation. IEEE Trans. Inform. Theory 42, 1488–1497 (1996)zbMATHCrossRefMathSciNetGoogle Scholar
  6. [GB03]
    Gassiat, E., Boucheron, S.: Optimal error exponents in hidden Markov models order estimation. IEEE Transactions on Information Theory 49(4), 964–980 (2003)zbMATHCrossRefMathSciNetGoogle Scholar
  7. [Hut09]
    Hutter, M.: Feature reinforcement learning: Part I: Unstructured MDPs. Journal of Artificial General Intelligence 1, 3–24 (2009)Google Scholar
  8. [Mah10]
    Mahmud, M.M.: Constructing states for reinforcement learning. In: The 27:th International Conference on Machine Learning, ICML 2010 (2010)Google Scholar
  9. [McC96]
    McCallum, A.K.: Reinforcement learning with selective perception and hidden state. PhD thesis, The University of Rochester (1996)Google Scholar
  10. [Pet69]
    Petrie, T.: Probabilistic functions of Finite State Markov chains. The Annals of Mathematical Statistics 40(1), 97–115 (1969)zbMATHCrossRefMathSciNetGoogle Scholar
  11. [Ris83]
    Rissanen, J.: A universal data compression system. IEEE Transactions on Information Theory 29(5), 656–663 (1983)zbMATHCrossRefMathSciNetGoogle Scholar
  12. [Ris86]
    Rissanen, J.: Complexity of strings in the class of Markov sources. IEEE Transactions on Information Theory 32(4), 526–532 (1986)zbMATHCrossRefMathSciNetGoogle Scholar
  13. [RN10]
    Russell, S., Norvig, P.: Artificial Intelligence: A Modern Approach, 3rd edn. Prentice-Hall, Englewood Cliffs (2010)Google Scholar
  14. [SB98]
    Sutton, R.S., Barto, A.G.: Reinforcement Learning: An Introduction (Adaptive Computation and Machine Learning). MIT Press, Cambridge (March 1998)Google Scholar
  15. [Sin96]
    Singer, Y.: Adaptive mixtures of probabilistic transducers. Neural Computation 9, 1711–1733 (1996)CrossRefGoogle Scholar
  16. [VTdlH+5a]
    Vidal, E., Thollard, F., de la Higuera, C., Casacuberta, F., Carrasco, R.C.: Probabilistic finite-state machines – Part I. IEEE Transactions on Pattern Analysis and Machine Intelligence 27(7), 1013–1025 (2005a)CrossRefGoogle Scholar

Copyright information

© Springer-Verlag Berlin Heidelberg 2010

Authors and Affiliations

  • Peter Sunehag
    • 1
  • Marcus Hutter
    • 1
  1. 1.RSISE@Australian National University and SML@NICTACanberraAustralia

Personalised recommendations