Advertisement

Point-Based Planning for Predictive State Representations

  • Masoumeh T. Izadi
  • Doina Precup
Part of the Lecture Notes in Computer Science book series (LNCS, volume 5032)

Abstract

Predictive state representations (PSRs) have been proposed recently as an alternative representation for environments with partial observability. The representation is rooted in actions and observations, so it holds the promise of being easier to learn than Partially Observable Markov Decision Processes (POMDPs). However, comparatively little work has explored planning algorithms using PSRs. Exact methods developed to date are no faster than existing exact planning approaches for POMDPs, and only memory-based PSRs have been shown so far to have an advantage in terms of planning speed. In this paper, we present an algorithm for approximate planning in PSRs, based on an approach similar to point-based value iteration in POMDPs. The point-based approach turns out to be a natural match for the PSR state representation. We present empirical results showing that our approach is either comparable or better than POMDP point-based planning.

Keywords

Belief State Hide State Partially Observable Markov Decision Process Belief Space Core Test 
These keywords were added by machine and not by the authors. This process is experimental and the keywords may be updated as the learning algorithm improves.

Preview

Unable to display preview. Download preview PDF.

Unable to display preview. Download preview PDF.

References

  1. 1.
    Bonet, B.: An epsilon-optimal grid-based algorithm for partially observable Markov decision processes. In: Proceedings of ICML, pp. 51–58 (2002)Google Scholar
  2. 2.
    Cassandra, A.R., Littman, M.L., Kaelbling, L.P.: A simple, fast, exact methods for partially observable Markov decision processes. In: Proceedings of UAI, pp. 54–61 (1997)Google Scholar
  3. 3.
    Even-Dar, E., Kakade, S.M., Mansour, Y.: Planning in POMDPS using Multiplicity Automata. In: Proceedings of UAI (2005)Google Scholar
  4. 4.
    Izadi, M.T., Precup, D.: A planning algorithm for predictive state representation. In: Proceedings of IJCAI, pp. 1520–1521 (2003)Google Scholar
  5. 5.
    Izadi, M.T., Precup, D.: Model minimization by linear PSR. In: Proceedings of IJCAI, pp. 1749–1750 (2005a)Google Scholar
  6. 6.
    Izadi, M.T., Precup, D.: Using rewards in POMDP belief update. In: ECML 2005 (2005b)Google Scholar
  7. 7.
    James, M., Singh, S., Littman, M.: Planning with predictive state representation. In: Proceedings of International Conference on Machine Learning and Applications (ICMLA) (2004)Google Scholar
  8. 8.
    James, M.R.: Using Predictions for Planning and Modeling in Stochastic Environments. PhD thesis, The University of Michigan (2005)Google Scholar
  9. 9.
    James, M.R., Singh, S.: Planning in models that combine memory with predictive representations of state. In: Proceedings of AAAI (2005)Google Scholar
  10. 10.
    James, M.R., Wessling, T., Vlassis, N.: Improving approximate value iteration using memories and predictive state representations. In: Proceedings of AAAI (2006)Google Scholar
  11. 11.
    Littman, M., Sutton, R., Singh, S.: Predictive representations of state. In: Proceedings of NIPS 2001 (2002)Google Scholar
  12. 12.
    Pineau, J., Gordon, G., Thrun, S.: Point-based value iteration: An anytime algorithms for POMDPs. In: Proceedings of IJCAI, pp. 1025–1032 (2003)Google Scholar
  13. 13.
    Pineau, J., Gordon, G.: POMDP Planning for Robust Robot Control. In: Proceedings of International Symposium on Robotics Research (ISRR)Google Scholar
  14. 14.
    Poupart, P., Boutilier, C.: Value-directed Compression of POMDPs. In: Proceedings of NIPS 2002, pp. 1547–1554 (2003)Google Scholar
  15. 15.
    Poupart, P., Boutilier, C.: VDCBPI: an Approximate Scalable Algorithm for Large Scale POMDPs. In: Proceedings of NIPS 2003, pp. 1081–1088 (2004)Google Scholar
  16. 16.
    Singh, S., James, M.R., Rudary, M.R.: Predictive state representations: a new theory for modeling dynamical systems. In: Proceedings of UAI, pp. 512–519 (2004)Google Scholar
  17. 17.
    Smith, T., Simmons, R.: Heuristic search value iteration for POMDPs. In: Proceedings of UAI, pp. 520–527 (2004)Google Scholar
  18. 18.
    Spaan, M.T.J., Vlassis, N.A.: PERSEUS: Randomized point-base value iteration for POMDPs. Journal of Artificial Intelligence Research, 195–220 (2005)Google Scholar
  19. 19.
    Rafols, E., Ring, M., Sutton, R.S., Tanner, B.: Using Predictive Representations to Improve Generalization in Reinforcement Learning. In: Proceedings of IJCAI (2005)Google Scholar

Copyright information

© Springer-Verlag Berlin Heidelberg 2008

Authors and Affiliations

  • Masoumeh T. Izadi
    • 1
  • Doina Precup
    • 1
  1. 1.McGill University 

Personalised recommendations