Belief Selection in Point-Based Planning Algorithms for POMDPs

  • Masoumeh T. Izadi
  • Doina Precup
  • Danielle Azar
Part of the Lecture Notes in Computer Science book series (LNCS, volume 4013)


Current point-based planning algorithms for solving partially observable Markov decision processes (POMDPs) have demonstrated that a good approximation of the value function can be derived by interpolation from the values of a specially selected set of points. The performance of these algorithms can be improved by eliminating unnecessary backups or concentrating on more important points in the belief simplex. We study three methods designed to improve point-based value iteration algorithms. The first two methods are based on reachability analysis on the POMDP belief space. This approach relies on prioritizing the beliefs based on how they are reached from the given initial belief state. The third approach is motivated by the observation that beliefs which are the most overestimated or underestimated have greater influence on the precision of value function than other beliefs. We present an empirical evaluation illustrating how the performance of point-based value iteration (Pineau et al., 2003) varies with these approaches.


Belief State Hide State Iteration Algorithm Partially Observable Markov Decision Process Initial Belief 
These keywords were added by machine and not by the authors. This process is experimental and the keywords may be updated as the learning algorithm improves.


Unable to display preview. Download preview PDF.

Unable to display preview. Download preview PDF.


  1. 1.
    Cassandra, A.R., Littman, M.L., Kaelbling, L.P.: A simple, fast, exact methods for partially observable Markov decisi on processes. In: Proceedings of UAI, pp. 54–61 (1997)Google Scholar
  2. 2.
    Izadi, M.T., Rajwade, A., Precup, D.: Using core beliefs for point-based value iteration. In: Proceedings of IJCAI, pp. 1751–1753 (2005)Google Scholar
  3. 3.
    Hauskrecht, M.: Value-function approximations for Partially Observable Markov Decision Processes. Journal of Artificial Intelligence Research 13, 33–94 (2000)zbMATHMathSciNetGoogle Scholar
  4. 4.
    Pineau, J., Gordon, G., Thrun, S.: Point-based value iteration: An anytime algorithms for POMDPs. In: Proceedings of IJCAI, pp. 1025–1032 (2003)Google Scholar
  5. 5.
    Smith, T., Simmons, R.: Heuristic search value iteration for POMDPs. In: Proceedings of UAI, pp. 520–527 (2004)Google Scholar
  6. 6.
    Smith, T., Simmons, R.: Point-based POMDP Algorithm: Improved Analysis and Implementation. In: Proceedings of ICML (2005)Google Scholar
  7. 7.
    Sondik The, E.J.: optimal control of Partially Observable Markov Processe. Ph.D. thesis, Stanford University (1971)Google Scholar
  8. 8.
    Spaan, M.T.J., Vlassis, N.: Perseus: Randomized point-base value iteration for POMDPs. Journal of Artificial Intelligencce Research, 195–220 (2005)Google Scholar
  9. 9.
    Zhang, N.L., Zhang, W.: Speeding up the convergence of value iteration in partially observable Markov decision processes. Journal of Artificial Intelligience Research 14, 2 (2001)Google Scholar

Copyright information

© Springer-Verlag Berlin Heidelberg 2006

Authors and Affiliations

  • Masoumeh T. Izadi
    • 1
  • Doina Precup
    • 1
  • Danielle Azar
    • 2
  1. 1.McGill UniversityCanada
  2. 2.American Lebanese University ByblosLebanon

Personalised recommendations