POMDP Planning for Robust Robot Control

  • Joelle Pineau
  • Geoffrey J. Gordon
Part of the Springer Tracts in Advanced Robotics book series (STAR, volume 28)


POMDPs provide a rich framework for planning and control in partially observable domains. Recent new algorithms have greatly improved the scalability of POMDPs, to the point where they can be used in robot applications. In this paper, we describe how approximate POMDP solving can be further improved by the use of a new theoretically-motivated algorithm for selecting salient information states. We present the algorithm, called PEMA, demonstrate competitive performance on a range of navigation tasks, and show how this approach is robust to mismatches between the robot’s physical environment and the model used for planning.


Partially Observable Markov Decision Process Stochastic Local Search Neural Information Processing System Rich Framework Partially Observable Markov Decision Process Modeling 
These keywords were added by machine and not by the authors. This process is experimental and the keywords may be updated as the learning algorithm improves.


Unable to display preview. Download preview PDF.

Unable to display preview. Download preview PDF.


  1. 1.
    D. Braziunas and C. Boutilier. Stochastic local search for POMDP controllers. In Proceedings of the Nineteenth National Conference on Artificial Intelligence (AAAI), pages 690–696, 2004.Google Scholar
  2. 2.
    A. Cassandra, M. L. Littman, and N. L. Zhang. Incremental pruning: A simple, fast, exact method for partially observable Markov decision processes. In Proceedings of the Thirteenth Conference on Uncertainty in Artificial Intelligence (UAI), pages 54–61, 1997.Google Scholar
  3. 3.
    M. L. Littman, A. R. Cassandra, and L. P. Kaelbling. Learning policies for partially obsevable environments: Scaling up. In Proceedings of Twelfth International Conference on Machine Learning, pages 362–370, 1995.Google Scholar
  4. 4.
    M. Montemerlo, N. Roy, and S. Thrun. Perspectives on standardization in mobile robot programming: The Carnegie Mellon navigation (CARMEN) toolkit. In Proceedings of the IEEE/RSJ International Conference on Intelligent Robots and Systems (IROS), volume 3, pages pp 2436–2441, 2003.Google Scholar
  5. 5.
    J. Pineau, G. Gordon, and S. Thrun. Point-based value iteration: An anytime algorithm for POMDPs. In Proceedings of the 18th International Joint Conference on Artificial Intelligence (IJCAI), pages 1025–1032, 2003.Google Scholar
  6. 6.
    J. Pineau, M. Montermerlo, M. Pollack, N. Roy, and S. Thrun. Towards robotic assistants in nursing homes: challenges and results. Robotics and Autonomous Systems, 42(3–4):271–281, 2003.zbMATHCrossRefGoogle Scholar
  7. 7.
    K.-M. Poon. A fast heuristic algorithm for decision-theoretic planning. Master’s thesis, The Hong-Kong University of Science and Technology, 2001.Google Scholar
  8. 8.
    P. Poupart. Exploiting Structure to Efficiently Solve Large Scale Partially Observable Markov Decision Processes. PhD thesis, University of Toronto, 2005.Google Scholar
  9. 9.
    P. Poupart and C. Boutilier. Bounded finite state controllers. In Advances in Neural Information Processing Systems (NIPS), volume 16, 2004.Google Scholar
  10. 10.
    T. Smith and R. Simmons. Heuristic search value iteration for POMDPs. In Proceedings of the Twentieth Conference on Uncertainty in Artificial Intelligence (UAI), 2004.Google Scholar
  11. 11.
    N. Vlassis and M. T. J. Spaan. A fast point-based algorithm for POMDPs. In Proceedings of the Belgian-Dutch Conference on Machine Learning, 2004.Google Scholar

Copyright information

© Springer-Verlag Berlin Heidelberg 2007

Authors and Affiliations

  • Joelle Pineau
    • 1
  • Geoffrey J. Gordon
    • 2
  1. 1.School of Computer ScienceMcGill UniversityMontrealCanada
  2. 2.Center for Automated Learning and DiscoveryCarnegie Mellon UniversityPittsburgh

Personalised recommendations