Skip to main content
Log in

Planning in partially-observable switching-mode continuous domains

  • Published:
Annals of Mathematics and Artificial Intelligence Aims and scope Submit manuscript

Abstract

Continuous-state POMDPs provide a natural representation for a variety of tasks, including many in robotics. However, most existing parametric continuous-state POMDP approaches are limited by their reliance on a single linear model to represent the world dynamics. We introduce a new switching-state dynamics model that can represent multi-modal state-dependent dynamics. We present the Switching Mode POMDP (SM-POMDP) planning algorithm for solving continuous-state POMDPs using this dynamics model. We also consider several procedures to approximate the value function as a mixture of a bounded number of Gaussians. Unlike the majority of prior work on approximate continuous-state POMDP planners, we provide a formal analysis of our SM-POMDP algorithm, providing bounds, where possible, on the quality of the resulting solution. We also analyze the computational complexity of SM-POMDP. Empirical results on an unmanned aerial vehicle collisions avoidance simulation, and a robot navigation simulation where the robot has faulty actuators, demonstrate the benefit of SM-POMDP over a prior parametric approach.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Similar content being viewed by others

References

  1. Blackmore, L., Gil, S., Chung, S., Williams, B.: Model learning for switching linear systems with autonomous mode transitions. In: Proceedings of the IEEE Conference on Decision and Control (CDC) (2007)

  2. Brooks, A., Makarenko, A., Williams, S., Durrant-Whyte, H.: Parametric POMDPs for planning in continuous state spaces. In: Robotics and Autonomous Systems (2006)

  3. Burl, J.B.: Linear Optimal Control. Prentice Hall (1998)

  4. Byl, K., Tedrake, R.: Dynamically diverse legged locomotion for rough terrain. In: Proceedings of the IEEE International Conference on Robotics and Automation (ICRA). Video Submission (2009)

  5. Fox, E.B., Sudderth, E.B., Jordan, M.I., Willsky, A.S.: Nonparametric Bayesian learning of switching linear dynamical systems. In: Advances in Neural Information Processing Systems (NIPS) (2009)

  6. Ghahramani, Z., Hinton, G.: Variational learning for switching state–space models. Neural Comput. 12, 831–864 (2000)

    Article  Google Scholar 

  7. Goldberger, J., Roweis, S.: Hierarchical clustering of a mixture model. In: Advances in Neural Information Processing Systems (NIPS) (2005)

  8. Hershey, J., Olsen, P.: Approximating the Kullback Leibler divergence between Gaussian mixture models. In: Proceedings of the International Conference on Acoustics, Speech and Signal Processing (ICASSP) (2007)

  9. Kaelbling, L., Littman, M., Cassandra, A.: Planning and acting in partially observable stochastic domains. Artif. Intell. 101, 99–134 (1998)

    Article  MATH  MathSciNet  Google Scholar 

  10. Kullback, S.: A lower bound for discrimination in terms of variation. IEEE Trans. Inf. Theory 13(1), 126–127 (1967)

    Article  Google Scholar 

  11. Munos, R., Moore, A.: Variable resolution discretization for high-accuracy solutions of optimal control problems. In: Proceedings of the International Joint Conference on Artificial Intelligence (IJCAI) (1999)

  12. Oh, S., Rehg, J., Balch, T., Dellaert, F.: Data-driven mcmc for learning and inference in switching linear dynamic systems. In: Proceedings of the National Conference on Artificial Intelligence (AAAI) (2005)

  13. Park, J., Sandberg, I.: Universal approximation using radial-basis-function networks. Neural Comput. 3(2), 246–257 (1991)

    Article  Google Scholar 

  14. Pineau, J., Gordan, G., Thrun, S.: Point-based value iteration: an anytime algorithm for POMDPs. In: International Joint Conference on Artificial Intelligence (IJCAI) (2003)

  15. Porta, J., Spaan, M., Vlassis, N., Poupart, P.: Point-based value iteration for continuous POMDPs. J. Mach. Learn. Res. 7, 2329–2367 (2006)

    MathSciNet  Google Scholar 

  16. Shani, G., Brafman, R.I., Shimony, S.E.: Forward search value iteration for POMDPs. In: Proceedings of the International Joint Conference on Artificial Intelligence (IJCAI) (2007)

  17. Smith, T., Simmons, R.: Heuristic search value iteration for POMDPs. In: Proceedings of the Conference on Uncertainty in Artificial Intelligence (UAI) (2004)

  18. Smith, T., Simmons, R.: Point-based POMDP algorithms: improved analysis and implementation. In: Proceedings of the Conference on Uncertainty in Artificial Intelligence (UAI) (2005)

  19. Sondik, E.J.: The optimal control of partially observable Markov processes. Ph.D. Thesis, Stanford University (1971)

  20. Spaan, M., Vlassis, N.: Perseus: Randomized point-based value iteration for POMDPs. J. Artif. Intell. Res. 24, 195–220 (2005)

    MATH  Google Scholar 

  21. Thrun, S.: Monte carlo POMDPs. In: Advances in Neural Information Processing Systems (NIPS) (2000)

  22. Zhang, K., Kwok, J.: Simplifying mixture models through function approximation. In: Advances in Neural Information Processing Systems (NIPS) (2006)

  23. Zhang, N., Zhang, W.: Speeding up the convergence of value iteration in partially observable Markov decision processes. J. Artif. Intell. Res. 14, 29–51 (2001)

    Google Scholar 

  24. Zhou, E., Fu, M.C., Marcus, S.I.: Solving continuous-state POMDPs vis density projection. IEEE Trans. Automat. Contr. 55(5), 1101–1116 (2010)

    Article  MathSciNet  Google Scholar 

Download references

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Emma Brunskill.

Additional information

This research was conducted while E. Brunskill was at the Massachusetts Institute of Technology.

Rights and permissions

Reprints and permissions

About this article

Cite this article

Brunskill, E., Kaelbling, L.P., Lozano-Pérez, T. et al. Planning in partially-observable switching-mode continuous domains. Ann Math Artif Intell 58, 185–216 (2010). https://doi.org/10.1007/s10472-010-9202-1

Download citation

  • Published:

  • Issue Date:

  • DOI: https://doi.org/10.1007/s10472-010-9202-1

Keywords

Mathematics Subject Classifications (2010)

Navigation