Monte Carlo Value Iteration for Continuous-State POMDPs

Bai, Haoyu; Hsu, David; Lee, Wee Sun; Ngo, Vien A.

doi:10.1007/978-3-642-17452-0_11

Haoyu Bai⁸,
David Hsu⁸,
Wee Sun Lee⁸ &
…
Vien A. Ngo⁸

Part of the book series: Springer Tracts in Advanced Robotics ((STAR,volume 68))

3741 Accesses
37 Citations

Abstract

Partially observable Markov decision processes (POMDPs) have been successfully applied to various robot motion planning tasks under uncertainty. However, most existing POMDP algorithms assume a discrete state space, while the natural state space of a robot is often continuous. This paper presents Monte Carlo Value Iteration (MCVI) for continuous-state POMDPs. MCVI samples both a robot’s state space and the corresponding belief space, and avoids inefficient a priori discretization of the state space as a grid. Both theoretical results and preliminary experimental results indicate that MCVI is a promising new approach for robot motion planning under uncertainty.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Log in via an institution

Chapter: USD 29.95; Price excludes VAT (USA)

eBook: USD 129.00; Price excludes VAT (USA)

Softcover Book: USD 169.99; Price excludes VAT (USA)

Hardcover Book: USD 169.99; Price excludes VAT (USA)

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

Preview

Unable to display preview. Download preview PDF.

References

Bagnell, J.A., Kakade, S., Ng, A., Schneider, J.: Policy search by dynamic programming. In: Advances in Neural Information Processing Systems (NIPS), vol. 16 (2003)
Google Scholar
Bellman, R.E.: Dynamic Programming. Princeton University Press, Princeton (1957)
MATH Google Scholar
Brooks, A., Makarendo, A., Williams, S., Durrant-Whyte, H.: Parametric POMDPs for planning in continuous state spaces. Robotics & Autonomous Systems 54(11), 887–897 (2006)
Article Google Scholar
Brunskill, E., Kaelbling, L., Lozano-Perez, T., Roy, N.: Continuous-state POMDPs with hybrid dynamics. In: Int. Symp. on Artificial Intelligence & Mathematics (2008)
Google Scholar
Choset, H., Lynch, K.M., Hutchinson, S., Kantor, G., Burgard, W., Kavraki, L.E., Thrun, S.: Principles of Robot Motion: Theory, Algorithms, and Implementations, vol. ch. 7. The MIT Press, Cambridge (2005)
MATH Google Scholar
Hansen, E.A.: Solving POMDPs by searching in policy space. In: Proc. AAAI Conf. on Artificial Intelligence, pp. 211–219 (1998)
Google Scholar
He, R., Brunskill, E., Roy, N.: PUMA: Planning under uncertainty with macro-actions. In: Proc. AAAI Conf. on Artificial Intelligence (2010)
Google Scholar
Hsiao, K., Kaelbling, L.P., Lozano-Pérez, T.: Grasping POMDPs. In: Proc. IEEE Int. Conf. on Robotics & Automation, pp. 4485–4692 (2007)
Google Scholar
Hsu, D., Lee, W.S., Rong, N.: What makes some POMDP problems easy to approximate? In: Advances in Neural Information Processing Systems (NIPS) (2007)
Google Scholar
Hsu, D., Lee, W.S., Rong, N.: A point-based POMDP planner for target tracking. In: Proc. IEEE Int. Conf. on Robotics & Automation, pp. 2644–2650 (2008)
Google Scholar
Kaelbling, L.P., Littman, M.L., Cassandra, A.R.: Planning and acting in partially observable stochastic domains. Artificial Intelligence 101(1-2), 99–134 (1998)
Article MATH MathSciNet Google Scholar
Kurniawati, H., Hsu, D., Lee, W.S.: SARSOP: Efficient point-based POMDP planning by approximating optimally reachable belief spaces. In: Proc. Robotics: Science and Systems (2008)
Google Scholar
Papadimitriou, C., Tsisiklis, J.N.: The complexity of Markov decision processes. Mathematics of Operations Research 12(3), 441–450 (1987)
Article MATH MathSciNet Google Scholar
Pineau, J., Gordon, G., Thrun, S.: Point-based value iteration: An anytime algorithm for POMDPs. In: Proc. Int. Jnt. Conf. on Artificial Intelligence, pp. 477–484 (2003)
Google Scholar
Porta, J.M., Vlassis, N., Spaan, M.T.J., Poupart, P.: Point-based value iteration for continuous POMDPs. J. Machine Learning Research 7, 2329–2367 (2006)
MathSciNet Google Scholar
Prentice, S., Roy, N.: The belief roadmap: Efficient planning in linear pomdps by factoring the covariance. In: Proc. Int. Symp. on Robotics Research (2007)
Google Scholar
Ross, S., Pineau, J., Paquet, S., Chaib-Draa, B.: Online planning algorithms for POMDPs. J. Artificial Intelligence Research 32(1), 663–704 (2008)
MATH MathSciNet Google Scholar
Roy, N., Thrun, S.: Coastal navigation with mobile robots. In: Advances in Neural Information Processing Systems (NIPS), vol. 12, pp. 1043–1049 (1999)
Google Scholar
Smith, T., Simmons, R.: Point-based POMDP algorithms: Improved analysis and implementation. In: Proc. Uncertainty in Artificial Intelligence (2005)
Google Scholar
Spaan, M.T.J., Vlassis, N.: A point-based POMDP algorithm for robot planning. In: Proc. IEEE Int. Conf. on Robotics & Automation (2004)
Google Scholar
Thrun, S.: Monte carlo POMDPs. In: Advances in Neural Information Processing Systems (NIPS). The MIT Press, Cambridge (2000)
Google Scholar
Thrun, S., Burgard, W., Fox, D.: Probabilistic Robotics. The MIT Press, Cambridge (2005)
MATH Google Scholar
Traub, J.F., Werschulz, A.G.: Complexity and Information. Cambridge University Press. Cambridge (1998)
MATH Google Scholar

Download references

Author information

Authors and Affiliations

Department of Computer Science, National University of Singapore, Singapore
Haoyu Bai, David Hsu, Wee Sun Lee & Vien A. Ngo

Authors

Haoyu Bai
View author publications
You can also search for this author in PubMed Google Scholar
David Hsu
View author publications
You can also search for this author in PubMed Google Scholar
Wee Sun Lee
View author publications
You can also search for this author in PubMed Google Scholar
Vien A. Ngo
View author publications
You can also search for this author in PubMed Google Scholar

Editor information

Editors and Affiliations

School of Computing, National University of Singapore, 13 Computing Drive, 117417, Singapore
David Hsu
Department of Computer Science, University of Minnesota, 200 Union Street SE, 55455, Minneapolis, MN, USA
Volkan Isler
Computer Science Department, Stanford University, 94305-9010, Stanford, CA, USA
Jean-Claude Latombe
Department of Computer Science, University of North Carolina, Chapel Hill, 254 Brooks Building, 27599, Chapel Hill, NC, USA
Ming C. Lin

Rights and permissions

Reprints and permissions

Copyright information

About this chapter

Cite this chapter

Bai, H., Hsu, D., Lee, W.S., Ngo, V.A. (2010). Monte Carlo Value Iteration for Continuous-State POMDPs. In: Hsu, D., Isler, V., Latombe, JC., Lin, M.C. (eds) Algorithmic Foundations of Robotics IX. Springer Tracts in Advanced Robotics, vol 68. Springer, Berlin, Heidelberg. https://doi.org/10.1007/978-3-642-17452-0_11

Download citation

DOI: https://doi.org/10.1007/978-3-642-17452-0_11
Publisher Name: Springer, Berlin, Heidelberg
Print ISBN: 978-3-642-17451-3
Online ISBN: 978-3-642-17452-0
eBook Packages: EngineeringEngineering (R0)

Publish with us

Policies and ethics