Abstract
The framework of Partially Observable Markov Decision Processes (POMDPs) offers a standard approach to model uncertainty in many robot tasks. Traditionally, POMDPs are formulated with optimality objectives. However, for robotic domains that require a correctness guarantee of accomplishing tasks, boolean objectives are natural formulations. We study POMDPs with a common boolean objective: safe-reachability, which requires that, with a probability above a threshold, the robot eventually reaches a goal state while keeping the probability of visiting unsafe states below a different threshold. The solutions to POMDPs are policies or conditional plans that specify the action to take contingent on every possible event. A full policy or conditional plan that covers all possible events is generally expensive to compute. To improve efficiency, we introduce the notion of partial conditional plans that only cover a sampled subset of all possible events. Our approach constructs a partial conditional plan parameterized by a replanning probability. We prove that the probability of the constructed partial conditional plan failing is bounded by the replanning probability. Our approach allows users to specify an appropriate bound on the replanning probability to balance efficiency and correctness. We validate our approach in several robotic domains. The results show that our approach outperforms a previous approach for POMDPs with safe-reachability objectives in these domains.
Access this chapter
Tax calculation will be finalised at checkout
Purchases are for personal use only
Similar content being viewed by others
References
Bai, H., Hsu, D., Lee, W.S.: Integrated perception and planning in the continuous space: a POMDP approach. Int. J. Robot. Res. 33(9), 1288–1302 (2014)
Biere, A., Cimatti, A., Clarke, E.M., Strichman, O., Zhu, Y.: Bounded model checking. Adv. Comput. 58, 117–148 (2003)
Cai, P., Luo, Y., Hsu, D., Lee, W.S.: HyP-DESPOT: a hybrid parallel algorithm for online planning under uncertainty. In: RSS (2018)
Chatterjee, K., ChmelÃk, M., Gupta, R., Kanodia, A.: Qualitative analysis of POMDPs with temporal logic specifications for robotics applications. In: ICRA, pp. 325–330 (2015)
Chatterjee, K., ChmelÃk, M., Gupta, R., Kanodia, A.: Optimal cost almost-sure reachability in POMDPs. Artif. Intell. 234(C), 26–48 (2016)
Chatterjee, K., ChmelÃk, M., Tracol, M.: What is decidable about partially observable Markov decision processes with \(\omega \)-regular objectives. J. Comput. Syst. Sci. 82(5), 878–911 (2016)
DeMoura, L., Bjørner, N.: Z3: an efficient SMT solver. In: TACAS, pp. 337–340 (2008)
Hadfield-Menell, D., Groshev, E., Chitnis, R., Abbeel, P.: Modular task and motion planning in belief space. In: IROS, pp. 4991–4998 (2015)
Hoey, J., Poupart, P.: Solving POMDPs with continuous or large discrete observation spaces. In: IJCAI, pp. 1332–1338 (2005)
Hou, P., Yeoh, W., Varakantham, P.: Solving risk-sensitive POMDPs with and without cost observations. In: AAAI, pp. 3138–3144 (2016)
Isom, J.D., Meyn, S.P., Braatz, R.D.: Piecewise linear dynamic programming for constrained POMDPs. In: AAAI, pp. 291–296 (2008)
Kaelbling, L.P., Littman, M.L., Cassandra, A.R.: Planning and acting in partially observable stochastic domains. Artif. Intell. 101(1–2), 99–134 (1998)
Kaelbling, L.P., Lozano-Pérez, T.: Integrated task and motion planning in belief space. Int. J. Robot. Res. 32(9–10), 1194–1227 (2013)
Kaelbling, L.P., Lozano-Pérez, T.: Implicit belief-space pre-images for hierarchical planning and execution. In: ICRA, pp. 5455–5462 (2016)
Kim, D., Lee, J., Kim, K.E., Poupart, P.: Point-based value iteration for constrained POMDPs. In: IJCAI, pp. 1968–1974 (2011)
Kurniawati, H., Hsu, D., Lee, W.S.: SARSOP: efficient point-based POMDP planning by approximating optimally reachable belief spaces. In: RSS (2008)
Luo, Y., Bai, H., Hsu, D., Lee, W.S.: Importance sampling for online planning under uncertainty. In: WAFR (2016)
Madani, O., Hanks, S., Condon, A.: On the undecidability of probabilistic planning and related stochastic optimization problems. Artif. Intell. 147, 5–34 (2003)
Marecki, J., Varakantham, P.: Risk-sensitive planning in partially observable environments. In: AAMAS, pp. 1357–1368 (2010)
Mundhenk, M., Goldsmith, J., Lusena, C., Allender, E.: Complexity of finite-horizon Markov decision process problems. J. ACM 47(4), 681–720 (2000)
Papadimitriou, C., Tsitsiklis, J.N.: The complexity of Markov decision processes. Math. Oper. Res. 12(3), 441–450 (1987)
Paz, A.: Introduction to Probabilistic Automata. Academic Press Inc., Cambridge (1971)
Pineau, J., Gordon, G., Thrun, S.: Point-based value iteration: an anytime algorithm for POMDPs. In: IJCAI, pp. 1025–1030 (2003)
Porta, J.M., Vlassis, N., Spaan, M.T.J., Poupart, P.: Point-based value iteration for continuous POMDPs. J. Mach. Learn. Res. 7, 2329–2367 (2006)
Poupart, P., Malhotra, A., Pei, P., Kim, K.E., Goh, B., Bowling, M.: Approximate linear programming for constrained partially observable Markov decision processes. In: AAAI, pp. 3342–3348 (2015)
Ross, S., Pineau, J., Paquet, S., Chaib-Draa, B.: Online planning algorithms for POMDPs. J. Artif. Intell. Res. 32(1), 663–704 (2008)
Santana, P., Thiébaux, S., Williams, B.: RAO*: an algorithm for chance-constrained POMDP’s. In: AAAI, pp. 3308–3314 (2016)
Seiler, K.M., Kurniawati, H., Singh, S.P.N.: An online and approximate solver for POMDPs with continuous action space. In: ICRA, pp. 2290–2297 (2015)
Smallwood, R.D., Sondik, E.J.: The optimal control of partially observable Markov processes over a finite horizon. Oper. Res. 21(5), 1071–1088 (1973)
Somani, A., Ye, N., Hsu, D., Lee, W.S.: DESPOT: online POMDP planning with regularization. In: NIPS, pp. 1772–1780 (2013)
Svoreňová, M., ChmelÃk, M., Leahy, K., Eniser, H.F., Chatterjee, K., ÄŒerná, I., Belta, C.: Temporal logic motion planning using POMDPs with parity objectives: case study paper. In: HSCC, pp. 233–238 (2015)
Undurti, A., How, J.P.: An online algorithm for constrained POMDPs. In: ICRA, pp. 3966–3973 (2010)
Wang, Y., Chaudhuri, S., Kavraki, L.E.: Bounded policy synthesis for POMDPs with safe-reachability objectives. In: AAMAS, pp. 238–246 (2018)
Acknowledgments
This work was supported in part by NSF CCF 1139011, NSF CCF 1514372, NSF CCF 1162076 and NSF IIS 1317849. We thank the reviewers for their insightful comments, and Juan David Hernández, Bryce Willey and Constantinos Chamzas for their assistance in the physical experiments.
Author information
Authors and Affiliations
Corresponding author
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 2020 Springer Nature Switzerland AG
About this paper
Cite this paper
Wang, Y., Chaudhuri, S., Kavraki, L.E. (2020). Online Partial Conditional Plan Synthesis for POMDPs with Safe-Reachability Objectives. In: Morales, M., Tapia, L., Sánchez-Ante, G., Hutchinson, S. (eds) Algorithmic Foundations of Robotics XIII. WAFR 2018. Springer Proceedings in Advanced Robotics, vol 14. Springer, Cham. https://doi.org/10.1007/978-3-030-44051-0_8
Download citation
DOI: https://doi.org/10.1007/978-3-030-44051-0_8
Published:
Publisher Name: Springer, Cham
Print ISBN: 978-3-030-44050-3
Online ISBN: 978-3-030-44051-0
eBook Packages: Intelligent Technologies and RoboticsIntelligent Technologies and Robotics (R0)