Abstract
Markov Decision Processes (MDPs) model problems where a decision-maker makes sequential decisions and the effect of decisions is probabilistic. A particular formulation of MDPs is the Shortest Stochastic Path (SSP), in which the agent seeks to accomplish a goal while reducing the cost of the path to it. Literature introduces some optimality criteria; most of them consider a priority of maximizing probability to accomplish the goal while minimizing some cost measure; such criteria allow a unique trade-off between probability-to-goal and path cost for a decision-maker. Here, we present algorithms to make a trade-off between probability-to-goal and expected cost; based on the Minimum Cost given Maximum Probability (MCMP) criterion, we propose to treat such a trade-off under three different methods: (i) additional constraints for probability-to-goal or expected cost; (ii) a Pareto’s optimality by finding non-dominated policies; and (iii) an efficient preference elicitation process based on non-dominated policies. We report experiments on a toy problem, where probability-to-goal and expected cost trade-off can be observed.
Access this chapter
Tax calculation will be finalised at checkout
Purchases are for personal use only
Similar content being viewed by others
References
Benabbou, N., Leroy, C., Lust, T.: Regret-based elicitation for solving multi-objective knapsack problems with rank-dependent aggregators. In: The 24th European Conference on Artificial Intelligence (ECAI 2020). Saint Jacques de Compostelle, Spain (June 2020). https://hal.sorbonne-universite.fr/hal-02493998
Bertsekas, D.P., Tsitsiklis, J.N.: An analysis of stochastic shortest path problems. Math. Oper. Res. 16(3), 580–595 (1991)
Branke, J., Corrente, S., Greco, S., Gutjahr, W.: Efficient pairwise preference elicitation allowing for indifference. Comput. Oper. Res. 88, 175–186 (2017)
Carpin, S., Chow, Y.L., Pavone, M.: Risk aversion in finite Markov decision processes using total cost criteria and average value at risk. In: 2016 IEEE International Conference on Robotics and Automation (ICRA), pp. 335–342. IEEE (2016)
Chow, Y., Tamar, A., Mannor, S., Pavone, M.: Risk-sensitive and robust decision-making: a CVaR optimization approach. In: Advances in Neural Information Systems, pp. 1522–1530 (2015)
Freire, V., Delgado, K.V.: GUBS: a utility-based semantic for goal-directed Markov decision processes. In: Proceedings of the 16th Conference on Autonomous Agents and MultiAgent Systems, pp. 741–749 (2017)
Freire, V., Delgado, K.V., Reis, W.A.S.: An exact algorithm to make a trade-off between cost and probability in SSPs. In: Proceedings of the Twenty-Ninth International Conference on Automated Planning and Scheduling, pp. 146–154 (2019)
Kolobov, A., Mausam, Weld, D.S.: A theory of goal-oriented MDPs with dead ends. In: de Freitas, N., Murphy, K.P. (eds.) Proceedings of the Twenty-Eighth Conference on Uncertainty in Artificial Intelligence, pp. 438–447. AUAI Press (2012)
Mausam, A.K.: Planning with Markov decision processes: an AI perspective. Synth. Lect. Artif. Intell. Mach. Learn. 6(1), 1–210 (2012)
Regan, K., Boutilier, C.: Robust policy computation in reward-uncertain MDPs using nondominated policies. In: Proceedings of the Twenty-Fourth AAAI Conference on Artificial Intelligence, AAAI 2010, pp. 1127–1133. AAAI Press (2010)
Freire da Silva, V., Reali Costa, A.H.: A geometric approach to find nondominated policies to imprecise reward MDPs. In: Gunopulos, D., Hofmann, T., Malerba, D., Vazirgiannis, M. (eds.) ECML PKDD 2011, Part I. LNCS (LNAI), vol. 6911, pp. 439–454. Springer, Heidelberg (2011). https://doi.org/10.1007/978-3-642-23780-5_38
Silva, V.F.D., Costa, A.H.R., Lima, P.: Inverse reinforcement learning with evaluation. In: IEEE International Conference on Robotics and Automation (ICRA 2006), pp. 4246–4251. IEEE, Orlando (May 2006)
Teichteil-Königsbuch, F.: Stochastic safest and shortest path problems. In: Proceedings of the Twenty-Sixth AAAI Conference on Artificial Intelligence (AAAI 2012), pp. 1825–1831 (2012)
Teichteil-Königsbuch, F., Vidal, V., Infantes, G.: Extending classical planning heuristics to probabilistic planning with dead-ends. In: Proceedings of the Twenty-Fifth AAAI Conference on Artificial Intelligence, AAAI 2011, San Francisco, California, USA, August 7–11, 2011 (2011)
Trevizan, F., Teichteil-Königsbuch, F., Thiébaux, S.: Efficient solutions for stochastic shortest path problems with dead ends. In: Proceedings of the Thirty-Third Conference on Uncertainty in Artificial Intelligence (UAI) (2017)
Author information
Authors and Affiliations
Corresponding author
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 2021 Springer Nature Switzerland AG
About this paper
Cite this paper
Kuo, I., Freire, V. (2021). Probability-to-Goal and Expected Cost Trade-Off in Stochastic Shortest Path. In: Gervasi, O., et al. Computational Science and Its Applications – ICCSA 2021. ICCSA 2021. Lecture Notes in Computer Science(), vol 12951. Springer, Cham. https://doi.org/10.1007/978-3-030-86970-0_9
Download citation
DOI: https://doi.org/10.1007/978-3-030-86970-0_9
Published:
Publisher Name: Springer, Cham
Print ISBN: 978-3-030-86969-4
Online ISBN: 978-3-030-86970-0
eBook Packages: Computer ScienceComputer Science (R0)