MICAI 2012: Advances in Artificial Intelligence pp 371-382 | Cite as
Shortest Stochastic Path with Risk Sensitive Evaluation
Abstract
In an environment of uncertainty where decisions must be taken, how to make a decision considering the risk? The shortest stochastic path (SSP) problem models the problem of reaching a goal with the least cost. However under uncertainty, a best decision may: minimize expected cost, minimize variance, minimize worst case, maximize best case, etc. Markov Decision Processes (MDPs) defines optimal decision in the shortest stochastic path problem as the decision that minimizes expected cost, therefore MDPs does not care about the risk. An extension of MDP which has few works in Artificial Intelligence literature is Risk Sensitive MDP. RSMDPs considers the risk and integrates expected cost, variance, worst case and best case in a simple way. We show theoretically the differences and similarities between MDPs and RSMDPs for modeling the SSP problem, in special the relationship between the discount factor γ and risk prone attitudes under the SSP with constant cost. We also exemplify each model in a simple artificial scenario.
Keywords
Markov Decision Process Expected Utility Theory Risk SensitivePreview
Unable to display preview. Download preview PDF.
References
- 1.do Lago Pereira, S., de Barros, L.N., Cozman, F.G.: Strong Probabilistic Planning. In: Gelbukh, A., Morales, E.F. (eds.) MICAI 2008. LNCS (LNAI), vol. 5317, pp. 636–652. Springer, Heidelberg (2008)CrossRefGoogle Scholar
- 2.Trevizan, F.W., Cozman, F.G., de Barros, L.N.: Planning under risk and knightian uncertainty. In: Veloso, M.M. (ed.) IJCAI, pp. 2023–2028 (2007)Google Scholar
- 3.Bertsekas, D.P., Tsitsiklis, J.N.: An Analysis of Stochastic Shortest Path Problems. Mathematics of Operations Research 16(3) (1991)Google Scholar
- 4.Puterman, M.L.: Markov Decision Processes: Discrete Stochastic Dynamic Programming, 1st edn. John Wiley & Sons, Inc., New York (1994)MATHCrossRefGoogle Scholar
- 5.Keeney, R.L., Raiffa, H.: Decisions with Multiple Objectives: Preferences and Value Tradeoffs. Wiley, New York (1976)Google Scholar
- 6.Braga, J., Starmer, C.: Preference anomalies, preference elicitation and the discovered preference hypothesis. Environmental & Resource Economics 32, 55–89 (2005)CrossRefGoogle Scholar
- 7.Howard, R.A., Matheson, J.E.: Risk-sensitive markov decision processes. Management Science 18(7), 356–369 (1972)MathSciNetMATHCrossRefGoogle Scholar
- 8.Porteus, E.L.: On the optimality of structured policies in countable stage decision processes. Management Science 22(2), 148–157 (1975)MathSciNetMATHCrossRefGoogle Scholar
- 9.Liu, Y., Koenig, S.: Probabilistic planning with nonlinear utility functions. In: Long, D., Smith, S.F., Borrajo, D., McCluskey, L. (eds.) ICAPS, pp. 410–413. AAAI (2006)Google Scholar
- 10.Delage, E., Mannor, S.: Percentile optimization for markov decision processes with parameter uncertainty. Oper. Res. 58(1), 203–213 (2010)MathSciNetMATHCrossRefGoogle Scholar
- 11.Mannor, S., Tsitsiklis, J.N.: Mean-variance optimization in markov decision processes. In: Getoor, L., Scheffer, T. (eds.) ICML, pp. 177–184. Omnipress (2011)Google Scholar
- 12.Kaelbling, L.P., Littman, M.L., Cassandra, A.R.: Planning and acting in partially observable stochastic domains. Artificial Intelligence 101, 99–134 (1998)MathSciNetMATHCrossRefGoogle Scholar
- 13.Sladký, K.: Growth rates and average optimality in risk-sensitive markov decision chains. Kybernetika 44(2), 205–226 (2008)MathSciNetMATHGoogle Scholar
- 14.Patek, S.D.: On terminating markov decision processes with a risk-averse objective function. Automatica 37(9), 1379–1386 (2001)MATHCrossRefGoogle Scholar
- 15.Chung, K.-J., Sobel, M.J.: Discounted mdp’s: distribution functions and exponential utility maximization. SIAM J. Control Optim. 25, 49–62 (1987)MathSciNetMATHCrossRefGoogle Scholar
- 16.Ermon, S., Conrad, J., Gomes, C.P., Selman, B.: Risk-sensitive policies for sustainable renewable resource allocation. In: Walsh, T. (ed.) IJCAI, pp. 1942–1948. IJCAI/AAAI (2011)Google Scholar
- 17.Cavazos-Cadena, R., Salem-Silva, F.: The discounted method and equivalence of average criteria for risk-sensitive markov decision processes on borel spaces. Applied Mathematics & Optimization 61, 167–190 (2010)MathSciNetMATHCrossRefGoogle Scholar
- 18.Koenig, S., Liu, Y.: The interaction of representations and planning objectives for decision-theoretic planning tasks. J. Exp. Theor. Artif. Intell. 14(4), 303–326 (2002)MATHCrossRefGoogle Scholar
- 19.Thiebaux, S., Gretton, C., Slaney, J., Price, D., Kabanza, F.: Decision-theoretic planning with non-markovian rewards. Journal of Artificial Intelligence Research 25, 17–74 (2006)MathSciNetMATHGoogle Scholar