Shortest Stochastic Path with Risk Sensitive Evaluation

  • Renato Minami
  • Valdinei Freire da Silva
Part of the Lecture Notes in Computer Science book series (LNCS, volume 7629)

Abstract

In an environment of uncertainty where decisions must be taken, how to make a decision considering the risk? The shortest stochastic path (SSP) problem models the problem of reaching a goal with the least cost. However under uncertainty, a best decision may: minimize expected cost, minimize variance, minimize worst case, maximize best case, etc. Markov Decision Processes (MDPs) defines optimal decision in the shortest stochastic path problem as the decision that minimizes expected cost, therefore MDPs does not care about the risk. An extension of MDP which has few works in Artificial Intelligence literature is Risk Sensitive MDP. RSMDPs considers the risk and integrates expected cost, variance, worst case and best case in a simple way. We show theoretically the differences and similarities between MDPs and RSMDPs for modeling the SSP problem, in special the relationship between the discount factor γ and risk prone attitudes under the SSP with constant cost. We also exemplify each model in a simple artificial scenario.

Keywords

Markov Decision Process Expected Utility Theory Risk Sensitive 

Preview

Unable to display preview. Download preview PDF.

Unable to display preview. Download preview PDF.

References

  1. 1.
    do Lago Pereira, S., de Barros, L.N., Cozman, F.G.: Strong Probabilistic Planning. In: Gelbukh, A., Morales, E.F. (eds.) MICAI 2008. LNCS (LNAI), vol. 5317, pp. 636–652. Springer, Heidelberg (2008)CrossRefGoogle Scholar
  2. 2.
    Trevizan, F.W., Cozman, F.G., de Barros, L.N.: Planning under risk and knightian uncertainty. In: Veloso, M.M. (ed.) IJCAI, pp. 2023–2028 (2007)Google Scholar
  3. 3.
    Bertsekas, D.P., Tsitsiklis, J.N.: An Analysis of Stochastic Shortest Path Problems. Mathematics of Operations Research 16(3) (1991)Google Scholar
  4. 4.
    Puterman, M.L.: Markov Decision Processes: Discrete Stochastic Dynamic Programming, 1st edn. John Wiley & Sons, Inc., New York (1994)MATHCrossRefGoogle Scholar
  5. 5.
    Keeney, R.L., Raiffa, H.: Decisions with Multiple Objectives: Preferences and Value Tradeoffs. Wiley, New York (1976)Google Scholar
  6. 6.
    Braga, J., Starmer, C.: Preference anomalies, preference elicitation and the discovered preference hypothesis. Environmental & Resource Economics 32, 55–89 (2005)CrossRefGoogle Scholar
  7. 7.
    Howard, R.A., Matheson, J.E.: Risk-sensitive markov decision processes. Management Science 18(7), 356–369 (1972)MathSciNetMATHCrossRefGoogle Scholar
  8. 8.
    Porteus, E.L.: On the optimality of structured policies in countable stage decision processes. Management Science 22(2), 148–157 (1975)MathSciNetMATHCrossRefGoogle Scholar
  9. 9.
    Liu, Y., Koenig, S.: Probabilistic planning with nonlinear utility functions. In: Long, D., Smith, S.F., Borrajo, D., McCluskey, L. (eds.) ICAPS, pp. 410–413. AAAI (2006)Google Scholar
  10. 10.
    Delage, E., Mannor, S.: Percentile optimization for markov decision processes with parameter uncertainty. Oper. Res. 58(1), 203–213 (2010)MathSciNetMATHCrossRefGoogle Scholar
  11. 11.
    Mannor, S., Tsitsiklis, J.N.: Mean-variance optimization in markov decision processes. In: Getoor, L., Scheffer, T. (eds.) ICML, pp. 177–184. Omnipress (2011)Google Scholar
  12. 12.
    Kaelbling, L.P., Littman, M.L., Cassandra, A.R.: Planning and acting in partially observable stochastic domains. Artificial Intelligence 101, 99–134 (1998)MathSciNetMATHCrossRefGoogle Scholar
  13. 13.
    Sladký, K.: Growth rates and average optimality in risk-sensitive markov decision chains. Kybernetika 44(2), 205–226 (2008)MathSciNetMATHGoogle Scholar
  14. 14.
    Patek, S.D.: On terminating markov decision processes with a risk-averse objective function. Automatica 37(9), 1379–1386 (2001)MATHCrossRefGoogle Scholar
  15. 15.
    Chung, K.-J., Sobel, M.J.: Discounted mdp’s: distribution functions and exponential utility maximization. SIAM J. Control Optim. 25, 49–62 (1987)MathSciNetMATHCrossRefGoogle Scholar
  16. 16.
    Ermon, S., Conrad, J., Gomes, C.P., Selman, B.: Risk-sensitive policies for sustainable renewable resource allocation. In: Walsh, T. (ed.) IJCAI, pp. 1942–1948. IJCAI/AAAI (2011)Google Scholar
  17. 17.
    Cavazos-Cadena, R., Salem-Silva, F.: The discounted method and equivalence of average criteria for risk-sensitive markov decision processes on borel spaces. Applied Mathematics & Optimization 61, 167–190 (2010)MathSciNetMATHCrossRefGoogle Scholar
  18. 18.
    Koenig, S., Liu, Y.: The interaction of representations and planning objectives for decision-theoretic planning tasks. J. Exp. Theor. Artif. Intell. 14(4), 303–326 (2002)MATHCrossRefGoogle Scholar
  19. 19.
    Thiebaux, S., Gretton, C., Slaney, J., Price, D., Kabanza, F.: Decision-theoretic planning with non-markovian rewards. Journal of Artificial Intelligence Research 25, 17–74 (2006)MathSciNetMATHGoogle Scholar

Copyright information

© Springer-Verlag Berlin Heidelberg 2013

Authors and Affiliations

  • Renato Minami
    • 1
  • Valdinei Freire da Silva
    • 1
  1. 1.Escola de Artes, Ciências e HumanidadesUniversidade de São Paulo (EACH-USP)São PauloBrazil

Personalised recommendations