Advertisement

Mathematical Methods of Operations Research

, Volume 88, Issue 2, pp 161–184 | Cite as

Risk measurement and risk-averse control of partially observable discrete-time Markov systems

  • Jingnan Fan
  • Andrzej Ruszczyński
Original Article

Abstract

We consider risk measurement in controlled partially observable Markov processes in discrete time. We introduce a new concept of conditional stochastic time consistency and we derive the structure of risk measures enjoying this property. We prove that they can be represented by a collection of static law invariant risk measures on the space of function of the observable part of the state. We also derive the corresponding dynamic programming equations. Finally we illustrate the results on a machine deterioration problem.

Keywords

Partially observable Markov processes Dynamic risk measures Time consistency Dynamic programming 

Notes

Acknowledgements

Funding was provided by the National Science Foundation, Division of Mathematical Sciences (Grant No. 1312016).

References

  1. Arlotto A, Gans N, Steele JM (2014) Markov decision problems where means bound variances. Oper Res 62(4):864–875MathSciNetCrossRefGoogle Scholar
  2. Artzner P, Delbaen F, Eber J-M, Heath D, Ku H (2007) Coherent multiperiod risk adjusted values and Bellman’s principle. Ann Oper Res 152:5–22MathSciNetCrossRefGoogle Scholar
  3. Aubin J-P, Frankowska H (2009) Set-valued analysis. Birkhäuser, BostonCrossRefGoogle Scholar
  4. Bäuerle N, Rieder U (2011) Markov decision processes with applications to finance. Universitext. Springer, HeidelbergCrossRefGoogle Scholar
  5. Bäuerle N, Rieder U (2013) More risk-sensitive Markov decision processes. Math Oper Res 39(1):105–120MathSciNetCrossRefGoogle Scholar
  6. Bäuerle N, Rieder U (2017) Partially observable risk-sensitive Markov decision processes. Math Oper Res 42:1180–1196MathSciNetCrossRefGoogle Scholar
  7. Bertsekas DP, Shreve SE (1978) Stochastic optimal control, volume 139 of mathematics in science and engineering. Academic Press, New YorkGoogle Scholar
  8. Çavus Ö, Ruszczyński A (2014a) Computational methods for risk-averse undiscounted transient Markov models. Oper Res 62(2):401–417MathSciNetCrossRefGoogle Scholar
  9. Çavus Ö, Ruszczyński A (2014b) Risk-averse control of undiscounted transient Markov models. SIAM J Control Optim 52(6):3935–3966MathSciNetCrossRefGoogle Scholar
  10. Chen Z, Li G, Zhao Y (2014) Time-consistent investment policies in Markovian markets: a case of mean-variance analysis. J Econ Dyn Control 40:293–316MathSciNetCrossRefGoogle Scholar
  11. Cheridito P, Delbaen F, Kupper M (2006) Dynamic monetary risk measures for bounded discrete-time processes. Electron J Probab 11:57–106MathSciNetCrossRefGoogle Scholar
  12. Cheridito P, Kupper M (2011) Composition of time-consistent dynamic monetary risk measures in discrete time. Int J Theor Appl Finance 14(01):137–162MathSciNetCrossRefGoogle Scholar
  13. Chu S, Zhang Y (2014) Markov decision processes with iterated coherent risk measures. Int J Control 87(11):2286–2293MathSciNetzbMATHGoogle Scholar
  14. Coraluppi SP, Marcus SI (1999) Risk-sensitive and minimax control of discrete-time, finite-state Markov decision processes. Automatica 35(2):301–309MathSciNetCrossRefGoogle Scholar
  15. Dai Pra P, Meneghini L, Runggaldier WJ (1998) Explicit solutions for multivariate, discrete-time control problems under uncertainty. Syst Control Lett 34(4):169–176MathSciNetCrossRefGoogle Scholar
  16. Denardo EV, Rothblum UG (1979) Optimal stopping, exponential utility, and linear programming. Math Program 16(2):228–244MathSciNetCrossRefGoogle Scholar
  17. Di Masi GB, Stettner Ł (1999) Risk-sensitive control of discrete-time Markov processes with infinite horizon. SIAM J Control Optim 38(1):61–78MathSciNetCrossRefGoogle Scholar
  18. Fan J (2017) Process-based risk measures and risk-averse control of observable and partially observable discrete-time systems. Ph.D. Dissertation, Rutgers UniversityGoogle Scholar
  19. Fan J, Ruszczyński A (2016) Process-based risk measures and risk-averse control of discrete-time systems. arXiv:1411.2675
  20. Feinberg EA, Kasyanov PO, Zgurovsky MZ (2016) Partially observable total-cost Markov decision processes with weakly continuous transition probabilities. Math Oper Res 41(2):656–681MathSciNetCrossRefGoogle Scholar
  21. Fernández-Gaucherand E, Marcus SI (1997) Risk-sensitive optimal control of hidden Markov models: structural results. IEEE Trans Autom Control 42(10):1418–1422MathSciNetCrossRefGoogle Scholar
  22. Filar JA, Kallenberg LCM, Lee H-M (1989) Variance-penalized Markov decision processes. Math Oper Res 14(1):147–161MathSciNetCrossRefGoogle Scholar
  23. Föllmer H, Penner I (2006) Convex risk measures and the dynamics of their penalty functions. Stat Decis 24(1/2006):61–96MathSciNetzbMATHGoogle Scholar
  24. Hinderer K (1970) Foundations of non-stationary dynamic programming with discrete time parameter. Springer, BerlinCrossRefGoogle Scholar
  25. Howard RA, Matheson JE (1971/72) Risk-sensitive Markov decision processes. Manag Sci. 18:356–369MathSciNetCrossRefGoogle Scholar
  26. James MR, Baras JS, Elliott RJ (1994) Risk-sensitive control and dynamic games for partially observed discrete-time nonlinear systems. IEEE Trans Autom Control 39(4):780–792MathSciNetCrossRefGoogle Scholar
  27. Jaquette SC (1973) Markov decision processes with a new optimality criterion: discrete time. Ann Statist 1:496–505MathSciNetCrossRefGoogle Scholar
  28. Jaśkiewicz A, Matkowski J, Nowak AS (2013) Persistently optimal policies in stochastic dynamic programming with generalized discounting. Math Oper Res 38(1):108–121MathSciNetCrossRefGoogle Scholar
  29. Jobert A, Rogers LCG (2008) Valuations and dynamic convex risk measures. Math Finance 18(1):1–22MathSciNetCrossRefGoogle Scholar
  30. Klöppel S, Schweizer M (2007) Dynamic indifference valuation via convex risk measures. Math Finance 17(4):599–627MathSciNetCrossRefGoogle Scholar
  31. Kuratowski K, Ryll-Nardzewski C (1965) A general theorem on selectors. Bull Acad Polon Sci Ser Sci Math Astron Phys 13(1):397–403MathSciNetzbMATHGoogle Scholar
  32. Levitt S, Ben-Israel A (2001) On modeling risk in Markov decision processes. In: Rubinov A, Glover B (eds) Optimization and related topics . Applied Optimization, vol 47. Springer, Boston, MA, pp 27–40Google Scholar
  33. Lin K, Marcus SI (2013) Dynamic programming with non-convex risk-sensitive measures. In: American control conference (ACC), 2013, IEEE, pp 6778–6783Google Scholar
  34. Mannor S, Tsitsiklis JN (2013) Algorithmic aspects of mean-variance optimization in Markov decision processes. Eur J Oper Res 231(3):645–653MathSciNetCrossRefGoogle Scholar
  35. Marcus, SI, Fernández-Gaucherand E, Hernández-Hernández D, Coraluppi S, Fard P (1997) Risk sensitive Markov decision processes. In: Byrnes CI, Datta BN, Martin CF, Gilliam DS (eds) Systems and control in the twenty-first century. Systems & Control: Foundations & Applications, vol 22. Birkhäuser, Boston, MA, pp 263–279CrossRefGoogle Scholar
  36. Ogryczak W, Ruszczyński A (1999) From stochastic dominance to mean-risk models: semideviations as risk measures. Eur J Oper Res 116(1):33–50CrossRefGoogle Scholar
  37. Ogryczak W, Ruszczyński A (2001) On consistency of stochastic dominance and mean-semideviation models. Math Program 89(2):217–232MathSciNetCrossRefGoogle Scholar
  38. Pflug ChG, Römisch W (2007) Modeling, measuring and managing risk. World Scientific, SingaporeCrossRefGoogle Scholar
  39. Riedel F (2004) Dynamic coherent risk measures. Stoch Process Their Appl 112:185–200MathSciNetCrossRefGoogle Scholar
  40. Roorda B, Schumacher JM, Engwerda J (2005) Coherent acceptability measures in multiperiod models. Math Finance 15(4):589–612MathSciNetCrossRefGoogle Scholar
  41. Runggaldier WJ (1998) Concepts and methods for discrete and continuous time control under uncertainty. Insur Math Econ 22(1):25–39MathSciNetCrossRefGoogle Scholar
  42. Ruszczyński A (2010) Risk-averse dynamic programming for Markov decision processes. Math Program 125(2, Ser. B):235–261MathSciNetCrossRefGoogle Scholar
  43. Ruszczyński A, Shapiro A (2006a) Optimization of convex risk functions. Math Oper Res 31:433–542MathSciNetCrossRefGoogle Scholar
  44. Ruszczyński A, Shapiro A (2006b) Conditional risk mappings. Math Oper Res 31:544–561MathSciNetCrossRefGoogle Scholar
  45. Scandolo G (2003) Risk measures in a dynamic setting. Ph.D. thesis, Università degli Studi di MilanoGoogle Scholar
  46. Shen Y, Stannat W, Obermayer K (2013) Risk-sensitive Markov control processes. SIAM J Control Optim 51(5):3652–3672MathSciNetCrossRefGoogle Scholar
  47. White DJ (1988) Mean, variance, and probabilistic criteria in finite Markov decision processes: a review. J Optim Theory Appl 56(1):1–29MathSciNetCrossRefGoogle Scholar

Copyright information

© Springer-Verlag GmbH Germany, part of Springer Nature 2018

Authors and Affiliations

  1. 1.RUTCORRutgers UniversityPiscatawayUSA
  2. 2.Department of Management Science and Information SystemsRutgers UniversityPiscatawayUSA

Personalised recommendations