Skip to main content

Analysis of Markov Decision Processes Under Parameter Uncertainty

Part of the Lecture Notes in Computer Science book series (LNPSE,volume 10497)

Abstract

Markov Decision Processes (MDPs) are a popular decision model for stochastic systems. Introducing uncertainty in the transition probability distribution by giving upper and lower bounds for the transition probabilities yields the model of Bounded Parameter MDPs (BMDPs) which captures many practical situations with limited knowledge about a system or its environment. In this paper the class of BMDPs is extended to Bounded Parameter Semi Markov Decision Processes (BSMDPs). The main focus of the paper is on the introduction and numerical comparison of different algorithms to compute optimal policies for BMDPs and BSMDPs; specifically, we introduce and compare variants of value and policy iteration.

The paper delivers an empirical comparison between different numerical algorithms for BMDPs and BSMDPs, with an emphasis on the required solution time.

Keywords

  • (Bounded Parameter) (Semi-)Markov Decision Process
  • Discounted reward
  • Average reward
  • Value iteration
  • Policy iteration

This is a preview of subscription content, access via your institution.

Buying options

Chapter
USD   29.95
Price excludes VAT (USA)
  • DOI: 10.1007/978-3-319-66583-2_1
  • Chapter length: 16 pages
  • Instant PDF download
  • Readable on all devices
  • Own it forever
  • Exclusive offer for individuals only
  • Tax calculation will be finalised during checkout
eBook
USD   54.99
Price excludes VAT (USA)
  • ISBN: 978-3-319-66583-2
  • Instant PDF download
  • Readable on all devices
  • Own it forever
  • Exclusive offer for individuals only
  • Tax calculation will be finalised during checkout
Softcover Book
USD   69.99
Price excludes VAT (USA)
Fig. 1.
Fig. 2.
Fig. 3.
Fig. 4.
Fig. 5.

Notes

  1. 1.

    We consider in the following and subsequent equations continuous random variables where the integrals are well-defined for sojourn times in the states. For discrete random variables, the integrals have to be substituted by sums and the densities by probabilities, respectively.

  2. 2.

    \(\epsilon \)-optimality means that the optimal value is reached up to \(\epsilon \).

References

  1. Analysis of Markov decision processes under parameter uncertainty online companion. http://ls4-www.cs.tu-dortmund.de/cms/de/home/dohndorf/publications/

  2. Bertsekas, D.P.: Dynamic Programming and Optimal Control, vol. 2, 3rd edn. Athena Scientific (2005, 2007)

    Google Scholar 

  3. Beutler, F.J., Ross, K.W.: Uniformization for Semi-Markov decision processes under stationary policies. J. Appl. Probab. 24, 644–656 (1987)

    MathSciNet  CrossRef  MATH  Google Scholar 

  4. Buchholz, P., Kriege, J., Felko, I.: Input Modeling with Phase-Type Distributions and Markov Models. SM. Springer, Cham (2014)

    CrossRef  MATH  Google Scholar 

  5. Chen, T., Hahn, E.M., Han, T., Kwiatkowska, M.Z., Qu, H., Zhang, L.: Model repair for Markov decision processes. In: TASE, pp. 85–92 (2013)

    Google Scholar 

  6. Cubuktepe, M., Jansen, N., Junges, S., Katoen, J., Papusha, I., Poonawala, H.A., Topcu, U.: Sequential convex programming for the efficient verification of parametric MDPs. CoRR, abs/1702.00063 (2017)

    Google Scholar 

  7. Delgado, K.V., de Barros, L.N., Cozman, F.G., Sanner, S.: Using mathematical programming to solve factored Markov decision processes with imprecise probabilities. Int. J. Approx. Reasoning 52(7), 1000–1017 (2011)

    MathSciNet  CrossRef  MATH  Google Scholar 

  8. Delgado, K.V., Sanner, S., de Barros, L.N.: Efficient solutions to factored MDPs with imprecise transition probabilities. Artif. Intell. 175, 1498–1527 (2011)

    MathSciNet  CrossRef  MATH  Google Scholar 

  9. Filho, R.S., Cozman, F.G., Trevizan, F.W., de Campos, C.P., de Barros, L.N.: Multilinear and integer programming for Markov decision processes with imprecise probabilities. In: 5th Int. Symposium on Imprecise Porbability: Theories and Applications, Prague, Czech Republic, pp. 395–404 (2007)

    Google Scholar 

  10. Givan, R., Leach, S.M., Dean, T.L.: Bounded-parameter Markov decision processes. Artif. Intell. 122(1–2), 71–109 (2000)

    MathSciNet  CrossRef  MATH  Google Scholar 

  11. Gross, D., Miller, D.: The randomization technique as a modeling tool and solution procedure for transient Markov processes. Oper. Res. 32, 343–361 (1984)

    MathSciNet  CrossRef  MATH  Google Scholar 

  12. Hoffman, A.J., Karp, R.M.: On nonterminating stochastic games. Manage. Sci. 12(5), 359–370 (1966)

    MathSciNet  CrossRef  MATH  Google Scholar 

  13. Kallenberg, L.: Markov decision processes. Lecture Notes, University Leiden (2011). https://www.math.leidenuniv.nl/~kallenberg/Lecture-notes-MDP.pdf

  14. Müller, A., Stoyan, D.: Comparison Methods for Stochastic Models and Risks. Wiley, Chichester (2002)

    MATH  Google Scholar 

  15. Puterman, M.L.: Markov Decision Processes. Wiley, New York (2005)

    MATH  Google Scholar 

  16. Satia, J.K., Lave, R.E.: Markovian decision processes with uncertain transition probabilities. Oper. Res. 21(3), 728–740 (1973)

    MathSciNet  CrossRef  MATH  Google Scholar 

  17. Serfozo, R.F.: An equivalence between continuous and discrete time Markov decision processes. Oper. Res. 27(3), 616–620 (1979)

    MathSciNet  CrossRef  MATH  Google Scholar 

  18. Sigaud, O., Buffet, O. (eds.): Markov Decision Processes in Artificial Intelligence. Wiley-ISTE (2010)

    Google Scholar 

  19. Tewari, A., Bartlett, P.L.: Bounded parameter Markov decision processes with average reward criterion. In: Bshouty, N.H., Gentile, C. (eds.) COLT 2007. LNCS (LNAI), vol. 4539, pp. 263–277. Springer, Heidelberg (2007). doi:10.1007/978-3-540-72927-3_20

    CrossRef  Google Scholar 

  20. White, C.C., Eldeib, H.K.: Markov decision processes with imprecise transition probabilities. Oper. Res. 42(4), 739–749 (1994)

    MathSciNet  CrossRef  MATH  Google Scholar 

Download references

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Iryna Dohndorf .

Editor information

Editors and Affiliations

Rights and permissions

Reprints and Permissions

Copyright information

© 2017 Springer International Publishing AG

About this paper

Cite this paper

Buchholz, P., Dohndorf, I., Scheftelowitsch, D. (2017). Analysis of Markov Decision Processes Under Parameter Uncertainty. In: Reinecke, P., Di Marco, A. (eds) Computer Performance Engineering. EPEW 2017. Lecture Notes in Computer Science(), vol 10497. Springer, Cham. https://doi.org/10.1007/978-3-319-66583-2_1

Download citation

  • DOI: https://doi.org/10.1007/978-3-319-66583-2_1

  • Published:

  • Publisher Name: Springer, Cham

  • Print ISBN: 978-3-319-66582-5

  • Online ISBN: 978-3-319-66583-2

  • eBook Packages: Computer ScienceComputer Science (R0)