Skip to main content
Log in

On undiscounted semi-Markov decision processes with absorbing states

  • Original Article
  • Published:
Mathematical Methods of Operations Research Aims and scope Submit manuscript

Abstract

Limiting ratio average (undiscounted) reward finite (state and action spaces) semi-Markov decision processes (SMDPs) with absorbing states are considered where all but one states are absorbing. We propose a realistic inspection model that suitably fits into the class of undiscounted SMDPs with absorbing states. Existence of an optimal semi-stationary policy (i.e., a semi-Markov policy independent of decision epoch counts) is proved. A linear programming algorithm is provided to compute such an optimal policy.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Institutional subscriptions

Similar content being viewed by others

References

  • Baykal-Gursoy M, Gursoy K (2007) Semi-Markov decision processes. Prob Eng Inform Sci 21(04):635–657

    Article  MathSciNet  MATH  Google Scholar 

  • Blackwell D (1962) Discrete dynamic programming. Ann Math Stat 33:719–726

    Article  MathSciNet  MATH  Google Scholar 

  • Chen D, Trivedi KS (2005) Optimization for condition-based maintenance with semi-Markov decision process. Reliab Eng Syst Saf 90(1):25–29

    Article  Google Scholar 

  • Das TK, Gosavi A, Mahadevan S, Marchalleck N (1999) Solving semi-Markov decision problems using average reward reinforcement learning. Manage Sci 45(4):560–574

    Article  MATH  Google Scholar 

  • Derman C (1962) On sequential decisions and Markov chains. Manage Sci 9(1):16–24

    Article  MathSciNet  MATH  Google Scholar 

  • Derman C, Strauch RE (1966) A note on memoryless rules for controlling sequential control processes. Ann Math Stat 37(1):276–278

    Article  MathSciNet  MATH  Google Scholar 

  • Everett H (1957) Recursive games. In: Dresher M, Tucker AW, Wolfe P (eds) Contributions to the Theory of Games III, Ann. Math. Studies 39. Princeton University Press, Princeton, pp 47–78

  • Federgruen A, Hordijk A, Tijms HC (1978) A note on simultaneous recurrence conditions on a set of denumerable stochastic matrices. J Appl Prob 15:842–847

    Article  MathSciNet  MATH  Google Scholar 

  • Feinberg EA (1994) Constrained Semi-Markov decision processes with average rewards. Math Methods Oper Res 39(3):257–288

    Article  MathSciNet  MATH  Google Scholar 

  • Flesch J, Thuijsman F, Vrieze OJ (1996) Recursive repeated games with absorbing states. Math Operat Res 21(4):1016–1022

    Article  MathSciNet  MATH  Google Scholar 

  • Hinderer K, Waldmann KH (2005) Algorithms for countable state Markov decision models with an absorbing set. SIAM J Contr Optim 43(6):2109–2131

    Article  MathSciNet  MATH  Google Scholar 

  • Howard RA (1963) Semi-Markovian decision processes. Proceedings of international statistical institute, Ottawa, Canada

  • Jewell WS (1963) Markov-renewal programming I and II. Operat Res 2:938–971

    Article  MathSciNet  MATH  Google Scholar 

  • Jianyong L, Xiaobo Z (2004) On average reward semi-Markov decision processes with a general multichain structure. Math Operat Res 29(2):339–352

    Article  MathSciNet  MATH  Google Scholar 

  • Kemeny JG, Snell JL (1976) Finite Markov Chains. Van Nostrand, New York

    MATH  Google Scholar 

  • Kuhn HW (1953) Extensive games and the problem of information. In Kuhn HW, Tucker AW (ed) Contributions to the theory of games. vol II, Ann. Math. Stud. 28, 193–216, Princeton University Press

  • Mondal P (2015) Linear programming and zero-sum two-person undiscounted semi-Markov games. Asia Pac J Oper Res 32(5):1550043

  • Mondal P, Sinha S (2015) Ordered field property for semi-Markov games when one player controls transition probabilities and transition times. Int Game Theory Rev 17(2):1540022-1–1540022-26

    Article  MathSciNet  MATH  Google Scholar 

  • Ross SM (1970) Applied probability models with optimization applications. Holden-Day, San Francisco

    MATH  Google Scholar 

  • Schal M (1992) On the second optimality equation for semi-Markov decision models. Math Operat Res 17(2):470–486

    Article  MathSciNet  MATH  Google Scholar 

  • Schweitzer PJ, Federgruen A (1978) The functional equations of undiscounted Markov renewal programming. Math Operat Res 3(4):308–321

    Article  MathSciNet  MATH  Google Scholar 

  • Sennott LI (1989) Average cost semi-Markov decision processes and the control of queueing systems. Probab Eng Inf Sci 3(02):247–272

    Article  MATH  Google Scholar 

  • Weyl H (1950) Elementary proof of a minimax theorem due to von Neumann. In: Kuhn HW, Tucker AW (eds) Contributions to the theory of games, vol I, Ann. Math. Studies, vol 24. Princeton University Press, Princeton, NJ, pp 19–25

  • White DJ (1993) A survey of applications of markov decision processes. J Operat Res Soc 44(11):1073–1096

    Article  MATH  Google Scholar 

Download references

Acknowledgments

The author is thankful to Dr. S. Sinha of Jadavpur University for many helpful suggestions. The author wishes to thank the unknown referees who have patiently gone through this paper and whose suggestions have improved its presentation and readability considerably.

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Prasenjit Mondal.

Rights and permissions

Reprints and permissions

About this article

Check for updates. Verify currency and authenticity via CrossMark

Cite this article

Mondal, P. On undiscounted semi-Markov decision processes with absorbing states. Math Meth Oper Res 83, 161–177 (2016). https://doi.org/10.1007/s00186-015-0524-y

Download citation

  • Received:

  • Revised:

  • Accepted:

  • Published:

  • Issue Date:

  • DOI: https://doi.org/10.1007/s00186-015-0524-y

Keywords

Mathematics Subject Classification

Navigation