Abstract
This chapter is devoted to studying constrained semi-Markov decision processes with denumerable states and unbounded reward/cost rates. The performance criterion to be optimized is the expected reward obtained during a first passage time to some target set, subject to a constraint on the associated expected cost over this first passage time. The discount rate is state-action dependent, and the undiscounted case is allowed. We employ the Lagrange multiplier technique to establish the existence of a constrained optimal policy under suitable conditions and show that the constrained optimal policy randomizes between two stationary policies differing in at most one state.
Access this chapter
Tax calculation will be finalised at checkout
Purchases are for personal use only
References
Aliprantis, C.D., Burkinshaw O.: Principles of real analysis. Third edition. Academic Press, Inc., San Diego, CA (1998)
Berument, H., Kilinc, Z., Ozlale, U.: The effects of different inflation risk premiums on interest rate spreads. Phys. A 333, 317–324 (2004)
Beutler, F.J., Ross, K.W.: Time-average optimal constrained semi-Markov decision processes. Adv. in Appl. Probab. 18, 341–359 (1986)
Feinberg, E.A.: Constrained semi-Markov decision processes with average rewards. Z. Oper. Res. 39, 257–288 (1994)
Feinberg, E.A.: Continuous time discounted jump Markov decision processes: a discrete-event approach. Math. Oper. Res. 29, 492–524 (2004)
Guo, X.P.: Constrained denumerable state non-stationary MDPs with expected total reward criterion. Acta Math. Appl. Sinica (English Ser.) 16, 205–212 (2000)
Guo, X.P., Hern\(\acute{\mathrm{a}}\)ndez-Lerma, O.: Constrained continuous-time Markov control processes with discounted criteria. Stochastic Anal. Appl. 21, 379–399 (2003)
Guo, X.P., Hern\(\acute{\mathrm{a}}\)ndez-Lerma, O.: Continuous-Time Markov Decision Processes: Theory and Applications. Springer-Verlag, Berlin Heidelberg (2009)
Haberman, S., Sung, J.: Optimal pension funding dynamics over infinite control horizon when stochastic rates of return are stationary. Insur. Math. Econ. 36, 103–116 (2005)
Hern\(\acute{\mathrm{a}}\)ndez-Lerma, O., Lasserre, J.B.: Discrete-Time Markov Control Processes: Basic Optimality Criteria. Springer-Verlag, New York (1996)
Hern\(\acute{\mathrm{a}}\)ndez-Lerma, O., Lasserre, J.B.: Further Topics on Discrete-Time Markov Control Processes. Springer-Verlag, New York (1999)
Hern\(\acute{\mathrm{a}}\)ndez-Lerma, O., Gonz\(\acute{\mathrm{a}}\)lez-Hern\(\acute{\mathrm{a}}\)ndez, J.: Constrained Markov control processes in Borel spaces: the discounted case. Math. Methods Oper. Res. 52, 271–285 (2000)
Huang, Y.H, Guo, X.P.: Optimal risk probability for first passage models in semi-Markov decision processes. J. Math. Anal. Appl. 359, 404–420 (2009)
Huang, Y.H, Guo, X.P.: Discounted semi-Markov decision processes with nonnegative costs. Acta. Math. Sinica (Chinese Series) 53, 503–514 (2010)
Huang, Y.H, Guo, X.P.: First passage models for denumerable semi-Markov decision processes with nonnegative discounted costs. Acta. Math. Appl. Sinica 27, 177–190 (2011)
Huang, Y.H, Guo, X.P.: Finite horizon semi-Markov decision processes with application to maintenance systems. European. J. Oper. Res. 212, 131–140 (2011)
Lee, P., Rosenfield, D.B.: When to refinance a mortgage: a dynamic programming approach. European. J. Oper. Res. 166, 266–277 (2005)
Limnios, N., Oprisan, J.: Semi-Markov Processes and Reliability. Birkhäuser, Boston (2001)
Lin, Y.L.: Continuous time first arrival target models (1)- discounted moment optimal models. Acta. Math. Appl. Sinica-Chinese Serias 14, 115–124 (1991)
Lippman, S.A.: Semi-Markov decision processes with unbounded rewards. Management Science 19, 717–731 (1973)
Liu, J.Y., Huang S.M.: Markov decision processes with distribution function criterion of first-passage time. Appl. Math. Optim. 43, 187–201 (2001)
Liu, J.Y., Liu, K.: Markov decision programming—the first passage model with denumerable state space. Systems Sci. Math. Sci. 5, 340–351 (1992)
Newell R. G. and Pizer W. A. Discounting the distant future: how much do uncertain rates increase valuation. J. Environ. Econ. Manage 46, 52–71 (2003)
Puterman, M.L.: Markov Decision Processes: Discrete Stochastic Dynamic Programming. John Wiley & Sons. Inc., New York (1994)
Ross, S.M.: Average cost semi-Markov decision processes. J. Appl. Probab. 7, 649–656 (1970)
Sack, B., Wieland, V.: Interest-rate smoothing and optimal monetary policy: a review of recent empirical evidence. J. Econ. Bus. 52, 205–228 (2000)
Yong, J.M., Zhou, X.Y.: Stochastic Controls—Hamiltonian Systems and HJB Equations. Springer-Verlag, New York (1999)
Yu, S.X., Lin, Y.L., Yan, P.F.: Optimization models for the first arrival target distribution function in discrete time. J. Math. Anal. Appl. 225, 193–223 (1998)
Zhang, L.L., Guo, X.P.: Constrained continuous-time Markov control processes with average criteria. Math. Meth. Oper. Res. 67, 323–340 (2008)
Author information
Authors and Affiliations
Corresponding author
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 2012 Springer Science+Business Media, LLC
About this chapter
Cite this chapter
Huang, Y., Guo, X. (2012). Constrained Optimality for First Passage Criteria in Semi-Markov Decision Processes. In: Hernández-Hernández, D., Minjárez-Sosa, J. (eds) Optimization, Control, and Applications of Stochastic Systems. Systems & Control: Foundations & Applications. Birkhäuser, Boston. https://doi.org/10.1007/978-0-8176-8337-5_11
Download citation
DOI: https://doi.org/10.1007/978-0-8176-8337-5_11
Published:
Publisher Name: Birkhäuser, Boston
Print ISBN: 978-0-8176-8336-8
Online ISBN: 978-0-8176-8337-5
eBook Packages: Mathematics and StatisticsMathematics and Statistics (R0)