Constrained Optimality for First Passage Criteria in Semi-Markov Decision Processes

Huang, Yonghui; Guo, Xianping

doi:10.1007/978-0-8176-8337-5_11

Yonghui Huang³ &
Xianping Guo³

Part of the book series: Systems & Control: Foundations & Applications ((SCFA))

1393 Accesses
1 Citations

Abstract

This chapter is devoted to studying constrained semi-Markov decision processes with denumerable states and unbounded reward/cost rates. The performance criterion to be optimized is the expected reward obtained during a first passage time to some target set, subject to a constraint on the associated expected cost over this first passage time. The discount rate is state-action dependent, and the undiscounted case is allowed. We employ the Lagrange multiplier technique to establish the existence of a constrained optimal policy under suitable conditions and show that the constrained optimal policy randomizes between two stationary policies differing in at most one state.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Log in via an institution

Chapter: USD 29.95; Price excludes VAT (USA)

eBook: USD 39.99; Price excludes VAT (USA)

Hardcover Book: USD 54.99; Price excludes VAT (USA)

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

References

Aliprantis, C.D., Burkinshaw O.: Principles of real analysis. Third edition. Academic Press, Inc., San Diego, CA (1998)
MATH Google Scholar
Berument, H., Kilinc, Z., Ozlale, U.: The effects of different inflation risk premiums on interest rate spreads. Phys. A 333, 317–324 (2004)
Article MathSciNet Google Scholar
Beutler, F.J., Ross, K.W.: Time-average optimal constrained semi-Markov decision processes. Adv. in Appl. Probab. 18, 341–359 (1986)
Article MathSciNet MATH Google Scholar
Feinberg, E.A.: Constrained semi-Markov decision processes with average rewards. Z. Oper. Res. 39, 257–288 (1994)
MathSciNet MATH Google Scholar
Feinberg, E.A.: Continuous time discounted jump Markov decision processes: a discrete-event approach. Math. Oper. Res. 29, 492–524 (2004)
Article MathSciNet MATH Google Scholar
Guo, X.P.: Constrained denumerable state non-stationary MDPs with expected total reward criterion. Acta Math. Appl. Sinica (English Ser.) 16, 205–212 (2000)
Google Scholar
Guo, X.P., Hern\(\acute{\mathrm{a}}\)ndez-Lerma, O.: Constrained continuous-time Markov control processes with discounted criteria. Stochastic Anal. Appl. 21, 379–399 (2003)
Google Scholar
Guo, X.P., Hern\(\acute{\mathrm{a}}\)ndez-Lerma, O.: Continuous-Time Markov Decision Processes: Theory and Applications. Springer-Verlag, Berlin Heidelberg (2009)
Google Scholar
Haberman, S., Sung, J.: Optimal pension funding dynamics over infinite control horizon when stochastic rates of return are stationary. Insur. Math. Econ. 36, 103–116 (2005)
Article MathSciNet MATH Google Scholar
Hern\(\acute{\mathrm{a}}\)ndez-Lerma, O., Lasserre, J.B.: Discrete-Time Markov Control Processes: Basic Optimality Criteria. Springer-Verlag, New York (1996)
Google Scholar
Hern\(\acute{\mathrm{a}}\)ndez-Lerma, O., Lasserre, J.B.: Further Topics on Discrete-Time Markov Control Processes. Springer-Verlag, New York (1999)
Google Scholar
Hern\(\acute{\mathrm{a}}\)ndez-Lerma, O., Gonz\(\acute{\mathrm{a}}\)lez-Hern\(\acute{\mathrm{a}}\)ndez, J.: Constrained Markov control processes in Borel spaces: the discounted case. Math. Methods Oper. Res. 52, 271–285 (2000)
Google Scholar
Huang, Y.H, Guo, X.P.: Optimal risk probability for first passage models in semi-Markov decision processes. J. Math. Anal. Appl. 359, 404–420 (2009)
Google Scholar
Huang, Y.H, Guo, X.P.: Discounted semi-Markov decision processes with nonnegative costs. Acta. Math. Sinica (Chinese Series) 53, 503–514 (2010)
Google Scholar
Huang, Y.H, Guo, X.P.: First passage models for denumerable semi-Markov decision processes with nonnegative discounted costs. Acta. Math. Appl. Sinica 27, 177–190 (2011)
Google Scholar
Huang, Y.H, Guo, X.P.: Finite horizon semi-Markov decision processes with application to maintenance systems. European. J. Oper. Res. 212, 131–140 (2011)
Google Scholar
Lee, P., Rosenfield, D.B.: When to refinance a mortgage: a dynamic programming approach. European. J. Oper. Res. 166, 266–277 (2005)
Article MATH Google Scholar
Limnios, N., Oprisan, J.: Semi-Markov Processes and Reliability. Birkhäuser, Boston (2001)
Google Scholar
Lin, Y.L.: Continuous time first arrival target models (1)- discounted moment optimal models. Acta. Math. Appl. Sinica-Chinese Serias 14, 115–124 (1991)
MATH Google Scholar
Lippman, S.A.: Semi-Markov decision processes with unbounded rewards. Management Science 19, 717–731 (1973)
Article MathSciNet MATH Google Scholar
Liu, J.Y., Huang S.M.: Markov decision processes with distribution function criterion of first-passage time. Appl. Math. Optim. 43, 187–201 (2001)
Article MathSciNet MATH Google Scholar
Liu, J.Y., Liu, K.: Markov decision programming—the first passage model with denumerable state space. Systems Sci. Math. Sci. 5, 340–351 (1992)
MathSciNet MATH Google Scholar
Newell R. G. and Pizer W. A. Discounting the distant future: how much do uncertain rates increase valuation. J. Environ. Econ. Manage 46, 52–71 (2003)
Article MATH Google Scholar
Puterman, M.L.: Markov Decision Processes: Discrete Stochastic Dynamic Programming. John Wiley & Sons. Inc., New York (1994)
MATH Google Scholar
Ross, S.M.: Average cost semi-Markov decision processes. J. Appl. Probab. 7, 649–656 (1970)
Article MATH Google Scholar
Sack, B., Wieland, V.: Interest-rate smoothing and optimal monetary policy: a review of recent empirical evidence. J. Econ. Bus. 52, 205–228 (2000)
Article Google Scholar
Yong, J.M., Zhou, X.Y.: Stochastic Controls—Hamiltonian Systems and HJB Equations. Springer-Verlag, New York (1999)
MATH Google Scholar
Yu, S.X., Lin, Y.L., Yan, P.F.: Optimization models for the first arrival target distribution function in discrete time. J. Math. Anal. Appl. 225, 193–223 (1998)
Article MathSciNet MATH Google Scholar
Zhang, L.L., Guo, X.P.: Constrained continuous-time Markov control processes with average criteria. Math. Meth. Oper. Res. 67, 323–340 (2008)
Article MATH Google Scholar

Download references

Author information

Authors and Affiliations

Sun Yat-Sen University, Guangzhou, 510275, China
Yonghui Huang & Xianping Guo

Authors

Yonghui Huang
View author publications
You can also search for this author in PubMed Google Scholar
Xianping Guo
View author publications
You can also search for this author in PubMed Google Scholar

Corresponding author

Correspondence to Xianping Guo .

Editor information

Editors and Affiliations

, Department of Probability and Statistics, Center for Research in Mathematics, Jalisco s/n, Guanajuato, 36000, Mexico
Daniel Hernández-Hernández
, Department of Mathematics, University of Sonora, Rosales s/n, Hermosillo, 83000, Sonora, Mexico
J. Adolfo Minjárez-Sosa

Rights and permissions

Reprints and permissions

Copyright information

About this chapter

Cite this chapter

Huang, Y., Guo, X. (2012). Constrained Optimality for First Passage Criteria in Semi-Markov Decision Processes. In: Hernández-Hernández, D., Minjárez-Sosa, J. (eds) Optimization, Control, and Applications of Stochastic Systems. Systems & Control: Foundations & Applications. Birkhäuser, Boston. https://doi.org/10.1007/978-0-8176-8337-5_11

Download citation

DOI: https://doi.org/10.1007/978-0-8176-8337-5_11
Published: 12 July 2012
Publisher Name: Birkhäuser, Boston
Print ISBN: 978-0-8176-8336-8
Online ISBN: 978-0-8176-8337-5
eBook Packages: Mathematics and StatisticsMathematics and Statistics (R0)

Publish with us

Policies and ethics