Skip to main content

A Constrained Optimization Problem with Applications to Constrained MDPs

  • Chapter
  • First Online:
Optimization, Control, and Applications of Stochastic Systems

Part of the book series: Systems & Control: Foundations & Applications ((SCFA))

Abstract

First, we introduce a constrained optimization problem in which the objective function is defined on the product space of a linear space and a convex set. The goal of this optimization problem is to maximize values of the function with any fixed variable in the linear space, over the subset of the convex set given by the function with another fixed variable from the linear space and with a constraint. We give suitable conditions under which the existence of a constrained-optimal solution to the optimization problem is shown. Then, we apply the main results obtained here to the existing discrete- and continuous-time constrained Markov decision processes (MDPs) with the discounted and average criteria, and also establish the existence of constrained-optimal policies and characterize a constrained-optimal policy without the nonnegativity assumption on the costs as in the previous literature. Furthermore, we apply our results to discrete-time constrained MDPs with state-dependent discount factors.

Research supported by NSFC, GDUPS, RFDP, and FRFCU.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Chapter
USD 29.95
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
eBook
USD 39.99
Price excludes VAT (USA)
  • Available as EPUB and PDF
  • Read on any device
  • Instant download
  • Own it forever
Hardcover Book
USD 54.99
Price excludes VAT (USA)
  • Durable hardcover edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

References

  1. Aliprantis, C., Border, K. (2007). Infinite Dimensional Analysis. Springer-Verlag, New York.

    Google Scholar 

  2. Altman, E. (1999). Constrained Markov Decision Processes. Chapman and Hall/CRC Press, London.

    Google Scholar 

  3. Beutler, F.J., Ross, K.W. (1985). Optimal policies for controlled Markov chains with a constraint. J. Math. Anal. Appl. 112, 236–252.

    Google Scholar 

  4. Borkar, V., Budhiraja, A. (2004). Ergodic control for constrained diffusions: characterization using HJB equations. SIAM J. Control Optim. 43, 1467–1492.

    Google Scholar 

  5. Borkar, V.S., Ghosh, M.K. (1990). Controlled diffusions with constraints. J. Math. Anal. Appl. 152, 88–108.

    Google Scholar 

  6. Borkar, V.S. (1993). Controlled diffusions with constraints II. J. Math. Anal. Appl. 176, 310–321.

    Google Scholar 

  7. Borkar, V.S. (2005). Controlled diffusion processes. Probab. Surv. 2, 213–244.

    Google Scholar 

  8. Budhiraja, A. (2003). An ergodic control problem for constrained diffusion processes: existence of optimal Markov control. SIAM J. Control Optim. 42, 532–558.

    Google Scholar 

  9. Budhiraja, A., Ross, K. (2006). Existence of optimal controls for singular control problems with state constraints. Ann. Appl. Probab. 16, 2235–2255.

    Google Scholar 

  10. Feinberg, E.A., Shwartz, A. (1995). Constrained Markov decision models with weighted discounted rewards. Math. Oper. Res. 20, 302–320.

    Google Scholar 

  11. Feinberg, E.A., Shwartz, A. (1996). Constrained discounted dynamic programming. Math. Oper. Res. 21, 922–945.

    Google Scholar 

  12. Feinberg, E.A., Shwartz, A. (1999). Constrained dynamic programming with two discount factors: applications and an algorithm. IEEE Trans. Autom. Control. 44, 628–631.

    Google Scholar 

  13. Feinberg, E.A. (2000). Constrained discounted Markov decision processes and Hamiltonian cycles. Math. Oper. Res. 25, 130–140.

    Google Scholar 

  14. F\(\ddot{o}\)llmer, H., Schied, A. (2004). Stochastic Finance: An Introduction in Discrete Time. Walter de Gruyter, Berlin.

    Google Scholar 

  15. Gonz\(\acute{a}\)lez-Hern\(\acute{a}\)ndez, J., Hern\(\acute{a}\)ndez-Lerma, O. (1999). Envelopes of sets of measures, tightness, and Markov control processes. Appl. Math. Optim. 40, 377–392.

    Google Scholar 

  16. Gonz\(\acute{a}\)lez-Hern\(\acute{a}\)ndez, J., Hern\(\acute{a}\)ndez-Lerma, O. (2005). Extreme points of sets of randomized strategies in constrained optimization and control problems. SIAM J. Optim. 15, 1085–1104.

    Google Scholar 

  17. Guo, X.P. (2000). Constrained denumerable state non-stationary MDPs with expected total reward criterion. Acta Math. Appl. Sin. (English Ser.) 16, 205–212.

    Google Scholar 

  18. Guo, X.P., Hern\(\acute{a}\)ndez-Lerma, O. (2003). Constrained continuous-time Markov controlled processes with discounted criteria. Stochastic Anal. Appl. 21, 379–399.

    Google Scholar 

  19. Guo, X.P. (2007). Constrained optimization for average cost continuous-time Markov decision processes. IEEE Trans. Autom. Control. 52, 1139–1143.

    Google Scholar 

  20. Guo, X.P., Hern\(\acute{a}\)ndez-Lerma, O. (2009). Continuous-time Markov Decision Processes: Theory and Applications. Springer-Verlag, Berlin Heidelberg.

    Google Scholar 

  21. Guo, X.P., Song, X.Y. (2011). Discounted continuous-time constrained Markov decision processes in Polish spaces. Ann. Appl. Probab. 21, 2016–2049.

    Google Scholar 

  22. Hern\(\acute{a}\)ndez-Lerma, O., Lasserre, J.B. (1996). Discrete-Time Markov Control Processes: basic optimality criteria. Springer-Verlag, New York.

    Google Scholar 

  23. Hern\(\acute{a}\)ndez-Lerma, O., Lasserre, J.B. (1999). Further Topics on Discrete-Time Markov Control Processes. Springer-Verlag, New York.

    Google Scholar 

  24. Hern\(\acute{a}\)ndez-Lerma, O., Gonz\(\acute{a}\)lez-Hern\(\acute{a}\)ndez, J. (2000). Constrained Markov control processes in Borel spaces: the discounted case. Math. Methods Oper. Res. 52, 271–285.

    Google Scholar 

  25. Huang, Y., Kurano, M. (1997). The LP approach in average reward MDPs with multiple cost constraints: the countable state case. J. Inform. Optim. Sci. 18, 33–47.

    Google Scholar 

  26. Gonz\(\acute{a}\)lez-Hern\(\acute{a}\)ndez, J., L\(\acute{o}\)pez-Mart\(\acute{i}\)nez, R.R., P\(\acute{e}\)rez-Hern\(\acute{a}\)ndez, J.R. (2007). Markov control processes with randomized discounted cost. Math. Methods. Oper. Res. 65, 27–44.

    Google Scholar 

  27. L\(\acute{o}\)pez-Mart\(\acute{i}\)nez, R.R., Hern\(\acute{a}\)ndez-Lerma, O. (2003). The Lagrange approach to constrained Markov processes: a survey and extension of results. Morfismos. 7, 1–26.

    Google Scholar 

  28. Mey, S.P., Tweedie, R.L. (1993). Stability of Markov processes III: Foster-Lyapunov criteria for continuous-time processes. Adv. Appl. Prob. 25, 518–548.

    Google Scholar 

  29. Piunovskiy, A.B. (1997). Optimal Control of Random Sequences in Problems with Constraits. Kluwer, Dordrecht.

    Google Scholar 

  30. Prieto-Rumeau, T., Hern\(\acute{a}\)ndez-Lerma, O. (2006). Ergodic control of continuous-time Markov chains with pathwise constraints. SIAM J. Control Optim. 45, 51–73.

    Google Scholar 

  31. Puterman, M.L. (1994). Markov Decision Processes: Discrete Stochastic Dynamic Programming. Wiley, New York.

    Google Scholar 

  32. Sennott, L.I. (1991). Constrained discounted Markov chains. Probab. Engrg. Inform. Sci. 5, 463–475.

    Google Scholar 

  33. Sennott, L.I. (1993). Constrained average cost Markov chains. Probab. Engrg. Inform. Sci. 7, 69–83.

    Google Scholar 

  34. Wei, Q.D., Guo, X.P. (2011). Markov decision processes with state-dependent discount factors and unbounded rewards/costs. Oper. Res. Lett. 39, 369–374.

    Google Scholar 

  35. Zhang, L.L., Guo, X.P. (2008). Constrained continuous-time Markov decision processes with average criteria. Math. Methods Oper. Res. 67, 323–340.

    Google Scholar 

  36. Zhang, Y. (2011). Convex analytic approach to constrained discounted Markov decision processes with non-constant discount factors. Top. doi: 10.1007/s11750-011-0186-8.

    Google Scholar 

Download references

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Xianping Guo .

Editor information

Editors and Affiliations

Rights and permissions

Reprints and permissions

Copyright information

© 2012 Springer Science+Business Media, LLC

About this chapter

Cite this chapter

Guo, X., Wei, Q., Zhang, J. (2012). A Constrained Optimization Problem with Applications to Constrained MDPs. In: Hernández-Hernández, D., Minjárez-Sosa, J. (eds) Optimization, Control, and Applications of Stochastic Systems. Systems & Control: Foundations & Applications. Birkhäuser, Boston. https://doi.org/10.1007/978-0-8176-8337-5_8

Download citation

Publish with us

Policies and ethics