A Constrained Optimization Problem with Applications to Constrained MDPs

Guo, Xianping; Wei, Qingda; Zhang, Junyu

doi:10.1007/978-0-8176-8337-5_8

Xianping Guo³,
Qingda Wei³ &
Junyu Zhang³

Part of the book series: Systems & Control: Foundations & Applications ((SCFA))

1425 Accesses
1 Citations

Abstract

First, we introduce a constrained optimization problem in which the objective function is defined on the product space of a linear space and a convex set. The goal of this optimization problem is to maximize values of the function with any fixed variable in the linear space, over the subset of the convex set given by the function with another fixed variable from the linear space and with a constraint. We give suitable conditions under which the existence of a constrained-optimal solution to the optimization problem is shown. Then, we apply the main results obtained here to the existing discrete- and continuous-time constrained Markov decision processes (MDPs) with the discounted and average criteria, and also establish the existence of constrained-optimal policies and characterize a constrained-optimal policy without the nonnegativity assumption on the costs as in the previous literature. Furthermore, we apply our results to discrete-time constrained MDPs with state-dependent discount factors.

Research supported by NSFC, GDUPS, RFDP, and FRFCU.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Log in via an institution

Chapter: USD 29.95; Price excludes VAT (USA)

eBook: USD 39.99; Price excludes VAT (USA)

Hardcover Book: USD 54.99; Price excludes VAT (USA)

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

References

Aliprantis, C., Border, K. (2007). Infinite Dimensional Analysis. Springer-Verlag, New York.
Google Scholar
Altman, E. (1999). Constrained Markov Decision Processes. Chapman and Hall/CRC Press, London.
Google Scholar
Beutler, F.J., Ross, K.W. (1985). Optimal policies for controlled Markov chains with a constraint. J. Math. Anal. Appl. 112, 236–252.
Google Scholar
Borkar, V., Budhiraja, A. (2004). Ergodic control for constrained diffusions: characterization using HJB equations. SIAM J. Control Optim. 43, 1467–1492.
Google Scholar
Borkar, V.S., Ghosh, M.K. (1990). Controlled diffusions with constraints. J. Math. Anal. Appl. 152, 88–108.
Google Scholar
Borkar, V.S. (1993). Controlled diffusions with constraints II. J. Math. Anal. Appl. 176, 310–321.
Google Scholar
Borkar, V.S. (2005). Controlled diffusion processes. Probab. Surv. 2, 213–244.
Google Scholar
Budhiraja, A. (2003). An ergodic control problem for constrained diffusion processes: existence of optimal Markov control. SIAM J. Control Optim. 42, 532–558.
Google Scholar
Budhiraja, A., Ross, K. (2006). Existence of optimal controls for singular control problems with state constraints. Ann. Appl. Probab. 16, 2235–2255.
Google Scholar
Feinberg, E.A., Shwartz, A. (1995). Constrained Markov decision models with weighted discounted rewards. Math. Oper. Res. 20, 302–320.
Google Scholar
Feinberg, E.A., Shwartz, A. (1996). Constrained discounted dynamic programming. Math. Oper. Res. 21, 922–945.
Google Scholar
Feinberg, E.A., Shwartz, A. (1999). Constrained dynamic programming with two discount factors: applications and an algorithm. IEEE Trans. Autom. Control. 44, 628–631.
Google Scholar
Feinberg, E.A. (2000). Constrained discounted Markov decision processes and Hamiltonian cycles. Math. Oper. Res. 25, 130–140.
Google Scholar
F\(\ddot{o}\)llmer, H., Schied, A. (2004). Stochastic Finance: An Introduction in Discrete Time. Walter de Gruyter, Berlin.
Google Scholar
Gonz\(\acute{a}\)lez-Hern\(\acute{a}\)ndez, J., Hern\(\acute{a}\)ndez-Lerma, O. (1999). Envelopes of sets of measures, tightness, and Markov control processes. Appl. Math. Optim. 40, 377–392.
Google Scholar
Gonz\(\acute{a}\)lez-Hern\(\acute{a}\)ndez, J., Hern\(\acute{a}\)ndez-Lerma, O. (2005). Extreme points of sets of randomized strategies in constrained optimization and control problems. SIAM J. Optim. 15, 1085–1104.
Google Scholar
Guo, X.P. (2000). Constrained denumerable state non-stationary MDPs with expected total reward criterion. Acta Math. Appl. Sin. (English Ser.) 16, 205–212.
Google Scholar
Guo, X.P., Hern\(\acute{a}\)ndez-Lerma, O. (2003). Constrained continuous-time Markov controlled processes with discounted criteria. Stochastic Anal. Appl. 21, 379–399.
Google Scholar
Guo, X.P. (2007). Constrained optimization for average cost continuous-time Markov decision processes. IEEE Trans. Autom. Control. 52, 1139–1143.
Google Scholar
Guo, X.P., Hern\(\acute{a}\)ndez-Lerma, O. (2009). Continuous-time Markov Decision Processes: Theory and Applications. Springer-Verlag, Berlin Heidelberg.
Google Scholar
Guo, X.P., Song, X.Y. (2011). Discounted continuous-time constrained Markov decision processes in Polish spaces. Ann. Appl. Probab. 21, 2016–2049.
Google Scholar
Hern\(\acute{a}\)ndez-Lerma, O., Lasserre, J.B. (1996). Discrete-Time Markov Control Processes: basic optimality criteria. Springer-Verlag, New York.
Google Scholar
Hern\(\acute{a}\)ndez-Lerma, O., Lasserre, J.B. (1999). Further Topics on Discrete-Time Markov Control Processes. Springer-Verlag, New York.
Google Scholar
Hern\(\acute{a}\)ndez-Lerma, O., Gonz\(\acute{a}\)lez-Hern\(\acute{a}\)ndez, J. (2000). Constrained Markov control processes in Borel spaces: the discounted case. Math. Methods Oper. Res. 52, 271–285.
Google Scholar
Huang, Y., Kurano, M. (1997). The LP approach in average reward MDPs with multiple cost constraints: the countable state case. J. Inform. Optim. Sci. 18, 33–47.
Google Scholar
Gonz\(\acute{a}\)lez-Hern\(\acute{a}\)ndez, J., L\(\acute{o}\)pez-Mart\(\acute{i}\)nez, R.R., P\(\acute{e}\)rez-Hern\(\acute{a}\)ndez, J.R. (2007). Markov control processes with randomized discounted cost. Math. Methods. Oper. Res. 65, 27–44.
Google Scholar
L\(\acute{o}\)pez-Mart\(\acute{i}\)nez, R.R., Hern\(\acute{a}\)ndez-Lerma, O. (2003). The Lagrange approach to constrained Markov processes: a survey and extension of results. Morfismos. 7, 1–26.
Google Scholar
Mey, S.P., Tweedie, R.L. (1993). Stability of Markov processes III: Foster-Lyapunov criteria for continuous-time processes. Adv. Appl. Prob. 25, 518–548.
Google Scholar
Piunovskiy, A.B. (1997). Optimal Control of Random Sequences in Problems with Constraits. Kluwer, Dordrecht.
Google Scholar
Prieto-Rumeau, T., Hern\(\acute{a}\)ndez-Lerma, O. (2006). Ergodic control of continuous-time Markov chains with pathwise constraints. SIAM J. Control Optim. 45, 51–73.
Google Scholar
Puterman, M.L. (1994). Markov Decision Processes: Discrete Stochastic Dynamic Programming. Wiley, New York.
Google Scholar
Sennott, L.I. (1991). Constrained discounted Markov chains. Probab. Engrg. Inform. Sci. 5, 463–475.
Google Scholar
Sennott, L.I. (1993). Constrained average cost Markov chains. Probab. Engrg. Inform. Sci. 7, 69–83.
Google Scholar
Wei, Q.D., Guo, X.P. (2011). Markov decision processes with state-dependent discount factors and unbounded rewards/costs. Oper. Res. Lett. 39, 369–374.
Google Scholar
Zhang, L.L., Guo, X.P. (2008). Constrained continuous-time Markov decision processes with average criteria. Math. Methods Oper. Res. 67, 323–340.
Google Scholar
Zhang, Y. (2011). Convex analytic approach to constrained discounted Markov decision processes with non-constant discount factors. Top. doi: 10.1007/s11750-011-0186-8.
Google Scholar

Download references

Author information

Authors and Affiliations

Sun Yat-Sen University, Guangzhou, 510275, China
Xianping Guo, Qingda Wei & Junyu Zhang

Authors

Xianping Guo
View author publications
You can also search for this author in PubMed Google Scholar
Qingda Wei
View author publications
You can also search for this author in PubMed Google Scholar
Junyu Zhang
View author publications
You can also search for this author in PubMed Google Scholar

Corresponding author

Correspondence to Xianping Guo .

Editor information

Editors and Affiliations

, Department of Probability and Statistics, Center for Research in Mathematics, Jalisco s/n, Guanajuato, 36000, Mexico
Daniel Hernández-Hernández
, Department of Mathematics, University of Sonora, Rosales s/n, Hermosillo, 83000, Sonora, Mexico
J. Adolfo Minjárez-Sosa

Rights and permissions

Reprints and permissions

Copyright information

About this chapter

Cite this chapter

Guo, X., Wei, Q., Zhang, J. (2012). A Constrained Optimization Problem with Applications to Constrained MDPs. In: Hernández-Hernández, D., Minjárez-Sosa, J. (eds) Optimization, Control, and Applications of Stochastic Systems. Systems & Control: Foundations & Applications. Birkhäuser, Boston. https://doi.org/10.1007/978-0-8176-8337-5_8

Download citation

DOI: https://doi.org/10.1007/978-0-8176-8337-5_8
Published: 12 July 2012
Publisher Name: Birkhäuser, Boston
Print ISBN: 978-0-8176-8336-8
Online ISBN: 978-0-8176-8337-5
eBook Packages: Mathematics and StatisticsMathematics and Statistics (R0)

Publish with us

Policies and ethics