Abstract
First, we introduce a constrained optimization problem in which the objective function is defined on the product space of a linear space and a convex set. The goal of this optimization problem is to maximize values of the function with any fixed variable in the linear space, over the subset of the convex set given by the function with another fixed variable from the linear space and with a constraint. We give suitable conditions under which the existence of a constrained-optimal solution to the optimization problem is shown. Then, we apply the main results obtained here to the existing discrete- and continuous-time constrained Markov decision processes (MDPs) with the discounted and average criteria, and also establish the existence of constrained-optimal policies and characterize a constrained-optimal policy without the nonnegativity assumption on the costs as in the previous literature. Furthermore, we apply our results to discrete-time constrained MDPs with state-dependent discount factors.
Research supported by NSFC, GDUPS, RFDP, and FRFCU.
Access this chapter
Tax calculation will be finalised at checkout
Purchases are for personal use only
References
Aliprantis, C., Border, K. (2007). Infinite Dimensional Analysis. Springer-Verlag, New York.
Altman, E. (1999). Constrained Markov Decision Processes. Chapman and Hall/CRC Press, London.
Beutler, F.J., Ross, K.W. (1985). Optimal policies for controlled Markov chains with a constraint. J. Math. Anal. Appl. 112, 236–252.
Borkar, V., Budhiraja, A. (2004). Ergodic control for constrained diffusions: characterization using HJB equations. SIAM J. Control Optim. 43, 1467–1492.
Borkar, V.S., Ghosh, M.K. (1990). Controlled diffusions with constraints. J. Math. Anal. Appl. 152, 88–108.
Borkar, V.S. (1993). Controlled diffusions with constraints II. J. Math. Anal. Appl. 176, 310–321.
Borkar, V.S. (2005). Controlled diffusion processes. Probab. Surv. 2, 213–244.
Budhiraja, A. (2003). An ergodic control problem for constrained diffusion processes: existence of optimal Markov control. SIAM J. Control Optim. 42, 532–558.
Budhiraja, A., Ross, K. (2006). Existence of optimal controls for singular control problems with state constraints. Ann. Appl. Probab. 16, 2235–2255.
Feinberg, E.A., Shwartz, A. (1995). Constrained Markov decision models with weighted discounted rewards. Math. Oper. Res. 20, 302–320.
Feinberg, E.A., Shwartz, A. (1996). Constrained discounted dynamic programming. Math. Oper. Res. 21, 922–945.
Feinberg, E.A., Shwartz, A. (1999). Constrained dynamic programming with two discount factors: applications and an algorithm. IEEE Trans. Autom. Control. 44, 628–631.
Feinberg, E.A. (2000). Constrained discounted Markov decision processes and Hamiltonian cycles. Math. Oper. Res. 25, 130–140.
F\(\ddot{o}\)llmer, H., Schied, A. (2004). Stochastic Finance: An Introduction in Discrete Time. Walter de Gruyter, Berlin.
Gonz\(\acute{a}\)lez-Hern\(\acute{a}\)ndez, J., Hern\(\acute{a}\)ndez-Lerma, O. (1999). Envelopes of sets of measures, tightness, and Markov control processes. Appl. Math. Optim. 40, 377–392.
Gonz\(\acute{a}\)lez-Hern\(\acute{a}\)ndez, J., Hern\(\acute{a}\)ndez-Lerma, O. (2005). Extreme points of sets of randomized strategies in constrained optimization and control problems. SIAM J. Optim. 15, 1085–1104.
Guo, X.P. (2000). Constrained denumerable state non-stationary MDPs with expected total reward criterion. Acta Math. Appl. Sin. (English Ser.) 16, 205–212.
Guo, X.P., Hern\(\acute{a}\)ndez-Lerma, O. (2003). Constrained continuous-time Markov controlled processes with discounted criteria. Stochastic Anal. Appl. 21, 379–399.
Guo, X.P. (2007). Constrained optimization for average cost continuous-time Markov decision processes. IEEE Trans. Autom. Control. 52, 1139–1143.
Guo, X.P., Hern\(\acute{a}\)ndez-Lerma, O. (2009). Continuous-time Markov Decision Processes: Theory and Applications. Springer-Verlag, Berlin Heidelberg.
Guo, X.P., Song, X.Y. (2011). Discounted continuous-time constrained Markov decision processes in Polish spaces. Ann. Appl. Probab. 21, 2016–2049.
Hern\(\acute{a}\)ndez-Lerma, O., Lasserre, J.B. (1996). Discrete-Time Markov Control Processes: basic optimality criteria. Springer-Verlag, New York.
Hern\(\acute{a}\)ndez-Lerma, O., Lasserre, J.B. (1999). Further Topics on Discrete-Time Markov Control Processes. Springer-Verlag, New York.
Hern\(\acute{a}\)ndez-Lerma, O., Gonz\(\acute{a}\)lez-Hern\(\acute{a}\)ndez, J. (2000). Constrained Markov control processes in Borel spaces: the discounted case. Math. Methods Oper. Res. 52, 271–285.
Huang, Y., Kurano, M. (1997). The LP approach in average reward MDPs with multiple cost constraints: the countable state case. J. Inform. Optim. Sci. 18, 33–47.
Gonz\(\acute{a}\)lez-Hern\(\acute{a}\)ndez, J., L\(\acute{o}\)pez-Mart\(\acute{i}\)nez, R.R., P\(\acute{e}\)rez-Hern\(\acute{a}\)ndez, J.R. (2007). Markov control processes with randomized discounted cost. Math. Methods. Oper. Res. 65, 27–44.
L\(\acute{o}\)pez-Mart\(\acute{i}\)nez, R.R., Hern\(\acute{a}\)ndez-Lerma, O. (2003). The Lagrange approach to constrained Markov processes: a survey and extension of results. Morfismos. 7, 1–26.
Mey, S.P., Tweedie, R.L. (1993). Stability of Markov processes III: Foster-Lyapunov criteria for continuous-time processes. Adv. Appl. Prob. 25, 518–548.
Piunovskiy, A.B. (1997). Optimal Control of Random Sequences in Problems with Constraits. Kluwer, Dordrecht.
Prieto-Rumeau, T., Hern\(\acute{a}\)ndez-Lerma, O. (2006). Ergodic control of continuous-time Markov chains with pathwise constraints. SIAM J. Control Optim. 45, 51–73.
Puterman, M.L. (1994). Markov Decision Processes: Discrete Stochastic Dynamic Programming. Wiley, New York.
Sennott, L.I. (1991). Constrained discounted Markov chains. Probab. Engrg. Inform. Sci. 5, 463–475.
Sennott, L.I. (1993). Constrained average cost Markov chains. Probab. Engrg. Inform. Sci. 7, 69–83.
Wei, Q.D., Guo, X.P. (2011). Markov decision processes with state-dependent discount factors and unbounded rewards/costs. Oper. Res. Lett. 39, 369–374.
Zhang, L.L., Guo, X.P. (2008). Constrained continuous-time Markov decision processes with average criteria. Math. Methods Oper. Res. 67, 323–340.
Zhang, Y. (2011). Convex analytic approach to constrained discounted Markov decision processes with non-constant discount factors. Top. doi: 10.1007/s11750-011-0186-8.
Author information
Authors and Affiliations
Corresponding author
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 2012 Springer Science+Business Media, LLC
About this chapter
Cite this chapter
Guo, X., Wei, Q., Zhang, J. (2012). A Constrained Optimization Problem with Applications to Constrained MDPs. In: Hernández-Hernández, D., Minjárez-Sosa, J. (eds) Optimization, Control, and Applications of Stochastic Systems. Systems & Control: Foundations & Applications. Birkhäuser, Boston. https://doi.org/10.1007/978-0-8176-8337-5_8
Download citation
DOI: https://doi.org/10.1007/978-0-8176-8337-5_8
Published:
Publisher Name: Birkhäuser, Boston
Print ISBN: 978-0-8176-8336-8
Online ISBN: 978-0-8176-8337-5
eBook Packages: Mathematics and StatisticsMathematics and Statistics (R0)