The Policy Iteration Algorithm for Average Continuous Control of Piecewise Deterministic Markov Processes

Costa, O. L. V.; Dufour, F.

doi:10.1007/s00245-010-9099-4

The Policy Iteration Algorithm for Average Continuous Control of Piecewise Deterministic Markov Processes

Published: 03 March 2010

Volume 62, pages 185–204, (2010)
Cite this article

Applied Mathematics & Optimization Aims and scope Submit manuscript

O. L. V. Costa¹ &
F. Dufour²

We’re sorry, something doesn't seem to be working properly.

Please try refreshing the page. If that doesn't work, please contact support so we can address the problem.

Abstract

The main goal of this paper is to apply the so-called policy iteration algorithm (PIA) for the long run average continuous control problem of piecewise deterministic Markov processes (PDMP’s) taking values in a general Borel space and with compact action space depending on the state variable. In order to do that we first derive some important properties for a pseudo-Poisson equation associated to the problem. In the sequence it is shown that the convergence of the PIA to a solution satisfying the optimality equation holds under some classical hypotheses and that this optimal solution yields to an optimal control strategy for the average control problem for the continuous-time PDMP in a feedback form.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Log in via an institution

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Institutional subscriptions

The Policy Iteration Algorithm for PDMPs

Abel-type Results for Controlled Piecewise Deterministic Markov Processes

Article 09 April 2015

Finite horizon continuous-time Markov decision processes with mean and variance criteria

Article 29 September 2018

References

Arapostathis, A., Borkar, V.S., Fernández-Gaucherand, E., Ghosh, M.K., Marcus, S.I.: Discrete-time controlled Markov processes with average cost criterion: a survey. SIAM J. Control Optim. 31(2), 282–344 (1993)
Article MATH MathSciNet Google Scholar
Bertsekas, D.P., Shreve, S.E.: Stochastic Optimal Control. Mathematics in Science and Engineering, vol. 139. Academic Press, New York (1978). The discrete time case
MATH Google Scholar
Borkar, V.S.: Topics in Controlled Markov Chains. Pitman Research Notes in Mathematics Series, vol. 240. Longman, Harlow (1991)
MATH Google Scholar
Costa, O.L.V., Dufour, F.: Relaxed long run average continuous control of piecewise deterministic Markov processes. In: Proceedings of the European Control Conference, pp. 5052–5059. Kos, Greece, July, 2007
Costa, O.L.V., Dufour, F.: Average control of piecewise deterministic Markov processes. SIAM J. Control Optim. (2008, under review). 0809.0477
Costa, O.L.V., Dufour, F.: The vanishing discount approach for the average continuous control of piecewise deterministic Markov processes. J. Appl. Probab. 46(4), 1157–1183 (2009)
Article MATH MathSciNet Google Scholar
Davis, M.H.A.: Markov Models and Optimization. Chapman & Hall, London (1993)
MATH Google Scholar
Dynkin, E.B., Yushkevich, A.A.: Controlled Markov Processes. Grundlehren der Mathematischen Wissenschaften, vol. 235. Springer, Berlin (1979)
Google Scholar
Gordienko, E., Hernández-Lerma, O.: Average cost Markov control processes with weighted norms: Existence of canonical policies. Appl. Math. (Warsaw) 23, 199–218 (1995)
MATH MathSciNet Google Scholar
Guo, X., Rieder, U.: Average optimality for continuous-time Markov decision processes in polish spaces. Ann. Appl. Probab. 16, 730–756 (2006)
Article MATH MathSciNet Google Scholar
Guo, X., Zhu, Q.: Average optimality for Markov decision processes in Borel spaces: A new condition and approach. J. Appl. Probab. 43, 318–334 (2006)
Article MATH MathSciNet Google Scholar
Hernández-Lerma, O., Lasserre, J.B.: Discrete-Time Markov Control Processes. Applications of Mathematics, vol. 30. Springer, New York (1996). Basic optimality criteria
Google Scholar
Hernández-Lerma, O., Lasserre, J.B.: Policy iteration for average cost Markov control processes on Borel spaces. Acta Appl. Math. 47, 125–154 (1997)
Article MATH MathSciNet Google Scholar
Hernández-Lerma, O., Lasserre, J.B.: Further Topics on Discrete-Time Markov Control Processes. Applications of Mathematics, vol. 42. Springer, New York (1999)
MATH Google Scholar
Hernández-Lerma, O., Montes-de-Oca, R., Cavazos-Cadena, R.: Recurrence conditions for Markov decision processes with Borel state space: a survey. Ann. Oper. Res. 28(1–4), 29–46 (1991)
Article MATH MathSciNet Google Scholar
Meyn, S.P.: The policy iteration algorithm for average reward Markov decision processes with general state space. IEEE Trans. Automat. Control 42(12), 1663–1680 (1997)
Article MATH MathSciNet Google Scholar
Puterman, M.L.: Markov Decision Processes: Discrete Stochastic Dynamic Programming. Wiley Series in Probability and Mathematical Statistics: Applied Probability and Statistics. Wiley, New York (1994)
MATH Google Scholar
Schäl, M.: Conditions for optimality and for the limit of the n-stage optimal polices to be optimal. Z. Wahrscheinlichkeitstheor. Verw. Geb. 32, 179–96 (1975)
Article MATH Google Scholar

Download references

Author information

Authors and Affiliations

Departamento de Engenharia de Telecomunicações e Controle, Escola Politécnica da Universidade de São Paulo, 05508 900, São Paulo, Brazil
O. L. V. Costa
IMB, Institut Mathématiques de Bordeaux, INRIA Bordeaux Sud Ouest, Team: CQFD, Universite Bordeaux I, 351 Cours de la Liberation, 33405, Talence Cedex, France
F. Dufour

Authors

O. L. V. Costa
View author publications
You can also search for this author in PubMed Google Scholar
F. Dufour
View author publications
You can also search for this author in PubMed Google Scholar

Corresponding author

Correspondence to O. L. V. Costa.

Additional information

O.L.V. Costa received financial support from CNPq (Brazilian National Research Council), grant 301067/09-0.

F. Dufour was supported by ARPEGE program of the French National Agency of Research (ANR), project “FAUTOCOES”, number ANR-09-SEGI-004.

Rights and permissions

Reprints and permissions

About this article

Cite this article

Costa, O.L.V., Dufour, F. The Policy Iteration Algorithm for Average Continuous Control of Piecewise Deterministic Markov Processes. Appl Math Optim 62, 185–204 (2010). https://doi.org/10.1007/s00245-010-9099-4

Download citation

Published: 03 March 2010
Issue Date: October 2010
DOI: https://doi.org/10.1007/s00245-010-9099-4

Keywords

Access this article

Log in via an institution

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Institutional subscriptions

The Policy Iteration Algorithm for Average Continuous Control of Piecewise Deterministic Markov Processes

Abstract

Access this article

Similar content being viewed by others

The Policy Iteration Algorithm for PDMPs

Abel-type Results for Controlled Piecewise Deterministic Markov Processes

Finite horizon continuous-time Markov decision processes with mean and variance criteria

References

Author information

Authors and Affiliations

Corresponding author

Additional information

Rights and permissions

About this article

Cite this article

Keywords

Navigation

The Policy Iteration Algorithm for Average Continuous Control of Piecewise Deterministic Markov Processes

Abstract

Access this article

Similar content being viewed by others

The Policy Iteration Algorithm for PDMPs

Abel-type Results for Controlled Piecewise Deterministic Markov Processes

Finite horizon continuous-time Markov decision processes with mean and variance criteria

References

Author information

Authors and Affiliations

Corresponding author

Additional information

Rights and permissions

About this article

Cite this article

Share this article

Keywords

Search

Navigation