Skip to main content
Log in

The Policy Iteration Algorithm for Average Continuous Control of Piecewise Deterministic Markov Processes

  • Published:
Applied Mathematics & Optimization Aims and scope Submit manuscript

    We’re sorry, something doesn't seem to be working properly.

    Please try refreshing the page. If that doesn't work, please contact support so we can address the problem.

Abstract

The main goal of this paper is to apply the so-called policy iteration algorithm (PIA) for the long run average continuous control problem of piecewise deterministic Markov processes (PDMP’s) taking values in a general Borel space and with compact action space depending on the state variable. In order to do that we first derive some important properties for a pseudo-Poisson equation associated to the problem. In the sequence it is shown that the convergence of the PIA to a solution satisfying the optimality equation holds under some classical hypotheses and that this optimal solution yields to an optimal control strategy for the average control problem for the continuous-time PDMP in a feedback form.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Similar content being viewed by others

References

  1. Arapostathis, A., Borkar, V.S., Fernández-Gaucherand, E., Ghosh, M.K., Marcus, S.I.: Discrete-time controlled Markov processes with average cost criterion: a survey. SIAM J. Control Optim. 31(2), 282–344 (1993)

    Article  MATH  MathSciNet  Google Scholar 

  2. Bertsekas, D.P., Shreve, S.E.: Stochastic Optimal Control. Mathematics in Science and Engineering, vol. 139. Academic Press, New York (1978). The discrete time case

    MATH  Google Scholar 

  3. Borkar, V.S.: Topics in Controlled Markov Chains. Pitman Research Notes in Mathematics Series, vol. 240. Longman, Harlow (1991)

    MATH  Google Scholar 

  4. Costa, O.L.V., Dufour, F.: Relaxed long run average continuous control of piecewise deterministic Markov processes. In: Proceedings of the European Control Conference, pp. 5052–5059. Kos, Greece, July, 2007

  5. Costa, O.L.V., Dufour, F.: Average control of piecewise deterministic Markov processes. SIAM J. Control Optim. (2008, under review). 0809.0477

  6. Costa, O.L.V., Dufour, F.: The vanishing discount approach for the average continuous control of piecewise deterministic Markov processes. J. Appl. Probab. 46(4), 1157–1183 (2009)

    Article  MATH  MathSciNet  Google Scholar 

  7. Davis, M.H.A.: Markov Models and Optimization. Chapman & Hall, London (1993)

    MATH  Google Scholar 

  8. Dynkin, E.B., Yushkevich, A.A.: Controlled Markov Processes. Grundlehren der Mathematischen Wissenschaften, vol. 235. Springer, Berlin (1979)

    Google Scholar 

  9. Gordienko, E., Hernández-Lerma, O.: Average cost Markov control processes with weighted norms: Existence of canonical policies. Appl. Math. (Warsaw) 23, 199–218 (1995)

    MATH  MathSciNet  Google Scholar 

  10. Guo, X., Rieder, U.: Average optimality for continuous-time Markov decision processes in polish spaces. Ann. Appl. Probab. 16, 730–756 (2006)

    Article  MATH  MathSciNet  Google Scholar 

  11. Guo, X., Zhu, Q.: Average optimality for Markov decision processes in Borel spaces: A new condition and approach. J. Appl. Probab. 43, 318–334 (2006)

    Article  MATH  MathSciNet  Google Scholar 

  12. Hernández-Lerma, O., Lasserre, J.B.: Discrete-Time Markov Control Processes. Applications of Mathematics, vol. 30. Springer, New York (1996). Basic optimality criteria

    Google Scholar 

  13. Hernández-Lerma, O., Lasserre, J.B.: Policy iteration for average cost Markov control processes on Borel spaces. Acta Appl. Math. 47, 125–154 (1997)

    Article  MATH  MathSciNet  Google Scholar 

  14. Hernández-Lerma, O., Lasserre, J.B.: Further Topics on Discrete-Time Markov Control Processes. Applications of Mathematics, vol. 42. Springer, New York (1999)

    MATH  Google Scholar 

  15. Hernández-Lerma, O., Montes-de-Oca, R., Cavazos-Cadena, R.: Recurrence conditions for Markov decision processes with Borel state space: a survey. Ann. Oper. Res. 28(1–4), 29–46 (1991)

    Article  MATH  MathSciNet  Google Scholar 

  16. Meyn, S.P.: The policy iteration algorithm for average reward Markov decision processes with general state space. IEEE Trans. Automat. Control 42(12), 1663–1680 (1997)

    Article  MATH  MathSciNet  Google Scholar 

  17. Puterman, M.L.: Markov Decision Processes: Discrete Stochastic Dynamic Programming. Wiley Series in Probability and Mathematical Statistics: Applied Probability and Statistics. Wiley, New York (1994)

    MATH  Google Scholar 

  18. Schäl, M.: Conditions for optimality and for the limit of the n-stage optimal polices to be optimal. Z. Wahrscheinlichkeitstheor. Verw. Geb. 32, 179–96 (1975)

    Article  MATH  Google Scholar 

Download references

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to O. L. V. Costa.

Additional information

O.L.V. Costa received financial support from CNPq (Brazilian National Research Council), grant 301067/09-0.

F. Dufour was supported by ARPEGE program of the French National Agency of Research (ANR), project “FAUTOCOES”, number ANR-09-SEGI-004.

Rights and permissions

Reprints and permissions

About this article

Cite this article

Costa, O.L.V., Dufour, F. The Policy Iteration Algorithm for Average Continuous Control of Piecewise Deterministic Markov Processes. Appl Math Optim 62, 185–204 (2010). https://doi.org/10.1007/s00245-010-9099-4

Download citation

  • Published:

  • Issue Date:

  • DOI: https://doi.org/10.1007/s00245-010-9099-4

Keywords

Navigation