Skip to main content
Log in

Impulsive Control for Continuous-Time Markov Decision Processes: A Linear Programming Approach

  • Published:
Applied Mathematics & Optimization Submit manuscript

Abstract

In this paper, we investigate an optimization problem for continuous-time Markov decision processes with both impulsive and continuous controls. We consider the so-called constrained problem where the objective of the controller is to minimize a total expected discounted optimality criterion associated with a cost rate function while keeping other performance criteria of the same form, but associated with different cost rate functions, below some given bounds. Our model allows multiple impulses at the same time moment. The main objective of this work is to study the associated linear program defined on a space of measures including the occupation measures of the controlled process and to provide sufficient conditions to ensure the existence of an optimal control.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Institutional subscriptions

Similar content being viewed by others

References

  1. Arapostathis, A., Borkar, V.S., Ghosh, M.K.: Ergodic Control of Diffusion Processes. Encyclopedia of Mathematics and Its Applications, vol. 143. Cambridge University Press, Cambridge (2012)

    MATH  Google Scholar 

  2. Bhatt, Abhay G., Borkar, Vivek S.: Occupation measures for controlled Markov processes: characterization and optimality. Ann. Probab. 24(3), 1531–1562 (1996)

    Article  MathSciNet  MATH  Google Scholar 

  3. Buckdahn, R., Goreac, D., Quincampoix, M.: Stochastic optimal control and linear programming approach. Appl. Math. Optim. 63(2), 257–276 (2011)

    Article  MathSciNet  MATH  Google Scholar 

  4. Christensen, Sören: On the solution of general impulse control problems using superharmonic functions. Stoch. Process. Appl. 124(1), 709–729 (2014)

    Article  MathSciNet  MATH  Google Scholar 

  5. Costa, O.L.V., Dufour, F.: A linear programming formulation for constrained discounted continuous control for piecewise deterministic Markov processes. J. Math. Anal. Appl. 424(2), 892–914 (2015)

    Article  MathSciNet  MATH  Google Scholar 

  6. Davis, M.H.A.: Markov Models and Optimization. Monographs on Statistics and Applied Probability, vol. 49. Chapman & Hall, London (1993)

    Book  MATH  Google Scholar 

  7. Dufour, F., Horiguchi, M., Piunovskiy, A.B.: The expected total cost criterion for Markov decision processes under constraints: a convex analytic approach. Adv. Appl. Probab. 44(3), 774–793 (2012)

    Article  MathSciNet  MATH  Google Scholar 

  8. Dufour, F., Piunovskiy, A.B.: Impulsive control for continuous-time Markov decision processes. Adv. Appl. Probab. 47(1), 106–127 (2015)

    Article  MathSciNet  MATH  Google Scholar 

  9. Guo, X., Hernández-Lerma, O.: Continuous-Time Markov Decision Processes: Theory and applications. Stochastic Modelling and Applied Probability, vol. 62. Springer, Berlin (2009)

    MATH  Google Scholar 

  10. Guo, X., Hernández-Lerma, O., Prieto-Rumeau, T.: A survey of recent results on continuous-time Markov decision processes. Top 14(2), 177–261 (2006). With comments and a rejoinder by the authors

    Article  MathSciNet  MATH  Google Scholar 

  11. Guo, X., Piunovskiy, A.: Discounted continuous-time Markov decision processes with constraints: unbounded transition and loss rates. Math. Oper. Res. 36(1), 105–132 (2011)

    Article  MathSciNet  MATH  Google Scholar 

  12. Helmes, K., Stockbridge, R.H.: Linear programming approach to the optimal stopping of singular stochastic processes. Stochastics 79(3–4), 309–335 (2007)

    MathSciNet  MATH  Google Scholar 

  13. Hernández-Lerma, O., Lasserre, J.B.: Discrete-Time Markov Control Processes. Applications of Mathematics, vol. 30. Springer, New York (1996)

    Book  MATH  Google Scholar 

  14. Hernández-Lerma, O., Lasserre, J.B.: Further Topics on Discrete-Time Markov Control Processes. Applications of Mathematics, vol. 42. Springer, New York (1999)

    Book  MATH  Google Scholar 

  15. Hordijk, A., van der Duyn Schouten, F.A.: Average optimal policies in Markov decision drift processes with applications to a queueing and a replacement model. Adv. Appl. Probab. 15(2), 274–303 (1983)

    Article  MathSciNet  MATH  Google Scholar 

  16. Hordijk, A., Schouten, F.A.V.D.D.: Markov decision drift processes: conditions for optimality obtained by discretization. Math. Oper. Res. 10(1), 160–173 (1985)

    Article  MathSciNet  MATH  Google Scholar 

  17. Jacod, J.: Multivariate point processes: predictable projection, Radon-Nikodým derivatives, representation of martingales. Z. Wahrscheinlichkeitstheorie und Verw. Gebiete, 31,235–253 (1974/75)

  18. Jacod, J.: Calcul Stochastique et Problèmes de Martingales. Lecture Notes in Mathematics, vol. 714. Springer, Berlin (1979)

    MATH  Google Scholar 

  19. Kitaev, M.Y., Rykov, V.V.: Controlled Queueing Systems. CRC Press, Boca Raton (1995)

    MATH  Google Scholar 

  20. Kurtz, T.G., Stockbridge, R.H.: Existence of Markov controls and characterization of optimal Markov controls. SIAM J. Control Optim. 36(2), 609–653 (1998)

    Article  MathSciNet  MATH  Google Scholar 

  21. Kushner, H.J., Dupuis, P.G.: Numerical Methods for Stochastic Control Problems in Continuous Time. Applications of Mathematics (New York), vol. 24. Springer, New York (1992)

    Book  MATH  Google Scholar 

  22. Kushner, H.J., Martins, L.F.: Numerical methods for stochastic singular control problems. SIAM J. Control Optim. 29(6), 1443–1475 (1991)

    Article  MathSciNet  MATH  Google Scholar 

  23. Last, G., Brandt, A.: Marked Point Processes on the Real Line. Probability and Its Applications (New York). Springer, New York (1995)

    MATH  Google Scholar 

  24. Piunovskiy, A., Zhang, Y.: Discounted continuous-time Markov decision processes with unbounded rates: the convex analytic approach. SIAM J. Control Optim. 49(5), 2032–2061 (2011)

    Article  MathSciNet  MATH  Google Scholar 

  25. Piunovskiy, A.B.: Multicriteria impulsive control of jump Markov processes. Math. Methods Oper. Res. 60(1), 125–144 (2004)

    Article  MathSciNet  MATH  Google Scholar 

  26. Prieto-Rumeau, T., Hernández-Lerma, O.: Ergodic control of continuous-time Markov chains with pathwise constraints. SIAM J. Control Optim. 47(4), 1888–1908 (2008)

    Article  MathSciNet  MATH  Google Scholar 

  27. Prieto-Rumeau, T., Hernández-Lerma, O.: Selected Topics on Continuous-Time Controlled Markov Chains and Markov Games. ICP Advanced Texts in Mathematics, vol. 5. Imperial College Press, London (2012)

    MATH  Google Scholar 

  28. Stockbridge, R.H.: Time-average control of martingale problems: a linear programming formulation. Ann. Probab. 18(1), 206–217 (1990)

    Article  MathSciNet  MATH  Google Scholar 

  29. Yushkevich, A.A.: Continuous time Markov decision processes with interventions. Stochastics 9(4), 235–274 (1983)

    Article  MathSciNet  MATH  Google Scholar 

  30. Yushkevich, A.A.: Markov decision processes with both continuous and impulsive control. In: Stochastic optimization (Kiev, 1984). Lecture Notes in Control and Information Sciences, vol. 81, pp. 234–246. Springer, Berlin (1986)

  31. Yushkevich, A.A.: Bellman inequalities in Markov decision deterministic drift processes. Stochastics 23(1), 25–77 (1987)

    Article  MathSciNet  MATH  Google Scholar 

  32. Yushkevich, A.A.: Verification theorems for Markov decision processes with controllable deterministic drift, gradual and impulse controls. Teor. Veroyatnost. i Primenen. 34(3), 528–551 (1989)

    MathSciNet  Google Scholar 

Download references

Acknowledgments

This study has been carried out with financial support from the French State, managed by the French National Research Agency (ANR) in the frame of the “Investments for the future” Programme IdEx Bordeaux - CPU (ANR-10-IDEX-03-02).

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to F. Dufour.

Appendices

Appendix 1: Proofs of Propositions 4.3 and 4.4

The next result provides a sufficient condition in terms of the finiteness of the occupation measure to ensure that the process is not explosive.

Lemma A.1

For any \(u\in \mathcal {U}\), \(\displaystyle \eta ^i_u({\mathbf {X}}\times \{\Delta \})\le 1+\frac{1}{\alpha } \int _{\mathbb {K}^i} \overline{q}(\mathbf {X}|x,a)\eta ^g_u(dx,da)+\eta ^i_u({\mathbf {X}}\times \mathbf {A}^i).\) If \(\eta ^{i}_{u}(\mathbf {X}\times \mathbf {A}^{i}_{\Delta })<\infty \) then \(\mu ^{i}_{u}(\mathbf {Y})<\infty \) and \(\mathbb {P}^{u}_{x_{0}}(T_{\infty }<\infty )=0\).

Proof

Note that

$$\begin{aligned} \eta ^i_u({\mathbf {X}}\times \{\Delta \})&=\widetilde{\mu }^i_u ( {\mathbf {X}}\times \{\Delta \}) =\sum _{j=1}^\infty \mu ^i_u \big ( \{ {\mathbf {y}}\in {\mathbf {Y}}:{\mathbf {y}}_j\in {\mathbf {X}}\times \{\Delta \} \} \big ) \nonumber \\&=\sum _{j=1}^\infty \mu ^i_u({\mathbf {Y}}_{j-1})=\mu ^i_u({\mathbf {Y}}) =\mathbb {E}^u_{x_0}\left[ \int _{]0,T_\infty [} e^{-\alpha s}\mu (ds,{\mathbf {Y}})\right] +u_0({\mathbf {Y}}|x_0). \end{aligned}$$

Since \(\nu =\nu _0+\nu _1\) is the predictable projection of \(\mu \) and \(\nu _1(ds,\cdot )\) is concentrated on \({\mathbf {Y}}^*\), we see that

$$\begin{aligned} \eta ^i_u({\mathbf {X}}\times \{\Delta \})&=\mathbb {E}^u_{x_0}\left[ \int _{]0,T_\infty [} e^{-\alpha s}\int _{\mathbf {A}^g}\overline{q}({\mathbf {X}}|\overline{x}(\xi _{s-}),a)\pi (da|s)ds\right] \\&\quad +\mathbb {E}^u_{x_0}\left[ \int _{]0,T_\infty [} e^{-\alpha s} \nu _1(ds,{\mathbf {Y}})\right] +1 \nonumber \\&=\frac{1}{\alpha }\int _{\mathbb {K}^i} \overline{q}(\mathbf {X}|x,a)\eta ^g_u(dx,da)+\mathbb {E}^u_{x_0}\left[ \int _{]0,T_\infty [} e^{-\alpha s} \nu _1(ds,{\mathbf {Y}}^*)\right] +1 \end{aligned}$$

and

$$\begin{aligned} \mathbb {E}^u_{x_0}\left[ \int _{]0,T_\infty [} e^{-\alpha s} \nu _1(ds,{\mathbf {Y}}^*)\right] \le \mu ^i_u({\mathbf {Y}}^*)=\sum _{k=1}^\infty \mu ^i_u({\mathbf {Y}}_k). \end{aligned}$$

Finally, since \( \{\mathbf {y}\in \mathbf {Y} : \mathbf {y}_{j}\in \mathbf {X}\times \mathbf {A}^{i}\}=\bigcup _{k=j}^\infty {\mathbf {Y}}_k\),

$$\begin{aligned} \eta ^i_u({\mathbf {X}}\times \mathbf {A}^i)=\sum _{j=1}^\infty \mu ^i_u \big ( \{ {\mathbf {y}}\in {\mathbf {Y}}: {\mathbf {y}}_j\in {\mathbf {X}}\times \mathbf {A}^i \} \big ) =\sum _{j=1}^\infty j\mu ^i_u({\mathbf {Y}}_j)\ge \sum _{k=1}^\infty \mu ^i_u({\mathbf {Y}}_k), \end{aligned}$$

showing the first part of the result. To prove the last statement, observe first that for any \(j\in \mathbb {N}^*\), we have

$$\begin{aligned} \{\mathbf {y}\in \mathbf {Y} : \mathbf {y}_{j}\in \mathbf {X}\times \mathbf {A}^{i}_{\Delta }\}= & {} \{\mathbf {y}\in \mathbf {Y} : \mathbf {y}_{j}\in \mathbf {X}\times \mathbf {A}^{i}\} \mathop {\cup }\{\mathbf {y}\in \mathbf {Y} : \mathbf {y}_{j}\in \mathbf {X}\times \{\Delta \}\} \\= & {} \mathop {\cup }_{k=j}^{\infty } \mathbf {Y}_{k}\mathop {\cup }\mathbf {Y}_{j-1} \end{aligned}$$

Consequently, \(\eta ^{i}_{u}(\mathbf {X}\times \mathbf {A}^{i}_{\Delta })=\widetilde{\mu }_{u}^{i}(\mathbf {X}\times \mathbf {A}^{i}_{\Delta }) =\sum _{j\in \mathbb {N}} (j+1) \mu _{u}^{i}(\mathbf {Y}_{j})\ge \mu _{u}^{i}(\mathbf {Y})\). Now, we have that \(\displaystyle \mathbb {E}^{u}_{x_{0}} \Bigg [ \sum _{n=2}^{\infty } e^{-\alpha T_{n}} I_{\{T_{n}<T_{\infty }\}} \Bigg ] \le \mu _{u}^{i}(\mathbf {Y}) < \infty \), showing the last part of the result. \(\Box \)

Proof of Proposition 4.3 Consider \(\Gamma \in \mathcal {B}(\mathbf {X})\). From Lemma A.1, \(\mathbb {P}^{u}_{x_{0}}(T_{\infty }=+\infty )=1\) and so, by using the product formula for functions of bounded variation

$$\begin{aligned} e^{-\alpha t} I_{\Gamma }(\overline{x}(\xi _{t}))= & {} I_{\Gamma }(\overline{x}(y_{1})) - \int _{0}^{t} \alpha e^{-\alpha s} I_{\Gamma }(\overline{x}(\xi _{s})) ds \\&+ \int _{]0,t]\times \mathbf {Y}} e^{-\alpha s} \Big [ I_{\Gamma }(\overline{x}(z)) - I_{\Gamma }(\overline{x}(\xi _{s-})) \Big ] \mu (ds,dz) . \end{aligned}$$

Therefore, combining the bounded convergence Theorem and the fact that \(\mu ^{i}_{u}(\mathbf {Y})<\infty \) (see Lemma A.1), we have

$$\begin{aligned} \eta _{u}^{g}(\Gamma \times&\mathbf {A}^{g}) = \alpha \mathbb {E}^{u}_{x_{0}} \Bigg [ \int _{0}^{\infty } e^{-\alpha s} I_{\Gamma }(\overline{x}(\xi _{s})) ds \Bigg ] \nonumber \\&\qquad = \mathbb {E}^{u}_{x_{0}} \Bigg [ \int _{\mathbf {Y}} I_{\Gamma }(\overline{x}(y)) u_{0}(dy|x_{0}) \Bigg ]\nonumber \\&\qquad \quad \,+ \mathbb {E}^{u}_{x_{0}} \Bigg [ \int _{]0,\infty [\times \mathbf {Y}} e^{-\alpha s} \Big [ I_{\Gamma }(\overline{x}(z)) - I_{\Gamma }(\overline{x}(\xi _{s-})) \Big ] \mu (ds,dz) \Bigg ]. \end{aligned}$$

Recalling the definition \(\mu _{u}^{i}\) (see Eq. 9) and the fact that \(\nu \) is the predictable projection of \(\mu \), we obtain by using Lemma 3.2

$$\begin{aligned} \eta _{u}^{g}(\Gamma \times \mathbf {A}^{g})&= \int _{\mathbf {Y}} I_{\Gamma }(\bar{x}(y)) \mu _{u}^{i}(dy)\\&\quad - \mathbb {E}^{u}_{x_{0}} \Bigg [ \int _{0}^{\infty } e^{-\alpha s} \int _{\mathbf {A}^{g}} I_{\Gamma } (\overline{x}(\xi _{s})) \bar{q}(\mathbf {X}| \overline{x}(\xi _{s}),a) \pi (da |s) ds \Bigg ] \nonumber \\&- \mathbb {E}^{u}_{x_{0}} \Bigg [ \sum _{n\in \mathbb {N}^{*}}\int _{]T_{n}, T_{n+1}]} e^{-\alpha s} I_{\Gamma }(\overline{x}(\xi _{s-})) \frac{\psi _{n}(ds-T_{n} | H_{n})}{\psi _{n}([s-T_{n},+\infty ] | H_{n})} \Bigg ], \end{aligned}$$

and so,

$$\begin{aligned} \eta _{u}^{g}(\Gamma \times \mathbf {A}^{g})&= \int _{\mathbf {Y}} I_{\Gamma }(\bar{x}(y)) \mu _{u}^{i}(dy)\\&\quad - \mathbb {E}^{u}_{x_{0}} \Bigg [ \int _{\mathbf {X}\times \mathbf {A}^{g}} I_{\Gamma } (x) \bar{q}(\mathbf {X}| x,a) \int _{0}^{\infty } e^{-\alpha s} \delta _{\overline{x}(\xi _{s})}(dx)\pi (da |s) ds \Bigg ] \nonumber \\&- \mathbb {E}^{u}_{x_{0}} \Bigg [ \sum _{n\in \mathbb {N}^{*}} I_{\Gamma }(\overline{x}(Y_{n})) \int _{]T_{n}, T_{n+1}]} e^{-\alpha s} \frac{\psi _{n}(ds-T_{n} | H_{n})}{\psi _{n}([s-T_{n},+\infty ] | H_{n})} \Bigg ]. \end{aligned}$$

By using Fubini’s Theorem, we have that

$$\begin{aligned}&\mathbb {E}^{u}_{x_{0}} \Bigg [ \int _{\mathbf {X}\times \mathbf {A}^{g}} I_{\Gamma } (x) \bar{q}(\mathbf {X}| x,a) \int _{0}^{\infty } e^{-\alpha s} \delta _{\overline{x}(\xi _{s})}(dx)\pi (da |s) ds \Bigg ]\\&\quad = \frac{1}{\alpha } \int _{\mathbf {X}\times \mathbf {A}^{g}} I_{\Gamma } (x) \bar{q}(\mathbf {X}| x,a) \eta _{u}^{g}(dx,da). \end{aligned}$$

Moreover, observe that

$$\begin{aligned} \mathbb {E}^{u}_{x_{0}} \Bigg [ \int _{]T_{n}, T_{n+1}]} e^{-\alpha s} \frac{\psi _{n}(ds-T_{n} | H_{n})}{\psi _{n}([s-T_{n},+\infty ] | H_{n})} \Big | \mathcal {F}_{T_{n}} \Bigg ] = e^{-\alpha T_{n}} \int _{]0,\infty [} e^{-\alpha s} \psi _{n}(ds | H_{n}). \end{aligned}$$

Combining the last three equations, it follows that

$$\begin{aligned} \eta _{u}^{g}(\Gamma \times \mathbf {A}^{g})&= \int _{\mathbf {Y}} I_{\Gamma }(\bar{x}(y)) \mu _{u}^{i}(dy) - \frac{1}{\alpha } \int _{\mathbf {X}\times \mathbf {A}^{g}} I_{\Gamma } (x) \bar{q}(\mathbf {X}| x,a) \eta _{u}^{g}(dx,da) \nonumber \\&- \mathbb {E}^{u}_{x_{0}} \Bigg [ \sum _{n\in \mathbb {N}^{*}} e^{-\alpha T_{n}} I_{\Gamma }(\overline{x}(Y_{n})) \int _{]0,\infty [} e^{-\alpha s} \psi _{n}(ds | H_{n}) \Bigg ]. \end{aligned}$$

Finally, remark that \(I_{\Gamma }(\bar{x}(y))= \sum _{j=1}^{\infty } I_{\Gamma \times \{\Delta \}} (y_{j})\) for any \(y=\big (y_{1},y_{2},\ldots ,y_{j},\ldots \big )\in \mathbf {Y}\). Therefore, \(\displaystyle \int _{\mathbf {Y}} I_{\Gamma }(\bar{x}(y)) \mu _{u}^{i}(dy)= \eta _{u}^{i}(\Gamma \times \{\Delta \})\) showing the result. \(\square \)

Lemma A.2

Consider a strategy \(u=(u_{n})_{n\in \mathbb {N}}\in \mathcal {U}\) fixed with \(u_{n}=\big ( \psi _{n},\pi _{n},\gamma ^0_{n},\gamma ^1_{n} \big )\) for \(n\in \mathbb {N}^{*}\).

Then, for any \(n\in \mathbb {N}^{*}\), \(\Gamma \in \mathcal {B}(\mathbf {X}_{\Delta })\), \(t\in \mathbb {R}_{+}\), \(x\in \mathbf {X}\) and \(h_{n}\in \mathbf {H}_{n}\)

$$\begin{aligned} \widetilde{\gamma }^{0}_{n}(\Gamma \times \mathbf {A}^{i}_{\Delta } | h_{n},t,x)&= \delta _{x}(\Gamma ) + \int _{\mathbf {X}_{\Delta }\times \mathbf {A}^{i}_{\Delta }} Q_{\Delta }(\Gamma | z,a) \widetilde{\gamma }^{0}_{n}(dz,da | h_{n},t,x) ,\end{aligned}$$
(47)
$$\begin{aligned} \widetilde{\gamma }^{1}_{n}(\Gamma \times \mathbf {A}^{i}_{\Delta } | h_{n})&= \delta _{\bar{x}(y_{n})}(\Gamma ) + \int _{\mathbf {X}_{\Delta }\times \mathbf {A}^{i}_{\Delta }} Q_{\Delta }(\Gamma | z,a) \widetilde{\gamma }^{1}_{n}(dz,da | h_{n}) , \end{aligned}$$
(48)

where \(h_{n}=(y_0,\theta _1,y_1,\ldots ,\theta _{n},y_{n})\in \mathbf {H}_{n}\). Similarly, for any \(\Gamma \in \mathcal {B}(\mathbf {X}_{\Delta })\) and \(x\in \mathbf {X}\)

$$\begin{aligned} \widetilde{u}_{0}(\Gamma \times \mathbf {A}^{i}_{\Delta } | x)&= \delta _{x}(\Gamma ) + \int _{\mathbf {X}_{\Delta }\times \mathbf {A}^{i}_{\Delta }} Q_{\Delta }(\Gamma | z,a) \widetilde{u}_{0}(dz,da | x) . \end{aligned}$$
(49)

Proof

This Lemma is a straightforward consequence of Lemma 9.4.3 in [14]. \(\square \)

Proof of Proposition 4.4 By using the fact that \(\nu \) is the predictable projection of \(\mu \) and Lemma 3.2, we have

$$\begin{aligned} \eta _{u}^{i}(\Gamma \times \mathbf {A}^{i}_{\Delta })&= \widetilde{u}_{0}(\Gamma \times \mathbf {A}^{i}_{\Delta }|x_{0}) \\&+ \mathbb {E}^{u}_{x_{0}} \Bigg [ \int _{]0,\infty [} e^{-\alpha s} \int _{\mathbf {A}^{g}}\int _\mathbf {X} \widetilde{\gamma }^{0}(\Gamma \times \mathbf {A}^{i}_{\Delta }|x,s) \bar{q}(dx | \overline{x}(\xi _{s-}),a) \pi (da|s) ds \Bigg ] \\&+ \mathbb {E}^{u}_{x_{0}} \Bigg [ \sum _{n\in \mathbb {N}^{*}}\int _{]T_{n}, T_{n+1}]} e^{-\alpha s} \widetilde{\gamma }^{1}_{n}(\Gamma \times \mathbf {A}^{i}_{\Delta }|H_{n}) \frac{\psi _{n}(ds-T_{n} | H_{n})}{\psi _{n}([s-T_{n},+\infty ] | H_{n})} \Bigg ]. \end{aligned}$$

Now, by using Lemma A.2, it follows that

$$\begin{aligned}&\eta _{u}^{i}( \Gamma \times \mathbf {A}^{i}_{\Delta }) \\&\quad = \delta _{x_{0}}(\Gamma ) + \int _{\mathbf {X}_{\Delta }\times \mathbf {A}^{i}_{\Delta }} Q_{\Delta }(\Gamma | z,b) \widetilde{u}_{0}(dz,db|x_{0}) \\&\qquad + \mathbb {E}^{u}_{x_{0}} \Bigg [ \int _{]0,\infty [} e^{-\alpha s} \int _{\mathbf {A}^{g}}\int _\mathbf {X} \delta _{x}(\Gamma ) \bar{q}(dx | \overline{x}(\xi _{s-}),a) \pi (da|s) ds \Bigg ] \\&\qquad + \mathbb {E}^{u}_{x_{0}} \Bigg [ \int _{]0,\infty [} e^{-\alpha s} \int _{\mathbf {A}^{g}}\int _\mathbf {X} \int _{\mathbf {X}_{\Delta }\times \mathbf {A}^{i}_{\Delta }} Q_{\Delta }(\Gamma | z,b) \widetilde{\gamma }^{0}(dz,db|x,s) \bar{q}(dx | \overline{x}(\xi _{s-}),a) \pi (da|s) ds \Bigg ] \\&\qquad + \mathbb {E}^{u}_{x_{0}} \Bigg [ \sum _{n\in \mathbb {N}^{*}}\int _{]T_{n}, T_{n+1}]} e^{-\alpha s} I_{\Gamma }(\overline{x}(Y_{n})) \frac{\psi _{n}(ds-T_{n} | H_{n})}{\psi _{n}([s-T_{n},+\infty ] | H_{n})} \Bigg ] \\&\qquad + \mathbb {E}^{u}_{x_{0}} \Bigg [ \sum _{n\in \mathbb {N}^{*}}\int _{]T_{n}, T_{n+1}]} e^{-\alpha s} \int _{\mathbf {X}_{\Delta }\times \mathbf {A}^{i}_{\Delta }} Q_{\Delta }(\Gamma | z,b) \widetilde{\gamma }^{1}_{n}(dz,db|H_{n}) \frac{\psi _{n}(ds-T_{n} | H_{n})}{\psi _{n}([s-T_{n},+\infty ] | H_{n})} \Bigg ]. \end{aligned}$$

Consequently

$$\begin{aligned} \eta _{u}^{i}(\Gamma \times \mathbf {A}^{i}_{\Delta })&= \delta _{x_{0}}(\Gamma ) + \int _{\mathbf {X}_{\Delta }\times \mathbf {A}^{i}_{\Delta }} Q_{\Delta }(\Gamma | z,b) \eta _{u}^{i}(dz,db)\\&\quad + \frac{1}{\alpha } \int _{\mathbf {X} \times \mathbf {A}^{g}} \bar{q}(\Gamma \mathop {\cap }\mathbf {X} | x,a) \eta _{u}^{g}(dx,da) \\&+ \mathbb {E}^{u}_{x_{0}} \Bigg [ \sum _{n\in \mathbb {N}^{*}}\int _{]T_{n}, T_{n+1}]} e^{-\alpha s} I_{\Gamma }(\overline{x}(Y_{n})) \frac{\psi _{n}(ds-T_{n} | H_{n})}{\psi _{n}([s-T_{n},+\infty ] | H_{n})} \Bigg ] \end{aligned}$$

showing the result. \(\square \)

Appendix 2: Proof of Proposition 4.8

This appendix is dedicated to the proof of Proposition 4.8. We first need to derive some technical results. In all this section, we consider \(\pi \in \mathcal {P}^{g}(\mathbf {A}^{g}|\mathbf {X})\) and \(\varphi \in \mathcal {P}^{i}(\mathbf {A}^{i}_{\Delta }|\mathbf {X}_{\Delta })\) fixed. Let us introduce the stochastic kernel \(G_{\pi ,\varphi }\) on \(\mathbb {R}^{*}_{+}\times \mathbf {Y}\) given \(\mathbf {Y}\)

$$\begin{aligned} G_{\pi ,\varphi }(dt,dy| z)= \bar{q}^{\pi } R^{\varphi }(dy| \bar{x}(z)) e^{-t\bar{q}^{\pi }(\mathbf {X}|\bar{x}(z))} dt, \end{aligned}$$
(50)

and the stochastic kernel \(L_{\pi }\) on \(\mathbf {X}\) given \(\mathbf {Y}\)

$$\begin{aligned} L_{\pi }(dx|y)&= \frac{\delta _{\overline{x}(y)}(dx)}{\alpha +\bar{q}^{\pi }(\mathbf {X}|\bar{x}(y))}. \end{aligned}$$
(51)

For notational convenience, we denote

$$\begin{aligned} H_{\pi ,\varphi }=L_{\pi }\bar{q}^{\pi }R^{\varphi }. \end{aligned}$$
(52)

Lemma B.1

Let \(\gamma \in \mathcal {P}(\mathbf {Y})\). Then \(\widetilde{\gamma }\) is supported on \(\mathbb {K}^{i}_{\Delta }\) and \(\widetilde{\gamma }(\mathbf {X}\times \{\Delta \})=1\). Consider \(x\in \mathbf {X}\) and a randomized stationary policy \(\varphi \) for the model \(\mathcal {M}^{i}\) then \(\widetilde{P}^{\varphi }(\mathbf {X}\times \{\Delta \}|x)\le 1\). Moreover, \(\widetilde{P}^{\varphi }(\mathbf {X}\times \{\Delta \}|x)=1\) if and only if \(P^{\varphi }(\mathbf {Y}|x)=1\).

Proof

Let \(j\in \mathbb {N}^{*}\). Observe that \(\{\mathbf {y}\in \mathbf {Y}: \mathbf {y}_{j}\in \mathbf {X}\times \{\Delta \}\}=\mathbf {Y}_{j-1}\) and the first assertion is clear. Regarding the second claim, we have \(P^{\varphi } \Big ( \big \{\mathbf {y}\in (\mathbb {K}^{i}_{\Delta })^{\infty }: \mathbf {y}_{j}\in \mathbf {X}\times \{\Delta \} \big \} |x \Big ) =P^{\varphi } \Big ( \mathbf {Y}_{j-1} |x \Big )\) for \(x\in \mathbf {X}\) since \(P^{\varphi }\) is the strategic measure for the model \(\mathcal {M}^{i}\) generated by \(\varphi \), showing the last part of the result. \(\square \)

Lemma B.2

For any \(\Upsilon \in \mathcal {B}(\mathbf {Y})\) and \(n\in \mathbb {N}^{*}\), we have

$$\begin{aligned} \mathbb {E}^{u^{\pi ,\varphi }}_{x_{0}} \Big [ I_{\{T_{n}< \infty \}} e^{-\alpha T_{n}} \delta _{Y_{n}}(\Upsilon ) \Big ] =R^{\varphi } H^{n-1}_{\pi ,\varphi }(\Upsilon |x_{0}). \end{aligned}$$
(53)

Proof

Let us show the result by induction. Clearly, this equation holds for \(n=1\). Now, assume that Eq. (53) holds for n. Consider \(\Upsilon \in \mathcal {B}(\mathbf {Y})\). Then,

$$\begin{aligned} \mathbb {E}^{u^{\pi ,\varphi }}_{x_{0}} \Big [ I_{\{T_{n+1}< \infty \}}&e^{-\alpha T_{n+1}} \delta _{Y_{n+1}}(\Upsilon ) \Big ] \\&=\mathbb {E}^{u^{\pi ,\varphi }}_{x_{0}} \Big [ I_{\{T_{n}< \infty \}} e^{-\alpha T_{n}} \mathbb {E}^{u^{\pi ,\varphi }}_{x_{0}} \Big [ I_{\{\Theta _{n+1}< \infty \}} e^{-\alpha \Theta _{n+1}} \delta _{Y_{n+1}}(\Upsilon ) | \mathcal {F}_{T_{n}} \Big ] \Big ] \\&= \mathbb {E}^{u^{\pi ,\varphi }}_{x_{0}} \Big [ I_{\{T_{n}< \infty \}} e^{-\alpha T_{n}} \int _{\mathbb {R}_{+}^{*}} e^{-\alpha s} G_{\pi ,\varphi }(ds,\Upsilon | Y_{n}) \Big ] \\&= \mathbb {E}^{u^{\pi ,\varphi }}_{x_{0}} \Big [ I_{\{T_{n}< \infty \}} e^{-\alpha T_{n}} \int _{\mathbb {R}_{+}^{*}} e^{-\alpha s} \bar{q}^{\pi }R^{\varphi }(\Upsilon |\bar{x}(Y_{n})) e^{-s\bar{q}^{\pi }(\mathbf {X}|\bar{x}(Y_{n}))} ds \Big ] \\&= \mathbb {E}^{u^{\pi ,\varphi }}_{x_{0}} \Big [ I_{\{T_{n}< \infty \}} e^{-\alpha T_{n}} H_{\pi ,\varphi }(\Upsilon | Y_{n}) \Big ] \\&= \int _{\mathbf {Y}} H_{\pi ,\varphi }(\Upsilon |z) R^{\varphi }H^{n-1}_{\pi ,\varphi }(dz|x_{0}), \end{aligned}$$

showing the result. \(\square \)

Proposition B.3

The following three equalities hold:

$$\begin{aligned} \eta _{u^{\pi ,\varphi }}^{i}(dx,da)= & {} \widetilde{R}^{\varphi }(dx,da|x_{0}) + \sum _{n=1}^{\infty } R^{\varphi } H^{n-1}_{\pi ,\varphi }\widetilde{H}_{\pi ,\varphi }(dx,da|x_{0}), \end{aligned}$$
(54)
$$\begin{aligned} \widehat{\eta }_{u^{\pi ,\varphi }}^{g}(dx)= & {} \alpha \sum _{n=1}^{\infty } R^{\varphi } H^{n-1}_{\pi ,\varphi } L_{\pi }(dx|x_{0}), \end{aligned}$$
(55)

and

$$\begin{aligned} \eta _{u^{\pi ,\varphi }}^{i}(\Gamma \times \{\Delta \}) =&\widehat{\eta }_{u^{\pi ,\varphi }}^{g}(\Gamma ) +\frac{1}{\alpha } \int _{\mathbf {X}\times \mathbf {A}^{g}} I_{\Gamma } (x) \bar{q}^{\pi }(\mathbf {X}| x) \widehat{\eta }_{u^{\pi ,\varphi }}^{g}(dx), \end{aligned}$$
(56)

Proof

From the definition of \(\mu _{u^{\pi ,\varphi }}^{i}\) (see Eq. 9) and Lemma B.2, we have

$$\begin{aligned} \mu _{u^{\pi ,\varphi }}^{i}(dy)= & {} R^{\varphi }(dy|x_{0}) + \sum _{n=2}^{\infty } \mathbb {E}^{u^{\pi ,\varphi }}_{x_{0}} \Big [ I_{\{T_{n}< \infty \}} e^{-\alpha T_{n}} \delta _{Y_{n}}(dy) \Big ] \\= & {} R^{\varphi }(dy|x_{0}) + \sum _{n=1}^{\infty } R^{\varphi } H^{n}_{\pi ,\varphi }(dy|x_{0}). \end{aligned}$$

Observe that \(\widetilde{H}_{\pi ,\varphi }= L_{\pi } \bar{q}^{\pi } \widetilde{R}^{\varphi }\). Since \(\eta _{u^{\pi ,\varphi }}^{i}(dx,da)=\widetilde{\mu }_{u^{\pi ,\varphi }}^{i}(dx,da)\), we obtain easily Eq. (54).

Moreover, we have

$$\begin{aligned}&\mathbb {E}^{u^{\pi ,\varphi }}_{x_{0}} \Bigg [ \int _{0}^{T_{\infty }} e^{-\alpha s} \delta _{\bar{x}(\xi _{s-})}(dx) ds \Bigg ]\\&\quad = \sum _{n=1}^{\infty } \mathbb {E}^{u^{\pi ,\varphi }}_{x_{0}} \Bigg [ I_{\{T_{n}<\infty \}} \int _{]T_{n},T_{n+1}]} e^{-\alpha s} \delta _{\bar{x}(\xi _{s-})}(dx) ds \Bigg ] \\&\quad = \frac{1}{\alpha } \sum _{n=1}^{\infty } \mathbb {E}^{u^{\pi ,\varphi }}_{x_{0}} \Bigg [ I_{\{T_{n}<\infty \}} e^{-\alpha T_{n}} \delta _{\bar{x}(Y_{n})}(dx) \big (1- e^{-\alpha \Theta _{n+1}}\big ) \Bigg ] \\&\quad = \sum _{n=1}^{\infty } \mathbb {E}^{u^{\pi ,\varphi }}_{x_{0}} \Bigg [ I_{\{T_{n}<\infty \}} e^{-\alpha T_{n}} \frac{\delta _{\bar{x}(Y_{n})}(dx)}{\alpha +\bar{q}^{\pi }(\mathbf {X}|\bar{x}(Y_{n}))} \Bigg ], \end{aligned}$$

and so by using the definition of \(L_{\pi }\) and Lemma B.2, we obtain (55).

Now, from Eq. (55) we get

$$\begin{aligned}&\widehat{\eta }_{u^{\pi ,\varphi }}^{g} (\Gamma ) +\frac{1}{\alpha } \int _{\mathbf {X}\times \mathbf {A}^{g}} I_{\Gamma } (x) \bar{q}^{\pi }(\mathbf {X}| x) \widehat{\eta }_{u^{\pi ,\varphi }}^{g}(dx) \\&\quad = \sum _{n=1}^{\infty } \int _{\mathbf {Y}} \frac{\alpha I_{\Gamma }(\bar{x}(y))}{\alpha +\bar{q}^{\pi } (\mathbf {X}|\bar{x}(y))} R^{\varphi } H^{n-1}_{\pi ,\varphi } (dy|x_{0})\\&\qquad +\sum _{n=1}^{\infty } \int _{\mathbf {Y}} \frac{I_{\Gamma }(\bar{x}(y)) \bar{q}^{\pi }(\mathbf {X}|\bar{x}(y))}{\alpha +\bar{q}^{\pi } (\mathbf {X}|\bar{x}(y))} R^{\varphi } H^{n-1}_{\pi ,\varphi } (dy|x_{0}) \\&\quad = \sum _{n=1}^{\infty } \int _{\mathbf {Y}} I_{\Gamma }(\bar{x}(y)) R^{\varphi } H^{n-1}_{\pi ,\varphi } (dy|x_{0}) = \widetilde{R}^{\varphi } (\Gamma \times \{\Delta \}|x_{0})\\&\qquad + \sum _{n=1}^{\infty } R^{\varphi } H^{n-1}_{\pi ,\varphi } \widetilde{H}_{\pi ,\varphi } (\Gamma \times \{\Delta \}|x_{0}). \end{aligned}$$

Recalling (54), we have Eq. (56), showing the result. \(\square \)

Proof of Proposition 4.8 Observe that \(\widetilde{H}_{\pi ,\varphi }= L_{\pi } \bar{q}^{\pi } \widetilde{R}^{\varphi }\), and so

$$\begin{aligned} \eta _{u^{\pi ,\varphi }}^{i}(dx,da)&= \widetilde{R}^{\varphi }(dx,da|x_{0}) + \sum _{n=1}^{\infty } R^{\varphi } H^{n-1}_{\pi ,\varphi } L_{\pi } \bar{q}^{\pi } \widetilde{R}^{\varphi }(dx,da|x_{0}), \end{aligned}$$

and with (55) we get (19). The measure \(\widehat{\eta }_{u^{\pi ,\varphi }}^{g}\) is finite by definition and so, by using Propositions B.3 and Assumption (A1), we have that for any \(\Gamma \in \mathcal {B}(\mathbf {X})\)

$$\begin{aligned} \sum _{n=1}^{\infty } R^{\varphi } H^{n-1}_{\pi ,\varphi } L_{\pi }(\Gamma |x_{0}) <\infty \end{aligned}$$
(57)
$$\begin{aligned} \sum _{n=1}^{\infty } R^{\varphi } H^{n-1}_{\pi ,\varphi }\widetilde{H}_{\pi ,\varphi }(\Gamma \times \{\Delta \}|x_{0}) <\infty \end{aligned}$$
(58)

Now, from Eq. (55) and the definition of \(H_{\pi ,\varphi }\) (see Eq. 52), we have that for any \(\Gamma \in \mathcal {B}(\mathbf {X})\)

$$\begin{aligned} \frac{1}{\alpha } \widehat{\eta }_{u^{\pi ,\varphi }}^{g} \bar{q}^{\pi } \widetilde{R}^{\varphi } (\Gamma \times \{\Delta \})&= \sum _{n\in \mathbb {N}^{*}} R^{\varphi } H^{n-1}_{\pi ,\varphi } L_{\pi } \bar{q}^{\pi } \widetilde{R}^{\varphi } (\Gamma \times \{\Delta \}|x_{0}) \nonumber \\&= \sum _{n\in \mathbb {N}} R^{\varphi } H^{n}_{\pi ,\varphi } \widetilde{H}_{\pi ,\varphi } (\Gamma \times \{\Delta \}|x_{0}). \end{aligned}$$
(59)

Moreover, observe that \(\displaystyle \frac{\bar{q}^{\pi }(\mathbf {X}|\bar{x}(y))}{\alpha +\bar{q}^{\pi } (\mathbf {X}|\bar{x}(y))} = 1 - \frac{\alpha }{\alpha +\bar{q}^{\pi } (\mathbf {X}|\bar{x}(y))}\) and so, for \(n\ge 2\)

$$\begin{aligned}&\int _{\mathbf {X}} I_{\Gamma }(x) \bar{q}^{\pi }(\mathbf {X}|x) R^{\varphi } H^{n-1}_{\pi ,\varphi } L_{\pi } (dx|x_{0})\nonumber \\&\quad = \int _{\mathbf {Y}} \frac{I_{\Gamma }(\bar{x}(y)) \bar{q}^{\pi }(\mathbf {X}|\bar{x}(y))}{\alpha +\bar{q}^{\pi } (\mathbf {X}|\bar{x}(y))} R^{\varphi } H^{n-1}_{\pi ,\varphi } (dy|x_{0}) \nonumber \\&\quad = R^{\varphi } H^{n-2}_{\pi ,\varphi } \widetilde{H}_{\pi ,\varphi } (\Gamma \times \{\Delta \}|x_{0}) - \alpha R^{\varphi } H^{n-1}_{\pi ,\varphi } L_{\pi } (\Gamma |x_{0}) \end{aligned}$$
(60)

and

$$\begin{aligned} \int _{\mathbf {X}} I_{\Gamma }(x) \bar{q}^{\pi }(\mathbf {X}|x) R^{\varphi } L_{\pi } (dx|x_{0})= & {} \int _{\mathbf {Y}} \frac{I_{\Gamma }(\bar{x}(y)) \bar{q}^{\pi }(\mathbf {X}|\bar{x}(y))}{\alpha +\bar{q}^{\pi } (\mathbf {X}|\bar{x}(y))} R^{\varphi }(dy|x_{0}) \nonumber \\= & {} \widetilde{R}^{\varphi } (\Gamma \times \{\Delta \}|x_{0}) - \alpha R^{\varphi } L_{\pi } (\Gamma |x_{0}). \end{aligned}$$
(61)

Consequently, by using the expression of \(\widehat{\eta }_{u^{\pi ,\varphi }}^{g}\) in (55) and Eqs. (60)\(-\)(61)

$$\begin{aligned} \frac{1}{\alpha } \int _{\mathbf {X}} I_{\Gamma }(x) \bar{q}^{\pi }(\mathbf {X}|x) \widehat{\eta }_{u^{\pi ,\varphi }}^{g} (dx|x_{0})= & {} \sum _{n=2}^{\infty } \int _{\mathbf {X}} I_{\Gamma }(x) \bar{q}^{\pi }(\mathbf {X}|x) R^{\varphi } H^{n-1}_{\pi ,\varphi } L_{\pi } (dx|x_{0}) \nonumber \\&+\int _{\mathbf {X}} I_{\Gamma }(x) \bar{q}^{\pi }(\mathbf {X}|x) R^{\varphi } L_{\pi } (dx|x_{0}) \nonumber \\= & {} \sum _{n\in \mathbb {N}} R^{\varphi } H^{n}_{\pi ,\varphi } \widetilde{H}_{\pi ,\varphi } (\Gamma {\times } \{\Delta \}|x_{0}) + \widetilde{R}^{\varphi } (\Gamma {\times } \{\Delta \}|x_{0}) \nonumber \\&- \alpha \sum _{n\in \mathbb {N}} R^{\varphi } H^{n}_{\pi ,\varphi } L_{\pi } (\Gamma |x_{0}). \end{aligned}$$
(62)

Note that the above calculations are possible since the quantities \(\sum _{n=1}^{\infty } R^{\varphi } H^{n-1}_{\pi ,\varphi } L_{\pi }(\Gamma |x_{0})\) and \(\sum _{n=1}^{\infty } R^{\varphi } H^{n-1}_{\pi ,\varphi }\widetilde{H}_{\pi ,\varphi }(\Gamma \times \{\Delta \}|x_{0})\) are finite (see inequalities (58) and (58)). Recalling (55) and combining Eqs. (59) and (62) we obtain that

$$\begin{aligned}&\frac{1}{\alpha } \widehat{\eta }_{u^{\pi ,\varphi }}^{g} \bar{q}^{\pi } \widetilde{R}^{\varphi } (\Gamma \times \{\Delta \}) - \frac{1}{\alpha } \int _{\mathbf {X}} I_{\Gamma }(x) \bar{q}^{\pi }(\mathbf {X}|x) \widehat{\eta }_{u^{\pi ,\varphi }}^{g} (dx|x_{0})\\&\quad = - \widetilde{R}^{\varphi } (\Gamma \times \{\Delta \}|x_{0}) + \widehat{\eta }_{u^{\pi ,\varphi }}^{g}(\Gamma |x_{0}), \end{aligned}$$

showing the result. \(\square \)

Rights and permissions

Reprints and permissions

About this article

Check for updates. Verify currency and authenticity via CrossMark

Cite this article

Dufour, F., Piunovskiy, A.B. Impulsive Control for Continuous-Time Markov Decision Processes: A Linear Programming Approach. Appl Math Optim 74, 129–161 (2016). https://doi.org/10.1007/s00245-015-9310-8

Download citation

  • Published:

  • Issue Date:

  • DOI: https://doi.org/10.1007/s00245-015-9310-8

Keywords

Mathematical Subject Classification

Navigation