Two Approaches to Stochastic Optimal Control Problems with a Final-Time Expectation Constraint


In this article, we study and compare two approaches to solving stochastic optimal control problems with an expectation constraint on the final state. The case of a probability constraint is included in this framework. The first approach is based on a dynamic programming principle and the second one uses Lagrange relaxation. In this article, we focus on discrete-time problems, but the two discussed approaches can be applied to discretized continuous-time problems.

This is a preview of subscription content, log in to check access.

Fig. 1
Fig. 2


  1. 1.

    Alais, J.C., Carpentier, P., De Lara, M.: Multi-usage hydropower single dam management: chance-constrained optimization and stochastic viability. Energy Syst. 1–24 (2015)

  2. 2.

    Andrieu, L., Cohen, G., Vázquez-Abad, F.J.: Gradient-based simulation optimization under probability constraints. Eur. J. Oper. Res. 212(2), 345–351 (2011)

    MathSciNet  Article  MATH  Google Scholar 

  3. 3.

    Bouchard, B., Elie, R., Imbert, C.: Optimal control under stochastic target constraints. SIAM J. Control Optim. 48(5), 3501–3531 (2010)

    MathSciNet  Article  MATH  Google Scholar 

  4. 4.

    Bouchard, B., Elie, R., Touzi, N.: Stochastic target problems with controlled loss. SIAM J. Control Optim. 48(5), 3123–3150 (2010)

    MathSciNet  Article  MATH  Google Scholar 

  5. 5.

    Camilli, F., Falcone, M.: An approximation scheme for the optimal control of diffusion processes. RAIRO Modél. Math. Anal. Numér. 29(1), 97–122 (1995)

    MathSciNet  Article  MATH  Google Scholar 

  6. 6.

    Granato, G.: Optimisation de lois de gestion energétique des véhicules hybrides. PhD thesis, Ecole Polytechnique (2012)

  7. 7.

    Henrion, R.: On the connectedness of probabilistic constraint sets. J. Optim. Theory Appl. 112(3), 657–663 (2002)

    MathSciNet  Article  MATH  Google Scholar 

  8. 8.

    Henrion, R., Strugarek, C.: Convexity of chance constraints with independent random variables. Comput. Optim. Appl. 41(2), 263–276 (2008)

    MathSciNet  Article  MATH  Google Scholar 

  9. 9.

    Hiriart-Urruty, J.B., Lemarechal, C.: Convex Analysis and Minimization Algorithms: Part 2: Advanced Theory and Bundle Methods. Convex Analysis and Minimization Algorithms. Springer, Berlin (1993)

    Google Scholar 

  10. 10.

    Kushner, H.J.: Numerical methods for stochastic control problems in continuous time. SIAM J. Control Optim. 28(5), 999–1048 (1990)

    MathSciNet  Article  MATH  Google Scholar 

  11. 11.

    Lemaréchal, C.: Lagrangian relaxation. In: Junger, M., Naddef, D. (eds.) Computational Combinatorial Optimization. Lecture Notes in Computer Science, vol. 2241, pp. 112–156. Springer, Berlin (2001)

    Google Scholar 

  12. 12.

    Øksendal, B.: Stochastic Differential Equations: An Introduction with Applications. Hochschultext/Universitext, Springer, New York (2003)

    Google Scholar 

  13. 13.

    Pfeiffer, L.: Optimality conditions for mean-field type optimal control problems. SFB Report 2015–015 (2015)

  14. 14.

    Rockafellar, R.T.: Convex Analysis. Princeton Mathematical Series. Princeton University Press, Princeton (1970)

    Google Scholar 

  15. 15.

    Shapiro, A., Dentcheva, D., Ruszczyński, A.P.: Lectures on Stochastic Programming: Modeling and Theory. MPS-SIAM Series on Optimization. Society for Industrial and Applied Mathematics, Philadelphia (2009)

    Google Scholar 

  16. 16.

    Touzi, N.: Optimal Stochastic Control, Stochastic Target Problems, and Backward SDE. Fields Institute Monographs, Springer, Berlin (2012)

    Google Scholar 

Download references


The author thanks J. Frédéric Bonnans for his advice and the two anonymous referees for useful remarks. The research leading to these results has received funding from the Gaspard Monge Program for Optimization and operations research (PGMO). The author gratefully acknowledges the Austrian Science Fund (FWF) for financial support under SFB F32, “Mathematical Optimization and Applications in Biomedical Sciences”.

Author information



Corresponding author

Correspondence to Laurent Pfeiffer.

Appendix: Generalities on Lagrange Relaxation

Appendix: Generalities on Lagrange Relaxation

First Definitions

This section is a short introduction to some notions of convex analysis [14] and Lagrange relaxation [9, Chapter XII]. The notations that we use here are independent of the article. Let \(V:z \in \mathbb {R}\mapsto \mathbb {R}\cup \{ + \infty \}\). We denote by \(V^*\) its Legendre–Fenchel transform, defined for all \(\lambda \in \mathbb {R}\) by

$$\begin{aligned} V^*(\lambda ) := \sup _{z \in \mathbb {R}} \big \{ \lambda z - V(z) \big \}. \end{aligned}$$

The function \(V^*\) is convex and lower semi-continuous and if \(z \mapsto V(z)\) is nondecreasing, which is the case for the considered functions in the article, then \(V^*(\lambda )= + \infty \), for all \(\lambda <0\). By definition, for all z and \(\lambda \),

$$\begin{aligned} V(z) + V^*(\lambda ) \ge z \lambda . \end{aligned}$$

The subdifferential \(\partial V\) is the multimapping, possibly empty, defined for all \(z \in \mathbb {R}\) by

$$\begin{aligned} \partial V(z) := \big \{ \lambda \in \mathbb {R}\,:\, V(z')-V(z) \ge \lambda (z'-z),\, \forall z' \in \mathbb {R}\big \}. \end{aligned}$$

It is easy to check that for all z and \(\lambda \),

$$\begin{aligned} V(z) + V^*(\lambda ) = z \lambda \Longleftrightarrow \lambda \in \partial V(z) \Longleftrightarrow z \in \partial V^*(\lambda ) \Longrightarrow V(z)=V^{**}(z), \end{aligned}$$

where \(V^{**}\) is the Legendre–Fenchel transform of \(V^*\), also called biconjugate of V. Finally, we denote by conv(V) the convex envelope of V, defined as the greatest convex function smaller than or equal to V. We denote by \(\overline{{{\mathrm{conv}}}}(V)\) the l.s.c. convex envelope of V, defined as the greatest lower semi-continuous convex function smaller or equal than V. It is easy to check that \({{\mathrm{conv}}}(V)^* = \overline{{{\mathrm{conv}}}}(V)^*= V^*\). By the Fenchel–Moreau–Rockafellar Theorem, if \(\overline{{{\mathrm{conv}}}}(V)\) is proper (that is to say, if for all z, \(\overline{{{\mathrm{conv}}}}(V) (z) > - \infty \) and if for at least one z, \(\overline{{{\mathrm{conv}}}}(V)(z) < + \infty \)), then

$$\begin{aligned} \overline{{{\mathrm{conv}}}}(V)= V^{**}. \end{aligned}$$


Let us consider now an abstract family of optimization problems:

$$\begin{aligned} V(z) := \inf _{x \in S} \ f(x) \quad \text {s.t.}\, g(x) \ge z, \end{aligned}$$

where \(S \subset \mathbb {R}^k\) is a given non-empty set, with \(k \in \mathbb {N}^*\) and where the functions \(f:S \rightarrow \mathbb {R}\) and \(g:S \rightarrow \mathbb {R}\) are given. The function V is called value function. Equivalently,

$$\begin{aligned} V(z) = \underset{x \in S \ \lambda \ge 0}{\inf \ \sup } \ \big \{ f(x)- \lambda (g(x)-z) \big \}. \end{aligned}$$

Let us compute the Legendre–Fenchel transform of V. Let \(\lambda \ge 0\), then


Since V is nondecreasing, for all \(\lambda < 0\), \(V^*(\lambda ) = + \infty \). We call penalized problem the minimization problem in (8.8). By (8.8), \(V^{**}\) is obtained by inverting the operators inf and sup in (8.7):

$$\begin{aligned} V^{**}(z) = \underset{\lambda \ge 0 \ x \in S}{\sup \ \inf } \ \big \{ f(x)- \lambda (g(x)-z) \big \}. \end{aligned}$$

Lemma 24

Let \(\lambda \ge 0\), let \(x \in S\), let \(z= g(x)\). Then,

$$\begin{aligned} \big \{ x \,\text {is a solution to (A.6) and}\, \lambda \in \partial V(z) \big \} \Longleftrightarrow \big \{ x \,\text {is a solution to (A.8)} \big \}. \end{aligned}$$

In this case, \(\lambda \in \partial V(z)\), \(z \in \partial V^*(\lambda )\), and \(V(z)= V^{**}(z)\).


Let us assume that x is a solution to (8.6) and that \(\lambda \in \partial V(z)\). Then, by (8.4),

$$\begin{aligned} V^*(\lambda )= -\big (V(z)-\lambda z)= -(f(x)-\lambda g(x)\big ), \end{aligned}$$

therefore, x is a solution to (8.8). Let us assume now that x is a solution to (8.8). Let \(x'\) be such that \(g(x') \ge z\). Then,

$$\begin{aligned} f(x') \ge f(x) + \lambda \big (g(x)-g(x')\big ) \ge f(x), \end{aligned}$$

which proves that x is a solution to (8.6). The last statement of the lemma is a direct consequence of (8.4). \(\square \)

The following lemma and Fig. 3 explain how to derive a lower and an upper estimate of \(V^{**}\) by solving the penalized problem. We give a bound for the maximal error generated by these estimates.

Lemma 25

Let \(0 \le \lambda _1 < \lambda _2\), let \(x_1\) and \(x_2\) in S, let \(z_1= g(x_1)\) and \(z_2= g(x_2)\). We assume that \(x_1\) and \(x_2\) are solutions to (8.8) for resp. \(\lambda =\lambda _1\) and \(\lambda = \lambda _2\). Then, \(z_1 \le z_2\). Moreover, as illustrated on Fig. 3 (on the left),

$$\begin{aligned} V^{**}(z)&\ge \check{V}(z) := \max _{i=1,2} \big \{ \lambda _i (z-z_i) + V(z_i) \big \}, \quad \forall z \in \mathbb {R}, \end{aligned}$$
$$\begin{aligned} V^{**}(z)&\le \hat{V}(z) := \frac{(z-z_1)V(z_2)+(z_2-z)V(z_1)}{z_2-z_1},\quad \forall z \in [z_1,z_2]. \end{aligned}$$

Finally, an upper bound of \(\hat{V}-\check{V}\) on \([z_1,z_2]\) is:

$$\begin{aligned} \left( \lambda _2 - \frac{V(z_2)-V(z_1)}{z_2-z_1} \right) \left( \frac{V(z_2)-V(z_1)}{z_2-z_1}- \lambda _1 \right) \frac{z_2-z_1}{\lambda _2-\lambda _1} \le \frac{1}{4}(\lambda _2-\lambda _1)(z_2-z_1). \end{aligned}$$


Observe first that

$$\begin{aligned} f(x_1) - \lambda _1 g(x_1) \le f(x_2) - \lambda _1 g(x_2), \quad f(x_2) - \lambda _2 g(x_2) \le f(x_1) - \lambda _2 g(x_1). \end{aligned}$$

Summing these two inequalities, we obtain that

$$\begin{aligned} (\lambda _2-\lambda _1) g(x_1) \le (\lambda _2-\lambda _1) g(x_2), \end{aligned}$$

which proves that \(z_1 \le z_2\). By Lemma 24, \(x_1\) and \(x_2\) are solutions to (8.6) with resp. \(z=z_1\) and \(z_2\) and \(\lambda _1 \in \partial V^*(z_1)\), \(\lambda _2 \in \partial V^*(z_2)\) and \(V(z_1)= V^{**}(z_1)\) and \(V(z_2)=V^{**}(z_2)\). The upper estimate on \(V^{**}\) follows, since \(V^{**}\) is convex. By definition and by Lemma 24,

$$\begin{aligned} V^{**}(z) \ge \lambda _1 z - V_1^*(\lambda _1)= \lambda _1 (z-z_1)+ V(z_1), \end{aligned}$$

and the same holds with \(\lambda _2\) and \(z_2\); this proves the lower estimate.

We give a geometrical proof of (8.13), illustrated by Fig. 3 (on the right). On this figure, the coordinates of points \(I_1\) and \(I_2\) are resp. \((z_1,V(z_1))\) and \((z_2,V(z_2))\). The bold lines \(\mathcal {D}_1\) and \(\mathcal {D}_2\) correspond to the lower estimate of \(V^{**}\), their slopes are resp. \(\lambda _1\) and \(\lambda _2\). The upper estimate is the segment \([I_1,I_2]\). The greatest gap between the lower and the upper estimate is given by e.

The line \(\mathcal {D}_2'\) is the parallel to \(\mathcal {D}_2\) passing by \(I_1\). Finally, the points A and B are the intersections of resp. \(\mathcal {D}_2'\) and \(\mathcal {D}_1\) with the vertical axis passing by \(I_2\). Note that \(AB= (\lambda _2-\lambda _1)(z_2-z_1)\). Let us set \(\alpha = \frac{BI_2}{AB}\). We have \(\alpha \in [0,1]\) and \(e= \alpha (1-\alpha ) AB\). The value of \(\alpha \) is given by

$$\begin{aligned} \alpha = \frac{1}{\lambda _2-\lambda _1} \left( \frac{V(z_2)-V(z_1)}{z_2-z_1}- \lambda _1 \right) . \end{aligned}$$

The first inequality in (8.13) follows, and the second one follows from the inequality: \(\alpha (1-\alpha ) \le 1/4\) for all \(\alpha \in [0,1]\). \(\square \)

Fig. 3

Lower and upper estimates of \(V^{**}\)


We finish this section with two equivalent relaxed formulations of problems (8.6). We assume now that

$$\begin{aligned} V(-\infty ):= \lim _{z \rightarrow -\infty } V(z) > - \infty . \end{aligned}$$

This is equivalent to assume that the unconstrained problem \(\inf _{x \in S} f(x)\) has a finite value. The value function V being non decreasing, it is thus bounded from below, and so is its convex envelope. Moreover, since V is non decreasing, its domain \(\{ z \,|\, V(z) < +\infty \}\) is an interval of the form \((-\infty ,\bar{z})\) or \((-\infty , \bar{z}]\), therefore the convex envelope has the same domain.

Let us fix a random variable \(\xi \), of uniform law on [0, 1]. We denote by \(\mathcal {M}(S,\zeta )\) the space of random variables with value in S and measurable with respect to \(\zeta \). We introduce a new family of optimization problems, that we call relaxed problems, with value \(V^{{{\mathrm{r}}}}(z)\):

$$\begin{aligned} V^{{{\mathrm{r}}}}(z) := \inf _{X \in \mathcal {M}(S,\zeta )} \mathbb {E} \big [ f(X) \big ] \quad \text {s.t. }\, \mathbb {E}\big [ g(X) \big ] \ge z. \end{aligned}$$

This family is linked to the family of problems (8.6). Instead of taking one decision x, we are allowed to make depend the decision on \(\zeta \), and the constraint must only be satisfied “in expectation”.

Lemma 26

The relaxed value function \(V^{{{\mathrm{r}}}}\) is the convex envelope of V:

$$\begin{aligned} V^{{{\mathrm{r}}}} = {{\mathrm{conv}}}(V). \end{aligned}$$


Let z be such that \({{\mathrm{conv}}}(V)(z) < +\infty \). By Carathéodory’s theorem,

Let z, let \(\varepsilon >0\), let \(z_1 \in \mathbb {R}\), \(z_2 \in \mathbb {R}\), \(\alpha \in [0,1]\), \(x_1 \in S\), \(x_2 \in S\) be such that

$$\begin{aligned}&z= \alpha z_1 + (1-\alpha ) z_2, \\&\alpha V(z_1) + (1-\alpha ) V(z_2) \le {{\mathrm{conv}}}(V)(z) + \varepsilon /2, \\&f(x_1) \le V(z_1) + \varepsilon /2,\ g(x_1) \ge z_1, \\&f(x_2) \le V(z_2) + \varepsilon /2,\ g(x_2) \ge z_2. \end{aligned}$$

We set: \(X= x_1\) if \(\zeta \in (0,\alpha )\), \(x_2\) otherwise. The variable X is feasible for (8.17) and therefore, \(V^{{{\mathrm{r}}}}(z) \le {{\mathrm{conv}}}(V)(z) + \varepsilon \). Passing to the limit, we obtain that \(V^{{{\mathrm{r}}}} \le {{\mathrm{conv}}}(z)\).

Let \(z \in \mathbb {R}\) be such that \(V^{{{\mathrm{r}}}}(z) < + \infty \), let \(\varepsilon >0\), let X be an \(\varepsilon \)-solution of (8.17). Let Z be the real random variable defined by \(Z= g(X)\). Then,

$$\begin{aligned} {{\mathrm{conv}}}(V)(z) \le \mathbb {E} \big [ V(Z) \big ] \le \mathbb {E} \big [ f(X) \big ] \le V^{{{\mathrm{r}}}}(z) + \varepsilon . \end{aligned}$$

Passing to the limit, we obtain that \({{\mathrm{conv}}}(V) \le V^{{{\mathrm{r}}}}\) and finally the equality. \(\square \)

Lemma 27

Let \(z \in \mathbb {R}\), let X be a relaxed optimal solution to problem (8.17). Let Z be a real random variable, adapted to \(\xi \) and such that

$$\begin{aligned} \mathbb {E}\big [ Z \big ] \ge z \quad \text {and} \quad Z \le g(X)\, \text { almost surely}. \end{aligned}$$

Then, X is a solution to the problem (8.6) with the level \(z=Z\) almost surely. Moreover, if \(\lambda \in \partial V^{{{\mathrm{r}}}}(z)\), then \(\lambda \in \partial V(Z)\) and X is a solution to the penalized problem (8.8) almost surely (that is to say, \(V^*(\lambda )= -(f(X)-\lambda g(X))\) almost surely).


Let us prove the first part of the lemma. First, by Lemma 26,

$$\begin{aligned} \mathbb {E}\big [ f(X) \big ] = V^{{{\mathrm{r}}}}(z)= {{\mathrm{conv}}}(V)(z) = {{\mathrm{conv}}}(V)\big (\mathbb {E}\big [ Z \big ]\big ) \le \mathbb {E}\big [ V(Z) \big ]. \end{aligned}$$

Moreover, \(V(Z) \le f(X)\) almost surely, since \(Z \le g(X)\) almost surely. Combined with (8.20), it follows that \(f(X)= V(Z)\) almost surely. Note that it follows also that \(V^{{{\mathrm{r}}}}(z)= \mathbb {E} \big [ V(Z) \big ]\).

Let us prove the second part of the lemma. Let \(\lambda \in \partial V^{{{\mathrm{r}}}}(z)\), using the fact that \(V^{{{\mathrm{r}}},*}=V^*\), we obtain that

$$\begin{aligned} \mathbb {E}\big [ \lambda Z- V^*(\lambda ) \big ]\ge \lambda z-V^{{{\mathrm{r}}},*}(\lambda ) = V^{{{\mathrm{r}}}}(z)= \mathbb {E}\big [V(Z)]. \end{aligned}$$

Since \(\lambda Z - V^*(\lambda ) \le V(Z)\), we obtain that \(\lambda Z- V^*(\lambda ) = V(Z)\) almost surely and therefore that \(\lambda \in \partial V(Z)\) almost surely. Finally, X is a solution to the penalized problem almost surely by Lemma 24. \(\square \)

Let us denote now by \(\mathcal {P}(S)\) the set of probability measures on S. With a similar proof to the one of Lemma 26, we can show that

$$\begin{aligned} V^{{{\mathrm{r}}}}(z)= \inf _{\mu \in \mathcal {M}(S)} \, \int _S f(x) \text {d}\mu (x) \quad \text {s.t. }\, \int _{S} g(x) \text {d}\mu (x) \ge z. \end{aligned}$$

Instead of taking one decision, we are allowed to take several decisions simultaneously. The cost function and the constraint are now linear with respect to the new optimization variable \(\mu \).

When \(V^{{{\mathrm{r}}}}\) is lower semi-continuous and such that \(V^{{{\mathrm{r}}}}(z) > -\infty \), for all z, then it is equal to \(V^{**}\). This means that it can be computed with a dual approach, motivated by Lemma 24, and that error estimates can be derived with Lemma 25. Note that we need the existence of optimal solutions to the penalized problems.

Rights and permissions

Reprints and Permissions

About this article

Verify currency and authenticity via CrossMark

Cite this article

Pfeiffer, L. Two Approaches to Stochastic Optimal Control Problems with a Final-Time Expectation Constraint. Appl Math Optim 77, 377–404 (2018).

Download citation


  • Stochastic optimal control
  • Expectation and probability constraints
  • Dynamic programming
  • Lagrange relaxation

Mathematics Subject Classification

  • 90C15
  • 93E20