## Abstract

In this article, we study and compare two approaches to solving stochastic optimal control problems with an expectation constraint on the final state. The case of a probability constraint is included in this framework. The first approach is based on a dynamic programming principle and the second one uses Lagrange relaxation. In this article, we focus on discrete-time problems, but the two discussed approaches can be applied to discretized continuous-time problems.

This is a preview of subscription content, log in to check access.

## References

- 1.
Alais, J.C., Carpentier, P., De Lara, M.: Multi-usage hydropower single dam management: chance-constrained optimization and stochastic viability. Energy Syst. 1–24 (2015)

- 2.
Andrieu, L., Cohen, G., Vázquez-Abad, F.J.: Gradient-based simulation optimization under probability constraints. Eur. J. Oper. Res.

**212**(2), 345–351 (2011) - 3.
Bouchard, B., Elie, R., Imbert, C.: Optimal control under stochastic target constraints. SIAM J. Control Optim.

**48**(5), 3501–3531 (2010) - 4.
Bouchard, B., Elie, R., Touzi, N.: Stochastic target problems with controlled loss. SIAM J. Control Optim.

**48**(5), 3123–3150 (2010) - 5.
Camilli, F., Falcone, M.: An approximation scheme for the optimal control of diffusion processes. RAIRO Modél. Math. Anal. Numér.

**29**(1), 97–122 (1995) - 6.
Granato, G.: Optimisation de lois de gestion energétique des véhicules hybrides. PhD thesis, Ecole Polytechnique (2012)

- 7.
Henrion, R.: On the connectedness of probabilistic constraint sets. J. Optim. Theory Appl.

**112**(3), 657–663 (2002) - 8.
Henrion, R., Strugarek, C.: Convexity of chance constraints with independent random variables. Comput. Optim. Appl.

**41**(2), 263–276 (2008) - 9.
Hiriart-Urruty, J.B., Lemarechal, C.: Convex Analysis and Minimization Algorithms: Part 2: Advanced Theory and Bundle Methods. Convex Analysis and Minimization Algorithms. Springer, Berlin (1993)

- 10.
Kushner, H.J.: Numerical methods for stochastic control problems in continuous time. SIAM J. Control Optim.

**28**(5), 999–1048 (1990) - 11.
Lemaréchal, C.: Lagrangian relaxation. In: Junger, M., Naddef, D. (eds.) Computational Combinatorial Optimization. Lecture Notes in Computer Science, vol. 2241, pp. 112–156. Springer, Berlin (2001)

- 12.
Øksendal, B.: Stochastic Differential Equations: An Introduction with Applications. Hochschultext/Universitext, Springer, New York (2003)

- 13.
Pfeiffer, L.: Optimality conditions for mean-field type optimal control problems. SFB Report 2015–015 (2015)

- 14.
Rockafellar, R.T.: Convex Analysis. Princeton Mathematical Series. Princeton University Press, Princeton (1970)

- 15.
Shapiro, A., Dentcheva, D., Ruszczyński, A.P.: Lectures on Stochastic Programming: Modeling and Theory. MPS-SIAM Series on Optimization. Society for Industrial and Applied Mathematics, Philadelphia (2009)

- 16.
Touzi, N.: Optimal Stochastic Control, Stochastic Target Problems, and Backward SDE. Fields Institute Monographs, Springer, Berlin (2012)

## Acknowledgments

The author thanks J. Frédéric Bonnans for his advice and the two anonymous referees for useful remarks. The research leading to these results has received funding from the Gaspard Monge Program for Optimization and operations research (PGMO). The author gratefully acknowledges the Austrian Science Fund (FWF) for financial support under SFB F32, “Mathematical Optimization and Applications in Biomedical Sciences”.

## Author information

### Affiliations

### Corresponding author

## Appendix: Generalities on Lagrange Relaxation

### Appendix: Generalities on Lagrange Relaxation

### First Definitions

This section is a short introduction to some notions of convex analysis [14] and Lagrange relaxation [9, Chapter XII]. The notations that we use here are independent of the article. Let \(V:z \in \mathbb {R}\mapsto \mathbb {R}\cup \{ + \infty \}\). We denote by \(V^*\) its *Legendre–Fenchel transform*, defined for all \(\lambda \in \mathbb {R}\) by

The function \(V^*\) is convex and lower semi-continuous and if \(z \mapsto V(z)\) is nondecreasing, which is the case for the considered functions in the article, then \(V^*(\lambda )= + \infty \), for all \(\lambda <0\). By definition, for all *z* and \(\lambda \),

The *subdifferential* \(\partial V\) is the multimapping, possibly empty, defined for all \(z \in \mathbb {R}\) by

It is easy to check that for all *z* and \(\lambda \),

where \(V^{**}\) is the Legendre–Fenchel transform of \(V^*\), also called *biconjugate* of *V*. Finally, we denote by conv(*V*) the *convex envelope* of *V*, defined as the greatest convex function smaller than or equal to *V*. We denote by \(\overline{{{\mathrm{conv}}}}(V)\) the *l.s.c. convex envelope* of *V*, defined as the greatest lower semi-continuous convex function smaller or equal than *V*. It is easy to check that \({{\mathrm{conv}}}(V)^* = \overline{{{\mathrm{conv}}}}(V)^*= V^*\). By the Fenchel–Moreau–Rockafellar Theorem, if \(\overline{{{\mathrm{conv}}}}(V)\) is proper (that is to say, if for all *z*, \(\overline{{{\mathrm{conv}}}}(V) (z) > - \infty \) and if for at least one *z*, \(\overline{{{\mathrm{conv}}}}(V)(z) < + \infty \)), then

### Duality

Let us consider now an abstract family of optimization problems:

where \(S \subset \mathbb {R}^k\) is a given non-empty set, with \(k \in \mathbb {N}^*\) and where the functions \(f:S \rightarrow \mathbb {R}\) and \(g:S \rightarrow \mathbb {R}\) are given. The function *V* is called *value function*. Equivalently,

Let us compute the Legendre–Fenchel transform of *V*. Let \(\lambda \ge 0\), then

Since *V* is nondecreasing, for all \(\lambda < 0\), \(V^*(\lambda ) = + \infty \). We call *penalized problem* the minimization problem in (8.8). By (8.8), \(V^{**}\) is obtained by inverting the operators inf and sup in (8.7):

### Lemma 24

Let \(\lambda \ge 0\), let \(x \in S\), let \(z= g(x)\). Then,

In this case, \(\lambda \in \partial V(z)\), \(z \in \partial V^*(\lambda )\), and \(V(z)= V^{**}(z)\).

### Proof

Let us assume that *x* is a solution to (8.6) and that \(\lambda \in \partial V(z)\). Then, by (8.4),

therefore, *x* is a solution to (8.8). Let us assume now that *x* is a solution to (8.8). Let \(x'\) be such that \(g(x') \ge z\). Then,

which proves that *x* is a solution to (8.6). The last statement of the lemma is a direct consequence of (8.4). \(\square \)

The following lemma and Fig. 3 explain how to derive a lower and an upper estimate of \(V^{**}\) by solving the penalized problem. We give a bound for the maximal error generated by these estimates.

### Lemma 25

Let \(0 \le \lambda _1 < \lambda _2\), let \(x_1\) and \(x_2\) in *S*, let \(z_1= g(x_1)\) and \(z_2= g(x_2)\). We assume that \(x_1\) and \(x_2\) are solutions to (8.8) for resp. \(\lambda =\lambda _1\) and \(\lambda = \lambda _2\). Then, \(z_1 \le z_2\). Moreover, as illustrated on Fig. 3 (on the left),

Finally, an upper bound of \(\hat{V}-\check{V}\) on \([z_1,z_2]\) is:

### Proof

Observe first that

Summing these two inequalities, we obtain that

which proves that \(z_1 \le z_2\). By Lemma 24, \(x_1\) and \(x_2\) are solutions to (8.6) with resp. \(z=z_1\) and \(z_2\) and \(\lambda _1 \in \partial V^*(z_1)\), \(\lambda _2 \in \partial V^*(z_2)\) and \(V(z_1)= V^{**}(z_1)\) and \(V(z_2)=V^{**}(z_2)\). The upper estimate on \(V^{**}\) follows, since \(V^{**}\) is convex. By definition and by Lemma 24,

and the same holds with \(\lambda _2\) and \(z_2\); this proves the lower estimate.

We give a geometrical proof of (8.13), illustrated by Fig. 3 (on the right). On this figure, the coordinates of points \(I_1\) and \(I_2\) are resp. \((z_1,V(z_1))\) and \((z_2,V(z_2))\). The bold lines \(\mathcal {D}_1\) and \(\mathcal {D}_2\) correspond to the lower estimate of \(V^{**}\), their slopes are resp. \(\lambda _1\) and \(\lambda _2\). The upper estimate is the segment \([I_1,I_2]\). The greatest gap between the lower and the upper estimate is given by *e*.

The line \(\mathcal {D}_2'\) is the parallel to \(\mathcal {D}_2\) passing by \(I_1\). Finally, the points *A* and *B* are the intersections of resp. \(\mathcal {D}_2'\) and \(\mathcal {D}_1\) with the vertical axis passing by \(I_2\). Note that \(AB= (\lambda _2-\lambda _1)(z_2-z_1)\). Let us set \(\alpha = \frac{BI_2}{AB}\). We have \(\alpha \in [0,1]\) and \(e= \alpha (1-\alpha ) AB\). The value of \(\alpha \) is given by

The first inequality in (8.13) follows, and the second one follows from the inequality: \(\alpha (1-\alpha ) \le 1/4\) for all \(\alpha \in [0,1]\). \(\square \)

### Relaxation

We finish this section with two equivalent relaxed formulations of problems (8.6). We assume now that

This is equivalent to assume that the unconstrained problem \(\inf _{x \in S} f(x)\) has a finite value. The value function *V* being non decreasing, it is thus bounded from below, and so is its convex envelope. Moreover, since *V* is non decreasing, its domain \(\{ z \,|\, V(z) < +\infty \}\) is an interval of the form \((-\infty ,\bar{z})\) or \((-\infty , \bar{z}]\), therefore the convex envelope has the same domain.

Let us fix a random variable \(\xi \), of uniform law on [0, 1]. We denote by \(\mathcal {M}(S,\zeta )\) the space of random variables with value in *S* and measurable with respect to \(\zeta \). We introduce a new family of optimization problems, that we call *relaxed problems*, with value \(V^{{{\mathrm{r}}}}(z)\):

This family is linked to the family of problems (8.6). Instead of taking one decision *x*, we are allowed to make depend the decision on \(\zeta \), and the constraint must only be satisfied “in expectation”.

### Lemma 26

The relaxed value function \(V^{{{\mathrm{r}}}}\) is the convex envelope of *V*:

### Proof

Let *z* be such that \({{\mathrm{conv}}}(V)(z) < +\infty \). By Carathéodory’s theorem,

Let *z*, let \(\varepsilon >0\), let \(z_1 \in \mathbb {R}\), \(z_2 \in \mathbb {R}\), \(\alpha \in [0,1]\), \(x_1 \in S\), \(x_2 \in S\) be such that

We set: \(X= x_1\) if \(\zeta \in (0,\alpha )\), \(x_2\) otherwise. The variable *X* is feasible for (8.17) and therefore, \(V^{{{\mathrm{r}}}}(z) \le {{\mathrm{conv}}}(V)(z) + \varepsilon \). Passing to the limit, we obtain that \(V^{{{\mathrm{r}}}} \le {{\mathrm{conv}}}(z)\).

Let \(z \in \mathbb {R}\) be such that \(V^{{{\mathrm{r}}}}(z) < + \infty \), let \(\varepsilon >0\), let *X* be an \(\varepsilon \)-solution of (8.17). Let *Z* be the real random variable defined by \(Z= g(X)\). Then,

Passing to the limit, we obtain that \({{\mathrm{conv}}}(V) \le V^{{{\mathrm{r}}}}\) and finally the equality. \(\square \)

### Lemma 27

Let \(z \in \mathbb {R}\), let *X* be a relaxed optimal solution to problem (8.17). Let *Z* be a real random variable, adapted to \(\xi \) and such that

Then, *X* is a solution to the problem (8.6) with the level \(z=Z\) almost surely. Moreover, if \(\lambda \in \partial V^{{{\mathrm{r}}}}(z)\), then \(\lambda \in \partial V(Z)\) and *X* is a solution to the penalized problem (8.8) almost surely (that is to say, \(V^*(\lambda )= -(f(X)-\lambda g(X))\) almost surely).

### Proof

Let us prove the first part of the lemma. First, by Lemma 26,

Moreover, \(V(Z) \le f(X)\) almost surely, since \(Z \le g(X)\) almost surely. Combined with (8.20), it follows that \(f(X)= V(Z)\) almost surely. Note that it follows also that \(V^{{{\mathrm{r}}}}(z)= \mathbb {E} \big [ V(Z) \big ]\).

Let us prove the second part of the lemma. Let \(\lambda \in \partial V^{{{\mathrm{r}}}}(z)\), using the fact that \(V^{{{\mathrm{r}}},*}=V^*\), we obtain that

Since \(\lambda Z - V^*(\lambda ) \le V(Z)\), we obtain that \(\lambda Z- V^*(\lambda ) = V(Z)\) almost surely and therefore that \(\lambda \in \partial V(Z)\) almost surely. Finally, *X* is a solution to the penalized problem almost surely by Lemma 24. \(\square \)

Let us denote now by \(\mathcal {P}(S)\) the set of probability measures on *S*. With a similar proof to the one of Lemma 26, we can show that

Instead of taking one decision, we are allowed to take several decisions simultaneously. The cost function and the constraint are now linear with respect to the new optimization variable \(\mu \).

When \(V^{{{\mathrm{r}}}}\) is lower semi-continuous and such that \(V^{{{\mathrm{r}}}}(z) > -\infty \), for all *z*, then it is equal to \(V^{**}\). This means that it can be computed with a dual approach, motivated by Lemma 24, and that error estimates can be derived with Lemma 25. Note that we need the existence of optimal solutions to the penalized problems.

## Rights and permissions

## About this article

### Cite this article

Pfeiffer, L. Two Approaches to Stochastic Optimal Control Problems with a Final-Time Expectation Constraint.
*Appl Math Optim* **77, **377–404 (2018). https://doi.org/10.1007/s00245-016-9378-9

Published:

Issue Date:

### Keywords

- Stochastic optimal control
- Expectation and probability constraints
- Dynamic programming
- Lagrange relaxation

### Mathematics Subject Classification

- 90C15
- 93E20