A framework for randomized time-splitting in linear-quadratic optimal control

Veldman, D. W. M.; Zuazua, E.

doi:10.1007/s00211-022-01290-3

A framework for randomized time-splitting in linear-quadratic optimal control

Open access
Published: 11 May 2022

Volume 151, pages 495–549, (2022)
Cite this article

Download PDF

You have full access to this open access article

Numerische Mathematik Aims and scope Submit manuscript

A framework for randomized time-splitting in linear-quadratic optimal control

Download PDF

1884 Accesses
2 Citations
1 Altmetric
Explore all metrics

Abstract

Inspired by the successes of stochastic algorithms in the training of deep neural networks and the simulation of interacting particle systems, we propose and analyze a framework for randomized time-splitting in linear-quadratic optimal control. In our proposed framework, the linear dynamics of the original problem is replaced by a randomized dynamics. To obtain the randomized dynamics, the system matrix is split into simpler submatrices and the time interval of interest is split into subintervals. The randomized dynamics is then found by selecting randomly one or more submatrices in each subinterval. We show that the dynamics, the minimal values of the cost functional, and the optimal control obtained with the proposed randomized time-splitting method converge in expectation to their analogues in the original problem when the time grid is refined. The derived convergence rates are validated in several numerical experiments. Our numerical results also indicate that the proposed method can lead to a reduction in computational cost for the simulation and optimal control of large-scale linear dynamical systems.

Optimal Control for a Linear Quadratic Problem with a Stochastic Time Scale

Article 01 May 2021

An approximation scheme for stochastic controls in continuous time

Article 18 October 2014

An Approximation Scheme for Uncertain Minimax Optimal Control Problems

Article 13 October 2017

Use our pre-submission checklist

Avoid common mistakes on your manuscript.

1 Introduction

Solving an optimal control problem for a large-scale dynamical system can be computationally demanding. This problem appears in numerous applications. One example is Model Predictive Control (MPC), which requires the solution of several optimal control problems on a receding time horizon [11, 18]. Another example is the training of Deep Neural Networks (DNNs), which can be approached as an optimal control problem for a large-scale nonlinear dynamical system, see, e.g., [4, 9, 10, 27, 29]. Because the computational cost for gradient-based deterministic optimization algorithms explodes on large training data sets, neural networks (NNs) are typically trained using stochastic optimization algorithms such as stochastic gradient descent or stochastic (mini-)batch methods, see, e.g., [6]. In such methods, the update direction for the parameters of the NN is not computed based on the complete training data set, but on a subset of the available training data that is chosen randomly in each iteration. It can be shown that such methods converge in expectation to a (local) minimum of the considered cost functional, see, e.g., [6].

These successes inspired the development of Random Batch Methods (RBMs) for the simulation of interacting particle systems [14, 15, 21]. Because the number of interactions between N particles is of order $N^2$, the forward simulation of a system with a large number of particles is computationally demanding. A RBM reduces the required computational cost by reducing the number of considered interactions as follows. First, the considered time interval is divided into a number of subintervals of length $\le h$. In each subinterval, particles are grouped in randomly chosen batches (of at least two particles) and only the interactions between particles in the same batch are considered. The number of considered interactions now grows as PN, where P is the size of the considered batches, and a significant reduction in computational time can be achieved when $P \ll N$. It can be shown that the expected error introduced by this process is proportional to $\sqrt{h}$, where h denotes (an upper bound on) the length of the considered time intervals, see [14].

The computation of optimal controls for interacting particle systems is even more computationally demanding than the forward simulation because it requires several simulations of the forward dynamics and the associated adjoint problem, see, e.g., [20]. Because the optimal control for the RBM-approximated dynamics can be computed significantly faster than the control for the original dynamics, it has been proposed in [18] to control the original system with the controls optimized for the RBM dynamics. The numerical experiments in [18] indeed indicate that this approach can lead to a reasonably good approximation of the control for the original system. In [18], the control of the original dynamics with the RBM-optimal controls is combined with an MPC strategy, which creates additional robustness against the errors introduced by the RBM-approximation. However, even for the simplest case that does not consider the combination with MPC, a formal proof that the optimal control computed for the RBM-approximated dynamics indeed converges to the optimal control for the original system for $h \rightarrow 0$ was not given.

In this paper, we study, motivated by the ideas from [18], the classical linear-quadratic (LQ) optimal control problem constrained by randomized dynamics. Extensions of these results to a nonlinear setting are not only of interest for the control of interacting particle systems as considered in [18], but have also applications in the training of certain DNNs which can be viewed as (the time discretization) of an optimal control problem, see, e.g., [4, 9, 10, 27, 29]. The results for the LQ problem in this paper form a starting point for the study of these more involved problem settings.

In this paper, we propose a framework for the simulation and optimal control of large-scale linear dynamical systems. In our proposed framework, the system matrix is split into submatrices and the time interval of interest is split into subintervals of length $\le h$. The randomized dynamics is then found based on the randomly selected submatrices in each subinterval. Similarly as in [14, 15, 21], we show that the randomized dynamics converges to the dynamics of the original system at a rate $\sqrt{h}$. The main contributions of this paper concern the LQ optimal control problem in which the original dynamics is replaced by these randomized dynamics. In particular, we show that the minimal values of the cost functional and the corresponding optimal controls for the RBM-dynamics converge (in $L^2$ and in expectation) to their analogues for the original dynamics when $h \rightarrow 0$. The found convergence rates are validated by several numerical examples. Numerical results also indicate that the proposed method can lead to a reduction in computational cost.

The remainder of this paper is structured as follows. Section 2 contains a precise description of our proposed stochastic simulation method and a summary of the main results of the paper. Section 3 contains the detailed proofs of the convergence of the proposed method. The proposed method and the obtained convergence results are illustrated by several numerical examples in Sect. 4. The conclusions and discussions are presented in Sect. 5.

2 Proposed method and main results

2.1 Proposed method

We consider the evolution of a large-scale Linear Time Invariant (LTI) dynamical system of the form

$$\begin{aligned} {\dot{x}}(t) = Ax(t) + Bu(t), \quad x(0) = x_0, \end{aligned}$$

(1)

where the state x(t) evolves in ${\mathbb {R}}^N$, the control u(t) evolves in ${\mathbb {R}}^q$, $A \in {\mathbb {R}}^{N \times N}$ is the system matrix, $B \in {\mathbb {R}}^{N \times q}$ is the input matrix, and $x_0 \in {\mathbb {R}}^N$ is the initial condition.

A typical problem associated to the dynamics (1) is to find the optimal control $u^*(t)$ that minimizes the quadratic cost functional

$$\begin{aligned} J(u) = \frac{1}{2}\int _0^T \left( (x(t)-x_d(t))^\top Q (x(t)-x_d(t))+u(t)^\top R u(t) \right) \,\mathrm {d}t, \end{aligned}$$

(2)

where the given target trajectory $x_d(t)$ evolves in ${\mathbb {R}}^N$, the weighting matrix $Q \in {\mathbb {R}}^{N \times N}$ is symmetric and positive semi-definite, and the weighting matrix $R\in {\mathbb {R}}^{q \times q}$ is symmetric and positive definite. It is well known that the optimal control $u^*(t)$ exists and that it is unique, see, e.g., [17, 22].

Remark 1

When the state-dimension N is large, the optimal control $u^*(t)$ is typically computed using a gradient-based algorithm in which the gradient of J(u) is computed from the adjoint state $\varphi (t)$ that satisfies (see, e.g., [17])

$$\begin{aligned} -{\dot{\varphi }}(t) = A^\top \varphi (t) + Q (x(t) - x_d(t)), \quad \varphi (T) = 0, \end{aligned}$$

(3)

where x(t) is the solution of (1). Note that the adjoint state $\varphi (t)$ is computed by integrating (3) backward in time starting from the final condition $\varphi (T) = 0$. The gradient of the cost functional J(u) is then obtained as

$$\begin{aligned} \left( \nabla J(u)\right) (t) = B^\top \varphi (t) + R u(t). \end{aligned}$$

(4)

In our proposed randomized time-splitting method, the matrix A is written as the sum of M submatrices $A_m$

$$\begin{aligned} A = \sum _{m=1}^M A_m. \end{aligned}$$

(5)

Typically, the submatrices $A_m$ will be more sparse than the original matrix A. For ease of presentation, the results in this paper are presented under the following assumption.

Assumption 1

The submatrices $A_m$ in (5) are dissipative, i.e. $\langle x, A_m x \rangle \le 0$ for all $x \in {\mathbb {R}}^N$ and all $m \in \{1,2,\ldots ,M \}$.

Remark 2

Note that there always exists a constant $a > 0$ such that the matrices $A_m - aI$ are dissipative for $m \in \{1,2,\ldots ,M \}$. Assumption 1 is therefore not essential for the convergence of the proposed method, but without Assumption 1 the error estimates are less clean and grow exponentially in time. This idea is made more precise in Remark 9 in Sect. 3.2.

We then choose a temporal grid in the time interval [0, T]

$$\begin{aligned} 0 = t_0< t_1< t_2< \cdots< t_{K-1} < t_K = T, \end{aligned}$$

(6)

and denote

$$\begin{aligned} h_k = t_k - t_{k-1}, \quad h = \max _{k \in \{1,2,\ldots , K\}} h_k. \end{aligned}$$

(7)

In each of the K subintervals $[t_{k-1}, t_k)$, we randomly select a subset of indices in $\{1,2,\ldots , M \}$. The idea of the proposed method is to consider a linear combination of the submatrices $A_m$ with the indices that have been selected for each time interval. This can lead to a significant reduction in computational time when the submatrices $A_m$ are well-chosen and only a small number of submatrices $A_m$ are selected in each time interval.

To make this idea more precise, we enumerate all of the $2^M$ subsets of $\{1,2, \ldots , M \}$ as $S_1, S_2, \ldots S_{2^M}$. Note that one of the subsets $S_{\omega }$ will be the empty set. To every subset $S_{\omega }$ ($\omega \in \Omega := \{1, 2, \ldots , 2^M \}$) we then assign a probability $p_{\omega }$ with which this subset is selected. This probability is the same in each of the time intervals $[t_{k-1},t_k)$. Because we select only one subset $S_{\omega }$ in each time interval, the probabilities $p_{\omega }$ should satisfy

$$\begin{aligned} \sum _{\omega = 1}^{2^M} p_{\omega } = 1. \end{aligned}$$

(8)

From the chosen probabilities $p_{\omega }$, we then compute the probability $\pi _m$ that an index $m \in \{1,2, \ldots , M \}$ is an element of the selected subset

$$\begin{aligned} \pi _m = \sum _{\omega \in \Omega _m} p_\omega , \quad \Omega _m = \{ \omega \in \{1,2, \ldots , 2^M \} \mid m \in S_{\omega } \}. \end{aligned}$$

(9)

Observe that $\Omega _m$ is the set of the indices $\omega $ of the sets $S_{\omega }$ that contain the index m. We need the following (weak) assumption on the selected probabilities $p_{\omega }$.

Assumption 2

The probabilities $p_{\omega }$ ($\omega \in \{1,2, \ldots ,2^M \}$) are assigned such that

Equation (8) is satisfied and
the probabilities $\pi _m$ defined in (9) are positive for all $m \in \{1, ,2, \ldots , M \}$.

In each of the K time intervals $[t_{k-1},t_k)$, we then randomly select an index $\omega _k \in \{1,2, \ldots , 2^M \}$ according to the chosen probabilities $p_\omega $ (and independently of the other indices $\omega _1, \omega _2, \ldots \omega _{k-1}, \omega _{k+1}, \omega _{k+1}, \ldots , \omega _K$). The selected indices form a vector

$$\begin{aligned} {\varvec{\omega }} := (\omega _1, \omega _2, \ldots , \omega _K) \in \{1,2,\ldots ,2^M \}^K =: \Omega ^K. \end{aligned}$$

(10)

For the selected ${\varvec{\omega }}\in \Omega ^K$, we then define a piece-wise constant matrix $t \mapsto {\mathcal {A}}_h({\varvec{\omega }},t)$

$$\begin{aligned} {\mathcal {A}}_h({\varvec{\omega }}, t) = \sum _{m \in S_{\omega _k}} \frac{A_m}{\pi _m}, \quad t \in [t_{k-1}, t_k). \end{aligned}$$

(11)

The scaling by $1/\pi _m$ assures that the expected value of ${\mathcal {A}}_h$ is A because

$$\begin{aligned} \sum _{\omega =1}^{2^M} \sum _{m \in S_{\omega }} \frac{A_m}{\pi _m} p_\omega = \sum _{m=1}^M \sum _{\omega \in \Omega _m} \frac{A_m}{\pi _m} p_\omega = \sum _{m=1}^M \frac{A_m}{\pi _m}\pi _m = \sum _{m=1}^M A_m = A, \end{aligned}$$

(12)

where the first identity follows after interchanging the two summations using the definition of $\Omega _m$ in (9), the second from the definition of $\pi _m$ in (9), and the last identity from the decomposition of A in (5).

Example 1

In the simplest situation, we decompose the original matrix A into $M = 2$ matrices as $A = A_1 + A_2$. We then need to assign $2^M = 4$ probabilities $p_\ell $ to the subsets $S_1 = \{ 1 \}$, $S_2 = \{ 2 \}$, $S_3 = \{ 1,2 \}$, and $S_4 = \emptyset $. In this example, we choose $p_1 = p_2 = \tfrac{1}{2}$ and $p_3 = p_4 = 0$. This choice indeed satisfies Assumption 2 because $\pi _1 = p_1 + p_3 = \tfrac{1}{2} > 0$ and $\pi _2 = p_2 + p_3 = \tfrac{1}{2} > 0$. The matrix ${\mathcal {A}}_h({\varvec{\omega }},t)$ is thus either equal to $2A_1$ with probability $p_1 = \tfrac{1}{2}$ or equal to $2A_2$ with probability $p_2 = \tfrac{1}{2}$. The expected value of ${\mathcal {A}}_h$ is then indeed $\tfrac{1}{2} 2 A_1 + \tfrac{1}{2} 2 A_2 = A_1 + A_2 = A$.

To reduce the computational cost for solving (1), the matrix A is replaced by a ${\mathcal {A}}_h({\varvec{\omega }},t)$ in the RBM. For the selected vector of indices ${\varvec{\omega }} \in \Omega ^K$, we thus obtain a solution $t \mapsto x_h({\varvec{\omega }},t)$

$$\begin{aligned} {\dot{x}}_h({\varvec{\omega }},t) = {\mathcal {A}}_h({\varvec{\omega }},t) x_h({\varvec{\omega }},t) + B u(t), \quad x_h({\varvec{\omega }},0) = x_0. \end{aligned}$$

(13)

The main contribution of this paper concerns the optimal controls computed based on the RBM-dynamics (13). In particular, we consider the minimization of the functional

$$\begin{aligned} J_h({\varvec{\omega }}, u) = \frac{1}{2}\int _0^T \left( (x_h({\varvec{\omega }},t)-x_d(t))^\top Q (x_h({\varvec{\omega }},t)-x_d(t))+u(t)^\top R u(t) \right) \, \mathrm {d}t, \end{aligned}$$

(14)

over all $u \in L^2(0,T ; {\mathbb {R}}^q)$ subject to the dynamics (13). The minimizer of $J_h({\varvec{\omega }}, \cdot )$ depends on the selected indices ${\varvec{\omega }} \in \Omega ^K$ and is denoted by $u^*_h({\varvec{\omega }},t)$. Because R is positive definite, the minimizer $u^*_h({\varvec{\omega }},t)$ exists and is unique. As we will show in (52)–(54) in Sect. 3.1, the minimizers $u^*_h({\varvec{\omega }},t)$ are uniformly bounded because R is positive definite.

Remark 3

Similarly as for the original cost functional J(u) in (2), we can compute the optimal control $u_h({\varvec{\omega }},t)$ that minimizes $J_h({\varvec{\omega }},u)$ by a gradient-based algorithm. We can again compute the gradient of $J_h({\varvec{\omega }},u)$ from the adjoint state $\varphi _h({\varvec{\omega }},t)$ which satisfies

$$\begin{aligned} -{\dot{\varphi }}_h({\varvec{\omega }},t) = \left( {\mathcal {A}}_h({\varvec{\omega }},t) \right) ^\top \varphi _h({\varvec{\omega }},t) + Q(x_h({\varvec{\omega }},t) - x_d(t)), \quad \varphi _h({\varvec{\omega }},T) = 0. \nonumber \\ \end{aligned}$$

(15)

The gradient of $J_h({\varvec{\omega }},u)$ is then obtained as

$$\begin{aligned} \nabla J_h({\varvec{\omega }},u) = B^\top \varphi _h({\varvec{\omega }},t) + Ru(t). \end{aligned}$$

(16)

Note that when the randomized dynamics for $x_h({\varvec{\omega }},t)$ in (13) can be solved faster than the original dynamics for x(t) in (1), the same reduction in computational cost is typically also obtained for the randomized adjoint in Eq. (15) compared to the original adjoint Eq. (3). Because the computation of the optimal control $u^*(t)$ [(resp. $u_h^*({\varvec{\omega }},t)$)] requires several evaluations of the forward dynamics (1) [resp. (13)] and the adjoint Eq. (3) [resp. (15)], it is natural to expect the same relative speed-up for $u_h^*({\varvec{\omega }},t)$ (compared to $u_h^*(t)$) as for $x_h({\varvec{\omega }},t)$ (compared to x(t)). This idea is confirmed by the numerical experiments in Sect. 4.

To conclude this subsection, we summarize the proposed approach to approximate the solution x(t) of (1) for a given control u(t) and/or the optimal control $u^*(t)$ that minimizes $J(\cdot )$ in (2) subject to (1) in Algorithm 1. The accuracy of the obtained solutions $x_h({\varvec{\omega }},t)$ and/or $u^*_h({\varvec{\omega }},t)$ depends on the chosen submatrices $A_m$ in Step 1, the chosen probabilities $p_\omega $ in Step 2, and the chosen time grid $t_0, t_1, \ldots , t_K$ in Step 3. This dependence is captured by the error estimates in the next subsection.

It should be emphasized that we do not have that ${\mathbb {E}}[x_h(t)] = x(t)$ (for a fixed control u(t)) or that ${\mathbb {E}}[u^*_h(t)] = u^*(t)$. Repeating Step 4 in Algorithm 1 for different realizations of ${\varvec{\omega }}$ and averaging the obtained results leads to better approximations of ${\mathbb {E}}[x_h(t)]$ and/or ${\mathbb {E}}[u^*_h(t)]$ and can therefore only improve the approximation of x(t) and $u^*(t)$ to some extend. A better way to increase the accuracy of the proposed method is to repeat Algorithm 1 for a choice of submatrices $A_m$, probabilities $p_{\omega }$, and a time grid $t_0, t_1, \ldots , t_K$ that reduce the error estimates presented in the next subsection.

Remark 4

The presented framework is somewhat different from the problem setting considered in previous publications on RBMs for interacting particle systems, see, e.g., [14, 15, 18, 21]. Appendix A shows how these RBMs can be accommodated in our proposed framework.

2.2 Main results

The main results of this paper concern the effect of replacing the system matrix A in the original LQ optimal control problem (1)–(2) by the randomized matrix ${\mathcal {A}}_h({\varvec{\omega }},t)$ defined in (11). Clearly, the deviation of the randomized matrix ${\mathcal {A}}_h({\varvec{\omega }},t)$ from the original matrix A will influence the accuracy of the obtained results. The deviation of ${\mathcal {A}}_h({\varvec{\omega }},t)$ from A is measured by

$$\begin{aligned} \mathrm {Var}[{\mathcal {A}}] := \sum _{\omega =1}^{2^M} \left\| \sum _{m \in S_{\omega }} \frac{A_m}{\pi _m} - A \right\| ^2 p_\omega , \end{aligned}$$

(17)

where $\Vert \cdot \Vert $ denotes the operator norm. The quantity $\mathrm {Var}[{\mathcal {A}}]$ is thus the average squared distance of ${\mathcal {A}}_h({\varvec{\omega }},t)$ from A, weighted with the probabilities $p_1, p_2, \ldots , p_{2^M}$ with which different values of ${\mathcal {A}}_h({\varvec{\omega }},t)$ occur. Naturally, the error estimates below show that reducing $\mathrm {Var}[{\mathcal {A}}]$ will also reduce the errors introduced by the proposed randomized time-splitting method.

Example 1 (continued) We again consider the situation from Example 1 in which A is decomposed into $M=2$ submatrices as $A = A_1 + A_2$ and ${\mathcal {A}}_h({\varvec{\omega }},t)$ is either $2A_1$ or $2A_2$, both with probability $\tfrac{1}{2}$. We now compute the variance $\mathrm {Var}[{\mathcal {A}}]$ according to (17) and find

$$\begin{aligned} \mathrm {Var}[{\mathcal {A}}] = \Vert 2A_1 - A \Vert ^2 p_1 + \Vert 2A_2 - A \Vert ^2 p_2 = \Vert A_1 - A_2 \Vert ^2. \end{aligned}$$

(18)

Examples 2 and 3 in the following subsection further illustrate how $\mathrm {Var}[{\mathcal {A}}]$ depends on the decomposition of A into submatrices $A_m$ and the selected probabilities $p_{\omega }$.

Remark 5

When A in an approximation of an unbounded operator as in the examples in Sect. 4, it is natural to introduce an additional (invertible) weighting matrix W and compute

$$\begin{aligned} \mathrm {Var}_W[{\mathcal {A}}] := \sum _{\ell =1}^{2^M} \left\| \left( \sum _{m \in S_\ell } \frac{A_m}{\pi _m} - A \right) W \right\| ^2 p_\ell . \end{aligned}$$

(19)

Clearly, we want to choose W such that AW and the matrices $A_mW$ can be considered as approximations of bounded operators. In that case, $\mathrm {Var}_W[{\mathcal {A}}]$ is also an approximation of a finite quantity. A natural choice is $W = (A - \lambda I)^{-1}$ for some $\lambda $ in the resolvent of A.

The first main result of this paper is an estimate for the difference

$$\begin{aligned} e_h({\varvec{\omega }},t) = x_h({\varvec{\omega }},t) - x(t) \end{aligned}$$

(20)

between the RBM-dynamics (13) and the original dynamics (1).

Mainresult 1

Assume that Assumptions 1 and 2 hold and that the input u(t) in (1) is the same as in the input u(t) in (13), then

$$\begin{aligned} {\mathbb {E}}[|e_h(t) |^2] \le C_{[A,B,x_0,T,u]} h \mathrm {Var}[{\mathcal {A}}], \end{aligned}$$

(21)

for all $t \in [0,T]$.

The first main result follows directly from Theorem 1 in Sect. 3.2.

The expectation operator ${\mathbb {E}}$ is taken with respect to all possible outcomes ${\varvec{\omega }} \in \Omega ^K$. A precise definition will be given in Sect. 3.1. The constant $C_{[A,B,x_0,T,u]}$ can be taken as $(\Vert A\Vert T^2 + 2T)( |x_0 |+|Bu |_{L^ 1(0,T;{\mathbb {R}}^N)})^2$. The estimate thus only depends on the used submatrices $A_m$, the probabilities $p_{\omega }$, and the used temporal grid $t_0,t_1,\ldots , t_K$ through $h\mathrm {Var}[{\mathcal {A}}]$ defined in (17). The proof of Main result 1 is inspired by the proofs of convergence of the RBM in [14, 15].

The estimate (21) shows that the expected squared error is proportional to the temporal grid spacing h. We can thus make the expected squared error in the forward dynamics arbitrary small by reducing the grid spacing. Note that Markov’s inequality, see, e.g., [26], shows that

$$\begin{aligned} {\mathbb {P}}[|e_h({\varvec{\omega }},t) |> \varepsilon ] = {\mathbb {P}}[|e_h({\varvec{\omega }},t) |^2 > \varepsilon ^2] < \frac{{\mathbb {E}}[|e_h(t) |^2]}{\varepsilon ^2}. \end{aligned}$$

(22)

The probability that we select an ${\varvec{\omega }} \in \Omega ^K$ for which $|e_h({\varvec{\omega }},t) |$ exceeds any given treshold $\varepsilon > 0$ is thus controlled by ${\mathbb {E}}[|e_h(t) |^2]$. According to (21), we can make this probability as small as desired by choosing the temporal grid spacing h small enough. However, one should also keep in mind that decreasing h will increase the computational cost for the RBM-dynamics (13) and that the computational advantage of the RBM is lost when the required grid spacing is too small.

Example 1 (continued) To illustrate why Main result 1 could be true, we again consider the situation from Example 1 in which A is decomposed as $A = A_1 + A_2$ and ${\mathcal {A}}_h({\varvec{\omega }},t)$ is equal to $2A_1$ or $2A_2$, both with probability $\tfrac{1}{2}$. We additionally assume that $u(t) \equiv 0$, that the time grid $t_k = k T / K$ ($k \in \{0,1,2, \ldots , K \}$) is uniform with grid spacing $h = T/ K$, and that $A_1$ and $A_2$ commute. Because $u(t) = 0$, the solution of (1) is $x(t) = e^{At}x_0$ and the solution of (13) is

$$\begin{aligned} x_h({\varvec{\omega }},T) = e^{2A_{\omega _K} h} \cdots e^{2A_{\omega _2} h} e^{2A_{\omega _1} h} x_0 = e^{2A_1T_1({\varvec{\omega }}) + 2A_2 T_2({\varvec{\omega }})} x_0. \end{aligned}$$

(23)

Here, $T_1({\varvec{\omega }})$ and $T_2({\varvec{\omega }})$ denote the times during which $A_1$ and $A_2$ are used, i.e.

$$\begin{aligned} T_1({\varvec{\omega }}) = \frac{T}{K}\sum _{\ell =1}^K \chi _1(\omega _\ell ), \quad T_2({\varvec{\omega }}) = \frac{T}{K}\sum _{\ell =1}^K \chi _2(\omega _\ell ), \end{aligned}$$

(24)

where the characteristic functions $\chi _1(\omega )$ and $\chi _2(\omega )$ are defined by the property that $\chi _i(\omega ) = 1$ when $\omega = i$ and $\chi _i(\omega ) = 0$ otherwise ($i \in \{ 1,2\}$). Note that the second identity in (23) uses that $A_1$ and $A_2$ commute. Because ${\mathbb {E}}[\chi _1] = {\mathbb {E}}[\chi _2] = \tfrac{1}{2}$, it follows that ${\mathbb {E}}[T_1] = {\mathbb {E}}[T_2] = T/2$. When we now consider the limit $K \rightarrow \infty $ (so $h \rightarrow 0$), the law of large numbers states that $T_1$ and $T_2$ converge to T/2 (in probability). The RHS of (23) thus converges (in probability) to $e^{AT}x_0 = x(T)$ for $K \rightarrow \infty $. Note that the convergence in Main result 1 is in expectation, which is stronger than convergence in probability.

We now present the main results aimed at the LQ optimal control problem constrained by randomized dynamics. Because the optimal control $u^*_h({\varvec{\omega }},t)$ depends on the selected indices ${\varvec{\omega }}$, we need the following result. The key difference with the first main result is that the input $u_h({\varvec{\omega }},t)$ may now depend on the randomly selected indices ${\varvec{\omega }}$. As will be explained at the start of Sect. 3, this makes the arguments for the convergence of the RBM in [14, 15] break down.

Note that replacing u(t) in (1) and (13) by $u_h({\varvec{\omega }},t)$ results in solutions $x({\varvec{\omega }},t)$ and $x_h({\varvec{\omega }},t)$ that now both depend on the selected indices ${\varvec{\omega }}$. The second main result now gives a bound for the expected value of the difference

$$\begin{aligned} e_h({\varvec{\omega }},t) = x_h({\varvec{\omega }},t) - x({\varvec{\omega }},t). \end{aligned}$$

(25)

Mainresult 2

Consider any control $u_h : \Omega ^K \rightarrow L^2(0,T ; {\mathbb {R}}^q)$. Assume that Assumptions 1 and 2 are satisfied and let U be such that $|u_h({\varvec{\omega }}) |_{L^2(0,T; {\mathbb {R}}^q)} \le U$ for all ${\varvec{\omega }} \in \Omega ^K$, then

$$\begin{aligned} {\mathbb {E}}[|e_h(t) |^2] \le C_{[A,B,x_0,T,U]} h \mathrm {Var}[{\mathcal {A}}]. \end{aligned}$$

(26)

The second result follows directly from Theorem 2 in Sect. 3.3.

Just as in the first main result, the expectation is taken over all possible values of ${\varvec{\omega }} \in \Omega ^K$ and the constant $C_{[A,B,x_0,T,U]}$ does not depend on the chosen submatrices $A_m$ in (5), the chosen probabilities $p_{\omega }$, and the used temporal grid.

Using this result, we can now obtain a no-gap result which shows that the minimal value of the cost functional $J_h({\varvec{\omega }}, u_h^*({\varvec{\omega }}))$ is (in expectation) close to the minimal value $J(u^*)$ in the original problem when $h \mathrm {Var}[{\mathcal {A}}]$ is small enough.

Mainresult 3

Let $u^*(t)$ be the control that minimizes the cost functional J(u) in (2) and let $u_h^*({\varvec{\omega }},t)$ be the control that minimizes the cost functional $J_h({\varvec{\omega }}, u)$ in (14). Then

$$\begin{aligned} {\mathbb {E}}[|J_h(u^*_h) - J(u^*) |] \le C_{[A,B,x_0,Q,R,x_d,T]} \left( \sqrt{h \mathrm {Var}[{\mathcal {A}}]} + h \mathrm {Var}[{\mathcal {A}}] \right) . \end{aligned}$$

(27)

The third main result is identical to Theorem 3 in Sect. 3.4.

For $h\mathrm {Var}[{\mathcal {A}}]$ small enough, Main result 3 clearly implies that ${\mathbb {E}}[|J_h(u^*_h) - J(u^*) |] \le C_{[A,B,x_0,Q,R,x_d,T]} \sqrt{h \mathrm {Var}[{\mathcal {A}}]}$, which is also the rate that is observed in numerical experiments. We keep the second term on the RHS of (27) to assure that the estimate is valid for all values of $h\mathrm {Var}[{\mathcal {A}}]$, and not just for sufficiently small values of $h\mathrm {Var}[{\mathcal {A}}]$.

By Markov’s inequality, this result thus implies that, for any $\varepsilon > 0$, the probability that $|J(u^*_h({\varvec{\omega }})) - J(u^*) |> \varepsilon $ can be made arbitrarily small by reducing the temporal grid spacing h.

The next main result shows that the optimal control for the RBM-problem $u_h^*({\varvec{\omega }})$ also converges (in expectation) to the optimal control of the original problem $u^*$ when $h \rightarrow 0$.

Mainresult 4

Let $u_h^*({\varvec{\omega }},t)$ be the minimizer of $J_h({\varvec{\omega }},\cdot )$ in (14) and $u^*(t)$ be the minimizer of J in (2), then

$$\begin{aligned} {\mathbb {E}}[|u_h^* - u^*|_{L^2(0,T; {\mathbb {R}}^q)}^2] \le C_{[A,B,x_0,Q,R,x_d,T]} h \mathrm {Var}[{\mathcal {A}}]. \end{aligned}$$

(28)

The fourth main result follows directly from Theorem 4 in Sect. 3.5.

The fourth main result justifies the use of the optimal control $u^*_h({\varvec{\omega }})$, that is optimized for the RBM-dynamics to control the original dynamics, as proposed in [18]. An almost immediate corollary of Main result 4 is that the trajectories of the original dynamics (1) resulting from the controls $u^*_h({\varvec{\omega }},t)$ and $u^*(t)$ will also be close to each other, see Corollary 2 in Sect. 3.5. This further justifies the strategy in [18].

When the control $u^*_h({\varvec{\omega }})$ is close to the control $u^*$ that is optimal for the original dynamics, the performance $J(u^*_h({\varvec{\omega }}))$ should also be close to the optimal performance $J(u^*)$. This idea is formalized by the fifth and last main result.

Mainresult 5

Let $u^*(t)$ be the control that minimizes the cost functional J(u) in (2) and let $u_h^*({\varvec{\omega }},t)$ be the control that minimizes the cost functional $J_h({\varvec{\omega }}, u)$ in (14). Then

$$\begin{aligned} {\mathbb {E}}[|J(u^*_h) - J(u^*) |] \le C_{[A,B,x_0,Q,R,x_d,T]}h \mathrm {Var}[{\mathcal {A}}]. \end{aligned}$$

(29)

The fifth main result is identical to Corollary 3 in Sect. 3.5. Main result 5 is proven as a corollary of Main result 4/Theorem 4.

The fifth main result is particularly important because it shows that the performance $J(u^*_h({\varvec{\omega }}))$ obtained with control $u^*_h({\varvec{\omega }})$ optimized for the randomized dynamics is close to the optimal performance $J(u^*)$ when $h\mathrm {Var}[{\mathcal {A}}]$ is sufficiently small. This further motivates strategies in which the original system is controlled by a control $u^*_h({\varvec{\omega }})$ that is optimized for the randomized dynamics, as was proposed in [18].

2.3 Further examples for $\mathrm {Var}[{\mathcal {A}}]$ and computational cost

The quantity $\mathrm {Var}[{\mathcal {A}}]$ describes how the derived estimates depend on the decomposition of A into submatrices and the selected probabilities $p_1, p_2, \ldots , p_{2^M}$. We therefore present two other examples that illustrate how $\mathrm {Var}[{\mathcal {A}}]$ depends on the decomposition of A into submatrices $A_m$ and the selected probabilities $p_{\omega }$.

Example 2

We decompose the matrix A into $M = 3$ parts $A = A_1 + A_2 + A_3$ and consider two choices for the probabilities $p_\omega $. In the first case, we only use one of the submatrices $A_m$ simultaneously. We thus assign probabilities $p_1 = p_2 = p_3 = \tfrac{1}{3}$ to the subsets $S_1 = \{ 1\}$, $S_2 = \{ 2 \}$, and $S_3 = \{ 3 \}$ and zero probability to the other 5 subsets of $\{1,2,3 \}$. We then have that $\pi _1 = \pi _2 = \pi _3 = \tfrac{1}{3}$ and the variance $\mathrm {Var}[{\mathcal {A}}]$ in (17) becomes

$$\begin{aligned} \mathrm {Var}[{\mathcal {A}}]&= \Vert 3A_1 - A \Vert ^2p_2 + \Vert 3A_2 - A \Vert ^2p_3 + \Vert 3A_3 - A \Vert ^2p_4 \nonumber \\&= \tfrac{1}{3} \left( \Vert 2 A_1 - A_2 - A_3 \Vert ^2 + \Vert 2 A_2 - A_1 - A_3 \Vert ^2 + \Vert 2 A_3 - A_1 - A_2 \Vert ^2\right) . \end{aligned}$$

(30)

In the second case, we always use two of the three submatrices $A_m$ simultaneously. We thus assign probabilities $p_4 = p_5 = p_6 = \tfrac{1}{3}$ to the subsets $S_4 = \{ 1,2\}$, $S_5 = \{ 2,3 \}$, and $S_6 = \{ 1,3 \}$ and zero probability to the other 5 subsets of $\{1,2,3 \}$. We then have that $\pi _1 = p_4 + p_6$, $\pi _2 = p_4 + p_5$, and $\pi _3 = p_5 + p_6$, so that $\pi _1 = \pi _2 = \pi _3 = \tfrac{2}{3}$. The variance $\mathrm {Var}[{\mathcal {A}}]$ in (17) becomes

$$\begin{aligned} \mathrm {Var}[{\mathcal {A}}]&= \Vert \tfrac{3}{2}(A_1+A_2) - A \Vert ^2p_5 + \Vert \tfrac{3}{2}(A_2+A_3) - A \Vert ^2p_6 + \Vert \tfrac{3}{2}(A_1+A_3) - A \Vert ^2p_7 \nonumber \\&= \tfrac{1}{3} \left( \Vert \tfrac{1}{2} (A_1 + A_2) - A_3 \Vert ^2 + \Vert \tfrac{1}{2} (A_2 + A_3) - A_1 \Vert ^2 + \Vert \tfrac{1}{2} (A_1 + A_3) - A_2 \Vert ^2\right) . \end{aligned}$$

(31)

Observe that $\Vert \tfrac{1}{2} (A_1 + A_2) - A_3 \Vert ^2 = \tfrac{1}{4} \Vert 2 A_3 - A_1 - A_2 \Vert ^2$ and that similar expressions relate the other terms in (30) and (31). The variance for the first case in (30) is thus four times larger than the variance for the second case in (31). Increasing the overlap between the possible values of ${\mathcal {A}}_h({\varvec{\omega }},t)$ thus reduces $\mathrm {Var}[{\mathcal {A}}]$ and will improve the accuracy of the proposed method. It is worth noting that similar observations have been made for domain decomposition methods, for which it is well-known that increasing the overlap between subdomains increases the convergence rate (see, e.g., [8, Section 1.5]). Note however that increasing the overlap will also reduce the sparsity of ${\mathcal {A}}_h(t)$ and thus also increase the computational cost. This will be illustrated further in Example 4 and the numerical examples in Sect. 4.

Example 3

It is not always optimal to choose the probabilities uniform. To illustrate this, we assume $A=A_1+A_2$ has a block-diagonal decomposition

$$\begin{aligned} A = \begin{bmatrix} A_{11} &{}\quad 0 \\ 0 &{}\quad A_{22} \end{bmatrix}, \quad A_1 = \begin{bmatrix} A_{11} &{}\quad 0 \\ 0 &{}\quad 0 \end{bmatrix}, \quad A_2 = \begin{bmatrix} 0 &{}\quad 0 \\ 0 &{}\quad A_{22} \end{bmatrix}. \end{aligned}$$

(32)

It easy to verify that $\Vert \alpha A_1 + \beta A_2 \Vert = \max \{ |\alpha |\Vert A_1\Vert , |\beta |\Vert A_2\Vert \}$ for any $\alpha ,\beta \in {\mathbb {R}}$. We assign the (at this point undetermined) probability $p_1 = p$ to the subset $S_1 = \{ 1\}$, the probability $p_2 = 1-p$ to the subset $S_2 = \{ 2\}$, and probabilities $p_3 = p_4 = 0$ to the subsets $S_3 = \emptyset $ and $S_4 = \{1,2 \}$. It follows that $\pi _1 = p$ and $\pi _2 = 1-p$ and that

$$\begin{aligned} \mathrm {Var}[{\mathcal {A}}]&= \Vert \tfrac{1}{p}A_1 - A \Vert ^2 p + \Vert \tfrac{1}{1-p} A_2 - A \Vert ^2 (1-p) \nonumber \\&= \Vert \tfrac{1}{p}((1-p)A_1 - pA_2) \Vert ^2 p + \Vert \tfrac{1}{1-p} (pA_2 - (1-p)A_1) \Vert ^2 (1-p) \nonumber \\&= \Vert (1-p)A_1 - pA_2 \Vert ^2 \left( \tfrac{1}{p} + \tfrac{1}{1-p} \right) = \left\| \sqrt{\tfrac{1-p}{p}}A_1 + \sqrt{\tfrac{p}{1-p}}A_2 \right\| ^2 \nonumber \\&= \left( \max \left\{ \sqrt{\tfrac{1-p}{p}}\Vert A_1\Vert , \sqrt{\tfrac{p}{1-p}}\Vert A_2 \Vert \right\} \right) ^2. \end{aligned}$$

(33)

It is now easy to see that $\mathrm {Var}[{\mathcal {A}}]$ is minimal when $\sqrt{\tfrac{1-p}{p}}\Vert A_1\Vert = \sqrt{\tfrac{p}{1-p}}\Vert A_2 \Vert $. Solving this equation for p, we find optimal probability

$$\begin{aligned} p^* = \frac{\Vert A_1\Vert }{\Vert A_1\Vert + \Vert A_2 \Vert }. \end{aligned}$$

(34)

We observe that the larger the submatrix $A_1$ is compared to $A_2$, the larger the probability p with which the submatrix $A_1$ is selected should be. Inserting the optimal probability $p^*$ in (34) into the expression for $\mathrm {Var}[{\mathcal {A}}]$, we find that

$$\begin{aligned} \mathrm {Var}[{\mathcal {A}}]^* = \Vert A_1 \Vert \Vert A_2 \Vert . \end{aligned}$$

(35)

With uniform probabilities, i.e., with $p = 1/2$, $\mathrm {Var}[{\mathcal {A}}] = \max \{ \Vert A_1 \Vert ^2, \Vert A_2 \Vert ^2 \}$, see (33). When $\Vert A_1 \Vert / \Vert A_2 \Vert \gg 1$ or $\Vert A_1 \Vert / \Vert A_2 \Vert \ll 1$, using the optimal probability $p^*$ in (34) can thus reduce $\mathrm {Var}[{\mathcal {A}}]$ significantly.

We conclude this section with two examples that illustrate the potential reduction in computational cost offered by the proposed randomized time-splitting method.

Example 4

Let $A \in {\mathbb {R}}^{N\times N}$ be a sparse symmetric negative semi-definite matrix with a bandwidth b, i.e. $[A]_{ij} = 0$ when $|i-j |> b$. Select $n_1,n_2,n_3 \in \{1,2,\ldots ,N \}$ such that $n_1 + n_2 + n_3 = N + 2b$. It is then possible to split A as $A = A_1 + A_2 + A_3$ with

$$\begin{aligned} A_1 = \begin{bmatrix} A_{11} &{}\quad 0 \\ 0 &{}\quad 0 \end{bmatrix}, \quad A_2 = \begin{bmatrix} 0_{n_1-b} &{}\quad 0 &{}\quad 0 \\ 0 &{}\quad A_{22} &{}\quad 0 \\ 0 &{}\quad 0 &{}\quad 0_{n_3-b} \end{bmatrix}, \quad A_3 = \begin{bmatrix} 0 &{}\quad 0 \\ 0 &{}\quad A_{33} \end{bmatrix}, \end{aligned}$$

(36)

where $A_{11} \in {\mathbb {R}}^{n_1 \times n_1}$, $A_{22} \in {\mathbb {R}}^{n_2 \times n_2}$, $A_{33} \in {\mathbb {R}}^{n_3 \times n_3}$, $0_n$ denotes an $n \times n$ zero matrix, and the 0’s denote zero matrices of appropriate size. We assign probabilities $p_1 = p_2 = p_3 = \tfrac{1}{3}$ to the subsets $S_1 = \{ 1\}$, $S_2 = \{ 2 \}$, and $S_3 = \{ 3 \}$ and zero probability to the other 5 subsets of $\{1,2,3 \}$. The computational cost for one time step with the matrix $A_1$ is $O(n_1^r)$, where $r \in [1,3]$ is a certain power that depends on b, the time discretization scheme, and the method used to solve the resulting linear systems. In particular, $r = 1$ when A is tridiagonal (i.e. when $b = 1$), $r = 3$ for an implicit time discretization scheme in which the resulting linear systems are solved by Gaussian elimination, and $r = 2$ for an implicit time discretization scheme in which the resulting linear systems are solved based on a precomputed Lower-Upper (LU) factorization. Similarly, the computational cost for one time step with the matrices $A_2$ or $A_3$ or with the full matrix A is $O(n_2^r)$ or $O(n_3^r)$ or $O(N^r)$, respectively. The proposed randomized time-splitting scheme is therefore expected to reduce the computational cost for one forward simulation (on the same temporal grid) by a factor

$$\begin{aligned} \frac{p_1n_1^r + p_2n_2^r + p_3n_3^r}{N^r}. \end{aligned}$$

(37)

When $b \ll N$, it is possible to choose $n_1 \approx n_2 \approx n_3 \approx N/3$, and the reduction in computational cost is then $\approx 1/3^r$. Note that the expected reduction in computational cost can only be observed when $n_1$, $n_2$, and $n_3$ are sufficiently large. As explained in Sect. 1, we expect that the computation of optimal controls is sped up by the same factor as the forward simulation.

Similarly as in the second case in Example 2, we also consider the situation in which the overlap is increased. We thus assign probabilities $p_4 = p_5 = p_6 = \tfrac{1}{3}$ to the subsets $S_4 = \{ 1,2\}$, $S_5 = \{ 2,3 \}$, and $S_6 = \{ 1,3 \}$ and zero probability to the other 5 subsets of $\{1,2,3 \}$. The cost of doing one time step with the matrices $A_1 + A_2$, $A_2 + A_3$, or $A_1 + A_3$ is then proportional to $(n_1 + n_2 -b)^r$, $(n_2 + n_3 - b)^r$, or $(n_1 + n_3)^r$, respectively. When $b \ll N$ and $n_1 \approx n_2 \approx n_3 \approx N/3$ the proposed randomized time-splitting scheme thus reduces the expected computational cost by a factor $2^r / 3^r$. Increasing the overlap thus increases the expected computational cost of the randomized time splitting method by a factor $2^r$, but it also reduces $\mathrm {Var}[{\mathcal {A}}_h]$ by a factor 4, see Example 2. Choosing the level of overlap is thus a trade-off between accuracy and computational cost.

Example 5

When $A \in {\mathbb {R}}^{N \times N}$ is symmetric but not sparse, we can select $n_1, n_2, n_3 \in \{1,2, \ldots , N \}$ such that $n_1 + n_2 + n_3 = N$, and split A as $A = A_1 + A_2 + \cdots + A_6$ with

$$\begin{aligned}&A_1 = \begin{bmatrix} A_{11} &{}\quad 0 &{}\quad 0 \\ 0 &{}\quad 0 &{}\quad 0 \\ 0\quad &{}\quad 0 &{}\quad 0 \end{bmatrix}, \quad A_2 = \begin{bmatrix} 0 &{}\quad 0 &{}\quad 0 \\ 0 &{}\quad A_{22} &{}\quad 0 \\ 0 &{}\quad 0 &{}\quad 0 \end{bmatrix}, \quad A_3 = \begin{bmatrix} 0 &{}\quad 0 &{}\quad 0 \\ 0 &{}\quad 0 &{}\quad 0 \\ 0 &{}\quad 0 &{}\quad A_{33} \end{bmatrix}, \nonumber \\&A_4 = \begin{bmatrix} 0 &{}\quad A_{12} &{}\quad 0 \\ A_{21} &{}\quad 0 &{}\quad 0 \\ 0 &{}\quad 0 &{}\quad 0 \end{bmatrix}, \quad A_5 = \begin{bmatrix} 0 &{}\quad 0 &{}\quad 0 \\ 0 &{}\quad 0 &{}\quad A_{23} \\ 0 &{} A_{32} &{} 0 \end{bmatrix}, \quad A_6 = \begin{bmatrix} 0 &{}\quad 0 &{}\quad A_{13} \\ 0 &{}\quad 0 &{}\quad 0 \\ A_{31} &{}\quad 0 &{}\quad 0 \end{bmatrix}, \end{aligned}$$

(38)

where $A_{11} \in {\mathbb {R}}^{n_1 \times n_1}$, $A_{22} \in {\mathbb {R}}^{n_2 \times n_2}$, and $A_{33} \in {\mathbb {R}}^{n_3 \times n_3}$. The cost for doing one time step with $A_1$, $A_2$, or $A_3$ is $O(n_1^r)$, $O(n_2^r)$, and $O(n_3^r)$, respectively, with r as in Example 4. Similarly, the cost for doing one time step with $A_4$, $A_5$, or $A_6$ is $O((n_1+n_2)^r)$, $O((n_2+n_3)^r)$, and $O((n_1+n_3)^r)$, respectively. When we assign probabilities $\tfrac{1}{6}$ to the six singleton subsets of $\{1,2,\ldots , 6 \}$ and zero probability to the other, the proposed randomized time-splitting scheme is expected to reduce the computational cost for one forward simulation (on the same temporal grid) by a factor

$$\begin{aligned} \frac{n_1^r + n_2^r + n_3^r + (n_1+n_2)^r + (n_2+n_3)^r + (n_1+n_3)^r}{6N^r} \approx \frac{1}{2}\left( \frac{1}{3^r} + \frac{2^r}{3^r} \right) , \end{aligned}$$

(39)

where the latter approximation holds when $n_1 \approx n_2 \approx n_3 \approx N/3$.

3 Convergence analysis

The proof of convergence for the RBM optimal control problem is divided into several stages.

In the first stage, we consider a control $u \in L^2(0,T; {\mathbb {R}}^q)$ that does not depend on the selected indices ${\varvec{\omega }}$. We then show that the expected difference between the RBM-dynamics (13) and the original dynamics (1) can be bounded in terms of $h \mathrm {Var}[{\mathcal {A}}]$ as in Main result 1. The proof of this statement is inspired by the results for interacting particles systems in [14, 15].

Because we will also need to deal with the optimal control $u^*_h({\varvec{\omega }},t)$ that minimizes $J_h({\varvec{\omega }}, \cdot )$, we consider a general family of controls $u_h({\varvec{\omega }},t)$ (with ${\varvec{\omega }} \in \Omega ^K$) in the second stage. This is a nontrivial extension of the results in the previous stage because the crucial idea in the proof for the first stage and in [14, 15] is that the solutions $x(t_{k-1})$ and $x_h({\varvec{\omega }},t_{k-1})$ do not depend on $\omega _k$ (the index that is used in the time interval $[t_{k-1}, t_k)$). This is clearly no longer the case when we insert an input $u_h({\varvec{\omega }},t)$ that depends on ${\varvec{\omega }}$, so also on $\omega _k$, into the dynamics (1) and (13). This problem is particularly clear when we consider the family of optimal controls $u_h^*({\varvec{\omega }})$ for which $u_h^*({\varvec{\omega }},t_{k-1})$ will depend on the choices for the ‘future’ indices $\omega _k, \omega _{k+1}, \ldots \omega _K$.

In the third stage, we prove the no-gap condition presented in Main result 3. A crucial result for the proof is an auxiliary lemma (Lemma 1) that bounds the differences $J_h({\varvec{\omega }},u) - J(u)$ and $J_h({\varvec{\omega }},u_h({\varvec{\omega }})) - J(u_h({\varvec{\omega }}))$ (in expectation). For controls u that do not depend on $\omega $, a bound on $J_h({\varvec{\omega }},u) - J(u)$ can be obtained directly from Main result 1. For controls $u_h({\varvec{\omega }})$ that do depend on ${\varvec{\omega }}$, we need to use Main result 2 to find the bound on the expected difference $J_h({\varvec{\omega }},u_h({\varvec{\omega }})) - J(u_h({\varvec{\omega }}))$. For brevity, Lemma 1 considers controls $u_h({\varvec{\omega }})$ that depend on ${\varvec{\omega }}$ (which of course also covers the case in which the control does not depend on ${\varvec{\omega }}$). The no-gap condition (i.e., a bound on $J_h(u_h^*({\varvec{\omega }})) - J(u^*)$) can then be obtained using classical arguments from the calculus of variations and Lemma 1 applied to the optimal controls $u^*$ and $u_h^*({\varvec{\omega }})$.

In the fourth stage, we bound the difference between the RBM-optimal control $u^*_h({\varvec{\omega }})$ and the control $u^*$ optimized for the original dynamics. To this end, we first bound the expected difference between the gradients of $J_h({\varvec{\omega }}, \cdot )$ and J. The bound on the difference in the optimal controls then follows from classical arguments based on the $\alpha $-convexity of the functional $J_h({\varvec{\omega }},\cdot )$. Finally, the bound for the difference $J(u^*_h({\varvec{\omega }})) - J(u^*)$ follows easily from the previously derived bound on the difference between the optimal controls $u^*_h({\varvec{\omega }})$ and $u^*$.

The four stages discussed above will be proved in detail in Sects. 3.2–3.5. We first present some preliminaries in Sect. 3.1.

3.1 Preliminaries

We will use the following notation. The transpose of a real column vector x is denoted by $x^\top $. Similarly, the transpose of a real matrix A is denoted by $A^\top $. The entry in the i-th row and j-th column of A is denoted by $[A]_{ij}$. The standard Euclidean innerproduct of two vectors $x,y \in {\mathbb {R}}^N$ is denoted by $\langle x, y \rangle := x^\top y$. The corresponding norm is denoted by $|x |:= \sqrt{x^\top x}$. The (operator) norm of a matrix $A \in {\mathbb {R}}^{N \times N}$ is denoted by

$$\begin{aligned} \Vert A \Vert := \max _{|x |=1} |Ax |. \end{aligned}$$

(40)

We use $C_{[a,b, \ldots , d]}$ to denote a constant that only depends on the parameters $a,b, \ldots , d$. The value of $C_{[a,b, \ldots , d]}$ may vary from line to line. The $L^p$-norm of a function in $u \in L^p(0,T; {\mathbb {R}}^q)$ (for $1 \le p < \infty $ and $p = \infty $) is defined as

$$\begin{aligned} |u |_{L^p(0,T;{\mathbb {R}}^q)} := \root p \of {\int _0^T |u(t) |^p \, \mathrm {d}t }, \quad |u |_{L^\infty (0,T;{\mathbb {R}}^q)} := \underset{t \in [0,T]}{\mathrm {ess\, sup}} \ \vert u(t) |. \end{aligned}$$

(41)

We now set up the precise probabilistic setting for our problem. The set $\Omega ^K$ defined in (10) is the natural sample space for the considered problem. To turn $\Omega ^K$ into a probability space, we assign a probability $p({\varvec{\omega }})$ to each ${\varvec{\omega }} \in \Omega ^K$ according to

$$\begin{aligned} p({\varvec{\omega }}) = p_{\omega _1} p_{\omega _2} \ldots p_{\omega _K}. \end{aligned}$$

(42)

Note that we use here that each index $\omega _k$ is chosen independently from the other indices $\omega _1, \omega _2, \ldots , \omega _{k-1}, \omega _{k+1}, \omega _{k+1}, \ldots , \omega _K$.

A random element on the sample space $\Omega ^K$ is a function $X : \Omega ^K \rightarrow V$ from the sample space $\Omega ^K$ to a vector space V. When $V = {\mathbb {R}}$, $X : \Omega ^K \rightarrow {\mathbb {R}}$ is also called a random variable. Note that we can embed V into $V^{\Omega ^K}$ by associating to each element $x \in V$ the constant function $X({\varvec{\omega }}) = x$ for all ${\varvec{\omega }} \in {\Omega ^K}$. Constant functions $X({\varvec{\omega }}) = x$ are called deterministic. Functions $X({\varvec{\omega }})$ that are not deterministic are called stochastic. The expectation operator ${\mathbb {E}}$ assigns to a random variable $X \in V^{\Omega ^K}$ an element of the vector space V

$$\begin{aligned} {\mathbb {E}}[X]&= \sum _{{\varvec{\omega }} \in \Omega ^K} X({\varvec{\omega }}) p({\varvec{\omega }}) \nonumber \\&= \sum _{\omega _1 =1}^{2^M} \sum _{\omega _2=1}^{2^M} \cdots \sum _{\omega _K =1}^{2^M} X(\omega _1, \omega _2, \ldots , \omega _K) p_{\omega _1} p_{\omega _2} \cdots p_{\omega _K}. \end{aligned}$$

(43)

It is immediate from this definition that ${\mathbb {E}}$ is linear. When $V = {\mathbb {R}}$, we also see that ${\mathbb {E}}[X] \ge 0$ when $X({\varvec{\omega }}) \ge 0$ for all ${\varvec{\omega }} \in \Omega ^K$.

Several random elements appear in the randomized splitting method outlined in Sect. 2.1. One example is the matrix ${\mathcal {A}}_h({\varvec{\omega }},t)$ defined in (11). When $t \in [t_{k-1}, t_k)$, ${\mathcal {A}}_h({\varvec{\omega }}, t)$ only depends on $\omega _k$. Therefore, the definitions in (43) and (11) show that (for $t \in [t_{k-1}, t_k)$)

$$\begin{aligned} {\mathbb {E}}[{\mathcal {A}}_h(t)]&= \sum _{{\varvec{\omega }} \in \Omega ^K} {\mathcal {A}}_h({\varvec{\omega }},t) p({\varvec{\omega }}) = \sum _{\omega _1 =1}^{2^M} \sum _{\omega _2=1}^{2^M} \cdots \sum _{\omega _K =1}^{2^M} \sum _{m \in S_{\omega _k}} \frac{A_m}{\pi _m} p_{\omega _1} p_{\omega _2} \cdots p_{\omega _K} \nonumber \\&= \sum _{\omega _k=1}^{2^M} \sum _{m \in S_{\omega _k}} \frac{A_m}{\pi _m} p_{\omega _k} = A, \end{aligned}$$

(44)

where the second to last identity follows from (8) and the last identity from (12). Again using that ${\mathcal {A}}_h({\varvec{\omega }},t)$ only depends on $\omega _k$ for $t \in [t_{k-1}, t_k)$, we also find that

$$\begin{aligned} {\mathbb {E}}[\Vert {\mathcal {A}}_h(t) - A \Vert ^2]&= \sum _{{\varvec{\omega }} \in \Omega ^K} \Vert {\mathcal {A}}_h({\varvec{\omega }},t) - A \Vert ^2 p({\varvec{\omega }}) \nonumber \\&= \sum _{\omega _k=1}^{2^M} \left\| \sum _{m \in S_{\omega _k}} \frac{A_m}{\pi _m} - A \right\| ^2 p_{\omega _k} = \mathrm {Var}[{\mathcal {A}}], \end{aligned}$$

(45)

where the last identity follows from the definition of $\mathrm {Var}[{\mathcal {A}}]$ in (17). Note that (45) holds for every time instant t and that ${\mathbb {E}}[\Vert {\mathcal {A}}_h(t) - A \Vert ^2]$ therefore does not depend on the considered time instant t.

Another random element is the solution $x_h : \Omega ^K \rightarrow L^2(0,T; {\mathbb {R}}^N)$ in (13). We will frequently use that $|x_h({\varvec{\omega }},t) |$ can be bounded as follows. First of all, observe that

$$\begin{aligned} \frac{d}{dt} |x_h({\varvec{\omega }},t) |^2 = 2\langle x_h({\varvec{\omega }},t), {\mathcal {A}}_h({\varvec{\omega }},t) x_h({\varvec{\omega }},t) + Bu(t) \rangle \le 2 |x_h({\varvec{\omega }},t) ||B u(t)|, \end{aligned}$$

(46)

where is was used that $\langle x, {\mathcal {A}}_h({\varvec{\omega }},t) x \rangle \le 0$ for all $x\in {\mathbb {R}}^N$ and ${\varvec{\omega }} \in \Omega ^K$ because of Assumption 1. Now observe that

$$\begin{aligned} \frac{d}{dt} |x_h({\varvec{\omega }},t) |= \frac{1}{2 |x_h({\varvec{\omega }},t)|} \frac{d}{dt} |x_h({\varvec{\omega }},t) |^2 \le |Bu(t) |, \end{aligned}$$

(47)

from which we conclude that

$$\begin{aligned} |x_h({\varvec{\omega }}) |_{L^\infty (0,T;{\mathbb {R}}^N)} \le |x_0 |+ |Bu |_{L^1(0,T;{\mathbb {R}}^N)}. \end{aligned}$$

(48)

For x(t), a similar derivation shows that

$$\begin{aligned} |x |_{L^\infty (0,T;{\mathbb {R}}^)} \le |x_0 |+ |Bu |_{L^1(0,T;{\mathbb {R}}^N)}. \end{aligned}$$

(49)

We will also consider situations in which we apply an input $u_h({\varvec{\omega }},t)$ to the dynamics (1) and (13) that depends on ${\varvec{\omega }}$. The resulting solutions are then both random elements $x({\varvec{\omega }},t)$ and $x_h({\varvec{\omega }},t)$ which satisfy

$$\begin{aligned} {\dot{x}}({\varvec{\omega }},t)&= Ax({\varvec{\omega }},t) + Bu_h({\varvec{\omega }},t), \quad&x({\varvec{\omega }},0) = x_0, \end{aligned}$$

(50)

$$\begin{aligned} {\dot{x}}_h({\varvec{\omega }},t)&= {\mathcal {A}}_h({\varvec{\omega }},t)x_h({\varvec{\omega }},t) + Bu_h({\varvec{\omega }},t), \quad&x_h({\varvec{\omega }},0) = x_0, \end{aligned}$$

(51)

In this case we can obtain estimates similar to (48) and (49) with u and x replaced by $u_h({\varvec{\omega }})$ and $x({\varvec{\omega }})$, respectively.

The third important random element in this paper is the optimal control $u^*_h({\varvec{\omega }}, \cdot )$ that minimizes $J_h({\varvec{\omega }}, \cdot )$ in (14). The coercivity of the functional $J_h({\varvec{\omega }},\cdot )$ allows us to bound $|u^*_h({\varvec{\omega }}) |_{L^2(0,T;{\mathbb {R}}^q)}$ as follows. Denote the smallest eigenvalue of the matrix R by $\alpha > 0$, then

$$\begin{aligned} \frac{\alpha }{2} |u^*_h({\varvec{\omega }}) |^2_{L^2(0,T;{\mathbb {R}}^q)} \le \frac{1}{2}\int _0^T u_h^*(t)^\top R u_h^*(t) \, \mathrm {d}t \le J_h({\varvec{\omega }}, u^*_h({\varvec{\omega }})) \le J_h({\varvec{\omega }}, 0), \end{aligned}$$

(52)

where the last inequality follows because $u^*_h({\varvec{\omega }})$ is the minimizer of $J_h({\varvec{\omega }},\cdot )$. Next, observe that

$$\begin{aligned} J_h({\varvec{\omega }}, 0)&\le \frac{1}{2}\int _0^T (x_h({\varvec{\omega }},t)-x_d(t))^\top Q (x_h({\varvec{\omega }},t)-x_d(t)) \, \mathrm {d}t \nonumber \\&\le \tfrac{1}{2}\Vert Q \Vert \left( |x_h({\varvec{\omega }}) |_{L^2(0,T;{\mathbb {R}}^N)} + |x_d |_{L^2(0,T;{\mathbb {R}}^N)} \right) ^2 \nonumber \\&\le \tfrac{1}{2}\Vert Q \Vert \left( T |x_0 |+ |x_d |_{L^2(0,T;{\mathbb {R}}^N)} \right) ^2 = C_{[x_0, Q, x_d,T]}, \end{aligned}$$

(53)

where $x_h({\varvec{\omega }},t)$ denotes the solution of (13) with $u(t) = 0$ and the last inequality follows from (48). Looking back at (52), we find

$$\begin{aligned} |u^*_h({\varvec{\omega }}) |^2_{L^2(0,T;{\mathbb {R}}^N)} \le C_{[x_0, Q, R, x_d, T]}. \end{aligned}$$

(54)

Finally, we repeat some standard definitions from the theory of the convex optimization, see, e.g., [22]. A functional $J : V \rightarrow {\mathbb {R}}$ on a normed vector space V is $\alpha $-convex if there exists an $\alpha \ge 0$ such that for all $u, v \in V$ and $\theta \in [0, 1]$

$$\begin{aligned} J((1 -\theta )u + \theta v) \le (1 -\theta )J(u) + \theta J(v) - \tfrac{\alpha }{2} \theta (1-\theta ) |u -v |_V^2. \end{aligned}$$

(55)

One can easily verify that the functional $J_h({\varvec{\omega }}, \cdot )$ is $\alpha $-convex (for all ${\varvec{\omega }} \in \Omega ^K$) when we take $\alpha $ as the smallest eigenvalue of the positive definite matrix R. The Gâteaux-derivative of J at the point u in the direction v is denoted by $\delta J(u; v)$, i.e.

$$\begin{aligned} \delta J(u; v) := \lim _{h \rightarrow 0} \frac{J(u+hv) - J(u)}{h}. \end{aligned}$$

(56)

By subtracting J(u) from both sides of (55), dividing the resulting inequality by $\theta $, and then taking the limit $\theta \rightarrow 0$, we find the well-known inequality

$$\begin{aligned} J(v) \ge J(u) + \delta J(u; v - u) + \tfrac{\alpha }{2} |v-u |_V^2. \end{aligned}$$

(57)

3.2 The forward dynamics with a deterministic input

In this subsection, we consider a deterministic u(t) and derive a bound for the error

$$\begin{aligned} e_h({\varvec{\omega }},t) := x_h({\varvec{\omega }},t) - x(t), \end{aligned}$$

(58)

where $x_h({\varvec{\omega }},t)$ and x(t) are the solutions of (13) and (1) resulting from the same input u(t), respectively.

Remark 6

It is important to stress that $x_h(t)$ is not an unbiased estimator for x(t), i.e., we do not have ${\mathbb {E}}[e_h(t)] = {\mathbb {E}}[x_h(t)] - x(t) = 0$. This can for example be observed when we write the error dynamics as

$$\begin{aligned} {\dot{e}}_h({\varvec{\omega }},t)&= {\mathcal {A}}_h({\varvec{\omega }},t)x_h({\varvec{\omega }},t) + Bu(t) - Ax({\varvec{\omega }},t) - Bu(t) \nonumber \\&= Ae_h({\varvec{\omega }},t) + ({\mathcal {A}}_h({\varvec{\omega }},t) - A)x_h({\varvec{\omega }},t), \quad e_h({\varvec{\omega }},0) = 0, \end{aligned}$$

(59)

where we have substituted $x({\varvec{\omega }},t) =x_h({\varvec{\omega }},t) - e_h({\varvec{\omega }},t)$. Taking the expected value in (59) we find

$$\begin{aligned} \frac{d}{dt}{\mathbb {E}}[e_h(t)] = A {\mathbb {E}}[e_h(t)] + {\mathbb {E}}[({\mathcal {A}}_h(t) - A)x_h(t)], \quad {\mathbb {E}}[e_h(0)] = 0. \end{aligned}$$

(60)

However, (60) does not imply that ${\mathbb {E}}[e_h(t)] = 0$ for all t because generally

$$\begin{aligned} {\mathbb {E}}[({\mathcal {A}}_h(t)-A)x_h(t)] \ne {\mathbb {E}}[{\mathcal {A}}_h(t) - A] {\mathbb {E}} [x_h(t)] = 0, \end{aligned}$$

(61)

where the equality follows because ${\mathbb {E}}[{\mathcal {A}}_h(t)] = A$, see (44). This would be the case when ${\mathcal {A}}_h({\varvec{\omega }},t)$ and $x_h({\varvec{\omega }},t)$ are independent, but they are correlated by the dynamics (13). Note, however, that at the beginning of each time interval $[t_{k-1}, t_k)$, the value of ${\mathcal {A}}_h({\varvec{\omega }},t)$ changes and that ${\mathcal {A}}_h({\varvec{\omega }},t_{k-1})$ is independent of the values of ${\mathcal {A}}_h({\varvec{\omega }},t)$ for $t < t_{k-1}$ so that

$$\begin{aligned} {\mathbb {E}}[({\mathcal {A}}_h(t_{k-1})-A)x_h(t_{k-1})] = {\mathbb {E}}[{\mathcal {A}}_h(t_{k-1}) - A] {\mathbb {E}} [x_h(t_{k-1})] = 0, \end{aligned}$$

(62)

where the second identity again follows because ${\mathbb {E}}[{\mathcal {A}}_h(t)] = A$, see (44). This observation is crucial to obtain the main result of this subsection.

The main result in this subsection is the following.

Theorem 1

Assume that the input u(t) in (13) is deterministic and equal to the input u(t) in (1) and that Assumptions 1 and 2 hold, then

$$\begin{aligned} {\mathbb {E}}[|e_h(t) |^2] \le h \mathrm {Var}[{\mathcal {A}}] (\Vert A\Vert t^2 + 2t) (|x_0 |+ |Bu |_{L^1(0,T;{\mathbb {R}}^N}))^2. \end{aligned}$$

(63)

Proof

Observe that

$$\begin{aligned} {\dot{e}}_h({\varvec{\omega }},t)&= {\mathcal {A}}_h({\varvec{\omega }},t)x_h({\varvec{\omega }},t) + Bu(t) - Ax({\varvec{\omega }},t) - Bu(t) \nonumber \\&= {\mathcal {A}}_h({\varvec{\omega }},t ) e_h({\varvec{\omega }},t) + ({\mathcal {A}}_h({\varvec{\omega }},t) - A)x(t), \quad e_h({\varvec{\omega }},0) = 0, \end{aligned}$$

(64)

where the last equation follows after substituting $x_h({\varvec{\omega }},t) = x({\varvec{\omega }},t) + e_h({\varvec{\omega }},t)$.

Fix $t \in [0, T]$ and let $k \le K$ be such that $t \in [t_{k-1}, t_k)$.

Consider an arbitrary time instant $s \in [0,t)$ and let $\ell \in \{1,2, \ldots , k \}$ be such that $s \in [t_{\ell -1}, t_\ell )$. Then (64) shows that

$$\begin{aligned}&\frac{d}{ds} |e_h({\varvec{\omega }},s)|^2 = 2 \langle e_h({\varvec{\omega }},s), {\mathcal {A}}_h({\varvec{\omega }},s) e_h({\varvec{\omega }},s) \rangle + 2 \langle e_h({\varvec{\omega }},s), ({\mathcal {A}}_h({\varvec{\omega }},s) - A) x(s) \rangle \nonumber \\&\quad = 2 \langle e_h({\varvec{\omega }},s), {\mathcal {A}}_h({\varvec{\omega }},s) e_h({\varvec{\omega }},s) \rangle + 2 \langle e_h({\varvec{\omega }},t_{\ell -1}), ({\mathcal {A}}_h({\varvec{\omega }},s) - A) x(s) \rangle \nonumber \\&\qquad + 2 \langle \Delta e_h({\varvec{\omega }},s), ({\mathcal {A}}_h({\varvec{\omega }},s) - A) x(s) \rangle , \end{aligned}$$

(65)

where, in the second equality, we have introduced

$$\begin{aligned} \Delta e_h({\varvec{\omega }},s) := e_h({\varvec{\omega }},s) - e_h({\varvec{\omega }},t_{\ell -1}). \end{aligned}$$

(66)

The first term on the RHS of (65) is nonpositive due to Assumption 1. We thus find after taking the expected value in (65) that

$$\begin{aligned} \frac{d}{ds} {\mathbb {E}}[ |e_h(s) |^2 ]&\le 2 {\mathbb {E}}[ \langle e_h(t_{\ell -1}), ({\mathcal {A}}_h(s) - A) x(s) \rangle ] \nonumber \\&\quad + 2{\mathbb {E}}[\langle \Delta e_h(s), ({\mathcal {A}}_h(s) - A) x(s) \rangle ]. \end{aligned}$$

(67)

For the first term on the RHS of (67), observe that $e_h({\varvec{\omega }},t_{\ell -1}) = x_h({\varvec{\omega }},t_{\ell -1}) - x(t_{\ell -1})$ only depends on $\omega _1, \ldots \omega _{\ell -1}$, so that

$$\begin{aligned}&{\mathbb {E}}[ \langle e_h(t_{\ell -1}), ({\mathcal {A}}_h(s) - A) x(s) \rangle ] = \sum _{{\varvec{\omega }} \in \Omega ^K} \langle e_h({\varvec{\omega }},t_{\ell -1}), ({\mathcal {A}}_h({\varvec{\omega }},s) - A) x(s) \rangle p({\varvec{\omega }}) \nonumber \\&\quad = \sum _{\omega _1=1}^{2^M} \cdots \sum _{\omega _{\ell -1}=1}^{2^M} \sum _{\omega _\ell =1}^{2^M} \bigg \langle e_h({\varvec{\omega }}, t_{\ell -1}) , \left( \sum _{m \in S_{\omega _\ell }} \frac{A_m}{\pi _m} - A \right) x(s) \bigg \rangle p_{\omega _1} \cdots p_{\omega _{\ell -1}}p_{\omega _\ell } \nonumber \\&\quad = \sum _{\omega _1=1}^{2^M} \cdots \sum _{\omega _{\ell -1}=1}^{2^M} \bigg \langle e_h({\varvec{\omega }}, t_{\ell -1}), \left( \sum _{\omega _\ell =1}^{2^M} \sum _{m \in S_{\omega _\ell }} \frac{A_m}{\pi _m}p_{\omega _\ell } - A \right) x(s) \bigg \rangle p_{\omega _1} \cdots p_{\omega _{\ell -1}} \nonumber \\&\quad = 0, \end{aligned}$$

(68)

where the second identity uses (8), the third identity follows from (8) and the fact that $e_h({\varvec{\omega }}, t)$ does not depend on $\omega _\ell $, and the last identity follows because (12) shows that the factor between round brackets vanishes.

For the second term on the RHS of (67), we use that

$$\begin{aligned}&{\mathbb {E}}[ \langle \Delta e_h(s), ({\mathcal {A}}_h(s) - A) x(s) \rangle ] \le {\mathbb {E}}[|\Delta e_h(s)|\Vert {\mathcal {A}}_h(s) - A \Vert |x(s) |] \nonumber \\&\quad \le \sqrt{{\mathbb {E}}[|\Delta e_h(s) |^2] {\mathbb {E}}[\Vert {\mathcal {A}}_h(s) - A\Vert ^2 |x(s) |^2]} = \sqrt{{\mathbb {E}}[|\Delta e_h(s) |^2]} \sqrt{\mathrm {Var}[{\mathcal {A}}]} |x(s) |\nonumber \\&\quad \le \sqrt{{\mathbb {E}}[|\Delta e_h(s) |^2]} \sqrt{\mathrm {Var}[{\mathcal {A}}]} (|x_0 |+ |Bu |_{L^1(0,T;{\mathbb {R}}^N)}), \end{aligned}$$

(69)

where the first identity follows from the Cauchy–Schwartz inequality in ${\mathbb {R}}^N$, the second inequality from Cauchy–Schwartz inequality in the probability space, and the last inequality follows from (49).

We now claim that

$$\begin{aligned} {\mathbb {E}}[|\Delta e_h(s) |^2] \le h^2 \mathrm {Var}[{\mathcal {A}}] (\Vert A\Vert s + 1)^2(|x_0 |+ |Bu |_{L^1(0,T;{\mathbb {R}}^N)})^2. \end{aligned}$$

(70)

We will prove (70) at the end of the proof. Inserting the claim (70) into (69), we find

$$\begin{aligned} {\mathbb {E}}[ \langle \Delta e_h(s), ({\mathcal {A}}_h(s) - A) x(s) \rangle ] \le h \mathrm {Var}[{\mathcal {A}}] (\Vert A\Vert s + 1) (|x_0 |+ |Bu |_{L^1(0,T;{\mathbb {R}}^N)})^2. \end{aligned}$$

(71)

Inserting (68) and (71) into (67) shows that

$$\begin{aligned} \frac{d}{ds} {\mathbb {E}}[ |e_h(s)|^2 ] \le 2 h \mathrm {Var}[{\mathcal {A}}](\Vert A\Vert s + 1) ( |x_0 |+ |Bu_d |_{L^1(0,T;{\mathbb {R}}^N)})^2. \end{aligned}$$

(72)

Integrating (72) from $s = 0$ to $s = t$ using that $e_h(\omega ,0) = 0$ now shows that

$$\begin{aligned} {\mathbb {E}}[ |e_h(t) |^2 ] \le h \mathrm {Var}[{\mathcal {A}}] (\Vert A\Vert t^2 + 2t) (|x_0 |+ |Bu_d |_{L^1(0,T;{\mathbb {R}}^N)})^2, \end{aligned}$$

(73)

which is the desired estimate (63).

It thus remains to show that (70) holds. Recall that, for $\tau \in [t_{\ell -1}, s)$, (66) shows that $\Delta e_h({\varvec{\omega }},\tau ) = e_h({\varvec{\omega }},\tau ) - e_h({\varvec{\omega }},t_{\ell -1})$. Using (59), we thus see that $\Delta e_h({\varvec{\omega }},\tau )$ is the solution of the ODE

$$\begin{aligned} \tfrac{d}{d\tau }\Delta e_h({\varvec{\omega }},\tau ) = {\dot{e}}_h({\varvec{\omega }},t) = Ae_h({\varvec{\omega }},t) + ({\mathcal {A}}_h({\varvec{\omega }},t) - A)x_h({\varvec{\omega }},t), \end{aligned}$$

(74)

with initial condition $\Delta e_h({\varvec{\omega }},t_{\ell -1}) = 0$. We therefore also have that

$$\begin{aligned} \frac{d}{d\tau } |\Delta e_h({\varvec{\omega }},\tau ) |= \frac{\langle \Delta e_h({\varvec{\omega }},\tau ), {\dot{e}}_h({\varvec{\omega }},\tau ) \rangle }{|\Delta e_h({\varvec{\omega }},\tau ) |} \le |Ae_h({\varvec{\omega }},\tau ) |+ |({\mathcal {A}}_h({\varvec{\omega }},\tau ) - A)x_h({\varvec{\omega }},\tau ) |. \end{aligned}$$

(75)

Using that $\Delta e_h({\varvec{\omega }},t_{\ell -1}) = 0$, integrating (75) from $\tau = t_{\ell -1}$ to $\tau = s$ yields

$$\begin{aligned} |\Delta e_h({\varvec{\omega }},s) |\le \int _{t_{\ell -1}}^s \left( \Vert A \Vert |e_h({\varvec{\omega }},\tau ) |+ |({\mathcal {A}}_h({\varvec{\omega }},\tau ) - A)x_h({\varvec{\omega }},\tau ) |\right) \, \mathrm {d}\tau . \end{aligned}$$

(76)

To bound $e_h({\varvec{\omega }},\tau )$, we apply the variation of constants formula to the error dynamics in (59) and obtain

$$\begin{aligned} |e_h({\varvec{\omega }},\tau ) |&= \left|\int _0^\tau e^{A(\tau -\sigma )} ({\mathcal {A}}_h({\varvec{\omega }},\sigma ) - A ) x_h({\varvec{\omega }},\sigma ) \, \mathrm {d}\sigma \right|\nonumber \\&\le \int _0^\tau \Vert {\mathcal {A}}_h({\varvec{\omega }},\sigma ) - A \Vert \,\mathrm {d}\sigma \ ( |x_0 |+ |Bu |_{L^1(0,T;{\mathbb {R}}^N)}), \end{aligned}$$

(77)

where we have used the bound for $x_h({\varvec{\omega }},\sigma )$ in (48) and that $\Vert e^{A(\tau -\sigma )}\Vert \le 1$ because Assumption 1 implies that A is dissipative. Using this result in (76), we find

$$\begin{aligned} |\Delta e_h({\varvec{\omega }},s) |\le \int _{t_{\ell -1}}^s g({\varvec{\omega }},\tau ) \, \mathrm {d}\tau \ (|x_0 |+ |Bu |_{L^1(0,T;{\mathbb {R}}^N)}), \end{aligned}$$

(78)

where we have again used the bound on $x_h({\varvec{\omega }},t)$ in (48) for the second term in (76) and introduced

$$\begin{aligned} g({\varvec{\omega }},\tau ) := \left( \Vert A \Vert \int _0^\tau \Vert {\mathcal {A}}_h({\varvec{\omega }},\sigma ) - A \Vert \, \mathrm {d}\sigma + \Vert {\mathcal {A}}_h({\varvec{\omega }},\tau ) - A \Vert \right) . \end{aligned}$$

(79)

Squaring both sides in (78) and taking the expectation, we find

$$\begin{aligned} {\mathbb {E}}[ |\Delta e_h(s) |^2]&\le {\mathbb {E}}\left[ \left( \int _{t_{\ell -1}}^s g(\tau ) \, \mathrm {d}\tau \right) ^2 \right] (|x_0 |+ |Bu |_{L^1(0,T;{\mathbb {R}}^N)})^2 \nonumber \\&\le (s-t_{\ell -1}) \int _{t_{\ell -1}}^s {\mathbb {E}}[(g(\tau ))^2] \, \mathrm {d}\tau \ (|x_0 |+ |Bu |_{L^1(0,T;{\mathbb {R}}^N)})^2, \end{aligned}$$

(80)

where the second inequality follows from the Cauchy–Schwartz inequality in $L^2(t_{\ell -1},s)$. Now observe that (79) shows that

$$\begin{aligned} {\mathbb {E}}[(g(\tau ))^2]&= \Vert A \Vert ^2 \int _0^\tau \int _0^\tau {\mathbb {E}}[ \Vert {\mathcal {A}}_h(\sigma ) - A \Vert \Vert {\mathcal {A}}_h(\sigma ') - A \Vert ] \, \mathrm {d}\sigma \, \mathrm {d}\sigma ' \nonumber \\&\quad + 2 \Vert A \Vert \int _0^\tau {\mathbb {E}}[ \Vert {\mathcal {A}}_h(\sigma ) - A \Vert \Vert {\mathcal {A}}_h(\tau ) - A \Vert ] \, \mathrm {d}\sigma + {\mathbb {E}}[\Vert {\mathcal {A}}_h(\tau ) - A \Vert ^2]. \end{aligned}$$

(81)

Because ${\mathbb {E}}[\Vert {\mathcal {A}}_h(t) - A \Vert ^2] = \mathrm {Var}[{\mathcal {A}}]$ for all t, we also have that

$$\begin{aligned} {\mathbb {E}}[ \Vert {\mathcal {A}}_h(\sigma ) - A \Vert \Vert {\mathcal {A}}_h(\tau ) - A \Vert ] \le \sqrt{ {\mathbb {E}}[ \Vert {\mathcal {A}}_h(\sigma ) - A \Vert ^2] {\mathbb {E}}[\Vert {\mathcal {A}}_h(\tau ) - A \Vert ^2]} = \mathrm {Var}[{\mathcal {A}}]. \end{aligned}$$

(82)

Using this result in (81), we obtain

$$\begin{aligned} {\mathbb {E}}[(g(\tau ))^2] \le \mathrm {Var}[{\mathcal {A}}] (\Vert A \Vert \tau + 1)^2. \end{aligned}$$

(83)

Using this result again in (80), also using that $s-t_{\ell -1} \le h$ and $\tau \le s$, we find the claimed inequality (70). $\square $

Some remarks regarding Theorem 1 are in order.

Remark 7

The error estimate in Theorem 1 involves the operator norm of the matrix A. This suggests that the expected error ${\mathbb {E}}[ |e_h(t) |^2]$ grows when we are considering better approximations A of an unbounded operator, which for example happens when we consider a discretization of a PDE and refine the spatial grid. However, Fig. 4a in Sect. 4 indicates that ${\mathbb {E}}[|e_h(t) |] \le C \sqrt{h \mathrm {Var}[{\mathcal {A}}]}$ for a constant C that does not increase (but even seems to decrease) when the spatial grid is refined.

A first step in understanding the infinite-dimensional case better is taken in Appendix B, where we prove that

$$\begin{aligned} {\mathbb {E}}[|e_h(t) |^2] \le 2ht \mathrm {Var}_W[{\mathcal {A}}] |W^{-1}x_0 |. \end{aligned}$$

(84)

under the additional assumptions that $u(t) \equiv 0$ and that all matrices $A_m$ commute pairwise. Here, W is any invertible matrix and $\mathrm {Var}_W[{\mathcal {A}}]$ is the weighted variance introduced in Remark 5. Observe that the operator norm $\Vert A \Vert $ does not appear in this estimate. The result from Appendix B extends naturally to an infinite dimensional setting in which all operators $A_m$ have the same domain $D(A_m) = D(A)$.

Recall from Remark 5 that a typical choice for W is $W = (A - \lambda I)^{-1}$ for some $\lambda $ in the resolvent of A. For $|W^{-1}x_0 |$ to be bounded, we thus require that $x_0 \in D(A)$, where D(A) denotes the domain of the operator A. In an infinite dimensional setting we thus need an additional smoothness assumption on the initial condition $x_0$. Such conditions are typical for (deterministic) splitting algorithms, see e.g. [12, 13]. Further details can be found in Appendix B.

Remark 8

The error estimate in Theorem 1 is derived based on the error dynamics (64). Considering the error dynamics (59) leads to a less clean proof because instead of the 3 terms on the RHS of (65), we then get 4 terms

$$\begin{aligned} \frac{d}{ds} |e_h({\varvec{\omega }},s) |^2&= 2 \langle e_h({\varvec{\omega }},s), A e_h({\varvec{\omega }},s) \rangle + 2 \langle e_h({\varvec{\omega }},s), ({\mathcal {A}}_h({\varvec{\omega }},s) - A) x_h({\varvec{\omega }},s) \rangle \nonumber \\&= 2 \langle e_h({\varvec{\omega }},s), A e_h({\varvec{\omega }},s) \rangle + 2 \langle e_h({\varvec{\omega }},t_{\ell -1}), ({\mathcal {A}}_h({\varvec{\omega }},s) - A) x_h({\varvec{\omega }},t_{\ell -1}) \rangle \nonumber \\&\quad + 2 \langle \Delta e_h({\varvec{\omega }},s), ({\mathcal {A}}_h({\varvec{\omega }},s) - A) x_h({\varvec{\omega }},s) \rangle \nonumber \\&\quad + 2 \langle e_h({\varvec{\omega }},s), ({\mathcal {A}}_h({\varvec{\omega }},s) - A) \Delta x_h({\varvec{\omega }},s) \rangle , \end{aligned}$$

(85)

where $\Delta e_h({\varvec{\omega }},s) := e_h({\varvec{\omega }},s) - e_h({\varvec{\omega }},t_{\ell -1})$ and $\Delta x_h({\varvec{\omega }},s) := x_h({\varvec{\omega }},s) - x_h({\varvec{\omega }},t_{\ell -1})$. This approach is closer to proofs for interacting particle systems in [14].

Note that the fourth term in (85) is needed because $x_h({\varvec{\omega }},s)$ is correlated to ${\mathcal {A}}_h({\varvec{\omega }},s)$ for $s\in (t_{\ell -1},t_\ell )$. Because x(s) is not correlated to ${\mathcal {A}}_h({\varvec{\omega }},s)$, it was not necessary to introduce such a term in (65). The proof of Theorem 1 based on the error dynamics (64) presented above is thus simpler than a proof based on (59).

Remark 9

When we look back at the proof of Theorem 1, we see that Assumption 1 is only used to assure that the matrices A and ${\mathcal {A}}_h({\varvec{\omega }},t)$ are dissipative (for all ${\varvec{\omega }}$ with $p({\varvec{\omega }}) > 0$ and all $t \in [0,T]$). When Assumption 1 is not satisfied, there must exist a constant $a > 0$ such that ${\hat{A}} = A-aI$ and $\hat{{\mathcal {A}}}_h({\varvec{\omega }},t) = {\mathcal {A}}_h({\varvec{\omega }},t) - a I$ are dissipative (for all ${\varvec{\omega }}$ with $p({\varvec{\omega }}) > 0$ and all $t \in [0,T]$). Because ${\mathbb {E}}[{\mathcal {A}}_h(t)] = A$, it follows that ${\mathbb {E}}[\hat{{\mathcal {A}}}_h(t)] = {\mathbb {E}}[{\mathcal {A}}_h(t)] - aI = A - aI = {\hat{A}}$ and $\mathrm {Var}[\Vert \hat{{\mathcal {A}}}_h(t) - {\hat{A}} \Vert ^2] = \mathrm {Var}[{\mathcal {A}}]$. When we let ${\hat{x}}(t)$ and ${\hat{x}}_h({\varvec{\omega }},t)$ denote the solutions generated by ${\hat{A}}$ and $\hat{{\mathcal {A}}}_h({\varvec{\omega }},t)$, respectively, we can now prove in a similar way as in Theorem 1 that the error ${\hat{e}}_h({\varvec{\omega }},t) = {\hat{x}}_h({\varvec{\omega }},t) - {\hat{x}}(t)$ can be bounded as

$$\begin{aligned} {\mathbb {E}}[|{\hat{e}}_h(t) |^2] \le h \mathrm {Var}[{\mathcal {A}}](\Vert {\hat{A}} \Vert t^2+2t)(|x_0 |+ |Bu |_{L^1(0,T;{\mathbb {R}}^N)})^2. \end{aligned}$$

(86)

Because $x(t) = e^{at}{\hat{x}}(t)$ and $x_h({\varvec{\omega }},t) = e^{at}{\hat{x}}_h({\varvec{\omega }},t)$, also

$$\begin{aligned} e_h({\varvec{\omega }},t) = x_h({\varvec{\omega }},t) - x(t) = e^{at} {\hat{x}}_h({\varvec{\omega }},t) - e^{at} {\hat{x}}(t) = e^{at} {\hat{e}}_h({\varvec{\omega }},t). \end{aligned}$$

(87)

Taking the expectation and using (86), we find

$$\begin{aligned} {\mathbb {E}}[|e_h(t) |^2] \le h e^{at} \mathrm {Var}[{\mathcal {A}}](\Vert {\hat{A}} \Vert t^2+2t)(|x_0 |+ |Bu |_{L^1(0,T;{\mathbb {R}}^N)})^2. \end{aligned}$$

(88)

The error estimate now grows exponentially in time.

3.3 The forward dynamics with a stochastic input

In this subsection, we prove a result similar to Theorem 1 for inputs $u_h({\varvec{\omega }},t)$ that are stochastic, i.e., which depend on ${\varvec{\omega }}$. We thus want to bound the error

$$\begin{aligned} e_h({\varvec{\omega }},t) = x_h({\varvec{\omega }},t) - x({\varvec{\omega }},t), \end{aligned}$$

(89)

where $x_h({\varvec{\omega }},t)$ and $x({\varvec{\omega }},t)$ are the solutions of (51) and (50), respectively.

To this end, we consider the semi-group $e^{At}$ generated by the matrix A and the evolution operator $S_h({\varvec{\omega }},t,s)$ associated to ${\mathcal {A}}_h({\varvec{\omega }},t)$. The evolution operator $S_h({\varvec{\omega }},t,s)$ is defined by property that for all vectors $x_s \in {\mathbb {R}}^N$ (and all $t \ge s$), $S_h({\varvec{\omega }},t,s)x_s$ is equal to the solution $y_h({\varvec{\omega }},t)$ of

$$\begin{aligned} {\dot{y}}_h({\varvec{\omega }},t) = {\mathcal {A}}_h({\varvec{\omega }},t) y_h(\omega ,t), \quad y_h({\varvec{\omega }},s) = x_s. \end{aligned}$$

(90)

Remark 10

An explicit formula for the evolution operator $S_h({\varvec{\omega }}, t, s)$ can be obtained as follows. Let $0 \le s \le t \le T$ and let $\ell , k \in \{1,2, \ldots , K \}$ be selected such that

$$\begin{aligned} s \in [t_{\ell -1}, t_\ell ), \quad t \in [t_{k-1}, t_k). \end{aligned}$$

(91)

By restricting the given time grid $0 = t_0< t_1< t_2< \cdots< t_{K-1} < t_K = T$ to the interval [s, t], we obtain a grid with ${\tilde{K}} = k-\ell +1$ grid points

$$\begin{aligned} {\tilde{t}}_0 := s< {\tilde{t}}_1 := t_\ell< {\tilde{t}}_2 := t_{\ell +1}< \cdots< {\tilde{t}}_{{\tilde{K}}-1} := t_{k-1} < {\tilde{t}}_{{\tilde{K}}} := t. \end{aligned}$$

(92)

The construction of the time grid ${\tilde{t}}_0, {\tilde{t}}_1, \ldots {\tilde{t}}_{{\tilde{K}}}$ is illustrated in Fig. 1. We also denote ${\tilde{h}}_p := {\tilde{t}}_p - {\tilde{t}}_{p-1}$ (for $p \in \{1,2, \ldots , {\tilde{K}} \}$) and introduce (for each $\omega \in \{1,2, \ldots , 2^M \}$)

$$\begin{aligned} {\mathcal {A}}_\omega := \sum _{m \in S_\omega } \frac{A_m}{\pi _m}. \end{aligned}$$

(93)

Because ${\mathcal {A}}_h({\varvec{\omega }},\tau ) = {\mathcal {A}}_{\omega _p}$ is constant for $\tau \in [{\tilde{t}}_{p-1}, {\tilde{t}}_p)$, it is now easy to see that

$$\begin{aligned} S_h({\varvec{\omega }}, t,s) = e^{{\mathcal {A}}_{\omega _k} {\tilde{h}}_{{\tilde{K}}}} \cdots e^{{\mathcal {A}}_{\omega _{\ell +1}} {\tilde{h}}_2} e^{{\mathcal {A}}_{\omega _\ell } {\tilde{h}}_1} = \prod _{p=1}^{{\tilde{K}}} e^{{\mathcal {A}}_{\omega _p+\ell -1}{\tilde{h}}_p}. \end{aligned}$$

(94)

Under Assumption 1, all matrices ${\mathcal {A}}_{\omega _p}$ are dissipative and (94) shows that

$$\begin{aligned} \Vert S_h({\varvec{\omega }}, t,s) \Vert \le 1. \end{aligned}$$

(95)

Using the variation of constants formula, the solutions of $x_h({\varvec{\omega }},t)$ and $x({\varvec{\omega }},t)$ can expressed as

$$\begin{aligned} x_h({\varvec{\omega }},t)&= S_h({\varvec{\omega }},t,0) x_0 + \int _0^t S_h({\varvec{\omega }},t,s) Bu_h({\varvec{\omega }},s) \, \mathrm {d}s, \end{aligned}$$

(96)

$$\begin{aligned} x({\varvec{\omega }}, t)&= e^{At} x_0 + \int _0^t e^{A(t-s)} Bu_h({\varvec{\omega }},s) \, \mathrm {d}s. \end{aligned}$$

(97)

Subtracting (97) from (96) we find the following expression for the error $e_h({\varvec{\omega }},t)$

$$\begin{aligned} e_h({\varvec{\omega }},t) = E_h({\varvec{\omega }}, t,0) x_0 + \int _0^t E_h({\varvec{\omega }},t,s)Bu_h({\varvec{\omega }},s) \, \mathrm {d}s, \end{aligned}$$

(98)

where $E_h({\varvec{\omega }},t,s) = S_h({\varvec{\omega }},t,s) - e^{A(t-s)}$. The following corollary of Theorem 1 shows that we can bound $E_h({\varvec{\omega }},t,s) = S_h({\varvec{\omega }},t,s) - e^{A(t-s)}$.

Corollary 1

Under Assumptions 1 and 2, we have that

$$\begin{aligned} {\mathbb {E}}[\Vert S_h(t,s) - e^{A(t-s)} \Vert ^2] \le (\Vert A \Vert T^2 + 2T) h \mathrm {Var}[{\mathcal {A}}], \end{aligned}$$

(99)

for all $0 \le s \le t \le T$.

Proof

Fix $s \in [0,T]$ and an initial condition $x_s \in {\mathbb {R}}^N$.

Define $y(t) = e^{A(t-s)}x_s$ and let $y_h({\varvec{\omega }},t)$ be the solution of (90), both for $t \in [s,T]$. We then apply Theorem 1 with $u(t) \equiv 0$ to the time-shifted solutions ${\tilde{y}}({\tilde{t}}) = y({\tilde{t}}+s)$ and ${\tilde{y}}_h({\varvec{\omega }}, {\tilde{t}}) = y_h({\varvec{\omega }},{\tilde{t}}+s)$ and the time-shifted matrix $\tilde{{\mathcal {A}}}_h({\varvec{\omega }},{\tilde{t}}) = {\mathcal {A}}_h({\varvec{\omega }},{\tilde{t}}+s)$ defined on ${\tilde{t}} \in [0, T-s]$. We thus conclude that (writing ${\tilde{t}} = t-s$)

$$\begin{aligned} {\mathbb {E}}[|y_h(t) - y(t) |^2] = {\mathbb {E}}[|{\tilde{y}}_h({\tilde{t}}) - {\tilde{y}}({\tilde{t}})|^2] \le h \mathrm {Var}[{\mathcal {A}}] (\Vert A\Vert {\tilde{t}}^2 + 2{\tilde{t}}) |x_s |^2. \end{aligned}$$

(100)

Noting that, by definition, $y(t) = e^{A(t-s)}x_s$ and $y_h({\varvec{\omega }},t) = S_h({\varvec{\omega }},t,s)x_s$, we find that (for $x_s \ne 0$)

$$\begin{aligned} {\mathbb {E}}\left[ \frac{|(S_h({\varvec{\omega }},t,s) - e^{A(t-s)})x_s |^2}{|x_s |^2} \right] \le h \mathrm {Var}[{\mathcal {A}}](\Vert A\Vert T^2 + 2T), \end{aligned}$$

(101)

where it was used that ${\tilde{t}} = t-s \le T$. The result now follows from the definition of the operator-norm. $\square $

Remark 11

In Appendix B, we prove a result similar to Corollary 1 under the additional assumption that all matrices $A_m$ commute pairwise. The result in Appendix B extends naturally to an infinite dimensional setting under the additional assumption that the domains of the operators $A_m$ are the same. This is not the case for Corollary 1 because the operator norm $\Vert A \Vert $ appears in (99).

We are now ready for the main result of this subsection.

Theorem 2

Consider any control $u_h : \Omega ^K \rightarrow L^2(0,T;{\mathbb {R}}^q)$. Assume that Assumptions 1 and 2 are satisfied and let U be such that

$$\begin{aligned} |Bu_h({\varvec{\omega }}) |_{L^2(0,T;{\mathbb {R}}^q)} \le U, \end{aligned}$$

(102)

for all ${\varvec{\omega }} \in \Omega ^K$, then

$$\begin{aligned} {\mathbb {E}}[|e_h(t) |^2] \le (\Vert A \Vert T^2 + 2T) h \mathrm {Var}[{\mathcal {A}}] \left( |x_0 |+ U \sqrt{T} \right) ^2. \end{aligned}$$

(103)

Proof

Using the triangle inequality in (98), we find

$$\begin{aligned} |e_h({\varvec{\omega }},t) |&\le \Vert E_h({\varvec{\omega }},t,0) \Vert |x_0 |+ \int _0^t \Vert E_h({\varvec{\omega }},t,s) \Vert |Bu_h({\varvec{\omega }},s) |\, \mathrm {d}s \nonumber \\&\le \Vert E_h({\varvec{\omega }},t,0) \Vert |x_0 |+ \sqrt{\int _0^t \Vert E_h({\varvec{\omega }},t,s) \Vert ^2 \, \mathrm {d}s} |Bu_h({\varvec{\omega }}) |_{L^2(0,T;{\mathbb {R}}^q)}, \end{aligned}$$

(104)

where the second inequality follows from the Cauchy–Schwarz inequality in $L^2(0,t)$. Squaring both sides and using the bound (102), we find

$$\begin{aligned} |e_h({\varvec{\omega }},t) |^2&\le \Vert E_h({\varvec{\omega }},t,0) \Vert ^2 |x_0 |^2 + U^2 \int _0^t \Vert E_h({\varvec{\omega }},t,s) \Vert ^2 \, \mathrm {d}s \nonumber \\&\quad + 2 U |x_0 |\Vert E_h({\varvec{\omega }},t,0) \Vert \sqrt{\int _0^t \Vert E_h({\varvec{\omega }},t,s) \Vert ^2 \, \mathrm {d}s}. \end{aligned}$$

(105)

In order to use the bound from Corollary 1 to estimate the last term, note that we can use the Cauchy–Schwartz inequality in the probability space to find

$$\begin{aligned} {\mathbb {E}}\left[ \Vert E_h(t,0) \Vert \sqrt{\int _0^t \Vert E_h(t,s) \Vert ^2 \, \mathrm {d}s} \right] \le \sqrt{{\mathbb {E}}[\Vert E_h(t,0) \Vert ^2 ] \int _0^t {\mathbb {E}}[ \Vert E_h(t,s) \Vert ^2] \, \mathrm {d}s} \end{aligned}$$

(106)

Taking the expected value in (105) and using that the bound on ${\mathbb {E}}[\Vert E_h(t,s) \Vert ^2]$ from Corollary 1 does not depend on t and s, we find

$$\begin{aligned} {\mathbb {E}}[|e_h(t) |^2] \le (|x_0 |+ U \sqrt{t})^2 (\Vert A \Vert T^2 + 2T) h \mathrm {Var}[{\mathcal {A}}], \end{aligned}$$

(107)

which gives the desired estimate. $\square $

Remark 12

Because $\Omega ^K$ is finite, we can always find a constant U such that (102) is satisfied for a given $u_h: \Omega ^K \rightarrow L^2(0,T;{\mathbb {R}}^q)$. However, when we consider a family of temporal grids for which $h \rightarrow 0$, the constant U may depend on h (depending on the considered family of controls $u_h({\varvec{\omega }},t)$). Fortunately, we only need to apply Theorem 2 with $u_h({\varvec{\omega }},t) = u_h^*({\varvec{\omega }},t)$, where $u^*_h({\varvec{\omega }},t)$ is the control that minimizes the cost functional $J_h({\varvec{\omega }},\cdot )$ in (14). For this control, the coercivity of the cost functional $J_h({\varvec{\omega }}, \cdot )$ implies that the constant U can be chosen independent of the considered temporal grid, see (54).

Remark 13

Note that the estimate in Theorem 1 depends on the $L^1$-norm of the control but that estimate in Theorem 2 depends through (102) on the $L^2$-norm. Setting $u_h({\varvec{\omega }},t) = u(t)$ in Theorem 2 therefore does not give the estimate in Theorem 1. This underlines the additional difficulty posed by stochastic controls.

3.4 A no-gap condition

With the results regarding forward dynamics from the previous two subsections, we are now ready to address the optimal control problem. The main result of this subsection is the no-gap condition in Theorem 3. To prove this result, we need the following technical lemma.

Lemma 1

Consider any control $u_h : \Omega ^K \rightarrow L^2(0,T;{\mathbb {R}}^q)$. Assume that Assumptions 1 and 2 hold and let $U > 0$ be such that (102) is satisfied. Then

$$\begin{aligned} {\mathbb {E}}[|J_h(u_h) - J(u_h) |] \le C_{[A,x_0, Q, x_d, T, U]} \left( \sqrt{h \mathrm {Var}[{\mathcal {A}}]} + h \mathrm {Var}[{\mathcal {A}}] \right) . \end{aligned}$$

(108)

Proof

Let $x({\varvec{\omega }},t)$ and $x_h({\varvec{\omega }},t)$ be the solutions of (50) and (51) for the considered control $u_h({\varvec{\omega }},t)$. For brevity, we write ${\tilde{x}}({\varvec{\omega }},t) = x({\varvec{\omega }},t) - x_d(t)$ and ${\tilde{x}}_h({\varvec{\omega }},t) = x_h({\varvec{\omega }},t) - x_d(t)$. By definition of the cost functionals $J(\cdot )$ and $J_h({\varvec{\omega }},\cdot )$ in (2) and (14), we have

$$\begin{aligned}&J_h(\omega ,u_h({\varvec{\omega }})) - J(u_h({\varvec{\omega }})) = \tfrac{1}{2}\int _0^T \left( {\tilde{x}}_h({\varvec{\omega }},t)^\top Q {\tilde{x}}_h({\varvec{\omega }},t) - {\tilde{x}}({\varvec{\omega }},t)^\top Q {\tilde{x}}({\varvec{\omega }},t) \right) \, \mathrm {d}t \nonumber \\&\quad = \int _0^T {\tilde{x}}({\varvec{\omega }},t)^\top Q ({\tilde{x}}_h({\varvec{\omega }},t) - {\tilde{x}}({\varvec{\omega }},t)) \, \mathrm {d}t \nonumber \\&\qquad +\tfrac{1}{2} \int _0^T({\tilde{x}}_h({\varvec{\omega }},t) - {\tilde{x}}({\varvec{\omega }},t))^\top Q ({\tilde{x}}_h({\varvec{\omega }},t) - {\tilde{x}}({\varvec{\omega }},t)) \, \mathrm {d}t \nonumber \\&\quad = \int _0^T \left( {\tilde{x}}({\varvec{\omega }},t)^\top Q e_h({\varvec{\omega }},t) +\tfrac{1}{2} e_h({\varvec{\omega }},t)^\top Q e_h({\varvec{\omega }},t) \right) \, \mathrm {d}t, \end{aligned}$$

(109)

where the last identity follows because $e_h({\varvec{\omega }},t) = x_h({\varvec{\omega }},t) - x(t)= {\tilde{x}}_h({\varvec{\omega }},t) - {\tilde{x}}(t)$. Taking the absolute value and estimating the RHS, we find

$$\begin{aligned}&|J_h({\varvec{\omega }},u_h) - J(u_h({\varvec{\omega }})) |\le \Vert Q \Vert \int _0^T \left( |{\tilde{x}}({\varvec{\omega }},t) ||e_h({\varvec{\omega }},t)|+ \tfrac{1}{2} |e_h({\varvec{\omega }},t)|^2 \right) \, \mathrm {d}t \nonumber \\&\quad \le \Vert Q \Vert \left( |{\tilde{x}}({\varvec{\omega }}) |_{L^2(0,T;{\mathbb {R}}^N)} |e_h({\varvec{\omega }}) |_{L^2(0,T;{\mathbb {R}}^N)} + \tfrac{1}{2}|e_h({\varvec{\omega }}) |_{L^2(0,T;{\mathbb {R}}^N)}^2 \right) . \end{aligned}$$

(110)

Taking the expectation and using the Cauchy–Schwartz inequality, we find that

$$\begin{aligned}&{\mathbb {E}}[|J_h(u_h) - J(u_h) |] \nonumber \\&\quad \le \Vert Q \Vert \left( \sqrt{{\mathbb {E}}[|{\tilde{x}} |_{L^2(0,T;{\mathbb {R}}^N)}^2]} \sqrt{{\mathbb {E}}[|e_h |_{L^2(0,T;{\mathbb {R}}^N)}^2]} + \tfrac{1}{2}{\mathbb {E}}[|e_h |_{L^2(0,T;{\mathbb {R}}^N)}^2] \right) . \end{aligned}$$

(111)

Using the estimate from Theorem 2, we find

$$\begin{aligned} {\mathbb {E}}[|e_h |_{L^2(0,T;{\mathbb {R}}^N)}^2] = \int _0^T {\mathbb {E}}[ |e_h(t) |^2] \, \mathrm {d}t \le h \mathrm {Var}[{\mathcal {A}}] C_{[A,x_0,T,U]}. \end{aligned}$$

(112)

Because ${\tilde{x}}({\varvec{\omega }},t) = x({\varvec{\omega }}, t) - x_d(t)$, (49) shows that

$$\begin{aligned} |{\tilde{x}}({\varvec{\omega }}) |^2_{L^2(0,T;{\mathbb {R}}^N)} \le ( \sqrt{T}(|x_0 |+ |Bu_h({\varvec{\omega }}) |_{L^1(0,T;{\mathbb {R}}^N)}) + |x_d |_{L^2(0,T;{\mathbb {R}}^N)} )^2. \end{aligned}$$

(113)

Because $|Bu_h({\varvec{\omega }}) |_{L^1(0,T;{\mathbb {R}}^N)} \le \sqrt{T} |Bu_h({\varvec{\omega }}) |_{L^2(0,T;{\mathbb {R}}^N)} \le \sqrt{T}U$, we see from (113) that ${\mathbb {E}}[|{\tilde{x}} |_{L^2(0,T;{\mathbb {R}}^N)}^2] \le C_{[x_0, x_d, T, U]}$. The result now follows by inserting this estimate and (112) into (111). $\square $

We are now ready to prove the main result of this section which can be considered as a no-gap condition for the RBM optimal control problem.

Theorem 3

Let $u^*(t)$ be the (deterministic) control that minimizes the cost functional J(u) in (2) and let $u_h^*({\varvec{\omega }},t)$ be the control that minimizes the cost functional $J_h({\varvec{\omega }}, u)$ in (14). Then

$$\begin{aligned} {\mathbb {E}}[|J_h(u^*_h) - J(u^*)|] \le C_{[A,B,x_0,Q,R,x_d,T]} \left( \sqrt{h \mathrm {Var}[{\mathcal {A}}]} + h \mathrm {Var}[{\mathcal {A}}] \right) . \end{aligned}$$

(114)

Proof

We have that

$$\begin{aligned} J(u^*)\le & {} J(u_h^*({\varvec{\omega }})) = J_h({\varvec{\omega }}, u_h^*({\varvec{\omega }})) + \delta ({\varvec{\omega }}) \nonumber \\\le & {} J_h({\varvec{\omega }},u^*) + \delta ({\varvec{\omega }}) = J(u^*) + \delta ({\varvec{\omega }}) + \varepsilon ({\varvec{\omega }}), \end{aligned}$$

(115)

where $\delta ({\varvec{\omega }}) = J(u^*_h({\varvec{\omega }})) - J_h({\varvec{\omega }}, u^*_h({\varvec{\omega }}))$ and $\varepsilon ({\varvec{\omega }}) = J_h({\varvec{\omega }},u^*) - J(u^*)$. Note that the first inequality follows because $u^*$ is the minimizer of J and the second inequality because $u_h^*({\varvec{\omega }})$ is the minimizer of $J_h({\varvec{\omega }}, \cdot )$. Subtracting $J(u^*) + \delta ({\varvec{\omega }})$ from the first, third, and fifth expressions in (115), shows that

$$\begin{aligned} -\delta ({\varvec{\omega }}) \le J_h({\varvec{\omega }},u^*_h({\varvec{\omega }})) - J(u^*) \le \varepsilon ({\varvec{\omega }}). \end{aligned}$$

(116)

Taking the absolute value, we find

$$\begin{aligned} |J_h({\varvec{\omega }},u^*_h({\varvec{\omega }})) - J(u^*) |\le \max \{|\delta ({\varvec{\omega }}) |, |\varepsilon ({\varvec{\omega }}) |\} \le |\delta ({\varvec{\omega }}) |+ |\varepsilon ({\varvec{\omega }}) |. \end{aligned}$$

(117)

Therefore also

$$\begin{aligned} {\mathbb {E}}[|J_h(u^*_h) - J(u^*) |] \le {\mathbb {E}}[|\delta |] + {\mathbb {E}}[|\varepsilon |]. \end{aligned}$$

(118)

Lemma 1 can now be used to find bounds for ${\mathbb {E}}[|\delta |] = {\mathbb {E}}[|J_h(u_h^*) - J(u_h^*) |]$ and ${\mathbb {E}}[|\varepsilon |] = {\mathbb {E}}[|J_h(u^*)-J(u^*) |]$.

For the bound on ${\mathbb {E}}[|\delta |]$, we use that (54) shows that there exists a constant such that $|Bu^*_h({\varvec{\omega }})|_{L^2(0,T;{\mathbb {R}}^N)} \le C_{[B,x_0,Q,R,x_d,T]}$ so that (102) is satisfied with a constant U that does not depend on the used temporal grid $t_0, t_1, \ldots , t_K$. Lemma 1 thus implies that

$$\begin{aligned} {\mathbb {E}}[|\delta |] \le C_{[A,B,x_0,Q,R,x_d,T]} \left( \sqrt{h \mathrm {Var}[{\mathcal {A}}]} + h \mathrm {Var}[{\mathcal {A}}] \right) . \end{aligned}$$

(119)

For the bound on ${\mathbb {E}}[|\varepsilon |]$, we can simply take $U = |Bu^*(t) |_{L^2(0,T;{\mathbb {R}}^N)}$, which is a constant that only depends on the parameters $A,B,x_0,Q,R,x_d,T$ that define the deterministic problem (1)–(2). Lemma 1 thus also shows that

$$\begin{aligned} {\mathbb {E}}[|\varepsilon |] \le C_{[A,B,x_0,Q,R,x_d,T]} \left( \sqrt{h \mathrm {Var}[{\mathcal {A}}]} + h \mathrm {Var}[{\mathcal {A}}] \right) . \end{aligned}$$

(120)

Inserting (119) and (120) into (118) we find (114). $\square $

3.5 Convergence in the controls

In the last stage of our analysis of the RBM-optimal control problem, we bound the expected difference between the optimal control $u^*_h$ that minimizes $J_h$ in (14) and the optimal control $u^*$ for the original problem. The proof is based on the strong convexity of the functional $J_h$ in (14).

To prove the main result, we need the following lemma which bounds the difference between the Gâteaux derivative of $J_h$ and the Gâteaux derivative of J in expectation.

Lemma 2

For any deterministic control $u \in L^2(0,T;{\mathbb {R}}^q)$ and any stochastic perturbation $v_h : \Omega ^K \rightarrow L^2(0,T;{\mathbb {R}}^q)$,

$$\begin{aligned} {\mathbb {E}} [|\delta J_h(u; v_h) - \delta J(u;v_h) |] \le C_{[A,B,x_0,Q,x_d,T,u]} \sqrt{h \mathrm {Var}[{\mathcal {A}}]} \sqrt{{\mathbb {E}}[|v_h |_{L^2(0,T;{\mathbb {R}}^q)}^2]}. \end{aligned}$$

(121)

Proof

Let x(t) and $x_h({\varvec{\omega }},t)$ be the solutions of (1) and (13), respectively. Furthermore, denote

$$\begin{aligned} y({\varvec{\omega }}, t) = \int _0^t e^{A(t-s)}Bv_h({\varvec{\omega }},s) \, \mathrm {d}s, \qquad y_h({\varvec{\omega }}, t) = \int _0^t S_h({\varvec{\omega }},t,s)Bv_h({\varvec{\omega }},s) \, \mathrm {d}s. \end{aligned}$$

(122)

Directly from the definition of the Gâteaux derivative, we find that

$$\begin{aligned} \delta J(u,v_h({\varvec{\omega }}))&= \int _0^T \left( {\tilde{x}}(t)^\top Q y({\varvec{\omega }},t) + u(t)^\top R v_h({\varvec{\omega }},t) \right) \, \mathrm {d}t, \end{aligned}$$

(123)

$$\begin{aligned} \delta J_h({\varvec{\omega }}, u,v_h({\varvec{\omega }}))&= \int _0^T \left( {\tilde{x}}_h({\varvec{\omega }},t)^\top Q y_h({\varvec{\omega }},t) + u(t)^\top R v_h({\varvec{\omega }},t) \right) \, \mathrm {d}t, \end{aligned}$$

(124)

where we write ${\tilde{x}}(t) = x(t) - x_d(t)$ and ${\tilde{x}}_h({\varvec{\omega }},t) = x_h({\varvec{\omega }},t) - x_d(t)$.

Subtracting (123) from (124), we find

$$\begin{aligned}&\delta J_h({\varvec{\omega }},u,v_h({\varvec{\omega }})) - \delta J(u,v_h({\varvec{\omega }})) \nonumber \\&\quad = \int _0^T \left( {\tilde{x}}_h({\varvec{\omega }},t)^\top Q y_h({\varvec{\omega }},t) - {\tilde{x}}(t)^\top Q y({\varvec{\omega }},t) \right) \, \mathrm {d}t \nonumber \\&\quad = \int _0^T \left( {\tilde{x}}_h({\varvec{\omega }},t)^\top Q (y_h({\varvec{\omega }},t) - y({\varvec{\omega }},t)) + ({\tilde{x}}_h({\varvec{\omega }},t) - {\tilde{x}}(t))^\top Q y({\varvec{\omega }},t) \right) \, \mathrm {d}t \nonumber \\&\quad = \int _0^T \left( {\tilde{x}}_h({\varvec{\omega }},t)^\top Q f_h({\varvec{\omega }},t) + e_h({\varvec{\omega }},t)^\top Q y({\varvec{\omega }},t) \right) \, \mathrm {d}t, \end{aligned}$$

(125)

where $e_h({\varvec{\omega }},t) = x_h({\varvec{\omega }},t) - x(t) = {\tilde{x}}_h({\varvec{\omega }},t) - {\tilde{x}}(t)$ and $f_h({\varvec{\omega }},t) = y_h({\varvec{\omega }},t) - y({\varvec{\omega }},t)$. Taking the absolute value, we find

$$\begin{aligned}&|\delta J_h({\varvec{\omega }},u,v_h({\varvec{\omega }})) - \delta J(u,v_h({\varvec{\omega }})) |\nonumber \\&\quad \le \Vert Q \Vert \int _0^T \left( |{\tilde{x}}_h({\varvec{\omega }},t)||f_h({\varvec{\omega }},t) |+ |e_h({\varvec{\omega }},t)||y({\varvec{\omega }},t) |\right) \, \mathrm {d}t. \end{aligned}$$

(126)

Using (48), we find the following bound for ${\tilde{x}}_h({\varvec{\omega }},t) = x_h({\varvec{\omega }}, t) - x_d(t)$

$$\begin{aligned} |{\tilde{x}}_h({\varvec{\omega }},t) |\le |x_h({\varvec{\omega }},t)|+ |x_d(t)|\le |x_0 |+ |B u |_{L^1(0,T;{\mathbb {R}}^N)} + |x_d(t) |. \end{aligned}$$

(127)

We thus have $|{\tilde{x}}_h({\varvec{\omega }},t) |\le C_{[B,x_0,x_d,T,u]}$ for all ${\varvec{\omega }} \in \Omega ^K$.

Taking the expectation in (126) using this result shows that

$$\begin{aligned}&{\mathbb {E}}[|\delta J_h(u,v_h) - \delta J(u,v_h) |] \nonumber \\&\quad \le \Vert Q \Vert \int _0^T \left( C_{[B,x_0,x_d,T,u]} {\mathbb {E}}[|f_h(t) |] - \sqrt{{\mathbb {E}}[|e_h(t) |^2]} \sqrt{{\mathbb {E}}[ |y(t) |^2]} \right) \, \mathrm {d}t, \end{aligned}$$

(128)

where the second term on the RHS follows from the Cauchy–Schwartz inequality.

Again using the notation $E_h({\varvec{\omega }},t,s) := S_h({\varvec{\omega }},t,s) - e^{A(t-s)}$, (122) shows that

$$\begin{aligned} f_h({\varvec{\omega }},t) = y_h({\varvec{\omega }},t) - y({\varvec{\omega }},t) = \int _0^t E_h({\varvec{\omega }},t,s) B v_h({\varvec{\omega }},s) \, \mathrm {d}s. \end{aligned}$$

(129)

Therefore,

$$\begin{aligned} {\mathbb {E}}[|f_h(t) |]&\le \int _0^t {\mathbb {E}}[\Vert E_h(t,s) \Vert |Bv_h(s) |] \, \mathrm {d}s \nonumber \\&\le \int _0^t \sqrt{{\mathbb {E}}[\Vert E_h(t,s) \Vert ^2]} \sqrt{{\mathbb {E}}[ |Bv_h(s) |^2]} \, \mathrm {d}s \nonumber \\&\le C_{[A,T]}\sqrt{h \mathrm {Var}[{\mathcal {A}}]} \int _0^t \sqrt{{\mathbb {E}}[|Bv_h(s)|^2]} \, \mathrm {d}s \nonumber \\&\le C_{[A,T]} \sqrt{h \mathrm {Var}[{\mathcal {A}}]} \sqrt{t} \sqrt{\int _0^t {\mathbb {E}}[|Bv_h(s) |^2] \, \mathrm {d}s} \nonumber \\&\le C_{[A,T]} \sqrt{h \mathrm {Var}[{\mathcal {A}}]} \sqrt{{\mathbb {E}}[|Bv_h|^2_{L^2(0,T;{\mathbb {R}}^N)}]}, \end{aligned}$$

(130)

where the second inequality follows from the Cauchy–Schwartz inequality in the probability space, the third inequality from Corollary 1, and the third inequality from the Cauchy–Schwartz inequality in $L^2(0,t)$.

Because the control u(t) is deterministic, Theorem 1 shows that

$$\begin{aligned} {\mathbb {E}}[|e_h(t) |^2] \le h \mathrm {Var}[{\mathcal {A}}] C_{[A,B,x_0,T,u]}. \end{aligned}$$

(131)

Finally, note

$$\begin{aligned} |y({\varvec{\omega }},t) |^2&= \left( \int _0^t \Vert e^{A(t-s)}\Vert |Bv_h({\varvec{\omega }},s) |\, \mathrm {d}s \right) ^2 \nonumber \\&\le \int _0^t \Vert e^{A(t-s)}\Vert ^2 \, \mathrm {d}s \int _0^t |Bv_h({\varvec{\omega }},s)r\vert ^2 \, \mathrm {d}s \le t|Bv_h({\varvec{\omega }})|_{L^2(0,T;{\mathbb {R}}^N)}^2. \end{aligned}$$

(132)

Therefore, also

$$\begin{aligned} {\mathbb {E}}[|y(t) |^2] \le C_{[B,T]}{\mathbb {E}}[|v_h |_{L^2(0,T;{\mathbb {R}}^N)}^2] . \end{aligned}$$

(133)

Inserting (130), (131), and (133) into (128) completes the proof. $\square $

We are now ready to prove the convergence result for the optimal controls.

Theorem 4

Suppose that the functional $J_h({\varvec{\omega }},\cdot )$ in (14) is $\alpha $-convex for all ${\varvec{\omega }} \in \Omega ^K$. Let $u_h^*({\varvec{\omega }},t)$ be the minimizer of $J_h({\varvec{\omega }},\cdot )$ in (14) and $u^*(t)$ be the minimizer of J in (2), then

$$\begin{aligned} \alpha ^2 {\mathbb {E}}[|u_h^* - u^*|_{L^2(0,T; {\mathbb {R}}^q)}^2] \le C_{[A,B,x_0,Q,R,x_d,T]} h \mathrm {Var}[{\mathcal {A}}]. \end{aligned}$$

(134)

Proof

We apply (57) with $J(\cdot ) = J_h({\varvec{\omega }}, \cdot )$, $v = u^*_h({\varvec{\omega }})$, and $u = u^*$ to find

$$\begin{aligned} J_h(\omega ,u^*_h({\varvec{\omega }})) \ge J_h({\varvec{\omega }},u^*) + \delta J_h({\varvec{\omega }}, u^*; u_h^*({\varvec{\omega }}) - u^*) + \tfrac{\alpha }{2} |u_h^*({\varvec{\omega }}) - u^* |^2_{L^2(0,T;{\mathbb {R}}^q)}. \end{aligned}$$

(135)

Because $u^*_h({\varvec{\omega }})$ is the minimizer of $J_h({\varvec{\omega }}, \cdot )$, $J_h({\varvec{\omega }},u_h^*({\varvec{\omega }})) \le J_h({\varvec{\omega }},u^*)$ and

$$\begin{aligned} 0 \ge \delta J_h({\varvec{\omega }},u^*; u_h^*({\varvec{\omega }}) - u^*) + \tfrac{\alpha }{2} |u_h^*({\varvec{\omega }}) - u^* |^2_{L^2(0,T;{\mathbb {R}}^q)}. \end{aligned}$$

(136)

Bringing $\delta J_h$ to the other side, taking the absolute value and then the expectation, yields

$$\begin{aligned} \frac{\alpha }{2}{\mathbb {E}}[|u_h^* - u^*|^2_{L^2(0,T;{\mathbb {R}}^q)}] \le {\mathbb {E}}[|\delta J_h(u^*; u_h^* - u^*) |]. \end{aligned}$$

(137)

Since $u^*$ is the minimizer of J, $\delta J(u^*, v) = 0$ for all perturbation $v \in L^2(0,T;{\mathbb {R}}^q)$. In particular, we have that $\delta J(u^*, u_h^*({\varvec{\omega }}) - u^*) = 0$ for all ${\varvec{\omega }} \in \Omega ^K$ so that also

$$\begin{aligned} \frac{\alpha }{2}{\mathbb {E}}[|u_h^* - u^* |^2_{L^2(0,T;{\mathbb {R}}^q)}] \le {\mathbb {E}}[|\delta J_h(u^*; u_h^* - u^*) - \delta J(u^*; u_h^* - u^* )|]. \end{aligned}$$

(138)

We now apply Lemma 2 to the RHS with $u = u^*$ and $v_h({\varvec{\omega }}) = u^*_h({\varvec{\omega }}) - u^*$, which shows that

$$\begin{aligned} \frac{\alpha }{2}{\mathbb {E}}[|u_h^* - u^* |^2_{L^2(0,T;{\mathbb {R}}^q)}] \le C_{[B,x_0,Q,x_d,T,u^*]} \sqrt{h \mathrm {Var}[{\mathcal {A}}]} \sqrt{{\mathbb {E}}[|u_h^* - u^* |^2_{L^2(0,T;{\mathbb {R}}^q)}]}. \end{aligned}$$

(139)

Next, we divide (139) by $\tfrac{1}{2} \sqrt{{\mathbb {E}}[|u_h^*-u^* |^2_{L^2(0,T;{\mathbb {R}}^q)}]}$ to find

$$\begin{aligned} \alpha \sqrt{{\mathbb {E}}[|u_h^* - u^* |^2_{L^2(0,T;{\mathbb {R}}^q)}]} \le C_{[A,B,x_0,Q,x_d,T,u^*]} \sqrt{h \mathrm {Var}[{\mathcal {A}}]}. \end{aligned}$$

(140)

Squaring both sides we arrive at

$$\begin{aligned} \alpha ^2 {\mathbb {E}}[|u_h^* - u^* |^2_{L^2(0,T;{\mathbb {R}}^q)}] \le C_{[A,B,x_0,Q,x_d,T,u^*]} h \mathrm {Var}[{\mathcal {A}}]. \end{aligned}$$

(141)

The result follows because the optimal control $u^*(t)$ only depends on the parameters $A,B,x_0,Q,R,x_d$, and T that define the original problem (1)–(2). $\square $

We now point out two corollaries of Theorem 4 that are important when we use the control $u_h^*({\varvec{\omega }} ,t)$ (optimized for the RBM-dynamics) to control the original dynamics. For the first corollary, we introduce the notation

$$\begin{aligned} x^*_h({\varvec{\omega }}, t)&= e^{At}x_0 + \int _0^t e^{A(t-s)}Bu_h^*({\varvec{\omega }}, s) \, \mathrm {d}s, \end{aligned}$$

(142)

$$\begin{aligned} x^*(t)&= e^{At}x_0 + \int _0^t e^{A(t-s)}Bu^*(s) \, \mathrm {d}s, \end{aligned}$$

(143)

i.e., $x^*_h({\varvec{\omega }}, t)$ is the solution of the original dynamics (1) resulting from the control $u_h^*({\varvec{\omega }}, t)$ optimized for the RBM-dynamics and $x^*(t)$ is the solution of the original dynamics (1) resulting from the optimal control $u^*(t)$.

Corollary 2

Suppose that the functional $J_h({\varvec{\omega }},\cdot )$ in (14) is $\alpha $-convex for all ${\varvec{\omega }} \in \Omega ^K$ and let $x_h^*({\varvec{\omega }}, t)$ and $x^*(t)$ be as in (142) and (143), respectively. Then

$$\begin{aligned} \alpha ^2 {\mathbb {E}}[|x^*_h(t) - x^*(t) |^2] \le C_{[A,B,x_0,Q,R,x_d,T]} h \mathrm {Var}[{\mathcal {A}}], \end{aligned}$$

(144)

for all $t \in [0,T]$.

Proof

Note that

$$\begin{aligned} x^*_h({\varvec{\omega }}, t) - x^*(t) = \int _0^t e^{A(t-s)}B(u_h^*({\varvec{\omega }}, s)- u^*(s)) \, \mathrm {d}s. \end{aligned}$$

(145)

Therefore also

$$\begin{aligned}&|x^*_h({\varvec{\omega }}, t) - x^*(t) |\le \int _0^t \Vert e^{A(t-s)}\Vert \Vert B\Vert |u_h^*({\varvec{\omega }}, s)- u^*(s) |\, \mathrm {d}s \nonumber \\&\quad \le \Vert B \Vert |u_h^*({\varvec{\omega }})- u^* |_{L^1(0,T; {\mathbb {R}}^q)} \le \Vert B \Vert \sqrt{T} \sqrt{|u_h^*({\varvec{\omega }})- u^*|_{L^2(0,T; {\mathbb {R}}^q)}}, \end{aligned}$$

(146)

where the second inequality uses that $\Vert e^{At}\Vert \le 1$ in view of Assumption 1. The result now follows after squaring this inequality, taking the expectation, and using (134). $\square $

Corollary 3

Suppose that the cost functional $J_h({\varvec{\omega }},\cdot )$ is $\alpha $-convex for all ${\varvec{\omega }} \in \Omega ^K$. Let $u^*(t)$ be the (deterministic) control that minimizes the cost functional J(u) in (2) and let $u_h^*({\varvec{\omega }},t)$ be the control that minimizes the cost functional $J_h({\varvec{\omega }}, u)$ in (14). Then

$$\begin{aligned} \alpha ^2 {\mathbb {E}}[|J(u^*_h) - J(u^*) |] \le C_{[A,B,x_0,Q,R,x_d,T]} h \mathrm {Var}[{\mathcal {A}}]. \end{aligned}$$

(147)

Proof

Denote $v_h({\varvec{\omega }},t) := u_h^*({\varvec{\omega }},t) - u^*(t)$ and $y({\varvec{\omega }},t) := \int _0^t e^{A(t-s)} Bv_h({\varvec{\omega }},s) \, \mathrm {d}s$. Because the considered functional is quadratic,

$$\begin{aligned} J(u^*_h({\varvec{\omega }})) - J(u^*)&= J(u^* + v_h({\varvec{\omega }})) - J(u^*) \nonumber \\&= \delta J(u^*, v_h({\varvec{\omega }})) + \delta ^2 J(v_h({\varvec{\omega }}), v_h({\varvec{\omega }})), \end{aligned}$$

(148)

where the Hessian $\delta ^2 J(v_h({\varvec{\omega }}), v_h({\varvec{\omega }}))$ is given by

$$\begin{aligned} \delta ^2 J(v_h({\varvec{\omega }}), v_h({\varvec{\omega }}))&= \frac{1}{2}\int _0^T \left( y({\varvec{\omega }},t)^\top Q y({\varvec{\omega }},t) + v_h({\varvec{\omega }},t)^\top R v_h({\varvec{\omega }},t) \right) \, \mathrm {d}t. \end{aligned}$$

(149)

Because $u^*$ is the minimizer of $J(\cdot )$, $\delta J(u^*, v) = 0$ for all $v \in L^2(0,T; {\mathbb {R}}^q)$. The first term on the RHS of (148) thus vanishes. Also observe that

$$\begin{aligned} \delta ^2 J(v_h({\varvec{\omega }}), v_h({\varvec{\omega }})) \le \tfrac{1}{2}\Vert Q \Vert |y({\varvec{\omega }}) |_{L^2(0,T;{\mathbb {R}}^N)}^2 + \tfrac{1}{2}\Vert R \Vert |v_h({\varvec{\omega }}) |_{L^2(0,T;{\mathbb {R}}^q)}^2. \end{aligned}$$

(150)

A similar estimate as (132) shows that $|y({\varvec{\omega }})|_{L^2(0,T;{\mathbb {R}}^N)}^2 \le C_{[B,T]}|v_h({\varvec{\omega }})|_{L^2(0,T;{\mathbb {R}}^q)}^2$. Combining these results in (148), we conclude

$$\begin{aligned} |J(u^*_h({\varvec{\omega }})) - J(u^*) |&\le J(u^*_h({\varvec{\omega }})) - J(u^*) \nonumber \\&\le \delta ^2 J(v_h({\varvec{\omega }}), v_h({\varvec{\omega }})) \le C_{[B,Q,R,T]}|v_h({\varvec{\omega }}) |_{L^2(0,T;{\mathbb {R}}^q)}^2. \end{aligned}$$

(151)

The result now follows after taking the expectation and using the result from Theorem 4 to bound ${\mathbb {E}}[|v_h |_{L^2(0,T;{\mathbb {R}}^q})^2] = {\mathbb {E}}[|u^*_h - u^* |_{L^2(0,T;{\mathbb {R}}^q)}^2]$. $\square $

4 Numerical results

In this section, we apply our proposed method to three medium to large scale linear dynamical systems that are obtained after spatial discretization of a linear PDE.

4.1 A discretized 1D heat equation

We consider a controlled heat equation on the 1-D spatial domain $[-L, L]$,

$$\begin{aligned} y_t(t,\xi )&= y_{\xi \xi }(t,\xi ) + \chi _{[-L/3,0]}(\xi ) u(t), \quad \xi \in [-L,L], \end{aligned}$$

(152)

$$\begin{aligned} y_\xi (t,-L)&= y_\xi (t,L) = 0, \quad y(0,\xi ) = e^{-\xi ^2} + \xi ^2e^{-L^2}, \end{aligned}$$

(153)

where $\chi _{[-L/3,0]}(\xi )$ denotes the characteristic function for the interval $[-L/3,0]$. We want to compute the optimal control $u^*(t)$ that minimizes

$$\begin{aligned} {\mathcal {J}}(u) = \frac{100}{2}\int _0^T \int _{-L}^0 y(t,\xi )^2 \, \mathrm {d}\xi \, \mathrm {d}t + \frac{1}{2}\int _0^T u(t)^2 \, \mathrm {d}t. \end{aligned}$$

(154)

The spatial discretization of the dynamics (152)–(153) is made by finite differences and the cost functional in (154) is discretized by the trapezoid rule. We choose a uniform spatial grid with $N = 61$ grid points $\xi _i = (i-1)\Delta \xi - L$ ($i \in \{ 1,2, \ldots , N \}$), where $\Delta \xi = 2L/(N-1)$ is the grid spacing, and obtain a system of the form (1).

The resulting A-matrix is of the form

$$\begin{aligned} A = \frac{1}{\Delta \xi ^2}\begin{bmatrix} -2 &{} 2 &{} 0 &{} \cdots &{} 0 &{} 0 &{} 0 \\ 1 &{} -2 &{} 1 &{} &{} 0 &{} 0 &{} 0 \\ 0 &{} 1 &{} -2 &{} &{} 0 &{} 0 &{} 0 \\ \vdots &{} &{} &{} \ddots &{} &{} &{} \vdots \\ 0 &{} 0 &{} 0 &{} &{} -2 &{} 1 &{} 0 \\ 0 &{} 0 &{} 0 &{} &{} 1 &{} -2 &{} 1 \\ 0 &{} 0 &{} 0 &{} \cdots &{} 0 &{} 2 &{} -2 \end{bmatrix}. \end{aligned}$$

(155)

Observe that A can be written as

$$\begin{aligned} A = \sum _{i=1}^{n} {\tilde{A}}_i, \end{aligned}$$

(156)

where the $n := N-1 = 60$ matrices ${\tilde{A}}_i \in {\mathbb {R}}^{N \times N}$ are zero except for the entries

$$\begin{aligned} \begin{bmatrix} [{\tilde{A}}_1]_{11} &{} [{\tilde{A}}_1]_{12} \\ [{\tilde{A}}_1]_{21} &{} [{\tilde{A}}_1]_{22} \end{bmatrix}&= \begin{bmatrix} -2 &{} 2 \\ 1 &{} -1 \end{bmatrix}, \\ \begin{bmatrix} [{\tilde{A}}_i]_{ii} &{} [{\tilde{A}}_i]_{i,i+1} \\ [{\tilde{A}}_i]_{i+1,i} &{} [{\tilde{A}}_i]_{i+1,i+1} \end{bmatrix}&= \begin{bmatrix} -1 &{} 1 \\ 1 &{} -1 \end{bmatrix}, \quad 2 \le i \le n-1, \\ \begin{bmatrix} [{\tilde{A}}_n]_{nn} &{} [{\tilde{A}}_n]_{n,n+1} \\ [{\tilde{A}}_n]_{n+1,n} &{} [{\tilde{A}}_n]_{n+1,n+1} \end{bmatrix}&= \begin{bmatrix} -1 &{} 1 \\ 2 &{} -2 \end{bmatrix}. \end{aligned}$$

One can easily verify that the matrices ${\tilde{A}}_i$ are dissipative. We now define the M submatrices $A_m$ (for $M = 1,2,3,4$) as

$$\begin{aligned} A_m = \sum _{i=i_{m-1}+1}^{i_m} {\tilde{A}}_i, \end{aligned}$$

(157)

where $i_m = nm/M$. Because of (156), it is easy to see that the submatrices $A_m$ satisfy (5). Because the submatrices ${\tilde{A}}_i$ are dissipative, the submatrices $A_m$ in (157) are dissipative and Assumption 1 is satisfied.

Example 6

For $M = 2$ and $N = 61$, we obtain the splitting of the A-matrix in (155) as $A = A_1 + A_2$, with

$$\begin{aligned} A_1 = \begin{bmatrix} A_{11} &{} 0_{31\times 30} \\ 0_{30\times 31} &{} 0_{30\times 30} \end{bmatrix}, \quad A_2 = \begin{bmatrix} 0_{30\times 30} &{} 0_{30\times 31} \\ 0_{31\times 30} &{} A_{22} \end{bmatrix}, \end{aligned}$$

(158)

where $A_{11}$ and $A_{22}$ are the $31 \times 31$-matrices

$$\begin{aligned} A_{11}&= \frac{1}{\Delta \xi ^2}\begin{bmatrix} -2 &{} 2 &{} 0 &{} \cdots &{} 0 &{} 0 &{} 0 \\ 1 &{} -2 &{} 1 &{} &{} 0 &{} 0 &{} 0 \\ 0 &{} 1 &{} -2 &{} &{} 0 &{} 0 &{} 0 \\ \vdots &{} &{} &{} \ddots &{} &{} &{} \vdots \\ 0 &{} 0 &{} 0 &{} &{} -2 &{} 1 &{} 0 \\ 0 &{} 0 &{} 0 &{} &{} 1 &{} -2 &{} 1 \\ 0 &{} 0 &{} 0 &{} \cdots &{} 0 &{} 1 &{} -1 \end{bmatrix}, \end{aligned}$$

(159)

$$\begin{aligned} A_{22}&= \frac{1}{\Delta \xi ^2}\begin{bmatrix} -1 &{} 1 &{} 0 &{} \cdots &{} 0 &{} 0 &{} 0 \\ 1 &{} -2 &{} 1 &{} &{} 0 &{} 0 &{} 0 \\ 0 &{} 1 &{} -2 &{} &{} 0 &{} 0 &{} 0 \\ \vdots &{} &{} &{} \ddots &{} &{} &{} \vdots \\ 0 &{} 0 &{} 0 &{} &{} -2 &{} 1 &{} 0 \\ 0 &{} 0 &{} 0 &{} &{} 1 &{} -2 &{} 1 \\ 0 &{} 0 &{} 0 &{} \cdots &{} 0 &{} 2 &{} -2 \end{bmatrix}. \end{aligned}$$

(160)

We will present numerical results for four cases:

Case i
We decompose A into $M = 2$ submatrices and assign a probability $\tfrac{1}{2}$ to the subsets $\{ 1 \}$ and $ \{ 2 \}$ and a probability 0 to the subsets $\emptyset $ and $\{1,2 \}$.
Case ii
We decompose A into $M = 3$ submatrices and assign a probability $\tfrac{1}{3}$ to the subsets $\{ 1 \}$, $ \{ 2 \}$, and $\{ 3 \}$ and a probability 0 to the other subsets of $\{1,2,3 \}$.
Case iii
We decompose A into $M = 4$ submatrices and assign a probability $\tfrac{1}{4}$ to the subsets $\{ 1 \}$, $ \{ 2 \}$, $\{ 3 \}$, and $\{ 4 \}$ and a probability 0 to the other subsets of $\{1,2,3,4\}$.
Case iv
We decompose A into $M = 4$ submatrices and assign a probability $\tfrac{1}{2}$ to the subsets $\{ 1,3 \}$ and $ \{ 2,4 \}$ and a probability 0 to the other subsets of $\{1,2,3,4\}$.

In all 4 cases, we fix $N = 61$, $L = \tfrac{3}{2}$, and $T = \tfrac{1}{2}$.

We use a uniform grid $0=t_0< t_1< \ldots< t_{K-1} < t_K = T$ with a uniform grid spacing h. We will present results for $h = 2^{-5}$, $2^{-7}$, $2^{-9}$, $2^{-11}$, $2^{-13}$, and $2^{-15}$. For each of the $K = T/h$ time intervals $[t_{k-1},t_k)$, we select an index $\omega _k$ according to the probabilities specified in Cases i–iv above. The state $x_h({\varvec{\omega }},t)$ that satisfies (13) is computed using a single Crank-Nicholson step in each time interval $[t_{k-1}, t_k)$. We use precomputed LU-factorizations of the matrices $I - \tfrac{h}{2}\sum _{m\in S_{\omega }}\tfrac{A_m}{\pi _m}$ (for subsets $S_{\omega }$ with a nonzero probability $p_{\omega }$) that need to be inverted frequently.

The optimal control $u_h^*({\varvec{\omega }},t)$ that minimizes $J_h({\varvec{\omega }},u)$ in (14) is computed with a gradient-descent algorithm. The gradient is computed using the adjoint state $\varphi _h({\varvec{\omega }},t)$, see Remark 3. The time discretization for the adjoint state equation (15) is done using the scheme proposed in [1] that leads to discretely consistent gradients. The iterates $u^k$ are computed as $u^{k+1} = u^k-\beta \nabla J_h({\varvec{\omega }},u^k)$. The step size $\beta $ is chosen such that $J_h({\varvec{\omega }}, u^k - \beta \nabla J_h({\varvec{\omega }},u^k))$ is minimal. The algorithm is terminated when the relative change in $J_h({\varvec{\omega }},u)$ is below $10^{-6}$.

The results for the four considered cases are displayed in Fig. 2. Because the obtained results depend on the randomly selected indices stored in ${\varvec{\omega }}$, each marker in the subfigures in Fig. 2 represents the average error or duration over 25 random realizations of ${\varvec{\omega }}$. The errorbars represent the $2\sigma $-confidence interval estimated from these 25 realizations. The errors are computed w.r.t. the solutions x(t) and $u^*(t)$ that are computed on the same time grid as the corresponding solutions $x_h({\varvec{\omega }},t)$ and $u^*_h({\varvec{\omega }},t)$. The displayed errors therefore do not reflect the errors due to the temporal (or spatial) discretization but capture only the error introduced by the proposed randomized splitting method.

Because the matrices A and $A_m$ represent approximations of unbounded operators, the variance $\mathrm {Var}[{\mathcal {A}}]$ defined in (17) will grow unbounded when the mesh is refined. This is also reflected by the large values of $\mathrm {Var}[{\mathcal {A}}]$ given in Table 1. It is therefore more natural to consider the variance $\mathrm {Var}_W[{\mathcal {A}}]$ in (19) weighted by a matrix of the form $W = (A - \lambda I)^{-1}$. The values of $\mathrm {Var}_W[{\mathcal {A}}]$ are indeed much smaller than the values of $\mathrm {Var}[{\mathcal {A}}]$ in Table 1. The results at the end of this subsection (in Fig. 4) also indicate that the weighted variance $\mathrm {Var}_W[{\mathcal {A}}]$ reflects the behavior of the error better when the mesh is refined.

The error estimates in Theorems 1, 3, and 4 and in Corollary 3 are proportional to $h\mathrm {Var}[{\mathcal {A}}]$. We therefore plot the errors in Fig. 2a–d against $\sqrt{h\mathrm {Var}_W[{\mathcal {A}}]}$ (with $W = (A - 0.1 I)^{-1}$) and expect that the errors for the different cases will be (approximately) on one line.

Table 1 Values of $\mathrm {Var}[{\mathcal {A}}]$ and $\mathrm {Var}_W[{\mathcal {A}}]$ for $W = (A - \lambda I)^{-1}$ with $\lambda = 0.1$

Full size table

Figure 2a shows the difference $|x_h({\varvec{\omega }},t) - x(t)|$ between the solutions x(t) and $x_h(\omega ,t)$ of (1) and (13) with $u(t) = 0$. Recall that the markers in this figure indicate the average error observed over 25 realizations of ${\varvec{\omega }}$, and are thus estimates for ${\mathbb {E}}[\max _{t \in [0,T]}|x_h(t) - x(t) |]$. Because ${\mathbb {E}}[|x_h(t) - x(t) |] \le \sqrt{{\mathbb {E}}[ |x_h(t) - x(t)|^2]}$, we expect (based on the bound in Theorem 1) that the errors in Fig. 2a are proportional to $\sqrt{h\mathrm {Var}_W[{\mathcal {A}}]}$. This is indeed confirmed by Fig. 2a.

Figure 2b shows the difference $|u_h^* - u^* |_{L^2(0,T)}$ between the optimal controls $u^*(t)$ and $u_h^*({\varvec{\omega }},t)$ that minimize (2) and (14), respectively. Based on the estimate in Theorem 4, we again expect that the observed errors are proportional to $\sqrt{h\mathrm {Var}_W[{\mathcal {A}}]}$. This is indeed the case and the proportionality constants for the different cases are again (approximately) equal, which is also expected based on the error estimate in Theorem 4.

The convergence in the optimal controls in Fig. 2b is also illustrated in Fig. 3. This figure shows the optimal controls $u_h^*({\varvec{\omega }},t)$ obtained for 25 randomly selected realizations of ${\varvec{\omega }} \in \Omega ^K$ (light red) for the six considered grid spacings h of the temporal grid. The figure also shows the average of the 25 optimal controls $u^*_h({\varvec{\omega }},t)$ (dark red) and the optimal control $u^*(t)$ for the original system (black). Figure 3 indeed shows that the optimal controls $u_h^*{\varvec{\omega }},t)$ get closer to the optimal control $u^*(t)$ when the spacing of the temporal grid h is reduced. Especially in Fig. 3a, b, it is also clear that the average of the 25 optimal controls $u^*_h({\varvec{\omega }},t)$ (dark red) is not equal to the optimal control $u^*(t)$ for the original system (black). This indicates that ${\mathbb {E}}[u_h^*] \ne u^*$, see also Remark 6. This means that $u^*_h$ is a biased estimator for $u^*$ and averaging several realizations of $u^*({\varvec{\omega }},t)$ can only improve the approximation of $u^*(t)$ to a limited extend. Note, however, that

$$\begin{aligned} |{\mathbb {E}}[u_h^*] - u^* |= |{\mathbb {E}}[u_h^* - u^*] |\le {\mathbb {E}}[|u^*_h - u^* |] \le \sqrt{{\mathbb {E}}[|u^*_h - u^* |^2]}, \end{aligned}$$

(161)

so that Theorem 4 shows that ${\mathbb {E}}[u_h^*] \rightarrow u^*$ at a rate of $\sqrt{h \mathrm {Var}[{\mathcal {A}}]}$. An analysis of the numerical results (that is not presented in Fig. 2) also indicates that the average of the 25 realizations of $u_h^*({\varvec{\omega }},t)$ converges to $u^*(t)$ at this rate.

Figure 2c, d illustrates the convergence of $J_h({\varvec{\omega }},u^*_h({\varvec{\omega }}))$ and $J(u^*_h({\varvec{\omega }}))$ to $J(u^*)$. Fig. 2c illustrates the error estimate in Theorem 3 and shows that the optimality gap $|J_h(\omega ,u^*_h({\varvec{\omega }})) - J(u^*) |$ is indeed proportional to $\sqrt{h\mathrm {Var}_W[{\mathcal {A}}]}$. The difference between the different cases is more visible than in Fig. 2a, b. Figure 2d illustrates the error estimate in Corollary 3, which shows that the suboptimality of the RBM-control $|J(u_h^*({\varvec{\omega }})) - J(u^*) |$ is proportional to $h\mathrm {Var}_W[{\mathcal {A}}]$. The convergence rate is now twice as high as in the previous cases and the relative error stabilizes around $10^{-5}$, which seems to be related to the tolerance of $10^{-6}$ used in the computation of the optimal controls.

Figure 2e, f shows the computational times for (one realization of) $x_h({\varvec{\omega }},t)$ and $u^*_h({\varvec{\omega }},t)$ in Cases i–iv and the computational time for the original problem (labeled ‘Original’). Note that the results have been generated on temporal grids with different grid spacings h and that the computational time generally increases when the more time steps are used, i.e. when h is smaller. The figures indicate that $x_h({\varvec{\omega }},t)$ and $u^*_h({\varvec{\omega }},t)$ are not computed faster than the solutions x(t) and $u^*(t)$ of the original problem. The proposed method does thus not lead to any reduction in computational time in this example. It seems that we cannot observe any reduction in computational time for this example because the original A-matrix is quite small ($N = 61$) and sparse (A is tridiagonal). The examples in the following two subsections indicate that a reduction in computational cost is obtained when the state dimension N is significantly higher or when A has significantly more nonzero off-diagonal elements.

To conclude this example, we study the dependence of our results on the number of grid points N. This gives us some indication whether the RBM can also be applied to infinite dimensional problems. In particular, the results give us some indication whether the proposed randomized splitting also works for the underlying PDE problem (152)–(154). As we also noted in Remarks 5 and 7, the main concerns are related to operator norm of A, that appears in $\mathrm {Var}[{\mathcal {A}}]$ and in the estimate in Theorem 1, which grows unbounded when the mesh is refined. These concerns also motivated the introduction of the weighted variance $\mathrm {Var}_W[{\mathcal {A}}]$, see Remark 5.

When the estimate in Theorem 1 indeed depends on $\Vert A \Vert $, the error $|x_h({\varvec{\omega }},t) - x(t) |$ divided by $\mathrm {Var}[{\mathcal {A}}]$ should grow when N is increased. Figure 4a shows that this is not the case, but that this ratio actually decreases when N is increased. However, when we divided the errors by $\mathrm {Var}_W[{\mathcal {A}}]$, the result seems to be independent of the mesh size. Figure 4b shows that the same trend is observed for the errors in the optimal control.

The numerical results in Fig. 4 match well with the result from Appendix B, where we prove an error estimate proportional to $\mathrm {Var}_W[{\mathcal {A}}]$ under the additional assumption that all matrices $A_m$ commute. This result also extends to an infinite-dimensional setting when the domains the operators $A_m$ coincide. However, in the setting considered here, the matrices $A_m$ do not commute and are not approximations of operators with the same domains. Proving the convergence of the proposed randomized time splitting method for the underlying PDE problem (152)–(154) with the proposed randomized time splitting method is a challenging topic for future research.

4.2 A discretized 3D heat equation

We now consider a heat equation on the a 3-D spatial domain $V = [-L, L]^3$,

$$\begin{aligned}&y_t(t,{\varvec{\xi }}) = \Delta y(t, {\varvec{\xi }}), \qquad \qquad \qquad&{\varvec{\xi }} \in [-L,L]^3, \end{aligned}$$

(162)

$$\begin{aligned}&\nabla y(t,{\varvec{\xi }}) \cdot {\mathbf {n}} = u(t), \quad \qquad&{\varvec{\xi }} \in S_{\mathrm {top}}, \end{aligned}$$

(163)

$$\begin{aligned}&\nabla y(t,{\varvec{\xi }}) \cdot {\mathbf {n}} = 0, \quad \qquad&{\varvec{\xi }} \in \partial V \backslash S_{\mathrm {top}}, \end{aligned}$$

(164)

$$\begin{aligned}&y(0,{\varvec{\xi }}) = e^{-|{\varvec{\xi }} |^2/(8L^2)}, \end{aligned}$$

(165)

where $\nabla $ and $\Delta $ are the gradient and Laplacian operators w.r.t. ${\varvec{\xi }}$, ${\mathbf {n}}$ is the outward pointing normal, and $S_{\mathrm {top}}$ denotes the top surface $S_{\mathrm {top}} = \{ (\xi _1, \xi _2, \xi _3) \in [-L,L]^3 \mid \xi _3 = L \}$. The control u(t) can be considered as a uniform heat load on the top surface. We want to compute the control $u^*(t)$ that minimizes

$$\begin{aligned} J = 1000 \int _0^T \iint _{S_{\mathrm {side}}} (y(t,{\varvec{\xi }}))^2 \, \mathrm {d}{\varvec{\xi }} \, \mathrm {d}t + \int _0^T (u(t))^2 \, \mathrm {d}t, \end{aligned}$$

(166)

where $S_{\mathrm {side}}= \{ (\xi _1, \xi _2, \xi _3) \in [-L,L]^3 \mid \xi _1 = -L \}$. We fix $L = 0.75$ and $T = 2$.

The spatial discretization of (162)–(166) is made by finite differences using $16 \times 16 \times 16$ grid points the $\xi _1$-, $\xi _2$-, and $\xi _3$-directions. This leads to a model of the form (1)–(2) with $N = 16^3 = 4096$ states. The resulting A-matrix is again dissipative. We create the decomposition of A into submatrices $A_m$ by observing that A is diagonally dominant. In particular, we have that

$$\begin{aligned}{}[A]_{ii} = -\sum _{\begin{array}{c} j=1\\ j \ne i \end{array}}^N [A]_{ij}, \end{aligned}$$

(167)

where the off-diagonal elements $[A]_{ij}$ ($j \ne i$) are positive and the diagonal elements $[A]_{ii}$ are negative. By associating a matrix ${\tilde{A}}_{ij} \in {\mathbb {R}}^{N\times N}$ to each pair (i, j) with $j > i$, we obtain a decomposition of A as

$$\begin{aligned} A = \sum _{\begin{array}{c} j=1\\ j > i \end{array}}^N {\tilde{A}}_{ij}, \end{aligned}$$

(168)

where the matrices ${\tilde{A}}_{ij}$ ($j > i$) are zero except for the entries

$$\begin{aligned} \begin{bmatrix} {[}{\tilde{A}}_{ij}]_{ii} &{} [{\tilde{A}}_{ij}]_{ij} \\ [{\tilde{A}}_{ij}]_{ji} &{} [{\tilde{A}}_{ij}]_{jj} \end{bmatrix} = [A]_{ij} \begin{bmatrix} -1 &{} 1\\ 1 &{} -1 \end{bmatrix} \end{aligned}$$

(169)

Because the off-diagonal elements $[A]_{ij} \ge 0$ ($j \ne i$), it is easy to verify that all the matrices ${\tilde{A}}_{ij}$ are dissipative. Also note that the matrix A contains many zero off-diagonal elements, so that many of the matrices ${\tilde{A}}_{ij}$ are zero. There are only $3(16-1)16^2 = 11{,}520$ nonzero off-diagonal elements and thus only 11,520 nonzero matrices ${\tilde{A}}_{ij}$. The 11,520 nonzero matrices ${\tilde{A}}_{ij}$ are randomly divided into M groups of (approximately) equal size. The matrices $A_m$ in (5) are formed by summing the matrices ${\tilde{A}}_{ij}$ in each group.

We again consider uniform time grids with a grid spacing h. In each time interval $[t_{k-1}, t_k)$, we randomly use P of the M submatrices simultaneously. In our formalism, we thus assign a probability $1/{M \atopwithdelims ()P}$ to each of the ${M \atopwithdelims ()P}$ subsets of $\{1,2, \ldots ,M \}$ of size P. The states $x_h({\varvec{\omega }},t)$ and the optimal controls $u^*_h({\varvec{\omega }},t)$ are computed in the same way as for the example in the previous subsection.

The obtained results are presented in Fig. 5. The average errors (indicated by the markers) and the 2$\sigma $-confidence intervals (indicated by the error bars) are now estimated based on 10 realizations of ${\varvec{\omega }}$. Figure 5a–d again show the convergence rates expected based on our theoretical results, just as in Fig. 2a–d for the example in the previous subsection. We also observe that the errors are smaller when larger parts of A are used simultaneously, i.e., when P/M is larger.

Figure 5e, f also shows a computational advantage of the proposed method. Naturally, the computational advantage increases when the matrix ${\mathcal {A}}_h({\varvec{\omega }},t)$ is more sparse, i.e., when P/M is smaller. This situation is significantly different from the 1D heat equation considered in the previous subsection. For that example, the proposed method did not lead to any computational advantage. Apart from the larger state dimension N in the 3D example, this difference seems to be related to the more ‘dense interconnection structure’ of the 3D problem (in which every node is typically connected to 6 neighboring nodes) compared to the 1D problem (in which every node is connected to two neighboring nodes). This idea will be explored further in the next subsection in which we consider a model with an even denser interconnection structure.

4.3 A FE discretization of the fractional Laplacian

We consider a controlled fractional heat equation on the a 1-D spatial domain $\xi \in [-L, L]$,

$$\begin{aligned} y_t(t,\xi )&= -(-d_\xi ^2)^sy(t,\xi ) + \chi _{[-L/3,0]}(\xi ) u_1(t) + \chi _{[L/3,2L/3]}(\xi ) u_2(t), \end{aligned}$$

(170)

$$\begin{aligned} y(t,-L)&= y(t,L) = 0, \quad y(0,\xi ) = e^{-\beta ^2\xi ^2} - e^{-\beta ^2L^2}, \end{aligned}$$

(171)

with the fractional power $s \in (0,1)$. We fix $s = 0.7$, $L = 5$, and $\beta = 0.4$. Note that the control $u(t) = [u_1(t), u_2(t)]^\top $ now has two components. Our aim is to compute the optimal control $u^*(t) = [u^*_1(t), u^*_2(t)]^\top $ that minimizes

$$\begin{aligned} {\mathcal {J}}(u) = \frac{100}{2}\int _0^T \int _{-L}^L y(t,\xi )^2 \, \mathrm {d}\xi \, \mathrm {d}t + \frac{1}{2}\int _0^T \left( u_1(t)^2 + u_2^2(t) \right) \, \mathrm {d}t. \end{aligned}$$

(172)

A Finite Element (FE) discretization of (170)–(171) with $N+1$ linear elements of equal length takes the form

$$\begin{aligned} E {\dot{x}}(t) = Ax(t) + Bu(t), \quad x(0) = x_0, \end{aligned}$$

(173)

where the state x(t) evolves in ${\mathbb {R}}^N$. Note that (173) now also contains the symmetric and positive definite mass matrix E and is thus not exactly of the form (1), but that the proposed method also applies to systems of this form. An explicit expression for the stiffness matrix A can be found in [5]. Because the fractional Laplacian is a nonlocal operator, all elements of A are nonzero. From the expressions for the coefficients of A in [5] we can verify that A is symmetric and diagonally dominant, i.e.

$$\begin{aligned} -[A]_{ii} > \sum _{\begin{array}{c} j=1 \\ j \ne i \end{array}}^N |[A]_{ij} |. \end{aligned}$$

(174)

We can now write

$$\begin{aligned} A = \sum _{\begin{array}{c} j=1 \\ j \ge i \end{array}}^N {\tilde{A}}_{ij} = \sum _{\begin{array}{c} j=1 \\ j > i \end{array}}^N {\tilde{A}}_{ij} + \sum _{i=1}^N {\tilde{A}}_{ii}, \end{aligned}$$

(175)

where the matrices $A_{ij} \in {\mathbb {R}}^{N \times N}$ ($j \ge i$) are zero except for the coefficients

$$\begin{aligned} \begin{bmatrix} [{\tilde{A}}_{ij}]_{ii} &{} [{\tilde{A}}_{ij}]_{ij} \\ [{\tilde{A}}_{ij}]_{ji} &{} [{\tilde{A}}_{ij}]_{jj} \end{bmatrix} = \begin{bmatrix} -|[A]_{ij} |&{} [A]_{ij} \\ [A]_{ij} &{} -|[A]_{ij} |\end{bmatrix}, \quad [A_{ii}]_{ii} = [A]_{ii} + \sum _{\begin{array}{c} j=1 \\ j \ne i \end{array}}^N |[A]_{ij} |.\qquad \end{aligned}$$

(176)

Again, it is easy to verify that the matrices $A_{ij}$ ($j \ge i$) are dissipative.

Now assume that N is divisable by some number P. We then decompose A into $M = P(P+1)/2$ submatrices $A_m$ as in (5) by setting

$$\begin{aligned} A_{m(p,q)} = \sum _{i=i_{p-1}+1}^{i_p} \sum _{j=i_{q-1}+1}^{i_q} {\tilde{A}}_{ij}, \quad q \ge p \in \{ 1,2, \ldots , P\}, \end{aligned}$$

(177)

where $i_p = pN/P$ and m(p, q) is a bijection

$$\begin{aligned} m: \{ (p,q) \in \{1,2, \ldots , P \}^2 \mid q \ge p \} \rightarrow \{1,2, \ldots , P(P+1)/2 \}. \end{aligned}$$

(178)

We thus effectively decompose A into $N/P \times N/P$ blocks, but we treat the diagonal in such a way that all submatrices $A_m$ are dissipative.

We only use one of the matrices $A_m$ in each time interval $[t_{k-1},t_k)$ and thus assign uniform probabilities $2/(P(P+1))$ to each of the $M = P(P+1)/2$ subsets of $\{1,2, \ldots M \}$ of size 1.

The results obtained for $N = 96$ are shown in Fig. 6. The markers and the error bars in this figure again indicate the average and $2\sigma $-confidence interval estimated from 10 realizations of ${\varvec{\omega }}$. Results are presented for for $P=4$, 8, 16, and 32, which correspond to values of $M=10$, 36, 136, and 528, respectively. Note that the number of submatrices M is now much larger than in the previous two examples, and that also $h\mathrm {Var}[{\mathcal {A}}]$ and the relative errors are larger than in the previous examples. Figure 6b, c even shows relative errors that exceed 100%. However, we still observe the convergence rates predicted by the theoretical results in Sect. 3 in Fig. 6a–d. In particular, the convergence rate in Fig. 6d is again twice as high as in the other figures.

When we inspect the computational times in Fig. 6e, f, we see that increasing M decreases the computational time. In particular, solutions for $M = 528$ are typically computed 2–3 times faster than the solutions for the original dynamics. We expect that the computational advantage of the proposed method increases further when we increase the state dimension N.

5 Conclusions and discussions

5.1 Conclusions

We have proposed a general framework for randomized time-splitting in LQ optimal control problems. It has been shown that the dynamics, the minimal values of the cost functional, and the optimal control obtained with the proposed randomized time-splitting method converge in expectation to their analogues in the original problem when the grid spacing of the time grid goes to zero. The convergence rates in our theoretical results are also observed in three numerical examples.

In two of the three considered examples, the proposed method leads to a typical reduction in computational cost of a factor 2–3. Only in the first example of a heat equation on a 1-D spatial domain, no reduction in computational cost could be observed. This seems to be the case because the matrix A is not very large and already very sparse in this example.

5.2 Extension to unbounded operators

We have considered finite-dimensional systems in this paper, but the numerical examples in Sect. 4 are all obtained after spatial discretization of an infinite-dimensional system. A natural question is therefore whether our results can be extended to an infinite-dimensional setting. We already touched on this question in Remarks 5 and 7 and in Appendix B. In particular, at the end of Appendix B we indicate how results can be extended to an infinite dimensional setting under the (strong) additional assumptions that all operators $A_m$ commute and have the same domain $D(A_m)$.

It should be noted that the assumption that $D(A_m) = D(A)$ is very strong and will not be satisfied in many applications. A prototypical example is the splitting of an advection diffusion problem with zero Dirichlet boundary conditions (represented by A) in an advective part (represented by $A_1$) and a diffusive part (represented by $A_2$). Functions in $D(A_2)$ can then satisfy the zero Dirichlet boundary conditions on the whole boundary, but the functions in $D(A_1)$ only satisfy the zero Dirichlet boundary conditions on the parts of the boundary where the velocity field is pointing inward. The analysis of the RBM becomes much more subtle in these kind of situations. The numerical results in Fig. 4 also seem to indicate that the proposed randomized time splitting method converges under weaker assumptions than the ones in Appendix B.

The technical difficulties encountered when weakening these assumptions are related to the difficulties in deterministic operator splitting with unbounded operators. These date back to the paper [28] by Trotter, and have been an active field of research since then, see, e.g., [12, 16, 19, 23, 24]. As the large literature on this topic indicates, determining the necessary conditions for the convergence of the proposed stochastic operator splitting method with unbounded operators is an interesting but challenging topic for future research.

5.3 Extension to nonlinear dynamics

Another important topic for future research is the extension of our results for the linear quadratic optimal control problem to problems with nonquadratic cost functions constrained by nonlinear dynamics. This extension is particularly interesting because of the connections between the training of certain types of Deep Neural Networks (DNNs) and optimal control, see, e.g., [4, 9, 10, 27, 29], and is also important for the control of interacting particles systems, see [18].

In the most general setting, we would replace the linear dynamics (1) by the nonlinear dynamics

$$\begin{aligned} {\dot{x}}(t) = f(x(t), u(t)), \quad x(0) = x_0, \end{aligned}$$

(179)

where $f : {\mathbb {R}}^N \times {\mathbb {R}}^q \rightarrow {\mathbb {R}}^N$ is Lipschitz in the first variable x. As an analogue of (5), we then write (for $x \in {\mathbb {R}}^N$ and $u \in {\mathbb {R}}^q$)

$$\begin{aligned} f(x,u) = \sum _{m=1}^M f_m(x,u), \end{aligned}$$

(180)

for certain Lipschitz continuous functions $f_m : {\mathbb {R}}^N \times {\mathbb {R}}^q \rightarrow {\mathbb {R}}^N$. Similarly as in this paper, we choose a time grid $0 = t_0< t_1< t_2< \cdots < t_K = T$, enumerate the subsets $S_1, S_2, \ldots ,S_{2^M}$ of $\{1,2, \ldots , M \}$ and assign probabilities $p_1, p_2, \ldots , p_{2^M}$ to them, and randomly select a K-tuple ${\varvec{\omega }} = (\omega _1, \omega _2, \ldots , \omega _K)$ of indices $\omega _k \in \{1,2,\ldots 2^M \}$ according to the selected probabilities. We then consider the (typically simpler) dynamics

$$\begin{aligned} {\dot{x}}_h({\varvec{\omega }}, t) = \sum _{m \in S_{\omega _k}} \frac{f_m(x_h({\varvec{\omega }},t), u_h({\varvec{\omega }}, t))}{\pi _m}, \quad t \in [t_{k-1},t_k). \end{aligned}$$

(181)

Extending Theorem 1 (which considers the forward dynamics with a deterministic control $u_h({\varvec{\omega }},t) = u(t)$) to such a nonlinear setting seems possible along the lines of the results for interacting-particle systems in [14]. The main difficulty is in Theorem 2 where we use the variation of constants formula to obtain an estimate for a stochastic control $u_h({\varvec{\omega }},t)$ (which depends on the randomly selected indices in ${\varvec{\omega }}$). The variation of constants formula can be extended to a nonlinear setting, see, e.g., [7], but this leads to several additional complications which we aim to address in a future work.

When an analogue of Theorem 2 for nonlinear dynamics can be obtained, a bound on ${\mathbb {E}}[|J_h(u_h) - J(u_h) |]$ as in Lemma 1 should follow relatively easily from a Lipschitz condition on the integrand in the considered cost function. An analogue of the no-gap condition, i.e., a bound on ${\mathbb {E}}[|J(u_h^*) - J(u^*) |]$, can then be obtained using classical arguments from the calculus of variations and the bound on ${\mathbb {E}}[|J_h(u_h) - J(u_h) |]$, similarly as for the linear-quadratic case in Theorem 3.

With these results, the suboptimality gap ${\mathbb {E}}[|J_h(u^*_h) - J(u^*) |]$ be bounded using the analogues of Lemma 1 and Theorem 3 as follows. We start by noting that the triangle inequality shows that

$$\begin{aligned} |J(u^*_h({\varvec{\omega }})) - J(u^*) |\le |J(u^*_h({\varvec{\omega }})) - J_h({\varvec{\omega }}, u^*_h({\varvec{\omega }})) |+ |J_h({\varvec{\omega }}, u^*_h({\varvec{\omega }})) - J(u^*) |.\nonumber \\ \end{aligned}$$

(182)

Taking the expectation in this inequality, we see that the first term on the RHS can be bounded using (the analogue of) Lemma 1 and the second term on the RHS can be bounded using (the analogue of) Theorem 3. We thus obtain a bound on ${\mathbb {E}}[|J_h(u^*_h) - J(u^*) |]$ that is of order $\sqrt{h}$. It is interesting to observe that this rate is slower than the rate of order h found for the linear-quadratic case in Corollary 3. This difference seems to occur because Corollary 3 relies on the strict convexity of the functional, which is lost in a setting in which the dynamics are nonlinear.

5.4 Combination with model predictive control

As suggested in [18], it is natural to combine the proposed randomized time-splitting method with an MPC strategy. The resulting algorithm is essentially a receding horizon strategy, see, e.g., [2, 3, 25], but we now use the proposed stochastic time-splitting method to approximate the optimal controls that need to be computed in each step. An important element of such a receding horizon strategy is that the optimal control is computed based on the current state of the original dynamics (1). This creates a feedback mechanism that provides additional robustness against the errors introduced by the proposed stochastic time-splitting method.

The receding horizon strategy introduces two additional parameters in the control algorithm: the prediction horizon ${\hat{T}}$ and the control horizon $\tau $. When the prediction horizon ${\hat{T}}$ is too short, the difference between the controls computed on the prediction horizon $[0, {\hat{T}}]$ and the desired optimal control on $[0, \infty )$ will be large. Decreasing the control horizon $\tau $ strengthens the feedback mechanism of the MPC strategy, which will likely allow for larger errors in the proposed stochastic time-splitting method. This idea could be formalized further by deriving an explicit error estimate that demonstrates the interaction of the control horizon $\tau $ and $h\mathrm {Var}[{\mathcal {A}}]$ (which characterizes the accuracy of the proposed random time-splitting method).

References

Apel, T., Flaig, T.G.: Crank–Nicolson schemes for optimal control problems with evolution equations. SIAM J. Numer. Anal. 50(3), 1484–1512 (2012). ISSN 0036-1429. https://doi.org/10.1137/100819333
Azmi, B., Kunisch, K.: On the stabilizability of the Burgers equation by receding horizon control. SIAM J. Control Optim. 54(3), 1378–1405 (2016). ISSN 0363-0129. https://doi.org/10.1137/15M1030352
Azmi, B., Kunisch, K.: Receding horizon control for the stabilization of the wave equation. Discrete Contin. Dyn. Syst. 38(2), 449–484 (2018). ISSN 1078-0947. https://doi.org/10.3934/dcds.2018021
Benning, M., Celledoni, E., Ehrhardt, M.J., Owren, B., Schönlieb, C.-B.: Deep learning as optimal control problems: models and numerical methods. J. Comput. Dyn. 6(2), 171–198 (2019). ISSN 2158-2491. https://doi.org/10.3934/jcd.2019009
Biccari, U., Hernández-Santamaría, V.: Controllability of a one-dimensional fractional heat equation: theoretical and numerical aspects. IMA J. Math. Control Inf. 36(4), 1199–1235 (2018). ISSN 0265-0754. https://doi.org/10.1093/imamci/dny025
Bottou, L., Curtis, F.E., Nocedal, J.: Optimization methods for large-scale machine learning. SIAM Rev. 60(2), 223–311 (2018). ISSN 0036-1445. https://doi.org/10.1137/16M1080173
Brauer, F.: Perturbations of nonlinear systems of differential equations. J. Math. Anal. Appl. 14, 198–206 (1966). ISSN 0022-247X. https://doi.org/10.1016/0022-247X(66)90021-7
Dolean, V., Jolivet, P., Nataf, F.: An Introduction to Domain Decomposition Methods: Algorithms, Theory, and Parallel Implementation. Society for Industrial and Applied Mathematics (SIAM), Philadelphia (2015). ISBN 978-1-611974-05-8. https://doi.org/10.1137/1.9781611974065.ch1
Esteve, C., Geshkovski, B.: Sparse approximation in learning via neural ODEs (2021)
Esteve, C., Geshkovski, B., Pighin, D., Zuazua, E.: Large-time asymptotics in deep learning (2021)
Grüne, L., Pannek, J.: Nonlinear Model Predictive Control. Communications and Control Engineering Series. Springer, Cham (2017). ISBN 978-3-319-46023-9; 978-3-319-46024-6. https://doi.org/10.1007/978-3-319-46024-6. Theory and algorithms, Second edition [of MR3155076]
Hansen, E., Ostermann, A.: Dimension splitting for evolution equations. Numer. Math. 108(4), 557–570 (2008). ISSN 0029-599X. https://doi.org/10.1007/s00211-007-0129-3
Ignat, L.I.: A splitting method for the nonlinear Schrödinger equation. J. Differ. Equ. 250(7), 3022–3046 (2011). ISSN 0022-0396. https://doi.org/10.1016/j.jde.2011.01.028
Jin, S., Li, L., Liu, J.-G.: Random batch methods (RBM) for interacting particle systems. J. Comput. Phys. 400, 108877 (2020). ISSN 0021-9991. https://doi.org/10.1016/j.jcp.2019.108877
Jin, S., Li, L., Liu, J.-G.: Convergence of random batch method for interacting particles with disparate species and weights (2020)
Kato, T.: Trotter’s product formula for an arbitrary pair of self-adjoint contraction semigroups. In: Topics in Functional Analysis (Essays Dedicated to M. G. Kreĭn on the Occasion of his 70th birthday), Adv. in Math. Suppl. Stud., vol. 3, pp. 185–195. Academic Press, New York (1978)
Kirk, D.E.: Optimal Control Theory: An Introduction. Dover (2004)
Ko, D., Zuazua, E.: Model predictive control with random batch methods for a guiding problem. Math. Models Methods Appl. Sci. 31(8), 1569–1592 (2021). ISSN 0218-2025. https://doi.org/10.1142/S0218202521500329
Lapidus, M.L.: Generalization of the Trotter-Lie formula. Integral Equ. Oper. Theory 4(3), 366–415 (1981). ISSN 0378-620X. https://doi.org/10.1007/BF01697972
Lee, E.B., Markus, L.: Foundations of Optimal Control Theory. John Wiley & Sons Inc, New York (1967)
MATH Google Scholar
Li, L., Xu, Z., Zhao, Y.: A random-batch Monte Carlo method for many-body systems with singular kernels. SIAM J. Sci. Comput. 42(3), A1486–A1509 (2020). ISSN 1064-8275. https://doi.org/10.1137/19M1302077
Minoux, M., Vajda, S.: Mathematical Programming: Theory and Algorithms. A Wiley-Interscience publication, Wiley (1986). ISBN 9780471901709. https://books.google.de/books?id=5kDvAAAAMAAJ
Neidhardt, H., Zagrebnov, V.A.: On error estimates for the Trotter-Kato product formula. Lett. Math. Phys. 44(3), 169–186 (1998). ISSN 0377-9017. https://doi.org/10.1023/A:1007494816401
Ostermann, A., Schratz, K.: Stability of exponential operator splitting methods for noncontractive semigroups. SIAM J. Numer. Anal. 51(1), 191–203 (2013). ISSN 0036-1429. https://doi.org/10.1137/110846580
Reble, M., Allgöwer, F.: Unconstrained model predictive control and suboptimality estimates for nonlinear continuous-time systems. Automatica J. IFAC 48(8), 1812–1817 (2012). ISSN 0005-1098. https://doi.org/10.1016/j.automatica.2012.05.067
Rohatgi, V.K., Ehsanes Saleh, A.K.M.: An Introduction to Probability and Statistics, 3rd edn. Wiley Series in Probability and Statistics. John Wiley & Sons, Inc., Hoboken (2015). ISBN 978-1-118-79964-2. https://doi.org/10.1002/9781118799635
Ruiz-Balet, D., Zuazua, E.: Neural ODE control for classification, approximation and transport (2021)
Trotter, H.F.: On the product of semi-groups of operators. Proc. Am. Math. Soc. 10, 545–551 (1959). ISSN 0002-9939. https://doi.org/10.2307/2033649
E, W.: A proposal on machine learning via dynamical systems. Commun. Math. Stat. 5(1), 1–11 (2017). ISSN 2194-6701. https://doi.org/10.1007/s40304-017-0103-z

Download references

Acknowledgements

This project has received funding from the European Research Council (ERC) under the European Union’s Horizon 2020 research and innovation programme (Grant agreement No: 694126-DyCon), the Alexander von Humboldt-Professorship program, the European Unions Horizon 2020 research and innovation programme under the Marie Sklodowska-Curie Grant agreement No.765579-ConFlex and the Transregio 154 Project “Mathematical Modelling, Simulation and Optimization Using the Example of Gas Networks”, project C08, of the German DFG, the grant PID2020-112617GB-C22, “Kinetic equations and learning control” of the Spanish MINECO, and the COST Action grant CA18232, “Mathematical models for interacting dynamics on networks” (MAT-DYN-NET).

Funding

Open Access funding enabled and organized by Projekt DEAL.

Author information

Authors and Affiliations

Chair in Dynamics, Control, and Numerics (Alexander-von-Humboldt Professorship), Department of Data Science, Friedrich-Alexander Universität (FAU) Erlangen-Neuremberg, Cauerstrasse 11, 91052, Erlangen, Germany
D. W. M. Veldman & E. Zuazua
Departamento do Matemáticas, Universidad Autonoma de Madrid, Ciudad Universitaria de Cantoblanco, 28049, Madrid, Spain
E. Zuazua
Chair of Computational Mathematics, Fundación Deusto, Av. de las Universidades 24, 48007, Bilbao, Basque-Country, Spain
E. Zuazua

Authors

D. W. M. Veldman
View author publications
You can also search for this author in PubMed Google Scholar
E. Zuazua
View author publications
You can also search for this author in PubMed Google Scholar

Corresponding author

Correspondence to D. W. M. Veldman.

Additional information

Publisher's Note

Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

Appendices

Appendix A: Interacting particle systems in the proposed framework

In this appendix, we explain the connection of our framework to the previously proposed RBMs for interacting particle systems in [14, 15, 18, 21]. We consider a (linearized first-order) system of N interacting particles

$$\begin{aligned} {\dot{x}}_i(t) = \frac{1}{N-1} \sum _{\begin{array}{c} j=1\\ j\ne i \end{array}}^N a_{ij}(x_j(t) - x_i(t)), \quad x_i(0) = x_{0,i}, \quad i \in \{1,2,\ldots N \}, \end{aligned}$$

(A1)

where the $a_{ij} \in {\mathbb {R}}$ ($j \ne i$) are constants. To simplify the following exposition, we assume that the number of particles N is divisible by some number $P > 1$.

We discuss here one particular RBM called RBM-1 in [14], but other variants can be treated similarly. We first choose a time grid $0 = t_0< t_1< t_2< \cdots< t_{K_1} < t_K = T$ in the time interval [0, T]. In each time interval $[t_{k-1}, t_k)$, we then choose a random partition of the index set $\{1,2, \ldots , n \}$ into disjoint subsets ${\mathcal {B}}_r^k$ (also called batches) of size P ($r \in \{1,2, \ldots , N/P \}$). We consider only the interactions between particles that are in the same batch. To formalize this idea, note that, in each time interval $[t_{k-1}, t_k)$, every particle i is contained in precisely one batch ${\mathcal {B}}^k_{r(i,k)}$. We thus consider the dynamics

$$\begin{aligned} {\dot{x}}_{\mathrm {RBM},i}(t) = \frac{1}{P-1}\sum _{\begin{array}{c} j\in {\mathcal {B}}^k_{r(i,k)} \\ j \ne i \end{array}} a_{ij}(x_{\mathrm {RBM},j}(t) - x_{\mathrm {RBM},i}(t)), \quad x_i(0) = x_{0,i}. \end{aligned}$$

(A2)

To connect this idea to our framework, we write (A1) in matrix form

$$\begin{aligned} {\dot{x}}(t) = A x(t), \quad x(0) = x_0, \quad A = \frac{1}{N-1} \sum _{\begin{array}{c} i,j = 1\\ i \ne j \end{array}}^N {\tilde{A}}_{ij}, \end{aligned}$$

(A3)

where $x(t) = [x_1(t), x_2(t), \ldots x_N(t)]^\top $ and $x_0 = [x_{0,1}, x_{0,2}, \ldots , x_{0,N}]$ and the entries of the matrices ${\tilde{A}}_{ij}$ ($j \ne i$) are zero except for the entries

$$\begin{aligned} \begin{bmatrix} [{\tilde{A}}_{ij}]_{ij}&[{\tilde{A}}_{ij}]_{ii} \end{bmatrix} = a_{ij} \begin{bmatrix} 1&-1 \end{bmatrix}. \end{aligned}$$

(A4)

Also the RBM-dynamics (A2) can be written in matrix form as

$$\begin{aligned} {\dot{x}}_{\mathrm {RBM}}(t) = {\mathcal {A}}_{\mathrm {RBM}}(t) x_{\mathrm {RBM}}(t), \quad x_{\mathrm {RBM}}(0) = x_0, \end{aligned}$$

(A5)

where

$$\begin{aligned} {\mathcal {A}}_{\mathrm {RBM}}(t) = \frac{1}{P-1} \sum _{r=1}^{N/P}\sum _{\{i,j\} \subseteq {\mathcal {B}}_r^k} {\tilde{A}}_{ij}, \quad t \in [t_{k-1}, t_k). \end{aligned}$$

(A6)

Note that the probability that two distinct indices i and j are in the same batch (i.e., the probability that $j \ne i$ is in the batch ${\mathcal {B}}^k_{r(i,k)}$) is $(P-1)/(N-1)$ because there are $P-1$ of the $N-1$ places in ${\mathcal {B}}^k_{r(i,k)}$ remaining after the index i has been fixed. This factor is also visible in the definitions of A and ${\mathcal {A}}_{\mathrm {RBM}}(t)$.

To make the connection to our proposed framework, we enumerate the $M = N(N-1)$ interaction matrices $A_{ij}$, i.e., we choose a bijection

$$\begin{aligned} {\mathfrak {m}} : \{ (i,j) \in \{1,2, \ldots , N \}^2 \mid i \ne j \} \rightarrow \{1,2, \ldots , N(N-1) \}, \end{aligned}$$

(A7)

and set

$$\begin{aligned} A_{{\mathfrak {m}}(i,j)} := \frac{1}{N-1} {\tilde{A}}_{ij}. \end{aligned}$$

(A8)

We then need to assign probabilities $p_\omega $ to the $2^M$ subsets $S_\omega $ of $\{1,2, \ldots , M \}$. Naturally, we only assign nonzero probabilities to subsets $S_\omega $ that correspond to a partition ${\dot{\cup }}_r {\mathcal {B}}_r= \{1,2, \ldots , N \}$, i.e. sets of the form

$$\begin{aligned} S_\omega = \{ {\mathfrak {m}}(i,j) \mid \exists _{i,j,r}\, \mathrm {such\, that}\, i \ne j\, \mathrm {and}\, \{ i,j\} \subseteq {\mathcal {B}}_r \}. \end{aligned}$$

(A9)

Standard combinatorics shows that there are

$$\begin{aligned} {\mathcal {N}} = \frac{N!}{(P!)^{N/P} \left( N/P \right) !}, \end{aligned}$$

(A10)

distinct partitions of N indices into N/P subsets of size P. We assign a probability $p_\omega = 1 /{\mathcal {N}}$ to each of the subsets of the form (A9).

It remains to compute the probabilities $\pi _m = \pi _{{\mathfrak {m}}(i,j)}$ defined in (9), i.e. to determine how many of the subsets $S_\omega $ of the form (A9) contain $m = {\mathfrak {m}}(i,j)$. When a certain batch ${\mathcal {B}}_{r^*}$ contains i and j ($j \ne i$) there are $N-2 \atopwithdelims ()P-2$ ways to fill the remaining positions in ${\mathcal {B}}_{r^*}$ with $P-2$ of the $N-2$ remaining indices. Once the indices in ${\mathcal {B}}_{r^*}$ are fixed, there are

$$\begin{aligned} {\mathcal {M}} = \frac{(N-P)!}{(P!)^{N/P-1} \left( N/P - 1\right) !}, \end{aligned}$$

(A11)

ways to distribute the remaining $N-P$ indices into $N/P-1$ subsets of size P. We thus conclude that

$$\begin{aligned} \pi _m = \frac{{N-2 \atopwithdelims ()P-2} {\mathcal {M}}}{{\mathcal {N}}} \end{aligned}$$

(A12)

Using the formulas for ${\mathcal {N}}$ and ${\mathcal {M}}$, it can be verified that

$$\begin{aligned} \pi _m = \frac{P-1}{N-1}. \end{aligned}$$

(A13)

It is now easy to verify that the definition of ${\mathcal {A}}_h({\varvec{\omega }},t)$ in (11) is equivalent to the definition of ${\mathcal {A}}_{\mathrm {RBM}}(t)$ in (A6).

Appendix B: An alternative for Corollary 1

In this appendix, we will prove a result similar to Corollary 1 under the additional assumption that all matrices commute. The proof is quite intuitive and gives an idea about how the results in this paper can be generalized to an infinite dimensional setting.

The analysis in this appendix uses the following additional assumption.

Assumption 3

Suppose that the matrices $A_1, A_2, \ldots , A_M$ all commute pairwise, i.e.

$$\begin{aligned} A_{m}A_{m'} = A_{m'}A_m, \end{aligned}$$

(B14)

for all $m, m' \in \{1,2, \ldots , M \}$.

Also observe that for any two dissipative matrices $X,Y \in {\mathbb {R}}^{N \times N}$ and vector $x_0 \in {\mathbb {R}}^N$ we have that

$$\begin{aligned} |e^Xx_0 - e^Yx_0 |&= \left|\int _0^1 \frac{d}{d\tau } e^{X \tau + Y(1-\tau )}x_0 \, \mathrm {d}\tau \right|\nonumber \\&\le \int _0^1 \Vert e^{X \tau + Y(1-\tau )} \Vert |(X-Y)x_0 |\mathrm {d}\tau \le |(X - Y)x_0 |, \end{aligned}$$

(B15)

where it was used that $X\tau + Y(1-\tau )$ is dissipative for $\tau \in [0,1]$ because X and Y are dissipative by assumption.

Theorem 5

Under Assumptions 1, 2, and 3, we have that

$$\begin{aligned} {\mathbb {E}}[\Vert S_h(t,s)x_0 - e^{A(t-s)}x_0 \Vert ^2] \le 2 h(t-s) \mathrm {Var}_W[{\mathcal {A}}] |W^{-1}x_0 |^2, \end{aligned}$$

(B16)

for all $0 \le s \le t \le T$, all $x_0 \in {\mathbb {R}}^N$, and all invertible matrices W.

Proof

We use the notation from Remark 10, so $\ell $ and k are such that $s \in [t_{\ell -1}, t_\ell )$ and $t \in [t_{k-1}, t_k)$, ${\tilde{K}} = k - \ell + 1$, and

$$\begin{aligned} {\tilde{t}}_0 := s< {\tilde{t}}_1 := t_\ell< {\tilde{t}}_2 := t_{\ell +1}< \cdots< {\tilde{t}}_{{\tilde{K}}-1} := t_{k-1} < {\tilde{t}}_{{\tilde{K}}} := t, \end{aligned}$$

(B17)

see also Fig. 1 on page 24. Furthermore, we denote ${\tilde{h}}_p := {\tilde{t}}_p - {\tilde{t}}_{p-1}$ for $p \in \{1,2, \ldots , {\tilde{K}} \}$ and denote ${\mathcal {A}}_\omega := \sum _{m \in S_\omega } A_m/\pi _m$ for $\omega \in \{1,2, \ldots ,2^M \}$. Note that ${\mathcal {A}}_h({\varvec{\omega }},\tau ) = {\mathcal {A}}_{\omega _p}$ for $\tau \in [{\tilde{t}}_{p-1}, {\tilde{t}}_p)$ and that ${\mathcal {A}}_\omega $ is dissipative for all $\omega \in \{1,2, \ldots , 2^M \}$ because of Assumption 1.

Because the matrices ${\mathcal {A}}_\omega $ (with $\omega \in \{1,2, \ldots , 2^M \}$) all commute pairwise due to Assumption 3, the formula for $S_h({\varvec{\omega }}, t,s)$ in (94) in Remark 10 reduces to

$$\begin{aligned} S_h({\varvec{\omega }}, t,s)x_0 = \exp \left( \sum _{p=1}^{{\tilde{K}}} {\mathcal {A}}_{\omega _{p+\ell -1}}{\tilde{h}}_p \right) x_0. \end{aligned}$$

(B18)

Because Assumption 1 implies that the matrix in the exponent in the formula above and A are both dissipative, (B15) can be applied to find that

$$\begin{aligned} |S_h({\varvec{\omega }}, t, s)x_0 - e^{A(t-s)}x_0 |\le \left|\sum _{p=1}^{{\tilde{K}}} \left( {\mathcal {A}}_{\omega _{p+\ell -1}} - A \right) {\tilde{h}}_p x_0 \right|, \end{aligned}$$

(B19)

where it was used that $\sum _{p=1}^{{\tilde{K}}} {\tilde{h}}_p = t-s$. Squaring this expression yields

$$\begin{aligned}&|S_h({\varvec{\omega }}, t, s)x_0 - e^{A(t-s)}x_0 |^2 \nonumber \\&\quad \le \sum _{p,p'=1}^{{\tilde{K}}} {\tilde{h}}_p {\tilde{h}}_{p'} \langle ({\mathcal {A}}_{\omega _{p+\ell -1}} - A) x_0, ({\mathcal {A}}_{\omega _{p'+\ell -1}} - A ) x_0 \rangle . \end{aligned}$$

(B20)

When we take the expected value, the terms with $p \ne p'$ disappear because

$$\begin{aligned}&{\mathbb {E}}[\langle ({\mathcal {A}}_{\omega _{p+\ell -1}} - A) x_0, ({\mathcal {A}}_{\omega _{p'+\ell -1}} - A ) x_0 \rangle ] \nonumber \\&\quad = \sum _{\omega = 1}^{2^M} \sum _{\omega ' = 1}^{2^M} \langle ({\mathcal {A}}_{\omega } - A) x_0, ({\mathcal {A}}_{\omega '} - A ) x_0 \rangle p_{\omega } p_{\omega '} \nonumber \\&\quad = \left\langle \sum _{\omega = 1}^{2^M} ({\mathcal {A}}_\omega - A)x_0, \sum _{\omega '=1}^{2^M} ({\mathcal {A}}_{\omega '} - A)x_0 \right\rangle = \langle 0, 0 \rangle = 0 \end{aligned}$$

(B21)

where the first identity follows after writing $\omega = \omega _{p-\ell +1}$ and $\omega '= \omega _{p'-\ell +1}$, and the second to last identity from (12) and (8). Therefore, only the terms with $p = p'$ remain after taking the expected value of (B20) and

$$\begin{aligned}&{\mathbb {E}}[|S_h(t, s)x_0 - e^{A(t-s)}x_0 |^2] \nonumber \\&\quad \le \sum _{\omega _\ell = 1}^{2^M} \sum _{\omega _{\ell +1} = 1}^{2^M} \cdots \sum _{\omega _{\ell +{\tilde{K}}-1} = 1}^{2^M} \sum _{p=1}^{{\tilde{K}}} {\tilde{h}}_p^2 |({\mathcal {A}}_{\omega _{p+\ell -1}} - A) x_0 |^2 p_{\omega _\ell } p_{\omega _{\ell +1}} \ldots p_{\omega _{\ell + {\tilde{K}}-1}} \nonumber \\&\quad = \sum _{p=1}^{{\tilde{K}}} {\tilde{h}}_p^2 \sum _{\omega =1}^{2^M} |({\mathcal {A}}_{\omega _p+\ell -1} - A)x_0 |^2 p_\omega . \end{aligned}$$

(B22)

The proof is completed with two straightforward observations. First of all, note that because ${\tilde{h}}_p \le h$

$$\begin{aligned} \sum _{p=1}^{{\tilde{K}}} {\tilde{h}}_p^2 \le \sum _{p=1}^{{\tilde{K}}} h {\tilde{h}}_p = h \sum _{p=1}^{{\tilde{K}}} {\tilde{h}}_p = h(t-s). \end{aligned}$$

(B23)

Secondly, we have that

$$\begin{aligned} \sum _{\omega =1}^{2^M} |({\mathcal {A}}_{\omega _p+\ell -1} - A)x_0 |^2 p_\omega&= \sum _{\omega =1}^{2^M} |({\mathcal {A}}_{\omega _p+\ell -1} - A)WW^{-1}x_0 |^2 p_\omega \nonumber \\&\le \sum _{\omega =1}^{2^M} \Vert ({\mathcal {A}}_{\omega _p+\ell -1} - A)W \Vert ^2 |W^{-1}x_0 |^2 p_\omega . \end{aligned}$$

(B24)

The result follows after inserting (B23) and (B24) into (B22). $\square $

The proof of Theorem 5 extends naturally to an infinite dimensional setting as follows. Most of the definitions and notations from Sect. 2 remain unchanged, apart from the following.

The state and the control no longer evolve in the finite-dimensional spaces ${\mathbb {R}}^N$ and ${\mathbb {R}}^q$, but in the (potentially) infinite-dimensional Hilbert spaces X and U, respectively.
A and $A_m$ (with $m \in \{1,2, \ldots , M \}$) now represent the generators of $C_0$-semigroups $e^{At}$ and $e^{A_mt}$ on the Hilbert space X with domains D(A) and $D(A_m)$, respectively.
B is now a bounded linear operator from U to X.

For simplicity we assume that the domains of the operators $A_m$ are all the same and equal to the domain of A, i.e. $D(A_m) = D(A)$. For a value of $\lambda $ in the resolvent set of A, the resolvent $W = (A - \lambda I)^{-1}$ is a bounded operator $X \rightarrow D(A) \subset X$ with (unbounded) inverse $A - \lambda I$ and one now easily verifies that AW and $A_mW$ represent bounded operators on X, meaning that $\mathrm {Var}_W[{\mathcal {A}}]$ as introduced in Remark 5 is bounded. For $|W^{-1}x_0 |= |(A - \lambda I)x_0 |$ to be bounded, we require that $x_0 \in D(A)$. The proof of Theorem 5 can thus be applied in this setting with the additional assumption that $x_0 \in D(A)$. The proof remains effectively unchanged.

Note that when we want to use Theorem 5 to obtain a result similar to Theorem 2, we also need a smoothness assumption on the input operator B. In particular, similarly as (104) in Theorem 2, we would then like to bound

$$\begin{aligned} \int _0^t \left|(S_h({\varvec{\omega }},t,s) - e^{A(t-s)})Bu_h({\varvec{\omega }},s) \right|\, \mathrm {d}s, \end{aligned}$$

(B25)

which is only possible with Theorem 5 when $|W^{-1}Bu_h({\varvec{\omega }},s) |$ is finite. To this end one would typically require that the range of B is contained in D(A).

Rights and permissions

Open Access This article is licensed under a Creative Commons Attribution 4.0 International License, which permits use, sharing, adaptation, distribution and reproduction in any medium or format, as long as you give appropriate credit to the original author(s) and the source, provide a link to the Creative Commons licence, and indicate if changes were made. The images or other third party material in this article are included in the article’s Creative Commons licence, unless indicated otherwise in a credit line to the material. If material is not included in the article’s Creative Commons licence and your intended use is not permitted by statutory regulation or exceeds the permitted use, you will need to obtain permission directly from the copyright holder. To view a copy of this licence, visit http://creativecommons.org/licenses/by/4.0/.

Reprints and permissions

About this article

Cite this article

Veldman, D.W.M., Zuazua, E. A framework for randomized time-splitting in linear-quadratic optimal control. Numer. Math. 151, 495–549 (2022). https://doi.org/10.1007/s00211-022-01290-3

Download citation

Received: 27 September 2021
Revised: 11 March 2022
Accepted: 09 April 2022
Published: 11 May 2022
Issue Date: June 2022
DOI: https://doi.org/10.1007/s00211-022-01290-3

Mathematics Subject Classification

Use our pre-submission checklist

Avoid common mistakes on your manuscript.

A framework for randomized time-splitting in linear-quadratic optimal control

Abstract

Similar content being viewed by others

Optimal Control for a Linear Quadratic Problem with a Stochastic Time Scale

An approximation scheme for stochastic controls in continuous time

An Approximation Scheme for Uncertain Minimax Optimal Control Problems

1 Introduction

2 Proposed method and main results

2.1 Proposed method

Remark 1

Assumption 1

Remark 2

Assumption 2

Example 1

Remark 3

Remark 4

2.2 Main results

Remark 5

Mainresult 1

Mainresult 2

Mainresult 3

Mainresult 4

Mainresult 5

2.3 Further examples for \(\mathrm {Var}[{\mathcal {A}}]\) and computational cost

Example 2

Example 3

Example 4

Example 5

3 Convergence analysis

3.1 Preliminaries

3.2 The forward dynamics with a deterministic input

Remark 6

Theorem 1

Proof

Remark 7

Remark 8

Remark 9

3.3 The forward dynamics with a stochastic input

Remark 10

Corollary 1

Proof

Remark 11

Theorem 2

Proof

Remark 12

Remark 13

3.4 A no-gap condition

Lemma 1

Proof

Theorem 3

Proof

3.5 Convergence in the controls

Lemma 2

Proof

Theorem 4

Proof

Corollary 2

Proof

Corollary 3

Proof

4 Numerical results

4.1 A discretized 1D heat equation

Example 6

4.2 A discretized 3D heat equation

4.3 A FE discretization of the fractional Laplacian

5 Conclusions and discussions

5.1 Conclusions

5.2 Extension to unbounded operators

5.3 Extension to nonlinear dynamics

5.4 Combination with model predictive control

References

Acknowledgements

Funding

Author information

Authors and Affiliations

Corresponding author

Additional information

Publisher's Note

Appendices

Appendix A: Interacting particle systems in the proposed framework