1 Introduction

In this article we consider Model Predictive Control (MPC) for systems governed by linear parabolic PDEs. This approach is also known as moving horizon control or receding horizon control, where we refer to the (seminal) monographs [9, 18] for a comprehensive presentation of this method. The core of the method for every MPC step at time \(t_i\) consists in solving a parabolic PDE constrained optimization problem on the prediction horizon \([t_i,t_i+{\bar{T}}]\), where \({\bar{T}}>0\). To solve this problem efficiently we propose a time-adaptive residual based a-posteriori error control concept for the elliptic space-time reformulation of the optimality system of the PDE constrained optimization problem. The contribution of our paper and the novelty of our approach is two-fold;

  • it delivers an adaptive time discretization of the prediction horizon \([t_i,t_i+{\bar{T}}]\) using residual based a-posteriori error control concepts, and

  • suggests two possible strategies related to the choice \(\tau _i \le {\bar{T}}\) of the application horizon \([t_i,t_i+\tau _i]\subseteq [t_i,t_i+{{\bar{T}}}]\) for the current MPC step. The application horizon \(\tau _i\) might be constant or chosen adaptively. For both cases the interval \([t_i,t_i+\tau _i]\) is discretized adaptively using the a-posteriori error control.

Our time-adaptive MPC algorithm works as follows, where the details of its formulation are given in Sect. 2.

figure a

Our adaptive concept is implemented with the first statement of the For-loop in Algorithm 1 and works as follows: in a first step, we rewrite the optimality conditions of the MPC optimization problem (1) as a second order in time and fourth order in space elliptic equation for the state variable, to which we then apply classical concepts from residual based a-posteriori error control for the time variable. This allows to construct a time grid for the state which is related to the optimal state solution. The time grid will be also used to discretize the application horizon \([t_i,t_i+\tau _i]\) in the current MPC step. The choice of \(\tau _i\) might be done according to steps 5-9 in Algorithm 1. The idea is based on [8], and now is transferred for a mixed formulation, where the a-posteriori error estimate is obtained from a semi-time discrete mixed form. For the fast computation of the adaptive time grid we use a coarse spatial discretization, where we assume that the structure of the temporal grid is not sensitive against changes in the spatial resolution. This is verified heuristically by numerical examples in e.g. [2, 3]. In a second step the resulting time grid is used for the numerical solution of the MPC optimization problem (1). Once the adaptive grid is obtained, we address the optimization problem by solving the coupled optimality system directly using a monolithic approach (see e.g. [20, Section 3.7]). Finally, the state is updated on the application horizon \([t_i,t_i+\tau _i]\) through the solution of the parabolic equation (25) with the optimal control \(u^N\) (more precisely the feedback value \(\phi ^N\)).

Let us briefly comment on related literature. Since there is a vast amount of books and papers on MPC we here concentrate on contributions related to adaptivity in MPC. In [11, 14] the authors took advantage of the structure of the problem using Lyapunov functions and/or the turnpike property to construct adaptive grids for the MPC optimal control problem. The turnpike property (see e.g. [21]) is often a key tool to prove asymptotic stability of the MPC method and to find the minimal prediction horizon (see e.g. [5, 10, 13, 15]). Our ideas are related to [12], where a goal-oriented adaptive approach for the MPC optimal control problem is proposed. This paper appeared while we were editing the first version of our manuscript. However, the a-posteriori concepts proposed there differ from our approach which relies on residual based a-posteriori error analysis for the elliptic space-time reformulation of the optimality systems appearing in every step of the MPC algorithm. Our method uses an error indicator with regard to the optimal state within each MPC subproblem.

The outline of this paper is as follows. In Sect. 2, we present the optimal control problem within the MPC framework and recall the basic idea of the MPC method. Further, we state the optimality conditions for the MPC subproblem. In Sect. 3 we describe the reformulation of the optimality system to a second order in time and fourth order in space elliptic equation as well as a mixed variational form. Additionally, we derive an a-posteriori error estimate for a semi-time discrete form. In Sect. 4, we propose the novel time-adaptive scheme in MPC. Finally, numerical tests are discussed in Sect. 5 and conclusions are made in Sect. 6.

2 Optimal Control Setting within the MPC Framework

2.1 Preliminaries

Let \(\varOmega \subset {\mathbb {R}}^n, n \in \{1,2,3\}\) be an open and bounded domain with Lipschitz boundary \(\partial \varOmega .\) The Lebesgue space of square integrable functions is denoted by \(L^2(\varOmega )\) with inner product \((u,v)_{L^2(\varOmega )}:= \int _\varOmega uv dx\) and norm \(\Vert u\Vert _{L^2(\varOmega )}:=(\int _\varOmega |u(x)|^2 dx)^{1/2}\) for \(u,v \in L^2(\varOmega )\). Further, let \(H^k(\varOmega )\) be defined by

$$\begin{aligned} H^k(\varOmega ) := \{u\in L^2(\varOmega ): u \text { has }\text {weak }\text {derivatives } D^{\beta }u \in L^2(\varOmega ) \text { for all } |\beta | \le k\} \end{aligned}$$

with \(k \in {\mathbb {N}}_0\) and equipped with the norm \(\Vert u\Vert _{H^{k}(\varOmega )} := (\sum _{|\beta | \le k} \Vert D^\beta u \Vert _{L^2(\varOmega )}^2)^{1/2}\) and

$$\begin{aligned} H_0^k(\varOmega ):= \{ u \in H^k(\varOmega ): D^\beta u = 0 \text { on } \partial \varOmega \text { in the sense of traces } (|\beta | \le k-1)\}. \end{aligned}$$

We use the notation \(H^{-1}(\varOmega )\) for the dual space of \(H_0^1(\varOmega )\) and denote \(\langle \cdot , \cdot \rangle _{H^{-1}(\varOmega ),H_0^1(\varOmega )}\) as the duality pairing of \(H^{-1}(\varOmega )\) with \(H_0^1(\varOmega )\). By \(| \cdot |_{H^1(\varOmega )}\) we denote the \(H^1\)-seminorm given by \(|u|_{H^1(\varOmega )} = \Vert \nabla u \Vert _{L^2(\varOmega )}\) for \(u\in H_0^1(\varOmega )\). We recall that the Poincaré constant is given by the smallest number \(c_p>0\) such that the Poincaré inequality

$$\begin{aligned} \Vert u\Vert _{L^2(\varOmega )} \le c_p \Vert \nabla u \Vert _{L^2(\varOmega )}, \; \forall u \in H_0^1(\varOmega ) \end{aligned}$$

is fulfilled. Thus, \(|.|_{H^1(\varOmega )}\) is a norm on \(H_0^1(\varOmega )\) equivalent to the norm \(\Vert .\Vert _{H^1(\varOmega )}\). For a given Banach space X and a given time \(T>0\), we denote by \(L^2((0 ,T);X)\) the space of measurable square integrable abstract functions with norm \(\Vert u\Vert _{L^2((0 ,{T});X)} := (\int _{0 }^{T} \Vert u(t)\Vert _X^2 dt )^{1/2}\). We define

$$\begin{aligned} W((0 ,T);H_0^1(\varOmega )):=\{v\in L^2((0 ,T);H_0^1(\varOmega )), v_t \in L^2((0 ,T);H^{-1}(\varOmega ))\}. \end{aligned}$$

Note that for a given function g in space-time, we use the short hand notation g(t) to indicate the time dependency and drop the space argument.

2.2 Model Predictive Control

In this section we specify our MPC setting of Algorithm 1. At time \(t_0\) we initialize our MPC algorithm and for convenience use a fixed length \({{\bar{T}}} > 0\) for the prediction horizon. At time instance \(t_i \ge t_0\) \((i\in {\mathbb {N}})\) this horizon is denoted by \([t_i,{{\bar{t}}}_i]\) with \({{\bar{t}}}_i := t_i+{{\bar{T}}}\). We denote with \(\tau _i\) the length of the application horizon at time instance \(t_i\), so that \([t_i,t_i+\tau _i] \subseteq [t_i,{{\bar{t}}}_i]\). The adaptive time grid at time instance \(t_i\) is denoted by \(\{t_i^j\}_{j=1}^{N}\), where we set \(t_i^1:=t_i\) and \(t_i^N:={{\bar{t}}}_i\). From here onwards we use \(t_i^N\) instead of \({{\bar{t}}}_i\) to denote the final time in the prediction horizon. We note that the value of \(\tau _i \) might be either constant or adaptive at each iteration due to our time-adaptive concept (see steps 5-9 in Algorithm 1).

The reduced cost functional over the domain \([t_i, t_i^N] \times \varOmega \), which is considered at the i-th time instance of the MPC algorithm for \(i=0,1,2\dots ,\) is given by

$$\begin{aligned} {\hat{J}}^N(u,t_i,y_i):=\int _{t_i}^{t_i^N} \ell (y_{[u,t_i,y_i]}(t),u(t))\,dt, \end{aligned}$$

where the function \(\ell \) in our applications is given by

$$\begin{aligned} \ell (y(t),u(t)):= \frac{1}{2} \Vert y(t) - y_d(t)\Vert _{L^2(\varOmega )}^2 + \frac{\alpha }{2}\Vert u(t)\Vert _{L^2(\varOmega )}^2. \end{aligned}$$

Here \(y_d \in L^2((t_i,t_i^N);\varOmega )\) denotes the desired state and \(\alpha > 0\) the prescribed regularization parameter. To anticipate discussions we note that also other cost functionals could be considered. The governing dynamics for the state \(y\equiv y_{[u,t_i,y_i]}\) is given by the linear parabolic partial differential equation

$$\begin{aligned} \left\{ \begin{array}{l@{\quad }l} y_t-\nu \varDelta y = f+u &{}\text { in } (t_i,t_i^N]\times \varOmega ,\\ y\qquad \qquad \, = 0 &{}\text { on } (t_i,t_i^N]\times \partial \varOmega ,\\ y(t_i)\qquad ~ = y_i &{} \text { in } \varOmega , \end{array} \right. \end{aligned}$$

where \(\nu > 0\) is a given constant, f is a given source term and \(y_i\) is the given initial state which is obtained from the preceding MPC step. The function u will act as the control. The weak form of (4) reads: for given \(f \in L^2((t_i,t_i^N);\varOmega )\), \(y_i \in L^2(\varOmega )\) and \(u\in L^2((t_i,t_i^N);\varOmega )\), find a state \(y \in W((t_i,t_i^N);H_0^1(\varOmega ))\) satisfying \(y(t_i) = y_i\) such that

$$\begin{aligned} \langle y_t(t),v \rangle _{H^{-1}(\varOmega ),H_0^1(\varOmega )} + \nu \int _\varOmega \nabla y(t) \cdot \nabla v dx = \int _\varOmega ( f(t)+u(t))v dx \end{aligned}$$

holds for all \(v \in H_0^1(\varOmega )\) and almost everywhere in \((t_i,t_i^N]\). It is clear that (5) admits a unique weak solution, see e.g. [7, §7.1.2, Theorems 3 and 4]. It therefore is meaningful to consider the state y as a function of the control u, so that the cost functional in (2) in fact only depends on the control as independent variable.

Then, the open-loop control problem in the \(i-\)th optimization instance of the MPC method is given by

$$\begin{aligned} \min _{u \in L^2((t_i,t_i^N);\varOmega )} {\hat{J}}^N(u,t_i,y_i). \end{aligned}$$

It forms the core of every MPC step. In the next section we develop a time-adaptive concept for its numerical approximation.

2.3 Optimal Control Problem

In this section, we investigate the distributed optimal control problem which we consider in each level of the MPC framework. To ease the notation here we will consider a general finite horizon [0, T] instead of \([t_i,t_i^N]\). It is clear that in the setting of the previous section the optimal control problem (6) admits a unique solution \(u\in L^2((0,T);L^2(\varOmega ))\). Moreover, there exists a unique adjoint state \(p\in W((0,T); H^1_0(\varOmega )),\) which together with u and the state \(y\in W((0,T); H^1_0(\varOmega ))\) satisfies the optimality system consisting of the state equation

$$\begin{aligned} \left\{ \begin{array}{ll@{\quad }l} y_t-\nu \varDelta y &{}= f+u &{}\text { in } (0 ,T]\times \varOmega ,\\ y &{}= 0 &{}\text { on } [0 ,T] \times \partial \varOmega ,\\ y(0 ) &{}= y_0 &{} \text { in } \varOmega , \end{array} \right. \end{aligned}$$

the adjoint equation

$$\begin{aligned} \left\{ \begin{array}{ll@{\quad }l} -p_t-\nu \varDelta p &{}= y-y_d &{}\text { in } [0 ,T)\times \varOmega ,\\ p &{}= 0 &{}\text { on } [0 ,T] \times \partial \varOmega ,\\ p(T) &{}= 0 &{}\text { in } \varOmega , \end{array} \right. \end{aligned}$$

and the optimality condition

$$\begin{aligned} \alpha u + p=0\quad \text{ in } [0 ,T]\times \varOmega . \end{aligned}$$

Remark 1

We note that it is possible to consider control constraints, state constraints and control operators mapping abstract controls to feasible right hand sides in (6), see Sect. 3.5 for a discussion.

In the next section we rewrite the optimality system as an elliptic boundary value problem in space-time and exploit its elliptic structure to provide adaptive concepts for its solution. For this purpose we need the following higher regularity results for the weak solutions of y of (7) and p of (8), respectively.

Lemma 1

(Higher regularity [7])

  1. (i)

    Let \(y_0 \in H_0^1(\varOmega )\) and let f, u, \(y_d \in L^2((0 ,T);\varOmega )\). Then, according to [7, §7.1.3. Theorem 5] the weak solution y of (4) and the weak solution p of (8) fulfill \(y,p \in L^2((0 ,T);H^2(\varOmega )) \cap L^\infty ((0 ,T);H_0^1(\varOmega ))\cap H^1((0 ,T);L^2(\varOmega ))\).

  2. (ii)

    Let \(y_0 \in H_0^1(\varOmega ) \cap H^3(\varOmega )\) and \(f,u, y_d \in L^2((0 ,T);H^2(\varOmega ))\cap H^1((0 ,T);L^2(\varOmega ))\). Further, let the compatibility assumption \((u+f)(0)+\nu \varDelta y_0 \in H_0^1(\varOmega )\) and \(y_d(T)\in H_0^1(\varOmega )\) hold true. Then according to [7, §7.1.3. Theorem 6] the weak solution y of (4) and the weak solution p of (8) fulfill \(y,p \in L^2((0 ,T);H^4(\varOmega ))\cap H^1((0 ,T);H^2(\varOmega ))\cap H^2((0 ,T);L^2(\varOmega ))\).

3 Reformulation of the Optimality System and Time Adaptivity

3.1 Reformulation of the Optimality System

Following along the lines of [8], we can reformulate the optimality system (7)–(8)–(9) as an elliptic equation of fourth order in space and second order in time involving only the state variable y. The adjoint state p as well as the control u are not present in this equation and will be computed by the coupled optimality system (7)–(8)–(9) afterwards. One can also reformulate the optimality system with respect to the adjoint p or the control u but in this work we are interested in an adaptive time grid for the state, compare also Sect. 3.5. To approximate the optimality conditions (7)–(8)–(9) we use an implicit Euler time integration and linear finite elements in space. This provides piecewise constant approximations with respect to time and piecewise linear and continuous approximations with respect to space of the state y, the adjoint state p and the control u.

In particular, the resulting elliptic equation is a two-point boundary value problem in space-time given by

$$\begin{aligned} \left\{ \begin{array}{ll@{\quad }l} -y_{tt}+\nu ^2\varDelta ^2 y +\frac{1}{\alpha } y &{}= \frac{1}{\alpha }y_d -f_t-\nu \varDelta f &{} \text { in } (0 ,T)\times \varOmega ,\\ y &{}= 0 &{}\text { on } [0 ,T]\times \partial \varOmega ,\\ \nu \varDelta y &{}= -f &{}\text { on } [0 ,T]\times \partial \varOmega ,\\ \left( y_t-\nu \varDelta y \right) (T) &{}= f(T) &{}\text { in }\varOmega ,\\ y(0 ) &{}= y_0 &{}\text { in } \varOmega . \end{array} \right. \end{aligned}$$

We note that for \(\nu =1\) and \(f\equiv 0\) this setting coincides with the setting considered in [8]. Under higher regularity assumptions on the data, the following theorem shows that the optimal state y of (7)–(8)–(9) fulfills the elliptic equation (10) a.e. in space-time.

Theorem 1

Let \((y,u) \in W((0 ,T);H_0^1(\varOmega ))\times L^2((0 ,T);\varOmega )\) with associated adjoint \(p\in W((0 ,T);H_0^1(\varOmega ))\) denote the unique weak solution to (7)–(8)–(9). Further, let the assumptions of Lemma 1(ii) be fulfilled. Then, y satisfies (10) a.e. in space-time.


The proof follows along the lines of the proof of [8, Theorem 2.7] and uses differentiation and insertion of the equations (7)–(8)–(9). \(\square \)

Let us homogenize (10). For this, let g be a function which fulfills the boundary conditions as well as initial and end time conditions of (10) and is sufficiently smooth. For example, g may be taken as the weak solution of (10) with zero right hand side and the same boundary conditions and the same initial condition. Let y satisfy (10). We define \({\tilde{y}}:=y-g\) and arrive at

$$\begin{aligned} \left\{ \begin{array}{ll@{\quad }l} -{\tilde{y}}_{tt}+\nu ^2\varDelta ^2 {\tilde{y}} + \frac{1}{\alpha } {\tilde{y}} &{}= {\tilde{y}}_d &{} \text { in } (0 ,T)\times \varOmega ,\\ {\tilde{y}} &{}= 0 &{}\text { on } [0 ,T]\times \partial \varOmega ,\\ \nu \varDelta {\tilde{y}} &{}= 0 &{}\text { on } [0 ,T]\times \partial \varOmega ,\\ \left( {\tilde{y}}_t-\nu \varDelta {\tilde{y}} \right) (T) &{}= 0 &{}\text { in }\varOmega ,\\ {\tilde{y}}(0 ) &{}= 0 &{}\text { in } \varOmega , \end{array} \right. \end{aligned}$$


$$\begin{aligned} {\tilde{y}}_d:= \frac{1}{\alpha } y_d - f_t - \nu \varDelta f + g_{tt} - \nu ^2 \varDelta ^2 g - \frac{1}{\alpha } g. \end{aligned}$$

Now, let us derive a weak formulation of (11). For this purpose we introduce the function space

$$\begin{aligned} H^{2,1}_0((0 ,T);\varOmega ):=\left\{ v\in H^{2,1}((0 ,T);\varOmega ): v(0 )=0 \text{ in } \varOmega \right\} , \end{aligned}$$


$$\begin{aligned} H^{2,1}((0 ,T);\varOmega ):=L^2((0 ,T);H^2(\varOmega )\cap H^1_0(\varOmega ))\cap H^1((0 ,T); L^2(\varOmega )). \end{aligned}$$

It is equipped with the norm

$$\begin{aligned} \Vert v\Vert _{H^{2,1}((0 ,T);\varOmega )}:= \left( \Vert v\Vert _{L^2((0 ,T);H^2(\varOmega ))}^2 + \Vert v\Vert _{H^1((0 ,T);L^2(\varOmega ))}^2 \right) ^{1/2}. \end{aligned}$$

We introduce the following symmetric bilinear form

$$\begin{aligned} A:H^{2,1}_0((0 ,T);\varOmega )\times H^{2,1}_0((0 ,T);\varOmega )\rightarrow {\mathbb {R}}, \end{aligned}$$
$$\begin{aligned} A(v_1,v_2):=\displaystyle \int _{0 }^T \int _\varOmega \left( (v_1)_t (v_2)_t + \nu ^2 \varDelta v_1 \varDelta v_2 + \frac{1}{\alpha } v_1 v_2 \right) dxdt + \displaystyle \int _\varOmega \nu \nabla v_1(T) \nabla v_2(T) dx, \end{aligned}$$

and linear form

$$\begin{aligned} L:H^{2,1}_0((0 ,T);\varOmega )\rightarrow {\mathbb {R}}, \quad L(v) := \int _0^T \int _\varOmega {\tilde{y}}_d v \; dxdt \end{aligned}$$

where \({\tilde{y}}_d\) is defined in (12).

Definition 1

(Weak formulation) The weak formulation of Eq. (11) is given by: find \({\tilde{y}} \in H_0^{2,1}((0 ,T);\varOmega ),\) which satisfies

$$\begin{aligned} A({\tilde{y}},v)=L(v)\quad \forall v\in H^{2,1}_0((0 ,T);\varOmega ). \end{aligned}$$

Existence of a solution to (13) and its relation to a solution to (10) is shown in the following theorem.

Theorem 2

Let y denote a solution to (10) and let g be a function which fulfills the boundary, initial and end time conditions in (10) and is sufficiently smooth. Then, \({\tilde{y}}=y-g\) is a solution to (13). On the other hand, if \({\tilde{y}}\) is a solution to (13) and the assumptions of Lemma 1(ii) are fulfilled, then \(y={\tilde{y}}+g\) satisfies (10) a.e. in space-time.


Assume y is a solution to (10). By Green’s formula and integration by parts it is straight forward to prove that \({\tilde{y}}=y-g\) satisfies (13). The other direction follows vice versa. \(\square \)

In order to show equivalence of the optimal control problem (6) over (0, T) to the weak formulation of (10) it remains to prove uniqueness of a solution.

Theorem 3

The solution y to (13) is unique.


The proof follows along the lines of the proof of [8, Theorem 2.6] and uses Lax-Milgram Lemma (see e.g. [7, §6.2.1, Theorem 1]). \(\square \)

3.2 Mixed Formulation

In order to use piecewise linear, continuous finite elements for discretization and avoid the construction of finite element subspaces in \(H^2(\varOmega )\), we introduce an auxiliary variable \({\tilde{w}}:=-\nu \varDelta {\tilde{y}}\). This allows to write (11) as a coupled system in \({\tilde{y}}\) and \({\tilde{w}}\) as

$$\begin{aligned} \left\{ \begin{array}{ll@{\quad }l} -{\tilde{y}}_{tt}-\nu \varDelta {\tilde{w}} +\frac{1}{\alpha } {\tilde{y}} &{}= {\tilde{y}}_d &{} \text { in } (0 ,T)\times \varOmega ,\\ \nu \varDelta {\tilde{y}} + {\tilde{w}} &{}= 0 &{} \text { in } (0 ,T)\times \varOmega ,\\ {\tilde{y}} &{}= 0 &{}\text { on } [0 ,T]\times \partial \varOmega ,\\ {\tilde{w}} &{}= 0 &{}\text { on } [0 ,T]\times \partial \varOmega ,\\ \left( {\tilde{y}}_t-\nu \varDelta {\tilde{y}} \right) (T) &{}= 0 &{}\text { in }\varOmega ,\\ {\tilde{y}}(0 ) &{}= 0 &{}\text { in } \varOmega . \end{array} \right. \end{aligned}$$

We introduce the function spaces \(Y:=\{v \in H^1((0 ,T);H_0^1(\varOmega )): v(0 ) = 0 \text { in } \varOmega \}\), \(W:=L^2((0 ,T);H_0^1(\varOmega ))\) and the product space \(X:=Y\times W\). We note that the function space W is different from \(W((0,T);H_0^1(\varOmega )).\) Let us define the following bilinear form

$$\begin{aligned}&A_M:X \times X \rightarrow {\mathbb {R}}, \\&A_M(({\tilde{y}},{\tilde{w}}),(v_1,v_2)) = \displaystyle \int _{0 }^T \int _\varOmega {\tilde{y}}_{t}(v_1)_t+\nu \nabla {\tilde{w}} \nabla v_1 + \frac{1}{\alpha } {\tilde{y}} v_1 -\nu \nabla {\tilde{y}} \nabla v_2 + {\tilde{w}} v_2 dxdt \\&\qquad \qquad \qquad \qquad \qquad \qquad + \displaystyle \int _\varOmega \nu \nabla {\tilde{y}}(T)\nabla v_1(T) dx \end{aligned}$$

and linear form

$$\begin{aligned} L_M: X \rightarrow {\mathbb {R}}, \quad L_M(v_1,v_2) = \displaystyle \int _{0 }^T \int _\varOmega {\tilde{y}}_d v_1 dxdt. \end{aligned}$$

Definition 2

The weak formulation of the mixed formulation (14) is given by: find \(({\tilde{y}},{\tilde{w}}) \in X,\) which satisfies

$$\begin{aligned} A_M(({\tilde{y}},{\tilde{w}}),(v_1,v_2)) = L_M(v_1,v_2) \quad \forall (v_1,v_2) \in X. \end{aligned}$$

By analogy with Theorems 2 and 3 it can be shown that the mixed variational form (15) admits at most one solution and that the pair \(({\tilde{y}},{\tilde{w}})\) with \({\tilde{y}}\) denoting the unique solution to (11) and \({\tilde{w}}:=-\nu \varDelta {\tilde{y}}\) is a solution to the mixed variational form (15). This means that the unique solution to (11) defines the solution to the mixed variational form (15).

Note that

$$\begin{aligned} A_M((y,w),(y,w)) = \displaystyle \int _{0 }^T \int _\varOmega y_t^2 + \frac{1}{\alpha }y^2 + w^2 dxdt + \displaystyle \int _\varOmega \nu |\nabla y(T) |^2 dx \end{aligned}$$

holds. For this reason, we define an energy norm associated with the bilinear form \(A_M\) by

$$\begin{aligned} |||(y,w)|||:= \left( \int _{0 }^T \int _\varOmega y_t^2 + \frac{1}{\alpha } y^2 + w^2 dxdt \right) ^{1/2}. \end{aligned}$$

3.3 A-Posteriori Error Estimate for the Semi-Time Discrete Mixed Form

Let us now consider a semi-time discretization of (15) with respect to \({\tilde{y}}\) while the variable \({\tilde{w}}\) is kept continuous. We introduce a time grid \(0={\tilde{\tau }}_0< {\tilde{\tau }}_1< \dots < {\tilde{\tau }}_m = T\) with \(m \in {\mathbb {N}}\), time step sizes \(\varDelta {\tilde{\tau }}_i = {\tilde{\tau }}_i - {\tilde{\tau }}_{i-1}\) and time intervals \(I_i = ({\tilde{\tau }}_{i-1},{\tilde{\tau }}_i]\) for \(i=1,\dots ,m\). The time discrete space \(V^k\) is defined by

$$\begin{aligned} V^{k} = \{v \in C^0((0 ,T);H^1(\varOmega )): v|_{I_i} \in {\mathbb {P}}_1(I_i)\}, \end{aligned}$$

where \({\mathbb {P}}_1\) denotes the space of linear polynomials. We set \(Y^{k}:=V^k \cap Y\).

Definition 3

(Semi-time discrete mixed form) The semi-time discrete mixed variational form reads as: find \(({\tilde{y}}^k,{\tilde{w}}^k) \in Y^k \times W\) such that

$$\begin{aligned} A_M(({\tilde{y}}^k,{\tilde{w}}^k),(v_1,v_2)) = L_M(v_1,v_2) \quad \forall (v_1,v_2) \in Y^k\times W. \end{aligned}$$

With arguments similar to those used for (15) we may show that problem (16) admits a unique solution.

Let us now derive a residual based error estimate for the semi-time discrete mixed form (16). We associate with \(({\tilde{y}}^k,{\tilde{w}}^k)\) the residuals \(R_1^k \in Y^*\) and \(R_2^k \in W^*\) by

$$\begin{aligned} R_1^k(v_1) = \displaystyle \int _{0 }^T \int _\varOmega {\tilde{y}}_d v_1 - ({\tilde{y}}^k)_t (v_1)_t - \nu \nabla {\tilde{w}}^k \nabla v_1 - \frac{1}{\alpha } {\tilde{y}}^k v_1 dxdt - \displaystyle \int _\varOmega \nu \nabla {\tilde{y}}^k(T)\nabla v_1(T) dx\nonumber \\ \end{aligned}$$


$$\begin{aligned} R_2^k(v_2) = \displaystyle \int _{0 }^T \int _\varOmega \nu \nabla {\tilde{y}}^k \nabla v_2 - {\tilde{w}}^k v_2 dxdt. \end{aligned}$$

Next, we derive \(L^2\)-representations of \(R_1^k\) and \(R_2^k\) by elementwise integration by parts

$$\begin{aligned} R_1^k(v_1)&= \displaystyle \sum _{i=1}^m \int _{I_i} \int _\varOmega \left\{ {\tilde{y}}_d + ({\tilde{y}}^k)_{tt} + \nu \varDelta {\tilde{w}}^k - \frac{1}{\alpha } {\tilde{y}}^k \right\} v_1 dxdt\\&\qquad + \sum _{i=1}^m \int _\varOmega ({\tilde{y}}^k)_t v_1 \bigg \vert _{I_i} dx + \int _\varOmega \nu \varDelta {\tilde{y}}^k(T) v_1(T) dx \end{aligned}$$


$$\begin{aligned} R_2^k(v_2) = \sum _{i=1}^m \int _{I_i} \int _\varOmega \left\{ -\nu \varDelta {\tilde{y}}^k - {\tilde{w}}^k \right\} v_2 dxdt. \end{aligned}$$

The residual \(R_1^k\) fulfills the Galerkin orthogonality

$$\begin{aligned} R_1^k(v_1) = 0 \quad \forall v_1 \in Y^k \end{aligned}$$

and it further holds true

$$\begin{aligned} R_2^k(v_2) = 0 \quad \forall v_2 \in W. \end{aligned}$$

Moreover, for \(({\tilde{y}},{\tilde{w}})\in Y \times W\) and \(({\tilde{y}}^k,{\tilde{w}}^k)\in Y^k \times W\) it holds for all \((v_1,v_2) \in Y^k \times W\):

$$\begin{aligned} \begin{array}{r c l} A_M(({\tilde{y}}-{\tilde{y}}^k,{\tilde{w}}-{\tilde{w}}^k),(v_1,v_2)) &{} = &{} A_M(({\tilde{y}},{\tilde{w}}),(v_1,v_2)) - A_M(({\tilde{y}}^k,{\tilde{w}}^k),(v_1,v_2))\\ \qquad &{} = &{} L_M(v_1,v_2) - A_M(({\tilde{y}}^k,{\tilde{w}}^k),(v_1,v_2)) \quad = 0.\\ \qquad \end{array} \end{aligned}$$

Further, the residual equation holds true for all \((v_1,v_2) \in Y \times W\):

$$\begin{aligned} \begin{array}{r c l} A_M(({\tilde{y}}-{\tilde{y}}^k,{\tilde{w}}-{\tilde{w}}^k),(v_1,v_2)) &{} = &{} R_1^k(v_1) + R_2^k(v_2) \quad = R_1^k(v_1),\\ \qquad \end{array} \end{aligned}$$

where the last equality follows from (20). We are now in the position to derive a temporal residual based a-posteriori error estimate for the semi-time discrete mixed variational formulation (16).

Theorem 4

Let \(({\tilde{y}},{\tilde{w}}) \in X\) denote the solution to (15) and let \(({\tilde{y}}^k,{\tilde{w}}^k) \in Y^k\times W\) denote the solution to (16). Then, the following residual based a-posteriori error estimate holds true:

$$\begin{aligned} ||| ({\tilde{y}}-{\tilde{y}}^k,{\tilde{w}}-{\tilde{w}}^k)|||^2 \le C \eta ^2, \end{aligned}$$

with a constant \(C>0\) and

$$\begin{aligned} \eta ^2 = \sum _{i=1}^m \int _{I_i} \int _\varOmega (\varDelta {\tilde{\tau }}_i)^2 \left| {\tilde{y}}_d + ({\tilde{y}}^k)_{tt} + \nu \varDelta {\tilde{w}}^k - \frac{1}{\alpha } {\tilde{y}}^k \right| ^2 dxdt. \end{aligned}$$


We combine (19) together with (21). Let for \(v_1 \in Y\) be \(I_Y^k v_1\) the approximation to \(v_1\) from \(Y^k\). Then, it is

$$\begin{aligned}&A_M(({\tilde{y}}-{\tilde{y}}^k,{\tilde{w}}-{\tilde{w}}^k),(v_1,v_2)) = R_1^k(v_1 - I_Y^k v_1) \\&\quad =\displaystyle \sum _{i=1}^m \int _{I_i} \int _\varOmega r_{1,int}^k(v_1 - I_Y^k v_1) dxdt + \int _\varOmega \nu \varDelta {\tilde{y}}^k(T) (v_1 - I_Y^k v_1)(T) dx \\&\qquad + \displaystyle \sum _{i=1}^m \int _\varOmega ({\tilde{y}}^k)_t (v_1-I_Y^k v_1)\bigg \vert _{I_i} dx, \end{aligned}$$

where we use the notation \(r_{1,int}^k:= {\tilde{y}}_d + ({\tilde{y}}^k)_{tt} + \nu \varDelta {\tilde{w}}^k- \frac{1}{\alpha } {\tilde{y}}^k\). Note that the last summands vanish since \((v_1-I_Y^k v_1)({\tilde{\tau }}_i) = 0\) for \(i=0,\dots ,m\). We can estimate using Cauchy-Schwarz

$$\begin{aligned} |A_M(({\tilde{y}}-{\tilde{y}}^k,{\tilde{w}}-{\tilde{w}}^k),(v_1,v_2))| \le \int _\varOmega \left( \sum _{i=1}^m \Vert r_{1,int}^k\Vert _{L^2(I_i)} \Vert v_1 - I_Y^k v_1 \Vert _{L^2(I_i)} \right) dx. \end{aligned}$$

Next, using standard interpolation properties (see e.g. [1, Theorem 1.7]), we arrive at

$$\begin{aligned} |A_M(({\tilde{y}}-{\tilde{y}}^k,{\tilde{w}}-{\tilde{w}}^k),(v_1,v_2))| \le \int _\varOmega \left( \sum _{i=1}^m \Vert r_{1,int}^k \Vert _{L^2(I_i)} \; c_1 \; \varDelta {\tilde{\tau }}_i | v_1 |_{H^1({\tilde{I}}_i)} \right) dx, \end{aligned}$$

where \({\tilde{I}}_i\) denotes the set of intervals which share a vertex with \(I_i\). We recall that \(| . |_{H^1}\) denotes the \(H^1\)-seminorm. Together with the Cauchy-Schwarz inequality for sums, we arrive at

$$\begin{aligned} \begin{array}{l l l} |A_M(({\tilde{y}}-{\tilde{y}}^k,{\tilde{w}}-{\tilde{w}}^k),(v_1,v_2))| \\ \qquad \qquad \qquad \qquad \le c_1 \displaystyle \int _\varOmega \left( \sum _{i=1}^m \Vert r_{1,int}^k \Vert _{L^2(I_i)}^2 (\varDelta {\tilde{\tau }}_i)^2 \right) ^{1/2} \left( \sum _{i=1}^m | v_1 |_{H^1({\tilde{I}}_i)}^2 \right) ^{1/2} dx \\ \qquad \qquad \qquad \qquad \le c_2 \displaystyle \int _\varOmega \left( \sum _{i=1}^m \Vert r_{1,int}^k \Vert _{L^2(I_i)}^2 (\varDelta {\tilde{\tau }}_i)^2 \right) ^{1/2} | v_1 |_{H^1(0,T)} dx \\ \qquad \qquad \qquad \qquad \le c_2 \left( \displaystyle \int _\varOmega \sum _{i=1}^m \Vert r_{1,int}^k \Vert _{L^2(I_i)}^2 (\varDelta {\tilde{\tau }}_i)^2 dx \right) ^{1/2} \left( \displaystyle \int _\varOmega | v_1 |_{H^1(0,T)}^2 dx \right) ^{1/2}, \end{array} \end{aligned}$$

where we use Hölder’s inequality in the last step. We note that

$$\begin{aligned} \left( \displaystyle \int _\varOmega | v_1 |_{H^1(0,T)}^2 dx \right) ^{1/2} \le \left( \int _0^T \int _\varOmega (v_1)_t^2 + \frac{1}{\alpha } v_1^2 + v_2^2 dxdt \right) ^{1/2} = ||| (v_1,v_2)|||. \end{aligned}$$

In (24) we choose \(v_1:= {\tilde{y}}-{\tilde{y}}^k\) and \(v_2 := {\tilde{w}}-{\tilde{w}}^k\) and denote \(e:=({\tilde{y}}-{\tilde{y}}^k,{\tilde{w}}-{\tilde{w}}^k)\), which leads to

$$\begin{aligned} |A_M(e,e)| \le c_2 \left( \int _\varOmega \sum _{i=1}^m \Vert r_{1,int}^k \Vert _{L^2(I_i)}^2 (\varDelta {\tilde{\tau }}_i)^2 dx \right) ^{1/2} \cdot ||| e |||. \end{aligned}$$

By the definition of the energy norm \(||| \cdot |||\), it follows that \(A_M(e,e) \ge |||e|||^2\), which yields the a-posteriori error estimate

$$\begin{aligned} |||e |||^2 \le C \left( \int _\varOmega \sum _{i=1}^m \Vert r_{1,int}^k \Vert _{L^2(I_i)}^2 (\varDelta {\tilde{\tau }}_i)^2 dx \right) . \end{aligned}$$

\(\square \)

Remark 2

(Adaptive cycle) In order to construct an adaptive time grid, we follow the standard

solve \(\rightarrow \) estimate \(\rightarrow \) mark \(\rightarrow \) refine

cycle. In practice, we solve (16) using rectangular space-time finite elements. Then, the error in each time interval is estimated using (22). The intervals with the largest errors are marked using the Dörfler marking strategy [6]. For refinement, we perform a bisection of the marked intervals. We iterate this loop until the time grid has a prescribed number of e.g. N time instances.

Remark 3

(Heuristic assumption) Note that we derived an error estimate (22) for a time discrete formulation in y whereas w is kept continuous. In practice, we solve a fully space-time discrete mixed variational formulation, but still use the error estimate for the semi-time discrete form to construct an adaptive time grid. For this, we assume that the temporal discretization of \(y^k\) is insensitive with respect to the spatial discretization. In fact, numerical studies in [2, 3] show that temporal and spatial discretization decouple for the considered problem settings. In addition, we also assume that a temporal discretization of \(w^k\) does not strongly influence the error estimate. Of course, these heuristic assumptions might not hold in general. For this reason, we will in future research derive a-posteriori error estimates for a fully space-time discrete mixed variational form.

With the help of (22), we are able to refine the time grid by means of the residual of the system (14). This property will constitute the major building block for the time-adaptive approach in the MPC framework as discussed in the next Sects. 4 and 5.

3.4 State Equation with Depletion Term

Let us now consider an optimal control problem of the form (6), where an additional depletion term in the state equation appears as

$$\begin{aligned} \left\{ \begin{array}{ll@{\quad }l} y_t-\nu \varDelta y - \mu y &{}= f+u &{}\text { in } (0 ,T]\times \varOmega ,\\ y &{}= 0 &{}\text { on } [0 ,T] \times \partial \varOmega ,\\ y(0 ) &{}= y_0 &{} \text { in } \varOmega , \end{array} \right. \end{aligned}$$

with \(\mu > 0\). The reformulation of the associated optimality system into an elliptic equation and an associated mixed formulation, respectively, follows along the lines of Sects. 3.1 and 3.2. In particular, the mixed formulation reads as

$$\begin{aligned} \left\{ \begin{array}{ll@{\quad }l} -{\tilde{y}}_{tt}-\nu \varDelta {\tilde{w}} + 2 \nu \mu \varDelta {\tilde{y}} +\left( \frac{1}{\alpha }+\mu ^2\right) {\tilde{y}} &{}= {\tilde{y}}_d &{} \text { in } (0 ,T)\times \varOmega ,\\ \nu \varDelta {\tilde{y}} + {\tilde{w}} &{}= 0 &{} \text { in } (0 ,T)\times \varOmega ,\\ {\tilde{y}} &{}= 0 &{}\text { on } [0 ,T]\times \partial \varOmega ,\\ {\tilde{w}} &{}= 0 &{}\text { on } [0 ,T]\times \partial \varOmega ,\\ \left( {\tilde{y}}_t-\nu \varDelta {\tilde{y}} - \mu {\tilde{y}}\right) (T) &{}= 0 &{}\text { in }\varOmega ,\\ {\tilde{y}}(0 ) &{}= 0 &{}\text { in } \varOmega . \end{array} \right. \end{aligned}$$

Let us define the bilinear form

$$\begin{aligned}&A_M^\mu :X \times X \rightarrow {\mathbb {R}}, \\&A_M^\mu (({\tilde{y}},{\tilde{w}}),(v_1,v_2)) = \displaystyle \int _{0 }^T \int _\varOmega ( {\tilde{y}}_{t}(v_1)_t+\nu \nabla {\tilde{w}} \nabla v_1 - 2 \nu \mu \nabla {\tilde{y}} \nabla v_1 +\left( \frac{1}{\alpha }+\mu ^2\right) {\tilde{y}} v_1 \\&\qquad \qquad \qquad \qquad \qquad \qquad \quad -\nu \nabla {\tilde{y}} \nabla v_2 + {\tilde{w}} v_2 )dxdt +\displaystyle \int _\varOmega \nu \nabla {\tilde{y}}(T)\nabla v_1(T) - \mu {\tilde{y}}(T) v_1 (T) dx \end{aligned}$$

and the linear form

$$\begin{aligned} L_M^\mu : X \rightarrow {\mathbb {R}}, \quad L_M^\mu (v_1,v_2) = \displaystyle \int _{0 }^T \int _\varOmega {\tilde{y}}_d v_1 dxdt, \end{aligned}$$

where \({\tilde{y}}_d:= \frac{1}{\alpha } y_d - f_t - \nu \varDelta f - \mu f + g_{tt} - \nu ^2 \varDelta ^2 g - 2 \nu \mu \varDelta g - (\frac{1}{\alpha } + \mu ^2) g\).

Definition 4

The weak formulation of the mixed formulation (26) is given by: find \(({\tilde{y}},{\tilde{w}}) \in X\), which satisfies

$$\begin{aligned} A_M^\mu (({\tilde{y}},{\tilde{w}}),(v_1,v_2)) = L_M^\mu (v_1,v_2) \quad \forall (v_1,v_2) \in X. \end{aligned}$$

The semi-time discrete mixed variational formulation then reads as

$$\begin{aligned} A_M^\mu (({\tilde{y}}^k,{\tilde{w}}^k),(v_1,v_2)) = L_M^\mu (v_1,v_2) \quad \forall (v_1,v_2) \in Y^k \times W. \end{aligned}$$

With similar arguments as in the previous sections, one can show existence of a unique solution of the involved equations provided sufficient regularity of the data.

In analogy to Theorem 4 we can derive a temporal residual based a-posteriori error estimate for (28).

Theorem 5

Let \(({\tilde{y}},{\tilde{w}}) \in X\) denote the solution to (27) and let \(({\tilde{y}}^k,{\tilde{w}}^k) \in Y^k\times W\) denote the solution to (28). Further, let \(\mu \le \nu /c_p^2\), where \(c_p\) denotes the Poincaré constant. Then, the following residual based a-posteriori error estimate holds true:

$$\begin{aligned} ||| ({\tilde{y}}-{\tilde{y}}^k,{\tilde{w}}-{\tilde{w}}^k)|||^2 \le C \eta ^2, \end{aligned}$$

with a constant \(C>0\) and

$$\begin{aligned} \eta ^2 = \sum _{i=1}^m \int _{I_i} \int _\varOmega (\varDelta {\tilde{\tau }}_i)^2 \left| {\tilde{y}}_d + ({\tilde{y}}^k)_{tt} + \nu \varDelta {\tilde{w}}^k -2 \nu \mu \varDelta {\tilde{y}}^k - \left( \frac{1}{\alpha } +\mu ^2 \right) {\tilde{y}}^k \right| ^2 dxdt.\quad \end{aligned}$$


The proof follows along the lines of the proof of Theorem 4. Note that it holds

$$\begin{aligned} A_M^\mu (({\tilde{y}},{\tilde{w}}),({\tilde{y}},{\tilde{w}}))&= \int _0^T \int _\varOmega {\tilde{y}}_t^2 -2 \nu \mu |\nabla {\tilde{y}}|^2 + \left( \frac{1}{\alpha } + \mu ^2 \right) {\tilde{y}}^2 + {\tilde{w}}^2 dxdt \nonumber \\&\quad + \int _\varOmega \nu |\nabla {\tilde{y}}(T)|^2 - \mu |{\tilde{y}}(T)|^2 dx. \end{aligned}$$

Using Green’s formula, the definition of \({\tilde{w}}\) and Young’s inequality, we can estimate the second summand in (31) by

$$\begin{aligned} \displaystyle \int _0^T \int _\varOmega - 2 \nu \mu |\nabla {\tilde{y}}|^2 dxdt= & {} \displaystyle \int _0^T \int _\varOmega 2 \nu \mu \varDelta {\tilde{y}} \; {\tilde{y}} \;dxdt = \int _0^T \int _\varOmega -2 \mu {\tilde{w}} {\tilde{y}} \; dxdt \\\ge & {} \displaystyle \int _0^T \int _\varOmega -2 \mu |{\tilde{w}}| \; |{\tilde{y}}| \; dxdt \ge \displaystyle \int _0^T \int _\varOmega - 4 \delta \mu ^2 {\tilde{y}}^2 - \frac{1}{4\delta } {\tilde{w}}^2 \; dxdt. \end{aligned}$$

With the choice \(\delta := \displaystyle \frac{1+2\alpha \mu ^2}{8 \alpha \mu ^2}\), it holds that \( -4\delta \mu ^2 + \frac{1}{\alpha } + \mu ^2 \ge 0\) and \( -\frac{1}{4\delta } + 1 \ge 0\).

Using the Poincaré inequality, we can estimate the last term in (31) by

$$\begin{aligned} \int _\varOmega \nu |\nabla y(T)|^2 - \mu |y(T)|^2 dx \ge \left( \frac{\nu }{c_p^2} - \mu \right) \Vert y(T) \Vert _{L^2(\varOmega )}^2 \end{aligned}$$

with Poincaré constant \(c_p\). If \(\mu \le \nu / c_p^2\), then \(\displaystyle \int _\varOmega \nu |\nabla y(T)|^2 - \mu |y(T)|^2 dx \ge 0\). Thus, for \(\mu \le \nu / c_p^2\) it holds that

$$\begin{aligned} A_M^\mu (({\tilde{y}},{\tilde{w}}),({\tilde{y}},{\tilde{w}})) \ge ||| ({\tilde{y}},{\tilde{w}})|||^2. \end{aligned}$$

With this, the a-posteriori error estimate follows in analogy to Theorem 4. \(\square \)

3.5 Control Constraints, Abstract Controls and State Constraints

The case of partially supported controls and control constraints can be treated by switching to an elliptic system for the adjoint state p. In particular, we can consider linear and bounded control operators \(B: U \rightarrow L^2((0,T);H^{-1}(\varOmega ))\) mapping controls to feasible right hand sides, where U denotes a real Hilbert space, and control constraints \(u \in U_{\text {ad}} \subseteq U\) with \(U_{\text {ad}}\) describes a convex, bounded and closed set of admissible controls. Under the corresponding regularity assumptions similar to those in Lemma 1, the associated optimality system can be reformulated into an elliptic equation of the form

$$\begin{aligned} \left\{ \begin{array}{ll@{\quad }l} -p_{tt} +\nu ^2 \varDelta ^2 p - B{\mathbb {P}}_{U_{\text {ad}}}\left\{ -\frac{1}{\alpha } B^*p\right\} &{}= f + \nu \varDelta y_d - (y_d)_t &{} \text { in } (0 ,T)\times \varOmega ,\\ p &{}= 0 &{}\text { on } [0 ,T]\times \partial \varOmega ,\\ \nu \varDelta p &{}= y_d &{}\text { on } [0 ,T]\times \partial \varOmega ,\\ (-p_t - \nu \varDelta p)(0) &{}= y_0 - y_d(0) &{}\text { in }\varOmega ,\\ p(T) &{}= 0 &{}\text { in } \varOmega \end{array} \right. \end{aligned}$$

with \(B^*\) denoting the dual operator to B and \({\mathbb {P}}_{U_{\text {ad}}}\) denoting the projection operator onto the admissible control space. An a-posteriori error estimate can be derived analogously, see [3] for more details. Using a regularization of the projection operator, it is also possible to derive an elliptic equation for the state, see [17].

Further, we note that the procedure above can be extended to the treatment of state constraints by e.g. adapting the approach of [16]. This is to consider the reduction to the elliptic space-time formulation for the state obeying state constraints. However, for the proof of concept we, in the present work, avoid the incorporation of additional constraints and other practical relevant control operators.

4 Time Adaptivity in MPC

In this section, we propose the use of a time-adaptive technique within MPC. In the classical application of MPC algorithms the length of the application horizon is fixed a priori and the prediction horizon is discretized equidistantly. This might not be ideal in practice. The choice of the length of the application horizon in each level of MPC is known to be a difficult issue. If one chooses a (too) long application horizon, a delayed reaction to possible disturbances might be the consequence. If one chooses a (too) small application horizon, the progress in the time domain is possibly (unnecessary) slow and many open-loop subproblems need to be solved leading to high computational efforts. We also refer to e.g. [19] for a study of stability conditions related to the length of the application horizon.

The adaptive time grid, computed using (29) within the prediction horizon, can provide a possible choice for the application horizon length, in fact it locates the time grid points according to the temporal dynamics of the optimal state.

In this work, we here would like to answer the following questions:

(i) How to choose a time discretization for the prediction horizon \([t_i,t_i+{{\bar{T}}}]\) in each level i of the MPC?

(ii) How to choose efficiently the time discretization and length for the application horizon \([t_i, t_i+ \tau _i]\) in each level i of the MPC to implement the feedback control?

We aim at computing the temporal discretization to identify the important dynamical structures according to the optimization goal. We propose an adaptive strategy which avoids unnecessary small uniform temporal discretizations and realize an efficient implementation. The proposed approach will lead to adaptive time discretizations which are related to the optimal state for each of the MPC subproblems.

The idea of adaptivity leads to different combinations using the error estimate (29). Here, we will deal with an adaptive grid in each subinterval for a fixed prediction horizon where the time discretization is computed on the fly. For a different adaptive concept based on goal-oriented adaptivity, see the recent work [12].

Therefore, for a given prediction interval \([t_i,t_i^{N}]\) at each MPC iteration i, we make use of the a-posteriori error estimation (29) for the state to compute an adaptive time grid within the current time horizon. Note that \(t_0 := 0\) is the initial time.

We consider the use of adaptive application horizons (flag\(=1\) in Algorithm 1) and compare it with the use of fixed application horizons (flag\(=2\) in Algorithm 1). The schemes are visualized in Figs. 1 and 2, respectively. The numerical performances of these approaches will be discussed in Sect. 5.

Fig. 1
figure 1

Scheme of an MPC approach with adaptive application horizons, flag\(=1\) in Algorithm 1: The blue dashed color refers to the grid at iteration i starting at time \(t_i\) till \(t_i^N\) whereas the red color refers to the next MPC level \(i+1\). The scheme visualizes the choice \(P=2\) in Algorithm 1. We note that the only guaranteed overlap of the time grids is for second time instance at iteration i which corresponds to the first time instance at iteration \(i+1\) (Color figure online)

Fig. 2
figure 2

Scheme of an MPC approach with fixed application horizons, flag\(=2\) in Algorithm 1: The blue dashed color refers to the grid at iteration i starting at time \(t_i\) till \(t_i^N\) whereas the red color refers to the next MPC level \(i+1\). The application horizon length \({\bar{\tau }}\) is fixed in each MPC step. We note that the discretization of the application horizon is formed by all adaptive time points of the prediction horizon which lie within the application horizon and a (possibly) additional time point \(t_i+{\bar{\tau }}\) (in black color), which then constitutes the first time point of the shifted prediction horizon (Color figure online)

For a given number of degrees of freedom N the algorithm distributes the time instances within the prediction horizon \([t_i,t_i^N]\) according to the error estimation (29), where we assume that all prediction horizons have the same length \(t_i^{N}-t_i = {\bar{T}}\). The resulting adaptive time grid at each time instance \(t_i\) is related to the optimal state of the corresponding open-loop subproblem of the current MPC step. Again, we assume that the heuristic assumptions of Remark 3 hold true which enables an efficient computation. The approach is summarized in Algorithm 1 in Sect. 1.

Remark 4

(Warm start) In order to make computations even more efficient, the information of the previous MPC iteration can be used as a warm start for the next MPC iteration. In particular, after a coarsening step of the previous adaptive time grid, this grid can be used as an initial adaptive time grid for the next prediction horizon. Furthermore, to improve the inner open-loop solver in each iteration one can use as initial control the one computed at the previous step.

Remark 5

(Efficiency under perturbations) This approach allows to compute a suitable adaptive temporal grid for every iteration of the MPC method. The grid will, in general, not result to be equidistant. This approach is particular sensitive to perturbations on the system. Specifically, we will consider in Sect. 5 perturbations of the initial condition and right hand side of the state equation (7) when applying the model predictive feedback value. This leads to a perturbed initial state for the next MPC iteration level. For this perturbed initial state we solve the elliptic system using the error indicator (30). Thus, the perturbations of the system enter the error indicator (30) through the perturbed state. It follows from the structure of the estimator that it is not able to distinguish whether a perturbation of the system drives the state y away or closer to the desired state \(y_d\).

Remark 6

(Discretization of the application horizon) Once \(\tau _i\) is computed, independently from the choice of the flag in Algorithm 1, we take advantage of the adaptive grid already computed for the prediction horizon. Then, we take those adaptive time points within the application horizon as time discretization of the application horizon.

5 Numerical Example

In the following tests, we investigate numerically the time-adaptive MPC algorithm proposed in Sect. 4. In all numerical examples, the considered spatial domain is the open interval \(\varOmega = (0,1)\). In order to solve the mixed form (26), we introduce a partitioning of the space-time domain into regular rectangles and use \({\mathbb {Q}}_1\) space-time finite elements for discretization, where \({\mathbb {Q}}_1\) is the space of polynomials of separate degree up to 1. We solve the equation with a direct solver using a coarse spatial resolution. We further note that in our implementations, we do not perform a homogenization of the elliptic system, but we solve a mixed formulation of (10) such that in the error indicator (23) the pair \(({\tilde{y}},{\tilde{w}})\) denotes the solution to (10) and \({\tilde{y}}_d\) denotes the right-hand side in (10). Analogously, we proceed for the system with the depletion term. For the solution of the MPC open-loop subproblems, we use an implicit Euler scheme for the temporal discretization and piecewise linear and continuous finite elements for the spatial discretization for the state, adjoint state and control. This results in piecewise constant approximations with respect to time for the state y, the adjoint state p and the control u within the MPC open-loop subproblems. The optimal control problem is solved with a direct solver addressing the coupled optimality system for all time instances at once (monolithic approach, see e.g. [20, Section 3.7]), where we take as fine spatial resolution an equidistant discretization with \(\varDelta x = 1/100\). All coding is done in Matlab R2020b.

5.1 Test 1: Solution with a Layer at \(t=0.5\)

In this numerical test, we consider the optimal control of (4) and the cost wants to track a time-dependent reference trajectory. In this example the control horizon will be [0, 1], since the quality of our results will not be different if dealing with a larger control horizon. The goal is to well approximate the layer at time \(t=0.5\), afterwards the solution is smooth. The setting for this test example is taken from [8, Example 5.2], with the following choices: \(\nu = 1\) in (4) and \(\alpha = 1\) in (3). The example is built such that the exact optimal solution (yu) to (6) over [0, 1] is given by

$$\begin{aligned} y(t,x) = \sin (\pi x) \text {atan} ((t-1/2)/\varepsilon ), \quad u(t,x) = -\sin (\pi x) \sin (\pi t). \end{aligned}$$

The initial condition is \(y_\circ (x) = \sin (\pi x) \text {atan}(-1/(2\varepsilon ))\). The functions f and \(y_d\) are chosen accordingly as

$$\begin{aligned} f(t,x)&= \sin (\pi x) \left( \varepsilon /(t^2 - t + \varepsilon ^2 + 1/4) + \pi ^2 \text {atan}((t-1/2)/(\varepsilon )) + \sin (\pi t) \right) ,\\ y_d(t,x)&= \sin (\pi x) \left( \text {atan}((t-1/2)/(\varepsilon )) + \pi \cos (\pi t) - \pi ^2 \sin (\pi t) \right) . \end{aligned}$$

For small values of \(\varepsilon \) (we use \(\varepsilon = 10^{-3}\)), the state y develops a very steep gradient at \(t = 0.5\), which can be seen in the left panel of Fig. 4.

We compare the adaptive Algorithm 1 with a standard equidistant MPC approach. To start with, we first consider the choice flag\(=1\), i.e. the length of the application horizon is chosen adaptively with \(P=2\). In Fig. 3, we show tracking costs for different choices of the prediction horizon length \({\bar{T}}\) and number N of time points in each prediction horizon. For large enough N, the tracking costs become \(\Vert y-y_d\Vert _{L^2((0,T);\varOmega )}\approx 5.2\), where we observe that using the adaptive approach, this value is already reached with a small number N of time points for either of the choices for \({\bar{T}}\). For an exemplary visualization, we plot the tracking term over time for the choices \(N=9\) and \({\bar{T}} \in \{0.2,0.3\}\) in Fig. 3 (middle, right).

Fig. 3
figure 3

Test 1: Tracking term value \(\Vert y-y_d\Vert _{L^2((0,T);\varOmega )}\) (left) for increasing N for different prediction horizon lengths \({\bar{T}}\) comparing the adaptive approach flag\(=1\) with \(P=2\) with an equidistant approach; tracking term value over time \(\Vert y(t)-y_d(t)\Vert _{L^2(\varOmega )}\) for \(N=9,{\bar{T}}=0.2\) (middle) and \(N=9,{\bar{T}}=0.3\) (right) comparing the adaptive and equidistant time discretization

Further, for the choices flag\(=1\) with \(P=2\) and \({\bar{T}}=0.2,N=9\), the numerical state solutions of the controlled problem with the different MPC approaches are shown in the middle and right panel of Fig. 4. We can see that the standard MPC algorithm with equidistant time grids fails whereas using Algorithm 1 it is possible to capture the layer at \(t=0.5\) and the solution complies much better with the true open-loop state solution over [0, 1].

Fig. 4
figure 4

Test 1: True optimal state solution (left), MPC state solution y using a uniform time discretization (middle) and adaptive approach (right) with the choices flag\(=1,P=2,{\bar{T}}=0.2,N=9\)

Let us now provide more details about the temporal grids we obtained with the proposed adaptive scheme with the choices flag\(=1,P=2,{\bar{T}}=0.2,N=9\). The adaptive grid with a coarse and a fine spatial resolution is shown in the middle and right panel of Fig. 5. We observe that for this setting, the time adaptivity is very insensitive with respect to the spatial resolution, compare Remark 3. We note that the time discretization in Fig. 5 (middle) displays the adaptive time intervals where the MPC feedback value is applied.

Fig. 5
figure 5

Test 1: Uniform space-time grid with fine spatial resolution (left), adaptive grid with coarse (middle) and fine (right) spatial resolution for flag\(=1,P=2,{\bar{T}}=0.2,N=9\)

Examples of adaptive prediction horizons are shown in the top panels of Fig. 6. As a comparison, the uniform time horizons of the same lengths are shown in the bottom panels of Fig. 6 using the same number of degrees of freedom in each interval. It is clear that the a-posteriori error estimate (22) leads to a time grid associated with the open-loop optimal state which benefits the accuracy of the control problem.

Fig. 6
figure 6

Test 1: Adaptive prediction horizons for flag\(=1,P=2,{\bar{T}}=0.2,N=9\) (top), uniform prediction horizons according to the standard MPC approach (bottom), MPC iteration levels \(i=13\) (left), \(i=15\) (middle), \(i=24\) (right)

Moreover, we provide an error analysis for the computation of the approximate solutions using an adaptive and an equidistant approach, for different choices of degrees of freedom in time and prediction horizons. For this, we compute the error between the analytical optimal state solution to (6) on the finite time domain \([0,T]=[0,1]\) and its numerical approximation using the different MPC approaches measured in the \(L^2((0,T);\varOmega )-\)norm, compare Fig. 7 (left). We fixed the prediction horizon \({\bar{T}}\) and modified the choice of the instances in each sub interval using the equidistant and adaptive method. As one can see, with this approach we need a small prediction horizon and a large number of time instances to obtain an error of order \(10^{-1}\) with an equidistant grid whereas the adaptive method provides a more flexible approach for our choices of \({\bar{T}}\in \{0.1,0.2,0.3\}\). Depending on whether the layer at \(t=0.5\) is a time discretization point or not, the approximation quality can differ strongly leading to the illustrated zig-zag behavior in the equidistant scheme. Since the exact location of the layer is usually not known a-priorily, an equidistant time grid approach is easy to fail.

Fig. 7
figure 7

Test 1: \(L^2-\)error (left) and computational time in seconds (right) for the MPC approach with equidistant and adaptive time grids with the choices flag\(=1,P=2\)

In Fig. 7 (right) we compare the computational time in seconds of the standard MPC algorithm using Algorithm 1 with the choices flag\(=1\) and \(P=2\) including the computational time needed to create the adaptive time discretization within each MPC iteration. Clearly, to obtain a more accurate solution is computationally more expensive but we also want to remark that the minimum error with the equidistant grid is 0.0872 computed in 25.87s whereas, with the adaptive approach, to get an error of 0.0216 we needed 16.06s. This shows that our method is more accurate and also more efficient computationally without any a-priori knowledge of the control problem.

Further, we provide results for the choice flag\(=2\) in Algorithm 1, i.e. the length of the application horizon is chosen to be fixed, whereas its time discretization is either adaptive or equidistant. We show in Fig. 8 the tracking term values for different choices for fixed application horizon lengths. In these settings, we make similar observations as for the results shown in Fig. 3.

Fig. 8
figure 8

Test 1: Tracking term value \(\Vert y-y_d\Vert _{L^2((0,T);\varOmega )}\) for increasing N for different prediction horizon lengths \({\bar{T}}\) comparing the adaptive approach flag\(=2\) with an equidistant approach for fixed application horizon lengths comparing the choices \({\bar{\tau }}={\bar{T}}/(N-1)\) (left), \({\bar{\tau }}=0.5 \cdot {\bar{T}}\) (middle), \({\bar{\tau }}=0.1 \cdot {\bar{T}}\) (right)

5.2 Test 2: State Equation with Depletion Term and Random Disturbances

In this numerical test, we consider an optimal control problem where the state dynamics are governed by (25) with \(\mu > 0\). Let us note that the Poincaré constant \(c_p\) and the first eigenvalue \(\lambda _1\) of the Laplace-Dirichlet operator are related by \(\lambda _1 = 1/c_p^2\) (see, e.g., [4, Proposition 8.4.3]). For the considered domain \(\varOmega = (0,1)\), the first eigenvalue \(\lambda _1\) of the Laplace-Dirichlet operator is given by \(\lambda _1 = \pi ^2\) (see, e.g., [4, Proposition 8.5.2]). Then, since Theorem 5 is applicable if \(\mu \le \nu / c_p^2\), for this setting it requires \(\mu \le \nu \cdot \pi ^2\). In this example, we set \(\nu =0.1\) and \(\mu = 5\). Thus, we consider an unstable case which goes beyond the assumptions of Theorem 5. Nevertheless, we will see that the numerical tests under this configuration still provide satisfactory results, very similar to a stable case with \(\mu \le \nu \pi ^2\) as required in Theorem 5. The initial condition for the state is chosen as \(y_\circ (x) \equiv 0\) and the source term in the state equation is set to \(f(t,x)\equiv 0\). The regularization parameter in the cost is chosen as \(\alpha = 10^{-3}\) and the desired state is given by

$$\begin{aligned} y_d(t,x) = -10 | x- 0.25| - 10 |x-0.75| + 10, \end{aligned}$$

which is a stationary state and shown in Fig. 9 (left). Thus, the goal of the optimal control problem is to steer the state y, which fulfills (25) in a weak sense, as close as possible to the desired state \(y_d\) and keep it there (for an infinite amount of time). In Fig. 9 (middle) we show the controlled state solution using Algorithm 1 with the choices flag=1, \(N=20, {\bar{T}}=0.5, {P=2}\) and plot the adaptive time grid for the first prediction horizon [0, 0.5] in Fig. 9 (right).

Fig. 9
figure 9

Test 2: Desired state \(y_d\) (left), controlled state (middle), adaptive time grid for prediction horizon [0, 0.5] (right)

Fig. 10
figure 10

Scheme of MPC with disturbances

For a cheap computation of the adaptive time grid, we solve (26) with a coarse spatial resolution of \(\varDelta x = 1/4\), compare Remark 3. We observe a fine temporal discretization toward \(t=0\), where the initial state must be steered from \(y_\circ (x)=0\) as close as possible to the desired state.

In realistic scenarios, however, often disturbances enter the system, see Fig. 10 for a schematic presentation. In particular, we focus on disturbances that happen at random time points \(\{\omega _\kappa \}_{\kappa =1}^K\) in the source term f and current state \(y_i\) of random magnitudes \(\{(\chi _\kappa ,\psi _\kappa )\}_{\kappa =1}^K\) leading to a disturbed initial value \(y_{i+1}=y^N(t_{i+1})\) for the next MPC loop. In particular, if \(\omega _\kappa \in (t_i,t_i + \tau _i]\), i.e. if the current simulation window contains one of the random time instances, we consider the following disturbed state equation for implementing the model predictive feedback value:

$$\begin{aligned} \left\{ \begin{array}{ll@{\quad }l} y_t-\nu \varDelta y - \mu y &{}= f_{dist}+\phi ^N &{}\text { in } (t_i,t_i + \tau _i]\times \varOmega ,\\ y &{}= 0 &{}\text { on } (t_i,t_i + \tau _i] \times \partial \varOmega ,\\ y(t_i) &{}= y_i + y_{dist} &{} \text { in } \varOmega , \end{array} \right. \end{aligned}$$

where \(f_{dist}(t,x) \equiv - \chi _\kappa \) in \((t_i,t_i + \tau _i] \times \varOmega \) and \(y_{dist} (x) = - \psi _\kappa \sin (\pi x)\) in \(\varOmega \). In this example, we generate the random numbers once and run all tests for these values in order to make the experiments comparable. We consider \(K=4\) random time points \(\omega _1 = 3.51, \omega _2 = 4.73, \omega _3 = 5.85, \omega _4 = 8.30\) and values \(\chi _1 = 75.85, \chi _2 = 380.44, \chi _3 = 567.82, \chi _4 = 753.72\) and \(\psi _1 = 6.78, \psi _2 = 7.57, \psi _3 = 7.43, \psi _4 = 3.92\). In Fig. 11 (left) we show the decay of the tracking costs for an increasing number of time instances N per prediction horizon for three examples of prediction horizon lengths (\({\bar{T}}=0.2, 0.3, 0.4\)) comparing the adaptive approach of Algorithm 1 with flag\(=1\) and \(P=2\) respectively \(P=4\) with the standard uniform approach. In this example, we exemplarily run the MPC loop until \(t_i=10\) for some \(i \in {\mathbb {N}}\), i.e. we cover a time domain of [0, 10]. We observe in this setting that the adaptive approach delivers smaller tracking term values than the equidistant approach. The greatest benefit of the adaptive approach is achieved when a small number of degrees of freedom in a comparatively large prediction horizon is considered, where the adaptive approach distributes the time discretization points according to the optimal state dynamics indicated through the error estimate (29). Note that fixing N and \({\bar{T}}\) can lead to different lengths of the application horizon \((t_i,t_i+\tau _i]\) in which the feedback value is applied in Algorithm 1, and thus different number of degrees of freedom for the whole considered time domain [0, 10]. Moreover, we note that choosing \(P=2\) in Algorithm 1 leads to smaller tracking values than choosing \(P=4\).

Fig. 11
figure 11

Test 2: Tracking term value \(\Vert y-y_d\Vert _{L^2((0,T);\varOmega )}\) (left) for increasing N for different prediction horizons \({\bar{T}}\) comparing the adaptive approach with flag\(=1\),\(P=2\) (top) and \(P=4\) (bottom) with an equidistant approach; tracking term value over time \(\Vert y(t)-y_d(t)\Vert _{L^2(\varOmega )}\) (middle) for \(N=15, {\bar{T}}=0.3\); zoom-in (right)

The fact that in these settings the time-adaptive MPC approach leads to a closer tracking of the desired state than an equidistant approach with the same respective choices for PN and \({\bar{T}}\) comes with the price of higher control costs, compare Fig. 12. The behavior of the adaptive control costs with increasing number N shown in Fig. 12 (left) is difficult to explain due to the nonlinear relation between the perturbations and N and \({\bar{T}}\).

Fig. 12
figure 12

Test 2: Control costs \(\Vert u\Vert _{L^2((0,T);\varOmega )}\) (left) for increasing N comparing the adaptive approach flag\(=1\) with \(P=2\) (top) and \(P=4\) (bottom) with an equidistant approach; control costs over time \(\Vert u(t)\Vert _{L^2(\varOmega )}\) for \(N=15\) and \({\bar{T}}=0.3\) (middle); zoom-in (right)

Finally, we provide some results using a fixed application horizon length in Fig. 13, i.e. we choose flag\(=2\) in Algorithm 1. In the presented plots, both adaptive and uniform time discretizations of the prediction and application horizon lead to similar tracking costs. In the setting with \(N=15,{\bar{T}}=0.3\), we observe that using the adaptive discretization, we detect the disturbance earlier in time.

Fig. 13
figure 13

Test 2: Tracking term value \(\Vert y-y_d\Vert _{L^2((0,T);\varOmega )}\) (left) for increasing N for different prediction horizon lengths \({\bar{T}}\) using a fixed application horizon length flag\(=2\) comparing adaptive and equidistant discretizations with \({\bar{\tau }}={\bar{T}}/(N-1)\); tracking term value over time \(\Vert y(t)-y_d(t)\Vert _{L^2(\varOmega )}\) for \(N=15,{\bar{T}}=0.3\) (middle); zoom-in (right)

6 Conclusions and Outlook

In this work we have proposed an approach to include time-adaptive discretization in the MPC framework. Our approach is fully flexible and relies on a reformulation of the optimal control problem into a second order in time and fourth order in space equation. Our approach does not require further assumptions on the control problem. The use of a-posteriori error estimates to generate the time grid in the MPC method is the important novelty of our work. Numerical tests have shown the efficiency of the method for both accuracy and computational time. We also want to remark that our approach is particularly suitable when a layer is shown in the solution or the disturbances happen. Other experiments with mild temporal variations did not always show a clear difference between equidistant and adaptive grid. The a-posteriori error indicator delivers an appropriate adaptive time grid even providing a coarse spatial resolution.

In the future, we plan to derive an a-posteriori error estimator for a fully space-time discrete form and to use that indicator for a fully adaptive and automatic MPC scheme, where the idea is to avoid an a-priori choice of the prediction horizon and/or the number of degrees of freedom in each sub-iteration. Another goal is to extend these results to nonlinear control problems and as soon as we increase the dimension of the problem to make use of efficient model reduction techniques, such as POD, to decrease the computational time.