1 Introduction

It is well known that optimal control and optimization problems are approximated by many numerical methods, such as standard finite element methods (FEMs), mixed FEMs, space-time FEMs, finite volume element methods, spectral methods, multigrid methods etc.; see e.g., [5, 8, 10, 16, 17, 24,25,26, 31]. There is no doubt that FEMs occupy the most important position in these methods.

For a control constrained elliptic optimal control problem (OCP), the regularity of the control variable is lower than the regularity of the state or co-state variable. Hence, most of the researchers use piecewise constant function and piecewise linear function to approximate the control variable and the state or co-state variable, respectively. If the mesh size is h, the convergent order in \(L^{2}\)-norm for the control or in \(H^{1}\)-norm for the state and co-state is just \(\mathcal{O}(h)\); see e.g., [2, 9, 12, 18]. When we use these techniques to deal with control constrained parabolic OCP, the similar convergent order is \(\mathcal{O}(h+k)\). In order to boost the accuracy and efficiency, superconvergence and adaptive algorithm of FEMs have become research focus. The convergent order will be improved to \(\mathcal{O}(h^{\frac{3}{2}})\) or \(\mathcal{O}(h^{\frac{3}{2}}+k)\) by superconvergence analysis. Some superconvergence results of FEMs for linear and semilinear elliptic or parabolic OCPs can be found in [4, 6, 15, 27,28,29]. Adaptive FEMs that approximate elliptic and parabolic OCPs have been investigated in [1, 11, 19, 32] and [3], respectively.

Hinze presents a variational discretization (VD) concept for control constrained optimization problems in [13]. It cannot only save some computation cost but also improve the convergent order to \(\mathcal{O}(h^{2})\). Recent years, VD are used to solve different kinds of constrained OCPs, for example, VD approximation of a convection dominated diffusion OCP with control constraints and linear parabolic OCPs with pointwise state constraints are investigated in [14] and [7], respectively.

In this paper, we consider VD approximation for constrained parabolic bilinear OCPs. The main purpose is to analyze the convergence and superconvergence. We are interested in the following control constrained parabolic bilinear OCP:

$$\begin{aligned} &\min_{u\in K}\frac{1}{2} \int _{0}^{T} \bigl( \bigl\Vert y(t,x)-y_{d}(t,x) \bigr\Vert ^{2}+\alpha \bigl\Vert u(t,x) \bigr\Vert ^{2} \bigr)\,dt, \end{aligned}$$
(1)
$$\begin{aligned} &y_{t}(t,x)-\operatorname{div}\bigl(A(x)\nabla y(t,x) \bigr)+u(t,x)y(t,x)=f(t,x), \quad t\in J,x\in \varOmega , \end{aligned}$$
(2)
$$\begin{aligned} &y(t,x)=0, \quad t\in J,x\in \partial \varOmega , \end{aligned}$$
(3)
$$\begin{aligned} &y(0,x)=y_{0}(x), \quad x\in \varOmega , \end{aligned}$$
(4)

where \(\alpha >0\) represents the weight of the cost of the control, \(\varOmega \in \mathbb{R}^{2}\) is a convex bounded open set with smooth boundary ∂Ω and \(J=[0,T]\) (\(0< T<+\infty \)). The symmetric and positive definite matrix \(A(x)=(a_{ij}(x))_{2\times 2}\in [W^{1, \infty }(\bar{\varOmega })]^{2\times 2}\). Moreover, we assume that \(f(t,x)\in C(J;L^{2}(\varOmega ))\), \(y_{0}(x)\in H_{0}^{1}(\varOmega )\), and the set of admissible controls K is defined by

$$ K=\bigl\{ v(t,x)\in L^{\infty }\bigl(J;L^{2}(\varOmega )\bigr): a \leq v(t,x)\leq b, \text{ a.e. in } \varOmega , t\in J \bigr\} , $$

where \(0\leq a< b\) are real numbers.

In this paper, we adopt the notation \(L^{s}(J;W^{m,q}(\varOmega ))\) for the Banach space of all \(L^{s}\) integrable functions from J into \(W^{m,q}(\varOmega )\) with norm \(\|v\|_{L^{s}(J;W^{m,q}(\varOmega ))}=(\int _{0}^{T}\|v\|_{W^{m,q}(\varOmega )}^{s}\,dt)^{\frac{1}{s}}\) for \(s\in [1, \infty )\) and the standard modification for \(s=\infty \), where \(W^{m,q}(\varOmega )\) is Sobolev spaces on Ω with norm \(\|\cdot \|_{W^{m,q}(\varOmega )}\) and semi-norm \(|\cdot |_{W^{m,q}( \varOmega )}\). We set \(H_{0}^{1}(\varOmega )\equiv \{v \in H^{1}( \varOmega ): v|_{\partial \varOmega } =0 \}\) and denote \(W^{m,2}( \varOmega )\) by \(H^{m}(\varOmega )\). Similarly, one can define \(H^{l}(J;W ^{m,q}(\varOmega ))\) and \(C^{k}(J;W^{m,q}(\varOmega ))\) (see e.g. [22]). In addition, c or C is a generic positive constant.

The plan of our paper is as follows. In Sect. 2, we present VD approximation scheme for the model problem (1)–(4). In Sect. 3, we introduce some important intermediate variables and their error estimates. Convergence of the control variable is derived in Sect. 4. Superconvergence of the state and the co-state are established in Sect. 5. In Sect. 6, we present two numerical examples to illustrate our theoretical results.

2 VD approximation for parabolic bilinear OCP

In this section, we construct VD approximation for (1)–(4). We set \(L^{p}(J;W^{m,q}(\varOmega ))\) and \(\|\cdot \|_{L^{p}(J;W^{m,q}(\varOmega ))}\) by \(L^{p}(W^{m,q})\) and \(\|\cdot \|_{L^{p}(W^{m,q})}\), respectively. Let \(W=H_{0}^{1}(\varOmega )\) and \(U=L^{2}(\varOmega )\). Moreover, we denote \(\|\cdot \|_{H^{m}( \varOmega )}\) and \(\|\cdot \|_{L^{2}(\varOmega )}\) by \(\|\cdot \|_{m}\) and \(\|\cdot \|\), respectively. Let

$$\begin{aligned} \begin{aligned}[b] &a(v,w)= \int _{\varOmega }(A\nabla v)\cdot \nabla w, \quad \forall v, w\in W, \\ &(f_{1},f_{2})= \int _{\varOmega }f_{1}\cdot f_{2}, \quad \forall f_{1}, f_{2}\in U. \end{aligned} \end{aligned}$$

According to the assumptions on A, we have

$$\begin{aligned} \begin{aligned}[b] a(v,v)\geq c \Vert v \Vert _{1}^{2}, \qquad \bigl\vert a(v,w) \bigr\vert \leq C \Vert v \Vert _{1} \Vert w \Vert _{1}, \quad \forall v, w\in W. \end{aligned} \end{aligned}$$

We recast (1)–(4) as the following weak formulation:

$$\begin{aligned} &\min_{u\in K}\frac{1}{2} \int _{0}^{T} \bigl( \Vert y-y_{d} \Vert ^{2}+\alpha \Vert u \Vert ^{2} \bigr)\,dt, \end{aligned}$$
(5)
$$\begin{aligned} &(y_{t},w)+a(y,w)+(u y,w)=(f,w), \quad \forall w\in W, t\in J, \end{aligned}$$
(6)
$$\begin{aligned} &y(x,0)=y_{0}(x), \quad \forall x\in \varOmega . \end{aligned}$$
(7)

It follows from (see e.g. [21]) that the problem (5)–(7) has at least one solution \((y,u)\), and that if the pair \((y,u)\in (H^{2}(L^{2})\cap L^{2}(H^{1}) )\times K\) is a solution of the formulation (5)–(7), then there is a co-state \(p\in H^{2}(L^{2})\cap L^{2}(H^{1})\) such that the triplet \((y,p,u)\) satisfies the following optimality conditions:

$$\begin{aligned} &(y_{t},w)+a(y,w)+(u y,w)=(f,w), \quad \forall w\in W, t\in J, \end{aligned}$$
(8)
$$\begin{aligned} &y(0,x)=y_{0}(x), \quad \forall x\in \varOmega , \end{aligned}$$
(9)
$$\begin{aligned} &{-}(p_{t},q)+a(q,p)+(u p,q)=(y-y_{d},q), \quad \forall q\in W, t\in J, \end{aligned}$$
(10)
$$\begin{aligned} &p(T,x)=0, \quad \forall x\in \varOmega , \end{aligned}$$
(11)
$$\begin{aligned} &(\alpha u-y p,v-u)\geq 0, \quad \forall v\in K, t\in J. \end{aligned}$$
(12)

As in Ref. [27], we can easily prove the following lemma.

Lemma 2.1

Let \((y,p,u)\) be the solution of (8)(12). Then

$$\begin{aligned} u=\min \biggl(\max \biggl(a,\frac{y p}{\alpha } \biggr),b \biggr). \end{aligned}$$
(13)

Let \(\mathcal{T}^{h}\) be regular triangulations of Ω, such that \(\bar{\varOmega }=\bigcup_{\tau \in \mathcal{T}^{h}}\bar{\tau }\) and \(h=\max_{\tau \in \mathcal{T}^{h}}\{h_{\tau }\}\), where \(h_{\tau }\) is the diameter of the triangle element τ. Furthermore, we set

$$\begin{aligned} W_{h} &= \bigl\{ v_{h}\in C(\bar{\varOmega }):v_{h}|_{\tau }\in \mathbb{P}_{1},\forall \tau \in \mathcal{T}^{h}, v_{h}|_{\partial \varOmega }=0 \bigr\} , \end{aligned}$$

where \(\mathbb{P}_{1}\) denotes the space of polynomials no more than order 1.

Let \(0=t_{0}< t_{1}<\cdots <t_{N}=T\), \(k_{n}=t_{n}-t_{n-1}\), \(n=1, 2, \ldots ,N\), \(k=\max_{1\leq n\leq N} \{k_{n}\}\). Set \(\varphi ^{n}= \varphi (x,t_{n})\) and

$$ d_{t}\varphi ^{n}=\frac{\varphi ^{n}-\varphi ^{n-1}}{k_{n}}, \quad n=1,2, \ldots ,N. $$

Moreover, we define for \(1\leq p<\infty \) the discrete time-dependent norms

$$ |\!|\!|\varphi |\!|\!|_{l^{p}(J;W^{m,q}(\varOmega ))}:= \Biggl(\sum _{n=1-l} ^{N-l}k_{n} \bigl\Vert \varphi ^{n} \bigr\Vert ^{p}_{W^{m,q}(\varOmega )} \Biggr) ^{\frac{1}{p}}, $$

where \(l=0\) for the control u and the state y and \(l=1\) for the co-state p, with the standard modification for \(p=\infty \). For convenience, we denote \(|\!|\!|\cdot |\!|\!|_{l^{s}(J;W^{m,q}(\varOmega ))}\) by \(|\!|\!|\cdot |\!|\!|_{l^{s}(W^{m,q})}\) and let

$$ l^{p}\bigl(H^{s}\bigr):= \bigl\{ f: |\!|\!|f |\!|\!|_{l^{p}(H^{s})}< \infty \bigr\} , \quad 1\leq p\leq \infty . $$

Then a possible VD approximation of (1)–(4) is as follows:

$$\begin{aligned} &\min_{u_{h}^{n}\in K} \frac{1}{2}\sum _{n=1}^{N}k_{n} \bigl( \bigl\Vert y _{h}^{n}-y_{d}^{n} \bigr\Vert ^{2}+\alpha \bigl\Vert u_{h}^{n} \bigr\Vert ^{2} \bigr), \end{aligned}$$
(14)
$$\begin{aligned} & \bigl(d_{t}y_{h}^{n},w_{h} \bigr)+a \bigl(y_{h}^{n},w_{h} \bigr)+ \bigl(u _{h}^{n}y_{h}^{n},w_{h} \bigr)= \bigl(f^{n},w_{h} \bigr), \end{aligned}$$
(15)
$$\begin{aligned} &\forall w_{h}\in W_{h}, n=1,2,\ldots ,N, \\ &y_{h}^{0}(x)=y_{0}^{h}(x), \quad \forall x\in \varOmega , \end{aligned}$$
(16)

where \(y_{0}^{h}(x)=R_{h} (y_{0}(x) )\) and \(R_{h}\) is an elliptic projection operator which will be specified later.

For \(n=1,2,\ldots ,N\), the OCP (14)–(16) again has a solution \((y_{h}^{n},u_{h}^{n} )\) and that if \((y _{h}^{n},u_{h}^{n} )\in W_{h} \times K\) is a solution of (14)–(16), then there is a co-state \(p_{h}^{n-1} \in W_{h}\), such that the triplet \((y_{h}^{n},p_{h}^{n-1},u_{h} ^{n} )\in W_{h}\times W_{h}\times K\), satisfies the following optimality conditions:

$$\begin{aligned} & \bigl(d_{t}y_{h}^{n},w_{h} \bigr)+a \bigl(y_{h}^{n},w_{h} \bigr)+ \bigl(u _{h}^{n}y_{h}^{n},w_{h} \bigr)= \bigl(f^{n},w_{h} \bigr), \quad \forall w_{h}\in W_{h}, \end{aligned}$$
(17)
$$\begin{aligned} &y_{h}^{0}(x)=y_{0}^{h}(x), \quad \forall x\in \varOmega , \end{aligned}$$
(18)
$$\begin{aligned} &{-} \bigl(d_{t}p_{h}^{n},q_{h} \bigr)+a \bigl(q_{h},p_{h}^{n-1} \bigr)+ \bigl(u _{h}^{n}p_{h}^{n-1},q_{h} \bigr)= \bigl(y_{h}^{n}-y_{d}^{n},q_{h} \bigr), \quad \forall q_{h}\in W_{h}, \end{aligned}$$
(19)
$$\begin{aligned} &p_{h}^{N}(x)=0, \quad \forall x\in \varOmega , \end{aligned}$$
(20)
$$\begin{aligned} & \bigl(\alpha u_{h}^{n}-y_{h}^{n}p_{h}^{n-1},v^{n}-u_{h}^{n} \bigr) \geq 0 , \quad \forall v\in K. \end{aligned}$$
(21)

Similar to (13), the variational inequality (21) can be equivalently rewritten as follows.

Lemma 2.2

Let \((y_{h},p_{h},u_{h})\) be the solution of (17)(21). Then, for \(n=1,2,\ldots ,N\), we have

$$\begin{aligned} u_{h}^{n}=\min \biggl(\max \biggl(a, \frac{y_{h}^{n} p_{h}^{n-1}}{ \alpha } \biggr),b \biggr). \end{aligned}$$
(22)

Remark 2.1

It should be pointed out that we minimize over the infinite dimensional set K instead of minimizing over a finite dimensional subset of K in (21). Then we just need to solve the discrete equations (17)–(20) and obtain \(u_{h}\) from (22).

3 Error estimates of intermediate variables

Some useful intermediate variables and their important error estimates will be introduced in this section. For any control function \(v\in K\) and \(w_{h},q_{h}\in W_{h}\), let \(y_{h}^{n}(v),p_{h}^{n}(v) \in W_{h}\) for \(n=1,2,\ldots ,N\) satisfy the following system:

$$\begin{aligned} & \bigl(d_{t} y_{h}^{n}(v),w_{h} \bigr)+a \bigl(y_{h}^{n}(v),w_{h} \bigr)+ \bigl(v ^{n} y_{h}^{n}(v),w_{h} \bigr)= \bigl(f^{n},w_{h} \bigr), \end{aligned}$$
(23)
$$\begin{aligned} &y_{h}^{0}(v)=y_{0}^{h}(x), \quad \forall x\in \varOmega , \end{aligned}$$
(24)
$$\begin{aligned} &{-} \bigl(d_{t} p_{h}^{n}(v),q_{h} \bigr)+a \bigl(q_{h},p_{h}^{n-1}(v) \bigr)+ \bigl(v ^{n} p_{h}^{n-1}(v),q_{h} \bigr)= \bigl(y_{h}^{n}(v)-y_{d}^{n},q_{h} \bigr), \end{aligned}$$
(25)
$$\begin{aligned} &p_{h}^{N}(v)=0, \quad \forall x\in \varOmega . \end{aligned}$$
(26)

If \((y_{h},p_{h},u_{h})\) be the solutions of and (17)–(21), then \((y_{h},p_{h})=(y_{h}(u_{h}),p_{h}(u _{h}))\).

We introduce the elliptic projection operator \(R_{h}:W\rightarrow W _{h}\), which satisfies: for any \(\phi \in W\),

$$\begin{aligned} a(R_{h}\phi -\phi ,w_{h})=0, \quad \forall \phi \in W,w_{h}\in W_{h}. \end{aligned}$$
(27)

It has the following property (see e.g., [4]):

$$\begin{aligned} & \Vert R_{h} \phi -\phi \Vert _{s}\leq Ch^{2-s} \Vert \phi \Vert _{2}, \quad \forall \phi \in H^{2}(\varOmega ),s=0,1. \end{aligned}$$
(28)

Lemma 3.1

Let \((y,p,u)\) be the solution of (8)(12) and \((y_{h}(u),p_{h}(u))\) be the discrete solution of (23)(26) with \(v=u\). Suppose that \(u\in l^{2}(H^{1})\) and \(y,p\in l^{2}(H^{2})\cap H^{2}(L^{2})\cap H^{1}(H^{2})\), we have

$$\begin{aligned} \bigl|\!\bigl|\!\bigl|y_{h}(u)-y \bigr|\!\bigr|\!\bigr|_{l^{2}(L^{2})}+ \bigl|\!\bigl|\!\bigl|p_{h}(u)-p \bigr|\!\bigr|\!\bigr|_{l^{2}(L^{2})}\leq C \bigl(h^{2}+k \bigr). \end{aligned}$$
(29)

Proof

Set \(v=u\) in (23), then from Eq. (8) and the elliptic projection operator \(R_{h}\). For \(n=1,2,\ldots ,N\) and \(\forall w_{h}\in W_{h}\), we derive

$$\begin{aligned} & \bigl(d_{t} y_{h}^{n}(u)-d_{t} R_{h} y^{n},w_{h} \bigr)+a \bigl(y _{h}^{n}(u)-R_{h} y^{n},w_{h} \bigr)+ \bigl(u^{n} \bigl(y_{h}^{n}(u)-R _{h} y^{n}\bigr),w_{h} \bigr) \\ &\quad = - \bigl(d_{t} R_{h} y^{n},w_{h} \bigr)-a \bigl(y^{n},w_{h} \bigr)- \bigl(u^{n}R_{h}y^{n},w_{h} \bigr)+ \bigl(f^{n},w_{h} \bigr) \\ &\quad = - \bigl(d_{t} R_{h} y^{n}-d_{t}y^{n},w_{h} \bigr)- \bigl(d_{t}y ^{n}-y_{t}^{n},w_{h} \bigr)- \bigl(u^{n} \bigl(R_{h}y^{n}-y^{n} \bigr),w _{h} \bigr). \end{aligned}$$
(30)

We note that

$$\begin{aligned} \begin{aligned}[b] & \bigl(d_{t} y_{h}^{n}(u)-d_{t}R_{h} y^{n},y_{h}^{n}(u)-R_{h}y^{n} \bigr) \\ &\quad \geq \frac{1}{k_{n}} \bigl( \bigl\Vert y_{h}^{n}(u)-R_{h}y^{n} \bigr\Vert ^{2}- \bigl\Vert y _{h}^{n}(u)-R_{h}y^{n} \bigr\Vert \bigl\Vert y_{h}^{n-1}(u)-R_{h}y^{n-1} \bigr\Vert \bigr) \end{aligned} \end{aligned}$$
(31)

and

$$\begin{aligned} a \bigl(y_{h}^{n}(u)-R_{h}y^{n},y_{h}^{n}(u)-R_{h}y^{n} \bigr)\geq \bigl(u^{n} \bigl(y_{h}^{n}(u)-R_{h}y^{n} \bigr),R_{h}y^{n}-y_{h} ^{n}(u) \bigr). \end{aligned}$$
(32)

By choosing \(w_{h}=y_{h}^{n}(u)-R_{h}y^{n}\) in (30) and using (31)–(32) and Hölder’s inequality, and multiplying both sides of (30) by \(k_{n}\) and summing n from 1 to \(N^{*}\) (\(1\leq N^{*}\leq N\)), we get

$$\begin{aligned} & \bigl\Vert y_{h}^{N^{*}}(u)-R_{h}y^{N^{*}} \bigr\Vert \\ &\quad \leq \sum_{n=1}^{N^{*}} \bigl\Vert (R_{h}-I) \bigl(y^{n}-y^{n-1} \bigr) \bigr\Vert +\sum_{n=1}^{N^{*}} \bigl\Vert y ^{n}-y^{n-1}-k_{n}y_{t}^{n} \bigr\Vert \\ &\qquad {}+\sum_{n=1}^{N^{*}}k_{n} \bigl\Vert u ^{n} \bigl(R_{h}y^{n}-y^{n} \bigr) \bigr\Vert \\ &\quad \leq \sum_{n=1}^{N ^{*}}Ch^{2} \bigl\Vert y^{n}-y^{n-1} \bigr\Vert _{2}+ \sum_{n=1}^{N^{*}} \int _{t_{n-1}} ^{t_{n}} \bigl\Vert (t_{n-1}-t )y_{tt} \bigr\Vert \,dt \\ &\qquad {}+C\sum_{n=1} ^{N^{*}}k_{n} \bigl\Vert R_{h}y^{n}-y^{n} \bigr\Vert \\ &\quad \leq Ch^{2}\sum_{n=1}^{N ^{*}} \int _{t_{n-1}}^{t_{n}} \Vert y_{t} \Vert _{2}\,dt+k\sum_{n=1}^{N^{*}} \int _{t_{n-1}}^{t_{n}} \Vert y_{tt} \Vert \,dt+Ch^{2}\sum_{n=1}^{N^{*}}k _{n} \bigl\Vert y^{n} \bigr\Vert _{2} \\ &\quad \leq Ch^{2} \int _{0}^{t_{N^{*}}} \Vert y_{t} \Vert _{2}\,dt+k \int _{0}^{t_{N^{*}}} \Vert y_{tt} \Vert \,dt+Ch^{2} |\!|\!|y |\!|\!|_{l^{2}(H^{2})} \\ &\quad \leq C \bigl(h^{2} \Vert y_{t} \Vert _{L^{2}(H^{2})}+k \Vert y_{tt} \Vert _{L^{2}(L^{2})}+h ^{2} |\!|\!|y |\!|\!|_{l^{2}(H^{2})} \bigr). \end{aligned}$$
(33)

Hence

$$\begin{aligned} \bigl|\!\bigl|\!\bigl|y_{h}(u)-R_{h} y \bigr|\!\bigr|\!\bigr|_{l^{\infty }(L^{2})}\leq C \bigl(h^{2}+k \bigr). \end{aligned}$$
(34)

It follows from (28) that

$$\begin{aligned} |\!|\!|R_{h} y-y |\!|\!|_{l^{2}(L^{2})}^{2}=\sum _{n=1}^{N}k_{n} \bigl\Vert R_{h} y ^{n}-y^{n} \bigr\Vert ^{2}\leq Ch^{4} |\!|\!|y |\!|\!|_{l^{2}(H^{2})}^{2}. \end{aligned}$$
(35)

According to (34)–(35) and the embedding theorem, we obtain

$$\begin{aligned} \bigl|\!\bigl|\!\bigl|y_{h}(u)-y \bigr|\!\bigr|\!\bigr|_{l^{2}(L^{2})}\leq C \bigl(h^{2}+k \bigr). \end{aligned}$$
(36)

Similarly, we can derive

$$\begin{aligned} \bigl|\!\bigl|\!\bigl|p_{h}(u)-p \bigr|\!\bigr|\!\bigr|_{l^{2}(L^{2})}\leq C \bigl(h^{2}+k \bigr). \end{aligned}$$
(37)

Therefore, (29) follows from (36) and (37). □

4 Convergence analysis

In this section, we will derive the convergence analysis for the control variable. For ease of exposition, we set

$$\begin{aligned} \begin{aligned}[b] &J(u)=\frac{1}{2} \int _{0}^{T} \bigl( \Vert y-y_{d} \Vert ^{2}+\alpha \Vert u \Vert ^{2} \bigr)\,dt, \\ &J_{h} (u_{h} )=\frac{1}{2} \int _{0}^{T} \bigl( \Vert y_{h}-y _{d} \Vert ^{2}+\alpha \Vert u_{h} \Vert ^{2} \bigr)\,dt. \end{aligned} \end{aligned}$$

It can be shown that

$$\begin{aligned} \begin{aligned}[b] & \bigl(J^{\prime }(u),v \bigr)= \int _{0}^{T} (\alpha u-y p,v )\,dt, \\ & \bigl(J_{hk}^{\prime } (u_{h} ),v \bigr)=\sum _{n=1}^{N}k_{n} \bigl( \alpha u_{h}^{n}-y_{h}^{n}(u_{h})p_{h} ^{n-1}(u_{h}),v \bigr). \end{aligned} \end{aligned}$$

In many applications, the objective functional \(J(\cdot )\) is uniform convex near the solution u (see, e.g., [23]) that is closely related to the second order sufficient conditions of the control problem. It is assumed in many studies on numerical methods of the problem (see, e.g., [2]). Hence, if h and k are small enough, we can assume that \(J_{hk}(\cdot )\) is uniform convex, namely, there is a positive constant c, such that

$$\begin{aligned} c |\!|\!|u-v |\!|\!|_{l^{2}(L^{2})}^{2}\leq \bigl(J_{hk}^{\prime }(u)-J_{hk} ^{\prime } (v ),u-v \bigr), \quad \forall u,v\in K. \end{aligned}$$
(38)

Theorem 4.1

Let \((y,p,u)\) and \((y_{h},p_{h},u_{h} )\) be the solutions of (8)(12) and (17)(21), respectively. Assume that \(y_{h}(u)\), \(p\in l^{\infty }(L^{\infty })\) and all the conditions in Lemma 3.1 are valid. Then we have

$$\begin{aligned} |\!|\!|u-u_{h} |\!|\!|_{l^{2}(L^{2})}\leq C \bigl(h^{2}+k \bigr). \end{aligned}$$
(39)

Proof

Set \(v=u_{h}\) and \(v=u\) in (12) and (21), respectively, we obtain

$$\begin{aligned} (\alpha u,u-u_{h})\leq (y p,u-u_{h}), \quad \forall t \in J, \end{aligned}$$
(40)

and

$$\begin{aligned} \bigl(\alpha u_{h}^{n}-y_{h}^{n}p_{h}^{n-1},u^{n}-u_{h}^{n} \bigr) \geq 0, \quad n=1,2,\ldots ,N. \end{aligned}$$
(41)

From (38) and (40)–(41), we have

$$\begin{aligned} \begin{aligned}[b] c |\!|\!|u-u_{h} |\!|\!|^{2}_{l^{2}(L^{2})} &\leq \bigl(J^{\prime }_{hk}(u)-J ^{\prime }_{hk}(u_{h}),u-u_{h} \bigr) \\ & = \sum_{n=1}^{N}k_{n} \bigl(\alpha u^{n}-y_{h}^{n}(u)p_{h}^{n-1}(u),u^{n}-u_{h}^{n} \bigr) \\ &\quad {}-\sum_{n=1}^{N}k_{n} \bigl(\alpha u_{h}^{n}-y_{h}^{n}(u _{h})p_{h}^{n-1}(u_{h}),u^{n}-u_{h}^{n} \bigr) \\ & \leq \sum_{n=1}^{N}k_{n} \bigl(y^{n}p^{n}-y_{h}^{n}(u)p^{n},u^{n}-u_{h}^{n} \bigr) \\ &\quad {}+\sum_{n=1}^{N}k_{n} \bigl(y_{h}^{n}(u)p^{n}-y_{h}^{n}(u)p ^{n-1},u^{n}-u_{h}^{n} \bigr) \\ &\quad {}+\sum_{n=1}^{N}k_{n} \bigl(y _{h}^{n}(u)p^{n-1}-y_{h}^{n}(u)p_{h}^{n-1}(u),u^{n}-u_{h}^{n} \bigr) \\ &: = I_{1}+I_{2}+I_{3}. \end{aligned} \end{aligned}$$
(42)

According to Young’s inequality with ϵ and Lemma 3.1, \(I_{1}\) can be estimated as follows:

$$\begin{aligned} \begin{aligned}[b] I_{1}& = \sum _{n=1}^{N}k_{n} \bigl(p^{n} \bigl(y^{n}-y_{h}^{n}(u) \bigr),u ^{n}-u_{h}^{n} \bigr) \\ & \leq C(\epsilon ) \bigl|\!\bigl|\!\bigl|y-y_{h}(u) \bigr|\!\bigr|\!\bigr|_{l^{2}(L ^{2})}^{2}+\epsilon |\!|\!|u-u_{h} |\!|\!|_{l^{2}(L^{2})}^{2} \\ & \leq C(\epsilon ) \bigl(h^{2}+k \bigr)^{2}+\epsilon |\!|\!|u-u_{h} |\!|\!|_{l^{2}(L^{2})}^{2}. \end{aligned} \end{aligned}$$
(43)

For the second term \(I_{2}\), by using Young’s inequality with ϵ, we have

$$\begin{aligned} \begin{aligned}[b] I_{2}& = \sum _{n=1}^{N}k_{n} \bigl(y_{h}^{n}(u) \bigl(p^{n}-p ^{n-1} \bigr),u^{n}-u_{h}^{n} \bigr) \\ & = C(\epsilon )k^{2} |\!|\!|p_{t} |\!|\!|_{l ^{2}(L^{2})}^{2}+\epsilon |\!|\!|u-u_{h} |\!|\!|_{l^{2}(L^{2})}^{2}. \end{aligned} \end{aligned}$$
(44)

From Young’s inequality with ϵ and Lemma 3.1, we get

$$\begin{aligned} \begin{aligned}[b] I_{3}& = \sum _{n=1}^{N}k_{n} \bigl(y_{h}^{n}(u) \bigl(p^{n-1}-p _{h}^{n-1}(u) \bigr),u^{n}-u_{h}^{n} \bigr) \\ & \leq C(\epsilon ) \bigl|\!\bigl|\!\bigl|p-p _{h}(u) \bigr|\!\bigr|\!\bigr|_{l^{2}(L^{2})}^{2}+\epsilon |\!|\!|u-u_{h} |\!|\!|_{l^{2}(L^{2})} ^{2} \\ & \leq C(\epsilon ) \bigl(h^{2}+k \bigr)^{2}+\epsilon |\!|\!|u-u _{h} |\!|\!|_{l^{2}(L^{2})}^{2}. \end{aligned} \end{aligned}$$
(45)

Let ϵ be small enough, then (39) follows from (42)–(45). □

5 Superconvergence analysis

In this section, we will derive superconvergence of the state and co-state variables.

Theorem 5.1

Let \((y,p,u)\) and \((y_{h},p_{h},u_{h})\) be the solutions of (8)(12) and (17)(21), respectively. Assume that \(y_{h}\in l^{\infty }(L^{\infty })\) all the conditions in Theorem 4.1 hold, we have

$$\begin{aligned} |\!|\!|R_{h} y-y_{h} |\!|\!|_{l^{2}(H^{1})}+ |\!|\!|R_{h} p-p_{h} |\!|\!|_{l^{2}(H^{1})} \leq C \bigl(h^{2}+k \bigr). \end{aligned}$$
(46)

Proof

From (8) and (17), for any \(w_{h}\in W^{h}\) and \(n=1,2,\ldots ,N\), we have

$$\begin{aligned} \begin{aligned}[b] & \bigl(y_{t}^{n}-d_{t} y_{h}^{n},w_{h} \bigr)+a \bigl(y^{n}-y_{h} ^{n},w_{h} \bigr)+ \bigl(u^{n}\bigl(y^{n}-y_{h}^{n} \bigr),w_{h} \bigr) \\ &\quad = \bigl(y_{h}^{n}\bigl(u_{h}^{n}-u^{n} \bigr),w_{h} \bigr). \end{aligned} \end{aligned}$$
(47)

According to the definition of \(R_{h}\), we get

$$\begin{aligned} \begin{aligned}[b] & \bigl(d_{t}R_{h}y^{n}-d_{t}y_{h}^{n},w_{h} \bigr)+a \bigl(R_{h}y ^{n}-y_{h}^{n},w_{h} \bigr)+ \bigl(u^{n}\bigl(R_{h}y^{n}-y_{h}^{n} \bigr),w_{h} \bigr) \\ &\quad = \bigl(d_{t}R_{h}y^{n}-d_{t}y^{n}+d_{t}y^{n}-y_{t}^{n}+u^{n} \bigl(R _{h}y^{n}-y^{n} \bigr)+y_{h}^{n}\bigl(u_{h}^{n}-u^{n} \bigr),w_{h} \bigr). \end{aligned} \end{aligned}$$
(48)

Note that

$$\begin{aligned} \begin{aligned}[b] & \bigl(d_{t} R_{h}y^{n}-d_{t} y_{h}^{n},R_{h}y^{n}-y^{n}_{h} \bigr) \\ &\quad \geq \frac{1}{2k_{n}} \bigl( \bigl\Vert R_{h}y^{n}-y^{n}_{h} \bigr\Vert ^{2}- \bigl\Vert R_{h}y ^{n-1}-y^{n-1}_{h} \bigr\Vert ^{2} \bigr) \end{aligned} \end{aligned}$$
(49)

and

$$\begin{aligned} \bigl(d_{t}R_{h}y^{n}-d_{t}y^{n},R_{h}y^{n}-y^{n}_{h} \bigr) &\leq \bigl\Vert d_{t}R_{h}y^{n}-d_{t}y^{n} \bigr\Vert \bigl\Vert R_{h}y^{n}-y^{n} _{h} \bigr\Vert \\ &\leq Ch^{2} \bigl\Vert d_{t}y^{n} \bigr\Vert _{2} \bigl\Vert R_{h}y^{n}-y^{n} _{h} \bigr\Vert \\ &\leq Ch^{2}k_{n}^{-1} \int _{t_{n-1}}^{t_{n}} \Vert y_{t} \Vert _{2}\,dt\bigl\| R_{h}y^{n}-y^{n}_{h}) \bigr\| \\ &\leq Ch^{2}k_{n}^{-\frac{1}{2}}\|y_{t}\|_{L^{2}(t_{n-1},t_{n};H^{2}( \varOmega ))} \bigl\Vert R_{h}y^{n}-y^{n}_{h} \bigr\Vert . \end{aligned}$$
(50)

In addition

$$\begin{aligned} \begin{aligned}[b] \bigl(d_{t}y^{n}-y_{t}^{n},R_{h}y^{n}-y^{n}_{h} \bigr) & = k_{n}^{-1} \bigl(y^{n}-y^{n-1}-k_{n} y_{t}^{n},R_{h}y^{n}-y^{n}_{h} \bigr) \\ & \leq k_{n}^{-1} \bigl\Vert y^{n}-y^{n-1}-k_{n}y_{t}^{n} \bigr\Vert \bigl\Vert R _{h}y^{n}-y^{n}_{h} \bigr\Vert \\ & = k_{n}^{-1} \biggl\Vert \int _{t_{n-1}}^{t _{n}}(t_{n-1}-s) (y_{tt}) (s)\,ds \biggr\Vert \bigl\Vert R_{h}y^{n}-y^{n}_{h} \bigr\Vert \\ & \leq Ck_{n}^{\frac{1}{2}}\bigl\Vert y_{tt}(v)\bigr\Vert _{L^{2}(t_{n-1},t_{n};L^{2}( \varOmega ))} \bigl\Vert R_{h}y^{n}-y^{n}_{h} \bigr\Vert . \end{aligned} \end{aligned}$$
(51)

By choosing \(w_{h}=R_{h}y^{n}-y^{n}_{h}\) in (48) and using (49)–(51) and Young’s inequality with ϵ, then multiplying both sides of (48) by \(2k_{n}\) and summing n from 1 to N, we get

$$\begin{aligned} & \bigl\Vert R_{h}y^{N}-y^{N}_{h} \bigr\Vert ^{2}+c\sum_{n=1}^{N}k _{n} \bigl\Vert R_{h}y^{n}-y^{n}_{h} \bigr\Vert ^{2}_{1} \\ &\quad \leq C(\epsilon ) \bigl(h^{4} \bigl\Vert y_{t} \bigr\| _{L^{2}(H^{2})}^{2}+k^{2} \bigl\Vert y _{tt} \bigr\Vert _{L^{2}(L^{2})}^{2}+h^{4} |\!|\!|y |\!|\!|_{l^{2}(H^{2})}^{2}+ |\!|\!|u_{h}-u |\!|\!|_{l ^{2}(L^{2})}^{2} \bigr) \\ &\qquad {}+\epsilon \sum_{n=1}^{N}k \bigl\Vert R_{h}y^{n}-y^{n}_{h} \bigr\Vert ^{2}. \end{aligned}$$
(52)

From (39) and (52), we obtain

$$\begin{aligned} |\!|\!|R_{h}y-y_{h} |\!|\!|_{l^{2}(H^{1})}\leq C \bigl(h^{2}+k \bigr). \end{aligned}$$
(53)

Similarly, we can prove

$$\begin{aligned} |\!|\!|R_{h}p-p_{h} |\!|\!|_{l^{2}(H^{1})}\leq C \bigl(h^{2}+k \bigr). \end{aligned}$$
(54)

Hence, (46) follows from (53)–(54). □

6 Numerical experiments

For an acceptable error Tol, we present the following VD approximation algorithm in which we have omitted the subscript h just for ease of exposition.

Algorithm 6.1

(VD approximation algorithm)

Step 1. Initialize \(u_{0}\).

Step 2. Solve the following equations:

$$\begin{aligned} \textstyle\begin{cases} (\frac{y^{i}_{n}-y^{i-1}_{n}}{k},w )+a (y^{i}_{n},w )+ (u^{i}_{n}y^{i}_{n},w )= (f^{i},w ), \\ y^{i}_{n},y^{i-1}_{n}\in W_{h},\quad \forall w\in W_{h}, \\ (\frac{p^{i-1}_{n}-p^{i}_{n}}{k},q )+a (q,p^{i-1} _{n} )+ (u^{i}_{n}p^{i-1}_{n},q )= (y^{i}_{n}-y ^{i}_{d},q ), \\ up^{i}_{n},p^{i-1}_{n}\in W_{h},\quad \forall q\in W_{h}, \\ u_{n+1}=\min (\max (a,\frac{y_{n} p_{n}}{\alpha }),b). \end{cases}\displaystyle \end{aligned}$$
(55)

Step 3. Calculate the iterative error: \(E_{n+1}=|\!|\!|u_{n+1}-u_{n}|\!|\!|_{l ^{2}(L^{2})}\).

Step 4. If \(E_{n+1}\leq \mathit{Tol}\), stop; else go to Step 2.

Let \(\varOmega =[0,1]\times [0,1]\), \(T=1\), \(\alpha =1\), \(a=0\), \(b=1\) and \(A(x)\) is a unit matrix. We solve the following two examples with AFEPack. The details can be found at [20]. We denote \(|\!|\!|\cdot |\!|\!|_{l^{2}(H^{1})}\) and \(|\!|\!|\cdot |\!|\!|_{l^{2}(L^{2})}\) by \(|\!|\!|\cdot |\!|\!|_{1}\) and \(|\!|\!|\cdot |\!|\!|\), respectively. The convergence order rate: \(\mathit{Rate}=\frac{\log (e_{i+1})-\log (e_{i})}{\log (h _{i+1})-\log (h_{i})}\), where \(e_{i}\) and \(e_{i+1}\) denote errors when mesh size \(h=h_{i}\) and \(h=h_{i+1}\), respectively.

Example 1

The data are as follows:

$$\begin{aligned}& p(x,t)=(1-t)\operatorname{sin}(2\pi x_{1})\operatorname{sin}(2\pi x_{2}), \\& y(t,x)=t\operatorname{sin}(2\pi x_{1})\operatorname{sin}(2\pi x_{2}), \\& u(t,x)=\min\bigl(\max\bigl(0,y(t,x)p(t,x)\bigr),1\bigr), \\& f(t,x)=y_{t}(t,x)-\operatorname{div}\bigl(A(x) \nabla y(t,x) \bigr)+u(t,x)y(t,x), \\& y_{d}(t,x)=y(t,x)+p_{t}(t,x)+\operatorname{div} \bigl(A^{*}(x) \nabla p(t,x)\bigr)-u(t,x)p(t,x). \end{aligned}$$

The errors based on a sequence of uniformly meshes are shown in Table 1, where we can see \(|\!|\!|u-u_{h}|\!|\!|=\mathcal{O} (h^{2}+k )\), \(|\!|\!|R_{h}y-y_{h}|\!|\!|_{1}=\mathcal{O} (h^{2}+k )\) and \(|\!|\!|R_{h}p-p_{h}|\!|\!|_{1}=\mathcal{O} (h^{2}+k )\). When \(h=\frac{1}{80}\), \(k=\frac{1}{640}\) and \(t=0.5\), the numerical solution \(u_{h}\) is shown in Fig. 1.

Figure 1
figure 1

The numerical solution \(u_{h}\) at \(t=0.5\), Example 1

Table 1 Numerical results of Example 1

Example 2

The data are as follows:

$$\begin{aligned}& p(t,x)=(1-t)x_{1}(1-x_{1})x_{2}(1-x_{2}), \\& y(t,x)=tx_{1}(1-x_{1})x_{2}(1-x_{2}), \\& u(t,x)=\min\bigl(\max\bigl(0,y(t,x)p(t,x)\bigr),1\bigr), \\& f(t,x)=y_{t}(t,x)-\operatorname{div}\bigl(A(x) \nabla y(t,x) \bigr)+u(t,x)y(t,x), \\& y_{d}(t,x)=y(t,x)+p_{t}(t,x)+\operatorname{div} \bigl(A^{*}(x) \nabla p(t,x)\bigr)-u(t,x)p(t,x). \end{aligned}$$

The errors \(|\!|\!|u-u_{h}|\!|\!|\), \(|\!|\!|R_{h}y-y_{h}|\!|\!|_{1}\) and \(|\!|\!|R_{h}p-p_{h}|\!|\!|_{1}\) on a sequence of uniformly meshes are shown in Table 2. When \(h=\frac{1}{80}\), \(k=\frac{1}{640}\) and \(t=0.5\), we plot the profile of \(u_{h}\) in Fig. 2.

Figure 2
figure 2

The numerical solution \(u_{h}\) at \(t=0.5\), Example 2

Table 2 Numerical results of Example 2

From the numerical results in Example 1 and Example 2, we see that \(|\!|\!|u-u_{h}|\!|\!|\), \(|\!|\!|R_{h}y-y_{h}|\!|\!|_{1}\) and \(|\!|\!|R_{h}p-p_{h}|\!|\!|_{1}\) are the second order convergent. Our numerical results and theoretical results are consistent.

7 Conclusions

Although there has been extensive research on convergence and superconvergence of FEMs for various parabolic OCPs, mostly focused on linear or semilinear parabolic cases (see, e.g., [6, 10, 16, 26, 30]), the results on convergence and superconvergence are \(\mathcal{O}(h+k)\) and \(\mathcal{O}(h^{\frac{3}{2}}+k)\), respectively. Recent years, VD are used to deal with different OCPs in [7, 13, 14]. While there is little work on bilinear OCPs. Hence, our results on convergence and superconvergence of VD for bilinear parabolic OCPs are new.