1 Introduction

In this paper, we propose norm error estimates that provide explicit values of error constants for the semi-discrete Galerkin approximation of the linear heat equation.

Let \(\Omega \subset {\mathbb {R}}^N(N\in {\mathbb {N}})\) be a bounded Lipschitz domain. \(L^2(\Omega )\) denotes the real Hilbert space endowed with inner product \((u,v)_{L^2(\Omega )}:=\int _{\Omega }u(x)v(x)dx\) and norm \(\Vert u\Vert _{L^2(\Omega )}:=\sqrt{(u,u)_{L^2(\Omega )}}\) for \(u,v\in L^2(\Omega )\). The real Hilbert space \(H^1_0(\Omega )\) is endowed with inner product \(a(u,v):=(\nabla u,\nabla v)_{L^2(\Omega )}\) and norm \(\Vert u\Vert _{H^{1}_{0}(\Omega )}:=\sqrt{a(u,u)}\) for \(u, v\in H^1_0(\Omega )\), where any function u in \(H^1_0(\Omega )\) vanishes on the boundary of \(\Omega \). Let \(H^{-1}(\Omega )\) be the dual space of \(H^1_0(\Omega )\) and \(\langle \cdot ,\cdot \rangle \) be the real dual product between \(H^{-1}(\Omega )\) and \(H^1_0(\Omega )\). We identify \(u\in H^1_0(\Omega )\) with \(u\in L^2(\Omega )\) and with \(u\in H^{-1}(\Omega )\) based on the Gelfand triple \(H^1_0(\Omega )\subset L^2(\Omega )=L^2(\Omega )^*\subset H^{-1}(\Omega )\) (all inclusions are dense with continuous injections), where \(L^2(\Omega )^*\) denotes the dual space of \(L^2(\Omega )\). Let \({\mathcal {A}}:H^1_0(\Omega )\rightarrow H^{-1}(\Omega )\) be defined by

$$\begin{aligned} \langle {\mathcal {A}}u,v\rangle =a(u,v)~\forall v\in H^1_0(\Omega ). \end{aligned}$$

We also define as \({\mathcal {W}}=\{u\in H^1_0(\Omega )\mid {\mathcal {A}}u\in L^2(\Omega )\}\), where the regularities of the functions in \({\mathcal {W}}\) are dependent on shapes of the domain \(\Omega \); (see e.g., [4]).

For parameter \(h>0\), the function space \(V_h\) denotes a finite-dimensional subspace of \(H^1_0(\Omega )\). We define the Ritz projection \(R_h:H^1_0(\Omega )\rightarrow V_h\) as

$$\begin{aligned} a(u-R_h u,v_h)=0~~\forall v_h\in V_h. \end{aligned}$$
(1)

Assume that the constant \(C_h\) satisfies

$$\begin{aligned} \Vert u-R_h u\Vert _{H^1_0(\Omega )}\le C_h \Vert {\mathcal {A}}u\Vert _{L^2(\Omega )}~~\forall u\in {\mathcal {W}}, \end{aligned}$$
(2)

where \(C_h\rightarrow 0\) as \(h\rightarrow 0\). Then, Aubin-Nitsche’s trick implies

$$\begin{aligned} \Vert u-R_h u\Vert _{L^2(\Omega )}\le C_h \Vert u-R_h u\Vert _{H^1_0(\Omega )}~~\forall u\in H^1_0(\Omega ). \end{aligned}$$
(3)

The estimates (2) and (3) derive very meaningful inequalities for the numerical analysis of elliptic partial differential equations (PDEs); (see e.g., [1]). In particular, explicit values of \(C_h\) play an important role in computer-assisted existential proofs of solutions to elliptic PDEs; (see e.g., [12]). Therefore, many estimates for obtaining the values have been proposed and applied to computer-assisted existential proofs of solutions to semi-linear elliptic PDEs; (see e.g., [6,7,8, 10, 11, 15, 17] and references therein).

In this paper, we propose two norm error estimates, which provide the best possible error constants using only \(C_h\) in (2) for the semi-discrete Galerkin approximation of the linear heat equation. Let \(J=(t_0,t_1)~(0\le t_0<t_1<\infty )\). For any function \(v:J\times \Omega \rightarrow {\mathbb {R}}\), we introduce the shortened form \(v(t):=v(t,\cdot )\) and \(\partial _tv(t):=(\partial _tv)(t,\cdot )\), where \(\partial _t\) denotes the weak derivative for \(t\in J\). For any real Hilbert space Y, \(L^2(J;Y)\) is defined by the function space of Lebesgue integrable functions \(J\ni t\mapsto v(t)\in Y\) endowed with the norm \(\Vert v\Vert _{L^2(J;Y)}:=\sqrt{\int _{J}\Vert v(s)\Vert _{Y}^2ds}\) for \(v\in L^2(J;Y)\). Let \(H^1(J;Y)\) denote the set of weak differentiable functions for J endowed with the norm \(\Vert v\Vert _{H^1(J;Y)}=\sqrt{\int _{J}\left( \Vert v(s)\Vert _{Y}^2+\Vert \partial _s v(s)\Vert _{Y}^2\right) ds}\) for \(v\in H^1(J;Y)\). The function space \(C^0([t_0,t_1];L^2(\Omega ))\) is defined by the set of continuous functions as \([t_0,t_1]\ni t\mapsto v(t)\in L^2(\Omega )\). Let \(Z:=H^1(J;H^{-1}(\Omega ))\cap L^2(J;H^1_0(\Omega ))\) be endowed with the norm \(\Vert v\Vert _{Z}=\sqrt{\Vert v\Vert _{H^1(J;H^{-1}(\Omega ))}^2+\Vert v\Vert _{L^2(J;H^{1}_0(\Omega ))}^2}\). Let \(w_0\in L^2(\Omega )\) and \(f\in L^2(J;H^{-1}(\Omega ))\). We define the weak solution as the function \(w\in Z\) that satisfies the linear heat equation:

$$\begin{aligned} {\left\{ \begin{array}{ll} \langle \partial _t w(t),v\rangle +a(w(t),v)=\langle f(t),v\rangle ~~\forall v\in H^1_0(\Omega ),~\text{ a.e. }~t\in J\\ w(t_0)=w_0. \end{array}\right. } \end{aligned}$$
(4)

Let \(V_{J,h}:=H^1(J;V_h)\). We define the semi-discrete Galerkin approximation of (4) as the function \(w_h\in V_{J,h}\) that satisfies

$$\begin{aligned} {\left\{ \begin{array}{ll} \langle \partial _t w_h(t),v_h\rangle +a(w_h(t),v_h)=\langle f(t),v_h\rangle ~~\forall v_h\in V_h,~\text{ a.e. }~t\in J\\ w_h(t_0)={\hat{w}}_0, \end{array}\right. } \end{aligned}$$
(5)

where \({\hat{w}}_0\in V_h\) is any approximation of \(w_0\) in (4). The error estimates for the semi-discrete Galerkin approximation have been proposed in, for example, \(L^2(\Omega )\), \(H^1(\Omega )\)\(L^{\infty }(\Omega )\)\(L^2(J;H^1_0(\Omega ))\), and \(L^2(J;L^2(\Omega ))\) norms; (see e.g., [16]). The regularities of \(w_0\) and f required for deriving the convergence of the semi-discrete Galerkin approximation \(w_h\) to the weak solution w have been studied. For instance, for \(w_0\in L^2(\Omega )\) and \(f\in L^2(J;H^{-1}(\Omega ))\), \(\Vert w-w_h\Vert _{Z}\rightarrow 0\) as \(h\rightarrow 0\) holds under some assumptions [2, Theorem 3.2 and 3.3]. In these studies, there is a case in which an \(L^2(J;L^2(\Omega ))\) norm error estimate of the form \(\Vert w-w_h\Vert _{L^2(J;L^2(\Omega ))}\le E_h\Vert w-w_h\Vert _{L^2(J;H^1_0(\Omega ))}\) is derived. The estimate of such a form is called the parabolic Aubin-Nitsche’s trick; (see e.g., [2, Theorem 3.5]).

By contrast, there are few results of studies for the explicit values of the error constants. Nakao et al. started pioneering studies with the constants and they have shown that for w in (4) and \(w_h\) in (5),

$$\begin{aligned}&\Vert w-w_h\Vert _{L^2(J;H^1_0(\Omega ))}\le 2C_h\Vert f\Vert _{L^2(J;L^2(\Omega ))} \end{aligned}$$
(6)
$$\begin{aligned}&\Vert w-w_h\Vert _{L^2(J;L^2(\Omega ))}\le 4C_h\Vert w-w_h\Vert _{L^2(J;H^1_0(\Omega ))}, \end{aligned}$$
(7)

where they assume that \(t_0=0\), \(w_0={\hat{w}}_0=0\), \(f\in L^2(J;L^2(\Omega ))\), and \(\Omega \) is a bounded convex polygonal or polyhedral domain [14, Theorem 4, 5]. Furthermore, these estimates (6) and (7) have been applied to verified numerical computations for semi-linear parabolic PDEs [14]. Currently, following the estimates in (6) and (7), methods, which are related to verified numerical computations to semi-linear parabolic PDEs, have been proposed; (see e.g., [5, 9, 13] and references therein).

In this paper, we provide sharp \(L^2(J;H^1_0(\Omega ))\) and \(L^2(J;L^2(\Omega ))\) norm error estimates, which contribute to improving methods for computer-assisted proofs for semi-linear parabolic PDEs, assuming \(w_0\in L^2(\Omega )\), \({\hat{w}}_0\in V_h\), and a bounded Lipschitz domain \(\Omega \). First, we derive an \(L^2(J;H^1_0(\Omega ))\) norm error estimate.

Theorem 1

For w and \(w_h\) defined by (4) and (5) with \(f\in L^2(J;L^2(\Omega ))\), we have

$$\begin{aligned} \Vert w-w_h\Vert _{L^2(J;H^1_0(\Omega ))}\le \sqrt{\Vert w_0-{\hat{w}}_0\Vert _{L^2(\Omega )}^2+C_h^2\left( \Vert f\Vert _{L^2(J;L^2(\Omega ))}^2+\Vert {\hat{w}}_0\Vert _{H^1_0(\Omega )}^2\right) }. \end{aligned}$$

Corollary 1 follows immediately from Theorem 1 with \(w_0={\hat{w}}_0=0\).

Corollary 1

We use the same notation and assumptions as in Theorem 1 and assume that \(w_0={\hat{w}}_0=0\) in (4) and (5). Then, we obtain

$$\begin{aligned} \Vert w-w_h\Vert _{L^2(J;H^1_0(\Omega ))}\le C_h\Vert f\Vert _{L^2(J;L^2(\Omega ))}. \end{aligned}$$

Next, we provide the parabolic Aubin-Nitsche’s trick as the following theorem:

Theorem 2

For w and \(w_h\) defined by (4) and (5), we have

$$\begin{aligned} \Vert w-w_h\Vert _{L^2(J;L^2(\Omega ))}\le \sqrt{\Vert R_h{\mathcal {A}}^{-1}(w_0-{\hat{w}}_0)\Vert _{H^1_0(\Omega )}^2+C_h^2\Vert w-w_h\Vert _{L^2(J;H^1_0(\Omega ))}^2}. \end{aligned}$$

We define \(P_h :L^2(\Omega )\rightarrow V_h\) as

$$\begin{aligned} (u-P_h u,v_h)_{L^2(\Omega )}=0~~\forall v_h\in V_h. \end{aligned}$$
(8)

Because \(R_h{\mathcal {A}}^{-1}(w_0-{\hat{w}}_0)=0\) when \({\hat{w}}_0=P_h w_0\), Theorem 2 with \({\hat{w}}_0=P_h w_0\) leads to Corollary 2.

Corollary 2

We use the same notation and assumptions as in Theorem 2 and assume that \({\hat{w}}_0=P_h w_0\) in (5). Then, we obtain

$$\begin{aligned} \Vert w-w_h\Vert _{L^2(J;L^2(\Omega ))}\le C_h\Vert w-w_h\Vert _{L^2(J;H^1_0(\Omega ))}. \end{aligned}$$

Assuming that \(t_0=0\) and \(w_0={\hat{w}}_0=0\), Corollaries 1 and 2 immediately yield sharper estimates than (6) and (7). Each of the constants derived by Corollaries 1 and 2 should be the best possible in the sense that we only use the error constant \(C_h\) for the Ritz projection in (2).

In this paper, we prove Theorem 1 in Sect. 2 and Theorem 2 in Sect. 3.

2 Proof of Theorem 1

We provide the proof of Theorem 1.

Proof

For \(t\in J\), it follows from (5) with \(v_h=R_h(w-w_h)(t)\in V_h\) that

$$\begin{aligned}&\langle f(t)-{\mathcal {A}}w_h(t)-\partial _t w_h(t), R_h (w-w_h)(t) \rangle \nonumber \\&=\langle f(t),R_h(w-w_h)(t) \rangle -a(w_h(t),R_h (w-w_h)(t))-\langle \partial _t w_h(t), R_h (w-w_h)(t) \rangle \nonumber \\&=\langle f(t),R_h(w-w_h)(t) \rangle -\langle f(t), R_h(w-w_h)(t) \rangle \nonumber \\&=0. \end{aligned}$$
(9)

From (4) with \(v=(w-w_h)(t)\),

$$\begin{aligned}&\langle \partial _t (w-w_h)(t),(w-w_h)(t)\rangle +a((w-w_h)(t),(w-w_h)(t))\\&=\langle f(t)-{\mathcal {A}}w_h(t)-\partial _t w_h(t), (w-w_h)(t)\rangle \\&=\langle f(t)-{\mathcal {A}}w_h(t)-\partial _t w_h(t), (I-R_h)(w-w_h)(t) \rangle \\&\quad +\langle f(t)-{\mathcal {A}}w_h(t)-\partial _t w_h(t), R_h(w-w_h)(t) \rangle . \end{aligned}$$

The equality (9) yields

$$\begin{aligned}&\langle \partial _t (w-w_h)(t),(w-w_h)(t)\rangle +a((w-w_h)(t),(w-w_h)(t))\\&=\langle f(t)-{\mathcal {A}}w_h(t)-\partial _t w_h(t), (I-R_h)(w-w_h)(t) \rangle \\&=a({\mathcal {A}}^{-1}(f(t)-{\mathcal {A}}w_h(t)-\partial _t w_h(t)),(I-R_h)(w-w_h)(t))\\&=a((I-R_h){\mathcal {A}}^{-1}(f(t)-{\mathcal {A}}w_h(t)-\partial _t w_h(t)),(w-w_h)(t))\\&=a((I-R_h){\mathcal {A}}^{-1}(f(t)-\partial _t w_h(t)),(w-w_h)(t)), \end{aligned}$$

where the last equality holds because \((I-R_h)w_h(t)=0\) for \(w_h(t)\in V_h\). Because \(f(t)-\partial _tw_h(t)\in L^2(\Omega )\), it follows from (2) that

$$\begin{aligned}&\langle \partial _t (w-w_h)(t),(w-w_h)(t)\rangle +a((w-w_h)(t),(w-w_h)(t))\nonumber \\&\le \Vert (I-R_h){\mathcal {A}}^{-1}(f(t)-\partial _tw_h(t))\Vert _{H^1_0(\Omega )}\Vert (w-w_h)(t)\Vert _{H^1_0(\Omega )}\nonumber \\&\le C_h\Vert f(t)-\partial _t w_h(t)\Vert _{L^2(\Omega )}\Vert (w-w_h)(t)\Vert _{H^1_0(\Omega )}. \end{aligned}$$
(10)

Note that \(w-w_h\in Z\subset C^0([t_0,t_1];L^2(\Omega ))\) and

$$\begin{aligned} \left( \frac{dk}{dt}\right) (t)=2\langle \partial _t (w-w_h)(t),(w-w_h)(t)\rangle ~~t\in J \end{aligned}$$

are satisfied, where \(k(t):=\Vert (w-w_h)(t)\Vert _{L^2(\Omega )}^2\); (see e.g., [3, Theorem 3 in Sect. 5.9]). Integrating both sides of (10) on J yields,

$$\begin{aligned}&\frac{1}{2}\Vert (w-w_h)(t_1)\Vert _{L^2(\Omega )}^2-\frac{1}{2}\Vert (w-w_h)(t_0)\Vert _{L^2(\Omega )}^2+\Vert w-w_h\Vert _{L^2(J;H^1_0(\Omega ))}^2\nonumber \\&=\int _{J}\langle \partial _s (w-w_h)(s),(w-w_h)(s)\rangle ds+\int _{J}a((w-w_h)(s),(w-w_h)(s))ds\nonumber \\&\le C_h\Vert f-\partial _t w_h\Vert _{L^2(J;L^2(\Omega ))}\Vert w-w_h\Vert _{L^2(J;H^1_0(\Omega ))}. \end{aligned}$$
(11)

We consider an estimate of \(\Vert f-\partial _t w_h\Vert _{L^2(J;L^2(\Omega ))}\). Equation (5) with \(v_h=\partial _t w_h(t)\in V_h\) provides that

$$\begin{aligned} \Vert \partial _tw_h(t)\Vert _{L^2(\Omega )}^2+a(w_h(t),\partial _t w_h(t))=(f(t),\partial _t w_h(t))_{L^2(\Omega )} \end{aligned}$$

holds. Integrating on J yields

$$\begin{aligned} \Vert \partial _tw_h\Vert _{L^2(J;L^2(\Omega ))}^2+\int _{J}a(w_h(s),\partial _s w_h(s))ds=\int _{J}(f(s),\partial _s w_h(s))_{L^2(\Omega )}ds. \end{aligned}$$
(12)

Because \(w_h\in H^1(J;V_h)\), we have

$$\begin{aligned}&\int _{J}a(w_h(s), \partial _s w_h(s))ds \\&=\int _{J}\int _{\Omega }\nabla w_h(s,x)\cdot \nabla \partial _s w_h(s,x)dxds\\&=\int _{J}\frac{d}{ds}\left( \int _{\Omega }|\nabla w_h(s,x)|^2dx\right) ds-\int _{J}\int _{\Omega }\nabla \partial _s w_h(s,x)\cdot \nabla w_h(s,x)dxds\\&=\int _{J}\left( \frac{dg}{ds}\right) (s)ds-\int _{J}a(w_h(s),\partial _s w_h(s))ds, \end{aligned}$$

where \(g(t):=a(w_h(t),w_h(t))=\Vert w_h(t)\Vert _{H^1_0(\Omega )}^2\). Since \(w_h\in H^1(J;H^1_0(\Omega ))\subset C^0([t_0,t_1];H^1_0(\Omega ))\); (see e.g., [3, Theorem 2 in Sect. 5.9]), we obtain

$$\begin{aligned} \int _{J}a(w_h(s), \partial _s w_h(s))ds&=\frac{1}{2}\int _{J}\left( \frac{dg}{ds}\right) (s)ds\nonumber \\&=\frac{1}{2}\left( \Vert w_h(t_1)\Vert _{H^1_0(\Omega )}^2-\Vert w_h(t_0)\Vert _{H^1_0(\Omega )}^2\right) . \end{aligned}$$
(13)

From (12) and (13),

$$\begin{aligned}&\Vert f-\partial _t w_h\Vert _{L^2(J;L^2(\Omega ))}^2 \nonumber \\&\quad =\Vert f\Vert _{L^2(J;L^2(\Omega ))}^2-2\int _{J}(f(s),\partial _s w_h(s))_{L^2(\Omega )}ds+\Vert \partial _tw_h\Vert _{L^2(J;L^2(\Omega ))}^2 \nonumber \\&\quad =\Vert f\Vert _{L^2(J;L^2(\Omega ))}^2-2\Vert \partial _tw_h\Vert _{L^2(J;L^2(\Omega ))}^2-2\int _{J}a(w_h(s),\partial _s w_h(s))ds+\Vert \partial _tw_h\Vert _{L^2(J;L^2(\Omega ))}^2\nonumber \\&\quad =\Vert f\Vert _{L^2(J;L^2(\Omega ))}^2-\Vert w_h(t_1)\Vert _{H^1_0(\Omega )}^2+\Vert w_h(t_0)\Vert _{H^1_0(\Omega )}^2-\Vert \partial _tw_h\Vert _{L^2(J;L^2(\Omega ))}^2\nonumber \\&\le \Vert f\Vert _{L^2(J;L^2(\Omega ))}^2+\Vert w_h(t_0)\Vert _{H^1_0(\Omega )}^2. \end{aligned}$$
(14)

It follows from (11), (14), and the additive geometric mean that

$$\begin{aligned}&\frac{1}{2}\Vert (w-w_h)(t_1)\Vert _{L^2(\Omega )}^2+\Vert w-w_h\Vert _{L^2(J;H^1_0(\Omega ))}^2 \\&\le \frac{1}{2}\Vert (w-w_h)(t_0)\Vert _{L^2(\Omega )}^2\\&\quad +C_h\sqrt{\Vert f\Vert _{L^2(J;L^2(\Omega ))}^2+\Vert w_h(t_0)\Vert _{H^1_0(\Omega )}^2}\Vert w-w_h\Vert _{L^2(J;H^1_0(\Omega ))}\\&\le \frac{1}{2}\Vert (w-w_h)(t_0)\Vert _{L^2(\Omega )}^2+\frac{C_h^2}{2}\left( \Vert f\Vert _{L^2(J;L^2(\Omega ))}^2+\Vert w_h(t_0)\Vert _{H^1_0(\Omega )}^2\right) \\&\quad +\frac{1}{2}\Vert w-w_h\Vert _{L^2(J;H^1_0(\Omega ))}^2. \end{aligned}$$

Then,

$$\begin{aligned}&\Vert (w-w_h)(t_1)\Vert _{L^2(\Omega )}^2+\Vert w-w_h\Vert _{L^2(J;H^1_0(\Omega ))}^2\\&\le \Vert (w-w_h)(t_0)\Vert _{L^2(\Omega )}^2+C_h^2\left( \Vert f\Vert _{L^2(J;L^2(\Omega ))}^2+\Vert w_h(t_0)\Vert _{H^1_0(\Omega )}^2\right) . \end{aligned}$$

Because \(w(t_0)=w_0\) and \(w_h(t_0)={\hat{w}}_0\), this proof is complete. \(\square \)

3 Proof of Theorem 2

We provide notation and lemmas, that are used for proving Theorem 2. Because \(w-w_h\in Z\subset C^0([t_0,t_1];L^2(\Omega ))\), for \(t\in [t_0,t_1]\), we may define

$$\begin{aligned} z(t):={\mathcal {A}}^{-1}(w-w_h)(t)\in H^1_0(\Omega ),~~~z_h(t):=R_h{\mathcal {A}}^{-1}(w-w_h)(t)\in V_h. \end{aligned}$$
(15)

We show Lemma 1, which is to be used to prove Theorem 2.

Lemma 1

The function \(z_h\) defined by (15) is in \(H^1(J;H^1_0(\Omega ))\) and we have

$$\begin{aligned} \partial _tz_h=R_h{\mathcal {A}}^{-1}\partial _t(w-w_h). \end{aligned}$$

Proof

We first verify that \(R_h{\mathcal {A}}^{-1}\partial _t(w-w_h)\in L^2(J;H^{1}_0(\Omega ))\). Since \(R_h{\mathcal {A}}^{-1}:H^{-1}(\Omega )\rightarrow V_h\) is a bounded operator, we only have to show that \(\partial _t(w-w_h)\in L^2(J;H^{-1}(\Omega ))\). We have \(\partial _t w_h\in L^2(J;V_h)\subset L^2(J;H^1_0(\Omega ))\) because of \(w_h\in H^1(J;V_h)\). We can consider \(\partial _t w_h\) as \(\partial _t w_h\in L^2(J;H^{-1}(\Omega ))\) and conclude that \(\partial _t(w-w_h)\in L^2(J;H^{-1}(\Omega ))\). Therefore, we have \(R_h{\mathcal {A}}^{-1}\partial _t(w-w_h)\in L^2(J;V_h)\subset L^2(J;H^{1}_0(\Omega ))\).

Next, we show that \(z_h\in H^1(J;H^1_0(\Omega ))\) and \(\partial _tz_h=R_h{\mathcal {A}}^{-1}\partial _t(w-w_h)\). Let the function space \(C^{\infty }_0(J)\) be the set of infinitely differentiable functions with compact support on J. For any \(\phi \in C^{\infty }_{0}(J)\), it follows that

$$\begin{aligned} \int _{J}\partial _s z_h(s)\phi (s)ds&=-\int _{J}z_h(s)\frac{d\phi }{ds}(s)ds\\&=-\int _{J}R_h{\mathcal {A}}^{-1}(w-w_h)(s)\frac{d\phi }{ds}(s)ds\\&=-R_h{\mathcal {A}}^{-1}\int _{J}(w-w_h)(s)\frac{d\phi }{ds}(s)ds, \end{aligned}$$

where the last equation is led by the boundedness of \(R_h{\mathcal {A}}^{-1}:H^{-1}(\Omega ) \rightarrow V_h\); (see e.g., [18, Corollary 2 on Sect. 5 in Chapter V]). It follows from \(\partial _t(w-w_h)\in L^2(J;H^{-1}(\Omega ))\) and the boundedness of \(R_h{\mathcal {A}}^{-1}:H^{-1}(\Omega ) \rightarrow V_h\) that

$$\begin{aligned} \int _{J}\partial _s z_h(s)\phi (s)ds&=R_h{\mathcal {A}}^{-1}\int _{J}\partial _s(w-w_h)(s)\phi (s)ds\\&=\int _{J}R_h{\mathcal {A}}^{-1}\partial _s(w-w_h)(s)\phi (s)ds. \end{aligned}$$

Since \(R_h{\mathcal {A}}^{-1}\partial _t(w-w_h)\in L^2(J;H^{1}_0(\Omega ))\), we have \(z_h\in H^1(J;H^1_0(\Omega ))\) and \(\partial _tz_h=R_h{\mathcal {A}}^{-1}\partial _t(w-w_h)\). \(\square \)

Now, we prove Theorem 2.

Proof

For \(t>t_0\), substituting \(v=z_h(t)\) into (4) and \(v_h=z_h(t)\) in (5) yields

$$\begin{aligned}&\langle \partial _t(w-w_h)(t),z_h(t)\rangle +a((w-w_h)(t),z_h(t))\nonumber \\&=\langle \partial _tw(t),z_h(t)\rangle +a(w(t),z_h(t))-\left( \langle \partial _tw_h(t),z_h(t)\rangle +a(w_h(t),z_h(t))\right) \nonumber \\&=\langle f(t),z_h(t)\rangle -\langle f(t),z_h(t)\rangle \nonumber \\&=0. \end{aligned}$$
(16)

Because the bilinear form a is symmetric, it follows from (16) that for \(t>t_0\),

$$\begin{aligned} \Vert w(t)-w_h(t)\Vert _{L^2(\Omega )}^2&=((w-w_h)(t),(w-w_h)(t))_{L^2(\Omega )}\\&=a(z(t),(w-w_h)(t))\\&=a((z-z_h)(t),(w-w_h)(t))+a(z_h(t),(w-w_h)(t))\\&=a((z-z_h)(t),(w-w_h)(t))+a((w-w_h)(t), z_h(t))\\&=a((z-z_h)(t),(w-w_h)(t))-\langle \partial _t(w-w_h)(t), z_h(t)\rangle \\&=a((z-z_h)(t),(w-w_h)(t))-a({\mathcal {A}}^{-1}\partial _t(w-w_h)(t),z_h(t)). \end{aligned}$$

Because \(R_h{\mathcal {A}}^{-1}\partial _t(w-w_h)=\partial _tz_h\) holds from Lemma 1, we obtain

$$\begin{aligned}&\Vert w(t)-w_h(t)\Vert _{L^2(\Omega )}^2\nonumber \\&=a((z-z_h)(t),(w-w_h)(t))-a(R_h{\mathcal {A}}^{-1}\partial _t(w-w_h)(t),z_h(t))\nonumber \\&=a((z-z_h)(t),(w-w_h)(t))-a(\partial _t z_h(t), z_h(t)). \end{aligned}$$
(17)

Integrating both sides of (17) for \(t\in J\), we obtain

$$\begin{aligned}&\Vert w-w_h\Vert _{L^2(J;L^2(\Omega ))}^2\\&=\int _{J}a((z-z_h)(s),(w-w_h)(s))ds-\int _{J}a(\partial _s z_h(s),z_h(s))ds\\&\le \left| \int _{J}a((z-z_h)(s),(w-w_h)(s))ds\right| -\int _{J}a(\partial _s z_h(s),z_h(s))ds\\&\le \int _{J}|a((z-z_h)(s),(w-w_h)(s))|ds-\int _{J}a(\partial _s z_h(s),z_h(s))ds\\&\le \sqrt{\int _{J}\Vert (z-z_h)(s)\Vert _{H^1_0(\Omega )}^2ds}\Vert w-w_h\Vert _{L^2(J;H^1_0(\Omega ))}-\int _{J}a(\partial _s z_h(s),z_h(s))ds\\&=\sqrt{\int _{J}\Vert (I-R_h){\mathcal {A}}^{-1}(w-w_h)(s)\Vert _{H^1_0(\Omega )}^2ds}\Vert w-w_h\Vert _{L^2(J;H^1_0(\Omega ))}\\&\quad -\int _{J}a(\partial _s z_h(s),z_h(s))ds\\&\le C_h\Vert w-w_h\Vert _{L^2(J;L^2(\Omega ))}\Vert w-w_h\Vert _{L^2(J;H^1_0(\Omega ))}-\int _{J}a(\partial _s z_h(s),z_h(s))ds, \end{aligned}$$

where because \((w-w_h)(t)\in L^2(\Omega )~(t\in [t_0,t_1])\), the last inequality follows from (2). It follows from (13), where \(w_h\) is replaced by \(z_h\), that

$$\begin{aligned}&\Vert w-w_h\Vert _{L^2(J;L^2(\Omega ))}^2\\&\le C_h\Vert w-w_h\Vert _{L^2(J;L^2(\Omega ))}\Vert w-w_h\Vert _{L^2(J;H^1_0(\Omega ))}-\frac{1}{2}\left( \Vert z_h(t_1)\Vert _{H^1_0(\Omega )}^2-\Vert z_h(t_0)\Vert _{H^1_0(\Omega )}^2\right) \\&\le C_h\Vert w-w_h\Vert _{L^2(J;L^2(\Omega ))}\Vert w-w_h\Vert _{L^2(J;H^1_0(\Omega ))}+\frac{1}{2}\Vert z_h(t_0)\Vert _{H^1_0(\Omega )}^2\\&=\sqrt{C_h^2\Vert w-w_h\Vert _{L^2(J;H^1_0(\Omega ))}^2\Vert w-w_h\Vert _{L^2(J;L^2(\Omega ))}^2}+\frac{1}{2}\Vert z_h(t_0)\Vert _{H^1_0(\Omega )}^2\\&\le \frac{C_h^2}{2}\Vert w-w_h\Vert _{L^2(J;H^1_0(\Omega ))}^2+\frac{1}{2}\Vert w-w_h\Vert _{L^2(J;L^2(\Omega ))}^2+\frac{1}{2}\Vert z_h(t_0)\Vert _{H^1_0(\Omega )}^2, \end{aligned}$$

where the last inequality follows from the additive geometric mean. Therefore, we have

$$\begin{aligned} \Vert w-w_h\Vert _{L^2(J;L^2(\Omega ))}^2&\le \Vert z_h(t_0)\Vert _{H^1_0(\Omega )}^2+C_h^2\Vert w-w_h\Vert _{L^2(J;H^1_0(\Omega ))}^2. \end{aligned}$$

\(\square \)

4 Conclusion

We proposed \(L^2(J;H^1_0(\Omega ))\) and \(L^2(J;L^2(\Omega ))\) norm error estimates that provide explicit values of the error constants for the semi-discrete Galerkin approximation of the linear heat equation (4) in Theorems 1 and 2, respectively. Furthermore, we derived Corollaries 1 and 2 as special cases of Theorems 1 and 2, respectively. The estimates in Corollaries 1 and 2 are sharper than those given by Nakao et al. [14]. Moreover, we showed that these constants coincide with \(C_h\) in (2). From this fact we believe that our error estimates should be, in a sense, the best possible. Therefore, our results contribute to the theoretical and numerical basis for computer-assisted existential proofs of solutions to semi-linear parabolic PDEs.