The variable two-step BDF method for parabolic equations

Akrivis, Georgios; Chen, Minghua; Han, Jianxing; Yu, Fan; Zhang, Zhimin

doi:10.1007/s10543-024-01007-y

The variable two-step BDF method for parabolic equations

Open access
Published: 01 March 2024

Volume 64, article number 14, (2024)
Cite this article

Download PDF

You have full access to this open access article

BIT Numerical Mathematics Aims and scope Submit manuscript

The variable two-step BDF method for parabolic equations

Download PDF

Georgios Akrivis ORCID: orcid.org/0000-0003-1851-0757^1,2,
Minghua Chen³,
Jianxing Han³,
Fan Yu³ &
…
Zhimin Zhang⁴

414 Accesses
Explore all metrics

Abstract

The two-step backward difference formula (BDF) method on variable grids for parabolic equations with self-adjoint elliptic part is considered. Standard stability estimates for adjacent time-step ratios $r_j:=k_j/k_{j-1}\leqslant 1.8685$ and 1.9104, respectively, have been proved by Becker (BIT 38:644–662, 1998) and Emmrich (J Appl Math Comput 19:33–55, 2005) by the energy technique with a single multiplier. Even slightly improving the ratio is cumbersome. In this paper, we present a novel technique to examine the positive definiteness of banded matrices that are neither Toeplitz nor weakly diagonally dominant; this result can be viewed as a variant of the Grenander–Szegő theorem. Then, utilizing the energy technique with two multipliers, we establish stability for adjacent time-step ratios up to 1.9398.

Stability and error estimates for the variable step-size BDF2 method for linear and semilinear parabolic equations

Article 21 January 2021

Mesh-Robustness of an Energy Stable BDF2 Scheme with Variable Steps for the Cahn–Hilliard Model

Article 04 July 2022

Galerkin-Chebyshev spectral method and block boundary value methods for two-dimensional semilinear parabolic equations

Article 29 May 2015

1 Introduction

Let $T >0, u^0\in H,$ and consider the initial value problem of seeking $u \in C((0,T];{\mathscr {D}}(A))\cap C([0,T];H)$ satisfying

$$\begin{aligned} \left\{ \begin{aligned}&u' (t) + Au(t) = f(t), \quad 0<t<T,\\&u(0)=u^0, \end{aligned} \right. \end{aligned}$$

(1.1)

with A a positive definite, selfadjoint, linear operator on a Hilbert space $(H, (\cdot , \cdot )) $ with domain ${\mathscr {D}}(A)$ dense in H and $f: [0,T] \rightarrow H$ a given forcing term.

The backward difference formula (BDF) methods are popular for stiff differential equations, in particular, for parabolic equations. They are frequently implemented on nonuniform partitions for numerical efficiency.

For an integer $N\geqslant 2,$ consider a partition $0=t_0<t_1<\cdots <t_N=T$ of the time interval [0, T], with time steps $k_n:=t_n-t_{n-1}, n=1,\dotsc ,N.$ We recursively define a sequence of approximations $u^n$ to the nodal values $ u(t_n)$ of the exact solution by the variable two-step BDF method,

$$\begin{aligned} D_2 u^{n}+A u^{n} =f^n, \quad n=2,\dotsc ,N, \end{aligned}$$

(1.2)

with $f^n:=f(t_n),$ assuming that arbitrary starting approximations $u^0$ and $u^{1}$ are given. Here,

$$\begin{aligned} D_2 \upsilon ^n:=\Bigg (1+\tfrac{k_n}{k_{n-1}}\Bigg )\frac{\upsilon ^n-\upsilon ^{n-1}}{k_n} -\frac{k_n}{k_{n-1}}\frac{\upsilon ^n-\upsilon ^{n-2}}{k_n+k_{n-1}}. \end{aligned}$$

Let $| \cdot |$ denote the norm on H induced by the inner product $(\cdot , \cdot )$, and introduce on $V, V:={\mathscr {D}}(A^{1/2}),$ the norm $\Vert \cdot \Vert $ by $\Vert \upsilon \Vert :=| A^{1/2} \upsilon |.$ We identify H with its dual, and denote by $V'$ the dual of V, and by $\Vert \cdot \Vert _\star $ the dual norm on $V', \Vert \upsilon \Vert _\star =| A^{-1/2} \upsilon |.$ We shall use the notation $(\cdot , \cdot )$ also for the antiduality pairing between $V'$ and V. For simplicity, we denote by $\langle \cdot ,\cdot \rangle $ the inner product on V, $\langle \upsilon , w\rangle :=(A^{1/2}\upsilon , A^{1/2}w).$

1.1 Main result

We establish the following stability result.

Theorem 1.1

(Stability estimate) Let $u^n$ satisfy (1.2), with $u^0, u^1\in V$, and assume that

$$\begin{aligned} r_n:=\frac{k_n}{k_{n-1}}\leqslant r^\star \approx 1.9398,\quad n=2,\dotsc ,N; \end{aligned}$$

(1.3)

the bound $r^\star $ is expressed in terms of the multipliers $\delta =0.9672$ and $\eta =-0.1793$ in (3.1); see also (4.17) for more precise values of the bound $r^\star $ as well as of the multipliers. Then, the variable two-step BDF method (1.2) is stable in the sense that

$$\begin{aligned}{} & {} |u^n|^2+\sum _{j=2}^{n}k_j\Vert u^{j}\Vert ^2\nonumber \\{} & {} \quad \leqslant C\textrm{e}^{c\varGamma _n}\Bigg (|u^0|^2+|u^1|^2 +k_2\Vert u^0\Vert ^2+k_2\Vert u^1\Vert ^2+\sum _{j=2}^{n}k_j\Vert f^{j}\Vert _\star ^2\Bigg ), \end{aligned}$$

(1.4)

$n=2,\dotsc ,N.$ Here, $\varGamma _n$ is a mesh-dependent quantity,

$$\begin{aligned} \varGamma _n:=\sum \limits _{j=2}^{n-2}[r_{j}-r_{j+2}]_{+}\quad \text {with}\quad [x]_+:=\max (x,0), \end{aligned}$$

(1.5)

and C, c denote generic constants, independent of T and the operator A as well as of f and of the partition of the time interval.

Let us recall some partitions for which $\varGamma _n$ is finite; see [12, p. 175]. If the sequence of the ratios $(r_n)$ is monotone (and bounded), then $\varGamma _n$ is bounded; more precisely, $\varGamma _n=0$ if $(r_n)$ is nondecreasing, and $\varGamma _n=r_2+r_3-r_{n-1}-r_n$ if $(r_n)$ is decreasing. More generally, $\varGamma _n$ is bounded in the practically reasonable case that the number of changes in monotonicity of the sequence $(r_n)$ is bounded, uniformly with respect to the number N of time steps. For partitions of the form $t_i=(i/N)^\alpha ,$ with $\alpha >1,$ the time steps $k_i$ increase and the ratios $r_i$ decrease to 1, whence, in particular, $r_i\leqslant r^\star $ except for a finite number of i.

1.2 Main ingredients of the proof

We shall use the energy technique. Let $r_n=k_n/k_{n-1}, n=2,\dotsc ,N,$ be the adjacent time step ratios. With the notation

$$\begin{aligned} \delta _k\upsilon ^n:=\upsilon ^n-\upsilon ^{n-k},\quad \omega _n:=\frac{1}{1+r_n},\quad \psi _n:=\Bigg (\frac{r_n}{1+r_n}\Bigg )^2, \end{aligned}$$

the backward difference quotient $D_2 \upsilon ^{n}$ can be written in the form (cf. [2])

$$\begin{aligned} D_2 \upsilon ^{n}=\frac{1}{\omega _nk_n}\big (\delta _1\upsilon ^n-\psi _n\delta _2\upsilon ^n\big ). \end{aligned}$$

(1.6)

Testing the BDF method (1.2) by $2\omega _nk_n\left( u^{n} -\delta u^{n-1}-\eta u^{n-2}\right) $, with $0<\delta <1$ and $-1<\eta < 0$ two multipliers to be suitably chosen below, we obtain

$$\begin{aligned} {\mathscr {D}}_n+{\mathscr {A}}_n={\mathscr {F}}_n,\quad n=2,\dotsc ,N, \end{aligned}$$

(1.7)

with

$$\begin{aligned} \left\{ \begin{aligned}&{\mathscr {D}}_n:=2\omega _nk_n\left( D_2 u^n, u^{n} -\delta u^{n-1}-\eta u^{n-2}\right) ,\\&{\mathscr {A}}_n:=2\omega _nk_n\langle u^n, u^{n} -\delta u^{n-1}-\eta u^{n-2}\rangle ,\\&{\mathscr {F}}_n:=2\omega _nk_n\big (f^n, u^{n} -\delta u^{n-1}-\eta u^{n-2}\big ). \end{aligned} \right. \end{aligned}$$

(1.8)

The terms ${\mathscr {F}}_n$ on the right-hand side of (1.7), accounting for the forcing term f, can be easily estimated from above by the generalized Cauchy–Schwarz and the weighted arithmetic–geometric mean inequalities. We shall estimate ${\mathscr {D}}_n$ from below, and subsequently the sum over all ${\mathscr {D}}_n,$ in Sect. 3.1, while in Sect. 3.2 we shall directly estimate the sum over all ${\mathscr {A}}_n$ from below rather than each term ${\mathscr {A}}_n$ separately. The key point in the estimate of the sum over all ${\mathscr {A}}_n$ is the positive definiteness of families of certain banded matrices; this property is described and established in Sect. 2.

1.3 Previous work

Stability of the A-stable two-step BDF method for parabolic equations for equidistant partitions can be easily established by the energy technique. The zero-stability property, and thus the stability for o.d.e’s satisfying the Lipschitz condition, of the variable two-step BDF method is also well-understood; a sufficient condition is $r^\star <1+\sqrt{2}\approx 2.414$ in (1.3) and the bound is sharp; see [4, 7] as well as [10, p. 405]. In contrast, the analysis of the variable two-step BDF method for parabolic equations is cumbersome and still incomplete.

Grigorieff proved stability for linear parabolic equations, with bounds independent of $\varGamma _n$, for $r^\star \leqslant (1+\sqrt{3})/2\approx 1.366$ in [8, 9]. In [2], Becker established stability of the form (1.4) and derived error estimates for linear parabolic equations for $r^\star \leqslant (2+\sqrt{13})/3\approx 1.8685$; see also [12, pp. 174–180]. Emmrich [5] further relaxed the bound to 1.9104 for semilinear parabolic equations. For stability estimates for the three-step BDF method, with a mesh-dependent quantity similar to $\varGamma _n,$ we refer to [3].

In [2, 12] and [5] the method is tested by linear combinations of two terms, $u^n$ and $u^{n-1};$ here, to relax the condition on the ratios, as we mentioned, we test by linear combinations of all three terms that enter into the method, namely, of $u^n, u^{n-1},$ and $u^{n-2}.$ Furthermore, we directly estimate the sum of the terms accounting for the elliptic operator from below; this is in sharp contrast to Becker [2], Thomée [12], Emmrich [5], where each one of these terms is estimated separately; see Sect. 3.2.

Several stability estimates of a different kind, in which the difference quotient $(u^1-u^0)/k_1$ enters on the right-hand side, have been recently established both for linear and nonlinear parabolic equations, for bounds $r^\star $ significantly larger than the optimal bound $1+\sqrt{2}$ for zero-stability; see [11] and references therein. Notice that $(u^1-u^0)/k_1$ may enter implicitly, if, for instance, the starting value $u^1$ is computed by employing one step of the trapezoidal method.

We establish key auxiliary results in Sect. 2 and provide the proof of Theorem 1.1 in Sect. 3. We motivate the choice of the multipliers $\delta $ and $\eta $ in Sect. 4.

2 Auxiliary results

Our main tool in the proof of the stability result in Theorem 1.1 will be the positive definiteness of families of certain banded matrices. This property will allow us to suitably estimate from below the sum over all terms ${\mathscr {A}}_n$ entering into (1.7).

For given real numbers $\delta $ and $\eta \leqslant 0,$ we are interested in properties of families of banded lower triangular $(n-1)\times (n-1)$ real matrices of the form

$$\begin{aligned} {\mathbb {L}}(r_2,\dotsc ,r_n):= \begin{pmatrix} \frac{1}{1+r_2} &{} &{} &{} &{} \\ -\delta \frac{\sqrt{r_3}}{1+r_3} &{} \frac{1}{1+r_3} &{} &{}&{} \\ -\eta \frac{\sqrt{r_3r_4}}{1+r_4} &{}-\delta \frac{\sqrt{r_4}}{1+r_4}&{}\frac{1}{1+r_4} &{} \\ &{}\hspace{-0.8cm} \ddots &{} \hspace{-0.8cm} \ddots &{} \hspace{-0.5cm} \ddots &{} \\ &{} &{}\hspace{-2cm} -\eta \frac{\sqrt{r_{n-1}r_n}}{1+r_n} &{}\hspace{-0.6cm}-\delta \frac{\sqrt{r_n}}{1+r_n}&{}\frac{1}{1+r_n} \end{pmatrix} \end{aligned}$$

(2.1)

with positive $r_2,\dotsc ,r_n\leqslant r,$ with a uniform positive upper bound r, for all $n\geqslant 4.$

Lemma 2.1

(Property of matrices of the form (2.1)) Let $(\cdot ,\cdot )_2$ and $\Vert \cdot \Vert _2$ denote the Euclidean inner product and norm, respectively, on $\mathbb {R}^{n-1},$ and let c be a real constant. Then,

$$\begin{aligned} ({\mathbb {L}}(r_2,\dotsc ,r_n)x,x)_2\geqslant c\Vert x\Vert _2^2\quad \forall x\in \mathbb {R}^{n-1}, \end{aligned}$$

(2.2)

for all matrices of the form (2.1) and for all $n\geqslant 4,$ if and only if

$$\begin{aligned} p(y)=\frac{1}{1+r}\big [1+\eta r-\delta \sqrt{r} y-2\eta r y^2\big ]\geqslant c\quad \forall y \in [-1,1]. \end{aligned}$$

(2.3)

As we shall see later on, the necessity of (2.3) is an easy consequence of well-known properties of the spectrum of symmetric, banded Toeplitz matrices.

Proof

First, we shall prove that condition (2.3) implies the estimate (2.2). With

$$\begin{aligned} J:=\begin{pmatrix} \frac{1}{1+r_2} &{} &{} &{} \\ &{}\!\! \frac{1}{1+r_3} &{} &{} \\ &{} &{}\ddots &{} \\ &{} &{}&{}\!\!\frac{1}{1+r_n} \end{pmatrix}\quad \text {and}\quad G:=\begin{pmatrix} 0 &{} &{} &{} \\ \sqrt{r_3} &{}0 &{} &{} \\ &{} \ddots &{}\ddots &{} \\ &{} &{}\sqrt{r_n}&{}0 \end{pmatrix} \end{aligned}$$

the matrix ${\mathbb {L}}:={\mathbb {L}}(r_2,\dotsc ,r_n)$ in (2.1) can be rewritten as

$$\begin{aligned} {\mathbb {L}}=J-\delta JG-\eta JG^2. \end{aligned}$$

It suffices to consider the symmetric part ${\mathbb {L}}_s$ of the matrix ${\mathbb {L}},$

$$\begin{aligned} {\mathbb {L}}_s=\frac{1}{2} (\mathbb {L}+\mathbb {L}^\top ) =J-\frac{\delta }{2} (JG+G^\top J)-\frac{\eta }{2} \big (JG^2+(G^\top )^2J\big ), \end{aligned}$$

since $({\mathbb {L}}x,x)_2=({\mathbb {L}}_sx,x)_2.$ With $K:=J^{1/2},$ we have

$$\begin{aligned} 2K^{-1}{\mathbb {L}}_sK^{-1} =2I-\delta \big (KGK^{-1}+K^{-1}G^\top K\big )-\eta \big (KG^2K^{-1}+K^{-1}(G^\top )^2K\big ). \end{aligned}$$

Letting

$$\begin{aligned} P:=KGK^{-1}= \left( \begin{array}{llll} 0 &{} &{} &{} \\ \sqrt{\frac{1+r_2}{1+r_3}r_3} &{} 0 &{} &{} \\ &{} \ddots \qquad \,\, \ddots &{} &{} \\ &{} \sqrt{\frac{1+r_{n-1}}{1+r_n}r_n}&{}\quad 0 \end{array}\right) , \end{aligned}$$

we can rewrite $2K^{-1}{\mathbb {L}}_sK^{-1}$ in the form

$$\begin{aligned} 2K^{-1}{\mathbb {L}}_sK^{-1}=2I-\delta (P+P^\top )-\eta \big (P^2+(P^\top )^2\big ), \end{aligned}$$

i.e.,

$$\begin{aligned} 2K^{-1}{\mathbb {L}}_sK^{-1} =2I-\delta \sqrt{r}\frac{P+P^\top }{\sqrt{r}}-\eta r \frac{P^2+(P^\top )^2}{r}. \end{aligned}$$

Therefore, with

$$\begin{aligned} Z:=\frac{P}{\sqrt{r}} =\begin{pmatrix} 0 &{} &{} &{} \\ z_3 &{}0 &{} &{} \\ &{} \ddots &{}\ddots &{} \\ &{} &{}z_n&{}0 \end{pmatrix}, \end{aligned}$$

we have

$$\begin{aligned} 2K^{-1}{\mathbb {L}}_sK^{-1} =2I-\delta \sqrt{r} (Z+Z^\top )-\eta r \big (Z^2+(Z^\top )^2\big ). \end{aligned}$$

Using here the identity $Z^2+(Z^\top )^2=(Z+Z^\top )^2-ZZ^\top -Z^\top Z,$ we see that

$$\begin{aligned} 2K^{-1}{\mathbb {L}}_sK^{-1}=2M-\eta r (2I-ZZ^\top -Z^\top Z) \end{aligned}$$

(2.4)

with the symmetric matrix M,

$$\begin{aligned} M:=(1+\eta r)I-\delta \sqrt{r} Z_s-2\eta r Z_s^2, \end{aligned}$$

(2.5)

where $Z_s:=(Z+Z^\top )/2$ is the symmetric part of the matrix Z.

Since $\frac{r_i}{1+r_i}\leqslant \frac{r}{1+r}$ and $1+r_{i-1}\leqslant 1+r$, we have $z_i=\sqrt{\frac{r_i}{1+r_i}\frac{1+r_{i-1}}{r}}\leqslant 1,$

$$\begin{aligned} 0<z_i\leqslant 1, \quad i=3,\dotsc ,n. \end{aligned}$$

(2.6)

To prove (2.2), we shall proceed in two steps: first we shall show that (2.6) implies that the diagonal matrix $2I-ZZ^\top -Z^\top Z$ is positive semidefinite, and subsequently, using the Rayleigh quotient criterion, that the eigenvalues of the matrix M are bounded from below by $c(1+r).$

Now,

$$\begin{aligned} \begin{aligned} ZZ^\top =\begin{pmatrix} 0 &{} &{} &{} \\ &{}z_3^2 &{} &{} \\ &{} &{}\ddots &{} \\ &{} &{}&{}z_n^2 \end{pmatrix}\quad \text {and}\quad Z^\top Z=\begin{pmatrix} z_3^2 &{} &{} &{} \\ &{}\ddots &{} &{} \\ &{} &{}z_n^2&{}\\ &{}&{}&{}0 \end{pmatrix}, \end{aligned} \end{aligned}$$

and, thus, the matrix $2I-ZZ^\top -Z^\top Z$ is diagonal. In view of (2.6), its diagonal entries are nonnegative; consequently, this matrix is indeed positive semidefinite. Notice also that $\eta \leqslant 0.$

To complete the proof of (2.2), it remains to show that the eigenvalues of the symmetric matrix M are bounded from below by $c(1+r).$ Now, the eigenvalues $\mu _i$ and $\lambda _i$ of the symmetric matrices M and $Z_s$, respectively, are related by

$$\begin{aligned} \mu _i=1+\eta r-\delta \sqrt{r} \lambda _i-2\eta r \lambda _i^2=(1+r)p(\lambda _i); \end{aligned}$$

(2.7)

see (2.5) and (2.3).

Let us first show that $\lambda _i\in [-1,1]$ via the Rayleigh quotient criterion. Indeed, for $y=(y_2,y_3,\dotsc ,y_n)^\top \in \mathbb {R}^{n-1},$ we have

$$\begin{aligned}(Z_s y,y)_2=\sum _{i=3}^nz_iy_iy_{i-1},\end{aligned}$$

whence, in view of (2.6),

$$\begin{aligned}|(Z_s y,y)_2|\leqslant \frac{1}{2}\sum _{i=3}^n\big ((y_{i-1})^2+(y_i)^2\big )=\Vert y\Vert _2^2-\frac{1}{2} \big [(y_2)^2+(y_n)^2\big ].\end{aligned}$$

Therefore,

$$\begin{aligned}|\lambda _i|\leqslant \sup _{\begin{array}{c} y\in \mathbb {R}^{n-1}\\ y\ne 0 \end{array}}\frac{|(Z_s y,y)_2|}{\Vert y\Vert _2^2}\leqslant 1.\end{aligned}$$

Now, it follows immediately from (2.3) and (2.7) that the eigenvalues $\mu _i$ of the symmetric matrix M are bounded from below by $c(1+r).$ Thus, for $x\in \mathbb {R}^{n-1},$

$$\begin{aligned}\big (K^{-1}{\mathbb {L}}_sK^{-1}x,x)_2\geqslant (Mx,x)_2 \geqslant c(1+r)\Vert x\Vert _2^2,\end{aligned}$$

which, in combination with $\Vert K^{-1}x\Vert _2^2\leqslant (1+r)\Vert x\Vert _2^2,$ yields the asserted estimate (2.2).

Next, we prove that condition (2.3) is necessary for (2.2).

It suffices to show that condition (2.3) is necessary for (2.2) for all matrices of the form (2.1) with $r_2=\cdots =r_n=r.$ The symmetric part ${\mathbb {L}}_s(r,\dotsc ,r):=\big ({\mathbb {L}}(r,\dotsc ,r)+{\mathbb {L}}(r,\dotsc ,r)^\top \big )/2$ of the $(n-1)\times (n-1)$ matrix ${\mathbb {L}}(r,\dotsc ,r)$ is a symmetric pentadiagonal Toeplitz matrix with generating function g (see [1, 6]),

$$\begin{aligned} g(x):=\frac{1}{1+r} \big [1-\delta \sqrt{r}\cos x-\eta r\cos (2x)\big ],\quad x\in \mathbb {R}. \end{aligned}$$

Now, with p the polynomial of (2.3) and the change of variables $y=\cos x,$ we have

$$\begin{aligned} g_{\min }:=\min _{x\in \mathbb {R}}g(x)=\min _{-1\leqslant y\leqslant 1}p(y). \end{aligned}$$

Assume that (2.3) is not satisfied; then, we would have $g_{\min }<c.$ From Theorem 2.1, a simplified version of more general results for symmetric banded Toeplitz matrices, we would then infer that the matrices ${\mathbb {L}}_s(r,\dotsc ,r)$ possess, for sufficiently large dimension, eigenvalues less than c, a contradiction to (2.2).$\square $

Theorem 2.1

(Grenander–Szegő theorem, and asymptotic behavior of extreme eigenvalues of symmetric, banded Toeplitz matrices; cf. [6, Theorems 6.1 and 6.6]) Let g be a nonconstant, real and even, $2\pi $-periodic, trigonometric polynomial. Then, the eigenvalues of all symmetric, banded, $n\times n$ Toeplitz matrices $T_n,$ with generating function g, belong to the open interval $(g_{\min },g_{\max })$ with $g_{\min }$ and $g_{\max }$ the minimum and maximum of g, respectively.

Let $\lambda _1(T_n)\geqslant \lambda _2(T_n)\geqslant \cdots \geqslant \lambda _n(T_n)$ be the eigenvalues of $T_n$ sorted in nonincreasing order. Then, for each fixed integer $j\geqslant 1,$ we have

$$\begin{aligned}\lim _{n\rightarrow \infty }\lambda _j(T_n)=g_{\max }\quad \text {and}\quad \lim _{n\rightarrow \infty }\lambda _{n-j+1}(T_n)=g_{\min }.\end{aligned}$$

Remark 2.1

The Grenander–Szegő theorem applies to Toeplitz matrices; see the first part of Theorem 2.1. Here, Lemma 2.1 can be viewed as a variant of the Grenander–Szegő theorem, applicable to a class of non-Toeplitz matrices.

3 Proof of Theorem 1.1

In this section, we prove Theorem 1.1.

Let us first recall a discrete version of Gronwall’s lemma that we will need in the sequel.

Lemma 3.1

(Discrete Gronwall inequality; Emmrich, [5]) Let $\alpha _n,\beta _n,\xi _n,\varphi _n$ be nonnegative numbers, with a monotonically increasing sequence $(\xi _n)_{n\geqslant 2},$ satisfying the inequalities

$$\begin{aligned} \alpha _n+\beta _n\leqslant \sum _{i=2}^{n-1}\varphi _i\alpha _i+\xi _n,\quad n=2,3,\dotsc . \end{aligned}$$

Then, the following estimate is valid

$$\begin{aligned} \alpha _n+\beta _n\leqslant \xi _n\exp \Bigg (\sum _{i=2}^{n-1}\varphi _i\Bigg ), \quad n=2,3,\dotsc . \end{aligned}$$

3.1 Estimation of the terms accounting for the difference quotient

Let us first focus on the first term on the left-hand side of (1.7).

Lemma 3.2

(Estimation of ${\mathscr {D}}_n$) Assume that $0<\delta<1,-1<\eta <0$ with $2-\delta +2\eta \geqslant 0$, $1+\delta +3\eta \geqslant 0$, and $r_j\leqslant r, j=2,\dotsc ,N,$ with r such that

$$\begin{aligned} r\leqslant \frac{\sqrt{1+\delta +3\eta }}{2\sqrt{1+\eta }-\sqrt{1+\delta +3\eta }} =:r^\star (\delta ,\eta ). \end{aligned}$$

(3.1)

Then,^{Footnote 1}

$$\begin{aligned} \sum _{j=2}^{n}{\mathscr {D}}_j\geqslant & {} (1-\delta -\eta ) \Bigg ( (1-\psi _n)|u^n|^2-\psi _{n-1}|u^{n-1}|^2 -|u^1|^2 -\sum _{j=2}^{n-2}[\psi _{j}-\psi _{j+2}]_{+}|u^j|^2\Bigg )\nonumber \\{} & {} -\,\big [-\eta +\left( 2-\delta +2\eta \right) \psi _2\big ]|\delta _1u^1|^2, \end{aligned}$$

(3.2)

$n=3,\dotsc ,N.$ For $n=2,$ (3.2) is also valid without the second and fourth terms on the right-hand side, i.e.,

$$\begin{aligned}{\mathscr {D}}_2\geqslant (1-\delta -\eta ) \Big ( (1-\psi _2)|u^2|^2-|u^1|^2\Big ) -\big [-\eta +\left( 2-\delta +2\eta \right) \psi _2\big ]|\delta _1u^1|^2.\end{aligned}$$

Proof

We shall estimate each term ${\mathscr {D}}_j$ from below separately and subsequently sum over j to obtain (3.2).

Using (1.6) and expanding ${\mathscr {D}}_n$ in (1.8), we have

$$\begin{aligned} {\mathscr {D}}_n=I_1^n+I_2^n+I_3^n+I_4^n+I_5^n+I_6^n \end{aligned}$$

(3.3)

with

$$\begin{aligned} \left\{ \begin{aligned} I_1^n&=2(\delta _1u^n,u^n),\quad{} & {} I_2^n=-2\psi _n(\delta _2u^n,u^n),\quad{} & {} I_3^n=-2\delta (\delta _1u^n,u^{n-1}),\\ I_4^n&=2\delta \psi _n(\delta _2u^n,u^{n-1}),\quad{} & {} I_5^n=-2\eta (\delta _1u^n,u^{n-2}),\quad{} & {} I_6^n=2\eta \psi _n(\delta _2u^n,u^{n-2}). \end{aligned} \right. \nonumber \\ \end{aligned}$$

(3.4)

Using the identities

$$\begin{aligned} 2(\delta _ku^n,u^n)=\delta _k|u^n|^2+|\delta _ku^n|^2,\quad 2(\delta _ku^n,u^{n-k})=\delta _k|u^n|^2-|\delta _ku^n|^2, \end{aligned}$$

we see that

$$\begin{aligned} I_1^n= & {} \delta _1|u^n|^2+|\delta _1u^n|^2,\quad I_2^n=-\psi _n\big (\delta _2|u^n|^2+|\delta _2u^n|^2\big ),\\ I_3^n= & {} -\delta \big (\delta _1|u^n|^2- |\delta _1u^n|^2\big ), \quad I_6^n=\eta \psi _n\big (\delta _2|u^n|^2-|\delta _2u^n|^2\big ). \end{aligned}$$

Furthermore, since $\delta _2u^n=\delta _1u^n+\delta _1u^{n-1}$, we have

$$\begin{aligned} \begin{aligned} I_4^n&=2\delta \psi _n(\delta _1u^n+\delta _1u^{n-1},u^{n-1})\\&=\delta \psi _n\big (\delta _2|u^n|^2-|\delta _1u^n|^2 +|\delta _1u^{n-1}|^2\big ),\\ I_5^n&=-2\eta (\delta _1u^n,u^{n-2}) =-2\eta (\delta _2u^n,u^{n-2})+2\eta (\delta _1u^{n-1},u^{n-2})\\&=-\eta \big (\delta _1|u^n|^2-|\delta _2u^n|^2+|\delta _1u^{n-1}|^2\big ). \end{aligned} \end{aligned}$$

Collecting terms, we therefore obtain from (3.3) and (3.4)

$$\begin{aligned} {\mathscr {D}}_n= & {} J_1^n+(1+\delta -\delta \psi _n)|\delta _1u^n|^2+(\delta \psi _n-\eta )|\delta _1u^{n-1}|^2\nonumber \\{} & {} +\,(\eta -\eta \psi _n-\psi _n)|\delta _2u^n|^2\geqslant J_1^n+J_2^n \end{aligned}$$

(3.5)

with

$$\begin{aligned} J_1^n=(1-\delta -\eta )\big (\delta _1|u^n|^2-\psi _n\delta _2|u^n|^2\big ),\quad J_2^n=A_n|\delta _1u^n|^2-B_n|\delta _1u^{n-1}|^2, \end{aligned}$$

where

$$\begin{aligned} A_n:=1+\delta +2\eta -(2+\delta +2\eta )\psi _n,\quad B_n:=-\eta +(2-\delta +2\eta )\psi _n; \end{aligned}$$

in the derivation of the inequality in (3.5), we used the obvious estimate $|\delta _2u^n|^2\leqslant 2|\delta _1u^n|^2+2|\delta _1u^{n-1}|^2$.

Now,

$$\begin{aligned} \begin{aligned} \sum _{j=2}^{n}J_1^j&=(1-\delta -\eta ) \Bigg (\left( 1-\psi _n\right) |u^n|^2-\psi _{n-1}|u^{n-1}|^2-|u^1|^2 -\sum _{j=2}^{n-2}\left( \psi _{j}-\psi _{j+2}\right) |u^j|^2\Bigg )\\&\quad +\,(1-\delta -\eta )\big (\psi _2|u^0|^2+\psi _3|u^1|^2\big ). \end{aligned} \end{aligned}$$

Hence, noting that $\delta +\eta <1,$ we have

$$\begin{aligned} \sum _{j=2}^{n}J_1^j \geqslant (1-\delta -\eta ) \Bigg ( (1-\psi _n)|u^n|^2-\psi _{n-1}|u^{n-1}|^2 -|u^1|^2 -\sum _{j=2}^{n-2}[\psi _{j}-\psi _{j+2}]_{+}|u^j|^2\Bigg ).\nonumber \\ \end{aligned}$$

(3.6)

Moreover,

$$\begin{aligned} \sum _{j=2}^{n}J_2^j= & {} \sum _{j=2}^{n}\big (A_j|\delta _1u^j|^2 -B_j|\delta _1u^{j-1}|^2\big )\nonumber \\= & {} \sum _{j=2}^{n-1}(A_j-B_{j+1})|\delta _1u^j|^2 +A_n|\delta _1u^n|^2-B_2|\delta _1u^1|^2. \end{aligned}$$

(3.7)

We shall now show that if $r_j\leqslant r^\star (\delta ,\eta )$ for all j, then $A_j-B_{j+1}\geqslant 0$. Assume that $r_j\leqslant r$ for all j. Since $2+\delta +2\eta >0$ and $2-\delta +2\eta \geqslant 0,$ $A_j$ and $B_j$ are decreasing and increasing functions of $r_j$, respectively; thus,

$$\begin{aligned} \begin{aligned} A_j-B_{j+1}&\geqslant 1+\delta +2\eta -\left( 2+\delta +2\eta \right) \Big (\frac{r}{1+r}\Big )^2 +\eta -\left( 2-\delta +2\eta \right) \Big (\frac{r}{1+r}\Big )^2\\&=1+\delta +3\eta -4\left( 1+\eta \right) \Big (\frac{r}{1+r}\Big )^2. \end{aligned} \end{aligned}$$

In view of (3.1), there holds $A_j-B_{j+1}\geqslant 0,$ and (3.7) yields

$$\begin{aligned} \sum _{j=2}^{n}J_2^j\geqslant -B_2|\delta _1u^1|^2. \end{aligned}$$

(3.8)

The asserted estimate (3.2) is an immediate consequence of (3.5), (3.6), and (3.8).$\square $

3.2 Estimation of the terms $\pmb {\mathscr {A}}_n$ accounting for the elliptic operator

Here, we shall estimate the sum of the terms ${\mathscr {A}}_n$ from below. Lemma 2.1 plays a key role in the proof.

Lemma 3.3

(Estimation of ${\mathscr {A}}_n$) Let $\delta =0.9672$ and $\eta =-0.1793$, and assume that $r_j\leqslant r^\star (0.9672,-0.1793) \approx 1.9398, j=2,\dotsc ,N;$ see (3.1). Then,

$$\begin{aligned} \frac{1}{2} \sum _{j=2}^{n}{\mathscr {A}}_j \geqslant c_1\sum _{j=2}^{n}k_j\Vert u^{j}\Vert ^2-\delta \omega _2k_2\langle u^{2}, u^{1}\rangle -\eta \omega _2k_2\langle u^{2}, u^{0}\rangle -\eta \omega _3k_3\langle u^{3}, u^{1}\rangle ,\nonumber \\ \end{aligned}$$

(3.9)

$n=3,\dotsc ,N,$ with $c_1=10^{-6};$ for $n=2,$ (3.9) is also valid without the last term on the right-hand side.

Proof

We rewrite the sum on the left-hand side of (3.9) in the form

$$\begin{aligned} \frac{1}{2} \sum _{j=2}^{n}{\mathscr {A}}_j =\sum _{i,j=1}^{n-1}L_{ij}\langle u^{i+1}, u^{j+1}\rangle -\delta \omega _2k_2\langle u^{2}, u^{1}\rangle -\eta \omega _2k_2\langle u^{2}, u^{0}\rangle -\eta \omega _3k_3 \langle u^{3}, u^{1}\rangle ,\nonumber \\ \end{aligned}$$

(3.10)

with $L_{ij}$ the entries of the matrix $L\in \mathbb {R}^{n-1,n-1},$

$$\begin{aligned} L:=\begin{pmatrix} \omega _2k_2 &{} &{} &{} &{} \\ -\delta \omega _3k_3 &{} \omega _3k_3 &{} &{}&{} \\ -\eta \omega _4k_4 &{}-\delta \omega _4k_4&{}\omega _4k_4 &{} \\ &{}\hspace{-0.8cm} \ddots &{} \hspace{-0.8cm} \ddots &{} \ddots &{} \\ {} &{} &{}\hspace{-1.9cm} -\eta \omega _nk_n &{}\hspace{-0.7cm} -\delta \omega _nk_n &{}\omega _nk_n \end{pmatrix}. \end{aligned}$$

(3.11)

With ${\mathbb {L}}(r_2,\dotsc ,r_n)$ the matrix in (2.1) and $\varLambda $ the diagonal matrix

$$\begin{aligned} \varLambda :={{\,\textrm{diag}\,}}\Big (\frac{1}{k_2},\frac{1}{k_3},\dotsc ,\frac{1}{k_n}\Big ), \end{aligned}$$

it is easily seen that ${\mathbb {L}}(r_2,\dotsc ,r_n)=\varLambda ^{1/2}L\varLambda ^{1/2}.$

It suffices to show that

$$\begin{aligned} ({\mathbb {L}}(r_2,\dotsc ,r_n)x,x)_2\geqslant c_1\Vert x\Vert _2^2\quad \forall x\in \mathbb {R}^{n-1}. \end{aligned}$$

(3.12)

Indeed, then, the first term on the right-hand side of (3.10) is larger than or equal to $c_1\big (k_2\Vert u^2\Vert ^2$ $+\cdots +k_n\Vert u^n\Vert ^2\big )$ and thus (3.12) leads to the asserted estimate (3.9).

To see that (3.12) is valid for $n\geqslant 4,$ we note that the quadratic polynomial p in (2.3) with r replaced by $r^\star ,$ attains its minimum in $[-1,1]$ at $y^\star =\frac{-\delta \sqrt{r^\star }}{4\eta r^\star }$. For $\delta =0.9672$ and $\eta =-0.1793$, we have $r\leqslant r^\star (0.9672,-0.1793) \approx 1.9398$ by inequality (3.1); then, indeed, $0<y^\star <1.$ Furthermore,

$$\begin{aligned} p(y^\star )\approx 7.3592\cdot 10^{-6}>c_1. \end{aligned}$$

Notice that (3.12) is valid also for $n=2$ and $n=3$. Indeed, for $n=2, ({\mathbb {L}}(r_2)x,x)_2 =\frac{1}{1+r_2}\Vert x\Vert _2^2\geqslant \frac{1}{1+r^\star }\Vert x\Vert _2^2\geqslant c_1\Vert x\Vert _2^2,$ and for $n=3, ({\mathbb {L}}(r_2,r_3)x,x)_2=\frac{1}{1+r_2}x_2^2-\delta \frac{\sqrt{r_3}}{1+r_3}x_2x_3+\frac{1}{1+r_3}x_3^2$ $\geqslant (\frac{1}{1+r^\star }-\frac{\delta }{2}\frac{\sqrt{r_3}}{1+r_3})\Vert x\Vert _2^2 \geqslant (\frac{1}{1+r^\star }-\frac{\delta }{4})\Vert x\Vert _2^2\geqslant c_1\Vert x\Vert _2^2.$

Now, in view of (3.12), (3.10) and (2.2) lead to the asserted estimate (3.9).

For the motivation of the specific choice of the multipliers $\delta $ and $\eta $, see Sect. 4.$\square $

3.3 Proof of Theorem 1.1

Here, we use Lemmata 3.2 and 3.3, the discrete Gronwall inequality in Lemma 3.1, and elementary inequalities, to prove Theorem 1.1.

Replacing n by j in (1.7), summing from $j=2$ to $j=n,$ and using (3.2) and (3.9), we obtain

$$\begin{aligned} \begin{aligned}&\left( 1-\delta -\eta \right) \left( 1-\psi _n\right) |u^n|^2 +2c_1\sum _{j=2}^{n}k_j\Vert u^{j}\Vert ^2\\&\quad \leqslant \left( 1-\delta -\eta \right) \psi _{n-1}|u^{n-1}|^2+C \left( |u^0|^2+|u^1|^2\right) +\left( 1-\delta -\eta \right) \sum _{j=2}^{n-2}[\psi _{j}-\psi _{j+2}]_{+}|u^j|^2\\&\qquad +\,\sum _{j=2}^{n}{\mathscr {F}}_j+2\delta \omega _2k_2\langle u^{2}, u^{1}\rangle +2\eta \omega _2k_2\langle u^{2}, u^{0}\rangle +2\eta \omega _3k_3\langle u^{3}, u^{1}\rangle . \end{aligned} \end{aligned}$$

Now, the terms involving the forcing term or the starting approximations can be estimated by the Cauchy–Schwarz inequality and the elementary inequality

$$\begin{aligned} 2ab\leqslant \varepsilon a^2+\varepsilon ^{-1}b^2,\quad a,b\in \mathbb {R}, \end{aligned}$$

with $\varepsilon >0$ small enough. We obtain

$$\begin{aligned} {\mathscr {F}}_j\leqslant \omega _jk_j\varepsilon _1^{-1}(1+\delta -\eta )\Vert f^j\Vert _\star ^2+ \omega _jk_j\varepsilon _1\left( \Vert u^j\Vert ^2+\delta \Vert u^{j-1}\Vert ^2-\eta \Vert u^{j-2}\Vert ^2\right) \end{aligned}$$

and

$$\begin{aligned} \begin{aligned} 2|\langle u^i,u^j \rangle |\leqslant \varepsilon _2 \Vert u^i\Vert ^2+\varepsilon _2^{-1}\Vert u^j\Vert ^2,~~i=2,3,~~j=0,1, \end{aligned} \end{aligned}$$

with sufficiently small $\varepsilon _1$ and $\varepsilon _2,$ and we are lead to the inequality

$$\begin{aligned} \begin{aligned} |u^n|^2 +c_1\sum _{j=2}^{n}k_j\Vert u^{j}\Vert ^2&\leqslant \frac{\psi _{n-1}}{1-\psi _{n}}\left| u^{n-1}\right| ^2 +C \big (|u^0|^2+|u^1|^2+k_2\Vert u^0\Vert ^2+k_2\Vert u^1\Vert ^2\big )\\&\quad +\,C\sum _{j=2}^{n-2}[\psi _{j}-\psi _{j+2}]_{+}|u^j|^2 +C\sum _{j=2}^{n}k_j\Vert f^j\Vert _\star ^2,~~n\geqslant 2. \end{aligned} \end{aligned}$$

Since $\frac{\psi _{n-1}}{1-\psi _{n}}\leqslant \bar{c}<1$, and $[\psi _{j}-\psi _{j+2}]_{+}\leqslant C[r_{j}-r_{j+2}]_{+}$ (see [12, p. 179]), we have

$$\begin{aligned} |u^n|^2 +c_1\sum _{j=2}^{n}k_j\Vert u^{j}\Vert ^2\leqslant & {} \bar{c}|u^{n-1}|^2 +C\left( |u^0|^2+|u^1|^2+k_2\Vert u^0\Vert ^2+k_2\Vert u^1\Vert ^2\right) \nonumber \\{} & {} +\,C\sum _{j=2}^{n-2}[r_{j}-r_{j+2}]_{+}|u^j|^2 +C\sum _{j=2}^{n}k_j\Vert f^j\Vert _\star ^2,~~n\geqslant 2.\nonumber \\ \end{aligned}$$

(3.13)

Hence, we have

$$\begin{aligned} \begin{aligned} |u^n|^2\leqslant \bar{c}|u^{n-1}|^2+K_n,\quad n\geqslant 2, \end{aligned} \end{aligned}$$

where

$$\begin{aligned} K_n=C\Bigg (|u^0|^2+|u^1|^2+k_2\Vert u^0\Vert ^2+k_2\Vert u^1\Vert ^2 +\sum _{j=2}^{n-2}[r_{j}-r_{j+2}]_{+}|u^j|^2+\sum _{j=2}^{n}k_j\Vert f^j\Vert _\star ^2\Bigg ). \end{aligned}$$

Let $2\leqslant n_\star \leqslant n,$ be such that $|u^{n_\star }| =\max \nolimits _{1\leqslant \ell \leqslant n}|u^\ell |$. Setting $n:=n_\star $ in the above inequality and using the fact that $K_{n_\star }\leqslant K_n$, we get

$$\begin{aligned} |u^{n_\star }|^2\leqslant \bar{c}|u^{{n_\star }-1}|^2+K_{n_\star }\leqslant \bar{c}|u^{{n_\star }}|^2+K_n, \end{aligned}$$

which leads to

$$\begin{aligned} |u^{{n}-1}|^2\leqslant |u^{n_\star }|^2\leqslant \frac{1}{1-\bar{c}}K_n. \end{aligned}$$

Thus, (3.13) yields

$$\begin{aligned} |u^n|^2 +c_1\sum _{j=2}^{n}k_j\Vert u^{j}\Vert ^2 \leqslant \bar{c} |u^{n-1}|^2+K_n \leqslant \frac{1}{1-\bar{c}}K_n. \end{aligned}$$

Applying here the discrete Gronwall Lemma 3.1, we obtain the asserted stability estimate (1.4).$\square $

Remark 3.1

Proceeding as in the proof of Theorem 1.1, we can see that Emmrich’s bound ($r^*\approx 1.9104$) is optimal for a single multiplier as far as the positive definiteness of suitable matrices is concerned, with $\delta =0.72349$, $\eta =0$.

4 On the choice of the multipliers $\delta $ and $\eta $

Here, we comment on the choice $\delta =0.9672$ and $\eta =-0.1793$ of the multipliers; we also give more precise values of the multipliers and of the bound $r^\star $; see (4.17).

We recall that in our stability analysis we used two conditions on the bound r of the ratios, namely,

$$\begin{aligned} 0<r\leqslant r^\star (\delta ,\eta )=\frac{\sqrt{1+\delta +3\eta }}{2\sqrt{1+\eta }-\sqrt{1+\delta +3\eta }} \end{aligned}$$

(4.1)

and the positivity condition

$$\begin{aligned} P(y)=(1+r)p(y)=1+\eta r-\delta \sqrt{r}y-2\eta ry^2> 0\quad \forall y\in [-1,1] \end{aligned}$$

(P)

to estimate the terms accounting for the difference quotient and for the elliptic operator, respectively; see (3.1) and (2.3). Our goal here is to choose the multipliers $\delta $ and $\eta $ in such a way that both conditions, (4.1) and (P), are satisfied for values of r as large as possible.

Let us focus on the condition (P) and introduce the domain

$$\begin{aligned} D:=\big \{(\delta ,\eta ): 0<\delta<2,-1<\eta < 0,1+\delta +3\eta \geqslant 0\big \}=D_1\cup D_2 \end{aligned}$$

(4.2)

with

$$\begin{aligned} D_1:=\big \{(\delta ,\eta )\in D: \dfrac{3}{16}\delta ^2\leqslant -\eta \big \},\quad D_2:=\big \{(\delta ,\eta )\in D: \dfrac{3}{16}\delta ^2> -\eta \}; \end{aligned}$$

see Fig. 1. Notice that instead of trying to determine optimal multipliers in the set of admissible multipliers, i.e., multipliers satisfying all conditions needed in our stability analysis, we find it more convenient to determine optimal multipliers in the larger domain D, in which only some of the conditions of our stability analysis are automatically satisfied, and, a posteriori, check that these multipliers are indeed admissible.

Claim. For $(\delta , \eta )\in D,$ the positivity condition (P) is satisfied if and only if

$$\begin{aligned} r< h(\delta , \eta ):=\left\{ \begin{aligned}&-\dfrac{1}{\eta }-\dfrac{\delta ^2}{8\eta ^2},\quad{} & {} (\delta , \eta )\in D_1, \\&\left( \dfrac{2}{\delta +\sqrt{\delta ^2+4\eta }}\right) ^2,\quad{} & {} (\delta , \eta )\in D_2. \end{aligned} \right. \end{aligned}$$

(4.3)

To see this, we consider the cases that $(\delta , \eta )$ belongs to $D_1$ or to $D_2$ separately.

We write P in the form

$$\begin{aligned} P(y)=-2\eta r\Big (y+\frac{\delta }{4\eta \sqrt{r}}\Big )^2 +1+\eta r+\frac{\delta ^2}{8\eta },\quad y\in [-1,1]. \end{aligned}$$

(4.4)

Since the first term on the right-hand side is nonnegative, P is positive in $[-1,1]$ provided that $1+\eta r+\frac{\delta ^2}{8\eta }$ is positive, i.e.,

$$\begin{aligned} r< -\dfrac{1}{\eta }-\dfrac{\delta ^2}{8\eta ^2}. \end{aligned}$$

(4.5)

Notice that (4.5) is also necessary if $-\frac{\delta }{4\eta \sqrt{r}}\leqslant 1.$

For $(\delta ,\eta )\in D_1,$ in case $-\frac{\delta }{4\eta \sqrt{r}}> 1,$ i.e., for $r<\frac{\delta ^2}{16\eta ^2},$ a seemingly milder condition for the positivity of P in $[-1,1]$ suffices, namely, $P(1)> 0.$ However, we have

$$\begin{aligned}\frac{\delta ^2}{16\eta ^2}\leqslant -\dfrac{1}{\eta }-\dfrac{\delta ^2}{8\eta ^2} \iff \frac{3}{16}\delta ^2 \leqslant -\eta ,\end{aligned}$$

which is the motivation for the definition of $D_1$.

Summarizing, for $(\delta ,\eta )\in D_1, P$ is positive in $[-1,1]$ if and only if (4.5) holds; this proves (4.3) for $(\delta ,\eta )\in D_1.$

Next, we consider the case $(\delta , \eta )\in D_2$. For $0<-\dfrac{\delta }{4\eta \sqrt{r}}\leqslant 1$, i.e., for $r\geqslant \frac{\delta ^2}{16\eta ^2},$ we have

$$\begin{aligned}1+\eta r+\frac{\delta ^2}{8\eta }\leqslant 1+\dfrac{3\delta ^2}{16\eta }<0 \quad \text {since}\quad \dfrac{3}{16}\delta ^2> -\eta \quad \text {for}\quad (\delta ,\eta )\in D_2,\end{aligned}$$

and we easily infer from (4.4) that (P) is not satisfied.

For $-\dfrac{\delta }{4\eta \sqrt{r}}>1$, i.e., for $0<\sqrt{r}<-\dfrac{\delta }{4\eta }$, P is positive in $[-1,1]$ if and only if

$$\begin{aligned} P(1)=-\eta \left( \sqrt{r}+\frac{\delta }{2\eta }\right) ^2+1+\frac{\delta ^2}{4\eta }> 0. \end{aligned}$$

The discriminant $\delta ^2+4\eta $ is positive for $(\delta ,\eta )\in D_2$, whence P(1) has two real roots $\sqrt{r}=\dfrac{\delta \pm \sqrt{\delta ^2+4\eta }}{-2\eta }$. In this case, we have

$$\begin{aligned}0<\sqrt{r}< \dfrac{\delta -\sqrt{\delta ^2+4\eta }}{-2\eta }<-\dfrac{\delta }{4\eta } \quad \text {since}\quad \dfrac{3}{16}\delta ^2> -\eta \quad \text {for}\quad (\delta ,\eta )\in D_2.\end{aligned}$$

Summarizing, for $(\delta , \eta )\in D_2, P$ is positive in $[-1,1]$ if and only if

$$\begin{aligned} r< \left( \dfrac{2}{\delta +\sqrt{\delta ^2+4\eta }}\right) ^2, \quad (\delta , \eta )\in D_2; \end{aligned}$$

this proves (4.3) for $(\delta ,\eta )\in D_2.$

Obviously, the mildest condition on r such that (4.1) and (4.3) are satisfied is

$$\begin{aligned} r< \max _{(\delta ,\eta )\in D} \min \{r^\star (\delta , \eta ), h(\delta , \eta )\}. \end{aligned}$$

(4.6)

It will be convenient to rewrite the expression on the right-hand side of (4.6). Since $\frac{1+\eta }{2-\delta }\in [1/3, \infty )$ for $(\delta ,\eta )\in D,$ we let

$$\begin{aligned} t:=\frac{1+\eta }{2-\delta }\quad \text {with}\quad t\in \left[ 1/3, \infty \right) , \end{aligned}$$

and, for fixed t, consider the secant segments $L_t\subset D,$

$$\begin{aligned} L_t:\quad \eta =t(2-\delta )-1\quad \text {for}\quad \eta \in (-1,0). \end{aligned}$$

(4.7)

Notice that the secant segment $L_{1/3}$ is $1+\delta +3\eta =0$, and, as t increases from 1/3 to $\infty ,$ the secant segment $L_t$ rotates clockwise and approaches the right boundary of the domain D, whence $L_t$ sweeps the whole domain D; see the colored part in Fig. 1. Consequently, (4.6) can be equivalently written in the form

$$\begin{aligned} r< \max _{t\in [1/3,\infty )} \max _{(\delta ,t(2-\delta )-1)\in L_t} \min \{r^\star (\delta , t(2-\delta )-1), h(\delta , t(2-\delta )-1)\}. \end{aligned}$$

(4.8)

From (4.7) and (4.1), we get

$$\begin{aligned} H(t){} & {} :=r^\star (\delta ,t(2-\delta )-1) =\dfrac{\sqrt{(3t-1)(2-\delta )}}{2\sqrt{t(2-\delta )}-\sqrt{(3t-1)(2-\delta )}}\nonumber \\{} & {} =\dfrac{\sqrt{3t-1}}{2\sqrt{t}-\sqrt{3t-1}}. \end{aligned}$$

(4.9)

Analogously, in view of (4.3) and (4.7), we let

$$\begin{aligned} G(t):=\max _{\{\delta : (\delta ,t(2-\delta )-1)\in L_t\}} h(\delta , t(2-\delta )-1)\quad \text {for}\quad t\in \left[ 1/3, \infty \right) \end{aligned}$$

(4.10)

with

$$\begin{aligned}{} & {} h(\delta , t(2-\delta )-1)\nonumber \\{} & {} \qquad =\left\{ \begin{aligned}&-\dfrac{1}{t(2-\delta )-1}-\dfrac{\delta ^2}{8\left( t(2-\delta )-1\right) ^2}, ~{} & {} ~(\delta ,t(2-\delta )-1)\in D_1, \\&\left( \dfrac{2}{\delta +\sqrt{\delta ^2+4\left( t(2-\delta )-1\right) }}\right) ^2, ~{} & {} ~(\delta ,t(2-\delta )-1)\in D_2. \end{aligned} \right. \end{aligned}$$

(4.11)

According to (4.9) and (4.10), inequality (4.8) can be written as

$$\begin{aligned} r< \max _{t\in [1/3,\infty )} \min \{H(t), G(t)\}. \end{aligned}$$

Next, we consider the maximum of $h(\delta , t(2-\delta )-1)$ in (4.11) for $(\delta ,t(2-\delta )-1)\in D$.

For the points $(\delta ,t(2-\delta )-1)\in L_t\cap D_2$, according to (4.11), we have

$$\begin{aligned} \begin{aligned} \dfrac{\partial h(\delta ,t(2-\delta )-1)}{\partial \delta }&=\frac{-8\left( \sqrt{\delta ^{2} + 4\left( t(2-\delta )-1\right) }+\delta -2t\right) }{\left( \delta + \sqrt{\delta ^{2} + 4\left( t(2-\delta )-1\right) }\right) ^{3} \sqrt{\delta ^{2} + 4 \left( t(2-\delta )-1\right) }}\\&=\frac{4\left( \sqrt{\delta ^{2} \!+\! 4\left( t(2-\delta )-1\right) }\!+\!\delta - 2\right) ^2}{(2-\delta )\left( \delta + \sqrt{\delta ^{2} + 4\left( t(2-\delta )-1\right) }\right) ^{3} \sqrt{\delta ^{2} \!+\! 4\left( t(2-\delta )-1\right) }} \geqslant 0. \end{aligned} \end{aligned}$$

Notice that $t(2-\delta )-1=\eta \in (-1,0)$ and $\delta ^2+4\eta $ is positive. Therefore, $h(\delta ,t(2-\delta )-1)$ is increasing with respect to $\delta $ in the secant line $L_t\cap D_2$, which implies that the maximum of $h(\delta ,t(2-\delta )-1)$ is attained at the point on the curve $\dfrac{3}{16}\delta ^2=-\eta $. Notice also that this curve lies in $D_1$. Hence, we only need to consider the points $(\delta , t(2-\delta )-1)\in D_1$.

For points $(\delta ,t(2-\delta )-1)\in L_t\cap D_1$, in view of (4.11), we see that

$$\begin{aligned} \begin{aligned} \dfrac{\partial h(\delta ,t(2-\delta )-1)}{\partial \delta }&=-\frac{1}{4\left( t(2-\delta )-1\right) ^{3}}\left[ \delta \left( t(2-\delta )-1\right) +\left( \delta ^{2} + 4\left( t(2-\delta )-1\right) \right) t\right] \\&=-\frac{1}{4\left( t(2-\delta )-1\right) ^{3}}\rho (\delta ) \end{aligned} \end{aligned}$$

with

$$\begin{aligned} t(2-\delta )-1=\eta \in (-1,0),\quad \rho (\delta ):=-\left( 4t^2-2t+1\right) \delta +8t^2-4t. \end{aligned}$$

Notice that $\rho $ is a decreasing function of $\delta $ since it is linear and $4t^2-2t+1=4(t-\frac{1}{4})^2+\frac{3}{4}>0$.

If $t\in [1/3, 1/2)$, then $\rho (\delta )<\rho (0)=8t^2-4t<0$. Therefore, $h(\delta ,t(2-\delta )-1)$ is decreasing with respect to $\delta $ and attains its maximum on the secant segment $L_t$ at $\delta =0$. From (4.7), we infer that $\eta =2t-1$. According to (4.3), we have

$$\begin{aligned} r< h(\delta ,\eta )=-\dfrac{1}{\eta }=\dfrac{1}{1-2t}. \end{aligned}$$

(4.12)

If $t\in (1/2,\infty )$, then $8t^2-4t>0$. The root $\delta ^\star $ of $\rho $ is

$$\begin{aligned} \delta ^\star =\dfrac{4t(2t-1)}{4t^2-2t+1}. \end{aligned}$$

(4.13)

If $\delta \in (0,\delta ^\star )$, then $\rho (\delta )>0$ and $h(\delta ,t(2-\delta )-1)$ is increasing with respect to $\delta $. If $\delta \in (\delta ^\star ,2)$, then $\rho (\delta )<0$ and $h(\delta ,t(2-\delta )-1)$ is decreasing with respect to $\delta $. Therefore, $h(\delta ,t(2-\delta )-1)$ attains its maximum on the secant segment $L_t$ at $\delta ^\star $. From (4.7), we have

$$\begin{aligned} \eta ^\star =t\left( 2-\delta ^\star \right) -1=-\dfrac{\left( 2t-1\right) ^{2}}{4t^{2}-2t+1}. \end{aligned}$$

(4.14)

Therefore, from (4.3), we obtain

$$\begin{aligned} r< h(\delta ^\star ,\eta ^\star )=-\dfrac{1}{\eta ^\star }-\dfrac{\left( \delta ^\star \right) ^2}{8\left( \eta ^\star \right) ^2}=\dfrac{1}{2}+\dfrac{1}{2(2t-1)^2}. \end{aligned}$$

(4.15)

Combining (4.10), (4.12) and (4.15), we have

$$\begin{aligned} G(t)={\left\{ \begin{array}{ll} \dfrac{1}{1-2t}, ~~&{}t\in \left[ 1/3,1/2\right) , \\ +\infty , ~~&{}t=1/2, \\ \dfrac{1}{2}+\dfrac{1}{2(2t-1)^2},~~&{}t\in (1/2,+\infty ). \end{array}\right. } \end{aligned}$$

(4.16)

It is easily seen from (4.9) that H is increasing with respect to $t\in [1/3,\infty ).$ Furthermore, in view of (4.16), G is increasing in the interval [1/3, 1/2) and decreasing in the interval $(1/2,\infty )$; see Fig. 2. Since $H(t)<H(1/2)\!=\!1<3\!=\!G(1/3)\!<G(t)$ for $t\in [1/3,1/2)$, the graphs of H and G do not intersect for $t\in [1/3,1/2)$.

On the other hand, there exists a unique optimal point of $H(t)=G(t)$ if $t\in (1/2,\infty )$. Indeed, from (4.9) and (4.16) for $t\in (1/2,\infty )$, we have

$$\begin{aligned} \dfrac{\sqrt{3t-1}}{2\sqrt{t}-\sqrt{3t-1}}=\dfrac{1}{2}+\dfrac{1}{2(2t-1)^2},\quad t\in (1/2,\infty ), \end{aligned}$$

that is

$$\begin{aligned} 23t^5-55t^4+55t^3-29t^2+8t-1=0,\quad t\in (1/2,\infty ). \end{aligned}$$

Notice that the above polynomial has only one real root, namely $t \approx $ 0.794645365827. Substituting this value of t in (4.13), (4.14), and (4.15), respectively, we obtain the optimal values

$$\begin{aligned}{} & {} \delta ^\star \approx 0.967237837020572,\quad \eta ^\star \approx -0.179320334471962,\nonumber \\{} & {} r^\star \approx 1.9398285699451. \end{aligned}$$

(4.17)

Let us mention that the multipliers $\delta ^\star $ and $\eta ^\star $ are admissible, i.e., they satisfy all conditions in our stability analysis; in particular, $2-\delta ^\star +2\eta ^\star \geqslant 0,$ which is used in Lemma 3.2 but does not enter into the definition of the domain D.

Notes

Note that $\sup \{r^\star (\delta ,\eta )\}=1+\sqrt{2}=r^\star (1,0),$ which agrees with the optimal bound for o.d.e’s.

References

Akrivis, G., Chen, M.H., Yu, F., Zhou, Z.: The energy technique for the six-step BDF method. SIAM J. Numer. Anal. 59, 2449–2472 (2021). https://doi.org/10.1137/21M1392656
Article MathSciNet Google Scholar
Becker, J.: A second order backward difference method with variable steps for a parabolic problem. BIT 38, 644–662 (1998). https://doi.org/10.1007/BF02510406
Article MathSciNet Google Scholar
Calvo, M., Grigorieff, R.D.: Time discretisation of parabolic problems with the variable 3-step BDF method. BIT 42, 689–701 (2002). https://doi.org/10.1023/A:1021992101967
Crouzeix, M., Lisbona, F.J.: The convergence of variable-stepsize, variable formula, multistep methods. SIAM J. Numer. Anal. 21, 512–534 (1984). https://doi.org/10.1137/0721037
Article MathSciNet ADS Google Scholar
Emmrich, E.: Stability and error of the variable two-step BDF for semilinear parabolic problems. J. Appl. Math. Comput. 19, 33–55 (2005). https://doi.org/10.1007/BF02935787
Article MathSciNet Google Scholar
Garoni, C., Serra-Capizzano, S.: Generalized Locally Toeplitz Sequences: Theory and Applications, vol. I. Springer, Cham (2017)
Book Google Scholar
Grigorieff, R.D.: Stability of multistep-methods on variable grids. Numer. Math. 42, 359–377 (1983). https://doi.org/10.1007/BF01389580
Article MathSciNet Google Scholar
Grigorieff, R.D.: Time discretization of semigroups by the variable two-step BDF method. In: Strehmel, K. (ed.) Numerical Treatment of Differential Equations (NUMDIFF-5, Halle 1989), Teubner, Leipzig, 1991, pp. 204–216
Grigorieff, R.D.: On the variable two-step BDF method for parabolic equations, Preprint 426/1995, TU Berlin
Hairer, E., Nørsett, S.P., Wanner, G.: Solving Ordinary Differential Equations I: Nonstiff Problems, 2nd edn. Springer, Berlin (1993)
Google Scholar
Liao, H.-L., Zhang, Z.: Analysis of adaptive BDF2 scheme for diffusion equations. Math. Comput. 90, 1207–1226 (2021). https://doi.org/10.1090/mcom/3585
Article MathSciNet Google Scholar
Thomée, V.: Galerkin Finite Element Methods for Parabolic Problems, 2nd edn. Springer, Berlin (2006)
Google Scholar

Download references

Acknowledgements

The first-named author is grateful to Prof. Stefano Serra-Capizzano for providing useful information concerning the asymptotic behavior of eigenvalues of symmetric, banded Toeplitz matrices.

Funding

Open access funding provided by HEAL-Link Greece.

Author information

Authors and Affiliations

Department of Computer Science and Engineering, University of Ioannina, Ioannina, 451 10, Greece
Georgios Akrivis
Institute of Applied and Computational Mathematics, FORTH, 700 13, Heraklion, Crete, Greece
Georgios Akrivis
School of Mathematics and Statistics, Gansu Key Laboratory of Applied Mathematics and Complex Systems, Lanzhou University, Lanzhou, 730000, People’s Republic of China
Minghua Chen, Jianxing Han & Fan Yu
Department of Mathematics, Wayne State University, Detroit, MI, 48202, USA
Zhimin Zhang

Authors

Georgios Akrivis
View author publications
You can also search for this author in PubMed Google Scholar
Minghua Chen
View author publications
You can also search for this author in PubMed Google Scholar
Jianxing Han
View author publications
You can also search for this author in PubMed Google Scholar
Fan Yu
View author publications
You can also search for this author in PubMed Google Scholar
Zhimin Zhang
View author publications
You can also search for this author in PubMed Google Scholar

Corresponding author

Correspondence to Georgios Akrivis.

Ethics declarations

Conflict of interest.

The authors have no competing interests to declare that are relevant to the content of this article.

Additional information

Communicated by Axel Målqvist.

Publisher's Note

Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

The work of the second author was supported by the Science Fund for Distinguished Young Scholars of Gansu Province under Grant No. 23JRRA1020 and the Fundamental Research Funds for the Central Universities under Grant No. lzujbky-2023-06.

Rights and permissions

Open Access This article is licensed under a Creative Commons Attribution 4.0 International License, which permits use, sharing, adaptation, distribution and reproduction in any medium or format, as long as you give appropriate credit to the original author(s) and the source, provide a link to the Creative Commons licence, and indicate if changes were made. The images or other third party material in this article are included in the article’s Creative Commons licence, unless indicated otherwise in a credit line to the material. If material is not included in the article’s Creative Commons licence and your intended use is not permitted by statutory regulation or exceeds the permitted use, you will need to obtain permission directly from the copyright holder. To view a copy of this licence, visit http://creativecommons.org/licenses/by/4.0/.

Reprints and permissions

About this article

Cite this article

Akrivis, G., Chen, M., Han, J. et al. The variable two-step BDF method for parabolic equations. Bit Numer Math 64, 14 (2024). https://doi.org/10.1007/s10543-024-01007-y

Download citation

Received: 31 July 2023
Accepted: 05 January 2024
Published: 01 March 2024
DOI: https://doi.org/10.1007/s10543-024-01007-y

Keywords

Mathematics Subject Classification

Use our pre-submission checklist

Avoid common mistakes on your manuscript.

The variable two-step BDF method for parabolic equations

Abstract

Similar content being viewed by others

Stability and error estimates for the variable step-size BDF2 method for linear and semilinear parabolic equations

Mesh-Robustness of an Energy Stable BDF2 Scheme with Variable Steps for the Cahn–Hilliard Model

Galerkin-Chebyshev spectral method and block boundary value methods for two-dimensional semilinear parabolic equations

1 Introduction

1.1 Main result

Theorem 1.1

1.2 Main ingredients of the proof

1.3 Previous work

2 Auxiliary results

Lemma 2.1

Proof

Theorem 2.1

Remark 2.1

3 Proof of Theorem 1.1

Lemma 3.1

3.1 Estimation of the terms accounting for the difference quotient

Lemma 3.2

Proof

3.2 Estimation of the terms \(\pmb {\mathscr {A}}_n\) accounting for the elliptic operator

Lemma 3.3

Proof

3.3 Proof of Theorem 1.1

Remark 3.1

4 On the choice of the multipliers \(\delta \) and \(\eta \)

Notes

References

Acknowledgements

Funding

Author information

Authors and Affiliations

Corresponding author

Ethics declarations

Conflict of interest.

Additional information

Publisher's Note

Rights and permissions

About this article

Cite this article

Share this article

Keywords

Mathematics Subject Classification

Search

Navigation