1 Introduction

Many highly effective methods for initial value problems in partial differential equations appear as parametric families of numerical schemes. These include exponential splittings (see, e.g., [8, 25]), where the free parameters constitute the coefficients of the splitting, and rational Krylov methods [20], where the free parameters are the poles of rational approximants.

A new technique that uses symbolic algebra to develop bespoke finite difference methods that preserve multiple local conservation laws has been recently introduced in [14]. This approach has been further refined in [16], and new families of conservative schemes have been introduced for a range of partial differential equations (PDEs) in [13,14,15,16]. These numerical schemes feature certain free parameters that can be arbitrarily chosen without compromising the preservation of the conservation laws.

A convenient choice of the free parameters yields numerical solutions with superior accuracy in all these cases. Coefficients of exponential splittings are typically determined a-priori using algebraic means in the pursuit of high order accuracies [28] and may be specialized for specific PDEs [31]. Optimal pole selection for rational Krylov methods remains an active area of research and strategies include a-priori choices based on analytical reasoning [17] and a-posteriori fitting [7]. Optimal parameters for the finite difference methods in [13,14,15,16] are identified using a brute force sweep through the entire parameter space, and comparisons against reference solutions show that suitable choices of the parameters yield errors up to 20 times smaller than existing methods for the proposed benchmark tests.

In practice, the optimal choice of such free parameters depends heavily on initial conditions and may also vary with time-step. Consequently, while the results in [13,14,15,16] do highlight the potential advantages of choosing good parameters, there is no known algorithm for identifying them. In order to overcome this issue we propose here a new approach for adaptively identifying optimal parameters for families of numerical schemes for PDEs, where convenient values are not known a priori.

For obtaining estimates of the optimal parameters we adaptively minimize an estimate of the local error introduced by the time integrator. In order for this approach to be effective, we assume throughout the paper that the spatial approximation is accurate and that the error in the solution is mainly due to the time discretization. This is not too restrictive an assumption, as in many instances PDEs are approximated very accurately in space, using for example spectral semidiscretizations. In the case of finite difference schemes, this amounts to either considering higher order discretizations in space, or restricting attention to cases where \(\varDelta t\gg \varDelta x\). Large time-steps reduce computational expenses and are generally desirable, except for potential stability concerns. In particular, \(\varDelta t\gg \varDelta x\) is a typical setting when using implicit schemes.

In the proposed approach, at each time-step of a single step numerical scheme, we seek to compute the optimal parameters that minimize the local error. This requires a reasonably accurate but inexpensive estimate of the local error and its dependence on the parameters. For an a posteriori estimate of the local error, we resort to the “defect” based approach outlined in [6]. In the context of backward error analysis, the defect measures the discrepancy between the differential equation satisfied by the numerical solution and the original equation [30]. Defect based error estimates have been utilized widely in the development of time-adaptive methods for ordinary differential equations (ODEs) (see, e.g., [12, 21]) and PDEs [3, 4, 6], but, to the best of our knowledge, these have not been employed for the estimation of optimal parameters.

In this paper we do not aim for adaptive time stepping. Indeed, unlike adaptive techniques for choosing time-steps, where the local error can be assumed to decrease monotonically with the time-step, in the proposed approach an optimization problem needs to be solved for finding the values of the parameters. The optimization problem for minimizing the local error estimate is solved in an efficient manner by computing the defect on a coarser (but still accurate) spatial grid, and utilizing an iterative method with a Gauss–Newton approximation to the Hessian for achieving fast local convergence [29].

We apply the new procedure to families of schemes introduced in [14] for the Korteweg de Vries (KdV) and a nonlinear heat (NLH) equation. The main feature of these schemes is that each of them preserves specific discretizations of some conservation laws. However, since the discrete conservation laws also depend on the parameters, these cannot be preserved by using an adaptive approach. Where conservation of these properties is of paramount importance, we suggest a more conservative version of the algorithm that uses fixed parameters derived from a sequence of values obtained adaptively.

Although the conservation properties of the schemes in [14] confer them good stability and accuracy, by applying the technique proposed in this paper we achieve significantly higher accuracy with moderate overheads. Despite the defect based approach being asymptotic in the time-step, \(\varDelta t\), in practice the procedure also works well in the large time-steps regime and, in some cases, also confers a notable stability advantage.

In Sect. 2 we discuss the validity of a defect based approximation of the local error for the purpose of adaptively identifying optimal parameters. In Sect. 3 we outline the defect based approach used for finding optimal values of free parameters in a numerical scheme, and introduce the two algorithms briefly outlined above. In Sect. 4, we apply the new techniques to families of conservative schemes introduced in [14] for the KdV equation and the NLH equation, giving explicit expressions for the defect. In Sect. 5, we show numerical results that demonstrate the effectiveness of the proposed algorithms in finding good estimates of the optimal parameters, together with their higher accuracy and efficiency in comparison to a default choice of the parameters and other schemes from the literature. Conclusive remarks are presented in Sect. 6.

2 Defect Based Approximation of Local Error

We consider a PDE,

$$\begin{aligned} \partial _t u(t) = \mathcal {A}(u(t)), \qquad t \ge 0,\qquad u(0) = u_0 \in \mathcal {H}, \end{aligned}$$
(1)

written as an Initial Value Problem on the Hilbert space \(\mathcal {H}\), where \(\mathcal {A}: \mathcal {H} \rightarrow \mathcal {H}\). Boundary conditions and non-autonomous PDEs can also be incorporated into our approach in a straightforward manner, as demonstrated with concrete examples in Sect. 4.2.

Following spatial discretization, the solution of (1) is approximated by the solution of the system of ODEs,

$$\begin{aligned} \mathcal {D}_t \varvec{u}(t) = A(\varvec{u}(t)), \qquad t \ge 0,\quad \varvec{u}(0) = \varvec{u}_0 \in \mathbb {R}^M, \end{aligned}$$
(2)

where here and henceforth \(\mathcal {D}_z\) denotes the total derivative with respect to z, and \(\varvec{u}(t)\) represents a finite dimensional approximation of u(t). For instance, this could involve a finite difference approximation on a uniform grid on the domain [ab] with Dirichlet boundaries,

$$\begin{aligned} x_m=a+m\varDelta x,\qquad m=0,\ldots ,M+1,\qquad \varDelta x= (b-a)/(M+1). \end{aligned}$$

Let T be the final time of integration,

$$\begin{aligned} t_n=n\varDelta t,\qquad n=0,\ldots ,N,\quad \text {and} \quad \varDelta t=T/N \end{aligned}$$

be the time nodes and stepsize, respectively, \(u_{m,n}\) an approximation of \(u(x_m,\!t_n)\), and \(\varvec{u}_n\) the column vector whose m-th entry is \(u_{m,n}\).

The exact solution of (2) is described by the flow \(\mathcal {E}: \mathbb {R}^+ \times \mathbb {R}^M \rightarrow \mathbb {R}^M\),

$$\begin{aligned} \varvec{u}(t) = \mathcal {E}(t, \varvec{u}_0). \end{aligned}$$

Similarly, a single step numerical scheme for (2) can be described by the numerical flow,

$$\begin{aligned} \varvec{u}_{n+1} = \Phi (\varDelta t, \varvec{u}_n). \end{aligned}$$

Note that the numerical flow \(\Phi \) also exists for implicit methods, even if not specified in an explicit form.

In this manuscript, we consider numerical schemes in the form

$$\begin{aligned} \varvec{u}_{n+1} = \Phi (\varDelta t, \varvec{u}_n, \chi ), \qquad \Phi : \mathbb {R}^+ \times \mathbb {R}^M \times \varOmega \rightarrow \mathbb {R}^M, \end{aligned}$$
(3)

where \(\varOmega \) is a compact subset of \(\mathbb {R}^K\), and \(\Phi \) depends on a vector of free parameters \(\chi \in \varOmega \) that effect the accuracy of the scheme. In our theoretical discussion we assume that the vector field A, the exact flow \(\mathcal {E}\) and the numerical flow \(\Phi \) are smooth with respect to all arguments and that the method,

$$\begin{aligned} \varvec{u}_{n+1} = \Phi (\varDelta t, \varvec{u}_{n}, \chi (t_n)), \end{aligned}$$
(4)

is stable and convergent for arbitrary choices of \(\chi : \mathbb {R}^+ \rightarrow \varOmega \).

The local error in the numerical method (3) is defined as

$$\begin{aligned} \mathcal {L}(\varDelta t,\varvec{u}_n,\chi ) = \Phi (\varDelta t, \varvec{u}_n, \chi ) - \mathcal {E}(\varDelta t, \varvec{u}_n). \end{aligned}$$
(5)

In general, \(\mathcal {L}(\varDelta t,\varvec{u}_n,\chi )\) is not a computable quantity since the exact solution \(\mathcal {E}(\varDelta t, \varvec{u}_n)\) is not available in practice. Consequently, we resort to defect-based approximations (see [5, 6]) to obtain a posteriori estimates. The defect or residual of \(\Phi \),

$$\begin{aligned} \mathcal {R}(\varDelta t, \varvec{u}_n, \chi ) = \mathcal {D}_{\varDelta t} \Phi (\varDelta t, \varvec{u}_n, \chi ) - A(\Phi (\varDelta t, \varvec{u}_n, \chi )), \end{aligned}$$
(6)

quantifies the extent to which the numerical flow \(\Phi \) fails to satisfy (2).

To study the accuracy of the defect (6) as an approximation of the local error, we resort to the following nonlinear variation-of-constant formula, valid for nonlinear parabolic PDEs and time-reversible equations [4, 11].

Lemma 1

(Gröbner–Alekseev formula) The analytical solutions of the following initial value problems

$$\begin{aligned}&\left\{ \begin{array}{l} \mathcal {D}_t\varvec{u}(t)=H(t,\varvec{u}(t))=G(\varvec{u}(t))+R(t,\varvec{u}(t)),\qquad 0\le t \le T,\\ \varvec{u}(0)=\varvec{u}_0, \end{array}\right. \\&\left\{ \begin{array}{l} \mathcal {D}_t\varvec{u}(t)=G(\varvec{u}(t)),\qquad 0\le t \le T,\\ \varvec{u}(0)=\varvec{u}_0, \end{array}\right. \end{aligned}$$

are related through the nonlinear variation-of-constants formula

$$\begin{aligned} \mathcal {E}_H(t,\varvec{u}_0)\!=\mathcal {E}_G(t,\varvec{u}_0)+\!\int _0^t\!\!\partial _2\mathcal {E}_G(t-\tau ,\mathcal {E}_H(\tau ,\varvec{u}_0))\cdot R(\tau ,\mathcal {E}_H(\tau ,\varvec{u}_0))\textrm{d}\tau , \,\,\, 0\le t\le T. \end{aligned}$$

Here and henceforth, \(\partial _k f\) denotes the Fréchet derivative of a function f with respect to the k-th argument.

Applying Lemma 1 to (2) and (6) yields the following formula for the local error (5),

$$\begin{aligned} \mathcal {L}(\varDelta t, \varvec{u}_n,\chi )&= \int _{0}^{\varDelta t} \partial _2 \mathcal {E}\left( \varDelta t- \tau , {\Phi }(\tau , \varvec{u}_n, \chi ) \right) \cdot \mathcal {R}(\tau , \varvec{u}_n, \chi )\, \textrm{d}\tau \nonumber \\&=: \int _0^{\varDelta t} \varTheta (\tau ,\varvec{u}_n,\chi )\,\textrm{d}\tau . \end{aligned}$$
(7)

Theorem 1

Given a numerical scheme \(\Phi \) (3) of order p, the defect-based estimator

$$\begin{aligned} L(\varDelta t,\varvec{u}_n,\chi ):=\frac{\varDelta t}{p+1}\mathcal {R}(\varDelta t,\varvec{u}_n,\chi ), \end{aligned}$$
(8)

is an asymptotically correct estimator of the local error (5), uniformly for \(\chi \in \varOmega \), i.e.

$$\begin{aligned} \Vert L(\varDelta t,\varvec{u}_n,\chi )-\mathcal {L}(\varDelta t,\varvec{u}_n,\chi )\Vert \le C\varDelta t^{p+2}, \end{aligned}$$

with C independent of \(\chi \).

Proof

Since the method is of order p, then \(\mathcal {L}(\varDelta t,\varvec{u}_n,\chi )=\mathcal {O}(\varDelta t^{p+1}),\) and \(\varTheta (\tau ,\varvec{u}_n,\chi )=\mathcal {O}(\tau ^{p})\) for any \(\varDelta t,\) \(\tau \in [0,\varDelta t]\), and \(\chi \in \varOmega .\) Taylor expanding \(\varTheta \) in \(\tau \) around 0, there exists \(\xi _1 \in [0,\tau ]\) such that

$$\begin{aligned} \varTheta (\tau ,\varvec{u}_n,\chi ) = \frac{\tau ^p}{p!}\partial _1^p\varTheta (0,\varvec{u}_n,\chi )+\frac{\tau ^{p+1}}{(p+1)!}\partial _1^{p+1}\varTheta (\xi _1,\varvec{u}_n,\chi ). \end{aligned}$$
(9)

Therefore,

$$\begin{aligned} \mathcal {L}(\varDelta t,\varvec{u}_n,\chi )&=\int _0^{\varDelta t}\varTheta (\tau ,\varvec{u}_n,\chi )\,\textrm{d}\tau \nonumber \\&= \frac{\varDelta t^{p+1}}{(p+1)!}\partial _1^p\varTheta (0,\varvec{u}_n,\chi ) + \int _0^{\varDelta t}\frac{\tau ^{p+1}}{(p+1)!}\partial _1^{p+1}\varTheta (\xi _1,\varvec{u}_n,\chi )\,\textrm{d}\tau \, . \end{aligned}$$
(10)

Due to smoothness of \(\varTheta \), \(\partial _1^{p+1}\varTheta \) is continuous and is bounded over the compact set \([0,\varDelta t] \times \varOmega \), so that we may define

$$\begin{aligned} \widetilde{C}=\max _{(\xi ,\chi )\in [0,\varDelta t]\times \varOmega }\left\| \partial _1^{p+1}\varTheta (\xi ,\varvec{u}_n,\chi ) \right\| < \infty . \end{aligned}$$

Note that by definition of \(\varTheta \) in (7), \(\varTheta (\varDelta t,\varvec{u}_n,\chi ) = \partial _2 \mathcal {E}\left( 0, {\Phi }(\varDelta t, \varvec{u}_n, \chi ) \right) \cdot \mathcal {R}(\varDelta t, \varvec{u}_n, \chi ) = \mathcal {R}(\varDelta t,\varvec{u}_n,\chi )\). Using (10) and applying (9) with \(\tau = \varDelta t\), for some \(\xi _1,\xi _2 \in [0,\varDelta t]\),

$$\begin{aligned}&\left\| \mathcal {L}(\varDelta t,\varvec{u}_n,\chi ) - \frac{\varDelta t}{p+1}\mathcal {R}(\varDelta t,\varvec{u}_n,\chi )\right\| = \left\| \mathcal {L}(\varDelta t,\varvec{u}_n,\chi ) - \frac{\varDelta t}{p+1}\varTheta (\varDelta t,\varvec{u}_n,\chi )\right\| \\&\quad = \left\| \int _0^{\varDelta t}\frac{\tau ^{p+1}}{(p+1)!}\partial _1^{p+1}\varTheta (\xi _1,\varvec{u}_n,\chi )\,\textrm{d}\tau -\frac{\varDelta t^{p+2}}{(p+1)!(p+1)}\partial _1^{p+1}\varTheta (\xi _2,\varvec{u}_n,\chi ) \right\| \\&\quad \le \left\| \int _0^{\varDelta t}\frac{\tau ^{p+1}}{(p+1)!} \partial _1^{p+1}\varTheta (\xi _1,\varvec{u}_n,\chi )\,\textrm{d}\tau \right\| + \left\| \frac{\varDelta t^{p+2}}{(p+1)!(p+1)}\partial _1^{p+1}\varTheta (\xi _2,\varvec{u}_n,\chi ) \right\| \\&\quad \le \left[ \frac{1}{(p+2)!}+\frac{1}{(p+1)!(p+1)}\right] \widetilde{C}\varDelta t^{p+2} =: C\varDelta t^{p+2}, \end{aligned}$$

where

$$\begin{aligned} C=\frac{2p+3}{(p+2)!(p+1)}\widetilde{C}, \end{aligned}$$

is independent of \(\varDelta t\) and \(\chi \). Therefore, indeed

$$\begin{aligned} \Vert L(\varDelta t,\varvec{u}_n,\chi )-\mathcal {L}(\varDelta t,\varvec{u}_n,\chi )\Vert \le C\varDelta t^{p+2}, \end{aligned}$$

uniformly for any value of \(\chi \in \varOmega \).

Remark 1

Theorem 1 is a generalisation of the results of [6], where the defect based estimator (8) is shown to be an asymptotically correct estimator for the local error. An important distinction here is the correctness of this estimator uniformly with respect to the parameters \(\chi \), which is essential for applications in optimal parameter selection. Note that the compactness of the parameter set \(\varOmega \) is crucial for the proof, but in practice is typically not a severe restriction.

Remark 2

Formula (8) provides an estimate of the local error of the time integrator. Nevertheless, there is no guarantee that the global error behaves in a similar way. This may occur even when accurate space discretizations are considered, e.g., caused by instabilities and constraints on the ratio of the discretization stepsizes. However, in Sect. 5 we show an example where the proposed approach, based on finding the parameters \(\chi \) minimizing the quantity L in (8), prevents the occurrence of instabilities.

3 Defect Based Identification of Optimal Parameters

In this section, we propose the use of the defect based error estimate (8) for finding, at every time-step, optimal parameters \(\chi _n^* \in \varOmega \subset \mathbb {R}^K\), defined as

$$\begin{aligned} \chi _n^* = \text {arg min}_{\chi \in \varOmega } \left\| L(\varDelta t,\varvec{u}_n,\chi ) \right\| _{2}, \end{aligned}$$
(11)

where \(\varvec{u}_n\) and \(\varDelta t\) are fixed. The result of Theorem 1 assures us that

$$\begin{aligned} \mathcal {L}(\varDelta t,\varvec{u}_n,\chi _n^*) = L(\varDelta t,\varvec{u}_n,\chi _n^*) + C{\varDelta t^{p+2}}, \end{aligned}$$

and guarantees that the choice of parameters \(\chi _n^*\) keeps the true local error, \(\mathcal {L}\), close to \(L(\varDelta t,\varvec{u}_n,\chi _n^*)\) in the asymptotic limit \(\varDelta t\rightarrow 0\) and, therefore, small.

Remark 3

The application of defect based error estimates for choosing optimal parameters differs from their application in context of time-adaptivity in a couple of crucial aspects.

  1. 1.

    Since L is neither monotonous in \(\chi \), nor are we interested in asymptotic limits for small \(\chi \) (unlike the case of \(\varDelta t\) in context of time-adaptivity), the defect based estimate \(L(\varDelta t,\varvec{u}_n,\chi )\) needs to be computed for multiple values of \(\chi \) within an optimization routine.

  2. 2.

    The perturbation of L by a term \(\rho \) independent of \(\chi \) has no effect on \(\chi ^*\). This is in contrast to time-adaptivity where we seek the largest \(\varDelta t^*\) such that \(\left\| L(\varDelta t^*,\varvec{u}_n,\chi ) \right\| _{2} < \delta \) for some user specified tolerance \(\delta \), and \(\rho \ne 0\) effects the choice of \(\varDelta t^*\).

The first observation in Remark 3 suggests that the application of defect based estimates for choosing optimal parameters can be prohibitively expensive. However, the second suggests that we can resort to inexpensive approximations of the defect and still hope to arrive at a good choice of parameters.

In Sect. 3.2, we see that under reasonable assumptions, the number of optimization steps is not expected to be large and just a few steps of Gauss–Newton iteration are ever required. In the large time-step regime the approximation of defect on a coarse spatial grid also proves to be sufficient for the purposes described here. Overall, this leads to a procedure for identifying optimal parameters with very reasonable overheads, producing highly efficient schemes.

3.1 Optimization Procedure

In practice, we minimize the square of the defect,

$$\begin{aligned} \chi ^*_n = \text {arg min}_{\chi \in \varOmega } f(\chi ), \qquad f(\chi ) = \frac{1}{2} \left\| \mathcal {R}(\varDelta t,\varvec{u}_n,\chi ) \right\| _{2}^2, \end{aligned}$$

using the gradient,

$$\begin{aligned} g = \left( \mathcal {D}_\chi \mathcal {R}(\varDelta t,\varvec{u}_n,\chi )\right) ^\top \mathcal {R}(\varDelta t,\varvec{u}_n,\chi ), \end{aligned}$$
(12)

where \(\mathcal {D}_\chi \mathcal {R}\) is the Jacobian of the defect with respect to \(\chi \), and a Gauss–Newton approximation to the Hessian,

$$\begin{aligned} \nabla ^2 f \approx H_{\textrm{GN}}:= \left( \mathcal {D}_\chi \mathcal {R}(\varDelta t,\varvec{u}_n,\chi )\right) ^\top \left( \mathcal {D}_\chi \mathcal {R}(\varDelta t,\varvec{u}_n,\chi )\right) . \end{aligned}$$
(13)

Utilizing the Gauss–Newton Hessian in the context of trust region algorithms yields a sequence of parameters \(\chi _n^k\) that quickly converges to the optimal \(\chi ^*_n\), with reliable global convergence properties [26]. At the same time, the procedure remains relatively inexpensive for a small number of parameters, K, since we only need to compute the first derivatives of the defect with regards to \(\chi \). These can be computed either analytically or approximately using finite differences.

The defect, \(\mathcal {R}(\varDelta t, \varvec{u}_n, \chi _n^k)\), is computed using (6). This requires the computation of a temporary solution, \(\widetilde{\varvec{u}}_{n+1}^k\!=\!\Phi (\varDelta t, \varvec{u}_n, \chi _n^k)\), and of \(\mathcal D_{\varDelta t} \Phi (\varDelta t,\varvec{u}_n, \chi _n^k)\), the latter of which can be computed analytically as outlined with concrete examples in Sect. 4.

Note that, in general, at each iteration a trust region algorithm may be used to compute \(\mathcal {R}\) at candidate parameters \(\chi =\widetilde{\chi }_n^{k+1}\) before deciding to accept or reject the candidate and/or update the trust region radius \(\varDelta _n^k\). For a detailed introduction to trust region algorithms, we refer the reader to [26].

3.2 Practical Considerations for Efficiency

The evaluations of defect can be very expensive, as they require the computation of the temporary solution \(\widetilde{\varvec{u}}_{n+1}^k\) at every iteration. Each of these is as expensive as a step of the original numerical solver \(\Phi \). However, in practice we identify \(\chi _n^*\) by optimizing the defect on a coarse spatial grid, resorting to the fine computational grid only for evaluating \(\varvec{u}_{n+1}\) once \(\chi _n^*\) has been identified.

The coarse grid is obtained as a subgrid of the fine grid with resolution \(r \varDelta x\), with r an integer that divides \(M+1\). Let \(\mathcal {P}_r\) denote the projection operator from the fine grid to the coarse grid, defined as

$$\begin{aligned} \mathcal {P}_r:\mathbb {R}^{M}\rightarrow \mathbb {R}^{(M+1)/r+1},\qquad \widehat{\varvec{u}}_n=\mathcal {P}_r(\varvec{u}_n), \end{aligned}$$
(14)

where

$$\begin{aligned} \varvec{u}_n&\,=(u_{0,n}, u_{1,n},\ldots ,u_{i,n},\ldots , u_{M,n}, u_{M+1,n}), \\ \widehat{\varvec{u}}_n&\,=(u_{0,n},u_{r,n},\ldots ,u_{ir,n},\ldots , u_{M+1-r,n},u_{M+1,n}). \end{aligned}$$

At the k-th iteration of Gauss-Newton algorithm, we evaluate the defect (6) on the coarse grid, as

$$\begin{aligned} \mathcal {R}(\varDelta t, \widehat{\varvec{u}}_n,\chi _n^k)=\mathcal {D}_{\varDelta t} \Phi (\varDelta t, \widehat{\varvec{u}}_n, \chi _n^k) - A(\Phi (\varDelta t, \widehat{\varvec{u}}_n, \chi _n^k)),\qquad \widehat{\varvec{u}}_n=\mathcal {P}_r(\varvec{u}_n). \end{aligned}$$

This requires the computation of \(\widetilde{\varvec{u}}_{n+1}^k = \Phi (\varDelta t, \widehat{\varvec{u}}_n, \chi _n^k)\). On a grid with resolution \(r\varDelta x\), the dimension of the problem is reduced by a factor r. This typically leads to a significant speedup in the computation of \(\chi _n^*\). This speedup is expected to be particularly pronounced in 2 or 3 dimensional problems, where the coarse grid is smaller by factors of \(r^2\) and \(r^3\), respectively, than the finer computational grid.

For a method of order p in space and time, a \(\mathcal {O}(r^p\varDelta x^p)\) term of error is introduced in the evaluation of the defect. This error is negligible if \(r\varDelta x\ll \varDelta t\) and is not expected to have a significant effect on the estimate of the optimal parameters \(\chi _n^*\) in light of Remark 3.

Remark 4

With larger values of r, the additional cost of identifying \(\chi _n^*\) becomes marginal, while the advantages of identifying good parameters could still be significant. We can expect, however, that once r is large enough such that \(r\varDelta x\ll \varDelta t\) is no longer valid, spatial discretization errors will start to dominate and the computation of defect may become too inaccurate to be useful. In light of these observations, as a rule of thumb, the largest r we recommend is the largest divisor of \(M+1\) such that \(r<\varDelta t/\varDelta x\).

Lipschitz continuity. Further gains can be obtained by exploiting temporal smoothness of the optimal parameters \(\chi ^*\). In the context of some methods it may be reasonable to assume that the optimal parameters are described by a Lipschitz function \(\chi ^*(t)\), i.e. \(|\chi ^*_{n} - \chi ^*_{n-1}| \le \widetilde{C} \varDelta t\) for some \(\widetilde{C}<\infty \) independent of n. For small enough \(\widetilde{C}\varDelta t\), \(\chi ^*_{n-1}\) is close enough to \(\chi ^*_{n}\). Thus, the previous value of the optimal parameter serves as a good first guess for the next time-step, \(\chi ^{0}_{n} = \chi ^*_{n-1}\). With \(\widetilde{C}\) and \(\varDelta t\) small enough, \(\chi ^{0}_{n}\) can be expected close enough to \(\chi ^{*}_{n}\) so that conditions of trust-region are guaranteed to be satisfied and fast convergence is guaranteed. In such a case, it suffices to use the simple Newton-type iteration,

$$\begin{aligned} \chi ^{k+1}_{n} = \chi ^{k}_{n} - H_{\textrm{GN}}^{-1}\ g, \qquad \chi ^0_n = \chi ^*_{n-1}, \quad n>0, \end{aligned}$$
(15)

in place of the trust-region algorithm, where g and \(H_{\textrm{GN}}\) are given by (12) and (13), respectively. In practice, just a couple of steps of (15) suffice, with the exception of the first iteration (\(n=0\)), when the arbitrary value of the initial guess may be very far from the optimal one, requiring more optimization steps and the use of trust-regions.

Gauss–Newton algorithm is iterated until the stopping criterion,

$$\begin{aligned} |\chi _n^{k+1}-\chi _n^{k}|<\textrm{tol}, \end{aligned}$$

is satisfied for a suitable tolerance. Then we set \({\chi }^*_n = \chi _n^{k+1}\) and the solution at the next time-step is obtained on the finer computational grid as

$$\begin{aligned} {\varvec{u}}_{n+1} = \Phi (\varDelta t, {\varvec{u}}_n, {\chi }^*_n). \end{aligned}$$

The overall procedure introduced in this section is summarized in Algorithm 1.

Assuming that the method \(\Phi \) in (4) is stable and convergent for a time-dependent choice of parameters \(\chi : \mathbb {R}^+ \rightarrow \varOmega \), it follows that the time-adaptive procedure is also stable and has the same order of convergence.

Algorithm 1
figure a

Adaptively identifying optimal parameters

3.3 Modifications for Conservative Schemes

Algorithm 1 is very effective in improving accuracy of standard numerical methods with free parameters and of geometric integrators that preserve a structure that is independent of the parameters. However, some numerical schemes, e.g. the schemes described in [14], preserve conservation laws that depend explicitly on the parameters. Since the optimal parameters generated by Algorithm 1 change from time-step to time-step, it undermines the preservation of the conservation laws of these schemes.

We present an alternative approach for these numerical schemes, which post-processes the time-dependent optimal parameters and uses constant-in-time values of the parameters e.g. by using the mean-values. This allows numerical schemes with parameter-dependent conservation laws to respect their conservation laws as well, while benefitting from low local errors. We modify Algorithm 1 as follows.

  1. 1.

    Project the initial condition on a grid with resolution \(r\varDelta x\).

  2. 2.

    Find the optimal \(\chi _0^*\) and use it to advance a step in time on the coarse grid. Iterate this step till the final time, obtaining the optimal parameters \(\chi _n^*\) at \(n=0,\ldots ,N-1\) steps.

  3. 3.

    Compute the average optimal parameters as \(\overline{\chi }^* = \frac{1}{N} \sum _{n=0}^{N-1} \chi _n^*\).

  4. 4.

    Compute the numerical solution on the full computational grid using the average parameters, \(\varvec{u}_{n+1} = \Phi (\varDelta t, \varvec{u}_n, \overline{\chi }^*)\), for \(n=0,\ldots ,N-1\).

We summarize this conservative version of our procedure in Algorithm 2. Note that the numerical solution obtained on the coarse grid, \(\widehat{\varvec{u}}_n\), is used only for estimating the optimal parameters and later discarded.

Algorithm 2
figure b

Identifying fixed optimal parameters

Since our estimate of the optimal parameters in Algorithm 2 relies on the solution of the problem on a coarser grid, this algorithm is also affected by the accumulation of errors in space, which may be particularly pronounced for large values of r. On the other hand, the parameters \(\chi _n^*\) need not be identified with too high an accuracy since we only need a good average choice, \(\overline{\chi }^*.\)

Considering the average parameters for solving the problem on the fine grid is a good option particularly for solutions with simple and smooth dynamics (e.g. travelling waves) as the values obtained at the different time-step are all reasonably close to each other.

Remark 5

Note that at the end of step 2, the full sequence of optimal parameters, \(\chi _0^*, \ldots , \chi _{N-1}^*\), and local error estimates are available to the user, and it is possible to consider alternative and more suitable options than the average values of the parameters. This could be particularly useful in cases where the solution changes its type substantially and also the parameters values vary in a wide range along the integration. For example, one may choose the value corresponding to a particularly delicate stage of the time evolution.

4 Approximation of Defect for Specific Schemes

In this section we consider the application of the proposed approach to two partial differential equations—the KdV equation and a nonlinear heat equation—with suitable initial and boundary conditions. In particular, we present the computation of defect (6) for numerical schemes introduced in [14], which is required in Algorithms 1 and 2.

We define the forward shifts in space and time,

$$\begin{aligned} S_{\varDelta x}(u_{m,n})=u_{m+1,n},\qquad S_{\varDelta t}(u_{m,n})=u_{m,n+1}, \end{aligned}$$

respectively, the forward difference and average operators,

$$\begin{aligned} {D}_{\varDelta x}=\frac{S_{\varDelta x}-I}{\varDelta x},\quad {D}_{\varDelta t}=\frac{S_{\varDelta t}-I}{\varDelta t},\quad \mu _{\varDelta x}=\frac{S_{\varDelta x}+I}{2},\quad \mu _{\varDelta t}=\frac{S_{\varDelta t}+I}{2}, \end{aligned}$$

and the centred difference operator,

$$\begin{aligned} D_{2k,\varDelta x}=D_{\varDelta x}^{2k}S_{\varDelta x}^{-k},\qquad D_{2k-1,\varDelta x}=D_{\varDelta x}^{2k-1}S_{\varDelta x}^{-k}\mu _{\varDelta x}, \end{aligned}$$

approximating the space derivatives of degree 2k and \(2k-1\), respectively, with second order of accuracy. Action of these operators on vectors is defined entrywise. Moreover, we denote with \(\circ \) the Hadamard product whose action is entrywise multiplication of vectors.

For the two equations considered here, families of second order finite difference methods that depend on one or more free parameters have been introduced in [14] by means of a strategy based on the fact that, just like total divergences form the kernel of an Euler operator [27], there exists a discrete Euler operator whose kernel is the space of the difference divergences [23, 24]. We consider in this section some of these families of schemes.

Remark 6

We restrict attention to the setting where \(\varDelta t>\varDelta x\), and the parameters \(\chi \) featured in the numerical schemes from [14] are \(\mathcal {O}(\varDelta t^2)\). These small parameters \(\chi \) correspond to perturbation terms that have no counterpart in the continuous problem, and whose contributions vanish in the limit \(\varDelta t \rightarrow 0\). Thus, it is reasonable to restrict our search to a neighbourhood of \(\textbf{0}\) of size \(\mathcal {O}(\varDelta t^2)\), \(\varOmega \subset \overline{B}_{C\varDelta t^2}(\textbf{0};\mathbb {R}^K)\), for some constant \(C>0\). Therefore, when applying the algorithms outlined in Sect. 3 we can expect that \(\chi =\textbf{0}\) is a reasonable initial guess, and we can find convenient values of the parameters by using the simple iteration (15) without the aid of trust region methods.

Remark 7

The schemes considered here are all implicit. We assume that \(\mathcal {J}\), the Jacobian operator defined by the partial derivatives of the scheme with respect to \(\varvec{u}_{n+1}\), is never singular. Solutions can then be obtained by iteration of Newton’s method.

An important property of the numerical schemes considered here is that they preserve some conservation laws. Conservation laws are defined as total divergences,

$$\begin{aligned} \mathcal {D}_x F + \mathcal {D}_t G, \end{aligned}$$

that vanish on solutions of the PDE. The functions F and G are the flux and the density of the conservation law and depend on xtu and its derivatives. The methods in [14] preserve second order approximations of specific conservation laws in the form

$$\begin{aligned} D_{\varDelta x} \widetilde{F}(\varvec{x}, \varvec{u}_n, \varvec{u}_{n+1},\chi ) + D_{\varDelta t} \widetilde{G}(\varvec{x}, \varvec{u}_n, \chi )=0, \end{aligned}$$

where here and henceforth tildes represent approximations of the corresponding continuous terms, and \(\varvec{x}\) is the column vector whose m-th entry is \(x_m\).

4.1 Schemes for KdV Equation

In this section we apply the approach outlined in Sect. 3 to parametric families of schemes for the KdV equation,

$$\begin{aligned} u_t + \left( \tfrac{1}{2}u^2+u_{xx}\right) _x=0, \end{aligned}$$
(16)

with initial condition

$$u(x,0)=u_0(x).$$

For simplicity, we restrict attention to periodic or zero boundary conditions. However, as shown in Sect. 4.2, the entire discussion can also be adapted to boundary conditions of a different type. The KdV equation has infinitely many independent conservation laws. The first three, in increasing order,

$$\begin{aligned}&\mathcal {D}_xF_1+\mathcal {D}_tG_1\equiv \mathcal {D}_x\left( \tfrac{1}{2}u^2+u_{xx}\right) +\mathcal {D}_tu=0,\nonumber \\&\mathcal {D}_xF_2+\mathcal {D}_tG_2\equiv \mathcal {D}_x\left( \tfrac{1}{3}u^3+uu_{xx}-\tfrac{1}{2}u_x^2\right) +\mathcal {D}_t\left( \tfrac{1}{2}u^2\right) =0,\nonumber \\&\mathcal {D}_xF_3+\mathcal {D}_tG_3\equiv \mathcal {D}_x\left( \tfrac{1}{4}u^4\!+\!u_xu_t\!-\!uu_{xt} +\!u^2u_{xx}\!+\!u^2_{xx}\right) \!+\!\mathcal {D}_t\!\left( \tfrac{1}{3}u^3\!+\!uu_{xx}\right) \!=0, \end{aligned}$$
(17)

describe the local conservation laws of mass, momentum and energy, respectively. For this equation we define the semidiscrete operator A in (2) in the most natural way, as

$$\begin{aligned} A(\varvec{u}(t))=-D_{1,\varDelta x}\left( \tfrac{1}{2}\varvec{u}^2(t)+D_{2,\varDelta x}\varvec{u}(t)\right) . \end{aligned}$$
(18)

The results obtained are independent of the particular form of this operator under spatial discretization, as we work under the assumption that the leading source of error is given by the time integration.

Energy conserving methods

Fig. 1
figure 1

10-point stencil for schemes EC\((\alpha )\) (19) and MC\((\beta ,\gamma )\) (22)

We consider here the following family of mass and energy-conserving methods described in [14],

$$\begin{aligned} D_{\varDelta t}(\varvec{u}_n)+D_{\varDelta x}\left( S_{\varDelta x}^{-1}\mu _{\varDelta x}\psi (\varvec{u}_n,\varvec{u}_{n+1},\alpha )\right) =0, \end{aligned}$$
(19)

where

$$\begin{aligned} \psi (\varvec{u}_n,\varvec{u}_{n+1},\alpha )=\tfrac{1}{6}(\varvec{u}_{n+1}^2+\varvec{u}_n^2+\varvec{u}_n\circ \varvec{u}_{n+1})+D_{2,\varDelta x}\mu _{\varDelta t}\varvec{u}_n+\alpha D_{\varDelta t}D_{1,\varDelta x}\varvec{u}_n. \end{aligned}$$

We denote with \(\text {EC}(\alpha )\) the schemes in Eq. (19), where, according to Remark 6, \(\alpha \) is a free parameter in a neighbourhood of 0, \(\varOmega \subset \overline{B}_{C\varDelta t^2}(0;\mathbb {R})=[-C\varDelta t^2,C\varDelta t^2]\) for some constant \(C>0\). For any choice of the parameter \(\alpha \in \varOmega \) schemes EC\((\alpha )\) are defined on the 10-point stencil in Fig. 1, they are implicit and second order accurate. Moreover, they have a discrete version of the conservation law of the mass (17) given by their definition (19) with,

$$\begin{aligned} \widetilde{F}_1=S_{\varDelta x}^{-1}\mu _{\varDelta x}\psi (\varvec{u}_n,\varvec{u}_{n+1},\alpha ),\qquad \widetilde{G}_1=\varvec{u}_n. \end{aligned}$$

and satisfy the discrete energy conservation law,

$$\begin{aligned} D_{\varDelta x}&\big (\widetilde{F}_3\big )+D_{\varDelta t}\big (\widetilde{G}_3\big )=0,\nonumber \\ \widetilde{F}_3=&\,\psi (\varvec{u}_n,\varvec{u}_{n+1},\alpha )\!\circ \! S_{\varDelta x}^{-1}\psi (\varvec{u}_n,\varvec{u}_{n+1},\alpha )+\alpha (D_{\varDelta t}\varvec{u}_n)\circ (S_{\varDelta x}^{-1}D_{\varDelta t}\varvec{u}_n)\nonumber \\&+S_{\varDelta x}^{-1}\big ((D_{\varDelta x}\mu _{\varDelta t}\varvec{u}_n)\circ (D_{\varDelta t}\mu _{\varDelta x}\varvec{u}_n)-(\mu _{\varDelta x}\mu _{\varDelta t}\varvec{u}_n)\circ (D_{\varDelta x}D_{\varDelta t}\varvec{u}_n)\big ),\nonumber \\ \widetilde{G}_3=&\,\tfrac{1}{3}\varvec{u}_n^3+\varvec{u}_n\circ D_{2,\varDelta x}\varvec{u}_n. \end{aligned}$$
(20)

Being implicit methods, an explicit expression for \(\Phi \) in (3) is not available. Nevertheless, an analytical expression of its time derivatives can be obtained by substituting (3) in (19) and differentiating. This yields,

$$\begin{aligned} \mathcal {D}_{\varDelta t} \Phi (\varDelta t,\varvec{u}_n,\alpha ) = -[\varDelta t\mathcal {J}]^{-1}(D_{1,\varDelta x}\psi (\varvec{u}_n,\varvec{u}_{n+1},0)), \end{aligned}$$
(21)

where \(\varvec{u}_{n+1}=\Phi (\varDelta t,\varvec{u}_n,\alpha )\), and \(\mathcal {J}\) denotes the Jacobian matrix defined in Remark 7.

Optimal values of \(\alpha \) are then computed according to (11), where the defect is calculated according to (6) with (18) and (21).

Remark 8

As noted in Sect. 3.3, since the value of \(\alpha \) changes at every time-step in Algorithm 1, it cannot preserve the conservation laws (19) and (20) of EC(\(\alpha \)) because they depend on \(\alpha \). However, since the boundary conditions are conservative, summing the entries of the vectors in (19) and (20) yields

$$\begin{aligned} \sum _m u_{m,n+1}&\,=\sum _m u_{m,n},\\ \sum _m \left( \tfrac{1}{3} u_{m,n+1}^3+u_{m,n+1} D_{2,\varDelta x}u_{m,n+1}\right)&\,=\sum _m \left( \tfrac{1}{3} u_{m,n}^3+u_{m,n} D_{2,\varDelta x}u_{m,n}\right) . \end{aligned}$$

Therefore, EC(\(\alpha \)) also preserves the following approximations of the global mass and energy:

$$\begin{aligned} \varDelta x\sum _m u_{m,n},\qquad \varDelta x\sum _m \left( \tfrac{1}{3} u_{m,n}^3+u_{m,n} D_{2,\varDelta x}u_{m,n}\right) . \end{aligned}$$

These two global invariants are independent of \(\alpha \), and therefore they are conserved by both algorithms introduced in Sect. 3.

Momentum conserving methods

A two-parameter family of mass and momentum conserving schemes described in [14] is

$$\begin{aligned} D_{\varDelta t}(\varvec{u}_n)&+D_{\varDelta x}\left\{ \tfrac{1}{6}\big ((S_{\varDelta x}^{-1}\mu _{\varDelta t}\varvec{u}_n)^2 +(S_{\varDelta x}^{-1}\mu _{\varDelta t}\varvec{u}_n)\circ (\mu _{\varDelta t}\varvec{u}_n)\right. \nonumber \\&\left. +(\mu _{\varDelta t}\varvec{u}_n)^2\big )\!+\!D_{\varDelta x}^2S_{\varDelta x}^{-2} \mu _{\varDelta x}\mu _{\varDelta t}\varvec{u}_n\!+\!D_{\varDelta t}D_{\varDelta x}S_{\varDelta x}^{-1}(\beta \varvec{u}_n\!+\!\gamma D_{2,\varDelta x}\varvec{u}_n)\right\} =0. \end{aligned}$$
(22)

We denote with \(\text {MC}(\beta ,\gamma )\) the two-parameter family of schemes (22). For any value of the parameters \((\beta ,\gamma )\in \varOmega \subset \overline{B}_{C\varDelta t^2}(0;\mathbb {R}^2), C>0\), the schemes \(\text {MC}(\beta ,\gamma )\) are defined on the 10-point stencil in Fig. 1, are implicit and second order accurate. Solutions of MC\((\beta ,\gamma )\) satisfy the local mass conservation law given by (22), and the local momentum conservation law,

$$\begin{aligned} D_{\varDelta x}&\big (\widetilde{F}_2\big )+D_{\varDelta t}\big (\widetilde{G}_2\big )=0,\nonumber \\ \widetilde{F}_2=&\,\tfrac{1}{3}(\mu _{\varDelta t}\varvec{u}_n)\circ (S_{\varDelta x}^{-1}\mu _{\varDelta t}\varvec{u}_n)\circ (S_{\varDelta x}^{-1} \mu _{\varDelta t}\mu _{\varDelta x}\varvec{u}_n)\!+\!\tfrac{1}{2}(\mu _{\varDelta t}\varvec{u}_n)\circ (D_{2,\varDelta x}\mu _{\varDelta t}\varvec{u}_n)\\&+\tfrac{1}{2}(S_{\varDelta x}^{-1}\mu _{\varDelta t}\varvec{u}_n)\circ (D_{\varDelta x}^2\mu _{\varDelta t}\varvec{u}_n)-\tfrac{1}{2}(D_{1,\varDelta x}\varvec{u}_n)^2 +\beta \rho (\varvec{u}_n,\varvec{u}_{n+1})\nonumber \\&+\gamma \sigma (\varvec{u}_n,\varvec{u}_{n+1}),\nonumber \\ \widetilde{G}_2=&\,\tfrac{1}{2}\varvec{u}_n\circ \left( \varvec{u}_n+D_{2,\varDelta x}\left( \beta \varvec{u}_n+\gamma D_{2,\varDelta x}\varvec{u}_n\right) \right) ,\nonumber \end{aligned}$$
(23)

where

$$\begin{aligned} \rho (\varvec{u}_n,\!\varvec{u}_{n+1})&=S_{\varDelta x}^{-1}\big \{(\mu _{\varDelta x}\mu _{\varDelta t}\varvec{u}_n)\!\circ \!(D_{\varDelta x}D_{\varDelta t}\varvec{u}_n)\! -\!\tfrac{1}{2}D_{\varDelta t}\big ((\mu _{\varDelta x}\varvec{u}_n)\!\circ \!(D_{\varDelta x}\varvec{u}_n)\big )\!\big \},\\ \sigma (\varvec{u}_n,\!\varvec{u}_{n+1})&=\tfrac{1}{2}D_{\varDelta t}\big \{S_{\varDelta x}^{-1}\big ((D_{\varDelta x}\varvec{u}_n)\circ (D_{\varDelta x}^2\varvec{u}_n) \big )-\varvec{u}_n\circ (D_{\varDelta x}^3S_{\varDelta x}^{-2}\varvec{u}_n)\big \}\\&\quad +\left\{ (\mu _{\varDelta t}\varvec{u}_n)\circ (D_{\varDelta t}D_{\varDelta x}^3S_{\varDelta x}^{-2}\varvec{u}_n) -S_{\varDelta x}^{-1}\big ((D_{\varDelta x}\mu _{\varDelta t}\varvec{u}_n)\circ (D_{\varDelta t}D_{\varDelta x}^2\varvec{u}_n)\big )\right\} . \end{aligned}$$

As also these schemes are fully implicit, we substitute (3) in (22) and differentiate in time. This yields,

$$\begin{aligned} \mathcal {D}_{\varDelta t} \Phi ( \varDelta t,\varvec{u}_n,\beta ,\gamma ) = [\varDelta t\mathcal {J}]^{-1}A(\mu _{\varDelta t}\varvec{u}_n). \end{aligned}$$
(24)

The defect and the local error estimate (8) are then computed using (18) and (24).

Remark 9

Summation of (22) and (23) gives that

$$\begin{aligned} \varDelta x\sum _m u_{m,n},\qquad \varDelta x\sum _m \left( \tfrac{1}{2}u_{m,n}(u_{m,n}+\beta D_{2,\varDelta x}u_{m,n}+\gamma D_{4,\varDelta x}u_{m,n})\right) , \end{aligned}$$

are the discretizations of the global mass and momentum, respectively. The discrete global mass is independent of the parameters, and therefore it is conserved by both Algorithms 1 and 2. However, the discrete global momentum and the local conservation laws (22) and (23) are only preserved by Algorithm 2.

4.2 Schemes for a Nonlinear Heat Equation

In this section we consider the nonlinear heat equation,

$$\begin{aligned} u_t= \tfrac{1}{2}( u^2)_{xx}, \end{aligned}$$
(25)

with initial condition and Dirichlet boundary conditions,

$$\begin{aligned} u(x,0) = u_0(x), \quad u(a,t) = \varphi _L(t), \quad u(b,t) = \varphi _R(t). \end{aligned}$$

Equation (25) has only two independent conservation laws,

$$\begin{aligned}&\mathcal {D}_xF_1+\mathcal {D}_tG_1\equiv \mathcal {D}_x(-uu_x)+\mathcal {D}_t(u)=0, \end{aligned}$$
(26)
$$\begin{aligned}&\mathcal {D}_xF_2+\mathcal {D}_tG_2\equiv \mathcal {D}_x\left( \tfrac{1}{2} u^2-xuu_x\right) +\mathcal {D}_t(xu)=0. \end{aligned}$$
(27)

We denote with CS\((\lambda )\) the one-parameter family of methods described in [14], given by

$$\begin{aligned} {D}_{\varDelta t}(\varvec{u}_n+\lambda D_{2,\varDelta x}\varvec{u}_n)=\tfrac{1}{2}D_{2,\varDelta x}(\varvec{u}_n\circ \varvec{u}_{n+1}). \end{aligned}$$
(28)

The scheme CS\((\lambda )\) has the following discrete versions of the conservation laws (26) and (27),

$$\begin{aligned}&\!\!D_{\varDelta x}\widetilde{F}_1+D_{\varDelta t}\widetilde{G}_1=0,\!\!\quad \widetilde{F}_1=-S_{\varDelta x}^{-1}\left( \tfrac{1}{2}\varvec{u}_n\circ \varvec{u}_{n+1}\!-\!\lambda D_{\varDelta t}\varvec{u}_n\right) ,\!\!\quad \widetilde{G}_1=\varvec{u}_n, \end{aligned}$$
(29)
$$\begin{aligned}&\!\!D_{\varDelta x}\widetilde{F}_2+D_{\varDelta t}\widetilde{G}_2=0,\qquad \widetilde{G}_2=\varvec{x}\circ (\varvec{u}_n+\lambda D_{2,\varDelta x}\varvec{u}_n),\nonumber \\&\widetilde{F}_2=\,S_{\varDelta x}^{-1}\left( \mu _{\varDelta x}(\tfrac{1}{2}\varvec{u}_n\circ \varvec{u}_{n+1})-(\mu _{\varDelta x}\varvec{x})\circ D_{\varDelta x}(\tfrac{1}{2}\varvec{u}_n\circ \varvec{u}_{n+1})\right) . \end{aligned}$$
(30)

In order to evaluate the defect, we consider a centred difference space discretization of (25) in the form (2) with

$$\begin{aligned} A(\varvec{u}(t),t)=\tfrac{1}{2}\left( \hat{D}_{2,\varDelta x}\varvec{u}^2(t)+\varphi _L^2(t)\varvec{e}_1+\varphi _R^2(t)\varvec{e}_M\right) , \end{aligned}$$
(31)

where \(\varvec{e}_j\) is the j-th unit vector, and we define \(\hat{D}_{2,\varDelta x} = D_{2,\varDelta x}\vert _{\varphi _L=\varphi _R=0}\) in order to isolate the contribution of the boundary conditions. The methods in (28) are linearly implicit so they can be written in the form (3) with

$$\begin{aligned} \Phi (\varDelta t,\varvec{u}_n,\lambda )&=[\varDelta t\mathcal {J}]^{-1}\big \{\tfrac{\varDelta t}{2\varDelta x^2}\big (\varphi _L(t_{n+1})\varphi _L(t_n)\varvec{e}_1+\varphi _R(t_{n+1})\varphi _R(t_n)\varvec{e}_M\big )\nonumber \\&\quad +\!(1\!+\!\lambda \hat{D}_{2,\varDelta x})\varvec{u}_n\!-\!\lambda \big ((\varphi _L(t_{n+1})\!-\!\varphi _L(t_n))\varvec{e}_1\!+\!(\varphi _R(t_{n+1})\!-\!\varphi _R(t_n))\varvec{e}_M\big )\big \}. \end{aligned}$$
(32)

Differentiating (32) in time yields,

$$\begin{aligned} \mathcal {D}_{\varDelta t} \Phi (\varDelta t,\!\varvec{u}_n,\!\lambda )\!&=\! [\varDelta t\mathcal {J}]^{-1}\!\big \{\tfrac{1}{2}\hat{D}_{2,\varDelta x}(\varvec{u}_n\!\circ \! \varvec{u}_{n+1})\!-\!\lambda (\varphi _L'(t_{n+1})\varvec{e}_1\!+\!\varphi _R'(t_{n+1})\varvec{e}_M)\\&\quad +\tfrac{1}{2\varDelta x^2}\big [(\varphi _L(t_{n+1})+\varDelta t\varphi _L'(t_{n+1}))\varphi _L(t_n)\varvec{e}_1\\&\quad +(\varphi _R(t_{n+1})+\varDelta t\varphi _R'(t_{n+1}))\varphi _R(t_n)\varvec{e}_M\big ]\big \}, \end{aligned}$$

which, together with (31), is utilized in (8) and (6) to estimate the local error.

Remark 10

If the boundary conditions are conservative, summing the entries of the vectors in (29) and (30) yields

$$\begin{aligned} \sum _m u_{m,n+1}&= \sum _m u_{m,n},\\ \sum _m x_m(u_{m,n+1}+\lambda D_{2,\varDelta x}u_{m,n+1})&\,=\sum _m x_m(u_{m,n}+\lambda D_{2,\varDelta x}u_{m,n}), \end{aligned}$$

giving the following approximations of the global invariants

$$\begin{aligned} \varDelta x\sum _m u_{m,n},\qquad \varDelta x\sum _m x_m(u_{m,n}+\lambda D_{2,\varDelta x}u_{m,n}). \end{aligned}$$

Since the former is independent of \(\lambda \), this is a conserved invariant when using either of Algorithms 1 and 2. In general, the latter is conserved only by Algorithm 2. However, if the boundary conditions are such that

$$\begin{aligned} \sum _m x_mD_{2,\varDelta x}u_{m,n}=0, \end{aligned}$$

the conservation of

$$\begin{aligned} \varDelta x\sum _mx_mu_{m,n} \end{aligned}$$

is also guaranteed by Algorithm 1. On a uniform spatial grid, this is achieved, for example, with zero boundary conditions.

5 Numerical Examples

In this section we consider a range of benchmark problems for the KdV Eq. (16) and the nonlinear heat Eq. (25), and investigate the performance of the methods described in Sect. 4 with optimal parameters obtained by the two algorithms introduced in Sect. 3. Comparisons between different numerical schemes are based on:

  • Relative error in the solution at the final time \(t=T\), defined as

    $$\begin{aligned} \frac{\Vert \varvec{u}_N-u_\textrm{exact}(T)\Vert }{\Vert u_\textrm{exact}(T)\Vert }, \end{aligned}$$

    where \(\Vert \cdot \Vert \) denotes the discrete \(L^2\) norm and \(u_\textrm{exact}\) is the solution of (1).

  • Error in the variation of the global densities. If the method preserves the k-th conservation law,

    $$\begin{aligned} D_{\varDelta x} \widetilde{F}_k(\varvec{x},\varvec{u}_n,\varvec{u}_{n+1},\chi )+D_{\varDelta t} \widetilde{G}_k(\varvec{x},\varvec{u}_n,\chi )=0, \end{aligned}$$

    the error in the global variation of \(G_k\) is defined as

    $$\begin{aligned} \textrm{Err}_k\!=\!\varDelta x\!\! \max _{n=1,\ldots ,N}\left| (\varvec{e}_{M+1}\!-\!\varvec{e}_{1})^T\widetilde{F}_k(\varvec{x},\varvec{u}_n,\varvec{u}_{n+1},\chi )\!+\!\textbf{1}^T D_{\varDelta t}\widetilde{G}_k\!\left( \varvec{x},\varvec{u}_n,\chi \right) \right| , \end{aligned}$$
    (33)

    where \(\varvec{e}_j\) denotes the j-th column vector of the standard basis of \(\mathbb {R}^{M+1}\), and \(\textbf{1}\in \mathbb {R}^{M+1}\) is the column vector with all entries equal to one. When the boundary conditions are periodic, we consider instead the maximum error on the k-th global invariant, defined as

    $$\begin{aligned} \textrm{Err}_k=\varDelta x\max _{n=1,\ldots ,N}\left| \textbf{1}^T\left( \widetilde{G}_k\left( \varvec{x}, \varvec{u}_n,\chi \right) -\widetilde{G}_k\left( \varvec{x}, \varvec{u}_0,\chi \right) \right) \right| . \end{aligned}$$
    (34)
  • Computational cost, measured in terms of computation time in seconds.

For each of the family of schemes described in Sect. 4, we consider the following choices of the vector of free parameters:

  • \(\chi =\textbf{0}\), default choice, fixed at each step.

  • \(\chi =\chi ^*\), globally optimal fixed value that minimizes the solution error for this specific problem. This value is obtained by brute force search based on empirical comparisons with the exact solution and is not available a priori. Thus, this choice of parameters does not constitute a reasonable numerical algorithm and we do not provide the computation time for it since it is excessive.

  • \(\{\chi _{r,n}^*\}_{n\geqslant 0}\), sequence of values that minimize the local error at each time-step obtained using Algorithm 1 projecting on a grid with resolution \(r\varDelta x\).

  • \(\chi =\overline{\chi }^*_r\), fixed value obtained using Algorithm 2 with projection on a coarse grid with spatial resolution \(r\varDelta x\).

As discussed in Remark 6, we apply both Algorithms 1 and 2 with the simplified Gauss–Newton step (15).

For all the experiments in this section, \(\varDelta t\) and \(\varDelta x\) are such that \(4<\varDelta t/\varDelta x<10\). In order to verify the validity of the observations in Remark 4, we apply the proposed algorithms with \(r=1,2,4,10\).

5.1 KdV Equation

In this section we solve the KdV equation (16), comparing schemes in Sect. 4.1 with different choices of the parameters against schemes known in literature – namely, the multisymplectic method,

$$\begin{aligned} D_{\varDelta x}\left\{ \tfrac{1}{2}\mu _{\varDelta x}(\mu _{\varDelta x}\mu _{\varDelta t}u_{m-2,n})^2+D_{2,\varDelta x}\mu _{\varDelta t}u_{m-1,n}\right\} +D_{\varDelta t}\left\{ \mu _{\varDelta x}^3u_{m-2,n}\right\} =0, \end{aligned}$$

and the narrow box scheme,

$$\begin{aligned} D_{\varDelta x}\left\{ \tfrac{1}{2}(\mu _{\varDelta t}u_{m-1,n})^2+\mu _{\varDelta t}(D_{2,\varDelta x}u_{m-1,n})\right\} +D_{\varDelta t}\left\{ \mu _{\varDelta x}u_{m-1,n}\right\} =0, \end{aligned}$$

introduced in [1, 2]. Both of these schemes preserve a discrete conservation law of the mass, given by their definition, but not of the momentum or the energy. These schemes are more compact than those defined in Sect. 4.1 and not centred on the grid. Therefore, we evaluate the error in the global momentum and energy according to (34) with

$$\begin{aligned} \widetilde{G}_2\!=\!\tfrac{1}{2}(\mu _{\varDelta x}u_{m-1,n})^2,\,\,\, \widetilde{G}_3\!=\!\tfrac{1}{3}(\mu _{\varDelta x}u_{m-1,n})^3\!+\!(\mu _{\varDelta x}u_{m-1,n})D_{2,\varDelta x}(\mu _{\varDelta x}u_{m-1,n}). \end{aligned}$$

For methods EC(\(\alpha \)) (resp. MC(\(\beta ,\gamma \))) we evaluate the error (34) in the conservation of the momentum (resp. the energy) with \(\widetilde{F}_2\) and \(\widetilde{G}_2\) (resp. \(\widetilde{F}_3\) and \(\widetilde{G}_3\)) given in (23) (resp. (20)) with all parameters set to zero. As a first numerical test we consider the motion of a soliton. The initial condition is obtained from the exact solution on \(\mathbb {R}\),

$$\begin{aligned} u(x,t)=3\,\textrm{sech}^2\left( \tfrac{1}{2}(x-t+5)\right) , \end{aligned}$$

evaluated at \(t=0\). We solve this problem over \([a,b]=[-20,20]\) till the final time \(T=10\), and set a grid with \(\varDelta x=0.05\) and \(\varDelta t=0.4\).

Fig. 2
figure 2

One-soliton problem for KdV (16). Sequence of parameters \(\{\alpha _{r,n}^*\}_{n\geqslant 0}\) for EC(\(\alpha \)), (19), obtained using Algorithms 1 (left) and 2 (right) with different values of r

Fig. 3
figure 3

One-soliton problem for KdV (16). Sequence of parameters \(\{(\beta _{r,n}^*,\gamma _{r,n}^*)\}_{n\geqslant 0}\) for MC(\(\beta ,\gamma \)), (22), obtained using Algorithms 1 (top) and 2 (bottom) with different values of r

Table 1 One-soliton problem for KdV (16)

We first find estimates for the optimal parameters of the schemes EC(\(\alpha \)) and MC(\(\beta ,\gamma \)) by applying the two new algorithms with projection (14) on grids with spatial resolution \(r\varDelta x\), \(r=1,2,4,10\). In Fig. 2 we show the sequences of optimal values given by Algorithms 1 and 2, respectively, at each time step of scheme EC. Similarly, in Fig. 3 we show the sequences obtained at each time step of the scheme MC.

Since the solution only travels along the same direction with constant speed, after a few initial steps the sequences of parameters stabilize around constant values.

The sequences obtained by the two algorithms with \(r=1,2,4,\) are all very close to each other (within a maximum distance of \(5\cdot 10^{-3}\)). In agreement with Remark 4, those obtained with \(r=10>\varDelta t/\varDelta x\) show the effect of the space error on the coarser grid. This shows that the accuracy in the solution can be compromised when r is too large. Nevertheless, these parameters are not too far from the values obtained on the computational grid, and still show a good level of accuracy in all cases, further reducing the computational overheads involved in the solution of the optimization problem.

In Table 1 we compare different schemes. The results obtained show that:

  • The values of the maximum error on the k-th global invariant, Err\(_k\), \(k=1,2,3\), evaluated according to (34), show that schemes EC and MC with fixed values of the parameters exactly preserve two conservation laws, and therefore two global invariants.

  • According to Remark 8, due to the conservative boundary conditions, the sequence of schemes EC\((\alpha ^*_{r,n})_{n\geqslant 0}\) preserves the global mass and the global energy. Similarly, according to Remark 9, schemes MC\((\beta ^*_{r,n},\gamma ^*_{r,n})_{n\geqslant 0}\) preserve the global mass but not the global momentum.

  • The computational cost of both of the proposed algorithms decreases as r increases, i.e., as the resolution of the grid (see Eq. (14)) used for the solution of the optimization problem increases. With respect to solving the optimization problem on the same grid of the differential problem (\(r=1\)), the overall computational time reduces to less than a half when \(r=2\), and to less than a fourth when \(r=4\), making the computation cost of the methods proposed in this paper comparable to that of the other schemes in literature. As expected, setting \(r=10\), the computational time further reduces.

  • All the approximations obtained with the sequences of parameters given by Algorithms 1 and 2 are more accurate than the solutions of the schemes from the literature and of EC(0) and MC(0,0). However, schemes EC(\(\alpha ^*\)) and MC(\(\beta ^*,\gamma ^*\)), where the parameters are obtained by brute force, are more precise. This shows that there exist sequences of parameters yielding higher accuracy than those obtained using the new algorithms. This is due to the fact that our approach is based on a minimization of (8) as an estimate of the local error. This improves local error but does not necessarily lead to a minimization of the global error.

  • The accuracy of the solutions obtained using the two new algorithms is only marginally affected by the different choices of \(r<\varDelta t/\varDelta x\). However, for \(r=10>\varDelta t/\varDelta x\) the spatial error has a visible effect on the sequences of parameters obtained, as is evident in Figs. 2 and 3. In this case, the accuracy in the solution is only marginally affected but in general this may not be the case. In agreement with Remark 4, setting \(r=4\) gives the best compromise between reliability and speed of computation.

  • Algorithms 1 and 2 with \(r=4\) are significantly more efficient than the schemes from the literature: while computation times are comparable, the solution error is roughly three times lower.

Fig. 4
figure 4

One-soliton problem for KdV (16). Left: Initial condition (dashed line) and solution of EC(\(\overline{\alpha }^*_{4})\). Right: Pointwise error for different schemes at the final time

In the left plot of Fig. 4 we show the initial profile and, as an example, the solution of EC(\(\overline{\alpha }_{4}^*\)) at the final time. In the right plot we show the absolute error given by EC(\(\overline{\alpha }_{4}^*\)) at every point, in comparison with EC(0), EC(\(\alpha ^*\)), and the narrow box scheme. For all schemes, the bulk of the error is detected around the final location of the soliton and is due to a delay introduced by all numerical schemes. However, the maximum error introduced by EC(\(\overline{\alpha }_{4}^*\)) is less than the 40% of that given by EC(0) and by the narrow box scheme, and only slightly larger than the error given by EC(\(\alpha ^*\)).

As a second numerical test we consider the interaction of two solitons over \([a,b]=[-30,30]\) and till time \(T=15\). The initial condition is obtained from the exact solution on \(\mathbb {R}\),

$$\begin{aligned} u(x,t)=\frac{12(c_1-c_2)(c_1\cosh ^2\xi _2+c_2\sinh ^2\xi _1)}{\left( (\sqrt{c_1}-\sqrt{c_2}) \cosh (\xi _1+\xi _2)+(\sqrt{c_1}+\sqrt{c_2})\cosh (\xi _1-\xi _2)\right) ^2}, \end{aligned}$$

with

$$\begin{aligned} \xi _1=\frac{c_1}{2}(x+d_1-c_1t),\quad \xi _2=\frac{c_2}{2}(x+d_2-c_2t). \end{aligned}$$

We set

$$\begin{aligned} c_1=2,\qquad c_2=1,\qquad d_1=17,\qquad d_2=10,\qquad \varDelta x=0.05,\qquad \varDelta t=0.25. \end{aligned}$$
Fig. 5
figure 5

Two-soliton problem for KdV (16). Sequence of parameters \(\{\alpha _{r,n}^*\}_{n\geqslant 0}\) for EC(\(\alpha \)), (19), obtained using Algorithms 1 (left) and 2 (right) with different values of r

Fig. 6
figure 6

Two-soliton problem for KdV (16). Sequence of parameters \(\{(\beta _{r,n}^*,\gamma _{r,n}^*)\}_{n\geqslant 0}\) for MC(\(\beta ,\gamma \)), (22), obtained using Algorithms 1 (left) and 2 (right) with different values of r

In Figs. 5 and 6 we show the sequences of parameters obtained at each time step of schemes EC and MC, respectively. Again, all the sequences obtained by solving the optimization problems on a grid up to four times coarser are very close (within a distance of \(6\cdot 10^{-3}\)) to those obtained on the computational grid.

We notice that the parameters rapidly change when the solitons interact, but only slowly variate before and after the interaction.

Table 2 Two-soliton problem for KdV (16)

In Table 2 we compare the accuracy, efficiency and conservation properties of different schemes. The same observations done for the case of the motion of a solitary wave hold also in this case. As before, the computational times of the new algorithms with \(r=4\) are comparable to those of the methods from the literature, and their greater efficiency is evident through solution errors that are much lower.

Fig. 7
figure 7

Two-soliton problem for KdV (16). Left: Initial condition (dashed line) and solution of MC\((\beta _{4,n}^*,\gamma _{4,n}^*)_{n\ge 0}\). Right: Pointwise error for different schemes at the final time

In the left plot of Fig. 7, we show the initial condition together with the solution of the sequence MC\((\beta _{4,n}^*,\gamma _{4,n}^*)_{n\ge 0}\). In the right plot, we show the pointwise error of MC\((\beta _{4,n}^*,\gamma _{4,n}^*)_{n\ge 0}\), MC(0,0), and MC\((\beta ^*,\gamma ^*)\) at the final time. We omit the results for the multisymplectic and the narrow box scheme, as they are similar to that of MC(0,0). The methods obtained using the approaches introduced in this paper are very accurate around the final location of the faster soliton, where the bulk of the error given by MC(0, 0) and by the schemes from the literature is located. However, the error around the slower soliton is larger and small oscillations can be seen far from the solitons. MC\((\beta ^*,\gamma ^*)\) gives a slightly smaller error around the slower soliton, but more oscillations are introduced.

5.2 Nonlinear Heat Equation

In this section we apply the new algorithms to the nonlinear heat Eq. (25) using the methods CS(\(\lambda \)) from Sect. 4.2.

We consider here two benchmark problems with (weak) energy solutions that are not classic solutions, having at least one point of non differentiability. Although in Sect. 3 smoothness of the solution is assumed, we show that the strategies introduced in this paper are also effective in this setting.

In order to converge, explicit and implicit finite difference methods for (25) in literature typically require \(\varDelta t=\mathcal {O}(\varDelta x^2)\) and \(\varDelta t=\mathcal {O}(\varDelta x)\), respectively [9, 10, 18, 19, 22]. Under such small time-step restrictions, the Crank-Nicolson method applied to the semidiscretization (31) turns out to be very accurate and efficient. However, it fails to converge for the two benchmark tests in this section with \(\varDelta t>\varDelta x\). Such instabilities may also occur when using a CS(\(\lambda \)) method with a default choice of the parameter, \(\lambda =0\). In contrast, we find that the two proposed algorithms in Sect. 3, based on optimization of defect based error estimate, are able to avoid these instabilities.

Fig. 8
figure 8

NLH (25) with initial and boundary conditions (35). Sequence of parameters \(\{\lambda _{r,n}^*\}_{n\geqslant 0}\) for CS(\(\lambda \)), (28), obtained using Algorithms 1 (left) and 2 (right) with different values of r

The first benchmark problem is given by Eq. (25) with initial and boundary conditions,

$$\begin{aligned} u(x,0)=0,\qquad u(0,t)=t,\qquad u(5,t)=0,\qquad (x,t)\in [0,5]\times [0,3]. \end{aligned}$$
(35)

The solution of this problem is a linear wave travelling in an undisturbed medium with unit speed,

$$\begin{aligned} u_{\text {exact}}(x,t)=\left\{ \begin{array}{ll} t-x,&{} \text {if } t>x,\\ 0,&{} \text {otherwise.} \end{array}\right. \end{aligned}$$

We discretize the initial boundary value problem described by (25) and (35), choosing \(\varDelta x=0.025\) and \(\varDelta t=0.12\). The graphs in Fig. 8 show the sequences of parameters obtained from Algorithms 1 and 2 at each time step, where the search of the optimal parameters is carried out after applying the projection (14) with spatial resolutions of \(r\varDelta x\) with \(r=1,2,4,10\). Although the values of the parameters given by the two algorithms are different for the first time-steps, for the smallest values of r the two procedures converge to the same value.

In Table 3 we compare the different choices for the parameter \(\lambda \). The error in the conservation laws is calculated according to (33) and the figures in the table show that these are preserved to machine accuracy by all the methods that use a fixed value of the free parameter.

The two algorithms introduced in this paper give equally accurate solutions. By increasing r, the computation time decreases, while the accuracy in the solution is only marginally affected. Moreover, both new algorithms avoid instabilities that occur when \(\lambda =0\). This is shown on the left of Fig. 9 where, as an example, we show the solution of CS(\(\overline{\lambda }^*_{4}\)) (marks at every tenth grid point) and CS(0) at the final time.

On the right of Fig. 9, we plot the pointwise errors given by CS(\(\lambda ^*\)) and CS(\(\overline{\lambda }^*_{4}\)). The error obtained with \(\lambda =\overline{\lambda }^*_{4}\) is almost entirely located at the interface where the solution is non differentiable. Fixing \(\lambda =\lambda ^*\), the \(L^2\) error is lower and the solution is more accurate around the interface. However, spurious oscillations are seen where the true solution is smooth.

Table 3 NLH (25) with initial and boundary conditions (35)
Fig. 9
figure 9

NLH (25) with initial and boundary conditions (35). Exact and numerical solutions of CS(\(\lambda \)), (28), with \(\lambda =0\) and \(\lambda =\overline{\lambda }_{4}^*\) (left). Solution error for \(\lambda =\overline{\lambda }_{4}^*\) and \(\lambda =\lambda ^*\) (right)

The second benchmark problem is (25) with

$$\begin{aligned} u(x,0)=\left( 1-\frac{x^2}{6}\right) _+,\qquad u(-6,t)=u(6,t)=0,\qquad (x,t)\in [-6,6]\times [0,9], \end{aligned}$$
(36)

where \(f_+=\max (f,0).\) The solution of this problem is the Barenblatt profile,

$$\begin{aligned} u_\textrm{exact}(x,t)=(t+1)^{-1/3}\left( 1-\frac{x^2}{6(t+1)^{2/3}}\right) _+. \end{aligned}$$

This solution has compact support and is not differentiable at the interface points, which move outward at a finite speed. We solve this problem with \(\varDelta x=0.02\) and \(\varDelta t=0.09\).

We show in Fig. 10 that in this case the two proposed algorithms generate sequences of parameters that approach a small negative value for all the considered values of r.

Table 4 shows that all the schemes preserve a discrete version of the global invariants. When the value of the parameter is fixed during the iteration, this is a consequence of the preservation of the local conservation laws. When a sequence of different parameters is used, this is instead due to the zero boundary conditions (see Remark 10). Although in this case the conservation laws are not preserved locally, the error in the solution is the lowest.

The figures in Table 4 show that with increasing r the computation time decreases while the accuracy is not significantly affected even for \(r=10\). This is a consequence of the fact that the sequence of parameters obtained for different values of r quickly approach each other after just a few time-steps (see Fig. 10). Once again, we find in Fig. 11 (left) that the sequence of parameters \(\{\lambda _{4,n}^*\}_{n\geqslant 0}\) obtained from Algorithm 1 (marks at every tenth grid point) allows us to avoid the numerical instability that occurs under the default choice, \(\lambda =0\).

On the right half of Fig. 11 we show the solution error of CS(\(\lambda _{4,n}^*\))\(_{n\geqslant 0}\) and of CS(\(\lambda ^*\)). The solution error obtained using the sequence of parameters given by Algorithm 1 is mainly located at the points where the solution is not differentiable. Although \(\lambda =\lambda ^*\) minimizes the \(L^2\) norm of the error with respect to any other fixed value of \(\lambda \), we notice that the error at the interface can be further reduced by changing the parameter at every time-step. Moreover, for \(\lambda =\lambda ^*\), relatively large components of error appear near \(\pm 2.5\) where the solution is otherwise smooth. These are not visible in the solution errors obtained using either the adaptive sequences given by Algorithm 1 or the fixed parameters given by Algorithm 2.

Table 4 NLH (25) with initial and boundary conditions (36)
Fig. 10
figure 10

NLH (25) with initial and boundary conditions (35). Sequence of parameters \(\{\lambda _{r,n}^*\}_{n\geqslant 0}\) for CS(\(\lambda \)), (28), obtained using Algorithms 1 (left) and 2 (right) with different values of r

Fig. 11
figure 11

NLH (25) with initial and boundary conditions (36). Exact and numerical solutions of CS(\(\lambda \)), (28), with \(\lambda =0\) and \(\lambda =\{\lambda _{4,n}^*\}_{n\geqslant 0}\) (left). Solution error for \(\lambda =\lambda =\{\lambda _{4,n}^*\}_{n\geqslant 0}\) and \(\lambda =\lambda ^*\) (right)

6 Conclusions

In this paper we have proposed two approaches for identifying an optimal method in a parameter dependent family of numerical schemes, based on a minimization of the defect as an estimate of the local error. The first approach uses different (adaptive) values of the parameters at every time-step. In the second approach fixed values of the parameters are derived from a sequence. The latter approach does not compromise parameter depending conservation properties of geometric integrators.

The new algorithms solve an optimization problem at each time-step in order to identify the optimal values of the parameter. In principle, this can increase the computational cost of the original method prohibitively. However, in the large time-step regime, it is possible to solve the optimization problem on coarser spatial grids without compromising the accuracy of the optimal parameters, significantly decreasing the computational time.

The new approaches have been applied to families of schemes for the KdV equation and a nonlinear heat equation that preserve local conservation laws. The proposed numerical tests show that, on one hand, the new strategies effectively identify very accurate methods in each considered family of schemes. On the other hand, introducing a coarse grid for the solution of the optimization problem tremendously improves the efficiency of the new strategies. Overall, the computational time is comparable to that of other schemes in literature, while the accuracy of the proposed approach is much superior.