Asymptotic Behaviour of Time Stepping Methods for Phase Field Models

Adaptive time stepping methods for metastable dynamics of the Allen–Cahn and Cahn–Hilliard equations are investigated in the spatially continuous, semi-discrete setting. We analyse the performance of a number of first and second order methods, formally predicting step sizes required to satisfy specified local truncation error \documentclass[12pt]{minimal} \usepackage{amsmath} \usepackage{wasysym} \usepackage{amsfonts} \usepackage{amssymb} \usepackage{amsbsy} \usepackage{mathrsfs} \usepackage{upgreek} \setlength{\oddsidemargin}{-69pt} \begin{document}$$\sigma $$\end{document}σ in the limit of small length scale parameter \documentclass[12pt]{minimal} \usepackage{amsmath} \usepackage{wasysym} \usepackage{amsfonts} \usepackage{amssymb} \usepackage{amsbsy} \usepackage{mathrsfs} \usepackage{upgreek} \setlength{\oddsidemargin}{-69pt} \begin{document}$$\epsilon \rightarrow 0$$\end{document}ϵ→0 during meta-stable dynamics. The formal predictions are made under stability assumptions that include the preservation of the asymptotic structure of the diffuse interface, a concept we call profile fidelity. In this setting, definite statements about the relative behaviour of time stepping methods can be made. Some methods, including all so-called energy stable methods but also some fully implicit methods, require asymptotically more time steps than others. The formal analysis is confirmed in computational studies. We observe that some provably energy stable methods popular in the literature perform worse than some more standard schemes. We show further that when Backward Euler is applied to meta-stable Allen–Cahn dynamics, the energy decay and profile fidelity properties for these discretizations are preserved for much larger time steps than previous analysis would suggest. The results are established asymptotically for general interfaces, with a rigorous proof for radial interfaces. It is shown analytically and computationally that for most reaction terms, Eyre type time stepping performs asymptotically worse due to loss of profile fidelity.


Introduction
The mathematical literature for computational methods for Allen-Cahn (AC) dynamics [2], and its higher order relative Cahn-Hilliard (CH) dynamics [6], is dominated by the proposal, use, and analysis of so-called energy stable schemes [12,25,26,29]. AC and CH dynamics are gradient flows on an energy functional, and the solution should decrease that energy in time. Energy stable schemes guarantee that decrease no matter what time step is chosen. This is a desirable property not shared by standard fully implicit, semi-implicit (IMEX), or exponential integrator time stepping methods. We will show in this work that some (but not all) fully implicit methods can outperform energy stable schemes when subject to fixed accuracy requirements. The recent article [30] gives especially clear evidence that when time steps are chosen appropriately, fully implicit methods are conditionally energy stable, and further that the large time steps allowed by energy stable schemes can come at the cost of significant loss of accuracy. We show that in the metastable dynamic regime of AC and CH, some fully implicit methods can take optimally sized time steps. By optimal, we mean the asymptotically largest time steps as the order parameter → 0 that satisfy a given local error tolerance. Here, represents the width of interfacial layers in metastable dynamics and, like the authors of [30], we use the form of the equations scaled so that these dynamics transpire in an O(1) time scale. When the dynamics are in this metastable regime, which dominates the time of typical phenomena of interest, definite statements about the behaviour of different time stepping methods can be made. This criteria does not take into account solver efficiency. However, we can make definite statements on how efficient solvers for nonlinear implicit time stepping need to be to outperform other methods.
A combination of asymptotic analysis and careful computational work backs up our claims. The computational codes that generate the results shown in the paper are available by e-mail request from the last author. In addition, we present a rigorous result for implicit time stepping for meta-stable AC dynamics in radial geometry that shows that asymptotically larger time steps can be taken than previous analysis would suggest. These time steps preserve the diffuse interface structure (a property that we call proflie fidelity) and also the energy decay property of the equations. This result is shown for a class of reaction terms. An interesting result in Sect. 6.2 shows that Eyre-type time stepping can perform asymptotically worse with most reaction terms, while implicit time stepping has uniform asymptotic behaviour over a class of reaction terms. This was predicted by the analysis and confirmed computationally.
Our study focuses on pure materials science applications rather than the use of Cahn-Hilliard equations to track interfaces in so-called diffuse interface methods [32] in which the CH dynamics are coupled to other physics. We consider the simplest form of AC and CH dynamics, whose Gamma limit (as → 0) is well understood and use that well known structure to gain insight into the behaviour of the schemes. The authors believe that the insight gained from these studies will also apply to schemes used for other materials science models which are less well understood.
We consider a number of first and second order time stepping schemes: the energy stable Eyre's method [14]; Backward (Implicit) Euler (BE) [17]; Trapezoidal Rule (TR) [17]; Second order Backward Differentiation Formula (BDF2) [17]; Secant [13]; standard semiimplicit (linear IMEX) methods of first and second order [3]; first and second order Scalar Auxiliary Methods (SAV) [25] for which a modified energy stability can be proved; and finally a second order Singular Diagonally Implicit Runge Kutta method with good stability properties (DIRK2) [17]. The resulting implicit systems are considered in the spatially continuous semi-discrete setting in a 2D periodic domain, with numerical validation done with a suitably refined Fourier spectral approximation. Time step schemes that result in nonlinear systems are solved with Newton's iterations using the Preconditioned Conjugate Gradient Solver (PCG) developed in [9] at each iteration. Adaptive time stepping is done based on a user-specified local error tolerance σ . The variation of the number of time steps with for fixed σ is predicted based on formal consideration of the local truncation error of the schemes in the metastable dynamics. The formal predictions are then validated in computational studies. With this criteria, first order BE performs better (asymptotically fewer time steps as → 0) than Eyre and first order IMEX and SAV. Second order TR and BDF2 perform better than Secant, DIRK2, and second order IMEX and SAV. The difference in both cases is asymptotically larger for CH than AC. These comparisons are also valid for computational time, using PCG counts as the measure, to similar accuracy. It is seen that optimal numbers of time steps are obtained when the dominant local truncation error is a higher order time derivative. This observation may have application in other systems with metastable dynamics. We observe that standard IMEX methods perform almost identically to SAV methods of the same order in the scenario we consider, at reduced computational cost.
It is observed that the global accuracy of BE is better than a naïve prediction based on the size of the local truncation error would suggest. A formal analysis of the scheme for the AC case shows that the dominant error made in one time step is asymptotically smaller than expected. This is due to a special structure of the local truncation error for BE, in which the asymptotically largest term lies in a strongly damped space.
We introduce the equations and numerical schemes in Sect. 2 with some introductory analysis. The scaling for AC and CH is chosen so that the metastable interface dynamics (approximate curvature motion for AC and Mullins-Sekerka flow [22] for CH) occurs in O(1) time. In Sect. 3 we examine the metastable dynamics of the equations and make predictions for the behaviour of the time steps with and local error tolerance σ under stability assumptions which are verified numerically in Sect. 4. We give an asymptotic analysis for the surprising accuracy and stability properties for BE with large time steps applied to AC in Sect. 5. In Sect. 6 we present the rigorous result for BE applied to AC with large time steps and also show the loss of profile fidelity for Eyre-type time stepping for most reaction terms. We end with a short discussion.

Equations and Schemes
We consider the simplest form of the AC dynamics for u(x, t) given by where f (u) = u 3 −u is the classical form of the reaction term. More general, smooth reaction terms are considered in Sect. 6. Non-smooth reaction terms and degenerate mobility are also of interest in some materials science applications and there are stability and convergence results for implicit time stepping applied to these problems in [4,5] for example. However, the asymptotic behaviour of time stepping schemes for these problems is not clear. CH dynamics is described by a higher order partial differential equation For computational simplicity, we consider the two-dimensional (2D) cases of these equations in a doubly periodic cell [0, 2π] 2 . The time scaling in the equations above is chosen to give sharp interface (as → 0) motion in O(1) time. The sharp interface limit yields curvature driven flow for AC and a nonlocal Mullins-Sekerka flow for CH [22]. Both types of dynamics have an associated energy functional where W (u) = 1 4 (u 2 −1) 2 and the reaction term f (u) = W (u). The energy E(t) is monotonic decreasing due to the gradient flow nature of the dynamics. For AC the gradient is in L 2 and for CH it is H −1 .

Backward Euler
We consider the simplest implicit scheme, first order Backward Euler (BE), also known as Implicit Euler. Applied to (1) keeping space continuous, we have where u n (x) approximates the exact solution u(x, t n ) and k n = t n+1 − t n is the time step.
We use the classical f (u) = u 3 − u as mentioned above. Dropping the subscript on the time step and the unknown solution at time level n + 1 we have the nonlinear problem for u given u n .

Definition 1 A time stepping scheme is said to have the energy decay property
This property could be conditional on the choice of time step size. Additionally, it could depend on u n . If a scheme has the energy decay property for any u n and k, the scheme is called unconditionally energy stable.  [30] with a different proof.
Proof The existence of u follows from the standard method of sub-/super-solutions applied to comparison functions −1 and +1. To establish uniqueness, we assume u 1 and u 2 are solutions. Then their difference w = u 1 − u 2 is a solution of where s takes values between u 1 and u 2 , and hence in [−1, 1]. Isolating w leads to the elliptic problem and if k < 2 / f ∞ then the corresponding elliptic operator is strictly positive and w is zero by the maximum principle. To establish energy decay, we take the inner product of (4) with the test function u − u n : From the Fundamental Theorem of Calculus we develop the expansion, Using this relation to eliminate f (u) yields the equality, which yields energy decay for k < 2 2 / f ∞ . The Theorem is also true when homogeneous Neumann boundary conditions are specified.
Thus we have existence of solutions to (4) for any time step size, and uniqueness and energy stability under the resitriction k ≤ 2 2 / f ∞ . This is true for any u n under the restrictions of the Theorem. We shall see in Sect. 6 that asympoticaly larger time steps k = o( ) can be taken when the dynamics are slow (interface motion) with locally unique, energy stable solutions. This is verified in computational tests.

Eyre's Method
An alternative first order scheme to fully implicit BE was proposed by Eyre [14]: The scheme is derived conceptually by keeping a convex part of the reaction term f (u) = u 3 − u implicit and a concave part explicit. In this sense, it is an IMEX method but an unusual one since a nonlinear term is kept implicit and a linear term is handled explicitly. The method has appealing properties: Theorem 2 (from [14]) The time step (5) has a unique solution u for any u n and k that is unconditionally energy stable.
Additional first order schemes considered are the SAV scheme [25] and a linear IMEX method [3]: with S > 0, sometimes called a stabilization term. We take S = 2 since that makes the left hand side a linearization about the far field values, but computational performance is relatively insensitive to S. The SAV scheme is energy stable with a modified energy. We use the same stabilization coefficient as above in the SAV scheme. There is a class of linearly implicit energy stable schemes [8,19,20] that require an asymptotically large stabilization term O( − p ) with p large and increasing from AC to CH and 2D to 3D for the analysis.
These methods are theoretically interesting but are extremely inaccurate and not useful for practical applications. We have further discussion of these schemes in Remark 3. All time stepping schemes can be applied to CH (2), with BE and Eyre shown below: In this case, BE is known to have unique solutions with the energy decay property when k < 3 [30] and Eyre is unconditionally energy stable [14].

Second Order Schemes
We also consider the second order methods Trapezoidal Rule (TR), Secant (S) [13], Second Order Backward Differencing (BDF2), and Second Order Singular Diagonal Implicit Runge Kutta (DIRK2) [17] methods. These are described below for u t = F (u) with and F (u) = − ΔΔu + Δf (u)/ for CH With this notation: Secant is a variant of TR with the term f (u) − f (u n ) replaced by where W is the energy term from (3). It is known to be conditionally energy stable [13]. For the simple form of W we have taken, the expression above can be factored explicitly. DIRK2 is a two stage method Both DIRK2 and BDF2 are A-stable, and so preferable to TR and Secant from the perspective of stiff ODE solver theory [17]. A second order linear IMEX method (SBDF2 [3]) and two variants of second order SAV methods based on BDF2 are also considered.
There are many other specialized schemes in the literature and we mention two second order unconditionally energy stable concave-convex splitting schemes here. They are nonlinear two step schemes but the nonlinear problem at each time step is convex. One is based on TR, with the cubic term handled implicitly and the linear reaction term extrapolated from two previous time steps [11]. The second is a variant of SBDF2, again with the cubic term handled implicitly and an additional moderate stabilizing term [31]. Both these schemes have the same asymptotic error behaviour as the Secant, DIRK2, and SBDF2 methods shown in detail below.

Spatial Discretization and Solution Procedure
The current work concentrates on the time stepping errors, and it is convenient to consider the semi-discrete, spatially continuous approximation. This idealization is approximated well by the Fourier spectral spatial discretization. The computational results shown have sufficient spatial resolution that spatial errors do not affect the results in the digits shown.
We use the Preconditioned Conjugate Gradient (PCG) solvers developed in [9] for the schemes involving nonlinear implicit problems. We note that there has been recent promising work in the use of preconditioned steepest descent with approximate line search in solving these nonlinear problems [7,15]. Another approach has been to recast the implicit step as a minimization problem [30]. Both these techniques have the advantage that they look for local solutions which can be unique and have energy decay even for large time steps, as shown rigorously in Sect. 6.
The computations in this work are done in a full 2D setting, rather than in a reduced dimensional radial setting as could be done, in order to give PCG iteration counts for the nonlinear time stepping methods that have meaning for more general computations. Note that the PCG counts are independent of spatial resolution when the problem is resolved.

Error Estimation and Adaptive Time Stepping
We perform two time steps of the same size k in order to use a specialized predictor u p for u n+2 .
where F (u) = Δu − f (u)/ 2 for AC and − ΔΔu + Δf (u)/ for CH as above. Time step sizes are adjusted so that The predictor u p is formally one order more accurate than the numerical approximation u n+2 from time stepping, up to fifth order. The predictor has an inherent dominant local error k 5 u ttttt /90 that is a pure time derivative of u, which is shown below in Sect. 3.1.3 to be a desirable property. For the one step methods, the time step is adjusted adaptively to maintain a local error below σ as described in [9]. For BDF2 and its linear variants, time steps are only adjusted by a factor of two. When time steps are reduced (using Hermite cubic interpolation for the restart value) or increased, four time steps are taken before checking the local error to allow relaxation of the initial error layer.

Metastable Dynamics
In our formulation, it is known that after a short time O( 2 ) solutions to AC tend to interfaces between regions of solution near the equilibrium values, u ≈ ±1.
with g(z) := tanh(z/ √ 2) and z = dist(x, Γ )/ , where Γ is the approximate interface with arc length parameter s moving with curvature motion (normal velocity equal to curvature). We fix its location at the u = 0 level set. The local coordinates (s, z) are shown in Fig. 1. This structural result on the metastable solution can be obtained with formal asymptotics. In the outer asymptotic region for AC the solution takes the form u = ±1 to all orders. Curvature motion as the limit → 0 has been proven rigorously [1,23].
CH has the same metastable solution structure (8) with normal interface velocity given by Mullins-Sekerka flow, in O(1) time in our scaling (2). We refer the reader to the review article [24] for details. It has been shown that numerical schemes can accurately approximate this limit with implicit time stepping with appropriate scaling of the time step with [16]. We will show that this limit can be taken with asymptotically larger time steps than in that analysis.
From (8), we see that time and space derivatives are large near the interface. Starting with we can take a time derivative to obtain: where V is the normal velocity at the point on Γ closest to x. Formally taking higher derivatives in this pattern yields: This is used to analyze the truncation error of the time stepping schemes.

Predicted Time Step Sizes for AC
A standard strategy for adaptive time stepping is to have a user specified local error tolerance of σ . The error for each time step is estimated and the time step adjusted so that there is an estimated error in that single time step less than σ . It is known that the dominant local truncation error for BE is k 2 u tt /2 which in metastable dynamics is O(k 2 / 2 ) from (9). The local truncation error restriction then requires time steps of size We now proceed to determine the expected behaviour of time steps with and σ from the other schemes. We can write the BE scheme (4) and Eyre's scheme (5) for AC in an instructive way Knowing that the truncation error for BE is O(k 2 / 2 ) we see that the truncation error for the Eyre scheme is dominated by the last term in its expression above, which has leading order Thus, the advantage of the Eyre scheme to be able to take large time steps and remain energy stable is never realized if accurate computational results are required. Reference [30] has an alternate way to view the loss of accuracy that does not highlight this asymptotic difference. The first order IMEX and SAV schemes have the same asymptotic behaviour as Eyre.

Remark 1
It is well known that when large time steps are taken with Eyre's method, the dynamics occur in a slower time scale. This is an exact result for AC [30], qualitative for CH [10]. In this work, time steps are restricted by a specified local error tolerance. Thus, we do not see a change in time scale for the results of Eyre's method, rather we see decreased time step size.

Remark 2
The formal local error analysis above relies on the stability of the schemes in metastable dynamics under the resulting time step restrictions. More than simple stability, the analysis requires that the time stepping preserves the asymptotic structure of the diffuse interface. This is the concept we have named profile fidelity. All predicitions described in this section lead to time stepping that preserves profile fidelity for the classical choice of f (u) = u 3 − u. We observe the predicted time step behaviour in and σ computationally. In Sect. 6.2 we show that for (most) other reaction terms, Eyre time stepping loses profile fidelity for time steps k = O( 3/2 ) and in these cases, k = O( 2 ) is needed for accuracy.

Remark 3
The first order, linearly implicit energy stable scheme for 2D AC is analyzed in [8]. The analysis requires a stabilization term of order −2 | ln |. If such a scheme were implemented, the time steps required for a local error tolerance of σ would be k = O( √ σ 5/2 /| ln |), prohibitively small for practical computation.
We can determine the dominant term in the local truncation errors of the second order schemes applied to AC: Table 1 Order predictions for the behaviour of the numerical schemes with local error tolerance σ in the metastable regime of AC dynamics Here, L is the local error, k is the time step size, and M is the number of time steps to reach a fixed end time We consider two second order SAV variants based on how an extrapolated approximation is computed. If the extrapolated value of u n+1 is taken as 2u n − u n−1 the scheme (referred to as SAV2-A) behaves similarly to SBDF2. If the extrapolated value is computed with a first order linear IMEX scheme as suggested in [25] (referred to as SAV2-B), the scheme has a local truncation error of order k 3 / 5 . The results are summarized in Table 1. It is clear that BE takes asymptotically (as → 0) fewer time steps than Eyre, although they are both first order in time step size. TR and BDF2 take asymptotically fewer time steps than Secant, DIRK2, SBDF2, SAV2-A and SAV2-B although they are all second order methods. The computations in Sect. 4 below show that these time step estimates correspond to real computational behaviour.

Remark 4
We predict the number M of time steps in Tables 1 and 2 and how it varies with and σ . As shown in Fig. 2 we are also predicting how a profile of time steps k(t) behaves with and σ .

Predicted Time Step Sizes for CH
The same local truncation analysis can be done for the CH in the metastable regime where the solution has the same interface structure (8) with the interface Γ moving approximately with Mullins-Sekerka flow in O(1) time. BE, TR, BDF2, and SBDF2 have the same error expressions as above, but Eyre, Secant and DIRK2 have local truncation errors when applied to CH listed below: where we have used the fact that the Laplacian Δ increases the size of terms by 1/ 2 near the interface. The first order IMEX and SAV schemes have the same asymptotic behaviour as Eyre. SAV2-A behaves similarly to SBDF2 as before, with SAV2-B worse by a power of as for the AC case above. The results are summarized in Table 2. The predictions in this table are validated in the numerical experiments in the next section. Although the methods all have the formal order of accuracy in terms of time step size, the behaviour as → 0 varies Table 2 Order predictions for the behaviour of the numerical schemes with local error tolerance σ in the metastable regime of CH dynamics Here, L is the local error, k is the time step size, and M is the number of time steps to reach a fixed end time significantly. Note that the gap between BE and the other first order schemes, and between TR/BDF2 and Secant/DIRK2/SBDF2/SAV2-A is wider for CH dynamics than it was for AC.

Discussion: The Source of Increased Local Error
In the metastable regime, the two terms in AC and CH (diffusion and nonlinear reaction) are both large but approximately cancel to give the slow dynamics. The methods with asymptotically (as → 0) small local errors (BE, TR, BDF2) have dominant truncation errors that are pure time derivatives of the solution, which inherit this high order cancellation. The other methods which have large local errors have truncation errors that involve the reaction term individually. This imbalance amplifies the size of the error. As an example, DIRK2 applied to u t = F (u) has an error proportional to F u 2 t . From this discussion, we believe the ranking of the schemes in this work will also apply to other nonlinear problems with metastable dynamics.

Allen-Cahn
We take initial conditions in the form of a radial front and compute with = 0.2, 0.1, 0,05 and 0.025. The benchmark for accuracy is the time T at which the value at the domain centre (π, π) changes from negative to positive. Except for the exponentially small (in ) derivative discontinuities at the periodic boundaries, the dynamics approximate the sharp interface limit of curvature motion of a circle shrinking to a point at the domain centre. The expectation from asymptotic analysis of the sharp interface limit is that This is confirmed by the numerical solutions below. A video of the dynamics is available [27]. Here, M is the total number of time steps taken (with the ratio to the value above in brackets), CG is the number of conjugate iterations (with the ratio to the number of time steps in brackets), E is the error in the benchmark time Here, M is the total number of time steps taken (with the ratio to the value above in brackets) and CG is the number of preconditioned conjugate gradient iterations (with the ratio to the number of time steps in brackets), E is the error in the benchmark time with * denoting a result correct to three decimal places

First Order Methods
The PCG approach is known to have bounded condition number under the scaling k = C 2 for BE with C < 1 [30] and we observe good behaviour in the example below even with C > 1 in the metastable regime. It is observed computationally in this work that the PCG for Eyre's method is independent of k and although the authors are not aware of a proof in the literature. PCG counts can be used as a proxy for computational time when comparing methods.
Results of the numerical experiments in which σ and were varied for BE and Eyre are shown in Tables 3 and 4. Spatial errors do not affect the digits shown in any of the computational results in this paper. Table 3 validates the second order O(k 2 ) local truncation error since the number of time steps was predicted to be M = O(1/ √ σ ) for both methods with constant, noting that √ 10 ≈ 3.16. Such results for other schemes and for the CH benchmark problem below are not shown, but verify the formal accuracy of the schemes. Table 4 validates the prediction of M = O(1/ ) for BE and M = O(1/ 3/2 ) for Eyre with σ constant, noting that 2 3/2 ≈ 2.83. Both tables validate the prediction that for the same local tolerance σ , Eyre involves more computational work than BE and gives less accurate answers. CG counts for both methods are small as expected. You see (unexpectedly) that the final accuracy of BE does not seem to degrade as → 0 for fixed σ . This is discussed in Sect. 5 below. Although BE does not guarantee energy stability, no step accepted by the local error tolerance exhibited an energy increase. Fig. 2 Time steps k for Allen-Cahn dynamics with and σ varied using BE (left) and DIRK2 (right). The time steps decrease in size as the simulation approaches the topological singularity at t ≈ 2. Note that for each method, the profiles k(t) have the same shape as σ and are varied and scale with these quantities according to our theoretical predictions. In particular, note that time steps decrease more quickly for BE as σ is decreased but more quickly for DIRK2 as is decreased, as we predict Here, M is the total number of time steps taken (with the ratio to the value above in brackets) and E is the error in the benchmark time For completeness, we show the time step sizes as a function of time for BE in Fig. 2 with and σ varied. As mentioned in Remark 4 our predictions for the behaviour of the time steps sizes k as and σ are varied describe a profile k(t).
We repeat the → 0 study for IMEX1 and SAV1 in Table 5. These methods require a fixed number of FFT calculations per time step to invert the constant coefficient linear implicit aspect of the schemes, with SAV1 requiring four times as many solves as IMEX1. It is seen that IMEX1 behaves almost identically to SAV1 and both are superior to Eyre's method when computational cost is considered. In the context of this study, there is no benefit from the theoretical guarantees of energy stable schemes and BE is the optimal (with our asymptotic definition) first order scheme with IMEX1 the runner up. This will remain true for other nonlinear solver strategies for BE as long as they require fewer than O(1/ √ ) iterations when adaptive time steps are taken.

Remark 5
Note that for the BE computation for = 0.025 we can still get reasonable accuracy taking σ = 10 −2 . In this case, the maximum value of k/ 2 is 14.6. Clearly, the theory which guarantees existence of solutions and energy decay for k < 2 [30] can be improved for metastable dynamics. This is explored in the analysis in Sects. 5 and 6 below.

Second Order Methods
The CG counts of all the nonlinear second order methods are relatively insensitive to , similar to the first order methods shown above. We show the number of time steps used for the seven methods in Table 6, for σ = 10 −4 fixed and varied. All second order methods give at least three digits of accuracy to the benchmark time with this tolerance σ . It is interesting to note that the slight change in the extrapolation procedure in the SAV2 schemes makes such a difference to their asymptotic performance. It is confirmation that merely considering the order of time stepping scheme and its theoretical energy stability properties is not the whole story.

Cahn-Hilliard
For the initial conditions we take 2 and compute with = 0.2, 0.1, 0,05 and 0.025. The dynamics approximate the sharp interface limit of two concentric circles, with the inner circle shrinking. As before, the benchmark is the time T at which the value at the domain centre (π, π) changes from negative to positive. A video of the dynamics is available [28] .

First Order Methods
Results of the numerical experiments in which is varied for the first order methods are shown in Table 7. These validate the prediction of M = O(1/ ) for BE and M = O(1/ 2 ) for Eyre and IMEX1 with σ constant. As for the AC case, SAV1 behaves similarly to IMEX1 at increased computational cost. For CH, the implicit problem for BE is more difficult to solve as → 0 with fixed σ , but it is still more accurate than Eyre stepping for equivalent computational cost. It will be asymptotically more efficient as long as the solution strategy for the nonlinear problem requires fewer than O(1/ ) iterations with adaptive time stepping.
As with AC, we see that BE does not suffer from global accuracy decrease as → 0.

Second Order Methods
The CG counts for the second order methods behave like those of BE with as shown above. We show the number of time steps used for the four methods in Table 8, for σ = 10 −4 fixed and varied. The superiority of TR and BDF2 is clearly seen, consistent with M = O(1/ ) Shown are the total number of time steps taken (with the ratio to the value above in brackets) Here, M is the total number of time steps taken (with the ratio to the value above in brackets), CG is the number of conjugate iterations (with the ratio to the number of time steps in brackets), and E is the error in the benchmark time with * denoting a result correct to three decimal places

Asymptotic Analysis of Properties of BE AC Solutions
The results in Table 4 present the accuracy for BE applied to AC with fixed local error tolerance σ = 10 −4 under various values of . It is remarkable the accuracy in the benchmark time does not degrade as → 0. This is unexpected, as a naïve prediction would be that the final accuracy scaled like Mσ where M is the number of time steps. It is clear that the resulting solution accuracy for the schemes under specified local error tolerance is a nontrivial question.
We present below the asymptotic analysis of a fully implicit BE time step (4) in two dimensions assuming the solution is in the meta-stable regime. That is, u n is approximately described as a curve x n (s) parametrized by arc length with normaln, dressed with the heteroclinic profile (8). We take the scaling k = c with c independent of , both sufficiently small depending only on the curve x n . We consider the formal asymptotics for the implicit time step u of (4) in this setting, anticipating that u will have the same local dependence u(s, z). Using where κ is the curvature of the interface, we find at leading order O( −1 ) that u has the same homoclinic structure around a new curve x(s). That is, with g(z) = tanh(z/ √ 2) and where we have changed coordinates to (s, z) with (x, y) = x(s) + zn based on the curve x(s) after the implicit time step. In the language of Remark 2 we predict that the scheme preserves profile fidelity and show below that this is asymptotically consistent.
In (10), v(z, s) is the correction to the leading order solution. We will identify the size and structure of this term below.
We take where ρ is the average normal speed through the time step. Recalling that k = c and the spatial scaling of z, we have A diagram is shown in Fig. 3. Note that the variation in normal direction appears in higher order asymptotic terms, so it is consistent in what follows to use the samen as normal direction for both curves, i.e. the same "z". Considering now the next order term O(1) in (4) with the forms (10) and (12): where L := ∂ 2 /∂z 2 + f (g)· and we have used the smallness of c for the cubic Taylor approximation of g(z) − g n (z) on the right hand side. We consider (13) at each s in the L 2 (R) orthogonal decomposition of G := span{g (z)} and G ⊥ . Note that g ∈ G ⊥ (this does not depend on the specific reaction term f = u 3 − u chosen here) and L has G as its kernel and has bounded inverse on G ⊥ from standard Fredhold theory [18]. and so it is seen that BE behaves like a second order method in this scaling. This explains the unexpected accuracy in AC BE computations as → 0.

Remark 6
Note that the error estimator (7) uses F (u) which sees the undamped dominant truncation error term, which is why the number of time steps behaves with in the manner predicted in Sect. 3.1.1. Thus for BE applied to AC in the metastable regime, the estimator asymptotically over-estimates the local errors actually made.
The formal asymptotic results can also be used to show that the implicit time steps in this scaling lead to energy decrease. Neglecting the O(c 2 ) terms in the interface motion, we have from (11) x n = x − kκn.
Using the identities for arc length parametrized curves |x s | = 1, κn = x ss and x s ·x sss = −κ 2 it follows by taking the s derivative of the equation above and the dot product with x s at each s that |x n,s | ≥ 1 + kκ 2 ≥ 1 = |x s |.
This shows that the metastable curve at time n is longer than at the next step n + 1. Since the energy E is proportional to curve length to highest order in the metastable regime [21], we have shown formally that implicit time stepping for AC has the energy decay property under this time step scaling. Large, accurate, fully implicit time steps can be taken in computations validated in Sect. 4. In the next section we show a closely related rigorous result in a radial geometry. The main result is in Proposition 1. A key ingredient is an identification of a dominant term in the space G that represents the interface motion, separate from heavily damped terms in the perpendicular space, as shown here. Care must be taken to control the size of terms which are formally neglected in this asymptotic analysis.

Rigorous Radial Analysis of AC With BE and Eyre Time Stepping
We derive rigorous asymptotic evolution of a radially symmetric profile for BE and first order Eyre-type methods for the Allen-Cahn equation in R 2 . Extensions to radial profiles in R d is immediate. More precisely we consider a splitting f = f + − f − and study the iterative scheme For simplicity, we assume that f is smooth, odd about u = 0, has precisely three simple zeros at u = ±1 and at u = 0, and tends to ±∞ as u → ±∞. This includes the classical choice of f (u) = u 3 − u but we consider other reaction terms in this class since Eyre's method can have quite different behaviour as shown in Sect. 6.2. The BE scheme corresponds to the choice f − ≡ 0 while Eyre-type schemes take f + , f − ≥ 0. We pose the problem on the affine space with u n ∈ Y as a given. The assumptions on f imply that the continuous 1D Allen-Cahn equation has a steady state solution which is heteroclinic to ±1; that is g → ±1 as z → ±∞. Considering R > 1, we modify this g at the order O(e −1/ ) so that g = 0 on (−∞, −1/ ) for 1. This introduces exponentially small residuals in the sequel that have no impact upon the salient results of our analysis.
We introduce z = r −R ε , the weighted inner product and the associated spaces L 2 R and H 1 R with B H 1 R (δ) the ball that is centered at the origin with radius δ in the space H 1 R . We rewrite the iterative equation as on the domain z ∈ (−R/ , ∞). We decompose u n and u as where R n and v n are taken as given and R and v are to be determined. The profile associated to u n is denoted g n and observe that it admits the expansion In the sequel we will enforce the orthogonality conditions v, g R = 0, v n , g n R = 0, (16) and denote the corresponding subspaces of L 2 R by X ⊥ and X ⊥ n respectively with the associated orthogonal projections Π and Π n .
At this point the analysis of the implicit and Eyre-type schemes diverges sufficiently that we approach them distinctly.

Backward Euler Estimates
For BE we take f − ≡ 0, f = f + , and write the iterative map as where we have introduced the linear operator and the nonlinearity The operator L is self-adjoint in the weighted inner product for which the eigenvalue problem takes the form subject to ∂ z ψ(−R/ ) = 0 and ψ → 0 as z → ∞. Since the profile g solves (14), it will be useful to compare L to the simpler operator arising as the linearization of (14) about g in L 2 (R). The operator L 0 is self-adjoint on L 2 (R), and since g is heteroclinic with g > 0, the Sturm-Liouville theory on L 2 (R) implies that L 0 has a simple, ground-state eigenvalue at λ = 0 with eigenfunction g and the remainder of the spectrum of L 0 is strictly positive, in particular L 0 is uniformly coercive on the space {g } ⊥ L 2 (R) . While L does not generically have a kernel, it does have an eigenspace with a small associated eigenvalue. However, for sufficiently small, it inherits the coercivity of L 0 . Lemma 1 Fix 0 > 0 sufficiently small, then there exists α > 0, independent of R ≥ 1 and of ∈ (0, 0 ), such that (20) Proof We defer the proof of L 2 R coercivity to the appendix. To extend coercivity to H 1 R we observe that so that for any t ∈ (0, 1) we may write Dropping the tilde, we have (20) with α independent of > 0 and R > 1.
We assume throughout our analysis that v H 1 R and v n H 1 R are uniformly bounded by δ 1. Returning to (17), we denote its right-hand side as F BE . To have the inversion of the operator on the left-hand side be contractive the term F BE must be approximately orthogonal to the small eigenspace of L. As Lemma 1 shows it is sufficient to be L R -orthogonal to g , the kernel of L 0 . To this end we determine R =R BE (v, v n , R n ) such that F BE ∈ X ⊥ , or equivalently Assuming this condition has been enforced we introduce and may rewrite the BE iteration in the equivalent formulation The key step is the introduction of the operator Π, the orthogonal projection onto X ⊥ . This is redundant when F BE ∈ X ⊥ , but preserves contractivity for choices of (v, v n , R − R n ) when it is not. Our goal is to show the function G BE is a contraction mapping and to develop asymptotic formula for R and v.

Lemma 2 The function R =R BE satisfies the implicit relation
where Moreover we have the Lipshitz estimate so long as kδ 2 2 .
Proof Due to parity considerations, we remark that g 2 R = R g 2 L 2 (R) , up to exponentially small terms. For brevity, and as an element of foreshadowing, we approximate (R − R n ) by k in the O-error terms. We address the terms in F BE and derive the following elementary estimates, For this scheme, F BE depends upon v only through N . Collecting terms in the orthogonality condition that are linear in R − R n and identifying relevant higher order terms yields the relation where b 1 is given in (24). Under the assumptions on k and δ we have the leading order result R − R n = −k/R. Substituting this relation into (30) yields the result (23).
To obtain the Lipschitz estimate we observe from the estimates above that The nonlinearity satisfies the Lipschitz properties Adding and subtracting N (ṽ), g R and using (29), we arrive at the estimates Imposing the condition kδ 2 2 yields (25).
To establish bounds on the map G BE defined in (22) we apply M to both sides of the relation and take the L 2 R inner product with respect to G BE . Using the coercivity estimate (20) we find Taking v, v n ∈ B H 1 R (δ) for δ 1 and recalling that the projection Π crucially cancels the leading order term in g − g n , we estimate For the BE system we examine distinguished limits k = s , for s ∈ (1, 2), which we call the large time-step regime, for which the H 1 R term is dominant on the left-hand side of (31). We drop the L 2 R term to find, Taking δ = s for any s > max{s/2, 2(s − 1)} then we determine that for sufficiently small. In particular, since s > max{s/2, 2(s − 1)} in the large time-stepping regime, we may take δ = k = s , so that, viewing G BE as a map on (v, v n ), we have , for all s in the large time-step regime.
Proposition 1 Fix 1 < s < 2, then in the distinguished limit k = s the function G BE defined in (22)

R (k) and is a strict contraction. In particular it has a unique solution
In particular there exists c > 0 such that for all v 0 ∈ B H 1 R (ck) and R 0 > 1 the sequence where b 1 > 0 is given by (24). Here N is the iteration number such that R N > 1 and R N +1 < 1.

Proof
We have established the mapping property. To establish the contractivity we must control the impact of f upon the projection Π through the motion of the front R. We assume that v,ṽ, v n ∈ B H 1 R (k) and denote R = R(v) andR = R(ṽ), with the associated front profiles denoted by g andg. The estimate (25) establishes thatR BE is Lipschitz with constant ckδ/ , which in the the large time-step regime reduces to ck 2 / . Following the proof of (25) we find that In the large time-step regime, using (20) we deduce the bound We wish to obtain a bound on the difference of G BE at two values of v: We first bound the difference The analysis is complicated by the fact that M is only uniformly invertible on the range of Π. To factor these projected inverses we act with M, observing where we used that fact thatMM −1Π =Π and henceMM −1Π + (I −Π) = I . Since the right-hand side of (36) lies in the range of Π we may invert boundedly, To recover the whole g BE we act with (I − Π) on (35) obtaining Adding (38) to (37) yields a regularized expression that accounts for the shifts in the projections The operators M −1 Π andM −1Π are bounded using (34), while where · R * denotes the operator norm from L 2 R into itself. The projections satisfy Applying these estimates to (39) and using (31) to estimate ΠF BE we obtain Finally we write and using (33), (40) estimate which is contractive so long as k 2 3 which holds with the large time-step regime. Within the large time-step regime the leading order iteration (30) simplifies as k k 2 / 2 and the dominant correction is given by the b 1 term. To compare to standard notation we rewrite the regime as 2 k = δ 1 and replace the internal parameter δ with k, the result is the large time-step interation (32).

Eyre-Type Iterations
For an Eyre iteration the map (15) where we have introduced the Eyre linear operator the explicit-term residual and the nonlinearity which we further decompose into implicit and explicit parts The operator L + is self-adjoint in the weighted inner product for which the eigenvalue problem takes the form subject to ∂ z ψ(−R/ ) = 0 and ψ → 0 as z → ∞. The coercivity estimate is substantially simpler than for BE as the operator L + is strictly positive without constraint.

Lemma 3
There exists α + > 0, independent of R ≥ 1, such that for all v ∈ H 1 R . Proof Since f + ≥ 0 the normalized ground-state eigenfunction ψ 0 of L + , satisfies Since the ground-state eigenvalue is strictly positive, this establishes the L 2 R coercivity of L + with α + = λ + 0 . The H 1 R coercivity follows as in Lemma 1.
We assume throughout our analysis that v H 1 R and v n H 1 R are uniformly bounded by δ 1. We denote the right-hand side of (41) by F E and introduce which is strictly contractive on the full space L 2 R , and re-write the Eyre iteration as For the Eyre iteration the role of the projection Π is diminished as M + is contractive without it. Our goal is to show the existence of a map R =R E (v, v n , R n ), for which to establish the contractive mapping properties of G E , and to develop asymptotic formula for R and v. We do this in the long time-stepping regime, k 2 , which has no lower bound for the Eyre scheme.

Lemma 4 Assume k
2 . There exists a smooth functionR E : such that the profile g = g(z; R) satisfies (45). The function R =R E satisfies the implicit relation where we have introduced the leading order Eyre time constant when f − ≡ 0. Moreover we have the Lipshitz estimate so long as δ 1.
Proof Due to parity considerations, we remark that g 2 R = R g 2 L 2 (R) , up to exponentially small terms. For brevity, and as an element of foreshadowing, we approximate (R − R n ) by 2 in the O-error terms. Addressing the terms in F BE one by one, we record With these reductions we can simplify the orthgonality condition, identifying terms that are linear in R − R n and most relevant higher order terms. The result is the balance where c − , introduced in (47) is positive since f − ≥ 0 by assumption. The largest terms and error terms come from the residual, and we kept the lower order constant on the left-hand side to emphasize that in the long time-stepping regime, the residual dominates the natural time-step term. Indeed, the iteration is independent of step size, k, given at leading order by (54).
To obtain the Lipshitz estimate we observe from the bounds above that dependence ofR E on v arises from the balance of the linear R − R n term in the residual against the nonlinearity. Since both these terms are multiplied by k/ 2 this factor cancels and we have the balance The nonlinearity satisfies the Lipshitz properties Adding and subtracting N (ṽ, v n ), g R and using (53), we arrive at the estimates Imposing the condition δ 1 yields (48).
We may now establish the main result on the Eyre sequence.

Proposition 2
There exists c > 0 such that for any k 2 ) and is a strict contraction, satisfying (55) In particular G E has a unique fixed point in that set, which we denote v n+1 . Moreover, if the Eyre balance parameter where the Eyre number, c E , is defined by where K + > 0 is defined in (65) and K + L −1 + Π > 0 is self-adjoint.
Proof To establish the contractivity of G E we follow the arguments for backward Euler, sketching only the differences. We introduce and derive the expression The operators M −1 + Π andM −1 +Π are bounded as L + has no small eigenvalues. Using (48) we estimate Similarly the projections satisfy Applying these estimates to (58,59) and following the proof of (48) to estimate ΠF E we obtain Finally we write and estimate the F E term from which the dominant contribution comes from the residual where we used (48) in the last inequality. In particular we deduce that Combining (60), (62) and (61), imposing δ = , and using k 2 we arrive at strict contractivity on B H 1 R (cδ) for any fixed c > 0. To establish bounds on the the fixed point v n+1 of G E (·; v n ) we observe from (43) that in the large time-stepping regime Using this result we expand Inverting M + we find, at leading order In particular we deduce that Arguing inductively, since the Eyre balance parameter γ < 1 and the functions are uniformly bounded for all R ≥ 1, we deduce that if δ : } N 0 are uniformly bounded, independent of 1 and k 2 for all n ≤ N so long as R n > 1 for all n = 0, . . . , N . To improve this bound we require Lipschitz estimates on the v n component of G E . To this end we find Here we introduce the quasi-steady parameter ρ ∈ (0, 1). Since |R n − R m | = O( 2−ρ ) for |n − m| ≤ N ρ :=≤ −ρ we infer that for all such n and m. For n > N ρ we define the quasi-equilibrium v n * := R n − R n−1 E n (z) where E n is the R = R n translate of Here the self-adjoint operator is well defined since L + Π • f − (g) H 1 R * = γ < 1 by assumption. Using the Lipschitz property (63) of G E and the quasi equilibrium relation for k = n − m , . . . , n. Since γ N ρ we deduce from an inductive argument that for all n > N s . Inserting this result in (46) we arrive at the leading order Eyre iteration (57).

Remark 7
There are two examples of particular relevance with the decomposition f + = (1 + β)u 3 and f − = u + βu 3 for β > 0. The choice β = 0 is classical and very degenerate, as in this case f − (u) = 1 and the corresponding Eyre balance parameter γ , defined in (56) is zero, and the Eyre number, (57) is 1. In this case it is possible to rewrite Eyre's method as backward Euler with a rescaled time. In particular the slow convergence to equilibrium will not be in evidence. For larger values of β the balance parameter increases from zero and the Eyre number decreases from 1. As the balance parameter increases through 1 we anticipate enhanced slowing of the front profile as the Eyre number tends to zero. The choice of non-zero β can be viewed as spurious, a deliberate attempt to foul the method. A more robust example of non-zero balance arises naturally through the model with β ≥ 1. This suggests the optimal decomposition f + = u 5 and f − = βu 3 . Here, unambiguously, increasing β increases the balance parameter and will lead to non-trivial enhanced slow-down with potential instability as γ increases through 1. These analytic predictions are validated in a computational study below.

Remark 8
To leading order, in the large time-stepping regime k 2 , the Eyre iteration recovers backward Euler with the substitution k → c E 2 . This reduces to the exact result for the case f (u) = u 3 − u and f − (u) = u, for which f − = 1, as the Eyre constant reduces to 1 since Π f − (g)g = Πg = 0.
The strong contractivity of G E with respect to v, given in (55), arises from the strong convexity with respect to v, but the slow evolution and marginal convergence to the quasiequilibrium, given in (63) arises from the balance between the implicit and explicit terms. The parameter γ measures this balance, with the quasi-equilibrium structure lost as γ increases towards 1. Indeed, since K + H 1 R * ∼ (1 − γ ) −1 , the Eyre constant will generically tend to zero as γ → 1.

Computational Validation of Remark 7
We perform computations for AC with the non-classical f (u) = u 5 − u 3 (which also leads to meta-stable dynamics of curvature motion) using the same initial conditions and accuracy criteria as described in Sect. 4.1. BE performs almost identically to the results shown in Tables 3 and 4 for the classical f (u) = u 3 − u in terms of accuracy and variation of time steps with and σ . This matches the theory in Sect. 6.1 which can be summarized as BE has profile fidelity when k = o( ).
When Eyre's method is applied to the dynamics with f (u) = u 5 − u 3 , with the natural splitting suggested in Remark 7, profile fidelity is lost as predicted. The formal prediction of Table 9 Computational results for the AC benchmark problem with fixed local error tolerance σ = 10 −4 and varied, using Eyre's method with reaction term f (u) = u 5 − u 3 Eyre with f (u) = u 5  Here, M is the total number of time steps taken (with the ratio to the value above in brackets) and E is the error in the benchmark time k = O( 3/2 ) which was seen computationally for f (u) = u 3 − u in Table 4 is not observed for f (u) = u 5 − u 3 . Rather, we see k = O( 2 ) as predicted by the theory in the previous section. The numerical results are shown in Table 9.

Summary and Future Work
We have identified the time step scaling for several first and second order schemes for AC and CH under the restriction of fixed local truncation error, σ . In particular, we derive the asymptotic behaviour of time-step number with σ and asymptotic parameter during metastable dynamics. These predictions are made under the assumption that the time steps preserve the asymptotic structure of the diffuse interface, a concept we refer to as profile fidelity. The predictions are verified in numerical experiments. We see that methods whose dominant local truncation error can be expressed as a pure time derivative have optimal asymptotic performance in this particular limit. BE, TR, and BDF2 all have this desirable property. We believe these methods will also have superior performance for other problems with metastable dynamics. Our numerical results show that BE performs better than expected and we have shown an explanation of this behaviour with formal asymptotics. The optimal fully implicit methods asymptotically computationally outperform all linearly implicit methods in the limit we consider. We present precise criteria on the computational cost of nonlinear solvers for this comparison. The provably energy stable first and second order SAV schemes had higher computational cost than standard IMEX methods for similar results. As a final result, we present a rigorous proof that large time steps with fully implicit BE can be taken with locally unique solutions that are energy stable. This is done for the 2D radial AC equation in meta-stable dynamics. Eyre-type iteration is also considered in this analytic framework, and it is shown that in general this approach loses profile stability unless very small time steps are taken.
Our work gives strong evidence that some fully implicit schemes for phase field models should be given more consideration and that provably energy stable schemes are inaccurate and give no benefits during meta-stable dynamics. Perhaps a hybrid scheme that switches between the two approaches depending on the dynamic regime should be considered.
Extending the analysis to the non-radial case and to CH is an interesting question. We observed that the question of global accuracy is not trivial in Sect. 5 and should be considered for other schemes. Accurate local error estimation for these problems is another interesting question to pursue.
Open Access This article is licensed under a Creative Commons Attribution 4.0 International License, which permits use, sharing, adaptation, distribution and reproduction in any medium or format, as long as you give appropriate credit to the original author(s) and the source, provide a link to the Creative Commons licence, and indicate if changes were made. The images or other third party material in this article are included in the article's Creative Commons licence, unless indicated otherwise in a credit line to the material. If material is not included in the article's Creative Commons licence and your intended use is not permitted by statutory regulation or exceeds the permitted use, you will need to obtain permission directly from the copyright holder. To view a copy of this licence, visit http://creativecommons.org/licenses/by/4.0/. . Let L 0 be as defined in (19).
Proof Let φ be the minimizer of L 0 u, u over H 1 (R) subject to u L 2 = 1 and the fullline orthogonality u, ψ 0 L 2 (R) = 0. By scaling, the minima is attained with φ L 2 = 1 2 and satisfies L 0 φ = λφ, on [− , ], subject to Neumann boundary conditions φ x (± ) = 0, in addition to the full line orthogonality condition. The operator L 0 on the truncated domain has eigenvalues λ 0 < λ 1 < ... which are O(e −d ) far away from the eigenvalues of L 0 on the full line. In particular λ 0 may be negative, but the rest are uniformly positive. In L 2 we partition φ = βψ 0 + φ ⊥ , where ψ 0 is the L 2 ground state of L 0 and φ ⊥ ∈ L 2 is L 2 orthogonal to ψ 0 . Then we have On the other hand the orthogonality condition implies that where the subscript c denotes integration over R\[− , ] with the corresponding norms. In particular we deduce that |β| ≤ φ L 2 c g L 2 c ≤ g L 2 c φ L 2 . Since g decays exponentially at ±∞, is complementary norm is exponentially small in . From orthogonality of ψ 0 and φ ⊥ we have or equivalently and taking large enough we use these bound in (67) to show that α is exponentially close to λ 1 > 0.
To complete the proof of Lemma 1 we take sufficiently large to apply Lemma 5 and then take sufficiently small that |z| ≤ 1. Under these conditions L 2 and L 2 R (− , ) are equivalent norms, uniformly in , and we have uniform L 2 R (− , ) coercivity of L. Conversely, L is clearly L 2 R coercive on [−R/ , ∞)\[− , ] since f (g) is strictly positive there. Clearly L is uniformly coercive on function with more than half their L 2 R mass in [−R/ , ∞)\[− , ]. The g orthogonality condition implies approximate orthogonality to ψ 0 for large. From these we deduce the full L 2 R -coercivity of L over X .