Toward error estimates for general space-time discretizations of the advection equation

We develop new error estimates for the one-dimensional advection equation, considering general space-time discretization schemes based on RungeKutta methods and finite difference discretizations. We then derive conditions on the number of points per wavelength for a given error tolerance from these new estimates. Our analysis also shows the existence of synergistic space-time discretization methods that permit to gain one order of accuracy at a given CFL number. Our new error estimates can be used to analyze the choice of space-time discretizations considered when testing Parallel-in-Time methods.

of which considered solutions based on Parareal, a well known PinT algorithm proposed in [28], see for instance [35,4,3,6,32,29]. A geometric multigrid interpretation of Parareal was given in [20], followed by the genesis of the MGRIT algorithm [10], based on algebraic multigrid methods. MGRIT was first presented during a student paper competition at the Sixteenth Copper Mountain Conference on Multigrid Methods, from which [10] originates. A journal version of this work was also published later in [9]. In contrast to [20], in the first paper for MGRIT the authors consider a Full Approximation Scheme (FAS) interpretation of Parareal, and based on this introduce a variant of MGRIT, with additional coarse and fine relaxation steps (FCF-relaxation, in contrast with F-relaxation only for the basic MGRIT method). It was then proved in [19] that this variant with FCF-relaxation is identical to Parareal with overlap, where each additional CF relaxation adds one coarse time interval more in the overlap in Parareal. Many recent contributions focused on applying MGRIT to hyperbolic problems, see e.g. [25,26,5]. There are however also other PinT algorithms especially designed for hyperbolic problems, see e.g. ParaExp [13,14], the diagonalization technique [18], and waveform relaxation [40,39,17,16], and combinations thereof, see e.g. [21], based on earlier work in [43].
The advection equation has proven to be a critical test problem for PinT algorithms ( [2,11,26]). Many space-time discretizations can be used to solve the advection equation, some being more diffusive than others, and numerical diffusion has proven to help convergence of PinT algorithms like Parareal [34]. It thus seems that a compromise between numerical diffusion and accuracy of the numerical scheme is necessary for effective convergence of such PinT methods applied to the advection equation (and hyperbolic problems in general). Hence, care must be taken when validating PinT algorithms on the advection equation. In particular, when solving such problems with a PinT algorithm, it is important to know how much accuracy the computed solution actually contains, especially since PinT algorithms should be capable of computing over longer time intervals accurate solutions in parallel. It is therefore important to have reliable error estimates that permit informed decisions on which space-time discretization schemes can be chosen for solving the advection problem as a test problem for PinT algorithms.
There are already several error estimates available in the literature for classical space-time discretizations. For instance, in [24,Chapter 2], the authors list many error estimates for high order space discretizations, and derive also conditions on the number of points per wavelength needed to achieve a given level of accuracy. Similar results can also be found in [38,23]. The existing error bounds in the literature have however three major drawbacks: -Most of them focus on the phase error, and estimate the L 1 error of the numerical solution in Fourier space. -They involve only specific space discretizations (e.g. centered finite-differences in space), and there is no general methodology for arbitrary space-time discretizations.
-Very few studies also include the impact of the time discretization, and the influence of important numerical parameters, like the Courant-Friedrich-Lewy (CFL) number, are still not completely understood.
Our goal here is to complement the currently available error estimates in the literature, and to provide a general methodology to estimate the error in the approximate solution, depending on the space and time discretization scheme used. We will also derive from these new general error estimates conditions on the minimum number of points per wavelength needed to achieve a given error tolerance in space. Finally, our new error estimates allowed us to discover the existence of numerical schemes with the same order p in space and time that have together a p + 1 order accuracy if used with a given CFL number.
Our manuscript is organized as follows: we state and prove our main theoretical results in Sec. 2. In Sec. 3, we perform numerical experiments to validate the error estimates and illustrate the main consequences they imply. In Sec. 4, we discuss the application of some spacetime discretizations for Parareal in view of our new error estimates. Finally, we give a conclusion and brief outlook on future work in Sec. 5.

Discrete Fourier Analysis
We consider the advection equation where a is constant, solved numerically on a periodic spatial domain [0, L], discretized using a uniform mesh [0, ∆x, ..., L − ∆x] = [x 1 , .., x nx ] of n x points (∆x = L/n x ). Numerical time integration is performed decomposing [0, T ] using n t + 1 points in time, [0, ∆t, .., T ] = [t 0 , .., t nt ], with ∆t = T /n t . We denote by T P := L/a the temporal period of the problem, which depends on the advection velocity a. Finally, u 0 (x) is a sinusoidal function with wave-number 1 κ ∈ N * (i.e. wavelength L/κ) and constant amplitude A, i.e. 2

Main Theorem
To present our main error estimate for an arbitrary space-time discretization of the advection equation (1), we need to introduce our measure for the error and two key properties that characterize an arbitrary space and time discretization.
Definition 1 Let the numerical solution of (1) be represented at time t by We define its discrete norm by The choice of this particular norm will be discussed later in Sec. 2.4, in view of the main results of this paper and their proof. We consider the use of a finite difference discretization of order p for the space derivative in (1), whose Fourier symbol is 3 1 Rigorously speaking, the wave-number is the inverse of the wavelength, i.e. κ/L. Our κ here is a normalized wavenumber, but we decided to call it "wave-number" for simplicity.
2 Any other sinusoidal function of wave-number κ and amplitude A can be obtained using a shift of the cosine, which would simply lead to a shift in the x coordinate and therefore not change the results. 3 For an example see (27).
We also consider a general Runge-Kutta type time discretization of order q, whose stability function R satisfies 4 We then have as our main result: Theorem 1 Let a numerical discretization of the advection equation (1) satisfying (5), (6) be given, and assume that the wave number κ ∈ N, and the CFL number are constant. Then the discrete error of the numerical solution, evaluated at time T := n P T P with n P a real positive constant, and starting with initial condition (2), depends when n x → ∞ on the discretization orders (p, q) as follows: 1. If p < q (space discretization has lower order), then 2. If q < p (time discretization has lower order), then 3. If p = q (same order for space and time discretization), then We obtain as a consequence the following result on the required number of discretization points needed for a certain accuracy.
Corollary 1 Using the same hypotheses as in Theorem 1, for a given error tolerance tol small enough, the number of points required for the whole domain discretization can be estimated by where n 0 depends on the tolerance tol , the discretization orders (p, q), and r = min(p, q): 4 For an example, see (30).
1. If p < q (space discretization has lower order), then 2. If q < p (time discretization has lower order), then 3. If p = q (same order for space and time discretization), then

Preliminary definitions
Performing a non-scaled Discrete Fourier Transform (DFT) of the sinusoidal initial value u 0 corresponding to (2) yields 5 The position of the non-zero components (up to machine precision) is linked to the angular frequency of u 0 , We consider a space grid, so it is convenient to define the reduced angular frequency As the DFT is un-scaled, we have the discrete form of the Parseval relation nx j=1 which allows us to link the discrete norm of the numerical solution with its Fourier components by For example, if we apply that to our initial condition, we obtain Hence, the (normalized) norms are conserved from discrete to continuous space.
Space discretization. We consider any finite-difference discretization that will approximate the first space derivative in (1) using where the c d are the finite difference coefficients, and s is the stencil half-width. Using the method of lines for (1) allows us to build the system of ordinary differential equations where K represents the discrete space derivative operator. Because of the periodic boundary conditions and the constant velocity, the matrix K is circulant, and thus becomes a diagonal matrix D in Fourier space. Each coefficient of D is associated with one reduced angular frequency of the solution. The two diagonal coefficients concerning our initial condition are then where z adv is the Fourier symbol associated with the space discretization, that is For example, the second order centered finite difference scheme has the Fourier symbol More generally, z adv (θ) is an approximation of iθ up to order p, that is with α p+1 = 0, which is our original hypothesis (5) for Theorem 1 on the finite difference scheme for the space discretization of the advection equation (1).
Time discretization. We consider now any Runge-Kutta type time integration method used to solve (24). The numerical solution is then given by where u is the numerical solution u at the th time point, and R is the stability function of the time integration scheme. For example, the stability function of the Backward Euler method is which implies In general, the stability function is an approximation up to some order q of the exponential [27, Lemma 4.4, p.62], that is with β q+1 = 0, which is our original hypothesis (6) for Theorem 1 on the time integration scheme.

Proofs of the main results
Estimation of the Fourier error function. For any Runge-Kutta type method, since R is a rational function, its expression is conserved in Fourier space, and we havê Using (16) and (25) allows us to get Here, we used the short hand notation z(θ) := z adv (θ) and dropped the sub-scripts to be concise in the following computations. Equation (34) can be written using the CFL number aŝ Let us consider the exact 6 solution of (1), obtained with exact derivation in Fourier space and an exponential time integration of (24). The exact solution can be written in Fourier space aŝ Then, using (20) we can write Hence, in order to compute the discrete error, we have to estimate what we call the Fourier error function Asymptotic approximation. We first define We use the hypothesis that κ and n P are constant, and from the definitions we have This shows the equivalence It is worth mentioning that the CFL number σ can be set arbitrarily large or small, depending on the computation 7 . But the hypothesis in Theorem 1 that σ is constant mainly states that it does not vary with n x , n t or θ. This implies that θ and ζ scale with one-another, and similarly for n x and n t , i.e.
Next, we definẽ and then (36) becomeŝ Since we also have the equivalence ζ ∼ζ, we get for any integer r that To studyẼ(ζ), we first perform a Taylor expansion of R(ζ): using (6) and (41) we have Then, using the notation 8 we obtain Taking the expression to the power c, a last Taylor expansion yields which gives us Finally, using the definition in (37) and (38), we get Origins of discretization errors. We identify two main terms in the Taylor expansion of E(n x ): the first one with n −p x is the contribution of the space discretization error; the second one with n −q x is the contribution of the time discretization error. Hence depending on the values of p and q, three cases need to be considered: 1. p < q : the order in space is lower than the order in time, and the Fourier error expansion is dominated by the spatial term. 2. q < p : the order in time is lower than the order in space, and the Fourier error expansion is dominated by the temporal term. 3. q = p : same order in time and space, and the Fourier error expansion is dominated by a mixed term, including error in time and space.
We note from (38) that c is a purely imaginary number, so |e c | = 1, which indicates that this term can be ignored when evaluating the absolute value of the Fourier error function in (35). We also recall for what follows that Space discretization with lower order. When p < q, the Fourier error becomes which implies that (47) Then, combining (35) with (46), we get which proves the first part of Theorem 1.
Since n x → ∞, we have → 0 such that we can neglect the term O 1/n p+1 x , and for a given error tolerance tol small enough, we have which proves the first part of Corollary 1.
Time discretization with lower order. When q < p, the Fourier error becomes Again, the Fourier error function verifies a similar relation as in (47), so we get which proves the second part of Theorem 1, and we obtain similarly as from the first part which proves the second part of Corollary 1.
Space and time discretization with the same order. When p = q, the Fourier error function becomes Again, the Fourier error function verifies a similar relation as in (47), so we get which concludes the proof of Theorem 1, and for Corollary 1 we obtain proceeding as before

Discussion of the norm and the error measure
Choice of the norm. We comment now on the definition of the discrete norm in (4). The Parseval theorem (19) used in the proof of Theorem 1 requires to select an L 2 -type norm. Several solutions exist, and we give here the three most natural candidates: u 2 is the standard Euclidean vector norm, but this norm grows with the number of elements in the vector, and thus can not converge when n x → ∞ to a continuous norm of a given continuous solution. Thus, a scaled norm is necessary to build a coherent error estimate, a condition that is satisfied by both u and u L 2 . The latter is the counterpart of the continuous L 2 norm L |u(x)| 2 dx, but it varies with the domain length L. Using it for the error estimate would then induce a dependence on the domain length, which is generally avoided by setting L = 1 implicitly. In that case, u L 2 simply becomes the u norm, that has no dependence on L. Note that our discrete norm u is similar to the norm used to compute the Mean Square Error in statistics and the root-mean-square velocity in Computational Fluid Dynamics. And finally, our discrete norm u is more convenient to apply the discrete form of the Parseval theorem (19). Thus, we chose u so our error estimate does not depend on the dimensionality of the problem. Also, note that error estimates based on u L 2 or u 2 can be easily retrieved by multiplying the results of Theorem 1 by the appropriate factor.
Choice of the error measure. Once a discrete norm is chosen, defining a measure for the error is not straightforward. First, we defined a norm "in space", but our solution depends also on time. To handle this, there are many approaches in the scientific community, one commonly used consists in looking at the largest L 2 error in space over the simulation time interval. This can be interpreted as an L ∞ measure in time combined with an L 2 measure in space. For the advection equation, it is already known that the error necessarily grows with time, such that the maximum L 2 error is reached at the end of the simulation time interval (i.e. for t = T ). This motivates our choice when defining in (8). We used an absolute error measure, but a relative error measure could also have been used, with u ref being either u 0 or u exact . Note that choosing one or the other would have no impact on our error measurements if they are computed after an integer number of periods, and for those times, u 0 = u exact . Looking at the proof of Theorem 1, it is easy to see that with the same property for the results of Corollary 1. The relative error measure does not depend on the amplitude of the wave, which is an interesting property when the error is dominated by numerical diffusion and contains only one frequency: rel then indicates the percentage of decrease in the solution values 9 . But if one considers several wave-number components of the solution, then the absolute error measure is more appropriate, since it allows a comparison of errors with different weights, related to each wave-number amplitude. Thus, we decided to express our main results in the absolute error measure . 9 If we build an artificial solution u num = cu exact with 0 ≤ c < 1, then we have rel = (1 − c).

Illustration of the error estimates
We set a = L = 1, so all numerical integration parameters are fully determined by σ, n P and n x . In particular, n t is computed using the fact that with T P = L/a = 1. Test values for σ, n P and n x are set to have n t ∈ N. We compare the error estimates given when neglecting the big O term in the formulas of Theorem 1 to the effective error in numerically computed solutions of (1) in Fig. 1. On the left, a Backward Euler scheme is used in combination with a 6 th order centered finite difference discretization. In this case, the time discretization is of lower order, and can be estimated with the dominant term in (10) (dashed gray lines). We observe that the estimates are sharp for large values of n x , and this does not depend on the simulation time interval, represented by various values of n P . Similar results were also obtained for other configurations with lower order time discretization schemes, and also when the space discretization is of lower order. In Fig. 1 on the right, the SDIRK3 scheme of [1, Theorem 5] is used in combination with a 3 rd order upwind finite difference scheme. Here, since the time and space discretization have the same order, (11) was used to estimate by neglecting the big O term, considering a value of the CFL number σ = 8. Again, the theoretical error estimates are sharp for large values of n x , with various choices of n P .
Generally, we observed with all numerical experiments that estimating by neglecting the big O terms in Theorem 1 was a reasonably good approximation for error levels equal or lower than = 1e −2 . For low values of n x , the error reaches a plateau, with an error around = A/ √ 2 (black dotted lines in Fig. 1), which corresponds to the magnitude of the exact solution, hence the numerical approximation has no accuracy whatsoever.

Mesh requirements for a given error tolerance
The results given in Sec. 3.1 have shown the accuracy of the error estimates given in Theorem 1 for general sinusoidal initial conditions as soon as n x is large enough to achieve a relatively small value for . This does not apply directly to general initial conditions, but since any solution of (1) can be represented by a linear combination of sinusoidal functions, knowing the highest relevant wavenumber in the initial condition indicates the mesh size requirement to achieve a given error tolerance on tol for the numerical solution to be accurate. If the spatial mesh is fine enough to get an error tolerance tol for the wave-number κ max , then this will automatically apply to all lower wave-numbers.
This motivates Corollary 1, which estimates the minimum number of mesh points n x,min required to get a given error tolerance for a wave-number κ, after simulating n P temporal periods of the advection problem (1). In general, n x,min depends on the limiting order of the space-time discretization r = min(p, q), as indicated in (12). But three parameters also influence the number of mesh points n x,min required: the wavenumber κ, its associated amplitude A and the number n P of simulated temporal periods for (1). The term A 1/r n P 1/r κ 1+1/r shows that increasing n P , the wavenumber κ or the amplitude A will naturally increase n x,min or a given absolute error tolerance, but if higher order methods are used, this increase will be slower. In particular, one can never be better than "infinite" order of accuracy that would induce no dependency on n P and A, and a linear dependency on κ.
Finally, the factor n 0 in (12) depends only on the space-time discretization schemes. It is particularly important to evaluate the accuracy of numerical schemes, as it indicates the number of mesh points required to get an accuracy of tol for the largest unitary sinusoidal component of the solution (κ = A = 1), after only one temporal period of (1). This is the focus of the next paragraphs, where values for n 0 are computed for different discretization schemes. Because we focus on one given wavenumber, we consider the relative error tolerance rel tol which represents a percent of error relative to the initial solution (cf. definition of rel in (56), Section 2.4, § 2).
Lower order for the space discretization. We consider the case when the space discretization has the lower order, and compute n 0 ( rel tol = 0.01) for several classical finite difference schemes of various order. More details on those schemes are given in Appendix A. The corresponding n 0 values are given in Tab. 1, along with the α p+1 coefficients for each scheme. We observe a huge n 0 value for the first order method (almost two thousand), and increasing the order rapidly lowers n 0 . But after 4 th order, the reduction of n 0 when increasing p is not as important any more as before. These results strongly suggest that low order space discretizations should never be considered in practical numerical integrations of smooth solutions of the advection equation (1), given their poor ratio between accuracy and efficiency. Furthermore, higher order difference schemes up to a certain order can easily bring good accuracy at the  Table 1 Standard finite difference space disretizations with their α p+1 , and corresponding n 0 when the space discretization has the lower order, for a given relative error tolerance expense of a small increase in the computational cost of the numerical spatial derivative evaluation, and thus should be favored when solving the advection problem (1) for smooth solutions. Finally, we can also compare upwind and centered schemes with equal order (U2-C2 and U4-C4). For both cases, centered schemes required less mesh points than upwind ones ( +41% for U2 compared to C2), but this gap becomes smaller for higher order ( +11% for U4 compared to C4). This comparison can be completed considering computational cost (see the coefficients in Appendix A): upwind schemes require more floating point operations than centered ones, which makes them also more costly. Lower order for the time discretization. We now consider the case when the time discretization has the lower order, and compute n 0 ( rel tol = 0.01) for several classical time integration schemes of various orders. The corresponding n 0 values are given in Tab. 2, along with the β p+1 coefficient for each scheme. Among the methods considered some are classical (Forward and Backward Euler, Heun, Runge-Kutta of 4 th order (RK4), Trape-zoidal Rule, Gauss-Legendre) and can be found in many academic textbooks. The others come from the research literature: the Runge-Kutta methods of order 3 with 3 and 5 stages (RK33, RK53) and of order 2 with 3 stages (RK32) from [41], the Singly Diagonally Implicit Runge Kutta methods of order 2 (SDIRK2 and SDIRK2-2) and order 3 (SDIRK3) from [1], and of order 4 with 5 stages (SDIRK54) from [42,Sec. IV.6]. Each n 0 value was computed for a unitary CFL number σ = 1. In practice, given one effective σ value, one should consider the given n 0 multiplied by σ.
Like when we used lower order for the space discretization, first order methods in time require a huge number of mesh points to reach the desired accuracy, while going to higher order quickly reduces n 0 to the order of 10. Also, methods of the same order do not require necessarily the same number of mesh points for the given error tolerance. For instance, RK33 and RK53 are both 3 rd order methods, but RK53 has an n 0 less than half the one of RK33; the two additional stages of RK53 compared to RK33 provide not only a better accuracy than RK33, but also a better accuracy than the 4 th order method RK4. This is due to the chosen error tolerance ( rel tol = 0.01), which is not small enough to get the beneficial effects of the 4 th order. This would appear for lower rel tol values (e.g. n 0,RK4 ( rel tol = 1.0e −3 ) 16.9 compared to n 0,RK53 ( rel = 1.0e −3 ) 17.1).
Same order for space and time discretizations. Finally, we consider the case when p = q. In this case, one should consider (15) of Corollary 1, which depends on the following term governed by the CFL number σ: On the one hand, for low CFL numbers σ, the term multiplying β p+1 can become negligible, and we would have which would reduce to the case where the space discretization has the lower order. On the other hand, large values of the CFL number σ would make the α p+1 term negligible, and which would reduce to the case where the time discretization has the lower order. This indicates a result that one could naturally expect: for low CFL numbers σ, the error will be dominated by the error of the space discretization, and for large CFL numbers σ, it will be dominated by the error of the time discretization.

Synergistic space-time discretizations
In the case where the time and space discretization have the same order, the Taylor expansion of in (11) has its leading order term multiplied by C p α,β defined in (59). Under certain sign conditions for α p+1 and β q+1 , there exists a CFL number σ which makes the error term of order p vanish, leading to a higher order error term of at most order p + 1. If this choice exists, the optimal CFL number is In that case, we call the space-time discretization "synergistic", since the combination of the two schemes of order p allow us to obtain a scheme of at least global order p+1. Note that in addition, the p th order error term is canceled for all wave-numbers κ, which leads to the increase of order of accuracy for any initial condition represented on the spatial grid.
A special case of this phenomenon is well known: if we consider Forward Euler (FE, β 2 = −1/2) combined with first order Upwind finite differences (U1, α 2 = 1/2), this combination is synergistic for σ F E/U 1 = 1, and in fact not only the error term of order 1 is canceled, but all the error terms of higher order are also canceled, and the method becomes an exact solver.
There are other combinations of space and time discretization schemes which become synergetic. For instance, we can consider the SDIRK2-2 method from [1], combined with 2 nd order centered finite differences (C2) in space. This combination is synergistic and the second order error term is canceled for σ SDIRK2−2/C2 0.35. We illustrate this property by considering an initial condition of the form and simulating up to T T P . We plot in Fig. 2 the numerical error in the solution at t = T , as a function of n x , for three CFL values σ ∈ {σ /2, σ , 2σ }. We observe second order convergence for σ ∈ {0.70, 0.17}, and the error is naturally lower for the smaller CFL number. For σ = σ however, not only the errors are smaller than for the other two CFL numbers, but we also clearly observe the third order convergence in 1/n x . Note that we showed this example only for illustration purposes: considering such a low value σ for the CFL number defies the purpose of using an implicit method. Furthermore, the SDIRK2-2 method is in practice way less accurate than its counterpart SDIRK2 (see the n 0 values in Tab. 2), which is probably the reason why it is rarely used in the literature, in comparison to other SDIRK methods. However, this example is interesting as it illustrates the need to investigate the construction of new space-time discretizations intended to be synergistic, with a σ value that would meet some basic efficiency requirements, a subject we intend to further study.

Application to PinT methods
We consider now the application of the Parareal algorithm to the advection problem (1), for which one needs to define two solvers with different levels of accuracy: a fine solver F, which should compute a solution with the desired target level of accuracy, and a coarse solver G, which needs to be much faster than F, used to achieve time parallel computations. For more information on the Parareal algorithm, see [28,20,15]. We consider N = 50 time sub-intervals for Parareal (which corresponds to the number of processors that can be used for time parallelization) and ∆T = 0.25T P the length of those time sub-intervals, which gives a total computation time T = 12.5T P . We set the domain length to L = 2. At each iteration k of the Parareal algorithm, we measure its error by where u F n is the solution obtained applying F sequentially at the end of the n th time sub-interval, and u k n is the corresponding Parareal solution after k iterations.
We use first BE − C6 as our space-time discretization for the fine solver F, with σ F = 1. The coarse solver G is using the same BE − C6 scheme, but with Error E k ∞ n x = 1024, n t, F = 128 n x = 2048, n t, F = 256 n x = 4096, n t, F = 512 Fig. 3 Convergence of Parareal for the advection problem, using the BE-C6 space-time discretization, and varying the level of accuracy by using different numbers of mesh points n x and n t,F . Dashed gray lines represent E k ∞ , using a zero initial condition and a random initial guess with unitary norm, scaled down to 1e −14 .
a coarsening factor m = 32 for the time step, which induces σ G = 32. Note that the choice of implicit time integration simplifies greatly the coarsening for G, since the time integration is unconditionally stable for any time-step length. On the other hand, we chose C6 to have a non-dissipative space discretization scheme. We consider the sinusoidal function in (2) with A = κ = 1.
The error E k ∞ is plotted in Fig. 3 for the first iterations of Parareal, where we chose different levels of accuracy for F by using different values for n x and n t,F (number of fine time steps on each Parareal time sub-interval), keeping the CFL number constant for both solvers (σ F = 1, σ G = 32). We see two convergence behaviors on this plot: for early iterations, increasing the accuracy of the fine (and coarse solver) improves the convergence behavior of Parareal. For n x = 4096, Parareal even achieves an accuracy of E k ∞ = 0.01 compared to the fine solver after only 7 iterations. This result might at first glance encourage the use of Parareal for advection problems, if a high order non dissipative scheme is used for the space discretization, since apparently rapid convergence can be achieved. We see however also a second convergence behavior in Figure 3, which the three experiments have in common: after a certain number of iterations, their convergence behavior changes, and follows a new, common curve. One can make this convergence behavior appear right from the start if one starts Parareal iterations with a random initial guess, which contains all frequencies 10 , in contrast to our experiment setup where we only solve for one frequency in the initial condition and use the coarse propagator to obtain the initial guess.
To illustrate this, we use a zero initial condition and start the algorithm with an initial guess containing random values uniformly distributed in [0, 1], scaled such that u 0 n = 1. We represent the associated E k ∞ errors with the dashed gray curves in Figure 3, scaled down to machine precision. We see that the Parareal method follows these convergence curves after the initial iterations, due to the (random) roundoff error in the convergence curves in solid lines, amplified by the method. So starting with a random initial guess of order one instead of the coarse solve, none of the methods would have converged before reaching the maximum of 50 parareal iterations in this setting with N = 50 time sub-intervals, as it was predicted in [20, Theorem 5.6, Remark 5.9 and Figure 5.1 on the left] with an analysis at the continuous level for the advection equation. We next use our new results from Sec. 2 to investigate the accuracy of the solutions computed in this example, and to study what happens if one is using the truly needed higher order methods in time.
Limiting discretization scheme. As it was shown in Theorem 1, the error of a space-time discretization scheme is dominated by its lower order component. In our example, the BE-C6 scheme (q = 1, p = 6) is equivalent to any other discretization using a space discretization with p > 1 (e.g. C2, U3, . . . ). If we use Parareal with those lower order space discretizations, we obtain the same convergence curves as in Fig. 3. The important point is that in Fig. 3, the difference between the Parareal iterate and the fine solver is shown, and the fine solver accuracy is not at all questioned. The use of the high order "non-dissipative" scheme has however no impact on the global accuracy of the numerical solution, obtained by both applying F sequentially and Parareal in parallel. It is very important to study the error of the fine solver before trying to solve the spacetime discretized problem in a time parallel fashion.
Error of the fine solver. We thus now investigate the accuracy of the fine solver for the values of n x and n t,F we used. Since the error estimate increases linearly with the simulation time, it is important to check the error at the final simulation time, that is t = 12.5T P . We show in Tab. 3 the values of for various values of n x and n t,F . We see that the error levels are quite high, convergence is first order due to BE in time, and comparing to the error levels of Parareal with the fine solver in Fig. 3, we see that the accuracy achieved by Parareal compared to the fine solver is not really of interest for the numerical solution of the advection Table 3 Error of the fine solver with BE-C6 at t = 12.5T P , for an initial sinusoidal solution with A = κ = 1.  The issue of using higher order time-discretization. The simplest coarsening strategy used in Parareal consists in taking larger time steps of the fine solver scheme to obtain the coarse solver. In our example, we used a coarsening factor m = 32, making the coarse solver G thirty two times cheaper than F, which leads to the parallel speed-up of Parareal. This also induces an error ratio of m between the fine and coarse solver, as we can see from (10) in Theorem 1, and the fact that BE has q = 1. We now keep the same configuration, but use a higher order time discretization for F and G, that is SDIRK3. This allows us to increase the accuracy of the fine solver, as illustrated in Tab. 4. But this increase in accuracy brings some drawbacks. In particular, we have an error ratio of m 3 = 32 3 between the fine and coarse solver, while the cost ratio still stays the same. This will make Parareal convergence more difficult, since the coarse solver accuracy is more degraded, compared to F. We illustrate this by the plot of E k ∞ for the SDIRK3-C6 case in Fig. 4. We see that not only the initial convergence is degraded compared to the result in Figure 3, but also the error level of the fine solver can not be reached any more by the Parareal iteration before reaching the final 50th iteration, since again convergence for the one target frequency in the experiment changes to the non convergence of parareal with all frequencies present due to roundoff, as indicated by the gray dashed lines computed as for Error E k ∞ n x = 1024, n t, F = 128 n x = 2048, n t, F = 256 n x = 4096, n t, F = 512 Fig. 4 Convergence of Parareal for the advection problem, using the SDIRK3-C6 space-time discretization, and varying the level of accuracy, with different number of mesh points n x . Dashed gray lines represent E k ∞ , using a zero initial condition and a random initial guess with unitary norm, scaled down to 1e −12 .
accuracy also makes the true non convergence behavior worse when all frequencies are present.
The difficult choice for time integration. In practice, time integration will often have the lower order, since increasing the space-discretization order does not increase the computational cost too much, and can easily be achieved by increasing the space discretization stencil. In that case however, choosing an implicit time integration scheme will not be advisable for the advection problem, since the benefit of using very large time steps with implicit unconditional stability will be canceled by the σ q factor in (10) of Theorem 1. This leaves us with explicit time integrators for the fine solver in Parareal, which will then make it difficult to use an implicit scheme for the coarse solver, because it will be hardly possible to have a large cost ratio between an explicit F and an implicit G.
On the other hand, building a coarse solver with explicit time integration cannot be based on an increase of the time step, because of their intrinsic CFL limitation. One can use then coarsening in space, but previous analyses in the literature [33,29] show that one needs necessarily high order interpolation between the coarse and fine grids, and the coarse solver could then also suffer major efficiency loss in the context of massive space parallelization, not even talking about further high frequency error components introduced due to this process, which the iteration can not eliminate. This makes it very tricky for actual PinT methods based on multilevel techniques, and it seems unavoidable that one has to develop new techniques for the coarser levels when applying Parareal like methods to the advec-tion problem, or more generally, hyperbolic problems. In that spirit, interesting ideas have emerged recently in the scientific community for the design of coarse level propagators more suited for PinT integration of hyperbolic problems, see e.g. [36,5,31], and testing these new methods under the stress of a random initial guess with precise accuracy requirements of the computed solution is an important task. The difficulty of good coarse corrections is reminiscent of the tremendous difficulties faced when trying to solve Helmholtz problems using multilevel techniques, see for example [7,8].

Conclusion
We developed general error estimates for the linear advection equation which complement estimates already available in the literature. In particular, our new results allow us to estimate the discrete L 2 norm of the discretization error in the numerical solution for any general space-time discretization based on Runge-Kutta type time integration methods and finite difference space discretizations. We then used our new estimates to derive conditions on the minimum number of points per wavelength needed, which clearly show the interest of using high order discretization methods for smooth solutions of the advection equation.
We also discovered with our new error estimates the existence of synergistic space-time discretizations that allow us to gain one order of accuracy at a given CFL number. We have only shown one example for illustrative purposes, but this discovery opens up the path for developing new space-time discretization methods constructed specifically to have this property.
We have finally also shown that our new error estimates can be used to analyze the choice of space-time discretizations used in PinT applications, in order to validate them on the advection problem, which is a critical first step toward a more generalized application of PinT methods to larger scale hyperbolic problems. Our preliminary analysis for applying Parareal to the advection equation indicates limitations of current classical numerical schemes, and the need of developing new coarse time integrations adapted to hyperbolic problems. Table 5 Standard finite difference space disretizations coefficients for advection. Half of the coefficients for C8 are not indicated to reduce table width, as they can be easily deduced from the first coefficients (see others centered schemes).

ID
Finite differences formula A Space discretization schemes for advection We give in Tab. 5 the coefficients of the space discretization schemes investigated in Sec. 3.2. Most of them belong to common literature on finite-difference schemes. The U2 and U4 scheme are not very common, but used in [5].