A numerical study of variational discretizations of the Camassa-Holm equation

We present two semidiscretizations of the Camassa-Holm equation in periodic domains based on variational formulations and energy conservation. The first is a periodic version of an existing conservative multipeakon method on the real line, for which we propose efficient computation algorithms inspired by works of Camassa and collaborators. The second method, and of primary interest, is the periodic counterpart of a novel discretization of a two-component Camassa-Holm system based on variational principles in Lagrangian variables. Applying explicit ODE solvers to integrate in time, we compare the variational discretizations to existing methods over several numerical examples.


Introduction
The Camassa-Holm (CH) equation (1) u t − u txx + 3uu x − 2u x u xx − uu xxx = 0, was presented in [8] as a model for shallow water waves, where u = u(t, x) is the fluid velocity at position x at time t, and the subscripts denote partial derivatives with respect to these variables. Equation (1) can also be seen as a geodesic equation, see [45,22,23]. This paper focuses on numerical schemes that are inspired by this interpretation, and more specifically the flow map or Lagrangian point of view for the equation. We mention that the CH equation also turns up in models for hyperelastic rods [13,25,40], and that it is known to have appeared first in [28] as a member of a family of completelyintegrable evolution equations. Due to its rich mathematical structure and interesting properties, (1) has been widely studied. For instance it is bi-Hamiltonian [28], has infinitely many conserved quantities, see, e.g., [46], and its solutions may develop singularities in finite time even for smooth initial data, see, e.g., [19,20]. Moreover, serveral extensions and generalisations of the Camassa-Hom equation exist, but we will only consider one of them, which is now commonly referred to as the two-component Camassa-Holm (2CH) system. It was first introduced in [52,Eq. (43)], and can be written as That is (1) has been augmented with a term accounting for the contribution of the fluid density ρ = ρ(t, x), and paired with a conservation law for this density.
Since the paper [8] by Camassa and Holm there have been numerous works on (1), and its extension (2) has also been widely studied. Naturally, there has also been proposed a great variety of numerical methods with these equations in mind, and here we will mention just a handful of them. An adaptive finite volume method for peakons was introduced in [3]. In [36,14] finite difference schemes were proved to converge to dissipative solutions of (1), while invariant-preserving finite difference schemes for (1) and (2) were studied numerically in [48]. Pseudospectral, or Fourier collocation, methods for the CH equation were studied in [43,44], where in the latter paper the authors also proved a convergence result for the method. In [9,10,37,11] the authors consider particle methods for (1) based on its Hamiltonian formulation, which are shown to converge under suitable assumptions on the initial data. On a related note, a numerical method based on the conservative multipeakon solution [38] of (1) was presented in [41]. Furthermore, there have been proposed several Galerkin finite element methods for (1): an adaptive local discontinuous method was presented in [56], a Hamiltonian-conserving scheme was studied in [49], while [1] presented a Galerkin method with error estimates. There have also been proposed more geometrically oriented methods, such as a geometric finite difference scheme based on discrete gradient methods [17], and multi-symplectic methods for both (1) and (2) in [16,15]. Moreover, [18] presents a numerical method for (1) based on direct discretization of the equivalent Lagrangian system of [39]. Such a list can never be exhaustive, and for more numerical schemes we refer to the most recent papers mentioned above and the references therein.
In this paper however, we consider energy-preserving discretizations for (1) and (2), which are closely related to variational principles in [29] and [38]. In particular, we are interested in studying how well the discretizations in [29] and [38] serve as numerical methods. To this end, we will consider the initial value problem of (2), with periodic boundary conditions in order to obtain a computationally viable numerical scheme, i.e., x ∈ T, ρ(0, x) = ρ 0 (x), x ∈ T.
One of the hallmarks of the CH equation, and also the 2CH system, is the fact that even for smooth initial data, its solutions can develop singularities, also known as wave breaking. Specifically, this means that the wave profile u remains bounded, while the slope u x becomes unbounded from below. At the same time energy may concentrate on sets of measure zero. This scenario is now well understood and has been described in [19,20,24,30]. A fully analytical description of a solution which breaks is provided by the peakon-antipeakon example, see [38]. An important motivation for the discretizations derived in [29] and [38] was for them to be able to handle such singularity formation, and we will see examples of this in our final numerical simulations.
The variational derivation of the equation as a geodesic equation is based on Lagrangian variables, and the Lagrangian framework is an essential ingredient in the construction of global conservative solutions, see [6,39,32]. The other essential ingredient is the addition of an extra energy variable to the system of governing equations, which tracks the concentration of energy on sets of measure zero. Later we will see that these ingredients have all been accounted for in our discretization.
Next we will outline how the variational derivation of the CH equation is carried out in the periodic setting, before we turn to our discrete methods. In our setting, we take the period to be L > 0 such that for t ≥ 0. We introduce the characteristics y(t, ξ) and the Lagrangian variables y t (t, ξ) = u(t, y(t, ξ)) =: U (t, ξ), r(t, ξ) := ρ(t, y(t, ξ))y ξ (t, ξ).
Let us ignore r for the moment by setting r ≡ 0, which corresponds to studying the CH equation. A rather straight-forward discretization of the above variables comes from replacing the continuous parameter ξ by a discrete parameter ξ i for i in a set of indices. The pairs (y i , U i ) can then be considered as position and velocity pairs for a set of discrete particles. We want to derive the governing equations of the discrete system from an Euler-Lagrange principle. The system of equations will thus be fully determined once we have a corresponding Lagrangian L(y, U ).
We base the construction of the discrete Lagrangian on the continuous case. For the CH equation, a Lagrangian formulation is already available from the variational derivation of the equation. Let us briefly review this derivation. The motion of a particle, labeled by the variable ξ, is described by the function y(t, ξ). The velocity of the particle is given by y t (t, ξ) = U (t, ξ). The Eulerian velocity u is given in the same reference frame through u(t, y(t, ξ)) = U (t, ξ), and the energy is given by a scalar product in the Eulerian frame. For the CH equation, the scalar product ·, · is given by the H 1 -norm (5) u, u = T (u 2 + u 2 x ) dx = T y 2 t y ξ + y 2 tξ y ξ dξ.
Other choices of the scalar product lead to the Burgers or Hunter-Saxton equations, see Table 1. From the scalar product, we define the momentum m as the function name Burgers Hunter-Saxton Camassa-Holm equation u t + 3uu x = 0 (u t + uu x ) x = 1 2 u 2 not defined piecewise linear multi-peakons Table 1. Summary of norms and corresponding soliton-like solutions.
which satisfies (6) u for all v. Note that this scalar product is invariant with respect to relabeling of the particles, or right invariant in the terminology of [2]. This means that for any diffeomorphism, also called relabeling function, φ(ξ), the transformation y → y • φ and U → U • φ leaves the energy invariant: By Noether's theorem, this invariance leads to the conserved quantity m c = m • yy 2 ξ , which is presented as the first Euler theorem in [2]. We can recover the governing equation using the conserved quantity m c . We have We use the definition of u as y t = u • y and, after simplification, we obtain which is exactly (1).
One method for discretizing the CH equation comes from its multipeakon solution, as studied in [38]. This solution of (1) is a consequence of that the class of functions of the form is preserved by the equation. By deriving an ODE system for y i and U i which define the position and height of the peaks, we can deduce their values at any time t. Then, given the points (y i , U i ) for i ∈ {1, . . . , n}, we can reconstruct the solution on the whole line by joining these points with linear combinations of the exponentials e x and e −x . For the new scheme, we use instead a linear reconstruction, which is also the standard approach in finite difference methods. In this case we approximate the energy in Lagrangian variables using finite differences for y i and U i , and then the corresponding Euler-Lagrange equation defines their time evolution. Finally, we apply a piecewise linear reconstruction to interpolate (y i , U i ) for i ∈ {1, . . . , n}.
Comparing these two reconstruction methods, we face a trade-off in how we interpolate the points (y i , U i ). Although the piecewise exponential reconstruction provides an exact solution of (1), one may, in absence of additional information on the initial data, consider it less natural to use these catenary curves to join the points instead of the more standard linear interpolation. On the other hand, linear reconstruction may approximate the initial data better, but an additional error is introduced since piecewise linear functions are not preserved by the equation, see Figure 1.  Note that multipeakon solutions are not available in the case ρ = 0, cf. [21], and so the method based on linear reconstruction is the only scheme presented here for the 2CH system which is based on variational principles in Lagrangian coordinates. We remark that for the Hunter-Saxton equation, the soliton-like solutions are piecewise linear, being solutions of u xx = i∈Z U i δ(x − y i ). Thus, the linear and the exact soliton reconstruction coincide for the Hunter-Saxton equation. As a matter of fact, in [34] there has recently been developed a fully discrete numerical method for conservative solutions of the Hunter-Saxton equation which is primarily set in Eulerian coordinates, but employs characteristics to handle wave breaking.
The rest of this paper is organized as follows. In Section 2 we briefly recall the conservative multipeakon method introduced in [38], where a finite set of peakons serve as the particles discretizing the CH equation, and outline how the corresponding system is derived for the periodic case. Moreover, we present efficient algorithms for computing the right-hand sides of their respective ODEs, which are inspired by the fast summation algorithms of Camassa et al. for their particle methods [9,10]. Section 3 describes the new variational scheme in detail, and some emphasis is put on deriving fundamental solutions for a discrete momentum operator, which in turn allows for collisions between characteristics and hence wave breaking. Finally, in Section 4 we very briefly describe the methods we have chosen to compare with, before turning to a series of numerical examples of both quantitative and qualitative nature.

Conservative multipeakon scheme
An interesting feature of (1) on the real line is that it admits so-called multipeakon solutions, that is, solutions of the form defined by the ODE systeṁ for i ∈ {1, . . . , n}, where q i and p i can respectively be seen as the position and momentum of a particle labeled i. In this sense, q i is analogous to the discrete characteristic y i in the previous section. Several authors have studied the discrete system (8), in particular Camassa and collaborators who named it an integrable particle method, see for instance [7,9,37]. This system is Hamiltonian, and one of its hallmarks is that for initial data satisfying q i = q j for i = j and all p i having the same sign, one can find an explicit Lax pair, meaning the discrete system is in fact integrable. The Lax pair also serves as a starting point for studying general conservative multipeakon solutions with the help of spectral theory, see [26,27].
System (8) is however not suited as a numerical method for extending solutions beyond the collision of particles, which for instance occurs for the two peakon initial data with q 1 < q 2 , and p 2 < 0 < p 1 . Indeed, as |q 2 − q 1 | → 0, the momenta blow up as (p 1 , p 2 ) → (+∞, −∞), cf. [55,31]. Even though this happens at a rate such that the associated energy remains bounded, unbounded solution variables are not well suited for numerical computations. One alternative way of handling this is to include an algorithm which transfers momentum between particles which are close enough according to some criterion, see for instance [12]. However, we prefer to use the method presented next, where a different choice of variables, which remain bounded at collision-time, is introduced.
2.1. Real line version. In [38] the authors propose a method for computing conservative multipeakon solutions of the CH equation (1), based on the observation that between the peaks located at q i and q i+1 in (7), u satisfies the boundary value problem u − u xx = 0 with boundary conditions u(t, q i ) =: u i (t) and u(t, q i+1 (t)) =: u i+1 (t). Moreover, from the transport equation for the energy density one can derive the time evolution of H i which denotes the cumulative energy up to the point q i . Using y i instead of q i to denote the i th characteristic we then obtain the discrete systemẏ for i ∈ {1, . . . , n} with We note that the solution u is of the form u(t, x) = A i (t)e x + B i (t)e −x between the peaks y i and y i+1 with coefficients and where we for any grid function {v i } n i=0 have defined In order to compute the solution for x < y 1 and x > y n , one also introduces the convention (y 0 , u 0 ) = (−∞, 0) and (y n+1 , u n+1 ) = (∞, 0). We also have the relation which can be computed as (11) δH i =ū 2 i tanh(δy i ) + (δu i ) 2 coth(δy i ). Here we emphasize that the total energy H n+1 is then given by since H 0 = 0. Due to the explicit form of u we may compute P and Q as where we have defined For details on how such multipeakons can be used to obtain a numerical scheme for (1) we refer to [41].
2.1.1. Fast summation algorithm. We will present a periodic version of the above method to compare with our variational scheme. Before that, we note that the above method can be computationally expensive if one naively computes (13) for each i and j, amounting to a complexity of O(n 2 ) for computing the right-hand side of (9). Inspired by [9] and borrowing their terminology we shall propose a fast summation algorithm for computing (12) with complexity O(n). Indeed, this can be done by noticing that our P i and Q i share a similar structure with the right-hand sides of (8). To this end we make the splittings and note that f l i and f r i satisfy the recursions (14) a j := δH j cosh 2 (δy j ) +ū 2 j tanh(δy j ) 2 cosh(δy j ) , b j :=ū j δu j sinh 2 (δy j ) cosh(δy j ) .
Moreover we have the starting points for the recursions given by Clearly, computing f l and f r recursively is of complexity O(n), while adding and subtracting them to produce P and Q is also of complexity O(n), which yields the desired result.
2.2. Periodic version. Now for the periodic version of (9) there are only a few modifications needed. First of all we have to replace the "peakons at infinity" given by (y 0 , u 0 ) = (−∞, 0) and (y n+1 , u n+1 ) = (∞, 0) which in some sense define the domain of definition for the solution. The new domain will instead be located between the "boundary peakons" (y 0 , u 0 ) = (y n − L, u n ) and (y n , u n ). Thus, we are still free to choose n peakons, but we impose periodicity by introducing an extra peakon at y n − L with height u n . We also have to redefine H i , which now will denote the energy contained between y 0 and y i . Thus we have H 0 = 0, H n is the total energy of an interval of length L, while each H i can be computed as with δH i defined in (11). In addition, since the energy is now integrated over the interval [y 0 , y i ] we have to replace the evolution equation for H i witḣ where the last identity follows from the periodicity of P i by virtue of δy i , u i , and δH i being n-periodic.
Moreover, we have to replace e −|y−x| in P i and Q i with its periodic counterpart |y − x| ≤ L, and we now integrate over [y 0 , y n ] instead of R. This is analogous to the derivation of the periodic particle method in [10], and the numerical results of [37]. The courageous reader may verify that the calculations in [38] can be reused to a great extent. In the end we find that the expressions for P i and Q i are essentially the same, we only need to replace each occurrence of e −σij (yi−ȳj ) and its "derivative" with respect to y i , −σ ij e −σij (yi−ȳj ) , with respectively. To be precise, P i and Q i are given by and for j ∈ {1, . . . , n − 1}. To summarize, the periodic system readṡ for i ∈ {1, . . . , n}, and P i and Q i defined by (17), (18), and (19). [38,Ex. 4.2] we set n = 1, and use the periodicity to findȳ 0 = y 1 − L/2, δy 0 = L/2,ū 0 = u 1 , δu 1 = 0, and δH 0 = u 2 1 tanh(L/2). Plugging into (18) and (19) we find Then (20) yieldsu 1 = 0 andḢ 1 = 0, and setting u 1 (t) ≡ c we obtain y 1 (t) = y 1 (0) + ct. This shows that for n = 1, in complete analogy to the real line case, the evolution equation for H 1 decouples from the other equations, and we find that a periodic peakon travels with constant velocity c equal to its height at the peak.
(ii) With substantially more effort compared to (i) we could also consider n = 2 with antisymmetric initial datum for u to recover the periodic peakon-antipeakon solution computed in [16, pp. 5505-10].
2.2.1. Fast summation algorithm. Drawing further inspiration from [10] we propose a fast summation algorithm for the periodic scheme as well, and following their lead we use the infinite sum rather than the hyperbolic function representation of the periodic kernel. Using geometric series we find we find that the periodic P i and Q i can be written where in analogy to the full line case we have defined in addition to with a j and b j defined in (14). Defining g l i := g − i + f l i and g r i := g + i + f r i , these functions satisfy the recursions . . , n − 1}. Once more, the recursion allows us to compute P i and Q i with complexity O(n) rather than O(n 2 ) for the naive computation of each distinct P ij and Q ij in (18) and (19).

Variational finite difference Lagrangian discretization
Here we describe the method which is based on a finite difference discretization in Lagrangian coordinates, as derived in [29].
Denoting the number of grid cells n ∈ N, we introduce the grid points ξ i = i∆ξ for i ∈ {0, . . . , n − 1} and step size ∆ξ > 0 such that n∆ξ = L. These will serve as "labels" for our discrete characteristics y i (t) which can be regarded as approximations of y(t, ξ i ). In a similar spirit we introduce U i (t) and r i (t). For our discrete variables, the periodicity in the continuous case (4) translates into For a grid function f = {f i } i∈Z we introduce the forward difference operator D + defined by where we also have included the backward and central differences for future reference. We will use the standard Euclidean scalar product in R n scaled by the grid cell size ∆ξ to obtain a Riemann sum approximation of the integral on T. Moreover, we introduce the space R n per of sequences v = {v j } j∈Z satisfying v j+n = v j , and which is isomorphic to R n . For n-periodic sequences, the adjoint (or transpose) D of the discrete difference operator D is defined by the relation For instance, summation by parts shows that the differences in (22) The variational derivation of the scheme for the CH equation (1) is based on an approximation of the energy given by which corresponds to (5) in the continuous case. Following [29] we obtain a semidiscrete system which is valid also in the periodic case, namely Observe that in solving (24) we obtain approximations of the fluid velocity in Lagrangian variables since For the 2CH system (2) one has the identity in the continuous setting. Based on this we introduce the discrete identity which allows us to express the discrete density ρ i (t) as a function of D + y i (t) and the initial data. Accordingly, we have the approximate relation ρ(t, y(t, ξ i )) ≈ Furthermore, the energy of the discrete 2CH system contains an additional term compared to (23), and reads As a consequence, the semidiscrete system for the 2CH system is the same as (24), except that right-hand side of the second equation now becomes Note that (24) does not give an explicit expression for the time derivativeU , as a solution dependent operator has been applied to it. For D + y i ∈ R n per and an arbitrary sequence w = {w i } i∈Z ∈ R n per , let us define the discrete momentum Note that when D + y i = 1, (25) is a discrete version of the Sturm-Liouville operator Id − ∂ xx . The name momentum operator comes from the fact that the discrete energy can be written as the scalar product of A[D + y]U and U , which corresponds to (6). Moreover, as in [29] we find that (24) preserves the total momentum (26) where the final identity comes from telescopic cancellations and periodicity.

Presentation of the scheme for global in time solutions.
To follow [29] in obtaining a scheme which allows for global in time solutions, we have to invert the discrete momentum operator (25), and in the aforementioned paper this is done by finding a set of summation kernels, or fundamental solutions, g i,j , γ i,j , k i,j , and κ i,j satisfying where D j± denotes differences with respect to the index j. Let us for the moment assume that we have a corresponding set of kernels for the periodic case, namely and which are n-periodic in their index j for fixed i. The existence of such kernels will be justified in the next subsection.
In the end we want to derive a system which is equivalent to (24) for D + y j > 0 and which serves as a finite-dimensional analogue to [29,Eq. (4.42)]. Following the convention therein we decompose y j = ζ j + ξ j , which by (21) implies that ζ is nperiodic as well: ζ j+n = ζ j . Then, with appropriate modifications of the approach in [29], our system for j ∈ {0, . . . , n − 1} readṡ where we have defined and h j is defined to satisfy (30) 2h We note that we could have includedṙ j = 0 in (28), but since r does not appear in any of the other equations, we choose to omit it. Note that when considering the CH equation, ρ, and thus also r, vanishes identically. In the current setting, this only affects the presence of r in the identity (30).
Observe that R j and Q j in (29) are n-periodic by virtue of the kernels being nperiodic in j, and so it follows that (28) is of the formẊ j (t) = F j (X(t)), where X j+n (0) = X j (0) and F j+n (X) = F j (X). Then the integral form of (28) shows that X j+n (t) = X j (t), and so any solution of this equation must be n-periodic.
We also note that we can equivalently formulate (28) more in the spirit of [29,Eq. (4.42)] by defining and replace (28a) and (28c) to obtaiṅ where we have combined (28c) and (31) with the periodicity of U and R to get (32c). In this case we note that D + H j = h j for j ∈ {0, . . . , n − 1},Ḣ n (t) =Ḣ 0 (t) ≡ 0, and H n (t) = H n (0) is the total energy of the system. The energy H n is a reformulation of (23) in Lagrangian variables. Equation (32) is in fact our preferred version of the scheme, as it more closely resembles (20) and preserves the discrete energy H n identically.
An important observation is that the sequences defined in (29) solve which is equivalent to with corners. As shown in the next section, the matrix (34) is invertible whenever D + y j ≥ 0 for j ∈ {0, . . . , n − 1}. Thus, (33) provides a far more practical approach for computing the right hand side of (28) than the identities (29), especially for numerical methods, as there is no need to compute the kernels in (27). Indeed, if one uses an explicit method to integrate in time, given y, U and h we can solve (33) to obtain the corresponding R and Q.

3.2.
Inversion of the discrete momentum operator. The alert reader may wonder why we work with the 2n × 2n matrix (34) when inverting the operator (25) defined in only n points. This comes from the approach in [29] which enables the discretization to handle "discrete" wave breaking, i.e., D + y i = 0. By introducing a change of variables we rewrite the second order difference operator (25) as the first order matrix operator appearing in (27). Thus we avoid D + y in the denominator at the cost of increasing the size of the system.
When introducing the change of variables, we lose some desirable properties which would have made it easy to establish the invertibility of the matrix corresponding to (34) in the cases where D + y i ≥ c for some positive constant c. This would for instance be the case for discretizations of (3) where ρ 2 0 (x) ≥ d for a constant d > 0, since it is then known that wave breaking cannot occur, see [33,Thm. 4.5]. In particular, we lose symmetry of the matrix which would have enabled us to use the standard argument involving diagonal dominance, as used for instance in [48] for a discrete Helmholtz operator. Our matrix (34) is clearly not diagonally dominant, but it is still invertible, as shown in the following proposition. Proof. We will prove that the determinant of A[D + y] is bounded from below by a strictly positive constant. Thus it is never singular.
First we recall the matrix which played an essential part when inverting the discrete momentum operator on the full line in [29]. Below we will see that it plays a role in the periodic case as well, and we emphasize the property det A j = 1.
Turning back to A[D + y], we consider the rescaled matrix ∆ξA[D + y] in order to have the absolute values of the off-diagonal elements equal to one. We observe that this matrix is tridiagonal, with nonzero corners owing to the periodic boundary. Then, the clever argument in [51,Lem. 1] gives an identity for the determinant of a general matrix of this form, which in our case reads and where the last identity in (36) comes from det(Π 0 ) = 1. Next, we note that each factor A j in Π 0 can be written as Then we may expand Π 0 as which means that its trace can be expanded as Since all factors are nonnegative, we throw away most terms to obtain where the final identity follows from y n − y 0 = L. Hence, combining the above with n∆ξ = L we obtain the lower bound which clearly shows A[D + y] to be nonsingular for any n ∈ N.
To prove the existence of global solutions to the governing equations (28) or (32) by a fixed point argument, we have to establish Lipschitz continuity of the right-hand side. However, Lipschitz bounds for the inverse operator of A[D + y] are difficult to obtain directly. In particular, we see that the estimates in Proposition 3.1 rely on the positivity of the sequence D + y, which is difficult to impose in a fixed-point argument. Therefore, we will have to follow the approach developed in [29] where we introduce the fundamental solutions for the operator A[D + y] and propagate those in time together with the solution. Proposition 3.1 gives us the existence of the fundamental solutions in (27). Indeed, comparing the equations (27) with the matrix (34) one can verify that each of G i,j , Γ i,j , K i,j , and K i,j for i, j ∈ {0, . . . , n − 1}, 4n 2 in total, appears as a distinct entry in the inverse of ∆ξA[D + y].
We do not detail here the argument developed in [29] which shows the existence of global solutions to the semi-discrete system (28). In fact, the periodic case is of finite dimension and therefore easier to treat than the case of the real line. Instead we will devote most of the remaining paper to numerical results. Before that, we present nevertheless some interesting properties of the fundamental solutions that can be derived in the periodic case, and which show the connection to the fundamental solutions on the real line. Readers more interested in numerical results may skip to Section 4.

Properties of the fundamental solutions.
Here we present an alternative method for deriving the periodic fundamental solutions, more in line with the procedure in [29]. The construction is done in two steps, the first of which is to find the fundamental solutions on the infinite grid ∆ξZ as was done in [29]. Then it turns out that we can periodize these solutions to find fundamental solutions for the grid given by i∆ξ for 0 ≤ i < n. In this endeavor we only assume the periodicity y i+n = y i + L and D + y i ≥ 0, as was done in Proposition 3.1.
Due to the periodicity, we can think of the sequences {D + y j , U j , h j , r j } j∈Z being a repetition of {D + y j , U j , h j , r j } n−1 j=0 , such that D + y j+kn = D + y j for j, k ∈ Z, and similarly for the other entries. Furthermore, it enforces the relation This also leads to the upper bounds but note that this bound can only be attained if D + y i = 0 for every other index than the one achieving the maximum.
To find a fundamental solution g i,j for the operator (25) defined on the real line, that is g i,j which satisfies we consider the homogeneous operator equation By introducing the quantity Thus, if for any index i we prescribe values for g i and γ i−1 , the corresponding solution of (40) in any other index can be found by repeated multiplication with the matrix A i from (35) and its inverse. The eigenvalues and eigenvectors of A i are found in [29,Lem. 3.3], and we briefly state its eigenvalues and underline that λ + i λ − i = 1. By (39) we obtain the bound Thus, using the inequality which means that we may write To construct fundamental solutions for the operator we need to find the correct homogeneous solutions for our purpose, namely those with exponential decay. In [29] one used the asymptotic relation lim n→±∞ D + y i = 1 to deduce the existence of limit matrices, and the correct values to prescribe for g and γ were given by the eigenvectors of these matrices. The periodicity of D + y prevents us from applying the same procedure to the problem at hand, but fortunately it turns out that a different argument can be applied in our case. In fact, we can draw much inspiration from [53,Chap. 7] which treats Jacobi operators with periodic coefficients, since the operator (25) can be regarded as a particular case of such operators. However, we make some modifications in this argument for our setting, such as introducing the variable γ i from earlier, and using [g i , γ i−1 ] as the vector to be propagated instead of [g i , g i−1 ] . The reason for this is to ensure the nice properties of the transition matrix A i , such as symmetry and determinant equal to one, and to avoid problems with dividing by zero when D + y i = 0. See also the discussion leading up to [29,Lem. 3.3]. Proof. Let us follow [29,Eq. (3.23)] in defining the transition matrix By the n-periodicity of A i we find Φ j+n,i+n = Φ j,i , and since det A i = 1 it follows that det Φ j,i = 1.
The next step is to show that for any fixed i 0 , we can write Φ i,i0 as the product of a matrix with n-periodic coefficients and a matrix exponential, as in [53, p. 116].
We define the generalization of (37) which is clearly n-periodic and contains every possible instance of A i as a factor. Note that for any i, j ∈ Z, it follows from the properties of Φ j,i that we can write so Π j and Π i are similar matrices. Thus, among other properties, they have the same eigenvalues.
The matrix Π i in (42) will prove to be the key to the construction of fundamental solutions. Now we define A to be the matrix given by (35) for D + y j = 1, that is Since A i is the sum of the identity and a nonnegative matrix, we deduce the entrywise inequality Π i0 ≥ A > 0. Indeed, this follows from using (38) to bound A i corresponding to the maximal D + y i from below by A, while bounding the remaining A i from below by I. Then it follows from Wielandt's theorem [50, p. 675] that + ≥ λ + > 1, where + and λ + are the largest eigenvalues of Π i0 and A respectively. Moreover, as det(Π i0 ) = 1 this implies 1 > λ − ≥ − , where the superscript now indicates the smallest eigenvalue. In particular, this implies the existence of a matrix Q with eigenvalues ±q for some q > 0 such that Π i0 = exp(n∆ξQ). This can be seen as an alternative factorization of Π i0 , a sort of geometric mean raised to the power n. Note that due to (41) we find it natural to include ∆ξ in the exponent to ensure that q can be bounded from above and below by constants depending only on the period L instead of the grid parameter ∆ξ.
The matrix exponential is always invertible and so we may write for some matrix P i,i0 which necessarily satisfies P i0,i0 = I. We then verify that P i,i0 is n-periodic in i, From before we know that Π i0 has distinct eigenvalues ± = e ±n∆ξq = e ±Lq . Let us then denote the corresponding eigenvectors of Π i0 by v ± i0 , and define where g ± i corresponds to the Floquet solution of (40). Observe that Since we can find such eigenvectors for any 0 ≤ i 0 ≤ n − 1 it is clear that the homogeneous solutions can be written as wherep i+n =p i andp i+n =p i . Note that we may use the same Floquet solutions for i → −∞ as well. Indeed, This concludes the proof.
Hence, to obtain a fundamental solution centered at some 0 ≤ i ≤ n − 1, we need only combine the Floquet solutions (43) with decay in each direction in such a way as to satisfy the correct jump condition at i. As shown in [29], the solution is of the form j is the spatially constant Wronskian. Furthermore, the use of g i,j and γ i,j in [29] to construct fundamental solutions k i,j and κ i,j satisfying carries over directly.
Using the fundamental solutions found before we introduce the periodized kernels which are defined for i, j ∈ {0, . . . , n − 1} and n-periodic in j, e.g., G i,j+n = G i,j . By our previous analysis, the summands are exponentially decreasing in |m|, and so the series in (45) are well-defined. For instance, using (43) and (44) we may compute G i,j as the sum of two geometric series, Compare this expression for i = 0 to the definition of g p j in [36, p. 1658], and note that they coincide for D + y j ≡ 1 with our q∆ξ andp 0pj /W corresponding to their κ and c respectively. Using the fact that D + y j+mn = D + y j for m ∈ Z we observe that these functions satisfy the fundamental solution identity (27). Moreover, the identity (27) imposes two symmetry conditions and an anti-symmetry condition on (45), namely These can be derived in complete analogy to the proof of [29,Lem. 4.1], replacing the decay at infinity by periodicity to carry out the summation by parts without any boundary terms. Alternatively, one can use the structure of the matrix (34) and its inverse to show that (46) holds.

Numerical experiments
In this section we will test our numerical method presented in the previous section for both the CH equation (1) and the 2CH system (2), and compare it to existing methods. As these are only discretized in space, we just want to consider the error introduced by the spatial discretization. To this end we have chosen to use explicit solvers from the Matlab ODE suite to integrate in time, and in most cases this amounts to using ode45, the so-called go-to routine. Matlab's solvers estimate absolute and relative errors, and the user may set corresponding tolerances for these errors, AbsTol and RelTol, to control the accuracy of the solution. Our aim is to make the errors introduced by the temporal integration negligible compared to the errors stemming from the spatial discretization, and thus be able to compare the spatial discretization error of our schemes to those of existing methods. All experiments were performed using Matlab R2018b on a 2015 Macbook Pro with a 3.1 GHz Dual-Core Intel Core i7 processor.
For the examples where we have an exact reference solution, we would like to compare convergence rates for some fixed time t. To compute the error we have then approximated the H 1 -norm by a Riemann sum where u(x) is the reference solution, and u n (x) is the numerical solution for n = 2 k . The norm in (47) is interpolated on a reference grid x i = i∆x for ∆x = 2 −k0 L.
Here we ensure that k 0 is large enough compared to k for the approximation to be sufficiently close to the H 1 -norm, and in general we have found that taking k 0 ≥ 2 + max k k works well in our examples. We omit the second term of the summand in (47) to obtain the corresponding approximation of the L 2 -norm.
Note that for the schemes (20) and (32) set in Lagrangian coordinates, traveling waves in an initial interval will move away from this interval along their characteristics. To compare their solutions to schemes set in fixed Eulerian coordinates we consider only norm on the initial interval, and for the Lagrangian solution we use the periodicity to identify y(t, ξ) with a position on the initial interval. For instance, if the initial interval is [0, L], we identify the solution in the positions y and y modulo L.

4.1.
Review of the discretization methods. Here we briefly review the discretization methods used in the coming examples, and in particular we specify how the they have been interpolated on the reference grid. The schemes we use to compare with (32) can of course be just a small sample of existing methods, and we have chosen to compare with a subset of schemes which share some features with our variational scheme (32). As alluded to in the introduction, the conservative multipeakon scheme (9) from [38] shares much structure with (32), and so we found it natural to define its periodic version (20) for comparison. Furthermore, since the discrete energy (23) is defined using finite differences, we decided to implement some finite difference schemes, and here we included both conservative and dissipative methods to illustrate their features. Finally, we included a pseudospectral scheme, also known as Fourier collocation method, which has less in common with the other schemes. This is known to perform extremely well for smooth solutions, but we will see that it is less suited for solutions of peakon type.
We underline that even though these numerical schemes may have been presented with specific methods for integrating in time in their respective papers, for these examples we want to compare the error introduced by the spatial discretization only, and to treat all methods equally we choose a common explicit method as described before.
4.1.1. Conservative multipeakon scheme. As mentioned in the introduction, when defining the interpolant u n (t, x) for the multipeakon scheme (20) between the peaks located at y i , it is a piecewise combination of exponential functions. This in turn makes its derivative (u n ) x piecewise smooth, but discontinuous at the peaks. For the approximation of initial data, unless otherwise specified, we have chosen y i (0) = ξ i , U i (0) = u 0 (ξ i ), and computed H i (0) according to (15) for ξ i = i∆ξ and i ∈ {0, . . . , n − 1}.

4.1.2.
Variational finite difference Lagrangian scheme. We mentioned in Section 3 that it is computationally advantageous to solve the matrix system (33) when computing R and Q in the right-hand side of (32). Indeed, solving this nearly tridiagonal system should have a complexity close to O(n) when solved efficiently. In practice, we find that the standard Matlab backslash operator, or mldivide routine, is sufficient for our purposes, as it seems to scale approximately linearly with n in our experiments.
For the interpolant u n (t, x) we solve (32) for a given n to find y i (t) and U i (t), and define a piecewise linear interpolation. This makes (u n ) x piecewise constant with value D + U i /D + y i for x ∈ [y i , y i+1 ). Note that there is no trouble with dividing by zero as the corresponding intervals are empty. When applying the scheme to the 2CH system, we follow the convention in [29] with a piecewise constant interpolation ρ n (t, x) for the density, setting it equal to r i (0)D + y i (0)/D + y i (t) for x ∈ [y i (t), y i+1 (t)). For initial data, unless otherwise specified, we follow the multipeakon method in choosing computing h i (0) according to (30), and then compute H i (0) as (31).

4.1.3.
Finite difference schemes. As they remain a standard method for solving PDEs numerically, it comes as no surprise that several finite difference schemes have been proposed for the CH equation. We will consider the convergent dissipative schemes for (1) presented in [36,14], and the energy-preserving scheme for (1) and (2) studied numerically in [48]. The schemes in [36,48] are both based on the following reformulation of (1), for u = u(t, x) and m = m(t, x). Here u(t, ·) ∈ H 1 (T) means that m(t, ·) corresponds to a Radon measure on T. Then, with (48) as starting point, and grid points x j = j∆x, ∆x > 0 we may apply finite differences, as defined in (22), to obtain various semidiscretizations, specifically which is the discretization studied in [36] under the assumption of m initially being a positive Radon measure. In the same paper they also briefly mention three alternative evolution equations for m j , which for different reasons were troublesome in practice when integrating in time using the explicit Euler method. On the other hand, in [48] they use the following discretization, which coincides with (50b) except that they use a wider stencil when defining m j . A difference operator which approximates the r th derivative using exactly r + 1 consecutive grid points is called compact, cf. [4,Ch. 3]. Clearly, D ± and D − D + are compact difference operators, while D 0 and D 0 D 0 are not. As pointed out in [4,Ch. 7], noncompact difference operators are notorious for producing spurious oscillations, and this is exactly the problem reported in [36] for (50b). Similarly, in [48] the authors remark that oscillations may appear when the solution of (1) becomes less smooth. In these cases they propose an adaptive strategy of adding numerical viscosity to the scheme with the drawback that the discrete energy is no longer conserved. We have not incorporated such a strategy here, as we would like an energy-preserving finite difference scheme to compare with our energy-preserving variational discretizations.
An invariant-preserving discretization of the 2CH system (2) is also presented in [48], and using the notation (10) we can write it aṡ still with m j = u j −D 0 D 0 u j . The semidiscretizations (51) and (52) are conservative in the sense that both preserve the invariants which respectively correspond to the momentum and energy of the system, and have counterparts in (26) and (23) for the system (24). In addition, (52) preserves the discrete mass A somewhat more refined spatial discretization is employed in [14]. This method is based on yet another reformulation of (1), and its discretization reads Not only do they use a staggered grid, but they also use u ∨ 0 and u ∧ 0, the positive and negative parts of u respectively, to obtain the proper upwinding required for dissipation only in the D − u-part and not the u-part of the associated discrete energy Contrary to [36], this dissipative scheme allows for initial data of any sign for u. Moreover, this precarious choice of positive and negative parts of u can be linked to the traveling direction of the wave profile u: for u j+1/2 > 0 the solution moves to the right, thus values to the right have a greater influence on the solution and it is reasonable to use a forward difference. For u j+1/2 < 0 the solution travels to the left and one analogously uses a backward difference.
For u n corresponding to (49), (51), (52), and (55) we have made a piecewise interpolation between the grid points, meaning (u n ) x is piecewise constant. For initial data we define u i (0) = u 0 (x i ) for x i = i∆x, n∆x = L, and apply the corresponding discrete Helmholtz operator to produce m i (0) for the schemes (49), (51), and (52).
A comment on the inversion of the discrete Helmholtz operator. In both [36] and [14] they compute the Green's function corresponding to the discrete Helmholtz operator Id −D − D + using a difference equation. As [36] concerns the periodic CH equation, they periodize this function to obtain the periodic Green's function which can be restated in our variables as Notice the resemblance to the periodized exponential (16) in the continuous case. Defining the matrix M corresponding to the discrete Helmholtz operator with periodic boundary conditions, we have (M g p ) j = δ 0,j for the Kronecker delta δ i,j . Consequently, they compute u from m by a convolution In the implementation they compute this convolution efficiently by employing the fast Fourier transform (FFT) and the convolution theorem for the discrete Fourier transform (DFT), namely The discrete Fourier transform is more than an efficient tool for evaluating the convolution in this case. Indeed, the matrix M is circulant and thus diagonalizable using the DFT matrix, cf. [50, p. 379]. Defining the matrix V through V j,k = 1 √ n e i 2π n jk for j, k ∈ {0, . . . , n − 1} we find that V * M V is indeed a diagonal matrix containing the eigenvalues of M , j ∈ 0, . . . , n − 1.
A little rearrangement then shows (F n [g p ]) j = d −1 j , or equivalently g p j = (F −1 n [d −1 ]) j , revealing an alternative method for computing the periodic Green's function or computing u through u = F −1 n d −1 · F n [m] . This can also be seen by directly inserting F −1 n [F n [g p ]] for g p in the difference equation [36,14], rearranging coefficients and applying the DFT.
The above method is also convenient for computing the inverse of the noncompact discrete Helmholtz matrix in [48], as its eigenvalues j ∈ 0, . . . , n − 1 can be computed from circulant matrix theory. Hence, this is how we have implemented the computation of u in (49), (51), and of P in (55). In our examples, when compared to computing g p by solving the nearly tridiagonal system with M , the FFT method was consistently faster.

Pseudospectral (Fourier collocation) scheme.
Let us consider the so-called pseudospectral scheme used in the study of traveling waves for the CH equation in [43], see also [54] for an introduction to the general idea. This method is based on applying Fourier series to (1) and solving the resulting evolution equation in the frequency domain. Introducing the scaling factor a := L/2π and assuming the discretization parameter n to be even, the pseudospectral method for (1) with period L can be written (56) Here ikV means the pointwise product of the vectors i[− n 2 , . . . , n 2 − 1] and [V (− n 2 ), . . . , V ( n 2 − 1)], while F n and F −1 n are the discrete Fourier transform and its inverse, defined as The right-hand side of (56) can be efficiently computed by applying FFT, and to evaluate the resulting u and u x on the grid x j one computes u n ( Unfortunately, this scheme is prone to aliasing, and for this reason the authors of [44] propose a modified version of the scheme which employs the Orszag 2/3-rule. For more details on aliasing and the 2/3 rule we refer to [5,Ch. 11]. This 2/3 rule amounts to removing the Fourier coefficients V (k) corresponding to one third of the frequencies k before applying the inverse transform in (56), specifically those frequencies of largest absolute value. In our setting this means that we keep V (k) as is for k ∈ {− n 2 , . . . , n 2 − 1} satisfying |k| ≤ n 3 , while setting V (k) = 0 for the rest. For this dealiased scheme, which they call a Fourier collocation method, the authors in [44] prove convergence in L 2 -norm for sufficiently regular solutions of (1).
When interpolating u n on a denser grid containing x j we must use the corresponding real Fourier basis function for each frequency k to obtain the correct representation of the pseudospectral solution, which will always be smooth. For this we use the routine interpft which interpolates using exactly Fourier basis functions.

Example 1: Smooth traveling waves.
To the best of our knowledge, there are no explicit formulae for smooth traveling wave solutions of either (1) or (2), and to obtain such solutions we make use of numerical integration in the spirit of [15] and [16] Assuming ϕ = c and multiplying with (c − ϕ) the above equation can be rearranged to yield (c − ϕ) 2 (ϕ − ϕ) = 0, which after integration gives for some constant B ∈ R. For the right choice of B, this ordinary differential equation can be integrated numerically to give periodic smooth solutions. Inspired by [16] we have chosen c = −B = 3 and initial conditions ϕ(0) = 1, ϕ (0) = 0. To integrate we used ode45 with very strict tolerances, namely AbsTol = eps and RelTol = 100 eps, where eps = 2 −52 is the distance from 1.0 to the next double precision floating point number representable in Matlab. After integration we found the solution to have period p = 6.4695469424989, where the first ten decimal digits agree with the period found in the experiments section of [16].

4.2.2.
Numerical results for the CH equation. Figures 2 and 3 display numerical results for the smooth reference solution above after moving one period L to the right. As the traveling wave has velocity c = 3, this corresponds to integrating over a time period L/3. To integrate in time we have applied ode45 with parameters AbsTol = RelTol = 10 −10 .
To highlight the different properties of each scheme, Figure 2 displays u n and (u n ) x for the various schemes for the low number n = 2 4 and interpolated on a reference grid with step size 2 −10 L. It is apparent how the dissipative nature of the schemes (49) and (55) reduces the height of the traveling wave such that it lags behind the true solution. This effect is particularly severe for (49), which probably explains why its error displayed in Figure 3a is consistently the largest. The perhaps most obvious feature in Figure 2b is the large-amplitude deviations introduced by the discontinuities for the multipeakon scheme. As indicated by its decreasing H 1 -error in Figure 3a, the amplitudes of these discrepancies reduce as n increases. Figure 3a contains L 2 -and H 1 -errors of the interpolated solutions for n = 2 k with k ∈ {3, . . . , 14}, evaluated on a reference grid with k 0 = 16. Before commenting on these convergence results, we underline that the pseudospectral method (56) has not been included in the figure, as its superior performance for this example would make it hard to differ between the plots for the remaining methods. Indeed, this scheme displays so-called spectral convergence in both L 2 -and H 1 -norm, and exhibits an L 2 -error close to rounding error already for n = 2 6 .
For the remaining methods it is perhaps not surprising that the finite difference scheme (51) based on central differences in general has the smallest error in H 1norm for this smooth reference solution, exhibiting convergence orders of 2 and 1 for L 2 and H 1 respectively. However, we observe that the L 2 -error of the multipeakon scheme is consistently the lowest, but its convergence in H 1 is impeded by its irregular derivative. Moreover, it appears that for small n, i.e., n ≤ 2 5 in this setting, the variational scheme (32) performs better.  An observation regarding the execution time of ode45 is that the finite difference schemes (49), (51) and (55) seem to experience some tipping point around k = 11 where their running times tend to be of complexity O(n 2 ) rather than O(n), see Figure 3b. A closer look at the statistics for the time integrator in these cases reveals that from k = 11 and onwards, the solver starts experiencing failed attempts at satisfying the specified error tolerances, thus increasing the execution time. This does not occur for the schemes in Lagrangian coordinates, which exhibit execution times aligning well with the O(n)-reference line. A possible explanation for this could be that the semidiscrete schemes based in Lagrangian coordinates are easier to handle for time integrator. Indeed, the almost semilinear structure of the ODE system corresponding to (32) in [29] is key to its existence and uniqueness proofs, and perhaps this structure is advantageous also for the ODE solvers.
It is however important to impose sufficiently strict error tolerances for the temporal integration error to be negligible compared to the spatial discretization for the smallest step sizes. Furthermore, in the case of (51) it appears important to not have too large error tolerances irrespective of n to avoid oscillations in m i . Since the convolution with the periodic Green's function to compute u i from m i is a regularizing process, we observe from experiments that seemingly well-behaved u i may hide extremely oscillatory m i . This is not surprising, as m i is a discretization of what in general may be a measure, and thus much less regular than u i .
It should be emphasized that the multipeakon scheme (20) is considerably faster than the other schemes, which likely comes from it being the only scheme where no matrix equations are solved. Thus, the fast summation algorithm appears to benefit the multipeakon scheme in this direction.

4.2.3.
Computing reference solution for the 2CH system. Here we essentially follow the steps for the CH equation, with some slight modifications. We make the ansatz u(t, x) = ϕ(x − ct) and ρ(t, x) = ψ(x − ct). Plugging these into (2) leads to the system Integration of (57b) yields the relation for some constant A ∈ R. This expression makes sense as long as ϕ = c, which we will assume from now on. Using the above relation we can replace ϕ by (58) in (57a). Rearranging we get in a similar manner as for the CH equation, Integration gives for some constant B ∈ R. We follow [15] in choosing c = A = −B = 2 and initial conditions ϕ(0) = 0.5, ϕ (0) = 0. Proceeding as in the case of the CH equation we obtain the period p = 5.1475159326651 where the four first decimal digits agree with the four decimal places provided in [15].

4.2.4.
Numerical results for the 2CH system. For this example we have only compared the variational scheme (28) to (52), as these are the only methods presented in Section 4.1 applicable to the 2CH system. As in the experiment for the CH equation we want to measure the error after the wave has moved a distance L to the right, corresponding to one period. Since the velocity of the solution now is c = 2, this corresponds to integrating from t = 0 to t = L 2 , which we did using ode45 with AbsTol = RelTol = 10 −8 . Figure 4 shows the interpolants u n and ρ n for n = 2 4 , together with the exact solutions. The reader may wonder why we have interpolated ρ n differently for the two methods, as it is clear from Figure 4b that the variational scheme would be much closer to the reference solution if one had used a piecewise linear interpolation of ρ n . In fact, when looking at the convergence rates of ρ n in this case, the variational scheme exhibits rate 1 convergence for both piecewise linear and piecewise constant interpolations. Since the discrete density is computed using D + y which is piecewise constant when y is piecewise linear, we follow [29] in using a piecewise constant ρ n . For the scheme (52) on the other hand, the convergence rate actually depends on the interpolation, as piecewise linear ρ n has convergence rate 2, while piecewise constant interpolation gives approximate rate 1. It is then only reasonable to use the interpolation which performs better. These convergence rates are shown in Figure 5 together with execution times for ode45, where n = 2 k , k ∈ {3, . . . , 14}, and the reference grid has k 0 = 16. We observe that both schemes exhibit highly consistent rates, but the difference scheme (52) outperforms (32) convergence-wise by having smallest errors overall and rate

Example 2: Periodic peakon.
It is now well known, cf. [8], that a single peakon u(t, x) = ce −|x−x0−ct| is a weak solution of (1) with its peak at x = x 0 + ct. The periodic counterpart of this solution is valid for |x − x 0 − ct| ≤ L, and periodically extended outside this interval. This formula for the periodic peakon can in fact be deduced from (20) for n = 1, or found in, e.g., [47,Eq. (8.5)]. Setting x 0 = 1 2 , c = L = 1, and t = 0 we use this function as initial datum on [0, 1] for a numerical example. As the periodic multipeakon scheme reduces to exactly this peakon for n = 1, y 1 (0) = 1 2 and u 1 (0) = 1, we have chosen to omit this scheme for the experiment, and rather compare how well the other schemes approximate a peakon solution.
As one could expect, the schemes generally performed worse for this problem compared to the smooth traveling wave, and so we could reduce the tolerances for the time integrator to AbsTol = RelTol = 10 −8 with no change in leading digits for the errors. However, as the finite difference schemes, and especially the noncompact scheme (51), were quite slow when using ode45 for large values of n, we instead used the solver ode113 which proved to be somewhat faster in this case. Moreover, when computing the approximate H 1 -error (47) in this case, we encounter the problem of the reference solution derivative not being defined at the peak. To circumvent this issue, we measure the error at time t = L on a shifted reference grid. That is, we evaluate (47) on x i = (i + 1 2 )2 −k0 instead of x i = i2 −k0 for i ∈ {0, . . . , 2 k0 − 1} to ensure x i = 1 2 . We plot the solutions again for the relatively small n = 2 4 to highlight differences between the schemes in Figure 6.  Once more we observe that the dissipativity of the schemes (49) and (51) is quite severe for this step size and they fail to capture the shape of the peakon. The energy preserving difference scheme (51) is closer to the shape of the peakon, but exhibits oscillations which are particularly prominent in the derivative. On the other hand, the variational scheme manages to capture the shape of the peakon very well, and manages far better than the other schemes to capture the derivative of the reference solution after one period.
The above observations are reflected in Figure 7a which shows the rate of convergence. The errors for the dissipative schemes decrease, which is expected since both have been proven to converge in H 1 . However, this convergence is quite slow, with approximate rates of 0.6 and 0.25 for the L 2 -and H 1 -norms respectively.
The energy-preserving difference scheme (51) exhibits order 1 convergence in L 2norm, but the oscillations in the derivative put an end to any hope of H 1 -convergence. Indeed, the H 1 -seminorm of the error is larger than 0.2 irrespective of the step size. The oscillations are of course even more severe for m i , the discrete version of u−u xx which is actually solved for in the ODE.
The variational scheme (32) performs quite well, with convergence rate 1 in L 2 and H 1 -rates generally between 0.45 and 1. The exception is the transition from n = 2 9 to n = 2 10 where there was barely any decrease in the error, followed by a large decrease corresponding to a rate of 5 in n = 2 11 , and from here the H 1 -error is comparable in magnitude to the L 2 -error of (51). This jump is possibly connected to the discontinuity of the reference solution.
The pseudospectral scheme performs quite well for the L 2 -norm, and for larger n it has the smallest error of all the methods, with the dealiased scheme showing a better convergence rate which approaches 1.5. However, in H 1 -norm the scheme performs worse than the variational scheme, owing to the major oscillations close to the discontinuity in the reference solution. Note that we have only run this scheme for k ∈ {3, . . . , 10}, as opposed to k ∈ {3, . . . , 13} for the other methods. The reason for this is that for larger n this scheme needs a finer reference grid to have consistent convergence rates, as opposed to the other schemes, and in addition the run times are very long for larger n.
Remark 4.1. Here we have only run the schemes over one period for the traveling wave, but an additional issue for the schemes in Lagrangian coordinates becomes apparent if they are run for a long time with initial data containing a derivative discontinuity, such as the traveling peakon. Then one typically observes a clustering of characteristics, or particles, at the front of the traveling discontinuity, leaving less particles to resolve the rest of the wave profile. Indeed, this is also reported in the numerical results of [9] for their particle method, and the authors suggest that a redistribution algorithm may be applied when particles come too close. Such redistribution algorithms would be useful for (20) and (32) when running them for long times, but development of such tools fall outside the scope of this paper. It is however important to be aware of this phenomenon, as the clustering can lead to artificial numerical collisions of the characteristics when they become too close for the computer to distinguish them. In worst case this can lead to a breakdown of the initial ordering of the characteristics y i , which again ruins the structure of the ODE system, leading to wrong solutions or breakdown of the method.
for c = 1 and L = 2π. We want to evaluate the numerical solutions at t = 4.5, which is approximately the time when the two peaks have returned to their initial positions x = π/2 and x = 3π/2 after colliding once. Since this is a multipeakon solution, we may use the conservative multipeakon scheme (20) to provide a reference solution. Setting n = 2, choosing y 1 = π/2, y 2 = 3π/2, u 1 = −u 2 = 1, and computing H i for i = 1, 2 according to (11), we integrated in time using ode113 with the very stringent tolerances AbsTol = eps and RelTol = 100 eps. For the schemes in the comparison we used the same solver with AbsTol = RelTol = 10 −9 , and a reference grid with 2 16 equispaced points.
For this example we have omitted the dissipative schemes (49) and (55), as the former cannot handle initial data of this type, while the latter would produce an approximation of the dissipative solution which is identically zero after the collision. Figure 6 shows u n for the variational scheme (32), the finite difference scheme (51), and the pseudospectral scheme (56) with and without dealiasing, all for n = 2 6 . This is a numerical example which is especially ill-suited for the pseudospectral method (56), and illustrates the necessity of the dealiasing to have any form of convergence. From Figure 8 one could get the impression that the dealiased scheme will perform worse because of its large-amplitude oscillations near the point of collision. A possible explanation for this is that the peakon-antipeakon interaction is very localized at collision time, and in removing the high frequency components of the pseudospectral approximation one loses the only basis functions which are able to resolve these localized details, and we are left with oscillations caused by the remaining basis functions. However, the amplitude of these spurious oscillations will decrease as n increases, since we have more basis functions with high frequencies.
On the other hand, the phase error in (56) without dealiasing does not decrease as n increases. Indeed, after collision this approximation always attains larger amplitudes than the reference solution, and thus travels further, irrespective of n. This becomes apparent in Figure 9a where we see that there is no convergence at all for the method without dealiasing. To try and understand this behavior we turned to the energy of the solution, and even though there is no defined discrete energy in the derivation of this method, based on the discrete energy in (53) we used the following expression as an indicator of the energy, In Figure 9b we have plotted the relative change (E(t) − E(0))/E(0) in the discrete energies of (51) and (56) over time for n = 2 12 . Note that we have also added plots of the relative change in the u 2 -and u 2 x -parts of the energies. From these plots we observe that all three schemes behave similarly until collision time, after which the total energy of both version of the pseudospectral scheme increases, while the relative change in total energy for (51) remains of order 10 −11 . We observe that the u 2 xenergy, and thus also the total energy, increases far more for the dealiased scheme, which is probably caused by the oscillations near the collision point discussed earlier.
On the other hand, when considering the u 2 -energy, the scheme without dealiasing has a larger increase than the dealiased one which has a u 2 -energy closer to that of the finite difference scheme (51). This suggests that unless one applies dealiasing to the pseudospectral scheme (56), the peakon-antipeakon collision introduces an artificial increase in energy which ruins L 2 -convergence through a phase error, see Figure 9a. For the H 1 -norm one cannot expect convergence from any version of (56), as the collision introduces severe oscillations in the pseudospectral derivative.
Oscillatory behavior also explains the very slow decrease in H 1 -error for the invariantpreserving difference scheme (51), displaying a rate fluctuating around 0.15. For the L 2 -norm it exhibits a rate which approaches 0.5.
Meanwhile, the variational scheme (32) performs rather well for this example, having the smallest errors in both L 2 -and H 1 -norm, and displaying consistent rates of respectively 1 and 0.5. 4.5. Example 4: Collision-time initial datum. An interesting feature discussed in [29,Section 5.2] is that the variational discretization allows for irregular initial data. That is, pairs (u, µ), where µ may be a positive finite Radon measure, provide a complete description of the initial data and the corresponding solution of (1) in Eulerian coordinates. In particular, for the absolutely continuous part of µ one has and the cumulative energy µ((−∞, x)) can be a step function, which is connected to the well-studied peakon-antipeakon dynamics.
For example, at collision time, u may be identically zero and all energy is concentrated in the point of collision as a delta distribution, meaning the cumulative energy will be a step function centered at the collision. To be able to accurately represent the solution between the two peakons emerging from a collision, we have to "pack" sufficiently many characteristics into the collision point. To this end we introduce the initial characteristics y 0 (ξ) = y(0, ξ) and the initial cumulative energy This feature inspired the following variation of peakon-antipeakon initial data, where we initially have a system with period L = 8 and total energy E = 6 equally concentrated in the points x = 2 and x = 6 on the interval [0,8]. In Eulerian variables this reads As the conservative multipeakon method describes exactly the interaction of peakons, we may once more use it as reference solution. Setting n = 4, L = 8 in (20) we define initial data y 0 = 2 2 6 6 , U 0 = 0 0 0 0 , H 0 = 0 6 6 12 corresponding to two pairs of peakons respectively placed at x = 2 and x = 6 with energy 6 contained between the peakons in each pair. Note that the energy is double that of the energy prescribed for the variational scheme (32), since the factor 1 2 is not present in the definition of the energy for the multipeakon scheme. This was then integrated using ode45 with the same tolerances as for the reference solution in the previous example, AbsTol = eps and RelTol = 100 eps.
Then we measured the errors using (47) on the reference grid x i = 2 −16 L, and the results are displayed in Figure 10b. We found the decrease in error to be remarkably consistent, rate 1 in the L 2 -norm and approximately 0.5 for the H 1 -norm, and this is true for both the time t = 2 before the collision and time t = 4 after the collision. Figure 10a displays the characteristics for the solution with n = 2 6 together with the four trajectories of the peaks of the exact solution. Observe how the characteristics, initially clustered at the collision points x = 2 and x = 6 in accordance with (59), spread out between the pairs of peaks in the reference solution.
In Figure 11 we have plotted the solution for n = 2 6 and interpolated on a reference grid with 2 10 grid points. We observe that the interpolants match the shape of the exact solution quite well, even for the derivative, and have only a slight phase error. 4.6. Example 5: Sine initial datum for the CH equation. In the following example we will qualitatively compare how the variational scheme (28) and the conservative multipeakon scheme (20) handle smooth initial data which leads to wave breaking. We have chosen to consider u 0 (x) = sin(x) for x ∈ [0, 2π], since this is a simple, smooth periodic function which leads to singularity formation. Furthermore, it is antisymmetric around the point x = π, which will highlight another difference between the methods. Since we do not have a reference solution in this case, the comparison will be of a more qualitative nature than in the preceding examples.  Figure 11. Plot of the interpolated solutions u n and (u n ) x for collision-time initial datum at times t = 2 and t = 4 with n = 2 6 interpolated on a reference grid with 2 10 grid points.
As usual we chose y i (0) = ξ i = i2π/n and U i (0) = u 0 (ξ i ) for both schemes, and computed their corresponding initial cumulative energies H i (0) in their own respective ways. Then we have integrated from t = 0 to t = 6π using ode45 with AbsTol = RelTol = 10 −10 , and evaluated the interpolated functions on a finer grid with step size ∆x = 2 −10 L.
A striking difference in the methods is immediately seen from their characteristics for n = 2 4 and n = 2 6 displayed in Figure 12. Indeed, we find that the multipeakon method preserves the symmetry of the characteristics, and this we have observed for all values of n that we tested for whenever y i (0) were equally spaced for i ∈ (d) Figure 12. The characteristics for u 0 (x) = sin(x) for the conservative multipeakon scheme with n = 2 4 (a) and n = 2 6 (c), and for the variational scheme for n = 2 4 (b) and n = 2 6 (d). Note that we have also plotted y n (t) = y 0 (t) + L to highlight the periodicity.
{0, . . . , n − 1}. In particular, the characteristics starting at ξ 0 = 0 and ξ n/2 = π remain in the same position for all t. Indeed, this is a consequence of the fact that antisymmetry is preserved by (1), cf. [8], [19,Rem. 4.2]. On the other hand, for the variational scheme the characteristics have a slight drift which becomes more pronounced over time. This is especially clear for small n; compare for instance Figures 12b and 12d where in the former the drift is apparent for y n/2 from the start, while in the latter the characteristics look more similar to those of Figure 12c for a longer time. This introduces a small phase error at t = 6π for n = 2 6 . The clustered characteristics indicating the peaks in Figure 12d lie slightly to the left for the corresponding peaks in Figure 12c.
To visualize the solution corresponding to Figure 12d, we have plotted the interpolant u n at 30 equally spaced times in Figure 13. Here we see how the initial smooth profile breaks after about two seconds, leading to two peaked waves traveling in opposite directions until they collide and reflect at the boundary. Comparing Figure 13. The interpolant u n for u 0 (x) = sin(x) with n = 2 6 and 30 equally spaced times between t = 0 and t = 6π.
with Figure 12d we see how the emerging peaks and antipeaks correspond to clustering of characteristics with very high density, while the locations of smoother ridges and troughs coincide with less dense clusters.
Remark 4.2. A natural question arising from this example is whether one could have chosen a different discrete energy as a starting point for the variational discretization in order to obtain a scheme which respects the preservation of antisymmetry. One could for instance try to use symmetric differences such as the central difference from (22). However, there is the potential drawback of the oscillatory solutions associated with noncompact difference operators, cf. the discussion in [17, p. 1929].
In fact, an early prototype of the scheme (32), comprising only of (24) solved as an ODE system with solution dependent mass matrix, exhibited severe oscillations in front of the peak when applied to the periodic peakon example after replacing D + by D 0 . This indicates that some care has to be exercised when choosing the defining energy. 4.7. Example 6: Sine initial datum for the 2CH system. Our final example illustrates how the solution of the previous example changes when we instead solve the two-component Camassa-Holm system initial value problem (3) by adding a positive density ρ 0 initially. In this case we cannot compare with the multipeakon method (20), since multipeakons will not be a solution of the 2CH system. Moreover, this time we will use (28) instead of (32) to illustrate how the total energy is preserved when h i rather than H i is used to track the energy of the system.
(D + U i (0)) 2 + (r i (0)) 2 for i ∈ {0, . . . , n − 1} and n = 2 6 . Once more we integrate from t = 0 to t = 6π with ode45 and AbsTol = RelTol = 10 −10 . The results are displayed in Figure 14. Figure 14a shows the characteristics which appear to have less drift in this case compared to Figure 12d. Moreover, as expected from theory we see that there is no collision of characteristics, there is always a positive distance between them. The lack of singularity formation expressed in the plot of the interpolant u n in Figure  14b, where the wave profile remains smooth. There are no sharp peaks or antipeaks, only ridges and troughs where the characteristics are dense. In Figure 14d we see how the energy is transferred to the density ρ n when the crests and troughs of u n meet, as opposed to the CH equation in Figure 13 where the energy is concentrated in the point of collision.
Finally, Figure 14c shows the deviation in momentum I(t) − I(0) and in energy E(t) − E(0), where I(t) is defined in (26) and E(t) = H n (t) as defined by the sum in (31). In the other examples, H n was one of the solution variables for the variational scheme, just as for the multipeakon scheme. For both schemes, the corresponding evolution equation isḢ n = 0 and the energy is conserved by default.
Here the energy is a sum scaled by ∆ξ, hence it is a linear invariant of the ODE system. Since linear invariants are preserved by any Runge-Kutta method, cf. [35, Thm. 1.5], we expect this deviation to be of the order of rounding error in our example. Indeed, this is what we observe: for any tolerance we set for ode45 we found the energy deviation to be of order 10 −15 . On the other hand, this is not the case for the total momentum I(t), which is a sum of products of U i and D + y i and thus a quadratic invariant. In our results, the momentum deviation always scales with the tolerances of the solver. This is also the case in Figure 14c where it is near the tolerance 10 −10 . Speaking of these invariants, we point out the role reversal of the total momentum and energy for the variational scheme and the invariantpreserving finite difference scheme: the total energy is a linear invariant for (28) and a quadratic invariant for (52), while the total momentum is a quadratic invariant for (28) and a linear invariant for (52).

Summary
We have applied the novel variational Lagrangian scheme (32) to several numerical examples. In general it performed well and displayed consistent convergence rates. In particular we saw rate 1 in both L 2 -and H 1 -norm for smooth reference solutions, while for the more irregular peakon reference solutions we observed rate 1 in L 2norm and rate 0.5 in H 1 -norm. Due to its rather simple discretization of the energy, it comes as no surprise that other higher-order methods outperform (32) for smooth reference solutions. However, it is for the more irregular examples involving wave breaking that this scheme stands apart, exhibiting consistent convergence even in H 1 -norm where other methods may struggle with oscillations.
When it comes to extensions of this work there are several possible paths, and we mention those most apparent. An obvious question is whether the scheme could be improved by choosing a more refined discrete energy for the variational derivation, and if there are choices other than the multipeakons which lead to an integrable discrete system. Another extension is to make the method fully discrete, in the sense that one introduces a tailored method to integrate in time, preferably one that respects the conserved quantities of the system. Finally, one could consider developing a specific redistribution algorithm which can handle the potential clustering of characteristics and prevent artificial numerical collisions when such a Lagrangian method is run over long time intervals.