A maximisation technique for solitary waves: the case of the nonlocally dispersive Whitham equation

Recently, two different proofs for large and intermediate-size solitary waves of the nonlocally dispersive Whitham equation have been presented, using either global bifurcation theory or the limit of waves of large period. We give here a different approach by maximising directly the dispersive part of the energy functional, while keeping the remaining nonlinear terms fixed with an Orlicz-space constraint. This method is to our knowledge new in the setting of water waves. The constructed solutions are bell-shaped in the sense that they are even, one-sided monotone, and attain their maximum at the origin. The method initially considers weaker solutions than in earlier works, and is not limited to small waves: a family of solutions is obtained, along which the dispersive energy is continuous and increasing. In general, our construction admits more than one solution for each energy level, and waves with the same energy level may have different heights. Although a transformation in the construction hinders us from concluding the family with an extreme wave, we give a quantitative proof that the set reaches `large' or 'intermediate-sized' waves.


INTRODUCTION
We propose in this paper a new and somewhat different variational method towards solitary waves in dispersive equations.Solitary waves in nonlinear dispersive settings go back to the discoveries of John Scott Russell, Boussinesq, Lord Rayleigh and Korteweg-de Vries in the 1800s, but a rigorous theory emerged first with works such as Lavrentieff's [21], Friedrich-Hyers' [15] and Ter-Krikorov's [33] in the mid 20th century.Early on, limiting procedures for long but small waves, and perturbative expansions around flowing parallel streams were used.The small-amplitude theory was later extended by Beale using the Nash-Moser implicit function theorem [4], with which he established smooth parameterdependence for all waves of small amplitude.Amick and Toland similarly extended the existence results from small to large amplitudes, yielding for the water-wave equations a full family of waves with increasing maximal slope all the way up to the wave of greatest height [1,2].Although variational formulations of the water wave problem existed much earlier, Turner seems to have been the first to develop a variational existence theory, simultaneously for periodic and solitary waves [36], based on joint work with Bona and Bose [7].The latter two authors also gave a large-amplitude functional-analytic theory for solitary waves in nonlinear equations together with Benjamin [5].At this time, Lions introduced the method of concentration-compactness [23], and Weinstein made use of it in his constructions of shallow-water solitary waves using constrained minimisation [37].Trivial solutions of water-wave equations are generally global minimisers, so critical points are found as constrained minimisers or saddle points; a common construction has been L 2 or otherwise quadratically constrained minimisation in Sobolev spaces, see for example Buffoni [9], Groves-Wahlén [18] and Buffoni-Groves-Sun-Wahlén [10] on waves with surface tension, among other works.
For nonlinear equations with negative-order dispersion, such as the Whitham equation, with a convolution kernel K given by the Fourier symbol solitary waves were first obtained in [11] by Ehrnström-Groves-Wahlén.That proof uses a combination of L 2 -constrained variation, minimisation sequences built on perodic minimisers, and Lions' concentration-compactness technique.
The symbol (1.2) describes the linear dispersion of gravity surface waves in water of finite depth, and as made precise by Emerald [14], the equation can be shown to be in this sense slightly superior to KdV and other models of the same order in the small-amplitude long-wave regime.Different and modified perturbative proofs for small-amplitude solitary waves in (1.1) have since been suggested, for example by Stefanov-Wright [32], who used an implicit-function type theorem, and Hildrum [19], who also included more irregular nonlinearities.But it was first with Truong-Wahlén-Wheeler [35] and Ehrnström-Nik-Walker [12] that large solitary waves for the Whitham equation were found.The regularity theory for highest waves, including solitary such, had been established by Ehrnström-Wahlén in [13], and so had symmetry and decay by Bruell-Ehrnström-Pei in [8], but there were no proofs of large-amplitude existence in the solitary case.
The proof presented in the paper at hand should be viewed as an alternative construction, for the Whitham equation, and for dispersive model equations in general.It is inspired by a method by Stefanov-Kevrekidis on solitary waves in Hertzian and monomer chains from [30,31], which we have adapted to the weakly dispersive case.Among the things that make the dispersive theory different, is that the linear operator and equation are of opposite character from [30,31], and that the theory for 'large' solutions is unique to the dispersive case.
Returning to our problem, we shall be interested in travelling wave solutions η(t , x) = ϕ(x − µt ), so that lim |z|→∞ ϕ(z) = 0.This yields the nonlinear nonlocal equation (1.3) A function ϕ shall be called a steady Whitham solution if it satisfies (1.3) almost everywhere.If ϕ is furthermore in L p (R), for some p ∈ [1, ∞), we shall call it a solitary wave.The idea behind our approach is the following: if the energy functional for a nonlinear equation is of the structural form with L a linear dispersive operator and N (ϕ) a primitive of the nonlinearity in the original equation, then critical points in a subspace of L 2 may be found through more than one form of constraint.Most commonly, the term ϕ 2 is fixed while minimising the remaining part of the functional; this yields an energetic type of stability, as described by Mielke in the work [24].But the nonlinear term N (ϕ), too, could be taken as a constraint, see for example Arnesen [3] and Zeng [38].We for our part shall keep the combination ϕ 2 + N (ϕ) fixed, while maximising the dispersive part ϕLϕ.This is made possible by working in an Orlicz space, which essentially equates the nonlinear function Ψ(ϕ) = ϕ 2 + N (ϕ) with the function |ϕ| p in L p -theory.For this to work, however, the above Ψ will be cut off and extended at a point corresponding to the appearance of highest waves, which also coincides with the point where Ψ seizes to be convex.We suggest this method as a possible alternative to other existence methods for solitary dispersive waves.Its distinct features are: it is based on L p -theory instead of Sobolev theory; it immediately provides bell-shapedness of constrained maximisers (they are even, positive and one-sided monotone); and it is not restricted to solutions of small sizes.In our particular case, we have not been able to prove that we reach the highest wave via this method, but this seems to be a difficulty rather than a constraint of the method; and we establish that we reach at least medium-sized waves.
The paper starts in Section 2 with a walk-through of properties of the convolution kernel K , the functional, and Orlicz spaces.Some of these results are new, other are by alternative proofs.We introduce the symmetric rearrangement of a function, the Riesz convolution-rearrangement inequality, and the necessary prerequisites to prove that a maximiser of our functional solves the correct Euler-Lagrange equation.
Section 3 introduces the functional with the constraint that (α f 2 − 1 3 f 3 ) dx = 1 for 0 ≤ f ≤ α (and something else when f > α).For each fixed α > 0 we are looking for solutions in L 2 ∩ L 3 .Large values of α will correspond to small solutions in the original problem, and very small values of α will show not to correspond to physical solutions at all.By setting up a sequence of problems for supp f ⊂ [−2 l , 2 l ], we show that the maximum J l is approximately α −1/2 , with some sharper estimates.For each l large enough, and α > α 0 > 0, we find a bell-shaped maximiser f l that satisfies the Euler-Lagrange equation and fulfils f l (0) < α with a non-degenerate condition.This is Lemma 3.5.The hardest part of the paper is Prop.3.6, a detailed rearrangement proof to show that maximisers are bounded from above by a value close to α, a result that later gives us some control on the size of solutions.The proof is technical, though rudimentary in technique, and is based on moving parts of the mass of a possible maximiser with too large supremum to obtain a contradiction.The proof contains a Slobodeckij-type difference characterisation (3.28) of the quadratic form 〈 f , K * f 〉, itself equivalent to a squared H −1/4 (R)-norm, that makes it possible to relate 'flatness' of f to the functional J ( f ).Section 3 additionally covers the limit l → ∞, which is achieved through weak convergence arguments and properties of the compactly supported maximisers.As in many other investigations considering long waves converging towards solitary solutions, the limit as l → ∞ need not be unique.Section 4 is about the dependence of maximisers upon the parameter α.We prove that α → ∞ corresponds to the small-amplitude limit of the maximisers, while α → 0 gives maximisers that are outside the valid regime of the Euler-Lagrange equation.An estimate for the largest obtained waves is given.At the threshold value α = α 0 , there is a solution satisfying f (0) ≥ α 0 , and another satisfying g (0) ≤ α 0 , but because of lack of uniqueness we cannot exclude that these are different, to reach the desired conclusion that in fact f (0) = g (0) = α 0 .That would correspond to a highest wave.
Finally, we go back to the original equation through the somewhat implicit The main result and estimates can be found in Theorem 5.2.We find an injective curve of bell-shaped solutions parameterised by α ∈ [α 0 , ∞), with the function α → αJ 2 α ∈ (1, 3  2 ) strictly decreasing and continuous with unit limit as α → ∞.We give some L p -estimates of these waves, and show that the small waves converge to the expected bifurcation point for solitary waves.

PRELIMINARIES
Throughout this paper, , and shall indicate (in)equalities that hold up to uniform positive factors.When the factors involved depend on some additional parameter or function, this will be indicated with subscripts such as µ .We shall call an element in L p (R) bell-shaped if it lies in the closure of the set of even, continuous and positive functions which are decreasing on the positive half-axis.This is equivalent to requiring the same properties almost everywhere for a general, not necessarily continuous, function in L p (R).We furthermore define the Fourier transform F by extended by duality from the Schwartz space S(R) of rapidly decaying smooth functions to S ′ (R), the space of tempered distributions on R. We shall write f interchangeably with F ( f ).The inverse of F with this normalisation is then given by f 2.1.The family of kernels {K α } α .The kernel K in (1.1) arises from the linear dispersion relation in the free-boundary Euler equations [20], where its symbol m(ξ) describes the dependence of wave speed of a travelling wave-train upon its frequency ξ.Since m is real and even, the operator L = K * has a well-defined square root given by F ( L f ) = m(•)F f .More generally, every symbol gives rise to a corresponding operator L α = K α * , for which one has the following result.
(i) K α is smooth outside the origin, and rapidly decaying: for each fixed α ∈ (0, 1) and N ≥ 1 one has (ii) K α is bell-shaped and strictly convex.
(iii) K α has unit operator norm:

Remark 2.2.
There are several approaches towards the properties of K α .We choose here to give an elementary and direct proof.A more subtle method yielding somewhat more information about the kernels can be found in [13].
Proof.(i) To prove the decay and smoothness of K α , we write where 2 ), and satisfies ̺(ξ) = 1 for |ξ| < is smooth (in fact, real analytic) and strictly positive, we have that K 2 α can be dealt with using integration by parts.Taking the form of the multiplier m into account, one readily sees that where m l ,α is C ∞ function with |m l ,α (ξ)| l ,α |ξ| −l −α , l being an arbitrary positive integer.This proves the super-polynomial decay rate for K α .Also, differentiating under the integral sign in (2.3) yields that K 2 α is of class C l −1 outside the origin.Since l is arbitrary, we conclude that K α is in fact smooth on the same domain.
(ii) Note that the integral obtained by differentiating under the integral sign as in is not well-defined.Thus, even a formal argument is hard to invoke for the onesided monotonicity of K .To circumvent this difficulty, we consider instead of . This is an element of S ′ , and we show that its Fourier transform is positive definite.Hence, x D x K α (x) < 0 for all x = 0 and all α ∈ (0, 1].We carry out the calculation first for the Whitham kernel (the case α = 1 2 ).Relying on Bochner's Theorem [6], we want to prove that is absolutely continuous; then x D x K (x) = − F −1 (D ξ (ξm(ξ))) will be negative.Notice that D ξ (ξm(ξ)) is even in ξ, so that it is enough to consider non-negative ξ in the calculations to come.Now, which is positive definite in view of that [ξ → 1], [ξ → ξ/ sinh(ξ)] and m all are (for the latter facts, see [6]).For a general α ∈ (0, 1), a calculation like (2.4) shows that which is positive definite for 0 ≤ α ≤ 1, in view of that ξ/ sinh(ξ) and 1 − α are, and that m is infinitely divisible, meaning that any positive power of it is positive definite.
To prove the convexity of K α , one can use theory for Stieltjes and completely monotone functions [13].In the case α ≤ 1 2 , it is however possible with a more straightforward approach.As above, one calculates where ν is an arbitrary real number.The Fourier transforms of all factors appearing in the outmost parenthesis are explicitly known [25,34], and one can find a value ν such that the entire expression is positive definite. 1Since m(ξ) is infinitely divisible, the convexity of the kernel away from the origin is therefore guaranteed (in the case α ≤ 1/2, one can see directly from the explicit Fourier transforms that the convexity is strict).
(iii) The proof of this is directly given in the statement.
(iv) We deduct the homogeneous part of Kα .Write which is valid since Kα is even.Then K α (x) , and for no greater p.The second part is the Fourier transform of an L 1 -function with exponential decay, and hence smooth by a version of Schwartz's (Paley-Wiener) theorem [29].Since, by (i), K α has rapid decay, K α ∈ L p (R) exactly when p < 1/(1 − α).
As above, we shall use the convention that K = K 1   2   .Recall that L α is the operator with symbol Kα , defined for α ∈ (0, 1).
Lemma 2.3.The operator L α is an isomorphism L 2 (R) → H α (R), with its inverse given by the symbol K−α .For fixed α and and for fixed q > 1/α, guarantees that both the parentheses () 1 and () 2 are positive definite.

Proof. To see that L
) α is an isomorphism on L 2 , since it is a positive and bounded function which is also bounded away from the origin.
The L q -L p -estimate is almost a direct consequence of Young's inequality, which states that f * g L p ≤ f L q g L r , when 1 and 1 ≤ q, p, r ≤ ∞.In view of that K α ∈ L r whenever 1/r ∈ (1 − α, 1], the statement follows for 1 q − 1 p < α.The same argument for p = ∞ gives the continuity L q → L ∞ of L α whenever q > 1/α.For the case 1  q − 1 p = α, one must use the generalisation of Young's inequality to the weak L r,∞ -space, namely which is valid for the same relation between p, q and r (although not for p = ∞ or q = 1, see for example [16,Thm 1.4.24]).Since |x| α−1 ∈ L 1/(α−1),∞ , it follows from the formula K α (x) |x| α−1 + f (x), f ∈ C ∞ (R), deduced in the proof of Lemma 2.1 (iv), and the decay of K α , that also K α ∈ L r,∞ .This proves the continuity L q → L p of L α for 1 q − 1 p = α.Lemma 2.4.L α preserves bell-shapedness, and the maximum of L α f for any nonconstant bell-shaped function f is attained only at the origin.
Proof.Recall that L α acts by convolution with K α , which according to Lemma 2.1 is itself bell-shaped.The convolution of two even and positive functions is clearly even and positive, so it remains to show that K α * f is decreasing on the positive half-axis if f ∈ L p (R) is bell-shaped, where p is some number in [1, ∞].Consider the difference where we have used the evenness of K α and f to rewrite the integral.For x > 0 and 0 < h ≪ 1 the factors in the integrand have differing signs, whence the first assertion follows.
To see that the maximum of L α f is attained only at one point, fix x 0 > 0 and consider g x 0 (y) := f (x 0 + y).By in [22,Thm 3.4], one has where we have used that the decreasing rearrangement of a translate of a bellshaped function coincides with the function itself: Thus the function x → K * f (x) achieves its maximum at x = 0.Moreover, since K is strictly symmetric decreasing, equality is possible (again, according to [22,Thm 3.4]) only when g x 0 = g * x 0 .But that would imply f (x 0 + y) = f (y) for all y, meaning that f is everywhere constant.

Orlicz spaces.
The following subsection is a short introduction to Orlicz spaces.The reader will find here all the basic results used in this paper.
2.2.1.Distribution functions and convolution rearrangement inequalities.For a measurable function f : R → R, let be its distribution function.For every ϕ ∈ C 1 (R) with ϕ(0) = 0, one has (cf.[16,Eq.(1.1.7)])the layer cake representation formula (2.6) The non-increasing rearrangement, f * , of the function f is the inverse function of d f , provided that α → d f (α) is strictly decreasing.In general, we define f * : Then f * is a non-increasing function, and one furthermore has that Then f # provides us with a convenient way of characterizing bell-shapedness, namely, f is bell-shaped if and only if f # = f .The function f # is equidistributed with f , in the sense that (2.7) Indeed, for t > L, we have A result that will be essential to our investigation is the Riesz convolutionrearrangement inequality.This has later been generalized in several ways, see, e.g., [26,28]).Lemma 2.5 (Riesz).Let f , g and h be measurable real functions.Then (2.8) Remark 2.6.The integrals in (2.8) are interpreted in the extended sense that they may take infinite values.In particular, the integral on the left-hand is finite whenever the right-hand side is finite.

2.2.2.
Young's functions.We mainly follow the exposition in the book of Rao and Ren [27], to which we refer the reader for further details.
We shall say that a function Ψ : R → R + is a Young function, if it is even, convex, and satisfies Ψ(0) = 0, lim x→∞ Ψ(x) = ∞.In the literature, a continuous Young function that satisfies is sometimes called a nice Young function (or an N -function).We shall, however, only work with functions satisfying (2.9), and will therefore make these requirements also for Young functions.
2.2.3.Orlicz spaces.Orlicz spaces are function spaces built on a Young function Ψ.More specifically, for a Borel measure2 µ on R, consider (2.10) Identifying α with 1/λ, one may define a norm via the so-called Minkowski gauge functional, With this definition the pair (L Ψ , N Ψ ) becomes a Banach space, called the Orlicz space for Ψ.The norm N Ψ is usually referred to as the gauge norm.A Young function Ψ is said to satisfy a global ∆ 2 -condition (Ψ ∈ ∆ 2 , for short) if one has (2.12) for all x ≥ 0. In the case when Ψ is strictly increasing, continuous and satisfies a ∆ 2 -condition, one may alternatively take as a definition of the gauge norm N Ψ .The following result is adapted from [27, p. 280] and shows that under additional regularity assumptions on the Young function Ψ, the gauge norm has directional derivatives.
Lemma 2.7.[27] Let Ψ be a differentiable and strictly convex Young function with a strictly positive derivative on the positive half-axis.
The next lemma relates the gauge norm to the distribution function.We shall make use of this in our construction of bell-shaped solutions.

Lemma 2.8. Let Ψ be a C 1 -Young function and let f
In particular, f * and f # both belong to L Ψ .Proof.By rescaling, we can assume that (2.6) and the fact that

THE VARIATIONAL PROBLEM
In order to construct a solution to (1.3), one naturally considers a constrained optimization problem connected to it.Formally, if f is a maximizer of under the constraint that one obtains a solitary solution of the steady Whitham equation (1.3) by a rescaling argument (the wave speed arises from a Lagrange multiplier principle).The above problem, however, is not well-posed, since there is no finite supremum of (3.1) over functions fulfilling (3.2).One way to see this is by taking a dilated and scaled characteristic function χ N supported on the interval [−N , N ] that satisfies , where φ is supported in (0, 1), and fulfilling I (φ N ) = I (φ) = 1, the sum fulfils the constraint The quadratic energy J (χ N +φ N ) on the other hand scales like N , as the dilation in χ N contributes N to the functional, but the translation in φ N results only in a phase-shift.Therefore One way to remedy this is to consider functions that additionally satisfies As we shall show, the resulting problem is solvable and yields a fairly rich family of solutions of the Whitham equation.Technically, though, (3.2) poses challenges as it does not describe a function space.One way of dealing with this problem is to work in a ball in a regular Sobolev space H s (R), for which both (3.2) and (3.3) may be fulfilled.Variants of this approach have been used in [11,17] and other investigations, and yields small-amplitude solutions of the original problem.
3.1.The Orlicz space of constraints.Our approach does not per se rely on smallness.As is often done, we deal with the loss of compactness on R by considering a sequence of problems on increasing intervals.To handle (3.2) and (3.3), we enlarge the set of functions allowed for by considering the Orlicz function for fixed and positive values of α > 0. By varying α > 0, we obtain a non-trivial family of waves, and we show that the relevant waves exist in an interval of the form α ∈ [α 0 , ∞), with α 0 to be appropriately defined later.Generally speaking, small waves of the Whitham equation (1.3) correspond to maximizers with large α and vice versa.
in the sense that The estimates f L 2 (R) Proof.The function Ψ is of class C 2 by construction, with Ψ ′ ( f ) > 0 for f > 0 and Ψ ′′ ( f ) > 0 for all f = ±α.Hence, Ψ is strictly increasing and convex in the sense of a Young function.To see that it is indeed Young, note that (2.9) trivially holds.Similarly, the ∆ 2 -condition (2.12) can be easily checked.Then (2.10) defines an Orlicz space with norm given, equivalently, by (2.11) and (2.13).
To prove that this Orlicz space is, for given α > 0, isomorphic to the left-hand side is uniformly bounded from below by 1 12 f 3 .Combining these estimates we obtain Since for such f one has (N Ψ ( f )) p = 1 for all p, a rescaling argument yields (3.5).
Relative to Ψ, we now consider the problem of maximizing J ( f ) under the constraint that for a given positive value of α.Note that by scaling, the maximizers of (3.1), if any, under the constraint (3.6) are the same if one enlarges the constraint to include the whole closed ball {N Ψ ( f ) ≤ 1} in L Ψ .To handle compactness, we consider first local versions of this maximization problem.

A family of local problems.
For the purpose of obtaining a convergent subsequence, we consider functions f supported on an interval [−2 l , 2 l ], l ∈ N.More precisely, in this section we find functions f = f l that realize the maximum of J ( f ) under the constraint that N Ψ ( f ) = 1, and that additionally satisfy The value of the parameter α > 0 will for now be held constant.Define so that J = max J under the constraint N Ψ ( f ) = 1 when a maximizer exists, and similarly Both J and J l depend on α.

Lemma 3.2. For any fixed value of α, one has
where the latter bound is uniform in α.
Proof.We already know that J l ≤ J .From Hausdorff-Young's inequality and Lemma 3.1, Also, it is clear from the definition that {J l } l is an increasing sequence.We will now show that (3.9) holds.Indeed, let ε > 0. Then there exists Thus (3.9) holds.
Our next lemma establishes a lower bound on J , consistent with the estimate J α −1/2 from Lemma 3.2.Most of the time, we shall just need a simple corollary of the below result, namely that J ≥ α −1/2 .Lemma 3.3.There exists an absolute and positive constant c 0 such that for all α > 0 and all l | log(α)|.In particular, J − 1 α is positive for any fixed α.
Proof.The exists a positive constant c > 0 such that ( tanh(ξ) ξ for any function f ≤ α satisfying the constraint N Ψ ( f ) = 1.To eliminate the αdependence in Ψ( f ) dx = 1, introduce a smooth test function q with supp(q) ⊂ [−1, 1], 0 ≤ q ≤ 1, and consider Then N Ψ ( f ) = 1 if and only if q 2 (x) − 1 3 q 3 (x) dx = 1, and it is evident that we can find q satisfying this additional assumption.Replacing f in (3.11) by f = αq(α 3 •), we find that provided that α 3 2 l ≥ 1, to satisfy the constraint that supp( f ) ⊂ [−2 l , 2 l ].Thus, there exists a positive constant c 0 such that For the general case of α ∈ (0, ∞), introduce the additional scaling where µ ≪ 1 is small parameter, and η is to satisfy the constraint On the other hand, (3.11) now becomes with η 1.Any small enough choice of µ 1 1+α 2 yields whenever l | log(α)|; thus the estimate includes also the case when α is small.This lower bound on J l is uniform in α because of how f ≤ α was constructed.Finally, as J ≥ J l by definition, the same bound is valid for J , independently of l .Lemma 3.4.For each l ≥ 0 there exists a bell-shaped maximizer f l of the local maximization problem fulfilling (3.8).
Proof.We start first with an argument that shows that any function f is at best no better than its rearrangement f # , as far as the local constrained maximization problem is concerned.
, and in view of Riesz's convolution-rearrangement inequality (2.8), we have that for all Schwartz functions g , Taking the supremum in (3.13) over all functions g such that g L 2 = 1, we obtain in view of that for such g one also has g # L 2 = 1.The supremum in (3.8) may thus be considered with respect to only bell-shaped functions.The eventual maximizer will then be bell-shaped as well.
We now prove that there is a maximizer of (3.8) in the subspace of bell-shaped functions.To that aim, let { f n } n be a maximizing sequence, that is, a sequence of bell-shaped functions with as n → ∞.By weak compactness, and up to passing to a subsequence, we may assume that Note that f 0 is bell-shaped (which can be seen by testing against characteristic functions) and, by the lower semicontinuity of the norm, . By weak compactness, it again follows that there is a function g 0 ∈ H 1/4 (R) such that g n g 0 .By uniqueness of weak limits, g 0 = K 1 4 * f 0 .In addition, from the integral representation of g n and the decay of K 1 4 proved in Lemma 2.1, for all |x| > 2 l +1 one has where N ≥ 1 is arbitrary.It follows that {g n } n is a compact sequence in L 2 (R) and hence has a convergent subsequence {g n m } m , that, by uniqueness of limits, converges to g 0 .In effect, By (3.14) we have N Ψ ( f 0 ) ≤ 1; a strict inequality would contradict the definition of J l and we conclude that N Ψ ( f 0 ) = 1, and f 0 is the bell-shaped maximizer sought for.
Now that we have established the existence of maximizers 3 of the local problem (3.8), let us proceed to derive the corresponding Euler-Lagrange equation.Lemma 3.5.Every maximizer f l realizing (3.8) satisfies the Euler-Lagrange equation In addition, f l is non-degenerate in L 2 and L 3 ; there is a constant c 0 > 0 such that f l 3 (3.16) Also, the estimate 0 < f l (x) α, −2 l < x < 2 l , (3.17) holds uniformly for α > 0, and there exists α 0 such that f l (0) < α, for all α > α 0 .These estimates hold uniformly for l | log(α)|.
Proof.It is straightforward to show that for any non-negative function f , we have f Ψ ′ ( f ) ≃ Ψ( f ) (see also Lemma 3.7 below) and hence uniformly in α > 0 and l | log(α)|, whence the denominator in (3.15) is bounded away from zero.By Lemma 2.7, the Gateaux derivative of the function N Ψ (•) when N Ψ ( f l ) = 1 may be determined as Since f l is a constrained maximizer, we have 3 Recall that we have no proof of uniqueness of these.
for every In view of that h is free to vary only on the interval [−2 l , 2 l ], we have established (3.15).Next we prove the non-degeneracy in L 2 and L 3 .As K 1 4 L 1 = 1, we have that where the last inequality follows from Lemma 3.3.For the L 3 -bound, note that with equality only when f ≤ α.As Ψ( f l ) dx = 1, it follows that Now we prove (3.17).Note first that since f l is bell-shaped, and K is everywhere strictly positive, the left-hand side of (3.15) cannot vanish unless f l is identically zero, which is clearly not the case for a maximizer.Hence, the righthand side is also non-vanishing, and therefore f l (x) > 0 for x in the open interval (−2 l , 2 l ); outside this interval, the Euler-Lagrange equation (3.15) does not hold.Recall that 〈 f l , Ψ ′ ( f l )〉 1 (cf.Lemma 3.7) and J l > α −1/2 for l | log(α)| according to Lemma 3.3.So wherever f l > α, we have from (3.4) and (3.15) that uniformly in α > 0, where we have used that |K * f l | ≤ K L 1 f l ∞ .This inequality gives the uniform upper bound on f l ∞ in (3.17).Lastly we prove that f l (0) < α for all α sufficiently large.Assuming that f α (0) > α, we have according to (3.18) evaluated at x = 0, Lemma 2.3 and Lemma 3.1 that Clearly, this is a contradiction for all large enough α, whence f l (0) < α for all such α and l .
In the previous lemma we established that f l L ∞ α uniformly in α > 0 and l | log(α)|.The following proposition provides more precise information about this upper bound, pushing maximisers below the value B .While the idea of the proposition is simple, the proof requires quite a rigorous balancing of terms and estimates.This is to us one of the first quantitative size estimates for a 'small amplitude' theory (which establish that they are not small at all in fact), and it takes up a sizeable part of the paper.We shall use it in the following, but it is strictly not needed for the construction of very small waves.The reader who is dominantly interested in the construction method may move forward to Lemma 3.7.Proposition 3.6.For any α > 0 and ε > 0, there is an l 0 , depending only on ε, such that for all l > l 0 , the maximizers f l satisfy where B = 4 3 cos 5π 18 α ≈ 1.48α is the unique solution in R + to 2Ψ(y) = yΨ ′ (y).
Proof.Let ε > 0. For a contradiction, we assume that for any l 0 , there exists maximizers f l , l > l 0 , such that The idea of the proof is that y 2 /Ψ(y) has its maximum at B so that B is the optimal height for maximizing the L 2 -norm for fixed N Ψ -norm, as shown below.
Hence we can construct a function fl with supp( fl ) ⊂ [−2 l , 2 l ], fl (0) = B and N Ψ ( fl ) = 1 that is "flatter" than f l and such that fl L 2 > f l L 2 .Generally, the difference between K 1 4 * f L 2 and f L 2 is smaller the "flatter" and less oscillating f is (see (3.28)).The restriction to [−2 l , 2 l ] causes some technical difficulties due to the jump-discontinuity at the endpoints, but for large enough l we can show that contradicting the assumption that f l is a maximizer.
Let now l > 0 be such that For y > 0, we define Ψ(y) .
We then have that Now, in the whole interval (−a − δ, a + δ), one has in view of (3.23); and in (−a, a), where has f l (x) > B = f (x), one has This means that y 2 is relatively smaller than Ψ(y) in the range where f l resides.More precisely, in view of (3.24) and the above inequalities for h.We shall quantify this inequality.Using (3.24) and the definition of f , one finds the following explicit relationship between f l and f : Hence, by the Taylor expansion by an explicit calculation.It follows from (3.20) and Jensen's inequality that uniformly for all l sufficiently large (note also that there is a uniform upper bound on a, imposed by N Ψ ( f l ) = 1).Using the same kind of Taylor expansion, we can also get the following row of equalities, that will be used later.Here, the first equality is a rewrite of (3.24), and the last of (3.25).
Next we want to show that (3.28) Hence, we only need to show that the double integral in the second line in (3.29) is smaller than that expression.By evenness of f , f l and K , this integral is equal to where we shall concentrate on the inner integral.Hence, let h ≥ 0. For |x| ≥ a+δ, the function f is just a δ-translation of f l , so Hence, the inner integral in (3.30) reduces to We study 0 ≤ h ≤ 2a and h ≥ 2(a + δ) separately.In the following, we will make ample use of the definition of f , as well as translations and (even) changes of variables in the integrals to reduce and compare the terms.
The case 0 ≤ h ≤ 2a.In the simplest case, when h ∈ [0, 2a], one has a+δ by the changes of variables x + h → x → −x.So the correction term E l (h) from (3.31) in the case when h ∈ [0, 2a] is given by 2 Both latter terms are negative, as f l (• + h) ≥ B on [−a − h, −a] when h ≤ 2a, and f l (•) ≤ B on the same interval when h ≥ 0. To quantify the negative contribution, we relate it to the norm difference (3.26) via the expression one gets after the change of variables x + h → x → −x, and subsequent addition and subtraction of Therefore, going back to (3.30), when h ∈ [0, 2a], the correction to the L 2 -terms in (3.29) can be bounded as by (3.26).Moreover, by (3.20) and (3.17), there is a lower bound on a that depends only on ε and α; in particular it is uniform in l .
The case h ≥ 2a.We divide the integral a+δ (I) Note that a + δ − h = −a − δ exactly when h = 2(α + δ), so the behaviour of the middle integrals in the right-hand side will depend on whether 2a ≤ h ≤ 2(a +δ) or h ≥ 2(a+δ).We deal with the integrals in order.When x ∈ (−a−δ−h, a−δ−h), f (x + h) is constantly equal to B , and f (x) is a δ-translation of f l (x), so one has Using the changes of variables x + h → x → −x, the second integral in (I) may be rewritten as a copy of the fourth: The third integral is only present, or only contributing positively, when h ≥ 2(a + δ).In that case both x + h ≥ 2 + δ and x ≤ −a − δ on the interval, so we count its contribution as where χ is an indicator function.The fourth integral has already been show to be a copy of the second, see (I2), but its behaviour is linked to the size of h in relation to 2(a +δ).When the interval [−a −δ+h, a +δ] is non-void, both f (x +h) and f (x) equal B there, so the integral vanishes over that part, else only one of these terms is constant.Therefore, by the change of variables x −δ → x → −x.Finally, the fifth can be shown to be a copy of the first: (3.33)This shall be compared with the term (3.31) in E l (h).We divide the latter term into parts, according to   (3.36) where furthermore f (c 2 ) ≤ f (c 1 ) ≤ B and 0 ≤ r (h) − a + h ≤ 2δ for h ≥ 2a by definition of r (h).We may therefore use (3.27) to bound (3.36), finding that it is less than (3.37) Turning to the χ-part of (3.33), it is only present when h > 2(a + δ), in which case we are to balance it against the remaining −a a−h -integral in (3.34).Note that changes of variables can be used to re-express terms, for example, ) dx by the change x + h → −x + 2δ.Using similar identities and the mean-value theorem, one finds where As f l is bell-shaped, it follows that the expression above is negative.As K is positive and , summing up ∞ 0 E l (h)K (h) dh from the negative quantified contribution from (3.32), the positive quantified from (3.37), and the negative from (3.38), we find that there is a constant 0 < C < 1, depending only on ε and α such that for all l sufficiently large.From (3.29), we then get that The modified f .Because f has support in [−2 l −δ, 2 l +δ] it is not an admissible maximizer, and we now modify it to yield the desired contradiction.By (3.21) and the properties of f , there exists γ > 0 such that a + δ + γ < 2 l and Now set δ = δ + γ, and define fl by in which case it follows from (3.39) and (3.26) that > 0 for all l sufficiently large, contradicting the assumption that f l is a maximizer.To prove (3.40), note that Let f be a non-negative, bell-shaped function satisfying for all x ∈ R, and that for all k > 0, |{x : f (x) ≥ k}| ≤ 1 Ψ(k) .In particular, there is an upper bound on δ that is independent of l and ε, say δ < C , and as l → ∞, uniformly in ε.This also implies that γ → 0. Hence the two first terms in (3.41) vanish and, to prove our claim, it is sufficient to show that the convergence lim As J l is bounded below and 〈 f l , Ψ ′ ( f l )〉 is bounded above, uniformly in l , we get from the Euler-Lagrange equation (3.15) that uniformly in l , where we used the regularity and decay properties of K (cf.Lemma 2.1).As f l L ∞ α uniformly l , Ψ ′ (x) > 0 for all x > 0 and Ψ ′′ (x) > 0 for all x = α, this implies that f l (x + h) − f l (x) → 0 as h → 0 uniformly in l and x.In particular, for any fixed R, uniformly in l .To deal with |x| > R, we improve on the estimate of the difference.By (3.42) we can pick R ≫ 1 such that f l (R) < α 2 for all l .Let |x| > R ≫ 1 and 0 < h ≪ 1.Then and, by Lemma 2.1 and the bell-shapedness of f l , uniformly in l for any N > 0. Picking any N > 1, it follows from the Euler-Lagrange equation (3.15) and the above estimates that uniformly in l and α.This proves the claim.As ε > 0 was arbitrary, this proves the result.
Proof.For f < α one has and, for f ≥ α, In view of that Ψ( f l ) dx = N Ψ ( f l ), the first part of the lemma and the lower bound on 〈 f l , Ψ ′ ( f l )〉 now follows from integrating the above expressions over R.
For the upper bound, we have that uniformly in α > 0 and l , where the last inequality follows from Lemma 3.1.As shown above, we have Letting B be as in Proposition 3.6, we have that the integrand in the second integral is strictly negative for Assume therefore that f l (0) > B .For any ε > 0, we have by Proposition 3.6 that for all l sufficiently large.Hence In particular, 〈 f l , Ψ ′ ( f l )〉 < 2 + ε and using this estimate and the regularity of K , we have by Lemma 3.5 that uniformly in α and l , where we used that K is smooth away from the origin and rapidly decaying, and K (x) ≃ |x| −1/2 for |x| ≪ 1.As Ψ ′ ∈ C 1 , this implies that if f l = α at some point, then for any 0 < c < α, the quantity |{x : c ≤ f l (x) ≤ α}| is uniformly bounded below by a positive constant depending only on c and α.This implies that there is a As ε > 0 above was arbitrary, we can choose ε < C in (3.43) to conclude that 〈 f l , Ψ ′ ( f l )〉 < 2 for all l sufficiently large.
Corollary 3.8.For all α > 0 and all l sufficiently large, the family { f l } l satisfies the estimate uniformly in α > 0 and l > l 0 , where l 0 is as in Lemma 3.7.
Adding the integrals over f l ≥ α and f l < α together gives the uniform L 1 bound.As f l is bell-shaped, we have for x ∈ (0, 2 l ), This concludes the proof.

3.3.
The limit as l → ∞.We now proceed to find the limit as l → ∞.
Lemma 3.9.For any α > 0, any sequence { f l } l of local maximizers from Lemma 3.5 (recall that we have no proof of uniqueness) has a subsequence { f l k } k that converges point-wise and in L p (R), p > 1, and the limit f α is a non-trivial bell-shaped solution of the global constrained maximization problem

.44)
The function f α furthermore satisfies the Euler-Lagrange equation Proof.By the upper bound on f l L ∞ in Lemma 3.5 and the estimates in Corollary 3.8, we have that for any fixed choice of α > 0, there exists an l 0 such that that uniformly in l > l 0 .Next, since { f l } l belongs to the unit sphere of L Ψ , and f l L 2 (R) α −1/2 N Ψ ( f l ) holds uniformly by Lemma 3.1, by weak compactness there is a weakly convergent subsequence such that, By testing against characteristic functions, one sees that f α is bell-shaped as well.
We want to obtain convergence in the Euler-Lagrange equation (3.15).Since 〈 f l , Ψ ′ ( f l )〉 1 by Lemma 3.7, and J l is bounded from above and below by Lemmas 3.2 and 3.3 for any fixed α, we may without loss of generality assume that lim k 〈 f l k , Ψ ′ ( f l k )〉 = 0 and The above convergence for f l k furthermore yields that for every fixed x ∈ R, since K ∈ L q for q < 2. Thus, we have established the pointwise convergence x ∈ R.
Let g l = Ψ ′ ( f l ).Given x ∈ R, for all k sufficiently large such that |x| < 2 l k , we have that converges point-wise as well.We can now readily deduce the point-wise convergence of f l k , since Ψ ′ is strictly positive away from the origin.Indeed, for |x| ≤ 2 l the value of f l (x) is given by f l (x) = (Ψ ′ ) −1 • g l (x).More explicitly, whence the latter expression has a point-wise limit along the subsequence l k .By uniqueness, the point-wise and weak limits of f l k coincide, so that and (3.48) thus holds with f α and g α exchanged for f l and g l , respectively.Equivalently, g α = Ψ ′ ( f α ), so that (3.47) implies that This equality is valid on the entire real line, since for each x ∈ R, (3.48) will eventually hold as l k → ∞.
It is now time to discuss the existence and value of lim k→∞ 〈 f l k , Ψ ′ ( f l k )〉.By Lemma 3.7, we have with f l k (x) → f α (x) for all x.By (3.46), the sequence { f l } is dominated by a function that is in L p for all p > 1.As f l k converges point-wise, Lebesgue's dominated convergence theorem then implies that for all p > 1.In particular, we get that and by (3.49) we then have that Multiplying both sides by f α and integrating, we get . By (3.50), N Ψ ( f α ) = 1, and it follows that f α is a maximizer.
Proof.The Euler-Lagrange equation (3.45) and the smoothing effect of K * implies that maximizers f α are continuous, so if f α (0) > 4 3 cos 5π 18 α = B , then The upper bound (3.51) then follows from Proposition 3.6 by taking l → ∞.The bounds 1 < 〈 f α , Ψ ′ ( f α )〉 < 2, (3.52) and (3.53) then follow directly from Lemma 3.7, Corollary 3.8 and Lemma 3.5, respectively, as the equivalent bounds for the local maximizers f l were uniform in l .It only remains to show that the solutions f α vanish in L ∞ as α → ∞.For all α sufficiently large we have that f α (0) < α, so by the Euler-Lagrange equation (3.45) applied to x = 0, and the bounds J α ≥ α −1/2 and 〈 f α , Ψ ′ ( f α )〉 1, we obtain from Lemma 2.3 that for all q > 2. Interpolating between the L 2 and the L 3 bounds of Lemma 3.1 we get that f α L q α 1− 3 q , q ∈ (2, 3).Since this is available for all q ∈ (2, 3), we obtain the inequality +δ ≥ 0, with 0 < δ ≪ 1 arbitrary small and C δ a positive constant depending on it.In view of that f α (0) ≤ α, the solution to the above inequality is

DEPENDENCE ON THE PARAMETER α
In this section we investigate the dependence on the parameter α for the maximizers f α .In particular we are interested in the maximizers that satisfy f α ≤ α, as other maximizers are not solutions of the Whitham equation.We therefore introduce the threshold parameter α 0 := inf{ξ > 0 : for any α ∈ (ξ, ∞) there exists a maximizer f α (0) < α}. (4.1) From Corollary 3.10 it follows that α 0 exists and is a finite number.Below we will prove that α 0 > 0, meaning that there are maximizers found in this paper that either attain or exceed the height f α (0) = α.We will also prove that α 0 < 5 2 , which means that we have solutions also for intermediate (and perhaps small) values of α.Note that α is in fact in opposite relation to the wave-height of the final solution constructed in Section 5; the solutions vanish as α → ∞.Proof.Let α ∈ (0, ∞) and write f α = αq α (α 3 •).Then N Ψ ( f α ) = 1 implies that with 0 < q α ≤ α −1 max( f α ).From this we see that, in fact, for any α ∈ (0, ∞), we have N Ψ ( αq α ( α3 •)) = 1.Let α 1 < α.As K (ξ) is strictly decreasing in |ξ|, we have that K (α 3 1 ξ) > K (α 3 ξ) for all ξ = 0, hence As α ∈ (0, ∞) was arbitrary, this proves that α → αJ 2 α is strictly decreasing.Similarly, we have that As |ξ K ′ (ξ)| is smooth and decaying and q α 2 L 2 ≃ 1 uniformly in α, this proves continuity (using regularity properties of q α derived from the Euler-Lagrange equation, one can show that, in fact, the continuity is uniform).
We can also provide a rough upper bound on α 0 .The estimates in the below proof may be improved, but the purpose of the proposition is to establish maximizers with f α (0) < α also for intermediate values of α.Proof.From Lemma 3.2 we have J 2 α > α −1 and by Young's inequality As K is positive, strictly monotone on (0, ∞) and K L 1 = 1, there is an a > 0 such that K (a) = 1 and , we have that the first term has inverse Fourier transform 1/ 2π|x|, while the second term is integrable and exponentially decaying and hence has a real-analytic transform.Moreover, the inverse transform of the second term is clearly negative around the origin.Hence K (x) < 1 2π|x| for |x| ≤ a and a < 1 2π , and we get that It follows that K L 3/2 < 2 π + 1 2/3 , and from (4.4) we get that It follows that α 0 must satisfy this upper bound.
Lemma 4.4.For α > 0, let {α j } j ⊂ (0, ∞) be a sequence that converges to α and let f α j be a maximizer corresponding to α j .Then a subsequence { f α j k } k converges point-wise and in L p , simultaneously for all p ∈ (1, ∞], to a maximizer of the corresponding maximization problem for α.
Proof.For ease of notation, let f α j = f j .By Lemma 3.1, we have uniformly in j .Hence there is a subsequence { f j k } k and a f α ∈ L 2 such that This implies that converges point-wise.Let Ψ j be the Ψ-function corresponding to α j , and let for all k, so we can without loss of generality assume that lim also converges point-wise.We have that The right-hand side has a point-wise limit along the subsequence j k , hence f j k → f α also converges point-wise.Moreover, . By Corollary 3.10, we have the following estimate, uniform in j : Hence we get that uniformly in j .The right-hand side is in L p for all p > 1, so the point-wise convergence and Lebesgue's dominated convergence theorem gives that It follows that Multiplying by f α on both sides and integrating over R, we get that K 1 4 * f α L 2 = J α , and as N Ψ α ( f α ) = 1, this implies that f α is a maximizer.Corollary 4.5.At the threshold parameter α 0 , there exists a maximizer f α 0 satisfying f α 0 (0) ≤ α 0 and a maximizer g α 0 satisfying g α 0 (0) ≥ α 0 .
Proof.Let {α j } j be a monotonically decreasing sequence with α j ց α 0 .By the definition of α 0 , we have that for all j there are maximizers f j for α j satisfying f j (0) < α j .It follows from Lemma 4.4 that a subsequence of { f j } j converges point-wise to a maximizer f α 0 , which by the point-wise convergence must satisfy Similarly, letting {α j } j ⊂ (0, α 0 ) be a monotonically increasing sequence with α j ր α 0 , we have, for each j , maximizers g j satisfying g j (0) ≥ α j , and we conclude that there is a maximizer g α 0 such that This proves the result.Remark 4.6.Note that if we have equality f α 0 (0) = α 0 , then this will correspond to a solitary wave of maximal height for the Whitham equation (see Section 5).We would expect that f α 0 = g α 0 in the above result, in which case f α 0 (0) = α 0 .However, presently we have no uniqueness results or other results that precludes the possibility that the maximizers are distinct and g α 0 (0) > α 0 > f α 0 (0).

SOLITARY WAVES OF THE FULL-DISPERSION EQUATION
The following proposition shows that the range of bell-shaped solutions are confined to the 'natural' region (0, µ 2 ] where the end-points correspond to the zero solution and the Whitham highest wave, respectively.All solitary waves have super-critical wave speed bounded from above by twice the critical wave speed, a result in line with both [11] and [13].Then z is bell-shaped, too, and in fact it is strictly decreasing in (0, ∞) in accordance with the proof of Lemma 2.4.From the quadratic equation z = µϕ−ϕ 2 we conclude that 2ϕ(x) = µ + µ 2 − 4z(x), x ∈ A + , µ − µ 2 − 4z(x), x ∈ A − , for some disjoint sets A ± with A + ∪ A − = [0, ∞).Assume for a contradiction that A + has positive measure.Then there are sets A 1 + , A 2 + in A + of positive measure with A 1 + to the left of A 2 + .Let 0 < x 1 < x 2 be two arbitrary elements in A 1 and A 2 , respectively.Since ϕ is bell-shaped, we have ϕ(x 1 ) ≥ ϕ(x 2 ).But solving the inequality µ + µ 2 − 4z(x 1 ) ≥ µ + µ 2 − 4z(x 2 ) yields z(x 2 ) ≥ z(x 1 ), so that z| A 2 + ≥ z| A 1 + .This contradicts the strict monotonicity of z, whence A + must have zero measure, and ϕ = ϕ − almost everywhere.Thus (5.1) holds, ϕ(x) ≤ µ 2 , and ϕ is non-negative by assumption since it is bellshaped, which gives that it is non-zero in view of that z = K * ϕ is strictly monotone.
In particular, there is an injective curve of solutions parameterized by α ∈ [α 0 , ∞), and the function 2 ) is strictly decreasing and continuous with the limit 1 achieved as α → ∞.For α > α 0 the waves satisfy 0 < ϕ < µ 2 and these waves are all smooth, while for α = α 0 the wave satisfies ϕ(0) ≤ µ 2 , with potential equality.The solutions scale as with the estimate being uniform in α.In particular, the waves are small for large values of the parameter α in the sense that ϕ L p α −1 for α ≫ 1 and all p ∈ [1, ∞].As α → ∞, one further has µ → 1, the bifurcation point for solitary waves.
Remark 5.3.The estimate for small waves could be improved using an additional L 2 -bound.That would yield the slightly better ϕ L p α −3/2 for p ≥ 2.
To get the estimates for ϕ, note that by the definition of ϕ and µ, we have and (5.2) then follows directly from the fact that µ ∈ (1, 2).By Corollary 3.10, f α L 1 1 for α >> 1 and f α L ∞ α −1/2+δ for each fixed δ > 0. Interpolation then gives the L p -estimates for ϕ.As mentioned in Remark 5.3 this estimate can be improved for p ≥ 2, for example by interpolation and the additional estimate α −1 .Finally, the limit µ → 1 as α → ∞ follows directly from the definition of µ and Lemma 4.1.

Lemma 3 . 1 .
The function Ψ defined by(3.4) is a strictly convex, strictly increasing, C 2 -Young function for which (2.13) defines a norm.For any fixed value of α > 0, the corresponding Orlicz space L Ψ satisfies

( 3 2 r
.34) by the changes of variables.The 2 a−h −a−h -integral will be used for the 2 r (h) −a−hintegral in (3.33), and the −a a−h -integral for the χ-part.We start with the former, writing (h)

(3. 35 ) 2 r
which is achieved by adding and subtracting 2 a−h−a−h | f l (x +h)−B | 2 dx, where the negative term is expressed as −2 a −a | f l (x) − B | 2 dx.Note that the terms B − f l (x)and f l (x +h)−B in the a−h −a−h -integral are both positive, so the total contribution from that term is negative.Quantifying with the help of the mean value theorem, we have that (3.35) equals(h)

Theorem 5 . 2 .
For any maximizer f α satisfying f α (0) ≤ α, the function , even and one-sided strictly monotone solution of the steady Whitham equation (1.3) with wave speed