1 Introduction and basic set-up

The intention of this review-type article is to put some of the key mathematical notions and solution results regarding the Fornberg–Whitham equation in a perspective with respect to each other and we will thereby also strive to connect two so far largely parallel threads of research, because the same equation has also been studied under the name of Burgers–Poisson equation. Neither do we attempt here to elaborate on the history and physics behind this model equation nor can we come anywhere near a complete overview of mathematical results from the more than 50 years of its analysis. Moreover, our attention was restricted to results from work published at the time of writing and no systematic search of preprints was undertaken.

We discuss here the Fornberg–Whitham equation as it was introduced by Whitham in [31, Eq. (67)] as the integro-differential equation at the center of a shallow water wave model that is comparably simple and yet showed indications of wave breaking (see also [29]). It featured later in Whitham’s book [32] in a section dedicated to breaking and peaking of waves and a first systematic numerical study was published by Fornberg and Whitham in [13, Sect. 6].

Let us describe the formal set-up of the Cauchy problem. The wave height is described by a function of one-dimensional space and time \(u :\mathbb {R}\times [0,\infty [ \rightarrow \mathbb {R}\), \((x,t) \mapsto u(x,t)\). We will occasionally write u(t) to denote the function \(x \mapsto u(x,t)\). Upon rescaling (cf. Remark 1.1 below) we may write the equation without explicitly occurring additional model parameters in the form

$$\begin{aligned} u_t + u u_x + K*u_x = 0, \end{aligned}$$
(1)

where the convolution is in the x variable only and \(t > 0\). The convolution kernel is \(K(x) = \frac{e^{-|x|}}{2}\) and satisfies

$$\begin{aligned} K - K'' = \delta , \end{aligned}$$
(2)

which means that K is a fundamental solution of the operator \(1 - \partial _x^2\). In fact, we will occasionally have to interpret Eq. (1) in various weak forms—with distributional, entropy, or mild semigroup solution concepts—which stem from rewriting the left-hand side either as in

$$\begin{aligned} \partial _t u + \partial _x \left( \frac{u^2}{2} + K*u \right) = 0 \end{aligned}$$
(3)

or also in the form

$$\begin{aligned} \partial _t u + \partial _x \left( \frac{u^2}{2}\right) + K' * u = 0. \end{aligned}$$
(4)

Based on the property (2), Eq. (1) emerged in [11] instead from a system of equations, which can be approached here in reverse direction upon rewriting (1) as \(u_t + u u_x = - K*u_x\) and putting \(v := - K * u\). Noting that \(v_x = - K * u_x\) and \((1 - \partial _x^2) v = - u\) we then obtain the following system of nonlinear partial differential equations

$$\begin{aligned} u_t + u u_x&= v_x,\\ v_{xx}&= v + u. \end{aligned}$$

It was the starting point of the model in [11] and called Burgers–Poisson system, while the analog of Eq. (1) derived from it got named Burgers–Poisson equation. This name was also used in the key publication about global weak solutions in [15]. In the context of the current review article we prefer to stay with the notion of Fornberg–Whitham equation referring to (1).

We will usually suppose an initial wave profile \(u_0 :\mathbb {R}\rightarrow \mathbb {R}\) to be given and require in addition

$$\begin{aligned} u|_{t=0} = u_0. \end{aligned}$$
(5)

Remark 1.1

(Rescaled and periodic variants of the Fornberg–Whitham equation)

  1. (i)

    Note that we followed here in (1) the sign convention for the convolution term as in [13, Equation (4)]Footnote 1 and [32, Sect. 13.14], but have applied a rescaling of the solution values in order to get rid of any additional constant factor in the nonlinear term. Replacing u(xt) by \(-u(x,-t)\) transforms solutions of either sign variant of the equation into solutions for the other convention. Moreover, if u solves (1) and \(\lambda > 0\) is a constant, then \(v := u/\lambda \) is a solution to

    $$\begin{aligned} v_t + \lambda v v_x + K * v_x = 0, \end{aligned}$$

    which shows why we could bring the original model equation from [31] into the form (1). Such scalings and sign conventions have to be taken into account when comparing results about wave breaking that typically involve also quantitative aspects of the initial wave profile.

  2. (ii)

    Formally applying \(1 - \partial _x^2\) to (1) produces the third order partial differential equation

    $$\begin{aligned} u_t - u_{txx} - 3 u_x u_{xx} - u u_{xxx} + u u_x + u_x = 0. \end{aligned}$$

    Instead we will stay with the non-local integro-differential equation (1) or (3), because it corresponds to the original model and is also more suitable for the various solution concepts to be discussed.

  3. (iii)

    To study spatially periodic waves we change the x-domain to the one-dimensional torus group \(\mathbb {T}= \mathbb {R}/ \mathbb {Z}\) and may identify functions on \(\mathbb {T}\) with 1-periodic functions on \(\mathbb {R}\). This also requires an adaptation of the convolution kernel K (cf. [21, Sect. 3]), which is then given as the 1-periodic function on \(\mathbb {R}\) with \(K(x) = (e^x + e^{1-x})/(2 (e-1)) = \frac{\sqrt{e}}{e-1} \cosh (x - \frac{1}{2}) \) for \(0 \le x < 1\). Note that K is continuous but not \(C^1\) and the derivative \(K'\) is not continuous but in \(L^\infty \).

To simplify the presentation in the context of this review, we will give detailed formulations only for the Fornberg–Whitham equation in the form (1) and without periodicity assumptions. However, we will occasionally add remarks on the periodic case.

Before discussing in the following section the main solution concepts that have been employed for the Cauchy problem consisting of (1) and (5), let us remark that there are not many conserved quantities for solutions u (of sufficient regularity and with suitable integrability properties). The most obvious one is

$$\begin{aligned} \forall t \ge 0:\quad \int _\mathbb {R}u(t,x)\, dx = \int _\mathbb {R}u_0(x)\, dx, \end{aligned}$$

since integrating (1) with respect to x gives

$$\begin{aligned} \frac{d}{dt} \int u(x,t)\, dx= & {} \int \partial _t u(x,t) \, dx = - \int \Big (u(x,t) \partial _x u(x,t) + (K * \partial _x u(.,t))(x) \Big ) \, dx\\= & {} - \frac{1}{2} \int \partial _x (u(x,t)^2) \, dx - \int \partial _x (K * u(.,t))(x) \, dx = - \frac{1}{2} \cdot 0 - 0 = 0. \end{aligned}$$

A second conserved quantity is the spatial (real) \(L^2\) norm and stems from the skew-symmetryFootnote 2 of the operator \(v \mapsto K' * v = (K * v)'\) on \(L^2(\mathbb {R})\): Multiplying (1) by u, integrating with respect to x, and writing \(u u_t = \frac{d}{dt} (u^2/2)\), \(u^2 u_x = \frac{d}{dx}(u^3/3)\) we obtain

$$\begin{aligned} 0= & {} \frac{1}{2} \frac{d}{dt} \int u(x,t)^2 \, dx + \frac{1}{3} \int \frac{d}{dx}(u(x,t)^3)\, dx + \langle K' * u , u \rangle \\= & {} \frac{1}{2} \frac{d}{dt} {\left\| u(t) \right\| }_{L^2} + \frac{1}{3} 0 + 0 = \frac{1}{2} \frac{d}{dt} {\left\| u(t) \right\| }_{L^2}, \end{aligned}$$

and thus (see also [14, Lemma 1])

$$\begin{aligned} \forall t \ge 0:\quad \int u(x,t)^2 \, dx = \int u_0(x)^2 \, dx. \end{aligned}$$

However, thanks to the analysis in [23], the Fornberg–Whitham equation is known to belong to those equations among a class of 3rd order nonlinear dispersive wave equations that are definitely not completely integrable. Therefore, the key methods from geometric theories of infinite-dimensional dynamical systems that are available, e.g., for the Camassa–Holm equation, are not applicable in case of the Fornberg–Whitham equation.

The structure of this article is as follows: Sect. 2 is devoted to a discussion and comparison of various strong and weak solution concepts for the Cauchy problem consisting of (1) and (5). In Sect. 3 we summarize the key well-posedness results for strong solutions and on blow-up in finite time in the form of wave breaking, where we also add one aspect of semi-boundedness at blow-up time. Section 4 discusses key results on weak entropy solutions and adds a brief investigation of mild solutions with their relations to the former. The final subsection then focusses on continuous traveling waves in relation to the weak or weak entropy solution concept.

2 Solution concepts for the Cauchy problem

2.1 Strong solutions

In pure classical terms, the minimum requirements for a particular function \(u :\mathbb {R}\times [0,\infty [ \rightarrow \mathbb {R}\) to count as a global solution of the Cauchy problem consisting of (1) and (5), would be like the following: u possesses first-order partial derivatives in \(\mathbb {R}\times \, ]0,\infty [\), the convolution \(K * u_x(.,t)\) is defined on \(\mathbb {R}\) for every \(t > 0\), Eq. (1) holds pointwise for every \((x,t) \in \mathbb {R}\times \, ]0,\infty [\), and \(u(x,0) = u_0(x)\) for all \(x \in \mathbb {R}\). We could instead also consider classical solution for a finite time interval [0, T[ instead of \([0,\infty [\) and the adaptations in the conditions described above are obvious.

Remark 2.1

In the context of partial differential conservation laws the term classical solution is also used, e.g., by Dafermos (cf. [9, Sect. 4.1]), for locally Lipschitz continuous functions u that satisfy the differential equation almost everywhere on the (xt)-domain. We could mimic this here with bounded and locally Lipschitz continuous functions, since \(K' \in L^1(\mathbb {R})\) so that the convolution \(K' * u\) is defined. This lies somewhat between typical weak solution concepts and what we will call strong solution below.

A somewhat restrictive, but more systematic and modern approach is to first identify some topological multiplicative algebra X of functions on the real line that is invariant under differentiation \(\partial _x\) and such that the operator of convolution with K acts continuously \(X \rightarrow X\). A standard example is \(X = H^\infty (\mathbb {R})\). Assuming \(u_0 \in X\), one then searches for a solution on [0, T[ of (1) and (5) in the senseFootnote 3 that \(u \in C^1(]0,T[, X) \cap C([0,T[, X)\) should satisfy \(u(0) = u_0\) and (1) holds as an equation in X for \(0< t < T\), i.e.,

$$\begin{aligned} \forall t \in \mathbb {R}, 0< t < T:\quad u'(t) + u(t) \partial _x u(t) + K * \partial _x u(t) = 0. \end{aligned}$$
(6)

In cases where T may be taken arbitrarily large we speak of a solution global in time.

Remark 2.2

For any u satisfying Eq. (6) in the above sense we have \(u'(t) = - u(t) u_x(t) - K * u_x(t)\) for all \(t > 0\), where the right-hand side belongs to C([0, T[, X). Therefore, \(u' = \partial _t u\) can be continuously extended to \(t = 0\) and we may thus specify \(u \in C^1([0,T[, X) \cap C([0,T[, X)\) from the outset.

The required invariance of X under differentiation makes it hard to obtain X itself as a Banach algebra, but an alternative is to resort to a scale of Banach spaces \(X^s\) (\(s \in [0,\infty [\)) with \(X^{s_2} \hookrightarrow X^{s_1}\), if \(s_1 \le s_2\) and differentiation being continuous \(X^{s+1} \rightarrow X^s\). The standard examples are Sobolev-type spaces, in particular, \(X^s = H^s(\mathbb {R})\) with \(X^0 = L^2(\mathbb {R})\). In the latter case, we also know that we obtain a Banach algebra, if \(s > 1/2\) (and we adapt the Sobolev norm by an appropriate constant factor; cf. [1, Theorem 4.39]). Observe that moreover, \(v \mapsto K*v\) is a continuous operator on \(H^s(\mathbb {R})\) for every \(s \ge 0\), since \(K \in L^1(\mathbb {R})\) (hence \(\widehat{K * v} = \widehat{K} \cdot \widehat{v}\) with \(\widehat{K}\) continuous and bounded). Therefore, the Sobolev spaces \(H^s(\mathbb {R})\) (with \(s > 1/2\)) provide an example of the following set-up.

Suppose \(s_0 \ge 0\) and \(X^s\) (\(s > s_0\)) is a scale of Banach algebras of function spaces on \(\mathbb {R}\), \(X^{s_2} \hookrightarrow X^{s_1}\) (\(s_0 < s_1 \le s_2\)), the product in \(X^s\) being pointwise multiplication of functions, and such that convolution by K acts continuously on every space and differentiation is continuous \(X^{s+1} \rightarrow X^s\). Let \(s > s_0\) and \(0 < T \le \infty \). A strong solution on the time interval [0, T[ of the Cauchy problem (1) and (5) with initial value \(u_0 \in X^{s + 1}\) is given by an element \(u \in C^1([0,T[,X^s) \cap C([0,T[, X^{s + 1})\) such that \(u(0) = u_0\) and (6) holds as an equation in \(X^{s}\) for \(0< t < T\).

A typical notion of well-posedness of the Cauchy problem (1) and (5) will require that, given any \(u_0 \in X^{s + 1}\), there is some \(0 < T \le \infty \) such that a unique solution u to (6) with \(u(0) = u_0\) exists in \(C^1([0,T[,X^s) \cap C([0,T[, X^{s + 1})\) and that the solution map \(u_0 \mapsto u\) is continuous, e.g., for every \(T_1 \in \, ]0,T[\) as a map between the Banach spaces \(X^{s +1} \rightarrow C([0,T_1], X^{s + 1})\), where the norm on \(C([0,T_1], X^{s + 1})\) is \(\sup _{0 \le t \le T_1} {\left\| u(t) \right\| }_{X^{s+1}}\) (the supremum exists, because \([0,T_1]\) is compact, and this was the reason for taking \(T_1 < T\)). A reasonable variant of the notion may speak of well-posedness on the closed finite time interval \([0,T_1]\), if the solution exists and is unique in \(C^1([0,T_1],X^s) \cap C([0,T_1], X^{s + 1})\) with continuity of \(u_0 \mapsto u\) as above.

Proofs of well-posedness typically establish a so-called a priori estimate of \(\sup _{0 \le t \le T_1} {\left\| u(t) \right\| }_{X^{s+1}} \) in terms of some concrete bounded function of the life span \(T_1\), the regularity s, and \({\left\| u_0 \right\| }_{X^{s+1}}\). In case of well-posedness, the maximal life span T associated with a given regularity \(s > s_0\) and an initial value \(u_0 \in X^{s+1}\) is the supremum of all \(T_1 > 0\) such that a (unique) solution exists in \(C^1([0,T_1],X^s) \cap C([0,T_1], X^{s + 1})\) with \(u(0) = u_0\). To have a unique strong solution global in time thus means that we have the maximal life span \(T = \infty \). On the other hand, a situation with finite maximal life span, i.e., \(T < \infty \), does lead to blow-up of the solution in finite time and is also the starting point for discussions of the question of wave breaking.

Blow-up of strong solutions and wave breaking: Suppose now that for given \(s > s_0\) and initial wave profile \(u_0 \in X^{s+1}\) we have maximal life span \(T < \infty \). Since \(u \in C^1([0,T[,X^s)\) and \(u \in C([0,T[,X^{s+1})\) at least one of the following two situations has to arise: (a) There is no continuous extension of \(t \mapsto u(t)\), \([0,T[\, \rightarrow X^{s+1}\) at \(t=T\); (b) \(t \mapsto u(t)\), \([0,T[\, \rightarrow X^s\) cannot be extended as a continuously differentiabe map up to \(t = T\). We claim that (a) must hold, i.e.,

$$\begin{aligned} \text {the map }t \mapsto u(t), [0,T[\, \rightarrow X^{s+1},\hbox { cannot possess a continuous extension to }t = T,\nonumber \\ \end{aligned}$$
(7)

for we could otherwise extend u to a solution with a life span larger than T: Indeed, if (7) is false, then \(v_0 := \lim _{t \rightarrow T} u(t) \in X^{s+1}\) can serve as an initial value with some life time \(T_0 > 0\) for a unique solution \(v \in C^1([0,T_0], X^s) \cap C([0,T_0],X^{s+1})\). We patch the two solutions u and v into one function \(w \in C([0,T+T_0],X^{s+1})\), i.e., \(w(t) := u(t)\) (\(0 \le t \le T\)), \(w(t) := v(t)\) (\(T < t \le T+T_0\)), which obviously solves (6) for \(t \ne T\) and satisfies \(w \in C^1([0,T[\, \cup \, ]T, T+ T_0], X^s)\). We have to show that w is \(C^1\) at \(t = T\) and also solves (6) there. We clearly have \(w'(T+) := \lim _{t \downarrow T} w'(t) = v'(0)\). The limit from the left, \(w'(T-) := \lim _{t \uparrow T} w'(t)\) also exists, since again by (6) we may represent \(w'(t)\) in terms of w(t) and \(w_x(t)\) and both are continuous at \(t = T\) with values in \(X^s\) by the negation of (7). This yields \(w'(T-) = - v(0) \partial _x v(0) - K * \partial _x v(0) = \partial _t v(0) = v'(0)\). Thus, w is \(C^1\) also at \(t = T\) and the validity of (6) on all of \( [0,T+T_0]\) follows from continuity of all terms in it.

The consequence of the finiteness of the life span T expressed in (7) means that u(t) does not converge in \(X^{s+1}\) as \(t \rightarrow T\). In this generality we do not see how to assess whether \({\left\| u(t) \right\| }_{X^{s+1}}\) stays bounded or we have blow-up of the solution at \(t = T\) in the sense that

$$\begin{aligned} \limsup _{t \uparrow T} {\left\| u(t) \right\| }_{X^{s+1}} = \infty . \end{aligned}$$

In the most prominent case with the scale of Sobolev spaces \(H^s(\mathbb {R})\) (\(s > 1/2\)), we have that \(H^{s+1}(\mathbb {R})\) is continuously embedded in the space of bounded \(C^1\) functions with bounded derivative [10, Chapter IV, , Theorem 1]. Thus, we obtain that for any strong solution \(u \in C^1([0,T[,H^s(\mathbb {R})) \cap C([0,T[,H^{s+1}(\mathbb {R}))\) the norms \({\left\| u(t) \right\| }_{L^\infty }\) and \({\left\| \partial _x u(t) \right\| }_{L^\infty }\) are finite for every \(t \in [0,T[\). We say that wave breaking occurs for u at time \(T > 0\) (cf. [6, Definition 6.1]), if the wave itself remains bounded while its slope becomes unbounded at \(t = T\), i.e.

$$\begin{aligned} \sup _{t \in [0,T[} {\left\| u(t) \right\| }_{L^\infty } < \infty \quad \text {and} \quad \limsup _{t \uparrow T} {\left\| \partial _x u(t) \right\| }_{L^\infty } = \infty . \end{aligned}$$
(8)

An analysis of wave breaking should ideally address at least the following two issues:

  1. (a)

    Whether a finite maximal life span \(T < \infty \) for a strong solution u necessarily implies wave breaking for this solution at time T.

  2. (b)

    Identification of a certain class of initial wave profiles \(u_0\) such that the maximal life span of the corresponding strong solution is indeed finite. Hence wave breaking does definitely occur for the strong solutions with these initial values.

2.2 Weak(er) solution concepts

2.2.1 Weak solutions

We recall how to proceed in the well-known standard policy to obtain an interpretation of the Cauchy problem (1) and (5) in a weak or distributional sense: Suppose u is a classical solution, write out (1) in the form (3) or (4), multiply the equation by an arbitrary test function \(\phi \) from the space \(\mathcal {D}(\mathbb {R}^2) := C^\infty _\text {c}(\mathbb {R}^2)\) of smooth functions with compact support, and integrate with respect to x over all of \(\mathbb {R}\) and with respect to t over the half-line \([0,\infty [\); integration by parts and observing \(u(x,0) = u_0(x)\) then yields an integral identity, which reads

$$\begin{aligned}&\int _0^\infty \int _{-\infty }^\infty \Big ( - u(x,t) \partial _t \phi (x,t) - \frac{u^2(x,t)}{2} \partial _x \phi (x,t) {+ } \big (K' *u(.,t)\big )(x) \phi (x,t) \Big ) \, dx \, dt \\&\quad = \int _{-\infty }^\infty u_0(x) \phi (x,0)\, dx. \end{aligned}$$

Since \(K' \in L^1(\mathbb {R})\), it does make sense also for measurable functions u and \(u_0\) such that u is bounded on \(\mathbb {R}\times [0,T]\) for every \(T > 0\) and \(u_0\) is (locally) bounded.

Definition 2.3

A measurable function \(u :\mathbb {R}\times [0,\infty [ \rightarrow \mathbb {R}\) that is bounded on \(\mathbb {R}\times [0,T]\) for every \(T > 0\) is called a weak solution of the Cauchy problem (1), (5) with initial value \(u_0 \in L^\infty (\mathbb {R})\), if

$$\begin{aligned}&\int _0^\infty \int _{-\infty }^\infty \Big ( u(x,t) \partial _t \phi (x,t) + \frac{u^2(x,t)}{2} \partial _x \phi (x,t) - \big (K' *u(.,t)\big )(x) \phi (x,t) \Big ) \, dx \, dt \nonumber \\&\quad + \int _{-\infty }^\infty u_0(x) \phi (x,0)\, dx = 0 \end{aligned}$$
(9)

holds for every test function \(\phi \in \mathcal {D}(\mathbb {R}^2)\).

In the situation of the above definition, let us extend u to a measurable function on all of \(\mathbb {R}^2\) by setting \(u(x,t) := 0\) for negative t. Upon writing \(K' * u = \partial _x(K * u)\) this leads to the distributional identity

$$\begin{aligned} \mathop {\mathrm {div}}\nolimits _{(x,t)} (A_1,A_2) = u_0 \otimes \delta \quad \text { on } \mathbb {R}^2, \end{aligned}$$

where \(A_1 := u^2/2 + K * u\) and \(A_2 := u\). We may thus deduce from [9, Lemma 1.3.3] (similarly as in the discussion in [9, Sect. 4.3]) the following, upon possibly modifying u on a set of measure zero: For any relatively compact open subset B of \(\mathbb {R}\), the map \(t \mapsto u(.,t)|_{B}\) is weak* continuous from \([0,\infty [\) into \(L^\infty (B)\). This implies that u may be considered a continuous map

$$\begin{aligned}{}[0,\infty [ \rightarrow \mathcal {D}'(\mathbb {R}), t \mapsto u(t), \end{aligned}$$

where u(t) is defined, for any \(t \ge 0\), by its action on test functions \(\varphi \in \mathcal {D}(\mathbb {R})\) as

$$\begin{aligned} \langle u(t) , \varphi \rangle := \int u(x,t) \varphi (x)\, dx. \end{aligned}$$

In particular, we obtain that \(\lim _{t \rightarrow 0} \int _B (u(x,t) - u_0(x)) \phi (x)\, dx = 0\) for any test function \(\varphi \in \mathcal {D}(\mathbb {R})\), i.e.,

$$\begin{aligned} u(t) \rightarrow u_0 \text { in }\mathcal {D}'(\mathbb {R})\text { as }t\rightarrow 0,\quad \text {or simply, }u(0) = u_0\text { holds in }\mathcal {D}'(\mathbb {R}), \end{aligned}$$

which also explains in what sense the initial value is attained for a weak solution according to Definition 2.3. As the following remark illustrates, we cannot hope for a considerably stronger notion of continuity.

Remark 2.4

In general, even with B compact, the weak* continuity of a map \(v :[0,\infty [ \rightarrow L^\infty (B)\) does not imply strong continuity of v as a map into some \(L^p(B)\) although \(L^\infty (B) \subseteq L^p(B)\). (An example with \(B = [0,1]\) is provided by \(v :[0,\infty [ \rightarrow L^\infty ([0,1])\), where \(v(t)(x) := \exp (ix/t)\), if \(t > 0\), and \(v(0) := 0\): Strong discontinuity of v at \(t = 0\) is obvious from \({\left\| v(t) - v(0) \right\| }_{L^p} = 1\) for all \(t > 0\); for any \(f \in L^1([0,1])\), continuity of \(\langle v(t) , f \rangle = \int _0^1 \exp (ix/t) f(x)\, dx\) in \(t >0\) is clear; to check weak* continuity of v at \(t = 0\), suppose \(t_n > 0\), \(t_n \rightarrow 0\), and let f be arbitrary from the dense subspace \(C^1([0,1]) \subseteq L^1([0,1])\); integration by parts gives \(\langle v(t_n) , f \rangle \rightarrow 0\) and, since \({\left\| v(t_n) \right\| }_{L^\infty } = 1\), a standard variant of the Banach-Steinhaus theorem [33, Sect. 3.3, Proposition  2] yields the pointwise convergence \(v(t_n) \rightarrow 0 = v(0)\) on all of \(L^1([0,1])\), thus the weak* continuity of v at \(t = 0\).) The author would like to use this opportunity to correct a slight slip of argument in a related discussion in [22, the paragraph below Definition 2.1] for periodic solutions on the torus \(\mathbb {T}\). The claim \(\lim _{t \rightarrow 0} {\left\| u(t) - u_0 \right\| }_{L^1(\mathbb {T})} = 0\) made there is true, but the reasoning rests on the entropy condition, which provides the weak* continuity of \(t \mapsto u(t)\) and of \(t \mapsto u(t)^2\) (compare with the discussion leading to (14) below) and the Cauchy–Schwarz inequality then yields the upper bound \((\int |u(t) - u_0| dx)^2 \le \int |u(t) - u_0|^2 dx = \int u(t)^2 dx - 2 \int u(t) u_0 dx + \int u_0^2 dx \rightarrow 0\) as \(t \rightarrow 0\) and completes the argument.

A typical phenomenon with nonlinear hyperbolic conservation laws is non-uniqueness of weak solutions to the Cauchy problem, in particular, for the Burgers equation [9, Sect. 4.4]. As the Fornberg–Whitham equation is a non-local linear perturbation of the Burgers equation by the convolution term, it seems plausible that non-uniqueness is an issueFootnote 4 there as well. In any case, guided by the success of entropy (admissibility) conditions on weak solutions for pure partial differential conservation laws, such methods have also been employed for the Fornberg–Whitham equation.

2.2.2 Weak entropy solutions

Let us write Eq. (4) as to resemble a scalar balance law, but with a non-local right-hand side as a “source”,

$$\begin{aligned} \partial _t u + \partial _x\left( \frac{u^2}{2}\right) = - K' * u . \end{aligned}$$

Introducing an entropy-entropy flux pair \(\eta , Q:\mathbb {R}\rightarrow \mathbb {R}\), where \(\eta \) is convex and \(Q'(z) = \eta '(z) z\) [9, Sect. 3.2], we obtain for any classical solution, \(\partial _t \eta (u) + \partial _x Q(u) = \eta '(u) \partial _t u + Q'(u) \partial _x u = \eta '(u) ( \partial _t u + u \partial _x u) = - \eta '(u) (K' * u)\), thus

$$\begin{aligned} \partial _t \eta (u) + \partial _x Q(u) + \eta '(u) (K' * u) = 0. \end{aligned}$$
(10)

We cannot expect this equation to extend to weak solution as well, but we may note that for any bounded measurable function u, the various compositions with u appearing in this equation are defined as locally bounded (Lebesgue) measurable functions: Indeed, convexity of \(\eta \) implies that \(\eta \) is (locally Lipschitz) continuous and \(\eta '\) is increasing, hence both \(\eta \) and \(\eta '\) are Borel measurable and locally bounded; furthermore, \(z \mapsto Q'(z) = \eta '(z) z\) is measurable and locally bounded, hence also Q is locally Lipschitz continuous; therefore, in all the compositions \(\eta \circ u\), \(\eta ' \circ u\), and \(Q \circ u\), the left member is Borel measurable, hence the composition is Lebesgue measurable.

Recall the following notion of admissibility for a weak solution to the hyperbolic partial differential conservation law \(\partial _t u + \partial _x f(u) + g(u) = 0\) with entropy–entropy flux pair \(\eta \) and Q, \(Q'(z) = \eta '(z) f'(z)\): One replaces the differential equation \(\partial _t \eta (u) + \partial _x Q(u) + \eta '(u) g(u) = 0\) for classical solutions, the analog of (10), by the inequality \(\partial _t \eta (u) + \partial _x Q(u) + \eta '(u) g(u) \le 0\) and requires that it holds in the distributional sense, i.e., \(-(\partial _t \eta (u) + \partial _x Q(u) + \eta '(u) g(u))\) shall be equal to a non-negative measure; it even turns out that this inequality alone already implies that u is a weak solution (see, e.g., [9, the brief discussion following Definition 6.2.1]). Taking this as a guideline for the Fornberg–Whitham equation and replacing (10) accordingly, we will thus consider measurable locally bounded solutions u of the distributional inequality

$$\begin{aligned} \partial _t \eta (u) + \partial _x Q(u) + \eta '(u) (K' * u) \le 0 \end{aligned}$$
(11)

and obtain a concept that is compatible with, but in-between, those of weak and of strong solutions. Before implementing this in detail for the Cauchy problem (1) and (5), we will simplify matters by a typical reduction in the set of all possible entropies \(\eta \) used in the inequality (11), which is based on the observation that on finite intervals any convex function may be approximated by linear combinations of a linear functions and functions of the form \(z \mapsto |z - \lambda |\) (cf. [19, discussion of Theorem 1.5.1, page 25] and [9, Sect. 6.2]). Namely, we need to consider only the so-called Kružkov entropy–entropy flux pairs [9, Eq. (6.2.6)] with parameter \(\lambda \in \mathbb {R}\) of the form

$$\begin{aligned} \eta (z) = |z - \lambda |, \quad Q(z) = \mathop {\mathrm {sgn}}(z - \lambda ) \frac{z^2 - \lambda ^2}{2} = \frac{1}{2} |z - \lambda | (z + \lambda ). \end{aligned}$$
(12)

We summarize the discussion so far in the following solution concept.

Definition 2.5

(Intermediate version) Let \(u_0 \in L^\infty (\mathbb {R})\). A measurable function \(u :\mathbb {R}\times [0,\infty [ \rightarrow \mathbb {R}\) that is bounded on \(\mathbb {R}\times [0,T]\) for every \(T > 0\) is called a weak entropy solution of the Cauchy problem (1) and (5), if

$$\begin{aligned} 0&\le \int _{0}^{\infty } \int _{-\infty }^{\infty } \Big ( |u(x,t) - \lambda | \partial _t \phi (x,t) + \mathop {\mathrm {sgn}}( u(x,t)-\lambda ) \frac{u^2(x,t) - \lambda ^2}{2}\partial _x \phi (x,t)\nonumber \\&\quad -\mathop {\mathrm {sgn}}(u(x,t)-\lambda ) \big ( K'*u(\cdot ,t) \big ) (x) \phi (x,t) \Big ) \,dx \,dt + \int _{-\infty }^{\infty } |u_0(x) - \lambda |\phi (x,0)\,dx \end{aligned}$$
(13)

holds for arbitrary \(\lambda \in \mathbb {R}\) and nonnegative test functions \(\phi \in \mathcal {D}(\mathbb {R}^2)\).

Entropy solutions are weak solutions: It is easy to check that the condition (13) in Definition 2.5 implies (9), since for any given \(\phi \) we may choose \(\lambda = -r\) and \(\lambda = r\), where \(r > 0\) is sufficiently large such that \(|u| < r\) holds on the support of \(\phi \). Thus, every weak entropy solution is a weak solution of the Cauchy problem.

Remark 2.6

It is equivalent to add \(K' * \lambda = K * \lambda ' = 0\) in the convolution term of the integral (13), i.e., change \(\mathop {\mathrm {sgn}}(u(x,t)-\lambda ) \big ( K'*u(\cdot ,t) \big ) (x) \phi (x,t)\) to \(\mathop {\mathrm {sgn}}(u(x,t)-\lambda ) \big ( K'*(u(\cdot ,t)-\lambda ) (x) \big ) \phi (x,t)\) there. We note this only to clarify consistency with the formulae mentioned in [20, 22].

Observe that for any weak entropy solution u (with the entropy–entropy flux pair given in (12)), the term \(\eta '(u) (K' * u)\) in (11) is a bounded measurable function, so that \( \partial _t \eta (u) + \partial _x Q(u)\) is equal to some signed measure. Thus, we may therefore again invoke [9, Lemma 1.3.3], but now also with \(\eta (u)\) in place of u, where \(\eta \) is any convex function. In particular, we may choose \(\eta \) quadratic and obtain that \(t \mapsto u(t)\) and \(t \mapsto u(t)^2\) both induce weak* continuous maps from \([0,\infty [\) into \(L^\infty (B)\) for any relatively compact open subset B. We claim that

$$\begin{aligned} t \mapsto u(t) \text { is norm continuous } [0,\infty [ \rightarrow L^1(B). \end{aligned}$$
(14)

Let \(t_n, t_0 \ge 0\) with \(t_n \rightarrow t_0\) (\(n \rightarrow \infty \)). We have to show that \(v_n := u(t_n)|_B\) converges to \(v_0 := u(t_0)|_B\) in \(L^1(B)\), which follows directly from the Cauchy–Schwarz inequality and the weak*-convergences \(v_n \rightarrow v_0\) and \(v_n^2 \rightarrow v_0^2\), since

$$\begin{aligned}&\Big (\int _B |v_n - v_0| dx\Big )^2 \le \int _B 1^2 dx \cdot \int _B |v_n - v_0|^2 dx \\&\quad = |B| \Big ( \int _B v_n^2 dx - 2 \int _B v_n v_0 dx + \int _B v_0^2 dx\Big ) \rightarrow 0. \end{aligned}$$

The automatic continuity of weak entropy solutions with respect to time expressed in (14) suggests that with an initial value \(u_0 \in L^1(\mathbb {R}) \cap L^\infty (\mathbb {R})\) one might hope to obtain even \(u \in C([0,\infty [,L^1(\mathbb {R}))\) for the weak entropy solution. Such a set-up works fine with scalar (partial differential) conservation laws (cf. [9, Chapter VI]) and turns out to be well-suited also for the Cauchy problem of the Fornberg–Whitham equation as demonstrated in [15]. We therefore adapt Definition 2.5 accordingly, in particular, the initial value may then be required to be attained directly in the form \(u(0) = u_0\) and need not appear in the integral inequality.

Definition 2.7

Let \(u_0 \in L^1(\mathbb {R}) \cap L^\infty (\mathbb {R})\). A function \(u \in C([0,\infty [, L^1(\mathbb {R}))\) that is bounded on \(\mathbb {R}\times [0,T]\) for every \(T > 0\) is called a weak entropy solution of the Cauchy problem (1) and (5), if \(u(0) = u_0\) and

$$\begin{aligned} 0\le & {} \int _{0}^{\infty } \int _{-\infty }^{\infty } \Big ( |u(x,t) - \lambda | \partial _t \phi (x,t) + \mathop {\mathrm {sgn}}( u(x,t)-\lambda ) \frac{u^2(x,t) - \lambda ^2}{2}\partial _x \phi (x,t)\nonumber \\&-\mathop {\mathrm {sgn}}(u(x,t)-\lambda ) \big ( K'*u(\cdot ,t) \big ) (x) \phi (x,t) \Big ) \,dx \,dt \end{aligned}$$

holds for arbitrary \(\lambda \in \mathbb {R}\) and nonnegative test functions \(\phi \in \mathcal {D}(\mathbb {R}\times ]0,\infty [)\).

2.2.3 Mild solutions

The (inviscid) Burgers equation is just (1) without the convolution term, in which case an alternative approach is to extend the nonlinear map \(v \mapsto \partial _x (v^2/2)\), \(C^1_c(\mathbb {R}) \rightarrow L^1(\mathbb {R})\), to an accretive operator in \(L^1(\mathbb {R})\) and to show that it generates a continuous semigroup of nonlinear contractions on \(L^1(\mathbb {R})\). In case of an initial value \(u_0 \in L^1(\mathbb {R}) \cap L^\infty (\mathbb {R})\) the concept for the Cauchy problem based on this approach is equivalent to that of a weak entropy solution (cf. [9, Sect. 6.4] or [4, Sect. 5.5]).

Let us recall some of the basic notions from nonlinear operator theory [4, Chapter 3] involved here, but writing it out specifically for the Banach space \(L^1 := L^1(\mathbb {R})\). A general (possibly multi-valued) nonlinear operator G on \(L^1\) is defined by a relation \(G \subseteq L^1 \times L^1\). The value of G at \(u \in L^1\) is defined as the subset \(G (u) := \{ v \in L^1 \mid (u,v) \in G\}\), the domain is \(D(G) := \{ u \in L^1 \mid G(u) \ne \emptyset \}\), and the range is \(R(G) := \bigcup _{u \in D(G)} G(u)\). Thus, \(G = \{ (u,v) \mid u \in D(G), v \in G(u) \}\) and in the special case of only single-valued sets G(u) for all \(u \in D(G)\) this is the identification of a map with its graph. For \(G, F \subseteq L^1 \times L^1\) and \(\lambda \in \mathbb {R}\), we define \(\lambda G := \{ (u,\lambda v) \mid (u,v) \in G\}\), the sum \(G + F := \{(u,v+w) \mid (u,v) \in G, (u,w) \in F\}\), and the composition \(G \circ F := \{ (u,w) \mid \exists v \in L^1:(u,v) \in F \text { and } (v,w) \in G\}\). We also set \(G^{-1} := \{ (v,u) \mid (u,v) \in G\}\).

A quasi-accretive nonlinear operator G on \(L^1\) can be characterized by the property that there exists some \(\omega > 0\) such that we have for \(0< \lambda < \frac{1}{\omega }\),

$$\begin{aligned} \forall (u_1,v_1), (u_2,v_2) \in G:\quad {\left\| u_1 - u_2 + \lambda (v_1 - v_2) \right\| }_{L^1} \ge (1 - \lambda \omega ) {\left\| u_1 - u_2 \right\| }_{L^1}, \end{aligned}$$

while G is accretive, if \({\left\| u_1 - u_2 + \lambda (v_1 - v_2) \right\| }_{L^1} \ge {\left\| u_1 - u_2 \right\| }_{L^1}\) holds for some (hence any) \(\lambda > 0\). An accretive operator G is said to be m-accretive, if \(R(I + G) = L^1\) (where I denotes the identity on \(L^1\)); a quasi-accretive operator G is quasi-m-accretive, if \(G + \omega I\) is m-accretive for some \(\omega > 0\).

A continuous semigroup of nonlinear operators (respectively, contractions) on \(L^1\) is a family \((S(t)_{t \ge 0}\) of maps \(S(t) :L^1 \rightarrow L^1\) such that \(S(0) = I\), \(S(t_1 + t_2) = S(t_1) \circ S(t_2)\) for all \(t_1, t_2 \ge 0\), the map \(t \mapsto S(t)(u_0)\) is continuous \([0,\infty [ \rightarrow L^1\) for every \(u_0 \in L^1\) (and, in case of contractions, \({\left\| S(t)(u_0) - S(t)(v_0) \right\| }_{L^1} \le {\left\| u_0 - v_0 \right\| }_{L^1}\) holds for all \(u_0, v_0 \in L^1\) and \(t \ge 0\)). The semigroup is said to be generated by the quasi-m-accretive nonlinear operator G, if for every \(u_0\) in the closure \(\overline{D(G)}\) of the domain of G, we have

$$\begin{aligned} S(t)(u_0) = \lim _{n \rightarrow \infty } \big ( I + \frac{t}{n} G \big )^{-n} (u_0). \end{aligned}$$

We are now ready to formulate a solution concept for (1) and (5) in terms of semigroups. Let A be the generator of the solution semigroup of contractions for the (inviscid) Burgers equation and denote by B the continuous linear convolution operator \(L^1(\mathbb {R}) \rightarrow L^1(\mathbb {R})\), \(u \mapsto B u := K' * u\) (cf. Lemma 3.8).

Definition 2.8

Suppose that \(A + B\) is quasi-m-accretive and generates the continuous semigroup \((S(t)_{t \ge 0}\) on \(L^1(\mathbb {R})\). If \(u_0 \in L^1(\mathbb {R})\), then \(u(t) := S(t)(u_0)\) (\(t \ge 0\)) defines the mild solution \(u \in C([0,\infty [,L^1(\mathbb {R}))\) of the Cauchy problem (1) and (5).

We recall from [9, Sect. 6.4] or [4, Sect. 3.3 and 5.5] that A is given as the closure of the set \(A_0 \subseteq L^1(\mathbb {R}) \times L^1(\mathbb {R})\), where \(A_0\) is defined to be the set of all pairs \((u,v) \in L^1(\mathbb {R}) \times L^1(\mathbb {R})\) with \(u^2/2 \in L^1(\mathbb {R})\) and satisfying

$$\begin{aligned} \int _\mathbb {R}\mathop {\mathrm {sgn}}( u(x)-\lambda ) \Big ( \frac{u^2(x) - \lambda ^2}{2}\partial _x \varphi (x) + v(x) \varphi (x)\Big )\, dx \ge 0 \end{aligned}$$

for every non-negative \(\varphi \in \mathcal {D}(\mathbb {R})\) and \(\lambda \in \mathbb {R}\).

3 Strong solutions and wave breaking

3.1 Existence and uniqueness of strong solutions for short time

The basic result on classical smooth solutions with initial and spatial \(H^\infty \) regularity was established in [28, Chapter 2, ], alongside with the case of smooth periodic solutions in [28, Chapter 3, ]. The strategy of proof there is successive approximation, starting with \(u^{(0)}(x,t) := u_0(x)\) (\(x, t \in \mathbb {R}\)), in the form

$$\begin{aligned} \partial _t u^{(n)} + u^{(n-1)} \partial _x u^{(n)} + K' * u^{(n-1)} = 0, \quad u^{(n)} |_{t=0} = u_0, \end{aligned}$$

which requires in each step to solve a linear hyperbolic equation for \(u^{(n)}\), given \(u^{(n-1)}\). Estimates along the characteristics allow then to show convergence of the scheme as well as uniqueness and leads to the following statement, which in particular gives a classical solution.

Theorem 3.1

If \(u_0 \in H^\infty (\mathbb {R})\), then there is some \(T > 0\) such that the Cauchy problem (1) and (5) possesses a unique solution \(u \in C^\infty ([0,T], H^\infty (\mathbb {R}))\).

Ten years later, the following unique existence result with initial and spatial \(H^{k+1}\) regularity (\(k \in \mathbb {N}\), thus \(k+1 \ge 2\)) was established in [11, Theorem 4.1], essentially by deriving a contraction argument for the map \(v \mapsto u\), where u solves

$$\begin{aligned} u_t + u u_x = - K' * v, \quad u|_{t=0} = u_0. \end{aligned}$$

Theorem 3.2

Let \(u_0 \in H^{k+1}(\mathbb {R})\) with \(k \in \mathbb {N}\). Then given any \(T > 0\), which is smaller than some positive bound depending on \({\left\| u_0 \right\| }_{H^{k+1}}\), the Cauchy problem (1) and (5) is uniquely solvable with \(u \in C([0,T],H^{k}(\mathbb {R})) \cap L^\infty ([0,T],H^{k+1}(\mathbb {R}))\).

As far as we understand the details of the proof in [11], it is implicit in its arguments that the actual solution regularity is better than just \(C([0,T],H^{k}(\mathbb {R})) \cap L^\infty ([0,T],H^{k+1}(\mathbb {R}))\), so that one obtains a strong solution. In fact, it follows from the equation that \(\partial _t u = - u \partial _x u - K' * u \in L^\infty ([0,T], H^k(\mathbb {R}))\), so that u is Lipschitz continuous as a map \([0,T] \rightarrow H^k(\mathbb {R})\).

For the periodic case, a similar result, but with spatial \(H^{s+1}\) regularity for general \(s \in \mathbb {R}\) with \(s > 1/2\) and solution \(u \in C([0,T], H^{s+1}(\mathbb {T}))\), was given in [17, Theorem 1]. In addition, continuous dependence of u on the initial data \(u_0 \in H^{s+1}(\mathbb {T})\) is noted there explicitly. Moreover, reasoning again via the equation we have that \(u \in C^1([0,T],H^s(\mathbb {T}))\) as well, hence u is a strong solution. The method of proof in [17] rests on Galerkin approximation and uses involved commutator and regularization techniques to derive the key energy estimates yielding convergence in appropriate function spaces. The well-posedness statement with spatial regularity \(H^{s+1}\) (\(s > 1/2\)) for periodic and non-periodic cases and even for a whole class of related equations is mentioned also in [27, Theorem 1], but there the proof is omitted and only a vague reference to a “standard iteration scheme combined with a closed energy estimate” is made.

The following result from [18, Theorem 1.1] holds for both cases, i.e., with the spatial variable in \(\mathbb {T}\) or \(\mathbb {R}\), and extends well-posedness to spatial regularity measured in the Besov scales \(B^{s+1}_{s,r}\) in place of merely \(H^{s+1} = B^{s+1}_{2,2}\).

Theorem 3.3

Let \(u_0 \in B^{s+1}_{2,r}\) with \(s > 1/2\), \(1< r < \infty \) or \(s +1 = 3/2\), \(r = 1\). Then for any \(0< T < c / {\left\| u_0 \right\| }_{B^{s+1}_{2,r}}\), where c is some positive constant depending only on s, the Cauchy problem (1) and (5) is uniquely solvable with \(u \in C([0,T],B^{s+1}_{2,r})\). Furthermore, the map \(u_0 \mapsto u\) is continuous \(B^{s+1}_{2,r} \rightarrow C([0,T],B^{s+1}_{2,r})\).

We obtain again also \(u \in C^1([0,T],B^{s}_{2,r})\) and thus have a strong solution. The proof starts with a regularizing sequence \((u_0^{(n)})_{n \in \mathbb {N}}\) of the initial value \(u_0\), putting \(u^{(0)} := 0\), and defining \(u^{(n)}\) (\(n \ge 1\)) successively as the solution of the linear hyperbolic Cauchy problem

$$\begin{aligned} \partial _t u^{(n)} + u^{(n-1)} \partial _x u^{(n)} = - K' * u^{(n-1)}, \quad u^{(n)}|_{t=0} = u_0^{(n)}. \end{aligned}$$

It is then shown that energy estimates hold for short enough time \(T > 0\) and allow for extraction of a convergent subsequence which can be used to define a solution. Again commutator estimates involving the regularization are crucial in the process.

Remark 3.4

In case of \(u_0 \in H^2(\mathbb {R})\) one can give an alternative proof for the unique existence of a short-time solution with spatial \(H^2\) regularity based on Kato’s semi-group approach for semi-linear evolution equations. This fact was indicated very briefly in [17, 18] after the basic well-posedness statements. The key elements and a sketch of this are provided in the introductory section of [16] and was worked out in more detail in the first part of the proof of Theorem 1 in [30].

3.2 Wave breaking for strong solutions

In contrast to well-posedness results, an analysis of wave breaking does not require to strive for statements with lowest possible regularity of the initial value. In a way, it is even more impressive to see smooth initial wave profiles leading eventually to wave breaking.

The first clear indication that wave breaking may indeed happen for solutions of the Fornberg–Whitham equation was given already in [29], where a sketch of arguments was provided including a quantitative asymmetry condition in terms of the minimum and maximum slopes occurring in the initial wave profile (see also [32, Sect. 13.14]). The arguments for a wave breaking result given later in [28] picked up the basic strategy from [29], namely to look at the time development of the locations with minimum and maximum slope in a solution and to consider these as curves in the spatial domain. However, the reasoning in [28] is not mathematically complete, as explained in [7], where the first rigorous proof of a wave breaking result was achieved. A main issue was that one cannot guarantee a time-dependent choice of the minimal or maximal slope location that is smooth with respect to time. The key to overcome this obstacle is a theorem on the evolution of extrema proved in [7, Theorem 2.1] (see also [6, Appendix 6.3.2] or [8, Page 104, Theorem 5]), which has by now become a standard tool in the analysis of wave breaking and that we state therefore here as a lemma.

Lemma 3.5

Let \(T > 0\) and \(v \in C^1([0,T[,H^2(\mathbb {R}))\). Then for every \(t \in [0,T[\) there is some \(\xi (t) \in \mathbb {R}\) such that

$$\begin{aligned} m(t) := \inf _{x \in \mathbb {R}} \partial _x v(x,t) = \partial _x v(\xi (t),t). \end{aligned}$$

The function \(t \mapsto m(t)\) is locally Lipschitz continuous, thus differentiable almost everywhere, and satisfies

$$\begin{aligned} m'(t) = \partial _t \partial _x v(\xi (t),t) \quad \text {for almost every } t \in \, ]0,T[. \end{aligned}$$

The analogous statement is true with the supremum in place of the infimum. A formulation for the periodic case is slightly simpler due to compactness of the torus [21, Lemma 3.1].

The first part in the definition of wave breaking at time T according to (8) requires that \({\left\| u(t) \right\| }_{L^\infty }\) stays bounded as \(t \rightarrow T\). A nice proof of this fact for solutions with spatial \(H^2\) regularity is given in [16, Proposition 2] based on an adaptation of the above lemma for the extrema of v rather than of \(\partial _x v\). We recall the statement.

Proposition 3.6

If \(u_0 \in H^2(\mathbb {R})\) and \(T > 0\) is the maximal life span of the corresponding unique solution u, then we have

$$\begin{aligned} \sup _{t \in [0,T[} {\left\| u(t) \right\| }_{L^\infty } < \infty . \end{aligned}$$

To prove that wave breaking actually occurs one has to show that there is a certain class of initial values \(u_0 \in H^2(\mathbb {R})\) such that \({\left\| \partial _x u(t) \right\| }_{L^\infty }\) inevitably blows up as t approaches the maximal life span T. We sketch out a basic strategy for such a proof attempt employing Lemma 3.5:

Step 1: Suppose \(u_0 \in H^3(\mathbb {R})\) and \(T > 0\) is the maximal life span of the corresponding unique solution \(u \in C([0,T[,H^3(\mathbb {R})) \cap C^1([0,T[,H^2(\mathbb {R}))\). (Note that we had to assume \(H^3\) regularity in order to meet the regularity requirement \(C^1([0,T[,H^2)\) for the function v as in Lemma 3.5.) For every \(t \in [0,T[\) we define

$$\begin{aligned} m_1(t) := \inf _{x \in \mathbb {R}} \partial _x u(x,t), \quad m_2(t) := \sup _{x \in \mathbb {R}} \partial _x u(x,t) \end{aligned}$$

and \(\xi _1(t), \xi _2(t) \in \mathbb {R}\) such that

$$\begin{aligned} m_1(t) = \partial _x u(\xi _1(t),t), \quad m_2(t) = \partial _x u(\xi _2(t),t) \end{aligned}$$

holds. The regularity of u allows us to differentiate Eq. (1) with respect to x, which yields

$$\begin{aligned} u_{t x} + u_x^2 + u u_{x x} + K * u_{x x} = 0. \end{aligned}$$

Upon observing that \(u_{x x}(\xi _j(t),t) = 0\) holds by definition of \(\xi _j(t)\), we evaluate this equation at \((\xi _j(t),t)\) and obtain

$$\begin{aligned} m_j'(t) + m_j(t)^2 + (K * u_{x x}(t))(\xi _j(t)) = 0 \quad \text {for almost all } t \in [0,T[. \end{aligned}$$
(15)

The convolution term can be estimated from below upon an integration by parts (recalling \(K'(y) = - \mathop {\mathrm {sgn}}(y) e^{- |y|}/2\)) in the following way

$$\begin{aligned}&(K * u_{x x}(t))(\xi _j(t)) = - \int _{-\infty }^\infty K'(y) u_x(\xi _j(t) - y, t)\, dy\\&\quad = - \frac{1}{2} \int _{-\infty }^0 e^y u_x(\xi _j(t) - y, t)\, dy + \frac{1}{2} \int _0^\infty e^{-y} u_x(\xi _j(t) - y, t)\, dy\\&\quad \ge - \frac{m_2(t)}{2} \int _{-\infty }^0 e^y\, dy + \frac{m_1(t)}{2} \int _0^\infty e^{-y} \, dy\\&\quad = \frac{1}{2} (m_1(t) - m_2(t)). \end{aligned}$$

Inserting this into (15) gives the two differential inequalities

$$\begin{aligned} m_j'(t) \le - m_j(t)^2 + \frac{1}{2} (m_2(t) - m_1(t)) \quad \text {for almost all } t \in \, ]0,T[ \text { and } j =1,2.\nonumber \\ \end{aligned}$$
(16)

Step 2: Suppose that

$$\begin{aligned} m_1(0) + m_2(0) + S \le 0 \end{aligned}$$
(17)

holds for some \(S \ge 1\). Adding the two inequalities in (16) and observing \(m_1 \le m_2\) then yields

$$\begin{aligned} (m_1 + m_2)' \le -m_1^2 - m_2^2 + m_2 - m_1 = (m_2 - m_1)(1 + m_1 + m_2) - 2 m_2^2 \le - m_2^2, \end{aligned}$$

which therefore in combination with (17) gives

$$\begin{aligned} \forall t \in [0,T[:\quad m_1(t) + m_2(t) + S \le 0. \end{aligned}$$

We use this now in the inequality (16) for \(j=1\) and obtain

$$\begin{aligned} m_1'\le & {} - m_1^2 + \frac{m_2}{2} - \frac{m_1}{2} \le - m_1^2 + \frac{-S - m_1}{2} - \frac{m_1}{2} \\= & {} - \left( m_1 + \frac{1}{2}\right) ^2 + \frac{1}{4} - \frac{S}{2} \le - \left( m_1 + \frac{1}{2}\right) ^2, \end{aligned}$$

which also implies

$$\begin{aligned} \left( m_1 + \frac{1}{2}\right) ' \le - \left( m_1 + \frac{1}{2}\right) ^2. \end{aligned}$$

Step 3: Putting \(M(t) := m_1(t) + \frac{1}{2}\) we have \(M(0) = m_1(0) + \frac{1}{2} \le - S - m_2(0) + \frac{1}{2} < 0\) (since \(m_2(0) \ge 0\); otherwise, we could not have \(u_0 \in L^2(\mathbb {R})\)) and \(M'(t) \le - M(t)^2\), which means

$$\begin{aligned} \frac{d}{dt} \left( \frac{1}{M(t)} \right) = - \frac{M'(t)}{M(t)^2} \ge 1, \quad M(0) < 0, \end{aligned}$$

and thus implies

$$\begin{aligned} 0 \ge \frac{1}{M(t)} \ge \frac{1}{M(0)} + t \quad (0 \le t < 1/|M(0)| =: t_* \le T). \end{aligned}$$

We conclude that \(M(t) \rightarrow - \infty \) as \(0 < t \rightarrow t_*\), hence \(t_* \ge T\), thus \(t_* = T\), and \({\left\| \partial _x u(t) \right\| }_{L^\infty }\) cannot stay bounded as t approaches T.

We may thus state the following wave breaking result corresponding to [7, Theorem 3.2] with two slight differences: First, availability of more general well-posedness results allows for less regular initial data; second, we discussed here only the specific convolution kernel \(K(x) = \exp (-|x|)/2\) and not the whole class of nonzero symmetric kernel functions \(K \in C(\mathbb {R}) \cap L^1(\mathbb {R})\) that are decreasing on \([0,\infty [\).

Theorem 3.7

If \(u_0 \in H^3(\mathbb {R})\) satisfies

$$\begin{aligned} \inf _{x \in \mathbb {R}} u_0'(x) + \sup _{x \in \mathbb {R}} u_0'(x) \le -1, \end{aligned}$$

then we observe wave breaking for the unique solution of the Cauchy problem (1) and (5) with initial value \(u_0\).

Note that the divergence in Step 3 of the above chain of reasoning ultimately rests on the extra condition (17) and this is the prototype of an initial wave profile asymmetry mentioned in the introduction to the current subsection. Unfortunately, a direct comparison of quantitative wave breaking conditions used in various results on wave breaking in the literature is somewhat impaired by the fact that these certainly have to depend on the exact conventions used for scaling and signs in the Fornberg–Whitham equation.

The reasoning in the wave breaking result of [16, Sect. 3] is similar to the above, but uses refined estimates in Steps 1 and 2 and gives a sufficient condition on the minimal and maximal slopes of \(u_0\) weighted by a real parameter from a bounded interval. For the periodic case, a wave breaking result along the lines of the above theorem is proved in [21, Sect. 3]. All the previous sufficient conditions on \(u_0\) have been shown in [30, Theorem 2] to be special cases of one more general condition that still leads to wave breaking. The proof departs from the above strategy after inequalities (16) at the end of Step 1 and succeeds to produce subtle bounds on an appropriate linear combination of \(m_1^2\) and \(m_2 - m_1\), which leads to a sufficient condition of the structure

$$\begin{aligned} m_1(0) < \min \left( - c_1, - c_2(1 + \sqrt{1 + c_3 m_2(0)})\right) \end{aligned}$$

with positive constants \(c_j\) depending on the precise scaling and sign conventions used in the Fornberg–Whitham equation.

For smooth periodic solutions the blow-up of \(u_x\) in finite time is shown in [27, Theorem 2], if \(- \inf _{x \in \mathbb {R}} u_0'(x)\) is sufficiently largeFootnote 5, and a similar result is shown in [15, Sect. 4] for the non-periodic case with initial value \(u_0 \in C^1(\mathbb {R}) \cap L^1(\mathbb {R})\). Both of these proofs use elaborate estimates along characteristics and the sufficient conditions require in particular domination of \({\left\| u_0 \right\| }_{L^\infty }\) or \({\left\| u_0 \right\| }_{L^1}\), respectively. To justify these results strictly as proofs of wave breaking, one should also guarantee boundedness of \({\left\| u(t) \right\| }_{L^\infty }\) as t approaches the critical blow-up time \(t_*\). This is implicitly so in [15], since there u is supposed to be the unique weak entropy solution with initial value \(u_0\).

Some numerical case studies of wave breaking as the formation of shocks in weak solutions on the torus are contained in [22]. They suggest that only negative infinities of \(u_x\) are developing and that \(u_x\) stays bounded from above at the moment of wave breaking. This also finds support by the Oleinik type inequality proved in [15, Lemma 2.1] (see also (19) below) for weak entropy solutions on the real line and can be shown directly for strong solutions with spatial \(H^3\) regularity by calling on Lemma 3.5. We will discuss this below after first listing a few basic results about convolution with \(K'\) that will also be useful for the application of semigroup theory later on.

Lemma 3.8

The linear operator \(u \mapsto K'*u\)

  1. (i)

    Is bounded from \(L^q(\mathbb {R})\) to \(L^p(\mathbb {R})\) for all \(p, q \in \mathbb {R}\) with \(1 \le q \le p \le \infty \),

  2. (ii)

    Maps \(BV(\mathbb {R})\) into \(W^{1,\infty }(\mathbb {R}) \cap W^{1,1}(\mathbb {R})\),

  3. (iii)

    And for any \(u \in L^\infty (\mathbb {R})\) one has that

    $$\begin{aligned} \sup |\partial _x (K'*u)| \le 2 \Vert u \Vert _\infty . \end{aligned}$$
    (18)

Proof

For the first point, as \(K' \in L^r(\mathbb {R})\) for every \(1 \le r \le \infty \), we obtain \(\Vert K'*u \Vert _p \le \Vert K' \Vert _r \Vert u \Vert _q\), if \(1 \le q := r p / (r + r p - p) \le p\), from Young’s convolution inequality [12, Proposition 8.7].

For the second point, we note that \(\partial _x (K' * u) = K' * Du\), where we may interpret Du as the BV derivative of u, which is a finite measure by assumption (cf. [25, Definition 7.1]). We can then apply the version of Young’s inequality for convolution with measures (cf. [12, Proposition 8.49]) to obtain (with \(1 \le p \le \infty \) arbitrary)

$$\begin{aligned} \Vert K'*Du\Vert _p \le |Du|(\mathbb {R}) \cdot \Vert K'\Vert _p, \end{aligned}$$

where |Du| denotes the total variation measure associated with Du.

For the last point, note that \(K'' = K - \delta \) and we therefore obtain

$$\begin{aligned} \sup |\partial _x (K' * u)|&= \Vert K'' * u \Vert _\infty \le \Vert K * u\Vert _\infty + \Vert u \Vert _\infty \\&\le (\Vert K \Vert _1 + 1) \Vert u \Vert _\infty = 2 \Vert u \Vert _\infty . \end{aligned}$$

\(\square \)

Assuming initial data \(u_0 \in H^3(\mathbb {R}) \subset W^{1,1}(\mathbb {R}) \subset BV(\mathbb {R}) \subset L^1(\mathbb {R}) \cap L^\infty (\mathbb {R})\) there is some maximal life span \(T > 0\) of a unique strong solution \(u \in C([0,T[,H^3(\mathbb {R})) \cap C^1([0,T[,H^2(\mathbb {R}))\). In case \(T < \infty \), u ceases to be a strong solution in the form of wave breaking at time \(t = T\), i.e., \(\sup _{0 \le t < T}\Vert u(t)\Vert _\infty \) is bounded while \(\limsup _{t \uparrow T}\Vert u_x(t)\Vert _\infty = \infty \), in fact, \(\inf _{x \in \mathbb {R}} \partial _x u (x,t) \rightarrow - \infty \) as \(t \rightarrow T\) (cf. [16, Proposition 1]). Furthermore, there are sufficient conditions on the initial wave profile \(u_0\) to definitely cause \(T < \infty \), thus wave breaking occurs even for smooth initial values. Combined with the following proposition we may deduce that in case of wave breaking

$$\begin{aligned} \text {a shock in the spatial wave profile can only form as a downward jump} \end{aligned}$$

(in the direction of growing x).

Proposition 3.9

If \(u_0 \in H^3(\mathbb {R})\) and T is the maximal life span of the corresponding unique strong solution \(u \in C([0,T[, H^3(\mathbb {R})) \cap C^1([0,T[,H^2(\mathbb {R}))\) to (1) and (5), then

$$\begin{aligned} \sup _{0 \le t< T} \sup _{x \in \mathbb {R}} \partial _x u (x,t) < \infty . \end{aligned}$$

Proof

We put \(M(t) := \sup _{x \in \mathbb {R}} u_x(x,t)\) and may call on Lemma 3.5 to deduce the following three facts: M is differentiable almost everywhere on [0, T[; for every \(t \in [0,T[\) there exists \(\xi (t) \in \mathbb {R}\) such that \(M(t) = u_x(\xi (t),t)\); and we have the relation

$$\begin{aligned} M'(t) = u_{tx}(\xi (t),t) \quad \text {a.e. on } [0,T[. \end{aligned}$$

Noting that \(u_{xx}(\xi (t),t) = 0\) we obtain upon differentiation in (1) from (18) in Lemma 3.8

$$\begin{aligned} M'(t)= & {} - M(t)^2 - 0 - \partial _x \big (K *u_x(.,t)\big )(\xi (t))\\\le & {} -M(t)^2 + 2 \Vert u(t)\Vert _\infty \quad \text {for almost every }t \in [0,T[. \end{aligned}$$

By Proposition 3.6 we have \(c^2 := 2 \sup _{0 \le t< T} \Vert u(t)\Vert _\infty < \infty \) (with \(c \ge 0\)) and obtain

$$\begin{aligned} M'(t) \le c^2 - M(t)^2. \end{aligned}$$

Note that \(0 \le M(0) = \sup _{x \in \mathbb {R}} u_0'(x) < \infty \), since \(u_0 \in L^2(\mathbb {R}) \cap C^1(\mathbb {R})\) and \(u_0' \in H^2(\mathbb {R}) \subset L^\infty (\mathbb {R})\). Now consider the solution y to the initial value problem \(y(0) = M(0)\), \(y'(t) = c^2 - y(t)^2\): It is constant, if \(M(0) = c\); in case \(M(0) < c\), we have \(y(t) = c \tanh (c t + \alpha )\) with \(\tanh (\alpha ) = M(0)/c < 1\); and in case \(M(0) > c\), we have \(y(t) = c \coth (c t + \alpha )\) with \(\coth (\alpha ) = M(0)/c > 1\). In any case, y(t) exists for all \(t \in [0,T]\) and is bounded. Since \(M(0) = y(0)\), \(M' \le c^2 - M^2\), and \(y' = c^2 - y^2\), an application of the comparison theorem for ordinary differential equations (e.g., [2, Lemma 16.4]) yields \(M(t) \le y(t)\) for \(t \in [0,T[\), thus M is bounded from above. \(\square \)

Remark 3.10

The above proposition does not tell whether the height of a downward shock that was formed due to wave breaking will stay bounded or decrease as time progresses. As the existence of bounded, piecewise smooth, traveling wave solutions with entropic jump discontinuities shows, we cannot in general expect a decrease of the shock height with time for entropy solutions in the sense of Definition 2.5 (see the paragraph on heteroclinic connections in [11, Sect. 3] or [20]).

4 Weak solutions from entropy concepts, semigroup methods, or traveling waves

4.1 Weak entropy solutions

First indications that the method of vanishing viscosity produces a convergent scheme seem to be given in [28, Chapter 5, and ], although their notion of generalized solution remains vague and uniqueness is not addressed. The basic strategy of vanishing viscosity was later used to produce the following rigorous statement on weak entropy solutions for the Fornberg–Whitham equation in [11, Theorem 4.2], where spatial BV regularity is assumed. (The original formulation does not describe the relation of the solution u with the initial value \(u_0\), but we know from our previous discussion of the solution concepts that \(u(0) = u_0\) holds in the sense of \(u \in C([0,\infty ), L^1(\mathbb {R}))\), since \(u_0 \in BV(\mathbb {R}) \subseteq L^1(\mathbb {R}) \cap L^\infty (\mathbb {R})\).) The uniqueness follows in the proof given in [11] from an intermediate \(L^1\)-stability result (see also (20) below).

Theorem 4.1

If \(u_0 \in BV(\mathbb {R})\), then there is a unique weak entropy solution u (in the sense of Definition 2.7) to the Cauchy problem (1) and (5), which in addition satisfies \(u \in L^\infty _\text {loc}([0,\infty [,BV(\mathbb {R}))\).

This result was extended in [15, Theorem 1.2] to the general case described by the situation in Definition 2.7. We note that although [15, Definition 1.1] does not explicitly specify the precise quality assumed of the initial value \(u_0\) and the formulation in [15, Theorem 1.2] speaks only of \(u_0 \in L^1(\mathbb {R})\), we have some doubt whether the concept and all the proof details are true without having also \(u_0 \in L^\infty (\mathbb {R})\) a priori. In any case, our formulation of the main result with the a priori requirement \(u_0 \in L^1(\mathbb {R}) \cap L^\infty (\mathbb {R})\) is certainly covered by and coherent with [15].

Theorem 4.2

Given \(u_0 \in L^1(\mathbb {R}) \cap L^\infty (\mathbb {R})\), the Cauchy problem (1) and (5) has a unique weak entropy solution u (in the sense of Definition 2.7). It satisfies the Oleinik type inequality

$$\begin{aligned} \forall t > 0, \forall x, y \in \mathbb {R}, x < y :\quad u(y,t) - u(x,t) \le \left( \frac{1}{t} + 2 + 2t (1 + 2 e^t {\left\| u_0 \right\| }_{L^1}) \right) (y - x).\nonumber \\ \end{aligned}$$
(19)

Moreover, the following \(L^1\)-stability holds: If v is the weak entropy solution corresponding to the initial value \(v_0 \in L^1(\mathbb {R}) \cap L^\infty (\mathbb {R})\), then

$$\begin{aligned} \forall t > 0:\quad {\left\| u(t) - v(t) \right\| }_{L^1} \le e^t {\left\| u_0 - v_0 \right\| }_{L^1}. \end{aligned}$$
(20)

Of course, as a corollary of (20) (with \(v_0 = 0\)) we obtain the following estimate for every \(t \ge 0\):

$$\begin{aligned} {\left\| u(t) \right\| }_{L^1} \le e^t {\left\| u_0 \right\| }_{L^1}. \end{aligned}$$

The proof of Theorem 4.2 is by a so-called flux-splitting method and uses approximate solutions based on discretized time steps and the solution semigroup for the Burgers equation applied in each interval between these time steps. Convergence is shown by fine techniques involving estimates for the Burgers semigroup and regularity properties of solutions to the Poisson equation. Continuity of u as a function into \(L^1\) follows from a tightness condition of the approximate solution sequence, which is shown via energy estimates establishing Hölder regularity of the characteristics along the way. Boundedness of weak entropy solutions combined with the \(L^1\) continuity of u with respect to time produces an integral inequality, which implies \(L^1\)-stability and hence also uniqueness.

The recent publication [26] states results partially parallel to Theorem 4.2 and independently sketches arguments based on vanishing viscosity solutions and compensated compactness. The solutions obtained are in coherence with the current setting, although the solution concepts given in [26, Definitions 2.6 and 2.7] fail to clarify details about the initial data and neither the definition of weak entropy solutions nor the main existence theorem [26, Theorem 3.6] include continuity aspects of the solution with respect to time.

Well-posedness and \(L^1\)-stability for periodic weak entropy solutions to the Fornberg–Whitham equation has been shown independently in [22, Sect. 2] along the lines of Kružkov’s original paper [24] and with an adaptation of an older technique by Fujita and Kato for the Navier–Stokes equation based on the analytic semigroup generated by \(- \varepsilon \partial _x^2\) on \(L^2(\mathbb {T})\).

Remark 4.3

We note that [14, Theorem 1] contains an \(L^1\)-stability statement analogous to (20), although for strong solutions and only for times of their common existence. There are also the following bounds for the spatial \(L^\infty \) norm of a strong solution u with existence time \(T > 0\) and initial value \(u_0 \in H^s(\mathbb {R})\) (\(s > 3/2\)), given in [14, Lemma 4],

$$\begin{aligned} \Vert (K * u_x)(t)\Vert _{L^\infty } \le \Vert u_0\Vert _{L^2} \quad \text {and}\quad \Vert u(t)\Vert _{L^\infty } \le \Vert u_0\Vert _{L^\infty } + t \Vert u_0\Vert _{L^2}. \end{aligned}$$

4.2 Mild solutions

Our goal here is to establish mild solutions and also the well-posedness of weak entropy solutions via the generation of a non-linear semigroup. The basic properties of the non-local term according to Lemma 3.8 allow us to see the following theorem almost as a direct application of the theory of semigroups on Banach spaces generated by non-linear operators (as described, e.g., in [4]).

Theorem 4.4

If \(u_0 \in L^1(\mathbb {R})\) then there exists a unique global mild solution \(u \in C([0,\infty [, L^1(\mathbb {R}))\) of the Cauchy problem (1) and (5) in the sense of semigroups.

Proof

As the non-local term is bounded in \(L^1\), and as the inviscid Burgers term generates a non-linear contraction semigroup in \(L^1\), we may consider the former a perturbation of the latter and apply the theory of semigroups generated from nonlinear operators. More precisely, let A be the non-linear operator associated with the Burgers equation as in [4, Sect. 3.3] or in [9, Sect. 6.4], and put \(Bu := K' * u\) with domain \(D(B) = L^1(\mathbb {R})\). By Proposition 3.8 we have the finite operator norm \(b:= \Vert B\Vert < \infty \) for B as linear map \(L^1(\mathbb {R}) \rightarrow L^1(\mathbb {R})\). We claim that \(A+B\) is quasi-m-accretive on \(L^1(\mathbb {R})\) (in the sense of [4, Sect. 3.1]).

To establish this, we first note that from the accretiveness of A and boundedness of B, we have for \(v_1, v_2 \in L^1(\mathbb {R})\) and \(0< \lambda < 1/b\),

$$\begin{aligned}&\Vert v_1 - v_2 + \lambda ( A(v_1) + B v_1 - A(v_2) - B v_2)\Vert _{1} \\&\quad \ge \Vert v_1 - v_2 + \lambda ( A(v_1) - A(v_2))\Vert _{1} - \lambda \Vert B (v_1 - v_2) \Vert _{1} \ge (1 - \lambda b) \Vert v_1 - v_2\Vert _{1}, \end{aligned}$$

hence A + B is quasi-accretive.

Second, appealing to [4, Proposition 3.3], the accretive operator \(A+B\) is quasi-m-accretive, if we can show surjectivity of \(I + \lambda (A + B)\) for small \(\lambda > 0\), e.g. by proving solvability of the following equation in \(L^1(\mathbb {R})\) for u given v:

$$\begin{aligned} (I + \lambda A)^{-1} (v) - (I + \lambda A)^{-1} (\lambda B u) = u. \end{aligned}$$

As long as \(\lambda < 1/b\), the left-hand side is a contraction, since by accretiveness of A,

$$\begin{aligned} \Vert (I + \lambda A)^{-1} (\lambda B u_1) - (I + \lambda A)^{-1} (\lambda B u_2) \Vert _{1} \le \Vert \lambda B u_1 -\lambda B u_2 \Vert _{1} \le \lambda b \Vert u_1 - u_2 \Vert _{1} \end{aligned}$$

and hence the equation is solvable.

For the quasi-m-accretive operator \(A+B\) we have by [4, Proposition 3.6] that its domain \(D(A+B)\) is dense in \(L^1(\mathbb {R})\). Therefore, the existence and uniqueness of a mild solution in the sense of non-linear semigroups with initial value \(u_0 \in \overline{D(A+B)} = L^1(\mathbb {R})\) now follows from [4, Corollary 4.1]. \(\square \)

Remark 4.5

In course of the above proof we preferred to show directly that of \(A+B\) is quasi-m-accretive, while alternatively, one could also just observe quasi-accretiveness of B and apply an appropriate variant of the basic perturbation result proved in [3, Theorem 3.2] (and mentioned also in [4, Theorem 3.1]).

The following result gives an independent proof of the well-posedness part from Theorem 4.2.

Theorem 4.6

If \(u_0 \in L^1(\mathbb {R}) \cap L^\infty (\mathbb {R})\), then the global mild solution \(u \in C([0,\infty [, L^1(\mathbb {R}))\) of the Cauchy problem (1) and (5) according to Theorem 4.4 is also a weak entropy solution.

Proof

Assuming now that \(u_0 \in L^1(\mathbb {R}) \cap L^\infty (\mathbb {R})\), the equivalence of the semigroup solution according to Theorem 4.4 and the entropy solution follows similarly as in the proof of [4, Theorem 5.6]. We indicate a few adaptations implementing the proof variant in our case: First, since we have shown that the range of \(I + \lambda (A + B)\) for small \(\lambda > 0\) is all of \(L^1(\mathbb {R})\), we may note that following [4, Theorem 4.3, Eq. (4.17)], the mild solution can be constructed as the limit of resolvents

$$\begin{aligned} u(t) = \lim _{n \rightarrow \infty } \left( I + \frac{t}{n}(A+B)\right) ^{-n} u_0 \end{aligned}$$

uniformly in t on compact intervals. Second, the resolvent-like bounds established above hold (with an appropriately changed constant \(b > 0\)) with respect to any \(L^p\)-norm, \(1 \le p \le \infty \), since this is true for the unperturbed operator A, the convolution operator B is bounded on \(L^p(\mathbb {R})\) as well (Lemma 3.8), and the above estimates for \(A+B\) were generic, i.e., without using special properties of the \(L^1\)-norm. In combination of these facts, it follows that the solutions \(u_\varepsilon \) to the \(\varepsilon \)-regularized difference equation as in [4, Eq. (5.125)], namely \((u_\varepsilon (t) - u_\varepsilon (t - \varepsilon ))/\varepsilon + A(u_\varepsilon (t)) + B u_\varepsilon (t) = 0\) for \(t > \varepsilon \) and \(u_\varepsilon (t) = u_0\) for \(t < 0\), satisfy

$$\begin{aligned} \Vert u_\varepsilon (t)\Vert _p \le e^{bt}\Vert u_0\Vert _p, \end{aligned}$$

uniformly in \(\varepsilon > 0\), and \(u_\varepsilon (t) \rightarrow u(t)\) as \(\varepsilon \rightarrow 0\), uniformly for t in a compact time interval. This uniform upper bound for \(u_\varepsilon \) allows us to enter the proof of [4, Theorem 5.6] at (5.128) and follow the line of arguments there up to the end with \(A+B\) always replacing A, which concludes the proof of our theorem. \(\square \)

4.3 Continuous weak traveling wave solutions

In the theory of (local) scalar conservation laws it can be shown that continuous weak solutions are always entropy solutions ( [9, Theorem 11.13.1]). The proof employs fine-tuned techniques from the theory of generalized characteristics and might to be out of reach in our case of a nonlocal conservation law with the Fornberg–Whitham equation. However, for the special situation of traveling waves we show a related result below. Its hypotheis includes the case of the famous peakon solution with initial wave profile \(4 \exp (-|y|/2)/3\), which gives a weak solution to the Fornberg–Whitham equation (cf. [5, 13] or [20, Example 1.4]). In general for a traveling wave \(u(x,t) = v(x - ct)\) we do not want to require \(u_0 = v \in L^1(\mathbb {R})\), since this would exclude many interesting cases. Thus, in the following statement we do not assume that \(x \mapsto u(x,t)\) is integrable for every t and resort to Definition 2.5 instead of 2.7.

Proposition 4.7

Any weak traveling wave solution \((x,t) \mapsto u(x,t) = v(x - ct)\) with bounded and absolutely continuousFootnote 6 wave profile v is an entropy solution in the sense of Definition 2.5.

Proof

From the assumption that \(u(x,t) = v(x - ct)\) defines a weak solution it is not difficultFootnote 7 to derive the following equation, which holds in the sense of distributions as well as pointwise almost everywhere on \(\mathbb {R}\):

$$\begin{aligned} \left( \frac{(v - c)^2}{2} + K * v \right) ' = 0. \end{aligned}$$
(21)

We will show that for any \(\lambda \in \mathbb {R}\) and nonnegative test function \(\phi \) in \(C^\infty _\text {c}(\mathbb {R}^2)\),

$$\begin{aligned}&\int \limits _{0}^{\infty } \int \limits _{-\infty }^{\infty } |v(x - ct) - \lambda | \partial _t \phi (x,t) \,dx dt \\&\quad + \int \limits _{0}^{\infty } \int \limits _{-\infty }^{\infty } \mathop {\mathrm {sgn}}( v(x - ct)-\lambda ) \frac{v^2(x - ct) - \lambda ^2}{2}\partial _x \phi (x,t) \,dx dt\\&\quad - \int \limits _{0}^{\infty } \int \limits _{-\infty }^{\infty } \mathop {\mathrm {sgn}}(v(x - ct)-\lambda )K'*(v(\cdot - ct)) (x) \phi (x,t) \,dx dt \\&\quad + \int \limits _{-\infty }^{\infty } |v(x) - \lambda |\phi (x,0)\,dx = 0. \end{aligned}$$

Let us denote the four integral terms on the left-hand side by \(I_1, I_2, I_3, I_4\), respectively, i.e., we claim that \(I_1 + I_2 - I_3 + I_4 = 0\).

Fubini’s theorem and integrating by parts with respect to t, gives

$$\begin{aligned} I_1= & {} \int \limits _{-\infty }^{\infty } \int \limits _{0}^{\infty } |v(x - ct) - \lambda | \partial _t \phi (x,t) \,dt dx\\= & {} \int \limits _{-\infty }^{\infty } \Big ( \int \limits _{0}^{\infty } \Big (\mathop {\mathrm {sgn}}(v(x - ct) - \lambda ) c v'(x - ct) \phi (x,t) \Big ) \,dt + |v(x - ct) - \lambda | \phi (x,t) |_{t=0}^{t = \infty } \Big )dx\\= & {} \int \limits _{-\infty }^{\infty } \int \limits _{0}^{\infty } \Big (\mathop {\mathrm {sgn}}(v(x - ct) - \lambda ) c v'(x - ct) \phi (x,t) \Big ) \,dt dx - \int \limits _{-\infty }^{\infty } |v(x) - \lambda | \phi (x,0) \,dx, \end{aligned}$$

where we already observe that the last term cancels \(I_4\).

In \(I_2\) we observe that \(f(y) = \mathop {\mathrm {sgn}}(y - \lambda ) (y^2 - \lambda ^2)/2\) is differentiable with derivative \(f'(y) = \mathop {\mathrm {sgn}}(y - \lambda ) y\) in an integration by parts to obtain

$$\begin{aligned} I_2= & {} \int \limits _{0}^{\infty } \int \limits _{-\infty }^{\infty } \mathop {\mathrm {sgn}}( v(x - ct)-\lambda ) \frac{v^2(x - ct) - \lambda ^2}{2}\partial _x \phi (x,t) \,dx dt\\= & {} - \int \limits _{0}^{\infty } \int \limits _{-\infty }^{\infty } \mathop {\mathrm {sgn}}( v(x - ct)-\lambda ) v(x - ct) v'(x - ct) \phi (x,t) \,dx dt. \end{aligned}$$

Summing up, we find

$$\begin{aligned}&I_1 + I_2 - I_3 + I_4\\&\quad = \int \limits _{0}^{\infty } \int \limits _{-\infty }^{\infty } \mathop {\mathrm {sgn}}( v(x - ct)-\lambda ) \underbrace{\Big ( (c - v(x - ct)) v'(x - ct) - (K' * v) (x - ct) \Big )}_{= - ((v-c)^2/2)' - K' * v = 0 \text { a.e.}} \phi (x,t) \,dx dt \\&\quad = 0. \end{aligned}$$

\(\square \)

The hypothesis of absolute continuity in the previous proposition certainly would allow for non-smoothness in v harsher than the Lipschitz continuous corner singularity in the example of the peakon solution. An interesting question is whether absolutely continuous functions v with a cusp at some location \(x_0 \in \mathbb {R}\), where the derivative is locally integrable but unbounded, qualify as initial values of weak traveling solutions u. If we have, in addition, \(v \in L^1(\mathbb {R}) \cap L^\infty (\mathbb {R})\) this can be ruled out immediately: The proof of Proposition 4.7 shows that u would be also a weak entropy solution in the sense of Definition 2.7 and we could construct a contradiction to the Oleinik type estimate (19) in Theorem 4.2 for any \(t > 0\) near the translated cusp location \(x_0 + c t\).

Example 4.8

In [5] the authors construct an interesting class of examples of bounded continuous traveling waves with a cusp and satisfying Eq. (1) in the pointwise classical sense everywhere on \(\mathbb {R}^2\) except for the straight line \(x = ct\). We consider the particular case with parameters \(A=0\) and \(c > 4/3\) in [5, Theorem 2.4(i), Theorem 2.5(iii), and Case III in Sect. 3] and obtain the traveling wave \(u(x,t) := v(x - ct)\), where v is a bounded continuous function on \(\mathbb {R}\) that is \(C^3\) off 0 and satisfies

$$\begin{aligned} \left( 1 - \frac{d^2}{d\xi ^2}\right) \left( \frac{(v - c)^2}{2}\right) ' + v' = 0 \quad \text { on } \mathbb {R}{\setminus }\{ 0\}. \end{aligned}$$
(22)

Furthermore, \(0 < v \le c\), \(v(0) = c\), \(\xi \mapsto v(\xi )\) is strictly increasing for \(\xi < 0\), \(v(-\xi ) = v(\xi )\), \(\lim _{\xi \rightarrow \pm \infty } v(\xi ) = 0\), and we have, with the constant \(b := 4 |c|^{3/2} \sqrt{c - 4/3}> 0\),

$$\begin{aligned} v(\xi ) = c - 2 b |\xi |^{1/2} + O(|\xi |) \quad \text {and}\quad v'(\xi ) = - b \mathop {\mathrm {sgn}}(\xi ) |\xi |^{-1/2} + O(1) \quad (\xi \rightarrow 0).\nonumber \\ \end{aligned}$$
(23)

Note that v is absolutely continuous. In fact, an inspection of the change of variables in the construction of [5, Sect. 3, Case III] shows that we have even \(v \in W^{1,1}(\mathbb {R}) \subset BV(\mathbb {R}) \subset L^1(\mathbb {R}) \cap L^\infty (\mathbb {R})\). Therefore, the argument presented above already shows that u cannot be a weak solution.

However, let us add here also a more direct reasoning why Eq. (22), valid pointwise for \(\xi \ne 0\), cannot guarantee that the initial wave profile v defines a global weak solution u of the Fornberg–Whitham equation. Similar arguments may be applicable to other cases of parameters in this example class as well.

We will show that the precise asymptotic information about v and \(v'\) near \(\xi = 0\) according to (23) allows us to draw the following conclusion: If v has all the properties specified above and the left-hand side of (22) is the restriction of a distribution on \(\mathbb {R}\) which vanishes on \(\mathbb {R}{\setminus }\{0\}\), then

$$\begin{aligned} \left( \frac{(v - c)^2}{2} + K * v \right) ' = - 4 b^2 K'. \end{aligned}$$
(24)

Since Eq. (24) is in contradiction to (21), we may then conclude that \(u(x,t) = v (x - ct)\) cannot define a weak solution of the Fornberg–Whitham equation.

To prove (24), we first note that due to (22) the distribution \((1 - \frac{d^2}{d\xi ^2}) ((v - c)^2/2)' + v'\) has support in the singleton set \(\{ 0\}\), thus equals a finite linear combination of derivatives of the Dirac distribution \(\delta \) (concentrated at \(\xi = 0\)). Recall that v is globally continuous, even \(C^3\) outside \(\xi = 0\), and by (23) the derivative \(v'\) is locally integrable; hence also \(((v-c)^2/2)' = (v-c) v'\) is locally integrable. Therefore the order of the Delta derivatives can be at most 1, i.e., there are \(\lambda _0, \lambda _1 \in \mathbb {R}\) such that

$$\begin{aligned} \left( 1 - \frac{d^2}{d\xi ^2}\right) \Big (\frac{(v - c)^2}{2}\Big )' + v' = \lambda _0 \delta + \lambda _1 \delta '. \end{aligned}$$

Upon convolution with K we obtain

$$\begin{aligned} ((v - c)^2/2)' + K' * v = \lambda _0 K + \lambda _1 K', \end{aligned}$$

which implies \(((v - c)^2/2)' = \lambda _0 K + \lambda _1 K' - K' * v \in L^1(\mathbb {R})\) and upon integration over \(\mathbb {R}\) that

$$\begin{aligned} \lambda _0 = \int _{-\infty }^\infty (\lambda _0 K + \lambda _1 K' - K' * v) d \xi = \int _{-\infty }^\infty ((v - c)^2/2)' d\xi = 0, \end{aligned}$$

since \(\lim _{\xi \rightarrow \pm \infty } (v(\xi ) - c)^2/2 = c^2/2\). Thus, we are left with the equation

$$\begin{aligned} ((v - c)^2/2)' + K' * v = \lambda _1 K'. \end{aligned}$$
(25)

Considering again (23), when \(\xi \rightarrow 0\) we have

$$\begin{aligned}&((v(\xi ) - c)^2/2)' \\&\quad = (v(\xi ) - c) v'(\xi ) = (- 2 b |\xi |^{1/2} + O(|\xi |)) (- b \mathop {\mathrm {sgn}}(\xi ) |\xi |^{-1/2} + O(1))\\&\quad = 2 b^2 \mathop {\mathrm {sgn}}(\xi ) + O(|\xi |^{1/2}), \end{aligned}$$

hence \(((v-c)^2/2)'\) has a jump of height \(4 b^2\) at \(\xi = 0\). Recalling \(K'(\xi ) = - \exp (-|\xi |) \mathop {\mathrm {sgn}}(\xi )/2\), using the continuity of \(K' * v\) and of K when taking the differences in (25) as \(\xi \rightarrow \pm 0\) we finally conclude that \(4 b^2 = - \lambda _1\).

Remark 4.9

It can be shown (cf. [11, Sect. 3] or [20]) that there are bounded, piecewise smooth, traveling waves with an entropic jump discontinuity that are weak entropy solutions in the sense of Definition 2.5.