1 Introduction and main result

1.1 The CKN-inequality

Caffarelli, Kohn, and Nirenberg [5] introduced, among others, the family of functional inequalities

(1.1)

nowadays known as CKN-inequality, with dimension \(d\in \mathbb {N}\) and parameters \(a,b\in \mathbb {R}\) such that

$$\begin{aligned} a<{\text {min}}\{0,b\}+\frac{d-2}{2} \qquad \text {and}\qquad 0\le b-a\le 1. \end{aligned}$$
(1.2)

We will call pairs (ab) satisfying (1.2) admissible. Note that the first condition reduces to \(2a<d-2\) in case \(d>2\). We use the convention that \(C_{a,b}\) denotes the optimal constant in (1.1). By a scaling argument, one can show that q has to satisfy

$$\begin{aligned} q=\frac{2d}{d-2+2(b-a)} . \end{aligned}$$
(1.3)

Note that for admissible (ab) the exponent q ranges between 2 and \(2^*\). Here \(2^*\) denotes the critical Sobolev exponent with \(2^*=2d(d-2)^{-1}\) in case \(d\ge 3\) and \(2^*=\infty \) in case \(d=1,2\). In fact, the CKN-inequality contains the classical Sobolev inequality (\(a=b=0\), \(d\ge 3\)) as well as the Hardy inequality (\(a=0\), \(b=1\), \(d\ge 3\)) as special cases. If (ab) is admissible, v is allowed to be a function in \({\mathcal {D}}^{1}_a(\mathbb {R}^d)\), the completion of \(C_c^\infty (\mathbb {R}^d)\) with respect to the norm

Horiuchi [23] (\(d\ge 2\)) and Catrina and Wang [6] (\(d\ge 1\)) were able to complete the existing results on whether the optimal constant for (1.1) is attained. Indeed, among all admissible (ab), an affirmative result was proved in case

$$\begin{aligned} 0<b-a<1 \qquad \text {or}\qquad b=a\ge 0 , \end{aligned}$$
(1.4)

which is sharp. We will call admissible pairs (ab) satisfying (1.4) attainable and denote the set of optimizers of (1.1) by

$$\begin{aligned}{\mathcal {Z}}{:}{=}\{v\in {\mathcal {D}}^{1}_a(\mathbb {R}^d): (1.1)\text { becomes an equality}\}.\end{aligned}$$

If we restrict (1.1) to radial functions, that is, functions that only depend on the radial coordinate, and call the corresponding optimal constant \( C_{a,b}^*\), then, obviously, \(C_{a,b}\le C_{a,b}^*\). We will call an admissible pair (ab) symmetric if \(C_{a,b}=C_{a,b}^*\). Otherwise, symmetry breaking is said to occur. The constant \(C_{a,b}^*\) can be determined explicitly, and, for \(a\le b<a+1\), the set of radial optimizers is given by

$$\begin{aligned} \left\{ \lambda \frac{\mu ^{\sqrt{\Lambda }}(2q\Lambda )^{\frac{1}{q-2}}}{(1+|\mu \cdot |^{\sqrt{\Lambda }(q-2)})^{\frac{2}{q-2}}}\right\} _{\mu>0, \lambda \in \mathbb {R}} \text { with } \Lambda {:}{=}\left( \frac{d-2-2a}{2}\right) ^2>0 \end{aligned}$$
(1.5)

(see [6, p. 236 f.]). This set agrees with \({\mathcal {Z}}\) in case (ab) is symmetric [14]. The only exception is \((a,b)=(0,0)\), where the set of optimizers contains, in addition, the translates of functions in (1.5).

Admissible pairs (ab) with \(a\ge 0\) are well-known to be symmetric (see, e.g., [1, 26, 34], and [9]), so, if symmetry breaking occurs, then necessarily \(a<0\). The fact that symmetry breaking does occur for some parameters was observed by Catrina and Wang [6]. Thereafter, Felli and Schneider found an explicit curve that encloses a region where symmetry breaking occurs; see [18, Corollary 1.2] for \(d\ge 3\) and [16, Theorem 1.1] for \(d=2\). More specifically, among all attainable (ab), they showed that the pair (ab) is not symmetric if

$$\begin{aligned} \Lambda >4 \, \frac{d-1}{q^2-4}{=}{:}\Lambda _{FS} . \end{aligned}$$
(1.6)

Note that for fixed dimension d, \(\Lambda = \Lambda _{FS}\) describes a curve in the (ab)-plane since \(q=q(a,b)\) and \(\Lambda =\Lambda (a)\) are parametrized by (ab) and a, respectively. The condition (1.6) is trivially satisfied if \(d=1\), and hence symmetry is broken for all attainable (ab) in this case, which is in line with [6, Theorem 7.2]. After various partial results [13, 15, 27, 32], Dolbeault, Esteban, and Loss [14, Theorem 1.1] were able to settle the longstanding conjecture on the optimal symmetry range by proving that the value \(\Lambda _{FS}\) indeed separates the symmetry from the symmetry breaking region. More concretely, among all attainable (ab), they proved that the pair (ab) is symmetric if and only if

$$\begin{aligned} \Lambda \le \Lambda _{FS}. \end{aligned}$$
(1.7)

In fact, they established that all optimizers are radial in the symmetric case. Assuming \(a<0\), the case of equality in (1.7) then determines the FS-curve. At this point, let us briefly mention that being attainable and being symmetric are not disjoint properties. Indeed, we have \(C_{a,a+1}=C^*_{a,a+1}\) by continuity in case \(d\ge 2\) [6, Theorem 1.1(i), Theorem 7.5(i), Remark 3.4], so (ab) is symmetric for \(b=a+1\) but not attainable. (Note that the formula of \(C_{a,b}^*\) given in [6, Eq. (2.13)] holds for \(d=2\) as well.) If \(d=1\), this follows from [6, p. 254] as \(C_{a,b}^*-C_{a,b}\rightarrow 0\) for \(1+a-b\rightarrow 0\). On the other hand, \(C_{a,a}<C^*_{a,a}\) as computed in [6, Proof of Theorem 1.3(ii)], so admissible (ab) with \(b=a\) are neither symmetric nor attainable.

1.2 Stability for the CKN-inequality—our main result

In this paper we are interested in the question of stability, that is, whether the closeness to 1 of the quotient of the two sides in (1.1) for some v implies the closeness of v to the set \({\mathcal {Z}}\).

For the Sobolev inequality (\(a=b=0\), \(d\ge 3\)), this question was raised by Brezis and Lieb [3]. Bianchi and Egnell [2] gave an affirmative answer with the gradient \(L^2\)-norm \(\Vert \nabla \cdot \Vert _{L^2(\mathbb {R}^d)}\) as a measure for the distance to the set of optimizers. They showed that this distance vanishes at least quadratically in the difference between 1 and the quotient of the two sides in (1.1). Bianchi and Egnell introduced a very robust technique, which has been adapted to many other functional inequalities; see, for instance, [29] for the Hardy–Sobolev inequality or [7] for the fractional Sobolev inequality. This technique is based on two ingredients, namely, a compactness theorem for optimizing sequences and a spectral analysis around an optimizer. For a recent quantitative variant of the basic Bianchi–Egnell argument, leading to an optimal dependence of the stability constants, we refer to [12]. Further progress on related questions can be found in [8, 24, 25]. For an introduction to the Sobolev inequality and its stability, see [20]. A quantitative version of the Hardy inequality (\(a=0\), \(b=1\), \(d\ge 3\)) appeared in [10].

For the CKN-inequality, a Bianchi–Egnell-type stability inequality was recently shown by Wei and Wu [35] in the interior of the symmetric regime (\(\Lambda <\Lambda _{FS}\)); see also [29] for an earlier contribution in case \(a=0\).

Our main result is a stability inequality on the boundary of the symmetric regime, that is, on the FS-curve \(\Lambda =\Lambda _{FS}\). Remarkably, while the Wei–Wu result in the interior of the symmetric regime involves a remainder term quadratic in the distance to the set of optimizers, our bound will involve a remainder term that is quartic in this distance. We will also show that this quartic vanishing is best possible. The reason is that in the spectral analysis part of the Bianchi–Egnell strategy additional zero modes appear, namely, zero modes that do not come from symmetries of the problem. [In passing, we note an inaccuracy in [35]; their quadratic stability result only holds in the parameter range excluding the FS-curve, as the inequality [35, Eq. (4.4)] breaks down due to the existence of non-trivial zero modes. This was also noticed in [11].]

Theorem 1

(Degenerate stability of the CKN-inequality along the FS-curve) Let \((a,b)\in \mathbb {R}^2\) satisfy \(a<0\) and \(\Lambda =\Lambda _{FS}\) with \(d\ge 2\) and q given by (1.3). Then there is a constant \( c(q,d)>0\) such that for all \(v\in {\mathcal {D}}^{1}_a (\mathbb {R}^d)\),

Moreover, the inequality is best possible with respect to the quartic vanishing of the distance to \({\mathcal {Z}}\), that is, there is a sequence \((v_n)_n\subset {\mathcal {D}}^1_a(\mathbb {R}^d)\setminus \{0\}\) with

and

This theorem establishes the stability of the CKN-inequality along the FS-curve, but only in a degenerate sense, where the distance to the set of optimizers vanishes faster than quadratically; compare (1.16). The interest in such degenerate stability of functional inequalities has been raised recently through a work by Engelstein, Neumayer, and Spolaor [17], who investigated the quantitative stability of the Yamabe problem for closed Riemannian manifolds. While non-degenerate stability (with a square of the distance to the set of optimizers) was proved for manifolds of a generic type, it was also shown that the manifold

(1.8)

with its standard product metric exhibits only degenerate stability, namely, with an (unspecified) power of the distance that is strictly larger than two. This example builds upon work by Schoen [30]. Recently, it was shown [19] that in the example (1.8) the sharp stability exponent is four. The same phenomenon was observed in other types of Sobolev-type inequalities. Let us stress that the example (1.8) indeed describes a degenerate scenario since the stability becomes non-degenerate – that is, a stability inequality with a quadratic distance to the optimizers—when varying the radius of the one-dimensional sphere in (1.8); see [20] for more details. Similarly, as we show in this paper, while non-degenerate stability was shown for the CKN-inequality in the interior of the symmetry region [35], only the weaker notion of degenerate stability with a fourth power in the distance is available along the FS-curve. Therefore, our result proves a loss of stability and highlights the phase transition occurring due to symmetry breaking.

The underlying mechanism for degenerate stability in [19] and in the present paper is similar. It is caused by the presence of zero modes of the Hessian of the deficit functional that do not come from symmetries of the problem. As we will explain below, there are various features of the CKN-setting that make the present analysis substantially harder than the one in [19].

We emphasize that the degeneracy along the FS-curve occurs only on a finite-dimensional subspace of \({\mathcal {D}}^{1}_a (\mathbb {R}^d)\), and hence an actual stronger stability result holds, with right side proportional to

$$\begin{aligned}{} & {} \inf _{\chi \in {\mathcal {Z}}}\left( \int _{\mathbb {R}^d}\frac{|\nabla (\Pi _d v-\chi )|^2}{ |x|^{2a}}\ \textrm{d} x\right) ^{2}\left( \int _{\mathbb {R}^d}\frac{|\nabla (\Pi _dv)|^2}{ |x|^{2a}}\ \textrm{d} x\right) ^{-1}\\{} & {} \quad + \inf _{\chi \in {\mathcal {Z}}}\left( \int _{\mathbb {R}^d}\frac{|\nabla (\Pi _d^\perp v-\chi )|^2}{ |x|^{2a}}\ \textrm{d} x\right) , \end{aligned}$$

where \(\Pi _d\) is the orthogonal projection in \(H^1({\mathcal {C}})\) onto the d-dimensional subspace of non-trivial zero modes and \(\Pi _d^\perp {:}{=}1- \Pi _d\). (For the precise definition of non-trivial zero modes, we refer to Subsect. 1.4.) This follows by a slight modification of our proof, as in [20]. Such a mix of quadratic and quartic stability was first observed by Brigati, Dolbeault, and Simonov [4] in the setting of the log-Sobolev inequality on the sphere, which is yet another example of a degenerately stable functional inequality.

Let us conclude this subsection by highlighting in which respect the present paper goes beyond the above mentioned works on degenerate stability. An obvious difference is that, in contrast to the inequalities covered in [19], the CKN-inequality contains integrals over a non-compact domain, and that the optimizers are non-constant functions. This leads to several technical complications—the crucial one being the verification of a certain secondary non-degeneracy condition, which we will describe in detail below. This part of the proof, whose analogue in the case of constant minimizers in [19] follows by a straightforward computation, is one of our main achievements here and takes up a significant part of this paper. It involves a series solution of a certain inhomogeneous second order equation and then the verification of certain positivity properties of an infinite series; see Sect. 5 and the proof of Proposition 4. We stress that our treatment is fully analytical and does not rely on numerical assistance. Finally, we want to stress a novel approach to deal with the quartic order expansion for the \(L^q\)-norm when \(2<q<4\), which would simplify and unify the different ad-hoc approaches employed in [19]. We hope this will be useful in other related works on degenerate stability.

1.3 A reformulation

As is common practice, we employ logarithmic coordinates to transform (1.1) to a Sobolev inequality on the cylinder \({\mathcal {C}}{:}{=}\mathbb {R}\times {\mathbb {S}}^{d-1}\) without weights. With the so-called Emden–Fowler transformation

$$\begin{aligned} v(r,\omega )=r^{a-\frac{(d-2)}{2}} \varphi (s,\omega ), \end{aligned}$$

where \(r=|x|\), \(s=-\log r\), and \(\omega =x/r\), we can write (1.1) as

$$\begin{aligned} \Vert \partial _s \varphi \Vert ^2_{L^2({\mathcal {C}})}+\Vert \nabla _\omega \varphi \Vert ^2_{L^2({\mathcal {C}})}+\Lambda \Vert \varphi \Vert ^2_{L^2({\mathcal {C}})}\ge C_{a,b} \Vert \varphi \Vert ^2_{L^q({\mathcal {C}})} \end{aligned}$$
(1.9)

for \(\varphi \in H^1({\mathcal {C}})\). Here \(\nabla _\omega \) denotes the gradient on \({\mathbb {S}}^{d-1}\). Via logarithmic variables the scaling invariance of the CKN-inequality turns into the translation invariance of the Sobolev inequality on the cylinder. Note that (1.9) is an equality if and only if (1.1) is, and, calling functions on \({\mathcal {C}}\) that depend only on s radial, the results on symmetry and symmetry breaking carry over to (1.9) as well.

For symmetric, attainable \((a,b)\ne (0,0)\), equality in (1.9) is attained if and only if \(\varphi \) equals (up to a scalar multiple and a translation) the radial function

$$\begin{aligned} u{:}{=}\beta (\cosh (\alpha \, \ \cdot \ ))^{-\frac{2}{q-2}},\qquad \alpha {:}{=}\frac{q-2}{2} \sqrt{\Lambda },\qquad \beta {:}{=}\left( \frac{q}{2}\Lambda \right) ^{\frac{1}{q-2}}; \end{aligned}$$

see [14, Cor. 1.3] for a reference. By the Emden–Fowler transformation, the set \({\mathcal {Z}}\) is cast to

$$\begin{aligned}{\mathcal {M}}{:}{=}\{\lambda u(\ \cdot - t)\}_{\lambda ,t\in \mathbb {R}},\end{aligned}$$

the set of optimizers for (1.9). Some authors neglect the non-regular value \(\lambda =0\) to obtain a differentiable manifold [2] or drop the multiplication by a scalar multiple entirely [18]. For our purposes, it is convenient to keep the full set of optimizers when deriving orthogonality relations.

The values \(\alpha \) and \(\beta \) in the definition of u are chosen such that u is the unique even, positive function solving the Euler–Lagrange equation

$$\begin{aligned} -\partial _s^2 u+ \Lambda u =u^{q-1} \end{aligned}$$
(1.10)

on \({\mathcal {C}}\). We will use this and the following related equations frequently:

(1.11)
(1.12)

In the following, the norm in \(L^q({\mathcal {C}})\) is denoted \(\Vert \cdot \Vert _q\). In \(H^1({\mathcal {C}})\) we use the (\(\Lambda \)-dependent) norm

The inner products in \(L^2({\mathcal {C}})\) and \(H^1({\mathcal {C}})\) are \(\langle \cdot ,\cdot \rangle _2\) and \(\langle \cdot ,\cdot \rangle \), respectively. Moreover, we will consider a, b, and d satisfying (1.2), (1.4), and (1.7), that is, (ab) is admissible, attainable, and symmetric. As an immediate consequence, we know that \(2<q<2^*\) and \(d\ge 2\). In particular, the assumption \(d\ge 2\) in Theorem 1 is redundant and for clarity only. As the Sobolev inequality admits an additional symmetry, we exclude \((a,b)=(0,0)\). These assumptions on a, b, d, q, and \(\Lambda \) are standard and will be used throughout Sects. 23, and 4. In the latter two sections we will restrict ourselves to the FS-curve, that is, \(\Lambda =\Lambda _{FS}\) for \(a<0\).

1.4 Strategy of the proof

We will prove Theorem 1 in the equivalent formulation on the cylinder presented in the previous subsection. This will appear in Corollary 6 below. We will also present some of the main ingredients that go into the proof of this corollary. Those are stated in three propositions, whose proofs will be given in the remaining sections of this paper.

Our basic technique in this paper will be the iterated Bianchi–Egnell strategy introduced in [19]: While Bianchi and Egnell project on the space of trivial zero modes of the Hessian of the deficit functional

$$\begin{aligned}{\mathcal {F}}(\varphi ){:}{=}\Vert \varphi \Vert ^2- C_{a,b} \Vert \varphi \Vert ^2_{q}, \qquad \varphi \in H^1({\mathcal {C}}),\end{aligned}$$

it is possible to project further on the nearest non-trivial zero mode. This leads to a Taylor-type expansion of the deficit to quartic order of the distance to the set of optimizers

$$\begin{aligned}{\text {dist}}(\varphi ,{\mathcal {M}}){:}{=}\inf _{\chi \in {\mathcal {M}}}\Vert \varphi -\chi \Vert ,\qquad \varphi \in H^1({\mathcal {C}}).\end{aligned}$$

In our first step, we consider a minimizing sequence \((u_n)_n\) for the functional inequality (1.9) and project it on the nearest trivial zero mode. As the CKN-inequality (1.1) is invariant under dilations, and hence (1.9) under translations, we will have to handle the emerging lack of compactness. The content of our first proposition is that the projection can be chosen to be orthogonal in \(H^1({\mathcal {C}})\) to

$$\begin{aligned} {\text {span}}\{u,\partial _s u\}, \end{aligned}$$

the trivial zero modes of the Hessian of \({\mathcal {F}}\) at u. We call these zero modes trivial, because they come from symmetries of \({\mathcal {F}}\), namely, from multiplication by a constant and from translations. The proposition will lead us to a decomposition of \((u_n)_n\) into an optimizer and a remainder term that converges to 0 in the \(H^1({\mathcal {C}})\)-norm.

Proposition 2

(Projection on the trivial zero modes of the Hessian) Let \((a,b)\in \mathbb {R}^2\setminus \{(0,0)\}\) be admissible (1.2), attainable (1.4), and symmetric (1.7) with \(d\ge 2\) and q given by (1.3). Let \((u_n)_n\) be a sequence in \(H^1({\mathcal {C}})\) such that

$$\begin{aligned} \Vert u_n\Vert ^2_q\rightarrow 1\qquad \text { and }\qquad \Vert u_n\Vert ^2\rightarrow C_{a,b} \qquad \text { for } n\rightarrow \infty . \end{aligned}$$
(1.13)

Then there are \(\lambda _n\in \mathbb {R}\setminus \{0\}\), \( t_n\in \mathbb {R}\), and \(r_n\in H^1({\mathcal {C}})\) such that, along a subsequence, we have

$$\begin{aligned} u_n(s,\omega )=\lambda _n (u+r_n)(s-t_n,\omega ), \qquad (s,\omega )\in {\mathcal {C}}, \end{aligned}$$
(1.14)

with \({\text {dist}}(u_n,{\mathcal {M}})=\Vert u_n-\lambda _n u(\ \cdot -t_n)\Vert =\Vert \lambda _n r_n\Vert \) and the following convergence and orthogonality properties.

  • Convergence properties: \(\Vert r_n\Vert \rightarrow 0\) and \(\lambda _n\rightarrow \lambda ^*\) hold for \(n\rightarrow \infty \) and some \(\lambda ^*\in \mathbb {R}\setminus \{0\}\) with \(|\lambda ^*|= \Vert u\Vert ^{-1}_q\).

  • Orthogonality properties: For all \(n\in \mathbb {N}\) we have

    $$\begin{aligned}\nonumber \langle r_n, u \rangle =\langle r_n, \partial _s u\rangle =0. \end{aligned}$$

We have written the orthogonality conditions in terms of the \(H^1({\mathcal {C}})\)-inner product. Using the equations (1.10) and (1.11), we can rewrite them in terms of the \(L^2({\mathcal {C}})\)-inner product as

$$\begin{aligned} \langle r_n, u^{q-1} \rangle _2=\langle r_n, (q-1)u^{q-2} \partial _s u\rangle _2=0. \end{aligned}$$
(1.15)

For \(\Lambda <\Lambda _{FS}\), the Hessian of the deficit functional \({\mathcal {F}}\) at u has trivial zero modes only. In case \(\Lambda =\Lambda _{FS}\), however, the Hessian of \({\mathcal {F}}\) admits non-trivial zero modes as well [18], that is, the kernel of the Hessian strictly contains \({\text {span}}\{u,\partial _s u\}\). While some authors refer to all zero modes that are not trivial as non-trivial zero modes, we will call a zero mode non-trivial if it lies in the orthogonal complement of the trivial zero modes. As we will see later, the space of non-trivial zero modes is given by the span of

where \(\omega _1,\dots ,\omega _d\) denote the Cartesian coordinates restricted to \({\mathbb {S}}^{d-1}\). These modes do not arise from symmetries of \({\mathcal {F}}\).

We will now see that, if we require the functional \({\mathcal {F}}\) to decay faster than the distance to the set of optimizers squared,

$$\begin{aligned} \lim _{n\rightarrow \infty } \frac{{\mathcal {F}}(u_n)}{{\text {dist}}(u_n,{\mathcal {M}})^2}=0, \end{aligned}$$
(1.16)

then a non-trivial zero mode contributes to \(u_n\) satisfying (1.13), and we are able to further expand the minimizing sequence \((u_n)_n\). It turns out that we can decompose the previous remainder \(r_n\) into the nearest non-trivial zero mode of the Hessian of \({\mathcal {F}}\) with decaying amplitude and a new remainder term that converges even faster in the \(H^1({\mathcal {C}})\)-norm. This new remainder can be chosen to be \(H^1({\mathcal {C}})\)-orthogonal to all zero modes of the Hessian.

Proposition 3

(Projection on the non-trivial zero modes of the Hessian) Let \((a,b)\in \mathbb {R}^2\) satisfy \(a<0\) and \(\Lambda =\Lambda _{FS}\) with \(d\ge 2\) and q given by (1.3). Let \((u_n)_n\) be a sequence in \(H^1({\mathcal {C}})\) satisfying (1.13) and (1.16). Then there are \(\lambda _n\in \mathbb {R}\setminus \{0\}\), \(t_n,\mu _n\in \mathbb {R}\), \(D_n\in {\mathcal {O}}(d)\), and \(R_n\in H^1({\mathcal {C}})\) such that, along a subsequence, we have

(1.17)

with \(\lambda _n\), \(t_n\), and u from Proposition 2 and the following additional convergence and orthogonality properties.

  • Convergence properties: \(\mu _n \rightarrow 0\) and \(\Vert R_n\Vert \rightarrow 0\) hold for \(n\rightarrow \infty \).

  • Orthogonality properties: For all \(n\in \mathbb {N}\) we have

Here \({\mathcal {O}}(d)\) denotes the set of orthogonal \(d\times d\) matrices.

Using the equations (1.10), (1.11), and (1.12), we can rewrite the orthogonality conditions as

(1.18)

It is not hard to show that, in fact, (1.16) and \(\Vert u_n\Vert ^2\rightarrow C_{a,b}\) for \(n\rightarrow \infty \) make the assumption \(\Vert u_n\Vert _q\rightarrow 1\) redundant.

The decomposition in Proposition 3 allows us to expand \({\mathcal {F}}\) to quartic order. In this way, we will obtain the following crucial asymptotic inequality, which will imply our main theorem.

Proposition 4

(Non-vanishing of the quartic order) Let \((a,b)\in \mathbb {R}^2\) satisfy \(a<0\) and \(\Lambda =\Lambda _{FS}\) with \(d\ge 2\) and q given by (1.3). There is an explicit constant

$$\begin{aligned} J(q,d)>0, \end{aligned}$$

which is given in (4.18), such that for every sequence \((u_n)_n\subset H^1({\mathcal {C}})\) satisfying (1.13) we have

$$\begin{aligned} \liminf _{n\rightarrow \infty } \frac{\Vert u_n\Vert ^2{\mathcal {F}}(u_n)}{{\text {dist}}(u_n,{\mathcal {M}})^4}\ge J(q,d). \end{aligned}$$
(1.19)

Moreover, the bound (1.19) is best possible in the sense that there is a sequence \((u_n)_n\) satisfying (1.13) for which equality is attained in (1.19).

The key point of this proposition is the strict inequality \(J(q,d)>0\). In the limit \(q\rightarrow 2^*\), it can be shown that J(qd) vanishes, which is due to the additional translation symmetry in case \((a,b)=(0,0)\). The non-vanishing of J(qd) for \(q<2^*\) can be viewed as a secondary non-degeneracy condition [19, 20]. We give a concrete definiton in terms of variational derivatives of the deficit \({\mathcal {F}}\), which may be generalized to the setting of other functional inequalities.

Definition 5

Let \(\psi \in {\mathcal {M}}\). We say that the CKN-inequality satisfies the secondary non-degeneracy condition if

$$\begin{aligned} (\partial _\varepsilon ^4 {\mathcal {F}}) (\psi +\varepsilon (g+\varepsilon \varphi ))|_{\varepsilon =0}>0 \end{aligned}$$
(1.20)

for every non-trivial zero mode g in \( {\text {Ker}}(D^2_\psi {\mathcal {F}})\) and every \(\varphi \) that is \(H^1({\mathcal {C}})\)-orthogonal to \({\text {Ker}}(D^2_\psi {\mathcal {F}})\).

Note that

$$\begin{aligned} (\partial _\varepsilon ^4 {\mathcal {F}}) (\psi +\varepsilon (g+\varepsilon \varphi ))|_{\varepsilon =0}=12(D_\psi ^2{\mathcal {F}}(\varphi ,\varphi )+ D_\psi ^3{\mathcal {F}}(g,g,\varphi ))+D_\psi ^4{\mathcal {F}} (g,g,g,g),\nonumber \\ \end{aligned}$$
(1.21)

where we wrote the differentials in \(\psi \) as multilinear forms.

As a consequence of the previous proposition, the secondary non-degeneracy condition can be verified, and we can prove, by contradiction, degenerate stability of quartic order. The following assertion is equivalent to Theorem 1 via the Emden–Fowler transformation.

Corollary 6

(Degenerate stability of a Sobolev inequality for a cylinder along the FS-curve) Let \((a,b)\in \mathbb {R}^2\) satisfy \(a<0\) and \(\Lambda =\Lambda _{FS}\) with \(d\ge 2\) and q given by (1.3). Then there is a constant \( c(q,d)>0\) such that for all \(\varphi \in H^1({\mathcal {C}})\),

$$\begin{aligned} {\mathcal {F}}(\varphi )\ge c(q,d)\frac{{\text {dist}}(\varphi ,{\mathcal {M}})^4}{\Vert \varphi \Vert ^2}. \end{aligned}$$

Moreover, the inequality is best possible with respect to the power four, that is, there is a sequence \((\varphi _n)_n\subset H^1({\mathcal {C}}){\setminus }\{0\}\) satisfying (1.13) with

$$\begin{aligned} \limsup _{n\rightarrow \infty } \frac{\Vert \varphi _n\Vert ^2\, {\mathcal {F}}(\varphi _n)}{{\text {dist}}(\varphi _n,{\mathcal {M}})^4} < \infty . \end{aligned}$$

Proof

We argue by contradiction. For fixed (qd), assume there is a sequence \((u_n)_n \subset H^1({\mathcal {C}})\) with

$$\begin{aligned}\lim _{n\rightarrow \infty } \frac{\Vert u_n\Vert ^2(\Vert u_n\Vert ^2-C_{a,b}\Vert u_n\Vert _q^2)}{\inf _{\chi \in {\mathcal {M}}}\Vert u_n-\chi \Vert ^4}= 0.\end{aligned}$$

By homogeneity, we may assume \(\Vert u_n\Vert ^2=C_{a,b}\). Using the CKN-inequality (1.9) for \(u_n\) and the notion of the infimum \(\inf _{\chi \in {\mathcal {M}}}\Vert u_n-\chi \Vert \le \Vert u_n\Vert \), we find that

$$\begin{aligned}0\le \liminf _{n\rightarrow \infty } \left( 1-\Vert u_n\Vert _q^2\right) \le \liminf _{n\rightarrow \infty } \frac{\Vert u_n\Vert ^2(\Vert u_n\Vert ^2-C_{a,b}\Vert u_n\Vert _q^2)}{\inf _{\chi \in {\mathcal {M}}}\Vert u_n-\chi \Vert ^4}=0.\end{aligned}$$

Hence, we have proved the required convergence properties for \((u_n)_n\) to apply Proposition 4, which leads to a contradiction.

The assertion that the stability inequality is best possible with respect to the power four follows immediately from the corresponding assertion in Proposition 4. In fact, this shows that the sequence can be chosen such that the \(\limsup \) in the assertion equals J(qd).\(\square \)

If (1.20) did not hold, we would go on projecting on zero modes corresponding to \((\partial _\varepsilon ^{4} {\mathcal {F}}) (\psi +\varepsilon (g+\varepsilon \varphi ))|_{\varepsilon =0}\) and could formulate a similar next order non-degeneracy condition. Repeating this procedure, we would derive a non-degeneracy condition of higher order in every step. If one of these conditions was satisfied, the iteration scheme would end, and we would obtain a degenerate stability result with some exponent greater than four. Following [17], one could probably show that this procedure terminates after finitely many steps. We are not aware of an example of a degenerate stability result with a distance to the power six or higher.

The remainder of this paper consists of four sections. In Sects. 23, and 4 we present the proofs of Proposition 23, and 4, respectively. In Sect. 5 we provide the details of some results used in Sect. 4.

2 Projection on the trivial zero modes of the Hessian

2.1 Proof of Proposition 2

As we are dealing with the non-compact domain \({\mathcal {C}}=\mathbb {R}\times {\mathbb {S}}^{d-1}\), we can only expect relative compactness of optimizing sequences to hold up to non-compact symmetries. Following Lions’ concentration compactness principle [28], we only have to rule out two phenomena—vanishing and dichotomy—in order to find convergent subsequences up to symmetries. For absence of vanishing we refer to [6, Lemma 4.1] and references therein. Exploiting the Hilbert space structure of \(H^1({\mathcal {C}})\), dichotomy can be excluded in a standard way; see [6, Proof of Theorem 1.2(i)], for instance. As a result, given \((u_n)_n\subset H^1({\mathcal {C}})\) such that \(\Vert u_n\Vert _q^2\rightarrow 1\) and \(\Vert u_n\Vert ^2\rightarrow C_{a,b}\), there are \((t_n^*)_n\subset \mathbb {R}\) such that \(u_n(\ \cdot +t_n^*, \cdot \ )\) converges in \(H^1({\mathcal {C}})\). The limit is necessarily an optimizer of (1.9). In addition, the limit has unit \(L^q({\mathcal {C}})\)-norm. According to [14], the limit is a translate of \(\lambda ^* u\), where \(\lambda ^*\in \mathbb {R}\setminus \{0\}\) satisfies \(|\lambda ^*|= \Vert u\Vert ^{-1}_q\). By redefining the \(t_n^*\), we may assume that the limit is \(\lambda ^* u\). Consequently, we can write

$$\begin{aligned} u_n(s,\omega )=\lambda ^* (u+r_n^*)(s-t_n^*,\omega ), \qquad (s,\omega )\in {\mathcal {C}}, \end{aligned}$$
(2.1)

with \(r^*_n\in H^1({\mathcal {C}})\) satisfying \(\Vert r_n^*\Vert \rightarrow 0\) for \(n\rightarrow \infty \).

Note that \(t\mapsto \langle u(\ \cdot -t),u_n\rangle ^2\) is a continuous, non-negative function that vanishes at infinity. Therefore, it attains its maximum at some \(t_n\in \mathbb {R}\). Moreover, for any \(t\in \mathbb {R}\), \(\inf _\lambda \Vert u_n- \lambda u(\ \cdot -t)\Vert ^2 = \Vert u_n\Vert ^2 - \Vert u\Vert ^{-2} \langle u(\ \cdot -t),u_n\rangle ^2\), and the infimum is attained at \(\lambda =\Vert u\Vert ^{-2} \langle u(\ \cdot -t),u_n\rangle \). Thus, if we set \(\lambda _n:=\Vert u\Vert ^{-2} \langle u(\ \cdot -t_n),u_n\rangle \), we see that \({\text {dist}}(u_n,{\mathcal {M}})^2= \inf _{\lambda ,t} \Vert u_n- \lambda u(\ \cdot -t)\Vert ^2\) is attained at \((\lambda _n,t_n)\).

Define \({\tilde{r}}_n{:}{=}u_n(\ \cdot +t_n, \cdot \ )- \lambda _nu\in H^1({\mathcal {C}})\), and note that

$$\begin{aligned} \Vert {\tilde{r}}_n\Vert =\inf _{\chi \in {\mathcal {M}}}\Vert u_n-\chi \Vert \le \Vert u_n-\lambda ^* u(\ \cdot -t_n^*)\Vert =\Vert r_n^*\Vert =o_{n\rightarrow \infty }(1). \end{aligned}$$

We now deduce that \(\lambda _n\rightarrow \lambda ^*\) and \(t_n-t_n^*\rightarrow 0\). To this end, we first notice that

and \(\Vert u_n\Vert _q \rightarrow 1\) imply that \(|\lambda _n|\rightarrow \Vert u\Vert _q^{-1} = |\lambda ^*|\). Next, we use the decomposition (2.1) of \(u_n\) to find that

$$\begin{aligned} \lambda _n = \Vert u\Vert ^{-2} \langle u(\ \cdot -t_n),u_n\rangle = \lambda ^* \Vert u\Vert ^{-2} \langle u(\ \cdot - t_n),u(\ \cdot - t_n^*) \rangle + o_{n\rightarrow \infty }(1) . \end{aligned}$$

Taking the absolute value on both sides, we deduce that \(| \langle u(\ \cdot - t_n),u(\ \cdot - t_n^*) \rangle | \rightarrow \Vert u\Vert ^2\). Using \(\langle u(\ \cdot - t_n),u(\ \cdot - t_n^*)\rangle = \langle u(\ \cdot - t_n), u(\ \cdot - t_n^*)^{q-1}\rangle _{L^2({\mathcal {C}})} \ge 0\) (by (1.10) for \(u(\ \cdot - t_n^*)\) and the fact that \(u\ge 0\)), we deduce on the one hand that \(\lambda _n\rightarrow \lambda ^*\) and on the other hand that

$$\begin{aligned} \Vert u - u(\ \cdot +t_n^*-t_n) \Vert ^2 = 2\left( \Vert u\Vert ^2 - \langle u(\ \cdot - t_n),u(\ \cdot - t_n^*) \rangle \right) \rightarrow 0 . \end{aligned}$$

Since u is symmetric decreasing, we deduce that \(t_n^*-t_n\rightarrow 0\), as claimed.

Since \(\lambda _n\rightarrow \lambda ^*\ne 0\), up to dropping finitely many n, we may assume \(\lambda _n\not =0\). Defining \(r_n{:}{=}\lambda _n^{-1}{\tilde{r}}_n\) then provides the desired decomposition (1.14) with \(\Vert r_n\Vert \rightarrow 0\).

Note that \(J_n(t,\lambda ) {:}{=}\Vert u_n - \lambda u(\ \cdot - t)\Vert ^2\) inherits the differentiability in t from u. Moreover, it is a polynomial in \(\lambda \) and hence differentiable in \(\lambda \). This yields the desired orthogonality relations

$$\begin{aligned} 0&=\partial _\lambda J_n(t,\lambda )|_{(t,\lambda )=(t_n,\lambda _n)}{} & {} \hspace{-0.4cm}=-2\lambda _n \langle r_n, u\rangle \,,\\ \hspace{122.8889pt}0&=\partial _t J_n(t,\lambda )|_{(t,\lambda )=(t_n,\lambda _n)}{} & {} \hspace{-0.4cm}= 2\lambda _n^2\langle r_n, \partial _s u\rangle \,. \hspace{106.31998pt}\end{aligned}$$

\(\square \)

3 Projection on the non-trivial zero modes of the Hessian

While the analysis in the previous section is relevant in the full range of admissible, attainable, and symmetric parameters, we now turn to properties that are specific for parameter values on the Felli–Schneider curve \(\Lambda =\Lambda _{FS}\). Recall that \(\omega _1,\dots ,\omega _d\) denote the Cartesian coordinates restricted to \({\mathbb {S}}^{d-1}\). These generate the space of spherical harmonics of degree 1. More generally, \((Y_{l,m})_m\) is defined to be the \(L^2({\mathbb {S}}^{d-1})\)-orthonormal basis of spherical harmonics of degree l. The degeneracy index m runs through a finite, l-dependent set, but we will not need a more detailed description for our purposes. Note that spherical harmonics of degree 0 are constant. For an introduction to spherical harmonics, we refer to [33, p. 137–152].

3.1 Degeneracy along the Felli–Schneider curve

Let \(\psi =\lambda u(\ \cdot -t)\in {\mathcal {M}}\). We investigate the stability of the functional \({\mathcal {F}}\) around \(\psi \) in the classical way by determining the zeros of the Hessian of \({\mathcal {F}}\). The Euler–Lagrange equation (1.10) implies \(C_{a,b}=\Vert u\Vert ^2 \Vert u\Vert _q^{-2} =\Vert u\Vert _q^{q-2}\), which can be used to compute the Hessian of the functional \({\mathcal {F}}\). One finds that for all \(\varphi \in H^1({\mathcal {C}})\),

$$\begin{aligned}&D^2_\psi {\mathcal {F}} (\varphi )=\partial ^2_\varepsilon {\mathcal {F}} (\psi +\varepsilon \varphi )|_{\varepsilon =0}\\&\quad =2 \Bigg (\Vert \varphi \Vert ^2-(q-1)\int _{{\mathcal {C}}}u(\ \cdot -t)^{q-2}\varphi ^2\ \textrm{ d}(s,\omega )\\&\qquad +(q-2)\Vert u\Vert _q^{-q}\left( \int _{{\mathcal {C}}}u(\ \cdot -t)^{q-1}\varphi \ \textrm{ d}(s,\omega )\right) ^2\Bigg ). \end{aligned}$$

This quadratic form corresponds to a self-adjoint, lower bounded operator \({\mathcal {L}}_\psi \) in the Hilbert space \(L^2({\mathcal {C}})\) with form domain \(H^1({\mathcal {C}})\) and operator domain \(H^2({\mathcal {C}})\) in the sense that

$$\begin{aligned} D^2_\psi {\mathcal {F}} (\varphi ) =2 \langle \varphi , {\mathcal {L}}_{\psi } \varphi \rangle _2, \end{aligned}$$

where

$$\begin{aligned} {\mathcal {L}}_\psi {:}{=}-\partial _s^2-\Delta _\omega +\Lambda -(q-1)u(\ \cdot -t)^{q-2}+(q-2)\Vert u\Vert ^{-q}_q |u^{q-1}(\ \cdot -t)\rangle \langle u^{q-1}(\ \cdot -t)| . \end{aligned}$$

Here \(\Delta _\omega \) denotes the Laplace–Beltrami operator on \({\mathbb {S}}^{d-1}\), and \( |u^{q-1}(\ \cdot -t)\rangle \langle u^{q-1}(\ \cdot -t)|\) denotes the rank one projector onto \(u^{q-1}(\ \cdot -t)\) in \(L^2({\mathcal {C}})\). We stress that the inner product in the definition of the rank one projector is the one in \(L^2({\mathcal {C}})\), not in \(H^1({\mathcal {C}})\). We observe that the operator \({\mathcal {L}}_\psi \) is independent of \(\lambda \), and hence \({\mathcal {L}}_\psi ={\mathcal {L}}_{u(\ \cdot \,- t)}\).

Note that \(D^2_\psi {\mathcal {F}}\), and hence \({\mathcal {L}}_\psi \), is positive semi-definite by optimality of \(\psi \). Indeed, we find \({\mathcal {F}}(\psi )=0\) and, through the Euler–Lagrange equation (1.10), \(D_\psi {\mathcal {F}}=0\). Therefore, expanding \({\mathcal {F}}\) with \(q>2\) around \(\psi \) yields

$$\begin{aligned}0\le {\mathcal {F}}(\psi +\varepsilon \varphi )= \frac{\varepsilon ^2}{2}D^2_\psi {\mathcal {F}}(\varphi )+o_{\varepsilon \rightarrow 0}(\varepsilon ^2), \qquad \varphi \in H^1({\mathcal {C}}).\end{aligned}$$

Next, we show that the kernel of \({\mathcal {L}}_\psi \) is given by

(3.1)

By means of \(-\Delta _\omega \omega _i=(d-1)\omega _i\) and the equations (1.10), (1.11), and (1.12), it can be verified easily that \({\mathcal {L}}_\psi \) vanishes on , which are mutually orthogonal in both \(L^2({\mathcal {C}})\) and \(H^1({\mathcal {C}})\). The other direction follows from a computation by Felli and Schneider [18], which we briefly review here. After separating the radial and the angular part of an arbitrary solution \(\varphi \in H^1({\mathcal {C}})\) to \({\mathcal {L}}_\psi \varphi =0\), they reduced this equation to an eigenvalue problem involving a one-dimensional Schrödinger operator with Pöschl–Teller potential:

$$\begin{aligned} (-\partial _s^2-(q-1)u(\ \cdot -t)^{q-2})\Phi _l=\theta _l \Phi _l \end{aligned}$$
(3.2)

with \(\theta _l{:}{=}-(l(l+d-2)+\Lambda )\) and \(\Phi _l\in H^1(\mathbb {R})\) for every \(l\in \mathbb {N}_0\). The parameter l corresponds to the angular momentum, that is, the degree of the spherical harmonic in the expansion of \(\varphi \).

We make use of the following facts about the operator appearing in (3.2).

Lemma 7

(Spectral analysis of lower eigenvalues) The lowest eigenvalue of the operator \(-\partial _s^2-(q-1)u(\ \cdot -t)^{q-2}\) in \(L^2(\mathbb {R})\) is \(-\frac{q^2}{4}\Lambda \) with corresponding eigenfunction . Its second eigenvalue is \(-\Lambda \) with corresponding eigenfunction \(\partial _s u(\ \cdot -t)\). These eigenvalues are simple.

Proof

These facts are well-known (see, for instance, [21, 4.2.2. Example: Pöschl–Teller potentials]), but it is easy to give an independent proof. Indeed, (1.12) says that is a positive \(H^2(\mathbb {R})\)-solution of an eigenvalue equation involving the operator. By general Schrödinger operator theory, this implies that is the ground state, \(-\frac{q^2}{4}\Lambda \) is the lowest eigenvalue, and this eigenvalue is simple. Similarly, (1.11) says that \(\partial _s u(\ \cdot -t)\) is a negative \(H^2((t,\infty ))\cap H^1_0((t,\infty ))\)-solution of an eigenvalue equation, so it is the ground state of the Dirichlet realization of the operator on \((t,\infty )\). Since, in one dimension, the eigenvalues of a Schrödinger operator with even potential are alternatingly those of the Neumann and the Dirichlet realization, we obtain the assertion about the second eigenvalue.\(\square \)

Let us return to the study of equation (3.2). Note that we have neglected the rank one projector in \({\mathcal {L}}_\psi \) when writing (3.2), which can be justified as follows. When \(l\ge 1\), the contribution of the rank one operator vanishes, since \(u(\ \cdot -t)\) is a radial (that is, independent of \(\omega \)) function. When \(l=0\), the function \(\psi \) is in the kernel of \({\mathcal {L}}_\psi \), as we have already observed. When looking for other elements \(\varphi \) in the kernel, we may thus subtract a suitable multiple of \(\psi \) from \(\varphi \) and are led, in view of (1.10), to the equation (3.2) without rank one projector.

For \(l=0\) we have \(\theta _0=-\Lambda \), which, by Lemma 7, is the second eigenvalue of the operator. Thus, in this case the \(L^2(\mathbb {R})\)-solution space of (3.2) is one-dimensional and spanned by \(\partial _s u(\ \cdot -t)\). For \(l= 1\) we have \(\theta _1= -(d-1+\Lambda )=-\frac{q^2}{4} \Lambda \) (as \(\Lambda =\Lambda _{FS}\)), which, by Lemma 7, is the lowest eigenvalue of the operator. Thus, in this case the \(L^2(\mathbb {R})\)-solution space of (3.2) is one-dimensional and spanned by . Finally, for \(l\ge 2\) we have \(\theta _l<\theta _1\), and correspondingly there is no non-trival \(L^2(\mathbb {R})\)-solution of (3.2).

Multiplied with a basis of spherical harmonics of the appropriate degree, we see that the kernel of the Hessian is spanned by , as claimed.

3.2 Proof of Proposition 3

Step 1. Proposition 2 is applicable as \(\Vert u_n\Vert ^2\rightarrow C_{a,b}\) and \(\Vert u_n\Vert _q\rightarrow 1\) for \(n\rightarrow \infty \). Passing to a subsequence, we thus obtain the decomposition

$$\begin{aligned}u_n=\lambda _n(u+r_n)(\ \cdot -t_n,\omega ), \qquad (s,\omega )\in {\mathcal {C}},\end{aligned}$$

with the prescribed convergence and orthogonality properties. Defining and \(\mu _n:=|\alpha _n|\), we can choose an orthogonal matrix \(D_n\in {\mathcal {O}}(d)\) such that \(D_n\alpha _n = \mu _n e_d\). It follows that

$$\begin{aligned} \mu _n \omega _d\circ D_n= \alpha _n\cdot \omega \qquad \text {for all}\ \omega \in \mathbb {S}^{d-1}, \end{aligned}$$

where \(\cdot \) denotes the scalar product in \(\mathbb {R}^d\). We will abuse the notation slightly by writing \(f\circ D_n=f (\ \cdot , \, D_n\ \cdot \ )\) for a function f defined on \({\mathcal {C}}\). We define \({\tilde{R}}_n\in H^1({\mathcal {C}})\) to be

so \({\tilde{R}}_n\circ D_n\) is the remainder term of projecting \(r_n\) onto , \(i=1,\dots , d\), in \(H^1({\mathcal {C}})\). As \(D_n^{-1}\) only rotates the basis \(\{\omega _i\}_{i\in \{1,\dots ,d\}}\), the set \(\{\omega _i\circ D_n^{-1}\}_{i\in \{1,\dots ,d\}}\) spans the spherical harmonics of degree 1 as well. Therefore, we see that

(3.3)

Since is orthogonal to \({\text {span}}\{ u, \partial _s u \}\) in \(H^1({\mathcal {C}})\), we can apply the orthogonality conditions for \(r_n\) from Proposition 2 to obtain the relations

$$\begin{aligned} \langle {\tilde{R}}_n, u\rangle =\langle {\tilde{R}}_n, \partial _s u\rangle =0. \end{aligned}$$

Step 2. Turning to the convergence properties, as \(q>2\), we can expand

Applying the expansions above with \(x=r_n u^{-1}\) and

$$\begin{aligned} y=\frac{q(q-1)}{2\Vert u\Vert ^q_q}\int _{{\mathcal {C}}}u^{q-2}r_n^2\ \textrm{d}(s,\omega )+{\mathcal {O}}_{n\rightarrow \infty }(\Vert r_n\Vert ^{q\wedge 3}_{q\wedge 3}+\Vert r_n\Vert ^{q}_{q}) \end{aligned}$$

leads to

where the first order term in the penultimate step vanished due to orthogonality; see (1.15). In the last step, Sobolev embedding and \(\Vert r_n\Vert \rightarrow 0\) for \(n\rightarrow \infty \) simplified the error \(\Vert r_n\Vert ^{q\wedge 3}_{q\wedge 3}+\Vert r_n\Vert ^{q}_{q}={\mathcal {O}}_{n\rightarrow \infty } (\Vert r_n\Vert ^{q\wedge 3})\). Using Hölder’s inequality, we can verify that the Taylor expansion in use was indeed applicable as \( |y|^2={\mathcal {O}}_{n\rightarrow \infty }(\Vert r_n\Vert ^{q\wedge 3})\). Recalling that by orthogonality

$$\begin{aligned}\Vert u_n\Vert ^2=\lambda _n^2(\Vert u\Vert ^2+\Vert r_n\Vert ^2),\end{aligned}$$

we are able to expand \({\mathcal {F}}\) to quadratic order:

$$\begin{aligned} {\mathcal {F}}(u_n)=\Vert u_n\Vert ^2-C_{a,b}\Vert u_n\Vert _q^2 =\frac{\lambda _n^2}{2}D^2_u{\mathcal {F}}(r_n)+{\mathcal {O}}_{n\rightarrow \infty } (\Vert r_n\Vert ^{q\wedge 3}). \end{aligned}$$
(3.4)

Above, the terms of order zero vanish due to the optimality of u, and the rank one projector in \(D^2_u{\mathcal {F}}\) can be added as \(\langle r_n,u^{q-1}\rangle _2=0\).

Since \(\Vert r_n\Vert \rightarrow 0\) for \(n\rightarrow \infty \), the expansion (3.4) yields

$$\begin{aligned} \lim _{n\rightarrow \infty }\frac{D^2_u{\mathcal {F}}(r_n)}{2\Vert r_n\Vert ^2}=\lim _{n\rightarrow \infty }\frac{{\mathcal {F}}(u_n)}{{\text {dist}}(u_n, {\mathcal {M}})^2}=0, \end{aligned}$$
(3.5)

where we used our assumption (1.16) in the last step. We recall from Subsection 3.1 that the Hessian of \({\mathcal {F}}\) corresponds to an operator \({\mathcal {L}}_\psi \) in \(L^2({\mathcal {C}})\). Since its essential spectrum starts at \(\Lambda \) (by Weyl’s theorem; see, e.g., [21, Theorem 1.14]), we see that \(D^2_u{\mathcal {F}}\) is positive definite on the \(L^2({\mathcal {C}})\)-orthogonal complement of \({\text {Ker}}(D^2_u{\mathcal {F}})\) and has a spectral gap above 0, so

$$\begin{aligned} D^2_u{\mathcal {F}}|_{{\text {Ker}}(D^2_u{\mathcal {F}})^\perp }\ge {\tilde{c}}\Vert \cdot \Vert _2^2 \end{aligned}$$

for some \({\tilde{c}}>0\). The right side can be improved to the \(H^1({\mathcal {C}})\)-norm. To this end, let \(\delta >0\) and \(\varphi \in {\text {Ker}}(D^2_u{\mathcal {F}})^\perp \). As \(\Vert u\Vert ^{q-2}_{L^{\infty }(\mathbb {R})}\le \frac{q}{2}\Lambda \), we can bound

$$\begin{aligned} D^2_u{\mathcal {F}}(\varphi )\ge \delta \Vert \varphi \Vert ^2+\left( (1-\delta ){\tilde{c}} -\delta (q-1)\frac{q}{2}\Lambda \right) \Vert \varphi \Vert _2^2\ge \delta \Vert \varphi \Vert ^2 \end{aligned}$$

for \(\delta \) chosen small enough. Therefore, it follows immediately that

$$\begin{aligned} D^2_u{\mathcal {F}}|_{{\text {Ker}}(D^2_u{\mathcal {F}})^\perp }\asymp \Vert \cdot \Vert ^2, \end{aligned}$$

that is, the induced norms are equivalent. Moreover, we know that by (3.1) and by (3.3) and (1.12). Using this and (3.5), we find that

$$\begin{aligned} \frac{\Vert {\tilde{R}}_n\Vert ^2}{\Vert r_n\Vert ^2}\asymp \frac{D^2_u{\mathcal {F}}( {\tilde{R}}_n)}{\Vert r_n\Vert ^2}=\frac{D^2_u{\mathcal {F}}(r_n)}{\Vert r_n\Vert ^2}\rightarrow 0 \end{aligned}$$
(3.6)

for \(n\rightarrow \infty \). The orthogonality relations (3.3) and the equation (1.12) for along with \(\Vert \omega _d\Vert ^2_{L^2({\mathbb {S}}^{d-1})}=|{\mathbb {S}}^{d-1}| d^{-1}\) imply

$$\begin{aligned} \Vert {\tilde{R}}_n\Vert ^2+\frac{q-1}{d}\Vert u^{q-1}\Vert _2^2 \mu _n^2 =\Vert r_n\Vert ^2. \end{aligned}$$

This and the asymptotics (3.6) show that \(\mu _n^2\Vert r_n\Vert ^{-2} \rightarrow d((q-1) \Vert u^{q-1}\Vert ^2_2)^{-1}\). It follows that \(\mu _n={\mathcal {O}}_{ n\rightarrow \infty }(\Vert r_n\Vert )\), and, unless \(r_n=0\), we have \(\mu _n\ne 0\) for all sufficiently large n. We finally set \(R_n:=\mu _n^{-1}{\tilde{R}}_n\) when \(\mu _n\ne 0\) (and \(R_n=0\) when \(\mu _n=0\)). Then \(\Vert R_n\Vert =o_{n\rightarrow \infty }(1)\), and the above orthogonality conditions for \({\tilde{R}}_n\) translate into orthogonality conditions for \(R_n\). \(\square \)

4 Non-vanishing of the quartic order

4.1 Quartic expansion of the deficit functional

As the denominator in (1.19) equals \(\lambda _n^4 \Vert r_n\Vert ^4\), which is comparable to \(\mu _n^4\) for \(n\rightarrow \infty \), we aim to expand the numerator to fourth order in \(|\mu _n|\) and expect lower order terms to vanish. In fact, the decomposition (1.17) formally leads to a quartic expansion of the functional \({\mathcal {F}}\). However, we cannot control arbitrary perturbations \(R_n\) in terms of \(\mu _n\) yet. We only know that \(\Vert R_n\Vert \rightarrow 0\) for \(n\rightarrow \infty \). In order to conduct perturbation theory, it would be more beneficial to have control of the \(L^\infty \)-norm of \(R_n\) as suggested in [19]. In general, an expansion of the \(L^{q}\)-norm up to fourth order requires \(q\ge 4\). However, we will circumvent this issue by splitting the domain of integration in the \(L^q\)-norm: The remainder term \(|R_n|\) is cut off by , which stems from the non-trivial zero mode, allowing expansions to arbitrary order. This approach for 1 instead of simplifies the computations for the quartic expansion in [19], and we expect it to be applicable to prove other degenerate stability results in the future.

Lemma 8

(Quartic order expansion of \({\mathcal {F}}\)) If \((u_n)_n\) is as in Proposition 3, we have in the notation of that proposition,

$$\begin{aligned} \lambda _n^{-2}{\mathcal {F}}(u_n)=\lambda _n^{-2}(\Vert u_n\Vert ^2-C_{a,b}\Vert u_n\Vert ^2_q) \ge (A)+(B)+{\mathcal {O}}_{n\rightarrow \infty }(|\mu _n|^5), \end{aligned}$$

where for some n-independent constant \(C<\infty \) we define

$$\begin{aligned} (A)&{:}{=}\mu _n^2\left( (1-C|\mu _n|^{(q-2)\wedge 1})\Vert R_n\Vert ^2-(q-1)\int _{{\mathcal {C}}}(u^{q-2}R_n^2+(q-2)\mu _n u^{2q-3}\omega _d^2R_n)\ \textrm{d}(s,\omega )\right) ,\\ (B)&{:}{=}\mu _n^4\frac{(q-1)(q-2)}{4}\left( \frac{q-1}{\Vert u\Vert _q^q}\left( \int _{{\mathcal {C}}}u^{2q-2}\omega _d^2\ \textrm{d}(s,\omega )\right) ^2-\frac{q-3}{3}\int _{{\mathcal {C}}}u^{3q-4}\omega _d^4\ \textrm{d}(s,\omega )\right) . \end{aligned}$$

It is instructive to consider this result in view of the secondary non-degeneracy condition in Definition 5. Setting \(\varphi =\mu _n R_n\) and , (A) corresponds (to leading order in \(\mu _n\)) to the bilinear and linear part in \(\varphi \) of (1.21), that is, \(2^{-1}(D_u^2{\mathcal {F}}(\varphi ,\varphi )+D_u^3{\mathcal {F}}(g,g,\varphi ))\), while (B) is the constant part \(24^{-1}D_u^4{\mathcal {F}}(g,g,g,g)\). Hence, the main goal of the next subsection is to bound them together from below by a positive constant.

Proof

As the prerequisites for Proposition 3 are satisfied, the decomposition (1.17) is available. Exploiting the translation and rotation invariance of the assumptions and the expansion in the theorem, we may assume without loss of generality that

We consider the set of points in \({\mathcal {C}}\) where and the one where , separately.

Step 1. In the set where we have

for n large enough. Then we can write

(4.1)

and expand the last term around 1. Applying the -norm to (4.1), the order of the error in \(|\mu _n|\) of the expansion on the right side is preserved since \(u^{q}\) is integrable. Thus, we may expand the -norm to arbitrary order. We will only need an expansion to quartic order:

(4.2)

where we absorbed some of the third and fourth order terms into the error. In particular, we used

Step 2. We can surely expand the \(L^q\)-norm to second order for \(q>2\) on :

(4.3)

where we used that \(u^{q-(q\wedge 3)}\lesssim 1\) uniformly in s in the argument of \({\mathcal {O}}_{n\rightarrow \infty }\). Denoting \(p\in \{q,3\wedge q\}\subset (2, 2^*)\), we realize that

(4.4)

As holds pointwise and is integrable for every \(m>0\), we obtain

(4.5)

Inserting (4.4) and (4.5) into (4.3) gives us a quartic expansion as in (4.2) but over and with an additional error \({\mathcal {O}}_{n\rightarrow \infty }(\Vert \mu _nR_n\Vert _q^q+\Vert \mu _nR_n\Vert _{q\wedge 3}^{q\wedge 3})\).

Step 3. By simple manipulations, we may reduce the overall error for the quartic expansion of the \(L^q({\mathcal {C}})\)-norm to \({\mathcal {O}}_{n\rightarrow \infty }(|\mu _n|^5+|\mu _n|^{q\wedge 3}\Vert R_n\Vert ^2)\). Using this expansion, we compute

where the whole first order term, the mixed term of second order, and the term including \(\omega _d^3\) of third order vanish due to orthogonality relations of the spherical harmonic \(\omega _d\) and \(R_n\); see Proposition 3. Together with the expansion of the \(H^1({\mathcal {C}})\)-norm,

we obtain the desired expansion of the functional \({\mathcal {F}}\) to quartic order. Note that the terms of order zero vanish by optimality of u, and the \(R_n\)-independent second order terms cancel due to equation (1.12). The stated lower bound now follows from replacing \({\mathcal {O}}_{n\rightarrow \infty }(|\mu _n|^{q\wedge 3} \Vert R_n\Vert ^2)\) by a lower bound of the form \(-C|\mu _n|^{q\wedge 3} \Vert R_n\Vert ^2\), \(C<\infty \).\(\square \)

4.2 Bounding (A) + (B) from below

In order to prove Proposition 4, we are going to estimate \((A) + (B)\) from Lemma 8 from below by the leading order \(\mu _n^4\) times a positive (qd)-dependent constant. To this end, we will use two well-known identities frequently hereafter:

$$\begin{aligned} \int _{{\mathbb {S}}^{d-1}}\omega _d^{n}\ \textrm{ d} \omega = |{\mathbb {S}}^{d-1}|\frac{\Gamma \left( \frac{d}{2}\right) \Gamma \left( \frac{n+1}{2}\right) }{\Gamma \left( \frac{1}{2}\right) \Gamma \left( \frac{d+n}{2}\right) }\mathbb {1}_{\{n\in 2\mathbb {N}_0\}}\text { and }\int _\mathbb {R}\frac{|\sinh (s)|^n}{\cosh ^\nu (s)}\ \textrm{ d} s= \frac{\Gamma \left( \frac{\nu -n}{2}\right) \Gamma \left( \frac{n+1}{2}\right) }{\Gamma \left( \frac{\nu +1}{2}\right) }\nonumber \\ \end{aligned}$$
(4.6)

for \(\nu >n\), \(n\in \mathbb {N}_0\). After passing to spherical coordinates, the first integral follows from [22, 3.621, Eq. 5], while the second one is a direct consequence of [22, 3.512, Eq. 2].

Let us start with (B). It is independent of \(R_n\) and can thus be computed explicitly by means of the identities (4.6). We find that

$$\begin{aligned} (B)=\mu _n^4\frac{\beta ^{3q-4}}{\alpha }\frac{|{\mathbb {S}}^{d-1}|}{4d^2}\frac{\Gamma \left( \frac{3q-4}{q-2}\right) \sqrt{\pi }}{\Gamma \left( \frac{3q-4}{q-2}+\frac{1}{2}\right) } (q-1)(q-2) \left( \frac{q(5q-6)}{2(3q-2)}-\frac{d(q-3)}{(d+2)}\right) .\nonumber \\ \end{aligned}$$
(4.7)

In order to estimate (A), we will expand \(R_n\) in spherical harmonics,

$$\begin{aligned} R_n(s,\omega )=\sum _{l,m} a_{l,m}(s) Y_{l,m}(\omega ), \end{aligned}$$
(4.8)

where \(a_{l,m}\) are s-dependent coefficients and \((Y_{l,m})_m\) is an \(L^2({\mathbb {S}}^{d-1})\)-orthonormal basis of spherical harmonics of degree l, which we introduced in Sect. 3. We will choose

$$\begin{aligned} Y_{0,0}{:}{=}\frac{1}{\sqrt{|{\mathbb {S}}^{d-1}|}}\qquad \text { and }\qquad Y_{2,0}(\omega _d){:}{=}\sqrt{\frac{d^2(d+2)}{2(d-1)|{\mathbb {S}}^{d-1}|}} \left( \omega _d^2-\frac{1}{d}\right) , \end{aligned}$$
(4.9)

where the normalizing constants can be computed using the first identity from (4.6). The \(L^2({\mathbb {S}}^{d-1})\)-orthogonality of the spherical harmonics \(Y_{l,m}\) allows us to state the \(L^2({\mathcal {C}})\)-orthogonality conditions from Proposition 3 (see (1.18)) in terms of the coefficients:

(4.10)

There will be two lemmas that provide sharp lower bounds on (A). The first estimate says that, except for \(l=0,m=0\) and \(l=2,m=0\), the contribution of the coefficients is bounded from below by 0. The other one takes care of the missing cases by a ‘completing the square’-argument.

Lemma 9

(Negligible energies) Inserting (4.8) for \(R_n\), we have for any \(n\in \mathbb {N}\),

$$\begin{aligned} (A)\ge \mu _n^4\sum _{l\in \{0,2\}}{\mathcal {E}}^{(l)}\left( \frac{a_{l,0}}{\mu _n}\right) , \end{aligned}$$
(4.11)

where, for \(l\in \{0,2\}\) and \(g\in H^1(\mathbb {R})\),

$$\begin{aligned}{\mathcal {E}}^{(l)}(g){} & {} {:}{=}\int _\mathbb {R}((1-C|\mu _n|^{(q-2)\wedge 1})((\partial _s g)^2+(2d\mathbb {1}_{\{l=2\}}(l)+\Lambda )g^2)\nonumber \\{} & {} \quad -(q-1)u^{q-2}g^2 -2 f^{(l)}g) \ \textrm{d}s\end{aligned}$$

with

$$\begin{aligned} f^{(l)}{:}{=}M u^{2q-3} \left( \sqrt{\frac{2(d-1)}{d+2}}\mathbb {1}_{\{l=2\}}(l)+\mathbb {1}_{\{l=0\}}(l)\right) , \qquad M{:}{=}(q-1)(q-2) \frac{\sqrt{|{\mathbb {S}}^{d-1}|}}{2d}. \end{aligned}$$

Here to lighten the notation, we do not reflect the dependence of \({\mathcal {E}}^{(l)}\) on n. Since \(\mu _n\rightarrow 0\), this dependence will be weak.

Proof

By means of (4.9), we expand \(\omega _d^2\) in spherical harmonics:

$$\begin{aligned}\omega _d^2=(\omega _d^2-d^{-1})+d^{-1}=\sqrt{\frac{2(d-1)|{\mathbb {S}}^{d-1}|}{d^2(d+2)}}Y_{2,0}+\frac{\sqrt{|{\mathbb {S}}^{d-1}|}}{d}Y_{0,0}.\end{aligned}$$

As a result, expanding \(R_n\) in spherical harmonics as well, we can write (A) as

$$\begin{aligned} \nonumber \mu _n^2&\left( (1-C |\mu _n|^{(q-2)\wedge 1})\Vert R_n\Vert ^2-(q-1)\int _{{\mathcal {C}}}\left( u^{q-2}R_n^2\ +(q-2)\mu _n u^{2q-3}\omega _d^2R_n\right) \, \textrm{d}(s,\omega )\right) \\&\quad =\mu _n^2\int _\mathbb {R}\Bigg ( \sum _{l,m} \Bigg ( (1-C |\mu _n|^{(q-2)\wedge 1})((\partial _s a_{l,m})^2+(l(l+d-2)+\Lambda )a_{l,m}^2) \nonumber \\&\qquad -(q-1)u^{q-2} a_{l,m}^2 \Bigg )\nonumber \\&\qquad -(q-1)(q-2)\mu _n \frac{\sqrt{|{\mathbb {S}}^{d-1}|}}{d} \left( u^{2q-3}\sqrt{\frac{2(d-1)}{d+2}} a_{2,0}+ u^{2q-3} a_{0,0}\right) \Bigg ) \, \textrm{d}s \,. \end{aligned}$$
(4.12)

The terms with \((l,m)\in \{(0,0),(2,0)\}\) are the terms that appear on the right side in the lemma. We will now show that the remaining terms are bounded from below by zero. This will imply the lemma.

According to Lemma 7, the lowest eigenvalue of the operator \(-\partial ^2_s -(q-1)u^{q-2}\) in \(L^2(\mathbb {R})\) is \(-\frac{q^2}{4}\Lambda \). This, together with the bound \(\Vert u\Vert ^{q-2}_{L^\infty (\mathbb {R})}\le \frac{q}{2}\Lambda \), implies that, once n is so large that \(C|\mu _n|^{(q-2)\wedge 1}\le 1\),

$$\begin{aligned}&(1-C |\mu _n|^{(q-2)\wedge 1}) (-\partial ^2_s +l(l+d-2)+\Lambda )-(q-1)u^{q-2} \\&\quad \ge (1-C|\mu _n|^{(q-2)\wedge 1}) \left( -\partial ^2_s +l(l+d-2)+\Lambda -(q-1)u^{q-2} \right) \\&\qquad - C|\mu _n|^{(q-2)\wedge 1}(q-1)\frac{q}{2}\Lambda \\&\quad \ge (1-C|\mu _n|^{(q-2)\wedge 1}) \left( - \frac{q^2}{4}\Lambda + l(l+d-2) + \Lambda \right) -C|\mu _n|^{(q-2)\wedge 1}(q-1)\frac{q}{2}\Lambda \,. \end{aligned}$$

We have \(- \frac{q^2}{4}\Lambda + \Lambda = - \frac{q^2-4}{4} \Lambda _{FS} = - (d-1)\). Since for \(l\ge 2\) we have \(l(l+d-2)\ge 2d>d-1\), it follows that, if n is large enough, then for all \(l\ge 2\) we have

$$\begin{aligned} (1-C |\mu _n|^{(q-2)\wedge 1}) (-\partial ^2_s +l(l+d-2)+\Lambda )-(q-1)u^{q-2} \ge 0 . \end{aligned}$$
(4.13)

Therefore, we can bound the terms in the sum over lm in (4.12) with \(l\ge 2\) by 0 from below; however, we will keep the summand with \((l,m)=(2,0)\).

We now bound the term with \(l=1\). We set and write

Here we used the Eq. (1.12) for , the third condition in (4.10) and the assumption \(\Lambda =\Lambda _{FS}\). Since is \(L^2(\mathbb {R})\)-orthogonal to , which is the ground state of the operator \(-\partial ^2_s -(q-1)u^{q-2}\) in \(L^2(\mathbb {R})\), and since the second eigenvalue of this operator is \(-\Lambda \), we can bound

This, together with \(\Vert u\Vert ^{q-2}_{L^\infty (\mathbb {R})}\le \frac{q}{2}\Lambda \), implies that, when \(C|\mu _n|^{(q-2)\wedge 1}\le 1\),

This is non-negative for all sufficiently large n. Thus, we conclude that (4.11) holds.\(\square \)

Lemma 9 motivates to study the minimization problems

$$\begin{aligned} E^{(0)}&{:}{=}\inf \left\{ {\mathcal {E}}^{(0)}(g) :\ g\in H^1(\mathbb {R}), \langle u^{q-1}, g\rangle _{L^2(\mathbb {R})}=\langle u^{q-2}\partial _s u, g\rangle _{L^2(\mathbb {R})} = 0 \right\} , \\ E^{(2)}&{:}{=}\inf \left\{ {\mathcal {E}}^{(2)}(g) :\ g\in H^1(\mathbb {R}) \right\} . \end{aligned}$$

The energy functionals \({\mathcal {E}}^{(l)}\) are of the form quadratic plus linear, and therefore the corresponding minimization problems \(E^{(l)}\) are abstractly solvable by a ‘completion of the square’-argument. Producing concrete numerical values, however, is not straightforward, in particular for \(l=2\). These numerical values are necessary in order to verify the secondary non-degeneracy condition. We stress that this difficulty is already present in the model case where \(\mu _n\) is replaced by zero. To avoid distraction from the main idea of the proof, we state here the outcome of the ‘completion of the square’-argument and provide a sketch of the argument but defer the details for \(l=2\) to the next section.

Lemma 10

(Energy of the degree 0 solution) As \(n\rightarrow \infty \), we have

$$\begin{aligned} E^{(0)} = - \frac{\beta ^{3q-4}}{\alpha }\frac{|{\mathbb {S}}^{d-1}|}{4d^2}\frac{\Gamma \left( \frac{3q-4}{q-2}\right) \sqrt{\pi }}{\Gamma \left( \frac{3q-4}{q-2}+\frac{1}{2}\right) }\frac{q(q-2)^3}{4(3q-2)}+ {\mathcal {O}}_{n\rightarrow \infty }(|\mu _n|^{(q-2)\wedge 1}).\end{aligned}$$

Lemma 11

(Energy of the degree 2 solution) As \(n\rightarrow \infty \), we have

$$\begin{aligned} E^{(2)}&= -\frac{\beta ^{3q-4}}{\alpha }\frac{|{\mathbb {S}}^{d-1}|}{4d^2}\frac{\Gamma \left( \frac{3q-4}{q-2}\right) \sqrt{\pi }}{\Gamma \left( \frac{3q-4}{q-2}+\frac{1}{2}\right) }\frac{q(q-1)(q-2)(d-1)}{(d+2)P(-1)} \sum _{k=0}^\infty \left( P(k-\xi )- P(k)\right) \\&\quad + {\mathcal {O}}_{n\rightarrow \infty }(|\mu _n|^{(q-2)\wedge 1}) \end{aligned}$$

with a smooth function \(P:[-1,\infty )\rightarrow (0,\infty )\) defined by

$$\begin{aligned} P(x){:}{=}\frac{\Gamma \left( x+\frac{3}{2}\right) \Gamma \left( x+2{\mathfrak {b}}-1\right) \Gamma \left( x+2{\mathfrak {b}}\right) }{\Gamma \left( x+{\mathfrak {b}}-{\mathfrak {a}}+1\right) \Gamma \left( x+{\mathfrak {b}}+{\mathfrak {a}} +1\right) \Gamma \left( x+2{\mathfrak {b}}+\frac{1}{2}\right) },\qquad x\ge -1, \end{aligned}$$
(4.14)

and

$$\begin{aligned} {\mathfrak {a}} {:}{=}\frac{\sqrt{1+\frac{2d}{\Lambda }}}{q-2} , \qquad {\mathfrak {b}} {:}{=}\frac{2q-3}{q-2},\qquad \xi {:}{=}{\mathfrak {b}}-{\mathfrak {a}}. \end{aligned}$$
(4.15)

For the proof of these two lemmas, we consider for \(l\in \{0,2\}\) the one-dimensional Schrödinger operators

$$\begin{aligned} h_n^{(l)}{:}{=}(1-C|\mu _n|^{(q-2)\wedge 1})(-\partial _s^2+2d\mathbb {1}_{\{l=2\}}(l)+\Lambda )-(q-1)u^{q-2} . \end{aligned}$$
(4.16)

They can be considered as self-adjoint, lower semibounded operators in the Hilbert space \(L^2(\mathbb {R})\) with form domain \(H^1(\mathbb {R})\) and operator domain \(H^2(\mathbb {R})\).

Proof of Lemma 11

As we have seen in (4.13) in the proof of Lemma 9, the operator \(h_n^{(2)}\) is bounded from below by \(d+1+ {\mathcal {O}}(|\mu _n|^{(q-2)\wedge 1})\). Thus, for all sufficiently large n the operator is boundedly invertible. We can write for any \(g\in H^1(\mathbb {R})\),

$$\begin{aligned} {\mathcal {E}}^{(2)}(g)&= \Vert (h_n^{(2)})^{1/2} g - (h_n^{(2)})^{-1/2} f^{(2)} \Vert _{L^2(\mathbb {R})}^2 - \langle f^{(2)}, (h_n^{(2)})^{-1} f^{(2)} \rangle _{L^2(\mathbb {R})}\,. \end{aligned}$$

Thus, \(E^{(2)} = \inf _g {\mathcal {E}}^{(2)}(g) = - \langle f^{(2)}, (h_n^{(2)})^{-1} f^{(2)} \rangle _{L^2(\mathbb {R})}\), and the infimum is attained at the unique g that satisfies \((h_n^{(2)})^{1/2} g = (h_n^{(2)})^{-1/2} f^{(2)}\). The latter is equivalent to having \(g\in H^2(\mathbb {R})\) and \(h_n^{(2)} g = f^{(2)}\). We have not been able to find a simple explicit expression of this solution g (not even for \(\mu _n=0\)), but we have been able to find a power series representation of it, which allows us to deduce the formula stated in the lemma. We defer the details of this rather lengthy analysis to the next section. \(\square \)

The idea of the proof of Lemma 10 is similar to that of Lemma 11, and in this case, in fact, an explicit expression for the solution is available. There is a different complication, however, namely, in the first part of the proof. This comes from the fact that the operator \(h_n^{(0)}\) is not positive definite. Indeed, from Lemma 7 we know that at \(\mu _n=0\) its lowest two eigenvalues are \(-(\frac{q^2}{4}-1)\Lambda \) and 0. We need to remove these two unstable directions in order to obtain a boundedly invertible operator. As we will show, this is achieved by the orthogonality conditions in the definition of \(E^{(0)}\). This is not completely obvious since the functions in these orthogonality conditions are not eigenfunctions of the operator.

Proof of Lemma 10

Step 1. We denote by \(\Pi \) the \(L^2(\mathbb {R})\)-orthogonal projection onto the \(L^2(\mathbb {R})\)-orthogonal complement of \({\text {span}}\{u^{q-1},u^{q-2}\partial _s u\}\). We claim that for all sufficiently large n the operator \(\Pi h_n^{(0)}\Pi \), considered in the Hilbert space \({{\,\textrm{ran}\,}}\Pi \), is bounded from below by a positive constant and, consequently, boundedly invertible on that space. Since the bottom of the spectrum of \(\Pi h_n^{(0)}\Pi \) depends continuously on \(\mu _n\), it suffices to prove this assertion for the operator \(h^{(0)}\), defined similarly as \(h_n^{(0)}\) but with \(\mu _n\) set to zero.

By the computations in Subsection 3.1, we know that the Hessian of \({\mathcal {F}}\) at u, restricted to radial functions, is given by the operator \(h^{(0)} + (q-2)\Vert u\Vert _q^{-q} |u^{q-1}\rangle \langle u^{q-1}|\) in \(L^2(\mathbb {R})\). Since the Hessian is positive semidefinite and since

$$\begin{aligned} \Pi h^{(0)} \Pi = \Pi \left( h^{(0)} + (q-2)\Vert u\Vert _q^{-q} |u^{q-1}\rangle \langle u^{q-1}| \right) \Pi , \end{aligned}$$

we deduce that \(\Pi h^{(0)}\Pi \) is positive semidefinite. Since the essential spectrum of this operator starts at \(\Lambda >0\) (as before by Weyl’s theorem; see, e.g., [21, Theorem 1.14]), it suffices to prove that 0 is not an eigenvalue of \(\Pi h^{(0)}\Pi \). Thus, assume that \(\Pi h^{(0)}\Pi \varphi =0\) for some \(\varphi \in H^2(\mathbb {R})\). Then the above computation shows that \(\Pi \varphi \) lies in the kernel of the Hessian of \({\mathcal {F}}\) at u, restricted to radial functions. Hence, by (3.1), \(\Pi \varphi = c_1 u + c_2 \partial _s u\) for some constants \(c_1,c_2\in \mathbb {R}\). Since even and odd functions are mutually orthogonal, we deduce that

$$\begin{aligned} 0 = \langle u^{q-1},\Pi \varphi \rangle _{L^2(\mathbb {R})} = c_1 \int _\mathbb {R}u^{q} \ \textrm{d}s \text {~and~} 0 = \langle u^{q-2},\partial _s \Pi \varphi \rangle _{L^2(\mathbb {R})} = c_2 \int _\mathbb {R}u^{q-2} \partial ^2_s u \ \textrm{d}s . \end{aligned}$$

Thus, \(c_1=c_2=0\) and \(\Pi \varphi =0\). This means that \(\Pi h^{(0)}\Pi \) has trivial kernel on the space \({{\,\textrm{ran}\,}}\Pi \), as claimed.

Step 2. Now we can proceed similarly as in the proof of Lemma 11. We can write for any \(g\in H^1(\mathbb {R})\cap {{\,\textrm{ran}\,}}\Pi \),

$$\begin{aligned} {\mathcal {E}}^{(0)}(g) = \Vert (\Pi h_n^{(0)}\Pi )^{1/2} g - (\Pi h_n^{(0)}\Pi )^{-1/2} \Pi f^{(0)} \Vert _{{{\,\textrm{ran}\,}}\Pi }^2 - \langle \Pi f^{(0)}, (\Pi h_n^{(0)}\Pi )^{-1} \Pi f^{(0)} \rangle _{{{\,\textrm{ran}\,}}\Pi } , \end{aligned}$$

where we consider the natural inner product and norm that \({{\,\textrm{ran}\,}}\Pi \) inherits from \(L^2(\mathbb {R})\). We conclude that \(E^{(0)} = \inf _{g\in H^1(\mathbb {R})\cap {{\,\textrm{ran}\,}}\Pi } {\mathcal {E}}^{(0)}(g) = - \langle \Pi f^{(0)}, (\Pi h_n^{(0)}\Pi )^{-1} \Pi f^{(0)} \rangle _{{{\,\textrm{ran}\,}}\Pi }\), and the infimum is attained at the unique g that satisfies \((\Pi h_n^{(0)}\Pi )^{1/2} g = (\Pi h_n^{(0)}\Pi )^{-1/2} \Pi f^{(0)}\). The latter is equivalent to having \(g\in H^2(\mathbb {R})\cap {{\,\textrm{ran}\,}}\Pi \) and \(\Pi h_n^{(0)}\Pi g = \Pi f^{(0)}\). This means that

$$\begin{aligned} h_n^{(0)}\Pi g = f^{(0)} + c_3 u^{q-1} + c_4 u^{q-2}\partial _s u \end{aligned}$$
(4.17)

for some constants \(c_3,c_4\in \mathbb {R}\).

Exploiting (1.10), we compute directly

$$\begin{aligned} h_n^{(0)}u&=-(q-2+C|\mu _n|^{(q-2)\wedge 1})u^{q-1}\,,\\ h_n^{(0)}u^{q-1}&=-(1-C|\mu _n|^{(q-2)\wedge 1})q(q-2)\Lambda u^{q-1}\\&\quad +\frac{(q-1)}{q}(2(q-2)-(3q-4)C|\mu _n|^{(q-2)\wedge 1})u^{2q-3}\,, \end{aligned}$$

and therefore, if we set

$$\begin{aligned} g=K u^{q-1}+Lu \end{aligned}$$

with constants \(K,L\in \mathbb {R}\) to be determined, then

$$\begin{aligned} h_n^{(0)}g&= \left( -(1-C|\mu _n|^{(q-2)\wedge 1})q(q-2)\Lambda K - (q-2+C|\mu _n|^{(q-2)\wedge 1}) L \right) u^{q-1} \\&\quad + \frac{(q-1)}{q}(2(q-2)-(3q-4)C|\mu _n|^{(q-2)\wedge 1}) K u^{2q-3}\,. \end{aligned}$$

If we choose

$$\begin{aligned} K{:}{=}\frac{ M q}{(2(q-2)-(3q-4)C|\mu _n|^{(q-2)\wedge 1})(q-1)}, \end{aligned}$$

then \(\frac{(q-1)}{q}(2(q-2)-(3q-4)C|\mu _n|^{(q-2)\wedge 1}) K u^{2q-3} = f^{(0)}\). Further, if we choose

$$\begin{aligned} L{:}{=}&- \frac{2q}{3q-2} \beta ^{q-2} K\,, \end{aligned}$$

then, using the second identity in (4.6) and the functional equation of the gamma function,

$$\begin{aligned} \langle u^{q-1}, g \rangle _{L^2(\mathbb {R})} = 0 \qquad \text {and}\qquad \langle u^{q-2}\partial _s u, g \rangle _{L^2(\mathbb {R})} = 0 . \end{aligned}$$

Here the second identity follows immediately by parity. Thus, we have found a solution \(g\in H^2(\mathbb {R})\cap {{\,\textrm{ran}\,}}\Pi \) of (4.17) for certain explicit but irrelevant constants \(c_3\) and \(c_4\). (Indeed, \(c_4=0\).)

It follows that

$$\begin{aligned} E^{(0)}&= - \langle \Pi f^{(0)}, (\Pi h_n^{(0)}\Pi )^{-1} \Pi f^{(0)} \rangle _{{{\,\textrm{ran}\,}}\Pi } = - \langle f^{(0)}, g \rangle _{L^2(\mathbb {R})}\\&= - M \int _\mathbb {R}u^{2q-3} \left( K u^{q-1}+Lu\right) \textrm{d}s \\&= - M K \frac{ \beta ^{3q-4}}{\alpha } \left( \frac{\Gamma \left( \frac{3q-4}{q-2}\right) \,\sqrt{\pi }}{\Gamma \left( \frac{3q-4}{q-2}+\frac{1}{2}\right) } - \frac{2q}{3q-2} \frac{\Gamma \left( \frac{2q-2}{q-2}\right) \,\sqrt{\pi }}{\Gamma \left( \frac{2q-2}{q-2}+\frac{1}{2} \right) } \right) \\&= -M K\frac{ \beta ^{3q-4}}{\alpha } \frac{\Gamma \left( \frac{3q-4}{q-2}\right) \,\sqrt{\pi }}{\Gamma \left( \frac{3q-4}{q-2}+\frac{1}{2}\right) } \frac{(q-2)^2}{2(q-1)(3q-2)} \,. \end{aligned}$$

Here we used the second identity from (4.6) and the functional equation of the gamma function. If we insert the asymptotics \(K=\frac{ M q}{2(q-2)(q-1)}+{\mathcal {O}}_{n\rightarrow \infty }(|\mu _n|^{(q-2)\wedge 1})\) and recall the definition of M, we arrive at the assertion of the lemma. \(\square \)

4.3 Proof of Proposition 4

Without changing the notation, we restrict ourselves to a subsequence along which the liminf on the left side of (1.19) is realized. We first consider the simple case where (along that chosen subsequence) \(\liminf _{n\rightarrow \infty } {\mathcal {F}}(u_n)/{{\,\textrm{dist}\,}}(u_n,{\mathcal {M}})^2>0\). Then, since by assumption \({\mathcal {F}}(u_n)\rightarrow 0\), we have \({{\,\textrm{dist}\,}}(u_n,{\mathcal {M}})\rightarrow 0\). Thus, \(\Vert u_n\Vert ^2/{{\,\textrm{dist}\,}}(u_n,{\mathcal {M}})^2\rightarrow \infty \), and therefore the left side of (1.19) is equal to \(\infty \).

We now consider the second case where \(\liminf _{n\rightarrow \infty } {\mathcal {F}}(u_n)/{{\,\textrm{dist}\,}}(u_n,{\mathcal {M}})^2=0\). Then, after passing to a subsequence, (1.16) is satisfied, and we may use the expansion from Lemma 8, which involves the two terms (A) and (B). The term (B) was computed in (4.7). Combining this with the bounds on (A) from Lemma 910, and 11, we obtain

$$\begin{aligned} \mu _n^{-4} \lambda _n^{-2} {\mathcal {F}}(u_n)&\ge \frac{\beta ^{3q-4}}{\alpha }\frac{|{\mathbb {S}}^{d-1}|}{d^2}\frac{\Gamma \left( \frac{3q-4}{q-2}\right) \sqrt{\pi }}{\Gamma \left( \frac{3q-4}{q-2}+\frac{1}{2}\right) } \frac{q(5q-6)(q-1)}{2(3q-2)} J(q,d) \\&\quad + {\mathcal {O}}_{n\rightarrow \infty }(|\mu _n|^{(q-2)\wedge 1}) \,, \end{aligned}$$

where, abbreviating \(\xi ={\mathfrak {b}}-{\mathfrak {a}}\) as in Lemma 11, J(qd) is defined by

$$\begin{aligned} J(q,d){:}{=}\frac{(q-2)(3q-2)}{2(5q-6)} \left( \frac{3q-4}{4(q-1)}- \frac{(q-3)d}{q(d+2)}- \frac{d-1}{d+2}\sum _{k=0}^\infty \frac{P(k-\xi )- P(k)}{P(-1)}\right) .\nonumber \\ \end{aligned}$$
(4.18)

Meanwhile, as a consequence of the results from Propositions 2 and 3, we find that

If we combine the previous two expansions, we arrive at the lower bound in (1.19).

Let us show that this bound is best possible in the sense that there is a sequence \((u_n)_n\) with the same properties as before, for which (1.19) is an equality. Indeed, it suffices to take

for an arbitary sequence \((\mu _n)_n\) tending to 0. Here \(g^{(l)}\) are the functions in \(H^1(\mathbb {R})\) for which the infima in the definition of \(E^{(l)}\) are attained; see the proofs of Lemma 10 and 11. With this choice of \((u_n)_n\), the bounds in Lemma 910, and 11 become saturated, and therefore for this sequence we have equality in (1.19), as claimed.

To complete the proof of the proposition, it remains to prove that \(J(q,d)>0\). This is equivalent to proving that \({\tilde{J}}(q,d)>0\), where

$$\begin{aligned} {\tilde{J}}(q,d) {:}{=}\frac{(3q-4)(d+2)}{4(q-1)(d-1)}- \frac{(q-3)d}{q(d-1)}- \frac{1}{P(-1)}\sum _{k=0}^\infty (P(k-\xi )- P(k))\,. \end{aligned}$$
(4.19)

We distinguish two cases.

Case 1. To bound \({\tilde{J}}(q,d)\), let us first assume that \(d>2, 2<q<2^*\) or \(d=2\), \(2.8<q <2^*\). Then Lemma 13 below is applicable, and we infer that P is strictly convex on the interval \([-1,\infty )\). We deduce that

$$\begin{aligned} \nonumber \sum _{k=0}^\infty \frac{\left( P(k-\xi )- P(k)\right) }{-\xi }&>\sum _{k=0}^\infty \frac{\left( P(k-1)- P(k)\right) }{-1} = -P(-1)\\ {}&>\frac{P(-1)}{-\xi } \left( \frac{(3q-4)(d+2)}{4(q-1)(d-1)}- \frac{(q-3)d}{q(d-1)}\right) , \end{aligned}$$
(4.20)

where we used

$$\begin{aligned} \xi -\left( \frac{(3q-4)(d+2)}{4(q-1)(d-1)}- \frac{(q-3)d}{q(d-1)}\right)&<1-\left( \frac{(3q-4)(d+2)}{4(q-1)(d-1)}- \frac{(q-3)d}{q(d-1)}\right) \nonumber \\ {}&=\frac{5(d-2)q^2-4(4d-3)q+12d}{4q(q-1)(d-1)}<0\,. \end{aligned}$$
(4.21)

Note that

$$\begin{aligned} \xi \in \left( \frac{d-2}{d-1},1\right) , \end{aligned}$$
(4.22)

which is a consequence of the strict monotonicity of \(\xi \) with respect to q and the values of \(\xi \) at the boundary points \(q=2\) and \(q=2^*\). The last step in (4.21) follows by computing the roots of the numerator, which is quadratic in q for \(d>2\). The roots are \(\frac{6}{5}\) and \(\frac{2d}{d-2}\), and thus for all admissible q the quadratic term lies below 0. In case \(d=2\), the numerator reduces to \(-20q+24\), which is smaller than 0. Multiplying (4.20) by \(-\xi P(-1)^{-1}<0\), we showed that \({\tilde{J}}(q,d)>0\) for the assumed parameter range of (qd).

Case 2. We now consider the remaining case \(d=2\), \(2<q\le 2.8\) and give a lower bound on \(E^{(2)}\) that is not optimal but more explicit than the optimal bound from Lemma 11. In fact, we give this bound in general dimension d and only specify to \(d=2\) at the end. The argument is based on a rough version of the ‘completion of the square’-argument that we have used above. We recall from the proof of Lemma 9 or Lemma 11 that the operator \(h^{(2)}_n\) is bounded from below by \(d+1+ {\mathcal {O}}(|\mu _n|^{(q-2)\wedge 1})\). Thus, for all sufficiently large n the operator is invertible, and the operator-\(L^2(\mathbb {R})\)-norm of \((h_n^{(2)})^{-1}\) is bounded by \((d+1)^{-1}+{\mathcal {O}}(|\mu _n|^{(q-2)\wedge 1})\). Moreover, we have seen that \(E^{(2)}= - \langle f^{(2)}, (h_n^{(2)})^{-1} f^{(2)} \rangle _{L^2(\mathbb {R})}\). Hence, we can bound, using the second identity in (4.6),

$$\begin{aligned} E^{(2)} \ge&- \Vert (h_n^{(2)})^{-1}\Vert _{op}\Vert f^{(2)}\Vert ^2_{L^2(\mathbb {R})} \\\ge&- \frac{\beta ^{3q-4}}{\alpha }\frac{|{\mathbb {S}}^{d-1}|}{4d^2}\frac{\Gamma \left( \frac{3q-4}{q-2}\right) \sqrt{\pi }}{\Gamma \left( \frac{3q-4}{q-2}+\frac{1}{2}\right) }q(q-1)(q-2)\\&\times \frac{(d-1)}{(d+2)}\left( \frac{8(q-1)(3q-4)(d-1)}{(q+2)(7q-10)(d+1)}\right) \\ {}&+{\mathcal {O}}(|\mu _n|^{(q-2)\wedge 1}). \end{aligned}$$

Now assume \(d=2\), \(2<q\le 2.8\). Then this lower bound directly implies \({\tilde{J}}(q,2)>0\) in (4.19) as

$$\begin{aligned} \frac{3q-4}{q-1}-2\frac{q-3}{q}-\frac{8}{3}\frac{(q-1)}{(q+2)}\frac{(3q-4)}{(7q-10)}\ge 2+\frac{1}{7}-\frac{8}{3}\cdot \frac{3}{8}\cdot \frac{1}{2}>0, \end{aligned}$$
(4.23)

where we estimated every fraction using its monotonicity in q by its value at \(q=2\) or \(q=2.8\). (We note in passing that the above argument works as long as the left side of (4.23) is positive, that is, q is smaller than approximately 57.325.) This completes our discussion of Case 2 and therefore the proof of the proposition. \(\square \)

5 Solving a certain inhomogeneous second order equation

To complete the proof of our main theorem, we still need to prove Lemma 11, as well as show the strict convexity of P that we used in the proof of Proposition 4. This will be accomplished in the present section.

5.1 Proof of Lemma 11

In the previous section, we have reduced the proof of Lemma 11 to solving the equation

$$\begin{aligned} h_n^{(2)} g = f^{(2)} \end{aligned}$$

for \(g\in H^2(\mathbb {R})\) and computing

$$\begin{aligned} E^{(2)} = - \langle f^{(2)}, (h_n^{(2)})^{-1} f^{(2)} \rangle _{L^2(\mathbb {R})} = - \langle f^{(2)}, g \rangle _{L^2(\mathbb {R})} . \end{aligned}$$

The operator \(h_n^{(2)}\) is defined in (4.16). The solution g depends on n (as well as q and d). We stress that we already know the existence and uniqueness of this g thanks to the invertibility of \(h_n^{(2)}\). Since \(f^{(2)}\) is even, so is g. This motivates us to study the equation on the positive halfline.

Lemma 12

(The degree 2 solution) There are \(\eta \in \mathbb {R}\) and \((A_k)_k,(B_k)_k\subset \mathbb {R}\), depending on \(\mu _n\), q, and d, such that the affine space of \(H^2((0,\infty ))\)-solutions g of

$$\begin{aligned} (1-C|\mu _n|^{(q-2)\wedge 1})(-\partial _s^2g +(2d+\Lambda )g)-(q-1)u^{q-2}g = f^{(2)} \qquad \text {in}\ (0,\infty ) \end{aligned}$$
(5.1)

is parametrized by

$$\begin{aligned} g =\tau u^{\sqrt{1+\frac{2d}{\Lambda }}} \sum _{k=0}^\infty A_k \cosh ^{-2k}(\alpha \,\ \cdot \ )-\eta f^{(2)}\sum _{k=0}^\infty B_k \cosh ^{-2k}(\alpha \, \ \cdot \ ) \end{aligned}$$
(5.2)

with an arbitrary parameter \(\tau \in \mathbb {R}\). Moreover, we have

$$\begin{aligned} \lim _{s\rightarrow 0^+} \partial _s g(s)&= -2\alpha \tau \beta ^{\sqrt{1+\frac{2d}{\Lambda }}} \sqrt{\pi }\frac{\Gamma (2{\mathfrak {a}}+1)}{\Gamma ({\mathfrak {a}}+{\mathfrak {b}}-1)} \frac{\Gamma (1)}{\Gamma \left( \frac{3}{2}-{\mathfrak {b}} + {\mathfrak {a}}\right) } P_n \\&\quad + 2\alpha \eta M \beta ^{2q-3} \sqrt{\frac{2(d-1)}{d+2}} \sqrt{\pi }\frac{\Gamma ({\mathfrak {a}}+{\mathfrak {b}}+1)}{\Gamma (2{\mathfrak {b}}-1)} \frac{\Gamma (1+{\mathfrak {b}} - {\mathfrak {a}})}{\Gamma \left( \frac{3}{2}\right) } Q_n \,, \end{aligned}$$

where \({\mathfrak {a}}\) and \({\mathfrak {b}}\) are defined in (4.15), and where \(P_n\) and \(Q_n\) are certain numbers satisfying

$$\begin{aligned} P_n = 1+ {\mathcal {O}}_{n\rightarrow \infty }(|\mu _n|^{(q-2)\wedge 1}) , \qquad Q_n = 1+ {\mathcal {O}}_{n\rightarrow \infty }(|\mu _n|^{(q-2)\wedge 1}) . \end{aligned}$$

The fact that the solution space is one-dimensional is not surprising; since the equation is of second order, the \(H^2\)-requirement imposes a ‘boundary condition’ at infinity, while there is no boundary condition at the origin. For the proof of Lemma 11, we will impose the condition \(\lim _{s\rightarrow 0^+} \partial _sg(s)=0\) so that g extends by even reflection to an \(H^2\)-function on \(\mathbb {R}\). This determines the parameter \(\tau \) uniquely.

Proof

Step 1. In this step, we explain the overall idea of the proof and defer the rigorous justification of the manipulations to the next step.

The ansatz (5.2) for g is a difference of two power series – a homogeneous and an inhomogeneous formal solution of (5.1). The coefficients \(\eta \), \(A_k\), and \(B_k\) can be found by inserting the homogeneous and the inhomogeneous ansatz into the respective equation and comparing coefficients, as we explain now. Here (and only here) we use the symbol \(h_n^{(2)}\) not in the operator-theoretic sense but rather to denote the natural differential expression associated to it. For \({\mathfrak {c}}\in \{{\mathfrak {a}}, {\mathfrak {b}}\}\) we find that

$$\begin{aligned}&h_n^{(2)}( \cosh ^{-2(k+{\mathfrak {c}})}(\alpha s ))\\ {}&\quad =((1-C|\mu _n|^{(q-2)\wedge 1})(-4(k+{\mathfrak {c}})^2\alpha ^2\tanh ^2+2(k+{\mathfrak {c}})\alpha ^2 \cosh ^{-2}+\Lambda +2d)\\ {}&\qquad -(q-1)\beta ^{q-2}\cosh ^{-2})(\alpha s)\cosh ^{-2(k+{\mathfrak {c}})}(\alpha s) \\ {}&\quad = -(1-C|\mu _n|^{(q-2)\wedge 1})4\alpha ^2 (k(k+2{\mathfrak {c}})+{\mathfrak {c}}^2-{\mathfrak {a}}^2)\cosh ^{-2(k+{\mathfrak {c}})}(\alpha s)\\ {}&\qquad +\alpha ^2\left( (1-C|\mu _n|^{(q-2)\wedge 1})(k+{\mathfrak {c}})(4(k+{\mathfrak {c}})+2) -\frac{2q(q-1)}{(q-2)^2}\right) \cosh ^{-2(k+1+{\mathfrak {c}})}(\alpha s)\,. \end{aligned}$$

Thus, with \(C_k\in \{A_k, B_k\}\), a formal, termwise application of \(h_n^{(2)}\) yields

$$\begin{aligned} h_n^{(2)}&\left( \sum ^\infty _{k=0} C_k \cosh ^{-2(k+{\mathfrak {c}})}(\alpha s )\right) = \sum _{k=0}^\infty C_k h_n^{(2)}( \cosh ^{-2(k+{\mathfrak {c}})}(\alpha s ))\nonumber \\=&\alpha ^2(1-C|\mu _n|^{(q-2)\wedge 1}) \Bigg ( \sum _{k=0}^\infty \cosh ^{-2(k+{\mathfrak {c}})}(\alpha s )\Bigg ( -4C_k(k(k+2{\mathfrak {c}})+{\mathfrak {c}}^2-{\mathfrak {a}}^2)\nonumber \\ {}&+C_{k-1}\left( (k-1+{\mathfrak {c}})(4(k-1+{\mathfrak {c}})+2) -\frac{2q(q-1)}{(1-C|\mu _n|^{(q-2)\wedge 1})(q-2)^2}\right) \Bigg )\Bigg ), \end{aligned}$$
(5.3)

where we performed an index shift in the last step. Here we use the convention \(C_{-1}{:}{=}0\). If \({\mathfrak {c}}={\mathfrak {a}}\), we set expression (5.3) equal to 0 to determine \(C_k=A_k\) for the homogeneous formal solution by comparing the coefficients of \(\cosh ^{-2(k+{\mathfrak {a}})}\) for each \(k\in \mathbb {N}_0\). Similarly, for the inhomogeneous part, if \({\mathfrak {c}}={\mathfrak {b}}\), we set (5.3) equal to \(-\eta ^{-1}\cosh ^{-2 {\mathfrak {b}}}\) and compare the coefficients of \(\cosh ^{-2(k+{\mathfrak {b}})}\) for each \(k\in \mathbb {N}_0\) to find \(C_k=B_k\). In summary, this leads to the recursion relations

$$\begin{aligned}{} & {} A_k {:}{=}A_{k-1}G(k)=A_0\prod _{j=1}^k G(j) \nonumber \\{} & {} B_k{:}{=}B_{k-1}G(k+{\mathfrak {b}}-{\mathfrak {a}}) B_0\prod _{j=1}^k G(j+{\mathfrak {b}}-{\mathfrak {a}}) \end{aligned}$$
(5.4)

with

$$\begin{aligned} G(k) {:}{=}\frac{(1-C|\mu _n|^{(q-2)\wedge 1})( k+{\mathfrak {a}}-1)(2(k+{\mathfrak {a}})-1)(q-2)^2-(q-1)q}{2(1-C|\mu _n|^{(q-2)\wedge 1})k(k+2{\mathfrak {a}})(q-2)^2}. \end{aligned}$$

Note that the coefficient of \(\cosh ^{-2k}\) vanishes in case \({\mathfrak {c}}={\mathfrak {a}}\), which is the reason for \(A_0\) being freely choosable. Put differently, fixing \(A_0{:}{=}1\), we obtain an additional degree of freedom \(\tau \). In contrast, \(B_0\) is determined by the inhomogeneity, or equivalently, if we set \(B_0{:}{=}1\), \(\eta \) is fixed and given by

$$\begin{aligned} \eta {:}{=}\frac{1}{4(1-C|\mu _n|^{(q-2)\wedge 1})\alpha ^2 \left( {\mathfrak {b}}^2-{\mathfrak {a}}^2\right) } . \end{aligned}$$
(5.5)

However, in the computation (5.3) the interchange of derivative and infinite sum has to be justified. First, note that the infinite sums given in (5.2) converge absolutely, and thus we can rearrange the terms of both sums. This is a consequence of and , which we will show in the next step. As the terms in the infinite sums still decay exponentially after differentiation if \(s\in (0,\infty )\), the termwise differentiated series converges uniformly in s on every open interval \(I\subset (0,\infty )\) that does not contain 0 at its boundary, and hence derivatives of the proposed g are well-defined. In particular, the formal solution is a classical, smooth solution on \((0,\infty )\). The convergence implies that g extends continuously to the origin. It follows that

$$\begin{aligned} (1-C|\mu _n|^{(q-2)\wedge 1})(2d+\Lambda )g-(q-1)u^{q-2}g - f^{(2)} \in L^2((0,\infty )) , \end{aligned}$$

and therefore \(g\in H^2((0,\infty ))\), as claimed. (Note that \(g\in H^2((0,\infty ))\) implies that \(g'\) extends continuously to the origin, which is not clear from the series representation.)

Step 2. We now justify the procedure outlined in Step 1, that is, we determine the asymptotics for \(A_k\). The asymptotics for \(B_k\) is analogous. We recall that, for \(A_0{:}{=}B_0{:}{=}1\) and \(k\in \mathbb {N}\), the coefficients \(A_k\) and \(B_k\) were defined recursively in (5.4) and \(\eta \) in (5.5).

Our analysis will be more precise than what is needed in Step 1 but will be useful in the proof of the next lemma. We factor

$$\begin{aligned} G(k) = G_0(k) \rho _n(k) \qquad \text {with}\qquad G_0(k) {:}{=}\frac{ \left( k+{\mathfrak {a}}+{\mathfrak {b}}-2\right) \left( k+{\mathfrak {a}}-{\mathfrak {b}}+\frac{1}{2}\right) }{k ( k+2{\mathfrak {a}}) } \end{aligned}$$

and note that, with \({\mathcal {O}}_{n\rightarrow \infty }\) describing a k-independent error,

$$\begin{aligned} \rho _n(k) = 1+ \frac{{\mathcal {O}}_{n\rightarrow \infty }(|\mu _n|^{(q-2)\wedge 1})}{k^2} . \end{aligned}$$

Since there is a \(C'>0\) such that

$$\begin{aligned}\left| \log \left( 1+\frac{{\mathcal {O}}_{n\rightarrow \infty }(|\mu _n|^{(q-2)\wedge 1})}{j^2}\right) \right| \le \frac{C' }{j^2}|\mu _n|^{(q-2)\wedge 1}\end{aligned}$$

for n large enough and all \(j\in \mathbb {N}\), the series \(\sum _{j\in \mathbb {N}} \log (\rho _n(j))\) converges absolutely, and hence \(\prod _{j\in \mathbb {N}} \rho _n(j)\) converges by continuity of the exponential function. It follows that for each n, \(P_n:= \lim _{k\rightarrow \infty } \prod _{j=1}^k \rho _n(j)\) exists, is non-zero, and satisfies \(P_n =1 + {\mathcal {O}}_{n\rightarrow \infty }(|\mu _n|^{(q-2)\wedge 1})\). Moreover, for any k,

$$\begin{aligned} \prod _{j=1}^k \rho _n(j) = P_n\left( 1 + \frac{{\mathcal {O}}_{n\rightarrow \infty }(|\mu _n|^{(q-2)\wedge 1})}{k} \right) . \end{aligned}$$

Concerning the main term, we note that

$$\begin{aligned} \prod _{j=1}^k G_0(j)=\frac{({\mathfrak {a}}+{\mathfrak {b}}-1)_k}{(2{\mathfrak {a}}+1)_k} \frac{\left( {\mathfrak {a}}-{\mathfrak {b}}+\frac{3}{2}\right) _k}{\left( 1\right) _k} , \end{aligned}$$
(5.6)

with \((\ \cdot \ )_k=\Gamma (\ \cdot +k) (\Gamma (\ \cdot \ ))^{-1}\) being the Pochhammer symbol for \(k\in \mathbb {N}\). The crucial ingredient will be the following asymptotics for ratios of two Gamma functions

$$\begin{aligned} k^{d_1-d_2}\frac{\Gamma (k+d_2)}{\Gamma (k+d_1)}= 1+\frac{(d_2-d_1)(d_1+d_2-1)}{2k}+{\mathcal {O}}_{k\rightarrow \infty }(k^{-2}), \qquad d_1,d_2>0, \end{aligned}$$
(5.7)

which can be found by Stirling’s approximation. Applying (5.7) to (5.6), we obtain

Combining this with the behavior of \(\rho _n(k)\) yields

(5.8)

with an error that is uniform in n. The asymptotics for \(B_k\) is similar but with a shift by \({\mathfrak {b}}-{\mathfrak {a}}\) in every Gamma function and with a possibly different constant \(Q_n\) that also satisfies \(Q_n = 1 + {\mathcal {O}}_{n\rightarrow \infty }(|\mu _n|^{(q-2)\wedge 1})\). Using this behavior of the coefficients \(A_k\) and \(B_k\), the manipulations in Step 1 can be justified, and thus we proved the assertion of the lemma concerning the solution.

Step 3. We finally compute \(\lim _{s\rightarrow 0^+} \partial _s g(s)\). First, we note that for \(s>0\) we can differentiate termwise the series defining g and obtain

$$\begin{aligned} \partial _s g&= -2 \alpha \tau u^{\sqrt{1+\frac{2d}{\Lambda }}} \sum _{k=0}^\infty k A_k \cosh ^{-2k}(\alpha \ \cdot \ ) \tanh (\alpha \ \cdot \ )\\&\quad +\! 2\alpha \eta f^{(2)} \sum _{k=0}^\infty k B_k \cosh ^{-2k}(\alpha \ \cdot \ ) \tanh (\alpha \ \cdot \ ) + h \,, \end{aligned}$$

where h involves a series that converges absolutely on all of \([0,\infty )\) and satisfies \(\lim _{s\rightarrow 0^+} h(s)=0\). Since u and \(f^{(2)}\) are continuous at the origin, the claimed formula will follow from the fact that

(5.9)

provided the \(D_k\) are non-negative, and the limit on the right side exists. (Indeed, we apply this with \(D_k\in \{kA_k,kB_k\}\), recalling the asymptotics of \(A_k\) and \(B_k\) from (5.8).)

To prove (5.9), we set \(\cosh ^{-2}(\sigma ) {=}{:}e^{-\zeta }\) and notice that the assertion is equivalent to

The latter is a consequence of a well-known Abelian theorem corresponding to the measure \(\mu =\sum _{k=0}^\infty D_k \delta _k\) on \([0,\infty )\), noting that

A simple proof of the Abelian theorem (which is an application of the dominated convergence theorem) can be found, for instance, in [31, Theorem 10.2].\(\square \)

Everything is now in place to compute \(E^{(2)}\) and thus to complete the proof of Lemma 11.

Proof of Lemma 11

Given the notation from Lemma 12, we fix the value \(\tau =\tau _n\) in such a way that \(\lim _{s\rightarrow 0^+} \partial _s g(s)=0\), for then the even extension of g is the unique \(H^2(\mathbb {R})\)-solution of \(h_n^{(2)}g=f^{(2)}\). This value is given by

$$\begin{aligned} \tau _n = \sqrt{\frac{2(d-1)}{d+2}}M \eta \beta ^{\xi (q-2)} \frac{2\Gamma ({\mathfrak {a}} + {\mathfrak {b}}-1)\Gamma \left( {\mathfrak {a}}-{\mathfrak {b}}+\frac{3}{2}\right) \Gamma ({\mathfrak {a}} + {\mathfrak {b}}+1)\Gamma ({\mathfrak {b}}- {\mathfrak {a}} +1)}{\sqrt{\pi }\Gamma (2{\mathfrak {b}}-1)\Gamma (2{\mathfrak {a}}+1)} \frac{Q_n}{P_n} . \end{aligned}$$

Since \(Q_n/P_n = 1+ {\mathcal {O}}_{n\rightarrow \infty }(|\mu _n|^{(q-2)\wedge 1})\) and \(\eta = (4\alpha ^2({\mathfrak {b}}^2-{\mathfrak {a}}^2))^{-1}(1+ {\mathcal {O}}_{n\rightarrow \infty }(|\mu _n|^{(q-2)\wedge 1}))\), we obtain the asymptotic behavior

$$\begin{aligned} \tau _n&= \sqrt{\frac{2(d-1)}{d+2}} \frac{M}{4\alpha ^2({\mathfrak {b}}^2 - {\mathfrak {a}}^2)} \beta ^{\xi (q-2)}\\&\quad \times \frac{2\Gamma ({\mathfrak {a}} + {\mathfrak {b}}-1)\Gamma \left( {\mathfrak {a}}-{\mathfrak {b}}+\frac{3}{2}\right) \Gamma ({\mathfrak {a}} + {\mathfrak {b}}+1)\Gamma ({\mathfrak {b}}- {\mathfrak {a}} +1)}{\sqrt{\pi }\Gamma (2{\mathfrak {b}}-1)\Gamma (2{\mathfrak {a}}+1)} \nonumber \\&\quad \times \left( 1+ {\mathcal {O}}_{n\rightarrow \infty }(|\mu _n|^{(q-2)\wedge 1}) \right) . \end{aligned}$$

We write

$$\begin{aligned} \tau _n {=}{:}\sqrt{\frac{2(d-1)}{d+2}}M \eta \beta ^{\xi (q-2)} {\tilde{\tau }}_n. \end{aligned}$$

Inserting (5.2) from Lemma 12 and \(\tau _n\) into the expression for \(E^{(2)}\) gives

$$\begin{aligned} E^{(2)}&=-\int _{\mathbb {R}} g f^{(2)}\ \textrm{d}s \\ {}&=-\frac{\beta ^{3q-4}}{\alpha }\frac{|{\mathbb {S}}^{d-1}|}{d^2}q(q-1)^2\frac{(d-1)}{(d+2)}\alpha ^2\eta \\&\quad \sum _{k=0}^\infty \times \left( {\tilde{\tau }}_n A_k \frac{\Gamma \left( k+{\mathfrak {a}}+{\mathfrak {b}}\right) \sqrt{\pi }}{\Gamma \left( k+{\mathfrak {a}}+{\mathfrak {b}}+\frac{1}{2}\right) }- B_k \frac{\Gamma \left( k+2{\mathfrak {b}}\right) \sqrt{\pi }}{\Gamma \left( k+2{\mathfrak {b}}+\frac{1}{2}\right) }\right) \\ {}&=-\frac{\beta ^{3q-4}}{\alpha }\frac{|{\mathbb {S}}^{d-1}|}{4d^2}\frac{\Gamma \left( \frac{3q-4}{q-2}\right) \sqrt{\pi }}{\Gamma \left( \frac{3q-4}{q-2}+\frac{1}{2}\right) }q(q-1)(q-2)\frac{(d-1)}{(d+2)} \frac{1}{P(-1)}\sum _{k=0}^\infty \left( P(k-\xi )- P(k)\right) \\ {}&\quad + {\mathcal {O}}_{n\rightarrow \infty }(|\mu _n|^{(q-2)\wedge 1})\,. \end{aligned}$$

In the penultimate step, we interchanged integral and infinite sum, as the summands in the respective infinite sum have the same sign, and applied the second identity from (4.6). Due to the asymptotics (5.7) and (5.8) and their counterparts in the inhomogeneuous setting, the infinite sums are absolutely summable, and thus we are allowed to rearrange the sums. \(\square \)

5.2 Convexity of P

We recall that the function \(P:[-1,\infty )\rightarrow \mathbb {R}\) was defined in (4.14). The following property is used in the proof of Proposition 4.

Lemma 13

(Strict convexity of P) Let \(d>2, 2<q<2^*\) or \(d=2\), \(2.8< q <2^*\). The function P is strictly convex on the interval \([-1,\infty )\).

The lemma will not cover \(d=2\), \(2<q\le 2.8\). Indeed, numerical computations suggest that for \(d=2\) and q close to 2, P fails to be convex.

Proof

We will show strict convexity by proving that the second derivative is positive. The idea is to split P into three factors, analyze their sign and that of their first and second derivatives, and conclude that the same pattern of signs can be observed for P. Be aware that P is a well-defined, smooth function on an open, (qd)-dependent set containing \([-1,\infty )\), as Gamma functions are positive, smooth functions on the positive real axis. In particular, we can differentiate at \(-1\). Before investigating the single factors, let us show how P inherits the property that differentiation alters the sign from its constituents.

Step 1. Let \(U\subset \mathbb {R}\) be an open subset and \(T_j:U\rightarrow \mathbb {R}\), \(j\in \{1,\dots ,J\}\), I-times differentiable functions for \(I\in \mathbb {N}_0\) and \(J\in \mathbb {N}\) with \((-1)^iT_j^{(i)}>0,\) for all \(i\in \{1,\dots ,I\}\) and \(j\in \{0,\dots , J\}\). Here \((\ \cdot \ )^{(i)}\) denotes the i-th derivative. Applying the general Leibniz rule, we observe that

$$\begin{aligned}(-1)^I\left( \prod _{j=1}^J T_j\right) ^{(I)}=\sum _{m_1+m_2+\dots +m_J=I}\left( {\begin{array}{c}I\\ m_1,m_2,\dots ,m_J\end{array}}\right) \prod _{j=1}^J (-1)^{m_j}T_j^{(m_j)}>0\end{aligned}$$

since every factor \((-1)^{m_j}T_j^{(m_j)}>0\).

Step 2. We are now going to apply Step 1 with \(I=2\) and \(J=3\) to the smooth function P. To this end, we write

$$\begin{aligned}P(x)=\prod _{j=1}^{3}T_j(x)\qquad \text { with }\qquad T_j:\left( -\frac{3}{2},\infty \right) \rightarrow \mathbb {R},\, T_j(x){:}{=}\frac{\Gamma (x+\varepsilon _1^j)}{\Gamma (x+\varepsilon _2^j)},\end{aligned}$$

\(j\in \{1,2,3\}\), where we denote

$$\begin{aligned}\begin{array}{lll} \varepsilon _1^1{:}{=}\frac{3}{2}, &{} \varepsilon _1^2{:}{=}2{\mathfrak {b}}-1, &{} \varepsilon _1^3{:}{=}2{\mathfrak {b}},\\ \varepsilon _2^1{:}{=}{\mathfrak {b}}-{\mathfrak {a}} +1, &{} \varepsilon _2^2{:}{=}{\mathfrak {b}}+{\mathfrak {a}}+1, &{} \varepsilon _2^3{:}{=}2{\mathfrak {b}} +\frac{1}{2}. \end{array}\end{aligned}$$

Note that \(\frac{3}{2}\le \varepsilon _1^j< \varepsilon _2^j\) for all \(j\in \{1,2,3\}\) as \(2^{-1}<\xi <1\) for the given parameter range of q and d; see (4.22). In particular, \(T_j\) is a well-defined, positive function on \(\left( -\frac{3}{2},\infty \right) \) since the Gamma functions are only evaluated at positive entries.

Next, we will use monotonicity of the Digamma function \(\Psi (y){:}{=}\partial _y(\log (\Gamma (y)))\), \(y>0\), and its derivative, the Trigamma function \(\Psi _1\). The monotonicity behavior can be deduced in an elementary way from an integral formula for the Digamma function [36, 12.3], for instance. As \(\Psi \) is strictly monotonically increasing on the positive real axis, the second inequality follows through

$$\begin{aligned}\partial _xT_j(x)=T_j(x)\left( \Psi (x+\varepsilon ^j_1)-\Psi (x+\varepsilon ^j_2)\right) <0\end{aligned}$$

for \(x>-\frac{3}{2}\) and \(j\in \{1,2,3\}\). Similarly, we compute

$$\begin{aligned}\partial ^2_x T_j(x)=T_j(x)\left( \left( \Psi (x+\varepsilon ^j_1)-\Psi (x+\varepsilon ^j_2)\right) ^2+\left( \Psi _1(x+\varepsilon ^j_1)-\Psi _1(x+\varepsilon ^j_2)\right) \right) >0\end{aligned}$$

for \(x>-\frac{3}{2}\) and \(j\in \{1,2,3\}\) using that \(\Psi _1\) is strictly monotonically decreasing on the positive real axis. Therefore, we can apply Step 1 to obtain \(\partial _x^2 P>0\).\(\square \)