1 Introduction

We study solutions \(u:\Omega \rightarrow {\mathbb {R}}\) to the semilinear elliptic equation

$$\begin{aligned} {\left\{ \begin{array}{ll}\Delta u - f(u)=0 &{} \quad \text{ in } \quad \Omega \\ \partial _\nu u=0 &{} \quad \text{ on } \quad \partial \Omega \end{array}\right. } \end{aligned}$$
(1)

where \(\partial _\nu \) is the exterior normal derivative along \(\partial \Omega \) and \(f=F'\) is the derivative of a smooth Morse function F, having finitely many non-degenerate critical points (meaning \(f'\) does not vanish at these points) and ends no decaying faster than sub-critically. Some of the configurations that F can have under these assumptions are illustrated in Fig. 1 below.

Our hypothesis on \(\Omega \) include strictly convex domains of \({\mathbb {R}}^N\) with smooth boundary, as well as closed manifolds with non-negative Ricci curvature. More precisely, we assume:

  1. (D)

    \(\Omega \) is bounded and one of the following types of domain:

    1. (D1)

      \(\partial \Omega \ne \emptyset \) is smooth and strictly convex with \({\text {Ric}}\ge 0\) on \(\Omega \), or

    2. (D2)

      \(\partial \Omega =\emptyset \) and \({\text {Ric}}\ge 0\).

Solutions to Eq. (1) are critical points of the energy functional

$$\begin{aligned} E(u)=\int _\Omega \frac{|\nabla u|^2}{2}+F(u), \quad u \in W^{1,2}(\Omega ). \end{aligned}$$
(2)

The Morse index of a solution u is defined as the number of negative eigenvalues of \(E''(u)(\cdot ,\cdot )\), i.e. the number of negative modes of the linearization of (1) around u. When the Morse index of u is zero we say that u is stable. Otherwise, we say that u is unstable.

The structure of stable solutions of (1) is well understood. It is known that under hypothesis (D), stable solutions are the constants corresponding to local minima of F (see [15, 24, 34] and Theorem 5.1). On the other hand, unstable solutions are abundant and their behavior is governed by the geometry of \(\Omega \) (see [13, 21, 39] for the case of the Allen–Cahn equation).

In the present work, we pursue a general study of the simplest unstable solutions of Eq. (1). Often called ground statesFootnote 1: these are solutions minimizing the energy among the set of all unstable solutions. Under the assumptions above, we characterize ground states as mountain pass critical points of Morse index 1. For some of the potentials we consider, this can not be obtained immediately by means of classical mountain pass methods (i.e. the existence of a mountain pass barrier and a Palais-Smale-type condition) since they fail to satisfy these requirements. To overcome this problem we prove a priori estimates for solutions of Eq. (1). These estimates, which we believe are interesting on their own, imply that

potentials that do not decay faster than quadratically behave exactly like potentials with no decay at all.

In this way, we can study unstable solutions for a general potential F with decay, as solutions for a modified potential \(F^*\) better suited for the classical mountain pass theorem (see Fig. 2 below).

Fig. 1
figure 1

Some possible configurations of F

Fig. 2
figure 2

If F does not decay faster than quadratically and M is large enough, then F and \(F^*\) induce the same unstable solutions

The mountain pass characterization of ground states allows us to derive some of their geometric properties. For example, we give simple conditions on F that guarantee that ground states are non-constant, and we show that ground states are symmetric when the domain \(\Omega \) is symmetric, e.g. a ball in \({\mathbb {R}}^N\) or the closed sphere \(S^N\).

We then proceed to obtain improved symmetry results given additional structure of the nonlinearity. More specifically, we concentrate on the Allen–Cahn equation

$$\begin{aligned} \varepsilon ^2 \Delta u- W'(u)=0 \end{aligned}$$

on \(S^N\), where W is a symmetric double well potential. In this case, we can improve the symmetry results for ground states to uniqueness up to ambient isometries. From this, we can draw a number of conclusions on the qualitative behavior of low energy solutions in parts in strong analogy with results for minimal hypersurfaces.

Firstly, in the case of the round sphere \(S^N\) we can show that the ground state has nodal set precisely the equator, which is the least area minimal hypersurface. Moreover, there is a gap of definite size to the next highest energy level achieved by a solution. This, in particular, provides precise information on the first few Allen–Cahn widths of the round sphere. These are a sequence \(\{c_\varepsilon (p)\}_{p \in {\mathbb {N}}}\) of min-max critical values for the energy functional \(\varepsilon E\) in the case of a double-well potential \(F(u)=\frac{W(u)}{\varepsilon ^2}\), and they were introduced in [27]. They are defined using topologically nontrivial p-dimensional families of functions in \(W^{1,2}\), inspired by the p-widths for the area functional introduced by Marques-Neves in [36] (see also the recent work of Dey, [22]). We show that the first \(N+1\) widths are at the ground state level and there is a gap between these and the \((N+2)\)-th width. Under the analogy with minimal hypersurfaces, this is equivalent to saying that on \(S^N\) the first \(N+1\) widths of the area functional, as defined in [36], correspond to the area of equatorial spheres. Following this analogy for \(N=3\), one should expect that the 5-th width of \(S^3\) is attained by a solution related to Clifford torus. We construct the candidate solution, but the problem of characterizing it as the one realizing the 5-th width remains open. This is related to the lack of an Urbano type theorem [43] for solutions of the Allen–Cahn equation.Footnote 2

In a slightly different direction, we also obtain some control on the bifurcation behavior of solutions to the Allen–Cahn equation as \(\varepsilon \rightarrow 0\). For the three-dimensional sphere, we determine the two largest values of \(\varepsilon >0\) at which new solutions appear. These are connected to low eigenvalues of the Laplacian. While at the first bifurcation only ground states appear from constant solutions, at the second bifurcation, maybe surprisingly, there are at least two new solutions appearing. One of them corresponds geometrically to the Clifford torus and the other to the intersection of \(S^3\) with two orthogonal hyperplanes.

Both, our gap and bifurcation results depend in a quite subtle way on the symmetries of the ambient manifold. For instance, we are at present not able to obtain the corresponding gap results, in fact not even the improved symmetry result, in the case of Euclidean balls. The main difference to the case of the sphere is the the orbits of a point under the symmetry group of a ground state are much smaller than in the case of the sphere. Similarly, our bifurcation analysis only applies to the three dimensional case at this point for related dimensional reasons.

The theory of ground states for semilinear elliptic equations in \({\mathbb {R}}^N\) has a long history and vast literature which we do not attempt to cover. The work of Berestycki and Lions [7], is one of the earliest and most influentials of the area. They study positive solutions in \({\mathbb {R}}^N\) assuming F has a mountain-pass geometry in the sense that it has a non-degenerate minimum at 0 with \(F(0)=0\), is bounded from below by a subcritical polynomial, and there is a positive number \(t>0\), where \(F(t)=0\) (see hypothesis (1.1)–(1.3) in [7]). Some of their methods were based on the earlier work of Coleman, Glazer and Martin [20], which in turn was motivated by Coleman’s study of vacuum instability in Minkowski space [19]. Despite the fact that ground states are not local minima, in both [20] and [7] the scale invariance of \({\mathbb {R}}^N\) allows them to reduce the problem to a restricted minimization problem. In geometric terms, this is analogous to constructing a hypersurface with constant mean curvature \(H_0>0\), by first solving a problem of area minimization with a volume constrain, obtaining a sphere with curvature H and then rescaling the sphere so that it has curvature \(H_0\). In this way, it is possible to avoid the use of mountain pass arguments in the case of \({\mathbb {R}}^N\). Only until more recently the mountain pass characterization of the ground states was proved in this setting [32, 33]. However, even in this case the scale invariance of \({\mathbb {R}}^N\) is used in the main arguments which cannot be reproduced in general domains.

In recent years, there has been growing interest in proving similar results on domains that are not scale invariant, for example the Euclidean sphere \(\Omega =S^N\). For Differential Geometry and Geometric PDE, the question is relevant under light of strong analogies between minimal surfaces and solutions to the stationary Allen–Cahn equation [6, 17, 22, 27, 28]. For Mathematical Physics, the question appears once more in the study of vacuum instability in curved space-times. For example, in the case of De Sitter space, critical points of a quantity often called Euclidean action correspond to solutions of a semilinear elliptic equation over the Euclidean sphere \(S^4\). These minimizers, known as bounce solutions, are often assumed to have \({\text {O}}(4)\) symmetry, a monotone profile and Morse index 1, but no proof of these facts has been given [12, 37]. The methods we develop are also motivated by the intention of filling this gap in the theory. These applications are discussed in an article under preparation by the third author and Camargo-Molina [14].

2 Main results

Let \(F:{\mathbb {R}}\rightarrow {\mathbb {R}}\) be a Morse function with finitely many critical points, at least one of which is a local maximum. Denote by \(c^-<c^+\) the smallest and largest critical points of F. Assume:

  1. (A)

    F satisfies one of the following:

    1. (A1)

      \(c^-\) and \(c^+\) are local minima of F,

    2. (A2)

      linear lower bound: \(\exists C>0\), such that

      $$\begin{aligned} -{\text {sgn}}(t)f(t)\le C(1+|t|), \ \ \forall t\in {\mathbb {R}}, \end{aligned}$$
    3. (A3)

      superlinear and subcritical decay:

      • \(c^-\) is a local minimum,

      • \(\lim _{t\rightarrow +\infty } f(t)/t=-\infty \),

      • \(\exists C>0, R_0>0, \rho \in [0,1/2)\) and \(p\in (2,\frac{2N}{N-2})\), such that

        $$\begin{aligned} -f(t)\le C(1+|t|^{p-1}), \ \ \forall t>R_0 \end{aligned}$$

        and

        $$\begin{aligned} -\rho tf(t)+F(t)-t^2\ge 0, \ \ \forall t>R_0. \end{aligned}$$

Remark

  • A simple application of the maximum principle shows that for ground states to exist it is necessary for F to have at least one local maximum.

  • (A1) implies (A2). However, we separate them for convenience in stating the results below. In fact, all the configurations in Fig. 1 are covered under (A2): F(t) is allowed to grow freely as \(|t|\rightarrow \infty \), but can decay at most quadratically. This includes potentials with an arbitrary number of wells, at arbitrary heights, with possibly some bottomless wells decaying no faster than quadratically.

  • (A3) only admits the configuration illustrated on the middle of the first row in Fig. 1, with possibly additional critical points. It is satisfied when F has the form \(F(t)=-c|t|^{p}+G(t)\), for t large enough, where \(g(t)=G'(t)=o(t^{p-1})\). In this case, one can choose \(\rho \in (1/p,1/2)\). The second inequality in (A3) is similar to the often called Ambrosetti-Rabinowitz condition.

Our first main result is the existence of ground states:

Theorem 2.1

Under hypothesis (A) and (D) Eq. (1) admits at least one ground state.

After this, we present a useful mountain pass characterization of ground states that will later lead us to understand their geometric properties better. In order to state this characterization, we must introduce the following definition:

Definition 2.2

Given \(u\in W^{1,2}(\Omega )\), a continuous map \(h:[-1,1]\rightarrow W^{1,2}(\Omega )\) joining \(h(-1)\) and \(h(+1)\) is said to be optimal at u with respect to E if

  1. a)

    \(u=h(0)\),

  2. b)

    \(E(u)=E(h(0))>E(h(t)),\) for all \(0<|t|\le 1\)

  3. c)

    the map E(h(t)), \(t\in [-1,1]\), is smooth near \(t=0\), and

  4. d)

    \(\frac{d^2}{dt^2}E(h(t))|_{t=0}<0\).

Our first variational characterization is the following, in particular generalizing [27, Theorem 2.1], which covers the case of the Allen–Cahn equation discussed in more detail below.

Theorem 2.3

Assume (D) and either (A1) or (A3). If u is a ground state of Eq. (1), then

  1. (1)

    u has Morse index 1 and is of mountain pass type with respect to E. More precisely, there exist constants \(s^-<s^+\), such that

    $$\begin{aligned} E(u)=\inf _{h\in \Gamma } \sup _{t\in [-1,1]} E(h(t))>E(s^\pm ) \end{aligned}$$

    where \(\Gamma =\{h:[-1,1]\rightarrow W^{1,2}(\Omega ) \ | \ h \text { continuous, with } h(\pm 1)=s^\pm \}.\)

  2. (2)

    Moreover, there is a continuous path \(h:[-1,1]\rightarrow W^{1,2}(\Omega )\) joining \(s^-\) and \(s^+\) which is optimal at u with respect to E.

In general, we have the following result:

Theorem 2.4

Assume (D) and (A). Let u be a ground state for Eq. (1). Then, there exists \(F^*:{\mathbb {R}}\rightarrow {\mathbb {R}}\) a Morse function with finitely many critical points, satisfying (A1) and such that (1) and (2) of Theorem 2.3 holds with \(E^*(\cdot )=\int _\Omega \frac{|\nabla \cdot |^2}{2}+F^*(\cdot )\) in place of E.

Some of the steps in the proof of Theorem 2.4 are interesting on their own. As mentioned in the introduction, we cannot guarantee a mountain pass geometry or a Palais-Smale condition for all the configurations in Fig. 1. We overcome this by proving a priori estimates for unstable solutions of (1). Informally, these estimates say that bottomless wells that decay no faster than quadratically behave exactly like finite wells.

More precisely, let \(u^-\) and \(u^+\) be the smallest unstable critical points of F. For simplicity defining the proposition below, denote \(k^-=\min (0,u^-)\) and \(k^+=\max (0,u^+)\). In particular, all local maxima of F lie on \([k^-,k^+]\), but there could be local minima of F outside of this interval. We have the following a priori estimates:

Theorem 2.5

Under hypothesis (A2) and (D), there exists a positive constant \(M_0=M_0(\Omega ,C,K)\) such that for any \(u \in W^{1,2}(M)\) unstable solution to (1) it holds

$$\begin{aligned} \Vert u\Vert _{L^\infty (\Omega )}+\Vert \nabla u\Vert _{L^\infty (\Omega )}\le M_0, \end{aligned}$$

where \(K=\max \{|k^-|,k^+,\max _{[k^-,k^+]}|f|\}\) and C is the constant from hypothesis (A1).

Thanks to these estimates we can modify F outside of a compact region without changing the set of unstable solutions of (1), as long as we retain the same quadratic bound for the decay and we do not add new local maxima while doing so. We expect this estimate to be sharp, in the sense that it should not hold for potentials with superquadratic decay as a consequence of the classical work [3], neither it holds for stable solutions, since F might have local minima outside of the interval \([-M_0,M_0]\). Theorem 2.5 also implies that the set of unstable solutions of (1). So one might expect that, generically, these type of equations only admit finitely many solutions (again, contrasting with the classical work [3]).

The mountain pass characterization from Theorems 2.3 and 2.4 also allows us to derive several geometric properties of ground states. For example, since the Morse index of a ground state must be one, in order to rule out constant ground states, it is enough for one of the smallest local maxima of F to have Morse index at least 2. This is the content of the following corollary.

Corollary 2.6

Assume (D) and (A). Let \(c\in {\mathbb {R}}\) be a local maxima of F. Assume that c minimizes F in the set of all local maxima of F. If \(F''(c)=f'(c)<-\lambda _1(\Omega )\), then c has Morse index greater than 1 as a critical point of E. In this case, ground states of (1) are non-constant.

We also prove that ground states are symmetric when the ambient space is symmetric. This is done by symmetrizing the optimal family obtained in the mountain pass characterization.

Theorem 2.7

Assume hypothesis (A), then:

  1. (1)

    if \(\Omega =B^N_R\) is an Euclidean ball in \({\mathbb {R}}^{N}\), ground states of (1) are Foliated Schwarz symmetric.

  2. (2)

    if \(\Omega =S_R^{N}\) is an Euclidean sphere, ground states of (1) are Schwarz symmetric.

We also study uniqueness of the ground state. Notice that in general, Eq. (1) might admit multiple ground states. However, it is still possible to show that ground states are unique in some important cases. We do this in particular for a symmetric double well potential on the sphere as discussed below. Some of our results in fact apply to more general non-linearities as made precise in the corresponding sections.

Consider the Allen–Cahn equation

$$\begin{aligned} \varepsilon ^2\Delta u -W'(u)=0 \quad \text{ in } \quad S^{N}, \end{aligned}$$
(3)

where \(\varepsilon >0\) and \(W(u)=(1-u^2)^2/4\). Solutions to this equation are the critical points of

$$\begin{aligned} E_\varepsilon (u)=\int _\Omega \varepsilon \frac{|\nabla u|^2}{2}+\frac{W(u)}{\varepsilon }. \end{aligned}$$

We have the following uniqueness result:

Theorem 2.8

The ground state u for Eq. (3) on \(S^{N}\) is unique up to rigid motions. The function u is odd and Schwarz symmetric. In addition, the ground state is non-constant if and only if \(\varepsilon \in (0,\varepsilon _1)\), where \(\varepsilon _1=\sqrt{-W''(0)/\lambda _1}\) and \(\lambda _1=\lambda _1(S^{N})\).

In other words, when \(\varepsilon \in (\varepsilon _1,\infty )\), the ground state of the Allen–Cahn equation on \(S^N\) is given by the constant solution 0, and it is non-constant with nodal set exactly along an equator, when \(\varepsilon \in (0, \varepsilon _1)\).

We also study bifurcation from the first energy level and the gap between the first and second energy levels of \(E_\varepsilon \). More precisely, denote the first and second critical energy levels of E as \(a_\varepsilon \le b_\varepsilon \), respectively, i.e.

$$\begin{aligned} a_\varepsilon = \inf \{ E_\varepsilon (u) : u \in W^{1,2}(M), E_\varepsilon '(u) = 0, E_\varepsilon (u)>E_\varepsilon (\pm 1)\} \end{aligned}$$
(4)

and

$$\begin{aligned} b_\varepsilon = \inf \{ E_\varepsilon (u) : u \in W^{1,2}(M), E_\varepsilon '(u)=0, E_\varepsilon (u)>a_\varepsilon \}. \end{aligned}$$

With this notation, we have the following gap theorem:

Theorem 2.9

\(b_\varepsilon >a_\varepsilon \) if and only if \(\varepsilon \in (0,\varepsilon _1)\). Moreover, the energy level \(b_\varepsilon \) is realized by at least one solution which is non-constant provided \(\varepsilon \in (0,\varepsilon _2)\), where \(\varepsilon _2=\sqrt{-W''(0)/\lambda _2}\), where \(\lambda _2=\lambda _2(S^{N})\).

Denoting by \(c_\varepsilon (1) \le c_\varepsilon (2) \le c_\varepsilon (3) \le \dots \) the Allen–Cahn widths of the sphere as defined in [27, Section 3.2], we also prove the following gap theorem:

Theorem 2.10

For \(\varepsilon >0\) sufficiently small we have on the sphere \(S^N\) that

$$\begin{aligned} c_\varepsilon (1) = \dots = c_\varepsilon (N+1) < c_\varepsilon (N+2). \end{aligned}$$

In the case \(N=3\), we prove the following bifurcation result for ground states in the sphere:

Theorem 2.11

Let \(\varepsilon _2 = (\lambda _2(S^3))^{-1/2}=\frac{1}{2\sqrt{2}}\).

  1. (1)

    For any \(\varepsilon \in (\varepsilon _2,\varepsilon _1)\), the only nonconstant solutions of the Allen–Cahn equation are the ground states (which are unique up to rotations).

  2. (2)

    For \(\varepsilon <\varepsilon _2\), there are at least two families of solutions which are not radially symmetric. These families of solutions have the symmetries of the Clifford torus and a pair of orthogonal equators, respectively, and accumulate on these minimal surfaces as \(\varepsilon \downarrow 0\).

Remark

After the first version of this work was posted, F. Hiesmayr [31] proved a characterization result for solutions of the Allen–Cahn equation (3) with bounded energy in the style of Urbano [43], using a Frankel-type property for solutions to (3). Assuming uniform energy bounds, Hiesmayr’s result shows that, for sufficiently small \(\varepsilon >0\), the only solutions with Morse index at most 5 are the ground states and the symmetric solutions vanishing on the Clifford torus described in Theorem 2.11. As a corollary, using the solution to the Willmore Conjecture by Marques and Neves [35], Hiesmayr concludes that for small \(\varepsilon >0\) the fifth min-max width \(c_\varepsilon (5)\) for the Allen–Cahn energy is achieved by the latter solutions. For a comparison between the min-max widths \(c_\varepsilon (p)\) and the p-widths for the area functional, we refer to [22, 27].

Organization Theorem 2.1 is proved in Sect. 5.2, Theorem 2.3 and Theorem 2.4 in Sect. 5.3, Theorem 2.5 in Sect. 3, Theorem 2.7 in Sect. 6, Theorem 2.8 in Sect. 7, Theorem 2.9 and Theorem 2.10 in Sect. 8, and Theorem 2.11 in Sect. 9.

2.1 Notation

\(B_R(p)\)

denotes the geodesic ball of radius R centered at \(p\in M\)

\(\nu \)

denotes the exterior unit normal along the boundary \(\partial \Omega \) of \(\Omega \)

|A|

denotes the N-dimensional Hausdorff measure of a subset

 

\(A \subset M\), \(|A|={\mathcal {H}}^N(A)\)

denotes the characteristic function of a set \(A \subset M\), i.e.

 

if \(x \in A\), and otherwise

\(\Vert f\Vert _{p}\)

denotes the \(L^p\) norm of f on M, i.e. \(\Vert f\Vert _p=\left( \int _M |f|^p\right) ^{1/p}\)

\(c^-\)

denotes the smallest critical point of F

\(c^+\)

denotes the largest critical point of F

\(c_1<\cdots <c_n\)

is the list of unstable critical point of F

\(k^-\)

denotes \(\min (0,c_1)\)

\(k^+\)

denotes \(\max (0,c_n)\)

\(\beta _{N}\)

denotes the volume of the N-dimensional sphere \(S^{N}\)

3 A priori estimates and compactness under (A2) and (D)

Given \(u \in W^{1,2}(M)\) a solution to (1) (hence smooth), let \(P:\Omega \rightarrow {\mathbb {R}}\), be defined in terms of u as

$$\begin{aligned} P=\frac{|\nabla u|^2}{2}-F(u). \end{aligned}$$

The proofs of the following two lemmas follow closely those in Chapter 5 of [42]. See also [25] for a related pointwise gradient bound.

Lemma 3.1

Assume \({\text {Ric}}\ge 0\), and u is a solution of (1). At points where \(|\nabla u|\ne 0\), the function P satisfies a maximum principle. Namely, it holds

$$\begin{aligned} \Delta P + |\nabla u|^{-2}\langle \nabla P , \nabla P - \nabla |\nabla u|^2 \rangle \ge 0. \end{aligned}$$
(5)

Proof

From \(\nabla P = |\nabla u|\nabla |\nabla u| - f(u)\nabla u \), it follows that

$$\begin{aligned} f(u)^2 = |\nabla u|^{-2} \langle \nabla P , \nabla P - \nabla |\nabla u|^2 \rangle + |\nabla |\nabla u||^2. \end{aligned}$$

From Bochner’s formula and \(\Delta u =f(u)\), we obtain

$$\begin{aligned} \Delta P&= |{\text {Hess}} u|^2 +\langle \nabla \Delta u, \nabla u \rangle + {\text {Ric}}(\nabla u,\nabla u) - f'(u)|\nabla u|^2 -f(u)\Delta u \\&= |{\text {Hess}} u|^2 -f(u)^2+ {\text {Ric}}(\nabla u,\nabla u). \end{aligned}$$

We obtain the inequality by combining both expressions, using \({\text {Ric}}(\nabla u,\nabla u) \ge 0\) and the standard inequality \(|{\text {Hess}} u|^2 - |\nabla |\nabla u||^2\ge 0\). \(\square \)

Lemma 3.2

Let \(\partial \Omega \) be strictly convex. Assume u is a solution of (1) with either zero Dirichlet or zero Neumann boundary condition. Then, at a boundary point with \(|\nabla u|\ne 0\), we must have \(\partial _\nu P<0\).

Proof

For the Dirichlet case, notice that from the expression

$$\begin{aligned} f(u)=\Delta u= \langle \nabla _\nu \nabla u, \nu \rangle + \Delta _{\partial \Omega } u -\langle \mathbf {H}, \nabla u \rangle , \end{aligned}$$

where \(\mathbf {H}\) is the mean curvature vector of the level sets of u (see formula (4.68) in [42], minding the different normalization for the mean curvature) and since \(\nu \) and \(\nabla u\) are parallel, we obtain

$$\begin{aligned} f(u)=\bigg \langle \nu ,\frac{\nabla u}{|\nabla u|}\bigg \rangle \bigg (|\nabla u|^{-1} \langle \nabla _\nu \nabla u, \nabla u \rangle - \langle \mathbf {H} ,\nu \rangle |\nabla u|\bigg ). \end{aligned}$$

Multipliying by \(\langle \nu ,\frac{\nabla u}{|\nabla u|} \rangle \) and rearranging the terms we get the desired inequality as long as \(\partial \Omega \) is strictly mean-convex (this holds, in particular, when \(\partial \Omega \) is strictly convex)

$$\begin{aligned} 0>\langle \mathbf {H} ,\nu \rangle |\nabla u|^2= \partial _\nu \bigg ( \frac{|\nabla u|^2}{2} - F(u)\bigg )=\partial _\nu P. \end{aligned}$$

In the Neumann case, \(\nu \) and \(\nabla u\) are perpendicular along the boundary. We conclude

$$\begin{aligned} \partial _\nu P&= \bigg \langle \nu , \nabla \frac{|\nabla u|^2}{2} - f(u)\nabla u \bigg \rangle \\&= \langle \nu , \nabla _{\nabla u}\nabla u \rangle \\&= - \langle \nabla _{\nabla u}\nu ,\nabla u\rangle \\&= II_\nu (\nabla u,\nabla u)<0, \end{aligned}$$

where \(II_\nu \) is the second fundamental form of \(\partial \Omega \) with respect to the exterior normal \(\nu \), which is a negative definite quadratic form when \(\partial \Omega \) is strictly convex. \(\square \)

Corollary 3.3

Let \(\Omega \) be bounded and strictly convex with \({\text {Ric}}\ge 0\). Assume u is a solution of (1) with either zero Dirichlet or zero Neumann boundary condition, then the maximum of P in \({\overline{\Omega }}\) is attained at a critical point of u.

Remark

The proof of Corollary 3.3 relies on assumption (D) only, and not on the condition (A2) on the potential F.

Lemma 3.4

Let \(c_1,\dots ,c_n\) be the unstable critical points of F. If u is an unstable solution to (1) then

  1. (1)

    \([u_{\min },u_{\max }]\cap \{c_1,\dots ,c_n\}\ne \emptyset \) and

  2. (2)

    \(\max _{[u_{\min },u_{\max }]} F \le \max \{F(c_1),\dots F(c_n) \}\).

Proof

Item (1) follows by contradiction directly from the maximum principle. In fact, assuming \([u_{\min },u_{\max }]\cap \{c_1,\dots ,c_n\}=\emptyset \) implies there is at most one local minimum of F in \([u_{\min },u_{\max }]\). If there are no local minimums, then \(f'\) has a sign on this interval, contradicting \(\Delta u=f'(u)\) either at the maximum or minimum of u (depending on the sign of \(f'\)). On the other hand, if there is exactly one local minimum \(m\in [u_{\min },u_{\max }]\), then the maximum principle is not contradicted only if \(u=m\). However, in this case the solution is stable by Lemma 5.2.

Item (2) follows along similar lines. If there are no critical points outside of \([c_1,c_n]\) then F decreases with respect to its distance to \([c_1,c_n]\) and the result follows. If there are critical points outside of \([c_1,c_n]\) then these are all local minima and there is at most one on each side of \([c_1,c_n]\). As above, by the maximum principle, these critical points bound \(u_{\min }\) and \(u_{\max }\) accordingly. Then, in this case F decreases with respect to its distance to \([c_1,c_n]\) in \([u_{\max },u_{\min }]\). \(\square \)

Proposition 3.5

Assume (A2). There exists a positive constant \(B_0=B_0(\Omega ,C,K)\), such that for any u which is an unstable solution to (1) we have

$$\begin{aligned} \sup _{\Omega } |\nabla u| \le B_0, \end{aligned}$$

where \(K=\max \{|k^-|,k^+,\max _{[k^-,k^+]}|f|\}\) and C is given by hypothesis (A2).

Proof

Let \(p_0 \in {\overline{\Omega }}\) be a point where P attains its maximum and denote \(u_0=u(p_0)\). By Corollary 3.3 we have that \(\nabla u(p_0)=0\), so \(P(p_0)=-F(u_0)\).

Case \(u_0\in [k^-,k^+]\): We have the following inequalities on all of \(\Omega \),

$$\begin{aligned} |\nabla u|^2&\le F(u)-F(u_0) \\&\le F(c_k) - F(u_0) \\&\le |c_k-u_0| \max _{[k^-,k^+]}|f|\\&\le |k^+-k^-| \max _{[k^-,k^+]}|f|, \end{aligned}$$

where the first inequality follows from \(P\le P(p_0)\), the constant \(c_k \in \{c_1,\dots ,c_n\}\) is such that \(F(c_k)=\max \{F(c_1),\dots ,F(c_n)\}\) and the second inequality follows from Lemma 3.4.

Case \(u_0> k^+\): We subdivide the proof of this case into a series of claims.

Claim 1

\(u_{\max } =u_0\).

Proof of Claim 1

Let \(p^+ \in \Omega \) such that \(u(p^+)=u_{\max }\). Clearly, \(u_0\le u_{\max }\) and \(\nabla u(p^+)=0\). To proceed by contradiction assume \(u_0<u_{\max }\). Since F is strictly decreasing in \([k^+,u_{\max }]\), we have \(F(u_{\max })< F(u_0)\). On the other hand, by our assumption on \(p_0\), we have \(-F(u_{\max })=P(p^+)\le P(p_0)=-F(u_0),\) which is contradiction. This proves the claim. \(\square \)

Claim 2

For all \(\delta \in (0,1)\), the gradient bound \(|\nabla (u_0-u)| \le A(\delta ) |u_0-u|\) holds on the level set \(\{ k^+ \le u\le u_0-\delta (u_0-k^+)\}\), with \(A(\delta )=\sqrt{(2-\delta )C+ 2C\frac{(1+k^+ )}{\delta (u_0-k^+)} }.\)

Proof of Claim 2

In what follows we use the fact that the function \(u_0-u\) is positive on the level set \(\{ k^+ \le u\le u_0-\delta (u_0-k^+)\}\), which follows from the previous claim. In the steps below there are three inequalities, the first one follows from \(P\le P(p_0)\), the second one uses the linear bound on f and the fact that \(u_0 \ge u\ge k^+\ge 0\), and the third one uses \(u\in [k^+,u_0-\delta (u_0-k^+)].\)

$$\begin{aligned} |\nabla (u_0-u)|^2&= |\nabla u|^2\\&\le 2F(u)-2F(u_0) \\&= 2\int _{u}^{u_0} -f(s) ds \\&\le 2C\int _{u}^{u_0} (1+s) ds \\&= 2C(u_0-u) + C(u_0^2-u^2) \\&= C\bigg ( \frac{2 + (u_0+u)}{u_0-u} \bigg ) |u_0-u|^2\\&\le C\bigg ( \frac{2 + (2u_0-\delta (u_0-k^+))}{\delta (u_0-k^+)} \bigg ) |u_0-u|^2\\&= C\bigg ( (2-\delta )+ 2\frac{(1+k^+ )}{\delta (u_0-k^+)} \bigg ) |u_0-u|^2, \end{aligned}$$

which is what we wanted to prove. \(\square \)

Claim 3

There exists \(\delta \in (0,1)\), depending only on \(\Omega \) and C, such that for any unstable solution we have

$$\begin{aligned} u_0 \le k^+ + \frac{2d^2C(1+k^+)}{\delta \log (\delta )^2 +\delta (\delta -2)d^2C}, \end{aligned}$$

where \(d={\text {diam}}(\Omega )\).

Proof of Claim 3

Since u is an unstable solution, by Lemma 3.4 the set \(\{c_1\le u \le c_n\}\) must be non-empty. In particular, since \(u_{\max }=u_0\) by Claim 1, the level sets \(\{u=s\}\), with \(s\in [k^+,u_0]\) are non-empty.

Choose points \(q_0\in \{u=u_0-\delta (u_0-k^+)\}\) and \(q_1\in \{u=k^+\}\) and let \(\gamma : [0,1]\rightarrow \Omega \) be a minimizing geodesic joining \(q_0\) with \(q_1\) (which exists because \(\partial \Omega \) is strictly convex). We can always assume that \(\gamma \) is contained in the set \(\{k^+\le u \le u_0-\delta (u_0-k^+)\}\) by redefining \(q_0\) to be \(\gamma (t_0)\), where \(t_0\in [0,1)\) is the last time \(u(\gamma (t_0))=u_0-\delta (u_0-k^+)\). and \(q_1\) to be \(\gamma (t_1)\), where \(t_1\in (t_0,1]\) is the first time \(u(\gamma (t_1))=k^+\). If necessary, we can reparametrize this segment of geodesic so that it is defined again over [0, 1] and \(|\gamma '(t)|={\text {dist}}(q_0,q_1)\le {\text {diam}}(\Omega )=d\), for all \(t\in [0,1]\).

The gradient inequality obtained in Claim 2 then holds along \(\gamma :[0,1]\rightarrow \{k^+\le u \le u_0-\delta (u_0-k^+)\}\) and denoting \((u_0-u)(t)=u_0-u(\gamma (t))\) it follows

$$\begin{aligned} (u_0-u)'(t)\le |u'(t)|\le |\gamma '(t)||\nabla u(\gamma (t))| \le d A(\delta ) (u_0-u)(t). \end{aligned}$$

From Gronwall’s inequality we obtain

$$\begin{aligned} u_0-k^+\le \delta (u_0-k^+) \exp ( d \times A(\delta )), \end{aligned}$$

which translates into \(1 \le \delta \exp ( d \times A(\delta ) )\), and then into \(-\log (\delta ) \le d A(\delta )\), after taking logarithm on both sides. Squaring the expression, substituting the explicit formula for \(A(\delta )\) and doing some simple arithmetic, one gets

$$\begin{aligned} (u_0-k^+)(\log (\delta )^2 +\delta d^2C -2 d^2C) \le \frac{2d^2C(1+k^+)}{\delta }. \end{aligned}$$

Finally, choosing \(\delta \in (0,1)\) small enough, we can guarantee that the quantity \(\log (\delta )^2 +\delta d^2C -2 d^2C\) is positive. This proves the claim. \(\square \)

Claim 4

Let \(R_0=k^+ + \frac{2d^2C(1+k^+)}{\delta \log (\delta )^2 +\delta (\delta -2)d^2C}\) be the constant from the previous claim. Then

$$\begin{aligned} |\nabla u|^2 \le |k^+-k^-|\max _{[k^-,k^+]}|f| + C(R_0^2-R_0). \end{aligned}$$

Proof

We have the following inequalities

$$\begin{aligned} |\nabla u|^2&\le F(u)-F(u_0) \\&\le F(c_k) - F(u_0)\\&= \int _{c_k}^{k^+}-f(s)ds + \int _{k^+}^{u_0}-f(s)ds\\&\le |k^+-k^-|\max _{[k^-,k^+]}|f| + C\int _{0}^{R_0}(1+s)ds\\&\le |k^+-k^-|\max _{[k^-,k^+]}|f| + CR_0(1+R_0), \end{aligned}$$

where the first inequality follows from \(P\le P(p_0)\), the constant \(c_k \in \{c_1,\dots ,c_n\}\) is such that \(F(c_k)=\max \{F(c_1),\dots ,F(c_n)\}\) and the second inequality follows from Proposition 3.4. \(\square \)

Finally, notice that Claim 4 concludes the proposition when \(u_0>k^+\) since \(R_0\) depends only on \(\Omega \), C and \(k^+\).

Case \(u_0< k^-\): This case can be handled using exactly the same computations as above but substituting the roles of \(u_{\max }\), \(u_0-u\) and \(k^+\) in Claims 1-4, for \(u_{\min }\), \(u-u_0\) and \(k^-\), respectively. \(\square \)

Proof of Theorem 2.5

Let \(B_0=B_0(\Omega ,C,K_0)\) be the constant from Proposition 3.5, i.e. \(|\nabla u|\le B_0\) at all points of \(\Omega \). Since by Proposition 3.4 the level set \(\{c_1\le u\le c_n\}\) is non-empty, \({\overline{\Omega }}\) is compact and \(\{c_1,\dots ,c_n\}\subset [k^-,k^+]\), we obtain bounds on \(\Vert u\Vert _{L^\infty (\Omega )}\) in terms of \(\Omega \), K and \(B_0\). \(\square \)

Corollary 3.6

Under hypotheses (A2) and (D), the space of solutions of Eq. (1) is compact in the \(W^{1,2}(\Omega )\) topology.

Proof

This is an immediate consequence of the pointwise bounds obtained above. \(\square \)

4 Compactness under (A3)

Hypothesis (A3) assumes subcritical growth and a version of the Ambrosetti-Rabinowitz condition adapted to the case of zero Neumann boundary condition. The method of proof is standard but we include it for convenience of the reader.

Lemma 4.1

Under hypothesis (A3), E satisfies the Palais-Smale condition.

Proof

Let \(\{u_n\}_n \in W^{1,2}(\Omega )\) be a Palais-Smale sequence, i.e. \(\sup _n E(u_n)=M<+\infty \) and \(\Vert E'(u_n)\Vert \rightarrow 0\). Then

$$\begin{aligned}&M + \rho \Vert E'(u_n)\Vert \cdot (1+ \Vert u_n\Vert ^2_{W^{1,2}(\Omega )})\\&\ge M + \rho \Vert E'(u_n)\Vert \cdot \Vert u_n\Vert _{W^{1,2}(\Omega )}\\&\ge E(u_n) - \rho E'(u_n)(u_n) \\&=\int _\Omega \bigg (\frac{1}{2}-\rho \bigg ) |\nabla u_n|^2 + \int _\Omega -\rho u_n f(u_n) + F(u_n) \\&= \bigg (\frac{1}{2}-\rho \bigg )\Vert u_n\Vert ^2_{W^{1,2}(\Omega )}+ \int _\Omega -\rho u_n f(u_n) + F(u_n) - \bigg (\frac{1}{2}-\rho \bigg ) u_n^2\\&\ge \bigg (\frac{1}{2}-\rho \bigg )\Vert u_n\Vert ^2_{W^{1,2}(\Omega )}+ \int _\Omega -\rho u_n f(u_n) + F(u_n) - u_n^2 \\&\ge \bigg (\frac{1}{2}-\rho \bigg )\Vert u_n\Vert ^2_{W^{1,2}(\Omega )}+ \int _{\{u_n\le R_0\}} -\rho u_n f(u_n) + F(u_n) - u_n^2. \end{aligned}$$

Since the second term of the last line is bounded and \(\Vert E'(u_n)\Vert \rightarrow 0\), it follows that the sequence \(u_n\) is bounded in \(W^{1,2}(\Omega )\).

Using the Rellich-Kondrachov’s compactness theorem, we can pass to a subsequence which is convergent in \(L^q\), for \(q=2\) and \(q=p\), and weakly convergent in \(W^{1,2}(\Omega )\), to a limit function u. Finally, we have that the last line in

$$\begin{aligned} \int _{\Omega } |\nabla (u_n-u)|^2&=\int _{\Omega } \nabla u_n\nabla (u_n-u) - \nabla u\nabla u_n+|\nabla u|^2 \\&= 2E'(u_n)(u_n-u)-2\int _\Omega f(u_n)(u_n-u)- \nabla u\nabla u_n+|\nabla u|^2 \end{aligned}$$

goes to zero. For the first term, we use that \(E'(u_n)\) goes to zero and \(u_n-u\) is bounded in \(W^{1,2}(\Omega )\). For the second term, we can use Holder’s inequality

$$\begin{aligned} \bigg |\int _\Omega f(u_n)(u_n-u)\bigg |\le \Vert f(u_n) \Vert _{L^{\frac{p}{p-1}}(\Omega )} \Vert u_n-u \Vert _{L^{p}(\Omega )}. \end{aligned}$$

Since |f(s)| is bounded by \(C(|u|^{p-1}+1)\), the term \(\Vert f(u_n) \Vert _{L^{\frac{p}{p-1}}(\Omega )}\) is bounded, while \(\Vert u_n-u \Vert _{L^{p}(\Omega )}\) goes to zero. Finally, the strong convergence in \(L^{2}(\Omega )\) and the weak convergence in \(W^{1,2}(\Omega )\), imply that the last two terms cancel in the limit. \(\square \)

The following is an immediate consequence of the lemma above:

Corollary 4.2

Under hypotheses (A3) and (D), the space of solutions to Eq. (1) with energy bounded from above, is compact in the \(W^{1,2}(\Omega )\) topology.

5 The mountain pass characterization

In this section we provide the proofs of Theorems 2.3 and 2.4.

5.1 Preliminary results and some technical lemmas

In this section we collect definitions and results that will be useful in the forthcoming sections.

Theorem 5.1

([15, 24, 34]) Under hypothesis (D), stable solutions to Eq. (1) are constant functions.

Lemma 5.2

Assume \(\Omega \) satisfies (D). If F is a Morse function, then the following are equivalents:

  1. (i)

    u is a non-degenerate local minimum of E.

  2. (ii)

    u is a solution of (1) with Morse index 0 (i.e. stable).

  3. (iii)

    u is constant equal to a local minimum of F.

Proof

i) \(\implies \) ii) is basic calculus of variations. To see ii) \(\implies \) iii), notice that by Theorem 5.1u must be a constant function. Then, \(0=\Delta u=f(u)=F'(u)\), so u is a critical point of F. Since F is a Morse function, u is either a local minimum or a local maximum. When u is a local maximum then \(F''(u)<0\), and the linearization of (1), i.e. \(-\Delta + F''(u)\), has \(\phi \equiv 1\) as an eigenfunction with negative eigenvalue \(F''(u)\). Therefore, u must be a local minimum, i.e. \(F''(u)>0\). Finally, iii) \(\implies \) i) follows from the \(E''(u)(v,v)=\int _{\Omega }|\nabla v|^2+F''(c)v^2\ge \int _{\Omega }F''(c)v^2>0\), for all \(v\ne 0\). \(\square \)

The following is the classical Mountain Pass Theorem (see [29]):

Theorem 5.3

(Mountain pass) Let \(E:W^{1,2}(\Omega )\rightarrow {\mathbb {R}}\) be a \(C^2\) energy functional given by

$$\begin{aligned} E(u)=\int _\Omega \frac{|\nabla u|^2}{2}+F(u) \end{aligned}$$

and \(u^-,u^+\in W^{1,2}(\Omega )\) such that we have the following inequality

$$\begin{aligned} \inf _{h\in \Gamma } \sup _{t\in [-1,1]} E(h(t))=E_0>E(u^\pm ) \end{aligned}$$
(6)

where \(\Gamma =\{h\in C([-1,1],W^{1,2}(\Omega )): h(\pm 1)\equiv u^{\pm }\}.\)

Assume there exists a sequence \(\{h_n\}_{n\in {\mathbb {N}}}\subset \Gamma \), with

$$\begin{aligned} E_0=\lim _{n\rightarrow \infty } \sup _{t\in [-1,1]} E(h_n(t)) \end{aligned}$$

and such that E satisfies the Palais-Smale condition along \(\{h_n\}_{n\in {\mathbb {N}}}\) (see Definition 5.4 below).

Then, there exists \(u \in W^{1,2}(\Omega )\) such that:

  1. (a)

    u is a critical point of E,

  2. (b)

    \(u=\lim _{n\rightarrow \infty }h_n(t_n)\), for some sequence \(\{t_n\}_{n\in {\mathbb {N}}}\subset [-1,1]\).

  3. (c)

    \(E(u)=E_0,\)

  4. (d)

    u has Morse index at most 1. Moreover, if E does not admit degenerate critical points of Morse index 0, then u must have Morse index 1.

Definition 5.4

Under the hypotheses of the theorem above, we say that E satisfies the Palais-Smale condition along \(\{h_n\}_{n\in {\mathbb {N}}}\) if any sequence \(\{u_n\}_{n\in {\mathbb {N}}} \subset W^{1,2}(\Omega )\) such that:

  1. (1)

    \(u_n\in h_n([-1,1])\),

  2. (2)

    \(\lim _{n\rightarrow \infty }E(u_n)=E_0\) and

  3. (3)

    \(\lim _{n\rightarrow \infty }\Vert E'(u_n)\Vert =0,\)

has a convergent subsequence.

Remark 5.5

The following are standard situations in which Theorem 5.3 can be applied:

  1. (1)

    If \(u^-\) and \(u^+\) are both strict local minima of E in \(W^{1,2}(\Omega )\), then inequality (6) holds.

  2. (2)

    If \(u^-\) is a strict local minima of E in \(W^{1,2}(\Omega )\) and there exists a ball \(B(u^-,\delta )\), for some \(\delta >0\), such that \(\inf _{W^{1,2}(\Omega ){\setminus } B(u^-,\delta )} E\le E(u^-)\), then there exists a \(u^+ \in W^{1,2}(\Omega ){\setminus } B(u^-,\delta )\) for which inequality (6) holds.

Lemma 5.6

(Deformation lemma) Let E, \(u^-\), \(u^+\) and \(E_0\) as in Theorem 5.3. Assume that E does not admit degenerate critical points of index 0, and h is a family joining \(u^-\) with \(u^+\) which is optimal at \(u\in W^{1,2}(\Omega )\) with respect to E (see Definition 2.2). If \(E(u)=E_0\), then u is a critical point of E with Morse index equal to 1.

Proof

To see that u must be a critical point one uses the classical deformation argument from the mountain-pass theorem. It consists on pushing the path \(\gamma \) in the direction of a \(v \in W^{1,2}(\Omega )\) such that \(E'(u)(v)<0\). We omit this part and refer to [23, Section 8.5], since the deformation argument we use later on this proof to show that the Morse index cannot be greater than 1, uses a similar construction.

We prove now that the Morse index of u cannot be 0. Otherwise, u would be a strict local minima of E in \(W^{1,2}(\Omega )\), since E does not admit degenerate critical points of index 0. In particular, for \(\delta >0\) small enough \(E(h(\delta ))>E(h(0))=E(u)\), which contradicts the assumption that h is optimal at u.

To reach a contradiction, assume now that the Morse index of u is greater or equal than 2. In this case, we can construct a competitor family on \(\Gamma \) below the level \(E_0\) in the following way. Let \(v_1\) and \(v_2\) the first two eigenfunctions of the linearization of Eq. (1). By the Morse index assumption, we have \(E''(u)(v,v)<0\), for all \(v\in {\text {span}}(v_1,v_2)\). Let \(v=\alpha _1v_1+\alpha _2v_2\), where \(\alpha _1,\alpha _2\in {\mathbb {R}}\), will be chosen later. Define \(\gamma :{\mathbb {R}}\times [-1,1] \rightarrow W^{1,2}(\Omega )\) by

$$\begin{aligned}\gamma (s,t):=h(t) + sv, \quad (s,t) \in {\mathbb {R}}\times [-1,1].\end{aligned}$$

To finish the lemma, we just have to show that \(f(s,t)=E(\gamma (s,t))\) has a non-degenerate local maximum at \((s,t)=(0,0)\). If this is the case, then substituting a piece of h by a path going around (0, 0) would do the work.

To see that (0, 0) is a non-degenerate local maximum, notice that since \(\gamma (0,0)=h(0)=u\), the point (0, 0) is a critical point of E and, as a consequence, it is also a critical point of f. Therefore, it is enough to show that, for the right choices of \(\alpha _1\) and \(\alpha _2\), the Hessian of f is negative definite at (0, 0). This is a simple computation:

$$\begin{aligned} \partial _{tt}^2 f(0,0)&=E''(u)(h'(0),h'(0))<0, \hbox { since } h \hbox { is optimal at } u.\\ \partial _{ss}^2 f(0,0)&=E''(u)(v,v)<0, \text { since }v\in V,\text { and}\\ \partial _{st}^2 f(0,0)&= E''(u)(h'(0),v) \\&=\int _{\Omega } \nabla h'(0) \nabla v + F''(u)h'(0)v\\&=\int _{\Omega } h'(0)[-\Delta v + F''(u)v] + \int _{\partial \Omega }h'(0)\partial _\nu v\\&=\int _{\Omega } h'(0)(\alpha _1\lambda _1 v_1 + \alpha _2\lambda _2 v_2) \\&=0, \end{aligned}$$

where we are using that \(\partial _\nu v=0\), since \(v_1\) and \(v_2\) are Neumann eigenvalues and we are selecting \(\alpha _1\) and \(\alpha _2\) so that the last term is equal to 0. \(\square \)

We also recall the following standard result for bounded semilinear parabolic heat flows (see [16]):

Theorem 5.7

(Parabolic flow) Let \(u^{\pm }_t : [0,+\infty ) \rightarrow W^{1,2}(\Omega )\) be solutions to the parabolic equation

$$\begin{aligned} {\left\{ \begin{array}{ll}\partial _t u_t -\Delta u_t + f(u_t)=0 &{} \text { on } [0,+\infty )\times \Omega \\ \partial _\nu u =0 &{} \text { on } [0,+\infty )\times \partial \Omega . \end{array}\right. }\end{aligned}$$
(7)

Assume that \(u_t^-<u_t^+\), for all \(t\ge 0\) and \(\sup _{t\ge 0}\Vert u^{\pm }\Vert _{L^\infty (\Omega )}<+\infty \).

Then, given any \(v_0 \in W^{1,2}(\Omega )\), with \(u^-_0< v_0 < u^+_0\), there exists a unique solution \(v_t\) of (7) with initial condition \(v_0\) which exists for all \(t\ge 0\) and such that:

  1. (a)

    \(u^-_t< v_t < u^+_t\), for all \(t\ge 0\).

  2. (b)

    \(E(v_t)\) is strictly decreasing on \(t\ge 0\) (unless \(v_t\) is a solution to the stationary Eq. (1), in which case \(v_t\) is constant).

  3. (c)

    There exists \(v_\infty \in W^{1,2}(\Omega )\) a solution to the stationary Eq. (1), and sequence of times \(t_k\rightarrow \infty \), such that \(v_{t_k}\rightarrow v_\infty \), strongly in \(W^{1,2}(\Omega )\).

The following lemma will allow us to construct general optimal families for unstable solutions:

Lemma 5.8

Let u be an unstable solution to (1) and \(\phi \) the first eigenfunction of the operator \(-\Delta +F''(u)\). Assume

  1. (1)

    \(u<c^+\) (resp. \(c^-<u\)).

Then, there exists \(\delta \in (0,1)\) and a continuous map \(h:[0,1]\rightarrow W^{1,2}(\Omega )\) (resp. \(h:[-1,0]\rightarrow W^{1,2}(\Omega )\)), such that

  1. (1)

    \(h(t)=u+t\phi \) for \(t\in [0,\delta ]\) (resp. \(t\in [-\delta ,0]\)),

  2. (2)

    \(\frac{d^2}{dt^2}E(h(t))|_{t=0}<0\),

  3. (3)

    \(E(u)=E(h(0))>E(h(t)),\) for all \(t\in (0,1]\) (resp. \(t\in [-1,0)\)),

  4. (4)

    h(1) (resp. \(h(-1)\)) is a constant which is a stable critical point of F.

Proof

We do the argument assuming \(u<c^+\). The case \(c^-<u\) is analogous.

Given \(\delta >0\), consider the smooth path \(h:[0,\delta )\rightarrow W^{1,2}(\Omega )\), given by \(h(t)=u+t\phi \). By our choice of \(\phi \), and since u is an unstable critical point of E, we have

  1. i.

    \(\frac{d}{dt}|_{t=0}E(h(t))=0\), and

  2. ii.

    \(\frac{d^2}{dt^2}E(h(t))|_{t=0}=\int _\Omega \phi (-\Delta \phi +F''(u)\phi )=-\lambda \int _\Omega \phi ^2<0\),

where \(\phi > 0\) everywhere on \(\Omega \) and \(\lambda >0\). Therefore, choosing \(\delta \in (0,1)\) small enough, we can assume \(E(u)=E(h(0))\) is a strict local maximum of E(h(t)) on \([0,\delta ]\), that \(\frac{d}{dt}|_{t=\delta }E(h(t)) \ne 0\), and that \(u<h(t)<c^+\), for all \(t\in [0,\delta ]\).

It remains to continuously extend the path from \(h(\delta )\) to a stable constant \(t^+\), with \(u<t^+\le s^+\) without increasing the energy above E(h(0)).

Let \(C^+=C^+(h(\delta ))\) be the subset consisting of all the \(v\in W^{1,2}(\Omega )\) such that

  • v is a solution to (1),

  • \(u\le v\le c^+\) and

there exists a continuous map \(h^+:[\delta ,1]\rightarrow W^{1,2}(\Omega )\) satisfying:

  • \(h^+(\delta )=h(\delta )\),

  • \(E(h(\delta )) \ge E(h^+(t))\), for \(t\in [\delta ,1]\) and

  • \(h^+(1)=v.\)

We want to show that \(C^+(h(\delta ))\ne \emptyset \) and that the minimum of E in \(C^+(h(\delta ))\) is attained by a stable solution.

We first prove that \(C^+\) is not empty. Since u and \(c^+\) are stationary solutions of (7) with \(u<h(\delta )<c^+\), it follows from Theorem 5.7 that there is a unique \(v\in C([0,\infty ):W^{1,2}(\Omega ))\) which solves the parabolic equation (7) with initial condition \(h(\delta )\). By item c) of Theorem 5.7, for a large \(T^+\), \(v(T^+)\) is arbitrarily close in \(W^{1,2}(\Omega )\) to a solution \(v_\infty \) of (1). Item b) of Theorem 5.7 implies (noting that v(t) is not stationary, as \(h(\delta )\) is not a solution of (1) that \(E(v_\infty ) < E(h(\delta ))\) and by continuity we can assume that \(v(T^+)\) belongs to a convex neighborhood of \(v_\infty \) in which \(E< E(h(\delta ))\). The path \(v:[0,T^+]\rightarrow W^{1,2}(\Omega )\) can be completed to a path arriving at \(v_\infty \), simply by joining \(v(T^+)\) to \(v_\infty \) with a straight line in \(W^{1,2}(\Omega )\). Finally, we reparametrize the junction of both paths over the interval \([\delta ,1]\). This shows \(v_\infty \) is in \(C^+\). We note that this yields \(\inf _{w \in C^+}E(w) \le E(v_\infty ) < E(h(\delta ))\).

Since solutions to (1) are bounded between u and \(c^+\), classical Schauder estimates imply that, after passing to a subsequence, an energy minimizing sequence \(\{v_n\}_{n\in {\mathbb {N}}}\) in \(C^+\) must converge to some \(v_{\min } \in W^{1,2}(\Omega )\) which is also a solution of (1), with \(E(v_{\min })< E(h(\delta ))\). By arguing as before, for n larger enough \(v_n\) is in a convex neighborhood of \(v_{\min }\) in which \(E< E(h(\delta ))\). Therefore, the path joining \(h(\delta )\) with \(v_n\) can be completed to a path arriving at \(v_{\min }\) through a straight line. This shows that \(v_{\min }\) also belongs to \(C^+\).

Finally, notice that \(v_{\min } \in C^+\) must be stable. Otherwise, we could repeat the argument and find a continuous path joining \(v_{\min }\) to a solution energy strictly less than that of \(v_{\min }\). Connecting this path with the one joining \(h(\delta )\) with \(v_{\min }\), we would contradict that \(v_{\min }\) attains the minimum of E on the set \(C^+\). Therefore, \(v_{\min }\) must be stable and by Lemma 5.2 it is a constant equal to a local minimum of F and of E. \(\square \)

The proof of the following lemma is elementary and it is left to the reader.

Lemma 5.9

Let \(F:{\mathbb {R}}\rightarrow {\mathbb {R}}\) be a Morse function with finitely many critical points. Let \(c^+\) be the largest critical points of F and assume that \(c^+\) is a local maximum. Then, for any given \(M>c^+\), there exists a Morse function \(F^*:{\mathbb {R}}\rightarrow {\mathbb {R}}\) with finitely many critical points, such that:

  1. (1)

    \(F^*=F\) on \((-\infty ,M]\),

  2. (2)

    \(F^*\ge F\) on \((M,+\infty )\),

  3. (3)

    \(F^*\) only has one critical point \(c^*\) on the interval \((M,+\infty )\), which is a local minimum.

Remark 5.10

Similarly, if \(c^-\) is a local maximum, we could find \(F^*\) that coincides with F on the \([c^-,c^+]\) and that has at most one critical point to each side of this interval, which is a local minima.

5.2 Proof of Theorem 2.1

By assumption F has at least one local maximum c. From \(F'(c)=f(c)=0\) and \(F''(c)<0\) and classical properties of the Laplacian operator, it is easy to check that the constant function \(u\equiv c\) is a solution to (1) and that for any constant direction \(a\ne 0\), we have \(E''(c)(a,a)<0\). Therefore, u is an unstable solution, i.e. the set of unstable solutions (1) is not empty.

Take a sequence of unstable solutions \(\{u_n\}_{n\in {\mathbb {N}}}\) which is minimizing for E. From Corollary 3.6 and Corollary 4.2, after perhaps passing to a subsequence, these must converge to some u which is also a solution of (1). We just need to argue that u is also unstable. This is a consequence of the fact that the set of stable solutions is isolated, since these are strict local minimizers. In fact, by item i) of Lemma 5.2 a stable solution must be a local minimum of F.

5.3 Proofs of Theorems 2.3 and 2.4

Let u a ground state of (1). We divide the proofs of these theorems into several cases depending on the hypothesis assumed on F. In each case, we first construct an optimal family for u, and then we prove u is the solution of a mountain pass problem. The cases are presented in such a way that information from earlier cases can be used in later cases.

Case 1: F satisfies (A1). In this case, \(c^\pm \) are both local minima of F. By the maximum principle we must have \(c^-<u<c^+\), so we can apply Lemma 5.8 twice to conclude there exist \(t^-\) and \(t^+\), which are local minima of E and F with \(t^-< u < t^+\), and a family h joining \(t^-\) with \(t^+\) optimal at u with respect to E. Moreover, \(c^-\le t^-< h(t) < t^+\le c^+\) for \(t \in (-1,1)\), in particular \(t^-< u=h(0) < t^+\). This follows from the construction of Lemma 5.8, for small |t|, and by the maximum principle for parabolic equations. This gives us the optimal family for this case.

Now, we check the hypothesis to apply the mountain pass theorem. By Item (1) in Remark 5.5, the inequality in the hypothesis of Theorem 5.3 holds for our choice of \(t^-\) and \(t^+\). Let \(E_0\) and \(\Gamma \) be as in the statement of Theorem 5.3. Given \(\{h_n\}_{n\in {\mathbb {N}}}\subset \Gamma \) we define its truncation to \([c^-,c^+]\) as \({\tilde{h}}_n(t)=\min (c^+,\max (c^-,h_n(t)))\). Since F is decreasing on \((-\infty ,c^-]\) and increasing on \([c^+,+\infty )\), it follows that after truncation its energy can only decrease, i.e. \(E({\tilde{h}}_n(t))\le E(h_n(t))\). In particular, there exists \(\{{\tilde{h}}_n\}_{n\in {\mathbb {N}}}\subset \Gamma \), such that \(E_0=\lim _{n\rightarrow \infty }\sup _{t\in [-1,1]}E({\tilde{h}}_n(t))\) and \(c^- \le {\tilde{h}}_n(t) \le c^+\). Since \(\{{\tilde{h}}_n\}_{n\in {\mathbb {N}}}\) is a family of bounded functions with bounded energy and F is bounded from below, it is easy to see that their Sobolev norm is bounded. Then, a simple application of the Rellich-Kondrachov Compactness Theorem, gives us that the Palais-Smale condition holds along \(\{{\tilde{h}}_n\}_{n\in {\mathbb {N}}}\). We can apply Theorem 5.3 and conclude that there exists a critical point \(u_{0}\) of E, with \(E(u_{0})=E_0\). In addition, by Lemma 5.2, E does not admit degenerate stable critical points, therefore Theorem 5.3 implies \(u_{0}\) is unstable.

Finally, we must check that u is also a solution of the mountain pass problem described in the previous paragraph and has Morse index 1. In fact, u belongs to a family \(h \in \Gamma \), which implies \(E_0 \le E(u)\). On the other hand, u is a ground state and since \(u_0\) is unstable, we must have \(E(u)\le E(u_{0})=E_0\). We conclude that \(E(u)=E_0\) which is the mountain pass critical level. Finally, the existence of the family h optimal at u with respect to E, and Lemma 5.6, imply that u has Morse index 1.

This proves both Theorems 2.3 and 2.4 when F satisfies (A1).

Case 2: F satisfies (A2). If both \(c^-\) and \(c^+\) are stable we are in Case 1 above. Therefore, we can assume that \(c^-\) is stable and \(c^+\) is unstable (the remaining cases are similar and we comment on them at the end of this proof). Let \(M_0\) be the one from Theorem 2.5. Then, \(u\le M_0\) and \(c^+\le M_0\). Fix \(M>M_0\), and let \(F^*\) and \(c^*\) be given by Lemma 5.9 for this choice of M. Let \(E^*(v)=\int _{\Omega }\frac{|\nabla v|^2}{2}+F^*(v)\). Since F and \(F^*\) coincide in the range of u, it follows that u is also an unstable critical point of \(E^*\). From the construction of \(F^*\) and Theorem 2.5, a function v is an unstable critical point of F if and only if it is a critical point of \(F^*\). It follows that u is also ground state for \(F^*\). Since, \(F^*\) satisfies (A1), we are in the situation of Case 1, which implies Theorem 2.4 for F.

The first remaining case is: \(c^-\) unstable and \(c^+\) stable, which is symmetric to what we just did. Finally, there is the case: both \(c^-\) and \(c^+\) are local maxima. To deal with this, we can use the estimates given by Theorem 2.5 to argue as in Lemma 5.9 also the left of \(c^-\), (e.g. by applying it to \({\tilde{F}}(x)=F(-x)\) and then using \({\tilde{F}}^*\)). See Remark 5.10.

Case 3: F satisfies (A3). As before, we first construct a family which is optimal at u with respect to E. By the maximum principle \(c^-<u\). If \(u_{\max }<c^+\), then we can apply the exact same construction of Case 1, to obtain an optimal family joining \(t^-<t^+\), stable critical points of F, such that \(c^-\le t^-<u<t^+<c^+\). If instead \(c^-<c^+\le u_{\max }\). Let \(t^-\) be the largest stable critical point of F such that \(t^-<u\). Choose M, such that \(M>u_{\max }\ge c^+\) and \(F(t^-)\ge F(M)\), which exists by assumption (A3). Let \(F^*\) and \(c^*\) be given by Lemma 5.9 for this choice of M. Since \(u<M\) and \(F=F^*\) on \((-\infty ,M)\), it follows that u and \(t^-\) are critical points of \(E^*\) and \(t^-<u<c^*\). As in Case 1, applying Proposition 5.8 we obtain a family h joining \(t^-\) and \(c^*\) which is optimal at u with respect to \(E^*\). By item 2 of Lemma 5.9, we have \(E^*\ge E\), and \(E^*(u)=E(u)\) it follows that h is also optimal at u with respect to E.

Now we prove that u is a mountain pass critical point. By Lemma 4.1, F satisfies the Palais-Smale condition and by Remark 5.5 we have a mountain pass barrier in both cases considered in the previous paragraph. As in Case 1, applying Theorem 5.3 we obtain an unstable critical point at the same energy level as u. Once more, using Lemma 5.6, u has Morse index 1, and must be a mountain pass solution for paths joining \(t^-\) with \(t^+\), in the first case, and paths joining \(t^-\) with \(c^*\), in the second case. This finishes the proof of Theorem 2.3 when F satisfies (A3).

6 Symmetry of ground states

In this section we prove Theorem 2.7. We first discuss the symmetry of ground states assuming that \(\Omega \) is the unit sphere \(S^N\) in \({\mathbb {R}}^{N+1}\), and in the last subsection, the axial symmetry for ground states in the unit ball \(B_1^{N}\) in \({\mathbb {R}}^{N}\), following [5].

6.1 Symmetrization in the sphere

Let us briefly describe our strategy. Fix \(z_{0}\) in \(S^{N}\). We consider a continuous path h in \(W^{1,2}\) which is optimal for the min-max problem at a ground state u, in the sense of Definition 2.2. The symmetrization of this path with respect to \(z_0\), say \(h^*(t)\), is a new path with the same endpoints, and it satisfies \(E(h^*(t))\le E(h(t))\). On the other hand, the optimality of the path forces \(u=h(0)=h^*(0)\), provided we are able to ensure that the symmetrized path is continuous, which is one of the key issues that must be addressed.

One of the most natural notions of symmetrization in \(S^N\) is the symmetric decreasing rearrangement, which provides a radially symmetric and decreasing function \(u^*:S^N \rightarrow {\mathbb {R}}\), i.e. it can be written as a decreasing function of d(y), and such that \(\{u>t\}\) and \(\{u^*>t\}\) have the same measure.

Given a Borel function \(u: S^{N}\rightarrow {\mathbb {R}}\), the associated distribution function \({\mathcal {V}}_{u}: {\mathbb {R}}\rightarrow [0,\beta _N]\) is defined by

(8)

Denote by \(R_{u}(s)\) the unique nonnegative real number such that

$$\begin{aligned} |B_{R_{u}(s)}| = {\mathcal {V}}_{u}(s), \end{aligned}$$

i.e., such that any geodesic ball of radius \(R_{u}(s)\) has volume \({\mathcal {V}}_{u}(s)\). The symmetric decreasing rearrangement (or simply the symmetrization) of u, with respect to \(z_{0} \in S^{N}\), is the function \(u^{*}: S^{N} \rightarrow {\mathbb {R}}\), defined by

(9)

We remark that, inasmuch as

and \(R_{u}\) is nonincreasing, we may express \(u^*\) explicitly as

$$\begin{aligned} u^{*}(y) = \sup \left\{ s: {\mathcal {V}}_{u}(s)>|B_{d(y)}(z_{0})|\,\right\} . \end{aligned}$$

The main properties of the symmetrization are summarized below.

Proposition 6.1

Let \(\Omega =S^N\) and let \(u \in L^1(\Omega )\).

  1. (1)

    For any \(y,y'\in \Omega \),

    $$\begin{aligned} d(y) \le d(y') \implies u^*(y)\ge u^*(y'). \end{aligned}$$

    In particular, \(u^*\) is radially symmetric.

  2. (2)

    For any Borel function \(\Phi : {\mathbb {R}}\rightarrow {\mathbb {R}}\), if \(\Phi (u) \in L^1(\Omega )\), then \(\Phi (u^*) \in L^1(\Omega )\), and it holds

    $$\begin{aligned} \int _{\Omega }\Phi (u^*) = \int _{\Omega }\Phi (u). \end{aligned}$$
  3. (3)

    If \(u \in W^{1,p}(\Omega )\) for some \(1 \le p \le \infty \), then \(u^* \in W^{1,p}(\Omega )\), and it holds

    $$\begin{aligned} \Vert \nabla u^*\Vert _{p} \le \Vert \nabla u\Vert _{p}. \end{aligned}$$
  4. (4)

    Assume that \(u \in W^{1,p}(\Omega )\) for some \(1<p<\infty \) and that the set \(\{x \in \Omega : \nabla u(x) = 0 \}\) has zero N-dimensional Hausdorff measure. If

    $$\begin{aligned} \Vert \nabla u^*\Vert _p = \Vert \nabla u\Vert _p, \end{aligned}$$

    then, there is an isometry T of \(S^N\) such that \(u \circ T=u^*\).

Proof

The first property follows directly from the definition of \(u^{*}\). The proof of (2) can be found in [2], see Proposition 1.18. Finally, the proof of (3) and (4) is due to Brothers and Ziemer [11]. \(\square \)

As a consequence, we see that

$$\begin{aligned} E(u^*) = \int _\Omega \frac{|\nabla u^*|^2}{2} + F(u^*) \le \int _\Omega \frac{|\nabla u|^2}{2} + F(u) = E(u) \end{aligned}$$
(10)

for any \(u \in W^{1,2}(\Omega )\). Moreover, \(E(u^*)=E(u)\) if and only if \(\Vert \nabla u\Vert _2 = \Vert \nabla u^*\Vert _2\). If u is a non-constant solution to (1), then \(\{\nabla u = 0\}\) has zero \({\mathcal {H}}^N\) measure (see [26]), so from Item (4) above, we get \(u=u^*\), provided this equality holds.

Let h be the optimal path constructed in the proof of Theorem 2.4 (see Sect. 5.3), and consider its symmetrization \({h^*:[-1,1] \rightarrow W^{1,2}(S^N)}\) given by \(h^*(t) = (h(t))^*\). Even though \(E(h^*(t)) \le E(h(t))\) and \(h^*( \pm 1) = h(\pm 1)\), it is not straightforward that \(h^*\) is continuous, as \(u \mapsto u^*\) is not a continuous map on \(W^{1,2}(S^N)\). In fact, the continuity problem for the symmetrization was studied in \({\mathbb {R}}^N\) by Almgren and Lieb [1]. On one hand, they proved that the symmetrization is continuous as a map in \(W^{1,2}\), exactly at functions that satisfy a condition called co-area regularity, which is met for any \(C^{N-1,1}_{\mathrm {loc}}\) function in \({\mathbb {R}}^N\) (see [1, Theorem 5.2]). On the other hand, they constructed a dense set of functions, all in \(C^{N-1,\lambda }_{\mathrm {loc}}({\mathbb {R}}^N)\), which are not co-area regular.

We believe it is possible to extend the results of Almgren and Lieb to the symmetrization in \(S^N\). Since the optimal path h in Theorem 2.4 is constructed using eigenfunctions of the stability operator and a parabolic flow, the function h(t) is smooth (hence co-area regular) for every t, so it is natural to expect that the symmetrized path \(h^*\) is continuous. However, we have opted for a more economical approach, using a simpler notion of rearrangement called polarization. The relevant properties of polarization are that it approximates the symmetrization and at the same time is continuous in \(W^{1,2}\). The use of polarizations to prove symmetry of solutions of partial differential equations has appeared in a number of works (see [8, 40, 41, 45]).

6.2 Polarization

Remember we are in the case \(\Omega =S^N\). Denote by \({\mathcal {H}}\) the family of closed half-spaces H of \({\mathbb {R}}^{N+1}\), such that \(0 \in \partial H\). For \(H\in {\mathcal {H}}\) consider the reflection \(\sigma _{H}: {\mathbb {R}}^{N+1}\rightarrow {\mathbb {R}}^{N+1}\) with respect to the hyperplane \(\partial H\). By fixing \(H\in {\mathcal {H}}\) we can define the polarization of a function \(u: \Omega \rightarrow {\mathbb {R}}\) with respect to the hyperplane \(\partial H\) as the function \(u_H:\Omega \rightarrow {\mathbb {R}}\) given by

$$\begin{aligned} u_{H}(x) = \left\{ \begin{array}{lcc} \max \{u(x), u(\sigma _{H}(x))\},\quad x\in \Omega \cap H \\ \min \{u(x), u(\sigma _{H}(x))\},\quad x \in \Omega \backslash H \end{array} \right. \end{aligned}$$
(11)

The polarization compares the values of u on both sides of \(\partial H\), and keeps the larger value in H. We will denote by \({\mathcal {H}}_* \subset {\mathcal {H}}\) the set of closed half-spaces H for which \(z_0 \in H\). The following proposition gathers some useful properties of the polarization which will be used later in the paper.

Proposition 6.2

Let \(\Omega = S^N\) and let \(H \in {\mathcal {H}}\).

  1. (1)

    For any Borel function \(\Phi : {\mathbb {R}}\rightarrow {\mathbb {R}}\), if \(\Phi (u) \in L^1(\Omega )\), then \({\Phi (u_H) \in L^1(\Omega )}\), and it holds

    $$\begin{aligned} \int _{\Omega }\Phi (u_H) = \int _{\Omega }\Phi (u). \end{aligned}$$
  2. (2)

    If \(u \in W^{1,p}(\Omega )\) for some \(1 \le p \le \infty \), then \(u_H \in W^{1,p}(\Omega )\), and it holds

    $$\begin{aligned} \Vert \nabla u_H\Vert _{p} = \Vert \nabla u\Vert _{p}. \end{aligned}$$

These properties can be easily proved noting that each \(\sigma _H\) is an isometry, and that \(u_H\) can be written explicitly in terms of u and \(u\circ \sigma _H\), see e.g. [2, Sections 3.3 and Chapter 7].

As discussed above, another key feature of the polarization is

Proposition 6.3

For any closed half-space \(H \in {\mathcal {H}}\), the map \(u \in W^{1,p}(S^{N}) \mapsto u_H \in W^{1,p}(S^{N})\) is continuous.

We now discuss the connection between symmetrization and polarization which will be used in the proof of Theorem 2.7. First, we note that

$$\begin{aligned} u = u_* \iff u=u_H, \quad \text{ for } \text{ all } \quad H \in {\mathcal {H}}_*. \end{aligned}$$

In fact, if \(H \in {\mathcal {H}}_*\) and \(u=u_H\), then \(u \ge u \circ \sigma _H\) in \(\Omega \cap H\), and \(u \le u \circ \sigma _H\) in \(\Omega {\setminus } H\). It this holds true for any \(H \in {\mathcal {H}}_*\), then one readily checks that u is radially symmetric and decreasing in the radial direction, so that \(u=u^*\)

Finally, we will use the fact that one may approximate the symmetric decreasing rearrangement \(u^*\) by a sequence of polarizations strongly in \(L^2\) norm.

Theorem 6.4

([44]) Let \(\Omega = S^N\). There exists a sequence \(\{H_k\}\) in \({\mathcal {H}}_*\) such that, for any \(u \in W^{1,2}(\Omega )\), if \(\{u_k\} \subset W^{1,2}(\Omega )\) is defined by \(u_0=u\) and

$$\begin{aligned} u_{k+1} = (u_k)_{H_1,\ldots ,H_{k+1}}, \end{aligned}$$

then \(\Vert u_k - u^*\Vert _{L^2(\Omega )} \rightarrow 0\) and \(u_k \rightharpoonup u\) in \(W^{1,2}(\Omega )\).

Proof of Theorem 2.7 (2)

Without loss of generality, we may assume \(S^N_R = S^N\). We follow the notation introduced in Sect. 5.1. Let u be an unstable solution of least energy. By the proof of Theorem 2.4, there exists \(h \in \Gamma \) such that \(h(0) = u\), and

$$\begin{aligned} E(h(t)) < E(u) = E_1 = \inf _{\gamma \in \Gamma } \sup _{s \in [-1,1]} E(\gamma (s)), \ \text{ for } \text{ all } \ t \in [-1,1]{\setminus }\{0\}. \end{aligned}$$

Given a half-space \(H\in {\mathcal {H}}_*\), let

$$\begin{aligned} h_{H}(t): = (h(t))_{H}, \end{aligned}$$

be the polarization of the path h with respect to H. By Propositions 6.2 and 6.3 , we have \(h_{H} \in \Gamma \) and \(E(\gamma _{H}(s)) = E(\gamma (s))\), for all \(s \in [-1,1]\). Consequently,

$$\begin{aligned} E(u_{H}) = \sup _{s\in [-1,1]} E(h_{H}(s)) = \sup _{s\in [-1,1]} E(h(s)) = E_1 \end{aligned}$$
(12)

We claim that \(h_{H}(0) = u_{H}\) is also a solution of (3). Indeed, if that is not the case, we can construct a new path \({\bar{h}} \in \Gamma \) by perturbing the original path in a neighborhood of \(u_{H}\) in the direction of a function \(\phi \) such that \(\langle E'(u_{H}), \phi \rangle < 0\). Then, there exists a small \(\delta >0\) such that \(E({\bar{h}}(t)) < E(h_{H}(t))\) for all \(|t|<\delta \), and \(E({\bar{h}}(t)) \le E(h_{H}(t))\), hence \(\sup E({\bar{h}}) < \sup E(h) = E_1\), which contradicts (12).

By Theorem 6.4, there exists a sequence of closed half-spaces \(\{H_{k}\}_{k}\) such that the sequence \(\{u_k\}\) defined by \(u_0 = u\) and \(u_{k+1} := (u_{k})_{H_{1}\cdots H_{k+1}}\) converges to the symmetrization \(u^{*}\) strongly in \(L^{2}(S^{N})\). By the arguments above, we see that each \(u_{k}\) is an unstable solution of (1), with energy \(E_1\) and constant \(W^{1,2}\) norm. By the compactness of the solutions, after possibly passing to a subsequence, \(u_{k}\) converges to a solution \({\bar{u}} \in W^{1,2}(S^{N})\) strongly in \(W^{1,2}(S^{N})\). Therefore, \({\bar{u}} = u^{*}\) and

$$\begin{aligned} E(u^{*}) = \lim _{n\rightarrow \infty }E(u_{n}) = E(u). \end{aligned}$$

Since \(\int _{S^{N}} F(u^{*}) = \int _{S^{N}}F(u)\), for all k, we get \(\int _{S^{N}}|\nabla u^{*}|^{2} = \int _{S^{N}}|\nabla u|^{2}\). Since u is nonconstant, the set \(\{x \in S^N : \nabla u(x) = 0\}\) has zero measure (see [26]). By Proposition 6.1, we see that u and \(u^{*}\) agree, up to ambient isometries, and hence u is radially symmetric. \(\square \)

Remark 6.5

We believe that in the proof above \(u=u_{H}\) should hold for any H, which would imply the rotational symmetry and monotonicity of u. To achieve this, after proving that \(u_H\) is a solution, one could try to follow the steps in [5], where it is done for domains in \({\mathbb {R}}^N\) (see also [41, 45]). In fact, in Sect. 6.3, we use this approach to deal with the case of the Euclidean ball. These results rely on the strong maximum principle. The method we present above for the case of the sphere, is slightly different and relies on the unique continuation property and the rigidity of the Pólya-Szëgo inequality (see Proposition 6.1).

6.3 Symmetrization in the unit ball

Consider now the case \(\Omega = B_1^{N}\). In this case, we do not expect that ground states of (1) have radial symmetry, so we need to introduce a different type of symmetrization.

Our geometric motivation is the connection between Neumann solutions to the Allen–Cahn equation (3) and free boundary minimal hypersurfaces. Solutions whose nodal sets accumulate on such minimal hypersurfaces (satisfying a nondegeneracy condition), were constructed by Pacard and Ritoré in [39]. Conversely, for families of solutions with bounded energy as \(\varepsilon \downarrow 0\), the energy density accumulates on a (possibly singular) minimal hypersurface (see also [38] for the case of Neumann solutions). Since least area free boundary minimal hypersurfaces in \(B_1^N\) are flat equatorial disks, it seems reasonable to expect that the ground states in \(B_1^N\) inherit this symmetry.

Consequently, they are expected to be foliated Schwarz symmetric: this means that they are axially symmetric with respect to the axis generated by some \(z_0 \in S^{N-1}\), and decreasing with respect to the polar angle from this axis. Similar symmetry results were proved in [5] and [45], for Dirichlet solutions, and in [41] for Neumann solutions and a sublinear potential with a unique critical point.

Proof of Theorem 2.7 (1)

As mentioned in Remark 6.5 above, once we characterize any ground state u as mountain pass type solutions, one proves that its polarization \(u_H\) is a solution for any half-space H. Using the argument of [5] (see Lemma 2.5 and Theorem 2.6), we see that \(u=u_H\) for any half-space containing \(z_0 \in S^{N-1}\) where \(z_0 = \frac{x_0}{|x_0|}\) and \(x_0 \in B_1^N{\setminus } \{0\}\) is such that

$$\begin{aligned} u(x_0) = \max \{u(x) : x \in B_1^N, \ |x|=|x_0|\}. \end{aligned}$$

This proves that u is foliated Schwarz symmetric (with respect to the \(z_0\)), and finishes the proof of Theorem 2.7 (1). \(\square \)

Remark 6.6

We believe that proof of Theorem 2.7 for the sphere, can be adapted to the case of the Euclidean ball as well. The corresponding rearrangement notion is called cap symmetrization [2, §7.5], and it is defined as the symmetric decreasing rearrangement in each sphere in \(B_1^N\) centered at the origin, see also [48]. In this case, one still needs to show that the solution is nonradial and derive a rigidity statement similar to item (4) in Proposition 6.1 for cap symmetrization under some condition on the nodal set of \(\nabla u\) in each sphere. A related rigidity result was obtained for Steiner symmetrization in [18] (on the other hand, see [48, Example 5.5]).

Remark 6.7

The results of [47] imply that for \(f(t) = t-|t|^{p-1}t\), for \({1<p<\frac{N+2}{N-2}}\) (or \(p>1\) for \(N=1,2\)), ground states are odd with respect to the hyperplane orthogonal to the axis of symmetry. This implies that \(\{u=0\}\) is precisely \(\{x \in B_1^N : x_N=0\}\). We expect the same to be true for the Allen–Cahn equation. In the next section, we discuss the analogous result in the case of \(S^N\).

7 The case of symmetric double well potentials on \(S^N\)

We now turn to study the case of symmetric multiple well potentials on \(S^N\) in more detail, aiming to prove Theorem 2.8. In addition to our previous assumptions (A1) and (D) we assume from here on that the right hand side in (1) satisfies

  1. (B)
    1. (i)

      \(f(t)=-f(-t)\) for any \(t \in {\mathbb {R}}\).

    2. (ii)

      \(-f(t)/t\) is decreasing in \(t \ge 0\).

Note that these conditions imply that \(t=0\) is the only local maximum of F, which is then a double well potential – its critical points are precisely \(\{c^-<0<c^+\}\). Moreover, (B) holds in particular for the standard double well potential \(W(u)/\varepsilon =(1-u^2)^2/4\varepsilon \), which yields the (parameter dependent) Allen–Cahn equation:

$$\begin{aligned} \varepsilon ^2\Delta u - W'(u)=0 \end{aligned}$$
(13)

From Theorem 5.1 any nonconstant solution of a semilinear elliptic PDE on a compact manifold with positive Ricci curvature is unstable. In particular, if a least energy solution is nonconstant, then Theorem 2.3 implies that it is a min-max solution with Morse index 1, and there is an optimal path joining this solution to stable (constant) solutions \(c^{\pm }\) of (1).

On the other hand, as noted in [27], for large \(\varepsilon \), the only solutions of (13) are the constant solutions. The proof is based on the following classical result, which also holds for solutions to the parameter dependent version of (1) under the assumptions on f stated above.

Theorem 7.1

([10]) Let \(\Omega \subset M\) be a domain with nonempty smooth boundary. The boundary value problem

$$\begin{aligned} \left\{ \begin{array}{rl} \Delta u- f(u) = 0 &{} \text{ in } \Omega \\ u > 0,&{} \text{ in } \Omega \\ u = 0,&{} \text{ on } \partial \Omega .\\ \end{array}\right. \end{aligned}$$

has at most one solution. Moreover a solution exists if, and only if, \(\lambda _1(\Omega )<-f'(0)\).

Consider now the case \(\Omega =S^{N}\), endowed with the round metric of constant curvature 1. It follows from the result above and the Faber-Krahn inequality that if (3) has a nonconstant solution, then

$$\begin{aligned} \sqrt{N}=\lambda _1(S^N_+) < -f'(0), \end{aligned}$$

where \(S^{N}_+ = S^{N} \cap \{x \in {\mathbb {R}}^{n+1} : x_{N+1} >0\}\). Since \(\lambda _1(S^N_+)=\lambda _1(S^N)\), this condition coincides with the inequality stated in Corollary 2.6.

Conversely, if this inequality holds, then we can find a solution of (3) on \(S^{N}\) whose nodal set is the equator \(S^{n-1} \subset S^{N}\) by putting \(u(x) = u_+(x)\), if \(x \in S^{N}_+\), and \(u(x) = -u_+(-x)\), otherwise, where \(u_+(x)\) is the unique positive solution of the problem above on \(\Omega =S^{N}_+\). Furthermore, the uniqueness part of Theorem 7.1 guarantees that \(u_+\), and hence u, are radially symmetric.

We now collect some consequences of Theorem 7.1 and the maximum principle for rotationally symmetric solutions. Before we start it is useful that assumption (ii) on f above in particular implies that

$$\begin{aligned} f'(\theta t) \le \theta f'(t) \end{aligned}$$

for \(t \ge 0\) and \(\theta \in (0,1)\). This in turn implies that if u is a non-negative solution to (1) then \(\theta u\) is a subsolution.

Lemma 7.2

Assume that F satisfies (A1) and (B). Let \(u :S^{N} \rightarrow {\mathbb {R}}\) be a non-constant solution to (1), which is rotationally symmetric about the \(e_{N+1}\)-axis and has connected nodal set. Then u is odd under the reflection at the hyperplane \(\{x_{N+1}=0\}\).

Proof

First note that the assertion follows from Theorem 7.1 once we know that \(\{u=0\}=\{x_{N+1}=0\}\), since u a solution if and only if \(-u\) is a solution thanks to the assumption on f. Moreover, note that Theorem 2.7 implies that the nodal set \(\{u=0\}\) has to be connected, since u is monotone.

Let us assume now that \(\{u=0\} \ne \{x_{N+1}=0\}\). By the rotational symmetry of u it follows that \(\{u=0\}\) is contained in the interior of a hemisphere. In particular, there is a connected component \(\Omega \) of \(S^{N} {\setminus } \{u = 0\}\) such that \({\bar{\Omega }}\) is contained in the interior of a hemisphere. We can thus find some rotation \(R \in SO(N+1)\) such that

$$\begin{aligned} \Omega \cap R(\Omega ) = \emptyset \ \text {and} \ \partial \Omega \cap \partial (R (\Omega )) = \{z\} \end{aligned}$$

for some point \(z \in S^{N}\). We write \(\Omega ' = R (\Omega )\). By our assumption on the potential, we then have functions \(u,v :\Omega ' \rightarrow {\mathbb {R}}\) both solving (1) and such that \(u \ge 0\) in \(\Omega '\), \(u=0\) along \(\partial \Omega '\), and \(v > 0 \) in \({\bar{\Omega }}' {\setminus } \{z\}\), \(v(z)=0\). (Here we use that \({\bar{\Omega }}\) is contained in the interior of a hemisphere.) But this can be seen to impossible using the maximum principle by an argument similar to that in [30, Corollary 7.4].

Here are the details. Since \(v >0\) in \({\bar{\Omega }}' {\setminus } \{z\}\) and \(\partial _\nu v (z) > 0\) for \(\nu \) the outward pointing normal of \(\Omega '\), we can choose \(\theta < 1\) such that

$$\begin{aligned} \theta u < v \ \text {in} \ {\bar{\Omega }}' {\setminus } \{z\}. \end{aligned}$$
(14)

As explained above, the assumptions on f imply that \(\theta u\) is a subsolution in \(\Omega '\). Since v is a solution, it follows from the maximum principle, that (14) continues to hold for \(\theta = 1\). (Note that we can not have \(\theta u = v\) for any \(\theta \) by construction.) At this stage an application of the Hopf boundary lemma implies that

$$\begin{aligned} \partial _\nu u(z) < \partial _\nu v(z) \end{aligned}$$

which is impossible by construction. \(\square \)

8 Integrability of the kernel and energy gap

In this section we prove Theorems 2.9 and 2.10. This follows from some fairly standard arguments once we have obtained the integrability of the kernel at rotationally symmetric, index 1 solutions. This is established in the following subsection.

8.1 The kernel at a rotational symmetric solution

The goal of this subsection is to show that any function in the kernel of the linearized operator at a rotationally symmetric solution u with index 1 and nodal set the equator, is generated by an ambient isometry, i.e. the kernel of the linearized operator is integrable.

We keep making the assumptions (A1) and (B) from the preceding section. It is useful to note the second assumption from (B) implies that

$$\begin{aligned} tf'(t)-f(t) \ \text {has a sign for} \ t>0. \end{aligned}$$

Assume now that u is a solution to (1) and denote by

$$\begin{aligned} L_u=-\Delta + f'(u) \end{aligned}$$

the linearized operator at u. If X is a Killing vector field on \(S^{N}\), then it is generated by a rotation with axis around some vector \(v \in S^{N}\). The function \(\phi _X=\langle X , \nabla u \rangle \) clearly lies in the kernel of \(L_u\). When u is rotationally symmetric, say about the \(e_{N+1}\)-axis, and odd under the reflection at the hyperplane \({\{x_{N+1}=0\}}\), the nodal set

$$\begin{aligned} \{ \phi _X =0\} = S_X, \end{aligned}$$

where \(S_X\) is an equator containing v and \(e_{N+1}\), provided \(\phi _X \ne 0\). The latter is equivalent to \(v\ne \pm e_{N+1}\).

Proposition 8.1

Let u be a non-constant, rotationally symmetric solution to (1) with index 1 and nodal set the equator. Then we have that

$$\begin{aligned} \dim \ker L_u = N. \end{aligned}$$

Proof

As above, after a rotation, we may assume that \(\{u=0\} = \{x_{N+1}=0\}\), i.e. u is rotationally symmetric about the \(e_{N+1}\)-axis and odd with respect to the reflection at the hyperplane \(\{x_{N+1}=0\}\). For \(v \in S^{N}\) denote by \(r_v :S^{N} \rightarrow S^{N}\) the reflection at the hyperplane \(v^\perp \), i.e.

$$\begin{aligned} r_v(x) = x - 2 \langle x , v \rangle v. \end{aligned}$$

For simplicity we write \(r_{j}=r_{e_j}\) for the reflection at the hyperplanes \(\{x_j=0\}\). Note that the function \(f'(u)\) is invariant under \(r_v\) if \(v \in S^{N-1} = S^{N} \cap \{x_{N+1} =0 \}\), since u is rotationally symmetric about the \(e_{N+1}\)-axis, or if \(v=e_{N+1}\), since u is odd with respect to \(r_{N+1}\) and \(f'\) is even. Since each \(r_v\) is also an isometry of \(S^{N}\), we find that each \(r_v\) commutes with \(L_u\), in particular it acts on \(\ker L_u\). Moreover, each \(r_v\) is an involution. In particular, we can decompose \(\ker L_u\) into the \(\pm 1\) eigenspaces of \(r_v\).

Claim 5

The \(-1\) eigenspace of \(r_{N+1}\) acting on \(\ker L_u\) is trivial.

Proof of Claim 5

Suppose we have \( \phi \in \ker L_u\) such that

$$\begin{aligned} \phi \circ r_{N+1}= - \phi . \end{aligned}$$

This implies that \(\phi = 0\) along \(\partial S^{N}_+\), where \(S^{N}_+=\{x \in S^{N} \ : \ x_{N+1}>0\}\).

Now, we can multiply the Eq. (1) by \(\phi \) on \(S^{N}_+\) and integrate by parts to obtain

$$\begin{aligned} \int _{S^{N}_+} \nabla u \nabla \phi = - \int _{S^{N}_+} \phi \Delta u = - \int _{S^{N}_+} f(u) \phi , \end{aligned}$$

since the boundary term vanishes as \(\phi = 0\) along \(\partial S^{N}_+\). Similarly, we can use the equation for \(\phi \) to find

$$\begin{aligned} \int _{S^{N}_+} \nabla u \nabla \phi = - \int _{S^{N}_+} u \Delta \phi = - \int _{S^{N}_+} f'(u) u \phi , \end{aligned}$$

using that \(u=0\) along \(\partial S^{N}_+\). Combining both of these we arrive at

$$\begin{aligned} \int _{S^{N}_+} (f'(u) u - f(u)) \phi =0 \end{aligned}$$
(15)

By our normalization of u we have that \(u > 0\) in \(S^{N}_+\), hence the second assumption on f implies that also

$$\begin{aligned} f'(u) u - f(u) > 0 \ \text {in} \ S^{N}_+. \end{aligned}$$
(16)

On the other hand, u is assumed to have index 1, we have that \(\phi \) is a second eigenfunction of \(L_u\). By Courant’s nodal domain theorem this implies that \(S^{N} {\setminus } \{ \phi =0\}\) has precisely two connected components. As remarked above we also know that \(\phi =0\) along \(\partial S^{N}_+\). Combining the last two pieces of information we find that

$$\begin{aligned} \phi \ge 0 \ \text {in} \ S^{N}_+ \end{aligned}$$
(17)

up to multiplying \(\phi \) by \(-1\). Combining (15), (16), and (17), we find that \(\phi =0\), which is precisely our claim. \(\square \)

We now prove a similar assertion for the reflections \(r_v\) with \(v \in S^{N-1}\).

Claim 6

Let \(v \in S^{N-1}\), then the the \(-1\) eigenspace of \(r_v\) acting on \(\ker L_u\) is one-dimensional.

Proof of Claim 6

Suppose that \(0\ne \phi \in \ker L_u\) has \(\phi \circ r_v = - \phi \) for some \(v \in S^{N-1}\). Then (up to multiplying \(\phi \) by \(-1\)) it follows as in the proof of the first claim above by the Courant nodal domain theorem that

$$\begin{aligned} \phi> 0 \ \text {in} \ \{x \in S^{N} \ : \langle x , v \rangle > 0\} \end{aligned}$$

and

$$\begin{aligned} \phi = 0 \ \text {along} \ \langle v \rangle ^\perp \cap S^{N}. \end{aligned}$$

This implies that \(\phi \) is a first Dirichlet eigenfunction for \(L_u\) on the hemisphere \(\{x \in S^{N} \ : \langle x , v \rangle \ge 0\}\). Let us now choose some \(w \in \langle v \rangle ^\perp \cap \langle e_{N+1} \rangle ^\perp \cap S^{N}\) and let X be a non-trivial Killing field generated by a rotation fixing w, i.e. \(X(w)=0\), then we have the corresponding function \(\phi _X \in \ker L_u\) described earlierFootnote 3. Moreover, we have that

$$\begin{aligned} \phi _X> 0 \ \text {in} \ \{x \in S^{N} \ : \langle x , v \rangle > 0\} \end{aligned}$$

and

$$\begin{aligned} \phi _x = 0 \ \text {along} \ \langle v \rangle ^\perp \cap S^{N}. \end{aligned}$$

In particular, it follows that \(\phi _X\) is first Dirichlet eigenfunction as well. This implies that \(\phi \in \langle \phi _X \rangle \), which is what we claimed. \(\square \)

Let us now finish the proof using the above two claims. We denote by

$$\begin{aligned} V=\{ \phi _X \ : \ X \ \text {Killing field} \} \subset \ker L_u \end{aligned}$$

the N-dimensional subspace spanned by the eigenfunctions generated by rotations. We want to show that the above inclusion is an equality. Let \(\phi \in V^\perp \subset \ker L_u\) and assume that \(\phi \ne 0\).

Note that for \(v \in S^{N-1} \cup \{e_{N+1}\}\) the reflections \(r_v\) preserve V and act by isometries on \(\ker L_u\) (endowed with the \(L^2\) scalar product). In particular, each of these defines an involution on \(V^\perp \subset \ker L_u\). It follows from Claim 5 and Claim 6 that the \(-1\) eigenspace of any of these has to be trivial. This implies that \(\phi \in V^\perp \) is symmetric about the \(e_{N+1}\)-axis and invariant under the reflection at \(\{x_{N+1}=0\}\). In particular

$$\begin{aligned} \phi = c \ \text {along} \ S^{N-1} \end{aligned}$$

for some constant \(c \in {\mathbb {R}}\). We can not have \(c=0\), since otherwise this would imply by the Courant nodal domain theorem that \(\phi \) has a sign in the upper and lower hemisphere contradicting our computation from Claim 5 (and also the Hopf boundary Lemma). But then, since \(\phi \) is not a first eigenfunction and invariant under \(r_{N+1}\) there has to be some point \(z=(z',z_{N+1}) \in S^{N}_+\) with \(z_{N+1}>0\). with \(\phi (z)=0\). Since \(\phi \) is rotationally symmetric about the \(e_{N+1}\) axis, this implies that the set

$$\begin{aligned} \{(z',z_{N+1}) \in S^{N} \} \subseteq \{\phi = 0\}. \end{aligned}$$

Since \(\phi \) is invariant under \(r_{n+1}\) this implies that also

$$\begin{aligned} \{(z',-z_{N+1}) \in S^{N} \} \subseteq \{\phi = 0\}. \end{aligned}$$

But this implies that \(S^{N} {\setminus } \{ \phi =0\}\) has at least three connected components contradicting Courant’s nodal domain theorem if \(\phi \ne 0\). \(\square \)

8.2 The energy gap for the Allen–Cahn equation

Thanks to Proposition 8.1 we can now provide the argument for Theorem 2.9. We argue by contradiction, essentially exploiting the fact that we have shown that our discussion up to this point could be summarized as the Allen–Cahn functional being Morse-Bott near the energy level \(a_\varepsilon \). Alternatively, one could invoke an appropriate version of the implicit function theorem.

Proof of Theorem 2.9

Assume that we have a sequence \((v_{j})_{j \in {\mathbb {N}}}\) of solutions to (3) with

$$\begin{aligned} E_\varepsilon (v_{j}) > a_\varepsilon \ \text {and} \ \lim _{j \rightarrow \infty } E_\varepsilon (v_{j}) = a_\varepsilon \end{aligned}$$

Note that Theorem 2.8 can be stated as

$$\begin{aligned} A_\varepsilon =\{ u \in W^{1,2}(S^{N}) \ : \ u \ \text {solves} \ (3) \ \text {and} \ E_\varepsilon (u)=a_\varepsilon \} = \{u_{0}\ \circ \ R \ : \ R \in O(N+1) \}, \end{aligned}$$

where \(u_0\) is rotationally symmetric, odd and monotone in the radial direction. In particular, the elements \(A_\varepsilon \) are determined by where their maximum is, so \(A_\varepsilon \) is a sphere of dimension N. In addition, since \(A_\varepsilon \) is compact we have:

$$\begin{aligned} \inf _{R \in O(N+1)}\Vert v_{j} - (u_{0} \circ R) \Vert _{L^2} = \Vert v_{j} - (u_{0} \circ R_j) \Vert _{L^2} \end{aligned}$$
(18)

for some \(R_j \in O(N+1)\). Notice that since \(u_0\circ R_j\) minimizes the distance from \(v_j\) to \(A_\varepsilon \), it follows that \(v_j-u_0\) is orthogonal to the tangent space of \(A_\varepsilon \). By composing everything with \(R_j^{-1}\) we may assume that \(R_j={{\,\mathrm{id}\,}}\) for any \(j \in {\mathbb {N}}\).

By standard elliptic estimates, we have that \(v_{j} \rightarrow v \ \text {in} \ C^{\infty } \) and \(E_\varepsilon (v) = a_\varepsilon .\) It follows that \(v=u_{0}.\) Consider now the sequence of functions \(w_{j} = \alpha _{j} (v_{j}-u_{0}),\) with \(\alpha _{j}^{-1} = \Vert v_{j}-u_{0} \Vert _{L^2}\), so that \(\Vert w_{j}\Vert _{L^2}=1\). It follows from standard arguments that

$$\begin{aligned} w_{j} \rightarrow \phi \in \ker L_{u_{0}}, \end{aligned}$$

where the convergence is smoothly and hence \(\Vert \phi \Vert _{L^2}=1\). But by the choice (18) we have that \(\phi \perp \ker L_{u_{0}}\) thanks to Proposition 8.1, which is a contradiction. \(\square \)

Thanks to the Palais–Smale condition satisfied by the Allen–Cahn functional the proof of the gap for the Allen–Cahn widths does not rely on Theorem 2.9 contrary to the argument leading to the same assertion for the Almgren–Pitts widths.

Proof of Theorem 2.10

Suppose that

$$\begin{aligned} c_\varepsilon (1) = \dots = c_{\varepsilon }(N+2) \end{aligned}$$

for the Allen–Cahn widths on the sphere \(S^N\). Under this assumption it follows from [27, Theorem 3.3 (2)] that the cohomological \({\mathbb {Z}}/2\) index of the set

$$\begin{aligned} K_{c_{\varepsilon }(1)} = \{ u \in W^{1,2}(S^N) \ : \ E_\varepsilon '(u)=0, E_{\varepsilon }(u)=c_{\varepsilon }(1) \} \end{aligned}$$

satisfies

$$\begin{aligned} {{\,\mathrm{ind}\,}}_{{\mathbb {Z}}/2}(K_{c_\varepsilon (1)}) \ge N+2. \end{aligned}$$
(19)

On the other hand, we know that there is a \({\mathbb {Z}}/2\)-equivariant map

$$\begin{aligned} \phi :K_{c_\varepsilon (1)} \rightarrow S^N. \end{aligned}$$

Since \(H^{nN+1}(S^N;{\mathbb {Z}}/2)=0\) this implies that

$$\begin{aligned} {{\,\mathrm{ind}\,}}_{{\mathbb {Z}}/2}(K_{c_\varepsilon (1)}) \le N+1 \end{aligned}$$

contradicting (19). \(\square \)

9 Bifurcation at the first positive critical level

In this section, we will study the bifurcation for solutions of (3) on \(S^3\) which occurs at \(\varepsilon = \varepsilon _1 = (\lambda _1(S_+^3))^{-1/2}\). We recall that the only solutions of (3) for any \(\varepsilon \ge \varepsilon _1\) are the constants \(\pm 1\) and 0. Denote by \(A_\varepsilon \subset W^{1,2}(S^3)\) the set of all unstable solutions of least energy in \(S^3\). By Theorem 2.8, the set \(A_\varepsilon \) is diffeomorphic to \(S^3\). Our goal is to prove Theorem 2.11.

We begin by constructing the families of nonradially symmetric solutions mentioned in (2). We regard \(S^3\) as the set of all \((z,w) \in {\mathbb {C}}^2\) such that \({|z|^2+|w|^2=1}\). Let \({\mathcal {T}} \subset S^3\) be the Clifford torus, namely

$$\begin{aligned} {\mathcal {T}} = \{ x \in S^3 : x_1^2+x_2^2 = 1/2 = x_3^2 + x_ 4^2\}. \end{aligned}$$

It is a minimal surface and it bounds the solid torus

$$\begin{aligned} \Omega _{{\mathcal {T}}} = \{ x \in S^3 : x_1^2 + x_2^2 < 1/2\}. \end{aligned}$$
(20)

Moreover, \({\mathcal {T}}\) is the nodal set of the restriction of the harmonic polynomial \(p(x) = x_1^2 + x_2^2 - x_3^2 - x_4^2\) on \({\mathbb {R}}^4\) to the sphere. Since this restriction is an eigenfunction for \(\Delta =\Delta _{S^3}\) with associated eigenvalue \(\lambda _2(S^3)=8\), we get \(\lambda _1(\Omega _{{\mathcal {T}}}) = \lambda _2(S^3)=8\). Finally, we note that \({\mathcal {T}}\) is invariant by the isometry

$$\begin{aligned} s:(x_1,x_2,x_3,x_4) \in S^3 \mapsto (x_3,x_4,x_1,x_2) \in S^3, \end{aligned}$$

which switches \(\Omega _{{\mathcal {T}}}\) and the interior of its complement, and that \(\Omega _{{\mathcal {T}}}\) is invariant by the isometries

$$\begin{aligned} f_{\theta ,\rho }: (z,w) \in S^3 \subset {\mathbb {C}}^2 \mapsto (e^{i\theta }z,e^{i\rho }w) \in S^3, \end{aligned}$$

for all \(\theta ,\rho \in {\mathbb {R}}\). By the uniqueness of positive Dirichlet solutions, we conclude:

Proposition 9.1

For any \(\varepsilon \in (0,\varepsilon _2)\), there is a solution u of (3) whose nodal set is precisely the Clifford torus. Moreover, it is invariant by \(f_{\theta ,\rho }\), for all \(\theta ,\rho \in {\mathbb {R}}\), and it satisfies

$$\begin{aligned} u(w,z) = -u(z,w), \quad \text{ for } \text{ all } \quad (z,w) \in S^3. \end{aligned}$$

Proof

By Theorem 7.1, there is a unique positive solution \(v \in C^3(\Omega _{{\mathcal {T}}})\) of (3) in \(\Omega _{{\mathcal {T}}}\) which vanishes on \(\partial \Omega _{{\mathcal {T}}} = {\mathcal {T}}\), provided \(\varepsilon < \varepsilon _2\). Since \(v \circ f_{\theta ,\rho }\) also solves (3) on \(\Omega _{{\mathcal {T}}}\), we get \(v \circ f_{\theta ,\rho } = v\), for all \(\theta ,\rho \in {\mathbb {R}}\). Moreover,

$$\begin{aligned} {{\,\mathrm{SO}\,}}(2)\times {{\,\mathrm{SO}\,}}(2) = \{f_{\theta ,\rho }: (z,w) \mapsto (e^{i\theta }z,e^{i\rho }w) \, : \, (\theta ,\rho ) \in {\mathbb {R}}^2\} \end{aligned}$$

acts transitively on \({\mathcal {T}}\), hence the normal derivative of u is constant along the boundary \({\mathcal {T}}\).

Therefore, the solution \(u:S^3 \rightarrow {\mathbb {R}}\) with the desired properties is given by

$$\begin{aligned} u(x) = \left\{ \begin{array}{rl} v(x), &{} \text{ if } \ x \in {\bar{\Omega }}_{{\mathcal {T}}}, \\ -v(s(x)), &{}\text{ if }\ x \in S^3 {\setminus } {\bar{\Omega }}_{{\mathcal {T}}} \end{array} \right. . \end{aligned}$$

\(\square \)

Similarly, the set

$$\begin{aligned} {\mathcal {X}} = \{x \in S^3 : x_3\cdot x_4 = 0\}, \end{aligned}$$

is the union of two orthogonal equators, and the nodal set of the restriction of the harmonic polynomial \(x \mapsto x_3\cdot x_4\) to \(S^3\). Since this polynomial is also a Laplace eigenfunction associated to \(\lambda _2(S^3)\) and it is positive in the region \(\Omega _{{\mathcal {X}}} = \{x \in S^3 : x_3>0, x_4>0\}\), we see that \(\lambda _1(\Omega _{{\mathcal {X}}}) = \lambda _2(S^3)\). By Theorem 7.1, there is a unique positive Dirichlet solution \(u_{{\mathcal {X}}}\) of (3) in \(\Omega _{{\mathcal {X}}}\).

Observe that \(\Omega _{{\mathcal {X}}}\) is invariant by the isometry \(t(x_1,x_2,x_3,x_4) = (x_1,x_2,x_4,x_3)\), which interchanges the two orthogonal equators in \({\mathcal {X}}\). It is also invariant by any \(T \in \mathrm {O}(2)\) acting on the first two coordinates of \(x \in S^3\). Hence \(u_{{\mathcal {X}}} \circ t = u_{{\mathcal {X}}}\), and \(u_{{\mathcal {X}}}\) depends on \(x_3\) and \(x_4\) only. Therefore, we can extend \(u_{{\mathcal {X}}}\) to a solution \({\bar{u}}\) in \(S^3\) by odd reflections across \(S^3\), namely

$$\begin{aligned} {\bar{u}}(x_1,x_2,x_3,x_4) = {\text {sgn}}(x_3x_4) \cdot u_{{\mathcal {X}}}(x_1,x_2,{\text {sgn}}(x_3)x_3,{\text {sgn}}(x_4)x_4). \end{aligned}$$

This concludes the construction of the second family of nonradially symmetric solutions for \(\varepsilon <\varepsilon _2\), and finishes the proof of Theorem 2.11 (2).

In order to prove Theorem 2.11 (1), we will first rule out other radially symmetric solutions. More precisely,

Lemma 9.2

If u is a nonconstant radially symmetric solutions of (3) for \(\varepsilon \in [\varepsilon _{2},\varepsilon _{1})\), then \(u \in A_{\varepsilon }\).

Before proving the lemma above, we recall some facts about the first Dirichlet eigenvalue of certain domains in \(S^3\). For any geodesic ball \(B_\tau \subset S^3\), where \(\tau \in (0,\pi )\), we have (see e.g. [4])

$$\begin{aligned} \lambda _1(B_\tau ) = \left( \frac{\pi }{\tau }\right) ^2 - 1. \end{aligned}$$

In particular, \(\lambda _1(B_\tau ) \ge \lambda _2(S^3) = 8\) if, and only if, \(\tau \le \pi /3\). Consider also the spherical segment

$$\begin{aligned} \Omega _h = \{ x \in S^3 : {\text {dist}}(x,\{x_4=0\})< h\}= B_{\pi /2+h}(e_4) {\setminus } {\bar{B}}_{\pi /2-h}(e_4). \end{aligned}$$

for \(h \in (0,\pi /2)\). We claim that

$$\begin{aligned} \lambda _1(\Omega _h) \ge \left( \frac{\pi }{2h}\right) ^2 -1. \end{aligned}$$

In fact, if \(\phi \) is a positive eigenfunction corresponding to \(\lambda _1(\Omega _h)\), then we may write \(\phi (x) = f(r(x))\) for some function \(f \in C^\infty ([\pi /2-h,\pi /2+h])\) which vanishes on the boundary of its domain, where \(r(x) = {\text {dist}}(x,e_4)\). Then

$$\begin{aligned} \lambda _1(\Omega _h) = \frac{\int _{\Omega _h}|\nabla \phi |^2}{\int _{\Omega _h} |\phi |^2} = \frac{\int _{\pi /2-h}^{\pi /2+h} \sin ^2(t)f'(t)^2\,dt}{\int _{\pi /2-h}^{\pi /2+h} \sin ^2(t)f(t)^2\,dt}. \end{aligned}$$

Using

$$\begin{aligned} {\int _{\pi /2-h}^{\pi /2+h} \left( \frac{d}{dt}(\sin (t)f(t))\right) ^2\,dt} = {\int _{\pi /2-h}^{\pi /2+h} \sin ^2(t)f'(t)^2\,dt} + {\int _{\pi /2-h}^{\pi /2+h} \sin ^2(t)f(t)^2\,dt}, \end{aligned}$$

which follows by integration by parts and \(f(\pi /2-h)=0=f(\pi /2+h)\), and Wirtinger’s inequality, we obtain

$$\begin{aligned} \lambda _1(\Omega _h) = \frac{{\int _{\pi /2-h}^{\pi /2+h} \left( \frac{d}{dt}(\sin (t)f(t))\right) ^2\,dt}}{\int _{\pi /2-h}^{\pi /2+h} \left( \sin (t)f(t)\right) ^2\,dt}-1 \ge \left( \frac{\pi }{2h}\right) ^2 -1 . \end{aligned}$$

Proof of Lemma 9.2

By composing u with an isometry, we may assume that u is radially symmetric with respect to \(e_4 \in S^3\). The nodal set \(u^{-1}(0)\) is the union of concentric geodesic spheres centered at \(e_4\), and, by the maximum principle, \(u(\pm e_4) \ne 0\). Hence, the connected component of \(\{u^{2}> 0\}\) containing \(e_4\) is a geodesic ball \(B_\tau =B_\tau (e_4)\). By Theorem 7.1, we obtain \(\varepsilon _{2} \le \varepsilon < \lambda _1(B_\tau )^{-1/2}\), so \(\lambda _1(B_\tau ) < \lambda _2(S^3)\) and \(\tau > \pi /3\). Similarly, the connected component of \(\{u^2>0\}\) containing \(-e_4\) is a geodesic ball of radius \(>\pi /3\).

We claim that \(u^{-1}(0)\) is connected. In fact, if this is not the case, then \(\{u^2>0\}\) has a third connected component \(\Omega \) which is contained in the spherical segment \(\Omega _{\pi /6}\). Consequently,

$$\begin{aligned} \lambda _1(\Omega )>\lambda _1(\Omega _{\pi /6}) \ge \frac{\pi ^2}{(\pi /3)^2} -1 = 8 = \lambda _2(S^3). \end{aligned}$$

But, using Theorem 7.1 again, we get \(\varepsilon _2 \le \varepsilon < \lambda _1(\Omega )^{-1/2} \le \lambda _2(S^3)^{-1/2}\), which is a contradiction. This shows that \(u^{-1}(0)\) is a single geodesic sphere. By Lemma 7.2, we see that \(u^{-1}(0) = \partial B_{\pi /2}=\{x_4=0\}\). Hence, the uniqueness part of Theorem 7.1 yields \(u \in A_{\varepsilon }\). \(\square \)

Remark 9.3

For sufficiently small \(\varepsilon >0\), there are radially symmetric solutions of the Allen–Cahn equation on \(S^3\) which are not in \(A_\varepsilon \); see Example 1 in [30]. The previous lemma shows that \(\varepsilon <\varepsilon _2\) whenever such solutions exists.

We recall some facts about Lie group actions. Let G be a compact Lie group which acts differentiably on the right on a (Hilbert) manifold \({\mathcal {M}}\). For any \(u \in {\mathcal {M}}\), denote by

$$\begin{aligned} G_u = \{ g \in G : u \cdot g = u\} \quad \text{ and } \quad u \cdot G = \{u \cdot g: g \in G\} \end{aligned}$$

the isotropy group of u and the orbit of u, respectively. Then \(G_u\) is a closed Lie subgroup of G, and the quotient manifold \(G/G_u\) can be embedded into \({\mathcal {M}}\), with image \(u\cdot G\). In particular, \(G/G_u\) is diffeomorphic to \(u \cdot G\) and

$$\begin{aligned} \dim (u \cdot G) = \dim G - \dim G_u, \end{aligned}$$
(21)

see e.g. [9, §VI.1] for a proof.

The group of all orientation preserving isometries of \(S^3\) is the special orthogonal group \({{\,\mathrm{SO}\,}}(4)\). Since any \(T \in {{\,\mathrm{SO}\,}}(4)\) is a diffeomorphism of \(S^3\), this group acts on the Sobolev space \(W^{1,2}(S^3)\) on the right by composition, i.e. \(u \cdot T = u\circ T\). This yields a differentiable action \(W^{1,2}(S^{N}) \times {{\,\mathrm{SO}\,}}(4) \rightarrow W^{1,2}(S^{N})\). Moreover, \(u\cdot T\) is a solution of (3) whenever u is a solution, and the orbit of any \(u\in A_{\varepsilon }\) under this action is precisely the set \(A_{\varepsilon }\).

We will need the following classification of Lie subgroups of \({{\,\mathrm{SO}\,}}(4)\).

Theorem 9.4

([46]) Let G be a connected Lie subgroup of \({{\,\mathrm{SO}\,}}(4)\) of dimension \(\ge 2\). Up to conjugation in \({{\,\mathrm{SO}\,}}(4)\), G is one of the following subgroups:

  1. (1)

    \({{\,\mathrm{SO}\,}}(4)\), if \(\dim G = 6\);

  2. (2)

    \(\mathrm {U}(2)\) (unitary complex \(2\times 2\) matrices acting on each \((z,w) \in S^3\) by matrix multiplication), if \(\dim G=4\).

  3. (3)

    \(\mathrm {SU}(2)\) (unitary complex \(2\times 2\) matrices with \(\det =1\) acting on each \((z,w) \in S^3\) by matrix multiplication), or \({{\,\mathrm{SO}\,}}(3) = {\{T \in {{\,\mathrm{SO}\,}}(4) : Te_4=e_4\}}\), where \(e_4=(0,i) \in S^3\), if \(\dim G =3\).

  4. (4)

    The torus \({{\,\mathrm{SO}\,}}(2)\times {{\,\mathrm{SO}\,}}(2)=\{f_{\theta ,\rho } : \theta ,\rho \in {\mathbb {R}}\}\), if \(\dim G=2\).

In particular \(\dim G \ne 5\).

Lemma 9.5

Let \(\varepsilon \in (\varepsilon _2,\varepsilon _1)\), and let u be a nonconstant solution of (3). If the group

$$\begin{aligned} G={{\,\mathrm{SO}\,}}(4)_{u} = \{T \in {{\,\mathrm{SO}\,}}(4) : u \circ T = u\} \end{aligned}$$

has dimension \(\ge 2\), then \(u \in A_\varepsilon \).

Proof

By Theorem 9.4, we cannot have \(\dim G \ge 4\). In fact, after possibly replacing u with \(u \circ T\) for some isometry T (and consequently \(G_{u}\) with \(T^{-1}G_{u}T\)), G is either \({{\,\mathrm{SO}\,}}(4)\) or \(\mathrm {U}(2)\). Both groups act transitively on \(S^3\), so u would be constant. Hence, it suffices to consider the following cases:

Case 1: \(\dim G = 3\).

Again, after possibly composing u with an isometry, we may assume G is either \(\mathrm {SU}(2)\) or \({{\,\mathrm{SO}\,}}(3)\). The former case cannot happen, as \(\mathrm {SU}(2)\) also acts transitively on \(S^{3}\). In the latter case, G acts by rotations in each hyperplane orthogonal to \((0,0,0,1)\in S^{3}\), so u is rotationally symmetric with respect to this point. By Lemma 9.2 we get \(u \in A_{\varepsilon }\).

Case 2: \(\dim G = 2\).

We will show that this case does not happen. By Theorem 9.4, we may assume that G is the group \({{\,\mathrm{SO}\,}}(2)\times {{\,\mathrm{SO}\,}}(2)\). The action of G on \(S^{3}\) has two orbit types: 2-dimensional tori given by the boundary of the region

$$\begin{aligned} \Omega _r = \{(z,w) \in S^{3}: |z| \le r\}, \quad \text{ for } \text{ some } \quad r\in (0,1), \end{aligned}$$

and the circles \(\{|z|=0\} \cap S^3\) and \(\{|z|=1\}\cap S^3\). If \(u \not \equiv \pm 1\), then u changes sign and we may pick \(x \in u^{-1}(0)\). By the maximum principle and \(x \cdot G \subset u^ {-1}(0)\), the orbit \(x\cdot G\) cannot be a circle. It follows that u has a nodal domain \(\Omega \) which is contained in either \(\Omega _{{\mathcal {T}}}\) (see (20)) or in its complement. In any case, we see that \(\lambda _1(\Omega ) \le \lambda _1(\Omega _{{\mathcal {T}}}) = \lambda _2(S^3)\), so Theorem 7.1 implies \(\varepsilon < \lambda _2(S^3)^{-1/2} = \varepsilon _2\), contradicting our assumption on \(\varepsilon \). \(\square \)

Lemma 9.6

Given any solution u of (3) for \(\varepsilon \in (\varepsilon _2,\varepsilon _1)\), the group \({G={{\,\mathrm{SO}\,}}(4)_u}\) has dimension \(\ge 2\).

Proof

Since \(W''(u) \ge -1 = W''(0)\), we have

$$\begin{aligned} \int _{S^3} \varepsilon |\nabla \phi |^2 + \frac{W''(u)}{\varepsilon } \phi ^2 \ge \int _{S^3} \varepsilon |\nabla \phi |^2 + \frac{W''(0)}{\varepsilon }\phi ^2, \quad \text{ for } \text{ any } \quad \phi \in W^{1,2}(S^3). \end{aligned}$$

If \(L_u\) and \(L_0\) are the linearizations of the Allen–Cahn operator at u and 0 respectively, the inequality above yields

$$\begin{aligned} \lambda _k(L_u) \ge \lambda _k(L_0), \quad \text{ for } \text{ all } \quad k \in {\mathbb {N}}. \end{aligned}$$

Since \(L_0 = -\varepsilon ^2\Delta + W''(0)= -\varepsilon ^2 \Delta -1\), we see that \(\lambda _k(L_0) = \varepsilon ^2\lambda _k(S^3) -1\). Using that \(\lambda _1(S^3)\) has multiplicity 4 and \(\varepsilon \in (\varepsilon _2,\varepsilon _1)\), we obtain

$$\begin{aligned} {{\,\mathrm{ind}\,}}(u) + \dim \ker (L_u) \le {{\,\mathrm{ind}\,}}(0) + \dim \ker (L_0) = 5. \end{aligned}$$

By Theorem 5.1, we have \({{\,\mathrm{ind}\,}}(u) \ge 1\), so \(\dim \ker (L_u) \le 4\). Since the tangent vector of any curve through u in the orbit \(u\cdot {{\,\mathrm{SO}\,}}(4)\) is an element in \(\ker (L_u)\), we see that

$$\begin{aligned} \dim (u\cdot {{\,\mathrm{SO}\,}}(4)) = \dim T_u(u\cdot {{\,\mathrm{SO}\,}}(4)) \le \dim \ker (L_u) \le 4. \end{aligned}$$

Along with (21) and \(\dim {{\,\mathrm{SO}\,}}(4) = 6\), this shows \(\dim ({{\,\mathrm{SO}\,}}(4)_u) \ge 2\). \(\square \)

Remark 9.7

The same argument gives a lower bound for the dimension isotropy group of a nonconstant solution u of (3) in \(S^{N}\), for \(\varepsilon \) in the range \((\lambda _2(S^{N})^{-1/2},\lambda _1(S^{N})^{-1/2})\). In fact, since \(\lambda _1(S^{N})\) has multiplicity \(n+1\) and \(\dim {{\,\mathrm{SO}\,}}(N+1) = \frac{N(N+1)}{2}\), for any such \(\varepsilon \) and u, the group \({{\,\mathrm{SO}\,}}(N+1)_u\) has dimension \(\ge \frac{(N-2)(N+1)}{2}=\dim ({{\,\mathrm{SO}\,}}(N))-1\).

As a consequence of Lemmas 9.5 and 9.6 , we see that the only solutions of (3) with \(\varepsilon \in (\varepsilon _2,\varepsilon _1)\) are the constants \(\pm 1\) and 0, and the least positive energy solutions \(A_\varepsilon \). This finishes the proof of Theorem 2.11.