1 Introduction

Let B be the unit ball in \(\mathbb {R}^2\), let N be a natural number, and, for any map u in the Sobolev space \(H^1(B,\mathbb {R}^2)\), let

$$\begin{aligned} \mathbb {D}(u):=\int _{B}|\nabla u(x)|^2 \, dx \end{aligned}$$
(1.1)

be the Dirichlet energy of u. Ball showed in [1] that there is a minimizer of \(\mathbb {D}(\cdot )\) among functions in the class

$$\begin{aligned} {\mathcal {Y}} := \{y \in H^1(B;\mathbb {R}^2): \, \det \nabla y= 1 \ \text {a.e. in} \ B, \ y\arrowvert _{\partial B}=u_{_{\scriptscriptstyle {N}}}\}. \end{aligned}$$
(1.2)

Here, \(u_{_{\scriptscriptstyle {N}}}\) is the N-covering map given by

$$\begin{aligned} u_{N}(R\cos \alpha , R\sin \alpha ):=\frac{1}{\sqrt{N}}(R \cos N \alpha , R \sin N \alpha ), \end{aligned}$$

where \(0\le R \le 1\) and \(0\le \alpha < 2\pi \). The prefactor \(1/\sqrt{N}\) ensures that \(u_{_{\scriptscriptstyle {N}}}\) satisfies \(\det \nabla u_{_{\scriptscriptstyle {N}}}=1\) a.e., which, together with the observation that \(\nabla u_{_{\scriptscriptstyle {N}}}\) is essentially bounded, implies that \(u_{_{\scriptscriptstyle {N}}}\) belongs to \({\mathcal {Y}}\). This paper examines and finds evidence in support of the conjecture that \(u_{_{\scriptscriptstyle {N}}}\) is itself a singular (i.e. non-smooth), global minimizer of \(\mathbb {D}\) in \({\mathcal {Y}}\) by proving that \(u_{_{\scriptscriptstyle {N}}}\) is the unique global minimizer of \(\mathbb {D}\) with respect to generalized inner and outer variations. A generalized inner variation of \(u_{_{\scriptscriptstyle {N}}}\) in this case is formed by varying the independent variable only, i.e. by forming a map \(u_{_{\scriptscriptstyle {N}}}\circ \varphi \) where \(\varphi \) belongs to the class

$$\begin{aligned} \mathcal {A}(B) = \{\varphi \in H^1_{\text {id}}(B;\mathbb {R}^2): \, \det \nabla \varphi = 1 \ \text {a.e. in} \ B\}, \end{aligned}$$
(1.3)

and where the notation \(H^1_{\text {id}}(B;\mathbb {R}^2)\) refers to maps in \(H^1(B;\mathbb {R}^2)\) that agree in the sense of trace with the identity map on \(\partial B\). An outer variation of \(u_{_{\scriptscriptstyle {N}}}\), on the other hand, is a map of the form \(\phi \circ u_{_{\scriptscriptstyle {N}}}\), where \(\phi \) belongs to \(\mathcal {A}(B(0,1/\sqrt{N}))\) and B(0, r) is the ball in \(\mathbb {R}^2\) of radius r and centre 0.

We also study the corresponding problem in the case of energies with subquadratic growth, a typical example of which would be a functional such as the p-Dirichlet energy. In the subquadratic setting the construction of generalised inner variations is much more delicate, so we postpone its description until Sect. 5 of the paper, where, it should be noted, we shall rely on the ideas of Barchiesi et al. [4].

The problem of minimizing \(\mathbb {D}\) in \({\mathcal {Y}}\) is of interest for several reasons, the first of which is that this problem automatically delivers a singular minimizer whenever \(N\ge 2\). An argument of Ball in [3, Section 2.3] can be adapted to establish that, when \(N \ge 2\), no member of \({\mathcal {Y}}\) is \(C^1\), and so, in particular, the minimizer of \(\mathbb {D}\) in \({\mathcal {Y}}\) cannot be \(C^1\). The constraint \(\det \nabla y= 1\) a.e. in B is instrumental in limiting the regularity in this way: without it, it is easy to show that the unique global minimizer of \(\mathbb {D}\) among maps y agreeing with \(u_{_{\scriptscriptstyle {N}}}\) on \(\partial B\) is the smooth harmonic extension of \(u_{_{\scriptscriptstyle {N}}}|_{\partial B}\). Nor can topology be ignored: it turns out that any y in \({\mathcal {Y}}\) is continuous and hence, by topological degree theory, is such that, roughly speaking, y(B) covers the ball \(B_{\scriptscriptstyle {N}}\) N times. When \(N=1\), the minimizer of \(\mathbb {D}\) in \({\mathcal {Y}}\) is trivially the identity map, which coincides with \(u_{_1}\) in the notation introduced above. Thus we focus on non-trivial changes in topology by choosing \(N\ge 2\).

The second reason to study \(\mathbb {D}\) in detail is that, formally at least, \(u_{_{\scriptscriptstyle {N}}}\) is a critical point of an appropriately perturbed version of \(\mathbb {D}\) in \({\mathcal {Y}}\). To be precise, it can be checked directly that \(u_{_{\scriptscriptstyle {N}}}\) solves the Euler-Lagrange equation associated with the energy

$$\begin{aligned} \mathbb {D}(u) + \int _{B} 2p_{\scriptscriptstyle {N}} \ln R \, \det \nabla u \,dx, \end{aligned}$$
(1.4)

where \(p_{\scriptscriptstyle {N}}:=N-1/N\) is constant. (See [6, Proposition 3.4] in the case \(N=2\); the case for general N follows similarly.) The prefactor of \(\det \nabla u\) in (1.4) is a Lagrange multiplier traditionally used to encode the requirement that functions in \({\mathcal {Y}}\) obey the constraint \(\det \nabla u = 1\) a.e.. We note that the theory which underpins Lagrange multipliers in the case of nonlinear elasticity has been established in [8, 14, 15, 21,22,23]. The main difference between that setting and ours lies in the lack of invertibility of maps belonging to \({\mathcal {Y}}\) which, as indicated above, can be thought of as being N-to-1, whilst those considered by the works cited above, for example, are assumed to be invertible, with sufficiently regular inverses. Thus their results do not necessarily provide for the existence of a Lagrange multiplier for our problem in the case of the Dirichlet energy \(\mathbb {D}\). Nevertheless, the direct calculation alluded to above conforms to the conclusions of both [15, Theorem 3.1] and [8, Theorem 4.1], for example, without necessarily obeying all their hypotheses. This observation may be of independent interest.

It is straightforward to check that, when \(\varphi \in \mathcal {A}(B)\), \(u_N \circ \varphi \) has finite Dirichlet energy, agrees with \(u_{_{\scriptscriptstyle {N}}}\) on \(\partial B\), and satisfies \(\det \nabla (u_{_{\scriptscriptstyle {N}}}\circ \,\varphi ) =1\) a.e. in B, and that the same assertions are true of outer variations \(\phi \circ u_{_{\scriptscriptstyle {N}}}\), where \(\phi \in \mathcal {A}(B(0,1/\sqrt{N}))\). We remark that \(u_{_{\scriptscriptstyle {N}}}\circ \varphi \) represents a quite general form of inner variation that is appropriate to the constraint \(\det \nabla v=1\) a.e. imposed on members of \(\mathcal {A}(B)\). In the setting of nonlinear elasticity more generally, where the constraint \(\det \nabla v>0\) a.e. is enforced, inner variations of admissible maps v often take the form \(v(x+\epsilon \phi (x))\), where \(\phi \) is a smooth, compactly supported function and \(\epsilon \) is chosen sufficiently small that \(\det \nabla (x+\epsilon \phi (x))\) is bounded away from 0. See [2] together with [5, Theorem A.1]; see also [3, Section 2.4]. In the incompressible setting we consider in this paper maps of the form \(\varphi (x)=x+\epsilon \phi (x)\), with \(\phi \) smooth and compactly supported in B, belong to \(\mathcal {A}(B)\) only if \(\phi (x)=0\) for all x. Thus this particular form of inner variation appears to be too restrictive.

The first main result of this paper, Theorem 2.2, is the following.

Theorem

Let \(u_{_{\scriptscriptstyle {N}}}\) be the N-covering map and let \(\varphi \) be a map in \(H^{1}_{\text {id}}(B;\mathbb {R}^2)\) which satisfies \(\det \nabla \varphi =1\) a.e. in B. Then

$$\begin{aligned} \int _{B} |\nabla (u_{_{\scriptscriptstyle {N}}}\circ \varphi )|^2\, dx \ge \int _{B} |\nabla u_{_{\scriptscriptstyle {N}}}|^{2} dx, \end{aligned}$$
(1.5)

with equality if and only if \(\varphi \) is the identity map. In particular, \(u_{_{\scriptscriptstyle {N}}}\) is the unique global minimizer with respect to inner variations of the Dirichlet energy.

The corresponding result for outer variations, Theorem 4.3, demonstrates that, within the class of outer variations, \(u_{_{\scriptscriptstyle {N}}}\) is the unique global minimizer of the Dirichlet energy among maps agreeing with \(u_{_{\scriptscriptstyle {N}}}\) on \(\partial B\), i.e. \(\mathbb {D}(\phi \circ u_{_{\scriptscriptstyle {N}}}) \ge \mathbb {D}(u_{_{\scriptscriptstyle {N}}})\) for all \(\phi \) in the class \(\mathcal {A}(B(0,1/\sqrt{N}))\), with equality if and only if \( \phi \) is the identity map.

It has to be checked that inner and outer variations in this setting are genuinely different, and this is done rigorously in Proposition 6.1. It is therefore quite natural that the approaches needed to prove Theorems 2.2 and 4.3 are necessarily different too. Sections 2 and 3 describe our approach to the inner variation problem, which proceeds by writing \(\mathbb {D}(u_{_{\scriptscriptstyle {N}}}\circ \varphi )\) as a functional in \(\psi \), a suitably defined inverse of \(\varphi \), and by solving a series of problems of isoperimetric type. The approach needed in the case of outer variations is adapted from [30], and, although isoperimetry is again involved, the method does not apply to inner variations. Taken together, these two results clearly do not settle the question of whether \(u_{_{\scriptscriptstyle {N}}}\) minimizes \(\mathbb {D}\) in \({\mathcal {Y}}\). Nonetheless, we believe that Theorems 2.2 and 4.3 are interesting intermediate steps.

Our approach to inner variations can be generalized to functionals with subquadratic p-growth. In this setting, the extension of \(\mathcal {A}(B)\) is a delicate matter, and we rely heavily on the results of [4] concerning the class which they refer to as \(\mathcal {A}_p\). The details can be found in Sect. 5, which deals with functionals of the form

$$\begin{aligned} \mathbb {E}(u):=\int _{B} f(|\nabla u(x)|)\,dx, \end{aligned}$$
(1.6)

where f is convex, of class \(C^1(\mathbb {R}^+)\), and such that \(f'(t) \ge 0\) for \(t>0\). We suppose that, for some \(1<p<2\), there is a constant \(C>0\) such that \(\frac{1}{C}t^p \le f(t) \le C(1+t^p)\) for all \(t > 0\). The corresponding result to Theorem 2.2 is Theorem 5.7, which we record here for completeness.

Theorem

Let the functional \(\mathbb {E}\) be given by (1.6) and let \(u_{_{\scriptscriptstyle {N}}}\circ \varphi \) be a generalized inner variation of \(u_{_{\scriptscriptstyle {N}}}\). Then \(\mathbb {E}(u_{_{\scriptscriptstyle {N}}}\circ \varphi ) \ge \mathbb {E}(u_{_{\scriptscriptstyle {N}}})\), with equality if and only if \(\varphi ={\text {id}}\).

We remark that since \(u_{_{\scriptscriptstyle {N}}}\) is a non-affine, positively one-homogeneous function, \(\nabla u_{_{\scriptscriptstyle {N}}}(x)\) is discontinuous at the point \(x=0\), a feature it has in common with many of the examples of singular minimizers in the higher dimensional calculus of variations. In brief, [17, 24, 27, 32] all construct one-homogeneous minimizers of smooth, strongly convex functionals, the last of these being in the smallest possible, and hence optimal, dimensions \(m=3,n=2\). Here, m and n are, respectively, the dimensions of the domain and codomain of the energy minimizing map. The works [24, 32] also contain cogent summaries of the wider theory, which the reader should consult for further details. To relate some of this to the problem considered in our paper, we note that in the dimensions \(m=n=2\), and without a constraint on the determinant, minimizers should be smooth. This follows from standard results about harmonic functions (or indeed from the classical result [25, Theorem 1.10.4 (iii)] in the more general case of strongly convex, quadratic growth integrands), and from [28] in the case that the candidate minimizer is supposed to be one-homogeneous. In the presence of the determinant constraint, however, these results cannot apply to members of \({\mathcal {Y}}\) when \(N\ge 2\).

The paper concludes by focusing in detail on the case \(N=2\), where we examine a rich subclass of \({\mathcal {Y}}\) comprised of maps of the form \(v=h\,\circ \,u_2\,\circ \,g\), where h and g are self-maps of the balls \(B(0,1/\sqrt{2})\) and B respectively. In fact, the maps g and h are generated by flows in a way that is made precise in Sect. 6.2, and because of this there is a natural parameter in v, which we call \(\delta \), on which both \(h=h(z;\delta )\) and \(g=g(y;\delta )\) depend smoothly, and for which it holds that \(g(y;0)=y\) for all y in B and \(h(z;0)=z\) for all z in \(B(0,1/\sqrt{2})\). This enables us to write

$$\begin{aligned} v(y;\delta ) = h(u_2(g(y;\delta ));\delta ), \end{aligned}$$
(1.7)

which is such that \(v(\cdot ;0)=u_2\), and so \(\mathbb {D}(v(\cdot ;\delta ))\) depends naturally on \(\delta \) and obeys \(\mathbb {D}(v(\cdot ;0))=\mathbb {D}(u_2)\). The final main result of the paper, Theorem 6.3, concerns maps of the type \(v(\cdot ,\delta )\) and can be described as follows. Note that J stands for the \(2\times 2\) matrix representing a rotation of \(\pi /2\) radians anticlockwise.

Theorem

Let \(v(\cdot ,\delta )\) be given by (1.7) and let \(\Xi =J \nabla \xi \) and \(\Sigma =J \nabla \sigma \) be smooth, divergence-free maps with compact support in \(B(0,1/\sqrt{2}){\setminus }\{0\}\) and \(B {\setminus } \{0\}\) respectively. Suppose that h and g are solutions of the following equations

$$\begin{aligned}&\left\{ \begin{array}{ll}\partial _{\delta }h(z,\delta ) =J\nabla \Xi (h(z,\delta )) &{}\quad z \in B(0,1/\sqrt{2}), \ \delta \in (-\delta _0,\delta _0) \\ h(z,0)=z &{}\quad z \in B(0,1/\sqrt{2}),\end{array}\right. \\&\left\{ \begin{array}{ll} \partial _{\delta }G(y,\delta )=J\nabla \Sigma (G(y,\delta )) &{}\quad y \in B, \delta \in (-\delta _0,\delta _0) \\ G(y,0)=y &{}\quad y \in B \end{array}\right. \end{aligned}$$

Then

  1. (a)

    \(\left. \partial _{\delta }\right| _{\delta =0}D(v(\cdot ,\delta )) = 0\), and

  2. (b)

    it holds that

    $$\begin{aligned} \left. \partial _\delta ^2\right| _{\delta =0}D(v(\cdot ,\delta )) \ge 4 \int _B \left\{ (\sigma - \xi \circ u_2)_{\tau {\scriptscriptstyle {R}}}\right\} ^2 + \left\{ (\sigma - \xi \circ u_2)_{{\scriptscriptstyle {R}}{\scriptscriptstyle {R}}}\right\} ^2 \, dy, \end{aligned}$$
    (1.8)

    and \(\left. \partial _\delta ^2\right| _{\delta =0}D(v(\cdot ,\delta ))=0\) only if

    $$\begin{aligned} D(v(\cdot ,\delta )) =D(u_2) \quad |\delta | < \delta _0. \end{aligned}$$

In particular, the map \(u_2\) is a local minimizer of the Dirichlet energy with respect to all variations of the form \(v(\cdot ,\delta )\).

1.1 Notation

For non-zero x, we will write \(\overline{x}= x/|x|\). Note then that for any weakly differentiable map \(\varphi \) we may write \(\nabla \varphi (x) = \varphi _{_R} \otimes \overline{x}+ \varphi _{\tau } \otimes J\overline{x}\), where J is the \(2 \times 2\) matrix representing a rotation of \(\pi /2\) radians anticlockwise. For any two vectors a, b in \(\mathbb {R}^2\), the \(2 \times 2\) matrix \(a\otimes b\) is defined by its action \(a\otimes b \ v:=(b\cdot v)a\) on any v in \(\mathbb {R}^2\). In plane polar coordinates \((R, \alpha )\), the angular derivative \(\psi _\tau \) is \((\psi _\alpha )/R\) and the radial derivative is \(\psi _{_{\small {R}}}\). The inner product of two matrices \(A_1\) and \(A_2\) will be denoted by \(A_1 \cdot A_2 = \mathrm{tr}\,(A_1^T A_2)\) and the norm by \(|A_1|^2=\mathrm{tr}\,(A_1^T A_1)\). The two-dimensional Lebesgue measure of a (measurable) set \(E \subset \mathbb {R}^2\) will be written |E| provided it is unambiguous to do so, and otherwise as \(\mathcal {L}^2(E)\). Throughout the paper, we will use \(C_r\) to represent the circle of radius r and centre 0. The rest of the notation is either standard or is explained where it is introduced.

2 The Dirichlet energy of inner variations of \(u_{_{\scriptscriptstyle {N}}}\)

Let \(\varphi \) belong to \(\mathcal {A}(B)\) and consider the composed map \(u_{_{\scriptscriptstyle {N}}}\circ \varphi \). A short calculation shows that

$$\begin{aligned} |\nabla (u_{_{\scriptscriptstyle {N}}}\circ \varphi )|^2 = \frac{1}{N}(X_1(\varphi )^2 + X_2(\varphi )^2) + N(Y_1(\varphi )^2+Y_2(\varphi )^2), \end{aligned}$$
(2.1)

where \(X_1(\varphi )=\overline{\varphi }\cdot \varphi _{_R}\), \(X_2(\varphi )=\overline{\varphi }\cdot \varphi _\tau \), \(Y_1(\varphi )=J \overline{\varphi }\cdot \varphi _{_R}\) and \(Y_2(\varphi )=J \overline{\varphi }\cdot \varphi _\tau \). The right-hand side of (2.1) can be rewritten in terms of \(\nabla \varphi ^T \overline{\varphi }\) and \(\nabla \varphi ^T J\overline{\varphi }\) to give

$$\begin{aligned} |\nabla (u_{_{\scriptscriptstyle {N}}}\circ \varphi )|^2 = \frac{1}{N}|\nabla \varphi ^T \overline{\varphi }|^2 + N|\nabla \varphi ^T J \overline{\varphi }|^2. \end{aligned}$$
(2.2)

For later use, we note that for \(N\ge 1\)

$$\begin{aligned} \frac{1}{\sqrt{N}} |\nabla \varphi | \le |\nabla (u_{_{\scriptscriptstyle {N}}}\circ \varphi )| \le \sqrt{N} |\nabla \varphi |. \end{aligned}$$
(2.3)

Define \(W: {\mathbb {S}}^1 \times \mathbb {R}^{2 \times 2} \times \mathbb {N}\rightarrow \mathbb {R}\) by

$$\begin{aligned} W(\overline{x}, F, N) = \frac{1}{N} |F^T \overline{x}|^2 + N|F^T J \overline{x}|^2. \end{aligned}$$
(2.4)

Note that, from (2.2) and (2.4),

$$\begin{aligned} |\nabla (u_{_{\scriptscriptstyle {N}}}\circ \varphi )|^2 = W({\overline{\varphi }},\nabla \varphi , N). \end{aligned}$$
(2.5)

In the following we shall need to refer to the inverse, \(\psi \), say, of a map \(\varphi \) belonging to \(\mathcal {A}(B)\). The existence and regularity of a continuous map \(\psi \) satisfying \(\psi (\varphi (x))=x\) for a.e. x in B was established in [31, Lemma 6 and Theorem 8], as was the validity of the expression \(\nabla \psi (\varphi (x))=(\nabla \varphi (x))^{-1}\). Bearing the constraint \(\det \nabla \varphi =1\) a.e. in mind, this immediately implies that \(\nabla \psi (\varphi (x))=\mathrm{adj}\,\nabla \varphi (x)\) a.e., and hence, via [31, Theorem 2 (ii)], that \(\nabla \psi \in L^2(B;\mathbb {R}^2)\).

Proposition 2.1

Let W be given by (2.4) and let \(\varphi \) lie in \(\mathcal {A}(B)\), so that \(u_{_{\scriptscriptstyle {N}}}\circ \varphi \) is an inner variation of the N-covering map. Let \(\psi =\varphi ^{-1}: B \mapsto B\) be the inverse of \(\varphi \), as described above. Then \(\psi \) also belongs to \(\mathcal {A}(B)\), and

$$\begin{aligned} W({\overline{\varphi }},\nabla \varphi , N) = W({\overline{y}},\nabla \psi ^T(y),1/N) \end{aligned}$$
(2.6)

pointwise a.e., where \(y:=\varphi (x)\). In particular, the Dirichlet energy of \(u_{_{\scriptscriptstyle {N}}}\circ \varphi \) is given by

$$\begin{aligned} \int _{B} |\nabla (u_{_{\scriptscriptstyle {N}}}\circ \varphi )|^2\, dx = I(\psi ,N), \end{aligned}$$
(2.7)

where

$$\begin{aligned} I(\psi ,N):=\int _{B} W(\overline{y}, \nabla \psi ^T(y), 1/N) \, dy. \end{aligned}$$
(2.8)

Proof

Apply [31, Theorem 8] together with the constraint \(\det \nabla \varphi (x)=1\) a.e. to obtain \(\nabla \psi (\varphi (x)) = \mathrm{adj}\,\nabla \varphi (x)\) a.e. in B. Letting \(y=\varphi (x)\) and using the identity \(\mathrm{adj}\,A = J^T A^T J\) gives \(\nabla \varphi (x)^T = J \nabla \psi (y) J^T\), and, since \(J^T=-J\), it follows that \(\nabla \varphi ^T(x)\overline{\varphi }(x) = J^T\nabla \psi (y) J {\overline{y}}\) and \(\nabla \varphi ^T(x)J \overline{\varphi }(x) = J \nabla \psi (y) {\overline{y}}\). Now, \(v \mapsto \pm Jv\) is an isometry of the plane, so

$$\begin{aligned} W({\overline{\varphi }},\nabla \varphi , N)&= \frac{1}{N}|\nabla \varphi ^T \overline{\varphi }|^2 + N|\nabla \varphi ^T J \overline{\varphi }|^2 \\&= \frac{1}{N}|J^T\nabla \psi (y) J {\bar{y}}|^2 + N|J \nabla \psi (y) {\bar{y}}|^2 \\&= \frac{1}{N}|\nabla \psi (y) J {\bar{y}}|^2 + N|\nabla \psi (y) {\bar{y}}|^2 \\&= W({\bar{y}},\nabla \psi (y)^T,1/N), \end{aligned}$$

which proves (2.6). Integrating the expression above with respect to x and than applying the change of variables formula [31, Theorem 2 (ii)] leads to (2.7). \(\square \)

Using Proposition 2.1, we can now study the Dirichlet energy of the map \(u_{_{\scriptscriptstyle {N}}}\circ \varphi \) by considering \(I(\psi ,N)\). The latter can be expressed very simply in polar coordinates as

$$\begin{aligned} I(\psi ,N) = \int _{B} N |\psi _{_R}|^2 + \frac{1}{N} |\psi _\tau |^2 \, dy \end{aligned}$$
(2.9)

and hence, by convexity, the inequality

$$\begin{aligned} I(\psi ,N) \ge I({\text {id}},N) + 2N \int _{B} (\psi _{_R}-\overline{y})\cdot \overline{y}\, dy + \frac{2}{N}\int _{B} (\psi _{\tau }-J \overline{y})\cdot J\overline{y}\, dy \end{aligned}$$
(2.10)

must hold. By a direct calculation using the boundary condition \(\psi (y)=y\) for \(y \in \partial B\), the two integrals combine to give

$$\begin{aligned} I(\psi ,N) \ge I({\text {id}},N)+ 2(N-1/N)\left( \pi - \int _{B} \psi \cdot \overline{y}\, d\nu (y)\right) , \end{aligned}$$
(2.11)

where \(d\nu \) corresponds to the measure \(d\mathcal {L}^2(y)/|y|\). Defining

$$\begin{aligned} F(\psi ) : = \int _{B} \psi \cdot \overline{y}\, d\nu (y), \end{aligned}$$
(2.12)

we claim that if \(\psi \) belongs to \(\mathcal {A}(B)\) then \(F(\psi ) \le \pi \). Supposing for now that this is established, we immediately obtain from (2.11) that \(I(\psi ,N) \ge I({\text {id}},N)\), which, thanks to (2.7), implies that

$$\begin{aligned} \int _{B} |\nabla (u_{_{\scriptscriptstyle {N}}}\circ \varphi )|^2\, dx \ge \int _{B} |\nabla u_{_{\scriptscriptstyle {N}}}|^2\, dx. \end{aligned}$$

Our goal is now to prove the following result.

Theorem 2.2

Let \(u_{_{\scriptscriptstyle {N}}}\) be the N-covering map and let \(\varphi \) be a map in \(H^{1}_{\text {id}}(B;\mathbb {R}^2)\) which satisfies \(\det \nabla \varphi =1\) a.e. in B. Then

$$\begin{aligned} \int _{B} |\nabla (u_{_{\scriptscriptstyle {N}}}\circ \varphi )|^2\, dx \ge \int _{B} |\nabla u_{_{\scriptscriptstyle {N}}}|^{2} dx, \end{aligned}$$
(2.13)

with equality if and only if \(\varphi \) is the identity map. In particular, \(u_{_{\scriptscriptstyle {N}}}\) is the unique global minimizer with respect to inner variations of the Dirichlet energy.

A short proof of Theorem 2.2 which draws together all its supporting results will be given at the end of Sect. 3.

3 Bounds on \(F(\psi )\)

It is enough to establish the stronger result that, for any admissible \(\psi \) and almost every \(r \in (0,1)\), the functional

$$\begin{aligned} F(\psi ;r): = \frac{1}{r}\int _{\partial B(0,r)} \psi (y) \cdot \overline{y}\,d{\mathcal {H}}^1, \end{aligned}$$
(3.1)

when subject to the constraint

$$\begin{aligned} \frac{1}{2}\int _{\partial B(0,r)} J \psi \cdot \psi _{\tau } \,d{\mathcal {H}}^1 = \pi r^2, \end{aligned}$$
(3.2)

obeys \(F(\psi ;r ) \le 2\pi r\). Integrating (3.1) over \(r \in (0,1)\) will then yield the claim that \(F(\psi )\le \pi \).

The constrained variational problem of maximising \(F(\cdot ;r)\) subject to (3.2) is of normal (or non-degenerate), isoperimetric type, and can be solved by introducing a Lagrange multiplier. Before justifying this assertion rigorously, we first gather some basic facts about admissible maps. In the following, \(\psi \arrowvert _{C_r}\) denotes the restriction of the continuous map \(\psi \) to the set \(C_r=\partial B(0,r)\).

Proposition 3.1

Let \(\psi \) be admissible, that is, let \(\psi \) belong to the Sobolev space \(H^1_{\text {id}}(B;{\mathbb {R}}^2)\) and satisfy \(\det \nabla \psi = 1\) a.e. in B. Then:

  1. (a)

    \(\psi \) has a continuous representative;

  2. (b)

    for \({\mathcal {L}}^1\)-a.e. \(r \in (0,1)\), \(\psi \arrowvert _{C_r}\) belongs to \(H^1(C_r;\mathbb {R}^2)\) and is \({\mathcal {H}}^1\)-a.e. \(1-1\);

  3. (c)

    \(\psi \) is measure-preserving, and the expression (3.2) holds for \({\mathcal {L}}^1\)-a.e \(r \in (0,1)\);

  4. (d)

    the topological degree \(d(\psi ,B_r,w)\in \{0,1\}\) for all \(w \in \mathbb {R}^2 {\setminus } \psi (C_r)\), and the set \(U_1:=\{w \in \mathbb {R}^2 {\setminus } \psi (C_r): \ d(\psi ,B_r,w)=1\}\) coincides with \(\psi (B_r)\);

  5. (e)

    \(U_1\) has a generalized exterior normal \({\tilde{\nu }}(\psi (y))=\frac{J^T \psi _{\tau }(y)}{|\psi _{\tau }(y)|}\) for a.e. \(y \in C_r\), and \(\partial U_1 =\psi (C_r)\).

Proof

  1. (a)

    This is [31, Theorem 5] or [16, Theorem 2.3.2]; alternatively, see [13, Cor. 5.19].

  2. (b)

    The first assertion follows from [26, Proposition 2.8 (iii) and (v)]: see Remark 2 after the statement of [26, Proposition 2.8]. The second part follows by first applying [31, Lemma 5.1] to deduce that \(\psi \) is \(1-1\) on a set of full \(\mathcal {L}^2\)-measure in B. It must therefore be the case that for a.e. \(r \in (0,1)\), \(\psi \arrowvert _{C_r}\) is \({\mathcal {H}}^1\)-a.e. \(1-1\).

  3. (c)

    That \(\psi \) is measure-preserving can be deduced from [31, Theorem 7]. Alternatively, one can argue as follows. Let \(G \subset B\) be open and such that \(|\partial G|=0\). First note that, since \(\psi \) is open (because it has a continuous inverse, \(\varphi \)) and preserves \(\partial B\), the degree \(d(\psi ,B,x)\) is well-defined at any point x in \(B=\psi (B)\) and is equal to 1 for such x. By [13, Proposition 5.25], we note that

    $$\begin{aligned} \int _{G} \det \nabla \psi (y) \, dy = \int _{\mathbb {R}^2} d(\psi ,G,x) \, dx. \end{aligned}$$
    (3.3)

    By properties of the degree, \(d(\psi ,G,x)=0\) if \(x \notin \psi (G)\). The degree \(d(\psi ,G,x)\) is not defined when \(x \in \psi (\partial G)\), but, since \(\psi \) has the N-property (by [13, Theorem 5.32]), this set is null. For x in \(\psi (G)\) we can apply excision and \(d(\psi ,B,x)=1\) to conclude that \(d(\psi ,G,x)\) and the characteristic function \(\chi _{\psi (G)}(x)\) are equal almost everywhere. Inserting this into (3.3) together with \(\det \nabla \psi =1\) gives the desired result for open sets. The result for general measurable G follows by approximation.

    Finally, to prove (3.2) it is clearly enough to show that \((*) \ \int _{B_r} \det \nabla \psi \, dx= \frac{1}{2}\int _{C_r} J \psi \cdot \psi _{\tau } \,d{\mathcal {H}}^1\) for smooth maps \(\psi \) and then apply an approximation argument. The latter expression is a natural consequence of the fact that \(\det \nabla \psi \) is a null Lagrangian, and is hence a divergence. In this case, \(\det \nabla \psi = (1/2) \mathrm{div}\,(\psi _2 J \nabla \psi _1-\psi _1 J \nabla \psi _2)\), which can be integrated using Green’s theorem to give \((*)\). Equation (3.2) is now immediate from \((*)\) and \(\det \nabla \psi = 1\).

  4. (d)

    By [7, Proposition 1, part (v)], \(\psi \) obeys condition (INV), and hence by [26, Lemma 3.5, part (ii)], \(d(\psi ,B_r,w)\in \{0,1\}\) for all \(w \in \mathbb {R}^2 {\setminus } \psi (C_r)\). Hence, by [26, Lemma 3.5], \(U_1\) coincides with the topological image of \(B_r\) under \(\psi \) which, given that \(\psi \) is continuous, is the same thing as \(\psi (B_r)\).

  5. (e)

    This is [26, Eq. 3.14] adapted to the case at hand. Alternatively, see [7, Lemma 1, Eq. 2.6]. \(\square \)

We first deal with the question of the existence of a maximiser of the functional \(F(\cdot ;r)\) among suitable functions in the set \(\mathcal {C}(r)\) described by (3.5) below. Let \(r \in (0,1)\) be such that Proposition 3.1(b)–(c) apply to \(\psi |_{C_r}\). Define

$$\begin{aligned} A(\psi |_{C_r}) = \frac{1}{2}\int _{C_r} J \psi \cdot \psi _{\tau } \,d{\mathcal {H}}^1 \end{aligned}$$
(3.4)

and let

$$\begin{aligned} {\mathcal {C}}(r) = \{ \psi |_{C_r}\in H^1(C_r;\mathbb {R}^2): \psi \in {\mathcal {A}},\ \ A(\psi |_{C_r})=\pi r^2\}. \end{aligned}$$
(3.5)

We can assume that for almost every r in (0, 1), sequences \((\psi ^{(n)}\restriction _{C_r})_{n \in {\mathbb {N}}}\) in \({\mathcal {C}}(r)\) are bounded in \(H^1\)-norm. To see this, consider the full sequence \((\psi ^{(n)})_{n \in {\mathbb {N}}}\) in \({\mathcal {A}}\). Without loss of generality \(||\psi ^{(n)}||_{H^1(B)}\) is uniformly bounded, and hence, in particular, so must \(||{\psi ^{(n)}_{\tau }\restriction _{C_r}}||_{L^2(C_r)}\) be for a.e. r in (0, 1). Note that \(\partial _{\tau }\) and restriction to \(C_r\) commute, so we can write \((\psi |_{C_r})_{\tau }={\psi _{\tau }}\restriction _{C_r}\) for any function \(\psi |_{C_r}\) in \(\mathcal {C}(r)\). Next, since \(F(\cdot ,r)\) is clearly bounded above, we can choose \((\psi ^{(n)}\restriction _{C_r})_{n \in {\mathbb {N}}}\) such that

$$\begin{aligned} F(\psi ^{(n)}|_{C_r};r) \rightarrow \sup \{ F(\psi |_{C_r};r): \ \psi |_{C_r}\in \mathcal {C}(r)\} \end{aligned}$$

as \(n \rightarrow \infty \). By the foregoing argument, we can extract a subsequence (not relabelled) such that \(\psi ^{(n)}\restriction _{C_r} \rightharpoonup v\) for some function v in \(H^1(C_r;\mathbb {R}^2)\). It is then immediate, by Sobolev embedding, that \(F(\psi ^{(n)}\restriction _{C_r};r) \rightarrow F(v;r)\), and that by Sobolev embedding together with the weak convergence \(\psi _{\tau }^{(n)}\restriction _{C_r} \rightharpoonup v_{\tau }\), that \(A(\psi ^{(n)}\restriction _{C_r} ) \rightarrow A(v)\). Hence \(F(v;r)=\sup _{\mathcal {C}(r)} F(\psi |_{C_r};r)\) and \(A(v) = \pi r^2\). Further, by again considering the full sequence \((\psi ^{(n)})_{n \in {\mathbb {N}}}\) in \({\mathcal {A}}\), which we may assume, without loss of generality, converges weakly to \(\psi \) in \(\mathcal {A}(B)\), it follows that \(v=\psi |_{C_r}\). Hence, for almost every \(r \in (0,1)\), a maximiser \(\psi |_{C_r}\) of \(F(\cdot ;r)\) exists in \(\mathcal {C}(r)\).

Introduce plane polar coordinates on \(C_r\) via \(y = r e(\alpha )\) for \(0 \le \alpha \le 2 \pi \), where \(e(\alpha ) = (\cos \alpha , \sin \alpha )^{T}\). We take z belonging to \(\psi (B_r)\) and, for \(0 \le \alpha < 2\pi \), define the maps \(\rho (\cdot )\) and \(\sigma (\cdot )\) by

$$\begin{aligned} \rho (\alpha )&= |\psi (r e(\alpha ))-z|\quad \text {and} \end{aligned}$$
(3.6)
$$\begin{aligned} e(\sigma (\alpha ))&= \frac{\psi (r e(\alpha ))-z}{\rho (\alpha )}. \end{aligned}$$
(3.7)

By Lemma A.1, \(\rho \) and \(\sigma \) can be chosen such that they both belong to \(H^1([0,2\pi ),\mathbb {R})\), and, by Lemma A.2, \(\sigma \) obeys \(\sigma (0)=\sigma _0\) and \(\sigma (2\pi )=2\pi +\sigma _0\) for some \(\sigma _0\). Notice that if either \(\rho (\alpha ) > r\) for all \(\alpha \) or \(\rho (\alpha ) < r\) for all \(\alpha \) then \(\psi (B_r)\) would have measure strictly greater than or, respectively, strictly less than \(|B_r|\), contradicting Proposition 3.1(c). Hence, and without loss of generality, we may assume that \(\rho (0)=r\). The quantity \(\sigma _0\) must remain free at this stage; it will, in fact, parametrize the two families of extremals described in Lemma 3.3.

In terms of \(\rho \) and \(\sigma \),

$$\begin{aligned} A(\psi |_{C_r}) = (1/2)\int _{0}^{2\pi } \rho ^2(\alpha ) \sigma '(\alpha )\, d\alpha \end{aligned}$$
(3.8)

and

$$\begin{aligned} F(\psi |_{C_r};r)=\int _{0}^{2\pi } \rho (\alpha ) \cos (\sigma (\alpha )-\alpha ) \, d\alpha . \end{aligned}$$
(3.9)

By identifying \(\psi |_{C_r}\) with the pair of functions \((\rho , \sigma )\) as described above, it is convenient to alter the notation slightly so that \(A(\rho , \sigma ):=A(\psi |_{C_r})\) and \(F(\rho , \sigma ):=F(\psi |_{C_r};r)\). We now determine conditions necessary for \(\psi |_{C_r}= (\rho , \sigma )\) to be a stationary point of F amongst maps which obey the constraint \(A(\rho , \sigma )=\pi r^2\).

Lemma 3.2

Let \((\rho , \sigma )\) be determined from \(\psi |_{C_r}\) as described in (3.6) and (3.7) above, and suppose that \((\rho , \sigma )\) is a stationary point of F with respect to perturbations which obey \(A({\tilde{\rho }}, {\tilde{\sigma }}) = \pi r^2\). Then, for some constant \(\lambda \), the coupled ODEs

$$\begin{aligned} \lambda \rho \sigma ' + \cos (\sigma - \alpha )&= 0 \end{aligned}$$
(3.10)
$$\begin{aligned} \lambda \rho ' + \sin (\sigma - \alpha )&= 0 \end{aligned}$$
(3.11)

hold almost everywhere on \([0,2\pi ]\).

Proof

Let \(\epsilon _1\) and \(\epsilon _2\) be real, let \(\rho _1\), \(\rho _2\), \(\sigma _1\) and \(\sigma _2\) be smooth and compactly supported in \([0,2\pi ]\), and consider the function

$$\begin{aligned} \gamma (\epsilon _1,\epsilon _2):=A(\rho +\epsilon _1\rho _1+\epsilon _2 \rho _2, \sigma +\epsilon _1\sigma _1+\epsilon _2 \sigma _2). \end{aligned}$$

Now \(\gamma _{\epsilon _2}(0,0)=\int _{0}^{2\pi } \rho \sigma ' \rho _2 + (1/2)\rho ^2 \sigma _2' \, d\alpha \). Suppose for a contradiction that \(\gamma _{\epsilon _2}(0,0)=0\) for any choice of \(\rho _2\), \(\sigma _2\). Then it easily follows that \(\rho \sigma '=0\) and hence that \(A(\rho , \sigma )=0\), contradicting a hypothesis of the lemma. Hence we can choose \(\rho _2\) and \(\sigma _2\) so that \(\gamma _{\epsilon _2}(0,0) \ne 0\), and, via the implicit function theorem, find a smooth function \(g: (-1,1) \rightarrow \mathbb {R}\) such that \(g(0)=0\) and \(\gamma (\epsilon ,g(\epsilon ))=\gamma (0,0)=\pi r^2\) for all sufficiently small \(\epsilon \). Differentiating the latter expression gives \(\gamma _{\epsilon _1}(0,0) + g'(0) \gamma _{\epsilon _2}(0,0) = 0\), which will be of use shortly. The variations \((\rho ^{\epsilon },\sigma ^{\epsilon }) := (\rho ,\sigma )+\epsilon (\rho _1,\sigma _1)+g(\epsilon )(\rho _2,\sigma _2)\) obey the constraint and so are admissible in the sense given in the statement of the lemma. In particular, \(\partial _{\epsilon }\arrowvert _{\epsilon = 0} F(\rho ^{\epsilon },\sigma ^{\epsilon })=0\), which we compute as follows (and where, for brevity, we set \(\Sigma :=\sigma (\alpha ) - \alpha \) and then suppress \(\alpha \) whenever possible):

$$\begin{aligned} \partial _{\epsilon }\arrowvert _{\epsilon = 0}F(\rho ^{\epsilon },\sigma ^{\epsilon })&= \int _{0}^{2\pi } (\rho _1+g'(0)\rho _2)\cos \Sigma - \rho (\sigma _1+g'(0)\sigma _2) \sin \Sigma \, d\alpha \nonumber \\&= \int _{0}^{2\pi } \rho _1\cos \Sigma - \rho \sigma _1 \sin \Sigma \, d\alpha + \lambda \gamma _{\epsilon _1}(0,0) \end{aligned}$$
(3.12)

provided we set \(\lambda =-(\int _{0}^{2\pi } \rho _2 \cos \Sigma - \rho \sigma _2 \sin \Sigma \, d\alpha ) /\gamma _{\epsilon _2}(0,0)\). Inserting \(\gamma _{\epsilon _1}(0,0)=\int _{0}^{2\pi } \rho \sigma ' \rho _1 + (1/2)\rho ^2 \sigma _1' \, d\alpha \) into (3.12) gives

$$\begin{aligned} 0&= \int _{0}^{2\pi } \rho _1 (\cos \Sigma + \lambda \rho \sigma ') + (\lambda /2)\rho ^2 \sigma _1' - \rho \sigma _1 \sin \Sigma \, d\alpha \\&= \int _{0}^{2\pi } \rho _1 (\cos \Sigma + \lambda \rho \sigma ') - \sigma _1 \rho (\lambda \rho ' + \sin \Sigma ) \, d\alpha , \end{aligned}$$

from which (3.10) follows immediately, as does \( \rho (\lambda \rho ' + \sin \Sigma )= 0\) a.e. on \([0,2\pi ]\). Since \(\psi |_{C_r}\) is, by Proposition 3.1(b), \(1-1\) \({\mathcal {H}}^1\)-a.e., we may assume that \(\rho \) vanishes only on an \({\mathcal {H}}^1\)-null subset of \([0,2\pi ]\), and hence (3.11) must hold almost everywhere. \(\square \)

It remains to identify the solutions to (3.10) and (3.11), and thereby calculate the extreme values of \(F(\psi |_{C_r};r)\) among \(\psi |_{C_r}\) in \(\mathcal {C}(r)\).

Lemma 3.3

Let \((\rho , \sigma )\) be determined from \(\psi |_{C_r}\) as described in (3.6) and (3.7) above, and suppose that (3.10) and (3.11) hold. Then \(-2\pi r \le F(\psi |_{C_r};r) \le 2\pi r\).

Proof

Recall that \(\Sigma (\alpha ) = \sigma (\alpha ) - \alpha \). By altering \(\sigma '\) and \(\rho '\) on a set of measure zero if necessary, we can assume that (3.10) and (3.11) hold pointwise on \([0,2\pi )\). Clearly, \(\lambda \ne 0\). Moreover, since \(\Sigma \) and \(\rho \) are absolutely continuous, a simple bootstrapping argument using (3.10) and (3.11) implies that \(\rho \) and \(\Sigma \) are smooth away from the set \(\{\alpha \in [0,2\pi ): \ \rho (\alpha )=0\}\). Since it is long, we break the rest of the proof into several steps.

  • Step 1. Let \(k\ne 0\) be a constant to be determined shortly, and let

    $$\begin{aligned} \rho&= -k Y \sec \Sigma , \end{aligned}$$
    (3.13)
    $$\begin{aligned} \tan \Sigma&= Y'/Y, \end{aligned}$$
    (3.14)

    provided \(\cos \Sigma \ne 0\) and \(Y \ne 0\). Let Y solve \(Y''+Y=(\lambda k)^{-1}\) on \((0,2 \pi )\) subject to \(Y(0)=1\) and \(Y'(0)=\tan \sigma _0\), where, to start with, we suppose that \(\cos \sigma _0 \ne 0\). The motivation for transforming variables in this way is explained in Remark 3.4 below. The aim is to show that with \(\rho \), \(\Sigma \) and Y so arranged, both (3.10) and (3.11) automatically hold. To that end, note that by first differentiating (3.14) and then using \(Y''+Y=(\lambda k)^{-1}\),

    $$\begin{aligned} \sec ^2(\Sigma ) \sigma '= & {} \frac{YY''-Y'^2}{Y^2} + 1 + \tan ^2 \Sigma \nonumber \\= & {} (\lambda k)^{-1}\frac{1}{Y}. \end{aligned}$$
    (3.15)

    Bearing (3.13) in mind, the latter expression rearranges to give (3.10).

    To see that (3.11) holds, differentiate (3.13) and use (3.14) and its derivative to obtain

    $$\begin{aligned} \rho '&= -k Y \sin (\Sigma ) \sec ^2(\Sigma ) \Sigma ' - k Y' \sec (\Sigma ) \\&= -k Y \sin (\Sigma )\left( \frac{Y'}{Y}\right) '-k Y \tan (\Sigma ) \sec (\Sigma ) \\&= -k Y \sin (\Sigma )\left( \frac{YY''-Y'^2}{Y^2} + 1 + \tan ^2 \Sigma \right) \\&= -\lambda ^{-1} \sin (\Sigma ). \end{aligned}$$

    (It may help to notice that the bracketed term in the penultimate line also appears in (3.15), and is therefore \((\lambda k Y)^{-1}\).)

  • Step 2. Eliminating \(\Sigma \) from (3.13) and (3.14) gives

    $$\begin{aligned} \rho =|k|(Y^2+Y'^2)^{1/2}. \end{aligned}$$
    (3.16)

    When \(\cos \sigma _0 \ne 0\), we impose \(\rho (0)=r\) by setting \(k=-r\cos \sigma _0\) in (3.13) (and by recalling that \(Y(0)=1\)). This gives rise to two possible sets of equations expressing \(\cos \Sigma \) and \(\sin \Sigma \) purely in terms of Y as follows:

    $$\begin{aligned} \cos \Sigma&= \frac{\pm Y}{(Y^2+Y'^2)^{1/2}} \end{aligned}$$
    (3.17)
    $$\begin{aligned} \sin \Sigma&= \frac{\pm Y'}{(Y^2+Y'^2)^{1/2}}, \end{aligned}$$
    (3.18)

    the ± sign to be interpreted as \(\text {sgn}\,(\cos \sigma _0)\). Suppose for now that \(\cos \sigma _0 >0\). Note that from (3.16), (3.17) and (3.18), we can express the components of \(\psi |_{C_r}\) as

    $$\begin{aligned} \rho \cos \sigma&= r \cos (\sigma _0) ( Y \cos \alpha - Y' \sin \alpha ) \end{aligned}$$
    (3.19)
    $$\begin{aligned} \rho \sin \sigma&= r \cos (\sigma _0) (Y' \cos \alpha + Y \sin \alpha ). \end{aligned}$$
    (3.20)

    Since

    $$\begin{aligned} Y=(\lambda k)^{-1} + (1-(\lambda k)^{-1})\cos \alpha + \tan \sigma _0 \sin \alpha , \end{aligned}$$
    (3.21)

    these simplify to give

    $$\begin{aligned} \psi |_{C_r}(\alpha )= \left( \begin{array}{c} r \cos \sigma _0 +\lambda ^{-1} \\ r \sin \sigma _0 \end{array} \right) - \lambda ^{-1} e_{_R}(\alpha ). \end{aligned}$$
    (3.22)

    The same equation results when \(\cos \sigma _0 < 0\) is assumed at the outset.

  • Step 3. To determine \(\lambda \) we impose the constraint \(A(\rho ,\sigma )=\pi r^2\) as follows. By the description of A given in (3.8), and by using (3.10), (3.13), and (3.21), we see that

    $$\begin{aligned} A(\rho , \sigma )= & {} \frac{1}{2} \int _{0}^{2\pi } \rho \rho \sigma ' \, d\alpha \\= & {} -\frac{1}{2\lambda } \int _{0}^{2\pi } \rho \cos \Sigma \, d\alpha \\= & {} -\frac{1}{2\lambda } \int _{0}^{2\pi } -kY \, d\alpha \\= & {} \pi /\lambda ^2, \end{aligned}$$

    and hence \(\lambda =\pm r^{-1}\). It is immediate from (3.9), (3.13), and (3.21) that \(F(\psi |_{C_r};r)=2\pi \lambda ^{-1}\), whence the claim in the statement of the lemma in the case that \(\cos \sigma _0 \ne 0\).

  • Step 4. We now deal with the case \(\cos \sigma _0 = 0\). Firstly, note that by sending \(\cos \sigma _0 \rightarrow 0\) in (3.22) we obtain four distinct maps (corresponding to \(\lambda ^{-1} = \pm r\), \(\sigma _0=\pi /2\) or \(3\pi /2\)) which, by a direct calculation, obey (3.10), (3.11) and the relevant boundary conditions. The point of this final step is to verify that these are the only such solutions.

Let us fix \(\sigma _0=\pi /2\), let (3.13) and (3.14) hold and again suppose that Y satisfies \(Y''+Y=(\lambda k)^{-1}\) on \((0,2 \pi )\). In order that \(\rho (0)=r\) holds, (3.13) implies that \(Y(0)=0\). Suppose for a contradiction that \(\Sigma (\alpha )=\pi /2\) on \([0,\delta )\) for some \(\delta >0\). Then (3.13) implies that \(Y(\alpha )=0\), and hence \(Y+Y''=0\) on \([0,\delta )\), which, since \((\lambda k)^{-1} \ne 0\), is a contradiction. Hence there is \(\delta > 0\) such that \(\Sigma (\alpha ) \ne \pi /2\) if \(\alpha \in (0,\delta )\). The calculations in Step 1 imply that (3.10) and (3.11) hold on \((0,\delta )\). Now, by writing (3.14) as \(Y' \cos \Sigma - Y \sin \Sigma =0\), differentiating this expression and applying the equation \(Y''+Y=(\lambda k)^{-1}\), we have

$$\begin{aligned} (Y\cos \Sigma + Y' \sin \Sigma )(\Sigma '+1)&= (\lambda k)^{-1}\cos \Sigma \end{aligned}$$
(3.23)

on \((0,\delta )\). Note that the prefactor of \(\Sigma '+1\) is continuous, with limit \(Y'(0)\) as \(\alpha \rightarrow 0+\).

Suppose first that \(Y'(0)=0\). Then \(Y=(\lambda k)^{-1}(1-\cos \alpha )\), so from (3.14) \(\cos \Sigma =\pm (1-\cos \alpha )^{1/2}/\sqrt{2}\), and hence \(\sin \Sigma = \pm (1+\cos \alpha )^{1/2}/\sqrt{2}\) (again by using (3.14)). In particular, \(\rho = \pm \sqrt{2}\lambda ^{-1}(1-\cos \alpha )^{1/2}\), making \(\rho (0)=r\) impossible.

Hence we may assume that \(Y'(0)\ne 0\). By letting \(\alpha \rightarrow 0+\) in (3.23), it follows that \(\Sigma '(0+)=-1\), with obvious notation. By (3.13), l’Hôpital’s rule, and the condition \(\rho (0)=r\), \(r=kY'(0)/\Sigma '(0+)\). Without loss of generality take \(k=r\), so that \(Y'(0)=\Sigma '(0+)=-1\) follows, and hence \(Y=(\lambda k)^{-1}(1-\cos \alpha ) - \sin \alpha \). With \(k=r\), we see that (3.16) becomes \(\rho = r(Y^2+Y'^2)^{1/2}\) and hence, from (3.13), that the versions of (3.17) and (3.18) with negative prefactors on the right-hand sides must be used. The components of \(\psi |_{C_r}\) are then

$$\begin{aligned} \rho \cos \sigma&= -r ( Y \cos \alpha - Y' \sin \alpha ) \\ \rho \sin \sigma&= -r (Y' \cos \alpha + Y \sin \alpha ). \end{aligned}$$

Using the expression for Y just derived, we obtain (3.22) evaluated at \(\sigma _0=\pi /2\). A similar procedure in the case that \(\sigma _0=3\pi /2\) produces the corresponding version of (3.22). The proof can be concluded by arguing as in Step 3 that \(\lambda ^{-1} = \pm r\) and substituting in (3.22). \(\square \)

Thus we see that the extremals of \(F(\cdot ;r)\) on \(\mathcal {C}(r)\) are in the form of two families of maps which take circles to circles, each family being parametrized by the change in polar angle \(\sigma _0\). Note also that the condition that \(r e_1\) maps to \(r e(\sigma _0)\) is achieved by effectively ‘pivoting’ the original circle \(C_r\) about a suitably chosen point.

Remark 3.4

Here we explain the origin of the transformation (3.13) and (3.14). For argument’s sake, suppose that \(\cos \sigma _0 >0\). Then the assumption that both \(\rho \) and \(\sigma \) are absolutely continuous together with the conditions \(\rho (0)=r\) and \(\Sigma (0)=\sigma _0\) imply that \(\cos \Sigma (\alpha ) > 0\) and \(\rho (\alpha ) > 0\) provided \(\alpha \) is sufficiently small. Using (3.11) and (3.10) gives \(\rho '/\rho \sigma ' = \tan \Sigma \), which on writing \(\sigma ' = 1+ \Sigma \) and integrating yields

$$\begin{aligned} \rho (\alpha ) = -k \sec \Sigma \exp \left( \int _{0}^{\alpha } \tan \Sigma (\bar{\alpha }) \, d\bar{\alpha }\right) , \end{aligned}$$
(3.24)

where \(k=- r \cos \sigma _0\). Let \(\theta = \int _{0}^{\alpha } \tan \Sigma (\bar{\alpha })\, d\bar{\alpha }\) and insert (3.24) into (3.10) to obtain

$$\begin{aligned} 1 - \lambda k (\Sigma ' + 1) e^\theta \sec ^2 \Sigma = 0. \end{aligned}$$

Now \(\theta ' = \tan \Sigma \) and \(\theta '' = \sec ^2 \Sigma \ \Sigma '\), so

$$\begin{aligned} 1 - \lambda k (\theta '' + \theta '^2 + 1)e^{\theta }= 0, \end{aligned}$$

which, on letting \(Y(\alpha )=e^{\theta }\), transforms into the equation

$$\begin{aligned} 1 - \lambda k (Y'' + Y) = 0. \end{aligned}$$

In terms of Y, (3.24) becomes (3.13) and the equation \(\theta ' = \tan \Sigma \) becomes (3.14).

Remark 3.5

Early in the proof of Lemma 3.3 it is asserted that the functions \(\rho \) and \(\sigma \) are smooth away from the set \(\{\alpha \in (0,2\pi ): \ \rho (\alpha )=0\}\). This qualification is necessary in the sense that, for certain \(\sigma _0\), there exists \(\alpha _0\) such that \(\rho (\alpha _0)=0\), and \(\rho \), \(\sigma \) fail to be smooth in a neighbourhood of \(\alpha _0\). To see this, consider the case that \(\cos \sigma _0 \ne 0\), so that \(\rho \) obeys (3.16) and Y is given by (3.21). Solving \(\rho (\alpha )=0\) gives \(\cos \alpha = 1-\lambda k\) and \(\sin \alpha = - \lambda k \tan \sigma _0\). It can be checked that if \(\cos \sigma _0\) and \(\lambda \) are of the same sign then \(\rho (\alpha )=0\) is impossible, but that when \(\cos \sigma _0 < 0\), \(k=-r \cos \sigma _0\) (see Step 2 of Lemma 3.3) and \(\lambda =r^{-1}\), the equations for \(\rho (\alpha _0)=0\) are solved by \(\alpha _0=\pi /3\) when \(\sigma _0 = 2\pi /3\). Letting \(\alpha = \pi /3 + \epsilon \), we find that \(\rho (\epsilon ) = \sqrt{2}r\sqrt{1-\cos \epsilon }\), \(\cos \Sigma = - \sqrt{1-\cos \epsilon }/\sqrt{2}\) and \(\sin \Sigma = -\sin (\epsilon ) (\sqrt{2}\sqrt{1-\cos \epsilon })^{-1}\), none of which is smooth in a neighbourhood of \(\epsilon =0\). By contrast, the quantities \(\rho \cos \Sigma = -r(1-\cos \epsilon )\) and \(\rho \sin \Sigma = - r \sin \epsilon \), which are the building blocks of \(\psi |_{C_r}\), are clearly smooth in \(\epsilon \), and hence in \(\alpha \).

As promised at the end of Sect. 2, we now give the proof of Theorem 2.2.

Proof

Let \(\varphi \) be as described in the statement of the theorem. By Proposition 2.1, Eq. (2.9) and inequality (2.11), it follows that

$$\begin{aligned} \int _{B} |\nabla (u_{_{\scriptscriptstyle {N}}}\circ \varphi )|^2\, dx \ge \pi (N+1/N) + 2(N-1/N)\left( \pi - \int _0^1 F(\psi ;r)\,dr\right) , \end{aligned}$$
(3.25)

where \(\psi =\varphi ^{-1}\) and the functional F is given by (3.1). By the argument preceding Lemma 3.2, the results of Lemma 3.3 apply to \(F(\psi ;r)\) for almost every r in (0, 1). In particular, \(F(\psi ;r)\le 2 \pi r\) a.e., from which it clearly follows that the second term in (3.25) is nonnegative for \(N\ge 1\). Finally, since the term \(\pi (N+1/N)\) in (3.25) is exactly the Dirichlet energy of \(u_{_{\scriptscriptstyle {N}}}\) on the unit ball, (2.13) follows.

To prove that \(u_{_{\scriptscriptstyle {N}}}\) is the unique global minimizer of the Dirichlet energy with respect to inner variations, we note that if (2.13) holds with equality then, in particular, (2.11) must also hold with equality. Since the functional \(I(\psi ,N)\) in (2.9) is strictly convex in \(\psi \), (2.10) holds with equality if and only if \(\psi \) is the identity map, whence \(\varphi \) must be the identity map too. \(\square \)

Remark 3.6

Note that equality in (2.13) also implies that \(F(\psi ;r)=2\pi r\) for almost every r, so, for such r, it must be that \(\psi =\psi |_{C_r}\) for an appropriate choice of \(\sigma _0\) and with \(\lambda ^{-1}=r\), as described in (3.22) (and in the notation introduced in (3.6) and (3.7)). Supposing this to be the case, it follows that

$$\begin{aligned} \psi (R,\alpha ) = R(e({\tilde{\sigma }}(R))+e(\alpha )-e_1) \end{aligned}$$

for a suitable, weakly differentiable map \({\tilde{\sigma }}\) satisfying \({\tilde{\sigma }}(1)=0\). A short calculation then reveals that \(\det \nabla \psi =1\) a.e. if and only if \({\tilde{\sigma }}\) is a.e. zero, and hence that \(\psi \) is equivalent to the identity map. In this way the uniqueness part of the proof of Theorem 2.2 can be deduced without appealing to strict convexity.

4 \(\mathbf {u_{_{\scriptscriptstyle {N}}}}\) is a minimizer within the class of outer variations

We now apply a technique of Sivaloganathan and Spector [30] to demonstrate that \(u_{_{\scriptscriptstyle {N}}}\) is a minimizer of the Dirichlet energy within the class of outer variations. In the rest of this section we will write \(B_{\scriptscriptstyle {N}}:=B(0,1/\sqrt{N})\) for brevity.

Definition 4.1

(Outer variation) Let \(\phi \in H^1_{\text {id}}(B_{\scriptscriptstyle {N}},\mathbb {R}^2)\) satisfy \(\det \nabla \phi = 1\) a.e. in \(B_{\scriptscriptstyle {N}}\). Then \(\phi \circ u_{_{\scriptscriptstyle {N}}}\) is an outer variation of \(u_{_{\scriptscriptstyle {N}}}\).

By Proposition 3.1, we may assume that \(\phi \) is continuous and measure-preserving, and that for a.e. \(r \in (0,1/\sqrt{N})\), \(\phi \arrowvert _{C_r}\) belongs to \(H^1(C_r;\mathbb {R}^2)\) and is \({\mathcal {H}}^1\)-a.e. \(1-1\). We have need of the following technical result.

Proposition 4.2

Let \(\phi \in H^1_{\text {id}}(B_{\scriptscriptstyle {N}},\mathbb {R}^2)\) satisfy \(\det \nabla \phi = 1\) a.e. in \(B_{\scriptscriptstyle {N}}\). Then there exists a null set \(\omega \subset (0,1/\sqrt{N})\) such that for all \(r \in (0,1/\sqrt{N}){\setminus } \omega \):

  1. (i)

    \(\phi (C_r)\) is \(\mathcal {H}^1\)-measurable and \(\mathcal {H}^1(\phi (C_r)) \ge 2\pi r\);

  2. (ii)

    \(J \phi _{{\scriptscriptstyle {R}}}(z) \cdot \phi _{\tau }(z)=1\) for \(\mathcal {H}^1\)-a.e. \(z \in C_r\);

  3. (iii)

    \(\mathcal {H}^1(\phi (C_r)) \le \int _{C_r} |\phi _{\tau }(z)| \, d\mathcal {H}^1(z)\).

Proof

  1. (i)

    Since \(\phi \) agrees with the identity on \(\partial B_{\scriptscriptstyle {N}}\), [31, Lemma 5(i)] implies that \(\phi \) is \(1-1\) almost everywhere. Let \(\mathcal {U}:=\{z \in B_{\scriptscriptstyle {N}}{\setminus } \phi (C_r): d(\phi ,B(0,r),z)\ge 1\}\). Then properties of the degree easily imply that \(\mathcal {U}\subset \phi (B(0,r))\). By [13, Theorem 5.21], the fact that \(\phi \) is \(1-1\) almost everywhere and the constraint \(\det \nabla \phi =1\) a.e., we can assume that the set

    $$\begin{aligned} {\widetilde{B_{\scriptscriptstyle {N}}}}:=\{x \in B_{\scriptscriptstyle {N}}: \ \nabla \phi (x) \ \text {exists classically}, \ \phi ^{-1}(\phi (x))=\{x\} \ \text {and} \ \det \nabla \phi (x)=1\} \end{aligned}$$

    is of full measure in \(B_{\scriptscriptstyle {N}}\). Let \(x \in {\widetilde{B_{\scriptscriptstyle {N}}}} \cap B(0,r)\) and define \(z=\phi (x)\). By [13, Lemma 5.9], there is \(r_0>0\) such that, for all \(r_1 \in (0,r_0]\), \(\phi (x+h) \ne \phi (x)\) if \(0<|h|\le r_1\) and \(d(\phi ,B(x_0,r_1),z)=\text {sgn}\, \det \nabla (x)=1\). The former implies in particular that \(z \notin \phi (\partial B(x,r_1))\). By taking \(r_1\) sufficiently small, we may assume that \(B(x,r_1)\subset B(0,r)\). It is clear that \(z \notin \phi (C_r)\), so by the excision and domain decomposition properties of the degree we have

    $$\begin{aligned} d(\phi ,B(0,r),z) = d(\phi , B(x,r_1),z)+ d(\phi ,B(0,r) {\setminus } \overline{B(x,r_1)},z). \end{aligned}$$

    It must be that \(d(\phi ,B(0,r) {\setminus } \overline{B(x,r_1)},z)=0\), since otherwise there is \({\tilde{x}} \in B(0,r) {\setminus } \overline{B(x,r_1)}\) such that \(\phi ({\tilde{x}})=z\), contradicting our assumption that z has no other preimages in \(B_{\scriptscriptstyle {N}}\) besides x. Hence \(d(\phi ,B(0,r),z)=1\), so \(x \in \mathcal {U}\), and we conclude that \(\phi (B(0,r))\) and the set \(\mathcal {U}:=\{z \in B_{\scriptscriptstyle {N}}: d(\phi ,B(0,r),z)\ge 1\}\) differ only by a set of \(\mathcal {L}^2\)-measure zero. By part (c) of Proposition 3.1, \(\phi \) preserves the measure of B(0, r), so that \(\mathcal {L}^2(\mathcal {U})=\pi r^2\). Now, by arguing as in [26, Lemma 3.5, Step 3], \(\phi (C_r)=\partial ^*\mathcal {U}\) \(\mathcal {H}^1\)-a.e., and hence, by the isoperimetric inequality [12, Theorem 3.2.43],

    $$\begin{aligned} \mathcal {H}^1(\phi (C_r)) \ge 2 \sqrt{\pi } (\mathcal {L}^2(U))^{\frac{1}{2}} =2\pi r. \end{aligned}$$
  2. (ii)

    Let \(C_r^*=\{z \in C_r: \ J \phi _{{\scriptscriptstyle {R}}}(z) \cdot \phi _{\tau }(z)\ne 1\}\), \(\omega _1:=\{r \in (0,1/\sqrt{N}): \ \mathcal {H}^1(C_r^*)>0\}\) and \(E=\cup _{r \in \omega _1} C_r^*\). Since \(\det \nabla \phi =1\) a.e., it follows that \(\mathcal {L}^2(E)=0\). On the other hand,

    $$\begin{aligned} \mathcal {L}^2(E)=\int _{\omega _1} \mathcal {H}^1(C_r^*) \, dr, \end{aligned}$$

    which implies that \(\omega _1\) is null. Hence, by excluding r from \(\omega _1\), we ensure that \(J \phi _{{\scriptscriptstyle {R}}}(z) \cdot \phi _{\tau }(z)=1\) for \(\mathcal {H}^1\)-a.e. \(z \in C_r\).

  3. (iii)

    By Proposition 3.1(b), we can assume without loss of generality that \(\phi \arrowvert _{C_r}\) belongs to \(H^1(C_r,\mathbb {R}^2)\). The stated inequality now follows by applying [26, Proposition 2.7]. \(\square \)

Following [30, Lemma 3.5], we note that, by the Cauchy–Schwarz inequality and Proposition 4.2(ii), for a.e. \(r \in (0,1/\sqrt{N})\),

$$\begin{aligned} |\phi _{{\scriptscriptstyle {R}}}(z)| \ge \frac{1}{|\phi _{\tau }(z)|} \quad \mathcal {H}^1-\text {a.e.} \ z \in C_r. \end{aligned}$$
(4.1)

We now state and prove our result concerning outer variations.

Theorem 4.3

Let \(\phi \circ u_{_{\scriptscriptstyle {N}}}\) be an outer variation of \(u_{_{\scriptscriptstyle {N}}}\). Then

$$\begin{aligned} \mathbb {D}(\phi \circ u_{_{\scriptscriptstyle {N}}}) \ge \mathbb {D}(u_{_{\scriptscriptstyle {N}}}) \quad \text {with equality if and only if } \phi = {\text {id}}. \end{aligned}$$
(4.2)

Proof

A short calculation shows that

$$\begin{aligned} \mathbb {D}(\phi \circ u_{_{\scriptscriptstyle {N}}}) = \int _B N |\phi _{\tau }(u_{_{\scriptscriptstyle {N}}}(y))|^2 + \frac{|\phi _{{\scriptscriptstyle {R}}}(u_{_{\scriptscriptstyle {N}}}(y))|^2}{N} \, dy, \end{aligned}$$

to which, by [13, Theorem 5.35], we can apply the change of variables

$$\begin{aligned} \int _B N |\phi _{\tau }(u_{_{\scriptscriptstyle {N}}}(y))|^2 + \frac{|\phi _{{\scriptscriptstyle {R}}}(u_{_{\scriptscriptstyle {N}}}(y))|^2}{N} \, dy&= \int _B \left( N |\phi _{\tau }(u_{_{\scriptscriptstyle {N}}}(y))|^2 + \frac{|\phi _{{\scriptscriptstyle {R}}}(u_{_{\scriptscriptstyle {N}}}(y))|^2}{N}\right) \det \nabla u_{_{\scriptscriptstyle {N}}}(y) \, dy \\&= \int _{\mathbb {R}^2} \left( N |\phi _{\tau }(z)|^2 + \frac{|\phi _{{\scriptscriptstyle {R}}}(z)|^2}{N}\right) d(u_{_{\scriptscriptstyle {N}}},B,z) \, dz \\&= \int _{B_{\scriptscriptstyle {N}}} N^2 |\phi _{\tau }(z)|^2 + |\phi _{{\scriptscriptstyle {R}}}(z)|^2 \, dz. \end{aligned}$$

Here we have used the fact that \(d(u_{_{\scriptscriptstyle {N}}},B,z)=N\) if \(z \in B_{\scriptscriptstyle {N}}\) and \(d(u_{_{\scriptscriptstyle {N}}},B,z)=0\) if \(z \in \mathbb {R}^2 {\setminus } {\overline{B_{\scriptscriptstyle {N}}}}\). This is easily deduced by noting that \(u_{_{\scriptscriptstyle {N}}}\) agrees with the smooth function \(v_{_{\scriptscriptstyle {N}}}(z):=z^N/\sqrt{N}\) on \(\partial B\), expressed here for brevity in terms of \(z \in \mathbb {C}\), so that \(d(u_{_{\scriptscriptstyle {N}}},B,z)=d(v_{_{\scriptscriptstyle {N}}},B,z)\) for all \(z \in \mathbb {R}^2 {\setminus } \partial B_{\scriptscriptstyle {N}}\), and then by computing \(d(v_{_{\scriptscriptstyle {N}}},B,z)\) directly. Now let \(f_{\scriptscriptstyle {N}}: \mathbb {R}^+ \rightarrow \mathbb {R}^+\) be given by

$$\begin{aligned} f_{\scriptscriptstyle {N}}(t)=N^2t^2+\frac{1}{t^2}, \end{aligned}$$

and notice that \(f_{\scriptscriptstyle {N}}\) is strictly convex on \(\mathbb {R}^+\) and strictly increasing on the interval \((1/\sqrt{N},\infty )\). It follows by the calculation above and (4.1) that

where Jensen’s inequality has been applied to pass from the third to the fourth lines above. By Proposition 4.2(i) and (iii), the argument of \(f_{\scriptscriptstyle {N}}\) in the last line above is at least 1, and since \(f_{\scriptscriptstyle {N}}\) is increasing on \((1/\sqrt{N},\infty )\), we must have

But, by inspection, \(\mathbb {D}(u_{_{\scriptscriptstyle {N}}})=\frac{\pi }{N} f_{\scriptscriptstyle {N}}(1)\), which proves the inequality stated in (4.2).

If \(\mathbb {D}(\phi \circ u_{_{\scriptscriptstyle {N}}})=\mathbb {D}(u_{_{\scriptscriptstyle {N}}})\) then all the inequalities derived above become equalities, implying that

  1. (a)

    Equation (4.1) must hold with equality a.e. in \(B_{\scriptscriptstyle {N}}\), and hence, by the condition for equality in the Cauchy–Schwarz inequality, there is a scalar-valued function k(z), say, such that \(J\phi _{{\scriptscriptstyle {R}}}(z) = k(z) \phi _{\tau }(z)\) for a.e. z in \(B_{\scriptscriptstyle {N}}\);

  2. (b)

    \(\int _{C_r} |\phi _{\tau }(z)|\, d\mathcal {H}^1(z) = 2 \pi r\) must also hold for a.e. \(r \in (0,1/\sqrt{N})\), which when integrated over r in that range gives

    (4.3)
  3. (c)

    .

Hence (c) and (4.3) give

Since \(f_{\scriptscriptstyle {N}}\) is strictly convex, Jensen’s inequality tells us that

with equality if and only if for a.e. z. But, by (4.3), this implies that \(|\phi _{\tau }(z)|=1\) for a.e. z, and hence from (a) together with the constraint \(J \phi _{{\scriptscriptstyle {R}}}(z) \cdot \phi _{\tau }(z)=1\) a.e., we obtain that \(k(z)=1\) a.e. Thus we can write

$$\begin{aligned} \nabla \phi (z) = -J \phi _{\tau }(z) \otimes {\bar{z}} + \phi _{\tau }(z) \otimes J {\bar{z}} \quad \text {a.e.} \;\, z \in B_{\scriptscriptstyle {N}}, \end{aligned}$$

which has the property that \(\mathrm{cof}\,\nabla \phi (z) = \nabla \phi (z)\) a.e. Thus, by Piola’s identity, \(\phi \) is a weak solution of Laplace’s equation which agrees with the identity on \(\partial B_{\scriptscriptstyle {N}}\), so it must be that \(\phi ={\text {id}}\) in \(B_{\scriptscriptstyle {N}}\). \(\square \)

Remark 4.4

From the proof above, we see that

$$\begin{aligned} \frac{1}{N}\mathbb {D}(\phi \circ u_{_{\scriptscriptstyle {N}}}) = \int _{B_{\scriptscriptstyle {N}}} N|\phi _{\tau }(z)|^2 +\frac{ |\phi _{{\scriptscriptstyle {R}}}(z)|^2}{N} \, dz. \end{aligned}$$
(4.4)

Now compare this with the corresponding expression for an inner variation, (2.9), which we reprint here for convenience,

$$\begin{aligned} \mathbb {D}(u_{_{\scriptscriptstyle {N}}}\circ \psi ) = \int _{B} \frac{1}{N} |\psi _\tau (y)|^2 +N |\psi _{{\scriptscriptstyle {R}}}(y)|^2 \, dy. \end{aligned}$$

The technique used in Theorem 4.3 does not apply to functional \(\mathbb {D}(u_{_{\scriptscriptstyle {N}}}\circ \psi )\), and neither do the methods used to prove Theorem 2.2 apply to the functional \(\mathbb {D}(\phi \circ u_{_{\scriptscriptstyle {N}}})\). In each case, it seems that the weighting of the radial and angular derivatives determines the approach required.

5 Extension to functionals with p-growth for \(1<p < 2\)

It is natural to ask whether the techniques of the preceding section carry over to functionals besides the Dirichlet energy. The functionals we have in mind are of the form

$$\begin{aligned} \mathbb {E}(u):=\int _{B} f(|\nabla u(x)|)\,dx, \end{aligned}$$
(5.1)

where f is convex, of class \(C^1(\mathbb {R}^+)\), and such that \(f'(t) \ge 0\) for \(t>0\). In addition, so that the setting of the problem is \(W^{1,p}(B;\mathbb {R}^2)\), we suppose that there is a constant \(C>0\) such that \(\frac{1}{C}t^p \le f(t) \le C(1+t^p)\) for all \(t > 0\). When \(p>2\), the analysis of \(\mathbb {E}(u_{_{\scriptscriptstyle {N}}}\circ \varphi )\) carries over from that given for \(\mathbb {D}(u_{_{\scriptscriptstyle {N}}}\circ \varphi )\) with only minor changes, so we do not address that question here. We focus on the case \(1<p<2\), where one has to construct carefully a suitable analogue to the function space \(\mathcal {A}(B)\) introduced in (1.3). The chief difficulties are that, in contrast to members of \(\mathcal {A}(B)\), a typical map \(\varphi \in W^{1,p}(B;\mathbb {R}^2)\), with \(p<2\), obeying \(\det \nabla \varphi = 1\) a.e. in B and \(\varphi \arrowvert _{\partial B} = {\text {id}}\) need not be continuous, Eq. (3.2) need not hold, nor need \(\varphi \) be invertible in the sense described just before Proposition 2.1. Recall that the continuity and invertibility of \(\varphi \in \mathcal {A}(B)\) were needed to apply results depending on the topological degree and to transform the energy functional \(\int _{B} W(\bar{\varphi },\nabla \varphi ,N)\, dx\) into a more tractable form involving \(\psi :=\varphi ^{-1}\) (see e.g. (2.7) and (2.8)), while Eq. (3.2) led to the area constraint in the study of the functional \(F(\psi )\) described in Sect. 3.

Fortunately, thanks to works of Müller and Spector [26], Henao and Mora-Corral [18,19,20] and Barchiesi et al. [4], there is a substantial framework which provides a suitable candidate for \(\mathcal {A}(B)\). In short, the required invertibility and other properties (such as the validity of Eq. (3.2), for example) can be found in the class which Barchiesi, Henao and Mora-Corral refer to in [4] as \(\mathcal {A}_p\). We shall recall the definition of \(\mathcal {A}_p\) from [4] below, describe some of its properties, and note how a supplementary condition, given later, ensures that the local inverse of [4] is, in this setting at least, effectively an inverse on the entire image domain.

5.1 Extending the class \(\mathcal {A}(B)\)

Let \(p \in (1,2)\) and let \({\mathcal {U}}_{\varphi }\) be the class of ‘good’ open sets defined in [4, Definition 2.7]. Following [4], define the class \(\mathcal {A}_p\) on the set B as:

$$\begin{aligned} \mathcal {A}_p&:=\{\varphi \in W^{1,p}(B,\mathbb {R}^2): \mathrm{adj}\,\nabla \varphi \, \varphi \in L^1_{\text {loc}}(B;\mathbb {R}^2), \ \det \nabla \varphi \ne 0 \ a.e., \ \mathrm {Det}\,\nabla \varphi = \det \nabla \varphi , \\&\qquad \text {and} \ d(\varphi ,B(x,r),\cdot ) \ge 0 \ a.e. \ \text {for all} \ r>0 \ \text {for which} \ B(x,r) \in {\mathcal {U}}_{\varphi }\}. \end{aligned}$$

For later use, we recall that if \(\varphi \in W^{1,p}(B,\mathbb {R}^2)\) then its distributional determinant \(\mathrm {Det}\,\nabla \varphi \) obeys

$$\begin{aligned} \langle \mathrm {Det}\,\nabla \varphi , \eta \rangle = -\frac{1}{2}\int _{B} \varphi (x)\cdot \mathrm{cof}\,\nabla \varphi (x) \nabla \eta (x) \, dx \quad \eta \in C_c^\infty (B). \end{aligned}$$

We also recall the definition of Müller and Spector’s condition (INV), as given in [26, Definition 3.2] in terms of a bounded, open domain \(\Omega \subset \mathbb {R}^n\) with Lipschitz boundary:

Definition 5.1

We say that \(u: \Omega \rightarrow \mathbb {R}^n\) satisfies (INV) provided that for every \(a \in \Omega \) there exists an \(\mathcal {L}^1\) null set \(N_a\) such that, for all \(r \in (0,r_a) {\setminus } N_a\), \(u\arrowvert _{\partial B(a,r)}\) is continuous,

  1. (i)

    \(u(x) \in \text {im}_{\text {T}}(u,B(a,r)) \cup u(\partial B(a,r))\) for \(\mathcal {L}^2\)-a.e. \(x \in \overline{B(a,r)}\), and

  2. (ii)

    \(u(x) \in \mathbb {R}^2 {\setminus } \text {im}_{\text {T}}(u,B(a,r))\) for \(\mathcal {L}^2\)-a.e. \(x \in \Omega {\setminus } B(a,r)\).

Here \(r_a:=\text {dist}\,(a,\partial \Omega )\).

The topological image \(\text {im}_{\text {T}}(u,B(a,r))\) is given by

$$\begin{aligned} \text {im}_{\text {T}}(u,B(a,r)) := \{y \in \mathbb {R}^n {\setminus } u(\partial B(a,r)): \ d(u,\partial B(a,r),y) \ne 0\} \end{aligned}$$

whenever \(u\arrowvert _{\partial B(a,r)}\) is continuous: see [26, Section 3] or [4, Section 3]. We will need to apply condition (INV) to the precise representative of a function in \(W^{1,p}\). One can either follow [26, Proposition 2.8], or, as we do here, give the formulation of [4, Proposition 2.2].

Proposition 5.2

Let \(1\le p < n\) and \(u \in W^{1,p}(\Omega ;\mathbb {R}^n)\). Let \(p^*:=np/(n-p)\) be the Sobolev conjugate exponent. Denote by P the set of points \(x_0 \in \Omega \) where the following property fails: there exists \(u^*(x_0) \in \mathbb {R}^n\) such that

Then \(\text {cap}_{p}(P)=0\).

Here, \(\text {cap}_{p}\) refers to the p-capacity of a set: see [11, Section 4.7], for example.

We now form the subclass \(\tilde{\mathcal {A}_{p}}\) as follows, where condition (INV) is understood in the sense given above with \(\Omega =B\) and \(n=2\):

$$\begin{aligned} \tilde{\mathcal {A}_{p}}:=\{\varphi \in \mathcal {A}_{p}(B): \ \varphi \in W^{1,p}_{{\text {id}}}(B;\mathbb {R}^2),\ \varphi ^* \ \text {satisfies} \ \text {(INV)} \ \text {in} \ B, \ \det \nabla \varphi =1 \ a.e.\}. \end{aligned}$$
(5.2)

By [26, Lemma 3.3], condition (INV) is stable under weak convergence in \(W^{1,p}(B,\mathbb {R}^2)\), and, by [4, Proposition 6.1], the weak (in \(L^1\)) limit of the sequence \((\det \nabla \varphi ^{(j)})_{j \in \mathbb {N}}\) coincides a.e. with \(\det \nabla \varphi \) provided \((\varphi ^{(j)})_{j \in \mathbb {N}} \subset \mathcal {A}_p\), \((\det \nabla \varphi ^{(j)})_{j \in \mathbb {N}}\) is equiintegrable, and \((\varphi ^{(j)})_{j \in \mathbb {N}}\) is bounded in \(W^{1,p}(B,\mathbb {R}^2)\). Since \(\det \nabla \varphi = 1\) a.e. in this case, [4, Proposition 6.1] yields \(\varphi \in \mathcal {A}_{p}\). Thus, in summary, \(\tilde{\mathcal {A}_p}\) is closed in the weak \(W^{1,p}(B;\mathbb {R}^2)\) topology.

Definition 5.3

Let \(\varphi \in \tilde{\mathcal {A}_p}\). Then a generalized inner variation of \(u_{_{\scriptscriptstyle {N}}}\) is a function of the form \(u_{_{\scriptscriptstyle {N}}}\circ \varphi : B \rightarrow \mathbb {R}^2\).

Since \(\tilde{\mathcal {A}_p}\) is weakly closed, it follows from Sobolev embedding that the class of generalized inner variations is also weakly closed. With this in mind, an application of the direct method of the calculus of variations yields the following.

Proposition 5.4

Let \(\mathbb {E}\) be given by (5.1). Then there is a minimizer \(\varphi \in \tilde{\mathcal {A}_p}\) of \(\mathbb {E}(u_{_{\scriptscriptstyle {N}}}\circ \varphi )\).

Proof

Note that \(\tilde{\mathcal {A}_p}\) contains the identity map, so, in particular, the set of generalized inner variations is nonempty. Take a sequence \((\varphi ^{(j)})_{j \in \mathbb {N}}\) such that \(\mathbb {E}(u_{_{\scriptscriptstyle {N}}}\circ \varphi ^{(j)}) \searrow \inf \{\mathbb {E}(u_{_{\scriptscriptstyle {N}}}\circ \varphi ^{(j)}): \ \varphi \in \tilde{\mathcal {A}_p}\}\). By (2.3) and the assumed p-growth of f, it follows that, for a subsequence, \(\varphi ^{(j)} \rightharpoonup \varphi \) in \(W^{1,p}(B,\mathbb {R}^2)\), and, by the argument above, that \(\varphi \in \tilde{\mathcal {A}_p}\). The convexity of \(\mathbb {E}\) finishes the proof. \(\square \)

Next, we wish to define a suitable inverse of \(\varphi \in \tilde{\mathcal {A}_{p}}\). According to [26, Lemma 3.4], if \(\varphi ^*\) satisfies (INV) on B then \(\varphi \) is \(1-1\) a.e. on B. (See also [4, Lemma 5.1(a)] for the equivalence of (INV) and \(1-1\) a.e. on sets \(U \in \mathcal {U}_{\varphi }\).) Take a family of open sets \((U_j)_{j \in \mathbb {N}} \subset \mathcal {U}_{\varphi }\) with the property that \(\cup _{j=1}^{\infty } U_j = B\) and \(U_{j} \subset \subset U_{j+1}\) for all j. By [4, Lemma 2.20], there are radii \(r_j \nearrow 1\) such that the choice \(U_j:=B(0,r_j)\) works, for example. Since \(\varphi \) is \(1-1\) a.e., it is clearly the case that \(\varphi \) is \(1-1\) a.e. on each \(U_j\). Hence, in the notation of [4, Proposition 5.3], the family \((U)_{j\in \mathbb {N}}\) belongs to \(\mathcal {U}_{\varphi }^\text {in}\), and so, by the same result, \((\varphi \arrowvert _{U_j})^{-1}\) exists and belongs to \(W^{1,1}(\text {im}_{\text {T}}(\varphi ,U_j);\mathbb {R}^2)\), with

$$\begin{aligned} \nabla (\varphi \arrowvert _{U_j})^{-1}(y) =(\nabla \varphi ((\varphi \arrowvert _{U_j})^{-1}(y)))^{-1} \quad a.e. \end{aligned}$$
(5.3)

and for each \(j \in \mathbb {N}\). By [4, Lemma 5.1 (b)], the topological images are nested in the sense that \(\text {im}_{\text {T}}(\varphi ,U_j) \subset \text {im}_{\text {T}}(\varphi ,U_{j+1})\) a.e., and, by [4, Lemma 5.18 (c)], they exhaust B up to a set of measure zero. For a.e. \(y \in B\), we take \(\psi (y) := (\varphi \arrowvert _{U_j})^{-1}(y)\) if \(y \in \text {im}_{\text {T}}(\varphi ,U_j)\). Since the sets \(\text {im}_{\text {T}}(\varphi ,U_j)\) are nested a.e., \((\varphi \arrowvert _{U_{j+1}})^{-1}\) agrees a.e. with \((\varphi \arrowvert _{U_j})^{-1}\) on \(\text {im}_{\text {T}}(\varphi ,U_j)\). For clarity, we record the definition of \(\psi \) here.

Definition 5.5

Let \(\varphi \in \tilde{\mathcal {A}_p}\) and let \((U_j)_{j \in \mathbb {N}} \subset \mathcal {U}_{\varphi }\) be a nested family of sets which satisfies \(B=\cup _{j \in \mathbb {N}} U_j\). Define the function \(\psi \) on \(B=\cup _{j \in \mathbb {N}} \text {im}_{\text {T}}(\varphi ,U_j)\) by \(\psi (y)=(\varphi \arrowvert _{U_j})^{-1}(y)\) whenever \(y \in \text {im}_{\text {T}}(\varphi ,U_j)\). For \(y \in \partial B\), set \(\psi (y)=y\).

Since \(\det \nabla \varphi = 1\) a.e., (5.3) implies that

$$\begin{aligned} \nabla (\varphi \arrowvert _{U_{j}})^{-1}(y) = \mathrm{cof}\,\nabla \varphi ((\varphi \arrowvert _{U_{j}})^{-1}(y))^{T}. \end{aligned}$$
(5.4)

By considering the functions \(\nabla \psi \, \chi _{\text {im}_{\text {T}}(\varphi ,U_j)}\), (5.4) and applying the area formula [4, Proposition 2.7] with \({\mathcal {N}}_{U_j}=1\) a.e. (the last of which follows from [4, Lemma 5.1]), it follows that

$$\begin{aligned} \int _{\text {im}_{\text {T}}(\varphi ,U_j)}|\nabla \psi (y)|^p \,dy&= \int _{\text {im}_{\text {T}}(\varphi ,U_j)} |\mathrm{cof}\,\nabla \varphi (\varphi \arrowvert _{U_{j}})^{-1}(y))^{T}|^p \, dy \\&= \int _{U_j} |\nabla \varphi (x)|^p \,dx \le \int _{B} |\nabla \varphi (x)|^p \,dx. \end{aligned}$$

Hence, by monotone convergence, \(\nabla \psi \in L^p(B,\mathbb {R}^2)\). By a similar argument, this time with the functions \(W^{\frac{1}{2}}({\bar{y}},\nabla \psi ^T(y),1/N) \, \chi _{\text {im}_{\text {T}}(\varphi ,U_j)}\), it follows that

$$\begin{aligned} \int _{B} W^{\frac{1}{2}}({\bar{y}},\nabla \psi ^T(y),1/N) \, dy = \int _{B} W^{\frac{1}{2}}(\bar{\varphi }(x),\nabla \varphi (x),N) \, dx. \end{aligned}$$
(5.5)

Proposition 5.6

Let \(\varphi \in \tilde{\mathcal {A}_p}\). Then the inverse \(\psi \) of \(\varphi \) given in Definition 5.5 satisfies \(\mathrm {Det}\,\nabla \psi =1\) as a distribution. In particular, (3.2) holds.

Proof

Let \(\eta \in C^{\infty }_{c}(B)\) be a scalar-valued test function. Since \({\text {spt}}\,\eta \subset \subset B\), there is \(j \in \mathbb {N}\) such that \({\text {spt}}\,\eta \subset \subset \text {im}_{\text {T}}(\varphi ,U_j)\). Then

$$\begin{aligned} \langle \mathrm {Det}\,\nabla \psi , \eta \rangle&= -\frac{1}{2} \int _{B} \psi (y) \cdot \mathrm{cof}\,\nabla \psi (y) \nabla \eta (y) \, dy \\&= -\frac{1}{2} \int _{\text {im}_{\text {T}}(\varphi ,U_j)} (\varphi \arrowvert _{U_j})^{-1}(y) \cdot \mathrm{cof}\,\nabla (\varphi \arrowvert _{U_j})^{-1}(y)\nabla \eta (y) \, dy \\&= -\frac{1}{2} \int _{\text {im}_{\text {T}}(\varphi ,U_j)} (\varphi \arrowvert _{U_j})^{-1}(y) \cdot \nabla \varphi ^T ((\varphi \arrowvert _{U_j})^{-1}(y))\nabla \eta (y) \, dy \\&= -\frac{1}{2} \int _{U_j} x \cdot (\nabla \varphi \arrowvert _{U_j}(x))^{T}\nabla \eta (\varphi \arrowvert _{U_j}(x)) \, dx \\&= -\frac{1}{2} \int _{U_j} x \cdot \nabla (\eta \circ \varphi \arrowvert _{U_j}) \, dx \\&= \int _{U_j} \eta \circ \varphi \arrowvert _{U_j} \, dx \\&= \int _{\text {im}_{\text {T}}(\varphi ,U_j)} \eta (y) \, dy = \int _B \eta (y) \, dy, \end{aligned}$$

which, since \(\eta \) was arbitrary, proves the first part. To show that (3.2) holds, note that by [4, Proposition 2.23], for each given set \(U \in \mathcal {U}_{\psi }\) there exists a family \((\eta _{\delta })_{\delta > 0}\) of smooth, compactly supported test functions such that \(\eta _\delta \rightarrow \chi _{U}\) pointwise in B and such that

$$\begin{aligned} \langle \mathrm {Det}\,\nabla \psi , \eta _\delta \rangle \rightarrow \frac{1}{2}\int _{\partial U} \psi (y)\cdot \mathrm{cof}\,\nabla \psi (y) \nu (y) \, d\mathcal {H}^1(y). \end{aligned}$$
(5.6)

According to [4, Lemma 2.20], \(B(0,r) \in \mathcal {U}_{\psi }\) for a.e. \(0<r<1\). Taking \(U=B(0,r)\) in (5.6), the left-hand side converges to \(\pi r^2\) by the first part of the proposition. Noting that \(\nu (y)={\bar{y}}\), and using the expression \(\mathrm{cof}\,\nabla \psi (y) {\bar{y}} = J^T\psi _{\tau }\), the right-hand side of (5.6) converges to \(\frac{1}{2}\int _{C_r} J\psi \cdot \psi _{\tau } \, d\mathcal {H}^1\), whence (3.2). \(\square \)

We are now in a position to apply the techniques of Sects. 2 and 3 of this paper to prove the following result.

Theorem 5.7

Let the functional \(\mathbb {E}\) be given by (5.1) and let \(u_{_{\scriptscriptstyle {N}}}\circ \varphi \) be a generalized inner variation of \(u_{_{\scriptscriptstyle {N}}}\). Then \(\mathbb {E}(u_{_{\scriptscriptstyle {N}}}\circ \varphi ) \ge \mathbb {E}(u_{_{\scriptscriptstyle {N}}})\), with equality if and only if \(\varphi ={\text {id}}\).

Proof

Let \(\varphi \in \tilde{\mathcal {A}_p}\). Recall (2.5) and note that, by (5.5), \(\mathbb {E}(u_{_{\scriptscriptstyle {N}}}\circ \varphi )\) can be written

$$\begin{aligned} \mathbb {E}(u_{_{\scriptscriptstyle {N}}}\circ \varphi )&= \int _{B} f(W^{\frac{1}{2}}(\bar{\varphi }(x),\nabla \varphi (x), N)) \, dx \\&= \int _{B} f(W^{\frac{1}{2}}({\bar{y}},\nabla \psi ^T(y), 1/N)) \, dy, \end{aligned}$$

where \(\psi \) is the inverse of \(\varphi \), as described above. By the convexity of f,

$$\begin{aligned} \int _{B} f(W^{\frac{1}{2}}({\bar{y}},\nabla \psi ^T, 1/N)) \, dy&\ge \int _{B} f(W^{\frac{1}{2}}({\bar{y}},{\mathbf {1}}, 1/N)) \, dy \ \nonumber \\&\quad + \int _{B} f'(W^{\frac{1}{2}}({\bar{y}},{\mathbf {1}}, 1/N))(W^{\frac{1}{2}}({\bar{y}},\nabla \psi ^T, 1/N)\nonumber \\&\quad -W^{\frac{1}{2}}({\bar{y}},{\mathbf {1}}, 1/N)) \, dy. \end{aligned}$$
(5.7)

A direct calculation shows that \(W^{\frac{1}{2}}({\bar{y}},{\mathbf {1}}, 1/N)=(N+\frac{1}{N})^{\frac{1}{2}}\) is constant, and in polar coordinates

$$\begin{aligned} W^{\frac{1}{2}}({\bar{y}},\nabla \psi ^T, 1/N) = \left( N |\psi _{_R}|^2 + \frac{1}{N} |\psi _\tau |^2\right) ^{\frac{1}{2}}. \end{aligned}$$

Let \(g(a,b) = (N |a|^2 + \frac{1}{N} |b|^2)^{\frac{1}{2}}\) for any \(a,b \in \mathbb {R}^2\) and note that g is convex. In particular,

$$\begin{aligned} g(\psi _{_R},\psi _\tau )&\ge g({\bar{y}},J{\bar{y}}) + Dg({\bar{y}},J{\bar{y}})\cdot (\psi _{_R}-{\bar{y}},\psi _\tau - J{\bar{y}}) \nonumber \\&= \left( N+\frac{1}{N}\right) ^{\frac{1}{2}} + \left( N+\frac{1}{N}\right) ^{-\frac{1}{2}}\left( N{\bar{y}}\cdot (\psi _{_R} - {\bar{y}})+\frac{1}{N}J{\bar{y}} \cdot (\psi _\tau - J{\bar{y}})\right) . \end{aligned}$$
(5.8)

Integrating and applying the boundary condition \(\psi \arrowvert _{\partial B} = {\text {id}}\), it follows that

$$\begin{aligned} \int _{B}W^{\frac{1}{2}}({\bar{y}},\nabla \psi , 1/N)\, dy \ge \int _{B} W^{\frac{1}{2}}({\bar{y}},{\mathbf {1}}, 1/N)\, dy + \left( N+\frac{1}{N}\right) ^{-\frac{1}{2}}\left( N-\frac{1}{N}\right) ^{\frac{1}{2}}(\pi - F(\psi )), \end{aligned}$$
(5.9)

where \(F(\psi )\) is given by (2.12). The constraint (3.2) is in force, and the arguments of Sect. 3 continue to hold with \(W^{1,p}\) in place of \(H^1\) throughout (with the exception of part (a) of Proposition 3.1; this is not needed, now that (3.2) has been established above by different means). In particular, by Lemma 3.3, we see that \(F(\psi ) \le \pi \), and hence the rightmost term in (5.9) is nonnegative. Since \(f'\ge 0\), it follows that the second line in (5.7) is nonnegative, from which the inequality \(\mathbb {E}(u_{_{\scriptscriptstyle {N}}}\circ \varphi ) \ge \mathbb {E}(u_{_{\scriptscriptstyle {N}}})\) results.

To prove that the identity map is the unique minimizer, first suppose that \(\mathbb {E}(u_{_{\scriptscriptstyle {N}}}\circ \varphi ) = \mathbb {E}(u_{_{\scriptscriptstyle {N}}})\). Then, in particular, (5.8) holds with equality for a.e. y in B. The same calculation which proves the convexity of g also shows that if \(x:=(a,b) \in \mathbb {R}^4\) then \(D^2g(x)[\xi ,\xi ]=0\) if and only if x and \(\xi \) are proportional. Equality in (5.8) therefore implies that \(D^2g({\bar{y}},J{\bar{y}})[\xi ,\xi ]=0\) with \(\xi :=(\psi _{_R}-{\bar{y}},\psi _{\tau }-J{\bar{y}})\), and hence, by the previous remark, that \((\psi _{_R},\psi _{\tau }) =k(y)({\bar{y}},J{\bar{y}})\) for some function k(y) and a.e. y in B. Since \(\det \nabla \psi =1\) a.e., it follows that \(k^2(y)=1\) a.e., and hence that \(\nabla \psi (y) \in SO(2)\) for a.e. y. By a version of Liouville’s theorem (see e.g. [29]), \(\nabla \psi \) is smooth and everywhere equal to a constant matrix, and hence, via the boundary condition, \(\psi ={\text {id}}\). Thus \(\varphi ={\text {id}}\). \(\square \)

6 A class of variations in which \(u_2\) is a local minimizer

We now focus on the double-covering map, \(u_2\), with the twin goals in mind of (a) establishing that there are maps in the admissible class \({\mathcal {Y}}\) that are not inner variations of \(u_2\) and (b) demonstrating that \(u_2\) is a local minimizer in a large subclass of \(\mathcal {A}\) whose description we give in Sect. 6.2. The proofs of both facts rely to differing extents on being able to generate self-maps on the balls B or \(B'\) using flows, the ideas for which come from [10] and the references therein.

6.1 Admissible maps that are not inner variations of \(u_2\)

The following class of counterexamples apply to the case \(N=2\), i.e. to the double-covering map, but the principle can be extended to N-covering maps if we wish. To be clear, we seek maps v belonging to \(H^1(B;\mathbb {R}^2)\) which obey both \(\det \nabla v = 1\) a.e. in B and \(v\arrowvert _{\partial B} = u_{2}\), but which cannot be expressed as inner variations \(u_2 \circ \varphi \) where \(\varphi \) belongs to \(\mathcal {A}(B)\).

Proposition 6.1

There exists a smooth diffeomorphism \(\psi \) of \(B' = B(0,1/\sqrt{2})\) such that \(\psi (0)=p_0\ne 0\), where \(p_0\) lies in \(B'\), \(\psi \arrowvert _{\partial B'} = {\text {id}}\) and \(\det \nabla \psi = 1\) in \(B'\). The map v defined by \(v=\psi \circ u_{2}\) is then admissible, i.e. \(v \in {\mathcal {Y}}\), but is not an inner variation of \(u_2\).

Proof

To begin with, an argument of Dacorogna and Moser implies that a diffeomorphism \(\psi \) with the properties stated above exists. Specifically, by [10, Remark after Theorem 7], one first chooses a diffeomorphism \(\psi _1\), say, of \(B'\) which permutes the points 0 and \(p_0\) and which preserves the boundary \(\partial B'\), but which does not necessarily preserve area. By [10, Theorem 7], one then finds a second diffeomorphism, \(\psi _2\), say, of \(B'\) which preserves the boundary, fixes 0 and \(p_0\) and which, in addition, satisfies \(\det \nabla \psi _2 = (\det \nabla \psi _1)^{-1}\). It then suffices to take \(\psi =\psi _2 \circ \psi _1\). It should be clear that \(v:=\psi \circ u_{2}\) lies in \(H^1(B;\mathbb {R}^2)\), agrees with \(u_2\) on \(\partial B\) and obeys \(\det \nabla v=1\) on \(B{\setminus } \{0\}\), so that \(v \in {\mathcal {Y}}\). Moreover, by construction, \(v(0)=p_0\) and, in fact, 0 is the unique preimage of \(p_0\) in B.

To clarify the following argument we introduce the following notation. Given sets XY, a map \(f: X \rightarrow Y\) and points \(a \in X\), \(b \in Y\), let \(a \overset{f}{\leftrightarrow } b\) express the statement that \(f(x)=b\) if and only if \(x=a\). By definition of \(u_2\) and \(\psi \) we therefore have \(0 \overset{u_2}{\leftrightarrow } 0 \overset{\psi }{\leftrightarrow } p_0\), and hence, since \(v=\psi \circ u_2\) by definition, \(0 \overset{v}{\leftrightarrow } p_0\). Now let \(y_1\) and \(y_2\) be the local inverses of \(u_2\) defined on \(B'\), so that \(u_2(y_i(p_0))=p_0\) for \(i=1,2\). Note that because \(p_0 \ne 0\), neither of \(y_1(p_0),y_2(p_0)\) is zero and \(y_1(p_0) \ne y_2(p_0)\). Let \(\varphi \in \mathcal {A}\), where the class \(\mathcal {A}\) is defined in (1.3). Since \(d(\varphi ,B,y)=1\) for all y in B, it must in particular be that there are points \(x_1,x_2\) in B such that \(\varphi (x_i) = y_i(p_0)\) for \(i=1,2\). Moreover, \(x_1 \ne x_2\) because \(y_1(p_0) \ne y_2(p_0)\). Hence, and with obvious notation, \(x_i \overset{\varphi }{\rightarrow } y_i(p_0) \overset{u_2}{\rightarrow } p_0\). (See Fig. 1.) If we now suppose for a contradiction that \(v=u_2 \circ \varphi \) then we must have \(x_i \overset{v}{\rightarrow } p_0\) for \(i=1,2\). But \(0 \overset{v}{\leftrightarrow } p_0\), which implies \(x_1=x_2=0\) and thereby contradicts \(x_1 \ne x_2\). Since \(\varphi \) was arbitrary, we conclude that v is not expressible as \(u_2 \circ \varphi \). \(\square \)

The key to the result above is the use of ‘outer variations’ of \(u_2\) of the form \(\psi \circ u_2\), and it naturally leads us to consider compositions of the form \(\psi \circ u_2 \circ \varphi \) where \(\psi \) and \(\varphi \) are measure-preserving maps of the balls \(B'\) and B respectively. By further requiring that \(\mathrm{tr}\,(\psi )=\text {id}\) and \(\mathrm{tr}\,(\varphi )=\text {id}\), it should be clear that \(\psi \circ u_2 \circ \varphi \) belongs to \({\mathcal {Y}}\). In this way, we can generate a rich subclass of \({\mathcal {Y}}\) which can be analysed in a neighbourhood of \(u_2\). This is the topic of the next subsection.

Fig. 1
figure 1

The points \(y_1\) and \(y_2\) are such that \(u_2^{-1}(p_0)=\{y_1,y_2\}\), and the points \(x_1\) and \(x_2\) obey \(\varphi (x_i)=y_i\) for \(i=1,2\). Thus the map \(u_2 \circ \varphi \) takes \(x_1\) and \(x_2\) to \(p_0\). In particular, \(p_0\) always has two preimages in B, and so the map v constructed in Proposition 6.1 cannot be of the form \(u_2 \circ \varphi \)

6.2 Local minimality of \(u_2\) in the class \(S' \circ u_2 \circ S^{-1}\)

In this section we focus on admissible maps \(v=v(\cdot ,\delta )\) in the form

$$\begin{aligned} v=h\,\circ \,u_2\,\circ \,g, \end{aligned}$$
(6.1)

where \(h \in S'\), \(g=G^{-1}\) with \(G \in S\), and where the classes \(S'\) and S are defined in terms of flows, as follows. Firstly, let

$$\begin{aligned} T(B):=\{\Sigma \in C_c^{\infty }(B{\setminus }\{0\},\mathbb {R}^2): \ \mathrm{div}\,\Sigma =0 \ \text {in } B\}, \end{aligned}$$

with a corresponding description for \(T(B')\). Then we define \(S'\) and S respectively by

$$\begin{aligned} S'&:={\left\{ \begin{array}{ll}h: B' \rightarrow B', \ \exists \ \Xi \in T(B')\end{array}\right. } \text {and}\\ \delta _0&> 0 \;\, \text {s.t.} \left\{ \begin{array}{l l}\partial _{\delta }h(z,\delta ) =J\nabla \Xi (h(z,\delta )) &{}\quad z \in B', \ \delta \in (-\delta _0,\delta _0) \\ h(z,0)=z &{}\quad z \in B'\end{array}\right\} \end{aligned}$$

and

$$\begin{aligned} S&:={\left\{ \begin{array}{ll}G: B \rightarrow B, \ \exists \ \Sigma \in T(B) \end{array}\right. }\text {and}\\ \delta _0&> 0 \;\, \text {s.t.} \left\{ \begin{array}{ll} \partial _{\delta }G(y,\delta )=J\nabla \Sigma (G(y,\delta )) &{}\quad y \in B, \delta \in (-\delta _0,\delta _0) \\ G(y,0)=y &{}\quad y \in B \end{array}\right\} . \end{aligned}$$

Now, by [9, Lemma 14.11], maps belonging to \(S'\) are self-maps of \(B'\) which obey (i) \(\det \nabla _z h(z,\delta )=1\) for all \(z \in B'\), \(|\delta | < \delta _0\) and (ii) \(h(z,\delta )=z\) for all \(z \in \partial B'\), \(|\delta | < \delta _0\). Similarly, maps in S are smooth self-maps of B with unit Jacobian and which agree with the identity on \(\partial B\). In particular, by the inverse function theorem, any map G belonging to S is smooth, invertible and agrees with the identity on \(\partial B\), so that \(g=G^{-1}\) is well-defined. Letting the set \(S^{-1}\) be

$$\begin{aligned} S^{-1}=\{G^{-1}: \ G \in S\}, \end{aligned}$$

the notation in the title of the section is now self-explanatory, and, by inspection, the map v given by (6.1) belongs to \({\mathcal {Y}}\).

Now recall that

$$\begin{aligned} \mathbb {D}(v)=\int _B |\nabla v(x)|^2 \, dx. \end{aligned}$$

When v is expressed using (6.1), a short calculation shows that

$$\begin{aligned} \mathbb {D}(v)=\int _B |\nabla h(u_2(y))\nabla u_2(y) \mathrm{adj}\,\nabla G(y)|^2 \, dy, \end{aligned}$$
(6.2)

where \(G=g^{-1}\). Indeed, it is clear from (6.1) and the definition of D(v) that

$$\begin{aligned} \mathbb {D}(v) = \int _B |\nabla h(u_2(g(x)))\nabla u_2(g(x)) \nabla g(x)|^2 \, dx. \end{aligned}$$

By making the substitution \(y=g(x)\) and using the fact that \(\nabla g(x) = \mathrm{adj}\,\nabla G(y)\) when \(\det \nabla g(x)=1\) and \(G=g^{-1}\), the expression (6.2) results.

Lemma 6.2

Let \(\mathbb {D}(v)\) be as in (6.2) with h a smooth self-map of \(B'\) and G a smooth self-map of B. Then \(\mathbb {D}(v)\) can be written as

$$\begin{aligned} \mathbb {D}(v)&=\int _B 2 |h_{\tau }(u_2(y))|^2|G_{_{\small {R}}}(y)|^2+\frac{1}{2}|h_s(u_2(y))|^2|G_{\tau }(y)|^2 \nonumber \\&\quad - 2 (h_{\tau }(u_2(y))\cdot h_s(u_2(y))) (G_{\tau }(y) \cdot G_{_{\small {R}}}(y)) \, dy. \end{aligned}$$
(6.3)

Futhermore, on letting

$$\begin{aligned} p(y)&= \frac{G_{\tau }(y) \cdot G_{_{\small {R}}}(y)}{|G_{_{\small {R}}}(y)|^2}, \end{aligned}$$
(6.4)
$$\begin{aligned} q(y)&= \frac{h_{\tau }(u_2(y))\cdot h_s(u_2(y))}{|h_s(u_2(y))|^2}, \end{aligned}$$
(6.5)

and defining the function \(\Psi : \mathbb {R}^+ \rightarrow \mathbb {R}^+\) by \(\Psi (t)=t+\frac{1}{t}\), D(v) takes the form

$$\begin{aligned} \mathbb {D}(v) = \int _{B} \frac{1}{2} |h_s(u_2(y))|^2|G_{_{\small {R}}}(y)|^2(p(y)-2q(y))^2 +\Psi \left( \frac{2|G_{_{\small {R}}}(y)|^2}{|h_s(u_2(y))|^2}\right) \, dy. \end{aligned}$$
(6.6)

Proof

The first expression, (6.3), for \(\mathbb {D}(v)\) follows by substituting the three expressions, valid for \(y \ne 0\),

$$\begin{aligned} \nabla h(u_2(y))&= h_s(u_2(y)) \otimes \overline{u_2(y)}+ h_\tau (u_2(y)) \otimes J\overline{u_2(y)},\\ \nabla u_2 (y)&= \frac{1}{\sqrt{2}} \overline{u_2(y)}\otimes \overline{y}+ \sqrt{2} \, J\overline{u_2(y)}\otimes J \overline{y}, \\ \mathrm{adj}\,\nabla G(y)&= J\overline{y}\otimes J G_{_{\small {R}}}(y) - \overline{y}\otimes J G_{\tau }(y), \end{aligned}$$

where \({\overline{z}}:=\frac{z}{|z|}\) for any \(z \in \mathbb {R}^2{\setminus }\{0\}\), into (6.2).

In order to see (6.6), we begin by abbreviating \(b_1=G_{_{\small {R}}}(y)\), \(b_2=G_{\tau }(y)\), \(a_1=h_s(u_2(y))\) and \(a_2=h_{\tau }(u_2(y))\). In these terms, \(\det \nabla h(u_2(y)) = J a_1 \cdot a_2\) and \(\det \nabla G(y) = J b_1 \cdot b_2\), and since both h and G are \(1-1\), measure-preserving maps by hypothesis, it follows that \(J a_1 \cdot a_2=1\) and \(J b_1 \cdot b_2=1\) throughout the domain B. In particular, note that none of \(a_1, a_2, b_1\) and \(b_2\) can vanish at any point y in B. Hence the decompositions

$$\begin{aligned} b_2&= p \, b_1 + \frac{J b_1}{|b_1|^2} \\ a_2&= q \, a_1 + \frac{J a_1}{|a_1|^2} \end{aligned}$$

must hold, where \(p=\frac{b_1 \cdot b_2}{|b_1|^2}\) and \(q=\frac{a_1 \cdot a_2}{|a_1|^2}\) are the shorthand versions of (6.4) and (6.5) introduced above. Using these exressions in (6.3) shows that

$$\begin{aligned} D(v)&= \int _B 2 |b_1|^2 \left| q \, a_1 + \frac{J a_1}{|a_1|^2}\right| ^2 +\frac{1}{2} |a_1|^2\left| p \, b_1 + \frac{J b_1}{|b_1|^2}\right| ^2 - 2 pq |a_1|^2 |b_1|^2 \, dy \\&= \int _B \frac{1}{2} |a_1|^2|b_1|^2 (p^2 + 4q^2 - 4 pq) + \frac{2|b_1|^2}{|a_1|^2}+\frac{|a_1|^2}{2|b_1|^2} \,dy, \end{aligned}$$

which, on reverting to the original notation, is (6.6). \(\square \)

Let \(\Xi : B' \rightarrow B'\) be a smooth, divergence-free map with compact support in the punctured ball \(B'{\setminus } 0\). Let h belong to \(S'\), that is, \(h(z,\delta )\) is a one-parameter family of maps, defined for \(z \in B'\) and \(|\delta | <\delta _0\), for some \(\delta _0>0\) by the system

$$\begin{aligned} {\left\{ \begin{array}{ll}h'(z,\delta ) = \Xi (h(z,\delta )) &{} \text {if} \;\, z \in B'\quad \text {and}\quad |\delta | < \delta _0 \\ h(z,0) =z &{} \text {if} \;\, z \in B', \end{array}\right. } \end{aligned}$$
(6.7)

where \(h'(z,\delta ):=\partial _{\delta }h(z,\delta )\). Similarly, let \(\Sigma : B \rightarrow B\) be a smooth, divergence-free map with compact support in the punctured ball \(B {\setminus } 0\). Let \(G=g^{-1}\) be a one-parameter family of maps, defined for \(y \in B\) and \(|\delta | <\delta _0\), by the system

$$\begin{aligned} {\left\{ \begin{array}{ll}G'(y,\delta ) = \Sigma (G(y,\delta )) &{} \text {if} \;\, y \in B\quad \text {and}\quad |\delta | < \delta _0 \\ G(y,0) =y &{} \text {if} \;\, y \in B, \end{array}\right. } \end{aligned}$$
(6.8)

where \(G'(y,\delta ):=\partial _{\delta }G(y,\delta )\).

Since \(\Xi \) and \(\Sigma \) are smooth and divergence-free, we have \(\det \nabla _z h(z,\delta )=1\) if \(z \in B'\), \(\det \nabla _y G(y,\delta )=1\) if \(y \in B\), and higher derivatives of \(h(\cdot ,\delta )\) and \(G(\cdot ,\delta )\) with respect to \(\delta \) can be calculated directly from (6.7) and (6.8). Finally, since \(\Xi \) and \(\Sigma \) are divergence-free, it is well known that potentials \(\sigma \) and \(\xi \) exist such that

$$\begin{aligned} \Xi (z)&= J \nabla _z \xi (z) \quad z \in B', \\ \Sigma (y)&= J \nabla _y \sigma (y) \quad y \in B. \end{aligned}$$

With \(G(y,\delta )\) and \(h(z,\delta )\) defined by (6.8) and (6.7), let

$$\begin{aligned} g(y,\delta )=G^{-1}(y,\delta ) \quad y \in B, \ |\delta | < \delta _0, \end{aligned}$$
(6.9)

and write

$$\begin{aligned} v(x,\delta ) =h(u_2(g(x,\delta )),\delta ) \quad x \in B, \ |\delta | < \delta _0. \end{aligned}$$
(6.10)

Suppressing the dependence on \(\delta \) for brevity, this is merely (6.1) with the particular choices \(g(x)=g(x,\delta )\) and \(h(z)=h(z,\delta )\), but now equipped with an associated evolution which enables us to calculate the second variation \(\left. \partial _{\delta }^2\right| _{\delta =0}\mathbb {D}(v(\cdot ,\delta )).\)

The goal of this section is to prove the following result.

Theorem 6.3

Let \(\Sigma =J\nabla \sigma \) and \(\Xi =J\nabla \xi \) be smooth, divergence free maps belonging, respectively, to \(C^{\infty }_c(B{\setminus } 0,\mathbb {R}^2)\) and \(C^{\infty }_c(B'{\setminus } 0,\mathbb {R}^2)\), and let hGg and v be defined by (6.7), (6.8), (6.9) and (6.10) respectively. Then

  1. (a)

    \(\left. \partial _{\delta }\right| _{\delta =0}\mathbb {D}(v(\cdot ,\delta )) = 0\);

  2. (b)

    it holds that

    $$\begin{aligned} \left. \partial _\delta ^2\right| _{\delta =0}\mathbb {D}(v(\cdot ,\delta )) \ge 4 \int _B \left\{ (\sigma - \xi \circ u_2)_{\tau {\scriptscriptstyle {R}}}\right\} ^2 + \left\{ (\sigma - \xi \circ u_2)_{{\scriptscriptstyle {R}}{\scriptscriptstyle {R}}}\right\} ^2 \, dy, \end{aligned}$$
    (6.11)

    and \(\left. \partial _\delta ^2\right| _{\delta =0}\mathbb {D}(v(\cdot ,\delta ))=0\) only if

    $$\begin{aligned} \mathbb {D}(v(\cdot ,\delta )) =\mathbb {D}(u_2) \quad |\delta | < \delta _0. \end{aligned}$$

In particular, the map \(u_2\) is a local minimizer of the Dirichlet energy with respect to all variations of the form \(v(y,\delta )\).

The proof of Theorem 6.3 relies on a number of auxiliary results, which we now present.

Lemma 6.4

Let \(h(z,\delta )\) and \(G(y,\delta )\) obey (6.7) and (6.8) respectively, and define

$$\begin{aligned} a_1(y,\delta )&=h_s(u_2(y),\delta ) \end{aligned}$$
(6.12)
$$\begin{aligned} b_1(y,\delta )&= G_{_{\small {R}}}(y,\delta ) \end{aligned}$$
(6.13)
$$\begin{aligned} a_2(y,\delta )&=h_{\tau }(u_2(y),\delta ) \end{aligned}$$
(6.14)
$$\begin{aligned} b_2(y,\delta )&= G_{\tau }(y,\delta ). \end{aligned}$$
(6.15)

For brevity, write \(b_1'(y,\delta )=\partial _{\delta }b_1(y,\delta )\), \(b_1''(y,\delta )=\partial _{\delta }^2b_1(y,\delta )\), and similarly for \(a_1(y,\delta )\), \(b_2(y,\delta )\) and \(a_2(y,\delta )\). Then the following hold for \(y \in B\) and \(-\delta _0< \delta < \delta _0\):

  1. (a)

    \(b_1(y,0)=\overline{y}\), \(a_1(y,0) = \overline{u_2(y)}\);

  2. (b)

    \(b_2(y,0)=J\overline{y}\), \(a_2(y,0) = J\overline{u_2(y)}\);

  3. (c)

    \(b_1'(y,0)=\Sigma _{{\scriptscriptstyle {R}}}(y)\), \(a_1'(y,0)=\Xi _s(u_2(y))\);

  4. (d)

    \(b_2'(y,0)=\Sigma _{\tau }(y)\), \(a_2'(y,0)=\Xi _{\tau }(u_2(y)).\)

Proof

  1. (a)

    Since \(G(y,0)=y\), the relation \(G_{_{\small {R}}}(y,0)=\overline{y}\), whose left-hand side is \(b_1(y,0)\) by definition, is immediate. Similarly, \(h_s(z,0)=\overline{z}\) for any \(z \ne 0\), so take \(z=u_2(y)\) and recall the definition of \(a_1(y,0)\). The proof of part (b) follows similarly.

  2. (c)

    Consider (6.8) and take the derivative with respect to the radial variable R. Since the derivatives in \(\delta \) and R commute, we have

    $$\begin{aligned} b_1'(y,\delta ) = \nabla \Sigma (G(y,\delta )) b_1(y,\delta ) \quad y \in B, \ |\delta | < \delta _0. \end{aligned}$$
    (6.16)

    By taking \(\delta =0\), applying (a) and the fact that \(G(y,0)=y\) for \(y \in B\), it follows that

    $$\begin{aligned} b_1'(y,0)=\nabla \Sigma (y) \overline{y}\quad y \in B. \end{aligned}$$

    The right-hand side of the equation above is \(\Sigma _{_{\small {R}}}(y)\), as claimed. Similarly, \(h_s'(z,0)=\Xi _s(z)\overline{z}\), in which we set \(z=u_2(y)\) to obtain \(a_1'(y,0)=\Xi _s(u_2(y))\). Notice that the condition \(y \ne 0\) is precisely what is needed to ensure \(z=u_2(y)\ne 0\). The proof of part (d) follows similarly. \(\square \)

The next three lemmas gather together and simplify various quantities which will shortly be of use and which exploit the divergence-free nature of \(\Xi \) and \(\Sigma \) appearing in (6.7) and (6.8). The first of the two results is an identity that is easily deduced, so we present only a sketch of the proof. The second and third results are more involved.

Lemma 6.5

Let \(\sigma \) belong to the class \(C_c^{\infty }(B{\setminus } 0,\mathbb {R})\). Then

$$\begin{aligned} \int _B \sigma _{\tau {\scriptscriptstyle {R}}{\scriptscriptstyle {R}}} \sigma _{\tau } -\sigma _{{\scriptscriptstyle {R}}}\sigma _{\tau {\scriptscriptstyle {R}}\tau } + \sigma _{\tau {\scriptscriptstyle {R}}}^2 - \sigma _{{\scriptscriptstyle {R}}{\scriptscriptstyle {R}}}\sigma _{\tau \tau } \, dy = 0. \end{aligned}$$
(6.17)

Proof

Equation (6.17) follows from the facts that

$$\begin{aligned} \int _B \sigma _{\tau {\scriptscriptstyle {R}}{\scriptscriptstyle {R}}} \sigma _{\tau } - \sigma _{{\scriptscriptstyle {R}}}\sigma _{\tau {\scriptscriptstyle {R}}\tau } \, dy&=0,\quad \text {and} \end{aligned}$$
(6.18)
$$\begin{aligned} \int _B \sigma _{{\scriptscriptstyle {R}}{\scriptscriptstyle {R}}}\sigma _{\tau \tau } - \sigma _{\tau {\scriptscriptstyle {R}}}^2\, dy&=0, \end{aligned}$$
(6.19)

each of which can be established through a relatively straightforward sequence of integrations by parts. \(\square \)

Lemma 6.6

Let \(b_1(y,\delta )\) be given by (6.13), where the corresponding map \(G(y,\delta )\) evolves according to (6.8) and where \(\Sigma =J \nabla \sigma \) is a smooth, divergence-free map with compact support in the punctured ball \(B{\setminus } 0\). Similarly, let \(a_1(y,\delta )\) be given by (6.12), where the corresponding map \(h(z,\delta )\) evolves according to (6.7) and where \(\Xi =J \nabla \xi \) is a smooth, divergence-free map with compact support in the punctured ball \(B'{\setminus } 0\). In addition, let

$$\begin{aligned} X(y):=\xi (u_2(y)) \quad y \in B, \end{aligned}$$
(6.20)

and note that \(X \in C_c^{\infty }(B{\setminus } 0,\mathbb {R})\). Then

  1. (a)

    \(|\Sigma _{_{\small {R}}}|^2 = \sigma _{{\scriptscriptstyle {R}}{\scriptscriptstyle {R}}}^2 + \sigma _{\tau {\scriptscriptstyle {R}}}^2\);

  2. (b)

    \(|\Xi _s(u_2(y))|^2 = 4 X_{{\scriptscriptstyle {R}}{\scriptscriptstyle {R}}}^2+X_{\tau {\scriptscriptstyle {R}}}^2\), and

  3. (c)

    the following identities hold:

    $$\begin{aligned} \int _{B} b_1(y,0) \cdot b_1''(y,0)+|b_1'(y,0)|^2 \, dy&= \int _B \frac{\sigma _{{\scriptscriptstyle {R}}}^2}{R^2} + \frac{\sigma _{\tau \tau }\sigma _{{\scriptscriptstyle {R}}}}{R} + \sigma _{{\scriptscriptstyle {R}}{\scriptscriptstyle {R}}}^2 + \sigma _{\tau {\scriptscriptstyle {R}}}^2 \, dy, \end{aligned}$$
    (6.21)
    $$\begin{aligned} \int _{B} a_1(y,0) \cdot a_1''(y,0)+|a_1'(y,0)|^2 \, dy&= \int _B \frac{4 X_{{\scriptscriptstyle {R}}}^2}{R^2} + \frac{X_{\tau \tau }X_{{\scriptscriptstyle {R}}}}{R} + 4 X_{{\scriptscriptstyle {R}}{\scriptscriptstyle {R}}}^2 + X_{\tau {\scriptscriptstyle {R}}}^2 \, dy. \end{aligned}$$
    (6.22)

Proof

  1. (a)

    We have \(\Sigma = \sigma _{{\scriptscriptstyle {R}}} J\overline{y}- \sigma _{\tau } \overline{y}\), so \(\Sigma _{{\scriptscriptstyle {R}}} = \sigma _{{\scriptscriptstyle {R}}{\scriptscriptstyle {R}}} J\overline{y}- \sigma _{\tau {\scriptscriptstyle {R}}} \overline{y}\), and part (a) is immediate.

  2. (b)

    Replacing \(\sigma (y)\) and y with \(\xi (z)\) and z respectively gives \(\Xi _s(z) = \xi _{ss}(z) J\overline{z}- \xi _{\tau s}(z) \overline{z}\). By (6.20), \(\xi _{ss}(u_2(y))=2X_{{\scriptscriptstyle {R}}{\scriptscriptstyle {R}}}\) and \(\xi _{\tau s}(u_2(y))=X_{\tau {\scriptscriptstyle {R}}}\), whence the stated expression for \(|\Xi _s(u_2(y))|^2.\)

  3. (c)

    We focus on calculating \(b_1(y,0)\cdot b_1''(y,0)\), since \(|b_1'(y,0)|^2=|\Sigma _{{\scriptscriptstyle {R}}}(y)|^2\) has already been dealt with in part (a). Given (6.8), it follows by (6.16) and both parts of Lemma 6.4, that for \(i=1,2\), the \(i^{th}\) component of \(b_1''(y,0)\) obeys

    $$\begin{aligned} ( b_1''(y,0))_i = \nabla ^2 \Sigma _i(y)[\Sigma (y),\overline{y}]+(\nabla \Sigma (y)\Sigma _{{\scriptscriptstyle {R}}}(y))_i. \end{aligned}$$

    Suppressing the dependence of \(\Sigma \) on y for brevity, we have

    $$\begin{aligned} \nabla \Sigma&= \Sigma _{{\scriptscriptstyle {R}}} \otimes \overline{y}+\Sigma _{\tau } \otimes J\overline{y}, \end{aligned}$$

    so that

    $$\begin{aligned} (\nabla \Sigma \, \Sigma _{{\scriptscriptstyle {R}}})_i = (\Sigma _i)_{{\scriptscriptstyle {R}}} (\overline{y}\cdot \Sigma _{{\scriptscriptstyle {R}}}) +(\Sigma _i)_{\tau } (J\overline{y}\cdot \Sigma _{{\scriptscriptstyle {R}}}). \end{aligned}$$

    Moreover,

    $$\begin{aligned} \nabla ^2 \Sigma _i = ((\Sigma _i)_{{\scriptscriptstyle {R}}{\scriptscriptstyle {R}}} \overline{y}+ (\Sigma _i)_{\tau {\scriptscriptstyle {R}}} J\overline{y}) \otimes \overline{y}+ \left( (\Sigma _i)_{{\scriptscriptstyle {R}}\tau } \overline{y}+ \frac{(\Sigma _i)_{{\scriptscriptstyle {R}}}}{R}J\overline{y}+(\Sigma _i)_{\tau \tau } J\overline{y}- \frac{(\Sigma _i)_{\tau }}{R} \overline{y}\right) \otimes J\overline{y}\end{aligned}$$

    and hence

    $$\begin{aligned} \nabla ^2 \Sigma _i[\Sigma (y),\overline{y}] = (\Sigma _i)_{{\scriptscriptstyle {R}}{\scriptscriptstyle {R}}} (\Sigma \cdot \overline{y}) + \left( (\Sigma _i)_{{\scriptscriptstyle {R}}\tau } - \frac{(\Sigma _i)_{\tau }}{R}\right) (\Sigma \cdot J\overline{y}). \end{aligned}$$

    Therefore, with the summation convention in force,

    $$\begin{aligned} b_1(y,0) \cdot b_1''(y,0)&= \overline{y}_i \nabla ^2 \Sigma _i[\Sigma ,\overline{y}]+\overline{y}_i(\nabla \Sigma \, \Sigma _{{\scriptscriptstyle {R}}})_i \\&= (\overline{y}\cdot \Sigma _{{\scriptscriptstyle {R}}{\scriptscriptstyle {R}}})(\Sigma \cdot \overline{y}) + \left( \overline{y}\cdot \Sigma _{{\scriptscriptstyle {R}}\tau } - \frac{\overline{y}\cdot \Sigma _{\tau }}{R}\right) (\Sigma \cdot J\overline{y}) + (\overline{y}\cdot \Sigma _{{\scriptscriptstyle {R}}})^2 \\&\quad + (\overline{y}\cdot \Sigma _{\tau })(J\overline{y}\cdot \Sigma _{{\scriptscriptstyle {R}}}) \end{aligned}$$

    Given that \(\Sigma =J\nabla \sigma \), it is straightforward to check that \(\Sigma \cdot \overline{y}= -\sigma _{\tau }\), \(\Sigma \cdot J\overline{y}= \sigma _{{\scriptscriptstyle {R}}}\), and that

    $$\begin{aligned} \overline{y}\cdot \Sigma _{{\scriptscriptstyle {R}}{\scriptscriptstyle {R}}}&= -\sigma _{\tau {\scriptscriptstyle {R}}{\scriptscriptstyle {R}}} \end{aligned}$$
    (6.23)
    $$\begin{aligned} \overline{y}\cdot \Sigma _{{\scriptscriptstyle {R}}\tau }&= -\frac{\sigma _{{\scriptscriptstyle {R}}{\scriptscriptstyle {R}}}}{R} -\sigma _{\tau {{\scriptscriptstyle {R}}} \tau } \end{aligned}$$
    (6.24)
    $$\begin{aligned} \overline{y}\cdot \Sigma _{\tau }&= -\sigma _{\tau \tau } -\frac{\sigma _{{\scriptscriptstyle {R}}}}{R}\end{aligned}$$
    (6.25)
    $$\begin{aligned} \overline{y}\cdot \Sigma _{{\scriptscriptstyle {R}}}&=-\sigma _{\tau {\scriptscriptstyle {R}}}\quad \text {and}\end{aligned}$$
    (6.26)
    $$\begin{aligned} J\overline{y}\cdot \Sigma _{{\scriptscriptstyle {R}}}&=\sigma _{{\scriptscriptstyle {R}}{\scriptscriptstyle {R}}}. \end{aligned}$$
    (6.27)

    Hence

    $$\begin{aligned} b_1(y,0) \cdot b_1''(y,0) + |b_1'(y,0)|^2&= \sigma _{\tau }\sigma _{\tau {\scriptscriptstyle {R}}{\scriptscriptstyle {R}}} -\sigma _{{\scriptscriptstyle {R}}} \left( \frac{\sigma _{{\scriptscriptstyle {R}}{\scriptscriptstyle {R}}}}{R} + \sigma _{\tau {{\scriptscriptstyle {R}}} \tau }\right) \\&\quad + \frac{\sigma _{{\scriptscriptstyle {R}}}}{R}\left( \sigma _{\tau \tau } +\frac{\sigma _{{\scriptscriptstyle {R}}}}{R}\right) -\sigma _{{\scriptscriptstyle {R}}{\scriptscriptstyle {R}}}\left( \sigma _{\tau \tau } +\frac{\sigma _{{\scriptscriptstyle {R}}}}{R}\right) \\&\quad + 2\sigma _{\tau {\scriptscriptstyle {R}}}^2 + \sigma _{{\scriptscriptstyle {R}}{\scriptscriptstyle {R}}}^2. \end{aligned}$$

    Integrating this expression over B, applying Lemma 6.17 and using the fact that \( \int _B \frac{\sigma _{{\scriptscriptstyle {R}}} \sigma _{{\scriptscriptstyle {R}}{\scriptscriptstyle {R}}}}{R} \, dy = 0\) yields (6.21). Similarly,

    $$\begin{aligned} a_1(y,0) \cdot a_1''(y,0) + |a_1'(y,0)|^2&= \xi _{\tau }\xi _{\tau s s} -\xi _{s} \left( \frac{\xi _{ss}}{s} + \xi _{\tau {s} \tau }\right) + \frac{\xi _{s}}{s}\left( \xi _{\tau \tau } +\frac{\xi _{s}}{s}\right) \\&\quad -\xi _{s s}\left( \xi _{\tau \tau } +\frac{\xi _{s}}{s}\right) + 2\xi _{\tau s}^2 + \xi _{s s}^2, \end{aligned}$$

    where each term on the right-hand side is evaluated at \(u_2(y)\). Using (6.20), we have \(\xi _{\tau }(u_2(y))=X_{\tau }(y)/\sqrt{2}\), \(\xi _{s}(u_2(y))=\sqrt{2} X_{{\scriptscriptstyle {R}}}(y)\), and, suppressing the arguments for brevity, \(\xi _{\tau s s}=\sqrt{2}X_{\tau {\scriptscriptstyle {R}}{\scriptscriptstyle {R}}}\), \(\xi _{ss}=2 X_{{\scriptscriptstyle {R}}{\scriptscriptstyle {R}}}\), \(\xi _{\tau s\tau }=X_{\tau {\scriptscriptstyle {R}}\tau } /\sqrt{2}\), \(\xi _{\tau \tau }=X_{\tau \tau }/2\) and \(\xi _{\tau s}=X_{\tau {\scriptscriptstyle {R}}}\). Hence, with \(s=|u_2(y)|=R/\sqrt{2}\), we have

    $$\begin{aligned} a_1(y,0) \cdot a_1''(y,0) + |a_1'(y,0)|^2&= X_{\tau } X_{\tau {\scriptscriptstyle {R}}{\scriptscriptstyle {R}}} - X_{{\scriptscriptstyle {R}}}X_{\tau {\scriptscriptstyle {R}}\tau }+X_{\tau {\scriptscriptstyle {R}}}^2 - X_{\tau \tau }X_{{\scriptscriptstyle {R}}{\scriptscriptstyle {R}}}-8 X_{{\scriptscriptstyle {R}}}X_{{\scriptscriptstyle {R}}{\scriptscriptstyle {R}}} \\&\quad + \frac{4X_{{\scriptscriptstyle {R}}}^2}{R^2} + \frac{X_{\tau \tau }X_{{\scriptscriptstyle {R}}}}{R}+4 X_{{\scriptscriptstyle {R}}{\scriptscriptstyle {R}}}^2 + X_{\tau {\scriptscriptstyle {R}}}^2. \end{aligned}$$

    Integrating this expression over B, applying Lemma 6.17 with X in place of \(\sigma \), and using the fact that \( \int _B \frac{X_{{\scriptscriptstyle {R}}} X_{{\scriptscriptstyle {R}}{\scriptscriptstyle {R}}}}{R} \, dy = 0\), we obtain (6.22). \(\square \)

By employing the shorthand notation \(a_i(\delta ):=a_i(y,\delta )\) and \(b_i(\delta ):=b_i(y,\delta )\) for \(i=1,2\), and using (6.6) from Lemma 6.2, we can write

$$\begin{aligned} \mathbb {D}(v(\cdot ,\delta ))=\int _B \frac{1}{2} |a_1(\delta )|^2|b_1(\delta )|^2 (p(\delta )-2q(\delta ))^2 + \Psi (r(\delta )) \, dy, \end{aligned}$$
(6.28)

where

$$\begin{aligned} p(\delta )&=\frac{b_1(\delta )\cdot b_2(\delta )}{|b_1(\delta )|^2}, \end{aligned}$$
(6.29)
$$\begin{aligned} q(\delta )&= \frac{a_1(\delta )\cdot a_2(\delta )}{|a_1(\delta )|^2},\end{aligned}$$
(6.30)
$$\begin{aligned} r(\delta )&:=\frac{2 |b_1(\delta )|^2}{|a_1(\delta )|^2}. \end{aligned}$$
(6.31)

Lemma 6.7

Let \(\mathcal {L}(\delta )=\frac{1}{2} |a_1(\delta )|^2|b_1(\delta )|^2 (p(\delta )-2q(\delta ))^2 + \Psi (r(\delta ))\) be the integrand of \(\mathbb {D}(v(\cdot ,\delta ))\) as it appears in (6.28), where pqr are given by (6.29), (6.30) and (6.31) respectively. Then

$$\begin{aligned} \int _B \mathcal {L}'(0) \, dy&= \int _B 3(X_{\tau {\scriptscriptstyle {R}}} - \sigma _{\tau {\scriptscriptstyle {R}}}) \, dy \end{aligned}$$
(6.32)

and

$$\begin{aligned} \int _B \mathcal {L}''(0) \, dy&= \int _B 4(\sigma _{\tau {\scriptscriptstyle {R}}} - 4 X_{\tau {\scriptscriptstyle {R}}})(\sigma _{\tau {\scriptscriptstyle {R}}} - X_{\tau {\scriptscriptstyle {R}}}) + 3 \left( \frac{\sigma _{{\scriptscriptstyle {R}}}^2}{R^2}+\frac{\sigma _{\tau \tau }\sigma _{{\scriptscriptstyle {R}}}}{R}+\sigma _{{\scriptscriptstyle {R}}{\scriptscriptstyle {R}}}^2+\sigma _{\tau {\scriptscriptstyle {R}}}^2\right) + \nonumber \\&\quad -3\left( \frac{4X_{{\scriptscriptstyle {R}}}^2}{R^2}+\frac{X_{\tau \tau }X_{{\scriptscriptstyle {R}}}}{R}+4X_{{\scriptscriptstyle {R}}{\scriptscriptstyle {R}}}^2+X_{\tau {\scriptscriptstyle {R}}}^2\right) \nonumber \\&\quad +\left( \sigma _{{\scriptscriptstyle {R}}{\scriptscriptstyle {R}}}-\frac{\sigma _{{\scriptscriptstyle {R}}}}{R}-\sigma _{\tau \tau }-4X_{{\scriptscriptstyle {R}}{\scriptscriptstyle {R}}}+X_{\tau \tau }+\frac{4 X_{{\scriptscriptstyle {R}}}}{R}\right) ^2 \, dy. \end{aligned}$$
(6.33)

Proof

By Lemma 6.4, it is immediate that \(p(0)=0\), \(q(0)=0\) and \(r(0)=2\), and, by a direct calculation, that

$$\begin{aligned} \mathcal {L}'(0)&=\Psi '(2)r'(0) \\&= 3 (b_1(0)\cdot b_1'(0) - a_1(0)\cdot a_1'(0)) \\&= 3 (\overline{y}\cdot \Sigma _{{\scriptscriptstyle {R}}}(y) - \overline{u_2(y)}\cdot \Xi _s(u_2(y)) \\&= 3 (-\sigma _{\tau {\scriptscriptstyle {R}}}(y)+\xi _{\tau s}(u_2(y))) \\&= 3 (X_{\tau {\scriptscriptstyle {R}}}(y)-\sigma _{\tau {\scriptscriptstyle {R}}}(y)), \end{aligned}$$

the equivalence \(\xi _{\tau s}(u_2(y)))=X_{\tau {\scriptscriptstyle {R}}}(y)\) having been established during the proof of Lemma 6.6. This proves (6.32).

To demonstrate (6.33), first note that

$$\begin{aligned} \mathcal {L}''(0) = (p'(0)-2q'(0))^2 + \Psi ''(2)(r'(0))^2+\Psi '(2)r''(0). \end{aligned}$$
(6.34)

Now, by a direct calculation followed by an application of Lemma 6.4,

$$\begin{aligned} p'(0)&= b_1(0) \cdot b_2'(0) + b_1'(0) \cdot b_2(0) -2 (b_1(0)\cdot b_2(0))(b_1(0)\cdot b_1'(0)) \\&= \overline{y}\cdot \Sigma _{\tau }(y) + J\overline{y}\cdot \Sigma _{{\scriptscriptstyle {R}}}(y) \\&= \sigma _{{\scriptscriptstyle {R}}{\scriptscriptstyle {R}}} - \frac{\sigma _{{\scriptscriptstyle {R}}}}{R} - \sigma _{\tau \tau }, \end{aligned}$$

where (6.25) and (6.27) have been used to replace terms in \(\Sigma \) with terms in \(\sigma \).

Similarly,

$$\begin{aligned} q'(0)&= \xi _{ss}(u_2(y)) - \frac{\xi _{s}(u_2(y))}{s} - \xi _{\tau \tau }(u_2(y)) \\&= 2X_{{\scriptscriptstyle {R}}{\scriptscriptstyle {R}}} - \frac{X_{\tau \tau }}{2} - \frac{2X_{{\scriptscriptstyle {R}}}}{R}, \end{aligned}$$

where the conversion to terms in X is again made using facts already established in Lemma 6.6. Hence

$$\begin{aligned} \int _B (p'(0)-2q'(0))^2 \, dy = \int _B \left( \sigma _{{\scriptscriptstyle {R}}{\scriptscriptstyle {R}}}-\frac{\sigma _{{\scriptscriptstyle {R}}}}{R}-\sigma _{\tau \tau }-4X_{{\scriptscriptstyle {R}}{\scriptscriptstyle {R}}}+X_{\tau \tau }+\frac{4 X_{{\scriptscriptstyle {R}}}}{R}\right) ^2 \, dy. \end{aligned}$$
(6.35)

Next,

$$\begin{aligned} \Psi ''(2)(r'(0))^2+\Psi '(2)r''(0)&= \frac{1}{4}\left( 4 b_1(0)\cdot b_1'(0) - 4 a_1(0) \cdot a_1'(0)\right) ^2 \\&\quad + \frac{3}{4}\Big \{4(b_1 \cdot b_1')'(0)-4(a_1 \cdot a_1')'(0) \\&\quad + 16((a_1(0)\cdot a_1'(0))^2-(a_1(0)\cdot a_1'(0))(b_1(0) \cdot b_1'(0)))\Big \} \\&= 4(\overline{y}\cdot \Sigma _{{\scriptscriptstyle {R}}}- \overline{u_2(y)}\cdot \Xi _s (u_2(y)))^2 + 12(\overline{u_2(y)}\cdot \Xi _s (u_2(y)))^2 \\&\quad + 3(b_1 \cdot b_1')'(0)-3(a_1 \cdot a_1')'(0) \\&\quad - 12 (\overline{u_2(y)}\cdot \Xi _s (u_2(y)))(\overline{y}\cdot \Sigma _{{\scriptscriptstyle {R}}})\\&= 4 \sigma _{\tau {\scriptscriptstyle {R}}}^2 + 16 \xi _{\tau s}(u_2(y))^2 - 20 \sigma _{\tau {\scriptscriptstyle {R}}}\xi _{\tau s}(u_2(y)) \\&\quad + 3(b_1 \cdot b_1')'(0)-3(a_1 \cdot a_1')'(0) \\&=4(\sigma _{\tau {\scriptscriptstyle {R}}} - 4 X_{\tau {\scriptscriptstyle {R}}})(\sigma _{\tau {\scriptscriptstyle {R}}} - X_{\tau {\scriptscriptstyle {R}}}) + 3(b_1 \cdot b_1')'(0)\\&\quad -3(a_1 \cdot a_1')'(0). \end{aligned}$$

By (6.21) and (6.22),

$$\begin{aligned} \int _B \Psi ''(2)(r'(0))^2+\Psi '(2)r''(0) \, dy&= \int _B 4(\sigma _{\tau {\scriptscriptstyle {R}}} - 4 X_{\tau {\scriptscriptstyle {R}}})(\sigma _{\tau {\scriptscriptstyle {R}}} - X_{\tau {\scriptscriptstyle {R}}}) \nonumber \\&\quad + 3 \left( \frac{\sigma _{{\scriptscriptstyle {R}}}^2}{R^2}+\frac{\sigma _{\tau \tau }\sigma _{{\scriptscriptstyle {R}}}}{R}+\sigma _{{\scriptscriptstyle {R}}{\scriptscriptstyle {R}}}^2+\sigma _{\tau {\scriptscriptstyle {R}}}^2\right) +\nonumber \\&\quad -3\left( \frac{4X_{{\scriptscriptstyle {R}}}^2}{R^2}+\frac{X_{\tau \tau }X_{{\scriptscriptstyle {R}}}}{R}+4X_{{\scriptscriptstyle {R}}{\scriptscriptstyle {R}}}^2+X_{\tau {\scriptscriptstyle {R}}}^2\right) \, dy, \end{aligned}$$
(6.36)

which, when combined with (6.34) and (6.35) yields (6.33). \(\square \)

We now replace X and \(\sigma \) in (6.33) with their Fourier series representations

$$\begin{aligned} X(y)&= \frac{A_0(R)}{2}+\sum _{j \ge 1} A_j(R) \cos (j\theta ) \end{aligned}$$
(6.37)
$$\begin{aligned} \sigma (y)&= \frac{a_0(R)}{2}+\sum _{j \ge 1} a_j(R) \cos (j\theta ) + b_j(R)\sin (j \theta ) \end{aligned}$$
(6.38)

where \(y=(R \cos \theta , R\sin \theta )\) and where, \(a_0, A_0\) and, for \(j = 1,2,\ldots \), \(a_j, b_j\) and \(A_j\), are smooth functions of \(R \in (0,1)\). Note that, by (6.20), X is an even function of y, and so its Fourier series must be as stated. The following result records the effect of this substitution into (6.33), but in a new set of variables which arise naturally from \(a_j,b_j\) and \(A_j\).

Proposition 6.8

Let X and \(\sigma \) be represented by (6.37) and (6.38) respectively. Let \(R=e^t\) for \(t \in I:=(-\infty ,0]\) and define functions

$$\begin{aligned} w_j(t)&:=\frac{b_j(e^t)}{e^t} \quad j \in \mathbb {N}, \ t \in I \end{aligned}$$
(6.39)
$$\begin{aligned} z_j(t)&:=\frac{a_j(e^t)}{e^t} \quad j \in \mathbb {N}_0, \ t \in I \end{aligned}$$
(6.40)
$$\begin{aligned} Z_j(t)&:=\frac{A_j(e^t)}{e^t} \quad j \in \mathbb {N}_0, \ t \in I. \end{aligned}$$
(6.41)

Define the function \(\mathcal {O}(f;j)\) on \(C^{\infty }(I) \times \mathbb {N}_0\) by

$$\begin{aligned} \mathcal {O}(f;j):=(j^2-4)(j^2-1)f^2 + (5j^2+8){\dot{f}}^2+4 {\ddot{f}\,}^2 \end{aligned}$$
(6.42)

and where \({\dot{f}}\) denotes differentiation with respect to the variable t. Then, in these terms, the expression (6.33) takes the form

$$\begin{aligned} \int _B \mathcal {L}''(0) \, dy = \frac{\pi }{2}\int _I \mathcal {O}(z_0-Z_0;0)\,dt + \pi \sum _{j \in \mathbb {N}} \int _I \mathcal {O}(z_j-Z_j;j) + \mathcal {O}(w_j;j) \, dt. \end{aligned}$$
(6.43)

Proof

Firstly, it is straightforward to check using elementary properties of Fourier series that when X and \(\sigma \) are expressed in terms of (6.37) and (6.38), (6.33) gives

$$\begin{aligned} \int _B \mathcal {L}''(0)\, dy&= 2 \pi \int _0^1 \frac{3}{4}\left( \frac{a_0'^2}{R^2}+a_0''^2\right) -3\left( \frac{A_0'^2}{R^2}+A_0''^2\right) \\&\quad + \frac{1}{4}\left( a_0''-\frac{a_0'}{R}+\frac{4A_0'}{R}-4A_0''\right) ^2 \, R dR \\&\quad + \pi \sum _{j =1}^{\infty } \int _0^1 4j^2 \left\{ \left( \left( \frac{b_j}{R}\right) '\right) ^2 + \left( \left( \frac{a_j}{R}\right) '-4\left( \frac{A_j}{R}\right) '\right) \left( \left( \frac{a_j}{R}\right) '-\left( \frac{A_j}{R}\right) '\right) \right\} \\&\quad + 3\left\{ \left( \frac{a_j'}{R}\right) ^2+\left( \frac{b_j'}{R}\right) ^2 +a_j''^2+b_j''^2+j^2\left( \left( \left( \frac{a_j}{R}\right) '\right) ^2-\frac{a_j a_j'}{R^3}\right) \right. \\&\quad \left. + j^2\left( \left( \left( \frac{b_j}{R}\right) '\right) ^2-\frac{b_j b_j'}{R^3}\right) \right\} \\&\quad - 3\left\{ 4\left( \frac{A_j'}{R}\right) ^2+4A_j''^2+j^2\left( \left( \left( \frac{A_j}{R}\right) '\right) ^2-\frac{A_j A_j'}{R^3}\right) \right\} \\&\quad + \left\{ b_j''-\frac{b_j'}{R}+j^2 \frac{b_j}{R^2}\right\} ^2 \\&\quad + \left\{ a_j''-\frac{a_j'}{R}+j^2 \frac{a_j}{R^2} -4A_j''+4\frac{A_j'}{R}-j^2\frac{A_j}{R^2} \right\} ^2 \, R \,dR. \end{aligned}$$

Now we use (6.39), (6.40) and (6.41) to write \(a_j'=z_j+\dot{z_j}\), \(a_j''=({\dot{z}}_j+\ddot{z}_j)/R\) and \((a_j/R)'={\dot{z}}_j/R\), with similar conversions for \(A_j\) and \(b_j\). Using also the facts that

$$\begin{aligned} \int _I u\dot{u}\, \, dt = \int _I \dot{u} \ddot{u} \, dt = 0\quad \text{ and }\quad \int _Iu\ddot{v}\, \, dt = \int _I \ddot{u} v\, \, dt = -\int _I \dot{u} \dot{v}\, \, dt, \end{aligned}$$

where uv,  are any of \(w_j, z_j, Z_j\), we find, after simplifying and re-arranging terms, that

$$\begin{aligned} \begin{aligned}&\int _B{{\mathcal {L}}}''(0) dy = 2\pi \int _I(\ddot{z}_0 - {\ddot{Z}}_0)^2 + 2(\dot{z}_0 - \dot{Z}_0)^2 + (z_0 - Z_0)^2\,\, dt \\&\quad +\pi \sum _{j\in {\mathbb {N}}}\int _I \left\{ 4\ddot{z}_j^2 + (5j^2+8)\dot{z}_j^2 +(j^2-1)(j^2-4)z_j^2\right\} \\&\quad + \left\{ 4\ddot{w}_j^2 + (5j^2+8)\dot{w}_j^2 +(j^2-1)(j^2-4)w_j^2\right\} \\&\quad +\left\{ 4{\ddot{Z}}_j^2 + (5j^2+8)\dot{Z}_j^2 +(j^2-1)(j^2-4)Z_j^2\right\} \\&\quad -2\left\{ 4\ddot{z}_j{\ddot{Z}}_j + (5j^2 + 8)\dot{z}_j\dot{Z}_j + (j^2-1)(j^2-4)z_jZ_j)\right\} \,dt. \end{aligned} \end{aligned}$$
(6.44)

It is now clear that the integrand of the first term in (6.44) is equal to \(\frac{1}{4}{\mathcal O}(z_0 - Z_0; 0)\), and that the remaining integrand equates to \({{\mathcal {O}}}(z_j - Z_j; j) + {{\mathcal {O}}}(w_j; j)\), which leads directly to Eq. (6.43) as stated. \(\square \)

Corollary 6.9

Let the functions \(\sigma \) and X belong to \(C_c^{\infty }(B{\setminus } 0)\) and be related to \(\int _B \mathcal {L}''(0)\, dy\) through (6.33). Then

$$\begin{aligned} \int _{B} \mathcal {L}''(0) \, dy \ge 4 \int _B \left\{ (X-\sigma )_{\tau {\scriptscriptstyle {R}}}\right\} ^2 + \left\{ (X-\sigma )_{{\scriptscriptstyle {R}}{\scriptscriptstyle {R}}}\right\} ^2 \, dy. \end{aligned}$$
(6.45)

Proof

To show (6.45), begin by noting the lower bounds

$$\begin{aligned} \mathcal {O}(z_0-Z_0;0)&\ge 4(\ddot{z}_0-{\ddot{Z}}_0)^2 + 4({\dot{z}}_0-{\dot{Z}}_0)^2 \end{aligned}$$
(6.46)
$$\begin{aligned} \mathcal {O}(z_j-Z_j;j)&\ge 4\{ (j^2+1)({\dot{z}}_j-{\dot{Z}}_j)^2 + (\ddot{z}_j-{\ddot{Z}}_j)^2\} \end{aligned}$$
(6.47)
$$\begin{aligned} \mathcal {O}(w_j;j)&\ge 4\{ (j^2+1){\dot{w}}_j^2 + \ddot{w}_j^2\}, \end{aligned}$$
(6.48)

the last two of which hold for all \(j \in \mathbb {N}\). A short calculation using the Fourier decompositions (6.37) and (6.38), and the changes of variables (6.39), (6.40) and (6.41), shows that

$$\begin{aligned} \int _B \left\{ (X-\sigma )_{{\scriptscriptstyle {R}}{\scriptscriptstyle {R}}}\right\} ^2 \, dy&= \frac{\pi }{2} \int _I ({\dot{z}}_0-{\dot{Z}}_0)^2 + (\ddot{z}_0-{\ddot{Z}}_0)^2 \, dt\nonumber \\&\quad + \pi \sum _{j=1}^{\infty } \int _I ({\dot{z}}_j-{\dot{Z}}_j)^2+ (\ddot{z}_j-{\ddot{Z}}_j)^2 + {\dot{w}}_j^2 + \ddot{w}_j^2 \, dt \end{aligned}$$
(6.49)

and

$$\begin{aligned} \int _B \left\{ (X-\sigma )_{\tau {\scriptscriptstyle {R}}}\right\} ^2 \, dy&=\pi \sum _{j=1}^{\infty } \int _I j^2 ({\dot{z}}_j-{\dot{Z}}_j)^2+ j^2{\dot{w}}_j^2 \, dt. \end{aligned}$$
(6.50)

Inserting inequalities (6.46), (6.47) and (6.48) into (6.43), and using (6.49) and (6.50), yields (6.45). \(\square \)

It may help at this point to take stock of the results obtained so far, which have established the lower bound (1.8), i.e.

$$\begin{aligned} \left. \partial _\delta ^2\right| _{\delta =0}\mathbb {D}(v(\cdot ,\delta )) \ge 4 \int _B \left\{ (\sigma - \xi \circ u_2)_{\tau {\scriptscriptstyle {R}}}\right\} ^2 + \left\{ (\sigma - \xi \circ u_2)_{{\scriptscriptstyle {R}}{\scriptscriptstyle {R}}}\right\} ^2 \, dy, \end{aligned}$$

for variations \(v(\cdot ,\delta )\) given by (6.10). The functions \(\sigma \) and \(X=\xi \circ u_2\) are connected to the evolution of \(v(\cdot , \delta )\) for \(|\delta | < \delta _0\) through the systems (6.7) and (6.8). In proving Theorem 6.3 we are faced with two possibilities, which are that either the right-hand side of (1.8) is strictly positive or it is not. The strictly positive case is easily dealt with, while the degenerate case that

$$\begin{aligned} \int _B \left\{ (\sigma - \xi \circ u_2)_{\tau {\scriptscriptstyle {R}}}\right\} ^2 + \left\{ (\sigma - \xi \circ u_2)_{{\scriptscriptstyle {R}}{\scriptscriptstyle {R}}}\right\} ^2 \, dy = 0 \end{aligned}$$

requires further analysis. This is the purpose of the next three results, after which we will finally turn to the proof of Theorem 6.3.

Lemma 6.10

Let \(\xi \in C_c^\infty (B{\setminus } 0,\mathbb {R})\) and suppose \(\sigma (y)=\xi (u_2(y))\) for all \(y \in B\). Then

$$\begin{aligned} \nabla _z \xi (u_2(y))=\sqrt{2} \sigma _{{\scriptscriptstyle {R}}}(y)\overline{u_2(y)} + \frac{1}{\sqrt{2}}\sigma _{\tau }(y)J\overline{u_2(y)} \quad y \in B. \end{aligned}$$
(6.51)

Proof

From \(\sigma (y)=\xi (u_2(y))\) we have \(\nabla \sigma (y) = \nabla _z \xi (u_2(y))\nabla u_2(y)\). Postmulitplying this expression on both sides by \((\nabla u_2(y))^{-1}\), which is the same thing as

$$\begin{aligned} \mathrm{adj}\,\nabla u_2(y) = \frac{1}{\sqrt{2}} J\overline{y}\otimes J\overline{u_2(y)}+ \sqrt{2} \, \overline{y}\otimes \overline{u_2(y)}, \end{aligned}$$

yields (6.51). \(\square \)

Lemma 6.11

Suppose that \(h(z,\delta )\) and \(G(y,\delta )\) evolve according to (6.7) and (6.8) respectively, and let \(\Xi =J\nabla _z \xi \) and \(\Sigma =J\nabla _y \sigma \) where \(\sigma (y) = \xi (u_2(y))\). Then

$$\begin{aligned} h(u_2(y),\delta ) = u_2(G(y,\delta )) \quad y \in B, \ |\delta | < \delta _0. \end{aligned}$$

Proof

Let \(H(y,\delta )=u_2(G(y,\delta ))\). Since \(G(y,0)=y\) for all y in B, it is immediate that

$$\begin{aligned} H(y,0) = u_2(G(y,0)) =u_2(y) \quad \text {if} \;\, y \in B \end{aligned}$$

and, since \(h(z,0)=z\) for all z in \(B'\),

$$\begin{aligned} h(u_2(y),0)=u_2(y) \quad \text {if} \;\, y \in B. \end{aligned}$$

Hence \(H(y,0)=h(u_2(y),0)\) if y belongs to B. We will now show that

$$\begin{aligned} H'(y,\delta ) = \Xi (H(y,\delta )) \quad y \in B, \ |\delta | < \delta _0, \end{aligned}$$

and, since the same equation is satisfied by \(h(u_2(y),\delta )\), we can conclude by the uniqueness of solutions to such ODEs that \(H(y,\delta )=h(u_2(y),\delta )\) for \(y \in B\) and \(|\delta | < \delta _0\).

In the following, we abbreviate \(H(y,\delta )\) to H, and, later, \(G(y,\delta )\) to G; we also make use of the notation \({\overline{z}} = z/|z|\) for \(z \ne 0\). By definition of \(H(y,\delta )\),

$$\begin{aligned} H'(y,\delta )&= \nabla u_2(G(y,\delta )) G'(y,\delta ) \\&= \left( \frac{\overline{u_2(G)}}{\sqrt{2}} \otimes {\overline{G}} + \sqrt{2}J\overline{u_2(G)} \otimes J {\overline{G}} \right) J\nabla \sigma (G) \\&= \left( \frac{{\overline{H}}}{\sqrt{2}} \otimes \overline{G} + \sqrt{2}J{\overline{H}} \otimes J {\overline{G}} \right) J\left( \sigma _{{\scriptscriptstyle {R}}}(G) {\overline{G}} + \sigma _\tau (G) J {\overline{G}}\right) \\&= \sqrt{2} \sigma _{{\scriptscriptstyle {R}}}(G) J{\overline{H}} - \frac{\sigma _\tau (G)}{\sqrt{2}} {\overline{H}} \\&= J \left( \sqrt{2} \sigma _{{\scriptscriptstyle {R}}}(G) {\overline{H}} + \frac{\sigma _\tau (G)}{\sqrt{2}} J \overline{H}\right) \\&= J \nabla \xi (u_2(G)) \\&= J \nabla \xi (H) \\&= \Xi (H), \end{aligned}$$

where, in order to pass from the fifth to the sixth line of the calculation, we have applied Lemma 6.10 with \(G(y,\delta )\) in place of y. The conclusion now follows. \(\square \)

The consequences of Lemma 6.11 are quite strong, as we now show.

Proposition 6.12

Assume that the conditions of Lemma 6.11 are in force, and let \(v(y,\delta )\) be given by (6.10). Then

$$\begin{aligned} \mathbb {D}(v(\cdot ,\delta )) = \mathbb {D}(u_2) \quad |\delta | < \delta _0. \end{aligned}$$

Proof

By Lemma 6.11, we may assume that \(h(u_2(y),\delta )=u_2(G(y,\delta ))\) for \(y \in B\) and \(|\delta | < \delta _0\). By differentiation,

$$\begin{aligned} \nabla h(u_2(y),\delta )\nabla u_2(y) = \nabla u_2(G(y,\delta )) \nabla G(y,\delta ), \end{aligned}$$

and hence, by (6.2),

$$\begin{aligned}\mathbb {D}(v(\cdot ,\delta ))&= \int _B |\nabla h(u_2(y),\delta ) \nabla u_2(y) \, \mathrm{adj}\,\nabla G(y)|^2 \, dy \\&= \int _B | \nabla u_2(G(y,\delta )) \nabla G(y,\delta ) \, \mathrm{adj}\,\nabla G(y)|^2 \, dy \\&= \int _B | \nabla u_2(G(y,\delta ))|^2 \, dy\\&= \int _B | \nabla u_2(x)|^2 \, dy \\&= \mathbb {D}(u_2). \end{aligned}$$

In the above, we have used the fact that \(\det \nabla G(y,\delta )=1\) together with the change of variables \(x=g(y,\delta )\) where g is given by (6.9) \(\square \)

We are now in a position to prove Theorem 6.3.

Proof

Part (a) of Theorem 6.3 follows from Eq. (6.32) in Lemma 6.7 and the fact that

$$\begin{aligned} \int _B X_{\tau {\scriptscriptstyle {R}}} - \sigma _{\tau {\scriptscriptstyle {R}}} \, dy =0 \end{aligned}$$

for any smooth function with compact support in the set \(B{\setminus } 0\).

To prove part (b), we twice apply a suitable dominated convergence theorem to the expression (6.28) for \(\mathbb {D}(v(\cdot ,\delta ))\) to deduce that

$$\begin{aligned} \left. \partial _\delta ^2\right| _{\delta =0}\mathbb {D}(v(\cdot ,\delta ))&=\int _B \mathcal {L}''(0) \, dy \nonumber \\&\ge 4 \int _B \left\{ (X-\sigma )_{\tau {\scriptscriptstyle {R}}}\right\} ^2 + \left\{ (X-\sigma )_{{\scriptscriptstyle {R}}{\scriptscriptstyle {R}}}\right\} ^2 \, dy, \end{aligned}$$
(6.52)

after (6.45) of Corollary 6.9 has been applied. If the lower bound in (6.52) is positive then a standard argument implies that, for sufficiently small \(\delta \), the inequality \(\mathbb {D}(v(\cdot ,\delta )) > \mathbb {D}(v(\cdot ,0))=\mathbb {D}(u_2)\) must hold. Note that we have implicitly used part (a) of Theorem 6.3 here.

If, on the other hand, the second derivative \(\partial _{\delta }^2\arrowvert _{_{\delta =0}}\mathbb {D}(v(\cdot ,\delta ))\) vanishes then

$$\begin{aligned} \int _B \left\{ (X-\sigma )_{\tau {\scriptscriptstyle {R}}}\right\} ^2 + \left\{ (X-\sigma )_{{\scriptscriptstyle {R}}{\scriptscriptstyle {R}}}\right\} ^2 \, dy = 0, \end{aligned}$$

and so \((X-\sigma )_{\tau {\scriptscriptstyle {R}}}=0\) and \( (X-\sigma )_{{\scriptscriptstyle {R}}{\scriptscriptstyle {R}}}=0\) both hold in B. Together, these imply that \(\nabla (X-\sigma )\) satisfies

$$\begin{aligned} \partial _{{\scriptscriptstyle {R}}}\nabla (X-\sigma ) = 0 \quad y \in B. \end{aligned}$$

But then \(\nabla (X-\sigma )\) is a function of the polar angle alone, and since \(\nabla (X-\sigma )\) vanishes on \(\partial B\), we have \(\nabla (X-\sigma )(y)=0\) for all y in B. Thus \(X-\sigma \) is a constant, which, since X and \(\sigma \) have compact support in B, must be zero. Hence \(X=\sigma \) in B. Now the conditions of Lemma 6.11 are satisfied, so, by Proposition 6.12, we must have \(\mathbb {D}(v(\cdot ,\delta ))=\mathbb {D}(u_2)\) if \( |\delta | < \delta _0\). \(\square \)