1 Introduction

Since the seminal work of Ball [3], the phenomenon of cavitation in nonlinear elasticity has been studied by many authors, with significant advances [9, 10, 15] having been made in the case that an appropriately defined surface energy be part of the cost of deforming a material. In this note we consider the original case of a purely bulk energy

$$\begin{aligned} I(u)=\int _{\Omega } W\left( \nabla u(x)\right) \,dx, \end{aligned}$$
(1.1)

where as usual \(u: \Omega \subset \mathbb {R}^{n} \rightarrow \mathbb {R}^{n}\) represents a deformation of an elastic material occupying the domain \(\Omega \) in a reference configuration, and where \(n=2\) or \(n=3\). Our goal is to give a straightforward, explicit characterization of those affine boundary conditions of the form

$$\begin{aligned} u_{\lambda }(x):=\lambda x, \end{aligned}$$

where \(\lambda \) is a positive parameter, which obey the quasiconvexity inequalityFootnote 1

$$\begin{aligned} I(u) \ge I(u_{\lambda }). \end{aligned}$$
(1.2)

In the case of radial mappings [3] it is this inequality which must be violated in order that a global minimizer of I might cavitate (i.e., where a hole is created in the deformed material), a crucial ingredient of which is the application of a large enough stretch on \(\partial \Omega \) (i.e., taking \(\lambda \) sufficiently large). When deformations are not restricted to any particular type we are still interested in whether the quasiconvexity inequality holds for a given \(\lambda \) since it rules out the possibility that a global energy minimizer cavitates. Thus the largest \(\lambda \) for which (1.2) holds is sometimes referred to as a critical load. Our chief inspiration for this work is [14], where bounds for the critical load are given in terms of constants appearing in certain isoperimetric inequalities. We use a different technique to find an explicit lower bound on the critical load in the two and three dimensional settings. The main results in this direction are summarised in Theorems 2.10 and 3.5.

Our method also yields conditions on \(\nabla u\) for the inequality (1.2) to be close to an equality in the sense that if \(\delta (u):=I(u)-I(u_{\lambda })\) is small and positive then, in the two dimensional case

$$\begin{aligned} \int _{\Omega }\min \left\{ |\nabla u- \lambda \mathbf{1}|^2,|\nabla u - \lambda \mathbf{1}|^q\right\} \,dx \le c\, \delta (u), \end{aligned}$$
(1.3)

where \(1<q<2\) is an exponent governing the growth of the stored-energy function W appearing in (1.1). See Theorem 2.11 for the latter. The corresponding condition in three dimensions is

$$\begin{aligned} \int _{\Omega } |\nabla u - \lambda \mathbf{1}|^{q} \,dx \le c \delta (u), \end{aligned}$$

where \(2<q <3\): see Theorem 3.6 for details. In both cases the Friesecke, James and Müller rigidity estimate [8, Theorem 3.1] (see also [5, Theorem 1.1]) is used in conjunction with the boundary condition to recover information apparently lost in deriving sufficient conditions for (1.2). We also note that these conditions are invariant under the elasticity scaling in which a function v(x), say, is replacedFootnote 2 by \(v^{\epsilon }(x)=\frac{1}{\epsilon }v(\epsilon x)\), where \(\epsilon > 0\). This is important in view of the example in [17, Section 1]. The latter says, among other things, that, in the absence of surface energy, a deformation which cavitates at just one point in the material can have the same energy as another deformation with infinitely many cavities.

The setting we work in is motivated by [15] in the sense that we impose condition (INV), a topological condition which is explained later. Cavitation problems must be posed in function spaces containing discontinuous functions. In particular, Sobolev spaces of the form \(W^{1,q}(\Omega ,\mathbb {R}^{n})\) with \(q \ge n\) are not appropriate, since their members are necessarily continuous. In the case \(q>n\) this follows from the Sobolev embedding theorem, while if \(q=n\) then well-known results [18, 19], applying to maps u with \(\det \nabla u> 0\) a.e., imply that u has a continuous representative. Thus we work in \(W^{1,q}(\Omega ,\mathbb {R}^{n})\), where \(n-1 < q < n\), and in so doing we are able to take advantage of existing results, including but not only those of [15].

The stored-energy functions we consider in the two dimensional case have the form

$$\begin{aligned} W(A) := |A|^{q} + h(\det A) \end{aligned}$$

where \(1 < q < 2\) and where \(h :\mathbb {R}\rightarrow [0,+\infty ]\) satisfies

  • (H1) h is convex and \(C^{1}\) on \((0,+\infty )\);

  • (H2) \(\lim _{t \rightarrow 0+} h(t) = +\infty \) and \(\liminf _{t \rightarrow \infty } \frac{h(t)}{t} > 0\);

  • (H3) \(h(t) = + \infty \) if \(t \le 0\).

In three dimensions the appropriate class of W is detailed in Sect. 3. In both cases we define a set of admissible deformations

$$\begin{aligned} \mathcal {A}_{\lambda }:=\left\{ u \in W^{1,q}(\Omega ,\mathbb {R}^{n}): \ u = u_{\lambda }\ \text {on} \ \partial \Omega , \ \det \nabla u > 0 \ \mathrm {a.e.\, in}\; \Omega \right\} . \end{aligned}$$
(1.4)

It is made clear in [3] and [16] that when \(\lambda \) is sufficiently large there are maps \(u_{0}\) belonging to \(\mathcal {A}_{\lambda }\) of the form

$$\begin{aligned} u_{0}(x) = r(|x|)\frac{x}{|x|}, \end{aligned}$$

with \(r(0) > 0\), such that

$$\begin{aligned} I(u_{0}) < I(u_{\lambda }). \end{aligned}$$
(1.5)

The growth of h(t) for large values of t is pivotal in ensuring that such an inequality can hold. Thus the integrand W is not (\(W^{1,q}\)-)quasiconvex at \(\lambda \mathbf{1}\). The loss of quasiconvexity is typically associated with so-called cavitating maps like \(u_{0}\), whose distributional Jacobian \(\mathrm {Det}\,\nabla u_{0}\) is proportional to a Dirac mass, a remark first made by Ball in [3].

For later use, we recall that the distributional Jacobian of a mapping in \(W^{1,p}(\Omega ,\mathbb {R}^n)\), with \(p > n^2/(n+1)\), is defined by

$$\begin{aligned} (\mathrm {Det}\,\nabla u)(\varphi ) = -\frac{1}{n} \int _{\Omega } \nabla \varphi \cdot (\mathrm{adj}\,\nabla u)u \,dx, \end{aligned}$$

where \(\varphi \) belongs to \(C_{0}^{\infty }(\Omega )\). When u is \(C^{2}\) the distributional Jacobian coincides with the Jacobian \(\det \nabla u\). The same is true if, more generally, \(u \in W^{1,p}(\Omega )\) with \(p \ge n^{2}/(n+1)\) and \(\mathrm {Det}\,\nabla u\) is a function (see [11]).

The paper is arranged as follows: after a short explanation of notation, we consider the two and three dimensional cases separately in Sects. 2 and 3 respectively. Subsection 2.1 contains the bulk of the estimates needed for (1.3); the relevant estimates in the three dimensional case draw on these results and are presented succinctly in Sect. 3. Along the way, we give a slight improvement of [20, Lemma 2.15], and, as a byproduct of our work in three dimensions we are led to a conjecture concerning the quasiconvexity of a certain function which, to the best of our knowledge, has not yet been considered in the literature.

1.1 Notation

We denote the \(n \times n\) real matrices by \(\mathbb {R}^{n \times n}\) and the identity matrix by \(\mathbf{1}\). Throughout, \(\Omega \subset \mathbb {R}^n\) is a fixed, bounded domain with Lipschitz boundary, B(aR) represents the open ball in \(\mathbb {R}^{n}\) centred at a with radius \(R>0\) and \(S(a,R):=\partial B(a,R)\). Other standard notation includes \(\mathcal {L}^n\) for the Lebesgue measure in \(\mathbb {R}^n\).

The inner product of two matrices \(A,B \in \mathbb {R}^{n \times n}\) is \(A \cdot B := \mathrm{tr}\,(A^{T}B)\). This obviously holds for vectors too. Accordingly, we make no distinction between the norm of a matrix and that of a vector: both are defined by \(|\nu |:=(\nu \cdot \nu )^{\frac{1}{2}}\). For any \(n \times n\) matrix we write \(\mathrm{adj}\,A:= (\mathrm{cof}\,A)^{T}\), while \(\mathrm{tr}\,A\) and \(\det A\) denote, as usual, the trace and determinant of A, respectively. Other notation will be introduced when it is needed.

2 The two dimensional case

The relevance of the distributional Jacobian to the loss of quasiconvexity can be seen using the following argument, the first part of which is due originally to Ball [2]. Firstly, the convexity of \(A \mapsto |A|^{q}\) and of h implies that

$$\begin{aligned} W(\nabla u) \ge W(\lambda \mathbf{1}) + q |\lambda \mathbf{1}|^{q-2}\lambda \mathbf{1} \cdot (\nabla u - \lambda \mathbf{1})+h'(\lambda ^{2})(\det \nabla u- \lambda ^{2}), \end{aligned}$$

which, when \(u \in \mathcal {A}_{\lambda }\), can be integrated over \(\Omega \); the result is

$$\begin{aligned} I(u) \ge I(u_\lambda ) + h'(\lambda ^{2}) \int _{\Omega } (\det \nabla u - \det \nabla u_{\lambda })\, dx. \end{aligned}$$
(2.1)

Clearly, if the integral with prefactor \(h'(\lambda ^{2})\) vanishes, that is if

$$\begin{aligned} \int _{\Omega } (\det \nabla u - \det \nabla u_{\lambda }) \, dx =0, \end{aligned}$$
(2.2)

then \(I(u) \ge I(u_{\lambda })\) follows. This can be ensured, for example, by imposing further conditions on u guaranteeing that

$$\begin{aligned} \int _{\Omega } f\left( u(x)\right) \, \det \nabla u(x)\,dx = \int _{\mathbb {R}^2}f(y) \, \text {deg}(\bar{u},\partial \Omega ,y)\,dy \end{aligned}$$
(2.3)

for any bounded continuous function f, where \(\bar{u}\) represents the trace of u, here assumed to possess a continuous representative in order that the degree is well-defined. The idea behind this originates in Šverák’s work [18], and was later refined by Müller et al. [12].Footnote 3 As Šverák remarks in [18], (2.3) clearly excludes cavitation by choosing f with support in the created cavity. We note that (2.3) is a key ingredient in Šverák’s proof of the existence of a representative for u that is continuous outside a set of Hausdorff dimension \(n-p\), where \(p>n-1\) is the Sobolev exponent appearing in the class \(\mathcal {A}_{p,q}^+\) he works in: see [18] for further details of that rich theory. It turns out that the discrepancy between \(\int _{\Omega } \det \nabla u \,dx\) and \(\int _{\Omega } \det \nabla u_{\lambda }\,dx\) can be measured using \(\mathrm {Det}\,\nabla u\) and interpreted in terms of cavitation provided some additional conditions are imposed on u. To explain this we follow the approach in [14] and appeal to a result in [15] that is couched in terms of Müller and Spector’s condition (INV). We now recall the definition of condition (INV), which is stated in terms of a general dimension n and domain \(\Omega \).

Definition 2.1

[15, Definition 3.2] The map \(u: \Omega \rightarrow \mathbb {R}^n\) satisfies condition (INV) provided that for every \(a \in \Omega \) there exists an \(\mathcal {L}^1\)-null set \(N_{a}\) such that, for all \(R \in \left( 0,\text {dist}\,(a,\partial \Omega )\right) \setminus N_{a}\), \(u|_{S(a,R)}\) is continuous,

  1. (i)

    \(u(x) \in \mathrm {im}\,_{T}\left( u,B(a,R)\right) \cup u\left( S(a,R)\right) \) for \(\mathcal {L}^{n}\)-a.e. \(x \in \overline{B(a,R)}\), and

  2. (ii)

    \(u(x) \in \mathbb {R}^n {\setminus } \mathrm {im}\,_{T}\left( u,B(a,R)\right) \) for \(\mathcal {L}^{n}\)-a.e. \(x \in \Omega {\setminus } B(a,R)\).

The topological image of B(aR) under the mapping u, \(\mathrm {im}\,_{T}\left( u,B(a,R)\right) \), is defined below.

Lemma 2.2

[15, Lemma 8.1] Let \(u \in W^{1,q}(\Omega ;\mathbb {R}^n)\) with \(q > n-1\). Suppose that \(\det \nabla u > 0\) a.e. in \(\Omega \) and that \(u^*\), the precise representativeFootnote 4 of u, satisfies condition (INV). Then \(\mathrm {Det}\,\nabla u \ge 0\) and hence \(\mathrm {Det}\,\nabla u\) is a Radon measure. Furthermore,

$$\begin{aligned} \mathrm {Det}\,\nabla u = \det \nabla u \,\mathcal {L}^{n} + m \end{aligned}$$
(2.4)

where m is singular with respect to Lebesgue measure and for \(\mathcal {L}^{1}\hbox {-}a.e. \ R \in (0,\text {dist}\,(a,\partial \Omega ))\),

$$\begin{aligned} (\mathrm {Det}\,\nabla u )\left( B(a,R)\right) = \mathcal {L}^{n}\left( \mathrm {im}\,_{T}\left( u,B(a,R)\right) \right) .\end{aligned}$$
(2.5)

Remark 2.3

Under the assumption that the perimeter of \(\mathrm {im}\,_{T}(u,\Omega )\) is finite it can be shown that the singular part of \(\mathrm {Det}\,\nabla u\) is a sum of Dirac masses. Thus the left-hand side of (2.6) below is \(-1 \times \) (volume of cavities created by the deformation u). See [15, Theorem 8.4] for more details.

Remark 2.4

Since m is singular with respect to Lebesgue measure, and in view of \(\mathrm {Det}\,\nabla u \ge 0\), it is clear that \(m \ge 0\).

Reverting to the two dimensional case \(\Omega \subset \mathbb {R}^2\), the assumption that \(u \in W^{1,q}(\Omega )\) for \(q>1\) implies (by Sobolev embedding) that \(u\arrowvert _{S(a,R)}\) is continuous for \(\mathcal {L}^{1}\)-a.e. \(R \in \left( 0,\text {dist}\,(a,\partial \Omega )\right) \). Hence, for such R, the topological image

$$\begin{aligned} \mathrm {im}\,_{T}\left( u,B(a,R)\right) = \left\{ y \in \mathbb {R}^2 {\setminus } u\left( S(a,R)\right) : \ \deg \left( u,S(a,R),y\right) \ne 0\right\} \end{aligned}$$

is well-defined. Following [14], we extend u by setting it equal to \(u_{\lambda }\) on \(B(0,M){\setminus } \bar{\Omega }\), where M is chosen so that \(\bar{\Omega } \subset B(0,M)\), and we assume that the extension satisfies condition (INV) on B(0, M). It is then straightforward to check, using the definition of the distributional Jacobian, its representation through [15, Lemma 8.1] and (2.5), that

$$\begin{aligned} -m(\bar{\Omega }) = \int _{\Omega } (\det \nabla u - \det \nabla u_{\lambda }) \, dx \end{aligned}$$
(2.6)

Finally, by applying (2.6) to inequality (2.1), we obtain

$$\begin{aligned} I(u) \ge I( u_\lambda ) - h'(\lambda ^2)\, m(\bar{\Omega }). \end{aligned}$$
(2.7)

It is clear that when \(h'(\lambda ^2) \le 0\) or \(m(\bar{\Omega })=0\) we have \(I(u) \ge I(u_{\lambda })\). Summarising the above, we have the following:

Proposition 2.5

Suppose that \(W(A)=|A|^{q} + h(\det A)\), where h satisfies (H1)–(H3), and where \(q > 1\). Let B(0, M) contain \(\bar{\Omega }\) and denote by \(u^{\text {e}}\) the extension of u to \(B(0,M) {\setminus } \Omega \) defined by

$$\begin{aligned} u^{\text {e}}(x) := \left\{ \begin{array}{l l} u(x) &{}\quad \text {if} \ x \in \Omega , \\ u_{\lambda }(x) &{}\quad \text {if} \ x \in B(0,M) {{\setminus }} \Omega . \end{array}\right. \end{aligned}$$

Assume that \(u^{\text {e}}\) satisfies the hypotheses of [15, Lemma 8.1] in the case that \(n=2\). Then if \(\int _{\Omega } \det \nabla u \,dx= \int _{\Omega }\det \nabla u_{\lambda }\,dx\) or if \(h'(\lambda ^{2}) \le 0\), the inequality \(I(u) \ge I(u_{\lambda })\) holds.

The rest of this section handles the case \(h' (\lambda ^2) > 0\) and \(m(\bar{\Omega }) > 0\), where m is given by (2.6), which is the situation not covered by Proposition 2.5. The following is a slightly improved version of a lemma by Zhang which, although stated here for general n, will only be needed in the case \(n=2\).

Lemma 2.6

(Adaptation of [20, Lemma 2.15]) For \(1 < q < 2\), \(M > 0\), and \(A, B \in \mathbb {R}^{n \times n}\) with \(0 < |A| \le M\),

$$\begin{aligned} |A+B|^q - |A|^q - q |A|^{q-2} A \cdot B \ge \left\{ \begin{array}{l l} C_{1}(M,q) |B|^2 &{}\quad \text {if} \ |B| \le M, \\ C_{2}(q) |B|^q &{}\quad \text {if} \ |B| \ge M, \end{array}\right. \end{aligned}$$

The constants \(C_{1}(M,q)\) and \(C_{2}(q)\) are given by

$$\begin{aligned} C_{1}(M,q)&= \frac{1}{2 (2M)^{2-q}}, \end{aligned}$$
(2.8)
$$\begin{aligned} C_{2}(q)&= \frac{1}{2(2^{2-q})}. \end{aligned}$$
(2.9)

Proof

The only part which requires proof is the constant \(C_{2}(q)\) since it is larger than the original version \(\tilde{C}_{2}(q):=\frac{1}{2(3^{2-q})}\) given in [20, Lemma 2.14]. The constant \(\tilde{C}_{2}(q)\) appears in [20, Eq. (2.23)] as a prefactor in the estimate

$$\begin{aligned} \int _{0}^{1} \frac{(1-s)|B|^2}{|A+sB|^{2-q}}\,ds \ge \tilde{C}_{2}(q)|B|^q \end{aligned}$$

under the assumption that \(|B| \ge M\). Now, in terms of \(\tau : = |B|/M\),

$$\begin{aligned} \frac{(1-s)|B|^2}{|A+sB|^{2-q}}\ge & {} \frac{(1-s)|B|^2}{|M+s|B||^{2-q}} \\= & {} \frac{(1-s) M^q \tau ^2}{(1+s\tau )^{2-q}} \\\ge & {} \frac{(1-s) \tau ^{2-q} |B|^q}{(1+\tau )^{2-q}.} \end{aligned}$$

Since \(\tau \ge 1\), the quantity \(\frac{\tau ^{2-q}}{(1+\tau )^{2-q}}\) is bounded below by \(1/2^{2-q}\). Upon integration, the lower bound

$$\begin{aligned} \int _{0}^{1} \frac{(1-s)|B|^2}{|A+sB|^{2-q}}\,ds \ge \frac{|B|^q}{2(2^{2-q})} \end{aligned}$$

follows. \(\square \)

Let \(u \in \mathcal {A}_{\lambda }\). Applying Lemma 2.6 to \(A:=\lambda \mathbf{1}\) and \(B:=\nabla u-\lambda \mathbf{1}\), we find that with \(M:=|A|=\sqrt{2}\lambda \),

$$\begin{aligned} |\nabla u|^{q} \ge |\lambda \mathbf{1}|^q + q |\lambda \mathbf{1}|^{q-2}(\nabla u - \lambda \mathbf{1}) \cdot \lambda \mathbf{1} +F_{M}(\nabla u - \lambda \mathbf{1}) \end{aligned}$$
(2.10)

where the function \(F_{M}: \mathbb {R}^{2 \times 2} \rightarrow \mathbb {R}\) is defined by

$$\begin{aligned} F_{M}(B) := \left\{ \begin{array}{l l} C_{1}(M,q) |B|^2 &{}\quad \text {if}\; |B| \le M, \\ C_{2}(q) |B|^q &{}\quad \text {if}\; |B| \ge M. \end{array}\right. \end{aligned}$$

Now

$$\begin{aligned} \left| \nabla u - \lambda \mathbf{1}\right| \ge \text {dist}\,\left( \nabla u, \lambda SO(2)\right) \end{aligned}$$

and since, by polar factorization,

$$\begin{aligned} \text {dist}\,\left( \nabla u, \lambda SO(2)\right) =\left| \sqrt{\nabla u^T \nabla u}-\lambda \mathbf{1}\right| =|\left( \lambda _{1}(\nabla u), \lambda _{2}(\nabla u)) - (\lambda ,\lambda )\right| , \end{aligned}$$

where \(0<\lambda _{1}(\nabla u) \le \lambda _{2}(\nabla u)\) are the singular values of \(\nabla u\), we have

$$\begin{aligned} \left| \nabla u - \lambda \mathbf{1}\right| \ge \left| \Lambda -\Lambda _{0}\right| . \end{aligned}$$
(2.11)

Here, \(\Lambda := (\lambda _{1},\lambda _{2})\), where we leave out the dependence on \(\nabla u\) for clarity, and \(\Lambda _{0}:=(\lambda ,\lambda )\). Next, define \(f_{M}: \mathbb {R}^+ \rightarrow \mathbb {R}^+\) by

$$\begin{aligned} f_{M}(t) := \min \left\{ C_{1}(M,q)t^2, C_{2}(q)t^q\right\} , \end{aligned}$$
(2.12)

where \(C_{1}(M,q)\) and \(C_{2}(q)\) are as in (2.8) and (2.9), respectively.

Remark 2.7

We note that \(f_{M}\) is continuous on \(\mathbb {R}^+\) and \(C_{1}(M,q)t^2=C_{2}(q)t^q\) if and only if \(t=M\). Thus the growth of \(f_{M}\) switches from quadratic on [0, M] to q-growth on \([M,+\infty )\). We remark that the continuity is a consequence of the improved (i.e. increased) value for \(C_{2}(q)\) provided in Lemma 2.6. More importantly, a larger value for \(C_{2}(q)\) makes our estimate of the critical load more accurate: see (2.32), for example.

Then, by combining (2.11) and (2.12) with the definition of \(F_{M}\), we obtain

$$\begin{aligned} F_{M}(\nabla u -\lambda \mathbf{1}) \ge f_{M}(|\Lambda -\Lambda _{0}|). \end{aligned}$$

Therefore, by (2.10),

$$\begin{aligned} |\nabla u|^{q} \ge |\lambda \mathbf{1}|^q + q |\lambda \mathbf{1}|^{q-2}(\nabla u - \lambda \mathbf{1}) \cdot \lambda \mathbf{1} +f_{\sqrt{2} \lambda }(|\Lambda -\Lambda _{0}|). \end{aligned}$$

Integrating this, applying the definition of the stored-energy function W, using

$$\begin{aligned} \int _{\Omega } (\nabla u -\lambda \mathbf{1}) \, dx = 0, \end{aligned}$$

and recalling that \(\det \nabla u=\lambda _1\lambda _2\), gives

$$\begin{aligned} I(u) \ge \int _{\Omega } \big (|\lambda \mathbf{1}|^{q} + f_{\sqrt{2} \lambda }\left( |\Lambda -\Lambda _{0}|\right) +h(\lambda _{1}\lambda _{2})\big )\,dx. \end{aligned}$$
(2.13)

Then in view of the convexity of h we get

$$\begin{aligned} I(u)-I(u_\lambda )\ge & {} \int _{\Omega } f_{\sqrt{2}\lambda }(|\Lambda -\Lambda _{0}|)\,dx+ \int _{\Omega } \big (h(\lambda _{1}\lambda _{2})-h(\lambda ^2)\big )\,dx\\\ge & {} \int _{\Omega } f_{\sqrt{2}\lambda }(|\Lambda -\Lambda _{0}|)\,dx+h'(\lambda ^2) \int _{\Omega } \big (\lambda _{1}\lambda _{2}-\lambda ^2\big )\,dx. \end{aligned}$$

As has already been observed, we need only consider \(h'(\lambda ^2)>0\), since Proposition 2.5 covers the case \(h'(\lambda ^2)\le 0\).

Note that

$$\begin{aligned} f_{\sqrt{2}\lambda }(|\Lambda -\Lambda _{0}|) + h'(\lambda ^2)(\lambda _{1}\lambda _{2}-\lambda ^2) = \mathcal {G}_{1}^\lambda (\Lambda ) + \mathcal {G}_{2}^\lambda (\Lambda ), \end{aligned}$$

where

$$\begin{aligned} \mathcal {G}_{1}^\lambda (\Lambda ):=f_{\sqrt{2}\lambda }(|\Lambda -\Lambda _{0}|) + h'(\lambda ^2)(\lambda _{1}-\lambda )(\lambda _{2}-\lambda ) \end{aligned}$$
(2.14)

and

$$\begin{aligned} \mathcal {G}_{2}^\lambda (\Lambda ):=\lambda h'(\lambda ^2)(\lambda _{1}+\lambda _{2}-2\lambda ), \end{aligned}$$
(2.15)

so that we have

$$\begin{aligned} I(u)-I(u_\lambda ) \ge \int _{\Omega } \mathcal {G}_{1}^\lambda (\Lambda )\,dx +\int _{\Omega } \mathcal {G}_{2}^\lambda (\Lambda )\,dx. \end{aligned}$$

The rest of this section is devoted to finding conditions on \(\lambda \) which ensure that

$$\begin{aligned} \int _{\Omega } \mathcal {G}_{i}^\lambda (\Lambda )\,dx \ge 0\quad \text {for } i=1,2. \end{aligned}$$

The following result, in which inequality (2.16) is part of [2, Lemma 5.3], allows us to deal with the term involving \(\mathcal {G}_{2}^\lambda \). We give a short elementary proof here to keep the paper self-contained; we also give a refined version of the estimate (2.16) which provides an ‘excess term’ [an estimate of the difference between the two sides of the inequality (2.16)]: see (2.17) below.

Lemma 2.8

Let \(u \in W^{1,1}(\Omega ,\mathbb {R}^2)\) satisfy \(u=u_{\lambda }\) on \(\partial \Omega \) and suppose that \(\det \nabla u > 0\) a.e. in \(\Omega \). Then

$$\begin{aligned} \int _{\Omega } \big (\lambda _{1}+\lambda _{2}\big )\,dx \ge 2\lambda \, \mathcal {L}^2(\Omega ), \end{aligned}$$
(2.16)

where \(0< \lambda _1\le \lambda _2\) denote the singular values of \(\nabla u\). Moreover,

$$\begin{aligned} \int _{\Omega } \big (\lambda _{1}+\lambda _{2}- 2\lambda \big ) \,dx \ge \int _{\Omega } \psi (u,\lambda )\,dx, \end{aligned}$$
(2.17)

where

$$\begin{aligned} \psi (u,\lambda ) :=\frac{2\lambda ^2(\mathrm{curl}\,u)^2}{\left( (\mathrm{curl}\,u)^2 + \max \left\{ 4\lambda ^2, (\mathrm{div}\,u)^2\right\} \right) ^{\frac{3}{2}}.} \end{aligned}$$

Proof

We first give a direct proof of (2.16).

The singular value decomposition theorem (see e.g., [6, Theorem 13.3]) yields

$$\begin{aligned} \nabla u = R D(\lambda _{1},\lambda _{2}) Q, \end{aligned}$$

where \(R, Q \in O(2)\) and

$$\begin{aligned} D(\lambda _{1},\lambda _{2}):=\left( \begin{array}{c@{\quad }c} \lambda _1 &{} 0 \\ 0 &{} \lambda _2 \end{array}\right) . \end{aligned}$$

Hence

$$\begin{aligned} \mathrm{tr}\,\nabla u = \mathrm{tr}\,\left( QR D(\lambda _{1},\lambda _{2})\right) . \end{aligned}$$

Since \(QR \in O(2)\), it must be of the form

$$\begin{aligned} QR = \left( \begin{array}{c@{\quad }c} \cos \sigma &{} {\pm } \sin \sigma \\ \sin \sigma &{} {\mp } \cos \sigma \end{array}\right) , \end{aligned}$$

therefore

$$\begin{aligned} \mathrm{tr}\,\nabla u = \cos \sigma (\lambda _{1} \mp \lambda _{2}). \end{aligned}$$

It can now be checked that

$$\begin{aligned} \mathrm{tr}\,\nabla u \le \lambda _{1} + \lambda _{2}. \end{aligned}$$

Then integrating the latter expression over \(\Omega \) and using the fact that the weak derivative satisfies

$$\begin{aligned} \int _{\Omega } \mathrm{tr}\,\nabla u \, dx = \int _{\Omega } \mathrm{tr}\,\nabla u_{\lambda } \, dx = 2\lambda \,\mathcal {L}^2(\Omega ) \end{aligned}$$

yields (2.16).

To prove (2.17), let \(\xi \in \mathbb {R}^{2\times 2}\), denote by \(\lambda _1(\xi ), \lambda _2(\xi )\) the singular values of \(\xi \) and define the function \(\varphi : \mathbb {R}^{2 \times 2} \rightarrow [0,+\infty )\) by

$$\begin{aligned} \varphi (\xi ) := \lambda _1(\xi )+\lambda _2(\xi ). \end{aligned}$$
(2.18)

Notice that

$$\begin{aligned} \varphi (\xi ) =\sqrt{|\xi |^2+2 \det \xi }. \end{aligned}$$
(2.19)

Then by applying the standard identity

$$\begin{aligned} g(1)=g(0)+g'(0)+\int _{0}^{1}(1-s)g''(s)\,ds \end{aligned}$$

to the function \(g(s):=\varphi \left( (1-s)\lambda \mathbf{1}+ s \xi \right) \) defined for \(s\in [0,1]\), we obtain

$$\begin{aligned} \varphi (\xi )&= \varphi (\lambda \mathbf{1}) + \mathrm{tr}\,(\xi -\lambda \mathbf{1}) \nonumber \\&\quad + \int _{0}^{1}(1-s)\frac{\varphi ^{2}(\omega (s))\varphi ^{2}(\xi -\lambda \mathbf{1})-((\omega (s)+\mathrm{cof}\,\omega (s))\cdot (\xi -\lambda \mathbf{1}))^{2}}{\varphi ^{3}(\omega (s))}\,ds, \end{aligned}$$
(2.20)

where

$$\begin{aligned} \omega (s) := (1-s)\lambda \mathbf{1} + s \xi \quad \text {for} \quad 0 \le s \le 1. \end{aligned}$$

For later use we note that the term

$$\begin{aligned} X\left( \omega (s),\xi -\lambda \mathbf{1}\right) :=\frac{\varphi ^{2}\left( \omega (s)\right) \varphi ^{2}(\xi -\lambda \mathbf{1})-\left( \left( \omega (s)+\mathrm{cof}\,\omega (s)\right) \cdot (\xi -\lambda \mathbf{1})\right) ^{2}}{\varphi ^{3}\left( \omega (s)\right) } \end{aligned}$$

can be rewritten as

$$\begin{aligned} X\left( \omega (s),\xi -\lambda \mathbf{1}\right) = \frac{\left( \mathrm{{atr}}\left( \omega (s)\right) \mathrm{tr}\,(\xi -\lambda \mathbf{1}) - \mathrm {atr}(\xi -\lambda \mathbf{1}) \mathrm{tr}\,\left( \omega (s)\right) \right) ^{2}}{\varphi ^{3}\left( \omega (s)\right) }. \end{aligned}$$
(2.21)

Here, \(\mathrm {atr}(\eta )\) denotes the antitrace of any \(\eta \in \mathbb {R}^{2\times 2}\) and is defined by \(\mathrm {atr}(\eta ) := \eta _{12}-\eta _{21}\). Note that, thanks to (2.21), \(X(\cdot ,\cdot ) \ge 0\) for all \(\xi \) and \(s\in [0,1]\), so that by letting \(\xi =\nabla u\) in (2.20) we obtain an alternative proof of (2.16).

Then (2.17) follows by calculating the terms in (2.21). Letting \(\xi = \nabla u\) again, we have \(\omega (s) = \lambda \mathbf{1} + s(\nabla u - \lambda \mathbf{1})\), and

$$\begin{aligned} \mathrm{{atr}} (\nabla u - \lambda \mathbf{1})&= \mathrm{curl}\,u \\ {\mathrm{tr}\,}(\nabla u - \lambda \mathbf{1})&= \mathrm{div}\,u - 2 \lambda \\ \mathrm{{atr}}\left( \omega (s)\right)&= s \,\mathrm{curl}\,u \\ {\mathrm{tr}\,}\left( \omega (s)\right)&= s \, \mathrm{div}\,u+(1-s)2\lambda . \end{aligned}$$

This gives

$$\begin{aligned} X\left( \omega (s),\xi -\lambda \mathbf{1}\right) =\frac{4\lambda ^2 (\mathrm{curl}\,u)^2}{\varphi ^{3}\left( \lambda \mathbf{1}+s(\nabla u -\lambda \mathbf{1})\right) }. \end{aligned}$$
(2.22)

Now

$$\begin{aligned} \varphi ^2(\eta ) = \left( \mathrm {atr}(\eta )\right) ^2+\left( \mathrm{tr}\,(\eta )\right) ^{2}, \end{aligned}$$

so we have

$$\begin{aligned} \varphi ^{2}\left( \lambda \mathbf{1}+s(\nabla u -\lambda \mathbf{1})\right) = s^2 (\mathrm{curl}\,u)^2 + \left( s\,\mathrm{div}\,u + 2 (1-s)\lambda \right) ^2. \end{aligned}$$

Since the function

$$\begin{aligned} p:s \mapsto \left( s\,\mathrm{div}\,u + 2 (1-s)\lambda \right) ^2 \end{aligned}$$

is convex, its maximum on the interval [0, 1] must be \(\max \left\{ p(0),p(1)\right\} \). Hence

$$\begin{aligned} \varphi ^{2}(\lambda \mathbf{1}+s(\nabla u -\lambda \mathbf{1}))&\le (\mathrm{curl}\,u)^2 + \max \{4\lambda ^2, (\mathrm{div}\,u)^2\} \end{aligned}$$

uniformly in s. Therefore (2.22) gives

$$\begin{aligned} X\left( \omega (s),\xi -\lambda \mathbf{1}\right) \ge \frac{4\lambda ^2 (\mathrm{curl}\,u)^2}{\left( (\mathrm{curl}\,u)^2 + \max \left\{ 4\lambda ^2, (\mathrm{div}\,u)^2\right\} \right) ^{\frac{3}{2}}}. \end{aligned}$$

Inserting this into (2.20), recalling that

$$\begin{aligned} \lambda _1+\lambda _2-2\lambda =\varphi (\nabla u)-\varphi (\lambda \mathbf{1}), \end{aligned}$$

and carrying out what becomes a trivial integration yields (2.17). \(\square \)

We now return to the estimate of \(\mathcal {G}_{2}^\lambda \). Indeed, since we are working under the assumption \(\lambda h'(\lambda ^2)>0\) for every \(\lambda >0\), applying Lemma 2.8 gives

$$\begin{aligned} \int _{\Omega } \mathcal {G}_{2}^\lambda (\Lambda )\,dx \ge 0, \end{aligned}$$
(2.23)

as desired.

To deal with the term involving \(\mathcal {G}_{1}^\lambda \) we find an explicit condition on \(\lambda \) which ensures that \(\mathcal {G}_{1}^\lambda (\Lambda ) \ge 0\) holds pointwise for \(\Lambda \in \mathbb {R}^{++}\) where

$$\begin{aligned} \mathbb {R}^{++}:=\left\{ x\in \mathbb {R}^2: \ x_{1}, x_{2} > 0\right\} . \end{aligned}$$

Lemma 2.9

The function

$$\begin{aligned} \mathcal {G}_{1}^\lambda (\Lambda )=f_{\sqrt{2}\lambda }\left( |\Lambda -\Lambda _{0}|\right) + h'(\lambda ^2)(\lambda _{1}-\lambda )(\lambda _{2}-\lambda ) \end{aligned}$$

is pointwise nonnegative on \(\mathbb {R}^{++}\) provided

$$\begin{aligned} C_{1}\left( \sqrt{2}\lambda ,q\right) \ge h'(\lambda ^2)/2, \end{aligned}$$
(2.24)

and

$$\begin{aligned} \frac{C_{2}(q)}{ h'(\lambda ^2)\lambda ^{2-q}} \ge (q-1)^{(q-1)/2}q^{-q/2}. \end{aligned}$$
(2.25)

Moreover, inequality (2.25) implies (2.24).

Proof

We divide the proof into two parts, the first of which is devoted to proving the sufficiency of (2.24) and (2.25).

Part 1 To shorten notation set \(Y:=h'(\lambda ^2)\). Let \(\Lambda -\Lambda _0=(\lambda _1-\lambda ,\lambda _2-\lambda )=(\rho \cos \mu , \rho \sin \mu )\) and let \(C_{1}:=C_{1}(\sqrt{2}\lambda ,q)\) and \(C_{2}:=C_{2}(q)\), as defined in (2.8) and (2.9) respectively. Let \(G(\rho ,\mu ):=\mathcal {G}_{1}^\lambda (\Lambda )\) and note that (using (2.12) with \(M=\sqrt{2}\lambda \))

$$\begin{aligned} G(\rho ,\mu ) = \left\{ \begin{array}{l@{\quad }l}C_{1}\rho ^2+Y\rho ^2 \sin \mu \cos \mu &{}\quad \text {if} \ \rho \le \sqrt{2}\lambda \\ C_{2}\rho ^q + Y\rho ^2 \sin \mu \cos \mu &{}\quad \text {if} \ \rho \ge \sqrt{2}\lambda . \end{array}\right. \end{aligned}$$
(2.26)

Firstly, if \(\rho \le \sqrt{2}\lambda \) then \(G(\rho , \mu ) \ge 0\) if and only if \(C_{1}+Y\sin \mu \cos \mu \ge 0\) for all \(\mu \). Whence \(C_{1} -Y/2 \ge 0\), which is (2.24). We henceforth suppose that (2.24) holds.

Inequality (2.25) essentially prevents \(G(\rho ,\mu )\) from vanishing at any point in \(\mathbb {R}^{++}\) outside the set \(B(\Lambda _{0},\sqrt{2}\lambda ) \cap \mathbb {R}^{++}\). By symmetry, we need only consider \(\mu \in [-\pi /4,\pi /4]\), and since \(G(\rho ,\mu )\ge 0\) if \(0 \le \mu \le \pi /4\), we can restrict attention to \(-\pi /4 < \mu \le 0\). Moreover, since \(G(\rho ,0)\) is obviously nonegative, we can also exclude \(\mu =0\). Now, in view of (2.24), the only way \(G(\rho ,\mu )\) can vanish is if \(\rho \ge \sqrt{2}\lambda \). In the region \(\rho \ge \sqrt{2}\lambda \), \(-\pi /4 < \mu < 0\)

$$\begin{aligned} G(\rho ,\mu )=C_{2}\rho ^q-Y|\sin \mu \cos \mu |\rho ^2, \end{aligned}$$

and since \(1 < q < 2\), it must be that \(G(\rho ,\mu ) < 0\) for sufficiently large \(\rho \) and each fixed \(\mu \). Also, since \(G(\rho ,\mu )\) is continuous and since, by (2.24), \(G(\sqrt{2}\lambda ,\mu ) \ge 0\), it follows that

$$\begin{aligned} \bar{\rho }(\mu ) := \inf \left\{ \rho \ge \sqrt{2}\lambda : \ C_{2}\rho ^q-Y|\sin \mu \cos \mu |\rho ^2=0\right\} \end{aligned}$$

is well-defined. Thus \(\bar{\rho }(\mu )\) satisfies

$$\begin{aligned} C_{2}\bar{\rho }(\mu )^q-Y|\sin \mu \cos \mu |\bar{\rho }(\mu )^2 = 0. \end{aligned}$$
(2.27)

Now, if the point \(\left( \bar{\rho }(\mu ) \cos \mu + \lambda , \bar{\rho }(\mu ) \sin \mu + \lambda \right) \) lies in the interior of \(\mathbb {R}^{++}\) then, by making \(\rho \) slightly larger, we ensure \(G(\rho ,\mu )<0\). Since \(-\pi /4<\mu < 0\), the inclusion

$$\begin{aligned} \left( \bar{\rho }(\mu ) \cos \mu + \lambda , \bar{\rho }(\mu ) \sin \mu + \lambda \right) \in \mathbb {R}^{++} \end{aligned}$$

is prevented when and only when

$$\begin{aligned} \bar{\rho }(\mu ) \ge \rho ^{*}(\mu ), \end{aligned}$$
(2.28)

where \(\rho ^{*}(\mu )\) satisfies \(\rho ^{*}(\mu ) \sin \mu + \lambda = 0\) and \(-\pi /4<\mu <0\).

Using (2.27) and the definition of \(\rho ^{*}\), inequality (2.28) is equivalent to

$$\begin{aligned} \frac{C_{2}}{Y \lambda ^{2-q}} \ge \underbrace{\cos \mu |\sin \mu |^{q-1}}_{=:e(\mu )}, \end{aligned}$$
(2.29)

where \(-\pi /4 < \mu < 0\). It can be checked that

$$\begin{aligned} \max _{(-\pi /4,0)}e = (q-1)^{(q-1)/2}q^{-q/2}, \end{aligned}$$
(2.30)

the maximum occurring at \(\mu \) such that \(\cos ^{2}\mu =1/q\). Inequality (2.25) now follows.

Part 2 We prove that (2.25) implies (2.24). First note that dividing both sides of (2.25) by \(2^{(2-q)/2}\) gives

$$\begin{aligned} \frac{C_{1}(\sqrt{2}\lambda ,q)}{Y} \ge \underbrace{\left( \frac{(q-1)^{q-1}q^{-q}}{2^{2-q}}\right) ^{1/2}}_{=:y(q)}. \end{aligned}$$
(2.31)

Let \(\gamma (q)=2 \ln y(q)\) and calculate \(\gamma '(q)=\ln \left( 2\left( 1-\frac{1}{q}\right) \right) \). Now \(1<q<2\), so \(2\left( 1-\frac{1}{q}\right) \in (0,1)\), and hence \(\gamma '(q) < 0\) on (1, 2). It follows that y is a decreasing function of q on (1, 2), and since \(y(q) \rightarrow \frac{1}{2}\) as \(q \rightarrow 2-\), the right-hand side of (2.31) is bounded below by \(\frac{1}{2}\). Hence (2.24) holds. \(\square \)

We now draw the preceding discussions together in the following result, whose statement, in contrast to that of Proposition 2.5, does not rely on the imposition of condition (INV).

Theorem 2.10

Let the stored energy function \(W: \mathbb {R}^{2 \times 2} \rightarrow [0,+\infty ]\) be given by

$$\begin{aligned} W(A) := |A|^{q}+ h(\det A), \end{aligned}$$

where \(1< q < 2\) and \(h:\mathbb {R}\rightarrow [0,+\infty ]\) satisfies (H1)–(H3). Let \(\lambda >0\) be such that

$$\begin{aligned} \frac{1}{2^{3-q}h'(\lambda ^2)\lambda ^{2-q}} \ge (q-1)^{(q-1)/2}q^{-q/2}. \end{aligned}$$
(2.32)

Then any \(u \in \mathcal {A}_{\lambda }\) satisfies \(I(u) \ge I(u_{\lambda })\).

2.1 Error estimates

In this section we are interested in understanding the properties of those \(u \in \mathcal A_\lambda \) such that \(I(u)-I(u_{\lambda })\) is small and positive. Hence we focus on the case \(h'(\lambda ^2)>0\) to which the results of the previous section apply. Accordingly, we impose the hypotheses of Theorem 2.10 and strengthen inequality (2.32) to read

$$\begin{aligned} \frac{1}{2^{3-q}h'(\lambda ^2) \lambda ^{2-q}} > (q-1)^{(q-1)/2}q^{-q/2}. \end{aligned}$$
(2.33)

The main result of this subsection is the following.

Theorem 2.11

Assume that (2.33) holds. Then there is a constant \(c=c(\Omega ,\lambda ,q)>0\) such that for every \(u \in \mathcal {A}_{\lambda }\)

$$\begin{aligned} \int _{\Omega }\min \left\{ |\nabla u - \lambda \mathbf{1}|^2,|\nabla u - \lambda \mathbf{1}|^q\right\} \,dx \le c\, \delta (u), \end{aligned}$$
(2.34)

where \(\delta (u):=I(u)-I(u_{\lambda })\). Moreover,

$$\begin{aligned} \lambda \, h'(\lambda ^2) \int _{\Omega } \frac{2\lambda ^2(\mathrm{curl}\,u)^2}{\left( (\mathrm{curl}\,u)^2 + \max \left\{ 4\lambda ^2, (\mathrm{div}\,u)^2\right\} \right) ^{\frac{3}{2}}}\,dx \le \delta (u).\end{aligned}$$
(2.35)

The proof of Theorem 2.11 is given in stages below. In view of

$$\begin{aligned} \int _{\Omega }\mathcal {G}_{1}^\lambda (\Lambda ) \,dx+\int _{\Omega }\mathcal {G}_{2}^\lambda (\Lambda ) \,dx \le \delta (u), \end{aligned}$$
(2.36)

the idea is that if \(\delta (u)\) is small then the same must be true of the two (necessarily nonnegative) terms in the left-hand side of (2.36). The first inequality, (2.34), follows from a smallness assumption on \(\int _{\Omega }\mathcal {G}_{1}^\lambda (\Lambda )\,dx\): see Proposition 2.14 below, while inequality (2.35) is a consequence of small \(\int _{\Omega } \mathcal {G}_{2}^\lambda (\Lambda )\,dx\) and follows in a straightforward way from (2.17).

We remark that an inequality like (2.35) is not available in the three dimensional case, or at least we could not derive it. The chief difficulty is the lack of an explicit expression for \(\lambda _{1}(\xi ) + \lambda _{2}(\xi )+\lambda _{3}(\xi )\) for \(\xi \in \mathbb {R}^{3\times 3}\): cf. (2.18) and (2.19).

We now turn to inequality (2.34). To this end we introduce the function \(g:[0,+\infty )\rightarrow [0,+\infty )\) defined by

$$\begin{aligned} g(t):={\left\{ \begin{array}{ll} \displaystyle \frac{t^2}{2} &{}\quad \text {if }\; 0\le t \le 1,\\ \displaystyle \frac{t^q}{q} + \frac{1}{2}-\frac{1}{q} &{}\quad \text {if }\; t \ge 1. \end{array}\right. } \end{aligned}$$
(2.37)

For later use we notice that g is convex.

Lemma 2.12

Let (2.33) hold. Then there is a constant \(c_0=c_0(\lambda ,q)>0\) such that

$$\begin{aligned} \mathcal {G}_{1}^\lambda (\Lambda ) \ge c_0\, g(|\Lambda -\Lambda _{0}|) \quad \text {on }\mathbb {R}^{++} \end{aligned}$$
(2.38)

where g is as in (2.37).

Proof

It is clear from the last part of the proof of Lemma 2.9 that inequality (2.33) implies that (2.24) holds with strict inequality. Thus

$$\begin{aligned} \mathcal {G}_{1}^\lambda (\Lambda ) \ge c|\Lambda -\Lambda _{0}|^{2} \quad \quad \mathrm {if} \ |\Lambda - \Lambda _{0}| \le \sqrt{2}\lambda \end{aligned}$$
(2.39)

for some constant \(c>0\).

Reusing the notation \(\Lambda -\Lambda _{0}= \rho (\cos \mu , \sin \mu )\) and \(G(\rho ,\mu ):=\mathcal {G}_{1}^\lambda (\Lambda )\), the case \(\rho \ge \sqrt{2}\lambda \) can be handled as follows. Let \(\epsilon >0\) and write

$$\begin{aligned} G(\rho ,\mu )&= C_{2}\rho ^q - Y|\sin \mu \cos \mu | \rho ^2 \\&= (C_{2}-\epsilon )\rho ^q -Y|\sin \mu \cos \mu |\rho ^2 + \epsilon \rho ^q, \end{aligned}$$

where \(Y:=h'(\lambda ^2)\). By applying the reasoning in the proof of Lemma 2.9 to the function

$$\begin{aligned} \tilde{G}(\rho ,\mu ): = (C_{2}-\epsilon )\rho ^q -Y|\sin \mu \cos \mu |\rho ^2, \end{aligned}$$

we see that \(\tilde{G}(\rho ,\mu )\ge 0\) provided

$$\begin{aligned} \frac{C_{2}-\epsilon }{Y \lambda ^{2-q}} \ge (q-1)^{(q-1)/2}q^{-q/2}. \end{aligned}$$
(2.40)

Inequality (2.33) clearly implies that \(C_{2}\) exceeds the right-hand side of (2.40) by a fixed amount; thus, if \(\epsilon >0\) is sufficiently small, inequality (2.40) holds. Hence

$$\begin{aligned} \mathcal {G}_{1}^\lambda (\Lambda ) \ge \epsilon |\Lambda -\Lambda _{0}|^{q} \quad \quad \mathrm {if} \ |\Lambda - \Lambda _{0}| \ge \sqrt{2}\lambda . \end{aligned}$$
(2.41)

Inequalities (2.39) and (2.41) are easily combined to give (2.38). \(\square \)

We will see that inequality (2.34) is a consequence of the \(L^2+L^q\) rigidity estimate [5, Theorem 1.1], or of [13, Proposition 2.3]. We recall here the following variant (see [1, Lemma 3.1]) which is suitable for our purposes.

Lemma 2.13

Let \(U\subset \mathbb {R}^n\) be a bounded domain with Lipschitz boundary. Let \(\lambda >0\) and g be as in (2.37). There exists a constant \(c=c(U,\lambda ,q)>0\) with the following property: for every \(v\in W^{1,q}(U;\mathbb {R}^n)\) there is a constant rotation \(R\in SO(n)\) satisfying

$$\begin{aligned} \int _U g(|\nabla v-\lambda R|)\,dx\le c \int _U g\left( \mathrm{dist}\left( \nabla v, \lambda SO(n)\right) \right) \,dx. \end{aligned}$$

Proof

Once we observe that, thanks to [8, Theorem 3.1] we can find \(c=c(U)>0\) such that for every \(w\in W^{1,2}(U;\mathbb {R}^n)\) there is a constant rotation \(R\in SO(n)\) satisfying

$$\begin{aligned} \int _U |\nabla w-\lambda R|^2\,dx\le c \int _U \mathrm{dist}^2\left( \nabla w, \lambda SO(n)\right) \,dx, \end{aligned}$$

the proof then closely follows that of [1, Lemma 3.1]. \(\square \)

Proposition 2.14

There is a constant \(c=c(\Omega ,\lambda ,q)>0\) such that

$$\begin{aligned} \int _{\Omega }\min \left\{ |\nabla u- \lambda \mathbf{1}|^2,|\nabla u - \lambda \mathbf{1}|^q\right\} \,dx \le c \delta (u). \end{aligned}$$
(2.42)

Proof

Throughout this proof c denotes a generic strictly positive constant possibly depending on \(\Omega \), \(\lambda \), and q. By (2.23) and (2.36) we have

$$\begin{aligned} \int _{\Omega }\mathcal {G}_{1}^\lambda (\Lambda ) \,dx\le \delta (u). \end{aligned}$$

Hence on recalling that

$$\begin{aligned} |\Lambda - \Lambda _{0}|=\text {dist}\,\left( \nabla u, \lambda SO(2)\right) , \end{aligned}$$

and by appealing to Lemma 2.12, we get

$$\begin{aligned} c_0 \int _\Omega g\left( \text {dist}\,\left( \nabla u,\lambda SO(2)\right) \right) \,dx\le \delta (u). \end{aligned}$$

Then Lemma 2.13 provides us with \(c>0\) and \(R\in SO(2)\) such that

$$\begin{aligned} \int _\Omega g(|\nabla u -\lambda R|)\,dx\le c\,\delta (u). \end{aligned}$$
(2.43)

We claim that

$$\begin{aligned} |\mathbf{1}-R|^2\le c\, \delta (u). \end{aligned}$$
(2.44)

By virtue of the convexity of g, combining Jensen’s inequality with (2.43) gives

$$\begin{aligned} g\left( \frac{1}{\mathcal {L}^2(\Omega )}\int _\Omega |\nabla u -\lambda R|\,dx\right) \le c\,\delta (u). \end{aligned}$$
(2.45)

Set \(\tilde{u}:= u/\lambda \) and \(\tilde{z}:=\frac{1}{\mathcal {L}^2(\Omega )}\int _\Omega (\tilde{u}-Rx)\,dx\). Then by Poincaré’s inequality together with the continuity of the trace operator we obtain

$$\begin{aligned} \int _{\partial \Omega }|\tilde{u} -Rx -\tilde{z}|\,d\mathcal H^1 \le c \int _\Omega |\nabla \tilde{u}-R|\,dx, \end{aligned}$$

and hence, since \(\tilde{u}=x\) on \(\partial \Omega \), we deduce that

$$\begin{aligned} \int _{\partial \Omega }\left| (\mathbf{1}-R)x -\tilde{z}\right| \,d\mathcal H^1 \le c \int _\Omega \left| \nabla \tilde{u}-R\right| \,dx. \end{aligned}$$
(2.46)

Arguing as in the proof of [1, Lemma 3.3], we apply [1, Lemma 3.2] to deduce that there exists a universal constant \(\sigma >0\) such that

$$\begin{aligned} |\mathbf{1}-R|\le \sigma \min _{z\in \mathbb {R}^2}\int _{\partial \Omega }\left| (\mathbf{1}-R)x -z\right| \,d\mathcal H^1. \end{aligned}$$
(2.47)

Combining (2.46) and (2.47) gives

$$\begin{aligned} |\mathbf{1}-R|\le & {} c \int _\Omega |\nabla \tilde{u}-R|\,dx\\= & {} \frac{c}{\lambda } \int _\Omega |\nabla u-\lambda R|\,dx, \end{aligned}$$

and therefore

$$\begin{aligned} |\mathbf{1}-R|^2 \le c \left( \frac{1}{\mathcal {L}^2(\Omega )}\int _\Omega |\nabla u-\lambda R|\,dx\right) ^2. \end{aligned}$$
(2.48)

Then to prove (2.44) we need to distinguish two cases.

  1. (i)

    \(\displaystyle \int _\Omega |\nabla u-\lambda R|\,dx\le \mathcal {L}^2(\Omega )\). By definition \(g(t)={t^2}/{2}\) for \(t\le 1\), so that (2.45) and (2.48) immediately yield

    $$\begin{aligned} |\mathbf{1}-R|^2 \le c\, g\left( \frac{1}{\mathcal {L}^2(\Omega )}\int _\Omega |\nabla u-\lambda R|\,dx\right) \le c\,\delta (u). \end{aligned}$$
  2. (ii)

    \(\displaystyle \int _\Omega |\nabla u-\lambda R|\,dx> \mathcal {L}^2(\Omega )\) .

When \(t>1\) we have \(g(t)>{1}/{2}\), then

$$\begin{aligned} |\mathbf{1}-R|^2\le & {} 2(|\mathbf{1}|^2+|R|^2) \\< & {} c\, g\left( \frac{1}{\mathcal {L}^2(\Omega )}\int _\Omega |\nabla u-\lambda R|\,dx\right) \\\le & {} c\,\delta (u), \end{aligned}$$

hence the claim is proved.

We now notice that the convexity of g together with its definition entails

$$\begin{aligned} g(s+t)\le c\Big (g(s)+t^2\Big )\quad \text {for every}\quad s,t\ge 0 \end{aligned}$$

and for some \(c>0\). Indeed we have

$$\begin{aligned} g(s+t)\le & {} 2^q\, g\Big (\frac{s+t}{2}\Big )\\\le & {} 2^{q-1} \big (g(s)+g(t)\big )\\\le & {} 2^{q-1} \bigg (g(s)+\frac{t^2}{q}\bigg ). \end{aligned}$$

Then choosing R as in (2.43) and combining the latter with (2.44) implies

$$\begin{aligned} \int _\Omega g(|\nabla u-\lambda \mathbf{1}|)\,dx= & {} \int _\Omega g(|\nabla u-\lambda R+ \lambda R-\lambda \mathbf{1}|)\,dx\nonumber \\\le & {} c\left( \int _\Omega g(|\nabla u-\lambda R|)\,dx+\lambda ^2\,|\mathbf{1}-R|^2\right) \nonumber \\\le & {} c\,\delta (u). \end{aligned}$$
(2.49)

Finally, since we can find \(c>0\) such that

$$\begin{aligned} \min \{t^2,t^q\} \le c\,g(t)\quad \text {for every}\quad t\ge 0, \end{aligned}$$

we obtain

$$\begin{aligned} \int _{\Omega }\min \left\{ |\nabla u - \lambda \mathbf{1}|^2,|\nabla u - \lambda \mathbf{1}|^q\right\} \,dx \le c \delta (u), \end{aligned}$$

which is the thesis. \(\square \)

Remark 2.15

Using (2.27) and the definition of g we obtain

$$\begin{aligned} \int _{|\nabla u-\lambda \mathbf{1}|\le 1}|\nabla u-\lambda \mathbf{1}|^2\,dx\le c \int _\Omega g(|\nabla u-\lambda \mathbf{1}|)\,dx\le c\,\delta (u). \end{aligned}$$
(2.50)

Then recalling that \(q<2\), Hölder’s inequality combined with (2.50) yields

$$\begin{aligned} \int _{|\nabla u-\lambda \mathbf{1}|\le 1}|\nabla u-\lambda \mathbf{1}|^q\,dx\le \mathcal {L}^2(\Omega )^{1-\frac{q}{2}} \left( \int _{|\nabla u-\lambda \mathbf{1}|\le 1}|\nabla u-\lambda \mathbf{1}|^2\,dx\right) ^\frac{q}{2} \le c\,\delta (u)^\frac{q}{2}.\quad \end{aligned}$$
(2.51)

On the other hand we clearly have

$$\begin{aligned} \int _{|\nabla u-\lambda \mathbf{1}|>1}|\nabla u-\lambda \mathbf{1}|^q\,dx\le c\, \int _\Omega g(|\nabla u-\lambda \mathbf{1}|)\,dx\le c\,\delta (u). \end{aligned}$$
(2.52)

Therefore (2.51) and (2.52) together give

$$\begin{aligned} \int _\Omega |\nabla u-\lambda \mathbf{1}|^q\,dx\le c\, \Big (\delta (u)^{\frac{q}{2}}+\delta (u)\Big ), \end{aligned}$$

which on applying Poincaré’s inequality finally implies

$$\begin{aligned} \Vert u-u_\lambda \Vert _{W^{1,q}(\Omega ;\mathbb {R}^2)}^q \le c\, \Big (\delta (u)^{\frac{q}{2}}+\delta (u)\Big ). \end{aligned}$$
(2.53)

If \(\lambda \) satisfies (2.33) then from (2.53) we can conclude that \(u_\lambda \) is the unique global minimiser of I among all maps u in \(\mathcal A_\lambda \) and, moreover, that \(u_\lambda \) lies in a potential well.

3 The three dimensional case

In this section we seek conditions analogous to those obtained in the two dimensional case ensuring that \(u_{\lambda }\) is the unique global minimizer of an appropriately defined stored-energy function. For simplicity we focus on the following \(W: \mathbb {R}^{3 \times 3} \rightarrow [0,+\infty ]\) given by

$$\begin{aligned} W(A):=|A|^{q}+\gamma |A|^{2} + Z(\mathrm{cof}\,A)+h(\det A), \end{aligned}$$
(3.1)

where \(2<q<3\), \(\gamma >0\) is a fixed constant, \(Z: \mathbb {R}^{3 \times 3} \rightarrow [0,+\infty )\) is convex and \(C^1\), and h has properties (H1)–(H3).

Applying [14, Lemma A.1] to \(A \mapsto |A|^q\) gives

$$\begin{aligned} |\nabla u|^q \ge |\lambda \mathbf{1}|^q+q|\lambda \mathbf{1}|^{q-2} \lambda \mathbf{1}\cdot (\nabla u -\lambda \mathbf{1})+\kappa |\nabla u-\lambda \mathbf{1}|^q, \end{aligned}$$
(3.2)

where

$$\begin{aligned} 2^{2-q} \le \kappa \le q2^{1-q} . \end{aligned}$$
(3.3)

Moreover, we clearly have

$$\begin{aligned} \gamma |\nabla u|^2 \ge \gamma |\lambda \mathbf{1}|^2+2\gamma \lambda \mathbf{1}\cdot (\nabla u -\lambda \mathbf{1})+\gamma |\nabla u-\lambda \mathbf{1}|^2. \end{aligned}$$
(3.4)

Therefore, by gathering (3.2) and (3.4) and appealing to the convexity of Z and h, we obtain

$$\begin{aligned} W(\nabla u)\ge & {} W (\nabla u_{\lambda }) + q|\lambda \mathbf{1}|^{q-2}\lambda \mathbf{1}\cdot (\nabla u - \lambda \mathbf{1}) + \kappa |\nabla u-\lambda \mathbf{1}|^{q} \nonumber \\&+ 2\gamma \lambda \mathbf{1} \cdot (\nabla u - \lambda \mathbf{1}) + \gamma |\nabla u - \lambda \mathbf{1}|^{2} \nonumber \\&+ D_{A}Z(\mathrm{cof}\,\lambda \mathbf{1}) \cdot (\mathrm{cof}\,\nabla u - \mathrm{cof}\,\lambda \mathbf{1})\nonumber \\&+ h'(\lambda ^{3})(\det \nabla u-\lambda ^3), \end{aligned}$$
(3.5)

for any \(u \in \mathcal A_\lambda \), where \(\mathcal {A}_{\lambda }\) is the class of admissible maps given by (1.4) with \(n=3\). Integrating (3.5) and using the facts that both \(\nabla u\) and \(\mathrm{cof}\,\nabla u\) are null Lagrangians in \(W^{1,q}(\Omega ,\mathbb {R}^{3})\) for \(q \ge 2\), we obtain

$$\begin{aligned} I(u)-I(u_{\lambda }) \ge \int _{\Omega } \big (\kappa |\nabla u - \lambda \mathbf{1}|^q+\gamma |\nabla u - \lambda \mathbf{1}|^2 + h'(\lambda ^3)(\det \nabla u - \lambda ^3)\big )\,dx \end{aligned}$$
(3.6)

By analogy with Proposition 2.5 we can deal with the case \(h'(\lambda ^{3})\le 0\) by imposing condition (INV) on a suitably defined extension of u, as follows.

Proposition 3.1

Suppose that \(W: \mathbb {R}^{3 \times 3} \rightarrow [0,+\infty ]\) is given by

$$\begin{aligned} W(A):=|A|^{q}+\gamma |A|^{2} + Z(\mathrm{cof}\,A)+h(\det A) \end{aligned}$$

where \(2<q<3\), \(\gamma >0\) is a fixed constant, \(Z: \mathbb {R}^{3 \times 3} \rightarrow [0,+\infty )\) is convex and \(C^1\), and h has properties (H1)–(H3). Let B(0, M) contain \(\bar{\Omega }\) and denote by \(u^{\text {e}}\) the extension of u to \(B(0,M) {\setminus } \Omega \) defined by

$$\begin{aligned} u^{\text {e}}(x) := \left\{ \begin{array}{l l} u(x) &{}\quad \text {if} \ x \in \Omega , \\ u_{\lambda }(x) &{}\quad \text {if} \ x \in B(0,M) {\setminus } \Omega . \end{array}\right. \end{aligned}$$

Assume that \(u^{\text {e}}\) satisfies the hypotheses of [15, Lemma 8.1] in the case that \(n=3\). Then if \(\int _{\Omega } \det \nabla u \,dx= \int _{\Omega }\det \nabla u_{\lambda }\,dx\) or if \(h'(\lambda ^{3}) \le 0\), the inequality \(I(u) \ge I(u_{\lambda })\) holds.

Proof

By (3.6) it is enough to show that \(h'(\lambda ^3)\int _{\Omega }(\det \nabla u - \lambda ^3)\,dx \ge 0\). The argument which precedes Proposition 2.5 implies that the integral term is not greater than zero, which when coupled with the assumption \(h'(\lambda ^3)\le 0\) easily gives the desired inequality. \(\square \)

Let \(0<\lambda _{1}\le \lambda _2 \le \lambda _3\) be the singular values of \(\nabla u\) and define the vectors \(\Lambda :=(\lambda _{1},\lambda _{2},\lambda _{3})\) and \(\Lambda _{0}:=(\lambda ,\lambda ,\lambda )\). Recall that

$$\begin{aligned} |\nabla u - \lambda \mathbf{1}| \ge |\Lambda - \Lambda _{0}|; \end{aligned}$$

then (3.6) implies

$$\begin{aligned} I(u)-I(u_{\lambda }) \ge \int _{\Omega } \big (\kappa |\Lambda - \Lambda _{0}|^q+\gamma |\Lambda - \Lambda _{0}|^2 + h'(\lambda ^3)(\lambda _1\lambda _2\lambda _3 - \lambda ^3)\big )\,dx \end{aligned}$$
(3.7)

The next three results are devoted to the case \(h'(\lambda ^3)>0\).

Lemma 3.2

Let W be as in (3.1) and let \(u\in \mathcal A_\lambda \). Then

$$\begin{aligned} I(u) - I(u_{\lambda }) \ge \int _{\Omega } \big (\mathcal {F}_{1}^\lambda (\Lambda )+\mathcal {F}_{2}^\lambda (\Lambda )\big )\,dx, \end{aligned}$$
(3.8)

where

$$\begin{aligned} \mathcal {F}_{1}^\lambda (\Lambda ) := \kappa |\Lambda -\Lambda _{0}|^{q}+h'(\lambda ^3)({\lambda }_{1}-\lambda )({\lambda }_{2}-\lambda )({\lambda }_{3}-\lambda ) \end{aligned}$$

and

$$\begin{aligned} \mathcal {F}_{2}^\lambda (\Lambda ) := \gamma |\Lambda -\Lambda _{0}|^{2} + \lambda h'(\lambda ^{3}) \sum _{i < j} ({\lambda }_{i}-\lambda )({\lambda }_{j}-\lambda ). \end{aligned}$$

Proof

For brevity we write \(\hat{\lambda }_{i} := \lambda _{i} - \lambda \) for \(i=1,2,3\). It follows that

$$\begin{aligned} \det \nabla u -\lambda ^{3} = \hat{\lambda }_{1}\hat{\lambda }_{2}\hat{\lambda }_{3} + \lambda \sum _{i<j}\hat{\lambda }_{i} \hat{\lambda }_{j} + \lambda ^{2}\sum _{i=1}^{3}\hat{\lambda }_{i}. \end{aligned}$$
(3.9)

Inserting this into (3.7) gives

$$\begin{aligned} I(u)-I(u_{\lambda }) \ge \int _{\Omega }\left( \mathcal {F}^\lambda _{1}(\Lambda )+\mathcal {F}^\lambda _{2}(\Lambda )\right) \,dx+\lambda ^{2}h'(\lambda ^{3})\int _{\Omega } \sum _{i=1}^{3} \hat{\lambda }_{i} \,dx. \end{aligned}$$

Since the last integral may be written as

$$\begin{aligned} \int _{\Omega } \sum _{i=1}^{3} \hat{\lambda }_{i} \,dx = \int _{\Omega } (\lambda _{1}+\lambda _{2}+\lambda _{3}-3\lambda )\,dx, \end{aligned}$$

we can apply [2, Lemma 5.3] again to deduce that

$$\begin{aligned} \int _{\Omega } (\lambda _{1}+\lambda _{2}+\lambda _{3})\,dx \ge 3\lambda \, \mathcal {L}^3(\Omega ). \end{aligned}$$

Hence since \(h'(\lambda ^{3}) > 0\), (3.8) holds. \(\square \)

By analogy with the strategy leading to Lemma 2.9, we now find conditions on \(\lambda \) in terms of \(\kappa \), \(\gamma \) and q ensuring that

$$\begin{aligned} {\left\{ \begin{array}{ll} \mathcal {F}_{1}^\lambda (\Lambda ) \ge 0\\ \mathcal {F}_{2}^\lambda (\Lambda ) \ge 0 \end{array}\right. } \quad \text {for every}\quad \Lambda \in \mathbb {R}^{+++}, \end{aligned}$$

where \(\mathbb {R}^{+++} := \{x\in \mathbb {R}^3:x_{i} > 0 \ \text {for} \ i=1,2,3\}\).

Lemma 3.3

The functions \(\mathcal {F}_{1}^\lambda (\Lambda )\) and \(\mathcal {F}_{2}^\lambda (\Lambda )\) are pointwise nonnegative on \(\mathbb {R}^{+++}\) provided

$$\begin{aligned} \frac{\kappa }{h'(\lambda ^{3})\lambda ^{3-q} } \ge (q-2)^{(q-2)/2}q^{-q/2} \end{aligned}$$
(3.10)

and

$$\begin{aligned} \frac{\gamma }{\lambda h'(\lambda ^{3})} \ge \frac{1}{2}. \end{aligned}$$
(3.11)

Proof

In the following we let \(Y:=h'(\lambda ^3) > 0\) for brevity. We write

$$\begin{aligned} (\hat{\lambda }_{1},\hat{\lambda }_{2},\hat{\lambda }_{3}) = \rho (\cos \phi \sin \theta , \sin \phi \sin \theta , \cos \theta ), \end{aligned}$$

where \(\rho \ge 0\) and \(0 \le \theta \le \pi \), \(0 \le \phi \le 2\pi \). In terms of \(\rho , \theta \) and \(\phi \) we have \(\mathcal {F}_{1}^\lambda (\Lambda ) = F_{1}(\rho , \theta , \phi )\), where

$$\begin{aligned} F_{1}(\rho , \theta , \phi ) := \kappa \rho ^{q} + \frac{Y \rho ^{3}}{4} \sin 2 \phi \sin 2 \theta \sin \theta . \end{aligned}$$
(3.12)

Since the singular values of \(\nabla u\) are ordered as \(\lambda _{1} \le \lambda _{2} \le \lambda _{3}\) the same applies to the \(\hat{\lambda }_{i}\) for \(i=1,2,3\); hence in particular \(\hat{\lambda }_{1} \le \hat{\lambda }_{2}\). The latter implies \(\phi \in [\pi /4,5\pi /4]\). Now if \(\sin 2 \phi \cos \theta \ge 0\) then the stated result would be immediate from (3.12). Therefore we assume \(\sin 2\phi \cos \theta < 0\) in what follows, which in view of the restriction \(\pi /4 \le \phi \le 5 \pi /4\) implies either that \(\phi \in [\pi /2,\pi ]\) when \(\cos \theta > 0\) or that \(\phi \in [\pi /4,\pi /2] \cup [\pi ,5 \pi /4]\) when \(\cos \theta < 0\). For later use we will let S be the set of \((\theta ,\phi )\) satisfying these restrictions.

Let

$$\begin{aligned} \bar{\rho }(\theta , \phi ) := \inf \left\{ \rho > 0: \ F_{1}(\rho ,\theta , \phi ) =0\right\} \end{aligned}$$

and note that \(\bar{\rho }\) is well-defined because, in view of

$$\begin{aligned} F_{1}(\rho ,\theta ,\phi ) = \rho ^{q}\left( \kappa - \frac{Y\rho ^{3-q}}{4}\left| \sin 2 \phi \sin 2 \theta \sin \theta \right| \right) , \end{aligned}$$

where \(q < 3\), there is always at least one positive solution to the equation \(F_{1}(\rho ,\theta ,\phi )=0\). Moreover, it is clear that \(\bar{\rho }\) satisfies

$$\begin{aligned} \frac{4\kappa }{Y} {\bar{\rho }}^{q-3}(\theta ,\phi ) = |\sin 2 \phi \sin 2 \theta \sin \theta |. \end{aligned}$$
(3.13)

Next, let us call \(\rho ^{*}(\theta ,\phi ) \ge 0\) an exit radius if

$$\begin{aligned} \Lambda _{0} + \rho ^{*} (\cos \theta \sin \phi , \sin \theta \sin \phi , \cos \theta ) \in \partial \mathbb {R}^{+++}. \end{aligned}$$

Thus \(\rho ^{*}={\rho _{i}}^{*} > 0\) for at least one i, where

$$\begin{aligned} \lambda +{\rho _{1}}^{*} \sin \theta \cos \phi&= 0, \\ \lambda + {\rho _{2}}^{*}\sin \theta \sin \phi&= 0, \\ \lambda + {\rho _{3}}^{*} \cos \theta&= 0. \end{aligned}$$

In order that \(\mathcal {F}_{1}^\lambda (\Lambda ) \ge 0\) for \(\Lambda \in \mathbb {R}^{+++}\) it should now be clear that \(\bar{\rho }\) must exceed the largest exit radius, i.e., \(\bar{\rho }(\theta ,\phi )\ge \max \{{\rho _{1}}^{*},{\rho _{2}}^{*},{\rho _{3}}^{*}\}\) for each pair \((\theta , \phi )\) in S. Rearranging this, we obtain the following sufficient condition:

$$\begin{aligned} \frac{4 \kappa }{\lambda ^{3-q}Y} \ge \max \{s_{1},s_{2},s_{3}\}, \end{aligned}$$
(3.14)

where

$$\begin{aligned} s_{1}&:= \sup _{(\theta ,\phi ) \in S_{1}}\frac{|\sin 2 \phi \sin 2 \theta \sin \theta |}{|\cos \phi \sin \theta |^{3-q}}, \\ s_{2}&:= \sup _{(\theta , \phi ) \in S_{2}}\frac{|\sin 2 \phi \sin 2 \theta \sin \theta |}{|\sin \phi \sin \theta |^{3-q}} ,\\ s_{3}&:= \sup _{(\theta , \phi ) \in S_{3}}\frac{|\sin 2 \phi \sin 2 \theta \sin \theta |}{|\cos \theta |^{3-q}}. \end{aligned}$$

Here, \(S_{i}=\left\{ (\theta ,\phi ) \in S: \ {\rho _{i}}^{*} > 0\right\} \) for \(i=1,2,3\).

To find \(s_{1}\): Let

$$\begin{aligned} m_{1}(\theta , \phi ) := 4 |\sin \phi ||\cos \phi |^{q-2}|\cos \theta ||\sin \theta |^{q-1}, \end{aligned}$$

so that \(s_1=\max _{S_1}m_1\). Note that \({\rho _{1}}^{*} = -\lambda (\cos \phi \sin \theta )^{-1} > 0\) implies \(\pi /2 < \phi \le \pi \), which when combined with the restriction \((\theta , \phi ) \in S\) implies \(\phi \in [\pi /2,\pi ]\) when \(\cos \theta > 0\) or \(\phi \in [\pi ,5 \pi /4]\) when \(\cos \theta < 0\). Thus we need only consider these values of \(\phi \) when maximizing \(m_{1}(\theta ,\phi )\) over \(S_{1}\). Define \(f(\phi ) := |\sin \phi ||\cos \phi |^{q-2}\) and note that

$$\begin{aligned} \max _{S_{1}} m_{1} = 4 \max _{0 \le \theta \le \pi } \left| e(\theta )\right| \max _{[\pi /2,5\pi /4]} f(\phi ), \end{aligned}$$

where the function e is defined in (2.29) and its maximum is given by (2.30). Thus

$$\begin{aligned} \max _{S_{1}} m_{1} = 4 (q-1)^{(q-1)/2}q^{-q/2} \max _{[\pi /2,5\pi /4]} f(\phi ). \end{aligned}$$

A short calculation shows that f is maximized when \(\phi \) satisfies \(\cos \phi =-\left( (q-2)/(q-1)\right) ^{\frac{1}{2}}\), which is only possible when \(\phi \) belongs to \([\pi /2,3\pi /4]\). (It is easy to check that f is monotonic on \([\pi ,5\pi /4]\) and that its maximum in this range is smaller than the maximum over the range \([\pi /2,\pi ]\)). Hence

$$\begin{aligned} \max _{[\pi /2,5\pi /4]} f(\phi )= (q-1)^{\frac{1}{2}}\left( \frac{q-2}{q-1}\right) ^{\frac{q-2}{2}}, \end{aligned}$$
(3.15)

which gives

$$\begin{aligned} \max _{S_{1}} m_{1} = 4 (q-2)^{(q-2)/2}q^{-q/2}. \end{aligned}$$

To find \(s_{2}\): We claim that \(s_{2}=s_{1}\). Let

$$\begin{aligned} m_{2}(\theta ,\phi ) := 4 |\sin \phi |^{q-2}|\cos \phi | |\sin \theta |^{q-1}|\cos \theta | \end{aligned}$$

and note that \(s_{2}=\max _{S_{2}}m_{2}\). By definition, \((\theta , \phi ) \in S_{2}\) are such that \(\rho _{2}^{*} > 0\), so \(\sin \phi <0\), from which (given that \((\theta , \phi ) \in S\)) it follows that \(\pi < \phi \le 5 \pi /4\). We have \(m_{2}(\theta ,\phi ) = \left| e(\theta )\right| \tilde{f}(\phi )\), where the function e was defined in (2.29) and

$$\begin{aligned} \tilde{f}(\phi ) = |\sin \phi |^{q-2}|\cos \phi |. \end{aligned}$$

It is straightforward to check that the maximum of the function \(\tilde{f}\) occurs at \(\phi \) such that \(\sin \phi = -\left( (q-2)/(q-1)\right) ^{\frac{1}{2}}\) and \(\cos \phi = -(q-1)^{\frac{1}{2}}\), and that consequently \(\max \tilde{f}=\max f\), where f is as defined in the previous paragraph. It follows that \(s_{2}=s_{1}\).

To find \(s_{3}\): We claim \(s_{3}=s_{1}\). Let

$$\begin{aligned} m_{3}(\theta , \phi ) := 2 |\sin 2 \phi ||\cos \theta |^{q-2}\sin ^{2}\theta \end{aligned}$$

so that \(s_{3}=\max _{S_{3}}m_{3}\). Define \(r(\theta )= |\cos \theta |^{q-2}\sin ^{2}\theta \). Note that r is symmetric about \(\theta =\pi /2\), so it suffices to consider just its restriction to \([0,\pi /2]\). A short calculation shows that the maximum of r occurs at \(\theta \) satisfying \(\sin ^{2}\theta = 2/q\). Thus

$$\begin{aligned} \max _{S_{3}} m_{3} = 4 (q-2)^{(q-2)/2}q^{-q/2}. \end{aligned}$$

Condition (3.10) follows by inserting \(s_{1}\) into (3.14).

Finally, (3.11) follows by writing \(\mathcal {F}^\lambda _{2}\) in terms of the coordinates \(\Lambda =\Lambda _{0} + \rho (l_{1},l_{2},l_{3})\) where \(l_{1}^{2}+l_{2}^{2}+l_{3}^{2}=1\), giving

$$\begin{aligned} \mathcal {F}_{2}^\lambda (\Lambda ) = \rho ^{2} \left( \gamma + \lambda h'(\lambda ^3) \big (l_{1}l_{2}+l_{1}l_{3}+l_{2}l_{3}\big )\right) . \end{aligned}$$

The minimum of \(l_{1}l_{2}+l_{1}l_{3}+l_{2}l_{3}\) among unit vectors \((l_{1},l_{2},l_{3})\) is \(-1/2\). Hence \(\mathcal {F}^\lambda _{2}\) is pointwise nonnegative provided (3.11) holds. \(\square \)

Remark 3.4

It is worth pointing out that the quadratic term in the definition of W cannot be omitted if our method of proof is to work. Nor could this be remedied by considering any adjusted form of \(F_{1}, F_{2}\), such as

$$\begin{aligned} \widehat{F}_{1}(\rho ,\theta ,\phi ) := \rho ^{q}\mu \kappa - \frac{Y\rho ^{3}}{4}|\sin 2 \phi \sin 2 \theta \sin \theta | \end{aligned}$$

and

$$\begin{aligned} \widehat{F}_{2}(\rho ,\theta ,\phi ): = \rho ^q (1-\mu )\kappa + \lambda \rho ^2 h'(\lambda ^3) \big (l_{1}l_{2}+l_{1}l_{3}+l_{2}l_{3}\big ), \end{aligned}$$

for some \(\mu \in (0,1)\). Indeed, since \(q>2\), the first term in \(\widehat{F}_{2}\) would be dominated by the term involving \(l_1l_2+l_1l_3+l_2l_3\) for sufficiently small \(\rho \), and this would prevent the pointwise inequality \(\hat{F}_2 \ge 0\).

The foregoing results imply a three dimensional analogue of Theorem 2.10:

Theorem 3.5

Let the stored energy function \(W: \mathbb {R}^{3 \times 3} \rightarrow [0,+\infty ]\) be given by

$$\begin{aligned} W(A):=|A|^{q}+\gamma |A|^{2} + Z(\mathrm{cof}\,A)+h(\det A), \end{aligned}$$

where \(2<q< 3\), \(Z:\mathbb {R}^{3\times 3} \rightarrow [0,+\infty )\) is convex and \(C^1\), and \(h:\mathbb {R}\rightarrow [0,+\infty ]\) satisfies (H1)–(H3). Let \(\lambda >0\) be such that

$$\begin{aligned} \frac{\kappa }{h'(\lambda ^{3})\lambda ^{3-q} } \ge (q-2)^{(q-2)/2}q^{-q/2}, \end{aligned}$$
(3.16)

where \(\kappa \) is as per (3.3) and

$$\begin{aligned} \frac{\gamma }{\lambda h'(\lambda ^3)} \ge \frac{1}{2}. \end{aligned}$$
(3.17)

Then any \(u \in \mathcal {A}_{\lambda }\) satisfies \(I(u) \ge I(u_{\lambda })\).

Let us briefly compare the result of Theorem 3.5 with [14, Theorem 4.1]. The latter asserts that under suitable smoothness and convexity assumptions on h, a linear deformation \(u(x)=Lx\), \(u: \Omega \rightarrow \mathbb {R}^3\), is a global minimizer of I provided

$$\begin{aligned} h'(\det L) |L|^{3-q} \le \frac{c_{1}}{\alpha }. \end{aligned}$$

Here, \(\alpha \) and \(c_{1}\) are constants which arise in their careful analysis (see [14, Section 3, Remark 2]). Inequalities (3.16) and (3.17) say, in the particular case \(L=\lambda \mathbf{1}\), that the affine map \(u_{\lambda }\) is a global minimizer of I provided

$$\begin{aligned} h'(\det L) |L|^{3-q} \le \min \big \{3^{(3-q)/2}(q-2)^{(2-q)/2}q^{q/2}\kappa ,\, 2(3^{(3-q)/2})\,\lambda ^{2-q}\gamma \big \}. \end{aligned}$$

Thus our result mirrors that of [14] and it produces constants which are explicit up to the inequality (3.3) obeyed by \(\kappa \). In fact,Footnote 5 \(\kappa \) varies very nearly linearly as a function of q on the interval [2, 3], the approximation \(\kappa (q)\sim 3-q + (2-\sqrt{2})(q-2)\) being accurate to within 0.025 for q in (2, 3) and exact at the endpoints.

3.1 Error estimates

In the three dimensional case error estimates follow an analogous pattern to those given in Sect. 2.1, as we now show. Let \(\lambda >0\) be such that

$$\begin{aligned} {\left\{ \begin{array}{ll} \displaystyle \frac{\kappa }{h'(\lambda ^{3})\lambda ^{3-q} } > (q-2)^{(q-2)/2}q^{-q/2}, \\ \\ \displaystyle \frac{\gamma }{\lambda h'(\lambda ^{3})} \ge \frac{1}{2}. \end{array}\right. } \end{aligned}$$
(3.18)

Theorem 3.6

Assume that (3.18) holds. Then there is a constant \(c=c(\Omega ,\lambda ,q)>0\) such that for every \(u \in \mathcal {A}_{\lambda }\)

$$\begin{aligned} \int _{\Omega }|\nabla u - \lambda \mathbf{1}|^q\,dx \le c\, \delta (u), \end{aligned}$$
(3.19)

where \(\delta (u):=I(u)-I(u_{\lambda })\).

Proof

Throughout this proof c denotes a generic strictly positive constant possibly depending on \(\Omega \), \(\lambda \), and q.

The second inequality in (3.18) ensures that

$$\begin{aligned} \int _\Omega \mathcal {F}_1^\lambda (\Lambda )\,dx\le \delta (u) \quad \text {for every}\quad u\in \mathcal A_\lambda , \end{aligned}$$
(3.20)

while the first (strict) inequality in (3.18) yields

$$\begin{aligned} \mathcal {F}_1^\lambda (\Lambda )\ge c|\Lambda -\Lambda _0|^q \quad \text {on}\, \mathbb {R}^{+++}, \end{aligned}$$
(3.21)

for some \(c>0\). To prove (3.21) we make use of the same notation as in the proof of Lemma 3.3. Let \(\epsilon >0\) and observe that

$$\begin{aligned} \mathcal {F}_1^\lambda (\Lambda )=&F_{1}(\rho , \theta , \phi ) = \kappa \rho ^{q} - \frac{Y \rho ^{3}}{4} |\sin 2 \phi \sin 2 \theta \sin \theta | \\ =&(\kappa -\epsilon )\rho ^{q} - \frac{Y \rho ^{3}}{4} |\sin 2 \phi \sin 2 \theta \sin \theta | +\epsilon \rho ^q. \end{aligned}$$

By applying the reasoning in the proof of Lemma 3.3 to the function

$$\begin{aligned} \tilde{F}_{1}(\rho , \theta , \phi ):= (\kappa -\epsilon )\rho ^{q} - \frac{Y \rho ^{3}}{4} |\sin 2 \phi \sin 2 \theta \sin \theta |, \end{aligned}$$

we see that \(\tilde{F}_{1}\ge 0\) provided

$$\begin{aligned} \frac{\kappa -\epsilon }{\lambda ^{q-3}Y}\ge (q-2)^{(q-2)/2}q^{-q/2}. \end{aligned}$$
(3.22)

Since \(Y:=h'(\lambda ^3)\), by virtue of the first inequality in (3.18), up to choosing \(\epsilon >0\) sufficiently small, (3.22) is clearly fulfilled.

Gathering (3.20), (3.21) and recalling that

$$\begin{aligned} |\Lambda -\Lambda _0| = \text {dist}\,(\nabla u,\lambda \,SO(3)), \end{aligned}$$

we thus obtain

$$\begin{aligned} \int _\Omega \text {dist}\,^q\left( \nabla u,\lambda \,SO(3)\right) \,dx\le c \delta (u) \quad \text {for every}\; u\in \mathcal A_\lambda . \end{aligned}$$
(3.23)

Then invoking the rigidity estimate [8, Theorem 3.1] we find \(c=c(\Omega )>0\) such that for every \(u\in \mathcal A_\lambda \) there is a constant rotation \(R\in SO(3)\) satisfying

$$\begin{aligned} \int _\Omega |\nabla u-\lambda R|^q \,dx\le c \delta (u) \quad \text {for every}\; u\in \mathcal A_\lambda . \end{aligned}$$
(3.24)

We now claim that

$$\begin{aligned} |\mathbf{1}-R|^q\le c\, \delta (u). \end{aligned}$$

Combining Jensen’s inequality with (3.24) gives

$$\begin{aligned} \left( \int _\Omega |\nabla u -\lambda R|\,dx\right) ^q \le c\,\delta (u). \end{aligned}$$
(3.25)

Set \(\tilde{u}:= u/\lambda \) and \(\tilde{z}:=\frac{1}{\mathcal {L}^3(\Omega )}\int _\Omega (\tilde{u}-Rx)\,dx\). Then by Poincaré’s inequality together with the continuity of the trace operator we obtain

$$\begin{aligned} \int _{\partial \Omega }|\tilde{u} -Rx -\tilde{z}|\,d\mathcal H^2 \le c \int _\Omega |\nabla \tilde{u}-R|\,dx, \end{aligned}$$

and hence, since \(\tilde{u}=x\) on \(\partial \Omega \), we deduce

$$\begin{aligned} \int _{\partial \Omega }\left| (\mathbf{1}-R)x -\tilde{z}\right| \,d\mathcal H^2 \le c \int _\Omega |\nabla \tilde{u}-R|\,dx. \end{aligned}$$
(3.26)

Arguing as in the proof of [1, Lemma 3.3], we apply [1, Lemma 3.2] to deduce that there exists a universal constant \(\sigma >0\) such that

$$\begin{aligned} |\mathbf{1}-R|\le \sigma \min _{z\in \mathbb {R}^3}\int _{\partial \Omega }|(\mathbf{1}-R)x -z|\,d\mathcal H^2. \end{aligned}$$
(3.27)

Combining (3.26) and (3.27) gives

$$\begin{aligned} |\mathbf{1}-R|\le & {} c \int _\Omega |\nabla \tilde{u}-R|\,dx\\= & {} \frac{c}{\lambda } \int _\Omega |\nabla u-\lambda R|\,dx, \end{aligned}$$

and therefore by (3.24) we achieve

$$\begin{aligned} |\mathbf{1}-R|^q \le c \left( \int _\Omega |\nabla u-\lambda R|\,dx\right) ^q \le c\, \delta (u), \end{aligned}$$
(3.28)

as claimed.

Finally, choosing R as in (3.24) and combining the latter with (3.28) implies

$$\begin{aligned} \int _\Omega |\nabla u-\lambda \mathbf{1}|^q\,dx= & {} \int _\Omega |\nabla u-\lambda R+ \lambda R-\lambda \mathbf{1}|^q\,dx\nonumber \\\le & {} c\left( \int _\Omega |\nabla u-\lambda R|^q\,dx+\lambda ^q\,|\mathbf{1}-R|^q\right) \nonumber \\\le & {} c\,\delta (u), \end{aligned}$$

which is the thesis. \(\square \)

Remark 3.7

If \(\lambda \) satisfies (3.18), from (3.19) we can conclude that also in this case \(u_\lambda \) is the unique global minimiser of I among all maps u in \(\mathcal A_\lambda \) and moreover that \(u_\lambda \) lies in a potential well.

We end this section by remarking that condition (3.17) can be removed from the statement of Theorem 3.5 if a certain conjecture holds, namely that the function

$$\begin{aligned} A \mapsto P(A) := \sum _{i < j} \lambda _{i}(A)\lambda _{j}(A) - \lambda \sum _{i=1}^{3}\lambda _{i}(A) \end{aligned}$$

is quasiconvex at \(\lambda \mathbf{1}\). [For \(i=1,2,3\), \(\lambda _i(A)\) denote, as usual, the singular values of \(A\in \mathbb {R}^{3\times 3}\)]. Standard results (see, e.g., [6, Theorem 5.39 (ii)]) imply that

$$\begin{aligned} A \mapsto \sum _{i < j} \lambda _{i}(A)\lambda _{j}(A) \end{aligned}$$

is polyconvex and hence quasiconvex, but it remains to be seen whether subtracting the term \(\sum _{i=1}^{3}\lambda _{i}(A)\) destroys the quasiconvexity at \(\lambda \mathbf{1}\). We conjecture that it does not.

To see why the quasiconvexity of P at \(\lambda \mathbf{1}\) might matter, note that from (3.9) we can write

$$\begin{aligned} \det \nabla u - \lambda ^{3} = \hat{\lambda }_{1}\hat{\lambda }_{2}\hat{\lambda }_{3} + \lambda \sum _{i < j} \hat{\lambda }_{i}\hat{\lambda }_{j} + \lambda ^2 \sum _{i=1}^{3}\hat{\lambda }_{i}. \end{aligned}$$

Recalling that \(\hat{\lambda }_{i} := \lambda _{i}-\lambda \) for \(i=1,2,3\), where each \(\lambda _{i}\) is as before, the quadratic and linear terms in the last line can be expanded and recast as

$$\begin{aligned} \lambda \sum _{i < j} \hat{\lambda }_{i}\hat{\lambda }_{j} + \lambda ^2 \sum _{i=1}^{3}\hat{\lambda }_{i} = \lambda \sum _{i<j}\lambda _{i}\lambda _{j}- \lambda ^2 \sum _{i=1}^{3}\lambda _{i}, \end{aligned}$$

whose right-hand side we recognise as \(\lambda P(\nabla u)\). In summary, we have shown that

$$\begin{aligned} \det \nabla u - \lambda ^{3} = \hat{\lambda }_{1}\hat{\lambda }_{2}\hat{\lambda }_{3}+ \lambda h'(\lambda ^{3}) P(\nabla u). \end{aligned}$$

Inserting this into (3.6) gives (on dropping the term with prefactor \(\gamma \), since it will no longer be needed)

$$\begin{aligned} I(u)-I(u_{\lambda })\ge & {} \int _\Omega \big (\kappa |\nabla u -\lambda \mathbf{1}|^{q} + h'(\lambda ^3) \hat{\lambda }_{1}\hat{\lambda }_{2}\hat{\lambda }_{3} \big )\,dx + \lambda h'(\lambda ^{3})\int _{\Omega } P(\nabla u) \,dx\\= & {} \int _{\Omega } \mathcal {F}_{1}^\lambda (\Lambda )\,dx + \lambda h'(\lambda ^{3}) \int _{\Omega } P(\nabla u) \,dx. \end{aligned}$$

If P were quasiconvex at \(\lambda \mathbf{1}\) then the second integral would by definition satisfy

$$\begin{aligned} \int _{\Omega } P(\nabla u) \,dx \ge \int _{\Omega } P(\lambda \mathbf{1}) \,dx\end{aligned}$$

for any Lipschitz u which agrees with \(u_{\lambda }\) on the boundary of \(\Omega \). This, when coupled with a straightforward approximation argument based on the estimateFootnote 6

$$\begin{aligned} \left| P(A)\right| \le |A|^{2}+3 \lambda |A|, \end{aligned}$$

further implies

$$\begin{aligned} \int _{\Omega } P(\nabla u) \,dx \ge \int _{\Omega } P(\lambda \mathbf{1}) \,dx\end{aligned}$$

for any u in \(W^{1,q}(\Omega )\) with \(q \ge 2\). Finally, a short calculation shows that \(P(\lambda \mathbf{1})=0\), so that the right-hand side of the last inequality vanishes. Thus the only condition needed in order to conclude that \(I(u) \ge I(u_{\lambda })\) would be (3.16), which ensures the positivity of the integral involving \(\mathcal {F}^\lambda _{1}\).