1 Introduction

In this article we are concerned with the detailed analysis of certain convex integration solutions which arise in the modeling of solid-solid, diffusionless phase transformations in shape-memory materials. We seek to precisely analyze the regularity properties of these constructions in a simple, two-dimensional, geometrically linear model case.

Shape-memory materials undergo a solid-solid, diffusionless phase transition upon temperature change (see e.g. [6] and the references given there): in the high temperature phase, the austenite phase, the materials form very symmetric lattices. Upon cooling down the material, the symmetry of the lattice is reduced, the material transforms into the martensitic phase. Due to the loss of symmetry, there are different variants of martensite, which make these materials very flexible at low temperature and give rise to a variety of different microstructures. Mathematically, it has proven very successful to model this behavior variationally in a continuum framework as the following minimization problem [4]:

$$\begin{aligned} \min \int _{\varOmega } W(\nabla y, \theta ) dx. \end{aligned}$$

Here \(\varOmega \subset \mathbb{R}^{n}\) is the reference configuration of the undeformed material. The mapping \(y:\varOmega \rightarrow \mathbb{R} ^{n}\) describes the deformation of the material with respect to the reference configuration. It is assumed to be of a suitable Sobolev regularity. The function \(W:\mathbb{R}^{n\times n} \times \mathbb{R} \rightarrow [0,\infty )\) denotes the energy density of a given deformation gradient \(M\in \mathbb{R}^{n \times n}\) at a certain temperature \(\theta \in \mathbb{R}\). Due to frame indifference, \(W\) is required to be invariant with respect to rotations, i.e.,

$$\begin{aligned} W(QM,\theta ) = W(M,\theta ) \mbox{ for all}\quad Q\in \mathit{SO}(n), \theta \in \mathbb{R}, M\in \mathbb{R}^{n\times n}. \end{aligned}$$

Modeling the behavior of shape-memory materials, the energy density further reflects the physical properties of these materials. In particular, it is assumed that at high temperatures \(\theta > \theta _{c}\) the energy density \(W\) has a single minimum (modulo \(\mathit{SO}(n)\) symmetry), which (upon normalization) we may assume to be given by the \(\mathit{SO}(n)\) orbit of \(\alpha (\theta ) \mathit{Id}\), where \(\alpha : \mathbb{R} \rightarrow (0,\infty )\) with \(\alpha (\theta _{c})=1\) (cf. [3]). This is the (austenite) energy well at temperature \(\theta \). Upon lowering the temperature below a critical temperature \(\theta _{c}\), the function \(W\) displays a (discrete) multi-well behavior (modulo \(\mathit{SO}(n)\)): There exist finitely many matrices \(U_{1}(\theta ),\ldots ,U_{m}(\theta ) \in \mathbb{R}^{n\times n}_{+}\), \(m\in \mathbb{N}\), such that

$$\begin{aligned} W(M,\theta ) = 0\quad \Leftrightarrow\quad M \in \bigcup _{j=1}^{m}\mathit{SO}(n) U _{j}(\theta ). \end{aligned}$$

The matrices \(U_{j}(\theta )\) represent the variants of martensite at temperature \(\theta <\theta _{c}\) and are referred to as the (martensite) energy wells. At the critical temperature \(\theta = \theta _{c}\) both the austenite and the martensite wells are energy minimizers.

In the sequel, we assume that \(\theta <\theta _{c}\) is fixed, so that only the variants of martensite are energy minimizers. We seek to study the quantitative behavior of minimizers for energies of the type (1). Here we make the following simplifications:

  1. (i)

    Reduction to the\(m\)-well problem. Instead of studying the full variational problem (1), we only focus on exact minimizers. Restricting to the low temperature regime, this implies that we seek solutions to the differential inclusion

    $$\begin{aligned} \nabla y \in \bigcup_{j=1}^{m} \mathit{SO}(n) U_{j}(\theta ), \end{aligned}$$

    for some \(\theta < \theta _{c}\).

  2. (ii)

    Small deformation gradient case, geometric linearization. We further modify (2) and assume that \(\nabla y\) is close to the identity. This allows us to linearize the problem around this constant value (cf. Chap. 11 in [6] and also [5] for a comparison of the linearized and the nonlinear theories). Instead of considering (2), we are thus lead to the inclusion problem

    $$\begin{aligned} e(\nabla u):=\frac{\nabla u + (\nabla u)^{T}}{2} \in \{e_{1},\dots ,e _{m}\}. \end{aligned}$$

    The symmetrized gradient \(e(\nabla u)\) represents the infinitesimal displacement strain associated with the displacement\(u\), which is defined as \(u(x):=y(x)-x\). The symmetric matrices \(e_{1},\ldots ,e_{m} \in \mathbb{R}^{n\times n}\) are the exactly stress-free strains representing the variants of martensite. After rescaling, we may assume that they are of a size of order one. While this procedure linearizes the geometry of the problem (by replacing the symmetry group \(\mathit{SO}(n)\) by an invariance with respect to the linear space \(\operatorname{Skew}(n)\)), the differential inclusion (3) preserves the inherent physical nonlinearity which arises from the multi-well structure of the problem. In order to ensure the validity of the geometric linearization assumption, in the sequel we will pay particular attention to deriving solutions with bounded displacement gradients of order one, and hence have skew parts of order one (we recall the normalization that our exactly stress-free strains are of order one). For more detailed comments on this, we refer to the discussion on the \(L^{\infty }\) bounds for \(\nabla u\) below (Q2), to (ii) in Sect. 1.3 and to Algorithm 30 in Sect. 3.2 and the discussion following it.

  3. (iii)

    Reduction to two dimensions and the hexagonal-to-rhombic phase transformation. In the sequel studying an as simple as possible model case, we restrict to two dimensions and a specific two-dimensional phase transformation, the hexagonal-to-rhombic phase transformation (this is for instance used in studying materials such as \(\mbox{Mg}_{2}\mbox{Al}_{4} \mbox{Si}_{5}\mbox{O}_{18}\) or Mg-Cd alloys undergoing a (three-dimensional) hexagonal-to-orthorhombic transformation, cf. [11, 34], and also for closely related materials such as \(\mbox{Pb}_{3}(\mbox{VO}_{4})_{2}\), which undergo a (three-dimensional) hexagonal-to-monoclinic transformation, cf. [11, 40, 41]). From a microscopic point of view, the hexagonal-to-rhombic phase transformation occurs, if a hexagonal atomic lattice is transformed into a rhombic atomic lattice. From a continuum point of view, we model it as solutions to the differential inclusion

    $$ \begin{aligned} & u: \mathbb{R}^{2} \rightarrow \mathbb{R}^{2}, \\ &\frac{1}{2}\bigl(\nabla u + (\nabla u)^{T}\bigr) \in K \quad\mbox{a.e. in }\varOmega , \end{aligned} $$

    where \(\varOmega \subset \mathbb{R}^{2}\) is a bounded Lipschitz domain and

    $$ \begin{aligned} &K:=\bigl\{ e^{(1)}, e^{(2)}, e^{(3)}\bigr\} \quad\mbox{with} \\ &e^{(1)}:= \begin{pmatrix} 1 & 0 \\ 0& -1 \end{pmatrix} ,\quad e^{(2)}:= \frac{1}{2} \begin{pmatrix} -1 & \sqrt{3} \\ \sqrt{3}& 1 \end{pmatrix} ,\quad e^{(3)}:= \frac{1}{2} \begin{pmatrix} -1 & -\sqrt{3} \\ -\sqrt{3}& 1 \end{pmatrix} . \end{aligned} $$

    We note that all the matrices in \(K\) are trace-free, which corresponds to the (infinitesimal) volume preservation of the transformation. We note that the set \(K \) is “large” (its convex hull is a two-dimensional set in the three-dimensional ambient space of two-by-two, symmetric, tracefree matrices, cf. Lemma 10).

In the sequel, we study the problem (4), (5) and investigate regularity properties of its solutions.

1.1 Main Result

The geometrically linearized hexagonal-to-rhombic phase transformation is a very flexible transformation, which allows for numerous exact solutions to the associated three-well problem (4) with different types of boundary data. Here the simplest possible solutions are so-called simple laminates, for which the strain is a one-dimensional function \(e(\nabla u)(x)=f(x\cdot n)\) for some vector \(n \in S^{1}\) and for which

$$ f(x\cdot n) \in \bigl\{ e^{(i_{1})}, e^{(i_{2})}\bigr\} \quad \mbox{a.e. in } \varOmega , \ i_{1}, i_{2} \in \{1,2,3\} \mbox{ and } i_{1} \neq i_{2}, $$

i.e., \(e(\nabla u)\) only attains two values. The possible directions of these laminates, as given by the vector \(n \in S^{1}\) are (up to sign reversal) six discrete values which arise as the symmetrized rank-one directions between the energy wells: For each \(i_{1},i_{2} \in \{1,2,3\}\) with \(i_{1} \neq i_{2}\) there exists (up to sign reversal and exchange of the roles of \(a_{i_{1},i_{2}}\) and \(n_{i_{1},i_{2}}\) and renormalization) exactly one pair \((a_{i_{1},i_{2}},n_{i_{1},i_{2}}) \in \mathbb{R}^{2} \setminus \{0\}\times S^{1}\) with the property that

$$\begin{aligned} e^{(i_{1})} - e^{(i_{2})} = a_{i_{1},i_{2}} \odot n_{i_{1},i_{2}} := \frac{1}{2}(a_{i_{1}, i_{2}}\otimes n_{i_{1},i_{2}} + n_{i_{1},i_{2}} \otimes a_{i_{1},i_{2}}). \end{aligned}$$

The possible vectors are collected in Lemma 16.

In addition to these “simple” constructions, there are further exact solutions to the three-well problem associated with the hexagonal-to-rhombic phase transformation, e.g., there are patterns involving all three variants as depicted in Figs. 24 and 25 in the Appendix.

In the sequel, we study solutions to the hexagonal-to-rhombic phase transformation with affine boundary conditions, i.e., we consider \(u\in W^{1,p}_{\text{loc}}(\mathbb{R}^{2})\) with \(p\in (2,\infty ]\) such that

$$ \begin{aligned} & u: \mathbb{R}^{2} \rightarrow \mathbb{R}^{2}, \\ &\nabla u = M\quad \mbox{a.e. in } \mathbb{R}^{2}\setminus \varOmega , \\ &\frac{1}{2}\bigl(\nabla u + (\nabla u)^{T}\bigr) \in K \quad\mbox{a.e. in }\varOmega . \end{aligned} $$

Here we investigate the rigidity/ non-rigidity of the problem by asking whether it has non-affine solutions:

  1. (Q1)

    Are there (non-affine) solutions to (6) with \(M \in \mathbb{R}^{2\times 2}\)?

Clearly, a necessary condition for this is that \(e(M)\in \operatorname{conv}(K)\). Using the method of convex integration, Müller and Šverák [43] (cf. also the Baire category arguments of [21, 22]) constructed multiple solutions to related differential inclusions, displaying the existence of a variety of solutions to the problem. Noting that these techniques are applicable to our set-up of the three-well problem, ensures that for any \(M\) with \(e(M)\in \operatorname{intconv}(K)\) there exists a non-affine solution to (6). We point out that in the setting of the geometrically linearized hexagonal-to-rhomobic phase transformation, all convex hulls coincide, see Lemma 10, which allows us to work with convex instead of lamination, rank-one or quasiconvex hulls.

In general these convex integration solutions are however very “wild” in the sense that they do not possess very strong regularity properties (cf. [26]). As our inclusion (6) is motivated by a physical problem, a natural question addresses the relevance of this multitude of solutions:

  1. (Q2)

    Are all the convex integration solutions physically relevant? Or are they only mathematical artifacts? Is there a mechanism distinguishing between the “only mathematical” and the “really physical” solutions?

Guided by the physical problem at hand and the literature on these problems, natural criteria to consider are surface energy constraints and surface energy regularizations of the minimization problem (1) (cf. for instance [9, 10, 14, 35, 36]). Here the presence of the higher order surface energy contributions gives rise to a length scale which is also reflected in physical experiments (e.g., branching structures are predicted), cf. also our follow-up work [50] for an energetic point of view on convex integration solutions. On the level of our differential inclusion surface energy constraints translate into regularity constraints and lead to the question, whether unphysical convex integration solutions have a natural regularity threshold. Here an immediate regularity property of solutions to (6) is that \(e(\nabla u)\in L^{\infty }( \mathbb{R}^{2})\). With slightly more care, it is also possible to obtain solutions with the property that \(u\in W^{1,\infty }_{\text{loc}}( \mathbb{R}^{2})\) (which in an abstract way follows from the stability arguments from Chap. 3 in [33] but which in this text we will implement by an explicit construction. In particular this allows us to explicitly control the size of the skew part, see Proposition 34, thus justifying the use of the geometrically linearized theory of elasticity). However, prior to this work it was not known whether these solutions can enjoy more regularity, i.e., whether for instance there are convex integration solutions with \(\nabla u \in W^{s,p}(\varOmega )\) for some \(s>0\), \(p\geq 1\). In particular, this shows that there are exact solutions to the (geometrically linearized version of the) minimization problem (1) which enjoy better regularity properties. This makes convex integration solutions potentially also interesting from an energy minimization point of view for energies involving elastic and surface energy contributions (cf. the experimental work of Inamura [29] for first experimental results capturing rather wild microstructures which might be related to convex integration). We however emphasize that our findings do not answer the question (Q2) on the physical relevance of convex integration solutions. They should be viewed as a first attempt at approaching this question.

Motivated by these questions, in this article, we study the regularity of a specific convex integration construction and obtain higher Sobolev regularity properties for the resulting solutions:

Theorem 1

Let\(\varOmega \subset \mathbb{R}^{2}\)be a bounded Lipschitz domain. Let\(K\)be as in (5) and let\(M\in \mathbb{R}^{2\times 2}\)be such that\(e(M):= \frac{M+M^{T}}{2} \in \operatorname{intconv}(K)\). Then there exist a value\(\theta _{0}\in (0,1)\), depending only on\(\frac{\operatorname{dist}(e(M), \partial \operatorname{conv}(K))}{ \operatorname{dist}(e(M),K)}\), and a displacement\(u: \mathbb{R}^{2} \rightarrow \mathbb{R}^{2}\)with\(u\in W^{1,\infty }_{loc}(\mathbb{R} ^{2})\)such that (6) holds and such that\(\nabla u\in W ^{s,p}_{loc}(\mathbb{R}^{2})\cap L^{\infty }(\mathbb{R}^{2})\)for all\(s\in (0,1)\), \(p\in (1,\infty )\)with\(s p< \theta _{0}\).

Let us comment on this result: to the best of our knowledge it represents the first \(W^{s,p}\) higher regularity result for convex integration solutions arising in differential inclusions for shape-memory materials. In addition to providing a regularity result for the displacement gradient \(\nabla u\), we also show higher \(W^{s,p}\) regularity for the characteristic functions of the martensitic phases. Invoking results of Sickel [52], this also implies bounds on the dimension of the singular set of the characteristic functions (cf. Remark 6). This in turn can be interpreted as a measure of the “fractality” of the constructed solutions. In this context, we also point out that our solutions are “piecewise affine” in the sense of Sect. 4 in [33], i.e., at almost every point of the domain, the iterative convex integration solution turns into an exact solution after a finite number of steps (where the number of steps however depends on the respective point). This entails that although our solutions are expected to be “wild”, they are not as wild as solutions to other convex integration schemes, where it might be necessary to iterate infinitely often at each point in the domain.

The given quantitative dependences for \(\theta _{0}\) are certainly not optimal in the specific constants. While it is certainly possible to improve on these numeric values, a more interesting question deals with the qualitative expected dependences: Is it necessary that \(\theta _{0}\) depends on \(\frac{\operatorname{dist}(e(M), \partial \operatorname{conv}(K))}{\operatorname{dist}(e(M),K)}\)?

Since for \(M\in \mathbb{R}^{2\times 2}\) with \(e(M)\in \partial \operatorname{conv}(K)\) there are no non-affine solutions to (6), it is natural to expect that convex integration constructions deteriorate for matrices \(M\) with \(e(M)\) approaching the boundary of \(\operatorname{conv}(K)\). The precise dependence on the behavior towards the boundary however is less intuitive. In this context, it is interesting to note that the regularity threshold \(\theta _{0}>0\) does not depend on the distance to the boundary of \(K\), but rather on the angle which is formed between the initial matrix \(e(M)\) and the boundary of \(\operatorname{conv}(K)\). This is in agreement with the intuition that the larger the angle is, the better the convex integration algorithm becomes, as it moves the values of the iterations which are used to construct the displacement \(u\) further into the interior of \(K\). In the interior of \(K\) it is possible to use larger length scales which increases the regularity of solutions. Whether this dependence is necessary in the value of the product of \(sp\) or whether the product \(sp\) should be independent of this and only the value of the corresponding norm should deteriorate with a smaller angle, is an interesting open question. In a follow-up work, [51], we introduce a different construction which provides a uniform lower bound on the attainable regularity. However, that construction is then not anymore “piecewise affine”.

We remark that in the very special case of additional symmetries it is possible to construct much better solutions. An example is given in the appendix for the case \(M=0\) (cf. also [48] and [11]). For these boundary data and specific domain geometries we show that it is possible to construct a solution \(u\) of the associated differential inclusion such that \(e(\nabla u) \in K\) and \(e(\nabla u)\in BV\). The skew part of the displacement gradient however diverges (and is “unphysical” in this sense). For the hexagonal-to-rhombic phase transformation the boundary data \(M=0\), which correspond to \(e(M)\) lying exactly in the barycenter of the three energy wells, is the only example with such substantially improved regularity properties that we are aware of (cf. [19] for similar examples in the geometrically nonlinear setting). The high symmetry situation with the improved solutions is thus very non-generic in this sense and requires very strong symmetries. It is an important and challenging open question, whether it is possible to exploit further symmetries and thus to construct further solutions with these much better regularity properties.

1.2 Literature and Context

A fascinating problem in studying solid-solid, diffusionless phase transformations modeling shape-memory materials is the dichotomy between rigidity and non-rigidity. Since the work of Müller and Šverák [43], who adapted the convex integration method of Gromov [27, 28] and Nash-Kuiper [39, 44] to the situation of solid-solid phase transformations, and the work of Dacorogna and Marcellini [21, 22], it is known that under suitable conditions on the convex hulls of the energy wells, there is a very large set of possible minimizers to (1) (cf. also [53] and [33] for a comparison of these two methods). More precisely, the set of minimizers forms a residual set (in the Baire sense) in the associated function spaces. However, in general convex integration solutions are “wild”; they do not enjoy very good regularity properties. This has rigorously been proven for the case of the geometrically nonlinear two-well problem [25, 26], the geometrically nonlinear three-well problem in three dimensions (the “cubic-to-tetragonal phase transformation”) [17, 32] and (under additional assumptions) for the geometrically linear six-well problem (the “cubic-to-orthorhombic phase transformation”) [49]. In these works it has been shown that on the one hand convex integration solutions exist, if the displacement gradient is only assumed to be \(L^{\infty }\) regular. If on the other hand, the displacement gradient is \(BV\) regular (or a replacement of this), then solutions are very rigid and for most constant matrices \(M\) the analogue of (6) does not possess a solution.

Thus, in these specific examples convex integration solutions cannot exist at \(BV\) regularity for the displacement gradient; at this regularity solutions are rigid. At \(L^{\infty }\) regularity they are however flexible and a multitude of solutions exist. Similarly as in

  • the related (though much more complicated) situation of the Onsager conjecture for Euler’s equations (cf. for instance [8, 24, 30, 53] and the references therein),

  • the situation of isometric embeddings (cf. [16, 23]),

  • the situation of orientation preserving Young measures (cf. [37]),

  • the situation of elliptic equations (cf. [1, 38]),

it is hence natural to adopt a more quantitative point of view and to ask whether there is a regularity threshold which distinguishes between the rigid and the flexible regime.

It is the purpose of this article to make a first, very modest step into the understanding of this dichotomy by analyzing the \(W^{s,p}\) regularity of a (known) convex integration scheme in an as simple as possible model case.

1.3 Main Ideas

In our construction of solutions to the differential inclusion (6) we follow the ideas of Müller and Šverák [43] (in the version of [47]) and argue by an iterative convex integration algorithm. For the hexagonal-to-rhombic transformation this is particularly simple, since the laminar convex hull equals the convex hull of the wells and since all matrices in the convex hull are symmetrized rank-one-connected with the wells (cf. Lemma 10). As a consequence it is possible to construct piecewise affine solutions (in the language of [33], Chap. 4). This simplifies the convergence of the iterative construction drastically. It is one of the reasons for studying the hexagonal-to-rhombic phase transformation as a model problem.

Yet, in spite of the (relative) simplicity of obtaining convergence of the iterative construction to a solution of (6) and hence of showing existence, substantially more care is needed in addressing regularity. In this context we argue by an interpolation result (cf. Theorem 2 and Proposition 66): While our approximating displacements \(u_{k}:\mathbb{R}^{2} \rightarrow \mathbb{R}^{2}\) are such that the \(BV\) norms of the iterations increase (exponentially), the \(L^{1}\) norm of their difference decreases exponentially. If the threshold \(\theta _{0}>0\) is chosen appropriately, the \(W^{s,p}\) norm for \(0< sp< \theta _{0}\) is controlled by an interpolation of the \(BV\) and the \(L^{1}\) norms, which can be balanced to be uniformly bounded. This is based on the interpolation results from [13] which characterizes Besov spaces as interpolation spaces of \(L^{p}\) and \(BV\) spaces and is similar to the strategy from for instance [18] or [53]. To derive the associated bounds, we have to make the iterative algorithm quantitative in several ways which distinguishes it from the “usual” convex integration schemes:

  1. (i)

    Tracking the error in strain space. In order to iterate the convex integration construction, it is crucial not to leave the interior of the convex hull of \(K\) in the iterative modification steps. In qualitative convex integration algorithms, it suffices to use errors which become arbitrarily small, and to invoke the openness of \(\operatorname{intconv}(K)\). As the admissible error in strain space is however coupled to the length scales of the convex integration constructions (cf. Lemma 21) and as these in turn are directly reflected in the solutions’ regularity properties, in our quantitative algorithm we have to keep track of the errors in strain space very carefully. Here we seek to maximize the possible length scales (and hence the aspect ratio of the building block constructions) without leaving \(\operatorname{intconv}(K)\) in each iteration step. This leads to the distinction of various possible cases (the “stagnant”, the “push-out”, the “parallel” and the “rotated” case, cf. Notation 25, Definition 29 and Algorithm 27). In these we quantitatively prescribe the admissible error according to the given geometry in strain space.

  2. (ii)

    Controlling the skew part without destroying the structure of (i). Seeking to construct \(W^{1,\infty }\) solutions, we have to control the skew part of our construction. Due to the results of Kirchheim, it is known that this is generically possible (cf. [33], Chap. 3). However, in our quantitative construction, we cannot afford to arbitrarily change the direction of the rank-one connection which is chosen in the convex integration algorithms at an arbitrary iteration step. This would entail \(BV\) bounds which could not be compensated by the exponentially decreasing \(L^{1}\) bounds in the interpolation argument. Hence we have to devise a detailed description of controlling the skew part (cf. Algorithm 30).

  3. (iii)

    Precise covering construction. In order to carry out our convex integration scheme we have to prescribe an iterative covering of our domain by constructions which successively modify a given gradient. As our construction in Lemma 21 relies on triangles, we have to ensure that there is a class of triangles which can be used for these purposes (cf. Section 4). Seeking to control both the \(L^{1}\) and the \(BV\) norms of the resulting convex integration solutions, we have to control competing requirements: On the one hand, we have to quantitatively control the overall perimeter (which can be viewed as a measure of the BV norm of \(\nabla u_{k}\)) of the covering at a given iteration step of the convex integration algorithm. This crucially depends on the specific case (“rotated” or “parallel”) in which we are. On the other hand, we have to ensure that a sufficiently large volume fraction of the underlying domain is covered by our building block constructions (which however costs surface energy) in order to obtain good \(L^{1}\) bounds. That it is possible to satisfy both requirements simultaneously is the content of Proposition 45.

For a further discussion of the quantitative aspects of our convex integration scheme and the differences with respect to more standard convex integration algorithms we refer to the discussion in Sect. 3.2 following the presentation of Algorithms 27 and 30.

1.4 Organization of the Article

The remainder of the article is organized as follows: After briefly collecting preliminary results in the next section (interpolation results, results on the convex hull of the hexagonal-to-rhombic phase transition), in Sect. 3 we begin by describing the convex integration scheme which we employ. Here we first recall the main ingredients of the qualitative scheme (Sect. 3.1) and then introduce our more quantitative algorithms in Sects. As this algorithm crucially relies on the existence of an appropriate covering, we present an explicit construction of this in Sect. 4. Here we also address quantitative covering estimates for the perimeter and the volume. The ingredients from Sects. 3 and 4 are then combined in Sect. 5, where we prove Theorem 1 for a specific class of domains. In Sect. 6 we explain how this can be generalized to arbitrary Lipschitz domains. Finally, in the Appendix, we recall a very special symmetry based construction for a solution to (6) with \(M=0\) with much better regularity properties for \(e(\nabla u)\) but with unbounded skew part.

2 Preliminaries

In this section we collect preliminary results which will be relevant in the sequel. We begin by stating the interpolation results of [13] on which our \(W^{s,p}\) bounds rely. Next, in Sect. 2.2 we recall general facts on matrix space geometry and in particular apply this to the hexagonal-to-rhombic phase transformation and its convex hulls.

2.1 An Interpolation Inequality and Sickel’s Result

Seeking to show higher Sobolev regularity for convex integration solutions, we rely on the characterization of \(W^{s,p}\) Sobolev functions. Here we recall the following two results on an interpolation characterization [13] and on a geometric characterization of the regularity of characteristic functions [52]:

Theorem 2

(Interpolation with BV, [13])

We have the following interpolation results:

  1. (i)

    Let\(p\in [2,\infty )\)and assume that\(\frac{1}{q} = \frac{1- \theta }{p} + \theta \)for some\(\theta \in (0,1)\). Then

    $$\begin{aligned} \|u\|_{W^{\theta ,q}(\mathbb{R}^{n})} \leq C \|u\|_{L^{p}(\mathbb{R} ^{n})}^{1-\theta } \|u\|_{BV(\mathbb{R}^{n})}^{\theta } . \end{aligned}$$
  2. (ii)

    Let\(p\in (1,2]\)and let\(\frac{1}{q} = \frac{1-\theta }{p} + \theta \)for some\(\theta \in (0,1)\). Let further\((\theta _{1}, q_{1}) \in (0,1)\times (1,\infty )\)be such that

    $$\begin{aligned} \frac{1}{q_{1}} &= \frac{1-\theta _{1}}{2} + \theta _{1}, \\ \bigl(\theta , q^{-1}\bigr) &= \tau (0,1-) + (1-\tau ) \bigl(\theta _{1}, q_{1}^{-1}\bigr), \end{aligned}$$

    for some\(\tau \in (0,1)\), where 1− denotes an arbitrary positive number slightly less than 1. Then,

    $$\begin{aligned} \| u \|_{W^{\theta , q}(\mathbb{R}^{n} )} \leq C \bigl(\|u\|_{L^{1+}( \mathbb{R}^{n})}^{\frac{\tau }{1-\theta }} \|u\|_{L^{2}(\mathbb{R} ^{n})}^{1-\frac{\tau }{1-\theta }} \bigr)^{1-\theta } \|u \|_{BV( \mathbb{R}^{n})}^{\theta }, \end{aligned}$$

    with\(1+ := (1-)^{-1}\).

Before proceeding to the proof of Theorem 2, we present an immediate corollary of it: For functions which are “essentially” characteristic functions we obtain the following unified result:

Corollary 3

Let\(u:\mathbb{R}^{n} \rightarrow \mathbb{R}^{n}\)be a function, such that

$$\begin{aligned} \|u\|_{L^{\infty }(\mathbb{R}^{n})} < \infty\quad \textit{and}\quad \bigl|u(x)\bigr| \geq c_{0}>0 \quad\textit{for a.e. } x \in \operatorname{supp}(u). \end{aligned}$$

Then, for any\(p\in (1,\infty )\)we have that

$$\begin{aligned} \|u\|_{W^{\theta ,q}(\mathbb{R}^{n})} \leq C \biggl( \frac{\|u\|_{L ^{\infty }(\mathbb{R}^{n})}}{c_{0}} \biggr)^{ (1-\frac{1}{p} )(1- \theta )} \|u\|_{L^{p}(\mathbb{R}^{n})}^{1-\theta } \|u \|_{BV( \mathbb{R}^{n})}^{\theta }, \end{aligned}$$

where\(\frac{1}{q} = \frac{1-\theta }{p} + \theta \)and\(\theta \in (0,1)\).

In the sequel, we will mainly rely on Corollary 3, since in our applications (e.g. in Propositions 66, 69) we will mainly deal with functions which are “essentially” characteristic functions.

Proof of Corollary 3

By virtue of Theorem 2 (i) and (7), it suffices to consider the regime \(p\in (1,2)\). In this case the statement follows from a combination of (8) and the fact that for functions satisfying (9) we have

$$\begin{aligned} \|u\|_{L^{p_{1}}(\mathbb{R}^{n})}^{\sigma } \|u\|_{L^{p_{2}}( \mathbb{R}^{n})}^{1-\sigma } \leq \biggl( \frac{\|u\|_{L^{\infty }( \mathbb{R}^{n})}}{c_{0}} \biggr)^{1-\frac{1}{r}} \|u\|_{L^{r}( \mathbb{R}^{n})}, \end{aligned}$$

for \(1< p_{1} \leq r \leq p_{2}\), \(r^{-1}= \sigma p_{1}^{-1} + (1- \sigma )p_{2}^{-1} \) and \(\sigma \in (0,1)\). We postpone a proof of (11) to the end of this proof, and observe first that it indeed suffices to show (11) to conclude the claim of (10). To this end, we note that the exponents in (8) obey the relation

$$\begin{aligned} \frac{1}{p} = \frac{1}{1+} \frac{\tau }{1-\theta } + \frac{1}{2} \biggl(1- \frac{\tau }{1-\theta } \biggr). \end{aligned}$$

This in turn is a consequence of the three identities

$$\begin{aligned} \frac{1}{q} = \frac{1}{1+} \tau + (1-\tau ) \frac{1+\theta _{1}}{2},\qquad \frac{1}{q} = \frac{1-\theta }{p} + \theta , \qquad \theta = (1-\tau ) \theta _{1}. \end{aligned}$$

Here we note that \(\frac{\tau }{1-\theta }=1-\frac{(1-\tau )(1-\theta _{1})}{1-\theta }\in (0,1)\). Hence (11) (applied to \(r=p\), \(p_{1} = 1+\), \(p_{2} = 2\) and \(\sigma = \frac{\tau }{1-\theta }\)) together with (8) yields the claim of (10).

It thus remains to prove (11). To this end, we observe that for any \(r \in [1,\infty ]\)

$$\begin{aligned} \|u\|_{L^{r}(\mathbb{R}^{n}) } &\geq c_{0}^{1-\frac{1}{r}} \|u \|_{L ^{1}(\mathbb{R}^{n})}^{\frac{1}{r}}, \\ \|u\|_{L^{r}(\mathbb{R}^{n}) } &\leq C_{1}^{1-\frac{1}{r}} \|u \|_{L ^{1}(\mathbb{R}^{n})}^{\frac{1}{r}}, \end{aligned}$$

where for abbreviation we have set \(C_{1}:=\|u\|_{L^{\infty }( \mathbb{R}^{n})}\). With this we infer

$$\begin{aligned} \|u\|_{L^{p_{1}}(\mathbb{R}^{n})}^{\sigma } \|u\|_{L^{p_{2}}( \mathbb{R}^{n})}^{1-\sigma } &\leq C_{1}^{ ( 1-\frac{1}{p_{1}} ) \sigma } \|u\|_{L^{1}(\mathbb{R}^{n})}^{\frac{ \sigma }{p_{1}}} C_{1}^{ (1-\frac{1}{p_{2}} ) (1-\sigma )} \|u\|_{L^{1}(\mathbb{R}^{n})}^{\frac{1-\sigma }{p_{2}}} \\ & \leq C^{1-\frac{1}{r}}_{1} \|u\|_{L^{1}(\mathbb{R}^{n})}^{ \frac{1}{r}} \leq C^{1-\frac{1}{r}}_{1} c_{0}^{\frac{1}{r}-1} \|u\| _{L^{r}(\mathbb{R}^{n})} \\ & = \biggl(\frac{C_{1}}{c_{0}} \biggr)^{1-\frac{1}{r}} \|u\|_{L^{r}( \mathbb{R}^{n})}. \end{aligned}$$

This concludes the argument. □

After this discussion, we come to the proof of Theorem 2:

Proof of Theorem 2

If \(p\geq 2\), the interpolation result is a special case of Theorem 1.4 in [13] (where in the notation of [13] we have chosen \(s=0\), \(t=\theta \)): Indeed, for \(\gamma < 1-\frac{1}{n}\) and \((s,p)\) satisfying \((s-1)p^{\ast } \frac{1}{n} = \gamma -1\) with \(p^{\ast }\) being the dual exponent of \(p\), the estimate in Theorem 1.4 from [13] reads

$$\begin{aligned} \|u\|_{B^{t}_{q,q}(\mathbb{R}^{n})} \leq C\|u\|_{B^{s}_{p,p}( \mathbb{R}^{n})}^{1-\theta } \|u\|_{BV(\mathbb{R}^{n})}^{\theta }, \end{aligned}$$


$$\begin{aligned} \frac{1}{q} = \frac{1-\theta }{p} + \theta , \ t=(1-\theta ) s + \theta . \end{aligned}$$

We note that in the setting of Theorem 2 the estimate (12) is applicable, as in the notation of [13] and with dimension \(n\) we have that \(\gamma := - \frac{p}{p-1}\frac{1}{n} + 1= 1- \frac{1}{n} - \frac{1}{p-1} \frac{1}{n}< 1- \frac{1}{n}\), which implies the validity of (7). The simplification from (12) to (7) is then a consequence of the facts that

  • for \(s\notin \mathbb{Z}\) we have \(W^{s,p}(\mathbb{R}^{n}) = B_{p,p} ^{s}(\mathbb{R}^{n})\) (cf. [7, 12]),

  • and for \(p\geq 2\) the embedding \(L^{p}(\mathbb{R}^{n}) \hookrightarrow B_{p,p}^{0}(\mathbb{R}^{n})\) is valid (Theorem 2.41 in [2]).

This concludes the argument for (i).

To obtain (ii), we combine (i) with an additional interpolation inequality, which becomes necessary, as the inclusion \(L^{p}( \mathbb{R}^{n}) \hookrightarrow B_{p,p}^{0}(\mathbb{R}^{n})\) is no longer valid for \(p\in (1,2)\). Hence, we rely on the following interpolation estimate (cf. Lemma 3 in [7])

$$\begin{aligned} \|u\|_{\tilde{F}^{s}_{r,l}(\mathbb{R}^{n})} \leq C \|u\|_{\tilde{F} ^{s_{0}}_{p_{0},l_{0}}(\mathbb{R}^{n})}^{\tau } \|u\|_{\tilde{F}^{s _{1}}_{p_{1},l_{1}}(\mathbb{R}^{n})}^{1-\tau }, \end{aligned}$$

which is valid for \(-\infty < s_{0} <s_{1}<\infty \), \(0< q_{0},q_{1} \leq \infty \), \(0< p_{0},p_{1}\leq \infty \), \(0< \tau < 1\) with

$$\begin{aligned} s = \tau s_{0} + (1-\tau ) s_{1}, \qquad r^{-1} = \tau p_{0}^{-1} + (1- \tau ) p_{1}^{-1}. \end{aligned}$$

Here the spaces \(\tilde{F}^{s}_{r,l}\) denote the (modified) Triebel-Lizorkin spaces from [7]. The main advantage of the estimate (13), which goes back to Oru [45], is that there are no conditions on the relations between \(l\), \(l_{0}\), \(l_{1}\) in this estimate. In particular, we can choose \(l_{0}=2\), \(l_{1} = p_{1}\) and \(l=r\). Using that

  • \(\tilde{F}^{s}_{r,2}(\mathbb{R}^{n}) = L^{s,r}(\mathbb{R}^{n})\) for \(s\in \mathbb{R}\), \(1< r<\infty \) and that for this range \(L^{0,r}( \mathbb{R}^{n})=L^{r}(\mathbb{R}^{n})\),

  • \(\tilde{F}^{s}_{r,r}(\mathbb{R}^{n})= W^{s,r}(\mathbb{R}^{n})\) for \(0< s<\infty \), \(s\notin \mathbb{Z}\), \(1\leq p <\infty \),

we can simplify (13) to yield

$$\begin{aligned} \|u\|_{W^{s,r}(\mathbb{R}^{n})} \leq C \|u\|_{L^{p_{0}}(\mathbb{R} ^{n})}^{\tau } \|u\|_{W^{s_{1},p_{1}}(\mathbb{R}^{n})}^{1-\tau }, \end{aligned}$$

which is valid for \(0< s_{1} < \infty \), \(1< p_{0} < \infty \), \(1< p_{1} <\infty \) with

$$\begin{aligned} s = (1-\tau ) s_{1}, \qquad r^{-1} = \tau p_{0}^{-1} + (1-\tau ) p_{1}^{-1}. \end{aligned}$$

We apply (14) with \(p_{0}=1+\), \(s=\theta \), \(r=q\) and \((s_{1},p_{1})=(\theta _{1},q_{1})\) lying on the boundary of the interpolation region from (i) (cf. the blue region in Fig. 1), i.e.,

$$\begin{aligned} \biggl(\frac{1}{q},\theta \biggr) &= \tau (1-,0 ) + (1- \tau ) \biggl( \frac{1}{q_{1}},\theta _{1} \biggr), \\ \biggl(\frac{1}{q_{1}},\theta _{1} \biggr) &= (1-\theta _{1}) \biggl(\frac{1}{2},0 \biggr) + \theta _{1}(1,1). \end{aligned}$$

In particular, these equations uniquely determine \(\tau \in (0,1)\). Hence, we obtain

$$\begin{aligned} \|u\|_{W^{\theta ,q}} \lesssim \|u\|_{L^{1+}}^{\tau } \|u \|_{W^{\theta _{1},q_{1}}}^{1-\tau } \lesssim \|u\|_{L^{1+}}^{\tau } \|u\|_{L^{2}} ^{(1-\tau )(1-\theta _{1})} \|u\|_{BV}^{(1-\tau )\theta _{1}}. \end{aligned}$$

We conclude the proof of (ii) by noting that \((1-\tau )\theta _{1}= \theta \) and that

$$\begin{aligned} 0< \frac{(1-\tau )(1-\theta _{1})}{1-\theta }= 1- \frac{\tau }{1-\theta }. \end{aligned}$$
Fig. 1
figure 1

For functions which are “essentially” characteristic functions in the sense that condition (9) of Corollary 3 holds we obtain the interpolation inequality (10), which is valid in the whole coloured region in the figure (green and blue). Here the blue region is already covered in Theorem 2(i). In order to also obtain the green region, we have to be able to simplify the statement of (8), which in general is only valid for functions which are “essentially” characteristic functions. In our application of Corollary 3 (cf. Propositions 66, 69), we will restrict to the region to the left of the line connecting \((\theta _{0},1)\) with \((0,0)\). Remark 5 shows that having a bound for the product of the right hand side of (10) for a specific value \((\theta _{0},1)\) already allows to deduce a bound for all exponents \((\tilde{\theta },q)\) on the associated line connecting \((\theta _{0},1)\) with \((0, \infty )\) (Color figure online)


As an alternative to the interpolation approach, a more geometric criterion for regularity is given by Sickel:

Theorem 4

(Sickel, [52])

Let\(\theta \in (0,1)\)and\(q\in [1,\infty )\)be arbitrary but fixed. Let\(E\subset \mathbb{R}^{n}\)be a bounded set satisfying

$$\begin{aligned} \int _{0}^{1}\delta ^{-\theta q}\bigl|(\partial E)_{\delta }\bigr| \frac{d \delta }{\delta }< \infty , \end{aligned}$$


$$\begin{aligned} (\partial E)_{\delta }:= \bigl\{ x \in E: \operatorname{dist}(x,\partial E) \leq \delta \bigr\} . \end{aligned}$$

Then, \(\chi _{E}\in W^{\theta ,q}(\mathbb{R}^{n})\).

Although this theorem provides good geometric intuition and could have been used as an alternative means of proving Theorem 1, we do not pursue this further in the sequel, but postpone its discussion to future work.

Remark 5

We note that the estimate (15) in Theorem 4 yields a condition on the product\(\theta q>0\), while, at first sight, Theorem 2 and Corollary 3 pose a restriction on \(\theta \), \(q\) individually. As we are dealing with bounded (or even characteristic) functions, we however observe that it is also possible to obtain an analogous condition on the product \(\theta q\) in Theorem 2 and Corollary 3: Indeed, assume that \(u\in L^{\infty }(\mathbb{R}^{2})\) is such that for some \(\theta _{0} \in (0,1)\) the product

$$\begin{aligned} \|u\|_{L^{1}(\mathbb{R}^{2})}^{1-\theta _{0}}\| u\|_{BV(\mathbb{R}^{2})} ^{\theta _{0}} \end{aligned}$$

is bounded. Then, we claim that for

$$\begin{aligned} q\in (1,\infty ), \quad \tilde{\theta }:=\theta _{0} q^{-1}\quad \mbox{and for } p= \frac{1-\tilde{\theta }}{\tilde{\theta }} \frac{\theta _{0}}{1- \theta _{0}}, \end{aligned}$$

also the product

$$\begin{aligned} \|u\|_{L^{p}(\mathbb{R}^{2})}^{1-\tilde{\theta }}\| u\|_{BV( \mathbb{R}^{2})}^{\tilde{\theta }} \end{aligned}$$

is bounded. To derive this, we first observe that the \(L^{\infty }\) bound for \(u\) allows us to infer that for any \(p\in (1,\infty )\)

$$\begin{aligned} \|u\|_{L^{p}(\mathbb{R}^{2})} \leq \|u\|_{L^{\infty }(\mathbb{R}^{2})} ^{1-\frac{1}{p}} \|u\|_{L^{1}(\mathbb{R}^{2})}^{\frac{1}{p}}. \end{aligned}$$

As a consequence, we deduce that

$$ \begin{aligned}[b] \|u\|_{L^{p}(\mathbb{R}^{2})}^{1-\tilde{\theta }}\| u \|_{BV( \mathbb{R}^{2})}^{\tilde{\theta }} &\leq \|u\|^{(1-\frac{1}{p})(1- \tilde{\theta })}_{L^{\infty }(\mathbb{R}^{2})} \|u \|_{L^{1}( \mathbb{R}^{2})}^{\frac{1-\tilde{\theta }}{p}}\| u\|_{BV(\mathbb{R} ^{2})}^{\tilde{\theta }} \\ & = \|u\|^{1-\frac{\tilde{\theta }}{\theta _{0}}}_{L^{\infty }( \mathbb{R}^{2})} \bigl( \|u\|_{L^{1}(\mathbb{R}^{2})}^{1-\theta _{0}} \| u\|_{BV(\mathbb{R}^{2})}^{\theta _{0}} \bigr)^{\frac{ \tilde{\theta }}{\theta _{0}}}. \end{aligned} $$

Here we have made use of the specific choices of exponents from (17) and the boundedness of \(u\), which allowed us to invoke (18). This concludes the argument for the claim.

Thus, relying on the bound (19), we infer that given a bound on (16), we obtain that for all exponents \(q\), \(\tilde{\theta }\), \(p\) from (17)

$$\begin{aligned} \|u\|_{W^{\tilde{\theta },q}(\mathbb{R}^{n})} \leq C \|u\|^{1-\frac{ \tilde{\theta }}{\theta _{0}}}_{L^{\infty }(\mathbb{R}^{2})} \bigl( \|u\|_{L^{1}(\mathbb{R}^{2})}^{1-\theta _{0}}\| u\|_{BV(\mathbb{R}^{2})} ^{\theta _{0}} \bigr)^{\frac{\tilde{\theta }}{\theta _{0}}}. \end{aligned}$$

Here we applied Theorem 2 (or Corollary 3), for which we noted that the respective exponents are admissible. On the one hand, this is the desired analogue of the condition from Theorem 4 and allows us to obtain a whole family of \(W^{\theta ,q}\) bounds for \(u\), where \(\theta q < \theta _{0}\). On the other hand, it shows that although \(p=1\) is not admissible in Theorem 2 and Corollary 3, for our purposes, it still suffices to consider the case \(p=1\) and to prove a control for (16), which then gives the full range of expected exponents in the form of the estimate (20).

Remark 6

(Fractal Packing Dimension)

Following Sickel [52], Proposition 3.3 (cf. also [31], Theorem 2.2) we remark that for a characteristic function its \(W^{s,p}\) regularity has direct consequences on the packing dimension (cf. [31, 42]), which we denote by \(\dim _{P}\), of its boundary: If for some set \(E \subset \mathbb{R} ^{n}\) its characteristic function \(\chi _{E}\) satisfies \(\chi _{E} \in W^{s,p}(\mathbb{R}^{n})\) for some \(s>0\) and \(1\leq p <\infty \), then

$$\begin{aligned} \dim _{P}\bigl(S_{\delta }(\partial E)\bigr) \leq \min \{n,n-sp+\delta \}. \end{aligned}$$


$$\begin{aligned} S_{\delta }(\partial E):= \bigl\{ & x \in \partial E: \ \exists \mu >0 \mbox{ such that } \forall \epsilon , \ 0 < \epsilon \leq 1, \ \exists A_{\epsilon }, A_{\epsilon }' \mbox{ satisfying } \\ & A_{\epsilon } \subset B_{\epsilon }(x)\cap E, \ A_{\epsilon }' \subset B_{\epsilon }(x)\cap E^{c} \mbox{ and } |A_{\epsilon }||A _{\epsilon }'| \geq \mu \epsilon ^{2n+\delta } \bigr\} , \end{aligned}$$

\(B_{\epsilon }(x):=\{x'\in \mathbb{R}^{n}: |x-x'|\leq \epsilon \}\) and \(E^{c}\) denotes the complement of \(E\).

2.2 Matrix Space Geometry

Before discussing our convex integration scheme, we recall some basic notions and properties of the hexagonal-to-rhombic phase transformation, which we will use in the sequel.

We begin by introducing notation for the symmetric and antisymmetric part of two matrices.

Definition 7

(Symmetric and Antisymmetric Parts)

Let \(M\in \mathbb{R}^{n\times n}\). We denote the uniquely determined symmetric and antisymmetric parts of \(M\) by

$$\begin{aligned} M= e(M) + \omega (M), \quad e(M) := \frac{1}{2}\bigl(M^{T} + M \bigr), \ \omega (M) := \frac{1}{2}\bigl(M - M^{T}\bigr). \end{aligned}$$

2.2.1 Lamination Convexity Notions

Relying on the notation from Definition 7, in the sequel we discuss the different notions of lamination convexity. Here we distinguish between the usual lamination convex hull (defined by successive rank-one iterations) and the symmetrized lamination convex hull (defined by successive symmetrized rank-one iterations):

Definition 8

(Lamination Convex Hull, Symmetrized Lamination Convex Hull)

We define the following notions of lamination convex hulls:

  1. (i)

    Let \(U\subset \mathbb{R}^{n\times n}\). Then we set

    $$\begin{aligned} \mathcal{L}^{0}(U) &:= U, \\ \mathcal{L}^{k}(U) &:= \bigl\{ M\in \mathbb{R}^{2\times 2}: M= \lambda A+ (1- \lambda ) B \mbox{ with } A-B = a \otimes n, \lambda \in [0,1], \\ & \quad \quad A,B \in \mathcal{L}^{k-1}(U)\bigr\} , \quad k\geq 1, \\ U^{lc} &:= \bigcup_{k=0}^{\infty } \mathcal{L}^{k}(U). \end{aligned}$$

    We refer to \(U^{lc}\) as the laminar convex hull of\(U\) and to \(\mathcal{L}^{k}(U)\) as the laminates of order at most\(k\).

  2. (ii)

    Let \(U\subset \mathbb{R}^{n \times n}_{sym}\). Then we define

    $$\begin{aligned} \mathcal{L}^{0}_{sym}(U) &:= U, \\ \mathcal{L}^{k}_{sym}(U) &:= \bigl\{ M\in \mathbb{R}^{2\times 2}: M= \lambda A+ (1-\lambda ) B \mbox{ with } A-B = a \odot n, \lambda \in [0,1], \\ & \quad \quad A,B \in \mathcal{L}^{k-1}_{sym}(U)\bigr\} , \quad k \geq 1, \\ U^{lc}_{sym} &:= \bigcup_{k=0}^{\infty } \mathcal{L}^{k}_{sym}(U). \end{aligned}$$

    Here \(a\odot b:= \frac{1}{2}(a\otimes b + b\otimes a)\). We refer to \(U^{lc}_{sym}\) as the symmetrized laminar convex hull of\(U\) and to \(\mathcal{L}^{k}_{sym}(U)\) as the symmetrized laminates of order at most\(k\).

  3. (iii)

    We denote the convex hull of a set \(U\subset \mathbb{R}^{m}\) by \(\operatorname{conv}(U)\).

Remark 9

We note that if \(U \subset \mathbb{R}^{n\times n}\) or \(U\subset \mathbb{R}^{n\times n}_{sym}\) is (relatively) open, then also \(U^{lc}\) or \(U^{lc,sym}\) is (relatively) open.

Lemma 10

(Convex Hull = Laminar Convex Hull)

Let\(K\)be as in (5). Then

$$\begin{aligned} K_{sym}^{lc} = \operatorname{conv}(K) = \mathcal{L}^{2}_{sym}(K). \end{aligned}$$

Moreover, each element\(e\in \operatorname{intconv}(K)\)is symmetrized rank-one connected with each element in\(K\).


The first point follows from an observation of Bhattacharya (cf. [6] and also Lemma 4 in [49]). The second point either follows from a direct calculation or by an application of Lemma 11 below. □

The following lemma establishes a relation between rank-one connectedness and symmetrized rank-one connectedness. It in particular shows that in two dimensions all symmetric trace-free matrices are pairwise symmetrized rank-one connected.

Lemma 11

(Rank-One vs Symmetrized Rank-One Connectedness)

Let\(e_{1},e_{2} \in \mathbb{R}^{n\times n}_{sym}\)with\(\operatorname{tr}(e_{1})=0=\operatorname{tr}(e_{2})\). Then the following statements are equivalent:

  1. (i)

    There exist vectors\(a\in \mathbb{R}^{n}\setminus \{0\}, n \in S^{n-1}\)such that

    $$\begin{aligned} e_{1}-e_{2} = a\odot n. \end{aligned}$$
  2. (ii)

    There exist matrices\(M_{1},M_{2} \in \mathbb{R}^{n\times n}\)and vectors\(a\in \mathbb{R}^{n}\setminus \{0\}, n \in S^{n-1}\)such that

    $$\begin{aligned} M_{1} - M_{2} &= a\otimes n, \\ e(M_{1}) &= e_{1}, e(M_{2}) = e_{2}. \end{aligned}$$
  3. (iii)

    \(\operatorname{rank}(e_{1}-e_{2})\leq 2\).


We refer to [49], Lemma 9 for a proof of this statement. □

This lemma allows us to view symmetrized rank-one connectedness essentially as equivalent to rank-one connectedness.

2.2.2 Skew Parts

We discuss some properties of the associated skew symmetric parts of rank-one connections which occur between points in the interior of \(\operatorname{intconv}(K)\). To this end, we introduce the following identification:

Notation 12

(Skew Symmetric Matrices)

As the two dimensional skew symmetric matrices are all of the form

$$\begin{aligned} S= \begin{pmatrix} 0 & \tilde{\omega } \\ -\tilde{\omega } & 0 \end{pmatrix} \quad\mbox{for some } \tilde{\omega } \in \mathbb{R}, \end{aligned}$$

we use the mapping \(S\mapsto \tilde{\omega }\) to identify \(\operatorname{Skew}(2)\) with ℝ. We define an ordering on \(\operatorname{Skew}(2)\) by the corresponding ordering on ℝ, i.e.,

$$\begin{aligned} S_{1} = \begin{pmatrix} 0 & \tilde{\omega }_{1} \\ -\tilde{\omega }_{1} & 0 \end{pmatrix} \leq S_{2} = \begin{pmatrix} 0 & \tilde{\omega }_{2} \\ -\tilde{\omega }_{2} & 0 \end{pmatrix} \end{aligned}$$

if \(\tilde{\omega }_{1} \leq \tilde{\omega }_{2}\).

We begin by estimating the symmetric and skew-symmetric parts of a symmetrized rank-one connection:

Lemma 13

Let\(a \in \mathbb{R}^{2}, n \in S^{1}\)with\(a \cdot n=0\). Then

$$\begin{aligned} \|a \odot n \| =|a|/2=\bigl\| \omega (a \otimes n)\bigr\| . \end{aligned}$$

Here\(\|\cdot \|\)denotes the spectral norm, i.e., \(\|A\|:= \sup_{|e|=1}\{e\cdot A e\}\), where\(|\cdot |\)denotes the\(\ell _{2}\)norm.


Since \(a \bot n\), we obtain that

$$\begin{aligned} (a \odot n) n &=a/2 = \frac{|a|}{2} \frac{a}{|a|} ,\qquad (a \odot n) \frac{1}{|a|} a = \frac{|a|}{2} n. \end{aligned}$$

As \(n\), \(\frac{1}{|a|}a\) forms an orthonormal basis, this shows that \(\|a \odot n\|=|a|/2\).

Similarly, we obtain that

$$\begin{aligned} \omega (a \otimes n) n &= \frac{|a|}{2} \frac{a}{|a|}, \qquad \omega (a \otimes n) \frac{1}{|a|}a = - \frac{|a|}{2} n, \end{aligned}$$

and hence \(\| \omega (a \otimes n)\|=|a|/2\). □

Using the previous result, we can control the size of the skew part which occurs in rank-one connections with \(K\):

Lemma 14

For all matrices\(N\)with\(e(N)\in \operatorname{intconv}(K)\)and with\(N\)being rank-one connected with a matrix\(e^{(j)} \in K\)it holds

$$\begin{aligned} \bigl\| \omega (N)\bigr\| \leq 10. \end{aligned}$$


For each \(e\in \operatorname{intconv}(K)\) there are exactly two matrices \(M_{e,i}^{\pm }\) such that

$$\begin{aligned} e\bigl(M_{e,i}^{\pm }\bigr)= e\quad \mbox{and}\quad \operatorname{rank}\bigl(M_{e,i}^{\pm } - e^{(i)}\bigr) = 1 \quad\mbox{for } i\in \{1,2,3\}. \end{aligned}$$

Let \(e-e^{(i)}= \frac{1}{2}(a \otimes n + n \otimes a)\) for some \(a \in \mathbb{R}^{2}\setminus \{0\}, n \in S^{1}\). Then, \(\omega (M _{e,i}^{\pm })= \omega (M_{e,i}^{\pm }) - \omega (e^{(i)})\) is explicitly given by \(\pm \frac{1}{2} (a \otimes n -n \otimes a)\). Thus, Lemma 13 implies

$$\begin{aligned} |a|/2= \bigl|\bigl(e-e^{(i)}\bigr) n\bigr| \leq \bigl\| e-e^{(i)}\bigr\| , \end{aligned}$$

since \(n \in S^{1}\) and \(a \cdot n =0\) by the trace-free condition. As \(\operatorname{conv}(K)\) is a compact set, \(e-e^{(i)}\) is uniformly bounded. Moreover, the diameter of \(\operatorname{conv}(K)\) is less than five, which yields the desired bound. □

In the interest of accessibility to a larger audience, the following subsections are phrased in standard matrix space formulation. It would have equally been possible to phrase these results in terms of conformal and anti-conformal coordinates, which would allow for a more concise formulation of some statements. For a work using these methods, see for instance [1].

2.2.3 Geometry of the Hexagonal-to-Rhombic Phase Transformation

In this subsection, we discuss the specific matrix space geometry of the hexagonal-to-rhombic phase transformation. To this end we decompose each matrix of the form (αββα) into a component in v1=(1001) and a component in v2=(0110) direction which essentially corresponds to introducing conformal and anti-conformal coordinates.

With this notation we make the following observations:

Lemma 15

Let\(v_{1}, v_{2} \in \mathbb{R}^{2\times 2}_{sym}\)be as above, and\(\varphi \in \mathbb{R}\). Then

$$\begin{aligned} \cos (\varphi ) \begin{pmatrix} 1 & 0 \\ 0 &- 1 \end{pmatrix} + \sin (\varphi ) \begin{pmatrix} 0 & 1 \\ 1 & 0 \end{pmatrix} & = \begin{pmatrix} \cos (\varphi ) & -\sin (\varphi ) \\ \sin (\varphi ) & \cos (\varphi ) \end{pmatrix} \begin{pmatrix} 1 & 0 \\ 0 & -1 \end{pmatrix} \\ &= \begin{pmatrix} \cos (\varphi ) & -\sin (\varphi ) \\ \sin (\varphi ) & \cos (\varphi ) \end{pmatrix} v_{1}. \end{aligned}$$

Furthermore, we have that

$$\begin{aligned} \begin{pmatrix} \cos (\varphi ) & -\sin (\varphi ) \\ \sin (\varphi ) & \cos (\varphi ) \end{pmatrix} v_{1} &= \begin{pmatrix} \cos (\frac{\varphi }{2}+\frac{\pi }{4}) \\ \sin (\frac{\varphi }{2}+\frac{\pi }{4}) \end{pmatrix} \otimes \begin{pmatrix} \sin (\frac{\varphi }{2} + \frac{\pi }{4}) \\ -\cos (\frac{\varphi }{2}+\frac{\pi }{4}) \end{pmatrix} \\ & \quad + \begin{pmatrix} \sin (\frac{\varphi }{2} + \frac{\pi }{4}) \\ -\cos (\frac{\varphi }{2}+\frac{\pi }{4}) \end{pmatrix} \otimes \begin{pmatrix} \cos (\frac{\varphi }{2}+\frac{\pi }{4}) \\ \sin (\frac{\varphi }{2}+\frac{\pi }{4}) \end{pmatrix} . \end{aligned}$$


Using the trigonometric identities

$$\begin{aligned} \cos (\phi ) &=\sin \biggl(\phi +\frac{\pi }{2}\biggr)= 2 \cos \biggl( \frac{\phi }{2}+\frac{ \pi }{4}\biggr)\sin \biggl(\frac{\phi }{2}+ \frac{\pi }{4}\biggr), \\ \sin (\phi ) &=-\cos \biggl(\phi +\frac{\pi }{2}\biggr)= \sin ^{2} \biggl(\frac{\phi }{2}+\frac{ \pi }{4}\biggr)-\cos ^{2}\biggl( \frac{\phi }{2}+\frac{\pi }{4}\biggr), \quad \phi \in \mathbb{R}, \end{aligned}$$

an immediate computation shows the claim. □

In other words, Lemma 15 allows us to identify all lines in matrix space (through the origin) by their rotation angle. In particular, this gives a simple description of the possible rank-one connections between the energy wells (cf. also Fig. 2(a)). In our application to the hexagonal-to-rhombic phase transformation we have to take into account that non-trivial differences of the matrices \(e^{(1)}\), \(e^{(2)}\), \(e^{(3)}\) lie on the sphere of radius \(\sqrt{3}\) in matrix space (with respect to the spectral norm), which yields slightly different normalization factors for \(a\):

Fig. 2
figure 2

The possible normals between the wells (cf. Lemma 16) (left) and the angles that arise in Lemma 18 (right). The figure on the left depicts the possible normals between the wells, the respective pairs \((a_{ij},n_{ij})\) are marked in the same color. We note that by symmetry it is also possible to pass from \((a,n)\) to \((-a,-n)\). After normalizing appropriately, symmetry further allows to exchange the roles of \(a\), \(n\). The figure on the right depicts the normals which arise in the decomposition of differences \(e-e^{(i)}\) with \(e^{(i)}\in K\) and \(e\in C_{d}\) (cf. Lemma 18). The colors represent the well, with which \(e\in C_{d}\) is connected: Red corresponds to the cone at \(e^{(1)}\), blue to the cone at \(e^{(2)}\) and black to the cone at \(e^{(3)}\). As \(C_{d}\) does not contain the full convex hull of \(K\) (cf. Fig. 3), only vectors which lie within the colored zones arise as possible normals (in particular, there is a gap between these vectors and the vectors which arise as decomposition of differences of the wells) (Color figure online)

Lemma 16

Let\(K\)be as in (5). Then we have that

$$\begin{aligned} e^{(1)} - e^{(2)} &= a_{12}\odot n_{12}, \\ e^{(1)}-e^{(3)} &= a_{13}\odot n_{13}, \\ e^{(2)}-e^{(3)} &= a_{23}\odot n_{23}, \end{aligned}$$

with (up to rotation symmetry by an angle of\(\pi \))

$$\begin{aligned} a_{12} &:= -2\sqrt{3} \begin{pmatrix} \cos (\frac{5 \pi }{12} + \frac{\pi }{4}) \\ \sin (\frac{5 \pi }{12} + \frac{\pi }{4}) \end{pmatrix} = \begin{pmatrix} \sqrt{3} \\ -3 \end{pmatrix} , \qquad n_{12}:= \begin{pmatrix} \sin (\frac{5 \pi }{12} + \frac{\pi }{4}) \\ -\cos (\frac{5 \pi }{12} + \frac{\pi }{4}) \end{pmatrix} =\frac{1}{2} \begin{pmatrix} \sqrt{3} \\ 1 \end{pmatrix} , \\ a_{13}&:= -2\sqrt{3} \begin{pmatrix} \cos (\frac{7 \pi }{12} + \frac{\pi }{4}) \\ \sin (\frac{7 \pi }{12} + \frac{\pi }{4}) \end{pmatrix} = \begin{pmatrix} 3 \\ -\sqrt{3} \end{pmatrix} , \qquad n_{13}:= \begin{pmatrix} \sin (\frac{7 \pi }{12} + \frac{\pi }{4}) \\ -\cos (\frac{7 \pi }{12} + \frac{\pi }{4}) \end{pmatrix} =\frac{1}{2} \begin{pmatrix} 1 \\ \sqrt{3} \end{pmatrix} , \\ a_{23} &:= 2\sqrt{3} \begin{pmatrix} 0 \\ 1 \end{pmatrix} , \qquad n_{23} := \begin{pmatrix} 1 \\ 0 \end{pmatrix} . \end{aligned}$$


This is a consequence of Lemma 15 and the form of the matrices in (5). □

With Lemma 15 at hand, we can also compute the possible (symmetrized) rank-one connections which occur between each well and any possible matrix in \(\operatorname{conv}(K)\):

Lemma 17

Let\(e^{(i)}\in K\)and let\(e\in \operatorname{conv}(K)\). Let\(\varphi \in (-\pi ,\pi ]\)denote the angle from the decomposition from Lemma15for the matrix\(\frac{e^{(i)}-e}{\|e^{(i)}-e\|}\), where\(\|\cdot \|\)denotes the spectral matrix norm. Then,

$$\begin{aligned} \varphi \in \left \{ \textstyle\begin{array}{l@{\quad}l} (\frac{5\pi }{6}, \frac{7\pi }{6}) &\textit{if } i=1, \\ (-\frac{\pi }{2},-\frac{\pi }{6}) &\textit{if } i=2, \\ (\frac{\pi }{6},\frac{\pi }{2}) &\textit{if } i=3, \end{array}\displaystyle \right . \end{aligned}$$


$$\begin{aligned} e^{(i)}- e = \bigl\| e^{(i)}-e\bigr\| a_{i}(\varphi ) \otimes n_{i}(\varphi ), \end{aligned}$$


$$\begin{aligned} a_{i}(\varphi ) := \begin{pmatrix} \sin (\frac{\varphi }{2} + \frac{\pi }{4}) \\ -\cos (\frac{\varphi }{2}+\frac{\pi }{4}) \end{pmatrix} , \qquad n_{i}(\varphi ) := \begin{pmatrix} \cos (\frac{\varphi }{2}+\frac{\pi }{4}) \\ \sin (\frac{\varphi }{2}+\frac{\pi }{4}) \end{pmatrix} . \end{aligned}$$


This is a direct consequence of Lemma 15 and of the fact that the set \(K\) forms an equilateral triangle in strain space. □

As an immediate consequence of Lemma 16 and Lemma 17, we infer the following result, which is graphically illustrated in Fig. 2(b):

Lemma 18

Let\(K\)be as in (5), let\(\tilde{C}_{d,j}:= \operatorname{conv}\{e^{(j)}, P_{d}, Q P_{d} Q^{T}, Q^{2} P_{d} (Q ^{2})^{T}\}\)with\(j\in \{1,2,3\}\), \(Q\)being a rotation by\(\frac{2\pi }{3}\)and

$$\begin{aligned} P_{d} := \frac{1-d}{2} \begin{pmatrix} -1 & 0 \\ 0 & 1 \end{pmatrix} , \end{aligned}$$

where\(d \in (0,\frac{1}{2})\). Further let\(C_{d}\)be the star-shaped domain given by their union depicted in Fig3:

$$\begin{aligned} C_{d}:=\bigcup_{j=1}^{3} \tilde{C}_{d,j}, \end{aligned}$$
Fig. 3
figure 3

The colored domain represents the interior of the star \(C_{d}\) from Lemma 18. As not the full cone of rank-one directions between the wells are possible (cf. (22)), the difference in the angle between different cones is bounded strictly away from zero and \(\pi \) (Color figure online)

Let\(\hat{e}, \bar{e} \in C_{d}\). Suppose that

$$\begin{aligned} \hat{e}-e^{(i_{1})} &= \bigl\| \hat{e}-e^{(i_{1})} \bigr\| a_{i_{1}}\odot n_{i _{1}}, \\ \bar{e}-e^{(i_{2})} &= \bigl\| \hat{e}-e^{(i_{2})} \bigr\| a_{i_{2}}\odot n_{i _{2}}, \end{aligned}$$

with\(e^{(i_{1})}, e^{(i_{2})} \in K\)and\(e^{(i_{1})} \neq e^{(i_{2})}\). Define\(\alpha _{m_{1},m_{2}} \in (0,\pi )\)as

$$\begin{aligned} \cos (\alpha _{m_{1},m_{2}}) := (m_{1},m_{2}), \end{aligned}$$


$$\begin{aligned} m_{l} \in \biggl\{ \frac{a_{i_{l}}}{\|a_{i_{l}}\|}, \frac{n_{i_{l}}}{ \|n_{i_{l}}\|} \biggr\} , \quad l \in \{1,2\}. \end{aligned}$$

Then there exists a constant\(C=C(d)\in (0,\pi /4)\)such that

$$\begin{aligned} |\alpha _{m_{1},m_{2}}| \in \bigcup_{l=1}^{6} \biggl(C(d) + \frac{ \pi }{6}(l-1), \frac{\pi }{6}l - C(d) \biggr). \end{aligned}$$


Arguing as for (21) by using the definition of the set \(C_{d}\), we infer that the angles \(\varphi \) that occur in the representation from Lemma 15 for \(\frac{e-e^{(i)}}{\|e-e ^{(i)}\|}\) with \(e\in C_{d}\) and \(e^{(i)}\in K\) satisfy

$$\begin{aligned} \varphi \in \left \{ \textstyle\begin{array}{l@{\quad}l} (\frac{5\pi }{6}-C(d), \frac{7\pi }{6} + C(d)) &\mbox{if } i=1, \\ (-\frac{\pi }{2}+C(d),-\frac{\pi }{6}-C(d)) &\mbox{if } i=2, \\ (\frac{\pi }{6}-C(d),\frac{\pi }{2}+C(d)) &\mbox{if } i=3. \end{array}\displaystyle \right . \end{aligned}$$

Here \(C(d)\in (0,\pi /3)\) is a constant, which depends only on \(d\). The associated symmetrized rank-one connection is determined by \(\varphi \) as stated in Lemma 17. Applied to the situation in Lemma 18 this implies that \(a_{i_{1}}\), \(n_{i_{1}}\), \(a _{i_{2}}\), \(n_{i_{2}}\) are expressed in terms of \(\varphi \) (as in Lemma 17). Since \(d>0\), the sectors parametrized by \(\varphi \) however do not overlap for \(i_{1}\neq i_{2}\). As a consequence of this and of the options in (22), only the claimed angles \(\alpha _{m_{1},m_{2}}\) occur. □

3 The Convex Integration Algorithm

In this section we present and analyze our convex integration algorithm (cf. Algorithms 27 and 30). Our discussion of this consists of four parts: First in Sect. 3.1 we introduce a replacement construction in which a displacement gradient can be modified (cf. Lemmas 1923). Here we follow Otto’s Minneapolis lecture notes [47] and refer to this construction as a version of Conti’s construction (cf. [15, 20] (Appendix) but also [33]).

Next in Sect. 3.2 we explain how the Conti construction can be exploited to formulate the convex integration algorithm (cf. Algorithms 27 and 30). Here we deviate from the more common qualitative algorithms by precisely prescribing error estimates in strain space, by specifying a covering construction and by controlling the skew part quantitatively.

In Sect. 3.3 we analyze our algorithms and show that they are well-defined (cf. Proposition 31). We further provide a control on the skew part of the resulting construction (cf. Proposition 34).

Finally, in Sect. 3.4 we use Algorithms 27 and 30 to deduce the existence of solutions to the inclusion problem (6), cf. Proposition 36.

We remark that our version of the convex integration scheme is based on particular properties of our set of strains: For the hexagonal-to-rhombic phase transition the laminar convex hull equals the convex hull (cf. Lemma 10). Moreover, we can connect any matrix in \(\operatorname{intconv}(K)\) with the wells \(K\) (cf. Lemma 11). For a general inclusion problem this is no longer possible and hence more sophisticated arguments are necessary. In spite of the restricted applicability of the scheme, we have decided to focus on the hexagonal-to-rhombic phase transformation, as it yields one of the simplest instances of convex integration and illustrates the difficulties and ingredients which have to be dealt with in proving higher Sobolev regularity in the simplest possible set-up.

3.1 The Replacement Construction

In this section we describe the replacement construction that allows us to modify constant gradients by replacing them with an affine construction that preserves the boundary values. Moreover, the resulting new gradients are controlled (cf. Lemma 23).

We will make use of a construction by Otto [47], see also the video at [46], which is a variant of a construction by Conti [15]. We will recall the construction here in detail since it is not publicly available in printed form.

Lemma 19

(Variable Conti Construction)

Let\(\varOmega =(-1,1)^{2}\). Let\(\lambda \in (0,1)\). Define

$$\begin{aligned} & M_{0}=2 \begin{pmatrix} 0 & 1 \\ 0 &0 \end{pmatrix} ,\qquad M_{4}=2 \begin{pmatrix} 0 & 0 \\ -1& 0 \end{pmatrix} , \\ & M_{1}^{\lambda }= 2 \begin{pmatrix} 0 & \frac{-1+\lambda }{\lambda } \\ \frac{1-\lambda }{\lambda } & 0 \end{pmatrix} ,\qquad M_{2}^{\lambda }=\frac{-2(1-\lambda )}{1-(1-\lambda )^{2}} \begin{pmatrix} 1 & \lambda -1 \\ 1-\lambda & -1 \end{pmatrix} , \\ & M_{3}^{\lambda }=Q^{T}M_{2}^{\lambda }Q,\quad Q= \begin{pmatrix} 0 & 1 \\ -1 & 0 \end{pmatrix} . \end{aligned}$$

Then there exists\(u: \mathbb{R}^{2}\rightarrow \mathbb{R}^{2}\)Lipschitz such that

In our applications we make use of a version of this construction with slightly different conventions:

Corollary 20

Let\(\varOmega =(-1,1)^{2}\)and\(\lambda \in (0,1)\). Then there exists\(u: \mathbb{R}^{2}\rightarrow \mathbb{R}^{2}\)Lipschitz such that


$$\begin{aligned} \begin{aligned} &M_{0}=\frac{1}{\lambda } \begin{pmatrix} 0 & 1 \\ 0 &0 \end{pmatrix} ,\qquad M_{1}=\frac{1}{1-\lambda } \begin{pmatrix} 0 & -1 \\ 1 &0 \end{pmatrix} ,\qquad M_{2}=\frac{1}{1-\lambda ^{2}} \begin{pmatrix} 1 & -\lambda \\ \lambda &-1 \end{pmatrix} , \\ & M_{3}=\frac{1}{1-\lambda ^{2}} \begin{pmatrix} -1 & -\lambda \\ \lambda &1 \end{pmatrix} ,\qquad M_{4}=\frac{1}{\lambda } \begin{pmatrix} 0 & 0 \\ -1 &0 \end{pmatrix} , \end{aligned} \end{aligned}$$

and the volumes of the level sets of\(\nabla u\) (total volume 4) satisfy

$$\begin{aligned} & \bigl|\bigl\{ x: \nabla u(x)=M_{i}\bigr\} \bigr|= \textstyle\begin{cases} 2\lambda & i=0,4, \\ 2(1-\lambda )^{2} & i=1, \\ (1-\lambda )(1+\lambda ) & i=2,3. \end{cases}\displaystyle \end{aligned}$$

Proof of Corollary 20

We apply the construction of Lemma 19 with \(1-\lambda \) in place of \(\lambda \) and multiply the resulting function by \(\frac{1}{2 \lambda }\). We remark that the matrices of this corollary are the ones given on p. 56 of [47]. □

Proof of Lemma 19

In order to construct the function \(u\), we prescribe the value of \(u\) at the points \((0,\pm \lambda )\), \(( \pm \lambda , 0)\) for \(\lambda \in (0,1)\) to be determined and then consider linear interpolations. This then yields a piecewise affine Lipschitz map. It remains to verify that all matrices are as claimed in the lemma (cf. also Fig. 4).

Fig. 4
figure 4

The level sets of \(\nabla u\) in Conti’s construction of Lemma 19 for \(\lambda =\frac{1}{2}\)

We start with the ansatz given in Fig. 5. The value of \(u\) at the points \((\pm \lambda ,0)\), \((0,\pm \lambda )\) is chosen in such a way that linear interpolation in the triangles on the sides of the square in Fig. 4 yields \(\nabla u \in \{M _{0},M_{4}\}\), i.e.,

$$\begin{aligned} u(\pm \lambda ,0) &= \pm 2 \begin{pmatrix} 0 \\ 1-\lambda \end{pmatrix} , \\ u(0, \pm \lambda )&= \pm 2 \begin{pmatrix} -1+\lambda \\ 0 \end{pmatrix} . \end{aligned}$$

By linear interpolation on the inner diamond we hence obtain that

$$\begin{aligned} & \nabla u \begin{pmatrix} 2 \lambda & 0 \\ 0 & 2 \lambda \end{pmatrix} =4 \begin{pmatrix} 0 & -1 +\lambda \\ 1-\lambda & 0 \end{pmatrix} \\ & \quad\Rightarrow\quad \nabla u =2 \begin{pmatrix} 0 & \frac{-1+\lambda }{\lambda } \\ \frac{1-\lambda }{\lambda } & 0 \end{pmatrix} . \end{aligned}$$

Choosing \(\lambda =\frac{1}{2}\), we thus obtain \(\nabla u = M_{1}\). It remains to check the value of \(\nabla u\) on the triangles which interpolate between the sides of the inner diamond and the corners of the outer square. By symmetry it suffices to consider the lower left triangle. Using again linear interpolation, there \(\nabla u\) has to satisfy

$$\begin{aligned} & \nabla u \begin{pmatrix} 1 & 1-\lambda \\ 1-\lambda & 1 \end{pmatrix} = \begin{pmatrix} -2(1-\lambda ) & 0 \\ 0 & -2(1-\lambda ) \end{pmatrix} \\ &\quad\Rightarrow\quad \nabla u =\frac{-2(1-\lambda )}{1-(1-\lambda )^{2}} \begin{pmatrix} 1 & \lambda -1 \\ 1-\lambda & -1 \end{pmatrix} . \end{aligned}$$

Setting \(\lambda = 1/2\) this equals

$$\begin{aligned} \nabla u = \frac{2}{3} \begin{pmatrix} 2& -1 \\ 1 & -2 \end{pmatrix} , \end{aligned}$$

which is the matrix \(M_{2}\) from Lemma 19.

Fig. 5
figure 5

In a symmetric ansatz we prescribe \(\nabla u\) on the triangles with corners \((\pm 1, \pm 1)\), \((\pm \lambda ,0)\) and \((0,\pm \lambda )\). This leaves four triangles and a diamond-shaped region where \(\nabla u\) remains to be determined


Using this construction as a basic building block, the following lemma allows us to replace a general matrix \(M \in \mathbb{R}^{2\times 2}\) and to restrict the replacement matrices to an \(\epsilon \)-neighborhood of a rank-one line passing through \(M\).

Lemma 21

(Deformed Conti Construction, p. 57 of [47])

Let\(M\), \(M_{0}\), \(M_{1}\)be given matrices such that

$$\begin{aligned} & M = \frac{1}{4}M_{0} + \frac{3}{4}M_{1}, \\ & M_{1}-M_{0}= a \otimes n, \quad a \cdot n =0. \end{aligned}$$

Then, for every\(\epsilon >0\)there exist matrices\(\tilde{M}_{1}\), \(\tilde{M}_{2}\), \(\tilde{M}_{3}\), \(\tilde{M}_{4}\)with

$$ \begin{aligned} & |\tilde{M}_{1}- M_{1}|< \epsilon ,\qquad |\tilde{M}_{2}-M_{2}|< \epsilon ,\qquad |\tilde{M}_{4}-M|< \epsilon , \\ &|\tilde{M}_{3}-M_{2}|< \epsilon , \\ & M_{2}=\frac{1}{5}M_{0} + \frac{4}{5}M_{1}, \end{aligned} $$

a rectangle\(\varOmega \subset \mathbb{R}^{2}\)of aspect ratio\(\delta =\frac{\epsilon }{20 |a|}\)and a Lipschitz map\(u: \mathbb{R} ^{2}\rightarrow \mathbb{R}^{2}\)such that

$$\begin{aligned} &\nabla u = M\quad \textit{in } \mathbb{R}^{2} \setminus \varOmega , \\ &\nabla u \in \{M_{0}, \tilde{M}_{1}, \ldots , \tilde{M}_{4}\}, \\ &\bigl|\bigl\{ x: \nabla u (x)=M_{0}\bigr\} \bigr| \geq \frac{1}{8}| \varOmega |. \end{aligned}$$

Furthermore, the level sets of\(\nabla u\)are given by the union of at most 16 triangles.


Mapping \(u \mapsto u - Mx\), it suffices to consider the case \(M=0\). Furthermore, rotating the rectangular domain by \(x \mapsto Qx, Q \in \mathit{SO}(2)\) and scaling \(u \mapsto \frac{1}{|a|} u\), we may assume that \(n= (1,0)\) and \(a=(0,-1)\). Hence,

$$\begin{aligned} M_{0}= \frac{1}{\lambda } \begin{pmatrix} 0 & 1 \\ 0 & 0 \end{pmatrix} ,\qquad M_{1}= \frac{1}{1-\lambda } \begin{pmatrix} 0 & -1 \\ 0 & 0 \end{pmatrix} ,\qquad M_{2}= \frac{1}{1-\lambda ^{2}} \begin{pmatrix} 0 & -\lambda \\ 0 & 0 \end{pmatrix} , \end{aligned}$$

where \(\lambda =\frac{1}{4}\) and \(-\frac{\lambda }{1-\lambda ^{2}}= - \frac{4}{15}=\frac{1}{5} (4) + \frac{4}{5}(-\frac{4}{3})\). Applying the construction of Corollary 20 (with \(\lambda = \frac{1}{4}\) instead) rescaled by \(\frac{2}{\lambda }\), we obtain a Lipschitz function \(v : \mathbb{R}^{2} \rightarrow \mathbb{R}^{2}\), which vanishes outside the rectangle \(\tilde{\varOmega }=(-1,1)^{2}\) and satisfies

$$\begin{aligned} \bigl|\bigl\{ x: \nabla v(x)=M_{0}\bigr\} \bigr| \geq \frac{1}{8}|\tilde{ \varOmega }|. \end{aligned}$$

However, the values of \(\nabla v\) as given in (23) in Corollary 20 are not yet in an \(\epsilon \)-neighborhood of \(\{M,M_{1},M_{2}\}\). Hence, we consider the following change of coordinates and the following modified displacement:

$$\begin{aligned} (y_{1},y_{2}) &= \bigl(y_{1}(x),y_{2}(x) \bigr):=(x_{1}, \delta x_{2}), \\ \bigl(u_{1}(x),u_{2}(x)\bigr) &:= \bigl(\delta v_{1}(x), \delta ^{2} v_{2}(x)\bigr). \end{aligned}$$

We remark that this transforms the domain \(\tilde{\varOmega }=(-1,1)^{2}\) into the domain \((-1,1)\times (-\delta ,\delta )\) and moreover note that this scaling preserves volume fractions. Rewriting \(\nabla _{x} v(x)\) into \(\nabla _{y} u\) yields

$$\begin{aligned} \nabla _{y} u (y) = \begin{pmatrix} \delta \partial _{x_{1}} v_{1} & \partial _{x_{2}} v_{1} \\ \delta ^{2} \partial _{x_{1}} v_{2} & \delta \partial _{x_{2}} v_{2} \end{pmatrix} \Biggm| _{(x_{1}, x_{2}/\delta )} = \begin{pmatrix} 0 & \partial _{x_{2}} v_{1} \\ 0 & 0 \end{pmatrix} +\mathcal{O}\bigl(\delta |\nabla _{x} v|\bigr), \end{aligned}$$

which in particular leaves \(M_{0}\) invariant. Letting \(\delta \) be sufficiently small, we thus obtain the desired \(\epsilon \)-closeness. Undoing the initial rescaling with \(|a|\) leads to the precise requirement

$$\begin{aligned} \delta |\nabla _{x} v| |a| \leq \epsilon . \end{aligned}$$

This implies the claimed ratio for \(\varOmega \) by noting that \(|\nabla _{x} v| \leq 20\). □

Remark 22

We remark that both the side ratio \(\delta \) as well as the error \(\epsilon \) remain unchanged under rescalings of the form \(\mu u(\frac{x}{ \mu })\) (as this leaves the gradient invariant).

We now show how to apply Lemma 21 to the setting of symmetric matrices in our three-well problem (5):

Lemma 23

(Application to the three-well-problem, p. 60 ff. of [47])

Suppose that\(M \in \mathbb{R}^{2\times 2}\)with

$$ e(M):=\frac{1}{2}\bigl(M+M^{T}\bigr) \in \operatorname{intconv}(K) $$

and let\(\epsilon _{0}\leq \frac{\operatorname{dist}(e(M), \partial \operatorname{conv}(K))}{100}\). Let\(e^{(i)}\)with\(i\in \{1,2,3\}\)be such that

$$\begin{aligned} \bigl|e(M)-e^{(i)}\bigr| \leq \operatorname{dist}\bigl(e(M),K \bigr) + 4 \epsilon _{0}. \end{aligned}$$

Then, for every\(0<\epsilon <\epsilon _{0}\)there exist a Lipschitz function\(u:\mathbb{R}^{2}\rightarrow \mathbb{R}^{2}\), a rectangular domain\(\varOmega \) (with ratio\(1:\delta \)and\(\delta =\frac{\epsilon }{20|e(M)-e^{(i)}|}\)), and matrices\(\tilde{M_{0}}, \dots , \tilde{M_{4}}\)with symmetric parts\(e^{(i)}\), and\(\tilde{e}_{1}, \dots ,\tilde{e}_{4} \in \operatorname{intconv}(K)\), such that

$$\begin{aligned} & u(x)= Mx \quad\textit{on } \mathbb{R}^{2} \setminus \varOmega , \\ & e(\nabla u)\in \bigl\{ e^{(i)},\tilde{e}_{1}, \dots , \tilde{e}_{4}\bigr\} \subset \operatorname{conv}(K) \quad\textit{in } \varOmega , \\ & \bigl|\bigl\{ x\in \varOmega : e(\nabla u) (x)= e^{(i)}\bigr\} \bigr|/|\varOmega |=\frac{1}{4}, \\ &\nabla u \in \{\tilde{M}_{0}, \dots , \tilde{M}_{4}\} \quad\textit{with } |\tilde{M}_{4}-M|\leq \epsilon . \end{aligned}$$

Remark 24

We point out that the strain \(e^{(i)}\) chosen in (26) is not required to be the one closest to \(e(M)\) if the distance to \(e^{(i)}\) is not much larger than the distance to the closest well. This avoids changing the wells constantly, if a matrix has a symmetrized part very close to the middle between two wells. This becomes important in our quantitative analysis in Sects. 4 and 5 since the “rotated” case behaves considerably worse than the “parallel” case.


Let \(M\) and \(e^{(i)}\) be given. Since \(e^{(1)}\), \(e^{(2)}\), \(e^{(3)}\) are arranged in an equilateral triangle with side lengths \(\sqrt{3}\) (with respect to the spectral norm) and as (26) holds, there exists \(\tilde{e}_{1} \in \operatorname{intconv}(K)\) such that

$$\begin{aligned} e(M)=\frac{1}{4}e^{(i)} + \frac{3}{4} \tilde{e}_{1}. \end{aligned}$$

Next let \(S:=\omega (M) \in \operatorname{Skew}(2)\) and let \(\tilde{S} \in \operatorname{Skew}(2)\) to be determined. Then we obtain

$$\begin{aligned} & M= \frac{1}{4}\bigl(e^{(i)}+S+3\tilde{S}\bigr)+ \frac{3}{4}(\tilde{e}_{1}+S- \tilde{S}), \\ & \bigl(e^{(i)}+S+3\tilde{S}\bigr) - (\tilde{e}_{1}+S- \tilde{S})= e^{(i)}- \tilde{e}_{1} +4 \tilde{S}. \end{aligned}$$

Since we are in two dimensions, any two symmetric, trace-free matrices are symmetrized rank-one connected (cf. Lemma 11). Thus, there exist vectors \(a\in \mathbb{R}^{2}\setminus \{0\}\), \(n\in S^{1}\) such that

$$\begin{aligned} e^{(i)}- \tilde{e}_{1} = \frac{1}{2}(a \otimes n + n \otimes a). \end{aligned}$$

Furthermore, as \(\operatorname{tr}(e^{(i)})= \operatorname{tr}( \tilde{e}_{1})\), \(a\) and \(n\) are orthogonal. Choosing

$$\begin{aligned} \tilde{S}:= \frac{1}{8}(a \otimes n - n \otimes a) = \frac{1}{4} \omega (a\otimes n)\quad \mbox{or}\quad \tilde{S}:=- \frac{1}{4} \omega (a \otimes n), \end{aligned}$$

we thus obtain that the matrices

$$\begin{aligned} M_{0}:=\bigl(e^{(i)}+S+3\tilde{S}\bigr), \qquad M_{1}:=(\tilde{e}_{1}+S-\tilde{S}) \end{aligned}$$

are rank-one connected (with difference \(a \otimes n\) or \(n \otimes a\), respectively) and

$$\begin{aligned} M= \frac{1}{4}M_{0} + \frac{3}{4}M_{1}. \end{aligned}$$

We may hence apply the construction of Lemma 21 with \(M\), \(M_{0}\), \(M_{1}\) as defined above. Noting that \(\|e(M)-e^{(i)} \|= \frac{|a|}{2} \) (cf. Lemma 13), Lemma 21 implies the statement on the side ratio for \(\varOmega \). Finally, we note that the \(\epsilon \)-closeness of the matrices \(\tilde{M}_{1}, \ldots , \tilde{M}_{4}\) also implies that their symmetric parts are \(\epsilon \)-close. □

Notation 25

In the preceding Lemma 23 the matrices \(\tilde{M} _{0}, \ldots , \tilde{M}_{4}\) obey the same (convexity) relations as the ones in Lemma 21, where for the matrices \(M_{0}\) and \(M_{1}\) we insert the ones from (29), cf. Figs. 6 and 7. The error estimates in (24) thus

  • motivate us to refer to the matrix \(\tilde{M}_{4}\) as stagnant (with respect to the replaced matrix \(M\)).

  • The matrices \(\tilde{M}_{1}\), \(\tilde{M}_{2}\), \(\tilde{M}_{3}\) will also be called pushed-out matrices (with the factors \(\frac{4}{3}\) and \(\frac{16}{15}\) respectively), since by construction

    $$ \frac{4}{3} \bigl\vert e(M)-e^{(i)} \bigr\vert -\epsilon \leq \bigl\vert e( \tilde{M}_{1})-e^{(i)} \bigr\vert \leq \frac{4}{3} \bigl\vert e(M)-e^{(i)} \bigr\vert + \epsilon , $$

    and similarly for the other matrices.

In order to emphasize the dependence on \(M\), we also use the notation

$$\begin{aligned} \tilde{M}_{0}(M),\dots ,\tilde{M}_{4}(M). \end{aligned}$$

Although the matrices \(\tilde{M}_{0},\dots ,\tilde{M}_{4}\) also depend on the choice of \(e^{(i)}\), in the sequel we will often suppress this additional dependence for convenience as the reference well will be clear in most of our applications.

Fig. 6
figure 6

The horizontal axis corresponds to the upper right component of the matrix: \(M\sim 0, M_{0}\sim \frac{1}{\lambda }\), \(M_{1}\sim -\frac{1}{1- \lambda }\), \(M_{2}\sim - \frac{\lambda }{1-\lambda ^{2}}\). Here for \(\lambda =\frac{1}{4}\)

Fig. 7
figure 7

Relative positions of the symmetric part of the matrices inside the convex hull. Along the dashed rank-one line, the ordering of the matrices here is the same as in Fig. 6

We refer to the construction of Lemma 23 as the\((\epsilon , \delta )\)Conti construction with respect to\(M\),\(e^{(i)}\). If some of the parameters of this are self-evident from the context, we also occasionally omit them in the sequel.

We emphasize that in our construction in Lemma 23, we have the choice between two different solutions, which differ in the sign of their skew symmetric component and thus in the choice of the corresponding rank-one connection (cf. (28)). This freedom of choice is a central ingredient in the control over the skew symmetric part of the iterated constructions. We summarize this observation in the following corollary.

Corollary 26

Let\(M\), \(e^{(i)}\), \(\epsilon _{0}\), \(\epsilon \)be as in Lemma23, and\(a\), \(n\)as given in (27) in the proof of Lemma23. Then there exist two Lipschitz functions\(u_{+}, u_{-}: \mathbb{R}^{2}\rightarrow \mathbb{R}^{2}\)such that on the set where\(e(\nabla u_{\pm }) =e^{(i)}\)

$$\begin{aligned} \omega (\nabla u_{\pm })= \omega (M) \pm \frac{3}{4} \omega (a \otimes n) =: \omega (M) \pm \hat{S}. \end{aligned}$$

Furthermore, up to an error of size\(\epsilon \)the skew parts on the other level sets are given by

$$\begin{aligned} \omega (M), \quad \omega (M) \pm \frac{1}{3}\hat{S} , \quad \omega (M) \pm \frac{1}{15} \hat{S}. \end{aligned}$$


From (29) we read off the skew symmetric parts of \(M_{0}\), \(M_{1}\). The skew symmetric part of \(M_{2}:= \frac{1}{5}M_{0} + \frac{4}{5}M_{1}\) is a consequence of that. The result then follows from Lemma 21. □

3.2 The Convex Integration Algorithm

In this subsection we formulate our convex integration algorithm. It consists of two parts, Algorithms 27 and 30. The first part (Algorithm 27) determines the symmetric part of the iterated displacement vector field, while the second part (Algorithm 30) deals with the choice of the “correct” skew component.

After formulating the algorithms, we prove their well-definedness (i.e. show that it is indeed possible to iterate this construction as claimed).

In the whole section we assume that the domain \(\varOmega \) and the matrix \(M\) in (6) fit together in the sense that \(\varOmega = Q_{ \beta }[0,1]^{2}\), where \(Q_{\beta }\) is the rotation of the Conti construction from Lemma 21 for \(M\) (and the closest energy well \(e^{(i)}\)). We emphasize here the rotation angle \(\beta \), which will become important in our analysis. These “special” domains will play the role of the essential building blocks of the construction of convex integration solutions in general Lipschitz domains (cf. Sect. 6).

We define our convex integration scheme:

Algorithm 27

(Quantitative Convex Integration Algorithm, I)

We consider the following construction:

Step 0::

State space and data.

  1. (a)

    State space. Our state space is given by

    $$\begin{aligned} SP_{j}:=\bigl(j,u_{j},\{\varOmega _{j,k}\}_{k\in \{1,\dots ,J_{j}\}}, e_{j} ^{(p)},\epsilon _{j}, \delta _{j}\bigr). \end{aligned}$$

    Here \(j \in \mathbb{N}\) and \(u_{j}: \varOmega \rightarrow \mathbb{R}^{2}\) is a piecewise affine function. The sets

    $$\begin{aligned} \varOmega _{j,k}\subset \varOmega \cap \{\nabla u_{j}=\mathrm{const} \}\cap \bigl\{ e(\nabla u_{j})\notin K\bigr\} \end{aligned}$$

    are closed triangles, which form a (up to null sets) disjoint, finite partition of the level sets of \(\nabla u_{j}\), for which \(e(\nabla u _{j}) \notin K\). Let \(\varOmega _{j}:= \bigcup_{k=1}^{J_{j}} \varOmega _{j,k}\) denote the set on which \(e(\nabla u_{j})\) is not yet in one of the energy wells.

    The function

    $$\begin{aligned} e^{(p)}_{j}:\varOmega \rightarrow K \end{aligned}$$

    is constant on each of the sets \(\varOmega _{j,k}\). It essentially keeps track of the well closest to \(e(\nabla u_{j}|_{\varOmega _{j,k}})\) for each \(j\), \(k\).

    The functions

    $$\begin{aligned} \epsilon _{j}, \delta _{j}: \varOmega \rightarrow \mathbb{R}, \end{aligned}$$

    are constant on each set \(\varOmega _{j,k}\) and vanish in \(\varOmega \setminus \varOmega _{j}\). They correspond to the error and side ratio in the Conti construction which is to be applied in \(\varOmega _{j,k}\). The functions \(\epsilon _{j}\), \(\delta _{j}\) are coupled by the relation

    $$\begin{aligned} \delta _{j} = \frac{\epsilon _{j}}{10^{2} d_{K}}, \mbox{ where } d_{K}:= \operatorname{dist}\bigl(e(M),K\bigr). \end{aligned}$$

    Hence, in the following (update) steps, we will mainly focus on \(\epsilon _{j}\) and assume that \(\delta _{j}\) is modified accordingly.

  2. (b)

    Data. Let \(M\in \mathbb{R}^{2\times 2}\) with \(e(M)\in \operatorname{intconv}(K)\). Let \(\varOmega = Q_{\beta }[0,1]^{2}\) with \(Q_{\beta }\) denoting the rotation associated with \(M\) (cf. explanations above). Further set

    $$\begin{aligned} d_{0} &:= \operatorname{dist}\bigl(e(M), \partial \operatorname{conv}(K)\bigr), \\ \epsilon _{0} &:= \min \biggl\{ \frac{d_{0}}{100}, \frac{1}{1600} \biggr\} , \qquad \delta _{0} := \frac{\epsilon _{0}}{10^{2} d_{K}}. \end{aligned}$$
Step 1::

Initialization, definition of \(SP_{1}\). We consider the data from Step 0 (b) and in addition define

$$\begin{aligned} u_{0}(x) &=Mx- \omega (M)x, \\ e^{(p)}_{0} &=\operatorname{argmin}\operatorname{dist} { \bigl(e(M), K\bigr)}. \end{aligned}$$

In the case of non-uniqueness in the above minimization problem, we arbitrarily choose any of the possible options.

Possibly dividing \(\delta _{0}\) by a factor up to 100, we may assume that \(K_{0,0}:=\delta _{0}^{-1}\in \mathbb{N}\). We cover \(\varOmega =Q _{\beta }[0,1]^{2}\) by \(K_{0,0}\) many (translated) up to null-sets disjoint \((\epsilon _{0},\delta _{0})\) Conti constructions with respect to \(\nabla u_{0}\) and \(e^{(p)}_{0}\) (cf. Notation 25). We denote these sets by \(R_{0,1}^{1},\dots , R_{0,K_{0,0}}^{1}\). We remark that \(\varOmega = \bigcup_{l=1}^{K_{0,0}} R_{0,l}^{1}\) is possible with (up to null sets) disjoint choices of \(R_{0,l}^{1}\), \(l\in \{1,\dots ,K_{0,0}\}\), as by definition of the domain \(\varOmega \) the sets \(R_{0,l}^{1}\), \(l\in \{1,\dots ,K_{0,0}\}\), are parallel to one of the sides of \(\varOmega \) and as \(\delta _{0}^{-1} \in \mathbb{N}\). We apply Step 2(b) on these sets. As a consequence we obtain \(SP_{1}\).

Step 2::

Update. Let \(SP_{j}\) be given. Let \(M_{j,k}:=\nabla u _{j}|_{\varOmega _{j,k}}\) for some \(k\in \{1,\dots , J_{j}\}\). We explain how to update \(u_{j}\) and \(\epsilon _{j}\), \(\delta _{j}\) on \(\varOmega _{j,k}\).

We seek to apply the construction of Lemma 23 with \(\epsilon _{j,k}:=\epsilon _{j}|_{\varOmega _{j,k}}\), \(\delta _{j,k}:=\delta _{j}|_{\varOmega _{j,k}}\) and

$$\begin{aligned} e^{(p)}_{j,k}, \ M_{j,k} \end{aligned}$$

in a part of \(\varOmega _{j,k}\). To this end, we cover the domain \(\varOmega _{j,k}\) by a union of finitely many (up to null sets) disjoint triangles and rectangles. The rectangles are chosen as translated and rescaled versions of the domains in the \((\epsilon _{j,k}, \delta _{j,k})\) Conti construction with respect to the matrices from (31). We denote these rectangles by \(R_{j,l}^{k}\), \(l\in \{1,\dots ,K_{j,k}\}\), for some \(K_{j,k}\in \mathbb{N}\) and require that they cover at least a fixed volume fraction \(v_{0}>0\) of the overall volume of \(\varOmega _{j,k}\) (which is always possible, cf. Sect. 4 for our precise covering algorithm).

We define new sets \(\tilde{\varOmega }_{j+1,l}^{k}\), \(l\in \{1,\dots , \tilde{K}_{j,k}\}\): These are given by the triangles which are in \(\varOmega _{j,k}\setminus \bigcup_{l=1}^{K_{j,k}}R_{j,l}^{k}\) and by the triangles which form the level sets of the deformed Conti constructions on the rectangles \(R_{l}^{k}\).

  1. (a)

    For \(x\in \varOmega _{j,k}\setminus \bigcup_{l=1}^{K _{j,k}}R_{j,l}^{k}\) we define

    $$\begin{aligned} u_{j+1}(x) &:=u_{j}(x), \\ \epsilon _{j+1}(x) &:= \epsilon _{j}(x) \quad \bigl(\mbox{and hence } \delta _{j+1}(x):= \delta _{j}(x)\bigr), \\ e^{(p)}_{j+1}(x) &:= e^{(p)}_{j}(x). \end{aligned}$$

    Further we set \(\varOmega _{j+1,l}^{k}:=\tilde{\varOmega }_{j+1,l}^{k}\). Carrying this out for all \(k\in \{1,\dots ,J_{j}\}\) hence yields a collection of triangles

    $$ \bigl\{ \varOmega _{j+1,l}^{k}\bigr\} _{k\in \{1,\dots ,J_{j}\},l\in \{1,\dots ,K_{j,k} \}} $$

    covering \(\varOmega _{j}\setminus \bigcup_{l=1}^{K_{j,k}}R_{j,l} ^{k}\).

  2. (b)

    In the sets \(R_{j,l}^{k}\) we apply the Conti construction with the matrices from (31). In this application we choose the skew part according to Algorithm 30. With \(\tilde{\varOmega }_{j+1,l}^{k} \subset \bigcup_{k=1}^{J_{j}} \bigcup_{l=1}^{K_{j,k}} R_{j,l}^{k}\) as defined in Step 2 (a), we define \(u_{j+1}|_{\tilde{\varOmega }_{j+1,l}^{k}}\) as the function from the corresponding Conti construction. More precisely, in each of the rectangles \(R_{j,l}^{k}\) the matrix \(M_{j,k}\) has been replaced by the matrices

    $$\begin{aligned} \tilde{M}_{0}(M_{j,k}), \dots , \tilde{M}_{4}(M_{j,k}), \end{aligned}$$

    with \(e(\tilde{M}_{0}(M_{j,k})) = e^{(p)}_{j,k}\). For each \(x\in \tilde{\varOmega }_{j+1,l}^{k}\) with \(\tilde{\varOmega }_{j+1,l}^{k}\) as above, we define

    $$\begin{aligned} \epsilon _{j+1}(x):= \textstyle\begin{cases} \epsilon _{0} &\mbox{for } \nabla u_{j+1}|_{\tilde{\varOmega }_{j+1,k}} \in \{\tilde{M}_{1}(M_{j,k}),\dots ,\tilde{M_{3}}(M_{j,k})\}, \\ \epsilon _{j}(x)/2 &\mbox{for } \nabla u_{j+1}|_{\tilde{\varOmega }_{j+1,k}} = \tilde{M}_{4}(M_{j,k}), \\ 0 &\mbox{for } \nabla u_{j+1}|_{\tilde{\varOmega }_{j+1,k}} = \tilde{M}_{0}(M_{j,k}). \end{cases}\displaystyle \end{aligned}$$

    For the definition of \(\delta _{j+1}\) we recall its coupling with \(\epsilon _{j+1}\). We further set

    $$\begin{aligned} e^{(p)}_{j+1}(x) := \textstyle\begin{cases} \operatorname*{argmin} _{i\in \{1,2,3\}} & \{|e(\nabla u_{j+1})|_{ \tilde{\varOmega }_{j+1,k}} - e^{(i)}|\} \\ &\mbox{for } \nabla u_{j+1}|_{\tilde{\varOmega }_{j+1,k}} \in \{ \tilde{M}_{1}(M_{j,k}),\dots ,\tilde{M_{3}}(M_{j,k})\}, \\ e^{(p)}_{j}(x) &\mbox{for } \nabla u_{j+1}|_{\tilde{\varOmega }_{j+1,k}} = \tilde{M}_{4}(M_{j,k}), \\ e^{(p)}_{j}(x) &\mbox{for } \nabla u_{j+1}|_{\tilde{\varOmega }_{j+1,k}} = \tilde{M}_{0}(M_{j,k}). \end{cases}\displaystyle \end{aligned}$$

    Here we choose an arbitrary possible minimizer if there is non-uniqueness. Finally, we possibly split each of the sets \(\tilde{\varOmega }_{j+1,l}^{k} \in \bigcup_{l=1}^{K_{j,l}}R_{j,l} ^{k}\) into at most four smaller triangles (cf. Sect. 4.2) and add them to the collection \(\{\varOmega _{j+1,l} ^{k}\}_{k\in \{1,\dots , J_{j}\}, l \in \{1,\dots , K_{j,k}\}}\). Upon relabeling this yields a new collection \(\{\varOmega _{j+1,k}\}_{k\in \{1, \dots , J_{j+1}\}}\).

As a result of Steps 2 (a) and (b) we obtain \(SP_{j+1}\).

While this algorithm prescribes the symmetric part of the iteration, we complement it with an algorithm which defines the choice of the skew part. Here the main objectives are to keep the resulting skew parts uniformly bounded (which is necessary, if we seek to obtain bounded solutions to (6)) and simultaneously to ensure the choice of the “right” rank-one direction (cf. Sect. 5, Lemma 63). Here the rank-one direction has to be chosen “correctly” in the sense that the successive Conti constructions are not rotated too much with respect to one another (which corresponds to the “parallel” case, cf. Definition 29).

In order to make this precise, we introduce two definitions: the first (Definition 28) allows us to introduce an “ordering” on the triangles in \(\{\varOmega _{j,k}\}_{k\in \{1,\dots ,J_{j}\}}\) for different values of \(j\in \mathbb{N}\). With this at hand, we then define the notions of being parallel or rotated (cf. Definition 29).

Definition 28

Let \(D\in \{\varOmega _{j,k}\}_{k\in \{1,\dots ,J_{j}\}}\) for \(j\geq 1\). Then a triangle \(\hat{D} \subset D\) is a descendant of\(D\)of order\(l\), if \(\hat{D}\in \{\varOmega _{j+l,k}\}_{k\in \{1,\dots ,J_{j+l} \}}\) is (part of) a level set of \(\nabla u_{j+l}\) and is obtained from \(D\) by an \(l\)-fold application of the update step of Algorithm 27 (where we specify the covering to be the one described in Sect. 4). The set of descendants of\(D\)of order\(l\) is denoted by \(\mathcal{D}_{l}(D)\). We define \(\mathcal{D}(D):=\bigcup_{l=1}^{\infty }\mathcal{D}_{l}(D)\).

A triangle \(\bar{D} \in \{\varOmega _{j,k}\}_{k\in \{1,\dots ,J_{j}\}}\) is a predecessor of order\(l\)of\(D\), if \(D\in \mathcal{D}_{l}( \bar{D})\). We then write \(\bar{D}\in \mathcal{P}_{l}(D)\) and also use the notation \(\mathcal{P}(D)\) for the set of all predecessors of \(D\).

With this we define the parallel and the rotated cases:

Definition 29

Let \(e^{(p)}_{j,k}\) be as in Algorithm 27. Let \(D\in \{\varOmega _{j,k}\}_{k\in \{1,\dots ,J_{j}\}}\) for \(j\geq 1\). Let \(j_{0}\neq 0\) be the smallest index for which \(\mathcal{P}_{j_{0}}(D) \ni \bar{D} \neq D\) (i.e. \(\mathcal{P}_{j_{0}}(D)\) was the last triangle in Algorithm 27 to which Step 2 (b) was applied instead of Step 2 (a)). Then, if for a.e. \(x\in D\)

$$\begin{aligned} e^{(p)}_{j}(x) = e^{(p)}_{j-j_{0}}(x), \end{aligned}$$

we say that in step\(j\)the triangle\(D\)is in the parallel case. If there is no possible confusion, we also just refer to \(D\) as in the parallel case.

If for a.e. \(x \in D\)

$$\begin{aligned} e^{(p)}_{j}(x) \neq e^{(p)}_{j-j_{0}}(x), \end{aligned}$$

we say that in step\(j\)the triangle\(D\)is in the rotated case. If there is no possible confusion, we also just refer to \(D\) as in the rotated case.

Let us comment on this definition: Intuitively, its objective is to describe whether successive Conti constructions can be chosen as essentially parallel or whether they are necessarily substantially rotated with respect to each other (hence, these notions will also play a crucial role in Sect. 4, where we construct our precise covering). More precisely, let \(SP_{j}\) be as in Algorithm 27 and let \(j\), \(j_{0}\), \(D\), \(\bar{D}\) be as in Definition 29. Then, at the iteration step \(j_{0}\) the triangle \(\bar{D}\) was a subset of one of the Conti rectangles \(R_{j-j_{0},l}^{k}\). Thus, \(u_{j-j_{0}}\) is modified according to the Conti construction with respect to \(\nabla u_{j-j_{0}}|_{\bar{D}}\), \(e_{j-j_{0}}^{(p)}|_{\bar{D}}\) in this domain. In particular, the difference of the matrices \(e(\nabla u_{j-j_{0}}|_{\bar{D}})\), \(e_{j-j_{0}}^{(p)}|_{\bar{D}}\) determines a direction \(e\) in strain space (up to a choice of the skew direction (cf. Corollary 26) this is directly related to the orientation of the Conti rectangle \(R_{j-j_{0},l}^{k}\)). By virtue of Corollary 20 all of the new matrices \(e(\tilde{M}_{0}(\nabla u_{j-j_{0}}|_{\bar{D}})),\dots \), \(e( \tilde{M}_{4}(\nabla u_{j-j_{0}}|_{\bar{D}}))\) essentially lie on the line \(e\) in strain space. Hence the direction which is determined by the difference of \(e^{(p)}_{j-j_{0}+1}|_{D}\) and \(e(\nabla u_{j-j_{0}+1}|_{D})\), is still essentially parallel to the directions \(e\) (in strain space). As by definition (we are now in Step 2(a) of Algorithm 27) the values of \(e^{(p)}_{j-j _{0} + l}|_{D}\) and of \(\nabla u_{j-j_{0} +l}|_{D}\) do not change further until \(l=j_{0}\) is reached, the requirement in (32) implies that the direction \(e\) spanned by \(e(\nabla u_{j-j_{0}}|_{\bar{D}})\), \(e^{(p)}_{j-j_{0}}|_{\bar{D}}\) and the one spanned by \(e(\nabla u_{j}|_{D})\), \(e^{(p)}_{j}|_{D}\) are essentially parallel (cf. Lemma 37 and Remark 38 for the precise statements). If we choose the correct skew directions in Step 2(b) of Algorithm 27, we can hence ensure that the successive Conti constructions are essentially parallel, if (32) is satisfied.

We remark that for this argument to hold and for it to yield new, significant information, it was necessary in Definition 29 to mod out the cases in which Step 2(a) was active, i.e., \(\mathcal{P}_{l}(D) = \{D\}\), as during these there are no changes.

If (33) holds, then the directions of the successive Conti constructions are necessarily substantially rotated with respect to each other (cf. Lemma 39 for the precise bounds). In this case we cannot substantially improve the situation to being more parallel by choosing the skew part appropriately in Corollary 26. Thus, in the sequel, we will exploit these instances as possibilities to control the size of the skew part and to use this, if necessary, to change the sign of the skew direction. The precise formulation of this is the content of Algorithm 30.

Algorithm 30

(Quantitative Convex Integration Algorithm, II)

Let \(\varOmega \), \(u_{j}:\varOmega \rightarrow \mathbb{R}^{2}\) and \(SP_{j}\) for \(j\geq 1\) be as in Algorithm 27. We further consider

$$\begin{aligned} \omega _{j}:\varOmega \rightarrow \operatorname{Skew}(2). \end{aligned}$$

This function will be defined to be piecewise constant on \(\varOmega \) and to be constant on each triangle \(\varOmega _{j,k}\). It will define the skew part of \(\nabla u_{j}\) on \(\varOmega _{j,k}\), i.e.,

$$\begin{aligned} \omega (\nabla u_{j}|_{\varOmega _{j,k}}) = \omega _{j}|_{\varOmega _{j,k}}. \end{aligned}$$
Step 1::

Initialization. Let \(M\) be as in Step 1 in Algorithm 27. Then we define

$$\begin{aligned} \omega _{0}(x) = 0 \mbox{ for a.e. } x \in \varOmega . \end{aligned}$$

In the initialization step of Algorithm 27 we choose \(\omega _{1}\) arbitrarily.

Step 2::

Update. Let \(j\in \mathbb{N}, j\geq 1\). Let \(\omega _{j}\) and \(\varOmega _{j,k}\) be given. Suppose that \(\tilde{\varOmega }_{j+1,l} ^{k}\) with \(\tilde{\varOmega }_{j+1,l}^{k}\in \mathcal{D}_{1}(\varOmega _{j,k})\) is constructed from \(\varOmega _{j,k}\) by our covering argument (cf. Step 2 in Algorithm 27). Then we define \(\omega _{j+1}\) as follows:

  1. (a)

    If \(\tilde{\varOmega }_{j+1,l}^{k}\) is not part of one of the Conti constructions in the covering, then we set

    $$\begin{aligned} \omega _{j+1}|_{\tilde{\varOmega }_{j+1,l}^{k}} = \omega _{j}|_{\varOmega _{j,k}}. \end{aligned}$$
  2. (b)

    If \(\tilde{\varOmega }_{j+1,l}^{k}\) is part of one of the Conti constructions in the covering, then by Algorithm 27 we seek to apply the construction of Lemma 23 with scale \(\epsilon _{j}|_{\varOmega _{j,k}}\) and \(e^{(p)}_{j}|_{\varOmega _{j,k}}\), \(\nabla u_{j}|_{\varOmega _{j,k}}\). Thus, by Corollary 26 we have two possible choices for the skew part of \(\nabla u_{j+1}\). These are determined by their sign. To define the sign, let \(j_{0}\in \mathbb{N}\) be the smallest integer such that \(D:=\mathcal{P}_{j_{0}}(\varOmega _{j,k}) \neq \varOmega _{j,k}\). We then choose the sign of the new skew direction \(\omega _{j+1}|_{\tilde{\varOmega }_{j+1,l}^{k}}\) (and hence determine the whole corresponding skew part) according to

    $$\begin{aligned} &\operatorname{sgn}(\omega _{j+1}|_{\tilde{\varOmega }_{j+1,l}^{k}} - \omega _{j}|_{\tilde{\varOmega }_{j+1,l}^{k}}) \\ &\quad := \left\{ \textstyle\begin{array}{l@{\quad}l} \operatorname{sgn}(\omega _{j}|_{\varOmega _{j,k}}- \omega _{j-j_{0}}|_{ \varOmega _{j,k}}) & \mbox{if } e^{(p)}_{j}|_{\varOmega _{j,l}^{k}} = e^{(p)} _{j-j_{0}}|_{D}, \\ -1 & \mbox{if } e^{(p)}_{j}|_{\varOmega _{j,l}^{k}} \neq e^{(p)}_{j-j _{0}}|_{D} \wedge \omega _{j}|_{\varOmega _{j,k}}\geq 0, \\ 1 & \mbox{if } e^{(p)}_{j}|_{\varOmega _{j,l}^{k}} \neq e^{(p)}_{j-j _{0}}|_{D} \wedge \omega _{j}|_{\varOmega _{j,k}}< 0. \end{array}\displaystyle \right. \end{aligned}$$

After having carried out the relabeling step, in which we pass from \(\tilde{\varOmega }_{j+1,l}^{k}\) to \(\varOmega _{j+1,l}\), the function \(\omega _{j+1}\) is constant on each of the triangles in \(\varOmega _{j+1,l}\). Together with Algorithm 27 this completes the construction of \(\nabla u_{j+1}\).

Let us comment on these algorithms: Due to the structure of the convex hulls (Lemma 10), our convex integration algorithm produces a (countably) piecewise affine solution (in contrast to the solutions obtained by means of an in-approximation scheme). This is reflected in the fact that the displacement \(u_{j}\) is not further modified in the piecewise polygonal domains in \(\varOmega \setminus \varOmega _{j}\). The preceding algorithm differs from a non-quantitative version of a convex integration scheme in several aspects:

  • We consider finite coverings of \(\varOmega \setminus \varOmega _{j}\) instead of directly covering the whole domain.

  • We prescribe the choice of \(\epsilon _{j}\) quantitatively.

  • We prescribe the skew part quantitatively.

These points are central in our higher regularity argument: As we seek to prove higher regularity by means of the interpolation result from Theorem 2 or Corollary 3, we have to control the BV norm of the resulting displacement gradients. However, by a countably infinite (self-similar) covering of the whole domain, this is in general not possible (the total perimeter of the covering triangles is not bounded in general). Hence we only consider finite coverings, which produce a controlled (but growing) BV norm and simultaneously allows us to cover a sufficiently large volume fraction \(v_{0}\) of our domain \(\varOmega _{j}\). That it is possible to satisfy these two competing aims is content of the covering results of the next sections (cf. Propositions 52 and 55). This finite covering of \(\varOmega _{j,k}\) is the cause for the splitting of Step 2 into two parts. Part (a) deals with the triangles which are not covered by Conti constructions and are in this sense “errors” (in the sense that \(u_{j}\) is not modified here), while part (b) deals with the part of the domain that is covered by Conti constructions on which \(u_{j}\) is modified.

The specification of \(\epsilon _{j}\) is of key relevance as well. It distinguishes in a quantitative way whether a new rank-one connection is rotated or not with respect to the corresponding last rank-one connection. In our BV estimate this leads to different bounds (cf. the perimeter estimates in Propositions 52 and 55). In particular we cannot afford substantial rotations, as long as \(\epsilon _{j}\ll \epsilon _{0}\) is very small, since this would yield superexponential growth for the BV norms, which cannot be compensated in our estimates (cf. Fig. 8 and the corresponding explanations for the intuition behind this).

Fig. 8
figure 8

Covering a rectangle \(R_{1,\delta _{0}}\) of side lengths 1 and \(\delta _{0}\) by (a) a parallel rectangle of half its aspect ratio, (b) an orthogonal rectangle of aspect ratio \(r=\delta _{j}\) (which could for instance be \(r=\delta _{0}/2\))

Due to the relation between the size of the scales \(\delta _{j}\) (which itself is directly coupled to the admissible error \(\epsilon _{j}\)) and our regularity estimates, we in general seek to choose the value of \(\epsilon _{j}\) as large as possible without leaving \(\operatorname{intconv}(K)\). By the intercept theorem, it is always possible to choose \(\epsilon _{j}\) to be “relatively large” in the push-out steps (cf. Notation 25). However, for stagnant matrices, this is no longer possible. Here we have to ensure a choice of \(\epsilon _{j}\) which is summable in \(j\in \mathbb{N}\) (in Algorithm 27 we choose it geometrically decaying), in order to avoid leaving \(\operatorname{intconv}(K)\). These considerations lead to the case distinction in the definition of \(\epsilon _{j+1}\) in Step 2(b) of Algorithm 27.

Finally, the quantitative prescription of the skew part is central to deduce the quantitative BV bound of Lemma 63, as we have to take care that, as long as we remain “parallel” in strain space (cf. Definition 29), we approximately preserve the same skew direction. This is necessary to prevent the Conti constructions from being substantially rotated with respect to each other if \(\epsilon _{j}\) is very small and constitutes a crucial ingredient in the derivation of our perimeter and BV estimates in Sects. 4 and 5 (cf. Fig. 8 for the intuition behind this).

The normalization of the initial skew part is convenient (though not necessary).

3.3 Well-Definedness of Algorithms 27 and 30

We now proceed to prove that Algorithms 27 and 30 are well-defined. Here in particular, it is crucial to show that with our choice of the admissible error \(\epsilon _{j}\), we do not leave \(\operatorname{intconv}(K)\) in the iteration except to attain one of the energy wells in \(K\) (cf. Proposition 31). Moreover, we seek to construct solutions to (6) which are Lipschitz regular. These points are the content of the following two Propositions 31 and 34, which deal with the symmetric and anti-symmetric parts respectively. To show these we will rely on several auxiliary observations.

3.3.1 Symmetric Part

We begin by discussing the symmetric part and by showing that in our construction it does not leave \(\operatorname{intconv}(K)\), except to reach \(K\).

Proposition 31

(Symmetric Part)


$$ d: \mathbb{R}^{2 \times 2} \rightarrow [0,\infty ], \qquad N \mapsto \operatorname{dist}\bigl(e(N), \partial \operatorname{conv}(K) \bigr), $$

and let\(SP_{j}\)and\(M\)be as in Algorithm27. Then for every\(j,k \in \mathbb{N}\)and every domain\(\varOmega _{j,k}\in \{ \varOmega _{j,k}\}_{k\in \{1,\dots ,J_{j}\}}\)there holds

$$\begin{aligned} d(\nabla u_{j}|_{\varOmega _{j,k}}) \geq \min \biggl\{ \frac{1}{16},d(M) \biggr\} - 2(\epsilon _{0} - \epsilon _{j}|_{\varOmega _{j,k}}). \end{aligned}$$

In particular, for all\(j\geq 1\)it holds that\(\nabla u_{j}(x) \in \operatorname{intconv}(K)\)for almost all\(x\in \varOmega _{j}\).


We prove the statement inductively. For \(j=0\), we note that this holds since \(\epsilon _{j}=\epsilon _{0}\) and \(\nabla u_{0}=M\).

Let thus \(\nabla u_{j}|_{\varOmega _{j,k}}=:M_{j,k}\) be given. We only show that the result remains true for \(j+1\) in the regions, in which the Conti construction is applied, as in the other regions it holds by the induction hypothesis (as \(\epsilon _{j+1}=\epsilon _{j}\) for these regions). Let \(\tilde{M}_{0}(M_{j,k}),\dots , \tilde{M}_{4}(M_{j,k})\) be the matrices, by which \(M_{j,k}\) is replaced in the application of the Conti construction of Lemma 23. We consider first the pushed out matrices (see also Notation 25). If the edge of \(\partial \operatorname{conv}(K)\) closest to \(\tilde{M}_{l}(M _{j,k})\), \(l=1,2,3\), is different from the edge closest to \(M_{j,k}\), then by construction \(d(\tilde{M}_{l}(M_{j,k}))\geq 1/16\). It thus remains to discuss the situation in which this is not the case. In this situation the intercept theorem and the induction hypothesis, for \(l=1,2,3\), (for which \(\epsilon _{j+1}|_{\tilde{\varOmega }_{j+1,l}^{k}}= \epsilon _{0}\)) it holds

$$\begin{aligned} d\bigl(\tilde{M}_{l}(M_{j,k})\bigr) &\geq \frac{16}{15}d(M_{j,k})- \epsilon _{0} \\ &\geq \frac{1}{15}d(M_{j,k}) - \epsilon _{0} + \min \biggl\{ \frac{1}{16},d(M) \biggr\} -2( \epsilon _{0} - \epsilon _{j}|_{\varOmega _{j,k}}) \\ &\geq \min \biggl\{ \frac{1}{16},d(M) \biggr\} + \frac{98}{15} \epsilon _{0} - 3\epsilon _{0} \\ &\geq \min \biggl\{ \frac{1}{16},d(M) \biggr\} . \end{aligned}$$

Here we used the definition of \(\epsilon _{0}\) (cf. Step 0 (b)) and the induction hypothesis combined with the bound

$$\begin{aligned} d(M_{j,k}) &\geq \min \biggl\{ d(M),\frac{1}{16} \biggr\} -2 \epsilon _{0}=98 \epsilon _{0}. \end{aligned}$$

Finally, for \(\tilde{M}_{4}(M_{j,k})\) we estimate

$$\begin{aligned} d\bigl(\tilde{M}_{4}(M_{j,k})\bigr) &\geq d(M_{j,k})- \epsilon _{j}|_{\varOmega _{j,k}} \geq \min \biggl\{ d(M), \frac{1}{16} \biggr\} +(2\epsilon _{j}|_{ \varOmega _{j,k}}- \epsilon _{j}|_{\varOmega _{j,k}})-2\epsilon _{0} \\ & = \min \biggl\{ d(M), \frac{1}{16} \biggr\} + 2(\epsilon _{j+1}|_{ \tilde{\varOmega }_{j,l}^{k}} -\epsilon _{0}). \end{aligned}$$

This concludes the proof. □

3.3.2 Skew Symmetric Part

In order to deal with the skew part and to show its boundedness, we need several auxiliary results. These are targeted at controlling the maximal number of push-out steps in the parallel case (cf. Lemma 33), where the notions “parallel” and “rotated” are used as in Definition 29. With the control of the maximal number of push-out steps at hand, we can then present a bound on the skew part of the gradients from Algorithms 27 and 30 (cf. Proposition 34). Together with the boundedness of the symmetrized gradient this yields the uniform \(L^{\infty }\) bounds on \(\nabla u_{j}\).

We begin by estimating the distance to the wells.

Lemma 32

Let\(SP_{j}\)be the j-th step of the convex integration construction obtained in Proposition36. Then for every level set\(\varOmega _{j,k}\)it holds

$$\begin{aligned} &\operatorname{dist}\bigl(e(\nabla u_{j}|_{\varOmega _{j,k}}), K\bigr) \geq \min \{d _{K}, 1/8\} - 2(\epsilon _{0}-\epsilon _{j}|_{\varOmega _{j,k}}), \end{aligned}$$

where\(d_{0}\), \(d_{K}\), \(\epsilon _{0}\)are as in Step 0(b) in Algorithm27.

The statement of this lemma is very similar to the result of Proposition 31. However, instead of controlling the distance to the boundary, we here estimate the distance to the wells. This can be substantially larger than the distance to the boundary.


The proof follows along the same lines of the one of Proposition 31. We note that the statement is true for \(j=0\) (by the definition of \(\epsilon _{0}\)) and proceed by induction. Let thus \(M_{j,k}:=\nabla u_{j}|_{\varOmega _{j,k}}\) be given. With slight abuse of notation we set \(\epsilon _{j}:=\epsilon _{j}|_{\varOmega _{j,k}}\). It suffices to show that the values of \(\nabla u_{j+1}\), which were obtained from \(M_{j,k}\) by an application of the Conti construction, still satisfy the desired estimates (in the domains in which \(u_{j}\) is unchanged, the estimate holds by the induction assumption). The application of the Conti construction yields matrices \(\tilde{M} _{0}(M_{j,k}), \ldots ,\tilde{M}_{4}(M_{j,k})\). As \(e(\tilde{M}_{0}(M _{j,k}))\in K\), we only consider the other matrices. We consider the matrices \(\tilde{M}_{1}(M_{j,k}),\ldots ,\tilde{M}_{3}(M_{j,k})\), which are constructed by “pushing-out” (cf. Notation 25). Without loss of generality (cf. the argument in Proposition 31), we only discuss the case that the closest well for \(e(\tilde{M}_{i}(M_{j,k}))\) is the same as for \(e(M_{j,k})\). For \(i\in \{1,2,3\}\) we have

$$\begin{aligned} \operatorname{dist}\bigl(e\bigl(\tilde{M}_{i}(M_{j,k}) \bigr),K\bigr) &\geq \frac{16}{15} \operatorname{dist}\bigl(e(M_{j,k}),K \bigr)-\epsilon _{0} \\ & \geq d_{K} + \frac{1}{15}\operatorname{dist} \bigl(e(M_{j,k}),K\bigr) - \epsilon _{0} - 2(\epsilon _{0}-\epsilon _{j}) \\ & \geq d_{K} + \frac{1}{15} d(M_{j,k}) - \epsilon _{0} - 2(\epsilon _{0}-\epsilon _{j}) \\ & \geq d_{K}. \end{aligned}$$

Here we used the induction assumption for \(M_{j,k}\) as well as the estimate for \(d(M_{j,k}) \) from Proposition 31 and the definition of \(\epsilon _{0}\).

For \(\tilde{M}_{4}(M_{j,k})\) we estimate

$$\begin{aligned} \operatorname{dist}\bigl(e\bigl(\tilde{M}_{4}(M_{j,k}) \bigr),K\bigr) &\geq \operatorname{dist}\bigl(e(M_{j,k}),K\bigr)- \epsilon _{j} \\ & \geq d_{K} - \epsilon _{j} - 2(\epsilon _{0}-\epsilon _{j}) \\ & \geq d_{K} - 2(\epsilon _{0}-\epsilon _{j+1}). \end{aligned}$$

Here we used the definition of \(\epsilon _{j+1}:= \epsilon _{j}/2\) on the subset of the Conti construction on which \(\tilde{M}_{4}(M_{j,k})\) is attained. □

Using Lemma 32 and recalling Definitions 28 and 29, we bound the maximal number of possible push-out steps in the parallel situation:

Lemma 33

Let\(SP_{j}\)and\(\omega _{j}\)be as in Algorithms27and30. Assume that\(D\in \{ \varOmega _{j_{0} + n,k}\}_{k\in \{1,\dots ,J_{j_{0}+n}\}}\)and suppose that the construction of\(D\)from\(\bar{D}\in \mathcal{P}_{n}(D)\)involves\(k\)with\(k\in \mathbb{N}\cup \{0\}\)push-out steps (cf. Notation25). Further assume that for a.e. \(x\in D\)and for all\(r \in \{1,\dots ,n\}\)

$$\begin{aligned} e^{(p)}_{j_{0} + r}(x)= e^{(p)}_{j_{0}}(x) . \end{aligned}$$

Then there exists a number\(N_{0}=N_{0}(d_{K})\)such that\(0\leq k \leq N_{0}\).


The proof relies on the definition of \(\epsilon _{0}\) and the control on the distance to the wells, which was obtained in Lemma 32. Indeed, let \(\varOmega _{j_{0}+l,m}\in \{\varOmega _{j_{0}+l,\tilde{m}}\}_{ \tilde{m}\in \{1,\dots ,J_{j_{0}+l}\}}\) with \(\varOmega _{j_{0}+l,m} \subset \bar{D}\) be arbitrary but fixed. Without loss of generality, we assume that in all the iteration steps \(j_{0},\dots ,j_{0}+l\) Step 2(b) occurs on our respective domain (as there is no change, if Step 2(a) occurs, and as we are only interested in the maximal number of steps in which a specific change, i.e., a push-out, occurs). Let \(M_{j}:= \nabla u_{j}|_{\varOmega _{j_{0}+l,m}}\) be given. Suppose that a matrix \(M_{j_{0}+n+1}\) is obtained from \(M_{j_{0}+n}\) for some \(n\in \{1, \dots ,l\}\) by push-out and that \(M_{j_{0}+n}\) is obtained from \(M_{j_{0}}\) by stagnating \(n\)-times. Then,

$$\begin{aligned} \operatorname{dist}\bigl(e(M_{j_{0}+n+1}),e_{j_{0}}^{(p)} \bigr) &\geq \frac{16}{15}\operatorname{dist}\bigl(e(M_{j_{0}+n}),e_{j_{0}}^{(p)} \bigr) - \epsilon _{0} \\ &\geq \frac{16}{15} \operatorname{dist}\bigl(e(M_{j_{0}}), e_{j_{0}}^{(p)}\bigr) - \frac{16}{15}\epsilon _{j_{0}} \sum_{j=1}^{n}2^{-j} - \epsilon _{0} \\ &\geq \frac{16}{15} \operatorname{dist}\bigl(e(M_{j_{0}}), e_{j_{0}}^{(p)}\bigr) - \frac{32}{15}\epsilon _{j_{0}}- \epsilon _{0} \\ & \geq \frac{101}{100}\operatorname{dist}\bigl(e(M_{j_{0}}), e_{j_{0}} ^{(p)}\bigr). \end{aligned}$$

Here we have used (34), the result of Lemma 32, the fact that each consecutive stagnation step decreases the value of \(\epsilon _{j}\) by a factor \(2^{-1}\) and the definition of \(\epsilon _{0}\). Thus, defining \(k\) as the number of push-out steps, we infer that

$$\begin{aligned} \operatorname{dist}\bigl(e(M_{j_{0}+l}),e_{j_{0}}^{(p)} \bigr) &\geq \biggl( \frac{101}{100} \biggr)^{k} \operatorname{dist}\bigl(e(M_{j_{0}}),e_{j _{0}}^{(p)} \bigr) \geq \biggl( \frac{101}{100} \biggr)^{k} (d_{K} - \epsilon _{0}) \\ &\geq \biggl( \frac{101}{100} \biggr)^{k} \frac{99}{100} d_{K}, \end{aligned}$$

where \(d_{K}\) is defined as in Step 0 (b) in Algorithm 27. Therefore, by Step 2 of Algorithm 27 (i.e. the update for \(e^{(p)}_{j}\)) after at most

$$\begin{aligned} N_{0}:= \frac{\log (\frac{3}{d_{K}})}{\log ( \frac{101}{100} )} \end{aligned}$$

push-out steps, we are no longer in the parallel case. This yields the desired upper bound. □

Relying on the previous lemma, we obtain a uniform bound on the skew part:

Proposition 34

(Skew Symmetric Part)

Let\(SP_{j}\)and\(\omega _{j}\)be as in Algorithms27and30. Suppose that\(N_{0}>0\)is the number from Lemma33. Define\(\bar{C}:=\max \{ 100, 20 (N _{0} +1) (1+\epsilon _{0}) \}\). Then,

$$\begin{aligned} \bigl|\omega _{j}(x)\bigr| \leq \bar{C} + 2(\epsilon _{0} - \epsilon _{j}) \quad\textit{for all } x\in \varOmega _{j}, \end{aligned}$$


$$\begin{aligned} \bigl|\omega _{j}(x)\bigr| \leq 2 \bar{C} \quad\textit{for all } x\in \varOmega \setminus \varOmega _{j}. \end{aligned}$$


We prove the claims inductively and note that \(\omega _{0}=0\) satisfies them. We first discuss (35) and show that it remains true for \(\omega _{j}\) with \(j\in \mathbb{N}\). To this end, let \(l\in \mathbb{N}\) and \(D\subset \{\varOmega _{j+l,k}\}_{k\in \{1,\dots ,J _{j+l}\}}\). For abbreviation we set \(M_{j}:= \nabla u_{j}|_{D}\), \(\tilde{\omega }_{j}:=\omega _{j}|_{D}\) (and recall that \(\omega _{j}|_{D} = \omega (\nabla u_{j}|_{D})\)) and first assume that \(\tilde{\omega } _{j} \leq 0\) (see Notation 12). We begin by making the following additional assumption:

Assumption 35

We suppose that the skew matrix \(\tilde{\omega }_{j+l}\) is derived from \(\tilde{\omega }_{j}\) by an \(l\)-fold application of Algorithms 27 and 30, where in the Conti construction of Corollary 26 we always choose the positive skew direction.

We point out that this assumption can occur both in the parallel and in the rotated case, but ensures that the skew direction was not changed in this process. In other words, Assumption 35 implies that the sign of the skew direction, which is chosen in Corollary 26, remains fixed. We hence refer to this situation as the “fixed sign case”. We further introduce the auxiliary functions

$$\begin{aligned} N_{l,1}, N_{l,2}, N_{l}: \varOmega \rightarrow \mathbb{N}\cup \{0\}, \end{aligned}$$

with \(N_{l}:= N_{l,1}+N_{l,2}\). Here for given \(l\in \mathbb{N}\) and \(\tilde{\omega }_{j+l}\), we define \(N_{1,l}\) as the number of \(4/3\) push-out steps in the process of obtaining \(\tilde{\omega }_{j+l}\) from \(\tilde{\omega }_{j}\), and \(N_{2,l}\) as the number of \(16/15\) push-out steps. By Lemma 33 we know that \(0\leq N_{l}\leq N_{0}\).

Step 1: Upper bound in the fixed sign case. We first deal with the upper bound for \(\tilde{\omega }_{j+l}\). To this end we note that

$$\begin{aligned} \tilde{\omega }_{j+1} \leq \frac{4}{3} \operatorname{dist} \bigl(e(\nabla u _{j})|_{D},K\bigr) + \epsilon _{j}|_{D} + \tilde{\omega }_{j}. \end{aligned}$$

We iterate this estimate:

$$\begin{aligned} \tilde{\omega }_{j+l} \leq \frac{4}{3} N_{l,1} 3 + \frac{16}{15} N _{l,2} 3 + (N_{l,1}+N_{l,2}) \epsilon _{0} + \epsilon _{0} N_{l} \sum _{m=1}^{l} 2^{-m} + \tilde{\omega }_{j}. \end{aligned}$$

Here we used the estimate \(\operatorname{dist}(e(\nabla u_{j})|_{D},K) \leq 3\), the fact that in each push-out step an error \(\epsilon _{0}\) is possible, while in each stagnant step the error is decreased by a factor two. Recalling the definition of \(\bar{C}\) and the fact that \(N_{l}\leq N_{0}\) hence implies

$$\begin{aligned} \tilde{\omega }_{j+l} \leq \bar{C}/2 + \tilde{\omega }_{j}. \end{aligned}$$

Using that \(\tilde{\omega }_{j}\leq 0\), therefore allows us to conclude that

$$\begin{aligned} \tilde{\omega }_{j+l} \leq \bar{C}/2. \end{aligned}$$

Step 2: Lower bound in the fixed sign case. Still working under the assumptions from above, we now bound the negative part of \(\tilde{\omega }_{j+l}\). Here we show that

$$\begin{aligned} \tilde{\omega }_{j+l} \geq -\bar{C} - 2(\epsilon _{0} - \epsilon _{j+l}|_{D}). \end{aligned}$$

We first consider the push-out steps. Let \(\tilde{M}_{1}(M_{j+l-1})\), \(\tilde{M}_{2}(M_{j+l-1})\), \(\tilde{M}_{3}(M_{j+l-1})\) be the push-out matrices in the corresponding Conti construction of Algorithms 27 and 30. Their skew parts are contained in \(\epsilon _{0}\) neighborhoods of

$$\begin{aligned} \omega (M_{j+l-1}) + \frac{1}{3} \hat{S}, \qquad\omega (M_{j+l-1}) + \frac{1}{15} \hat{S}. \end{aligned}$$

By Lemma 13, Lemma 14 and Proposition 31, we obtain that

$$\begin{aligned} 10 \geq \hat{S} \geq \frac{3}{8}|a \odot n| \geq \frac{3}{4} \operatorname{dist}\bigl(e(M_{j+l-1}), e^{(p)}_{j+l-1}|_{D} \bigr) \geq \frac{3}{4} d(M_{j+l-1}) \geq \frac{3}{4}98 \epsilon _{0}. \end{aligned}$$

Here \(d:\mathbb{R}^{2\times 2} \rightarrow \mathbb{R}\) denotes the function from Proposition 31, and \(a\otimes n\) is the rank-one connection which appears in the Conti construction. In the last estimate we have used the estimate from Proposition 31. Hence the skew parts of \(\tilde{M}_{1}(M_{j+l-1})\), \(\tilde{M}_{2}(M_{j+l-1})\), \(\tilde{M}_{3}(M _{j+l-1})\) are respectively bounded by

$$\begin{aligned} \omega \bigl(\tilde{M}_{i}(M_{j+l-1})\bigr) &\geq \tilde{ \omega }_{j+l-1} + \frac{1}{15} \frac{3}{4}98 \epsilon _{0} - \epsilon _{0} \\ & \geq -\bar{C} - 2(\epsilon _{0} -\epsilon _{j+l}|_{D}) + \frac{1}{15} \frac{3}{4}98 \epsilon _{0} - \epsilon _{0} \geq -\bar{C} \quad\mbox{for } i\in \{1,2,3\}, \end{aligned}$$

which shows the claimed estimate (35) with \(\epsilon _{j}=\epsilon _{0}\). For \(\tilde{M}_{4}(M_{j+l-1})\) we have that

$$\begin{aligned} \omega \bigl(\tilde{M}_{4}(M_{j+l-1})\bigr) &\geq -\bar{C} - 2(\epsilon _{0} - \epsilon _{j+l-1}|_{D}) - \epsilon _{j+l-1}|_{D} \\ & \geq -\bar{C} - 2(\epsilon _{0} - \epsilon _{j+l}|_{D}), \end{aligned}$$

which also proves the desired result. This concludes the proof of (35) in the fixed sign case.

Step 3: Sign change. Let \(j+l+1\) be the first index in which the sign of the difference of the skew parts changes according to Algorithm 30. By Assumption 35 and by the definition of our Algorithms 27 and 30, this can only be the case if \(\tilde{\omega }_{j+l}\geq 0\). The definition of \(\bar{C}\) ensures that the \(\tilde{M}_{4}(M_{j+l+1})\) obeys the upper bound

$$\begin{aligned} \omega \bigl(\tilde{M}_{4}(M_{j+l+1})\bigr) \leq \frac{\bar{C}}{2}+\epsilon _{0} \leq \bar{C}. \end{aligned}$$

For the pushed out parts, \(\tilde{M}_{1}(M_{j+l+1})\), \(\tilde{M}_{2}(M _{j+l+1})\), \(\tilde{M}_{3}(M_{j+l+1})\), we argue similarly as we did in Step 1, but now with a change of signs: By the intercept theorem, the resulting skew parts become strictly smaller than the one of \(\tilde{\omega }_{j+l}\) (potentially they even become negative). This then improves the upper bound (37). For the lower bound we argue as in Step 1 but with reversed sign in Assumption 35. This concludes the proof of (35).

Step 4: Proof of (36). In order to obtain the estimate (36), we notice that the skew parts associated with values of \(e(\nabla u_{j})\in K\) may on the one hand be strictly larger than the bound given in (35). But on the other hand, they are derived as an \(\tilde{M}_{0}\) matrix in one of the Conti constructions, in which matrices satisfying (35) are modified. This implies that at most a gain of 5 in the modulus of the corresponding skew part is possible, which yields the bound (36). As these domains are not further modified in the convex integration algorithm this bound cannot deteriorate in the course of the application of Algorithms 27 and 30.  □

3.4 Existence of Convex Integration Solutions

Finally, in this last subsection, we show that Algorithms 27 and 30 can be used to deduce the existence of solutions to our problem (6).

Proposition 36

(Convex Integration Solutions)

Let\(M \in \mathbb{R}^{2\times 2}\)with\(e(M) \in \operatorname{intconv}(K)\). Let\(\varOmega \subset \mathbb{R}^{2}\)be open and bounded. Then there exists a Lipschitz function\(u: \mathbb{R} ^{2} \rightarrow \mathbb{R}^{2}\)such that

$$\begin{aligned} & \nabla u =M \textit{ a.e. in }\mathbb{R}^{2}\setminus \varOmega , \\ & e(\nabla u) \in K \textit{ a.e. in } \varOmega . \end{aligned}$$


We apply Algorithm 27 with \(\bar{M}:= M-\omega (M)\). By the results of Propositions 31 and 34 this algorithm is well-defined and can be iterated with \(j \rightarrow \infty \). This yields a sequence of functions \(u_{j}:\mathbb{R}^{2} \rightarrow \mathbb{R}^{2}\) with bounded gradient (with \(\|\nabla u_{j}\|_{L^{\infty }(\mathbb{R}^{2})}\) depending on \(d_{K}\), cf. Lemma 33). We prove the convergence of this sequence and show that the limiting function \(u_{0}\) solves (5) with boundary data \(\bar{M}\).

We note that for \(k\geq j\)

$$\begin{aligned} \nabla u_{k}(x) = \nabla u_{j}(x) \quad\mbox{for a.e. }x\mbox{ in } \varOmega \setminus \varOmega _{j}. \end{aligned}$$


$$\begin{aligned} \nabla u_{k} = \bar{M} \quad\mbox{a.e. in } \mathbb{R}^{2} \setminus \overline{ \varOmega }. \end{aligned}$$

By construction \(\nabla u_{j}\) is bounded, hence \(\nabla u_{j} \rightharpoonup \nabla u_{0}\) in the \(L^{\infty }_{loc}(\mathbb{R}^{2})\) weak-∗ and the \(L^{2}_{loc}(\mathbb{R}^{2})\) weak topologies. By Poincaré’s inequality \(u_{j} \rightarrow u_{0}\) in \(L^{2}_{loc}(\mathbb{R}^{2})\). We observe that Step 2 in Algorithm 27 decreases the total volume of the \(\varOmega _{j}\), i.e., of the part of \(\varOmega \) on which \(e(\nabla u_{j})\) does not yet attain one of the wells:

$$\begin{aligned} |\varOmega _{j}|= \bigl|\bigl\{ x \in U: e(\nabla u_{j})\notin K\bigr\} \bigr| \leq \biggl(1-v _{0}\frac{7}{8} \biggr)^{j}|\varOmega |. \end{aligned}$$

Combined with (38) and the \(L^{\infty }\) bound, this implies the desired convergence \(\nabla u_{j} \rightarrow \nabla u _{0} \) with respect to the \(L^{2}_{loc}(\mathbb{R}^{2})\) topology, where \(\nabla u_{0} \in L^{\infty }(\mathbb{R}^{2})\) is a solution to the problem (5) with boundary data \(\bar{M}\).

Defining \(u(x):=u_{0}(x) + \omega (M)x\) hence concludes the proof of Proposition 36. □

In Sects. 4 and 5 we present a more refined analysis of this construction algorithm. In particular, we give an explicit quantitative construction for the covering procedure from Step 2 in Algorithm 27.

4 Covering Constructions

In the following section we present the details of the coverings, which we use in Algorithms 27 and 30. Here we pursue two (partially) competing objectives: Given a triangle \(D\),

  • we seek to cover an as large as possible volume fraction of it, but at least a given fixed volume fraction, \(v_{0}>0\).

  • We have to control the perimeters of the triangles in the resulting new covering.

In the context of these considerations, it turns out that the parallel and the rotated cases (cf. Definition 29) differ quantitatively and hence have to be discussed separately. This can be understood when considering possible coverings of rectangles by parallel or rotated rectangles.

We illustrate this in two extreme situations (cf. Fig. 8): Given a rectangle \(R_{1,\delta _{0}}\) with sides of length 1 and \(\delta _{0}\), we seek to cover it with rectangles which have a fixed side ratio \(r\) and whose long sides are either parallel or orthogonal to the long side of the original rectangle \(R_{1,\delta _{0}}\). In order to illustrate the differences between these situations, we for instance assume that \(r=\delta _{0}/2\). In the situation, in which the original rectangle \(R_{1,\delta _{0}}\) is covered by rectangles whose long side is parallel to the long side of \(R_{1,\delta _{0}}\), the covering can be achieved by splitting \(R_{1,\delta _{0}}\) along its central line as illustrated in Fig. 8(a). Thus, the resulting perimeter (we view it as a measure of the \(BV\) energy of the characteristic functions in the Conti covering), which is necessary to cover the volume of \(R_{1,\delta _{0}}\) is bounded by twice the perimeter of \(R_{1,\delta _{0}}\). If the long sides of the covering rectangles of ratio \(\delta _{0}/2\) are however orthogonal to the long side of \(R_{1,\delta _{0}}\), the covering of \(R_{1,\delta _{0}}\) can only be achieved by \(2\delta _{0}^{-2}\) small rectangles of side lengths \(\delta _{0}\) and \(\delta _{0}^{2}/2\) (cf. Fig. 8(b)). The necessary perimeter for this covering is thus proportional to \(\delta _{0}^{-1} \operatorname{Per}(R _{1,\delta _{0}})\).

For a small value of \(\delta _{0}\) this makes a substantial difference and accounts for the losses in the estimates for the rotated situation.

The difference of the parallel and the rotated situation become even more apparent, if we consider a sequence of coverings: Here we start with the rectangle \(R_{1,\delta _{0}}\) and first consider an iterative covering of it by parallel rectangles, which in the \(j\)-th iteration step are of side ratio \(\delta _{j}:=2^{-j}\delta _{0}\) (and such that the long side is parallel to the long side of \(R_{1,\delta _{0}}\)). The desired covering of \(R_{1,\delta _{0}}\) in the iteration step \(k\) can be achieved by splitting the rectangles from the covering at the iteration step \(k-1\) along their central lines. In each iteration step the overall perimeter increases at most by a factor two, so that after \(j\) iteration steps the overall perimeter can be estimated by

$$\begin{aligned} 2^{j} \operatorname{Per}(R_{1,\delta _{0}}). \end{aligned}$$

If in comparison, we consider the case in which the covering rectangles are rotated in every step by \(\pi /2\) with respect to the preceding rectangles, and again choose a ratio \(\delta _{j}:= 2^{-j}\delta _{0}\) in the \(j\)-th iteration step, we inductively obtain a bound of the form

$$\begin{aligned} \Biggl(\prod_{l=1}^{j} \delta _{l}^{-1} \Biggr) \operatorname{Per}(R_{1,\delta _{0}}) = \delta _{0}^{-j} \Biggl(\prod_{l=1}^{j} 2^{l} \Biggr)\operatorname{Per}(R_{1,\delta _{0}}) \end{aligned}$$

for the overall perimeter after the \(j\)-th step. In contrast to the parallel situation this has superexponential behavior in \(j\).

If we consider the \(\pi /2\) rotated situation with fixed ratio \(\delta _{j}=\delta _{0}\), this bound improves to an exponential bound of the form

$$\begin{aligned} \delta _{0}^{-j} \operatorname{Per}(R_{1,\delta _{0}}). \end{aligned}$$

Hence, the estimates in the rotated situation are substantially worse than the ones in the parallel situation. In order to avoid superexponential behavior, we have to take care that the rotated case can only occur, if the value of \(\delta _{j}\) is controlled from below. These heuristics a posteriori justify our careful choice of \(\epsilon _{j}\) and \(\omega _{j}\) in Algorithms 27 and 30.

Although the level sets of the Conti construction consist of triangles and hence our coverings \(\{\varOmega _{j,k}\}_{k\in \{1,\dots ,J_{j}\}}\) will be coverings of triangles by triangles (instead of the previously described rectangular coverings), the heuristics from above still persist.

Motivated by these heuristic considerations, in the sequel we seek to provide covering results and associated \(BV\) bounds, which can be applied in Algorithms 27, 30. We organize the discussion of this as follows: In Sect. 4.1, we introduce some of the fundamental objects (cf. Definitions 40 and 43) and formulate the main covering result (Proposition 45). Here we consider a similar distinction into a parallel and a rotated situation as described in the above heuristics (cf. Definition 43). With the class of triangles from Definition 40 at hand we distinguish several different cases and discuss different covering scenarios. The respective coverings are tailored to the specific situation and are made such that we do not leave our class of triangles during the iteration. Their discussion is the content of Sects. 4.24.5. Finally, the various different cases are combined in Sect. 4.6 to provide the proof of Proposition 45.

4.1 Preliminaries

In this section we introduce the central objects of our covering (cf. Definitions 40 and 43) and state our main covering result (Proposition 45).

As a preparation for the main part of this section, we begin by discussing auxiliary results on matrix space geometry. We first estimate the angle formed in strain space between two matrices:

Lemma 37

Let\(\tilde{d}>0\). Let\(M^{(1)}\), \(M^{(2)}\)be matrices with\(d(e(M^{(i)}),e ^{(1)})\geq \tilde{d}\)and let\(a^{(i)} \otimes n^{(i)}\)be the associated rank-one connection to\(e^{(1)}+ \hat{S}_{i}\), where the skew matrices\(\hat{S}_{i}\)are as in Corollary26. Suppose that\(|M^{(1)}-M^{(2)}| \leq \epsilon _{j} \ll \tilde{d}\), then the angle\(\alpha _{1}\)between\(n^{(1)}\)and\(n^{(2)}\)satisfies

$$\begin{aligned} |\alpha _{1}| \leq 3 \frac{\epsilon _{j}}{\tilde{d}}. \end{aligned}$$

Remark 38

Applied to a triangle \(D \in \{\varOmega _{j,k}\}_{k\in \{1,\dots ,J_{j} \}}\) in the parallel case (cf. Definition 29), Lemma 37 implies that the rotation angle \(\alpha _{1}\), with which the consecutive Conti constructions are rotated with respect to each other (and which is defined as in Lemma 37), is bounded by

$$\begin{aligned} |\alpha _{1}| \leq 3 \frac{\epsilon _{j}|_{D}}{\operatorname{dist}(e( \nabla u_{j})|_{D}, K)} \leq 300 \delta _{j}|_{D}. \end{aligned}$$

Here \(\epsilon _{j}\), \(\delta _{j}\) are the functions from the convex integration Algorithm 27.

Proof of Lemma 37

As sketched in Fig. 9, we may estimate

$$\begin{aligned} |\alpha _{1}| \leq 1.5 |\tan (\alpha _{1})| \leq 3 \frac{\epsilon _{j}}{ \tilde{d}}, \end{aligned}$$

where \(\epsilon _{j}\) is the error in matrix space. Here the first estimate follows by a Taylor approximation and by noting that \(|\alpha _{1}|\) is small, so that in particular \(|\alpha _{1}| \leq \frac{ \pi }{6}\) (in which range the tangent is invertible and for which the Taylor expansion is valid).

Fig. 9
figure 9

The angle between rank-one connections of nearby matrices is small. In particular, the associated rectangle constructions from Lemma 21 are close to being parallel


Next we observe the following bounds on the rotation angles:

Lemma 39


Let\(D\in \{\varOmega _{j,k}\}_{k\in \{1,\dots ,J_{j}\}}\)for\(j\geq 1\). Let\(\hat{D} \in \mathcal{D}_{1}(D)\). Assume that the triangle\(\hat{D}\)is in the rotated case (cf. Definition29). Let\(\alpha _{1}\)denote the angle between the long sides of the current and the following Conti constructions. Then, we have that

$$\begin{aligned} 0< C\delta _{0}< |\alpha _{1}| \leq \pi -C\delta _{0}. \end{aligned}$$


This is an immediate consequence of Lemma 18. □

With these auxiliary results at hand, we proceed to the discussion of our central covering objects. In order to define our set of covering triangles, \(\{\varOmega _{j,k}\}_{k\in \{1,\dots ,J_{j}\}}\), we consider a subclass of triangles with, for our purposes, suitable properties. To this end, we can not ensure that all domains appearing in our covering argument are right angle triangles (due to the presence of the green triangles in the Conti construction in Fig. 10), for which one of the other angles is approximately of size \(\delta _{j}\). However, the following definition provides a family of sets with similar properties. This will allow us to formulate a precise, iterative covering result.

Fig. 10
figure 10

Triangles in the undeformed Conti construction (Color figure online)

Definition 40

Let \(\delta _{0}\) be as in Algorithm 27. The triangle \(D\) is said to be \(\delta \)-good with respect to a reference direction\(n\in S^{1}\), for \(\delta \in (0,\delta _{0}]\), if

  1. 1.

    One angle, \(\alpha \), satisfies \(\alpha \in \delta [\frac{1}{10},1000]\),

  2. 2.

    The other two angles are contained in \(\frac{\pi }{2} + 2 \delta [-1000, 1000]\),

  3. 3.

    One of the long sides encloses an angle in \(\delta [-1000,1000]\) with \(n\).

We refer to the long side of the triangle which satisfies the requirement of 3.), as the direction of\(D\). We also say that the triangle \(D\) is oriented parallel to\(n\) or that \(D\)is aligned to\(n\).

If a triangle \(D\) satisfies 1.) and 2.) but not necessarily 3.) we call it \(\delta \)-good (which allows for a possible change of orientation).

Remark 41

We note that if \(\alpha \) is small, both long sides could satisfy condition 3.) at the same time. In this case both directions are valid as directions of\(D\).

In our construction one prominent reference direction is obtained from Conti’s construction, as detailed in the following definition.

Definition 42

Let \(D\) be a level set of \(\nabla u_{j}\) and let \(e= e^{(p)}_{j}|_{D}\) be the reference well. Let further \(n \in S^{1}\) be the direction of the long side of the Conti rectangle from Step 2 in Algorithm 27 on \(D\). We say that \(n\) is the direction of the relevant Conti construction (at step\(j\)), i.e. the construction by which \({D}\) is (in part) covered. A \(\delta \)-good triangle is parallel to Conti’s construction if one of its long sides is parallel to \(n\).

In the sequel, we will give a precise covering result, which shows that in the \(j\)th step of our convex integration Algorithms 27 and 30, we may assume that only very specific triangles are present in the collection \(\{\varOmega _{j,k}\} _{k\in \{1,\dots ,J_{j}\}}\) as (parts of) level sets of \(\nabla u_{j}\). To this purpose we define the following classes of triangles:

Definition 43

Let \(SP_{j}\) be as in Algorithms 27 and 30. Then, a triangle \(D_{j} \in \{\varOmega _{j,k}\}_{k\in \{1, \dots ,J_{j}\}}\) is in the case:

  1. (P1),

    if it is \(\delta _{j}|_{D_{j}}\)-good with direction \(n \in S^{1}\), where \(n\) denotes the direction of the relevant Conti construction.

  2. (P2),

    if \(\delta _{j}|_{D_{j}} = \delta _{0}\) and \(\delta _{j-1}|_{D _{j}}\neq \delta _{0}\) and if \(D_{j}\) is \(\delta _{j-1}|_{D_{j}}\)-good with direction \(n \in S^{1}\), where \(n\) denotes the direction of the relevant Conti construction.

  3. (R1),

    if \(\delta _{j}|_{D_{j}}=\delta _{0}\), the triangle is \(\delta _{0}\)-good and if it forms an angle \(\beta \) with \(C\delta _{0} \leq \beta \leq \frac{\pi }{2}- C \delta _{0}\) with respect to the direction of the relevant Conti construction (cf. Lemma 39).

  4. (R2),

    if \(\delta _{j}|_{D_{j}}=\delta _{0}\), the triangle is \(\delta _{j-1}|_{D_{j}}\)-good with \(\delta _{j-1}|_{D_{j}} \neq \delta _{0}\) and if it forms an angle \(\beta \) with \(C\delta _{0} \leq \beta \leq \frac{\pi }{2}- C \delta _{0}\) with respect to the direction of the relevant Conti construction.

  5. (R3),

    if \(\delta _{j}|_{D_{j}}=\delta _{0}\) and if the triangle is right angled and such that

    1. (a)

      the other angles are bounded from below and above by \(C \delta _{0}\) and \(\frac{\pi }{2}- C \delta _{0}\),

    2. (b)

      one of its sides is parallel to the orientation of the relevant Conti construction.

Remark 44

The cases above, as stated, are not distinct since we allow for a factor in our definition of being \(\delta \)-good (cf. Definition 40). For instance, there might be triangles which are in both case (P1) and (P2). However, in such situations also the constructions and perimeter estimates are comparable. In situations in which the estimates would differ significantly and where \(\delta _{j-1} \ll \delta _{j}=\delta _{0}\), the above definitions yield distinct cases.

Let us comment on this classification: The basic distinction criterion separating the triangles into the different cases is given by checking whether the corresponding triangles are roughly aligned (as in the cases (P1), (P2)) or whether they are substantially rotated (as in the cases (R1), (R2)) with respect to the direction of the relevant Conti construction (the case (R3) is a special “error situation”, which does not entirely fit into this heuristic consideration). Roughly speaking, this determines whether we are in a situation analogous to the first or to the second picture in Fig. 8. This distinction is necessary, as else a control of the arising surface energy is not possible in a, for our purposes, sufficiently strong form.

This distinction (essentially) coincides with our definition of the parallel and the rotated cases (cf. Definition 29): If a triangle \({D_{j}} \in \{\varOmega _{j,k}\} _{k\in \{1,\dots ,J_{j}\}}\) is in the parallel case in step \(j\), then the directions of the Conti construction, which gave rise to \({D_{j}}\), and of the relevant Conti construction at step \(j\) (i.e. the construction by which \({D_{j}}\) is (in part) covered) are essentially parallel (cf. Lemma 37 and Remark 38). Letting \(j_{0}\in \mathbb{N}\) denote the index from Definition 29 and assuming that \(j_{0}=1\), there are three possible scenarios for the relation of \(\delta _{j}|_{D_{j}}\) and \(\delta _{j-1}|_{D_{j}}\):

  1. (i)

    \(\delta _{j}|_{D_{j}} = \delta _{j-1}|_{D_{j}}/2\). In this case \(\nabla u_{j}|_{{D_{j}}}\) was produced as the stagnant matrix (cf. Notation 25) in the iteration step \(j-1\). In this case, our covering construction will ensure that \({D_{j}}\) is in case (P1) (not exclusively, cf. Remark 44, but as one option).

  2. (ii)

    \(\delta _{j}|_{D_{j}}= \delta _{0}\) but \(\delta _{j-1}|_{D _{j}}\neq \delta _{0}\). This can for instance occur in a parallel push-out step. In this case, our covering construction will ensure that \({D_{j}}\) is in case (P2) (not exclusively (depending on the value of \(\delta _{j-1}\)), cf. Remark 44, but as one option).

  3. (iii)

    \(\delta _{j}|_{D_{j}}= \delta _{0}= \delta _{j-1}|_{{D_{j}}}\). This case can for instance occur in two successive push-out steps. In this case, our covering ensures that \({D_{j}}\) is in the case (P1).

If a triangle \({D_{j}} \in \{\varOmega _{j,k}\}_{k\in \{1,\dots ,J_{j}\}}\) is in the rotated case in step \(j\), then the direction of the Conti construction, which gave rise to \({D_{j}}\), and the direction of the relevant Conti construction at step \(j\) are necessarily substantially rotated with respect to each other (cf. Lemma 39). Again assuming that the index \(j_{0}=1\) (where \(j_{0}\) denotes the index from Definition 29), we now distinguish two cases for the relation between \(\delta _{j}|_{ {D_{j}}}\) and \(\delta _{j-1}|_{D_{j}}\): Here we first note that necessarily (by definition of the rotated case, as occurring only after a push-out step) we have \(\delta _{j}|_{D_{j}} = \delta _{0}\). Then there are two options for \(\delta _{j-1}|_{D_{j}}\):

  1. (i)

    \(\delta _{j-1}|_{D_{j}} = \delta _{0}\). This case can for instance occur in the situation of two successive push-out steps. In this case our covering ensures that \({D_{j}}\) can be taken to be in the case (R1).

  2. (ii)

    \(\delta _{j-1}|_{D_{j}} \neq \delta _{0}\). This case can for instance occur in the case in which \(\nabla u_{j-1}|_{D_{j}}\) is produced in a stagnant and \(\nabla u_{j}|_{D_{j}}\) in a push-out step. In this situation our covering ensures that \({D_{j}}\) can be taken to

The case (R3) only occurs as an error case as a consequence of our specific covering procedure for the triangles of the types (R1) and (R2).

We relate the different cases to the heuristics given at the beginning of Sect. 4 (cf. Fig. 8). We view the cases (P1) and (R1) as the “model cases” without and with substantial rotation and corresponding to the parallel and orthogonal (triangular) situation depicted in Fig. 8. In both cases (P1) and (R1) the aspect ratio of the given triangle \(D_{j}\) is roughly of order \(\delta _{j}|_{D _{j}}\) (i.e. the quotient of its shortest and of its longest sides are roughly of that order) and we seek to cover it with a Conti construction of comparable ratio \(\delta _{j}|_{D_{j}}\).

The cases (P2) and (R2) are situations in which the underlying triangle \(D_{j}\) is roughly of side ratio \(\delta _{j-1}|_{D_{j}}\) (i.e., the quotient of its shortest and of its longest sides are roughly of that order), where we however seek to cover the triangle with Conti constructions with ratio \(\delta _{0}\). This mismatch is a consequences of our construction of the function \(\delta _{j}|_{D_{j}}\) in Algorithm 27: here we prescribe that the matrices which are pushed out (cf. Notation 25), are allowed to have an error tolerance of \(\delta _{0}\). In particular, it may occur that \(\delta _{j-1}|_{D_{j}}\ll \delta _{j}|_{D_{j}}=\delta _{0}\), which is the situation described in (P2), (R2) either without or with substantial rotation.

The case (R3) is a consequence of how we deal with “remainders” in our covering constructions for the cases (R1), (R2).

Our main result of the present section states that it is possible to find a covering of the level sets which respects Algorithms 27 and 30, such that only the specific triangles from Definition 43 occur. Moreover, we provide bounds for the remaining uncovered “bad” volume and the resulting perimeters.

Proposition 45


Let\(\varOmega =Q_{\beta }[0,1]^{2}\), where\(\beta \)is the rotation of the Conti construction adapted to the matrix\(M\)from Algorithm27. Let\(u_{j}\)be as in Algorithms27and 30. Then, there exists a covering\(\{\varOmega _{j,k}\}_{k\in \{1,\dots ,J_{j}\}}\)such that only triangles of the classes (P1), (P2) and (R1)-(R3) occur and such that

  1. (i)

    \(|\varOmega \setminus \varOmega _{j}| \leq (1-\frac{7}{8}v_{0})^{j}| \varOmega |\),

  2. (ii)

    \(\sum_{k=1}^{J_{j+1}}\operatorname{Per}(\varOmega _{j+1,k}) \leq C \delta _{0}^{-1} \sum_{k=1}^{J_{j}}\operatorname{Per}(\varOmega _{j,k}) \).

Here\(v_{0} \in (0,1)\)is a small constant, which is independent of\(\epsilon _{0}\), \(d_{0}\)and\(d_{K}\).

In the remainder of this section we seek to prove this result and to construct the associated covering. To this end, in Sect. 4.2 we first explain that the “natural covering” of the Conti construction, which is achieved by splitting it into its level sets, satisfies the requirements of Proposition 45. In particular, this implies that the covering, which is obtained in Step 1 of Algorithm 27, satisfies the properties of Proposition 45 (the resulting triangles are of the types (P1), (P2) or (R1), (R2)). Hence, in the remaining part of the section, it suffices to prove that given a triangle of the type (P1)–(R3), we can construct a covering for it which obeys the claims of Proposition 45. To this end, in Sect. 4.3, we first describe a general construction, on which we heavily rely in the sequel. With this construction at hand, in Sect. 4.4 and its subsections we then deal with the cases (P1), (P2), in which there is no substantial rotation involved. Subsequently, we discuss the cases (R1)–(R3) with non-negligible rotations in Sect. 4.5. Finally, in Sect. 4.6 we provide the proof of Proposition 45.

The generalization to more generic domains is detailed in Sect. 6.

4.2 Covering the Conti Construction by Triangles

We begin with our covering construction by explaining that a Conti construction of ratio \(\delta _{j}|_{D_{j}}\) can be divided into a finite number of triangles which are all of the types (P1), (P2) and (R1)–(R3).

Lemma 46

Let\(SP_{j}\)be as in Algorithm27and let\(D_{j} \in \{\varOmega _{j,k}\}_{k\in \{1,\dots ,J_{j}\}}\). Suppose that\(R\subset D_{j}\)is a Conti rectangle of ratio\(\delta _{j}|_{D_{j}}\). Let\(M_{1},\dots ,M_{4}\)denote the gradients occurring in the Conti construction with the same convention as in Notation25. Then all level sets in\(R\)on which\(M_{4}\)is attained, can be decomposed into (at most two) triangles which are of the type (P1). The level sets with\(M_{1}\), \(M_{2}\), \(M_{3}\)can be decomposed into triangles of the type (P1), (P2) or (R1), (R2).


We recall that (after a suitable splitting into in total 16 triangles as depicted in Fig. 10) all except for four triangles in the undeformed Conti construction (cf. Corollary 20) are axis-parallel and have aspect ratio approximately \(1:\delta _{j}|_{D_{j}}\) (with a factor depending on \(\lambda \); for \(\lambda =\frac{1}{4}\) a factor in the interval \((1/4,4)\) is more than sufficient). After rescaling the \(x_{2}\)-axis by \(\delta _{j}|_{D_{j}}\) (as in Lemma 21), these aspect ratios are then comparable to \(1:\frac{\delta _{j}|_{D_{j}}}{2}\), i.e. \(1: \delta _{j+1}|_{R\setminus F}\), where \(F\) denotes the union of the non-axis-parallel level sets in the deformed configuration. Hence, all the axis-parallel triangles are of the type (P1) or (R1). The triangles in which \(M_{4}\) is attained, are of type (P1), as the rotation angle of the next Conti construction in the parallel case is controlled by virtue of Lemma 39.

It remains to discuss the remaining triangles contained in \(F\) (green in Fig. 10). These are again \(\delta _{j}|_{D_{j}}\)-good by a similar estimate on the aspect ratios, and by an estimate on the angle of rotation with respect to the \(x_{1}\)-axis. As by our convex integration Algorithm 27, Step 2(b), \(\delta _{j+1}|_{F}=\delta _{0}\), they are in general of the type (P2) or (R2), if \(\delta _{j}|_{D_{j}} \neq \delta _{0}\), but could also be of the type (P1) or (R1), if \(\delta _{j}|_{D_{j}} = \delta _{0}\). □

Remark 47

We observe that Lemma 46 in particular implies that the triangles which are obtained in Step 1 in Algorithm 27, all satisfy the claim of Proposition 45. Hence, in the sequel it suffices to provide a covering algorithm which preserves this property.

4.3 A Basic Building Block

We begin our iterative covering statements by presenting a general building block, which we will frequently use in the sequel. Given a triangle \(D\) we seek to reduce the discussion to that of a rectangle \(R_{2}\), whose long side is aligned with the direction of \(D\) and which is of similar volume as the original triangle. Only in the covering of this rectangle will the situations (P1), (P2) and (R1), (R2) differ. For the case (R3) we argue differently.

Proposition 48

Let\(D\)be a\(\delta \)-good triangle with\(0< \delta \leq \delta _{0}\). Let further\(\tilde{R}_{2}\)be a rectangle of aspect ratio\(r\in [ \frac{1}{10},1000] \delta \), whose long side is aligned to the direction of\(D\). Then there exists a rescaled and translated copy\(R_{2}\subset D\)of the rectangle\(\tilde{R}_{2} \) (of aspect ratio\(1:r\)) for which three of its corners lie on\(\partial D\), and such that:

  1. (i)

    \(|R_{2}| \geq 10^{-6}|D|\).

  2. (ii)

    One corner divides a side of the triangle in the ratio\(\frac{2}{3}:\frac{1}{3}\).

  3. (iii)

    The set\(D \setminus R_{2}\)consists of at most 100 \(\delta \)-good triangles which are aligned with the direction of\(D\).


Let \(D\) be a given \(\delta \)-good triangle and let \(\alpha \) denote the corresponding angle from Definition 40(1). Without loss of generality we may assume that the triangle \(D\) is aligned with the \(x_{1}\)-axis, that the tip of the triangle lies at the origin and that (after rescaling) the \(x_{1}\)-axis-parallel side is given by the interval \([0,1]\times \{0\}\) (cf. Fig. 11).

Fig. 11
figure 11

Fitting the parallel rectangle \(R_{2}\) (green box) into \(D\). The resulting remaining triangles are by construction again \(\delta \)-good. The partition of the box is shown in Fig. 12 (Color figure online)

Let \(P_{1}=(\frac{2}{3},0)\) and let \(g\) be the line of slope \(-r\) through \(P_{1}\). Then \(g\) intersects \(\partial D\) in exactly one other point \(P_{2}\). Being aligned along the \(x_{1}\)-axis, the rectangle \(R_{2}\) is then uniquely determined by requiring that \(P_{1}\) and \(P_{2}\) are two of its corners. By construction it has aspect ratio \(1:r\).

In order to infer the bound on the volume, we compute the coordinates of \(P_{2}=(\frac{2r}{\tan (\alpha ) + r}, \frac{2 r \tan (\alpha )}{3( \tan (\alpha ) + r)})\). Hence the volume of \(R_{2}\) is given by

$$\begin{aligned} |R_{2}|= \biggl( \frac{2}{3}- \frac{2r}{\tan (\alpha ) + r} \biggr) \frac{2 r \tan (\alpha )}{3(\tan (\alpha ) + r)} = \frac{2}{3} \frac{ \tan (\alpha )}{r+\tan (\alpha )} \frac{2}{3} \frac{r\tan (\alpha )}{r+ \tan (\alpha )}. \end{aligned}$$

Since the volume of \(D\) is comparable to \(\frac{\tan (\alpha )}{2}\), this results in a volume fraction of approximately

$$\begin{aligned} \frac{|R_{2}|}{|D|} \geq & \frac{ \frac{2}{3} \frac{\tan (\alpha )}{r+ \tan (\alpha )} \frac{2}{3} \frac{r\tan (\alpha )}{r+\tan (\alpha )}}{\frac{ \tan (\alpha )}{2}} = \frac{1}{9} \frac{r \tan (\alpha )}{(\frac{r+ \tan (\alpha )}{2})^{2}} \\ &\geq \frac{1}{9} \frac{\min (r,\tan (\alpha ))}{\max (r, \tan ( \alpha ))}. \end{aligned}$$

Using the fact that \(r\geq \frac{\delta }{10}\) and \(\alpha ,\arctan (r) \leq 1000\delta \), we infer the desired estimate on the volume fraction. In particular, we note that it is independent of \(\delta \).

Adding a vertical line through \(P_{1}\) and a horizontal line through \(P_{1}+(0,\frac{2}{3}\tan (\alpha )) \in \partial D\), we obtain an axis parallel triangle of opening angle \(\alpha \) to the left of \(R_{2}\), another axis parallel triangle of opening angle \(\alpha \) above \(R_{2}\), a four-sided box \(B\) to the right of \(R_{2}\) and triangle self-similar to \(D\) above the box (cf. Fig. 11).

As \(D\) is \(\delta \)-good, so are the above mentioned three triangles and it hence remains to discuss the box \(B\) on the right of \(R_{2}\) (cf. Fig. 12). By construction the bottom side of \(B\) is axis parallel and of length \(\frac{1}{3}\) and the left side is also axis-parallel and of length \(\frac{2}{3}\tan (\alpha )\). Furthermore, since \(D\) is \(\delta \)-good, the angle on the bottom right of \(B\) is given by \(\frac{\pi }{2} - \gamma \) for some \(\gamma \in \delta [-2000,2000]\) and in particular \(|\gamma | \leq \frac{1}{10}\). Hence, the length of the axis-parallel top side differs from \(\frac{1}{3}\) by \(|\tan (\gamma ) \frac{2}{3}\tan (\alpha )| \leq \frac{1}{10}\).

Fig. 12
figure 12

As the angle at the bottom right is very close to \(\frac{\pi }{2}\), the box is well approximated by a rectangle of side-lengths \(\frac{1}{3}:\frac{2}{3}\tan (\alpha )\). Partitioning the box into \(N\) slices of the same height, an estimate on the tan of the opening angles \(\gamma _{j}\) of the corresponding rectangles shows that \(\tan (\gamma _{j}) \approx \frac{2}{N} \tan (\alpha )\). Choosing \(N \in \{1,2,3\}\) appropriately, we thus obtain \(\delta \)-good triangles

Introducing further horizontal lines, we may partition \(B\) into \(N\) boxes with three axis-parallel sides of height \(\frac{2}{3N}\tan ( \alpha )\) and length close to \(\frac{1}{3}\) (cf. Fig. 12). Bisecting along the diagonals, we hence obtain opening angles \(\gamma _{j}\) with

$$\begin{aligned} \tan (\gamma _{j})\approx \frac{\frac{2}{3N}\tan (\alpha )}{ \frac{1}{3}} = \frac{2}{N} \tan (\alpha ). \end{aligned}$$

Choosing \(N \in \{1,2,3\}\) appropriately and noting that the remaining angle is either \(\frac{\pi }{2} -\gamma \) (same as \(D\)) or a right-angle, all obtained triangles are \(\delta \)-good with respect to \(n\). □

Remark 49

We note that the above quotient

$$\begin{aligned} \frac{\min (r,\tan (\alpha ))}{\max (r, \tan (\alpha ))} \end{aligned}$$

is symmetric in \(\tan (\alpha )\), \(r\) and punishes them being of different size. A similar mechanism can be observed when trying to fit axis parallel rectangles of different aspect ratios \(r_{1}\), \(r_{2}\) inside each other. Letting \(1:r_{1}\) be the lengths of the exterior rectangle, the interior rectangle has lengths \(a:ar_{2}\), where \(a \leq 1\) limits the volume ratio to

$$\begin{aligned} \frac{a^{2} r_{2}}{r_{1}} \leq \frac{r_{2}}{r_{1}}, \end{aligned}$$

and \(ar_{2} \leq r_{1} \Leftrightarrow a \leq \frac{r_{1}}{r_{2}}\) limits the volume ratio to

$$\begin{aligned} \frac{a^{2} r_{2}}{r_{1}} \leq \frac{\frac{r_{1}^{2}}{r_{2}^{2}} r _{2} }{r_{1}}= \frac{r_{1}}{r_{2}}. \end{aligned}$$

This illustrates that the triangular situation is comparable to the rectangular situation, which we introduced as our heuristic model situation in the beginning of Sect. 4.

We further explain how, given a box \(R\) with some rotation angle with respect to the \(x_{1}\)-axis, we construct a block of the type \(R_{2}\).

Lemma 50

Let\(0<\delta \leq \delta _{0}\)and let\(R\)be a rectangle of aspect ratio\(1:r_{0}\)for\(r_{0} \in \delta [\frac{1}{10},10]\). Suppose further that the direction\(n \in S^{1}\)of the long side of\(R\)encloses an angle\(\beta \in \delta [-1000,1000]\)with the\(x_{1}\)-axis. Then there exist an axis-parallel rectangle\(R_{2}\)which is parallel to the\(x_{1}\)-axis and of aspect ratio\(1:r\), where\(r \in \delta [ \frac{1}{10}, 1000]\), such that

  1. (i)

    \(R \subset R_{2}\)and\(|R| \geq 10^{-6}|R_{2}|\),

  2. (ii)

    The set\(R_{2} \setminus R\)can be decomposed into at most 100 \(\delta \)-good triangles, whose direction is either\(n\)or\(e_{1}\).


The construction is sketched in Figs. 13, 14, 15 and 16.

Fig. 13
figure 13

Constructing an axis parallel rectangle starting from \(R\). We begin with the inner rectangle \(R\) (which encloses an angle \(\beta \) with the \(x_{1}\)-axis) and successively add the eight outer rectangles to obtain the box \(R_{2}\) from Lemma 50. The explicit construction of the four white boxes, which together with \(R\) form the inner “cross”, are described in detail in Figs. 15 and 16. All the outer rectangles are of aspect ratio approximately \(1:\delta _{j}\) and can hence be decomposed into \(\delta _{j}\)-good triangles (Color figure online)

Fig. 14
figure 14

Labeling of points. The figure illustrates the successive addition of further points, which results in the inner cross structure. In this construction we have to choose the points in a way, which ensures that the triangles, which are formed by bisecting the boxes along the diagonals, still remain of the types (P1), (P2) and (R1), (R2)

Fig. 15
figure 15

Adding \(\delta \)-good triangles on the left and right, we can achieve axis-parallel boundaries

Fig. 16
figure 16

Adding \(\delta _{j}\)-good triangles on the top and bottom, we can achieve axis-parallel boundaries. Since \(\beta \) might include a large or small factor in the definition, we may either allow and opening angle \(\gamma _{2}+\beta \) or introduce a horizontal line to obtain two triangles with opening angle \(\gamma _{2}\) and \(\beta \)

After a translation, rescaling and reflection with respect to the \(x_{2}\)-axis, we may assume that \(\beta \geq 0\) and that the corners of \(R\) are given by

$$\begin{aligned} P_{1} &=(0,0), \\ P_{2} &=\bigl(1,\tan (\beta )\bigr), \\ P_{3} &=\bigl(1,\tan (\beta )\bigr)+r_{0} \sqrt{1+\tan ( \beta )^{2}}\bigl(-\sin ( \beta ),\cos (\beta )\bigr) \\ &= \bigl(1,\tan (\beta )\bigr) + r_{0} \bigl(-\tan (\beta ),1\bigr), \\ P_{4} &=r_{0}\bigl(-\tan (\beta ),1\bigr), \end{aligned}$$

where we used that

$$\begin{aligned} \cos (\beta ) \sqrt{1+\tan (\beta )^{2}} &= 1, \\ \sin (\beta ) \sqrt{1+\tan (\beta )^{2}} &= \tan (\beta ). \end{aligned}$$

In the following, we add quadrilaterals with three axis-parallel sides with aspect ratios \(r_{0}\) and \(r_{2}=\tan (\gamma _{2})\) for

$$\begin{aligned} \gamma _{2} \in \delta \biggl(\frac{3}{10}, \frac{1000}{3}\biggr), \end{aligned}$$

to be chosen later.

We begin by adding quadrilaterals on the left and right by inserting the following four points (cf. Figs. 14 and 15)

$$\begin{aligned} Q_{1} &= P_{4} - (1,0), \\ Q_{2} &= Q_{1} - (0, r_{0})= P_{1} - \bigl(1 + r_{0}\tan (\beta ),0\bigr), \\ Q_{3} &= P_{2} + (1,0), \\ Q_{4} &= Q_{3} + (0, r_{0})= P_{3} + \bigl(1 + r_{0} \tan (\beta ),0\bigr). \end{aligned}$$

We, in particular, note that the lines \(\overline{Q_{1} Q_{2}}\) and \(\overline{Q_{3}Q_{4}}\) are parallel to the \(x_{2}\)-axis. Furthermore, the triangles \(Q_{2}P_{1}P_{4}\) and \(P_{4}Q_{1}Q_{2}\) have opening angles \(\arctan (r_{0})\), are parallel to the \(x_{1}\)-axis and are either right-angled or have an angle \(\frac{\pi }{2} - \beta \). Hence all of these triangles are \(\delta \)-good with direction \(e_{1}\). Similar observations hold for the triangles which are constructed from \(P_{2}\), \(P_{3}\), \(Q_{2}\), \(Q_{3}\).

Following a similar approach, we add the points

$$\begin{aligned} Q_{5} &= P_{1} - (0,r_{2}), \\ Q_{6} &= Q_{5} + (1, 0)= P_{2} - \bigl(0, r_{2} + \tan (\beta )\bigr), \\ Q_{7} &= P_{2} + (0,r_{2}) , \\ Q_{8} &= Q_{7} - (1,0)= P_{3} + \bigl(0, r_{2}+ \tan (\beta )\bigr). \end{aligned}$$

Here, the aspect ratio \(r_{2}\) is chosen flexibly to account for the facts that our construction is horizontally of a length which is slightly larger than 3, and that the rotated rectangle has height \(r_{0}+\tan (\beta )\geq r_{0}\). By symmetry, we may restrict ourselves to discussing the rectangle \(P_{1}Q_{5}Q_{6}P_{2}\). The axis-parallel right angled triangle \(P_{1}Q_{5}Q_{6}\) has opening angle \(\gamma _{2}\) (as defined in (39)) and is thus \(\delta \)-good. For the remaining triangle \(P_{1}Q_{6}P_{2}\), we distinguish two cases:

  • If \(\beta \in \delta [\frac{1}{10}, 1000]\), we additionally introduce the point \(Q_{9}= (1,0)\) and note that \(P_{1}Q_{9}P_{2}\) is \(\delta \)-good and axis-parallel, as are \(P_{1}Q_{9}Q_{6}\) and \(Q_{6}Q_{5}P_{1}\) (which both have an opening angle \(\gamma _{2}\)).

  • If \(0\leq \beta \leq \delta \frac{1}{10}\), we note that by our restriction on \(\gamma _{2}\),

    $$\begin{aligned} \beta + \gamma _{2} \in \delta \biggl[\frac{1}{10}, 300\biggr], \end{aligned}$$

    which ensures that \(P_{1}Q_{6}P_{2}\) is \(\delta \)-good and parallel to the long side \(n\) of \(R\).

Finally, we complete our thus far roughly cross-shaped construction to the desired axis-parallel rectangle \(R_{2}\) by adding four rectangles as in Fig. 13 (the green rectangles there). These rectangles have side lengths

$$\begin{aligned} &\bigl(1+ r_{0}\tan (\beta )\bigr) : r_{2}, \end{aligned}$$
$$\begin{aligned} &1 : \bigl(r_{2}+\tan (\beta )\bigr). \end{aligned}$$

We consider the first rectangle with side ratio as in (40). Since \(r_{0} \tan (\beta ) \leq 2 (1000\delta )^{2} < 0.1\) (which follows from the bounds for \(\delta _{0}\)), we can estimate the aspect ratio from above and below by \(1:\frac{r_{2}}{1.1}\) and \(1:r_{2}\), respectively. Bisecting this rectangle along the diagonals, then results in \(\delta \)-good axis-parallel right triangles, provided

$$\begin{aligned} r_{2} \in \biggl[1.1 \arctan \biggl(\delta \frac{1}{10}\biggr), \arctan (1000 \delta )\biggr]. \end{aligned}$$

This is satisfied due the assumptions on \(\gamma _{2}\), since \(\frac{x}{1.1} \leq \arctan (x) \leq x\) on the considered domain.

For the second rectangle with ratio as in (41), we again distinguish two cases:

  • If \(\beta \in \delta [\frac{1}{10}, 1000]\), we divide the rectangle by a horizontal line through \(P_{1}\), which yields two rectangles of lengths \(1:\tan (\beta )\), \(1:r_{2}\), which are \(\delta \)-good.

  • If \(\beta \in \delta [0,\frac{1}{10}]\), we note that by the same argument as above

    $$\begin{aligned} r_{2} + \tan (\beta ) \in \biggl[\arctan \biggl(\delta \frac{1}{10}\biggr), \arctan (1000 \delta )\biggr]. \end{aligned}$$

    Thus the aspect ratio \(1: (r_{2}+\tan (\beta ))\) results in \(\delta \)-good axis-parallel triangles.

We conclude by noting that the resulting axis-parallel rectangle \(R_{2}\) (which is the entire rectangle in Fig. 13) has side lengths

$$\begin{aligned} \bigl(1+ 2 +r_{0}\tan (\beta )\bigr) : \bigl(r_{0}+ 2 r_{2}+ \tan (\beta )\bigr). \end{aligned}$$

Again estimating \(r_{0}\tan (\beta )< 0.1\), this yields suitable triangles, provided

$$\begin{aligned} r_{0}+ 2 r_{2}+ \tan (\beta ) \in \biggl[3.1 \arctan \biggl(\delta \frac{1}{10}\biggr), 3\arctan (1000 \delta )\biggr]. \end{aligned}$$

Again this is satisfied due to our restrictions on \(\gamma _{2}\), \(r_{0}\) and \(\beta \).

We thus obtain a large family of admissible values \(r_{2}\). We note that the aspect ratio of \(R_{2}\) is comparable to \(2r_{2}+ \tan (\beta )+r _{0}\), as is the area of \(R_{2}\). Hence, as a particular choice, we may take \(r_{2}\) comparable to \(\frac{r_{0}+\tan (\beta )}{2}\) (within a factor 3 to ensure that \(\gamma _{2}\) satisfies the above restriction). Then the aspect ratio is comparable to

$$\begin{aligned} 1: \frac{r_{0}+\tan (\beta )}{2}, \end{aligned}$$

and the volume ratio is comparable to

$$\begin{aligned} \frac{|R|}{|R_{2}|}\geq \frac{r_{0}}{3 \cdot 2(r_{0}+\tan (\beta ))}= \frac{1}{6} \frac{r_{0}}{r_{0}+\tan (\beta )}. \end{aligned}$$

Since \(r_{0} \geq \arctan (\delta \frac{1}{10})\) and \(\beta \leq 1000 \delta \), this quotient may be estimated from below by \(\frac{1}{10000}\). □

Remark 51

One should think of the rectangle \(R\) in Lemma 50 as a Conti construction, which we seek to fit into a triangle \(D\) as in Proposition 48. In doing so, we however have to be careful, since in the cases (P1), (P2) we have to avoid creating new triangles which are substantially rotated with respect to the original one (cf. Fig. 17 and the explanations at the beginning of the next section). The box construction of Lemma 50 ensures this.

Fig. 17
figure 17

Problems which could arise in the covering algorithm: In our parallel covering result we have to avoid rotated triangles as the aspect ratios are very small for these

4.4 Covering in the Cases (P1), (P2)

In this section we explain how, given a triangle \(D_{j} \in \{\varOmega _{j,k}\}_{k\in \{1,\dots ,J_{j}\}}\) of type (P1) or (P2), we can cover it by a combination of the relevant Conti constructions and some remaining triangles, which are again of the types (P1), (P2) and (R1), (R2). Moreover, we seek to achieve two partially competing objectives: On the one hand, we have to control the volume of \(D_{j}\) which is covered by Conti constructions, from below. On the other hand, we aim at keeping the resulting overall perimeter of the new covering geometry as small as possible. The construction of a covering which balances these two objectives, is the content of Proposition 52, which is the main result of this section.

Motivated by the heuristic considerations at the beginning of Sect. 4 (cf. Fig. 8(a)), we expect that in the cases (P1) and (P2), in which there is no substantial rotation with respect to the relevant Conti construction, the two competing objectives of sufficient volume coverage (Proposition 52(1)) and of a good perimeter bound (Proposition 52(3)), can be satisfied with a surface energy, which is independent of \(\delta _{j}\) and \(\delta _{0}\). Indeed, it is possible to show that in the situation without substantial rotation, in each iteration step the overall perimeter of the covering of a triangle is comparable to the perimeter of the original triangle up to a loss of a controlled universal factor.

Proposition 52

Let\(D_{j}\)be as in (P1)(P2) with\(j\geq 1\). Then there exists a covering and a constant\(C>1\) (independent of\(\delta _{0}\)) such that:

  1. 1.

    A volume fraction of at least\(10^{-12}|D_{j}|\)is covered by finitely many rescaled and translated Conti constructions from Lemma21. The Conti constructions can again be covered by finitely many triangles of the types occurring in the cases (P1)(P2) and (R1)(R2), where\(j\)is replaced by\(j+1\).

  2. 2.

    The complement of the Conti constructions is covered by finitely many triangles occurring in the cases (P1)(P2), where\(j\)is replaced by\(j+1\).

  3. 3.

    The overall surface energy of the new triangles\(D_{j+1,l}\in \mathcal{D}_{1}(D_{k})\), is controlled by

    $$\begin{aligned} \sum_{D_{j+1,l}\in \mathcal{D}_{1}(D_{j})}\operatorname{Per}(D _{j+1,l}) \leq C \operatorname{Per}(D_{j}). \end{aligned}$$

In the proof of Proposition 52 we have to be careful in the choice of the covering, in order to keep all the resulting triangles parallel to the direction of \(D_{j}\) or parallel to the relevant Conti construction (cf. Definition 42). This is necessary to ensure a covering such that the sum of the resulting perimeters is comparable to the original perimeter; in particular no factor of \(\delta _{0}\) occurs here. We emphasize that this alignment with the directions of the original triangle or the relevant Conti construction is a central point, since if a (substantial) rotation angle with respect to these directions were to be obtained (e.g. as illustrated in Fig. 17, where the covering gives rise to triangles which are rotated by an angle of \(\frac{\pi }{2}\)), we would inevitably fall into cases similar as the situations described in (R1), (R2), however with a ratio \(\delta _{j}\), which might be substantially smaller than \(\delta _{0}\). As explained at the beginning of Sect. 4, this would entail a growth of the perimeters of the covering by a factor \(\delta _{j}\). As a consequence our BV estimate from Sect. 5 would become a superexponential bound, which could no longer be compensated by the only exponential \(L^{1}\) decay. This would hence destroy all hopes of deducing good higher regularity estimates for the convex integration solutions.

The remainder of this section is organized into three parts: We first discuss the covering constructions for the cases (P1) and (P2) separately in Sects. 4.4.1 and 4.4.2. Then in Sect. 4.4.3 we combine these cases, in order to provide the proof of Proposition 52.

4.4.1 The Case (P1)

We begin by explaining the covering in the case (P1).

Lemma 53

Let\(D\)be a\(\delta _{j}\)-good triangle oriented along the\(x_{1}\)-axis. Let\(R\)be a rectangle of aspect ratio\(1:\delta _{j}/2\)such that its long axis is rotated by an angle of\(\beta \in \delta _{j}[-10,10]\)with respect to the\(x_{1}\)-axis. Then there exists a covering of\(D\)by

  1. (i)

    \(K_{j}\), with\(K_{j}\in [1,100]\), \(\delta _{j}\)-good, up to null-sets disjoint triangles, \(D_{l}\), \(l\in \{1,\ldots ,K_{j}\}\)which are either oriented along the\(x_{1}\)-axis or the long side of\(R\),

  2. (ii)

    a rescaled and translated copy\(\tilde{R}\)of\(R\), such that\(|\tilde{R}| \geq 10^{-6}|D|\).


$$\begin{aligned} \operatorname{Per}(\tilde{R}) + \sum_{l=1}^{K_{j}} \operatorname{Per}(D_{l}) \leq C\operatorname{Per}(D), \end{aligned}$$

where\(C>1\)is a universal constant (in particular independent of\(\delta _{j}\)and\(\delta _{0}\)).


We first invoke Lemma 50 with \(R\) and \(\delta = \delta _{j}\). This yields an axis-parallel box \(\tilde{R}_{2}\) of side ratio \(r\in [\frac{1}{10},10]\delta _{j}\). This box \(\tilde{R}_{2}\) is admissible in Proposition 48. An application of this proposition with \(D\), \(\tilde{R}_{2}\) and \(\delta =\delta _{j}\) hence yields a covering of \(D\) by \(\delta _{j}\)-good triangles which all have \(e_{1}\) as their direction, and a box \(R_{2}\), which is covered as described in Lemma 50. We note that the triangles within \(R_{2}\) are thus also \(\delta _{j}\)-good and have as their directions either \(e_{1}\) or the long side of \(R\). The estimate on the perimeter follows, since all the covering triangles have perimeter controlled by \(\operatorname{Per}(D)\) and as \(K_{j} \leq 100\). The estimate on the volume fraction is a consequence of Proposition 48(i) and Lemma 50(i). □

4.4.2 The Case (P2)

As in the case (P1) we have the following main covering result:

Lemma 54

Let\(D\)be a\(\delta _{j-1}\)-good, axis-parallel triangle. Let\(R\)be a rectangle of aspect ratio\(1: \delta _{0}\), whose long side is rotated with respect to the axis by an angle\(\beta \in \delta _{0}[-1000,1000]\). Then there exists a covering of\(D\)into

  1. (i)

    \(M_{j}\), with\(M_{j} \in [1,100]\), \(\delta _{j-1}\)-good triangles\(D_{l}\), \(l\in \{1,\dots ,M_{j}\}\)which are parallel to the\(x_{1}\)-axis,

  2. (ii)

    \(K_{j}:= \frac{\delta _{0}}{\delta _{j-1}}\)translated, disjoint and rescaled copies\(\tilde{R}_{k}\)of\(R\)with the property that

    $$ \Biggl|\bigcup_{k=1}^{K_{j}}\tilde{R}_{k}\Biggr| \geq 10^{-6}|D|, $$
  3. (iii)

    \(\tilde{M}_{j}\), with\(\tilde{M}_{j} \in [1,100]\), \(\delta _{0}\)-good triangles\(\tilde{D}_{l}\), \(l\in \{1,\dots , \tilde{M}_{j}\}\)which are either parallel to the\(x_{1}\)-axis or parallel to the long side of\(\tilde{R}\).


$$\begin{aligned} \sum_{k=1}^{K_{j}}\operatorname{Per}( \tilde{R}_{k}) + \sum_{l=1}^{M_{j}} \operatorname{Per}(D_{l}) + \sum_{l=1} ^{\tilde{M}_{j}} \operatorname{Per}(\tilde{D}_{l}) \leq C \operatorname{Per}(D), \end{aligned}$$

where\(C>1\)is a universal constant (in particular independent of\(\delta _{j}\)and\(\delta _{0}\)).


We apply Lemma 50 with \(\delta =\delta _{0}\) and the box \(R\). This yields a box \(\tilde{R}_{2}\) of ratio approximately \(\delta _{0}\) and a box \(\tilde{R}\subset \tilde{R}_{2}\) which is a translated and rescaled copy of \(R\) of volume comparable to the volume of \(\tilde{R}_{2}\). Stacking \(K_{j}:= \frac{\delta _{0}}{\delta _{j-1}}\) translated copies of the boxes \(\tilde{R}_{2}\) along the \(x_{1}\)-axis next to each other and denoting the individual boxes by \(\tilde{R} _{2,k}\) (each containing a translated copy \(\tilde{R}_{k}\) of \(R\)), yields as their union a new box \(\bar{R}_{2}\) of aspect ratio \(1:\delta _{j-1}\) (cf. Fig. 18). With respect to this rectangle \(\bar{R}_{2}\) and with \(\delta =\delta _{j-1}\) we now apply Proposition 48, which yields a rectangle \(R_{2}\) of the same aspect ratio as that of \(\bar{R}_{2}\). As the volume of each \(\tilde{R}_{k}\) is comparable to the volume of \(\tilde{R}_{2,k}\), the claim (ii) of Lemma 54 follows from Proposition 48, since this ensures that \(R_{2}\) has volume comparable to \(D\).

Fig. 18
figure 18

The stacking construction of the boxes. Each of the smaller boxes is a suitably translated copy of the rectangle \(\tilde{R}_{2}\), which is roughly of aspect ratio \(1:\delta _{0}\). In each of these we insert a rescaled and translated version of the construction from Lemma 50, cf. Fig. 13

It remains to bound the perimeters. Here we only estimate the sum of the perimeters of the rectangles \(\tilde{R}_{2,k}\), as the remaining parts of the covering are controlled by a multiple of this. We note that

  • each rectangle \(\tilde{R}_{2,k}\) has perimeter bounded by

    $$\begin{aligned} \operatorname{Per}(\tilde{R}_{2,k})\leq C \frac{\delta _{j-1}}{\delta _{0}} \operatorname{Per}(D), \end{aligned}$$
  • there are \(\frac{\delta _{0}}{\delta _{j-1}}\)-many axis parallel boxes \(\tilde{R}_{2,k}\).


$$\begin{aligned} \sum_{k=1}^{K_{j}} \operatorname{Per}( \tilde{R}_{2,k})\leq C \frac{ \delta _{j-1}}{\delta _{0}} \frac{\delta _{0}}{\delta _{j-1}} \operatorname{Per}(D) \leq C \operatorname{Per}(D). \end{aligned}$$

This concludes the proof. □

The main difference of Lemma 54 with respect to Proposition 48 is the step in which we bridge the mismatch in the ratios of the triangle \(D\) (ratio \(\delta _{j-1}\)) and the given box \(R\) (ratio \(\delta _{0}\)). Here we pass from a box of ratio approximately \(\delta _{0}\) (which is prescribed for \(R\) and hence for \(\tilde{R}_{2}\)) to a box with ratio approximately \(\delta _{j}\) (for \(\bar{R}_{2}\)) by stacking translates of the boxes \(\tilde{R}_{2,k}\) next to each other.

4.4.3 Proof of Proposition 52

Using the results from Sects. 4.4.1 and 4.4.2 we can now address the proof of Proposition 52.


The first property of the Proposition follows from Lemma 46 in combination with Lemma 53(ii) (in the case (P1)) or Lemma 54(ii) (in the case (P2)). In particular, by Lemma 46 all the triangles, which are used to cover the Conti constructions, are \(\delta _{j+1}\)-good with respect to the relevant Conti construction. The second property is a consequence of Lemma 53(i) combined with Lemma 50 (in the case (P1)) or Lemma 54(i), (iii) (in the case (P2)). We emphasize that all these triangles are either parallel to the original triangle \(D\) or to the relevant Conti construction, implying that both the angles and the orientations are within the admissible margins. Finally, the bound on the perimeters follows from the corresponding claims in Lemmata 53 and 54. □

4.5 Covering in the Cases (R1)–(R3)

In this section we deal with the covering in the cases (R1)–(R3). As in Sect. 4.4 we seek to simultaneously control the perimeter of the resulting covering and the volume of the domain, which is covered by Conti constructions. Motivated by the discussion from the beginning of Sect. 4, we however expect that it is unavoidable to produce estimates in which the ratio \(\delta _{0}\) appears.

With this expectation, we are less careful in our covering constructions and for instance do not seek to preserve the direction \(n\), in which the corresponding \(\delta _{j}\)-good triangles are oriented. Yet, we still heavily rely on Proposition 48 and only modify the construction within the block \(R_{2}\). This will give rise to certain new “error triangles”, which are of the type (R3). In analogy to Proposition 52 we have:

Proposition 55

Let\(D_{j}\)be as in (R1)(R3) with\(j\geq 1\). Then there exists a covering and a constant\(C>0\)independent of\(\delta _{0}\)such that:

  1. 1.

    A volume fraction of\(10^{-6}|D_{j}|\)is covered by finitely many rescaled and translated Conti constructions. The Conti-constructions can again be covered by finitely many triangles of the types occurring in the cases (P1)(P2) and (R1)(R3), where\(j\)is replaced by\(j+1\).

  2. 2.

    The complement of the Conti constructions is covered by finitely many triangles occurring in the cases (P1), (P2) and (R1)(R3), where\(j\)is replaced by\(j+1\).

  3. 3.

    The overall surface energy of the new triangles\(D_{j+1,l}\in \mathcal{D}_{1}(D_{j})\), is controlled by

    $$\begin{aligned} \sum_{D_{j+1,l}\in \mathcal{D}_{1}(D_{j})}\operatorname{Per}(D _{j+1,l}) \leq C \delta _{0}^{-1} \operatorname{Per}(D_{j}). \end{aligned}$$

As in Proposition 52 the proof of this statement is based on separate discussions of the cases (R1), (R2), (R3) and can be deduced by combining the results of Lemmas 56, 58, 59, 46 and Proposition 48. Since this does not involve new ingredients, we restrict our attention to the discussion of the cases (R1)-(R3) and omit the details of the proof of Proposition 52. The analysis of the cases (R1)-(R3) is the content of the following subsections.

4.5.1 The Case (R1)

The covering result for the case (R1) is very similar to the one from the case (P1). It only deviates from this by the construction within the rectangle \(R_{2}\):

Lemma 56

Let\(\beta \in (C \delta _{0}, \frac{\pi }{2}-C\delta _{0})\). Assume that\(D\)is a\(\delta _{0}\)-good triangle oriented parallel to the\(x_{1}\)-axis. Let\(\bar{R}\)be a rectangle of aspect ratio\(1:\delta _{0}\), which encloses an angle\(\beta \)with respect to the orientation of\(D\). Then\(D\)can be covered by the union of

  1. (i)

    \(M_{j}\), with\(M_{j} \in [1,100]\), \(\delta _{0}\)-good triangles, \(D_{1,k}\)which are aligned with the direction of\(D\),

  2. (ii)

    \(0< K_{j} \leq C \delta _{0}^{-2}\)many translated, up to null-sets disjoint and rescaled copies\(R_{2,k}\)of the rectangle\(\bar{R}\), whose union covers a volume of size at least\(10^{-6}|D|\),

  3. 1.

    \(0\leq L_{j} \leq C \delta _{0}^{-2}\)many triangles\(D_{2,k}\)which are of the type (R3).

Here\(C>1\)is a universal constant. The overall perimeter of the resulting triangles and rectangles is controlled by

$$\begin{aligned} \sum_{k=1}^{M_{j}} \operatorname{Per}(D_{1,k})+\sum_{k=1} ^{K_{j}}\operatorname{Per}(R_{2,k}) + \sum _{k=1}^{L_{j}} \operatorname{Per}(D_{2,k}) \leq C \delta _{0}^{-1} \operatorname{Per}(D). \end{aligned}$$

Remark 57

By Lemma 39 the angles which occur in our constructions, always satisfy the bound \(\beta \in (C \delta _{0}, \frac{\pi }{2}-C \delta _{0})\).


Let \(c_{1}\in (1/4,1)\), \(c_{2} \in (1,4)\). We begin by stacking \(K_{j} \in [c_{1}, c_{2}]\delta _{0}^{-2}\cap \mathbb{N}\) many rectangles \(\tilde{R}_{2,k}\), which are translated copies of \(\bar{R}\), next to each other in such a way that their lowest corners lie on the \(x_{1}\)-axis (cf. Fig. 19). Let \(\tilde{R}_{2}\) denote the enveloping axis-parallel rectangle. By adapting the constants \(c_{1}\), \(c_{2}\) we can arrange that \(\tilde{R}_{2}\) has an aspect ratio \(r\) allowing for an application of Proposition 48 with \(\delta = \delta _{0}\), \(\tilde{R}_{2}\) and \(r\). This yields a rectangle \(R_{2}\) of aspect ratio \(r\). Thus, the set \(D\setminus R_{2}\) consists of the triangles described in (i). Moreover \(R_{2}\) is covered as in Fig. 19 by \(K_{j}\)-many rectangles \(R_{2,k}\) with aspect ratio \(\delta _{0}\) and by a comparable number of “error” triangles \(D_{2,k}\). By definition, the constant \(K_{j}\) satisfies the bounds in (ii). Using elementary geometry, we calculate that

$$\begin{aligned} \Biggl| \bigcup_{k=1}^{K_{j}}R_{2,k}\Biggr| \geq \frac{1}{10}|R_{2}|. \end{aligned}$$

This implies the claim of (ii).

Fig. 19
figure 19

The covering of the box \(R_{2}\) of ratio \(1:\delta _{0}\). The dashed rectangles correspond to the \(K_{j}\) stacked (rescaled and translated) copies of \(\bar{R}\), which we denote by \(R_{2,k}\). Their envelope is a (rescaled) copy of \(\bar{R}_{2}\), which we denote by \(R_{2}\). It is the rectangle which is returned as the output of Proposition 48. The parts of \(R_{2}\) which are not covered by the rectangles \(R_{2,k}\), consist of the triangles \(D_{2,k}\)

We note that the error triangles \(D_{2,k}\) are all right angle triangles. Moreover, one of the other angles coincides with the rotation angle \(\beta \). At least one of triangles’ sides is parallel to the orientation of the rectangles \(R_{2,k}\). Thus the triangles \(D_{2,k}\) are of the type (R3).

The bound on the perimeters of the rectangles \(R_{2,k}\) and of the triangles \(D_{2,k}\) results from the following observations:

  • The number \(K_{j}\) of rectangles \(R_{2,k}\) and the number \(L_{j}\) of triangles \(D_{2,k}\) are bounded by \(C \delta _{0}^{-2}\).

  • The perimeter of each of the rectangles and each of the triangles is controlled: \(\operatorname{Per}(R_{2,k}) + \operatorname{Per}(D_{2,k}) \leq C \delta _{0} \operatorname{Per}(D)\).

  • There are at most 100 triangles \(D_{1,k}\), each of which has a perimeter controlled by \(\operatorname{Per}(D)\).


$$\begin{aligned} \sum_{k=1}^{M_{j}}\operatorname{Per}(D_{1,k}) + \sum_{k=1}^{K_{j}}\operatorname{Per}(R_{2,k}) + \sum_{k=1}^{L_{j}} \operatorname{Per}(D_{2,k}) &\leq C \delta _{0}\operatorname{Per}(D) \delta _{0}^{-2} \\ & \leq C \delta _{0}^{-1} \operatorname{Per}(D). \end{aligned}$$

This concludes the proof. □

4.5.2 The Case (R2)

The case (R2) is the rotated analogue of the case (P2). As we are in a rotated case, we have to be less careful about preserving orientations, and proceed similarly as in the case (R1). Again the main issue is the covering of the rectangle \(R_{2}\). However, in contrast to the case (R1) we now have to deal with a mismatch between the ratio of the triangle \(D\) (with ratio \(\delta _{j} \neq \delta _{0}\)) and the ratio of the Conti construction (with ratio \(\delta _{0}\)). Similarly as in the case (P2) we overcome this issue by a “stacking construction”, which compensates the mismatch.

Lemma 58

Let\(\beta \in (C \delta _{0}, \frac{\pi }{2}-C \delta _{0})\). Assume that\(D\)is a\(\delta _{j}\)-good triangle with direction parallel to the\(x_{1}\)-axis. Let\(\bar{R}\)be a rectangle of side ratio\(\delta _{0}\), which encloses an angle\(\beta \)with respect to the long side of\(D\). Then\(D\)can be covered by the union of

  1. (i)

    \(M_{j}\), with\(M_{j} \in [1,100]\), \(\delta _{0}\)-good triangles, \(D_{1,k}\)which are aligned with the direction of\(D\),

  2. (ii)

    \(0< K_{j}\leq C \delta _{0}^{-1}\delta _{j}^{-1}\)many translated and rescaled copies\(R_{2,k}\)of the rectangle\(\bar{R}\), whose union covers a volume of size at least\(10^{-6}|D|\),

  3. (iii)

    \(0< L_{j}\leq C \delta _{0}^{-1}\delta _{j}^{-1}\)many triangles\(D_{2,k}\)which are of the type (R3).

There exists a universal constant\(C>1\)such that the perimeter of the resulting triangles and rectangles is bounded by

$$\begin{aligned} \sum_{k=1}^{M_{j}}\operatorname{Per}(D_{1,k}) + \sum_{k=1}^{K_{j}}\operatorname{Per}(R_{2,k}) + \sum_{k=1}^{L_{j}} \operatorname{Per}(D_{2,k}) \leq C \delta _{0}^{-1} \operatorname{Per}(D). \end{aligned}$$


We construct a box \(\tilde{R}_{2}\) as in the case (R1) but now by stacking \(K_{j} \in [c_{1},c_{2}](\delta _{0} \delta _{j})^{-1}\cap \mathbb{N}\) many of the boxes \(\bar{R}\) next to each other, where \(c_{1}\in (1/4,1)\) and \(c_{2}\in (1,4)\). We denote these stacked boxes by \(\tilde{R}_{2,k}\) and define \(\tilde{R}_{2}\) as the enveloping axis-parallel rectangle. By adapting the values of \(c_{1}\), \(c_{2}\) it is possible to obtain a ratio \(r\) for \(\tilde{R}_{2}\) (cf. Fig. 20), which is admissible in applying Proposition 48 with \(\delta =\delta _{j}\), \(\tilde{R}_{2}\) and \(r\). This yields a box \(R_{2}\), which is covered by rescaled copies \(R_{2,k}\) of the rectangles \(\tilde{R}_{2,k}\) and by “error” triangles \(D_{2,k}\). By construction and by elementary geometry (as in (43)) these satisfy the requirements in (ii), (iii). By Proposition 48 also (i) holds true.

Fig. 20
figure 20

The figure shows a (rescaled) copy of the enveloping rectangle \(R_{2}\) and the stacked (and rescaled) copies of the rectangle \(\bar{R}\), which we denote by \(R_{2,k}\). The triangles correspond to the ones, which we denote by \(D_{2,k}\) in Lemma 58 (iii)

It remains to estimate the perimeter of the union of the rectangles \(R_{2,k}\) and the triangles \(D_{2,k}\). To this end, we note that:

  • Each rectangle \(R_{2,k}\) has perimeter controlled by \(C\delta _{j} \operatorname{Per}(D)\).

  • There are \(C \delta _{j}^{-1}\delta _{0}^{-1} \) many such rectangles \(R_{2,k}\).

  • The perimeters of the error triangles \(D_{2,k}\) are up to a factor controlled by the perimeters of the rectangles \(R_{2,k}\).

Thus, the resulting perimeter is up to a constant bounded by

$$\begin{aligned} &\sum_{k=1}^{M_{j}}\operatorname{Per}(D_{1,k}) + \sum_{k=1}^{K_{j}}\operatorname{Per}(R_{2,k}) + \sum_{k=1}^{L_{j}} \operatorname{Per}(D_{2,k}) \\ &\quad \leq C \delta _{j} \operatorname{Per}(D) \bigl(\delta _{j}^{-1} \delta _{0}^{-1}\bigr) = C \delta _{0}^{-1} \operatorname{Per}(D). \end{aligned}$$

This concludes the argument of the lemma. □

4.5.3 The Case (R3)

We deal with the error triangles from the previous step. All of them are right angle triangles in which the other two angles are bounded from below and above by \(C \delta _{0}\) and \(\frac{\pi }{2}-C \delta _{0}\). We show that in this situation we can reduce to two model cases, which we discuss below. This allows us to obtain the following result:

Lemma 59

Let\(D\)be a triangle of type (R3). Let\(R\)be a rectangle of side ratio\(\delta _{0}\), which is parallel to one of the sides of\(D\). Then it is possible to cover\(D\)by finitely many scaled and translated copies of itself and by finitely many translated and scaled copies\(R_{k}\)of the rectangle\(R\)such that

  1. (i)

    \(|\bigcup_{k=1}^{K_{j}} R_{k}| \geq 10^{-6}|D|\),

  2. (ii)

    \(\sum_{k=1}^{K_{j}} \operatorname{Per}(R_{k}) \leq \frac{C}{\delta _{0}}\operatorname{Per}(D)\).

Our main ingredient in proving this is the following lemma:

Lemma 60

(Covering of Triangles by Rectangles)

Let\(D_{1,m}\)denote a right angle triangle, in which the sides enclosing the right angle are of side lengths 1 and\(m\). Assume that\(m\in (0,50 \delta ^{-1})\). Let\(R\)be a rectangle, which has side ratio\(\delta \in (0,1)\). Assume that the longer side of\(R\)is parallel to the side of the triangle\(D_{1,m}\), which is of length\(m\).

Then there exist a number\(L=L(m)\)and disjoint, rescaled and translated copies\(R_{k}\)of\(R\)with the properties that:

  1. (i)

    \(|\bigcup_{k=1}^{L}R_{k}| \geq 10^{-2}|D_{1,m}|\).

  2. (ii)

    The sum of the perimeters of the rectangles\(R_{k}\)satisfies

    $$\begin{aligned} \sum_{k=1}^{L}\operatorname{Per}(R_{k}) \leq \frac{C(1+m\delta )}{\delta }\leq \frac{C}{\delta }\operatorname{Per}(D_{m,1}). \end{aligned}$$

Remark 61

We remark that for our application, the bound on \(m\) does not impose an additional requirement. Indeed, the triangles of type (R3) only occur as artifacts of the coverings in Lemmas 56 and 58. Here we may estimate \(m\) by \(\tan (\beta )\). For \(\beta =\frac{\pi }{2}-\delta \), a Taylor expansion of \(\frac{\sin (\frac{ \pi }{2}-\delta )}{\cos (\frac{\pi }{2}-\delta )}\) entails the desired estimate.

Before explaining Lemma 60, we show how our main covering result, Lemma 59, can be reduced to the situation of Lemma 60.

Proof of Lemma 59

We first claim that without loss of generality \(D\) can be assumed to be of type (R3) with \(R\) being parallel to one of the short sides of the triangle \(D\). Indeed, if \(R\) is parallel to the long side of the triangle \(D\), then this side is opposite of the right angle of the triangle. In this case, we split the triangle \(D\) into two smaller triangles \(D^{(1)}\), \(D^{(2)}\) by connecting the corner, at which \(D\) has its right angle, by the shortest line to the long side. The resulting triangles \(D^{(1)}\), \(D^{(2)}\) have the same angles as the original triangle (and in particular satisfy the non-degeneracy conditions for the angles, which are required in condition (R3)), but are now such that \(R\) is parallel to one of their short sides.

After this reduction, we seek to apply Lemma 60 with \(\delta =\delta _{0}\) for each of the triangles \(D^{(1)}\), \(D^{(2)}\). To this end we note that as \(\beta _{0} \geq C\delta _{0}\), we have that \(m\leq C \delta _{0}^{-1}\). As a consequence, Lemma 60 yields the desired result (by observing that \(\operatorname{Per}(D ^{(1)})+ \operatorname{Per}(D^{(2)}) \leq 2 \operatorname{Per}(D)\)). □

Proof of Lemma 60

We construct the desired covering by a “greedy” type algorithm. We begin by fitting in the largest possible copy \(R_{1}\) of \(R\), which touches the side of the triangle \(D_{1,m}\), which is of length \(m\) (cf. Fig. 21). As the rectangle \(R_{1}\) has to have a side ratio of \(1:\delta \), its side lengths can be computed explicitly to be \(l_{1}= \frac{m}{1+m\delta }, l_{2} = \delta l_{1}\). Choosing \(R_{1}\) as the first rectangle in the desired covering, we have created a decomposition of the triangle \(D_{1,m}\) into three parts, the rectangle \(R_{1}\) and two triangles, which are self-similar to the original triangle:

$$\begin{aligned} D_{1,m}= R_{1} \cup D_{a_{1}, a_{1} m} \cup D_{(1-a_{1}),(1-a_{1})m}. \end{aligned}$$

Here \(a_{1}:= l_{2}\) and hence \(1-a_{1}= \frac{1}{1+\delta m}\) denote the similarity factors with respect to the original triangle \(D_{1,m}\). We iterate this procedure in the new triangle \(D_{(1-a_{1}),(1-a _{1})m}\), while ignoring the (smaller) triangle \(D_{a_{1},a_{1} m}\). After \(L\) steps of this algorithm we have obtained \(L\) (up to null sets) disjoint rectangles \(R_{1},\ldots,R_{L}\). We claim that if \(L\) is chosen sufficiently large, the covering \(\bigcup_{k=1}^{L}R_{k}\) has the desired properties. Indeed, we choose \(L\) such that \((\frac{1}{1+ \delta m} )^{L}\in (\frac{1}{4}, \frac{1}{2})\) and first note that the construction of the rectangles \(R_{k}\) is based on a self-similar iterative process with similarity factor \(\lambda = \frac{1}{1+\delta m}\). Thus, we infer that

$$\begin{aligned} \Biggl|\bigcup_{k=1}^{L}R_{k}\Biggr| &= |R_{1}| \sum_{k=1}^{L} \lambda ^{2 k} = |R_{1}| \frac{1-\lambda ^{2(L+1)}}{1-\lambda ^{2}} \geq |R_{1}| \frac{1}{2}\frac{1}{1-\lambda ^{2}} = \frac{1}{2} \frac{m}{2+ \delta m} \\ &\geq \frac{1}{200} m. \end{aligned}$$

Here we used the disjoint construction of the covering, the choice of \(L\), the value of \(\lambda \) and \(\delta m \leq 50\). As \(|D_{1,m}|= \frac{1}{2}m\), this yields the first claim. Similarly,

$$\begin{aligned} \sum_{k=1}^{L}\operatorname{Per}(R_{k}) &= \operatorname{Per}(R _{1}) \sum_{k=1}^{L} \lambda ^{ k} = \operatorname{Per}(R_{1}) \frac{1- \lambda ^{L+1}}{1-\lambda } \leq \operatorname{Per}(R_{1}) 2\frac{1}{1- \lambda } \\ &\leq 4 \frac{1}{ \delta } \leq 4 \frac{1}{ \delta } \operatorname{Per}(D_{m,1}). \end{aligned}$$

Here we used that \(\operatorname{Per}(R_{1})\leq 2 l_{1}\) and that \(\operatorname{Per}(D_{m,1})\geq 1\).

Fig. 21
figure 21

The triangles \(D_{1,m}\) for \(m\sim 1\) (left) and \(m\sim \delta \) (right). Coverings of at least half the volume for \(m>1\) and \(m\sim \delta <1\)


4.6 Proof of Proposition 45

We initialize the construction by applying Step 1 in Algorithms 27 and 30. As these initial triangles are obtained as the level sets of a Conti construction with ratio \(\delta _{0}\), they all form \(\delta _{0}\) good triangles (cf. Lemma 46) and hence satisfy the properties of the theorem. It therefore remains to argue that this is preserved in our constructions from Sects. 4.4 and 4.5. Given one of the triangles \(D_{j}\) as in the theorem, the results of Propositions 52 and 55 ensure this, once the rotation angle of the successive Conti constructions is controlled. This however is the achieved by virtue of Remark 38 and Lemma 39.

5 Quantitative Analysis

After having recalled the qualitative construction of convex integration solutions in Sect. 3, we now focus on controlling the scheme quantitatively. Here we rely on the quantitative covering results from Sect. 4 (cf. Propositions 52 and 55), which allow us to obtain bounds on the \(BV\) norm of the iterates \(u_{k}\) and the corresponding characteristic functions associated with the well \(e^{(i)}\in K\) (Lemma 63). Combined with an \(L^{1}\) estimate and the interpolation inequality from Theorem 2 or from Corollary 3, this then yields the desired \(W^{s,q}\) regularity of the characteristic function of the phases.

As in Sect. 3.2, given a matrix \(M\) with \(e(M)\in \operatorname{intconv}(K)\), we here assume that \(\varOmega := Q_{\beta }[0,1]^{2}\), where \(Q_{\beta }\) is the rotation, which describes how the Conti construction with respect to \(M\) and \(e^{(p)}_{0}\) is rotated with respect to the \(x_{1}\)-axis. This special case will play the role of a crucial building block in the situation of more general domains (cf. Sect. 6).

We begin by defining the characteristic functions associated with the corresponding wells:

Definition 62

(Characteristic Functions)

We define the characteristic functions, \(\chi _{k}^{(1)}\), \(\chi _{k}^{(2)}\), \(\chi _{k}^{(3)}\) associated with \(e^{(1)}\), \(e^{(2)}\), \(e^{(3)}\) in the \(k\)th step of the Conti construction as

$$\begin{aligned} \chi _{k}^{(i)}(x)= \left\{ \textstyle\begin{array}{l@{\quad}l} 1 & \mbox{if } e(\nabla u_{k})(x) = e^{(i)}, \\ 0 & \mbox{else,} \end{array}\displaystyle \right. \quad i\in \{1,2,3\}. \end{aligned}$$

We denote their point-wise a.e. limits as \(k\rightarrow \infty \) by \(\chi ^{(i)}\), \(i\in \{1,2,3\}\).

We emphasize that these point-wise limits exists, since for a.e. point \(x\in \varOmega \) there exists an index \(k_{x}\in \mathbb{N}\) such \(x\in \varOmega \setminus \varOmega _{k_{x}}\). By our convex integration algorithm and by Definition 62, the value of \(\chi _{l}(x)\) remains fixed for \(l\geq k_{x}\).

Using the covering results from Sect. 4, we can address the \(BV\) bounds for the characteristic functions \(\chi _{k} ^{(i)}\), \(i\in \{1,2,3\}\):

Lemma 63

(\(BV\) Control)

Let\(u_{j}:\varOmega \rightarrow \mathbb{R}^{2}, \chi _{j}^{(1)},\chi _{j} ^{(2)},\chi ^{(3)}_{j}\)denote the displacement and characteristic functions, which are obtained in the\(j\)-th step of the convex integration scheme from Proposition36. Let\(\delta _{0}\)be as in Algorithm27, Step 0 (b). Then, there exist constants\(C_{0}, C_{1}>0\) (independent of\(\delta _{0}\)) such that

$$\begin{aligned} \bigl\| \chi _{j}^{(i)}\bigr\| _{BV(\varOmega )} \leq C_{1} \bigl(C_{0}\delta _{0}^{-1}\bigr)^{j} \quad\textit{for } i\in \{1,2,3\}. \end{aligned}$$


We deduce the following iterative bound for the size of the BV norm:

$$\begin{aligned} \sum_{k=1}^{J_{j}}\sum _{\varOmega _{j+1,l}\in \mathcal{D}_{1}(\varOmega _{j,k})} \operatorname{Per}(\varOmega _{j+1,k}) \leq C \delta _{0}^{-1} \sum_{k=1}^{J_{j}} \operatorname{Per}(\varOmega _{j,k}). \end{aligned}$$

To this end, let \(\varOmega _{j,k}\) be a triangle from the covering \(\{\varOmega _{j,k}\}_{j\in \{1,\dots ,J_{j}\}}\) at the \(j\)-th iteration step. In particular, \(e(\nabla u_{j})\) is constant on \(\varOmega _{j,k}\). We apply Algorithms 27 and 30 and in these specify the choice of our covering to be the one of Proposition 52 or the one of Proposition 55. In order to bound the resulting BV norm, we distinguish two cases:

  1. (a)

    The parallel case. Assume that \(\varOmega _{j,k}\) is of the type (P1) or (P2) (which, by the explanations below Definition 43 holds in the parallel case). Thus, the Conti construction in the \(j\)th and \((j+1)\)th step are nearly aligned in the direction of their degeneracy. In this case, Proposition 52 is applicable and implies that

    $$\begin{aligned} \sum_{\varOmega _{j+1,l}\in \mathcal{D}_{1}(\varOmega _{j,k})} \operatorname{Per}( \varOmega _{j+1,k}) \leq C \operatorname{Per}(\varOmega _{j,k}), \end{aligned}$$

    for some absolute constant \(C>0\).

  2. (b)

    The rotated case. Assume that \(\varOmega _{j,k}\) is of the type (R1)–(R3) (which, by the explanations below Definition 43 holds in the rotated case). Thus, the Conti construction in the \(j\)th and \((j+1)\)th step are not aligned. By Lemma 39 there are even lower bounds on the degree of alignment. In this case, Proposition 55 is applicable and yields that

    $$\begin{aligned} \sum_{\varOmega _{j+1,l}\in \mathcal{D}_{1}(\varOmega _{j,k})} \operatorname{Per}( \varOmega _{j+1,k}) \leq C \delta _{0}^{-1} \operatorname{Per}(\varOmega _{j,k}). \end{aligned}$$

Combining both estimates (45), (46) and summing over all domains \(\varOmega _{j,k}\) for fixed \(j\) implies (44). From this we infer that

$$\begin{aligned} \sum_{k=1}^{J_{j}}\operatorname{Per}( \varOmega _{j,k}) \leq C \delta _{0}^{-j}. \end{aligned}$$

As by construction

$$\begin{aligned} \sum_{i=1}^{3}\bigl\| \chi _{j}^{(i)} \bigr\| _{BV(\varOmega )} \leq C \sum_{k=1}^{J_{j}} \operatorname{Per}(\varOmega _{j,k})\leq C \delta _{0}^{-j}, \end{aligned}$$

we therefore obtain the statement of the lemma. □

Using the explicit construction of our convex integration scheme, we further estimate the difference of two successive iterates in the \(L^{1}\) norm:

Lemma 64

(\(L^{1}\) Control)

Let\(u_{k}:\varOmega \rightarrow \mathbb{R}^{2}, \chi _{k}^{(1)},\chi _{k} ^{(2)},\chi ^{(3)}_{k}\)denote the displacement and characteristic functions, which are obtained in the\(k\)-th step of the convex integration scheme from Lemma23. Then,

$$\begin{aligned} \bigl\| \chi _{k}^{(i)}- \chi _{k+1}^{(i)} \bigr\| _{L^{1}(\varOmega )} \leq C \biggl(1- \frac{7}{8} v_{0} \biggr)^{k} \quad\textit{for } i\in \{1,2,3\}. \end{aligned}$$

Remark 65

In our realization of the covering argument, which is described in Sect. 4, we have chosen \(v_{0} = 10^{-6}\). In particular, it is independent of the boundary condition \(M\) in (6).


The proof follows immediately from the Conti construction and the observations that

$$\begin{aligned} |\varOmega _{j}|\leq C \biggl(1-\frac{7}{8}v_{0} \biggr)^{j} |\varOmega |, \end{aligned}$$

and that \(\chi _{k}^{(i)}(x) = \chi _{j}^{(i)}(x)\) for a.e. \(x\in \varOmega \setminus \varOmega _{j}\), if \(k\geq j\). □

Combining Lemmas 63 and 64 with Theorem 2 or Corollary 3 yields the following regularity result (see also Fig. 22 for the parameter \(\theta_{0}\)):

Fig. 22
figure 22

By interpolation, our decay and growth bounds for the \(L^{1}\) and \(BV\) norms of the differences \(\chi _{j+1}^{(i)}-\chi _{j} ^{(i)}\) yield Cauchy sequences in the \(W^{\theta ,q}\) spaces inside the triangular region. Here, we use that the functions under consideration are characteristic functions and hence all \(L^{p}\) norms with \(1\leq p <\infty \) can be compared

Proposition 66

(Regularity of Convex Integration Solutions)

Let\(M\)with\(e(M)\in \operatorname{intconv}(K)\)and assume that\(\varOmega =Q_{\beta }[0,1]^{2}\). Let\(\delta _{0}>0\)be as in Step 0(b) in Algorithm 27. Let\(u:\varOmega \rightarrow \mathbb{R}^{2}\)be a convex integration solution obtained according to Algorithms27and 30and described in Proposition36. Then it is possible to obtain

$$\begin{aligned} \chi ^{(i)} \in W^{s,q} \end{aligned}$$

for all\(s \in (0,1),q\in (1,\infty )\)with\(0< s q < \theta _{0}\)and\(\theta _{0}=\frac{\ln (1-\frac{7}{8}v_{0})}{\ln (1-\frac{7}{8}v_{0})+ \ln (\delta _{0})-\ln (C_{0})}\). Here\(C_{0}>0\)is an absolute constant, which does not depend on\(\delta _{0}\)and\(M\).


By Remark 5 it suffices to control a weighted product of the \(L^{1}\) and the \(BV\) norms. Combining Lemmata 63 and 64 to this end yields that

$$\begin{aligned} \bigl\| \chi _{j+1}^{(i)}-\chi _{j}^{(i)} \bigr\| _{L^{1}(\mathbb{R}^{2})}^{1-\theta }\bigl\| \chi _{j+1}^{(i)}-\chi _{j}^{(i)}\bigr\| _{BV(\mathbb{R}^{2})}^{\theta } \leq C \biggl(1- \frac{7}{8}v_{0} \biggr)^{j (1-\theta )} \biggl( \frac{C _{0}}{\delta _{0}} \biggr)^{j \theta } . \end{aligned}$$

Choosing \(\theta \in (0,\theta _{0})\) with \(\theta _{0}:= \frac{\ln (1- \frac{7}{8}v_{0})}{\ln (1-\frac{7}{8}v_{0})+\ln (\delta _{0})-\ln (C _{0})} \) hence yields geometric decay for the right hand side of (47). Invoking Remark 5 and using (47) hence implies that

$$\begin{aligned} \bigl\| \chi _{j+1}^{(i)}-\chi _{j}^{(i)} \bigr\| _{W^{s,q}(\mathbb{R}^{2})} &\leq C _{s,q} \biggl( \biggl(1-\frac{7}{8}v_{0} \biggr)^{j (1-\theta _{1})} \biggl(\frac{C_{0}}{\delta _{0}} \biggr)^{j \theta _{1}} \biggr)^{\frac{s}{ \theta _{1}}} \end{aligned}$$

for any \(\theta _{1} \in (0,\theta _{0})\) and any pair \((s,q)\in (0,1) \times (1,\infty )\) with \(0< s q \leq \theta _{1}\). Therefore, a telescope sum argument entails that the sequence \(\{\chi _{j}^{(i)}\}_{j\in \mathbb{N}}\) forms a Cauchy sequence in \(W^{s,q}(\mathbb{R}^{2})\). Hence completeness yields that \(\chi ^{(i)}\in W^{s,q}\) for all \(sq \in (0, \theta _{0})\). □

Remark 67


We remark that the quantitative dependences in Proposition 66 are clearly non-optimal. Parameters, which could improve this are for instance:

  • Varying the volume fraction \(\lambda \in (0,1)\) in the Conti construction from Corollary 20.

  • Choosing a sharper relation between \(\epsilon _{j}\) and \(\delta _{j}\) and modifying the \(j\)-dependence of \(\epsilon _{j}\) (for instance by only using summability for the stagnant matrices instead of the geometric decay, which is prescribed in Step 2(b) of Algorithm 27).

This would however not change the qualitative behavior of the estimates.

A qualitatively different behavior would arise, if in the proof of Lemma 63 only case (a) occurred. Then based on our construction in Proposition 52 and Lemma 63 the \(\delta _{0}\) dependence would improve in Proposition 66: In this case the choice of the product of the exponents\(s\), \(q\) in Proposition 66 would not depend on \(\delta _{0}\), but would be uniform in the whole triangle \(\operatorname{intconv}(K)\). In this case only the value of the \(W^{s,q}\) norm would deteriorate with \(\delta _{0}\).

We however remark that, as a matrix in \(\operatorname{intconv}(K)\) is a convex combination of all three values of \(e^{(1)}\), \(e^{(2)}\), \(e^{(3)}\), the described construction necessarily involves instances of case (b). It is however conceivable that by controlling the number of these steps, it could be possible to improve the dependence of \(s\), \(q\) on \(\delta _{0}\). It is unclear (and maybe rather unlikely), whether it is possible to completely remove it with the described convex integration scheme.

Remark 68

(Fractal Dimension)

We emphasize that in accordance with Remark 6 the \(W^{s,p}\) regularity of \(\chi ^{(i)}\) for \(i\in \{1,2,3\}\) has direct implications on the (packing) dimension of the boundary of the sets \(\{x\in \mathbb{R}^{2}: \chi ^{(i)}(x)=1\}\).

Similarly, we obtain bounds on the displacement and the infinitesimal strain tensor:

Proposition 69

Let\(\theta _{0}\in (0,1)\)be the exponent from Proposition66. Then for all\(s\in (0,1), p \in (1,\infty )\)with\(sp <\theta _{0}\)there exist solutions\(u:\varOmega \rightarrow \mathbb{R}^{2}\)of (6) with

$$\begin{aligned} \nabla u - M \in W^{s,p}\bigl(\mathbb{R}^{2}\bigr). \end{aligned}$$


The proof is along the lines of the proof of Proposition 66. However, instead of estimating \(\chi _{j+1}^{(i)} - \chi _{j}^{(i)}\), we bound \(\nabla u_{j+1}-\nabla u_{j}\). Here the \(BV\) bound follows from the bound for \(\chi _{j+1}^{(i)}\) by noting the uniform boundedness of \(\nabla u_{j}\) (cf. Proposition 34) and the fact that for the estimate for \(\chi _{j+1}^{(i)}\) we used the whole resulting perimeter. Hence

$$\begin{aligned} \|\nabla u_{j+1}- \nabla u_{j}\|_{BV(\mathbb{R}^{2})} \leq C \bigl\| \chi _{j+1}^{(i)}- \chi _{j}^{(i)} \bigr\| _{BV(\mathbb{R}^{2})} . \end{aligned}$$

For the \(L^{1}\) estimate we use the \(L^{\infty }\) bound for \(\nabla u_{j}\) (which follows from Proposition 34) and the fact that \(e(\nabla u_{j}) \in \overline{ \operatorname{conv}(K)}\)) in combination with the fact that in the \(j\)th iteration step \(\nabla u_{j}\) is only changed on a volume fraction of \((1-\frac{7}{8}v_{0} )^{j}\). Thus,

$$\begin{aligned} \|\nabla u_{j+1}- \nabla u_{j}\|_{L^{1}(\varOmega )} \leq C \max_{x\in \mathbb{R}^{2}}\bigl|\nabla u_{j}(x)\bigr| \biggl(1- \frac{7}{8}v _{0} \biggr)^{j}. \end{aligned}$$

Hence the same interpolation as above yields the \(W^{s,p}(\mathbb{R} ^{2})\) regularity of \(\nabla u - M\), which implies the desired result. □

6 General Domains

In this section we explain how to construct the desired “regular” convex integration solutions in arbitrary Lipschitz domains by using the bounds from the special cases which were discussed in Sect. 5. In this context our main result is the following:

Proposition 70

Assume that\(\varOmega \subset \mathbb{R}^{2}\)is a bounded Lipschitz domain and suppose that\(M \in \mathbb{R}^{2\times 2}\)with\(e(M)\in \operatorname{intconv}(K)\). Let\(\beta \in [0,2\pi )\)be the angle with which the Conti construction for\(M\)is rotated with respect to the\(x_{1}\)-axis, and let\(\chi _{k}^{(i)}\)be defined as in Definition 62. Let\(\theta _{0}>0\)be the\(W^{s,p}\)exponent for the regularity of\(\chi ^{(i)}\)with respect to the rotated unit square\(Q_{\beta }[0,1]^{2}\)adapted to\(M\), i.e. let\(\theta _{0}\)be such that for all\(s\in (0,1), p\in (1,\infty ]\)with\(0< sp <\theta _{0}\)and for some\(\mu (s,p)\in (0,1)\)it holds

$$\begin{aligned} \bigl\| \chi ^{(i)}_{k+1}-\chi _{k}^{(i)} \bigr\| _{BV(Q_{\beta }[0,1]^{2})}^{\theta }\bigl\| \chi ^{(i)}_{k+1}-\chi _{k}^{(i)}\bigr\| _{L^{1}(Q_{\beta }[0,1]^{2})}^{1- \theta } \leq C(s,p)\mu (s,p)^{-k}. \end{aligned}$$

Then there exists a constant\(C(\varOmega ,M,s,p)\)and a family of subsets\(\bar{\varOmega }_{k}\subset \varOmega \)such that

  1. (i)

    \(\bar{\varOmega }_{k}:= \bigcup_{l=1}^{k} \bigcup_{m=1}^{K_{l}} Q_{l}^{m}\), where\(Q_{l}^{m}:= ( [0, \lambda _{l}]^{2} + x_{l,m} )\)are (up to null-sets) disjoint cubes with\(x_{l,m} \in \varOmega \)and\(\lambda _{l}:= 2^{-l}\)such that

    $$\begin{aligned} \bar{\varOmega }_{k} \nearrow \varOmega \quad\textit{in } L^{1} \bigl(\mathbb{R}^{2}\bigr) \end{aligned}$$

    (in the sense of the convergences of their characteristic functions),

  2. (ii)

    for\(\tilde{\chi }^{(i)}_{k}(x):= \sum_{l=1}^{k} \sum_{m=1}^{K_{l}} \chi _{k}^{(i)}( \frac{x-x_{m,l}}{\lambda _{l}})\chi _{\bar{\varOmega }_{k}}(x)\)the estimate (49) remains valid for all\(s\), \(p\)with\(s\in (0,1), p \in (1,\infty ]\)and\(0< sp<\theta _{0}\). In the dependences the constant\(C(s,p)\)however is replaced by\(C(\varOmega ,M,s,p)\)and\(\mu (s,p)\)replaced by\(C(\theta )\mu (s,p)\). Here\(\theta =\theta (s,p)\)is the interpolation exponent associated with\(s,p>0\).

As an immediate consequence we infer the following corollary:

Corollary 71

Suppose that\(\varOmega \subset \mathbb{R}^{2}\)is a bounded Lipschitz domain and assume that\(M\in \operatorname{intconv}(K)\). Let\(\beta \in [0,2\pi ]\)be the angle with which the Conti construction for\(M\)is rotated with respect to the\(x_{1}\)-axis. Let\(\theta _{0}>0\)be the limiting\(W^{s,p}\)exponent for the regularity of\(\chi ^{(i)}\)with respect to the rotated unit square\(Q_{\beta }[0,1]^{2}\)adapted to\(M\). Then for all\(s,p>0\)with\(0< sp<\theta _{0}\)

  1. (i)

    the point-wise limit\(\tilde{\chi }^{(i)}\)of the functions\(\tilde{\chi }^{(i)}_{k}\)satisfies

    $$\begin{aligned} \bigl\| \tilde{\chi }^{(i)}\bigr\| _{W^{s,p}(\varOmega )} \leq C(\varOmega , M, s,p). \end{aligned}$$
  2. (ii)

    there exist solutions\(u\)to (6) with

    $$\begin{aligned} \|\nabla u\|_{W^{s,p}(\varOmega )} \leq C(\varOmega , M, s,p). \end{aligned}$$

Proof of Corollary 71

We first note that the point-wise limit \(\tilde{\chi }^{(i)}\) exists, since \(\bar{\varOmega }_{k} \rightarrow \varOmega \) and since \(\chi _{k}^{(i)} \rightarrow \chi _{k}\) in a point-wise sense as \(k\rightarrow \infty \). With this at hand, the proof of Corollary 71 follows from Proposition 70 by interpolation in an analogous way as explained in Proposition 66. We therefore omit the details of the proof of the corollary. □

We proceed to the proof of Proposition 70. Here we argue by covering our general domain \(\varOmega \) by the special domains from Sect. 5 (Steps 1 and 2). On each of the special domains, we apply the construction from Sect. 5 (cf. also Algorithms 27 and 30). In order to obtain a sequence with bounded \(W^{s,p}\) norm, we however do not refine to arbitrarily fine scales immediately, but proceed iteratively (cf. Step 3). A central point here is to control the necessary number of cubes at each scale (Claim 72), since this has to be balanced with the corresponding energy contribution (cf. Step 4). To this end, we use a “volume argument”, which by the Lipschitz regularity of the domain allows us to infer information on the number of cubes on each scale (cf. Proof of Claim 72).

Proof of Proposition 70

Step 1: Covering of a general Lipschitz domain. We may assume that \(M=0\) and first consider the case of \(\varOmega \) being a domain which is bounded by the \(x_{2}\)-axis, the segment \([0,1]\times \{0\}\), a Lipschitz graph \(f:[0,1]\rightarrow \mathbb{R}\) and the segment \(\{1\}\times [0,f(1)]\). By symmetry we may further assume that \(f(x_{1})\geq 0\) for all \(x_{1}\in [0,1]\). For general domains \(\varOmega \), by the compactness and Lipschitz regularity, we may locally reduce to a similar case, where \(f\) is a Lipschitz curve, but not necessarily a graph. However, all arguments in the following extend to that case as well.

Step 2: Counting cubes. Let \(\tilde{\varOmega }_{l}:= \bigcup_{k=1}^{K_{l}} Q_{l}^{k}\), where \(Q_{l}^{k}\subset \varOmega \) are (up to zero sets) disjoint, grid cubes of an axis-parallel grid of grid size \(\lambda _{l}:=2^{-l}\). We choose \(K_{l}\in \mathbb{N}\) maximal. Thus, by definition we have that \(\tilde{\varOmega }_{l} \subset \tilde{\varOmega }_{l+1}\subset \varOmega \) for all \(l\in \mathbb{N}\). In the limit \(l\rightarrow \infty \) the sets \(\tilde{\varOmega }_{l}\) eventually cover the whole set \(\varOmega \) (which we assume to be as in Step 1).

We estimate the number of the cubes which are contained in the sets \(\tilde{\varOmega }_{l+1}\setminus \tilde{\varOmega }_{l}\). For these we claim:

Claim 72

The set\(\tilde{\varOmega }_{l+1}\setminus \tilde{\varOmega }_{l}\)contains at most\(C_{f} \lambda _{l+1}^{-1}\)of the grid cubes\(Q_{l+1}^{k} \subset \varOmega \).

Proof of Claim 72

Indeed, we first observe that for a sufficiently large constant \(C_{f}\) (depending on \(f\), cf. Remark 73) and for a sufficiently large value of \(l\in \mathbb{N}\) every point \(x\in \varOmega \) in the subgraph \(S(f,l)\) of \(f-C_{f} \lambda _{l}\) is contained in a cube \(Q_{l}^{k} \subset \tilde{\varOmega }_{l}\). Hence at least a volume of size

$$\begin{aligned} \bigl|S(f,l) \cap \varOmega \bigr|:= \int _{0}^{1}f(x)dx - C_{f} \lambda _{l}, \end{aligned}$$

is completely covered by cubes of size \(\lambda _{l}\). There may be additional cubes of size \(\lambda _{l}\) contained in \(\tilde{\varOmega } _{l}\). As however only a volume of size \(C_{f} \lambda _{l}\) is left and as each cube has volume \(\lambda _{l}^{-2}\), the number of these additional cubes is controlled by

$$\begin{aligned} \#\bigl\{ \mbox{grid cubes of size }\lambda _{l}\mbox{ in } \tilde{ \varOmega }_{l} \setminus S(f,l)\bigr\} \leq 2 C_{f} \lambda _{l} \lambda _{l}^{-2} \leq 2 C _{f} \lambda _{l}^{-1}. \end{aligned}$$

Combining this with the observation that

$$\begin{aligned} \bigl|\varOmega \cap S(f,l+1)\bigr|-\bigl|\varOmega \cap S(f,l)\bigr| = C_{f} (\lambda _{l}- \lambda _{l+1}) = C_{f} \lambda _{l+1}, \end{aligned}$$

which implies that \(S(f,l+1)\) has at most \(C_{f} \lambda _{l+1}^{-1}\) more cubes of size \(\lambda _{l+1}\) than \(S(f,l)\), we infer that \(\tilde{\varOmega }_{l+1}\setminus \tilde{\varOmega }_{l}\) contains at most \(4 C_{f} \lambda _{l+1}^{-1}\) cubes of size \(\lambda _{l+1}\). □

Step 3: Definition of the algorithm. We use the following definitions

$$\begin{aligned} \hat{\varOmega }_{1}:=\tilde{\varOmega }_{1},\qquad \hat{ \varOmega }_{l}:= \varOmega _{l} \setminus \bigcup _{j=1}^{l-1} \tilde{\varOmega }_{l} \quad\mbox{for } l\geq 2. \end{aligned}$$

With this we set (as illustrated in Fig. 23)

$$\begin{aligned} \bar{\varOmega }_{k}:=\bigcup_{j=1}^{k} \hat{\varOmega }_{j}. \end{aligned}$$

We recall that by Claim 72 we have that \(\bar{\varOmega } _{k+1}\setminus \bar{\varOmega }_{k} \) is a union of at most \(C C_{f} \lambda _{k+1}^{-1}\) cubes of side lengths \(\lambda _{k+1}\).

Fig. 23
figure 23

The covering of \(\varOmega \) by squares of decreasing sizes defines the set \(\varOmega _{k}\)

Denoting by \(u_{k}\) the displacement in step \(k\) of Algorithms 27 and 30 with initialization as in Step 1, we define the displacement \(\tilde{u}_{k}|_{\bar{\varOmega }_{k}}\) on \(\bar{\varOmega }_{k}\) in the \(k\)th step as

$$\begin{aligned} \tilde{u}_{k}(x):= \left\{ \textstyle\begin{array}{l@{\quad}l} \lambda _{l} u_{k}(\lambda _{l}^{-1} (x-x_{l,k})) &\mbox{for } x \in Q_{l}^{k} \subset \hat{\varOmega }_{l} \cap \bar{\varOmega }_{k}, \\ 0 &\mbox{for } x \notin \bar{\varOmega }_{k}, \end{array}\displaystyle \right. \end{aligned}$$

where \(x_{l,k}\in Q_{l}^{k}\) denotes the center of the cube \(Q_{l}^{k}\). We observe that \(\tilde{u}_{k}\) is a Lipschitz function (since \(M=0\)). We define \(\tilde{\chi }_{k}^{(i)}\) as the associated characteristic function for the well \(e^{(i)}\), i.e.,

$$\begin{aligned} \tilde{\chi }_{k}(x):= \left\{ \textstyle\begin{array}{l@{\quad}l} 1 &\mbox{if } e(\nabla u)(x) = e^{(i)}, \\ 0 &\mbox{else}. \end{array}\displaystyle \right. \end{aligned}$$

Step 4: Energy estimate. We note that if

$$\begin{aligned} E_{k,1}:=\bigl\| \chi _{k+1}^{(i)}-\chi _{k}^{(i)}\bigr\| _{BV([0,1]^{2})}^{\theta }\bigl\| \chi _{k+1}^{(i)}-\chi _{k}^{(i)} \bigr\| _{L^{1}([0,1]^{2})}^{1-\theta } \leq C \mu (s,p)^{k}, \end{aligned}$$

scaling implies that

$$\begin{aligned} E_{k,l}:=\bigl\| \chi _{k+1}^{(i)}-\chi _{k}^{(i)}\bigr\| _{BV([0,\lambda _{l}]^{2})} ^{\theta }\bigl\| \chi _{k+1}^{(i)}-\chi _{k}^{(i)} \bigr\| _{L^{1}([0,\lambda _{l}]^{2})} ^{1-\theta } \leq C \mu (s,p)^{k} \lambda _{l}^{2-\theta }. \end{aligned}$$

Hence, we estimate

$$\begin{aligned} E_{k} &:=\bigl\| \tilde{\chi }_{k+1}^{(i)}-\tilde{\chi }_{k}^{(i)}\bigr\| _{BV( \varOmega _{k+1})}^{\theta }\bigl\| \tilde{\chi }_{k+1}^{(i)}-\tilde{\chi }_{k} ^{(i)} \bigr\| _{L^{1}(\varOmega _{k+1})}^{1-\theta } \\ & \leq \sum_{l=1}^{k} E_{k,l} \#\bigl\{ Q_{l}^{k} \subset \hat{\varOmega }_{l} \cap \bar{\varOmega }_{k}: Q_{l}^{k} \mbox{ is a grid cube of size } \lambda _{l}\bigr\} \\ & \stackrel{\text{Claim 72}}{\leq } C C_{f} \sum _{l=1}^{k} E_{k,l} \lambda _{l}^{-1}= C C_{f} \sum _{l=1}^{k} \mu (s,p)^{k} \lambda _{l}^{2-\theta } \lambda _{l}^{-1} \\ &= C C_{f} \mu (s,p)^{k}\sum _{l=1}^{k} 2^{-l(1-\theta )} \\ & \leq C(\theta ) C_{f} \mu (s,p)^{k} \rightarrow 0 \quad\mbox{as } k \rightarrow \infty . \end{aligned}$$

Thus, for \(s\), \(p\) as above, the sequences \(\tilde{\chi }_{k}^{(i)}\) are still Cauchy in \(W^{s,p}\). This concludes the proof.  □

Remark 73

The constant \(C_{f}\) from Claim 72 can be controlled by \(C [\nabla f]_{C^{0,1}(\varOmega )}\), for some universal constant \(C>1\).