On the Energy Scaling Behaviour of a Singularly Perturbed Tartar Square

In this article we derive an (almost) optimal scaling law for a singular perturbation problem associated with the Tartar square. As in Winter (Eur J Appl Math 8(2):185–207, 1997), Chipot (Numer Math 83(3):325–352, 1999), our upper bound quantifies the well-known construction which is used in the literature to prove the flexibility of the Tartar square in the sense of the flexibility of approximate solutions to the differential inclusion. The main novelty of our article is the derivation of an (up to logarithmic powers matching) ansatz free lower bound which relies on a bootstrap argument in Fourier space and is related to a quantification of the interaction of a nonlinearity and a negative Sobolev space in the form of “a chain rule in a negative Sobolev space”. Both the lower and the upper bound arguments give evidence of the involved “infinite order of lamination”.


Introduction
In this article we study a singularly perturbed variational problem for a differential inclusion associated with the Tartar square. The Tartar square, T 4 , and more generally its siblings the T N -structures, are well-known sets in matrix space with important ramifications in the calculus of variations and the theoretical study of differential inclusions [12,28,32,[48][49][50]62], the theory of partial differential equations, in particular as building blocks for convex integration schemes, ranging from elliptic and parabolic equations [46,47,63] to equations of fluid dynamics [21,24,61], and with various consequences for applications, for instance for the analysis of certain phase transformations [10,60] and related differential inclusions [43,52]. For further applications and implications we refer to the lecture notes and survey articles [32,33,45,53].

The Tartar square and the "stress-free" setting
Let us recall the "stress-free" set-up and some properties of our problem. The Tartar square-which was introduced in several places in the literature [1,7,51,59,64] (see also the survey articles from above)-is the following set K ⊂ R 2×2 : It displays a striking dichotomy between rigidity and flexibility for the associated differential inclusion. On the one hand, it is easily shown (for convenience, a proof is recalled in Section 3) that any solution u ∈ W 1,∞ ( ) to the differential inclusion ∇u ∈ K a.e. in (2) is rigid in the sense that any solution to (2) is an affine function whose gradient is equal to one of the four matrices A 1 , . . . , A 4 . On the other hand, the Tartar square is flexible on the level of approximate solutions: Indeed, it is possible to find sequences (u j ) j∈N ∈ W 1,∞ ( ) such that dist(∇u j , K) → 0 in measure and such that no subsequence of (∇u j ) j∈N converges in measure (to a constant gradient in K; see Section 2 as well as [45,Chapter 2.5] and [11,65] for qualitative and quantitative versions of this construction). Moreover, it is known that arbitrarily small perturbations of the Tartar square enjoy even stronger flexibility in the sense that if K δ ⊂ R 2×2 is an arbitrarily small, open neighbourhood of K in R 2×2 , then there are infinitely many, non-affine solutions to the differential inclusion ∇u ∈ K δ which can be obtained by the method of convex integration [48]. These rigidity and flexibility aspects are mirrored in the algebraic properties of the set K: On the one hand, the set K does not have any rank-one connections, i.e. for any i, j ∈ {1, . . . , 4} with i = j it holds that rk(A i − A j ) = 2 > 1. This excludes "trivial solutions" to (2) such as simple laminates. It furthermore directly implies that the lamination convex hull K lc of K is trivial. On the other hand, however, the rank-one convex hull is non-trivial: where conv(·) denotes the convex hull and The set K rc = K qc is obtained by laminates of infinite order. Thus, the outlined properties make the Tartar square a prototypical model problem for studying more detailed properties of the dichotomy between rigidity and flexibility.

The singularly perturbed problem and a scaling law
Motivated by the long-term goal of understanding the described dichotomy and related dichotomies in the study of shape-memory alloys more precisely and quantitatively [16,22,23,[25][26][27]31,48,54,57,58], and inspired by the observation in [56] that the scaling behaviour of associated singularly perturbed problems give certain upper bounds on possible regularities of wild convex integration solutions, we here study the minimal energy scaling of a singularly perturbed Tartar square. Let us emphasize that in this context upper bound constructions are well-known and had earlier been quantified in [65] and also in [11]. We repeat these estimates in Section 2 for completeness. The main novelty of our work consists in proving (essentially) matching lower scaling bounds.
Let us outline the setting of this. We begin by noting that the differential inclusion (2) for the Tartar square can be rewritten in terms of characteristic functions indicating the "phase" the gradient is in where χ j ∈ {0, 1} for j = 1, . . . , 4 and χ 1 + χ 2 + χ 3 + χ 4 = 1.
Using this formulation and motivated by Hooke's law, we consider the following elastic energy: Here u : [0, 1] 2 → R 2 is the "deformation" and the functions χ j are subject to the constraints from (4). Moreover, on the left hand side of (5) (and also in the following arguments) we have used the abbreviation The elastic energy thus measures the deviation of a given deformation from being a solution to the differential inclusion (2). We emphasize that due to the flexibility of approximate solutions, the vanishing of the elastic energy along some sequence (u j ) j∈N does however not entail that along a subsequence the gradients (∇u j ) j∈N converge in measure against a constant map (in K).
Heading towards a scaling result for the Tartar square, for F ∈ K qc arbitrary but fixed, we set Associated with this definition we consider two natural choices for the possible classes of deformations among which we minimize: Fixing the mean value of ∇u we consider In this case we always (for instance, for the elastic energy (5)) identify [0, 1] 2 with the torus T 2 of side length one and in addition also assume that the phase indicators χ j are one-periodic functions. As an alternative, we fix affine boundary conditions for u and study for F ∈ K qc and b ∈ R 2 In order to regain some rigidity, we add a singular perturbation. More precisely, modelling the "surface energy" by for every sufficiently small parameter > 0, we consider the total energy The surface energy, being a higher order term, regularizes the problem by penalizing fine oscillations of the phase indicators and hence provides some compactness in the problem (for fixed > 0). We emphasize that there are various possible choices in the surface energy penalization ranging from diffuse interface models as in [18,19,39], to sharp interface models as in [9] directly penalizing the oscillation of u, or, closer to our choice, penalizations based on the phase indicator functions χ as in [5,6,34,35,55] and in [4,Chapter 12]. While in principle also other types of surface energies, e.g. on other Besov scales, could be considered, if correspondingly rescaled, they are expected to yield analogous results (see for instance [56] where explicitly different Sobolev scales are compared) but possibly require different tools (such as Littlewood-Paley arguments instead of direct Fourier tools). On the one hand, driven by the modelling point of view that the given choice of surface energy can be viewed as a quantitative form of a "physical surface energy penalization" and, on the other hand, in order to prevent additional layers of difficulties and to highlight our main ideas in a model setting, we here focus on the L 2 based, phase indicator model described above.
Seeking to study quantitatively "how rigid" or "how flexible" the Tartar square is, we are interested in deriving a scaling law for the minimal (total) energy as → 0. As our main result, we obtain the following (up to exponents of logarithms) matching upper and lower scaling bounds: Theorem 1. Let χ j for j = 1, . . . , 4 be as in (4) and let E be defined as in (11). For every > 0 and ν ∈ (0, 1) let where c = c ν > 0 depends on ν > 0 and C > 0 is a universal constant. Let X be either given by Assume that F ∈ K qc \ K and define Then, there exists an 0 ∈ (0, e −1 ) such that for every ν ∈ (0, 1) and for any ∈ (0, 0 ) it holds that Here the constants 0 < C 1 ≤ C 2 are independent of but may depend on the choice of F and 0 < 0 < e −1 .
Let us comment on this result. In contrast to other phase transition problems, our scaling law is not polynomial in the small parameter > 0 but of an order which is converging more slowly as → 0 than any polynomial in > 0. This is due to the fact that we are dealing with infinite order laminates. While any finite order laminate is expected to have a polynomial in scaling, our problem becomes degenerate in that an infinite order laminate requires strictly more oscillation than any finite order laminate. In this sense, Theorem 1 captures and quantifies the infinite order of lamination in our problem and thus distinguishes it from many other scaling laws in the literature on phase transformations.
The infinite order of lamination is also directly reflected in our proof of Theorem 1. On the one hand, it directly enters in the (well-known, here quantified) upper bound construction, however it also enters in a more subtle way in the main novelty of our article: the lower bound for which we use a bootstrap iteration argument. Although the lower bound necessitates an ansatz-free argument, it is still strongly reminiscent of the upper bound construction and also the staircase laminate argument from [17]. Seeking to mimic the rigidity argument for the "stress-free" differential inclusion (2) in which one uses that-by the absence of rank-one connections in the set K-the ∂ 1 u 1 and the ∂ 2 u 2 components of any solution to (2) "determine" each other, we are lead to an interesting Fourier space "chain rule problem" in negative-order Sobolev spaces. More precisely, due to the diagonal structure of the matrices in K, the elastic energy yields quantitative control on the partial Riesz transforms R 2 χ 1,1 and R 1 χ 2,2 , where the Riesz transform is defined as R j := ∂ j (− ) −1 , j ∈ {1, 2}, with (− ) denoting the Laplacian on periodic functions. Since only partial Riesz transforms of the respective functions are controlled, a priori the control of R 2 χ 1,1 and R 1 χ 2,2 does not suffice to conclude direct L 2 bounds on the functions χ 1,1 , χ 2,2 but only allows to localize them in conical domains around the coordinate axes in Fourier space. As in the exact differential inclusion, the two functions are however not independent: due to the absence of rank-one connections in K they are functions of each other, i.e. there is a polynomial g such that χ 1,1 = g(χ 2,2 ) and vice versa. Hence, one may hope to deduce information on the Riesz transform R 1 χ 1,1 by the information on R 1 χ 2,2 = R 1 g(χ 1,1 ) and vice versa. We view this as a (quantitative) "type of chain rule in a negative Sobolev space". The quantification of this "chain rule type argument in a negative Sobolev space" (see Lemma 4) thus provides the central step in our argument. Careful quantitative bootstrap type estimates of this then allow us to close the estimates for the lower bound.
All in all, our result (and model) thus serves as an extreme case compared to other scaling laws for differential inclusions in phase transformations in that it is an "extremely expensive" construction in which the energy scaling law is no longer algebraic.

Relation to the literature
Our result should be viewed in the context of scaling laws in the calculus of variations in general and more specifically in the modelling of shape-memory alloys and related phase transformation problems (see [37] and [45] for surveys on this). In the context of the modelling of shape-memory alloys, scaling laws, providing some insights on the possible behaviour of energy minimizers, have been deduced in various settings [3,5,6,8,9,14,15,20,[34][35][36][38][39][40][41][42]55]. For certain models, in subsequent steps, even finer properties (such as for instance almost periodicity results) have been derived [13]. While these methods have provided important insight into many physically relevant problems, none of the known scaling bounds deal with problems in which a dichotomy between rigidity and flexibility is known. Our problem thus addresses a weak form of this dichotomy for the first time in a model case. Moreover, we emphasize that while our result does not directly model a martensitic phase transformation, it is strongly motivated by the commonly used differential inclusions and the arising microstructures as, for instance, used in describing these problems in the stress-free setting [2,4]. We emphasize that, for instance, in the (geometrically linearized) cubic-to-monoclinic phase transformation, it was shown that closely related T 3 -structures appear [10,60], for whose more quantitative analysis our investigation seems to be a natural preliminary step.

Outline of the article
The remainder of the article is structured as follows: in Section 2 we first provide a quantitative version of the (well-known) upper bound construction for the flexibility of the Tartar square. In Section 4, after briefly recalling the (well-known) rigidity argument for the Tartar square and auxiliary properties of the associated elastic energy in Section 3, as the main novelty of our article, we complement the scaling of the upper bound construction with a (nearly) matching lower bound.

Notation
Throughout the article, when writing a ∼ b we mean that c −1 b ≤ a ≤ cb where c is a fixed constant which is independent of > 0. Analogously, a b and a b stand for a ≤ cb and a ≥ cb.
Given d, m, n ∈ N and a summable function f : ⊂ R d → R m×n , we denote the average of f on by For every T 2 -periodic function f : its Fourier transform. We will denote with x the space variable and with k the frequency variable. If there is no ambiguity, we will also use the notationf = F( f ).

An Upper Bound Construction
In this section we quantify the total energy of the well-known construction of infinite orders of laminations which is used in the literature to prove flexibility of the Tartar square (see, for instance, [45, Section 2.5]). We stress that this quantitative construction had first been quantified in the literature in [11,65] (for closely related continuum and finite element models) and that we recall it for completeness here. This construction is an example of a sequence with vanishing elastic energy which is not strongly compact. Balancing the elastic and the surface energy terms through a parameter optimization, as in [11,65], we obtain an upper bound (in terms of scaling) of our perturbed problem both for the affine and the periodic settings.

Quantification of the total energy of the infinite-order laminate
In what follows we take into account zero boundary conditions, that is we will define u ∈ A aff 0 and χ ∈ X aff for every > 0 and quantify The argument is completely analogous for any other affine boundary datum F Further the construction directly provides the upperbound estimate of Theorem 1 for both E aff and E per by taking the T 2 -periodic extension of u and χ .
Since K does not have rank-one connections we make use of some auxiliary matrices (in particular the matrices from (3)) to build laminates of higher and higher order, reducing the volume fraction of the region in which the gradients differ from elements of K but increasing the surface energy.
First-order laminate. Let 0 < r 1 < 1 2 be an arbitrarily small parameter to be determined such that 1 r 1 is integer. We resolve the boundary datum as a laminate (and a cut-off layer) with gradients Any other rank-1-convex combination of elements of K qc would lead to an analogous construction. Thanks to the rank-1-connection between B 2 and B 1 , we define the continuous function v (1) and consider, without relabeling, its r 1 -periodic (in the x 1 variable) extension on [0, 1] 2 . We then use a cut-off argument to attain zero boundary conditions on the whole where ϕ(t) = max(0, min(2t, 1)). We also set χ (1) It is convenient to view the elastic energy as the sum of two different terms. One corresponds to the volume-fraction of the auxiliary states B 1 and B 2 and it is proportional to the area of {x ∈ [0, 1] 2 : ∇u (1) The other contribution is given by the cut-off and it is proportional to the area of [0, 1] × [0, r 1 2 ]. Hence, The surface energy is proportional to the sum of the perimeters of Second-order laminate. Let 0 < r 2 < r 1 2 be an arbitrary parameter such that r 1 r 2 is integer. From the fact that in each rectangle in which ∇u (1) = B 1 , B 2 , we replace u (1) with a simple laminate (up to cut-off) having gradients A 1 , P 1 and A 3 , P 3 respectively, attaining boundary conditions u (1) . Namely, we take v (2) to be continuous and such that On the left the first-order laminate construction u (1) . The shaded regions represent the cut-off areas. On the right the projection of ∇u (1) onto K

Fig. 2.
On the left the second-order laminate construction u (2) . The shaded regions represent the cut-off areas. On the right the projection of ∇u (2) We consider its r 2 -periodic (in the and repeating this analogous construction in all the other parts of the rectangle in which ∇u (1) = B 1 , B 2 , using the rank-1-connection between A 3 and P 3 where The elastic energy is given by the sum of two terms; one is proportional to the area of x ∈ [0, 1] 2 : ∇u (2) (x) = P 1 , P 3 and the the other is given by the cut-off. The energy of the cut-off of the current step gives a contributions of order r 2 for every rectangle in which ∇u (1) = B 1 , B 2 that are 2 r 1 many; namely, E el (u (2) , χ (2) The surface energy is controlled by the perimeters of the rectangles in which ∇u (2) is constant; indeed the rank-1-convexity of the cut-off process yields that, connecting ∇u (2) to the boundary data (i.e., B 1 and B 2 ), the projection of ∇u (2) changes at most once. There are 4 r 1 r 2 such rectangles each of perimeter smaller than 2r 1 . Hence, through an iterative procedure starting from u (2) . Thanks to the relation we replace u (m−1) , in the rectangles in which ∇u (m−1) = P j , with r m -periodic laminate of gradients A j and P j obtaining u (m) ∈ W 1,∞ 0 ([0, 1] 2 ; R 2 ) after a cutoff argument to attain u (m−1) at the boundary of each rectangle. Here r m is an arbitrarily small parameter with 0 < r m < r m−1 2 and r m−1 r m integer. We then set For every m ≥ 3 we have that The volume fraction of the cut-off regions of the m-th step is 2 r m r m−1 . Thus its contribution in the elastic energy is 2 r m The surface energy of the m-th step is proportional to the sum of the perimeters of the rectangles in which ∇u (m) = P 1 , . . . , P 4 . Denoting with R m ∈ N the number of such rectangles, we have Since the perimeter of each rectangle is r m−1 we get The total energy of the construction above is therefore and it depends on {r j } m j=1 . In order to obtain a good upper bound, we determine the optimal choice of such parameters in terms of r 1 .
Comparing the terms r 1 and r 2 r 1 we get r 2 ≤ r 2 1 . Since the energy depends on r 2 in only another term, that is r 3 r 2 , the choice r 2 ∼ r 2 1 is optimal. Working inductively, we get r j ∼ r j 1 . Thus, we denote with u m,r and χ m,r the functions u (m) and χ (m) defined as above, corresponding to r j = r j , where r > 0 is a small parameter. Hence, we have

Determination of the length scale
From the analysis performed above we obtain the following result, which provides an upper (scaling) bound for Theorem 1. (13) and (12) respectively. For every > 0 small enough and every F ∈ K qc \ K, there exist u ∈ A F and χ ∈ X such that

Proposition 1. Let E and r ( ) be defined as in
where C > 0 is a constant depending on F.
Proof. As already noticed, it is sufficient to consider affine boundary conditions. The result comes from a parameter optimization in terms of for the constructions u m,r and χ m,r defined in Section 2.1. Determine first the optimal length scale r for the m-th iteration by comparing the terms r and r −m in (15), obtaining r ∼ 1 m+1 . We now look for the optimal order of iterations m .
for some c > 0, which yields the result for F = 0 by (15) by taking u = u m ,r and χ = χ m ,r .
The construction corresponding to a non-zero boundary datum differs from u (m) and χ (m) only in the first step, i.e. m = 1, being then completely analogous. Thus it does not affect the scaling of E , i.e. the constant c can be chosen independently of F. Hence the result is proved.

Remark 1.
We note that r ( ) is smaller than any logarithmic scale and greater than any power of . Indeed, given 0 < α ≤ 1 we get

Remark 2.
We do not claim that our constant c > 0 is optimal. It is expected that this depends on the finer properties of the upper bound construction, e.g. on using branched constructions instead of direct laminations. Since the value of the constant c > 0 is not the main emphasis of our scaling result, we do not pursue this further in this article.

A Qualitative Rigidity Argument and Some Auxiliary Results for the Elastic Energy
In this section, we recall an argument for the exactly stress-free rigidity of the Tartar square which will serve as our guideline for the lower bound estimate. Additionally, we will recall the expression of the elastic energy in Fourier space for different affine boundary conditions which will become a central ingredient in our quantitative lower bound arguments.

A qualitative rigidity argument
We recall a qualitative rigidity argument which we will mimic in our lower bound estimate.
Note that this is a particular case of the general fact that any Lipschitz solution of ∇u ∈ K with K ⊂ R n×m of cardinality 4 whose elements are not rank-1connected is trivial (see [12,Theorem 7]). We also refer to the discussion in [64] (and in particular the section "on the consequences of separate convexity") for conditions on T 4 structures in diagonal matrices.

Remark 3.
We remark that for the exact differential inclusion from Proposition 2, the differential inclusion directly implies that any solution u ∈ W 1,1 loc (R 2 , R 2 ) automatically also satisfies u ∈ W 1,∞ loc (R 2 , R 2 ) since K ⊂ R 2×2 is compact. This explains the restriction to u ∈ W 1,∞ loc (R 2 , R 2 ) in Proposition 2 compared to the natural choice of u ∈ W 1,1 loc (R 2 , R 2 ) for the minimization problem from Theorem 1 as a large set in which the energies are defined.

Elastic energy in Fourier space
We give the expression of the elastic energy E per el defined in (7) in Fourier space with periodic boundary conditions, following a standard approach in the literature [5,35]. It will be useful in the sequel to rewrite E el in terms of the diagonal entries of χ , that is,

Lemma 1. Let E
per el be defined in (7) and {χ j } be as in (4) and T 2 -periodic. Then for every F ∈ R 2×2 sym and χ ∈ X per it holds that Proof. We first notice that We can thus rewrite E per el (χ , F) in Fourier space as follows By minimizing (17) inv, we obtain for every test function w ∈ L 2 (T 2 ; R 2 ), which implies that This is solved byv Substituting these values into (17) we get the result.
In view of the lower-bound estimate which is formulated in Theorem 2 (see the proof in Section 4.5), it is worth noting that from the characterization given in Lemma 1 we have

Remark 4.
We highlight that, phrased in different words, the elastic energy controls the partial Riesz transforms . Related directional derivative control can also be found in [5,6,35,41]. As stressed in [5, Section 3.2], such an only partial control however is related to hyperbolic type equations and cannot be transferred to full L 2 control in general. In order to deduce the desired scaling law, these bounds thus need to be complemented with the structural conditions on the wells. Here the crucial argument consists in the respective "determinedness" of the two functions χ 1,1 and χ 2,2 which is captured quantitatively in Proposition 3 (and originates from the absence of rank-one connections in K).
We further remark that also in more qualitative arguments related to compensated compactness and Morrey's conjecture, strong, optimal bounds on Riesz transforms have played a major role; see [44] and also [30].

A Bootstrap Argument and a Proof of the Lower Bound
In this section, we turn to the proof of the lower bound of Theorem 1. To this end, we will make use of a bootstrap argument which again highlights the infinite order of lamination in our solutions.
In the following Sections, 4.1-4.4 and 4.5, we first consider the case of periodic data and prove a lower bound for E per . Noting that E per E aff then also leads to the desired lower bound result of Theorem 1 in the case of affine boundary conditions (see Section 4.5 for the details).

A chain rule argument in H −1
In this section, we prove the following main result which will be used to prove the lower bound of Theorem 1 in Section 4.5: then there exists 0 < 0 < e −1 depending on d such that for any ∈ (0, 0 ) and ν > 0 there exists C ν > 0 with Throughout this section, all the norms are restricted to T 2 . Also, for the sake of simplicity, we assume g and h to be polynomials of the same degree d ∈ N, d ≥ 2. Without loss of generality, we may further assume that f 1 = 0 = f 2 since else the statement follows directly.
We will use that the first inequality in (18) can be phrased in the following two equivalent formulations This will yield intermediate bounds on the sets where the L 2 -mass off 1 andf 2 concentrate which will lead to (19) thanks to a bootstrap argument.

Remark 5.
In our application to the singularly perturbed Tartar square, we will apply Proposition 3 with f 2 = χ 1,1 and f 1 = χ 2,2 (see Section 4.5). In this case, due to the lack of rank-one directions in K, it is possible to find corresponding polynomials g and h for which the relations f 2 = g( f 1 ) and f 1 = h( f 2 ) hold. These can be chosen via interpolation, e.g.
Such choices of g and h work both in the Dirichlet and in the periodic settings since from the symmetry of Tartar's square h(0) = g(0) = 0. We emphasize that in the setting of the Tartar square the choice of the functions g, h is extremely non-unique. We refer to Section 4.5 for the details of the application of Proposition 3 to the Tatar square.
Since the statements in this subsection are always symmetric between f 1 and f 2 (with only g replaced by h), in this subsection we only state the results for f 1 but emphasize that they are also valid for f 2 . The symmetry is first broken in Section 4.3 in which we will then state and use both the bounds for f 1 and f 2 .

Lemma 2.
Let f 1 , f 2 and g be as in the statement of Proposition 3. Then, for every μ, μ 2 > 0 there hold where C > 0 is a constant depending on f 1 L ∞ and g. A symmetric statement holds for f 2 and h.
Proof. We divide the proof into three steps, for which the first two steps are general results on arbitrary functions and where only in the last step these results are applied to the specific function f 1 .
Step 1. Following the argument in [35, proof of Lemma 4.3], we first show that for a general function f : T 2 → R with f L ∞ < ∞ and anyμ > 0 it holds that Indeed, arguing as in [35,proof of Lemma 4.3], for every c ∈ R 2 we have that for every L > 0. Integrating over ∂ B L with |c| = L, we deduce that Choosing L =μ −1 , we infer the claim.
Step 2. Next we show that for a general function f : T 2 → R with the notation from above, for μ, μ 2 > 0 as above and for j ∈ {1, 2} Indeed, without loss of generality, we may consider j = 1. Passing to the frequency space, we obtain Step 3. Finally, we conclude the desired result by applying the bounds from Steps 1 and 2 to f = f 1 andμ = μ 2 and recalling the bounds from (20).
Next, we seek to improve the control on the Fourier supports of f 1 and f 2 iteratively. To this end, as a crucial observation, we use that the Fourier support of f 2 is essentially obtained through a nonlinear function interacting with f 1 . If f 1 were such that χ 1,μ,μ 2 (D) f 1 ∈ L ∞ (T 2 ), and with M := max{ f 1 L ∞ , χ 1,μ,μ 2 (D) f 1 L ∞ }, this would be a consequence of the local Lipschitz continuity of g: Indeed, by (20), the fact that f 2 = g( f 1 ) and the triangle inequality, we would obtain The same estimate would follow if g was globally a Lipschitz function, without requiring any further assumptions on f 1 . In our application we work with nonlinear functions g which are only locally Lipschitz (cubic polynomials) and we do not a priori know thatf 1 ∈ L ∞ (T 2 ). Hence, even though f 1 ∈ L ∞ , in our setting, we cannot directly proceed as in (21), since Fourier multipliers are in general not bounded as maps from L ∞ to L ∞ . Yet we can still control the left-hand-side of (21) in a similar way, obtaining a (small) loss (see Corollary 1).
More precisely, in order to remedy the lack of L ∞ bounds for χ 1,μ,μ 2 (D) f 1 and hence the lack of direct Lipschitz continuity arguments, we make use of Calderón-Zygmund estimates in L p spaces with p ∈ (1, ∞) and interpolation. While this gives rise to a small loss, it will provide our replacement of (21) in Corollary 1.
with C > 0 being a constant depending on f 1 L ∞ , g and d. A symmetric statement holds for f 2 and h.
Proof. It is sufficient to prove the statement for g(t) = t d for some d ∈ N, d ≥ 2.
Using the fact that 1)-homogeneous, by Hölder's inequality we obtain for any γ ∈ (0, 1). By means of L p interpolation (see for instance [29, Proposition 1.1.14]) we get Invoking Hölder's inequality and the explicit form of G(a, b), we further infer that In order to bound the L p norms for p ∈ { 2+2γ γ , (2+2γ )(d−1) γ } involving the multiplier contributions in the above expressions, we invoke the quantitative L p -L p boundedness of Fourier multipliers by means of (a version of) the Marcinkiewicz multiplier theorem (see [29,Corollary 6.2.5]) in combination with the transference principle (see, for instance, [29,Theorem 4.3.7]): Indeed, all our multipliers m(D) are of the form χ 1,μ,μ 2 (D) and 1 − χ 1,μ,μ 2 (D) and hence satisfy for j ∈ {1, 2} and all k ∈ R 2 \ {(0, 0)} the quantitative bounds with a uniformly (in > 0) bounded constant A > 0. This thus implies that for each j ∈ {0, . . . , d − 1} Here 1 < C(γ , d) ≤ C( d γ ) 12 (which follows from [29, Theorem 6.2.4 and Corollary 6.2.5]) and is, in particular, independent of μ and μ 2 . Combining the previous inequalities we obtain (22) As a direct generalization of the previous result, we state an immediate corollary (our replacement of the estimate (21)) which we will use in the next subsection.

Corollary 1.
Let f 1 , f 2 , g and h be as in the statement of Proposition 3. Let d ≥ 1 denote the degree of the polynomials g, h. Then for every μ, μ 2 > 0 and any γ ∈ (0, 1) there hold with C 0 > 0 being a constant depending on f 1 L ∞ , g, h and d. A symmetric statement holds for f 2 and h.
We stress that the constant C 0 introduced in Corollary 1 is the same as that of Proposition 3 and it is chosen to be greater than 2C + 2C 2 + 2, where C and C are the constants of Lemmas 2 and 3 , respectively.

A bootstrap argument
In this section, we carry out our main bootstrap argument. Let us informally explain the strategy of this, before formulating the precise results in the following lemmas. It consists of three main steps: Step 1: The starting point. As our starting point, we note that Lemma 2 contains the information that the L 2 -mass of the states f 1 and f 2 concentrate (in the frequency space) on the truncated cones C 1,μ,μ 2 and C 2,μ,μ 2 , respectively (Fig. 3). It allows us to control the mass of f 1 , f 2 outside of these cones. This information is a direct consequence of the inequalities in (18), which correspond to elastic energy and surface energy controls.
Step 2: Exploiting the "determinedness" of f 2 in terms of f 1 in the form of the estimate (21). As a next step, we seek to improve the bounds on the mass concentration of f 1 and f 2 and to iteratively also control the mass of f 1 and f 2 inside of the cones except for possible concentrations at the origin: To this end, we use that the estimates (20) and its symmetric version for f 2 can be improved by noting that f 2 = g( f 1 ) and f 1 = h( f 2 ) with g and h two polynomials of degree d ≥ 1. Here, an estimate of the type (21) is crucial, since it allows us to compare the Fourier supports of f 1 and f 2 = g( f 1 ) by viewing (21) as In particular, this implies that the Fourier support of f 2 in the cone C 2,μ,μ 2 is determined by the interaction of the nonlinearity g and the Fourier support of f 1 in the cone C 1,μ,μ 2 . More precisely, assuming that g is a polynomial of degree d ≥ 1 and recalling that multiplication in real space turns into convolution in Fourier space, we obtain that the support of F(g(χ 1,μ,μ 2 (D) f 1 )) is contained in the d-fold Minkowski sum of the support of F(χ 1,μ,μ 2 (D) f 1 ): supp(F(g(χ 1,μ,μ 2 (D) f 1 ))) ⊂ S d (supp(F(χ 1,μ,μ 2 (D) f 1 ))) Fig. 4. Illustration of the argument in Step 2 of our bootstrap scheme. The red and blue hashed regions depict C 2,μ,μ 3 and C 1,μ,μ 2 , respectively, whereas the hashed light-red region on the left depicts C 2,μ,μ 2 . The shaded light-blue region on the left represents the Minkowski sum of C 1,μ,μ 2 with itself (obtained as a bound on the mass of the convolution). This implies that the mass inside C 2,μ,μ 2 actually concentrates in the smaller red cone C 2,μ,μ 3 instead of the original cone C 2,μ,μ 2 (colour figure online) we hence infer that supp(F(g(χ 1,μ,μ 2 (D) f 1 ))) ⊂ {k ∈ Z n : |k 1 | < 4dμμ 2 } (see Fig. 4). Combining this with (24), hence implies that the support of F f 2 in C 2,μ,μ 2 \ C 2,μ,μ 3 is quantitatively controlled in terms of the elastic and surface energies and thus that in a quantitative sense F f 2 is essentially supported in the smaller cone C 2,μ,μ 3 . These observations are made precise and quantified in Lemma 4 below. Technically, this step involves slight losses in the estimates due to the fact that our nonlinearities are not globally Lipschitz continuous and Calderón-Zygmund type arguments as in Lemma 3 are required.
Step 3: Iteration. Due to the symmetry of the properties of f 1 and f 2 it is then possible to obtain a new estimate of the type (21), now with reversed roles for f 1 and f 2 and for f 2 localized to the smaller cone C 2,μ,μ 3 Repeating the Fourier support argument from above with reversed roles for f 1 and f 2 , then also implies that the mass of f 1 must concentrate on a smaller cone C 1,μ,μ 4 with 0 < μ 4 < μ 3 (Fig. 5). Finally, iterating this process, we obtain that the states f 1 and f 2 concentrate in smaller and smaller cones in frequency space with corresponding L 2 -errors which In the following, we make this heuristic argument precise. To this end, from now on, for some α ∈ (0, 1 4 ) arbitrary but fixed, we define μ = α and μ 2 = −1+2α .
Such a choice of the parameters will be clear at the final stage of the argument and will allow us to rewrite the right-hand-side of (19) with a multiple of the total energy.
Lemma 4. Let f 1 , g be as in the statement of Proposition 3 and let d ≥ 1 be the degree of g. Let C 0 > 0 and γ > 0 be the constants from Lemma 3 and let α ∈ (0, 1 4 ) be as above. Then it holds that where Proof. By the choice of the parameters μ and μ 2 we get max k∈C 1,μ,μ 2 By the properties of Fourier transform and convolution, from the fact that g is polynomial of degree d and from (27) we have that Now we define χ 1, to be the characteristic function of {k ∈ R 2 : |k 1 | > 4d −1+3α }. From (28) and (23) we infer that This, together with the fact that |χ 2,μ,μ 2 − χ 2,μ,μ 3 | ≤ χ 1, χ 2,μ,μ 2 , yields Thus, the triangle inequality, (20) and (29) give the result: using that C 0 ≥ C.
Next, we iterate this and thus obtain that f 1 and f 2 can always be approximated by functions with smaller and smaller support in Fourier space (see Fig. 5).
Lemma 5. Let f 1 , f 2 , g, h be as in the statement of Proposition 3, let d ≥ 1 be the degree of g, h and let for some universal constant M > 1. Let C 0 > 0 and γ > 0 be the constants from Lemma 3 and let α ∈ (0, 1 4 ) be as above. Then, for every m ∈ N there holds where m e = 2 m+2 2 and m o = 2 m+1 2 + 1 are respectively the lower even and odd parts of m + 2.

Proof. We reason by induction. The induction basis is provided by Lemma 4.
Without loss of generality we may assume m ∈ 2N, thus m e = m + 2, m o = m + 1 and also (m − 1) e = m, (m − 1) o = m + 1. Assume the inductive hypothesis to hold true. We now show that the statement remains valid for (m − 1) m.
Step 1. Here, by the triangle inequality, the fact that f 1 = h( f 2 ) and (32) we get Step 2. We now reason as similarly as in the proof of Lemma 4. From max k∈C 2,μ,μ m+1 Let χ 2, denote the characteristic function of {k ∈ R 2 : |k 2 | > 4μ m+1 μ}. Thus, from the fact that |χ 1,μ,μ m −χ 1,μ,μ m+2 | ≤ χ 2, χ 1,μ,μ m and recalling (34), we obtain Thus, by the triangle inequality We control the first term on the right-hand-side above by means of the inductive hypothesis (32) and the second by the statement of Lemma 3 (applied to f 2 and h) and again (32), that is Hence, recalling that C 0 ≥ 2 + 2C 2 , we infer which combined with (33) gives the result.

Proof of Proposition 3
In this section, we conclude the proof of Proposition 3 by combining all the bounds from Sections 4.2-4.3.

Proof of Proposition 3. Step 1: Choices of parameters in the bootstrap iteration.
Consider m ∈ 2N. From Lemma 5 we deduce We first identify the number of iterations m such that μ m+2 < 1, so that the lefthand-side of (35) reduces to The condition μ m+2 < 1 corresponds to For (36) to hold for some m ∈ N a necessary condition is that α satisfies α| log( )| > log(Md). Under this assumption on α (which will be satisfied by our choice of α in terms of , see below), (36) is satisfied for m > 1 α − 2. In particular, a possible choice for m thus is given by m = 2 1 2α . For such a choice of m we arrive at We now take γ = α 2 . Since α is a small parameter (to be determined) 1 2 < (1 − α 2 ) 1 α < 1. Thus, recalling the definition of μ = α and μ 2 = −1+2α , we get Step 2: Optimization. We can further improve the right-hand-side of (37) by noticing that for every ν > 0 and for some constant c > 0 depending on d and ν > 0 which gives Optimizing in α, we choose (4C 0 e c ) α −1−ν ∼ −2α , that is Notice that taking for instance α = | log( )| − 1 2+ν the conditions on α are satisfied for every ∈ (0, 0 ), where 0 < 0 < e −1 depends only on d. We eventually obtain, for every ν > 0, Remark 6. We emphasize that there is no particular reason to choose the exponent 1 2 in the exponent of the right hand side of (37). It would have been possible to produce any power in (0, 1). As this does not play a major role in our estimates below, we have simply chosen this power for convenience.

Application to Tartar's square and proof of the lower bound from Theorem 1
We now consider the case f 1 = χ 2,2 and f 2 = χ 1,1 , where the phase indicators χ j are defined as in (4). Using the lower bound from Proposition 3 we derive the following lower bound for the elastic energy, which, in particular, yields the proof of the lower bound in Theorem 1 for the periodic setting. We refer to the argument below, which allows us to then also transfer this to the case of affine boundary conditions. Theorem 2. Let E be as in (11) and r ν ( ) as in (12). Let F ∈ K qc . Assume that E per (χ , F) ≤ 1. Then, for every ν ∈ (0, 1) and for every χ j for j = 1, . . . , 4 as in (4) it holds that, r ν ( ) dist 2 (F, K) E per (χ , F) 1 2 for every ∈ (0, 0 ). We set r ν ( ) := exp(−C| log( )| 1 2 +ν ). By Proposition 3 we infer for every ν ∈ (0, 1), whereχ is the mean of χ . In order to conclude the argument, we seek to provide a bound on dist(χ, F). To this end, we invoke the boundary conditions and make use of Jensen's inequality applied to the elastic energy: For instance, for every u ∈ A Now, invoking the triangle inequality in conjunction with (39), (40) and the observation that χ − F 2 L 2 ≥ |T 2 | dist 2 (F, K), it follows that for any boundary datum F ∈ K qc ⊂ R 2×2 , we have dist 2 (F, K) r ν ( ) − It is worth remarking that the lower bound of Theorem 2 trivially holds in the case F ∈ K qc since in this case from relaxation theory one has E per el (χ , F) ≥ c > 0 for some constant c depending on F. It should also be noted that, when F ∈ K qc , the condition E (χ , F) ≤ 1 is not restrictive in the application of this lower bound in that of Theorem 1 due to the upper bound from Proposition 1.
Theorem 2 combined with Proposition 1 proves the main result of this paper, Theorem 1, in the periodic setting.
Last but not least, we now also transfer the lower bound estimate to the case of affine boundary data: Proof of the lower bound of Theorem 1 in the case of affine boundary conditions. We first note that A aff F ⊂ A per F . Since (L ∞ ∩ BV )(T 2 ) ⊂ (L ∞ ∩ BV )((0, 1) 2 ), this implies that for each χ ∈ (L ∞ ∩ BV )(T 2 ), it holds that Recalling the trace theorem for BV functions, we further note that any function in (L ∞ ∩ BV )((0, 1) 2 ) can also be viewed as a function in (L ∞ ∩ BV )(T 2 ) by periodic extension. Hence, (41) yields that E per (χ , F) ≤ 40E aff (χ , F).