1 Introduction

A challenging question in the study of non-linear partial differential differential equations is to find which non-linear functionals are well-behaved with respect to weak convergence, which represents the typical topology consistent with physical measurements and has satisfactory compactness properties. In the context of the Calculus of Variations, answering this question amounts, roughly speaking, to describing semi-continuity properties of functionals

$$\begin{aligned} \mathscr {E}[w]=\int _\varOmega f(w(x)){\text {d}}\!x \end{aligned}$$
(1)

with respect to weak convergence in certain weakly closed, convex subsets \(\mathfrak {C}\), say, of \({\text {L}}^p\)-spaces, \(1<p<\infty )\), under growth conditions

$$\begin{aligned} 0\leqslant f\leqslant c(|\cdot |^p+1) \end{aligned}$$
(2)

on the integrands f. Such subsets \(\mathfrak {C}\) can account for differential constraints and boundary conditions. Modulo terms removed for simplicity of exposition, such functionals could model, for instance, the energy arising from the deformation of a solid body \(\varOmega \), viewed as a sufficiently regular open subset of \(\mathbb {R}^n\), where f is a continuous energy density map characterized by the constitutive properties of the material. In accordance with the Direct Method in the Calculus of Variations, imposing a suitable bound from below on f ensures existence and weak compactness of minimizing sequences \(w_j\). The appropriate continuity property of \(\mathscr {E}\) in this case is that of lower semi-continuity with respect to weak convergence in \({\text {L}}^p\)

$$\begin{aligned} w_j\rightharpoonup w\implies \liminf _{j\rightarrow \infty }\mathscr {E}[w_j]\ge \mathscr {E}[w], \end{aligned}$$

which, if satisfied, implies existence of a minimizer \(w\in \mathfrak {C}\).

It is well-known that if \(\mathfrak {C}\) consists of the whole of \({\text {L}}^p\), then \(\mathscr {E}\) is weakly sequentially lower semi-continuous if and only if f satisfying (2) is convex. Of course, convexity of f is sufficient for lower semi-continuity (always understood as weakly sequential throughout this note) in any reasonable class \(\mathfrak {C}\), but it is hardly necessary in general. For instance, if \(\mathfrak {C}\) is the space of weak gradients in \({\text {L}}^2\) and f is a quadratic form, then one can easily show that f being positive on rank-one matrices implies lower semi-continuity. This example, that we will later come back to in more generality, is of particular relevance, as it provides the insight for a second convexity condition, which is necessary for lower semi-continuity with the constraint \(w=\nabla u\): if \(\mathscr {E}\) is lower semi-continuous, then f is convex along rank-one lines. In particular, for integrands f of class \({\text {C}}^2\), this is equivalent to the so-called Legendre–Hadamard ellipticity condition

$$\begin{aligned} \frac{\partial ^2F(X)}{\partial X_{ij}\partial X_{\alpha \beta }}a_ia_\alpha b_jb_\beta \ge 0\quad \text {for all } X,a,b, \end{aligned}$$

where summation over repeated indices is adopted. From this point of view, lower semi-continuity of \(\mathscr {E}\) acting on gradients reflects a semi-convexity condition on f. Indeed, it was shown by Morrey in [22] that lower semi-continuity of \(\mathscr {E}\) is equivalent with quasiconvexity of f, i.e., the Jensen-type inequality

$$\begin{aligned} f(\eta )\leqslant \fint _Q f(\eta +\nabla u(x)){\text {d}}\!x \end{aligned}$$

holds for all \(\eta \) and all smooth maps u with compact support in the open cube Q. On one hand, the quasiconvexity assumption is a plausible constitutive relation for energy functionals arising in solid mechanics [5]; on the other hand, it is but a minor improvement of the lower semi-continuity concept, which makes it particularly difficult to check in applications. The counterexample of Šverák [32] rules out the possibility of quasiconvexity being a type of directional convexity (see also [7, Ex. 3.5] for the case of higher order gradients). A tractable sufficient condition for quasiconvexity is polyconvexity, i.e., f is a convex functions of the minors, also introduced by Morrey in [22] in connection with lower semi-continuity and used by Ball to obtain existence theorems under very mild growth conditions, giving very satisfactory existence results in non-linear elasticity [4]. The fact that quasiconvexity does not imply polyconvexity is much easier to see, at least in higher dimensions, and follows from an old observation of Terpstra concerning quadratic forms [37] (see also [2, 6] and the references therein).

The above considerations show that a considerable amount of work was devoted to the treatment of lower semi-continuity in the case when \(\mathfrak {C}\) consists of gradients (see [1, 19] and the monographs [14, 27]). However, for instance in continuum mechanics, it is often the case that \(\mathfrak {C}\) consists of those \({\text {L}}^p\)-fields w that satisfy a linear, typically under-determined, partial differential constraint, say \(\mathcal {A}w=0\), assumption that we make henceforth. Examples arise in elasticity, plasticity, elasto-plasticity, electromagnetism, and others. The \(\mathcal {A}\)-free framework originates in the pioneering work of Murat and Tartar in compensated compactness [23, 33, 34] and can be correlated with the question of finding energy functionals that are continuous with respect to weak convergence in \(\mathfrak {C}\) [24]. The latter question was also studied in generality by Ball, Currie, and Olver in [7], leading to the generalization of polyconvexity to the case where energy functionals depend on higher order derivatives. In this case, the definition of quasiconvexity extends mutatis mutandis [20]. As to the question of lower semi-continuity, the analysis of the case when f is a quadratic form (see, e.g., [36, Ch. 17] or [35, Thm. 2]) reveals a different necessary condition of directional convexity, namely with respect to the so-called wave cone of \(\mathcal {A}\). It was shown by Dacorogna in [12, Thm. I.2.3] that, in order to have lower semi-continuity, it is sufficient to assume the following generalization of quasiconvexity, namely that

$$\begin{aligned} f(\eta )\leqslant \fint _Qf(\eta +w(x)){\text {d}}\!x \end{aligned}$$

for all \(\eta \) and all bounded w such that \(\int _Qw=0\) and \(\mathcal {A}w=0\). However, it is not clear whether this condition is necessary. More recently, Fonseca and Müller showed in [16] that if one assumes in addition that the fields w are periodic, in which case f is called \(\mathcal {A}\)-quasiconvex, then one indeed obtains a necessary and sufficient conditionFootnote 1 (under suitable growth assumptions on f). Their result holds under the assumption that the symbol map \(\mathcal {A}(\cdot )\) of \(\mathcal {A}\) is a constant rank matrix-valued field away from 0. This condition, introduced in [30, Def. 1.5] to prove coerciveness inequalities for non-elliptic systems, was first used in the context of compensated compactness by Murat and ensures, as noted on [23, p.502], the continuity of the map

$$\begin{aligned} 0\ne \xi \mapsto \text {Proj}_{\ker \mathcal {A}(\xi )}, \end{aligned}$$
(3)

making tools from pseudo-differential calculus available. In the absence of the constant rank assumption, little is known about the lower semi-continuity problem. One of the few results in this direction was proved by Müller in [25], answering a long standing question of Tartar (see also [18] for a generalization).

In the proof of the main result of [16], considerable difficulty is encountered when proving sufficiency of \(\mathcal {A}\)-quasiconvexity. One reason for this is the absence of potential functions for \(\mathcal {A}\), which, if available, should allow one to test with compactly supported functions in the definition of \(\mathcal {A}\)-quasiconvexity and, perhaps, use more standard methods.

The main result of the present work is to show that the existence of such a potential in Fourier space is equivalent with the constant rank condition.

Theorem 1

Let \(\mathcal {A}\) be a linear, homogeneous differential operator with constant coefficients on \(\mathbb {R}^n\). Then \(\mathcal {A}\) has constant rank if and only if there exists a linear, homogeneous differential operator \(\mathbb {B}\) with constant coefficients on \(\mathbb {R}^n\) such that

$$\begin{aligned} \ker \mathcal {A}(\xi )=\mathrm {im\,}\mathbb {B}(\xi ) \end{aligned}$$
(4)

for all \(\xi \in \mathbb {R}^n\setminus \{0\}\).

Here \(\mathcal {A}(\cdot )\), \(\mathbb {B}(\cdot )\) denote the (tensor-valued) symbol maps of, respectively, \(\mathcal {A},\mathbb {B}\). We say that \(\mathcal {A}\) has constant rank if the map \(0\ne \xi \mapsto {\mathrm{rank}\,}\mathcal {A}(\xi )\) is constant (see Sect. 2 for detailed notation). We will regard \(\mathbb {B}\) as the potential and \(\mathcal {A}\) as the annihilator, although this terminology is not standard.

It is important to mention that the algebraic relation (4) does not, in general, imply for vector fields w that

$$\begin{aligned} \mathcal {A}w=0\implies w=\mathbb {B}u\qquad \text { for some }u. \end{aligned}$$
(5)

To see this, simply take \(\mathcal {A}=\nabla ^k\). In turn, if we impose restrictions on w that allow for usage of the Fourier transform, (5) can be shown to hold (Lemma 2). As a consequence, standard arguments in the Calculus of Variations lead to the fact that a map f is \(\mathcal {A}\)-quasiconvex if and only if

$$\begin{aligned} f(\eta )\leqslant \fint _Qf(\eta +\mathbb {B}u(x)){\text {d}}\!x \end{aligned}$$

for all \(\eta \) and all smooth vector fields u supported in an open cube Q (Corollary 1). It is also the case that, under the constant rank condition, the notions of \(\mathcal {A}\)-quasiconvexity [16, Def. 3.1] and Dacorogna’s \(\mathcal {A}\)-\(\mathbb {B}\)-quasconvexity [12, Eq. (A.12)] coincide. In particular, one can define \(\mathcal {A}\)-quasiconvexity via integration over arbitrary domains (Lemma 3). As a consequence, the lower semi-continuity properties of functionals (1) in the (asymptotically \(\mathcal {A}\)-free) topologies considered in [3, 16], which are natural from the point of view of compensated compactness theory, rely only on the structure of the potential \(\mathbb {B}\).

In fact, we will show that the \(\mathcal {A}\)-quasiconvex relaxation of a continuous integrand can be described in terms of \(\mathbb {B}\) only. From this point of view, it is natural to investigate the Young measures generated by sequences satisfying differential constraints [16, Sec. 4], as they efficiently describe the minimization of energies that are not lower semi-continuous. We recall that the role of parametrized measures for non-convex problems in the Calculus of Variations was first recognized by Young in the pioneering works [39,40,41]. See the monographs [26, 27] for a modern, detailed exposition.

Roughly speaking, for \(1<p<\infty \), we consider a sequence \(w_j\) converging weakly in \({\text {L}}^p\) which is asymptotically \(\mathcal {A}\)-free and generates a Young measure \(\varvec{\nu }\). Technically speaking, it suffices to take \(\mathcal {A}w_j\) to be strongly compact in \({\text {W}}^{-k,p}_{{\text {loc}}}\), where k is the order of \(\mathcal {A}\). This is (slightly more general than) the topology considered in [16, Rk. 4.2(i)] and is consistent with the topology considered in compensated compactness (see, e.g., [36, Thm. 17.3], which essentially deals with the case of linear Euler–Lagrange equations). In this setting, we will show that the Young measure \(\varvec{\nu }\) is generated by a sequence of smooth maps \(\mathbb {B}u_j\), modulo a shift by the barycentre (Proposition 1).

To sum up, under the constant rank condition on the annihilator \(\mathcal {A}\), the objects characterizing the lower semi-continuous relaxation of functionals defined on \(\mathcal {A}\)-free vector fields (i.e., \(\mathcal {A}\)-quasiconvex envelopes and \(\mathcal {A}\)-free Young measures) can be described only in terms of the potential \(\mathbb {B}\) constructed in Theorem 1. From this point of view, it is the author’s opinion that the study of functionals

$$\begin{aligned} \mathscr {E}[w]=\int _\varOmega f(x,w(x)){\text {d}}\!x\text { for }\mathcal {A}w=0\quad \text { and }\quad \mathscr {F}[u]=\int _\varOmega f(x,\mathbb {B}u(x)){\text {d}}\!x \end{aligned}$$

is essentially dual (strictly under the constant rank condition). See also [13] and the Appendix of [12].

Since testing with the appropriate quantity is fundamental in the study of partial differential equations, we hope that the observations made in this work will increase the flexibility of analyzing functionals in either class described above. On the other hand, the functional \(\mathscr {F}\) seems better suited for incorporating boundary conditions, which will be pursued elsewhere.

This paper is organized as follows: In Sect. 2 we prove the main Theorem 1, in Sect. 3 we prove that \(\mathcal {A}\)-quasiconvexity can be tested with compactly supported fields \(w=\mathbb {B}u\) (Corollary 1), and in Sect. 4 we prove that \(\mathcal {A}\)-free Young measures are shifts of Young measures generated by sequences \(\mathbb {B}u_j\).

2 Proof of Theorem 1

We take a moment to clarify notation. By a k-homogeneous, linear differential operator \(\mathcal {A}\) on \(\mathbb {R}^n\) from W to X we mean

$$\begin{aligned} \mathcal {A}w{:}{=}\sum _{|\alpha |=k}\partial ^\alpha \mathcal {A}_\alpha w\qquad \text {for }w:\mathbb {R}^n\rightarrow W, \end{aligned}$$
(6)

where \(\mathcal {A}_\alpha \in \mathrm {Lin}(W,X)\) for all multi-indices \(\alpha \) such that \(|\alpha |=k\), for finite dimensional inner product spaces WX. We also define the (Fourier) symbol map

$$\begin{aligned} \mathcal {A}(\xi ){:}{=}\sum _{|\alpha |=k}\xi ^\alpha \mathcal {A}_\alpha \in \mathrm {Lin}(W,X)\qquad \text {for }\xi \in \mathbb {R}^n. \end{aligned}$$

We also recall the condition mentioned above that \(\mathcal {A}\) is of constant rank if there exists a natural number r such that

$$\begin{aligned} \mathrm {rank}\mathcal {A}(\xi )=r\qquad \text {for all }\xi \in \mathbb {R}^n\setminus \{0\}. \end{aligned}$$

As to the resolution of Theorem 1, we recall the notion of (Moore–Penrose) generalized inverse, introduced independently in [8, 21, 28], to which we refer plainly as the pseudo-inverse, although the terminology is not standard. For a matrix \(M\in \mathbb {R}^{N\times m}\), its pseudo-inverse \(M^\dagger \) is the unique \({m\times N}\) matrix defined by the relations

$$\begin{aligned} MM^\dagger M=M,\quad M^\dagger MM^\dagger =M^\dagger ,\quad (MM^\dagger )^*=MM^\dagger ,\quad (M^\dagger M)^*=M^\dagger M, \end{aligned}$$

where \(M^*\) denotes the adjoint (transpose) of M. Equivalently, the pseudo-inverse is determined by the geometric property that \(MM^\dagger \) and \(M^\dagger M\) are orthogonal projections onto \(\mathrm {im\,}M\) and \((\ker M)^\perp \) respectively. We refer the reader to the monograph [10] for more detail on generalized inverses.

With these considerations in mind, it is easy to see that the projection map \(\mathbb {P}\in {\text {C}}^\infty (\mathbb {R}^n\setminus \{0\},\mathrm {Lin}(W,W))\) defined in (3) can be represented as

$$\begin{aligned} \mathbb {P}(\xi )={\text {Id}}_W-\mathcal {A}^\dagger (\xi )\mathcal {A}(\xi )\quad \text {for }\xi \in \mathbb {R}^n\setminus \{0\}. \end{aligned}$$
(7)

The smoothness of \(\mathbb {P}\) is well-known [16, Prop. 2.7]; for a proof using pseudo-inverses see [29, Sec. 4]. By the basic properties of pseudo-inverses, it is easy to see that, with the choice \(\mathbb {B}=\mathbb {P}\), we have that (4) holds; however, the tensor-valued map \(\mathbb {P}\) is 0-homogeneous, hence not polynomial in general. In particular, \(\mathbb {P}\) cannot define a differential operator.

On the other hand, motivated by a similar construction in [38, Rk. 4.1], one can speculate that \(\mathbb {P}\) and, in fact, \(\mathcal {A}^\dagger (\cdot )\) are rational functions. This is indeed the case, as a consequence of the main result of Decell in [15], building on the fundamental result of Penrose [28, Thm. 2] and the Cayley–Hamilton Theorem.

Theorem 2

(Decell [15, Thm. 3]) Let \(M\in \mathbb {R}^{N\times m}\) and denote by

$$\begin{aligned} p(\lambda ){:}{=}(-1)^N\left( a_0\lambda ^N+a_1\lambda ^{N-1}+\cdots +a_N\right) \quad \text {for }\lambda \in \mathbb {R}\end{aligned}$$

the characteristic polynomial of \(MM^*\), where \(a_0=1\). Define

$$\begin{aligned} r{:}{=}\max \{j\in \mathbb {N}:a_j>0\}. \end{aligned}$$
(8)

Then, if \(r=0\), we have that \(M^\dagger =0\); else

$$\begin{aligned} M^\dagger =-a^{-1}_rM^*\left[ a_0(MM^*)^{r-1}+a_1(MM^*)^{r-2}+\cdots +a_{r-1}{\text {Id}}_{N\times N}\right] . \end{aligned}$$

Proof

(Proof of Theorem 1, sufficiency) Suppose that \(\mathcal {A}\) has constant rank. We put \(M{:}{=}\mathcal {A}(\xi )\) in the above Theorem for \(\xi \in \mathbb {R}^n\setminus \{0\}\), and abbreviate \(\mathcal {H}(\xi ){:}{=}\mathcal {A}(\xi )\mathcal {A}^*(\xi )\). The first, perhaps most crucial, observation is that \(r(\xi )\), as defined by (8), equals the number of non-zero eigen-values of \(MM^*\), which equals the number of singular values of M. This is, in turn, equal to \({\mathrm{rank}\,}M\), which is independent of \(\xi \) by the constant rank assumption on \(\mathcal A\).

Therefore, if \(r(\xi )=r=0\), we have that \(\mathcal {A}(\xi )=0_{N\times m}\), \(\mathcal {A}^\dagger (\xi )=0_{m\times N}\), so we can simply choose \(\mathbb {B}(\xi )={\text {Id}}_W\), which satisfies (4) and gives rise to a linear, 0-homogeneous differential operator. Otherwise, if \(r(\xi )=r>0\), we obtain

$$\begin{aligned} \mathcal {A}^\dagger (\xi )=-a_r(\xi )^{-1}\mathcal {A}^*(\xi )\left[ a_0(\xi )\mathcal {H}(\xi )^{r-1}+a_1(\xi )\mathcal {H}(\xi )^{r-2}+\cdots +a_{r-1}(\xi ){\text {Id}}_{X}\right] . \end{aligned}$$

It is easy to see that \(\mathcal {H}(\cdot )\) is a tensor-valued polynomial in \(\xi \). The scalar fields \(a_j\), \(j=1\ldots r\), are such that \(a_j(\xi )\) is a coefficient of the characteristic polynomial of \(\mathcal {H}(\xi )\), hence a linear combination of minors. In particular, \(a_j\) are scalar-valued polynomials in \(\xi \).

It then follows that, with \(\mathbb {P}\) as in (3),

$$\begin{aligned} \mathbb {B}(\xi ){:}{=}a_r(\xi )\mathbb {P}(\xi ) =a_r(\xi ){\text {Id}}_W-a_r(\xi )\mathcal {A}^\dagger (\xi )\mathcal {A}(\xi )\quad \text {for }\xi \in \mathbb {R}^n \end{aligned}$$
(9)

defines a tensor-valued polynomial that satisfies (4). In particular, (9) gives rise to a linear differential operator. To check that it is homogeneous, it suffices to see that \(a_r(\cdot )\) is a linear combination of minors of the same order of \(\mathcal {H}(\cdot )\), which is homogeneous since \(\mathcal {A}(\cdot )\) is. \(\square \)

The necessity of the constant rank condition in Theorem 1 follows from the following Lemma and the Rank–Nullity Theorem.

Lemma 1

Let \(S\subset \mathbb {R}^n\) be a set of positive Lebesgue measure and PQ be two matrix-valued polynomials on \(\mathbb {R}^n\). Suppose that there exists s such that

$$\begin{aligned} {\mathrm{rank}\,}P(\xi )+{\mathrm{rank}\,}Q(\xi )=s\qquad \text { for }\xi \in S. \end{aligned}$$

Then both P and Q have constant rank in S.

Proof

We abbreviate \(R_P{:}{=}{\mathrm{rank}\,}P\), \(R_Q{:}{=}{\mathrm{rank}\,}Q\) and assume for contradiction that \(R_P\) is not constant in S. Say \(R_P(S)=\{r_1,r_1+1\ldots ,r_2\}\) for natural numbers \(r_1<r_2\). We also write \(\mathrm {M}_d\) for the map that has input a matrix and returns (a vector of) all its minors of order d. In particular, \(\mathrm {M}_dP\), \(\mathrm {M}_dQ\) are vector-valued polynomials on \(\mathbb {R}^n\). We then have that

$$\begin{aligned} R_P^{-1}(\{r_1,r_1+1\ldots r_2-1\})\subset \{\xi \in \mathbb {R}^n:\mathrm {M}_{r_2}P(\xi )=0\}, \end{aligned}$$

so that either \(\mathrm {M}_{r_2}P\equiv 0\) (which is not the case by definition of \(r_2\)) or \(R_P^{-1}(\{r_1,r_1+1\ldots r_2-1\})\) is Lebesgue-null.Footnote 2 On the other hand,

$$\begin{aligned} R_P^{-1}(\{r_2\})\cap S&=R_Q^{-1}(\{s-r_2\})\cap S\\&\subset R_Q^{-1}(\{s-r_2,s-r_2+1,\ldots s-r_1-1\})\\&\subset \{\xi \in \mathbb {R}^n:\mathrm {M}_{s-r_1}Q(\xi )=0\}, \end{aligned}$$

which is Lebesgue-null by the same argument. Since

$$\begin{aligned} S=[R_P^{-1}(\{r_1,r_1+1,\ldots r_2-1\})\cap S]\cup [R_P^{-1}(\{r_2\})\cap S], \end{aligned}$$

it follows that S is Lebesgue-null and we arrive at a contradiction. \(\square \)

It is natural to ask the reversed question, whether a constant rank operator \(\mathbb {B}\) admits an exact annihilator \(\mathcal {A}\). This is indeed the case, as can be shown by a simple modification of the argument above:

Remark 1

Let \(\mathbb {B}\) be a linear, homogeneous, differential operator of constant rank on \(\mathbb {R}^n\) from V to W. Then, we can choose \(M{:}{=}\mathbb {B}(\xi )\) for \(\xi \in \mathbb {R}^n\setminus \{0\}\) in Theorem 2, so that

$$\begin{aligned} \mathcal {A}(\xi ){:}{=}a_r(\xi )\left[ {\text {Id}}_W-\mathbb {B}(\xi )\mathbb {B}^\dagger (\xi )\right] \quad \text {for }\xi \in \mathbb {R}^n \end{aligned}$$

satisfies (4) and gives rise to a differential operator. In particular, the formula is consistent with [38, Eq. (4.3)]. This fact can be used to extend the \({\text {L}}^1\)-estimates in [9, 38] from elliptic to constant rank operators.

We conclude the discussion of algebraic properties with two remarks: Firstly, it is quite convenient that the two constructions presented are explicitly computable. On the other hand, performing the computations on simple examples, e.g., involving only \({\text {div}}\), \(\mathrm {grad}\), \({\text {curl}}\), one easily notices that the operators constructed via our formulas are often over complicated. Perhaps more computationally efficient methods, e.g., in the spirit of [38, Sec. 4.2] can be developed.

3 \(\mathcal {A}\)-quasiconvexity

The relevance of Theorem 1 for analysis can be seen, for instance, from the fact that periodic \(\mathcal {A}\)-free fields have differential structure:

Lemma 2

Let \(\mathcal {A}\), \(\mathbb {B}\) be linear, homogeneous, differential operators of constant rank with constant coefficients on \(\mathbb {R}^n\) from W to X, and from V to W, respectively. Assume that (4) holds. Then for all \(w\in {\text {C}}^\infty (\mathbb {T}_n,W)\) such that \(\mathcal {A}w=0\) and \(\int _{\mathbb {T}_n}w(x){\text {d}}\!x=0\), there exists \(u\in {\text {C}}^\infty (\mathbb {T}_n,V)\) such that \(w=\mathbb {B}u\). Similarly, for all \(w\in \mathscr {S}(\mathbb {R}^n,W)\) such that \(\mathcal {A}w=0\), there exists \(u\in \mathscr {S}(\mathbb {R}^n,V)\) such that \(w=\mathbb {B}u\).

Here \(\mathbb {T}_n\) denotes the n-dimensional torus, identified in an obvious way with (a quotient of) \([0,1]^n\). The Fourier transform is defined as

$$\begin{aligned} \hat{u}(\xi ){:}{=}\int _{\mathbb {T}_n} u(x){\text {e}}^{-2\pi {\text {i}}x\cdot \xi }{\text {d}}\!x, \end{aligned}$$
(10)

for \(\xi \in \mathbb {Z}^n\) and \(u\in {\text {C}}^\infty (\mathbb {T}_n)\). In addition, \(\mathscr {S}(\mathbb {R}^n)\) denotes the Schwartz class of rapidly decreasing functions on \(\mathbb {R}^n\), where the Fourier transform is defined also by (10), with the amendment that the integral is taken over \(\mathbb {R}^n\).

Proof

Let \(w\in {\text {C}}^\infty (\mathbb {T}_n,W)\) have zero average and satisfy \(\mathcal {A}w=0\), so that

$$\begin{aligned} w(x)=\sum _{\xi \in \mathbb {Z}^n\setminus \{0\}}\hat{w}(\xi ){\text {e}}^{2\pi {\text {i}}x\cdot \xi }, \end{aligned}$$

for \(x\in \mathbb {T}_n\), where the coefficients \(\hat{w}(\xi )\in \ker \mathcal {A}(\xi )\) decay faster than any polynomial as \(|\xi |\rightarrow \infty \). We define

$$\begin{aligned} u(x){:}{=}\sum _{\xi \in \mathbb {Z}^n\setminus \{0\}}\mathbb {B}^\dagger (\xi )\hat{w}(\xi ){\text {e}}^{2\pi {\text {i}}x\cdot \xi }, \end{aligned}$$

for \(x\in \mathbb {T}_n\), which is smooth by homogeneity of \(\mathbb {B}^\dagger (\cdot )\): say \(\mathbb {B}\) has order l, then \(\mathbb {B}^\dagger (\cdot )\) is \((-l)\)-homogeneous. We can thus differentiate the sum term by term to obtain

$$\begin{aligned} \mathbb {B}u(x)&=(2\pi {\text {i}})^l\sum _{\xi \in \mathbb {Z}^n\setminus \{0\}}\mathbb {B}(\xi )\mathbb {B}^\dagger (\xi )\hat{w}(\xi ){\text {e}}^{2\pi {\text {i}}x\cdot \xi }\\&=(2\pi {\text {i}})^l\sum _{\xi \in \mathbb {Z}^n\setminus \{0\}}\hat{w}(\xi ){\text {e}}^{2\pi {\text {i}}x\cdot \xi }\\&=(2\pi {\text {i}})^lw(x), \end{aligned}$$

where the exactness relation (4) is used in the second equality, along with the geometric properties of the pseudo-inverse. The proof of the first case is complete.

We give an analogous argument for the case when \(w\in \mathscr {S}(\mathbb {R}^n,W)\) is \(\mathcal {A}\)-free. We have the pointwise relation \(\mathcal {A}(\xi )\hat{w}(\xi )=0\), so that (4) implies that \(w\in \mathrm {im\,}\mathbb {B}(\xi )\) and we can define

$$\begin{aligned} \hat{u}(\xi ){:}{=}\mathbb {B}^\dagger (\xi )\hat{w}(\xi ), \end{aligned}$$

which satisfies the required properties. \(\square \)

We conclude this Section by showing that one can test with compactly supported smooth maps in the definition of \(\mathcal {A}\)-quasiconvexity.

Corollary 1

Let \(\mathcal {A},\,\mathbb {B}\) be as in Lemma 2 and \(f:W\rightarrow \mathbb {R}\) be Borel measurable and locally bounded. Then

$$\begin{aligned} Q_\mathcal {A}f(\eta )&{:}{=}\inf \bigg \{\int _{\mathbb {T}_n}f(\eta +w(x)){\text {d}}\!x:w\in {\text {C}}^\infty (\mathbb {T}_n,W),\mathcal {A}w=0,\int _{\mathbb {T}_n}w(x){\text {d}}\!x=0\bigg \},\\ Q^\mathbb {B}f(\eta )&{:}{=}\inf \bigg \{\int _{[0,1]^n}f(\eta +\mathbb {B}u(x)){\text {d}}\!x:u\in {\text {C}}^\infty _c((0,1)^n,V)\bigg \} \end{aligned}$$

are equal for all \(\eta \in W\). Moreover, if \(\mathbb {B}\) has order l and \(\alpha \in [0,1)\), we have

$$\begin{aligned} Q_\mathcal {A}f(\eta )=\inf \bigg \{\int _{[0,1]^n}f(\eta +\mathbb {B}u(x)){\text {d}}\!x:u\in {\text {C}}^\infty _c((0,1)^n,V),\Vert u\Vert _{{\text {C}}^{l-1,\alpha }}<\varepsilon \bigg \} \end{aligned}$$
(11)

for any \(\eta \in W\) and \(\varepsilon >0\).

The proof follows standard arguments; in particular we follow [14, Prop. 5.13] and [17, Thm. 4.2] and include the proof for completeness of the present work.

Proof

It is obvious that \(Q_\mathcal {A}f\leqslant Q^\mathbb {B}f\). To prove the opposite inequality, let \(\varepsilon >0\), \(\eta \in W\), and w be a periodic field as in the definition of \(Q_\mathcal {A}f(\eta )\). We will construct \(v\in {\text {C}}^\infty _c((0,1)^n,V)\) such that

$$\begin{aligned} \int _{[0,1]^n}f(\eta +\mathbb {B}v(x)){\text {d}}\!x\leqslant \int _{[0,1]^n}f(\eta +w(x))+\varepsilon . \end{aligned}$$
(12)

By Lemma 2, we have that \(w=\mathbb {B}u\) for a periodic field \(u\in {\text {C}}^\infty (\mathbb {T}_n,V)\). Say, as before, that \(\mathbb {B}\) has order l and define \(u_N(x){:}{=}N^{-l}u(Nx)\) for N sufficiently large. This does not change the value of the integral over the cube. Next, let \(\delta >0\) be sufficiently small and truncate to obtain \(u^\delta _N{:}{=}\rho ^\delta u_N\), where \(\rho ^\delta \in {\text {C}}^\infty _c([0,1]^n)\) is such that \(\rho ^\delta (x)=1\) if \(\mathrm {dist}(x,\partial [0,1]^n)>\delta \) and \(|\nabla ^j\rho ^\delta |\leqslant C\delta ^{-j}\) for \(j=0\ldots l\) and some constant \(C>0\). We impose \(\delta N\ge 1\) and leave \(\delta \) to be determined. It follows, for \(c_1\ge 1\) depending on \(\mathbb {B}\) only, that

$$\begin{aligned} |\mathbb {B}u^\delta _N|&\leqslant |\rho ^\delta \mathbb {B}u_N|+c_1\sum _{j=1}^{l}|\nabla ^{j}\rho ^\delta ||\nabla ^{l-j} u_N|\\&\leqslant c_1C\left( \Vert \mathbb {B}u\Vert _{{\text {L}}^\infty }+\sum _{j=1}^l(\delta N)^{-j} \Vert \nabla ^{l-j}u\Vert _{{\text {L}}^\infty }\right) \\&\leqslant c_1C\left( \Vert \mathbb {B}u\Vert _{{\text {L}}^\infty }+\sum _{j=0}^{l-1}\Vert \nabla ^{j}u\Vert _{{\text {L}}^\infty }\right) {=}{:}c_1C\Vert u\Vert _{{\text {W}}^{\mathbb {B},\infty }}. \end{aligned}$$

Say f is bounded by \(M>0\) on \({\text {B}}(0,|\eta |+c_1C\Vert u\Vert _{{\text {W}}^{\mathbb {B},\infty }})\). Hence, if we choose \(\delta \) such that \(\mathscr {L}^n\left( \{x\in [0,1]^n:\mathrm {dist}(x,\partial [0,1]^n)\leqslant \delta \}\right) \leqslant M^{-1}\varepsilon \), we obtain

$$\begin{aligned} \int _{[0,1]^n}f(\eta +\mathbb {B}u^\delta _N(x)){\text {d}}\!x&\leqslant \int _{\mathrm {dist}(x,\partial [0,1]^n)<\delta }M{\text {d}}\!x+\int _{[0,1]^n}f(\eta +\mathbb {B}u_N(x)){\text {d}}\!x\\&\leqslant M\times M^{-1}\varepsilon +\int _{[0,1]^n}f(\eta +w(x)){\text {d}}\!x, \end{aligned}$$

which implies (12) with \(v{:}{=}u_N^\delta \). To prove the equality of the two envelopes, we distinguish two cases: If \(Q_\mathcal {A}f(\eta )>-\infty \), we can choose w such that

$$\begin{aligned} \int _{[0,1]^n}f(\eta +w(x)){\text {d}}\!x \leqslant Q_\mathcal {A}f(\eta )+\varepsilon , \end{aligned}$$

and we conclude that \(Q_\mathcal {A}f(\eta )=Q^\mathbb {B}f(\eta )\) by (12) since \(\varepsilon >0\) is arbitrary. If \(Q_\mathcal {A}f(\eta )=-\infty \), we choose w such that

$$\begin{aligned} \int _{[0,1]^n}f(\eta +w(x)){\text {d}}\!x\leqslant -\varepsilon ^{-1}, \end{aligned}$$

so that we can conclude by (12) that \(Q^\mathbb {B}f(\eta )=-\infty \).

To prove (11), we need only show that the infimum is smaller than the envelope. Firstly, note as above that by replacing u with \(u_N(x)=N^{-l}u(Nx)\), where u is extended by periodicity to \(\mathbb {R}^n\), the value of the integral does not change. It suffices to choose N large enough so that \(u_N\) has small \({\text {C}}^{l-1,\alpha }\)-norm. Note that for \(j=0\ldots l-1\) we have

$$\begin{aligned} \Vert \nabla ^ju_N\Vert _{\infty }=N^{j-l}\Vert \nabla ^ju\Vert _{\infty }, \end{aligned}$$

which can clearly be made arbitrarily small.

Finally, to check the Hölder bound, say that \(\{z_i+[0,N^{-1}]^n\}_{i=1}^{N^n}\) is a covering of \([0,1]^n\) by cubes of side-length \(N^{-1}\) that can only touch at their boundaries and let \(x,y\in [0,1]^n\). If xy lie in the same cube \(z_i+[0,N^{-1}]^n\), we have that

$$\begin{aligned} |\nabla ^{l-1}u_N(x)-\nabla ^{l-1}u_N(y)|&=N^{-1}|\nabla ^{l-1}u(Nx-z_i)-\nabla ^{l-1}u(Ny-z_i)|\\&\leqslant \Vert \nabla ^l u\Vert _{\infty }|x-y|\\&\leqslant (\sqrt{n}N^{-1})^{1-\alpha }\Vert \nabla ^l u\Vert _{\infty }|x-y|^\alpha , \end{aligned}$$

which can be made small since \(1-\alpha >0\). If xy lie in different cubes, which we label \(Q_x,Q_y\). Let \(\bar{x}\in \partial Q_x\cap (x,y)\), \(\bar{y}\in \partial Q_y\cap (x,y)\), so that \(|x-y|\ge |x-\bar{x}|+|y-\bar{y}|\), \(|x-\bar{x}|,|y-\bar{y}|\leqslant \sqrt{n}N^{-1}\), and all derivatives of \(u_N\) vanish near \(\bar{x},\bar{y}\). Using these facts and the previous step we get

$$\begin{aligned} |\nabla ^{l-1}u_N(x)-\nabla ^{l-1}u_N(y)|&\leqslant |\nabla ^{l-1}u_N(x)-\nabla ^{l-1}u_N(\bar{x})|\\&\quad +|\nabla ^{l-1}u_N(y)-\nabla ^{l-1}u_N(\bar{y})|\\&\leqslant (\sqrt{n}N^{-1})^{1-\alpha }\Vert \nabla ^l u\Vert _{\infty }\left( |x-\bar{x}|^\alpha +|y-\bar{y}|^\alpha \right) \\&\leqslant (\sqrt{n}N^{-1})^{1-\alpha }\Vert \nabla ^l u\Vert _{\infty }2^{-\alpha }|x-y|^\alpha , \end{aligned}$$

where the last inequality follows by concavity and monotonicity of \(0\leqslant t\mapsto t^\alpha \). The proof is complete. \(\square \)

Remark 2

Using the argument in Corollary 1, one can show for constant rank operators \(\mathcal {A}\) that \(\mathcal {A}\)-quasiconvexity, as defined by Fonseca and Müller in [16, Def. 3.1], coincides with \(\mathcal {A}\)-\(\mathbb {B}\)-quasiconvexity, as introduced by Dacorogna in [12, 13] (to be precise, in the original definition of \(\mathcal {A}\)-\(\mathbb {B}\)-quasiconvexity, the operator \(\mathbb {B}\) is assumed to be of first order, but this is only a minor technical restriction). In this case, it is not difficult to prove that [13, Thm. 4] is essentially unconditional. A proof of this fact will be given elsewhere.

We also have that \(\mathcal {A}\)-quasiconvexity can be defined by integrals over arbitrary domains, instead of cubes.

Lemma 3

Let \(\mathcal {A},\,\mathbb {B}\) be as in Lemma 2 and \(f:W\rightarrow \mathbb {R}\) be Borel measurable, locally bounded, and \(\mathcal {A}\)-quasiconvex, and \(\varOmega \) be a bounded open set. Then

$$\begin{aligned} f(\eta )\leqslant \fint _\varOmega f(\eta +\mathbb {B}v(y)){\text {d}}\!y \end{aligned}$$

for all \(\eta \in W\) and \(v\in {\text {C}}^\infty _c(\varOmega ,V)\).

The proof follows from a simple argument in the Calculus of Variations [14, Prop. 5.11].

Proof

Fix \(\eta \in W\), \(v\in {\text {C}}^\infty _c(\varOmega ,V)\), extended by zero to \(\mathbb {R}^n\). By the argument in the proof of Corollary 1, we write \(C{:}{=}(0,1)^n\) and have that

$$\begin{aligned} f(\eta )\leqslant \int _C f(\eta +\mathbb {B}u(x) ){\text {d}}\!x \end{aligned}$$

for all \(u\in {\text {C}}^\infty _c(C,V)\). For sufficiently small \(\varepsilon >0\), we can find \(x_0\in \mathbb {R}^n\) such that \(x_0+\varepsilon \varOmega \subset C\). We define

$$\begin{aligned} u(x){:}{=}\varepsilon ^l v\left( \dfrac{x-x_0}{\varepsilon }\right) , \end{aligned}$$

so that

$$\begin{aligned} f(\eta )&\leqslant \int _C f(\eta +\mathbb {B}u(x)){\text {d}}\!x=|C\setminus (x_0+\varepsilon \varOmega )|f(\eta )+\int _{x_0+\varepsilon \varOmega }f(\eta +\mathbb {B}u(x)){\text {d}}\!x\\&=(1-\varepsilon ^n|\varOmega |)f(\eta )+\int _\varOmega f(\eta +\mathbb {B}v(y))\varepsilon ^n{\text {d}}\!y. \end{aligned}$$

Rearranging the terms we obtain the conclusion. \(\square \)

4 \(\mathcal {A}\)-free Young measures

We recall the definition of oscillation Young measures, while also giving a simplified variant of the Fundamental Theorem of Young measures.

Theorem 3

(FTYM, [26, 27]) Let \(\varOmega \subset \mathbb {R}^n\) be a bounded, open set and \(z_j\in {\text {L}}^1(\varOmega ,\mathbb {R}^d)\) be a bounded sequence in \({\text {L}}^1\). Then there exists a subsequence (not relabeled) and a weakly-* measurable map \(\varvec{\nu }:\varOmega \rightarrow \mathcal {P}({\mathbb {R}^d})\) (or parametrized measure \(\varvec{\nu }=(\nu _x)_{x\in \varOmega }\)) such that for all \(f\in {\text {C}}(\varOmega \times \mathbb {R}^d)\) we have that

$$\begin{aligned} \liminf _{j\rightarrow \infty }\int _\varOmega f(x,z_j(x)){\text {d}}\!x\ge \int _{\varOmega }\langle f(x,\cdot \,),\nu _x\rangle {\text {d}}\!x \end{aligned}$$

Moreover,

$$\begin{aligned} \lim _{j\rightarrow \infty }\int _\varOmega f(x,z_j(x)){\text {d}}\!x= \int _{\varOmega }\langle f(x,\cdot \,),\nu _x\rangle {\text {d}}\!x \end{aligned}$$

if and only if the sequence \(f(\,\cdot ,z_j)\) is uniformly integrable.

Above, \(\mathcal {P}(\mathbb {R}^d)\) denotes the space of probability measures on \(\mathbb {R}^d\). In the notation of Theorem 3, we say that \(z_j\) generates the Young measure \(\varvec{\nu }\) (in symbols, \(z_j\overset{\mathbf {Y}}{\rightarrow }\varvec{\nu }\)). We also recall that a sequence \(z_j\) is said to be uniformly integrable if and only if for all \(\varepsilon >0\), there exists \(\delta >0\) such that for all Borel sets \(E\subset \varOmega \), we have that

$$\begin{aligned} \mathscr {L}^n(E)<\delta \implies \sup _j\int _E|z_j|{\text {d}}\!x<\varepsilon , \end{aligned}$$

or, equivalently, if

$$\begin{aligned} \lim _{\alpha \rightarrow \infty }\sup _j\int _{\{|z_j|>\alpha \}}|z_j|{\text {d}}\!x=0. \end{aligned}$$

If \(|z_j|^p\) is uniformly integrable, we say that \(z_j\) is p-uniformly integrable.

Lemma 4

[16, Prop. 2.4] Let \(z_j\) generate a Young measure \(\varvec{\nu }\) and \(\tilde{z}_j\rightarrow \tilde{z}\) in measure. Then \(z_j+\tilde{z}_j\) generates the Young measure \(\varvec{\mu }\) given by \(\mu _x=\nu _x\star \delta _{\tilde{z}(x)}\) for \(\mathscr {L}^n\) a.e. x, i.e.,

$$\begin{aligned} \langle \varphi ,\mu _x\rangle =\langle \varphi (\,\cdot +\tilde{z}(x),\nu _x\rangle \end{aligned}$$

for any \(\varphi \in {\text {C}}_0\).

The following is an extension of [16, Lem. 2.15]. The first two steps of the present proof are almost a repetition of their arguments, which we include since the original proof only covers first order annihilators \(\mathcal {A}\).

Proposition 1

Let \(\mathcal {A}\), \(\mathbb {B}\) be as in Lemma 2 and have orders k, l, respectively, \(\varOmega \subset \mathbb {R}^n\) be a bounded Lipschitz domain, and \(1<p<\infty \). Let \(w_j,w\in {\text {L}}^p(\varOmega ,W)\) be such that

$$\begin{aligned} w_j\rightharpoonup w&\text { in }{\text {L}}^p(\varOmega ,W),\\ \mathcal {A}w_j\rightarrow \mathcal {A}w&\text { in }{\text {W}}^{-k,p}_{{\text {loc}}}(\varOmega ,X),\\ w_j\overset{\mathbf {Y}}{\rightarrow }\varvec{\nu }. \end{aligned}$$

Then there exists a sequence \(u_j\in {\text {C}}^\infty _c({\varOmega },V)\) such that

$$\begin{aligned} \mathbb {B}u_j\rightharpoonup 0&\text { in }{\text {L}}^p(\varOmega ,W),\\ \mathbb {B}u_j+w\overset{\mathbf {Y}}{\rightarrow }\varvec{\nu }. \end{aligned}$$

Moreover, \(u_j\) can be chosen such that \((\mathbb {B}u_j)_j\) is p-uniformly integrable.

A Young measure \(\varvec{\nu }\) satisfying the assumptions of Proposition 1 is said to be an \(\mathcal {A}\)-free Young measure.

Proof

By Lemma 4 and linearity we can assume that \(w=0\). We will identify maps defined on \(\varOmega \) with their extensions by zero to full-space without mention. Uniform integrability considerations strictly refer to sequences defined on \(\varOmega \).

Step I. We construct p-uniformly integrable \(\tilde{w}_j\in {\text {C}}^\infty _c(\varOmega ,W)\) such that \(\tilde{w}_j\rightharpoonup 0\) in \({\text {L}}^p(\varOmega ,W)\), \(\mathcal {A}\tilde{w}_j\rightarrow 0\) in \({\text {W}}^{-k,q}(\mathbb {R}^n,X)\) for some \(1<q<p\), and \(\tilde{w}_j\) generates \(\varvec{\nu }\).

Recall the truncation operators, defined for \(\alpha >0\) by

$$\begin{aligned} \tau _\alpha A{:}{=}{\left\{ \begin{array}{ll} A&{}\text { if }|A|\leqslant \alpha \\ \alpha A/|A|&{}\text { if }|A|>\alpha , \end{array}\right. } \end{aligned}$$

which are clearly Carathéodory integrands. By Theorem 3, we have that

$$\begin{aligned} \lim _{\alpha \rightarrow \infty }\lim _{j\rightarrow \infty }\int _\varOmega |\tau _\alpha w_j|^p{\text {d}}\!x&=\lim _{\alpha \rightarrow \infty }\int _{\varOmega }\int _{W}|\tau _\alpha A|^p{\text {d}}\!\nu _x(A){\text {d}}\!x\\&=\int _{\varOmega }\int _{W}|A|^p{\text {d}}\!\nu _x(A){\text {d}}\!x<\infty , \end{aligned}$$

so that we can choose a diagonal subsequence \(\alpha _j\uparrow \infty \) such that \(\int _\varOmega |\tau _{\alpha _j}w_j|^p{\text {d}}\!x\) equals the pth moment of \(\varvec{\nu }\). It also follows from Theorem 3 that \((\tau _{\alpha _j}w_j)_j\) is p-uniformly integrable.

We now show that \(\tau _{\alpha _j}w_j\) generates \(\varvec{\nu }\). Since \(w_j\) converges weakly in \({\text {L}}^p(\varOmega ,W)\), it converges weakly in \({\text {L}}^1\), hence is uniformly integrable, so that \(\tau _{\alpha _j}w_j-w_j\rightarrow 0\) in measure. It also follows by elementary manipulations that \(\tau _{\alpha _j}w_j-w_j\rightharpoonup 0\) in \({\text {L}}^p\), so that, indeed, \(\tau _{\alpha _j}w_j\) generates \(\varvec{\nu }\) by Lemma 4.

Let \(1<q<p\). We have that

$$\begin{aligned} \Vert \tau _{\alpha _j}w_j-w_j\Vert _{{\text {L}}^q(\varOmega ,W)}\leqslant \int _{\{|w_j|>\alpha _j\}}2^q|w_j|^q{\text {d}}\!x\leqslant 2^q\alpha _j^{q-p}\int _{\{|w_j|>\alpha _j\}}|w_j|^p{\text {d}}\!x\rightarrow 0, \end{aligned}$$

so that \(\mathcal {A}\tau _{\alpha _j}w_j\rightarrow 0\) in \({\text {W}}^{-k,q}_{{\text {loc}}}(\varOmega ,X)\). We also record that \(\tau _{\alpha _j}w_j\) is precompact in \({\text {W}}^{-1,q}(\varOmega ,W)\), so that \(D^\beta \tau _{\alpha _j} w_j\rightarrow 0\) in \({\text {W}}^{-k,q}(\varOmega ,X)\) for \(|\beta |<k\).

We can therefore choose a sequence of cut-off functions \(\rho _j\in {\text {C}}^\infty _c(\varOmega ,[0,1])\) such that \(\rho _j\uparrow 1\) in \(\varOmega \) and \(\Vert \rho _j \mathcal {A}\tau _{\alpha _j}w_j\Vert _{{\text {W}}^{-k,q}(\mathbb {R}^n,X)}\rightarrow 0\) and

$$\begin{aligned} \mathcal {A}(\rho _j\tau _{\alpha _j}w_j)=\rho _j\mathcal {A}\tau _{\alpha _j}w_j+\sum _{m=1}^k B_m[D^m\rho _j,D^{k-m}\tau _{\alpha _j}w_j]\rightarrow 0\quad \text { in }{\text {W}}^{-k,q}(\mathbb {R}^n,X), \end{aligned}$$

where \(B_m\) are fixed bi-linear pairings given by the Leibniz rule. To see that this is possible, consider \(\varOmega _j{:}{=}\{x\in \varOmega :\mathrm {dist}(x,\partial \varOmega )<j\}\), where \(s_j\downarrow 0\) will be determined. We require that \(\rho _j=1\) in \(\varOmega \setminus \varOmega _{s_j}\), \(\rho _j=0\) in \(\varOmega _{2s_{j}}\) and \(|D^{m}\rho _j|\leqslant cs_j^{-m}\), \(m=1,\ldots ,k\). It is easy to see that the sum above is controlled in \({\text {W}}^{-k,q}\) by

$$\begin{aligned} \sum _{m=1}^k\Vert D^m\rho _j\Vert _{{\text {L}}^\infty }\Vert D^{k-m}\tau _{\alpha _j}w_j\Vert _{{\text {W}}^{-k,q}}\leqslant c\sum _{m=1}^ks_j^{-m}\Vert D^{k-m}\tau _{\alpha _j}w_j\Vert _{{\text {W}}^{-k,q}}, \end{aligned}$$

so that it suffices to choose any \(s_j\ge \max _{m=1,\ldots ,k}\Vert D^{k-m}\tau _{\alpha _j}w_j\Vert _{{\text {W}}^{-k,q}}^{1/(2m)}\downarrow 0\) as \(j\rightarrow \infty \). Alternatively, one can consider a different cut-off sequence \(\rho _i\uparrow 1\) and employ a diagonalization argument.

We define

$$\begin{aligned} \tilde{w}_j{:}{=}(\rho _j\tau _{\alpha _j}w_j)\star \eta _{\varepsilon (j)}, \end{aligned}$$

where \(\eta _{\varepsilon (j)}\) denotes a standard sequence of (radial, positive) mollifiers and \(\varepsilon (j)\downarrow 0\) is such that \(\tilde{w}_j\in {\text {C}}^\infty _c(\varOmega ,W)\) and, therefore, \(\mathcal {A}\tilde{w}_j\rightarrow 0\) in \({\text {W}}^{-k,q}(\mathbb {R}^n,X)\). The latter inequality follows since, for all \(\varphi \in {\text {C}}^\infty _c(\mathbb {R}^n,W)\) with \(\Vert \varphi \Vert _{{\text {W}}^{k,q}}\leqslant 1\),

$$\begin{aligned} \langle \mathcal {A}\tilde{w}_j,\varphi \rangle&=\langle \mathcal {A}(\rho _j\tau _{\alpha _j}w_j),\varphi \star \eta _{\varepsilon (j)}\rangle \leqslant \Vert \mathcal {A}(\rho _j\tau _{\alpha _j}w_j)\Vert _{{\text {W}}^{-k,q}}\Vert \varphi \star \eta _{\varepsilon (j)}\Vert _{{\text {W}}^{k,q}}\\&\leqslant \Vert \mathcal {A}(\rho _j\tau _{\alpha _j}w_j)\Vert _{{\text {W}}^{-k,q}}\rightarrow 0. \end{aligned}$$

It is also clear that \(\Vert \tilde{w}_j-\tau _{\alpha _j}w_j\Vert _{{\text {L}}^p}\rightarrow 0\), so that \(\tilde{w}_j\) is p-uniformly integrable, converges weakly to 0 in \({\text {L}}^p\), and generates \(\varvec{\nu }\).

Step II. We project \(\tilde{w}_j\) on the kernel of \(\mathcal {A}\) in \(\mathbb {R}^n\) and show that \(\mathbb {P}\tilde{w}_j\) are p-uniformly integrable in \(\varOmega \), converge weakly to zero in \({\text {L}}^p\), and generate \(\varvec{\nu }\). Here the \({\text {L}}^2\)-orthogonal projection operator \(\mathbb {P}\) is given by the multiplier in (7),

$$\begin{aligned} \widehat{\mathbb {P}w}(\xi ){:}{=}\mathbb {P}(\xi )\hat{w}(\xi )=[{\text {Id}}_W-\mathcal {A}^\dagger (\xi )\mathcal {A}(\xi )]\hat{w}(\xi )\quad \text { for }w\in \mathscr {S}(\mathbb {R}^n,W). \end{aligned}$$

Since the symbol \(\mathbb {P}(\cdot )\) is homogeneous of degree zero, \(\mathbb {P}\) is a singular integral operator of convolution type; in particular \(\mathbb {P}\) maps Schwartz functions to Schwartz functions. Moreover, we have that

$$\begin{aligned} \mathscr {F}\left( \tilde{w}_j-\mathbb {P}\tilde{w}_j\right) (\xi )=\mathbb {B}^\dagger (\xi )\mathbb {B}(\xi )\mathscr {F}\tilde{w}_j(\xi )=\mathcal {A}^\dagger \left( \dfrac{\xi }{|\xi |}\right) \dfrac{\widehat{\mathcal {A}\tilde{w}_j}(\xi )}{|\xi |^k}, \end{aligned}$$

so that, by boundedness of singular integrals on \({\text {L}}^q\)

$$\begin{aligned} \Vert \tilde{w}_j-\mathbb {P}\tilde{w}_j\Vert _{{\text {L}}^q(\mathbb {R}^n,W)}\leqslant c\left\| \mathscr {F}^{-1}\left( \frac{\widehat{\mathcal {A}\tilde{w}_j}}{|\cdot |^k}\right) \right\| _{{\text {L}}^q(\mathbb {R}^n,X)}=c\Vert \mathcal {A}\tilde{w}_j\Vert _{{\text {W}}^{-k,q}(\mathbb {R}^n,X)}\rightarrow 0. \end{aligned}$$

It immediately follows by Lemma 4 that \(\mathbb {P}\tilde{w}_j\) generates \(\varvec{\nu }\). To see that \(\mathbb {P}\tilde{w}_j\rightharpoonup 0\) in \({\text {L}}^p(\varOmega ,W)\), we note that, since \(\mathbb {P}\) is (pointwisely) self-adjoint, we have, for any \(g\in {\text {L}}^{p/(p-1)}(\varOmega ,W)\),

$$\begin{aligned} \int _\varOmega \langle g,\mathbb {P}\tilde{w}_j\rangle {\text {d}}\!x =\int _\varOmega \langle \mathbb {P}g,\tilde{w}_j\rangle {\text {d}}\!x\rightarrow 0, \end{aligned}$$

since \(\mathbb {P}g\in {\text {L}}^{p/(p-1)}(\varOmega ,W)\) by boundedness of singular integrals.

To see that \(\mathbb {P}\tilde{w}_j\) is p-uniformly integrable, we use the idea in [16, Lem. 2.14.(iv)]. We first note, by boundedness of \(\mathbb {P}\) on \({\text {L}}^p\), that

$$\begin{aligned} \sup _j\Vert \mathbb {P}\tilde{w}_j-\mathbb {P}\tau _\alpha \tilde{w}_j\Vert _{{\text {L}}^p(\mathbb {R}^n,W)}\leqslant c\sup _j\Vert \tilde{w}_j-\tau _\alpha \tilde{w}_j\Vert _{{\text {L}}^p(\mathbb {R}^n,W)}\rightarrow 0\quad \text { as }\alpha \rightarrow \infty \end{aligned}$$

by p-uniform integrability of \(\tilde{w}_j\). Note that for each fixed \(\alpha \), \(\mathbb {P}\tau _\alpha \tilde{w}_j\) is bounded in \({\text {L}}^r\) for any \(p<r<\infty \), hence is p-uniformly integrable. Let \(\varepsilon >0\). We choose \(\alpha >0\) such that

$$\begin{aligned} \sup _j\Vert \mathbb {P}\tilde{w}_j-\mathbb {P}\tau _{\alpha } \tilde{w}_j\Vert _{{\text {L}}^p(\mathbb {R}^n,W)}<\varepsilon \end{aligned}$$

and also choose \(\delta >0\) such that for each Borel set \(E\subset \varOmega \) with \(\mathscr {L}^n(\varOmega )<\delta \), we have that \(\int _E|\mathbb {P}\tau _\alpha \tilde{w}_j|^p{\text {d}}\!x<\varepsilon \) for all j. It follows that for all such E,

$$\begin{aligned} \int _E |\mathbb {P}\tilde{w}_j|^p{\text {d}}\!x\leqslant 2^{p-1}\left( \sup _j\int _E |\mathbb {P}\tilde{w}_j-\mathbb {P}\tau _\alpha \tilde{w}_j|^p{\text {d}}\!x+\sup _j\int _E |\mathbb {P}\tau _\alpha \tilde{w}_j|^p{\text {d}}\!x\right) <(2\varepsilon )^p, \end{aligned}$$

where the right hand side is independent of j. The second step is concluded.

Step III. Using Lemma 2, we can write \(\mathbb {P}\tilde{w}_j=\mathbb {B}u_j\), where \( \hat{u}_j(\xi ){:}{=}\mathbb {B}^\dagger (\xi )\widehat{\mathbb {P}\tilde{w}_j}(\xi ) \), so that \(u_j\in \mathscr {S}(\mathbb {R}^n,V)\). It remains to cut-off \(u_j\) suitably.

Since \(\mathbb {B}\) has order l, we first note that

$$\begin{aligned} \widehat{D^lu}(\xi )=\mathbb {B}^\dagger (\xi )\widehat{\mathbb {B}u}(\xi )\otimes \xi ^{\otimes l}, \end{aligned}$$

so that \(\mathbb {B}u\mapsto D^{l}u\) is a singular integral operator of convolution type. It follows that \(D^l u_j\) is bounded in \({\text {L}}^p(\mathbb {R}^n)\) (recall here that \(\mathbb {B}u_j=\mathbb {P}\tilde{w}_j\) is bounded in \({\text {L}}^p\) as \(\tilde{w}_j\in {\text {C}}^\infty _c(\varOmega ,W)\) is a weakly convergent sequence), so \(u_j\) is bounded in \({\text {W}}^{l,p}(\varOmega ,V)\).

By compactness of the embedding \({\text {W}}^{l,p}(\varOmega )\hookrightarrow {\text {W}}^{l-1,p}(\varOmega )\), we have \(u_j\rightarrow u\) in \({\text {W}}^{l-1,p}(\varOmega ,V)\). Since \(\mathbb {B}u_j\rightharpoonup 0\), we have that \(\mathbb {B}u=0\). On the other hand, \(u=\mathscr {F}^{-1}[\mathbb {B}^\dagger (\cdot )]\star (\mathbb {B}u)=0\), so that \(D^{l-m}u_j\rightarrow 0\) in \({\text {L}}^p(\varOmega )\) for \(m=1,\ldots ,l\).

We now proceed similarly to Step I. Let \(\rho \in {\text {C}}^\infty _c(\mathbb {R}^n)\) be such that \(\rho _j=1\) in \(\varOmega \setminus \varOmega _{s_j}\) and \(|D^{m}\rho _j|\leqslant cs_j^{-m}\), \(m=1,\ldots ,l\), where

$$\begin{aligned} s_j{:}{=}\max _{m=1,\ldots ,l}\Vert D^{l-m}u_j\Vert _{{\text {L}}^p(\varOmega )}^{1/(2m)}\rightarrow 0. \end{aligned}$$

We can then estimate

$$\begin{aligned} \Vert \mathbb {B}u_j-\mathbb {B}(\rho _j u_j)\Vert _{{\text {L}}^p(\varOmega )}&\leqslant \Vert (1-\rho _j)\mathbb {B}u_j\Vert _{{\text {L}}^p(\varOmega )}+\sum _{m=1}^l\Vert B_m[D^m\rho _j,D^{l-m}u_j]\Vert _{{\text {L}}^p(\varOmega )}\\&\leqslant \Vert \mathbb {B}u_j\Vert _{{\text {L}}^p(\varOmega _{s_j})}+c\sum _{m=1}^ls^{-m}_j\Vert D^{l-m}u_j\Vert _{{\text {L}}^p(\varOmega )}, \end{aligned}$$

which tends to zero by p-uniform integrability of \(\mathbb {B}u_j\) and the choice of \(s_j\). Here \(B_m\) is another collection of bi-linear pairings given by the product rule. It remains to conclude that \(\mathbb {B}(\rho _ju_j)\) converges weakly to zero in \({\text {L}}^p(\varOmega ,W)\), is p-uniformly integrable, and generates \(\varvec{\nu }\). The proof is complete. \(\square \)