1 Introduction

For \(n \in \mathbb {N}\), consider a finite set \(A \subseteq \mathbb {R}^n\). We study continuous functions \(u :\mathbb {R}^n \rightarrow \mathbb {R}\) such that the weak gradient \(\nabla u\) satisfies \(\nabla u \in \mathrm {BV}_\mathrm {loc}(\mathbb {R}^n; \mathbb {R}^n)\) and \(\nabla u(x) \in A\) for almost every \(x \in \mathbb {R}^n\). This means that whenever \(\Omega \subseteq \mathbb {R}^n\) is open and bounded, the sets \(\left\{ x \in \Omega :\nabla u(x) = a \right\} \), for \(a \in A\), form a Caccioppoli partition of \(\Omega \) as discussed, e.g., by Ambrosio et al. [1,  Sect. 4.4]. The theory of Caccioppoli partitions therefore applies and gives some information on the structure of \(\nabla u\) and of u. The fact that we are dealing with a gradient, however, gives rise to a better theory, especially under additional assumptions on the geometry of A. We work with the following notions in this paper.

Definition 1

A set \(A \subset \mathbb {R}^n\) is called convex independent if any \(a \in A\) does not belong to the convex hull of \(A \setminus \{a\}\). It is called affinely independent if any \(a \in A\) does not belong to the affine span of \(A \setminus \{a\}\).

If either of these conditions is satisfied, then we can prove statements on the regularity of u that finite Caccioppoli partitions do not share in general. In fact, we will see that u is locally piecewise affine away from a closed, countably \({\mathcal {H}}^{n - 1}\)-rectifiable set (if A is convex independent) or away from a closed \({\mathcal {H}}^{n - 1}\)-null set (if A is affinely independent).

In order to make this more precise, we introduce some notation. Given \(r > 0\) and \(x \in \mathbb {R}^n\), we write \(B_r(x)\) for the open ball of radius r centred at x. Given \(a \in \mathbb {R}^n\), the function \(\lambda _a :\mathbb {R}^n \rightarrow \mathbb {R}\) is defined by \(\lambda _a(x) = a \cdot x\) for \(x \in \mathbb {R}^n\). Given two functions \(v, w :\mathbb {R}^n \rightarrow \mathbb {R}\), we write \(v \wedge w\) and \(v \vee w\), respectively, for the functions with \((v \wedge w)(x) = \min \{v(x), w(x)\}\) and \((v \vee w)(x) = \max \{v(x), w(x)\}\) for \(x \in \mathbb {R}^n\).

Definition 2

Given a function \(u :\mathbb {R}^n \rightarrow \mathbb {R}\), the regular set of u, denoted by \({\mathcal {R}}(u)\), consists of all \(x \in \mathbb {R}^n\) such that there exist \(a, b \in \mathbb {R}^n\), \(c \in \mathbb {R}\), and \(r > 0\) with \(u = \lambda _a \wedge \lambda _b + c\) in \(B_r(x)\) or \(u = \lambda _a \vee \lambda _b + c\) in \(B_r(x)\). The singular set of u is its complement \({\mathcal {S}}(u) = \mathbb {R}^n \setminus {\mathcal {R}}(u)\).

The condition for \({\mathcal {R}}(u)\) allows the possibility that \(a = b\), in which case u is affine near x. If \(a \ne b\), then it is still piecewise affine near x. Obviously \({\mathcal {R}}(u)\) is an open set and \({\mathcal {S}}(u)\) is closed.

It would be reasonable to include functions consisting of more than two affine pieces in the definition of \({\mathcal {R}}(u)\), for example \((\lambda _{a_1} \wedge \lambda _{a_2}) \vee \lambda _{a_3} + c\) for \(a_1, a_2, a_3 \in \mathbb {R}^n\) and \(c \in \mathbb {R}\). For the results of this paper, however, this would make no difference, therefore we choose the simpler definition.

For \(s \ge 0\), we denote the s-dimensional Hausdorff measure in \(\mathbb {R}^n\) by \({\mathcal {H}}^s\). The notation \(\mathrm {BV}_\mathrm {loc}^2(\mathbb {R}^n)\) is used for the space of functions with weak gradient in \(\mathrm {BV}_\mathrm {loc}(\mathbb {R}^n; \mathbb {R}^n)\). Thus the hypotheses of the following theorems are identical to the assumptions at the beginning of the introduction.

Theorem 3

Suppose that A is a finite, affinely independent set. Let \(u \in \mathrm {BV}_\mathrm {loc}^2(\mathbb {R}^n)\) with \(\nabla u(x) \in A\) for almost every \(x \in \mathbb {R}^n\). Then \({\mathcal {H}}^{n - 1}({\mathcal {S}}(u)) = 0\).

Theorem 4

Suppose that A is a finite, convex independent set. Let \(u \in \mathrm {BV}_\mathrm {loc}^2(\mathbb {R}^n)\) with \(\nabla u(x) \in A\) for almost every \(x \in \mathbb {R}^n\). Then \({\mathcal {S}}(u)\) is countably \({\mathcal {H}}^{n - 1}\)-rectifiable.

For \(n = 2\), Theorem 3 was proved in a previous paper [10]. For higher dimensions, the result is new. Theorem 4 is new even for \(n = 2\). For \(n = 1\), both statements are easy to prove.

The results are optimal in terms of the Hausdorff measures involved. Furthermore, the assumption of convex/affine independence is necessary. Indeed, there are examples of finite sets \(A \subseteq \mathbb {R}^2\) and functions \(u \in \mathrm {BV}_\mathrm {loc}^2(\mathbb {R}^2)\) with \(\nabla u(x) \in A\) almost everywhere such that

  • \({\mathcal {H}}^2({\mathcal {S}}(u)) > 0\); or

  • \({\mathcal {H}}^1({\mathcal {S}}(u)) > 0\) and A is convex independent; or

  • \({\mathcal {H}}^s({\mathcal {S}}(u)) = \infty \) for any \(s < 1\) and A is affinely independent.

All of these can be found in the author’s previous paper [10].

Our results are easily adapted to functions defined on an open set \(\Omega \subseteq \mathbb {R}^n\) with the corresponding properties. For the sake of simplicity, we do not discuss this case in detail.

Apart from being of obvious geometric interest, functions as described above appear in problems from materials science. They naturally arise as limits in \(\Gamma \)-convergence theories in the spirit of Modica and Mortola [8, 9] for quantities such as

$$\begin{aligned} \int _\Omega \left( \epsilon |\nabla ^2 u|^2 + \frac{W(\nabla u)}{\epsilon }\right) \, dx, \end{aligned}$$
(1)

where \(\Omega \subseteq \mathbb {R}^n\) is an open set and \(W :\mathbb {R}^n \rightarrow [0, \infty )\) is a function with \(A = W^{-1}(\{0\})\). Functionals of this sort appear in certain models for the surface energy of nanocrystals [7, 13, 14]. For \(\Omega \subseteq \mathbb {R}^2\), functions \(u \in \mathrm {BV}^2(\Omega )\) with \(\nabla u \in \{(\pm 1, 0), (0, \pm 1)\}\) have also been used by Cicalese et al. [3] for a different sort of \(\Gamma \)-limit arising from a model for frustrated spin systems.

Functionals similar to (1), but for maps \(u :\Omega \rightarrow \mathbb {R}^n\), also appear in certain models for phase transitions in elastic materials (see, e.g., the seminal paper of Ball and James [2] or the introduction into the theory by Müller [11]). In this context, due to the frame indifference of the underlying models, the set \(W^{-1}(\{0\})\) is typically not finite. Sometimes, however, the frame indifference is disregarded (as in the paper by Conti et al. [4]), or the theory gives a limit with \(\nabla u \in \mathrm {BV}(\Omega ; A)\) for a finite set \(A \subseteq \mathbb {R}^{n \times n}\) anyway (such as in recent results of Davoli and Friedrich [5, 6]). In such a case, Theorem 3 and Theorem 4 are potentially useful, as they apply to the components (or other one-dimensional projections) of u.

In the proof of Theorem 3, we use some of the tools from the author’s previous paper [10]. In particular, we will analyse the intersections of the graph of u with certain hyperplanes in \(\mathbb {R}^{n + 1}\). We will see that these intersections correspond to the graphs of functions with \((n - 1)\)-dimensional domains and with properties similar to u. The key ideas from the previous paper, however, are specific to \(\mathbb {R}^2\), so we eventually use different arguments. In this paper, we use the theory of \(\mathrm {BV}_\mathrm {loc}(\mathbb {R}^n; \mathbb {R}^n)\) to a much greater extent. The central argument will consider approximate jump points of \(\nabla u\). Near such a point, we know that u is close to a piecewise affine function in a measure theoretic sense by definition. We then use an induction argument (with induction over n) to show that u is in fact piecewise affine near \({\mathcal {H}}^{n - 1}\)-almost every approximate jump point.

We also need to analyse points where u has an approximate limit, and they are of interest for the proofs of both Theorems 3 and 4. This part of the analysis is significantly simpler and relies on the fact that for any \(a \in A\), the function \(v(x) = u(x) - a \cdot x\) has some monotonicity properties.

In the rest of the paper, we study a fixed function \(u \in \mathrm {BV}_\mathrm {loc}^2(\mathbb {R}^n)\) with \(\nabla u(x) \in A\) for almost every \(x \in \mathbb {R}^n\). Then u is automatically Lipschitz continuous. Since we are interested only in the local properties of u, we may assume that it is also bounded. (Otherwise we can modify it outside of a bounded set with the construction described in [10,  Sect. 6].) We define the function \({\varvec{U}}:\mathbb {R}^n \rightarrow \mathbb {R}^{n + 1}\) by

$$\begin{aligned} {\varvec{U}}(x) = \begin{pmatrix} x \\ u(x) \end{pmatrix}, \quad x \in \mathbb {R}^n. \end{aligned}$$

We use the notation \({{\,\mathrm{graph}\,}}(u) = {\varvec{U}}(\mathbb {R}^n)\) for the graph of u.

As we sometimes work with points in \(\mathbb {R}^{n + 1}\) (especially points on \({{\,\mathrm{graph}\,}}(u)\)) and their projections onto \(\mathbb {R}^n\) simultaneously, we use the following notation. A generic point in \(\mathbb {R}^{n + 1}\) is denoted by \({\varvec{x}}= (x_1, \dotsc , x_{n + 1})^T\), and then we write \(x = (x_1, \dotsc , x_n)^T\). Thus \({\varvec{x}}= ({\begin{matrix} x \\ x_{n + 1} \end{matrix}})\). We think of elements of \(\mathbb {R}^n\) and of \(\mathbb {R}^{n + 1}\) as column vectors, and this is sometimes important, as we use them as columns in certain matrices.

As our function satisfies in particular the condition \(\nabla u \in \mathrm {BV}_\mathrm {loc}(\mathbb {R}^n; \mathbb {R}^n)\), the theory of this space will of course be helpful. In this context, we mostly follow the notation and terminology of Ambrosio et al. [1]. We also use several of the results found in this book.

2 Approximate faces and edges of the graph

In this section, we decompose \(\mathbb {R}^n\) into three sets \({\mathcal {F}}\), \({\mathcal {E}}\), and \({\mathcal {N}}\). These are defined such that we expect regularity in \({\mathcal {F}}\) under the assumptions of either of the main theorems, and also in \({\mathcal {E}}\) under the assumptions of Theorem 3. The third set, \({\mathcal {N}}\), will be an \({\mathcal {H}}^{n - 1}\)-null set. The sets \({\mathcal {F}}\) and \({\mathcal {E}}\) are characterised, up to \({\mathcal {H}}^{n - 1}\)-null sets, by the condition that \(\nabla u\) has an approximate limit or an approximate jump, respectively. Since much of our analysis examines \({{\,\mathrm{graph}\,}}(u)\), it is also convenient to think of \({\mathcal {F}}\) as the set of points where the graph behaves approximately like the (n-dimensional) faces of a polyhedral surface, whereas \({\mathcal {E}}\) corresponds to approximate (\((n - 1)\)-dimensional) edges.

First, however, we define three related sets \({\mathcal {F}}'\), \({\mathcal {E}}'\), and \({\mathcal {N}}'\). The sets \({\mathcal {F}}\), \({\mathcal {E}}\), and \({\mathcal {N}}\) will be derived from these later.

Consider the set \({\mathcal {F}}' \subseteq \mathbb {R}^n\), comprising all points \(x \in \mathbb {R}^n\) such that there exists \(a \in \mathbb {R}^n\) satisfying

In other words, this is the set of all points where \(\nabla u\) has an approximate limit a. It is then clear that \(a \in A\). The complement \(\mathbb {R}^n \setminus {\mathcal {F}}'\) is called the approximate discontinuity set of \(\nabla u\).

Furthermore, let \({\mathcal {E}}'\) be the set of all \(x \in \mathbb {R}^n\) such that there exist \(a_-, a_+ \in \mathbb {R}^n\) with \(a_- \ne a_+\) and there exists \(\eta \in S^{n - 1}\) such that

(2)

and

(3)

This is the approximate jump set of \(\nabla u\). Again, the points \(a_-, a_+\) belong to A.

According to a result by Federer and Vol’pert (which can be found in the book by Ambrosio et al. [1,  Theorem 3.78]), there exists an \({\mathcal {H}}^{n - 1}\)-null set \({\mathcal {N}}' \subseteq \mathbb {R}^n\) such that

$$\begin{aligned} \mathbb {R}^n = {\mathcal {F}}' \cup {\mathcal {E}}' \cup {\mathcal {N}}'. \end{aligned}$$

Furthermore, the set \({\mathcal {E}}'\) is countably \({\mathcal {H}}^{n - 1}\)-rectifiable.

Given \(x \in \mathbb {R}^n\) and \(\rho > 0\), we define the function \(u_{x, \rho } :\mathbb {R}^n \rightarrow \mathbb {R}\) with

$$\begin{aligned} u_{x, \rho }({\tilde{x}}) = \frac{1}{\rho }\left( u(x + \rho {\tilde{x}}) - u(x)\right) \end{aligned}$$

for \({\tilde{x}} \in \mathbb {R}^n\). For x fixed, the family of functions \((u_{x, \rho })_{\rho > 0}\) is clearly bounded in \(C^{0, 1}(K)\) for any compact set \(K \subseteq \mathbb {R}^n\). Therefore, the theorem of Arzelà–Ascoli implies that there exists a sequence such that \(u_{x, \rho _k}\) converges locally uniformly. If we have in fact a limit for , then we write

and call this limit the tangent function of u at x.

If \(x \in {\mathcal {F}}'\) and \(a \in A\) is the approximate limit of \(\nabla u\) at x, then for any sequence , the limit of \(u_{x, \rho _k}\) can only be \(\lambda _a\). Hence in this case, there exists a tangent function \(T_x u\), which is exactly this function. Similarly, if \(x \in {\mathcal {E}}'\), then \(T_x u\) exists and

$$\begin{aligned} T_x u({\tilde{x}}) = {\left\{ \begin{array}{ll} \lambda _{a_-}({\tilde{x}}) &{} \text {if } {\tilde{x}} \cdot \eta < 0, \\ \lambda _{a_+}({\tilde{x}}) &{} \text {if } {\tilde{x}} \cdot \eta \ge 0. \end{array}\right. } \end{aligned}$$

Because \(T_x u\) is a continuous function, this means that

$$\begin{aligned} \eta = \pm \frac{a_+ - a_-}{|a_+ - a_-|}. \end{aligned}$$
(4)

Then we conclude that \(T_x u = \lambda _{a_-} \wedge \lambda _{a_+}\) or \(T_x u = \lambda _{a_-} \vee \lambda _{a_+}\), depending on the sign.

If we consider the functions \(a_-, a_+ :{\mathcal {E}}' \rightarrow A\) and \(\eta :{\mathcal {E}}' \rightarrow S^{n - 1}\) such that (2) and (3) are satisfied on \({\mathcal {E}}'\), then the previously used result [1,  Theorem 3.78] also implies that

(5)

Let \(\gamma = \min \left\{ |a - b| :a, b \in A \right\} \). Then for any Borel set \(\Omega \subseteq \mathbb {R}^n\), we conclude that

$$\begin{aligned} |D\nabla u|(\Omega ) \ge \gamma {\mathcal {H}}^{n - 1}({\mathcal {E}}' \cap \Omega ). \end{aligned}$$

Hence \({\mathcal {E}}'\) has locally finite \({\mathcal {H}}^{n - 1}\)-measure.

Now define

Then standard results [1,  Theorem 2.56 and Lemma 3.76] imply that \({\mathcal {H}}^{n - 1}({\mathcal {F}}' \setminus {\mathcal {F}}) = 0\).

Recall the map \({\varvec{U}}:\mathbb {R}^n \rightarrow \mathbb {R}^{n + 1}\) defined in the introduction. Set \({\mathcal {F}}^* = {\varvec{U}}({\mathcal {F}})\) and \({\mathcal {E}}^\dagger = {\varvec{U}}({\mathcal {E}}')\). Then \({\mathcal {E}}^\dagger \) is a countably \({\mathcal {H}}^{n - 1}\)-rectifiable subset of \(\mathbb {R}^{n + 1}\). Hence at \({\mathcal {H}}^{n - 1}\)-almost every \({\varvec{x}}\in {\mathcal {E}}^\dagger \), the measure has a tangent measure [1,  Theorem 2.83] of the form , where \(T_{{\varvec{x}}} {\mathcal {E}}^\dagger \) is an \((n - 1)\)-dimensional linear subspace of \(\mathbb {R}^{n + 1}\) (the approximate tangent space of \({\mathcal {E}}^\dagger \) at \({\varvec{x}}\)). Let \({\mathcal {E}}^*\) be the set of all \({\varvec{x}}\in {\mathcal {E}}^\dagger \) where this is the case. Furthermore, let \({\mathcal {E}}= {\varvec{U}}^{-1}({\mathcal {E}}^*)\). Then \({\mathcal {E}}' \setminus {\mathcal {E}}\) is an \({\mathcal {H}}^{n - 1}\)-null set.

Thus if we define \({\mathcal {N}}= \mathbb {R}^n \setminus ({\mathcal {F}}\cup {\mathcal {E}})\), then \({\mathcal {N}}\) is an \({\mathcal {H}}^{n - 1}\)-null set and we have the disjoint decomposition

$$\begin{aligned} \mathbb {R}^n = {\mathcal {F}}\cup {\mathcal {E}}\cup {\mathcal {N}}. \end{aligned}$$

For later use, we also define \({\mathcal {N}}^* = {\varvec{U}}({\mathcal {N}})\).

3 Proof of Theorem 4

In this section we prove our second main result, Theorem 4. The proof is based on the following proposition, which will also be useful for the Proof of Theorem 3 later on.

Proposition 5

Suppose that \(A \subseteq \mathbb {R}^n\) is finite and convex independent. Let \(u \in \mathrm {BV}_\mathrm {loc}^2(\mathbb {R}^n)\) be a function with \(\nabla u(x) \in A\) for almost all \(x \in \mathbb {R}^n\). Then there exist \(r > 0\) and \(\epsilon > 0\) with the following property. Suppose that there exists \(a \in A\) such that

$$\begin{aligned} {\mathcal {H}}^n(\left\{ x \in B_1(0) :\nabla u(x) \ne a \right\} ) \le \epsilon \end{aligned}$$
(6)

and

$$\begin{aligned} |D\nabla u|(B_1(0)) \le \epsilon . \end{aligned}$$

Then \(\nabla u(x) = a\) for almost every \(x \in B_r(0)\).

Proof

Because A is convex independent, there exists \(\omega \in S^{n - 1}\) such that

$$\begin{aligned} a \cdot \omega < \min _{b \in A \setminus \{a\}} b \cdot \omega . \end{aligned}$$

(For example, we may choose a point \(b_0\) in the convex hull of \(A \setminus \{a\}\) that minimises the distance to a. Then \(\omega = (b_0 - a)/|b_0 - a|\) has this property.) As A is finite, there also exists \(\delta \in (0, 1)\) such that the inequality \(a \cdot \xi \le \min _{b \in A \setminus \{a\}} (b \cdot \xi )\) holds even for \(\xi \) in the cone

$$\begin{aligned} C = \left\{ \xi \in \mathbb {R}^n :\xi \cdot \omega \ge \delta |\xi | \right\} . \end{aligned}$$

Consider the function \(v :\mathbb {R}^n \rightarrow \mathbb {R}\) with \(v(x) = u(x) - a \cdot x\) for \(x \in \mathbb {R}^n\). Then for any \(\xi \in C\),

$$\begin{aligned} \xi \cdot \nabla v(x) = \xi \cdot \nabla u(x) - a \cdot \xi \ge 0 \end{aligned}$$

almost everywhere. Thus v is monotone along lines parallel to \(\xi \). (This is true for every such line by the continuity of v.) Furthermore, for almost every \(x \in \mathbb {R}^n\), we find that either \(\nabla u(x) = a\) or \(\omega \cdot \nabla v(x) > 0\).

Suppose that \(\nabla u = a\) does not hold almost everywhere in \(B_r(0)\). Then there exist \(x_-, x_+ \in B_r(0)\) with \(v(x_-) < v(x_+)\). Define

$$\begin{aligned} C_- = (x_- - C) \cap B_1(0) \quad \text {and} \quad C_+ = (x_+ + C) \cap B_1(0). \end{aligned}$$

Then for any \(x' \in C_-\) and \(x'' \in C_+\), we conclude that

$$\begin{aligned} v(x') \le v(x_-) < v(x_+) \le v(x''). \end{aligned}$$

We now foliate a part of \(B_1(0)\) by line segments parallel to \(\omega \). For \(R \in (0, 1]\), let \(Z_R = \left\{ x \in B_R(0) :\omega \cdot x = 0 \right\} \). For every \(z \in Z_R\), consider the line segment

$$\begin{aligned} L_z = \left\{ z + t\omega :-\frac{1}{2} \le t \le \frac{1}{2} \right\} . \end{aligned}$$

Provided that r is chosen sufficiently small, we can find \(R \in (0, 1]\) such that

$$\begin{aligned} \left\{ z - \frac{\omega }{2} :z \in Z_R \right\} \subseteq C_- \quad \text {and} \quad \left\{ z + \frac{\omega }{2} :z \in Z_R \right\} \subseteq C_+. \end{aligned}$$

Hence for any \(z \in Z_R\),

$$\begin{aligned} v\left( z + \frac{\omega }{2}\right) - v\left( z - \frac{\omega }{2}\right) \ge v(x_+) - v(x_-) > 0. \end{aligned}$$

In particular, the restriction of v to the line segment \(L_z\) is not constant. For \(z \in Z_R\), define \(\Lambda _z = \left\{ x \in L_z :\nabla u(x) = a \right\} \). Then it follows that \({\mathcal {H}}^1(\Lambda _z) < 1\) for \({\mathcal {H}}^{n - 1}\)-almost all \(z \in Z_R\).

On the other hand, because of (6), we also know that

$$\begin{aligned} {\mathcal {H}}^{n - 1}\left( \left\{ z \in Z_R :{\mathcal {H}}^1(\Lambda _z) = 0 \right\} \right) \le \epsilon . \end{aligned}$$

Thus if we define \(Z' = \left\{ z \in Z_R :0< {\mathcal {H}}^1(\Lambda _z) < 1 \right\} \), then

$$\begin{aligned} {\mathcal {H}}^{n - 1}(Z') \ge {\mathcal {H}}^{n - 1}(Z_R) - \epsilon . \end{aligned}$$

Set \(c = \min _{b \in A} |a - b|\). For \({\mathcal {H}}^{n - 1}\)-almost any \(z \in Z'\), the function \(t \mapsto \nabla u(z + t\omega )\) belongs to \(\mathrm {BV}\bigl ((-\frac{1}{2}, \frac{1}{2}); \mathbb {R}^n\bigr )\) and its total variation is at least c. Hence [1,  Theorem 3.103] implies that

$$\begin{aligned} |D\nabla u|(B_1(0)) \ge c {\mathcal {H}}^{n - 1}(Z') \ge c\bigl ({\mathcal {H}}^{n - 1}(Z_R) - \epsilon \bigr ). \end{aligned}$$

If \(\epsilon \) is sufficiently small, then this means in particular that \(|D\nabla u|(B_1(0)) > \epsilon \). Thus we have proved the contrapositive of Proposition 5. \(\square \)

Proof of Theorem 4

We show that \({\mathcal {F}}\subseteq {\mathcal {R}}(u)\). To this end, fix \(x \in {\mathcal {F}}\) and consider the rescaled functions \(u_{x, \rho }\) for \(\rho > 0\). Since \(x \in {\mathcal {F}}\), we know that \(\nabla u_{x, \rho } \rightarrow a\) in \(L^1(B_1(0))\) as for some \(a \in A\). Furthermore, since

$$\begin{aligned} |D\nabla u_{x, \rho }|(B_1(0)) = \rho ^{1 - n} |D\nabla u|(B_\rho (x)) \rightarrow 0 \end{aligned}$$

as , the function \(u_{x, \rho }\) satisfies the inequalities of Proposition 5 for \(\rho \) sufficiently small. Hence \(\nabla u_{x, \rho }({\tilde{x}}) = a\) for almost every \({\tilde{x}} \in B_r(0)\), which implies that

$$\begin{aligned} u({\tilde{x}}) = u(x) + a \cdot ({\tilde{x}} - x) \end{aligned}$$

for all \({\tilde{x}} \in B_{\rho r}(x)\). Hence \(x \in {\mathcal {R}}(u)\).

We conclude that \({\mathcal {S}}(u) \subseteq {\mathcal {E}}\cup {\mathcal {N}}\). As seen in Sect. 2, the set \({\mathcal {E}}\) is countably \({\mathcal {H}}^{n - 1}\)-rectifiable and \({\mathcal {N}}\) is an \({\mathcal {H}}^{n - 1}\)-null set. Thus Theorem 4 follows. \(\square \)

4 Specialising to a regular n-simplex

The rest of the paper is devoted to the Proof of Theorem 3. This proof relies on the same general strategy as the Proof of Theorem 4 to some extent: we use some monotonicity properties of u again, together with the theory of \(\mathrm {BV}\)-functions, to show that if u is close to \(\lambda _{a_1} \wedge \lambda _{a_2}\) (for two points \(a_1, a_2 \in A\)) in a cube centred at 0, and if \(|D\nabla u|(Q)\) is not much larger than the corresponding quantity for \(\lambda _{a_1} \wedge \lambda _{a_2}\), then u actually coincides with \(\lambda _{a_1} \wedge \lambda _{a_2}\) up to a translation and addition of a constant in a smaller set. (The same applies to \(\lambda _{a_1} \vee \lambda _{a_2}\).) The details, however, are much more involved than in the Proof of Theorem 4.

Instead of considering any affinely independent set A, it is convenient in this analysis to assume that \(a_0, \dotsc , a_n \in \mathbb {R}^n\) are the corners of a regular n-simplex of side length \(\sqrt{2n + 2}\) centred at 0, and that \(A = \{a_0, \dotsc , a_n\}\). This means in particular that

$$\begin{aligned} \sum _{i = 0}^n a_i = 0. \end{aligned}$$
(7)

We further assume that the matrix with columns \(a_0 - a_1, \dotsc , a_0 - a_n\) has a positive determinant. Theorem 3 can then be reduced to this situation by composing u with an affine transformation. The details are given on page 26 below.

As it is sometimes convenient to permute \(a_0, \dotsc , a_n\) cyclically, we regard \(0, \dotsc , n\) as members of \(\mathbb {Z}_{n + 1} = \mathbb {Z}/ (n + 1)\mathbb {Z}\) in this context. Thus \(a_{i + n + 1} = a_i\).

The condition that our simplex has side length \(\sqrt{2n + 2}\) means that \(|a_i| = \sqrt{n}\) for every \(i \in \mathbb {Z}_{n + 1}\). Indeed, by the calculations of Parks and Wills [12], the dihedral angle of the regular n-simplex is \(\arccos \frac{1}{n}\). As each \(a_i\) is orthogonal to one of the faces, this means that \(a_i \cdot a_j = - \frac{1}{n} |a_i| |a_j|\) for \(i \ne j\), and therefore \(2n + 2 = |a_i - a_j|^2 = \frac{2n + 2}{n} |a_i| |a_j|\). From this we conclude that

$$\begin{aligned} |a_i| = \sqrt{n} \end{aligned}$$

for \(i \in \mathbb {Z}_{n + 1}\) and

$$\begin{aligned} a_i \cdot a_j = -1 \end{aligned}$$

for \(i \ne j\).

In the following arguments, we mostly study u in terms of \({{\,\mathrm{graph}\,}}(u)\). With A chosen as above, we have a positively oriented, orthonormal basis \((\varvec{\nu }_1, \dotsc , \varvec{\nu }_{n + 1})\) of \(\mathbb {R}^{n + 1}\) (defined shortly) consisting of vectors normal to the expected faces of \({{\,\mathrm{graph}\,}}(u)\). (This is the reason why we choose this specific set A.) To facilitate our analysis, we will often represent \({{\,\mathrm{graph}\,}}(u)\) with respect to this basis, or equivalently, apply the corresponding linear transformation in \(\mathrm {SO}(n + 1)\).

The vectors \(\varvec{\nu }_i\) are defined as follows. For \(i \in \mathbb {Z}_{n + 1}\), let

$$\begin{aligned} \varvec{\nu }_i = \frac{1}{\sqrt{n + 1}} \begin{pmatrix} -a_i \\ 1 \end{pmatrix}. \end{aligned}$$

Then

$$\begin{aligned} |\varvec{\nu }_i|^2 = \frac{|a_i|^2 + 1}{n + 1} = 1, \end{aligned}$$

whereas for \(i \ne j\),

$$\begin{aligned} \varvec{\nu }_i \cdot \varvec{\nu }_j = \frac{a_i \cdot a_j + 1}{n + 1} = 0. \end{aligned}$$

Hence \((\varvec{\nu }_1, \dotsc , \varvec{\nu }_{n + 1})\) is indeed an orthonormal basis of \(\mathbb {R}^{n + 1}\). Furthermore,

$$\begin{aligned} \begin{aligned} \det \begin{pmatrix} -a_1 &{} \cdots &{} -a_{n + 1} \\ 1 &{} \cdots &{} 1 \end{pmatrix}&= \det \begin{pmatrix} a_0 - a_1 &{} \cdots &{} a_0 - a_n &{} -a_0 \\ 0 &{} \cdots &{} 0 &{} 1 \end{pmatrix} \\&= \det \begin{pmatrix} a_0 - a_1&\cdots&a_0 - a_n \end{pmatrix}. \end{aligned} \end{aligned}$$

(In the first step, we have used the fact that \(a_{n + 1} = a_0\) and subtracted the last column from each of the other columns of the matrix.) Hence the above assumption guarantees that the basis \((\varvec{\nu }_1, \dotsc , \varvec{\nu }_{n + 1})\) gives the standard orientation of \(\mathbb {R}^{n + 1}\).

We now use the notation \(\lambda _i = \lambda _{a_i}\), recalling that this is the linear function with \(\lambda _i(x) = a_i \cdot x\) for \(x \in \mathbb {R}^n\). For \(i \in \mathbb {Z}_{n + 1}\), we set

$$\begin{aligned} {\mathcal {F}}_i = \left\{ x \in {\mathcal {F}} :T_x u = \lambda _i \right\} . \end{aligned}$$

Thus we have the disjoint decomposition

$$\begin{aligned} {\mathcal {F}}= \bigcup _{i \in \mathbb {Z}_{n + 1}} {\mathcal {F}}_i. \end{aligned}$$

Furthermore, we define \({\mathcal {F}}_i^* = {\varvec{U}}({\mathcal {F}}_i)\).

Of course \({\varvec{U}}:\mathbb {R}^n \rightarrow {{\,\mathrm{graph}\,}}(u)\) is a bi-Lipschitz map. Thus in order to understand \({\mathcal {F}}\), \({\mathcal {E}}\), or \({\mathcal {F}}_i\), it suffices to study \({\mathcal {F}}^*\), \({\mathcal {E}}^*\), or \({\mathcal {F}}_i^*\) and how \({\varvec{U}}^{-1}\) transforms them. In particular, the following is true.

Lemma 6

For any Borel set \(\Omega \subseteq \mathbb {R}^n\),

$$\begin{aligned} {\mathcal {H}}^{n - 1}({\mathcal {E}}^* \cap (\Omega \times \mathbb {R})) = \sqrt{\frac{n + 1}{2}} {\mathcal {H}}^{n - 1}({\mathcal {E}}\cap \Omega ) = \frac{1}{2} |D\nabla u|(\Omega ). \end{aligned}$$

Proof

We use the area formula [1,  Theorem 2.91]. Hence we need to calculate the Jacobian of \({\varvec{U}}\) restricted to the approximate tangent spaces of \({\mathcal {E}}\).

More precisely, since \({\mathcal {E}}\) is countably \({\mathcal {H}}^{n - 1}\)-rectifiable, there exists an approximate tangent space \(T_x {\mathcal {E}}\) at \({\mathcal {H}}^{n - 1}\)-almost every \(x \in {\mathcal {E}}\). Because \({\varvec{U}}\) is Lipschitz continuous, the tangential derivative \(d^{\mathcal {E}}{\varvec{U}}(x)\) (i.e., the derivative of the restriction of \({\varvec{U}}\) to \(T_x {\mathcal {E}}\)) exists at \({\mathcal {H}}^{n - 1}\)-almost every \(x \in {\mathcal {E}}\) [1,  Theorem 2.90]. We write \(L^*\) for the adjoint of a linear operator L. Then

$$\begin{aligned} J_{\mathcal {E}}{\varvec{U}}(x) = \sqrt{\det \bigl ((d^{\mathcal {E}}{\varvec{U}}(x))^* \circ d^{\mathcal {E}}{\varvec{U}}(x)\bigr )} \end{aligned}$$

is the Jacobian of \({\varvec{U}}\) at x with respect to \(T_x{\mathcal {E}}\). The area formula implies that

$$\begin{aligned} {\mathcal {H}}^{n - 1}({\varvec{U}}({\mathcal {E}}\cap \Omega )) = \int _{{\mathcal {E}}\cap \Omega } J_{\mathcal {E}}{\varvec{U}}(x) \, d{\mathcal {H}}^{n - 1}. \end{aligned}$$

Thus in order to prove the first identity, it suffices to show that

$$\begin{aligned} J_{\mathcal {E}}{\varvec{U}}(x) = \sqrt{\frac{n + 1}{2}} \end{aligned}$$

for \({\mathcal {H}}^{n - 1}\)-almost every \(x \in {\mathcal {E}}\).

To this end, consider \(x \in {\mathcal {E}}\). Because of (4), we know that \(T_x{\mathcal {E}}= (a_i - a_j)^\perp \) for some \(i, j \in \mathbb {Z}_{n + 1}\) with \(i \ne j\) at \({\mathcal {H}}^{n - 1}\)-almost every such point. For \(\xi \in (a_i - a_j)^\perp \), we know that

$$\begin{aligned} \frac{1}{\rho }(u(x + \rho \xi ) - u(x)) = u_{x, \rho }(\xi ) \rightarrow T_x u(\xi ) \end{aligned}$$

as . The convergence is in fact uniform on compact subsets of \((a_i - a_j)^\perp \). Moreover, since \(T_x u = \lambda _i \wedge \lambda _j\) or \(T_x u = \lambda _j \vee \lambda _j\), its restriction to \((a_i - a_j)^\perp \) is linear with \(T_x u(\xi ) = a_i \cdot \xi \). Hence \(d^{\mathcal {E}}u(x)\) exists, and so does \(d^{\mathcal {E}}{\varvec{U}}(x)\). We calculate

$$\begin{aligned} d^{\mathcal {E}}{\varvec{U}}(x) \xi = \begin{pmatrix} \xi \\ a_i \cdot \xi \end{pmatrix}. \end{aligned}$$

For simplicity, we assume that \(i = n - 1\) and \(j = n\). The space \((a_i - a_j)^\perp \) is spanned by the vectors \(a_0, \dotsc , a_{n - 2}\). Suppose that we choose an orthonormal basis \((\epsilon _0, \dotsc , \epsilon _{n - 2})\) of \(T_x{\mathcal {E}}\). Let \(L :T_x{\mathcal {E}}\rightarrow T_x {\mathcal {E}}\) denote the linear operator that maps \(\epsilon _i\) to \(a_i\) for \(i = 0, \dotsc , n - 2\). Then \(d^{\mathcal {E}}{\varvec{U}}(x) \circ L\) is represented by the matrix

$$\begin{aligned} M_1 = \begin{pmatrix} a_0 &{} \cdots &{} a_{n - 2} \\ a_0 \cdot a_{n - 1} &{} \cdots &{} a_{n - 2} \cdot a_{n - 1} \end{pmatrix} = \begin{pmatrix} a_0 &{} \cdots &{} a_{n - 2} \\ -1 &{} \cdots &{} -1 \end{pmatrix} \end{aligned}$$

with respect to the above basis. Hence

$$\begin{aligned} J_{\mathcal {E}}{\varvec{U}}(x) = \sqrt{\frac{\det (M_1^T M_1)}{\det (L^* \circ L)}}. \end{aligned}$$

We write \(I_k\) for the identity \(k \times k\)-matrix. Then

$$\begin{aligned} M_1^T M_1 = \begin{pmatrix} a_0 \cdot a_0 + 1 &{} \cdots &{} a_0 \cdot a_{n - 2} + 1 \\ \vdots &{} \ddots &{} \vdots \\ a_{n - 2} \cdot a_0 + 1 &{} \cdots &{} a_{n - 2} \cdot a_{n - 2} + 1 \end{pmatrix} = (n + 1) I_{n - 1} \end{aligned}$$

and \(\det (M_1^T M_1) = (n + 1)^{n - 1}\).

As L maps an \((n - 1)\)-cube of side length 1 to the parallelepiped spanned by \(a_0, \dotsc , a_{n - 2}\), we know that \(\sqrt{\det (L^* \circ L)}\) is the \((n - 1)\)-volume of the latter. Thus if \(M_2\) is the \(n \times (n - 1)\)-matrix with columns \(a_0, \dotsc , a_{n - 2}\), then

$$\begin{aligned} \det (L^* \circ L) = \det (M_2^T M_2). \end{aligned}$$

We further compute

$$\begin{aligned} M_2^T M_2 = \begin{pmatrix} a_0 \cdot a_0 &{} \cdots &{} a_0 \cdot a_{n - 2} \\ \vdots &{} \ddots &{} \vdots \\ a_{n - 2} \cdot a_0 &{} \cdots &{} a_{n - 2} \cdot a_{n - 2} \end{pmatrix} = \begin{pmatrix} n &{} -1 &{} \cdots &{} -1 \\ -1 &{} n &{} &{} \vdots \\ \vdots &{} &{} \ddots &{} -1 \\ -1 &{} \cdots &{} -1 &{} n \end{pmatrix}. \end{aligned}$$

In order to calculate the determinant, we first subtract the first row of this matrix from each of the other rows. We obtain

$$\begin{aligned} \begin{aligned} \det (M_2^T M_2)&= \det \begin{pmatrix} n &{} -1 &{} \cdots &{} \cdots &{} -1 \\ -(n + 1) &{} n + 1 &{} 0 &{} \cdots &{} 0 \\ -(n + 1) &{} 0 &{} n + 1 &{} &{} \vdots \\ \vdots &{} \vdots &{} &{} \ddots &{} 0 \\ -(n + 1) &{} 0 &{} \cdots &{} 0 &{} n + 1 \end{pmatrix} \\&= (n + 1)^{n - 2} \det \begin{pmatrix} n &{} -1 &{} \cdots &{} \cdots &{} -1 \\ -1 &{} 1 &{} 0 &{} \cdots &{} 0 \\ -1 &{} 0 &{} 1 &{} &{} \vdots \\ \vdots &{} \vdots &{} &{} \ddots &{} 0 \\ -1 &{} 0 &{} \cdots &{} 0 &{} 1 \end{pmatrix}. \end{aligned} \end{aligned}$$

In the last matrix, we now add to the first row the sum of all the other rows. Thus

$$\begin{aligned} \det (M_2^T M_2) = (n + 1)^{n - 2} \det \begin{pmatrix} 2 &{} 0 &{} \cdots &{} \cdots &{} 0 \\ -1 &{} 1 &{} 0 &{} \cdots &{} 0 \\ -1 &{} 0 &{} 1 &{} &{} \vdots \\ \vdots &{} \vdots &{} &{} \ddots &{} 0 \\ -1 &{} 0 &{} \cdots &{} 0 &{} 1 \end{pmatrix} = 2(n + 1)^{n - 2}. \end{aligned}$$

Hence

$$\begin{aligned} J_{\mathcal {E}}{\varvec{U}}(x) = \sqrt{\frac{\det (M_1^T M_1)}{\det (M_2^T M_2)}} = \sqrt{\frac{n + 1}{2}}. \end{aligned}$$

In order to prove the second identity, we recall that \(|a_i - a_j| = \sqrt{2n + 2}\) for \(i \ne j\). Hence \(|D\nabla u|(\Omega ) = \sqrt{2n + 2} {\mathcal {H}}^{n - 1}({\mathcal {E}}\cap \Omega ) = 2{\mathcal {H}}^{n - 1}({\mathcal {E}}^* \cap (\Omega \times \mathbb {R}))\) according to (5). \(\square \)

5 Slicing the graph

We still assume that A consists of the corners of the regular n-simplex from Sect. 4 and we assume that \(u \in \mathrm {BV}_\mathrm {loc}^2(\mathbb {R}^n)\) is bounded and satisfies \(\nabla u(x) \in A\) for almost every \(x \in \mathbb {R}^n\). In this section, we analyse the graph of u. In particular, we examine intersections of \({{\,\mathrm{graph}\,}}(u)\) with hyperplanes perpendicular to one of the vectors \(\varvec{\nu }_i\). We will see that almost all such intersections can be represented as the graphs of functions in \(\mathrm {BV}_\mathrm {loc}^2(P)\), where

$$\begin{aligned} P = \left\{ y \in \mathbb {R}^n :y_1 + \cdots + y_n = 0 \right\} , \end{aligned}$$

and with gradient taking one of n different values almost everywhere. That is, we have a function with properties similar to u, but with an \((n - 1)\)-dimensional domain. This observation will eventually make it possible to prove Theorem 3 with the help of an induction argument.

We use some tools from the author’s previous paper [10] in this section. Given \(i \in \mathbb {Z}_{n + 1}\), let \(\Phi _i :\mathbb {R}^{n + 1} \rightarrow \mathbb {R}^{n + 1}\) be the linear map with

$$\begin{aligned} \Phi _i({\varvec{x}}) = \begin{pmatrix} \varvec{\nu }_{i + 1} \cdot {\varvec{x}}\\ \vdots \\ \varvec{\nu }_{i + n + 1} \cdot {\varvec{x}}\end{pmatrix}, \end{aligned}$$

so that \(\Phi _i(\varvec{\nu }_{i + k})\) is the k-th standard basis vector in \(\mathbb {R}^{n + 1}\). For \(t \in \mathbb {R}\), let

$$\begin{aligned} \Gamma _i(t) = \left\{ y \in \mathbb {R}^n :\begin{pmatrix} y \\ t \end{pmatrix} \in \Phi _i({{\,\mathrm{graph}\,}}(u)) \right\} . \end{aligned}$$

This corresponds to the intersection of \({{\,\mathrm{graph}\,}}(u)\) with a hyperplane orthogonal to \(\varvec{\nu }_i\) after rotation by \(\Phi _i\), or in other words, a slice of \({{\,\mathrm{graph}\,}}(u)\).

We further define the functions

$$\begin{aligned} {\underline{g}}_i(y) = \sup \left\{ t \in \mathbb {R} :u(t\nu _i + y_1 \nu _{i + 1} + \cdots + y_n \nu _{i + n}) > \frac{t + y_1 + \cdots + y_n}{\sqrt{n + 1}} \right\} \end{aligned}$$

and

$$\begin{aligned} {\overline{g}}_i(y) = \inf \left\{ t \in \mathbb {R} :u(t\nu _i + y_1 \nu _{i + 1} + \cdots + y_n \nu _{i + n}) < \frac{t + y_1 + \cdots + y_n}{\sqrt{n + 1}} \right\} . \end{aligned}$$

Note that for a fixed \(y \in \mathbb {R}^n\), the set

$$\begin{aligned} \left\{ t \in \mathbb {R} :u(t\nu _i + y_1 \nu _{i + 1} + \cdots + y_n \nu _{i + n}) = \frac{t + y_1 + \cdots + y_n}{\sqrt{n + 1}} \right\} \end{aligned}$$

corresponds to the intersection of \({{\,\mathrm{graph}\,}}(u)\) with a line parallel to \(\nu _i\), so the functions \({\underline{g}}_i\) and \({\overline{g}}_i\) tell us something about the geometry of \({{\,\mathrm{graph}\,}}(u)\) as well.

The following properties of \({\underline{g}}_i\) and \({\overline{g}}_i\) have been proved elsewhere for \(n = 2\) [10,  Lemma 16]. The proof carries over to higher dimensions as well. We therefore do not repeat it here.

Lemma 7

For any \(i \in \mathbb {Z}_{n + 1}\), the following statements hold true.

  1. (i)

    The function \({\underline{g}}_i\) is lower semicontinuous and \({\overline{g}}_i\) is upper semicontinuous.

  2. (ii)

    The identity \({\underline{g}}_i = {\overline{g}}_i\) holds almost everywhere in \(\mathbb {R}^n\).

  3. (iii)

    For any \(y \in \mathbb {R}^n\), the inequality \({\underline{g}}_i(y) \le {\overline{g}}_i(y)\) holds true and

    $$\begin{aligned} \{y\} \times [{\underline{g}}_i(y), {\overline{g}}_i(y)] \subseteq \Phi _i({{\,\mathrm{graph}\,}}(u)). \end{aligned}$$
  4. (iv)

    Let \(t \in \mathbb {R}\) and \(y \in \mathbb {R}^n\). Then \(y \in \Gamma _i(t)\) if, and only if, \({\underline{g}}_i(y) \le t \le {\overline{g}}_i(y)\).

  5. (v)

    For all \(y \in \mathbb {R}^n\) and all \(\zeta \in (0, \infty )^n\), the inequality \({\overline{g}}_i(y + \zeta ) \le {\underline{g}}_i(y)\) is satisfied; and if equality holds, then

    $$\begin{aligned} {\underline{g}}_i(y) = {\underline{g}}_i(y + s\zeta ) = {\overline{g}}_i(y + s\zeta ) = {\overline{g}}_i(y + \zeta ) \end{aligned}$$

    for all \(s \in (0, 1)\).

  6. (vi)

    For all \(y \in \mathbb {R}^n\) and all \(\zeta \in [0, \infty )^n\), the inequalities \({\underline{g}}_i(y) \ge {\underline{g}}_i(y + \zeta )\) and \({\overline{g}}_i(y) \ge {\overline{g}}_i(y + \zeta )\) are satisfied.

Now consider the hyperplane \(P \subseteq \mathbb {R}^n\) given by

$$\begin{aligned} P = \left\{ y \in \mathbb {R}^n :y_1 + \cdots + y_n = 0 \right\} \end{aligned}$$

and its unit normal vector

$$\begin{aligned} \sigma = \frac{1}{\sqrt{n}} \begin{pmatrix} 1 \\ \vdots \\ 1 \end{pmatrix} \in \mathbb {R}^n. \end{aligned}$$

Let \(e_1, \dotsc , e_n\) be the standard basis vectors of \(\mathbb {R}^n\) and define

$$\begin{aligned} b_i = \sigma - \sqrt{n} e_i \end{aligned}$$

for \(i = 1, \dotsc , n\). Then

$$\begin{aligned} |b_i|^2 = n - 1 \end{aligned}$$

and

$$\begin{aligned} b_i \cdot b_j = -1 \end{aligned}$$

for \(i \ne j\). Hence \(b_1, \dotsc , b_n\) are the corners of a regular \((n - 1)\)-simplex in P centred at 0 with side length \(\sqrt{2n}\). (Indeed the construction is similar to the standard \((n - 1)\)-simplex.) Thus they are the \((n - 1)\)-dimensional counterparts to \(a_0, \dotsc , a_n\).

Given a function \(f :P \times \mathbb {R}\rightarrow \mathbb {R}\), we write \({\tilde{\nabla }} f\) for its gradient with respect to the variable \(p \in P\). We want to show the following.

Proposition 8

Let \(i \in \mathbb {Z}_{n + 1}\). Then there exists a function \(f_i :P \times \mathbb {R}\rightarrow \mathbb {R}\) such that for almost every \(t \in \mathbb {R}\),

  • The function \(p \mapsto f_i(p, t)\) belongs to \(\mathrm {BV}_\mathrm {loc}^2(P)\) and \({\tilde{\nabla }} f_i(p, t) \in \{b_1, \dotsc , b_n\}\) for \({\mathcal {H}}^{n - 1}\)-almost every \(p \in P\); and

  • Its graph is \(\Gamma _i(t)\), that is, \(\Gamma _i(t) = \left\{ p + f_i(p, t) \sigma :p \in P \right\} \).

Before we can prove this result, we need a few lemmas.

Lemma 9

Let \(i \in \mathbb {Z}_{n + 1}\). Suppose that \(t \in \mathbb {R}\) and \(y_-, y_+ \in \Gamma _i(t)\). Then

$$\begin{aligned} \bigl (y_- + [0, \infty )^n\bigr ) \cap \bigl (y_+ - [0, \infty )^n\bigr ) \subseteq \Gamma _i(t). \end{aligned}$$

Proof

We first prove that

$$\begin{aligned} \bigl (y_- + (0, \infty )^n\bigr ) \cap \bigl (y_+ - (0, \infty )^n\bigr ) \subseteq \Gamma _i(t). \end{aligned}$$

Let

$$\begin{aligned} y \in \bigl (y_- + (0, \infty )^n\bigr ) \cap \bigl (y_+ - (0, \infty )^n\bigr ). \end{aligned}$$

Define \(\zeta _- = y - y_-\) and \(\zeta _+ = y_+ - y\). Then \(\zeta _-, \zeta _+ \in (0, \infty )^n\). According to Lemma 7, this means that

$$\begin{aligned} t \ge {\underline{g}}_i(y_-) \ge {\overline{g}}_i(y_- + \zeta _-) = {\overline{g}}_i(y) \ge {\underline{g}}_i(y) = {\underline{g}}_i(y_+ - \zeta _+) \ge {\overline{g}}_i(y_+) \ge t. \end{aligned}$$

Hence \(y \in \Gamma _i(t)\). By the semicontinuity of \({\underline{g}}_i\) and \({\overline{g}}_i\), we also conclude that

$$\begin{aligned} {\underline{g}}_i(y) \le t \le {\overline{g}}_i(y) \end{aligned}$$

for all \(y \in \bigl (y_- + [0, \infty )^n\bigr ) \cap \bigl (y_+ - [0, \infty )^n\bigr )\). \(\square \)

Lemma 10

Let \(i \in \mathbb {Z}_{n + 1}\). Let \(t \in \mathbb {R}\) and \(p \in P\). Suppose that

$$\begin{aligned} \left\{ s \in \mathbb {R} :p + s\sigma \in \Gamma _i(t) \right\} = [s_-, s_+]. \end{aligned}$$

Then

$$\begin{aligned} \Gamma _i(t) \cap \bigl (p + s_- \sigma - (0, \infty )^n\bigr ) = \emptyset \end{aligned}$$

and

$$\begin{aligned} \Gamma _i(t) \cap \bigl (p + s_+ \sigma + (0, \infty )^n\bigr ) = \emptyset . \end{aligned}$$

Proof

Let \(y \in p + s_- \sigma - (0, \infty )^n\). Choose \(s < s_-\) such that \(y \in p + s\sigma - (0, \infty )^n\) as well. Then Lemma 7 implies that

$$\begin{aligned} {\underline{g}}_i(y) \ge {\overline{g}}_i(p + s\sigma ) \ge {\underline{g}}_i(p + s\sigma ) \ge {\overline{g}}_i(p + s_-\sigma ) \ge t. \end{aligned}$$
(8)

Moreover, since \(p + s\sigma \not \in \Gamma _i(t)\), we know that \({\underline{g}}_i(p + s\sigma ) \ne t\). Therefore, we do not have equality everywhere in (8). Hence \(y \not \in \Gamma _i(t)\). The proof of the second statement is similar. \(\square \)

Lemma 11

There exists a constant C such that the following holds true. Suppose that \(v :\mathbb {R}^n \rightarrow \mathbb {R}\) is smooth and bounded with \(a_j \cdot \nabla v > -1\) for all \(j \in \mathbb {Z}_{n + 1}\) and \(\sup _{\mathbb {R}^n} |v| \le M\). Let \(i \in \mathbb {Z}_{n + 1}\). Let \(\phi :P \times \mathbb {R}\rightarrow \mathbb {R}\) be the unique function such that

$$\begin{aligned} \begin{pmatrix} p + \phi (p, t) \sigma \\ t \end{pmatrix} \in \Phi _i({{\,\mathrm{graph}\,}}(v)) \end{aligned}$$

for \(p \in P\) and \(t \in \mathbb {R}\). Then

$$\begin{aligned} |{\tilde{\nabla }} \phi (p, t)| \le \sqrt{n} \end{aligned}$$
(9)

for all \(p \in P\) and \(t \in \mathbb {R}\). Moreover, for any \(R > 0\),

$$\begin{aligned} \int _{-R}^R \int _{P \cap B_R(0)} |{\tilde{\nabla }}^2 \phi | \, d{\mathcal {H}}^{n - 1} \, dt \le C \int _{B_{C(M + R)}(0)} |\nabla ^2 v| \, dx. \end{aligned}$$

Since the proof of this statement is lengthy, we postpone it to the next section. We now prove Proposition 8.

Proof of Proposition 8

Let \(t \in \mathbb {R}\) and \(p \in P\). Since u is bounded, the line

$$\begin{aligned} \left\{ t\varvec{\nu }_i + \sum _{k = 1}^n (p_k + s\sigma _k) \varvec{\nu }_{i + k} :s \in \mathbb {R} \right\} \end{aligned}$$

(which is not horizontal) must intersect \({{\,\mathrm{graph}\,}}(u)\). Hence there exists \(s \in \mathbb {R}\) with \(p + s\sigma \in \Gamma _i(t)\).

If there are \(s_-, s_+ \in \mathbb {R}\) with \(s_- < s_+\) such that \(p + s_- \sigma \in \Gamma _i(t)\) and \(p + s_+ \sigma \in \Gamma _i(t)\), then Lemma 9 implies that \(\Gamma _i(t)\) has non-empty interior, denoted by \(\mathring{\Gamma }_i(t)\). Because of Lemma 7.(v), we know that \({\underline{g}}_i(y) = {\overline{g}}_i(y) = t\) for every \(y \in \mathring{\Gamma }_i(t)\). Hence for \(t_1 \ne t_2\), it follows that \(\mathring{\Gamma }_i(t_1) \cap \mathring{\Gamma }_i(t_2) = \emptyset \). Therefore, there can only be countably many \(t \in \mathbb {R}\) such that \(\mathring{\Gamma }_i(t) \ne \emptyset \). For all other values, we see that \(\Gamma _i(t)\) is a graph of a function over P. We denote this function by .

We extend \(f_i\) arbitrarily to the remaining values of t.

If t is such that \(\mathring{\Gamma }_i(t) = \emptyset \), then Lemma 10 shows that for every \(y \in \Gamma _i(t)\), the set \(\Gamma _i(t)\) is between the cones \(y + (0, \infty )^n\) and \(y - (0, \infty )^n\). Clearly there exists \(L > 0\) such that

$$\begin{aligned} \left\{ p + s \sigma :p \in P, s \in \mathbb {R}\text { with } |s| > L|p| \right\} \subseteq (0, \infty )^n \cup (-(0, \infty )^n). \end{aligned}$$

It follows that is Lipschitz continuous with Lipschitz constant at most L.

Next we employ an approximation argument in conjunction with Lemma 11. Using a standard mollifier, we can find a sequence of smooth, uniformly bounded functions \(v_k :\mathbb {R}^n \rightarrow \mathbb {R}\) such that \(v_k \rightarrow u\) locally uniformly as \(k \rightarrow \infty \) and [1,  Proposition 3.7]

$$\begin{aligned} |D\nabla u|(\Omega ) = \lim _{k \rightarrow \infty } \int _\Omega |\nabla ^2 v_k| \, dx \end{aligned}$$

whenever \(\Omega \subseteq \mathbb {R}^n\) is an open, bounded set with \(|D\nabla u|(\partial \Omega ) = 0\). It is then easy to modify \(v_k\) (e.g., replacing it with \((1 - 1/k)v_k\)) such that in addition, it satisfies \(a_j \cdot \nabla v_k > -1\) in \(\mathbb {R}^n\) for every \(j \in \mathbb {Z}_{n + 1}\). Hence Lemma 11 applies to \(v_k\).

From the above convergence, it follows that for any sequence of points \({\varvec{x}}_k \in {{\,\mathrm{graph}\,}}(v_k)\), if \({\varvec{x}}_k \rightarrow {\varvec{x}}\) as \(k \rightarrow \infty \), then \({\varvec{x}}\in {{\,\mathrm{graph}\,}}(u)\). If we define \(\phi _k\) as in Lemma 11, then for any fixed \(t \in \mathbb {R}\), the functions are uniformly bounded in \(C^{0, 1}(P \cap B_R(0))\) for any \(R > 0\). Hence there is a subsequence that converges locally uniformly. If t is such that \(\Gamma _i(t)\) is the graph of , then it is clear that the limit of any such subsequence must coincide with . Hence in this case, we have the locally uniform convergence as \(k \rightarrow \infty \). The second inequality in Lemma 11 implies that

$$\begin{aligned} \limsup _{k \rightarrow \infty } \int _{-R}^R \int _{P \cap B_R(0)} |{\tilde{\nabla }}^2 \phi _k| d{\mathcal {H}}^{n - 1} \, dt < \infty \end{aligned}$$

for any \(R > 0\). By Fatou’s lemma,

$$\begin{aligned} \int _{-R}^R \liminf _{k \rightarrow \infty } \int _{P \cap B_R(0)} |{\tilde{\nabla }}^2 \phi _k| d{\mathcal {H}}^{n - 1} \, dt < \infty . \end{aligned}$$

Therefore, for almost every \(t \in (-R, R)\), there exists a subsequence converging to locally uniformly and such that

$$\begin{aligned} \limsup _{\ell \rightarrow \infty } \int _{P \cap B_R(0)} |{\tilde{\nabla }}^2 \phi _{k_\ell }| d{\mathcal {H}}^{n - 1} < \infty . \end{aligned}$$

We conclude that for almost all \(t \in (-R, R)\). Applying this argument for a sequence \(R_\ell \rightarrow \infty \), we then see that for almost all \(t \in \mathbb {R}\).

We finally need to show that \({\tilde{\nabla }} f_i(p, t) \in \{b_1, \dotsc , b_n\}\) for almost every \(t \in \mathbb {R}\) and \({\mathcal {H}}^{n - 1}\)-almost every \(p \in P\).

Consider the function \(w_i :\mathbb {R}^n \rightarrow \mathbb {R}\) with

$$\begin{aligned} w_i(x) = \frac{u(x) - a_i \cdot x}{\sqrt{n + 1}}, \quad x \in \mathbb {R}^n. \end{aligned}$$

Then for every \(t \in \mathbb {R}\),

$$\begin{aligned} \begin{aligned} \Gamma _i(t) \times \{t\}&= \Phi _i\bigl (\left\{ {\varvec{x}}\in {{\,\mathrm{graph}\,}}(u) :{\varvec{x}}\cdot \varvec{\nu }_i = t \right\} \bigr ) \\&= \Phi _i\left( \left\{ \begin{pmatrix} x \\ u(x) \end{pmatrix} :x \in \mathbb {R}^n \text { with } w_i(x) = t \right\} \right) . \end{aligned} \end{aligned}$$

Note further that \({\mathcal {F}}_i\) coincides up to an \({\mathcal {H}}^n\)-null set with \(\left\{ x \in \mathbb {R}^n :\nabla w_i(x) = 0 \right\} \). Let \({\mathcal {Z}} \subset \mathbb {R}^n\) denote the set of all points where u is not differentiable. By Rademacher’s theorem, this is an \({\mathcal {H}}^n\)-null set. Hence the coarea formula gives

$$\begin{aligned} 0 = \int _{{\mathcal {F}}_i \cup {\mathcal {Z}}} |\nabla w_i| \, dx = \int _{-\infty }^\infty {\mathcal {H}}^{n - 1}\bigl (w_i^{-1}(\{t\}) \cap ({\mathcal {F}}_i \cup {\mathcal {Z}})\bigr ) \, dt. \end{aligned}$$

In particular, for almost all \(t \in \mathbb {R}\),

$$\begin{aligned} {\mathcal {H}}^{n - 1}\bigl (w_i^{-1}(\{t\}) \cap ({\mathcal {F}}_i \cup {\mathcal {Z}})\bigr ) = 0. \end{aligned}$$

As the map \({\varvec{U}}\) (defined in the introduction) is Lipschitz continuous, we conclude that \({\varvec{U}}\left( w_i^{-1}(\{t\}) \cap ({\mathcal {F}}_i \cup {\mathcal {Z}})\right) \) is an \({\mathcal {H}}^{n - 1}\)-null set, too. Therefore, for \({\mathcal {H}}^{n - 1}\)-almost all \(y \in \Gamma _i(t)\), the unique point \(x \in \mathbb {R}^n\) with

$$\begin{aligned} \Phi _i({\varvec{U}}(x)) = \begin{pmatrix} y \\ t \end{pmatrix} \end{aligned}$$

belongs to \(\mathbb {R}^n \setminus {\mathcal {Z}}\) and satisfies \(\nabla u(x) \in A \setminus \{a_i\}\).

To put it differently, for almost every \(t \in \mathbb {R}\), the following holds true: for \({\mathcal {H}}^{n - 1}\)-almost every \(p \in P\) the derivative of u exists at the point

$$\begin{aligned} \Theta (p, t) = t\nu _i + \sum _{k = 1}^n (p_k + f_i(p, t) \sigma _k) \nu _{i + k} \end{aligned}$$

and belongs to \(A \setminus \{a_i\}\). Furthermore, we know that is differentiable at \({\mathcal {H}}^{n - 1}\)-almost every p by Rademacher’s theorem. At a point \(p \in P\) where both statements hold true, we can differentiate the equation

$$\begin{aligned} u(\Theta (p, t)) = \frac{t + \sqrt{n} f_i(p, t)}{\sqrt{n + 1}}. \end{aligned}$$

(The right-hand side is the \((n + 1)\)-st component of

$$\begin{aligned} t\varvec{\nu }_i + \sum _{k = 1}^n (p_k + f_i(p, t) \sigma _k) \varvec{\nu }_{i + k} = \Phi _i^{-1}\begin{pmatrix} p + f_i(p, t) \sigma \\ t \end{pmatrix} \end{aligned}$$

because \(p \in P\) and by the definition of \(\sigma \).) For any \(\varpi \in P\), we thus obtain

$$\begin{aligned} -\left( \sum _{k = 1}^n \varpi _k a_{i + k} + \varpi \cdot {\tilde{\nabla }} f_i(p, t) \sum _{k = 1}^n \sigma _k a_{i + k}\right) \cdot \nabla u(\Theta (p, t)) =\sqrt{n} \, \varpi \cdot {\tilde{\nabla }} f_i(p, t). \end{aligned}$$

If \(\nabla u(\Theta (p, t)) = a_j\) for some \(j \ne i\), then this simplifies to

$$\begin{aligned} - (n + 1) \varpi _{j - i} - \frac{1}{\sqrt{n}} \varpi \cdot {\tilde{\nabla }} f_i(p, t) = \sqrt{n} \varpi \cdot {\tilde{\nabla }} f_i(p, t). \end{aligned}$$

Hence

$$\begin{aligned} \varpi \cdot {\tilde{\nabla }} f_i(p, t) = -\sqrt{n} \, \varpi _{j - i} = b_{j - i} \cdot \varpi . \end{aligned}$$

We therefore conclude that \({\tilde{\nabla }} f_i(p, t) = b_{j - i}\) at such a point. \(\square \)

6 Proof of Lemma 11

In this section we give the postponed Proof of Lemma 11. To this end, we first need another lemma.

Lemma 12

Let \(\Lambda \) denote the \((n \times n)\)-matrix with columns

$$\begin{aligned} \sum _{i \in \mathbb {Z}_{n + 1}} \gamma _{ik} a_i, \quad k = 1, \dotsc , n. \end{aligned}$$

Then

$$\begin{aligned} \det (\Lambda ) = (-1)^n (n + 1)^{\frac{n - 1}{2}} \det \begin{pmatrix} \gamma _{01} &{} \cdots &{} \gamma _{0n} &{} 1 \\ \vdots &{} &{} \vdots &{} \vdots \\ \gamma _{n1} &{} \cdots &{} \gamma _{nn} &{} 1 \end{pmatrix}. \end{aligned}$$

Proof

Let M denote the \(((n + 1) \times (n + 1))\)-matrix with columns

$$\begin{aligned} \sum _{i \in \mathbb {Z}_{n + 1}} \gamma _{ik} \varvec{\nu }_i, \quad k = 1, \dotsc , n, \quad \text {and} \quad \sum _{i \in \mathbb {Z}_{n + 1}} \varvec{\nu }_i. \end{aligned}$$

Then, since \((\varvec{\nu }_1, \dotsc , \varvec{\nu }_{n + 1})\) is a positively oriented orthonormal basis of \(\mathbb {R}^{n + 1}\), we conclude that

$$\begin{aligned} \det (M) = \det \begin{pmatrix} \gamma _{01} &{} \cdots &{} \gamma _{0n} &{} 1 \\ \vdots &{} &{} \vdots &{} \vdots \\ \gamma _{n1} &{} \cdots &{} \gamma _{nn} &{} 1 \end{pmatrix}. \end{aligned}$$

On the other hand,

$$\begin{aligned} M = \frac{1}{\sqrt{n + 1}} \begin{pmatrix} &{}&{}&{} 0 \\ &{} -\Lambda &{} &{} \vdots \\ &{}&{}&{} 0 \\ m_1 &{} \cdots &{} m_n &{} n + 1 \end{pmatrix}, \end{aligned}$$

where \(m_k = \sum _{i \in \mathbb {Z}_{n + 1}} \gamma _{ik}\). Here we use the properties of \(a_0, \dotsc , a_n\), including Eq. (7). Hence

$$\begin{aligned} \det (M) = (-1)^n (n + 1)^{\frac{1 - n}{2}} \det (\Lambda ). \end{aligned}$$

The claim follows immediately. \(\square \)

Proof of Lemma 11

Suppose that \({\varvec{x}}= ({\begin{matrix} x \\ v(x)\end{matrix}}) \in {{\,\mathrm{graph}\,}}(v)\). Then

$$\begin{aligned} {\varvec{N}} = \frac{1}{\sqrt{1 + |\nabla v|^2}} \begin{pmatrix} -\nabla v(x) \\ 1 \end{pmatrix} \end{aligned}$$

is a normal vector to \({{\,\mathrm{graph}\,}}(v)\) at \({\varvec{x}}\). We compute

$$\begin{aligned} \Phi _i({\varvec{N}}) = \frac{1}{\sqrt{(n + 1) (1 + |\nabla v|^2)}} \begin{pmatrix} 1 + a_{i + 1} \cdot \nabla v(x) \\ \vdots \\ 1 + a_{i + n + 1} \cdot \nabla v(x) \end{pmatrix}. \end{aligned}$$

Under the assumptions of Lemma 11, all the components of this vector are positive. Hence it is not parallel to \((0, \dotsc , 0, 1)^T\) and not perpendicular to \((1, \dotsc , 1, 0)^T\). The implicit function theorem therefore implies that

$$\begin{aligned} \Phi _i({{\,\mathrm{graph}\,}}(v)) \cap \left\{ {\varvec{y}} \in \mathbb {R}^{n + 1} :y_{n + 1} = t \right\} \end{aligned}$$

is a smooth \((n - 1)\)-dimensional manifold for every \(t \in \mathbb {R}\) and that the function \(\phi \) from the statement of the lemma is smooth.

If we define \(\varvec{\Xi }:P \times \mathbb {R}^2 \rightarrow \mathbb {R}^{n + 1}\) such that

$$\begin{aligned} \varvec{\Xi }(p, s, t) = t\varvec{\nu }_i + \sum _{k = 1}^n (p_k + s \sigma _k) \varvec{\nu }_{i + k} \end{aligned}$$

for \(p \in P\) and \(s, t \in \mathbb {R}\), then \(\phi \) is characterised by the condition that

$$\begin{aligned} \varvec{\Xi }(p, \phi (p, t), t) \in {{\,\mathrm{graph}\,}}(v) \end{aligned}$$

for all \(t \in \mathbb {R}\) and \(p \in P\). Hence

$$\begin{aligned} v\bigl (\Xi (p, \phi (p, t), t)\bigr ) = \Xi _{n + 1}(p, \phi (p, t), t). \end{aligned}$$
(10)

We now differentiate this equation.

We compute

$$\begin{aligned} \frac{\partial \Xi }{\partial t} = \nu _i = - \frac{a_i}{\sqrt{n + 1}}, \quad \frac{\partial \Xi _{n + 1}}{\partial t} = \frac{1}{\sqrt{n + 1}}. \end{aligned}$$

For \(\varpi \in P\),

$$\begin{aligned} \varpi \cdot {\tilde{\nabla }} \Xi = -\frac{1}{\sqrt{n + 1}} \sum _{k = 1}^n \varpi _k a_{i + k}, \quad \varpi \cdot {\tilde{\nabla }} \Xi _{n + 1} = \frac{1}{\sqrt{n + 1}} \sum _{k = 1}^n \varpi _k = 0. \end{aligned}$$

Finally,

$$\begin{aligned} \frac{\partial \Xi }{\partial s} = \sum _{k = 1}^n \sigma _k \nu _{i + k} = - \frac{1}{\sqrt{n^2 + n}} \sum _{k = 1}^n a_{i + k} = \frac{a_i}{\sqrt{n^2 + n}}, \quad \frac{\partial \Xi _{n + 1}}{\partial s} = \sqrt{\frac{n}{n + 1}}. \end{aligned}$$

We define \(\Theta (p, t) = \Xi (p, \phi (p, t), t)\). Differentiating (10), we now conclude that

$$\begin{aligned} \left( \frac{1}{\sqrt{n}} \frac{\partial \phi }{\partial t}(p, t) - 1\right) a_i \cdot \nabla v(\Theta (p, t)) = \sqrt{n} \frac{\partial \phi }{\partial t}(p, t) + 1 \end{aligned}$$

and

$$\begin{aligned} \left( \frac{1}{\sqrt{n}} \varpi \cdot {\tilde{\nabla }} \phi (p, t) a_i - \sum _{k = 1}^n \varpi _k a_{i + k}\right) \cdot \nabla v(\Theta (p, t)) = \sqrt{n} \varpi \cdot {\tilde{\nabla }} \phi (p, t). \end{aligned}$$
(11)

Hence

$$\begin{aligned} \frac{\partial \phi }{\partial t}(p, t) = \sqrt{n} \frac{a_i \cdot \nabla v(\Theta (p, t)) + 1}{a_i \cdot \nabla v(\ \Theta (p, t)) - n} \end{aligned}$$
(12)

and

$$\begin{aligned} \varpi \cdot {\tilde{\nabla }} \phi (p, t) = \sqrt{n} \frac{\sum _{k = 1}^n \varpi _k a_{i + k} \cdot \nabla v(\Theta (p, t))}{a_i \cdot \nabla v(\Theta (p, t)) - n}. \end{aligned}$$

Fix \(t \in \mathbb {R}\) and \(p \in P\). Since \(\nabla v(\Theta (p, t))\) is in the interior of the convex hull of the set \(\left\{ a_j :j \in \mathbb {Z}_{n + 1} \right\} \), there exist \(\tau _j \in (0, 1)\) for \(j \in \mathbb {Z}_{n + 1}\) such that

$$\begin{aligned} \sum _{j \in \mathbb {Z}_{n + 1}} \tau _j = 1 \end{aligned}$$

and

$$\begin{aligned} \nabla v(\Theta (p, t)) = \sum _{j \in \mathbb {Z}_{n + 1}} \tau _j a_j. \end{aligned}$$

Then

$$\begin{aligned} a_i \cdot \nabla v(\Theta (p, t)) - n = n \tau _i - \sum _{j \ne i} \tau _j - n = (n + 1) (\tau _i - 1), \end{aligned}$$

while

$$\begin{aligned} \sum _{k = 1}^n \varpi _k a_{i + k} \cdot \nabla v(\Theta (p, t)) = \sum _{k = 1}^n \varpi _k \left( n\tau _{i + k} - \sum _{j \ne i + k} \tau _j\right) = (n + 1) \sum _{k = 1}^n \varpi _k \tau _{i + k}. \end{aligned}$$

We further note that

$$\begin{aligned} \tau _{i + 1}^2 + \cdots + \tau _{i + n}^2 \le (\tau _{i + 1} + \cdots + \tau _{i + n})^2 = (1 - \tau _i)^2. \end{aligned}$$

The Cauchy–Schwarz inequality therefore implies that

$$\begin{aligned} \left| \sum _{k = 1}^n \varpi _k a_{i + k} \cdot \nabla v(\Theta (p, t))\right| \le (n + 1) (1 - \tau _i) |\varpi |. \end{aligned}$$

It follows that

$$\begin{aligned} |\varpi \cdot {\tilde{\nabla }} \phi (p, t)| \le \sqrt{n} |\varpi |, \end{aligned}$$

and inequality (9) is proved.

In order to prove the second statement of Lemma 11, we need to differentiate (11) again with respect to p. We write \(\Lambda : M\) for the Frobenius inner product between two matrices \(\Lambda \) and M. We also drop the arguments (pt) in the derivatives of \(\phi \) and in \(\Theta \). Then for all \(\varpi , \xi \in P\),

$$\begin{aligned}&\sqrt{\frac{n + 1}{n}} (\xi \otimes \varpi ) : {\tilde{\nabla }}^2 \phi \\&\quad = \frac{\left( \frac{\xi \cdot {\tilde{\nabla }} \phi }{\sqrt{n}} a_i - \sum _{k = 1}^n \xi _k a_{i +k}\right) \otimes \left( \frac{\varpi \cdot {\tilde{\nabla }} \phi }{\sqrt{n}} a_i - \sum _{k = 1}^n \varpi _k a_{i +k}\right) }{n - a_i \cdot \nabla v(\Theta )} : \nabla ^2 v(\Theta ). \end{aligned}$$

As we have already seen that \(|{\tilde{\nabla }} \phi | \le \sqrt{n}\), it follows that there is a constant \(C_1 = C_1(n)\) such that

$$\begin{aligned} |{\tilde{\nabla }}^2 \phi | \le \frac{C_1|\nabla ^2 v(\Theta )|}{n - a_i \cdot \nabla v(\Theta )}. \end{aligned}$$

Choose an orthonormal basis \((\eta _1, \dotsc , \eta _{n - 1})\) of P. Next we examine the derivative \(d\Theta \), and more specifically, its determinant.

Define \(\gamma _{1k}, \dotsc , \gamma _{nk}\) such that \(\eta _k = (\gamma _{1k}, \dotsc , \gamma _{nk})^T\). For \(t \in \mathbb {R}\) and \(p \in P\), we also define

$$\begin{aligned} \gamma _{n + 1, k}(p, t) = - \frac{1}{\sqrt{n}} \eta _k \cdot {\tilde{\nabla }} \phi (p, t), \quad k = 1, \dotsc , n - 1, \end{aligned}$$

and

$$\begin{aligned} \gamma _{n + 1, n}(p, t) = 1 - \frac{1}{\sqrt{n}} \frac{\partial \phi }{\partial t}(p, t). \end{aligned}$$

Finally, we set \(\gamma _{\ell n} = 0\) for \(\ell = 1, \dotsc , n\). We compute

$$\begin{aligned} \eta _k \cdot {\tilde{\nabla }} \Theta (p, t) = \frac{1}{\sqrt{n + 1}} \left( \frac{1}{\sqrt{n}} \eta _k \cdot {\tilde{\nabla }} \phi (p, t) a_i - \sum _{\ell = 1}^n \gamma _{\ell k} a_{i + \ell }\right) \end{aligned}$$

and

$$\begin{aligned} \frac{\partial \Theta }{\partial t}(p, t) = \left( \frac{1}{\sqrt{n}} \frac{\partial \phi }{\partial t}(p, t) - 1\right) \frac{a_i}{\sqrt{n + 1}}. \end{aligned}$$

Hence we can represent \(d\Theta \) by the matrix with columns

$$\begin{aligned} -\sum _{\ell = 1}^{n + 1} \frac{\gamma _{\ell k} a_{i + \ell }}{\sqrt{n + 1}}, \quad k = 1, \dotsc , n, \end{aligned}$$

with respect to the basis of \(P \times \mathbb {R}\) induced by \(\eta _1 \dotsc , \eta _{n - 1}\). Lemma 12 now tells us that

$$\begin{aligned} \begin{aligned} \det (d\Theta )&= \pm \frac{1}{\sqrt{n + 1}} \det \begin{pmatrix} \gamma _{11} &{} \cdots &{} \gamma _{1n} &{} 1 \\ \vdots &{}&{} \vdots &{} \vdots \\ \gamma _{n + 1, 1} &{} \cdots &{} \gamma _{n + 1, n} &{} 1 \end{pmatrix} \\&= \pm \frac{1}{\sqrt{n + 1}} \det \begin{pmatrix} \gamma _{11} &{} \cdots &{} \gamma _{1, n - 1} &{} 0 &{} 1 \\ \vdots &{}&{} \vdots &{} \vdots &{} \vdots \\ \gamma _{n1} &{} \cdots &{} \gamma _{n, n - 1} &{} 0 &{} 1 \\ \gamma _{n + 1, 1} &{} \cdots &{} \gamma _{n + 1, n - 1} &{} \gamma _{n + 1, n} &{} 1 \end{pmatrix} \\&= \mp \sqrt{\frac{n}{n + 1}} \gamma _{n + 1, n} \det \begin{pmatrix} \gamma _{11} &{} \cdots &{} \gamma _{1, n - 1} &{} \sigma _1 \\ \vdots &{}&{} \vdots &{} \vdots \\ \gamma _{n1} &{} \cdots &{} \gamma _{n, n - 1} &{} \sigma _n \end{pmatrix}. \end{aligned} \end{aligned}$$

As \((\eta _1, \dotsc , \eta _{n - 1}, \sigma )\) form an orthonormal basis of \(\mathbb {R}^n\), we find that

$$\begin{aligned} |\det (d\Theta )| = \sqrt{\frac{n}{n + 1}} |\gamma _{n + 1, n}| = \frac{1}{\sqrt{n + 1}} \left| \sqrt{n} - \frac{\partial \phi }{\partial t}\right| . \end{aligned}$$

Recalling (12), we now obtain

$$\begin{aligned} |\det (d\Theta )| = \frac{\sqrt{n^2 + n}}{n - a_i \cdot \nabla v(\Theta )}. \end{aligned}$$

We also note that the map \(\Theta \) is injective. Given \(R > 0\), we therefore compute

$$\begin{aligned} \begin{aligned}&\int _{-R}^R \int _{P \cap B_R(0)} |{\tilde{\nabla }}^2 \phi | \, d{\mathcal {H}}^{n - 1} \, dt \\&\quad \le C_1 \int _{-R}^R \int _{P \cap B_R(0)} \frac{|\nabla ^2 v(\Theta )|}{n - a_i \cdot \nabla v(\Theta )} \, d{\mathcal {H}}^{n - 1} \, dt \\&\quad = \frac{C_1}{\sqrt{n^2 + n}} \int _{-R}^R \int _{P \cap B_R(0)} |\nabla ^2 v(\Theta )| |\det (d\Theta )| \, d{\mathcal {H}}^{n - 1} \, dt \\&\quad = \frac{C_1}{\sqrt{n^2 + n}} \int _{\Theta ((P \cap B_R(0)) \times (-R, R))} |\nabla ^2 v| \, dx. \end{aligned} \end{aligned}$$
(13)

It remains to examine the set \(\Theta ((P \cap B_R(0)) \times (-R, R))\). Recall that we have the assumption \(\sup _{\mathbb {R}^n} |v| \le M\) in Lemma 11. Thus (10) implies that

$$\begin{aligned} |\Xi _{n + 1}(p, \phi (p, t), t)| \le M. \end{aligned}$$

Since

$$\begin{aligned} \Xi _{n + 1}(p, \phi (p, t), t) = \frac{t + \sqrt{n} \phi (p, t)}{\sqrt{n + 1}}, \end{aligned}$$

this means that

$$\begin{aligned} |\phi (p, t)| \le M \sqrt{\frac{n + 1}{n}} + \frac{R}{\sqrt{n}} \end{aligned}$$

when \(t \in (-R, R)\). Hence there exists a constant \(C_2 = C_2(n)\) such that

$$\begin{aligned} |\Theta (p, t)| \le C_2(M + R) \end{aligned}$$

for all \(p \in P \cap B_R(0)\) and all \(t \in (-R, R)\). Thus (13) implies the second inequality of Lemma 11. \(\square \)

7 Proof of Theorem 3

In this section we combine the previous results to prove the second main theorem. We first consider a function \(u \in \mathrm {BV}_\mathrm {loc}^2(\mathbb {R}^n) \cap L^\infty (\mathbb {R}^n)\) such that \({{\,\mathrm{graph}\,}}(u)\) is close to the graph of \(\lambda _i \wedge \lambda _j\) or \(\lambda _i \vee \lambda _j\) in a cube in \(\mathbb {R}^{n + 1}\) with edges parallel to \(\varvec{\nu }_1, \dotsc , \varvec{\nu }_{n + 1}\). We will give a condition which implies that such a function actually coincides with \(\lambda _i \wedge \lambda _j\) or \(\lambda _i \vee \lambda _j\) up to a constant in part of the domain.

For \(i, j \in \mathbb {Z}_{n + 1}\) with \(i \ne j\) and for \(r, R > 0\), we define

$$\begin{aligned} Q_{ij}(r, R) = \Biggl \{\sum _{k \in \mathbb {Z}_{n + 1}} c_k \varvec{\nu }_k :c_i, c_j \in (-r, r) \text { and } c_k \in (-R, R) \text { for } k \not \in \{i, j\}\Biggr \}. \end{aligned}$$

Again we consider the map \({\varvec{U}}:\mathbb {R}^n \rightarrow \mathbb {R}^{n + 1}\) with \({\varvec{U}}(x) = ({\begin{matrix} x \\ u(x) \end{matrix}})\) for \(x \in \mathbb {R}^n\). Recall that \({\mathcal {E}}^* = {\varvec{U}}({\mathcal {E}})\). The following is the key statement for the Proof of Theorem 3.

Proposition 13

Let \(n \in \mathbb {N}\). For any \(\delta > 0\) there exist \(\epsilon > 0\) with the following properties. Let \(i, j \in \mathbb {Z}_{n + 1}\) with \(i \ne j\). Suppose that \(|u(0)| \le \epsilon \) and either

$$\begin{aligned} |u- \lambda _i \wedge \lambda _j| \le \epsilon \quad \text {in } {\varvec{U}}^{-1}\bigl (Q_{ij}(1,1)\bigr ) \end{aligned}$$
(14)

or

$$\begin{aligned} |u - \lambda _i \vee \lambda _j| \le \epsilon \quad \text {in } {\varvec{U}}^{-1}\bigl (Q_{ij}(1,1)\bigr ). \end{aligned}$$
(15)

Then

$$\begin{aligned} {\mathcal {H}}^{n - 1}\bigl ({\mathcal {E}}^* \cap Q_{ij}(\textstyle \frac{1}{4}, 1)\bigr ) \ge 2^{n - 1}(1 - \delta ). \end{aligned}$$
(16)

If, in addition,

$$\begin{aligned} {\mathcal {H}}^{n - 1}\bigl ({\mathcal {E}}^* \cap Q_{ij}(1, 1)\bigr ) \le 2^{n - 1}(1 + \epsilon ), \end{aligned}$$
(17)

then there exist \(\alpha , \beta \in \mathbb {R}\) such that

$$\begin{aligned} u = (\lambda _i + \alpha ) \wedge (\lambda _j + \beta ) \quad \text {in } \textstyle {\varvec{U}}^{-1}\bigl (Q_{ij}(\frac{1}{2}, \frac{1}{2})\bigr ) \end{aligned}$$
(18)

or

$$\begin{aligned} u = (\lambda _i + \alpha ) \vee (\lambda _j + \beta ) \quad \text {in } \textstyle {\varvec{U}}^{-1}\bigl (Q_{ij}(\frac{1}{2}, \frac{1}{2})\bigr ). \end{aligned}$$
(19)

Before we can prove Proposition 13, we need a few more lemmas. First we need some more information on the functions \(f_i\) from Proposition 8. Recall that for almost all \(t \in \mathbb {R}\).

Given \(i \in \mathbb {Z}_{n + 1}\) and given \(t \in \mathbb {R}\) such that , let \({\mathcal {D}}_i'(t) \subseteq P\) denote the approximate jump set of . Thus this set is defined analogously to \({\mathcal {E}}'\), but for the function instead of u. Furthermore, we set

$$\begin{aligned} {\mathcal {D}}_i^\dagger (t) = \left\{ p + f_i(p, t) \sigma :p \in {\mathcal {D}}_i'(t) \right\} , \end{aligned}$$

in analogy to \({\mathcal {E}}^\dagger \).

Lemma 14

Let \(i \in \mathbb {Z}_{n + 1}\). For almost any \(t \in \mathbb {R}\),

$$\begin{aligned} {\mathcal {D}}_i^\dagger (t) \times \{t\} \subseteq \Phi _i({\mathcal {E}}^* \cup {\mathcal {N}}^*). \end{aligned}$$

Hence for any \(t_1, t_2 \in \mathbb {R}\) and any Borel set \(\Omega \subseteq \mathbb {R}^n\),

$$\begin{aligned} \int _{t_1}^{t_2} {\mathcal {H}}^{n - 2}({\mathcal {D}}_i^\dagger (t) \cap \Omega ) \, dt \le {\mathcal {H}}^{n - 1}\bigl ({\mathcal {E}}^* \cap \Phi _i^{-1}(\Omega \times (t_1, t_2))\bigr ). \end{aligned}$$

Proof

Let \(p \in P\) and \(t \in \mathbb {R}\). Set

$$\begin{aligned} {\varvec{x}}= \Phi _i^{-1}\begin{pmatrix} p + f_i(p, t) \sigma \\ t \end{pmatrix}. \end{aligned}$$

If \({\varvec{x}}\in {\mathcal {F}}^*\), then Proposition 5 implies that \({{\,\mathrm{graph}\,}}(u)\) coincides with a hyperplane in a neighbourhood of \({\varvec{x}}\). If that hyperplane is perpendicular to \(\varvec{\nu }_i\), then \(p + f_i(p, t) \sigma \in \mathring{\Gamma }_i(t)\) and t belongs to the null set identified in Proposition 8. Otherwise, the function is affine near p, and hence \(\Phi _i({\varvec{x}})\) cannot belong to \({\mathcal {D}}_i^\dagger (t) \times \{t\}\). This implies the first claim.

The second claim is now a consequence of the coarea formula [1,  Theorem 2.93], applied to the functions \({\varvec{y}} \mapsto y_{n + 1}\) and to the countably \({\mathcal {H}}^{n - 1}\)-rectifiable set \(\Phi _i(E^* \cup {\mathcal {N}}^*) \cap (\Omega \times [t_1, t_2])\). \(\square \)

Lemma 15

Let \(k \in \{1, \dotsc , n\}\). Suppose that \({\underline{s}}, {\overline{s}} \in \mathbb {R}\) with \({\underline{s}} < {\overline{s}}\). For \(z \in \mathbb {R}^{n - 1}\), define \(\ell _z(s) = (z_1, \dotsc , z_{k - 1}, s, z_k, \dotsc , z_{n - 1})^T\) for \(s \in [{\underline{s}}, {\overline{s}}]\), and \(L_z = \left\{ \ell _z(s) :{\underline{s}} \le s \le {\overline{s}} \right\} \). Fix \(i \in \mathbb {Z}_{n + 1}\). Then for \({\mathcal {H}}^{n - 1}\)-almost every \(z \in \mathbb {R}^{n - 1}\), either

$$\begin{aligned} {\underline{g}}_i(y) = {\overline{g}}_i(y) = {\underline{g}}_i(y') = {\overline{g}}_i(y') \end{aligned}$$

for all \(y, y' \in L_z\), or there exist \({\varvec{y}} \in L_z \times \mathbb {R}\) such that

$$\begin{aligned} {\overline{g}}_i(\ell _z({\overline{s}})) \le y_{n + 1} \le {\underline{g}}_i(\ell _z({\underline{s}})) \end{aligned}$$

and \({\varvec{y}} \in \Phi _i({\mathcal {E}}^*)\).

Proof

Consider the projection \(\Pi :\mathbb {R}^{n + 1} \rightarrow \mathbb {R}^n\) given by \(\Pi ({\varvec{y}}) = y\) for \({\varvec{y}} \in \mathbb {R}^{n + 1}\). Set \(\Psi _i = \Pi \circ \Phi _i\). Then for \(j \in \mathbb {Z}_{n + 1}\) with \(j \ne i\) and for \({\varvec{x}}\in {\mathcal {F}}_j^*\), it is clear that \(J_{{\mathcal {F}}^*}\Psi _i({\varvec{x}}) = 0\). (Indeed, the vectors \(\varvec{\nu }_{j + 1}, \dotsc , \varvec{\nu }_{j + n + 1}\) provide an orthonormal basis of \(T_{{\varvec{x}}} {\mathcal {F}}^*\), and we compute \(d\Psi _i(\varvec{\nu }_i) = 0\).) Hence the area formula gives \({\mathcal {H}}^n(\Psi _i({\mathcal {F}}_j^*)) = 0\). Using the coarea formula, we conclude that for \({\mathcal {H}}^{n - 1}\)-almost every \(z \in \mathbb {R}^{n - 1}\),

$$\begin{aligned} {\mathcal {H}}^1(L_z \cap \Psi _i({\mathcal {F}}_j^*)) = 0 \end{aligned}$$
(20)

for all \(j \ne i\). Furthermore, since \({\mathcal {E}}^*\) is an \({\mathcal {H}}^{n - 1}\)-rectifiable set and \({\mathcal {H}}^{n - 1}({\mathcal {N}}^*) = 0\), we also know that for \({\mathcal {H}}^{n - 1}\)-almost every \(z \in \mathbb {R}^{n - 1}\),

$$\begin{aligned} {\mathcal {H}}^1(L_z \cap \Psi _i({\mathcal {E}}^*)) = 0 \end{aligned}$$
(21)

and

$$\begin{aligned} L_z \cap \Psi _i({\mathcal {N}}^*) = \emptyset . \end{aligned}$$
(22)

Consider a point \(z \in \mathbb {R}^{n - 1}\) such that (20), (21), and (22) hold true. Recall that by Lemma 7, a point \({\varvec{y}} \in \mathbb {R}^{n + 1}\) belongs to \(\Phi _i({{\,\mathrm{graph}\,}}(u))\) if, and only if, \({\underline{g}}_i(y) \le y_{n + 1} \le {\overline{g}}_i(y)\). Also recall that

$$\begin{aligned} {{\,\mathrm{graph}\,}}(u) = {\mathcal {E}}^* \cup {\mathcal {N}}^* \cup \bigcup _{j \in \mathbb {Z}_{n + 1}} {\mathcal {F}}_j^*. \end{aligned}$$

From (20)–(22) we therefore infer that for \({\mathcal {H}}^1\)-almost all \(y \in L_z\),

$$\begin{aligned} \begin{pmatrix} y \\ t \end{pmatrix} \in \Phi _i({\mathcal {F}}_i^*) \quad \text {for all } t \in [{\underline{g}}_i(y), {\overline{g}}_i(y)]. \end{aligned}$$
(23)

Consider \({\varvec{y}} \in \Phi _i({\mathcal {F}}_i^*)\) with \(y \in L_z\). Then, setting \({\varvec{x}}= \Phi _i^{-1}({\varvec{y}})\), we have the locally uniform convergence \(u_{x, \rho } \rightarrow \lambda _i\) as . Hence for any compact set \(K \subseteq \mathbb {R}^{n + 1}\) and any \(\epsilon > 0\) there exists \(\rho _0 > 0\) such that

$$\begin{aligned} \frac{1}{\rho }({{\,\mathrm{graph}\,}}(u) - {\varvec{x}}) \cap K \subseteq \left\{ {\tilde{{\varvec{x}}}} \in \mathbb {R}^{n + 1} :{{\,\mathrm{dist}\,}}({\tilde{{\varvec{x}}}}, {{\,\mathrm{graph}\,}}(\lambda _i)) < \epsilon /2 \right\} \end{aligned}$$

for all \(\rho \in (0, \rho _0]\). Equivalently,

$$\begin{aligned} \Phi _i({{\,\mathrm{graph}\,}}(u) - {\varvec{x}}) \cap \rho \Phi _i(K) \subseteq \mathbb {R}^n \times (-\rho \epsilon /2, \rho \epsilon /2). \end{aligned}$$

Recall that \(e_1, \dotsc , e_n\) are the standard basis vectors in \(\mathbb {R}^n\). Also recall that for any \({\tilde{y}} \in \mathbb {R}^n\), the points \(({\tilde{y}}, {\underline{g}}_i({\tilde{y}}))\) and \(({\tilde{y}}, {\overline{g}}_i({\tilde{y}}))\) belong to \(\Phi _i({{\,\mathrm{graph}\,}}(u))\). It follows that there exists \(r_0 > 0\) such that for all \(r \in (0, r_0]\),

$$\begin{aligned} \bigl |{\underline{g}}_i(y \pm r e_k) - {\underline{g}}_i(y)\bigr | \le r\epsilon \quad \text {and} \quad \bigl |{\overline{g}}_i(y \pm r e_k) - {\overline{g}}_i(y)\bigr | \le r\epsilon \end{aligned}$$

and \(|{\underline{g}}_i(y) - {\overline{g}}_i(y)| \le r\epsilon \). Thus

$$\begin{aligned} \frac{\partial }{\partial y_k}{\underline{g}}_i(y) = 0 \quad \text {and} \quad \frac{\partial }{\partial y_k}{\overline{g}}_i(y) = 0 \end{aligned}$$

and \({\underline{g}}_i(y) = {\overline{g}}_i(y)\). Note that this is true for \({\mathcal {H}}^1\)-almost all \(y \in L_z\). We now claim that

$$\begin{aligned} {\underline{g}}_i(\ell _z({\underline{s}})) \ge {\overline{g}}_i(y) \ge {\underline{g}}_i(y) \ge {\overline{g}}_i(\ell _z({\overline{s}})) \end{aligned}$$
(24)

for all \(y \in L_z \setminus \{\ell _z({\underline{s}}), \ell _z({\overline{s}})\}\). Indeed, given \(y = \ell _z(s)\) with \(s \in ({\underline{s}}, {\overline{s}})\), we may choose \(s_- \in ({\underline{s}}, s)\) and \(s_+ \in (s, {\overline{s}})\) such that \({\underline{g}}_i(\ell _z(s_-)) = {\overline{g}}_i(\ell _z(s_-))\) and \({\underline{g}}_i(\ell _z(s_+)) = {\overline{g}}_i(\ell _z(s_+))\). Then (24) follows from Lemma 7.(vi).

If (23) holds for all \(y \in L_z\), then we immediately conclude that \({\underline{g}}_i\) and \({\overline{g}}_i\) are constant and coincide on \(L_z\), i.e., we have the first alternative from the statement of the lemma. If there exists \(y \in L_z\) such that (23) does not hold true, then by the above observations, we know that

$$\begin{aligned} \begin{pmatrix} y \\ t \end{pmatrix} \not \in \Phi _i({\mathcal {F}}_i^*) \end{aligned}$$

holds in fact for all \(t \in [{\underline{g}}_i(y), {\overline{g}}_i(y)]\). (Otherwise, we would conclude that \({\underline{g}}_i(y) = {\overline{g}}_i(y)\), which would give an immediate contradiction.) Moreover, because (23) still holds true almost everywhere on \(L_z\), there exists a sequence \(({\tilde{y}}_m)_{m \in \mathbb {N}}\) in \(L_z\) such that \(y = \lim _{m \rightarrow \infty } {\tilde{y}}_m\) and such that (23) holds for every \({\tilde{y}}_m\). Hence \({\underline{g}}_i({\tilde{y}}_m) = {\overline{g}}_i({\tilde{y}}_m)\). We set \({\tilde{t}}_m = {\underline{g}}_i({\tilde{y}}_m)\). Extracting a subsequence if necessary, we may assume that \(y_{n + 1} = \lim _{m \rightarrow \infty } {\tilde{t}}_m\) exists. Set \({\varvec{y}} = ({\begin{matrix} y \\ y_{n + 1} \end{matrix}})\). Then \(\Phi _i^{-1}({\varvec{y}})\) belongs to the boundary of \({\mathcal {F}}_i^*\) relative to \({{\,\mathrm{graph}\,}}(u)\).

Proposition 5 implies that \({\mathcal {F}}_i^*\) is an open set relative to \({{\,\mathrm{graph}\,}}(u)\), and its relative boundary is contained in \({\mathcal {E}}^* \cup {\mathcal {N}}^*\). Because of (22), it follows that \(\Phi _i^{-1}({\varvec{y}}) \in {\mathcal {E}}^*\). Moreover, (24) implies that

$$\begin{aligned} {\overline{g}}_i(\ell _z({\overline{s}})) \le y_{n + 1} \le {\underline{g}}_i(\ell _z({\underline{s}})). \end{aligned}$$

Thus \({\varvec{y}}\) has the properties from the second alternative in the statement. \(\square \)

Lemma 16

Let \(i \in \mathbb {Z}_{n + 1}\). Suppose that \(G \subseteq \mathbb {R}^n\) is a connected set such that \(G \cap \Gamma _i(t) = \emptyset \) for all \(t \in (-1, 1)\). Then either \({\underline{g}}_i(y) \ge 1\) for all \(y \in G\) or \({\overline{g}}_i(y) \le -1\) for all \(y \in G\).

Proof

Assume that there exists \(y_0 \in G\) such that \({\underline{g}}_i(y_0) < 1\). Since \(G \cap \Gamma _i(t) = \emptyset \) for all \(t \in (-1, 1)\), this implies that

$$\begin{aligned} -1 \ge {\overline{g}}_i(y_0) \ge {\underline{g}}_i(y_0) \end{aligned}$$

by Lemma 7.(iv).

Given \(t \in (-1, 1)\), define

$$\begin{aligned} H_t = \left\{ y \in G :{\overline{g}}_i(y) \ge t \right\} . \end{aligned}$$

Because \({\overline{g}}_i\) is upper semicontinuous by Lemma 7, this is a closed set relative to G. Moreover, if \(y \in H_t\), it follows that

$$\begin{aligned} {\overline{g}}_i(y) \ge {\underline{g}}_i(y) \ge 1, \end{aligned}$$

because \(G \cap \Gamma _i(t') = \emptyset \) for all \(t' \in (-1, 1)\). By the lower semicontinuity of \({\underline{g}}_i\), this means that there exists \(\rho > 0\) such that \({\overline{g}}_i \ge {\underline{g}}_i \ge t\) in \(B_\rho (y)\). Hence \(H_t\) is also open relative to G. Since G is connected and \(y_0 \not \in H_t\), it follows that \(H_t = \emptyset \). This is true for all \(t \in (-1, 1)\), so \({\overline{g}}_i(y) \le -1\) for all \(y \in G\). \(\square \)

We now have everything in place for the Proof of Proposition 13.

Proof of Proposition 13

We use induction over n. The statement is clear for \(n = 1\). We now assume that \(n \ge 2\) and the statement holds true for \(n - 1\).

For simplicity, we assume that \(i = 1\) and \(j = 2\). We also assume that (14) holds true; the proof is similar under the assumption (15).

Let

$$\begin{aligned} \Lambda = \bigl ((-\infty , 0] \times \{0\} \times \mathbb {R}^{n - 2}\bigr ) \cup \bigl (\{0\} \times (-\infty , 0] \times \mathbb {R}^{n - 2}\bigr ). \end{aligned}$$

The graph of \(\lambda _1 \wedge \lambda _2\) is the union of two half-hyperplanes meeting at a right angle. In fact, it is easy to see that

$$\begin{aligned} \Phi _0({{\,\mathrm{graph}\,}}(\lambda _1 \wedge \lambda _2)) = \Lambda \times \mathbb {R}. \end{aligned}$$

Let

$$\begin{aligned} \epsilon ' = \epsilon \sqrt{\frac{n}{n + 1}}. \end{aligned}$$

Under the assumptions of the proposition, the set \(\Phi _0({{\,\mathrm{graph}\,}}(u)) \cap (-1, 1)^{n + 1}\) is between \((\Lambda - \epsilon ' \sigma ) \times \mathbb {R}\) and \((\Lambda + \epsilon ' \sigma ) \times \mathbb {R}\), i.e.,

$$\begin{aligned} \Phi _0({{\,\mathrm{graph}\,}}(u)) \cap (-1, 1)^{n + 1} \subseteq \bigcup _{- \epsilon ' \le s \le \epsilon '} (\Lambda + s \sigma ) \times \mathbb {R}. \end{aligned}$$

Set \(s_0 = \sqrt{\frac{n}{n + 1}} u(0)\). Then \(|s_0| \le \epsilon '\) by the assumption that \(|u(0)| \le \epsilon \). Moreover, we compute

$$\begin{aligned} \Phi _0\begin{pmatrix} 0 \\ u(0) \end{pmatrix} = \frac{u(0)}{\sqrt{n + 1}} \begin{pmatrix} 1 \\ \vdots \\ 1 \end{pmatrix} = s_0 \begin{pmatrix} \sigma \\ \frac{1}{\sqrt{n}} \end{pmatrix}. \end{aligned}$$

Assuming that \(\epsilon < \sqrt{n + 1}\), we infer that \({\overline{g}}_0(s_0\sigma ) > -1\) and \({\underline{g}}_0(s_0\sigma ) < 1\). Using Lemmas 7.(v) and 16, we conclude that

$$\begin{aligned} {\underline{g}}_0(y) \ge 1 \quad \text {for } y\in (-1, 1)^n \cap \bigcup _{s <- \epsilon '} (\Lambda + s\sigma ) \end{aligned}$$

and

$$\begin{aligned} {\overline{g}}_0(y) \le -1 \quad \text {for } y \in (-1, 1)^n \cap \bigcup _{s > \epsilon '} (\Lambda + s\sigma ). \end{aligned}$$

Now consider the function \(f_0 :P \times \mathbb {R}\rightarrow \mathbb {R}\) from Proposition 8. For almost every \(t \in (-1, 1)\), the graph of , which is given by \(\Gamma _0(t)\), is between \(\Lambda - \epsilon ' \sigma \) and \(\Lambda + \epsilon ' \sigma \) in the hypercube \((-1, 1)^n\).

Define \(\mu _1, \mu _2 :P \rightarrow \mathbb {R}\) by \(\mu _1(p) = b_1 \cdot p\) and \(\mu _2(p) = b_2 \cdot p\) for \(p \in P\) (where \(b_1\) and \(b_2\) are the vectors defined on page 12). Let \(F_t :P \rightarrow \mathbb {R}^n\) be the map with \(F_t(p) = p + f_0(p, t) \sigma \) for \(p \in P\). Then the preceding statements amount to the inequality

Moreover, the condition \(|f_0(0, t)|\le \epsilon '\) is clearly satisfied. Hence we may apply the induction hypothesis to the function . We thereby obtain the inequality

$$\begin{aligned} {\mathcal {H}}^{n - 2}\bigl ({\mathcal {D}}_0^\dagger (t) \cap \textstyle \bigl ((-\frac{1}{4}, \frac{1}{4})^2 \times (-1, 1)^{n - 2}\bigr )\bigr ) \ge 2^{n - 2} (1 - \delta ) \end{aligned}$$
(25)

for almost all \(t \in (-1, 1)\), provided that \(\epsilon \) is sufficiently small. Using Lemma 14, we therefore obtain inequality (16). This proves the first statement of Proposition 13.

In order to prove the second statement, assume now that (17) holds true. Then

$$\begin{aligned} \int _{-1}^1 {\mathcal {H}}^{n - 2}\bigl ({\mathcal {D}}_0^\dagger (t) \cap (-1, 1)^n\bigr ) \, dt \le 2^{n - 1} (1 + \epsilon ). \end{aligned}$$

Recall that we also have inequality (25), and we may now assume that \(\delta \) is arbitrarily small. Hence there exist \(t_- \in (-1, -\frac{1}{2})\) and \(t_+ \in (\frac{1}{2}, 1)\) such that

$$\begin{aligned} {\mathcal {H}}^{n - 2}\bigl ({\mathcal {D}}_0^*(t_\pm ) \cap (-1, 1)^n\bigr ) \le 2^{n - 2} (1 + 3\delta + 4\epsilon ). \end{aligned}$$

By the induction hypothesis, if \(\delta \) and \(\epsilon \) are sufficiently small, then

for certain numbers \(\alpha _-, \alpha _+, \beta _-, \beta _+ \in \mathbb {R}\). Therefore, there exist \(y_-, y_+ \in \mathbb {R}^2 \times \{0\}^{n - 2}\) such that

$$\begin{aligned} \Gamma _0(t_\pm ) \cap \textstyle \left( -\frac{1}{2}, \frac{1}{2}\right) ^n = (y_\pm + \Lambda ) \cap \left( -\frac{1}{2}, \frac{1}{2}\right) ^n. \end{aligned}$$

Clearly, by the above observations on \(\Phi _0({{\,\mathrm{graph}\,}}(u))\), this implies that \(y_\pm \in B_{\epsilon '}(0)\). We assume that \(\epsilon ' \le \frac{1}{4}\).

If \(y_- = y_+\), then

$$\begin{aligned} \Gamma _0(t) \cap \textstyle \left( -\frac{1}{2}, \frac{1}{2}\right) ^n = (y_+ + \Lambda ) \cap \left( -\frac{1}{2}, \frac{1}{2}\right) ^n \end{aligned}$$

for every \(t \in (t_-, t_+)\) as well (because any other point \(y \in (-\frac{1}{2}, \frac{1}{2})^n \setminus (y_+ + \Lambda )\) satisfies either \({\underline{g}}_0(y) \ge t_+\) or \({\overline{g}}_0(y) \le t_-\) by Lemma 7). In this case, we conclude that (18) holds true. Thus it now suffices to show that \(y_- = y_+\).

We argue by contradiction here. Suppose that \(y_- \ne y_+\). We assume that in fact the first components \(y_{1-}\) and \(y_{1+}\) are different. The arguments are similar if \(y_{2-} \ne y_{2+}\).

If \(y_{1-} \ne y_{1+}\), then for any \(z \in (-\frac{1}{2}, -\frac{1}{4}) \times (-\frac{1}{2}, \frac{1}{2})^{n - 2}\), it follows that

$$\begin{aligned} {\underline{g}}_0\begin{pmatrix} y_{1-} \\ z \end{pmatrix} \le t_- \le {\overline{g}}_0 \begin{pmatrix} y_{1-} \\ z \end{pmatrix} \end{aligned}$$

and

$$\begin{aligned} {\underline{g}}_0\begin{pmatrix} y_{1+} \\ z \end{pmatrix} \le t_+ \le {\overline{g}}_0\begin{pmatrix} y_{1+} \\ z \end{pmatrix}. \end{aligned}$$

Since \(t_- < t_+\), it is therefore not true that \({\underline{g}}_0\) and \({\overline{g}}_0\) are constant with \({\underline{g}}_0 = {\overline{g}}_0\) on \([y_{1+}, y_{1-}] \times \{z\}\). Lemma 15 now implies that for \({\mathcal {H}}^{n - 1}\)-almost every \(z \in (-\frac{1}{2}, -\frac{1}{4}) \times (-\frac{1}{2}, \frac{1}{2})^{n - 2}\), the set \([y_{1+}, y_{1-}] \times \{z\} \times [t_-, t_+]\) intersects \(\Phi _0({\mathcal {E}}^*)\). It follows that

$$\begin{aligned} {\mathcal {H}}^{n - 1}\left( \Phi _0({\mathcal {E}}^*) \cap \left( \textstyle (-1, 1) \times \left( -\frac{1}{2}, -\frac{1}{4}\right) \times (-1, 1)^{n - 1}\right) \right) \ge \frac{1}{4}. \end{aligned}$$

Furthermore, because of (16), we obtain the estimate

$$\begin{aligned} {\mathcal {H}}^{n - 1}\bigl ({\mathcal {E}}^* \cap Q_{12}(1, 1)\bigr ) \ge 2^{n - 1} (1 - \delta ) + \frac{1}{4}. \end{aligned}$$

If \(\delta + \epsilon < 2^{-n - 1}\), then this contradicts the hypothesis. \(\square \)

Finally we can prove the second main result with the help of Propositions 5 and 13.

Proof of Theorem 3

Suppose that \(A \subseteq \mathbb {R}^n\) is affinely independent. Then A contains at most \(n + 1\) elements. If there are fewer, then we can add additional elements to A such that it remains affinely independent. Thus we may assume without loss of generality that the size of A is exactly \(n + 1\).

Now suppose that \(A = \{{\tilde{a}}_0, \dotsc , {\tilde{a}}_n\}\). Consider \(M \in \mathbb {R}^{n \times n}\) and \(c \in \mathbb {R}^n\) such that \(M{\tilde{a}}_i + c = a_i\) for \(i = 0, \dotsc , n\). Then the function \(v :\mathbb {R}^n \rightarrow \mathbb {R}\) with \(v(x) = u(M^T x) + c \cdot x\) has the property that \(\nabla v(x) \in \{a_0, \ldots a_n\}\) for almost all \(x \in \mathbb {R}^n\). Hence we may assume that A consists of the vectors \(a_0, \dotsc , a_n\).

Now for the sets \({\mathcal {F}}\), \({\mathcal {E}}\), and \({\mathcal {N}}\) as defined in Sect. 2, Proposition 5 implies that \({\mathcal {F}}\subseteq {\mathcal {R}}(u)\) with the same arguments as in the Proof of Theorem 4.

For \(x \in {\mathcal {E}}\), the functions \(u_{x, \rho }\) converge locally uniformly to \(\lambda _i \wedge \lambda _j\) or to \(\lambda _i \vee \lambda _j\) as for some \(i, j \in \mathbb {Z}_{n + 1}\) with \(i \ne j\). Moreover, the approximate tangent space of \({\mathcal {E}}^*\) exists at the point \({\varvec{U}}(x)\). Clearly this approximate tangent space is \({{\,\mathrm{graph}\,}}(\lambda _i) \cap {{\,\mathrm{graph}\,}}(\lambda _j)\). Hence for \(\rho \) sufficiently small, the function \(u_{x, \rho }\) satisfies the hypotheses of Proposition 13, including (17). It follows that \(u_{x, \rho }\) satisfies (18) or (19). In particular, it is regular near 0, and hence \(x \in {\mathcal {R}}(u)\).

Thus \({\mathcal {S}}(u) \subseteq {\mathcal {N}}\), which is an \({\mathcal {H}}^{n - 1}\)-null set. \(\square \)