Functions with Bounded Hessian–Schatten Variation: Density, Variational, and Extremality Properties

Ambrosio, Luigi; Brena, Camillo; Conti, Sergio

doi:10.1007/s00205-023-01938-w

Functions with Bounded Hessian–Schatten Variation: Density, Variational, and Extremality Properties

Open access
Published: 20 November 2023

Volume 247, article number 111, (2023)
Cite this article

Download PDF

You have full access to this open access article

Archive for Rational Mechanics and Analysis Aims and scope Submit manuscript

Functions with Bounded Hessian–Schatten Variation: Density, Variational, and Extremality Properties

Download PDF

Luigi Ambrosio¹,
Camillo Brena¹ &
Sergio Conti²

789 Accesses
2 Citations
Explore all metrics

Abstract

In this paper we analyze in detail a few questions related to the theory of functions with bounded p-Hessian–Schatten total variation, which are relevant in connection with the theory of inverse problems and machine learning. We prove an optimal density result, relative to the p-Hessian–Schatten total variation, of continuous piecewise linear (CPWL) functions in any space dimension d, using a construction based on a mesh whose local orientation is adapted to the function to be approximated. We show that not all extremal functions with respect to the p-Hessian–Schatten total variation are CPWL. Finally, we prove the existence of minimizers of certain relevant functionals involving the p-Hessian–Schatten total variation in the critical dimension $d=2$.

Linear inverse problems with Hessian–Schatten total variation

Article Open access 20 November 2023

On the Properties of the Method of Minimization for Convex Functions with Relaxation on the Distance to Extremum

Article 01 January 2019

Some Properties of Smooth Convex Functions and Newton’s Method

Article 01 March 2021

Use our pre-submission checklist

Avoid common mistakes on your manuscript.

1 Introduction

In applied sciences, an inverse problem is the process of reconstructing an unknown signal (in practice, a causal factor) from a set of (possibly noisy) observations. Of notable importance is the subclass of linear inverse problems. A linear inverse problem is specified by three main objects:

an hypothesis space $\mathcal {S}$ in which we look for the signal $f^*\in \mathcal {S}$,
a linear forward operator $\nu :\mathcal {S}\rightarrow \mathbb {R}^N$ which is a proxy for the process of data collection,
an array ${y}\in \mathbb {R}^N$ which contains the observed data, to which ${\nu }(f^*)$ should be close.

Therefore, the inverse problem consists in (approximately) recovering the unknown signal $f^*$ from the observed data y. Also, the problem can be reformulated in variational terms as

$$\begin{aligned} f^* \in \mathop {\textrm{arg min}}\limits _{f\in \mathcal {S}} \lambda \mathcal E\left( {\nu }(f),{y}\right) + \mathcal {R}(f), \end{aligned}$$

(0.1)

where

$\mathcal E:\mathbb {R}^N\times \mathbb {R}^N\rightarrow \mathbb {R}$ is a convex loss function used to measure the data discrepancy,
$\mathcal {R}:\mathcal {S}\rightarrow \mathbb {R}$ is the regularization functional, mainly used to enforce known structure and regularity on the reconstructed signal,
$\lambda >0$ is a parameter that governs the interplay between fidelity to the data and regularity.

Three key effects of the presence of the regularization functional ${\mathcal {R}}$ are: the enhancement of the stability of the problem, the alleviation of the ill-posedness of the problem and the possibility to invoke the “representer theorem”, which provides a parametric form for solutions of (0.1) and has been recently extended to rather general frameworks, see [8, 9, 19, 20]. Roughly (and under suitable assumptions), the just-mentioned abstract result characterizes the set of solutions of (0.1) as linear combinations of the extreme points of

$$\begin{aligned} \{f\in \mathcal {S}: \mathcal {R}(f)\leqq 1\} \end{aligned}$$

(0.2)

(which is the unit ball associated to the regularization functional). This strongly motivates the interest in finding and studying the extreme points of the set in (0.2).

In this paper, we are going to study problems arising from a particular, yet general, choice of the items appearing in the functional in (0.1). In particular,

a)
the hypothesis space are the functions $f:\Omega \rightarrow {\mathbb {R}}$ with bounded p-Hessian–Schatten variation (see item b)), for some $\Omega \subseteq {\mathbb {R}}^d$ open. The space coincides indeed with Demengel’s space ( [11]) of functions with bounded Hessian, which has been introduced to study models of plastic deformations of solids and has proven useful also in the context of image processing, but the norm we adopt is specific and allows for optimal approximation results by continuous and piecewise affine functions when $p=1$;
b)
the regularizing term is the p-Hessian–Schatten variation $|{\textrm{D}}_p^2 \,\cdot \,|(\Omega )$, that coincides with the relaxation of the functional (here and after $|\,\cdot \,|_p$ denotes the p-Schatten norm),
$$\begin{aligned} |{\textrm{D}}_p^2 f|(\Omega ):=\int _{\Omega } |\nabla ^2 f|_p\textrm{d}\mathscr {L}^d\qquad \text {for every }f\in C^2(\Omega ); \end{aligned}$$
This is a variant of the classical second-order total variation ( [5]). It has been inspired by [6, 14,15,16,17] and used in [10, 18];
c)
in the critical case $d=2$ we consider as linear forward operator the evaluation functional at certain points $x_1,\ldots ,x_N\in {\mathbb {R}}^2$, with observed data $(y_1,\ldots ,y_N)\in {\mathbb {R}}^N$;
d)
still in the critical case, the error term is taken to be an $\ell ^q$ norm, i.e.
$$\begin{aligned} {\mathcal {E}} (f):=\Vert (f(x_i)-y_i)_{i=1,\ldots ,N}\Vert _{\ell ^q}. \end{aligned}$$
e)
the tunable parameter is $\lambda \in (0,\infty ]$, where by convention $\lambda =\infty $ imposes a perfect fit with the data.

In view of the discussion above, it is evident that some questions arise as natural.

i)
The description of the extremal points of the ball (cf. (0.2))
$$\begin{aligned} \{f:\Omega \rightarrow {\mathbb {R}}:|{\textrm{D}}^2_p f|(\Omega )\leqq 1\} \end{aligned}$$
(0.3)
modulo additive affine functions (since the Hessian–Schatten seminorm is invariant under the addition of affine functions, this factorization is necessary). A reasonable description of these extremal points was given in [2], under the assumption that a certain density conjecture holds true. Namely, it has been proved that if $\mathrm CPWL$ functions are dense in energy in the space of functions with bounded Hessian–Schatten variation, then all extremal points, which obviously are on the sphere, are found in the closure of the $\mathrm CPWL$ extremal points (and this last set is rather manageable, see [2]). Here and below, a $\mathrm CPWL$ (Continuous and PieceWise Linear) function is a piecewise affine function, affine on certain simplexes. In Section 2 we give a positive answer to the just mentioned conjecture, proved only in the two-dimensional case in [2] with a different, more constructive, strategy. As any CPWL function can be exactly represented by a neural network with rectified linear unit (ReLU) activation functions [4], our result (Theorem 2.4) in particular implies approximability of any function whose Hessian has bounded total variation by means of neural networks with ReLU activation functions, with convergence of the 1-Hessian–Schatten norm.
ii)
Again with respect to the extremal points of the set described in (0.3), one may wonder whether all the extremal points are $\mathrm CPWL$. By a delicate measure-theoretic analysis, in Section 3 we show that the answer is negative: functions whose graphs are cut cones are extremal, modulo affine functions, and these functions are not $\mathrm CPWL$ if $d\geqq 2$. In connection with this negative answer, as for compact convex sets exposed points are dense in the class of extreme points, it would be interesting to know whether cut cones are also exposed, namely if there exist linear continuous functionals attaining their minimum, when restricted to the closed unit ball of the Hessian–Schatten seminorm, only at a cut cone.
iii)
In the two-dimensional case, one may wonder whether the functional (0.1) admits minimizers, with the choice of error and regularizing term described above. In Section 4 we give a positive answer, for a large set of choices of the parameters $\lambda $, p and q.

Now we pass to a more detailed description of the content of the paper. Namely, we examinate separately the answers to items i), ii) and iii) above and we sketch their proofs.

1.1 Density of CPWL functions

In Section 2 we address the problem of density in energy $|{\textrm{D}}^2_1\,\cdot \,|(\Omega )$ of $\mathrm CPWL$ functions in the set of functions with bounded Hessian–Schatten variation. Our main result is Theorem 2.2, stated for $C^2$ targets, and then it follows the localized version Theorem 2.4 for targets with finite p-Hessian–Schatten variation. The proof of Theorem 2.2 heavily relies on a fine study of triangulations of ${\mathbb {R}}^d$ and consists morally of three parts.

Part 1 is Section 2.1 and deals with general properties of triangulations (considered as couples of sets, the set of vertices and the set of elements), the most important ones being the Delaunay, non degeneracy and uniformity properties (items (a), (b) and (c) of Definition 2.7). Roughly speaking, the Delaunay property states that given an element of the triangulation, no vertex of the triangulation lies inside the circumsphere of the given element; it entails regularity properties, among them, the fact that angles in the elements are not too small. This leads to the non degeneracy property, crucial to estimate geometric quantities related to an element in terms of the volume of the given element. Finally, uniformity states that the vertices of the triangulation are uniformly distributed, in the sense that the maximum size of a ball which contains no vertex is bounded by a constant times the minimal distance between two vertices. The main results are Lemma 2.9, that allows us to gain a Delaunay triangulation starting from a uniform set of vertices and Lemma 2.13 which studies Delaunay triangulation whose vertices locally coincide with a rotation of a rescaling of the lattice ${\mathbb {Z}}^d$.

Part 2 is Section 2.2 which aims to construct a “good” triangulation (in the sense of Part 1) that locally follows a prescribed orientation. The outcome is Theorem 2.14 and the main difficulty in its proof relies in “gluing” the various sub-triangulations to allow for the variable orientation (see Fig. 3).

Part 3 is the proof of the density result, Section 2.3. We exploit the outcome of

Part 2 to build a triangulation that locally follows the orientation given by the Hessian of w, $\nabla ^2 w$, in the sense that is given by an orthonormal basis of eigenvectors for $\nabla ^2 w$. Then we take u, the affine interpolation for w with respect to this triangulation, which will be a good approximation. The contribution of the Hessian–Schatten variation of u on regions in which the orientation of the triangulation is constant (and hence adapted to the Hessian of w) is estimated thanks to the good choice of the orientation, whereas the contribution around the boundaries of these regions, i.e. where the gluing took place, comes from the regularity properties of the triangulation and the smallness of these regions.

1.2 Extremality of cones

In Section 3, we prove that functions whose graphs are cut cones are extremal with respect to the Hessian–Schatten total variation seminorm. Namely, we prove that functions defined as

$$\begin{aligned} f^\textrm{cone}(x):=(1-|x|)_+ \end{aligned}$$

are extremal modulo affine functions, in the sense that, if, for some $\lambda \in (0,1)$

$$\begin{aligned} f^\textrm{cone}=\lambda f_1+(1-\lambda ) f_2 \end{aligned}$$

with

$$\begin{aligned} |{\textrm{D}}_p^2 f_1|({\mathbb {R}}^d)=|{\textrm{D}}_p^2 f_2|({\mathbb {R}}^d)=|{\textrm{D}}_p^2 f^\textrm{cone}|({\mathbb {R}}^d), \end{aligned}$$

for some $p\in [1,\infty )$, then $f_1$ and $f_2$ are equal to $f^\textrm{cone}$, up to affine functions (Theorem 3.1).

Our strategy is as follows. First, we set $f_i^\textrm{rad}$ to be the radial symmetrization of $f_i$, for $i=1,2$. As $f^\textrm{cone}$ is radial, a simple computation yields that still

$$\begin{aligned} f^\textrm{cone}=\lambda f_1^\textrm{rad}+(1-\lambda ) f_2^\textrm{rad}\end{aligned}$$

and

$$\begin{aligned} |{\textrm{D}}_p^2 f_1^\textrm{rad}|({\mathbb {R}}^d)=|{\textrm{D}}_p^2 f_2^\textrm{rad}|({\mathbb {R}}^d)=|{\textrm{D}}_p^2 f^\textrm{cone}|({\mathbb {R}}^d). \end{aligned}$$

This implies with not much effort that $f_i^\textrm{rad}=f^\textrm{cone}$, up to affine terms, thanks to the explicit computation of Hessian–Schatten total variation of radial functions (Proposition 1.13).

The bulk of the proof is then to prove that whenever we have f such that $f^\textrm{rad}=f^\textrm{cone}$ and $|{\textrm{D}}_p^2 f|({\mathbb {R}}^d)=|{\textrm{D}}_p^2 f^\textrm{cone}|({\mathbb {R}}^d)$, then f equals to $f^\textrm{cone}$, up to affine terms. In other words, in the case $f^\textrm{rad}=f^\textrm{cone}$, we have rigidity of the property that $|{\textrm{D}}^2_p f^\textrm{rad}|({\mathbb {R}}^d)\leqq |{\textrm{D}}^2_p f|({\mathbb {R}}^d)$ stated in Lemma 1.10.

Case $p=1$ is dealt in Proposition 3.5. For its proof, a key remark is the fact that, if $\varvec{\Delta }$ denotes the distributional Laplacian, then $\int _{B_1}\varvec{\Delta } (f(U\,\cdot \,))$ is independent of $U\in SO({\mathbb {R}}^d)$. Hence, by $f^\textrm{rad}=f^\textrm{cone}$, we have that

$$\begin{aligned} \int _{B_1}\varvec{\Delta } f= \int _{B_1}\varvec{\Delta } f^\textrm{cone}=-|{\textrm{D}}_1^2 f^\textrm{cone}|(B_1) =-|{\textrm{D}}_1^2 f|(B_1), \end{aligned}$$

where the second inequality is obtained by explicit computation (or by concavity of $f^\textrm{cone}$ in $B_1$). This then implies that (at the right hand side there is the total variation of the matrix valued measure ${\textrm{D}}\nabla f$ with respect to the 1-Schatten norm)

$$\begin{aligned} \int _{B_1}\textrm{d}\,tr({\textrm{D}}\nabla f)=- \int _{B_1}\textrm{d}|{\textrm{D}}\nabla f|_1, \end{aligned}$$

so that $\mathrm{{tr}}({\textrm{D}}\nabla f)=-|{\textrm{D}}\nabla f|_1$ almost everywhere, which implies that the eigenvalues of ${\textrm{D}}\nabla f$ are all negative, almost everywhere (Lemma 3.3), by rigidity in the inequality $|\mathrm{{Tr}}(A)|\leqq |A|_1$. Then, by Lemma 3.2, it follows that f has a continuous concave representative in $B_1$. Finally we exploit concavity to obtain the pointwise bound $f\geqq f^\textrm{cone}$ in $B_1$, which, combined with the integral equality $f^\textrm{rad}=f^\textrm{cone}$, implies the claim.

Case $p\in (1,\infty )$ is dealt in Proposition 3.6, where we reduce ourselves to the case $p=1$, namely we show that the information $|{\textrm{D}}_p^2 f|({\mathbb {R}}^d)=|{\textrm{D}}_p^2 f^\textrm{cone}|({\mathbb {R}}^d)$, coupled with $f^\textrm{rad}=f^\textrm{cone}$, self improves to $|{\textrm{D}}_1^2 f|({\mathbb {R}}^d)=|{\textrm{D}}_1^2 f^\textrm{cone}|({\mathbb {R}}^d)$, whence we can use what proved in the Case $p=1$. This reduction is done treating separately the absolutely continuous and singular part of $|{\textrm{D}}_p^2 f|$. The former is treated exploiting the strict convexity of the p-Schatten norm together with the scaling property of the map $p\mapsto |{\textrm{D}}_p^2 f^\textrm{cone}|$, whereas the latter is treated by Alberti’s rank 1 Theorem ( [1]), in conjunction with the fact that the p-Schatten norm of rank 1 matrices is independent of p.

1.3 Solutions to the minimization problem

In Section 4 we restrict ourselves to the two dimensional Euclidean space. Indeed, we want to exploit the continuity of functions with bounded Hessian–Schatten variation in dimension 2 ( [2], see Proposition 1.11) to have a meaningful evaluation functional and define, for $\Omega \subseteq {\mathbb {R}}^2$ open (cf. (0.1)), $\mathcal {F}_\lambda :L^1_{\textrm{loc}}(\Omega )\rightarrow [0,\infty ]$ by

$$\begin{aligned} \mathcal {F}_\lambda (f)= |{\textrm{D}}_1^2 f|(\Omega )+\lambda \Vert (f(x_i)-y_i)_{i=1,\ldots ,N}\Vert _{\ell ^1}, \end{aligned}$$

(0.4)

where $x_1,\ldots ,x_N\in \Omega $ are distinct points and $y_1,\ldots ,y_N\in {\mathbb {R}}$. Also, we are adopting the convention that $\infty \,\cdot \,0=0$, hence, if $\lambda =\infty $, we have $\mathcal {F}_{\infty }:L^1_{\textrm{loc}}(\Omega )\rightarrow [0,\infty ]$,

$$\begin{aligned} \mathcal {F}_{\infty }(f)={\left\{ \begin{array}{ll} |{\textrm{D}}^2_1 f|(\Omega )\qquad &{}\text { if }f(x_i)=y_i\quad \text { for }\quad i=1,\ldots ,N,\\ \infty \qquad &{}\text {otherwise}. \end{array}\right. } \end{aligned}$$

Notice that $\mathcal {F}_\lambda $ is the sum of the regularizing term $|{\textrm{D}}^2_1 f|$ and the weighted (by $\lambda $) error term $\lambda \Vert (f(x_i)-y_i)_{i=1,\ldots ,N}\Vert _{\ell ^1}$ and that $\mathcal {F}_\lambda $ can be seen as a relaxed version of $\mathcal {F}_{\infty }$.

In Section 4, we will consider slightly more general functionals, see (4.1), but for the sake of clarity we reduce ourselves to a particular case in this introduction. Our aim is to prove existence of minimizers of $\mathcal {F}_\lambda $ (Theorem 4.2). Notice that in higher ($\geqq 3$) dimension, $\mathcal {F}_\lambda $ is not well defined (by the lack of continuity), and, even if we try to define it imposing continuity on its domain, minimizers do not exist in general, as the infimum of $\mathcal {F}_\lambda $ is always zero. To see this last claim, simply exploit the scaling property of the Hessian–Schatten total variation (or use Proposition 1.13) for functions of the kind $x\mapsto \sum _{i=1}^N y_i(1-|x-x_i|/r)_+$ as $r\searrow 0$.

We sketch now the proof of the existence of minimizers of $\mathcal {F}_\lambda $. There are two key steps. We denote $\lambda _c:=4\pi $, the “critical” value for $\lambda $.

Step 1. First we prove existence of minimizers of $\mathcal {F}_{\lambda }$, for $\lambda \in [0,\lambda _c]$. This is done via the direct method of calculus of variations, after we prove relative compactness of minimizing sequences and semicontinuity of this functional. Compactness, proved in Proposition 4.9, is mostly due to the estimates of [2], see Proposition 1.11. Semicontinuity is then proved in Lemma 4.8 and here the choice of $\lambda \in [0,\lambda _c]$ plays a role. The key idea is that, given a point $x_i$ and a converging sequence $f_k\rightarrow f$, either $|{\textrm{D}}^2_1 f_k|$ concentrates at $x_i$ or it does not. In the former case (Lemma 4.7), as a part of $|{\textrm{D}}_1^2 f_k|$ concentrates at $x_i$ (and $|{\textrm{D}}_1^2 f|(x_i)=0$, being points of codimension 2), we experience a drop in the regularizing term of the functional, and this drop is enough to offset the lack of convergence of the evaluation term $f_k(x_i)$ in the error term. In the latter case (Lemma 4.7 again), we have instead convergence of $k\mapsto f_k(x_i)$.

Step 2. We prove the existence of minimizers of $\mathcal {F}^\lambda $, for $\lambda \in [\lambda _c,\infty ]$. By Step 1, we can take a minimizer f of $\mathcal {F}_{\lambda _c}$. Then we modify f to obtain ${\tilde{f}}$ satisfying

$$\begin{aligned}{} & {} |{\textrm{D}}^2_1 {\tilde{f}}|(\Omega )\leqq |{\textrm{D}}_1^2 f|(\Omega )+\lambda _c \Vert (f(x_i)-y_i)_i\Vert _{\ell ^1}\\{} & {} \qquad \text {and}\qquad {\tilde{f}}(x_i)=y_i\text { for }i=1,\ldots ,N. \end{aligned}$$

Such modifications is obtained adding to f a suitable linear combination of “cut-cones”, namely functions $x\mapsto y_i(1-|x-x_i|/\bar{r})_+$ for $\bar{r}$ small enough. As ${\tilde{f}}$ has a perfect fit with the data, for any $\lambda $,

$$\begin{aligned} \mathcal {F}_\lambda ({\tilde{f}})=\mathcal {F}_{\lambda _c}({\tilde{f}})\leqq \mathcal {F}_{\lambda _c}(f), \end{aligned}$$

where the inequality is due to the construction of ${\tilde{f}}$. Now, as $\mathcal {F}_\lambda \geqq \mathcal {F}_{\lambda _c}$ (here the choice $\lambda \in [\lambda _c,\infty ]$ plays a role) and as f is a minimizer of $\mathcal {F}_{\lambda _c}$, we see that ${\tilde{f}}$ is a minimizer of $\mathcal {F}_\lambda $.

Therefore, putting together what seen in Step 1 and in Step 2 we have that for every $\lambda \in [0,\infty ]$ there exists a minimizer of $\mathcal {F}_\lambda $.

2 Preliminaries

In this short section we first recall basic facts about Hessian–Schatten seminorms and then in Section 1.3 we add an explicit formula to compute Hessian–Schatten variations of radial functions.

2.1 Schatten norms

We recall basic facts about Schatten norms, see [2] and the references therein.

Definition 1.1

(Schatten norm) Let $p\in [1,\infty ]$. If $M\in {\mathbb {R}}^{d\times d}$ and $s_1(M),\ldots , s_d(M)\geqq 0$ denote the singular values of M (counted with their multiplicity), we define the Schatten p-norm of M by

$$\begin{aligned} |M|_p:=\Vert (s_1(M),\ldots ,s_d(M))\Vert _{\ell ^p}. \end{aligned}$$

We recall that the scalar product between $M,\,N\in {\mathbb {R}}^{d\times d}$ is defined by

$$\begin{aligned} M\,\cdot \, N:=\mathrm{{tr}}(M^t N)=\sum _{i,\,j=1,\ldots ,d} M_{i,j}N_{i,j} \end{aligned}$$

and induces the Hilbert–Schmidt norm. Next, we enumerate several properties of the Schatten norms that shall be used throughout the paper

Proposition 1.2

The family of Schatten norms satisfies the following properties.

i)
If $M\in {\mathbb {R}}^{d\times d}$ is symmetric, then its singular values $s_1(M),\ldots ,s_d(M)$ are equal to $|\lambda _1(M)|,\ldots ,|\lambda _d(M)|$, where $\lambda _1(M),\ldots ,\lambda _d(M)$ denote the eigenvalues of M (counted with their multiplicity). Hence $|M|_p=\Vert (\lambda _1(M),\ldots ,\lambda _d(M))\Vert _{\ell ^p}$.
ii)
If $M\in {\mathbb {R}}^{d\times d}$ and $N\in O({\mathbb {R}}^d)$, then $|M N|_p=|N M|_p=|M|_p$.
iii)
If $M,\,N\in {\mathbb {R}}^{d\times d}$, then $|M N|_p\leqq |M|_p |N|_p$.
iv)
If $M\in {\mathbb {R}}^{d\times d}$, then $|M|_p=\sup _N M\,\cdot \, N$, where the supremum is taken among all $N\in {\mathbb {R}}^{d\times d}$ with $|N|_{p'}\leqq 1$, for $p'$ the conjugate exponent of p, defined by $1/p+1/p'=1$.
v)
If M has rank 1, then $|M|_p$ coincides with the Hilbert–Schmidt norm of M for every $p\in [1,\infty ]$.
vi)
If $p\in (1,\infty )$, then the Schatten p-norm is strictly convex.
vii)
If $M\in {\mathbb {R}}^{d\times d}$, then $|M|_p\leqq C|M|_q$, where $C=C(d,p,q)$ depends only on d, p and q.

Definition 1.3

($L^r$-Schatten norm) Let $p,\,r\in [1,\infty ]$ and let $M\in C_{\textrm{c}}({\mathbb {R}}^d)^{d\times d}$. We define the Schatten (p, r)-norm of M by

$$\begin{aligned} \Vert M\Vert _{p,r}:=\Vert |M|_p\Vert _{L^r({\mathbb {R}}^d)}. \end{aligned}$$

2.1.1 Poincaré inequalities

We recall basic facts about Poincaré inequalities.

Definition 1.4

Let $A\subseteq {\mathbb {R}}^d$ be a domain. We say that A supports Poincaré inequalities if for every $q\in [1,d)$ there exists a constant $C=C(A,q)$ depending on A and q such that

where $1/{q^*}=1/q-1/d$.

2.2 Hessian–Schatten total variation

For this section fix $\Omega \subseteq {\mathbb {R}}^d$ open and $p\in [1,\infty ]$. We let $p'$ denote the conjugate exponent of p. Now we recall the definition of Hessian–Schatten total variation and some basic properties, see [2] and the references therein.

Definition 1.5

(Hessian–Schatten variation) Let $f\in L^1_{\textrm{loc}}(\Omega )$. For every $A\subseteq \Omega $ open we define

$$\begin{aligned} |{\textrm{D}}^2_p f|(A):=\sup _F \int _A \sum _{i,\,j=1,\ldots ,d} f \partial _i\partial _j F_{i,j}\textrm{d}\mathscr {L}^d, \end{aligned}$$

(1.1)

where the supremum runs among all $F\in C_{\textrm{c}}^\infty (A)^{d\times d}$ with $\Vert F\Vert _{p',\infty }\leqq 1$. We say that f has bounded p-Hessian–Schatten variation in $\Omega $ if $|{\textrm{D}}^2_p f|(\Omega )<\infty $.

Remark 1.6

If f has bounded p-Hessian–Schatten variation in $\Omega $, then the set function defined in (1.1) is the restriction to open sets of a finite Borel measure, that we still call $|{\textrm{D}}^2_p f|$. This can be proved with a classical argument, building upon [12] (see also [3, Theorem 1.53]).

By its very definition, the p-Hessian–Schatten variation is lower semicontinuous with respect to convergence in distributions.$\blacksquare $

For any couple $p,\,q\in [1,\infty ]$, f has bounded p-Hessian–Schatten variation if and only if f has bounded q-Hessian–Schatten variation and, moreover,

$$\begin{aligned} C^{-1}|{\textrm{D}}^2_p f|\leqq |{\textrm{D}}^2_q f|\leqq C|{\textrm{D}}^2_p f| \end{aligned}$$

for some constant $C=C(d,p,q)$ depending only on d, p and q. This is due to equivalence of matrix norms.

The next proposition connects Definition 1.5 with Demengel’s space of functions with bounded Hessian [11], namely Sobolev functions whose partial derivatives are functions of bounded variation. We shall use ${\textrm{D}}$ to denote the distributional derivative, to keep the distinction with $\nabla $ notation (used also for gradients of Sobolev functions).

Proposition 1.7

[2, Proposition 9] Let $f\in L^1_{\textrm{loc}}(\Omega )$. Then the following are equivalent:

f has bounded Hessian–Schatten variation in $\Omega $,
$f\in W^{1,1}_{\textrm{loc}}(\Omega )$ and $\nabla f\in {\textrm{BV}}_{\textrm{loc}}(\Omega ;{\mathbb {R}}^d)$ with $|{\textrm{D}}\nabla f|(\Omega )<\infty $.

If this is the case, then, as measures,

$$\begin{aligned} |{\textrm{D}}^2_p f|=\bigg |\frac{\textrm{d}{{\textrm{D}}\nabla f}}{\textrm{d}{|{\textrm{D}}\nabla f|}}\bigg |_p |{\textrm{D}}\nabla f|. \end{aligned}$$

In particular, there exists a constant $C=C(d,p)$ depending only on d and p such that

$$\begin{aligned} C^{-1}|{\textrm{D}}\nabla f|\leqq |{\textrm{D}}^2_p f|\leqq C|{\textrm{D}}\nabla f| \end{aligned}$$

as measures.

Proposition 1.8

[2, Proposition 11] Let $f\in L^1_{\textrm{loc}}(\Omega )$. Then, for every $A\subseteq \Omega $ open, it holds

$$\begin{aligned} |{\textrm{D}}^2_p f|(A)=\inf \left\{ \liminf _k \int _A |\nabla ^2 f_k|_p\textrm{d}\mathscr {L}^d\right\} \end{aligned}$$

where the infimum is taken among all sequences $(f_k)\subseteq C^\infty (A)$ such that $f_k\rightarrow f$ in $L^1_{\textrm{loc}}(A)$. If moreover $f\in L^1(A)$, the convergence in $L^1_{\textrm{loc}}(A)$ above can be replaced by convergence in $L^1(A)$.

In the statement of the next lemma and in the sequel we denote by $B_\varepsilon (A)$ the open $\varepsilon $-neighbourhood of $A\subseteq {\mathbb {R}}^d$.

Lemma 1.9

[2, Lemma 12] Let $f\in L^1_{\textrm{loc}}(\Omega )$ with bounded Hessian–Schatten variation in $\Omega $. Let also $A\subseteq {\mathbb {R}}^d$ open and $\varepsilon >0$ with $B_\varepsilon (A)\subseteq \Omega $. Then, if $\rho \in C_{\textrm{c}}({\mathbb {R}}^d)$ is a convolution kernel with ${\textrm{supp}\,}\rho \subseteq B_\varepsilon (0)$, it holds

$$\begin{aligned} |{\textrm{D}}_p^2 (\rho *f)|(A)\leqq |{\textrm{D}}_p^2 f|(B_\varepsilon (A)). \end{aligned}$$

In the same spirit of Lemma 1.9, we have the following lemma.

Lemma 1.10

Let $f\in L^1_{\textrm{loc}}(\Omega )$ with bounded Hessian–Schatten variation in $\Omega $. Assume that $A\subseteq \Omega $ is open and invariant under the action of $SO({\mathbb {R}}^d)$. For any $U\in SO({\mathbb {R}}^d)$ the function $f_U:=f(U\,\cdot \,)$ satisfies $|{\textrm{D}}^2_p f_U|(A)= |{\textrm{D}}^2_p f|(A).$ In particular, setting

where $\mu _d$ is the Haar measure on $SO({\mathbb {R}}^d)$, by convexity one has

$$\begin{aligned} |{\textrm{D}}^2_p f^\textrm{rad}|(A)\leqq |{\textrm{D}}^2_p f|(A). \end{aligned}$$

Proof

The proof is very similar to the one of Lemma 1.9 above i.e. [2, Lemma 12], but we sketch it anyway for the reader’s convenience and for future reference.

We take any $F\in C_{\textrm{c}}^\infty (A)^{n\times n}$ with $\Vert F\Vert _{p',\infty }\leqq 1$ and we set $G:=U F(U^t\,\cdot \,) U^t$. A straightforward computation shows that

$$\begin{aligned} \sum _{i,j}\partial _i\partial _j G_{i,j}(x)=\sum _{i,\,j}(\partial _i\partial _j F_{i,j})(U^tx) \end{aligned}$$

and that $G\in C_{\textrm{c}}^\infty (A)^{n\times n}$ with $\Vert G\Vert _{p',\infty }\leqq 1$. Then we compute, by a change of variables,

$$\begin{aligned} \int _A \sum _{i,\,j} f_U\partial _i\partial _j F_{i,j}\textrm{d}\mathscr {L}^d&=\int _A f (x) \sum _{i,\,j}(\partial _i\partial _j F_{i,j})(U^t x)\textrm{d}\mathscr {L}^d(x)\\ {}&=\int _A f (x) \sum _{i,\,j}(\partial _i\partial _j G_{i,j})(x)\textrm{d}\mathscr {L}^d(x). \end{aligned}$$

In particular,

$$\begin{aligned} \bigg | \int _A \sum _{i,j} f_U\partial _i\partial _j F_{i,j}\textrm{d}\mathscr {L}^d(x)\bigg |\leqq |{\textrm{D}}^2_p f|(A). \end{aligned}$$

Now, by Fubini’s Theorem,

$$\begin{aligned} \int _A \sum _{i,j} f^\textrm{rad}\partial _i\partial _j F_{i,j}\textrm{d}\mathscr {L}^d&=\int _{SO({\mathbb {R}}^d)}\int _A f_U \sum _{i,j}\partial _i\partial _j F_{i,j}\textrm{d}\mathscr {L}^d\textrm{d}\mu _d(U)\\ {}&\leqq \int _{SO({\mathbb {R}}^d)} |{\textrm{D}}^2_p f|(A)\textrm{d}\mu _d(U)= |{\textrm{D}}^2_p f|(A), \end{aligned}$$

whence the claim as F was arbitrary. $\quad \square $

Proposition 1.11

Sobolev embedding, [2, Proposition 13] Let $f\in L^1_{\textrm{loc}}(\Omega )$ with bounded Hessian–Schatten variation in $\Omega $. Then

$$\begin{aligned}&f\in L^{d/(d-2)}_{\textrm{loc}}(\Omega )\cap W_{\textrm{loc}}^{1,d/(d-1)}(\Omega )\qquad{} & {} \text {if }d\geqq 3,\\&f\in L^\infty _{\textrm{loc}}(\Omega )\cap W^{1,2}_{\textrm{loc}}(\Omega )\qquad{} & {} \text {if }d= 2,\\&f\in L^\infty _{\textrm{loc}}(\Omega )\cap W^{1,\infty }_{\textrm{loc}}(\Omega )\qquad{} & {} \text {if }d= 1 \end{aligned}$$

and, if $d=2$, f has a continuous representative.

More explicitly, for every $A\subseteq \Omega $ bounded domain that supports Poincaré inequalities and $r\in [1,\infty )$, there exist $C=C(A,r)$ and an affine map $g=g(A,f)$ such that, setting ${\tilde{f}}:=f-g$, it holds that

$$\begin{aligned}&\Vert {\tilde{f}}\Vert _{L^{d/(d-2)}(A)}+\Vert \nabla {\tilde{f}}\Vert _{L^{d/(d-1)}(A)}\leqq C|{\textrm{D}}^2 f|(A)\qquad{} & {} \text {if }d\geqq 3,\\&\Vert {\tilde{f}}\Vert _{L^r(A)}+\Vert \nabla {\tilde{f}}\Vert _{L^2(A)}\leqq C|{\textrm{D}}^2 f|(A)\qquad{} & {} \text {if }d= 2,\\&\Vert {\tilde{f}}\Vert _{L^\infty (A)}+\Vert \nabla {\tilde{f}}\Vert _{L^\infty (A)}\leqq C|{\textrm{D}}^2 f|(A)\qquad{} & {} \text {if }d= 1. \end{aligned}$$

Lemma 1.12

Rigidity, [2, Lemma 15] Let $f,\,g\in L^1_{\textrm{loc}}(\Omega )$ with bounded Hessian–Schatten variation in $\Omega $ and assume that

$$\begin{aligned} |{\textrm{D}}^2_p (f+g)|(\Omega )=|{\textrm{D}}^2_p f|(\Omega )+|{\textrm{D}}^2_p g|(\Omega ). \end{aligned}$$

Then

$$\begin{aligned} |{\textrm{D}}^2_p (f+g)|=|{\textrm{D}}^2_p f|+|{\textrm{D}}^2_p g| \end{aligned}$$

as measures on $\Omega $.

2.3 Hessian–Schatten variation of radial functions

The following result is new and aims at computing the Hessian–Schatten variation of radial functions. This will be needed in Sections 3 and 4. Notice also that, as expected, the contribution involving the singular part of $|{\textrm{D}}g'|$ in (1.2) below does not depend on p.

In the proof we shall use the auxiliary function $F:(0,R)\times {\mathbb {R}}^2\rightarrow [0,\infty )$

$$\begin{aligned} F(s,(v_1,v_2)):=\textrm{d}\omega _d\Vert (s v_2,v_1,\ldots ,v_1)\Vert _{\ell ^p}s^{d-2}, \end{aligned}$$

where $v_1$ is repeated $d-1$ times and $\omega _d:=\mathscr {L}^d(B_1)$ (d will be the dimension of the Euclidean ambient space). Notice that F is continuous, convex and 1-homogeneous with respect to the $(v_1,v_2)$ variable. Therefore, for intervals $(r_1,r_2)\subseteq (0,R)$, the functional

$$\begin{aligned} \Phi _{(r_1,r_2)}(\mu ):=\int _{(r_1,r_2)}F\bigg (s,\frac{\textrm{d}{\mu }}{\textrm{d}{|\mu |}}\bigg )\textrm{d}|\mu |= \int _{(r_1,r_2)}F\bigg (s,\frac{\textrm{d}{\mu }}{\textrm{d}{\lambda }}\bigg )\textrm{d}\lambda \qquad \hbox { whenever}\ |\mu |\ll \lambda , \end{aligned}$$

defined on ${\mathbb {R}}^2$-valued measures $\mu $ makes sense and is convex. Furthermore, Reshetnyak lower semicontinuity Theorem (e.g. [3, Theorem 2.38]) grants its lower semicontinuity with respect to weak convergence in duality with $C_{\textrm{c}}((r_1,r_2))$.

Proposition 1.13

Let $d\geqq 2$ and let $g\in L^1_{\textrm{loc}}((0,R))\rightarrow {\mathbb {R}}$ be such that $\int _0^r s^{d-1}|g(s)|\textrm{d}s<\infty $ for every $r\in (0,R)$. Define $f(\,\cdot \,):=g(|\,\cdot \,|)\in L^1_{\textrm{loc}}(B_R(0))$.

Assume that f has bounded Hessian–Schatten total variation in $B_R(0)$. Then $g\in W^{1,1}_{\textrm{loc}}((0,R))$ and $g'\in {\textrm{BV}}_{\textrm{loc}}((0,R))$. Write the decomposition ${\textrm{D}}g'={\textrm{D}}^s g'+g''\mathscr {L}^1$, where ${\textrm{D}}^s g'\perp \mathscr {L}^1$. Then, for every $r\in (0,R]$ and $p\in [1,\infty ]$, one has

$$\begin{aligned} |{\textrm{D}}^2_p f|(B_r(0))= \textrm{d}\omega _d \bigg (\int _{(0,r) }s^{d-1}\textrm{d}{|{\textrm{D}}^s g'|( s)}+\int _0^r \Vert (s g''(s),g'(s),\ldots ,g'(s))\Vert _{\ell ^p} s^{d-2}\textrm{d}s\bigg ).\nonumber \\ \end{aligned}$$

(1.2)

Conversely, assume that $g\in W^{1,1}_{\textrm{loc}}((0,R))$ and $g'\in {\textrm{BV}}_{\textrm{loc}}((0,R))$, and, with the same notation above, that

$$\begin{aligned} \int _{(0,R)} s^{d-1}\textrm{d}{|{\textrm{D}}^s g'|( s)}+\int _0^R\Vert (s g''(s),g'(s),\ldots ,g'(s))\Vert _{\ell ^p} s^{d-2}\textrm{d}s<\infty . \end{aligned}$$

Then f has bounded Hessian–Schatten total variation in $B_R(0)$ and the Hessian–Schatten variation of f is computed as above.

Proof

Let $r\in (0,R)$. Let $\rho _k$ be radial Friedrich mollifiers for ${\mathbb {R}}^d$ and define $f_k:=\rho _k*f$. As $f_k$ is still radial, we write $f_k(\,\cdot \,)=g_k(|\,\cdot \,|)$, where $g_k\in C^\infty ((0,r))$. As $f_k\rightarrow f\in L^1 (B_r(0))$, $g_k\rightarrow g$ in $L^1_{\textrm{loc}}((0,r))$. Now we compute, on $B_r(0)$,

$$\begin{aligned} \nabla ^2 f_k(x)=g_k''(|x|)\frac{x\otimes x}{|x|^2}+g_k'(|x|)\frac{|x|^2 \textrm{Id}-x\otimes x}{|x|^3}. \end{aligned}$$

Notice that the eigenvalues of the matrix appearing at the right hand side of the equation above are $g_k''(|x|)$ with multiplicity 1 and $g_k'(|x|)/|x|$ with multiplicity $d-1$, the eigenvectors being x and a basis of $x^\perp $. Therefore, by Proposition 1.7, on $B_r(0)$ one has

$$\begin{aligned} |{\textrm{D}}^2_p f_k|=|x|^{-1}\big \Vert \big (|x|g_k''(|x|),g_k'(|x|),\ldots ,{g_k'(|x|)}\big )\big \Vert _{\ell ^p}\mathscr {L}^d \geqq g_k''(|x|)\mathscr {L}^d.\qquad \end{aligned}$$

(1.3)

As $|{\textrm{D}}_p^2 f_k|(B_r(0))$ is uniformly bounded by Lemma 1.9, we obtain the claimed membership for g, letting eventually $r\nearrow R$.

For the purpose of proving the inequality $\geqq $ in (1.2). It is enough to compute $|{\textrm{D}}^2_p f|(A_{r_1,r_2})$, where we define the open annulus

$$\begin{aligned} A_{r_1,r_2}:=B_{r_2}(0)\setminus \bar{B}_{r_1}(0) \end{aligned}$$

for $[r_1,r_2]\subseteq (0,R)$. Also, there is no loss of generality in assuming that $r_1$ and $r_2$ are such that $|{\textrm{D}}g'|(\{r_1\})=|{\textrm{D}}g'|(\{r_2\})=0$, as well as $|{\textrm{D}}\nabla f|(\partial A_{r_1,r_2})=0$, hence we will tacitly assume this condition in what follows.

From (1.3), with the notation $\mu _g:=(g'\mathscr {L}^1,{\textrm{D}}g')$, we get

$$\begin{aligned} |{\textrm{D}}^2_p f_k|(A_{r_1,r_2})=\int _{A_{r_1,r_2}}|{\textrm{D}}^2_pf_k|(x)\textrm{d}\mathscr {L}^d(x)=\Phi _{(r_1,r_2)}(\mu _{g_k}). \end{aligned}$$

Now notice that Lemma 1.9 and our choice of radii grant $|{\textrm{D}}^2_p f|(A_{r_1,r_2})=\lim _k |{\textrm{D}}^2_p f_k|(A_{r_1,r_2})$, so that the lower semicontinuity of $\Phi $ together with the weak* convergence of $\mu _{g_k}$ to $\mu _g$ grants

$$\begin{aligned} |{\textrm{D}}^2_p f|(A_{r_1,r_2})&\geqq \Phi _{(r_1,r_2)}(\mu _g)\\&=\textrm{d}\omega _d \bigg (\int _{(r_1,r_2)} s^{d-1}\textrm{d}{|{\textrm{D}}^s g'|( s)}+\int _{r_1}^{r_2} \Vert (s g''(s),g'(s),\dots ,g'(s))\Vert _{\ell ^p} s^{d-2}\textrm{d}s\bigg ). \end{aligned}$$

Letting $r_1\rightarrow 0$ and $r_2\rightarrow r$ provides the inequality $\geqq $ in (1.2).

Now we prove the converse implication and inequality. This time we denote by $(\rho _k)$ a sequence of Friedrich mollifiers on ${\mathbb {R}}$ and we call $g_k:=\rho _k*g$, then $f_k(\,\cdot \,):=g_k(|\,\cdot \,|)$. Notice that, with our choice of the radii, $|\mu _{g_k}|((r_1,r_2))$ converges to $|\mu _g|((r_1,r_2))$ as $k\rightarrow \infty $, therefore invoking Reshetnyak continuity Theorem (e.g. [3, Theorem 2.39]) we get

$$\begin{aligned}&|{\textrm{D}}^2_p f|(A_{r_1,r_2})\leqq \liminf _k |{\textrm{D}}^2_p f_k|(A_{r_1,r_2})= \liminf _k \Phi _{(r_1,r_2)}(\mu _{g_k})\\&\quad =\Phi _{(r_1,r_2)}(\mu _g)\leqq \Phi _{(0,R)}(\mu _g)\\&\quad = \textrm{d}\omega _d\biggl (\int _{(0,R)} s^{d-1}\textrm{d}{|{\textrm{D}}^s g'|( s)} +\int _0^R\Vert (s g''(s),g'(s),\ldots ,g'(s))\Vert _{\ell ^p} s^{d-2}\textrm{d}s\biggr ). \end{aligned}$$

Letting $r_1\rightarrow 0$ and $r_2\rightarrow R$ gives that f has bounded Hessian–Schatten total variation in $B_R(0)\setminus \{0\}$. To conclude, obtaining also the converse inequality in (1.2), we need just to apply the classical Lemma 1.14 below to f and to the partial derivatives of f, taking into account the mutual absolute continuity of $|{\textrm{D}}^2_p f|$ and $|{\textrm{D}}\nabla f|$ (Proposition 1.7). $\quad \square $

Lemma 1.14

Let $B_R(0)\subseteq {\mathbb {R}}^d$, $d\geqq 2$ and let $h\in W^{1,1}(B_R(0){\setminus }\{0\})$ (resp. $h\in {\textrm{BV}}(B_R(0){\setminus }\{0\})$). Then $h\in W^{1,1}(B_R(0))$ (resp. $h\in {\textrm{BV}}(B_R(0))$ and $|{\textrm{D}}h|(\{0\})=0$).

Proof

By a truncation argument, we can assume with no loss of generality that h is bounded. Then, the approximation of h by the functions $h_k=h(1-\psi _k)\in W^{1,1}(B_R(0))$ (resp. ${\textrm{BV}}(B_R(0))$), where $\psi _k\in C^1_{\textrm{c}}(B_{1/k}(0))$ satisfy $|\nabla \psi _k|\leqq 2 k$, $0\leqq \psi _k\leqq 1$ and $\psi _k=1$ in a neighbourhood of 0, together with Leibniz rule, provides the result. $\quad \square $

3 Density of CPWL Functions

We recall the definition of continuous piecewise linear ($\mathrm CPWL$) functions. In view of this definition we state that a simplex in ${\mathbb {R}}^d$ is the convex hull of $d+1$ points (called vertices of the simplex) that do not lie on an hyperplane, and a face of a simplex is the convex hull of a subset of its vertices.

Definition 2.1

Let $\Omega \subseteq {\mathbb {R}}^d$ open and let $f\in C(\Omega )$. We say that f is $\mathrm CPWL$ (or $f\in \textrm{CPWL}(\Omega )$) if there exists a decomposition of ${\mathbb {R}}^d$ in d-dimensional simplexes $\{P_k\}_{k\in {\mathbb {N}}}$, such that

(i)
$P_k\cap P_h$ is either empty or a common face of $P_k$ and $P_h$, for every $h\ne k$;
(ii)
for every k, the restriction of f to $P_k\cap \Omega $ is affine;
(iii)
the decomposition is locally finite, in the sense that for every ball B, only finitely many $P_k$ intersect B.

The main theorem of this section is the following density result:

Theorem 2.2

For any $w\in C^2({\mathbb {R}}^d)$ there exists a sequence $(u_j)\subseteq \textrm{CPWL}({\mathbb {R}}^d)$ with $u_j\rightarrow w$ in the $L^\infty _{\textrm{loc}}({\mathbb {R}}^d)$ topology and such that for any bounded open set $\Omega \subseteq {\mathbb {R}}^d$ with $\mathscr {L}^d(\partial \Omega )=0$,

$$\begin{aligned} \lim _{j\rightarrow \infty } |{\textrm{D}}^2_1u_j|(\Omega )= |{\textrm{D}}^2_1 w|(\Omega ). \end{aligned}$$

Recall that, as explained in [2, Remark 22], because of lower semicontinuity the exponent $p=1$ is the only meaningful exponent in a density result as above, namely this sharp approximation by $\textrm{CPWL}$ functions is not possible for the energy $|{\textrm{D}}^2_p f|$ when $p>1$.

We defer the proof of Theorem 2.2 to Section 2.3, after having studied properties of “good” triangulations in Sections 2.1 and 2.2. Namely, we aim to construct triangulations of ${\mathbb {R}}^d$ which locally follow a prescribed orientation. The general scheme is illustrated in Fig. 2. In each of the large squares it coincides with a rotation of a triangulation of $\varepsilon {\mathbb {Z}}^d$; the difficulty resides in the interpolation region between different squares. In Section 2.1 we discuss standard material on general properties of triangulations. In Section 2.2 we present the specific construction, the key result is Theorem 2.14. This is then used to prove density in Theorem 2.2.

First, we start with a brief discussion around the result of Theorem 2.2. We recall the following extension result [2, Lemma 17] its last claim is immediate, once one takes into account also Proposition 1.11:

Lemma 2.3

Let $\Omega :=(0,1)^d\subseteq {\mathbb {R}}^d$ and let $f\in L^1_{\textrm{loc}}(\Omega )$ with bounded Hessian–Schatten variation in $\Omega $. Then there exist an open neighbourhood $\tilde{\Omega }$ of $\bar{\Omega }$ and ${\tilde{f}}\in L^1_{\textrm{loc}}(\tilde{\Omega })$ with bounded Hessian–Schatten variation in $\tilde{\Omega }$ such that

$$\begin{aligned} |{\textrm{D}}^2_1 {\tilde{f}}|(\partial \Omega )=0 \end{aligned}$$

(2.1)

and

$$\begin{aligned} {\tilde{f}}=f\qquad \hbox { a.e.\ on}\ \Omega . \end{aligned}$$

In particular, $f\in L^1(\Omega )$.

The next result gives a positive answer to [2, Conjecture 1], partially proved in the two-dimensional case in [2, Theorem 21]. The proof is based on Theorem 2.2 and a diagonal argument.

Theorem 2.4

Let $\Omega :=(0,1)^d\subseteq {\mathbb {R}}^d$. Then $\mathrm CPWL$ functions are dense with respect to the energy $|{\textrm{D}}^2_1\,\cdot \,|(\Omega )$ in the space

$$\begin{aligned} \{f\in L^1_{\textrm{loc}}(\Omega ):f\text { has bounded Hessian--Schatten variation in }\Omega \} \end{aligned}$$

with respect to the $L^1(\Omega )$ topology. Namely, for any $f\in L^1_{\textrm{loc}}(\Omega )$ with bounded Hessian–Schatten variation in $\Omega $, there exists $\{f_k\}_k\subseteq \textrm{CPWL}(\Omega )$ with $f_k\rightarrow f$ in $L^1(\Omega )$ and $|{\textrm{D}}_1^2 f_k|(\Omega )\rightarrow |{\textrm{D}}_1^2 f|(\Omega )$.

Proof

Take f as in the statement, and let $\tilde{f}$ be given by Lemma 2.3. By using smooth cut-off functions, there is no loss of generality in assuming that $\tilde{f}$ is compactly supported in $\tilde{\Omega }$, hence, in particular, ${\tilde{f}}\in L^1({\mathbb {R}}^d)$. Also, we see that we can assume that $\mathscr {L}^d(\partial \tilde{\Omega })=0$.

Now we take $({\tilde{f}}_k)\subseteq C_{\textrm{c}}^\infty ({\mathbb {R}}^d)$ be mollifications of ${\tilde{f}}$ by means of compactly supported mollifiers, notice that $\tilde{f}_k\rightarrow {\tilde{f}}$ in $L^1({\mathbb {R}}^d)$ and $|{\textrm{D}}^2_1 f_k|(\tilde{\Omega })=|{\textrm{D}}^2_1 f_k|({\mathbb {R}}^d)\rightarrow |{\textrm{D}}^2_1 {\tilde{f}}|({\mathbb {R}}^d) =|{\textrm{D}}^2_1 {\tilde{f}}|(\tilde{\Omega })$, thanks to Proposition 1.9 and lower semicontinuity. Now, for any k, take $({\tilde{f}}_{k,h})\subseteq \textrm{CPWL}({\mathbb {R}}^d)$ be given by Theorem 2.2 for ${\tilde{f}}_k$. With a diagonal argument, we obtain $(g_\ell )\subseteq \textrm{CPWL}({\mathbb {R}}^d)$ with $g_\ell \rightarrow {\tilde{f}}$ in $L^1(\tilde{\Omega })$ and such that $|{\textrm{D}}_1^2 g_\ell |(\tilde{\Omega })\rightarrow |{\textrm{D}}^2_1 {\tilde{f}}|(\tilde{\Omega })$. By lower semicontinuity, the fact that $|{\textrm{D}}_1^2 g_\ell |(\tilde{\Omega })\rightarrow |{\textrm{D}}^2_1 {\tilde{f}}|(\tilde{\Omega })$ and (2.1), it easily follows that

$$\begin{aligned} |{\textrm{D}}_1^2 g_\ell |(\Omega )\rightarrow |{\textrm{D}}^2_1 {\tilde{f}}|(\Omega )=|{\textrm{D}}^2_1 f|(\Omega ). \end{aligned}$$

Clearly, $g_\ell \rightarrow f$ in $L^1(\Omega )$, so that the proof is concluded. $\quad \square $

Remark 2.5

Let $\Omega :=(0,1)^d$. As a consequence of Theorem 2.4, the description of the extremal points of the unit ball with respect to the $|{\textrm{D}}_1^2\,\cdot \,|(\Omega )$ seminorm obtained in [2, Theorem 25] remains in place in arbitrary dimension. In a slightly imprecise way, the result states that $\mathrm CPWL$ extremal points are dense in 1-Hessian–Schatten energy in the set of extremal points with respect to the $L^1(\Omega )$ topology. Notice that the description of $\mathrm CPWL$ extremal points is made explicit in [2, Proposition 23]. $\blacksquare $

Remark 2.6

The set of extremal points is not closed with respect to the convergence considered here. For example, with $d=2$, one can easily check that the function $g(x):=\max \{1-\Vert x\Vert _{\ell ^\infty },0\}$ is extremal, but the function $G_0(x):=g(x+e_1)+g(x-e_1)$ is not. Indeed, $G_0=\frac{1}{2} (2\,g(\cdot + e_1) + 2\,g (\cdot -e_1))$, with $|{\textrm{D}}_p^2G_0|({\mathbb {R}}^2)=|{\textrm{D}}_p^22\,g(\cdot + e_1)|({\mathbb {R}}^2)= |{\textrm{D}}_p^22\,g(\cdot - e_1)|({\mathbb {R}}^2)$. For $h\in (0,1/4)$ we then define $G_h:{\mathbb {R}}^2\rightarrow {\mathbb {R}}$ by

$$\begin{aligned} G_h(x):=\max \bigl \{1-\Vert x-(1+h)e_1\Vert _{\ell ^\infty }, 1-\Vert x+(1+h)e_1\Vert _{\ell ^\infty }, -\textrm{dist}_{\ell ^\infty }(x, \partial R_h)\bigr \} \end{aligned}$$

if $x\in R_h:=[-2-h,2+h]\times [-1,1]$, and $G_h(x)=0$ if $x\in {\mathbb {R}}^2{\setminus } R_h$ (see Fig. 1). Then each $G_h$ is CPWL, is extremal, and $G_h\rightarrow G_0$ uniformly with $|{\textrm{D}}_p^2G_h|({\mathbb {R}}^2)\rightarrow |{\textrm{D}}_p^2G_0|({\mathbb {R}}^2)$ for any $p\in [1,\infty ]$, but $G_0$ is not extremal.

Let us briefly comment on the proof of extremality of $G_h$ (the same argument implies extremality of g). If $G_h=\lambda f+(1-\lambda ) f'$, with $\lambda \in (0,1)$ and $|{\textrm{D}}_p^2f|({\mathbb {R}}^2)= |{\textrm{D}}_p^2f'|({\mathbb {R}}^2)= |{\textrm{D}}_p^2G_h|({\mathbb {R}}^2)$, then by Lemma 1.12 the support of $|{\textrm{D}}_p^2f|$ is contained in the support of $|{\textrm{D}}_p^2G_h|$, so that f (after choosing the continuous representative) is affine in each of the sets on which $G_h$ is affine. Adding an irrelevant affine function, we can reduce to the case that $f=0$ outside $R_h$. Using the fact that if two affine functions coincide on three non-collinear points then they coincide everywhere, one obtains $f=aG_h$, where $a:=f((1+h)e_1)\in {\mathbb {R}}$ (see Fig. 1); by equality of the norms $a=\pm 1$. Similarly, $f'=\pm G_h$, so that by $G_h=\lambda f+(1-\lambda ) f'$ we obtain $G_h=f=f'$.$\blacksquare $

3.1 General properties of triangulations

We define a triangulation of ${\mathbb {R}}^d$ as a pair of two sets, the first one, V, containing the vertices (nodes), the second one, E, containing the elements, which are nondegenerate compact simplexes with pairwise disjoint interior. Each simplex is the convex hull of its $d+1$ vertices. One further requires a compatibility condition that ensures that neighbouring elements share a complete face (and not a strict subset of a face). We remark that there is a large literature which studies this in the more general framework of simplicial complexes. For the present application the metric and regularity properties are crucial, we present in this section the few properties which are relevant here in a self-contained way.

Definition 2.7

A triangulation of ${\mathbb {R}}^d$ is a pair (V, E), with $V\subseteq {\mathbb {R}}^d$ and $E\subseteq {\mathcal {P}}({\mathbb {R}}^d)$ such that

i)
for every $e\in E$, e has non empty interior and there is $v_e\subseteq {\mathbb {R}}^d$ with $\# v_e=d+1$ and $e=\textrm{conv}\,(v_e)$;
ii)
$V=\bigcup _{e\in E} v_e$;
iii)
for any $e,\,e'\in E$ one has $e\cap e'=\textrm{conv}\,(v_e\cap v_{e'})$;
iv)
$\bigcup _{e\in E}e={\mathbb {R}}^d$.

We introduce four regularity properties:

(a)
The triangulation has the Delaunay property if for each $e\in E$, the unique open ball B with $v_e\subseteq \partial B$ obeys $B\cap V=\emptyset $.
(b)
The triangulation is $c_*$-non degenerate, for some $c_*>0$, if $(\textrm{diam}\, e)^d\leqq c_*\mathscr {L}^d(e)$ for all $e\in E$.
(c)
The set $V\subseteq {\mathbb {R}}^d$ is $(\bar{c}, \varepsilon )$-uniform, for some $\bar{c},\,\varepsilon >0$, if $|x-y|\geqq \varepsilon /\bar{c}$ for all $x\in V,\,y\in V$ with $x\ne y$ and $B_{\bar{c}\varepsilon }(q)\cap V\ne \emptyset $ for all $q\in {\mathbb {R}}^d$.
(d)
The triangulation is locally finite if, for every ball B, only finitely many elements of E intersect B.

Condition iii) states that two distinct elements of E are either disjoint or share a face of dimension between 0 and $d-1$; in particular distinct elements have disjoint interior. Notice that $\textrm{conv}\,(\emptyset )=\emptyset $.

The Delaunay property (a) states that the circumscribed sphere to each simplex does not contain any other vertex, and implies $\partial e\cap V=v_e$ for all $e\in E$. It can be interpreted as a statement that the vertices have been matched to form simplexes in an “optimal” way.

The non-degeneracy property (b) states that simplexes are uniformly non-degenerate, so that the affine bijection that maps e onto the standard simplex has a uniformly bounded condition number. It implies that there is $C=C(c_*,d)$ such that for any $e\in E$, any $x\in v_e$, any $F\in {\mathbb {R}}^d$ one has

$$\begin{aligned} |F|\leqq C(c_*,d) \sum _{y\in v_e\setminus \{x\}} \frac{|F\cdot (y-x)|}{|y-x|}. \end{aligned}$$

(2.2)

The uniformity property (c) of a set V of vertices ensures (for Delaunay triangulations) that all sides of all elements have length comparable to $\varepsilon $. Also, property (c) immediately implies property (d), as it forces V to be a locally finite set.

Remark 2.8

Let (V, E) be a triangulation that has the Delaunay property (property (a)) and is $(\bar{c},\varepsilon )$-uniform (property (c)). Then $\textrm{diam}(e)\leqq 2\bar{c} \varepsilon $, for any $e\in E$.$\blacksquare $

Proof

Take $e\in E$ and let $q\in {\mathbb {R}}^d$ and $r\in (0,\infty )$ such that $v_e\subseteq \partial B_r(q)$. By the Delaunay property, $V\cap B_r(q)=\emptyset $, so that, by $(\bar{c},\varepsilon )$-uniformity, $\bar{c}\varepsilon >r\geqq \textrm{diam}(e)/2$. $\quad \square $

We next show how given the set of vertices V one can abstractly obtain a good triangulation. The construction is standard up to a perturbation argument. As we could not find a reference with the complete result, we prove it.

Lemma 2.9

Let $V\subseteq {\mathbb {R}}^d$ be uniform in the sense of property (c) of Definition 2.7. Then there is $E\subseteq {\mathcal {P}}({\mathbb {R}}^d)$ such that (V, E) is a triangulation of ${\mathbb {R}}^d$ with the Delaunay property (a)

Proof

We define $f:{\mathbb {R}}^d\rightarrow [0,\infty ]$ by

$$\begin{aligned} f(x):={\left\{ \begin{array}{ll} |x|^2\qquad &{}\text { if } x\in V,\\ \infty \qquad &{}\text { otherwise.} \end{array}\right. } \end{aligned}$$

Let g be the convex envelope of f, which is CPWL (see Lemma 2.10 below). Moreover, notice that

$$\begin{aligned} g(x)=|x|^2=f(x)\qquad \text {for every }x\in V. \end{aligned}$$

Let $q\in {\mathbb {R}}^d$, $\mu \in {\mathbb {R}}$ be such that

$$\begin{aligned} A:=\{x: g(x)=\mu +2x\cdot q \} \end{aligned}$$

(2.3)

has nonempty interior. Notice that A is compact, convex and coincides with the closure of its interior, and $g(x)> \mu +2x\cdot q$ for every $x\in {\mathbb {R}}^d{\setminus } A$. Also, we set

$$\begin{aligned} w:=\{x\in V: \mu +2x\cdot q = |x|^2 \}= A\cap V, \end{aligned}$$

(2.4)

then,

$$\begin{aligned} \mu +2x\cdot q < |x|^2 \qquad \text {for all } x\in V\setminus w. \end{aligned}$$

Now we show that $\textrm{ext}\,(A)\subseteq V$ so that $\textrm{ext}\,(A)\subseteq w$ and hence $A=\textrm{conv}\,(w)$ with $\#w\geqq d+1$ (as A has nonempty interior). Take indeed $p\in \textrm{ext}\,(A)$ and assume $p\notin V$. Then, take a minimal set of points $\{p_1,\dots ,p_{k}\}\subseteq V$ such that $(p,g(p))\in \textrm{conv}\,\big ((p_1,f(p_1)),\dots , (p_{k},f(p_{k}))\big )$ (this is possible by (2.7) of Lemma 2.10 below). As $p\in \textrm{ext}\,(A)$, up to reordering, we can assume that $p_1\notin A$, hence by $g(p_1)>\mu +2p_1\cdot q$ we have that $g(p)>\mu +2p\cdot q$, a contradiction.

The above equations can be rewritten as

$$\begin{aligned} |x-q|^2=\mu +|q|^2 \qquad \text {for all }x\in w \end{aligned}$$

and

$$\begin{aligned} |x-q|^2>\mu +|q|^2 \qquad \text {for all } x\in V\setminus w. \end{aligned}$$

We set $r:=\sqrt{\mu +|q|^2}$, so that these conditions are $w\subseteq \partial B_r(q)$ and $V\cap B_r(q)=\emptyset $, so that the set w has the Delaunay property.

Notice then that for every $x\in V$, there is at least one set A as in (2.3) with nonempty interior and with $x\in A\cap V$ (this set was called w): this follows from the fact that g is CPWL.

Any decomposition of those elements A in (2.3) with nonempty interior into non degenerate simplexes with vertices in w leads to a pair (V, E) with all 4 claimed properties of triangulations, except for iii) of Definition 2.7. In the rest of the proof we show by a perturbation argument that a decomposition exists such that property iii), which relates neighbouring pieces in which g is affine, also holds.

We first remark that property iii) is automatically true if g is non degenerate, in the sense that each A is a simplex, which is the same as $\#w=d+1$ (we are going to add a few details about this in the sequel of the proof). In turn, this is true if for every choice of $X:=\{x_1, \ldots , x_{d+2}\}\subseteq V$ the $d+2$ points $\{(x,g(x))\}_{x\in X}\in {\mathbb {R}}^{d+1}$ do not lie in a d-dimensional hyperplane, so that (2.4) cannot hold for all $x\in X$.

We fix an enumeration $\varphi :V\rightarrow {\mathbb {N}}\setminus \{0,1\}$ and recall that V is $(\bar{c},\varepsilon )$-uniform. For any $\rho \in (0,\varepsilon \wedge 1]$ we consider $f_\rho :{\mathbb {R}}^d\rightarrow [0,\infty ]$ defined by

$$\begin{aligned} f_\rho (x):={\left\{ \begin{array}{ll} |x|^2+\rho ^{\varphi (x)}\qquad &{}\text { if } x\in V,\\ \infty \qquad &{}\text { otherwise.} \end{array}\right. } \end{aligned}$$

For a given set $X:=\{x_1,\ldots , x_{d+2}\}\subseteq V$ consider the $d+2$ equations

$$\begin{aligned} \mu +2x_i\cdot q =|x_i|^2+\rho ^{\varphi (x_i)}\qquad \text {for } i=1,\ldots , d+2 \end{aligned}$$

(2.5)

in the $d+1$ unknowns $(\mu ,q)$. The affine map $T:{\mathbb {R}}^{d+1}\rightarrow {\mathbb {R}}^{d+2}$ defined by $T_i(\mu ,q):=\mu +2x_i\cdot q - |x_i|^2$ has an image which is at most $d+1$ dimensional, hence contained in a set of the form $\{\Xi \in {\mathbb {R}}^{d+2}: \Xi \cdot \nu =a\}$ for some $\nu \in S^{d+1}$, $a\in {\mathbb {R}}$ (which depend on X). If the system (2.5) has a solution, then

$$\begin{aligned} \sum _{i=1}^{d+2} \nu _i \rho ^{\varphi (x_i)}=a. \end{aligned}$$

As $|\nu |=1$ and the exponents are all distinct, this is a nontrivial polynomial equation in $\rho $, and has at most finitely many solutions. As there are countably many possible choices of the set $X\subseteq V$, for all but countably many values of $\rho $ no such system has a solution. Therefore we can choose $\rho _j\searrow 0$ such that (2.5) has no solution for any choice of X with $X=\{x_1,\ldots ,x_{d+2}\}\subseteq V$.

Fix now an index j and let $g_{\rho _j}$ be the convex envelope of $f_{\rho _j}$. Notice that if $\rho _j$ is sufficiently small (that we are going to assume from here on), then, as V is discrete and $|x|^2$ is strictly convex,

$$\begin{aligned} g_{\rho _j}(x)=|x|^2+\rho _j^{\varphi (x)}=f_{\rho _j}(x)\qquad \text {for every }x\in V. \end{aligned}$$

Our choice of $\rho _j$ implies that for every j, for every choice of $X:=\{x_1, \ldots , x_{d+2}\}\subseteq V$ the $d+2$ points $\{(x,g_{\rho _j}(x))\}_{x\in X}\in {\mathbb {R}}^{d+1}$ do not lie in a d-dimensional hyperplane. Now pick $\mu ,\,q$ such that

$$\begin{aligned} A:=\{x: g_{\rho _j}(x)=\mu +2x\cdot q \} \end{aligned}$$

has nonempty interior (the function $g_{\rho _j}$ is CPWL, by Lemma 2.10 below). By non-degeneracy, arguing as above, $A=\textrm{conv}\,(w)$, with $\#w=d+1$ and $\textrm{Int}(A)\cap V=\emptyset $. We define $E_j$ as the family of those sets.

Let us justify why $(V,E_j)$ is a triangulation of ${\mathbb {R}}^d$. It is enough to show that property iii) holds. Take then $e_1,e_2\in E_j$ (with vertices $w_1,w_2$), so that there exist two affine functions $L_1,L_2$ such that $g_{\rho _j}=L_i$ on $e_i $ and $g_{\rho _j}>L_i$ on ${\mathbb {R}}^d\setminus e_i $, for $i=1,2$. Assume that $\xi \in e_1\cap e_2$, so that $L_1(\xi )=g_{\rho _j}(\xi )=L_2(\xi )$. Take a minimal set $\{\zeta _1,\dots ,\zeta _k\}\subseteq w_2$ with $\xi \in \textrm{conv}\,(\{\zeta _1,\dots ,\zeta _k\})$. As for every $a=1,\dots ,k$, $L_2(\zeta _a)=g_{\rho _j}(\zeta _a)\geqq L_1(\zeta _a)$, it follows that for every $a=1,\dots ,k$, $g_{\rho _j}(\zeta _a)=L_1(\zeta _a)$ hence $\{\zeta _1,\dots ,\zeta _k\}\subseteq w_1\cap w_2$, so that we have verified property iii).

Now take $e\in E_j$ and consider the associated set of vertices w. The conditions

$$\begin{aligned} \mu +2x\cdot q = |x|^2 +\rho _j^{\varphi (x)}\geqq |x|^2 \qquad \text {for all }x\in w \end{aligned}$$

and

$$\begin{aligned} \mu +2x\cdot q\leqq |x|^2 +\rho _j^{\varphi (x)}\leqq |x|^2+\rho _j^2\qquad \text {for all } x\in V \end{aligned}$$

lead to

$$\begin{aligned} |x-q|^2\leqq \mu +|q|^2\qquad \text {for all }x\in w \end{aligned}$$

and

$$\begin{aligned} \rho _j^2+|x-q|^2\geqq \mu +|q|^2\qquad \text {for all } x\in V. \end{aligned}$$

Therefore $w\subseteq \overline{B}_r(q)$, and either $r\leqq \rho _j$ or $V\cap B_{r-\rho _j}(q)=\emptyset $, where $r:=\sqrt{\mu +|q|^2}$. By uniformity of the grid, necessarily $r-\rho _j< \bar{c} \varepsilon $, which gives $\textrm{diam}(e)\leqq 2r< 2\bar{c}\varepsilon +2\rho _j\leqq 2(\bar{c}+1)\varepsilon $.

For any $x\in V$, the possible choices of e with $x\in v_e$ are restricted by $\textrm{diam}(e)< 2(\bar{c}+1)\varepsilon $, which implies $v_e\subseteq V\cap B_{2(\bar{c}+1)\varepsilon }(x)$. As the grid is uniform, the latter set is finite, with a bound depending only on $\bar{c}$. Therefore for any $x\in V$ we can choose a subsequence of $\rho _j$ such that the set

$$\begin{aligned} \{e\in E_j: x\in v_e\} \end{aligned}$$

is, after finitely many steps, constant. As there are countably many $x\in V$, we can choose a common diagonal subsequence. Along this sequence, for any bounded set K the set $\{e\in E_j: e\subseteq K\}$ is, after finitely many steps, constant. Property iii) holds for $E_j$, and therefore for those sets. Therefore we obtain a common set E with all desired properties. We remark that indeed the Delaunay property follows from the construction of E and the discussion of the first part of the proof: indeed, if $e\in E$, it is easy to see that there exists an affine function coinciding with g on e. $\quad \square $

We next present the result on the regularity of convex envelopes used above.

Lemma 2.10

Let $V\subseteq {\mathbb {R}}^d$ be a uniform set of vertices, in the sense of item (c) of Definition 2.7. Let $f:V\rightarrow [0,\infty )$ be superlinear, in the sense that

$$\begin{aligned} \lim _{x\in V,\ |x|\rightarrow \infty } \frac{f(x)}{|x|}=\infty . \end{aligned}$$

(2.6)

Let $g:{\mathbb {R}}^d\rightarrow [0,\infty )$ be the convex envelope of f (f is extended by $\infty $ to ${\mathbb {R}}^d\setminus V$). Then g is $\mathrm CPWL$. Moreover,

$$\begin{aligned} \{(x,g(x)):x\in {\mathbb {R}}^d\}\subseteq \textrm{conv}\,(\{(x,f(x)):x\in V\}) \end{aligned}$$

(2.7)

(notice that we are not taking the closure of the convex hull at the right hand side).

Remark 2.11

It is easy to verify the following:

(i)
That V is uniform implies that g is real-valued.
(ii)
That the assumption of superlinearity is necessary. Indeed, consider $d=2$, $V={\mathbb {Z}}^2$, $f(x)=|x|$. Obviously $g(x)\geqq |x|$. For any $x\in {\mathbb {Q}}^2$ there is $n\in {\mathbb {N}}\setminus \{0\}$ such that $xn\in {\mathbb {Z}}^2$, which implies $g(x)\leqq (1-\frac{1}{n}) f(0)+\frac{1}{n} f(xn)=|x|$, so that $g(x)=|x|$ on ${\mathbb {Q}}^2$. As g is a real-valued convex function, it is continuous. We conclude $g(x)=|x|$ on ${\mathbb {R}}^2$, which is not $\mathrm CPWL$.

Proof of Lemma 2.10

For $r\in (0,\infty )$, we write

$$\begin{aligned} f_r(x):={\left\{ \begin{array}{ll} f(x) &{} \text { if } x\in V\cap B_r,\\ \infty &{} \text { otherwise,} \end{array}\right. } \end{aligned}$$

and let $g_r\geqq g$ be the convex envelope of $f_r$. Since V is uniform, any set $V\cap B_r$ is finite, and therefore $g_r$ is $\mathrm CPWL$ on $\textrm{conv}\,(V\cap B_r)$, and infinity outside. If $r\geqq \bar{c} \varepsilon $, with $\bar{c},\,\varepsilon >0$ the constants from item (c) of Definition 2.7, the set $V\cap B_r$ is nonempty.

We shall show below that for any $r>0$ there is $R>0$ such that $g=g_R$ on $B_{r/4}$. This implies that g is CPWL on $B_{r/4}$ for any r, and therefore the assertion. The choice of R (which depends on f and r) is done in (2.9) below.

For $r\geqq \bar{c}\varepsilon $ we define $\alpha _r:=\max f(V\cap [-r,r]^d)$. We first prove that if $R/\sqrt{d}> r\geqq 4\bar{c} \varepsilon $ then

$$\begin{aligned} g_R(x)\leqq \alpha _{r} \text { for all } x\in B_{r/2}. \end{aligned}$$

(2.8)

To see this, let $q_1,\ldots , q_{2^d}$ denote the vertices of the cube $[-1,1]^d$. By uniformity of V, for each i we can pick $p_i\in V\cap B_{\bar{c} \varepsilon }((r-\bar{c} \varepsilon )q_i)$. One checks that $B_{r/2}\subseteq (r-2\bar{c} \varepsilon )[-1,1]^2\subseteq \textrm{conv}\,(\{p_1,\dots , p_{2^d}\})$. As $p_i\in V\cap [-r,r]^d\subseteq V\cap B_R$, we have $g_R(p_i)\leqq f(p_i)\leqq \alpha _{r}$ for all i, and therefore $g_R\leqq \alpha _{r}$ on $B_{r/2}$, which proves (2.8).

We next show that, if R is chosen sufficiently large, then $g_R=g$ on $B_{r/4}$. By convexity, (2.8), and $g_R\geqq 0$ we obtain $\textrm{Lip}(g_R;B_{r/4})\leqq 4\alpha _r/r$. As $g_R$ is CPWL in $B_{r/4}$, for any $y\in B_{r/4}$ there is an affine function $a:{\mathbb {R}}^d\rightarrow {\mathbb {R}}$ such that $y\in T_a:=\{g_R=a\}\cap B_{r/4}$ and $T_a$ has nonempty interior. The Lipschitz bound on $g_R$ then carries over to a, and we obtain $|\nabla a|\leqq 4\alpha _{r}/r$. By convexity of $g_R$, we have $a\leqq g_R$, so that $a\leqq f$ on $V\cap B_R$. In order to obtain the same inequality outside $B_R$, we consider any x with $|x|\geqq R\geqq r$. Then, recalling $y\in T_a\subseteq B_{r/4}$,

$$\begin{aligned} a(x)\leqq a(y)+|\nabla a|\,|x-y| \leqq \alpha _r + \frac{4\alpha _{r} }{r}\Big (|x|+\frac{r}{4}\Big ) \leqq \frac{6\alpha _{r}}{r} |x|. \end{aligned}$$

Finally, by (2.6) we can choose $R> \sqrt{d} r$ such that

$$\begin{aligned} f(x)\geqq \frac{6\alpha _r}{r} |x| \hspace{28.45274pt}\text { for all } x\in V\setminus B_R. \end{aligned}$$

(2.9)

Therefore $a\leqq f$ everywhere, which implies $a\leqq g \leqq g_R$, and in turn $g=g_R$ on $T_a$ and therefore on $B_{r/4}$.

We prove now (2.7). Take $x\in {\mathbb {R}}^d$, so that, by what proved above, $g(x)=g_R(x)$ for some $R>0$. Now notice that the epigraph of $g_R$ coincides with the convex hull of the epigraph of $f_R$ (here we are using that the convex hull of the epigraph of $f_R$ is closed), so that the conclusion is easily achieved. $\square $

We next investigate in more detail Delaunay triangulations such that V locally coincides with ${\mathbb {Z}}^d$ (possibly up to translations and rotations). We show in Lemma 2.13 below that the elements necessarily are the “natural” ones. Before we recall some basic properties of ${\mathbb {Z}}^d$, where, as usual, for $F\in {\mathbb {R}}^{d\times d}$, $A\subseteq {\mathbb {R}}^d$, $p\in {\mathbb {R}}^d$, we set $p+FA:=\{p+Fa: a\in A\}$.

Remark 2.12

The following hold.

i)
Let $R\in SO({\mathbb {R}}^d)$ and let $\varepsilon \in (0,\infty )$. Then $\textrm{dist}(x,\varepsilon R {\mathbb {Z}}^d)\leqq \varepsilon \sqrt{d}/2$ for any $x\in {\mathbb {R}}^d$.
ii)
If $v\subseteq {\mathbb {Z}}^d$, $\#v=d+1$, then either v is contained in a $(d-1)$-dimensional affine subspace, or
$$\begin{aligned} \mathscr {L}^d(\textrm{conv}\,v)\geqq \frac{1}{d!}. \end{aligned}$$
iii)
If $w\subseteq {\mathbb {Z}}^d$, $\#w=d$, then either w is contained in a $(d-2)$-dimensional affine subspace, or
$$\begin{aligned} {\mathcal {H}}^{d-1}(\textrm{conv}\,w)\geqq \frac{1}{(d-1)!}. \end{aligned}$$
(2.10)

Proof

To prove the first item, we can change coordinates to assume that $R=\mathrm Id$, and then, by scaling, we see that we can assume $\varepsilon =1$. For each $i=1,\dots ,d$ we select $z_i\in {\mathbb {Z}}$ with $|x_i-z_i|\leqq \frac{1}{2}$, so that $z\in {\mathbb {Z}}^d$ and

$$\begin{aligned} |x-z|=\Big ({\sum \nolimits _{i=1}^d (x_i-z_i)^2}\Big )^{1/2}\leqq \sqrt{d}/2. \end{aligned}$$

For the second one, by translation we can assume $0\in v$. The volume of the simplex $\textrm{conv}\,v$ is given by 1/d! times the absolute value of the determinant of the matrix whose columns are the vectors of $v\setminus \{0\}$. As each component of each vector is integer, the determinant is an integer. Hence it is either 0, or at least 1.

The proof of the third item is similar. Again, assume $0\in w$. At least one $e_i$ is not contained in the linear space generated by w. We apply the first assertion to $v:=w\cup \{e_i\}$, and obtain that the volume of $T:=\textrm{conv}\,v$ is either zero or at least 1/d!. Since the volume of T is also given by 1/d times the area of $\textrm{conv}\,w$ times the distance of $e_i$ to the space generated by w, which is at most 1 since $0\in w$, we obtain (2.10). $\quad \square $

Lemma 2.13

Let (V, E) be a triangulation of ${\mathbb {R}}^d$ with the Delaunay property and let $B_r(q)$ be a ball such that $ V\cap B_r(q)= \varepsilon R {\mathbb {Z}}^d\cap B_r(q)$, for some $\varepsilon >0$ and $R\in {SO}({\mathbb {R}}^d)$. If $e\in E$ is such that $e\cap B_{r-\sqrt{d} \varepsilon }(q)\ne \emptyset $, then there is a unique $y\in \varepsilon R ({\mathbb {Z}}+\frac{1}{2})^d$ such that $v_e\subseteq y+\varepsilon R \{-\frac{1}{2},\frac{1}{2}\}^d$, characterized by $v_e\subseteq \partial B_{\sqrt{d}/2}(y)$.

We remark that the assumption $e\cap B_{r-\sqrt{d} \varepsilon }(q)\ne \emptyset $ implies $r>\sqrt{d} \varepsilon $.

Proof

By scaling and a change of coordinates it suffices to consider the case $\varepsilon =1$, $R=\textrm{Id}$. Let e be as in the statement, and let $B_\rho (y)$ be such that $v_e\subseteq \partial B_\rho (y)$. By the Delaunay property, using also the assumption in force here,

$$\begin{aligned} B_\rho (y)\cap {\mathbb {Z}}^d\cap B_r(q)\subseteq B_\rho (y)\cap V=\emptyset ; \end{aligned}$$

(2.11)

by $e\cap B_{r-\sqrt{d}}(q)\ne \emptyset $ and $e\subseteq \overline{B}_\rho (y)$ we have

$$\begin{aligned} |q-y|<r-\sqrt{d}+\rho \qquad \text {(and }r>\sqrt{d}). \end{aligned}$$

(2.12)

We want to show now that $\rho =\sqrt{d}/2$.

First, we assume (by contradiction) that $\rho >\sqrt{d}/2$. We show that this possibility cannot occur. We define $\rho ':=\min \{\rho ,r, (r+\rho -|q-y|)/2\}$. Condition (2.12) implies $\rho '>\sqrt{d}/2$ and the definition of $\rho '$ gives

$$\begin{aligned} |q-y|\leqq - 2\rho '+r+\rho =( r-\rho ')+(\rho -\rho '), \end{aligned}$$

so that there exists $y'\in \overline{B}_{r-\rho '}(q)\cap \overline{B}_{\rho -\rho '}(y)$ (we adopt the convention that $\overline{B}_0(x)=\{x\}$). The point $y'$ obeys then $B_{\rho '}(y')\subseteq B_r(q)\cap B_\rho (y)$ and therefore, recalling (2.11), $B_{\rho '}(y')\cap {\mathbb {Z}}^d=\emptyset $, which contradicts $\rho '>\sqrt{d}/2$ (Remark 2.12(i)).

Hence $\rho \leqq \sqrt{d}/2$, so that, using also (2.12), $\overline{B}_\rho (y)\subseteq B_r(q)$, and therefore, recalling (2.11), $B_\rho (y)\cap {\mathbb {Z}}^d=\emptyset $ and $v_e\subseteq {\mathbb {Z}}^d$. We define $z\in {\mathbb {Z}}^d$ by choosing for each i a component $z_i\in {\mathbb {Z}}$ which minimizes $|z_i-y_i|$, notice that $|z_i-y_i|\leqq 1/2$. As $B_\rho (y)\cap {\mathbb {Z}}^d=\emptyset $, we have $|z-y|\geqq \rho $. By minimality of $z_i$, for any $x\in v_e\subseteq {\mathbb {Z}}^d$ and any i we have $|x_i-y_i|\geqq |z_i-y_i|$, which by $x\in \partial B_\rho (y)$ implies $\rho =|x-y|\geqq |z-y|\geqq \rho $. Therefore, equality holds throughout and

$$\begin{aligned} \rho =|x-y|=|z-y| \hbox { and } |x_i-y_i|=|z_i-y_i| \qquad \text {for every} i\in \{1,\ldots , d\} \hbox {and} x\in v_e. \end{aligned}$$

Assume that there exists i with $|z_i-y_i|<\frac{1}{2}$, so that $|z_i-x_i|<1$ for all $x\in v_e$. As $x_i, z_i\in {\mathbb {Z}}$, this implies $x_i=z_i$ for all $x\in v_e$, hence $v_e$ is contained in a $(d-1)$-dimensional subspace of ${\mathbb {R}}^d$. As e is non degenerate (i.e. has non empty interior), this is impossible, hence $|z_i-y_i|=\frac{1}{2}$ for all i. We conclude that $\rho =\sqrt{d}/2$ and then $v_e\subseteq y+\{-\frac{1}{2},\frac{1}{2}\}^d$, which also implies the membership of y to $(\mathbb {Z}+1/2)^d$ by $v_e\subseteq {\mathbb {Z}}^d$.$\quad \square $

3.2 Construction of the triangulation

We write $Q_\ell (x):=x+(-\ell /2,\ell /2)^d$ and $Q_\ell :=Q_\ell (0)$. Notice the factor 1/2, i.e. $\ell $ is the length of the edge of the open cube $Q_\ell (x)$.

Aim of this section is to prove the following (see Figs. 2 and 3 for illustrations):

Theorem 2.14

For any $d\geqq 2$ there is $C_\mathrm G=C_\mathrm G(d)$ with the following property.

Let $0<\varepsilon <\delta $ with $\delta \geqq C_\mathrm G\varepsilon $, and let $R:\delta {\mathbb {Z}}^d\rightarrow {SO}({\mathbb {R}}^d)$. Then there is a triangulation (V, E) of ${\mathbb {R}}^d$, in the sense of Definition 2.7, with the following properties:

i)
Regularity: The triangulation has the Delaunay property (property (a)), is $C_\mathrm G$-non degenerate (property (b)), and is $(C_\mathrm G,\varepsilon )$-uniform (property (c)).
ii)
Orientation: for each $z\in \delta {\mathbb {Z}}^d$ one has $V\cap Q_{\delta -C_\mathrm G\varepsilon }(z)= \varepsilon R(z){\mathbb {Z}}^d\cap Q_{\delta -C_\mathrm G\varepsilon }(z)$.

We start by proving that in a single cube we can construct a set of vertices V which coincides with $\varepsilon {\mathbb {Z}}^d$ on the boundary, with a rotation of the same lattice inside, and which is uniform and non-degenerate, in a sense made precise in the statement below. This will then be used to prove Theorem 2.14.

Lemma 2.15

Let $z\in {\mathbb {R}}^d$, $\varepsilon >0$, $R\in {SO}({\mathbb {R}}^d)$, $M\in {\mathbb {N}}$ with $M\geqq 6+2 d$. Then there is $V\subseteq {\mathbb {R}}^d$ with the following properties:

i)
Orientation: $V{\setminus } Q_{M\varepsilon }(z)=\varepsilon {\mathbb {Z}}^d{\setminus } Q_{M\varepsilon }(z)$ and $V\cap Q_{(M-2)\varepsilon }(z)=R\varepsilon {\mathbb {Z}}^d\cap Q_{(M-2)\varepsilon }(z)$;
ii)
$(2d,\varepsilon )$-uniformity: for any $q\in {\mathbb {R}}^d$ we have $B_{2d\varepsilon }(q)\cap V\ne \emptyset $; for any $x\ne y\in V$ we have $|x-y|\geqq \varepsilon /(2d)$;
iii)
Non-degeneracy: There is $C'=C'(d)$ such that if $v\subseteq V$, $\#v=d+1$, v is not contained in a $(d-1)$-dimensional affine subspace, and there is a ball $B_r(y)$ with $v\subseteq \partial B_{r}(y)$, $B_r(y)\cap V=\emptyset $, then $\mathscr {L}^d(\textrm{conv}\,v)\geqq \varepsilon ^d/C'$.

Proof

We divide the proof in several steps.

Step 1: general setting. To simplify notation we denote by $Q_\textrm{out}:=Q_{M\varepsilon }(z)$ the outer cube, by $Q_\textrm{in}:=Q_{(M-2)\varepsilon }(z)$ the inner cube, and by $Q_\textrm{mid}:=Q_{(M-1)\varepsilon }(z)$ the intermediate one (see Fig. 4). We set $V_\textrm{out}:= \varepsilon {\mathbb {Z}}^d\setminus Q_\textrm{out}$; $V_\textrm{in}:= R\varepsilon {\mathbb {Z}}^d\cap \overline{Q_\textrm{in}}$, and shall construct below a finite set $V_\textrm{mid}\subseteq Q_{(M-\frac{1}{2})\varepsilon }(z){\setminus } Q_{(M-\frac{3}{2})\varepsilon }(z)$ such that

$$\begin{aligned} V:=V_\textrm{in}\cup V_\textrm{out}\cup V_\textrm{mid}\end{aligned}$$

has the desired properties. The property i) is true for any choice of $V_\textrm{mid}$. Next we deal with ii), and leave the more delicate treatment of iii) at the end.

We show that for any $q\in {\mathbb {R}}^d$ one has $B_{2d\varepsilon }(q)\cap (V_\textrm{in}\cup V_\textrm{out})\ne \emptyset $. Consider first the case $q\in Q_\textrm{mid}$. Let $q'$ be the point of $\overline{Q}_{(M-2-\sqrt{d})\varepsilon }(z)$ closest to q. This implies

$$\begin{aligned} |q-q'| \leqq \frac{1}{2}\sqrt{d}(1+\sqrt{d})\varepsilon \end{aligned}$$

(2.13)

and $B_{\sqrt{d}\varepsilon /2}(q')\subseteq Q_\textrm{in}$. By Remark 2.12, we can take $p\in R\varepsilon {\mathbb {Z}}^d\cap \overline{B}_{\sqrt{d}\varepsilon /2}(q')\subseteq V_\textrm{in}$. Since by (2.13)

$$\begin{aligned} 2 d\varepsilon > |q-q'|+\sqrt{d}\varepsilon /2 \end{aligned}$$

we have $p\in \overline{B}_{\sqrt{d}\varepsilon /2}(q')\subseteq B_{2d\varepsilon }(q)$, and the first assertion in ii) is proved in this case. In the case $q\not \in Q_\textrm{mid}$ we argue similarly, projecting onto ${\mathbb {R}}^d{\setminus } Q_{(M+\sqrt{d})\varepsilon }(z)$, with ${\mathbb {R}}^d{\setminus } Q_\textrm{out}$ instead of $\overline{Q}_\textrm{in}$. Therefore the first assertion in ii) is true for any choice of $V_\textrm{mid}$.

It remains to choose $V_\textrm{mid}$ so that the property $|x-y|\geqq \varepsilon /(2d)$ for all $x\ne y\in V$ (i.e. the second assertion in ii)) is preserved, and iii) holds. In order to understand the strategy (cf. iii)), consider a set v and a ball $B_r(y)$ such that

$$\begin{aligned} v\subseteq V \text {with }\#v=d+1, v\subseteq \partial B_r(y), V\cap B_r(y)=\emptyset . \end{aligned}$$

(2.14)

The construction strategy of $V_\textrm{mid}$ then will ensure that:

(a)
sets v as in (2.14) cannot contain elements of both $V_\textrm{in}$ and $V_\textrm{out}$;
(b)
for any choice of v as in (2.14), with additionally $v\subseteq V_\textrm{in}\cup V_\textrm{mid}$ or $v\subseteq V_\textrm{out}\cup V_\textrm{mid}$, it holds that v is either contained in a $(d-1)$-dimensional affine subspace or obeys $\mathscr {L}^d(\textrm{conv}\,v)\geqq \varepsilon ^d/C'$.

Step 2: construction of $U_\varepsilon $. We show here that there is a finite set $U_\varepsilon \subseteq \partial Q_\textrm{mid}$ such that if the set $V_\textrm{mid}$ is constructed picking exactly one point of each $B_{\varepsilon /(4d)}(u)$, for $u\in U_\varepsilon $, then a and the second assertion in ii) hold. The specific choice of the points of $B_{\varepsilon /(4d)}(u)$, for $u\in U_\varepsilon $, will be done in Step 3 to ensure (b) of (and hence iii), by a).

We let $U_\varepsilon :=\partial Q_\textrm{mid}\cap (\frac{1}{d}\varepsilon {\mathbb {Z}}^d+p)$, where $p:=z-\frac{M-1}{2}\varepsilon \sum _i e_i$ is a vertex of $Q_\textrm{mid}$. The shift p is chosen so that the set is nonempty; we recall that $Q_\textrm{mid}$ is a cube of side length $(M-1)\varepsilon \in \varepsilon {\mathbb {Z}}$, but the centre z is a generic point in ${\mathbb {R}}^d$.

Assume now that $V_\textrm{mid}$ is chosen so that it contains exactly one point of each $ B_{\varepsilon /(4d)}(u)$, for $u\in U_\varepsilon $. We claim that then V satisfies also the second assertion in ii). Let indeed $x,y\in V$, $x\ne y$. If both are in $V_\textrm{in}$, or both in $V_\textrm{out}$, then $|x-y|\geqq \varepsilon $. If both are in $V_\textrm{mid}$, then there are $u_x\ne u_y\in U_\varepsilon $ with $|u_x-x|+|u_y-y|\leqq \varepsilon /(2d)$. As $u_x-u_y\in \frac{1}{d}\varepsilon {\mathbb {Z}}^d{\setminus }\{0\}$, we obtain

$$\begin{aligned} |x-y|\geqq |u_x-u_y|-|u_x-x|-|u_y-y|\geqq \varepsilon /(2d). \end{aligned}$$

In the other cases, we use

$$\begin{aligned} \textrm{dist}(V_\textrm{out},V_\textrm{mid})\geqq \textrm{dist}(\partial Q_\textrm{out},\partial Q_\textrm{mid})-\varepsilon /(4d)=\varepsilon /2-\varepsilon /(4 d)\geqq \varepsilon /4 \end{aligned}$$

and similarly $\textrm{dist}(V_\textrm{in},V_\textrm{mid})\geqq \varepsilon /4$ to conclude. This proves the second assertion in ii).

We finally check that a holds. Let $v\subseteq V$ be as in (2.14). Assume by contradiction that v contains elements of both $V_\textrm{in}$ and $V_\textrm{out}$, then the sphere $\partial B_r(y)$ intersects both $\partial Q_\textrm{out}$ and $\partial Q_\textrm{in}$. We show that there exists $x'\in \partial Q_\textrm{mid}$ such that $B_{\varepsilon /2}(x')\subseteq B_r(y)$. Assume first $y\in Q_\textrm{mid}$. Let $y'\in \partial B_r(y)\cap \partial Q_\textrm{out}$, and choose $x'\in [y,y']\cap \partial Q_\textrm{mid}$. Then $|x'-y'|\geqq \varepsilon /2$, so that

$$\begin{aligned} |x'-y|=|y-y'|-|x'-y'|\leqq r-\varepsilon /2 \end{aligned}$$

and $B_{\varepsilon /2}(x')\subseteq B_r(y)$. If instead $y\not \in Q_\textrm{mid}$, we select $y'\in \partial B_r(y)\cap \partial Q_\textrm{in}$, and proceed analogously. Let x be the point in $U_\varepsilon $ closest to $x'$. As every component $x_i$ is the element of $\frac{1}{d}\varepsilon {\mathbb {Z}}+p_i$ closest to $x_i'$, we have $|x-x'|\leqq \sqrt{d}\varepsilon /(2d)=\varepsilon /(2\sqrt{d})$. As $\frac{1}{2}> \frac{1}{4d}+\frac{1}{2\sqrt{d}}$, we obtain $B_{\varepsilon /(4d)}(x)\subseteq B_{\varepsilon /2}(x')\subseteq B_r(y)$. As $x\in U_\varepsilon $, there is a point of $V_\textrm{mid}$ in $B_{\varepsilon /(4d)}(x)$, which contradicts the condition $V\cap B_r(y)=\emptyset $ stated in (2.14). Therefore this cannot happen, and hence (a) holds.

Step 3: choice of the elements of $V_\textrm{mid}$. We write $\{u_1,\ldots , u_J\}:=U_\varepsilon $ and iteratively for every j pick a point $z_j\in B_{\varepsilon /(4d)}(u_j)$ which ensures (b). We collect in $V_\textrm{mid}^j:=\{z_1, \ldots , z_j\}$ the points chosen in the first j steps, and at the end we will use $V_\textrm{mid}:=V_\textrm{mid}^J$. Fix

$$\begin{aligned} \ell := 1+2d, \end{aligned}$$

(2.15)

the reason for this specific choice will be clear later.

An admissible set of vertices at stage j is a set v with $\#v=d+1$ such that there is $q\in \partial Q_\textrm{mid}$ with $v\subseteq B_{\ell \varepsilon }(q)$, $\mathscr {L}^d(\textrm{conv}\,v)>0$, and either $v\subseteq V_\textrm{mid}^j\cup V_\textrm{in}$ or $v\subseteq V_\textrm{mid}^j\cup V_\textrm{out}$.

An admissible face at stage j is a set w with $\#w=d$ such that there is $q\in \partial Q_\textrm{mid}$ with $w\subseteq B_{\ell \varepsilon }(q)$, ${\mathcal {H}}^{d-1}(\textrm{conv}\,w)>0$, and either $w\subseteq V_\textrm{mid}^j\cup V_\textrm{in}$ or $w\subseteq V_\textrm{mid}^j\cup V_\textrm{out}$. We denote by $N_w:=\#(w\cap V_\textrm{mid}^j)$ the number of items of w in $V_\textrm{mid}^j$, clearly $N_w\leqq d$.

We intend to show that there are $\alpha ,\,\beta ,\,\gamma ,\,{C_F}>0$ (depending only on d) such that we can choose $z_j\in B_{\varepsilon /(4d)}(u_j)$ iteratively with the following two properties:

(i)
If v is an admissible set of vertices at stage j, then
$$\begin{aligned} \mathscr {L}^d(\textrm{conv}\,v) \geqq \beta \varepsilon ^{d}. \end{aligned}$$
(2.16)
(ii)
If w is an admissible face at stage j, then
$$\begin{aligned} {\mathcal {H}}^{d-1}(\textrm{conv}\,w) \geqq \frac{\alpha ^{N_w}}{C_\mathrm F}\varepsilon ^{d-1}. \end{aligned}$$
(2.17)

The key to the choice of $z_j$, which eventually leads to (2.16) at stage j building upon (2.17) at stage $j-1$, is the following geometric observation. If v is an admissible set of vertices at stage j, and it contains the point $z_j$, then $w:=v{\setminus }\{z_j\}$ is an admissible face at stage $j-1$ and for any $q\in w$ we have

$$\begin{aligned} \mathscr {L}^d(\textrm{conv}\,v ) = \frac{1}{d}|(z_j-q)\cdot \nu _w| {\mathcal {H}}^{d-1}(\textrm{conv}\,w) \end{aligned}$$

(2.18)

where $\nu _w$ is a unit normal to the affine space generated by w. The factor ${\mathcal {H}}^{d-1}(\textrm{conv}\,w)$ will be estimated via (2.17) at stage $j-1$, the choice of $z_j$ needs to ensure that the first factor is not too small, for any possible choice of w.

Now we start choosing $z_1,\ldots ,z_J$. As stated before, we proceed by iteration. Assume that we have already chosen $z_1,\ldots ,z_{j-1}$, we want to choose $z_j$ (if $j=1$ we use $V_\textrm{mid}^0=\emptyset $). Let w be an admissible face at stage $j-1$ such that $w\subseteq B_{(2\ell +1/(4d))\varepsilon }(u_j)$. If no such face exists, choose $z_j:=u_j$. Since no two points in V are at distance smaller than $\varepsilon /(2d)$ (by ii)), the number of possible choices of w is bounded by a number K which depends only on d. Let $w_1, \ldots , w_K$ be these possible choices. We choose $z_j$ such that

$$\begin{aligned} |(z_j-p_k)\cdot \nu _{w_k}|\geqq \gamma \varepsilon \end{aligned}$$

(2.19)

for all $k=1,\ldots , K$ and an arbitrary choice of $p_k\in w_k$ (the condition does not depend on the choice of $p_k$, as $\nu _{w_k}$ is orthogonal to $p_k-p_k'$ for any $p_k$, $p_k'\in w_k$). We show now why we can choose such $z_j$. We observe that

$$\begin{aligned} \mathscr {L}^d\big (\{z\in B_{\varepsilon /(4d)}(u_j): |(z-p_k)\cdot \nu _{w_k}|< \gamma \varepsilon \}\big ) \leqq 2\gamma \varepsilon \left( \frac{\varepsilon }{2d}\right) ^{d-1} =\gamma 2^{2-d}d^{1-d}\varepsilon ^d \end{aligned}$$

and thus the total volume of these sets is controlled by $ K \gamma 2^{2-d}d^{1-d}\varepsilon ^d$. Then we choose $\gamma $ such that this expression equals $\frac{1}{2}\mathscr {L}^d(B_{\varepsilon /(4d)}(u_j))$ and hence we have a suitable $z_j$. Continuing in this way, we have thus constructed $V_\textrm{mid}^J$.

It remains to show by induction that the points we constructed have the properties (2.16) and (2.17). Assume first $j=0$, and recall $V_\textrm{mid}^0=\emptyset $, so that $N_w=0$. By Remark 2.12, (2.16) and (2.17) hold provided $C_\mathrm F\geqq (d-1)!$ and $\beta \leqq 1/d!$. Assume now that (2.16) and (2.17) hold at stage $j-1$, we are going to prove that they hold also at stage j.

Let v be an admissible set of vertices at stage j. If $z_j\not \in v$, then v was already admissible at stage $j-1$, hence (2.16) holds. Then we assume that $z_j\in v$, so that $w:=v\setminus \{z_j\}$ is an admissible face at stage $j-1$ and $v\subseteq B_{\ell \varepsilon }(q)\subseteq B_{2\ell \varepsilon }(z_j)\subseteq B_{(2\ell +1/(4d))\varepsilon }(u_j)$, where $q\in \partial Q_\textrm{mid}$ is given by the admissibility of v. In particular, $w\subseteq B_{(2\ell +1/(4d))\varepsilon }(u_j)$, so that (2.19) holds for w in place of $w_k$. By (2.17) at stage $j-1$, (2.18), (2.19) and $N_w\leqq d$ we have, provided $\alpha \leqq 1$,

$$\begin{aligned} \mathscr {L}^d(\textrm{conv}\,v )= \frac{1}{d} |(z_j-p)\cdot \nu _w| {\mathcal {H}}^{d-1}(\textrm{conv}\,w) \geqq \frac{\gamma \alpha ^{d}}{C_\mathrm Fd} \varepsilon ^d \end{aligned}$$

for any $p\in w$, so that setting $\beta := \min \{\gamma \alpha ^{d} /(C_\mathrm Fd),1/d!\}$ we obtain (2.16).

Let w be an admissible face at stage j. As above, by the inductive assumption it suffices to consider the case $z_j\in w$. Assume $w\subseteq V_\textrm{mid}^j\cup V_\textrm{in}$, the other case is analogous and will not be treated. Being w admissible, $w\subseteq B_{\ell \varepsilon }(q)$, for some $q\in \partial Q_\textrm{mid}$. Let $q'$ be the point of $\partial Q_{(M-4-\sqrt{d})\varepsilon }(z)$ closest to q, so that $|q-q'|\leqq \sqrt{d} (3+\sqrt{d})\varepsilon /2$, and choose $p_*\in \varepsilon R {\mathbb {Z}}^d\cap \overline{B}_{\varepsilon \sqrt{d}/2}(q')\subseteq \overline{Q}_{(M-4)\varepsilon }(z) $ (Remark 2.12). By the choice of $\ell $ made in (2.15), we get

$$\begin{aligned} |p_*-q|\leqq |p_*-q'|+|q'-q|\leqq (\sqrt{d}+3\sqrt{d} + d )\varepsilon /2< (\ell -1)\varepsilon . \end{aligned}$$

Then the 2d points $p_*\pm \varepsilon Re_i$ are all in $B_{\ell \varepsilon }(q)\cap V_\textrm{in}$, and at least one of them is not in the affine space generated by $w\setminus \{z_j\}$. Denote it by p, and set

$$\begin{aligned} {\hat{w}}:= \bigl (w \setminus \{z_j\}\bigr )\cup \{p\}. \end{aligned}$$

Then ${\hat{w}}$ is an admissible face at stage $j-1$, with $N_{{\hat{w}}}=N_{w}-1$ and ${\mathcal {H}}^{d-1}(\textrm{conv}\,{\hat{w}})\ne 0$, so that (2.17) holds for ${\hat{w}}$. Further, ${\hat{w}}\subseteq B_{\ell \varepsilon }(q)\subseteq B_{2\ell \varepsilon }(z_j)\subseteq B_{(2\ell +1/(4d))\varepsilon }(u_j)$ implies that ${\hat{w}}$ is one of the faces $w_1,\ldots , w_K$ considered for (2.19), so that the choice of $z_j$ implies that (2.19) holds for ${\hat{w}}$.

We compute the volume of the simplex with vertices in ${\hat{w}}\cup \{z_j\}=w\cup \{p\}$ in two different ways:

$$\begin{aligned} |(z_j-p)\cdot \nu _{{\hat{w}}}| {\mathcal {H}}^{d-1}(\textrm{conv}\,{\hat{w}})= |(z_j-p)\cdot \nu _{w}| {\mathcal {H}}^{d-1}(\textrm{conv}\,w). \end{aligned}$$

By (2.19) and (2.17) for ${\hat{w}}$, recalling that $z_j,p\in B_{\ell \varepsilon }(q)$ implies $|z_j-p|\leqq 2\ell \varepsilon $, we obtain

$$\begin{aligned} {\mathcal {H}}^{d-1}(\textrm{conv}\,w) \geqq \frac{1}{2\ell \varepsilon }|(z_j-p)\cdot \nu _{{\hat{w}}}| {\mathcal {H}}^{d-1}(\textrm{conv}\,{\hat{w}})\geqq \frac{\gamma }{2\ell }\alpha ^{N_{{\hat{w}}}}\varepsilon ^{d-1}/C_\mathrm F\end{aligned}$$

which concludes the proof of (2.17) with $\alpha :=\min \{1, \gamma /(2\ell )\}$. $\quad \square $

At this point we conclude the proof of Theorem 2.14.

Proof of Theorem 2.14

Set

$$\begin{aligned} \ell :=2 d\qquad \text {and}\qquad M:=\lfloor \delta /\varepsilon \rfloor -4\ell , \end{aligned}$$

so that $Q_{M\varepsilon }\subseteq Q_\delta $, with

$$\begin{aligned} \textrm{dist}(Q_{M\varepsilon },\partial Q_\delta )\geqq 2\ell \varepsilon . \end{aligned}$$

(2.20)

We first select a background lattice,

$$\begin{aligned} V^0:= \varepsilon {\mathbb {Z}}^d \setminus \bigcup _{z\in \delta {\mathbb {Z}}^d} Q_{M\varepsilon }(z). \end{aligned}$$

For each $z\in \delta {\mathbb {Z}}^d$, if $C_\mathrm G\geqq 7+2d+4\ell $ we can use (by $M\geqq C_G-1-4\ell $) Lemma 2.15 to obtain a set $V_z$ such that $V_z\cap Q_{(M-2)\varepsilon }(z)=R(z)\varepsilon {\mathbb {Z}}^d\cap Q_{(M-2)\varepsilon }(z)$, and $V_z{\setminus } Q_{M\varepsilon }(z)= \varepsilon {\mathbb {Z}}^d{\setminus } Q_{M\varepsilon }(z)$. We then set

$$\begin{aligned} V:= V^0\cup \bigcup _{z\in \delta {\mathbb {Z}}^d} (V_z\cap Q_\delta (z)) = V^0\cup \bigcup _{z\in \delta {\mathbb {Z}}^d} (V_z\cap Q_{M\varepsilon }(z)). \end{aligned}$$

This set obviously has the orientation property stated in ii), provided that $C_\mathrm G\geqq 4\ell +3$.

We show that for any $x\ne y\in V$, one has $|x-y|\geqq \varepsilon /\ell $. Indeed, if there is $z\in \delta {\mathbb {Z}}^d$ with $x,y\in V_z$ then item ii) of Lemma 2.15 implies $|x-y|\geqq \varepsilon / \ell $. If $x,y\in V^0$ then $|x-y|\geqq \varepsilon $. We are left with the case $x\in Q_{M\varepsilon }(z)$ and $y\in Q_{M\varepsilon }(z')$ for some $z\ne z'\in \delta {\mathbb {Z}}^d$, which implies $|x-y|\geqq 2\textrm{dist}(Q_{M\varepsilon },\partial Q_\delta )\geqq 4\ell \varepsilon \geqq \varepsilon /\ell $, by (2.20).

We next similarly show that for any $q\in {\mathbb {R}}^d$ one has $V\cap B_{\ell \varepsilon }(q)\ne \emptyset $. If there is $z\in \delta {\mathbb {Z}}^d$ such that $q\in Q_{(M+2\ell )\varepsilon }(z)$ then $B_{\ell \varepsilon }(q)\subseteq Q_\delta (z)$, and the required property follows from item ii) of Lemma 2.15, since $V\supseteq V_z\cap Q_\delta (z)$. If not, then $B_{\ell \varepsilon }(q)$ does not intersect any $Q_{M\varepsilon }(z)$, so that $B_{\ell \varepsilon }(q)\cap V^0= B_{\ell \varepsilon }(q)\cap \varepsilon {\mathbb {Z}}^d$, which is nonempty by Remark 2.12.

This proves that the set V is $(\ell ,\varepsilon )$-uniform, in the sense of Property (c) of Definition 2.7. By Lemma 2.9 there is a set E so that (V, E) is a triangulation with the Delaunay property.

It only remains to show that (V, E) is non-degenerate. Let $e\in E$ be a simplex, and let $\partial B_r(q)\supseteq v_e$ be its circumscribed sphere. By the Delaunay property $B_r(q)\cap V=\emptyset $, by the $(\ell ,\varepsilon )$-uniformity proven above this implies $r< \ell \varepsilon $. If there is $z\in \delta {\mathbb {Z}}^d$ such that $q\in Q_{(M+2\ell )\varepsilon }(z)$ then $v_e\subseteq V_z$, and item iii) of Lemma 2.15 implies $\mathscr {L}^d(e)\geqq \varepsilon ^d/C'$. Otherwise $v_e\subseteq V^0\subseteq \varepsilon {\mathbb {Z}}^d$, and since $\mathscr {L}^d(e)>0$ by Remark 2.12 we obtain $\mathscr {L}^d(e)\geqq \varepsilon ^d/d!$. This concludes the proof, with $C_\mathrm G:=\max \{ 7+2d+4\ell ,4\ell +3, C', d!\}$. $\quad \square $

3.3 Proof of the main result

We now recall how one can use a triangulation to define continuous, piecewise affine approximations.

Lemma 2.16

Let (V, E) be a triangulation of ${\mathbb {R}}^d$. For any $w:V\rightarrow {\mathbb {R}}$ there is a unique $u\in C^0({\mathbb {R}}^d)$ which coincides with w on V and is affine on each $e\in E$.

If the triangulation is $c_*$-non degenerate, and if moreover w is obtained as the restriction to V of a $C^2({\mathbb {R}}^d)$ function that we still denote w, then the function u obtained above obeys

$$\begin{aligned} \Vert \nabla u\Vert _{L^\infty (e)}\leqq C \Vert \nabla w\Vert _{L^\infty (e)} \end{aligned}$$

(2.21)

and

$$\begin{aligned} \Vert \nabla w-\nabla u\Vert _{L^\infty (e)}\leqq C\textrm{diam}(e) \Vert \nabla ^2 w\Vert _{L^\infty (e)} \end{aligned}$$

(2.22)

for all $e\in E$, with C depending on $c_*$ and d.

Proof

For each $e\in E$ one defines $u_e:e\rightarrow {\mathbb {R}}$ by $u_e=w$ on $v_e$ and as the affine interpolation in the rest of $e=\textrm{conv}\,(v_e)$. To prove existence of u we only need to check that $u_e=u_{e'}$ on $e\cap e'$, for any pair $e\ne e'\in E$. Assume $e\cap e'\ne \emptyset $. Then $e\cap e'=\textrm{conv}\,(v_e\cap v_{e'})$. As $u_e=u_{e'}$ on $v_e\cap v_{e'}$, and both are affine in $\textrm{conv}\,(v_e\cap v_{e'})$, they coincide on $e\cap e'$. This concludes the proof of the first assertion.

To prove the two estimates, we focus on an element $e\in E$ and let G be the constant gradient of u on e. For any pair $x,\,y\in v_e$,

$$\begin{aligned} G(y-x)=u(y)-u(x)&=w(y)-w(x)=\int \limits _0^1 \nabla w(x+t(y-x)) (y-x) \textrm{d}t,\nonumber \\ \end{aligned}$$

(2.23)

which implies

$$\begin{aligned} |G(y-x)|\leqq \Vert \nabla w\Vert _{L^\infty (e)} |y-x|. \end{aligned}$$

With (2.2) we obtain (2.21).

To prove the last estimate, we pick any $\xi \in e$ and rewrite (2.23) as

$$\begin{aligned}\begin{aligned} (G-\nabla w(\xi ))(y-x)&=\int _0^1 \left( \nabla w(x+t(y-x))-\nabla w(\xi )\right) (y-x) \textrm{d}t. \end{aligned} \end{aligned}$$

By the mean-value theorem $|\nabla w(\eta )-\nabla w(\xi )|\leqq \textrm{diam}(e)\Vert \nabla ^2w\Vert _{L^\infty (e)}$ for any $\eta \in e$, so that

$$\begin{aligned} |(G-\nabla w(\xi ))(y-x)|\leqq \textrm{diam}(e)\Vert \nabla ^2w\Vert _{L^\infty (e)} |y-x|. \end{aligned}$$

With (2.2) we obtain (2.22). $\quad \square $

We are ready to prove our main result, Theorem 2.2.

Proof of Theorem 2.2

Before entering into the proof of the theorem, we stress that we are going to use the fact that for a piecewise affine function $u_j$,

$$\begin{aligned} |{\textrm{D}}^2_1 u_j|=|{\textrm{D}}\nabla u_j|. \end{aligned}$$

(2.24)

This follows from the fact that $u_j$ is piecewise affine, hence the distributional derivative of ${\textrm{D}}\nabla u_j$ is only of jump type, so that the density of ${\textrm{D}}\nabla u_j$ with respect to $|{\textrm{D}}\nabla u_j|$ is a rank 1 matrix, and hence we can use item v) of Proposition 1.2 in conjunction with Proposition 1.7.

Fix two sequences $\delta _j\rightarrow 0$, $\varepsilon _j\rightarrow 0$, with $\delta _j>0$, $\varepsilon _j>0$, and $\varepsilon _j/\delta _j\rightarrow 0$. For each j and each $z\in \delta _j{\mathbb {Z}}^d$ we select a matrix $R_z\in {SO}({\mathbb {R}}^d)$ such that $R_z^t\nabla ^2w(z)R_z$ is diagonal, and let $(V_j,E_j)$ be the grid constructed in Theorem 2.14 with these parameters. We define $u_j$ as the piecewise affine interpolation of w, constructed as in Lemma 2.16. This concludes the construction.

In order to prove convergence and the energy bound, it suffices to work in a large ball $B_r$, with $\Omega \subseteq B_{r/2}$. For large j, we can assume $C_\mathrm G\varepsilon _j\leqq \delta _j\leqq r/(2d)$. Here and below $C_\mathrm G$ is the (fixed) constant from Theorem 2.14, we can assume $C_\mathrm G>2\sqrt{d}$. We use C for a generic constant that depends only on d (and $C_\mathrm G$) and may vary from line to line. By Lemma 2.16 one immediately obtains a uniform Lipschitz bound on $u_j$,

$$\begin{aligned} \Vert \nabla u_j\Vert _{L^\infty (B_{2r})} \leqq C \Vert \nabla w\Vert _{L^\infty (B_{3r})}. \end{aligned}$$

By the uniformity property of the grid, for any $x\in B_r$ and any j there is $y\in V_j$ with $|x-y|\leqq C_\mathrm G\varepsilon _j$, therefore

$$\begin{aligned} \Vert w-u_j\Vert _{L^\infty (B_r)} \leqq C_\mathrm G\varepsilon _j (\Vert \nabla u_j\Vert _{L^\infty (B_{2r})}+\Vert \nabla w\Vert _{L^\infty (B_{2r})})\rightarrow 0. \end{aligned}$$

This proves local uniform convergence.

Since $\nabla ^2w$ is continuous, one has that

$$\begin{aligned} \omega _\rho :=\sup \bigl \{|\nabla ^2w(x)-\nabla ^2w(y)|: x,\,y\in B_{2r},\, |x-y|\leqq \rho \sqrt{d} \bigr \} \end{aligned}$$

(2.25)

converges to zero as $\rho \rightarrow 0$.

The estimate of the energy is done separately in the interior of the cubes, where the grid is regular, and in the boundary regions. We start from the boundary, where the grid is irregular. As $\nabla w$ is continuous, equation (2.22) in Lemma 2.16 permits to estimate $ |[\nabla u_j]|$, the jump in $\nabla u_j$ across the boundary between two neighbouring elements e and $e'$ which intersect $B_r$, and gives

$$\begin{aligned} |[\nabla u_j]|\leqq C \varepsilon _j \Vert \nabla ^2w\Vert _{L^\infty (B_{2r})} \qquad \,\hbox {in\, all}\,e\quad \hbox {with}\quad e\cap B_r\ne \emptyset , \end{aligned}$$

here we used also Remark 2.8. Using non-degeneracy and uniformity of the triangulation to control the volume of e, we obtain

$$\begin{aligned} |{\textrm{D}}\nabla u_j|(\partial e)\leqq C {\mathcal {H}}^{d-1}(\partial e) \max |[\nabla u_j]|(\partial e)\leqq C \mathscr {L}^d(e) \Vert \nabla ^2w\Vert _{L^\infty (B_{2r})} \end{aligned}$$

for all elements $e\in E_j$ with $e\subseteq B_{r}$. Fix now $z\in \delta _j{\mathbb {Z}}^d$ such that $Q_{\delta _j}(z)\cap \Omega \ne \emptyset $. Summing the previous condition over all elements $e\in E_j$ with $e\cap \overline{Q}_{\delta _j}(z) {\setminus } Q_{\delta _j-4C_\mathrm G\varepsilon _j}(z)\ne \emptyset $ leads to

$$\begin{aligned} \begin{aligned}&|{\textrm{D}}\nabla u_j|( \overline{Q}_{\delta _j}(z)\setminus Q_{\delta _j-4C_\mathrm G\varepsilon _j}(z))\\&\quad \leqq C \mathscr {L}^d(Q_{\delta _j+4C_\mathrm G\varepsilon _j}(z)\setminus Q_{\delta _j- 8C_\mathrm G\varepsilon _j}(z)) \Vert \nabla ^2w\Vert _{L^\infty (B_{2r})} \\&\quad \leqq C((\delta _j+4C_\mathrm G\varepsilon _j)^d-(\delta _j- 8C_\mathrm G\varepsilon _j)^d) \Vert \nabla ^2w\Vert _{L^\infty (B_{2r})}\\&\quad \leqq C \delta _j^{d-1}\varepsilon _j\, \Vert \nabla ^2w\Vert _{L^\infty (B_{2r})}, \end{aligned} \end{aligned}$$

(2.26)

provided j is large enough, since $\varepsilon _j\ll \delta _j$. Here we used that for every $e\in E_j$, $\textrm{diam}(e)\leqq 2 C_G\varepsilon _j$, being the triangulation $(V_j,E_j)$ $(C_G,\varepsilon _j)$-uniform and with the Delaunay property.

We next estimate the energy inside $ Q_{\delta _j-3C_\mathrm G\varepsilon _j}(z)$, for some $z\in \delta _j{\mathbb {Z}}^d\cap B_{r}$. Let $H_z:=\nabla ^2w(z)$, and recall that $R_z$ was chosen so that $R_z^tH_zR_z=\textrm{diag}(\lambda _1,\dots , \lambda _d)$ for some $\lambda \in {\mathbb {R}}^d$, which implies $|H_z|_1=\sum _{i=1}^d|\lambda _i|$, see items i) and ii) of Proposition 1.2. In the next estimates we write briefly $\delta $ and $\varepsilon $ for $\delta _j$ and $\varepsilon _j$.

For any element $e\in E_j$ with $e\cap Q_{\delta -2C_\mathrm G\varepsilon }(z)\ne \emptyset $, we can select $p_e\in e\cap Q_{\delta -2C_\mathrm G\varepsilon }(z)$. Then $B_{C_\mathrm G\varepsilon /2}(p_e){\subseteq Q_{C_\mathrm G\varepsilon }(p_e)}\subseteq Q_{\delta -C_\mathrm G\varepsilon }(z)$, so that the orientation property of Theorem 2.14 gives $B_{C_\mathrm G\varepsilon /2}(p_e)\cap V_j=B_{C_\mathrm G\varepsilon /2}(p_e)\cap \varepsilon R_z{\mathbb {Z}}^d$. Recalling $C_\mathrm G>2\sqrt{d}$, by applying Lemma 2.13 with $q=p_e$, $r=C_\mathrm G\varepsilon /2$, there exists $y\in \varepsilon R_z({\mathbb {Z}}+\frac{1}{2})^d$ such that $v_e\subseteq y+\varepsilon R_z\{-\frac{1}{2},\frac{1}{2}\}^d$. Let $F_y:=\nabla w(y)$. For all $x\in v_e$, Taylor remainder term in integral form and (2.25) yield

$$\begin{aligned} w(x)= w(y)+F_y(x-y)+\frac{1}{2} H_z(x-y)\cdot (x-y) +R(x) \end{aligned}$$

(this can be seen as the definition of $R(\,\cdot \,)$) with

$$\begin{aligned} |R(x)|\leqq & {} d\varepsilon ^2|\nabla ^2 w(y)-H_z|+ \int _0^1 |\nabla ^2 w(x+t(y-x))\nonumber \\{} & {} \quad -\nabla ^2 w(y)| \, |y-x|^2 \textrm{d}t \leqq C \varepsilon ^2 \omega _\delta . \end{aligned}$$

(2.27)

As $x-y=\sum _i \varepsilon \gamma _i R_ze_i$, with $\gamma _i\in \{-\frac{1}{2},\frac{1}{2}\}$, recalling that $R_z^tH_zR_z=\textrm{diag}(\lambda _1,\ldots , \lambda _d)$ we have

$$\begin{aligned} H_z(x-y)\cdot (x-y) =\varepsilon ^2\sum _{i,\,k=1}^d\gamma _i\gamma _k e_i R_z^t H_z R_z e_k =\frac{1}{4} \varepsilon ^2\sum _{i=1}^d\lambda _i \end{aligned}$$

which does not depend on the $\gamma _i$, and therefore is the same for all $x\in v_e$. Hence

$$\begin{aligned}\begin{aligned} w(x)&=w(y)+F_y(x-y)+\frac{1}{8} \varepsilon ^2 \sum _{i=1}^d \lambda _i+R(x)\qquad \text {for all }x\in v_e.\end{aligned} \end{aligned}$$

The function $u_j$ is affine on the element e, assume it has the form $u_j(\xi )=a_e+G_e\xi $ for $\xi \in e$. As $u_j=w$ on $v_e$, for every pair $x,x'\in v_e$ we obtain

$$\begin{aligned} G_e(x-x')=u_j(x)-u_j(x')=w(x)-w(x')=F_y(x-x')+R(x)-R(x'). \end{aligned}$$

Recalling that e is a non-degenerate simplex by (2.2), (2.27) and what just proved we obtain

$$\begin{aligned} |G_e-F_y|\leqq C\varepsilon \omega _\delta . \end{aligned}$$

(2.28)

In summary, if $e\in E_j$ obeys $e\cap Q_{\delta -2C_\mathrm G\varepsilon }(z)\ne \emptyset $ then there exists $y_e\in \varepsilon R_z({\mathbb {Z}}+\frac{1}{2})^d$ with $v_e\subseteq y_e+\varepsilon R_z\{-\frac{1}{2},\frac{1}{2}\}^d$, and the vector $G_e:=\nabla {u_j}_{|e}$ obeys (2.28).

Consider now some $y\in \varepsilon R_z({\mathbb {Z}}+\frac{1}{2})^d$ such that $(y+R_z Q_\varepsilon )\cap Q_{\delta -4C_\mathrm G\varepsilon }(z)\ne \emptyset $. If $e,\,e'$ are two elements with $v_e,\,v_{e'}\subseteq y+R_z \overline{Q}_\varepsilon $, then (by $C_\mathrm G>\sqrt{d}$) both intersect $Q_{\delta -2C_\mathrm G\varepsilon }(z)$, so that the above discussion applies and (2.28) gives $|G_e-G_{e'}|\leqq C\varepsilon \omega _\delta $, having used that the above discussion forces $y=y_e$ (since $y,\,y_e\in \varepsilon R_z({\mathbb {Z}}+\frac{1}{2})^d$ and $y\ne y_e$ imply that $(y+R_z \overline{Q}_\varepsilon )\cap (y_e+\varepsilon R_z\{-\frac{1}{2},\frac{1}{2}\}^d)\supseteq v_e$ has at most dimension $d-1$) and analogously $y=y_{e'}$. In particular, those elements constitute a decomposition of $y+R_z Q_\varepsilon $. Arguing as before, summing over all pairs,

$$\begin{aligned} |{\textrm{D}}\nabla u_j|(y+R_zQ_\varepsilon )\leqq C \varepsilon ^{d-1} \max |G_e-G_{e'}| \leqq C \varepsilon ^d\omega _\delta . \end{aligned}$$

(2.29)

In order to estimate the contribution from the boundary of these cubes, let $y'=y\pm \varepsilon R_z e_i$ be the centre of one of the neighbouring small cubes. Since $C_\mathrm G>2\sqrt{d}$, $y'+R_z Q_\varepsilon \subseteq Q_{\delta -2C_\mathrm G\varepsilon }(z)$, so that (2.28) holds for any element $e''$ contained in $y'+R_z \overline{Q}_\varepsilon $ (with $e''$ in place of e and $y'$ in place of y). As the common boundary has area $\varepsilon ^{d-1}$,

$$\begin{aligned}\begin{aligned} |{\textrm{D}}\nabla u_j|(\partial (y+R_zQ_\varepsilon ))&\leqq C \varepsilon ^d\omega _\delta + \sum _{y'\in y+R_z\varepsilon \{\pm e_1,\dots , \pm e_d\}} \varepsilon ^{d-1} |F_y-F_{y'}|. \end{aligned} \end{aligned}$$

As we did before, we represent $F_{y'}-F_y=\nabla w(y')-\nabla w(y)$ with Taylor’s theorem

$$\begin{aligned} F_{y'}=F_y +H_z(y'-y)+R'(y',y) \qquad \text {and}\qquad |R'(y',y)|\leqq C \varepsilon \omega _\delta \end{aligned}$$

(this can be seen as the definition of $R'(\,\cdot ,\,\cdot \,)$) to obtain

$$\begin{aligned} \begin{aligned} |{\textrm{D}}\nabla u_j|(\partial (y+R_zQ_\varepsilon ))&\leqq C \varepsilon ^d\omega _\delta + \sum _{y'\in y+R_z\varepsilon \{\pm e_1,\dots , \pm e_d\}} \varepsilon ^{d-1} |H_z(y'-y)|\\&= C \varepsilon ^d\omega _\delta + 2\varepsilon ^d |H_z|_1 \leqq C \varepsilon ^d\omega _\delta +2\int _{y+R_zQ_\varepsilon } |\nabla ^2 w|_1\textrm{d}\mathscr {L}^d, \end{aligned} \nonumber \\ \end{aligned}$$

(2.30)

where we used that the $R_ze_i$ are eigenvectors of $H_z$ by the choice of $R_z$, the definition of the Schatten norm and in the final step (2.25). Let

$$\begin{aligned} A_z:=\left\{ y\in \varepsilon R_z\left( {\mathbb {Z}}+\frac{1}{2}\right) ^d: (y+R_zQ_\varepsilon )\cap Q_{\delta -4C_\mathrm G\varepsilon }(z)\ne \emptyset \right\} . \end{aligned}$$

Summing over all $y\in A_z$, taking into account (2.29) and (2.30) and recalling that the boundaries between the cubes appear twice in the sum, gives

$$\begin{aligned} |{\textrm{D}}\nabla u_j|(Q_{\delta -4C_\mathrm G\varepsilon }(z)) \leqq C \delta ^d\omega _\delta +\int _{Q_\delta (z)} |\nabla ^2w|_1\textrm{d}\mathscr {L}^d \end{aligned}$$

and combining with (2.26)

$$\begin{aligned} |{\textrm{D}}\nabla u_j|(\overline{Q}_{\delta }(z)) \leqq C \delta ^d\left( \omega _\delta +\frac{\varepsilon }{\delta }\Vert \nabla ^2w\Vert _{L^\infty (B_{2r})}\right) +\int _{Q_\delta (z)} |\nabla ^2w|_1\textrm{d}\mathscr {L}^d. \end{aligned}$$

Summing over all z such that $Q_\delta (z)\cap \Omega \ne \emptyset $, and inserting back the indices j,

$$\begin{aligned} |{\textrm{D}}\nabla u_j|(\Omega ) \leqq C |(\Omega )_{\delta _j}| \left( \omega _{\delta _j}+\frac{\varepsilon _j}{\delta _j}\Vert \nabla ^2 w\Vert _{L^\infty (B_{2r})}\right) +\int _{(\Omega )_{\delta _j}} |\nabla ^2w|_1 \textrm{d}\mathscr {L}^d \end{aligned}$$

where $(\Omega )_\rho :=\{x\in {\mathbb {R}}^d: \textrm{dist}(x,\Omega )\leqq \rho \sqrt{d}\}$. Taking the limit $j\rightarrow \infty $, and recalling that $\delta _j\rightarrow 0$, $\omega _{\delta _j}\rightarrow 0$ and $\varepsilon _j/\delta _j\rightarrow 0$, concludes the proof (recalling (2.24)). $\quad \square $

4 Extremality of Cones

In this section we consider functions of the kind

$$\begin{aligned} f^\textrm{cone}(x):=(1-|x|)_+. \end{aligned}$$

(3.1)

It is clear that our forthcoming discussion will apply also to slightly different functions, e.g. $a(1-b|x-x_0|)_+$ for $a,b\in {\mathbb {R}}$ with $b> 0$ and $x_0\in {\mathbb {R}}^d$, but this will not make much difference, as one can reduce to the particular case of (3.1) via a change of coordinates and a rescaling. Notice that, by Proposition 1.13, if $d\geqq 2$,

$$\begin{aligned} |{\textrm{D}}_p^2 f^{\textrm{cone}}|(B_r(0))= d\omega _d\big ((d-1)^{1/p-1}(r\wedge 1)^{d-1}+\chi _{(1,\infty )}(r)\big ). \end{aligned}$$

(3.2)

Our aim is to investigate extremality of such kind of functions with respect to p-Hessian–Schatten seminorms, for $p\in [1,\infty ]$. It turns out that these functions are extremal, and now we state our main result in this direction. Its proof is deferred to Section 3.3 and will follow easily from the results of Sections 3.1 and 3.2, taking into account also Section 1.3.

Theorem 3.1

Let $d\geqq 2$ and let $p\in [1,\infty )$. Let $f_1,\,f_2\in L^1_{\textrm{loc}}({\mathbb {R}}^d)$ with bounded Hessian–Schatten variation in ${\mathbb {R}}^d$ such that

$$\begin{aligned} |{\textrm{D}}^2_p f_1|({\mathbb {R}}^d)=|{\textrm{D}}^2_p f_2|({\mathbb {R}}^d)=|{\textrm{D}}^2_p f^\textrm{cone}|({\mathbb {R}}^d) \end{aligned}$$

and such that for some $\lambda \in (0,1)$,

$$\begin{aligned} f^\textrm{cone}=\lambda f_1+(1-\lambda ) f_2. \end{aligned}$$

Then $f_1$ and $f_2$ are equal to $f^\textrm{cone}$, up to affine terms: there exist affine functions $L_1,L_2:{\mathbb {R}}^d\rightarrow {\mathbb {R}}$ such that $f_i=f^\textrm{cone}+L_i$ for $i=1,\,2$.

Notice that Theorem 3.1 is stated only for $d\geqq 2$. Indeed, for $d=1$, it is easy to realize that $f^{\textrm{cone}}$ is not extremal, according to the meaning described in the statement of the theorem.

To simplify the notation, as in this section we are going to consider only balls centred at the origin, we will omit to write the centre of the ball, i.e. $B_r:=B_r(0)$. Before going on, we recall that given $f\in L^1_{\textrm{loc}}({\mathbb {R}}^d)$, we denote by $f^\textrm{rad}$ the function given by Lemma 1.10. As an explicit expression, notice that

(3.3)

Notice also that $f^\textrm{rad}(x)=g(|x|)$ for g(r) given by the right hand side of (3.3) with r in place of |x|.

4.1 Convexity

We prove that if a function $f\in L^1_{\textrm{loc}}({\mathbb {R}}^d)$ is such that $f^\textrm{rad}=f^\textrm{cone}$ and such that $|{\textrm{D}}^2_p f|({\mathbb {R}}^d)=|{\textrm{D}}^2_p f^\textrm{cone}|({\mathbb {R}}^d)$, then f is the cone. The case $p=1$ is treated in Proposition 3.5, using the fact that the absolutely continuous part of ${\textrm{D}}\nabla f$ has a sign, which makes f concave inside the unit ball. The case $p>1$ is treated in Proposition 3.6, using strict convexity of the p-Schatten norm to show that the absolutely continuous part of ${\textrm{D}}\nabla f$ is a scalar multiple of the absolutely continuous part of ${\textrm{D}}\nabla f^\textrm{cone}$, and then scaling to reduce to the $p=1$ case.

First, we need a couple of lemmas. The first is an extension of a well known criterion to recognize convexity.

Lemma 3.2

Let $\Omega \subseteq {\mathbb {R}}^d$ be open and convex and let $f\in L^1_{\textrm{loc}}(\Omega )$ with bounded Hessian–Schatten variation in $\Omega $. Assume that ${\textrm{D}}\nabla f\geqq 0$ (as a measure with values in symmetric matrices). Then f has a representative which is continuous and convex.

Proof

The property of having a continuous representative is clearly local. Since $\Omega $ is open and convex, a continuous function $g:\Omega \rightarrow {\mathbb {R}}$ is convex if and only if it is convex in a neighbourhood of any point. Therefore it suffices to prove the assertion in a neighbourhood of any point, so that we can assume $f\in W^{1,1}(\Omega )$ with $\nabla f\in {\textrm{BV}}(\Omega ;{\mathbb {R}}^d)$, by Proposition 1.11 and Proposition 1.7.

Let $x\in \Omega $, and pick $r>0$ such that $Q_{4r}(x)\subseteq \Omega $ (we write here $Q_\ell (y):=y+(-\ell ,\ell )^n$). Fix a mollifier $\eta _\varepsilon \in C^\infty _{\textrm{c}}(B_\varepsilon ;[0,\infty ))$, with $\varepsilon \leqq r$, and define $f_\varepsilon :=\eta _\varepsilon *f\in C^\infty (Q_{3r}(x))$. Then an immediate computation yields ${\textrm{D}}\nabla f_\varepsilon =\eta _\varepsilon *{\textrm{D}}\nabla f\geqq 0$ in $Q_{3r}(x)$, therefore $f_\varepsilon $ is convex in $Q_{3r}(x)$. Further, $f_\varepsilon \rightarrow f$ in $W^{1,1}(Q_{3r})$. It remains to show that $f_\varepsilon $ (possibly after passing to a subsequence) converges uniformly in $Q_{r}$, which implies the conclusion in $Q_r$ and therefore in a neighbourhood of any point of $\Omega $.

We prove now uniform convergence in $Q_r$, the argument is classical, see e.g. the proof of [13, Theorem 7.6]. Passing to a subsequence, $f_{\varepsilon _j}\rightarrow f$ pointwise almost everywhere. Pick $\bar{x}\in Q_{r/2}(x)$ such that the sequences $f_{\varepsilon _j}(\bar{x})$ and $f_{\varepsilon _j}(y)$, for any vertex y of $Q_{2 r}(\bar{x})\subseteq Q_{3r}(x)$, are bounded (as we can assume them to be convergent), and let $M=M_{\bar{x},r}$ be the common bound. By convexity, $f_{\varepsilon _j}\leqq M$ on $\bar{Q}_{2 r}(\bar{x})$. To prove the uniform lower bound, we observe that for any $w\in Q_{2r}(\bar{x}){\setminus }\{\bar{x}\}$ there is $z\in \partial Q_{2r}(\bar{x})$ such that $\bar{x}$ is in the interior of the segment joining w with z. As convexity implies monotonicity of the difference quotients,

$$\begin{aligned} \frac{f_{\varepsilon _j}(\bar{x})-f_{\varepsilon _j}(w)}{|\bar{x}-w|} \leqq \frac{f_{\varepsilon _j}(z)-f_{\varepsilon _j}(\bar{x})}{|z-\bar{x}|} \leqq \frac{2M}{2r}, \end{aligned}$$

where in the last step we used $|z-\bar{x}|\geqq 2r$. Since $f_{\varepsilon _j}(\bar{x})\geqq -M$ and $|w-\bar{x}|\leqq 2r\sqrt{d}$ we have $f_{\varepsilon _j}(w)\geqq -(1+2\sqrt{d})M$. Passing to the smaller cube $Q_r(x)$ and using again monotonicity of the difference quotients we obtain $\textrm{Lip}(f_{\varepsilon _j}; Q_{r}(x))\leqq C'M$ for all j, so that $f_{\varepsilon _j}$ converges uniformly in $Q_r(x)$ to a continuous convex function, which coincides almost everywhere with f. This concludes the proof. $\quad \square $

The next lemma builds upon Lemma 3.2 and gives an integral characterization of convexity, which is more manageable, and follows from the rigidity in the inequality $|\mathrm{{Tr}}A|\leqq |A|_1$.

Lemma 3.3

Let $\Omega \subseteq {\mathbb {R}}^d$ be open and let $f\in L^1_{\textrm{loc}}(\Omega )$ with bounded Hessian–Schatten variation in $\Omega $. Then

$$\begin{aligned} |{\textrm{D}}^2_1f|(\Omega )\geqq |\textrm{TrD}\nabla f(\Omega )|. \end{aligned}$$

(3.4)

Assume now that equality in (3.4) holds. Then

either $|{\textrm{D}}^2_1f|(\Omega )=\textrm{TrD} \nabla f(\Omega )$ and then f has a representative which is continuous and convex,
or $|{\textrm{D}}^2_1f|(\Omega )=-\textrm{TrD} \nabla f(\Omega )$ and then f has a representative which is continuous and concave.

Proof

We can assume that $\textrm{TrD}\nabla f(\Omega )\geqq 0$, otherwise one replaces f by $-f$.

Let now $A\in {\mathbb {R}}^{d\times d}$ be a symmetric matrix and let $\lambda _1,\ldots ,\lambda _d$ denote its eigenvalues. By item i) of Proposition 1.2,

$$\begin{aligned} |A|_1=\sum _{i=1}^d|\lambda _i|\geqq \sum _{i=1}^d\lambda _i= \mathrm{{Tr}}A \end{aligned}$$

and equality holds if and only if $\lambda _i\geqq 0$ for all i, which is the same as $A\geqq 0$ as a symmetric matrix.

By Proposition 1.7 (in particular, $|{\textrm{D}}^2_1f|\ll |{\textrm{D}}\nabla f|$ and $\textrm{TrD}\nabla f\ll |{\textrm{D}}\nabla f|$),

$$\begin{aligned} |{\textrm{D}}^2_1f|(\Omega ) =\int _\Omega \bigg |\frac{\textrm{d}{{\textrm{D}}\nabla f}}{\textrm{d}{|{\textrm{D}}\nabla f|}}\bigg |_1 \textrm{d}|{\textrm{D}}\nabla f| \geqq \int _\Omega \textrm{Tr}\frac{\textrm{d}{{\textrm{D}}\nabla f}}{\textrm{d}{|{\textrm{D}}\nabla f|}} \textrm{d}|{\textrm{D}}\nabla f|= \mathrm{{Tr}}{\textrm{D}}\nabla f(\Omega ), \end{aligned}$$

which proves the bound (3.4). If equality holds, then

$$\begin{aligned} \bigg |\frac{\textrm{d}{{\textrm{D}}\nabla f}}{\textrm{d}{|{\textrm{D}}\nabla f|}}\bigg |_1= \textrm{Tr} \frac{\textrm{d}{{\textrm{D}}\nabla f}}{\textrm{d}{|{\textrm{D}}\nabla f|}}\qquad |{\textrm{D}}\nabla f|\text {-a.e.} \end{aligned}$$

so that

$$\begin{aligned} \frac{\textrm{d}{{\textrm{D}}\nabla f}}{\textrm{d}{|{\textrm{D}}\nabla f|}}\geqq 0\qquad |{\textrm{D}}\nabla f|\text {-a.e.} \end{aligned}$$

which means that ${\textrm{D}}\nabla f\geqq 0$ as a matrix-valued measure, so that the conclusion then follows by Lemma 3.2. $\quad \square $

4.2 Extremality with respect to spherical averaging

In this section, we consider only the case $d\geqq 2$. This is because this is an auxiliary section for the proof of Theorem 3.1, which holds only for $d\geqq 2$. We start by doing some explicit computation involving the Hessian–Schatten total variation of $f^\textrm{cone}$. First, by Proposition 1.7, $f^\textrm{cone}\in W^{1,1}({\mathbb {R}}^d)$ with $\nabla f^\textrm{cone}\in {\textrm{BV}}({\mathbb {R}}^d;{\mathbb {R}}^d)$. More precisely,

$$\begin{aligned} \nabla f^\textrm{cone}(x)=-\chi _{B_1}(x)\frac{x}{|x|}. \end{aligned}$$

This computation is easily justified by locality, as $f^\textrm{cone}$ is smooth on $B_1\setminus \{0\}$ and on ${\mathbb {R}}^d\setminus \bar{B}_1$. Now we claim that

(3.5)

Taking into account that ${\textrm{D}}\nabla f^\textrm{cone}$ does not charge points, this formula is easily justified on ${\mathbb {R}}^d{\setminus } \partial B_1$ by locality, as above. For what concerns the singular part, on $\partial B_1$, it is enough to use the representation formula for the singular part of differentials of vector valued functions of bounded variation, e.g. [3], notice indeed that the unit outer normal to $\partial B_1$ is x and that the jump of $\nabla f^\textrm{cone}$ at $x\in \partial B_1$ is exactly x.

Taking traces, we have that

so that

$$\begin{aligned} \int _{B_r}{\textrm{d}TrD}\nabla f^\textrm{cone}= - d\omega _d r^{d-1}\chi _{(0,1]}(r) \qquad \hbox { }\ \forall r>0. \end{aligned}$$

(3.6)

Recall that by Lemma 1.10, $|{\textrm{D}}^2_p f^\textrm{rad}|({\mathbb {R}}^d)\leqq |{\textrm{D}}^2_p f|({\mathbb {R}}^d)$. The next lemma states that this inequality is somehow rigid.

Lemma 3.4

Let $p\in [1,\infty ]$. Let $f \in L^1_{\textrm{loc}}({\mathbb {R}}^d)$ with bounded Hessian–Schatten variation and assume that

$$\begin{aligned} |{\textrm{D}}^2_p f^\textrm{rad}|({\mathbb {R}}^d)=|{\textrm{D}}^2_p f|({\mathbb {R}}^d). \end{aligned}$$

(3.7)

Then, for every $r>0$ one has

$$\begin{aligned} \begin{aligned} |{\textrm{D}}_p^2 f|(B_r)=|{\textrm{D}}_p^2 f^\textrm{rad}|(B_r)&,\ |{\textrm{D}}_p^2 f|(\partial B_r)=|{\textrm{D}}_p^2 f^\textrm{rad}|(\partial B_r)\\ {}&\text { and }|{\textrm{D}}_p^2 f|({\mathbb {R}}^d\setminus \bar{B}_r)=|{\textrm{D}}_p^2 f^\textrm{rad}|({\mathbb {R}}^d\setminus \bar{B}_r). \end{aligned} \end{aligned}$$

(3.8)

Proof

First notice that thanks to Lemma 1.10, for any $\varepsilon >0$,

$$\begin{aligned} |{\textrm{D}}^2_p f^\textrm{rad}|(B_r)\leqq |{\textrm{D}}^2_p f|(B_r)&,\ |{\textrm{D}}^2_p f^\textrm{rad}|(B_{r+\varepsilon }\setminus \bar{B}_{r-\varepsilon })\leqq |{\textrm{D}}^2_p f|(B_{r+\varepsilon }\setminus \bar{B}_{r-\varepsilon })\\ {}&\text {and } |{\textrm{D}}^2_p f^\textrm{rad}|({\mathbb {R}}^d\setminus \bar{B}_r)\leqq |{\textrm{D}}^2_p f|({\mathbb {R}}^d\setminus \bar{B}_r) \end{aligned}$$

so that, by regularity of measures, letting $\varepsilon \searrow 0$,

Then we can compute, by the inequalities above and exploiting (3.7),

$$\begin{aligned} |{\textrm{D}}^2_p f|({\mathbb {R}}^d)&=|{\textrm{D}}^2_p f^\textrm{rad}|({\mathbb {R}}^d)=|{\textrm{D}}^2_p f^\textrm{rad}|(B_r)+|{\textrm{D}}^2_p f^\textrm{rad}|(\partial B_r)+|{\textrm{D}}^2_p f^\textrm{rad}|({\mathbb {R}}^d\setminus \bar{B}_r)\\ {}&\leqq |{\textrm{D}}^2_p f|(B_r)+|{\textrm{D}}^2_p f|(\partial B_r)+|{\textrm{D}}^2_p f|({\mathbb {R}}^d\setminus \bar{B}_r)=|{\textrm{D}}_p^2 f|({\mathbb {R}}^d), \end{aligned}$$

so that equality holds throughout and therefore we obtain (3.8). $\quad \square $

Now we state and prove the main results of this section, splitting the case $p=1$ and the case $p\in (1,\infty )$. Recall that $|{\textrm{D}}^2_1 f^\textrm{cone}|({\mathbb {R}}^d\setminus \bar{B}_1)=0$ according to (3.5).

Proposition 3.5

Let $f \in L^1_{\textrm{loc}}({\mathbb {R}}^d)$ with bounded Hessian–Schatten variation and assume that

$$\begin{aligned} f^\textrm{rad}=f^\textrm{cone}\qquad \text {and}\qquad |{\textrm{D}}^2_1f|({\mathbb {R}}^d)=|{\textrm{D}}^2_1f^\textrm{cone}|({\mathbb {R}}^d). \end{aligned}$$

(3.9)

Then f is equal to $f^\textrm{cone}$ up to a linear term: there exists $\alpha \in {\mathbb {R}}^d$ such that

$$\begin{aligned} f(x)=f^\textrm{cone}(x)+\alpha \cdot x \qquad \text { for a.e.} x\in {\mathbb {R}}^d. \end{aligned}$$

Proof

Let $r>0$ and let $U\in SO({\mathbb {R}}^d)$. By Lemma 1.10, $f_U:=f(U\,\cdot \,)$ has finite Hessian–Schatten total variation. Also, for any radial function $g\in C_{\textrm{c}}^\infty ({\mathbb {R}}^d)$ one has

$$\begin{aligned} \int _{{\mathbb {R}}^d}f_U\Delta g\textrm{d}\mathscr {L}^d=\int _{{\mathbb {R}}^d}f (\Delta g)_{U^t}\textrm{d}\mathscr {L}^d=\int _{{\mathbb {R}}^d}f\Delta g\textrm{d}\mathscr {L}^d, \end{aligned}$$

so that, integrating both sides with respect to $\textrm{d}\mu _d(U)$ and using Fubini’s Theorem,

$$\begin{aligned} \int _{{\mathbb {R}}^d}f^\textrm{rad}\Delta g\textrm{d}\mathscr {L}^d= \int _{{\mathbb {R}}^d}f\Delta g\textrm{d}\mathscr {L}^d. \end{aligned}$$

Then, as $f^\textrm{rad}=f^\textrm{cone}$ and integrating by parts,

$$\begin{aligned} \int _{{\mathbb {R}}^d}g {\textrm{d}TrD}\nabla f^\textrm{cone}= \int _{{\mathbb {R}}^d}g {\textrm{d}TrD}\nabla f. \end{aligned}$$

Therefore, by an approximation argument, recalling the explicit computation (3.6), we obtain that

$$\begin{aligned} \int _{B_r} {\textrm{d}TrD}\nabla f=- d\omega _d r^{d-1}\chi _{(0,1]}(r) \qquad \forall r>0. \end{aligned}$$

In particular, taking into account (3.2) and (3.8) (recall that we can use (3.8) thanks to the standing assumption (3.9))

$$\begin{aligned} -\mathrm{{Tr}}{\textrm{D}}\nabla f(B_1)=d\omega _d=|{\textrm{D}}_1^2 f^\textrm{cone}|(B_1)= |{\textrm{D}}_1^2 f|(B_1). \end{aligned}$$

Now Lemma 3.3 can be applied, to obtain that the function f has a continuous and concave representative in $B_1$ that, without loss of generality, we still denote by f. By (3.8) again, f is affine on ${\mathbb {R}}^d{\setminus }\bar{B}_1$, say $f(x)=\alpha \,\cdot \,x+\beta $ for $x\in {\mathbb {R}}^d{\setminus }\bar{B}_1$, for some $\alpha \in {\mathbb {R}}^d$ and $\beta \in {\mathbb {R}}$. Now $f^\textrm{rad}=f^\textrm{cone}$ forces $\beta =0$.

Setting also ${\tilde{f}}(x):=f(x)-\alpha \,\cdot \, x$, we conclude the proof by showing ${\tilde{f}}=f^\textrm{cone}$. Notice that still ${\tilde{f}}$ is continuous and concave on $B_1$ and ${\tilde{f}}^\textrm{rad}=f^\textrm{cone}$. Notice that this last fact implies ${\tilde{f}}(0)=1$.

Now, for any $\sigma \in \partial B_1$, define ${\tilde{f}}_\sigma (s):={\tilde{f}}(s \sigma )$ for $s\in [0,\infty )$, a function continuous and concave in [0, 1) with ${\tilde{f}}_\sigma (0)=1$. Notice that for ${\mathscr {H}}^{d-1}$-a.e. $\sigma \in \partial B_1$, ${\tilde{f}}_\sigma \in W^{1,1}_{\textrm{loc}}((0,\infty ))$. This can be seen either with a change of coordinates and the characterization of Sobolev functions on lines or by approximation, using repeatedly integration in polar coordinates. Hence, for ${\mathscr {H}}^{d-1}$-a.e. $\sigma \in \partial B_1$, the function ${\tilde{f}}_\sigma $ has a continuous representative in $[1,\infty )$. Now, for ${\mathscr {H}}^{d-1}$-a.e. $\sigma \in \partial B_1$, ${\tilde{f}}_\sigma $ vanishes a.e. in $(1,\infty )$ (as ${\tilde{f}}$ vanishes identically on ${\mathbb {R}}^d\setminus \bar{B}_1$), therefore this implies $\tilde{f}_\sigma (s)\rightarrow 0$ as $s\uparrow 1$ and the continuous representative is the one null in $[1,\infty )$. Then, exploiting continuity and concavity, for ${\mathscr {H}}^{d-1}$-a.e. $\sigma \in \partial B_1$, ${\tilde{f}}_\sigma (s)\geqq (1-s)$ for $s\in [0,1]$. Then it holds that ${\tilde{f}}\geqq f^\textrm{cone}$ $\mathscr {L}^d$-a.e. on $B_1$, whence, being ${\tilde{f}}^\textrm{rad}=f^\textrm{cone}$, ${\tilde{f}}=f^\textrm{cone}$ on $B_1$. $\quad \square $

Proposition 3.6

Let $p\in [1,\infty )$. Let $f \in L^1_{\textrm{loc}}({\mathbb {R}}^d)$ with bounded Hessian–Schatten variation and assume that

$$\begin{aligned} f^\textrm{rad}=f^\textrm{cone}\qquad \text {and}\qquad |{\textrm{D}}^2_p f|({\mathbb {R}}^d)=|{\textrm{D}}^2_p f^\textrm{cone}|({\mathbb {R}}^d). \end{aligned}$$

(3.10)

Then f is equal to $f^\textrm{cone}$ up to a linear term: there exists $\alpha \in {\mathbb {R}}^d$ such that

$$\begin{aligned} f(x)=f^\textrm{cone}(x)+\alpha \cdot x \qquad \text { for a.e.} x\in {\mathbb {R}}^d. \end{aligned}$$

Proof

We focus on the case $p>1$ as the case $p=1$ has already been proved in Proposition 3.5. Let now $g:=\frac{1}{2} (f+f^\textrm{cone})$. Recalling (3.8), $|{\textrm{D}}^2_p g|({\mathbb {R}}^d{\setminus } \bar{B}_1)=0$. Still, $g^\textrm{rad}=f^\textrm{cone}$, so that, by Lemma 1.10 and (3.10),

$$\begin{aligned} |{\textrm{D}}^2_p f^\textrm{cone}|({\mathbb {R}}^d)\leqq |{\textrm{D}}^2_p g|({\mathbb {R}}^d)\leqq \frac{1}{2}|{\textrm{D}}^2_p f|({\mathbb {R}}^d)+\frac{1}{2}|{\textrm{D}}^2_p f^\textrm{cone}|({\mathbb {R}}^d)=|{\textrm{D}}^2_p f^\textrm{cone}|({\mathbb {R}}^d), \end{aligned}$$

hence equality holds throughout and therefore g satisfies (3.10) in place of f.

We next decompose ${\textrm{D}}\nabla f$ in absolutely continuous and singular part, use that the singular one has a rank one density with respect to the total variation, and show that the absolutely continuous one is proportional to the one of ${\textrm{D}}\nabla f^\textrm{cone}$. We are going to use the theory of functions of bounded variation throughout, see e.g. [3]. The superscript s denotes the singular part of a measure with respect to $\mathscr {L}^d$. We have a $\mathscr {L}^d$-negligible Borel set $N\subseteq B_1$ such that . Also , being , by (3.5). In addition

hence equality holds throughout and in particular, . Now, recall that and , also , by (3.5). Therefore, by Proposition 1.7,

$$\begin{aligned} |{\textrm{D}}^2_p g|(B_1)&=|{\textrm{D}}^2_p g|(N)+\int _{B_1\setminus N}\bigg |\frac{\textrm{d}{{\textrm{D}}\nabla g}}{\textrm{d}{|{\textrm{D}}\nabla g|}}\bigg |_p\textrm{d}{|{\textrm{D}}\nabla g|}\\&=|{\textrm{D}}^2_p g|(N)+\int _{B_1\setminus N}\bigg |\frac{\textrm{d}{{\textrm{D}}\nabla g}}{\textrm{d}{\mathscr {L}^d}}\bigg |_p\textrm{d}{\mathscr {L}^d}\\ {}&=|{\textrm{D}}^2_p g|(N)+\frac{1}{2}\int _{B_1\setminus N}\bigg |\frac{\textrm{d}{{\textrm{D}}\nabla f}}{\textrm{d}{\mathscr {L}^d}}+\frac{\textrm{d}{{\textrm{D}}\nabla f^\textrm{cone}}}{\textrm{d}{\mathscr {L}^d}}\bigg |_p\textrm{d}{\mathscr {L}^d}\\&\leqq \frac{1}{2}|{\textrm{D}}^2_p f|(N)+\frac{1}{2}\int _{B_1\setminus N}\bigg |\frac{\textrm{d}{{\textrm{D}}\nabla f}}{\textrm{d}{\mathscr {L}^d}}\bigg |_p+\bigg |\frac{\textrm{d}{{\textrm{D}}\nabla f^\textrm{cone}}}{\textrm{d}{\mathscr {L}^d}}\bigg |_p\textrm{d}{\mathscr {L}^d}\\&\leqq \frac{1}{2}|{\textrm{D}}^2_p f|(B_1)+\frac{1}{2}|{\textrm{D}}^2_p f^\textrm{cone}|(B_1)= |{\textrm{D}}^2_p g|(B_1), \end{aligned}$$

where we also used (3.10) for f and g and (3.8) in the last equality. Hence equality holds throughout, so that

$$\begin{aligned} \bigg |\frac{\textrm{d}{{\textrm{D}}\nabla f}}{\textrm{d}{\mathscr {L}^d}}+\frac{\textrm{d}{{\textrm{D}}\nabla f^\textrm{cone}}}{\textrm{d}{\mathscr {L}^d}}\bigg |_p=\bigg |\frac{\textrm{d}{{\textrm{D}}\nabla f}}{\textrm{d}{\mathscr {L}^d}}\bigg |_p+\bigg |\frac{\textrm{d}{{\textrm{D}}\nabla f^\textrm{cone}}}{\textrm{d}{\mathscr {L}^d}}\bigg |_p\qquad \mathscr {L}^d\hbox { -a.e.\ on}\ B_1. \end{aligned}$$

By strict convexity of the p-Schatten norm (item vi) of Proposition 1.2), and the fact (by (3.5)) that the density of ${\textrm{D}}\nabla f^\textrm{cone}$ with respect to $\mathscr {L}^d$ is nonzero $\mathscr {L}^d$-a.e. on $B_1$, we have that for some Borel map $t:B_1\rightarrow [0,\infty )$,

$$\begin{aligned} \frac{\textrm{d}{{\textrm{D}}\nabla f}}{\textrm{d}{\mathscr {L}^d}}=t\frac{\textrm{d}{{\textrm{D}}\nabla f^\textrm{cone}}}{\textrm{d}{\mathscr {L}^d}}\qquad \mathscr {L}^d\hbox { -a.e.\ on}\ B_1. \end{aligned}$$

(3.11)

Now, by (3.5), for $q\in [1,\infty ]$,

$$\begin{aligned} \bigg |\frac{\textrm{d}{{\textrm{D}}\nabla f^\textrm{cone}}}{\textrm{d}{\mathscr {L}^d}}{}(x)\bigg |_q=\bigg |-\frac{|x|^2 \textrm{Id}-x\otimes x}{|x|^3}\bigg |_q=\frac{(d-1)^{1/q}}{|x|}\qquad \mathscr {L}^d\hbox { -a.e.\ on}\ B_1.\nonumber \\ \end{aligned}$$

(3.12)

Then, by (3.11) and (3.12) (with $q=1,p$),

$$\begin{aligned} \bigg |\frac{\textrm{d}{{\textrm{D}}\nabla f}}{\textrm{d}{\mathscr {L}^d}}{}(x)\bigg |_p&=t(x)\frac{(d-1)^{1/p}}{|x|}={(d-1)^{1/p-1}}t(x)\frac{d-1}{|x|}\\&={(d-1)^{1/p-1}}\bigg |\frac{\textrm{d}{{\textrm{D}}\nabla f}}{\textrm{d}{\mathscr {L}^d}}{}(x)\bigg |_1\qquad \mathscr {L}^d\hbox { -a.e.\ on}\ B_1. \end{aligned}$$

Therefore, by Proposition 1.7,

$$\begin{aligned} |{\textrm{D}}^2_p f|(B_1\setminus N)={(d-1)^{1/p-1}}|{\textrm{D}}^2_1 f|(B_1\setminus N). \end{aligned}$$

(3.13)

On the singular set N, by Proposition 1.7 and Alberti’s rank 1 Theorem together with item v) of Proposition 1.2,

$$\begin{aligned} |{\textrm{D}}^2_p f|(N)=\int _{N}\bigg |\frac{\textrm{d}{{\textrm{D}}\nabla f}}{\textrm{d}{|{\textrm{D}}\nabla f|}}\bigg |_p\textrm{d}|{\textrm{D}}\nabla f|=\int _{N}\bigg |\frac{\textrm{d}{{\textrm{D}}\nabla f}}{\textrm{d}{|{\textrm{D}}\nabla f|}}\bigg |_1\textrm{d}|{\textrm{D}}\nabla f|= |{\textrm{D}}^2_1 f|(N). \nonumber \\ \end{aligned}$$

(3.14)

Therefore, by (3.13), (3.14) and (3.8), taking into account that $d\geqq 2$ and $p\geqq 1$ (hence $1\leqq (d-1)^{1-1/p}$),

$$\begin{aligned} \begin{aligned} |{\textrm{D}}_1^2 f|(B_1)&=|{\textrm{D}}^2_1 f|(B_1\setminus N)+|{\textrm{D}}^2_1 f|(N)\\&=(d-1)^{1-1/p}|{\textrm{D}}^2_p f|(B_1\setminus N)+|{\textrm{D}}^2_p f|(N)\\&\leqq (d-1)^{1-1/p}\big (|{\textrm{D}}^2_p f|(B_1\setminus N)+|{\textrm{D}}^2_p f|(N)\big )\\&=(d-1)^{1-1/p}|{\textrm{D}}_p^2 f|(B_1)\\&=(d-1)^{1-1/p}|{\textrm{D}}_p^2f ^\textrm{cone}|(B_1)=|{\textrm{D}}_1^2f ^\textrm{cone}|(B_1) \end{aligned} \end{aligned}$$

(3.15)

where the last equality follows from (3.2). Recalling (3.8) and arguing exactly as for (3.14) for the first and third equalities,

$$\begin{aligned} |{\textrm{D}}^2_1 f|(\partial B_1)=|{\textrm{D}}^2_p f|(\partial B_1)=|{\textrm{D}}^2_p f^\textrm{cone}|(\partial B_1)=|{\textrm{D}}^2_1 f^\textrm{cone}|(\partial B_1). \end{aligned}$$

(3.16)

Then, by (3.8), exploiting (3.15) and (3.16)

$$\begin{aligned}&|{\textrm{D}}_1^2 f|({\mathbb {R}}^d)=|{\textrm{D}}_1^2 f|(B_1)+|{\textrm{D}}_1^2 f|(\partial B_1)\leqq |{\textrm{D}}_1^2 f^\textrm{cone}|(B_1)\\&\quad +|{\textrm{D}}_1^2 f^\textrm{cone}|(\partial B_1)=|{\textrm{D}}_1^2 f^\textrm{cone}|({\mathbb {R}}^d). \end{aligned}$$

Recalling Lemma 1.10 together with (3.10), the inequality above yields that f satisfies (3.9), so that the conclusion follows from Proposition 3.5. $\quad \square $

4.3 Proof of the main result

Proof of Theorem 3.1

Let $f_1$ and $f_2$ be as in the statement and recall (3.3), so that we can define $f^\textrm{rad}_i$ for $i=1,2$. As $f^{\textrm{cone}}$ is already a radial function, we still have $\lambda f^\textrm{rad}_1+(1-\lambda )f^\textrm{rad}_2=f^\textrm{cone}$. Now we compute, using Lemma 1.10 and the assumption,

$$\begin{aligned}&|{\textrm{D}}^2_p f^\textrm{cone}|({\mathbb {R}}^d)=|{\textrm{D}}^2_p (\lambda f_1^\textrm{rad}+(1-\lambda )f_2^\textrm{rad})|({\mathbb {R}}^d)\\&\quad \leqq \lambda |{\textrm{D}}_p^2 f_1^\textrm{rad}|({\mathbb {R}}^d)+ (1-\lambda ) |{\textrm{D}}^2_p f^\textrm{rad}_2|({\mathbb {R}}^d)\\&\quad \leqq \lambda |{\textrm{D}}_p^2 f_1|({\mathbb {R}}^d)+ (1-\lambda ) |{\textrm{D}}^2_p f_2|({\mathbb {R}}^d)\\&\quad =\lambda |{\textrm{D}}_p^2 f^\textrm{cone}|({\mathbb {R}}^d)+(1-\lambda ) |{\textrm{D}}_p^2 f^\textrm{cone}|({\mathbb {R}}^d)\\&\quad =|{\textrm{D}}^2_p f^\textrm{cone}|({\mathbb {R}}^d), \end{aligned}$$

hence equality holds throughout. Therefore,

$$\begin{aligned} |{\textrm{D}}^2_p f^\textrm{rad}_i|({\mathbb {R}}^d)=|{\textrm{D}}^2_p f_i|({\mathbb {R}}^d)\qquad \text {for }i=1,2, \end{aligned}$$

and

$$\begin{aligned} |{\textrm{D}}^2_p (\lambda f_1^\textrm{rad}+(1-\lambda )f_2^\textrm{rad})|({\mathbb {R}}^d)= |{\textrm{D}}_p^2 (\lambda f_1^\textrm{rad})|({\mathbb {R}}^d)+ |{\textrm{D}}^2_p ((1-\lambda ) f^\textrm{rad}_2)|({\mathbb {R}}^d) \end{aligned}$$

so that, by Lemma 1.12,

$$\begin{aligned} |{\textrm{D}}^2_p f^\textrm{cone}|=\lambda |{\textrm{D}}^2_p f_1^\textrm{rad}|+(1-\lambda )|{\textrm{D}}^2_p f_2^\textrm{rad}| \end{aligned}$$

(3.17)

as measures on ${\mathbb {R}}^d$. As $f_1^\textrm{rad}$ and $f_2^\textrm{rad}$ are radial functions with bounded Hessian–Schatten variation, by Proposition 1.13, $f_i^\textrm{rad}(x)=g_i(|x|)$ for $g_i\in W^{1,1}_{\textrm{loc}}((0,\infty ))$. Similarly, $f^\textrm{cone}(x)=g^\textrm{cone}(|x|)=(1-|x|)_+$, notice that $\lambda g_1+(1-\lambda ) g_2=g^\textrm{cone}$. Then, using repeatedly the representation formula of Proposition 1.13 and (3.17),

$$\begin{aligned}&|{\textrm{D}}_p^2 f^\textrm{cone}|(B_1)=d\omega _d\int _0^1 \Vert (0,g_\textrm{cone}',\ldots ,g_\textrm{cone}')\Vert _{\ell ^p} s^{d-2}\textrm{d}s\\&\quad \leqq d\omega _d\bigg (\lambda \int _0^1 \Vert (0,g_1',\ldots ,g_1')\Vert _{\ell ^p} s^{d-2}\textrm{d}s \\&\quad + (1-\lambda ) \int _0^1 \Vert (0,g_2',\ldots ,g_2')\Vert _{\ell ^p} s^{d-2}\textrm{d}s\bigg )\\&\quad \leqq \lambda |{\textrm{D}}_p^2 f^\textrm{rad}_1|(B_1)+(1-\lambda )|{\textrm{D}}_p^2 f^\textrm{rad}_2|(B_1)=|{\textrm{D}}_p^2 f^\textrm{cone}|(B_1), \end{aligned}$$

hence equality holds throughout. In particular, as we have obtained

$$\begin{aligned} d\omega _d\int _0^1 \Vert (0,g_i',\dots ,g_i')\Vert _{\ell ^p} s^{d-2}\textrm{d}s=|{\textrm{D}}^2_p f_i^\textrm{rad}|(B_1)\qquad \text {for }i=1,\,\,2, \end{aligned}$$

exploiting the representation formula of Proposition 1.13, we have that $g_1'$ and $g_2'$ are constant on (0, 1). Also, by (3.17), and the representation formula of Proposition 1.13 again, $g_1'$ and $g_2'$ vanish identically on $(1,\infty )$. Recall also that $g_i\in W^{1,1}_{\textrm{loc}}((0,\infty ))$, so that $g_i$ has a continuous representative, for $i=1,\,2$. Hence, there exist $\alpha _1,\alpha _2\in {\mathbb {R}}$ and $\beta _1,\,\beta _2\in {\mathbb {R}}$ such that

$$\begin{aligned} g_i(s)=\alpha _i (1-s)_++\beta _i. \end{aligned}$$

Now, $\lambda g_1+(1-\lambda ) g_2=g^\textrm{cone}$ forces $\lambda \alpha _1+(1-\lambda )\alpha _2=1$, whereas

$$\begin{aligned} |\alpha _i||{\textrm{D}}_p^2 f^\textrm{cone}|({\mathbb {R}}^d)=|{\textrm{D}}_p^2 f_i^\textrm{rad}|({\mathbb {R}}^d)=|{\textrm{D}}_p^2 f^\textrm{cone}|({\mathbb {R}}^d)\qquad \text {for }i=1,2 \end{aligned}$$

forces $|\alpha _1|=|\alpha _2|=1$. Hence, $\alpha _1=\alpha _2=1$.

Therefore, to sum up, we have, for $i=1,\,2$,

$$\begin{aligned} f_i^\textrm{rad}=f^\textrm{cone}+\beta _i, \end{aligned}$$

so that

$$\begin{aligned} |{\textrm{D}}^2_p f_i^\textrm{rad}|({\mathbb {R}}^d)=|{\textrm{D}}_p^2 f^\textrm{cone}|({\mathbb {R}}^d)=|{\textrm{D}}^2_p f_i|({\mathbb {R}}^d). \end{aligned}$$

Notice that $f_i^\textrm{rad}-\beta _i=(f_i-\beta _i)^\textrm{rad}$. Now we use Proposition 3.6 to infer that

$$\begin{aligned} f_i(x)-\beta _i=f^\textrm{cone}(x)+a_i\,\cdot \,x\qquad \text {for a.e.\ }x\in {\mathbb {R}}^d, \end{aligned}$$

hence the proof is concluded with $L_i(x):=\alpha _i\,\cdot \, x+\beta _i$. $\quad \square $

5 Solutions of the Minimization Problem

In this section we stick to the two dimensional case $d=2$. Recall that, by Proposition 1.11, functions with bounded Hessian–Schatten variation are continuous, as we are in dimension 2 and hence the evaluation functionals in (4.1) below are meaningful (we will implicitly take the continuous representative, whenever it is possible).

Fix $\Omega \subseteq {\mathbb {R}}^2$ open, and fix $x_1,\ldots , x_N \in \Omega $ distinct test points and fix also $y_1,\ldots ,y_N\in {\mathbb {R}}$. For $\lambda \in [0,\infty ]$ and $p,\,q\in [1,\infty ]$ we consider the functional

$$\begin{aligned}{} & {} \mathcal {F}_\lambda ^{p,q}:L^1_{\textrm{loc}}(\Omega )\rightarrow [0,\infty ]\qquad \text {defined as} \nonumber \\{} & {} \mathcal {F}_\lambda ^{p,q}(f) :=|{\textrm{D}}^2_p f|(\Omega )+\lambda \Vert (f(x_i)-y_i)_{i=1,\ldots ,N}\Vert _{\ell ^q}, \end{aligned}$$

(4.1)

where we adopt the convention that $\infty \,\cdot \, 0=0$. Notice that if $p=q=1$, we have that $\mathcal {F}^{1,1}_\lambda =\mathcal {F}_\lambda $, where $\mathcal {F}_\lambda $ is defined in (0.4) in the Introduction.

Our aim is to establish conditions under which $\mathcal {F}_\lambda ^{p,q}$ has minimizers, i.e. we want to ensure the existence of a minimizer of

$$\begin{aligned} \inf _{f\in L^1_{\textrm{loc}}(\Omega )} \mathcal {F}^{p,q}_\lambda (f). \end{aligned}$$

It turns out that for many values of $\lambda ,\,p,\,q$, minimizers indeed exist. Here we state our main results in this direction.

Theorem 4.1

Let $p,\,q\in [1,\infty ]$ and let $\lambda \in [0, 2^{1/p-1} 4\pi ]$. Then there exists a minimizer of $\mathcal {F}_\lambda ^{p,q}$.

Theorem 4.2

Let $\lambda \in [0,\infty ]$. Then there exists a minimizer of $\mathcal {F}_\lambda ^{1,1}$.

Theorems 4.1 and 4.2 will follow easily from the results of Section 4.1. We defer their proof of to Section 4.2.

5.1 Auxiliary results

For the next lemma, we recall again that functions with bounded Hessian–Schatten variation in dimension 2 are automatically continuous. Hence, the evaluation (at 0) functional in the infimum above is meaningful. The spirit of this lemma is to provide us with “bump” functions whose Hessian–Schatten total variation is almost optimal.

Lemma 4.3

Let $p\in [1,\infty ]$. Then it holds that

$$\begin{aligned} \inf \left\{ |{\textrm{D}}^2_p f|({\mathbb {R}}^2): f\in L^1_{\textrm{loc}}({\mathbb {R}}^2)\,\text {with compact support and }f(0)=1\right\} =2^{1+1/p}\pi .\nonumber \\ \end{aligned}$$

(4.2)

In particular, thanks to (3.2), the infimum is attained by the cut cone $x\mapsto (1-|x|)^+$ when $p=1$.

Proof

For $\varepsilon \in (0,1)$, define $f_\varepsilon (x)=(1-|x|^{\varepsilon })\vee 0$. By Proposition 1.13,

$$\begin{aligned} |{\textrm{D}}^2_p f_\varepsilon |({\mathbb {R}}^2)=2\pi \bigg (\int _{0}^1 s^{\varepsilon -1}\Vert (\varepsilon (\varepsilon -1),\varepsilon )\Vert _{\ell ^p}\textrm{d}s+\varepsilon \bigg )\rightarrow 2^{1+1/p}\pi \qquad \text {as }\varepsilon \searrow 0, \end{aligned}$$

so that we have $\leqq $ in (4.2).

We prove now the opposite inequality in (4.2). Take then $f\in L^1_{\textrm{loc}}({\mathbb {R}}^2)$, compactly supported, with bounded Hessian–Schatten variation and such that $f(0)=1$. We have to prove that $|{\textrm{D}}^2_p f|({\mathbb {R}}^2)\geqq 2^{1+1/p}\pi $. Using Lemma 1.9, Lemma 1.10, we see that we can assume with no loss of generality that $f\in C^\infty _{\textrm{c}}({\mathbb {R}}^2)$ and f is radial, say $f(x)=g(|x|)$, with $g(0)=1$ and $g'_+(0)=0$. Now, by Proposition 1.7 and the inequality $(|a|+|b|)\leqq 2^{1-1/p}(|a|^p+|b|^p)^{1/p}$, we obtain that

$$\begin{aligned} |{\textrm{D}}^2_p f|({\mathbb {R}}^2)\geqq 2^{1/p-1}|{\textrm{D}}^2_1 f|({\mathbb {R}}^2). \end{aligned}$$

Hence, it is enough to show the claim in the case $p=1$, i.e. we have to show that $|{\textrm{D}}^2_1 f|({\mathbb {R}}^2)\geqq 4\pi $. We compute now that

$$\begin{aligned}{} & {} \int _0^\infty s|g''|\textrm{d}s\geqq \int _0^\infty sg''\textrm{d}s=-\int _0^\infty g'\textrm{d}s=1\\ {}{} & {} \qquad \text {and}\qquad \int _0^\infty |g'|\textrm{d}s\geqq -\int _0^\infty g'\textrm{d}s=1, \end{aligned}$$

so that, by by Proposition 1.13,

$$\begin{aligned} |{\textrm{D}}^2_1 f|({\mathbb {R}}^2)=2\pi \int _0^\infty s|g''|+|g'|\textrm{d}s\geqq 4\pi . \end{aligned}$$

$\quad \square $

The existence of “good bump functions” granted by Lemma 4.3 allows us to prove, in Proposition 4.4 below, that for $\lambda $ large enough the infimum of $\mathcal {F}_\lambda ^{p,q}$ does not depend on $\lambda $, namely that minimizing $\mathcal {F}_\lambda ^{p,q}$ asymptotically promotes the perfect fit with the data.

Proposition 4.4

Let $p,\,q\in [1,\infty ]$ and let $\lambda \in [2\pi 2^{1/p}{ N}^{1-1/q},\infty ]$. Then

$$\begin{aligned} \inf _{f\in L^1_{\textrm{loc}}(\Omega )} \mathcal {F}^{p,q}_\lambda (f)=\inf _{f\in L^1_{\textrm{loc}}(\Omega )} \mathcal {F}_{\infty }^{p,q}(f). \end{aligned}$$

In particular, in this range of $\lambda $, the infima are also independent of q.

Proof

We let $r\in (0,\infty )$ small enough so that $\textrm{dist}(x_i,x_j)> 3 r$ if $i\ne j$. Let $\varepsilon \in (0,1)$. For $i=1,\ldots ,N$, by Lemma 4.3 and a scaling argument, we take $g_i\in C_{\textrm{c}}({\mathbb {R}}^2)$ with $g(x_i)=1$, ${\textrm{supp}\,}g_i\subseteq B_{r}(x_i)$ and $|{\textrm{D}}^2_p g_i|({\mathbb {R}}^2)\leqq 2^{1+1/p}\pi +\varepsilon $.

Then we consider $f\in L^1_{\textrm{loc}}(\Omega )$ and we set

$$\begin{aligned} {\tilde{f}}:=f-\sum _i (f(x_i)-y_i)g_i. \end{aligned}$$

(4.3)

Notice ${\tilde{f}}(x_i)=y_i$ for every $i=1,\ldots ,N$ and that

$$\begin{aligned} |{\textrm{D}}^2_p {\tilde{f}}|(\Omega )&\leqq |{\textrm{D}}^2_p f|(\Omega )+(2^{1+1/p}\pi +\varepsilon ) \sum _{i=1}^N |f(x_i)-y_i|\\ {}&=|{\textrm{D}}^2_p f|(\Omega )+(2^{1+1/p}\pi +\varepsilon ) \Vert (f(x_i)-y_i)_i\Vert _{\ell ^1}\\ {}&\leqq |{\textrm{D}}^2_p f|(\Omega )+ (2^{1+1/p}\pi +\varepsilon ) N^{1-1/q}\Vert (f(x_i)-y_i)_i\Vert _{\ell ^q}. \end{aligned}$$

Therefore, being $\varepsilon \in (0,1)$ arbitrary and $f\in L^1_{\textrm{loc}}(\Omega )$ arbitrary, we have that

$$\begin{aligned} \inf _{f\in L^1_{\textrm{loc}}(\Omega )} \mathcal {F}^{p,q}_{\infty }(f)\leqq \inf _{f\in L^1_{\textrm{loc}}(\Omega )} \mathcal {F}_{\lambda }^{p,q}(f) \qquad \text {whenever }\lambda \geqq 2\pi 2^{1/p}N^{1-1/q}. \end{aligned}$$

As also $\mathcal {F}_{\infty }^{p,q}({\tilde{f}})\geqq \mathcal {F}^{p,q}_{\lambda }(f)$, we have proved the claim, thanks to our choice of $\lambda $. $\square $

The following lemma estimates how much the evaluation functional at x differs from the average functional on $B_r(x)$, hence allows us to quantify the error we make replacing the evaluation functional with another functional that has the advantage of being continuous with respect to weaker notion of convergence.

Lemma 4.5

Let $f\in L^1_{\textrm{loc}}(\Omega )$ with bounded Hessian–Schatten variation in $\Omega $. Let also $B=B_r(x)\subseteq \Omega $ such that $2 B:=B_{2r}(x)\subseteq \Omega $. Then, if $p\in [1,\infty ]$,

(4.4)

Proof

We can assume with no loss of generality that $x=0$. By approximation of r from below, we can also assume that $|{\textrm{D}}^2_1 f|(\partial B)=0$. Hence, using Proposition 1.8 and Lemma 1.10, we can assume in addition that f is radial and $f\in C^{\infty }(2B)$, say $f(\,\cdot \,)=g(|\,\cdot \,|)$. Notice that $g'_{+}(0)=0$. We then compute

so that

(4.5)

We stick for the moment to the case $p=1$. We use Proposition 1.13 to compute

$$\begin{aligned} \begin{aligned} |{\textrm{D}}^2_1 f|(2B\setminus B)&=2\pi \int _r^{2 r} s|g''|+|g'|\textrm{d}s,\\ |{\textrm{D}}^2_1 f|( B)&=2\pi \int _0^r s|g''|+|g'|\textrm{d}s \end{aligned} \end{aligned}$$

(4.6)

and we take $\xi \in (r,2 r)$ such that

$$\begin{aligned} r | g'|(\xi )\leqq \int _r^{2 r} |g'|\textrm{d}s. \end{aligned}$$

(4.7)

Now we write $\{g'>0\}\cap (0,\xi )=\bigcup _k I_k$ and $\{g'<0\}\cap (0,\xi )=\bigcup _k J_k$, where $I_k$ and $J_k$ are countably many pairwise disjoint open intervals. Notice that if $p\in \partial I_k$ for some k, then either $p=\xi $ or $g'(p)=0$. Then, if we take $I_k$ such that $\xi \in \partial I_k$,

$$\begin{aligned} \int _{I_k} s|g''|\textrm{d}s\geqq -\int _{I_k} s g''\textrm{d}s=\int _{I_k} g'\textrm{d}s- \xi g'(\xi )=\int _{I_k} |g'|\textrm{d}s- \xi |g'|(\xi ), \end{aligned}$$

whereas if we take $I_k$ such that $\xi \notin \partial I_k$,

$$\begin{aligned} \int _{I_k} s|g''|\textrm{d}s\geqq -\int _{I_k} s g''\textrm{d}s=\int _{I_k} g'\textrm{d}s=\int _{I_k} |g'|\textrm{d}s. \end{aligned}$$

Similar inequalities hold in the case of an interval of the type $J_k$. Therefore, summing over all intervals $I_k$ and $J_k$,

$$\begin{aligned} \int _0^{2 r}s|g''|\textrm{d}s\geqq \int _0^\xi s |g''|\textrm{d}s\geqq \int _0^\xi |g'|\textrm{d}s-\xi |g'|(\xi ), \end{aligned}$$

so that, by the choice of $\xi $ due to (4.7),

$$\begin{aligned} \int _0^r |g'|\textrm{d}s\leqq \int _0^\xi |g'|\textrm{d}s \leqq \int _0^{2r}s |g''|\textrm{d}s+\xi |g'|(\xi )\leqq \int _0^{2r}s |g''|\textrm{d}s+ 2\int _r^{2 r} |g'|\textrm{d}s. \end{aligned}$$

Then, using also (4.5) and (4.6),

whence the claim for $p=1$. For the general case, simply notice that $|{\textrm{D}}^2_1f|(B)\leqq 2^{1-1/p}|{\textrm{D}}^2_p f|(B)$ and the same holds for $2B\setminus B$, by $\ell _1-\ell _p$ inequality and Proposition 1.7. $\quad \square $

Remark 4.6

Notice that the constant ${1}/{(4\pi )}$ in front of $|{\textrm{D}}^2_p f|(B)$ in (4.4) is somehow optimal. We can realize this considering the sequence of functions $f_\varepsilon $ used to prove Lemma 4.3.$\blacksquare $

By Lemma 4.5, there is no surprise in knowing that, given a weakly convergent sequence $f_k\rightharpoonup f$, in duality with the space $L^\infty _{\textrm{c}}(\Omega )$ of $L^\infty $ function with compact (essential) support, we can estimate how much the evaluation functional fails to converge in terms of concentration of Hessian–Schatten total variation at x.

Lemma 4.7

Let $f\in L^1_{\textrm{loc}}(\Omega )$ and let $(f_k)\subseteq L^1_{\textrm{loc}}(\Omega )$ such that $f_k\rightharpoonup f$ in duality with $L^\infty _{\textrm{c}}(\Omega )$ with $\sup _k|{\textrm{D}}^2_p f_k|(A)<\infty $ for any open set $A\Subset \Omega $. Then, f has locally bounded Hessian–Schatten variation in $\Omega $ and for any $x\in \Omega $ one has

$$\begin{aligned} \limsup _k |f(x)-f_k(x)|\leqq \frac{2^{1-1/p}}{4\pi }\lim _{r\searrow 0}\limsup _k |{\textrm{D}}^2_p f_k|(B_{r}(x)). \end{aligned}$$

(4.8)

Proof

First, take a non relabelled subsequence so that $\lim _k |f(x)-f_k(x)|$ exists and equals the $\limsup _k$ at the left hand side of (4.8).

We assume that there exists $r_1>0$ small enough so that $B_{r_1}(x)\subseteq \Omega $ and moreover that $\limsup _k|{\textrm{D}}^2 f_k|(B_{r_1}(x))<\infty $, otherwise there is nothing to show. By lower semicontinuity this implies that f has bounded Hessian–Schatten variation in $B_{r_1}(x)$. We extract a further non relabelled subsequence such that, for some finite measure $\mu $ on $B_{r_1}(x)$, $|{\textrm{D}}^2_p f_k|\rightharpoonup \mu $ in duality with $C_{\textrm{c}}(B_{r_1}(x))$.

Let now $r\in (0,r_1/2)$. Then,

Now notice that by continuity of f the first summand converges to 0 as $r\searrow 0$, whereas, by the convergence assumption the second summand converges to 0 as $k\rightarrow \infty $. Also, by Lemma 4.5, we bound the third summand as follows

To conclude, it is enough notice that that

$$\begin{aligned} \limsup _{r\searrow 0}\limsup _k|{\textrm{D}}^2_p f_k | (B_{2 r}(x)\setminus B_r(x))\leqq \lim _{r\searrow 0} \mu (\bar{B}_{2 r}(x)\setminus B_r(x))=0. \end{aligned}$$

$\quad \square $

By using the results above, we can prove the lower semicontinuity of $\mathcal {F}_\lambda ^{p,q}$. In the case $q=1$, notice that the argument used in the proof of Proposition 4.4 together with the next result can be used to show that $\mathcal {F}^{p,1}_\lambda $ is precisely the relaxed functional of $\mathcal {F}_\infty ^{p,1}$ when $\lambda =2^{1+1/p}\pi $.

Lemma 4.8

Let $p,\,q\in [1,\infty ]$ and let $\lambda \in [0,2^{1/p-1} 4\pi ]$. Then $\mathcal {F}_\lambda ^{p,q}$ is lower semicontinuous with respect to weak convergence in duality with $L^\infty _{\textrm{c}}(\Omega )$.

Proof

Let $(f_k)\subseteq L^1_{\textrm{loc}}(\Omega )$ be such that $f_k\rightharpoonup f$ in duality with $L^\infty _{\textrm{c}}(\Omega )$, for some $f\in L^1_{\textrm{loc}}(\Omega )$. We have to prove that

$$\begin{aligned} \mathcal {F}_\lambda ^{p,q}(f)\leqq \liminf _k \mathcal {F}_\lambda ^{p,q}(f_k). \end{aligned}$$

First, extract a non relabelled subsequence such that $\mathcal {F}_\lambda ^{p,q}(f_k)$ has a limit, as $k\rightarrow \infty $, which equals the right hand side of the inequality above. Then, we can assume that $\liminf _k |{\textrm{D}}^2_p f_k|(\Omega )<\infty $, otherwise there is nothing to show. Hence f has bounded Hessian–Schatten variation in $\Omega $ and, up to the extraction of a non relabelled subsequence, we can assume that $|{\textrm{D}}^2_p f_k|\rightharpoonup \mu $ in duality with $C_{\textrm{c}}(\Omega )$ for some finite measure $\mu $ on $\Omega $. Even though $\mu $ depends on p, we do not make this dependence explicit. Also, we extract a non relabelled subsequence such that for every $i=1,\ldots ,N$, $|f(x_i)-f_k(x_i)|$ has a (finite) limit as $k\rightarrow \infty $.

Notice that for every $z\in \Omega $ one has

$$\begin{aligned} \mu (\{z\})\leqq \lim _{r\searrow 0}\limsup _k |{\textrm{D}}_p^2 f_k|(\bar{B}_r(z)){\leqq } \lim _{r\searrow 0}\mu (\bar{B}_r(z))=\mu (\{z\}). \end{aligned}$$

(4.9)

We compute, as $|{\textrm{D}}^2_p f|(\{z\})=0$ for every $z\in \Omega $,

$$\begin{aligned}{} & {} \mathcal {F}_\lambda ^{p,q}(f)=|{\textrm{D}}^2_p f|(\Omega )+\lambda \Vert (f(x_i)-y_i)_i\Vert _{\ell ^q}\nonumber \\{} & {} \quad =\lim _{r\searrow 0}|{\textrm{D}}^2_p f|\bigg (\Omega \setminus \bigcup _{i=1}^N\bar{B}_{r}(x_i)\bigg )+\lambda \Vert (f(x_i)-y_i)_i\Vert _{\ell ^q}. \end{aligned}$$

(4.10)

By lower semicontinuity,

$$\begin{aligned} |{\textrm{D}}^2_p f|\bigg (\Omega \setminus \bigcup _{i=1}^N\bar{B}_{r}(x_i)\bigg )\leqq \liminf _k |{\textrm{D}}^2_p f_k|\bigg (\Omega \setminus \bigcup _{i=1}^N\bar{B}_{r}(x_i)\bigg ) \end{aligned}$$

so that by (4.9)

$$\begin{aligned} \lim _{r\searrow 0}|{\textrm{D}}^2_p f|\bigg (\Omega \setminus \bigcup _{i=1}^N\bar{B}_{r}(x_i)\bigg )\leqq \liminf _k |{\textrm{D}}^2_p f_k|(\Omega )-\sum _{i=1}^N\mu (\{x_i\}). \end{aligned}$$

(4.11)

Also, by Lemma 4.7 and (4.9),

$$\begin{aligned} \lim _k \Vert (f(x_i)-f_k(x_i))_i\Vert _{\ell ^q}\leqq \lim _k \Vert (f(x_i)-f_k(x_i))_i\Vert _{\ell ^1}\leqq \frac{2^{1-1/p}}{4\pi } \sum _{i=1}^N \mu (\{x_i\}), \end{aligned}$$

so that

$$\begin{aligned} \Vert (f(x_i)-y_i)_i\Vert _{\ell ^q}\leqq \frac{2^{1-1/p}}{4\pi } \sum _i \mu (\{x_i\})+\liminf _k \Vert (f_k(x_i)-y_i)_i\Vert _{\ell ^q}.\qquad \end{aligned}$$

(4.12)

Inserting (4.11) and (4.12) into (4.10) we obtain, by the super additivity of the $\liminf $,

$$\begin{aligned} \mathcal {F}_\lambda ^{p,q}(f)\leqq \liminf _k \mathcal {F}_\lambda ^{p,q}(f_k)+\bigg (\lambda \frac{2^{1-1/p}}{4\pi }-1\bigg ) \sum _i \mu (\{x_i\}), \end{aligned}$$

whence the claim by the choice of $\lambda $. $\quad \square $

Weak relative compactness of minimizing sequences for $\mathcal {F}_\lambda ^{p,q}$ is obtained through a classical argument, the only (slight) technical difficulty relies in possibly irregular domains $\Omega $.

Lemma 4.9

Let $p,\,q\in [1,\infty ]$ and let $\lambda \in [0,\infty ]$. Then there exist a minimizing sequence $(f_k)$ for $\mathcal {F}_\lambda ^{p,q}$ and a function $f\in L^1_{\textrm{loc}}(\Omega )$ such that $f_k\rightharpoonup f$ in duality with $L^\infty _{\textrm{c}}(\Omega )$.

Proof

We assume $\lambda >0$, the case $\lambda =0$ being trivial. We also assume that $\Omega $ is connected, as we can do the modifications independently in each connected component of $\Omega $. Let now $(f_k)\subseteq L^1_{\textrm{loc}}(\Omega )$ be a minimizing sequence for $\mathcal {F}_\lambda ^{p,q}$. In particular, the sequence $(|{\textrm{D}}^2 f_k|(\Omega ))$ is bounded as well as the sequence $(|f_k(x_i)|)$, for every $i=1,\ldots ,N$. Now we are going to modify $(f_k)$ to obtain a new sequence $(\tilde{f}_k)\subseteq L^1_{\textrm{loc}}(\Omega )$ that is still minimizing but in $\Omega $ is locally uniformly bounded.

There are two cases to be considered:

(a)
$N\geqq 3$ and there are three points $x_{i_1},x_{i_2},x_{i_3}\in \{x_1,\ldots , x_N\}$ such that $x_{i_2}-x_{i_1}$ and $x_{i_3}-x_{i_1}$ are linearly independent.
(b)
either $N=0$ or all the points $x_i$ are on a line $\{t v+c:t\in {\mathbb {R}}\} \subseteq {\mathbb {R}}^2$, for some $v\in {\mathbb {R}}^2{\setminus }\{0\}$ and $c\in {\mathbb {R}}$.

We treat the two cases separately.

Case (a). In this case no modification is needed, indeed we show that $(f_k)$ is locally uniformly bounded in $\Omega $. Take a compact set $K\subseteq \Omega $. For $\varepsilon :=\frac{1}{2} \textrm{dist}(K,\partial \Omega )$ we select points $q_0, q_1, \ldots , q_M\in K$ such that $K\subseteq \cup _j B_\varepsilon (q_j)$, then curves $\gamma _j\subseteq \Omega $ joining $q_j$ to $q_0$, and finally curves $\hat{\gamma }_i\subseteq \Omega $ joining $x_i$ to $q_0$. Let

$$\begin{aligned} K':=\bigcup _{j=0}^M \overline{B}_\varepsilon (q_j) \cup \bigcup _{j=1}^M \gamma _j \cup \bigcup _{i=1}^N \hat{\gamma }_i. \end{aligned}$$

Then $\cup _i\{x_i\}\cup K\subseteq K'\subseteq \Omega $, and $K'$ is compact and connected. Therefore, to prove uniform boundedness of $(f_k)$ on K, we can assume with no loss of generality that all points $x_i$ belong to K and that K is connected.

Now we take $\delta \in (0,1)$ small enough so that $\Omega ':={B_{{2}\delta }}(K)$ satisfies $\overline{\Omega '}\subseteq \Omega $. Hence $\Omega '$ is a connected domain. We show now that $\Omega '$ is a (bounded) John domain, then $\Omega '$ satisfies Poincaré inequalities, by [7, Lemma 3.1 and Theorem 5.1] and the trivial inequality

that holds for every $f\in L^1(\Omega ')$ and $q\in [1,\infty )$. Fix any $p_0\in K$. We have to show that there exist $0<\alpha \leqq \beta $ such that for every $p\in \Omega '$, there exists a rectifiable curve $\gamma :[0,l(\gamma )]\rightarrow \Omega '$, parametrized by arc length, joining p to $p_0$ and such that $l(\gamma )\leqq \beta $ and

$$\begin{aligned} \textrm{dist}(\gamma (t),\partial \Omega ')\geqq \frac{\alpha t}{l(\gamma )}\qquad \text {for every }t\in [0,l(\gamma )]. \end{aligned}$$

(4.13)

To prove this, notice first that there exists $\beta '>0$ such that for every $p\in K$, there exists rectifiable curve $\gamma $, parametrized by arc length, joining p to $p_0$, with image contained in $B_{\delta }(K)\subseteq \Omega '$ and length bounded by $\beta '$. This follows from the connectedness of $B_\delta (K)$ and the compactness of K (simply take a finite covering of K of balls of radius $\delta $ and centre in K and consider the rectifiable curves with image in $B_{\delta }(K)$ joining the centres of these balls); also, $\gamma $ satisfies (4.13) with $\alpha :=\delta $. Then the claim for arbitrary $p\in \Omega '$ follows: indeed, for any $p\in \Omega '{\setminus } K$, $p\in B_{2\delta }(q)$ with $q\in K$, then we join the radial curve connecting p to q to the curve connecting q to $p_0$ obtained as before and we have that $l(\gamma )\leqq {2}\delta + \beta '=:\beta $ and moreover $\gamma $ still satisfies (4.13) (with $\alpha =\delta $ as before): indeed, for $t\in [0,|p-q|]$,

$$\begin{aligned} \textrm{dist}(\gamma (t),\partial \Omega ')\geqq {2}\delta -|p-q|+t\geqq {2}\delta \frac{t}{|p-q|}\geqq {2}\delta \frac{t}{l(\gamma )}, \end{aligned}$$

whereas for $t\in [|p-q|,l(\gamma )]$, (4.13) follows as before.

Take also $\psi \in C_{\textrm{c}}^\infty ({\mathbb {R}}^2)$ such that ${\textrm{supp}\,}\psi \subseteq \Omega '$ and $\psi =1$ on a neighbourhood of K. By Proposition 1.11 and standard calculus rules, the sequence $(|{\textrm{D}}^2 (\psi {\hat{f}}_k)|({\mathbb {R}}^2))$ is bounded, where ${\hat{f}}_k=f_k-g_k$ with $g_k$ suitable affine perturbation. Therefore, by [11, Proposition 3.1] and the compactness of support of $\psi \hat{f}_k$, we have that $\psi \hat{f}_k$ are uniformly bounded in $L^\infty ({\mathbb {R}}^2)$, in particular $\hat{f}_k$ are uniformly bounded in $L^\infty (K)$. Now, as $|g_k(x_i)|=|\hat{f}_k(x_i)-f(x_i)|$ are bounded for every $i=i_1,\,i_2,\,i_3$, it is easy to infer, by the assumption in (a) that the perturbations $g_k$ are uniformly bounded. Hence $\Vert f_k\Vert _{L^\infty (K)}$ is bounded and, since K is arbitrary, the claim follows by weak compactness.

Case (b). If $N\leqq 2$, there is an affine function $f_*$ with $f_*(x_i)=y_i$ for all i, and therefore $\mathcal {F}_\lambda ^{p,q}(f_*)=0$. We can therefore assume $N\geqq 3$. Let $v^\perp $ be a unit vector orthogonal to v, and choose $\varepsilon \in (0,1)$ sufficiently small that $x_0:=x_1+\varepsilon v^\perp \in \Omega $. Define

$$\begin{aligned} \tilde{f}_k(x):=f_k(x)-\frac{1}{\varepsilon }f_k(x_0) (x-x_1)\cdot v^\perp . \end{aligned}$$

As $\mathcal {F}_\lambda ^{p,q}(\tilde{f}_k)= \mathcal {F}_\lambda ^{p,q}(f_k)$, this is also a minimizing sequence, with the additional property that $\tilde{f}_k(x_0)=0$ for all k. The conclusion follows then from the argument of the previous case. $\quad \square $

5.2 Proof of the main results

Having proved the results in Section 4.1, Theorems 4.1 and 4.2 follow in a immediate, classical way.

Proof of Theorem 4.1

The statement is proved by the direct method of calculus of variations, by Lemma 4.8 and Lemma 4.9. $\quad \square $

Proof of Theorem 4.2

Let $\lambda _c{:=}4\pi $. We argue as in Proposition 4.4, starting from a minimizer f of $\mathcal {F}^{1,1}_{\lambda _c}$ granted by Theorem 4.1. We modify f subtracting $\sum _i (f(x_i)-y_i)g_i$ where this time $g_i$ are rescaled cut cones (see (4.3)), in such a way that

$$\begin{aligned} \tilde{f}:=f-\sum _i (f(x_i)-y_i)g_i \end{aligned}$$

has a perfect fit with the data. Since $|{\textrm{D}}^2_1 g_i|({\mathbb {R}}^2)=4\pi $ (recall e.g. Lemma 4.3), one has

$$\begin{aligned} \mathcal {F}^{1,1}_\infty ({\tilde{f}})\leqq |{\textrm{D}}^2_1\tilde{f}|(\Omega )\leqq |{\textrm{D}}^2_1 f|(\Omega )+\sum _i |{\textrm{D}}^2_1 g_i|({\mathbb {R}}^2)|f(x_i)-y_i|\\ =\mathcal {F}^{1,1}_{\lambda _c}(f). \end{aligned}$$

This, taking the inequality $\mathcal {F}^{1,1}_\lambda \leqq \mathcal {F}^{1,1}_\infty $ into account, proves that $\tilde{f}$ is a minimizer of $\mathcal {F}^{1,1}_\lambda $ for any $\lambda \geqq \lambda _c$. $\quad \square $

Data availability

The only data relevant to this paper are contained in the paper itself.

References

Alberti, G.: Rank one property for derivatives of functions with bounded variation. Proc. R. Soc. Edinb. Sect. A Math. 123(2), 239–274, 1993
Article MathSciNet MATH Google Scholar
Ambrosio, L., Aziznejad, S., Brena, C., Unser, M.: Linear inverse problems with Hessian–Schatten total variation. Preprint arXiv:2210.04077 (2022)
Ambrosio, L., Fusco, N., Pallara, D.: Functions of Bounded Variation and Free Discontinuity Problems. Clarendon Press, Oxford (2000)
Book MATH Google Scholar
Arora, R., Basu, A., Mianjy, P., Mukherjee, A. Understanding deep neural networks with rectified linear units. arXiv preprint arXiv:1611.01491 (2016)
Aziznejad, S, Campos, J, Unser, M: Measuring complexity of learning schemes using Hessian–Schatten total variation. arXiv preprint arXiv:2112.06209 (2021)
Bergounioux, M., Piffet, L.: A second-order model for image denoising. Set-Valued Var. Anal. 18(3–4), 277–306, 2010
Article MathSciNet MATH Google Scholar
Bojarski, B: Remarks on Sobolev imbedding inequalities. In: Laine, I., Sorvali, T., Rickman, S. (Eds.) Complex Analysis Joensuu 1987, pp. 52–68. Springer, Berlin (1988)
Boyer, C., Chambolle, A., De Castro, Y., Duval, V., De Gournay, F., Weiss, P.: On representer theorems and convex regularization. SIAM J. Optim. 29(2), 1260–1281, 2019
Article MathSciNet MATH Google Scholar
Bredies, K., Carioni, M.: Sparsity of solutions for variational inverse problems with finite-dimensional data. Calc. Var. Part. Differ. Equ. 59(1), 1–26, 2020
Article MathSciNet MATH Google Scholar
Campos, J., Aziznejad, S., Unser, M.: Learning of continuous and piecewise-linear functions with Hessian total-variation regularization. IEEE Open J. Signal Process. 3, 36–48, 2021
Article Google Scholar
Demengel, F.: Fonctions à hessien borné. Annales de l’Institut Fourier 34(2), 155–190, 1984
Article MathSciNet MATH Google Scholar
De Giorgi, E., Letta, G.: Une notion générale de convergence faible pour des fonctions croissantes d’ensemble. Annali della Scuola Normale Superiore di Pisa - Classe di Scienze, 4e série 4(1), 61–99 (1977)
Evans, L.C., Gariepy, R.F.: Measure Theory and Fine Properties of Functions. CRC Press, Boca Raton (2015)
Book MATH Google Scholar
Hinterberger, W., Scherzer, O.: Variational methods on the space of functions of bounded Hessian for convexification and denoising. Computing 76(1–2), 109–133, 2006
Article MathSciNet MATH Google Scholar
Knoll, F., Bredies, K., Pock, T., Stollberger, R.: Second order total generalized variation (TGV) for MRI. Magn. Reson. Med. 65(2), 480–491, 2011
Article Google Scholar
Lefkimmiatis, S., Unser, M.: Poisson image reconstruction with Hessian–Schatten-norm regularization. IEEE Trans. Image Process. 22(11), 4314–4327, 2013
Article ADS MathSciNet MATH Google Scholar
Lefkimmiatis, S., Ward, J.P., Unser, M.: Hessian–Schatten-norm regularization for linear inverse problems. IEEE Trans. Image Process. 22(5), 1873–1888, 2013
Article ADS MathSciNet MATH Google Scholar
Pourya, M., Goujon, A., Unser, M.: Delaunay-triangulation-based learning with Hessian total-variation regularization. arXiv preprint arXiv:2208.07787 (2022)
Unser, M.: A unifying representer theorem for inverse problems and machine learning. Found. Comput. Math. 21(4), 941–960, 2021
Article MathSciNet MATH Google Scholar
Unser, M., Aziznejad, S.: Convex optimization in sums of Banach spaces. Appl. Comput. Harmon. Anal. 56, 1–25, 2022
Article MathSciNet MATH Google Scholar

Download references

Acknowledgements

The first two authors wish to thank Shayan Aziznejad, Michele Benzi and Michael Unser for inspiring conversations around the topic of this and acknowledge the support of PRIN MIUR project “Gradient flows, Optimal Transport and Metric Measure Structures” and the Balzan project led by the first author. The third author wishes to thank Matteo Focardi and Flaviana Iurlano for interesting discussions on Section 2 and acknowledges the support of the Deutsche Forschungsgemeinschaft through project 211504053/SFB1060. The authors wish to thank Gian Paolo Leonardi for comments leading to the investigation contained in Remark 2.6 and the referee for useful comments. Funding was provided by Ministero dell’Istruzione, dell’Università e della Ricerca (Grant No. 2017TEXA3H).

Funding

Open access funding provided by Scuola Normale Superiore within the CRUI-CARE Agreement.

Author information

Authors and Affiliations

Scuola Normale Superiore, Piazza dei Cavalieri 7, 56126, Pisa, Italy
Luigi Ambrosio & Camillo Brena
Institut für Angewandte Mathematik, Universität Bonn, 53115, Bonn, Germany
Sergio Conti

Authors

Luigi Ambrosio
View author publications
You can also search for this author in PubMed Google Scholar
Camillo Brena
View author publications
You can also search for this author in PubMed Google Scholar
Sergio Conti
View author publications
You can also search for this author in PubMed Google Scholar

Corresponding author

Correspondence to Luigi Ambrosio.

Additional information

Communicated by G. Dal Maso.

Publisher's Note

Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

Rights and permissions

Open Access This article is licensed under a Creative Commons Attribution 4.0 International License, which permits use, sharing, adaptation, distribution and reproduction in any medium or format, as long as you give appropriate credit to the original author(s) and the source, provide a link to the Creative Commons licence, and indicate if changes were made. The images or other third party material in this article are included in the article's Creative Commons licence, unless indicated otherwise in a credit line to the material. If material is not included in the article's Creative Commons licence and your intended use is not permitted by statutory regulation or exceeds the permitted use, you will need to obtain permission directly from the copyright holder. To view a copy of this licence, visit http://creativecommons.org/licenses/by/4.0/.

Reprints and permissions

About this article

Cite this article

Ambrosio, L., Brena, C. & Conti, S. Functions with Bounded Hessian–Schatten Variation: Density, Variational, and Extremality Properties. Arch Rational Mech Anal 247, 111 (2023). https://doi.org/10.1007/s00205-023-01938-w

Download citation

Received: 03 March 2023
Accepted: 11 October 2023
Published: 20 November 2023
DOI: https://doi.org/10.1007/s00205-023-01938-w

Use our pre-submission checklist

Avoid common mistakes on your manuscript.

Functions with Bounded Hessian–Schatten Variation: Density, Variational, and Extremality Properties

Abstract

Similar content being viewed by others

Linear inverse problems with Hessian–Schatten total variation

On the Properties of the Method of Minimization for Convex Functions with Relaxation on the Distance to Extremum

Some Properties of Smooth Convex Functions and Newton’s Method

1 Introduction

1.1 Density of CPWL functions

1.2 Extremality of cones

1.3 Solutions to the minimization problem

2 Preliminaries

2.1 Schatten norms

Definition 1.1

Proposition 1.2

Definition 1.3

2.1.1 Poincaré inequalities

Definition 1.4

2.2 Hessian–Schatten total variation

Definition 1.5

Remark 1.6

Proposition 1.7

Proposition 1.8

Lemma 1.9

Lemma 1.10

Proof

Proposition 1.11

Lemma 1.12

2.3 Hessian–Schatten variation of radial functions

Proposition 1.13

Proof

Lemma 1.14

Proof

3 Density of CPWL Functions

Definition 2.1

Theorem 2.2

Lemma 2.3

Theorem 2.4

Proof

Remark 2.5

Remark 2.6

3.1 General properties of triangulations

Definition 2.7

Remark 2.8

Proof

Lemma 2.9

Proof

Lemma 2.10

Remark 2.11

Proof of Lemma 2.10

Remark 2.12

Proof

Lemma 2.13

Proof

3.2 Construction of the triangulation

Theorem 2.14

Lemma 2.15

Proof

Proof of Theorem 2.14

3.3 Proof of the main result

Lemma 2.16

Proof

Proof of Theorem 2.2

4 Extremality of Cones

Theorem 3.1

4.1 Convexity

Lemma 3.2

Proof

Lemma 3.3

Proof

4.2 Extremality with respect to spherical averaging

Lemma 3.4

Proof

Proposition 3.5

Proof

Proposition 3.6

Proof

4.3 Proof of the main result

Proof of Theorem 3.1

5 Solutions of the Minimization Problem

Theorem 4.1