1 Introduction

The study of the optimal transport problem:

$$\begin{aligned} \min _{\pi \in \Pi (\lambda ,\mu )}\int _{{\mathbb {R}}^d\times {\mathbb {R}}^d} c(x-y)\,{\textrm{d}}\pi \end{aligned}$$
(1.1)

is well established. Here \(\lambda ,\mu \) are two (finite) non-negative measures on \({\mathbb {R}}^d\) satisfying \(\lambda ({\mathbb {R}}^d)=\mu ({\mathbb {R}}^d)\). We refer the reader to [26] and [23] for an introduction and overview of the literature. When solutions take the form of a transport map \(\pi = (Id \times T)_\#\mu \), under mild assumptions, minimisers are characterised by satisfying the Euler-Lagrange equation

$$\begin{aligned} \mu (T(x)) \textrm{det}(\textrm{D}T(x))=\lambda (x) \end{aligned}$$
(1.2)

as well as the additional structure condition

$$\begin{aligned} T(x)=x+ \nabla c^*(\textrm{D}\phi ), \end{aligned}$$
(1.3)

where \(\phi \) is a c-concave function and \(c^*\) denotes the convex conjugate of c. Assuming \(\mu \sim \lambda \sim 1\) and linearising the geometric nonlinearity in (1.2), that is formally expanding \(det (Id +A)=1+tr \, A+\ldots \), we find that

$$\begin{aligned} \textrm{div}\,\nabla c^{*}(\textrm{D}\phi )=\lambda -\mu . \end{aligned}$$
(1.4)

Thus, at least formally, we expect solutions of (1.1) to be well approximated by solutions of (1.4). Note that in general (1.4) is a nonlinear equation. Thus we refer to the process of moving from (1.1) to (1.4) as geometric linearisation. The aim of this paper is to make this connection rigorous. We show the following:

Theorem 1.1

Let \(1<p<\infty \). Suppose \(c:{\mathbb {R}}^d\rightarrow {\mathbb {R}}\) is a strongly p-convex cost function of controlled-duality p-growth, that is satisfying (1.11)–(1.14). Let \(\pi \) be a minimiser of (1.1) for some non-negative measures \(\lambda \), \(\mu \) satisfying \(\lambda ({\mathbb {R}}^d)=\mu ({\mathbb {R}}^d)\). Denote

$$\begin{aligned} E(R)&:= \frac{1}{|B_{R}|}\int _{(B_{R}\times {\mathbb {R}}^d)\cup ({\mathbb {R}}^d\times B_{R})}c(x-y)\,{\textrm{d}}\pi \\ D(R)&:= \frac{1}{|B_{R}|} W_{p}^p(\lambda \llcorner B_R,\kappa _{\lambda ,R}\,{\textrm{d}}x\llcorner B_R)+\frac{R^p}{\kappa _{\lambda ,R}^{p-1}}(\kappa _{\lambda ,R}-1)^p\\&\quad +\frac{1}{|B_{R}|} W_{p}^p(\mu \llcorner B_R,\kappa _{\mu ,R}\,{\textrm{d}}x \llcorner B_R)+\frac{R^p}{\kappa _{\mu ,R}^{p-1}}(\kappa _{\mu ,R}-1)^p. \end{aligned}$$

Here \(\kappa _{\lambda ,R} = \frac{\lambda (B_R)}{|B_R|}\) and \(\kappa _{\mu ,R} = \frac{\mu (B_R)}{|B_R|}\). Then, for every \(\tau >0\), there exists \(\varepsilon (\tau )>0\) such that if \({E(4)+D(4)\le \varepsilon }\),Footnote 1 then there exists a radius \(R\in (2,3)\), \(c\in {\mathbb {R}}\) and \(\phi \) satisfying

$$\begin{aligned} -\textrm{div}\,\nabla c^*(\textrm{D}\phi ) = c \text { in } B_R \end{aligned}$$

such that

$$\begin{aligned} \int _{(B_1\times {\mathbb {R}}^d)\cup ({\mathbb {R}}^d\times B_1)}c(x-y-\nabla c^*(\textrm{D}\phi ))\,{\textrm{d}}\pi \lesssim \tau E(4)+ D(4). \end{aligned}$$
(1.5)

Moreover

$$\begin{aligned} \sup _{B_1}|\textrm{D}\phi |^{p'}+\int _{B_R}|\textrm{D}\phi |^{p'}\,{\textrm{d}}x\lesssim E(4)+D(4). \end{aligned}$$

We remark that we explain our assumptions on the cost function in detail in Sect. 1.1. Theorem 1.1 states that if at some scale the local transportation cost E(R) (and the data term) are small, then at a smaller scale, the transportation plan is well-approximated by \(\nabla c^*(\textrm{D}\phi )\), in the sense that estimate (1.5) holds. Note in particular that as a consequence of (1.5) and (1.13), also

$$\begin{aligned} \int _{(B_1\times {\mathbb {R}}^d)\cup ({\mathbb {R}}^d\times B_1)}|x-y-\nabla c^*(\textrm{D}\phi )|^p d\pi \lesssim \tau E(4)+D(4). \end{aligned}$$

Thus Theorem 1.1 makes the intuition leading to (1.4) rigorous.

Traditionally, (1.1) has been approached via the study of (1.2) using the theory of fully nonlinear elliptic equations developed by Caffarelli, see e.g. [1, 2] and the references therein. Recently, an alternative approach using variational techniques has been developed by Goldman and Otto in [9]. There, partial \(C^{1,\alpha }\)-regularity for solutions to (1.1) in the case of Hölder-regular densities \(\lambda \), \(\mu \) and quadratic cost function \(c(x-y)=\frac{1}{2} |x-y|^2\) was proven. The key tool in the proof was a version of Theorem 1.1 in this setting. In later papers, continuous densities [6], rougher measures [8], more general cost functions (albeit still close to the quadratic cost functional) [21], as well as almost-minimisers of the quadratic cost functional [21] were considered. The quadratic version of Theorem 1.1 was also used to provide a more refined linearisation result of (1.2) in the quadratic set-up in [8] and of a similar statement in the context of optimal matching in [7]. Finally, quadratic versions of Theorem 1.1 played a key role in disproving the existence of a stationary cyclically monotone Poisson matching in 2-d [14].

We remark that very little information is available about the regularity of minimiser of (1.1) already in the simplest degenerate/singular case \(c(x-y)=\frac{|x-y|^p}{p}\). In order to attempt to extend the techniques of [9] to this setting, an essential first step is Theorem 1.1. This result will also play a key role in extending the results of [14] to p-costs with \(p\ne 2\) [15].

The strategy of proof is similar to that used in [8], although with a number of simplifications. Further, we point the reader to [16] where a detailed account of the proof of Theorem 1.1 and the motivations behind the strategy are given in the quadratic case. The heart of the proof is contained in Sect. 7. The key insight Lemma 7.2 is a consequence of the strong p-convexity, which allows to estimate (up to error terms) the left-hand side of the main estimate in Theorem 1.1 by a sum of two terms: on the one hand the difference between the local transportation energy of \(\pi \) and a dual form of the energy of \(\phi \):

$$\begin{aligned} \int _{(B_R\times {\mathbb {R}}^d)\cup ({\mathbb {R}}^d\times B_R)} c(x-y)\,{\textrm{d}}\pi -\int _{B_R} c(\nabla c^*(\textrm{D}\phi ))\,{\textrm{d}}x, \end{aligned}$$
(1.6)

and on the other hand

$$\begin{aligned} \int _{\{\exists t:X(t)\in B_R\}}\int _{\tau }^{\sigma }\langle y-x-\nabla c^*(\textrm{D}\phi (X(t)),\textrm{D}\phi (X(t))\rangle \,{\textrm{d}}t \,{\textrm{d}}\pi . \end{aligned}$$
(1.7)

Here we write \(X(t)=(1-t)x+ty\) and set \({\tau = \inf \{t\in [0,1]:X(t)\in B_R\}}\), as well as \(\sigma = \sup \{t\in [0,1]:X(t)\in B_R\}\). Lemma 7.2 replaces the quasi-orthogonality property, which was employed in the quadratic case. The quasi-orthogonality property relied on expanding the squares, which is not an available tool if \(p\ne 2\).

From formal calculations, which we make rigorous in Lemma 7.5, (1.7) will be small, if \(\phi \) solves the Neumann problem

$$\begin{aligned}&-\textrm{div}\,\nabla c^{*}(\textrm{D}\phi )=\mu -\lambda \quad \text { in } B_R\end{aligned}$$
(1.8)
$$\begin{aligned}&\nabla c^*(\textrm{D}\phi )\cdot \nu = g_R-f_R \quad \text { on } \partial B_R, \end{aligned}$$
(1.9)

where \(f_R,g_R\) are functions tracking the location of \(X(\tau )\) and \(X(\sigma )\), respectively. We formally define \(f_R,g_R\) in (2.3). However, as written, the problem is not well-posed and solutions do not possess sufficient regularity to carry out the necessary estimates. Hence, we will actually work with approximations of \(f_R\) and \(g_R\), which we construct in Sect. 5. Controlling the error made in this approximation, will require us to enlarge the domain on which we work and to choose a suitable radius \(R\in [2,3]\). This explains the presence of R in the formulation of Theorem 1.1. The estimate of (1.7) is finally carried out in Lemma 7.5 in Sect. 7.

The idea in estimating (1.6) is to relate the first term with the value of a localised optimal transportation problem. Then an appropriate competitor can be constructed using \(\phi \) in order to estimate (1.6). We carry out the first step in Sect. 4, while the second is obtained in Lemma 7.4 in Sect. 7.

In order to carry out the estimates, both of (1.6) and (1.7), two ingredients are essential. We require elliptic regularity estimates which follow from strict \(p^\prime \)-convexity of \(c^*\). We explain how to obtain these in Sect. 1.2. In the quadratic case, the relevant equation is Laplace equation. Hence solutions are harmonic and hence very regular- the proof in [8] requires \(C^3\)-regularity of solutions! Already in the case \(c(x-y)=\frac{|x-y|^p}{p}\) with \(p\ne 2\), the best regularity that is known for solutions to (1.4) in general is \(C^{1,\beta }\)-regularity for some \(\beta >0\) [18, 24, 25]. Thus, at various places in the proof, more careful estimates are needed.

The second ingredient is to obtain a \(L^\infty \)-bound for minimisers of (1.1) in the small-energy regime, see Sect. 3. In the quadratic case, this relies on the monotonicity (in the classical sense) property of solutions. In the non-quadratic case, c-monotonicity needs to be used directly. Focusing on p-homogeneous convex cost functions with \(p>1\), \(L^\infty \)-bounds were obtained in [10]. Note that [10] obtained \(L^\infty \)-bounds in all energy regimes whereas the \(L^\infty \)-bounds obtained in this paper only cover the small-energy regime. A further difference is that in this paper, we obtain the \(L^\infty \)-bound as a consequence of the strong p-convexity of the energy, whereas [10] relies on the homogeneity of the cost. Nevertheless, in the small-energy regime and for cost functions covered by the assumptions both of this paper and of [10], the obtained bounds agree. In particular, this is the case for the important example of p-cost \(c(x-y)=\frac{1}{p}|x-y|^p\) with \(p>1\).

Finally, we comment why we restrict our attention to cost functions of the form \(c(x-y)\). This is due to the fact that our proof relies on the availability of a dynamical formulation. As (1.7) hints, we want to identify points \((x,y)\in \textrm{spt}\,\pi \) with the trajectory \({X(t)=tx+(1-t)y}\). This is related to the Benamou–Brenier formulation of optimal transport, c.f. [3], which in our case states that (1.1) can be alternatively characterised as

$$\begin{aligned} \min _{(j,\rho )}\left\{ \int c\left( \frac{\,{\textrm{d}}j_t}{\,{\textrm{d}}\rho _t}\right) \,{\textrm{d}}\rho :\partial _t \rho _t+\textrm{div}\,j_t = 0,\, \rho _0=\lambda ,\, \rho _1=\mu \right\} \end{aligned}$$
(1.10)

Here \(\frac{\,{\textrm{d}}j_t}{\,{\textrm{d}}\rho _t}\) denotes the Radon-Nikodym derivative. We refer the reader to (2.2) for an explanation on how to make sense of (1.10) rigorously. This alternative, dynamical formulation of optimal transport is only available for costs of the form \(c(x-y)\) where c is convex.

The outline of the paper is as follows: In the remainder of this section, we make precise our assumptions on the cost function (Sect. 1.1) and collect the elliptic regularity statements we require (Sect. 1.2). After collecting notation and some elementary results on optimal transportation in Sect. 2, we collect the statements we require in order to prove Theorem 1.1 in Sect. 7: in Sect. 3, we prove the \(L^\infty \)-estimate. In Sect. 4, we prove a localisation result on optimal transportation costs. Next, we approximate the boundary data of (1.8) in Sect. 5. For technical reasons, we also need to localise the data-term D, which we do in Sect. 6.

1.1 Assumptions on the cost function and its dual

In this section, we explain our assumptions on the cost function c. In order to keep the statements of our results short, the conditions (1.11)–(1.14) below will be assumed to hold throughout the entire paper. We consider cost-functions modeled on the p-cost, \(c(x)=|x|^p\) with \(p>1\). This is also the primary example of cost functions we have in mind. We emphasize however that the cost functions we consider need not be homogeneous. In fact the assumptions we impose are standard within elliptic regularity theory, see e.g. [5, 20]. Let \(p\in (1,\infty )\). We consider a \(C^1\)-cost function \(c:{\mathbb {R}}^d\rightarrow {\mathbb {R}}\) satisfying the following properties: There is \(\Lambda \ge 1\) such that

  1. (i)

    c is strongly p-convex: for any \(x,y\in {\mathbb {R}}^d\) and \(\tau \in [0,1]\),

    $$\begin{aligned} \Lambda ^{-1} \tau (1-\tau ) V_p(x,y)+c(\tau x+(1-\tau )y)\le \tau c(x)+(1-\tau )c(y). \end{aligned}$$
    (1.11)

    where

    $$\begin{aligned} V_p(x,y)=(|x|^2+|y|^2)^\frac{p-2}{2}|x-y|^2. \end{aligned}$$
  2. (ii)

    c has p-growth: for any \(x\in {\mathbb {R}}^d\),

    $$\begin{aligned} \Lambda ^{-1} |x|^p\le c(x)\le \Lambda |x|^p. \end{aligned}$$
    (1.12)
  3. (iii)

    for any \(x,y\in {\mathbb {R}}^d\),

    $$\begin{aligned} |c(x)-c(y)|\le \Lambda U_p(x,y) \end{aligned}$$
    (1.13)

    where

    $$\begin{aligned} U_p(x,y)= (|x|+|y|)^{p-1}|x-y|. \end{aligned}$$
  4. (iv)

    c satisfies controlled p-growth: For any \(x,y\in {\mathbb {R}}^d\),

    $$\begin{aligned} |\nabla c(x)-\nabla c(y)|\le \Lambda (|x|+|y|)^{p-2}|x-y|. \end{aligned}$$
    (1.14)

If the choice of p is clear from the context, we will write \(V=V_p\) and \(U=U_p\). We further note the following inequality, valid for any \(z_1,z_2,z_3\in {\mathbb {R}}^d\) and with implicit constants depending only on p and d,

$$\begin{aligned} |V_p(z_1,z_2)-V_p(z_1,z_3)|\lesssim (|z_1|+ |z_2|+|z_3|)^{p-1}|z_2-z_3|. \end{aligned}$$
(1.15)

(1.15) follows from writing

$$\begin{aligned} V_p(z_1,z_2)-V_p(z_1,z_3) = \int _0^1 \langle \nabla _2 V_p(z_1,s z_2+(1-s)z_3), z_2-z_3\rangle \,{\textrm{d}}t. \end{aligned}$$

Here \(\nabla _2\) denotes a derivative with respect to the second variable of \(V_p\). From elementary calculations

$$\begin{aligned} |\nabla _2 V_p(z_1+s z_2+(1-s)z_3)|\lesssim (|z_1|+|z_2|+|z_3|)^{p-1}, \end{aligned}$$

which gives (1.15).

(1.11) and the fact that c is \(C^1\) imply that for any \(x,y\in {\mathbb {R}}^d\),

$$\begin{aligned}&c(x)\ge c(y)+\langle \nabla c(y),x-y\rangle +\lambda V(x,y),\end{aligned}$$
(1.16)
$$\begin{aligned}&\langle \nabla c(x)-\nabla c(y),x-y\rangle \ge \lambda V(x,y). \end{aligned}$$
(1.17)

This can be seen by arguing as in the 2-convex, 2-growth case, which can be found for example in [12, Chapter IV.4.1].

We require some information on the convexity properties of the convex conjugate that follow by adaptions of the 2-convex, 2-growth theory in [22] and [13]. For the convenience of the reader, we provide proofs of the statements we require. Introduce the convex conjugate \(c^*\) defined on \({\mathbb {R}}^d\) via

$$\begin{aligned} c^*(\xi )= \sup _{x} \langle \xi ,x\rangle - c(x). \end{aligned}$$

We remark that since c is strongly convex, \(C^1\) and superlinear at infinity, \(c^*\) is strongly convex, \(C^1\) and superlinear at infinity. Moreover \(\nabla c\) and \(\nabla c^*\) are homeomorphisms of \({\mathbb {R}}^d\) and \(\nabla c^*= (\nabla c)^{-1}\) [22, Theorem 26.5]. Note that due to (1.12), we have for any \(\xi \in {\mathbb {R}}^d\),

$$\begin{aligned} c^*(\xi )\le \sup _x \langle \xi ,x\rangle -\Lambda ^{-1} |x|^p \lesssim |\xi |^{p^\prime }. \end{aligned}$$

A lower bound can be obtained similarly and we deduce:

$$\begin{aligned} |\xi |^{p'}\lesssim c^*(\xi )\lesssim |\xi |^{p'}. \end{aligned}$$
(1.18)

Due to strict p-convexity of c, \(c^*\) satisfies for any \(\xi _1,\xi _2\in {\mathbb {R}}^d\),

$$\begin{aligned} |\nabla c^*(\xi _1)-\nabla c^*(\xi _2)|\lesssim (|\xi _1|+|\xi _2|)^{p'-2}|\xi _1-\xi _2|. \end{aligned}$$
(1.19)

Indeed, for \(\xi _1,\xi _2\in {\mathbb {R}}^d\), using (1.16) with the choice \({x=\nabla c^*(\xi _1)}\), \(y=\nabla c^*(\xi _2)\) and Cauchy-Schwarz,

$$\begin{aligned} V_p(\nabla c^*(\xi _1),\nabla c^*(\xi _2))&\lesssim \langle \xi _1-\xi _2,\nabla c^*(\xi _1)-\nabla c^*(\xi _2)\rangle&\le |\nabla c^*(\xi _1)-\nabla c^*(\xi _2)||\xi _1-\xi _2|. \end{aligned}$$

Re-arranging, we have

$$\begin{aligned} |\nabla c^*(\xi _1)-\nabla c^*(\xi _2)|\lesssim |\xi _1-\xi _2|(|\nabla c^*(\xi _1)|^2+|\nabla c^*(\xi _2)|^2)^{\frac{2-p}{2}}. \end{aligned}$$
(1.20)

We claim that for any \(\xi \in {\mathbb {R}}^d\),

$$\begin{aligned} |\xi |^{p^\prime -1}\lesssim |\nabla c^*(\xi )|\lesssim |\xi |^{p^\prime -1}. \end{aligned}$$
(1.21)

Then (1.20) gives (1.19). Since \(\nabla c^*= (\nabla c)^{-1}\) and both maps are homeomorphisms, to show (1.21), it suffices to show that for any \(x\in {\mathbb {R}}^d\),

$$\begin{aligned} |x|^{p-1}\lesssim |\nabla c(x)|\lesssim |x|^{p-1}. \end{aligned}$$
(1.22)

Fix \(x\in {\mathbb {R}}^d\). Since difference quotients of convex functions are non-decreasing, for any \(h\in {\mathbb {R}}^d\), applying also (1.13),

$$\begin{aligned} \langle \nabla c(x),h\rangle \le \frac{f(x+h)-f(x)}{|h|}\lesssim \frac{(|x|+|h|)^{p}}{|h|}. \end{aligned}$$

Applying the above with \(h\rightarrow th\) and choosing t such that \(|t h|= |x|\), as h was arbitrary, this gives the second inequality in (1.22). Note that in particular \(\nabla c(0)=0\). Thus, using (1.11) with \(y=0\) and Cauchy-Schwartz gives

$$\begin{aligned} |x|^p\lesssim \langle \nabla c(x),x\rangle \le |\nabla c(x)||x|. \end{aligned}$$

After rearranging, this proves the first inequality in (1.22). Hence (1.19) is established.

Finally, it follows from (1.19) and the intermediate value theorem that for \(\xi _1,\xi _2\in {\mathbb {R}}^d\),

$$\begin{aligned} |c^*(\xi _1)-c^*(\xi _2)|= \Big |\int _0^1 \langle \nabla c^*(s\xi _1+(1-s)\xi _2),\xi _1-\xi _2\rangle \,{\textrm{d}}s\Big |\lesssim U_{p'}(\xi _1,\xi _2). \end{aligned}$$
(1.23)

We also require that \(c^*\) is \(p^\prime \) convex, that is for some \(C(p,\Lambda )>0\),

$$\begin{aligned} c \tau (1-\tau ) V_{p'}(\xi _1,\xi _2)+c^*(\tau \xi _1+(1-\tau )\xi )2)\le \tau c^*(\xi _1)+(1-\tau )c^*(\xi _2). \end{aligned}$$
(1.24)

Indeed, we can use Taylor’s theorem and (1.14) to obtain for any \(x,y\in {\mathbb {R}}^d\) and some \(C=C(p,n)>0\),

$$\begin{aligned} c(y)&= c(x)+\langle \nabla c(x),y-x\rangle + \int _0^1 \langle \nabla c(x+t(y-x))-\nabla c(x),y-x\rangle \textrm{d}t\nonumber \\&\le c(x)+\langle \nabla c(x),y-x\rangle + C (|x|+|x-y|)^{p-2}|y-x|^2. \end{aligned}$$
(1.25)

In order to estimate the integral, we used a well-known estimate, see e.g. [4, 11]. Recall the Fenchel-Young inequality in the form

$$\begin{aligned} c(\xi ) +c^*(x)\ge \langle \xi ,x\rangle \quad \forall \xi , x\in {\mathbb {R}}^d, \end{aligned}$$
(1.26)

with equality if and only if \(\xi = \nabla c^*(x)\). Hence, with the choice \(x=\nabla c^*(\xi _1)\), (1.25) gives

$$\begin{aligned} -c^*(\xi _2)&= -\sup _y \{ \langle \xi _2,y\rangle -c(y)\}\nonumber \\&\le - c^*(\xi _1)-\sup _{y}\left\{ \langle \xi _2-\xi _1,y\rangle -C(|\nabla c^*(\xi _1)|+|\nabla c^*(\xi _1)-y|)^{p-2} |y{-}\nabla c^*(\xi _1)\right\} . \end{aligned}$$
(1.27)

Note that the supremum is nothing but \(\left( C V_p(\nabla c^*(\xi _1),\cdot -\nabla c^*(\xi _1)\right) ^*(\xi _2-\xi _1)\). Noting that for \(x,y\in {\mathbb {R}}^d\),

$$\begin{aligned} (|x|+|y|)^{p-2}|y|^2 \lesssim {\left\{ \begin{array}{ll} |y|^p &{}\quad \text {if } \{|x|\le |y|, p\ge 2\} \text { or } \{|x|\ge |y|, p\le 2\}\\ |x|^{p-2} |x|^2 &{} \quad \text {if } \{|x|\ge |y|, p\ge 2\} \text { or } \{|x|\le |y|, p\le 2\} \end{array}\right. } \end{aligned}$$

and arguing case by case as for (1.18), this shows

$$\begin{aligned} \left( (|x|+|\cdot |)^{p-2}|\cdot |^2\right) ^*(\xi )\gtrsim (|x|^{p-1}+|\xi |)^{p^\prime -2}|\xi |^2. \end{aligned}$$

In particular, recalling (1.21), we deduce

$$\begin{aligned} V_p(\nabla c^*(\xi _1),\cdot )^*(\xi _2-\xi _1)\gtrsim (|\nabla c^*(\xi _1)|^{p-1}+|\xi _1-\xi _2|)^{p^\prime -2} |\xi _1-\xi _2|^2\gtrsim V_{p^\prime }(\xi _1-\xi _2). \end{aligned}$$

Employing the fact that for any convex \(f:{\mathbb {R}}^d\rightarrow {\mathbb {R}}\) and any \(x,\xi \in {\mathbb {R}}^d\), it holds that \({f(\cdot -x)^*(\xi ) = f^*(\xi )+\langle x,\xi \rangle }\), we finally conclude,

$$\begin{aligned} \left( C V_p(\nabla c^*(\xi _1),\cdot -\nabla c^*(\xi _1)\right) ^*(\xi _2-\xi _1)\gtrsim V_{p^\prime }(\xi _1-\xi _2)+\langle \nabla c^*(\xi _1),\xi _2-\xi _1). \end{aligned}$$

Combining this estimate with (1.27) gives (1.24).

1.2 Regularity assumptions on the dual system

Let \(R\in (2,3)\). In this section, we state the regularity assumptions we make on distributional solutions \(\phi \in W^{1,p'}(B)\) of the equation

$$\begin{aligned} -\textrm{div}\,\nabla c^*(\textrm{D}\phi )&= c_g \quad \text { on } B_R \end{aligned}$$
(1.28)
$$\begin{aligned} \nabla c^*(\textrm{D}\phi )\cdot \nu&= g \quad \text { on } \partial B_R, \end{aligned}$$
(1.29)

where \(g\in L^{p}(B)\) and \(c_g\) satisfies the compatibility condition \(|B_R|c_g = \int _{\partial B} g\). \(\nu \) denotes the outward pointing normal vector on \(\partial B\). We will show that solutions exist and are unique up to a constant. Hence, we usually normalise solutions by requiring that \(\int _B \phi = 0\). Fixing \(g\in L^p(\partial B)\), we denote by \(\phi ^r\) the solution satisfying \(\int _B \phi ^r=0\) of (1.28) with data \(g^r\), where \(g^r\) denotes convolution of g with a smooth convolution kernel on \(\partial B\) at scale r.

Lemma 1.2

If \(c^*\) satisfies (1.11)–(1.14), then solutions \(\phi \) to (1.28) exist, are unique up to constant and the following statements hold:

  1. (i)

    \(\phi \) satisfies the following energy estimates:

    $$\begin{aligned}&\int _{B_R}|\textrm{D}\phi |^{p'}\,{\textrm{d}}x\lesssim \int _{\partial B_R} |g|^p, \end{aligned}$$
    (1.30)
    $$\begin{aligned}&\int _{B_R} c(\nabla c^*(\textrm{D}\phi ))\,{\textrm{d}}x\lesssim \int _{\partial B_R} |g|^p. \end{aligned}$$
    (1.31)
  2. (ii)

    \(\phi \) is Lipschitz regular in the interior of \(B_R\): For any \(r<R\),

    $$\begin{aligned} \sup _{x\in B_r} |\textrm{D}\phi |^{p^\prime } \lesssim _{R-r} \int _{\partial B_R} |g|^p. \end{aligned}$$
    (1.32)
  3. (iii)

    The difference between \(\phi \) and \(\phi ^r\) is controlled: There exists \(s=s(n,p)>0\) such that

    $$\begin{aligned} \int _{B_R} |\textrm{D}\phi -\textrm{D}\phi ^r|^{p'}\,{\textrm{d}}x\lesssim r^s\int _{\partial B_R} |g|^p. \end{aligned}$$
    (1.33)
  4. (iv)

    \(\textrm{D}\phi ^r\) is Hölder-regular up to the boundary: For any \(\beta \in (0,1)\),

    $$\begin{aligned} r^\beta [\textrm{D}\phi ^r]_{C^{0,\beta }(B_R)}^{p'}+\sup _{B_R} |\textrm{D}\phi ^r|^{p'}\lesssim \frac{1}{r^{d-1}}\int _{\partial B_R} |g|^{p}. \end{aligned}$$
    (1.34)
  5. (v)

    Let \(\beta \in (0,1)\). Whenever \(\phi \in C^{0,\beta }(B)\) for some ball B,

    $$\begin{aligned}{}[c^*(\textrm{D}\phi )+c(\nabla c^*(\textrm{D}\phi ))]_{C^{0,\beta }(B)} \lesssim \Vert \textrm{D}\phi \Vert _{L^\infty (B)}^{p'-1}[D\phi ]_{C^{0,\beta }(B)}. \end{aligned}$$

Proof

Note that in light of the results of Sect. 1.1, \(c^*\) is \(p^\prime \)-convex and satisfies controlled \(p^\prime \)-growth. Hence, the statements we need to prove are largely standard.

The existence and uniqueness up to constant of solutions follows from the direct method. Testing the weak formulation of (1.28) with \(\phi \) and applying (1.24) in combination with Hölder’s inequality, the trace estimate and Poincaré inequality (recall that \(\int _B \phi = 0\)) gives:

$$\begin{aligned} \Vert \textrm{D}\phi \Vert _{L^{p^\prime }(B_R)}^{p^\prime }&\lesssim \int _{B_R} \langle \nabla c^*(\textrm{D}\phi ),\textrm{D}\phi \rangle \,{\textrm{d}}x = \int _{\partial B_R} g \phi \le \Vert \phi \Vert _{L^{p^\prime }(\partial B_R)}\Vert g\Vert _{L^{p}(\partial B_R)}\\&\lesssim \Vert \phi \Vert _{W^{1,p^\prime }(B_R)}\Vert g\Vert _{L^{p}(\partial B_R)}\lesssim \Vert \textrm{D}\phi \Vert _{L^{p^\prime }(B_R)}\Vert g\Vert _{L^{p}(\partial B_R)}. \end{aligned}$$

Re-arranging this gives (1.30). Using (1.12) and (1.21), (1.30) implies that also

$$\begin{aligned} \int _{B_R} c(\nabla c^*(\textrm{D}\phi ))\,{\textrm{d}}x\lesssim \int _{B_R} |\nabla c^*(\textrm{D}\phi )|^p\,{\textrm{d}}x\lesssim \int _{B_R}|\textrm{D}\phi |^{p^\prime }\,{\textrm{d}}x\lesssim \int _{\partial B_R} |g|^p, \end{aligned}$$

that is (1.31) holds.

(1.32) is proven in [19] and [17]. As the proofs are quite involved, we do not comment on them here. Instead we turn to (1.33). We focus on the case \(p^\prime \le n\) as the other case is easier. Testing the equations for \(\phi \) and \(\phi ^r\) with \(\textrm{D}\phi -\textrm{D}\phi ^r\) and applying (1.24) and Hölder’s inequality, we find

$$\begin{aligned} \int _{B_R} V_{p^\prime }(\textrm{D}\phi ,\textrm{D}\phi ^r)&\lesssim \int _{B_R} \langle \nabla c^*(\textrm{D}\phi ){-}\nabla c^*(\textrm{D}\phi ^r),\textrm{D}\phi {-}\textrm{D}\phi ^r)\,{\textrm{d}}x= \int _{\partial B_R} (g-g^r)(\phi -\phi ^r)\\&\lesssim \Vert g-g^r\Vert _{L^{\frac{p^\prime (n-1)}{n(p^\prime -1)}}(\partial B_R)}\Vert \phi -\phi ^r\Vert _{L^{\frac{p^\prime (n-1)}{n-p^\prime }}(\partial B_R)} \end{aligned}$$

By a standard trace estimate and Poincaré’s inequality

$$\begin{aligned} \Vert \phi -\phi ^r\Vert _{L^{\frac{p^\prime (n-1)}{n-p^\prime }}(\partial B_R)}\lesssim \Vert \phi -\phi ^r\Vert _{W^{1,p^\prime }(B_R)}\lesssim \Vert D\phi -\textrm{D}\phi ^r\Vert _{L^{p^\prime }(B_R)}. \end{aligned}$$

Note \(\frac{p^\prime (n-1)}{n-p^\prime } =\frac{(n-1)p}{n}\). By standard properties of convolution,

$$\begin{aligned} \Vert g-g^r\Vert _{L^\frac{p(n-1)}{n}(\partial B_R)}\lesssim r^\frac{1}{n-1} \Vert g\Vert _{L^p(\partial B_R)}. \end{aligned}$$

If \(p\le 2\), since \(V_{p^\prime }(\textrm{D}\phi ,\textrm{D}\phi ^r)\ge |\textrm{D}\phi -\textrm{D}\phi ^r|^{p^\prime }\), combining estimates and re-arranging concludes the proof. If \(p^\prime \le 2\), we apply Hölder’s inequality to see

$$\begin{aligned} \int _{B_R} V_{p^\prime }(\textrm{D}\phi ,\textrm{D}\phi ^r) \ge \Vert \textrm{D}\phi -\textrm{D}\phi ^r\Vert _{L^{p^\prime }(B_R)}^2(\Vert |\textrm{D}\phi |+|\textrm{D}\phi ^r|\Vert _{L^{p^\prime }(B_R)}^{p^\prime -2}. \end{aligned}$$

Since due to (1.30) and standard properties of convolution,

$$\begin{aligned} \Vert ||\textrm{D}\phi |+|\textrm{D}\phi ^r|\Vert _{L^{p^\prime }(B_R)}\lesssim \Vert D\phi \Vert _{L^{p^\prime }(B_R)}+\Vert \textrm{D}\phi ^r\Vert _{L^{p^\prime }(B_R)}\lesssim \Vert g\Vert _{L^p(\partial B_R)}^{p-1}, \end{aligned}$$

this again gives (1.33).

(1.34) follows from [19] and [17]. Again the proof is involved and we don’t comment on it here.

Finally, we note by direct calculation using (1.13), (1.23), (1.19) and (1.21) that

$$\begin{aligned}{}[c^*(\textrm{D}\phi )+c(\nabla c^*(\textrm{D}\phi ))]_{C^{0,\beta }(B)} \lesssim \Vert \textrm{D}\phi \Vert _{L^\infty (B)}^{p'-1}[D\phi ]_{C^{0,\beta }(B)}. \end{aligned}$$

This concludes the proof. \(\square \)

2 Preliminaries

2.1 General notation

Throughout, we let \(1<p<\infty \). \(B_r(x)\) will denote a ball of radius \(r>0\) centered at \(x\in {\mathbb {R}}^d\). We further write \(B_r=B_r(0)\) and \(B=B_1(0)\). c denotes a generic constant that may change from line to line. Relevant dependencies on \(\Lambda \), say, will be denoted \(c(\Lambda )\). We say \(a\lesssim b\) and \(a\gtrsim b\), if there exists a constant \(c>0\) depending only on d, p and \(\Lambda \) such that \(a\le c b\) and \(a\ge c b\), respectively.

Given \(\Omega \subset {\mathbb {R}}^d\), we denote by \([\cdot ]_{C^{0,\alpha }}\), the \(\alpha \)-Hölder-seminorm. Given \(\alpha \in (0,\infty )\), \(L^p(\Omega )\) and \(W^{\alpha ,p}(\Omega )\) denote the usual Lebesgue and (fractional) Sobolev spaces. If \(\mu \) is a measure on \({\mathbb {R}}^d\), \(\mu \llcorner \Omega \) denotes its restriction to \(\Omega \). We say a couple of measures \(\lambda \), \(\mu \) on \({\mathbb {R}}^d\) is admissible if \(\lambda ,\mu \) are non-negative finite measures satisfying \(\lambda ({\mathbb {R}}^d)=\mu ({\mathbb {R}}^d)\).

Given \(R>0\), we let \(\Pi _R(x)=R\frac{x}{|x|}\) be the projection onto \(\partial B_R\) and define for every measure \(\rho \) on \({\mathbb {R}}^d\) the projected measure on \(\partial B_R\), \({{\hat{\rho }}} = \Pi _R \# \rho \), i.e.

$$\begin{aligned} \int \xi \,{\textrm{d}}{{\hat{\rho }}} = \int \xi \left( R\frac{x}{|x|}\right) \,{\textrm{d}}\rho (x). \end{aligned}$$
(2.1)

A set \(\Omega \subset {\mathbb {R}}^d\times {\mathbb {R}}^d\) is said to be c-cyclically monotone if for any \(N\in {\mathbb {N}}\) and any points \((x_1,y_1),\ldots ,(x_N,y_N)\in \Omega \), there holds

$$\begin{aligned} \sum _{i=1}^N c(x_i-y_i)\le \sum _{i=1}^N c(x_i-y_{i+1}), \end{aligned}$$

where we identify \(y_{N+1}=y_1\).

A function \(f:{\mathbb {R}}^d\rightarrow {\mathbb {R}}\) is called c-concave if there exists a function \(g:{\mathbb {R}}^d\rightarrow {\mathbb {R}}\) such that

$$\begin{aligned} f(x)=\inf _{y\in {\mathbb {R}}^d} c(x-y)-g(y) \end{aligned}$$

for all \(x\in {\mathbb {R}}^d\).

2.2 Optimal transportation

We recall some definitions and facts about optimal transportation, see [26] for more details. For this subsection, the full strength of our assumptions (1.11)–(1.14) is not needed. In fact, assuming that c is lower semi-continuous, convex and satisfies p-growth (1.12) would be sufficient in this subsection.

Given a measure \(\pi \) on \({\mathbb {R}}^d\times {\mathbb {R}}^d\) we denote its marginals by \(\pi _1\) and \(\pi _2\) respectively. The set of measures on \({\mathbb {R}}^d\times {\mathbb {R}}^d\) with marginals \(\pi _1\) and \(\pi _2\) is denoted \(\Pi (\pi _1,\pi _2)\). Given two positive measures with compact support and equal mass \(\lambda \) and \(\mu \) we define

$$\begin{aligned} W_c(\lambda ,\mu )=\min _{\pi _1 = \lambda ,\pi _2=\mu } \int c(x-y)\,{\textrm{d}}\pi . \end{aligned}$$

While our notation is reminiscent of the Wasserstein distance, and in fact gives the (p-th power of the) Wasserstein p-distance in the case \({c(x-y) = |x-y|^p}\), in general it is not a distance on measures. Under our hypothesis, an optimal coupling always exists and moreover a coupling \(\pi \) is optimal, if and only if its support is c-cyclical monotone.

Moreover, we note the following triangle-type inequality:

Lemma 2.1

Let \(\varepsilon \in (0,1)\). There is \(C(\varepsilon )>0\) such that for any admissible measures \(\mu _1, \mu _2, \mu _3\) it holds that

$$\begin{aligned} W_c(\mu _1,\mu _3)\le (1+\varepsilon ) W_c(\mu _1,\mu _2)+C(\varepsilon ) W_c(\mu _2,\mu _3). \end{aligned}$$

Proof

Due to the gluing lemma, see e.g. [23, Lemma 5.5.], there exists \(\sigma \), a positive measure on \({\mathbb {R}}^d\times {\mathbb {R}}^d\times {\mathbb {R}}^d\) with marginal \(\pi _1\) on the first two variables and marginal \(\pi _2\) on the last two variables. Here \(\pi _1\) and \(\pi _2\) are the optimal couplings between \(\mu _1\) and \(\mu _2\) and \(\mu _2\) and \(\mu _3\), respectively, with respect to \(W_c\). Set \(\gamma \) to be the marginal of \(\sigma \) with respect to the first and third variable. Then \(\gamma \in \Pi (\mu _1,\mu _3)\). It follows using the convexity of c and the triangle inequality in \(L^p(\gamma )\) that for any \(t\in (0,1)\),

$$\begin{aligned} W_c(\mu _1,\mu _3)&\le \left( \int c(x-z)\,{\textrm{d}}\gamma \right) ^\frac{1}{p}\le \left( \int t c\left( \frac{x-y}{t}\right) +(1-t)\left( \frac{y-z}{1-t}\right) \,{\textrm{d}}\gamma \right) ^\frac{1}{p}\\&\le \left( \int \left( \left( t c\left( \frac{x-y}{t}\right) \right) ^\frac{1}{p}+\left( (1-t)\left( \frac{y-z}{1-t}\right) \right) ^\frac{1}{p}\right) ^p\,{\textrm{d}}\gamma \right) ^\frac{1}{p}\\&\le \left( \int t c\left( \frac{x-y}{t}\right) \,{\textrm{d}}\gamma \right) ^\frac{1}{p}+\left( \int (1-t) c\left( \frac{y-z}{1-t}\right) \,{\textrm{d}}\gamma \right) ^\frac{1}{p}. \end{aligned}$$

Using (1.12) and recalling the definition of \(\gamma \), we deduce

$$\begin{aligned} W_c(\mu _1,\mu _3)\le \left( \Lambda ^2 t^{1-p}\right) ^\frac{1}{p} W_c(\mu _1,\mu _2)+\left( \Lambda ^2 (1-t)^{1-p}\right) ^\frac{1}{p}W_c(\mu _2,\mu _3). \end{aligned}$$

Choosing t sufficiently close to 1, this gives the desired estimate. \(\square \)

We require also the following consequence of Lemma 2.1.

Corollary 2.2

Let \(\mu _1,\mu _2\) be admissible measures. Then

$$\begin{aligned} W_c(\mu _1,\mu _2)\lesssim W_c(\mu _1+\mu _2,2\mu _2). \end{aligned}$$

Proof

Using Lemma 2.1 and sub-additivity of \(W_c\), we note for any \(\delta >0\),

$$\begin{aligned} W_c(\mu _1,\mu _2)&\le (1+\delta ) W_c\left( \mu _1,\frac{1}{2} (\mu _1+\mu _2)\right) +C(\delta )W_c\left( \frac{1}{2} (\mu _1+\mu _2),\mu _2\right) \\&= (1+\delta ) W_c\left( \frac{1}{2} \mu _1,\frac{1}{2} \mu _2\right) +c(\delta )W_c\left( \frac{1}{2} (\mu _1+\mu _2),\mu _2\right) \\&\le \frac{1+\delta }{2} W_c\left( \mu _1,\mu _2\right) +C(\delta )W_c\left( \mu _1+\mu _2,2\mu _2\right) . \end{aligned}$$

Re-arranging gives the result. \(\square \)

We remark that the Benamou–Brenier formula (1.10) needs to be interpreted via duality in general, that is we set

$$\begin{aligned} \int c\left( \frac{\,{\textrm{d}}j_t}{\,{\textrm{d}}\rho _t}\right) \,{\textrm{d}}\rho _t = \sup _{\zeta \in C^0_c({\mathbb {R}}^d)} \left\{ \int \zeta \,{\textrm{d}}j_t - \int c^*(\xi )\,{\textrm{d}}\rho _t\right\} . \end{aligned}$$
(2.2)

Given \(O\subset {\mathbb {R}}^d\) set \(\kappa _{\mu ,O}\) to be the generic constant such that \(W_{c}(\mu \llcorner O,\kappa _{\mu ,O}\,{\textrm{d}}x \llcorner O)\) is well-defined, that is \(\kappa _{\mu ,O} = \frac{\mu (O)}{|O|}\). If \(O=B_R\), we write \(\kappa _{\mu ,R} = \kappa _{\mu ,B_R}\).

It will be convenient to denote \(\#_R = (B_R\times {\mathbb {R}}^d)\cup ({\mathbb {R}}^d\times B_R)\). We recall the definition of the quantities that we use to measure smallness:

$$\begin{aligned} E(R)&:= \frac{1}{|B_R|} \int _{\#_R} c(x-y)\,{\textrm{d}}\pi ,\\ D(R)&:= \frac{1}{|B_R|} W_p^p(\lambda \llcorner B_R,\kappa _{\lambda ,R}\,{\textrm{d}}x\llcorner B_R)+\frac{R^p}{\kappa _{\lambda ,R}^{p-1}}(\kappa _{\lambda ,R}-1)^p \\&\quad +\frac{1}{|B_R|} W_p^p(\mu \llcorner B_R,\kappa _{\mu ,R}\,{\textrm{d}}x\llcorner B_R)+\frac{R^p}{\kappa _{\mu ,R}^{p-1}}(\kappa _{\mu ,R}-1)^p. \end{aligned}$$

We will find it convenient to work with trajectories \(X(t) = t x+(1-t)y\). In this context, it is useful to work on the domain

$$\begin{aligned} \Omega _R = \{(x,y)\in \#_3:\exists t\in [0,1] \text { s.t. } X(t)\in {{\overline{B}}}_R\}. \end{aligned}$$

To every trajectory \(X\in \Omega \), we associate entering and exiting times of \(B_R\):

$$\begin{aligned} \sigma _R&:= \min \{t\in [0,1]:X(t)\in {{\overline{B}}}_R\}\\ \tau _R&:= \max \{t\in [0,1]:X(t)\in {{\overline{B}}}_R\}. \end{aligned}$$

Often, we will drop the subscripts and denote \(\Omega = \Omega _R\), \(\sigma =\sigma _R\) and \(\tau = \tau _R\). Further, we will need to track trajectories entering and leaving \(B_R\). This is achieved through the non-negative measures \(f_R\) and \(g_R\) concentrated on \(\partial B_R\) and defined by the relations

$$\begin{aligned} \int \zeta \,{\textrm{d}}f_R= & {} \int _{\Omega \cap \{X(\sigma )\in \partial B_R\}} \zeta (X(\sigma ))\,{\textrm{d}}\pi ,\nonumber \\ \int \zeta \,{\textrm{d}}g_R= & {} \int _{\Omega \cap \{X(\tau )\in \partial B_R\}} \zeta (X(\sigma ))\,{\textrm{d}}\pi . \end{aligned}$$
(2.3)

Note that the set of trajectories \(\Omega \cap \{X(\sigma )\in \partial B_R\}\) implicitly defines a Borel measurable subset of \({\mathbb {R}}^d\times {\mathbb {R}}^d\), namely the pre-image under the mapping \((x,y)\rightarrow X\), which is continuous from \({\mathbb {R}}^d\times {\mathbb {R}}^d\) into \(C^0([0,1])\). Thus, the integrals in (2.3) are well-defined. We will often use similar observations without further justification.

2.3 Estimating radial projections

We record a technical estimate concerning radial projections we will require.

Lemma 2.3

For \(R>0\), there exists \(1\ge \varepsilon (d)>0\) such that for every \(g\ge 0\) with \(\textrm{Spt}\, g\subset B_{(1+\varepsilon )R}{\setminus } B_{(1-\varepsilon )R}\) we have

$$\begin{aligned} R^{1-d}\left( \int g\right) ^p\lesssim \int _{\partial B_R} {{\hat{g}}}^p\lesssim \sup g^{p-1} \int |R-|x||^{p-1}g \end{aligned}$$

Here \({{\hat{g}}}\) is the radial projection of g defined in (2.1).

Proof

By scaling we may assume \(R=\sup g =1\). The first inequality is then a direct consequence of Jensen’s inequality.

For the second inequality, note that if \(\varepsilon \ll 1\), \(\sup _{\partial B_1} |{{\hat{g}}}|\ll 1\), since we assume \({\textrm{Spt}\, g\subset B_{1+\varepsilon }{\setminus } B_{1-\varepsilon }}\). Fix \(\omega \in \partial B_1\) and set \(\psi (r)=r^{d-1}g(r\omega )\) for \(r>0\). Then we have \({0\le \psi \le (1+\varepsilon )^{d-1}\le 2}\) and

$$\begin{aligned} \int _0^\infty \psi = {{\hat{g}}}(\omega ). \end{aligned}$$

We conclude that for \(\omega \in \partial B_1\),

$$\begin{aligned} \int _0^\infty |1-r|^{p-1} r^{d-1}g(r\omega )\ge \min _{0\le {{\tilde{\psi }}}\le 2, \int {{\tilde{\psi }}} = {{\hat{g}}}(\omega )} \int _0^\infty |1-r|^{p-1} {{\tilde{\psi }}}(r)\gtrsim {{\hat{g}}}(\omega ). \end{aligned}$$

The last inequality holds, since the minimiser of

$$\begin{aligned} \min _{0\le {{\tilde{\psi }}}\le 2, \int {{\tilde{\psi }}} = {{\hat{g}}}(\omega )} \int _0^\infty |1-r|^{p-1} {{\tilde{\psi }}}(r) \end{aligned}$$

is given by \(2I\left( |r-1|\le \frac{1}{4} {{\hat{g}}}(\omega )\right) \). \(\square \)

3 A \(L^\infty \)-bound on the displacement

A key point in our proof will be that trajectories do not move very much. Since we assume \(E(4)\ll 1\), this is evidently true on average. However, we will require to control the length of trajectories not just on average, but in a pointwise sense. We establish this result in this section. In the quadratic case, the proof in [9] relies on the fact that 2-monotonicity is equivalent to standard monotonicity. In our setting this is not available and we hence provide a different proof. Our proof heavily relies on the strong p-convexity of c.

Lemma 3.1

Let \(1<p<\infty \). Let \(\pi \) be a coupling between two admissible measures \(\lambda \) and \(\mu \). Assume that \(\textrm{Spt}\,\pi \) is cyclically monotone with respect to c-cost and that \({E(4)+D(4)\ll 1}\). Then for every \((x,y)\in \textrm{Spt}\,\pi \cap \#_3\), we have

$$\begin{aligned} |x-y|\lesssim \left( E(4)+D(4)\right) ^\frac{1}{p+d}. \end{aligned}$$
(3.1)

As a consequence, for \((x,y)\in \textrm{Spt}\,\pi \) and \(t\in [0,1]\),

$$\begin{aligned} x\in B_{3} \text { or } y\in B_3 \Rightarrow (1-t)x+ty\in B_{4}. \end{aligned}$$
(3.2)

In the proof of Lemma 3.1 we require the following technical result, which we state independently as we will require it again in the future.

Lemma 3.2

Let \(1<p<\infty \) and \(0< \alpha <1\). For every \(R>0\), \(\xi \in C^{0,\alpha }(B_R)\) and \(\mu \) supported in \(B_R\) with \(\mu (B_R)\sim |B_R|\),

$$\begin{aligned} \left|\int _{B_R}\xi (\,{\textrm{d}}\mu -\kappa _{\mu ,R}\,{\textrm{d}}x)\right|\le&[\xi ]_{C^{0,\alpha }(B_R)} W_{c}(\mu ,\kappa _{\mu ,R}\,{\textrm{d}}x \llcorner B_R)^\frac{\alpha }{p} R^\frac{2d(p-\alpha )}{p}. \end{aligned}$$
(3.3)

In case \(\alpha =1\), (3.3) holds with \(C^{0,1}\) replaced by \(C^1\). Further, if in addition we have \({\xi \in C^{\lfloor {p-1}\rfloor ,p-\lfloor {p-1}\rfloor }(B_R)}\), there is \(C>0\) such that

$$\begin{aligned} \left|\int _{B_R} \xi (\,{\textrm{d}}\mu -\kappa _{\mu ,R}\,{\textrm{d}}x)\right|&\le C\sum _{i=1}^{\lfloor {p-1}\rfloor }\left( \kappa _{\mu ,R}\int |\textrm{D}^i\xi |^\frac{p}{p-i}\,{\textrm{d}}x\right) ^\frac{p-i}{p} W_c(\mu ,\kappa _{\mu ,R}\,{\textrm{d}}x \llcorner B_R)^\frac{i}{p}\\&\quad + [\xi ]_{C^{\lfloor {p-1}\rfloor ,p-\lfloor {p-1}\rfloor }} W_c(\mu ,\kappa _{\mu ,R}\ x \llcorner B_R). \end{aligned}$$

Proof

Integrate the estimate

$$\begin{aligned} |\xi (x)-\xi (y)|\le [\xi ]_{C^{0,\alpha }}|x-y|^\alpha \end{aligned}$$

against an optimal transport plan \(\pi \) between \(\mu \) and \(\kappa _{\mu ,R}\,{\textrm{d}}x\llcorner B_R\) to find,

$$\begin{aligned} \left| \int _{B_R}\xi (\,{\textrm{d}}\mu -\kappa _\mu \,{\textrm{d}}x)\right| \le&[\xi ]_{C^{0,\alpha }(B_R)} \int _{B_R}|x-y|^\alpha \,{\textrm{d}}\pi . \end{aligned}$$

Applying Hölder and using (1.12) the result follows.

To obtain the second estimate, we proceed similarly, but start with the estimate

$$\begin{aligned} |\xi (x)-\xi (y)-\sum _{|\alpha |=1}^{\lfloor {p-1}\rfloor }\textrm{D}^\alpha \xi (y)\frac{(x-y)^\alpha }{|\alpha |!}|\le [\xi ]_{C^{\lfloor {p-1}\rfloor ,p-\lfloor {p-1}\rfloor }(B_R)} |x-y|^{p}. \end{aligned}$$

The result follows using (1.12) and using Hölder to estimate

$$\begin{aligned} \int |\textrm{D}^\alpha \xi (y)||x-y|^{|\alpha |} \,{\textrm{d}}\pi \le \left( \kappa _{\mu ,R}\int |\textrm{D}^{|\alpha |}\xi |^\frac{p}{p-|\alpha |}\,{\textrm{d}}x\right) ^\frac{p-|\alpha |}{p} W_c(\mu ,\kappa _{\mu ,R}\,{\textrm{d}}x \llcorner B_R)^\frac{|\alpha |}{p}. \end{aligned}$$

\(\square \)

We proceed to prove Lemma 3.1.

Proof of Lemma 3.1

Fix \((x,y)\in \textrm{Spt}\,\pi \cap \#_3\). Without loss of generality we may assume that \((x,y)\in B_{3}\times {\mathbb {R}}^d\).

Step 1. Barrier points exist in all directions: In this step we show that in all directions we may find points \((x',y')\in \textrm{Spt}\,\pi \) with \(x'\approx y'\). To be precise, consider an arbitrary unit vector \(\varvec{n}\in {\mathbb {R}}^d\) and let \(r>0\). We show that for any \(\varvec{n}\), and all \(r\ll 1\), there is \(M=M(p,d,\Lambda )>0\) and \((x',y')\in \textrm{Spt}\,\pi \cap (B_r(x+2r \varvec{n})\times {\mathbb {R}}^d)\) such that

$$\begin{aligned} c(x'-y')\le \frac{ME(4)}{r^d}. \end{aligned}$$

Assume, for contradiction, that for any \(M>0\), there is \(\varvec{n}\in {\mathbb {R}}^d\) and \(r>0\) such that for all \((x',y')\in \textrm{Spt}\,\pi \cap (B_r(x+2r \varvec{n})\times {\mathbb {R}}^d)\), \(|x'-y'|\ge \frac{M E(4)}{r^d}\). Let \(\eta \) be a non-negative, smooth cut-off supported in \(B_r(x+2r \varvec{n})\) satisfying

$$\begin{aligned} \sum _{i=1}^{\lfloor {p-1}\rfloor } r^i\sup |\textrm{D}^i\eta |+r^{p}[\eta ]_{C^{\lfloor {p-1}\rfloor ,p-\lfloor {p-1}\rfloor }}\lesssim 1. \end{aligned}$$

Then

$$\begin{aligned} E(4)&\gtrsim \int \int \eta (x)c(x-y)\,{\textrm{d}}\pi (x,y)\ge \int \int \frac{M E(4)}{r^d}\eta (x)\,{\textrm{d}}\pi (x,y)\\&= \frac{M E(4)}{r^d}\int \eta (x) \,{\textrm{d}}\mu (x). \end{aligned}$$

However, due to Lemma 3.2 and noting \(\kappa _{\mu ,4}\sim 1\),

$$\begin{aligned} \left| \int \eta \,{\textrm{d}}\mu (x)-\kappa _{\mu ,4}\int \eta \,{\textrm{d}}x\right| \lesssim \sum _{i=1}^{\lfloor {p-1}\rfloor } r^{\frac{d(p-i)}{p}-i}D(4)^\frac{i}{p}+r^{-p} D(4). \end{aligned}$$

Normalising \(\eta \) such that \(\int _{B_r(x+2r n)}\eta \,{\textrm{d}}x\sim r^d\), we can guarantee \(\kappa _{\mu ,4}\int \eta \,{\textrm{d}}x\sim \kappa _{\mu ,4} r^d\sim r^d\). Ensuring \(D(4)\ll r^{p+d}\), so that \(\sum _{i=1}^{\lfloor {p-1}\rfloor } r^{\frac{d(p-i)}{p}-i}D(4)^\frac{i}{p}+r^{-p} D(4) \ll r^d\), we may thus conclude

$$\begin{aligned} E(4)\gtrsim \frac{M E(4)}{r^d} r^d = M E(4). \end{aligned}$$

As M was arbitrary, this is a contradiction.

Step 2. Building barriers: In this step, we show that if we are given points

$$\begin{aligned} {(x',y')\in \textrm{Spt}\,\pi \cap (B_r(x+2r \varvec{n})\times {\mathbb {R}}^d)} \end{aligned}$$

such that \(|x'-y'|\le \frac{ME(4)}{r^d}\) for some \(M=M(p,d,\Lambda )>0\), then there is a cone \(C_{x,x'}\) with vertex \(x'+r\rho (x'-x)\) for some \(\rho =\rho (p,d,\Lambda )>0\), aperture \(\alpha =\alpha (p,d,\Lambda )\) and axis \(x'-x\) such that \(y\not \in C_{x,x'}\).

Without loss of generality, we may assume that \(x'-x\) points in the \(e_n\) direction. Moreover, considering the cost \(c(\cdot )-c(x)\), we may assume that \(c(x)=0\). Suppose for a contradiction that

$$\begin{aligned} y\in C_{x,x'}=x'+\{a\in {\mathbb {R}}^{d-1}\times {\mathbb {R}}^+:d(a,\Gamma )\le \alpha (|{{\overline{a}}}-x'|-r \rho )\} \end{aligned}$$

for some \(\alpha ,\rho >0\) to be determined. Here \(\Gamma = \{t(x'-x):t\ge 0\}\) and \({{\overline{a}}}\) denotes the orthogonal projection of a point \(a\in {\mathbb {R}}^{d-1}\times {\mathbb {R}}^+\) onto \(\Gamma \). We want to show that then

$$\begin{aligned} c(x-y)\ge c(x'-y)+c(x-y'). \end{aligned}$$
(3.4)

(3.4) is a contradiction to the c-monotonicity of \(\pi \) and hence proves the stated claim.

We note that we may assume \(x=0\). Indeed, setting \(z = y-x\), \(z'=y'-x\) and \({{\tilde{z}}}=x'-x\), (3.4) becomes

$$\begin{aligned} c(-z)\ge c({{\tilde{z}}}-z)+c(-z') \end{aligned}$$

with \(|{{\tilde{z}}}|\le \frac{M E}{r^d}\) and \(z\in C_{0,{{\tilde{z}}}}\), which we recognise as precisely the situation we are in if \(x=0\) (Fig. 1).

Fig. 1
figure 1

Geometric situation in Step 2

Taking \(\rho \ge 4\), we then estimate using the growth assumption (1.13) and the strict convexity assumption (1.11)

$$\begin{aligned} c(-y)&\ge c(-{{\overline{y}}})+c(x'-y')-\Lambda U(-y,-{{\overline{y}}})\\&\ge \frac{|{{\overline{y}}}|}{|-{{\overline{y}}}+x'|}c (-{{\overline{y}}}+x')+\frac{\lambda |x'|}{|{{\overline{y}}} |}V(-{{\overline{y}}},0)-\Lambda U(-y,-{{\overline{y}}})\\&\ge c(-{{\overline{y}}}+x'){+}c(x')+\frac{\lambda |x'|}{ |{{\overline{y}}}|}V(-{{\overline{y}}},0){+}\frac{\lambda | {{\overline{y}}}{-}2x'|}{|{{\overline{y}}}-x'|}V(-{{\overline{y}}} +x',0)-\Lambda U(-y,-{{\overline{y}}})\\&\ge c(-y+x')+c(-x')-\Lambda U(-y+x',-{{\overline{y}}}+x') +\frac{\lambda |x'|}{|{{\overline{y}}}|}V(-{{\overline{y}}},0)\\&\quad +\frac{\lambda |{{\overline{y}}}-2x'|}{|{{\overline{y}}} -x'|}V(-{{\overline{y}}}+x',0)-\Lambda U(-y,-{{\overline{y}}})\\&\ge c(-y+x')+c(-y')-\Lambda U(-y+x',-{{\overline{y}}}+x') +\frac{\lambda |x'|}{|{{\overline{y}}}|}V(-{{\overline{y}}},0)\\&\quad +\frac{\lambda |{{\overline{y}}}-2x'|}{|{{\overline{y}}} -x'|}V(-{{\overline{y}}}+x',0)-\Lambda U(-y,-{{\overline{y}}})-\Lambda U(-x',-y')\\ \end{aligned}$$

In particular, it suffices to show

$$\begin{aligned}&c(x'-y')+ \frac{\lambda |x'|}{|{{\overline{y}}}|}V(-{{\overline{y}}}, 0)+\frac{\lambda |{{\overline{y}}}-2x'|}{|{{\overline{y}}} -x'|}V(-{{\overline{y}}}+x',0)\nonumber \\&\quad \ge \Lambda U(-y,-{{\overline{y}}})+\Lambda U (-y+x',-{{\overline{y}}}+x'). \end{aligned}$$
(3.5)

We note that, if \(\rho \ge 8\), \(|{{\overline{y}}}-2x'|\ge \frac{1}{2}|{{\overline{y}}}|\). Then we can estimate

$$\begin{aligned} \frac{\lambda |x'|}{|{{\overline{y}}}|}V(-{{\overline{y}}},0) +\frac{\lambda |{{\overline{y}}}-2x'|}{|{{\overline{y}}}-x'|}V(-{{\overline{y}}}+x',0)&= \lambda (|x'| |{{\overline{y}}}|^{p-1}+|{{\overline{y}}} -2 x'| |{{\overline{y}}}-x'|^{p-1})\\&\gtrsim |{{\overline{y}}}|^{p}. \end{aligned}$$

Further, if \(E(4)\le \varepsilon r^{d+1}\),

$$\begin{aligned}&\Lambda U(-y,-{{\overline{y}}})+\Lambda U(-y+x',-{{\overline{y}}}+x')+\Lambda U(-x',-y')\\&\quad \le 2\Lambda (2|y|)^{p-1}|y-{{\overline{y}}}| +\Lambda \left( |x'|+|y'|\right) ^{p-1} |x'-y'|\\&\quad \lesssim |y|^{p-1}\alpha |{{\overline{y}}}-x'|+\frac{M E}{r^d} (2r)^{p-1}\\&\quad \lesssim \alpha |{{\overline{y}}}|^p+\varepsilon |{{\overline{y}}}|^p . \end{aligned}$$

Thus choosing \(\alpha \), \(\varepsilon >0\), sufficiently small, we find (3.5) holds, proving our claim.

Step 3. Proving the \(L^\infty \)-bounds: Choose \(r=c (E(4)+D(4))^{\frac{1}{p+d}}\). For sufficiently large choice of \(c>0\) and selecting c(d) directions \(n_i\), by Step 1 and Step 2 we obtain points

$$\begin{aligned} {(x',y')\in \textrm{Spt}\,\pi \cap (B_r(x+2 r n_i)\times {\mathbb {R}}^d)} \end{aligned}$$

and cones \((C_{x,x_i'})_{i\le c(d)}\) with vertices \(x_i'+r\rho (x_i'-x)\), aperture \(\alpha \) and axis \(x_i'-x\) such that for some \(c(\alpha )>0\),

$$\begin{aligned} {y\not \in \cup C_{y_i}} \text { and }{\mathbb {R}}^d\setminus B_{(\rho +c(\alpha ))r}(x)\subset \cup C_{y_i}. \end{aligned}$$

In particular, we have

$$\begin{aligned} |y-x|\le (\rho +c(\alpha ))r\lesssim (E+D)^\frac{1}{p+d}, \end{aligned}$$

that is (3.1).

(3.2) is a direct consequence of (3.1), concluding the proof. \(\square \)

We record two consequences of Lemma 3.1 we will use later.

Corollary 3.3

Under the assumptions of Lemma 3.1, it holds that

$$\begin{aligned}{} & {} \int _2^3 \int _{\Omega \cap \{\exists t\in [0,1]:X(t)\in \partial B_R)\}} c(x-y)\,{\textrm{d}}\pi \,{\textrm{d}}R\lesssim (E(4)+D(4))^{1+\frac{1}{p+d}},\\{} & {} \int _2^3 \int _{\Omega } I(\{\exists t\in [0,1]:X(t)\in \partial B_R)\})\,{\textrm{d}}\pi \,{\textrm{d}}R \lesssim (E(4)+D(4))^\frac{1}{p+d}. \end{aligned}$$

Proof

We use Lemma 3.1 to deduce there is \(C>0\) such that

$$\begin{aligned}&\int _2^3 \int _{\Omega \cap \{\exists t\in [0,1]:X(t)\in \partial B_R)} c(x-y)\,{\textrm{d}}\pi \,{\textrm{d}}R\\&\quad \le \int _2^3 \int _{(B_{7/2}\setminus B_{3/2})\times (B_{7/2}-B_{3/2})}I(\{||x|-R|\le C(E(4)+D(4))^\frac{1}{p+d}\}) c(x-y)\,{\textrm{d}}\pi \,{\textrm{d}}R\\&\quad \lesssim (E(4)+D(4))^\frac{1}{p+d} \int _{\#_4} c(x-y)\,{\textrm{d}}\pi \\&\quad \lesssim (E(4)+D(4))^{1+\frac{1}{p+d}}. \end{aligned}$$

Further, again using Lemma 3.1, there is \(C>0\) such that,

$$\begin{aligned}&\int _2^3 \int _\Omega I(\{\exists t\in [0,1]:X(t)\in \partial B_R\})\,{\textrm{d}}\pi \,{\textrm{d}}R\\&\quad \le \int _2^3 \pi (\Omega \cap \{|X(0)-R|\le C(E(4)+D(4))^\frac{1}{p+d}\})\,{\textrm{d}}R\\&\quad = \int _2^3 \mu (\{||x|-R|\le C(E(4)+D(4))^\frac{1}{p+d})\,{\textrm{d}}R\\&\quad \le \int _{B_{7/2}\setminus B_{3/2}\times B_{7/2} \setminus B_{3/2}} \int I(||x|-R|\le C(E(4)+D(4))^\frac{1}{p+d})\,{\textrm{d}}R\,{\textrm{d}}\mu \\&\quad \lesssim (E(4)+D(4))^\frac{1}{p+d} \mu (B_4)\\&\quad \lesssim (E(4)+D(4))^\frac{1}{p+d}. \end{aligned}$$

\(\square \)

4 A localisation result

In order to prove Theorem 1.1, we need to use optimality in a localised way, as the quantity we need to estimate is a local quantity. In general, given a minimiser \(\pi \) of optimal transport with cost function c between two measures \(\lambda \) and \(\mu \), it is not true that the localised transport cost of \(\pi \) is approximately equal to the optimal transport cost between the localised measures \(\lambda \llcorner B_R\) and \(\mu \llcorner B_R\). In other words, it is in general not the case that

$$\begin{aligned} \int _{\Omega } c(x-y)\,{\textrm{d}}\pi \approx W_c(\lambda \llcorner B_R,\mu \llcorner B_R). \end{aligned}$$

However, if we take into account the entry points of trajectories entering \(B_R\) (which we denoted \(f_R\), c.f. (2.3)) and the exit points of trajectories exiting \(B_R\) (which we denoted \(g_R\), c.f. (2.3)), the values are close as we show in the next lemma.

Lemma 4.1

Let \(\lambda ,\mu \) be admissible measures. Suppose \(\pi \in \Pi (\lambda ,\mu )\) minimises (1.1). Let \(R\in [2,3]\) and define \(f_R,g_R\) as in (2.3). Then for any \(\tau ,\delta >0\), there is \(\varepsilon >0\) such that if \(E(4)+D(4)\le \varepsilon \), then

$$\begin{aligned} \int _\Omega c(x-y)\,{\textrm{d}}\pi \le (1+\delta ) W_c(\lambda \llcorner B_R+f_R,\mu \llcorner B_R+ g_R)+\tau \left( E(4) + D(4)\right) \end{aligned}$$

Proof

Introduce the weakly continuous family of probability measures \(\{\lambda _z\}_{z\in \partial B_R}\) such that

$$\begin{aligned} \int _{\Omega \cap \{X(\sigma )\in \partial B_R\}} \zeta (x,X(\sigma ))\pi (\,{\textrm{d}}x\,{\textrm{d}}y) = \int _{\partial B_R}\int \zeta (x,z)\lambda _z(\,{\textrm{d}}x)f_R(\,{\textrm{d}}z) \end{aligned}$$

for any test function \(\zeta \) on \({\mathbb {R}}^d\times {\mathbb {R}}^d\). Likewise, introduce \(\{\mu _w\}_{w\in \partial B_R}\) via

$$\begin{aligned} \int _{\Omega \cap \{X(\tau )\in \partial B_R\}}\zeta (X(\tau ),y)\pi (\,{\textrm{d}}x\,{\textrm{d}}y) = \int _{\partial B_R}\int \zeta (w,y)\mu _w(\,{\textrm{d}}y)g_R(\,{\textrm{d}}w). \end{aligned}$$

Let \({{\overline{\pi }}}\) be an optimal plan for \(W_c(\lambda \llcorner B_R+f_R,\mu \llcorner B_R+g_R)\). Define a competitor \({{\tilde{\pi }}}\) for \(\pi \) by requiring the following formula to hold for any test function \(\zeta \) on \({\mathbb {R}}^d\times {\mathbb {R}}^d\),

$$\begin{aligned} \int \zeta (x,y){{\tilde{\pi }}}(\,{\textrm{d}}x \,{\textrm{d}}y)&= \int _{\Omega ^c}\zeta (x,y)\,{\textrm{d}}\pi (x,y) + \int _{B_R\times B_R} \zeta (x,y){{\overline{\pi }}}(\,{\textrm{d}}x\,{\textrm{d}}y)\nonumber \\&\quad + \int _{\partial B_R\times B_R}\int \zeta (x,y)\lambda _z(\,{\textrm{d}}x){{\overline{\pi }}}(\,{\textrm{d}}z \,{\textrm{d}}y)\nonumber \\&\quad + \int _{B_R\times \partial B_R}\zeta (x,y)\mu _w(\,{\textrm{d}}y){{\overline{\pi }}}(\,{\textrm{d}}x \,{\textrm{d}}w) \nonumber \\&\quad + \int _{\partial B_R\times \partial B_R} \int \int \zeta (x,y)\mu _w(\,{\textrm{d}}y)\lambda _z(\,{\textrm{d}}x){{\overline{\pi }}}(\,{\textrm{d}}z \,{\textrm{d}}w)\nonumber \\&= I + II + III + IV + V. \end{aligned}$$
(4.1)

In order to see that \({{\tilde{\pi }}}\in \Pi (\lambda ,\mu )\), by symmetry it suffices to check that the first marginal is \(\lambda \). Hence test (4.1) against \(\zeta (x)\). We begin by noting that due to the definition of \(\mu _w\) and using that \({{\overline{\pi }}}\) is supported in \({{\overline{B}}}_R\),

$$\begin{aligned} II + IV = \int _{B_R\times {\mathbb {R}}^d}\zeta (x){{\overline{\pi }}}(\,{\textrm{d}}x\,{\textrm{d}}y) = \int _{B_R}\zeta (x)\mu (\,{\textrm{d}}x) = \int _{\Omega \cap \{X(\sigma )\in B_R)\}} \zeta (x)\pi (\,{\textrm{d}}x \,{\textrm{d}}y). \end{aligned}$$

Similarly, using also the definition of \(f_R\),

$$\begin{aligned} III + V&= \int _{\partial B_R\times {\mathbb {R}}^d} \int \zeta (x)\lambda _z(\,{\textrm{d}}x){{\overline{\pi }}}(\,{\textrm{d}}z \,{\textrm{d}}y) = \int _{\partial B_R} \zeta (z)f_R(\,{\textrm{d}}z) \\&= \int _{\Omega \cap \{X(\sigma )\in \partial B_R\}} \zeta (x)\pi (\,{\textrm{d}}x \,{\textrm{d}}y). \end{aligned}$$

In particular, we have shown

$$\begin{aligned} \int \zeta (x){{\tilde{\pi }}}(\,{\textrm{d}}x \,{\textrm{d}}y) = \int \zeta (x)\pi (\,{\textrm{d}}x\,{\textrm{d}}y) = \int \zeta (x)\lambda (\,{\textrm{d}}x) \end{aligned}$$

as desired.

Using optimality of \(\pi \) in the form

$$\begin{aligned} \int _{\Omega } c(x-y)\,{\textrm{d}}\pi + \int _{\Omega ^c} c(x-y)\,{\textrm{d}}\pi \le \int c(x-y)\,{\textrm{d}}{{\tilde{\pi }}} \end{aligned}$$

and testing (4.1) against \(\zeta (x,y)= c(x-y)\), we learn

$$\begin{aligned}&\int _\Omega c(x-y)\,{\textrm{d}}\pi \\&\quad \le \int _{B_R\times B_R} c(x-y){{\overline{\pi }}}(\,{\textrm{d}}x \,{\textrm{d}}y) + \int _{\partial B_R\times B_R} \int c(x-y)\lambda _z(\,{\textrm{d}}x){{\overline{\pi }}}(\,{\textrm{d}}z \,{\textrm{d}}y) \\&\qquad +\int _{B_R\times \partial B_R} c(x-y)\mu _w(\,{\textrm{d}}y){{\overline{\pi }}}(\,{\textrm{d}}x \,{\textrm{d}}w) \\&\qquad + \int _{\partial B_R\times \partial B_R} \int \int c(x-y)\mu _w(\,{\textrm{d}}y)\lambda _z(\,{\textrm{d}}x) {{\overline{\pi }}}(\,{\textrm{d}}z \,{\textrm{d}}w)\\&\quad = \int _{B_R\times B_R} f_1 {{\overline{\pi }}}(\,{\textrm{d}}x \,{\textrm{d}}y) + \int _{\partial B_R\times B_R} f_2 {{\overline{\pi }}}(\,{\textrm{d}}z\,{\textrm{d}}y) + \int _{B_R\times \partial B_R} f_3 {{\overline{\pi }}}(\,{\textrm{d}}x \,{\textrm{d}}w) \\&\qquad + \int _{\partial B_R\times \partial B_R} f_4 {{\overline{\pi }}}(\,{\textrm{d}}z \,{\textrm{d}}w) \end{aligned}$$

As in the proof of Lemma 2.1, for any \(\delta >0\), there is \(C_\delta >0\) such that for any xyz,

$$\begin{aligned} c(x-z)\le (1+\delta ) c(x-y) + C_\delta c(y-z). \end{aligned}$$

Using this in combination with the fact that \(\lambda _z\), \(\mu _w\) are probability measures we deduce

$$\begin{aligned} f_2&\le (1+\delta )c(z-y) + C(\delta ){{\tilde{f}}}_2, \quad f_3 \le (1+\delta )c(x-w) + C(\delta ){{\tilde{f}}}_3\\ f_4&\le (1+\delta )c(z-w)+C(\delta ) {{\tilde{f}}}_4, \end{aligned}$$

where

$$\begin{aligned} {{\tilde{f}}}_2(z,y)= & {} \int c(x-z)\lambda _z(\,{\textrm{d}}x),\quad {{\tilde{f}}}_3(x,w) = \int c(w-y)\mu _w(\,{\textrm{d}}y),\\ {{\tilde{f}}}_4(z,w)= & {} {{\tilde{f}}}_2(z,y) + f_3(x,w). \end{aligned}$$

In particular, we deduce

$$\begin{aligned}&\int _\Omega c(x-y)\,{\textrm{d}}\pi \\&\quad \le (1+\delta )\int _{{{\overline{B}}}_R\times {{\overline{B}}}_R}c(x-y){{\overline{\pi }}}(\,{\textrm{d}}x \,{\textrm{d}}y) + C(\delta )\int _{\partial B_R\times B_R} {{\tilde{f}}}_2 {{\overline{\pi }}}(\,{\textrm{d}}z\,{\textrm{d}}y) \\&\qquad + C(\delta ) \int _{B_R\times \partial B_R} {{\tilde{f}}}_3 {{\overline{\pi }}}(\,{\textrm{d}}x \,{\textrm{d}}w)+ C(\delta ) \int _{\partial B_R\times \partial B_R} {{\tilde{f}}}_4{{\overline{\pi }}}(\,{\textrm{d}}z \,{\textrm{d}}w)\\&\quad = (1+\delta )\int _{{{\overline{B}}}_R\times {{\overline{B}}}_R}c(x-y){{\overline{\pi }}}(\,{\textrm{d}}x \,{\textrm{d}}y) + 2 C(\delta )\int _{\partial B_R\times {\mathbb {R}}^d} \int c(x-z) \lambda _z(\,{\textrm{d}}x){{\overline{\pi }}}(\,{\textrm{d}}z\,{\textrm{d}}y) \\&\qquad + 2C(\delta )\int _{{\mathbb {R}}^d\times \partial B_R} c(w-y)\mu _y (\,{\textrm{d}}y){{\overline{\pi }}}(\,{\textrm{d}}x \,{\textrm{d}}w)\\&\quad = (1+\delta )\int _{{{\overline{B}}}_R\times {{\overline{B}}}_R}c(x-y){{\overline{\pi }}}(\,{\textrm{d}}x \,{\textrm{d}}y) + 2 C(\delta )\int _{\partial B_R\times {\mathbb {R}}^d} \int c(x-z)\lambda _z(\,{\textrm{d}}x)f_R(\,{\textrm{d}}z) \\&\qquad + 2C(\delta )\int _{{\mathbb {R}}^d\times \partial B_R} c(w-y) \mu _y(\,{\textrm{d}}y)g_R(\,{\textrm{d}}w)\\&\quad = (1+\delta )\int _{{{\overline{B}}}_R\times {{\overline{B}}}_R}c(x-y){{\overline{\pi }}}(\,{\textrm{d}}x \,{\textrm{d}}y) + 2 C(\delta )\int _{\Omega \cap \{X(\sigma )\in \partial B_R\}} c(x-X(\sigma ))\,{\textrm{d}}\pi \\&\qquad + 2C(\delta )\int _{\Omega \cap \{ X(\tau ) \in \partial B_R\}} c(X(\tau )-y)\,{\textrm{d}}\pi \end{aligned}$$

In order to obtain the second to last line we used the admissibility of \({{\overline{\pi }}}\). Now note on the one hand, that due to optimality of \({{\overline{\pi }}}\),

$$\begin{aligned} \int _{{{\overline{B}}}_R\times {{\overline{B}}}_R} c(x-y)\,{\textrm{d}}{{\overline{\pi }}} = W_c(\lambda \llcorner B_R + f_R,\mu \llcorner B_R + g_R). \end{aligned}$$

On the other hand, on \(\Omega \cap \{X(\sigma )\in \partial B_R)\} \cap \{X(\tau )\in \partial B_R\}\), for some \(\rho _1, \rho _2\ge 0\) with \(\rho _1 + \rho _2\le 1\), due to convexity of c and \(c(0)=0\),

$$\begin{aligned} c(x-X(\sigma )) + c(X(\tau )-y) = c(\rho _1(x-y)) + c(\rho _2(x-y)) \le c(x-y). \end{aligned}$$

Thus, we have shown

$$\begin{aligned}&\int _\Omega c(x-y)\,{\textrm{d}}\pi \\&\quad \le (1+\delta ) W_c(\lambda \llcorner B_R + f_R,\mu \llcorner B_R+g_R)+C(\delta ) \int _{\Omega \cap (\{X(\sigma \in \partial B_R\}\cup \{X(\tau )\in \partial B_R\})} c(x-y)\,{\textrm{d}}\pi \\&\quad = (1+\delta ) W_c(\lambda \llcorner B_R + f_R,\mu \llcorner B_R+g_R)+C(\delta ) \int _{\Omega \cap \{\exists t\in [0,1]:X(t)\in \partial B_R\}} c(x-y)\,{\textrm{d}}\pi \\&\quad \le (1+\delta ) W_c(\lambda \llcorner B_R + f_R,\mu \llcorner B_R+g_R)+C(\delta ) (E(4)+D(4))^{1+\frac{1}{p+d}}. \end{aligned}$$

To obtain the last line, we used Corollary 3.3. Choosing \(\varepsilon \) sufficiently small the result follows. \(\square \)

5 Approximating the boundary data

Before we can implement the \(c^*\)-harmonic approximation, we face another problem. Lemma 4.1 suggests that the \(c^*\)-harmonic function \(\phi \) we should use in Theorem 1.1 is given as a solution of the following Neumann-problem

$$\begin{aligned} {\left\{ \begin{array}{ll} -\textrm{div}\,\nabla c^*(\textrm{D}\phi )=\mu -\lambda &{} \quad \text {in } B_R\\ \nabla c^*(\textrm{D}\phi )\cdot \nu = g_R-f_R &{} \quad \text {on } \partial B_R. \end{array}\right. } \end{aligned}$$

However, \(f_R\), \(g_R\) as well as \(\lambda ,\mu \) are not sufficiently smooth for \(\phi \) to make sense as a weak solution and we will not be able to apply the regularity results of Lemma 1.2 as it stands. Hence, we will approximate \(f_R\), \(g_R\) by suitable \(L^p(\partial B_R)\)-functions \({{\bar{f}}}_R\) and \({{\bar{g}}}_R\) and will replace \(\mu -\lambda \) with \(c=\int _{\partial B_R} g_R-f_R\). After choosing a suitable radius \(R\in [2,3]\), this approximation is given by the following result:

Lemma 5.1

Let \(\tau >0\). Let \(\lambda ,\mu \) be admissible measures on \({\mathbb {R}}^d\). There is \(\varepsilon >0\) such that if \(E(4)+D(4)\le \varepsilon \), then for every \(R\in [2,3]\) there exist non-negative functions \({{\overline{f}}}_R\), \({{\overline{g}}}_R\) such that

$$\begin{aligned}&W_c(f_R,{{\overline{f}}}_R)+W_c(g_R,{{\overline{g}}}_R)\lesssim \tau E(4)+ D(4)\\&\int _2^3 \int _{\partial B_R} {{\overline{g}}}_R^p+{{\overline{f}}}_R^p \,{\textrm{d}}R\lesssim E(4)+D(4) \end{aligned}$$

Here \(f_R, g_R\) are the functions defined in (2.3).

Proof

By symmetry it suffices to focus on the terms involving g.

We begin by constructing \({{\overline{g}}}_R\). Let \({{\overline{\pi }}}\) be optimal for \(W_c(\mu \llcorner B_4,\kappa _{\mu ,4} \,{\textrm{d}}z\llcorner B_4)\). Extend \({{\overline{\pi }}}\), which is supported on \(B_4\times B_4\) by \(\frac{\mu \otimes \mu }{\mu ({\mathbb {R}}^d)}\) to \({\mathbb {R}}^d\times {\mathbb {R}}^d\). Note that this extension, still denoted \({{\overline{\pi }}}\) has marginals \(\mu \) and \(\kappa _{\mu ,4}\,{\textrm{d}}z\llcorner B_4+\mu \llcorner B_4^c\). Further due to the definition of D, \(W_c(\mu \llcorner B_4,\kappa _{\mu ,4} \,{\textrm{d}}z \llcorner B_4)\lesssim D(4)\). Introduce the family of probability measures \(\{{{\overline{\pi }}}(\cdot |y)\}_{y\in {\mathbb {R}}^d}\) via asking that for every test function \(\zeta \)

$$\begin{aligned} \int \zeta (y,z){{\overline{\pi }}}(\,{\textrm{d}}z|y)\mu (\,{\textrm{d}}y) = \int \zeta (y,z){{\overline{\pi }}}(\,{\textrm{d}}y \,{\textrm{d}}z). \end{aligned}$$

Let \(\pi \) be the minimiser of \(W_c(\lambda ,\mu )\). Then define \({{\tilde{\pi }}}\) on \({\mathbb {R}}^d\times {\mathbb {R}}^d\times {\mathbb {R}}^d\) by the formula

$$\begin{aligned} \int \zeta (x,y,z){{\tilde{\pi }}}(\,{\textrm{d}}x \,{\textrm{d}}y \,{\textrm{d}}z) = \int \int \zeta (x,y,z){{\overline{\pi }}}(\,{\textrm{d}}z|y)\pi (\,{\textrm{d}}x \,{\textrm{d}}y) \end{aligned}$$

valid for any test function \(\zeta \). We note that with respect to the (xy) variables \({{\tilde{\pi }}}\) has marginal \(\pi \), while with respect to the (yz) variables \({{\tilde{\pi }}}\) has marginal \({{\overline{\pi }}}\).

Fix \(R\in [2,3]\). Extend a trajectory \(X\in \Omega \) in a piecewise affine fashion by setting for \(t\in [1,2]\),

$$\begin{aligned} X(t) = (t-1)z+(2-t)y. \end{aligned}$$

Note that the distribution \(g^\prime \) of the endpoint of those trajectories that exit \({{\overline{B}}}_R\) during the time interval [0, 1] is given by

$$\begin{aligned} \int \zeta \,{\textrm{d}}g^\prime = \int _{\Omega \cap \{X(\tau )\in \partial B_R\}} \zeta (z){{\tilde{\pi }}}(\,{\textrm{d}}x \,{\textrm{d}}y \,{\textrm{d}}z). \end{aligned}$$
(5.1)

Note that due to Lemma 3.1, \(y=X(1)\in B_4\) for any trajectory X that contributes to (5.1). Since \({{\overline{\pi }}}(B_4,B_4^c) = 0\), we deduce that also \(z=X(2)\in B_4\) and hence that \(g^\prime \) is supported in \(B_4\). In particular, we may estimate for any \(\zeta \ge 0\), using that the second marginal of \({{\overline{\pi }}}\) is \(\kappa _{\mu ,4}\,{\textrm{d}}z\llcorner B_4+\mu \llcorner B_4^c\),

$$\begin{aligned} \int \zeta \,{\textrm{d}}g^\prime \le \int _{\{z\in B_4\}} \zeta (z){{\overline{\pi }}}(\,{\textrm{d}}y \,{\textrm{d}}z) = \kappa _{\mu ,4}\int _{B_4}\zeta . \end{aligned}$$

This shows that \(g^\prime \) has a density, still denoted \(g^\prime \), satisfying \(g^\prime \le \kappa _{\mu ,4}\) and allows us to conclude the construction of \({{\overline{g}}}_R\) by defining

$$\begin{aligned} \int \zeta \,{\textrm{d}}{{\overline{g}}}_R = \int \zeta \left( R \frac{z}{|z|}\right) g^\prime (\,{\textrm{d}}z). \end{aligned}$$

We now turn to establishing the claimed estimates for \({{\overline{g}}}_R\). Note that, directly from the definitions of \({{\tilde{\pi }}}\), \(g^\prime \) and \({{\overline{g}}}_R\), an admissible plan for \(W_c(g_R,{{\overline{g}}}_R)\) is

$$\begin{aligned} \int _{\Omega \cap \{X(\tau )\in \partial B_R\}} \zeta \left( X(\tau ),R\frac{z}{|z|}\right) \,{\textrm{d}}{{\tilde{\pi }}}. \end{aligned}$$

Indeed, due to the definition of \({{\tilde{\pi }}}\) and the definition of \(g_R\) in (2.3), for any test function \(\zeta \),

$$\begin{aligned} \int _{\Omega \cap \{X(\tau )\in \partial B_R\}} \zeta (X(\tau ))\,{\textrm{d}}{{\tilde{\pi }}} = \int _{\Omega \cap \{X(\tau )\in \partial B_R\}} \zeta (X(\tau ))\,{\textrm{d}}\pi = \int \xi \,{\textrm{d}}g_R. \end{aligned}$$

On the other hand, using (5.1) and the definition of \({{\overline{g}}}_R\),

$$\begin{aligned} \int _{\Omega \cap \{X(\tau )\in \partial B_R\}} \zeta \left( R \frac{z}{|z|}\right) \,{\textrm{d}}{{\tilde{\pi }}} = \int \zeta \left( R\frac{z}{|z|}\right) \,{\textrm{d}}g^\prime = \int \zeta \,{\textrm{d}}{{\overline{g}}}_R. \end{aligned}$$

In particular, using the p-growth of c (1.12),

$$\begin{aligned} W_c(g_R,{{\overline{g}}}_R)\lesssim \int _{\Omega \cap \{X(\tau )\in \partial B_R\}} c\left( X(\tau )-R\frac{z}{|z|}\right) \,{\textrm{d}}{{\tilde{\pi }}}\lesssim \int _{\Omega \cap \{X(\tau )\in \partial B_R\}} \left|X(\tau )-R\frac{z}{|z|}\right|^p\,{\textrm{d}}{{\tilde{\pi }}}. \end{aligned}$$

Noting that \(|X(\tau )-R\frac{z}{|z|}|\le 2 |X(\tau )-z|\), we deduce

$$\begin{aligned} \left|X(\tau )-R\frac{z}{|z|}\right|^p \lesssim |x-y|^p+|y-z|^p. \end{aligned}$$

Thus, we deduce using once again the p-growth of c (1.12) and Corollary 3.3,

$$\begin{aligned} W_c(g_R,{{\overline{g}}}_R) \lesssim \int _{\Omega \cap \{\exists t\in [0,1]:X(t)\in \partial B_R\}} c(x-y)\,{\textrm{d}}\pi + D(4)\lesssim E(4)^{1+\frac{1}{p+d}}+D(4). \end{aligned}$$

Choosing \(\varepsilon \) sufficiently small, the first estimate holds.

Noting \(\sup g^\prime \le \kappa _{\mu ,4}\lesssim 1\), in order to prove the second inequality, it suffices to prove

$$\begin{aligned} \int _2^3 \int |R-|x||^{p-1} \,{\textrm{d}}g^\prime \lesssim E(4)+D(4) \end{aligned}$$

and to apply Lemma 2.3. The condition on the support of g in Lemma 2.3 applies due to Lemma 3.1. Note that by definition of \(g^\prime \),

$$\begin{aligned} \int |R-|x||^{p-1}\,{\textrm{d}}g^\prime&= \int _{\Omega \cap \{X(\tau )\in \partial B_R\}} ||z|-R|^{p-1}{{\tilde{\pi }}}(\,{\textrm{d}}x\,{\textrm{d}}y \,{\textrm{d}}z)\\&\lesssim \int _{\{y\in B_4\}\cap \{\min _{[0,1]}|X|\le R\le \max _{[0,1]} |X|\}} |x{-}y|^{p-1}{+}|y{-}z|^{p-1}{{\tilde{\pi }}}(\,{\textrm{d}}x \,{\textrm{d}}y\,{\textrm{d}}z). \end{aligned}$$

In order to obtain the second line, we observed that since \(|X(\tau )|= R\), it holds that \({||z|-R|\le |x-y|+ |y-z|}\). In addition we noted that, since \(X(\tau )\in \partial B_R\), we have \({\min _{[0,1]}X\le R\le \max _{[0,1]} X}\) and \(X(1)\in B_4\) due to Lemma 3.1. Integrating over R, this gives

$$\begin{aligned}&\int _2^3 \int ||z|-R|^{p-1} \,{\textrm{d}}g^\prime \,{\textrm{d}}R \\&\quad \lesssim \int _{\{y\in B_4\}} (\max _{[0,1]} X-\min _{[0,1]} X)\left( |x-y|^{p-1}+|y-z|^{p-1}\right) {{\tilde{\pi }}}(\,{\textrm{d}}x \,{\textrm{d}}y \,{\textrm{d}}z)\\&\quad \le \int _{\{y\in B_4\}} |x-y|\left( |x-y|^{p-1}+|y-z|^{p-1}\right) {{\tilde{\pi }}}(\,{\textrm{d}}x \,{\textrm{d}}y \,{\textrm{d}}z)\\&\quad \lesssim \int _{\{y\in B_4\}} |x-y|^p\pi (\,{\textrm{d}}x \,{\textrm{d}}y) + \int |y-z|^p {{\overline{\pi }}}(\,{\textrm{d}}y \,{\textrm{d}}z)\\&\quad \le E(4)+D(4). \end{aligned}$$

The second-to last line was obtained applying Young’s inequality. This concludes the proof. \(\square \)

6 Restricting the data

As we see from Lemma 5.1, we will not be able to work on \(B_4\) directly, but will have to pass to a smaller ball \(B_R\) with some suitably chosen \(R\in [2,3]\). Hence, we need to control– for a well-chosen RD(R), while at the moment we only control D(4). Unfortunately, this does not follow immediately from the definition but requires a technical proof utilising ideas of the previous sections. The outcome of these considerations is the following lemma:

Lemma 6.1

For any non-negative measure \(\mu \) there is \(\varepsilon >0\) such that if \(D(4)\le \varepsilon \), then

$$\begin{aligned} \int _2^3 \left( W_{c}(\mu \llcorner B_R,\kappa _{\mu ,R}\,{\textrm{d}}x \llcorner B_R)+\frac{1}{\kappa _{\mu ,R}}(\kappa _{\mu ,R}-1)^p\right) \,{\textrm{d}}R\lesssim D(4). \end{aligned}$$

Proof

Fix \(R\in [2,3]\). In this proof \(\pi \) will denote the optimal transference plan for the problem \({W_{c}(\mu \llcorner B_4,\kappa _{\mu ,4}\,{\textrm{d}}x\llcorner B_4)}\). Note that due to the \(L^\infty \)-bound Lemma 3.1, if \(X(0)\in B_R\), then \(X(1)\in B_4\) and if \(X(1)\in B_R\), then \(X(0)\in B_4\). Define the measures \(0\le f^\prime \le \kappa _{\mu ,4}\) on \({{\overline{B}}}_R\) and \(0\le g^\prime \le \kappa _{\mu ,4}\) on \({{\overline{B}}}_4\setminus B_R\), which record where exiting and entering trajectories end up by asking that for all test functions \(\zeta \),

$$\begin{aligned} \int \zeta \,{\textrm{d}}f^\prime&:=\int _{\Omega \cap \{X(0)\not \in B_R\}\cap \{X(1)\in B_R\}}\zeta (X(1))\,{\textrm{d}}\pi \\ \int \zeta \,{\textrm{d}}g^\prime&:=\int _{\Omega \cap \{X(0)\in B_R\}\cap \{X(1)\not \in B_R\}}\zeta (X(1))\,{\textrm{d}}\pi . \end{aligned}$$

Introduce the mass densities

$$\begin{aligned} \kappa _f = \frac{f^\prime ({\mathbb {R}}^d)}{|B_R|}\le \kappa _{\mu ,R}, \qquad \kappa _g = \frac{g^\prime ({\mathbb {R}}^d)}{|B_R|}. \end{aligned}$$

We use Lemma 2.1 to deduce

$$\begin{aligned}&W_c(\mu \llcorner B_R,\kappa _{\mu ,R}\,{\textrm{d}}x \llcorner B_R)\\&\quad \lesssim W_c(\mu \llcorner B_R,\kappa _{\mu ,4}\,{\textrm{d}}x\llcorner B_R- f^\prime +g^\prime ) + W_c(\kappa _{\mu ,4}\,{\textrm{d}}x\llcorner B_R-f^\prime \\&\qquad + g^\prime ,(\kappa _{\mu ,4}-\kappa _f)\,{\textrm{d}}x \llcorner B_R+g^\prime )+ W_c((\kappa _{\mu ,4}-\kappa _f)\,{\textrm{d}}x \llcorner B_R+g^\prime ,\kappa _{\mu ,R}\,{\textrm{d}}x \llcorner B_R)\\&\quad = I + II + III. \end{aligned}$$

Restricting \(\pi \) to trajectories that start in \(B_R\) gives an admissible plan for I. Consequently,

$$\begin{aligned} I=W_c(\mu \llcorner B_R,\kappa _{\mu ,4}\,{\textrm{d}}x\llcorner B_R-f^\prime + g^\prime )\le W_c(\mu \llcorner B_4,\kappa _{\mu ,4}\,{\textrm{d}}x \llcorner B_4)\le D(4). \end{aligned}$$

Since II will be estimated in the same way as III, but is slightly more tricky, we first estimate III. In order to estimate III, introduce the projection \({{\hat{g}}}\) of \(g^\prime \) onto \(\partial B_R\) via (2.1). Using Lemma 2.1, we deduce

$$\begin{aligned} III&\lesssim W_c((\kappa _{\mu ,4}-\kappa _f)\,{\textrm{d}}x\llcorner B_R+{{\hat{g}}},\kappa _{\mu ,R}\,{\textrm{d}}x \llcorner B_R)+W_c({{\hat{g}}}, g^\prime ). \end{aligned}$$
(6.1)

Regarding the first term, we claim that

$$\begin{aligned} W_c((\kappa _{\mu ,4}-\kappa _f)\,{\textrm{d}}x \llcorner B_R+{{\hat{g}}},\kappa _{\mu ,R}\,{\textrm{d}}x \llcorner B_R)\lesssim \int ({{\hat{g}}})^p. \end{aligned}$$

Indeed, an admissible density-flux pair \((\rho ,j)\) for the Benamou–Brenier formulation (1.10) is given by

$$\begin{aligned} {\left\{ \begin{array}{ll} \rho _t = (\kappa _{\mu ,4}\,{\textrm{d}}x\llcorner B_4-\kappa _f+t\kappa _g) +(1-t){{\hat{g}}} g\\ j_t = \nabla c^*(\textrm{D}\phi ) \,{\textrm{d}}x \llcorner B_R, \end{array}\right. } \end{aligned}$$

where \(\phi \) solves

$$\begin{aligned} {\left\{ \begin{array}{ll} -\textrm{div}\,\nabla c^*(\textrm{D}\phi ) = \kappa _g &{}\quad \text {in } B_R\\ \nu \cdot \nabla c^*(\textrm{D}\phi ) = {{\hat{g}}} &{}\quad \text {on } \partial B_R. \end{array}\right. } \end{aligned}$$

We find, writing \(s=(\kappa _{\mu ,4}-\kappa _f+t\kappa _g)\), for any \(\zeta \) supported in \(B_R\),

$$\begin{aligned} \int \zeta \,{\textrm{d}}j_t -\int c^*(\zeta )\,{\textrm{d}}\rho _t&= \int _{B_R} \zeta \cdot \nabla c^*(\textrm{D}\phi )-c^*(\zeta )s\,{\textrm{d}}x\nonumber \\&\le \int _{B_R}\int _0^1 s c \left( \frac{1}{s}\nabla c^*(\textrm{D}\phi )\right) \,{\textrm{d}}t \,{\textrm{d}}x \end{aligned}$$
(6.2)

To obtain the second line, we used the Fenchel-Young inequality. Assuming \(|s-1|\ll 1\) for now, using the p-growth of c (1.12) it is straightforward to see that

$$\begin{aligned} \int _{B_R}\int _0^1 s c \left( \frac{1}{s} \nabla c^*(\textrm{D}\phi )\right) \,{\textrm{d}}t \,{\textrm{d}}x \lesssim \int _{B_R} c\left( \nabla c^*(\textrm{D}\phi )\right) \,{\textrm{d}}x \lesssim \int _{\partial B_R} ({{\hat{g}}})^p. \end{aligned}$$

To obtain the last inequality, we used the energy inequality (1.31). We now estimate \(\int ({{\hat{g}}})^p\) and \(W_c({{\hat{g}}},g^\prime )\) arguing exactly as in Lemma 6.1 in order to conclude

$$\begin{aligned} \int {{\hat{g}}}^p+W_c({{\hat{g}}},g^\prime )\lesssim D(4). \end{aligned}$$

Since \(\kappa _{\mu ,R} = \kappa _{4,R}-\kappa _f+ \kappa _g\) and \(|\kappa _{4,R}-1|\lesssim D(4)\), to deduce that \(|s-1|\ll 1\) and to conclude the estimate of III, it suffices to show \(\kappa _f^p+\kappa _g^p\lesssim D(4)\). By symmetry it suffices to consider \(\kappa _g\). Since \({{\hat{g}}}\) is supported on \(\partial B_R\), using Young’s inequality, we find

$$\begin{aligned} \kappa _g^p = \frac{{{\hat{g}}}({\mathbb {R}}^d)^p}{|B_R|^p}\le \frac{|\partial B_R|^{p-1}}{|B_R|^p}\int _{\partial B_R}({{\hat{g}}})^p\lesssim D(4). \end{aligned}$$
(6.3)

This concludes the estimate for III.

It remains to estimate II. Using the subadditivity of \(W_c\), we have

$$\begin{aligned}&W_c(\kappa _{\mu ,4}\,{\textrm{d}}x \llcorner B_R-f^\prime + g^\prime ,(\kappa _{\mu ,4}-\kappa _f)\,{\textrm{d}}x \llcorner B_R+g^\prime )\\&\quad \le W_c(\kappa _{\mu ,4}\,{\textrm{d}}x \llcorner B_R-f^\prime ,(\kappa _{\mu ,4}-\kappa _f)\,{\textrm{d}}x \llcorner B_R)+W_c(g^\prime ,g^\prime )\\&\quad = W_c(\kappa _{\mu ,4}\,{\textrm{d}}x \llcorner B_R-f^\prime ,(\kappa _{\mu ,4}-\kappa _f)\,{\textrm{d}}x\llcorner B_R ). \end{aligned}$$

We want to proceed exactly as we did in the estimate for III, the only delicate issue being that we do not have \(\kappa _{\mu ,4}\,{\textrm{d}}x \llcorner B_R-f^\prime \ge c>0\). This is necessary in (6.2). However, this can be remedied by using Corollary 2.2 to deduce

$$\begin{aligned}&W_c(\kappa _{\mu ,4}\,{\textrm{d}}x \llcorner B_R-f^\prime , (\kappa _{\mu ,4}-\kappa _f)\,{\textrm{d}}x \llcorner B_R)\\&\quad \lesssim W_c((2\kappa _{\mu ,4}-\kappa _f)\,{\textrm{d}}x \llcorner B_R-f^\prime , 2(\kappa _{\mu ,4}-\kappa _f)\,{\textrm{d}}x \llcorner B_R). \end{aligned}$$

Note that since \(\kappa _{\mu ,4}\,{\textrm{d}}x \llcorner B_R-f^\prime \ge 0\) and using (6.3), after choosing \(\varepsilon \) sufficiently small,

$$\begin{aligned} (2\kappa _{\mu ,4}-\kappa _f)\,{\textrm{d}}x \llcorner B_R-f^\prime \ge \kappa _{\mu ,4}-\kappa _f\ge \frac{1}{2}\kappa _{\mu ,4}\ge c>0. \end{aligned}$$

In order to obtain the last inequality, we used the definition of D and chose \(\varepsilon \) sufficiently small. This allows to proceed using the same argument as for III to conclude

$$\begin{aligned} W_c(\kappa _{\mu ,4}\,{\textrm{d}}x \llcorner B_R-f^\prime , (\kappa _{\mu ,4}-\kappa _f)\,{\textrm{d}}x \llcorner B_R)\lesssim D(4). \end{aligned}$$

This completes the proof. \(\square \)

7 The \(c^*\)-harmonic approximation result

The goal of this section is to prove the \(c^*\)-harmonic approximation result. With the results of the previous sections in hand, we can give a more precise version of Theorem 1.1. In particular, we can make explicit the problem that \(\phi \) solves.

To this end, in light of Lemma 5.1 and Lemma 6.1, given \(\tau >0\), we fix \(R\in (2,3)\) such that there exist non-negative \({{\overline{f}}}_R\), \({{\overline{g}}}_R\) such that

$$\begin{aligned}&W_c(f_R,{{\overline{f}}}_R)+W_c(g_R,{{\overline{g}}}_R)\lesssim \tau E(4)+D(4)\nonumber \\&D(R)+\int _{\partial B_R} {{\overline{f}}}_R^p+{{\overline{g}}}_R^p \lesssim E(4)+D(4). \end{aligned}$$

Then let \(\phi \) be a solution with \(\int _{B_R}\phi \,{\textrm{d}}x = 0\) of

$$\begin{aligned} {\left\{ \begin{array}{ll} -\textrm{div}\,\nabla c^*(\textrm{D}\phi ) = c_R &{}\quad \text {in } B_R\\ \nabla c^*(\textrm{D}\phi )\cdot \nu = {{\overline{g}}}_R-{{\overline{f}}}_R &{}\quad \text {on } \partial B_R. \end{array}\right. } \end{aligned}$$
(7.1)

where \(c_R = |B_R|^{-1}\left( \int _{\partial B_R} {{\overline{g}}}_R-{{\overline{f}}}_R\right) \) is the constant so that (7.1) is well-posed. We emphasize that while we do not make the dependence explicit in our notation, \(\phi \) depends on the choice of radius R.

With this notation in place, we state a precise version of our main result.

Theorem 7.1

For every \(\tau >0\), there exist positive constants \(\varepsilon (\tau ),C(\tau )>0\) such that if \(E(4)+D(4)\le \varepsilon (\tau )\), then there exists \(R\in (2,3)\) such that

$$\begin{aligned} \int _{\#_1} c(x-y-\nabla c^*(\textrm{D}\phi )) \le \tau E(4) + C(\tau ) D(4), \end{aligned}$$

where \(\phi \) solves (7.1).

The proof will be a direct consequence of the lemmata we prove in the following subsections. We begin in Sect. 7.1 by using strong p-convexity in order to bound a quantity related to the left-hand side of the estimate in Theorem 7.1 by a difference of energies, as well as two error terms. The first error term arises from the approximation of the boundary data, while the second error term comes from passing to the perspective of trajectories. We construct a competitor to estimate the difference of energies and estimate the two error terms in Sect. 7.2. Collecting estimates, we conclude the proof of Theorem 7.1 in Sect. 7.3.

7.1 Quasi-orthogonality

The key observation in order to prove Theorem 7.1 is contained in the following elementary lemma, which relies on the strong p-convexity of c.

Lemma 7.2

For any \(\pi \in \Pi (\lambda ,\mu )\) and \(\phi \) continuously differentiable in \({{\overline{B}}}_R\), there is \(c(p,\Lambda )\) such that we have

$$\begin{aligned}&c(p,\Lambda )\int _\Omega \int _\sigma ^\tau V\left( \dot{X}(t),\nabla c^*(\nabla \phi (X(t))\right) \,{\textrm{d}}t\,{\textrm{d}}\pi \\&\quad \le \int _\Omega c(x-y)\,{\textrm{d}}\pi - \int _{B_R}c(\nabla ^*(\textrm{D}\phi ))\,{\textrm{d}}x \\&\qquad - \int _\Omega \int _\sigma ^\tau \langle \dot{X}(t)-\nabla c^*(\textrm{D}\phi (X(t)),\textrm{D}\phi (X(t))\rangle \,{\textrm{d}}t\,{\textrm{d}}\pi \\&\qquad + \int _{B_R}c(\nabla c^*(\textrm{D}\phi ))\,{\textrm{d}}x-\int _{\Omega }\int _\sigma ^\tau c(\nabla c^*(\textrm{D}\phi (X(t)))\,{\textrm{d}}t\,{\textrm{d}}\pi . \end{aligned}$$

Proof

We apply (1.16) with \(x = \dot{X}(t)\) and \(y=\nabla c^*(\textrm{D}\phi (X(t)))\). Noting that we have \({\nabla c(\nabla c^*(\textrm{D}\phi )) = \textrm{D}\phi }\) and \(\int _\Omega \int _\sigma ^\tau c(\dot{X}(t))\,{\textrm{d}}t\,{\textrm{d}}\pi \le \int _\Omega c(x-y)\,{\textrm{d}}\pi \) this gives the desired result. \(\square \)

7.2 Error estimates

We would like to apply Lemma 7.2 with \(\phi \) solving (7.1). In order to do so, we require \(\phi \in C^1({{\overline{B}}}_R)\). However note that \({{\overline{g}}}_R\) and \({{\overline{f}}}_R\) will in general not be sufficiently smooth to ensure that \(\phi \in C^1({{\overline{B}}}_R)\). Thus, we approximate them using mollification. To be precise, let \(0<r\ll 1\) and denote by \({{\overline{f}}}_R^r\) and \({{\overline{g}}}_R^r\), respectively, the convolution with a smooth convolution kernel (on \(\partial B_R)\) at scale r of \({{\overline{f}}}_R\) and \({{\overline{g}}}_R\). Set \(\phi ^r\) to be the solution with \(\int _{B_R} \phi ^r\,{\textrm{d}}x = 0\) of

$$\begin{aligned} {\left\{ \begin{array}{ll} -\textrm{div}\,\nabla c^*(\textrm{D}\phi ^r) = c^r &{}\quad \text {in } B_R\\ \nabla c^*(\textrm{D}\phi ^r)\cdot \nu = {{\overline{g}}}_R^r-{{\overline{f}}}_R^r &{}\quad \text {on } \partial B_R. \end{array}\right. } \end{aligned}$$
(7.2)

Here \(c^r= |B_R|^{-1}\left( \int _{\partial B_R} {{\overline{g}}}_R^r-{{\overline{f}}}_R^r\right) \) is the constant such that (7.2) is well-posed.

We begin by showing that replacing \({{\overline{f}}}_R\) and \({{\overline{g}}}_R\) with \({{\overline{f}}}_R^r\) and \({{\overline{g}}}_R^r,\) respectively is not detrimental on the left-hand side of the estimate in Lemma 7.2.

Lemma 7.3

For every \(0<\tau \) there exists \(\varepsilon (\tau )\) and \(C(\tau ), r_0(\tau )>0\), such that if it holds that \({E(4)+D(4)\le \varepsilon (\tau )}\) and \(0<r\le r_0\), then there exists \(R\in [2,3]\) such that if \(\phi \) solves (7.1) and \(\phi ^r\) solves (7.2), then

$$\begin{aligned}&\int _{\Omega _{3/2}} \int _{\sigma _{3/2}}^{\tau _{3/2}} V\left( \dot{X}(t),\nabla c^*(\textrm{D}\phi (X(t))\right) \,{\textrm{d}}t\,{\textrm{d}}\pi {-}\int _{\Omega _{3/2}} \int _{\sigma _{3/2}}^{\tau _{3/2}}\! V\left( \dot{X}(t),\nabla c^*(\textrm{D}\phi ^r(X(t))\right) \,{\textrm{d}}t\,{\textrm{d}}\pi \\&\quad \lesssim \tau (E(4)+D(4)). \end{aligned}$$

Proof

Write \(\xi (x) = \nabla c^*(\textrm{D}\phi (x))\), \(\xi ^r(x) = \nabla c^*(\textrm{D}\phi ^r(x))\). We focus on the case \(p\le 2\). The case \(p>2\) follows by similar arguments, but is easier. In light of (1.15), the controlled \(p^\prime \)-growth of \(c^*\) (1.19) and using Hölder, we find

$$\begin{aligned}&\left|\int _{\Omega _{3/2}}\int _{\sigma _{3/2}}^{\tau _{3/2}} V\left( \dot{X}(t),\xi (X(t))\right) \,{\textrm{d}}t\,{\textrm{d}}\pi -\int _\Omega \int _\sigma ^\tau V\left( \dot{X}(t),\xi ^r(X(t))\right) \,{\textrm{d}}t\,{\textrm{d}}\pi \right|\\&\quad \lesssim \int _{\Omega _{3/2}}\int _{\sigma _{3/2}}^{\tau _{3/2}} |\xi (X(t))-\xi ^r(X(t))|\left( |\dot{X}(t)+|\xi (X(t))|+ |\xi ^r(X(t))|\right) ^{p-1}\,{\textrm{d}}t\,{\textrm{d}}\pi \\&\quad \lesssim \int _{\Omega _{3/2}}\int _{\sigma _{3/2}}^{\tau _{3/2}} |\textrm{D}\phi (X(t))-\textrm{D}\phi ^r(X(t))|\left( |\textrm{D}\phi (X(t))|+ |\textrm{D}\phi ^r( X(t))|\right) ^{p^\prime -1}\,{\textrm{d}}t\,{\textrm{d}}\pi \\&\quad \le \left( \int _{\Omega _{3/2}}\int _{\sigma _{3/2}}^{\tau _{3/2}}|\textrm{D}\phi (X(t))-\textrm{D}\phi ^r(X(t))|^{p^\prime }\,{\textrm{d}}t\,{\textrm{d}}\pi \right) ^\frac{1}{p^\prime }\\&\qquad \times \left( \int _{\Omega _{3/2}}\int _{\sigma _{3/2}}^{\tau _{3/2}}|\textrm{D}\phi (X(t))|^{p^\prime }+|\textrm{D}\phi ^r (X(t))|^{p^\prime }\,{\textrm{d}}t\,{\textrm{d}}\pi \right) ^\frac{1}{p}. \end{aligned}$$

Note that due to Lemma 3.1, if \(X\in \Omega _{3/2}\), then \(X(0)\in B_{7/8}\). Thus using elliptic regularity in the form of (1.32) and (1.34),

$$\begin{aligned}&\Big |\int _{\Omega _{3/2}}\int _{\sigma _{3/2}}^{\tau _{3/2}}|\textrm{D}\phi (X(t))-\textrm{D}\phi ^r(X(t))|^{p^\prime }\,{\textrm{d}}t\,{\textrm{d}}\pi \\&\qquad -\int _{\Omega _{3/2}}\int _{\sigma _{3/2}}^{\tau _{3/2}}|\textrm{D}\phi (X(0))-\textrm{D}\phi ^r(X(0))|^{p^\prime }\,{\textrm{d}}t\,{\textrm{d}}\pi \Big |\\&\quad \lesssim \left( [\textrm{D}\phi ]_{C^{0,\beta }(B_{7/8})}+[\textrm{D}\phi ^r]_{C^{0,\beta } (B_{7/8})}\right) \left( \sup _{B_{7/8}}|\textrm{D}\phi |+|\textrm{D}\phi ^r|\right) ^{p^\prime -1}\\&\qquad \times \int _{\Omega _{3/2}}\int _{\sigma _{3/2}}^{\tau _{3/2}} |X(t)-X(0)|^\beta \,{\textrm{d}}t\,{\textrm{d}}\pi \\&\quad \lesssim c(r)(E(4)+D(4)) \pi (\Omega _{3/2})(E(4)+D(4))^\frac{\beta }{p+d}\lesssim c(r)(E(4)+D(4))^{1+\frac{\beta }{p+d}}. \end{aligned}$$

In order to obtain the last line, we used the \(L^\infty \)-bound Lemma 3.1. Moreover, using Lemma 3.2 and elliptic regularity in the form of (1.33),

$$\begin{aligned}&\int _{\Omega _{3/2}}\int _{\sigma _{3/2}}^{\tau _{3/2}} |\textrm{D}\phi (X(0))-\textrm{D}\phi ^r(X(0))|^{p^\prime }\,{\textrm{d}}t\,{\textrm{d}}\pi \\&\quad \le \int _{B_{7/8}} |\textrm{D}\phi (x)-\textrm{D}\phi ^r(x)|^{p^\prime }\,{\textrm{d}}\mu \\&\quad \le \Big |\int _{B_{7/8}} |\textrm{D}\phi (x)-\textrm{D}\phi ^r(x)|^{p^\prime }(\,{\textrm{d}}\mu -\kappa _{\mu ,R}\,{\textrm{d}}x)\Big |+\kappa _{\mu ,R}\int _{B_R}|\textrm{D}\phi (x)-\textrm{D}\phi ^r(x)|^{p^\prime }\,{\textrm{d}}x\\&\quad \lesssim \left( \sup _{B_{7/8}} |\textrm{D}\phi |^{p^\prime -1}+|\textrm{D}\phi ^r|^{p^\prime -1}\right) \left( [\textrm{D}\phi ]_{C^{0,\beta }(B_{(3/2+R)/2})}+[\textrm{D}^r\phi ]_{C^{0,\beta }(B_{(3/2+R)/2})}\right) \\&\qquad \times W_{c}(\mu \llcorner B_R,\kappa _{\mu ,R}\,{\textrm{d}}x\llcorner B_R)^\frac{\beta }{p}+ r^s(E(4)+D(4))\\&\quad \lesssim c(r)(E(4)+D(4))^{1+\frac{\beta }{p}} + r^s(E(4)+D(4)). \end{aligned}$$

Arguing similarly, that is first replacing X(t) with X(0), at the cost of making an error of size \(c(r)\left( E(4)+D(4)\right) ^{1+\frac{\beta }{p+d}}\), and then replacing \(\,{\textrm{d}}\mu \) with \(\,{\textrm{d}}x\), making an error \(c(r)(E(4)+D(4))^{1+\frac{\beta }{p}}+r(E(4)+D(4))\), we find

$$\begin{aligned}&\int _{\Omega _{3/2}}\int _{\sigma _{3/2}}^{\tau _{3/2}}|\nabla \phi (X(t))|^{p^\prime }+|\nabla \phi ^r(X(t))|^{p^\prime }\,{\textrm{d}}t\,{\textrm{d}}\pi \\&\quad \lesssim r^s(E(4)+D(4))+ c(r)(E(4)+D(4))^{1+\frac{\beta }{p+d}}. \end{aligned}$$

Collecting estimates and choosing r as well as \(\varepsilon (\tau )\) sufficiently small, the desired result follows. \(\square \)

We now turn to estimating each of the three terms on the right-hand side of the estimate in Lemma 7.2 in turn. We will see that the second and third term are errors that arise from the approximation of the boundary data and from passing to the perspective of trajectories, respectively. Accordingly, estimating them will be essentially routine. In contrast, estimating the first term requires us to contrast an appropriate competitor to \(\pi \).

Lemma 7.4

For every \(0<\tau \) there exists \(\varepsilon (\tau ),C(\tau ),r_0(\tau )>0\) such that if it holds that \({E(4)+D(4)\le \varepsilon (\tau )}\) and \(0<r\le r_0\), then there exists \(R\in [2,3]\) such that if \(\phi ^r\) solves (7.2), then

$$\begin{aligned}&\int _\Omega c(x-y)\,{\textrm{d}}\pi - \int _{B_R} c(\nabla c^*(\textrm{D}\phi ^r))\,{\textrm{d}}x \lesssim \tau E(4)+D(4). \end{aligned}$$

Proof

We note, in the case \(p\le 2\), using the p-growth of c (1.13), the \(p^\prime -1\)-growth of \(\nabla c^*\) (1.19) and Hölder,

$$\begin{aligned}&\int _{B_R} c(\nabla c^*(\textrm{D}\phi ^r))-c(\nabla c^*(\textrm{D}\phi ))\,{\textrm{d}}x\\&\quad \lesssim \int _{B_R} |\nabla c^*(\textrm{D}\phi ^r)-\nabla c^*(\textrm{D}\phi )|(|\nabla c^*(\textrm{D}\phi ^r)|+|\nabla c^*(\textrm{D}\phi )|)^{p-1}\,{\textrm{d}}x\\&\quad \lesssim \int _{B_R} |\textrm{D}\phi -\textrm{D}\phi ^r|\left( |D\phi |+|\textrm{D}\phi ^\prime |\right) ^{p^\prime -1}\\&\quad \lesssim \Vert \textrm{D}\phi -\textrm{D}\phi ^r\Vert _{L^{p^\prime }(B_R)}\left( \Vert \textrm{D}\phi \Vert _{L^{p^\prime }(B_R)}+\Vert \textrm{D}\phi ^r\Vert _{L^{p^\prime }(B_R)}\right) ^{p^\prime -1}\\&\quad \lesssim r^\frac{s}{p^\prime } (E(4)+D(4)). \end{aligned}$$

To obtain the last line, we used the elliptic estimates (1.30) and (1.33). In case \(p\ge 2\) a similar estimate holds by the same argument. Due to the localisation result of Lemma 4.1 and the \(L^\infty \)-bound in the form of Corollary 3.3,

$$\begin{aligned}&\int _\Omega c(x-y))\,{\textrm{d}}\pi \le W_c(\lambda \llcorner B_R+f_R,\mu \llcorner B_R+g_R)+2\int _{\Omega \cap \{\exists t\in [0,1]:X(t)\in \partial B_R \} }c(x-y)\,{\textrm{d}}\pi \\&\quad \le W_c(\lambda \llcorner B_R+f_R,\mu \llcorner B_R+g_R)+c\left( \tau E(4)+D(4)\right) . \end{aligned}$$

In particular, combining the previous two estimates and choosing r sufficiently small, it suffices to prove

$$\begin{aligned}&W_c(\lambda \llcorner B_R +f_R,\mu \llcorner B_R+g_R)- \int _{B_R} c(\nabla c^*(\textrm{D}\phi ))\,{\textrm{d}}x \lesssim \tau E(4)+D(4). \end{aligned}$$

Using Lemma 2.1, we obtain for \(\delta \in (0,1)\) to be fixed

$$\begin{aligned}&W_c(\lambda \llcorner B_R+f_R,\mu \llcorner B_R+g_R)\\&\quad \le (1+\delta ) W_c(\kappa _{\lambda ,R}\,{\textrm{d}}x\llcorner B_R+{{\overline{f}}}_R,\kappa _{\mu ,R}\,{\textrm{d}}x\llcorner B_R+{{\overline{g}}}_R)\\&\qquad + c(\delta ) \left( W_c(\lambda \llcorner B_R,\kappa _{\lambda ,R} \,{\textrm{d}}x\llcorner B_R)+W_c(\mu \llcorner B_R,\kappa _{\mu ,R} \,{\textrm{d}}x\llcorner B_R)\right) \\&\qquad +c(\delta )\left( W_c(f_R,{{\overline{f}}}_R)+W_c(g_R,{{\overline{g}}}_R)\right) . \end{aligned}$$

Noting that due to the definition of D and our choice of R,

$$\begin{aligned}&W_c(\lambda \llcorner B_R,\kappa _{\lambda ,R} \,{\textrm{d}}x\llcorner B_R)+W_c(\mu \llcorner B_R,\kappa _{\mu ,R} \,{\textrm{d}}x\llcorner B_R)+W_c(f_R,{{\overline{f}}}_R)+W_c(g_R,{{\overline{g}}}_R)\\&\quad \lesssim \tau E(4)+D(4), \end{aligned}$$

we claim that for some \(C>0\),

$$\begin{aligned} W_c(\kappa _{\lambda ,R} \,{\textrm{d}}x\llcorner B_R+{{\overline{f}}}_R,\kappa _{\mu ,R} \,{\textrm{d}}x\llcorner B_R+{{\overline{g}}}_R)\le (1+CD(4)^\frac{1}{p}) \int _{B_R} c(\nabla c^*(\textrm{D}\phi ))\,{\textrm{d}}x. \end{aligned}$$
(7.3)

Collecting estimates, choosing first \(\delta \) and r small, then \(\varepsilon \) small, once (7.3) is established, the proof is complete.

Establishing (7.3) is easy to do using the Benamou–Brenier formulation (1.10). For \(t\in [0,1]\) introduce the non-singular, non-negative measure

$$\begin{aligned} \rho _t = t(\kappa _{\mu ,R} \,{\textrm{d}}x \llcorner B_R+{{\overline{f}}}_R)+(1-t)(\kappa _{\lambda ,R} \,{\textrm{d}}x\llcorner B_R+{{\overline{g}}}_R), \end{aligned}$$

and the vector-valued measure

$$\begin{aligned} j_t = \nabla c^*(\textrm{D}\phi )\,{\textrm{d}}x\llcorner B_R. \end{aligned}$$

Note that (7.2) can be rewritten as

$$\begin{aligned} \frac{\,{\textrm{d}}}{\,{\textrm{d}}t} \int \zeta \,{\textrm{d}}\rho _t = \int \nabla \zeta \cdot \,{\textrm{d}}j_t \end{aligned}$$

for all test functions \(\zeta \), that is \((j_t,\rho _t)\) satisfy the continuity equation in the sense of distributions. The Benamou–Brenier formula (1.10) gives

$$\begin{aligned} W_c(\kappa _{\lambda ,R} \,{\textrm{d}}x \llcorner B_R+{{\overline{f}}}_R,\kappa _{\mu ,R}\,{\textrm{d}}x\llcorner B_R+{{\overline{g}}}_R)\le \int _0^1 \int c\left( \frac{\,{\textrm{d}}j_t}{\,{\textrm{d}}\rho _t}\right) \,{\textrm{d}}\rho _t \,{\textrm{d}}t. \end{aligned}$$

Here the right-hand side needs to be interpreted in the sense of (2.2). Since \(j_t\) is supported in \(B_R\) it suffices to consider \(\zeta \) supported in \(B_R\). Then by definition of \((j_t,\rho _t)\) and the Fenchel-Young inequality for any \(s>0\),

$$\begin{aligned} \int \zeta \,{\textrm{d}}j_t-\int c^*(\zeta )\,{\textrm{d}}\rho _t&= \int _{B_R}\zeta \cdot \nabla c^*(\textrm{D}\phi )- c^*(\zeta ) (t\kappa _{\mu ,R}+(1-t)\kappa _{\lambda ,R})\,{\textrm{d}}x\\&\le \int _{B_R}s c^*(\zeta ){+}s c\left( \frac{1}{s} \nabla c^*(\textrm{D}\phi )\right) {-}c^*(\zeta )(t\kappa _{\mu ,R}+(1-t)\kappa _{\lambda ,R})\,{\textrm{d}}x. \end{aligned}$$

Choosing \(s = t\kappa _{\mu ,R}+(1-t)\kappa _{\lambda ,R}\) and integrating in t, we deduce

$$\begin{aligned} W_c(\kappa _{\mu ,R}\,{\textrm{d}}x\llcorner B_R+{{\overline{f}}}_R,\kappa _{\lambda ,R} \,{\textrm{d}}x\llcorner B_R+{{\overline{g}}}_R)\le \int _{B_R}\int _0^1 s c\left( \frac{\nabla c^*(\textrm{D}\phi ))}{s}\right) \,{\textrm{d}}t\,{\textrm{d}}x. \end{aligned}$$

Now

$$\begin{aligned} \int _{B_R}\int _0^1 s c\left( \frac{\nabla c^*(\textrm{D}\phi ))}{s}\right) \,{\textrm{d}}t\,{\textrm{d}}x&\le \int _{B_R} c(\nabla c^*(\textrm{D}\phi ))\,{\textrm{d}}x {+} \int _{B_R}\int _0^1 (s{-}1) c(\nabla c^*(\textrm{D}\phi ))\ d x\,{\textrm{d}}t\\&\quad + \int _{B_R}\int _0^1 s \left( c\left( \frac{\nabla c^*(\textrm{D}\phi )}{s}\right) -c(\nabla c^*(\textrm{D}\phi ))\right) \,{\textrm{d}}t \,{\textrm{d}}x. \end{aligned}$$

Note that

$$\begin{aligned} |s-1|\lesssim D(4)^\frac{1}{p}. \end{aligned}$$

Further using (1.13) and (1.12),

$$\begin{aligned}&\int _{B_R}\int _0^1 s \left( c\left( \frac{\nabla c^*(\textrm{D}\phi )}{s}\right) -c(\nabla c^*(\textrm{D}\phi ))\right) \,{\textrm{d}}t \,{\textrm{d}}x\\&\quad \lesssim \int _{B_R}\int _0^1 \left|1- \frac{1}{s}\right|\left( 1+\frac{1}{s}\right) ^{p-1} |\nabla c^*(\textrm{D}\phi )|^{p} \,{\textrm{d}}t\,{\textrm{d}}x \lesssim D(4)^\frac{1}{p} \int _{B_R} c(\nabla c^*(\textrm{D}\phi ))\,{\textrm{d}}x. \end{aligned}$$

Thus the proof of (7.3) is complete. \(\square \)

We turn to the second term on the right-hand side of Lemma 7.2. This term will be small due to the definition of \(\phi ^r\).

Lemma 7.5

For every \(0<\tau \) there exists \(\varepsilon (\tau ),C(\tau ),r_0(\tau )>0\) such that if it holds that \({E(4)+D(4)\le \varepsilon (\tau )}\) and \(0<r\le r_0\), then there exists \(R\in [2,3]\) such that if \(\phi ^r\) solves (7.2), then

$$\begin{aligned}&\int _\Omega \int _{\sigma }^\tau \langle \dot{X}(t)-\nabla c^*(\textrm{D}\phi ^r(X(t)),\textrm{D}\phi ^r(X(t))\rangle \,{\textrm{d}}t\,{\textrm{d}}\pi \lesssim \tau E(4)+D(4). \end{aligned}$$

Proof

Note that \(\frac{\,{\textrm{d}}}{\,{\textrm{d}}t} \phi ^r(X(t)) = \langle \dot{X}(t),\nabla \phi ^r(X(t))\rangle \). Thus, since \(\pi \in \Pi (\lambda ,\mu )\) as well as using the definition of \(f_R\) and \(g_R\),

$$\begin{aligned} \int _\Omega \int _\sigma ^\tau \langle \dot{X}(t),\textrm{D}\phi ^r(X(t))\rangle \,{\textrm{d}}t\,{\textrm{d}}\pi&= \int _\Omega \phi ^r(X(\tau )-\phi ^r(X(\sigma ))\,{\textrm{d}}\pi \\&= \int _{\Omega \cap \{X(\tau )\in \partial B_R\}} \phi ^r(X(\tau ))\,{\textrm{d}}\pi +\int _{\{x\in B_R\}} \phi ^r(x)\,{\textrm{d}}\pi \\&\quad -\int _{\Omega \cap \{X(\sigma )\in \partial B_R\}}\phi ^r(X(\sigma ))\,{\textrm{d}}\pi - \int _{\{y\in B_R\}}\phi ^r(y)\,{\textrm{d}}\pi \\&= \int _{B_R}\phi ^r \,{\textrm{d}}(\mu -\lambda ) + \int _{\partial B_R} \phi ^r \,{\textrm{d}}(g_R-f_R). \end{aligned}$$

On the other hand, as in Lemma 7.3, at cost of an error \(c(r)(E(4)+D(4))^{1+\frac{\beta }{p+d}}\), we may replace \(\textrm{D}\phi ^r(X(t))\) with \(\textrm{D}\phi ^r(x)\) in the expression

$$\begin{aligned} \int _\Omega \int _\sigma ^\tau \langle \nabla c^*(\textrm{D}\phi ^r(X(t)),\textrm{D}\phi ^r(X(t))\rangle \,{\textrm{d}}t\,{\textrm{d}}\pi \end{aligned}$$

and \(\int _\Omega \int _\sigma ^\tau \,{\textrm{d}}t\,{\textrm{d}}\pi \) with \(\int _{B_R}\,{\textrm{d}}x\) at the cost of a further error \(c(r)(E(4)+D(4))^{1+\frac{\beta }{p}}\). Thus it suffices to consider

$$\begin{aligned} \int _{B_R} \langle \nabla c^*(\textrm{D}\phi ),\textrm{D}\phi \rangle \,{\textrm{d}}x = c^r\int _{B_R} \phi ^r \,{\textrm{d}}x+\int _{\partial B_R} \phi ^r\,{\textrm{d}}\left( {{\overline{g}}}^r_R-{{\overline{f}}}^r_R\right) . \end{aligned}$$

Collecting estimates, we have shown

$$\begin{aligned}&\int _\Omega \int _{\sigma }^\tau \langle \dot{X}(t)-\nabla c^*(\textrm{D}\phi ^r(X(t)),\textrm{D}\phi ^r(X(t))\rangle \,{\textrm{d}}t\,{\textrm{d}}\pi \\&\quad \lesssim \int _{B_R} \phi ^r \,{\textrm{d}}(\mu -\lambda -c^r)+\int _{\partial B_R} \phi ^r \,{\textrm{d}}(g_R-{{\overline{g}}}_R^r+f_R-{{\overline{f}}}_R^r)+c(r) \left( E(4)+D(4)\right) ^{1+\frac{\beta }{p+d}} \\&\quad = I + II + III. \end{aligned}$$

We find using Lemma 3.2, Young’s inequality and elliptic regularity in the form of (1.34) as well as (1.30),

$$\begin{aligned} I&\le |\int _{B_R} \phi ^r \,{\textrm{d}}(\mu -\kappa _{\mu ,R}-\lambda +\kappa _{\lambda ,R})|+ |\int _{B_R} \phi ^r \,{\textrm{d}}(\kappa _{\mu ,R}-\kappa _{\lambda ,R}-c^r)|\\&\lesssim \sup _{B_R}|\textrm{D}\phi ^r|\left( W_{c}(\lambda \llcorner B_R,\lambda _{\mu ,R}\,{\textrm{d}}x\llcorner B_R)^\frac{1}{p}+W_{c}(\mu \llcorner B_R,\kappa _{\mu ,R}\,{\textrm{d}}x\llcorner B_R)^\frac{1}{p}\right) \\&\quad +|\kappa _{\mu ,R}-\kappa _{\lambda }-c^r|\Vert \phi ^r\Vert _{\mathrm L^p(B_R)} \\&\lesssim \tau E(4)+ c(\tau ,r) D(4)+c(\tau )|\kappa _{\mu ,R}-\kappa _{\lambda ,R}-c^r|^{p^\prime }. \end{aligned}$$

Noting that as \(r\rightarrow 0\),

$$\begin{aligned} c^r=|B_R|^{-1} \left( g^r(\partial B_R)-f^r(\partial B_R)\right) \rightarrow |B_R|^{-1}\left( g(\partial B_r)-f(\partial B_R)\right) = \kappa _{\mu ,R}-\kappa _{\lambda ,R}, \end{aligned}$$

we deduce, choosing \(r_0\) sufficiently small, that

$$\begin{aligned} I =\int _{B_R} \phi ^r \,{\textrm{d}}(\mu -\lambda -c^r)\lesssim \tau E(4)+D(4). \end{aligned}$$

Finally consider II. By symmetry it suffices to consider the terms involving g. We estimate, denoting by \((\phi ^r)^r\) convolution of \(\phi ^r\) with the convolution kernel used to construct \(g_R^r\),

$$\begin{aligned} \int _{\partial B_R} \phi ^r \,{\textrm{d}}(g_R-{{\overline{g}}}_R^r) = \int _{\partial B_R} (\phi ^r)^r-\phi ^r\,{\textrm{d}}{{\overline{g}}}_R+\int _{\partial B_R}\phi ^r \,{\textrm{d}}({{\overline{g}}}_R-g_R). \end{aligned}$$

Now note that a standard mollification argument shows

$$\begin{aligned} \int _{\partial B_R}\phi ^r \,{\textrm{d}}(g_R-{{\overline{g}}}_R^r)\lesssim r^\frac{1}{p}\Vert \textrm{D}\phi ^r\Vert _{\mathrm L^{p^\prime }(B_R)}\Vert {{\overline{g}}}_R\Vert _{\mathrm L^p(\partial B_R)}\lesssim r^\frac{1}{p} (E(4)+D(4)), \end{aligned}$$

and that moreover using Lemma 3.2 and Young’s inequality,

$$\begin{aligned} \int _{\partial B_R} \phi ^r(\,{\textrm{d}}{{\overline{g}}}_R-g)\lesssim [\textrm{D}_{\textrm{tan}}\phi ^r]_{C^{0,\beta }(\partial B_R)} W{c}({{\overline{g}}}_R,{{\overline{g}}})^\frac{1}{p}\lesssim c(r)(\tau E(4)+D(4)). \end{aligned}$$

Thus, collecting estimates and first choosing \(r_0\) sufficiently small, then \(\varepsilon \) small, we conclude the proof. \(\square \)

We next estimate the third term on the right-hand side of the estimate in Lemma 7.2.

Lemma 7.6

For every \(0<\tau \) there exists \(\varepsilon (\tau ),C(\tau ),r_0(\tau )>0\) such that if it holds that \({E(4)+D(4)\le \varepsilon (\tau )}\) and \(0<r\le r_0\), then there exists \(R\in [2,3]\) such that if \(\phi ^r\) solves (7.2), then

$$\begin{aligned}&\int _{B_R}c(\nabla c^*(\textrm{D}\phi ^r))\,{\textrm{d}}x-\int _\Omega \int _{\sigma }^\tau c(\nabla c^*(\textrm{D}\phi ^r(X(t)))\,{\textrm{d}}t\,{\textrm{d}}\pi \lesssim \tau E(4)+D(4). \end{aligned}$$

Proof

Set \(\xi = c(\nabla c^*(\textrm{D}\phi ^r))\). Then

$$\begin{aligned}&\int _{B_R} c(\nabla c^*(\textrm{D}\phi ^r))\,{\textrm{d}}x-\int _\Omega \int _\sigma ^\tau c(\nabla c^*(\textrm{D}\phi ^r(X(t))\,{\textrm{d}}t\,{\textrm{d}}\pi \\&\quad =(1-\kappa _{\mu ,R})\int _{B_R}\xi \,{\textrm{d}}x+\left( \kappa _{\mu ,R} \int _{B_R}\xi \,{\textrm{d}}x-\int _{B_R\times {\mathbb {R}}^d} \xi \,{\textrm{d}}\pi \right) \\&\qquad +\int _{B_R\times {\mathbb {R}}^d}\xi \,{\textrm{d}}\pi -\int _\Omega \int _\sigma ^\tau \xi (X(t))\,{\textrm{d}}t\,{\textrm{d}}\pi \\&\quad = I + II + III. \end{aligned}$$

Using (1.31) and Young’s inequality, we find

$$\begin{aligned} I \le D(4)^\frac{1}{p} (E(4)+D(4))\lesssim \tau E(4) + D(4). \end{aligned}$$

Employing Lemma 3.2 we deduce

$$\begin{aligned} II\lesssim \Vert \xi \Vert _{C^{0,\beta }({{\overline{B}}}_R)}W_{c}(\mu \llcorner B_R,\kappa _{\mu ,R}\llcorner B_R)^\frac{\beta }{p}. \end{aligned}$$

It is straightforward to check that \(\Vert \xi \Vert _{C^{0,\beta }({{\overline{B}}}_R)}\lesssim \Vert \textrm{D}\phi ^r\Vert _{C^{0,\beta }({{\overline{B}}}_R)}\left( \sup _{{{\overline{B}}}_R}\textrm{D}\phi ^r\right) ^{p^\prime -1}\), so that using (1.34) and Young’s inequality,

$$\begin{aligned} II\lesssim c(r)(E(4)+D(4))D(4)^\frac{\beta }{p}\lesssim \tau E(4)+D(4). \end{aligned}$$

In order to estimate III, we first find

$$\begin{aligned}&I(X(0)\in B_R)\xi (X(0))-I(X(t)\in B_R)\xi (X(t))\\&\quad \le I(\exists s\in [0,1]:X(s)\in \partial B_R, X(0)\in {{\overline{B}}}_R)\xi (X(0))\\&\qquad + I(\forall s\in [0,1]\; X(s)\in B_R)\left( \xi (X(0))-\xi (X(t)\right) . \end{aligned}$$

Thus, using also (1.34) and Jensen’s inequality,

$$\begin{aligned} III&\le \int _0^1\int I(X(0)\in B_R)\xi (X(0))-I(X(t)\in B_R)\xi (X(t))\,{\textrm{d}}\pi \,{\textrm{d}}t\\&\le \sup _{{{\overline{B}}}_R}|\xi |\int _0^1\int I(\exists s\in [0,1]:X(s)\in \partial B_R, X(0)\in \partial B_R)\,{\textrm{d}}\pi \,{\textrm{d}}t\\&\quad + \Vert D\xi \Vert _{C^{0,\beta }({{\overline{B}}}_R)}\int _0^1 \int I(\forall s\in [0,1]\; X(s)\in B_R)|X(t)-X(0)|^\beta \,{\textrm{d}}t\,{\textrm{d}}\pi \\&\lesssim c(r)(E(4)+D(4))\pi (\exists t\in [0,1]:X(t)\in \partial B_R)+c(r)(E+D)\int _\Omega |x-y|^\beta \,{\textrm{d}}\pi \\&\lesssim c(r) (E(4)+D(4))^{1+\frac{1}{p+d}}+c(r)(E(4)+D(4))^{1+\frac{\beta }{p}}. \end{aligned}$$

In order to obtain the last line we used Corollary 3.3. Collecting estimates and choosing first \(r_0\) small, then \(\varepsilon \) small, we conclude the proof. \(\square \)

7.3 Proof of Theorem 7.1

We are now ready to prove Theorem 7.1.

Proof of Theorem 7.1

Applying Lemma 7.2 to \(\phi ^r\) and collecting the output of Lemma 7.3, Lemma 7.4, Lemma 7.5 and Lemma 7.6, we have shown that for any \(0<\tau \), there is \(\varepsilon ,C>0\) such that if \(E(4)+D(4)\le \varepsilon \), then

$$\begin{aligned} \int _{\Omega _{3/2}} \int _{\sigma _{3/2}}^{\tau _{3/2}}V\left( \dot{X}(t),\nabla c^*(\textrm{D}\phi (X(t))\right) \,{\textrm{d}}t\,{\textrm{d}}\pi \le \tau E(4)+C D(4). \end{aligned}$$

Arguing as in Lemma 7.3, we may replace \(\textrm{D}(\phi (X(t)))\) by \(\textrm{D}(\phi (X(0)))\) at the cost of an error of size \(\left( E(4)+D(4)\right) ^{1+\frac{\beta }{p+d}}\). Noting that due to Lemma 3.1, \(\#_1\subset \Omega _{3/2}\), we find

$$\begin{aligned} \int _{\#_1}V\left( x-y-\nabla c^*(\textrm{D}\phi (x))\right) \,{\textrm{d}}\pi \le \int _{\Omega _{3/2}}\int _0^1 V\left( \dot{X}(t)-\nabla c^*(\textrm{D}\phi (X(0))\right) \,{\textrm{d}}t\,{\textrm{d}}\pi . \end{aligned}$$

Now using (1.23), elliptic regularity (1.32), as well as the \(L^\infty \)-bound in the form of Corollary 3.3,

$$\begin{aligned}&\int _{\Omega _{3/2}\cap (\{\sigma _{3/2}>0\} \cup \{\tau _{3/2}<1\})}\int _{\sigma _{3/2}}^{\tau _{3/2}} V\left( \dot{X}(t)-\nabla c^*(\textrm{D}\phi (X(0))\right) \,{\textrm{d}}t\,{\textrm{d}}\pi \\&\quad \lesssim \int _{\Omega _{3/2}\cap (\{\sigma _{3/2}>0\} \cup \{\tau _{3/2}<1\})} |\dot{X}(t)|^p+|\textrm{D}\phi (X(0)))|^{p^\prime }\\&\quad \lesssim (E(4)+D(4)) \pi (\Omega _{3/2}\cap (\{\sigma _{3/2}>0\} \cup \{\tau _{3/2}<1\}))\\&\quad \lesssim (E(4)+D(4))^{1+\frac{1}{p+d}}. \end{aligned}$$

Thus, we conclude

$$\begin{aligned} \int _{\#_1}V(x-y-\nabla c^*(\textrm{D}\phi (x)))\,{\textrm{d}}\pi \lesssim \tau E(4)+D(4). \end{aligned}$$

This concludes the proof in the case \(p\ge 2\). In the case \(p\le 2\), an application of Hölder’s inequality combined with (1.12) concludes the proof. \(\square \)