1 Introduction

Let \(\Omega \) be an open convex subset of \({\mathbb {R}}^n\). Let \(\Pi \) be a group that acts freely on \(\Omega \) by affine transformations in such a way that there is a compact \(E\subseteq \Omega \) satisfying \(\Omega = \Pi E\), i.e., \(M=\Omega /\Pi \) is a smooth compact manifold. Assume also that \(\Omega \) admits a proper convex function, \(\Phi _0\), such that its Hessian tensor

$$\begin{aligned} \frac{\mathrm{d}^2\Phi _0}{\mathrm{d}x_i \mathrm{d}x_j} \mathrm{d}x_i\otimes \mathrm{d}x_j \end{aligned}$$

is \(\Pi \)-invariant. The action on \(\Omega \subset {\mathbb {R}}^n\) induces an action on \(\mathrm{d}\Phi _0(\Omega )\subset ({\mathbb {R}}^n)^*\), where \(\mathrm{d}\Phi _0:\Omega \rightarrow ({\mathbb {R}}^n)^*\) is the usual derivative of \(\Phi _0\). This action is defined by the relation

$$\begin{aligned} \mathrm{d}\Phi _0(x)=p \Leftrightarrow \mathrm{d}\Phi _0(\gamma (x)) = \gamma .p, \end{aligned}$$
(1)

where \(p\in \mathrm{d}\Phi (\Omega )\) and \(\gamma \in \Pi \). Let \(\mu \) and \(\nu \) be locally finite \(\Pi \)-invariant measures (throughout this paper all measures are assumed to be Borel) on \(\Omega \) and \(\mathrm{d}\Phi (\Omega )\), respectively. Assuming \(\mu \) has a density f and \(\nu \) has a density g, we will consider the equation

$$\begin{aligned} g\left( \mathrm{d}\Phi (x)\right) \det \left( \Phi _{ij}(x)\right) = c f(x), \end{aligned}$$
(2)

for a suitable constant \(c>0\). We will demand of a solution that it is convex and that its Hessian tensor is invariant under \(\Pi \). We will say that an absolutely continuous measure \(\mu = \rho \mathrm{d}x\) is non-degenerate if for any compact \(E\subset \Omega \) it holds that \(\rho \ge c_E > 0\). Recall also that a convex (not necessarily smooth) function, \(\Phi \), is an Alexandrov solution of (2) if the multivalued map \(\mathrm{d}\Phi \) satisfies

$$\begin{aligned} \int _{A} \mu = c\int _{\mathrm{d}\Phi (A)} \nu \end{aligned}$$

for all measurable \(A\subset \Omega \). Note that in this setting \(\mu \) and \(\nu \) does not have to be absolutely continuous. Our main theorem is

Theorem 1.1

Let \(\Omega \), \(\Pi \), and \(\Phi _0\) satisfy the conditions above. Assume that \(\mu \) and \(\nu \) are locally finite \(\Pi \)-invariant measures on \(\Omega \) and \(\mathrm{d}\Phi (\Omega )\), respectively. Then there is a unique constant \(c>0\) such that (2) admits an Alexandrov solution of the form \(\Phi = \Phi _0+u\) for a \(\Pi \)-invariant function u. If \(\nu \) is absolutely continuous with full support, then the solution is unique up to an additive constant. Moreover if \(\mu \) and \(\nu \) are both absolutely continuous with non-degenerate \(C^{k,\alpha }\) densities, then \(\Phi \) is \(C^{k+2,\alpha }\).

See the end of Sect. 2.1 for examples of \(\Omega \), \(\Pi \), and \(\Phi \) satisfying the assumptions of Theorem 1.1. For now, we just note that Theorem 1.1 covers the class of flat Riemannian manifolds. In dimension two, this class contains the Two-Torus and the Klein Bottle. In dimension three the number of examples is ten. Moreover, with the extensions to the orbifold case given in Sect. 7, Theorem 1.1 cover the class of Space Groups of which there are 17 examples in dimension two and 230 in dimension three [1]. For examples where the affine transformations are not volume preserving, see [22, p. 287].

1.1 Geometric Formulation

Theorem 1.1 is a reformulation of a geometric result (Theorem 1.2) regarding Monge–Ampère equations on compact Hessian manifolds. To state it we will need some terminology from affine geometry. An affine manifold is a topological manifold M equipped with a distinguished atlas \((U_i,x^i)\) such that the transition maps \(x^i \circ (x^j)^{-1}\) are affine. Equivalently, an affine manifold is a smooth manifold equipped with a flat torsion-free connection \(\nabla \) on TM. The coordinates in the distinguished atlas are often referred to as affine coordinates on M. A function, f, on M is said to be affine (or convex) if it is affine (convex) in the affine coordinates or, equivalently, its second derivative with respect to \(\nabla \)

$$\begin{aligned} \nabla \mathrm{d}f = \frac{\mathrm{d}^2f}{\mathrm{d}x_i \mathrm{d}x_j}\mathrm{d}x_i\otimes \mathrm{d}x_j \end{aligned}$$
(3)

vanishes (is semi-positive). A Hessian metric on an affine manifold M is a Riemannian metric g which is locally of the form (3). In other words, there is a covering \(\{U_i\}\) of M and smooth functions \(\{\phi _i: U_i\rightarrow {\mathbb {R}}\}\) such that

$$\begin{aligned} g = \nabla \mathrm{d}\phi _i. \end{aligned}$$
(4)

A Hessian manifold, \((M,\{\phi _i\})\), is an affine manifold M together with a Hessian metric \(\{\phi _i\}\). For short we write \(\phi \) instead of the collection \(\{\phi _i\}\). Note that as a consequence of (4), we have that \(\phi _i - \phi _j\) is locally affine where it is defined. We will explain in Sect. 2 how the data \(\{\phi _i - \phi _j\}\) define a principal \({\mathbb {R}}\)-bundle \(L\rightarrow M\) that respects the affine structure on M (affine \({\mathbb {R}}\)-bundle for short). We will say that Hessian metrics defining the same affine \({\mathbb {R}}\)-bundle lie in the same Kähler class, and will occasionally refer to a Hessian manifold only using the data (ML) without giving reference to a specific Hessian metric.

An affine manifold is special if the transition maps preserve the Euclidean volume form on \({\mathbb {R}}^n\) or, equivalently, if the holonomy associated to \(\nabla \) sits in \(\text {SL}(n,{\mathbb {R}})\). An important property of special Hessian manifolds is that the real Monge–Ampère measure of the Hessian metric

$$\begin{aligned} \text {MA}(\phi ) = \det \left( \frac{d^2\phi }{\mathrm{d}x_i \mathrm{d}x_j}\right) \mathrm{d}x_1\wedge \cdots \wedge \mathrm{d}x_j \end{aligned}$$
(5)

is invariant under coordinate transformations and globally defines a measure on M. Indeed, differential equations involving this operator have been studied in a number of papers. Existence and uniqueness for associated Monge–Ampère equations on special Hessian manifolds were first given by Cheng and Yau [8], and Delanoë [12] extends the result to general Hessian manifolds, under smoothness assumptions. Further, also using the continuity method, in [5] it is shown that \(f\in C^{0,\alpha }\) along with a two-sided bound on f suffices. A key point of this paper is that a variational approach yields existence of weak solutions for a wider class of measures; in particular, we do not need to assume that \(f>0\). Further the variational approach also generalizes in a straight-forward manner to equations with a Kähler–Einstein-like structure. In this paper we will explain that, although the expression in (5) is only well defined as a measure when M is special, it is possible to, by fixing an absolutely continuous measure \(\nu \) on a certain dual manifold, define a Monge–Ampère operator on general Hessian manifolds. This is in contrast to the approach in [5] where the operator in (5) is generalized to a non-special setting by considering 2-densities. More precisely, we will explain that the data (ML) defines a dual Hessian manifold \(M^*\). This is essentially the same construction found in the literature on the Strominger–Yau–Zaslow picture of mirror symmetry (see for example [2, pp. 428–429]). Given a measure \(\nu \) on \(M^*\) and a Hessian metric \(\phi \) on M, we define a \(\nu \)-Monge–Ampère measure \(\text {MA}_\nu (\phi )\) (see Definition 2.22) and consider the equation

$$\begin{aligned} \text {MA}_\nu (\phi ) = \mu \end{aligned}$$
(6)

for measures \(\mu \) on M. We will also introduce a concept of weak solutions to this equation. The majority of Sect. 3 is devoted to the proof of the following main theorem.

Theorem 1.2

Let \((M,\phi _0)\) be a compact Hessian manifold. Let \(\mu \) and \(\nu \) be probability measures on M and \(M^*\), respectively. Then there is a continuous function u on M, such that \(\phi = \phi _0 + u \) solves (6) in the weak sense. If \(\nu \) is absolutely continuous and \(\phi _1\) and \(\phi _2\) are solutions to (6), then \(\phi _2-\phi _1\) is constant. If, in addition, \(\mu \) and \(\nu \) are absolutely continuous with non-degenerate \(C^{k,\alpha }\)-densities for some \(k\in {\mathbb {N}}\) and \(\alpha \in (0,1)\), then the solution is \(C^{k+2,\alpha }\).

Remark 1

The constant c in Theorem 1.1 is determined by the fact that \(\mu \) and \(c\nu \) should define measures of equal mass on \(\Omega /\Pi \) and \(\Omega ^*/\Pi \). In Theorem 1.2 this obstruction is handled by demanding that both \(\mu \) and \(\nu \) are probability measures.

Remark 2

In [5] Caffarelli and Viaclovsky consider a certain type of Monge–Ampère equations on non-special Hessian manifolds, namely equations of the form

$$\begin{aligned} \det (\phi _{ij}) = \rho ^2, \end{aligned}$$

where \(\rho \) is a density on M. In other words, they consider equations involving the expression \(\det (\phi _{ij})\) which transforms as the square of a density on M. We stress that our approach is different. The \(\nu \)-Monge–Ampère defines a measure on M regardless if M is special or not.

It will follow from the construction that if M is special then \(M^*\) is special. If we choose \(\nu \) as the canonical \(\nabla \)-parallel measure on \(M^*\), then (6) reduces to the standard inhomogenous Monge–Ampère equation on special manifolds considered in the literature.

While Theorem 1.1 considers affine actions on domains in Euclidean space respecting a convex exhaustion function, Theorem 1.2 considers abstract Hessian manifolds. These two points of view are equivalent by a theorem of Shima (See [20, Theorem B, p. 386]). An important aspect of this work is that most of the paper is set in the setting of abstract Hessian manifolds and that we adapt the framework of optimal transport to suit this setting. Our motivation for this comes from Mirror Symmetry and tropical geometry (in particular the framework of the Strominger–Yau–Zaslow, Gross–Wilson, and Kontsevich–Soibelman conjectures [2]). In this framework dual affine (singular) manifolds appear as the ”large complex limits” of ”mirror dual” complex/symplectic manifolds and the corresponding Kähler–Einstein metrics (solving complex Monge–Ampère equations) are expected to converge to solutions of real Monge–Ampère equations on the singular affine manifolds in question. Hopefully, the present approach can in the future be extended to such singular (and possibly non-compact) affine manifolds where Shima’s theorem does not hold.

Finally, we remark that the local geometry of smooth measured metric spaces of the form \((M,\nabla \mathrm{d}\phi ,\mu )\) where \(\phi \) and \(\mu \) are related as in Theorem 1.2 has recently been studied by Klartag and Kolesnikov [16]. It is interesting to note that our approach shows that a pair of measures \((\mu ,\nu )\) with smooth densities on M and \(M^*\) determines a pair of measured metric spaces \((M,\nabla \mathrm{d}\phi ,\mu )\) and \((M^*,\nabla ^* \mathrm{d}\phi ^*,\nu )\) of the form studied in [16] related by Legendre transform.

1.2 Optimal Transport Interpretation

The connection between optimal transport and solutions to Monge–Ampère equation on \({\mathbb {R}}^n\) was discovered independently by Brenier on one hand [4] and Knott and Smith on the other hand [17]. Two generalizations of this that provide an important background for the present paper are Cordero-Erausquin’s paper on optimal transport of measures on \({\mathbb {R}}^n\) invariant under the additive action by \({\mathbb {Z}}^n\) [10] and McCann’s theorem on optimal transport on general Riemannian manifolds [18].

One of the key points of the present paper is to show that Eq. (6), defined on Hessian manifolds, also fits nicely into the theory of optimal transport. Recall that an optimal transport problem is given by two probability spaces \((X,\mu )\) and \((Y,\nu )\) together with a cost function \(c:X\times Y\rightarrow {\mathbb {R}}\). We will explain in Sect. 4 how the data (ML) determines a cost function \(c=c_{(M,L)}:M\times M^*\rightarrow {\mathbb {R}}\). This means a Hessian manifold (ML) together with two measures \(\mu \) and \(\nu \) on M and \(M^*\), respectively, determines an optimal transport problem. Moreover, by construction, the differentials of \(\{\phi _i\}\), \(x\mapsto \mathrm{d}\phi _i|_x\), induce a diffeomorphism, which we will denote \(\mathrm{d}\phi \), from M to \(M^*\). We have the following theorem with respect to this interpretation.

Theorem 1.3

Let (ML) be a compact Hessian manifold. Let \(\mu \) and \(\nu \) be probability measures on M and \(M^*\), respectively. Assume \(\phi \) is a smooth strictly convex section of L such that

$$\begin{aligned} \text {MA}_\nu (\phi ) = \mu . \end{aligned}$$

Then \(\mathrm{d}\phi \) is the optimal transport map determined by \(M,M^*,\mu ,\nu \), and the cost function induced by (ML).

In the classical case of optimal transport, when \(X={\mathbb {R}}^n\) and \(Y=({\mathbb {R}}^n)^*\), the cost function is given by \(-\langle \cdot ,\cdot \rangle \) where \(\langle \cdot ,\cdot \rangle \) is the standard pairing of \({\mathbb {R}}^n\) with \(({\mathbb {R}}^n)^*\). Our setting is a generalization of this in the sense that the cost function induced by a Hessian manifold (ML) is induced by a pairing-like object \([\cdot ,\cdot ]\). However, \([\cdot ,\cdot ]\) will not be a bi-linear function on \(M\times M^*\). Instead it will be a (piecewise) bi-linear section of a certain affine \({\mathbb {R}}\)-bundle over \(M\times M^*\). We suspect that this might turn out to be important when setting up a similar framework in the setting of the Strominger–Yau–Zaslov, Gross–Wilson, and Kontsevich–Soibelman conjectures explained above.

While the results of Cordero-Erausquin and McCann’s cited above are concerned with optimal transport with respect to a cost function given by the squared distance function of a Riemannian metric (in the case of Cordero-Erausquin: the Euclidean metric on \({\mathbb {R}}^n\)), we stress that in the present setting the cost function is not a priori related to any Riemannian metric. However, we will explain in Sect. 4 that if (ML) is special then L determines a flat Riemannian metric on M that is compatible with the affine structure. Moreover, it turns out that when (ML) is special, M and \(M^*\) are equivalent as affine manifolds. We will show that under this identification the induced cost function (defined on \(M\times M^*\)) is given by the squared distance function determined by a certain flat Riemannian metric on M, hence proving

Theorem 1.4

Let (ML) be a compact special Hessian manifold, \(\mu \) and \(\nu \) probability measures on M and \(M^*\), respectively. Then the cost function determined by (ML) is the squared distance function \(d^2/2\) of the flat Riemannian metric on M induced by L. Hence, equation (6) is equivalent to the optimal transport problem determined by \(\mu \), \(\nu \), and \(d^2/2\), where d is the flat Riemannian metric on M induced by L.

Remark 3

We make a remark here about whether or not Theorem 1.2 in the setting of special Hessian manifolds follow from McCann’s theorem on optimal transport on Riemannian manifolds. In the light of Theorem 1.4, and given the existence of a flat Riemannian metric on M compatible with the affine structure, Theorem 1.2 in the setting of special Hessian manifolds follows from McCann’s theorem. However, the existence of of a flat Riemannian metric on M compatible with the affine structure is only evident after solving the Monge–Ampère equation in Theorem 1.2 (see the proof of Proposition 4.8 for details).

1.3 The Legendre Transform

To formulate the Kantorovich type functional, we generalize the Legendre transform from \({\mathbb {R}}^n\) to the setting of Hessian metrics on affine manifolds. A Legendre transform of Hessian metrics on manifolds has appeared elsewhere in the literature, see, e.g., [2, 22]. In this setting, the Legendre transform is formulated in terms of the flat torsion-free connection \(\nabla \) of the tangent bundle TM. It is shown that the connection \(\nabla ^* = 2\nabla _\phi - \nabla \), where \(\nabla _\phi \) denotes the Levi–Civita connection defined by the Hessian metric which is also a flat torsion-free connection on TM, defining a dual affine structure on M.

We attempt to take a more global approach to constructing the Legendre transform on a Hessian manifold \((M,\phi )\). The crucial observation is that the affine structure on M allows one to define local affine functions (or more generally, affine sections to the principal \({\mathbb {R}}\)-bundle \(L\rightarrow M\) defined by \(\phi \)) on M, which in turn can be used to define the Legendre transform by a supremum formula. A difficulty lies in that generally an affine manifold does not allow any global non-trivial affine sections. In this paper this is dealt with by passing to universal cover of the compact Hessian manifold \((M,\phi )\), which by [20] can be realized as a convex set \(\Omega \subset {\mathbb {R}}^n\) with a convex exhaustion function \(\Phi \). The key advantage of this approach, compared to that of [2, 22], is that the supremum formula allows the definition of a projection operator P mapping continuous sections to convex sections. To illustrate this point, we note that the Legendre transform in [2, 22], being defined as a change of connection on TM, is purely local, and in \({\mathbb {R}}\) reduces to the expression (for a smooth strictly convex \(\phi :{\mathbb {R}}\rightarrow {\mathbb {R}}\))

$$\begin{aligned} \phi ^{*}(\phi '(x)) = \phi '(x)x - \phi (x). \end{aligned}$$
(7)

However, issues arise when attempting to take the Legendre transform of a non-convex function f, the one immediately relevant for our purposes being that \(f^{**}\) does not define a projection operator from the space of continuous functions on \({\mathbb {R}}\) to convex functions. However, the slight modification (sometimes called the Legendre–Fenchel transform) of the above expression to

$$\begin{aligned} \phi ^*(p) = \sup _x px - \phi (x) \end{aligned}$$
(8)

allows immediately the definition of the projection \(f \mapsto f^{**}\). A main contribution of this paper is that we generalize (8) instead of (7), giving us such a projection operator. It is this projection operator that allows us to give a variational formulation of the Monge–Ampére equation, formulated in terms of a Kantorovich functional with continuous functions as domain.

Using the variational formulation, the existence and uniqueness of solutions are reduced to a question regarding existence and uniqueness of minimizers of functionals, and a main result in this (which implicitly can also be found in [8]) is a compactness result for Hessian metrics in a fix Kähler class.

Theorem 1.5

Let (ML) be a compact Hessian manifold. Then the space ofconvex sections of L modulo \({\mathbb {R}}\) is compact, in the topology of uniform convergence modulo \({\mathbb {R}}\).

1.4 Further Results

Using Theorem 1.5, we outline in Sect. 5 how functionals mimicking the Ding- and Mabuchi functionals in complex geometry can be shown to have minimizers, this also giving existence and uniqueness results for a Kähler–Einstein-like equation on Hessian manifolds. The main theorem in this regard can be formulated as follows.

Theorem 1.6

Let \((M,L,\phi _0)\) be a compact Hessian manifold, let \(\nu \) be an absolutely continuous probability measure of full support on \(M^*\), let \(\mu \) be a probability measure on \(\mu \), and let \(\lambda \in {\mathbb {R}}\). Then the equation

$$\begin{aligned} \text {MA}_\nu \phi = e^{-\lambda (\phi - \phi _0)} \mu \end{aligned}$$
(9)

has a solution.

We wish to point out that in contrast to the complex setting, solutions to (9) do not define Einstein metrics, in the sense that (9) is not a reformulation of the Einstein equation \(\text {Ric}g = \lambda g\). However, as mentioned above the geometric properties of solutions to Eq. (9) have very recently been studied by Klartag and Kolesnikov [16]. Moreover, when \(M={\mathbb {R}}^n\), (9) has been studied as a twisted Kähler–Einstein equation on a corresponding toric manifold (see [3, 26]) and strong existence results has been given in [11]. When M is the real torus with the standard affine structure, (9) has been studied as an analogue of a twisted singular Kähler–Einstein equation in [15]. In the case when \(\lambda >0\) we will also show uniqueness of solutions to (9). When \(\lambda <0\), solutions are not unique in general. Nevertheless, with the variational approach outlined here one gets a set of distinguished solutions, namely the minimizing ones.

Further, although this paper is chiefly concerned with the case where \(M = \Omega /\Pi \) is a manifold, we in Sect. 7 outline how the results can be extended to an orbifold setting.

1.5 Atomic Measures

We also include a section on atomic measures, and show that the only convex sections \(\phi \) where the Monge–Ampére operator has finite support are the piecewise affine ones (Theorem 1.7).

Corresponding to a piecewise affine section of L is a (locally) piecewise affine function \(\Phi \) on \(\Omega \). The singular locus of \(\Phi \) defines a quasi-periodic tiling of \(\Omega \) (with respect to \(\Pi \)) by convex polytopes. This means that solving Monge–Ampére equations with atomic data corresponds to finding quasi-periodic tilings of the covering space. In the case of real tori \(M = {\mathbb {R}}^n/{\mathbb {Z}}^n\), for \(n=2,3\) this is related to the computational work in [6, 7, 27].

The main points of this section are the following theorems.

Theorem 1.7

We call a probability measure \(\mu \) on M atomic if \(\mu = \sum _{i=1}^N \lambda _i \delta _{x_i}\). Let \(\nu \) be an absolutely continuous probability measure of full support on \(M^*\). Then

$$\begin{aligned} \text {MA}_\nu \phi \text { is atomic} \Leftrightarrow \phi \, \text {is piecewise affine.} \end{aligned}$$
(10)

Theorem 1.8

Any Hessian metric \(\phi _0\) on a compact Hessian manifold \((M,L,\phi _0)\) can approximated uniformly by a piecewise affine section.

Remark 4

We point out that the above two theorems seem to be a phenomenon specific to the compact Hessian setting, in the sense that the corresponding statements are false both in \({\mathbb {R}}^n\) and on compact Kähler manifolds. In \({\mathbb {R}}^n\), \(n\ge 2\) we may take \(\phi = \Vert x\Vert \), which is not piecewise affine, but where \(\text {MA}\phi = \delta _0\).

Further, if the Monge–Ampére measure of a \(\omega \)-plurisubharmonic function u on a compact Kähler manifold \((X,\omega )\) is discrete, (see [9]), the current \(\omega + dd^c u\) does not necessarily vanish. To see this, one can take X to be complex projective space \({\mathbb {P}}^n\), and letting \(\omega \) correspond to \(dd^c \log |z|^2\) on a dense embedding \({\mathbb {C}}^n \subset {\mathbb {P}}^n\). Then \(\omega \) is \({\mathbb {C}}^*\) invariant, and descends to the Fubini–Study form on \({\mathbb {P}}^{n-1}\). Hence \(\omega \ne 0\) on the dense set \({\mathbb {C}}^n\subset {\mathbb {P}}^n\) away from the origin, but the Monge–Ampére mass is concentrated on 0.

Also note that Theorem 1.8 can be seen as analogous to an approximation result in [13], stating that an \(\omega \)-plurisubharmonic function on a compact Kähler manifold \((X,\omega )\) can be written as a decreasing sequence of \(\omega \)-plurisubharmonic functions with analytic singularities. However we obtain uniform convergence instead. To the best of our knowledge this is the first such result in the setting of Hessian manifolds.

2 Geometric Setting

Definition 2.1

(affine\({\mathbb {R}}\)-bundle) An affine \({\mathbb {R}}\)-bundle over an affine manifold M is an affine manifold L and a map \(\tau : L\rightarrow M\) such that the fibers of \(\tau \) have the structure of affine manifolds isomorphic to \({\mathbb {R}}\) and such that there is a collection of local trivializations \(\{(U_i,p_i)\}\) such that the transition maps \(p_i\circ p_j^{-1}: U_j\cap U_i\times {\mathbb {R}}\rightarrow U_i\times {\mathbb {R}}\) are of the form

$$\begin{aligned} (x,y)\mapsto (x',y+\alpha _{ij}(x)) \end{aligned}$$

for some affine transition functions \(\alpha _{ij}\) on \(U_i\cap U_j\).

Remark 5

It follows that an affine \({\mathbb {R}}\)-bundle is a principal \({\mathbb {R}}\)-bundle compatible with the affine structure on M.

A section \(s:M\rightarrow L\) of an affine \({\mathbb {R}}\)-bundle is affine (or convex) if it is represented by affine (convex) functions in the trivializations.

Note that if g is a Hessian metric on M induced by \(\{\phi _i\}\), then (4) implies that \(\phi _i - \phi _j\) is affine for any ij. Putting \(\alpha _{ij}=\phi _i-\phi _j\) defines an affine \({\mathbb {R}}\)-bundle over M in which \(\{\phi _i\}\) is a convex section. We will often refer to a Hessian manifold as \((M,L,\phi )\) where L is the affine \({\mathbb {R}}\)-bundle associated to \(\{\phi _i\}\) and \(\phi \) is the convex section in L defined by \(\{\phi _i\}\). We will also refer to \(\phi \) both as a weak Hessian metric, and as a convex section to L interchangeably. We will say that L is positive if it admits a smooth and strictly convex section. This is consistent with the terminology used in the complex geometric setting, as well as the tropical setting [19] Also, in analogy with the setting of Kähler manifolds we make the following notational definition.

Definition 2.2

If \(\phi \) and \(\phi _0\) are convex sections to the same affine \({\mathbb {R}}\)-bundle \(L\rightarrow M\), we say that \(\phi \) lies in the Kähler class of \(\phi _0\).

Let \(\pi :\Omega \rightarrow M\) be the universal covering of M. By pulling back \(\nabla \) with the covering map we get that \(\Omega \) is also an affine manifold. The pullback of L defines an affine \({\mathbb {R}}\)-bundle over \(\Omega \). Let us denote this bundle K and let \(\pi ^*\phi \) be the pullback of \(\phi \) to K. Let \(\Gamma (\Omega ,K)\) be the space of global affine sections in K. We have the following basic

Proposition 2.3

Any local affine section of an affine \({\mathbb {R}}\)-bundle over a simply connected manifold \(\Omega \) may be uniquely extended to a global affine section.

Proof

Assume s is defined in a neighborhood of \(x\in \Omega \). To define s(y) for \(y\in \Omega \), let \(\gamma \) be a curve in \(\Omega \) from x to y. Cover \(\gamma \) with open balls \(B_i\) each contained in a some local trivialization of L. In each ball there is a unique way of extending s. Moreover, replacing \(\gamma \) with a perturbation of \(\gamma \) allows us to use the same cover, \(\{B_i\}\). This means, since \(\Omega \) is simply connected, that s(y) does not depend on \(\gamma \). \(\square \)

Proposition 2.3 says that \(\Gamma (\Omega ,K)\) is isomorphic (as an affine manifold) to the space of affine functions on \({\mathbb {R}}^n\), which is isomorphic to \(({\mathbb {R}}^n)^*\times {\mathbb {R}}\), (see Remark 8). In particular \(\Gamma (\Omega ,K)\) is non-empty.

Remark 6

If \(y_1\) and \(y_2\) are two points in the same fiber of an affine \({\mathbb {R}}\)-bundle, then, since the structure group acts additively, their difference, \(y_1-y_2\), is a well-defined real number. Consequently, if \(s_1\) and \(s_2\) are sections of an affine \({\mathbb {R}}\)-bundle over a manifold M, then \(s_1-s_2\) defines a function on M. Generalizing this observation to sections \(s_1,s_2\) of the affine \({\mathbb {R}}\)-bundles \(L_1,L_2\), we see that the set of affine \({\mathbb {R}}\)-bundles over M naturally carries the structure of an \({\mathbb {R}}\) vectorspace.

Taking \(q\in \Gamma (\Omega ,K)\) we may consider the pullback \(\pi ^*\phi \) of \(\phi \) to K and

$$\begin{aligned} \Phi _{q} = \pi ^*\phi - q \end{aligned}$$

Since both \(\pi ^*\phi \) and q are sections of K, \(\Phi _q\) is a well-defined function on \(\Omega \). Moreover, \(\nabla \mathrm{d}\Phi _q = \nabla d {\tilde{\phi }}\). This means the Hessian of \(\Phi _q\) is strictly positive and defines the same metric as the one given by the pull back of the Hessian metric \(\nabla \mathrm{d}\phi \) on M. We conclude that any Hessian metric on an affine manifold may be expressed as the Hessian of a global function on the covering space. Now, by a theorem by Shima (See [20, Theorem B, p. 386]), the covering space of any compact Hessian manifold may be embedded as a convex subset in \({\mathbb {R}}^n\). Convexity of the covering space implies that \(\Phi _q\) is convex. Moreover, it readily follows from the proof in [20] that, for some choice of \(q_0\), \(\Phi _{q_0}\) is an exhaustion function of \(\Omega \).

2.1 A Dual Hessian Manifold

In the notation of the previous section we have

figure a

where \(\Omega \) is the universal covering space of M and K is the pullback of L under the covering map. In this section we will define a dual diagram

figure b

with dual objects \(K^*\), \(\Omega ^*\), \(L^*\), and \(M^*\) where \(M^*\) will turn out to give (under suitable assumptions) another Hessian manifold which we will refer to as the dual Hessian manifold.

Definition 2.4

Let \(K^*\) be the subset of \(\Gamma (\Omega ,K)\) given by all \(q\in \Gamma (\Omega ,K)\) such that \(\Phi _q:\Omega \rightarrow {\mathbb {R}}\) is bounded from below and proper.

Remark 7

If \(M={\mathbb {R}}^n\) and L is the trivial affine \({\mathbb {R}}\)-bundle \({\mathbb {R}}^n\times {\mathbb {R}}\), then \(\phi \) is a strictly convex function on \({\mathbb {R}}^n\) and \(K^*\) is given by the affine functions on \({\mathbb {R}}^n\) such that their derivative is in the gradient image of \(\phi \).

Lemma 2.5

The set \(K^*\subset \Gamma (\Omega ,K)\) is non-empty and open. Moreover, it does only depend on (ML).

Proof

As mentioned in the end of the previous section, by [21], \(\Phi _q\) is an exhaustion function for a suitable choice of q. This means \(K^*\) is non-empty. To see that \(K^*\) is open, assume \(q\in K^*\) and note that since \(\Phi _q\) is bounded from below and proper it admits a minimizer \(x_0\in \Omega \). Let U be a neighborhood if \(x_0\). Since \(\Phi _q\) is strictly convex, \(\Phi _q(x)>\epsilon |x-x_0|-C\) outside U. We have that for any \(q'\) close to q, \(|q-q'|<\epsilon |x-x_0|/2+C'\) for some \(C'\). This means

$$\begin{aligned} \Phi _{q'} = \Phi _q + q-q'> \Phi _q - \frac{\epsilon }{2}|x-x_0| - C' > \frac{1}{2}\Phi _q -C/2 -C' \end{aligned}$$

outside U. Since \(\Phi _q/2\) is proper and bounded from below if and only if \(\Phi _q\) is proper and bounded from below, it follows that \(\Phi _{q'}\) is proper and bounded from below; hence \(q'\in K^*\).

Finally, let \(\phi \) and \(\psi \) be two Hessian metrics of the same affine \({\mathbb {R}}\)-bundle. Then \(\Phi _q-\Psi _q = \pi ^*\phi -\pi ^*\psi \) is a continuous function on \(\Omega \) that descends to M. This means it is bounded. We conclude that \(\Phi _q\) is bounded from below and proper if and only if \(\Psi _q\) is bounded from below and proper. \(\square \)

Note that, given \(C\in {\mathbb {R}}\), we may consider the map on \(\Gamma (\Omega ,K)\) given by

$$\begin{aligned} q\mapsto q+C. \end{aligned}$$
(11)

This defines a smooth, free, and proper action by \({\mathbb {R}}\) on \(\Gamma (\Omega ,K)\). Moreover, \(\Phi _q\) is proper if and only if \(\Phi _{q+C} = \Phi _q - C\) is proper, hence the action preserves \(K^*\).

Definition 2.6

We define \(\Omega ^*\) to be the quotient \(K^*/{\mathbb {R}}\).

Remark 8

We here give a way to explicitly identify \(\Omega \) and \(\Omega ^*\) with compatible embeddings in \({\mathbb {R}}^n\) and \(({\mathbb {R}}^n)^*\), respectively. Fixing a point \(q_0\in K^*\), we may write any \(q\in K^*\) as \(q = q_0 + (q-q_0)\). Since \(q - q_0\) is an affine function this yields an identification \(\Gamma (\Omega ,K) \overset{q_0}{\simeq } \Gamma (\Omega ,0)\), where 0 denotes the trivial affine \({\mathbb {R}}\)-bundle over \(\Omega \). Further, choosing a point \(x_0 \in \Omega \) and a basis for \(T_{x_0} \Omega \) yields an identification of \(\Omega \) with an embedding to \(i_1: \Omega \rightarrow {\mathbb {R}}^n\), and thus also an identification \(\Gamma (\Omega ,0) \overset{T_{x_0}\Omega }{\simeq } \Gamma ({\mathbb {R}}^n,0) \simeq ({\mathbb {R}}^n)^* \times {\mathbb {R}}\). This provides an embedding \(i_2: \Omega ^* \rightarrow ({\mathbb {R}}^n)^*\). In fact, as will be explained later, \(d(\pi ^* \phi )\) yields a map \(\Omega \rightarrow \Omega ^*\), and the identification can be summarized as saying that the following diagram commutes.

figure c

Lemma 2.7

The quotient map \( K^*\rightarrow \Omega ^* \) gives \(K^*\) the structure of an affine \({\mathbb {R}}\)-bundle over \(\Omega ^*\).

Proof

First of all, note that the fibers of the quotient map are affine submanifolds of \(K^*\) isomorphic to \({\mathbb {R}}\). Moreover, there is a global affine trivialization of \(K^*\) over \(\Omega ^*\). To see this, recall that by Remark 8\(K^*\) is isomorphic to a subset of \(({\mathbb {R}}^n)^*\times {\mathbb {R}}\). The action on \(K^*\) given by (11) extends to all of \(({\mathbb {R}}^n)^*\times {\mathbb {R}}\) where it is given by \((a,b)\rightarrow (a,b+C)\). In particular, the quotient map is the same as the projection map on the first factor. We conclude that the identification of \(K^*\) with the subset of \(({\mathbb {R}}^n)^*\times {\mathbb {R}}\) defines a global trivialization of \(K^*\) over \(\Omega ^*\). \(\square \)

Now, let \(\Pi \) be the fundamental group of M, acting on \(\Omega \) by deck transformations. This action extends to an action on K. To see this, note that the total space of K can be embedded in \(\Omega \times L\) as the submanifold

$$\begin{aligned} \{ (x,y)\in \Omega \times L:\pi x= \tau y \}. \end{aligned}$$

The action by \(\Pi \) on K is then given by \(\gamma (x,y) = (\gamma x,y)\). If q is an affine section of K, then its conjugate \(\gamma \circ q \circ \gamma ^{-1}\) is also an affine section of K. We get an action of \(\Pi \) on \(\Gamma (\Omega ,K)\) defined by

$$\begin{aligned} \gamma .q = \gamma \circ q \circ \gamma ^{-1}. \end{aligned}$$

Lemma 2.8

The action by \(\Pi \) on \(\Gamma (\Omega ,K)\) commutes with the action by \({\mathbb {R}}\). Moreover, if \(\phi \) is a convex section of L and \(q\in \Gamma (\Omega ,K)\), then the action satisfies

$$\begin{aligned} \Phi _{\gamma .q} = \Phi _q\circ \gamma ^{-1}. \end{aligned}$$

Finally, \(q\in K^*\) if and only if \(\gamma .q\in K^*\).

Proof

First of all, if we have two points in the same fiber of K, \((x,y_1)\) and \((x,y_2)\), then

$$\begin{aligned} \gamma (x,y_1)-\gamma (x,y_2) = (\gamma x, y_1) - (\gamma x, y_2) = y_1-y_2 = (x,y_1)-(x,y_2). \end{aligned}$$
(12)

In particular, if \(q_1,q_2\in \Gamma (\Omega ,K)\), then \(q_1=q_2+C\) if and only if \(\gamma . q_1 = \gamma . q_2+C\). This proves the first point of the lemma. Note that this implies

$$\begin{aligned} \gamma \circ (\pi ^*\phi ) - \gamma \circ q = (\pi ^*\phi ) -q. \end{aligned}$$

Since \(\pi ^*\phi \) descends to a convex section of L, we have \(\gamma \circ (\pi ^*\phi )\circ \gamma ^{-1} = \pi ^*\phi \). This means

$$\begin{aligned} \Phi _{\gamma .q}= & {} \pi ^*\phi - \gamma \circ q\circ \gamma ^{-1} \nonumber \\= & {} \gamma \circ (\pi ^*\phi )\circ \gamma ^{-1} - \gamma \circ q\circ \gamma ^{-1}= (\pi ^*\phi ) \circ \gamma ^{-1} -q\circ \gamma ^{-1} = \Phi _q\circ \gamma ^{-1} \end{aligned}$$

proving the second point of the lemma. For the last point of the lemma, note that \(\Phi _q\) is bounded from below if and only if \(\Phi _q\circ \gamma ^{-1}\) is bounded from below. Moreover, any invertible affine transformation of \({\mathbb {R}}^n\) is proper and has proper inverse. This means \(\Phi _q\) is proper if and only if \(\Phi _q\circ \gamma ^{-1}\) is proper. \(\square \)

Form the first and third point of Lemma 2.8, we have that \(\Pi \) acts on \(K^*\) and \(\Omega ^*\).

Definition 2.9

We define

$$\begin{aligned} L^*&= K^*/\Pi \\ M^*&= \Omega ^*/\Pi . \end{aligned}$$

Remark 9

It is clear from the definition that the actions by \(\Pi \) on \(K^*\) and \(\Omega ^*\) are affine. However, at this point it is not clear that they are free. We will prove in the next section that K and \(K^*\) are diffeomorphic and that the action on K and the action on \(K^*\) are the same up to conjugation. This will imply that the quotients in Definition 2.9 are affine manifolds.

In a lot of examples \(\Omega \) and \(\Phi _q\) are explicit. The action of \(\Pi \) on \(K^*\) is then explicitly described by

Lemma 2.10

Let \(\gamma \in \Pi \) and \(q \in K^*\). Then

$$\begin{aligned} \gamma .q = q + \Phi _q-\Phi _q\circ \gamma ^{-1}. \end{aligned}$$

Proof

From the second point of Lemma 2.8 we get

$$\begin{aligned} \Phi _q-\Phi _q\circ \gamma ^{-1} = \Phi _q-\Phi _{\gamma .q} = \gamma .q-q \end{aligned}$$

proving the lemma. \(\square \)

Example 2.11

Let \(M={\mathbb {R}}^n\), L be the trivial affine \({\mathbb {R}}\)-bundle, \({\mathbb {R}}^n\times {\mathbb {R}}\) over M, and \(\phi \) be any smooth strictly convex function on M. Then \(\Pi \) is trivial, \(M^* = \Omega ^* = \mathrm{d}\phi (M)\) and \(L^*\) is the trivial affine \({\mathbb {R}}\)-bundle, \(M^*\times {\mathbb {R}}\), over \(M^*\).

Example 2.12

Let M be the standard torus \({\mathbb {R}}^n/{\mathbb {Z}}^n\). Let \(\phi \) and L be the data defined by the Euclidean metric on M, in other words \(\Phi _{q_0} = |x|^2/2\) for some \(q_0\in \Gamma (\Omega ,K)\). Now, any \(q\in \Gamma (\Omega ,K)\) is given by \(q_0+A\) for some affine function \(A=\langle x ,a \rangle + b\) on \(\Omega \) (\(a\in {\mathbb {R}}^n\), \(b\in {\mathbb {R}}\)). This means \(\Phi _q = \Phi _{q_0} - \langle x ,a \rangle - b\) is bounded from below and proper for all \(q\in \Gamma (\Omega ,K)\) and we get that \(K^*=\Gamma (\Omega ,K)\cong {\mathbb {R}}^n\times {\mathbb {R}}\). The deck transformations acts by lattice translations. Given a deck transformation \(\gamma _m: x\mapsto x + m\) for \(m\in {\mathbb {Z}}^n\), we calculate \(\gamma _m.q\) by

$$\begin{aligned} \gamma _m.q= & {} q + \Phi _q - \Phi _q \circ \gamma ^{-1} \\= & {} q + \frac{|x|^2}{2} - \langle x , a \rangle - b - \left( \frac{|x-m|^2}{2} - \langle x-m ,a \rangle - b\right) \\= & {} q + \langle x,m \rangle - \frac{|m|^2}{2} - \langle m,a \rangle .\\= & {} q_0 + \langle x,a+m \rangle + b - \frac{|m|^2}{2} - \langle m,a \rangle . \end{aligned}$$

In particular \(\Pi \) acts on \(\Omega ^*=K^*/{\mathbb {R}}\) by translations and \(M^*\) is isomorphic (as a smooth manifold) to the standard torus \({\mathbb {R}}^n/{\mathbb {Z}}^n\).

The manifolds in the above two examples are special; however as the example below illustrates, our definitions work out also for non-special manifolds.

Example 2.13

Consider the action by \({\mathbb {Z}}\) on the positive real numbers generated by \(y\mapsto 2y\). Let \(M = {\mathbb {R}}_+/2^{{\mathbb {Z}}}\) be the quotient and \(\phi \) and L be the data defined by the metric \(dy\otimes dy/y^2\) on M, in other words \(\Phi _{q_0} = -\log (y)\) for some \(q_0\in \Gamma (\Omega ,K)\). We see that \(-\log y - \langle y,a \rangle - b\) is bounded from below and proper if and only if \(a<0\). This means \(\Omega ^*\) consists of all \(q=q_0 + \langle y,a \rangle + b\) where \(a<0\). Given a deck transformation \(\gamma _m: y\mapsto 2^m y\), we calculate \(\gamma . q\) by

$$\begin{aligned} \gamma _m.q= & {} \Phi _q - \Phi _q\circ \gamma ^{-1} \\= & {} q -\log y -\langle a,y \rangle - b - (-\log 2^{-m} y - \langle a,2^{-m} y \rangle - b)\\= & {} q_0 + \langle y,2^{-m} a \rangle + b - m\log 2. \end{aligned}$$

In particular, if we identify an element \(q=q_0 + \langle y,a \rangle + b\) in \(\Omega ^*\) with \(a<0\), then the action by \(\Pi \) on \(\Omega ^*\) is described by \(\gamma _m:a\mapsto 2^{-m} a\) and \(M^*\cong {\mathbb {R}}_-/2^{\mathbb {Z}}\).

2.2 Legendre Transform

We begin by defining the Legendre transform of a section of \(L\rightarrow M\) as a section of \(-K^*\rightarrow \Omega ^*\). In Proposition 2.15 we show that it is equivariant, in other words that it descends to a section of \(-L^*\rightarrow M^*\).

Definition 2.14

(Legendre transform on the cover) Let (ML) be a Hessian manifold. Then the Legendre transform of a continuous section \(\phi \) of L is the convex section of the affine \({\mathbb {R}}\)-bundle \(-K^* \rightarrow \Omega ^*\) defined by

$$\begin{aligned} \phi ^*(p) := -q + \sup _{x\in \Omega } q(x) - \phi (x) = -q + \sup _{x\in \Omega } -\Phi _q(x), \end{aligned}$$
(13)

where \(q\in K^*\) is any point in the fiber over \(p\in \Omega ^*\).

To see that the Legendre transform is well defined, we must verify that it is independent of choice of q, but this follows immediately since any other choice can be written as \(q' = q + m\) for some \(m\in {\mathbb {R}}\), and thus

$$\begin{aligned} -q' + \sup _{x\in \Omega } q'(x) - q'\phi (x) = -q - m + \sup _{x\in \Omega } q(x) + m - \phi (x) = \phi ^*(p). \end{aligned}$$

Also note that the \(\sup \) in (13) is always attained, since \(p\in \Omega ^*\) means that \(\Phi _q\) is bounded from below and proper.

Remark 10

Note that over a simply connected manifold \(\Omega \), any point \(q\in \Gamma (\Omega ,K)\) defines a global affine trivialization of the affine \({\mathbb {R}}\)-bundle \(K\rightarrow \Omega \). Since \(\phi = \phi - q + q\), the representation of \(\phi \) as function in this trivialization is simply \(\Phi _q = \phi - q\). Thus the Legendre transform over a simply connected manifold can be viewed as taking the \(\sup \) in different trivializations of the affine \({\mathbb {R}}\)-bundle \(K\rightarrow \Omega \).

Remark 11

As in Remark 8 fix \(x_0\in \Omega \), a basis of \(T_{x_0}\Omega \) and \(q_0\in K^*\). For each \(p\in \Omega ^*\), let L(p) be the unique element q in the fiber above p such that \(q_0(x_0)=q(x_0)\). Then L defines an affine section of \(K^*\). Moreover, using the identification of \(\Omega \) with a subset of \({\mathbb {R}}^n\), \(L(p)-q_0\) may be identified with an element in \(({\mathbb {R}}^n)^*\). Letting \(q=L(p)\) and plugging this into (13) gives

$$\begin{aligned} \phi ^*(p)= & {} L(p) + \sup _{x\in \Omega } \Phi _{L(p)} = L(p) + \sup _{x\in \Omega } (q-q) - \Phi _{q_0}\nonumber \\= & {} L(p) + \Phi _{q_0}^*(L(p)-q_0), \end{aligned}$$
(14)

where \(\Phi _p^*\) denotes the Legendre transform of \(\Phi _{q_0}\), seen as a bona fide convex function on a convex domain in \({\mathbb {R}}^n\). We conclude that

$$\begin{aligned} \phi ^*-L = \Phi _{q_0}^*. \end{aligned}$$

Proposition 2.15

The Legendre transform \(\phi ^*\) is \(\Pi \)-equivariant, i.e., \(\phi ^*(\gamma .p) = \gamma .\phi ^*(p)\) for all \(\gamma \in \Pi \).

Proof

Fix \(\gamma \in \Pi \). By Lemma 2.8, we have \(\Phi _{\gamma .q} = \Phi _q \circ \gamma ^{-1}\). Thus

$$\begin{aligned} \phi ^*(\gamma .p)= & {} -\gamma .q + \sup _{\Omega } -\Phi _q\circ \gamma ^{-1} = -\gamma .q + \sup _{\Omega } -\Phi _q\nonumber \\= & {} -\gamma .(q + \sup _{\Omega } \Phi _q) = \gamma .\phi ^*(p). \end{aligned}$$
(15)

\(\square \)

We will now define a map \(\mathrm{d}\phi :\Omega \rightarrow \Omega ^*\). It turns out that if M is a compact manifold, and \(\phi \) is smooth, then this map is a diffeomorphism. The map will also be equivariant. This will guarantee that the action of \(\Pi \) on \(\Omega ^*\) induces a smooth quotient manifold \(\Omega ^*/\Pi =M^*\). Moreover, the map will also provide a diffeomorphism between M and \(M^*\) proving that they are equivalent as smooth manifolds.

Definition 2.16

Let \((M,L,\phi )\) be a Hessian manifold and \(x\in \Omega \). Locally there is a unique affine section tangent to \(\pi ^*\phi \) at x. By Proposition 2.3 this extends to a global affine section \(q\in K^*\) (thus satisfying \(\Phi _q(x)=\mathrm{d}\Phi _q(x)=0\)). We define \(d(\pi ^* \phi )(x)\) as the image of q in \(\Omega ^*\).

Lemma 2.17

Let \((M,L,\phi )\) be a compact Hessian manifold. Then \(d (\pi ^*\phi ): \Omega \rightarrow \Omega ^*\) is a diffeomorphism.

Proof

As in Remark 8, we may identify \(d(\pi ^* \phi )\) with the map \(\mathrm{d}\Phi _{q_0}\). But since \(\Phi _{q_0}\) is smooth and strictly convex, this yields an diffeomorphism. \(\square \)

Moreover, we have

Lemma 2.18

The map \(d (\pi ^*\phi ): \Omega \rightarrow \Omega ^*\) is equivariant. In other words

$$\begin{aligned} d (q^*\phi )(\gamma (x)) = \gamma .d(q^*\phi )(x). \end{aligned}$$

Proof

By Lemma 2.8\(\Phi _{\gamma .q}=\Phi _q\circ \gamma ^{-1}\). Hence, if q is tangent to \(\pi ^* \phi \) at x, then \(\gamma .q\) is tangent to \(\pi ^* \phi \) at \(\gamma (x)\). \(\square \)

Theorem 2.19

The quotient \(M^*=\Omega /\Pi \) is an affine manifold and \(\mathrm{d}\phi :M\rightarrow M^*\) is a diffeomorphism.

Proof

By Lemma 2.18 there is an equivariant diffeomorphism between \(\Omega \) and \(\Omega ^*\). If the action by \(\Pi \) on \(\Omega \) induces a smooth quotient manifold, so does the action by \(\Pi \) on \(\Omega ^*\). This means \(M^*\) is an affine manifold. Moreover, we get the following commutative diagram:

figure d

and since the top row is a diffeomorphism, so is the bottom. This means M and \(M^*\) are equivalent as smooth manifolds. \(\square \)

Using this diffeomorphism we also get an analogue of the real Legendre transform, in the sense that we can affinely identify the bidual \(M^{**}\) with M.

Lemma 2.20

Let \((M,L,\phi )\) be a compact Hessian manifold. Then the bidual \((M^{**},\phi ^{**})\) and \((M,\phi )\) are isomorphic as Hessian manifolds.

Proof

Using the identification of Remark 8 twice, we have the following commutative diagram:

figure e

where \(z_0\) is some choice of affine section \(z_0 \in \Gamma (\Omega ^*,K^*)\). But taking \(z_0\) as in Remark  11, we have that \(\Phi _{z_0} = (\Phi _{q_0})^*\). By standard properties of smooth strictly convex functions, we have that \(\mathrm{d}\Phi _{q_0}^* \circ \mathrm{d}\Phi _{q_0} = Id\). But this shows that the identity map from \(i_1(\Omega ) \rightarrow i_3(\Omega ^{**})\) is equivariant, and hence M and \(M^*\) are equivalent as affine manifolds. Further, since \(\Phi _{q_0}^{**} = \Phi _{q_0}\) as convex functions, the equivalence indeed holds also in the Hessian category. \(\square \)

Note that by the above identification with the classical Legendre transform on the cover \(\Omega \), we immediately inherit several properties from the corresponding properties of the Legendre transform in \({\mathbb {R}}^n\). In particular the above identification yields an identification of the bidual \(M^{**} = M\). By taking the double Legendre transform \(\Phi _{q_0}^{**}\) as a real function, we get a convex function on \(\Omega \) such that its Hessian tensor is \(\Pi \)-invariant. This descends to a Hessian metric on M, and \(\Phi _{q_0}^{**} = \Phi _{q_0}\). Furthermore, this construction is valid for any continuous section s, and hence we may define a projection operator taking continuous sections of \(L\rightarrow M\) to convex sections of \(L\rightarrow M\). By slight abuse of notation (i.e., identifying the bidual \(M^{**} = M\), see Lemma 2.20), we denote this projection by double Legendre transformation, and the following proposition follows.

Proposition 2.21

Let (ML) be a compact Hessian manifold and \(\phi \) a convex section of L. Then

$$\begin{aligned} \phi ^{**}(x)&= \sup _{q \in \Gamma (\Omega ,K), \Phi _q \ge 0} q(x) \end{aligned}$$
(16)
$$\begin{aligned} \phi ^{**}(x)&= \phi (x) , \end{aligned}$$
(17)

where on the right-hand side we have identified x with an arbitrary point over x in the cover. Furthermore, for any continuous section of L, we have that

$$\begin{aligned} s^{**} \le s \end{aligned}$$
(18)

pointwise.

Moreover, by standard properties of convex functions, for any convex (not necessarily strictly convex) section \(\phi \), \(\mathrm{d}\Phi _{q_0}\) has an inverse defined almost everywhere, namely \(d(\Phi _{q_0}^*)\). This means that, under the identification above, \(d(\phi ^*)\) is an inverse of \(\mathrm{d}\phi \) defined almost everywhere on \(M^*\). We will denote this map \(T_\phi \). Moreover, by standard properties for convex functions, for any continuous \(\Pi \)-invariant function v (see for example Lemma 2.7 in [3])

$$\begin{aligned} \frac{\mathrm{d}}{\mathrm{d}t}(\Phi _{q_0}+tv)^* = -v\circ (\mathrm{d}\Phi _{q_0}^*). \end{aligned}$$

It follows that

$$\begin{aligned} \frac{\mathrm{d}}{\mathrm{d}t}(\phi +tv)^*(p) = -v(T_\phi (p)). \end{aligned}$$
(19)

We end this section with the following definition.

Definition 2.22

Let (ML) be a Hessian manifold and \(\nu \) a probability measure on \(M^*\). We define the \(\nu \)-Monge–Ampère measure of a convex section \(\phi \) in L as

$$\begin{aligned} \text {MA}_\nu (\phi ) = (T_\phi )_* \nu . \end{aligned}$$

Remark 12

It is interesting to note that there is no apparent complex geometric analogue of the \(\nu \)-Monge–Ampère unless in the case when M is special and \(\nu \) is the unique parallel probability measure, in which case \(\text {MA}_\nu \) reduces to the standard Monge–Ampère operator considered in [8].

3 Solvability of Monge–Ampère Equations

We are now ready to give proofs the Theorems 1.1 and 1.2. We begin by

Definition 3.1

For a Hessian manifold \((M,L,\phi _0)\), the affine Kantorovich functional is the functional \(F: C^0(M) \rightarrow {\mathbb {R}}\) defined by

$$\begin{aligned} F(u) = \int _{M} u \mathrm{d}\mu + \int _{M^*} \left[ (u+\phi _0)^* - \phi _0^* \right] \mathrm{d}\nu . \end{aligned}$$
(20)

By abuse of notation, if \(\phi \) is a convex section of L, we also write

$$\begin{aligned} F(\phi ) = \int _{M} (\phi - \phi _0) \mathrm{d}\mu + \int _{M^*} (\phi ^* - \phi _0 ^*) \mathrm{d}\nu . \end{aligned}$$
(21)

Remark 13

Note that (21) only depends on \(\phi _0\) up to a constant. In particular the minimizers of (21) are independent of \(\phi _0\). We stress that this is not the classical Kantorovich functional induced by the Riemannian metric \(\nabla \mathrm{d}\phi _0\). Rather, it is determined by the affine structure on M together with L.

Proposition 3.2

Let \((M,L,\phi _0)\) be a compact Hessian manifold. Let \(\mu \) and \(\nu \) be probability measures on M and \(M^*\), respectively. Then F admits a convex minimizer. If \(\nu \) is absolutely continuous with full support and if \(\phi _0\) and \(\phi _1\) are minimizers of F, then \(\phi _1-\phi _0\) is constant. If, in addition, \(\mu \) and \(\nu \) are absolutely continuous with non-degenerate \(C^{k,\alpha }\) densities for some \(k\in {\mathbb {N}}\) and \(\alpha \in (0,1)\), then any minimizer is in \(\phi \in C^{k+2,\alpha }\).

Before we prove this we will explain how it implies Theorems 1.1 and 1.2. The main point is given by the following characterization of the minimizers of F.

Proposition 3.3

Let \((M,L,\phi _0)\) be a compact Hessian manifold, and let \(\mu \) and \(\nu \) be probability measures on M and \(M^*\). Assume \(\nu \) is absolutely continuous and let \(\phi \) be a convex minimizer of (21). Then

$$\begin{aligned} (T_\phi )_* \nu = \mu , \end{aligned}$$

where \(T_\phi \) is the map defined at the end of the previous section.

Proof

Let v be a continuous function on M. First of all, we claim that

$$\begin{aligned} \sup _{p\in M^*} |(\phi +tv)^*(p) - \phi ^*(p)| \le \sup _{x\in M} |tv(x)|. \end{aligned}$$
(22)

We defer the proof of this claim to the end of the proof of the existence part Proposition 3.3. The dominated convergence theorem and (22) then give

$$\begin{aligned} \frac{\mathrm{d}}{\mathrm{d}t}F(\phi +tv) = \int _M v \mu + \int _{M*} \frac{\mathrm{d}}{\mathrm{d}t} (\phi +tv)^* \mathrm{d}\nu . \end{aligned}$$
(23)

By (19) and since \(\nu \) is absolutely continuous, we have that \(\nu \)-almost everywhere

$$\begin{aligned} \frac{\mathrm{d}}{\mathrm{d}t} (\phi +tv)^*(p)=-v(T_\phi (p)). \end{aligned}$$

Applying this to the second integral above and performing the change of variables formula \(x=T_\phi (p)\), we get

$$\begin{aligned} \frac{\mathrm{d}}{\mathrm{d}t}F(\phi +tv) = \int _M v (\mu -(T_\phi ) _* \nu ). \end{aligned}$$

Since \(\phi \) is a minimizer of F, this has to vanish for any v, and hence \(\mu -(T_\phi ) _* \nu =0\). \(\square \)

Combining Propositions 3.2 and 3.3, Theorems 1.1 and 1.2 follow by the following arguments. Note that since \(\pi \), \(\tilde{\pi }\) are covering maps, we may consider the pullbacks \(\pi ^* \mu \), \(\tilde{\pi }^* \nu \) as invariant measures on \(\Omega , \Omega ^*\), and moreover any invariant measures on \(\Omega ,\Omega ^*\) arise in this way. Moreover, by definition, \(T_\phi :M^* \rightarrow M\) is induced by the (equivariant) partially defined inverse of \(\mathrm{d}\Phi _{q_0}\Omega \rightarrow \Omega ^*\). It then follows that

$$\begin{aligned} \pi ^*\mu = (\mathrm{d}\Phi _{q_0})^{-1}_* \pi ^* \nu \end{aligned}$$
(24)

if and only if

$$\begin{aligned} \mu = (T_\phi )_* \nu . \end{aligned}$$

Under the assumption that \(\nu \) is absolutely continuous, (24) is equivalent to \(\Phi _{q_0}\) being an Alexandrov solution to (2) (see for example Lemma 4.2 in [24]). This means Theorem 1.1 is a direct consequence of Propositions 3.2 and 3.3.

We now turn to the proof of Proposition 3.2. To establish existence of minimizers we will need a \(C^0\)-estimate and a Lipschitz bound on (normalized) convex sections of L, which together imply the following theorem, using Arzela–Ascoli.

Theorem 1.5

Let (ML) be a compact Hessian manifold. Then the space of convex sections of L modulo \({\mathbb {R}}\) is compact, in the topology of uniform convergence modulo \({\mathbb {R}}\).

Proposition 3.4

(Uniform \(C^0\) estimate). Let \((M,L,\phi _0)\) be a compact Hessian manifold. Then any \(\phi \) in the Kähler class of \(\phi _0\), normalized such that \(\sup \phi - \phi _0 = 0\) satisfies \(|\phi - \phi _0| \le C\) for some constant depending only on \(\phi _0\).

Proof

Fix \(\phi \), and let \(u = \phi -\phi _0 \in C^0(M)\). Being the difference of two convex sections, u has a Hessian in the Alexandrov sense. Fix \(E \subset {\mathbb {R}}^n\) as a relatively compact convex set containing a fundamental domain of M in \(\Omega \). Then the affine curve \(x_t = (1-t) x_0 + t x_1 \in E\), where we identify \(x_0,x_1\) with any lift to E. Letting \(f(t) = u(x_t)\) we have \(f'(0) = f'(1) = 0\), and thus, letting \(\nabla ^2\) denote the Alexandrov Hessian given by the embedding \(\Omega \subseteq {\mathbb {R}}^n\) endowed with the Euclidean metric, we have

$$\begin{aligned} \begin{aligned} f(t)&= f(0) + \int _0^t \left( \int _0^s f''(l) dl \right) \mathrm{d}s \\&= f(0) + \int _0^t \left( \int _0^s \langle x'_l , \nabla ^2 u|_{x_l} x'_l\rangle \mathrm{d}l \right) \mathrm{d}s \\&\ge f(0) - \int _0^t \left( \int _0^s \langle x'_l , \nabla ^2 \phi _0|_{x_l} x'_l\rangle \mathrm{d}l \right) \mathrm{d}s \\&\ge f(0) - Ct^2, \end{aligned} \end{aligned}$$
(25)

where the first inequality follows since \(\nabla ^2 u = \nabla ^2 \phi - \nabla ^2 \phi _0\) and \(\nabla ^2 \phi \ge 0\) by convexity. The constant C depends only on \(\phi _0\) and the (bounded) diameter of E. For \(t=1\) this yields that \(\sup u - \inf u \le C\), and the proposition follows. \(\square \)

Virtually the same proof can be used to give a locally uniform Lipschitz bound.

Proposition 3.5

Let \((M,L,\phi _0)\) be a compact Hessian manifold, and let \(\phi \) be a convex section of L. Then for \(u := \phi - \phi _0\) we have \(\Vert u\Vert _{Lip} \le C\) on any compact \(E\subset \Omega \), for some constant depending only on \((M,L,\phi _0)\) and E, where \(\Vert .\Vert _{Lip}\) denotes the Lipschitz constant with respect to the Euclidean metric on \({\mathbb {R}}^n\).

Proof

Fix a compact set \(E \subset \Omega \). We may without loss of generality assume that E is a convex set containing a fundamental domain. For any \(x\in E\) there is a Euclidean open ball B(xr) of radius r such that \(B(x,r) \subset \Omega \), and by compactness a finite number of such balls cover E, and we let \(U=\bigcup _{i=1}^N B(x_i,r_i)\) be their union. It follows that \(x + tv\in U\) for all \(x\in E\), \(t\in [-2,2]\) and \(\Vert v\Vert \le \delta := \min r_i/3 > 0\). Now fix \(x\in E\) arbitrarily and fix v arbitrarily such that \(\Vert v\Vert \le \delta \). Consider the function

$$\begin{aligned} f(t) := u(x + tv) \end{aligned}$$
(26)

as a function of t, twice differentiable in the Alexandrov sense, and defined on some open interval V such that \([0,1] \subset V\). Now assume that \(du_x(v) = A > 0\) for some A, or equivalently, \(f'(0) = A\). We then have

$$\begin{aligned} \begin{aligned} f(t)&= f(0) + \int _0^t \left[ f'(0) + \int _0^s f''(0) \mathrm{d}\tau \right] \mathrm{d}s \\&= f(0) + tA + \int _0^t\int _0^s \langle v \nabla ^2 u |v\rangle \mathrm{d}\tau \mathrm{d}s \\&\ge f(0) + tA + \int _0^t \int _0^s -C_{\phi _0} \mathrm{d}\tau \mathrm{d}s \\&= f(0) + tA - \frac{t^2 C_{\phi _0}}{2} \end{aligned} \end{aligned}$$
(27)

for some constant \(C_{\phi _0}\) depending only on \(\phi _0\) and \(\delta \). Then setting \(t=1\) we get

$$\begin{aligned} A \le f(1) - f(0) + C_{\phi _0} \le \sup u - \inf u + C_{\phi _0}, \end{aligned}$$
(28)

and by Proposition 3.4 we get an uniform upper bound on A. Replacing v by \(-v\) yields also a uniform lower bound, which then gives the desired bound on \(\Vert u\Vert _{Lip}\). \(\square \)

Using these a priori estimates the existence of a minimizer can be established.

Proof of the existence part of Proposition 3.3

Let \(\phi _k\) be an infimizing sequence of F, and define \(u_k := \phi _k - \phi _0 \in C^0(M)\). First we note that the functional F is invariant under the map \(\phi \mapsto \phi + C\), and hence we may assume that the sequences are normalized such that \(\int _M (\phi _k - \phi _0 )\mathrm{d}\mu = \int _M u_k \mathrm{d}\mu = 0\). Second we note that since \(F(\phi ^{**}) \le F(\phi )\), we may assume that \(\phi _i\) lie in the Kahler class of \(\phi _0\). Then, since \(u_k \in C_0(M)\) it follows that \(\sup _X u_k \ge 0 \ge \inf _X u_k\), and hence by Proposition 3.4\(\Vert u\Vert _{C_0(M)}\) is uniformly bounded. Furthermore, \(\Vert u\Vert _{Lip}\) is uniformly bounded by Proposition 3.5. By the Arzela–Ascoli theorem we can thus extract a subsequence converging as \(u_k \rightarrow u\) in \(C_0(M)\), and thus also convergence \(\phi _k \rightarrow \phi \). To show that \(\phi \) is indeed a minimizer of F it remains to show that F is continuous as a map \(C_0(M) \rightarrow {\mathbb {R}}\). To show this it suffices to show that \(\phi ^* = \lim (\phi _k)^*\). But this follows from the general claim that \(|\inf f - \inf g| \le \sup |f - g|\), since \(|\phi ^*(p) - \phi _k ^*(p)| = |\inf _{x\in \Omega } \Phi _q(x) - \inf _{x\in \Omega } \Phi _{k,q}|\). To show the claim, assume that \(\inf f \le \inf g\), let \(x_\epsilon \) be such that \(\inf f \ge f(x_\epsilon ) - \epsilon \). Then we have that

$$\begin{aligned} -|\inf f - \inf g | = \inf f - \inf g \ge f(x_\epsilon ) - \epsilon - g(x_\epsilon ) \ge -\epsilon - \sup |f-g|.\nonumber \\ \end{aligned}$$
(29)

Letting \(\epsilon \rightarrow 0\) yields the claim. \(\square \)

Proof of regularity part of Proposition 3.2

Fix a point \(x\in \Omega \), a point \(p\in \Omega ^*\) and a small open ball \(B(x,r)\ni x\). Then, since \(\phi \) solves a Monge–Ampere equation it follows that \(\mathrm{d}\Phi _p: B(x,r) \rightarrow \mathrm{d}\Phi _p(B)\) is a Brenier map for an optimal transportation of restrictions of \(\mu \) and \(\nu \). By Caffarelli’s regularity theory [25, Thm 4.14], since \(\mathrm{d}\Phi _p(B)\) is a convex domain, it then follows that we have that \(\Phi _p \in C^{2,\alpha }(B(x,r))\), and thus also that \(\pi ^*u = \Phi _p - \Phi _{0,p} \in C^{2,\alpha }(B(x,r))\). But fixing a relatively compact set E, covering \(\bar{E}\) with B(xr / 2), and passing to a finite subcover yield that \(u \in C^{2,\alpha }(E)\). The same argument yields the \(C^{k+2,\alpha }\) result. \(\square \)

Uniqueness of minimizers follows from a convexity argument.

Proof of uniqueness part of Proposition 3.2

Assume that there are two minimizers \(\psi _0,\psi _1\), both normalized such that \(\int (\psi _i - \phi _0) \mu =0\), and let \(\psi _t = (1-t)\psi _0 + t\psi _1\). Then

$$\begin{aligned} \begin{aligned} \inf \Psi _{t,p} = \inf \left[ (1-t) (q^*\psi _0 - p) + t(q^*\psi _1 - p) \right] \ge (1-t) \inf \Psi _{0,p} + t \inf \Psi _{1,p}, \end{aligned}\nonumber \\ \end{aligned}$$
(30)

and hence \(\psi ^*_t \le (1-t)\psi ^*_0 + (1-t)\psi _1^*\) holds pointwise. It follows that \(F(\psi _t) \le F(\psi _0)\). Now assume that \(\psi ^*_t(p) < \psi ^*_0(p)\) in some point p. By continuity this then holds also for some open set \(U \in M^*\). But using the pointwise inequality on \(M{\setminus } U\) and strict inequality on U we get that \(\int _{M^*}( \psi ^*_t - \phi ^*_0 ) d\nu < \int _{M^*}( \psi ^*_0 - \phi ^*_0 )d\nu \), contradicting the minimality of \(\psi _0\). Hence \(\psi ^*_t = \psi ^*_0\) for all t, and uniqueness follows. \(\square \)

Using the uniqueness result we are also able to show continuity of the inverse Monge–Ampère operator.

Theorem 3.6

Let \((M,L,\phi _0)\) be a compact Hessian manifold, and let \(\nu \) be a fixed absolutely continuous probability measure on \(M^*\), with full support. Then if \(\mu _i \rightarrow \mu \) are probability measures converging in the weak\(^*\) topology, the solutions \(\phi _i \rightarrow \phi \) of \(\text {MA}_\nu \phi _i = \mu _i\), normalized such that \(\int _M ( \phi ^*_i - \phi ^*_0 )\mu = 0\), converge in the \(C^0\)-topology, where \(\text {MA}_\nu \phi = \mu \).

Proof

We claim that Theorem 1.5 yields that up to subsequence \(\phi _i \rightarrow \bar{\phi }\) in the \(C^0\) topology. Indeed, as in the existence proof of Proposition 3.2, we have that \(\phi _i^*\) has a converging subsequence, and the claim then follows from the continuity of the Legendre transform. Furthermore, note that in fact \(\int _{M^*} (\bar{\phi }^* - \phi _0 ^*) d\nu = 0\).

Let \(F_i\), F denote the Kantorovich functionals corresponding to \(\mu _i, \mu \), and let \(\phi \) be the solution to \(\text {MA}_\nu \phi = \mu \), normalized such that \(\int _{M^*} (\phi ^* - \phi _0^*) d\nu = 0\). Then we by minimality of \(\phi _i\) for \(F_i\) have that

$$\begin{aligned} \begin{aligned} F_i(\phi _i)&\le F_i(\phi ) = \int _M (\phi - \phi _0)\mathrm{d}\mu _i \\&= \int _M (\phi -\phi _0) \mathrm{d}\mu + \int _M (\phi -\phi _0)(\mathrm{d}\mu _i - \mathrm{d}\mu ). \end{aligned} \end{aligned}$$
(31)

Since \(\mu _i \rightarrow \mu \) and \(\phi -\phi _0\) is bounded and continuous, taking limits we obtain \(\limsup F_i(\phi _i) \le F(\phi )\). On the other hand we have

$$\begin{aligned} \begin{aligned} F_i(\phi _i)&= \int _M (\phi _i -\phi _0)\mathrm{d}\mu _i \\&= \int _M(\bar{\phi } - \phi _0)\mathrm{d}\mu + \int _M(\phi _i - \bar{\phi })\mathrm{d}\mu _i + \int _M(\bar{\phi } - \phi _0) (\mathrm{d}\mu _i - \mathrm{d}\mu ) \\&= F(\bar{\phi }) + \int _M(\phi _i - \bar{\phi })\mathrm{d}\mu _i + \int _M(\bar{\phi } - \phi _0) (\mathrm{d}\mu _i - \mathrm{d}\mu ). \end{aligned} \end{aligned}$$
(32)

Since \(\phi _i \rightarrow \bar{\phi }\) and \(\mu _i\) is of mass 1, we have that \(\left| \int _M(\phi _i - \bar{\phi })\mathrm{d}\mu _i\right| \le \sup _M |\phi _i-\bar{\phi }| \rightarrow 0\), and by weak-\(*\) convergence we have that \(\int _M(\bar{\phi } - \phi _0) (\mathrm{d}\mu _i - \mathrm{d}\mu ) \rightarrow 0\). Taking limits we thus obtain that \(F(\bar{\phi }) = \lim F_i(\phi _i) \le F(\phi )\), which shows that \(\bar{\phi }\) is a minimizer of F. By the uniqueness part of Proposition 3.2 it follows that \(\bar{\phi } = \phi \), and consequently \(\phi _i \rightarrow \phi \)\(\square \)

4 The Pairing and Optimal Transport

Let \(M_1\) and \(M_2\) be two affine manifolds. Consider their product \(M_1\times M_2\) and let \(q_1\) and \(q_2\) be the projections of \(M_1\times M_2\) onto \(M_1\) and \(M_2\), respectively. Assume \(L_1\) and \(L_2\) are affine \({\mathbb {R}}\)-bundles over \(M_1\) and \(M_2\), respectively. Then there is a natural affine \({\mathbb {R}}\)-bundle over \(M_1\times M_2\) given by

$$\begin{aligned} L\boxplus L^* = q_1^* L_1 + q_2^* L_2. \end{aligned}$$

Given a Hessian manifold \((M,L,\phi )\) we will show that \(L\boxplus -L^*\) has a canonical section. We will use the notation \([\cdot ,\cdot ]\) for this section and it will play the same role as the standard pairing between \({\mathbb {R}}^n\) and \(({\mathbb {R}}^n)^*\) in the classical Legendre transform. The definition will be given in terms of a section in \(K\boxplus -K^*\). We will then show that this section defines a section in \(L\boxplus -L^*\). Indeed, the actions of \(\Pi \) on K and \(K^*\) defines an action by \(\Pi \times \Pi \) on \(K\times -K^*\) given by

$$\begin{aligned} (\gamma _1,\gamma _2) (y-q) = \gamma _1(y) - \gamma _2.q \end{aligned}$$

and we may recover \(L\boxplus -L^*\) as the quotient \(K\boxplus -K^*/\Pi \times \Pi \).

Definition 4.1

Let \((M,L,\phi )\) be a Hessian manifold and \(K\rightarrow \Omega \) and \(K^*\rightarrow \Omega ^*\) be the associated objects defined in the previous section. Given \((x,p)\in \Omega \times \Omega ^*\), let q be a point in the fiber of \(K^*\) over p. We define

$$\begin{aligned}{}[x,p] = \sup _{\gamma \in \Pi } \gamma .q(x)-q. \end{aligned}$$

Remark 14

To see that this is well defined we must verify that it is independent of the choice of q. But this follows immediately since any other choice can be written as \(q' = q + C\) for some \(C\in {\mathbb {R}}\) and thus \( \gamma .q'(x)-q' = \gamma .q(x)-q. \)

Lemma 4.2

The pairing \([\cdot ,\cdot ]\) descends to a section of \(L\boxplus -L^*\).

Proof

We need to prove that \([\cdot ,\cdot ]\) is equivariant, in other words that

$$\begin{aligned} {[}\gamma _1(x),\gamma _2.p] = (\gamma _1,\gamma _2)[x,p] \end{aligned}$$

for all \(\gamma _1,\gamma _2\in \Pi \), \(x\in M\) and \(p\in M^*\). Now,

$$\begin{aligned} {[}\gamma _1(x),p]= & {} \sup _{\gamma \in \Pi } \gamma .q(\gamma _1(x))-q \\= & {} \sup _{\gamma \in \Pi } \gamma _1(\gamma _1^{-1}\gamma .q(x))-q \\= & {} \sup _{\gamma \in \Pi } \gamma _1(\gamma .q(x)))-q \\= & {} (\gamma _1,id)[x,p], \end{aligned}$$

where the second equality follows from

$$\begin{aligned} \gamma .q(\gamma _1(x))= & {} \gamma \circ q \circ \gamma ^{-1}\circ \gamma _1(x) = \gamma _1 \circ (\gamma _1^{-1}\circ \gamma )^{-1} \circ q \circ (\gamma _1^{-1} \circ \gamma )^{-1}(x)\\= & {} \gamma _1(\gamma _1^{-1}\gamma .q(x)) \end{aligned}$$

and the third equality follows from the substitution of \(\gamma \) by \(\gamma _1^{-1}\gamma \). Moreover,

$$\begin{aligned}{}[x,\gamma _2.p] = \sup _{\gamma \in \Pi } \gamma .\gamma _2.q(x) - \gamma _2.q = \sup _{\gamma \in \Pi } \gamma .q(x)-\gamma _2.q = (id, \gamma _2)[x,p], \end{aligned}$$

where the second equality is given by substituting \(\gamma \) by \(\gamma \gamma _2\). This proves thelemma. \(\square \)

Lemma 4.3

Assume (ML) is a compact Hessian manifold and \(\phi \) is a continuous section of L, then

$$\begin{aligned} \phi ^*(p) = \sup _{x\in M} [x,p]-\phi (x). \end{aligned}$$
(33)

Moreover, \(\mathrm{d}\phi \) is defined at a point \(x\in M\) and \(\mathrm{d}\phi (x)=p\) if and only if p is the unique point in \(M^*\) such that

$$\begin{aligned} \phi ^*(p)=[x,p]-\phi (x). \end{aligned}$$

Proof

The right-hand side of (33) is given by

$$\begin{aligned} \sup _{x\in M} \sup _{\gamma \in \Pi } \gamma .q(x) - q - \phi (x)= & {} -q +\sup _{x\in M} \sup _{\gamma \in \Pi } -\Phi _{\gamma .q}(x) \\= & {} -q +\sup _{x\in M} \sup _{\gamma \in \Pi } -\Phi _q(\gamma ^{-1}(x)) \\= & {} -q +\sup _{x\in \Omega } -\Phi _q(x) = \phi ^*(p), \end{aligned}$$

where, as usual, q is a point in the fiber above p. To prove the second point, note that \(\mathrm{d}\phi (x)=p\) if and only if there is \({\tilde{x}}\in \Omega \) above x and \(q\in K^*\) above p such that

$$\begin{aligned} \mathrm{d}\Phi _q({\tilde{x}}) = 0. \end{aligned}$$

By standard properties of convex functions this is true if and only if

$$\begin{aligned} \Phi _q^*(0) = - \Phi _q({\tilde{x}}). \end{aligned}$$

Since \(\Phi _q^*(0) \ge -\Phi _q(x') \) for any \(x'\in \Omega \), we get

$$\begin{aligned} \Phi _q^*(0) = - \sup _{\gamma \in \Pi } \Phi _q\circ \gamma ^{-1}({\tilde{x}}) \end{aligned}$$
(34)

Using the notation in Remark 11, we have \(\Phi _q^* = \phi ^*+L\). Putting \(q=L(p)\) we get, using that \(\Phi _q\circ \gamma = \Phi _{\gamma .q}\), that (34) is equivalent to

$$\begin{aligned} \phi ^*(p)= & {} -q +\sup _{\gamma \in \Pi } \Phi _{\gamma .q}({\tilde{x}}) \\= & {} \sup _{\gamma \in \Pi }-q + \gamma .q({\tilde{x}}) - \phi (x) \\= & {} [x,p]-\phi (x). \end{aligned}$$

\(\square \)

4.1 Recap: Kantorovich Problem of Optimal Transport

Let X and Y be topological manifolds, \(\mu \) and \(\nu \) (Borel) probability measures on X and Y, respectively, and c be a real-valued continuous function on \(X\times Y\). Then the associated problem of optimal transport is to minimize the quantity

$$\begin{aligned} I_c(\gamma ) = \int _{X\times Y} c(x,y) \gamma \end{aligned}$$

over all probability measures \(\gamma \) on \(X\times Y\) such that its first and second marginals coincide with \(\mu \) and \(\nu \), respectively. A probability measure \(\gamma \) with the above property is called a transport plan. Under regularity assumptions (see [25]) on \(\mu \), \(\nu \), and c, \(I_c\) will admit a minimizer \(\bar{\gamma }\) which is supported on the graph of a certain map \(T:X\rightarrow Y\) called the optimal transport map. If this is the case, then \(\bar{\gamma }\) is determined by T and

$$\begin{aligned} \bar{\gamma } = (\text {id}\otimes T)_* \mu , \end{aligned}$$

where \(\text {id}\) is the identity map on X,

Remark 15

See the introductions of [24, 25] for very good heuristic interpretations of transport plans and transport maps.

Remark 16

Assume two cost functions c and \(c'\) satisfy

$$\begin{aligned} c'(x,y) = c(x,y) + f(x) + g(y), \end{aligned}$$
(35)

where \(f\in L^1(\mu )\) and \(g\in L^1(\nu )\) are functions on X and Y, respectively. Then they determine the same optimal transport problem in the sense that

$$\begin{aligned} I_{c'}(\gamma )= & {} \int _{X\times Y} c' \gamma = \\= & {} \int c \gamma + \int _X f \gamma + \int _Y g \gamma \\= & {} \int c \gamma + \int _X f \mu + \int _Y g \nu \\= & {} I_{c} + C, \end{aligned}$$

where C is a constant independent of \(\gamma \). In particular, \(I_{c}\) and \(I_{c'}\) have the same minimizers. Motivated by this we will say that two cost functions c and \(c'\) are equivalent if (35) holds for some integrable functions f and g on X and Y, respectively.

Two important cases is worth mentioning. The first is when \(X=Y\) is a Riemannian manifold and \(c(x,y) = d^2(x,y)/2\) where d is the distance function induced by the Riemannian metric. The other, which can in fact be seen as a special case of this, is when \(X={\mathbb {R}}^n\) and \(Y=({\mathbb {R}}^n)^*\) and \(c(x,y)=-\langle x,y \rangle \), where \(\langle \cdot ,\cdot \rangle \) is the standard pairing between X and Y. Now, let d be the standard Riemannian metric on \({\mathbb {R}}^n\). This induces an isomorphism of X and Y and

$$\begin{aligned} d(x,y)^2/2 = |x-y|^2/2 = |x|^2/2 - \langle x,y \rangle + |y^2|/2. \end{aligned}$$

In other words \(-\langle \cdot ,\cdot \rangle \) and \(d(\cdot ,\cdot )^2/2\) are equivalent as cost functions, as long as \(\mu \) and \(\nu \) have finite second moments.

To see the relation between our setup and optimal transport we need to consider the Kantorovich dual of the problem of optimal transport. Let f be a continuous function on X. Its c-transform is the function on Y given by

$$\begin{aligned} f^c(y) = \sup _{x\in X} -c(x,y) - f(x). \end{aligned}$$

Moreover, if \(x\in X\) satisfies

$$\begin{aligned} f^c(y) = -c(x,y) - f(x) \end{aligned}$$

for a unique \(y\in Y\), then the c-differential of f, \(d^c f\), is defined at x and \(d^cf(x) = y\).

The dual formulation of the problem of optimal transport above is to minimize the quantity

$$\begin{aligned} J(f) = \int _X f \mu + \int _Y f^c \nu \end{aligned}$$

over all continuous functions on X. Let \(\Pi (\mu ,\nu )\) be the set of transport plans. We have the following:

Theorem 4.4

(See theorem 5.10 in [25]). Let X, Y, \(\mu \), \(\nu \), and c be defined as above. Then, under certain regularity conditions (see [25] for details)

$$\begin{aligned} \inf _\gamma I(\gamma ) = -\inf _f J(f). \end{aligned}$$
(36)

Moreover, both I and J admit minimizers and the minimizer of I is supported on the graph of \(T= d^c f\) where f is the minimizer of J.

4.2 Affine \({\mathbb {R}}\)-Bundles and Cost Functions

Definition 4.5

Let \((M,L,\phi _0)\) be a compact Hessian manifold. We say that the associated cost function on \(M\times M^*\) is

$$\begin{aligned} c(x,y) = -[x,y]+\phi _0(x)+\phi _0^*(y). \end{aligned}$$

Remark 17

Let \(\phi \) be another convex section of L. Then \(\phi - \phi _0\) and \(\phi ^* - \phi _0^*\) are continuous functions on M and \(M^*\), respectively. Let \(c'\) be the cost function induced by \((M,L,\phi )\). Then

$$\begin{aligned} c'(x,y) = c(x,y) - (\phi -\phi _0) - (\phi ^*-\phi _0^*). \end{aligned}$$

This means \((M,L,\phi _0)\) and \((M,L,\phi )\) determine equivalent cost functions (in the sense of Remark 16). We conclude that under this equivalence the induced cost function on a Hessian manifold only depends on the data (ML).

Now, \(\phi \mapsto f = \phi - \phi _0 \) defines a map from the space of continuous sections of L to the space of continuous functions on M.

Lemma 4.6

Let

$$\begin{aligned} f = \phi - \phi _0. \end{aligned}$$

Then

$$\begin{aligned} f^c = \phi ^*-\phi _0^* \end{aligned}$$

Moreover, \(d^c f\) is defined if and only if \(\mathrm{d}\phi \) is defined and \(d^c f(x) = \mathrm{d}\phi (x)\) for all x where they are defined.

Proof

Using the first point in Lemma 4.3 We have

$$\begin{aligned} f^c(p)= & {} \sup _{x\in X} -c(x,p) - f(x) = \sup _{x\in M} [x,p]-\phi (x) - \phi _0^*(p) \\= & {} \phi ^*(p) - \phi _0^*(p) \end{aligned}$$

proving the first part of the lemma. For the second part note that

$$\begin{aligned} f(x) + f^c(p) + c(x,p) = \phi (x) + \phi ^*(p) - [x,p]; \end{aligned}$$

hence \(f^c(p) = -c(x,p)-f(x)\) if and only if \(\phi ^*(p) = [x,p] - \phi ^*(x)\). Combining this with the second point of Lemma 4.3 proves the second part. \(\square \)

Theorem 1.3

Let (ML) be a compact Hessian manifold. Let \(\mu \) and \(\nu \) be probability measures on M and \(M^*\), respectively. Assume \(\phi \) is a smooth strictly convex section of L such that

$$\begin{aligned} \text {MA}_\nu (\phi ) = \mu . \end{aligned}$$

Then \(\mathrm{d}\phi \) is the optimal transport map determined by \(M,M^*,\mu ,\nu \), and the cost function induced by (ML).

Proof

Let \(\phi _0\) be a convex section of L and c be the cost function induced by \((M,L,\phi _0)\). By the first part of Lemma 4.6, if

$$\begin{aligned} f = \phi - \phi _0, \end{aligned}$$

then \(F(\phi )=I_c(f)\). If \(\phi \) is a minimizer of F, then f is a minimizer of \(I_c\). By the second part of Lemma 4.6\(\mathrm{d}\phi = d^cf \) which by Theorem 4.4 is the optimal transport map determined by \(\mu \), \(\nu \), and c. To see that our setting satisfies the conditions in Theorem 5.10 in [25], note that c is continuous and M and \(M^*\) are compact manifolds (hence Polish spaces). By compactness and continuity the integrability properties in 5.10(i) and 5.10(iii) are satisfied. Moreover, by Lemma 4.6 the c-gradient of \(f^c\) is defined (as a single valued map) almost everywhere. \(\square \)

Now, when (ML) is special we may take \(\mu \) and \(\nu \) to be the unique parallel probability measures on M and \(M^*\), respectively. By Theorem 1.2 there is a smooth, strictly convex section \(\phi \) of L satisfying

$$\begin{aligned} \text {MA}_\nu (\phi ) = \mu . \end{aligned}$$

Then \(\Phi _q\), for some \(q\in K^*\) defines a convex exhaustion function on a convex subset of \({\mathbb {R}}^n\) and \(\det (\Phi _{ij})\) is constant. By Jörgens theorem [23] \(\Phi _q\) is a quadratic form and \(\Omega ={\mathbb {R}}^n\). This means \(\Phi \) induces an equivariant flat metric on \(\Omega \) and hence a flat metric on M. We conclude that any positive affine \({\mathbb {R}}\)-bundle over a special Hessian manifold M induces a flat Riemannian metric on M.

Further, we have

Lemma 4.7

Let (ML) be a compact special Hessian manifold. Then M and \(M^*\) are equivalent as affine manifolds.

Proof

Let \(\mu \) and \(\nu \) be the unique parallel probability measures on M and \(M^*\), respectively, and let \(\phi \) be the solution to

$$\begin{aligned} \text {MA}_\nu (\phi ) = \mu . \end{aligned}$$

By Jörgens theorem \(\Phi _q\) is a quadratic form. In particular, \(\mathrm{d}\Phi _q:\Omega \rightarrow \Omega ^*\) is an affine map. This means \(\mathrm{d}\phi :M\rightarrow M^*\) is affine and since it is also a diffeomorphism (by Theorem 2.19) this proves the lemma. \(\square \)

In the following proposition and corollary, we use the isomorphism in Lemma 4.7 to identify M and \(M^*\).

Proposition 4.8

Let (ML) be a compact special Hessian manifold. Let \(\mu \) and \(\nu \) be the unique parallel probability measures on M and \(M^*\), respectively, and let \(\phi \) be the smooth strictly convex section of L satisfying

$$\begin{aligned} \text {MA}_{\nu }(\phi ) = \mu . \end{aligned}$$

Then the cost function induced by \((M,L,\phi )\) is the squared distance function induced by the flat Riemannian metric determined by L.

Proof

Fixing \(q_0\in K^*\) and letting \(q=L(p)\) as in Remark 11, we get

$$\begin{aligned} -c(x,p)= & {} [x,p] - \phi (x)-\phi ^*(p) \\= & {} \sup _{\gamma \in \Pi } \gamma .q(x) - q - \phi (x)-\phi ^*(p) \\= & {} \sup _{\gamma \in \Pi } \gamma .q(x) - \gamma .q_0(x) -\Phi _{\gamma .q_0}(x) - \Phi _{q_0}^*(p) \\= & {} \sup _{\gamma \in \Pi } -\Phi _q(\gamma ^{-1}(x)) + \langle \gamma ^{-1}(x),p\rangle - \Phi _{q_0}^*(p). \end{aligned}$$

By Jörgens theorem [23] \(\Omega ={\mathbb {R}}^n\) and for a suitable choice of \(q_0\in \Gamma (X,L)\), we have

$$\begin{aligned} \Phi _{q_0}(x) = x^T \frac{Q}{2} x \end{aligned}$$

for some symmetric real \(n\times n\) matrix Q. This means \(\Omega ^*=({\mathbb {R}}^n)^*\) and

$$\begin{aligned} \Phi _{q_0}^*(p) = p \frac{Q^{-1}}{2} p^T. \end{aligned}$$

Under the identification \(p=\mathrm{d}\Phi _q(x_2) = x_2^TQ\), we get

$$\begin{aligned} -\Phi _q(x) + \langle x,p\rangle - \Phi _{q_0}^*(p)= & {} -x_1^T\frac{Q}{2}x_1 + px_1 - y\frac{Q^{-1}}{2}p^T \\= & {} x_1^T\frac{Q}{2}x_1 - x_2^TQx_1 + x_2^T \frac{Q}{2} x_2 \\= & {} -(x_1-x_2)^T \frac{Q}{2} (x_1-x_2). \end{aligned}$$

In other words

$$\begin{aligned} c(x,p) = -\sup _{\gamma \in \Pi } - (\gamma (x_1)-x_2) \frac{Q}{2} (\gamma (x_1)-x_2) = -d(x,p)^2/2. \end{aligned}$$

This proves the proposition. \(\square \)

Theorem 1.4

Let (ML) be a compact special Hessian manifold, \(\mu \) and \(\nu \) probability measures on M and \(M^*\), respectively. Then the cost function determined by (ML) is the squared distance function \(d^2/2\) of the flat Riemannian metric on M induced by L. Hence, Eq. (6) is equivalent to the optimal transport problem determined by \(\mu \), \(\nu \) and \(d^2/2\), where d is the flat Riemannian metric on M induced by L.

Proof

By Proposition 4.8\(d^2/2\) is the cost function induced by \((M,L,\phi )\). The theorem then follows from Theorem 1.3. \(\square \)

5 Einstein–Hessian Metrics

To illustrate that the use of the Legendre transform is not limited to the inhomogeneous Monge–Ampére equation considered in the preceding section, we here consider an analogue of the Kähler–Einstein equation on complex manifolds, and give a variational proof of the existence of solutions.

We first give some brief background on Kähler–Einstein metrics in the complex setting as motivation. For a Kähler manifold \((X,\omega )\), let \(\omega _\varphi \) denote the form \(\omega _\varphi = \omega + dd^c \varphi \), which we assume to be a Kähler form. We call \(\omega _\varphi \) a Kähler–Einstein metric if the equation

$$\begin{aligned} \text {Ric}\omega _\varphi = \lambda \omega _\varphi \end{aligned}$$
(37)

holds for some real constant \(\lambda \). Taking cohomology, we see that for (37) to hold for some \(\varphi \) we must have \(\lambda [\omega ] = c_1(X)\), where \(c_1(X)\) denotes the first Chern class of X, and hence (by the \(dd^c\)-lemma) we have that \(Ric \omega - \lambda \omega = dd^c f\) for some function \(f:X\rightarrow {\mathbb {R}}\). One can show that (37) is then equivalent to solving the complex Monge–Ampére equation

$$\begin{aligned} (\omega + dd^c \varphi )^n = e^{f - \lambda \varphi } \omega ^n. \end{aligned}$$
(38)

We here consider an analogue of (38) in the setting of a compact Hessian manifold.

Theorem 1.6

Let \((M,L,\phi _0)\) be a compact Hessian manifold, let \(\nu \) be an absolutely continuous probability measure of full support on \(M^*\), let \(\mu \) be a probability measure on \(\mu \), and let \(\lambda \in {\mathbb {R}}\). Then the equation

$$\begin{aligned} \text {MA}_\nu \phi = e^{-\lambda (\phi - \phi _0)} \mu \end{aligned}$$
(9)

has a solution.

To prove Theorem 1.6 we will define a functional analogous to the Ding functional in complex geometry. It will be a modified version of the affine Kantorovich functional used in previous sections and solutions to (9) are stationary points of this functional. Moreover, we will provide an additional proof of Theorem 1.6 using an alternative functional, analogous to the Mabuchi functional in complex geometry.

Definition 5.1

Fix \(\mu _0 \in {\mathcal {P}}(M)\) and \(\nu \in {\mathcal {P}}(M^*)\), where \(M: {\mathcal {P}}(M) \rightarrow {\mathbb {R}}\), where \({\mathcal {P}}(M)\) denotes the space of probability measures on M. We let \(D:C^0(M,L) \rightarrow {\mathbb {R}}\) and \(M: {\mathcal {P}}(M)\rightarrow {\mathbb {R}}\), be defined by

$$\begin{aligned} D(\phi )&= \int _{M^*} \left( \phi ^* - \phi _0^*\right) \mathrm{d}\nu - \frac{1}{\lambda } \log \int _M e^{-\lambda (\phi -\phi _0)} \mu _0 \end{aligned}$$
(39)
$$\begin{aligned} M(\mu )&= \lambda \inf _{\phi \in C^0(M,L)} F_{\mu ,\nu }(\phi ) + \int _M \log \frac{\mu }{\mu _0} \mathrm{d}\mu , \end{aligned}$$
(40)

where \(F_{\mu ,\nu }\) denotes the affine Kantorovich functional (21).

Remark 18

Note that the term \(\int _M \log \frac{\mu }{\mu _0} \mathrm{d}\mu \) is precisely the relative entropy of \(\mu \) with respect to \(\mu _0\). Thus, the functional M is finite only when \(\mu \) has a density with respect to \(\mu _0\). In the following we will denote this density by \(\rho \), i.e., \(\mu = \rho \mu _0\).

We proceed by analyzing the two functionals D and M separately in the following two subsections. The key point in both subsections is that existence of minimizers to D and M will follow from the compactness result of Theorem 1.5.

5.1 The Ding Functional

Lemma 5.2

D descends to a functional on \(C^0(M,L)/{\mathbb {R}}\).

Proof

It is immediate to verify that D is invariant under the action \(\phi \mapsto \phi + c\) for any \(c\in {\mathbb {R}}\). \(\square \)

Using the above lemma, in what follows we may choose a normalization of \(\phi \) such that \(\int _M e^{-\lambda (\phi -\phi _0)}\mu _0 = 1\).

Lemma 5.3

D is continuous as a map \(C^0(M,L) \rightarrow {\mathbb {R}}\), and thus also as a map \(C^0(M,L)/{\mathbb {R}}\rightarrow {\mathbb {R}}\).

Proof

Let \(\phi _i \rightarrow \phi \) in the \(C^0\)-topology. The continuity of the first term was established in Proposition 3.2, and continuity of the second term follows from the dominated convergence theorem. \(\square \)

Lemmas 5.2 and 5.3 then immediately give the following corollary.

Corollary 5.4

D has a convex minimizer.

Proof

The existence of a continuous minimizer follows from compactness and continuity. To show that the minimizer can be taken convex, it suffices to show that \(D(\phi ^{**}) \le D(\phi )\). But the first term of D is unchanged by double Legendre transform, by the equality \(\phi ^{***}= \phi ^*\). Further, since \(\phi ^{**} \le \phi \) for any section of L, we also have

$$\begin{aligned} \frac{-1}{\lambda } \log \int _M e^{-\lambda (\phi ^{**}-\phi _0)} \mu _0 \le \frac{-1}{\lambda } \log \int _M e^{-\lambda (\phi -\phi _0)} \mu _0 \end{aligned}$$

for any \(\lambda \ne 0\). \(\square \)

To show that (9) has a solution it thus suffices to show that minimizers of D are characterized by (9).

Proposition 5.5

Let \(\phi \) be a convex section of L. Then \(D: C^0(M)\) is Gateaux differentiable at \(\phi \) and

$$\begin{aligned} dD|_\phi = -\text {MA}_\nu \phi + e^{-\lambda (\phi -\phi _0)} \mu _0. \end{aligned}$$
(41)

Proof

The Gateaux differential of the first term was established in 3.3. Consider the perturbation of \(\frac{-1}{\lambda } \log \int _M e^{-\lambda (\phi -\phi _0)} \mu _0\) by a continuous function \(\phi \mapsto \phi + tv\), where we normalize \(\int _M e^{-\lambda (\phi -\phi _0)} \mu _0 = 1\). Since M is compact, the dominated convergence theorem gives that

$$\begin{aligned} \begin{aligned} \frac{\mathrm{d}}{\mathrm{d}t}|_{t=0} \log \int _M e^{-\lambda (\phi -\phi _0 + tv)}\mu _0 = \int _M \frac{\mathrm{d}}{\mathrm{d}t}|_{t=0} e^{-\lambda (\phi -\phi _0 + tv)} \mu _0 = \int _M v e^{-\lambda (\phi -\phi _0)} \mu _0 \end{aligned}\nonumber \\ \end{aligned}$$
(42)

and the proposition follows. \(\square \)

We now have the necessary ingredients to give a

Proof of Theorem 1.6

Note that the case where \(\lambda = 0\) is Theorem 1.2. When \(\lambda \ne 0\), the theorem follows from Corollary 5.4 and Proposition 5.5. \(\square \)

Uniqueness of solutions to (9) when \(\lambda < 0\) are also quite easy to show.

Proposition 5.6

The solution to (9) is unique modulo \({\mathbb {R}}\), for \(\lambda < 0\).

Proof

For simplicity assume that \(\lambda = -1\), let \(\psi _0,\psi _1\) be convex sections of L, and let \(\psi _t = (1-t) \psi _0 + t\psi _1\) for \(t\in (0,1)\). Note that \(e^{\psi _t-\phi _0}\in L^p(\mu _0)\) for any \(p\in [1,\infty ]\), by continuity. Using Hölders inequality we have that

$$\begin{aligned} \int _M e^{(1-t)(\psi _0 - \phi _0) + t (\psi _1-\phi _0)}\mathrm{d}\mu _0 \le \left( \int _M e^{\psi _0 - \phi _0}\mathrm{d}\mu \right) ^{1-t} \left( \int _M e^{\psi _1-\phi _0}\mathrm{d}\mu _0 \right) ^t \end{aligned}$$
(43)

and taking logarithms shows that \(t\mapsto \int _M e^{(1-t)(\psi _0 - \phi _0) + t (\psi _1-\phi _0)}\mathrm{d}\mu _0\) is convex in t. But, by the uniqueness part of Proposition 3.2\(\int _{M^*} (\psi _t^* - \phi _0^*) d\nu \) is strictly convex in t unless \(\psi _0 -\psi _1\) is constant, and hence uniqueness follows. \(\square \)

5.2 The Mabuchi Functional

We also outline how to achieve the same results as above using the Mabuchi functional.

Proposition 5.7

If \(\nu \) has full support, then M is lower semicontinuous.

Proof

By Theorem 3.6, the first term is continuous. Further the lower semicontinuity of relative entropy is well known. \(\square \)

Proposition 5.8

If \(\nu \) has full support and \(\mu \) has a density \(\rho \) with respect to \(\mu _0\), then M is Gateaux differentiable, and

$$\begin{aligned} dM|_{\mu } = \lambda ( \phi -\phi _0 ) + \log \rho , \end{aligned}$$
(44)

where \(\phi \) is the unique solution to \(\text {MA}_\nu \phi = \mu \).

Proof

The Gateaux differential of relative entropy \(d\left( \int _M \rho \log \rho \mathrm{d}\mu _0\right) = \log \rho \) is well known.

For the first term, let \({\dot{\mu }}\) be a perturbation of \(\mu \), i.e., a measure such that \(\int _M {\dot{\mu }} = 0\). Consider the function \(f(t) = \inf _\phi F_{\mu + t{\dot{\mu }},\nu }(\phi ) := \inf _\phi F_t(\phi )\), i.e., the Mabuchi functional along the one-parameter family of measures given by \({\dot{\mu }}\), defined on some open interval around \(t=0\). Then \(F_t\) is convex (indeed, linear) in t, and since the space of convex sections modulo \({\mathbb {R}}\) is compact we have by Danskins theorem that f has directional derivatives at \(t=0\). In fact \(f'_{\pm }(0) = \int (\phi _\pm - \phi _0 ) \mathrm{d}{\dot{\mu }}\) where \(\phi _\pm \) are some minimizers of \(F_0\). But by the uniqueness part of Proposition 3.2, \(F_0\) has a unique convex minimizer \(\phi \), and thus \(f'(0) = \int (\phi -\phi _0 ) d{\dot{\mu }}\), by which the proposition follows. \(\square \)

Proof of Theorem 1.6

First we note that since M is compact, by Prokhorovs theorem \({\mathcal {P}}(M)\) is also compact. Hence, by lower semicontinuity and since \(M(\mu )\) is not identically \(\infty \) (e.g., \(M(\mu _0)< \infty \)), there is some minimizer \(\mu \). Further we have that \(\mu \) is absolutely continuous with respect to \(\mu _0\), since otherwise \(M(\mu ) = \infty \). Thus \(\mu = \rho \mu _0\), and by Proposition 5.8 we have that \(\text {MA}_\nu \phi = \rho \mu _0\) and \(-\lambda (\phi - \phi _0) = \log \rho \). Taking exponentials yields the theorem. \(\square \)

6 Atomic Measures and Piecewise Affine Sections

Definition 6.1

We call a convex section \(\phi : M\rightarrow L\) piecewise affine if for any compact set \(K\subset \Omega \), it holds that \(\Phi _p|_K\) is piecewise affine.

Note that the above definition simply means that \(\Phi \) can locally be written as the \(\sup \) of finitely many affine sections \(p_i \in \Gamma (\Omega ,K)\). Note however that is not a priori clear that this is equivalent to taking the \(\sup \) over all deck transformations of finitely many \(p_i \in L^*\); however, this is essentially the content of the following theorem.

Theorem 1.7

We call a probability measure \(\mu \) on M atomic if \(\mu = \sum _{i=1}^N \lambda _i \delta _{x_i}\). Let \(\nu \) be an absolutely continuous probability measure of full support on \(M^*\). Then

$$\begin{aligned} \text {MA}_\nu \phi \text { is atomic} \Leftrightarrow \phi \, \text {is piecewise affine.} \end{aligned}$$
(10)

Note that the although the measure \(\text {MA}_\nu \phi \) depends on the choice of reference \(\nu \), the condition that \(\text {MA}_\nu \phi = 0\) outside a finite set is in fact independent of \(\nu \) as long as \(\nu \) has a non-vanishing density. To see this, one can use the same identification (2) to identify the Monge–Ampére measure with a measure on the cover \(\Omega \). But this implies that \(\text {MA}_\nu \phi = 0\) if and only if \(\det (\Phi _p)_{ij} = 0\), which is independent of \(\nu \).

In the section that follows we will, by abuse of notation, use \(\mu \) to denote both the measure on M, and its periodic lift to the universal cover \(\tilde{M}\), which we identify with a fix convex domain in \({\mathbb {R}}^n\). We also identify \(K^*\) with a fix convex domain in \(({\mathbb {R}}^n)^*\times {\mathbb {R}}\) by fixing \(q_0\in K^*\), and letting p correspond to \(q_0 - p\). We further let \(\Phi := \Phi _{q_0}\) and \(\Phi _p = \Phi - p\).

Proof

Fix an atomic measure \(\mu \), and let \(\phi \) be the solution to (6). Further fix compact set \(K \subset \Omega \), we aim to show that \(\Phi |_K\) is piecewise affine on K. First we note that we may write

$$\begin{aligned} \Phi (x) = \sup _{p\, \text {affine}, p \le \Phi } p(x). \end{aligned}$$
(45)

We claim that it suffices to restrict the \(\sup \) to

$$\begin{aligned} \Phi (x) = \sup _{p(x_i)\le \Phi (x_i)} p(x) \end{aligned}$$
(46)

as a \(\sup \) over all points \(x_i \in \text {supp}\mu \). To see this, let \(\tilde{\Phi }\) be the function defined in the right-hand side of (46). We immediately have that \(\tilde{\Phi } \ge \Phi \), by which we have that \(\tilde{\Phi }^*\le \Phi ^*\). But for any point \(x_i \in \text {supp}\mu \), we also have the reverse inequality, i.e., \(\tilde{\Phi }(x_i) \le \Phi (x_i)\). Combining these observations yields that \(F(\tilde{\Phi }) \le F(\Phi )\), and by the uniqueness result in Proposition 3.2, since \(\Phi \) is a minimizer, we have that \(\tilde{\Phi } = \Phi \).

Next we observe that since \(\Phi \) is a continuous convex function, for any \(x\in K\) the \(\sup \) is attained at some \(p \in L^*\) satisfying \(\Phi (x) = p(x)\), by the Hahn–Banach theorem. More precisely, the \(\sup \) is attained precisely when \(p \in \partial \Phi (x)\). It follows that we may further restrict the \(\sup \) to, for any \(x\in K\),

$$\begin{aligned} \Phi (x) = \sup _{p(x_i) \le \Phi (x_i), p\in \partial \Phi (K)} p(x_i). \end{aligned}$$
(47)

Furthermore, the subdifferential image \(\partial \Phi (K)\) is compact in \( K^*\), by [14], Remark 6.2.3. Now fix \(p\in L^*\), i.e., p such that \(\Phi _p\) exhausts \(\Omega \). Then there is an open set \(V_p \ni p\) such that \(\inf _{q\in V_p} \Phi _q(x)\) also exhausts \(\Omega \) (Lemma 6.2), and by compactness we may cover \(\partial \Phi (K)\) by a finite collection \(V_{p_j}\) of such open sets. It follows that the function

$$\begin{aligned} f(y) = \inf _{x\in K} \inf _{p\in \partial \Phi (x)} \Phi _p(y) \ge \inf _{p \in \cup V_{p_j}} \Phi _p(y) \end{aligned}$$
(48)

also exhausts \(\Omega \). But for any \(y\in \Omega \), \(p\in \partial \Phi (K)\) we have that \(p(y) \le \Phi (y) - f(y)\), and hence we may for \(x\in K\) restrict the \(\sup \) of (46) to

$$\begin{aligned} \Phi (x) = \sup _{p(x_i) \le \Phi (x_i), p\in \partial \Phi (K), x_i \in \{f \le 1\} } p(x). \end{aligned}$$
(49)

But the set \(\{f\le 1\}\) is compact, since f exhausts \(\Omega \), so \(\{x \in \text {supp}\mu , f\le 1\}\) is finite, and thus \(\Phi \) is piecewise affine on K. \(\square \)

We provide below a Lemma for convex functions in \({\mathbb {R}}^n\), which was used in Theorem 1.7

Lemma 6.2

Let \(f:\Omega \rightarrow {\mathbb {R}}\) be a convex exhaustion function from an open set \(\Omega \subseteq {\mathbb {R}}^n\). Assume that \(\Omega ^* = \{f^*(p) < \infty \}\) is an open set. Then for any \(p_0 \in \Omega ^*\), there is an open set \(U\ni p_0\) such that

$$\begin{aligned} g(x) := \inf _{U} f(x) - p(x) \end{aligned}$$
(50)

exhausts \(\Omega \).

Proof

Without loss of generality, we assume that \(p_0 = 0 \in \Omega ^*\). Further we denote \(f_p(x) = f(x) - p(x)\). We claim that we may take \(U = B_\delta (0)\), where \(\delta > 0\) is small enough that \(B_{2\delta }(0)\) is relatively compact in \(\Omega ^*\). The lemma follows if we can show that \(\{g(x) \le c\}\) is a closed bounded set and that \(\{g(x) \le c\} \cap \partial \Omega = \emptyset \) for every c, since then \(\{g \le x\}\) is a compact set in \({\mathbb {R}}^n\) contained in the open set \(\Omega \). We thus proceed by showing the following three claims.

  1. 1.

    g has bounded level sets.

  2. 2.

    g is continuous.

  3. 3.

    \(\{g(x) \le c\} \cap \partial \Omega = \emptyset \). \(\square \)

Claim 1

Fix \(p\in B_\delta (0)\), and fix \(x_0\in \Omega \) arbitrarily, and for simplicity we assume that \(x_0 = 0 \in \Omega \). Let \(x\in \Omega \) be arbitrary, and let r be the linear function \(r(y) = \delta \frac{\langle x, y \rangle }{\Vert x\Vert }\). Note that \(q + r\in B_{2\delta }(0)\), and hence by assumption \(f^*(q+r) \le C_\delta \) for some constant \(C_\delta \) depending only on \(\delta \), by continuity of \(f^*\). We then have

$$\begin{aligned} \begin{aligned} f_q(x)&= f_{q+r}(x) + r(x) = f_{q+r}(x) + \delta \Vert x\Vert \\&\ge \delta \Vert x\Vert + \inf f_{q+r}(y) = \delta \Vert x\Vert - f^*(q+r) \ge \delta \Vert x\Vert - C_\delta , \end{aligned} \end{aligned}$$
(51)

and the first claim follows.

Claim 2

Let C be the closure of \(B_\delta (0)\), and note that \(g(x) = \min _{p\in C} f_p(x)\) by continuity and compactness. Thus \(g(x) = f_{p_x}(x)\) for some \(p_x \in C\). Fix \(x\in \Omega \), and let \(x_i\) be any sequence converging \(x_i \rightarrow x\). Then \(g(x_i) = f_{p_i}(x_i)\) for some \(p_i \in C\), and by compactness we have that \(p_i \rightarrow p\) for some \(p\in C\), up to subsequence. But then we also have that

$$\begin{aligned} g(x) \le f_p(x) = \lim f_{p_i}(x_i) = \lim g(x_i), \end{aligned}$$
(52)

by continuity of f in (px). Hence g is lower semicontinuous in x. Upper semicontinuity follows from the fact that g is defined as an \(\inf \) of a family of convex functions.

Claim 3

The first two claims show that \(\{g(x) \le c\}\) is a compact set in \({\mathbb {R}}^n\). To show the third claim it suffices to show that \(g(x_i) \rightarrow \infty \) for any sequence such that \(x_i \rightarrow x\in \partial \Omega \), where we may assume that \(x_i\) is bounded. But for any \(p\in B_\delta (0)\) we have that \(f_p(x_i) = f(x_i) - p(x_i) \ge f(x_i) - \delta \Vert x_i\Vert \ge f(x_i) - C_\delta \), since \(\Vert x_i\Vert \) is bounded. It follows that \(g(x_i) \rightarrow \infty \).

\(\square \)

Theorem 1.8

Any Hessian metric \(\phi _0\) on a compact Hessian manifold \((M,L,\phi _0)\) can approximated uniformly by a piecewise affine section.

Proof

We fix the reference measure \(\nu = vol(\phi _0^*)\) as the Riemannian volume form corresponding to \(\phi _0^*\), and approximate \(\mu \) by atomic measures \(\mu _i\rightarrow \mu \). Then the solutions to \(\text {MA}_\nu \phi _i = \mu _i\) are piecewise affine, and by Theorem 3.6 we have that \(\phi \rightarrow \phi \) uniformly. \(\square \)

We also note a geometric consequence of Theorem 1.7, in that any piecewise affine convex function \(\Phi : \Omega \rightarrow {\mathbb {R}}^n\) corresponds to a tiling of \(\Omega \) by convex polytopes. Hence solving to \(\text {MA}_\nu \phi = \mu \) for an atomic measure \(\mu \) yields a quasi-periodic tiling of \(\Omega \).

7 Orbifolds

In this section we present an outline of a generalization of the main results to the setting of orbifolds. Throughout this section the setup is that of a compact affine manifold (M) and the properly discontinuous affine action by a finite group G on M. We let \(X=M/G\) as a Hausdorff topological space, but since the group action G is not assumed to be free X is not in general a manifold.

We call \(X = M/G\) a Hessian orbifold if M comes equipped with a G-equivariant Hessian metric \(\phi \) on M. Note that given such a metric, the affine \({\mathbb {R}}\)-bundle \(L\rightarrow M\) yields a principal \({\mathbb {R}}\)-bundle \(L/G \rightarrow M/G\), and we denote a Hessian orbifold by the data (MLG).

Note that sections of \(L/G\rightarrow M/G\) are simply G-equivariant sections of \(L\rightarrow M\). By letting \((M^*,L^*)\) denote the dual manifold of (ML), we as in the manifold setting may define a dual action of G on \((M^*,L^*)\) and can in precisely the same way as in the manifold setting construct a dual compact Hessian orbifold \((M^*,L^*,G^*)\). Note that \(G^* = G\) as groups; however, we use a superscripted \(*\) to indicate that G acts differently on \((M^*,L^*)\).

The extension of Theorem 1.2 can be formulated as follows:

Theorem 7.1

Let \(\mu \),\(\nu \) be probability measures on the compact Hessian orbifolds \((M,L,G),(M^*,L^*,G^*)\), respectively. Then the equation

$$\begin{aligned} \text {MA}_\nu \phi = \mu \end{aligned}$$
(53)

has a solution. Equivalently, given \(G,G^*\)-invariant probability measures \(\mu ,\nu \) on \(M,M^*\), respectively, there is a G-equivariant solution \(\phi : M\rightarrow L\)

$$\begin{aligned} \text {MA}_\nu \phi = \mu . \end{aligned}$$
(54)

The technique used to prove the above theorem follows the same principle as that in the manifold setting. Instead of producing solutions to (53) directly, one may look for equivariant solutions to a Monge–Ampère equation on the covering manifolds \(M,M^*\). The key point to note is the correspondence in the manifold setting of Hessian metrics on M with equivariant convex exhaustion functions on the universal cover \(\Omega \), and the extension to the orbifold setting can be seen as also requiring equivariance with respect to G. However, a subtle point is that to guarantee that the Kantorovich functional is somewhere finite, it is crucial that we are integrating against finite measures on \(M,M^*\), whereas the corresponding measures on \(\Omega ,\Omega ^*\) are only locally finite. In the manifold setting this correspondence between locally finite measures and probability measures is given by pushing forward under the local homeomorphisms given by the covering map \(\Omega \rightarrow M\). However in the orbifold setting the quotient map \(M \rightarrow M/G\) is not a covering map, and does not give local homeomorphisms near the fixed points of the action of G. Thus, there seems to be no obvious way to construct a probability measure on X given a locally finite measure on \(\Omega \). This lack of correspondence for probability measures is also the reason why we are not capable of dealing with non-finite groups G.

Anyway, since we make the assumption that \(X = M/G\) is the quotient of a compact Hessian manifold by a finite, we may push forward any probability measure on M by the quotient map to yield a probability measure on X (and similarly on \(M^*\)), and any probability measure on X arises in this way. Hence, given two G-equivariant measures \(\mu ,\nu \) on \(M,M^*\), pushing forward to probability measures \(\mu _X,\nu _X\) yields a Kantorovich functional

$$\begin{aligned} F(\phi ) = \int _X (\phi - \phi _0) \mathrm{d}\mu _X + \int _{X^*} (\phi ^* - \phi _0^*) d\nu _X. \end{aligned}$$
(55)

The arguments in the manifold setting can then be repeated, mutatis mutandis, to yield the existence of a convex minimizer to F, corresponding to a convex minimizer of the Kantorovich functional on \(M,M^*\) under the constraint of G-equivariance.