1 Introduction

The incompressible semi-geostrophic equations (SG) model the large-scale dynamics of rotational atmospheric flows. They can be viewed as a low Rossby number limit of the primitive equations, and are used by meteorologists to diagnose irregularities in simulated Navier–Stokes flows of the atmosphere on length scales of the order of tens of kilometres (see Cullen [10] and Visram et al. [53]). First proposed by Eliassen [20] in 1949, and subsequently developed by Hoskins [31] in 1975, the semi-geostrophic equations have attracted significant attention from the mathematical community over the past twenty years owing partly to their connection with optimal transport theory (see [1, 2, 4, 9, 11,12,13, 21,22,23,24, 39,40,41, 47]).

In this paper we consider SG in geostrophic coordinates, associated to flows on an arbitrary convex bounded (physical) domain \(\varOmega \subset \mathbb {R}^{3}\), which we interpret as the active transport equation

$$\begin{aligned} \partial _t \alpha _t + \mathscr {W}[\alpha _t] \cdot \nabla \alpha _t =0 \end{aligned}$$
(1)

for the time-dependent measure-valued map \(\alpha \), which is known as the potential vorticity. The connection between SG and optimal transport is contained in the nonlocal divergence-free velocity field \(\mathscr {W}[\alpha ]\), which is often called the geostrophic velocity. Let \(\mathscr {B}[\alpha _t]\) denote the unique mean-zero convex function whose gradient is the optimal transport map between the Lebesgue measure on \(\varOmega \) and the Borel measure \(\alpha _t\) with respect to the quadratic cost, and let \(\mathscr {B}[\alpha _t]^*\) denote its Legendre-Fenchel transform on \(\varOmega \). At each time t, \(\mathscr {W}[\alpha ]\) is given by

$$\begin{aligned} \mathscr {W}[\alpha _t]:=J(\mathrm {id}_{\mathbb {R}^3}-\nabla \mathscr {B}[\alpha _t]^*), \end{aligned}$$

where \(\mathrm {id}_{\mathbb {R}^3}\) denotes the identity on \(\mathbb {R}^3\) and

$$\begin{aligned} J:=\left( \begin{array}{c c c} 0 &{} -1 &{} 0 \\ 1 &{} 0 &{} 0 \\ 0 &{} 0 &{} 0 \\ \end{array} \right) . \end{aligned}$$
(2)

Guided by the work of Cullen and Purser [14], this connection was first established rigorously by Benamou and Brenier. In [4], those authors proved the existence of global-in-time weak solutions of (1) for initial measures which are absolutely continuous with respect to the Lebesgue measure and have compactly-supported \(L^{p}\) density for \(p>3\). This result was extended by Lopes Filho and Nussenzveig Lopes in [41] to the case where \(p\ge 1\), and by Loeper [40] and later Feldman and Tudorascu [23] to the case where the initial measure need only have compact support in \(\mathbb {R}^3\). In [22, Proposition 4.14], Feldman and Tudorascu use Ambrosio and Gangbo’s abstract techniques for Hamiltonian ODEs in Wasserstein space [3] to prove that when the initial measure is an arbitrary convex combination of Dirac masses there exists a global-in-time solution that maintains the discrete structure of the initial data. The two known results regarding the uniqueness of solutions of (1) are the local-in-time uniqueness of Hölder continuous periodic solutions proved in [40], and weak-strong uniqueness under uniform convexity proved in [24]. Otherwise, the problem of uniqueness of solutions remains open.

The main contribution of this paper is an alternative proof the existence of global-in-time weak solutions of (1) for arbitrary compactly-supported initial measures, which uses recently developed techniques from semi-discrete optimal transport to treat the case where the initial measure is discrete (see Sect. 3 and, in particular, Theorems 3.5 and 3.6). For a wide class of discrete initial measures our result recovers [22, Proposition 4.14] with improved time regularity (twice continuously differentiable rather than Lipschitz) and uniqueness. More significantly, our application of semi-discrete optimal transport to SG illuminates an explicit and intuitive connection between geostrophic coordinates and corresponding flows in the physical domain \(\varOmega \). It also gives a constructive way of determining solutions explicitly, and it forms the basis of an effective numerical scheme, as we illustrate in Sect. 7.

1.1 SG in geostrophic coordinates and semi-discrete optimal transport

In this section, we describe our approach to studying (1) using semi-discrete optimal transport, which is the special case of optimal transport in which the source measure is absolutely continuous with respect to the Lebesgue measure and the target measure is discrete.

In recent years, semi-discrete optimal transport theory has seen significant expansion in its theoretical foundations (see [5, 15, 17, 30, 32, 36, 38, 42,43,44,45]). It has also been applied to many diverse problems in the sciences, both within fluid dynamics [28, 37] and elsewhere such as materials science [6, 7, 33], economics [27, Chapter 5], crowd dynamics [34] and image interpolation [36]. Inspired by ideas in the original work of Cullen and Purser [14] on piecewise constant solutions of semi-geostrophic slice models, we use semi-discrete optimal transport to analyse (1) in the special case where the initial potential vorticity \(\alpha _0=\overline{\alpha }\) is a discrete measure, i.e.,

$$\begin{aligned} \overline{\alpha }=\sum _{i=1}^{N}m_{i}\delta _{\overline{z}_{i}} \end{aligned}$$

for some \(m_{i}>0\) and \(\overline{z}_{i}\in \mathbb {R}^{3}\). We show (Theorem 3.5) that for well-prepared discrete initial data (see Definition 3.4) there exists a corresponding discrete solution \(t\mapsto \alpha _{t}\) of (1) of the form

$$\begin{aligned} \alpha _{t}=\sum _{i=1}^{N}m_{i}\delta _{z_{i}(t)}, \end{aligned}$$
(3)

where the trajectories \(z_i\) are twice continuously differentiable. Denoting by \(\mathscr {L}^3\) the Lebesgue measure on \(\mathbb {R}^3\) and by its restriction to \(\varOmega \), the optimal transport map between and \(\alpha _t\) given by (3) is a piecewise constant function

$$\begin{aligned} T=\sum _{i=1}^Nz_i\mathbb {1}_{C_i}, \end{aligned}$$

where \(\{C_i\}_{i=1}^N\) is a tessellation of \(\varOmega \) by convex sets, known as the optimal Laguerre tesselation (see Definitions 2.4 and 2.5) generated by the seed vector \(\mathbf {z}:=(z_1,\ldots ,z_N)\) subject to the mass constraint

$$\begin{aligned} \mathscr {L}^3(C_i)=m_i\quad \forall \, i\in \{1,\ldots ,N\}. \end{aligned}$$

Letting \(x_i(\mathbf {z})\) denote the centroid of the Laguerre cell \(C_i(\mathbf {z})\), one can show (see Lemma 4.2) that the time-dependent measure-valued map defined by (3) is a weak solution of (1) if and only if the trajectories \(z_1,\ldots ,z_N\) satisfy the ODE initial value problem (IVP)

$$\begin{aligned} {\left\{ \begin{array}{ll} \frac{dz_{i}}{dt}=J(z_{i}-x_{i}(\mathbf {z})),\\ z_i(0)=\overline{z}_i, \end{array}\right. } \end{aligned}$$
(4)

for \(i\in \{1,\ldots ,N\}\). At each time t, the seeds \(z_i(t)\) in geostrophic space generate a Laguerre tessellation of the physical domain \(\varOmega \) (see Fig. 1 for a 2D illustration).

Fig. 1
figure 1

Typical snapshot at a time t of a 2D discrete solution \(\alpha \) of (1) and the corresponding tessellation of the physical domain \(\varOmega \), which is taken here to be a square subset of \(\mathbb {R}^2\). The right-hand plot shows a configuration of \(N=8\) seeds \(\mathbf {z}(t)=(z_1(t),\ldots ,z_8(t))\) in 2D geostrophic space, on which the measure \(\alpha _t\) is supported. The left-hand plot shows the (approximate) Laguerre tessellation of \(\varOmega \) generated by \(\mathbf {z}(t)\) subject to the constraint that all cells have the same mass. Black lines represent Laguerre cell boundaries and the circle within each cell \(C_i\) represents its centroid \(x_i(\mathbf {z}(t))\)

We obtain a solution of (1) for an arbitrary compactly-supported initial measure \(\overline{\alpha }\) by generating a sequence \((\overline{\alpha }^N)_{N\in \mathbb {N}}\) of well-prepared discrete measures converging to \(\overline{\alpha }\) in the Wasserstein 2-distance, evolving each of these discrete measures according to the corresponding ODE-IVP (4), and using compactness in the space of continuous measure-valued maps to pass to the limit as \(N\rightarrow \infty \). As such, our construction method can be thought of as a meshless or particle method. By comparison with the proof given in [4], and later generalised in [40, 41], the discretisation occurs in the spatial domain rather than in the time domain. Note that in [13] Cullen, Gangbo and Pisante analyse a variant of SG using a spatial discretisation different from the one considered in this paper. Analytically, the essential benefit of the discretisation of the initial measure \(\overline{\alpha }\) is that the study of the active transport equation (1), whose velocity field is only in general of class \(BV_{\mathrm {loc}}\), is replaced by the study of the ODE-IVP (4), whose right hand side is continuously differentiable \(\mathscr {L}^{3N}\)-almost everywhere. Mollifications of the vector-field and related quantities used in [4, 40, 41], as well as the abstract techniques for Hamiltonian ODEs in Wasserstein space used in [22], are therefore avoided, resulting in a more direct solution procedure.

1.2 Background on the semi-geostrophic equations

In this section, we briefly describe how Eq. (1) is derived. In their traditional Eulerian formulation in a fixed spatial domain \(\varOmega \subset \mathbb {R}^{3}\), the (non-dimensionalised) semi-geostrophic equations are given by the coupled system

$$\begin{aligned} \left\{ \begin{array}{l} \displaystyle \partial _{t}u_{g}+(u\cdot \nabla )u_{g}=-Ju_{a}, \\ \displaystyle \partial _{t}\theta +(u\cdot \nabla )\theta =0, \\ \nabla \cdot u=0, \end{array} \right. \end{aligned}$$
(5)

where \(u_{g}:=(u_{g, 1}, u_{g, 2}, 0)^{\mathrm {T}}\) is the geostrophic velocity field, u is the Eulerian velocity field of the fluid, \(u_{a}:=u-u_{g}\) is the ageostrophic velocity field, \(\theta \) is the potential temperature, and the matrix J, defined by (2), encodes planetary rotation. Importantly, the hydrodynamic and thermodynamic fields \(u_{g}\) and \(\theta \) are linked through the fluid pressure p by the identity

$$\begin{aligned} \nabla p=\left( \begin{array}{c} u_{g, 2} \\ -u_{g, 1} \\ \theta \end{array} \right) . \end{aligned}$$
(6)

When posed on a suitably smooth bounded domain \(\varOmega \subset \mathbb {R}^3\), the system (5) is typically supplemented with a no-slip boundary condition \(u\cdot n=0\) on \(\partial \varOmega \), where n is the outward unit normal field on the boundary, which is sufficient to ensure that fluid points remain in the domain \(\varOmega \) for all times. Define the geopotential P pointwise by

$$\begin{aligned} P(x, t):=p(x, t)+\frac{1}{2}(x_{1}^{2}+x_{2}^{2}) \end{aligned}$$

for \(x\in \varOmega \) and \(t\ge 0\). The system (5) can then be expressed as

$$\begin{aligned} \frac{\partial }{\partial t}\nabla P+(U[\nabla P]\cdot \nabla )\nabla P=J(\nabla P-\mathrm {id}_{\varOmega }), \end{aligned}$$
(7)

where \(\mathrm {id}_{\varOmega }\) denotes the identity map on \(\varOmega \), and \(U:\nabla P\mapsto u\) is the formal solution operator associated to the (time-independent) div-curl boundary-value problem

$$\begin{aligned} \left\{ \begin{array}{l} \nabla \wedge (D^{2}Pu)=\nabla \wedge J(\nabla P-\mathrm {id}_{\varOmega }) \quad \text {in}\quad \varOmega , \\ \nabla \cdot u=0 \quad \text {in}\quad \varOmega , \\ u\cdot n=0 \quad \text {on}\quad \partial \varOmega . \end{array} \right. \end{aligned}$$
(8)

By way of this simple change of dependent variable, SG can be viewed as an inhomogeneous active transport equation (7) whose unknown \(\nabla P\) is a time-dependent conservative vector field on \(\varOmega \). The Eulerian velocity field is then formally defined through the action of the solution operator U. This change of dependent variable also highlights a substantial mathematical difficulty one faces when constructing solutions of (7): for the boundary-value problem (8) to be of elliptic type at each time t, \(P(\cdot ,t)\) must be strictly convex.

The state-of-the-art regarding the existence of solutions of SG in Eulerian coordinates is due to Ambrosio et al. [1]. Using the \(W^{2, 1}_{\mathrm {loc}}\)-regularity of Alexandrov solutions of a class of Dirichlet boundary-value problems for the Monge–Ampère equation established in [16], the authors proved the existence of global-in-time distributional solutions of SG in Eulerian coordinates posed on smooth convex domains \(\varOmega \subset \mathbb {R}^{3}\) for a class of initial geopotentials \(P_0\) satisfying

However, for such solutions, the support of the pushforward measure is the whole space \(\mathbb {R}^3\) at each time t. The relation (6) then implies that the temperature field \(\theta \) satisfies \(\theta (\cdot , t)\notin L^{\infty }(\varOmega )\). Interpreted physically, this means that the atmospheric fluid is arbitrarily hot on sets of positive measure at all times. At the time of writing, the existence of either local-in-time or global-in-time distributional solutions of (7) for physical initial data \(P_0\) satisfying \(\nabla P_{0}(\varOmega )\subset \subset \mathbb {R}^{3}\) remains open.

Since the pioneering work of Hoskins [31], Cullen and Purser [14], and Benamou and Brenier [4] on the semi-geostrophic equations, it has become customary to regard \(\nabla P(\cdot , t)\) formally as a diffeomorphism between \(\varOmega \) and its image \(\nabla P(\varOmega , t)\) for each time t. The system (5) is then transformed to the time-dependent coordinate system determined by \(\nabla P\), known as geostrophic coordinates. It is a remarkable property of SG that, as shown in [4], this formal change of coordinates yields a closed equation which is free of the field u. Indeed, under the assumption that \(\nabla P\) is a smooth solution of (7) and \(P(\cdot , t)\) is strictly convex at each time t, it can be shown that the time-dependent pushforward measure

is a distributional solution of the active transport equation (1).

1.3 Outline of the paper

We begin in Sect. 2 with a brief introduction to semi-discrete optimal transport theory. Section 3 contains the statement of the following existence results whose novel proofs are the main contribution of this paper:

  1. 1.

    discrete geostrophic solutions with well-prepared discrete initial data exist, are unique, and are defined by trajectories that are twice continuously differentiable in time (see Definition 3.4 and Theorem 3.5);

  2. 2.

    Lipschitz-in-time solutions of SG in geostrophic coordinates with arbitrary compactly-supported initial measure can be constructed as the uniform limit of a sequence of discrete geostrophic solutions that are twice continuously differentiable in time (Theorem 3.6).

These results are proved in Sects. 4 and 5 respectively. Section 6 contains the explicit calculation of two exact solutions of SG in geostrophic coordinates, as well as a brief discussion on equilibrium solutions. Finally, in Sect. 7, we illustrate the theory developed in the paper by simulating a 2D semi-geostrophic flow in geostrophic coordinates, and we plot the corresponding Laguerre tessellations of the physical domain \(\varOmega \).

1.4 Notation

Let \(d\in \mathbb {N}\). We denote by \(\mathbb {R}^d_{>}\) the subset of \(\mathbb {R}^d\) consisting of all vectors whose components are positive. For \(i\in \{1,\ldots ,d\}\), the \(i^{\text {th}}\) canonical basis vector in \(\mathbb {R}^d\) is denoted by \(e_i\). Let \(A\subseteq \mathbb {R}^d\) be a Borel set. We denote the identity map on A by \(\mathrm {id}_A\), and the characteristic function of A by \(\mathbb {1}_A\). We denote the interior of A by \(\mathrm {Int}(A)\) and the boundary of A by \(\partial A\).

Measures We denote by \(\mathscr {L}^d\) the Lebesgue measure on \(\mathbb {R}^d\) and by its restriction to A. The set of Borel probability measures on A is denoted by \(\mathscr {P}(A)\). Given a Borel map \(T:A\rightarrow \mathbb {R}^d\) and a measure \(\mu \in \mathscr {P}(A)\), the pushforward of \(\mu \) by T is denoted by \(T_\#\mu \) and is defined by \(T_\#\mu (B)=\mu (T^{-1}(B))\) for all Borel sets \(B\subseteq \mathbb {R}^d\). The set of Borel probability measures on \(\mathbb {R}^d\) with compact support is denoted by \(\mathscr {P}_c(\mathbb {R}^d)\). For any \(p\in [1,+\infty )\), \(\mathscr {P}_p(\mathbb {R}^d)\) denotes the set of all Borel probability measures on \(\mathbb {R}^d\) with finite moments of order p, equipped with the Wasserstein p-distance \(W_p\). This is defined for \(\mu ,\,\nu \in \mathscr {P}_p(\mathbb {R}^d)\) by

$$\begin{aligned} W_p(\mu ,\nu ):=\inf \left\{ \int _{\mathbb {R}^d\times \mathbb {R}^d} |x-y|^p\,\mathrm {d}\gamma (x,y)\,:\, \gamma \in \mathscr {P}(\mathbb {R}^d\times \mathbb {R}^d),\,{\pi _x}_\#\gamma =\mu \text{, }\,{\pi _y}_\#\gamma =\nu \right\} ^{\frac{1}{p}}, \end{aligned}$$

where \(\pi _x\) and \(\pi _y\) denote the projections onto the first and second variables, respectively. Throughout this paper spaces of probability measures are understood to be equipped with the Wasserstein 2-distance unless otherwise stated.

Convex functions Given a convex function \(f:A\rightarrow \mathbb {R}\), the subdifferential of f is the set-valued function, mapping from A into the set of subsets of \(\mathbb {R}^d\), defined by

$$\begin{aligned} \partial f (x)=\left\{ y\in \mathbb {R}^d \ \Big \vert \ y\cdot (z-x)\le f(z)-f(x)\quad \forall \,z\in A \right\} . \end{aligned}$$

The Legendre-Fenchel transform of f is the function \(f^*:\mathbb {R}^d\rightarrow \mathbb {R}\) defined by

$$\begin{aligned} f^*(y)=\underset{x\in A}{\sup }\left\{ x\cdot y-f(x)\right\} . \end{aligned}$$

Test functions We denote by \(\mathcal {D}(\mathbb {R}^d)\) the space of test functions \(C^{\infty }_c(\mathbb {R}^d)\) equipped with the standard semi-norm topology (see, for example, [26]).

Physical domain Throughout this paper (with the exception of Sect. 7) \(\varOmega \) is taken to be an arbitrary convex open bounded subset of \(\mathbb {R}^3\), and, without loss of generality, we use the normalisation convention that \(\mathscr {L}^3(\varOmega )=1\) so that all measures under consideration are probability measures.

2 Semi-discrete optimal transport

In this section we review some basic aspects of semi-discrete optimal transport theory. For further information on semi-discrete optimal transport see [44, Section 4] and, for more on optimal transport theory in greater generality, see [48, 51, 52].

Given a target measure \(\nu \in \mathscr {P}_2(\mathbb {R}^3)\), a Borel map \(T:\varOmega \rightarrow \mathbb {R}^3\) is said to be an optimal transport map between and \(\nu \) with respect to the quadratic cost \(c:\mathbb {R}^3\times \mathbb {R}^3\rightarrow \mathbb {R}\) given by

$$\begin{aligned} c(x,y)=|x-y|^2 \end{aligned}$$

if it minimises the transport cost

$$\begin{aligned} \int _{\varOmega }|x-T(x)|^2\,\mathrm {d}x \end{aligned}$$

subject to the constraint that

The problem of finding an optimal transport map given source and target measures is known as the Monge problem. For any \(\nu \in \mathscr {P}_2(\mathbb {R}^3)\) such a map T exists, is unique, and can be expressed as the gradient of a convex function \(\varPhi \) belonging to the Sobolev space \(H^1(\varOmega )\) (see for example [48, Theorem 1.22] or [51, Theorem 2.12]).

Definition 2.1

We define the operator \(\mathscr {B}:\mathscr {P}_2(\mathbb {R}^3)\rightarrow H^1(\varOmega )\) to be that which sends any given \(\nu \in \mathscr {P}_2(\mathbb {R}^3)\) to the unique mean-zero convex function in \(H^1(\varOmega )\) whose gradient is the unique optimal transport map between and \(\nu \) with respect to the quadratic cost.

The stability of optimal transport, including the continuity of \(\mathscr {B}\), has been studied in, for example, [5, 38, 43] and [52, Theorem 5.20 and Corollary 5.23]. In [43] it is shown that for any bounded set \(A\subset \mathbb {R}^3\), the restriction of \(\mathscr {B}\) to \(\mathscr {P}(A)\) is Hölder continuous.

Theorem 2.2

(cf. [43, Theorem 3.1]) Let \(A\subset \mathbb {R}^3\) be bounded. There exists a constant \(C>0\), which depends only on A and \(\varOmega \), such that for all \(\mu ,\ \nu \in \mathscr {P}(A)\),

$$\begin{aligned} \Vert \nabla \mathscr {B}[\mu ]-\nabla \mathscr {B}[\nu ]\Vert _{L^2(\varOmega ;\mathbb {R}^3)} \leqslant CW_2(\mu ,\nu )^{\frac{2}{15}}. \end{aligned}$$

Semi-discrete optimal transport (see, for example, [32, 44, Section 4], [48, Section 6.4.2]) refers to the special case where, for some \(N\in \mathbb {N}\), the target measure \(\nu \) belongs to the class

$$\begin{aligned} \mathscr {Q}^N\left( \mathbb {R}^3\right) :=\left\{ \nu =\sum _{i=1}^Nm_i\delta _{z_i}\ \Big \vert \ \mathbf {z}\in D,\ \mathbf {m}=(m_1,\ldots ,m_N)\in \mathbb {R}^N_{>}\ \text {and}\ \sum _{i=1}^Nm_i=1\ \right\} , \end{aligned}$$
(9)

where

$$\begin{aligned} D:=\left\{ \mathbf {z}=(z_1,\ldots ,z_N)\in \mathbb {R}^{3N}\ \big \vert \ z_i\ne z_j\text { whenever } i\ne j\right\} . \end{aligned}$$
(10)

Definition 2.3

We call a vector \(\mathbf {m}=(m_1,\ldots ,m_N)\in \mathbb {R}^N_>\) such that \(\sum _{i=1}^Nm_i=1\) a mass vector, and a vector \(\mathbf {z}\in D\) a seed vector. A measure \(\nu \in \mathscr {Q}^N\left( \mathbb {R}^3\right) \) given by

$$\begin{aligned} \nu =\sum _{i=1}^Nm_i\delta _{z_i} \end{aligned}$$

is said to have mass vector \(\mathbf {m}=(m_1,\ldots ,m_N)\) and seed vector \(\mathbf {z}=(z_1,\ldots ,z_N)\).

Let \(\nu \in \mathscr {Q}^N\left( \mathbb {R}^3\right) \) have mass vector \(\mathbf {m}=(m_1,\ldots ,m_N)\) and seed vector \(\mathbf {z}=(z_1,\ldots ,z_N)\). A map \(T:\varOmega \rightarrow \mathbb {R}^3\) satisfies the pushforward constraint

if and only if it has the form

$$\begin{aligned} T=\sum _{i=1}^Nz_i\mathbb {1}_{C_i}, \end{aligned}$$

where \(\{C_i\}_{i=1}^N\) is a tesselation of \(\varOmega \) by measurable sets \(C_i\) such that

$$\begin{aligned} \mathscr {L}^3(C_i)=m_i\qquad \forall \, i\in \{1,\ldots ,N\}. \end{aligned}$$

Hence the optimal transport problem between and \(\nu \in \mathscr {Q}^N(\mathbb {R}^3)\) is reduced to an optimal partitioning problem. Moreover, the unique optimal partition, and corresponding transport map, can be characterised using the notion of Laguerre tessellations.

Definition 2.4

(Laguerre tessellation) Given a seed vector \(\mathbf {z}=(z_1,\ldots ,z_N)\in \mathbb {R}^{3N}\) and a weight vector \(\mathbf {w}=(w_1,\ldots ,w_n)\in \mathbb {R}^N\), the Laguerre tessellation of \(\varOmega \) generated by the pair \((\mathbf {z},\mathbf {w})\) is defined to be the family

$$\begin{aligned} \{C_i(\mathbf {z},\mathbf {w})\}_{i=1}^N, \end{aligned}$$

where \(C_i(\mathbf {z},\mathbf {w})\) are Laguerre cells defined by

$$\begin{aligned} C_i(\mathbf {z},\mathbf {w})=\left\{ x\in \varOmega \ :\ |x-z_i|^2-w_i\leqslant |x-z_j|^2-w_j\quad \forall \, j\in \{1,\ldots ,N\}\right\} . \end{aligned}$$
(11)

Note that any Laguerre cell is convex since it is the intersection of finitely many half-spaces with the convex set \(\varOmega \). In particular, Laguerre cells that do not intersect \(\partial \varOmega \) are polyhedra. Moreover, for any seed vector \(\mathbf {z}\), weight vector \(\mathbf {w}\) and indices \(i\ne j\), the intersection \(C_i(\mathbf {z},\mathbf {w})\cap C_j(\mathbf {z},\mathbf {w})\) is contained in the 2-dimensional plane

$$\begin{aligned} \left\{ x\in \mathbb {R}^3\ :\ |x-z_i|^2-w_i=|x-z_j|^2-w_j\right\} . \end{aligned}$$

Using the Kantorovich Duality Theorem (see, for example, [48, Section 1.2]), one can show that the optimal transport cost between and \(\nu \) is the supremum over all weight vectors \(\mathbf {w}=(w_1,\ldots ,w_N)\in \mathbb {R}^N\) of the Kantorovich functional \(g:\mathbb {R}^N\rightarrow \mathbb {R}\) defined by

$$\begin{aligned} g(\mathbf {w})=\sum _{i=1}^N\underset{C_i(\mathbf {z},\mathbf {w})}{\int }|x-z_i|^2\,\mathrm {d}x+\sum _{i=1}^N\big (m_i-\mathscr {L}^3\left( C_i(\mathbf {z},\mathbf {w})\right) \big ) w_i. \end{aligned}$$
(12)

The Kantorovich functional g is concave, and maximisers \(\mathbf {w}\in \mathbb {R}^N\) of g exist and satisfy

$$\begin{aligned} \mathscr {L}^3(C_i(\mathbf {z},\mathbf {w}))=m_i\qquad \forall \, i\in \{1,\ldots ,N\}. \end{aligned}$$
(13)

(See, for example, [44, Theorem 40].) We call such \(\mathbf {w}\) optimal weight vectors. Note that \(w_i=\psi (z_i)\) where \(\psi \) is an optimal Kantorovich potential. Given an optimal weight vector \(\mathbf {w}\in \mathbb {R}^N\), the unique optimal transport map from to \(\nu \) is given by

$$\begin{aligned} T=\sum _{i=1}^Nz_i\mathbb {1}_{C_i(\mathbf {z},\mathbf {w})}. \end{aligned}$$

In particular, if we define the function \(\varPhi :\varOmega \rightarrow \mathbb {R}\) by

$$\begin{aligned} \varPhi (x)=\frac{1}{2}|x|^2-\frac{1}{2}\underset{i}{\min }\{|x-z_i|^2-w_i\}, \end{aligned}$$
(14)

then \(T=\nabla \varPhi \) and, using the notation introduced in Definition 2.1,

$$\begin{aligned} \mathscr {B}[\nu ]=\varPhi -\int _{\varOmega }\varPhi (x)\,\mathrm {d}x. \end{aligned}$$
(15)

This follows from the Gangbo-McCann Theorem (see, for example, [48, Theorem 1.17]).

The Monge problem is a non-convex optimisation problem. As outlined above, in the case of semi-discrete optimal transport this can be replaced by an unconstrained, finite dimensional optimisation problem (maximising g), which is numerically tractable. As we demonstrate in Sect. 7, this is one motivation for using semi-discrete optimal transport to construct solutions of SG in geostrophic coordinates.

2.1 Optimal weight map

Optimal weight vectors are not unique. Indeed, let \(\nu \in \mathscr {Q}^N(\mathbb {R}^3)\) have mass vector \(\mathbf {m}\) and seed vector \(\mathbf {z}\), and let \(\mathbf {e}=(1,\ldots ,1)\in \mathbb {R}^N\). Using (11) it is easy to see that for any \(\lambda \in \mathbb {R}\),

$$\begin{aligned} C_i(\mathbf {z},\mathbf {w})=C_i(\mathbf {z},\mathbf {w}+\lambda \mathbf {e})\qquad \forall \, i\in \{1,\ldots ,N\}. \end{aligned}$$

Hence, by the characterisation of optimal weight vectors (13), \(\mathbf {w}\in \mathbb {R}^N\) is optimal if and only if every vector in \(\mathbf {w}+\mathrm {span}\{\mathbf {e}\}\) is optimal. Conversely, if \(\mathbf {w},\,\widetilde{\mathbf {w}}\in \mathbb {R}^N\) and \(\mathbf {w}-\widetilde{\mathbf {w}}\notin \mathrm {span}\{\mathbf {e}\}\), then the pairs \((\mathbf {z},\mathbf {w})\) and \((\mathbf {z},\widetilde{\mathbf {w}})\) define distinct Laguerre tessellations, so at least one of \(\mathbf {w}\) and \(\widetilde{\mathbf {w}}\) is not optimal. In particular, it is easy to deduce that there is a unique optimal weight vector whose \(N^{\text {th}}\) component is zero. This leads to the following definition, where \(e_i\) denotes the \(i^{\text {th}}\) canonical basis vector in \(\mathbb {R}^N\).

Definition 2.5

Given a fixed mass vector \(\mathbf {m}\in \mathbb {R}^N\), we define the optimal weight map \(\mathbf {w}_*:D\rightarrow \mathbb {R}^N\) to be that which sends each seed vector \(\mathbf {z}\in D\) to the unique optimal weight vector \(\mathbf {w}_*(\mathbf {z})\in \mathrm {span}\{e_1,\ldots ,e_{N-1}\}\). We refer to the family \(\{C_i(\mathbf {z},\mathbf {w}_*(\mathbf {z}))\}_{i=1}^N\) as the optimal Laguerre tessellation of \(\varOmega \) generated by \(\mathbf {z}\in \mathbb {R}^N\).

Note that there are many possible definitions of the optimal weight map, which all yield the same definition of an optimal Laguerre tessellation. For instance, a natural choice of range space would be \(\mathrm {span}\{\mathbf {e}\}^{\perp }\). We choose the range space \(\mathrm {span}\{e_1,\ldots ,e_{N-1}\}\) so that subsequent arguments concerning the regularity of \(\mathbf {w}_*\) can be carried out using only the canonical basis of Euclidean space.

3 Statement of existence theorems

After stating some preliminary definitions, we state the two existence results (Theorems 3.5 and 3.6) for which we give novel proofs. See Definition 2.1 and Eq. (9) for the definitions of the operator \(\mathscr {B}\) and the space \(\mathscr {Q}^N(\mathbb {R}^3)\), respectively.

Definition 3.1

(Geostrophic energy) The geostrophic energy functional \(E:\mathscr {P}_c(\mathbb {R}^3)\rightarrow \mathbb {R}\) is defined by

$$\begin{aligned} E[\nu ]:=\int _{\varOmega }\left( \frac{1}{2}\left( (\partial _1\mathscr {B}[\nu ](x)-x_1)^2+(\partial _2\mathscr {B}[\nu ](x)-x_2)^2\right) -x_3\partial _3\mathscr {B}[\nu ](x)\right) \,\mathrm {d}x. \end{aligned}$$

Note that the geostrophic energy is traditionally written in terms of the geostrophic velocity field \(u_{g}=(u_{g, 1}, u_{g, 2}, 0)^{T}\) and the potential temperature \(\theta \) as

$$\begin{aligned} E(u_g,\theta )=\int _{\varOmega }\left( \frac{1}{2}\left( u^2_{g,1}+u^2_{g,2}\right) -x_3\theta \right) \,\mathrm {d}x. \end{aligned}$$

Remark 3.2

The geostrophic energy functional is continuous on \(\mathscr {P}(K)\) for any compact set \(K\subset \mathbb {R}^3\). To see this first note that, for any \(\nu \in \mathscr {P}_c(\mathbb {R}^3)\),

(16)

Suppose that \((\nu ^N)_{N\in \mathbb {N}}\subset \mathscr {P}(K)\) is a sequence which converges to a measure \(\nu \in \mathscr {P}(K)\) as \(N\rightarrow \infty \). Since \(W_2\) is a metric on \(\mathscr {P}(K)\),

and, by continuity of \(\mathscr {B}\) on \(\mathscr {P}(K)\) (Theorem 2.2),

$$\begin{aligned} \lim _{N\rightarrow \infty }\Vert \partial _3\mathscr {B}[\nu ^N]\Vert _{L^2(\varOmega )} =\Vert \partial _3\mathscr {B}[\nu ]\Vert _{L^2(\varOmega )}. \end{aligned}$$

Hence

$$\begin{aligned} \lim _{N\rightarrow \infty }E[\nu ^N]=E[\nu ]. \end{aligned}$$

Definition 3.3

(Geostrophic solution) Let \(T\in (0,\infty )\) and let \(\overline{\alpha }\in \mathscr {P}_c(\mathbb {R}^3)\). We say that \(\alpha \in C([0,T];\mathscr {P}_c(\mathbb {R}^3))\) is a weak solution of the 3D incompressible semi-geostrophic equations in geostrophic coordinates on [0, T] with initial measure \(\overline{\alpha }\) if

$$\begin{aligned}&\int _0^{T}\int _{\mathbb {R}^3}\left( \partial _t \varphi (z,t)+Jz\cdot \nabla \varphi (z,t)\right) \,\mathrm {d}\alpha _t(z)\,\mathrm {d}t-\int _0^{T}\int _{\varOmega }J x\cdot \nabla \varphi \big (\nabla \mathscr {B}[\alpha _t](x),t\big )\,\mathrm {d}x\,\mathrm {d}t \nonumber \\&\quad \qquad \qquad \qquad \qquad \qquad \qquad \qquad \qquad =\int _{\mathbb {R}^3}\varphi (z,T)\,\mathrm {d}\alpha _{T}(z)-\int _{\mathbb {R}^3}\varphi (z,0)\,\mathrm {d}\overline{\alpha }(z), \end{aligned}$$
(17)

for all \(\varphi \in \mathcal {D}(\mathbb {R}^3\times \mathbb {R})\), where the matrix J is defined by (2). In what follows, we will refer to such a map \(\alpha \) as a geostrophic solution. In particular, if there exists \(N\in \mathbb {N}\), a (k-times continuously differentiable) map \(\mathbf {z}=(z_1,\ldots ,z_N):[0,T]\rightarrow \mathbb {R}^{3N}\) and a mass vector \(\mathbf {m}\in \mathbb {R}^N\) such that

$$\begin{aligned} \alpha _t=\sum _{i=1}^Nm_i\delta _{z_i(t)}\in \mathscr {Q}^N(\mathbb {R}^3) \end{aligned}$$
(18)

for all \(t\in [0,T]\), we will refer to \(\alpha \) as a (k-times continuously differentiable) discrete geostrophic solution. If the corresponding geostrophic energy \(E[\alpha _t]\) is constant in time, then we say that \(\alpha \) is energy-conserving.

The defintion of a geostrophic solution of SG coincides with Loeper’s definition of a weak measure solution [40, Definition 2.2]. Moreover, if \(\alpha \) is a geostrophic solution and \(\alpha _t\ll \mathscr {L}^3\) for all \(t\in [0,T]\) then, since

by a change of variables,

$$\begin{aligned} \int _{\varOmega }Jx\cdot \nabla \varphi \big (\nabla \mathscr {B}[\alpha _t](x),t\big )\,\mathrm {d}x=\int _{\mathbb {R}^3}J\nabla \mathscr {B}[\alpha _t]^*(z)\cdot \nabla \varphi (z,t)\, \mathrm {d}\alpha _t(z)\qquad \forall \, \varphi \in \mathcal {D}(\mathbb {R}^3\times \mathbb {R}) \end{aligned}$$

for all \(t\in [0,T]\). Hence \(\alpha \) is a distributional solution of the active transport equation

$$\begin{aligned} \partial _t\alpha +J\big (\text {id}_{\mathbb {R}^3}-\nabla \mathscr {B}[\alpha ]^*\big )\cdot \nabla \alpha =0. \end{aligned}$$
(19)

This is the setting considered in [4].

Definition 3.4

(Well-preparedness) Let \(N\in \mathbb {N}\). We say that a discrete probability measure \(\beta \in \mathscr {Q}^N(\mathbb {R}^3)\) given by

$$\begin{aligned} \beta =\sum _{i=1}^Nm_i\delta _{z_i} \end{aligned}$$

is well-prepared for SG in geostrophic coordinates if there exists \(r>0\) such that

$$\begin{aligned} |(z_i-z_j)\cdot e_3|\geqslant r\quad \forall \, i\ne j. \end{aligned}$$
(20)

In other words, \(\beta \) is well prepared if the seeds \(z_i\) lie in distinct horizontal planes.

Well-preparedness of the initial data ensures that the seeds do not collide and therefore do not enter the set on which the right-hand side of the ODE in (4) is discontinuous. This allows us to prove the following theorem.

Theorem 3.5

(Existence of discrete geostrophic solutions, cf. [22, Proposition 4.14]) Let \(\varOmega \subset \mathbb {R}^3\) be open, bounded and convex, and fix \(N\in \mathbb {N}\), \(N\ge 2\). For any \(T\in (0,\infty )\) and any well-prepared discrete probability measure \(\overline{\alpha }\in \mathscr {Q}^N(\mathbb {R}^3)\), there exists a unique twice continuously differentiable discrete geostrophic solution \(\alpha \in C([0,T];\mathscr {P}_c(\mathbb {R}^3))\) with initial measure \(\overline{\alpha }\). Moreover, this solution is energy-conserving.

In Example 6.1, we show that Theorem 3.5 extends easily to the case \(N=1\) by deriving an explicit expression for the solution.

The uniqueness and regularity of solutions attained in Theorem 3.5 are significant in the context of numerical analysis since these factors determine the convergence of the corresponding numerical method. As mentioned in the introduction, the result of Theorem 3.5 is analogous to [22, Proposition 4.14], but with the following differences. In [22, Proposition 4.14], the physical domain \(\varOmega \) need not be convex, and the initial measure can be any convex combination of Dirac masses. However, with these slightly weaker hypotheses, the seed trajectories \(\mathbf {z}\) are only known to be Lipschitz in time and are not known to be unique.

Theorem 3.6

(Existence of geostrophic solutions) Let \(\varOmega \subset \mathbb {R}^3\) be open, bounded and convex. For any \(T\in (0,\infty )\) and any \(\overline{\alpha }\in \mathscr {P}_c(\mathbb {R}^3)\), there exists an energy-conserving geostrophic solution \(\alpha \in C^{0,1}([0,T];\mathscr {P}_c(\mathbb {R}^3))\) with initial measure \(\overline{\alpha }\) and a sequence \((\alpha ^N)_{N\in \mathbb {N}}\) of twice continuously differentiable discrete geostrophic solutions which converges uniformly in \(C([0,T];\mathscr {P}_c(\mathbb {R}^3))\) to \(\alpha \):

$$\begin{aligned} \lim _{N\rightarrow \infty }\sup _{t\in [0,T]}W_2(\alpha ^N_t,\alpha _t)=0. \end{aligned}$$

The existence of geostrophic solutions with arbitrary compactly supported initial measure was first proved by Loeper [40, Theorem 2.3] and later by Feldman and Tudorascu in the context of weak Lagrangian solutions (see [23, Theorem 3.2]). Being uniform in time, as opposed to pointwise in time, the convergence obtained in Theorem 3.6 is stronger than that obtained in these previous works. Since we work only with compactly supported measures, the spatial convergence obtained in Theorem 3.6 is equivalent to that obtained in [23]. As in [23], the limit point obtained in Theorem 3.5 is Lipschitz in time. Note that the corresponding theorems in [23, 40] do not include the hypothesis that \(\varOmega \) is convex. We include this hypothesis for technical reasons relating to the regularity of the optimal centroid map (see Definition 4.1 and Remark 4.12).

Remark 3.7

(Conservation of transport cost) From the proof of Lemma 4.14 it is easy to deduce that each discrete solution \(\alpha ^N\) conserves not only the geostrophic energy but also the transport cost . Therefore any solution \(\alpha \) constructed as the uniform limit of discrete solutions also conserves the transport cost by continuity of the Wasserstein distance. This is essentially due to the Hamiltonian structure of Eq. (1) (see [3, Example 8.1(c)]) and can be seen as a special case of [3, Theorem 5.2]. Conservation of the transport cost for weak Lagrangian solutions is proved in [22, Corollary 5.2]. For clarity, we include an independent proof based on our semi-discrete solution procedure.

4 Proof of Theorem 3.5

Fix \(N\in \mathbb {N}, N\geqslant 2\). (The case \(N=1\) is discussed in Example 6.1.) To construct a twice continuously differentiable discrete geostrophic solution with well-prepared initial measure \(\overline{\alpha }\in \mathscr {Q}^N(\mathbb {R}^3)\), we substitute the expression (18) into the transport equation (17) and, by appropriate choice of test functions, derive an ODE-IVP for the paths \(\mathbf {z}=(z_1,\ldots ,z_N)\). We then prove in Proposition 4.4 that this ODE-IVP has a unique \(C^2\)-solution. Conversely, we show that any \(C^1\)-solution of the ODE-IVP gives rise to an energy-conserving discrete geostrophic solution via the formula (18), from which Theorem 3.5 follows. The most involved part of the proof of Theorem 3.5 is the proof of the regularity of the optimal centroid map, which we now define.

Definition 4.1

Define the set

$$\begin{aligned} \widetilde{D}:=\left\{ (\mathbf {z},\mathbf {w})\in \mathbb {R}^{3N}\times \mathbb {R}^N\ \big \vert \ \mathbf {z}\in D,\ \mathscr {L}^3(C_i(\mathbf {z},\mathbf {w}))>0\ \forall \,i\in \{1,\ldots ,N\}\right\} , \end{aligned}$$

where D is the set of seed vectors defined by (10). \(\widetilde{D}\) is the set of generators of Laguerre tessellations of \(\varOmega \) with no zero-mass cells. We define the centroid map \(\overline{\mathbf {x}}:\widetilde{D}\rightarrow \varOmega ^N\), \(\overline{\mathbf {x}}=\left( \overline{x}_1,\ldots ,\overline{x}_N\right) \) by

$$\begin{aligned} \overline{x}_i(\mathbf {z},\mathbf {w})=\frac{1}{\mathscr {L}^3\left( C_i(\mathbf {z},\mathbf {w})\right) }\underset{C_i(\mathbf {z},\mathbf {w})}{\int }x\,\mathrm {d}x,\qquad i\in \{1,\ldots ,N\}. \end{aligned}$$

Moreover, given a fixed mass vector \(\mathbf {m}\), we define the optimal centroid map \(\mathbf {x}=\left( x_1,\ldots ,x_N\right) :D\rightarrow \varOmega ^N\) by

$$\begin{aligned} \mathbf {x}(\mathbf {z}):=\overline{\mathbf {x}}(\mathbf {z},\mathbf {w}_*(\mathbf {z})), \end{aligned}$$

where \(\mathbf {w}_*\) is the optimal weight map (Definition 2.5).

We now characterise k-times continuously differentiable discrete geostrophic solutions in terms of solutions of an ODE-IVP involving the optimal centroid map.

Lemma 4.2

Let \(\overline{\alpha }\in \mathscr {Q}^N(\mathbb {R}^3)\) be given by

$$\begin{aligned} \overline{\alpha }=\sum _{i=1}^Nm_i\delta _{\overline{z}_i}. \end{aligned}$$
(21)

A map \(\alpha :[0,T]\rightarrow \mathscr {Q}^N(\mathbb {R}^3)\) given by

$$\begin{aligned} \alpha _t=\sum _{i=1}^Nm_i\delta _{z_i(t)}\quad \forall \,t\in [0,T] \end{aligned}$$
(22)

is a k-times continuously differentiable discrete geostrophic solution with initial measure \(\overline{\alpha }\), for \(k\in \mathbb {N}\), if and only if the map \(\mathbf {z}=(z_1,\ldots ,z_N):[0,T]\rightarrow \mathbb {R}^{3N}\) is a k-times continuously differentiable solution of the ODE-IVP

$$\begin{aligned} {\left\{ \begin{array}{ll} \ \dot{\mathbf {z}}=W(\mathbf {z}),\\ \ \mathbf {z}(0)=\overline{\mathbf {z}}, \end{array}\right. } \end{aligned}$$
(23)

where \(\overline{\mathbf {z}}=(\overline{z}_1,\ldots ,\overline{z}_N)\),

$$\begin{aligned} W(\mathbf {z}):=J_N(\mathbf {z}-\mathbf {x}(\mathbf {z})), \end{aligned}$$
(24)

and \(J_N\in \mathbb {R}^{3N\times 3N}\) is the block diagonal matrix

$$\begin{aligned} J_N:=\mathrm {diag}(J,\ldots ,J). \end{aligned}$$

Proof

Suppose that \(\mathbf {z}=(z_1,\ldots ,z_N):[0,T]\rightarrow \mathbb {R}^{3N}\) is a k-times continuously differentiable solution of (23) and let \(\alpha \) be given by (22). Since \(\mathbf {z}\) is continuous, \(\alpha \in C([0,T];\mathscr {P}_c(\mathbb {R}^3))\). We must check that \(\alpha \) satisfies (17). Letting \(\varphi \in \mathcal {D}(\mathbb {R}^3\times \mathbb {R})\) be arbitrary, we have

$$\begin{aligned}&\int _0^{T}\int _{\mathbb {R}^3}\left( \partial _t \varphi (z,t)+Jz\cdot \nabla \varphi (z,t)\right) \,\mathrm {d}\alpha _t(z)\,\mathrm {d}t-\int _0^{T}\int _{\varOmega }J x\cdot \nabla \varphi \big (\nabla \mathscr {B}[\alpha _t](x),t\big )\,\mathrm {d}x\,\mathrm {d}t\\&\quad =\sum ^N_{i=1}m_i\int _0^{T}\left( \partial _t\varphi (z_i(t),t)+J\big (z_i(t)-{x}_i(\mathbf {z}(t))\big )\cdot \nabla \varphi (z_i(t),t)\right) \,\mathrm {d}t\\&\quad =\sum ^N_{i=1}m_i\int _0^{T}\left( \partial _t\varphi (z_i(t),t)+\dot{z}_i(t)\cdot \nabla \varphi (z_i(t),t)\right) \,\mathrm {d}t\\&\quad =\sum ^N_{i=1}m_i\int _0^{T}\dfrac{d {}}{d {t}}\varphi (z_i(t),t)\,\mathrm {d}t\\&\quad =\sum ^N_{i=1}m_i\varphi (z_i(T),T)-\sum ^N_{i=1}m_i\varphi (z_i(0),0)\\&\quad =\int _{\mathbb {R}^3}\varphi (z,T)\,\mathrm {d}\alpha _{T}(z)-\int _{\mathbb {R}^3}\varphi (z,0)\,\mathrm {d}\overline{\alpha }(z), \end{aligned}$$

as required.

Conversely, suppose that \(\alpha :[0,T]\rightarrow \mathscr {Q}^N(\mathbb {R}^3)\) given by (22) is a k-times continuously differentiable discrete geostrophic solution with initial measure \(\overline{\alpha }\), in the sense of Definition 3.3. Then substitution of (22) into (17) yields

$$\begin{aligned}&\sum ^N_{i=1}m_i\int _0^{T}\left( \partial _t\varphi (z_i(t),t)+J\big (z_i(t)-{x}_i(\mathbf {z}(t))\big )\cdot \nabla \varphi (z_i(t),t)\right) \,\mathrm {d}t \nonumber \\&\qquad \qquad \qquad \qquad \qquad \qquad \qquad \qquad \qquad \qquad =\sum _{i=1}^Nm_i\Big (\varphi \big (z_i(T),T\big )-\varphi (\overline{z}_i,0)\Big ) \end{aligned}$$
(25)

for all \(\varphi \in \mathcal {D}(\mathbb {R}^3\times \mathbb {R})\). We now show that the paths \(z_i\) can be separated, and deduce that the ODE-IVP (23) is satisfied by \(\mathbf {z}=(z_1,\ldots ,z_N)\) by choosing test functions which isolate each path. Since \(\alpha _t\in \mathscr {Q}^N(\mathbb {R}^3)\) for each \(t\in [0,T]\), and \(\mathbf {z}\) is continuous on [0, T] by hypothesis, the map

$$\begin{aligned} t\mapsto d(t):=\min _{i\ne j}|z_i(t)-z_j(t)| \end{aligned}$$

is positive and continuous on the compact interval [0, T] and, therefore, attains a positive minimum on [0, T]. In other words, by hypothesis, there exists \(d_*>0\) such that

$$\begin{aligned} \min _{t\in [0,T]}\min _{i\ne j}|z_i(t)-z_j(t)|\geqslant d_*. \end{aligned}$$

Hence, for fixed \(t_0\in (0,T)\) and fixed \(i\in \{1,\ldots ,N\}\), there exists an open interval \(I\subset (0,T)\) containing \(t_0\) and an open set \(U_i\subset \mathbb {R}^3\) such that

$$\begin{aligned} z_i(I)\subset U_i\qquad \text {and}\qquad \bigcup _{j\ne i}z_j(I)\cap U_i=\emptyset . \end{aligned}$$

For a fixed coordinate index \(k\in \{1,2,3\}\), consider a test function \(\varphi _{i,k}\in \mathcal {D}(\mathbb {R}^3\times \mathbb {R})\) of the form \(\phi \psi _{i,k}\), where \(\phi \in C^{\infty }_c(I)\) and \(\psi _{i,k}\in C^{\infty }_c(\mathbb {R}^3)\) satisfies

$$\begin{aligned} \psi _{i,k}(z)=z\cdot e_k \quad \forall \,z \in U_i,\qquad \text {and}\qquad \bigcup _{j\ne i}z_j(I)\cap \mathrm {supp}(\psi _{i,k})=\emptyset . \end{aligned}$$

By the Chain Rule, for all \(t\in I\),

$$\begin{aligned} \partial _t\varphi _{i,k}(z_j(t),t)&={\left\{ \begin{array}{ll} \dfrac{d {}}{d {t}}\big (\phi (t)\psi _{i,k}(z_i(t))\big )-\phi (t)\dot{z}_i(t)\cdot e_k \quad &{}\text {if }j=i,\\ 0 &{}\text {if }j\ne i. \end{array}\right. } \end{aligned}$$

Hence, with this particular choice of test function, Eq. (25) becomes

$$\begin{aligned} \int _I m_i\dfrac{d {}}{d {t}}\big (\phi (t)\psi _{i,k}(z_i(t))\big )\,\mathrm {d}t +\int _I m_i\left( -\dot{z}_i(t)+J\left( z_i(t)-x_i(\mathbf {z}(t))\right) \right) \cdot e_k\phi (t)\,\mathrm {d}t=0. \end{aligned}$$

Since \(\phi \in C^{\infty }_c(I)\), the first integral is zero. By varying over all \(k\in \{1,2,3\}\) and all \(\phi \in C^{\infty }_c(I)\) we obtain the pointwise statement that

$$\begin{aligned} \dot{z}_i(t)=J\left( z_i(t)-x_i(\mathbf {z}(t))\right) \end{aligned}$$

for all \(t\in I\), in particular for \(t=t_0\). In a similar way, it can be shown that the inital condition \(z_i(0)=\overline{z}_i\) is satisfied. Repeating this for all \(t_0\in (0,T)\), and all \(i\in \{1,\ldots ,N\}\), we deduce that \(\mathbf {z}=(z_1,\ldots ,z_N)\) satisfies the ODE-IVP (23). \(\square \)

Remark 4.3

The ODE-IVP (23) is precisely the discrete analogue of the active transport equation (1). Indeed, let \(\nu \in \mathscr {Q}^N(\mathbb {R}^3)\) have seed vector \(\mathbf {z}=(z_1,\ldots ,z_N)\) and consider the extension by \(+\infty \) of the convex function \(\mathscr {B}[\nu ]\) to \(\mathbb {R}^3\), which we will again denote by \(\mathscr {B}[\nu ]\). For each \(i\in \{1,\ldots ,N\}\), \(x_i(\mathbf {z})\) is the centroid of the subdifferential of \(\mathscr {B}[\nu ]^*\) evaluated at the point \(z_i\). To see this, first recall that for any convex function \(f:\varOmega \rightarrow \mathbb {R}\) and any constant \(c\in \mathbb {R}\)

$$\begin{aligned} \partial f^*(z)=\partial (f+c)^*(z)\qquad \forall \, z\in \mathbb {R}^3. \end{aligned}$$

Therefore, the characterisation of \(\mathscr {B}[\nu ]\) given by (14) and (15) implies that, for each \(i\in \{1,\ldots ,N\}\),

$$\begin{aligned} \partial (\mathscr {B}[\nu ])^*(z_i)=\partial \left( \varPhi -\int _{\varOmega }\varPhi \,\mathrm {d}x\right) ^*(z_i)=\partial \varPhi ^*(z_i), \end{aligned}$$

where, letting \(\mathbf {w}_*(\mathbf {z})=(w_1,\ldots ,w_N)\),

$$\begin{aligned} \varPhi (x)=\left\{ \begin{array}{ll} \frac{1}{2}|x|^2-\frac{1}{2}\underset{j}{\min }\left\{ |x-z_j|^2-w_j\right\} &{}\quad \text {if}\; x\in \varOmega ,\\ +\infty &{}\quad \text {if}\; x\notin \varOmega . \end{array}\right. \end{aligned}$$

Note that

$$\begin{aligned} \varPhi ^*(z_i)=\frac{1}{2}\left( |z_i|^2-w_i\right) \qquad \forall \, i\in \{1,\ldots ,N\}. \end{aligned}$$

By the characterisation of the subdifferential of \(\varPhi \) in terms of its Legendre-Fenchel transform (see, for example, [51, Proposition 2.4]), we have

$$\begin{aligned} \partial \varPhi ^*(z_i)&=\left\{ x\in \varOmega \ :\ \varPhi (x)+\varPhi ^*(z_i)=x\cdot z_i\right\} \\&=\left\{ x\in \varOmega \ :\ |x-z_i|^2-w_i=\min _{j}\left\{ |x-z_j|^2-w_j\right\} \right\} \\&=C_i(\mathbf {z},\mathbf {w}_*(\mathbf {z})). \end{aligned}$$

Hence

$$\begin{aligned} \partial \left( \mathscr {B}[\nu ]\right) ^*(z_i)=C_i(\mathbf {z},\mathbf {w}_*(\mathbf {z})), \end{aligned}$$

as required. This is consistent with the ‘geometric interpretation’ of a geostrophic solution described in [40, p.803].

Having characterised k-times continuously differentiable discrete geostrophic solutions in terms of solutions of the ODE-IVP (23), we now aim to show that solutions of (23) exist and that the corresponding geostrophic solutions are energy-conserving. After proving some preliminary results (Lemma 4.13 and Lemma 4.10), we prove the following Proposition.

Proposition 4.4

For \(\overline{\mathbf {z}}=(\overline{z}_1,\ldots ,\overline{z}_N)\in \mathbb {R}^{3N}\) satisfying

$$\begin{aligned} \min _{i\ne j}\vert (\overline{z}_i-\overline{z}_j)\cdot e_3\vert \geqslant r \end{aligned}$$
(26)

for some \(r>0\), there exists a unique \(C^2\)-solution of (23) on the interval [0, T] for any \(T\in (0,\infty )\).

By Lemma 4.2, this gives rise to a unique discrete geostrophic solution with initial measure (21) via the formula (22). As we will see in Sect. 5, the condition (26) on the initial data is not restrictive for the purpose of constructing a geostrophic solution with arbitrary initial measure in \(\mathscr {P}_c(\mathbb {R}^3)\). We begin by explaining its relevance.

We show below that the map W, defined by (24), is continuously differentiable, and therefore locally Lipschitz on D. However, as we demonstrate in Remark 4.11, W does not, in general, admit continuous extension to \(\mathbb {R}^{3N}\). In order to apply the Picard-Lindelöf Existence Theorem to obtain a unique solution of (23) on a given time interval [0, T], it is therefore necessary to ensure that any solution trajectory is bounded away from the boundary of D. This is guaranteed a priori if \(\overline{\mathbf {z}}\) satisfies (26). Indeed, since the third row of the matrix J is zero, if \(\overline{\mathbf {z}}\in \mathbb {R}^{3N}\) satisfies (26) and \(\mathbf {z}\) is a solution of the corresponding ODE-IVP (23), then \(\mathbf {z}(t)\) satisfies (26) for all times t.

We now prove the claimed regularity of the map W in the case where the domain \(\varOmega \) is open, bounded and convex (see Remark 4.12 for a discussion of the case where \(\varOmega \) is non-convex). To do so, we use the following results regarding the regularity of the centroid map \(\overline{\mathbf {x}}\) (see Definition 4.1) and the volume map \(\mathbf {V}\) (see Definition 4.8), as well as the structure of the matrix of partial derivatives of \(\mathbf {V}\) with respect to \(\mathbf {w}\), which can be described using the notions of a Laplacian matrix and the dual graph of a Laguerre tesselation.

Definition 4.5

(Dual graph) Given a Laguerre tessellation \(\{C_i\}_{i=1}^N\) of \(\varOmega \), for each \(i\in \{1,\ldots ,N\}\) the set of neighbours of i is defined to be the set

$$\begin{aligned} N_i:=\left\{ j\in \{1,\ldots ,N\}\ \vert \ j\ne i,\ C_i\cap C_j\ne \emptyset \right\} . \end{aligned}$$

The (undirected) graph (VE) given by

$$\begin{aligned} V=\{1,\ldots ,N\},\qquad E=\big \{\{i,j\}\subset \{1,\ldots ,N\}:j\in N_i\big \}, \end{aligned}$$

is referred to as the dual graph of the Laguerre decomposition \(\{C_i\}_{i=1}^N\).

Definition 4.6

(Laplacian matrix [46]) Given a weighted graph \(G=(V,E,h)\), where \(V=\{1,\ldots ,N\}\) and \(h:E\rightarrow \mathbb {R}\), the adjacency matrix of G, \(A_G\in \mathbb {R}^{N\times N}\), is given by

$$\begin{aligned} (A_G)_{ij}={\left\{ \begin{array}{ll} h(\{i,j\})\quad &{}\text {if}\quad \{i,j\}\in E,\\ 0\quad &{}\text {otherwise}. \end{array}\right. } \end{aligned}$$

The degree matrix of G, \(D_G\in \mathbb {R}^{N\times N}\), is the diagonal matrix such that

$$\begin{aligned} (D_G)_{ii}=\sum \limits _{j=1}^N(A_G)_{ij}. \end{aligned}$$

The Laplacian matrix of G, \(L_G\in \mathbb {R}^{N\times N}\), is then given by

$$\begin{aligned} L_G=D_G-A_G. \end{aligned}$$

Theorem 4.7

(cf. [15, Proposition 2 and Lemma 2], [32, Theorem 4.1], [44, Theorem 45]) Let \(f\in C(\varOmega )\cap W^{1,1}(\varOmega )\) and define \(F=(F_1,\ldots ,F_N):\widetilde{D}\rightarrow \mathbb {R}^{N}\) by

$$\begin{aligned} F_i(\mathbf {z},\mathbf {w}):=\int \limits _{C_i(\mathbf {z},\mathbf {w})}f(x)\,\mathrm {d}x. \end{aligned}$$

Then F is continuously differentiable. In particular, for \((\mathbf {z},\mathbf {w})\in \widetilde{D}\) let \(G=(V,E)\) denote the dual graph of the Laguerre tessellation \(\{C_i(\mathbf {z},\mathbf {w})\}_{i=1}^N\) and define \(h:E\rightarrow \mathbb {R}\) by

$$\begin{aligned} h(\{i,j\})=\frac{1}{2|z_j-z_i|}\int _{C_i\cap C_j}f(x)\,\mathrm {d}\mathscr {H}^2(x). \end{aligned}$$

Then the matrix \(D_{\mathbf {w}}F(\mathbf {z},\mathbf {w})\) of partial derivatives of F with respect to \(\mathbf {w}\) evaluated at \((\mathbf {z},\mathbf {w})\) is the Laplacian matrix of the weighted graph \(G=(V,E,h)\).

Note that the expressions for the partial derivatives of F with respect to the weights given above differs by a factor of 2 from those given in [15] due to our choice of quadratic cost function. Note also that the assumption that the seed locations are generic with respect to the cost, which is used in the proof of [44, Theorem 45], is not needed for the quadratic cost if the cells are non-empty.

Definition 4.8

We define the volume map \(\mathbf {V}:D\times \mathbb {R}^N\rightarrow \mathbb {R}^N,\ \mathbf {V}=(V_1,\ldots ,V_N),\) by

$$\begin{aligned} V_i(\mathbf {z},\mathbf {w})=\mathscr {L}^3\left( C_i(\mathbf {z},\mathbf {w})\right) . \end{aligned}$$

By combining Theorem 4.7 with the quotient rule for derivatives, we immediately obtain the following corollary.

Corollary 4.9

(c.f. [15, Proposition 2], [32, Theorem 4.1], [8, Lemma 2.4]) The volume map \(\mathbf {V}\) and the centroid map \(\overline{\mathbf {x}}\) (see Definition 4.1) are continuously differentiable on \(\widetilde{D}\). In particular, for \((\mathbf {z},\mathbf {w})\in \widetilde{D}\), let \(G=(V,E)\) denote the dual graph of the Laguerre tessellation \(\{C_i(\mathbf {z},\mathbf {w})\}_{i=1}^N\) and define \(h:E\rightarrow \mathbb {R}\) by

$$\begin{aligned} h(\{i,j\})=\frac{\mathscr {H}^2(C_i\cap C_j)}{2|z_j-z_i|}, \end{aligned}$$

where \(\mathscr {H}^2\) denotes the 2-dimensional Hausdorff measure on \(\mathbb {R}^3\). Then the matrix \(D_{\mathbf {w}}V(\mathbf {z},\mathbf {w})\) of partial derivatives of F with respect to \(\mathbf {w}\) evaluated at \((\mathbf {z},\mathbf {w})\) is the Laplacian matrix of the weighted graph \(G=(V,E,h)\).

The optimal centroid map \(\mathbf {x}\) (see Definition 4.1) is the composition of the centroid map \(\overline{\mathbf {x}}\) with the optimal weight map \(\mathbf {w}_*\) (see Definition 2.5). To show that \(\mathbf {x}\) is continuously differentiable we now prove that \(\mathbf {w}_*\) is continuously differentiable. At the time of writing, we believe this result to be novel.

Lemma 4.10

Given a fixed mass vector \(\mathbf {m}\), the corresponding optimal weight map \(\mathbf {w}_*:D\rightarrow \mathbb {R}^N\) is continuously differentiable.

Proof

Let \(A\in \mathbb {R}^{N\times (N-1)}\) be the matrix

$$\begin{aligned} A=\left( \begin{array}{c} \mathbf {I}_{N-1}\\ \mathbf {0} \end{array}\right) , \end{aligned}$$

where \(\mathbf {I}_{N-1}\) denotes the \((N-1)\times (N-1)\) identity matrix. Define the residual map \(\mathbf {r}:D\times \mathbb {R}^{N-1}\rightarrow \mathbb {R}^{N-1}\) by

$$\begin{aligned} \mathbf {r}(\mathbf {z},\mathbf {w})=A^{\mathrm {T}} \left( \mathbf {V}(\mathbf {z},A \mathbf {w})-\mathbf {m} \right) . \end{aligned}$$

Observe that \(AA^{\mathrm {T}}\mathbf {w}_*=\mathbf {w}_*\) so for all \(\mathbf {z}\in D\)

$$\begin{aligned} \mathbf {r}(\mathbf {z},A^{\mathrm {T}}\mathbf {w}_*(\mathbf {z}))=0. \end{aligned}$$

Now fix an arbitrary seed vector \(\mathbf {z}\in D\). Since \(m_i>0\) for each \(i\in \{1,\ldots ,N\}\), \((\mathbf {z},\mathbf {w}_*(\mathbf {z}))\in \widetilde{D}\). By Corolloary 4.9, \(\mathbf {V}\) is therefore continuously differentiable in a neighbourhood of \((\mathbf {z},\mathbf {w}_*(\mathbf {z}))\). The residual map \(\mathbf {r}\) is therefore continuously differentiable in a neighbourhood of \((\mathbf {z},A^{\mathrm {T}}\mathbf {w}_*(\mathbf {z}))\) and

$$\begin{aligned} D_{\mathbf {w}}\mathbf {r}(\mathbf {z},A^{\mathrm {T}}\mathbf {w}_*(\mathbf {z}))=A^{\mathrm {T}}D_{\mathbf {w}}V(\mathbf {z},\mathbf {w}_*(\mathbf {z}))A. \end{aligned}$$

The weighted Laplacian matrix \(D_{\mathbf {w}}V(\mathbf {z},\mathbf {w}_*(\mathbf {z}))\) has positive entries and corresponds to a connected graph. Hence it is symmetric, positive semi-definite and its kernel is \(\mathrm {span}\{\mathbf {e}\}\), where \(\mathbf {e}=(1,\ldots ,1)\in \mathbb {R}^N\) (see for example [46, Section 2.4]). The symmetric matrix \(L=D_{\mathbf {w}} \mathbf {r}(\mathbf {z},A^{\mathrm {T}}\mathbf {w}_*(\mathbf {z}))\) is therefore invertible. Indeed, if \(\mathbf {y} \in \mathbb {R}^{N-1}\),

$$\begin{aligned} \mathbf {y}^T L \mathbf {y} = 0 \quad \Longleftrightarrow \quad \mathbf {y}^T A^{\mathrm {T}} D_{\mathbf {w}} \mathbf {V}(\mathbf {z},\mathbf {w}_*(\mathbf {z})) A \mathbf {y} = 0 \quad \Longleftrightarrow \quad A \mathbf {y} \in \mathrm {span}\{\mathbf {e}\} \quad \Longleftrightarrow \quad \mathbf {y} = \mathbf {0}. \end{aligned}$$

By the Implicit Function Theorem (see, for example, [18, Theorem 10.2.1, p.270]) applied to \(\mathbf {r}\) at the point \((\mathbf {z},A^{\mathrm {T}}\mathbf {w}_*(\mathbf {z}))\), the function \(A^{\mathrm {T}}\mathbf {w}_*:D\rightarrow \mathbb {R}^{N-1}\) is therefore continuously differentiable in a neighbourhood of \(\mathbf {z}\). Since \(AA^{\mathrm {T}}\mathbf {w}_*=\mathbf {w}_*\), it follows that the optimal weight map \(\mathbf {w}_*\) is continuously differentiable in a neighbourhood of \(\mathbf {z}\), as required. \(\square \)

Remark 4.11

(Counterexample 1) Combining Corollary 4.9 with Lemma 4.10 we see that the optimal centroid map \(\mathbf {x}\) is continuously differentiable on its domain \(\widetilde{D}\). However, it does not, in general, admit continuous extension to \(\mathbb {R}^{3N}\). We demonstrate this by means of a simple counter example in the case \(N=2\). Let \(\varOmega \subset \mathbb {R}^3\) be the ball of volume 1 centred at the origin, fix the mass vector \(\mathbf {m}=(1/2,1/2)\), and consider the map \(\mathbf {z}=(z_1,z_2):(-1,1)\rightarrow \mathbb {R}^6\) given by

$$\begin{aligned} z_1(s)=(s,0,0),\qquad z_2(s)=-z_1(s)\qquad \forall \,s\in (-1,1). \end{aligned}$$

Letting \(x_*\) denote the centroid of the spherical cap

$$\begin{aligned} C:=\{x\in \varOmega \ :\ x_1\geqslant 0\}, \end{aligned}$$

it can easily be shown that the corresponding optimal centroids are given by

$$\begin{aligned} x_1(\mathbf {z}(s))=\left\{ \begin{array}{ll} x_*&{}\quad \text {if}\; s\in (0,1)\\ -x_*&{}\quad \text {if}\; s\in (-1,0) \end{array}\right. \end{aligned}$$

and

$$\begin{aligned} x_2(\mathbf {z}(s))=-x_1(\mathbf {z}(s))\quad \forall \, s\in (-1,0)\cup (0,1). \end{aligned}$$

Since \(x_*\ne 0\), the map \(\mathbf {x}\circ \mathbf {z}=\left( x_1\circ \mathbf {z},x_2\circ \mathbf {z}\right) \) does not admit a continuous extension to the whole interval \((-1,1)\). Since \(\mathbf {z}\) is continuous on \((-1,1)\), this means that the optimal centroid map \(\mathbf {x}\) does not admit a continuous extension at the point \((0,0)\in \mathbb {R}^6\), precisely where the two seeds are the same.

Remark 4.12

(Counterexample 2) Corollary 4.9 does not hold in general if \(\varOmega \) is non-convex. Indeed, let \(\varOmega \subset \mathbb {R}^3\) be the domain \((0,1)^3\setminus [1/2,1)^3\), let \(\mathbf {w}=(0,0)\) and consider seed vectors given by the map \(\mathbf {z}=(z_1,z_2):(-1/2,1/2)\rightarrow \mathbb {R}^6\) defined by

$$\begin{aligned} z_1(s)=(0,0,s),\qquad z_2(s)=(0,0,1+s). \end{aligned}$$

For \(s\in (-1/2,1/2)\), the cell \(C_1(\mathbf {z}(s),\mathbf {w})\) is then given by

$$\begin{aligned} C_1(\mathbf {z}(s),\mathbf {w})=\left\{ x\in \varOmega \ :\ x\cdot e_3\le \frac{1}{2}+s\right\} , \end{aligned}$$

and has volume

$$\begin{aligned} V_1(\mathbf {z}(s),\mathbf {w})= \left\{ \begin{array}{ll} \frac{1}{2}+s&{}\quad \text {if}\; s\in (-\frac{1}{2},0]\\ \frac{1}{2}+\frac{3}{4}s&{}\quad \text {if}\; s\in (0,\frac{1}{2}). \end{array}\right. \end{aligned}$$

The map \(s\mapsto V_1(\mathbf {z}(s),\mathbf {w})\) is not differentiable at \(s=0\) which, since \(\mathbf {z}\) is differentiable, implies that \(V_1\) is not differentiable at the point \((\mathbf {z}(0),\mathbf {w})\).

If \(\varOmega \) is open, bounded and convex, we can use Corollary 4.9 to deduce that the optimal centroid map \(\mathbf {x}\) is continuously differentiable and, therefore, locally Lipschitz. While it is known that the volume map \(\mathbf {V}\) is Lipschitz in \(\mathbf {w}\) even when the domain \(\varOmega \) is non-convex (see, for example [44, Proposition 41]), to prove that \(\mathbf {x}\) is locally Lipschitz for non-convex domains would require a finer analysis of the regularity of the centroid map \(\overline{\mathbf {x}}\) and the volume map \(\mathbf {V}\) with respect to \(\mathbf {z}\).

We now prove that for any finite time horizon T, solutions of the ODE-IVP (23) on the interval [0, T] are bounded and have bounded first derivatives. Our immediate application of these a priori estimates is to prove that such solutions exist (Proposition 4.4). As we show in Sect. 5, these estimates also yield the necessary compactness of the corresponding sequence of measure-valued maps to pass to the limit as \(N\rightarrow \infty \).

Lemma 4.13

(A priori estimates) If \(\mathbf {z}=(z_1,\ldots ,z_N)\) is a \(C^1\)-solution of (23) on the interval [0, T] then, for each \(i\in \{1,\ldots ,N\}\),

$$\begin{aligned} |z_i(t)|&\leqslant |\overline{z}_i|+ RT \quad&\forall \, t\in [0,T], \end{aligned}$$
(27)
$$\begin{aligned} |\dot{z}_i(t)|&\leqslant |\overline{z}_i|+ R(1+T)\quad&\forall \, t\in (0,T), \end{aligned}$$
(28)

where \(R=R(\varOmega )>0\) is such that \(\varOmega \subset B_{R}(0)\subset \mathbb {R}^3\).

Proof

Since the matrix J is skew symmetric and has operator norm 1, for any \(i\in \{1,\ldots ,N\}\) we have

$$\begin{aligned} \dfrac{d {}}{d {t}}|z_i|^2=2z_i\cdot \dot{z}_i=-2z_i\cdot Jx_i(\mathbf {z})\leqslant 2R|z_i| \end{aligned}$$

on [0, T]. The maximal solution of the ODE-IVP

$$\begin{aligned} {\left\{ \begin{array}{ll} \dfrac{d {y}}{d {t}}=2Ry^{1/2}\\ y(0)=|\overline{z}_i|^2 \end{array}\right. } \end{aligned}$$

on the interval [0, T] is given by

$$\begin{aligned} y(t)=(|\overline{z}_i|+Rt)^2. \end{aligned}$$

(Note that this is the unique solution unless \(\overline{z}_i=0\).) Applying a comparison lemma such as [49, Theorem 13.2] then establishes (27). Hence,

$$\begin{aligned} |\dot{z}_i|=| J(z_i-x_i(\mathbf {z}))| \leqslant |z_i|+ |x_i(\mathbf {z})| \leqslant |\overline{z}_i|+ RT +R \end{aligned}$$

on [0, T], which establishes (28). \(\square \)

Using the regularity results and the a priori estimates established above, we now prove Proposition 4.4.

Proof of Proposition 4.4

Combining Corollary 4.9 with Lemma 4.10 we see that the optimal centroid map \(\mathbf {x}\) is continuously differentiable on its domain D. The map W is, therefore, locally Lipschitz on D. Now, let \(\overline{\mathbf {z}}=(\overline{z}_1,\ldots ,\overline{z}_N)\in D\) satisfy (26), let \(\overline{M}>0\) be such that

$$\begin{aligned} \max _i\{|\overline{z}_i|\}<\overline{M} \end{aligned}$$

and let \(T\in (0,\infty )\). Then

$$\begin{aligned} \overline{B_{\frac{r}{2}}(\overline{\mathbf {z}})}\subset D \end{aligned}$$

and, for all \(\mathbf {z}\in B_{\frac{r}{2}}(\overline{\mathbf {z}})\),

$$\begin{aligned} |W(\mathbf {z})|&\le |\mathbf {z}|+|\mathbf {x}(\mathbf {z})| \le \frac{r}{2}+|\overline{\mathbf {z}}|+|\mathbf {x}(\mathbf {z})| \le \frac{r}{2}+N^{\frac{1}{2}}\left( \overline{M}+R\right) \\&\le \frac{r}{2}+N^{\frac{1}{2}}\left( \overline{M}+R(1+T)\right) =:M. \end{aligned}$$

Let

$$\begin{aligned} T_0=\min \left\{ T,\frac{r}{2M}\right\} . \end{aligned}$$

By the Picard-Lindelöf Existence Theorem (see, for example, [50, Theorem 2.2]) there exists a unique \(C^1\)-solution of (23) on the interval \([0,T_0]\).

Since the third row of the matrix J is zero, by (26)

$$\begin{aligned} \min _{i\ne j}\vert (z_i(T_0)-z_j(T_0))\cdot e_3\vert \geqslant r. \end{aligned}$$

Hence

$$\begin{aligned} \overline{B_{\frac{r}{2}}(\mathbf {z}(T_0))}\subset D. \end{aligned}$$

By the a priori estimate (27) proved in Lemma 4.13, for all \(\mathbf {z}\in B_{\frac{r}{2}}(\mathbf {z}(T_0))\),

$$\begin{aligned} |W(\mathbf {z})|&\le |\mathbf {z}|+|\mathbf {x}(\mathbf {z})| \le \frac{r}{2}+|\mathbf {z}(T_0)|+|\mathbf {x}(\mathbf {z})| \le \frac{r}{2}+N^{\frac{1}{2}}\left( \overline{M}+R(1+T_0)\right) \\&\le \frac{r}{2}+N^{\frac{1}{2}}\left( \overline{M}+R(1+T)\right) =M. \end{aligned}$$

Applying the Picard-Lindelöf Existence Theorem again we conclude that there exists a unique \(C^1\)-solution of (23) on the interval \([0,\min \{T,2T_0\}]\). Repeating this process, we obtain a unique \(C^1\)-solution \(\mathbf {z}\) of (23) on the interval [0, T]. Since the function W is \(C^1\) on \(\mathbf {z}([0,T])\), \(\mathbf {z}\) is in fact \(C^2\). \(\square \)

To conclude this section we prove Theorem 3.5. We begin by proving that any continuously differentiable discrete geostrophic solution is energy-conserving.

Lemma 4.14

Any continuously differentiable discrete geostrophic solution is energy-conserving.

Proof

Suppose that \(\alpha \in C([0,T];\mathscr {P}_c(\mathbb {R}^3))\) is a continuously differentiable discrete geostrophic solution with initial measure

$$\begin{aligned} \overline{\alpha }=\sum _{i=1}^Nm_i\delta _{\overline{z}_i}\in \mathscr {Q}^N(\mathbb {R}^3). \end{aligned}$$

By Lemma 4.2, \(\alpha \) is then given by

$$\begin{aligned} \alpha _t=\sum _{i=1}^Nm_i\delta _{z_i(t)}\qquad \forall \, t\in [0,T], \end{aligned}$$

where, for each \(i\in \{1,\ldots ,N\}\),

$$\begin{aligned} {\left\{ \begin{array}{ll} \dot{z}_i=J(z_i-x_i(\mathbf {z})),\\ z_i(0)=\overline{z}_i. \end{array}\right. } \end{aligned}$$
(29)

Moreover, letting \(\widetilde{C}_i(t)=C_i(\mathbf {z}_i(t),\mathbf {w}_*(\mathbf {z}(t)))\) for \(t\in [0,T]\), we have

$$\begin{aligned} \nabla \mathscr {B}[\alpha _t]=\sum _{i=1}^Nz_i(t)\mathbb {1}_{\widetilde{C}_i(t)}. \end{aligned}$$

Then, by Eq. (16),

$$\begin{aligned} E[\alpha _t]=\frac{1}{2}\sum _{i=1}^N\int _{\widetilde{C}_i(t)}|x-z_i(t)|^2\,\mathrm {d}x-\frac{1}{2}\sum _{i=1}^Nm_i(z_i(t)\cdot e_3)^2-\frac{1}{2}\int _{\varOmega }x_3^2\,\mathrm {d}x. \end{aligned}$$
(30)

We now show that the function \(t\mapsto E[\alpha _t]\) is differentiable on the interval (0, T) and that its derivative is zero, from which it follows that \(\alpha \) is energy-conserving. Trivially, the third term of (30) is constant in time. By (29), since the third row of the matrix J is zero,

$$\begin{aligned} \dfrac{d {}}{d {t}}(z_i(t)\cdot e_3)=0\quad \forall \,i\in \{1,\ldots ,N\},\, t\in (0,T). \end{aligned}$$

Therefore the time derivative of the second term of (30) is also zero on (0, T). Now define the function \(\zeta :\varOmega \times [0,T]\rightarrow \mathbb {R}\) by

$$\begin{aligned} \zeta (x,t):=\frac{1}{2}\sum _{i=1}^N\mathbb {1}_{\widetilde{C}_i(t)}(x)|x-z_i(t)|^2, \end{aligned}$$

so that the first term of (30) is precisely \(\int _{\varOmega }\zeta (x,t)\,\mathrm {d}x\). Observe that for any fixed \(s\in (0,T)\), the function \(t\mapsto \zeta (x,t)\) is continuously differentiable at s for \(\mathscr {L}^3\)-almost every x in \(\varOmega \). Indeed, since the set \(\partial \widetilde{C}_i(s)\cap \partial \widetilde{C}_j(s)\) is contained in a 2-dimensional plane for each \(i\ne j\) (see Sect. 2),

$$\begin{aligned} \mathscr {L}^3\left( \varOmega \setminus \bigcup _{i=1}^N\mathrm {Int}\left( {\widetilde{C}}_i(s)\right) \right) =\mathscr {L}^3\left( \bigcup _{i\ne j}\left( \partial \widetilde{C}_i(s)\cap \partial \widetilde{C}_j(s)\right) \cup \partial \varOmega \right) =0. \end{aligned}$$

For x in the interior of the cell \(\widetilde{C}_i(t)\)

$$\begin{aligned} \partial _t\zeta (x,s)&=-(x-z_i(s))\cdot \dot{z}_i(s) =-(x-z_i(s))\cdot J(z_i(s)-x_i(\mathbf {z}(s))). \end{aligned}$$

By hypothesis, \(\mathbf {z}(s)\in D\), so continuity at s of the map \(t\mapsto \partial _t\zeta (x,t)\) follows from Corollary 4.9 and Lemma 4.10. By Lemma 4.13, \(\partial _t\zeta (\cdot ,s)\) is uniformly bounded on \(\varOmega \). Hence, by the Mean Value Theorem and the Dominated Convergence Theorem, we may exchange the order of differentiation and integration and, using (29) and the skew-symmetry of the matrix J, we obtain

$$\begin{aligned} \dfrac{d {}}{d {t}}\left( \int _{\varOmega }\zeta (x,t)\,\mathrm {d}x\right) \bigg \vert _{t=s}&=\int _{\varOmega }\partial _t\zeta (x,s)\,\mathrm {d}x =-\sum _{i=1}^N\int _{\widetilde{C}_i(s)}(x-z_i(s))\cdot \dot{z}_i(s)\,\mathrm {d}x\\&=-\sum _{i=1}^Nm_i\big (x_i(\mathbf {z}(s))-z_i(s)\big )\cdot J\big (z_i(s)-x_i(\mathbf {z}(s))\big )\,\mathrm {d}x=0, \end{aligned}$$

which completes the proof. \(\square \)

Proof of Theorem 3.5

Since \(\overline{\alpha }\) is well-prepared in the sense of Definition 3.4, by Proposition 4.4 the corresponding ODE-IVP (23) has a unique \(C^2\)-solution. By Lemma 4.2, this gives rise to a unique twice continuously differentiable discrete geostrophic solution with initial measure \(\overline{\alpha }\) via the formula (22). This solution is energy-conserving by Lemma 4.14. \(\square \)

5 Proof of Theorem 3.6

We now construct a geostrophic solution with arbitrary initial measure \(\overline{\alpha }\in \mathscr {P}_c(\mathbb {R}^3)\). We begin by proving the existence of a sequence \((\overline{\alpha }^N)_{N\in \mathbb {N}}\) of well-prepared discrete probability measures (see Definition 3.4) converging to \(\overline{\alpha }\) with respect to the Wasserstein 2-distance (Lemma 5.1). The existence of a sequence of discrete geostrophic solutions \(\alpha ^N\) with initial measures \(\overline{\alpha }^N\), respectively, is then guaranteed by Theorem 3.5. To obtain a geostrophic solution with initial measure \(\overline{\alpha }\), we then apply the Arzelà-Ascoli Compactness Theorem combined with the continuity of \(\mathscr {B}\) (Theorem 2.2) and pass to the limit as \(N\rightarrow \infty \).

Lemma 5.1

Let \(\beta \in \mathscr {P}_c(\mathbb {R}^3)\). There exists a compact set \(U\subset \mathbb {R}^3\), a sequence of discrete probability measures \(\beta ^N\in \mathscr {Q}^N(\mathbb {R}^3)\) given by

$$\begin{aligned} \beta ^N=\sum _{i=1}^Nm^N_i\delta _{z^N_i} \end{aligned}$$

and a sequence of positive real numbers \(r_N>0\) such that

$$\begin{aligned}&\mathrm {supp}(\beta ^N)\subset U\quad \forall \, N\in \mathbb {N},&\underset{N\rightarrow \infty }{\lim }W_2(\beta ^N,\beta )=0, \end{aligned}$$
(31)

and

$$\begin{aligned} |(z^N_i-z^N_j)\cdot e_3|\geqslant r_N \quad \forall \, i\ne j. \end{aligned}$$
(32)

Proof

Take a compact set \(U \subset \mathbb {R}^3\) such that \(\text {supp}(\beta )\subset U\) and a sequence of discrete probability measures

$$\begin{aligned} \tilde{\beta }^N=\sum _{i=1}^Nm^N_i\delta _{\tilde{z}^N_i} \in \mathscr {Q}^N(\mathbb {R}^3), \end{aligned}$$

whose support is contained in U, such that

$$\begin{aligned} \underset{N\rightarrow \infty }{\lim }W_2(\tilde{\beta }^N,\beta )=0. \end{aligned}$$
(33)

Such a sequence exists by for example [44, Lemma 10].

For each \(N \in \mathbb {N}\), let \(z_i^N \in U\), \(i \in \{1,\ldots ,N\}\), satisfy the following: \((z_i^N-z_j^N)\cdot e_3 \ne 0\) for all \(i,j\in \{1,\ldots ,N\}\), \(i \ne j\); \(z_i^N = \mathrm {argmin}_{z \in \{z_1^N,\ldots ,z_N^N\}} |z - \tilde{z}^N_i|\) for all \(i \in \{1,\ldots ,N\}\); \(|z_i^N - \tilde{z}^N_i|<1/N\) for all \(i \in \{1,\ldots ,N\}\). Define

$$\begin{aligned} \beta ^N:=\sum _{i=1}^Nm_i^N\delta _{z_i^N}. \end{aligned}$$

Then

$$\begin{aligned} W^2_2(\beta ^N,\tilde{\beta }^N) = \sum _{i=1}^N m_i^N |z_i^N - \tilde{z}^N_i|^2 \le \frac{1}{N^2}. \end{aligned}$$
(34)

Combining (33) and (34) completes the proof. \(\square \)

Now fix \(T\in (0,\infty )\) and \(\overline{\alpha }\in \mathscr {P}_c(\mathbb {R}^3)\). By Lemma 5.1 there exists a sequence of well-prepared discrete probability measures \(\overline{\alpha }^N\in \mathscr {Q}^N(\mathbb {R}^3)\), given by

$$\begin{aligned} \overline{\alpha }^N:=\sum _{i=1}^Nm^N_i\delta _{\overline{z}^N_i}, \end{aligned}$$

and a positive constant \(\overline{M}>0\) such that

$$\begin{aligned} \underset{N\rightarrow \infty }{\lim }W_2(\overline{\alpha }^N,\overline{\alpha })=0 \qquad \text {and} \qquad \underset{N\in \mathbb {N}}{\bigcup }\text {supp}(\overline{\alpha }^N)\subset B_{\overline{M}}(0). \end{aligned}$$

By Theorem 3.5, for each \(N\in \mathbb {N}\) there exists a unique twice continuously differentiable discrete geostrophic solution \(\alpha ^N\in C([0,T];\mathscr {P}_c(\mathbb {R}^3))\) with initial measure \(\alpha ^N_0=\overline{\alpha }^N\), given by

$$\begin{aligned} \alpha ^N_t=\sum ^N_{i=1}m^N_i\delta _{z^N_i(t)} \quad \forall \, t\in [0,T], \end{aligned}$$

where \(\mathbf {z}^N=(z^N_1,\ldots ,z^N_N):[0,T]\rightarrow \mathbb {R}^{3N}\) is a twice continuously differentiable solution of the ODE-IVP (23) with initial condition

$$\begin{aligned} \mathbf {z}^N(0)=(\overline{z}_1^N,\ldots ,\overline{z}_N^N). \end{aligned}$$

Lemma 5.2

(Compactness, c.f. [40, Lemma 2.5], [22, Theorem 3.2]) The sequence \((\alpha ^N)_{N\in \mathbb {N}}\) has a uniformly convergent subsequence in \(C\left( [0,T];\mathscr {P}_{c}(\mathbb {R}^3)\right) \). In particular, there exists a strictly increasing function \(k:\mathbb {N}\rightarrow \mathbb {N}\) and a Lipschitz map \(\alpha \in C^{0,1}\left( [0,T];\mathscr {P}_{c}(\mathbb {R}^3)\right) \) such that

$$\begin{aligned} \underset{N\rightarrow \infty }{\lim }\sup _{t\in [0,T]} W_2(\alpha _t^{k(N)},\alpha _t)=0. \end{aligned}$$

Moreover, there exists \(R_1>0\) such that for all \(t\in [0,T]\) and all \(N\in \mathbb {N}\)

$$\begin{aligned} \mathrm {supp}(\alpha ^N_t),\, \mathrm {supp}(\alpha _t)\subset K:=\overline{B_{R_1}(0)}. \end{aligned}$$

Proof

The a priori estimates (27) on the paths \(z^N_i\) mean that the metric space \((\mathscr {P}_{c}(\mathbb {R}^3),W_2)\) can be replaced by the compact metric space \((\mathscr {P}(K),W_2)\), where \(K\subset \mathbb {R}^3\) is the closed ball of radius \(R_1:=\overline{M}+R(\varOmega )T\) centred at the origin. Since K is compact, \(W_1\) and \(W_2\) are equivalent metrics on \(\mathscr {P}(K)\) (see [48, p.179]), so it is sufficient to prove that \((\alpha ^N)_{N\in \mathbb {N}}\) has a uniformly convergent subsequence in \(C\left( [0,T];(\mathscr {P}(K),W_1)\right) \) whose limit point is Lipschitz with respect to the \(W_1\) metric. Moerover, since the arrival space \((\mathscr {P}(K),W_1)\) is a compact metric space, by the Ascoli-Arzelá Theorem (see for example [48, p.10, Box 1.7]), such a sequence exists if and only if for every \(\varepsilon > 0\), there exists \(\delta (\varepsilon ) >0\) such that, whenever \(s,t \in [0,T]\) and \(|t-s|<\delta (\varepsilon )\), \(W_1(\alpha _t^{N},\alpha _s^{N})<\varepsilon \) for all N.

For \(\mu ,\,\nu \in \mathscr {P}(K)\),

$$\begin{aligned} W_1(\mu ,\nu ) = \sup \left\{ \int _K \phi \, \mathrm {d}(\mu -\nu ) \ \Big \vert \ \phi : K \rightarrow \mathbb {R},\ \phi \text { is 1-Lipschitz} \right\} . \end{aligned}$$
(35)

(See for example [51, Theorem 1.14].) Let \(\phi : K \rightarrow \mathbb {R}\) be 1-Lipschitz and let \(t,s\in [0,T]\). By the a priori estimate (28) we have

$$\begin{aligned} \int _K \phi \, \mathrm {d}(\alpha _t^{N}-\alpha _s^{N})&= \sum _{i=1}^Nm^N_i\Big (\phi \big (z^N_i(t)\big )-\phi \big (z^N_i(s)\big )\Big ) \\&\le \sum _{i=1}^Nm^N_i \left| z^N_i(t) - z^N_i(s)\right| \\&\le \sum _{i=1}^Nm^N_i L |t-s| \\&= L |t-s|, \end{aligned}$$

where \(L=\overline{M}+R(1+T)\). Using the characterisation of the Wasserstein 1-distance given by (35) we obtain

$$\begin{aligned} W_1(\alpha ^N_t,\alpha ^N_s)\le L|t-s|. \end{aligned}$$
(36)

Choosing \(\delta (\varepsilon )=\varepsilon /L\), the Ascoli-Arzelá Theorem guarantees the existence of a uniformly convergent subsequence \((\alpha ^{k(N)})_{N\in \mathbb {N}}\). Denote by \(\alpha \) its limit point. Combining the Lipschitz estimate (36) and using the triangle inequality in \((\mathscr {P}(K),W_1)\), for all \(N\in \mathbb {N}\) we have

$$\begin{aligned} W_1(\alpha _t,\alpha _s)\le W_1(\alpha _t,\alpha _t^{k(N)})+W_1(\alpha _s,\alpha _s^{k(N)})+L|t-s|. \end{aligned}$$

Passing to the limit as \(N\rightarrow \infty \), we see that \(\alpha \) is Lipschitz with respect to \(W_1\) as required. \(\square \)

We conclude this section with the proof of Theorem 3.6.

Proof of Theorem 3.6

Let \(\alpha \in C^{0,1}([0,T];\mathscr {P}_c(\mathbb {R}^3))\) be the limit point of the sequence \((\alpha ^{k(N)})_{N\in \mathbb {N}}\) obtained in Lemma 5.2. First, we prove that \(\alpha \) satisfies the transport equation (17). For clarity, given \(\beta \in C([0,T];\mathscr {P}(K))\) and \(\varphi \in \mathcal {D}(\mathbb {R}^3\times \mathbb {R})\) we define

$$\begin{aligned} \mathscr {F}[\beta ,\varphi ]:=&\int _0^{T}\int _{K}\big (\partial _t \varphi (z,t)+Jz\cdot \nabla \varphi (z,t)\big )\,\mathrm {d}\alpha _t(z)\,\mathrm {d}t-\int _0^{T}\int _{\varOmega }Jx\cdot \nabla \varphi \\&\big (\nabla \mathscr {B}[\alpha _t](x),t\big )\,\mathrm {d}x\,\mathrm {d}t-\left( \int _{K}\varphi (z,T)\,\mathrm {d}\alpha _{T}(z)-\int _{K}\varphi (z,0)\,\mathrm {d}\alpha _{0}(z)\right) . \end{aligned}$$

Since the space

$$\begin{aligned} \widetilde{\mathcal {D}}:=\{\varphi =\phi \psi \ \vert \ \phi \in C^{\infty }_c(\mathbb {R}),\ \psi \in C^{\infty }_c(\mathbb {R}^3)\} \end{aligned}$$

is a dense subspace of \(\mathcal {D}(\mathbb {R}^3\times \mathbb {R})\) (see [26]), to show that \(\alpha \) satisfies (17) it is enough to check that \(\mathscr {F}[\alpha ,\varphi ]=0\) for any \(\varphi \in \widetilde{\mathcal {D}}\). Moreover, for each \(N\in \mathbb {N}\) we have

$$\begin{aligned} \mathscr {F}[\alpha ^N,\varphi ]&=0\qquad \forall \, \varphi \in \widetilde{\mathcal {D}}. \end{aligned}$$

By the triangle inequality, it is therefore sufficient to show that

$$\begin{aligned} \underset{N\rightarrow \infty }{\lim }\big \vert \mathscr {F}[\alpha ^{k(N)},\varphi ]-\mathscr {F}[\alpha ,\varphi ]\big \vert =0\qquad \forall \, \varphi \in \widetilde{\mathcal {D}}. \end{aligned}$$
(37)

Since K is compact, \(W_1\) and \(W_2\) are equivalent metrics on K. The sequence \((\alpha ^{k(N)})_{N\in \mathbb {N}}\) therefore converges to \(\alpha \) uniformly in \(C([0,T];(\mathscr {P}(K),W_1))\). By the characterisation of \(W_1\) given by (35), this implies that for any Lipschitz function \(\eta :K\rightarrow \mathbb {R}\),

$$\begin{aligned} \underset{N\rightarrow \infty }{\lim }\sup _{t\in [0,T]}\left\{ \int _K\eta (z)\,\mathrm {d}(\alpha ^{k(N)}_t-\alpha _t)(z)\right\} =0. \end{aligned}$$
(38)

For \(\varphi =\phi \psi \in \widetilde{\mathcal {D}}\), where \(\phi \in C^{\infty }_c(\mathbb {R})\) and \(\psi \in C^{\infty }_c(\mathbb {R}^3)\), we have

$$\begin{aligned} \int _0^{T}\int _{K}\partial _t \varphi (z,t)\,\mathrm {d}(\alpha ^{k(N)}_t-\alpha _t)(z)\,\mathrm {d}t&= \int _0^{T} \phi ^{\prime }(t)\left( \int _{K}\psi (z)\,\mathrm {d}(\alpha ^{k(N)}_t-\alpha _t)(z)\right) \,\mathrm {d}t\\&\leqslant T\sup _{t\in [0,T]}|\phi ^{\prime }(t)|\sup _{t\in [0,T]}\bigg \vert \int _{K}\psi (z)\,\mathrm {d}(\alpha ^{k(N)}_t-\alpha _t)(z)\bigg \vert \end{aligned}$$

and

$$\begin{aligned} \int _0^{T}\int _{K}Jz\cdot \nabla \varphi (z,t)\,\mathrm {d}(\alpha ^{k(N)}_t-\alpha _t)(z)\,\mathrm {d}t&=\int _0^{T} \!\phi (t)\left( \int _{K}\!Jz\cdot \nabla \psi (z)\,\mathrm {d}(\alpha ^{k(N)}_t-\alpha _t)(z)\right) \mathrm {d}t\\&\leqslant T\sup _{t\in [0,T]}|\phi (t)|\sup _{t\in [0,T]}\bigg \vert \int _{K}Jz\cdot \nabla \phi (z) \,\mathrm {d}\\&\quad (\alpha ^{k(N)}_t-\alpha _t)(z)\bigg \vert . \end{aligned}$$

Therefore, by (38),

$$\begin{aligned} \underset{N\rightarrow \infty }{\lim }\int _0^{T}\int _{K}\big (\partial _t \varphi (z,t)+Jz\cdot \nabla \varphi (z,t)\big )\,\mathrm {d}(\alpha ^{k(N)}_t-\alpha _t)(z)\,\mathrm {d}t=0. \end{aligned}$$
(39)

Moreover, since uniform convergence implies pointwise convergence, we have

$$\begin{aligned} \underset{N\rightarrow \infty }{\lim }\left( \int _K\varphi (z ,T)\,\mathrm {d}(\alpha _{T}-\alpha ^{k(N)}_{T})(z)-\int _K\varphi (z ,0)\,\mathrm {d}(\alpha _{0}-\alpha ^{k(N)}_{0})(z)\right) =0. \end{aligned}$$
(40)

Finally, let

$$\begin{aligned} L_{\varphi }=\underset{t\in [0,T]}{\sup }\underset{x\in \mathbb {R}^3}{\sup }|D^2\varphi (x,t)| \end{aligned}$$

and, again, let \(R>0\) be such that \(\varOmega \subset B_R(0)\). By the Cauchy-Schwarz inequality in \(L^2(\varOmega ;\mathbb {R}^3)\) and the Lipschitz continuity of \(\mathscr {B}\) (Theorem 2.2 and [43, Theorem 3.1]), for some constant \(C>0\) depending only on \(\varOmega \) and K,

$$\begin{aligned}&\int _0^T\int _{\varOmega }Jx\cdot \left( \nabla \varphi (\nabla \mathscr {B}[\alpha _t](x),t)-\nabla \varphi \big (\nabla \mathscr {B}[\alpha ^{k(N)}_t](x),t\big )\right) \, \mathrm {d}x\, \mathrm {d}t\\&\leqslant T R L_{\varphi }\sup _{t\in [0,T]}\Vert \nabla \mathscr {B}\left[ \alpha _t\right] -\nabla \mathscr {B}[\alpha ^{k(N)}_t]\Vert _{L^2(\varOmega ;\mathbb {R}^3)}\\&\leqslant T R L_{\varphi }C\left( \sup _{t\in [0,T]}W_2(\alpha _t,\alpha ^{k(N)}_t)\right) ^{\frac{2}{15}}. \end{aligned}$$

Hence,

$$\begin{aligned} \underset{N\rightarrow \infty }{\lim }\int _0^{T}\int _{\varOmega }Jx\cdot \left( \nabla \varphi (\nabla \mathscr {B}[\alpha _t](x),t)-\nabla \varphi \big (\nabla \mathscr {B}[\alpha ^{k(N)}_t](x),t\big )\right) \, \mathrm {d}x\, \mathrm {d}t=0. \end{aligned}$$
(41)

By combining (39), (40) and (41), we see that (37) holds.

To complete the proof we note that each discrete solution \(\alpha ^{k(N)}\) is energy-conserving so, by continuity of the geostrophic energy functional E on \(\mathscr {P}(K)\) (see Remark 3.2), for any \(t\in [0,T]\)

$$\begin{aligned} E[\alpha _t]=\lim _{N\rightarrow \infty }E[\alpha ^{k(N)}_t]=\lim _{N\rightarrow \infty }E[\overline{\alpha }^{k(N)}]=E[\overline{\alpha }], \end{aligned}$$

which means that \(\alpha \) is also energy-conserving. \(\square \)

6 Explicit solutions

Here we consider two special cases (Examples 6.1 and 6.2) where we can obtain explicit expressions for discrete geostrophic solutions. In Example 6.3 we show that is an equilibrium solution of (1) and that its optimal quantisers are equilibrium solutions of the corresponding ODE-IVP (23).

Example 6.1

(A single mass) First we consider the case of a single Dirac mass. This example has been discussed in both [21, Section 5] and [23, Section 2.2] in the context of the 2-dimensional semi-geostrophic equations on the physical domain \(B_1(0)\subset \mathbb {R}^2\). In contrast with previous approaches, which use approximations of the Dirac mass by characteristic functions on balls, our solution follows immediately from the characterisation of discrete geostrophic solutions in terms of the ODE-IVP (23).

Let \(\varOmega \subset \mathbb {R}^3\) be open, bounded and convex, denote by \(x_{\varOmega }\) the centroid of \(\varOmega \), and let \(\overline{z}\in \mathbb {R}^3\). By an argument analogous to the proof of Lemma 4.2, a map \(\alpha \in C([0,T];\mathscr {P}_c(\mathbb {R}^3))\) given by

$$\begin{aligned} \alpha _t=\delta _{z(t)} \end{aligned}$$

is a k-times continuously differentiable discrete geostrophic solution with initial measure \(\overline{\alpha }=\delta _{\overline{z}}\) if and only if \(z:[0,T]\rightarrow \mathbb {R}^3\) is a k-times continuously differentiable solution of the ODE

$$\begin{aligned} \dot{z}=J(z-x_{\varOmega }) \end{aligned}$$
(42)

satisfying the initial condition \(z(0)=\overline{z}\). Such a map z is unique and is given by

$$\begin{aligned} z(t)=\mathrm {e}^{tJ}\left( \overline{z}-x_{\varOmega }\right) +x_{\varOmega }\qquad \forall \, t\in [0,T]. \end{aligned}$$

Moreover,

$$\begin{aligned} \nabla \mathscr {B}[\alpha _t]=z(t)\qquad \forall \, t\in [0,T]. \end{aligned}$$

Let \(z_3=z\cdot e_3\). By Eq. (16), the corresponding geostrophic energy satisfies

$$\begin{aligned} \dfrac{d {}}{d {t}}\left( E[\alpha _t]\right)&=\dfrac{d {}}{d {t}}\left( \frac{1}{2}\int _{\varOmega }|x-z(t)|^2\, \mathrm {d}x-\frac{1}{2}z^2_3(t)-\frac{1}{2}\int _{\varOmega }x^2_3\, \mathrm {d}x\right) \\&=-\int _{\varOmega }(x-z(t))\cdot \dot{z}(t)\, \mathrm {d}x -z_3(t)\dot{z}_3(t)\\&=(z(t)-x_{\varOmega })\cdot J(z(t)-x_{\varOmega })=0. \end{aligned}$$

Hence \(\alpha _t\) is energy-conserving.

Example 6.2

(Two masses in a ball) Let \(\varOmega \subset \mathbb {R}^3\) be the ball of volume 1 centred at the origin, and let

$$\begin{aligned} \overline{\alpha }=m\delta _{\overline{z}_1}+(1-m)\delta _{\overline{z}_2}, \end{aligned}$$

where \(\overline{z}_{1}, \overline{z}_{2}\in \mathbb {R}^3\) are distinct and \(m\in (0,1/2]\). To construct a geostrophic solution with initial measure \(\overline{\alpha }\) we make the ansatz

$$\begin{aligned} \alpha _t=m\delta _{z_1(t)}+(1-m)\delta _{z_2(t)}, \end{aligned}$$

where each \(z_i:[0,T]\rightarrow \mathbb {R}^3\) is continuously differentiable. This yields the ODE-IVP

$$\begin{aligned} {\left\{ \begin{array}{ll} \ \dot{z}_i=J(z_i-x_i(z_1,z_2)),\\ \ z_i(0)=\overline{z}_i, \end{array}\right. } \end{aligned}$$
(43)

for \(i\in \{1,2\}\), where \(x_i(z_1,z_2)\) denotes the centre of mass of the \(i^{\text {th}}\) cell in the optimal Laguerre tessellation generated by \((z_1,z_2)\).

Due to its simple shape, Laguerre tessellations of \(\varOmega \) which are generated by two seeds can be easily characterised. Indeed, for any given \((z_1,z_2)\in \mathbb {R}^{6}\) such that \(z_1\ne z_2\), the boundary between the Laguerre cells \(C_1\) and \(C_2\) generated by \((z_1,z_2)\) is necessarily the intersection of the ball with a plane perpendicular to the vector \(z_1-z_2\). Hence \(C_1\) is the spherical cap of mass m whose base has outward pointing normal vector \((z_2-z_1)/|z_2-z_1|\), and \(C_2\) is its complement in \(\varOmega \). The centre of mass of the Laguerre cell \(C_1\) must therefore lie some positive distance r, depending only on the mass m, along the axis defined by the vector \(z_1-z_2\) about which \(C_1\) is rotationally symmetric. Hence,

$$\begin{aligned} x_1(z_1,z_2)=r(m)\frac{z_1-z_2}{|z_1-z_2|}. \end{aligned}$$
(44)

Moreover, the centre of mass of the whole ball is the origin so

$$\begin{aligned} mx_1+(1-m)x_2=0 \quad \implies \quad x_2=\frac{mr(m)}{m-1}\frac{z_1-z_2}{|z_1-z_2|}. \end{aligned}$$
(45)

Using these observations, (43) becomes

$$\begin{aligned}&\dot{z}_1= J\left( z_1-r\frac{z_1-z_2}{|z_1-z_2|}\right) , \end{aligned}$$
(46)
$$\begin{aligned}&\dot{z}_2=J\left( z_2-\frac{mr}{m-1}\frac{z_1-z_2}{|z_1-z_2|}\right) . \end{aligned}$$
(47)

Subtracting (47) from (46) yields the following ODE-IVP for the difference \(Z:=z_1-z_2\):

$$\begin{aligned} {\left\{ \begin{array}{ll} \dot{Z}=J\left( Z-q\frac{Z}{|Z|}\right) ,\\ Z(0)=\overline{z}_1-\overline{z}_2=:\overline{Z}, \end{array}\right. } \end{aligned}$$
(48)

where

$$\begin{aligned} q=q(m)=\left( \frac{1}{1-m}\right) r(m) \end{aligned}$$

is a positive constant determined by m alone. To solve (48), we first note that due to the skew symmetry of J

$$\begin{aligned} \dfrac{d {}}{d {t}}|Z|^2=2Z\cdot \dot{Z}=0\quad \implies \quad |Z(t)|=|\overline{Z}| \quad \forall \, \ t\in [0,T]. \end{aligned}$$

Hence, (48) becomes

$$\begin{aligned} {\left\{ \begin{array}{ll} \dot{Z}(t)=\left( 1-\frac{q}{|\overline{Z}|}\right) JZ(t),\\ Z(0)=\overline{Z}. \end{array}\right. } \end{aligned}$$

Its solution is the map \(Z:[0,T]\rightarrow \mathbb {R}^3\) given by

$$\begin{aligned} Z(t)=e^{\omega t J}\overline{Z}, \end{aligned}$$

where \(\omega =1-q/|\overline{Z}|\). That is,

$$\begin{aligned} Z(t)=\left( \begin{array}{c c c} \cos (\omega t) &{} -\sin (\omega t) &{} 0\\ \sin (\omega t) &{} \cos (\omega t) &{} 0\\ 0 &{} 0 &{} 1 \end{array}\right) \overline{Z}. \end{aligned}$$

Hence, Eqs. (46) and (47) decouple and we obtain two linear inhomogeneous ODE-IVPs for \(z_1\) and \(z_2\). Noting that \(\omega \ne 1\) since \(q>0\), we use Duhamel’s formula and find that

$$\begin{aligned} z_1(t)&= \mathrm {e}^{tJ}\overline{z}_1-\frac{r}{\omega -1}\left( e^{\omega tJ}-\mathrm {e}^{tJ}\right) \frac{\overline{z}_1-\overline{z}_2}{|\overline{z}_1-\overline{z}_2|},\\ z_2(t)&= \mathrm {e}^{tJ}\overline{z}_2+\frac{mr}{(1-m)(\omega -1)}\left( e^{\omega tJ}-\mathrm {e}^{tJ}\right) \frac{\overline{z}_1-\overline{z}_2}{| \overline{z}_1-\overline{z}_2|}. \end{aligned}$$

To conclude, recalling the expressions for the centroids given by (44) and (45), we note that \(x_1\) and \(x_2\) simply rotate anti-clockwise around the vertical coordinate axis with angular frequency \(\omega \).

Example 6.3

(Equilibrium solutions) The Lebesgue measure restricted to \(\varOmega \) is an equilibrium solution of (1). Indeed, let . Then \(\nabla \mathscr {B}[\alpha ]=\mathrm {id}_{\varOmega }\), which implies that \(\nabla \mathscr {B}[\alpha ]^*(x)=x\ \forall \, x\in \varOmega \), which in turn implies that \(\mathscr {W}[\alpha ]=0\) on \(\mathrm {supp}(\alpha )\). Let

$$\begin{aligned} \alpha ^N=\sum _{i=1}^Nm_i\delta _{z_i} \end{aligned}$$

be an optimal quantiser of . By, e.g., [19, Proposition 3.1], [29, Corollary 4.3], the seeds \(\mathbf {z}=(z_1,\ldots ,z_N)\) generate a centroidal Voronoi tessellation of \(\varOmega \), i.e,

$$\begin{aligned} x_i(\mathbf {z})=z_i\quad \forall \, i\in \{1,\ldots ,N\}. \end{aligned}$$

This means that \(W(\mathbf {z})=0\) so \(\mathbf {z}\) is an equilibrium solution of the ODE-IVP (23). The behaviour of under the dynamics of Eq. (1) therefore agrees exactly with that of its optimal discrete approximants.

7 Numerical simulations

One advantage of the constructive existence proof given in this paper is that it naturally leads to a numerical method (a meshfree method) and moreover it tells us something about the convergence of this method. To be precise, given an initial measure \(\overline{\alpha } \in \mathscr {P}_c(\mathbb {R}^3)\), we can construct a numerical approximation of a solution of the semi-geostrophic equations (17) as follows:

  1. 1.

    Approximate \(\overline{\alpha }\) by a discrete measure \(\overline{\alpha }^N=\sum _{i=1}^N m_i \delta _{\overline{z}_i}\). This leads to the semi-discrete numerical scheme (continuous in time, discrete in space) used in the proof of Theorem 3.6, where an exact solution \(\alpha ^N=\sum _{i=1}^N m_i \delta _{z_i(t)}\) of (17) with initial condition \(\overline{\alpha }^N\) is constructed by solving the system of ODEs \(\dot{\mathbf {z}}=W(\mathbf {z})\) given in Eq. (23). To turn this into a bona fide numerical method we require a further discretisation:

  2. 2.

    Use a time-stepping scheme to approximately solve the ODE \(\dot{\mathbf {z}}=W(\mathbf {z})\). Every evaluation of the vector field W requires a further numerical approximation; to evaluate \(W(\mathbf {z}(t))\) we must solve the semi-discrete transport problem in order to compute the centroids \(\mathbf {x}(\mathbf {z}(t))\).

We demonstrate the viability of this numerical method by giving an example in two dimensions, implemented in MATLAB. Due to a lack of space we postpone the three-dimensional implementation and a more thorough numerical study to a further paper. Before giving the example we briefly discuss some implementation issues and convergence.

ODE solver We used the explicit Runge-Kutta scheme RK4 [35, Example 5.13] to solve the ODE \(\dot{\mathbf {z}}=W(\mathbf {z})\). For larger simulations it may be better to use a linear multistep method [35, Section 5.9] since each evaluation of the vector field W is expensive (linear multistep methods only require one new vector field evaluation per time step, whereas RK4 requires four per time step).

Semi-discrete transport solver The semi-discrete transport problem can be solved by maximising the concave function g (see Eq. (12)) as described in Sect. 2. We did this in MATLAB using a quasi-Newton method (the MATLAB function fminunc). Every evaluation of g and its gradient requires a Laguerre diagram to be computed, which we did using the MATLAB function power_bounded [25]. This simple method, which is described in more detail in [6, Algorithm 1 and Section 4], was sufficient for our proof of concept simulations here. In general, however, to maximise g it would be much faster to use the damped Newton method from [32], especially for large 3D simulations.

Convergence By Theorem 3.6 the sequence \((\alpha ^N)_{N \in \mathbb {N}}\) generated by the semi-discrete scheme has a subsequence that converges (uniformly with respect to the Wasserstein metric) to a solution of the semi-geostrophic equations (17) with initial condition \(\overline{\alpha }\). In particular, if Eq. (17) has a unique weak solution with initial measure \(\overline{\alpha }\), then the whole sequence of approximations \(\alpha ^N\) converges to the true solution. Local-in-time uniqueness of Hölder continuous periodic solutions of (17) was proved in [40], but is not known in general. By the conservation of the transport cost (see Remark 3.7) it is easy to see that the whole sequence \((\alpha ^N)_{N\in \mathbb {N}}\) also converges in the very special case where the initial measure \(\overline{\alpha }\) is the Lebesgue measure on \(\varOmega \). (Note that the Lebesgue measure is an equilibrium solution of (17) as discussed in Example 6.3.) In general, proving convergence of the whole sequence \((\alpha ^N)_{N\in \mathbb {N}}\) is beyond the scope of this paper, as is proving convergence of the fully discrete scheme. We will study these in a future paper.

Example 7.1

(Gaussian initial condition) Let \(\varOmega = [-1,1] \times [-1,1]\) and let the initial measure \(\overline{\alpha }\) be absolutely continuous with respect to the Lebesgue measure with density \(C \exp (-|x|^2) \mathbb {1}_{\varOmega }\), where \(C>0\) is the normalisation constant satisfying

$$\begin{aligned} \overline{\alpha }(\mathbb {R}^2) = \mathscr {L}^2(\varOmega ) \quad \Longleftrightarrow \quad C = 4 \left( \int _\varOmega \exp (-|x|^2) \, \mathrm {d}x \right) ^{-1}. \end{aligned}$$

Note that we have dropped the previous normalisation convention that \(\overline{\alpha }(\mathbb {R}^2) = \mathscr {L}^2(\varOmega )=1\), which is not necessary for the analysis; we simply require that \(\overline{\alpha }(\mathbb {R}^2) = \mathscr {L}^2(\varOmega )\). We approximated \(\overline{\alpha }\) by a discrete measure of the form \(\overline{\alpha }^N = \sum _{i=1}^N m_i \delta _{\overline{z}_i}\) with \(N=2000\) seeds using 1000 iterations of Lloyd’s algorithm [19]. We approximated the solution of the ODE-IVP (23) on the time interval [0, T] with \(T=5\) using the Runge-Kutta method RK4 with uniform time step size \(h=0.01\). For the numerical maximisation of g we used the following stopping condition: for all \(i \in \{1,\ldots ,N\}\),

$$\begin{aligned} \left| \frac{\partial g}{\partial w_i} \right|< 10^{-2} \, \varepsilon \min _{j} m_j \quad \Longleftrightarrow \quad |m_i-|C_i|| < 10^{-2} \, \varepsilon \min _{j} m_j \end{aligned}$$

with \(\varepsilon =0.1\). This ensures that the areas of the Laguerre cells \(C_i\) are correct to within \(0.1 \%\).

The results are shown in Fig. 2 at time steps \(t=0,0.5,1,3,4,5\). The black dots in the first and third rows are the (approximate) seed locations \(z_i\). The polygons in the second and fourth rows are the (approximate) Laguerre cells \(C_i\). The cells are coloured according to their target areas \(m_i\), where blue corresponds to small cells and yellow corresponds to large cells. Note that all the seeds start off in \(\varOmega \) but they are not confined there.

Fig. 2
figure 2

Results of Example 7.1. The seeds \(z_i\) (black dots) and the corresponding Laguerre cells (coloured polygons) are illustrated at time steps \(t=0,0.5,1,3,4,5\) for a Gaussian initial measure. The Laguerre cells are coloured according to their area (small cells in blue, large cells in yellow) (color figure online)

As an accuracy check, we repeated these simulations with a smaller time step size of \(h=0.005\) and a finer optimal transport tolerance of \(\varepsilon =0.05\). We found that the x- and y-coordinates of the seeds \(z_i\) at the final time step \(T=5\) agreed with our previous results to within \(10^{-3}\). Recall from Remark 3.7 that the exact dynamics (23) preserves the transport cost:

In our simulations (with \(h=0.01\), \(\varepsilon =0.1\)) the transport cost was conserved to within \(7.5 \times 10^{-7}\):

where \(t_n = nh\), \(n \in \{0,1,\ldots ,500\}\).