1 Introduction

For ensemble-based data assimilation purposes, there is a definite need for relevant ensemble sampling tools. Indeed, the quality and spreading of these ensembles have deep implications in the quality of the data assimilation [Dufée et al 2022], and—until recently—those so-called covariance inflation tools have mostly relied on unsuitable linear Gaussian frameworks [Tandeo et al 2020, Resseguier et al 2020a]. A promising alternative is the generation of ensembles through a stochastic remapping of the physical space.

Consider a random mapping T, acting at every infinitesimal time step, such that \(T_t(x)-x\) is interpreted as a “location perturbation” expressed by

$$\displaystyle \begin{aligned} T_t(x) = x + a(t,x)\varDelta t + e_i(t,x)\varDelta \eta_i(t),{} \end{aligned} $$
(1.1)

where \(a(t,x), e_i(t,x)\in \mathbb {R}^n\). In Eq. (1.1), \(a(t,x)\) controls deterministic location shifts, and \(\varDelta \eta _i(t)\sim \mathcal {N}(0,\varDelta t)\) random ones. At every time step, this random mapping T shall induce a perturbation to any tensor field \(\theta (t)\) [Zhen et al 2022]. For instance, one can perturb a differential form \(\theta (t)\) applying \(\theta (t)\to T_t^*\theta (t)\) with \(T_t^*\) the associated pull-back operator.

A rigorous mathematical definition and calculation of \(T_t\) and \(T_t^*\) can be obtained in terms of stochastic flows of diffeomorphisms and its Lie derivatives [e.g., Bethencourt De Leon 2021]. Yet, to rapidly assess \(T_t^*\theta \), a Taylor expansion and Itô’s lemma can be used. Given coordinates \((x^1,\ldots ,x^n)\), when \(\theta \) is a differential \(k-\)form, it can be written as

$$\displaystyle \begin{aligned} \theta = \sum_{i_1<\ldots<i_k}f^{i_1,\ldots,i_k}dx^{i_1}\wedge\dots\wedge dx^{i_k}{{,}} \end{aligned} $$
(1.2)

with f a semimartingale smooth in space. Then

$$\displaystyle \begin{aligned} T_t^*\theta = \sum_{i_1<\ldots<i_k}f^{i_1,\ldots,i_k}(T_t(x))T_t^*(dx^{i_1}\wedge\dots\wedge dx^{i_k}){{,}} \end{aligned} $$
(1.3)

leading to a compact expression

$$\displaystyle \begin{aligned} T_t^*\theta = \theta + \mathcal{M}(t,\theta)\varDelta t + \mathcal{N}_i(t,\theta)\varDelta\eta_i(t), \end{aligned} $$
(1.4)

with some differential \(k-\)forms \(\mathcal {M}(t,\theta )\) and \(\mathcal {N}_i(t,\theta )\). Appendix 5 provides definitions of \(\mathcal {M}\) and \(\mathcal {N}\) [see Zhen et al 2022, Appendix B for a full proof].

Hereafter, we present and discuss the potential of this random mapping scheme to possibly prescribe \(\theta \), and the parameters a and \(e_i\) to ensure that certain quantities, i.e. mass, vorticity, helicity, energy, are conserved.

Several examples of \(T_t^*\theta \) can indeed be considered. For instance, when \(\theta = f\) is a function (differential \(0-\)form),

$$\displaystyle \begin{aligned} {} (T_t^*\theta) =& f + \underbrace{\Big{(}a^j \partial_{x^j} f + \tfrac{1}{2}e_i^pe_i^q \partial_{x^p} \partial_{x^q} f\Big{)} }_{=\mathcal{M}} \varDelta t + \underbrace{e_i^p \partial_{x^p} f}_{=\mathcal{N}_i} \varDelta\eta_i. \end{aligned} $$
(1.5)

And when \(\theta = fdx^1\wedge \dots \wedge dx^n\) (differential n-form), it then follows

$$\displaystyle \begin{aligned} {} T_t^*\theta =& \Big{\{}f + \Big{(}(\partial_{x^p}a^p+\tfrac{1}{2}J_i)f + ( a^p + e^p_i \partial_{x^q} e^q_{i} ) \partial_{x^p} f +\tfrac{1}{2}e^p_ie^q_i \partial_{x^p} \partial_{x^q} f \Big{)}\varDelta t \\ &+ ( \partial_{x^p} e^p_{i} f+ e^p_i \partial_{x^p} f )\varDelta\eta_i\Big{\}}dx^1\wedge\dots\wedge dx^n . \end{aligned} $$
(1.6)

Finally, when \(\theta = f^j dx^j = \sum _{j=1}^n f^j dx^{j}\) is differential 1-form, we have

$$\displaystyle \begin{aligned} {} T_t^*\theta =& \Big{\{}f^j + ( a^p \partial_{x^p} f^j + \tfrac{1}{2} e^p_ie^q_i \partial_{x^p} \partial_{x^q} f^j + \partial_{x^j} a^p f^p + \partial_{x^j}e^p_{i} e^q_i \partial_{x^q} f^p)\varDelta t \\ &+ (e_i^p \partial_{x^p} f^j + \partial_{x^j} e^p_{i} f^p )\varDelta\eta_i \Big{\}}dx^j . \end{aligned} $$
(1.7)

2 Induced Stochastic PDE

From the expression of \(T_t^*\theta \), a SPDE is derived from an original PDE, when \(\theta \) is a differential form. Suppose \(S{{{ }^d}}\) is the full state variable of the deterministic dynamical system:

$$\displaystyle \begin{aligned} \frac{\partial S{{{}^d}}}{\partial t} = g(S{{{}^d}}). \end{aligned} $$
(2.1)

Let \(f{{{ }^d}}\) be a component or a collection of components of \(S{{{ }^d}}\). We then associate \(f{{{ }^d}}\) to a differential form \(\theta {{{ }^d}}\), i.e. there is an invertible map \(\mathcal {F}\) that maps the space of \(f{{{ }^d}}\) to the space of \(\theta {{{ }^d}}\), such that \(\mathcal {F}(f{{{ }^d}}) = \theta {{{ }^d}}\). Typically, if \(f{{{ }^d}}\) is a tracer, it is often associated to the 0-form \(\theta {{{ }^d}}=f{{{ }^d}}\). If \(f{{{ }^d}}\) is the density \(\rho {{{ }^d}}\), we might associate the n-form \(\theta {{{ }^d}} = \rho {{{ }^d}} \; dx^{i_1}\wedge \dots \wedge dx^{i_n}\). More generally, \( \theta {{{ }^d}}\), and thus \(\mathcal {F}\), can be prescribed to ensure that certain quantities—such as mass, energy, circulation—are conserved [Zhen et al 2022, section 3.3]. Consider the propagation equation for \(f{{{ }^d}}\)

$$\displaystyle \begin{aligned} \mathtt{d}f{{{}^d}} = g^f(S{{{}^d}})\mathtt{d}t.{} \end{aligned} $$
(2.2)

It implies a propagation equation for \(\theta \):

$$\displaystyle \begin{aligned} \mathtt{d}\theta{{{}^d}} = g^\theta(S{{{}^d}})\mathtt{d}t.{} \end{aligned} $$
(2.3)

We will now stochastically perturb the above deterministic dynamics. Let us denote \(S,f\) and \(\theta \) the semimartingale solutions of this randomized dynamics. The proposed discrete-time perturbation at each time step consists of the following two steps:

$$\left\{ \begin{array}{l l} \tilde{\theta}(t+\varDelta t) = \theta(t) + g^\theta(S(t))\varDelta t{{,}} \\ \theta(t+\varDelta t) = T_t^*\tilde{\theta}(t+\varDelta t){{,}} & \qquad\qquad (2.5)\\ \end{array} \right.$$
(2.4)

with \(T_t^*\tilde {\theta }(t+\varDelta t) = \tilde {\theta }(t+\varDelta t) + \mathcal {M}(t,\tilde {\theta }(t+\varDelta t))\varDelta t + \mathcal {N}_i(t,\tilde {\theta }(t+\varDelta t) )\varDelta \eta _i(t) + o(\varDelta t)\) for the associated differential forms \(\mathcal {M}(t,\tilde {\theta })\) and \(\mathcal {N}_i(t,\tilde {\theta })\).

The deterministic PDE (2.4) and \(\|\tilde {\theta }(t+\varDelta t) - \theta (t)\|\) scales in \(O(\varDelta t)\). There is no noise term to induce a scaling in \(O(\sqrt {\varDelta t})\). Therefore, it can be assumed that there exists \(C > 0\) so that \(\|\mathcal {M}(t,\tilde {\theta }(t+\varDelta t)) - \mathcal {M}(t,\theta (t))\| < C\varDelta t\) and \(\|\mathcal {N}_i(t,\tilde {\theta }(t+\varDelta t)) - \mathcal {N}_i(t,\theta (t))\| < C\varDelta t\), for \(\varDelta t\) small enough. Accordingly,

$$\displaystyle \begin{aligned} T_t^*\tilde{\theta}(t+\varDelta t) =& \tilde{\theta}(t+\varDelta t) + \Big{(}\mathcal{M}(t,\theta(t)) + \mathcal{O}(\varDelta t)\Big{)}\varDelta t \\ & + \Big{(}\mathcal{N}_i(t,\theta(t)) + \mathcal{O}(\varDelta t)\Big{)}\varDelta\eta_i(t) + o(\varDelta t){{,}} \\ =& \tilde{\theta}(t+\varDelta t) + \mathcal{M}(t,\theta(t))\varDelta t + \mathcal{N}_i(t,\theta(t))\varDelta \eta_i(t) + o(\varDelta t){{.}} {} \end{aligned} $$
(2.6)

Therefore,

$$\displaystyle \begin{aligned} \theta(t + \varDelta t) = \theta(t) + g^\theta(S(t))\varDelta t + \mathcal{M}(t,\theta(t))\varDelta t + \mathcal{N}_i(t,\theta(t))\varDelta \eta_i + o(\varDelta t). {} \end{aligned} $$
(2.7)

It suggests the following stochastic propagation equation for \(\theta \):

$$\displaystyle \begin{aligned} \mathtt{d}\theta = g^\theta(S)\mathtt{d}t + \mathcal{M}(t,\theta)\mathtt{d}t + \mathcal{N}_i(t,\theta)\mathtt{d}\eta_i.{} \end{aligned} $$
(2.8)

Since there is a 1-1 correspondence between \(\theta \) and f, Eq. (2.3) also suggests a stochastic propagation equation for f, which can be written as

$$\displaystyle \begin{aligned} \mathtt{d}f = g^f(S)\mathtt{d}t + \mathcal{M}^f(f)\mathtt{d}t + \mathcal{N}_i^f(f)\mathtt{d}\eta_i.{} \end{aligned} $$
(2.9)

We denote the additional terms in Eq. (2.9) by

$$\displaystyle \begin{aligned} \mathtt{d}_s f := \mathcal{M}^f(f)\mathtt{d}t + \mathcal{N}_i^f(f)\mathtt{d}\eta_i. \end{aligned} $$
(2.10)

Then Eq. (2.9) can be written as:

$$\displaystyle \begin{aligned} \mathtt{d}f = g^f(S)\mathtt{d}t + \mathtt{d}_sf. \end{aligned} $$
(2.11)

3 Comparison with Other Perturbation Schemes

Obtained above, \(\mathtt {d}_sf\) is completely determined by \(T_t^*\theta \), but is not directly related to the original dynamics Eq. (2.2). Once the expression of T in Eq. (1.1) and the choice of the differential form \(\theta {{{ }^d}}\) are determined, the perturbation term \(\mathtt {d}_sf\) is prescribed. However, the choice of \(\theta {{{ }^d}}\) is up to the user, and may then be related to the original dynamics.

In the following, we thus demonstrate that both the stochastic advection by Lie transport (SALT) equation [Holm 2015] and the location uncertainty (LU) equation [Mémin 2014, Resseguier et al 2017; 2020b] can be properly recovered using the proposed perturbation scheme.

3.1 Comparison with the LU Equations

The Reynolds transport theorem is central to the LU setting. The Reynolds transport theorem expresses an integral conservation equation for the transport of any conserved quantity within a fluid, connected to its corresponding differential equation. A link between the proposed perturbation approach and the LU formulation can be anticipated to be related to differential n-forms. But first, we consider a key ingredient of LU: the stochastic material derivative of functions (differential 0-forms).

3.1.1 0-Forms in the LU Framework

Dropping the forcing terms, LU equation for compressible and incompressible flow reads [Resseguier et al 2017]

$$\displaystyle \begin{aligned} \partial_t f + \boldsymbol{w}^{\star}\cdot \nabla f =& \nabla\cdot(\tfrac{1}{2}\boldsymbol{a}\nabla f) - \boldsymbol{\sigma}\dot{\boldsymbol{B}}\cdot\nabla f{{,}} {} \end{aligned} $$
(3.1)
$$\displaystyle \begin{aligned} \boldsymbol{w}^{\star} =& \boldsymbol{w} - \tfrac{1}{2}(\nabla\cdot \boldsymbol{a})^{\top} + \boldsymbol{\sigma}(\nabla\cdot\boldsymbol{\sigma})^{\top}{{,}} {} \end{aligned} $$
(3.2)

where \(\boldsymbol {a} =\boldsymbol {\sigma }_{\bullet k} \boldsymbol {\sigma }_{\bullet k}^T \) and f can be any quantity that is assumed to be transported by the flow, i.e. \(Df/Dt=0\) where \(D/Dt\) is the Itô material derivative. For instance, f could be the velocity (dropping forces in the SPDE), the temperature, or the buoyancy.

Separating the terms of the SPDE related to the deterministic dynamics from the term associated to the stochastic scheme, it comes

$$\displaystyle \begin{aligned} \mathtt{d}^{\text{LU}}f = g^f(S)\mathtt{d}t + \mathtt{d}_s^{\text{LU}}f, \end{aligned} $$
(3.3)

where

$$\displaystyle \begin{aligned} {} g^f(S) =& -\boldsymbol{w}\cdot \nabla f{{,}} \end{aligned} $$
(3.4)
$$\displaystyle \begin{aligned} \mathtt{d}_s^{\text{LU}}f =& - (\boldsymbol{w}^{\star}-\boldsymbol{w})\cdot \nabla f\mathtt{d}t - \boldsymbol{\sigma} \mathtt{d}\boldsymbol{B}\cdot\nabla f + \nabla\cdot(\tfrac{1}{2}\boldsymbol{a}\nabla f)\mathtt{d}t{{.}} {} \end{aligned} $$
(3.5)

Besides, from our proposed scheme applied to a 0-form \(\theta = f\) (Eq. (1.5)), we obtain:

$$\displaystyle \begin{aligned} \mathtt{d}_sf = \Big{(}a^p \partial_{x^p} f + \tfrac{1}{2}e_i^pe_i^q \partial_{x^p} \partial_{x^q} f\Big{)}\mathtt{d} t + e_i^p \partial_{x^p} f \mathtt{d}\eta_i{{.}} {} \end{aligned} $$
(3.6)

To physically interpret this equation, we rewrite:

$$\displaystyle \begin{aligned} \frac{ \mathtt{d}_sf}{\mathtt{d} t } = - V^p \partial_{x^p} f + \partial_{x^p} \left((\tfrac{1}{2}e_i^pe_i^q) \partial_{x^q} f \right){{,}} {} \end{aligned} $$
(3.7)

where

$$\displaystyle \begin{aligned} V^p = -a^p + \tfrac{1}{2} \partial_{x^q}(e_i^pe_i^q) - e_i^p \frac{\mathtt{d}\eta_i}{\mathtt{d} t }{{.}} \end{aligned} $$
(3.8)

Terms of advection and diffusion are recognized. The matrix \(\tfrac {1}{2} e_i e_i^T\) is symmetric non-negative and represents a diffusion matrix. The p-th component of the advecting velocity \(V^p\) is composed of the drift \(-a^p\), a correction \(\tfrac {1}{2} \partial _{x^q}(e_i^pe_i^q)\), and a stochastic advecting velocity \(-e_i^p \frac {\mathtt {d}\eta _i}{\mathtt {d} t }\).

Direct calculation yields that Eq. (3.5) coincides with Eq. (3.7) when \(\boldsymbol {a} =\boldsymbol {\sigma }_{\bullet k} \boldsymbol {\sigma }_{\bullet k}^T = e_i e_i^T \) and \(\boldsymbol {\sigma }\dot {\boldsymbol {B}} = - e_i d\eta _i\) and

$$\displaystyle \begin{aligned} T_t(x) = x + e^q_i \partial_{x_q} e_{i} \varDelta t +e_i\varDelta\eta_i = x - \boldsymbol{w}_S^c \varDelta t + ( - \boldsymbol{w}_S^c \varDelta t - \boldsymbol{\sigma}\varDelta \boldsymbol{B} ), {} \end{aligned} $$
(3.9)

where

$$\displaystyle \begin{aligned} \boldsymbol{w}_S^c = - \tfrac{1}{2} (\boldsymbol{\sigma}_{\bullet k} \cdot \nabla) \boldsymbol{\sigma}_{\bullet k} = - \tfrac{1}{2}(\nabla\cdot \boldsymbol{a})^{\top} + \tfrac 12 \boldsymbol{\sigma}(\nabla\cdot\boldsymbol{\sigma})^{\top} . \end{aligned} $$
(3.10)

The LU equation can thus be derived by choosing \(\theta = f\) and \(T_t\) by Eq. (3.9). Note, the term \((-\boldsymbol {w}_S^c \varDelta t - \boldsymbol {\sigma }\varDelta {\boldsymbol {B}} ) = ( \tfrac {1}{2} e^q_i \partial _{x_q} e_{i} \varDelta t + e_i\varDelta \eta _i)\) is the Itô noise plus its Itô-to-Stratonovich correction. Hence, it corresponds to the Stratonovich noise \(e_i \circ \mathrm {d}\eta _i\) of the flow associated to \(T_t\). The additional drift \(-\boldsymbol {w}_S^c \varDelta t\) is different in nature. It is related to the advection correction \(\boldsymbol {w}_S^c \cdot \nabla f\) in the LU setting. Indeed, in the LU framework, the Itô drift, \(\boldsymbol {w}\), is seen as the resolved large-scale velocity. That is why, in this framework, the deterministic dynamics (3.4) involve the Itô drift, \(\boldsymbol {w}\). This is also the reason why, under the LU derivation, the advected velocity is assumed to be given by the Itô drift, \(\boldsymbol {w}\). It differs from the Stratonovich drift \(\boldsymbol {w}_S=\boldsymbol {w}+\boldsymbol {w}_S^c\), used as advected velocity in SALT approach or in Mikulevicius and Rozovskii [2004] (where the Stratonovich drift is denoted u). Interested readers are referred to [Resseguier et al 2020b, Appendix A] for a discussion on these assumptions and for the complete table of SALT-LU notations correspondences. Note however that in all these approaches, the advecting velocity is always the Stratonovich drift. This can be seen e.g., in the Stratonovich form of LU equations, derived in [Resseguier 2017, Appendix 10.1] and [Resseguier et al 2020a, 6.1.3]:

$$\displaystyle \begin{aligned} \partial_t f + \boldsymbol{w}_S\cdot \nabla f =& - (\boldsymbol{\sigma} \circ \dot{\boldsymbol{B}})\cdot \nabla f, \end{aligned} $$
(3.11)

where \(\boldsymbol {\sigma }\circ \dot {\boldsymbol {B}}\) is the Stratonovich noise of the SPDE. Since the advecting velocity \(\boldsymbol {w}_S\) and the resolved velocity \(\boldsymbol {w}\) differ by a drift \(\boldsymbol {w}_S^c\), the term \(\boldsymbol {w}_S^c \cdot \nabla f\) is interpreted as an advection correction, being part of the stochastic scheme (3.5). Accordingly, the remapping \( T_t\) involves an additional drift \( - \boldsymbol {w}_S^c \varDelta t \) .

To also understand (3.9), the inverse flow can be considered:

$$\displaystyle \begin{aligned} T_t^{-1}(x) = x - e_i \varDelta\eta_i = x + \boldsymbol{\sigma}\varDelta {\boldsymbol{B}} .{} \end{aligned} $$
(3.12)

If \(T_t\) represents a necessary perturbation to match, at each time step, a true solution, \(T_t^{-1}\) measures the difference, at each time step, between this true solution and a model forecast. Therefore, the LU equation can be derived using the proposed perturbation scheme, choosing \(\theta = f\) and assuming that a true solution differs from a model forecast by a displacement prescribed by Eq. (3.12).

3.1.2 n-Forms in the LU Framework

The LU physical justification relies on a stochastic interpretation of fundamental conservation laws, typically conservation of extensive properties (i.e. integrals of functions over a spatial volume) like momentum, mass, matter and energy [Resseguier et al 2017]. These extensive properties can be expressed by integrals of differential \(n-\)forms. For instance, the mass and the momentum are integrals of the differential \(n-\)forms \(\rho dx^1\wedge \dots \wedge dx^n\) and \(\rho \boldsymbol {w} dx^1\wedge \dots \wedge dx^n\), respectively. In the LU framework, a stochastic version of the Reynolds transport theorem [Resseguier et al 2017, Eq. (28)] is used to deal with these differential \(n-\)forms \(\theta =fdx^1\wedge \dots \wedge dx^n\). Assuming an integral conservation \(\frac {d}{dt}\int _{V(t)} f = 0 \) on a spatial domain \(V(t)\) transported by the flow, it leads to the following SPDE:

$$\displaystyle \begin{aligned} \frac{Df}{Dt}+ \nabla \cdot ( \boldsymbol{w}^{\star}+\boldsymbol{\sigma}\dot{\boldsymbol{B}} )f = \frac{d}{dt}\left<\int_0^t D_t f, \int_0^t \nabla \cdot \boldsymbol{\sigma}\dot{\boldsymbol{B}} \right> = (\nabla \cdot \boldsymbol{\sigma}_{\bullet i} )(\nabla \cdot \boldsymbol{\sigma}_{\bullet i} )^T f{{,}} {} \end{aligned} $$
(3.13)

where \(D/Dt\) denotes the Itô material derivative. Forcing terms are dropped for the sake of readability. This SPDE can be rewritten using the expression of that material derivative (Eq. (9) and (10) of Resseguier et al [2017]):

$$\displaystyle \begin{aligned} \partial_t f + \nabla \cdot (\boldsymbol{w}_S f) =& \tfrac{1}{2} \nabla \cdot (\boldsymbol{a} \nabla f) +\tfrac{1}{2} \nabla \cdot ( \boldsymbol{\sigma}_{\bullet i} (\nabla \cdot \boldsymbol{\sigma}_{\bullet i})^T f) - \nabla \cdot ( \boldsymbol{\sigma}\dot{\boldsymbol{B}} f){{.}} \end{aligned} $$
(3.14)

The original deterministic equation and stochastic perturbation correspond to

$$\displaystyle \begin{aligned} g^f(S) =& -\nabla \cdot (\boldsymbol{w} f) {{,}} \end{aligned} $$
(3.15)
$$\displaystyle \begin{aligned} \mathtt{d}_s^{\text{LU}}f \!=\!& (- \nabla \cdot (\boldsymbol{w}_S^c f) \!+\!\tfrac{1}{2} \nabla \cdot (\boldsymbol{a} \nabla f) \!+\!\tfrac{1}{2} \nabla \cdot ( \boldsymbol{\sigma}_{\bullet i} ( \nabla \cdot \boldsymbol{\sigma}_{\bullet i})^T f) ) \mathtt{d}t \!-\! \nabla \cdot ( \boldsymbol{\sigma}\mathtt{d} \boldsymbol{B} f){{,}} \end{aligned} $$
(3.16)
$$\displaystyle \begin{aligned} =& -\nabla \cdot ( ( -(\tfrac 12 \nabla \cdot \boldsymbol{a})^T \mathtt{d}t + \boldsymbol{\sigma}\mathtt{d} \boldsymbol{B} ) f) + \nabla \cdot (\tfrac{1}{2}\boldsymbol{a} \nabla f) \mathtt{d}t{{.}} \end{aligned} $$
(3.17)

We can now compare these LU equations to our new stochastic scheme applied to n-form \(\theta = fdx^1\wedge \dots \wedge dx^n\) (Eq. (1.6)). This implies that

$$\displaystyle \begin{aligned} \mathtt{d}_sf =& \Big{(}(\partial_{x^p}a^p+\tfrac{1}{2}J_i)f + ( a^p + e^p_i \partial_{x^q} e^q_{i} ) \partial_{x^p} f +\tfrac{1}{2}e^p_ie^q_i \partial_{x^p} \partial_{x^q} f \Big{)}\mathtt{d} t \\ &+ ( \partial_{x^p} e^p_{i} f+ e^p_i \partial_{x^p} f )\mathtt{d}\eta_i, {} \end{aligned} $$
(3.18)

where \(J_i = \partial _{x^p} e_{i}^p \partial _{x^q} e_{i}^q - \partial _{x^p}e_{i}^q \partial _{x^q}e_{i}^p\). Rewritten, it leads to:

$$\displaystyle \begin{aligned} \frac{ \mathtt{d}_sf}{\mathtt{d} t } = - \partial_{x^p}\left(\tilde{V}^p f\right) + \partial_{x^p} \left((\tfrac{1}{2}e_i^pe_i^q) \partial_{x^q} f \right){{,}} {} \end{aligned} $$
(3.19)

where

$$\displaystyle \begin{aligned} \tilde{V}^p = {V}^p - (e_i^p \partial_{x^q} e_i^q) = -a^p + \tfrac{1}{2} ( \partial_{x^q}e_i^pe_i^q - e_i^p \partial_{x^q} e_i^q ) - e_i^p \frac{\mathtt{d}\eta_i}{\mathtt{d} t }{{.}} \end{aligned} $$
(3.20)

Again an advection-diffusion equation is recognized, but of different nature. Indeed, as expected for an n-form, the PDE is similar to a density conservation equation. Moreover, the advecting drift is slightly different to take into account the cross-correlations between \(f(T_t(x))\) and \(T_t^*(dx^{1}\wedge \dots \wedge dx^{n})\).

Identifying \(\boldsymbol {a} =\boldsymbol {\sigma }_{\bullet k} \boldsymbol {\sigma }_{\bullet k}^T = e_i e_i^T \) and \(\boldsymbol {\sigma }\dot {\boldsymbol {B}} = - e_i d\eta _i\),

$$\displaystyle \begin{aligned} \tilde{V} = -a^p + \tfrac{1}{2} ( \partial_{x^q}e_i^pe_i^q - e_i^p \partial_{x^q} e_i^q ) - e_i^p \frac{\mathtt{d}\eta_i}{\mathtt{d} t } = - (\tfrac 12 \nabla \cdot \boldsymbol{a})^T +\boldsymbol{\sigma}\dot{\boldsymbol{B}} {{,}} \end{aligned} $$
(3.21)

i.e.

$$\displaystyle \begin{aligned} a^p = \tfrac{1}{2} ( \partial_{x^q}e_i^pe_i^q - e_i^p \partial_{x^q} e_i^q ) + \tfrac 12 \partial_{x^q}(e_i^pe_i^q) = e_i^q \partial_{x^q} e_i^p{{.}} \end{aligned} $$
(3.22)

A remapping is thus obtained to write

$$\displaystyle \begin{aligned} T_t(x) = x + e^q_i \partial_{x_q} e_{i} \varDelta t +e_i\varDelta\eta_i = x - \boldsymbol{w}_S^c \varDelta t + ( - \boldsymbol{w}_S^c \varDelta t - \boldsymbol{\sigma}\varDelta \boldsymbol{B} ), {} \end{aligned} $$
(3.23)

already derived for differential \(0-\)form in LU framework (Eq. (3.9)). Therefore, the proposed perturbation mapping can also encompass the LU framework for \(n-\) forms, and its capacity—given by the Reynolds transport theorem—to deal with extensive properties.

Moreover, for incompressible flows, LU equation further imposes that

$$\displaystyle \begin{aligned} \begin{cases} \nabla\cdot\boldsymbol{\sigma} = 0{{,}} \\ \nabla\cdot\nabla\cdot\boldsymbol{a} = 0 {{.}} \end{cases}{} \end{aligned} $$
(3.24)

Translating it into our present notation, it reads as

$$\displaystyle \begin{aligned} \begin{cases} \partial_{x_p} e^p_{i} = 0{{,}} \text{ for each }i \\ \partial_{x_p}\partial_{x_q}(e^p_ie^q_i) = 0{{.}} \end{cases} \end{aligned} $$
(3.25)

Following straightforward calculation, Eq. (3.24) is found equivalent to that \(T_t^*\theta = \theta \) for \(\theta = dx^1\wedge \dots \wedge dx^n\). Such a result is expected since constraints Eq. (3.24) are obtained from the LU density conservation.

3.2 The SALT Perturbation Scheme

Holm [2015] derived the original SALT equation following a stochastically constrained variational principle \(\delta S = 0\), for which

$$\displaystyle \begin{aligned} \begin{cases} &S(u,q) = \int \ell(u,q)\mathtt{d}t{{,}}\\ &\mathtt{d}q + {\pounds}_{\mathtt{d}x_t}q = 0{{,}} \end{cases}{} \end{aligned} $$
(3.26)

where \(\ell (u,q)\) is the Lagrangian of the system, \({\pounds }\) is the Lie derivative, and \(x_t(x)\) is defined by (using our notation)

$$\displaystyle \begin{aligned} x_t(x) = x_0(x) + \int_{0}^tu(x,s)\mathtt{d}s - \int_{0}^te_i(x)\circ\mathtt{d}\eta_i(s),{} \end{aligned} $$
(3.27)

in which u is the velocity vector field. The \(\circ \) means that the integral is defined in the Stratonovich sense, instead of in the Ito sense. Hence, \(\mathtt {d}x_t = u(x,t)\mathtt {d}t - e_i\circ \mathtt {d}\eta _i\) refers to an infinitesimal stochastic tangent field on the domain. We can express \(\mathtt {d}x_t = T_t(x) - x + u\mathtt {d}t\). Note the difference between Ito’s notation and Stratonovich’s notation, i.e. \(e_i\circ \mathtt {d}\eta _i\neq e_i\mathtt {d}\eta _i\). The initial expression of \(T_t\) essentially follows Ito’s notation. In this subsection, it comes that \(T_t(x) \neq x - e_i\varDelta \eta _i\). Instead, it becomes \(T_t(x) = x +\frac {1}{2} e^p_i \partial _{x_p} e_{i} \varDelta t - e_i\varDelta \eta _i\).

In the second equation of Eq. (3.26), q is assumed to be a quantity advected by the flow. q can correspond to any differential form that is not uniquely determined by the velocity (since the SALT equation for the velocity is usually determined by the first equation of Eq. (3.26)). Holm [2015] evaluates the Lie derivative \({\pounds }_{\mathtt {d}x_t}q\) using Cartan’s formula:

$$\displaystyle \begin{aligned} {\pounds}_{\mathtt{d}x_t}q = d(i_{\mathtt{d}x_t}q) + i_{\mathtt{d}x_t}dq. \end{aligned} $$
(3.28)

This Lie derivative \({\pounds }_{\mathtt {d}x_t}q\) corresponds to \(T_t^*q - q + f^{q}(S)\mathtt {d}t\), if we assume that the deterministic forecast of q is simply the advection of q by u. More generally, \({\pounds }_{\mathtt {d}x_t - u\mathtt {d}t} q = T_t^*q - q\). Therefore, the SALT equation for q is the same as our perturbation for q. Note, the Cartan’s formula can not be directly applied to calculate the Lie derivative if the expression of \(\mathtt {d}x_t\) is in Ito’s notation.

Within the SALT setting, the velocity u comes from the first equation of Eq. (3.26). For most cases, the velocity u is associated with the momentum, a differential \(1-\)form \(\mathrm {m} = u^j dx^j = u^1dx^1 + \ldots + u^ndx^n\). When the Lagrangian includes the kinetic energy, Holm [2015] observed that the stochastic noises contribute a term \({\pounds }_{\mathtt {d}x_t}\theta \), where \(\theta \) is a differential \(1-\)form related to the momentum \(1-\)form. In particular, \(\theta = \mathrm {m}\) for the “Stratonovich stochastic Euler-Poincaré flow” example, and \(\theta = \mathrm {m} + R^j dx^j\) for the “Stochastic Euler-Boussinesq equations of a rotating stratified incompressible fluid”.

Already pointed out, the operator \({\pounds }_{\mathtt {d}x_t}\) is closely related to \(T_t^*\), and the SALT momentum equation can thus also be derived using our proposed perturbation scheme by properly choosing \(\theta \), without relying on Lagrangian mechanics.

Another way to appreciate the correspondence to SALT is by looking at the final SPDE. If we choose \(\theta \) to be a differential 1-form to represent the momentum f, i.e. \(\theta = f^j dx^j\) we obtain from Eq. (1.7) [more details in Zhen et al 2022]:

$$\displaystyle \begin{aligned} \mathtt{d}_sf^j = \,\,& ( a^p \partial_{x^p} f^j + \tfrac{1}{2} e^p_ie^q_i \partial_{x^p} \partial_{x^q} f^j + \partial_{x^j} a^p f^p + \partial_{x^j}e^p_{i} e^q_i \partial_{x^q} f^p)\mathtt{d} t \\ &+ (e_i^p \partial_{x^p} f^j + \partial_{x^j} e^p_{i} f^p )\mathtt{d}\eta_i{{.}} {} \end{aligned} $$
(3.29)

Regrouping the terms for physical interpretation, it writes:

$$\displaystyle \begin{aligned} \frac{ \mathtt{d}_sf^j}{\mathtt{d} t } &= - V^p \partial_{x^p} f^j + \partial_{x^p} \left((\tfrac{1}{2}e_i^pe_i^q) \partial_{x^q} f^j \right)\\ & \quad + \partial_{x^j} \left(a^p + e_i^p \frac{\mathtt{d}\eta_i}{\mathtt{d} t }\right) f^p +\partial_{x^j}e^p_{i} e^q_i \partial_{x^q} f^p{{.}} \end{aligned} $$
(3.30)

Two last terms of the right-hand side complete the advection-diffusion terms, already appearing in (3.7). The first one, \(\partial _{x^j} \left (-a^p - e_i^p \frac {\mathtt {d}\eta _i}{\mathtt {d} t }\right )f^p\), is reminiscent to the additional terms appearing in SALT momentum equations [Holm 2015, Resseguier et al 2020b]. The second term, \(-\partial _{x^j}e^p_{i} e^q_i \partial _{x^q} f^p\), comes from cross-correlation in Itô notation.

4 Conclusion

As demonstrated, both SALT and LU equations can be recovered using a prescribed definition of a random diffeomorphism \(T_t\) used to perturb the physical space. However, compared with SALT and LU settings, the proposed perturbation scheme does not directly rely on a particular physics. Hence, the random mapping is more flexible and can be applied to any PDE. Interestingly, similarities and differences can then be identified and studied between the proposed use of the random diffeomorphism and the existing stochastic physical SALT and LU settings. For instance, the proposed derivation provides an interesting interpretation the operator \({\pounds }_{\mathtt {d}x_t - u\mathtt {d}t}\), appearing in the SALT equation. This term can indeed represent an infinitesimal forecast error at every forecast time step.

To apply the proposed perturbation scheme to any specific model, the diffeomorphism parameters a and \(e_i\) must be determined specifically. Hence it is necessary to learn these parameters from existing data, experimental runs, or additional physical considerations. This framework naturally provides new perspectives to generate ensembles through constrained stochastic mappings applied in the physical space.