1 Introduction

The introduction of stochasticity in fluid dynamics has recently been the subject of intense research effort. This approach involves using random processes to model, for example, unresolved scales, or to take into account neglected physical effects. A stochastic formulation for the fluid flow introduces a probabilistic basis for modelling unresolved scales. This is different from the deterministic causal modelling which is difficult to achieve in practice due, for instance, to unknown initial conditions. In addition, some phenomena such as energy backscattering are directly accessible as stochastic processes. Another usage of stochastic modelling is to generate ensembles of realizations of the model. This facilitates the analysis of model uncertainty quantification for different low-resolution computational simulations and their usage to approximate the true state of the fluid, instead of using a single high-resolution numerical simulation.

Some stochastic schemes have been proposed in the literature by considering a variety of ad-hoc perturbations. However, a principled approach is desirable. The formulation of stochastic dynamical systems based on physical principles has recently been proposed in various settings. For a review and classification of approaches to stochastic parameterisation based on physical principles, see [4]. The present work treats two additional new approaches. The first one, called stochastic advection by Lie transport (SALT) relies on the variational principle for fluid dynamics [16]. The second one, called modelling under location uncertainty (LU) is derived from Newton’s principle [23]. Both frameworks introduce stochasticity into the Lagrangian specification of the flow field, rather than directly into the Eulerian frame.

In the deterministic case, it is known that three-dimensional fluid flows may trigger a cascade of dynamics across multiple length and time scales. This multiscale behavior poses considerable challenges in the computational simulation using standard Navier-Stokes equations (see e.g. [7], [24], [5]). When modelling turbulence numerically, specialised discretisation methods are needed to decompose the underlying partial differential equations into a very large number of ordinary differential equations. Alternative approaches have been introduced where the Navier-Stokes dynamics in the Fourier space is mimicked using a finite number of variables, say \(u^1, u^2, \ldots , u^N\). The Fourier space is divided into N shells, and each shell \(\mathrm {s}_i\) comprises the set of wave vectors \( \underline {s}\) with magnitude \(| \underline {s}|\in (s_02^i, s_02^{i+1})\). Each \(u^{i}\) satisfies an ODE and it represents the magnitude of the velocity field on a length scale of \(s_i^{-1}\) ([15], [7]). The quadratic nonlinearity in the Navier-Stokes equations produces triads of interacting vector Fourier modes within each shell. Shell models involving multiple triads have had considerable success in modelling energy and helicity cascades, as well as modelling intermittency in chaotic dynamical systems [6, 10, 9]. Simplified shell models with only a few triads date back to the 1970s and have provided major insight into fluid modal interaction. Even the dynamical system representation of Euler’s fluid equations on a single triad has been quite insightful, see e.g. [28, 29].

More recently, the problem of correctly parameterizing effects of small-scale physical processes together with the need for probabilistic ensemble forecasting and uncertainty quantification has led to modern stochastic approaches in the study of turbulence using reduced order shell models. In this work we will explore reduced order models for SALT and LU models obtained by projecting onto helical basis functions [6, 28, 29]. These helical basis functions, defined as eigenfunctions of the curl operator, enable one to construct reduced order stochastic models of fluid flow with a simplified nonlinear interaction. As we will see, under projection onto the basis of helical triad modes, both LU and SALT result in the same reduced order model and this projected model conserves helicity, but it does not conserve energy. Because of this coincidence in projecting the SALT and LU models onto the helical basis, a second reduced order scheme with a strong energy conservation property inspired by [17] and known as the energy preserving stochastic triad (EST) model will be proposed for comparison.

While the EST model is not of transport type, it will provide comparison between two different classes of stochastic dynamical systems. The two classes treated here are (1) the helicity preserving stochastic triad (HST) (comprising both LU and SALT on the helical basis) and (2) the energy preserving stochastic triad (EST) of [17] projected onto the helical basis. The solution behaviour of the HST model will be compared to that for the EST model for several data assimilation objectives formulated on the helical triad modes. For classical deterministic models one obtains a system of ordinary differential equations. However, for stochastic dynamics a set of stochastic differential equations (SDEs) is obtained [8, 14, 25].

The goal of the data assimilation procedure in this context is twofold: firstly, it is used to calibrate the uncertainty of the model (the amplitude of the noise). Secondly, once the calibration is complete, the particle filtering methodology can be used to reduce the uncertainty. We want the distribution of the fluctuations to be properly approximated. In the absence of stochasticity, all particles would go in the same direction and the initial spread would rapidly disappear because of the hyperbolic character of the model. In the absence of a reasonable spread, the particle filter methodology will eventually collapse. For this reason, we need to introduce stochasticity into the system that correctly characterises the fluctuation dynamics. In particular, we want to find the type of noise amplitude and the stochastic parameters for which the distribution of the output samples is reasonably uniform.

Structure of the Paper

In Sect. 2 we introduce the triad models for incompressible flows modelled by the Euler equation in its deterministic and stochastic form. To this end, we introduce the stochastic parametrisation paradigms. Building upon these models for the 3D Euler equation, we then present reduced order triad models derived from the original equations. The derivation follows the classical approach for triad models from the literature, that has been successfully employed in the deterministic case. Our full derivations, complete also for the stochastically parametrised models, can be found in Appendix 2. The Data Assimilation experiments are carried out in Sect. 3. We first briefly explain the standard particle filter methodology, and then in Sect. 3.1 we present the findings of our numerical studies. In particular, Sect. 3.1 presents the results of the main numerical studies in this work. These are:

  • The model realisations of the stochastic models which we propose in this work are presented in Sect. 3.1.3 for different realisations of the noise. Further, we numerically confirm here the physical conservation properties of the models that motivated their theoretical design.

  • The model statistics for a large number of stochastic realisations are presented in Sect. 3.1.4. Crucially, the evolution of the model statistics in time reveals differences between the two stochastic models on the level of the individual triad energies which go beyond the conservation properties. Moreover, we show here the (in-)stability of either model with respect to large noise parameters.

  • In Sect. 3.1.5 we exhibit the results of the data assimilation performed using the transition kernels derived from the theoretical models developed in this paper. We show here that using the stochastic transition kernels associated with our proposed stochastic models improves the particle filtering procedure and produces efficient ensemble evolutions that are well-suited for data assimilation purposes.

In Sect. 4 we describe our conclusions on this topic. We conclude the paper with a number of appendices: in Appendix 1 one can find a list of notations and standard identities, in Appendix 2 we present a detailed derivation of shell models (deterministic and stochastic), in Appendix 3 we introduce some supplementary numerics related to the noise amplitude calibration.

Code Availability

The code corresponding to the numerical experiments in this paper is archived in [22]. The GitHub repository is located at https://github.com/alobbe/stochastic-triads.

2 Reduced Order Models for Incompressible Fluids

2.1 Reduced Order Models for the 3D Euler Equation

The 3D Euler equations model incompressible inviscid fluid dynamics. These equations may be written by using the Leray operator \(\mathcal {P}\) to project onto the divergence-free part of its operand as

$$\displaystyle \begin{aligned} \begin{aligned} \frac{{\partial} \mathbf{v}}{{\partial} t} &= \mathcal{P}\Big( \mathbf{v} \times \mathrm{curl}\mathbf{v}\Big) \\&= \mathcal{P} \bigg( \frac{\delta E}{\delta \mathbf{v}} \times \frac{\delta C}{\delta \mathbf{v}} \bigg) \\ \mbox{with conserved Energy } E( \mathbf{v}) &= \int_{\mathbb{R}^3} \frac{1}{2}\mathcal{P}\mathbf{v}\cdot \mathbf{v}\mathit{d}^{\mathit{3}}\mathit{x} \\ \mbox{and conserved helicity } C( \mathbf{v}) &= \frac{1}{2}\int_{\mathbb{R}^3} \mathbf{v} \cdot \mathrm{curl}\,\mathbf{v}\, d^3x \, \end{aligned} {} \end{aligned} $$
(1)

where \(\delta /\delta \mathbf {v}\) represents variational derivative with respect to the fluid velocity \(\mathbf {v}\).

Following [9, 10] we use a Galerkin expansion in orthogonal vector modes that are eigenfunctions of the curl operator. Assume the fluid is contained in a periodic box \(\mathcal {D}\subset \mathbb {R}^3\) of side length \(L>0\). Then the velocity \({\mathbf {v}}\) and vorticity \(\boldsymbol {\omega }:=\nabla \times {\mathbf {v}}\) may be expanded in circularly polarised or helical modes \({\mathbf {h}}_{\pm }({\mathbf {k}})\exp (i{\mathbf {k}}\cdot {\mathbf {x}})\), with the wave vectors \({\mathbf {k}}\in \mathcal {K}:=(2\pi /L)\mathbb {Z}^3\). The modes shall be orthogonal, i.e.

$$\displaystyle \begin{aligned} \int_{\mathcal{D}}{\mathbf{h}}_{s_p}({\mathbf{p}})\exp(i{\mathbf{p}}\cdot{\mathbf{x}})\cdot [{\mathbf{h}}_{s_q}({\mathbf{q}})\exp(i{\mathbf{q}}\cdot{\mathbf{x}})]^* \,\mathrm{d}{\mathbf{x}} = C\delta_{{\mathbf{p}},{\mathbf{q}}}\delta_{s_p,s_q}\,;\; C>0 \text{ const}. \end{aligned} $$
(2)

The complex vector amplitudes \({\mathbf {h}}_{\pm }({\mathbf {k}})\) should satisfy \({\mathbf {k}}\cdot {\mathbf {h}}_{\pm }({\mathbf {k}})=0\) and \(i{\mathbf {k}}\times {\mathbf {h}}_{\pm }({\mathbf {k}})=\pm |{\mathbf {k}}|{\mathbf {h}}_{\pm }({\mathbf {k}})\). A convenient choice of basis for the \({\mathbf {h}}_{\pm }({\mathbf {k}})\) is then given by

$$\displaystyle \begin{aligned} {\mathbf{h}}_{\pm}({\mathbf{k}})&:=\boldsymbol{\nu}\times\boldsymbol{\kappa}\pm i\boldsymbol{\nu}, \; \text{ with } \; \boldsymbol{\kappa}:={\mathbf{k}}/k, \; \boldsymbol{\nu} := {\mathbf{k}}\times\boldsymbol{\Gamma}/|{\mathbf{k}}\times\boldsymbol{\Gamma}|,\; \boldsymbol{\Gamma}:=\text{ const}, {} \end{aligned} $$
(3)

for which \(|{\mathbf {h}}_\pm ({\mathbf {k}})|{ }^2:={\mathbf {h}}_\pm ({\mathbf {k}})\cdot {\mathbf {h}}_\pm ({\mathbf {k}})^*=2\) and \({\mathbf {h}}_{\pm }({\mathbf {k}})\cdot {\mathbf {h}}_{\mp }({\mathbf {k}})^*=0\).

At this point, one notices the key features of the helical modes \({\mathbf {h}}_{\pm }({\mathbf {k}})\exp (i{\mathbf {k}}\cdot {\mathbf {x}})\) which greatly simplifies analysis of modal expansions of the 3D Euler and related equations, such as 3D Navier-Stokes. Namely, the helical modes \({\mathbf {h}}_{\pm }({\mathbf {k}})\exp (i{\mathbf {k}}\cdot {\mathbf {x}})\), are divergence-free eigenfunctions of the curl operator. Specifically,

$$\displaystyle \begin{aligned} \nabla \cdot {\mathbf{h}}_s({\mathbf{k}}) e^{i{\mathbf{k}}\cdot{\mathbf{x}}} = i{\mathbf{k}}\cdot{\mathbf{h}}_s({\mathbf{k}}) e^{i{\mathbf{k}}\cdot{\mathbf{x}}} = 0 \end{aligned} $$
(4)

and

$$\displaystyle \begin{aligned} \nabla \times {\mathbf{h}}_s({\mathbf{k}}) e^{i{\mathbf{k}}\cdot{\mathbf{x}}} = i{\mathbf{k}}\times{\mathbf{h}}_s({\mathbf{k}})e^{i{\mathbf{k}}\cdot{\mathbf{x}}} = s |{\mathbf{k}}|{\mathbf{h}}_s({\mathbf{k}})e^{i{\mathbf{k}}\cdot{\mathbf{x}}} . {} \end{aligned} $$
(5)

See [9, 10, 28, 29] for more information about how this Galerkin decomposition into divergence-free eigenfunctions of the curl are used as a standard tool in analysis of 3D solution behaviour of the deterministic Euler fluid equations and Navier-Stokes fluid equations. In particular, the helical mode expansions in Eqs. (6) comprise the source of the popular shell models as finite-dimensional expansions of turbulent fluid dynamics. Thus, this expansion provides a useful framework for studying low-dimension stochastic models of 3D Navier-Stokes turbulence.

In terms of the basis of helical modes \({\mathbf {h}}_{\pm }({\mathbf {k}})\exp (i{\mathbf {k}}\cdot {\mathbf {x}})\), the divergence-free fluid velocity \({\mathbf {v}}({\mathbf {x}},t)\) and vorticity \(\boldsymbol {\omega }({\mathbf {x}},t)\) are expressed in [28, 29] in terms of complex vector amplitudes \({\mathbf {u}}({\mathbf {k}},t),\boldsymbol {\varpi }({\mathbf {k}},t)\in \mathbb {C}^3\), respectively,

$$\displaystyle \begin{aligned} \begin{aligned} {\mathbf{v}}({\mathbf{x}},t) &:= \sum_{{\mathbf{p}}} {\mathbf{u}}({\mathbf{p}},t)e^{i{\mathbf{p}}\cdot{\mathbf{x}}} := \sum_{{\mathbf{p}}} \sum_{s_p=\pm} a_{s_p}({\mathbf{p}},t){\mathbf{h}}_{s_p}({\mathbf{p}}) e^{i{\mathbf{p}}\cdot{\mathbf{x}}} \,,\\ \boldsymbol{\omega}({\mathbf{x}},t) &:= \sum_{{\mathbf{q}}} \boldsymbol{\varpi}({\mathbf{q}},t)e^{i{\mathbf{q}}\cdot{\mathbf{x}}} :=\sum_{{\mathbf{q}}} \sum_{s_q=\pm} s_q|{\mathbf{q}}|\, a_{s_q}({\mathbf{q}},t){\mathbf{h}}_{s_q}({\mathbf{q}}) e^{i{\mathbf{q}}\cdot{\mathbf{x}}} \,. \end{aligned} {} \end{aligned} $$
(6)

Here, the choice

$$\displaystyle \begin{aligned} {\mathbf{u}}({\mathbf{k}},t) := a_+({\mathbf{k}},t){\mathbf{h}}_+({\mathbf{k}}) + a_-({\mathbf{k}},t){\mathbf{h}}_-({\mathbf{k}}) = \sum_{s_k=\pm} a_{s_k}({\mathbf{k}},t){\mathbf{h}}_{s_k}({\mathbf{k}}), {} \end{aligned} $$
(7)

with \(a_s^*({\mathbf {k}}) = a_s( - {\mathbf {k}}) \) [29], was made, so that (5) implies

$$\displaystyle \begin{aligned} \boldsymbol{\varpi}({\mathbf{k}},t):= |{\mathbf{k}}| \Big(a_+({\mathbf{k}},t){\mathbf{h}}_+({\mathbf{k}}) - a_-({\mathbf{k}},t){\mathbf{h}}_-({\mathbf{k}})\Big) = |{\mathbf{k}}|\!\!\sum_{s_k=\pm} s_k\, a_{s_k}({\mathbf{k}},t){\mathbf{h}}_{s_k}({\mathbf{k}}). {} \end{aligned} $$
(8)

The conservation laws for the Euler fluid kinetic energy and helicity—expressed as integrals over the spatially periodic box \({\mathcal D}\)—can be evaluated in Fourier space via Parseval’s theorem, as follows,

$$\displaystyle \begin{aligned} \begin{aligned} \frac{1}{2}\int_{\mathcal D} |{\mathbf{v}}({\mathbf{x}},t)|{}^2 \, d^3x &{=} \frac{1}{2} \sum_{{\mathbf{k}}} {\mathbf{u}}({\mathbf{k}},t)\cdot {\mathbf{u}}^*({\mathbf{k}},t) {=} \sum_{{\mathbf{k}}} \sum_{s_k=\pm} a_{s_k}({\mathbf{k}},t)\cdot a^*_{s_k}({\mathbf{k}},t) \,,\\ \int_{\mathcal D} {\mathbf{v}}({\mathbf{x}},t)\cdot \mathrm{curl} {\mathbf{v}}({\mathbf{x}},t) \, d^3x &= \sum_{{\mathbf{k}}} {\mathbf{u}}({\mathbf{k}},t) \cdot \boldsymbol{\varpi}^*({\mathbf{k}},t) \\ &= \sum_{{\mathbf{k}}}\sum_{s_k=\pm} ks_k \,a_{s_k}({\mathbf{k}},t)a^*_{s_k}({\mathbf{k}},t) \,{\mathbf{h}}_{s_k}({\mathbf{k}}) \cdot {\mathbf{h}}^*_{s_k}({\mathbf{k}}) \\ &= 2 \sum_{{\mathbf{k}}}\sum_{s_k=\pm} ks_k \,a_{s_k}({\mathbf{k}},t)a^*_{s_k}({\mathbf{k}},t) \,. \end{aligned} {} \end{aligned} $$
(9)

Expanding the terms of the Euler equation in curl form (57), we obtain the Euler equations for the coefficients \(a_{s_k}({\mathbf {k}},t)\). For all \({\mathbf {k}}\in \mathcal {K}\), \(s_k\in \{+,-\}\) we have

$$\displaystyle \begin{aligned} \partial_t a_{s_k}({\mathbf{k}}, t) {=} -\frac{1}{4} \sum_{{\mathbf{p}}+{\mathbf{q}}+{\mathbf{k}}=0}\sum_{s_p, s_q} (s_p|{\mathbf{p}}|-s_q|{\mathbf{q}}|)a_{s_p}^*({\mathbf{p}}, t)a_{s_q}^*({\mathbf{q}}, t){\mathbf{h}}_{s_p}^*({\mathbf{p}}){\times}{\mathbf{h}}_{s_q}^*({\mathbf{q}})\cdot {\mathbf{h}}_{s_k}^*({\mathbf{k}}). {} \end{aligned} $$
(10)

For the explicit derivation of Eq. (10) see section “Deterministic Euler” in Appendix 2.

The elementary interactions in Fourier space take place between triads of wave vectors such that \({\mathbf {k}} + {\mathbf {p}} + {\mathbf {q}} = 0\), as exemplified in Eq. (10) above. There are two degrees of freedom per wave vector, \((a_+ ,a_-)\), so eight different types of interaction are allowed according to the value of the triplet \((s_k ,s_p ,s_q ) = ( \pm 1, \pm 1, \pm 1)\). Consider a fixed triple of wave vectors \({\mathbf {k}}, {\mathbf {p}}, {\mathbf {q}}\in \mathcal {K}\) such that \({\mathbf {k}}+{\mathbf {p}}+{\mathbf {q}}=0\) and a fixed triple \(s_k,s_p,s_q\in \{+,-\}\). This gives rise to three coefficients \(a_{s_k}({\mathbf {k}},t), a_{s_p}({\mathbf {p}},t), a_{s_q}({\mathbf {q}},t)\), which we compactly summarise into the complex vector

$$\displaystyle \begin{aligned} {\mathbf{a}} = (a_{s_k}, a_{s_p}, a_{s_q})\in\mathbb{C}^3\,. \end{aligned}$$

The dynamics of \({\mathbf {a}}\) is determined by the three equations obtained from (10)

$$\displaystyle \begin{aligned} \frac{{\mathrm{d}} a_{s_k}}{{\mathrm{d}} t} &= g(s_p|{\mathbf{p}}| - s_q|{\mathbf{q}}|) a_{s_p}^*a_{s_q}^*+ R, \end{aligned} $$
(11)
$$\displaystyle \begin{aligned} \frac{{\mathrm{d}} a_{s_p}}{{\mathrm{d}} t} &= g(s_q|{\mathbf{q}}| - s_k|{\mathbf{k}}|) a_{s_q}^*a_{s_k}^*+R, \end{aligned} $$
(12)
$$\displaystyle \begin{aligned} \frac{{\mathrm{d}} a_{s_q}}{{\mathrm{d}} t} &= g(s_k|{\mathbf{k}}| - s_p|{\mathbf{p}}|) a_{s_k}^*a_{s_p}^*+R, \end{aligned} $$
(13)

We pick out a summand and cycle through \(k,p,q\)

$$\displaystyle \begin{aligned} \frac{{\mathrm{d}} a_{s_k}}{{\mathrm{d}} t} &= g(s_p|{\mathbf{p}}| - s_q|{\mathbf{q}}|) a_{s_p}^*a_{s_q}^*, \end{aligned} $$
(14)
$$\displaystyle \begin{aligned} \frac{{\mathrm{d}} a_{s_p}}{{\mathrm{d}} t} &= g(s_q|{\mathbf{q}}| - s_k|{\mathbf{k}}|) a_{s_q}^*a_{s_k}^*, \end{aligned} $$
(15)
$$\displaystyle \begin{aligned} \frac{{\mathrm{d}} a_{s_q}}{{\mathrm{d}} t} &= g(s_k|{\mathbf{k}}| - s_p|{\mathbf{p}}|) a_{s_k}^*a_{s_p}^*, \end{aligned} $$
(16)

with the constant complex scalar

$$\displaystyle \begin{aligned} g := -\frac{1}{4}{\mathbf{h}}_{s_p}^*({\mathbf{p}})\times{\mathbf{h}}_{s_q}^*({\mathbf{q}})\cdot {\mathbf{h}}_{s_k}^*({\mathbf{k}}). {} \end{aligned} $$
(17)

The equations corresponding to the single triad interaction of type \((s_k, s_p, s_q)\) with \({\mathbf {k}} + {\mathbf {p}}+{\mathbf {q}}=0\) thus have the complex vector form also derived in [28],

$$\displaystyle \begin{aligned} \frac{{\mathrm{d}}{\mathbf{a}}}{{\mathrm{d}} t} = g {\mathbf{a}}^* \times \mathbb{D}{\mathbf{a}}^* = g ({\mathbf{a}} \times \mathbb{D}{\mathbf{a}})^*, {} \end{aligned} $$
(18)

with the constant diagonal matrix

$$\displaystyle \begin{aligned} \mathbb{D}:=\mathrm{diag}\big(s_k|{\mathbf{k}}|, s_p|{\mathbf{p}}|, s_q|{\mathbf{q}}| \big). \end{aligned} $$
(19)

The form of the factor g defined in (17) above can be calculated from (3) to show that it depends on the shape and the orientation of the wave-vector triad, but not on its scale; since the real and imaginary parts of the complex helical vector amplitudes \({\mathbf {h}}_{s_k}\) are unit vectors. Moreover, \(\mathbb {D}{\mathbf {a}}\) can be seen to represent the \((s_k, s_p, s_q)\) components of the vorticity vector amplitude \(\boldsymbol {\varpi }\) through Eq. (8) above. Two conservation laws for real-valued triad energy and helicity follow immediately from Eq. (18), as

$$\displaystyle \begin{aligned} \frac{d}{dt}({\mathbf{a}}\cdot{\mathbf{a}}^*) = 0 \quad \mbox{and}\quad \frac{d}{dt}({\mathbf{a}}\cdot\mathbb{D}{\mathbf{a}}^*) = 0 \,. {} \end{aligned} $$
(20)

The dynamical system in Eq. (18) is similar to rigid body dynamics, but replaced by complex angular momentum \(({\mathbf {a}})\), complex angular velocity \((\mathbb {D}{\mathbf {a}})\) and real moment of inertia \(\mathbb {I}=\mathbb {D}^{-1}\) relating the two complex quantities.Footnote 1

2.2 Stochastic Parametrizations for the 3D Euler Equation

In the following we introduce a reduced order model for the stochastic parametrizations introduced through the Stochastic Advection by Lie Transport paradigm as well as the Location Uncertainty paradigm. We explain below the rationale of these parametrizations:

2.2.1 Modelling Under the Stochastic Advection by Lie Transport Principle

The SALT equations were derived in [16] using a Stratonovich stochastic version of Hamilton’s variational principle [18] in combination with Kraichnan’s scalar turbulence model based on Stratonovich stochastic Lagrangian paths [20]. The application of Hamilton’s principle with an imposed stochastic Lie transport constraint implied an Euler-Poincaré equation for the fluid motion [18]. The 3D SALT Euler equations for divergence-free fluid velocity \({\mathbf {v}}({\mathbf {x}},t)\) are given by

$$\displaystyle \begin{aligned} \begin{aligned} &\,\mathrm{d} {\mathbf{v}} + (\mathrm{d} {\mathbf{x}}_t \cdot \nabla) {\mathbf{v}} + v_j\nabla \,\mathrm{d} {\mathbf{x}}_t^j = - \nabla \,\mathrm{d} p \,,\\ &\quad \mbox{with}\quad \,\mathrm{d} {\mathbf{x}}_t = {\mathbf{v}}{dt} + \sum_i \boldsymbol{\xi}_i({\mathbf{x}})\circ \,\mathrm{d} W_t^i\,. \end{aligned} {} \end{aligned} $$
(21)

As discussed in [16], this motion equation yields a Kelvin-Noether circulation theorem for the stochastic system

$$\displaystyle \begin{aligned} \,\mathrm{d}\oint_{c({\mathbf{x}}_t)} {\mathbf{v}}\cdot d{\mathbf{x}} = -\oint_{c({\mathbf{x}}_t)}\nabla \,\mathrm{d} p \cdot d{\mathbf{x}} = 0\,. {} \end{aligned} $$
(22)

This stochastic Kelvin circulation theorem is has the same form as that for the deterministic system, except that each line element of the material loop in Kelvin’s theorem follows the Stratonovich stochastic Lagrangian path, \({\mathbf {x}}_t\).

The real vectors \(\boldsymbol {\xi }_i\) comprise the time-independent noise amplitudes which are to be determined from data assimilation. The \(W^i\) are independent (uni-dimensional) standard Brownian motions and \(\circ \) denotes stochastic integration in the Stratonovich sense.Footnote 2 The curl form of the SALT Euler motion equation in (21) is obtained from (56) and given by

$$\displaystyle \begin{aligned} \,\mathrm{d} {\mathbf{v}} - \,\mathrm{d} {\mathbf{x}}_t \times \mathrm{curl}{\mathbf{v}} + \nabla ({\mathbf{v}} \cdot \,\mathrm{d} {\mathbf{x}}_t) = -\nabla \,\mathrm{d} p. {} \end{aligned} $$
(23)

The motion equation (23) and its curl yielding the SALT vorticity equation implies a formula for the evolution of the helicity of the flow, \(\Lambda \), defined as

$$\displaystyle \begin{aligned} \Lambda:=\int_{\mathcal D} {\mathbf{v}} \cdot \mathrm{curl}{\mathbf{v}} \,d^3x\,. {} \end{aligned} $$
(24)

Upon applying the divergence theorem, one finds

$$\displaystyle \begin{aligned} \,\mathrm{d} \Lambda = - \int_{{\partial}\mathcal D} \widehat{\mathbf{n}}\,\cdot \Big(({\mathbf{v}} \cdot \mathrm{curl}{\mathbf{v}}) \,\mathrm{d} {\mathbf{x}}_t + \mathrm{curl}{\mathbf{v}} \, \,\mathrm{d} p \Big)\,dS\,. {} \end{aligned} $$
(25)

Thus, in a periodic 3D domain, or in an infinite 3D domain with asymptotically vanishing boundary conditions, the SALT motion equation in (21) or (23) preserves the helicity, \(\Lambda \), defined in (24). However, a glance at the SALT motion equation in (23) informs us that it will not preserve the kinetic energy, since even with the usual fluid boundary conditions \(\mathrm {div}{\mathbf {v}}=0\) implies

$$\displaystyle \begin{aligned} \tfrac12\,\mathrm{d} \|{\mathbf{v}}\|{}_{L^2}^2:=\int_{\mathcal D} {\mathbf{v}} \cdot \,\mathrm{d} {\mathbf{x}}_t \times\mathrm{curl}{\mathbf{v}} \,d^3x\ne0\,. {} \end{aligned} $$
(26)

2.2.2 Modeling Under the Location Uncertainty Principle

The Location Uncertainty principle consists in decomposing the flow trajectory \(\mathbf {x}\colon \Omega \times \mathbb {R}^{+}\rightarrow \Omega \) over a bounded domain, \(\Omega \subset \mathbb {R}^{3}\)

$$\displaystyle \begin{aligned} {} \mathrm{d}{\mathbf{x}}_{t} = \mathbf{v}\left({\mathbf{x}}_{t},t\right)\,\mathrm{d} t + \boldsymbol{\sigma}\left({\mathbf{x}}_{t},t\right)\mathrm{d}{\mathbf{W}}_{t} \end{aligned} $$
(27)

in terms of \({\mathbf {v}}\left ({\mathbf {x}}_{t},t\right )\), a smooth-in-time component of the (Lagrangian) velocity and a noise \(\boldsymbol {\sigma }\left ({\mathbf {x}}_{t},t\right )\mathrm {d}{\mathbf {W}}_{t}\), which has here to be understood in the Itô sense and that accounts for the unresolved processes. The Wiener process, \({\mathbf {W}}_{t}\) is a H-valued (cylindrical) Brownian motion, where H is the Hilbert space of square integrable functions. The noise is then properly defined as the application of an Hilbert-Schmidt symmetric integral kernel \(\boldsymbol {\sigma }_t\boldsymbol {f}\left (\boldsymbol {x}\right )=\int _{{\mathcal S}}\breve {\sigma }\left (\boldsymbol {x},\boldsymbol {y},t\right )\boldsymbol {f}\left (\boldsymbol {y}\right )\,\mathrm {d}\boldsymbol {y}\) to the H-valued cylindrical Wiener process \(\mathbf {W}\) as

$$\displaystyle \begin{aligned} {} \left(\boldsymbol{\sigma}_t\mathrm{d}{\mathbf{W}}_{t}\right)^{i} \left(\boldsymbol{x}\right)= \int_{{\mathcal S}} \breve{\sigma}_{ik}\left(\boldsymbol{x},\boldsymbol{y},t\right) \mathrm{d} W_{t}^{k}\left(\boldsymbol{y}\right) \mathrm{d}\boldsymbol{y}, \end{aligned} $$
(28)

The role of the integrable kernel \(\breve {\sigma }\) is to impose a spatial correlation on the small-scale component. It leads to the covariance tensor \(\boldsymbol {Q}\)

$$\displaystyle \begin{aligned} \begin{array}{rcl} Q_{ij}\left(\boldsymbol{x},\boldsymbol{y},t,s\right) & =&\displaystyle \mathbb{E}\left[ \left(\boldsymbol{\sigma}_t \mathrm{d}{\mathbf{W}}_{t}\left(\boldsymbol{x}\right)\right)^{i} \left(\boldsymbol{\sigma}_t\mathrm{d}{\mathbf{W}}_{s}\left(\boldsymbol{y}\right) \right) ^{j} \right] \\ & =&\displaystyle \delta\left(t-s\right)\mathrm{d}t \int_{{\mathcal S}} \breve{\sigma}_{ik}\left(\boldsymbol{x},\boldsymbol{z},t\right) \breve{\sigma}_{kj}\left(\boldsymbol{z},\boldsymbol{y},s\right) \,\mathrm{d}\boldsymbol{z}, \end{array} \end{aligned} $$

of the centered Gaussian process \(\boldsymbol {\sigma }_{t}\mathrm {d}{\mathbf {W}}_{t} \sim \mathcal {N}\left ( 0,\mathbf {Q}\mathrm {d}t \right )\). The diagonal components of the covariance tensor per unit of time, referred to as the variance tensor, \(\mathbf {a}\), is a positive definite matrix defined as \(\mathbf {a} (\mathbf {x}, t)\delta (t-t')\mathrm {d} t = \mathbf {Q}(\mathbf {x},\mathbf {x},t,t')\), that quantifies the strength of the noise and has the dimension of a viscosity in \(\mathrm {m}^2\mathrm {s}^{-1}\). The operator Q being compact auto-adjoint positive definite operator on H, it admits eigenfunctions \(\boldsymbol {\xi }_{n}\left (\boldsymbol {\cdot },t\right )\) with (strictly) positive eigenvalues \(\lambda _{n}\left (t\right )\) satisfying \(\sum _{n\in \mathbb {N}}\lambda _{n}\left (t\right )<+\infty \). As a consequence, the noise and the variance tensor \(\boldsymbol {a}\) can be expressed through the spectral representation

$$\displaystyle \begin{aligned} \begin{array}{rcl} \boldsymbol{\sigma}_t\mathrm{d}{\mathbf{W}}_{t} \left(\boldsymbol{x} \right) & &\displaystyle = \sum_{n\in\mathbb{N}}\lambda_n^{1/2}\left(t\right)\boldsymbol{\xi}_{n}\left(\boldsymbol{x},t\right)\mathrm{d}\beta_{n} \end{array} \end{aligned} $$
(29)
(30)

The rate of change of a volume \(V_{t}\) of the scalar q is given by the stochastic Reynolds transport theorem, introduced in [23]

$$\displaystyle \begin{aligned} {} \mathrm{d}\int_{V_{t}}q\left( \boldsymbol{x},t \right)\,\mathrm{d}\boldsymbol{x} = \int_{V_{t}} \left\lbrace\mathrm{D}_{t}q+ q\nabla\boldsymbol{\cdot}\left[ {\mathbf{v}}^{\star}\,\mathrm{d}t + \boldsymbol{\sigma}_t\mathrm{d}{\mathbf{W}}_{t} \right]\right\rbrace\left( \boldsymbol{x},t \right) \,\mathrm{d}\boldsymbol{x}, \end{aligned} $$
(31)

with the transport operator

$$\displaystyle \begin{aligned} {} \mathrm{D}_{t}q = \mathrm{d}_{t}q + \left[ {\mathbf{v}}^{\star} \,\mathrm{d}t + \boldsymbol{\sigma}_t\,\mathrm{d}{\mathbf{W}}_{t} \right] \boldsymbol{\cdot}\nabla q - \frac{1}{2}\nabla\boldsymbol{\cdot}\left(\mathbf{a}\nabla q\right) \,\mathrm{d}t. \end{aligned} $$
(32)

In this formula, the first component of the right-hand side is the increment in time at a fixed location of the process q, that is \(\mathrm {d}_{t}q=q\left ({\mathbf {x}}_{t},t+\mathrm {d}t\right ) -q\left ({\mathbf {x}}_{t},t\right )\), playing the role of a derivative in time for a non differentiable process. The effective velocity \({\mathbf {v}}^{\star }\) is defined as

$$\displaystyle \begin{aligned} {} {\mathbf{v}}^{\star} = \mathbf{v} - \frac{1}{2}\nabla\boldsymbol{\cdot}\,\mathbf{a} + \boldsymbol{\sigma}_t^{\ast}\left(\nabla\boldsymbol{\cdot}\,\boldsymbol{\sigma}_t\right), \end{aligned} $$
(33)

where the velocity component \({\mathbf {v}}_{s} = \frac {1}{2}\nabla \boldsymbol {\cdot }\,\mathbf {a}\) results from the noise inhomogeneities. For incompressible homogeneous noise as considered in this work \({\mathbf {v}}^{\star } = \mathbf {v}\). Besides, the diffusion term exactly balances the noise brought by the noise. With Stratonovich convention and a homogeneous noise the transport operator takes a simplified form similar to the material derivative:

$$\displaystyle \begin{aligned} {} \mathrm{D}_{t}q = \mathrm{d}_{t}q + ({\mathbf{v}} \,\mathrm{d}t + \boldsymbol{\sigma}_t\,\circ\mathrm{d}{\mathbf{W}}_{t}) \boldsymbol{\cdot}\nabla q. \end{aligned} $$
(34)

For a divergence free homogeneous noise, the Euler equation, in the LU framework, can then be defined as:

$$\displaystyle \begin{aligned} \mathrm{d}_{t}{\mathbf{v}} + ({\mathbf{v}} \,\mathrm{d}t + \boldsymbol{\sigma}_t\,\circ\mathrm{d}{\mathbf{W}}_{t}) \boldsymbol{\cdot}\nabla {\mathbf{v}} = -\nabla dp_t, \;\;\nabla\boldsymbol{\cdot} {\mathbf{v}} =0, \end{aligned} $$
(35)

where \(dp_t\) denotes the pressure composed of a finite variation term and a martingale pressure term. With the Leray projection, \(\mathbb {P}\), this pressure term can be removed and we obtain, the inertial form of the Euler equation:

$$\displaystyle \begin{aligned} \mathrm{d}_{t}{\mathbf{v}} + \mathbb{P}\bigl((\mathrm{d} {\mathbf{x}}_{t}\boldsymbol{\cdot}\nabla) {\mathbf{v}}\bigr) = 0, \quad \;\nabla\boldsymbol{\cdot} {\mathbf{v}} =0. \end{aligned} $$
(36)

2.3 Triad Model Comparison

The reduced order model for the full-scale 3D SALT Euler and 3D LU Euler for a single triad interaction equation is obtained by projecting the continuous stochastic Euler models onto the helical modes, in the same fashion as for the deterministic equation (18). Therefore, we introduce an additional Stratonovich stochastic term into the transport velocity in (7) as

$$\displaystyle \begin{aligned} \,\mathrm{d} {\mathbf{x}}_t({\mathbf{k}},t) := \big(a_+({\mathbf{k}},t){\mathbf{h}}_+ + a_-({\mathbf{k}},t){\mathbf{h}}_-\big)\, \mathrm{d}t + \sum_i \big(b_+^i({\mathbf{k}}){\mathbf{h}}_+ + b_-^i({\mathbf{k}}){\mathbf{h}}_-\big)\circ dW_t^i\,, {} \end{aligned} $$
(37)

where the \({\mathbf {k}}\)-dependent complex vector \({\mathbf {b}}({\mathbf {k}}):=(b_{s_k},b_{s_p},b_{s_q})^T\in \mathbb {C}^3\) represents the time-independent noise amplitude which is to be determined from data assimilation, similar to the continuous stochastic models (21). Enumerating the equation for a single triad then yields, after rearranging using exchange symmetry in \(({\mathbf {k}},{\mathbf {p}},{\mathbf {q}})\), the matrix equation

$$\displaystyle \begin{aligned} \begin{aligned} \,\mathrm{d} \begin{bmatrix} {a}_{s_k} \\ {a}_{s_p} \\ {a}_{s_q} \end{bmatrix} = g \begin{bmatrix} 0 & -qs_q{a}_{s_q} & ps_p {a}_{s_p} \\ qs_q {a}_{s_q} & 0 & -ks_k {a}_{s_k} \\ -ps_p {a}_{s_p} & ks_k {a}_{s_k} & 0 \end{bmatrix}^* \begin{bmatrix} {a}_{s_k}\, \mathrm{d}t \ +\ b_{s_k} \circ dW_t\\ {a}_{s_p}\, \mathrm{d}t \ +\ b_{s_p} \circ dW_t \\ {a}_{s_q} \, \mathrm{d}t \ +\ b_{s_q} \circ dW_t \end{bmatrix}^* \,. \end{aligned} {} \end{aligned} $$
(38)

Upon applying the previous steps for the deterministic case to the stochastic velocity in (37), the single triad interaction dynamics for the SALT case would emerge as, cf. Eq. (18),

$$\displaystyle \begin{aligned} \,\mathrm{d}{\mathbf{a}} = g \Big({\mathbf{a}}({\mathbf{k}},t)dt + {\mathbf{b}}({\mathbf{k}})\circ dW_t\Big)^* \times \mathbb{D} {\mathbf{a}}^*\,. {} \end{aligned} $$
(39)

The details of this computation can be found in section “LU Euler” in Appendix 2 for the 3D LU Euler model and in section “SALT Euler” in Appendix 2 for the 3D SALT Euler model.

Remarkably, the HST equation for triad interaction (39) still preserves the triad helicity \({\mathbf {a}}\cdot \mathbb {D}{\mathbf {a}}^*\). Hence we name this model the helicity preserving stochastic triad (HST) model. Note that in both equations we use as single source of noise (One Brownian motion drives the entire system).

It is readily checked that the HST triad evolution (39) preserves the helicity. Let’s have a look at the diffusion coefficients.

$$\displaystyle \begin{aligned} {\mathbf{a}}^* \times \mathbb{D} {\mathbf{b}} = \begin{bmatrix} a_p^* s_q q b_q - a_q^* s_p p b_p \\ a_q^* s_k k b_k - a_k^* s_q q b_q \\ a_k^* s_p p b_p - a_p^* s_k k b_k \end{bmatrix} \,,\;\; {\mathbf{b}} \times \mathbb{D}{\mathbf{a}}^* = \begin{bmatrix} b_p s_q q a_q^* - b_q s_p p a_p^* \\ b_q s_k k a_k^* - b_k s_q q a_q^* \\ b_k s_p p a_p^* - b_p s_k k a_k^* \end{bmatrix} . \end{aligned} $$
(40)

Taking the difference

$$\displaystyle \begin{aligned} {\mathbf{a}}^* \times \mathbb{D} {\mathbf{b}} - {\mathbf{b}} \times \mathbb{D}{\mathbf{a}}^* = \begin{bmatrix} a_p^* (s_q q + s_p p) b_q - a_q^* (s_p p + s_q q) b_p \\ a_q^* (s_k k + s_q q) b_k - a_k^* (s_q q + s_k k) b_q \\ a_k^* (s_p p + s_k k) b_p - a_p^* (s_k k + s_p p) b_k \end{bmatrix}. \end{aligned} $$
(41)

Writing \(\rho :=\operatorname {Tr}{\mathbb {D}}\) we get

$$\displaystyle \begin{aligned} {\mathbf{a}}^* \times \mathbb{D} {\mathbf{b}} - {\mathbf{b}} \times \mathbb{D}{\mathbf{a}}^* = \begin{bmatrix} a_p^* (\rho-s_k k) b_q - a_q^* (\rho-s_k k) b_p \\ a_q^* (\rho - s_p p) b_k - a_k^* (\rho- s_p p) b_q \\ a_k^* (\rho - s_q q) b_p - a_p^* (\rho-s_q q) b_k \end{bmatrix}. \end{aligned} $$
(42)

So that the difference term becomes

$$\displaystyle \begin{aligned} {\mathbf{a}}^* \times \mathbb{D} {\mathbf{b}} - {\mathbf{b}} \times \mathbb{D}{\mathbf{a}}^* = (\rho \operatorname{Id} - \mathbb{D})({\mathbf{a}} \times {\mathbf{b}})^*. \end{aligned} $$
(43)

Since the projections of the LU and SALT models onto a single triad are indistinguishable, we introduce a different model that conserves energy on a single triad to enable a comparison between energy conserving and helicity conserving models.

Energy-Preserving Stochastic Triad (EST) Model

We introduce below a modified version of the HST triad equation (39) that introduces stochasticity into the vorticity instead of into the transport velocity and thereby conserves the energy. This is inspired by the full-scale model introduced in [17]. The reduced model is as follows

$$\displaystyle \begin{aligned} \,\mathrm{d}{\mathbf{a}} = -\,{g} {\mathbf{a}}^* \times \mathbb{D}\Big({\mathbf{a}}({\mathbf{k}},t)\, \mathrm{d}t + {\mathbf{b}}({\mathbf{k}})\circ dW_t\Big)^*\,. {} \end{aligned} $$
(44)

We call this model the energy preserving stochastic triad (EST). Written in matrix form Eq. (44) becomes

$$\displaystyle \begin{aligned} \begin{aligned} \,\mathrm{d} \begin{bmatrix} {a}_{s_k} \\ {a}_{s_p} \\ {a}_{s_q} \end{bmatrix} = {g} \begin{bmatrix} 0 & -{a}_{s_q} & {a}_{s_p} \\ {a}_{s_q} & 0 & -{a}_{s_k} \\ -{a}_{s_p} & {a}_{s_k} & 0 \end{bmatrix}^* \begin{bmatrix} ka_k\big({a}_{s_k}\, \mathrm{d}t \ +\ b_{s_k}({\mathbf{k}}) \circ dW_t\big) \\ ps_p\big({a}_{s_p}\, \mathrm{d}t \ +\ b_{s_p}({\mathbf{p}}) \circ dW_t \big) \\ qs_q\big({a}_{s_q} \, \mathrm{d}t \ +\ b_{s_q}({\mathbf{q}}) \circ dW_t\big) \end{bmatrix}^* \,. \end{aligned} {} \end{aligned} $$
(45)

The exchange symmetry between the two models HST and EST in the placement of the noise in Eqs. (39) and (44) is apparent already in the exchange symmetry between velocity and vorticity in Euler’s fluid equations (1).

Deviation from the Conservation Laws

We can write the equations for the deviation from the conservation laws, which is present in both models. The SALT model deviates from the energy conservation by

$$\displaystyle \begin{aligned} \,\mathrm{d}_t E_{\text{HST}} = g {\mathbf{b}}\cdot(\mathbb{D}{\mathbf{a}}^*\times {\mathbf{a}}^*) \circ \,\mathrm{d} W_t \end{aligned} $$
(46)

whereas the LU model deviates from the helicity conservation by

$$\displaystyle \begin{aligned} \,\mathrm{d}_t {H}_{\text{EST}} = g \mathbb{D}{\mathbf{b}}\cdot (\mathbb{D}{\mathbf{a}}^*\times {\mathbf{a}}^*) \circ \,\mathrm{d} W_t. \end{aligned} $$
(47)

This is seen, since, to get the energy we dot the HST equation with \({\mathbf {a}}^*\) and to get helicity we dot the EST equation with \(\mathbb {D}{\mathbf {a}}^*\) and use the standard identities

$$\displaystyle \begin{aligned} &{\mathbf{a}}^* \cdot ({\mathbf{b}} \times \mathbb{D}{\mathbf{a}}^*) = {\mathbf{b}} \cdot (\mathbb{D}{\mathbf{a}}^*\times {\mathbf{a}}^*) \end{aligned} $$
(48)
$$\displaystyle \begin{aligned} &\mathbb{D}{\mathbf{a}}^* \cdot ( {\mathbf{a}}^* \times \mathbb{D} {\mathbf{b}} ) = \mathbb{D}{\mathbf{b}}\cdot(\mathbb{D}{\mathbf{a}}^*\times {\mathbf{a}}^*). \end{aligned} $$
(49)

Therefore, \({\mathbf {b}}\) respects the right scaling and no further scale adjustments between the SALT and LU noise scaling need to be performed in order to compare the models.

3 Data Assimilation Comparison

In this section, we perform a comparative study of the two reduced order models (HST and EST) introduced above by using data assimilation tools. The particular methodology that we make use of is that of particle filters. We will first briefly explain the particle filtering methodology in a generic framework:

Let X and Z be two processes defined on a given probability space \((\Omega , \mathcal {F}, \mathbb {P})\). The process X is usually called the signal process or the truth and Z is the observation process. In this paper, X is the pathwise solution of a (deterministic) shell model. The pair of processes \((X,Z)\) forms the basis of the nonlinear filtering problem which consists in finding the best approximation of the posterior distribution of the signal \(X_t\) given the observations \(Z_1, Z_2, \ldots , Z_{t_{n}}\).Footnote 3 The posterior distribution of the signal at time t is denoted by \(\pi _t\). We let \(d_X\) be the dimension of the state space and \(d_Z\) be the dimension of the observation space. This mixed continuous-discrete time framework can be embedded into a fully discrete framework, whereby one is interested in computing the conditional probability law of the signal at the time corresponding to the observation time. in other words one wants to compute the conditional distribution \(\pi _{{t_n}}\) of \(X(t_{n})\) given the data \(Z(t_{1}),Z(t_{2}),\ldots ,Z(t_{n})\). The process X is assumed to be a Markov process, and we will denote by \(\mathcal {K}_n\) its transition kernel, that is

$$\displaystyle \begin{aligned} \mathcal{K}_n:\mathbb{R}^{d_X}\times\mathcal{B}(\mathbb{R}^{d_X}) \rightarrow [0,1], \ \mathcal{K}_n(x, B) = \mathbb{P}(X_{t_n} \in B|X_{t_{n-1}}=x) \end{aligned} $$
(50)

for any Borel measurable set \(B \in \mathcal {B}(\mathbb {R}^{d_X})\) and \(x\in \mathbb {R}^{d_X}\). The process Z models noisy measurements of the truth, using the so-called observation operator \(\mathscr {H}: \mathbb {R}^{d_X} \rightarrow \mathbb {R}^{d_Z}\):

$$\displaystyle \begin{aligned} Z_{n} = \mathscr{H}(X_{t_n})+ V_n \end{aligned} $$
(51)

where \(\{V_n\}_{n\geq 0}\) are independent identically distributed random variables that represent the measurement noise and \(\mathscr {H}\) is a Borel-measurable function. In this paper we will assume that \(\{V_n\}_{n\geq 0}\) have standard normal distributions, but the same methodology can be applied to more general distributions. Observations are incorporated into the system at assimilation times. The following recursion formula holds (see [2])

$$\displaystyle \begin{aligned} {} \pi_n = g_n \star \pi_{n-1}\mathcal{K}_n \end{aligned} $$
(52)

where by ‘\(\star \)’ we denoted the projective product (see e.g. Definition 10.4 in [2]).

In the following, we compare approximations of the posterior distribution of the signal using particle filters. These are sequential Monte Carlo methods which generate approximations of the posterior distribution \(\pi _t\) using sets of particles. That is, they generate approximations that are (random) measures of the form

$$\displaystyle \begin{aligned} \pi_n \approx \displaystyle\sum_{\ell} \mathrm{w}_{n}^{\ell}\delta({x_{n}^{\ell}}), \end{aligned}$$

where \(\delta \) is the Dirac delta distribution, \(\mathrm {w}_{t}^1, \mathrm {w}_{t}^2, \ldots \) are the weights of the particles and \(x_{t}^1, x_{t}^2, \ldots \) are their corresponding positions. Particle filters are used to make inferences about the signal process by using Bayes’ theorem, the time-evolution induced by the signal X, and the observation process Z.

In a standard particle filter, the particles evolve between assimilation times according to the law of the signal. As we explain below, at each assimilation time the observation is incorporated into the system through the likelihood function:

$$\displaystyle \begin{aligned} g_t^{z_t}\!:\mathbb{R}^{d_X} {\rightarrow} \mathbb{R}_+, \ g_t^{z_t}(x) \,{=}\, g_t(z_t {-} \mathscr{H}(x)) \ \ \text{ such that }\ \ \mathbb{P}(Z_t {\in} dz_t | X_t \,{=}\, x)\,{=}\, g_t^{z_t}(x)dz_t \end{aligned} $$
(53)

and all particles are weighted depending on the likelihood of their corresponding position, given the observation. More precisely, the particle \(\ell \) is given the weight \(w_n^{l} = g_n^{Z_n}(x_\ell )\). Heuristically, the particle weight measures how close the particle trajectory is to the signal trajectory. A selection procedure is then applied to the set of weighted particles. Particles with higher conditional likelihood (guided by the observation) have higher weights and will be multiplied, while those which have small likelihoods will be eliminated. For the basic particle filter, this is done by sampling with replacement from the population of particles, with corresponding probabilities proportional to their weights.

A Monte Carlo implementation of the transition kernel of the signal may not always yield good approximations. In many situations one replaces the original transition kernel with likelihood informed importance proposals, leading to much better approximations. One situation when this is necessary is when the original process is actually deterministic (aside for the initial position which is assumed to be random). This is the case in our paper.

To overcome the collapse of the particle filter when using deterministic transition kernels, one can use a Markov Chain Monte Carlo procedure that leaves the deterministic dynamics invariant. This procedure can be costly and might not always introduce enough spread into the sample. In this paper, we propose a different approach, which we illustrate numerically in Sect. 3.1.2 below. In particular, we propose two different transition kernels based on the physical conservation properties:

  • The transition kernel associated with the HST model equation (38). We will denote this transition kernel by \(\mathcal {K}^{\text{HST}}\). As we have explained above, this transition kernel preserves the helicity of the system.

  • The transition kernel associated with the triad model equation (45) we will denote this transition kernel by \(\mathcal {K}^{\text{EST}}\). As we have explained above, this transition kernel preserves the energy of the system.

3.1 Numerical Studies

3.1.1 Numerical Implementation

The models are discretised using the stochastic SSPRK3 scheme which is documented, for example, in [11]. In our specific case, for instance, the HST model is discretised as

$$\displaystyle \begin{aligned} {\mathbf{q}}_1^{n} &= {\mathbf{a}}_{n} + g ({\mathbf{a}}_n^*\times \mathbb{D}{\mathbf{a}}_n^*) \Delta t + g({\mathbf{b}}\times\mathbb{D}{\mathbf{a}}_n^*)\Delta W \\ {\mathbf{q}}_2^{n} &= (3/4) {\mathbf{a}}_{n} + (1/4)({\mathbf{q}}_1^{n}+ g (({\mathbf{q}}_1^{n})^*\times \mathbb{D}({\mathbf{q}}_1^{n})^*) \Delta t + g({\mathbf{b}}\times\mathbb{D}({\mathbf{q}}_1^{n})^*)\Delta W) \\ {\mathbf{a}}_{n+1} &= (1/3) {\mathbf{a}}_{n} + (2/3)({\mathbf{q}}_2^{n}+ g (({\mathbf{q}}_2^{n})^*\times \mathbb{D}({\mathbf{q}}_2^{n})^*) \Delta t + g({\mathbf{b}}\times\mathbb{D}({\mathbf{q}}_2^{n})^*)\Delta W) \end{aligned} $$

where \(\Delta t\) denotes the timestep and \(\Delta W\) the increment of the driving Brownian motion. Further, \({\mathbf {a}}_n\) is the approximate complex vector amplitude at time \(t = n\Delta t\). The EST model is discretised completely analogously. For the numerical simulations we chose the following triad throughout. We set

$$\displaystyle \begin{aligned} {\mathbf{k}}=[1,0,0],\;\; {\mathbf{p}}=[0,-1,1],\;\; {\mathbf{q}}=[-1,1,-1], \end{aligned} $$
(54)

with parities \(s_k=1, s_p=-1, s_q=-1\) and the initial value \({\mathbf {a}}_0 = \frac {1}{\sqrt {3}}[1,1,1]\). We set the parameter \(\boldsymbol {\Gamma }=[1,1,1]\) and used a time stepsize of \(\Delta t=0.0005\).

3.1.2 Data Assimilation for the Deterministic Model

We illustrate the failure of the particle filter with deterministic transition kernel in Fig. 1. In this case, the particle filtering is performed for an ensemble of \(n=25\) particles evolving according to the deterministic triad dynamics. The initial ensemble is spread around the initial value \({\mathbf {a}}_0\) of the signal according to a Gaussian distribution with standard deviation \(1/\sqrt {600}\) and, in particular, does not contain the true initial point. Data assimilation is performed every 10 time units and the observations are taken from the modal energies of the truth with an observation error \(\eta \) distributed as \( \eta \sim \mathcal {N}(\mathbf {0}, \mathrm {C}) \) with covariance \(\mathrm {C}=\operatorname {diag}(0.005^2, 0.05^2, 0.05^2)\). We observe that both the bias and the RMSE keep increasing with time to values much larger than the observation error. Moreover, the number of distinct particles decreases rapidly: after 30 steps, a single particle remains that is not the true particle since the true particle was not part of the initial cloud. The particle filter does not work.

Fig. 1
figure 1

Deterministic Triad Model. The horizontal axis shows time. (a) Evolution of the modal energies (colored lines) as well as the total energy (dashed line) and helicity (dash-dotted line). The deterministic model exhibits continually oscillating modal energies. The simulation also confirms the conservation of energy and helicity. (be) Data assimilation for the deterministic model using particle filter. (b) Evolution of the energy of mode \({\mathbf {p}}\) of the signal (grey line) and evolution of the energy of mode \({\mathbf {p}}\) for the particle ensemble (blue lines). Noisy observations (black stars) are made and assimilated every 10 time units. (c) The number of unique particle positions in the filtering ensemble. (d) The bias of the particle ensemble wrt. the observations. (e) The RMSE of the particle ensemble wrt. the observations

3.1.3 Reduced Order Model Realisations

The deterministic model in Fig. 1a exhibits continually oscillating triad amplitudes. Plotted are the modal energies. Writing \({\mathbf {a}}=[a_k, a_p, a_q]\) we call the real value \(a_k^{\phantom {*}}a_k^*\) the energy of mode k. Similarly for the two other modes.

We simulate the model realisations for different noise scenarios. We simulate the effect of noise in each single mode. Let the noise amplitude vector be \({\mathbf {b}}=[b_k, b_p, b_q]\). Then we simulate the two models for \({\mathbf {b}}=[b_k=0.1, b_p=0, b_q=0]\), \({\mathbf {b}}=[b_k=0, b_p=0.1, b_q=0]\), and \({\mathbf {b}}=[b_k=0, b_p=0, b_q=0.1]\). The trajectories of the modal energies for \(n=20\) realisations of the driving noise for each scenario are shown in Fig. 2. We also simulate the case of full noise for the noise amplitude vector \({\mathbf {b}}=[b_k=0.1, b_p=0.05, b_q=0.01]\) which was calibrated to the data assimilation objective using the procedure explained in section “Calibration of the Noise Amplitude” in Appendix 3. The ensemble of \(n=20\) realisations of the driving noise in the full noise case is depicted in Fig. 2d and h. In all cases, it can be observed that the mean energy amplitudes of the modes are dampened in both stochastic models. Furthermore, we can experimentally verify the conservation of triad energy for the EST model and the conservation of triad helicity for the HST model.

Fig. 2
figure 2

Model realisations for both stochastic triad models. Plotted are the modal energies (colored lines), total energy (black line) and helicity (grey line). The respective thick lines are the ensemble means, and the thin lines represent the different stochastic realisations. (a + e) The noise coefficient \({\mathbf {b}} = [0.1,0,0]\). (b + f) The noise coefficient \({\mathbf {b}} = [0,0.1,0]\). (c+g) The noise coefficient \({\mathbf {b}} = [0,0,0.1]\). (d + h) The noise coefficient \({\mathbf {b}} = [0.1,0.05,0.01]\)

3.1.4 Model Statistics

Figure 3 shows various statistics of the generated ensembles of \(n=1000\) particles for the HST and EST triad models in the full noise case introduced above. We plot the ensemble mean, standard deviation, skew, and kurtosis.

Fig. 3
figure 3

Evolution of statistical moments of the modal energies for both stochastic models in the full noise case. The statistics are computed pointwise in time from an ensemble of 1000 realisations up to a final time of 150. (a + e) Ensemble mean. (b + f) Ensemble standard deviation. (c + g) Ensemble skew. (d + h) Ensemble kurtosis

The effect of large noise coefficients is exemplified in Fig. 4. We observe that the HST model explodes whereas the EST model is more tolerant to large noise coefficients, and even in the extreme case, does not become unstable in the mean. The ensemble means are computed from \(n=500\) realisations, using the noise coefficient \({\mathbf {b}} = [0.0,1.0,0.0]\). Moreover, to stress the EST model, we also ran the same experiment with a noise coefficient of \({\mathbf {b}} = [0.0,10.0,0.0]\) for the EST model alone.

Fig. 4
figure 4

The effect of large noise coefficients on the mean. Evolution of the mean modal energies (colored lines), mean total energy (black), and mean helicity (grey) for the HST (a) and EST (b) model with noise coefficient \({\mathbf {b}} = [0,1,0]\). (c) Evolution of the mean modal energies, mean total energy, and mean helicity for the EST model with the strong noise coefficient \({\mathbf {b}} = [0,10,0]\)

The mean ensemble for a large number of particles, \(n=20,\!000\), is shown in Fig. 5. We can observe that, compared to Fig. 3a and e the oscillations after time 40 are reduced for a very large number of particles. Hence, we believe that the system stabilizes in the mean to stationary modal energies as the limiting effect of the noise.

Fig. 5
figure 5

Evolution of mean modal energies for a very large number of realisations for the EST (a) and HST (b) models. The mean is computed from \(20{,}000\) particles in the full noise case

3.1.5 Data Assimilation

Using the two stochastic models in the full noise case described above, we perform the data assimilation tests using the following framework:

The signal process (the truth) is given by the deterministic triad model. The observations are the modal energies of the deterministic model, observed every 10 time units, and perturbed by noise of the form

$$\displaystyle \begin{aligned} \eta\sim\mathcal{N}(\mathbf{0}, \mathrm{C}), \end{aligned} $$
(55)

where the covariance matrix \(\mathrm {C}\in \mathbb {R}^3\) is chosen to be the diagonal matrix \(\mathrm {C}=\operatorname {diag}(0.005^2, 0.05^2, 0.05^2)\).

We use the sequential importance resampling (SIR) particle filter to assimilate the periodically observed signal process under the influence of observation noise. The particles evolve according to the stochastic triad models. Figures 6 and 7 show the results of filtering the ensemble of \(n=100\) particles of the EST and HST triad models, respectively. The ensembles are assessed in terms of the bias and RMSE statistics. We analyse the comparison details below:

Fig. 6
figure 6

Filtering experiment for EST model using SIR particle filter. (ac) Ensemble evolution for 100 particles in mode k (a, red), p (b, blue), and q (c, green). The signal (grey) is the deterministic model and the observations (black stars) are noisy and taken and assimilated every 10 time units. (df) Bias of the filtering ensemble. (gh) RMSE of the filtering ensemble

Fig. 7
figure 7

Filtering experiment for HST model using SIR particle filter. (ac) Ensemble evolution for 100 particles in mode k (a, red), p (b, blue), and q (c, green). The signal (grey) is the deterministic model and the observations (black stars) are noisy and taken and assimilated every 10 time units. (df) Bias of the filtering ensemble. (gh) RMSE of the filtering ensemble

3.1.5.1 Mode k

This is the least energetic of all the modes (hence the reason why we observe it with the least amount of measurement noise). The cloud of particles is well placed around the truth even with the small sample. The bias remains small for both the HST and the EST versions and it reduces significantly when observations are assimilated more frequently (see Fig. 9 in Appendix 3) as well as when we use a large number (500) of particles (see Fig. 10 in Appendix 3). The RMSE remains small in all cases and decreases (though not substantially) when the DA step is small.

3.1.5.2 Modes p and q

These are the two energetic modes of the system. We used here a measurement noise that is one order of magnitude larger. Despite this, the results remain equally good. The cloud of particles provide a good envelope for the truth at all times. This validates the choice of the stochasticity: the uncertainty is properly modelled. For both models the bias can become very large, reaching \(30\%\) of the size of the oscillations for the HST model and \(25\%\) of the size of the oscillations for the EST model. As expected, it is drastically reduced when observations are assimilated more frequently. The RMSE for mode p is also large but substantially smaller for mode q. The addition of more intermediate DA steps or more particles has a less pronounced effect for the q mode.

Remark 1

We record the Effective Sample Size (ESS) for a typical run (for both EST and HST) in Fig. 11. As usual, the ESS is computed just before the application of the resampling procedure. The ESS is seen to decay dramatically from 100 down to single digits numbers in most instances in time.

Remark 2

We record the results over 10 independent runs of the filtering experiment for the EST model with 500 ensemble members in Fig. 12. More precisely, in graphs 12a, b and c each, we plot the mean across the 10 independent filtering runs together with the evolution of the signal and the individual ensemble means for each mode. The mean bias as well as the envelope obtained from the independent runs are shown in graphs 12d, e and f. The same is shown for the RMSE in graphs 12g, h and i. Compared with a single run of the same experiment reported in Fig. 10 we observe the approximations are now near perfect (the statistical error has been drastically reduced).

4 Conclusions

The introduction of stochasticity into the deterministic triad models leads to two new stochastic models. Stochasticity is introduced in a principled way (rather than ad-hoc). It starts with a full scale fluid dynamic model which is randomly perturbed. At the full scale, the stochastic parametrisation models the small-scale effects in fluid dynamics modelling. In particular, it efficiently captures the high-frequency small-scale dynamics and correctly correlates it with the slow, large-scale fluid motion. In addition, it is constrained to conserve either the helicity or the kinetic energy of the system. This inspires two different stochastic triad models of Euler type which we compare using data assimilation procedures based on particle filtering. The methodology we employ can be used as a benchmark when analysing new types of stochastic parametrisations: ours is the first study that assesses the efficiency of stochastic parametrisations from a data assimilation perspective.

The introduction of stochasticity ensures that the correct spread (one that preserves the physical properties of the system) is introduced in the ensemble of particles. In its absence, the particle filter degenerates quite rapidly: after a few DA steps, a single particle survives the culling procedure which does offer a good approximation of the truth. A purely deterministic transition kernel does not work, generating a rapid degeneracy of the particle filter.

The two stochastic systems (one which preserves helicity and the other one which preserves energy) are analysed using a standard particle filter. There is no need for additional procedures (such as tempering, nudging, or jittering). They perform equally well from the viewpoint of DA: both the RMSE and the bias are drastically reduced and stabilised when the noise is carefully calibrated. The two different stochastic kernels require different noise calibrations in order to perform well in similar data assimilation scenarios. This is somehow expected, given that the underlying stochastic parametrisations preserve different physical quantities.