Variational Stochastic Parameterisations and their Applications to Primitive Equation Models

We present a numerical investigation into the stochastic parameterisations of the Primitive Equations (PE) using the Stochastic Advection by Lie Transport (SALT) and Stochastic Forcing by Lie Transport (SFLT) frameworks. These frameworks were chosen due to their structure-preserving introduction of stochasticity, which decomposes the transport velocity and fluid momentum into their drift and stochastic parts, respectively. In this paper, we develop a new calibration methodology to implement the momentum decomposition of SFLT and compare with the Lagrangian path methodology implemented for SALT. The resulting stochastic Primitive Equations are then integrated numerically using a modification of the FESOM2 code. For certain choices of the stochastic parameters, we show that SALT causes an increase in the eddy kinetic energy field and an improvement in the spatial spectrum. SFLT also shows improvements in these areas, though to a lesser extent. SALT does, however, have the drawback of an excessive downwards diffusion of temperature.


Introduction
Uncertainty can be present in ocean models due to a number of factors including, but not limited to: smallscale processes not resolved by the grid; observation error; model error; numerical error and unrealistic viscosities imposed to ensure numerical stability. Several stochastic parameterisation techniques [PZ14,Ber05,Mem14,Hol15,HH21] have been proposed recently as ways of representing uncertainty in ocean models. Because these parameterisations are probabilistic, it is possible to generate ensemble forecasts [CCH + 19, CCH + 20, Cot20, UJPD21] with associated means and variances, which can then be applied to data assimilation. This work will focus on two frameworks which introduce stochasticity in a way that preserves certain fundamental and desirable properties of fluid flows. These frameworks are: Stochastic Advection by Lie Transport (SALT) [Hol15] and Stochastic Forcing by Lie Transport (SFLT) [HH21]. Both SALT and SFLT are derived from variational principles, from which we may observe the geometric structure of the fluid equations and the conservation laws which are inherited.
The key assumption of SALT is the decomposition of transport velocity into a slow mean part and a fast, rapidly fluctuating part around the mean. In the limit of high fluctuation frequency, one can use homogenisation theory to transform the rapidly-fluctuating component to a sum of stochastic vector fields [CGH17]. Thus, the modification from the deterministic flow is the addition of stochastic vector fields to the transport velocity. This stochastic modification has been shown [Hol15] to preserve the Kelvin circulation theorem and the advection equation for potential vorticity. In the case where buoyancy obeys an advection relation, the potential vorticity is conserved along particle paths. However, SALT violates energy conservation since stochastic Hamiltonians are introduced into the variational principle. The application of the SALT in quasi-geostrophic (QG) models and the 2D Euler equations has been investigated before in [CCH + 20, CCH + 19, Cot20]. However, these models are too simplistic to be used in operational ocean simulations, and the majority of ocean codes (e.g. MOM5 [GBB + 00], ICON [Kor17], MITgcm [MAH + 97], FESOM2 [DSWJ16]) solve the Primitive Equations (PE). For this reason, if SALT is to be employed for use in practical applications, it must be adapted for use in PE. This introduces additional features to the model as compared the QG or 2D Euler: in PE there are advected quantities such as temperature and salinity, which in the SALT framework are advected by the stochastic velocity. There is, moreover, a subtlety in the pressure arising from the imposition of a semi-martingale Lagrange multiplier in the incompressibility condition of the variational principle [SC21].
An alternative stochasatic parameterisation is the more recent SFLT framework [HH21]. Derived via a Lagrange-d'Alembert principle, SFLT allows the addition of arbitrary stochastic forcings to the evolution equations of the momentum and of the advected quantities. This modification differs from SALT, as stochasticity is added in the variational principle after taking variations of the Hamiltonian for the deterministic system . By considering the Lie-Poisson bracket of the system, we choose the forcing to be of a particular form that preserves, on every realisation of the noise, the original (deterministic) Hamiltonian. For PE, the Hamiltonian is given in eq. (2). However, the addition of energy preserving forces will modify the Kelvin circulation theorem. In the current work, we will consider the case where the stochastic forcing is in the energy conserving form and applied to the momentum equation. As in the SALT case, stochastic pressure terms will appear in the momentum equation due to the imposition of semi-martingale Lagrange multiplier in the incompressibility constraint. Prior to the present work, SFLT has not been implemented into numerical models.
The rest of the paper is structured as follows. In section 2, we derive PE with both SALT and SFLT from a variational principle and we show the conservation properties from the resulting equations. In section 3, we consider calibration procedures to calculate the stochastic parameters of SALT and SFLT. In particular, we use the Lagrangian paths method of [CCH + 20] but also consider a simpler technique, that of Eulerian differences, which we propose is more appropriate for use in SFLT. In section 4, we present numerical results of applying SALT and SFLT to FESOM2 [DSWJ16] (see appendix A), demonstrating the different effects of these stochastic frameworks and the sensitivity to the choice of parameters.

Variational Principles for Stochastic Primitive Equations
Variational principles may be used to derive systems of fluid equations [HMR98,HSS09] which obey conservation laws such as the Kelvin circulation theorem. To derive the Primitive Equations from a variational principle, the appropriate Lagrangian is [HSS09]: where u = (u, v) is the horizontal velocity vector field, R is the Coriolis potential, which satisfies curl R = f (y)ẑ with f (y) = 2Ω cos y and Ω = 2π/day is the rotational frequency of the earth. T and S are the temperature and salinity respectively; these are tracers advected by the fluid. D is the Jacobian of the flow map g t that maps a fluid particle at initial position x 0 to its position x t = g t x 0 at time t. V is the potential energy, which has explicit dependence on T and S, as well as on the vertical coordinate z. The three-dimensional velocity shall be denoted v = (u, w).
In order to obtain the correct hydrostatic balance condition the potential energy should obey ∂V ∂z (T, S, z) = g(1 + b) where the partial derivative is taken with respect to z at constant T, S. b is the buoyancy, given by the equation of state b = b(T, S, z).
It is convenient here to use the Clebsch version of the variational principle [CH09] in Hamiltonian form. The Hamiltonian is given by Legendre transformation as h(m h , D, T, S) := u, δl δu − l(u, D, T, S) where m h := δl δu = D (u + R) is the horizontal momentum. We have also defined the inner product p, q = p · qd 3 x. We shall use the same angle-bracket notation for all such pairings, when p and q are dual variables, e.g. vector field and 1-form density; or a scalar and a density. The Hamiltonian can be written explicitly as: In the Clebsch variational principle when SALT or SFLT are present, the (3-dimensional) transport velocity dχ is defined to be a stochastic process. The form of dχ is defined using Lagrange-multiplier constraints to impose the transport equations (d + L dχ ) a = 0, where a ∈ {D, T, S}, see [SW68]. Here we remark that for clarity, we denote by an italic d the spatial differential and a straight red d for the stochastic time-increment. L dχ denotes the Lie derivative, which is a differential operator with a form that depends on the object on which it acts. We remark here that there is a slight abuse of notation and we shall write D as a short-hand for Dd 3 x so that this is a density 3-form and the Lie derivative is given by L dχ D = ∇ · (dχ D). T and S are scalars, so we have L dχ T := dχ · ∇T and similarly for S. In order to obtain the incompressibility of the transport velocity dχ, we include an additional constraint to set D = 1 where the Lagrange multiplier will be interpreted as the pressure. Since the Hamiltonian h only depends on the horizontal momentum m h , we need to include an extra constraint so that the vertical component of the momentum is set to zero; this will give us hydrostatic balance.
The defining feature of SALT is that the transport velocity is the sum of the drift velocity and a number of stochastic corrections to the drift: where ξ i (x, t) are arbitrary vector fields. We remark here that eq. (3) is a stochastic process at fixed Eulerian points x and we do not solve for this process explicitly. dχ is distinct from the particle trajectories x t , which evolve in time according to dx t = v(x t , t)dt + i ξ i (x t , t) • dW i t and will be used during calibration procedures in section 3. We can impose the form of the transport velocity specified in eq. (3) by including in the action some additional stochastic Hamiltonians i h i (m h ) • dW i t where the horizontal component of the parameters is given by ξ h i (x, t) = δhi δm h . The three-dimensional momentum is denoted m = (m h , m 3 ). We note that in principle ξ i may depend on time; however, we shall henceforth assume for simplicity that ξ i = ξ i (x) is a function of space only. When h i are independent of m h , we have the relation dχ(x, t) := v(x, t)dt, so that dχ reduces to the original deterministic transport.
SFLT is included [HH21] via a Lagrange-d'Alembert term δdχ, F added to the variation of the action δS. Since this is added after variations of the action are taken, the forcing F can in principle be arbitrary. Overall, the variational principle takes the following form: The first two lines of eq. (4) are what would be included in the unmodified variational principle. dζ is a Lagrange multiplier, enforcing m 3 = 0 and after taking variations can be interpreted as the vertical component of the stochastic transport velocity. Indeed, we may expand dζ = wdt + i ξ (z) i • dW i t ; note that here dζ is varied and so the third component of ξ i is treated as a variable in the action, whereas the horizontal components are treated as fixed parameters. The final term on the top line enforces incompressibility, and the Lagrange multiplier dP must be stochastic since a semi-martingale Lagrange multiplier is required to enforce a condition on the semi-martingale D, see [SC21]. On the second line the quantities α, β, γ are Lagrange multipliers enforcing the fact that D, T, S are advected quantities. The final line contains the modifications required to include SALT or SFLT; we shall not in practice use both SALT and SFLT together, but for compactness of the presentation we include them together here. The first modification, giving SALT, consists of a sum of N ξ Hamiltonians multiplied by Stratonovich noise. The second, additional term is a Lagrange-d'Alembert term which introduces a shift F in the momentum. We remark that by including further Lagrange-d'Alembert terms such as δα, dF D or δβ, dF T etc. we may add arbitrary forcings to the right-hand side of the equations for the advected tracers. However, we do not consider this here.
The equations resulting from the variational principle δS = 0 are: The diamond in eq. (5c) is a binary operator acting on two variables that are dual with respect to the inner product ·, · (e.g. or scalar and density) and giving a 1-form density. Explicitly, for two dual variables p, q and an arbitrary vector field X : the diamond is defined by the relation p q, X = − p, L X q . We can compute these explicitly as follows: We note that the form of dχ as given in eq. (3) is not an input to the variational principle, but a consequence of it. Indeed, we obtain eq. (3) by defining v := δh δm h , w and ξ i : , we can split the incompressibility condition into its drift part and stochastic oscillations. Thus we are able to compute w, ξ (z) i in terms of u and ξ (h) respectively: Boundary conditions at z = 0 allow us to integrate eq. (7) in the vertical direction. To obtain the momentum equation we apply (d + L dχ ) to both sides of eq. (5c) and use the fact that the Lie derivative obeys a Leibniz rule with respect to the diamond operator. After some re-arranging, we obtain: We shall show in section 2.2 that the SFLT terms will conserve energy if we require that the momentum shift F takes a particular form, which is that it satisfies (d + L dχ ) F = L v dΦ, for some stochastic process dΦ. In this work, we shall assume further that dΦ has the form dΦ = I φ I • dB I t for some spatially dependent parameters φ I and with B I t being a set of independent Brownian motions. Because the momentum m = (m h , 0) has only horizontal components, we shall assume that φ I also have only horizontal components. Moreover, we can expand the pressure in terms of its drift component and Brownian increments: dP = pdt Thus, writing m = u + R and expanding dχ in terms of v and ξ i , we find that eq. (8) becomes: The first line of eq. (9) contains the terms of the deterministic momentum equation, the second line contains the SALT terms and the final line contains the SFLT contributions. Equation (9) is a three-dimensional equation, but the third component is the (diagnostic) hydrostatic balance condition rather than a prognostic evolution equation for w. In the cases of SALT and SFLT hydrostatic balance includes additional constraints on the stochastic parts of the pressure dP : where we have the definitions of the shifted stochastic pressure terms p i := p i + ξ i · R and p I := p I − v · φ I . We solve eq. (10) by imposing the following surface pressure boundary conditions: where η is the free surface height. The boundary condition on p is that used in the linear free surface approximation, which is employed in FESOM2 [DSWJ16]. ψ i and ψ I are functions only of the horizontal direction and are arbitrary. They may be used to introduce some stochastic atmospheric forcing at the ocean surface, but we do not consider this in the present work. For simplicity we shall set ψ i = ψ I = 0 for all i, I. Solving eq. (10) with the boundary conditions in eq. (11) gives us the following: A more exact condition on the deterministic pressure would be p| z=η = 0. Using this gives almost the same result for p except that the upper limit of the integral will instead be η.
The equation for the evolution of the free surface height η is obtained by integrating the incompressibility condition and using appropriate surface boundary conditions. For the linear free surface approximation we take w| z=0 dt = dη; at the bottom boundary z = −H(x, y) we have dχ| z=−H · ∇ (z + H) = 0. Thus, integrating the incompressibility condition in the vertical direction from z = −H to z = 0 we find, in the linear free surface case: Again, the more exact boundary condition would be dχ| z=η · ∇ (z − η) = dη ad in this case eq. (13) is modified by udt → udt + i ξ i • dW i t and the upper limit of the integral will be η rather than 0. However, for our numerical simulations we use the linear free surface.
From eqs. (5e) and (5f) we have the advection equations: for temperature and salinity respectively. The horizontal component of the momentum equation eq. (9), along with the solutions eq. (10) for pressure (with the equation of state b = b(T, S, z)), the incompressibility conditions eq. (7), the tracer advection equations eqs. (14) and (15) and the linear free surface equation eq. (13) give us a complete set of fluid equations, the Primitive Equations with SALT and SFLT.

Conservation Laws
The key benefit of the SALT and SFLT frameworks is that they retain some of the fundamental conservation properties possessed by the deterministic equations. By writing the Primitive Equations in the geometric form given in eqs. (5d) to (5f) and (8), we may demonstrate the effect of the stochastic frameworks on these conservation laws. First, we consider energy conservation. The total energy is equal to the Hamiltonian, as given in eq. (2). For convenience of notation, we defineh(m, D, T, S, w) = h(m h , D, T, S) + m 3 , w . h and h are equal on solutions of the equations, but we have δh δm = v. By direct calculation, the time evolution of the energy is given by: Thus, the energy conservation property is violated by the stochastic terms. The two terms on the right-hand side of the pairing in eq. (16) come from SALT and SFLT respectively. However, as shown in [HH21], the energy deviation from SFLT can be nullified by choosing (d + L dχ ) F = I L v φ I • dB I t for some parameters φ I (x). Indeed, by the anti-symmetry of the vector field commutator: where the square bracket [·], denotes the commutator of vector fields. Thus, energy conservation is broken by SALT but preserved by a class of stochastic forcing in SFLT. In the remainder of the paper, we shall assume the stochasticity introduced by SFLT are in the energy preserving form.
The next conservation law we consider is the Kelvin circulation theorem. The evolution of the circulation corresponding to eq. (8) is given by: where C(t) is a closed loop moving with the transport velocity dχ. We see that SALT affects the circulation theorem only by modifying the advection of the loop; thus the circulation theorem for SALT is the same as in the deterministic case, but with the circulation considered around a stochastically-transported loop. Therefore, circulation is generated only by buoyancy gradients being misaligned with the vertical direction.
In SFLT, on the other hand, there are additional forces introduced, which generate the circulation of fluid momentum.
The evolution of potential vorticity associated with eq. (8) can be expressed as where ω := curl m h /D is the relative vorticity, ω I = curl φ I is the stochastic vorticity generated by SFLT and q := 1 D ω · ∇b is the potential vorticity. Similar to the Kelvin circulation theorem, SALT introduces stochasticity in the transport velocity dχ, while SFLT introduces stochastic forces that act on the advection of fluid potential vorticity. If we assume that the buoyancy has no explicit dependence on the vertical coordinate, i.e. ∂b ∂z = 0, then q is purely advected by the flow in the absence of SFLT.
3 Calibration of the stochastic parameters

Lagrangian Paths
In order to calibrate the parameters ξ i used in SALT we propose to use the method of Lagrangian paths introduced in [CCH + 19, CCH + 20]. First, we perform a fine-grid model run, which we shall take to be the 'truth'. Resulting from this run we get an output velocity v(x, t) saved at times t ∈ {t 1 , ..., t N −1 , t N }, where the time interval between subsequent sample times, t i+1 − t i , is greater than the velocity decorrelation time, defined to be the smallest τ at which the auto-correlation function C(τ ) is less than e −1 . Suppose the fine-grid resolution is M times that of the coarse grid, in which case the coarse-grid time step is given by ∆t c := M ∆t f , where ∆t f is the time step for the fine-grid model run. In order to compute Lagrangian paths we also save v(x, t) at t ∈ {t i , t i + ∆t f , ..., t i + (M − 1)∆t f } for each i = 1, ..., N .
To obtain the corresponding coarse-grid velocityv(x, t) from v(x, t), we apply a coarse-graining operator to v(x, t), which consists of a local average over fine-grid points, to obtain a velocityv(x, t) defined on the coarse grid. Considering a distribution of tracer particles whose initial positions x r 0 are the (threedimensional) coordinates of the coarse-grid nodes (enumerated by r), we compute Lagrangian paths on the fine grid and coarse grid respectively: where x f r (t i + M ∆t f ) and x r c (t i + M ∆t f ) are the Lagrangian paths computed as integral curves of v f and v respectively; the integral is carried out over one coarse-grid time-step, which is equivalent to M fine-grid time steps. We can then define the difference ∆x r,i = ∆x(t i , x r 0 ) := x r f (t r + M ∆t f ) − x c (t r + M ∆t f ) and apply the method of [HJS07] to compute the Empirical Orthogonal Functions (EOFs). To summarise, we subtract off the time mean to define ∆x r,i := ∆x r,i − 1 N N −1 i=0 ∆x r,i . In the x-direction we then have a matrix with components ∆x r,i . From this we construct the matrix Λ (x) which has components Λ  j (x) = δ ij , where the sum is over all grid points. We apply the same process to the ycomponent ∆y r,i to obtain N eigenvectors in the y-direction, which we denote a (y) i . We do not compute the eigenvectors for the z-direction since these will be obtained from the incompressibility condition.
We remark that the method we have used here, in which we compute the EOFs of each component of ∆x separately, is different from the method found in other sources e.g. [HLB96], in which the components are computed together and we obtain a set of two-component eigenvectors a i immediately, with one eigenvalue λ i corresponding to each of these EOFs. However, this method was attempted for SALT runs in the current set-up and the results of model runs were less successful. For this reason we have chosen to compute the components separately.
Thus, in our case we have N eigenvectors in each of the horizontal directions and these will have associated eigenvalues λ gives an indication of how much of the variance is captured by the corresponding eigenvector. Therefore, we choose to scale the parameters so that ξ in order to ensure that the different methods for computing ξ i may be compared fairly, we require that the L 2 -norm of the sum be the same for each method. Thus we impose the following: where γ is a constant with units ms −1/2 , which we shall choose later; V tot is the total volume of the domain. and N ξ ≤ N is the number of EOFs we choose to keep for our model runs. The total integral, denoted by angle brackets, is defined by a, b := x a(x) · b(x)V (x). We can achieve the required properties by choosing the following scaling: where V (x) is the volume of the grid cell located at x and we have defined λ tot := i ). After computing the horizontal components in this way, ξ (h) i are then smoothed to zero near the boundaries in order to enforce the impermeability condition at the boundary, ξ (h) i · n = 0, where n is the normal to the boundary and ξ For the z-component we use the incompressibility condition eq. (7) along with the impermeability condition ξ i · ∇ (z + H) = 0 at the lower boundary z = −H to obtain: where ∇ (h) = ( ∂ ∂x , ∂ ∂y ) is the horizontal gradient. This method for computing the vertical component of ξ i is applicable to any system of fluid equations with an incompressibility condition. We could, alternatively, compute all three components of ξ i as EOFs of the three components of ∆x. However, the resulting threecomponent vector ξ i will not be guaranteed to be divergence-free. We would then need to subtract off the divergent part ξ i → ξ i = ξ i − ∇∆ −1 (∇ · ξ i ) where ∆ −1 is the inverse Laplacian. However, computing the divergent part of the vector ξ i is computationally expensive; moreover, the components of ξ i computed in this way will not be guaranteed to be orthogonal with respect to ·, · . Thus in this paper we consider only the ξ i for which the vertical components are computed from integrating the incompressibility condition.

Eulerian Differences
To calibrate the parameters φ I used in SFLT we propose an alternative method by using differences in fixed Eulerian coordinates. Consider the deterministic momentum equation given by: and the SFLT equation: where the notation(·) are used on the variables of the SFLT equations to emphasise the difference between deterministic and stochastic variables. The goal of the stochastic parameterisation is to decompose the "true" fluid flow to a slow drift component and a rapid fluctuating component whose amplitude can be estimated from data. In the example of estimating the momentum fluctuation dΦ ofm h , we denote the slow drift component asm h and we seek the solution to the minimisation problem Assuming D, T and S do not have rapidly fluctuating components, the minimisation problem becomes We see that this minimisation problem can be solved by taking dΦ = m h −m h dt = (u −ū) dt. Therefore, we define the differences ∆x r,I := ∆x(t I , for I = 1, ..., N . We then assume the expansion dΦ = N I=1 φ I • dB I t . As before, we subtract the time-mean to obtain ∆x r,I = ∆x r,I − 1 N N −1 I=0 x r,I and then compute the EOFs exactly as we did in section 3.1 to get our parameters φ I .
In both methods we initially compute horizontal components of the stochastic parameters using EOFs, but for SALT there is the additional step of integrating the incompressibility condition to obtain the vertical component. The vertical component is not needed for φ I since it is a part of the decomposition of the fluid momentum m, the vertical part of which vanishes in the Primitive Equations. In fully three-dimensional models in which the vertical component of the momentum is non-zero, the Eulerian differences of the momenta will be a three-dimensional object and one can compute all three components of the parameters φ I using EOFs.
We can also consider using Eulerian differences as an option for ξ i in SALT. This effectively means approximating the fine-grid Lagrangian path by taking only one time-step in the coarse grid: We can expect that this will be a reasonable approximation for small M , but for larger M the Lagrangian paths method will diverge from the Eulerian differences. In our numerical investigations in SALT we shall consider ξ i computed from both the Lagrangian paths method and Eulerian differences method. For SFLT we also consider φ I computed from Lagrangian paths (for completeness) as well as those computed by Eulerian differences as described above.

Results
We   are imposed at all boundaries. The model is spun up for three years from zero initial velocity and an initial temperature profile given by T (z) = T 0 + λ αρ0 (1 − β) tanh z z0 + β z H , which is based on the test case described in [RDH + 12, SDB + 16]. We take T 0 = 25 • C, β = 0.05, λ = 5kgm −3 , z 0 = 300m, ρ 0 = 1030kgm −3 , and α = 0.00025K −1 . For simplicity, salinity is kept constant and we use a linear equation of state which depends only on temperature: b = −α(T − 10 • C). The flow is driven by a wind forcing in the upper layer given by τ (x, y) = −τ0∆z0 ρ0 cos πy 15 • x, where ∆z 0 = 10m is the thickness of the upper layer; τ 0 = 0.2ms −2 is the wind strength. The vertical discretisation consists of 23 layers, with layer thicknesses increasing with depth. For the horizontal discretisation we take a fine grid of spacing 1/4 • and a coarse grid of spacing 1/2 • . At the latitudes we are considering, 1/4 • corresponds to an eddy-permitting model, while 1/2 • may be considered non-eddy resolving, see [Hal13]. We run the deterministic model on the fine grid and the coarse grid, and carry out the SALT and SFLT runs on the coarse grid. All coarse-grid runs are begun from the same initial condition, being the final time snapshot after the three-year spin-up period; the fine-grid run is begun from the end of the three-year spin-up on the fine grid. We save data in each case at intervals of 15 days, over a time period of 10 years, for a total of 240 snapshots. From the fine grid data we have the 'truth' velocity v f . To this we apply a coarse-grainingv; we then follow the procedures outlined in section 3 to compute ξ (h) i and φ I . However, there is no canonical choice for how the coarse-graining should be done. We consider a filter defined by an equally-weighted nine-point average over nearest neighbours, and we denote this filter F; this filter, applied once, has a width equal to the spacing on the coarse grid, i.e. 1/2 • . The coarse-graining will then be done by applying this filter N f ilt times successively, then projecting onto the coarse grid. Thus, the smoothing filter applied N f ilt times will be denoted F N f ilt ; this has a width N f ilt /2 degrees with a stronger weighting for points closer to the centre of the filter. We consider the cases N f ilt = 1, 4, 32.
From the deterministic model run, we have velocities saved at 240 time snapshots, so we can use these to compute 240 EOFs. We do this for both the Lagrangian paths method and the Eulerian differences method, for each of the three choices of N f ilt ; this gives a total of six sets of parameters. In our model runs we shall choose to keep N ξ = N φ = 32 of these parameters for each run. In fig. 1 we plot the square-root of the sum of the squares of these parameters (before re-scaling by γ) as a field in space. From fig. 1 it appears the differences between Lagrangian paths or Eulerian differences are minimal. We remark that here the time-steps on the fine and coarse grids differ only by a factor of 2; it is expected that if a bigger difference in resolution is used, then more steps will be needed in computing the Lagrangian paths and therefore the corresponding parameters will differ more substantially. The number of times we apply the smoothing operator, however, has a much greater effect and we see significantly different fields with N f ilt = 32 than we do with N f ilt = 4 or N f ilt = 1. Indeed, it appears from fig. 1 that the weaker filter causes the parameters to be more strongly concentrated around the western boundary, whereas for the stronger filter the parameters are spread more across the domain.
The cumulative spectra of the EOFs are shown in fig. 2. These spectra show us how many EOFs are needed to capture a given percentage of the total variability; or conversely, how much variance is captured by a given number of EOFs. We show in each case how much variability is captured by using 32 EOFs. In all cases the Lagrangian paths method gives a slightly higher variability captured, though the difference is small, especially for the smaller values of N f ilt . A much bigger variability is captured, however, in the N f ilt = 32 case when compared with the N f ilt = 1 case.
We implemented SALT and SFLT into FESOM2 (see appendix A) and ran the model with each choice of parameters and with the appropriate re-scaling as detailed above. For all SALT runs we use N ξ = 32 with the scaling γ = 2 × 10 −3 ms −1/2 . For SFLT we also take N φ = 32 but scale the parameters with γ = 10 2 ms −1/2 . This re-scaling is chosen empirically taking γ to be the largest value possible that will not result in model blow-up. It appears that the magnitude of parameters that we are able to use for SFLT is much higher. This is possibly due to the fact that SFLT does not involve any direct modification of the tracer equation. SALT, on the other hand, includes an advection of the temperature by the stochastic transport velocity; using higher values for this velocity may destabilise the tracer equation and cause model blow-up.
The results of these runs are shown in figs. 3 to 5. Figure 3 shows the eddy kinetic energy (EKE), defined by E = 1 2 |u − u | 2 , where u is the time-averaged velocity. We notice that the eddy kinetic energy is significantly less in the coarse-grid deterministic run than it is in the fine-grid run. This is probably due to the fact that small scales are less present in the coarse-grid flow, and in the coarse-grid model the viscosity used is greater and so kinetic energy is dissipated at a faster rate. However, when we include SALT there is, for most choices of ξ i , a notable increase in EKE across the domain, particularly around the western boundary. The exception is in the cases in which the coarse-grained velocitiesv used to calculate ξ i are defined with only one application of the smoothing operator, as shown in panels (c) and (d) in fig. 3. This could be because, from fig. 2, the inclusion of 32 ξ i captures a smaller amount of the total variability; it may also be that the effect of the ξ i s is more spread out across the domain, as shown in fig. 1, which overall has a greater impact than having them more highly concentrated in one region. For SFLT there is only a modest improvement in the EKE field, and the effect is similar for all choices of the parameters. In all cases there appears to be little difference between the Eulerian differences method and the Lagrangian paths method when the same N f ilt is used.
We can also consider the spatial spectra, as shown in fig. 4. There we see that the 1/4 • run contains higher EKE at all scales than the low-resolution run. Every SALT run succeeds in increasing the energy at almost all scales and in shifting the spectrum towards that of the 1/4 • run. The most significant improvements are seen in the run with Eulerian parameters computed with N f ilt = 32; in contrast, there is only a small change from the deterministic run when the N f ilt = 1 Eulerian parameters are used. For SFLT the improvement is again less noticeable, with all choices of parameters only giving a slight increase in EKE at all scales.
Since we are working with the Primitive Equations, the buoyancy can have a large effect on the fluid flow. We therefore consider the temperature, which determines buoyancy directly via the linear equation of state. Figure 5 shows vertical temperature profiles at the end of the ten-year run. In the coarse-grid model there is a slightly lower average temperature in the upper layers of the fluid, and slightly higher temperatures in the lower layers. However, with SALT included there is, for some choices of parameters, a significant reduction in temperature in the upper layers, while at lower depths the temperature increases relative to the deterministic model. Considering the time series of spatially averaged temperature at z = −5m and z = −1350m in Figure 6, we see the downwards diffusion effects are persistent in time. In the deterministic case we see that the coarse-grid model has a stronger downwards diffusion of temperature than the fine-grid run. The inclusion of SALT also accelerates this downward-diffusion effect. It therefore appears that the calibrated stochastic terms we have included in the temperature equation with SALT cause a downwards-diffusion effect. Indeed, an additional SALT run (not shown), in which the stochastic terms were not included in the temperature advection, did not display this downwards diffusion behaviour. Thus, further investigation will be required in order to determine how to avoid the excessive downwards diffusion in the tracer equation while maintaining a positive effect on the EKE field. SFLT has very little effect on the temperature field when compared to that of the low-resolution model. This is expected however since there are no direct stochastic effects in the temperature equation. Comparing with SALT runs where the temperature downwards-diffusion effect is present against the SFLT runs, we believe that the temperature is the dominant force for the evolution of velocity, at least at the resolutions we have considered here. Then, the limited effects on EKE by the SFLT framework are explained as it does not affect the driving temperature fields directly. It remains part of future work to consider the case where SFLT is added to the temperature field.

Summary and Discussion
This work lays the groundwork for the application of two relatively new stochastic parameterisation frameworks to the Primitive Equations. The first, SALT, has hitherto only been applied to simple idealised ocean models such as QG and 2D Euler. The second, SFLT, had not been investigated numerically prior to the present work. We have demonstrated some of the desirable theoretical properties of the stochastic Primitive Equations with the noise added in these ways. Notably, the preservation of a circulation theorem for SALT and energy conservation for SFLT. We have proposed to calculate the parameters ξ i governing SALT and    φ I governing SFLT by two different methods: Lagrangian paths and Eulerian differences. We find that there are no significant differences between the two methods, either in the parameters themselves or in the results of model runs. In this case it is preferable to use the Eulerian differences method, as the parameters in this case are computationally less expensive to compute. However, we have used a set-up in which the fine-grid resolution is only is only 2 times the coarse-grid resolution. However, using a larger ratio of grid resolutions would mean more time-steps are needed in the Lagrangian paths and so may give different EOFs that differ more significantly than what we have observed here. We do observe, however, that there are large sensitivities to the choice of smoothing used in defining the coarse-grained velocity, from which the parameters are calculated. In the SALT case, the model runs using parameters calculated with a strong smoothing filter show a significant improvement in the eddy kinetic energy field at all depths, as well as in the eddy kinetic energy spectrum. In the SFLT case, the improvement in EKE field and EKE spectrum are more modest compared to the improvement by SALT due to the lack of direct stochastic effects to the driving temperature fields. Considering the temperature profile, however, we observe that SALT causes significant additional downward diffusion when compared with the deterministic model. It remains an open problem to devise a method to avoid this effect. The answer may lie in a different method for configuring the parameters ξ i or it may be the case that this is a property intrinsic to SALT. In either case, further study is needed in this direction.
The stochastic parameterisation frameworks considered in this paper distils all uncertainties of the ocean models into the stochastic parameters ξ i and φ I . However, the effects of these stochastic parameterisations could be limited by the model, both physically and numerically. Examples of the limiting factors for the Primitive Equations are the forcing from the temperature field and artificial viscosity imposed for numerical stability. The interplay between numerical effects such as artificial viscosity and stochastic parameterisation is particularly interesting for future work. This is due to different numerical viscosity are imposed at different mesh resolutions to numerical stability which influences the calibration process. Thus, we expect there are limits to the effects of SALT and SFLT for low-resolution simulations where viscosity are dominant. In highresolution simulations, we expect to see further effects of stochasticity as the influence of viscosity diminishes. After all, the problem of stochastic parameterisations are not just model-dependent, it also dependent on the numerical method solving it.
College for their thoughtful comments and discussions. We acknowledge the Alfred Wegener Institute for the use of their computing facilities. The authors are grateful for partial support, as follows. RH for the EPSRC scholarship (Grant No. EP/R513052/1); and SP for the EPSRC Centre for Doctoral Training in the Mathematics of Planet Earth (Grant No. EP/L016613/1). Finally, we thank the anonymous reviewer for giving useful feedback on our manuscript.

A Numerical Implementation
In order to apply SALT and SFLT to FESOM2 we adapt the time-stepping scheme to include the appropriate stochastic terms. Details of the original (deterministic) time-stepping are given in [DSWJ16]. We modify the scheme from FESOM2 to a two-step Heun-type method [BBT04]; we choose this because of the use of Stratonovich integrals, to which the Heun method converges. The first step in the method is to compute the modified pressure:p where ∆W i n+1 and ∆B I n+1 are independent, normally-distributed random variables with mean 0 and variance ∆t. For the sake of conciseness we shall assume that the buoyancy depends only on temperature T , and that salinity is kept constant; however, extending the method to include additional tracers should be straightforward. The advective, diffusive and pressure parts of the momentum right-hand-side are then computed: ∆û n+1 =R n+1/2 − ∇ (p n h + η n ) ∆t + D u n , ∆û n+1 (30) whereR n+1/2 is an Adams-Bashforth interpolation of the advective and Coriolis terms. In fact we havê R n+1/2 = 3 2 + R n − 1 2 + R n−1 , whereR n = R [v n ∆t, u n ] + i R [ξ i , u n ] − ∇ (h) ξ i · u n ∆W i n+1 − I R v n , φ I ∆B I n+1 and R [v, u] := −∇ · (vu) − f × v. D includes the horizontal and vertical diffusion terms, as well as the external wind forcing.
The change in free surface height ∆η n+1 is computed implicitly: Once this has been solved we can finally compute the stepped-forward horizontal velocity: Then we solve for the total layer thicknessh, which in the continuous case is the same as the free surface height η; in the discrete case, however, they are different and we compute: In our present set-up we then set the free-surface height as a linear interpolation of the total layer heights: where θ ∈ [0, 1] is an arbitrary parameter, which we set equal to 1.
Since we have the horizontal velocity we may compute the vertical velocity: The newly-computed three-dimensional velocity, along with the stochastic SALT velocity, is then used to advect the tracer: T n+3/2 = T n+1/2 − R T T n+1/2 , T n−1/2 ,v n+1 ∆t + ξ i ∆W n+3/2 + K T n+1/2 where R T denotes the advection scheme and K is the diffusion. From these steps we compute intermediate valuesX n+1 := û n+1 ,η n+1 ,ĥ n+3/2 ,T 3/2 from values at the previous two time steps: u n , u n−1 ,h n+1/2 , T n+1/2 , T n−1/2 . We may write this schematically as:X n+1 = X n + F X n , X n−1 where F is an operator representing the computations outlined above. For the corrector step we follow the same steps as above, to compute F X n+1 , X n and we have the overall evolution given by: This method differs from the usual Heun method because the right-hand side depends on the previous two time-steps, rather than just the previous one. It remains to prove that adding the stochasticity with this method does converge to the required Stratonovich integrals.