1 Introduction

Ocean general circulation models used at climatic scales are limited for evident computational reasons to too coarse horizontal resolutions to solve correctly ocean mesoscale and sub-mesoscale eddies, even with large computational infrastructure. The horizontal resolution of the most recent climatic ocean models is of the order of the Rossby radius of deformation. These models are hence in the so-called eddy-permitting regime and they can solve partially the mesoscale (i.e. 10–100 km) eddy field. These models however suffer from strong limitations. In particular, they are unable to reproduce accurately large-scale structures such as the eastward turbulent jet in an idealized double-gyre configuration.

Recent parameterizations have shown significant improvements in coarse-resolution models compared to high-resolution reference solutions [2]. However, it remains an important topic of research, as the actual generation of parametrizations is not completely able to resolve the effects of the unresolved scales on the large-scale flow structures.

A wide range of subgrid parametrizations relies on eddy viscosity such as Laplacian and biharmonic schemes [16, 10, 4, 3]. It has been shown in [9] that including only these (hyper)viscosity in coarse-resolution models often causes too much dissipation and results in an artificial energy sink at large scales. In general, even eddy-permitting models are not energetic enough and as a result, the long-time average of any coarse model’s variable of interest departs completely from the long-time average of high-resolution models subsampled at the same scale. This becomes the main motivation of the present work. In particular, we would like to answer the following question: how can we reduce the excessive resolved kinetic energy loss due to the viscosity while simultaneously ensuring numerical stability?

We propose a simple affine parameterization of (hyper)viscosity. The (bi)laplacian operator Δp f is replaced by \( \Delta ^p \left (f - f' \right )\), where f′ is a field of same dimension as f that does not depend upon time. We interpret this method as a mathematical regularization technique to guide the solutions towards prior information. We frame f′ as the solution of an optimal control problem to reproduce statistics computed from a reference solution or observations. We present a method to solve this optimal control problem.

We test the proposed method with an idealized double-gyre configuration. For that purpose, we release with this article a fast, concise, and CPU-GPU portable Pytorch implementation of a multi-layer quasi-geostrophic model on a rectangular domain. We implement and test our optimization procedure within this setting.

This article is organized as follows: we present in Sect. 2 the double gyre quasi-geostrophic model we use and detail its implementation, we present in Sect. 3 our modified viscosity parameterization and we show and discuss numerical results in Sect. 4.

2 Double Gyre Quasi-Geostrophic Model

2.1 Governing Equations

We use the same multi-layer quasi-geostrophic model in a non-periodic rectangular domain as in [6]. Here, we only give a brief review of this system. The quasi-geostrophic pressure and potential vorticity (PV) are stacked in three isopycnal layers. We adopt vector forms to denote the layered pressure and potential vorticity (PV):

$$\displaystyle \begin{aligned} \mathbf{p} = \begin{bmatrix} p_1\\ p_2\\ p_3 \end{bmatrix} , \quad \mathbf{q} = \begin{bmatrix} q_1\\ q_2\\ q_3 \end{bmatrix} \ . \end{aligned}$$

The forced and damped quasi-geostrophic (QG) equations can be then written as

$$\displaystyle \begin{aligned} & \partial_t \mathbf{q} = \frac{1}{f_0}J (\mathbf{q}, \mathbf{p}) + f_0 B \mathbf{e} + \frac{1}{f_0} \big( a_2 \Delta - a_4 \Delta^2 \big) (\Delta \mathbf{p}), {} \end{aligned} $$
(1)
$$\displaystyle \begin{aligned} & \big(\Delta - f_0^2 A \big)\mathbf{p} = f_0 \mathbf{q} - f_0 \beta (y - y_0) , {} \end{aligned} $$
(2)

where \(\Delta = \partial _{xx}^2 + \partial _{yy}^2\) denotes the horizontal Laplacian, Δ2 the bi-laplacian operator, J(a, b) =  x a∂ y b −  x b∂ y a stands for the Jacobi operator, f 0 + β(y − y 0) is the Coriolis parameter under beta-plane approximation with the meridional axis center y 0, a 2 and a 4 are the Laplacian and biharmonic viscosity coefficients. Parameters of the configuration are listed in the Tables 1 and 2 in Appendix. Besides, the second term on the right-hand side of Eq. (1) represents the external forcing applied on different layers. In this work, we only consider an idealized case in which the ocean basin is driven by a stationary and symmetric wind stress \(\vec {\tau } = (\tau ^x, \tau ^y)\) on the surface and by a linear Ekman stress at the bottom. In that case, the forcing term can be specified by

$$\displaystyle \begin{aligned} B = \begin{bmatrix} \frac{1}{H_1} & \frac{-1}{H_1} & 0 & 0 \\ 0 & \frac{1}{H_2} & \frac{-1}{H_2} & 0 \\ 0 & 0 &\frac{1}{H_3} & \frac{-1}{H_3}\\ \end{bmatrix},\ \quad \mathbf{e} = \begin{bmatrix} \partial_x \tau^y - \partial_y \tau^x\\ 0\\ 0\\ \frac{\delta_{\mathrm{ek}}}{2 |f_0|} \Delta p_3 \end{bmatrix},\ \quad \vec{\tau} = \tau_0 \begin{bmatrix} -\cos{}(2 \pi y / L_y)\\ 0 \end{bmatrix}, \end{aligned}$$

where τ 0 is the magnitude of surface wind, H k is the background thickness of layer k, and δ ek is the bottom Ekman layer thickness. The vertical stratification level of such a model is described by the term \(- f_0^2 A \mathbf {p}\) in Eq. (2) with

$$\displaystyle \begin{aligned} A = \begin{bmatrix} \frac{1}{H_1 g_{1.5}^{\prime}} & \frac{-1}{H_1 g_{1.5}^{\prime}} & 0\\ \frac{-1}{H_2 g_{1.5}^{\prime}} & \frac{1}{H_2}\left( \frac{1}{g_{1.5}^{\prime}} + \frac{1}{g_{2.5}^{\prime}}\right) & \frac{-1}{H_2 g_{2.5}^{\prime}}\\ 0 &\frac{-1}{H_3 g_{2.5}^{\prime}} & \frac{1}{H_3 g_{2.5}^{\prime}} \end{bmatrix} , \end{aligned}$$

where g k+0.5 is the reduced gravity defined across the interface between layers k and k + 1. A multi-layered generalization of this model can be found in [5]. Note also that such a multi-layered model can be considered as a vertical discretized approximation of the continuously stratified QG system [17] with \(\partial _z (f_0 \partial _z \mathbf {p} / \vec {N}^2) \approx -f_0 A \mathbf {p}\) approximated by finite differences, and in which \(\vec {N}\) denotes the buoyancy (or Brunt-Vaisala) frequency.

Table 1 Common parameters for all the models
Table 2 Grid-dependent parameters

2.2 Pytorch Implementation

To facilitate numerical developments and benefit from built-in automatic differentiation, we develop a Pytorch [12] implementation of the above-described multilayer QG model.Footnote 1 For this purpose, we follow rigorously the strategy of [7]:

  1. 1.

    we use a regular numerical grid with finite differences

  2. 2.

    We solve the PV advection equation (1) on the whole domain except the boundaries. We use a standard 5-point finite difference scheme for the (bi-)Laplacian and the energy-enstrophy conservative Arakawa-Lamb scheme for the Jacobian [1].

  3. 3.

    We apply a vertical change of coordinate to Eq. (2) which becomes a set of three inhomogeneous Helmholtz equations. We solve these equations with the spectral Discrete Sine Transform (DST) method, and we add corresponding homogeneous Helmholtz equation solutions to ensure mass conservation.

  4. 4.

    We update the boundary values of the potential vorticity q using Eq. (2).

Detailed equations and numerical routine design choices can be found in [7]. We use a Heun–Runge–Kutta 2 time-stepping instead of the Leap-Frog time scheme used by [7].

For sake of numerical efficiency, we follow the recommendation of [14]: we compile computationally demanding routines and simplify finite difference calculations by reducing as much as possible the number of multiplications. We end up with a very concise code (less than 300 lines) that only depends upon Numpy and Pytorch libraries. This implementation will be open-sourced at the time of the publication.

2.3 Eddy-Resolving and Eddy-Permitting Regimes

We consider two spatial settings for our simulations:

  1. 1.

    The eddy-resolving regime, our high-resolution reference with a 5 km resolution.

  2. 2.

    The eddy-permitting regime, our low-resolution setting with a 40 km resolution.

Parameters for these two different regimes are written in Table 2 in Appendix.

Shevchenko and Berloff [15] studied the resulting flows’ differences between these two regimes. The high-resolution eddy-resolving model shows a well-pronounced eastward jet fuelled by mesoscale eddies circulating while the low-resolution eddy-permitting model does not induce a proper eastward jet as shown on Fig. 1. Temporal statistics significantly differ between high- and low-resolution simulations.

Fig. 1
figure 1

(Top) high-resolution and (bottom) low-resolution top-layer snapshots after 400 years of integration starting from zero velocity. Velocities are in m s1 and PV in s−1

3 Proposed Modified Viscosity

3.1 Motivation

In both resolutions, we use biharmonic viscosity as in [16, 10, 4, 3] essentially because it is less dissipating at large scales than a Laplacian. Compared to the usual Laplacian viscosity, it preserves large-scale structures. However, hyperviscosity remains much too dissipative in the “eddy-permitting” regime [9]. This too strong dissipation kills the eastward jet that is present in the high-resolution and that we expect to see in such a double-gyre quasi-geostrophic model. Figure 2 shows a sequence of snapshots of the low-resolution models where we input a downsampled snapshot of the high-resolution (see Appendix for details on downsampling). After as few as three years, the eastward jet has almost disappeared, showing that the model is too dissipating. Lowering the hyper-viscosity coefficient by a factor of 10 does not solve this problem, and creates spurious gradients in the potential vorticity as shown in Fig. 2. These numerical artifacts are due to a bad representation of the direct enstrophy cascade, causing a piling up of the small-scale vorticity gradients at the cut-off frequency together with aliasing effects.

Fig. 2
figure 2

(left) Initial condition: high-resolution snapshot on the low-resolution grid.(center and right) Zonal velocity and potential vorticity (PV) snapshots after 3 years of integration at low-resolution with Eqs. (1, 2) with (top) standard hyper-viscosity and (bottom) 10 times smaller hyper-viscosity. We can see aliasing effects on potential vorticity snapshots integrated with low hyper-viscosity

3.2 Modified Viscosity

Here we propose a simple affine modification parameterization of hyperviscosity. We add a bias to the term Δp in Eq. (1), which becomes \( \Delta \left (\mathbf {p} - \mathbf {p}' \right ) \) where p is a dimensional field that does not depend upon time. The PV advection equation with hyperviscosity becomes

$$\displaystyle \begin{aligned} & \partial_t \mathbf{q} = \frac{1}{f_0}J (\mathbf{q}, \mathbf{p}) + f_0 B \mathbf{e} + \frac{1}{f_0} \big( a_2 \Delta - a_4 \Delta^2 \big) \left(\Delta (\mathbf{p} - \mathbf{p}')\right).{} \end{aligned} $$
(3)

The elliptic equation (2) remains unchanged.

The goal of this additional term is to reproduce a relevant time-average pressure field relying on observations or high-resolution solutions. For example the high-resolution average \(\overline {{\mathbf {p}}_{\mathrm {HR}}}\) can be downsampled to the targeted coarse grid resolution in \(\overline {{\mathbf {p}}_{\mathrm {HR}}}\downarrow \), and we want the average of the modified low-resolution \(\overline {{\mathbf {p}}_{\mathrm {LR}}}\) model to be as close as possible to the high-resolution reference \(\overline {{\mathbf {p}}_{\mathrm {HR}}}\downarrow \).

We face here an optimal control problem, as the low-resolution average is a function of the control parameter p . We state it with the following least-square formulation

(4)
$$\displaystyle \begin{aligned} \mathcal{F}\left( \mathbf{p}'\right)&= \big\| \ \overline{{\mathbf{p}}_{\mathrm{LR}}}\left( \mathbf{p}'\right) - \overline{{\mathbf{p}}_{\mathrm{HR}}}\downarrow \big\|{}^2 \end{aligned} $$
(5)

This optimization problem is a priori non-convex and we shall not expect to find a global optimum. In the following, we propose a numerical procedure to find a heuristic \(\hat {\mathbf {p}'}\) of the optimal solution \({\mathbf {p}}^{\prime }_{\mathrm {opt}}\).

Computationally, the implementation of this modified hyperviscosity is simple and computationally cheap. We precompute Δp and subtract it from Δp at each time-integration step. It increases the integration time of the advection equation (1) by less than 1% on CPUs and GPUs.

3.3 Modified Viscosity Regularization

The continuously stratified QG equations can be rewritten in a variational formulation [8] with a Hamiltonian \(\mathcal {J}\) defined as

$$\displaystyle \begin{aligned} \mathcal{J} (\mathbf{p}) = \frac{1}{2} \int_\Omega \frac{1}{f_0} |\nabla \mathbf{p}|{}^2 + \frac{f_0}{N^2} (\partial_z \mathbf{p})^2 . \end{aligned}$$

Our model is a discretized version of the continuous stratification. Since we add an external wind forcing term and we use an energy conservative Arakawa advection scheme, we need to add some viscosity or hyperviscosity to dissipate energy. In a variational formulation, these (hyper-)viscous terms become the following penalization

$$\displaystyle \begin{aligned} \frac{1}{2}\int_{\Omega} a_2 |\Delta\mathbf{p}|{}^2 + a_4 \left| \nabla\left( \Delta \mathbf{p} \right) \right|{}^2, \end{aligned}$$

added to the Hamiltonian \(\mathcal {J} (\mathbf {p})\) to produce a smooth solution. The Gradient norm penalization of Laplacian p guides the minimization toward solutions of smooth Laplacian. Hyperviscosity corresponds to the Laplacian norm penalization and enforces a solution of minimum Laplacian norm. The parameters a 2 and a 4 quantify the strength of these regularization constraints.

Here, we simply propose to replace it with the following penalization

$$\displaystyle \begin{aligned}\frac{1}{2}\int_{\Omega} a_2 |\Delta (\mathbf{p} - \mathbf{p}')|{}^2 + a_4 \left| \nabla \left(\Delta (\mathbf{p} - \mathbf{p}')\right)\right|{}^2 \ . \end{aligned}$$

We now penalize (p −p ) instead of p, meaning that we guide the solution to a possibly non-smooth reference p that will produce the correct large scale behavior.

3.4 Iterative Procedure

Here we present a method to find a solution to the optimization problem (4). A natural guess for \({\mathbf {p}}^{\prime }_{\mathrm {opt}}\) is \(\overline {{\mathbf {p}}_{\mathrm {HR}}}\downarrow \). We solve the equations and compute the average pressure \(\overline {{\mathbf {p}}_{\mathrm {LR}}}\). Results are shown in Fig. 4. It is a good first-guess, but the difference \(\overline {{\mathbf {p}}_{\mathrm {HR}}}\downarrow - \overline {{\mathbf {p}}_{\mathrm {LR}}} \) is still large.

We propose the following iterative procedure to find a better guess for \({\mathbf {p}}^{\prime }_{\mathrm {opt}}\). In the following we assume that we are in low resolution, i.e. p = p LR and \(\overline {\mathbf {p}} = \overline {{\mathbf {p}}_{\mathrm {LR}}}\) unless explicitly written.

  • We set \({\mathbf {p}}^{\prime }_0\) and we compute the average pressure \(\overline {\mathbf {p}}_0\) solving standard equations (1, 2) without modified viscosity.

  • Choose \(k \in \left ] 0, 1 \right ]\).

  • Start with \({\mathbf {p}}^{\prime }_1 = \overline {{\mathbf {p}}_{\mathrm {HR}}}\downarrow \).

  • Evolve the ensemble for n years and compute the corresponding average pressure \(\overline {\mathbf {p}}_1\) with ensemble average.

  • For n = 1…:

    • Set \({\mathbf {p}}^{\prime }_{n+1} = {\mathbf {p}}^{\prime }_{n} + k \left ( \overline {{\mathbf {p}}_{\mathrm {HR}}}\downarrow - \overline {\mathbf {p}}_n\right )\).

    • Evolve the ensemble for n years and compute new average pressure \(\overline {\mathbf {p}}_{n+1}\).

  • return \({\mathbf {p}}^{\prime }_{n}\) and \(\overline {\mathbf {p}}_{n}\)

There is no theoretical guarantee that this procedure converges, but we observe in the next section that it converges with the double-gyre QG model that we use.

4 Results and Discussion

4.1 Statistics

We use ensemble averages to compute the statistics. To create ensembles of size N, we start from a zero solution and spin up the models for 100 years with a timestep of 1200 s to reach statistically steady states as in [13]. Then we run the models for 500 years and save 10 snapshots a year to get 5000 snapshots, and we randomly select N snapshots out of these 5000 snapshots. The ensemble averages are simply average over these N ensemble members that we evolve in parallel. Such ensemble averages are denoted with \(\overline {\bullet }\) in the following, i.e. the average pressure is denoted by \(\overline {\mathbf {p}}\), average velocity by \(\overline {u}\), etc.

4.2 Iterative Procedure

We test the iterative procedure described in Sect. 4.2 with the double-gyre model presented in Sect. 2 in the eddy-permitting regime. We use n = 10 years to evolve the ensemble after each iterate. We compute the reference pressure average \(\overline {\mathbf {p}}_{\mathrm {HR}}\) with the same model in the eddy-resolving regime.

Figure 3 shows the relative square error \( \| \overline {\mathbf {p}}_{n} - \overline {{\mathbf {p}}_{HR}}\downarrow \|{ }^2 / \|\overline {{\mathbf {p}}_{HR}}\downarrow \|{ }^2\) at iterations of the procedure with k = 1 and k = 0.7. The procedure converges with k = 0.7 and oscillates with k = 1.

Fig. 3
figure 3

Evolution of the relative square error \( \frac {\| \overline {\mathbf {p}}_{n} - \overline {{\mathbf {p}}_{HR}}\downarrow \|{ }^2}{ \|\overline {{\mathbf {p}}_{HR}}\downarrow \|{ }^2}\) w.r.t iterations of the procedure

Fig. 4
figure 4

Top-layer average pressure (top) and velocity (bottom) of (left-to-right) proposed model at low-resolution, reference, and the difference between the two

Figure 4 shows the output average pressure \(\overline {\mathbf {p}}_{n}\) of the iterative procedure, the reference \(\overline {\mathbf {p}}_{\mathrm {HR}}\) and the difference between the two, as well as for zonal velocity u . Our model can reproduce the eastward jet produced by the high-resolution reference model. Kinetic energy spectra shown on Fig. 5 shows also the improvement of our model compared to low-resolution. Finally, Fig. 6 shows high-resolution and low-resolution snapshots as well as a snapshot of the proposed model at low-resolution. Our model effectively produces the eastward jet and a re-circulation zone around it where eddies are created. Artifacts can be also observed on the zonal velocity and potential vorticity on the right of Fig. 6. They can likely be Rossby waves created by the harmonic regularization terms, which remain an artificial constraint, but this needs to be studied further.

Fig. 5
figure 5

Top-layer kinetic-energy spectra average with models at high-resolution (HR), at low-resolution (LR) and at low-resolution with proposed modified viscosity. The decreasing slope of the spectrum of the proposed model is much closer to the high-resolution reference

Fig. 6
figure 6

PV and zonal velocity snapshots form (left-to-right) high-resolution, low-resolution and proposed model at low-resolution

5 Conclusion

We presented a simple modified-viscosity scheme for coarse resolution ocean modeling that we derived and tested on a double-gyre multi-layer quasi-geostrophic model. We interpret it as a modified regularization technique that will guide the solution to a reference rather than producing a too smooth solution in the eddy-permitting regime. The technique requires solving an optimization problem, and we presented a procedure to find a good guess for the solutions. We showed that it converges to a reasonable solution that fairly reproduces the input reference.

If this method mimics the average of the high-resolution, it only reproduces partially the variability and higher-order statistics of the high-resolution. We see in Fig. 5 our model’s snapshots resemble the averages. In future works, we consider using this method as a deterministic basis for stochastic parameterizations such as Location-Uncertainty [11].