1 Introduction

The climate is a nonequilibrium system whose dynamics is primarily driven by the uneven absorption of solar radiation, which is mainly absorbed near the surface and in the tropical latitudes, rather than aloft and in the mid-high latitudes, respectively. The system reacts to such an inhomogeneity in the local energy input through a complex set of instabilities and feedbacks affecting its dynamical processes and thermodynamic and radiative fluxes. Such processes lead to an overall reduction in the temperature gradients inside the system and allow for the establishment of approximate steady-state conditions [1, 2].

An example of the re-equilibration mechanism can be described as follows. The large scale energy transport, which tends to reduce the temperature difference between low and high latitudes, is mainly performed by atmospheric disturbances in the form of synoptic and (to a lesser extent) planetary eddies which are, in turn, fuelled by the baroclinic conversion of available potential energy into kinetic energy, which is then dissipated by friction. In turn, the presence of large low-high latitudes temperature differences is responsible for the existence of a reservoir of available potential energy, which is continuously replenished thanks to the dishomogeneity of the radiative energy budget across the globe [1]. This is the core of the celebrated Lorenz energy cycle, which provides a powerful representation of the climate as an engine [3]. The thermodynamic viewpoint on the climate allows to define its efficiency and irreversibility [4,5,6,7,8,9].

The reconstruction, interpretation, and analysis of observative data (now facilitated by the recent advances in data science); analytical tools borrowed (mostly) from mathematics and physics; and numerical simulations all contribute to our understanding of the climate. This task is exceedingly demanding since the system features nontrivial variability on a vast range of temporal and spatial scales and, furthermore, our ability to observe it has changed enormously over time. Additionally, the presence of periodic as well as irregular fluctuations in the boundary conditions does not allow the climate to reach an exact steady state [10, 11].

One of the features of the numerical investigation of the climate system is the reliance on hierarchies of models. In other terms, climate phenomena are investigated using a full range of models going from low dimensional ones to state-of-the-art Earth system models, able to represent with higher precision many aspects of the climate. See a discussion of the meaning and of the use of hierarchies of climate models in [11,12,13,14]. It is important to remark that, given the multiscale nature of the climate system, the heterogeneity of its subdomains, and the number of the active physical, chemical, and biological processes, the endeavour of constructing a model able to directly simulate all of them appears as a Sisyphean task, whilst, instead, the parametrisation of the effect of the unresolved scales on those that are explicitly simulated is an essential component of any reasonable model of the Earth system [15,16,17].

Low-order models have played and still play today a very important role for improving our understanding of the geophysical flows. Apart from the landmark 3-dimensional model developed by Lorenz in 1963 [18] starting from the truncation of the equations describing the Rayleigh–Benard convection introduced by Saltzman [19], simple models have been key to scientific advances in oceanography [20,21,22], dynamical meteorology [23,24,25], climate dynamics [26,27,28,29], turbulence [30,31,32,33], and convection [34, 35], among others. Additionally, low-order models supplemented by stochastic forcings have also provided the backbone of the stochastic theory of climate [36,37,38]. Aside from sheer mathematical-related curiosity, many of these models were created in order to shed some light on specific problems by using a physically meaningful benchmark tool that is easier to analyse mathematically and faster to simulate numerically with respect to more complex models.

1.1 The Lorenz ’96 model

Of special relevance for the present study is the now-celebrated Lorenz ’96 model [25, 39] (hereafter L96), whose structure is briefly recapitulated below. The model consists of a lattice of N gridpoints, whose state is described by a variable. The model has periodic boundary conditions, so that it can loosely be interpreted as describing the properties of the atmosphere along a latitudinal circle. The model features in an extremely simplified—almost metaphorical—way the main processes of the atmosphere: forcing, dissipation, and advection. Two versions of the model have been proposed: a one-level version, where the dynamics takes place in a limited range of spatial and temporal scales, and a two-level version, where the lattice is augmented in order to describe dynamical processes on smaller spatial and temporal scales. The two-level version of the L96 model has the especially attractive property that the time-scale separation between the fast and the slow variables can be controlled by modulating one parameter.

The L96 model has rapidly gained relevance among geoscientists, physicists, and applied mathematicians, as it has become a benchmark testbed for parametrisations [40,41,42,43,44,45,46], for studying extreme events [47,48,49,50], for developing data assimilation schemes [51,52,53,54], for developing ensemble forecasting techniques [55,56,57], for studying the properties of Lyapunov exponents and covariant Lyapunov vectors [58,59,60,61], for developing and testing ideas in nonequilibrium statistical mechanics [62,63,64,65,66], and for investigating bifurcations [67,68,69,70,71,72,73]. By looking at these references, the reader can find a very thorough analysis of the properties of the L96 model.

The one-level L96 model can be written as:

$$\begin{aligned} \frac{\mathrm{d}X_k}{\mathrm{d}t}=X_{k-1}(X_{k+1}-X_{k-2})-\gamma X_k+F, \end{aligned}$$
(1)

with boundary conditions:

$$\begin{aligned} X_{k-N}=X_{k+N}=X_k. \end{aligned}$$
(2)

\(k=1,\ldots ,N\) is the index of the gridpoints defining the lattice, the nonlinear term defines a nontrivial process of advection, F is an external forcing, and \(\gamma \) (usually taken with unitary value) modulates the dissipation. In the unforced and inviscid regime—i.e. setting \(F=\gamma =0\), the energy of the system, expressed as the sum of the squares of the variables, is conserved:

$$\begin{aligned} \frac{\mathrm{d}E}{\mathrm{d}t}=\frac{\mathrm{d}}{\mathrm{d}t}\sum _{k=1}^{N}\frac{X_k^2}{2}=\sum _{k=1}^{N}X_k\frac{\mathrm{d}X_k}{\mathrm{d}t}=0. \end{aligned}$$
(3)

If \(\gamma =1\) and \(N\gg 1\), the model’s attractor is a fixed point for \(0\le F\le 8/9\). As F is increased, the fixed point \(X_1=X_2=\cdots = X_K = F\) loses stability as the system undergoes bifurcations leading to a quasi-periodic behaviour for moderate values of F and chaotic behaviour for \(F\ge 5.0\) [39]. This is only a rough description of the complexity of the bifurcations taking place in the L96 model as F is changed: as discussed in detail in [70,71,72], the properties of the system depend on N in a very nontrivial way in the regime of moderate forcing. In the regime of strong forcing and developed turbulence, instead, some sort of universality emerges, as the L96 model is strongly chaotic when \(F\ge 8\) and its properties are extensive with respect the number of nodes N [65]. By and large, the mechanism of instability of the L96 model boils down to exchanges of energy between the symmetric state and the perturbations away from it, as particularly clear in the case of the linear instability analysis [39]. Nonetheless, the L96 model clearly features only one form of energy, which we may refer to as kinetic.

1.2 This paper

In this paper, we propose an extension of the L96 model whereby a second variable is attached to each gridpoint, representing, metaphorically, the local thermodynamical properties, which are advected by the dynamical variables of the L96 model, and undergo forcing and diffusion. The model proposed here features a meaningful definition of energy that includes the kinetic part already present in the L96 model plus a potential part associated to the fluctuations of the temperature in the domain. The fundamental advantage of the new model proposed here and analysed in detail in Sect. 2 is that it features an energy cycle that allows for conversion between the kinetic and potential forms of energy. Conceptually, this mirrors the change one has going from one a one-layer quasi-geostrophic model, where only barotropic processes are possible, to a two-layer quasi-geostrophic model, which, instead, features a coupling between dynamical and thermodynamic processes via baroclinic conversion [74]. Note that, the Lorenz ’63 model [18] and, more completely so, its extensions to higher order modes [75] features a nontrivial energetics where exchanges take place between potential and kinetic energy [76]. The energy of the system defines the symplectic component that contributes—together with the metric one—to defining the evolution of the model [47].

The rest of the paper is structured as follows. Section 2 provides a thorough introduction to the one-level version of the model presented here. The evolution equations are presented together with the rationale on which the model is based. Additionally, a detailed analysis of its mechanical and thermodynamic properties is carried out. Finally, the linear stability analysis of the symmetric fixed point is also described. In Sect. 3, we present the results of a large set of numerical integrations of the model, aimed at exploring its properties in a rather vast range of values of two main parameters, which control the input of energy in the kinetic and potential form. We discuss the thermodynamics of the model in terms of mean values and the fluctuations of the main terms describing the energetics of the model. Such a physical characterisation of the model is complemented by the analysis of how the first Lyapunov exponent depends on the two considered parameters, in order to be able to separate the regions where the asymptotic dynamics of the system takes place in a regular vs in a strange attractor [67, 77]. We will discover a nontrivial interplay between the two sources of forcings applied to the model. We then perform a preliminary analysis for assessing to what extent the system obeys extensive chaos. In Sect. 4, we summarise the main features of the model and the results obtained so far, and propose future lines of investigations. As in the case of original L96 model, the model introduced here can be formulated in a two-level fashion—see Appendix A—with nontrivial couplings among different levels and variables and with a fairly sophisticated energetics, which is conceptually rather similar to the one described by the Lorenz energy cycle in the atmosphere. The analysis of the properties of the two-level model will not be performed in this paper and will be the subject of future studies.

2 Model formulation and properties

We want to extend the standard L96 model presented in the introduction by adding a second set of variables for all the gridpoints \(k=1,\ldots ,N\). The goal is to construct a toy model able to describe in a very simple yet conceptually correct way the interaction between dynamical and thermodynamical processes of the atmosphere. The evolution equations of the model we propose in this contribution are the following:

$$\begin{aligned} \frac{\mathrm{d}X_k}{\mathrm{d}t}= & {} X_{k-1}(X_{k+1}-X_{k-2})-\alpha \theta _k-\gamma X_k+F, \end{aligned}$$
(4)
$$\begin{aligned} \frac{\mathrm{d}\theta _k}{\mathrm{d}t}= & {} X_{k+1}\theta _{k+2}-X_{k-1}\theta _{k-2}+\alpha X_k-\gamma \theta _k+G, \end{aligned}$$
(5)

with \(k=1,\ldots ,N\). The variable \(\theta _k\) can be loosely interpreted as temperature at the grid-point k. The boundary conditions are defined as

$$\begin{aligned} X_{k-N}= & {} X_{k+N}=X_k,\nonumber \\ \theta _{k-N}= & {} \theta _{k+N}=\theta _k, \end{aligned}$$
(6)

The variable \(\theta _k\) undergoes a constant forcing G, a linear dissipation term, and a nonlinear term representing, loosely speaking, the advection performed by the X variables. Additionally, \(\theta _k\) and \(X_k\) are linearly coupled through a term proportional to \(\alpha \). The purpose of this coupling is to represent, in a very simplified way, the effect of correlated thermal and dynamical fluctuations on the dynamics, which allow for an exchange between kinetic and potential energy associated with thermal fluctuations, as discussed below. The introduction of a term proportional to \(\alpha \) is the only—yet important—modification in the dynamics of the X variables for this model as compared to the classical L96 model, see Eq. (1). In what follows, we consider \(F,G,\alpha ,\gamma \ge 0\).

The coupling between the X and the \(\theta \) variables is constructed in such a way that in the unforced and inviscid limit (\(F=G=\gamma =0\)) the total energy of the system

$$\begin{aligned} E=K+P=\sum _{k=1}^{N}\left( \frac{X_k^2}{2}+\frac{\theta _k^2}{2}\right) \end{aligned}$$

given by the sum of its kinetic and potential components, is conserved:

$$\begin{aligned} \frac{\mathrm{d}E}{\mathrm{d}t}=\frac{\mathrm{d}K}{\mathrm{d}t}+\frac{\mathrm{d}P}{\mathrm{d}t}=\sum _{k=1}^{N}X_k\frac{\mathrm{d}X_k}{\mathrm{d}t}+\sum _{k=1}^{N}\theta _k\frac{\mathrm{d}\theta _k}{\mathrm{d}t}=0, \end{aligned}$$
(7)

whereas, in general, K and P are not separately conserved. The quadratic functional form of the potential energy is inspired by the fact that the available potential energy in the global circulation of the atmosphere is approximately proportional to the variance of the temperature fluctuations [1, 3, 78]. The dynamical role of the function E is explored in the next section.

2.1 Mechanics

By definition, the time derivative of any smooth observable \(\Psi (X_1,\ldots ,X_K,\theta _1,\ldots ,\theta _k)\) is obtained by applying the generator of the Koopman operator to \(\Psi \), as follows:

$$\begin{aligned}\frac{\mathrm {d}\Psi }{\mathrm {d}t}=\mathscr {L}[\Psi ].\end{aligned}$$

We will now show that linear operator \(\mathscr {L}\) can be written as the sum of a contribution coming from a symplectic (indeed, quasi-symplectic, for the reasons detailed at the end of this section) term and a contribution coming from a gradient term. Indeed, we can write:

$$\begin{aligned} \frac{\mathrm {d}\Psi }{\mathrm {d}t} =\mathscr {L}[\Psi ]=\left\{ \Psi ,E\right\} +\langle \Psi ,\Gamma \rangle \end{aligned}$$
(8)

where \(\left\{ A,B\right\} =-\left\{ B,A\right\} \) is a suitably defined Poisson bracket for the functions A and B, whilst \(\langle A,B\rangle =\langle B,A\rangle \) gives the gradient contribution. The evolution Eqs. (4)–(5) are obtained by setting \(\Psi = X_i\) and \(\Psi =\theta _i\), \(i=1,\ldots ,N\), respectively.

We have that \(\langle A,B\rangle =\left( \partial _{X_i} A\right) \left( \partial _{X_i} B\right) +\left( \partial _{\theta _i} A\right) \left( \partial _{\theta _i} B\right) \), where we use the Einstein convention for the indices. The function \(\Gamma \) defining the gradient contribution to the dynamics is:

$$\begin{aligned} \Gamma =-\gamma E + F \sum _{k=1}^N X_k + G \sum _{k=1}^N \theta _k=-\gamma E +F \Xi + G \Theta , \end{aligned}$$
(9)

where \(\Xi =\sum _{k=1}^N X_k \) and \(\Theta =\sum _{k=1}^N \theta _k\). it is clear that such a component describes the irreversible dynamics as it vanishes in the unforced, inviscid limit \(F=G=\gamma =0\).

We then discuss the symplectic term associated with the Poisson bracket. We have that

$$\begin{aligned} \begin{aligned} \left\{ A,B\right\}&=\left( \partial _{X_i} A\right) J_{ij} \left( \partial _{X_j} B\right) + \left( \partial _{\theta _i} A\right) Y_{ij} \left( \partial _{\theta _j} B\right) \\&\quad +\left( \partial _{X_i} A\right) L_{ij} \left( \partial _{\theta _j} B\right) +\left( \partial _{\theta _i} A\right) M_{ij} \left( \partial _{X_j} B\right) \end{aligned} \end{aligned}$$
(10)

where

$$\begin{aligned} \begin{aligned} J_{ij}&=X_{i-1}\delta _{i+1,j}- X_{i-1}\delta _{i-2,j}\\ Y_{ij}&=X_{i+1}\delta _{i+2,j}-X_{i-1}\delta _{i-2,j}\\ L_{ij}&=-\alpha \delta _{i,j}\\ M_{ij}&=\alpha \delta _{i,j}\\ \end{aligned} \end{aligned}$$
(11)

It is clear that the energy E is the generator of time translations according to the symplectic contribution and the antisymmetry of the Poisson brackets enforces the corresponding conservation law already discussed in Eq. 7.

We remark that, in the inviscid and unforced limit the system is not Hamiltonian because the Poisson brackets do not fulfil the Jacobi identity \(\{A,\{B,C\}\}+\{C,\{A,B\}\}+\{B,\{C,A\}\}=0\). Because of this and of the fact that \(\{\Gamma ,E\}\ne 0\), the system given in Eqs. (4)–(5) is not metriplectic, i.e. the standard generalisation of Hamiltonian system to the dissipative case. [79]. Note that, the dynamics of dissipative fluids is, instead, metriplectic [80], and so is the dynamics of the (extended) Lorenz ’63 model, which is in fact derived from the Rayleigh-Bénard equations through systematic modal truncation [47]. The lack of an underlying Hamiltonian skeleton confirms the well-known fact that the L96 model cannot be easily related to any model of fluid flows.

2.2 Thermodynamics

Using Eqs. (8)–(9), we obtain the time evolution of the energy of the system:

$$\begin{aligned} \frac{\mathrm {d}E}{\mathrm {d}t}=\langle \Psi ,\Gamma \rangle =-2\gamma E + F \Xi + G \Theta = \Gamma -{\gamma E} \end{aligned}$$

which implies that, at steady state \(2\gamma {\bar{E}} = F{\bar{\Xi }} + G {\bar{\Theta }}\), where \({\bar{\Psi }}\) is the long term average of the quantity \(\Psi \).

We next analyse the separate budget of the kinetic and potential energy. By inserting K and P in Eq. (8), we obtain:

$$\begin{aligned} \frac{\mathrm {d}K}{\mathrm {d}t}= & {} \sum \limits _{k=1}^N X_k\frac{\mathrm {d}X_k}{\mathrm {d}t}=I_\mathrm{K}-D_\mathrm{K}+C_\mathrm{P,K}\quad I_\mathrm{K}=F \Xi ,\quad D_\mathrm{K}=2\gamma K,\quad \nonumber \\&C_\mathrm{P,K}=-\sum \limits _{k=1}^N \left( \alpha X_k \theta _k \right) , \end{aligned}$$
(12)
$$\begin{aligned} \frac{\mathrm {d}P}{\mathrm {d}t}= & {} \sum \limits _{k=1}^N \theta _k\frac{\mathrm {d}\theta _k}{\mathrm {d}t}=I_\mathrm{P}-D_\mathrm{P}-C_\mathrm{P,K}\quad I_\mathrm{P}=G \Theta ,\quad D_\mathrm{P}=2\gamma P, \end{aligned}$$
(13)

where \(I_\mathrm{K}\) (\(I_\mathrm{P}\)) is the rate of input of kinetic (potential) energy, \(D_\mathrm{K}\) (\(D_\mathrm{P}\)) is the dissipation rate of kinetic (potential) energy, and \(C_\mathrm{P,K}\) is the conversion rate from potential to kinetic energy. We remark that the input and dissipation of energy in either kinetic or potential form is due by the metric component of the dynamics. Instead, the conversion of energy between the potential and kinetic form is controlled by the Poisson brackets given in Eqs. (10)–(11). Nonetheless, the components J and Y of the Poisson brackets, which describe advection, do not give any net contribution. Equations (12)–(13) describe the energetics of the model presented in this work, which is represented by the diagram shown in Fig. 1. One can draw a parallel between the energetics of this model and the Lorenz energy cycle of the atmosphere, where, as well known, the input of energy comes almost entirely through the potential energy channel via baroclinic forcing associated with the differential heating of low versus high latitude regions [1, 3]. The two-level version of the model introduced in this paper features an energetics that is conceptually closer to the one of the true atmosphere because it is able to describe energy cascades across scales on top of energy conversion processes, see “Appendix A”.

Fig. 1
figure 1

Energetics of the model presented in Eqs. (4)–(5). We indicate the fluxes of energy in and out of the reservoirs of kinetic (K) and potential (P) energy. Dashed lines represent input of energy; dotted lines represent energy dissipation; and solid lines represent energy conversion terms. The direction of the arrows indicates a positive energy flux. See text for details

At steady-state conditions, one has \(2\gamma \bar{K} = F {\bar{\Xi }} + \bar{C}\) and \(2\gamma \bar{P} = G {\bar{\Theta }} -\bar{C}\), which relate the size of the reservoirs of kinetic and potential energy to intensity of the acting forcings and energy exchange. We can also introduce a notion of efficiency of this model \(\eta =\bar{C}/(F\bar{\Xi }+G\bar{\Theta })=\bar{C}/(2\gamma \bar{E})\), which relates the amount of energy exchanged between the two reservoirs of energy to the total energy input. Since \(X_k^2+\theta _k^2\ge 2|X_k\theta _k|\) \(\forall k\), we have that \(2 E\ge |C|/\alpha \). Therefore, \(|\eta |\le \alpha /\gamma \), which provides a constraint on the efficiency of the system. Note that, \(\eta \) is positive if, on the average, energy is converted from potential to kinetic, and negative otherwise.

Note that, \(K\ge 0\) and \(P\ge 0\) by definition. As a result, if \(G,\gamma > 0\) and \(F=0\), one has \({\bar{C}}\ge 0\) (on the average we have an energy flux from potential to kinetic),Footnote 1 whereas if \(F,\gamma > 0\) and \(G=0\), \(\bar{C}\le 0\) (on the average, we have an energy flux from kinetic to potential). If \(F=G=0\), instead, we have that \(\bar{K}=\bar{P}=\bar{E}=0\), where the origin is a stable fixed point for the system.

2.3 Linear stability analysis

We investigate the linear stability of the system analysed here around the fixed point corresponding to the symmetric solution \(X_j=X_k=\mathrm{const}\), \(1,\ldots ,j, k,\ldots N\) and \(\theta _j=\theta _k=\mathrm{const}\), \(1,\ldots ,j, k,\ldots N\) . By plugging this ansatz in Eqs. (4)–(5), one gets:

$$\begin{aligned} X_k= & {} {\tilde{X}}=\frac{\gamma F-\alpha G}{\gamma ^2+\alpha ^2} \forall k, \end{aligned}$$
(14)
$$\begin{aligned} \theta _k= & {} {\tilde{\theta }}=\frac{\alpha F+ \gamma G}{\gamma ^2+\alpha ^2} \forall k. \end{aligned}$$
(15)

Taking inspiration from [39],Footnote 2 we then investigate the linear stability of this solution by substituting \(X_k={\tilde{X}}+A\exp (\sigma t)\exp (\mathrm {i}\kappa k-\mathrm {i}\omega t)\) and \(\theta _k={\tilde{\theta }}+B\exp (\sigma t)\exp (\mathrm {i}\kappa k-\mathrm {i}\omega t)\) in Eqs. (4)–(5), where \({\tilde{X}}\) and \({\tilde{\theta }}\) have been defined in Eqs. (14) and (15), respectively; A and B are complex constants, \(\sigma \) is a real number defining the growth rate (if positive) of the amplitude of the wave, whilst \(\kappa \) is the wavenumber and \(\omega \) is the angular frequency of the wave. Neglecting terms that are quadratic in the wave amplitude, one obtains:

$$\begin{aligned} (\sigma -\mathrm {i}\omega )A= & {} {\tilde{X}} A \left( \exp (\mathrm {i}\kappa )-\exp (-2\mathrm {i}\kappa ) \right) -\alpha B -\gamma A \end{aligned}$$
(16)
$$\begin{aligned} (\sigma -\mathrm {i}\omega )B= & {} 2\mathrm {i}{\tilde{X}} B \sin (2\kappa )+2\mathrm {i}{\tilde{\theta }} A \sin (\kappa )+ \alpha A-\gamma B, \end{aligned}$$
(17)

We exclude the trivial solution \(A=B=0\) and, thanks to linearity, we set \(A=1\) (only the ratio \(b=B/A\) is indeed relevant). We separate real and imaginary part in the previous equations and obtain:

$$\begin{aligned}&\displaystyle \sigma ={\tilde{X}} \left( \cos (\kappa )-\cos (2\kappa )\right) -\alpha \mathbf {Re} \{b\} -\gamma \end{aligned}$$
(18)
$$\begin{aligned}&\displaystyle \omega =-{\tilde{X}} \left( \sin (\kappa )+\sin (2\kappa )\right) +\alpha \mathbf {Im}\{b\} \end{aligned}$$
(19)
$$\begin{aligned}&\displaystyle \sigma \mathbf {Re}\{b\}+\omega \mathbf {Im}\{b\} =-2{\tilde{X}} \mathbf {Im}\{b\} \sin (2\kappa )+ \alpha -\gamma \mathbf {Re}\{b\}, \end{aligned}$$
(20)
$$\begin{aligned}&\displaystyle \sigma \mathbf {Im}\{b\}-\omega \mathbf {Re}\{b\} =2{\tilde{X}} \mathbf {Re}\{b\} \sin (2\kappa )+2{\tilde{\theta }} \sin (\kappa )-\gamma \mathbf {Im}\{b\}, \end{aligned}$$
(21)

where \(\mathbf {Re}\{x\}\) and \(\mathbf {Im}\{x\}\) are the real and imaginary part of the complex number x, respectively. The conditions leading to the bifurcation associated with the loss of stability of the fixed point given in Eqs. (14)–(15) can be derived by setting \(\sigma =0\) in Eqs. (18)–(21) and finding \(\omega \), \(\kappa \), \(\mathbf {Re}\{b\}\), and \(\mathbf {Im}\{b\}\) and a function of the parameters F, G, \(\alpha \), and \(\gamma \). In the case \(\omega \ne 0\), the onset of the neutral wave corresponds to a Hopf bifurcation.

Solving the previous Eqs. (18)–(21) and finding the expression of \(\sigma \) and \(\omega \) as a function of \(\kappa \) and of the parameters \(F,G,\alpha \), and \(\gamma \) gives the dispersion relation of the waves. Additionally, obtaining the real and imaginary part of b allows for understanding the relative amplitude of the waves in the X and \(\theta \) variables.

Note that, the linear stability analysis of the L96 model can be obtained by setting \(\alpha =0\), \(\gamma =1\) in Eq. (16) and neglecting, instead, the \(\theta \) variables. One then recovers the result first presented in [39] and discussed in greater detail in [70]. In what follows, we consider \(F\ge 0\); an analysis of the dynamics occurring for \(F<0\) has been presented in [72]. It is possible to derive the minimal value of F such that the fixed point of the system loses stability and, correspondingly, to obtain the wavelength and frequency of the emerging neutral wave. One finds that, taking a continuum approximation (\(N\rightarrow \infty \)), the neutral wave is realised when \(F=F_\mathrm{crit}=8/9\), where the critical wavenumber is \(\kappa =\kappa _\mathrm{crit}=\arccos (1/4)\), and the critical frequency is \(\omega _\mathrm{crit}=-F_\mathrm{crit}(\sin (\kappa _\mathrm{crit})+\sin (2\kappa _\mathrm{crit}))\approx -1.29\). If one assumes that the gridpoints are arranged like along a latitudinal circle where the longitude increases with the index of the gridpoints (note that, the periodic boundary conditions of the system impose a toroidal topology), we have that the crest of the neutral wave moves westward, because the phase velocity \(v_p=\omega _\mathrm{crit}/\kappa _\mathrm{crit}= \approx -0.98\) is negative. Instead, the group velocity \(v_\mathrm{g}=\partial \omega _\mathrm{crit}/\partial \kappa |_{\kappa =\kappa _\mathrm{crit}}=-F_\mathrm{crit}(\cos (\kappa _\mathrm{crit})+2\cos (2\kappa _\mathrm{crit})) \approx 1.33>0\) so that the wave packets have an eastward propagation.

As a result of the presence of the coupling between the X and \(\theta \) variables, it is hard to find for the model introduced in this paper an explicit expression for the conditions supporting the presence of a neutral wave, also if one takes the special cases where one between F and G vanishes. A simple solution is instead found if one takes \(\gamma =\alpha =1\) and \(F=G\), which implies \({\tilde{X}} = 0\) and \({\tilde{\theta }} = F\). One then obtains the following results when imposing \(\sigma =0\) and taking the continuum approximation: \(\mathbf {Re}\{b\}=-1\), \(\omega = \mathbf {Im}\{b\}=\sqrt{2}\), \(\kappa =\arcsin (\sqrt{2}/F)\). This indicates that \(F_\mathrm{crit}=\sqrt{2}\), and \(\kappa _\mathrm{crit}=\pi /2\) (corresponding to a critical wavelength of 4), and \(\omega _\mathrm{crit}=\sqrt{2}\). Therefore, the phase velocity of the neutral wave \(v_\mathrm{p}=\omega _\mathrm{crit}/\kappa _\mathrm{crit}=2\sqrt{2}/\pi \) is positive, corresponding to an eastward motion of the wave crests. Since \(\omega _\mathrm{crit}=F_\mathrm{crit}\sin (\kappa _\mathrm{crit})\), we have \(v_\mathrm{g}=\partial \omega _\mathrm{crit}/\partial \kappa |_{\kappa =\kappa _\mathrm{crit}}=F_\mathrm{crit}(\cos (\kappa _\mathrm{crit}))=0\), implying no net motion of the wave packets.

3 Results

Many are the possible scientific questions one can address regarding the model introduced above. Building on the large literature on the L96 model discussed in the introduction, and taking into account the extra features of the current model, we can mention the following lines of investigation:

  • Analysis of the bifurcations leading the system from fixed point to a periodic and quasi-periodic behaviour to a chaotic regime as the forcing is increased;

  • Systematic investigation of the predictability of the system—e.g. analysis of the finite-time and asymptotic Lyapunov exponents and the corresponding covariant Lyapunov vectors as a function of the two forcing parameters F and G;

  • Systematic investigation of the energetics of the system as a function of the two forcing parameters F and G;

  • Analysis of the signal propagation through waves in the quasi-periodic and weakly chaotic regime;

  • Definition of asymptotic scaling laws for the properties if the system for large values of F and G;

  • Detection and analysis of chaos extensivity as the number of gridpoints \(N\rightarrow \infty \);

  • Extension of the model to multiple scales and analysis of dynamics and of the energetics of scale-to-scale interaction.

Obviously, it is impossible to address with a high level of detail all these aspects in the present paper. Rather than focusing on one or few aspects among those above, since this is the first time this model is proposed to the scientific community, we will present some preliminary results that address partially each of the points above, in the hope of stimulating a reader into going in greater detail. Further studies by the authors that focus specifically in some of the aspects mentioned above will be reported elsewhere.

All the numerical integrations presented below are performed using a Dormand–Prince method with adaptive time step and a spin time of 100 time units, with runs of 1000 time units. We make use of the Python module JiTCODE [81], an extension of SciPy’s ODE that allows to numerically simulate ordinary differential equations, computing quantities of interest as Lyapunov exponents as well. All results have been double checked and confirmed using the MATLAB function ode45 where integrations are performed using the 4th order Runge–Kutta integrator with adaptive time step [82].

Fig. 2
figure 2

First Lyapunov exponent of the one-level model with \(N=36\) gridpoints as a function of F and G. a Detail for \(0\le F,G\le 3\). b Full range for \(0\le F,G\le 10\). In all simulations, \(\gamma =\alpha =1\)

3.1 Transition to chaos and predictability

A simple yet fundamentally correct way to characterise at qualitative level the dynamical properties of a system is to investigate to what extent its evolution is sensitive to its initial conditions. Roughly speaking, the first Lyapunov exponent of a system measures the asymptotic rate of growth or decay of the distance between two orbits which are initialised in the attractor of the system at infinitesimal distance from each other [83]. Similarly, one can define the sum of the first p Lyapunov exponents as defining the asymptotic average rate of growth or decay of the p-volume defined by \(p+1\) orbits that are initialised in the attractor of the system at infinitesimal distance from each other [77]. Indeed, for a Q dimensional continuous time dynamical system, it is possible to compute Q Lyapunov exponents \(\lambda _1,\ldots ,\lambda _Q\), where the customary ordering is such that \(\lambda _j\le \lambda _k\) if \(k\le j\) of the [84].

If \(\lambda _1<0\) the attractor is a fixed point, whilst if \(\lambda _1=0\) the attractor is periodic or quasi-periodic. Finally, the presence of a positive first Lyapunov exponent is a significant evidence that the system is chaotic, and the value of such exponent determines quantitatively the rapidity with which two nearby trajectories diverge from each other. In this case, one has that there is at least one \(\lambda _j=0\), \(j>1\), which corresponds to the direction of the flow [77, 83].

Figure 2 shows the estimate of \(\lambda _1\) for \(N=36\) as a function of F and G in the range \(0\le F, G \le 10\). We remark that the Python JiTCODE module allows for the computation of the full spectrum of Lyapunov exponents using the algorithm proposed in [84]. The system has a negative \(\lambda _1\) for small values of the forcings, as expected, see Fig. 2a. We remark that if \(G=0\) \(\lambda _1\le 0\) for \(F\le 1.4\), whereas for the L96 model \(F_\mathrm{crit}=8/9\), indicating that presence of a mechanism of energy transfer between kinetic and potential energy and the presence of a new channel of dissipation (for potential energy) leads to higher stability for the system. We observe that \(\lambda _1\) depends in a very nontrivial way on both F and G, as the system’s behaviour depends delicately on how the energy is injected into it, because the dynamics of the X and \(\theta \) variables is, in fact, quite distinct. It is extremely different to force the system through kinetic or the potential energy channel. We also observe that the theoretical prediction of \(F_\mathrm{crit}=\sqrt{2}\) for \(F=G\) agrees with what shown in Fig. 2a.

Many other interesting features appear. If we increase F from zero to 3 whilst keeping \(G=1.7\), the asymptotic dynamics of the system changes first from quasi-periodic to a fixed point, then again to quasi-periodic, which then alternates with chaotic behaviour. Indeed, one can observe two complex tongue-like structures in Fig. 2a for \(F\ge 1.7\), \(G\ge 0.5\), which indicate the presence of a very nontrivial set of bifurcations for that regions of the parameters’ space, defining the transition between the quasi-periodic behaviour—the light orange region—ad the chaotic regime—the dark orange and red region in Fig. 2a.

Zooming out towards a larger range of values for F and G the intuitive argument that increasing either F or G makes the system less predictable becomes more robust, even though there are regions where a destructive interference is clear (in terms of reduced values of \(\lambda _1\)) between the two forms of forcing, compare the two troughs near the diagonal in Fig. 2b.

We remark that it is reasonable to expect that, as in the case of the L96 model [70,71,72], in the regime of moderate forcing the position and nature of the bifurcations will depend delicately on the number of nodes N, so that one should expect modifications especially in Fig. 2a when performing simulations for a value of N other than 36 considered here. Instead, as shown in Sect. 3.4, one finds some indication of universality associated with the continuum limit \(N\rightarrow \infty \) when sufficiently strong forcing is considered.

3.2 Energetics

It is useful to investigate the long-term average of the terms in Eqs. (12)–(13) as a function of F and G, see Fig. 3. The lack of equivalence between applying forcing to the X vs to the \(\theta \) variables is extremely clear by looking at the conversion term (panel e). \({\bar{C}}\) is positive for large values of G and moderate values of F, and negative vice versa. The absolute value of \({\bar{C}}\) increases with F (G) if G (F) is kept constant. The zero isoline strongly deviates from the diagonal and indicates that if \(F=G\) there is a net transfer of energy from kinetic to potential. The zero isoline of \({\bar{C}}\) coincides with the ridge in the value of \(\lambda _1\) shown in Fig 2b, indicating that the condition of no net energy exchange between the two reservoirs of energy corresponds to a state where instabilities are rather strong. The zero-isoline of the efficiency \(\eta \) (see panel f), by definition, coincides with the one of \(\bar{C}\). The absolute value of the efficiency grows with the asymmetry of the forcing and peaks for moderate intensity of either F or G, suggesting—see Sect. 3.5—that the energy conversion becomes less efficient decreases when stronger forcings are considered.

The behaviour of the other thermodynamical quantities is somehow unsurprising, as we have that both input and dissipation of kinetic (potential) energy increase with F (G). We remark that, once again, the response of the system to the two individual forcings is quantitatively different. It should be noted that, when one considers \(F\le 2\), for \(G\ge 2\) one has that the net input of kinetic energy is negative (with the dissipation of kinetic energy, being, by definition, positive). This indicates a very nontrivial impact of the thermodynamic variables on the dynamical ones, which are the only ones performing advection. As a result, there is an additional mechanism of energy loss for the system, whilst all the energy input takes place through the potential energy channel. Instead, when considering low values of G, the potential energy input is always positive—yet small.

Fig. 3
figure 3

a \(F \frac{1}{N} {\bar{\Xi }} = F \frac{1}{N}\sum _{k=1}^N X_k\); b \(-2\gamma \frac{1}{N} {\bar{K}} = -\gamma \frac{1}{N}\sum _{k=1}^N X_k^2\); c \(G \frac{1}{N} \bar{\Theta } = G \frac{1}{N}\sum _{k=1}^N \theta _k\); d \(-2 \frac{1}{N} \gamma {\bar{P}} = -\gamma \frac{1}{N}\sum _{k=1}^N \theta _k^2\); e \(\frac{1}{N} {\bar{C}}=-\frac{1}{N}\sum _{k=1}^N \left( \alpha X_k \theta _k \right) \); f \(\eta =\bar{C}/(2\gamma \bar{E})\) as a function of F and G. In all simulations \(\gamma =\alpha =1\)

3.3 Waves amidst chaos

We highlight some qualitative features of the dynamics of the model that indicate the presence of wave-like structures amidst chaos in the regime of moderate forcing. Figure 4 shows some examples of evolution of the system of the system in the case of \(N=36\) sectors with \(F=10\), \(G=0\) (panel a); \(F=10\), \(G=10\) (panel b); and \(F=0\), \(G=10\) (panel c). We are using a Hovmöller-type diagramme [85], where time is on the vertical axis and the variables \(X_k\) and \(\theta _k\), \(k=1,\ldots ,36\) are on the horizontal axis. This diagramme is particularly well-suited for appreciating wave-like structures, as it is easy to visualise wave crests. If the forcing on the \(\theta \) variables is switched off, the X variables behave similarly to the case of the L96 model, where, amidst chaos, the clear signature of a westward propagating phase velocity can be found, as already observed in [39] and recently discussed in [73]. As can be guessed from the evolution equations, the \(\theta \) variables feature weaker variability and similar pattern of the wave crests, as they are advected by the X variables and receive energy from them. The situations is qualitatively similar when both the X and \(\theta \) variables are forced, but, quite naturally, the fluctuations of the \(\theta \) variables are stronger than in the previous case. Note that in the case analysed here of \(F=G=10\), the wave crests travel in the opposite direction with respect to what we have found for the neutral wave emerging for \(F=G=\sqrt{2}\), see Sect. 2.3. Therefore, the presence of a turbulent background radically changes the kinematics of the waves. If, instead, the forcing acts on the \(\theta \) variables only, the wave crests have a much less clear direction of propagation, both for the \(\theta \) and for the X variables, where the latter feature a much lower variability, as expected. In other terms, the setup where F vanishes is characterised by absolute instability, with little or no advection of anomalies, whereas the other two cases are characterised by convective instability, where anomalies are spatially advected [86].

We will further discuss in the following sections in more quantitative terms the differences emerging when forcing the X variables only, the \(\theta \) variables only, or all variables.

3.4 Chaos extensivity

Ruelle [87] proposed that systems with short-range interaction can feature extensive chaos, because large domains can be hierarchically partitioned into smaller, weakly interacting subdomains with similar properties. One way to test whether chaos extensivity is to analyse the finite-size scaling of the Lyapunov exponents. Specifically, one plots the obtained spectrum of Lyapunov exponents for different values of system size Q (in our case, \(Q=2N\)) against the rescaled index \(x=(j-1/2)/Q\) and tests whether a universal curve is obtained in the limit of large values of Q [65, 88]. We remark that chaos extensivity implies that the ratio between the Kaplan-Yorke dimension of the attractor [89], also referred to as Lyapunov dimension [67], and Q tends to a constant as \(Q\rightarrow \infty \).

In order to prove convincingly the extensive nature of chaos in the system analysed here, one should test such property for all values of F and G. This is beyond the scope of this paper. Yet, preliminary results do confirm extensivity for the three reference cases \(F=10\), \(G=0\); \(F=G=10\); \(F=0\), \(G=10\) shown in Panels (a)–(c) of Fig. 4. Indeed, we have here performed simulations with \(N=18\), \(N=36\), and \(N=72\) and, as shown in Fig. 4d, the Lyapunov exponents spectra seem to collapse to universal curves as N grows. Indeed, the Lyapunov spectra plotted against their respective rescaled indices can hardly be visually distinguished. This is especially encouraging in view of the clear evidence for chaos extensivity in the L96 model [60, 65].

Fig. 4
figure 4

Slice of 10 time units of the evolution of the system with \(N=36\) for \(F=10\), \(G=0\) (a); \(F=10\), \(G=10\) (b); and \(F=0\), \(G=10\) (c). Time is on the vertical axis. On the horizontal axis, the first 36 variables are the \(X_k\), \(k=1,\ldots ,36\), followed by the variables \(\theta _k\), \(k=1,\ldots ,36\). d Extensivity of the system—spectrum of Lyapunov exponents as a function of the rescaled index \(x=(j-1/2)/(2N)\) for \(F=10\), \(G=0\) (red); \(F=10\), \(G=10\) (black); \(F=0\), \(G=10\) (blue). Solid lines: \(N=72\); dots: \(N=36\); crosses: \(N=18\). In all simulations, \(\alpha =\gamma =1\)

3.5 Scaling laws for strong forcings

As thoroughly analysed in [65], in the one-layer L96 model the average energy per unit site scales to a very high degree of approximation as \(F^{1.33}\) for large values of F. The origin of such a scaling law is still unknown. We report some preliminary results of scaling laws obtained for the current model in some special configurations of parameters. We have performed long integrations (1000 time units) at steady state for three set of experiments:

  1. 1.

    \(F=2^j\), \(j=3,\ldots ,14\); \(G=0\)

  2. 2.

    \(F=G=2^j\), \(j=3,\ldots ,14\);

  3. 3.

    \(G=2^j\), \(j=3,\ldots ,14\); \(F=0\)

which correspond to applying a forcing of increasing strength on the X variables only, on both the X and the \(\theta \) variables, or on the \(\theta \) variables only, respectively. These are regimes of forcing where, see the case of the L96 model [65], one might expect that chaos extensivity applies with a very good approximation, see Sect. 3.4. We obtain the following approximate asymptotic scaling laws, which are rather accurate when F and/or G are larger than 256:

  1. 1.

    \(\bar{K},\bar{E}\propto F^{1.33}\), \(\bar{\Xi }\propto F^{0.33}\), \(\bar{\Theta }\propto F^{-0.28}\), \(\bar{P}=-\bar{C}/2\propto F^{0.71}\), \(\lambda _1\propto F^{0.66}\);

  2. 2.

    \(\bar{K},\bar{E},\bar{P}\propto F^{1.33}\), \(\bar{P}\approx 0.7\bar{K}\), \(\bar{C}<0,|\bar{C}|\propto F^{0.70}\), \(\bar{\Xi },\bar{\Theta }\propto F^{0.33}\), \(\bar{\Theta }\approx 0.7\bar{\Xi }\), \(\lambda _1\propto F^{0.66}\);

  3. 3.

    \(\bar{P},\bar{E}\propto G^{1.50}\), \(\bar{K}=\bar{C}/2\propto G^{1.00}\), \(\bar{\Theta }\propto G^{0.50}\), \(\bar{\Xi }\propto G^{0.00}\), \(\lambda _1\propto G^{0.50}\);

where the uncertainty is 0.01 for all the numbers above. As clear from these scaling relations, and in agreement with what one could intuitively guess by looking at Fig. 4, it is rather different to force the system through the X or the \(\theta \) variables, and the interplay between the two reservoirs of energy is far from trivial. If the forcing is applied to only one set of variables, the energy cycle is more enhanced, ceteris paribus, when the \(\theta \) variables undergo the forcing. Indeed, the reservoir of total energy and the conversion \(\bar{C}\) of energy between the two forms are larger than for corresponding case of forcing applied uniquely to the X variables. The behaviour of the quantities \(\bar{\Xi }\) and \(\bar{\Theta }\) is also extremely different in the two cases, implying a qualitatively different way the forcing impacts the spatially coherent fluctuations of the variables. If \(F=G\), the ratio of the average size of the two reservoirs of energy is a constant, with \(\bar{K}\) being larger that \(\bar{P}\) (and, correspondingly, \(\bar{\Xi }\) being larger than \(\bar{\Theta }\)), and the average flux of energy goes from kinetic to potential. We remind that the dynamics of the \(F=G\) case is characterised by convective instability, similarly to the case where \(G=0\), compare Fig. 3a and b. The reason why the forcing through the kinetic channel dominates in the special case of \(F=G\) is still unclear and should be further investigated.

In all cases, the amount of energy that is converted between the two forms becomes a negligible fraction of the total incoming energy in the limit of large forcing. In other terms, the efficiency of the model \(\eta =\bar{C}/{(2\gamma \bar{E})}\) tends to zero if either F or G tend to infinity, even if \(\bar{C}\) tends to infinity. It is then unsurprising that when considering the limit of large F, regardless of whether G is also increased, we obtain for the X variables results that are in agreement with what featured by the one-layer L96, compare with [65]. At this regard, a useful piece of information is obtained by looking at the properties of \(\lambda _1\) in the large forcing limit. One obtains that in scenarios 1 and 2, \(\lambda _1\propto F^{0.66}\), which is again in excellent agreement with what obtained for the L96 model (including the pre-exponential factor). Scenarios 1) and 2) seem like featuring a rather similar dynamics, the main difference between the two being the strength of the fluctuations of the \(\theta \) variables; compare with panels a) and b) of Fig. 4.

The growth of \(\lambda _1\) with G is slower in the case F is set to zero, where we have absolute instability. We can gain a qualitative understanding of the different impact on \(\lambda _1\) of changes in the value of G vs F by comparing panels a), b) and c) of Fig. 4, which nonetheless describe weaker regimes of forcing (what follows stands also in the case of stronger applied forcing).

4 Conclusions

Simple and conceptual models have proved extremely useful for better understanding the dynamics of climate as a whole as well as of its individual components. Indeed, their usefulness spans from being the testbeds for developing new methods in terms of data analysis, data assimulation, and model testing; to supporting the definition of new metrics for testing more complex models; to providing valuable insights in the basic active physical mechanisms and most prominent mathematical features.

The model presented in this paper goes in this direction and has been constructed in order to provide a new layer of physical complexity to the L96 model by adding a new variable to each gridpoint of the model. This variable can be loosely interpreted as a local temperature and allows for the establishment of a complex energetics for the system, encompassing energy input, output, and conversion. Two forms of energy are present in the system, a kinetic one and potential one. We are also able to introduce a notion of efficiency, which is useful for studying the conversion of energy from one form to the other one. The energetics of the model is reminiscent of the one of the real atmosphere. Extending previous analyses, we have provided a fairly complete analysis of the mechanics of the new model by separating a quasi-symplectic and a metric component to its dynamical structure. The energy of the system is used to construct the antisymmetric evolution operator, whose corresponding brackets are not true Poisson brackets because they do not obey the Jacobi identity; hence, the symplectic structure is not complete.

We have then performed a preliminary analysis of some of the key aspects of the new model by investigating how its properties change as a function of the two parameters that control the input of kinetic and potential energy. We have studied, in a special case, the Hopf bifurcation leading to the onset of the neutral wave from the fixed point solution. The interplay between the two forcings is extremely nontrivial in the weak forcing regime, where much needs to be explored regarding the transition from fixed point to quasi-periodic to chaotic asymptotic states, and one expects that the structure and position of the bifurcations might depend delicately on the number of modes included in the system, similarly to the case of the L96 model. When considering regimes associated with stronger forcing, the system exhibits extensive chaos, even if there is clear evidence of wave-like structures emerging in the context of an overall strongly chaotic flow. Understanding the interplay between ordered wave-like structures and turbulence seems of great interest.

The system reacts differently depending on how we force it. The nature of the flow is impacted because absolute vs convective instability dominate if we force the system through the potential energy vs kinetic energy channel, respectively. The mechanism of energy conversion makes sure that also the variables that are not directly forced feature nontrivial variability. If the strength of the forcing is the same in the two channels, the kinetic energy channel ends up being more efficient: the dynamics is characterised by convective instability, and, on the average, energy is transferred from the kinetic to the potential form. The reason for this behaviour is still unclear. Similarly to the case of the L96 model, it is possible to obtain accurate power laws describing how some of the fundamental dynamical and thermodynamical properties of the system scale with the forcing parameters in the limit of very strong forcings.

The analysis presented here is only a first step in the direction of better understanding the properties of this model, which we believe has the potential of being of great interest for investigations in areas like statistical physics, nonlinear dynamics, data assimilation, mechanics, model reduction techniques, and extreme events.

Finally, again along the lines of the L96 model, we have introduced a two-level version—see Appendix A—of the model, which allows for studying multi-scale dynamics and which features an energetics that resembles, conceptually, the one of the atmosphere, where the Lorenz energy cycle describes succinctly the input and output of energy in the kinetic and potential form as well as the conversion between the two forms and between energy compartments at small vs large scales. The study of the properties of this model, which is a fortiori extremely promising in the fields above, will be carried out in a future work.