Stochastic Hamilton equations were introduced along parallel lines with the deterministic canonical theory in Bismut (1982). These results were later extended to include reduction by symmetry in Lázaro-Camí and Ortega (2008). Reduction by symmetry of expected-value stochastic variational principles for Euler–Poincaré equations was developed in Arnaudon et al. (2014) and Chen et al. (2015). Stochastic variational principles were also used in constructing stochastic variational integrators in Bou-Rabee and Owhadi (2009).
The inclusion of noise in fluid equations has a long history in the scientific literature. For reviews and recent advances in stochastic turbulence models, see Kraichnan (1994) and Gawedzki and Kupiainen (1996); and in the analysis of stochastic Navier-Stokes equations, see Mikulevicius and Rozovskii (2001). These studies of the stochastic Navier-Stokes equation are fundamental in the analysis of fluid turbulence. Expected-value stochastic variational principles leading to the derivation of the Navier-Stokes motion equation for incompressible viscous fluids have been investigated in Arnaudon and Cruzeiro (2012). For further references, we refer to Holm (2015), on which the present work is based. This work has recently had a sequence of further developments, which we now briefly sketch.
-
(1)
In Holm (2015), the extension of geometric mechanics to include stochasticity in nonlinear fluid theories was accomplished by using Hamilton’s variational principle, constrained to enforce stochastic Lagrangian fluid trajectories arising from the stochastic Eulerian vector field
$$\begin{aligned} v(x,t,dW) := u(x,t)\,\hbox {d}t + \sum _{i=1}^N \xi _i (x) \circ dW^i(t) \,, \end{aligned}$$
(2.1)
regarded as a decomposition into the sum of a drift velocity u(x, t) and a sum over stochastic terms. Imposing this decomposition as a constraint on the variations in Hamilton’s principle for fluid dynamics (Holm et al. 1998), led in Holm (2015) to applications in a variety of fluid theories, particularly for geophysical fluid dynamics (GFD).
-
(2)
The same stochastic fluid dynamics equations derived in Holm (2015) were also discovered in Cotter et al. (2017) to arise in a multi-scale decomposition of the deterministic Lagrange-to-Euler flow map, into a slow large-scale mean and a rapidly fluctuating small-scale map. Homogenisation theory was used to derive effective slow stochastic particle dynamics for the resolved mean part, thereby justifying the stochastic fluid partial equations in the Eulerian formulation. The application of rigorous homogenisation theory required assumptions of mildly chaotic fast small-scale dynamics, as well as a centring condition, according to which the mean of the fluctuating deviations was small, when pulled back to the mean flow.
Paper (Cotter et al. 2017) justified regarding the Eulerian vector field in (2.1) as a genuine decomposition of the fluid velocity into a sum of drift and stochastic parts, rather than simply a perturbation of the dynamics meant to model unknown effects in uncertainty quantification. As a genuine decomposition of the solution, one should expect that the properties of the fluid equations with stochastic transport noise should closely track the properties of the unapproximated solutions of the fluid equations. For example, if the unapproximated model equations are Hamiltonian, then the model equations with stochastic transport noise should also be Hamiltonian, as shown in Holm (2015).
-
(3)
Paper (Crisan et al. 2017) showed that the same stochastic fluid dynamics derived in Holm (2015) naturally arises from an application of a stochastic Lagrange-to-Euler map to Newton’s second law for a Lagrangian domain of fluid, acted on by an external force. In addition, local well-posedness in regular spaces and a Beale-Kato-Majda blow-up criterion are proved in Crisan et al. (2017) for the stochastic model of the 3D Euler fluid equation for incompressible flow derived in Holm (2015). Thus, the analytical properties of the 3D Euler fluid equations with stochastic transport noise derived in Holm (2015) closely mimic the corresponding analytical properties of the original deterministic 3D Euler fluid equations.
-
(4)
Inspired by spatiotemporal observations from satellites of the trajectories of objects drifting near the surface of the ocean in the National Oceanic and Atmospheric Administration’s “Global Drifter Program”, paper (Gay-Balmaz and Holm 2017) developed data-driven stochastic models of geophysical fluid dynamics (GFD) with non-stationary spatial correlations representing the dynamical behaviour of oceanic currents. These models were derived using reduction by symmetry of stochastic variational principles, leading to stochastic Hamiltonian systems, whose momentum maps, conservation laws and Lie–Poisson bracket structures were used in developing the new stochastic Hamiltonian models of GFD with nonlinearly evolving stochastic properties.
The present section incorporates stochasticity into finite dimensional mechanical systems admitting Lie group symmetry reduction, by using the standard Clebsch variational method for deriving cotangent lifted momentum maps. We review the standard approach to Lie group reduction by symmetry for finite dimensional systems in 2.1 and incorporate noise into this approach in Sect. 2.2. Next, we describe the semidirect extension in 2.3 and study the associated Fokker–Planck equations and stationary distribution in 2.4. The primary examples from classical mechanics with symmetry will be the free rigid body and the heavy top under gravity.
Euler–Poincaré Reduction
Classical mechanical systems with symmetry can often be understood geometrically in the context of Lagrangian or Hamiltonian reduction, by lifting the motion m(t) on the configuration manifold M to a Lie symmetry group via the action of the symmetry group G on the configuration manifold, by setting \(m(t)=g(t)m(0)\), where the multiplication is to be understood as the action of G to M. This procedure lifts the solution of the motion equation from a curve \(m(t)\in M\) to a curve \(g(t)\in G\), see Marsden and Ratiu (1999) and Holm (2008). The simplest case is when \(M= G\). This case, called Euler–Poincaré reduction, will be described in the present section.
In the Lagrangian framework, reduction by symmetry may be implemented in Hamilton’s principle via restricted variations in the reduced variational principle arising from variations on the corresponding Lie group. In the standard approach, for an arbitrary variation \(\delta g\) of a curve \(g(t)\in G\) in a Lie group G, the left-invariant reduced variables are \(g^{-1}\dot{g}\in {\mathfrak {g}}\) in the Lie algebra \({\mathfrak {g}}= T_eG\). Their variations arise from variations on the Lie group and are given by
$$\begin{aligned} \delta \xi = \dot{\eta }+ \mathrm {ad}_\eta \xi \,, \end{aligned}$$
for \(\eta := g^{-1}\delta g\). Here, the operation \(\mathrm {ad}:{\mathfrak {g}}\times {\mathfrak {g}}\rightarrow {\mathfrak {g}}\) represents the adjoint action of the Lie algebra on itself via the Lie bracket, denoted equivalently as \(\mathrm {ad}_\xi \eta = [\xi ,\eta ]\), and we will freely use either notation throughout the text. If the Lagrangian \(L(g,\dot{g})\) is left-invariant under the action of G, the restricted variations \(\delta \xi \) of the reduced Lagrangian \(L(e,g^{-1}\dot{g})=:l(\xi )\) inherited from admissible variations of the solution curves on the group yield the Euler–Poincaré equation
$$\begin{aligned} \frac{\text{ d }}{\hbox {d}t}\frac{\partial l(\xi )}{\partial \xi } + \mathrm {ad}^*_\xi \frac{\partial l(\xi )}{\partial \xi }=0 \,. \end{aligned}$$
(2.2)
In this equation, \(\mathrm{ad}^*: {\mathfrak {g}}\times {\mathfrak {g}}^*\rightarrow {\mathfrak {g}}^*\) is the dual of the adjoint Lie algebra action, ad. That is, \(\langle \mathrm{ad}^*_\xi \mu ,\eta \rangle =\langle {\mu ,\mathrm ad}_\xi \eta \rangle \) for \(\mu \in \mathfrak {g}^*\) and \(\xi ,\eta \in \mathfrak {g}\), under the non-degenerate pairing \(\langle \,\cdot \,,\, \cdot \,\rangle : {\mathfrak {g}}\times {\mathfrak {g}}^*\rightarrow \mathbb {R}\). Throughout this paper, we will restrict ourselves to semi-simple Lie algebras, so that the pairing is given by the Killing form, defined as
$$\begin{aligned} \kappa (\xi ,\eta ) := \mathrm {Tr}\left( \mathrm {ad}_\xi \mathrm {ad}_\eta \right) \, . \end{aligned}$$
(2.3)
In terms of the structure constants of the Lie algebra denoted as \(c_{jk}^i\) for a basis \(e_i,\,i=1,\dots ,\mathrm{dim}(\mathfrak {g})\), so that \([e_i,e_j]=c_{jk}^ie_i\), in which \(\xi =\xi ^ie_i\) and \(\eta =\eta ^je_j\), the Killing form takes the explicit form
$$\begin{aligned} \mathrm {Tr}(\mathrm {ad}_\xi \mathrm {ad}_\eta )= c_{im}^nc_{jn}^m \xi ^i \eta ^j\, . \end{aligned}$$
An important property of this pairing is its bi-invariance, written as
$$\begin{aligned} \kappa (\xi , \mathrm {ad}_\zeta \eta ) = \kappa (\mathrm {ad}_\xi \zeta ,\eta )\, , \end{aligned}$$
(2.4)
for every \(\xi ,\eta ,\zeta \in \mathfrak {g}\). This pairing allows us to identify the Lie algebra with its dual, as the Killing form of semi-simple Lie algebras is non-degenerate. We will also use the property for compact Lie algebras, that the Killing form is negative definite and thus induces a norm on the Lie algebra, \(\Vert \xi \Vert ^2:=-\kappa (\xi ,\xi )\). This function turns out to always be an invariant function on the coadjoint orbit, for every semi-simple Lie algebra; that is, it is a Casimir function. Indeed, an invariant function is in the kernel of the Lie–Poisson bracket \(\{F,C\}(\mu ) := \kappa \left( \mu , \left[ \frac{\partial F}{\partial \mu },\frac{\partial C}{\partial \mu }\right] \right) \), \(\forall F\in C({\mathfrak {g}}^*, \mathbb {R})\). For \(C(\mu )= \frac{1}{2} \kappa (\mu ,\mu )\) it is straightforward to check using the bi-invariance (2.4) that this is true for any function F. In general, it is difficult to find other independent Casimir functions of semi-simple Lie algebra; see Perelomov and Popov (1968) and Thiffeault and Morrison (2000). Of course, the theory of semi-simple Lie algebras is standard and well developed, see for example Varadarajan (1984). However, for the sake of clarity, we will express the abstract notations of adjoint and coadjoint actions with respect to the Killing form. We may then identify \({\mathfrak {g}}^* \cong {\mathfrak {g}}\) for each semi-simple Lie algebra we treat here.
We now turn to the equivalent Clebsch formulation of the Euler–Poincaré equations via a constrained Hamilton’s principle, which we will use for implementing the noise in these systems. The Clebsch formulation of the Euler–Poincaré equation and its corresponding Lie–Poisson bracket on the Hamiltonian side has been explored extensively in ideal fluid dynamics (Holm and Kupershmidt 1983; Marsden and Weinstein 1983) and more recently in optimal control problems Gay-Balmaz and Ratiu (2011) and stochastic fluid dynamics Holm (2015). This earlier work should be consulted for detailed derivations of Clebsch formulations of Euler–Poincaré equations in the contexts of ideal fluids and optimal control problems. We will briefly sketch the Clebsch approach, as specialised to the applications treated here; since we will rely on it for the introduction of noise in finite-dimensional mechanical systems by following the approach of Holm (2015) for stochastic fluid dynamics. We first introduce the Clebsch variables \(q\in {\mathfrak {g}}\) and \(p\in {\mathfrak {g}}^*\), where p will be a Lagrange multiplier which enforces the dynamical evolution of q given by the Lie algebra action of \(\xi \in {\mathfrak {g}}\), as \(\dot{q}+ \mathrm {ad}_\xi q= 0\). Note the similarity of this equation with the constrained variations of the Lagrangian reduction theory. The Clebsch method in fluid dynamics (resp. optimal control) introduces auxiliary equations for advected quantities (resp. Lie algebra actions on state variables) as constraints in the Hamilton (resp. Hamilton-Pontryagin) variational principle \(\delta S = 0\) with constrained action
$$\begin{aligned} S(\xi ,q,p) =\int l(\xi ) \hbox {d}t + \int \langle p, \dot{q} + \mathrm {ad}_\xi q \rangle \hbox {d}t\,. \end{aligned}$$
(2.5)
Taking free variations of S with respect to \(\xi ,q\) and p yields a set of equations for these three variables which can be shown to be equivalent to the Euler–Poincaré equation (2.2). The relation between the Lie algebra vector \(\xi \in {\mathfrak {g}}\) and the phase-space variables \((q,p)\in T_e^*G\) is given by the variation of the action S in (2.5) with respect to the velocity \(\xi \) in (2.5). This variation yields the momentum map, \(\mu : T_e^*G \rightarrow {\mathfrak {g}}^*\), given explicitly by
$$\begin{aligned} \mu := \frac{\partial l(\xi )}{\partial \xi } = \mathrm {ad}^*_qp\, . \end{aligned}$$
(2.6)
Unless specified otherwise, we will always use the notation \(\mu \) for the conjugate variable to \(\xi \). This version of the Clebsch theory is especially simple, as the Clebsch variables are also in the Lie algebra \({\mathfrak {g}}\). In general, it is enough for them to be in the cotangent bundle of a manifold \(T^*M\) on which the group G acts by cotangent lifts. In this more general case, the adjoint and coadjoint actions must be replaced by their corresponding actions on \(T^*M\), but the method remains the same. Another generalisation, which will be useful for us later, allows the Lagrangian to depend on both \(\xi \) and q. In this case, the Euler–Poincaré equation will acquire additional terms depending on q and the Clebsch approach will be equivalent to semidirect product reduction (Holm et al. 1998). We will consider a simple case of this extension in Sect. 2.3 and in the treatment of the heavy top in Sect. 6.
Structure Preserving Stochastic Deformations
We are now ready to deform the Euler–Poincaré equation (2.2) by introducing noise in the constrained Clebsch variational principle (2.5). In order to do this stochastic deformation, we introduce n independent Wiener processes \(W_t^i\) indexed by \(i=1,2,\dots ,n,\) and their associated stochastic potential fields \(\Phi _i(q,p) \in \mathbb {R}\) which are prescribed functions of the Clebsch phase-space variables, (q, p). The stochastic processes used here are standard Weiner processes, as discussed, e.g., in Chen et al. (2015) and Ikeda and Watanabe (2014). The number of stochastic processes can be arbitrary, but we will assume it is equal to the dimension of the Lie algebra, \(n=\mathrm{dim}({\mathfrak {g}})\). The constrained stochastic variational principle is then given by
$$\begin{aligned} S(\xi ,q,p) =\int l(\xi ) \hbox {d}t + \int \langle p,\circ dq + \mathrm {ad}_\xi q\, \hbox {d}t \rangle + \int \sum _{i=1}^n \Phi _i(q,p)\circ dW^i_t\, . \end{aligned}$$
(2.7)
In the stochastic action integral (2.7) and hereafter, the multiplication symbol \(\circ \) denotes a stochastic integral in the Stratonovich sense (Kloeden and Platen 1991). The Stratonovich formulation is the only choice of stochastic integral that admits the classical rules of calculus (e.g., integration by parts, the change of variables formula, etc.). Therefore, the Stratonovich formulation is indispensable in variational calculus and optimal control. The free variations of the action functional (2.7) may now be taken, and they will yield stochastic processes for the three variables \(\xi ,q\) and p.
For convenience in the next step of deriving a stochastic Euler–Poincaré equation, we will assume that the Lagrangian \(l(\xi )\) in the action (2.7) is hyperregular, so that \(\xi \) may be obtained from the fibre derivative \(\frac{\partial l(\xi )}{\partial \xi } = \mathrm {ad}^*_qp\). We will also specify that the stochastic potentials \(\Phi _i(q,p)\) should depend only on the momentum map \(\mu = \mathrm {ad}^*_q p\), so that the resulting stochastic equation will be independent of q and p. Following the detailed calculations in Holm (2015), we then find the stochastic Euler–Poincaré equation
$$\begin{aligned} d \frac{\partial l(\xi )}{\partial \xi } + \mathrm {ad}^*_\xi \frac{\partial l(\xi )}{\partial \xi } \hbox {d}t - \sum _i \mathrm {ad}^*_\frac{\partial \Phi _i(\mu )}{\partial \mu } \frac{\partial l(\xi )}{\partial \xi }\circ dW_t^i=0\,. \end{aligned}$$
(2.8)
In terms of the stochastic process
$$\begin{aligned} dX= \xi \hbox {d}t - \sum _i \frac{\partial \Phi _i(\mu )}{\partial \mu } \circ dW_t^i \,,\quad \hbox {with}\quad \mu =\frac{\partial l(\xi )}{\partial \xi }\, , \end{aligned}$$
(2.9)
the stochastic Euler–Poincaré equation (2.8) may be expressed in compact form, as
$$\begin{aligned} d\mu + \mathrm {ad}^*_{dX} \mu =0\,. \end{aligned}$$
(2.10)
The introduction of noise in the Clebsch-constrained variational principle rather than using reduction theory provides a transparent approach for dealing with stochastic processes on Lie groups and constrained variations arising for such processes. In this approach, the momentum map plays the same central role in both the deterministic and stochastic formulations. See Arnaudon et al. (2014) for a different approach, resulting in the derivation and analysis of deterministic expectation-value Euler–Poincaré equations using reduction by symmetry with conditional expectation.
Remark 2.1
(Reduction with noise) The stochastic Euler–Poincaré equation (2.8) arises from a stochastic reduction by symmetry, as follows. First, the reconstruction relation \(\dot{g}= g \xi \) in the deterministic case has its stochastic counterpart
$$\begin{aligned} dg= g \xi \hbox {d}t + \sum _i g\sigma _i \circ dW_t^i\, , \end{aligned}$$
(2.11)
where \(\sigma _i:= -\,\frac{\partial \Phi _i(\mu )}{\partial \mu } \), and the expressions \(g \xi \) and \(g \sigma _i \) are understood as the tangent of the left action of the group on itself; or equivalently, the left action of the group on its Lie algebra. Equation (2.8) then results from taking the variation of \(g^{-1} dg\) with (2.11) and comparing with the derivative of \(g^{-1}\delta g\) while setting \(d(\delta g)= \delta (dg)\). This gives the variation \(\delta \xi \) as
$$\begin{aligned} \delta \xi = d\eta + \mathrm {ad}_{(g^{-1} dg)}\eta = d\eta + \mathrm {ad}_\xi \eta \, \hbox {d}t + \sum _i \mathrm {ad}_{\sigma _i}\eta \circ dW^i_t\, , \end{aligned}$$
(2.12)
where \( \eta = g^{-1}\delta g\in {\mathfrak {g}}\) is arbitrary, except that \(\delta g(0)=0=\delta g(T)\) at the endpoints in time \(t\in [0,T]\). Then, using these constrained variations in the reduced variational principle \(\delta \int l(\xi )\hbox {d}t = 0\) yields the stochastic Euler–Poincaré equation (2.8), by the following calculation,
$$\begin{aligned} \begin{aligned} 0&= \delta \int l(\xi )\hbox {d}t = \int \left\langle \frac{\delta l}{\delta \xi } \,,\, \delta \xi \right\rangle \hbox {d}t = \int \left\langle \frac{\delta l}{\delta \xi } \,,\, d\eta + \mathrm {ad}_{(g^{-1} dg)}\eta \right\rangle \hbox {d}t\\&= \int \left\langle -\,d\frac{\delta l}{\delta \xi } + \mathrm {ad}^*_{(g^{-1} dg)} \frac{\delta l}{\delta \xi } \,,\, \eta \right\rangle \hbox {d}t + \left\langle \frac{\delta l}{\delta \xi } \,,\, \eta \right\rangle \bigg |_0^T \end{aligned} \end{aligned}$$
(2.13)
upon imposing the condition that \(\eta \) vanishes at the endpoints in time.
As in the deterministic case, various generalisations of this theory are possible. For example, as mentioned earlier, the Clebsch phase-space variables can also be defined in \(T^*M\), and the Lagrangian can depend on q for systems of semidirect product type (Gay-Balmaz and Holm 2017). Another generalisation is to let the stochastic potentials \(\Phi _i(\mu )\) also depend separately on q in the semidirect product setting, as we will see later.
After having defined the Stratonovich stochastic process (2.8), one may compute its corresponding Itô form, which is readily given in terms of the \( \mathrm {ad}^*\) operation by
$$\begin{aligned} \begin{aligned} d \frac{\partial l(\xi )}{\partial \xi } + \mathrm {ad}^*_\xi \frac{\partial l(\xi )}{\partial \xi } \hbox {d}t&+ \sum _i \mathrm {ad}^*_{\sigma _i} \frac{\partial l(\xi )}{\partial \xi }dW_t^i - \frac{1}{2} \sum _i \mathrm {ad}^*_{\sigma _i}\left( \mathrm {ad}^*_{\sigma _i}\frac{\partial l(\xi )}{\partial \xi }\right) \hbox {d}t=0\, , \end{aligned} \end{aligned}$$
(2.14)
where \(\sigma _i:= -\,\frac{\partial \Phi _i(\mu )}{\partial \mu } \). Notice that the indices for \(\sigma _i\) in the Itô sum in (2.14) are the same, and the \(\sigma _i\) may be taken as a basis of the underlying vector space. In terms of \(\mu := \frac{\partial l(\xi )}{\partial \xi } \) the Itô stochastic Euler–Poincaré equation (2.14) may be expressed equivalently as
$$\begin{aligned} d \mu + \mathrm {ad}^*_\xi \mu \hbox {d}t + \sum _i \mathrm {ad}^*_{\sigma _i} \mu dW_t^i - \frac{1}{2} \sum _i \mathrm {ad}^*_{\sigma _i}\left( \mathrm {ad}^*_{\sigma _i}\mu \right) \hbox {d}t=0\,. \end{aligned}$$
(2.15)
Another formulation of the stochastic Euler–Poincaré equation in (2.8) which will be used later in deriving the Fokker–Planck equation is the stochastic Lie–Poisson equation
$$\begin{aligned} d f(\mu )&= \left\langle \mu , \left[ \frac{\partial f}{\partial \mu }, \frac{\partial h}{\partial \mu }\right] \right\rangle \hbox {d}t + \sum _i\left\langle \mu , \left[ \frac{\partial f}{\partial \mu }, \frac{\partial \Phi _i}{\partial \mu }\right] \right\rangle \circ dW_t^i \end{aligned}$$
(2.16)
$$\begin{aligned}&=: \{f,h\} \hbox {d}t +\sum _i \{ f,\Phi _i\} \circ dW_i\,, \end{aligned}$$
(2.17)
where the Lie–Poisson bracket \(\{\cdot ,\cdot \}\) is defined just as in the deterministic case, from the adjoint action and the pairing on the Lie algebra \({\mathfrak {g}}\).
Extension to Semidirect Product Systems
As discussed in Holm et al. (1998), “It turns out that semidirect products occur under rather general circumstances when the symmetry in \(T^*G\) is broken”. The geometric mechanism for this remarkable fact is that under reduction by symmetry, a semidirect product of groups emerges whenever the symmetry in the phase space is broken. The symmetry breaking produces new dynamical variables, living in the coset space formed from taking the quotient \(G/G_a\) of the original unbroken symmetry G by the remaining symmetry \(G_a\) under the isotropy subgroup of the new variables. These new dynamical variables form a vector space \(G/G_a\simeq V\) on which the unbroken symmetry acts as a semidirect product, \(G\,\circledS \,V\). In physics, elements of the vector space V corresponding to the new variables are called “order parameters”. Typically, in physics, the original symmetry is broken by the introduction of potential energy depending on variables which reduce the symmetry to the isotropy subgroup of the new variables. Dynamics on the semidirect product \(G\,\circledS \,V\) results, and what had previously been flowed under the action of the unbroken symmetry now becomes flow plus waves, or oscillations, produced by the exchange of energy between its kinetic and potential forms. The heavy top is the basic example of this phenomenon, and it will be treated in Sect. 6. The semidirect product motion for the heavy top arises in the presence of gravity when the support point of a freely rotating rigid body is shifted away from its centre of mass.
With this connection between symmetry breaking and semidirect products in mind, we now extend the stochastic Euler–Poincaré equations to include semidirect product systems. We refer to Holm et al. (1998) for a complete study of these systems. Although the deterministic equations of motion in Holm et al. (1998) are derived from reduction by symmetry, we will instead incorporate noise by simply extending the Clebsch-constrained variational principle used in the previous section.
The generalisation proceeds, as follows. We will begin by assuming that the Clebsch phase-space variables comprise the elements of \(T^*V\) for a given vector space V on which the Lie group G acts freely and properly. In fact we will have \((q,p)\in V\times V^*\). However, in this work, we will restrict ourselves to the case where V is the underlying vector space of \({\mathfrak {g}}\). Following the notation of Ratiu (1981), we denote \(\overline{ {\mathfrak {g}}}= V\) in the sequel. Then, from the Killing form on \({\mathfrak {g}}\), denoted by \(\kappa :{\mathfrak {g}}\times {\mathfrak {g}} \rightarrow \mathbb {R} \), there is a bi-invariant extension of the Killing form on \({\mathfrak {g}}\circledS \overline{{\mathfrak {g}} }\) defined as
$$\begin{aligned} \kappa _s \left( ( \xi _1, \xi _2) , (\eta _1, \eta _2) \right) := \kappa ( \xi _1,\eta _2) + \kappa ( \xi _2, \eta _1)\, . \end{aligned}$$
(2.18)
Although this pairing is non-degenerate and bi-invariant, we will not use it for the definition of the dual of the semidirect algebra \({\mathfrak {g}}\circledS V\). Instead, we will use the sum of both Killing forms, namely
$$\begin{aligned} \kappa _0 \left( ( \xi _1, \xi _2) , (\eta _1, \eta _2) \right) := \kappa ( \xi _1,\eta _1) + \kappa ( \xi _2, \eta _2)\, . \end{aligned}$$
(2.19)
The group action is defined via the adjoint representation of G on \(V=\overline{{\mathfrak {g}}}\), given by \((g_1,\eta _1)(g_2,\eta _2) = (g_1g_2, \eta _1 + \mathrm {Ad}_{g_1}\eta _2)\). We then directly have the infinitesimal adjoint and coadjoint actions as
$$\begin{aligned} \begin{aligned} \mathrm {ad}_{(\xi _1,q_1)} (\xi _2,q_2)&= ( \mathrm {ad}_{\xi _1}\xi _2 , \mathrm {ad}_{\xi _1}q_2+\mathrm {ad}_{q_1}\xi _2)\,,\\ \mathrm {ad}^*_{(\xi ,q)} (\mu ,p)&= ( \mathrm {ad}^*_\xi \mu + \mathrm {ad}^*_qp, \mathrm {ad}^*_\xi p)\, , \end{aligned} \end{aligned}$$
(2.20)
where the coadjoint action is taken with respect to \(\kappa _0\) in (2.19). The extended Killing form \(\kappa _s\) defined in (2.18), gives, apart from \(\kappa (\eta ,\eta )\) with \(\eta \in \overline{{\mathfrak {g}}}\), a second invariant function on the coadjoint orbit
$$\begin{aligned} \kappa _s\left( (\xi ,\eta ), (\xi ,\eta )\right) = 2\kappa (\xi , \eta )\, , \end{aligned}$$
found from the bi-invariance of the Killing form \(\kappa _s\). One then replaces the corresponding Lie algebra actions in the Clebsch-constrained variational principle (2.7), to obtain the stochastic process with semidirect product
$$\begin{aligned} \begin{aligned} d \left( \mu , q \right)&+ \mathrm {ad}^*_{(\xi ,r)}\left( \mu , q\right) \hbox {d}t + \sum _i \mathrm {ad}^*_{\left( \frac{\partial \Phi _i(\mu ,q)}{\partial \mu },\frac{\partial \Phi _i(\mu ,q)}{\partial q}\right) } \left( \mu , q \right) \circ dW_t^i=0\, , \end{aligned} \end{aligned}$$
(2.21)
where \(l:{\mathfrak {g}}\circledS V\rightarrow \mathbb {R} \), \(\Phi _i:{\mathfrak {g}}^*\rightarrow \mathbb {R}\),
$$\begin{aligned} \frac{\partial l(\xi ,q)}{\partial \xi }=: \mu \qquad \mathrm {and}\qquad \frac{\partial l(\xi ,q)}{\partial q}=: r\,. \end{aligned}$$
(2.22)
Remark 2.2
(q dependence on the stochastic potentials) Notice that the stochastic potentials do not depend on the advected variables. This is a consequence of the Clebsch construction. By using the Hamilton-Pontryagin derivation of this equation, the potentials may depend on the advected variables, see the remark 3.2 or Gay-Balmaz and Holm (2017) for more details. We will write here the dependence on the advected variables for completeness, but we will not use it in the examples.
After taking the Legendre transform of reduced Lagrangian l, we have the Hamiltonian derivatives
$$\begin{aligned} \frac{\partial h(\mu ,q)}{\partial \mu }=: \xi \qquad \mathrm {and}\qquad \frac{\partial h(\mu ,q)}{\partial q}=: -\,r\,, \end{aligned}$$
(2.23)
for \(h:{\mathfrak {g}}^*\circledS V^*\rightarrow \mathbb {R}\). By substituting into (2.21) the expressions in (2.20) for the coadjoint action, we obtain the system
$$\begin{aligned} \begin{aligned}&d \mu + \left( \mathrm {ad}^*_\xi \mu + \mathrm {ad}^*_r q\right) {\text{ d }}t + \sum _i \left( \mathrm {ad}^*_\frac{\partial \Phi _i(\mu ,q)}{\partial \mu } \mu + \mathrm {ad}^*_\frac{\partial \Phi _i(\mu ,q)}{\partial q} q \right) \circ dW_t^i =0 \,,\\&\quad d q + \mathrm {ad}^*_\xi q {\text{ d }}t + \sum _i \mathrm {ad}^*_\frac{\partial \Phi _i(\mu ,q)}{\partial \mu }q \circ dW_t^i=0\,. \end{aligned} \end{aligned}$$
(2.24)
Although the number of stochastic potentials \(\Phi _i\) which one may consider is arbitrary, we shall find it convenient for our purposes to restrict to a maximum of \(n= \mathrm {dim}({\mathfrak {g}}) + \mathrm {dim}(V)\) such potentials. In fact, the potentials associated with V will not be fully treated here.
Remark 2.3
General setting The semidirect product theory we have described here is the simplest instance of it, as we are using a particular vector space V. In general, the advected quantities can also be in a Lie algebra, or an arbitrary manifold, provided the action of the group G on it is free and proper.
The Fokker–Planck Equation and Stationary Distributions
We derive here a geometric version of the classical Fokker–Planck equation (or forward Kolmogorov equation) using our SDE (2.8). Recall that the Fokker–Planck equation describes the time evolution of the probability distribution \(\mathbb {P}\) for the process driven by (2.8). We refer to Ikeda and Watanabe (2014) for the standard textbook treatment of stochastic processes. Here, we will consider \(\mathbb {P}\) as a function \(C({\mathfrak {g}}^*)\) with the additional property that \(\int _\mathfrak {g^*} \mathbb {P} d\mu = 1\). First, the generator of the process (2.8) can be readily found from the Lie–Poisson form (2.14) of the stochastic process (2.17) to be
$$\begin{aligned} Lf(\mu ) = \left\langle \mathrm {ad}^*_\xi \mu ,\frac{\partial f}{\partial \mu }\right\rangle - \sum _i\left\langle \mathrm {ad}^*_{\sigma _i}\mu ,\frac{\partial }{\partial \mu } \left\langle \mathrm {ad}^*_{\sigma _i}\mu ,\frac{\partial f}{\partial \mu }\right\rangle \right\rangle \, , \end{aligned}$$
(2.25)
where \(\langle \,\cdot \,,\, \cdot \,\rangle \) still denotes the Killing form on the Lie algebra \({\mathfrak {g}}\) and \(f\in C({\mathfrak {g}}^*\)) is an arbitrary function of \(\mu \). The Fokker–Planck equation
$$\begin{aligned} \partial _t \mathbb {P}= L^*\mathbb {P} \end{aligned}$$
obtained from the formula (2.25) is proved in the following proposition, cf. Holm et al. (2016).
Proposition 2.4
The forward Kolmogorov operator is
$$\begin{aligned} L^*\mathbb {P}(\mu ) = -\{ h,\mathbb {P}\} - \frac{1}{2} \sum _i\{\Phi _i, \{\Phi _i,\mathbb {P}\}\} \,. \end{aligned}$$
(2.26)
Proof
We first consider the drift term of (2.25) written in Lie-Bracket form and use the Leibniz property of the Lie–Poisson bracket to get
$$\begin{aligned} \int \mathbb {P} \{ h,f\} d\mu = - \int f\{h, \mathbb {P}\} d\mu + \int \{h,f\mathbb {P}\}d\mu \, . \end{aligned}$$
(2.27)
Recall that a non-degenerate Poisson bracket can be expressed in term of a symplectic form as \(\{F,G\}= \omega (X_F,X_G)\) where \(X_F\) and \(X_G\) are the Hamiltonian vector fields associated with the function F and G. Now, the symplectic form on the coadjoint orbit is an exact 2-form, that is \(\omega = d\theta \) for some 1-form \(\theta \). These facts mean that provided that we have vanishing or periodic boundary conditions on the coadjoint orbits, the integration of the Lie–Poisson bracket corresponds to the integration of the exterior differential of a 1-form. This integral vanishes by using Stokes theorem and suitable boundary conditions. For the double bracket term, we similarly obtain
$$\begin{aligned} \int \mathbb {P} \{\Phi _i, \{ \Phi _i,f\}\} d\mu&= - \int \{\Phi _i,f\}\{\Phi _i, \mathbb {P}\} d\mu + \int \{\Phi _i ,\{\Phi _i,f\}\mathbb {P}\}d\mu \\&= \int f\{\Phi _i,\{\Phi _i, \mathbb {P}\}\} d\mu - \int \{\Phi _i,f\{\Phi _i, \mathbb {P}\}\} d\mu \\&\quad +\, \int \{\Phi _i ,\{\Phi _i,f\}\mathbb {P}\}d\mu \, , \end{aligned}$$
where the last two terms vanishes from the same argument as before, that is using integration by parts. Collecting terms give the forward Kolmogorov operator (2.26). \(\square \)
In (2.26), we recover the Lie–Poisson formulation (2.17) of the Euler–Poincaré equation together with a dissipative term arising from the noise of the original SDE in a double Lie–Poisson bracket form. This formulation gives the following theorem for stationary distributions of (2.25):
Theorem 2.5
The stationary distribution \(\mathbb {P}_\infty \) of the Fokker–Planck equation (2.25), i.e., \(L^*\mathbb {P}_\infty =0\) is uniform on the coadjoint orbits on which the SDE (2.8) evolves.
Proof
By a standard result in functional analysis, see for example (Villani 2009), a linear differential operator of the form \(L = B + \frac{1}{2} \sum _i A_i^2\) has the property that \(\mathrm {ker}(L) = \mathrm {ker}(A_i)\cap \mathrm {ker}(B)\), where here \(A_i = \{\Phi _i, \cdot \}\) and \(B = \{h, \cdot \}\). Consequently, for every smooth function f, the only functions g which satisfy \(\{ f,g \}= 0\) are the Casimirs, or invariant functions, on the coadjoint orbits. When restricted to a coadjoint orbit, these functions become constants. Hence, the stationary distribution \(\mathbb {P}_\infty \) is a constant on the coadjoint orbit identified by the initial conditions of the system. \(\square \)
Since the dynamics are restricted to the coadjoint orbits, for the probability distribution \(\mathbb {P}\) to tend to a constant, yet remain normalisable, satisfying \(\int _{{\mathfrak {g}}^*} \mathbb {P}(\mu )d\mu = 1\), the value of the density must tend to the inverse of the volume of the coadjoint orbit. Of course, the compactness of the coadjoint orbit is equivalent to \(\mathbb {P}_\infty >0\). For non-compact orbits, Theorem 2.5 is still valid, and it will imply an asymptotically vanishing stationary distribution, in the same sense as for the stationary solution of the heat equation on the real line. In this case, a more detailed analysis of the stationary distribution can be performed by studying marginals, or projections onto a compact subspace of the coadjoint orbit.
Examples of non-compact coadjoint orbits arise in the semidirect product setting. First, the Fokker–Planck equation for the semidirect product stochastic process (2.21) is given by
$$\begin{aligned} \begin{aligned} Lf(\mu ,q)&= \left\langle \mathrm {ad}^*_{(\xi ,r)} (\mu ,q) ,\left( \frac{\partial f(\mu ,q)}{\partial \mu }, \frac{\partial f(\mu ,q)}{\partial q}\right) \right\rangle - \\&\quad -\, \sum _i\left\langle \mathrm {ad}^*_{(\sigma _i,\eta _i)}(\mu ,q), \left\{ \frac{\partial }{\partial \mu } \left\langle \mathrm {ad}^*_{(\sigma _i,\eta _i)}(\mu ,q) ,\left( \frac{\partial f(\mu ,q)}{\partial \mu },\frac{\partial f(\mu ,q)}{\partial q}\right) \right\rangle \right. \right. , \\&\quad \times \left. \left. \frac{\partial }{\partial q} \left\langle \mathrm {ad}^*_{(\sigma _i,\eta _i)}(\mu ,q) ,\left( \frac{\partial f(\mu ,q)}{\partial \mu },\frac{\partial f(\mu ,q)}{\partial q}\right) \right\rangle \right\} \right\rangle , \end{aligned} \end{aligned}$$
(2.28)
where \(\sigma _i :=- {\partial \Phi _i}/{\partial \mu }\) and \(\eta _i := -{\partial \Phi _i}/{\partial q}\). The pairing used here is the sum of the pairings on \({\mathfrak {g}}\) and on V, given by \(\kappa _0\) in (2.19). Note that for some values of index i, the vector fields \(\sigma _i\) or \(\eta _i\) may be absent. One can check that \(L^*= L\); so that L generates the Lie–Poisson Fokker–Planck equation for the probability density \(\mathbb {P}(\mu ,q)\). As before, upon using the explicit form of the coadjoint actions, one finds
$$\begin{aligned} \begin{aligned} Lf(\mu ,q)&= \left\langle \mathrm {ad}^*_\xi \mu + \mathrm {ad}^*_q r , \frac{\partial f}{\partial \mu } \right\rangle + \left\langle \mathrm {ad}^*_\xi q ,\frac{\partial f}{\partial q}\right\rangle -\\&\quad -\, \sum _i\left\langle \mathrm {ad}^*_{\sigma _i} \mu + \mathrm {ad}^*_q \eta _i , \frac{\partial A_i}{\partial \mu } \right\rangle -\sum _i \left\langle \mathrm {ad}^*_{\sigma _i} q ,\frac{\partial A_i}{\partial q}\right\rangle , \\ \mathrm {where}\qquad A_i:&= \left\langle \mathrm {ad}^*_{\sigma _i} \mu + \mathrm {ad}^*_q \eta _i , \frac{\partial f}{\partial \mu } \right\rangle + \left\langle \mathrm {ad}^*_{ \sigma _i} q ,\frac{\partial f}{\partial q}\right\rangle \, . \end{aligned} \end{aligned}$$
(2.29)
The Fokker–Planck equation (2.28) provides a direct corollary of Theorem 2.5.
Corollary 2.5.1
The stationary distribution \(\mathbb {P}_\infty (\mu ,q)\) of (2.28) is constant on the coadjoint orbit corresponding to the initial conditions of the stochastic process (2.21).
As mentioned earlier, the coadjoint orbit of this system is not compact, even if it had been compact for the Lie algebra \({\mathfrak {g}}\). Nevertheless, we can study the marginal distributions
$$\begin{aligned} \mathbb {P}^1(\mu )&:= \int \mathbb {P}(\mu ,q)dq\qquad \mathrm { and} \end{aligned}$$
(2.30)
$$\begin{aligned} \mathbb {P}^2(q)&:= \int \mathbb {P}(\mu ,q)d\mu \, , \end{aligned}$$
(2.31)
which of course extend to stationary marginal distributions \(\mathbb {P}^1_\infty \) and \(\mathbb {P}^2_\infty \). With these marginal distributions, we can get more information about the stationary distribution of the semidirect product Lie–Poisson Fokker–Planck equation (2.28), as summarised in the next theorem.
Theorem 2.6
For a semi-simple Lie algebra \({\mathfrak {g}}\) and \(V= \overline{{\mathfrak {g}}}\), the marginal stationary distributions defined in (2.30) and (2.31) of the Fokker–Planck equation (2.28), with \(\eta _i=0\), for all i, have the following forms.
-
(1)
The stationary distribution \(\mathbb {P}^2_\infty (q)\) is constant on the q-dependent subspace of the coadjoint orbit. If the Lie algebra \({\mathfrak {g}}\) is non-compact, the constant is zero.
-
(2)
The stationary distribution \(\mathbb {P}^1_\infty (\mu )\) restricted to \(\kappa (\mu ,\mu )\) is constant.
-
(3)
If \({\mathfrak {g}}\) is compact, the stationary distribution \(\mathbb {P}^1_\infty (\mu )\) is linearly bounded in time in the direction perpendicular to \(\kappa (\mu ,\mu )\).
Proof
We will compute the stationary marginal distributions separately, but first recall that the stationary distribution \(\mathbb {P}(\mu ,q)\) is constant on the Casimir level sets given by the initial conditions.
(1) By integrating the Fokker–Planck equation (2.28) over \(\mu \), one obtains
$$\begin{aligned} L\mathbb {P}^2(q)&= \int \left\langle \mathrm {ad}^*_\xi q ,\frac{\partial \mathbb {P}(\mu ,q)}{\partial q}\right\rangle d\mu -\frac{1}{2} \left\langle \mathrm {ad}^*_{ \sigma _i} q ,\frac{\partial }{\partial q} \left\langle \mathrm {ad}^*_{\sigma _i} q ,\frac{\partial \mathbb {P}^2(q)}{\partial q}\right\rangle \right\rangle \, , \end{aligned}$$
(2.32)
where we have used the property that the coadjoint action is divergence-free (because of the anti-symmetry of the adjoint action, when identified with the coadjoint action via the Killing form) and have recalled that the Lie algebra is either compact, or \(\mathbb {P}(\mu ,q)= 0\) for the boundary conditions.
Only the advection term remains in (2.32), as \(\xi =\frac{\partial h}{\partial \mu }\) depends on \(\mu \). Nevertheless, an argument similar to that for the proof of Theorem 2.5 may be applied here to give the result of constant marginal distribution on the q dependent part of the coadjoint orbits. Again, if the Lie algebra is non-compact, then the probability density \(\mathbb {P}^2_\infty (q)\) must vanish because of the normalisation.
(2) We first integrate the Fokker–Planck equation (2.28) with respect to the q variable to find
$$\begin{aligned} L\mathbb {P}^1(\mu )&= \left\langle \mathrm {ad}^*_\xi \mu , \frac{\partial \mathbb {P}^1}{\partial \mu } \right\rangle - \frac{1}{2} \sum _i\left\langle \mathrm {ad}^*_{\sigma _i} \mu , \frac{\partial }{\partial \mu } \left\langle \mathrm {ad}^*_{\sigma _i} \mu , \frac{\partial \mathbb {P}^1}{\partial \mu } \right\rangle \right\rangle \, , \end{aligned}$$
(2.33)
where we have again used that the coadjoint action is divergence-free, the same boundary conditions and the fact that \(\langle \mathrm {ad}_q \xi , \frac{\partial \mathbb {P}}{\partial \mu }\rangle = 0,\, \forall \xi \) since \(\frac{\partial \mathbb {P}}{\partial \mu }\) is aligned with q. Indeed, \(\mathbb {P}\) is a function of the Casimirs, and thus is a function of \(\kappa _s((\mu ,q),(\mu ,q))\). This fact prevents us from directly invoking Theorem 2.5 as we would find that \(\mathbb {P}^1\) is indeed constant on \(\kappa (\mu ,\mu )\), but \(\mu \) does not have an invariant norm. Nevertheless, we can still use this theorem by restricting \(\mathbb {P}^1\) to the sphere \(\kappa (\mu ,\mu )\), or equivalently simply considering polar coordinates for \(\mu \) and discarding the radial coordinate. In this case, we can invoke Theorem 2.5 and obtain the result of a constant marginal distribution \(\mathbb {P}^1_\infty \) projected on the coadjoint orbit of the Lie algebra \({\mathfrak {g}}\) alone.
(3) We compute the time derivative of the quantity \(\Vert \mu \Vert ^2:= -\,\kappa (\mu ,\mu )\), which is positive definite and thus defines a norm, to get an upper estimate of the form
$$\begin{aligned} \frac{\text{ d }}{{\text{ d }}t}\frac{1}{2} \Vert \mu \Vert ^2&= \langle \mu , \dot{\mu }\rangle = \langle \mathrm {ad}_r q, \mu \rangle \le \Vert r\Vert \Vert q\Vert \Vert \mu \Vert \, . \end{aligned}$$
Then, because \(\Vert q\Vert =\sqrt{-\kappa (q,q)}\) is constant, and provided that r is bounded, we can integrate to find
$$\begin{aligned} \Vert \mu (t)\Vert \le \Vert \mu (0)\Vert +\alpha t\, , \end{aligned}$$
(2.34)
where \(\alpha \) is a constant depending on the Lie algebra and the Hamiltonian. \(\square \)
Remark 2.7
(On ergodicity) An important question about any given dynamical system is whether its solution is ergodic. This question needs some clarification for the systems considered here. First, notice that the deterministic systems are not ergodic, as they are Hamiltonian systems with extra conserved quantities given by the Casimirs. If the \(\sigma _i\) span the entire Lie algebra, there is a constant invariant measure on the level set of the Casimir given by the initial conditions. This means that we have the ergodicity property on the coadjoint orbits but not on the full Euclidian space in which the coadjoint orbits are embedded. The ergodicity must then only be defined with respect to the coadjoint orbit, or the system will not be seen to be ergodic. Finally, the cases where the \(\sigma _i\) do not span the Lie algebra must be treated individually, depending on the system in hand. For example with the rigid body in Sect. 5, having two independent non-trivial \(\sigma _i\) is sufficient for ergodicity, while having only one \(\sigma _i\) will make the system integrable, and thus non-ergodic on the coadjoint orbit, or momentum sphere.
Summary This section has reviewed the framework for the study of noise in dynamical systems defined on coadjoint orbits and has illustrated how noise may be included in these systems, so as to preserve the deterministic coadjoint orbits. This preservation property is seen clearly in the Clebsch formulation because the deterministic and stochastic systems share the same momentum map, whose level sets define the coadjoint orbits. The systems we have considered are the Euler–Poincaré equations on semi-simple finite-dimensional Lie groups and the semidirect product structures which appear when the advected quantities are introduced into the underlying vector space of the Lie algebra of the Lie group. These structures are not the most general. However, their study has allowed us to use the properties of the natural pairing given by the Killing form to prove a few illustrative results in a simple and transparent way. In particular, we showed that the stationary solution of the Fokker–Planck equation, written in Lie–Poisson form, is constant on the coadjoint orbits. In the semidirect product setting, a bit more care was needed to obtain similar results for the marginal distributions, as the coadjoint orbits are not compact in this case. We will illustrate our approach with the two basic examples of the rigid body and heavy top in Sects. 5 and 6, where more will be said about these systems, and in particular about their integrability.