1 Introduction

The concept of Lagrangian multiforms was introduced in [1] with the objective of providing a variational criterion of integrability. The pioneering insight was inspired by the well-established criterion for integrability known as multidimensional consistency [2, 3] which is the discrete analogue of the property of commuting (Hamiltonian) flows for a dynamical system sitting in an integrable hierarchy. It was proposed to introduce a generalised action and a variational principle, involving a new object (a Lagrangian multiform), to capture purely variationally multidimensional consistency. This idea grew quickly, first in the discrete realm, see [4] and references therein. Over the last decade or so, the universality of this idea and its connections with more traditional features of integrability (Lax pair, Hamiltonian structures) has been illustrated in many other incarnations of integrable systems: finite-dimensional systems [5] followed by [6, 7], continuous infinite-dimensional systems—field theories in \(1+1\) dimensions [8,9,10,11,12,13,14] and in \(2+1\) dimensions [15, 16]—and semi-discrete systems [17]. The relations between discrete and continuous multiforms were explored in [18]. The concept was even extended recently to non-commuting flows in [19]. In general, a Lagrangian multiform is a d-form which is integrated over a hypersurface of dimension d in a so-called multi-time space of dimension greater than d to yield an action functional depending not only on the field configurations but also on the hypersurface. This last point is the main departure from a traditional action and principle of least action. One postulates a principle of least action which must be valid for any hypersurface embedded in the multi-time space. This is the postulate which captures the idea of the commutativity of the flows and which was adopted as a definition of pluri-Lagrangians, see [7, 13] and references therein. In Lagrangian multiform theory, there is an additional postulate, the closure relation which is the direct counterpart of the Poisson involutivity of Hamiltonians, the Liouville criterion for integrability.

The generalised variational principle produces equations that come in two flavours: 1) Euler–Lagrange equations associated with each of the coefficients of the Lagrangian multiform which form a collection of Lagrangian densities; 2) Corner or structure equations on the Lagrangian coefficients themselves which select possible models and ensure the compatibility of the various equations of motion imposed on a common set of fields. Classifying all possible Lagrangian multiforms along these lines would amount to classifying all integrable hierarchies. In practice, it is a nontrivial task to obtain all the Lagrangian coefficients of a multiform which produce compatible equations of motion. Beyond brute-force calculations to solve the corner equations [9], several works have used the idea of variational symmetries to achieve this goal [7, 10, 13]. This produces an algorithm to construct the Lagrangian coefficients one after the other from a given initial Lagrangian. Although perfectly fine in theory, this can become quickly unmanageable in practice and usually formulas for only a few Lagrangian coefficients are obtained. It also has the disadvantage of singling out some independent variables in the hierarchy which then appear as the so-called alien derivatives in the higher Lagrangian coefficients.

More recently, another approach was introduced which takes a more global view on a hierarchy and provides an efficient way of describing all the Lagrangian coefficients in one formula [12, 14], see also [16]. A key insight in [12, 14] was the incorporation in the Lagrangian multiform of key ingredients known in the Hamiltonian framework for integrable hierarchies, in particular the classical r-matrix, as well as the “compounding” of hierarchies following [20]. This paper draws and expands upon this insight and is concerned with Lagrangian 1-forms which allow one to treat integrable hierarchies of finite-dimensional systems. Specifically, we show how the theory of Lie dialgebras [21] can be used to construct systematically a Lagrangian multiform for any finite-dimensional system which falls within the Lie dialgebra framework. The latter incorporates and generalises the perhaps more well-known Adler–Kostant–Symes scheme [22,23,24]. In terms of versatility, this goes beyond the results of [12, 14] which were confined to skew-symmetric classical r-matrices. The Lie dialgebra framework can easily accommodate the non-skew-symmetric case. For conciseness, we only illustrate this versatility and our construction on two famous models: the open Toda chain and the (rational) Gaudin model. However, the construction can in principle cover a much larger range of models which falls into the r-matrix scheme, see [25] for a description of many such systems including classical tops. To our knowledge, only one instance of a Lagrangian description of the AKS scheme has been proposed before in [26]. Compared to the present paper, [26] is limited to the AKS scheme and provides only one Lagrangian corresponding to the quadratic Hamiltonian \({{\,\textrm{Tr}\,}}L^2/2\) (the idea of Lagrangian multiforms was not yet available at that time). By using ideas from Hamiltonian reduction, we produce a Lagrangian multiform on a general coadjoint orbit which encompasses the results of [26] as a special case.

Our main results are:

  1. 1.

    The definition (3.1)-(3.3) of a Lagrangian multiform from the data of a Lie dialgebra and the proof that its multi-time Euler–Lagrange equations produce a hierarchy of compatible equations in Lax form, Theorem 3.2.

  2. 2.

    For this Lagrangian multiform, the derivation of an identity relating its closure relation, its Euler–Lagrange equations and the Poisson involutivity of associated Hamiltonians, Theorem 3.3.

  3. 3.

    The construction of a Lagrangian multiform from the reduction of a “free” Lagrangian on the cotangent bundle \(T^*A\) of a Lie group A and the connection with the above Lie dialgebra case.

  4. 4.

    Explicit Lagrangian multiforms for the open Toda chain and the rational Gaudin model.

The paper is organised as follows. In Sect. 2, we briefly review the notions of Lagrangian multiforms and Lie dialgebras that we need. Section 3 introduces the Lagrangian multiform and contains two main results, Theorems 3.2 and 3.3. Sect. 4 deals with another main result. We recast our results in the context of reduction from free motion on the cotangent bundle of a Lie group and produce a Lagrangian multiform on a general coadjoint orbit. We show how to recover the case of a Lagrangian multiform associated with a Lie dialgebra described in Sect. 3. In Sect. 5, we illustrate the construction for the open Toda chain associated with a Lie dialgebra via a non-skew-symmetric r-matrix. We present explicit expressions for the Lagrangian coefficients and relate our results to the well-known formulations of the Toda chain in Flaschka and canonical coordinates. In Sect. 6, the same open Toda chain is used to illustrate our construction in the case of a skew-symmetric r-matrix. We also relate our results to the description in Flaschka and canonical coordinates. Section 7 is concerned with the rational Gaudin model and is the opportunity for us to show how our Lagrangian multiform operates in the case of an infinite-dimensional Lie algebra which accounts for the presence of a spectral parameter in the Lax matrices. Although it deals with a finite-dimensional Gaudin model, this section bears a lot of similarities with the framework introduced in [14] for integrable field theories. We end with concluding remarks in Sect. 8.

2 Background material

2.1 Lagrangian \(\varvec{1}\)-forms

We review in more detail the notion of Lagrangian multiforms that we need, restricting our attention to Lagrangian 1-forms since our aim is to describe integrable hierarchies of finite-dimensional systems. The basic object is a Lagrangian 1-form

$$\begin{aligned} \mathscr {L}[q]=\sum _{k=1}^N \, \mathscr {L}_k[q] \, \textrm{d}t_k \end{aligned}$$
(2.1)

and the related generalised action

$$\begin{aligned} S[q,\Gamma ]=\int _\Gamma \mathscr {L}[q] \end{aligned}$$
(2.2)

where \(\Gamma \) is a curve in the multi-time \(\mathbb {R}^N\) with (time) coordinates \(t_1,\dots ,t_N\) and q denotes generic configuration coordinates. For instance, q could be a position vector in \(\mathbb {R}^d\) for some d, or as will be the case for us, an element of a (matrix) Lie group. The notations \(\mathscr {L}[q]\) and \(\mathscr {L}_k[q]\) mean that these quantities depends on q and a finite number of derivatives of q with respect to the times \(t_1,\dots ,t_N\). In this paper, we restrict ourselves only to the case of first derivatives and simply write \(\mathscr {L}_k\) for the Lagrangian coefficients. The application of the generalised variational principle leads to the following multi-time Euler–Lagrange equations [6]

$$\begin{aligned}&\frac{\partial \mathscr {L}_k}{\partial q}-\partial _{t_k} \frac{\partial \mathscr {L}_k}{\partial q_{t_k}}=0\,, \end{aligned}$$
(2.3)
$$\begin{aligned}&\frac{\partial \mathscr {L}_k}{\partial q_{t_\ell }}=0\,,\qquad \ell \ne k\,, \end{aligned}$$
(2.4)
$$\begin{aligned}&\frac{\partial \mathscr {L}_k}{\partial q_{t_k}}=\frac{\partial \mathscr {L}_\ell }{\partial q_{t_\ell }}\,,\qquad k,\ell =1,\dots ,N\,. \end{aligned}$$
(2.5)

Note that (2.3) is simply the standard Euler–Lagrange equation for each \(\mathscr {L}_k\). Condition (2.4) states that the Lagrangian coefficient \(\mathscr {L}_k\) cannot depend on the velocities \(q_{t_\ell }\) for \(\ell \ne k\). The last condition (2.5) requires that the conjugate momentum to q be the same with respect to all times \(t_k\). The closure relation then stipulates that

$$\begin{aligned} \textrm{d}\mathscr {L}[q]=0~~\Leftrightarrow ~~\partial _{t_k}\mathscr {L}_j-\partial _{t_j}\mathscr {L}_k=0, \end{aligned}$$
(2.6)

on solutions of (2.3)-(2.5).

2.2 Lie dialgebras and Lax equations

Here we collect facts from the theory of Lie dialgebras as defined in [21, Lecture 2], see also [27, Chapter 4]. Proofs are omitted for brevity and the reader is referred to [21, 27] for details. We emphasise that Lie dialgebras are different from the perhaps more familiar Lie bialgebras appearing in Drinfeld’s theory of Poisson-Lie groups. Connections and differences between these two structures are discussed in [21] and [28].

Let \(\mathfrak {g}\) be a matrix Lie algebra, with matrix Lie group G, and \(\mathfrak {g}^*\) its dual space. We have the usual (co)adjoint actionsFootnote 1 for all \(\xi \in \mathfrak {g}^*\), \(X,Y\in \mathfrak {g}\), \(g\in G\),

(2.7)
(2.8)

The space \(\mathfrak {g}^*\) can be endowed with the Lie–Poisson bracket defined by

$$\begin{aligned} \{f,g\}(\xi )= (\xi ,[\nabla f(\xi ),\,\nabla g(\xi )]),~~f,g\in C^\infty (\mathfrak {g}^*), \end{aligned}$$
(2.9)

where we introduced the convenient notation \((\,~,~)\) for the natural pairing between \(\mathfrak {g}^*\) and \(\mathfrak {g}\): \(\xi (X)=(\xi ,X)\). The gradient \(\nabla f(\xi )\) is the element of \(\mathfrak {g}\) defined from the differential \(\delta f(\xi )\) by using the pairing

$$\begin{aligned} \delta f(\xi )(\eta )=\lim _{\epsilon \rightarrow 0}\frac{f(\xi +\epsilon \eta )-f(\xi )}{\epsilon }=(\eta ,\nabla f(\xi )). \end{aligned}$$
(2.10)

The Lie–Poisson bracket is degenerate in general and the \(\text {Ad}^*\)-invariant functions on \(\mathfrak {g}^*\) are the Casimir functions. Its symplectic leaves are the coadjoint orbits of G in \(\mathfrak {g}^*\). The restriction to a coadjoint orbit gives rise to the Lie–Kostant–Kirillov–Souriau symplectic form \(\omega _{KK}\).

Let \(R:\mathfrak {g}\rightarrow \mathfrak {g}\) be a linear map. It is a solution of the modified classical Yang–Baxter equation (mCYBE) if it satisfies

$$\begin{aligned}{}[R(X),R(Y)]-R\left( [R(X),Y]+[X,R(Y)]\right) =-[X,Y],~~\forall ~X,Y\in \mathfrak {g}.\quad \end{aligned}$$
(2.11)

By abuse of language, we will call a solution R of (2.11) a (classical) r-matrix, in relation to the fact that with R one can associate \(r\in \mathfrak {g}\otimes \mathfrak {g}\) (which is what is traditionally called the r-matrix) when \(\mathfrak {g}\) is equipped with a nondegenerate ad-invariant symmetric bilinear form \(\langle \,~,~\rangle \) (e.g. the Killing form when \(\mathfrak {g}\) is a finite-dimensional semi-simple Lie algebra). A famous example of an r-matrix arises in the case where \(\mathfrak {g}\) admits a direct sum decomposition (as a vector space) into two Lie subalgebras

$$\begin{aligned} \mathfrak {g}=\mathfrak {g}_+\oplus \mathfrak {g}_-. \end{aligned}$$
(2.12)

Then, \(R=P_+-P_-\) is a solution of (2.11), where \(P_\pm \) is the projector on \(\mathfrak {g}_\pm \) along \(\mathfrak {g}_\mp \).

Given a solution R of the mCYBE, one can define on the vector space \(\mathfrak {g}\) a second Lie bracket

$$\begin{aligned}{}[X,Y]_R=\frac{1}{2}\left( [R(X),Y]+[X,R(Y)]\right) . \end{aligned}$$
(2.13)

The corresponding Lie algebra is denoted by \(\mathfrak {g}_R\). We therefore have an adjoint action of \(\mathfrak {g}_R\) on itself and a coadjoint action of \(\mathfrak {g}_R\) on \(\mathfrak {g}^*\) (\(\mathfrak {g}\) and \(\mathfrak {g}_R\), being the same vector space, have the same dual space)

(2.14)
(2.15)

The algebraic significance of the mCYBE and of the second Lie bracket \([\,~,~]_R\) is given by the following results which lead to essential factorisation properties underlying integrable systems. The key objects are the maps

$$\begin{aligned} R_\pm =\frac{1}{2}\left( R\pm \textrm{id}\right) . \end{aligned}$$
(2.16)

Proposition 2.1

Let \(\mathfrak {g}_\pm =\textrm{Im}\,R_\pm \). Then,

  1. 1.

    \(R_\pm :\mathfrak {g}_R\rightarrow \mathfrak {g}\) are Lie algebra homomorphisms:

    $$\begin{aligned} R_\pm \left( [X,Y]_R\right) =\left[ R_\pm (X),R_\pm (Y)\right] . \end{aligned}$$
    (2.17)

    In particular, \(\mathfrak {g}_\pm \subset \mathfrak {g}\) are Lie subalgebras of \(\mathfrak {g}\).

  2. 2.

    The mapping \(i_R:\mathfrak {g}_R\rightarrow \mathfrak {g}_+\oplus \mathfrak {g}_-\), \(i_R(X)=(R_+(X),R_-(X))\) is a Lie algebra embedding. Thus \(\widetilde{\mathfrak {g}}_R=\textrm{Im}\,i_R\) is a Lie subalgebra of \(\mathfrak {g}_+\oplus \mathfrak {g}_-\) .

  3. 3.

    The composition of the maps

    $$\begin{aligned} i_R: \mathfrak {g}_R ~{\rightarrow }~ \mathfrak {g}_+\oplus \mathfrak {g}_-,~~X\mapsto (R_+(X),R_-(X)), \end{aligned}$$
    (2.18)

    followed by

    $$\begin{aligned} a:\mathfrak {g}_+\oplus \mathfrak {g}_-~{\rightarrow }~\mathfrak {g},~~(X_+,X_-)\mapsto X_+-X_-, \end{aligned}$$
    (2.19)

    provides a unique decomposition of any element \(X\in \mathfrak {g}\) as \(X=R_+(X)-R_-(X)\).

Note that \(R_+-R_-=\textrm{id}\) and

$$\begin{aligned}{}[X,Y]_R= & {} R_+\left( [X,Y]_R\right) -R_-\left( [X,Y]_R\right) \nonumber \\ {}= & {} \left[ R_+(X),R_+(Y)\right] -\left[ R_-(X),R_-(Y)\right] . \end{aligned}$$
(2.20)

We can express the actions of \(\mathfrak {g}_R\) in terms of those of \(\mathfrak {g}\). For convenience, we write \(X_\pm =R_\pm (X)\) for \(X\in \mathfrak {g}\). Then,

(2.21)
(2.22)

where the adjoint \(A^*:\mathfrak {g}^*\rightarrow \mathfrak {g}^*\) of a linear map \(A:\mathfrak {g}\rightarrow \mathfrak {g}\) is defined by \((A^*(\xi ),X)=(\xi ,A(X))\).

The application of this framework to integrable systems hinges on the interplay between the two Lie–Poisson brackets one can define on \(\mathfrak {g}^*\). Indeed, having a second Lie bracket, we can repeat the definition (2.9) to obtain

$$\begin{aligned} \{f,g\}_R(\xi )= (\xi ,[\nabla f(\xi ),\,\nabla g(\xi )]_R). \end{aligned}$$
(2.23)

A similar conclusion holds: the symplectic leaves are coadjoint orbits of \(G_R\), the Lie group of \(\mathfrak {g}_R\), in \(\mathfrak {g}^*\). The restriction to a coadjoint orbit gives rise to the symplectic form which we denote by \(\omega _R\). It is the interplay between these two structures that provides integrable systems whose equations of motion take the form of a Lax equation. For this last part, one needs one more ingredient: an \(\text {Ad}\)-invariant nondegenerate bilinear symmetric form \(\langle \,~,~\rangle \) on \(\mathfrak {g}\). It allows to identify \(\mathfrak {g}^*\) with \(\mathfrak {g}\) and the coadjoint actions with the adjoint actions. Specifically, one has

Theorem 2.2

The \(Ad ^*\!\)-invariant functions on \(\mathfrak {g}^*\) are in involution with respect to \(\{\,~,~\}_R\). The equation of motion

$$\begin{aligned} \frac{\textrm{d}}{\textrm{d}t}L=\{L,H\}_R \end{aligned}$$
(2.24)

induced by an \(Ad ^*\!\)-invariant function H on \(\mathfrak {g}^*\) takes the following equivalent forms, for an arbitrary \(L\in \mathfrak {g}^*\),

$$\begin{aligned} \frac{\textrm{d}}{\textrm{d}t}L=ad ^{R*}_{\nabla H(L)}\cdot L=\frac{1}{2}\,ad ^*_{R\nabla H(L)}\cdot L=ad ^*_{R_\pm \nabla H(L)}\cdot L. \end{aligned}$$
(2.25)

When there is an \(Ad \)-invariant nondegenerate bilinear form \(\langle \,~,~\rangle \) on \(\mathfrak {g}\) so that we can identify \(\mathfrak {g}^*\) with \(\mathfrak {g}\) and \(ad ^*\!\) with \(ad \), the last equation takes the desired form of a Lax equation for \(L\in \mathfrak {g}\),

$$\begin{aligned} \frac{\textrm{d}}{\textrm{d}t}L=[M_\pm ,L],\qquad M_\pm =R_\pm \nabla H(L). \end{aligned}$$
(2.26)

The proof can be found for instance in [27] and we only elaborate on certain points which will be useful for our purposes below. The crucial point is to exploit the \(\text {Ad}^*\)-invariance of the function H defining the time flow. The latter means that the following property holds

$$\begin{aligned} \text {ad}^*_{\nabla H(\xi )} \cdot \xi = 0 ~~\Leftrightarrow ~~ (\xi ,\, [\nabla H(\xi ),X])=0\qquad \forall \xi \in \mathfrak {g}^*,~~\forall X\in \mathfrak {g}. \end{aligned}$$
(2.27)

Thus, for any two \(\text {Ad}^*\)-invariant functions \(H_1\) and \(H_2\),

$$\begin{aligned} \{H_1,H_2\}_R(\xi )&= (\xi ,[\nabla H_1(\xi )\,,\,\nabla H_2(\xi )]_R)\nonumber \\&=\frac{1}{2}(\xi ,[R\nabla H_1(\xi )\,,\,\nabla H_2(\xi )]+[\nabla H_1(\xi )\,,\,R\nabla H_2(\xi )])=0. \end{aligned}$$
(2.28)

For any function f on \(\mathfrak {g}^*\), the time evolution associated with the \(\text {Ad}^*\)-invariant H with respect to the Poisson bracket \(\{\,~,~\}_R\) is defined by

$$\begin{aligned} \frac{\textrm{d}}{\textrm{d}t}f(L)=\{f,H\}_R(L) \,, \end{aligned}$$

i.e.

$$\begin{aligned} \left( \frac{\textrm{d}}{\textrm{d}t}L\,,\, \nabla f(L) \right)&= (L,[\nabla f(L)\,,\,\nabla H(L)]_R)=-\frac{1}{2}\,(L,[R\nabla H(L)\,,\,\nabla f(L)])\\&=(\text {ad}^{R*}_{ \nabla H(L)}\cdot L\,,\,\nabla f(L))=\frac{1}{2}(\text {ad}^{*}_{R\nabla H(L)}\cdot L\,,\,\nabla f(L))\,. \end{aligned}$$

Finally, in view of (2.13) and (2.27), we have

$$\begin{aligned} \text {ad}^{*}_{R\nabla H(L)}\cdot L=2\,\text {ad}^{*}_{R_\pm \nabla H(L)}\cdot L, \end{aligned}$$
(2.29)

thus establishing the various equivalent forms of the equations in (2.25) (by restricting f to be any of the coordinate functions on \(\mathfrak {g}^*\)).

The involutivity property (2.28) ensures that we can define compatible time flows associated with a family of \(\text {Ad}^*\)-invariant Hamiltonian functions \(H_k\), \(k=1,\dots ,N\). If one can supply enough such independent functions, or work on a coadjoint orbit of low-enough dimension, one obtains an integrable system described by an integrable hierarchy of equations in Lax form (again using the identification provided by \(\langle \,~,~\rangle \))

$$\begin{aligned} \partial _{t_k}L=[R_\pm \nabla H_k(L),L],~~ k=1,\dots ,N. \end{aligned}$$
(2.30)

The typical example of an invariant function \(H_k\) is given by \(H_k=\frac{1}{k+1}{{\,\textrm{Tr}\,}}(L^k)\).

For our purposes, the Lie groups associated with \(\mathfrak {g}\) and \(\mathfrak {g}_R\) will be important. We introduce G and \(G_R\) as the (connected, simply connected) Lie groups defined for \(\mathfrak {g}\) and \(\mathfrak {g}_R\) respectively. For simplicity, we only think of matrix groups in this paper. Only in special circumstances are G and \(G_R\) diffeomorphic. In general, this is only true in a neighbourhood of the identity where the crucial difference between the two groups lies in their multiplications induced by \([\,~,~]\) and \([\,~,~]_R\) respectively. The homomorphisms \(R_\pm \) give rise to Lie group homomorphisms (which we denote by the same symbols) and we obtain a factorisation at the group level. With \(g=\text {e}^X\), \(X\in \mathfrak {g}\), we have

$$\begin{aligned} R_\pm \, g=\text {e}^{R_\pm X}. \end{aligned}$$
(2.31)

Specifically, let \(G_\pm =R_\pm (G_R)\) be the subgroups of G corresponding to \(\mathfrak {g}_\pm \). The composition of the maps

$$\begin{aligned} i_R: G_R\,\rightarrow \,G_+ \times G_-,~~ g\mapsto (R_+(g),R_-(g)), \end{aligned}$$
(2.32)

followed by

$$\begin{aligned} m:G_+\times G_- \,\rightarrow \,G,~~ (g_+,g_-)\mapsto g_+ \, g_-^{-1}, \end{aligned}$$
(2.33)

allows us to factorise uniquely an arbitrary element \(g\in G\) (sufficiently close to the identity) as

$$\begin{aligned} g=g_+ \, g_-^{-1},~~(g_+,g_-)\in \widetilde{G}_R=\textrm{Im}\,i_R. \end{aligned}$$
(2.34)

An element \(g\in G_R\) can be identified with its image \((g_+,g_-)\in \widetilde{G}_R\subseteq G_+\times G_-\) and the multiplication \(\cdot _R\) in \(G_R\) is most easily visualised using the homomorphism property

$$\begin{aligned} i_R( g\cdot _R h)=i_R( g)*i_R( h)=(g_+\,h_+,\,g_-\,h_-) \end{aligned}$$
(2.35)

where \(*\) is the direct product group structure of \(G_+\times G_-\). This is usually shortened to

$$\begin{aligned} g\cdot _R h= (g_+\,h_+,\,g_-\,h_-). \end{aligned}$$
(2.36)

The group \(G_R\) acts on \(\mathfrak {g}_R\) by the adjoint action and on \(\mathfrak {g}^*\) via the coadjoint action

$$\begin{aligned}&\text {Ad}^R_g\cdot X =g\cdot _R X\cdot _R g^{-1}\,,\qquad \forall \, X\in \mathfrak {g}_R\,,~~g\in G_R\,, \end{aligned}$$
(2.37)
$$\begin{aligned}&\text {Ad}^{R*}_g\cdot \xi (X) =(\xi \,,\,\text {Ad}^R_{g^{-1}}\cdot X)\,,\qquad \forall \,g\in G_R\,,\xi \in \mathfrak {g}^*\,,~~X\in \mathfrak {g}_R\,. \end{aligned}$$
(2.38)

Remark 2.3

When writing using the suggestive notation \(g\cdot _R X\cdot _R g^{-1}\) for the adjoint action, we tacitly view \(\cdot _R\) as an associative product on the matrix Lie algebra and its Lie group. Strictly speaking, this is not possible if R is a solution of (2.11). It becomes possible for instance if \(\mathfrak {g}\) is an associative algebra and we require R to be a solution of the associative Yang–Baxter equation \(R(X)\, R(Y)-R(R(X)\, Y+ X\, R(Y))+ X\, Y=0\), see [29]. This implies that \(X\cdot _R Y=\frac{1}{2}\left( R(X)\, Y+X\, R(Y) \right) \) defines a second associative product on \(\mathfrak {g}\) and allows us to view \([X,Y]_R\) as the commutator \(X\cdot _R Y-Y\cdot _R X\), in complete analogy with \([X,Y]=X\, Y-Y\, X\). We will assume that \(\cdot _R\) is such an associative product in the rest of this paper and use the consequences, e.g. \([X,Y]_R=X\cdot _R Y-Y\cdot _R X\).

The following relations are most useful in the practical calculations of the examples discussed below. With \(g_\pm =R_\pm \,g\), \(X_\pm =R_\pm \,X\), \(g\in G_R\), \(X\in \mathfrak {g}_R\),

$$\begin{aligned} \text {Ad}^R_g\cdot X&=g_+\, X_+\,g_+^{-1}-g_-\, X_-\,g_-^{-1}\,, \end{aligned}$$
(2.39)
$$\begin{aligned} \text {Ad}^{R*}_g\cdot \xi&=R_+^*(\text {Ad}^*_{g_+} \xi )-R_-^*(\text {Ad}^*_{g_-}\xi )\,,~~ \forall \xi \in \mathfrak {g}^*\,. \end{aligned}$$
(2.40)

Thus, the dual space \(\mathfrak {g}^*\) hosts two coadjoint actions of G and \(G_R\), as it does with the two coadjoint actions of the Lie algebras \(\mathfrak {g}\) and \(\mathfrak {g}_R\). The last main result of this framework is known as the factorisation theorem, see e.g. [21, 27, 30].

Theorem 2.4

Consider the system of compatible equations with the given initial condition

$$\begin{aligned} \partial _{t_k}L=ad ^*_{R_\pm \nabla H_k(L)}\cdot L,\qquad k=1,\dots ,N, \qquad L(0,\dots ,0)=L_0\in \mathfrak {g}^*.\quad \end{aligned}$$
(2.41)

Denote \((t_1,\dots ,t_N)=\textbf{t}\) for conciseness. Let \(g_\pm (\textbf{t})\) be the smooth curves in \(G_\pm \) which solves the factorisation problem

$$\begin{aligned} \text {e}^{-\sum _{k=1}^N t_k\nabla H_k(L_0)}=g_+(\textbf{t})^{-1}\,g_-(\textbf{t}),\qquad g_\pm (\textbf{0})=e. \end{aligned}$$
(2.42)

Then, the solution to the initial-value problem (2.41) is given by

$$\begin{aligned} L(\textbf{t})=Ad ^*_{g_+(\textbf{t})}\cdot L_0=Ad ^*_{g_-(\textbf{t})}\cdot L_0, \end{aligned}$$
(2.43)

and \(g_{\pm }(\textbf{t})\) satisfy

$$\begin{aligned} \partial _{t_k} g_\pm (\textbf{t}) =R_\pm \nabla H_k(L(\textbf{t}))\,g_\pm (\textbf{t}). \end{aligned}$$
(2.44)

This result shows that the solution lies at the intersection of coadjoint orbits of G and \(G_R\). Combined with the fact that the coadjoint orbits provide the natural symplectic manifolds associated with the corresponding Lie–Poisson bracket, this means that the natural arena to define our phase space, i.e. where L lives, is a coadjoint orbit of \(G_R\) in \(\mathfrak {g}^*\)

$$\begin{aligned} \mathcal{O}_\Lambda =\{\text {Ad}^{R*}_{\varphi }\cdot \Lambda ;\varphi \in G_R\},~~\text {for some}~~\Lambda \in \mathfrak {g}^*. \end{aligned}$$
(2.45)

In the Lagrangian multiform theory, the prevalent idea is that one should think of an integrable system as an integrable hierarchy, in a way completely similar to the Hamiltonian integrable hierarchy we have just recalled. This leads us to work with the space where \((t_1,\dots ,t_N)=\textbf{t}\) lives: the multi-time space. Since the flows commute, the multi-time is simply (a subspace of) \(\mathbb {R}^{N_1}\times (S^1)^{N_2}\), \(N_1+N_2=N\) (in general we should allow for the possibility of having periodicity in some of the independent variables \(t_1,\dots ,t_N\)). The generalisation to the case where the vector fields giving the flows no longer commute but still form a Lie algebra was considered in [19] and leads to the consideration of the multi-time space being a (non-abelian) Lie group.

2.3 Extension to loop algebras and some special cases

The essential results of the Lie dialgebra construction discussed above extend to the infinite-dimensional setting, e.g. the case of loop algebras.Footnote 2 The latter is relevant when one needs Lax matrices with spectral parameters. This is typically the case for integrable field theories but it can also be required for some finite-dimensional systems such as the closed Toda chain or Gaudin models. We will present the extension of the Lie dialgebra construction to this infinite-dimensional setting via the Gaudin example in Sect. 7 and we refer the reader to [21, Lecture 3] for more details.

There are special cases of the Lie dialgebra framework that may be more familiar to the reader and will play a role in our examples below. They both arise when \(\mathfrak {g}\) admits a direct sum decomposition (as a vector space) into two Lie subalgebras

$$\begin{aligned} \mathfrak {g}=\mathfrak {g}_+\oplus \mathfrak {g}_-, \end{aligned}$$
(2.46)

and we take \(R=P_+-P_-\), where \(P_\pm \) is the projector on \(\mathfrak {g}_\pm \) along \(\mathfrak {g}_\mp \). The decomposition of \(\mathfrak {g}\) induces the decomposition

$$\begin{aligned} \mathfrak {g}^*=\mathfrak {g}_+^*\oplus \mathfrak {g}_-^*. \end{aligned}$$
(2.47)

Using a nondegenerate ad-invariant bilinear form on \(\mathfrak {g}\), we can identify \(\mathfrak {g}_\pm ^*\) with \(\mathfrak {g}_\mp ^\perp \).

The first special case, which historically is at the origin of the so-called Adler–Kostant–Symes scheme [22,23,24] is obtained as follows. We fix \(\Lambda \) to be in \(\mathfrak {g}_-^*\) and consider the coadjoint orbit of elements \(L=\text {Ad}^{R*}_{\varphi } \cdot \Lambda \). As a result, only the subgroup \(G_-\) in \(G_R\simeq G_+\times G_-\) plays a role since \(L=\text {Ad}^{R*}_{\varphi } \cdot \Lambda =-R_-^*(\text {Ad}^*_{\varphi _-}\cdot \Lambda )\) and the coadjoint orbit \(\mathcal{O}_\Lambda \) lies in \(\mathfrak {g}_-^*\). This is the historic setup which can be used to formulate the open Toda chain in Flaschka coordinates. R is not skew-symmetric in this case. We will present this example in Sect. 5 where details on our Lagrangian multiform for this model will be given.

The second special case is a further specialisation where \(\mathfrak {g}_\pm \) are isotropic with respect to \(\langle \,~,~\rangle \), meaning

$$\begin{aligned} \langle \mathfrak {g}_\pm ,\mathfrak {g}_\pm \rangle =0 \end{aligned}$$

and implying that \(\mathfrak {g}_\pm ^*\) can be identified with \(\mathfrak {g}_\mp =\mathfrak {g}_\mp ^\perp \). This case can arise with loop algebras and will be discussed in Sect. 7 in relation to the Gaudin model. In this case, R is skew-symmetric, i.e.

$$\begin{aligned} \langle RX,Y\rangle =-\langle X,RY\rangle ,~~\forall ~X,Y\in \mathfrak {g}. \end{aligned}$$

Note that we will also illustrate the case where R is not defined from a decomposition into two subalgebras but rather from a decomposition into nilpotent and Cartan subalgebras. This different setup is accommodated without problems into Lie dialgebras. Interestingly, it can also be used to describe the same open Toda chain as in the AKS scheme and this will be illustrated in Sect. 6. The underlying algebraic structures are very different, though. In particular, R is skew-symmetric in this case while it is not in the AKS formulation, showing that the same Toda chain can arise from two distinct constructions.

3 Lagrangian multiform on a coadjoint orbit

3.1 The Lagrangian multiform and its properties

Recalling our comment about the coadjoint orbits of \(G_R\) in \(\mathfrak {g}^*\) being the natural arena for an integrable hierarchy, let us introduce the following Lagrangian 1-form

$$\begin{aligned} \mathscr {L}[\varphi ] = \sum _{k=1}^N \, \mathscr {L}_k \, \textrm{d}t_k =\mathcal{K}[\varphi ]-\mathcal{H}[\varphi ] \end{aligned}$$
(3.1)

with kinetic part

$$\begin{aligned} \mathcal{K}[\varphi ] = \sum _{k=1}^N \, \left( \, L,\,\partial _{t_k}\varphi \cdot _R \varphi ^{-1} \,\right) \, \textrm{d}t_k ,~~ L = \text {Ad}^{R*}_{\varphi } \cdot \Lambda ,~~\varphi \in G_R, \end{aligned}$$
(3.2)

and potential part

$$\begin{aligned} \mathcal{H}[\varphi ]=\sum _{k=1}^N \, H_k(L) \, \textrm{d}t_k . \end{aligned}$$
(3.3)

The field \(\varphi \in G_R\) contains the dynamical degrees of freedom and, as we will see, the Euler–Lagrange equation will take a natural form when expressed in terms of \(L = \text {Ad}^{R*}_{\varphi } \cdot \Lambda \). \(\Lambda \) is a fixed non-dynamical element of \(\mathfrak {g}^*\) which defines \(\mathcal{O}_\Lambda \), the phase space of the model. Each Lagrangian \(\mathscr {L}_k\) in the Lagrangian multiform has a structure comparable to the familiar Lagrangian \(p\dot{q}-H\) in classical mechanics. The potential part is expressed in terms of \(\text {Ad}^*\)-invariant functions \(H_k \in C^{\infty }(\mathfrak {g}^*)\) and we suppose we have N of them.Footnote 3

Remark 3.1

We emphasised that one important ingredient in producing equations of motion in Lax form from the coadjoint orbit construction is to use an \(\text {Ad}\)-invariant nondegenerate bilinear symmetric form \(\langle \,~,~\rangle \) on \(\mathfrak {g}\) to identify \(\mathfrak {g}^*\) with \(\mathfrak {g}\) and the coadjoint action with the adjoint action. The reader could therefore wonder why we have written our Lagrangian multiform using the pairing \((\,~,~)\), an element \(\Lambda \in \mathfrak {g}^*\) and functions \(H_k\) on \(\mathfrak {g}^*\). The point is that we found that it was less confusing to do so when deriving results in general and in examples, in order to identify correctly the subalgebras involved in the decomposition of \(\mathfrak {g}\) and \(\mathfrak {g}^*\). However, we cannot stress enough that ultimately we always use the bilinear form \(\langle \,~,~\rangle \) to make all the identifications and indeed obtain equations in Lax form, whether this is clearly mentioned or not. Hopefully, this understanding will make the exposition easier to follow.

We can now formulate our first main result.

Theorem 3.2

The Lagrangian 1-form (3.1) satisfies the corner equations (2.4)-(2.5) of the multi-time Euler–Lagrange equations. The standard Euler–Lagrange equations (2.3) associated with the Lagrangian coefficients \(\mathscr {L}_k\) take the form of compatible Lax equations

$$\begin{aligned} \partial _{t_k}L=[R_\pm \nabla H_k(L),L],\qquad k=1,\dots ,N. \end{aligned}$$
(3.4)

The closure relation holds: on solutions of (3.4) we have

$$\begin{aligned} \partial _{t_k}\mathscr {L}_j-\partial _{t_j}\mathscr {L}_k=0,\qquad j,k=1,\dots ,N. \end{aligned}$$

Proof

It is clear that each \(\mathscr {L}_k\) does not depend on \(\partial _{t_\ell } \varphi \) for \(\ell \ne k\) so the corner equation (2.4) is satisfied. To see that (2.5) holds, it is convenient to introduce local coordinates \(\phi _\alpha \), \(\alpha =1,\dots ,M\), on the group \(G_R\). The only source of dependence on velocities is in the kinetic term of \(\mathscr {L}_k\). Now

$$\begin{aligned} \left( \, \text {Ad}^{R*}_{\varphi } \cdot \Lambda ,\,\partial _{t_k}\varphi \cdot _R \varphi ^{-1} \,\right){} & {} =\left( \, \Lambda ,\,\text {Ad}^{R}_{\varphi ^{-1}}\cdot \left( \partial _{t_k}\varphi \cdot _R \varphi ^{-1}\right) \right) \nonumber \\{} & {} = \left( \, \Lambda ,\,\varphi ^{-1} \cdot _R \partial _{t_k}\varphi \right) \nonumber \\{} & {} = \sum _{\alpha =1}^M\left( \, \Lambda ,\,\varphi ^{-1} \cdot _R \frac{\partial \varphi }{\partial \phi _\alpha } \,\right) \partial _{t_k}\phi _\alpha \equiv \sum _{\alpha =1}^M \pi _\alpha \, \partial _{t_k}\phi _\alpha \nonumber \\ \end{aligned}$$
(3.5)

where we have introduced the momentum

$$\begin{aligned} \pi _\alpha =\left( \, \Lambda ,\,\varphi ^{-1} \cdot _R \frac{\partial \varphi }{\partial \phi _\alpha } \,\right) \end{aligned}$$
(3.6)

conjugate to the field \(\phi _\alpha \). Thus,

$$\begin{aligned} \frac{\partial \mathscr {L}_k}{\partial \left( \partial _{t_k}\phi _\alpha \right) }=\pi _\alpha \end{aligned}$$

is independent of k. The remainder of the multi-time Euler Lagrange equations consists of the standard Euler–Lagrange equations for each \(\mathscr {L}_k\). We compute

$$\begin{aligned} \begin{aligned} \delta \mathscr {L}_k&= \left( \, \delta L,\,\partial _{t_k}\varphi \cdot _R \varphi ^{-1} \,\right) + \left( \, L,\,\delta \,(\partial _{t_k}\varphi \cdot _R \varphi ^{-1}) \,\right) -\delta H_k(L) , \end{aligned} \end{aligned}$$
(3.7)

withFootnote 4

$$\begin{aligned} \delta L=\text {ad}^{R*}_{\delta \varphi \cdot _R\varphi ^{-1}}\cdot L \end{aligned}$$
(3.8)

and

$$\begin{aligned} \delta H_k(L)&=\left( \delta L,\nabla H_k(L) \right) =-\left( L\,,\,\left[ \delta \varphi \cdot _R\varphi ^{-1},\nabla H_k(L)\right] _R \right) \\&=\frac{1}{2}\left( L\,,\,\left[ R\nabla H_k(L)\,,\, \delta \varphi \cdot _R\varphi ^{-1}\right] \right) =-\frac{1}{2}\left( \text {ad}^*_{R\nabla H_k(L)}\cdot L\,,\,\delta \varphi \cdot _R\varphi ^{-1} \right) \,. \end{aligned}$$

So,

$$\begin{aligned} \delta \mathscr {L}_k&= \left( \, \text {ad}^{R*}_{\delta \varphi \cdot _R\varphi ^{-1}}\cdot L\,,\,\partial _{t_k}\varphi \cdot _R \varphi ^{-1} \,\right) + \left( \, L\,,\,\delta \,(\partial _{t_k}\varphi ) \cdot _R \varphi ^{-1} \,\right) \\&\quad - \left( \, L\,,\,\partial _{t_k}\varphi \cdot _R \varphi ^{-1} \cdot _R \delta \varphi \cdot _R \varphi ^{-1} \,\right) +\frac{1}{2}\left( \text {ad}^*_{RdH_k(L)}\cdot L\,,\,\delta \varphi \cdot _R\varphi ^{-1} \right) \\&= \left( \, \text {ad}^{R*}_{\delta \varphi \cdot _R\varphi ^{-1}}\cdot L\,,\,\partial _{t_k}\varphi \cdot _R \varphi ^{-1} \,\right) - \left( \, \partial _{t_k}L\,,\,\delta \,\varphi \cdot _R \varphi ^{-1} \,\right) \\ {}&\quad + \left( \, L\,,\,\delta \varphi \cdot _R \varphi ^{-1} \cdot _R \partial _{t_k}\varphi \cdot _R \varphi ^{-1} \,\right) \\&\quad + \partial _{t_k}\left( \, L\,,\,\delta \,\varphi \cdot _R \varphi ^{-1} \,\right) - \left( \, L\,,\,\partial _{t_k}\varphi \cdot _R \varphi ^{-1} \cdot _R \delta \varphi \cdot _R \varphi ^{-1} \,\right) \\ {}&\quad +\frac{1}{2}\left( \text {ad}^*_{R\nabla H_k(L)}\cdot L\,,\,\delta \varphi \cdot _R\varphi ^{-1} \right) \\&= \left( \, \text {ad}^{R*}_{\delta \varphi \cdot _R\varphi ^{-1}}\cdot L\,,\,\partial _{t_k}\varphi \cdot _R \varphi ^{-1} \,\right) - \left( \, \text {ad}^{R*}_{\partial _{t_k}\varphi \cdot _R\varphi ^{-1}}\cdot L\,,\,\delta \,\varphi \cdot _R \varphi ^{-1} \,\right) \\&\quad + \left( \, L\,,\,\left[ \delta \varphi \cdot _R \varphi ^{-1} \,,\,\partial _{t_k}\varphi \cdot _R \varphi ^{-1} \right] _R \,\right) +\frac{1}{2}\left( \text {ad}^*_{R\nabla H_k(L)}\cdot L\,,\,\delta \varphi \cdot _R\varphi ^{-1} \right) \\ {}&\quad +\partial _{t_k}\left( \, L\,,\,\delta \,\varphi \cdot _R \varphi ^{-1} \,\right) . \end{aligned}$$

The first and third term cancel each other. In the second term we recognise \(\text {ad}^{R*}_{\partial _{t_k}\varphi \cdot _R\varphi ^{-1}}\cdot L=\partial _{t_k} L\). Hence,

$$\begin{aligned} \delta \mathscr {L}_k= \left( \, -\partial _{t_k} L+\frac{1}{2}\text {ad}^*_{R\nabla H_k(L)}\cdot L\,,\,\delta \,\varphi \cdot _R \varphi ^{-1} \,\right) +\partial _{t_k}\left( \, L\,,\,\delta \,\varphi \cdot _R \varphi ^{-1} \,\right) \end{aligned}$$

and we obtain the Euler–Lagrange equation for each \(\mathscr {L}_k\) as

$$\begin{aligned} \partial _{t_k} L= \frac{1}{2}\, \text {ad}^*_{R\nabla H_k(L)}\cdot L. \end{aligned}$$
(3.9)

Now recall that \( \frac{1}{2}\, \text {ad}^*_{R\nabla H_k(L)}\cdot L= \text {ad}^*_{R_\pm \nabla H_k(L)}\cdot L\) and that, with \(\mathfrak {g}\) being equipped with an \(\text {Ad}\)-invariant nondegenerate bilinear form, \(\text {ad}^*_{R_\pm \nabla H_k(L)}\cdot L\) is identified with \([R_\pm \nabla H_k(L), L]\). Thus, we have obtained (3.4) variationally as desired. That this set of equations is compatible follows from the commutativity of the flows which is a consequence of the mCYBE and the \(\text {Ad}\)-invariance of \(H_k\) as we now show. Going back to having \(L\in \mathfrak {g}^*\) and evaluating its derivatives on a fixed but arbitrary \(X\in \mathfrak {g}\), we have

$$\begin{aligned} (\partial _{t_k}\partial _{t_j}L)(X)&=-\frac{1}{2}\partial _{t_k}\left( L\,,\, [R\nabla H_j(L),X] \right) \\&=\frac{1}{4}\left( L\,,\,[R\nabla H_k(L)\,,\, [R\nabla H_j(L)\,,\,X]] \right) \\ {}&\quad -\frac{1}{4}\left( L\,,\,[R [R\nabla H_k(L)\,,\,\nabla H_j(L)]\,,\,X] \right) \,. \end{aligned}$$

Hence, using the Jacobi identity

$$\begin{aligned} ([\partial _{t_k}\,,\,\partial _{t_j}]L)(X)&=\frac{1}{4}\left( L\,,\,[[R\nabla H_k(L)\,,\, R\nabla H_j(L)]\,,\,X]\right) \\&\quad -\frac{1}{4}\left( L\,,\,R ([R\nabla H_k(L)\,,\,\nabla H_j(L)]\right. \\ {}&\quad \left. +[\nabla H_k(L)\,,\,R\nabla H_j(L)])\,,\,X] \right) \\&=-\frac{1}{4}\left( L\,,\,[[\nabla H_k(L)\,,\, \nabla H_j(L)]\,,\,X]\right) =0 \end{aligned}$$

where we use the mCYBE in the second equality and property (2.27) in the last step. We now establish the closure relation, i.e. \(\textrm{d}\mathscr {L}=0\) on shell. It turns out that the kinetic and potential contributions vanish separately. We have

$$\begin{aligned} \partial _{t_j} \mathscr {L}_k-\partial _{t_k} \mathscr {L}_j= & {} \partial _{t_j} \left( \,L,\,\partial _{t_k}\varphi \cdot _R \varphi ^{-1} \,\right) -\partial _{t_k} \left( \,L,\,\partial _{t_j}\varphi \cdot _R \varphi ^{-1} \,\right) \nonumber \\ {}{} & {} -\partial _{t_j}H_k(L)+\partial _{t_k}H_j(L). \end{aligned}$$
(3.10)

Now, using (2.27), we find

$$\begin{aligned} \partial _{t_j}H_k(L)=\left( \partial _{t_j} L ,\, \nabla H_k(L)\right) =-\frac{1}{2}\,\left( L ,\, \left[ R\nabla H_j(L),\,\nabla H_k(L)\right] \right) =0.\quad \end{aligned}$$
(3.11)

Thus, it is a direct consequence of the \(\text {Ad}^*\)-invariance of H that the potential contribution to \(\textrm{d}\mathscr {L}\) is zero on shell. We are now left with just the kinetic terms which can be rewritten as

$$\begin{aligned}&\left( \,\partial _{t_j} L\,,\,\partial _{t_k}\varphi \cdot _R \varphi ^{-1} \,\right) - \left( \,\partial _{t_k} L\,,\,\partial _{t_j}\varphi \cdot _R \varphi ^{-1} \,\right) + \left( \, L\,,\,\partial _{t_j}(\partial _{t_k}\varphi \cdot _R \varphi ^{-1} ) \,\right) \\&\quad -\left( \, L\,,\,\partial _{t_k}(\partial _{t_j}\varphi \cdot _R \varphi ^{-1} ) \,\right) \\&\quad = \left( \,\partial _{t_j} L\,,\,\partial _{t_k}\varphi \cdot _R \varphi ^{-1} \,\right) - \left( \,\partial _{t_k} L\,,\,\partial _{t_j}\varphi \cdot _R \varphi ^{-1} \,\right) \\&\qquad + \left( \, L\,,\,\partial _{t_j}\partial _{t_k}\varphi \cdot _R \varphi ^{-1}\,- \,\partial _{t_k}\partial _{t_j}\varphi \cdot _R \varphi ^{-1}\,\right) \\&\qquad + \left( \, L\,,\,\partial _{t_k}\varphi \cdot _R \partial _{t_j}\varphi ^{-1}\,-\,\partial _{t_j}\varphi \cdot _R \partial _{t_k}\varphi ^{-1}\,\right) . \end{aligned}$$

From the commutativity of flows, we have \(\partial _{t_j}\partial _{t_k}\varphi - \partial _{t_k}\partial _{t_j}\varphi = 0\), which leaves us with

$$\begin{aligned}{} & {} \left( \,\partial _{t_j} L,\,\partial _{t_k}\varphi \cdot _R \varphi ^{-1} \,\right) - \left( \,\partial _{t_k} L,\,\partial _{t_j}\varphi \cdot _R \varphi ^{-1} \,\right) \\{} & {} \quad + \left( \, L,\,\partial _{t_k}\varphi \cdot _R \partial _{t_j}\varphi ^{-1}\,-\,\partial _{t_j}\varphi \cdot _R \partial _{t_k}\varphi ^{-1}\,\right) . \end{aligned}$$

The on-shell relation

$$\begin{aligned} \partial _{t_j} L= \frac{1}{2}\, \text {ad}^*_{R\nabla H_j(L)}\cdot L\, \end{aligned}$$

allows us to express the first term as

$$\begin{aligned} \left( \,\partial _{t_j} L\,,\,\partial _{t_k}\varphi \cdot _R \varphi ^{-1}\,\right)&= \frac{1}{2} \left( \,\text {ad}^*_{R\,\nabla H_j(L)}\cdot L\,,\,\partial _{t_k}\varphi \cdot _R \varphi ^{-1}\,\right) \\&= -\frac{1}{2} \left( \,\text {ad}^*_{\partial _{t_k}\varphi \cdot _R \varphi ^{-1}}\cdot L\,,R\,\nabla H_j(L)\right) . \end{aligned}$$

Since

$$\begin{aligned} \left( \,\text {ad}^{R*}_{\partial _{t_k}\varphi \cdot _R \varphi ^{-1}}\cdot L,\nabla H_j(L)\right)= & {} \frac{1}{2} \left( \,\text {ad}^*_{\partial _{t_k}\varphi \cdot _R \varphi ^{-1}}\cdot L,R\nabla H_j(L)\right) \\ {}{} & {} + \frac{1}{2} \left( \,\text {ad}^*_{R\,\partial _{t_k}\varphi \cdot _R \varphi ^{-1}}\cdot L,\nabla H_j(L)\right) \end{aligned}$$

and

$$\begin{aligned} \left( \,\text {ad}^*_{R\,\partial _{t_k}\varphi \cdot _R \varphi ^{-1}}\cdot L,\nabla H_j(L)\right) =-\left( \,\text {ad}^*_{\nabla H_j(L)}\cdot L,R\partial _{t_k}\varphi \cdot _R \varphi ^{-1}\right) = 0, \end{aligned}$$

we have a further simplification to

$$\begin{aligned} \left( \,\partial _{t_j} L\,,\,\partial _{t_k}\varphi \cdot _R \varphi ^{-1}\,\right)&= - \left( \,\text {ad}^{R*}_{\partial _{t_k}\varphi \cdot _R \varphi ^{-1}}\cdot L\,,\nabla H_j(L)\right) \\&= - \left( \,\partial _{t_k}L\,,\nabla H_j(L)\right) \\&= -\,\partial _{t_k}H_j(L) = 0 \end{aligned}$$

where we have used the result from (3.11) (with \(k\leftrightarrow j\)). Similarly, we have for the second term

$$\begin{aligned} \left( \,\partial _{t_k} L,\,\partial _{t_j}\varphi \cdot _R \varphi ^{-1}\,\right) = -\,\partial _{t_j}H_k(L) = 0. \end{aligned}$$
(3.12)

For the last remaining term, we have

$$\begin{aligned}&\left( \, L\,,\,\partial _{t_k}\varphi \cdot _R \partial _{t_j}\varphi ^{-1}\,-\,\partial _{t_j}\varphi \cdot _R \partial _{t_k}\varphi ^{-1}\,\right) \\&\qquad \qquad = \left( \, L\,,\,-\partial _{t_k}\varphi \cdot _R \varphi ^{-1} \cdot _R \partial _{t_j}\varphi \cdot _R \varphi ^{-1}\,+\,\partial _{t_j}\varphi \cdot _R \varphi ^{-1} \cdot _R \partial _{t_k}\varphi \cdot _R \varphi ^{-1}\,\right) \\&\qquad \qquad = \left( \, L\,,\, \left[ \,\partial _{t_j}\varphi \cdot _R \varphi ^{-1}\,,\, \partial _{t_k}\varphi \cdot _R \varphi ^{-1}\, \right] _R\,\right) \\&\qquad \qquad = -\left( \,\text {ad}^{R*}_{\partial _{t_j}\varphi \cdot _R \varphi ^{-1}}\cdot L\,,\,\partial _{t_k}\varphi \cdot _R \varphi ^{-1}\,\right) \\&\qquad \qquad = - \left( \,\partial _{t_j} L\,,\,\partial _{t_k}\varphi \cdot _R \varphi ^{-1}\,\right) = \,\partial _{t_k}H_j(L) = 0.\qquad \qquad \qquad \qquad \qquad \qquad \qquad \quad \square \end{aligned}$$

It is worth noting that the properties of our Lagrangian multiform heavily rely on the mCYBE for R. It is at the heart of the commutativity of the flows and the closure relation. The connection between the closure relation and the CYBE was first identified and established in [14] in the context of integrable field theories. Here it is established in the finite-dimensional context and related to Lie dialgebras.

3.2 Closure relation, Hamiltonians in involution, Kostant–Kirillov form

In this section, we derive a structural result which brings together Lagrangian multiforms and essential Hamiltonian aspects of integrable systems. It will be convenient and clearer to work with local coordinates \(\phi _\alpha \), \(\alpha =1,\dots ,M\), on the group \(G_R\), as we did in (3.5). Then, our Lagrangian multiform can be written in the form

$$\begin{aligned} \mathscr {L}[\varphi ] =\sum _{k=1}^N\left( \,\sum _{\alpha =1}^M \pi _\alpha \, \partial _{t_k}\phi _\alpha - H_k\right) \textrm{d}t_k \end{aligned}$$
(3.13)

where we recall that the momentum \(\pi _\alpha \) is defined by

$$\begin{aligned} \pi _\alpha =\left( \, \Lambda ,\,\varphi ^{-1} \cdot _R \frac{\partial \varphi }{\partial \phi _\alpha } \,\right) . \end{aligned}$$
(3.14)

Each Lagrangian \(\mathscr {L}_k\) in the multiform has the structure \(p\dot{q}-H\) of a Lagrangian in phase space,

$$\begin{aligned} \mathscr {L}_k=\sum _{\alpha =1}^M \pi _\alpha \, \partial _{t_k}\phi _\alpha - H_k, \end{aligned}$$
(3.15)

and yields its Euler–Lagrange equations from the variation

$$\begin{aligned} \delta \mathscr {L}_k=\sum _{\beta =1}^M\left( \sum _{\alpha =1}^M \left( \frac{\partial \pi _\alpha }{\partial \varphi _\beta }-\frac{\partial \pi _\beta }{\partial \varphi _\alpha } \right) \, \partial _{t_k}\phi _\alpha - \frac{\partial H_k}{\partial \phi _\beta }\right) \delta \phi _\beta +\partial _{t_k}\left( \sum _{\alpha =1}^M \pi _\alpha \delta \phi _\alpha \right) .\qquad \nonumber \\ \end{aligned}$$
(3.16)

This is of course consistent with the general result of the previous section, and the comparison of the two expressions for \(\delta \mathscr {L}_k\) gives

$$\begin{aligned} \left( \, -\partial _{t_k} L+\frac{1}{2}\,\text {ad}^*_{R\nabla H_k(L)}\cdot L,\,\delta \,\varphi \cdot _R \varphi ^{-1} \,\right) = \sum _{\beta =1}^M\left( \sum _{\alpha =1}^M \Omega _{\alpha \beta }\, \partial _{t_k}\phi _\alpha - \frac{\partial H_k}{\partial \phi _\beta }\right) \delta \phi _\beta \nonumber \\ \end{aligned}$$
(3.17)

and

$$\begin{aligned} \left( \, L,\,\delta \,\varphi \cdot _R \varphi ^{-1} \,\right) =\left( \sum _{\alpha =1}^M \pi _\alpha \delta \phi _\alpha \right) . \end{aligned}$$
(3.18)

Thus, we have natural coordinate versions of key components of the theory. In particular, let us denote by \(\theta _R\) the vertical 1-form

$$\begin{aligned} \theta _R=-\sum _{\alpha =1}^M \pi _\alpha \delta \phi _\alpha =-\sum _{\alpha =1}^M \left( \, \Lambda ,\,\varphi ^{-1} \cdot _R \frac{\partial \varphi }{\partial \phi _\alpha } \,\right) \delta \phi _\alpha =-\left( \, \Lambda ,\,\varphi ^{-1} \cdot _R \delta \varphi \,\right) ,\nonumber \\ \end{aligned}$$
(3.19)

and let us introduce the vertical 2-form

$$\begin{aligned} \Omega _R=\sum _{\alpha <\beta } \Omega _{\alpha \beta }\,\delta \phi _\alpha \wedge \delta \phi _\beta ,\qquad \Omega _{\alpha \beta }=\frac{\partial \pi _\alpha }{\partial \phi _\beta }-\frac{\partial \pi _\beta }{\partial \phi _\alpha }. \end{aligned}$$
(3.20)

Observe the important relation

$$\begin{aligned} \Omega _R=\delta \theta _R. \end{aligned}$$
(3.21)

The form \(\Omega _R\) is the pullback to the group \(G_R\) by the map

$$\begin{aligned} \begin{aligned} \chi :~&G_R \rightarrow \mathcal{O}_\Lambda \\&\varphi \mapsto \text {Ad}^{R*}_{\varphi } \cdot \Lambda \end{aligned} \end{aligned}$$
(3.22)

of the Kostant–Kirillov symplectic form \(\omega _{R}\) on the coadjoint orbit through \(\Lambda \in \mathfrak {g}^*\). We recall here that we consider the coadjoint action of the group \(G_R\), not the group G. Relation (3.21) is the well-known fact that this pullback is an exact form. The expression \(\varphi ^{-1} \cdot _R \delta \varphi \) appearing in \(\theta _R\) can be interpreted as the Maurer–Cartan form on \(G_R\). The structure of our Lagrangian coefficients, in particular their kinetic part, is now elucidated in terms of fundamental objects associated with \(G_R\) and its coadjoint orbits in \(\mathfrak {g}^*\).

It is known that the map \(\chi \) is a submersion.Footnote 5 Also, a coadjoint orbit is always even-dimensional as it admits the nondegenerate symplectic form \(\omega _{R}\). Let us introduce local coordinates \(\xi _m\), \(m=1,\dots , 2p\), on \(\mathcal{O}_\Lambda \) (\(2p\le M\)). The tangent map \(\chi _*\) is represented locally by the \(2p\times M\) matrix \(\left( \frac{\partial \xi _m}{\partial \phi _\alpha }\right) \). From now on, summation over repeated indices is understood. The pushforward of the vector fields \(\frac{\partial }{\partial \phi _\alpha }\) on \(G_R\) is given by

$$\begin{aligned} \chi _*\!\left( \frac{\partial }{\partial \phi _\alpha }\right) =\frac{\partial \xi _m}{\partial \phi _\alpha }\,\frac{\partial }{\partial \xi _m} \end{aligned}$$
(3.23)

and the pullback of the differential 1-forms \(\delta \xi _m\) on \(\mathcal{O}_\Lambda \) reads

$$\begin{aligned} \chi ^*(\delta \xi _m)=\frac{\partial \xi _m}{\partial \phi _\alpha }\, \delta \phi _\alpha . \end{aligned}$$
(3.24)

If we write for the Kostant–Kirillov form

$$\begin{aligned} \omega _R=\omega _{mn}\,\delta \xi _m\wedge \delta \xi _n , \end{aligned}$$
(3.25)

then we have the following relation with the coefficients of its pullback \(\Omega _R=\chi ^*(\omega _R)\),

$$\begin{aligned} \Omega _{\alpha \beta }= \frac{\partial \xi _m}{\partial \phi _\alpha }\,\frac{\partial \xi _n}{\partial \phi _\beta }\, \omega _{mn} . \end{aligned}$$
(3.26)

In view of (3.17), it remains to introduce the Euler–Lagrange vertical 1-forms on \(G_R\)

$$\begin{aligned} EL_k\equiv EL_k^\beta \,\delta \phi _\beta \equiv \left( \Omega _{\alpha \beta }\,\partial _{t_k}\phi _\alpha -\frac{\partial H_k}{\partial \phi _\beta }\right) \,\delta \phi _\beta . \end{aligned}$$
(3.27)

This is the pullback of the following vertical 1-form on \(\mathcal{O}_\Lambda \),

$$\begin{aligned} EL_k=\chi _*(\Upsilon _k)= \Upsilon _k^n\,\,\chi _*(\delta \xi _n)=\left( \sum _{m}\omega _{mn}\, \partial _{t_k}\xi _m -\frac{\partial H_k}{\partial \xi _n} \right) \chi _*(\delta \xi _n) \end{aligned}$$
(3.28)

with the relation

$$\begin{aligned} EL_k^\beta = \Upsilon _k^n\,\frac{\partial \xi _n}{\partial \phi _\beta }. \end{aligned}$$
(3.29)

Since \(\chi \) is a submersion, the matrix \(\left( \frac{\partial \xi _m}{\partial \phi _\alpha }\right) \) has maximal rank 2p, so the Euler–Lagrange equations \(EL_k^\beta =0\) imply the equations \(\Upsilon _k^n=0\) (and vice versa). This is of course just the confirmation in the present coordinate notations of the result we obtained previously that the (multi-time) Euler–Lagrange equations from our Lagrangian multiform produce Lax equations naturally living on coadjoint orbits of \(G_R\).

As a consequence, whenever we say that an equality holds “on shell”, we mean that it holds modulo \(EL_k^\beta =0\) or equivalently \(\Upsilon _k^n=0\). We can take advantage of this in the following way. \(\Omega _R\) is the pullback of the Kostant–Kirillov form \(\omega _R\) on the coadjoint orbit \(\mathcal{O}_R\). The latter is nondegenerate and therefore induces a Poisson bracket with bivector

$$\begin{aligned} P_R=\sum _{m<n}P_{mn}\frac{\partial }{\partial \xi _m}\wedge \frac{\partial }{\partial \xi _n},\qquad P_{mn}\, \omega _{nr}=\delta _{mr}. \end{aligned}$$
(3.30)

The corresponding Poisson bracket on \(\mathcal{O}_\Lambda \) is known (see e.g. [27, Chapter 14]) to be the restriction of the Lie–Poisson bracket (2.23) on \(\mathfrak {g}^*\)

$$\begin{aligned} \{f,g\}_R(\xi )=\Big (\xi ,\left[ \nabla f(\xi ),\nabla g(\xi )\right] _R\Big ). \end{aligned}$$
(3.31)

In other words, when f, g are restricted to \(\mathcal{O}_\Lambda \), we have

$$\begin{aligned} \{f,g\}_R=P_{mn}\frac{\partial f}{\partial \xi _m}\frac{\partial g}{\partial \xi _n}. \end{aligned}$$
(3.32)

With these notions introduced, we see that the Euler–Lagrange equations \(\Upsilon _k^n=0\) take the form

$$\begin{aligned} \sum _{m}\omega _{mn}\,\partial _{t_k}\xi _m =\frac{\partial H_k}{\partial \xi _n}, \end{aligned}$$
(3.33)

and can be written in Hamiltonian form

$$\begin{aligned} \partial _{t_k}\xi _m=P_{mn}\frac{\partial H_k}{\partial \xi _n}=\{\xi _m,H_k\}_R. \end{aligned}$$
(3.34)

The system of simultaneous equations (3.34) on the \(\xi _m\) admits a solution (at least locally) if and only if the flows are compatible, i.e. \([\partial _{t_k},\partial _{t_\ell }]=0\). For an arbitrary function f, this means

$$\begin{aligned}{}[\partial _{t_k},\partial _{t_\ell }]f=\{\{H_k, H_\ell \}_R,f\}_R=0. \end{aligned}$$

The stronger condition \(\{ H_k, H_\ell \}_R=0\) is the familiar Hamiltonian criterion for integrability (together with a sufficient number of independent such functions \(H_k\), of course).

After these preliminary steps, we are now ready to state our second main result and its corollary, the significance of which will be discussed after the proofs.

Theorem 3.3

The following identity holds

$$\begin{aligned} \frac{\partial \mathscr {L}_k}{\partial t_\ell }-\frac{\partial \mathscr {L}_\ell }{\partial t_k}+\Upsilon _k^m\,P_{mn}\,\Upsilon _\ell ^n=\{H_k,H_\ell \}_R. \end{aligned}$$
(3.35)

Proof

The proof is by direct computation.

$$\begin{aligned} \frac{\partial \mathscr {L}_k}{\partial t_\ell }-\frac{\partial \mathscr {L}_\ell }{\partial t_k}&=\left( \frac{\partial \pi _\alpha }{\partial \phi _\beta }-\frac{\partial \pi _\beta }{\partial \phi _\alpha }\right) \partial _{t_\ell }\phi _\beta \,\partial _{t_k} \phi _\alpha -\frac{\partial H_k}{\partial \phi _\beta }\partial _{t_\ell }\phi _\beta +\frac{\partial H_\ell }{\partial \phi _\alpha }\partial _{t_k}\phi _\alpha \\&=\left( \Omega _{\alpha \beta } \,\partial _{t_k} \phi _\alpha -\frac{\partial H_k}{\partial \phi _\beta }\right) \partial _{t_\ell }\phi _\beta +\frac{\partial H_\ell }{\partial \xi _m}\frac{\partial \xi _m}{\partial \phi _\alpha }\partial _{t_k}\phi _\alpha \\&=\left( \omega _{mn} \,\partial _{t_k} \xi _m-\frac{\partial H_k}{\partial \xi _n}\right) \partial _{t_\ell }\xi _n+\frac{\partial H_\ell }{\partial \xi _m}\partial _{t_k}\xi _m\\&=\left( \omega _{mn} \,\partial _{t_k} \xi _m-\frac{\partial H_k}{\partial \xi _n}\right) P_{nr}\left( \omega _{rs}\,\partial _{t_\ell }\xi _s+\frac{\partial H_\ell }{\partial \xi _r}-\frac{\partial H_\ell }{\partial \xi _r} \right) +\frac{\partial H_\ell }{\partial \xi _m}\partial _{t_k}\xi _m\\&=-\Upsilon _k^n\,P_{nr}\,\Upsilon _\ell ^r+\frac{\partial H_k}{\partial \xi _n}\,P_{nr} \, \frac{\partial H_\ell }{\partial \xi _r}, \end{aligned}$$

hence the result. \(\square \)

Corollary 3.4

The closure relation for the Lagrangian multiform \(\mathscr {L}\) is equivalent to the involutivity of the Hamiltonians \(H_k\) with respect to the Lie–Poisson R-bracket \(\{\,~,~\}_R\).

Proof

The closure relation requires that on shell, we have

$$\begin{aligned} \textrm{d}\mathscr {L}=\sum _{k<\ell }\left( \frac{\partial \mathscr {L}_\ell }{\partial t_k}-\frac{\partial \mathscr {L}_k}{\partial t_\ell }\right) \textrm{d}t_k\wedge \textrm{d}t_\ell =0. \end{aligned}$$
(3.36)

From the previous theorem, on shell we have

$$\begin{aligned} \frac{\partial \mathscr {L}_k}{\partial t_\ell }-\frac{\partial \mathscr {L}_\ell }{\partial t_k}=\{H_k,H_\ell \}_R, \end{aligned}$$
(3.37)

hence the result. \(\square \)

Remark 3.5

The connection between the closure relation for Lagrangian 1-forms and the involutivity of Hamiltonians was first discussed in [6]. The content of our Corollary establishes this result for all Lagrangian 1-forms in the class that we have introduced in this paper. They include any system describable by the coadjoint orbit and r-matrix methods of Lie dialgebras. An extension of the connection between closure and involutivity to the field theory context (Lagrangian 2-forms) was discussed in [31].

Remark 3.6

In the field theory context, the connection between the closure relation and the classical Yang–Baxter equation was elucidated in [14]. In the present article on Lagrangian 1-forms, this connection is also at the heart of our results since the entire construction is based on the availability of the second Lie bracket \([\,~,~]_R\) on \(\mathfrak {g}\), a feature ensured if R satisfies the mCYBE.

Remark 3.7

The content of the theorem sheds fundamental light on the link between the closure relation and the involutivity of the Hamiltonian as it establishes an off-shell identity which clearly shows the interplay between the coefficients of \(\textrm{d}\mathscr {L}\), the Euler–Lagrange equations, the Poisson tensor on the coadjoint orbit, and the Poisson bracket of the Hamiltonians related to our Lagrangian coefficients. A particular point is that it shows in the present general setting that \(\textrm{d}\mathscr {L}\) has a so-called double zero on the equation of motion. This idea was introduced in [10] and developed in [15, 32] as an important ingredient of Lagrangian multiform theory. However, the relation to Hamiltonians in involution was not noticed there. The status of the “double zero” term \(\Upsilon _k^n\,P_{nr}\,\Upsilon _\ell ^r\) is now clearly identified as well as its relation to the Euler–Lagrange equations. This term is the off-shell element linking the Hamiltonian integrability criterion \(\{H_k,H_\ell \}_R=0\) and the integrability criterion advocated in Lagrangian multiform theory: the closure relation \(\textrm{d}\mathscr {L}=0\) on shell.

4 Lagrangian multiform on a coadjoint orbit from reduction

It is well-known that many integrable systems arise from Hamiltonian reduction on the cotangent bundle of a Lie group A following the intuitive idea that the more intricate dynamics of the integrable system of interest on the reduced phase space comes from the simplest “free” dynamics on the cotangent bundle. In this section, we show how one can construct a general Lagrangian multiform on a coadjoint orbit by a Lagrangian analogue of the procedure of Hamiltonian reduction. The multiform of Sect. 3 is recovered as a special case.

We follow mainly the exposition and ideas in [21, Lectures 1 & 2] to summarise the key notions. By fixing a left trivialisation of \(T^*A\) we can parametrise it with \((\alpha ,a)\in \mathfrak {a}^*\times A\) where \(\mathfrak {a}^*\) is the dual of the Lie algebra \(\mathfrak {a}\) of A. The canonical symplectic form \(\Omega \) is exact and derives from the canonical one-form \(\theta \)

$$\begin{aligned} \Omega =\delta \theta ~~\text {with}~~\theta =(\alpha ,a^{-1}\delta a). \end{aligned}$$
(4.1)

The cotangent lifts to \(T^*A\) of the action of A on itself by left and right translations read

$$\begin{aligned} \lambda _b:(\alpha ,a)\mapsto (\alpha ,b\,a),\quad \rho _b:(\alpha ,a)\mapsto (\text {Ad}_b^* \cdot \alpha ,a\,b^{-1}),~~b\in A. \end{aligned}$$
(4.2)

The canonical one-form and hence the symplectic form are invariant under these actions. The corresponding moment maps are given by

$$\begin{aligned} \mu _{\ell }(\alpha ,a)=\text {Ad}^*_{a}\cdot \alpha ,~~\mu _{r}(\alpha ,a)=-\alpha . \end{aligned}$$
(4.3)

In applications to integrable systems, one usually consider the case where only Lie subgroups \(A_+\) and \(A_-\) of A act by left and right translations. In this case, the moment maps are the restriction of the above moment maps to \(\mathfrak {a}_\pm \), the Lie algebras of \(A_\pm \). Thus they are elements of \(\mathfrak {a}^*_\pm \) and we denote them by

$$\begin{aligned} \mu _\ell (\alpha ,a)=\Pi _{\mathfrak {a}^*_+}\left( \text {Ad}^*_{a}\cdot \alpha \right) ,~~\mu _{r}(\alpha ,a)=-\Pi _{\mathfrak {a}^*_-}\alpha . \end{aligned}$$
(4.4)

In the special case where \(A_+\) is the trivial group and \(A_-=A\), it is known that the quotient Poisson manifold \(T^*A/A\) is isomorphic to \(\mathfrak {a}^*\) equipped with the Lie–Poisson bracket, see e.g. [21, Proposition 1.24].

Since our emphasis is on the Lagrangian formalism, let us describe the translation of the above situation into this framework. We consider the following Lagrangian on \(T^*A\)

$$\begin{aligned} \mathscr {L}^0=\left( \alpha ,a^{-1}\frac{\textrm{d}a}{\textrm{d}t} \right) . \end{aligned}$$
(4.5)

The importance of \(\mathscr {L}^0\) is that the Cartan form arising from its variation is precisely the canonical one-form on \(T^*A\). Indeed, we have

$$\begin{aligned} \delta \mathscr {L}^0&=\left( \delta \alpha ,a^{-1}\frac{\textrm{d}a}{\textrm{d}t} \right) -\left( \alpha ,a^{-1}\delta a a^{-1}\frac{\textrm{d}a}{\textrm{d}t} \right) +\left( \alpha ,a^{-1}\delta \frac{\textrm{d}a}{\textrm{d}t} \right) \\&=\left( \delta \alpha ,a^{-1}\frac{\textrm{d}a}{\textrm{d}t} \right) -\left( \alpha ,a^{-1}\delta a a^{-1}\frac{\textrm{d}a}{\textrm{d}t} \right) -\left( \frac{\textrm{d}\alpha }{\textrm{d}t},a^{-1}\delta a \right) \\ {}&\quad +\left( \alpha ,a^{-1}\frac{\textrm{d}a}{\textrm{d}t} a^{-1}\delta a \right) +\frac{\textrm{d}}{\textrm{d}t}\left( \alpha ,a^{-1}\delta a \right) \\&=\left( \delta \alpha ,a^{-1}\frac{\textrm{d}a}{\textrm{d}t} \right) -\left( \frac{\textrm{d}}{\textrm{d}t}\left( \text {Ad}^*_a\cdot \alpha \right) ,\delta a a^{-1}\right) +\frac{\textrm{d}}{\textrm{d}t}\left( \alpha ,a^{-1}\delta a \right) . \end{aligned}$$

In the last term we recognise that the Cartan form is \(\theta \) (up to a conventional sign). Also, we see that this Lagrangian yields trivial (free) equations of motion

$$\begin{aligned} a^{-1}\frac{\textrm{d}a}{\textrm{d}t}=0,~~\frac{\textrm{d}}{\textrm{d}t}\left( \text {Ad}^*_a\cdot \alpha \right) =0\quad \Leftrightarrow \quad \frac{\textrm{d}a}{\textrm{d}t}=0,~~\frac{\textrm{d}\alpha }{\textrm{d}t}=0. \end{aligned}$$
(4.6)

The Lagrangian \(\mathscr {L}^0\) is invariant under the global transformations \((\alpha ,a)\mapsto (\alpha ,b\,a)\) and \((\alpha ,a)\mapsto (\text {Ad}_b^*\cdot \alpha ,a\,b^{-1})\) where \(b\in A\) is constant. The conserved currents produced by Noether’s theorem are the moment maps \(\mu _{\ell ,r}\). It is immediate from (4.6) that they are indeed conserved currents. The symmetry group \(A\times A\) of this free theory is too large to produce systems of interest. One easy way to reduce the symmetry group to \(A_+\times A_-=\{e\}\times A\) acting by right translations only is to include a potential term where the potential function depends on \((\alpha ,a)\) only through \(\mu _\ell \):

$$\begin{aligned} \mathscr {L}=\left( \alpha ,a^{-1}\frac{\textrm{d}a}{\textrm{d}t} \right) -H\left( -\mu _\ell (\alpha ,a)\right) =\left( \alpha ,a^{-1}\frac{\textrm{d}a}{\textrm{d}t} \right) -H\left( \text {Ad}^*_{a}\cdot \alpha \right) , \end{aligned}$$
(4.7)

where H is a function on \(\mathfrak {a}^*\). By Noether’s theorem, we expect that \(\mu _r=-\alpha \) is still a conserved current. Indeed, a computation analogous to that in the proof of Theorem (3.2) gives

$$\begin{aligned}&\delta \mathscr {L}=\left( \delta \alpha ,a^{-1}\frac{\textrm{d}a}{\textrm{d}t} -\text {Ad}_{a^{-1}}\cdot \nabla H\left( \text {Ad}^*_{a}\cdot \alpha \right) \right) \nonumber \\&-\left( \frac{\textrm{d}}{\textrm{d}t}\left( \text {Ad}^*_a\cdot \alpha \right) -\text {ad}^*_{\nabla H\left( \text {Ad}^*_{a}\cdot \alpha \right) }\cdot \left( \text {Ad}^*_{a}\cdot \alpha \right) ,\delta a a^{-1}\right) +\frac{\textrm{d}}{\textrm{d}t}\left( \alpha ,a^{-1}\delta a \right) . \end{aligned}$$
(4.8)

Thus the equations of motion read

$$\begin{aligned} \frac{\textrm{d}a}{\textrm{d}t}\,a^{-1} -\nabla H\left( \text {Ad}^*_{a}\cdot \alpha \right) =0,~~ \frac{\textrm{d}}{\textrm{d}t}\left( \text {Ad}^*_a\cdot \alpha \right) -\text {ad}^*_{\nabla H\left( \text {Ad}^*_{a}\cdot \alpha \right) }\cdot \left( \text {Ad}^*_{a}\cdot \alpha \right) =0\nonumber \\ \end{aligned}$$
(4.9)

or equivalently

$$\begin{aligned} \frac{\textrm{d}a}{\textrm{d}t}\,a^{-1} -\nabla H\left( \text {Ad}^*_{a}\cdot \alpha \right) =0,~~ \frac{\textrm{d}}{\textrm{d}t}\alpha =0. \end{aligned}$$
(4.10)

The analogue of fixing the moment map \(\mu _r=-\alpha \) to some fixed value \(-\Lambda \in \mathfrak {a}^*\) in the Hamiltonian reduction approach consists of “integrating out degrees of freedom” by solving \(\frac{\textrm{d}}{\textrm{d}t}\alpha =0\) to \(\alpha =\Lambda \in \mathfrak {a}^*\), and inserting back into the Lagrangian to get the effective Lagrangian of the reduced model. This yields

$$\begin{aligned} \mathscr {L}_{\text {eff}}=\left( \Lambda ,a^{-1}\frac{\textrm{d}a}{\textrm{d}t} \right) -H\left( \text {Ad}^*_{a}\cdot \Lambda \right) . \end{aligned}$$
(4.11)

This Lagrangian describe a system on the coadjoint orbit of \(\Lambda \in \mathfrak {a}^*\) under A. At this stage, if we equip \(\mathfrak {a}\) with a nondegenerate symmetric bilinear form to identify \(\mathfrak {a}^*\) with \(\mathfrak {a}\), we obtain as before that the equations of motion take the Lax form for \(L=\text {Ad}_{a}^*\cdot \Lambda \)

$$\begin{aligned} \frac{\textrm{d}L}{\textrm{d}t}=\left[ \nabla H(L),\, L\right] . \end{aligned}$$
(4.12)

Note that we did not assume anything special about the function H so that strictly speaking there is no notion of integrability at this stage, only that the equations for the system under consideration are written in Lax form.

Applying this construction to the case \(A=G_R\), \(\mathfrak {a}=\mathfrak {g}_R\) we see that each of our Lagrangian \(\mathscr {L}_k\) in (3.1) is of the form of \(\mathscr {L}_{\text {eff}}\). Of course, in that case, each function \(H_k\) was assumed to have the additional property of being invariant under the coadjoint action of G so that the closure relation, the Lagrangian criterion for integrability,Footnote 6 was valid. However, let us go back to the general situation above and suppose we form a Lagrangian 1-form by assembling N effective Lagrangians of the form (4.11) with N independent (arbitrary smooth) functions \(H_k\) defined on \(\mathfrak {a}^*\) (or possibly only on \(\mathcal{O}_\Lambda \)):

$$\begin{aligned} \mathscr {L}= \sum _{k=1}^N \, \mathscr {L}_k \, \textrm{d}t_k = \sum _{k=1}^N \, \left( \left( \Lambda ,a^{-1}\partial _{t_k}a \right) -H_k\left( \text {Ad}^*_{a}\cdot \Lambda \right) \right) \, \textrm{d}t_k. \end{aligned}$$
(4.13)

The arguments of Sect. 3.2 can be repeated verbatim and lead to the same conclusion as in Theorem 3.3.

Theorem 4.1

The following identity holds

$$\begin{aligned} \frac{\partial \mathscr {L}_k}{\partial t_\ell }-\frac{\partial \mathscr {L}_\ell }{\partial t_k}+\Upsilon _k^m\,P_{mn}\,\Upsilon _\ell ^n=\{H_k,H_\ell \}_{\mathfrak {a}^*}, \end{aligned}$$
(4.14)

where \(\{~,~\}_{\mathfrak {a}^*}\) is the Lie–Poisson bracket on \(\mathfrak {a}^*\) and \(P_{mn}\) the corresponding Poisson tensor on \(\mathcal{O}_\Lambda \).

Note that so far we have not assumed anything on the functions \(H_k\). The exact analogue of Corollary 3.4 follows from Theorem 4.1 in the present context. We stress its importance in this general situation: if we can solve for \(\mathscr {L}_k\) such that the multi-time Euler–Lagrange equations and the closure relation hold for the 1-form (4.13) then it qualifies as a Lagrangian multiform and (4.14) implies that the corresponding functions \(H_k\) are in involution. Conversely, if we use functions \(H_k\) that are in involution with respect to \(\{~,~\}_{\mathfrak {a}^*}\) then the 1-form (4.13) satisfies the closure relation and is a Lagrangian multiform. In Sect. 3, we used the latter point of view in a special situation: We took advantage of the Lie dialgebra construction which uses \(\text {Ad}^*_G\)-invariant functions to produce functions in involution with respect to \(\{~,~\}_R\) on the dual of \(\mathfrak {g}_R\). In the present section, the above results imply the stronger statement that one can associate a Lagrangian multiform with any family of Hamiltonians in involution on any coadjoint orbit of a Lie group.

The perspective of the reverse procedure consisting of solving the multi-time Euler–Lagrange equations and the closure relations to produce (new?) integrable systems and Hamiltonians in involution is tantalising. However, it is far from clear whether such a philosophy is more (or less) promising than the established integrable system classification tools such as symmetry analysis or classification of the solutions of the (modified) classical Yang–Baxter equation.

It is instructive to recover our Lagrangians \(\mathscr {L}_k\) as effective Lagrangians via a slightly different, but related, mechanism. Suppose now that we reduce the symmetry group to \(A_+\times A_-\) such that, at least locally, any element \(a\in A\) factorise uniquely as \(a=a_+^{-1}a_-\), \(a_\pm \in A_\pm \). At the Lie algebra level, we have a unique decomposition \(X=X_+-X_-\), \(X_\pm \in \mathfrak {a}_\pm \) and by duality \(\alpha =\Pi _{\mathfrak {a}^*_+}\alpha -\Pi _{\mathfrak {a}^*_-}\alpha \). Identifying a with \((a_+,a_-)\), we see that the action of \(A_+\times A_-\) on a amounts to an action by right translations \((a_+,a_-)\mapsto (a_+b_+^{-1},a_-b_-^{-1})\). Thus,

$$\begin{aligned} s:T^*A\rightarrow \mathfrak {a}^*,~~(\alpha ,a)\mapsto \text {Ad}^*_{a_-}\cdot \alpha \end{aligned}$$
(4.15)

is invariant under the action of \(A_+\times A_-\). Equipped with this, as before, it is easy to introduce a Lagrangian which has \(A_+\times A_-\) as symmetry group

$$\begin{aligned} \mathscr {L}=\left( \alpha ,a^{-1}\frac{\textrm{d}a}{\textrm{d}t} \right) -H\left( -s(\alpha ,a)\right) , \end{aligned}$$
(4.16)

where H is a function on \(\mathfrak {a}^*\). By Noether’s theorem we expect that \(\mu _{\ell ,r}\) in (4.4) are conserved currents. The direct verification from the Euler–Lagrange equations follows by noticing that

$$\begin{aligned} \left( \alpha ,a^{-1}\frac{\textrm{d}a}{\textrm{d}t} \right) =-\left( s,\frac{\textrm{d}a_+}{\textrm{d}t}a_+^{-1}-\frac{\textrm{d}a_-}{\textrm{d}t}a_-^{-1} \right) . \end{aligned}$$

Therefore

$$\begin{aligned} \delta \mathscr {L}={} & {} -\left( \delta s,\frac{\textrm{d}a_+}{\textrm{d}t}a_+^{-1}-\frac{\textrm{d}a_-}{\textrm{d}t} a_-^{-1}-\nabla H(-s)\right) +\left( \frac{\textrm{d}}{\textrm{d}t}\left( \text {Ad}^*_{a_+^{-1}}\cdot s\right) ,\,\delta a_+a_+^{-1} \right) \nonumber \\{} & {} -\left( \frac{\textrm{d}}{\textrm{d}t}\left( \text {Ad}^*_{a_-^{-1}}\cdot s\right) ,\,\delta a_-a_-^{-1} \right) - \frac{\textrm{d}}{\textrm{d}t}\left( s,\delta a_+a_+^{-1}-\delta a_-a_-^{-1} \right) \end{aligned}$$
(4.17)

giving

$$\begin{aligned} {\left\{ \begin{array}{ll} \dfrac{\textrm{d}}{\textrm{d}t}\,\Pi _{\mathfrak {a}^*_+}\left( \text {Ad}^*_{a_+^{-1}}\cdot s \right) =\dfrac{\textrm{d}}{\textrm{d}t}\,\Pi _{\mathfrak {a}^*_+}\left( \text {Ad}^*_{a}\cdot \alpha \right) =\dfrac{\textrm{d}}{\textrm{d}t}\,\mu _\ell =0,\\ \dfrac{\textrm{d}}{\textrm{d}t}\,\Pi _{\mathfrak {a}^*_-}\left( \text {Ad}^*_{a_-^{-1}}\cdot s \right) =\dfrac{\textrm{d}}{\textrm{d}t}\,\Pi _{\mathfrak {a}^*_-} \alpha =-\dfrac{\textrm{d}}{\textrm{d}t}\,\mu _r=0,\\ \dfrac{\textrm{d}a_+}{\textrm{d}t}\,a_+^{-1}-\dfrac{\textrm{d}a_-}{\textrm{d}t}\,a_-^{-1} -\nabla H(-s)=0. \end{array}\right. } \end{aligned}$$
(4.18)

The first two equations are indeed the conservation of the Noether currents, as expected. We use them to integrate out the corresponding degrees of freedom. Namely, we set

$$\begin{aligned} \mu _\ell =-\Lambda _+\in \mathfrak {a}_+,~~\mu _r=\Lambda _-\in \mathfrak {a}_-, \end{aligned}$$
(4.19)

and substitute back into the Lagrangian. To obtain the effective Lagrangian, note that

$$\begin{aligned} s=\Pi _{\mathfrak {a}^*_+}\left( \text {Ad}^*_{{a_+}}\cdot \mu _\ell \right) +\Pi _{\mathfrak {a}^*_-}\left( \text {Ad}^*_{{a_-}}\cdot \mu _r \right) . \end{aligned}$$
(4.20)

This can be seen by the following computation.Footnote 7 For any \(X\in \mathfrak {a}\),

$$\begin{aligned} \left( s,X\right) =\left( s,X_+-X_-\right) =&\left( \text {Ad}^*_{a_+^{-1}}\cdot s,\text {Ad}_{a_+^{-1}}\cdot X_+\right) -\left( \text {Ad}^*_{a_-^{-1}}\cdot s,\text {Ad}_{a_-^{-1}}\cdot X_-\right) \\ =&\left( \mu _\ell ,\text {Ad}_{a_+^{-1}}\cdot X_+\right) +\left( \mu _r,\text {Ad}_{a_-^{-1}}\cdot X_-\right) \\ =&\left( \Pi _{\mathfrak {a}^*_+}\!\left( \text {Ad}^*_{{a_+}}\cdot \mu _\ell \right) +\Pi _{\mathfrak {a}^*_-}\!\left( \text {Ad}^*_{{a_-}}\cdot \mu _r \right) ,X\right) \,. \end{aligned}$$

Putting everything together, and setting

$$\begin{aligned} L=-s=\Pi _{\mathfrak {a}^*_+}\left( \text {Ad}^*_{{a_+}}\cdot \Lambda _+ \right) -\Pi _{\mathfrak {a}^*_-}\left( \text {Ad}^*_{{a_-}}\cdot \Lambda _- \right) \end{aligned}$$
(4.21)

the effective Lagrangian is

$$\begin{aligned} \mathscr {L}_{\text {eff}}=\left( L,\frac{\textrm{d}a_+}{\textrm{d}t}\,a_+^{-1} -\frac{\textrm{d}a_-}{\textrm{d}t}\,a_-^{-1}\right) -H\left( L\right) . \end{aligned}$$
(4.22)

In the special case where \(A=G\), \(\mathfrak {a}=\mathfrak {g}\), \(\mathfrak {a}_\pm =\mathfrak {g}_{\pm }\) with \(\mathfrak {g}_\pm =R_\pm (\mathfrak {g})\), this effective Lagrangian is exactly of the form of our Lagrangian coefficients \(\mathscr {L}_k\). This alternative construction amounts to reducing a free system on \(T^*G\) by acting with \(G_+\times G_-\simeq G_R\). We refer the interested reader to [21, Section 2.4] for a discussion of the connection between the reduction on \(T^*G_R\) by left translations of \(G_R\) and the reduction on \(T^*G\) by left and right translations of \(G_+\times G_-\). The main reason why we discussed the alternative construction here is because it suggests that one may be able to construct Lagrangian multiforms for systems of Calogero-Moser type. In the terminology of [21, Section 2.4], one would have to drop the assumption that the subgroup H of \(G\times G\) used for reduction is transversal to the diagonal subgroup and implement instead a reduction corresponding for instance to the action of G on itself by conjugation. We leave this open problem for future investigation.

5 Open Toda chain in the AKS scheme

As this is our first example, we first spend some time reviewing the known Adler–Kostant–Symes Lie algebraic construction of the Lax matrix and the Lax equation reproducing Flaschka’s approach. Then, we will make the connection with our variational approach.

Algebraic setup: Let us choose \(\mathfrak {g}=\mathfrak {sl}(N+1)\), the Lie algebra of \((N+1)\times (N+1)\) traceless real matrices, \(\mathfrak {g}_+\) the Lie subalgebra of skew-symmetric matrices and \(\mathfrak {g}_-\) the Lie subalgebra of upper triangular traceless matrices, yielding

$$\begin{aligned} \mathfrak {g}=\mathfrak {g}_+\oplus \mathfrak {g}_-. \end{aligned}$$
(5.1)

Here \(R=P_+-P_-\) and \(R_\pm =\pm P_\pm \) with \(P_\pm \) the projector on \(\mathfrak {g}_\pm \) along \(\mathfrak {g}_\mp \). The following \(\text {Ad}\)-invariant nondegenerate bilinear form

$$\begin{aligned} \langle \,X,Y\rangle =\textrm{Tr}(XY) \end{aligned}$$
(5.2)

allows the identification \(\mathfrak {g}^*\simeq \mathfrak {g}\), and it induces the decomposition

$$\begin{aligned} \mathfrak {g}^*=\mathfrak {g}_-^*\oplus \mathfrak {g}_+^*\simeq \mathfrak {g}_+^\perp \oplus \mathfrak {g}_-^\perp , \end{aligned}$$
(5.3)

where \(\mathfrak {g}_\pm ^\perp \) is the orthogonal complement of \(\mathfrak {g}_{\pm }\) with respect to \(\langle \,~,~\rangle \): \(\mathfrak {g}_+^\perp \) is the subspace of traceless symmetric matrices and \(\mathfrak {g}_-^\perp \) the subspace of strictly upper triangular matrices. Let us choose

$$\begin{aligned} \Lambda =\begin{pmatrix} 0 &{}\quad 1 &{}\quad 0 &{}\quad 0 &{}\quad \dots &{}\quad 0\\ 1 &{} 0 &{} 1 &{} 0 &{}\dots &{} 0\\ 0 &{} 1 &{} 0 &{} 1 &{}\dots &{} 0\\ 0 &{} 0 &{} 1 &{}\ddots &{}\ddots &{} \vdots \\ \vdots &{} &{} &{} \ddots &{} \ddots &{} 1\\ 0 &{} 0 &{} 0 &{} \dots &{}1 &{} 0\\ \end{pmatrix}\in \mathfrak {g}_-^*\simeq \mathfrak {g}_+^\perp \end{aligned}$$
(5.4)

and consider its orbit under the (co)adjoint action of \(G_-\), the Lie subgroup associated to \(\mathfrak {g}_-\) consisting of upper triangular matrices with unit determinant.

Lax matrix and Lax equations for the first two flows: As explained in Sect. 2.3, the AKS case corresponds to the particular case where \(\varphi \in G_-\) so that

$$\begin{aligned} L=\text {Ad}^{R*}_{\varphi } \cdot \Lambda =-R_-^*(\text {Ad}^*_{\varphi _-}\cdot \Lambda ), \end{aligned}$$

and the coadjoint orbit \(\mathcal{O}_{\Lambda }\) lies in \(\mathfrak {g}_-^*\). Using \(\langle \,~,~\rangle \) we can identify the adjoint and coadjoint actions. Also, we use it to identify the transpose \(A^*:\mathfrak {g}^*\rightarrow \mathfrak {g}^*\) of any linear map \(A:\mathfrak {g}\rightarrow \mathfrak {g}\) with the transpose of A with respect to \(\langle \,~,~\rangle \) defined on \(\mathfrak {g}\). Writing \((\xi ,X)=\langle Y,X\rangle \), this means that we have

$$\begin{aligned} (A^*(\xi ),X)=(\xi ,A(X))=\langle Y,A(X) \rangle =\langle A^*(Y),X \rangle . \end{aligned}$$

This allows us to work with

$$\begin{aligned} L=-R_-^*(\varphi _-\, \Lambda \,\varphi _-^{-1})=-R_-^*(\varphi \, \Lambda \,\varphi ^{-1}) , \end{aligned}$$
(5.5)

where we have dropped the redundant subscript on \(\varphi \) in the second equality with \(\varphi =\varphi _-\in G_-\). From the definitions \(\langle \,X\,,\,R_\pm Y\,\rangle =\langle \,R^*_\pm X\,,\,Y\,\rangle \) and \(\langle \,X\,,\,P_\pm Y\,\rangle =\langle \,\Pi _\mp X\,,\,Y\,\rangle \), where we denote by \(\Pi _\pm \) the projector onto \(\mathfrak {g}_\pm ^\perp \) along \(\mathfrak {g}_\mp ^\perp \), we find \(R^*_\pm =\pm \Pi _\mp \,\). Note that this is an example of non-skew-symmetric r-matrix since

$$\begin{aligned} R^*=\Pi _--\Pi _+\ne -R=P_--P_+. \end{aligned}$$
(5.6)

Now, \(\varphi \,\Lambda \,\varphi ^{-1}\) is of the form

$$\begin{aligned} \varphi \,\Lambda \,\varphi ^{-1}=\begin{pmatrix} ~a_1~ &{}\quad * &{}\quad * &{}\quad * &{}\quad \dots &{}\quad *\\ b_1 &{} a_2 &{} * &{} * &{}\dots &{} *\\ 0 &{} b_2 &{} a_3 &{} * &{}\dots &{} *\\ 0 &{} 0 &{} b_3 &{}\ddots &{}\ddots &{} \vdots \\ \vdots &{} &{} &{} \ddots &{} \ddots &{} *\\ 0 &{} 0 &{} 0 &{} \dots &{}b_{N} &{} ~a_{N+1}~\\ \end{pmatrix}. \end{aligned}$$
(5.7)

So, we find

$$\begin{aligned} L=\Pi _+(\varphi \,\Lambda \,\varphi ^{-1})=\begin{pmatrix} ~a_1~ &{}\quad b_1 &{}\quad 0 &{}\quad 0 &{}\quad \dots &{}\quad 0\\ b_1 &{} a_2 &{} b_2 &{} 0 &{}\dots &{} 0\\ 0 &{} b_2 &{} a_3 &{} b_3 &{}\dots &{} 0\\ 0 &{} 0 &{} b_3 &{}\ddots &{}\ddots &{} \vdots \\ \vdots &{} &{} &{} \ddots &{} \ddots &{} b_{N}\\ 0 &{} 0 &{} 0 &{} \dots &{}b_{N} &{} ~a_{N+1}~\\ \end{pmatrix}, \end{aligned}$$
(5.8)

i.e. it is symmetric tridiagonal. Using the Hamiltonian

$$\begin{aligned} H_1(L) = -\frac{1}{2} \,\textrm{Tr}\, L^2, \end{aligned}$$
(5.9)

we then find

$$\begin{aligned} R_+\nabla H_1(L)=P_+(-L)=\begin{pmatrix} 0 &{} b_1 &{} 0 &{} 0 &{}\dots &{} 0\\ ~-b_1~ &{} 0 &{} b_2 &{} 0 &{}\dots &{} 0\\ 0 &{} -b_2 &{} 0 &{} b_3 &{}\dots &{} 0\\ 0 &{} 0 &{} -b_3 &{}\ddots &{}\ddots &{} \vdots \\ \vdots &{} &{} &{} \ddots &{} \ddots &{} ~b_{N}~\\ 0 &{} 0 &{} 0 &{} \dots &{}-b_{N} &{} 0\\ \end{pmatrix}. \end{aligned}$$
(5.10)

A direct substitution in (3.4) with \(k=1\), i.e.

$$\begin{aligned} \partial _{t_1}L=[R_\pm \nabla H_1(L),L] \end{aligned}$$

reproduces the open finite Toda lattice equations in Flaschka’s coordinates \(a_n\), \(b_n\)

$$\begin{aligned} {\left\{ \begin{array}{ll} \partial _{t_1}a_1=2b_1^2,\qquad \partial _{t_1}a_{N+1}=-2b_{N}^2,\\ \partial _{t_1}a_j=2(b_{j}^2-b_{j-1}^2),\qquad j=2,\dots ,N,\\ \partial _{t_1}b_j=b_j(a_{j+1}-a_j),\qquad j=1,\dots ,N. \end{array}\right. } \end{aligned}$$
(5.11)

The next flow generated by the Hamiltonian

$$\begin{aligned} H_2(L) = -\frac{1}{3} \,\textrm{Tr}\, L^3, \end{aligned}$$
(5.12)

with gradient \(\nabla H_2(L)=-L^2\) yields \(R_+\nabla H_2(L)=P_+(-L^2)\) as

$$\begin{aligned} \begin{pmatrix} 0 &{} b_1(a_1+a_2) &{} b_1\,b_2 &{} 0 &{}\dots &{} 0\\ -b_1(a_1+a_2) &{} 0 &{} b_2(a_2+a_3) &{} b_2\,b_3 &{}\dots &{} 0\\ -b_1\,b_2 &{} -b_2(a_2+a_3) &{} 0 &{} b_3(a_3+a_4) &{}\dots &{} 0\\ 0 &{} -b_2\,b_3 &{} -b_3(a_3+a_4) &{}\ddots &{}\ddots &{} \vdots \\ \vdots &{} &{} &{} \ddots &{} \ddots &{} b_{N}(a_{N}+a_{N+1})\\ 0 &{} 0 &{} 0 &{} \dots &{}-b_{N}(a_{N}+a_{N+1}) &{} 0\\ \end{pmatrix}. \end{aligned}$$

The corresponding equations from (3.4) with \(k=2\) read

$$\begin{aligned} {\left\{ \begin{array}{ll} \partial _{t_2}a_1=2\,b_1^2(a_1+a_2),\qquad \partial _{t_2}a_{N+1}=-2\,b_{N}^2(a_{N}+a_{N+1}),\\ \partial _{t_2}a_j=2b_j^2(a_j+a_{j+1})-2\,b_{j-1}^2(a_{j-1}+a_j),\qquad j=2,\dots ,N, \\ \partial _{t_2}b_1= b_1(a_2^2-a_1^2+b_2^2\,) ,\qquad \partial _{t_2}b_{N}=b_{N}(a_{N+1}^2-a_{N}^2-b_{N-1}^2),\\ \partial _{t_2}b_j=b_j(a_{j+1}^2-a_j^2+b_{j+1}^2-b_{j-1}^2),\qquad j=2,\dots ,N. \end{array}\right. } \end{aligned}$$
(5.13)

Lagrangian description: We need to choose a convenient parametrisation of \(\varphi \) since this is the essential ingredient in the Lagrangians \(\mathscr {L}_k\). We choose

$$\begin{aligned} \varphi =U\,Y , \end{aligned}$$
(5.14)

where \(Y=\textrm{diag}(y_1,\dots ,y_{N+1})\) is the diagonal matrix of diagonal elements of \(\varphi \) (i.e. \(y_i=\varphi _{ii}\)) and \(U=\varphi \,Y^{-1}\) is the upper triangular matrix with 1 on the diagonal and arbitrary elements \(u_{ij}\), \(1\! \le \! i\!<j\!\le \!N\). Since \(\varphi \) has nonzero determinant, \(y_i\ne 0\), \(i=1,\dots ,N+1\), and with this parametrisation, we find L as in (5.8) with

$$\begin{aligned} {\left\{ \begin{array}{ll} a_1=\dfrac{y_{2}}{y_1}\,u_{12},\qquad a_{N+1}=-\dfrac{y_{N+1}}{y_{N}}\,u_{N,N+1}, \\ a_i=\dfrac{y_{i+1}}{y_i}\,u_{i,i+1}-\dfrac{y_{i}}{y_{i-1}}\,u_{i-1,i} ,\qquad i=2,\dots ,N,\\ b_i=\dfrac{y_{i+1}}{y_i},\qquad i=1,\dots ,N . \end{array}\right. } \end{aligned}$$
(5.15)

Note that \(\displaystyle \sum \nolimits _{j=1}^{N+1}a_j=0\), so we have 2N independent variables on the coadjoint orbit \(\mathcal{O}_\Lambda \). We compute the kinetic part of \(\mathscr {L}_k\) defined in (3.2) as

$$\begin{aligned} \begin{aligned} K_k&= \langle -R^*_-(\varphi \,\Lambda \,\varphi ^{-1}),\,\partial _{t_k}\varphi \cdot _R \varphi ^{-1}\rangle = -\langle \varphi \,\Lambda \,\varphi ^{-1},\,R_-(\partial _{t_k}\varphi \cdot _R \varphi ^{-1})\rangle \\&=-\langle \varphi \,\Lambda \,\varphi ^{-1},\,\partial _{t_k}\varphi \, \varphi ^{-1}\rangle =-\textrm{Tr}\left( \Lambda \,\varphi ^{-1}\,\partial _{t_k}\varphi \right) , \end{aligned} \end{aligned}$$
(5.16)

where in the third step we have used the morphism property of \(R_-\)

$$\begin{aligned} R_-(\partial _{t_k}\varphi \cdot _R \varphi ^{-1})=\partial _{t_k}\varphi _- \, \varphi _-^{-1} \end{aligned}$$
(5.17)

and \(\varphi _-=\varphi \in G_-\). It remains to express it in terms of our chosen coordinates to get

$$\begin{aligned} K_k= & {} -\textrm{Tr}\left( \Lambda \, Y^{-1}\,U^{-1}\,\partial _{t_k}(UY) \right) =-\textrm{Tr}\left( Y\,\Lambda \, Y^{-1}\,U^{-1}\,\partial _{t_k}U \right) \nonumber \\ {}= & {} - \sum _{j=1}^{N}\frac{y_{j+1}}{y_j}\,\partial _{t_k}u_{j,j+1}. \end{aligned}$$
(5.18)

From these results, it becomes apparent that the convenient coordinates are \(b_i\) as given in (5.15) and \(u_{i}\equiv u_{i,i+1}\), \(i=1,\dots ,N\). The first two Lagrangians involve the Hamiltonians (5.9) and (5.12) respectively, and can now be expressed in the \(u_i,b_i\) coordinates as follows

$$\begin{aligned} \mathscr {L}_1= K_1 - H_1&= -\sum _{j=1}^{N}b_j\,\partial _{t_1}u_{j}+ \frac{1}{2}\sum _{j=2}^{N}(b_j\,u_j-b_{j-1}\,u_{j-1})^2\nonumber \\ {}&\quad +\sum _{j=1}^{N}b_j^2+ \frac{1}{2}b_1^2\,u_1^2+ \frac{1}{2}b_{N}^2\,u_{N}^2\,, \nonumber \\ \mathscr {L}_2=K_2 - H_2&= -\sum _{j=1}^{N}b_j\,\partial _{t_2}u_{j}+\frac{1}{3}\sum _{j=2}^{N}(b_j\,u_j-b_{j-1}\,u_{j-1})^3\nonumber \\ {}&\quad +\frac{1}{3}(b_1\,u_1)^3+\frac{1}{3}(b_{N}\,u_{N})^3\nonumber \\&\quad +\sum _{j=2}^{N-1}b_j^2(b_{j+1}\,u_{j+1}-b_{j-1}\,u_{j-1})\nonumber \\&\quad +b_1^2(b_2\,u_2)-b_{N}^2(b_{N-1}\,u_{N-1}) \,. \end{aligned}$$

The variation of \(\mathscr {L}_1\) reads

$$\begin{aligned} \delta \mathscr {L}_1&= -\sum _{j=1}^{N} \partial _{t_1}u_{j}\,\delta b_j+\sum _{j=1}^{N} \partial _{t_1}b_j\,\delta u_{j}- \partial _{t_1} \sum _{j=1}^{N}b_j\,\delta u_{j} \\&\quad + \sum _{j=2}^{N}(b_ju_j-b_{j-1}u_{j-1})(u_j\,\delta b_j+b_j\,\delta u_j)\\ {}&\quad - \sum _{j=1}^{N-1}(b_{j+1}u_{j+1}-b_{j}u_{j})(u_j\,\delta b_j+b_j\,\delta u_j)\\&\quad +2\sum _{j=1}^{N}b_j\,\delta b_j+ b_1u_1^2\,\delta b_1+b_1^2u_1\,\delta u_1+ b_{N}u_{N}^2\,\delta b_{N} + b_{N}^2u_{N}\,\delta u_{N} \,, \end{aligned}$$

and gives the following Euler–Lagrange equations

$$\begin{aligned} {\left\{ \begin{array}{ll} \partial _{t_1}u_{1}=u_1^2\,b_1-u_1\,(b_{2}\,u_{2}-b_{1}\,u_{1})+2\,b_1,\\ \partial _{t_1}u_{N}=u_N\,(b_N\,u_N-b_{N-1}\,u_{N-1})+u_N^2\,b_N+2\,b_N,\\ \partial _{t_1}u_{j}=u_j(b_j\,u_j-b_{j-1}\,u_{j-1})-u_j(b_{j+1}\,u_{j+1}-b_{j}\,u_{j})+2\,b_j,\\ \partial _{t_1}b_1=b_1(b_2u_2-b_{1}u_{1})-b_1^2\,u_1 ,~~\\ \partial _{t_1}b_{N}=-b_{N}(b_{N}u_{N}-b_{N-1}u_{N-1})-b_{N}^2\,u_{N},\\ \partial _{t_1}b_j=b_j(b_{j+1}u_{j+1}-b_{j}u_{j})-b_j(b_ju_j-b_{j-1}u_{j-1}), \end{array}\right. } \end{aligned}$$
(5.19)

for \(j=2,\dots ,N-1\). It is easy to see that these equations give exactly (5.11) using the identification (see (5.15))

$$\begin{aligned} {\left\{ \begin{array}{ll} a_1=b_1u_1,\qquad a_{N+1}=-b_{N}u_{N},\\ a_j=b_ju_j-b_{j-1}u_{j-1},\qquad j=2,\dots ,N. \end{array}\right. } \end{aligned}$$
(5.20)

This provides a very explicit check that our Lagrangians produce the corresponding Lax equations, in coordinates naturally dictated by the coadjoint orbit construction of the kinetic term, here \(u_j\), \(b_j\). As recalled in Sect. 3.2, the kinetic part of a Lagrangian provides the (pullback of the) symplectic form of the model via the Cartan form \(\theta _R\). Here, we have (see the total derivative term in \(\delta \mathscr {L}_1\))

$$\begin{aligned} \theta _R=\sum _{j=1}^{N}b_j\,\delta u_{j}~\Rightarrow ~\Omega _R=\sum _{j=1}^{N}\delta b_j\wedge \delta u_{j}. \end{aligned}$$
(5.21)

This shows that the coordinates \(u_j\), \(b_j\) are canonical. In the present case, choosing \(b_j\), \(a_j\) for \(j=1,\dots ,N\), as the coordinates on the coadjoint orbit \(\mathcal{O}_\Lambda \), we can also express the Kostant–Kirillov form explicitly using the formula

$$\begin{aligned} u_j=\frac{1}{b_j}\sum _{\ell =1}^j a_{\ell } \end{aligned}$$

to get

$$\begin{aligned} \omega _R=\sum _{j=1}^{N}\frac{1}{b_j}\sum _{\ell =1}^j\delta b_j\wedge \delta a_{\ell }. \end{aligned}$$
(5.22)

It is instructive to see how the usual Hamiltonian formulation of the open Toda chain in canonical coordinates \(q_i,p_i\) is derived from our Lagrangian formulation. From the symplectic form (5.21), we deduce the following (canonical) Poisson bracketsFootnote 8

$$\begin{aligned} \{b_j,\,u_k\}=\delta _{jk},\qquad \{b_j,\,b_k\}=0=\{u_j,\,u_k\},\qquad j,k=1,\dots ,N. \end{aligned}$$
(5.23)

The Legendre transformation

$$\begin{aligned} \frac{\partial \mathscr {L}_1}{\partial (\partial _{t_1} u_j)}=b_j \end{aligned}$$

reproduces, as it should, the Hamiltonian

$$\begin{aligned} \sum _{j=1}^{N}\frac{\partial \mathscr {L}_1}{\partial (\partial _{t_1} u_j)}\,\partial _{t_1}u_{j}-\mathscr {L}_1= & {} -\,\frac{1}{2}\sum _{j=2}^{N}(b_j\,u_j-b_{j-1}\,u_{j-1})^2-\sum _{j=1}^{N}b_j^2\\ {}{} & {} - \frac{1}{2}b_1^2\,u_1^2- \frac{1}{2}b_{N}^2\,u_{N}^2=H_1(L) . \end{aligned}$$

The matrix L for \(\mathfrak {sl}(N+1)\) in canonical coordinates \((q_i,\, p_i)\) is given by

$$\begin{aligned} L = \begin{pmatrix} p_1 &{}\quad \text {e}^{q_1-q_2} &{}\quad 0 &{}\quad 0 &{}\quad 0 &{}\quad \dots \\ \text {e}^{q_1-q_2} &{}\quad p_2 &{}\quad \text {e}^{q_2-q_3} &{}\quad 0 &{}\quad 0 &{}\quad \dots \\ 0 &{}\quad \text {e}^{q_2-q_3} &{}\quad p_3 &{}\quad \text {e}^{q_3-q_4} &{}\quad 0 &{}\quad \dots \\ 0 &{}\quad 0 &{}\quad \ddots &{}\quad \ddots &{}\quad \ddots &{}\quad &{} \\ \vdots &{}\quad &{}\quad &{}\quad \text {e}^{q_{N-1}-q_{N}} &{}\quad p_n &{}\quad \text {e}^{q_{N}-q_{N+1}}\!\!\!\! \\ 0 &{}\quad &{}\quad &{}\quad &{}\quad \text {e}^{q_{N}-q_{N+1}} &{}\quad p_{N+1} \end{pmatrix} , \end{aligned}$$
(5.24)

and by comparison with (5.8), we set the change of variables

$$\begin{aligned} {\left\{ \begin{array}{ll} \,q_j=\displaystyle \sum _{k=j}^{N}\ln b_k,\qquad j=1,\dots ,N,\\ \,p_j=b_j\,u_j-b_{j-1}\,u_{j-1},\qquad j=2,\dots ,N,\\ \,p_1=b_1\,u_1,\qquad p_{N+1}=-b_{N}\,u_{N}. \end{array}\right. } \end{aligned}$$
(5.25)

From (5.23), we deduce by direct calculation

$$\begin{aligned} \begin{aligned}&\{q_j,\,p_k\}=\delta _{jk},~~\{q_j,\,q_k\}=0=\{p_j,\,p_k\},~~j,k=1,\dots ,N, \\&\{q_j,\,p_{N+1}\}=-1,~~\{p_j,\,p_{N+1}\}=0,~~j=1,\dots ,N. \end{aligned} \end{aligned}$$
(5.26)

Note that \(p_{N+1}\) is redundant for the description of the dynamics since we only need the map \((u_j,b_j)\mapsto (q_j,p_j)\) for \(j=1,\dots ,N\). This is captured by the fact that the previous relations imply that \(C= \sum _{j=1}^{N+1} p_j\) is a Casimir on the 2N phase space with coordinates \((q_1,\dots ,q_{N},p_1,\dots ,p_{N+1})\) and we can work with \(C=0\). The coordinate \(p_{N+1}\) is still useful to write the Hamiltonian in the compact familiar form as

$$\begin{aligned} H_1= -\frac{1}{2}\sum _{j=1}^{N+1}p_j^2-\sum _{j=1}^{N-1}\text {e}^{2(q_j-q_{j+1})}-\text {e}^{2q_{N}}. \end{aligned}$$
(5.27)

Hamilton’s equations \(\partial _{t_1}q_j=\{q_j,H_1\}\), \(\partial _{t_1}p_j=\{p_j,H_1\}\) yield

$$\begin{aligned} {\left\{ \begin{array}{ll} \partial _{t_1}p_1=2\text {e}^{2(q_{1}-q_2)},\qquad \partial _{t_1}p_{N+1}=-2\text {e}^{2q_{N}},\\ \partial _{t_1}p_j=2\left( \text {e}^{2(q_j-q_{j+1})}-\text {e}^{2(q_{j-1}-q_j)}\right) ,\qquad j=2,\dots ,N-1,\\ \partial _{t_1}q_j=p_{N+1}-p_j,~~j=1,\dots ,N. \end{array}\right. } \end{aligned}$$
(5.28)

These can be seen to be equivalent to (5.11), thus completing the Hamiltonian description of the first flow for the open Toda chain, from our Lagrangian formulation. The same analysis can be performed with \(\mathscr {L}_2\) although the calculations are longer. We simply record here the Euler–Lagrange equations obtained from \(\delta \mathscr {L}_2\)

$$\begin{aligned} {\left\{ \begin{array}{ll} \partial _{t_2}u_1= u_1\left( (b_{1}\,u_{1})^2-(b_{2}\,u_{2}-b_{1}\,u_{1})^2-b_{2}^2\,\right) +2\,b_1\,b_{2}\,u_{2} ,\\ \partial _{t_2}u_{N}=u_N\left( (b_{N}\,u_{N}-b_{N-1}\,u_{N-1})^2-(b_{N}\,u_{N})^2+b_{N-1}^2\,\right) -2\,b_{N}\,b_{N-1}\,u_{N-1},\\ \partial _{t_2}u_j=u_j\left( (b_{j}\,u_{j}-b_{j-1}\,u_{j-1})^2-(b_{j+1}\,u_{j+1}-b_{j}\,u_{j})^2\right) +u_j\,(b_{j-1}^2-b_{j+1}^2) \\ \hspace{10ex} +2\,b_j\,(b_{j+1}\,u_{j+1}-b_{j-1}\,u_{j-1}),\\ \partial _{t_2}b_1= b_1\left( (b_{2}\,u_{2}-b_{1}\,u_{1})^2-(b_{1}\,u_{1})^2+b_{2}^2\,\right) ,\\ \partial _{t_2}b_{N}= b_N\left( (b_{N}\,u_{N})^2-(b_{N}\,u_{N}-b_{N-1}\,u_{N-1})^2-b_{N-1}^2\,\right) ,\\ \partial _{t_2}b_j= b_j\left( (b_{j+1}\,u_{j+1}-b_{j}\,u_{j})^2-(b_{j}\,u_{j}-b_{j-1}\,u_{j-1})^2\right) -b_j\,(b_{j-1}^2-b_{j+1}^2) , \end{array}\right. } \end{aligned}$$
(5.29)

for \(j=2,\dots ,N-1\). We leave it to the reader to check that these correctly reproduce (5.13) again using (5.20). To conclude this example we establish the closure relation for the first two flows, i.e.

$$\begin{aligned} \partial _{t_2}\mathscr {L}_1-\partial _{t_1}\mathscr {L}_2=0~~\text {on shell}. \end{aligned}$$

We know from our general results that this must hold, so this is simply an explicit check. We know that the kinetic and potential contributions give zero separately, so we split the calculations accordingly. For the potential terms, it is more expedient to use the \(a_j,b_j\) coordinatesFootnote 9 and equations (5.11) and (5.13)

$$\begin{aligned} \partial _{t_2}H_1-\partial _{t_1}H_2&= \partial _{t_1}\left( \sum _{j=1}^{N+1}\frac{a_j^3}{3}+\sum _{j=1}^{N}b_j^2(a_j+a_{j+1}) \right) -\partial _{t_2}\left( \sum _{j=1}^{N+1}\frac{a_j^2}{2}+\sum _{j=1}^{N}b_j^2 \right) \\&= \sum _{j=1}^{N+1} 2 a_j^2(b_{j-1}^2-b_j^2) + \sum _{j=1}^{N}2b_j^2(a_j^2-a_{j+1}^2+b_{j-1}^2-b_{j+1}^2)\\&\quad -\sum _{j=1}^{N+1} 2 a_j(b_{j-1}^2(a_{j-1}+a_j)-b_j^2(a_j+a_{j+1}))\\&\quad -\sum _{j=1}^{N}2b_j^2(a_j^2-a_{j+1}^2+b_{j-1}^2-b_{j+1}^2)\\&=\sum _{j=1}^{N+1} 2 (a_ja_{j+1}b_j^2-a_ja_{j-1}b_{j-1}^2 )\\&=0\,, \end{aligned}$$

where in the last step we recognise a telescopic sum. For the kinetic terms, we also use the \(a_j\), \(b_j\) coordinates wherever possible to expedite the calculations

$$\begin{aligned} \partial _{t_1}K_2 - \partial _{t_2}K_1&= \sum _{j=1}^N(\partial _{t_1} (b_j\partial _{t_2}u_j)-\partial _{t_2}(b_j\partial _{t_1}u_j )) \\&=\sum _{j=1}^N(\partial _{t_1}((a_{j+1}^2-a_j^2+b_{j+1}-b_{j-1}^2)u_jb_j-2b_j^2(a_j+a_{j+1}) )\\ {}&\quad -\partial _{t_2}((a_{j+1}-a_j)u_jb_j-2b_j^2 ))\\&=\sum _{j=1}^N((\partial _{t_1}(a_{j+1}^2-a_j^2+b_{j+1}-b_{j-1}^2)- \partial _{t_2}(a_{j+1}-a_j)) u_jb_j \\ {}&\quad - 2b_j^2(b_{j+1}^2-b_{j-1}^2))\\&=0 \,, \end{aligned}$$

where in the last step the first term gives zero for each j upon using the equations of motion and the remaining terms form a telescopic sum adding up to zero.

A Lagrangian multiform for the Toda chain was first constructed in [7] using variational symmetries of a given starting Lagrangian, which would be \(\mathscr {L}_1\) in our context, to construct higher Lagrangian coefficients which constitute a multiform when assembled together. The infinite Toda chain was studied more recently in [17] to illustrate the newly introduced theory of Lagrangian multiforms over semi-discrete multi-time. In [7], the analogue of our \(\mathscr {L}_2\) and \(\mathscr {L}_3\) were constructed. The Noether integrals \(J_1\) and \(J_2\) (equations (10.11) and (10.12) in [7]) which constitute the potential part of their Lagrangians are nothing but \(H_2(L)\) and \(H_3(L)\) with L parametrised as in (5.24), up to an irrelevant change of convention \(\text {e}^{q_i-q_{i+1}}\rightarrow \text {e}^{q_{i+1}-q_i}\) and setting \(q_i=x_i\) and \(p_i=\dot{x}_i\). The kinetic part of the higher Lagrangians in [7] involves the so-called alien derivatives which are symptomatic of constructing a multiform from a starting Lagrangian and building compatible higher Lagrangian coefficients. Our construction prevents the problem of alien derivatives altogether, putting all the Lagrangian coefficients on equal footing. This was also achieved previously in the context of field theories in [12, 14].

6 Open Toda chain with a skew-symmetric \(\varvec{r}\)-matrix

We now present the same model for the same algebra \(\mathfrak {g}=\mathfrak {sl}(N+1)\) but endowed with a different Lie dialgebra structure. This is based on the Cartan decomposition of \(\mathfrak {g}\) and leads to a skew-symmetric r-matrix. One attractive feature of this setup, that we only illustrate for \(\mathfrak {sl}(N+1)\), is that it allows for a generalisation to any finite semi-simple Lie algebra, see [27, Chapter 4].

Algebraic setup: Consider the decomposition

$$\begin{aligned} \mathfrak {g} = \mathfrak {n}_+ \oplus \mathfrak {h} \oplus \mathfrak {n}_- , \end{aligned}$$
(6.1)

where \(\mathfrak {h}\) is the Cartan subalgebra of diagonal (traceless) matrices and \(\mathfrak {n}_\pm \) the nilpotent subalgebra of strictly upper/lower triangular matrices. Let \(P_\pm \), \(P_0\) be the projectors onto \(\mathfrak {n}_\pm \) and \(\mathfrak {h}\) respectively, relative to the decomposition (6.1) and set \(R=P_+-P_-\). It can be verified that R satisfies the mCYBE. Here \(R_{\pm }=\pm (P_\pm +P_0/2)\) and

$$\begin{aligned} \mathfrak {g}_{\pm } = \text {Im}(R_{\pm }) = \mathfrak {b}_{\pm } = \mathfrak {h} \oplus \mathfrak {n}_{\pm }. \end{aligned}$$
(6.2)

We have the following action of \(R_\pm \) on the elements \(y \in \mathfrak {h}\) and \(w_{\pm } \in \mathfrak {n}_{\pm }\),

$$\begin{aligned} R_{\pm } (y) = \pm \frac{1}{2}\, y ,\qquad R_{\pm } (w_{\pm }) = \pm \, w_{\pm } ,\qquad R_{\pm } (w_{\mp }) = 0 . \end{aligned}$$
(6.3)

Taking the same bilinear form as in (5.2), i.e. \(\langle X, Y\rangle ={{\,\textrm{Tr}\,}}(XY)\), we see that

$$\begin{aligned} P_\pm ^*=P_\mp ,~~P_0^*=P_0~~\text {so that}~~ R^*=-R. \end{aligned}$$
(6.4)

Thus, we have a skew-symmetric r-matrix here. For the related Lie groups, we have the following factorisations close to the identity,

$$\begin{aligned} \varphi = \varphi _{+}\,\varphi _-^{-1} , \qquad \varphi _{\pm } = W_{\pm }\,Y^{\pm 1} , \qquad Y\in \text {exp}(\mathfrak {h}) ,\,\, W_{\pm }\in \text {exp}(\mathfrak {n}_{\pm }) . \end{aligned}$$
(6.5)

Lax matrix and Lax equations for the first two flows: For \(\Lambda \in \mathfrak {g}^*\simeq \mathfrak {g}\), the expression of L as a coadjoint orbit of \(\Lambda \) is given by

$$\begin{aligned} \begin{aligned} L&= \text {Ad}^{R*}_{\varphi } \cdot \Lambda = R^{^*}_+( W_+\,Y\,\Lambda \,Y^{-1}\,W_+^{-1}) - R^{^*}_-(W_-\,Y^{-1} \,\Lambda \,Y\,W_-^{-1}) . \end{aligned} \end{aligned}$$
(6.6)

We choose \(\Lambda \) as in (5.4), emphasising that in this case it is an element of the full \(\mathfrak {g}^* \simeq \mathfrak {g}\), and \(Y \in \text {exp}(\mathfrak {h})\), \(W_{\pm }\in \text {exp}(\mathfrak {n}_{\pm })\) given by

$$\begin{aligned}&\hspace{20ex}Y = \text {diag} \left( \eta _1 \,, \eta _2 \dots , \eta _{N+1} \right) \,, \qquad \det Y = 1\,, \end{aligned}$$
(6.7)
$$\begin{aligned}&W_- \!=\! {\small \begin{pmatrix} 1 &{} 0 &{} 0 &{}\dots &{} 0\\ \omega ^-_{2,1} &{} 1 &{} 0 &{}\dots &{} 0\\ \omega ^-_{3,1} &{} \omega ^-_{3,2} &{} 1 &{}\dots &{} 0\\ \vdots &{} \ddots &{} \ddots &{} \ddots &{} ~~0~~\\ ~\omega ^-_{N,1}~ &{} \omega ^-_{N,2} &{} \dots &{}\omega ^-_{N,N-1} &{} 1\\ \end{pmatrix}} , ~~ W_+ \!=\! {\small \begin{pmatrix} 1 &{} \omega ^+_{1, 2} &{} \omega ^+_{1, 3} &{}\dots &{} \omega ^+_{1, N}\\ 0 &{} 1 &{} \omega ^+_{2, 3} &{}\dots &{} \omega ^+_{2 ,N}\\ 0 &{} 0 &{} 1 &{}\ddots &{} \vdots \\ \vdots &{} &{} &{} \ddots &{} ~\omega ^+_{N-1, N}~\\ ~~0~~ &{} 0 &{} \dots &{} 0 &{} 1\\ \end{pmatrix}} . \end{aligned}$$
(6.8)

From (6.4), we deduce that \(R^*_{\pm }=\pm (P_\mp +P_0/2)\) so that

$$\begin{aligned} R_{\pm }^* (y) = \pm \frac{1}{2}\, y ,\qquad R_{\pm }^* (w_{\pm }) = 0 ,\qquad R_{\pm }^* (w_{\mp }) = \pm w_{\mp } , \end{aligned}$$
(6.9)

for \(y \in \mathfrak {h},~w_{\pm } \in \mathfrak {n}_{\pm }\). Let us introduce the variables \((w_i,z_i)\), defined as

$$\begin{aligned} w_i = \frac{\omega ^+_{i,i+1} - \omega ^-_{i+1,i}}{2}, \qquad z_i=2\,\frac{\eta _{i+1}}{\eta _i} , \end{aligned}$$
(6.10)

from which we determine the Flaschka coordinates as

$$\begin{aligned} {\left\{ \begin{array}{ll} a_i = \dfrac{w_i\,z_i-w_{i-1}\, z_{i-1}}{2} , \qquad i = 2, \dots , N-1, \\ a_1 = \dfrac{w_1\,z_1}{2} , \qquad a_{N+1} = -\dfrac{w_N\,z_N}{2} ,\\ b_i = \dfrac{z_i}{2} ,\qquad i=1, \dots , N. \end{array}\right. } \end{aligned}$$
(6.11)

The evaluation of (6.6) in those coordinates reproduces the tridiagonal form as in (5.8). One can then check that the equations for the first two flows (5.11) and (5.13) in the previous section derive from the Lax equation

$$\begin{aligned} \partial _{t_k} L = \left[ \, R_{+}(\nabla H_k(L)),L \,\right] , \qquad k = 1,2 , \end{aligned}$$

where the Hamiltonians are taken as

$$\begin{aligned} H_1(L) = {{\,\textrm{Tr}\,}}\,(L^2), \qquad H_2(L) = \frac{2}{3} {{\,\textrm{Tr}\,}}\,(L^3), \end{aligned}$$
(6.12)

and we recall that \(R_+=P_+ +P_0/2\) here.

Lagrangian description: The Lagrangian multiform takes the form

$$\begin{aligned} \mathscr {L}= \sum _k \mathscr {L}_{k} \, \textrm{d}t_k = \sum _k \left( {K}_k(L) - {H}_k(L) \right) \textrm{d}t_k , \end{aligned}$$
(6.13)

for \(L \in \mathcal {O}_{\Lambda } , \varphi \in G_R\,\), with the kinetic and the potential terms given by

$$\begin{aligned} {K}_k(L) = \text {Tr} \,( L \, \partial _{t_k} \varphi \cdot _R \varphi ^{-1} ), \qquad {H}_k(L) = \frac{2}{k+1} \,\text {Tr}\,(L^{k+1}\,), \end{aligned}$$
(6.14)

respectively. As in the previous section, the kinetic term will allow us to recognise natural canonical variables of the system in this description. Recalling (6.5), (6.7), (6.8) and (6.10), we find

$$\begin{aligned} \begin{aligned} {K}_k(L)&= \,\text {Tr}\,(\, \Lambda \, \varphi ^{-1} \cdot _R \partial _{t_k} \varphi \,) \\&= \, \,\text {Tr}\,(\, \Lambda \, \varphi ^{-1}_+ \cdot \partial _{t_k} \varphi _+ \, ) - \,\text {Tr}\,(\, \Lambda \, \varphi ^{-1}_- \cdot \partial _{t_k} \varphi _- \, ) \\&=\sum _{i=1}^N \dfrac{\eta _{i+1}}{\eta _i} \, \partial _{t_k} \omega _{i,i+1}^+ -\sum _{i=1}^N \dfrac{\eta _{i+1}}{\eta _i} \, \partial _{t_k} \omega _{i+1,i}^- \\&=\sum _{i=1}^N z_i \, \partial _{t_k} w_i . \end{aligned} \end{aligned}$$
(6.15)

The k-th Lagrangian coefficient expressed in terms of the coordinates \((w_i,z_i)\) reads

$$\begin{aligned} \mathscr {L}_k = {K}_k - {H}_k = \sum _{i=1}^N z_i\, \partial _{t_k}w_i - {H}_k , \end{aligned}$$
(6.16)

with \((w_i,z_i)\) being canonical coordinates, and for \(k=1,2\),

$$\begin{aligned} \begin{aligned} {H}_1(L)&= \,\text {Tr}\,(L^2) = \sum _{i=1}^{N} \frac{1}{2}\left( z_i^2+w_i^2\,z_i^2 \right) - \sum _{i=1}^{N-1} \frac{1}{2}\,w_i\,z_i\,w_{i+1}\,z_{i+1} , \\ {H}_2(L)&= \dfrac{2}{3} \,\text {Tr}\,(L^3)\\&= \sum _{i=1}^{N} \frac{1}{4}\left( z_i^2\,w_{i+1}\,z_{i+1}-z_{i+1}^2\,w_i\,z_i + w_i^2\,z_i^2 \,w_{i+1}\,z_{i+1} -w_i\,z_i\,w_{i+1}^2\,z_{i+1}^2 \right) . \end{aligned} \end{aligned}$$

To obtain the latter expressions of the Hamiltonians, it suffices to use the following expression for L in the \(w_i,z_i\) coordinates

$$\begin{aligned} L = \frac{1}{2}\begin{pmatrix} ~w_1\,z_1 &{} \,z_1 &{} 0 &{} 0 &{}\dots &{} 0\\ \,z_1 &{} w_2\,z_2-w_1\,z_1 &{} \,z_2 &{} 0 &{}\dots &{} 0\\ 0 &{} \,z_2 &{} w_3\,z_3-w_2\,z_2 &{} \,z_3 &{}\dots &{} 0\\ 0 &{} 0 &{} \,z_3 &{}\ddots &{}\ddots &{} \vdots \\ \vdots &{} &{} &{} \ddots &{} \ddots &{} \,z_N\\ 0 &{} 0 &{} 0 &{} \dots &{} \,z_N &{} -w_N\,z_N ~ \end{pmatrix} . \end{aligned}$$
(6.17)

Note that we can just as easily determine the higher Hamiltonians and hence the higher Lagrangian coefficients \(\mathscr {L}_k\), although the expressions become long. The variation \(\delta \mathscr {L}_1\) yields the following Euler–Lagrange equations

$$\begin{aligned} {\left\{ \begin{array}{ll} \partial _{t_1} w_1 = z_1 - \dfrac{w_1}{2} \, \left( (w_{2}\, z_{2} - w_{1}\, z_{1}) - w_{1}\, z_{1} \right) , \\ \partial _{t_1} w_N = z_N - \dfrac{w_N}{2}\, \left( - w_{N}\, z_{N} - ( w_{N}\, z_{N} - w_{N-1}\, z_{N-1} ) \right) , \\ \partial _{t_1} w_i = z_i - \dfrac{w_i}{2}\, \left( (w_{i+1}\, z_{i+1} -w_{i}\, z_{i} ) -(w_{i}\, z_{i} - w_{i-1}\, z_{i-1} ) \right) , \\ \partial _{t_1} z_1 = \dfrac{z_1}{2} \, \left( (w_{2}\, z_{2} - w_{1}\, z_{1}) - w_{1}\, z_{1} \right) , \\ \partial _{t_1} z_N = \dfrac{z_N}{2} \, \left( -w_{N}\, z_{N} +(w_{N}\, z_{N} - w_{N-1}\, z_{N-1}) \right) ,\\ \partial _{t_1} z_i = \dfrac{z_i}{2} \, \left( (w_{i+1}\, z_{i+1}- w_{i}\, z_{i}) - ( w_{i}\, z_{i} - w_{i-1}\, z_{i-1} ) \right) , \end{array}\right. } \end{aligned}$$
(6.18)

for \(i=2,\,\dots ,\,N-1\), while the variation of \(\mathscr {L}_2\) gives the Euler–Lagrange equations for the second flow

$$\begin{aligned} {\left\{ \begin{array}{ll} \partial _{t_2} z_1 = \dfrac{z_1}{4} \big ( (w_{2}\,z_{2}-w_1\,z_1)^2 -(w_{1}\,z_{1})^2 + z_{2}^2 \big ) , \\ \partial _{t_2} z_N = \dfrac{z_N}{4} \left( (-w_N\,z_N)^2 - (w_N\,z_N - w_{N-1}\,z_{N-1})^2 -z_{N-1}^2 \right) , \\ \partial _{t_2} z_i = \dfrac{z_i}{4}\big ( (w_{i+1}\,z_{i+1}-w_i\,z_i)^2 -(w_{i}\,z_{i}-w_{i-1}\,z_{i-1})^2 + z_{i+1}^2- z_{i-1}^2 \big ),\\ \partial _{t_2} w_1 = \dfrac{z_1}{2}\big (w_{2}\,z_{2}\big ) -\dfrac{w_1}{4} \Big ( (w_{2}\,z_{2}-w_1\,z_1)^2 -(w_{1}\,z_{1})^2 + z_{2}^2 \Big ) ,\\ \partial _{t_2} w_N = \dfrac{z_N}{2}\big (w_{N-1}\,z_{N-1}\big ) \\ \hspace{5ex}-\dfrac{w_N}{4} \Big ( (w_N\,z_N)^2 -(w_{N}\,z_{N}-w_{N-1}\,z_{N-1})^2- z_{N-1}^2 \Big ) ,\\ \partial _{t_2} w_i = \dfrac{z_i}{2}\big ((w_{i+1}\,z_{i+1}-w_i\,z_i)-(w_i\,z_i-w_{i-1}\,z_{i-1})\big ) \\ \hspace{5ex} -\dfrac{w_i}{4} \Big ( (w_{i+1}\,z_{i+1}-w_i\,z_i)^2 -(w_{i}\,z_{i}-w_{i-1}\,z_{i-1})^2 + z_{i+1}^2-z_{i-1}^2 \Big ) , \end{array}\right. } \end{aligned}$$
(6.19)

with \(i=1,\dots ,N-1\). One can check that these reproduce the more familiar equations (5.11)-(5.13) in Flaschka coordinates, using (6.11). As in the previous section, we can relate our results with the Hamiltonian formulation of the Toda chain in traditional canonical coordinates \((q_i,p_i)\). With

$$\begin{aligned} \theta _R = \sum _{i=1}^N z_i\,\delta w_i ~~ \implies ~~ \{z_i,w_j\} = \delta _{ij} , \qquad \{w_i,w_j\} = 0 = \{z_i,z_j\} , \end{aligned}$$
(6.20)

we see that it suffices to set

$$\begin{aligned} {\left\{ \begin{array}{ll} q_i = \displaystyle \sum _{\ell =i}^N \ln \dfrac{z_i}{2}, \qquad i = 1, \dots , N \\ p_i = \dfrac{w_i\,z_i-w_{i-1}\,z_{i-1}}{2}, \qquad i = 2, \dots , N \\ p_1 = \dfrac{w_1\,z_1}{2} , \qquad p_{N+1} = -\dfrac{w_N\,z_N}{2} . \end{array}\right. } \end{aligned}$$
(6.21)

The explicit verification of the closure relation in the first two flows is completely analogous to that given at the end of the previous section.

7 Rational Gaudin model

Gaudin models are a general class of integrable systems associated with Lie algebras with a nondegenerate invariant bilinear form. Unlike the case of the open Toda lattice, the Lax matrix of a Gaudin model is a Lie algebra-valued rational function of a variable \(\lambda \), the spectral parameter. We will only look at finite Gaudin models here, which describe certain spin chains and mechanical systems. To accommodate this, we need to extend our construction to certain infinite-dimensional Lie algebras.

Before diving into the required algebraic machinery, it is useful to recall the usual presentation of the equations of the model that we are aiming at describing variationally. We do so in the simplest case of a rational Lax matrix with simple poles. Many generalisations are known, including elliptic and non-skew-symmetric cases [33]. The Lax matrix of a (rational) Gaudin model associated with a finite Lie algebra \(\mathfrak {g}\) and a set of points \(\zeta _r \in \mathbb {C}\) \((r=1, \ldots , N)\) and the point at infinity is given by the following \(\mathfrak {g}\)-valued rational function

$$\begin{aligned} L(\lambda )= \sum _{r=1}^N \frac{X_r}{\lambda - \zeta _r} + X_\infty ,~~X_1,\dots ,X_N,X_\infty \in \mathfrak {g}. \end{aligned}$$
(7.1)

The coefficients \(H^n_{k, r}\) of \((\lambda -\zeta _r)^{-n-1}\), \(n\ge 0\), in \({{\,\textrm{Tr}\,}}(L(\lambda )^{k+1})/{(k+1)}\), \( k\ge 1\), are Hamiltonians in involution (with respect to the Sklyanin bracket). Of course, only a finite subset of them are independent and generate nontrivial flows. In the rest of this paper, we will focus on the coefficients corresponding to \(n=0\) and drop the extra label by simply writing \(H^0_{k, r}=H_{k, r}\). The most famous ones are the quadratic Gaudin Hamiltonians which are the coefficients \(H_{1, r}\) in

$$\begin{aligned} \frac{1}{2}{{\,\textrm{Tr}\,}}(L(\lambda )^2) = \frac{1}{2}\sum _{r=1}^N\frac{{{\,\textrm{Tr}\,}}(X_r^2)}{(\lambda - \zeta _r)^2} + \sum _{r=1}^N\frac{H_{1, r}}{\lambda - \zeta _r} + \frac{1}{2}{{\,\textrm{Tr}\,}}(X_\infty ^2), \end{aligned}$$
(7.2)

and read

$$\begin{aligned} H_{1, r}=\sum _{s\ne r}\frac{{{\,\textrm{Tr}\,}}(X_rX_s)}{\zeta _r-\zeta _s}+{{\,\textrm{Tr}\,}}(X_rX_\infty ),~~r=1,\dots ,N. \end{aligned}$$
(7.3)

The functions \(H_{k, r}\) give rise to a hierarchy of compatible equations in Lax form

$$\begin{aligned} \partial _{t_k^r}L(\lambda )=\left[ M_{k, r}(\lambda ),\,L(\lambda )\right] . \end{aligned}$$
(7.4)

For \(k=1\), we have

$$\begin{aligned} M_{1, r} = -\frac{X_r}{\lambda - \zeta _r}, \end{aligned}$$
(7.5)

and (7.4) gives the following equations of motion for the degrees of freedom in \(X_1,\dots ,X_N,X_\infty \)

$$\begin{aligned}&\partial _{t_1^r}X_s = \frac{[X_r\,,\,X_s]}{\zeta _r-\zeta _s}\,,\qquad s\ne r\,, \end{aligned}$$
(7.6)
$$\begin{aligned}&\partial _{t_1^r}X_r = -\sum _{s\ne r}\frac{[X_r\,,\,X_s]}{\zeta _r-\zeta _s}- [X_r\,,\,X_\infty ] \,, \end{aligned}$$
(7.7)
$$\begin{aligned}&\partial _{t_1^r}X_\infty = 0\,. \end{aligned}$$
(7.8)

We proceed to derive a Lagrangian multiform description of the set of equations (7.6)-(7.8), as well as those corresponding to the next higher Hamiltonians with \(k=2\). In principle, we could also include all higher Hamiltonians, but the first two levels are enough to illustrate our method. To do so, we need to be able to interpret \(L(\lambda )\) as living in a coadjoint orbit and use the framework of Lie dialgebras. This is described in [21, Lecture 3] which we now review and adapt to our purposes.

Algebraic setup: Let \(Q = \{\zeta _1, \ldots , \zeta _N, \infty \} \subset \mathbb {C}P^1\) be a finite set of points in \(\mathbb {C}P^1\) including the point at infinity, and denote by \(\mathcal {F}_Q(\mathfrak {g})\) the algebra of \(\mathfrak {g}\)-valued rational functions in the formal variable \(\lambda \) with poles in Q. Further, define the local parameters

$$\begin{aligned} \lambda _r = \lambda - \zeta _r,\quad \zeta _r \ne \infty ,\qquad \lambda _\infty = \frac{1}{\lambda }, \end{aligned}$$
(7.9)

and let \(S=\{1,\dots ,N,\infty \}\). This is to be used as an index set, so \(\infty \) is viewed here purely as a label for an index, not as the point at infinity. For each \(r\in S\), consider the algebra \(\widetilde{\mathfrak {g}}_r\) of formal Laurent series in variable \(\lambda _r\) with coefficients in \(\mathfrak {g}\),

$$\begin{aligned} \widetilde{\mathfrak {g}}_r = \mathfrak {g}\otimes \mathbb {C}((\lambda _r)), \end{aligned}$$
(7.10)

with Lie bracket

$$\begin{aligned}{}[X\lambda _r^i, Y\lambda _r^j] = [X, Y] \lambda _r^{i+j},\qquad X,Y \in \mathfrak {g}. \end{aligned}$$
(7.11)

We have the vector space decomposition into Lie subalgebras

$$\begin{aligned} \widetilde{\mathfrak {g}}_r = \widetilde{\mathfrak {g}}_{r+} \oplus \widetilde{\mathfrak {g}}_{r-}, \end{aligned}$$
(7.12)

where

$$\begin{aligned} \widetilde{\mathfrak {g}}_{r+}=\mathfrak {g}\otimes \mathbb {C}[[\lambda _r]],\quad r\ne \infty ,\qquad \widetilde{\mathfrak {g}}_{\infty +}=\mathfrak {g}\otimes \lambda _\infty \mathbb {C}[[\lambda _\infty ]], \end{aligned}$$
(7.13)

and

$$\begin{aligned} \widetilde{\mathfrak {g}}_{r-}=\mathfrak {g}\otimes \lambda _r^{-1}\mathbb {C}[\lambda _r^{-1}],\quad r\ne \infty ,\qquad \widetilde{\mathfrak {g}}_{\infty -}=\mathfrak {g}\otimes \mathbb {C}[\lambda _\infty ^{-1}]. \end{aligned}$$
(7.14)

In other words, \(\widetilde{\mathfrak {g}}_{r+}\) is the algebra of formal Taylor series in \(\lambda _r\) (without constant term when \(r=\infty \)) and \(\widetilde{\mathfrak {g}}_{r-}\) is the algebra of polynomials in \(\lambda _r^{-1}\) without constant term (except when \(r=\infty \)). Associated with this decomposition, we have projectors \(P_{r\pm }\) onto \(\widetilde{\mathfrak {g}}_{r\pm }\) relative to \(\widetilde{\mathfrak {g}}_{r\mp }\). Let us now consider \(\widetilde{\mathfrak {g}}_Q\) defined as the following direct sum of Lie algebras

$$\begin{aligned} \widetilde{\mathfrak {g}}_Q = \bigoplus _{r\in S} \widetilde{\mathfrak {g}}_r. \end{aligned}$$
(7.15)

The above decompositions yield the decomposition of \(\widetilde{\mathfrak {g}}_Q\) as

$$\begin{aligned} \widetilde{\mathfrak {g}}_{Q}=\widetilde{\mathfrak {g}}_{Q+} \oplus \widetilde{\mathfrak {g}}_{Q-}~~\text {with}~~ \widetilde{\mathfrak {g}}_{Q+} = \bigoplus _{r\in S} \widetilde{\mathfrak {g}}_{r+} ~~ \text {and} ~~ \widetilde{\mathfrak {g}}_{Q-} = \bigoplus _{r\in S} \widetilde{\mathfrak {g}}_{r-}, \end{aligned}$$
(7.16)

and the related projectors \(P_\pm \). Although useful, as we will see below, the decomposition (7.16) is not what we need to interpret (7.4) within the Lie dialgebra setup. So, let us consider the map

$$\begin{aligned} \iota _\lambda : \mathcal {F}_Q(\mathfrak {g}) \rightarrow \widetilde{\mathfrak {g}}_Q, \qquad f \mapsto \left( \iota _{\lambda _1} f, \ldots , \iota _{\lambda _N} f, \iota _{\lambda _\infty } f \right) , \end{aligned}$$
(7.17)

where \(\iota _{\lambda _r} f \in \widetilde{\mathfrak {g}}_r\) is the formal Laurent series of \(f \in \mathcal {F}_Q(\mathfrak {g})\) at \(\zeta _r \in \mathbb {C}P^1\) and \(\iota _{\lambda _\infty } f \in \widetilde{\mathfrak {g}}_r\) that of \(f \in \mathcal {F}_Q(\mathfrak {g})\) at \(\zeta _\infty \). This is an embedding of Lie algebras. In addition, we have the vector space decomposition

$$\begin{aligned} \widetilde{\mathfrak {g}}_Q = \widetilde{\mathfrak {g}}_{Q+} \oplus \iota _\lambda \mathcal {F}_Q(\mathfrak {g}). \end{aligned}$$
(7.18)

Let us introduce the projectors \(\Pi _\pm \) associated with this decomposition. They are different from \(P_\pm \) related to (7.16). The following relation is useful in practical calculations (see below when computing gradients or in (7.26))

$$\begin{aligned} \Pi _-(X)=\iota _\lambda \circ \pi _\lambda \circ P_-(X),~~X\in \widetilde{\mathfrak {g}}_Q, \end{aligned}$$
(7.19)

where the map \(\pi _\lambda :\widetilde{\mathfrak {g}}_{Q-}\rightarrow \mathcal {F}_Q(\mathfrak {g})\) given by

$$\begin{aligned} \pi _\lambda \,(Y_1(\lambda _1),\dots , Y_N(\lambda _N),Y_\infty (\lambda _\infty )) =\sum _{r\in S}Y_r(\lambda _r) \end{aligned}$$
(7.20)

puts elements of \(\widetilde{\mathfrak {g}}_{Q-}\) and \(\mathcal {F}_Q(\mathfrak {g})\) in one-to-one correspondence. This amounts to decomposing an \(f\in \mathcal {F}_Q(\mathfrak {g})\) into the sum of its partial fractions \(Y_r(\lambda _r)\).

We define the r-matrix we need as

$$\begin{aligned} R=\Pi _+-\Pi _- \end{aligned}$$
(7.21)

and use it to define on \(\widetilde{\mathfrak {g}}_Q\) the structure of a Lie dialgebra to which we will apply the results of that theory. Since we want to work with rational fractions which we have naturally embedded as \(\iota _\lambda \mathcal {F}_Q(\mathfrak {g})\) into \(\widetilde{\mathfrak {g}}_Q\), we need to identify the dual space this corresponds to, so that we can identify the coadjoint action and its orbits appropriately. The nondegenerate invariant symmetric bilinear form on \(\mathfrak {g}\), given by \((X, Y) \mapsto {{\,\textrm{Tr}\,}}(XY)\), can be used to define a nondegenerate invariant symmetric bilinear form on \(\widetilde{\mathfrak {g}}_Q\) by setting

$$\begin{aligned} \langle X, Y \rangle = \sum _{r\in S} \textrm{res}_{\lambda _r = 0} {{\,\textrm{Tr}\,}}(X_r(\lambda _r) Y_r(\lambda _r)). \end{aligned}$$
(7.22)

Both \(\widetilde{\mathfrak {g}}_{Q+}\) and \(\iota _\lambda \mathcal {F}_Q(\mathfrak {g})\) are Lie subalgebras which are (maximally) isotropic with respect to the bilinear form \(\langle \, ~~,~~ \rangle \) in (7.22). This tells us that

$$\begin{aligned} \widetilde{\mathfrak {g}}_{Q+}^{*} \simeq \iota _\lambda \mathcal {F}_Q(\mathfrak {g}), \end{aligned}$$
(7.23)

so that elements of \(\widetilde{\mathfrak {g}}_{Q+}^{*}\) are those we should work with if we want to deal with Lax matrices which are rational fractions of the spectral parameter. Accordingly, coadjoint orbits of \(\widetilde{G}_{Q+}\) in \(\widetilde{\mathfrak {g}}_{Q+}^{*}\) are the natural arena for the description of Gaudin Lax matrices. \(\widetilde{G}_{Q+}\) is the group associated with the algebra \(\widetilde{\mathfrak {g}}_{Q+}\), with elements of the form

$$\begin{aligned} \varphi _+=\left( \varphi _{1+}(\lambda _1),\dots ,\varphi _{N+}(\lambda _N),\varphi _{\infty +}(\lambda _\infty ) \right) . \end{aligned}$$
(7.24)

Each component \(\varphi _{r+}(\lambda _r)\) is a Taylor series in the local parameter \(\lambda _r\) with values in G whose Lie algebra is \(\mathfrak {g}\),

$$\begin{aligned} \varphi _{r+}(\lambda _r)=\sum _{n=0}^\infty \phi _r^{(n)}\lambda _r^n,\quad r\ne \infty ,\qquad \varphi _{\infty +}(\lambda _\infty )=\varvec{1}+\sum _{n=1}^\infty \phi _\infty ^{(n)} \lambda _\infty ^n. \end{aligned}$$
(7.25)

As always, in practice we use the identification (7.23) (identifying the action and coadjoint actions accordingly) and the (co)adjoint orbit of an element \(f\in \iota _\lambda \mathcal {F}_Q(\mathfrak {g})\) can be seen to be given by the elements

$$\begin{aligned} F=\Pi _-(\text {Ad}_{\varphi _+}\cdot f)\equiv \iota _\lambda L. \end{aligned}$$
(7.26)

In (7.26), the adjoint action of \(\varphi _+\) on f is defined component-wise

$$\begin{aligned} (\text {Ad}_{\varphi _+}\cdot f)_r(\lambda _r)=\varphi _{r+}(\lambda _r)\, f_r(\lambda _r)\,\varphi _{r+}(\lambda _r)^{-1},~~r\in S. \end{aligned}$$
(7.27)

Thus, we have a construction that allows us to interpret a rational Lax matrix \(L(\lambda )\) as an element of a (co)adjoint orbit and recast (7.4) as the following Lax equation in \(\iota _\lambda \mathcal {F}_Q(\mathfrak {g})\)

$$\begin{aligned} \partial _{t_k^r}\iota _\lambda L=[R_\pm \nabla H_{k, r}(\iota _\lambda L),\iota _\lambda L], \end{aligned}$$
(7.28)

where \(H_{k, r}\) are the following invariant functions on \(\widetilde{\mathfrak {g}}_Q\)

$$\begin{aligned} H_{k, r}:\, X\in \widetilde{\mathfrak {g}}_Q\mapsto {{\,\textrm{res}\,}}_{\lambda _r=0}\frac{{{\,\textrm{Tr}\,}}(X_r(\lambda _r)^{k+1})}{k+1},~~k\ge 1. \end{aligned}$$
(7.29)

We now apply the described framework to show how (7.4) is derived in this context for \(k=1,2\). Then we construct explicitly the corresponding Lagrangian coefficients of our multiform and check that their Euler–Lagrange equations produce the correct equations of motion.

Lax matrix and Lax equations for the first two flows: Let us choose

$$\begin{aligned} \Lambda (\lambda ) =\sum _{r=1}^N\frac{\Lambda _r}{\lambda -\zeta _r}+\Omega , \end{aligned}$$
(7.30)

and apply (7.26) to \(f=\iota _\lambda \Lambda \) to get

$$\begin{aligned} \iota _\lambda L=\Pi _-\left( \text {Ad}_{\varphi _+}\cdot \iota _\lambda \Lambda \right){} & {} = \iota _\lambda \circ \pi _\lambda \circ P_-\left( \text {Ad}_{\varphi _+}\cdot \iota _\lambda \Lambda \right) \nonumber \\{} & {} = \iota _\lambda \circ \pi _\lambda \left( \frac{\phi _1^{(0)}\,\Lambda _1\,(\phi _1^{(0)})^{-1}}{\lambda -\zeta _1},\dots , \frac{\phi _N^{(0)}\,\Lambda _N\,(\phi _N^{(0)})^{-1}}{\lambda -\zeta _N},\Omega \right) \nonumber \\{} & {} \equiv \iota _\lambda \circ \pi _\lambda \left( \frac{A_1}{\lambda -\zeta _1},\dots , \frac{A_N}{\lambda -\zeta _N},\Omega \right) \nonumber \\{} & {} =\iota _\lambda \left( \sum _{r=1}^N\frac{A_r}{\lambda -\zeta _r}+\Omega \right) . \end{aligned}$$
(7.31)

This is the desired form of (7.1) where now each \(X_r\) is of the form \(A_r=\phi _r^{(0)}\,\Lambda _r\,(\phi _r^{(0)})^{-1}\) with \(\Lambda _r\in \mathfrak {g}\) fixed and \(\phi _r^{(0)}\) containing the dynamical degrees of freedom. This is the (co)adjoint description required to compute our Lagrangian coefficients, see below.

Next, we derive the Lax equations in \(\iota _\lambda \mathcal {F}_Q(\mathfrak {g})\) associated with the functions \(H_{k, r}(\iota _\lambda L)\) for \(k=1, 2\). The gradient of \(H_{k, r}\) at the point \(\iota _\lambda L\) is defined as the element of \(\widetilde{\mathfrak {g}}_Q\) satisfying

$$\begin{aligned} \lim _{\epsilon \rightarrow 0}\frac{H_{k, r}(\iota _\lambda L+\epsilon \eta )-H_{k, r}(\iota _\lambda L)}{\epsilon }=\langle \eta ,\, \nabla H_{k, r}(\iota _\lambda L)\rangle , \end{aligned}$$
(7.32)

for all \(\eta \in \widetilde{\mathfrak {g}}_Q\). It is enough for our purposes to calculate \(R_-(\nabla H_{k, r}(\iota _\lambda L))\), therefore, we can restrict \(\eta \) to \(\widetilde{\mathfrak {g}}_{Q+}\). Thus, writing

$$\begin{aligned} \nabla H_{k, r}(\iota _\lambda L)=N^{(k)}+\iota _\lambda h^{(k)},\qquad N^{(k)}\in \widetilde{\mathfrak {g}}_{Q+},~~h^{(k)}(\lambda )\in \mathcal {F}_Q(\mathfrak {g}), \end{aligned}$$
(7.33)

recalling that \(\widetilde{\mathfrak {g}}_{Q+}\) and \(\iota _\lambda \mathcal {F}_Q(\mathfrak {g})\) are isotropic with respect to the bilinear form in (7.22), (7.32) becomes

$$\begin{aligned} {{\,\textrm{res}\,}}_{\lambda _r=0}{{\,\textrm{Tr}\,}}\left( \eta _r \,\iota _{\lambda _r} L^k \right) =\sum _{s \in S}{{\,\textrm{res}\,}}_{\lambda _s=0} {{\,\textrm{Tr}\,}}\left( \eta _s \,\iota _{\lambda _s} h^{(k)} \right) , \end{aligned}$$
(7.34)

for any \(\eta _s \in \widetilde{\mathfrak {g}}_{s+}\), \(s\in S\), implying

$$\begin{aligned} ( \iota _{\lambda _s}h^{(k)})_-&= 0\,,~~\forall \,s\ne r\,,\end{aligned}$$
(7.35)
$$\begin{aligned} (\iota _{\lambda _r}h^{(k)})_-&= (\iota _{\lambda _r} L^k)_-\,. \end{aligned}$$
(7.36)

This means that the rational function \(h^{(k)}(\lambda )\) has a nonzero principal only at \(\zeta _r\) which equals \((\iota _{\lambda _r} L^k)_-\), so

$$\begin{aligned} h^{(k)}(\lambda ) = (\iota _{\lambda _r} L^k)_-(\lambda ), \end{aligned}$$
(7.37)

and we find

$$\begin{aligned} R_-(\nabla H_{k, r}(\iota _\lambda L)) = -\Pi _-(\nabla H_{k, r}(\iota _\lambda L))= -\iota _\lambda h^{(k)} = -\iota _\lambda \left( (\iota _{\lambda _r} L^k)_- \right) .\qquad \end{aligned}$$
(7.38)

For \(k=1, 2\), this gives us

$$\begin{aligned} R_-(\nabla H_{1, r}(\iota _\lambda L)) = -\iota _\lambda \frac{A_r}{\lambda -\zeta _r}, \end{aligned}$$
(7.39)

and

$$\begin{aligned} R_-(\nabla H_{2, r}(\iota _\lambda L)) = -\iota _\lambda \left( \frac{A_r^2}{(\lambda -\zeta _r)^2} +\sum _{s \ne r} \frac{A_rA_s+A_sA_r}{(\lambda - \zeta _r)(\zeta _r - \zeta _s)} + \frac{A_r\Omega +\Omega A_r}{\lambda - \zeta _r} \right) ,\nonumber \\ \end{aligned}$$
(7.40)

respectively. As a consequence, we find the Lax equations for the two levels of flows as

$$\begin{aligned}{} & {} \partial _{t_1^r}\iota _\lambda L=\left[ -\iota _\lambda \frac{A_r}{\lambda -\zeta _r},\,\iota _\lambda L\right] , \end{aligned}$$
(7.41)
$$\begin{aligned}{} & {} \partial _{t_2^r}\iota _\lambda L=\left[ -\iota _\lambda \left( \frac{A_r^2}{(\lambda -\zeta _r)^2} +\sum _{s \ne r} \frac{A_rA_s+A_sA_r}{(\lambda - \zeta _r)(\zeta _r - \zeta _s)} + \frac{A_r\Omega +\Omega A_r}{\lambda - \zeta _r} \right) ,\,\iota _\lambda L\right] .\nonumber \\ \end{aligned}$$
(7.42)

Explicitly, they yield the following equations on the \(A_s\),

$$\begin{aligned} \begin{aligned}&\partial _{t_1^r}A_s= \frac{[A_r,\,A_s]}{\zeta _r-\zeta _s},~~s\ne r,\\&\partial _{t_1^r}A_r= -\sum _{s\ne r}\frac{[A_r,\,A_s]}{\zeta _r-\zeta _s}- [A_r,\,\Omega ], \end{aligned} \end{aligned}$$
(7.43)

thus reproducing (7.6)-(7.7) ((7.8) is automatic here since \(\Omega \) is a constant element of \(\mathfrak {g}\)), and

$$\begin{aligned} \begin{aligned} \partial _{t_2^r}A_s&= - \frac{[A_r^2,\,A_s]}{(\zeta _r-\zeta _s)^2} + \sum _{{s^\prime } \ne r} \frac{[A_r A_{s^\prime } + A_{s^\prime }A_r, A_s]}{(\zeta _r-\zeta _s)(\zeta _r-\zeta _{s^\prime })} + \frac{[A_r\Omega + \Omega A_r,\,A_s]}{\zeta _r-\zeta _s} ,~~ s \ne r,\\ \partial _{t_2^r}A_r&= \sum _{s \ne r} \frac{[A_r^2,\,A_s]}{(\zeta _r-\zeta _s)^2} -\sum _{s\ne r}\sum _{{s^\prime }\ne r}\frac{[A_r,\,A_s A_{s^\prime }]}{(\zeta _r-\zeta _s)(\zeta _r-\zeta _{s^\prime })} \\ {}&\quad - \sum _{s \ne r} \frac{[A_r,\, A_s \Omega + \Omega A_s]}{\zeta _r-\zeta _s} - [A_r,\, \Omega ^2]. \end{aligned} \end{aligned}$$
(7.44)

Lagrangian description: Applying our formula for the Lagrangian coefficients, we obtain the following multiform on the orbit of \(\Lambda (\lambda )\), with elements \(\iota _\lambda L\) given in (7.31),

$$\begin{aligned} \mathscr {L}= \sum _{k=1}^{N} \sum _{r \in S} \mathscr {L}_{k, r}\, \textrm{d}t_k^r , \end{aligned}$$
(7.45)

with

$$\begin{aligned} \mathscr {L}_{k, r} = \sum _{s\in S} {{\,\textrm{res}\,}}_{\lambda _s = 0} {{\,\textrm{Tr}\,}}\left( \iota _{\lambda _s}L\, \partial _{t_k^r}\varphi _{s+}(\lambda _s)\,\varphi _{s+}(\lambda _s)^{-1}\right) -H_{k, r}(\iota _\lambda L), \end{aligned}$$
(7.46)

where \(H_{k, r}(\iota _\lambda L)\) is the restriction of \(H_{k, r}\) to \(\iota _{\lambda }L\). For the kinetic part, we have

$$\begin{aligned} {{\,\textrm{res}\,}}_{\lambda _s = 0} {{\,\textrm{Tr}\,}}(\iota _{\lambda _s}L\, \partial _{t_k^r}\varphi _{s+}(\lambda _s)\,\varphi _{s+}(\lambda _s)^{-1})={{\,\textrm{Tr}\,}}\left( \Lambda _s (\phi _s^{(0)})^{-1}\partial _{t_k^r}\phi _s^{(0)}\right) ,~~s=1,\dots ,N,\nonumber \\ \end{aligned}$$
(7.47)

and

$$\begin{aligned} {{\,\textrm{res}\,}}_{\lambda _\infty = 0} {{\,\textrm{Tr}\,}}(\iota _{\lambda _\infty }L\, \partial _{t_k^r}\varphi _{\infty +}(\lambda _\infty )\,\varphi _{\infty +}(\lambda _\infty )^{-1})= & {} {{\,\textrm{Tr}\,}}\left( \Omega \partial _{t_k^r}\phi _\infty ^{(1)}\,\phi _\infty ^{(1)}\right) \nonumber \\ {}= & {} \frac{1}{2}\partial _{t_k^r}{{\,\textrm{Tr}\,}}\left( \Omega (\phi _\infty ^{(1)})^2\right) . \end{aligned}$$
(7.48)

The contribution at \(\infty \) is a total derivative, so it will not enter the Euler–Lagrange equations and hence we discard it. Thus, only the term \(\phi _s^{(0)}\) in the Taylor series of \(\varphi _{s+}(\lambda _s)\) appears in the kinetic term. We will simply denote it by \(\phi _s\) to lighten notations. The Lagrangian coefficients of the Gaudin multiform take the form

$$\begin{aligned} \mathscr {L}_{k, r} = \sum _{s=1}^N{{\,\textrm{Tr}\,}}\left( \Lambda _s \phi _s^{-1}\partial _{t_k^r}\phi _s\right) - H_{k, r}(\iota _\lambda L). \end{aligned}$$
(7.49)

More explicitly, for \(k=1,2\), we have

$$\begin{aligned} H_{1, r}(\iota _{\lambda }L) = \sum _{s\ne r}\frac{{{\,\textrm{Tr}\,}}(A_rA_s)}{\zeta _r-\zeta _s}+{{\,\textrm{Tr}\,}}(A_r\Omega ), \end{aligned}$$
(7.50)

and

$$\begin{aligned} H_{2, r}(\iota _{\lambda }L) = {{\,\textrm{Tr}\,}}\left( A_r \left( \sum _{s\ne r}\frac{A_s}{\zeta _r-\zeta _s} +\Omega \right) ^2\right) - {{\,\textrm{Tr}\,}}\left( A_r^2 \left( \sum _{s\ne r}\frac{A_s}{(\zeta _r-\zeta _s)^2}\right) \right) .\nonumber \\ \end{aligned}$$
(7.51)

Varying \(\mathscr {L}_{1, r}\) and \(\mathscr {L}_{2, r}\) with respect to \(\phi _s\), \(s=1,\dots ,N\) (recalling that \(A_s=\phi _s\,\Lambda _s\,\phi _s^{-1}\)), one can check by direct calculations that the Euler–Lagrange equations give exactly (7.43)-(7.44).

Remark 7.1

The algebraic framework we have used to describe the Lagrangian multiform for the Gaudin model is to a very large extent similar to that used in [14] to construct Lagrangian multiforms of Zakharov-Mikhailov type. Therefore, in hindsight, it is perhaps not so surprising that the Lagrangian

$$\begin{aligned} \mathscr {L}_{1, r} = \sum _{s=1}^N{{\,\textrm{Tr}\,}}\left( \Lambda _s \phi _s^{-1}\partial _{t_1^r}\phi _s\right) - \sum _{s\ne r}\frac{{{\,\textrm{Tr}\,}}(A_rA_s)}{\zeta _r-\zeta _s} - {{\,\textrm{Tr}\,}}(A_r\Omega ), \end{aligned}$$
(7.52)

appears to be the direct analogue in the finite-dimensional case of the Zakharov-Mikhailov Lagrangians which describe integrable field theories with rational Lax matrices [34]. It is a rather satisfying outcome that we have unravelled the unifying structure underlying such Lagrangians, whether in finite or infinite dimensions. They are all connected to Lie dialgebras which control the structure of their kinetic part and tell us which potentials to include (invariant functions on \(\mathfrak {g}^*\)). Note that in [35], a very similar Lagrangian, their Equation (24), was constructed by a completely different method: an adaptation of the idea of 4d Chern-Simons theory, see [36] and references therein, and of the construction in [37] to the case of a BF theory in 3d. This suggests the tantalising direction of deriving our Lagrangian multiforms from an appropriately adapted BF theory. This could perhaps offer an interpretation for the appearance of Lie dialgebras from this point of view, instead of introducing them ad hoc as we do in the present paper.

We know from the general theory that the closure relation \(\textrm{d}\mathscr {L}= 0\) holds on shell. This implies

$$\begin{aligned} \partial _{t_j^s} \mathscr {L}_{k, r} - \partial _{t_k^r} \mathscr {L}_{j, s} = 0, \end{aligned}$$
(7.53)

for all possible combinations of jk and rs. As we know, the kinetic and potential contributions give zero separately in each case. Let us illustrate the main steps here for \(k=1\), \(j=2\) and \(r\ne s\) in (7.53), the left-hand side of which will then read

$$\begin{aligned}&\sum _{s^\prime =1}^N \partial _{t_2^s} {{\,\textrm{Tr}\,}}\left( \Lambda _{s^\prime } \phi _{s^\prime }^{-1}\partial _{t_1^r}\phi _{s^\prime }\right) - \sum _{s^\prime =1}^N \partial _{t_1^r} {{\,\textrm{Tr}\,}}\left( \Lambda _{s^\prime } \phi _{s^\prime }^{-1}\partial _{t_2^s}\phi _{s^\prime }\right) \\&\quad - \partial _{t_2^s} H_{1, r}(\iota _\lambda L) + \partial _{t_1^r} H_{2, s}(\iota _\lambda L)\,. \end{aligned}$$

Using the equations of motion, we have

$$\begin{aligned}&\partial _{t_2^s} H_{1, r}(\iota _{\lambda }L)\\&= \sum _{s^\prime \ne r} \frac{1}{\zeta _r-\zeta _{s^\prime }} {{\,\textrm{Tr}\,}}\left( \left( - \frac{[A_s^2\,,\,A_r]}{(\zeta _s-\zeta _r)^2} + \sum _{{s^{\prime \prime }} \ne s} \frac{[A_s A_{s^{\prime \prime }} + A_{s^{\prime \prime }}A_s, A_r]}{(\zeta _s-\zeta _r)(\zeta _s-\zeta _{s^{\prime \prime }})} + \frac{[A_s \Omega + \Omega A_s\,,\,A_r]}{\zeta _s-\zeta _r} \right) A_{s^\prime } \right) \\&~~+ \sum _{\begin{array}{c} s^\prime \ne r\\ s^\prime \ne s \end{array}} \frac{1}{\zeta _r-\zeta _{s^\prime }} {{\,\textrm{Tr}\,}}\left( A_r \left( - \frac{[A_s^2\,,\,A_{s^\prime }]}{(\zeta _s-\zeta _{s^\prime })^2} + \sum _{{s^{\prime \prime }} \ne s} \frac{[A_s A_{s^{\prime \prime }} + A_{s^{\prime \prime }}A_s, A_{s^\prime }]}{(\zeta _s-\zeta _{s^\prime })(\zeta _s-\zeta _{s^{\prime \prime }})} + \frac{[A_s \Omega + \Omega A_s\,,\,A_{s^\prime }]}{\zeta _s-\zeta _{s^\prime }} \right) \right) \\&~~+ \frac{1}{\zeta _r-\zeta _s} {{\,\textrm{Tr}\,}}\left( A_r \left( \sum _{s^\prime \ne s} \frac{[A_s^2\,,\,A_{s^\prime }]}{(\zeta _s-\zeta _{s^\prime })^2} -\sum _{s^\prime \ne s}\sum _{{s^{\prime \prime }}\ne s}\frac{[A_s\,,\,A_{s^\prime } A_{s^{\prime \prime }}]}{(\zeta _s-\zeta _{s^\prime })(\zeta _s-\zeta _{s^{\prime \prime }})}\right. \right. \\&\quad \left. \left. - \sum _{s^\prime \ne s} \frac{[A_s\,,\, A_{s^\prime } \Omega + \Omega A_{s^\prime }]}{\zeta _s-\zeta _{s^\prime }} - [A_s\,,\, \Omega ^2] \right) \right) \\&~~+{{\,\textrm{Tr}\,}}\left( \left( - \frac{[A_s^2\,,\,A_r]}{(\zeta _s-\zeta _r)^2} + \sum _{{s^{\prime }} \ne s} \frac{[A_s A_{s^{\prime }} + A_{s^{\prime }}A_s, A_r]}{(\zeta _s-\zeta _r)(\zeta _s-\zeta _{s^{\prime }})} + \frac{[A_s \Omega + \Omega A_s\,,\,A_r]}{\zeta _s-\zeta _r} \right) \Omega \right) \,. \end{aligned}$$

This is seen to add up to zero by assembling the terms of the same nature (quartic, cubic or quadratic in A), manipulating the sums, using the \(\textrm{ad}\)-invariance property \({{\,\textrm{Tr}\,}}([A,B]C)={{\,\textrm{Tr}\,}}(A[B,C])\) and the identity

$$\begin{aligned} \frac{1}{(\zeta _r-\zeta _s)(\zeta _r-\zeta _{s^\prime })}+\frac{1}{(\zeta _s-\zeta _{s^\prime })(\zeta _r-\zeta _{s^\prime })} +\frac{1}{(\zeta _s-\zeta _r)(\zeta _s-\zeta _{s^\prime })}=0 . \end{aligned}$$

Similar calculations give \(\partial _{t_1^r} H_{2, s}(\iota _\lambda L)=0\). For the kinetic terms we have

$$\begin{aligned}&\partial _{t_2^s} \sum _{s^\prime =1}^N {{\,\textrm{Tr}\,}}\left( \Lambda _{s^\prime } \phi _{s^\prime }^{-1}\partial _{t_1^r}\phi _{s^\prime }\right) - \partial _{t_1^r} \sum _{s^\prime =1}^N {{\,\textrm{Tr}\,}}\left( \Lambda _{s^\prime } \phi _{s^\prime }^{-1}\partial _{t_2^s}\phi _{s^\prime }\right) \\&\quad = \sum _{s^\prime =1}^N {{\,\textrm{Tr}\,}}\left( (\partial _{t_2^s}A_{s^\prime }) (\partial _{t_1^r}\phi _{s^\prime }) \phi _{s^\prime }^{-1} \right) - \sum _{s^\prime =1}^N {{\,\textrm{Tr}\,}}\left( (\partial _{t_1^r}A_{s^\prime }) (\partial _{t_2^s}\phi _{s^\prime }) \phi _{s^\prime }^{-1} \right) \\&\qquad + \sum _{s^\prime =1}^N {{\,\textrm{Tr}\,}}\left( A_{s^\prime } [(\partial _{t_2^s}\phi _{s^\prime })\phi _{s^\prime }^{-1}\,, (\partial _{t_1^r}\phi _{s^\prime })\phi _{s^\prime }^{-1}] \right) \\&\qquad + \sum _{s^\prime =1}^N {{\,\textrm{Tr}\,}}\left( A_{s^\prime } \left( (\partial _{t_2^s}\partial _{t_1^r}\phi _{s^\prime })\phi _{s^\prime }^{-1} - (\partial _{t_1^r}\partial _{t_2^s}\phi _{s^\prime })\phi _{s^\prime }^{-1}\right) \right) . \end{aligned}$$

The commutativity of flows ensures that the last term equals zero. Further, using the relation

$$\begin{aligned} \partial _{t_2^s}A_{s^\prime } = [(\partial _{t_2^s} \phi _{s^\prime })\phi _{s^\prime }^{-1}, A_{s^\prime }],~~s^\prime = 1,\dots ,N, \end{aligned}$$
(7.54)

it is easy to see that the first and the third terms cancel each other. Finally, for the second term, using \(\textrm{ad}\)-invariance, (7.54) and the on-shell relations in (7.43) and (7.44), we have

$$\begin{aligned}&\sum _{s^\prime =1}^N {{\,\textrm{Tr}\,}}\left( (\partial _{t_1^r}A_{s^\prime }) (\partial _{t_2^s}\phi _{s^\prime }) \phi _{s^\prime }^{-1} \right) \\&\quad = {{\,\textrm{Tr}\,}}\left( (\partial _{t_1^r}A_r) (\partial _{t_2^s}\phi _r) \phi _r^{-1} \right) + \sum _{s^\prime \ne r} {{\,\textrm{Tr}\,}}\left( (\partial _{t_1^r}A_{s^\prime }) (\partial _{t_2^s}\phi _{s^\prime }) \phi _{s^\prime }^{-1} \right) \\&\quad = -\sum _{s^\prime \ne r} {{\,\textrm{Tr}\,}}\left( \frac{[A_r\,,\,A_{s^\prime }]}{\zeta _r - \zeta _{s^\prime }} (\partial _{t_2^s}\phi _r) \phi _r^{-1} \right) - {{\,\textrm{Tr}\,}}\left( [A_r\,,\,\Omega ] (\partial _{t_2^s}\phi _r) \phi _r^{-1} \right) \\&\qquad + \sum _{s^\prime \ne r} {{\,\textrm{Tr}\,}}\left( \frac{[A_r\,,\,A_{s^\prime }]}{\zeta _r - \zeta _{s^\prime }} (\partial _{t_2^s}\phi _{s^\prime }) \phi _{s^\prime }^{-1} \right) \\&\quad = -\sum _{s^\prime \ne r} \frac{{{\,\textrm{Tr}\,}}(A_{s^\prime } \partial _{t_2^s} A_r)}{\zeta _r - \zeta _{s^\prime }} - {{\,\textrm{Tr}\,}}(\Omega \partial _{t_2^s}A_r) -\sum _{s^\prime \ne r} \frac{{{\,\textrm{Tr}\,}}(A_r \partial _{t_2^s} A_{s^\prime })}{\zeta _r - \zeta _{s^\prime }}\\&\quad =-\partial _{t_2^s} H_{1, r}(\iota _{\lambda }L) \end{aligned}$$

which we previously showed to be zero.

8 Conclusion

In this work, we provided an answer to the problem of constructing all the coefficients in Lagrangian 1-form for a large class of finite-dimensional integrable systems (any model fitting the Lie dialgebra framework). A reinterpretation of our construction is that we proved that any collection of compatible equations in the Lax form

$$\begin{aligned} \partial _{t_k}L=[R_\pm \nabla H_k(L),L],~~ k=1,\dots ,N, \end{aligned}$$
(8.1)

is variational, by explicitly providing a collection of Lagrangians assembled in a multiform. The closure relation is equivalent to the involutivity of the Hamiltonians \(H_k\). This is a corollary of our stronger result, Theorem 3.3.

We recast our construction in a more general context which makes it clear how it descends from a “free” (phase space) Lagrangian on the cotangent bundle of a Lie group by reduction. This procedure is well-known in the Hamiltonian framework and we have explained how it translates into our framework, by exploiting the correspondence between moment maps and Noether currents. This offers a larger perspective on our results. On the one hand, it may lead to the possibility of constructing Lagrangian multiforms for systems of Calogero-Moser type by using reduction ideas appropriately. From the point of view of r-matrices, a strong motivation, and at the same time an anticipated difficulty, is the appearance of dynamical r-matrices in such systems. Comparison with the early work on Calogero-Moser multiforms [5] would be beneficial. On the other hand, it shows that our Lagrangian coefficients turn out to have a structure similar to those appearing in so-called geometric actions. The latter can be traced back (at least) to [38,39,40,41] and are concerned with quantisation using Feynman’s path integral in conjunction with coadjoint orbit methods. This interesting connection deserves further investigation.

Of the many models we could have used to illustrate the results in the present work, we chose the open Toda chain and the Gaudin model, two emblematic finite-dimensional integrable systems. The motivation for studying the finite Gaudin model is Vicedo’s construction of a class of non-ultralocal field theories as affine Gaudin models [42]. We very much hope that the present results combined with the approach of [14] and Vicedo’s construction will allow us to overcome the current limitation of Lagrangian multiforms to only ultralocal field theories.