Classical Yang-Baxter equation, Lagrangian multiforms and ultralocal integrable hierarchies

We cast the classical Yang-Baxter equation (CYBE) in a variational context for the first time, by relating it to the theory of Lagrangian multiforms, a framework designed to capture integrability in a variational fashion. This provides a significant connection between Lagrangian multiforms and the CYBE, one of the most fundamental concepts of integrable systems. This is achieved by introducing a generating Lagrangian multiform which depends on a skew-symmetric classical $r$-matrix with spectral parameters. The multiform Euler-Lagrange equations produce a generating Lax equation which yields a generating zero curvature equation. The CYBE plays a role at three levels: 1) It ensures the commutativity of the flows of the generating Lax equation; 2) It ensures that the generating zero curvature equation holds; 3) It implies the closure relation for the generating Lagrangian multiform. The specification of an integrable hierarchy is achieved by fixing certain data: a finite set $S\in CP^1$, a Lie algebra $\mathfrak{g}$, a $\mathfrak{g}$-valued rational function with poles in $S$ and an $r$-matrix. We show how our framework is able to generate a large class of ultralocal integrable hierarchies by providing several known and new examples pertaining to the rational or trigonometric class. These include the Ablowitz-Kaup-Newell-Segur hierarchy, the sine-Gordon (sG) hierarchy and various hierachies related to Zakharov-Mikhailov type models which contain the Faddeev-Reshetikhin (FR) model and recently introduced deformed sigma/Gross-Neveu models as particular cases. The versatility of our method is illustrated by showing how to couple integrable hierarchies together to create new examples of integrable field theories and their hierarchies. We provide two examples: the coupling of the nonlinear Schr\"odinger system to the FR model and the coupling of sG with the anisotropic FR model.


Contents
A profound discovery in the modern theory of integrable systems was that the special partial differential equations originally treated in the seminal works [GGKM, ZS], using what is now known as the Inverse Scattering Method, were also infinite dimensional Hamiltonian systems for which an analog of the Liouville theorem for finite dimensional Hamiltonian systems could be established, [ZF, ZMan]. This allows one, in particular, to see such systems as Hamiltonian field theories. The developments based on these early examples led to the beautiful theory of the classical r-matrix which captures the special Hamiltonian features of these models [Dr1,STS]. An important condition usually required of the r-matrix is that it satisfies the classical Yang-Baxter equation (CYBE) r 12 (λ, µ), r 13 (λ, ν) + r 12 (λ, µ), r 23 (µ, ν) − r 13 (λ, ν), r 32 (ν, µ) = 0. (1.1) It ensures that a certain Poisson bracket defined using r satisfies the Jacobi identity. Another important condition is to decide if r is skew-symmetric or not, i.e. whether or not it satisfies r 12 (λ, µ) = −r 21 (µ, λ) .
This has deep mathematical and physical implications. If the r-matrix is skew-symmetric, the associated field theories are called ultralocal while they are non-ultralocal otherwise. In the present work, we restrict our attention to the ultralocal case. A characteristic feature of integrable field theories is that their equations of motion come in hierarchies. Specifically, any given integrable Hamiltonian field theory has infinitely many conserved charges which can, themselves, be used as Hamiltonians to define flows with respect to the Poisson bracket. Because all the conserved charges Poisson commute amongst themselves, it is possible to impose all these flows simultaneously on the fields of the theory and thus treat the latter as depending on infinitely many times. The collection of equations of motion thus obtained is referred to as an integrable hierarchy. Schematically, for a scalar field theory with field u, there would be a countable number of conserved charges H j , labelled by integers j ≥ 1 say, in involution with respect to a given Poisson bracket, namely {H i , H j } = 0 for every i, j ≥ 1. The hierarchy would then consist of all the equations where we have introduced an infinite number of times t j for j ≥ 1. Among all the conserved charges H j , one of them can be taken to be the Hamiltonian of the integrable field theory one started with. Studying the hierarchy as a whole can reveal much more structure and properties of the initial model. This is of course not a new idea but here we depart from the established point of view in that we want to exploit this idea in a Lagrangian setting.

Integrability in the Lagrangian framework
When turning to the Lagrangian setting, one is immediately faced with the following question: how should integrable hierarchies be captured in the Lagrangian formalism? This question found an answer relatively recently in the theory of Lagrangian multiforms which was introduced in the seminal paper [LN] and has rapidly developed into various direction. More recently, several works cast the original idea into the context of continuous integrable field theories, see [SV,V,SNC,PV,SNC2,CS1,CS2,CS3] for examples of two-dimensional field theories (e.g. Korteweg-de Vries, sine-Gordon and nonlinear Schrödinger) and [SNC2,SNC3] for a threedimensional example (Kadomtsev-Petviashvili). For a two-dimensional field theory, the central object is a differential two-form on an infinite-dimensional space R ∞ parametrised by the infinite collection of times t i of the hierarchy. The coefficients L ij [u] are Lagrangians depending on the fields of the theory, which are collectively denoted by u here for simplicity (even though we are no longer restricting to the case of a single scalar field). For each Lagrangian coefficient L ij [u] we can consider the associated action S ij [u] = R 2 L ij [u]dt i ∧ dt j . Using the differential two-form (1.3) we can succinctly rewrite all these actions as S ij [u] = σ ij L [u], where the integral here is over the two dimensional plane σ ij ≃ R 2 spanned by the coordinates t i and t j in R ∞ . At this point, of course, there is no reason for the field theories described by the actions S ij [u] to belong to the same integrable hierarchy, let alone to produce equations of motion that are integrable! The key new ingredient is to impose a generalised variational principle on the more general action which now also depends on an arbitrary choice of two-dimensional smooth surface σ in R ∞ . Note, in particular, that S ij [u] = S[u, σ ij ]. The generalised variational principle which ties all these theories together is a least action principle for S [u, σ] simultaneously for all smooth surfaces σ. This results in what are called the multiform Euler-Lagrange (EL) equations. These were first derived in [SV] for the two-form case that we consider in this paper. It can be shown [SV, SNC] that they can be written compactly as where d is the usual exterior derivative and δ denotes the variational derivative. In the Lagrangian multiform theory, the above generalised variational principle is complemented by another requirement: on critical points, one also requires that the action be stationary with respect to arbitrary local variations of σ. This gives us the important closure relation on the equations of motion, i.e. on shell dL = 0 . (1.6) Intuitively, requiring criticality of the action for an arbitrary surface is the new feature that encodes variationally the commutativity of the flows known to be a signature of integrability in the Hamiltonian world. Roughly speaking, the connection with (1.2) is that the Lagrangian coefficients L 1j correspond by a Legendre transform to the Hamiltonians H j , with the understanding that the time t 1 plays some preferred role (the "space" variable) and the t j , j ≥ 2 are all the higher times of the hierarchy. The interpretation of all the other Lagrangian coefficients L ij is best obtained by casting the hierarchy as a collection of compatible zero curvature equations involving Lax matrices V j (λ), namely (1.7) for i, j ≥ 1. It is known that all these equations are in fact Hamiltonian, see e.g. [AC], and the case i = 1 corresponds to (1.2). One of the main points of the present work is that they are also variational with Lagrangian L ij . It is important to realise that the multiform EL equations are largely overdetermined equations for the coefficients L ij . Part of these equations impose restrictions on the allowed coefficients, the idea being that they enforce the integrability of the corresponding theories. The rest consists of standard EL equations associated to these coefficients and give the equations of motion of the integrable hierarchy.

Motivating example: Ablowitz-Kaup-Newell-Segur hierarchy
In [CS3], on the example of the Ablowitz-Kaup-Newell-Segur (AKNS) hierarchy, the notion of Lagrangian multiform was successfully combined with the idea of "compounding hierarchies" introduced in the Lagrangian framework in [N]. This naturally leads to working with generating functions when dealing with hierarchies. The key object was what we can call a generating Lagrangian multiform. The simple idea is to organise the Lagrangian coefficients L ij of the 2-form (1.3) into a generating series involving formal (spectral) parameters (1.8) It is clear that there is a one-to-one corresponding between L [u] and L (λ, µ) where from the latter, one can extract the coefficients by the formula where res λ returns the coefficient of λ −1 in the series expansion, and similarly for res µ . One advantage of working with generating series such as (1.8) stems from the usefulness of generating functions in general: properties of their coefficients are more easily studied and derived from those of the generating function. In our context, this means that we can manipulate an integrable hierarchy as a whole instead of studying each Lagrangian coefficient L ij individually. Originally, the latter approach was used in the sense that only a given "starting" Lagrangian coefficient was known, say L 12 , and one would try to build the higher coefficients L ij so as to obtain a consistent Lagrangian multiform. Methods to compute these coefficients were introduced for instance in [V,SNC2]. Although the recursive algorithm could be repeated in principle, in practice this is hard to implement beyond the first few coefficients. Moreover, the Lagrangians L ij obtained in this way usually contain derivatives with respect to t 1 or t 2 (the times associated with L 12 ). These are not natural times from the point of L ij : this is the so-called problem of "alien derivatives" which was identified and explained in [V]. Having a generating Lagrangian multiform circumvents these issues. This will be elaborated upon in the examples. For AKNS, the generating Lagrangian multiform can be written as [CS3] L (λ, µ) with Q(λ) = −iφ(λ)σ 3 φ(λ) −1 , φ(λ) being a formal series in 1/λ whose coefficients contain the dynamical variables with constant term equal to the identity. The operator D λ = j≥0 λ −j−1 ∂ t j is a formal collection of all the AKNS flows ∂ t j , and similarly for D µ . The generating Lagrangian multiform (1.9) generates all the coefficients L ij systematically and without the problem of alien derivatives, reproducing the first few coefficients which had been constructed in [SNC,SNC2,PV], as it should. Its multiform EL equations yield the defining equations of the AKNS hierarchy as discussed by Flaschka-Newell-Ratiu (FNR) in [FNR], namely 1 where Q(λ) = ∞ j=0 Q j λ −j and Q (i) (λ) = i j=0 Q j λ i−j and Q 0 = −iσ 3 . More precisely, the multiform EL equations for (1.9) produce the equations (1.10) in generating form where we used the formal series identity (1.12)

Motivation, main results and plan
Motivation: The present work is motivated by the following observations made on the generating Lagrangian multiform (1.9) and the generating FNR equations (1.11): 1. The potential term in L (λ, µ) has a characteristic form which can be identified as the expression Tr 12 (r 12 (λ, µ)Q 1 (λ)Q 2 (µ)) where r 12 (λ, µ) = P 12 µ−λ is the rational r-matrix, known to describe the Hamiltonian structure of the AKNS hierarchy. One could then imagine replacing this particular r-matrix with another skew-symmetric r-matrix. This leads to the question of whether the nice properties of the generating Lagrangian multiform still hold. One of our main results is that this is the case by virtue of the CYBE. Correspondingly, the RHS of (1.11) can also be written as [Tr 2 r 12 (λ, µ)Q 2 (µ), Q 1 (λ)] and the same generalisation can be contemplated.
2. The choice of expanding all the objects as formal series in 1/λ and 1/µ is a sign that one is performing an expansion around the point at infinity. However, nothing would prevent us from considering other points in CP 1 .
3. The Pauli matrix σ 3 appearing in (1.9) is a special choice of constant element in the underlying loop algebra of sl 2 and the form of Q(λ) indicates that one is building a phase space for the field theory as a (co)adjoint orbit around this particular element. One could consider other elements in the loop algebra to construct different phase spaces and hence different models. Moreover, one could also consider other Lie algebras than sl 2 .
The careful implementation of these natural observations requires some machinery which is presented Section 2. In a first instance, the reader may choose to read the rest of this introduction containing a summary of the formalism and results, and go directly to Section 3. The idea is to substitute the loop algebra of sl 2 with a much more versatile structure: the Lie algebra of g-valued adèles associated with a Lie algebra g. This algebra is presented in [STS2] as the relevant structure to implement the second observation above. By doing so in our context, we build a "universal" generating Lagrangian multiform which is capable of describing a large class of ultralocal integrable hierarchies and we provide a large variety of examples.
In a nutshell, for a matrix Lie algebra g, the Lie algebra of g-valued adèles is defined as where λ a = λ − a for a ∈ C and λ ∞ = 1 λ are the local series expansion parameters. An element X(λ) = (X a (λ a )) a∈CP 1 of this algebra consist of tuples with all but finitely many of the formal Laurent series X a (λ a ) ∈ g ⊗ C((λ a )) being Taylor series in λ a , i.e. there exists a finite subset S ⊂ CP 1 such that X a (λ a ) ∈ g ⊗ C λ a for every a ∈ C \ S. Let R λ (g) denote the Lie algebra of g-valued rational functions in the formal variable λ and define the map where ι λa f ∈ g ⊗ C((λ a )) is the Laurent expansion of f ∈ R λ (g) at a ∈ CP 1 . Using certain solutions of the CYBE, it is possible to obtain a direct sum decomposition of this Lie algebra into two maximally isotropic Lie subalgebras (1.14) We can also define a group A + λ (G) associated to A + λ (g). If µ is another formal variable, we can work with double formal series locally in λ a and µ b , a, b ∈ CP 1 and consider tuples of the form X(λ, µ) = (X a,b (λ a , µ b )) a,b∈CP 1 .
Thanks to this adèlic framework, we can retain the power of the algebraic formulation of formal power series while working locally around arbitrary points in CP 1 . We introduce the following generalisation of (1.9) which realises the above three observations where the kinetic and potential terms are given by a∈CP 1 is the collection of principal parts of a g-valued rational function F (λ) ∈ R λ (g). In terms of components of the tuples, we have for every a, b ∈ CP 1 . The operator D λ := (D λa ) a∈CP 1 denotes the CP 1 -tuple of formal operators D λa which contain the partial differential operators ∂ t a n (see (3.6)). The times t a n will be the times of the integrable hierarchies we describe. The rational function r 12 (λ, µ) is the classical r-matrix defining the type of ultralocal hierarchies we consider (e.g. rational or trigonometric) and corresponds to the r-matrix yielding the decomposition (1.14).
Main results: 1. We show that the generating Lax equation (1.17) is variational. It arises as the multiform EL equations associated to our generating Lagrangian multiform (1.15). This is the content of Theorem 3.12. This generalises the analogous result first obtained in [SNC] in the context of the Zakharov-Mikhailov models [ZM1]. The generating Lax equation plays here for field theories a role similar to the traditional Lax equation for finite dimensional systems. This is explained in Section 3.1. We relate it to a generating zero curvature which is shown to hold as a consequence of the CYBE for the r-matrix appearing in (1.15).
2. We relate for the first time the CYBE with the relatively recent notion of Lagrangian multiforms. The closure relation (1.6) in generating form, i.e. the closure relation for (1.15), is shown to be a consequence of the CYBE for the r-matrix appearing in (1.15), see Theorem 3.13. On the one hand, this provides a variational interpretation of the CYBE, a fundamental equation that has only been introduced and studied from a Hamiltonian point of view so far. On the other hand, given the importance of the CYBE as a criterion for classical integrability, this further establishes the Lagrangian multiform approach as a variational criterion for integrability.
3. Specialising the generating Lagrangian multiform (1.15), we recover known examples of integrable hierarchies and produce several new examples. We also introduce an easy method for coupling hierarchies together.
Plan of the paper: In Section 2, we introduce the Lie algebra of g-valued adèles and establish its decomposition into two complementary maximal isotropic Lie subalgebras which allows us to introduce the classical r-matrix of interest via the corresponding projectors onto the Lie subalgebras. This generalises to the adèles case the well-known interpretation of a classical rmatrix as a difference of projectors. This is done explicitely for the rational and trigonometric cases. Section 3 introduces the main elements of our framework: we state the generalisation of the generating FNR equations (1.11), which we call generating Lax equation, taking into account the above observations. Its properties are directly connected to the CYBE. Then we introduce the generating Lagrangian multiform that produces the generating Lax equation as its multiform EL equations. Again, its properties, in particular the closure relation, are shown to be a direct consequence of the CYBE. The subsequent Sections 4 to 6 are devoted to examples. Several were known and are used to show how our framework contains them naturally, e.g. the AKNS hierarchy and the sine-Gordon hierarchy. For the latter example, we explain in detail how the first few known Lagrangian coefficients are recovered but without the problem of alien derivatives. Other examples, such as the trigonometric Zakharov-Mikhailov hierarchy, are new. For the recently introduced deformed sigma/Gross-Neveu models, the new feature brought in by our construction is that they are naturally embedded into an integrable hierarchy. Various conclusions and discussions are presented in Section 8. Finally, we recall in an appendix the relationship between the trigonometric r-matrix used in this paper and the more familiar rmatrix of the sine-Gordon model.
2 Lie algebra of g-valued adèles 2.1 General setup Let N ∈ Z ≥1 and consider either the Lie algebra gl N of all N × N or its Lie subalgebra sl N of traceless N × N matrices. We will treat both of these cases in parallel, using the common notation g throughout. The generalisation to other matrix Lie algebras is straightforward but for simplicity we shall restrict to these two cases. We also denote by G the associated Lie group which corresponds either to the general linear group GL N of invertible N × N matrices or to its Lie subgroup SL N of matrices with determinant 1. We use the trace Tr : gl N → C to endow the Lie algebra g with the non-degenerate invariant symmetric bilinear form g × g → C given by (X, Y ) → Tr(XY ). Let P 12 be the tensor Casimir of g with the property that Tr 2 (P 12 X 2 ) = X for any X ∈ g. Explicitly, for gl N it is given by . . , N is the standard basis of gl N . Similarly, for sl N we can write P 12 = a I a ⊗ I a where {I a } and {I a } are dual bases of sl N with respect to the above bilinear form. Let λ be a formal variable. For any a ∈ C we define the formal local coordinate around a as λ a := λ − a and to the point at infinity we associate the formal local coordinate λ ∞ := λ −1 . We consider the Lie algebra of g-valued adèles defined as Its elements consist of tuples X(λ) = (X a (λ a )) a∈CP 1 with all but finitely many of the formal Laurent series X a (λ a ) ∈ g ⊗ C((λ a )) being Taylor series, i.e. there exists a finite subset S ⊂ CP 1 such that X a (λ a ) ∈ g ⊗ C λ a for every a ∈ C \ S. The Lie bracket of two elements X(λ) = (X a (λ a )) a∈CP 1 and Y Let R λ denote the algebra of rational functions in the formal variable λ. The Laurent expansion of a rational function f ∈ R λ at any a ∈ CP 1 defines a homomorphism We will consider two possible non-degenerate invariant bilinear forms on the Lie algebra for k = 0 and k = −1, defined as for any X(λ) = (X a (λ a )) a∈CP 1 and Y (λ) = (Y a (λ a )) a∈CP 1 . Strictly speaking, the rational function λ k on the right hand side of (2.2b) should be expanded at a ∈ CP 1 , namely we should write ι λa λ k instead of λ k . In order to simplify the notation, such expansions will always be implicit when taking residues. Here, for any a ∈ CP 1 , the residue res λ a : C((λ a ))dλ a → C returns the coefficient of λ −1 a dλ a . For computing the residue at infinity we note that dλ = −λ −2 ∞ dλ ∞ . Note that only finitely many terms in the sum in (2.2b) are non-zero by definition of A λ (g).
Let R λ (g) := g ⊗ R λ denote the Lie algebra of g-valued rational functions in the formal variable λ. We have an embedding of Lie algebras where ι λa f ∈ g ⊗ C((λ a )) is the Laurent expansion of f ∈ R λ (g) at a ∈ CP 1 in the second tensor factor, as in (2.1). The Lie subalgebra ι λ R λ (g) ⊂ A λ (g) is maximally isotropic with respect to ·, · k , for any k ∈ Z, by the strong residue theorem; see for instance [Ta,Corollary 1].
In the remainder of this section we will describe two possible complementary Lie algebras to ι λ R λ (g) in A λ (g), which are maximally isotropic with respect to ·, · 0 and ·, · −1 , respectively. These two main examples, which can be found for instance in [Dr2,Example 4], correspond to the rational r-matrix and the trigonometric r-matrix, respectively. Notation We will generally use boldface to denote CP 1 -tuples. For instance, given any n ∈ Z we will write λ n X(λ) for the element (λ n a X a (λ a )) a∈CP 1 ∈ A λ (g) of the Lie algebra of gvalued adèles. More generally, we would write λ n X(λ)dλ as a shorthand for the CP 1 -tuple (λ n a X a (λ a )dλ a ) a∈CP 1 . Note, crucially, that although dλ a = dλ for all a ∈ C, we have dλ ∞ = −λ −2 dλ for the point at infinity. Therefore the two expressions λ n X(λ)dλ and λ n X(λ)dλ subtly differ only in the component at infinity. If µ is another formal variable then µ will denote a separate CP 1 -tuple carrying an independent label b ∈ CP 1 . For instance, we would have for any X(λ) ∈ A λ (g) and Y (µ) ∈ A µ (g). We will make use of such notation with multiple formal variables extensively from Section 3 onwards.

Rational r-matrix
Throughout this section we fix the choice k = 0 in the bilinear form (2.2). Consider the Lie subalgebra of g-valued integral adèles Note that we have excluded the constant term from the Taylor series at infinity. We shall also need the corresponding group (2.5) where in the GL N case G a consists of all invertible N × N matrices with entries in C λ a while G ∞ consists of all invertible N × N matrices with off-diagonal entries in λ ∞ C λ ∞ and diagonal entries in 1 + λ ∞ C λ ∞ . In the SL N case the groups G a for all a ∈ CP 1 are defined in the same way but with the added condition that the matrices are of determinant 1. For later practical purposes, it is convenient to collect the following notations in a definition.
Definition 2.1. Let a ∈ C and X a (λ a ) ∈ g ⊗ C((λ a )) be a Laurent series in λ a with coefficient in g. We shall use the notation to represent the pole part of X a (λ a ). Similarly, for X ∞ (λ ∞ ) ∈ g ⊗ C((λ ∞ )) = g ⊗ C((λ −1 )), we denote by (2.6b) the pole part of X ∞ (λ ∞ ). Note that the constant term in λ ∞ is included around infinity.
The Lie subalgebra A rat λ (g) ⊂ A λ (g) is clearly maximally isotropic with respect to the bilinear form ·, · 0 defined in (2.2). Here we made use of the fact that the constant term was excluded from the Taylor series at infinity in the definition (2.4). It follows that the Lie algebra A λ (g) decomposes as a direct sum of vector spaces into complementary Lagrangian (i.e. maximal isotropic) Lie subalgebras. Let π rat ± denote the projections onto A rat λ (g) and ι λ R λ (g), respectively, relative to (2.7).
Definition 2.2 (Rational r-matrix). Recall the notation P 12 for the tensor Casimir of g from Section 2.1. The rational r-matrix is defined as the following rational function of the formal variables λ and µ: r rat 12 (λ, µ) = P 12 µ − λ . (2.8) As is well-known, it is skew-symmetric: r rat 12 (λ, µ) = −r rat 21 (µ, λ). The following result shows that its known connection to projectors associated to the decomposition of a Lie algebra into isotropic Lie subalgebras extends to the present adèles setting.
Proposition 2.3. For any X(λ) ∈ A λ (g), its projections onto the complementary subalgebras A rat λ (g) and ι λ R λ (g) relative to the direct sum decomposition (2.7) are given respectively by π rat + X(λ) = (π rat + X) a (λ a ) a∈CP 1 and π rat − X(λ) = (π rat − X) a (λ a ) a∈CP 1 where Proof. Let X(λ) ∈ A λ (g). We consider, to begin with, its projection onto ι λ R λ (g). The gvalued rational function in R λ (g) constructed out of the pole parts of the collection of Laurent series in X(λ) is given by where in the first equality we took the trace and split the term at b = ∞ from the rest of the sum over b ∈ CP 1 . The expression (2.9b) is then obtained by taking the Laurent series expansion of this rational function at each a ∈ CP 1 , corresponding to applying the map (2.3). Consider now the projection of X(λ) ∈ A λ (g) onto A rat λ (g). Note that for any a ∈ C we have b∈CP 1 If X(λ) ∈ ι λ R λ (g), say X(λ) = ι λ f (λ) for some f (λ) ∈ R λ (g), then the above vanishes at each order in the λ a -expansion by the residue theorem. Indeed, the coefficient of λ n a is given by the sum of all the residues of (µ − a) −n−1 f (µ)dµ. On the other hand, if X(λ) ∈ A rat λ (g) then the only term contributing to the sum over b ∈ CP 1 is the term for b = a which is equal to X a (λ a ). The same statements hold for a = ∞ and hence the result follows.
Define the linear operator r rat := π rat + − π rat − . It follows from Proposition 2.3 that its kernel is given by (2.10) The kernel of the identity operator id = π rat + + π rat − is similarly given by an expansion of zero (see e.g. [LL,Chap. 2 a,b∈CP 1 = P 12 δ ab δ(λ a , µ a )dµ a a,b∈CP 1 where we introduced δ aa = 1 for any a ∈ CP 1 and δ ab = 0 for any a, b ∈ CP 1 with a = b, while we defined δ(λ a , µ a ) := n∈Z λ n a µ −n−1 a . (2.11) Lemma 2.4. Let X(µ) = X a (µ a ) a∈CP 1 ∈ A µ (g) with X a (µ a ) = ∞ n=−Na X a n µ n a for some N a ∈ Z, where N a > 0 for finitely many a ∈ CP 1 . For any a ∈ C we have rat − and at infinity we have Proof. First, let a ∈ C. Then we have where in the second equality we changed variable from s to r = s + n in the second sum and in the second line we changed the order of the sums. Consider now the point at infinity. We have where in the second equality we changed variables s = r − n as before and in the second line we changed the order of the sum.

Trigonometric r-matrix
Throughout this section we will choose k = −1 in the bilinear form (2.2). We shall also make use of the standard nilpotent subalgebras n ± and Borel subalgebras b ± of g. Explicitly, n + (resp. n − ) is spanned by E ij for i < j (resp. i > j). In the gl N The Cartan subalgebra h is spanned by E ii for i = 0, . . . , N in the gl N case and by E ii − E jj for i < j in the sl N case. We have the direct sum decompositions b ± = h ⊕ n ± . We shall also make use of the corresponding subgroups N ± , B ± and H in G.
For GL N these are the groups of unipotent upper/lower-triangular N × N matrices, invertible upper/lower-triangular N × N matrices and invertible diagonal N × N matrices, respectively. For SL N we add the condition that the matrices are of determinant 1. Recall the notation P 12 for the tensor Casimir of g from Section 2.1. We can split this into three parts as P 12 = P − 12 + P 0 12 + P + 12 where P ± 12 ∈ n ± ⊗ n ∓ and P 0 12 ∈ h ⊗ h. Explicitly, in the gl N case these read For sl N the expression for P 0 12 is given in terms of dual bases {u i } and {u i } of the Cartan subalgebra h with respect to the trace bilinear form as P 0 12 = N −1 i=1 u i ⊗ u i . We note that P + 21 = P − 12 , P 0 21 = P 0 12 and P 21 = P 12 . We also define the corresponding projectors P ± : g → n ± and P 0 : g → h onto the nilpotent Lie subalgebras n ± and the Cartan subalgebra h, respectively, given for any X ∈ g as P ± X := Tr 2 (P ± 12 X 2 ), P 0 X := Tr 2 (P 0 12 X 2 ), so that id g = P − + P 0 + P + .
In the trigonometric setting, the role of the Lie subalgebra A rat λ (g) ⊂ A λ (g) in (2.4) will be played by the following alternative Lie subalgebra We shall also need the corresponding group A trig λ (G) defined as follows.
In the GL N case we let B + denote the group of all invertible N × N matrices with entries below the diagonal in λC λ and entries on or above the diagonal in C λ . Likewise, we let B − be the group of all invertible N × N matrices with entries on or below the diagonal in C λ ∞ and entries above the diagonal in λ ∞ C λ ∞ . Concretely, an element of B + can be expanded as a Taylor series g(λ) = ∞ n=0 g n λ n with g 0 upper triangular and g n ∈ gl N for n ≥ 1, while an element of B − is a Taylor series h(λ ∞ ) = ∞ n=0 h n λ n ∞ with h 0 lower triangular and h n ∈ gl N for n ≥ 1. As usual, in the SL N case we define the subgroups B ± as in the GL N case but with the added condition that the matrices are of determinant 1. We then set where the first factor is the subgroup B 0,∞ λ (GL N ) ⊂ B + × B − consisting of pairs of Taylor series g 0 (λ) = ∞ n=0 g 0 n λ n and g ∞ (λ ∞ ) = ∞ n=0 g ∞ n λ n ∞ with g 0 n , g ∞ n ∈ gl N for all n ≥ 1 but where the upper triangular matrix g 0 0 and the lower triangular matrix g ∞ 0 are subject to the constraint P 0 g 0 0 = (P 0 g ∞ 0 ) −1 . Note that for consistency we should really keep denoting the local coordinate at the origin as λ 0 , following the general notation introduced in Section 2.1. However, since λ 0 is nothing but λ, we will most often prefer to write the local coordinate at the origin simply as λ, rather than use the more cumbersome notation λ 0 .
It will be convenient in what follows to introduce slightly different notions of pole parts of Laurent series at the origin and infinity in the trigonometric case. As they are important in practical calculations, we gather them in the following definition.
Proof. To see that A trig λ (g) is isotropic with respect to the bilinear form ·, · −1 , let X(λ), Y (λ) ∈ A trig λ (g) be arbitrary and consider the pairing X(λ), Y (λ) −1 as given in (2.2b). There are no contributions from any a ∈ C × . The only contributions come from 0 and ∞, which read In the first equality we wrote X 0 (λ) = ∞ n=0 X 0 n λ n , X ∞ (λ ∞ ) = ∞ n=0 X ∞ n λ n ∞ and similarly for Y 0 (λ) and Y ∞ (λ ∞ ). The second equality above follows from the fact that X 0 0 , Y 0 0 ∈ b + and X ∞ 0 , Y ∞ 0 ∈ b − and the last step makes use of the conditions in the definition of B 0,∞ λ (g) that In order to show that A trig λ (g) is maximally isotropic it suffices to prove the second statement, namely that we have the direct sum decomposition of vector spaces as in (2.15).
To any X(λ) ∈ A λ (g) we associate the rational function in R λ (g). Consider the element X(λ) = ( X a (λ a )) a∈CP 1 defined by for every a ∈ CP 1 . We have X a (λ a ) ∈ g ⊗ C λ a for every a ∈ CP 1 . But more precisely, noting that for every b ∈ C × , we have, in fact, X 0 (λ 0 ) ∈ b + ⊕ g ⊗ λC λ whose leading term in b + is given by Moreover, comparing the Cartan components of (2.17) and (2.18) we see that these are opposite.
Hence we conclude X(λ) ∈ A trig λ (g). In other words, gives the desired decomposition of a general element X(λ) ∈ A λ (g) as in (2.15). This decomposition is unique since any element which belongs to both A trig λ (g) and ι λ R λ (g) must vanish. Indeed, suppose f (λ) ∈ R λ (g) is such that ι λ f (λ) ∈ A trig λ (g). Then it is clear from the definition of A trig λ (g) in (2.12) that f (λ) cannot be singular at any point in CP 1 and so must be constant. But then it follows from the definition of B 0,∞ λ (g) that this constant must in fact be zero.
Definition 2.7 (Trigonometric r-matrix). The trigonometric r-matrix is defined as the following function of two formal variables λ and µ: It is skew-symmetric, namely we have r trig 21 (µ, λ) = −r trig 12 (λ, µ). It provides the trigonometric counterpart of the kernel (2.10) for the choice of complement (2.12). Indeed, we have the following analogue of Proposition 2.3 in the trigonometric case.
Proposition 2.8. For any X(λ) ∈ A λ (g), its projections onto the complementary subalgebras A trig λ (g) and ι λ R λ (g) relative to the direct sum decomposition (2.15) are given respectively by Proof. Given any X(λ) ∈ A λ (g), consider the g-valued rational function We compute the residues at each b ∈ C × and then at the origin and infinity. Firstly, for the residue at b ∈ C × we find X b (λ b ) trig − . For the residue at the origin we find X 0 (λ) trig − and, likewise, for the residue at infinity we find where in the first expression we are using the pole part X ∞ (λ ∞ ) rat − ∈ g ⊗ C[λ −1 ∞ ] defined in (2.6b) of a Laurent series at infinity, and in the second expression we are using the other notion of pole part X ∞ (λ ∞ ) trig − introduced above in (2.14b). Putting the above together we conclude that f X (λ) is the rational function (2.16) used in the proof of Proposition 2.6. By construction using the definition (2.14c), so the sum over b ∈ C × on the right hand side of (2.16) vanishes. On the other hand, , it follows that the remaining two terms in (2.16) cancel. So we have shown that π trig − X(λ) = 0 for any X(λ) ∈ A trig λ (g). On the other hand, suppose now that X(λ) = ι λ f (λ) for some f (λ) ∈ R λ (g). If the latter has a pole at some a ∈ C × then its pole part there is given by X a (λ a ) rat − . If it has a pole at the origin then its pole part there is equal to is given by the pole part X 0 (λ) rat − at the origin plus (P − + 1 2 P 0 )X 0 0 where X 0 0 is given here by the value at the origin of all the other pole parts of f (λ). This is why we must subtract the latter from X 0 (λ) trig − in (2.21) to be left only with the desired pole part at the origin. Finally, the pole part of f (λ) at infinity is given by Indeed, the pole part at infinity should contain the constant term but X ∞ (λ ∞ ) trig − only contains part of it. The remaining part is precisely the piece added in (2.22). It now follows that the expression on the right hand side of (2.16) built from X(λ) = ι λ f (λ) coincides exactly with the partial fraction decomposition of f (λ). This establishes that π trig − X(λ) = X(λ) for any X(λ) ∈ ι λ f (λ). In other words, we have therefore shown that π trig − is indeed the projection onto ι λ R λ (g) along A trig λ (g). It remains to consider π trig + . For any X(λ) ∈ A λ (g) and a ∈ C we have If X(λ) ∈ ι λ R λ (g) then the first term on the right hand side vanishes by the residue theorem and the second term likewise at each order in the λ a -expansion. If instead we consider a = ∞ then but both terms vanish once again by the residue theorem if X(λ) ∈ ι λ R λ (g). So we deduce that π trig + X(λ) = 0 for every X(λ) ∈ ι λ R λ (g). Suppose now that X(λ) ∈ A trig λ (g). The first term on the right hand side (2.23) gets a contribution only from the terms b = 0 and b = ∞, which read The second equality above follows since by assumption we have X 0 0 ∈ b + so that P − X 0 0 = 0 and also P 0 X 0 0 = −P 0 X ∞ 0 . The third equality also follows since by assumption X ∞ 0 ∈ b − . The second sum in (2.23) is just as in the rational case, however since the series at infinity now contains a constant term we get a contribution to the sum over b ∈ CP 1 from both b = a and b = ∞, yielding X a (λ a ) − X ∞ 0 . So in total, we deduce that (π trig + X) a (λ a ) = X a (λ a ) for every a ∈ C. Consider now the case a = ∞. The first term on the right hand side of (2.24) is again equal We can now define the linear operator r trig := π trig + − π trig − . It follows from Proposition 2.8 that its kernel reads Moreover, the kernel of the identity operator id = π trig + + π trig − is similarly given by an expansion of zero since (ι µ b ι λa − ι λa ι µ b )r trig 12 (λ, µ)µ −1 dµ a,b∈CP 1 = P 12 δ ab δ(λ a , µ a )dµ a a,b∈CP 1 using the same notation δ ab and δ(λ, µ) as in the rational case.
The following is the analogue of Lemma 2.4 in the trigonometric case.
Lemma 2.9. Let X(µ) = X a (µ a ) a∈CP 1 ∈ A µ (g) with X a (µ a ) = ∞ n=−Na X a n µ n a for some N a ∈ Z, where N a > 0 for finitely many a ∈ CP 1 . For any a ∈ C we have while at infinity we have In the third equality we split the double sum into two terms, containing µ a and a respectively from the first factor. We changed variable from s to r = s + n + 1 in the first and from s to r = s + n in the second, and then changed the order of the two double sums. It remains to note that r−1 n=−Na λ n−r a X a n + a r n=−Na and that this is equal to −X a r when evaluated at λ = 0. The result at a ∈ C × now follows by definition (2.14c) of the pole part at a.
At the origin we have In the second equality we changed variable in the double sum from s to r = n + s and then changed the order of the two sums. We have also added the term r = −N 0 in this double sum since this term vanishes due to the range in the sum over n being empty. The last equality uses the definition (2.14a). Note that the result at the origin coincides with the result obtained above for a ∈ C × but taken at a = 0. This is not completely obvious since the definitions of the pole parts (2.14a) and (2.14c) at 0 and a generic point a ∈ C × are different. Likewise, at infinity we have In the second equality we changed variable in the double sum from s to r = n + s and then changed the order of the two sums. The last equality uses (2.14b).

Generating Lagrangian multiform and CYBE
In this section we will treat uniformly both the rational and trigonometric cases discussed in Section 2.2 and Section 2.3, respectively. More precisely, we shall work with the Lie algebra of g-valued adèles A λ (g) equipped with the bilinear form (2.2) with either k = 0 or k = −1. The corresponding vector space direct sum decompositions (2.7) and (2.15) will be denoted by where A + λ (g) stands for the rational Lie subalgebra A rat λ (g) when k = 0 and the trigonometric Lie subalgebra A trig λ (g) when k = −1. Correspondingly, we shall use the common notation A + λ (G) for the groups A rat λ (G) in the rational case and A trig λ (G) in the trigonometric case. Given a general element X(λ) = (X a (λ a )) a∈CP 1 ∈ A λ (g) of the Lie algebra of g-valued adèles, we will also denote by X a (λ a ) − the principal part of the formal Laurent series X a (λ a ), which stands for X a (λ a ) rat − in the rational case, see (2.6), or for X a (λ a ) trig − in the trigonometric case, see (2.14).

Dynamical equations
We will describe integrable field theories by taking the point of view that the spatial coordinate x, on which all the Hamiltonian fields are usually taken to depend, should be treated on an equal footing to all the other times in the hierarchy. To explain this new perspective on integrable hierarchies it is useful to begin by recalling the traditional point of view.
The dynamical equations of different integrable field theories in the same hierarchy are usually described as zero curvature equations which encodes the finite collection of fields of the hierarchy. The V a n (λ) ∈ R λ (g), associated to the times t a n for some labels a ∈ C and n ∈ Z to be specified below and which we also refer to as Lax matrices, are coadjoint orbits of A + λ (G) in R λ (g) built out of differential polynomials in the fields. From this traditional point of view, (3.1) are seen as a natural extension of the Lax equations ∂ t a n L(λ) = [M a n (λ), L(λ)], used to describe finite-dimensional systems, to the field theory case where every degree of freedom now depends on x. In particular, U (λ) is usually treated as the fundamental object since the V a n (λ) can all be built out of it and as such it is seen as the natural analogue of the Lax matrix L(λ) in the field theory case.
The crucial point is that the particular flow ∂ x can, and from our point of view should, be thought of as a linear combination of some of the elementary time flows ∂ t a n . But if we are to treat the coadjoint orbit U (λ) ∈ R λ (g) on an equal footing to all the other coadjoint orbits V a n (λ) ∈ R λ (g) then we should also abandon the idea that each V a n (λ) is parametrised by differential polynomials with respect to x of the finite collection of fields contained in U (λ). Instead, we should treat all the coadjoint orbits V a n (λ) ∈ R λ (g) as truly independent. We shall see, in a sense which is much closer in spirit to the Lax formalism for finite-dimensional systems, that all the Lax matrices V a n (λ) can be derived from a single object Q(λ) ∈ A λ (g), a certain adjoint orbit of A + λ (G) in the full space of adèles A λ (g). In particular, the latter will satisfy a Lax equation (see (3.16) below) with respect to all the times t a n . As such, in our approach to hierarchies of integrable field theories, Q(λ) will play a very similar role to that of the usual Lax matrix L(λ) for finite-dimensional systems. For us, the fundamental object will therefore be Q(λ) rather than U (λ). The relationship between these two objects, and in particular the connection between our approach to hierarchies of integrable field theories and the usual one recalled above, comes from fixing a particular linear combination of time flows as our choice of spatial derivative ∂ x . We discuss this in detail in Section 3.1.4, together with what we call the FNR procedure.
Since there is a close parallel between our treatment of integrable field theories and various familiar constructions in the theory of finite-dimensional integrable systems, we will draw the comparison throughout this section in a series of remarks.

Adjoint orbit
Let φ(λ) = (φ a (λ a )) a∈CP 1 ∈ A + λ (G). We regard the entries of the matrix coefficients in the expansions φ a (λ a ) = ∞ n=0 φ a n λ n a for all a ∈ CP 1 as an infinite collection of dynamical variables. These are not all independent since φ a (λ a ) should at least be invertible in the GL N case, which means that the first term φ a 0 should be invertible, or φ a (λ a ) should have determinant 1 in the SL N case which will impose non-trivial relations between the coefficients at each order in λ a . The infinitely many degrees of freedom contained in φ(λ), or equivalently in Q(λ), will be used to describe infinitely many different integrable hierarchies of integrable field theories. We will refer to these as group or algebra coordinates (respectively): they represent the dependent coordinates and are the fields satisfying the equations of motion of a hierarchies. A particular integrable hierarchy will be determined by a choice of non-dynamical rational function, with poles in a finite subset S ⊂ CP 1 , which we can write using a partial fraction decomposition as are (rational or trigonometric, depending on the case) principal parts at each a ∈ S. In particular, F a (λ a ) − dλ has a pole of order N a > 0 at any a ∈ S ∩ C and a pole of order N ∞ + 2 ≥ 2 at infinity if ∞ ∈ S. Its expansion at all of the points a ∈ CP 1 defines an element ι λ F (λ) = (ι λa F (λ)) a∈CP 1 ∈ A λ (g) of the g-valued adèles, via the embedding (2.3). By design, we have (ι λa F (λ)) − = F a (λ a ) − for each a ∈ S and (ι λa F (λ)) − = 0 for every other points a ∈ CP 1 \ S. The element of the g-valued adèles with these components, which we can denote by is just a finite collection of principal parts. We consider its adjoint orbit under the group element Explicitly, its component at any pole a ∈ S is Q a (λ a ) = φ a (λ a )F a (λ a ) − φ a (λ a ) −1 while the component at any other a ∈ CP 1 \ S vanishes. We can further expand the latter as a Laurent series in λ a , namely for some Q a n ∈ g, where N a > 0 is the order of the pole of F (λ) at a ∈ S ∩ C. For the point at infinity we can have N ∞ ≥ 0.
Remark 3.1. The adjoint orbit (3.3) within the full Lie algebra of g-valued adèles A λ (g) will play the role of the Lax matrix in the present infinite-dimensional setting. For comparison, it is useful to recall that in the finite-dimensional setting the Lax matrix is given by a coadjoint orbit where Π − denotes either π rat − or π trig − , depending on whether we are in the rational or trigonometric setting, but without applying the expansion ι λ so that we obtain an element of R λ (g) rather than ι λ R λ (g). In particular, the rational function L(λ) depends only on finitely many dynamical variables in φ(λ) ∈ A λ (g). ⊳

Generating Lax equation
As our aim is to work with hierarchies of equations of motions, to each point a ∈ CP 1 we attach an infinite family of time coordinates t a n for n ∈ Z. Related to each time is the usual partial derivative ∂ t a n (meant as a total derivative when acting on functions of the fields). For our purposes, let us define the following generating operators with k = 0 in the rational case and k = −1 in the trigonometric case. We let D λ := (D λa ) a∈CP 1 denote the CP 1 -tuple of these differential operators. Then, if µ is another formal variable we will use the notation which encodes the flows ∂ t a m Q b n of all the dynamical variables Q b n with respect to all the times t a m for each pair of points a, b ∈ CP 1 .
Following the first observation in Section 1.2 of the introduction, we want to declare the evolution of Q(λ) ∈ A λ (g) with respect to the above infinite family of times t a n to be governed by the following general Lax equation in r-matrix and generating form ( 3.8) However, a few comments and precautions are necessary. First, writing such an equation with the understanding that D µ is the CP 1 -tuple of commuting differential operators defined in (3.6) assumes that the vector fields on the right-hand side commute, if we want to be able to interpret the times t a n as coordinates on a manifold. In other words, defining the generating vector X µ acting on A λ (g) by we must first prove that [X µ , X ν ] = 0. Only then can we set X µ = D µ and view the generating Lax equation (3.8) as describing compatible time flows on A λ (g). This is shown below in Proposition 3.4 and is a beautiful consequence of the CYBE for r. Second, note that the right-hand side of (3.8) lives in a,b∈CP 1 g⊗λ −Na by definition (3.4). By the following lemma we then also deduce that at a ∈ CP 1 the power of λ a on the right hand side of (3.8) is bounded below by −N a . For the left hand side, this means that the flow with respect to the times t b m , with m < −N b are trivial: Q(λ) does not depend on those times and for all practical purposes related to a hierarchy of field theories, they can be ignored.
Lemma 3.2. We have Proof. Using the identity (??) we deduce that for any a, b ∈ CP 1 we have Since [Q a (λ a ), Q a (µ a )] vanishes when λ a = µ a it is proportional to λ a − µ a and so it follows that the right hand side above vanishes, as required.
Remark 3.3. The Lax equation (3.8) is to be compared with the Lax equation in the usual finite-dimensional setting for the evolution of the Lax matrix with respect to the times associated with the coefficients in the partial fraction decomposition of the quadratic Hamiltonian If we gather together the flows ∂ t a n = {H a n , ·} associated with the Hamiltonians H a n by defining the differential operator valued rational function which is to be compared with the adèlic object (3.6) in the present infinite-dimensional setting, then the Lax equations in the finite-dimensional setting take the form (3.10) Both sides of this equation are g-valued rational functions in both λ and µ with poles in λ and µ at each a ∈ S of order at most N a . ⊳ Proposition 3.4. The flows (3.8) are compatible, as a consequence of the commutativity of the corresponding vector fields, i.e. for any three formal variables λ, µ and ν we have By using the cyclicity of the trace over space 2 in the first term on the right hand side and the Jacobi identity on the last term, this can be rewritten as Likewise, exchanging µ ↔ ν in (3.12) we obtain where the second equality we used Lemma 3.2 to swap the order of ι ν and ι µ in the first term, along with the cyclicity of the trace over space 3. Thus X ν , X µ Q 1 (λ) equals which vanishes as a consequence of the CYBE (1.1).

Generating zero curvature equation
In the context of integrable field theories the role of the Lax equation, cf. (3.10), is replaced by the zero curvature equation for a Lax connection. Therefore, as a first step towards relating the present formalism to integrable field theories, we now associate with each time t a n , for any a ∈ S and n ≥ −N a , a rational Lax matrix V a n (λ) ∈ R λ (g) such that any pair of these satisfies a zero curvature equation.
The equations of motion (3.8) can be written succinctly as (3.14) Note that we do not expand the right hand side in powers of λ a for a ∈ CP 1 , i.e. we do not apply the homomorphism ι λ . Instead, this expansion is taken explicitly in (3.13). In particular, the semi-colon in the notation V (λ; µ) is used to emphasise that λ is just a formal variable whereas µ is the usual boldface notation used as a shorthand for a collection As usual, we take k = 0 in the rational case and k = −1 in the trigonometric case. Here V b n (λ) ∈ R λ (g) are g-valued rational functions in λ with a pole at λ = b. Unpacking the notation in (3.13) slightly, recalling the definition of the operators D µ and (3.6), we see that the flow of Q(λ) with respect to the time t a n is controlled by V a n (λ), namely we have the Lax equation ( 3.16) Moreover, by the following proposition V b (λ; µ b ) can be seen as a generating series in µ b of a hierarchy of Lax matrices V b n (λ) associated with the times t b n .
Proposition 3.5. We have the zero curvature equation in generating form Equivalently, in components we have the zero curvature equation for every a, b ∈ CP 1 and m ≥ −N a and n ≥ −N b .
Proof. Using the Lax equation (3.8) we find where in the last equality we used the cyclicity of the trace in space 2. Likewise, we also have where in the final step we used Lemma 3.2 to swap the order of ι ν and ι µ , before using the cyclicity of the trace in space 3. Finally, we have The result now follows by the classical Yang-Baxter equation (1.1).
Remark 3.6. Note the clear resemblance between the generating series (3.14) for the hierarchy of Lax matrices V a n (λ) and the usual generating rational function in the finite-dimensional case. The coefficients in the partial fraction decomposition of the latter with respect to µ are g-valued rational matrices M a n (λ) which control the flow of the Lax matrix L(λ) with respect to the associated time t a n via the Lax equation ∂ t a n L(λ) = [M a n (λ), L(λ)], which is to be compared with (3.16).
In the finite-dimensional case, however, one may also need to consider the more general generating rational function M (n) (λ; µ) = Tr 2 r 12 (λ, µ)L 2 (µ) n−1 for integers n ≥ 2. Indeed, the Lax equation in (3.10) involves M (λ; µ) = M (2) (λ; µ) which is only associated with the quadratic Hamiltonians H(µ) = 1 2 Tr(L(µ) 2 ). But in the finitedimensional setting one should equally consider the Lax equations where M (n) (λ; µ) replaces M (2) (λ; µ) since these describe the flows of the Lax matrix L(λ) with respect to the higher order Hamiltonians built from 1 n Tr(L(µ) n ). In stark contrast, the generating series (3.14) is sufficient in the present infinite-dimensional context to produce the infinite number of Lax matrices V a n (λ) associated with the infinite number of times t a n in the hierarchy. Indeed, the key point is that Q(λ) is defined in (3.3) as an adjoint orbit so that its powers Q(λ) n−1 correspond simply to taking powers of the rational function F (λ). This is different to the finite-dimensional case where L(λ) is defined in (3.5) as a coadjoint orbit in R λ (g). ⊳ Remark 3.7. It is instructive to compare the generating series (3.14) for the hierarchy of Lax matrices V a n (λ) with formulas for similar generating series of Lax matrices obtained in the more traditional approach to integrable field theories which involves the monodromy matrix associated to a given auxiliary equation ∂ x Ψ = U Ψ. For example, in [FT,, it is shown that the object "is the generating series of the Lax matrices V n (x, λ) appearing in the zero curvature equation representation of the higher NS equations". The expansion is to be understood as The point is that (3.18) can be rewritten as We note the explicit dependence of the preferred variable x, signature that this object has been built from a particular, preferred time x associated to the Lax matrix denoted U (x, λ), which is nothing but V 1 (x, λ), as it should. Other than this dependence, formula (3.18) has exactly the same structure as our formula (3.14) when specialised to the AKNS hierarchy, see Section 4. Indeed, in that case the only pole to consider is at infinity and the function F (λ) is taken to be −iσ 3 . Hence the only non zero element in the tuple (3.14) is To complete the comparison, note that the term (1 + W (x, µ)) in (3.20) comes from writing the monodromy matrix T (x, y, λ) on the finite interval [y, x], associated to U (x, λ), as where Z is a diagonal matrix and both Z and W are Taylor series in 1/µ (with no constant term for W ). We refer the curious reader to [FT] for more details about Z and W which are not of importance for our discussion here. Considering for instance the case of fast decaying fields as |x| → ∞, we can work with the monodromy matrix on (−∞, x) This is the object that plays the role of our group element φ ∞ (µ ∞ ). Indeed, formally plugging in T − (x, µ) in place of φ ∞ (µ ∞ ) into (3.21), and remembering that e Z − (x,µ) commutes with σ 3 , we see that we get (3.20) (up to an irrelevant factor −1/2 which comes from a different choice of normalisation between us and [FT]). In [ACDK, AC], the argument from [FT] was generalised to obtain the analog of formula (3.20) but where one now builds it from the monodromy matrix associated to the time t k and Lax matrix V k (t k , λ), for an arbitrary but fixed k ≥ 1. This represented the first step towards providing a generating function of Lax matrices that treats all times in the AKNS hierarchy equally. Our formula (3.14) achieves this fully in that it makes no reference to a preferred time and an associated monodromy matrix as a starting point. It is also valid well beyond the realm of AKNS only, as our various examples below demonstrate. ⊳ It will be useful, in view of applying our general framework to construct explicit examples in the next few sections, to be more explicit about the form of the Lax matrices V a n (λ). This can be done using Lemma 2.4 in the rational case or Lemma 2.9 in the trigonometric case.
Proposition 3.8. In the rational case, for every a ∈ C and n ≥ −N a , we have In the trigonometric case, for every a ∈ C and n ≥ −N a we have which at the origin simply reads V 0 n (λ) = − λ −n Q 0 (λ) trig − , while at infinity we have, for every Proof. In the rational (resp. trigonometric) case this is a direct consequence of the definition (3.15) together with Lemma 2.4 (resp. Lemma 2.9).
Recall that Q a (λ a ) = 0 if a ∈ CP 1 \ S so that, in fact, V a n (λ) = 0 unless a ∈ S. By construction each Lax matrix V a n (λ) ∈ R λ (g) for any a ∈ S and n ≥ −N a , or rather their embedding in A λ (g) via (2.3), is a coadjoint orbit in ι λ R λ (g). For instance, in the rational case for a ∈ C ∩ S we have

Connection to integrable field theory and FNR procedure
Up to this point, the framework we have been discussing is very similar to the one used to describe finite-dimensional integrable systems, as emphasised in Remarks 3.1, 3.3 and 3.6. However, as we will see explicitly in all the examples discussed in later sections, our formalism encodes entire hierarchies of integrable field theories! The FNR procedure. One way to make explicit contact with the traditional approach to integrable field theory is to choose a preferred coordinate, denote it by x and set it as a particular combination of the fundamental times t a n for a ∈ S and n ≥ −N a . Quite generally, we can choose some finite subsets T a ⊂ Z ≥−Na for each a ∈ S and define ∂ x := a∈S n∈Ta r a n ∂ t a n for some r a n ∈ C * .
The Lax matrix associated with the coordinate x is then given by U (λ) := a∈S n∈Ta r a n V a n (λ) ∈ R λ (g). (3.24) As explained above, ι λ U (λ) is then a coadjoint orbit in the dual space ι λ R λ (g) of A + λ (g). This is the coadjoint orbit alluded to at the very start of this section which encodes the finite collection of fields of our integrable hierarchy. It follows from (3.13), or even more directly from (3.16), that the spatial dependence of Q(λ) is governed by the Lax equation (3.25) As we will see on examples, the equation (3.25) can be solved recursively to express the coefficients Q a n of Q a (λ a ), cf. (3.4), as differential polynomials in the fields, i.e. the variables contained in the Lax matrix U (λ). All other Lax matrices V a n (λ) associated to the fundamental times t a n will then have components expressed as differential polynomials of the fields. We will outline below how (3.25) can, in principle, be solved recursively for each Q a n . Since certain details of the recursive procedure depend on the model considered, we will only illustrate here the part of the construction which applies universally to all models in Lemma 3.9 below. We will see later on examples how to apply this construction to specific models.
To state the lemma, we first need to make a few observations and definitions. Since the Laurent expansions ι λa V a n (λ) each have a non-zero principal part, it follows from the definition (3.24) that we can write for some n a ≥ 1 and non-zero leading coefficient U a −na ∈ g. By definition (3.3) we have that Q a (λ a ) = φ a (λ a )F a (λ a ) − φ a (λ a ) −1 . It thus follows from the relationship between each ι λa V a n (λ) and Q a (λ a ), as described explicitly in Proposition 3.8, that the coefficients of the most singular terms in the formal Laurent series Q a (λ a ) and ι λa U (λ), given in (3.4) and (3.26) respectively, are proportional. In other words, we have Q a −Na = c U a −na for some c ∈ C * . Explicitly, it can be seen from Proposition 3.8 that c is given up to a sign by the coefficient r a n in (3.24) with n = max T a . We can thus write (3.27) Now let k := ker(ad U a −na ) and i := im(ad U a −na ). We fix any complements k ′ of k and i ′ of i in g so that we have the direct sum decompositions (3.28) Let π k : g → k and π k ′ : g → k ′ denote the projections onto k and k ′ relative to the first decomposition. Likewise, let π i : g → i and π i ′ : g → i ′ denote the projections onto i and i ′ relative to the second decomposition in (3.28).
Lemma 3.9. For any r ≥ 1, π k ′ (Q a −Na+r ) is expressible as a differential polynomial in x of the elements Q a −Na+s for s < r.
Proof. Using the explicit forms (3.4) and (3.26) for the Laurent series of Q a (λ a ) and ι λa U (λ), we may rewrite the component of (3.25) at a ∈ S more explicitly as In the second equality we have changed variables in the double sum from m ≥ −N a to n := m + p ≥ −N a − n a . Comparing the coefficients of λ −Na−na+r a on both sides of the above equation for all r ≥ 0 we find the following. For every 0 ≤ r ≤ n a − 1, where we changed variables in the sum from p to q := p + n a . Notice that for r = 0 this gives [U a −na , Q a −Na ] = 0 which is consistent with the observation in (3.27) that Q a −Na is proportional to U a −na . On the other hand, for r ≥ n a we have Denoting the right hand side of the equations (3.29) by B r , for each r ≥ 0 we can rewrite all of them more uniformly as for r ≥ 0. Since the left hand side of (3.30) lies in i we have, for every r ≥ 0, where in the first equation we have also decomposed Q a −Na+r relative to the first decomposition in (3.28) and used the fact that π k (Q a −Na+r ) commutes with U a −na . Now the linear map ad U a −na : k ′ → i is a bijection. Indeed, it is clearly surjective by definition of i. To see that it is injective, note that if [U a −na , X] = [U a −na , Y ] for any X, Y ∈ k ′ then X −Y ∈ k and hence X − Y = 0, as required. It follows that π k ′ (Q a −Na+r ) is uniquely determined in terms of π i (B r ) for every r ≥ 0 by the first equation in (3.31). The result now follows.
In order to completely determine the coefficients Q a −Na+r for r ≥ 0, it remains to show that the π k (Q a −Na+r ) for every r ≥ 0 can also be determined recursively. This is the part which will typically depend on the model considered. Here we will show, generalising an argument for the ZS-AKNS n × n hierarchy given in [TU,Theoerem 2.2], see also [Sa], how this can be done under the assumption that there is a polynomial P a with coefficients in C[λ a ] such that P a λ Na a F a (λ a ) − = 0 and P ′ a (c U a −na ) ∈ C[λ a ] is invertible in C λ a . Recalling (3.27), we have the identity And using the fact that each π k (Q a −Na+r ) commutes with U a −na , by definition of k, we can then rewrite the above in the form where the right hand side is a sum of terms, each of which contains either higher powers of ∞ r=1 π k (Q a −Na+r )λ r a or at least one factor of ∞ r=1 π k ′ (Q a −Na+r )λ r a . Since we are assuming that P ′ a (c U a −na ) ∈ C[λ a ] is invertible, it follows by comparing powers of λ r a on both sides of (3.32) that π k (Q a −Na+r ) can be expressed as a finite sum of terms involving only π k (Q a −Na+s ) for s < r or π k ′ (Q a −Na+s ) for s ≤ r. In conjunction with Lemma 3.9, this shows that each π k (Q a −Na+r ) and π k ′ (Q a −Na+r ), and therefore Q a −Na+r itself, can be determined recursively for each r ≥ 0. In particular, all the coefficients Q a n , n ≥ −N a of the Laurent series Q a (λ a ) in (3.4) can be expressed as differential polynomials in x of the coefficients of the rational function U (λ). The same conclusion still holds even when there is no polynomial P a with the above properties, as will be shown on the example of the sine-Gordon hierarchy in Section 5.
It is important to observe that our choice of 'spatial' coordinate x defined by the linear combination ∂ x = a∈S n∈Ta r a n ∂ t a n and its associated Lax matrix in (3.24) was completely arbitrary. Indeed, one of the main advantages of working with the adjoint orbit Q(λ) in A λ (g) rather than the coadjoint orbit U (λ) in R λ (g) is that it keeps all the times on an equal footing by not singling out a particular (linear combination of) time as 'space'.
On the redundancy of the FNR procedure. The previous discussion casts in the present framework the original idea of [FNR] whereby one should first solve for the coordinates in Q(λ) in terms of the finite collection of fields contained in a given Lax matrix U (λ), now interpreted as fields depending on a preferred space variable x. The other times in the hierarchies are viewed as (compatible) time flows imposed on this finite collection of fields and define a preferred field theory alongside its higher symmetries.
Here we want to elaborate on a point of view originally advocated in [CS3] whereby the above "traditional" approach is not needed at all and, in fact, represents a conceptual obstruction to the formalism we want to put forward in this work: we treat all the times in a hierarchy as well as all the (algebra or group) coordinates (i.e. the dependent variables contained in Q(λ) or φ(λ) respectively) on the same footing. From this point of view, one should consider the entirety of the Lax equations contained in the generating Lax equation (3.8), or equivalently, the collection of zero curvature equations (3.5). The point is that the latter implement the FNR procedure anyway but they present the advantage of being amenable to a covariant Hamiltonian formulation, which was one of the main results of [CS1,CS3]. This aspect is beyond the scope of the present work but remains one motivation for it. The fact that the zero curvature equations contain the equations of the FNR procedure was already observed and used in the particular example of the AKNS hierarchy in [AC]. For convenience, let us sketch the argument here in the simplest case of a single pole a ∈ C, with a collection of times t a n , n ≥ −N a . Suppose we fix n ≥ −N a and we want to solve ∂ t a n Q a (λ a ) = [ι λa V a n (λ), Q a (λ a )] , (3.33) given Q a −Na , along the lines of Lemma 3.9 and the discussion after it. Without loss of generality, shifting the power of λ a by N a , we can always assume for simplicity that N a = 0. Then, (3.33) amounts to the collection of equations ∂ t a n Q a j = n p=0 [Q a j+n−p+1 , Q a p ] , j ≥ 0 . (3.34) As discussed above, in certain cases (which include the AKNS hierarchy and the sG hierarchy as we show explicitly in Section 5), this allows one to express all the algebra coordinates in Q j , j ≥ n as differential polynomials with respect to t a n in the coordinates contained in Q k , k = 0, . . . , n. Now consider the zero curvature equations, for m ≥ n + 1, Looking at the coefficient of 1/λ j , for j = n + 2, . . . , m + 1, we find that they contain the equations If we set j = m + 1 − k, these become ∂ t a n Q a j = n p=0 [Q a j+n+1−p , Q a p ] , j = 0, . . . , m − n − 1 . (3.37) So the collection of zero curvature equations (3.35) for m ≥ n + 1 produces exactly the set of FNR equations (3.34). Hence, there is no point in implementing the FNR procedure a priori to determine the "fields" and then impose the zero curvature equations to determine their equations of motion. The latter suffices. With this in mind, we will come back to this point in certain examples below to illustrate our position and show how abandoning the FNR procedure allows us to eliminate the problem of alien derivatives mentioned in the introduction.

Extracting Lagrangians and Lagrangian multiforms
We have been using the generating formalism efficiently so far. Here, we spend some time discussing the connection of our generating Lagrangian multiform with Lagrangians and Lagrangians multiforms. This will be useful to reformulate the multiform EL equations and the closure relation in generating form, allowing to continue to take advantage of this for general computations. From the definition of the generating Lagrangian multiform (1.15), we see that the kinetic term K a,b (λ a , µ b ) given by (3.40a) is a Laurent series in both λ a and µ b , with powers bounded below by −N a and −N b , respectively. In particular, for any m, n ∈ Z the coefficient of λ m a µ n b is well defined. The same is true for the potential term (3.40b) by the following lemma.
Lemma 3.10. For any m, n ∈ Z and any a, b ∈ CP 1 , the coefficient of λ m a µ n b in the potential term U a,b (λ a , µ b ) given by (3.40b) is a well defined expression which is quadratic in the coefficients of Q a (λ a ) and Q b (µ b ).
is a Laurent series in both λ a and µ b , with powers bounded below by −N a and −N b , respectively.
If b = a then (ι λa ι µa + ι µa ι λa )r 12 (λ, µ) contains a doubly infinite Laurent series in λ a µ −1 a coming from the expansion of 1/(λ−µ), possibly also multiplied by some polynomial in λ a and µ a depending on the precise form of the r-matrix. Multiplying this by the Laurent series Q a (λ a ) ∈ g ⊗ λ −Na a C λ a and Q a (µ a ) ∈ g ⊗ µ −Na a C µ a , we produce terms of the form λ r+j+p a µ s−j+q a with r, s ≥ −N a , j ∈ Z and p, q ranging over finitely many possible values. In order to form a term proportional to λ m a µ n a we need m = r + j + p and n = s − j + q. But then m − j − p = r ≥ −N a so that j ≤ m + N a − p and also n + j − q = s ≥ −N a so that j ≥ −n − N a + q. In other words, j ∈ Z must be bounded from above and below so that it ranges only over finitely many values. Hence, there are only finitely many terms contributing to the coefficient of λ m a µ n a and the result follows.
As a consequence, for any a, b ∈ CP 1 and m, n ∈ Z with m ≥ −N a and n ≥ −N b , we may now extract the following Lagrangian coefficients associated to the times t a n and t b m : Definition 3.11 (Elementary Lagrangians).
L a,b m,n := res λ a res µ b L a,b (λ a , µ b )λ −m−1 dλ µ −n−1 dµ . (3.41) Recall the notational convention explained after (2.2b), in particular for residues computed at infinity. In short, Definition (3.41) means that L a,b m,n is the coefficient of λ m a µ n b in the expansion of L a,b (λ a , µ b ), as one would want. This is what we use to compute elementary Lagrangians in all our examples.
As explained below, when building a hierarchy, one chooses a finite set S ∈ CP 1 and all but a finite number of the elementary Lagrangians L a,b m,n vanish (those for which a and/or b is in CP 1 \ S). The Lagrangian multiform of the hierarchy is then given by (3.42) Note that we introduced an order on the pairs (m, a) ∈ Z × S in the last equality (recall that L a,b m,n = −L b,a n,m ). With S = {a 1 , . . . , a n }, it is defined by (m, a i ) < (n, a j ) ⇔ i < j or (i = j and m < n) .
These definitions generalise the correspondence explained in the introductory section 1.1.3 between L [u] and L (λ, µ) for the AKNS hierarchy. As we will see in detail in Section 4, the latter indeed corresponds the case where S = {∞}. In practice, one calculates the elementary Lagrangians (3.41) directly by computing the appropriate Laurent series expansion of L a,b (λ a , µ b ). The corresponding Lagrangian multiform is easily obtained as in (3.42).
The essential point of the present discussion is to identify the generating form of the two main equations of the theory of Lagrangian multiforms: the multiform EL equations δdL S = 0 and the closure relation dL S = 0 which should hold on solutions of the multiform EL equations. We see that the key object to translate in generating form is therefore dL S . In view of (3.42), dL S has the form The generating function corresponding to the coefficient Summarizing our discussion, the set S was fixed but arbitrary, so going back to the adélic setting, we will be working compactly with δD ν L (λ, µ) + δD µ L (ν, λ) + δD λ L (µ, ν) when deriving the multiform EL equations in generating form, and with when studying the closure relation.

Generating multiform Euler-Lagrange equations
Having introduced the main object of our framework, we proceed to derive the associated multiform EL equations (in generating form) and show that they give the generating Lax equation (3.8).
Explicitly, it reads

Generating closure relation
Theorem 3.13. The generating closure relation (3.50) holds when (3.8) is satisfied. It is a consequence of the CYBE for r.
By using the cyclicity of the trace in space 1 and 2 in the first and second terms, respectively, we may write this as where in the second equality we used Lemma 3.2 in both terms. By using once again the cyclicity of the trace in space 1 and 2 in the first and second terms, respectively, we arrive at the expression D ν U (λ, µ) = Tr 123 ι λ ι µ ι ν [r 12 (λ, µ), r 13 (λ, ν)] (3.52a) Likewise, using the skew-symmetry of the r-matrix we find Then by following the same steps as above for D ν U (λ, µ) we deduce that D λ U (µ, ν) = Tr 123 ι λ ι µ ι ν [r 12 (λ, µ), r 23 (µ, ν)] Similarly, we also find using the skew-symmetry of the r-matrix that D µ U (ν, λ) = Tr 123 ι λ ι µ ι ν [r 13 (λ, ν), r 23 (µ, ν)] It now follows from combining the three equations in (3.52) and using the classical Yang-Baxter equation for the skew-symmetry r-matrix that The result now follows from (3.51) and (3.53) but together.
The rest of the paper is devoted to examples. To specify an example, the following ingredients need to be fixed: (i) a skew-symmetric r-matrix as in Section 2 (rational or trigonometric in this work), (ii) an effective divisor D := a∈S N a a, in particular with support given by a finite subset S ⊂ CP 1 and with N a ∈ Z ≥1 for each a ∈ S, (N ∞ ∈ Z ≥0 if ∞ ∈ S), (ii) a Lie algebra g which for simplicity we take to be either gl N or sl N , (iv) a g-valued rational function F (λ) ∈ R λ (g) with poles divisor (F ) ∞ = D, i.e. with a pole of order N a ∈ Z ≥1 at each point a ∈ S, (N ∞ ∈ Z ≥0 if ∞ ∈ S).

AKNS hierarchy
We keep this section short as it is a matter of "closing the loop": we reproduce the motivating example of Section 1.1.3 which was dealt with in detail in [CS3]) and the started point of this whole project. The main objective is to illustrate on the simplest and most well known example how to use our machinery. We choose the rational r-matrix and we fix the required data as follows The adjoint orbit description of Section 3.1 is implemented with and gives , the familiar first two elements in the AKNS hierarchy. Since there is only one pole in this example, let us drop the subscripts and superscripts and simply write the fundamental objects in (4.2) and (4.3) as (4.4) Similarly, we will just write t n instead of t ∞ n for the times of the hierarchy. The generating Lax equation (3.8) gives us, using the definitions (3.6), (3.14) and (3.15), are the Lax matrices of the hierarchy. Eqs (4.5) are the the central equations of [FNR] where only Hamiltonian aspects of the theory were developed. The associated zero curvature equations read and produce the equations of motion of the hierarchy. The famous (unreduced) NLS system corresponds to n = 1 and k = 2. From our generating Lagrangian (1.15), we can of course reproduce the generating Lagrangian of [CS3] and all the Lagrangians forming the Lagrangian multiform that gives these equations as its (multiform) EL equations. Since S = {∞} we only have L ∞,∞ (λ ∞ , µ ∞ ) to consider. As above, let us simply denote it as L (λ, µ). The coefficient L mn of λ −m−1 µ −n−1 in its expansion reads where we wrote φ −1 (λ) = 1 + ∞ n=1φ n λ −n for convenience and where U mn is given by These are the coefficients of the AKNS Lagrangian multiform found in [CS3] (up to an overall minus sign) to which we refer for more details. It was explained in [CS3] that there exists a parametrization of φ(λ) in terms of very nice coordinates e (4.10) For the reader's convenience, let us give for instance and (4.12) Of course, one can check that the equations of motion for these Lagrangians give precisely the zero curvature equations (4.7) for (k, n) = (1, 2) and (k, n) = (1, 3) respectively. For instance, varying L 12 with respect to e j , f j , j = 1, 2, we have This is equivalent to (4.7) for (k, n) = (1, 2), upon recalling that The top two equations can be used to eliminate e 2 , f 2 in the bottom two equations. With (4.15) and the reduction r = ∓r * yields the well-known (de)focusing NLS equation for the complex field q. Similarly, L 13 gives the complex modified KdV equation.

Sine-Gordon hierarchy
For the example of the sine-Gordon equation u xy + sin u = 0 , (5.1) we choose the trigonometric r-matrix (2.19). The required data is fixed as follows and we work with the basis σ 3 , σ + , σ − . The adjoint orbit description of Section 3.1 is implemented with The phase space coordinate u will be the sine-Gordon field as will become clear soon. This gives, We now show how to use our formalism to recover the sine-Gordon equation (in light cone coordinates) as well as its first higher compatible flow which is nothing but the modified KdV equation, as presented in [Su]. We take advantage of sine-Gordon to illustrate how our formalism also produces the Lagrangian multiform corresponding to these 3 times. In this context, our motivation is to show that the so-called "alien derivatives" problem that was discussed in [V] does not appear with our approach. The problem only arises if one insists on using the variational equations we obtain to eliminate some of the phase space coordinates in favour of the sine-Gordon field u and its derivatives with respect to a given time.
In other words, we show in detail how our general discussion about the FNR procedure, when applied at the variational level, leads to this alien derivative problem. This is yet another reason in our opinion why it is preferable to work with the natural phase space coordinates that are provided by φ(λ).
It is convenient to parametrise (5.3)-(5.4) as where we recall that det φ 0 = 1 = φ ∞ should hold. Using the gauge freedom of multiplying φ 0 (λ 0 ) (resp. φ ∞ (λ ∞ )) on the right by a matrix which commutes with (ι λ 0 F (λ)) trig − (resp. (ι λ∞ F (λ)) trig − ), we can work with (5.10) Note that one can show that there is a bijection between the group coordinates A 0 n , C 0 n , D 0 n and A ∞ n , C ∞ n , D ∞ n , and the algebra coordinates a 0 n , b 0 n and c 0 n , which we would introduce via Q 0 n = a 0 n σ 3 +b 0 n σ + +c 0 n σ − (and similarly at ∞). The reader familiar with the FNR construction or only interesting in zero curvature equations would tend to use the algebra coordinates. However, since our Lagrangians are naturally expressed with group coordinates, we use the latter both for the zero curvature equations and the Lagrangians. It also facilitates comparison between the two ways of obtained the equations of motion.
By our general results in Sections 3.1.2 and 3.1.3, all the time flows commute and all the corresponding zero curvature equations of Proposition 3.5 hold, with the Lax matrices reading for n ≥ −1, (see Proposition 3.8) The sine-Gordon is recovered by taking the pair of Lax matrices (V 0 0 (λ), V ∞ 0 (λ)) and the compatible higher flow attached to the pair (V ∞ 0 (λ), V ∞ 1 (λ)) gives (potential) mKdV. The third possible Lax pair is (V 0 0 (λ), V ∞ 1 (λ)) and will be called the mixed equation. For convenience, let us label the corresponding times as follows t 0 0 = y, t ∞ 0 = x, t ∞ 1 = z. Therefore, we focus on the following three zero curvature equations Hence, Therefore, we obtain the following equations of motion from the zero curvature equations. (sG) (5.20) The first two equations show that the group coordinates C 0 1 , B ∞ 1 can be thought of as auxiliary fields and can be eliminated from the dynamics to get (5.1), as desired. (mKdV) We see that both (5.20) and (5.21) contain the same equation for B ∞ 1 in terms of u, as it should be. A comment is in order. Under the first two equations, the third and fourth equation consistently give the same expression for B ∞ 2 . In turn, replacing all the auxiliary fields into the last equation yields mKdV in potential form (i.e. mKdV for v = u x ) (5.23) Using the first two equations to eliminate the auxiliary fields and noting that the fourth and fifth equations are equivalent (modulo the third equation), we obtain after simplification the following system of equations for the three fields u, A ∞ 1 and B ∞ 1 , (5.24) Note that this system of equations in (y, z) can be perfectly studied on its own and is integrable. However, from our point of view, it should be included together with (5.20) and (5.21) into the sG hierarchy. This leads to interesting observations which are related to the Lagrangian multiform description we present below. First of all, using B ∞ 1 = −u x and the sine-Gordon equation, we see that the first equation in (5.24) is trivially satisfied. Similarly, the second equation in (5.24) is a consequence of B ∞ 1 = −u x , the sine-Gordon equation and the second equation in (5.21). Perhaps more interesting is the fact that combining the first three equations in (5.21) with the second equation in (5.23) yields of which (5.22) is simply a differential consequence. We now turn to the extraction of the coefficients of the Lagrangian multiform for the corresponding time flows. We need L 0,∞ 00 for (sG), L 0,∞ 01 for (mixed) and L ∞,∞ 01 for (mKdV). We have K 0,∞ (λ 0 , µ ∞ ) = Tr i 2 Hence, dropping irrelevant total derivative terms and using again t 0 0 = y, t ∞ 0 = x, t ∞ 1 = z for convenience, we find To compute the potential terms, observe that for the trigonometric r-matrix, we have Hence, This gives us the desired Lagrangian densities for (sG) and (mixed) as (5.26) and Similarly, we find and, with, Thus, the Lagrangian density for (mKdV) is given by (5.28) It remains to derive the EL equations associated to each Lagrangians. For instance, by varying B ∞ 1 , C 0 1 and u in L sG we find exactly the three equations in (5.20). Similarly, it can be checked that the E-L equations for L mKdV and L mixed reproduce (5.21) and (5.23) respectively.
In particular, all the equations that determine the group coordinates in terms of u and its (relevant) derivatives are reproduced variationally. This is an interesting feature that the FNR procedure is also obtained variationally with our construction. An important by-product is that the so-called problem of "alien-derivatives" is eliminated systematically. In the present context, the manifestation of this problem would be for instance that the Lagrangian L mixed contains terms with derivatives of u with respect to x, while this Lagrangian is supposed to produce equations of motion with respect to the variables y and z only. Clearly, our Lagrangians do not suffer from this problem since by construction, they always only involves the two times they are supposed to produce equations of motion for. The problem is an artefact of using some of the equations of motions to solve for some of the fields in terms of u and its derivatives. In other words, it is an artifact of implementing the FNR procedure a priori to eliminate some of the group coordinates. If we do implement this procedure of elimination, we obtain Lagrangians which form a Lagrangian multiform equivalent to the one given originally in [Su] and which suffers from this problem. Eliminating the auxiliary fields in favour of u and its derivatives, we obtain which is a well-known Lagrangian for (5.1), as well as Changing x to −x, multiplying all our Lagrangian by 2 and dropping the irrelevant total derivatives in x and y, we recover exactly the three Lagrangian coefficients, eqs (31)-(33), in [Su]. Our Lagrangian multiform expressed with the group coordinates (and restricted to the three times x, y, z) is thus equivalent to that in [Su] but, as noted before, it does not suffer from the alien derivative problem. The poles at 0 and ∞ play a symmetric role in the construction so it would be natural to consider also the time t 0 1 and the associated Lax matrix V 0 1 (λ). This naturally leads to two additional zero curvature equations (denote t 0 1 = t and the other times as above) that can be combined with (sG) The first one is called (mKdV2) as it is another copy of the mKdV equation but in (y, t) instead of (x, t). It is a compatible flow with (sG) where we can think of the roles of x and y being swapped. Then, naturally (mixed 2) is the remaining compatible flow between the variables x and t. All the expressions for the Lax matrices, the zero curvarture equations and the corresponding Lagrangians are similar to the above ones with the appropriate changes and we omit them. To complete the picture related to the four times we have focussed on, it would remain to consider the zero curvature equation The set of equations of motion is not particularly enlightening. When embedded in the hierarchy of the five zero curvature equations already discussed, this system is a consequence of them, as it should be. Our contruction gives us the means to derive the corresponding Lagrangian density L (0,∞) 11 if required but again we omit its lengthy expression here.
FNR procedure for the sine-Gordon hierarchy. We have discussed the FNR procedure at the level of the EL equations above, using some of the equations to eliminate certain auxiliary coordinates/fields. Here, we discuss it at the level of the algebra coordinates using the Lax equation. This is more in line with the original work [FNR] and with the explanation around Lemma 3.9 for which it provides an illustration in the sG case. We recall that our point of view is that the procedure is unnecessary. We show it in the sG case to make contact with a more traditional approach but also because to our knowledge, this is the first time that the FNR construction is obtained for a hierarchy other than AKNS. In the present sG case, it is based on the Lax equations (5.29)-(5.30) below. The generating Lax equation (3.8) gives the following equations, for n ≥ −1, We could use the Lax equations (5.29)-(5.30) to derive the coefficients of Q 0 n and Q ∞ n as differential polynomial in the coordinate u. Given the form of F (λ) here, we do not fall into the area of applicability of the argument given after Lemma 3.9. Nevertheless, it is still possible to proceed. We illustrate this with (5.30), the other case being similar.
Our choices (5.2) and (5.3) give c −1 = − i 2 e iu/2 , a −1 = 0 = b −1 . Then, consider (5.30) for n = 0 with Q ∞ (λ) = a(λ) b(λ) c(λ) −a(λ) (we drop the superscript for conciseness). Writing t ∞ 0 = x for convenience and projecting onto σ 3 , σ + and σ − , we obtain (5.31) Looking at the λ j coefficient, this yields the following system which we should use to determine the coefficients recursively. Suppose, we have determined a k , b k , c k for k = 1, . . . , n − 1 then the first equation gives us b n and hence the second equation yields a n . However, we cannot deduce c n from the third equation since it would require the knowledge of a n+1 . It is possible to replace (5.31) by the following equivalent system (5.33) To see this, note that (5.31) implies ∂ x Tr Q ∞ (λ ∞ ) 2 = 0 so that Tr Q ∞ (λ ∞ ) 2 = cst = Tr (ι λ∞ F (λ)) trig − 2 , as it should by construction. Conversely, assume (5.33) holds. The third equation implies 2∂ x a(λ)a(λ) + ∂ x b(λ)c(λ) + b(λ)∂ x c(λ) = 0. Using the first two equations to eliminate ∂ x a(λ) and ∂ x b(λ) yields b(λ) ∂ x c(λ) + a 0 c(λ) − 2 λ c −1 a(λ) = 0, and the claim follows. Now the advantage of system (5.33) is that the j-th term of the third equation gives the following relation Spelling it out, it can be seen that it can be used to determine c n from a k , b k , c k , k = 1, . . . , n − 1 and b n , a n obtained from the first two equations as explained before. Thus, (5.33) allow us to determine all a j , b j , c j , j ≥ 0 recursively. We find the first few as Now, for instance, the expression we find for a 0 is consistent with the fact that a 0 = − i 2 B ∞ 1 from (5.15) and with the second equation in (5.20). This is what we mean when we say that the FNR procedure is automatically implemented with our Lagrangian approach. We reiterate that the advantage of not applying it is that the problem of alien derivatives disappears and that dependent variables are also treated on an equal footing, like the independent variables.

Hierarchies of Zakharov-Mikhailov type
In this section, we introduce a rather large class of models and their hierarchies by using the following data S = {a 1 , . . . , a P } ⊂ C , P > 0 , g = gl N , (6.1) (6.2) Each A ir ∈ gl N is a non-dynamical constant matrix and we have chosen to write the order N a i of the pole a i , i = 1, . . . , P as N a i = n i + 1 for convenience. All the poles in S are distinct. The r-matrix can be the rational or trigonometric one at this stage.
The motivation behind such choices is that in the simplest setting (rational r-matrix and simple poles), our construction reproduces the Zakharov-Shabat Lax pair with simple poles whose equations of motion were cast in variational form in [ZM1]. In fact, our construction automatically embeds this single Lax pair, its zero curvature equation and its Lagrangian into an integrable hierarchy. This point of view was first introduced in [SNC] where the class of Zakharov-Mikhailov (ZM) models was cast into the formalism of Lagrangian multiforms. Allowing for higher order poles gives us the generalisation discussed in [Di,Chap. 20]. When we switch to the trigonometric r-matrix, we produce for the first time the trigonometric version of the large class of ZM models and their hierarchies. Finally, when specialising the construction via an appropriate reduction and choice of matrices A ir , we obtain as a special case the class of models studied in [ABW]. Their integrability is guaranteed by construction and they are naturally embedded in an integrable hierachy, a new feature for these models that were originally obtained as standalone models by a different method related to the 4d Chern-Simons construction (see conclusions for details and references). These examples are detailed in the next three subsections.

Rational Zakharov-Mikhailov models
We first describe in detail how to reproduce the class of Lax pairs and Lagrangians originally discussed in the pioneering paper [ZM1]. The generalisation to higher order poles presented in [Di] will be straightforward. The r-matrix is fixed to be the rational one in this subsection. We split the data (6.1)-(6.2) in the following way: P = P 1 + P 2 , P 1 , P 2 > 0, and For notational convenience, we simply denoted A j+P 1 ,r = B jr and n j+P 1 = m j for j = 1, . . . , P 2 .

Case of simple poles
Following [ZM1], let us consider a Lax pair of the form 2 A prominent example of an integrable field theory that falls into this class is the Faddeev-Reshetikhin model [FR] which was proposed as an ultralocal variant of the principal chiral model. The main result of [ZM1] is that the equations of motions encoded in the zero curvature (6.6) are variational and are obtained as the EL equations of the following Lagrangian density The key insight to obtain this result is to parametrise U i as ϕ i U We can reproduce (6.5) by choosing n i = 0 and m j = 0 in our data (6.4). Since a direct calculation using Proposition 3.8 gives Therefore, it remains to make the identifications φ a i 0 = ϕ i and A i0 = U . The corresponding Lax matrices are simply the sum of the elementary Lax matrices (6.8) which gives precisely (6.5).
To understand how to recover the Lagrangian (6.7) with our method, note that the zero curvature equation associated to the elementary times t a i −1 and t Summing these elementary zero curvature equations over i = 1 . . . , P 1 and j = 1, . . . , P 2 yields the desired ∂ η U (λ) − ∂ ξ V (λ) + [U (λ), V (λ)] = 0. Therefore, to find the Lagrangian L ZM it suffices to sum the elementary Lagrangians L which yields the equations of motion in (6.9)). A direct calculation gives (6.10) and the claim follows, i.e. , with identifications made above, we derive L ZM (up to an irrelevant minus sign) as in (6.7) by taking the double sum It was shown for the first time in [SNC] that the ZM Lagrangian can be incorporated into a Lagrangian multiform where each coefficient is a copy of the original ZM Lagrangian associated to the corresponding times. The explicit case of 3 times was considered. We now explain how to recover this multiform from our data. Instead of splitting the data (6.1)-(6.2) into two types of poles as in (6.3)-(6.4), we split it into three types of poles by setting P = P 1 + P 2 + P 3 and restrict our attention to simple poles, i.e. we set S = {a 1 , . . . , a P 1 , b 1 , . . . , b P 2 , c 1 , . . . , c P 3 } ⊂ C , P 1 , P 2 , P 3 > 0 , g = gl N , (6.11) As before, we take the linear combinations ∂ ξ = The Lagrangian multiform in [SNC,Section 2.4] is precisely (6.14) The associated Lax matrices and zero curvature equations also reproduce those of [SNC].

Case of higher poles
The generalisation of the ZM result to Lax matrices with higher order poles of the form (6.16) was presented in [Di]. We can reproduce it by simply allowing n i and m j in the data (6.4) to be arbitrary positive integers and by following the same steps as for simple poles. In that case we find where the coefficients are identified as Q a i −r−1 =: −U ir , i = 1, . . . , P 1 , r = 0, . . . , n i , (6.18) (6.19) and calculated from the group coordinates using the following expansions (6.21) As before, we simply assemble the elementary time flows into ∂ ξ = which have the desired Lax pair (6.15). This gives the corresponding equations of motion in zero curvature form ∂ η U (λ) − ∂ ξ V (λ) + [U (λ), V (λ)] = 0. The Lagrangian producing these equations of motion is obtained by adding the elementary Lagrangians L a i ,b j −1−1 . We give some details to show that we recover exactly [Di,Formula 20.2.12] (in the case of non coinciding poles which we consider here).
The kinetic part of L where in the last equality, we introduced g i (resp. h j ) to denote the truncation of φ a i (λ a i ) (resp. φ b j (λ b j )) up to the order n i (resp. m j ), in order to help make the comparison with Dickey's formula. The equality holds since the truncation is possible under the residue. We also denoted The potential term reads, noting that ι λa i ι µ b j = ι µ b j ι λa i when a i = b j , We obtain Dickey's Lagrangian, up to an overall sign and a relative sign due to a different convention in the zero-curvature equation, by taking the following sums (6.22) 6.1.3 Interplay between hierarchies associated to simple and higher order poles Following Proposition 3.8, the Lax matrices read, for each n ≥ −n i − 1 and i = 1, . . . , P 1 , and for each m ≥ −m j − 1 and j = 1, . . . , P 2 : (6.24) At first glance, it is tempting to suggest that a Dickey hierarchy with certain fixed order n i and m j simply sits higher or lower in another Dickey hierarchy with different fixed n i and m j . The situation is much more complicated in general. To illustrate what we mean and show that this is too naive, let us focus on the field content of a Lax matrix around a pole a and compare the ZM case (where a is a simple pole) with the Dickey case (where a has order n 1 + 1 > 1). The corresponding Lax matrices are Q ZM,a n−r (λ − a) r+1 , n ≥ −1, (6.25) and V D,a n (λ) = − (6.26) In general, it is always the case that the Dickey hierarchy contains the ZM case as its lowest level. Indeed, (6.27) and it suffices to choose A D n 1 = A ZM 0 to see that this is equal to However, the crucial point is that Q ZM,a (λ a ) and Q D,a (λ a ) are constructed as orbit around different elements in general so the phase is different in general. This means that the previous identification only gives some of the fields of the Dickey case which happen to be identifiable with the full phase space for ZM. The "converse" is not true in general. The Dickey case can only be seen as a higher flow in the ZM hierarchy if we construct it around a special element of the form n 1 r=0 A D r (λ − a) r+1 with A D r = 0 for r = 1, . . . , n 1 and A D n 1 = A ZM 0 . In that case, we see that V D,a n (λ) = V ZM,a n+n 1 (λ) (6.29) so that the two hierarchies simply correspond to shifting the starting point in the elementary times t a j . This discussion was local in the sense that we looked at a typical pole a. Of course, similar conclusion hold around the other poles. If one assemble them to obtain compound times, then the situation is similar but technically more complicated. The summary is that in general, the Dickey case is a genuine generalisation of the ZM case unless it is constructed as an orbit around a specific element dictated by the ZM element. Of course, this comparison extends to the corresponding Lagrangians since the building objects are the same as for the Lax matrices.

Trigonometric Zakharov-Mikhailov models
We can repeat the construction of the previous subsection only by changing the r-matrix from rational to trigonometric. To the best of our knowledge, this produces for the first time a new class of models which we call trigonometric Zakharov-Mikhailov models.
For conciseness, we simply illustrate this on the simplest example of simple poles in the data (6.4). To derive the elementary Lax matrices, we need to use the trigonometric formula in Proposition 3.8 which bring interesting differences compared to the rational case, already for the lowest times t a i −1 and t It will be convenient to introduce the following notations, for M ∈ gl N P + + 1 2 In particular M = M > + M < . We derive from our general formula the following elementary Lagrangian The last term represents the main difference with the rational case, see (6.10).
We now show that the so-called anisotropic chiral model presented in Section 6 of [FR] can be obtained as a particular case of our trigonometric ZM lagrangians and ZS Lax matrices. We will refer to it as anisotropic Faddeev-Reshetikhin model to avoid the confusion with the "anisotropic chiral model" terminology used in [FR] which would assume that we parametrise the currents differently from our coadjoint parametrization, see (6.34).
We proceed in two steps. First, we specialise our data as follows. in (6.3), we take P 1 = P 2 = 1 and write a 1 = a and b 1 = b. In (6.4), we simply write We also restrict g to sl 2 3 . Second, we apply the automorphism discussed in Appendix A to make the connection with [FR] easier. Let us denote for convenience t a −1 = ξ, t b −1 = η, (6.34) and the Lax pair (6.30), The Lagrangian (6.33) becomes Varying with respect to φ a 0 and φ b 0 , the EL equations read 4 Projecting on the basis J 0, (6.39) The residue at infinity of the zero curvature equation for the Lax pair (6.35)-(6.36) yields the equation −∂ η J < 0 + ∂ ξ J < 1 + [J < 0 , J < 1 ] = 0 in addition to (6.38). However, when projecting, one can see that this is a consequence of the system (6.39).
To make the comparison with the equations for the fields S 1,2,3 and T 1,2,3 used in [FR], we use the automorphism mentioned above and express the final answer using the Pauli matrices σ 1,2,3 . We also implement the changes λ → e 2λ , a → e 2a , b → e −2a to go from rational to hyperbolic parametrisation. We find and e λ/2σ 3 V (e 2λ )e −λ/2σ 3 = − 1 2 w 1 (λ + a) 1 2 e −a J + 1 + e a J − 1 σ 1 + w 2 (λ + a) i 2 e −a J + 1 − e a J − 1 σ 2 + w 3 (λ + a)J 3 1 σ 3 , 3 Thus, it would be more accurate to say that we derive the sl2 anisotropic FR model, as opposed to the su (2) version of [FR]. This is not important for our considerations here. 4 The property Tr J0J < 1 = Tr J > 0 J1 is useful in deriving the EL equations.
where w 1 (λ) = w 2 (λ) = 1 sinh λ , w 3 (λ) = coth λ. It remains to compare with the Lax operator (6.22) in [FR] and remember that they work with x and t instead of the light-cone coordinates ξ and η. This leads to the identifications        (6.40) Using (6.40), eqs (6.39) become (6.42) which are of the same form as (6.26)-(6.27) in [FR] when moving from the light-cone coordinates ξ, η to the coordinates x, t.

Deformed sigma and Gross-Neveu models
Here, we show how to produce the Lax pair and Lagrangian for the model discussed in [ABW,Section 16.2] (see also [By] and references therein for the particular case of rank M = 1) as a particular case of our construction. In fact, more than just this single Lagrangian and its Lax pair, we can in principle generate all the elementary Lagrangians in the whole Lagrangian multiform and all the elementary Lax pairs for the hierarchy containing this model as its main representative. This explains the origin of the integrability of such a class of models and is seen to be a particular case of our construction. The idea is to apply a reduction, in the spirit of [Mik], to a Zakharov-Mikhailov model. The r-matrix could in principle be any skew-symmetric solution of the CYBE as we have already mentioned. Of course, if we want to resort to our explicit formulas for Lagrangians or Lax matrices, then it will be either the rational or trigonometric one since we have given an explicit construction only in those cases. Nevertheless, we will write most results without specifying the r-matrix to emphasize this observation.
Choose the data in (6.3)-(6.4) as follows S = {a, a * } , a / ∈ R , g = gl N , (6.43) (6.44) In particular, we chose N a = 1 = N a * . As mentioned, we want to use the idea of reduction which we implement as a reality condition on the objects of the theory. Writing we require Q a * k = − (Q a k ) † for all k ≥ −1. Accordingly, at the group level, we require that when writing ϕ a k (λ − a) k (6.47) and ϕ a * (λ a * ) = ∞ k=0 ϕ a * k (λ − a * ) k (6.48) we must have ϕ a k = ϕ a * k † for all k ≥ 0. Then, for any skew-symmetric r-matrix which is well-defined at λ = a and µ = a * , a direct computation gives (6.49) It remains to choose A as a rank M matrix and parametrize it as This is the Lagrangian given in [ABW] (without the covariant derivative), with the relation to their notation being r a (A) 1 = Tr 2 (r 12 (a, a * )A 2 ) so that the potential term reads The interpretation of the parameter appearing in the r-matrix (a here, s in [ABW]) is clear in our context: it corresponds to the pole structure of the constant matrix in our data (6.4). The corresponding Lax pair is derived from (3.14) and reads, with K = U V , V a −1 (λ) = Tr 2 r 12 (λ, a)K † 2 , V a * −1 (λ) = − Tr 2 r 12 (λ, a * )K 2 , (6.51) and coincides with the Lax connection (16.7) in [ABW]. Hence, the zero curvature equation yields ∂ z Tr 2 res a λ r 12 (λ, a)K † 2 = Tr 2 res a λ r 12 (λ, a)K † 2 , Tr 2 r 12 (a, a * )K 2 , ∂z Tr 2 res a * λ r 12 (λ, a * )K 2 = Tr 2 r 12 (λ, a)K † 2 , Tr 2 res a * λ r 12 (λ, a * )K 2 , which reduces to 5 ∂ z K † = K † , Tr 2 r 12 (a, a * )K 2 , ∂zK 2 = Tr 2 r 12 (λ, a)K † 2 , K , In our opinion, it is rather beautiful that our generating Lagrangian multiform produces this class of models which was originally obtained via a completely different method, related to 4d Chern-Simons theory (see the conclusion for details and references). Unlike the latter method which necessarily focuses on a single Lagrangian at a time, we can also obtain all the Lagrangians corresponding to the higher commuting flows of the hierarchy, if desired.

Coupling integrable hierarchies together
To show the flexibility of the construction, we explain by way of two examples how we can couple integrable field theories together in a simple way. The reader familiar with integrable hierarchies will recognize the procedure of assembling elementary time flows and the corresponding Lax matrices into linear combinations. What we gain here is the possibility to derive the corresponding Lagrangian (multiform) systematically for the new model as well. The procedure is an analog in the ultralocal case of the construction presented in [DLMV1] for a class of non ultralocal field theories. Unlike the latter, the coupling here is at the level of an entire hierarchy. We give an example in the rational class and one in the trigonometric class of models. In the rational class, we couple together the AKNS hierarchy with the hierarchy of the Faddeev-Reshetikhin model (the simplest instance of a ZM model). In the trigonometric class, we couple the sine-Gordon hierarchy as discussed in Section 5 with the hierarchy of the anisotropic Faddeev-Reshetikhin model as presented in Section 6.2. In each case, for conciseness, we present all the details for the lowest levels of the hierarchy but it should be clear by now that one can extract higher levels (Lagrangians and Lax matrices) systematically if desired.

AKNS-FR hierarchy
To couple models in the AKNS hierarchy with models in the simplest ZM hierarchy (with two poles), we assemble the corresponding data as and we choose where A, B are constant sl 2 matrices. The parameter α is the coupling between the two theories: α = 0 gives a pure FR theory while sending α to infinity produces a pure AKNS hierarchy. The effect of multiplying F AKN S (λ) = −iσ 3 by α is to yield Q ∞ (λ ∞ ) = αQ(λ) where Q(λ) is the AKNS series (4.4). Hence the Lax matrix V ∞ n (λ) is equal to the AKNS Lax matrix V n (λ) multiplied by α. With this in mind, we have for instance V ∞ 1 (λ) = −iαλσ 3 + αQ 1 . For simplicity, we illustrate the coupling by looking at the two main models in each hierarchy( NLS in AKNS and FR in ZM), i.e. by considering the Lax pair with associated times ξ and η respectively. This choice of Lax pair corresponds to assembling the four flows t ∞ 1 , t ∞ 2 (AKNS) and t a −1 , t −a −1 (FR) Denoting Q a −1 = J 0 and Q −a −1 = J 1 and recalling the above comments on the effect of multiplying by α, we have The zero curvature equation ∂ η U (λ) − ∂ ξ V (λ) + [U (λ), V (λ)] = 0 yields the following four (matrix) equations by looking at the residue at λ = a, λ = −a, λ = ∞ and at the constant term in the 1/λ expansion respectively, Setting α = 0, (7.6) and (7.7) gives the FR version of the principal chiral model [ZM2,FR] which is usually written as (7.10) In the limit α → ∞ (recall that ∂ ξ scales like α∂ t 1 and ∂ η scales like α∂ t 2 , with t 1 , t 2 the NLS times), we see that (7.8)-(7.9) yield the NLS system (4.13)-(4.14) The Lagrangian of this coupled model is obtained by adding the NLS Lagrangian L ∞∞

Discussion and conclusion
By introducing a certain generating Lagrangian multiform, we were able to relate two important but so far separate aspects of integrable systems: the well established theory of the classical rmatrix and the comparatively much newer framework of Lagrangian multiforms. In doing so, we bring closer together the vast amount of results in the Hamiltonian approach to integrable systems and the Lagrangian approach in the form advocated in the seminal paper [LN]. A rich byproduct of this effort is that the generating Lagrangian multiform and its accompanying generating Lax equation and zero curvature equation provide a systematic framework to construct integrable hierarchies of field theories, both in terms of Lagrangians and of Lax matrices. This was illustrated at length over many examples, both known and new. As already emphasised in the introduction, this versatility to accommodate a very large class of examples stems from the fact that we work in the adèlic framework.
The most immediate question that comes to mind relates to the restrictions imposed on the classical r-matrix appearing in the generating Lagrangian multiform. Certain aspects of our construction appear to remain true under only the assumption that r is a solution of the CYBE (1.1). In particular, the restriction to the rational or trigonometric case that we studied in detail only played a role in the explicit construction of the projectors associated to the decomposition of the Lie algebra of g-valued adèles. It is easy to imagine that one could use a more general skew-symmetric r-matrix provided similar technicalities can be dealt with. Specifically, given a solution r of the CYBE, one would like to establish results along the following lines: -Define a pair of linear operators on the Lie algebra of g-valued adèles A λ (g) as π ± : A λ (g) −→ A λ (g), X(λ) −→ (π ± X) a (λ a ) a∈CP 1 (8.1) with formulas similar to e.g. (2.20).
-Show that the linear maps π ± so defined are projection operators onto complementary subspaces of A λ (g), i.e. (π ± ) 2 = π ± and π + + π k − = id. -Show that the images π ± A λ (g) of the projection operators π ± are both Lie subalgebras of A λ (g) and are isotropic with respect to the bilinear form analogous to that defined in (2.2).
If one could accomplish this then it would follow that one would have a direct sum decomposition of A λ (g) into complementary Lagrangian Lie subalgebras A λ (g) = π + A λ (g) ∔ π − A λ (g).
The corresponding r-matrix would be defined as r := π + − π − ∈ End A λ (g) and would presumably have a kernel of the form (ι µ b ι λa + ι λa ι µ b )r 12 (λ, µ) a,b∈CP 1 . We could then use this kernel into our generating Lagrangian multiform and construct integrable hierarchies by the same method as we have done. One candidate to see if such a programme can be realised is the elliptic r-matrix [Sk, Be].
The other obvious restriction of the present work is the condition that r be skew-symmetric. In fact, we wrote the CYBE (1.1) in its non-skew-symmetric form on purpose. Once again, some of our results appear to hold without this assumption. This is the case for the commutativity of the vector fields (3.9) as can be seen from the proof of Proposition 3.4. The extension of our construction to the non-skew-symmetric case, hence to non-ultralocal field theories, appears rather challenging as the current form of our generating Lagrangian multiform simply does not allow for such an extension. We are currently investigating this exciting issue which promises to have connection with the framework of classical affine Gaudin models, developed in [V1,DLMV2], that provides a unifying formalism for constructing and studying a very broad class of non-ultralocal classical integrable field theories.
It was shown in [V2] that classical affine Gaudin models are closely related to 4d mixed topological-holomorphic Chern-Simons theory introduced and studied in [Cos1,Cos2,CWY1,CWY2,CY], see also [DLMV3,BSV,LV]. In fact, 4d Chern-Simons theory also naturally provides a framework for constructing a very broad class of ultralocal integrable field theories (see also [Zo] for a description of ultralocal integrable field theories as affine Gaudin models). In this context, it was shown in [CSV], see also [FSY], that the rational Zakharov-Mikhailov models, one of the main classes of examples that we reproduced here, could be obtained from 4d Chern-Simons theory with certain line defects. However, the construction of [CSV] is, by design, able to produce only the action of a single Zakharov-Mikhailov model, as opposed to its entire hierarchy, starting from that of 4d Chern-Simons theory. It seems natural to wonder if such a construction, and in fact the whole 4d Chern-Simons approach, could be adapted to our generating Lagrangian multiform framework in order to derive entire integrable hierarchies and not just single models from this point of view.
In the simplest case of the AKNS hierarchy, the concept of Hamiltonian multiform, initially introduced in [CS2], was illustrated in [CS3]. The main idea is that it is possible to apply a version of the covariant Legendre transformation to an entire Lagrangian multiform to obtain the Hamiltonian analog of a multiform. Each coefficient of the resulting Hamiltonian multiform can be seen as a covariant Hamiltonian for the field theory described by the associated Lagrangian coefficient in the Lagrangian multiform. Important accompanying objects are the symplectic multiform and the multitime Poisson bracket which generalise to an entire hierarchy the concepts of multisymplectic form and of covariant Poisson bracket respectively. The latter are essential ingredients of the framework generally called covariant Hamiltonian field theory, see e.g. [G] and references therein for a very useful recent review of the many facets of this rich topic. We believe it is important to try and obtain the generating Hamiltonian multiform and related structures corresponding to our generating Lagrangian multiform. Indeed, historically, one of the driving motivations of the above mentioned covariant Hamiltonian approach to field theory has been to allow for a (canonical) quantization of field theories that removes from the start the breaking of covariance associated to the standard Hamiltonian approach. The idea of covariant Hamiltonian field theory is to use a Poisson bracket that does not suffer from the lack of covariance of the traditional Poisson bracket: a covariant Poisson bracket. The results of [CS2,CS3] show that one can extend this idea to a whole integrable hierarchy and that the classical r-matrix plays a key role in this "covariant" context, see also [CS1,CSV]. The hope is that this could allow one to use the nice features of integrability encoded in the passage from the classical r-matrix to the quantum R-matrix, to fully implement the idea of covariant canonical quantization for such field theories.
Finally, our work also opens the possibility for quantization using another route: combining Feynman's path integral ideas with a Lagrangian multiform, thus taking advantage again of integrability features now encoded in a Lagrangian object entering the path integral. This tantalising idea was first put forward and explored in [KN] but is still very much in its infancy.