1 Introduction

Determining whether a given two-dimensional classical field theory is integrable is somewhat of an art. It requires finding a connection \(d + {\mathcal {L}}\) on the two-dimensional space-time \(\varSigma \), valued in some complex Lie algebra \(\mathfrak {g}^\mathbb {C}\), such that:

  1. (a)

    It depends meromorphically on an auxiliary Riemann surface C,

  2. (b)

    It is on-shell flat,

  3. (c)

    The integrals of motion constructed from it are in involution.

In this article, we shall restrict attention to the case when \(C = \mathbb {C}P^1\). We fix a global holomorphic coordinate z on \(\mathbb {C}\subset \mathbb {C}P^1\), called the spectral parameter.

Given the difficulty of the above task, one can turn the tables around by seeking instead to construct connections with all the above properties and, only a posteriori, identify which classical integrable field theories they correspond to.

Very recently, two different approaches for constructing integrable field theories in this way have been developed.

The first, proposed in [40] and further developed more recently in [11, 31], is rooted in the representation theory of untwisted affine Kac–Moody algebras, or more precisely in the theory of Gaudin models associated with such algebras. The basic idea for constructing connections \(d + {\mathcal {L}}\) with all of the above desired properties is, roughly, to choose a representation of a certain infinite-dimensional Lie algebra associated with the datum of the Gaudin model and apply it to the corresponding canonical element \(I_A \otimes I^A\), where \(\{ I^A \}\) is a basis of this Lie algebra and \(\{ I_A \}\) is a basis of its dual. Specifically, under this representation, we obtain [40]

$$\begin{aligned} I_A \otimes I^A \; \longmapsto \; \omega (\partial _\sigma + {\mathcal {L}}_\sigma ) \end{aligned}$$

where \({\mathcal {L}}_\sigma \) is the component of the 1-form \({\mathcal {L}} = {\mathcal {L}}_\sigma d\sigma + {\mathcal {L}}_\tau d\tau \) along the spatial direction which we assume here to be a circle \(S^1\). By construction, it depends meromorphically on the spectral parameter z. The prefactor \(\omega \) is a meromorphic 1-form which in terms of the spectral parameter z is given by

$$\begin{aligned} \omega = \varphi (z) dz \end{aligned}$$
(1.1)

where \(\varphi \) is known as the twist function. The latter controls the form of the Poisson bracket of \({\mathcal {L}}_\sigma \) with itself which guarantees property (c). Note that this approach is intrinsically formulated within the Hamiltonian framework. In particular, the temporal component \({\mathcal {L}}_\tau \) of the on-shell flat connection \(d + {\mathcal {L}}\), which satisfies also (a) and (b) above, is induced by evolution with the Hamiltonian.

The second approach, proposed recently in [7], is based on a four-dimensional variant of Chern–Simons theory which was used in the earlier works [2,3,4,5,6, 42] to describe integrable lattice models. In fact, two types of integrable field theories were considered in [7], associated with so-called order and disorder defects, respectively. We shall restrict attention to the latter class here. The action of the four-dimensional theory reads (note that we take the same overall factor as used in [41])

$$\begin{aligned} S[A] = \frac{\mathrm{i}}{4 \pi } \int _{\varSigma \times \mathbb {C}P^1} \omega \wedge CS(A), \end{aligned}$$
(1.2)

where CS(A) is the Chern–Simons 3-form and \(\omega \) is a meromorphic 1-form on \(\mathbb {C}P^1\) with zeroes. The four-dimensional \(\mathfrak {g}^\mathbb {C}\)-valued 1-form \(A = A_\sigma d\sigma + A_\tau d\tau + A_{{\bar{z}}} d{\bar{z}}\) has no dz-component since it drops out from the action and is therefore ignored. To relate A to a connection on \(\varSigma \), one can write

$$\begin{aligned} d + A = \widehat{g} (d + {\mathcal {L}}) \widehat{g}^{-1} \end{aligned}$$

for some smooth \(G^\mathbb {C}\)-valued function \(\widehat{g}\) on \(\varSigma \times \mathbb {C}P^1\) and 1-form \({\mathcal {L}} = {\mathcal {L}}_\sigma d\sigma + {\mathcal {L}}_\tau d\tau \). The equations of motion derived from the action (1.2) ensure that \({\mathcal {L}}\) satisfies both properties (a) and (b). Crucially, these are accompanied by boundary equations of motion for the values of \(A_\sigma \) and \(A_\tau \) at the poles of \(\omega \). What determines the integrable field theory in this approach is then the choice of boundary conditions imposed on \(A_\sigma \) and \(A_\tau \) to ensure these boundary equations of motion hold.

It was shown recently in [41] that the two approaches outlined above are closely related. In particular, the Poisson bracket of \({\mathcal {L}}_\sigma \) with itself derived from a canonical analysis of the action (1.2) coincides with the nonultralocal Poisson algebra obtained in the affine Gaudin model approach, where the 1-forms \(\omega \) in both approaches are identified. It follows that the connection \(d + {\mathcal {L}}\) constructed using the Chern–Simons approach of [7] also satisfies property (c), as required.

The purpose of this article is to show that many of the integrable \(\sigma \)-models which had previously been described as realisations of affine Gaudin models can equally be described using action (1.2). Specifically, we identify the boundary conditions on the 1-form A which give rise to: the principal chiral model with WZ-term (already covered in [7]), the homogeneous Yang–Baxter deformation of the principal chiral model, the Yang–Baxter \(\sigma \)-model with WZ-term, the \(\lambda \)-deformation of the principal chiral model and the bi-Yang–Baxter \(\sigma \)-model.

More precisely, we will suppose that the 1-form \(\omega \) has at most double poles and consider three general classes of boundary conditions that can be imposed on the 1-form A at the set \({{\varvec{z}}}\) of poles of \(\omega \). These are determined by a choice of Lagrangian subalgebra of either the semi-direct product \(\mathfrak {g}\ltimes \mathfrak {g}_{\mathrm{ab}}\), where \(\mathfrak {g}\) is a real form of \(\mathfrak {g}^\mathbb {C}\) and \(\mathfrak {g}_\mathrm{ab}\) is an abelian copy of \(\mathfrak {g}\), the direct sum \(\mathfrak {g}\oplus \mathfrak {g}\) or the complexification \(\mathfrak {g}^\mathbb {C}\). They are, respectively, imposed at a real double pole, at a pair of real simple poles or at a pair of complex conjugate simple poles.

One of the main results of the present paper, Theorem 3.2, is that if we impose any combination of the above three types of boundary conditions on A, then the four-dimensional action (1.2) reduces to the two-dimensional action

$$\begin{aligned} S[ \{ g_x\}_{x \in {{\varvec{z}}}} ] = \frac{1}{2} \sum _{x \in {{\varvec{z}}}} \int _{\varSigma } \langle {{\,\mathrm{res}\,}}_x \omega \wedge {\mathcal {L}}, g_x^{-1} d g_x \rangle - \frac{1}{2} \sum _{x \in {{\varvec{z}}}} ({{\,\mathrm{res}\,}}_x \omega ) I_{\mathrm{WZ}}[g_x], \end{aligned}$$
(1.3)

where the two-dimensional field \(g_x : \varSigma \rightarrow G\) is defined as the restriction \(\widehat{g}|_{\varSigma \times \{ x \}}\) for all \(x \in {{\varvec{z}}}\) and \(I_{\mathrm{WZ}}[g_x]\) denotes the corresponding Wess–Zumino term.

The meromorphic 1-form \({\mathcal {L}}\) can be expressed in terms of the set of fields \(\{ g_x \}_{x \in {{\varvec{z}}}}\) by solving the boundary conditions on A, so that the action is then a functional of these fields only. By construction, the equations of motion for these fields obtained by varying (1.3) are equivalent to the flatness of the connection \(d + {\mathcal {L}}\).

The two-dimensional action (1.3) unifies the actions of many integrable \(\sigma \)-models which had previously been described in the affine Gaudin model approach.

We also give an interpretation of Poisson–Lie T-duality in this context as arising in the case when the Lagrangian subalgebra of either \(\mathfrak {g}\oplus \mathfrak {g}\) or \(\mathfrak {g}^\mathbb {C}\) belongs to a Manin triple, i.e. there is a complementary Lagrangian subalgebra in \(\mathfrak {g}\oplus \mathfrak {g}\) or \(\mathfrak {g}^\mathbb {C}\).

Finally, we also consider a fourth kind of natural boundary condition on A imposed at a pair of simple poles, and for which the two-dimensional action (1.3) also holds. Imposing this boundary condition, we recover the action for the \(\textsf {E} \)-model also from (1.3). We stress, however, that this particular example is on a different footing to all of the others considered in this paper since the 1-form \({\mathcal {L}}\) in this case vanishes on-shell and so trivially satisfies condition (b) above.

2 The four-dimensional action

Let \(G^\mathbb {C}\) be a complex semisimple Lie group with Lie algebra \(\mathfrak {g}^\mathbb {C}\), on which we fix a choice of nondegenerate invariant symmetric bilinear form \(\langle \cdot , \cdot \rangle : \mathfrak {g}^\mathbb {C}\times \mathfrak {g}^\mathbb {C}\rightarrow \mathbb {C}\).

Let \(\mathbb {C}P^1\,{:}{=}\, \mathbb {C}\cup \{ \infty \}\) denote the Riemann sphere. We shall fix a choice of global holomorphic coordinate z on \(\mathbb {C}\subset \mathbb {C}P^1\).

2.1 Bulk and boundary equations of motion

Consider action (1.2) where \(\omega \) is a meromorphic 1-form on \(\mathbb {C}P^1\) and the Chern–Simons 3-form for the 1-form \(A = A_\sigma d\sigma + A_\tau d\tau + A_{{\bar{z}}} d{\bar{z}}\) is given by

$$\begin{aligned} CS(A) = \langle A, d A + {\small \frac{2}{3}} A \wedge A \rangle = \langle \!\langle A, d A + {\small \frac{1}{3}} [A \wedge A] \rangle \!\rangle . \end{aligned}$$

The second equality uses the fact that \(B \wedge B = {\small \frac{1}{2}}[B \wedge B]\) for any \(\mathfrak {g}^\mathbb {C}\)-valued 1-form B. Note also that for any \(\mathfrak {g}^\mathbb {C}\)-valued 1-forms BC and D we have

$$\begin{aligned} \langle B, [C \wedge D] \rangle = \langle C, [D \wedge B] \rangle \end{aligned}$$
(2.1)

by the invariance and symmetry of the bilinear form \(\langle \cdot , \cdot \rangle \).

Varying action (1.2) with respect to the field A, we find

$$\begin{aligned} \delta S[A] = \frac{\mathrm{i}}{2 \pi } \int _{\varSigma \times \mathbb {C}P^1} \omega \wedge \langle \delta A, F(A) \rangle + \frac{\mathrm{i}}{4 \pi } \int _{\varSigma \times \mathbb {C}P^1} d\omega \wedge \langle A, \delta A \rangle , \end{aligned}$$

where \(F(A) \,{:}{=}\, dA + A \wedge A\). The last term comes from applying Stokes’s theorem, and we have removed the boundary term using the fact that A vanishes at the boundary of \(\varSigma \times \mathbb {C}P^1\). The variation of the action therefore vanishes provided that

$$\begin{aligned} \omega \wedge F(A)&= 0, \end{aligned}$$
(2.2a)
$$\begin{aligned} d\omega \wedge \langle A, \delta A \rangle&= 0. \end{aligned}$$
(2.2b)

Equation (2.2a) is the bulk equation of motion, while Eq. (2.2b) will be referred to as the boundary equation of motion since \(d\omega \) is a distribution supported at the set \({{\varvec{z}}}\) of poles of \(\omega \) (see the proof of Lemma 2.1).

More explicitly, the \({\bar{z}}\)-, \(\tau \)- and \(\sigma \)-components of the bulk equation (2.2a) read

$$\begin{aligned} \partial _\sigma A_\tau - \partial _\tau A_\sigma + [A_\sigma , A_\tau ]&= 0, \end{aligned}$$
(2.3a)
$$\begin{aligned} \omega \big ( \partial _{{\bar{z}}} A_\sigma - \partial _\sigma A_{{\bar{z}}} + [A_{{\bar{z}}}, A_\sigma ] \big )&= 0, \end{aligned}$$
(2.3b)
$$\begin{aligned} \omega \big ( \partial _{{\bar{z}}} A_\tau - \partial _\tau A_{{\bar{z}}} + [A_{{\bar{z}}}, A_\tau ] \big )&= 0. \end{aligned}$$
(2.3c)

We have kept the factor of \(\omega \) in the last two equations since \(\partial _{{\bar{z}}} A_\sigma \) and \(\partial _{{\bar{z}}} A_\tau \) may be distributions on \(\mathbb {C}P^1\), with support at the zeroes of \(\omega \).

In order to rewrite the boundary equation of motion (2.2b) more explicitly, we begin by introducing some notation. Let \(\xi _x\) be a local holomorphic coordinate around \(x \in {{\varvec{z}}}\). Explicitly \(\xi _x = z - x\) for \(x \in {{\varvec{z}}} {\setminus } \{ \infty \}\) and \(\xi _\infty = z^{-1}\) for the point at infinity. It will also be convenient to introduce the shorthand notation \(f|_x \,{:}{=}\, f|_{\varSigma \times \{ x \}}\) for the function on \(\varSigma \) obtained by evaluating any function f on \(\varSigma \times \mathbb {C}P^1\) at \(x \in \mathbb {C}P^1\).

Lemma 2.1

The boundary equation of motion (2.2b) can be rewritten as

$$\begin{aligned} \sum _{x \in {{\varvec{z}}}} \sum _{p \ge 0} ({{\,\mathrm{res}\,}}_x \xi _x^p \omega ) \epsilon _{ij} \frac{1}{p!} \partial ^p_{\xi _x} \langle A_i, \delta A_j \rangle \big |_x = 0, \end{aligned}$$
(2.4)

where there is an implicit sum over the repeated space-time indices \(i, j = \tau , \sigma \).

Proof

The pole part of the 1-form \(\omega \) at each \(x \in {{\varvec{z}}}\) can be expressed as

$$\begin{aligned} \sum _{p \ge 0} \frac{k^{(x)}_p}{\xi _x^{p+1}} d\xi _x \end{aligned}$$
(2.5)

in the local variable \(\xi _x\) at x, where \(k^{(x)}_p \,{:}{=}\, {{\,\mathrm{res}\,}}_x \xi _x^p \omega \). Note that this also deals with the point at infinity if \(\infty \in {{\varvec{z}}}\). Concretely, since \(\xi _\infty = z^{-1}\) is the local variable at infinity, this means that the pole part of \(\omega \) there takes the form \(- \sum _{p \ge 0} k^{(\infty )}_p z^{p-1} dz\). We then have

$$\begin{aligned} d \omega = 2 \pi \mathrm{i}\sum _{x \in {{\varvec{z}}}} \sum _{p \ge 0} \frac{k^{(x)}_p}{p!} (-1)^{p+1} \partial ^p_{\xi _x} \delta _{\xi _x 0} d\xi _x \wedge d{\bar{\xi }}_x, \end{aligned}$$

where \(\delta _{\xi _y 0}\) denotes the Dirac \(\delta \)-distribution at y, with the property that

$$\begin{aligned} \int _{\mathbb {C}P^1} d\xi _y \wedge d{\bar{\xi }}_y \delta _{\xi _y 0} f = f|_y \end{aligned}$$

for any smooth function \(f : \mathbb {C}P^1\rightarrow \mathbb {C}\).

Integrating \(d\omega \wedge \langle A, \delta A \rangle = \epsilon _{ij} \langle A_i, \delta A_j \rangle d\omega \wedge d\sigma \wedge d\tau \) over a small open neighbourhood of \({{\varvec{z}}} \subset \mathbb {C}P^1\) using the above expression for \(d\omega \) gives the desired result. \(\square \)

When the 1-form \(\omega \) has at most double poles, which is the case we shall focus on in the present paper, the boundary equation of motion (2.4) simply reads

$$\begin{aligned} \sum _{x \in {{\varvec{z}}}} ({{\,\mathrm{res}\,}}_x \omega ) \epsilon _{ij} \langle A_i|_x, \delta A_j|_x \rangle + \sum _{x \in {{\varvec{z}}}} ({{\,\mathrm{res}\,}}_x \xi _x \omega ) \epsilon _{ij} \partial _{\xi _x} \langle A_i, \delta A_j \rangle \big |_x = 0. \end{aligned}$$
(2.6)

Following the approach of [5, 7], we will impose appropriate boundary conditions on the 1-form A to ensure that the boundary equation of motion (2.6) is satisfied. Let us note that for a given meromorphic 1-form \(\omega \), different boundary conditions can be chosen, leading to different \(\sigma \)-models. We therefore postpone the detailed description of the various boundary conditions we shall consider until Sect. 4, concentrating for the time being on aspects which are common to all these choices.

2.2 Gauge transformations

The group consisting of smooth \(G^\mathbb {C}\)-valued functions u on \(\varSigma \times \mathbb {C}P^1\) acts on the space of \(\mathfrak {g}^\mathbb {C}\)-valued connections \(d+A\), considered in Sect. 2.1, by formal gauge transformations

$$\begin{aligned} d+A \longmapsto d+A^u \,{:}{=}\, u(d+A)u^{-1} = d - duu^{-1} + uAu^{-1} . \end{aligned}$$
(2.7)

Such transformations act on the curvature of A by conjugation, namely

$$\begin{aligned} F(A^u) = u F(A) u^{-1}. \end{aligned}$$
(2.8)

Thus, they are symmetries of the bulk equation of motion (2.2a). However, they are in general not symmetries of the boundary equation of motion (2.2b). In the rest of this article, we will use the term ‘gauge transformation’ to refer to the transformations \(A \mapsto A^u\) which preserve the boundary conditions imposed on the field A at the poles \({{\varvec{z}}}\) of \(\omega \), while keeping the denomination of ‘formal gauge transformation’ to describe the most general ones. In particular, only gauge transformations leave action (1.2) invariant and can thus be interpreted as local symmetries of the model.

2.3 Lax connection

In order to connect A with the Lax connection of an integrable \(\sigma \)-model, one should work in a formal gauge where the \(d{\bar{z}}\)-component vanishes.

Indeed, if we denote by \({\mathcal {L}}\) the 1-form A in such a formal gauge, then \({\mathcal {L}}\) would only have components along \(d\sigma \) and \(d\tau \); it would be on-shell flat by the first bulk equation of motion (2.3a), and its dependence on \(\mathbb {C}P^1\) would be meromorphic by virtue of the remaining two bulk equations of motion (2.3b) and (2.3c). These are exactly the properties of a Lax connection of an integrable \(\sigma \)-model.

It is important to note that \({\mathcal {L}}\) is related to A only by a formal gauge transformation (2.7), which need not preserve the boundary conditions imposed on A. In particular, one cannot compute the value of the action (1.2) in this formal gauge. However, recall from Sect. 2.2 that, crucially, formal gauge transformations preserve the bulk equations (2.3): this is what allowed us to interpret \(\mathcal {L}\) as a Lax connection in the previous paragraph.

Let us be more explicit about the construction sketched above. Finding the formal gauge mentioned in the previous paragraph means writing the 1-form A in the form

$$\begin{aligned} A = - d \widehat{g} \widehat{g}^{-1} + \widehat{g} {\mathcal {L}} \widehat{g}^{-1}, \end{aligned}$$
(2.9)

for some smooth function \(\widehat{g} : \varSigma \times \mathbb {C}P^1\rightarrow G^\mathbb {C}\), denoted by \(\widehat{\sigma }^{-1}\) in [7], and where \({\mathcal {L}} \,{:}{=}\, {\mathcal {L}}_\sigma d\sigma + {\mathcal {L}}_\tau d\tau \) has no \(d{\bar{z}}\)-component, i.e. \({\mathcal {L}}_{{\bar{z}}} = 0\).

Substituting (2.9) into the bulk equation of motion (2.3a) implies that \({\mathcal {L}}\) is on-shell flat, while substituting it into (2.3b) and (2.3c) tells us that

$$\begin{aligned} \omega \wedge \partial _{{\bar{z}}} {\mathcal {L}} = 0. \end{aligned}$$
(2.10)

It follows from (2.10) that \({\mathcal {L}}\) is meromorphic with poles at the zeroes of \(\omega \), with the order of each pole of \({\mathcal {L}}\) being at most equal to the multiplicity of the corresponding zero of \(\omega \). In other words, \(\omega \wedge {\mathcal {L}}\) has the same poles as \(\omega \) and of the same order.

It is important to note here that there is, in fact, a large freedom in choosing a smooth \(G^\mathbb {C}\)-valued field \(\widehat{g}\) with the property (2.9). Since this will be a crucial point for us, we postpone its detailed discussion until Sect. 3.1.

We can be more explicit about the pole structure of \({\mathcal {L}}\) following [7], by making the choice that the singularity at each zero of \(\omega \) lies only in one component of \({\mathcal {L}}\) so that \(\omega \wedge CS(A)\) is regular. Let \({{\varvec{\zeta }}}\) denote the set of zeroes of the 1-form \(\omega \) which we assume to be simple. We will allow the meromorphic 1-form \({\mathcal {L}}\) to have the form

$$\begin{aligned} {\mathcal {L}} = \sum _{y \in {{\varvec{\zeta }}}} V^y \xi _y^{-1} d\sigma _y + U_\sigma d\sigma + U_\tau d\tau \end{aligned}$$
(2.11)

where \(\xi _y\) is the local coordinate at y and \(U_\tau , U_\sigma , V^y : \varSigma \rightarrow \mathfrak {g}^\mathbb {C}\) are smooth functions for each \(y \in {{\varvec{\zeta }}}\). Here, each \(\sigma _y\) for \(y \in {{\varvec{\zeta }}}\) is a linear combination of \(\sigma \) and \(\tau \).

The situation considered in [7] corresponds to the case where \(\sigma _y = w = {\small \frac{1}{2}}( \tau + \mathrm{i}\sigma )\) for some of the zeroes \(y \in {{\varvec{\zeta }}}\) and \(\sigma _y = {\bar{w}} = {\small \frac{1}{2}}(\tau - \mathrm{i}\sigma )\) for the others. This is the natural choice to obtain Euclidian invariant theories (see Remark 2.1). Since we are interested in Lorentz invariant theories rather than in Euclidean invariant ones, we will instead always (with the exception of the discussion in Sect. 6) make the choice \(\sigma _y = \sigma ^+\) for some of the zeroes \(y \in {{\varvec{z}}}\) and \(\sigma _y = \sigma ^-\) for the other zeroes, where \(\sigma ^\pm \,{:}{=}\, {\small \frac{1}{2}}(\tau \pm \sigma )\) are the light-cone coordinates.

Remark 2.1

Form (2.11) is consistent with the derivation of integrable \(\sigma \)-model actions from their descriptions as affine Gaudin models [11]. Indeed, the spatial and temporal components of the Lax connection of an affine Gaudin model are given by very similar expressions (see [11, (2.39) & (2.40)] and [31, Theorem 2.1] for details); namely, we should have

$$\begin{aligned} {\mathcal {L}} = \left( \sum _{y \in {{\varvec{\zeta }}}} V^y \xi _y^{-1} + U_\sigma \right) d\sigma + \left( \sum _{y \in {{\varvec{\zeta }}}}\epsilon _y V^y \xi _y^{-1} + U_\tau \right) d\tau , \end{aligned}$$

for some fixed \(\epsilon _y \in \mathbb {C}\) for all \(y \in {{\varvec{\zeta }}}\). This is equivalent to (2.11) with \(d\sigma _y = d\sigma + \epsilon _y d\tau \).

It was also shown in [11, 31] that for the affine Gaudin model to describe a relativistic integrable \(\sigma \)-model, we should take \(\epsilon _y = \pm 1\) for each \(y \in {{\varvec{\zeta }}}\). This analysis can be generalised to show that the theory is Euclidean invariant if \(\epsilon _y = \pm \mathrm{i}\) for each \(y \in {\varvec{\zeta }}\). This was precisely the choice made in [7], i.e. \(\sigma _y = w, {\bar{w}}\). \(\vartriangleleft \)

2.4 Action

We will now express the action (1.2) in terms of \(\widehat{g}\) and \({\mathcal {L}}\).

Lemma 2.2

Under a formal gauge transformation as in (2.9), the Chern–Simons 3-form transforms as

$$\begin{aligned} CS(A) = \langle {\mathcal {L}}, d {\mathcal {L}} \rangle + d \langle \widehat{g}^{-1} d \widehat{g}, {\mathcal {L}} \rangle - {\small \frac{1}{3}} \langle \widehat{g}^{-1} d \widehat{g}, \widehat{g}^{-1} d \widehat{g} \wedge \widehat{g}^{-1} d \widehat{g} \rangle . \end{aligned}$$
(2.12)

Proof

The behaviour of the Chern–Simons 3-form under formal gauge transformations,

$$\begin{aligned} CS(A) = CS({\mathcal {L}}) + d \langle \widehat{g}^{-1} d \widehat{g}, {\mathcal {L}} \rangle - {\small \frac{1}{3}} \langle \widehat{g}^{-1} d \widehat{g}, \widehat{g}^{-1} d \widehat{g} \wedge \widehat{g}^{-1} d \widehat{g} \rangle , \end{aligned}$$

is well known. In the present context, since the 1-form \({\mathcal {L}}\) only has components along \(d\sigma \) and \(d\tau \), we have \(CS({\mathcal {L}}) = \langle {\mathcal {L}}, d {\mathcal {L}} \rangle \) from which we deduce (2.12). Since this is an essential result on which the derivation of the two-dimensional action in Sect. 3.3 rests, we recall its proof in detail below for completeness.

Following [7], it is convenient to use the have

$$\begin{aligned} \widehat{A} \,{:}{=}\, - d \widehat{g} \widehat{g}^{-1}, \quad A' \,{:}{=}\, \widehat{g} {\mathcal {L}} \widehat{g}^{-1}. \end{aligned}$$

We have the identity (valid for any 1-form A decomposed as a sum \(A = \widehat{A} + A'\))

$$\begin{aligned} CS(A)&= \langle \widehat{A} + A', d\widehat{A} + dA' \rangle + {\frac{1}{3}} \langle \widehat{A} + A', [\widehat{A} \wedge \widehat{A}] + 2 [A' \wedge \widehat{A}] + [A' \wedge A'] \rangle \nonumber \\&= CS(\widehat{A}) + 2 \langle A', F(\widehat{A}) \rangle - d \langle \widehat{A}, A' \rangle + \langle \widehat{A}, [A' \wedge A'] \rangle + CS(A'),\nonumber \\ \end{aligned}$$
(2.13)

where in the second line we have made use of (2.1) to rearrange terms. This is to be compared with [7, (8.8)], noting that \([A' \wedge A'] = 2 A' \wedge A'\).

The second term on the right-hand side of (2.13) vanishes by virtue of the fact that \(F(\widehat{A}) = 0\), since \(\widehat{A}\) is formally pure gauge. On the other hand,

$$\begin{aligned} CS(A')&= \langle A', d A' \rangle = \big \langle \widehat{g} {\mathcal {L}} \widehat{g}^{-1}, \big [ d\widehat{g} \widehat{g}^{-1} \wedge \widehat{g} {\mathcal {L}} \widehat{g}^{-1} \big ] \big \rangle + \langle {\mathcal {L}}, d {\mathcal {L}} \rangle \\&= - \langle A', [\widehat{A} \wedge A'] \rangle + \langle {\mathcal {L}}, d {\mathcal {L}} \rangle = - \langle \widehat{A}, [A' \wedge A'] \rangle + \langle {\mathcal {L}}, d {\mathcal {L}} \rangle . \end{aligned}$$

where in the first equality we have used \(\langle A', A' \wedge A' \rangle = 0\), which follows using the fact that \(A'\) only has \(d\sigma \)- and \(d\tau \)-components. The second and third equalities are by definition of \(\widehat{A}\) and \(A'\), while the last equality follows from (2.1). Finally, we have

$$\begin{aligned} CS(\widehat{A})&= \langle \widehat{A}, d \widehat{A} \rangle + {\small \frac{2}{3}} \langle \widehat{A}, \widehat{A} \wedge \widehat{A} \rangle = - {\small \frac{1}{3}} \langle \widehat{A}, \widehat{A} \wedge \widehat{A} \rangle . \end{aligned}$$

The second equality here uses the fact that \(F(\widehat{A}) = 0\) so that \(d\widehat{A} = - \widehat{A} \wedge \widehat{A}\). Putting all of the above together, we obtain the desired identity (2.12). \(\square \)

Lemma 2.3

For \({\mathcal {L}}\) of form (2.11), we have \(\omega \wedge \langle {\mathcal {L}}, d {\mathcal {L}} \rangle = 0\).

Proof

It follows from the explicit form (2.11) of the Lax connection that

$$\begin{aligned} \omega \wedge \langle {\mathcal {L}}, d {\mathcal {L}} \rangle = - 2 \pi \mathrm{i}\sum _{y \in {{\varvec{\zeta }}}} \omega \wedge \big \langle {\mathcal {L}}, V^y \delta _{\xi _y 0} d{\bar{\xi }}_y \wedge d\sigma _y \big \rangle . \end{aligned}$$

Consider each term in the sum over \(y \in {{\varvec{\zeta }}}\) individually. Since this already contains an explicit \(d\sigma _y\), the corresponding term in \({\mathcal {L}}\) which is singular at y cannot contribute. Thus, only the terms which are regular at y can contribute from \({\mathcal {L}}\). On the other hand, since y is a simple zero of \(\omega \), it follows that \(\omega \, \delta _{\xi _y 0} = 0\). Thus, each term in the above sum over \(y \in {{\varvec{\zeta }}}\) vanishes, as required. \(\square \)

Substituting (2.12) into action (1.2) and using Lemma 2.3, we thus obtain

$$\begin{aligned} S[A]&= - \frac{\mathrm{i}}{12 \pi } \int _{\varSigma \times \mathbb {C}P^1} \omega \wedge \langle \widehat{g}^{-1} d \widehat{g}, \widehat{g}^{-1} d \widehat{g} \wedge \widehat{g}^{-1} d \widehat{g} \rangle \nonumber \\&\quad - \frac{\mathrm{i}}{4 \pi } \int _{\varSigma \times \mathbb {C}P^1} d\omega \wedge \langle \widehat{g}^{-1} d \widehat{g}, {\mathcal {L}} \rangle , \end{aligned}$$
(2.14)

where in the second line we have used Stokes’s theorem and the fact that all fields are assumed to vanish at the boundary of \(\varSigma \times \mathbb {C}P^1\) to get rid of the boundary term.

2.5 Reality conditions

Action (1.2) is a functional of the complex valued 1-forms \(\omega \) and A. Without imposing conditions on \(\omega \) and A, it is certainly not real.

However, we will want to use this four-dimensional theory to construct the actions of two-dimensional integrable \(\sigma \)-models. In order to ensure that the latter are all real, we will impose suitable reality conditions on the 1-forms \(\omega \) and A so as to make the four-dimensional action (1.2) real itself.

Let \(\tau : \mathfrak {g}^\mathbb {C}\rightarrow \mathfrak {g}^\mathbb {C}\) be an anti-linear involutive automorphism of the complex Lie algebra \(\mathfrak {g}^\mathbb {C}\). It provides \(\mathfrak {g}^\mathbb {C}\) with an action of the cyclic group \(\mathbb {Z}_2\). Its fixed point subset is a real Lie subalgebra \(\mathfrak {g}\) of \(\mathfrak {g}^\mathbb {C}\), regarded itself as a real Lie algebra. The anti-linear involution \(\tau \) is compatible with the bilinear form on \(\mathfrak {g}^\mathbb {C}\) in the sense that

$$\begin{aligned} \overline{\langle B, C \rangle } = \langle \tau B, \tau C \rangle \end{aligned}$$
(2.15)

for any \(B, C \in \mathfrak {g}^\mathbb {C}\). We will also denote by \(\tau \) its lift to an involutive automorphism \(\tau : G^\mathbb {C}\rightarrow G^\mathbb {C}\) of the Lie group \(G^\mathbb {C}\) and denote by G its fixed point real subgroup.

Complex conjugation \(z \mapsto {\bar{z}}\) on \(\mathbb {C}\subset \mathbb {C}P^1\) defines an involution \(\mu _{\mathrm{t}} : \mathbb {C}P^1\rightarrow \mathbb {C}P^1\), which also provides \(\mathbb {C}P^1\) with a \(\mathbb {Z}_2\)-action. We will require both the 1-forms \(\omega \) and A to be equivariant under this action of \(\mathbb {Z}_2\) in the sense that

$$\begin{aligned} \overline{\omega } = \mu _{\mathrm{t}}^*\omega , \quad \tau A = \mu _{\mathrm{t}}^*A. \end{aligned}$$
(2.16)

Concretely, in terms of the twist function \(\varphi \), defined from \(\omega \) in (1.1), the first condition simply states that \(\overline{\varphi (z)} = \varphi ({\bar{z}})\).

Lemma 2.4

The reality conditions (2.16) ensure that the action (1.2) is real.

Proof

We have

$$\begin{aligned} \overline{S[A]}&= -\frac{\mathrm{i}}{4 \pi } \int _{\varSigma \times \mathbb {C}P^1} \overline{\omega } \wedge CS(\tau A) = -\frac{\mathrm{i}}{4 \pi } \int _{\varSigma \times \mathbb {C}P^1} \mu _{\mathrm{t}}^*\omega \wedge CS(\mu _{\mathrm{t}}^*A) \nonumber \\&= -\frac{\mathrm{i}}{4 \pi } \int _{\varSigma \times \mathbb {C}P^1} \mu _{\mathrm{t}}^*(\omega \wedge CS(A)) = -\frac{\mathrm{i}}{4 \pi } \int _{\varSigma \times \mu _{\mathrm{t}} \mathbb {C}P^1} \omega \wedge CS(A) = S[A], \end{aligned}$$
(2.17)

where in the first equality we used the fact that

$$\begin{aligned} \overline{CS(A)} = \overline{\big \langle A, d A + {\small \frac{1}{3}} [A \wedge A] \big \rangle } = \big \langle \tau A, d (\tau A) + {\small \frac{1}{3}} [\tau A \wedge \tau A] \big \rangle = CS(\tau A). \end{aligned}$$

In the middle step here, we have used both the identity (2.15) and the fact that \(\tau \) is an automorphism of \(\mathfrak {g}^\mathbb {C}\). The second step in (2.17) is by the equivariance property (2.16) of \(\omega \) and A. The very last step in (2.17) uses the fact that \(\mu _{\mathrm{t}}\) has the effect of conjugating the complex structure on \(\mathbb {C}P^1\) and thus also of reversing its orientation. Concretely, the integral over \(\mu _{\mathrm{t}} \mathbb {C}P^1\) with measure \(d{\bar{z}} \wedge dz\) is equal to the integral over \(\mathbb {C}P^1\) with measure \(d z \wedge d{\bar{z}}\). \(\square \)

Upon writing the 1-form A as in (2.9), to satisfy its equivariance property (2.16), we will impose the equivariance property

$$\begin{aligned} \tau \widehat{g} = \mu _{\mathrm{t}}^*\widehat{g}, \quad \tau {\mathcal {L}} = \mu _{\mathrm{t}}^*{\mathcal {L}}. \end{aligned}$$
(2.18)

for the function \(\widehat{g} : \varSigma \times \mathbb {C}P^1\rightarrow G^\mathbb {C}\) and \(\mathfrak {g}^\mathbb {C}\)-valued 1-form \({\mathcal {L}} = {\mathcal {L}}_\sigma d\sigma + {\mathcal {L}}_\tau d\tau \).

3 Integrable \(\sigma \)-model actions

3.1 Freedom in the choice of \({\varvec{\widehat{g}}}\)

Notice that (2.9) is equivalent to saying that \(A_{{\bar{z}}}\) is of the form

$$\begin{aligned} A_{{\bar{z}}} = - \partial _{{\bar{z}}} \widehat{g} \widehat{g}^{-1}. \end{aligned}$$
(3.1)

The smooth function \(\widehat{g} : \varSigma \times \mathbb {C}P^1\rightarrow G^\mathbb {C}\) in this expression is by no means unique.

On the one hand, we can multiply it on the right by an arbitrary smooth function \(h : \varSigma \rightarrow G\) since we have

$$\begin{aligned} A_{{\bar{z}}} = - \partial _{{\bar{z}}} (\widehat{g} h) (\widehat{g} h)^{-1}, \end{aligned}$$
(3.2)

which is still of form (3.1). In order to preserve the equivariance of \(\widehat{g}\) in (2.18), we need h to take values in the real subgroup \(G \subset G^\mathbb {C}\) so that \(\tau h = h\).

Note that such a transformation \(\widehat{g}\mapsto \widehat{g}h\) does not modify \(A_{{\bar{z}}}\) and is thus a redundancy in definition (3.1) of \(\widehat{g}\) in terms of \(A_{{\bar{z}}}\). Recall also that this definition was obtained as the \(d{\bar{z}}\)-component of (2.9) and that the corresponding \(d\tau \)- and \(d\sigma \)-components serve as a definition of the Lax connection \(\mathcal {L}\) in terms of \(A_\tau \), \(A_\sigma \) and \(\widehat{g}\). One easily checks that for fixed A, the redundancy \(\widehat{g}\mapsto \widehat{g}h\) in the definition of \(\widehat{g}\) corresponds to the transformation

$$\begin{aligned} \mathcal {L} \longmapsto h^{-1} dh + h^{-1} \mathcal {L}h \end{aligned}$$
(3.3)

on \(\mathcal {L}\). This is a two-dimensional gauge transformation of the Lax connection \(\mathcal {L}\). It is well known that such a freedom on \(\mathcal {L}\) is always allowed in any integrable field theory, as it preserves its on-shell flatness.

On the other hand, we can also perform a gauge transformation on the connection A by a smooth function \(u : \varSigma \times \mathbb {C}P^1\rightarrow G^\mathbb {C}\) since the \(d{\bar{z}}\)-component of the gauge-transformed connection

$$\begin{aligned} A^u = - d u u^{-1} + u A u^{-1} \end{aligned}$$
(3.4)

is still of the form (3.1), explicitly

$$\begin{aligned} A^u_{{\bar{z}}} = - \partial _{{\bar{z}}}(u \widehat{g}) (u \widehat{g})^{-1}. \end{aligned}$$
(3.5)

However, it is important to note that u cannot be completely arbitrary here. Indeed, the gauge transformation by u must also preserve the boundary conditions imposed on A (which is why, following the terminology of Sect. 2.2, we call it a gauge transformation and not a formal gauge transformation). For \(A^u\) to be real, we must also require that u be equivariant under the action of \(\mathbb {Z}_2\).

Note that the transformation \(\widehat{g}\mapsto u\widehat{g}\) is of a different nature than the transformation \(\widehat{g}\mapsto \widehat{g}h\) considered in the previous paragraph. Indeed, the latter corresponds to a redundancy in the definition of \(\widehat{g}\) in terms of A and does not alter A itself, while the transformation \(\widehat{g}\mapsto u\widehat{g}\) corresponds to a gauge transformation on A. Moreover, the parameter h considered in (3.2) was a two-dimensional field on \(\varSigma \), independent of z and \({\bar{z}}\), while the parameter u in (3.4) is a four-dimensional field on \(\varSigma \times \mathbb {C}P^1\). Finally, let us note that contrary to the transformation \(\widehat{g} \mapsto \widehat{g} h\), the gauge transformation \(\widehat{g}\mapsto u\widehat{g}\) does not modify the Lax connection \(\mathcal {L}\).

3.2 Archipelago conditions

The action (2.14) derived in the previous section holds for an arbitrary meromorphic differential \(\omega \), in particular with poles of any order. It is, however, still four-dimensional as the original action (1.2).

In order to reduce action (2.14) to a two-dimensional one, we will exploit the large freedom in the choice of \(\widehat{g}\) discussed in Sect. 3.1. Specifically, in the remainder of this section, we will identify sufficient conditions on the function \(\widehat{g}\), which guarantee that the action (2.14) can be explicitly reduced to an action on \(\varSigma \). In Sect. 4, we will then identify various boundary conditions for which such conditions on \(\widehat{g}\) can be made to hold by using the freedom discussed in Sect. 3.1.

We will say that a smooth equivariant function \(\widehat{g} : \varSigma \times \mathbb {C}P^1\rightarrow G^\mathbb {C}\) is of archipelago type if it satisfies the following three archipelago conditions:

  1. (i)

    \(\widehat{g} = 1\) outside \(\varSigma \times \bigsqcup _{x \in {{\varvec{z}}}} U_x\) for some disjoint open discs \(U_x\) around \(x \in {{\varvec{z}}}\),

  2. (ii)

    \(\widehat{g}_x \,{:}{=}\, \widehat{g}|_{\varSigma \times U_x}\) only depends on \(\sigma \), \(\tau \) and the radial coordinate \(r_x \,{:}{=}\, |\xi _x|\),

  3. (iii)

    There is an open disc \(V_x \subset U_x\) for every \(x \in {{\varvec{z}}}\) such that \(g_x \,{:}{=}\, \widehat{g}|_{\varSigma \times V_x}\) only depends on \(\sigma \) and \(\tau \). By a slight abuse of notation, we also denote its further restriction \(\widehat{g}|_{\varSigma \times \{ x \}}\) to the point \(x \in {{\varvec{z}}}\) as \(g_x\).

Lemma 3.1

One can always ensure that the smooth \(G^\mathbb {C}\)-valued function \(\widehat{g}\) appearing in (3.1) satisfies the archipelago condition (i).

Proof

We will bring the function \(\widehat{g}\) to a form which satisfies the archipelago condition (i) by applying a suitable gauge transformation (3.4) for some smooth function u.

Given any disjoint open discs \(U_x\) around each \(x \in {{\varvec{z}}}\), we can choose a smooth function \(u : \varSigma \times \mathbb {C}P^1\rightarrow G^\mathbb {C}\) such that \(u = \widehat{g}^{-1}\) outside \(\varSigma \times \bigsqcup _{x \in {{\varvec{z}}}} U_x\) and \(u = 1\) in some open neighbourhood of \(\varSigma \times {{\varvec{z}}}\). The latter condition is there to ensure that the gauge transformation by u preserves the boundary conditions at \({{\varvec{z}}}\). By construction, the new function \(u \widehat{g}\) appearing in (3.5) satisfies condition (i). \(\square \)

By contrast, conditions (ii) and (iii) are not always satisfied. Whether or not \(\widehat{g}\) can be made to satisfy them depends on the type of boundary conditions that are imposed on the Chern–Simons field A at the poles of \(\omega \) in order to satisfy (2.4).

3.3 Two-dimensional action with WZ-terms

Suppose that \(\widehat{g}\) can be chosen to be of archipelago type. We will show that the four-dimensional action (2.14) can then be further simplified to a two-dimensional action with WZ-terms.

Consider, to begin with, the first term in action (2.14). It can be written as

$$\begin{aligned}&- \frac{\mathrm{i}}{12 \pi } \int _{\varSigma \times \mathbb {C}P^1} \omega \wedge \langle \widehat{g}^{-1} d \widehat{g}, \widehat{g}^{-1} d \widehat{g} \wedge \widehat{g}^{-1} d \widehat{g} \rangle \\&\quad = - \frac{\mathrm{i}}{12 \pi } \sum _{x \in {{\varvec{z}}}} \int _{\varSigma \times U_x} \omega \wedge \langle \widehat{g}_x^{-1} d \widehat{g}_x, \widehat{g}_x^{-1} d \widehat{g}_x \wedge \widehat{g}_x^{-1} d \widehat{g}_x \rangle \end{aligned}$$

using property (i) of the archipelago-type function \(\widehat{g}\), cf. Sect. 3.2, to localise the integral over \(\mathbb {C}P^1\) to the individual discs \(U_x\) around each \(x \in {{\varvec{z}}}\).

In each disc \(U_x\) centred on \(x \in {{\varvec{z}}} {\setminus } \{ \infty \}\), we introduce local polar coordinates \(z = x + r_x e^{\mathrm{i}\theta _x}\) and likewise \(z = r_\infty ^{-1} e^{-\mathrm{i}\theta _\infty }\) in \(U_\infty \) if \(\infty \in {{\varvec{z}}}\). We note that only the differential \(d \theta _x\) in \(dz = e^{\mathrm{i}\theta _x} (dr_x + \mathrm{i}r_x d \theta _x)\) contributes in the above integral for \(x \in {{\varvec{z}}} {\setminus } \{ \infty \}\). Indeed, since \(\widehat{g}_x\) is assumed to be independent of \(\theta _x\) in property (ii) of the archipelago type function \(\widehat{g}\), it follows that the 3-form \(\langle \widehat{g}_x^{-1} d \widehat{g}_x, \widehat{g}_x^{-1} d \widehat{g}_x \wedge \widehat{g}_x^{-1} d \widehat{g}_x \rangle \) is proportional to \(dr_x \wedge d\sigma \wedge d\tau \). Therefore, when taking the wedge product with \(\omega \), only the \(d\theta _x\) component of \(\omega \) can contribute. The same is true when \(x = \infty \). We can then rewrite the above integral as

$$\begin{aligned}&\frac{1}{12 \pi } \sum _{x \in {{\varvec{z}}} {\setminus } \{ \infty \}} \int _{\varSigma \times [0, R_x] \times [0, 2 \pi ]} r_x e^{\mathrm{i}\theta _x} \varphi \big ( x + r_x e^{\mathrm{i}\theta _x} \big ) d\theta _x \wedge \langle \widehat{g}_x^{-1} d \widehat{g}_x, \widehat{g}_x^{-1} d \widehat{g}_x \wedge \widehat{g}_x^{-1} d \widehat{g}_x \rangle \\&\quad - \frac{1}{12 \pi } \sum _{x \in {{\varvec{z}}} \cap \{ \infty \}} \int _{\varSigma \times [0, R_x] \times [0, 2 \pi ]} r_x^{-1} e^{-\mathrm{i}\theta _x} \varphi \big ( r_x^{-1} e^{-\mathrm{i}\theta _x} \big ) d\theta _x \wedge \langle \widehat{g}_x^{-1} d \widehat{g}_x, \widehat{g}_x^{-1} d \widehat{g}_x \wedge \widehat{g}_x^{-1} d \widehat{g}_x \rangle , \end{aligned}$$

where \(R_x\) is the radius of the disc \(U_x\) around \(x \in {{\varvec{z}}}\). Performing the integrals over the angular variables \(\theta _x\) for each \(x \in {{\varvec{z}}}\), we now deduce that when \(\widehat{g}\) is of archipelago type, the first term in action (2.14) reduces to

$$\begin{aligned}&- \frac{\mathrm{i}}{12 \pi } \int _{\varSigma \times \mathbb {C}P^1} \omega \wedge \langle \widehat{g}^{-1} d \widehat{g}, \widehat{g}^{-1} d \widehat{g} \wedge \widehat{g}^{-1} d \widehat{g} \rangle = - \frac{1}{2} \sum _{x \in {{\varvec{z}}}} ({{\,\mathrm{res}\,}}_x \omega ) I_{\mathrm{WZ}}[g_x]. \end{aligned}$$

Here, we introduce the standard WZ-term

$$\begin{aligned} I_{\mathrm{WZ}}[g_x] \,{:}{=}\, - \frac{1}{3} \int _{\varSigma \times [0, R_x]} \langle \widehat{g}_x^{-1} d \widehat{g}_x, \widehat{g}_x^{-1} d \widehat{g}_x \wedge \widehat{g}_x^{-1} d \widehat{g}_x \rangle . \end{aligned}$$

As usual, it depends only on the two-dimensional field \(g_x : \varSigma \rightarrow G\) up to an additive constant which is irrelevant classically. Note that the overall minus sign in the above definition is there to match with the conventions of [11]. Indeed, the boundary of the volume \(\varSigma \times [0, R_x]\) being at the origin of the interval \([0, R_x]\) accounts for this extra minus sign.

Consider now the second term in action (2.14). It can be rewritten as

$$\begin{aligned} - \frac{\mathrm{i}}{4 \pi } \int _{\varSigma \times \mathbb {C}P^1} d\omega \wedge \langle \widehat{g}^{-1} d \widehat{g}, {\mathcal {L}} \rangle = - \frac{\mathrm{i}}{4 \pi } \sum _{x \in {{\varvec{z}}}} \int _{\varSigma \times V_x} d\omega \wedge \langle g_x^{-1} d g_x, {\mathcal {L}} \rangle \end{aligned}$$
(3.6)

where we have used the fact that \(d\omega \) is a distribution with support \({{\varvec{z}}}\) to localise the integral over \(\mathbb {C}P^1\) to the open discs \(V_x\) for each \(x \in {{\varvec{z}}}\) from property (iii) of the archipelago-type function \(\widehat{g}\). By writing this distribution explicitly in terms of the local coordinates \(\xi _x\) at each \(x \in {{\varvec{z}}}\), as in the proof of Lemma 2.1, substituting this expression into (3.6) we arrive at

$$\begin{aligned} - \frac{\mathrm{i}}{4 \pi } \int _{\varSigma \times \mathbb {C}P^1} d\omega \wedge \langle \widehat{g}^{-1} d \widehat{g}, {\mathcal {L}} \rangle = - \frac{1}{2} \sum _{x \in {{\varvec{z}}}} \sum _{p \ge 0} \int _{\varSigma } \frac{k^{(x)}_p}{p!} \big ( \partial _{\xi _x}^p \langle g_x^{-1} d g_x, {\mathcal {L}} \rangle \big ) \big |_x, \end{aligned}$$

where \(k^{(x)}_p = {{\,\mathrm{res}\,}}_x \xi _x^p \omega \) for each \(p \in \mathbb {Z}_{\ge 0}\) and \(x \in {{\varvec{z}}}\).

Now since \(g_x\) is independent of the local coordinate \(\xi _x\) on \(V_x\) by property (iii), it follows that \(\langle g_x^{-1} d g_x, {\mathcal {L}} \rangle \) is holomorphic in a neighbourhood of the pole x of \(\omega \) by virtue of (2.10) and we may thus rewrite each term in the above sum over \(x \in {{\varvec{z}}}\) as a residue. Indeed, for any \(\psi \) holomorphic at x, we have

$$\begin{aligned} {{\,\mathrm{res}\,}}_x \omega \wedge \psi = {{\,\mathrm{res}\,}}_x \left( \sum _{p \ge 0} \frac{k^{(x)}_p}{\xi _x^{p+1}} d\xi _x \wedge \sum _{q \ge 0} \frac{1}{q!} (\partial ^q_{\xi _x} \psi )|_x \xi _x^q \right) = \sum _{p \ge 0} \frac{k^{(x)}_p}{p!} (\partial ^p_{\xi _x} \psi )|_x, \end{aligned}$$

where in the first equality we made use of the expression (2.5) for the pole part of \(\omega \) at x, as well as the Taylor expansion of \(\psi \) near x. Finally, we thus obtain

$$\begin{aligned} - \frac{\mathrm{i}}{4 \pi } \int _{\varSigma \times \mathbb {C}P^1} d\omega \wedge \langle \widehat{g}^{-1} d \widehat{g}, {\mathcal {L}} \rangle&= - \frac{1}{2} \sum _{x \in {{\varvec{z}}}} \int _{\varSigma } {{\,\mathrm{res}\,}}_x \big ( \omega \wedge \langle g_x^{-1} d g_x, {\mathcal {L}} \rangle \big )\\&= - \frac{1}{2} \sum _{x \in {{\varvec{z}}}} \int _{\varSigma } \langle g_x^{-1} d g_x, {{\,\mathrm{res}\,}}_x \omega \wedge {\mathcal {L}} \rangle . \end{aligned}$$

Notice that the sign has not changed in the last line since we have moved \(\omega \) past \(g_x^{-1} dg_x\) but at the same time we have also reversed the orientation of the domain of integration by moving the operation \({{\,\mathrm{res}\,}}_x\), which is given by a contour integral over a small circle around x, past \(g_x^{-1} dg_x\) also.

We have thus shown the following.

Theorem 3.2

If \(\widehat{g}\) is of archipelago type, then action (2.14) reduces to the sum of a two-dimensional term and a Wess–Zumino term for each point in \({{\varvec{z}}}\), namely

$$\begin{aligned} S[ \{ g_x\}_{x \in {{\varvec{z}}}} ] = \frac{1}{2} \sum _{x \in {{\varvec{z}}}} \int _{\varSigma } \langle {{\,\mathrm{res}\,}}_x \omega \wedge {\mathcal {L}}, g_x^{-1} d g_x \rangle - \frac{1}{2} \sum _{x \in {{\varvec{z}}}} ({{\,\mathrm{res}\,}}_x \omega ) I_{\mathrm{WZ}}[g_x], \end{aligned}$$
(3.7)

where \(g_x : \varSigma \rightarrow G\) is the restriction of \(\widehat{g}\) to \(\varSigma \times \{ x \}\) for each \(x \in {{\varvec{z}}}\).

Remark 3.1

The notation that we have used for the action in (3.7) suggests that it is only a functional of \(\{ g_x\}_{x \in {{\varvec{z}}}}\), even though the right-hand side clearly also depends on \({\mathcal {L}}\). This is because, as we shall see in a case-by-case analysis of all the examples discussed in Sect. 5, the 1-form \({\mathcal {L}}\) can always be expressed in terms of the set of fields \(\{ g_x\}_{x \in {{\varvec{z}}}}\) by solving the boundary condition imposed on A.\(\vartriangleleft \)

It follows from the equivariance properties (2.16) that the set \({{\varvec{z}}}\) of poles of \(\omega \) is invariant under complex conjugation, so that \(x \in {{\varvec{z}}}\) implies \({\bar{x}} \in {{\varvec{z}}}\). And using also (2.18), we find that

$$\begin{aligned} \overline{{{\,\mathrm{res}\,}}_x \omega \wedge {\mathcal {L}}} = {{\,\mathrm{res}\,}}_{{\bar{x}}} \omega \wedge {\mathcal {L}}, \quad \overline{{{\,\mathrm{res}\,}}_x \omega } = {{\,\mathrm{res}\,}}_{{\bar{x}}} \omega . \end{aligned}$$
(3.8)

Moreover, from the equivariance property (2.18), it follows that for any \(x \in {{\varvec{z}}}\) we have \(\tau (g_x) = g_{{\bar{x}}}\) and \(\tau (\widehat{g}_x) = \widehat{g}_{{\bar{x}}}\). This, together with (3.8), implies that the action (3.7) is real, as expected since it was obtained as a reduction of (1.2), which was real by virtue of the equivariance properties (2.16) imposed on \(\omega \) and A.

3.4 Two-dimensional gauge invariance

Recall from the discussion in Sect. 3.1 that there is a redundancy in definition (2.9) of both the function \(\widehat{g}\) and the 1-form \({\mathcal {L}}\) in terms of A, namely

$$\begin{aligned} \widehat{g} \longmapsto \widehat{g} h, \quad {\mathcal {L}} \longmapsto h^{-1} dh + h^{-1} \mathcal {L} h, \end{aligned}$$

for an arbitrary smooth function \(h : \varSigma \rightarrow G\). We note that the above transformation on \(\widehat{g}\) will spoil the fact that \(\widehat{g}\) is of archipelago type. However, by combining it with the gauge transformation by u defined in the proof of Lemma 3.1, we are able to bring \(\widehat{g} h\) back to being of archipelago type. Note that the gauge transformation by u leaves invariant the 1-form \({\mathcal {L}}\) so that we obtain the combined transformation

$$\begin{aligned} \widehat{g} \longmapsto u \widehat{g} h, \quad {\mathcal {L}} \longmapsto h^{-1} dh + h^{-1} \mathcal {L} h. \end{aligned}$$
(3.9)

As \(u \widehat{g} h\) is of archipelago type, action (3.7) therefore holds after performing transformation (3.9) and in particular it makes sense to ask whether it is invariant under such a transformation.

More precisely, in terms of the fields \(\{ g_x \}_{x \in {{\varvec{z}}}}\) appearing in action (3.7), transformation (3.9) acts as

$$\begin{aligned} g_x \longmapsto g_x h \end{aligned}$$
(3.10)

for all \(x \in {{\varvec{z}}}\). Here, we used the property that \(u|_x = 1\) from the proof of Lemma 3.1.

And as noted in Remark 3.1, in all the cases to be considered in Sect. 5 the 1-form \({\mathcal {L}}\) will be completely fixed in terms of \(\{ g_x \}_{x \in {{\varvec{z}}}}\) by solving the boundary condition imposed on A. In this sense, transformation (3.3) on the 1-form \({\mathcal {L}}\), i.e. the second relation in (3.9), can be seen as a consequence of (3.10).

Proposition 3.3

The two-dimensional action (3.7) is invariant under the gauge transformation (3.10) for an arbitrary smooth function \(h : \varSigma \rightarrow G\).

We can fix this gauge invariance by imposing that \(g_x = 1\) for some \(x \in {{\varvec{z}}}\).

Proof

We compute \(S[ \{ g_x h \}_{x \in {{\varvec{z}}}}]\) by substituting transformations (3.10) and (3.3) into (3.7). The first term in the action reads

$$\begin{aligned}&\frac{1}{2} \sum _{x \in {{\varvec{z}}}} \int _{\varSigma } \big \langle {{\,\mathrm{res}\,}}_x \big ( \omega \wedge (h^{-1} dh + h^{-1} \mathcal {L} h) \big ), (g_x h)^{-1} d (g_x h) \big \rangle \\&\quad = \frac{1}{2} \sum _{x \in {{\varvec{z}}}} \int _{\varSigma } \big \langle {{\,\mathrm{res}\,}}_x \big ( \omega \wedge (d h h^{-1} + {\mathcal {L}}) \big ), g_x^{-1} d g_x \big \rangle \\&\qquad + \frac{1}{2} \sum _{x \in {{\varvec{z}}}} \int _{\varSigma } \big \langle {{\,\mathrm{res}\,}}_x \big ( \omega \wedge (h^{-1} dh + h^{-1} \mathcal {L} h) \big ), h^{-1} d h \big \rangle . \end{aligned}$$

The second term on the right-hand side vanishes because \(\omega \wedge (h^{-1} dh + h^{-1} \mathcal {L} h)\) is meromorphic on \(\mathbb {C}P^1\) with poles in \({{\varvec{z}}}\), so that the sum of its residues vanishes.

On the other hand, by using the Polyakov–Wiegmann formula [33], we find

$$\begin{aligned} \frac{1}{2} \sum _{x \in {{\varvec{z}}}} ({{\,\mathrm{res}\,}}_x \omega ) I_{\mathrm{WZ}}[g_x h]&= \frac{1}{2} \sum _{x \in {{\varvec{z}}}} ({{\,\mathrm{res}\,}}_x \omega ) I_{\mathrm{WZ}}[g_x] + \frac{1}{2} \sum _{x \in {{\varvec{z}}}} ({{\,\mathrm{res}\,}}_x \omega ) I_{\mathrm{WZ}}[h]\\&\quad - \frac{1}{2} \sum _{x \in {{\varvec{z}}}} ({{\,\mathrm{res}\,}}_x \omega ) \int _\varSigma \langle g_x^{-1} dg_x, dh h^{-1} \rangle . \end{aligned}$$

The second term on the right-hand side vanishes using the fact that \(\sum _{x \in {{\varvec{z}}}} {{\,\mathrm{res}\,}}_x \omega = 0\). It now follows from combining the above that \(S[ \{ g_x h \}_{x \in {{\varvec{z}}}} ] = S[ \{ g_x \}_{x \in {{\varvec{z}}}} ]\). \(\square \)

Remark 3.2

In the approach to integrable \(\sigma \)-models based on affine Gaudin models, the gauge transformation (3.10) and its interplay with the integrable structure were studied in detail in [31], expanding on the description of gauge symmetries in affine Gaudin models given in [40]. In particular, it was shown in [31, Proposition 2.2] (see also [40, (4.61)]) that the gauge transformation of the fundamental fields of the \(\sigma \)-model, represented here by \(\{ g_x \}_{x \in {{\varvec{z}}}}\), acts as \(d+\mathcal {L} \mapsto h^{-1}(d+\mathcal {L})h\) on its Lax connection, in agreement with the situation considered in the present paper.\(\vartriangleleft \)

4 Boundary conditions

As already mentioned in Sect. 2.1, we shall restrict attention in this paper to the case when \(\omega \) has at most double poles, in which case the boundary conditions imposed on A should ensure that (2.6) holds. In the list of examples discussed in Sect. 5, we shall consider two types of boundary conditions.

The first is imposed at a double pole \(x \in {{\varvec{z}}}\) of \(\omega \) and ensures that the corresponding term in the sum of (2.6) vanishes by itself, i.e.

$$\begin{aligned} ({{\,\mathrm{res}\,}}_x \omega ) \epsilon _{ij} \langle A_i|_x, \delta A_j|_x \rangle + ({{\,\mathrm{res}\,}}_x \xi _x \omega ) \epsilon _{ij} \partial _{\xi _x} \langle A_i, \delta A_j \rangle \big |_x = 0. \end{aligned}$$
(4.1a)

For the discussion of reality conditions, we will assume for simplicity that x lies on the real axis. We discuss the simplest possible boundary condition in Sect. 4.1 and then come back to more general boundary conditions that can be imposed in Sect. 4.5.

The second is imposed at a pair of simple poles \(x_+, x_- \in {{\varvec{z}}}\) of \(\omega \) and ensures that the corresponding terms in the sum of (2.6) cancel each other out, i.e.

$$\begin{aligned} ({{\,\mathrm{res}\,}}_{x_+} \omega ) \epsilon _{ij} \langle A_i|_{x_+}, \delta A_j|_{x_+} \rangle + ({{\,\mathrm{res}\,}}_{x_-} \omega ) \epsilon _{ij} \langle A_i|_{x_-}, \delta A_j|_{x_-} \rangle = 0. \end{aligned}$$
(4.1b)

There are two possibilities allowed by the reality conditions, corresponding to the case when \(x_+\) and \(x_-\) are both real and when they form a complex conjugate pair. These separate cases are discussed in Sects. 4.2 and 4.3, respectively.

In Sect. 4.4, we describe Poisson–Lie T-duality in the present context, as relating different choices of boundary conditions that can be imposed at a pair of simple poles.

4.1 Boundary conditions at a real double pole

Let \(x \in {{\varvec{z}}}\) be a real double pole of \(\omega \). One way the boundary equation of motion (4.1a) can be satisfied is by demanding that [5, 7]

$$\begin{aligned} A_i|_x = 0, \end{aligned}$$
(4.2)

for \(i = \tau , \sigma \), noting that we then also have \(\delta A_i|_x = 0\).

Proposition 4.1

Suppose that A satisfies the boundary condition (4.2), and we are given a field \(\widehat{g}\) satisfying (3.1) for which the archipelago condition (i) holds.

Then, the value of \(\widehat{g}\) on the island \(U_x\) can be modified, without changing its value at x and its value outside \(U_x\), so as to also satisfy both of the remaining two archipelago conditions (ii) and (iii).

Proof

This will be achieved by applying a suitable gauge transformation (3.4) for some smooth function \(u : \varSigma \times \mathbb {C}P^1\rightarrow G^\mathbb {C}\), equal to 1 on the complement of \(\varSigma \times U_x\) so as not to modify the value of \(\widehat{g}\) there. Note, however, that in order for \(A^u\) to still satisfy the boundary condition (4.2), it is necessary to require that \((- \partial _i u u^{-1})|_x = 0\) for \(i = \tau , \sigma \). That is, u is an allowed gauge transformation parameter provided

$$\begin{aligned} \partial _i (u |_x) = 0, \end{aligned}$$
(4.3)

for \(i = \tau , \sigma \). Also, for \(A^u\) to still satisfy the reality condition (2.16), we should require that u be equivariant in the sense that \(\tau u = \mu _{\mathrm{t}}^*u\). We are thus seeking a smooth \(\mathbb {Z}_2\)-equivariant \(G^\mathbb {C}\)-valued function u equal to 1 outside \(\varSigma \times U_x\) and satisfying (4.3), such that \(u \widehat{g}\) satisfies the archipelago conditions (ii) and (iii) on the island \(U_x\).

Consider the smooth equivariant function \(\widetilde{g} : \varSigma \times \mathbb {C}P^1\rightarrow G^\mathbb {C}\) defined as follows. Let \(\widetilde{g} \,{:}{=}\, \widehat{g}\) on the complement of \(\varSigma \times U_x\). Choose two open discs \(D^r_x \subset D^s_x \subset U_x\) of radii \(s> r > 0\) centred on x. Let \(\widetilde{g}\) in \(D^r_x\) be constant equal to \(\widehat{g}|_x\), and extend it to a smooth function on \(U_x\) such that \(\widetilde{g} \,{:}{=}\, 1\) on the complement \(U_x {\setminus } D^s_x\) and \(\widetilde{g}\) depends only on the radial coordinate \(|\xi _x|\) around x. More precisely, writing \(\widehat{g}|_x = \exp y\) for some \(y : \varSigma \rightarrow \mathfrak {g}\), we let \(\widetilde{g} \,{:}{=}\, \exp (f(|\xi _x|) y)\) where \(f : [0, R_x] \rightarrow \mathbb {R}\) is a smooth function equal to 1 on [0, r] and equal to 0 on \([s, R_x]\).

By construction, \(\widetilde{g}\) satisfies both of the archipelago conditions (ii) and (iii) on the island \(U_x\). It therefore remains to show that \(u = \widetilde{g} \widehat{g}^{-1} : \varSigma \times \mathbb {C}P^1\rightarrow G^\mathbb {C}\) satisfies (4.3) and is also \(\mathbb {Z}_2\)-equivariant. The latter condition is evident from the equivariance of \(\widetilde{g}\) and \(\widehat{g}\). On the other hand, \(\partial _i (u |_x) = \partial _i \big ( \widetilde{g}|_x \widehat{g}|_x^{-1} \big ) = 0\) where in the second equality we used the fact that \(\widetilde{g} = \widehat{g}|_x\) in \(D^r_x\) and hence \(\widetilde{g}|_x = \widehat{g}|_x\). \(\square \)

4.2 Boundary conditions at pairs of real simple poles

Let \(x_\pm \in {{\varvec{z}}}\) be simple poles of \(\omega \) with \(x_\pm \in \mathbb {R}\), so that in particular \({{\,\mathrm{res}\,}}_{x_\pm } \omega \in \mathbb {R}\). Also, by the equivariance property (2.16) of A it follows that the components \(A_i|_{x_\pm }\), for \(i = \tau , \sigma \), are valued in the real Lie subalgebra \(\mathfrak {g}\).

The boundary equation of motion (4.1b) can then be rewritten as

$$\begin{aligned} \epsilon _{ij} \langle \!\langle (A_i|_{x_+}, A_i|_{x_-}), \delta (A_j|_{x_+}, A_j|_{x_-}) \rangle \!\rangle _{\mathfrak {d}; x_\pm } = 0, \end{aligned}$$
(4.4)

where \(\langle \!\langle \cdot , \cdot \rangle \!\rangle _{\mathfrak {d}; x_\pm } : \mathfrak {d}\times \mathfrak {d}\rightarrow \mathbb {R}\) denotes the nondegenerate symmetric invariant bilinear form on the Lie algebra direct sum \(\mathfrak {d}\,{:}{=}\, \mathfrak {g}\oplus \mathfrak {g}\), defined by

$$\begin{aligned} \langle \!\langle ({\textsf {x} }, {\textsf {y} }), ({\textsf {x} }', {\textsf {y} }') \rangle \!\rangle _{\mathfrak {d}; x_\pm } \,{:}{=}\, ({{\,\mathrm{res}\,}}_{x_+} \omega ) \langle {\textsf {x} }, {\textsf {x} }' \rangle + ({{\,\mathrm{res}\,}}_{x_-} \omega ) \langle {\textsf {y} }, {\textsf {y} }' \rangle \end{aligned}$$

for any \({\textsf {x} }, {\textsf {y} }, {\textsf {x} }', {\textsf {y} }' \in \mathfrak {g}\). In the special case when \({{\,\mathrm{res}\,}}_{x_+} \omega = - {{\,\mathrm{res}\,}}_{x_-} \omega \), this reduces to the usual bilinear form \(\langle {\textsf {x} }, {\textsf {x} }' \rangle - \langle {\textsf {y} }, {\textsf {y} }' \rangle \) on \(\mathfrak {d}\) up to an overall factor of \({{\,\mathrm{res}\,}}_{x_+} \omega \).

One way of ensuring that (4.4) holds is as follows. Let \((\mathfrak {d}, \mathfrak {k})\) be a Manin pair, i.e. fix a Lagrangian subalgebra \(\mathfrak {k}\) of \(\mathfrak {d}\). We recall that Lagrangian here means ‘maximal isotropic’. We can demand that, for \(i = \tau , \sigma \),

$$\begin{aligned} (A_i|_{x_+}, A_i|_{x_-}) \in \mathfrak {k}, \end{aligned}$$
(4.5)

noting that we will then also have \(\delta (A_i|_{x_+}, A_i|_{x_-}) \in \mathfrak {k}\). This then ensures (4.4) holds by virtue of the isotropy of \(\mathfrak {k}\). The reason for using a Manin pair \((\mathfrak {d}, \mathfrak {k})\) rather than just an isotropic subspace \(\mathfrak {k}\) of \(\mathfrak {d}\) will be explained shortly.

Let K denote the subgroup of \(D = G \times G\) with Lie algebra \(\mathfrak {k}\subset \mathfrak {d}\).

Proposition 4.2

Suppose that A satisfies the boundary condition (4.5), and we are given a field \(\widehat{g}\) satisfying (3.1) for which the archipelago condition (i) holds.

Then, the value of \(\widehat{g}\) on the islands \(U_{x_\pm }\) can be modified, without changing its value outside, so as to also satisfy the remaining archipelago conditions (ii) and (iii).

Furthermore, the value \((g_{x_+}, g_{x_-}) : \varSigma \rightarrow D\) of the archipelago-type function \(\widehat{g}\) at the pair of points \(x_\pm \) can be adjusted using \((g_{x_+}, g_{x_-}) \mapsto a (g_{x_+}, g_{x_-})\) for any smooth function \(a : \varSigma \rightarrow K\).

Proof

We will find a gauge transformation (3.4) for some suitable equivariant \(u : \varSigma \times \mathbb {C}P^1\rightarrow G^\mathbb {C}\) equal to 1 outside \(\varSigma \times (U_{x_+} \sqcup U_{x_-})\) such that \(u \widehat{g}\) also satisfies the archipelago conditions (ii) and (iii) on \(U_{x_\pm }\).

Evaluating (3.4) at the pair of points \(x_\pm \), we see that, for \(i= \tau , \sigma \),

$$\begin{aligned} (A_i^u|_{x_+}, A_i^u|_{x_-})&= - \big ( (\partial _i u u^{-1})|_{x_+}, (\partial _i u u^{-1})|_{x_-} \big )\\&\quad + (u|_{x_+}, u|_{x_-}) (A_i|_{x_+}, A_i|_{x_-}) (u|_{x_+}, u|_{x_-})^{-1}. \end{aligned}$$

The gauge transformation is allowed provided that this still takes values in \(\mathfrak {k}\), so that \(A^u\) still satisfies the boundary condition (4.5). For this, it is sufficient to ensure that both terms on the right-hand side above take values in \(\mathfrak {k}\). Therefore, we will demand that our gauge transformation parameter u should be such that

$$\begin{aligned} (u|_{x_+}, u|_{x_-}) \in K. \end{aligned}$$
(4.6)

Note that this then also implies \(\big ( (\partial _i u u^{-1})|_{x_+}, (\partial _i u u^{-1})|_{x_-} \big ) \in \mathfrak {k}\) for \(i = \tau , \sigma \). This is where we had to use the fact that \(\mathfrak {k}\) is a subalgebra, and not just a subspace, of \(\mathfrak {d}\) in order to define the corresponding Lie group K.

Proceeding as in the proof of Proposition 4.1, we consider the smooth equivariant function \(\widetilde{g} : \varSigma \times \mathbb {C}P^1\rightarrow G^\mathbb {C}\) defined as follows. Let \(\widetilde{g} \,{:}{=}\, \widehat{g}\) on the complement of \(\varSigma \times (U_{x_+} \sqcup U_{x_-})\). Define \(\widetilde{g}\) locally in small open discs \(D^r_{x_\pm } \subset U_{x_\pm }\) around the points \(x_\pm \) as \((\widetilde{g}|_{D^r_{x_+}}, \widetilde{g}|_{D^r_{x_-}}) \,{:}{=}\, a (\widehat{g}|_{x_+}, \widehat{g}|_{x_-})\) for any smooth \(a : \varSigma \rightarrow K\) of our choice. Note here that \(\widehat{g}|_{x_\pm } \in G\) by the equivariance of \(\widehat{g}\) since \(x_\pm \in \mathbb {R}\). We can then extend the definition of \(\widetilde{g}\) to \(\varSigma \times (U_{x_+} \sqcup U_{x_-})\) as we did in Sect. 4.1 so that \(\widetilde{g}_{x_\pm } = \widetilde{g}|_{\varSigma \times U_{x_\pm }}\) depends only on \(\sigma \), \(\tau \) and the radial coordinate \(|\xi _{x_\pm }|\) around \(x_\pm \). In other words, \(\widetilde{g}\) satisfies the archipelago conditions (ii) and (iii) on \(U_{x_\pm }\).

It remains to show that \(u = \widetilde{g} \widehat{g}^{-1}\), i.e. the gauge transformation parameter from \(\widehat{g}\) to \(\widetilde{g}\), is equivariant and satisfies (4.6). The equivariance is clear from that of \(\widetilde{g}\) and \(\widehat{g}\). Now note that from the relation \(\widetilde{g} = u \widehat{g}\), it follows that

$$\begin{aligned} (\widetilde{g}|_{x_+}, \widetilde{g}|_{x_-}) = (u|_{x_+}, u|_{x_-}) (\widehat{g}|_{x_+}, \widehat{g}|_{x_-}). \end{aligned}$$
(4.7)

But since \((\widetilde{g}|_{x_+}, \widetilde{g}|_{x_-}) = a (\widehat{g}|_{x_+}, \widehat{g}|_{x_-})\), we deduce that \((u|_{x_+}, u|_{x_-}) = a \in K\), which is the required condition (4.6). \(\square \)

4.3 Boundary conditions at complex conjugate simple poles

Let \(x_\pm \in {{\varvec{z}}}\) be simple poles of \(\omega \) with \(x_- = \overline{x_+}\), so that \({{\,\mathrm{res}\,}}_{x_-} \omega = \overline{{{\,\mathrm{res}\,}}_{x_+} \omega }\). By the equivariance property (2.16) of A, it also follows that \(\tau (A_i|_{x_+}) = A_i|_{x_-}\) for \(i = \tau , \sigma \).

The boundary equation of motion (4.1b) can then be rewritten as

$$\begin{aligned} \epsilon _{ij} \langle \!\langle A_i|_{x_+}, \delta A_j|_{x_+} \rangle \!\rangle _{\mathfrak {g}^\mathbb {C}; x_\pm } = 0. \end{aligned}$$
(4.8)

Here, \(\langle \!\langle \cdot , \cdot \rangle \!\rangle _{\mathfrak {g}^\mathbb {C}; x_\pm } : \mathfrak {g}^\mathbb {C}\times \mathfrak {g}^\mathbb {C}\rightarrow \mathbb {R}\) is the nondegenerate symmetric invariant bilinear form on the complexification \(\mathfrak {g}^\mathbb {C}\), regarded as a real Lie algebra, defined by

$$\begin{aligned} \langle \!\langle {\textsf {x} }, {\textsf {x} }' \rangle \!\rangle _{\mathfrak {g}^\mathbb {C}; x_\pm } \,{:}{=}\, 2 \mathfrak {R}\big ( ({{\,\mathrm{res}\,}}_{x_+} \omega ) \langle {\textsf {x} }, {\textsf {x} }' \rangle \big ) \end{aligned}$$

for any \({\textsf {x} }, {\textsf {x} }' \in \mathfrak {g}^\mathbb {C}\), where we denote by \(\mathfrak {R}z\) and \(\mathfrak {I}z\) the real and imaginary parts of a complex number z, respectively. When \({{\,\mathrm{res}\,}}_{x_+} \omega = - {{\,\mathrm{res}\,}}_{x_-} \omega \) so that \({{\,\mathrm{res}\,}}_{x_+} \omega \in \mathrm{i}\mathbb {R}\) this reduces, up to an overall factor, to the standard bilinear form \(\mathfrak {I}\langle {\textsf {x} }, {\textsf {x} }' \rangle \) on \(\mathfrak {g}^\mathbb {C}\).

The discussion below is completely analogous to that of Sect. 4.2, just working with the complexification \(\mathfrak {g}^\mathbb {C}\) rather than the real double \(\mathfrak {d}\). We will thus be much briefer in the arguments presented and only highlight the differences with Sect. 4.2.

In particular, we can satisfy (4.8) by choosing a Manin pair \((\mathfrak {g}^\mathbb {C}, \mathfrak {k})\), this time for the complexification rather than the real double, and demanding that

$$\begin{aligned} A_i|_{x_+} \in \mathfrak {k}, \end{aligned}$$
(4.9)

for \(i= \tau , \sigma \), noting that this implies \(\delta A_i|_{x_+} \in \mathfrak {k}\).

Let K denote the Lie subgroup of \(G^\mathbb {C}\) with Lie algebra \(\mathfrak {k}\subset \mathfrak {g}^\mathbb {C}\).

Proposition 4.3

Suppose that A satisfies the boundary condition (4.9), and we are given a field \(\widehat{g}\) satisfying (3.1) for which the archipelago condition (i) holds.

Then, the value of \(\widehat{g}\) on the islands \(U_{x_\pm }\) can be modified, without changing its value outside, so as to also satisfy the remaining archipelago conditions (ii) and (iii).

Furthermore, the value \(g_{x_+} : \varSigma \rightarrow G^\mathbb {C}\) of the archipelago-type function \(\widehat{g}\) at the point \(x_+\) can be adjusted using \(g_{x_+} \mapsto a g_{x_+}\) for any smooth function \(a : \varSigma \rightarrow K\).

Proof

Evaluating (3.4) at \(x_+\) yields \(A^u|_{x_+} = - (d u u^{-1})|_{x_+} + u|_{x_+} A|_{x_+} u|_{x_+}^{-1}\). So a parameter u such that

$$\begin{aligned} u|_{x_+} \in K \end{aligned}$$
(4.10)

defines an allowed gauge transformation.

We proceed as in the proof of Proposition 4.2 to construct a smooth equivariant \(\widetilde{g} : \varSigma \times \mathbb {C}P^1\rightarrow G^\mathbb {C}\), which is equal to \(\widehat{g}\) on the complement of \(\varSigma \times (U_{x_+} \sqcup U_{x_-})\) and which satisfies both of the archipelago conditions (ii) and (iii) on the islands \(U_{x_\pm }\). Referring to the notation introduced in Sect. 4.2, in the present case we let \(\widetilde{g}|_{D^r_{x_+}} \,{:}{=}\, a \widehat{g}|_{x_+}\) for some smooth \(a : \varSigma \rightarrow K\) of our choice. The rest of the definition of \(\widehat{g}\) over \(U_{x_+}\) is as in Sect. 4.2, and then we also let \(\widetilde{g}|_{U_{x_-}} \,{:}{=}\, \tau (\widetilde{g}|_{U_{x_+}})\).

The fact that \(u = \widehat{g} \widetilde{g}^{-1}\) is equivariant and satisfies (4.10) is established as in Sect. 4.2 with minor changes. Specifically, we have

$$\begin{aligned} \widetilde{g}|_{x_+} = u|_{x_+} \widehat{g}|_{x_+}. \end{aligned}$$
(4.11)

But since \(\widetilde{g}|_{x_+} = a \widehat{g}|_{x_+}\), we deduce that \(u|_{x_+} = a \in K\), which is condition (4.10), as required. \(\square \)

4.4 Manin triples and Poisson–Lie T-duality

In all examples where \(\omega \) has simple poles, we shall be interested in the special case where the Manin pair \((\mathfrak {d},\mathfrak {k})\) (resp. \((\mathfrak {g}^\mathbb {C}, \mathfrak {k})\)) can be extended to a Manin triple \((\mathfrak {d}, \mathfrak {k}, \mathfrak {p})\) (resp. \((\mathfrak {g}^\mathbb {C}, \mathfrak {k}, \mathfrak {p})\)). That is, \(\mathfrak {p}\) is another Lagrangian subalgebra of \(\mathfrak {d}\) (resp. \(\mathfrak {g}^\mathbb {C}\)) which is complementary to \(\mathfrak {k}\), i.e. we have a direct sum \(\mathfrak {d}= \mathfrak {k}\dotplus \mathfrak {p}\) (resp. \(\mathfrak {g}^\mathbb {C}= \mathfrak {k}\dotplus \mathfrak {p}\)). We denote by \(\dotplus \) the direct sum as vector spaces.

An important class of Manin triples is given by a choice of solution \(R \in {{\,\mathrm{End}\,}}\mathfrak {g}\) of the modified classical Yang–Baxter equation

$$\begin{aligned}{}[R {\textsf {x} }, R {\textsf {y} }] - R \big ( [R {\textsf {x} }, {\textsf {y} }] + [{\textsf {x} }, R {\textsf {y} }] \big ) = - c^2 [{\textsf {x} }, {\textsf {y} }] \end{aligned}$$
(4.12)

for every \({\textsf {x} }, {\textsf {y} } \in \mathfrak {g}\), where either \(c = 1\) or \(c = \mathrm{i}\). We shall be particularly interested in solutions which are skew-symmetric with respect to the bilinear form \(\langle \cdot , \cdot \rangle \) on \(\mathfrak {g}\), namely such that

$$\begin{aligned} \langle R{\textsf {x} }, {\textsf {y} } \rangle = - \langle {\textsf {x} }, R {\textsf {y} } \rangle \end{aligned}$$

for any \({\textsf {x} }, {\textsf {y} } \in \mathfrak {g}\).

Specifically, in the real case where \(c=1\), we define

$$\begin{aligned} \mathfrak {g}_R \,{:}{=}\, \{ ((R-1){\textsf {x} }, (R+1){\textsf {x} }) \,|\, {\textsf {x} } \in \mathfrak {g}\}, \quad \mathfrak {g}^\delta \,{:}{=}\, \{ ({\textsf {x} }, {\textsf {x} }) \,|\, {\textsf {x} } \in \mathfrak {g}\}. \end{aligned}$$

It is clear that \(\mathfrak {g}^\delta \) is a Lie subalgebra of \(\mathfrak {d}\), and it follows from (4.12) that \(\mathfrak {g}_R\) also is. Suppose that \(\mathfrak {d}\) is equipped with its standard bilinear form, namely

$$\begin{aligned} \langle \!\langle ({\textsf {x} }, {\textsf {y} }), ({\textsf {x} }', {\textsf {y} }') \rangle \!\rangle _\mathfrak {d}\,{:}{=}\, \langle {\textsf {x} }, {\textsf {x} }' \rangle - \langle {\textsf {y} }, {\textsf {y} }' \rangle \end{aligned}$$

for any \({\textsf {x} }, {\textsf {y} }, {\textsf {x} }', {\textsf {y} }' \in \mathfrak {g}\). This corresponds, up to an overall factor, to the bilinear form considered in Sect. 4.2 when \({{\,\mathrm{res}\,}}_{x_-} \omega = - {{\,\mathrm{res}\,}}_{x_+} \omega \). In this case, \(\mathfrak {g}^\delta \) is clearly isotropic and so is \(\mathfrak {g}_R\) by the skew-symmetry of R. It follows that \((\mathfrak {d}, \mathfrak {g}_R, \mathfrak {g}^\delta )\) is a Manin triple.

In the complex case, we take \(c=\mathrm{i}\) and define

$$\begin{aligned} \mathfrak {g}_R \,{:}{=}\, \{ (R-\mathrm{i}) {\textsf {x} } \,|\, {\textsf {x} } \in \mathfrak {g}\}, \end{aligned}$$

with \(\mathfrak {g}\subset \mathfrak {g}^\mathbb {C}\) denoting the real subalgebra of \(\mathfrak {g}^\mathbb {C}\) regarded itself as a real Lie algebra. It follows again from (4.12) that \(\mathfrak {g}_R\) is a Lie subalgebra of \(\mathfrak {g}^\mathbb {C}\). Suppose, moreover, that \(\mathfrak {g}^\mathbb {C}\) is equipped with its standard bilinear form, namely

$$\begin{aligned} \langle \!\langle {\textsf {x} }, {\textsf {x} }' \rangle \!\rangle _{\mathfrak {g}^\mathbb {C}} = \mathfrak {I}\langle {\textsf {x} }, {\textsf {x} }' \rangle \end{aligned}$$

for any \({\textsf {x} }, {\textsf {x} }' \in \mathfrak {g}^\mathbb {C}\), which corresponds to the bilinear form considered in Sect. 4.3 with \({{\,\mathrm{res}\,}}_{x_-} \omega = - {{\,\mathrm{res}\,}}_{x_+} \omega \). In this case, we have that \(\mathfrak {g}\) is certainly isotropic and \(\mathfrak {g}_R\) also is by the skew-symmetry of R. Therefore, \((\mathfrak {g}^\mathbb {C}, \mathfrak {g}_R, \mathfrak {g})\) is a Manin triple.

Consider the Lie subgroup \(G^\delta \,{:}{=}\, \{ (x, x) \,|\, x \in G \} \subset D\) with the Lie algebra \(\mathfrak {g}^\delta \). Also let \(G_R\) denote the Lie subgroup of D with Lie algebra \(\mathfrak {g}_R\). We will assume that the decomposition \(\mathfrak {d}= \mathfrak {g}_R \dotplus \mathfrak {g}^\delta \) lifts to the Lie group level, i.e. that \(D = G_R G^\delta \), or at least that \(G_R G^\delta \) forms a dense subset of D. It then follows that a natural parametrisation of the quotient \(G_R \backslash D\) in the case \(c = 1\) is given by elements of \(G^\delta \).

Likewise, in the case \(c = \mathrm{i}\), we let \(G_R \subset G^\mathbb {C}\) denote the Lie subgroup with Lie algebra \(\mathfrak {g}_R \subset \mathfrak {g}^\mathbb {C}\). Again, we will assume that the decomposition \(\mathfrak {g}^\mathbb {C}= \mathfrak {g}_R \dotplus \mathfrak {g}\) similarly lifts to the Lie group level, i.e. that \(G^\mathbb {C}= G_R G\), or at least that \(G_R G\) forms a dense subset of \(G^\mathbb {C}\). A natural parametrisation of the quotient \(G_R \backslash G^\mathbb {C}\) is then given by elements of G. An example is provided by the Iwasawa decomposition \(G^\mathbb {C}= A N G\), where G is the compact real form of \(G^\mathbb {C}\) and \(G_R = AN\).

Since a Manin triple \((\mathfrak {d}, \mathfrak {k}, \mathfrak {p})\) (resp. \((\mathfrak {g}^\mathbb {C}, \mathfrak {k}, \mathfrak {p})\)) gives rise to two Manin pairs, namely \((\mathfrak {d}, \mathfrak {k})\) or \((\mathfrak {d}, \mathfrak {p})\) (resp. \((\mathfrak {g}^\mathbb {C}, \mathfrak {k})\) or \((\mathfrak {g}^\mathbb {C}, \mathfrak {p})\)), we can apply the construction of Sect. 4.2 (resp., Sect. 4.3) at a pair of simple poles \(x_\pm \) of \(\omega \) using either of these Manin pairs. We expect the corresponding models obtained as in Sect. 3.3 to be Poisson–Lie T-dual [21, 22].

The main example of Poisson–Lie T-duality is provided by Manin triples of the form \(\mathfrak {d}= \mathfrak {g}_R \dotplus \mathfrak {g}^\delta \) or \(\mathfrak {g}^\mathbb {C}= \mathfrak {g}_R \dotplus \mathfrak {g}\). This includes the Poisson–Lie T-duality between the Yang–Baxter \(\sigma \)-model, discussed in Sect. 5.3, and the \(\lambda \)-deformation of the principal chiral model, discussed in Sect. 5.4. See, for instance, [15, 27, 38, 39].

The Yang–Baxter \(\sigma \)-model with WZ-term, discussed in Sect. 5.6, was also shown in [14] to be Poisson–Lie T-dual to itself for a different choice of parameters. In this case as well, the duality is underpinned by certain choice of Manin triple so that it can also be described in the present formalism.

Let us finally note that another way of ensuring the vanishing of the terms in the boundary equation of motion (2.6) corresponding to a pair of simple poles \(x_\pm \) of \(\omega \), is to ask that the terms associated with \(x_+\) and with \(x_-\) separately vanish. In other words, instead of (4.1b) one could impose the weaker condition

$$\begin{aligned} \epsilon _{ij} \langle A_i|_{x_\pm }, \delta A_j|_{x_\pm } \rangle = 0. \end{aligned}$$
(4.13)

This situation was discussed in detail in [5, §9.1]. In particular, it was argued that (4.13) can be satisfied by fixing a Manin triple \((\mathfrak {g}, \mathfrak {l}_+, \mathfrak {l}_-)\), i.e. making a choice of Lagrangian subalgebras \(\mathfrak {l}_\pm \subset \mathfrak {g}\) with \(\mathfrak {g}= \mathfrak {l}_+ \dotplus \mathfrak {l}_-\), and requiring that \(A_i|_{x_\pm }\) be \(\mathfrak {l}_\pm \)-valued. In the present language, working with such Manin triples on \(\mathfrak {g}\), as opposed to ones on \(\mathfrak {g}^\mathbb {C}\) or \(\mathfrak {d}\), corresponds to considering skew-symmetric solutions \(R \in {{\,\mathrm{End}\,}}\mathfrak {g}\) of the modified classical Yang–Baxter equation (4.12) for which \(R^2 = 1\). The two subalgebras \(\mathfrak {l}_\pm \) then correspond to the two eigenspaces \(\ker (R\mp 1)\) of R.

4.5 Generalised boundary conditions at a real double pole

In light of the discussion of boundary conditions at pairs of simple poles in Sects. 4.2 and 4.3, we will now consider more general boundary conditions that can be imposed at double poles. The algebraic setting of this section is similar to the one used in [29] in the context of \(\textsf {E} \)-models.

Let \(x \in {{\varvec{z}}}\) be a double pole of \(\omega \) along the real axis, as in Sect. 4.1. One can rewrite the boundary equation of motion (4.1a) in the following way. We consider the semi-direct product \(\mathfrak {t}\,{:}{=}\, \mathfrak {g}\ltimes \mathfrak {g}_{\mathrm{ab}}\) where \(\mathfrak {g}_{\mathrm{ab}}\) is an abelian copy of \(\mathfrak {g}\) on which \(\mathfrak {g}\) acts by the adjoint action. That is, \(\mathfrak {t}\) is isomorphic to the direct sum \(\mathfrak {g}\oplus \mathfrak {g}\) as a vector space with Lie bracket given by \([({\textsf {x} },{\textsf {y} }), ({\textsf {x} }',{\textsf {y} }')]_\mathfrak {t}= ([{\textsf {x} },{\textsf {x} }'], [{\textsf {x} }, {\textsf {y} }'] - [{\textsf {x} }', {\textsf {y} }])\) for any \({\textsf {x} }, {\textsf {y} }, {\textsf {x} }', {\textsf {y} }' \in \mathfrak {g}\). By the equivariance property of A in (2.16), since \(x \in \mathbb {R}\) we have \(A_i|_x \in \mathfrak {g}\). Also

$$\begin{aligned} \tau \big ( (\partial _{\xi _x} A_i)|_x \big ) = \big ( \tau (\partial _{\xi _x} A_i) \big ) \big |_x = \big ( \partial _{\bar{\xi }_x} (\tau A_i) \big ) \big |_x = \big ( \mu _{\mathrm{t}}^*(\partial _{\xi _x} A_i) \big ) \big |_x = (\partial _{\xi _x} A_i)|_x, \end{aligned}$$

where the second step is by the anti-linearity of \(\tau \), the third by the equivariance of A and the last step follows because \(\mu _{\mathrm{t}} x = x\). Hence, \((\partial _{\xi _x} A_i)|_x \in \mathfrak {g}\). We can therefore regard \((A_i|_x, (\partial _{\xi _x} A_i)|_x)\) as valued in \(\mathfrak {t}\), which allows us to rewrite (4.1a) as

$$\begin{aligned} \epsilon _{ij} \langle \!\langle (A_i|_x, (\partial _{\xi _x} A_i)|_x), \delta (A_j|_x, (\partial _{\xi _x} A_j)|_x) \rangle \!\rangle _{\mathfrak {t}; x} = 0, \end{aligned}$$
(4.14)

where \(\langle \!\langle \cdot , \cdot \rangle \!\rangle _{\mathfrak {t}; x} : \mathfrak {t}\times \mathfrak {t}\rightarrow \mathbb {R}\) is the bilinear form on \(\mathfrak {t}\) defined by

$$\begin{aligned} \langle \!\langle ({\textsf {x} }, {\textsf {y} }), ({\textsf {x} }', {\textsf {y} }') \rangle \!\rangle _{\mathfrak {t}; x} \,{:}{=}\, ({{\,\mathrm{res}\,}}_x \omega ) \langle {\textsf {x} }, {\textsf {x} }' \rangle + ({{\,\mathrm{res}\,}}_x \xi _x \omega ) \big ( \langle {\textsf {x} }, {\textsf {y} }' \rangle + \langle {\textsf {x} }', {\textsf {y} } \rangle \big ), \end{aligned}$$

for every \({\textsf {x} }, {\textsf {y} }, {\textsf {x} }', {\textsf {y} }' \in \mathfrak {g}\). One checks that this bilinear form is nondegenerate (using the fact that \({{\,\mathrm{res}\,}}_x \xi _x \omega \ne 0\) since x is a double pole of \(\omega \)), symmetric and invariant.

Reformulation (4.14) of the general condition (4.1a) leads to a natural way of imposing boundary conditions at the real double pole x, mimicking the discussion of Sects. 4.2 and 4.3 for pairs of simple poles. Specifically, if we have a Manin pair \((\mathfrak {t}, \mathfrak {k})\), i.e. a Lagrangian subalgebra \(\mathfrak {k}\) of \(\mathfrak {t}\), then we can satisfy (4.14) by requiring that

$$\begin{aligned} (A_i|_x, (\partial _{\xi _x} A_i)|_x) \in \mathfrak {k}\end{aligned}$$
(4.15)

for \(i = \tau , \sigma \), noting that this then also implies \(\delta (A_i|_x, (\partial _{\xi _x} A_i)|_x) \in \mathfrak {k}\). For technical reasons to be discussed below, to do with making \(\widehat{g}\) of archipelago type, we need to assume that the subalgebra \(\mathfrak {g}\ltimes \{ 0 \} \subset \mathfrak {t}\) is complementary to our choice of Lagrangian subalgebra \(\mathfrak {k}\subset \mathfrak {t}\). That is, we assume that we have a direct sum decomposition

$$\begin{aligned} \mathfrak {t}= (\mathfrak {g}\ltimes \{ 0 \}) \dotplus \mathfrak {k}. \end{aligned}$$
(4.16)

Before proceeding, we note that the simple boundary condition (4.2) considered in Sect. 4.1 is a special case of (4.15). Indeed, an obvious choice of Lagrangian subalgebra of \(\mathfrak {t}\) satisfying condition (4.16) is the abelian subalgebra \(\{ 0 \} \ltimes \mathfrak {g}_{\mathrm{ab}}\). Imposing condition (4.15) in the case \(\mathfrak {k}= \{ 0 \} \ltimes \mathfrak {g}_{\mathrm{ab}}\) is equivalent to requiring (4.2).

Proposition 4.4

Suppose that A satisfies the boundary condition (4.15), and we are given a field \(\widehat{g}\) satisfying (3.1) for which the archipelago condition (i) holds.

Then, the value of \(\widehat{g}\) on the island \(U_x\) can be modified, without changing its value outside, so as to also satisfy the remaining archipelago conditions (ii) and (iii).

Proof

In order for \(A^u\) to satisfy the second condition in (2.16), we should require that the function \(u : \varSigma \times \mathbb {C}P^1\rightarrow G^\mathbb {C}\) be equivariant, i.e. \(\tau u = \mu _{\mathrm{t}}^*u\). Evaluating the latter condition at the real pole x implies that \(u|_x \in G\) since \(\mu _{\mathrm{t}} x = x\). Also, we have

$$\begin{aligned} \tau \big ( (\partial _{\xi _x} u)|_x u|_x^{-1} \big )&= \big ( \tau (\partial _{\xi _x} u) \big ) \big |_x (\tau u)|_x^{-1} = \big ( \partial _{\bar{\xi }_x} (\tau u) \big ) \big |_x (\tau u)|_x^{-1}\\&= \big ( \mu _{\mathrm{t}}^*(\partial _{\xi _x} u) \big ) \big |_x (\mu _{\mathrm{t}}^*u)|_x^{-1} = (\partial _{\xi _x} u)|_x u|_x^{-1}, \end{aligned}$$

where in the second equality we use the anti-linearity of \(\tau \) and in the third equality the equivariance of u. Therefore, \((\partial _{\xi _x} u)|_x u|_x^{-1} \in \mathfrak {g}\). We thus obtain a function

$$\begin{aligned} U \,{:}{=}\, \big ( u|_x, (\partial _{\xi _x} u)|_x u|_x^{-1} \big ) : \varSigma \longrightarrow T \end{aligned}$$

valued in the Lie group \(T \,{:}{=}\, G \ltimes \mathfrak {g}_{\mathrm{ab}}\) with Lie algebra \(\mathfrak {t}= \mathfrak {g}\ltimes \mathfrak {g}_{\mathrm{ab}}\).

Next, we determine conditions on u for \(A^u\) to still satisfy the boundary condition (4.15). Evaluating (3.4) at x, we obtain

$$\begin{aligned} A_i^u|_x = - \partial _i (u|_x) u|_x^{-1} + u|_x A_i|_x u|_x^{-1}. \end{aligned}$$
(4.17a)

On the other hand, differentiating (3.4) first with respect to the local holomorphic coordinate \(\xi _x\) before evaluating at x, we find

$$\begin{aligned} (\partial _{\xi _x} A_i^u)|_x&= - \partial _i \big ( (\partial _{\xi _x} u)|_x u|_x^{-1} \big ) + \big [ \partial _i (u|_x) u|_x^{-1}, (\partial _{\xi _x} u)|_x u|_x^{-1} \big ] \nonumber \\&\quad + \big [ (\partial _{\xi _x} u)|_x u|_x^{-1}, u|_x A_i|_x u|_x^{-1} \big ] + u|_x (\partial _{\xi _x} A_i)|_x u|_x^{-1}. \end{aligned}$$
(4.17b)

Combining (4.17a) and (4.17b), we thus find

$$\begin{aligned} \big ( A_i^u|_x, (\partial _{\xi _x} A_i^u)|_x \big ) = - \partial _i U U^{-1} + U \big ( A_i|_x, (\partial _{\xi _x} A_i)|_x \big ) U^{-1}, \end{aligned}$$
(4.18)

where the first term on the right-hand side denotes the components of the Darboux derivative of \(U : \varSigma \rightarrow T\), while the second term denotes the adjoint action of \(U \in T\) on \(\big ( A_i|_x, (\partial _{\xi _x} A_i)|_x \big ) \in \mathfrak {t}\). These are given explicitly by (see, for instance, [29])

$$\begin{aligned} \partial _i (h, {\textsf {v} }) (h, {\textsf {v} })^{-1}&= \big ( \partial _i h h^{-1}, \partial _i {\textsf {v} } - \big [ \partial _i h h^{-1}, {\textsf {v} } \big ] \big ),\\ (k, {\textsf {w} }) ({\textsf {x} }, {\textsf {y} }) (k, {\textsf {w} })^{-1}&= (k {\textsf {x} } k^{-1}, k {\textsf {y} } k^{-1} + [{\textsf {w} }, {\textsf {x} }]), \end{aligned}$$

for any smooth functions \(h : \varSigma \rightarrow G\), \({\textsf {v} } : \varSigma \rightarrow \mathfrak {g}\) and any elements \(k \in G\), \({\textsf {w} }, {\textsf {x} }, {\textsf {y} } \in \mathfrak {g}\).

It now follows from (4.18) that an allowed gauge transformation, in the present case, should have parameter u such that

$$\begin{aligned} U = \big ( u|_x, (\partial _{\xi _x} u)|_x u|_x^{-1} \big ) \in K, \end{aligned}$$
(4.19)

where K is the Lie subgroup of T with Lie algebra \(\mathfrak {k}\subset \mathfrak {t}\). We will assume that decomposition (4.16) lifts to a factorisation at the group level, namely that

$$\begin{aligned} T = K (G \ltimes \{ 0 \}). \end{aligned}$$
(4.20)

Having determined the set of allowed gauge transformations, we should find the one which brings the smooth function \(\widehat{g}\) to the desired archipelago form in \(U_x\).

We proceed exactly as in the proofs of Propositions 4.2 and 4.3, by considering a smooth equivariant \(\widetilde{g} : \varSigma \times \mathbb {C}P^1\rightarrow G^\mathbb {C}\) defined as follows. Let \(\widetilde{g} \,{:}{=}\, \widehat{g}\) on the complement of \(\varSigma \times U_x\). We then define \(\widetilde{g}\) as being constant in a small open disc \(D^r_x \subset U_x\) around x by letting \((\widetilde{g}|_{D^r_x}, 0) \in G \ltimes \{ 0 \}\) be the representative of the class in \(K \backslash T\) of \((\widehat{g}|_x, (\partial _{\xi _x} \widehat{g})|_x \widehat{g}|_x^{-1}) \in T\). Note that here we have made use of property (4.20). The reason we had to choose a representative in \(G \ltimes \{ 0 \}\) is that we want \(\widetilde{g}\) to be of archipelago type on the island \(U_x\), which by definition means \(\widetilde{g}|_{D^r_x}\) is constant along \(\mathbb {C}P^1\) so that necessarily \((\partial _{\xi _x} \widetilde{g} \widetilde{g}^{-1})|_{D^r_x} = 0\). Finally, we can also extend the definition of \(\widetilde{g}\) to \(\varSigma \times U_x\) as we did in Sect. 4.1 so that \(\widetilde{g}_x = \widetilde{g}|_{\varSigma \times U_x}\) depends only on \(\sigma \), \(\tau \) and the radial coordinate \(|\xi _x|\) around x. Therefore, by construction \(\widetilde{g}\) satisfies both of the archipelago conditions (ii) and (iii) on \(U_x\).

Now consider the gauge transformation parameter \(u = \widetilde{g} \widehat{g}^{-1}\). Its equivariance is clear from that of \(\widetilde{g}\) and \(\widehat{g}\). And from the relation \(\widetilde{g} = u \widehat{g}\), we obtain

$$\begin{aligned} \widetilde{g}|_x = u|_x \widehat{g}|_x, \quad 0 = (\partial _{\xi _x} u)|_x u|_x^{-1} + u|_x (\partial _{\xi _x} \widehat{g})|_x \widehat{g}|_x^{-1} u|_x^{-1}. \end{aligned}$$

The second equality is obtained by computing \(\partial _{\xi _x} \widetilde{g} \widetilde{g}^{-1}\) in terms of u and \(\widehat{g}\) and then evaluating at x, noting that since \(\widetilde{g}\) is constant along \(\mathbb {C}P^1\) in a neighbourhood of x, we have \((\partial _{\xi _x} \widetilde{g} \widetilde{g}^{-1})|_x = 0\). By definition of the product in \(T = G \ltimes \mathfrak {g}_{\mathrm{ab}}\), the above two equations are equivalent to

$$\begin{aligned} ( \widetilde{g}|_x, 0) = \big ( u|_x, (\partial _{\xi _x} u)|_x u|_x^{-1} \big ) \big ( \widehat{g}|_x, (\partial _{\xi _x} \widehat{g})|_x \widehat{g}|_x^{-1} \big ). \end{aligned}$$

Yet since \(( \widetilde{g}|_x, 0)\) was defined as the representative in \(G \ltimes \{ 0 \}\) of the class in \(K \backslash T\) of \((\widehat{g}|_x, (\partial _{\xi _x} \widehat{g})|_x \widehat{g}|_x^{-1}) \in T\), condition (4.19) follows. \(\square \)

An important class of Lie subalgebras \(\mathfrak {k}\subset \mathfrak {t}\) with property (4.16) is provided by solutions \(R \in {{\,\mathrm{End}\,}}\mathfrak {g}\) of the classical Yang–Baxter equation, i.e. (4.12) with \(c = 0\), which reads

$$\begin{aligned}{}[R {\textsf {x} }, R {\textsf {y} }] - R \big ( [R {\textsf {x} }, {\textsf {y} }] + [{\textsf {x} }, R {\textsf {y} }] \big ) = 0 \end{aligned}$$
(4.21)

for every \({\textsf {x} }, {\textsf {y} } \in \mathfrak {g}\). Specifically, given such a solution, we define the Lie subalgebra

$$\begin{aligned} \mathfrak {g}_R \,{:}{=}\, \{ (- R{\textsf {x} }, {\textsf {x} }) \,|\, {\textsf {x} } \in \mathfrak {g}\} \end{aligned}$$

of \(\mathfrak {t}\). The fact that it is a subalgebra is a direct consequence of (4.21). Indeed, for any \({\textsf {x} }, {\textsf {y} } \in \mathfrak {g}\), we have

$$\begin{aligned} \big [ (-R{\textsf {x} }, {\textsf {x} }), (-R{\textsf {y} }, {\textsf {y} }) \big ]_\mathfrak {t}= \big ( [- R{\textsf {x} }, - R {\textsf {y} }], [- R{\textsf {x} }, {\textsf {y} }] - [- R{\textsf {y} }, {\textsf {x} }] \big ) = (- R {\textsf {z} }, {\textsf {z} }) \in \mathfrak {g}_R \end{aligned}$$

where \({\textsf {z} } = - [R{\textsf {x} }, {\textsf {y} }] - [{\textsf {x} }, R {\textsf {y} }] \in \mathfrak {g}\).

In the case when \({{\,\mathrm{res}\,}}_x \omega = 0\), which we shall focus on in Sect. 5.2, it is clear that the Lie subalgebra \(\mathfrak {g}\ltimes \{ 0 \} \subset \mathfrak {t}\) is isotropic with respect to \(\langle \!\langle \cdot , \cdot \rangle \!\rangle _{\mathfrak {t}; x}\). If, moreover, the solution \(R \in {{\,\mathrm{End}\,}}\mathfrak {g}\) of (4.21) is skew-symmetric in the sense that

$$\begin{aligned} \langle R{\textsf {x} }, {\textsf {y} } \rangle = - \langle {\textsf {x} }, R {\textsf {y} } \rangle \end{aligned}$$

for any \({\textsf {x} }, {\textsf {y} } \in \mathfrak {g}\), then the subalgebra \(\mathfrak {g}_R \subset \mathfrak {t}\) is also isotropic. In this case, we therefore have a Manin triple \((\mathfrak {t}, \mathfrak {g}_R, \mathfrak {g}\ltimes \{ 0 \})\).

Let \(G_R\) denote the Lie subgroup of T with Lie algebra \(\mathfrak {g}_R\). We will assume, as in (4.20), that the vector space direct sum decomposition \(\mathfrak {t}= (\mathfrak {g}\ltimes \{0\}) \dotplus \mathfrak {g}_R\) lifts to the Lie group level, namely that \(T = G_R (G \ltimes \{ 0 \})\), or at least that \(G_R (G \ltimes \{ 0 \})\) forms a dense subset of T, cf. Sect. 4.4.

5 Examples

In this section, we rederive the actions of many known integrable \(\sigma \)-models from the four-dimensional Chern–Simons action (1.2). Specifically, our starting point in each case is the 1-form \(\omega \) given by

$$\begin{aligned} \omega = \varphi (z) dz \end{aligned}$$

where \(\varphi (z)\) is the twist function of the integrable \(\sigma \)-model that we want to consider, which has at most double poles. We then impose natural boundary conditions on the 1-form A at the poles of \(\omega \), of the various types discussed in Sect. 4. In each case, we then compute the corresponding action (3.7) and show that it coincides with the known action of the given integrable \(\sigma \)-model. In all cases, we also find that the meromorphic 1-form \({\mathcal {L}}\) coincides with the Lax connection of the integrable \(\sigma \)-model.

In every example, \(\omega \) will have a pair of simple zeroes, say at \(y_\pm \in {{\varvec{\zeta }}}\). Since all the \(\sigma \)-models that we want to reconstruct are relativistic, by Remark 2.1 we will thus take \(\sigma _{y_\pm } = \sigma ^\pm \) in the notation of (2.11). The reason for not taking \(\sigma _{y_\pm }\) both equal to \(\sigma ^+\) or both equal to \(\sigma ^-\) is that the resulting 1-form \({\mathcal {L}}\) would be quite degenerate, with one of its light-cone components being independent of the spectral parameter. In the absence of a Lax connection, there is no guarantee that the resulting \(\sigma \)-model would be integrable. We will come back in Sect. 6 to considering such a case.

5.1 Principal chiral model with WZ-term

Although the action for this model was already derived from (1.2) in [7], we give the derivation of this case in detail as it illustrates the general procedure for constructing the action of an integrable \(\sigma \)-model from the two-dimensional action (3.7) in the simplest possible setting.

Consider the 1-form (see, for instance, [40, §5.1.3)] and [32] in the case \(k=0\))

$$\begin{aligned} \omega = K \frac{1 - z^2}{(z-k)^2} dz, \end{aligned}$$

where K and k are real parameters. It has a pair of double poles at \(k \in \mathbb {R}\) and \(\infty \). Note that, under the change of variable \(z \mapsto z + k\), this can also be brought to the equivalent form

$$\begin{aligned} \omega = - K \frac{(z - z_+)(z - z_-)}{z^2} dz \end{aligned}$$

with \(z_\pm \,{:}{=}\, -k\pm 1\). This is the 1-form used in [7] to describe the principal chiral model with WZ-term.

As discussed in Sect. 4.1, we can satisfy the boundary equations of motion (2.6) by requiring that

$$\begin{aligned} A_i|_k = 0, \quad A_i|_\infty = 0, \end{aligned}$$
(5.1)

for \(i = \tau , \sigma \). It follows from Lemma 3.1 and Proposition 4.1 that \(\widehat{g}\) can be chosen of archipelago type and, moreover, such that

$$\begin{aligned} g_k = g, \quad g_\infty = 1 \end{aligned}$$

for some \(g : \varSigma \rightarrow G\). The latter condition is used to fixed the gauge invariance of Proposition 3.3. Evaluating (2.9) at k and \(\infty \), we then find

$$\begin{aligned} A|_k = - dg g^{-1} + {{\,\mathrm{Ad}\,}}_g {\mathcal {L}}|_k, \quad A|_\infty = {\mathcal {L}}|_\infty . \end{aligned}$$
(5.2)

Now the 1-form \(\omega \) has simple zeroes at \(\pm 1\), i.e. \({{\varvec{\zeta }}} = \{1, -1\}\). On the other hand, we also know from combining the second equations in (5.1) and (5.2) that \({\mathcal {L}}\) vanishes at infinity. Thus, \(U_\sigma = U_\tau = 0\) in the general expression (2.11) for the meromorphic dependence of \({\mathcal {L}}\) on \(\mathbb {C}P^1\). As discussed at the start of this section, in the general notation of (2.11) we choose \(\sigma _{\pm 1} = \sigma ^\pm \), so that the Lax connection in the present case takes the form

$$\begin{aligned} {\mathcal {L}} = \frac{V^{1}}{z-1} d\sigma ^+ + \frac{V^{-1}}{z+1} d\sigma ^-, \end{aligned}$$

for some \(V^{\pm 1} : \varSigma \rightarrow \mathfrak {g}\). Their expressions in terms of the G-valued field g can now be determined uniquely by solving \(- \partial _i g g^{-1} + {{\,\mathrm{Ad}\,}}_g {\mathcal {L}}_i|_k = 0\) for \(i = \tau , \sigma \), which follows from combining the first two equations in (5.1) and (5.2). We find

$$\begin{aligned} V^{\pm 1} = (k\mp 1) j_\pm , \end{aligned}$$

where \(j_\pm \,{:}{=}\, g^{-1} \partial _\pm g\).

We now have all the ingredients to compute action (3.7) in the case at hand. Note that the terms in this action corresponding to the pole \(\infty \in {{\varvec{z}}}\) do not contribute since we chose to set \(g_\infty = 1\). To compute the first term, we thus only need the residue

$$\begin{aligned} {{\,\mathrm{res}\,}}_k \omega \wedge \mathcal {L}=-K \big ( (k-1) j_+ d\sigma ^++(k+1) j_- d\sigma ^- \big ), \end{aligned}$$

while for the WZ-term we note that \({{\,\mathrm{res}\,}}_k\omega =-2K k\). From these expressions and the fact that \(d\sigma ^+ \wedge d\sigma ^- = {\small \frac{1}{2}}d\sigma \wedge d\tau \), we finally obtain

$$\begin{aligned} S[g]&= \frac{K}{2}\int \langle j_+, j_- \rangle d\sigma \wedge d\tau + K k \, I_{\mathrm{WZ}}[g], \end{aligned}$$

which we recognise as the action of the principal chiral model in the presence of a WZ-term.

5.2 Homogeneous Yang–Baxter \(\sigma \)-model

We will follow the conventions of [11, §4.2.1].

The procedure for constructing a homogeneous Yang–Baxter deformation [18] of a given integrable \(\sigma \)-model does not modify the underlying twist function [39]. For this reason, we start from the same 1-form as in Sect. 5.1. However, to simplify the discussion, we set \(k = 0\) and take

$$\begin{aligned} \omega = K \frac{1 - z^2}{z^2} dz \end{aligned}$$

where K is a real parameter. The discussion of the more general case with \(k \ne 0\) could be done by proceeding along the same lines as in Sect. 5.6.

Now although the 1-form \(\omega \) is the same as in Sect. 5.1, here we will impose a different boundary condition at its double pole 0 compared to that in (5.1). More precisely, we will replace it with a boundary condition that is associated with a choice of Lagrangian subalgebra of \(\mathfrak {t}= \mathfrak {g}\ltimes \mathfrak {g}_{\mathrm{ab}}\), as discussed in Sect. 4.5.

The bilinear form on \(\mathfrak {t}\) in the present case reads

$$\begin{aligned} \langle \!\langle ({\textsf {x} }, {\textsf {y} }), ({\textsf {x} }', {\textsf {y} }') \rangle \!\rangle _{\mathfrak {t}; 0} = K \big ( \langle {\textsf {x} }, {\textsf {y} }' \rangle + \langle {\textsf {x} }', {\textsf {y} } \rangle \big ). \end{aligned}$$

Let us fix any skew-symmetric solution of the classical Yang–Baxter equation (4.21). As recalled at the end of Sect. 4.5, it follows that \(\mathfrak {g}_R = \{ (- R{\textsf {x} }, {\textsf {x} }) \,|\, {\textsf {x} } \in \mathfrak {g}\}\) is a Lagrangian Lie subalgebra of \(\mathfrak {t}\).

We may use this Lagrangian subalgebra \(\mathfrak {g}_R \subset \mathfrak {t}\) to satisfy the boundary equations of motion (2.6) by requiring that (see Sect. 4.5)

$$\begin{aligned} \big ( A_i|_0, (\partial _z A_i)|_0 \big ) \in \mathfrak {g}_R, \quad A_i|_\infty = 0, \end{aligned}$$
(5.3)

for \(i = \tau , \sigma \). Recall here that \(\xi _0 = z\) is the local coordinate at 0. By virtue of Lemma 3.1 and Proposition 4.4, we can choose \(\widehat{g}\) to be of archipelago type and, moreover, such that

$$\begin{aligned} g_0 = g, \quad g_\infty = 1 \end{aligned}$$

for some \(g : \varSigma \rightarrow G\). The latter condition fixes the gauge invariance of Proposition 3.3. Evaluating (2.9) at 0 and \(\infty \), we then find

$$\begin{aligned} A|_0 = - dg g^{-1} + {{\,\mathrm{Ad}\,}}_g {\mathcal {L}}|_0, \quad A|_\infty = {\mathcal {L}}|_\infty , \end{aligned}$$
(5.4a)

but also, taking the derivative with respect to z before evaluating at 0 and using the fact that \((\partial _z \widehat{g})|_0 = 0\) by virtue of the archipelago condition (iii), we obtain

$$\begin{aligned} (\partial _z A)|_0 = {{\,\mathrm{Ad}\,}}_g (\partial _z {\mathcal {L}})|_0. \end{aligned}$$
(5.4b)

Since \({\mathcal {L}}\) is meromorphic with poles in the set \({{\varvec{\zeta }}} = \{ 1, - 1 \}\) of zeroes of \(\omega \) and since it vanishes at infinity by the last two equations in (5.3) and (5.4a), it follows from the general expression (2.11) that we can write

$$\begin{aligned} {\mathcal {L}} = \frac{V^{1}}{z - 1} d\sigma ^+ + \frac{V^{-1}}{z + 1} d\sigma ^-. \end{aligned}$$

Now the first condition in (5.3) implies that \(A_i|_0 = - R (\partial _z A_i)|_0\). By combining this with (5.4) and the above explicit form of \({\mathcal {L}}\), we obtain

$$\begin{aligned} V^{\pm 1} = \mp \frac{1}{1 \pm R_g} j_\pm , \end{aligned}$$

where \(R_g \,{:}{=}\, {{\,\mathrm{Ad}\,}}_{g^{-1}} \circ R \circ {{\,\mathrm{Ad}\,}}_g\).

Finally, noting that \({{\,\mathrm{res}\,}}_0 \omega \wedge {\mathcal {L}} = - K (V^1 d\sigma ^+ + V^{-1} d\sigma ^-)\) and \({{\,\mathrm{res}\,}}_0 \omega = 0\), we find that action (3.7) reduces to

$$\begin{aligned} S[g]&= \frac{K}{2} \int _{\varSigma } \langle \!\langle j_+, \frac{1}{1 - R_g} j_- \rangle \!\rangle d\sigma \wedge d\tau . \end{aligned}$$

This is the action of the homogeneous Yang–Baxter deformation of the principal chiral model, as first constructed in [18] in the case of the semi-symmetric space \(\sigma \)-model.

5.3 Yang–Baxter \(\sigma \)-model

The twist function in this case was first computed in [12]. We will follow the conventions of [11, § 4.2.2]. In particular, we take

$$\begin{aligned} \omega = \frac{K}{1-c^2 \eta ^2} \frac{1 - z^2}{z^2 - c^2 \eta ^2} dz, \end{aligned}$$

with \(K, \eta \) real parameters and \(c = 1\) or \(c = \mathrm{i}\).

We fix a skew-symmetric solution \(R \in {{\,\mathrm{End}\,}}\mathfrak {g}\) of the modified classical Yang–Baxter equation (4.12). As \({{\,\mathrm{res}\,}}_{c \eta } \omega = - {{\,\mathrm{res}\,}}_{- c \eta } \omega \), it follows from Sect. 4.4 that \(\mathfrak {g}_R\) is a Lagrangian subalgebra of \(\mathfrak {d}\) when \(c = 1\) (resp. \(\mathfrak {g}^\mathbb {C}\) when \(c = \mathrm{i}\)). The boundary equations of motion (2.6) can then be satisfied by requiring that

$$\begin{aligned} (A_i|_{\eta }, A_i|_{-\eta }) \in \mathfrak {g}_R, \quad A_i|_\infty = 0, \end{aligned}$$
(5.5a)

for \(i = \tau , \sigma \), in the case \(c = 1\), or

$$\begin{aligned} A_i|_{\mathrm{i}\eta } \in \mathfrak {g}_R, \quad A_i|_\infty = 0, \end{aligned}$$
(5.5b)

for \(i = \tau , \sigma \), in the case \(c = \mathrm{i}\). It follows from Lemma 3.1 and Propositions 4.14.2 and 4.3 that we can choose \(\widehat{g}\) to be of archipelago type.

Moreover, by the discussion in Sect. 4.4 and Proposition 3.3, we are able to choose our archipelago-type field \(\widehat{g}\) such that

$$\begin{aligned} g_{\pm c \eta } = g, \quad g_\infty = 1 \end{aligned}$$

for some \(g : \varSigma \rightarrow G\). More precisely, by the last part of Proposition 4.2 (resp. Proposition 4.3), the value of \(\widehat{g}\) at the pair of points \(\pm \eta \) when \(c = 1\) (resp. at the point \(\mathrm{i}\eta \) when \(c = \mathrm{i}\)) defines a field on \(\varSigma \) valued in \(G_R \backslash D\) (resp. in \(G_R \backslash G^\mathbb {C}\)). We can parametrise this quotient by the diagonal subgroup \(G^\delta \) (resp. the real subgroup G), which allows us to choose \(\widehat{g}\) such that \((\widehat{g}|_\eta , \widehat{g}|_{-\eta }) = (g, g)\) (resp. \(\widehat{g}|_{\mathrm{i}\eta } = g\)). In the case \(c = \mathrm{i}\), we then use the fact that \(\widehat{g}\) is equivariant to obtain also \(g_{-\mathrm{i}\eta } = \tau (g_{\mathrm{i}\eta }) = g\). With this choice, evaluating (2.9) at the poles of \(\omega \), we then obtain

$$\begin{aligned} A|_{\pm c \eta } = - dg g^{-1} + {{\,\mathrm{Ad}\,}}_g {\mathcal {L}}|_{\pm c \eta }, \quad A|_\infty = {\mathcal {L}}|_\infty . \end{aligned}$$
(5.6)

Now \(\omega \) has simple zeroes at \(\pm 1\) so \({{\varvec{\zeta }}} = \{ 1, -1 \}\). Moreover, combining the last two equations in (5.5) and (5.6), we find that \({\mathcal {L}}\) should vanish at infinity. By the same reasoning as in Sect. 5.1, this allows us to write the Lax matrix in the form

$$\begin{aligned} {\mathcal {L}} = \frac{V^1}{z-1} d\sigma ^+ + \frac{V^{-1}}{z+1} d\sigma ^- \end{aligned}$$

for some \(\mathfrak {g}\)-valued fields \(V^{\pm 1}\) to be determined.

It follows from the first condition in (5.5) that \((R+c) A_i|_{c \eta } = (R - c) A_i|_{-c \eta }\). By combining this with the first equation in (5.6) and the above explicit rational form of \({\mathcal {L}}\), we therefore deduce that

$$\begin{aligned}&- (R+c) dg g^{-1} + (R+c) {{\,\mathrm{Ad}\,}}_g \left( \frac{1}{c\eta -1} V^1 d\sigma ^+ + \frac{1}{c\eta +1} V^{-1} d\sigma ^- \right) ,\\&\quad = - (R-c) dg g^{-1} - (R-c) {{\,\mathrm{Ad}\,}}_g \left( \frac{1}{c\eta +1} V^1 d\sigma ^+ + \frac{1}{c\eta -1} V^{-1} d\sigma ^- \right) . \end{aligned}$$

By equating the \(d\sigma ^\pm \)-components on both sides, we obtain two equations for the two unknowns \(V^{\pm 1}\) which can be solved to give

$$\begin{aligned} V^{\pm 1} = \pm \frac{c^2\eta ^2-1}{1 \pm \eta R_g} j_\pm \end{aligned}$$

where \(R_g = {{\,\mathrm{Ad}\,}}_{g^{-1}} \circ R \circ {{\,\mathrm{Ad}\,}}_g\) as before and \(j_\pm = g^{-1} \partial _\pm g\).

Since \(g_\infty = 1\), there is no WZ-term in action (3.7) corresponding to the double pole at \(\infty \). On the other hand, as \({{\,\mathrm{res}\,}}_{\pm c \eta } \omega = \pm K/2 c \eta \) and \(g_{c \eta } = g_{-c\eta }\), it follows that the WZ-terms associated with the simple poles \(\pm c \eta \) cancel out.

To compute the first term in action (3.7), we need the residue

$$\begin{aligned} {{\,\mathrm{res}\,}}_{\pm c \eta } \omega \wedge {\mathcal {L}} = ({{\,\mathrm{res}\,}}_{\pm c \eta } \omega ) {\mathcal {L}}|_{\pm c \eta } = \pm \frac{K}{2 c \eta } \left( \frac{c\eta +1}{1 \pm \eta R_g} j_\pm d\sigma ^\pm - \frac{c\eta -1}{1 \mp \eta R_g} j_\mp d\sigma ^\mp \right) . \end{aligned}$$

Putting everything together, we find that action (3.7) becomes

$$\begin{aligned} S[g] = \frac{K}{2} \int _{\varSigma } \langle \!\langle j_+, \frac{1}{1 - \eta R_g} j_- \rangle \!\rangle d\sigma \wedge d\tau \end{aligned}$$

which coincides with the Yang–Baxter \(\sigma \)-model action [24, 25].

5.4 \(\lambda \)-Deformation of the principal chiral model

The twist function in this case was first computed in [16]. We shall follow here the conventions of [11, §4.4]. In particular, we take

$$\begin{aligned} \omega = \frac{K}{1-\alpha ^2} \frac{1 - z^2}{z^2 - \alpha ^2} dz, \quad \lambda = \frac{1+\alpha }{1-\alpha }, \end{aligned}$$

with \(K, \alpha \) real parameters.

Since \({{\,\mathrm{res}\,}}_{\alpha } \omega = - {{\,\mathrm{res}\,}}_{- \alpha } \omega \), it follows from Sect. 4.4 that \(\mathfrak {g}^\delta \) is a Lagrangian subalgebra of \(\mathfrak {d}\). We can therefore satisfy the boundary condition (2.6) by requiring that

$$\begin{aligned} (A_i|_{\alpha }, A_i|_{-\alpha }) \in \mathfrak {g}^\delta , \quad A_i|_\infty = 0 \end{aligned}$$

for \(i= \tau , \sigma \). In other words, we have \(A_i|_{\alpha } = A_i|_{-\alpha }\) and \(A_i|_\infty = 0\). It follows from Lemma 3.1 and Propositions 4.1 and 4.2 that we can choose \(\widehat{g}\) to be of archipelago type. Now as in the corresponding discussion of Sect. 5.3, it follows from the last part of Proposition 4.2 that \((\widehat{g}|_\alpha , \widehat{g}|_{-\alpha })\) defines a field on \(\varSigma \) valued in \(G^\delta \backslash D\). A natural parametrisation of this quotient consists of elements of the form (h, 1) for \(h \in G\). We can thus choose our archipelago-type field \(\widehat{g}\) such that

$$\begin{aligned} g_\alpha = g, \quad g_{-\alpha } = 1, \quad g_\infty = 1 \end{aligned}$$

for some \(g : \varSigma \rightarrow G\). The condition on \(g_\infty \) is imposed by virtue of Proposition 3.3. Evaluating (2.9) at the poles of \(\omega \), we thus obtain

$$\begin{aligned} A|_\alpha = - dg g^{-1} + {{\,\mathrm{Ad}\,}}_g {\mathcal {L}}|_\alpha , \quad A|_{-\alpha } = {\mathcal {L}}|_{-\alpha }, \quad A|_\infty = {\mathcal {L}}|_\infty . \end{aligned}$$
(5.7)

Using the last equation and the boundary condition at infinity, we get \({\mathcal {L}}|_\infty = 0\). Since \({\mathcal {L}}\) is meromorphic with simple poles in the set \({{\varvec{\zeta }}} = \{ 1, - 1 \}\) of zeroes of \(\omega \), we deduce its dependence on z to be of the form, cf. Sects. 5.15.2 and 5.3,

$$\begin{aligned} {\mathcal {L}} = \frac{\alpha +1}{z-1} U_+ d\sigma ^+ + \frac{\alpha +1}{z+1} U_- d\sigma ^- \end{aligned}$$

for some \(\mathfrak {g}\)-valued pair of fields \(U_\pm = (\alpha + 1)^{-1} V^{\pm 1}\) on \(\varSigma \). The normalising factor of \(\alpha +1\) is introduced for convenience. In particular, evaluating \({\mathcal {L}}\) at \(\pm \alpha \), we find

$$\begin{aligned} {\mathcal {L}}|_\alpha = -\lambda U_+ d\sigma ^+ + U_- d\sigma ^-, \quad {\mathcal {L}}|_{-\alpha } = - U_+ d\sigma ^+ + \lambda U_- d\sigma ^-. \end{aligned}$$
(5.8)

It then follows from the boundary conditions at \(\pm \alpha \) and the first two equations in (5.7) that

$$\begin{aligned} - dg g^{-1} - \lambda {{\,\mathrm{Ad}\,}}_g U_+ d\sigma ^+ + {{\,\mathrm{Ad}\,}}_g U_- d\sigma ^- = - U_+ d\sigma ^+ + \lambda U_- d\sigma ^- \end{aligned}$$

Equating the coefficients of \(d\sigma ^\pm \) on both sides, solving for \(U_\pm \) and substituting back into (5.8), we find

$$\begin{aligned} {\mathcal {L}}|_\alpha = - \frac{\lambda {{\,\mathrm{Ad}\,}}_g}{1 - \lambda {{\,\mathrm{Ad}\,}}_g} j_+ d\sigma ^+ + \frac{{{\,\mathrm{Ad}\,}}_g}{{{\,\mathrm{Ad}\,}}_g - \lambda } j_- d\sigma ^-. \end{aligned}$$

We did not specify \({\mathcal {L}}|_{-\alpha }\) since it will not be needed as \(g_{-\alpha } = 1\). It now follows that

$$\begin{aligned} {{\,\mathrm{res}\,}}_\alpha \omega \wedge \mathcal {L}= ({{\,\mathrm{res}\,}}_\alpha \omega ) {\mathcal {L}}|_\alpha = 2k \frac{\lambda {{\,\mathrm{Ad}\,}}_g}{1 - \lambda {{\,\mathrm{Ad}\,}}_g} j_+ d\sigma ^+ - 2k \frac{{{\,\mathrm{Ad}\,}}_g}{{{\,\mathrm{Ad}\,}}_g - \lambda } j_- d\sigma ^- \end{aligned}$$

using the fact that \({{\,\mathrm{res}\,}}_\alpha \omega = - 2k\) where \(k = - K/4\alpha \).

Inserting all of the above into action (3.7), we find it simplifies to

$$\begin{aligned} S[g]&= \frac{k}{2} \int _\varSigma \langle g^{-1} \partial _+ g, g^{-1} \partial _- g \rangle d\sigma \wedge d\tau + k \, I_{\mathrm{WZ}}[g]\\&\quad + k \int _\varSigma \langle \!\langle \frac{1}{\lambda ^{-1} - {{\,\mathrm{Ad}\,}}_g} \partial _+ g g^{-1}, g^{-1} \partial _- g \rangle \!\rangle d\sigma \wedge d\tau . \end{aligned}$$

It coincides with the action of the \(\lambda \)-deformation of the principal chiral model [37], written using the conventions of [11, §4.4].

5.5 Bi-Yang–Baxter \(\sigma \)-model

We follow the conventions used in [9]. In particular, we take

$$\begin{aligned} \omega = \frac{16 K z}{\zeta ^2 (z - z_+)(z - z_-)(z - {\tilde{z}}_+)(z - {\tilde{z}}_-)} dz, \end{aligned}$$
(5.9)

where \(K \in \mathbb {R}\). The four poles \(z_\pm \) and \({\tilde{z}}_\pm \) as well as \(\zeta \in \mathbb {R}\) are related to the two real deformation parameters \(\eta \) and \({\tilde{\eta }}\) of the model by

$$\begin{aligned} z_\pm= & {} \frac{- 2 \rho \pm \mathrm{i}\eta }{\zeta }, \quad {\tilde{z}}_\pm = - \frac{2 + 2 \rho \pm \mathrm{i}{\tilde{\eta }}}{\zeta }, \quad \rho = - {\small \frac{1}{2}}\left( 1 - \frac{\eta ^2 - {\tilde{\eta }}^2}{4} \right) , \\ \zeta ^2= & {} \left( 1 + \frac{(\eta + {\tilde{\eta }})^2}{4} \right) \left( 1 + \frac{(\eta - {\tilde{\eta }})^2}{4} \right) . \end{aligned}$$

Choose two skew-symmetric solutions \(R, {\tilde{R}} \in {{\,\mathrm{End}\,}}\mathfrak {g}\) of the modified Yang–Baxter equation (4.12) with \(c = \mathrm{i}\). Because \({{\,\mathrm{res}\,}}_{z_-} \omega = - {{\,\mathrm{res}\,}}_{z_+} \omega \) and \({{\,\mathrm{res}\,}}_{{\tilde{z}}_-} \omega = - {{\,\mathrm{res}\,}}_{{\tilde{z}}_+} \omega \), it follows from Sect. 4.4 that \(\mathfrak {g}_R\) and \(\mathfrak {g}_{{\tilde{R}}}\) are both Lagrangian subalgebras of \(\mathfrak {g}^\mathbb {C}\). To satisfy the boundary equations of motion (2.6), we impose that

$$\begin{aligned} A_i|_{z_+} \in \mathfrak {g}_R, \quad A_i|_{{\tilde{z}}_+} \in \mathfrak {g}_{{\tilde{R}}}, \end{aligned}$$
(5.10)

for \(i = \tau , \sigma \). By Lemma 3.1 and Proposition 4.3, we can choose \(\widehat{g}\) to be of archipelago type. And by the discussion in Sect. 4.4, see also the corresponding discussion in Sect. 5.3, we can take \(\widehat{g}\) such that

$$\begin{aligned} g_{z_\pm } = g, \quad g_{{\tilde{z}}_\pm } = {\tilde{g}} \end{aligned}$$
(5.11)

for some \(g, {\tilde{g}} : \varSigma \rightarrow G\). Evaluating (2.9) at the poles of \(\omega \), we obtain

$$\begin{aligned} A|_{z_\pm } = - dg g^{-1} + {{\,\mathrm{Ad}\,}}_g {\mathcal {L}}|_{z_\pm }, \quad A|_{{\tilde{z}}_\pm } = - d{\tilde{g}} {\tilde{g}}^{-1} + {{\,\mathrm{Ad}\,}}_{{\tilde{g}}} {\mathcal {L}}|_{{\tilde{z}}_\pm }. \end{aligned}$$
(5.12)

The 1-form \(\omega \) has a simple zero at the origin and at infinity, that is \({{\varvec{\zeta }}} = \{ 0, \infty \}\). In the present case, the general form (2.11) of the Lax connection therefore reads

$$\begin{aligned} {\mathcal {L}} = \left( B_+ + \frac{\zeta }{2} z J_+ \right) d\sigma ^+ + \left( B_- + \frac{\zeta }{2} z^{-1} J_- \right) d\sigma ^- \end{aligned}$$
(5.13)

for some \(\mathfrak {g}\)-valued fields \(B_\pm \,{:}{=}\,U_\tau \pm U_\sigma \), \(J_+\,{:}{=}\,2 \zeta ^{-1} V^\infty \) and \(J_- \,{:}{=}\, 2 \zeta ^{-1} V^0\) to be determined.

The \(d\sigma ^\pm \)-components of the two equations

$$\begin{aligned} (R+\mathrm{i})A_i|_{z_+} = (R-\mathrm{i})A_i|_{z_-}, \quad ({\tilde{R}}+\mathrm{i})A_i|_{{\tilde{z}}_+} = ({\tilde{R}}-\mathrm{i})A_i|_{{\tilde{z}}_-}, \end{aligned}$$

which follow from (5.10), give us four equations on the four unknowns \(B_\pm \) and \(J_\pm \). Explicitly, we have

$$\begin{aligned} j_\pm = B_\pm \pm \frac{\eta }{2} R_g J_\pm - \rho J_\pm , \quad {\tilde{\jmath }}_\pm = B_\pm \mp \frac{{\tilde{\eta }}}{2} {\tilde{R}}_{{\tilde{g}}} J_\pm - (\rho +1) J_\pm , \end{aligned}$$
(5.14)

where we have introduced \(j_\pm \,{:}{=}\, g^{-1} \partial _\pm g\) and \({\tilde{\jmath }}_\pm \,{:}{=}\, {\tilde{g}}^{-1} \partial _\pm {\tilde{g}}\). Taking the difference of these two equations yields

$$\begin{aligned} J_\pm = \frac{1}{1 \pm \frac{\eta }{2} R_g \pm \frac{{\tilde{\eta }}}{2} {\tilde{R}}_{{\tilde{g}}}} (j_\pm - {\tilde{\jmath }}_\pm ). \end{aligned}$$

The first equation in (5.14) then also yields \(B_\pm = j_\pm \mp \frac{\eta }{2} R_g J_\pm + \rho J_\pm \). In particular, the Lax connection (5.13) thus coincides with [30, (3.4.9)] or, up to a conventional sign, with [9, (2.18)].

We have \({{\,\mathrm{res}\,}}_{z_\pm } \omega = \mp \frac{2 \mathrm{i}K}{\eta }\) and \({{\,\mathrm{res}\,}}_{{\tilde{z}}_\pm } \omega = \mp \frac{2 \mathrm{i}K}{{\tilde{\eta }}}\). It then follows from (5.11) that the four WZ-terms in action (3.7) cancel in pairs. We also have

$$\begin{aligned} {{\,\mathrm{res}\,}}_{z_+} \omega \wedge {\mathcal {L}} + {{\,\mathrm{res}\,}}_{z_-} \omega \wedge {\mathcal {L}}&= 2 K (J_+ d\sigma ^+ - J_- d\sigma ^-),\\ {{\,\mathrm{res}\,}}_{{\tilde{z}}_+} \omega \wedge {\mathcal {L}} + {{\,\mathrm{res}\,}}_{{\tilde{z}}_-} \omega \wedge {\mathcal {L}}&= - 2 K (J_+ d\sigma ^+ - J_- d\sigma ^-) \end{aligned}$$

so that action (3.7) takes the final form

$$\begin{aligned} S[g, {\tilde{g}}] = K \int _{\varSigma } \langle j_+ - {\tilde{\jmath }}_+, J_- \rangle d\sigma \wedge d\tau . \end{aligned}$$

This is the action of the bi-Yang–Baxter \(\sigma \)-model as written in [9, (2.2)].

Note that, contrary to the examples discussed in all the previous sections, as well as in Sect. 5.6, we have not fixed the gauge invariance of Proposition 3.3 by fixing the value of \(\widehat{g}\) at any of the poles of \(\omega \). It follows that the above action still has the gauge invariance of Proposition 3.3 which here takes the form

$$\begin{aligned} (g, {\tilde{g}}) \longmapsto (g h, {\tilde{g}} h) \end{aligned}$$

for any smooth \(h : \varSigma \rightarrow G\). Fixing this gauge invariance by setting \({\tilde{g}} = 1\), we obtain the original action of the bi-Yang–Baxter \(\sigma \)-model [25, 26].

5.6 Yang–Baxter \(\sigma \)-model with WZ-term

Consider the 1-form [13]

$$\begin{aligned} \omega = \frac{K(1 - z^2)}{(z - k)^2 - c^2 {\mathcal {A}}^2} dz, \end{aligned}$$

with free parameters \(K, k, {\mathcal {A}} \in \mathbb {R}\). We shall consider in parallel the cases when \(c = 1\) and \(c = \mathrm{i}\). Note that in the limit \(k \rightarrow 0\) we recover the 1-form of the Yang–Baxter \(\sigma \)-model with \({\mathcal {A}} = \eta \), discussed in Sect. 5.3, up to an overall factor.

Besides the double pole at \(\infty \), the 1-form \(\omega \) has two simple poles at \(z_\pm = k \pm c {\mathcal {A}}\) which are both real for \(c=1\) and complex conjugate for \(c = \mathrm{i}\). However, in order to apply the construction of Sect. 4.2 (resp. Sect. 4.3) to the pair of simple poles \(z_\pm \), we require a Lagrangian subalgebra of \(\mathfrak {d}\) (resp. \(\mathfrak {g}^\mathbb {C}\)). But since the residues

$$\begin{aligned} {{\,\mathrm{res}\,}}_{z_\pm } \omega = \pm K \frac{1 - z_\pm ^2}{2c{\mathcal {A}}} \end{aligned}$$

are such that \({{\,\mathrm{res}\,}}_{z_-} \omega \ne - {{\,\mathrm{res}\,}}_{z_+} \omega \), the bilinear form \(\langle \!\langle \cdot , \cdot \rangle \!\rangle _{\mathfrak {d}; z_\pm }\) on \(\mathfrak {d}\) (resp. \(\langle \!\langle \cdot , \cdot \rangle \!\rangle _{\mathfrak {g}^\mathbb {C}; z_\pm }\) on \(\mathfrak {g}^\mathbb {C}\)) is not the standard one, by contrast with the situations of Sects. 5.35.4 and 5.5. Our analysis, at least in the case \(c=\mathrm{i}\), is closely related to that of [28] where the double \(\mathfrak {g}^\mathbb {C}\) is also equipped with the more general bilinear form \(\langle \!\langle \cdot , \cdot \rangle \!\rangle _{\mathfrak {g}^\mathbb {C}; z_\pm }\).

A consequence of the bilinear form on \(\mathfrak {d}\) (resp. \(\mathfrak {g}^\mathbb {C}\)) not being the standard one is that the diagonal subalgebra \(\mathfrak {g}^\delta \subset \mathfrak {d}\) (resp. the real subalgebra \(\mathfrak {g}\subset \mathfrak {g}^\mathbb {C}\)) is no longer isotropic. Moreover, given any skew-symmetric solution \(R \in {{\,\mathrm{End}\,}}\mathfrak {g}\) of the modified classical Yang–Baxter equation (4.12), the corresponding subalgebra \(\mathfrak {g}_R\) of \(\mathfrak {d}\) (resp. of \(\mathfrak {g}^\mathbb {C}\)) will in general not be isotropic either.

To construct a Lagrangian subalgebra of \(\mathfrak {d}\) (resp. of \(\mathfrak {g}^\mathbb {C}\)), we proceed as follows. Let \(R \in {{\,\mathrm{End}\,}}\mathfrak {g}\) be a skew-symmetric solution of (4.12) such that

$$\begin{aligned} R^3 = c^2 R. \end{aligned}$$
(5.15)

This implies that R is diagonalisable with \(\mathfrak {g}= \mathfrak {g}_+ \dotplus \mathfrak {g}_0 \dotplus \mathfrak {g}_-\) its eigenspace decomposition where \(\mathfrak {g}_\pm \,{:}{=}\, \ker (R \mp c)\) and \(\mathfrak {g}_0 \,{:}{=}\, \ker R\) are subalgebras of \(\mathfrak {g}\), and moreover that \([\mathfrak {g}_0, \mathfrak {g}_\pm ] \subset \mathfrak {g}_\pm \) and \(\mathfrak {g}_0\) is abelian (see, for instance, [30, Proposition C.2.2]). In particular, we can thus write \(R = c (\pi _+ - \pi _-)\) where \(\pi _\pm \) and \(\pi _0\) are the projections onto the subalgebras \(\mathfrak {g}_\pm \) and \(\mathfrak {g}_0\) relative to the eigenspace decomposition of R.

It is useful to note that \(\pi _0 = - c^2 R^2 + 1\) which is symmetric with respect to the bilinear form \(\langle \cdot , \cdot \rangle \) on \(\mathfrak {g}\). Relation (5.15) then implies that \(\pi _0 R = R \pi _0 = 0\). Let

$$\begin{aligned} {\tilde{R}} \,{:}{=}\, R + \theta \pi _0 \in {{\,\mathrm{End}\,}}\mathfrak {g}\end{aligned}$$
(5.16)

for some real parameter \(\theta \in \mathbb {R}\) to be fixed shortly.

Since \(\pi _0 \in {{\,\mathrm{End}\,}}\mathfrak {g}\) is symmetric, it follows that \({\tilde{R}}\) is not skew-symmetric. However, one checks that it still satisfies the modified classical Yang–Baxter equation (4.12), for the same value of c (see, for instance, [30, Theorem C.2.1]). So \(\mathfrak {g}_{{\tilde{R}}}\) defined as in Sect. 4.4 is a subalgebra of \(\mathfrak {d}\) complementary to the diagonal subalgebra \(\mathfrak {g}^\delta \) if \(c=1\) or a subalgebra of \(\mathfrak {g}^\mathbb {C}\) complementary to the real subalgebra \(\mathfrak {g}\) if \(c = \mathrm{i}\).

Moreover, we find that \(\mathfrak {g}_{{\tilde{R}}}\) is isotropic, and so in fact Lagrangian, with respect to the bilinear form \( \langle \!\langle \cdot , \cdot \rangle \!\rangle _{\mathfrak {d}; z_\pm }\) on \(\mathfrak {d}\) (resp. \(\langle \!\langle \cdot , \cdot \rangle \!\rangle _{\mathfrak {g}^\mathbb {C}; z_\pm }\) on \(\mathfrak {g}^\mathbb {C}\)) provided that

$$\begin{aligned} (\theta - c)^2 ({{\,\mathrm{res}\,}}_{z_+} \omega ) + (\theta + c)^2 ({{\,\mathrm{res}\,}}_{z_-} \omega ) = 0. \end{aligned}$$

Of the two solutions for \(\theta \in \mathbb {R}\), the one which is regular in the limit \(k \rightarrow 0\) reads

$$\begin{aligned} \theta = \frac{- c^2 k \eta ^2}{(1 - c^2 \eta ^2){\mathcal {A}}}, \end{aligned}$$

where the real parameter \(\eta \) is related to the parameters \({\mathcal {A}}\) and k as (see [13, 19, 20] in the case \(c = \mathrm{i}\))

$$\begin{aligned} {\mathcal {A}} = \eta \sqrt{1 - \frac{k^2}{1 - c^2 \eta ^2}}. \end{aligned}$$

It therefore follows that \(\mathfrak {g}_{{\tilde{R}}}\), with \({\tilde{R}} \in {{\,\mathrm{End}\,}}\mathfrak {g}\) defined in (5.16) and for \(\theta \in \mathbb {R}\) as above, is a Lagrangian subalgebra of \(\mathfrak {d}\) (resp. of \(\mathfrak {g}^\mathbb {C}\)). In other words, we have a Manin pair \((\mathfrak {d}, \mathfrak {g}_{{\tilde{R}}})\) (resp. \((\mathfrak {g}^\mathbb {C}, \mathfrak {g}_{{\tilde{R}}})\)), which we can use in the construction of Sect. 4.2 (resp. Sect. 4.3).

Concretely, we will realise the boundary equations of motion (2.6) by demanding that

$$\begin{aligned} (A_i|_{z_+}, A_i|_{z_-}) \in \mathfrak {g}_{{\tilde{R}}}, \quad A_i|_\infty = 0, \end{aligned}$$
(5.17a)

for \(i =\tau , \sigma \), in the case \(c = 1\), or

$$\begin{aligned} A_i|_{z_+} \in \mathfrak {g}_{{\tilde{R}}}, \quad A_i|_\infty = 0, \end{aligned}$$
(5.17b)

for \(i = \tau , \sigma \), in the case \(c = \mathrm{i}\). By virtue of Lemma 3.1 and Propositions 4.14.2 and 4.3 we can choose \(\widehat{g}\) to be of archipelago type. Moreover, by the discussion in Sect. 4.4 and Proposition 3.3, we can take \(\widehat{g}\) to be such that

$$\begin{aligned} g_{z_\pm } = g, \quad g_\infty = 1 \end{aligned}$$
(5.18)

for some \(g : \varSigma \rightarrow G\). We refer to the corresponding discussion in Sect. 5.3 for details. In the case \(c = \mathrm{i}\), we used here the equivariance of \(\widehat{g}\) to show that \(g_{z_-} = \tau (g_{z_+}) = g\).

Evaluating (2.9) at the poles of \(\omega \), we obtain

$$\begin{aligned} A|_{z_\pm } = - dg g^{-1} + {{\,\mathrm{Ad}\,}}_g {\mathcal {L}}|_{z_\pm }, \quad A|_\infty = {\mathcal {L}}|_\infty . \end{aligned}$$
(5.19)

Combining the last two equations of (5.17) and (5.19), we deduce that \({\mathcal {L}}\) vanishes at \(\infty \). And since \({{\varvec{\zeta }}} = \{ 1, -1 \}\), we can write the general form (2.11) as

$$\begin{aligned} {\mathcal {L}} = \frac{V^1}{z-1} d\sigma ^+ + \frac{V^{-1}}{z+1} d\sigma ^-, \end{aligned}$$

for some \(V^{\pm 1} : \varSigma \rightarrow \mathfrak {g}\) to be determined. Now it follows from the first condition in (5.17) that \(({\tilde{R}} + c) A_i|_{z_+} = ({\tilde{R}} - c) A_i|_{z_-}\). We therefore obtain the two equations

$$\begin{aligned} ({\tilde{R}}_g + c) \left( - j_\pm + \frac{V^{\pm 1}}{z_+ \mp 1} \right) = ({\tilde{R}}_g - c) \left( - j_\pm + \frac{V^{\pm 1}}{z_- \mp 1} \right) , \end{aligned}$$

for the two unknowns \(V^{\pm 1}\), or in other words

$$\begin{aligned} \left( \frac{c(z_+ + z_- \mp 2)}{(z_+ \mp 1)(z_- \mp 1)} + \frac{z_- - z_+}{(z_+ \mp 1)(z_- \mp 1)} {\tilde{R}}_g \right) V^{\pm 1} = 2 c j_\pm . \end{aligned}$$

The operator on the left-hand side can be inverted by making use of the relations \(\pi _0 R = R \pi _0 = 0\), \(\pi _0^2 = \pi _0\) and \(R^2 = c^2(1 - \pi _0)\). We find

$$\begin{aligned} V^{\pm 1} = \mp (1 \mp k - c^2 \eta ^2 \mp {\mathcal {A}} R_g + \eta ^2 R^2_g) j_\pm . \end{aligned}$$

By contrast with the situation of Sect. 5.3, the WZ-terms associated with the poles \(z_\pm \) in action (3.7) do not cancel since \({{\,\mathrm{res}\,}}_{z_+} \omega + {{\,\mathrm{res}\,}}_{z_-} \omega = - 2 K k\), which is nonzero. On the other hand, we have

$$\begin{aligned} {{\,\mathrm{res}\,}}_{z_+} \omega \wedge {\mathcal {L}} + {{\,\mathrm{res}\,}}_{z_-} \omega \wedge {\mathcal {L}} = - K (V^1 d\sigma ^+ + V^{-1} d\sigma ^-), \end{aligned}$$

so that action (3.7) evaluates to

$$\begin{aligned} S[g]&= \frac{K}{2} \int _{\varSigma } \big \langle j_-, \big (1 - c^2 \eta ^2 - {\mathcal {A}} R_g + \eta ^2 R^2_g\big ) j_+ \big \rangle d\sigma \wedge d\tau + K k \, I_{\mathrm{WZ}}[g]. \end{aligned}$$

This coincides, in the case when \(c = \mathrm{i}\), with the action of the Yang–Baxter \(\sigma \)-model with WZ-term as given in [13, (2.7)].

6 \(\textsf {E} \)-models

We will take the 1-form \(\omega \) to be given by

$$\begin{aligned} \omega = K \frac{1 - z^2}{(z - z_+)(z - z_-)} dz \end{aligned}$$

where, for simplicity, we restrict attention to the case \(z_\pm \in \mathbb {R}\). The reasoning below can be easily adapted to the case of complex conjugate simple poles.

Even though the starting point \(\omega \) is of the same form as in Sects. 5.35.4 and 5.6, and each of the integrable \(\sigma \)-models considered in those sections is known to be examples of \(\textsf {E} \)-models, see [27, 28], respectively, we will proceed very differently to construct the underlying \(\textsf {E} \)-models themselves.

To begin with, we will impose a very different boundary condition on the 1-form A, at the poles \(z_\pm \) of \(\omega \), to those considered in Sects. 5.35.4 and 5.6. In fact, our choice of boundary condition is very closely related to that considered in [34] for deriving the \(\textsf {E} \)-model from three-dimensional Chern–Simons theory.

Moreover, the choice of coordinates \(\sigma _{\pm 1}\) that we will make for the pair of zeroes \(\pm 1\) of \(\omega \) in the general expression (2.11) for \({\mathcal {L}}\) will be different from that used throughout Sect. 5. Indeed, the choice will result in the \(d\sigma \)-component of the 1-form \({\mathcal {L}}\) being trivial.

6.1 Boundary condition

Evaluating A at the pair of points \(z_\pm \) yields a \(\mathfrak {d}\)-valued 1-form \(\textsf {A} \,{:}{=}\, (A|_{z_+}, A|_{z_-})\) on \(\varSigma \), whose components we denote

$$\begin{aligned} \textsf {A} _i \,{:}{=}\, (A_i|_{z_+}, A_i|_{z_-}) : \varSigma \longrightarrow \mathfrak {d}\end{aligned}$$

for \(i = \tau , \sigma \). In terms of these, we can express the boundary equations of motion (2.6) as

$$\begin{aligned} \langle \!\langle \textsf {A} _\sigma , \delta \textsf {A} _\tau \rangle \!\rangle _{\mathfrak {d}; z_\pm } - \langle \!\langle \textsf {A} _\tau , \delta \textsf {A} _\sigma \rangle \!\rangle _{\mathfrak {d}; z_\pm } = 0, \end{aligned}$$
(6.1)

where \(\langle \!\langle \cdot , \cdot \rangle \!\rangle _{\mathfrak {d}; z_\pm } : \mathfrak {d}\times \mathfrak {d}\rightarrow \mathbb {R}\) is defined in Sect. 4.2.

Let \(\textsf {E} : \mathfrak {d}\rightarrow \mathfrak {d}\) be a linear map such that \(\textsf {E} ^2 = \mathrm{id}\) which is symmetric with respect to the bilinear form \(\langle \!\langle \cdot , \cdot \rangle \!\rangle _{\mathfrak {d}; z_\pm }\) on \(\mathfrak {d}\). We shall impose the boundary conditions at the poles \(z_\pm \) and \(\infty \) of \(\omega \) to be

$$\begin{aligned} \textsf {A} _\tau = \textsf {E} (\textsf {A} _\sigma ), \quad A_i|_\infty = 0, \end{aligned}$$
(6.2)

for \(i = \tau , \sigma \). The first boundary condition provides a simple way of satisfying the boundary equation of motion (6.1), merely as a consequence of the symmetry of the linear map \(\textsf {E} \).

Under a gauge transformation with parameter \(u : \varSigma \times \mathbb {C}P^1\rightarrow G^\mathbb {C}\), the components \(\textsf {A} _i\) for \(i = \tau , \sigma \) of the \(\mathfrak {d}\)-valued 1-form \(\textsf {A} \) become

$$\begin{aligned} \textsf {A} _i^{\textsf {u} } \,{:}{=}\, (A^u_i|_{z_+}, A^u_i|_{z_-}) = - \partial _i \textsf {u} \textsf {u} ^{-1} + \textsf {u} \textsf {A} _i \textsf {u} ^{-1}, \end{aligned}$$

where \(\textsf {u} \,{:}{=}\, (u|_{z_+}, u|_{z_-}) : \varSigma \rightarrow D\). If we require that \(\textsf {u} = 1 \in D\), then the components \(\textsf {A} _i^{\textsf {u} }\) for \(i = \tau , \sigma \) of the gauge-transformed connection \(\textsf {A} ^{\textsf {u} }\) are trivially seen to satisfy the boundary condition (6.2). Therefore, any gauge transformation with parameter \(u : \varSigma \times \mathbb {C}P^1\rightarrow G^\mathbb {C}\) such that \(u|_{z_\pm } = 1\) is allowed. Using such a gauge transformation, one can then ensure that \(\widehat{g} : \varSigma \times \mathbb {C}P^1\rightarrow G^\mathbb {C}\) satisfying (3.1) is of archipelago type, by the same arguments as in Sects. 3.2 and 4.1. Without loss of generality, we can also choose the radii of the discs \(U_{z_\pm }\) around the points \(z_\pm \) to be equal, namely \(R_{z_+} = R_{z_-}\). We shall denote this common radius by R.

As usual, we use Proposition 3.3 to set \(g_\infty = 1\). Then, \({\mathcal {L}}_i|_\infty = A_i|_\infty = 0\), where the last step uses the second boundary condition in (6.2). We take \({\mathcal {L}}\) of the form

$$\begin{aligned} {\mathcal {L}} = \left( \frac{V^1}{z - 1} + \frac{V^{-1}}{z + 1} \right) d\tau \end{aligned}$$
(6.3)

where we have chosen \(\sigma _1 = \sigma _{-1} = \tau \) in the notation of the general form (2.11).

6.2 Action

Since \({\mathcal {L}}\) is regular at the simple poles \(z_\pm \in {{\varvec{z}}}\) of \(\omega \), we have

$$\begin{aligned} {{\,\mathrm{res}\,}}_{z_\pm } \omega \wedge {\mathcal {L}} = ({{\,\mathrm{res}\,}}_{z_\pm } \omega ) {\mathcal {L}}|_{z_\pm }. \end{aligned}$$

Let \(\textsf {J}_\tau \,{:}{=}\, ({\mathcal {L}}_\tau |_{z_+}, {\mathcal {L}}_\tau |_{z_-}) : \varSigma \rightarrow \mathfrak {d}\) and

$$\begin{aligned} \ell \,{:}{=}\, (g_{z_+}, g_{z_-}) : \varSigma \longrightarrow D. \end{aligned}$$

We also let \(\widehat{\ell } \,{:}{=}\, (\widehat{g}_{z_+}, \widehat{g}_{z_-}) : \varSigma \times [0, R] \rightarrow D\). Action (3.7) can be rewritten in the present case as

$$\begin{aligned} S[\ell ] = - \frac{1}{2} \int _{\varSigma } \langle \!\langle d \ell \ell ^{-1}, {{\,\mathrm{Ad}\,}}_\ell \textsf {J} _\tau d\tau \rangle \!\rangle _{\mathfrak {d}; z_\pm } - \frac{1}{6} \int _{\varSigma \times [0, R]} \langle \!\langle d \widehat{\ell } \widehat{\ell }^{-1}, d \widehat{\ell } \widehat{\ell }^{-1} \wedge d \widehat{\ell } \widehat{\ell }^{-1} \rangle \!\rangle _{\mathfrak {d}; z_\pm }. \end{aligned}$$

Evaluating (2.9) at the pair of points \(z_\pm \) yields

$$\begin{aligned} \textsf {A} _\tau = - \partial _\tau \ell \ell ^{-1} + \ell \textsf {J} _\tau \ell ^{-1}, \quad \textsf {A} _\sigma = - \partial _\sigma \ell \ell ^{-1}. \end{aligned}$$

By combining this with the first boundary condition in (6.2), it follows that

$$\begin{aligned} {{\,\mathrm{Ad}\,}}_\ell \textsf {J} _\tau = \partial _\tau \ell \ell ^{-1} - \textsf {E} (\partial _\sigma \ell \ell ^{-1}). \end{aligned}$$
(6.4)

Finally, substituting this into the above action yields

$$\begin{aligned} S[\ell ]&= - \frac{1}{2} \int _{\varSigma } \langle \!\langle \partial _\sigma \ell \ell ^{-1}, \partial _\tau \ell \ell ^{-1} \rangle \!\rangle _{\mathfrak {d}; z_\pm } d\sigma \wedge d\tau + \frac{1}{2} \int _{\varSigma } \langle \!\langle \partial _\sigma \ell \ell ^{-1}, \textsf {E} (\partial _\sigma \ell \ell ^{-1}) \rangle \!\rangle _{\mathfrak {d}; z_\pm } d\sigma \wedge d\tau \\&\quad - \frac{1}{6} \int _{\varSigma \times [0, R]} \langle \!\langle d \widehat{\ell } \widehat{\ell }^{-1}, d \widehat{\ell } \widehat{\ell }^{-1} \wedge d \widehat{\ell } \widehat{\ell }^{-1} \rangle \!\rangle _{\mathfrak {d}; z_\pm }. \end{aligned}$$

This is the action of the \(\textsf {E} \)-model [22, 23] in the case of the real double \(\mathfrak {d}= \mathfrak {g}\oplus \mathfrak {g}\).

Note that since we have set \({\mathcal {L}}_\sigma = 0\), it follows from the equations of motion (2.3a) expressed in terms of \({\mathcal {L}}\) that we have \(\partial _\sigma \textsf {J} _\tau = 0\). If \(\varSigma = \mathbb {R}^2\) and we assume that \(\textsf {J} _\tau \) vanishes at spatial infinity, then it follows that \(\textsf {J} _\tau = 0\). By virtue of (6.4), this implies the on-shell relation \(\partial _\tau \ell \ell ^{-1} = \textsf {E} (\partial _\sigma \ell \ell ^{-1})\), which we also recognise as the equation of motion of the \(\textsf {E} \)-model.

7 Conclusion

7.1 Integrable coupled \(\sigma \)-models

A general procedure for coupling together an arbitrary number of integrable \(\sigma \)-models in a way that preserves integrability was proposed in [11]. In particular, the action for an integrable \(\sigma \)-model coupling together N copies of the principal chiral model with WZ-term, as given in [10], was constructed by first devising its Hamiltonian as that of an affine Gaudin model and then performing its inverse Legendre transform.

This same action was recently rederived in [7] starting from the four-dimensional action (1.2). In fact, it follows from the results of the present paper that the action for this integrable \(\sigma \)-model can also be obtained directly from the two-dimensional action (1.3) by substituting for \(\varphi \) and \({\mathcal {L}}\) the twist function and the Lax connection, respectively, of the affine Gaudin model constructed in [11] or in [31] for the version with gauge invariance.

7.2 \(\lambda \)-Deformations and ‘doubled’ Chern–Simons

An appealing feature of the two-dimensional action (1.3) is its ‘universality’.

The \(\lambda \)-deformation, considered in Sect. 5.4, is a particular example of an integrable \(\sigma \)-model that describes a certain integrable deformation. This was constructed for the principal chiral model in [37], for the symmetric and semi-symmetric space \(\sigma \)-models in [16, 17] and more recently for the pure-spinor superstring on the \(AdS_5 \times S^5\) background in [1].

It was, in fact, already known that the actions of \(\lambda \)-deformations can be written in the ‘universal’ form (1.3), see [36, (3.98)]. Explicitly, in the case of the \(\lambda \)-deformation of the principal chiral model, it follows from Sect. 5.4 that the action reads

$$\begin{aligned} S_\lambda [g] = k \int _\varSigma \langle g^{-1} dg, {\mathcal {L}}|_\alpha \rangle + k\, I_{\mathrm{WZ}}[g]. \end{aligned}$$

Interestingly, this action was obtained in [35, 36] by starting from that of a ‘double’ Chern–Simons theory, whose Lagrangian is given by a difference \(CS(A_+) - CS(A_-)\) of two Chern–Simons 3-forms for \(\mathfrak {g}\)-valued 1-forms \(A_\pm \) on \(D \times \mathbb {R}\) where D is a disc.

It would be interesting to derive this ‘double’ Chern–Simons theory starting from the four-dimensional theory of [7]. More generally, one may wonder whether such an intermediate three-dimensional Chern–Simons theory also exists more generally for other integrable \(\sigma \)-models whose action takes the universal form (1.3).

Finally, in connection with the derivation of \(\textsf {E} \)-models presented in Sect. 6, it would also be interesting to understand the relationship between the approach of [7], which we have been using, and the formalism of [34] in which \(\textsf {E} \)-models can equally be obtained but by starting instead from three-dimensional Chern–Simons theory.

7.3 Yang–Baxter-type deformations

In Sects. 5.35.5 and 5.6, we imposed boundary conditions at each pair of simple poles \(x_\pm \) of \(\omega \) by applying the general procedure outlined in Sect. 4.2 or 4.3 with a choice of Lagrangian subalgebra \(\mathfrak {k}\) of the form \(\mathfrak {g}_R\) for some solution \(R \in {{\,\mathrm{End}\,}}\mathfrak {g}\) of the modified classical Yang–Baxter equation. This characterises the class of Yang–Baxter-type deformations of integrable \(\sigma \)-models, obtained by splitting a double pole x in \(\omega \) into two simple poles \(x_\pm \) as in [11, 12, 39]. Within this class, there is a WZ-term associated with the pair of poles \(x_\pm \) if and only if \({{\,\mathrm{res}\,}}_{x_+} \omega + {{\,\mathrm{res}\,}}_{x_-} \omega \ne 0\).

Just as Sect. 5.6 generalises Sect. 5.3 by introducing a WZ-term, one could also consider a similar generalisation of the construction of Sect. 5.5 by starting from a more general 1-form \(\omega \) with two arbitrary pairs of simple poles \(z_\pm \) and \({\tilde{z}}_\pm \) (respecting the reality conditions) but with \({{\,\mathrm{res}\,}}_{z_+} \omega + {{\,\mathrm{res}\,}}_{z_-} \omega \ne 0\) and \({{\,\mathrm{res}\,}}_{{\tilde{z}}_+} \omega + {{\,\mathrm{res}\,}}_{{\tilde{z}}_-} \omega \ne 0\). It is natural to conjecture that this would result in the bi-Yang–Baxter \(\sigma \)-model with WZ-term introduced in [8]. We leave the verification of this conjecture for future work.

In fact, the 1-form \(\omega \) considered in [7, (14.2)] is precisely of the general type described above (up to a Möbius transformation). The boundary conditions imposed on A at each pair of simple poles \(z_\pm \) and \(\tilde{z}_\pm \) of \(\omega \) in [7, §14] are associated with a choice of Manin triple \((\tilde{\mathfrak {g}},\mathfrak {l}_+, \mathfrak {l}_-)\) for the Lie algebra \(\tilde{\mathfrak {g}}=\mathfrak {g}\oplus \tilde{\mathfrak {h}}\), where \(\tilde{\mathfrak {h}}\) denotes an auxiliary copy of the Cartan subalgebra of \(\mathfrak {g}\) (considering \(\tilde{\mathfrak {g}}\) instead of \(\mathfrak {g}\) allows for a more direct construction of a Manin triple).

As recalled at the end of Sect. 4.4, such boundary conditions correspond in the present language to choosing an R-matrix satisfying \(R^2=1\). In light of the above discussion, this suggests that the trigonometric deformation of the principal chiral model constructed in [7, §14] coincides with a certain bi-Yang–Baxter \(\sigma \)-model with WZ-term on the Lie group \(\tilde{G}\) corresponding to \({\tilde{\mathfrak {g}}}\). We note, however, that its description in [7] uses a different parametrisation of the degrees of freedom than the one that we used in Sect. 5.5 to describe the bi-Yang–Baxter \(\sigma \)-model. It would be interesting to make this relation more explicit.

7.4 Boundary conditions versus representations

The nonabelian T-dual of the principal chiral model was described in [11] as an affine Gaudin model. Indeed, a particular realisation of the relevant affine Gaudin model, with \(\omega \) as in Sect. 5.2, was shown to reproduce the Hamiltonian, phase space and action of this \(\sigma \)-model.

We have not considered this particular model here, part of the reason being that we do not expect the two-dimensional action (1.3) to hold in this case. Indeed, since \({{\,\mathrm{res}\,}}_0 \omega = 0\), a natural choice of Lagrangian subalgebra \(\mathfrak {k}\subset \mathfrak {t}\) is given by \(\mathfrak {k}= \mathfrak {g}\ltimes \{0\}\). The quotient \(K \backslash T\) in this case is naturally parameterised by elements of the abelian subgroup \(\{ 0 \} \ltimes \mathfrak {g}_{\mathrm{ab}} \subset T\). In other words, the field of the resulting integrable \(\sigma \)-model would be \(\mathfrak {g}\)-valued, so it is natural to conjecture that this corresponds to the nonabelian T-dual of the principal chiral model. However, the Lagrangian subalgebra \(\mathfrak {k}\) does not satisfy condition (4.16) which was necessary to bring \(\widehat{g}\) to the archipelago form in the proof of Proposition 4.4. More precisely, the choice of representative in \(\{ 0 \} \ltimes \mathfrak {g}_{\mathrm{ab}}\) for the quotient \((G \ltimes \{ 0 \}) \backslash T\) is not compatible with the archipelago condition (iii). This is why action (1.3) does not hold. It would be interesting to see how the present construction can be generalised to this case.

More generally, it would be very interesting to understand the precise connection between the choice of boundary condition on A in the setting of [7] and the choice of realisation of a suitable infinite-dimensional Lie algebra in the setting of [40]. Recall, for instance, that affine Toda field theories were shown to admit an affine Gaudin model realisation in [40]. It would be important to clarify what boundary conditions, if any, can be imposed on A in order to derive the action of affine Toda field theories from the four-dimensional action (1.2).

With this in mind, it would also be very interesting to see whether the approach of [7] can be used to furnish new classes of models, for instance by identifying new types of suitable boundary conditions to be imposed on A. It would then also be interesting to understand what the corresponding infinite-dimensional Lie algebra representation is in the affine Gaudin model language.

7.5 Quantising integrable \(\sigma \)-models

Perhaps most importantly, the results of the present paper bring further evidence in support of the connection established in [41] between the two formalisms of [7, 40].

And while at present they have mainly been used to describe classical integrable \(\sigma \)-models, one important interest in these new general frameworks lies in their potential in addressing the long-standing open problem of quantising integrable \(\sigma \)-models from first principles. We expect that exploiting the close connection between these two formalisms will be vital in making progress on this important question.