1 Introduction

Given the action for a 2d field theory, it is a difficult problem to decide whether or not it is classically integrable. Indeed, proving a theory is integrable requires showing, in particular, that the field equations of motion are equivalent to the flatness equation for a 2d connection \({\mathcal {L}}\) valued in some finite-dimensional Lie algebra \(\mathfrak {g}\) and depending meromorphically on some auxiliary complex parameter. Unfortunately, there is no systematic procedure for finding such a Lax connection when it exists.

In recent years, however, there has been tremendous progress towards the problem of classifying 2d integrable field theories. In particular, Costello and Yamazaki proposed a general approach [3], based on earlier work in the context of integrable spin chains [4,5,6,7], for constructing 2d integrable field theories starting from 4d Chern–Simons theory. This approach, being rooted in the Lagrangian formalism, provides an elegant way of computing the 2d action for the integrable field theory as well as its Lax connection. It has been extensively applied to reproduce many existing 2d integrable field theories and also construct a wide variety of new ones [2, 8,9,10,11,12,13,14,15,16,17,18,19,20,21]. See also [1, 22,23,24,25,26,27,28,29,30] for further developments in relation to 4d Chern–Simons theory.

It is important to note that establishing the complete integrability of a 2d field theory also requires showing that the integrals of motion constructed out of the Lax connection Poisson commute with one another, which necessitates moving to the Hamiltonian formalism. A general approach for constructing classical 2d integrable field theories directly in the Hamiltonian formalism was proposed in [31], and further developed in [32, 33], by starting from affine Gaudin models. It was shown in [34] by performing a Hamiltonian analysis of 4d Chern–Simons theory (see also [9] for the \({\mathbb {Z}}_2\)- and \({\mathbb {Z}}_4\)-equivariant settings in the context of the \(\lambda \)-model) that the formalisms of [3] and [31] are intimately related. In particular, all the 2d integrable field theories constructed from 4d Chern–Simons theory are integrable in this stronger sense.

The action of 4d Chern–Simons theory is specified by a choice of meromorphic 1-form \(\omega \) on \({\mathbb {C}}{P}^1\) (one could also consider higher genus Riemann surfaces [3] but in this article we focus on the Riemann sphere). Different 2d integrable field theories arise from different choices of \(\omega \) and various other data to be reviewed in Sect.2. The case when \(\omega \) has at most double poles was studied in detail in [8], where a very simple ‘unifying’ 2d action was derived for all integrable field theories belonging to this class of meromorphic 1-forms. A generalisation of this ‘unifying’ 2d action for arbitrary \(\omega \) was then obtained in [1], where a different perspective on the passage from 4d Chern–Simons theory to 2d integrable field theory was also advocated.

In order to write the ‘unifying’ 2d actions of [8] or [1] in terms of the field of the 2d integrable field theory alone, one needs to solve a certain constraint relating this field to the Lax connection. A very general class of solutions to this constraint was constructed in [2] when \(\omega \) has arbitrary poles and zeroes in the complex plane but a double pole at infinity. This technical assumption was required in order to fix some of the gauge invariance of 4d Chern–Simons theory and as a result remove any constant term from the Lax connection, as we recall at the start of Sect.3. It was also shown in [2] that all of the 2d integrable field theories arising from such solutions of the constraint are described by integrable non-degenerate \(\mathcal {E}\)-models.

Non-degenerate \(\mathcal {E}\)-models were introduced by Klimčík and Ševera in [35,36,37] as \(\sigma \)-models providing a natural setting for describing a non-Abelian generalisation of T-duality, known as Poisson–Lie T-duality. Even though the generic non-degenerate \(\mathcal {E}\)-model is not integrable, it turns out that many interesting examples were found to be integrable [38,39,40,41]. A simple condition was also formulated on the data of the non-degenerate \(\mathcal {E}\)-model which ensures it is integrable [42]. Of course, the data of the integrable non-degenerate \(\mathcal {E}\)-models constructed in [2] all satisfy this condition.

A generalisation of non-degenerate \(\mathcal {E}\)-models, known as degenerate \(\mathcal {E}\)-models or dressing cosets, was also introduced by Klimčík and Ševera in [37]. Whereas the non-degenerate \(\mathcal {E}\)-model can be described as a \(\sigma \)-model on a certain quotient space \(K \setminus D\) where K is an maximal isotropic subgroup of a Lie group D, the degenerate \(\mathcal {E}\)-model is a \(\sigma \)-model on a double quotient \(K \setminus D \,/\, F\) where F is another isotropic subgroup of D. It was recently shown in [43] that a certain integrable \(\sigma \)-model that was constructed in [44] provides an example of an integrable degenerate \(\mathcal {E}\)-model. Very recently, conditions on the degenerate \(\mathcal {E}\)-model data ensuring its integrability, in the Hamiltonian sense recalled above, were also given in [45] and a family of new integrable degenerate \(\mathcal {E}\)-models were constructed, including the pseudo-dual chiral model and its multifield generalisations.

The purpose of this article is to extend the construction of [2] to the most general setting of an arbitrary meromorphic 1-form \(\omega \). In particular, we drop the technical assumption made in [2] that \(\omega \) is required to have a double pole at infinity. We show that the solutions of the constraint equation from [1] which we construct by generalising the approach of [2] all give rise to integrable degenerate \(\mathcal {E}\)-models. Just as in [2], the Lie group D is determined by the pole structure of \(\omega \) and the maximal isotropic subgroup \(K \subset D\) is determined by the choice of boundary condition imposed on the Chern–Simons field at the collection of poles of \(\omega \). On the other hand, the isotropic subgroup F, which is specific to the present case, is a remnant of the gauge symmetry of the 4d Chern–Simons theory under the Lie group G and is given by the image of the diagonal embedding \(G \hookrightarrow D\).

The plan of the paper is as follows.

In Sect.2, we review the alternative, less conventional approach of [1] for extracting the action of a 2d integrable field theory from the 4d Chern–Simons theory of Costello and Yamazaki [3]. One advantage of this approach is that it makes the passage from 4d to 2d more direct, with the field h of the 2d theory being introduced along surface defects in the 4d theory to ensure gauge invariance.

In Sect.3, we generalise the approach of [2] for solving a constraint between the 2d field h and the 4d gauge field appearing in the construction of [1]. More precisely, we do away with the technical requirement in [2] that the 1-form \(\omega \) in the 4d Chern–Simons action should have a double pole at infinity. Starting from a general meromorphic 1-form \(\omega \), the resulting 2d integrable field theories are degenerate \(\mathcal {E}\)-models.

In Sect.4, we give two detailed examples of the construction from Sect.3. Namely, we apply the general formalism to recover the pseudo-dual of the principal chiral model, or pseudo-chiral model for short, of Zakharov and Mikhailov [46] and the bi-Yang-Baxter \(\sigma \)-model proposed by Klimčík in [47].

2 A Review on 4d Chern–Simons and 2d IFT

In this section, we begin by reviewing the correspondence between 4d Chern–Simons theory and 2d integrable field theories, proposed by Costello and Yamazaki in [3]. We will, however, follow the approach advocated in [1] which puts special emphasis on the principle of gauge invariance. In this approach, the 4d Chern–Simons field A is coupled to additional degrees of freedom, the so-called edge modes, living on certain surface defects. This is to ensure the full gauge invariance of the theory. The 2d integrable field theory is then seen to emerge in a particular gauge by going partly on-shell. Although the main ideas of [1] are intrinsically physical, the constructions rely heavily on methods of homotopical analysis and the theory of groupoids. So the purpose of this section is to review the key steps of the approach of [1] using a language more familiar to theoretical physicists.

2.1 4d Chern–Simons Theory

Let \(X :=\Sigma \times {\mathbb {C}}P^1\) where \(\Sigma \) denotes a 2d manifold which will eventually correspond to the space-time of the 2d integrable field theory. We take \(\Sigma ={\mathbb {R}}^2\) or \({\mathbb {R}}\times S^1\) with coordinates \((\tau ,\sigma )\).

Let G be a real, simply connected Lie group with Lie algebra \(\mathfrak {g}\). Let \(\mathfrak {g}^{\mathbb {C}}:=\mathfrak {g}\otimes _{{\mathbb {R}}} {\mathbb {C}}\) be the complexification of \(\mathfrak {g}\) and let \(G^{\mathbb {C}}\) denote the corresponding complex Lie group. We fix a non-degenerate, symmetric and ad-invariant bilinear form \(\langle \cdot ,\cdot \rangle :\mathfrak {g}\times \mathfrak {g}\rightarrow {\mathbb {R}}\) and denote by \(\langle \cdot ,\cdot \rangle :\mathfrak {g}^{\mathbb {C}}\times \mathfrak {g}^{\mathbb {C}}\rightarrow {\mathbb {C}}\) its complex linear extension to \(\mathfrak {g}^{\mathbb {C}}\).

2.1.1 The Meromorphic 1-Form

The key ingredient entering the definition of the 4d variant of Chern–Simons theory [3] is a choice of meromorphic 1-form \(\omega \) on \({\mathbb {C}}P^1\).

We denote by \(\Pi \varvec{z}\subset {\mathbb {C}}P^1\) the set of poles of \(\omega \) and by \(n_x\) the order of the pole \(x \in \Pi \varvec{z}\). Let \(\Pi \varvec{z}' :=\Pi \varvec{z}{\setminus } \{\infty \}\) be the subset of finite poles of \(\omega \). The reason for this notation will be justified shortly. Fixing a coordinate z on \({\mathbb {C}} \subset {\mathbb {C}}P^1\), we can write \(\omega \) explicitly as

$$\begin{aligned} \omega =\left( \sum _{x\in \Pi \varvec{z}'}\sum _{p=0}^{n_x-1} \frac{\ell ^x_p}{(z-x)^{p+1}}-\sum _{p=1}^{n_{\infty }-1}\ell ^{\infty }_p z^{p-1}\right) \textrm{d}z =:\varphi (z)\textrm{d}z\,, \end{aligned}$$
(2.1)

for some \(\ell ^x_p\in {\mathbb {C}}\) which we call levels. We impose reality conditions on each \(x\in \Pi \varvec{z}\) and its corresponding levels by requiring that \(\overline{\varphi (z)} = \varphi (\bar{z})\). In particular, introducing the subset of real poles \(\varvec{z}_{\textrm{r}} :=\varvec{z}'_{\textrm{r}} \,\sqcup \, \{\infty \}\), where \(\varvec{z}'_{\textrm{r}} :=\Pi \varvec{z}' \,\cap \, {\mathbb {R}}\), the associated levels are real, i.e. \(\ell ^x_p \in {\mathbb {R}}\) for \(x\in \varvec{z}_{\textrm{r}}\). The remaining poles come in complex conjugate pairs, and we define \(\varvec{z}_{\textrm{c}}:=\{x \in \Pi \varvec{z}\,|\, \Im x > 0\}\) so that \(\Pi \varvec{z}= \varvec{z}_{\textrm{r}}\,\sqcup \,\varvec{z}_{\mathrm c}\,\sqcup \, \bar{\varvec{z}}_{\mathrm c}\). For every \(x \in \varvec{z}_{\textrm{c}} \,\sqcup \, \bar{\varvec{z}}_{\mathrm c}\), we have \(n_{\bar{x}} = n_x\) and \(\overline{\ell ^x_p}=\ell ^{\bar{x}}_p\) for \(p = 0,\dots , n_{x}-1\). It is convenient to introduce the subset \(\varvec{z}:=\varvec{z}_{\textrm{r}} \,\sqcup \, \varvec{z}_{\textrm{c}}\) of independent poles, namely which are independent under complex conjugation. Finally, we introduce the subset \(\varvec{z}':=\varvec{z}'_{\textrm{r}}\,\sqcup \, \varvec{z}_{\textrm{c}}\subset \varvec{z}\) of finite independent poles in \(\varvec{z}\). We let \(\Pi :=\{ \textrm{id}, {\textsf{t}} \}\) denote the group \({\mathbb {Z}}_2\) generated by the element \({\textsf{t}}\) acting by complex conjugation on \({\mathbb {C}}{P}^1\). The notation \(\Pi \varvec{z}\) (resp. \(\Pi \varvec{z}'\)) introduced above then corresponds to the orbit of the set \(\varvec{z}\) (resp. \(\varvec{z}'\)) under \(\Pi \).

We will also be interested in the set of zeroes of \(\omega \) which can be similarly decomposed as \({\varvec{\zeta }}_{\textrm{r}} \, \sqcup \, {\varvec{\zeta }}_{\textrm{c}} \, \sqcup \, \bar{{\varvec{\zeta }}}_{\textrm{c}}\) with \({\varvec{\zeta }}_{\textrm{r}}\subset {\mathbb {R}}\) the subset of real zeroes and \({\varvec{\zeta }}_{\textrm{c}}\subset \{y \in {\mathbb {C}} \,|\, \Im y > 0\}\) the subset of complex zeroes with positive imaginary part. Moreover, we introduce the set \({\varvec{\zeta }}:={\varvec{\zeta }}_{\textrm{r}} \,\sqcup \, {\varvec{\zeta }}_{\textrm{c}}\) of independent zeroes and let \(m_y \in {\mathbb {Z}}_{\ge 1}\) denote the order of the zero \(y \in {\varvec{\zeta }}\). For \(y \in {\varvec{\zeta }}_{\textrm{c}}\), \(\omega \) also has a zero of order \(m_{\bar{y}}:=m_y\) at \(\overline{y} \in \overline{{\varvec{\zeta }}}_{\textrm{c}}\). The set of all zeroes of \(\omega \) is \(\Pi {\varvec{\zeta }}\) and the set of all finite zeroes is \(\Pi {\varvec{\zeta }}'\).

The total number of poles of \(\omega \) (counting multiplicities) is related to the total number of zeroes of \(\omega \) (counting multiplicities) by

$$\begin{aligned} \sum _{x \in \Pi \varvec{z}} n_x = \sum _{y \in \Pi {\varvec{\zeta }}} m_y + 2. \end{aligned}$$
(2.2)

We will assume that the total number of poles of \(\omega \) (counting multiplicities) is even, so that the total number of zeroes of \(\omega \) (counting multiplicities) also is by (2.2).

We summarise the notations for the different subsets of poles and zeroes of \(\omega \) introduced above in the tables below.

Subsets of poles of \(\omega \)

\(\Pi \varvec{z}\)

All

\(\Pi \varvec{z}'\)

Finite

\(\varvec{z}_{\textrm{r}}\)

Real

\(\varvec{z}_{\textrm{r}}'\)

Finite and real

\(\varvec{z}_{\textrm{c}}\)

Positive imaginary part

\(\varvec{z}\)

Independent

\(\varvec{z}'\)

Finite and independent

Subsets of zeroes of \(\omega \)

\(\Pi {\varvec{\zeta }}\)

All

\(\Pi {\varvec{\zeta }}'\)

Finite

\({\varvec{\zeta }}_{\textrm{r}}\)

Real

\({\varvec{\zeta }}'_{\textrm{r}}\)

Finite and real

\({\varvec{\zeta }}_{\textrm{c}}\)

Positive imaginary part

\({\varvec{\zeta }}\)

Independent

\({\varvec{\zeta }}'\)

Finite and independent

Following [2], to account for the fact that poles and zeroes of \(\omega \) have multiplicities, it will sometimes be convenient to use the notation \([{\varvec{z}}]\) for the set of pairs \([x, p]\) with \(x \in \varvec{z}\) and \(p = 0, \ldots , n_x - 1\), and similarly \(({{\varvec{\zeta }}})\) for the set of pairs \((y, q)\) with \(y \in {\varvec{\zeta }}\) and \(q=0,\ldots , m_y-1\). We will use other similar notations, such as \([{\Pi \varvec{z}}]\) and \(({\Pi {\varvec{\zeta }}})\) for the set of all poles and all zeroes of \(\omega \) with multiplicities included, respectively.

Finally, we shall also often make use of the local coordinates \(\xi _x :=z-x\) at any finite pole \(x \in \varvec{z}'\) or finite zero \(x \in {\varvec{\zeta }}'\) and the local coordinate at infinity \(\xi _{\infty } :=z^{-1}\) if \(\infty \in \varvec{z}\) or \(\infty \in {\varvec{\zeta }}\). The expansion of the meromorphic 1-form at each pole \(x \in \varvec{z}\) can then be written uniformly as

$$\begin{aligned} \iota _x \omega = \sum _{p=0}^{n_x-1} \ell ^x_p \xi _x^{-p-1} d\xi _x. \end{aligned}$$
(2.3)

For the point at infinity, we have used the fact that \(z^{p-1} \textrm{d}z = - \xi _{\infty }^{-p-1} \textrm{d}\xi _{\infty }\) for any \(p \in {\mathbb {Z}}\) and introduced the additional level \(\ell ^\infty _0 :=-\sum _{x\in \Pi \varvec{z}'} \ell ^x_0\) for convenience.

2.1.2 The 4d Action and Space of Fields

Given a choice of meromorphic 1-form \(\omega \) as described in §2.1.1, the 4d Chern–Simons action for a \(\mathfrak {g}\)-valued 1-form A on X is given by [3]

$$\begin{aligned} S_{4{\text {d}}}(A)=\frac{i}{4\pi }\int _{X}\omega \wedge \textrm{CS}(A), \end{aligned}$$
(2.4)

where \(\textrm{CS}(A) :=\langle A, \textrm{d}A + \tfrac{1}{3}[A,A]\rangle \) denotes the Chern–Simons 3-form.

Strictly speaking, the action (2.4) only makes sense when \(\omega \) has at most simple poles, i.e. when \(n_x = 1\) for all \(x \in \varvec{z}\). Indeed, when \(\omega \) has higher-order poles, i.e. \(n_x > 1\) for some \(x \in \varvec{z}\), the 4-form \(\omega \wedge \textrm{CS}(A)\) is not locally integrable near the surface defects \(\Sigma _x :=\Sigma \times \{ x \}\) with \(n_x > 1\) and therefore needs to be suitably regularised [1, §3.1]. In what follows we will not need the precise form of the regularised action, only its variation under gauge transformations to be discussed shortly, so we refer to [1, §3.1], see also [48], for details of this regularisation procedure. We only point out that the regularisation is ‘local’ in the sense that it consists in modifying the 4-form \(\omega \wedge \textrm{CS}(A)\) only locally in small neighbourhoods of the surface defects \(\Sigma \times \{ x \}\) for each pole \(x \in \varvec{z}\) of \(\omega \). We will keep denoting the action as \(S_{4{\text {d}}}(A)\) in the presence of higher-order poles in \(\omega \).

Note that the \(\textrm{d}z\)-component of A drops out from the action (2.4) due to the presence of the meromorphic 1-form \(\omega = \varphi (z) \textrm{d}z\). Another way to say this is that (2.4) is trivially invariant under translations \(A \mapsto A + \chi \textrm{d}z\) for any \(\chi \in C^\infty (X, \mathfrak {g})\) and we can fix this invariance by simply focussing on gauge fields with no \(\textrm{d}z\)-component. This remains true when \(\omega \) has higher-order poles [1]. From now on, we will therefore always focus on fields of the form

$$\begin{aligned} A = A_{\tau } \textrm{d}\tau + A_{\sigma } \textrm{d}\sigma + A_{\bar{z}} \textrm{d}\bar{z}. \end{aligned}$$
(2.5)

On the other hand, it is too restrictive to consider only smooth \(\mathfrak {g}\)-valued 1-forms A, namely (2.5) where the components \(A_{\tau }\), \(A_{\sigma }\) and \(A_{\bar{z}}\) are smooth \(\mathfrak {g}\)-valued functions on X. Indeed, one can allow these component functions to be singular at the zeroes of \(\omega \) provided that the Lagrangian \(\omega \wedge \textrm{CS}(A)\), or its regularised version in the case when \(\omega \) has higher-order poles, remains locally integrable there. More precisely, let us fix a partition \({\varvec{\zeta }}= {\varvec{\zeta }}_+ \sqcup {\varvec{\zeta }}_-\) of the independent zeroes of \(\omega \) such that \(\sum _{y \in {\varvec{\zeta }}_+} m_y = \sum _{y \in {\varvec{\zeta }}_-} m_y\). In particular, if all the zeroes of \(\omega \) are simple, which will be the case in all our examples, then the latter condition means \(|\Pi {\varvec{\zeta }}_+| = |\Pi {\varvec{\zeta }}_-|\). We will take the space of fields of 4d Chern–Simons theory to consist of \(\mathfrak {g}\)-valued 1-forms as in (2.5) such that:

(i):

\(A_{\pm } :=A_{\tau } \pm A_{\sigma }\) has singularities at \(\Sigma \times {\varvec{\zeta }}_{\pm }\) and is smooth elsewhere,

(ii):

\(A_{\bar{z}}\) is smooth everywhere,

(iii):

the 4-form \(\omega \wedge \textrm{CS}(A)\) is locally integrable near \(\Sigma \times {\varvec{\zeta }}\).

Condition (iii) puts constraints on the type of singularities allowed in condition (i) so that the action is well defined. On the other hand, condition (ii) is consistent with the gauge fixing condition \(A_{\bar{z}} = 0\) which we shall impose later on in Sect. 2.3 in order to describe integrable field theories, once we have established the gauge invariance of 4d Chern–Simons theory in the next section.

2.2 Gauge Invariance

Having defined the 4d Chern–Simons action in Sect. 2.1, the next step in the approach of [1] is to study its gauge invariance. We therefore consider the variation of the 4d Chern–Simons action under gauge transformations

$$\begin{aligned} A \longmapsto {}^g A :=gAg^{-1} - \textrm{d}g g^{-1} \end{aligned}$$
(2.6)

for \(g \in C^{\infty }(X,G)\) an arbitrary smooth G-valued function. Notice once again, as in Sect. 2.1.2, that the \(\textrm{d}z\)-component of the term \(\textrm{d}g g^{-1}\) will automatically drop out from the action due to the presence of the 1-form \(\omega \), so that the gauge transformation (2.6) effectively acts on connections of the form (2.5).

In the case when \(\omega \) has only simple poles, and the action takes the form (2.4), we easily see that

$$\begin{aligned} S_{4{\text {d}}}({}^g A)&= S_{4{\text {d}}}(A)+\frac{i}{4\pi }\int _X \omega \, \wedge \, \textrm{d}\langle g^{-1}\textrm{d}g,A\rangle \nonumber \\&\quad + \frac{i}{4\pi }\int _X \omega \, \wedge \,\langle g^{-1}\textrm{d}g,[g^{-1}\textrm{d}g,g^{-1}\textrm{d}g]\rangle \,. \end{aligned}$$
(2.7)

The 4d Chern–Simons action is thus manifestly not gauge invariant. It is instructive to compare (2.7) with the gauge variation of the usual 3d Chern–Simons action. In particular, the first additional term generated on the right-hand side of (2.7) is not the integral of an exact differential precisely due to the presence of the 1-form \(\omega \). In fact, neither of the two additional terms in (2.7) will vanish in general. Therefore, obtaining a better understanding of these two terms is key to being able to promote 4d Chern–Simons theory to a gauge invariant theory.

Before stating the general result from [1], it is helpful to first explain the result in the simplest case when \(\omega \) has only simple poles. It can be shown, see [3, 8], that the first additional term on the right-hand side of (2.7) localises on the surface defects \(\Sigma _x = \Sigma \times \{ x \}\). Explicitly, if we suppose for simplicity that all the poles are real, i.e. \(\varvec{z}= \varvec{z}_{\textrm{r}}\), then we have

$$\begin{aligned} \frac{i}{4\pi }\int _X \omega \, \wedge \, \textrm{d}\langle g^{-1}\textrm{d}g,A\rangle = - \frac{1}{2} \sum _{x \in \varvec{z}} \ell ^x_0 \int _{\Sigma _x} \langle g^{-1}\textrm{d}g, A \rangle |_{\Sigma _x}. \end{aligned}$$

In turn, the right-hand side of the above can be rewritten as a single integral over \(\Sigma \) as follows. The collection \((A|_{\Sigma _x})_{x \in \varvec{z}}\) of the restrictions of \(A \in \Omega ^1(X, \mathfrak {g})\) to each surface defect \(\Sigma _x\), \(x \in \varvec{z}\) defines a \(\mathfrak {g}\)-valued 1-form on the disjoint union of surface defects \(\sqcup _{x \in \varvec{z}} \Sigma _x\). Alternatively, this can also be thought of as defining a 1-form on \(\Sigma \) but valued in the direct product of Lie algebras \({\mathfrak {d}}= \prod _{x \in \varvec{z}} \mathfrak {g}\). Moreover, we have a map \(\varvec{j}^*: \Omega ^1(X,\mathfrak {g}) \rightarrow \Omega ^1(\Sigma , {\mathfrak {d}})\) given by \(\varvec{j}^*A = (A|_{\Sigma _x})_{x \in \varvec{z}}\). Likewise, the collection \((g|_{\Sigma _x})_{x \in \varvec{z}}\) of the restrictions of \(g \in C^\infty (X, G)\) to the surface defects defines a smooth function on \(\Sigma \) valued in the direct product Lie group \(D = \prod _{x \in \varvec{z}} G\), and we have a map \(\varvec{j}^*: C^\infty (X,G) \rightarrow C^\infty (\Sigma , D)\) given by \(\varvec{j}^*g = (g|_{\Sigma _x})_{x \in \varvec{z}}\). Defining the bilinear form \(\langle \langle \cdot , \cdot \rangle \rangle _{{\mathfrak {d}}}: {\mathfrak {d}}\times {\mathfrak {d}}\rightarrow {\mathbb {R}}\) as \(\langle \langle ({\textsf{u}}_x)_{x \in \varvec{z}}, ({\textsf{v}}_x)_{x \in \varvec{z}} \rangle \rangle _{{\mathfrak {d}}} = \sum _{x \in \varvec{z}} \ell ^x_0 \langle {\textsf{u}}_x, {\textsf{v}}_x \rangle \), we may finally rewrite the first additional term on the right-hand side of (2.7) as

$$\begin{aligned} \frac{i}{4\pi }\int _X \omega \, \wedge \, \textrm{d}\langle g^{-1}\textrm{d}g,A\rangle = - \frac{1}{2} \int _{\Sigma } \langle \langle (\varvec{j}^*g)^{-1} \textrm{d}(\varvec{j}^*g), \varvec{j}^*A \rangle \rangle _{{\mathfrak {d}}}. \end{aligned}$$
(2.8)

The second additional term in (2.7) may similarly be rewritten as a WZ-term for an extension of the D-valued field \(\varvec{j}^*g \in C^\infty (\Sigma , D)\) to \(\Sigma \times [0,1]\), see [3, 8].

2.2.1 Defect Lie Algebra and Lie Group

When \(\omega \) has higher-order poles, it was shown in [1] that the above rewriting of the gauge variation (2.7) of the 4d Chern–Simons action goes through with the obvious modifications. In particular, instead of just restricting the \(\mathfrak {g}\)-valued 1-form A on X to each surface defect \(\Sigma _x\) one should keep the first \(n_x - 1\) orders in the Taylor expansion of A near \(\Sigma _x\). Correspondingly, the direct product Lie algebra \({\mathfrak {d}}\) and Lie group D need to be replaced by the defect Lie algebra and Lie group [1, 2].

Let \({\mathcal {T}}_x :={\mathbb {R}}[\varepsilon _x]/ (\varepsilon _x^{n_x})\) for each real pole \(x \in \varvec{z}_{\textrm{r}}\) and \({\mathcal {T}}_x :={\mathbb {C}}[\varepsilon _x]/ (\varepsilon _x^{n_x})\) for each complex pole \(x \in \varvec{z}_{\textrm{c}}\). We define the defect Lie algebra as the real Lie algebra

$$\begin{aligned} {\mathfrak {d}}:=\prod _{x \in \varvec{z}_{\textrm{r}}} \mathfrak {g}\otimes _{{\mathbb {R}}} {\mathcal {T}}_x \times \prod _{x \in \varvec{z}_{\textrm{c}}} \mathfrak {g}^{\mathbb {C}}\otimes _{{\mathbb {C}}} {\mathcal {T}}_x, \end{aligned}$$
(2.9)

where \(\mathfrak {g}^{\mathbb {C}}\otimes _{{\mathbb {C}}} {\mathcal {T}}_x\) is regarded as a Lie algebra over \({\mathbb {R}}\). The Lie algebra relations of \({\mathfrak {d}}\) are given explicitly as

$$\begin{aligned} \big [{\textsf{u}} \otimes \varepsilon ^p_x, {\textsf{v}} \otimes \varepsilon ^q_y\big ] = \delta _{xy} [{\textsf{u}}, {\textsf{v}}] \otimes \varepsilon ^{p+q}_x \end{aligned}$$

where \(\varepsilon _x^{p+q}=0\) for \(p+q\ge n_x\). The truncated polynomial Lie algebras \(\mathfrak {g}\otimes _{{\mathbb {R}}} {\mathcal {T}}_x\) and \(\mathfrak {g}^{\mathbb {C}}\otimes _{{\mathbb {C}}} {\mathcal {T}}_x\) are sometimes referred to as Takiff algebras and we will refer to the integer p as the Takiff degree of the element \({\textsf{u}} \otimes \varepsilon ^p_x\). Note that

$$\begin{aligned} \dim {\mathfrak {d}}= \dim \mathfrak {g}\sum _{x \in \varvec{z}_{\textrm{r}}} n_x + 2 \dim \mathfrak {g}\sum _{x \in \varvec{z}_{\textrm{c}}} n_x = \dim \mathfrak {g}\sum _{x \in \Pi \varvec{z}} n_x. \end{aligned}$$
(2.10)

Recall form Sect. 2.1.1 that we are assuming the total number of poles of \(\omega \) (counting multiplicities) to be even, which implies by (2.10) that \(\dim {\mathfrak {d}}\) is even.

We define a non-degenerate invariant symmetric bilinear form

$$\begin{aligned} \langle \langle \cdot , \cdot \rangle \rangle _{{\mathfrak {d}}} : {\mathfrak {d}}\times {\mathfrak {d}}\longrightarrow {\mathbb {R}}\end{aligned}$$
(2.11a)

with respect to which all the factors in (2.9) are orthogonal, for any \(x,y \in \varvec{z}_{\textrm{r}}\) we set

$$\begin{aligned} \big \langle \big \langle {\textsf{u}} \otimes \varepsilon ^p_x,{\textsf{v}} \otimes \varepsilon ^q_y \big \rangle \big \rangle _{{\mathfrak {d}}} :=\delta _{xy} \, \ell ^x_{p+q} \langle {\textsf{u}}, {\textsf{v}} \rangle , \end{aligned}$$
(2.11b)

where \(\ell ^x_p = 0\) for all \(p \ge n_x\), and for any \(x, y \in \varvec{z}_{\textrm{c}}\) we set

$$\begin{aligned} \big \langle \big \langle {\textsf{u}} \otimes \varepsilon ^p_x, {\textsf{v}} \otimes \varepsilon ^q_y \big \rangle \big \rangle _{{\mathfrak {d}}} :=\delta _{xy} \, \big ( \ell ^x_{p+q} \langle {\textsf{u}}, {\textsf{v}} \rangle + \ell ^{\bar{x}}_{p+q} \langle \tau {\textsf{u}}, \tau {\textsf{v}} \rangle \big ). \end{aligned}$$
(2.11c)

In the case when all poles are real and simple, i.e. \(\varvec{z}= \varvec{z}_{\textrm{r}}\) and \(n_x = 1\) for all \(x \in \varvec{z}\), the defect Lie algebra \({\mathfrak {d}}\) reduces to the direct product Lie algebra \(\prod _{x \in \varvec{z}} \mathfrak {g}\) considered above in the motivating example.

An important subalgebra of the defect Lie algebra \({\mathfrak {d}}\) which will play a central role in the description of degenerate \(\mathcal {E}\)-models is the diagonal subalgebra. To introduce it, we define the diagonal map

$$\begin{aligned} \Delta : \mathfrak {g}\longrightarrow \mathfrak {g}^{\times |\varvec{z}|} \subset {\mathfrak {d}}, \qquad {\textsf{u}} \longmapsto ({\textsf{u}} \otimes \varepsilon ^0_x)_{x \in \varvec{z}}. \end{aligned}$$
(2.12)

Note that at complex points \(x \in \varvec{z}_{\textrm{c}}\) the defect Lie algebra contains a copy of the real Lie algebra \(\mathfrak {g}\) in Takiff degree 0 so that \(\mathfrak {g}^{\oplus |\varvec{z}|}\) is indeed a subalgebra of \({\mathfrak {d}}\). The image of (2.12) then defines the diagonal subalgebra \({\mathfrak {f}}:={{\,\textrm{im}\,}}\Delta \subset {\mathfrak {d}}\).

One can also introduce a real Lie group with Lie algebra \({\mathfrak {d}}\) which we will call the defect Lie group and denote by D. As a set this is given by the direct product

$$\begin{aligned} D = \prod _{x \in \varvec{z}_{\textrm{r}}} \big ( G \times (\mathfrak {g}\otimes _{{\mathbb {R}}} {\mathcal {T}}'_x) \big ) \times \prod _{x \in \varvec{z}_{\textrm{c}}} \big ( G^{\mathbb {C}}\times \left( \mathfrak {g}^{\mathbb {C}}\otimes _{{\mathbb {C}}} {\mathcal {T}}'_x \right) \big ), \end{aligned}$$
(2.13)

where \({\mathcal {T}}'_x :=\varepsilon _x {\mathbb {R}}[\varepsilon _x]/ (\varepsilon _x^{n_x})\) for \(x \in \varvec{z}_{\textrm{r}}\) and \({\mathcal {T}}'_x :=\varepsilon _x {\mathbb {C}}[\varepsilon _x]/ (\varepsilon _x^{n_x})\) for \(x \in \varvec{z}_{\textrm{c}}\). However, the general definition of the Lie group structure on D, which can be found in [49], is quite involved so we will not include it here to avoid clutter. In practice, we will only require the group law on D when discussing specific examples in Sect. 4, where the corresponding expressions will be explicitly stated. Finally, note that we have the diagonal embedding \(\Delta : G \rightarrow G^{\times |\varvec{z}|} \subset D\) corresponding to (2.12) at the group level.

Just as in the above motivating example, the purpose of introducing the defect Lie algebra is that we then have a map [1, 2]

$$\begin{aligned} \varvec{j}^*:\Omega ^1(X,\mathfrak {g}) \longrightarrow \Omega ^1 (\Sigma , {\mathfrak {d}}), \quad A \longmapsto \bigg (\sum _{p=0}^{n_x-1}\frac{1}{p!} (\partial _{\xi _x}^p A)|_{\Sigma _x} \otimes \varepsilon _x^p\bigg )_{x\in \varvec{z}} \end{aligned}$$
(2.14)

where \((\partial _{\xi _x}^p A)|_{\Sigma _x} \in \Omega ^{1}(\Sigma ,\mathfrak {g})\) denotes the pullback of \(\partial _{\xi _x}^p A\) to each surface defect \(\Sigma _x\). In other words, \(\varvec{j}^*\) sends a \(\mathfrak {g}\)-valued 1-form A on X to the first \(n_x\) terms in its Taylor expansion at each surface defect \(\Sigma _x\) for \(x \in \varvec{z}\). Similarly, for smooth G-valued functions on X we have a map

$$\begin{aligned} \varvec{j}^*:C^{\infty }(X,G)&\longrightarrow C^{\infty }(\Sigma ,D) \nonumber \\ g&\longmapsto \varvec{j}^*g=\left( g|_{\Sigma _x}, \sum _{p=1}^{n_x-1} \frac{1}{p!} \big (\partial _{\xi _x}^{p-1}(\partial _{\xi _x} g g^{-1}) \big )\big |_{\Sigma _x} \otimes \varepsilon _x^p\right) _{x\in \varvec{z}} \,.\nonumber \\ \end{aligned}$$
(2.15)

This definition differs from the one given in [1] which applies to matrix Lie groups.

With the above definitions in place, we are now in a position to state one of the main results of [1]. Namely, for an arbitrary meromorphic 1-form \(\omega \) as in (2.1), the variation (2.7) of the (regularised) 4d Chern–Simons action under an arbitrary gauge transformation (2.6) can be expressed as

$$\begin{aligned} S_{4{\text {d}}}({}^g A)=S_{4{\text {d}}}(A)-\frac{1}{2}\int _{\Sigma } \langle \langle (\varvec{j}^* g)^{-1}\textrm{d}(\varvec{j}^*g),\varvec{j}^* A \rangle \rangle _{{\mathfrak {d}}}-\frac{1}{2}I_{{\mathfrak {d}}}^{\textrm{WZ}}[\varvec{j}^* g] \,, \end{aligned}$$
(2.16)

where we have introduced the standard WZ-term for a field \(h\in C^{\infty }\left( \Sigma , D\right) \), namely

$$\begin{aligned} I_{{\mathfrak {d}}}^{\textrm{WZ}}[h]= -\frac{1}{6}\int _{\Sigma \times I} \langle \langle \widehat{h}^{-1}\textrm{d}\widehat{h}, {[}\widehat{h}^{-1}\textrm{d}\widehat{h},\widehat{h}^{-1} \textrm{d}\widehat{h}]\rangle \rangle _{{\mathfrak {d}}} \end{aligned}$$
(2.17)

where \(I:=[0,1]\) and \(\widehat{h}\in C^{\infty }\left( \Sigma \times I, D \right) \) is any smooth extension of h to \(\Sigma \times I\) with the property that \(\widehat{h}=h\) near \(\Sigma \times \{0\} \subset \Sigma \times I\) and \(\widehat{h}=\textrm{id}\) near \(\Sigma \times \{1\} \subset \Sigma \times I\). Of course, the second term on the right-hand side in (2.16) coincides with (2.8) when \(\omega \) has only simple poles. The virtue of the result (2.16) is that it holds for any meromorphic 1-form \(\omega \) with poles of arbitrary order.

2.2.2 Isotropy and Edge Modes

As already anticipated, it is now clear from (2.16) that the 4d Chern–Simons action is not gauge invariant. However, gauge invariance of the theory may still be achieved upon imposing boundary conditions on both \(\varvec{j}^*A\) and \(\varvec{j}^*g\) at the surface defects, in order for the two additional terms appearing in (2.16) to vanish.

Recall that a Lie subalgebra \({\mathfrak {k}}\subset {\mathfrak {d}}\) is said to be isotropic with respect to (2.11) if \(\langle \langle {\textsf{x}},{\textsf{y}} \rangle \rangle _{{\mathfrak {d}}}=0\) for every \({\textsf{x}}, {\textsf{y}} \in {\mathfrak {k}}\). Given a subgroup \(K\subset D\) whose Lie algebra \({\mathfrak {k}}\subset {\mathfrak {d}}\) is isotropic with respect to \(\langle \langle \cdot ,\cdot \rangle \rangle _{{\mathfrak {d}}}\), we can impose the boundary conditions

$$\begin{aligned} \varvec{j}^*A \in \Omega ^{1}(\Sigma ,{\mathfrak {k}}) \qquad \text {and} \qquad \varvec{j}^*g \in C^{\infty }(\Sigma ,K) \end{aligned}$$
(2.18)

on both the field A and the gauge transformation parameter g, so that in particular \((\varvec{j}^* g)^{-1}\textrm{d}(\varvec{j}^*g)\in \Omega ^{1}(\Sigma ,{\mathfrak {k}})\). The action then becomes manifestly gauge invariant since the last two terms on the right-hand side of (2.16) vanish due to isotropy.

There are, however, two important related issues with the boundary conditions in (2.18). Firstly, the condition imposed on A is a strict boundary condition which equates \(\varvec{j}^*A\), the restriction of A to the surface defects, with a \({\mathfrak {k}}\)-valued gauge field on \(\Sigma \). But in a gauge theory one should only compare gauge fields via gauge transformations and not via equalities. Secondly, the condition imposed on g restricts the set of allowed gauge transformations, thereby partially breaking the gauge invariance we are trying to achieve. In particular, the strict boundary condition imposed on A is preserved only by these restricted gauge transformations. Therefore, strictly speaking, even upon imposing boundary conditions, 4d Chern–Simons theory is not a fully gauge invariant theory.

Now the role of gauge transformations is to identify physically indistinguishable field configurations, by killing would-be degrees of freedom. So restricting the kind of gauge transformations we allow will resurrect some of these degrees of freedom from the dead [50]. In particular, if we insist on establishing a fully gauge invariant theory then these resurrected degrees of freedom must be included somehow.

This brings us to the second main result of [1]. Both issues with the boundary condition (2.18) can be resolved by introducing a new degree of freedom living on the surface defects, namely a smooth D-valued field \(h\in C^{\infty }(\Sigma ,D)\) called the edge mode. It was shown in [1] that 4d Chern–Simons theory with the boundary conditions (2.18) is equivalent to 4d Chern–Simons theory coupled to the edge mode by introducing the extended action

$$\begin{aligned} S_{4{\text {d}}}^{\textrm{ext}}(A,h)=S_{4{\text {d}}}(A)-\frac{1}{2} \int _{\Sigma }\langle \langle h^{-1}\textrm{d}h,\varvec{j}^* A \rangle \rangle _{{\mathfrak {d}}} -\frac{1}{2}I_{{\mathfrak {d}}}^{\textrm{WZ}}[h] \,, \end{aligned}$$
(2.19)

together with the alternate boundary condition

$$\begin{aligned} {}^h (\varvec{j}^*A) \in \Omega ^1(\Sigma ,{\mathfrak {k}})\,. \end{aligned}$$
(2.20)

That is, instead of imposing boundary conditions on A and g as in (2.18), we only impose the boundary condition on A and only up to a gauge transformation by h.

One can verify, using (2.16), the Polyakov–Wiegmann identity [51] and the invariance of the bilinear form, that both the extended action (2.19) and the constraint (2.20) are invariant under the gauge transformation

$$\begin{aligned} A\longmapsto {}^g A\,, \qquad h \longmapsto h (\varvec{j}^* g)^{-1} \end{aligned}$$
(2.21)

with arbitrary \(g\in C^{\infty }(X,G)\). Thus, we have defined a fully gauge invariant theory, at the price of adding a new field.

Observe that if we restrict the edge mode h to take values in K then we recover the original 4d Chern–Simons action together with the original boundary conditions (2.18). More precisely, the extended action (2.19) enjoys the additional symmetry

$$\begin{aligned} h \longmapsto k h \end{aligned}$$
(2.22)

for arbitrary \(k\in C^{\infty }(\Sigma ,K)\). The invariance of (2.19) under (2.22) can be verified using the Polyakov–Wiegmann identity, the constraint (2.20) and the isotropy of \({\mathfrak {k}}\). In other words, the degrees of freedom added can be described by a smooth field on \(\Sigma \) valued in the quotient \(K \setminus D\).

2.3 2d Integrable Field Theories

Although 4d Chern–Simons theory coupled to the edge mode as described above is equivalent to the original 4d Chern–Simons theory with boundary conditions (2.18), the advantage of the former is that it leads more naturally to 2d integrable field theories. In particular, the field content of the latter will correspond precisely to the edge mode degrees of freedom living on the defect. Moreover, the Lax connection of the 2d integrable field theory will come directly from the gauge field A of the 4d Chern–Simons theory in (2.5).

There are, however, two glaring issues with interpreting the gauge field (2.5) as a Lax connection. The first is that A has a component along the \(\textrm{d}\bar{z}\) direction whereas a Lax connection should be a 1-form along \(\Sigma \). The second is that as it stands A is not meromorphic in the z-coordinate which we would like to interpret as the spectral parameter of the Lax connection.

The first issue is easily resolved. Indeed, we can partially fix the gauge invariance of (2.19) using the gauge fixing condition \(A_{\bar{z}} = 0\) in order to get rid of the undesired \(\textrm{d}\bar{z}\)-component. This is analogous to the axial gauge in electrodynamics and Yang–Mills theories, where one of the component of the gauge field is set to vanish. We will suggestively denote the gauge field A in this gauge by the letter \({\mathcal {L}}\). Note that there is a residual gauge symmetry (2.6) by \(g\in C^{\infty }(X,G)\) satisfying \(\partial _{\bar{z}}g g^{-1}=0\).

The second issue is more problematic. In order to resolve it, we will have to go partly on-shell, which we turn to next.

2.3.1 Solving the Bulk Equations of Motion

The action (2.19) defines a four-dimensional theory due to the presence of the ‘bulk’ term \(S_{4{\text {d}}}({\mathcal {L}})\) for the ‘bulk’ field \({\mathcal {L}}= {\mathcal {L}}_{\tau } \textrm{d}\tau + {\mathcal {L}}_{\sigma } \textrm{d}\sigma \). To obtain a two-dimensional theory, we will therefore restrict to solutions of the bulk equations of motion. Specifically, varying the action (2.19) with respect to both \({\mathcal {L}}\) and h, subject to the constraint (2.20), we find the bulk and boundary field equations of motion [1]

$$\begin{aligned} \partial _{\bar{z}} {\mathcal {L}}&= 0 \quad \text {on} \quad \Sigma \times ({\mathbb {C}}{P}^1\setminus {\varvec{\zeta }}), \end{aligned}$$
(2.23a)
$$\begin{aligned} \textrm{d}_{\Sigma }(\varvec{j}^*{\mathcal {L}}) +\tfrac{1}{2} \left[ \varvec{j}^*{\mathcal {L}},\varvec{j}^*{\mathcal {L}}\right]&= 0 \quad \text {on} \quad \Sigma , \end{aligned}$$
(2.23b)

where \(\textrm{d}_{\Sigma }\) denotes the de Rham differential on \(\Sigma \).

The bulk equation of motion (2.23a) expresses the fact that the components of \({\mathcal {L}}= {\mathcal {L}}_{\tau } \textrm{d}\tau + {\mathcal {L}}_{\sigma } \textrm{d}\sigma \) are holomorphic along \({\mathbb {C}}{P}^1\) away from the set \({\varvec{\zeta }}\) of zeroes of \(\omega \). However, recall from condition (i) in §2.1.2 that we allow the light-cone components \({\mathcal {L}}_{\pm } = {\mathcal {L}}_{\tau } \pm {\mathcal {L}}_{\sigma }\) of the gauge field to have singularities at the subset of poles \({\varvec{\zeta }}_{\pm }\), so long as the Lagrangian \(\omega \wedge \textrm{CS}({\mathcal {L}})\) remained integrable along each surface \(\Sigma \times \{ y \}\) for \(y \in {\varvec{\zeta }}\). We can therefore take the components of \({\mathcal {L}}\) to be of the form

$$\begin{aligned} {\mathcal {L}}_{\mu } = \sum _{y \in {\varvec{\zeta }}'}\sum _{q=0}^{m_y-1} \frac{{\mathcal {L}}_{\mu }^{(y,q)}}{(z-y)^{q+1}}+\sum _{q=0}^{m_{\infty }-1} {\mathcal {L}}_{\mu }^{(\infty ,q)}z^{q+1}+{\mathcal {L}}_{\mu }^{\textrm{c}} \end{aligned}$$
(2.24a)

for \(\mu =\tau ,\sigma \), where the coefficient functions \({\mathcal {L}}_{\mu }^{\textrm{c}}\in C^{\infty }(\Sigma ,\mathfrak {g})\), \({\mathcal {L}}_{\mu }^{(y,q)} \in C^{\infty }(\Sigma ,\mathfrak {g})\) for \((y, q) \in ({{\varvec{\zeta }}_{\textrm{r}}})\) and \({\mathcal {L}}_{\mu }^{(y,q)} \in C^{\infty }(\Sigma ,\mathfrak {g}^{\mathbb {C}})\) for \((y, q) \in ({{\varvec{\zeta }}_{\textrm{c}}})\) are related by

$$\begin{aligned} {\mathcal {L}}_{\tau }^{(y,q)} = \epsilon _y {\mathcal {L}}_{\sigma }^{(y,q)} \end{aligned}$$
(2.24b)

with \(\epsilon _y = \pm 1\) for \(y \in {\varvec{\zeta }}_{\pm }\). Note that (2.24b) ensures that the light-cone component \({\mathcal {L}}_{\pm }\) only has poles in \({\varvec{\zeta }}_{\pm }\), and not in \({\varvec{\zeta }}_{\mp }\), as required by condition (i) from Sect. 2.1.2. To see why singularities of the form (2.24) are allowed by condition (iii), consider the case when \(\omega \) has only simple poles so that the action takes the form (2.4). The cubic term in the Chern–Simons 3-form drops out since \({\mathcal {L}}\) only has legs along \(\textrm{d}\sigma \) and \(\textrm{d}\tau \) so that the bulk Lagrangian reads

$$\begin{aligned} \omega \wedge \textrm{CS}({\mathcal {L}}) = \omega \wedge \big ( \langle {\mathcal {L}}_+, \bar{\partial } {\mathcal {L}}_- \rangle -\langle {\mathcal {L}}_-, \bar{\partial } {\mathcal {L}}_+ \rangle \big ) \wedge \textrm{d}\sigma ^+ \wedge \textrm{d}\sigma ^-. \end{aligned}$$
(2.25a)

It is then easy to see that this 4-form is locally finite near the surface \(\Sigma \times \{y\}\) for each \(y \in {\varvec{\zeta }}\). Notice, in particular, that the condition (2.24b) is used here to guarantee that the set of poles of \({\mathcal {L}}_{\pm }\) are disjoint so that, for instance, the \(\delta \)-functions at the set \({\varvec{\zeta }}_-\) arising from \(\bar{\partial } {\mathcal {L}}_-\) are multiplied by poles in \({\varvec{\zeta }}_+\) coming from \({\mathcal {L}}_+\). Moreover, note that the singularities of the form (2.24a) are also consistent with the cubic term in the Lagrangian before moving to the gauge \(A_{\bar{z}} = 0\), i.e. when \(A_{\bar{z}}\) is nonzero but smooth as in condition (ii) of 2.1.2, namely

$$\begin{aligned} \omega \wedge \langle A, \tfrac{1}{3}[A,A]\rangle =-2 \omega \wedge \langle A_{\bar{z}},[A_+,A_-]\rangle \wedge \textrm{d}\bar{z} \wedge \textrm{d}\sigma ^+ \wedge \textrm{d}\sigma ^-. \end{aligned}$$
(2.25b)

The above remains true also when \(\omega \) has higher-order poles since the regularisation procedure, to make sense of the action in that case, only modifies the Lagrangian locally near the surface defects \(\Sigma \times \{ x \}\) for each \(x \in \varvec{z}\).

Gauge fields \({\mathcal {L}}\) of the form (2.24a) satisfying the condition (2.24b) were referred to in [1, 2] as being admissible. More precisely, (2.24b) is the most natural solution of the admissibility conditions given there, cf. [1, Example 5.4] and [2, § 3.5  & § 4.2]. The observation we made above is that the admissibility condition can be traced back to the 4d Chern–Simons Lagrangian as the requirement that it be locally integrable. In fact, this new perspective on admissible solutions of the bulk equations of motion (2.23a) leads to the following observation which will be useful later.

Remark 2.1

The rational expression (2.24a), with poles of order \(m_y\) at each \(y \in {\varvec{\zeta }}\), is not the most general one for which the bulk action is locally integrable. Indeed, one could take poles of order \(m_y + 1\) at each \(y \in {\varvec{\zeta }}\) while maintaining the local integrability of the expression (2.25a) and also of (2.25b) before fixing the gauge \(A_{\bar{z}} = 0\). This is because both of the top forms in (2.25) are locally integrable near a simple pole of the component [1, Lemma 2.1]. Note that this is precisely why the 4d Chern–Simons action (2.4) was well defined in the case when \(\omega \) has only simple poles. The reason we have kept the strength of the poles in (2.24a) as they are is to ensure that the 2d action we end up with is integrable, as we will see shortly.

Upon restricting the gauge field \({\mathcal {L}}\) to be a solution of the bulk equation of motion (2.23a) as in (2.24a), the bulk four-dimensional term in the action (2.19) disappears and we are left with the two-dimensional action

$$\begin{aligned} S_{2d}({\mathcal {L}},h)=-\frac{1}{2}\int _{\Sigma } \langle \langle h^{-1}\textrm{d}h,\varvec{j}^* {\mathcal {L}} \rangle \rangle _{{\mathfrak {d}}} -\frac{1}{2}I_{{\mathfrak {d}}}^{\textrm{WZ}}[h] \,. \end{aligned}$$
(2.26)

The equation of motion of this action is the boundary equation of motion (2.23b). It was shown in [1, Proposition 5.6] that for admissible solutions of the bulk equations of motion, the flatness equation (2.23b) for \(\varvec{j}^*{\mathcal {L}}\) lifts to a flatness equation for \({\mathcal {L}}\) itself, namely

$$\begin{aligned} \textrm{d}{\mathcal {L}}+\tfrac{1}{2}[{\mathcal {L}},{\mathcal {L}}]=0 \quad \text {on }\Sigma \,. \end{aligned}$$
(2.27)

Here, we can make use of the observation in Remark 2.1 by noting that the argument in the proof of [1, Proposition 5.6] still applies if we increase the order of one of the poles of \({\mathcal {L}}\) with components (2.24a) by 1. In other words, although the requirement that the action be well defined allows us to increase the order of all the poles in \({\mathcal {L}}\) by 1, the requirement that the boundary equations of motion (2.23b) lift to the flatness of \({\mathcal {L}}\) in (2.27), which ultimately ensures integrability, only enables us to increase the order of one of the poles in \({\mathcal {L}}\) by 1 while keeping the strength of all the other poles the same. We will see another proof of this later in §3.3.

Finally, recall that after removing the \(\textrm{d}\bar{z}\)-component of the gauge field the gauge symmetry (2.21) was restricted to those \(g \in C^\infty (X, G)\) such that \(\partial _{\bar{z}} g g^{-1} = 0\). And to preserve the pole structure of the meromorphic gauge field \({\mathcal {L}}\) in (2.24a) we can restrict to \(g \in C^\infty (\Sigma , G)\) which are independent of the coordinate on \({\mathbb {C}}{P}^1\). It follows from (2.15) that for such gauge transformation parameters we have \(\varvec{j}^*g = \Delta (g)\), where recall that \(\Delta : G \rightarrow G^{\times |\varvec{z}|} \subset D\) is the diagonal embedding.

2.3.2 The 2d Action

The two-dimensional action (2.26) which we obtained from (2.19) by solving the bulk equations of motion in Sect. 2.3.1 should, of course, be supplemented by the boundary condition given in (2.20). In other words, the fields \({\mathcal {L}}\) and h on which the action (2.26) depends are not independent but instead are related by the constraint

$$\begin{aligned} {}^h (\varvec{j}^*{\mathcal {L}}) \in \Omega ^{1}(\Sigma ,{\mathfrak {k}}). \end{aligned}$$
(2.28)

The final step for obtaining a two-dimensional integrable field theory is therefore to solve the constraint (2.28) to find an expression for \({\mathcal {L}}\) in terms of h.

Indeed, suppose that we can find a unique solution \({\mathcal {L}}={\mathcal {L}}(h)\) to the constraint (2.28). In order to respect the gauge invariance (2.21) and (2.22) we further assume that this solution is such that

$$\begin{aligned} \varvec{j}^*{\mathcal {L}}\big ( kh\Delta (g)^{-1} \big ) = {}^{\Delta (g)} \big (\varvec{j}^*{\mathcal {L}}(h) \big ) \end{aligned}$$
(2.29)

for every \(g \in C^{\infty }(\Sigma , G)\) and \(k\in C^{\infty }(\Sigma , K)\). Note that the existence and uniqueness of such a solution depend on the choice of Lagrangian subalgebra \({\mathfrak {k}}\subset {\mathfrak {d}}\). One of the main results of [2] was to explicitly construct such solutions. The resulting models were shown to coincide with integrable non-degenerate \(\mathcal {E}\)-models. In the remainder of this article, we will generalise the construction of [2] to obtain a more general class of solutions to (2.28) leading to the class of integrable degenerate \(\mathcal {E}\)-models.

Given any solution of the boundary condition (2.28) satisfying the equivariance property (2.29), the action (2.26) reduces to a two-dimensional action for the edge mode field \(h \in C^{\infty }(\Sigma ,D)\) alone given by

$$\begin{aligned} S_{2d}(h)=-\frac{1}{2}\int _{\Sigma } \langle \langle h^{-1} \textrm{d}h, \varvec{j}^*{\mathcal {L}}(h)\rangle \rangle _{{\mathfrak {d}}} -\frac{1}{2}I_{{\mathfrak {d}}}^{\textrm{WZ}}[h] \,. \end{aligned}$$
(2.30)

By virtue of the property (2.29), this action is invariant under the transformations

$$\begin{aligned} h \longrightarrow kh\Delta (g)^{-1} \end{aligned}$$
(2.31)

for any \(k\in C^{\infty }(\Sigma ,K)\) and \(g\in C^{\infty }(\Sigma ,G)\). Moreover, the equations of motion (2.27) which arose from the boundary equations of motion of the original 4d Chern–Simons action now read

$$\begin{aligned} \textrm{d}{\mathcal {L}}(h) + \tfrac{1}{2}[{\mathcal {L}}(h),{\mathcal {L}}(h)]=0\,. \end{aligned}$$
(2.32)

In other words, the two-dimensional action (2.30) has an associated Lax connection \({\mathcal {L}}(h)\) and therefore describes a two-dimensional integrable field theory.

3 Obtaining Degenerate \(\mathcal {E}\)-Models

The purpose of this section is to complete the passage from 4d Chern–Simons theory to 2d integrable field theories. In particular, we will show how to obtain integrable degenerate \(\mathcal {E}\)-models. Since the details of this section are quite technical, the reader interested in applying the construction may wish, on first read, to skip to Sect. 4 where we present various examples of the procedure in detail. They may then refer back to the present section for further details of the construction. Before presenting these details, and in order to facilitate the reading of this section, we begin by giving a brief outline of the main strategy.

As recalled in Sect. 2.3.2, the very last step in the approach of [1] for passing from 4d Chern–Simons theory to 2d integrable field theories consists in finding a solution of the constraint Eq. (2.28) which satisfies the transformation property (2.29). Such solutions were constructed in [2] under the assumption that \(\omega \) has a double pole at infinity. This technical assumption was used to fix the gauge symmetry under F. Specifically, under the assumption that \(n_{\infty } = 2\), the component of the edge mode \(h \in C^\infty (\Sigma , D)\) at infinity is a field on \(\Sigma \) valued in the semi-direct product \(G < imes \mathfrak {g}\). The latter can be brought to the identity by using the F symmetry and the component of the K symmetry associated with the point at infinity, see [2, §3.6] for details. With the gauge symmetry under F fixed in this way, the component at infinity of the constraint (2.28) forces the constant term in the Lax connection to vanish (note that since \(\omega \) initially has a pole at infinity, the Lax connection cannot have poles there). Therefore, the Lax connections considered in [2] are of the special form

$$\begin{aligned} {\mathcal {L}}_{\mu } = \sum _{y \in {\varvec{\zeta }}'}\sum _{q=0}^{m_y-1} \frac{{\mathcal {L}}_{\mu }^{(y,q)}}{(z-y)^{q+1}}. \end{aligned}$$
(3.1)

Moreover, the property (2.29) that the solution \({\mathcal {L}}= {\mathcal {L}}(h)\) is required to satisfy boils down to \(\varvec{j}^*{\mathcal {L}}(kh) = \varvec{j}^*{\mathcal {L}}(h)\). And indeed, the solutions constructed in [2] were shown to have this property and the resulting 2d integrable field theories were shown to coincide with integrable non-degenerate \(\mathcal {E}\)-models.

The main purpose of this section is to generalise the results of [2] to the case of a generic 1-form \(\omega \), as defined in Sect. 2.1.1. The key idea behind the approach of [2] for solving (2.28) is to construct an involution \(\mathcal {E}: {\mathfrak {d}}\xrightarrow {\cong }{\mathfrak {d}}\) on the defect Lie algebra with the property that

$$\begin{aligned} \varvec{j}_{\varvec{z}} {\mathcal {L}}_{\tau } = \mathcal {E}(\varvec{j}_{\varvec{z}} {\mathcal {L}}_{\sigma }), \end{aligned}$$
(3.2)

where we wrote \(\varvec{j}^*{\mathcal {L}}= \varvec{j}_z {\mathcal {L}}_{\tau } \textrm{d}\tau +\varvec{j}_z {\mathcal {L}}_{\sigma } \textrm{d}\sigma \) in components. More precisely, the property (3.2) was the one imposed in [2] but it will have to be adapted in the present case, see (3.29). The property (3.2) was then used in [2] as the starting point for solving the constraint (2.28).

In order to build such an involution \(\mathcal {E}: {\mathfrak {d}}\xrightarrow {\cong }{\mathfrak {d}}\) satisfying (3.2), observe that the relationship between \({\mathcal {L}}_{\tau }\) and \({\mathcal {L}}_{\sigma }\) is in fact very simple to describe in terms of the coefficients of these rational functions (3.1). Indeed, recall that these are related by (2.24b), namely \({\mathcal {L}}_{\tau }^{(y,q)} = \epsilon _y {\mathcal {L}}_{\sigma }^{(y,q)}\) where \(\epsilon _y = \pm 1\) for each \(y \in {\varvec{\zeta }}\). The idea of [2] is then to build two isomorphisms

(3.3)

from a certain space of rational functions \(R'_{\Pi {\varvec{\zeta }}'}\left( \mathfrak {g}^{\mathbb {C}}\right) ^\Pi \), where the components (3.1) of the Lax connection live, to the defect Lie algebra \({\mathfrak {d}}\) by Taylor expanding at each \(x \in \varvec{z}\) and to a vector space \(\mathfrak {g}^{({{\varvec{\zeta }}'})}\) by extracting the coefficients at each pole \(y \in {\varvec{\zeta }}'\).

If we do not fix the gauge symmetry by F, as was done in [2], then the components of the Lax connection still have a constant term compared to (3.1) and can in general also have a pole at infinity, cf. (2.24). In this section we will adapt the construction of [2] summarised above to this case, in particular defining suitable generalisations of the above isomorphisms (3.3) in §3.2 and Sect. 3.3. These will then be used in §3.4 to build an involution \(\mathcal {E}:{\mathfrak {d}}\xrightarrow {\cong }{\mathfrak {d}}\) which is symmetric with respect to the bilinear form on \({\mathfrak {d}}\) introduced in §2.2.1. Finally, we will use the latter in §3.5 to construct solutions of the constraint (2.28) satisfying (2.29) and thereby obtain the action of integrable degenerate \(\mathcal {E}\)-models.

3.1 The Real Vector Space \(R_{\Pi {\varvec{\zeta }}}\left( \mathfrak {g}^{\mathbb {C}}\right) ^\Pi \)

Given a complex vector space V we let \(R_{\Pi {\varvec{\zeta }}}(V)\) denote the space of V-valued rational functions with poles at each \(y \in \Pi {\varvec{\zeta }}\) of order at most \(m_y\), the order of the zero y of \(\omega \). It will also be useful to define the subspace \(R'_{\Pi {\varvec{\zeta }}}(V) \subset R_{\Pi {\varvec{\zeta }}}(V)\) of V-valued rational functions without constant term.

If V is equipped with an anti-linear involution \(\tau : V \rightarrow V\) then we can define an action of \(\Pi \) on V by letting \({\textsf{t}} \in \Pi \) act as \(\tau \). This then also lifts to an action of \(\Pi \) on \(R_{\Pi {\varvec{\zeta }}}(V)\). We can also define an action of \(\Pi \) on \(R_{\Pi {\varvec{\zeta }}}(V)\) by letting \({\textsf{t}} \in \Pi \) act as the pullback by complex conjugation \(\mu _{{\textsf{t}}}: z \mapsto \bar{z}\). We let \(R_{\Pi {\varvec{\zeta }}}(V)^\Pi \) denote the real vector space of rational functions in \(R_{\Pi {\varvec{\zeta }}}(V)\) on which these two actions coincide. We will also make use of the subspace \(R'_{\Pi {\varvec{\zeta }}}(V)^\Pi \subset R_{\Pi {\varvec{\zeta }}}(V)^\Pi \) of such rational functions without constant term.

In what follows, bi-Yang-Baxterwe will either take \(V = \mathfrak {g}^{\mathbb {C}}\) or \(V = C^\infty (\Sigma , \mathfrak {g}^{\mathbb {C}})\), where the action of \(\Pi \) on the latter is induced from the action of \(\Pi \) on \(\mathfrak {g}^{\mathbb {C}}\). Explicitly, an element \(f \in R_{\Pi {\varvec{\zeta }}}\left( \mathfrak {g}^{\mathbb {C}}\right) ^\Pi \) is a \(\Pi \)-equivariant \(\mathfrak {g}\)-valued rational function of the form

$$\begin{aligned} f(z) = \sum _{y \in \Pi {\varvec{\zeta }}'} \sum _{q=0}^{m_y-1} \frac{f^{(y, q)}}{(z-y)^{q+1}} + \sum _{q=0}^{m_{\infty }-1} f^{(\infty , q)} z^{q+1} + f^{\textrm{c}} \end{aligned}$$
(3.4)

where \(f^{\textrm{c}} \in \mathfrak {g}\), \(f^{(y, q)} \in \mathfrak {g}\) for all \((y, q) \in ({{\varvec{\zeta }}_{\textrm{r}}})\) with \(q = 0, \ldots , m_y - 1\), and \(f^{(y, q)} \in \mathfrak {g}^{\mathbb {C}}\) for all \((y, q) \in ({{\varvec{\zeta }}_{\textrm{c}}})\) with \(q = 0, \ldots , m_y - 1\) and \(f^{(\bar{y}, q)} = \tau f^{(y, q)}\).

The dimension of the real vector space \(R_{\Pi {\varvec{\zeta }}}\left( \mathfrak {g}^{\mathbb {C}}\right) ^\Pi \) is given by

$$\begin{aligned} \dim R_{\Pi {\varvec{\zeta }}}\left( \mathfrak {g}^{\mathbb {C}}\right) ^\Pi = \dim \mathfrak {g}\bigg ( \sum _{y \in \Pi {\varvec{\zeta }}} m_y + 1 \bigg ). \end{aligned}$$
(3.5)

The term \(\dim \mathfrak {g}\sum _{y \in \Pi {\varvec{\zeta }}} m_y\) comes from counting the degrees of freedom in the pole parts at each \(y \in \Pi {\varvec{\zeta }}\) of a generic rational function (3.4), see in particular the second equality in (3.12) later. The additional \(\dim \mathfrak {g}\) comes from the constant term \(f^{\textrm{c}}\) in (3.4). An alternative way of counting the dimension of \(R_{\Pi {\varvec{\zeta }}}\left( \mathfrak {g}^{\mathbb {C}}\right) ^\Pi \) is to consider instead the isomorphic space \(R_{\Pi {\varvec{\zeta }}}\left( \mathfrak {g}^{\mathbb {C}}\right) ^\Pi \omega \) which consists of \(\Pi \)-equivariant \(\mathfrak {g}^{\mathbb {C}}\)-valued meromorphic 1-forms with poles in \(\Pi \varvec{z}\). Its dimension is then given by

$$\begin{aligned} \dim R_{\Pi {\varvec{\zeta }}}\left( \mathfrak {g}^{\mathbb {C}}\right) ^\Pi = \dim \mathfrak {g}\bigg ( \sum _{x \in \Pi \varvec{z}} n_x - 1 \bigg ), \end{aligned}$$
(3.6)

where the term \(\dim \mathfrak {g}\sum _{x \in \Pi \varvec{z}} n_x\) comes from counting the degrees of freedom in the pole parts of \(f \omega \) for \(f \in R_{\Pi {\varvec{\zeta }}}\left( \mathfrak {g}^{\mathbb {C}}\right) ^\Pi \) at each \(x \in \Pi \varvec{z}\) and the additional \(-\dim \mathfrak {g}\) accounts for the fact that the sum of the residues of a meromorphic 1-form vanishes. Of course, the two expressions (3.5) and (3.6) coincide by virtue of (2.2).

A \(\Pi \)-equivariant \(\mathfrak {g}\)-valued rational function \(f \in R'_{\Pi {\varvec{\zeta }}}\left( \mathfrak {g}^{\mathbb {C}}\right) ^\Pi \) takes the form

$$\begin{aligned} f(z) = \sum _{y \in \Pi {\varvec{\zeta }}'} \sum _{q=0}^{m_y-1} \frac{f^{(y, q)}}{(z-y)^{q+1}} + \sum _{q=0}^{m_{\infty }-1} f^{(\infty , q)} z^{q+1}. \end{aligned}$$
(3.7)

Note that the only difference with (3.4) is that the ‘constant’ term \(f^{\textrm{c}} \in \mathfrak {g}\) is missing in (3.7). The constant rational functions in \(R_{\Pi {\varvec{\zeta }}}\left( \mathfrak {g}^{\mathbb {C}}\right) ^\Pi \) form a subspace isomorphic to \(\mathfrak {g}\) and we have a direct sum decomposition

$$\begin{aligned} R_{\Pi {\varvec{\zeta }}}\left( \mathfrak {g}^{\mathbb {C}}\right) ^\Pi = \mathfrak {g}\dotplus R'_{\Pi {\varvec{\zeta }}}\left( \mathfrak {g}^{\mathbb {C}}\right) ^\Pi \end{aligned}$$
(3.8)

given explicitly by writing a function \(f \in R_{\Pi {\varvec{\zeta }}}\left( \mathfrak {g}^{\mathbb {C}}\right) ^\Pi \) as in (3.4), with the first two sums defining the component in \(R'_{\Pi {\varvec{\zeta }}}\left( \mathfrak {g}^{\mathbb {C}}\right) ^\Pi \), cf. (3.7), and the constant term \(f^{\textrm{c}} \in \mathfrak {g}\) corresponding to the component in \(\mathfrak {g}\).

It is useful to adjoin another copy of \(\mathfrak {g}\) to the space \(R_{\Pi {\varvec{\zeta }}}\left( \mathfrak {g}^{\mathbb {C}}\right) ^\Pi \) by considering the direct sum \(R_{\Pi {\varvec{\zeta }}}\left( \mathfrak {g}^{\mathbb {C}}\right) ^\Pi \oplus \mathfrak {g}\). It is important to note that this additional copy of \(\mathfrak {g}\) is distinct from the copy of \(\mathfrak {g}\) already present in (3.8), representing the constant term in the rational function. It follows from comparing (2.10) with (3.6) that

$$\begin{aligned} \dim \left( R_{\Pi {\varvec{\zeta }}}\left( \mathfrak {g}^{\mathbb {C}}\right) ^\Pi \oplus \mathfrak {g}\right) = \dim {\mathfrak {d}}. \end{aligned}$$
(3.9)

We define the symmetric bilinear form

$$\begin{aligned} \langle \langle \cdot , \cdot \rangle \rangle _{\omega } : \left( R_{\Pi {\varvec{\zeta }}}\left( \mathfrak {g}^{\mathbb {C}}\right) ^\Pi \oplus \mathfrak {g}\right) \times \left( R_{\Pi {\varvec{\zeta }}}\left( \mathfrak {g}^{\mathbb {C}}\right) ^\Pi \oplus \mathfrak {g}\right) \longrightarrow {\mathbb {R}}, \end{aligned}$$
(3.10a)

given for any \(f, g \in R_{\Pi {\varvec{\zeta }}}\left( \mathfrak {g}^{\mathbb {C}}\right) ^\Pi \) and \({\textsf{u}}, {\textsf{v}} \in \mathfrak {g}\) by

$$\begin{aligned} \langle \langle (f, {\textsf{u}}), (g, {\textsf{v}}) \rangle \rangle _{\omega } :=\sum _{x \in \Pi \varvec{z}} {{\,\textrm{res}\,}}_x \langle f, g \rangle \omega + \langle {\textsf{u}}, g^{\textrm{c}} \rangle + \langle f^{\textrm{c}}, {\textsf{v}} \rangle . \end{aligned}$$
(3.10b)

By a slight abuse of notation, we will often denote the restriction of (3.10) to the subspace \(R_{\Pi {\varvec{\zeta }}}\left( \mathfrak {g}^{\mathbb {C}}\right) ^\Pi \) also as \(\langle \langle \cdot , \cdot \rangle \rangle _{\omega }\). That is, we will write \(\langle \langle f, g \rangle \rangle _{\omega } :=\langle \langle (f, 0), (g, 0) \rangle \rangle _{\omega }\) for any \(f, g \in R_{\Pi {\varvec{\zeta }}}\left( \mathfrak {g}^{\mathbb {C}}\right) ^\Pi \).

Lemma 3.1

The bilinear form (3.10) is non-degenerate. Its restriction to \(R_{\Pi {\varvec{\zeta }}} \left( \mathfrak {g}^{\mathbb {C}}\right) ^\Pi \) is degenerate and its restriction to \(R'_{\Pi {\varvec{\zeta }}}\left( \mathfrak {g}^{\mathbb {C}}\right) ^\Pi \) is also non-degenerate.

Proof

Let us first show that the restriction to \(R_{\Pi {\varvec{\zeta }}}\left( \mathfrak {g}^{\mathbb {C}}\right) ^\Pi \) is degenerate. Consider a constant function \(f = f^{\textrm{c}} \in R_{\Pi {\varvec{\zeta }}}\left( \mathfrak {g}^{\mathbb {C}}\right) ^\Pi \). Then, for any \(g \in R_{\Pi {\varvec{\zeta }}}\left( \mathfrak {g}^{\mathbb {C}}\right) ^\Pi \) the poles of the 1-form \(\langle f, g \rangle \omega \) are contained in \(\Pi \varvec{z}\) so it follows that \(\langle \langle f, g \rangle \rangle _{\omega } = 0\).

Consider now the restriction of the bilinear form (3.10) to \(R'_{\Pi {\varvec{\zeta }}}\left( \mathfrak {g}^{\mathbb {C}}\right) ^\Pi \). We show that this is non-degenerate. For every \(f, g \in R'_{\Pi {\varvec{\zeta }}}\left( \mathfrak {g}^{\mathbb {C}}\right) ^\Pi \) we have

$$\begin{aligned} \langle \langle f, g \rangle \rangle _{\omega } = \sum _{x \in \Pi \varvec{z}} {{\,\textrm{res}\,}}_x \langle f, g \rangle \omega . \end{aligned}$$

Since f is not constant, it has poles at some of the zeroes of \(\omega \). By choosing g to also have a pole at one of these same zeroes, we can ensure that some of the poles of the 1-form \(\langle f, g \rangle \omega \) lie outside the subset \(\Pi \varvec{z}\), namely in \(\Pi {\varvec{\zeta }}\). It is then possible to choose g such that \(\langle \langle f, g \rangle \rangle _{\omega } \ne 0\).

It is clear from the form of the additional two terms in (3.10b) that the bilinear form on the whole of \(R_{\Pi {\varvec{\zeta }}}\left( \mathfrak {g}^{\mathbb {C}}\right) ^\Pi \oplus \mathfrak {g}\) is itself also non-degenerate. \(\square \)

3.2 The Isomorphism \(\varvec{\pi }_{{\varvec{\zeta }}}: R_{\Pi {\varvec{\zeta }}}(\mathfrak {g}^{{\mathbb {C}}})^\Pi \xrightarrow {\cong }\mathfrak {g}\oplus \mathfrak {g}^{({\varvec{\zeta }})}\)

A rational function \(f \in R_{\Pi {\varvec{\zeta }}}\left( \mathfrak {g}^{\mathbb {C}}\right) ^\Pi \), as in (3.4), is uniquely determined by the coefficients \(f^{(y, q)} \in \mathfrak {g}\) for \((y, q) \in ({{\varvec{\zeta }}_{\textrm{r}}})\) and \(f^{(y, q)} \in \mathfrak {g}^{\mathbb {C}}\) for \((y, q) \in ({{\varvec{\zeta }}_{\textrm{c}}})\) at each of its real and complex poles, along with the constant term \(f^{\textrm{c}} \in \mathfrak {g}\). It is therefore convenient to introduce the vector space in which the coefficients of such rational functions live. Explicitly, we associate with the zeroes of \(\omega \) the real vector space

$$\begin{aligned} \mathfrak {g}^{({\varvec{\zeta }})}:=\prod _{(y, q) \in ({{\varvec{\zeta }}_{\textrm{r}}})} \mathfrak {g}\times \prod _{(y, q) \in ({{\varvec{\zeta }}_{\textrm{c}}})} \mathfrak {g}^{\mathbb {C}}\end{aligned}$$
(3.11)

where \(\mathfrak {g}^{\mathbb {C}}\) is regarded as a real vector space. Its dimension is

$$\begin{aligned} \dim \mathfrak {g}^{({\varvec{\zeta }})}= \dim \mathfrak {g}\sum _{y \in {\varvec{\zeta }}_{\textrm{r}}} m_y + 2 \dim \mathfrak {g}\sum _{y \in {\varvec{\zeta }}_{\textrm{c}}} m_y = \dim \mathfrak {g}\sum _{y \in \Pi {\varvec{\zeta }}} m_y. \end{aligned}$$
(3.12)

We now have an obvious isomorphism

$$\begin{aligned}&{\varvec{\pi }}_{{\varvec{\zeta }}} : R_{\Pi {\varvec{\zeta }}}(\mathfrak {g}^{{\mathbb {C}}})^{\Pi } \overset{\cong }{\longrightarrow }\ \mathfrak {g}\oplus \mathfrak {g}^{({\varvec{\zeta }})}, \nonumber \\&f \longmapsto \Big (f^{\textrm{c}}, \big ( f^{(y, q)} \big )_{(y, q) \in ({{\varvec{\zeta }}})}\Big ) \end{aligned}$$
(3.13)

which takes a rational function in \(R_{\Pi {\varvec{\zeta }}}\left( \mathfrak {g}^{\mathbb {C}}\right) ^\Pi \) and returns its constant term in \(\mathfrak {g}\) and the coefficients at each of its poles as an element of \(\mathfrak {g}^{({\varvec{\zeta }})}\).

We also extend this map to an isomorphism \(\varvec{\pi }_{{\varvec{\zeta }}}: R_{\Pi {\varvec{\zeta }}}\left( \mathfrak {g}^{\mathbb {C}}\right) ^\Pi \oplus \mathfrak {g}\xrightarrow {\cong }\mathfrak {g}\oplus \mathfrak {g}^{({\varvec{\zeta }})}\oplus \mathfrak {g}\) by letting it act trivially on the additional copy of \(\mathfrak {g}\) introduced in Sect. 3.1.

We define the symmetric bilinear form, cf. [2, (4.16)],

$$\begin{aligned} \langle \langle \cdot , \cdot \rangle \rangle _{\mathfrak {g}\oplus \mathfrak {g}^{({\varvec{\zeta }})}\oplus \mathfrak {g}} : \big ( \mathfrak {g}\oplus \mathfrak {g}^{({\varvec{\zeta }})}\oplus \mathfrak {g}\big ) \times \big ( \mathfrak {g}\oplus \mathfrak {g}^{({\varvec{\zeta }})}\oplus \mathfrak {g}\big ) \longrightarrow {\mathbb {R}}, \end{aligned}$$
(3.14a)

given for any \({\textsf{U}} = ({\textsf{U}}^{(y,q)})_{(y,q) \in ({{\varvec{\zeta }}})}, {\textsf{V}} = ({\textsf{V}}^{(y,q)})_{(y,q) \in ({{\varvec{\zeta }}})} \in \mathfrak {g}^{({\varvec{\zeta }})}\) and \({\textsf{x}}, {\textsf{x}}', {\textsf{y}}, {\textsf{y}}' \in \mathfrak {g}\) by

$$\begin{aligned}&\langle \langle ({\textsf{x}}, {\textsf{U}}, {\textsf{x}}'), ({\textsf{y}}, {\textsf{V}}, {\textsf{y}}') \rangle \rangle _{\mathfrak {g}\oplus \mathfrak {g}^{({\varvec{\zeta }})}\oplus \mathfrak {g}} \nonumber \\&\qquad \qquad :=\sum _{y \in {\varvec{\zeta }}} \sum _{\begin{array}{c} p, q=0\\ p+q \ge m_y - 1 \end{array}}^{m_y-1} \frac{2}{|\Pi _y|} \Re \bigg ( \alpha _{p,q} \big \langle {\textsf{U}}^{(y,p)}, {\textsf{V}}^{(y,q)} \big \rangle \bigg ) + \langle {\textsf{x}}, {\textsf{y}}' \rangle + \langle {\textsf{x}}', {\textsf{y}} \rangle , \end{aligned}$$
(3.14b)

where \(\alpha _{p,q} :=- \frac{1}{(p+q+1-m_y)!} \partial _{\xi _y}^{p+q+1-m_y} \psi _y(\xi _y)\) for \(p,q = 0, \ldots , m_y\) is symmetric under the exchange of p and q. Here, we wrote \(\omega = \psi _y(\xi _y) \xi _y^{m_y} \textrm{d}\xi _y\) in the local coordinate \(\xi _y\) at \(y \in {\varvec{\zeta }}\) where \(\psi _y(\xi _y)|_y \ne 0\) using the fact that \(\omega \) has a zero of order \(m_y\) at y. We also denote by \(\Pi _y \subseteq \Pi \) the stabiliser subgroup of y under the action of \(\Pi \) on \({\mathbb {C}}{P}^1\), and \(|\Pi _y|\) is its order. Explicitly, we have \(|\Pi _y| = 2\) for any real point \(y \in {\varvec{\zeta }}_{\textrm{r}}\) and \(|\Pi _y| = 1\) for any complex point \(y \in {\varvec{\zeta }}_{\textrm{c}}\).

The following is an immediate generalisation of [2, Lemma 4.3].

Lemma 3.2

For any \(f, g \in R_{\Pi {\varvec{\zeta }}}\left( \mathfrak {g}^{\mathbb {C}}\right) ^\Pi \) and \({\textsf{u}}, {\textsf{v}} \in \mathfrak {g}\) we have

$$\begin{aligned} \langle \langle \varvec{\pi }_{{\varvec{\zeta }}} (f, {\textsf{u}}), \varvec{\pi }_{{\varvec{\zeta }}} (g, {\textsf{v}}) \rangle \rangle _{\mathfrak {g}\oplus \mathfrak {g}^{({\varvec{\zeta }})}\oplus \mathfrak {g}} = \langle \langle (f, {\textsf{u}}), (g, {\textsf{v}}) \rangle \rangle _{\omega }. \end{aligned}$$

Proof

The first term in the bilinear form (3.10b) can be rewritten as

$$\begin{aligned} \langle \langle f, g \rangle \rangle _{\omega } =\sum _{x \in \Pi \varvec{z}} {{\,\textrm{res}\,}}_x \langle f, g \rangle \omega = - \sum _{y \in \Pi {\varvec{\zeta }}} {{\,\textrm{res}\,}}_y \langle f, g \rangle \omega = - \sum _{y \in {\varvec{\zeta }}} \frac{2}{|\Pi _y|} \Re \big ( {{\,\textrm{res}\,}}_y \langle f, g \rangle \omega \big ) \end{aligned}$$
(3.15)

where in the second equality we used the residue theorem and the fact that the poles of the meromorphic 1-form \(\langle f, g \rangle \omega \) belong to the set \(\Pi \varvec{z}\sqcup \Pi {\varvec{\zeta }}\). The right-hand side of (3.15) can be evaluated more explicitly as follows:

$$\begin{aligned} \langle \langle f, g \rangle \rangle _{\omega }&= - \sum _{y \in {\varvec{\zeta }}} \sum _{p=0}^{m_y-1} \frac{2}{|\Pi _y|} \Re \Big ( {{\,\textrm{res}\,}}_y \big \langle f^{(y,p)} \xi _y^{-p-1}, \psi _y(\xi _y) \xi _y^{m_y} g \big \rangle \textrm{d}\xi _y \Big )\\&= - \sum _{y \in {\varvec{\zeta }}} \sum _{p, q=0}^{m_y-1} \frac{2}{|\Pi _y|} \Re \bigg ( \frac{1}{p!} \partial _{\xi _y}^p \big ( \psi _y(\xi _y) \xi _y^{m_y-q-1} \big ) \big |_y \left\langle f^{(y,p)}, g^{(y,q)} \right\rangle \bigg ). \end{aligned}$$

In the first equality, we used the fact that \(g \omega \) is regular at \(y \in {\varvec{\zeta }}\) so that the only contribution to the residue is from the pole term \(f^{(y,p)} \xi _y^{-p-1}\) at y in f and we wrote \(\omega \) locally in the coordinate \(\xi _y\). In the second equality we took the residue and used the fact that the poles of g at \(x \ne y\) in the expression \(\partial _{\xi _y}^p \big ( \psi _y(\xi _y) \xi _y^{m_y} g \big ) \big |_y\) with \(p = 0, \ldots , m_y-1\) vanish since \(\psi _y(\xi _y) \xi _y^{m_y}\) has a zero of order \(m_y\) at y. Finally, we note that \(\frac{1}{p!} \partial _{\xi _y}^p \big ( \psi _y(\xi _y) \xi _y^{m_y-q-1} \big ) \big |_y = \frac{1}{(p+q+1-m_y)!} \partial _{\xi _y}^{p+q+1-m_y} \psi _y(\xi _y)\) if \(p+q \ge m_y - 1\) and is zero otherwise, from which the result now follows. \(\square \)

3.3 The Isomorphism \(\varvec{j}_{\varvec{z}}: R_{\Pi {\varvec{\zeta }}}(\mathfrak {g}^{{\mathbb {C}}})^{\Pi } \xrightarrow {\cong }{\mathfrak {f}}^{\perp }\)

Let us introduce a linear map

$$\begin{aligned} \varvec{j}_{\varvec{z}} : R_{\Pi {\varvec{\zeta }}}\left( \mathfrak {g}^{\mathbb {C}}\right) ^\Pi&\longrightarrow {\mathfrak {d}}, \nonumber \\ f&\longmapsto \bigg (\sum _{p=0}^{n_x - 1} \frac{1}{p!} \big (\partial ^p_{\xi _x} f \big ) \big |_x \otimes \varepsilon _x^p \bigg )_{x \in \varvec{z}}, \end{aligned}$$
(3.16)

which takes a rational function with poles at the zeroes of \(\omega \) and returns the first \(n_x\) terms in its Taylor expansion at each pole \(x \in \varvec{z}\) of \(\omega \) in the local coordinate \(\xi _x\), where \(\xi _x = z-x\) if \(x \in \varvec{z}'\) and \(\xi _{\infty } = z^{-1}\) if \(\infty \in \varvec{z}\). The linear map (3.16) cannot be an isomorphism on dimensional grounds by (3.9). However, we will show in Proposition 3.4 that it is injective. Before doing so, we will show that (3.16) maps the bilinear form (3.10), or rather its restriction to \(R_{\Pi {\varvec{\zeta }}}\left( \mathfrak {g}^{\mathbb {C}}\right) ^\Pi \), to the bilinear form (2.11) on \({\mathfrak {d}}\).

Lemma 3.3

For any \(f, g \in R_{\Pi {\varvec{\zeta }}}\left( \mathfrak {g}^{\mathbb {C}}\right) ^\Pi \), we have \(\langle \langle \varvec{j}_{\varvec{z}} f, \varvec{j}_{\varvec{z}} g \rangle \rangle _{{\mathfrak {d}}} =\langle \langle f, g \rangle \rangle _{\omega }\).

Proof

Let \(f, g \in R_{\Pi {\varvec{\zeta }}}\left( \mathfrak {g}^{\mathbb {C}}\right) ^\Pi \). First note that by using the reality conditions at all of the poles of \(\omega \) we may rewrite the bilinear form (3.10b) more explicitly as a sum over the independent poles, namely

$$\begin{aligned} \sum _{x \in \Pi \varvec{z}} {{\,\textrm{res}\,}}_x \langle f, g \rangle \omega =\sum _{x \in \varvec{z}} \frac{2}{|\Pi _x|} \Re \big ( {{\,\textrm{res}\,}}_x \langle f, g \rangle \omega \big ). \end{aligned}$$
(3.17)

Recall that \(\Pi _x \subseteq \Pi \) denotes the stabiliser subgroup of \(x \in \varvec{z}\).

Recall the explicit expression (2.1) for the meromorphic 1-form, and in particular its expansion (2.3) at each pole \(x \in \varvec{z}\). We then have

$$\begin{aligned} \langle \langle f, g \rangle \rangle _{\omega }&= \sum _{x \in \varvec{z}'} \sum _{p=0}^{n_x-1} \frac{2}{|\Pi _x|} \Re \bigg ( \frac{\ell ^x_p}{p!} \big ( \partial _{\xi _x}^p \langle f, g \rangle \big )\big |_x \bigg ) + \sum _{p=0}^{n_{\infty }-1} \frac{\ell ^{\infty }_p}{p!} \big ( \partial ^p_{\xi _{\infty }}\langle f, g \rangle \big )\big |_{\infty }\\&= \sum _{x \in \varvec{z}} \sum _{p=0}^{n_x-1} \sum _{q=0}^p \frac{2}{|\Pi _x|} \Re \bigg ( \ell ^x_p \bigg \langle \frac{1}{q!} (\partial _{\xi _x}^q f)|_x, \frac{1}{(p-q)!} (\partial _{\xi _x}^{p-q} g)|_x \bigg \rangle \bigg )\\&= \sum _{x \in \varvec{z}} \sum _{q,r=0}^{n_x-1} \frac{2}{|\Pi _x|} \Re \bigg ( \ell ^x_{q+r} \bigg \langle \frac{1}{q!} (\partial _{\xi _x}^q f)|_x, \frac{1}{r!} (\partial _{\xi _x}^r g)|_x \bigg \rangle \bigg ) = \langle \langle \varvec{j}_{\varvec{z}} f, \varvec{j}_{\varvec{z}} g \rangle \rangle _{{\mathfrak {d}}} \end{aligned}$$

where in the second last step we changed variable from p to \(r=p-q\) and used the convention that \(\ell ^x_p = 0\) for \(p \ge n_x\). The last equality is by definition (2.11) of the bilinear form on \({\mathfrak {d}}\) and of the map \(\varvec{j}_{\varvec{z}}\) in (3.16). \(\square \)

Any \({\textsf{v}} \in \mathfrak {g}\) defines a constant function on \(R_{\Pi {\varvec{\zeta }}}\left( \mathfrak {g}^{\mathbb {C}}\right) ^\Pi \). Explicitly, in the notation of (3.4) we have \(f^{\textrm{c}} = {\textsf{v}}\) and \(f^{(y, q)} =0\) for every \((y,q) \in ({\Pi {\varvec{\zeta }}})\). By abuse of notation we will denote this rational function also as \({\textsf{v}}\). Its image under \(\varvec{j}_{\varvec{z}}\) is the element \(\varvec{j}_{\varvec{z}} {\textsf{v}} = \Delta {\textsf{v}} \in {\mathfrak {f}}\) of the diagonal subalgebra \({\mathfrak {f}}= \Delta \mathfrak {g}\subset {\mathfrak {d}}\). Moreover, any element of \({\mathfrak {f}}\) can be represented in this way. Let

$$\begin{aligned} {\mathfrak {v}}:=\varvec{j}_{\varvec{z}}\big ( R'_{\Pi {\varvec{\zeta }}}\left( \mathfrak {g}^{\mathbb {C}}\right) ^\Pi \big ). \end{aligned}$$
(3.18)

Proposition 3.4

The linear map \(\varvec{j}_{\varvec{z}}\) in (3.16) is an isomorphism onto its image \({\mathfrak {f}}^\perp \). In particular, we have the direct sum decomposition \({\mathfrak {f}}^\perp = {\mathfrak {f}}\dotplus {\mathfrak {v}}\).

Proof

We will first show that \(\varvec{j}_{\varvec{z}}\big ( R_{\Pi {\varvec{\zeta }}}\left( \mathfrak {g}^{\mathbb {C}}\right) ^\Pi \big ) \subset {\mathfrak {f}}^\perp \). To see this, let \(f \in R_{\Pi {\varvec{\zeta }}}\left( \mathfrak {g}^{\mathbb {C}}\right) ^\Pi \) and \(\Delta {\textsf{v}} \in {\mathfrak {f}}\) be arbitrary. Then by Lemma 3.3 we have

$$\begin{aligned} \langle \langle \varvec{j}_{\varvec{z}} f, \Delta {\textsf{v}} \rangle \rangle _{{\mathfrak {d}}} =\langle \langle \varvec{j}_{\varvec{z}} f, \varvec{j}_{\varvec{z}} {\textsf{v}} \rangle \rangle _{{\mathfrak {d}}} =\langle \langle f, {\textsf{v}} \rangle \rangle _{\omega } = 0, \end{aligned}$$

where the last step follows by the residue theorem since \(\langle f, {\textsf{v}} \rangle \omega \) is a meromorphic 1-form with poles contained in the subset \(\Pi \varvec{z}\).

It remains to show that the map \(\varvec{j}_{\varvec{z}}\) is injective. The result will then follow by virtue of (3.6) which can be rewritten as \(\dim R_{\Pi {\varvec{\zeta }}}\left( \mathfrak {g}^{\mathbb {C}}\right) ^\Pi = \dim {\mathfrak {d}}- \dim {\mathfrak {f}}= \dim {\mathfrak {f}}^\perp \) using (2.10). Equivalently, with the help of the bijection (3.13) it is enough to show that \(\varvec{j}_{\varvec{z}} \circ \varvec{\pi }_{{\varvec{\zeta }}}^{-1}: \mathfrak {g}\oplus \mathfrak {g}^{({\varvec{\zeta }})}\rightarrow {\mathfrak {d}}\) is injective. The coefficients of the expansions to order \(n_x\) at all the poles \(x \in \Pi \varvec{z}\) are given by

$$\begin{aligned} \frac{1}{p!} ( \partial _z^p f)|_x = \sum _{(y,q) \in ({\Pi {\varvec{\zeta }}'})} C^{[x, p]}_{\quad \;\; (y,q)} f^{(y,q)} + \sum _{q=-1}^{m_{\infty } - 1} C^{[x, p]}_{\quad \;\; (\infty ,q)} f^{(\infty , q)} \end{aligned}$$

for all \([x,p] \in [{\Pi \varvec{z}}]\), where we have incorporated the constant term \(f^{(\infty , -1)} :=f^{\textrm{c}}\) into the \((-1)^{\textrm{st}}\) term of the second sum and introduced the coefficients

$$\begin{aligned} C^{[x, p]}_{\quad \;\; (y,q)} :=\left( {\begin{array}{c}p+q\\ p\end{array}}\right) \frac{(-1)^p}{(x-y)^{p+q+1}}, \qquad C^{[x, p]}_{\quad \;\; (\infty , q)} :=\left( {\begin{array}{c}q+1\\ p\end{array}}\right) x^{q+1-p} \nonumber \\ \end{aligned}$$
(3.19)

for all \([x,p] \in [{\Pi \varvec{z}}]\) and \((y,q) \in ({\Pi {\varvec{\zeta }}})\) where \(q = 0, \ldots , m_y - 1\) for all \(y \in {\varvec{\zeta }}{\setminus } \{ \infty \}\) and \(q = -1, \ldots , m_{\infty } - 1\) for \(y = \infty \). The expressions in (3.19) are the components of what is known as a confluent Cauchy–Vandermonde matrix, see for instance [52, Definition 13]. By combining (3.12) and (2.2), we find \(\dim \left( \mathfrak {g}\oplus \mathfrak {g}^{({\varvec{\zeta }})}\right) = \dim {\mathfrak {d}}- \dim \mathfrak {g}\). The matrix specified by the components (3.19) is of dimension \(\dim {\mathfrak {d}}\times \dim \left( \mathfrak {g}\oplus \mathfrak {g}^{({\varvec{\zeta }})}\right) \) so removing \(\dim \mathfrak {g}\) columns (by removing the highest-order term in the expansion at any of the poles \(x \in \Pi \varvec{z}\)) we obtain a square confluent Cauchy–Vandermonde matrix. The result now follows form the fact that a square confluent Cauchy–Vandermonde matrix is invertible [52, Corollary 19].

The last part follows from applying the injective linear map \(\varvec{j}_{\varvec{z}}\) to the direct sum decomposition (3.8). \(\square \)

Even though we will not explicitly need it in order to construct the action for the Lax connection of the degenerate \(\mathcal {E}\)-model in §3.5, it is useful to try to extend the injective linear map (3.16) to an isomorphism \(R_{\Pi {\varvec{\zeta }}}\left( \mathfrak {g}^{\mathbb {C}}\right) ^\Pi \oplus \mathfrak {g}\overset{\cong }{\longrightarrow }{\mathfrak {d}}\). This will be useful in constructing the \(\mathcal {E}\)-operator \(\mathcal {E}: {\mathfrak {d}}\rightarrow {\mathfrak {d}}\) on all of \({\mathfrak {d}}\) in Sect. 3.4. Explicitly, we want to construct an isomorphism

$$\begin{aligned} \widehat{\varvec{\jmath }}_{\varvec{z}} :=\varvec{j}_{\varvec{z}} \oplus \varvec{\rho }_{\varvec{z}} : R_{\Pi {\varvec{\zeta }}} \left( \mathfrak {g}^{\mathbb {C}}\right) ^\Pi \oplus \mathfrak {g}\overset{\cong }{\longrightarrow }{\mathfrak {d}}, \end{aligned}$$
(3.20a)

where \(\varvec{\rho }_{\varvec{z}} : \mathfrak {g}\rightarrow {\mathfrak {d}}\) is a linear map whose image is complementary to \({\mathfrak {f}}^\perp \) in \({\mathfrak {d}}\). We will now give a general procedure for constructing such a linear map by requiring that its image \(\tilde{\mathfrak {f}}:={{\,\textrm{im}\,}}\varvec{\rho }_{\varvec{z}}\) be isotropic and perpendicular to \({\mathfrak {v}}\), i.e. such that

$$\begin{aligned} {\mathfrak {d}}= {\mathfrak {f}}^\perp \dotplus \tilde{\mathfrak {f}}, \qquad \tilde{\mathfrak {f}}\subset (\tilde{\mathfrak {f}}\dotplus {\mathfrak {v}})^\perp , \end{aligned}$$
(3.20b)

and with the property that, for any \({\textsf{x}}, {\textsf{y}} \in \mathfrak {g}\),

$$\begin{aligned} \langle \langle \varvec{\rho }_{\varvec{z}} {\textsf{x}}, \Delta {\textsf{y}} \rangle \rangle _{{\mathfrak {d}}} = \langle {\textsf{x}}, {\textsf{y}} \rangle . \end{aligned}$$
(3.20c)

For simplicity, we will suppose in the following argument that \(\omega \) does not have a pole at infinity. The construction could also be adapted to that case. Recall that \(m_{\infty }\) denotes the order of the zero of \(\omega \) at infinity. We let \(P: {\mathbb {C}}\rightarrow {\mathbb {C}}\) be a generic polynomial of order \(m_{\infty } + 1\), which therefore contains \(m_{\infty } + 2\) arbitrary coefficients in \({\mathbb {C}}\). If \({\textsf{x}} \in \mathfrak {g}\) then \({\textsf{x}} \, P\) is a \(\mathfrak {g}\)-valued polynomial. Notice, however, that \({\textsf{x}} \, P\) does not lie in \(R_{\Pi {\varvec{\zeta }}}\left( \mathfrak {g}^{\mathbb {C}}\right) ^\Pi \) since rational functions in \(R_{\Pi {\varvec{\zeta }}}\left( \mathfrak {g}^{\mathbb {C}}\right) ^\Pi \), which are of the form (3.4), contain a polynomial of order at most \(m_{\infty }\) (and not \(m_{\infty } + 1\)). Nevertheless, by a slight abuse of notation we will write

$$\begin{aligned} \varvec{j}_{\varvec{z}} ({\textsf{x}} \, P) = \bigg ( \sum _{p=0}^{n_x - 1} \frac{1}{p!} \big ( {\textsf{x}} (\partial ^p_{\xi _x} P)|_x \big ) \otimes \varepsilon _x^p \bigg )_{x \in \varvec{z}} \in {\mathfrak {d}}\end{aligned}$$

for the Taylor expansion of \({\textsf{x}} \, P\) at each \(x \in \varvec{z}\) up to order \(n_x - 1\), cf. (3.16). We then define the linear map

$$\begin{aligned} \varvec{\rho }_{\varvec{z}} : \mathfrak {g}\longrightarrow {\mathfrak {d}},\qquad {\textsf{x}} \longmapsto \varvec{j}_{\varvec{z}} ({\textsf{x}} \, P). \end{aligned}$$
(3.21)

By construction \(\tilde{\mathfrak {f}}= {{\,\textrm{im}\,}}\varvec{\rho }_{\varvec{z}}\) is a complement to \({\mathfrak {f}}^\perp = {{\,\textrm{im}\,}}\varvec{j}_{\varvec{z}}\) in \({\mathfrak {d}}\) since \({\textsf{x}} \, P \not \in R_{\Pi {\varvec{\zeta }}}\left( \mathfrak {g}^{\mathbb {C}}\right) ^\Pi \). Next, we will argue how to fix the coefficients of the polynomial P by imposing the second property in (3.20b) and the property (3.20c).

We will first show how to ensure that \(\varvec{j}_{\varvec{z}} ({\textsf{x}} \, P) \in {\mathfrak {v}}^\perp \). Suppose first that \(m_{\infty } = 0\), i.e. that \(\omega \) does not vanish at infinity. Then, any \(g \in R'_{\Pi {\varvec{\zeta }}}\left( \mathfrak {g}^{\mathbb {C}}\right) ^\Pi \), specifically without constant term, has a zero of order at least 1 at infinity so that the 1-form \(\langle {\textsf{x}} \, P, g \rangle \omega \) is regular at infinity. In particular, all of its poles lie in \(\Pi \varvec{z}\). By the same computation as in the proof of Lemma 3.3, one shows that \(\langle \langle \varvec{j}_{\varvec{z}} ({\textsf{x}} \, P), \varvec{j}_{\varvec{z}} g \rangle \rangle _{{\mathfrak {d}}} = \langle \langle {\textsf{x}} \, P, g \rangle \rangle _{\omega } = 0\) where the last step is by the residue theorem. Hence, \(\varvec{j}_{\varvec{z}} ({\textsf{x}} \, P) \in {\mathfrak {v}}^\perp \) where in this case P is an arbitrary polynomial of degree 1, i.e. with two arbitrary coefficients.

Suppose now that \(m_{\infty } > 0\). Then, we can write any \(g \in R'_{\Pi {\varvec{\zeta }}}\left( \mathfrak {g}^{\mathbb {C}}\right) ^\Pi \) as \(g = g' + g_{\infty }\) where \(g_{\infty }\) is a \(\mathfrak {g}^{\mathbb {C}}\)-valued polynomial of order \(m_{\infty }\) with no constant term, which therefore contains \(m_{\infty }\) arbitrary coefficients, and \(g'\) contains no polynomial term. Consider the equation \(\langle \langle \varvec{j}_{\varvec{z}} ({\textsf{x}} \, P), \varvec{j}_{\varvec{z}} g \rangle \rangle _{{\mathfrak {d}}} = 0\). By the exact same reasoning as above, we deduce \(\langle \langle \varvec{j}_{\varvec{z}} ({\textsf{x}} \, P), \varvec{j}_{\varvec{z}} g' \rangle \rangle _{{\mathfrak {d}}} = 0\), whereas the condition \(\langle \langle \varvec{j}_{\varvec{z}} ({\textsf{x}} \, P), \varvec{j}_{\varvec{z}} g_{\infty } \rangle \rangle _{{\mathfrak {d}}} = 0\) imposes a triangular system of \(m_{\infty }\) linear equations on the coefficients of the polynomial P. By solving these equations, we can then ensure that \(\varvec{j}_{\varvec{z}} ({\textsf{x}} \, P) \in {\mathfrak {v}}^\perp \) where P is a polynomial of degree \(m_{\infty } + 1\) but with only two free coefficients.

The two remaining coefficients in the polynomial P can be fixed by requiring that \(\tilde{\mathfrak {f}}\) is isotropic, which amounts to imposing the condition \(\langle \langle \varvec{\rho }_{\varvec{z}} {\textsf{x}}, \varvec{\rho }_{\varvec{z}} {\textsf{y}} \rangle \rangle _{{\mathfrak {d}}} = 0\) for any \({\textsf{x}}, {\textsf{y}} \in \mathfrak {g}\), and the additional property (3.20c).

Note that the linear map (3.21) could be naturally incorporated into the linear map (3.16) by extending the space of rational functions \(R_{\Pi {\varvec{\zeta }}}\left( \mathfrak {g}^{\mathbb {C}}\right) ^\Pi \) to include the polynomial functions of the form \({\textsf{x}} \, P\) for \({\textsf{x}} \in \mathfrak {g}\). In other words, if we define the space \(\widehat{R}_{\Pi {\varvec{\zeta }}}\left( \mathfrak {g}^{\mathbb {C}}\right) ^\Pi \cong R_{\Pi {\varvec{\zeta }}}\left( \mathfrak {g}^{\mathbb {C}}\right) ^\Pi \oplus \mathfrak {g}\) consisting of rational functions of the form (3.4) but with the pole at infinity of order \(m_{\infty } + 1\) rather than \(m_{\infty }\) then the isomorphism (3.20a) can equally be described as an isomorphism

$$\begin{aligned} \widehat{\varvec{\jmath }}_{\varvec{z}} : \widehat{R}_{\Pi {\varvec{\zeta }}}\left( \mathfrak {g}^{\mathbb {C}}\right) ^\Pi \overset{\cong }{\longrightarrow }{\mathfrak {d}}, \end{aligned}$$

defined in exactly the same way as (3.16), namely by Taylor expanding in the local coordinate \(\xi _x\) at each \(x \in \varvec{z}\) up to the order \(n_x - 1\). This discussion relates back to the observation made in Remark 2.1 and in the paragraph after (2.27), namely that the \(\mathfrak {g}\)-valued gauge field \({\mathcal {L}}\), which after going on-shell was valued in \(R_{\Pi {\varvec{\zeta }}}\left( \mathfrak {g}^{\mathbb {C}}\right) ^\Pi \), could in fact be taken to live in the larger space \(\widehat{R}_{\Pi {\varvec{\zeta }}}\left( \mathfrak {g}^{\mathbb {C}}\right) ^\Pi \) just defined.

The key property of the isomorphism (3.20a), generalising that of Lemma 3.3 and which we shall use to construct the \(\mathcal {E}\)-operator in the next section, is the following.

Lemma 3.5

For any \(f, g \in R_{\Pi {\varvec{\zeta }}}\left( \mathfrak {g}^{\mathbb {C}}\right) ^\Pi \) and \({\textsf{u}}, {\textsf{v}} \in \mathfrak {g}\) we have

$$\begin{aligned} \langle \langle \widehat{\varvec{\jmath }}_{\varvec{z}} (f, {\textsf{u}}), \widehat{\varvec{\jmath }}_{\varvec{z}} (g, {\textsf{v}}) \rangle \rangle _{{\mathfrak {d}}} = \langle \langle (f, {\textsf{u}}), (g, {\textsf{v}}) \rangle \rangle _{\omega }. \end{aligned}$$

Proof

We have

$$\begin{aligned} \langle \langle \widehat{\varvec{\jmath }}_{\varvec{z}} (f, {\textsf{u}}), \widehat{\varvec{\jmath }}_{\varvec{z}} (g, {\textsf{v}}) \rangle \rangle _{{\mathfrak {d}}}&= \langle \langle \varvec{j}_{\varvec{z}} f, \varvec{j}_{\varvec{z}} g \rangle \rangle _{{\mathfrak {d}}} + \langle \langle \varvec{\rho }_{\varvec{z}} {\textsf{u}}, \varvec{j}_{\varvec{z}} g \rangle \rangle _{{\mathfrak {d}}} + \langle \langle \varvec{j}_{\varvec{z}} f, \varvec{\rho }_{\varvec{z}} {\textsf{v}} \rangle \rangle _{{\mathfrak {d}}}\\&= \langle \langle f, g \rangle \rangle _{\omega } + \langle \langle \varvec{\rho }_{\varvec{z}} {\textsf{u}}, \Delta g^{\textrm{c}} \rangle \rangle _{{\mathfrak {d}}} + \langle \langle \Delta f^{\textrm{c}}, \varvec{\rho }_{\varvec{z}} {\textsf{v}} \rangle \rangle _{{\mathfrak {d}}}\\&= \langle \langle f, g \rangle \rangle _{\omega } +\langle {\textsf{u}}, g^{\textrm{c}} \rangle + \langle f^{\textrm{c}}, {\textsf{v}} \rangle = \langle \langle (f, {\textsf{u}}), (g, {\textsf{v}}) \rangle \rangle _{\omega }. \end{aligned}$$

where in the first step we used the isotropy of \(\tilde{\mathfrak {f}}\). In the second line we have used Lemma 3.3 for the first term and the fact that \(\tilde{\mathfrak {f}}\perp {\mathfrak {v}}\) for the last two terms. The very last step is by definition (3.10b) of the bilinear form on \(R_{\Pi {\varvec{\zeta }}}\left( \mathfrak {g}^{\mathbb {C}}\right) ^\Pi \oplus \mathfrak {g}\)\(\square \)

3.4 The \(\mathcal {E}\)-Operator

We now wish to define an operator \(\mathcal {E}: {\mathfrak {d}}\rightarrow {\mathfrak {d}}\) which is symmetric with respect to the bilinear form (2.11) on \({\mathfrak {d}}\). To do so, we will first construct an operator [2]

$$\begin{aligned} \widetilde{\mathcal {E}} : \mathfrak {g}\oplus \mathfrak {g}^{({\varvec{\zeta }})}\oplus \mathfrak {g}\overset{\cong }{\longrightarrow }\mathfrak {g}\oplus \mathfrak {g}^{({\varvec{\zeta }})}\oplus \mathfrak {g}\end{aligned}$$
(3.22)

which is symmetric with respect to the bilinear form (3.14) on \(\mathfrak {g}\oplus \mathfrak {g}^{({\varvec{\zeta }})}\oplus \mathfrak {g}\). This may then be transferred to \({\mathfrak {d}}\) using the isomorphism \(\widehat{\varvec{\jmath }}_{\varvec{z}}\) in (3.20a) and the isomorphism \(\varvec{\pi }_{{\varvec{\zeta }}}: R_{\Pi {\varvec{\zeta }}}\left( \mathfrak {g}^{\mathbb {C}}\right) ^\Pi \oplus \mathfrak {g}\xrightarrow {\cong }\mathfrak {g}\oplus \mathfrak {g}^{({\varvec{\zeta }})}\oplus \mathfrak {g}\) from Sect. 3.2.

Recall from Sect. 2.1.2 that we assumed we were given a partition \({\varvec{\zeta }}= {\varvec{\zeta }}_+ \sqcup {\varvec{\zeta }}_-\) of the set of zeroes of \(\omega \) such that \(\sum _{y \in {\varvec{\zeta }}_+} m_y = \sum _{y \in {\varvec{\zeta }}_-} m_y\). The latter condition means, in particular, that the vector space \(\mathfrak {g}^{({\varvec{\zeta }})}\) introduced in (3.11) splits into a direct sum

$$\begin{aligned} \mathfrak {g}^{({\varvec{\zeta }})}= \mathfrak {g}^{({{\varvec{\zeta }}_+})} \dotplus \mathfrak {g}^{({{\varvec{\zeta }}_-})} \end{aligned}$$
(3.23)

of two subspaces with \(\dim \mathfrak {g}^{({{\varvec{\zeta }}_+})} = \dim \mathfrak {g}^{({{\varvec{\zeta }}_-})}\). In Sect. 2.3.1, we then introduced signs \(\epsilon _y = \pm 1\) for each \(y \in {\varvec{\zeta }}\) which we used to impose the condition (2.24b) on the components of the gauge field. Given this data, we introduce the involution

$$\begin{aligned} \widetilde{\mathcal {E}} : \mathfrak {g}\oplus \mathfrak {g}^{({\varvec{\zeta }})}\oplus \mathfrak {g}&\overset{\cong }{\longrightarrow }\mathfrak {g}\oplus \mathfrak {g}^{({\varvec{\zeta }})}\oplus \mathfrak {g}, \nonumber \\ \Big ( f^{\textrm{c}}, \big ( f^{(y,q)} \big )_{(y,q) \in ({{\varvec{\zeta }}})}, {\textsf{u}} \Big )&\longmapsto \Big ( {\textsf{u}}, \big ( \epsilon _y f^{(y,q)} \big )_{(y,q) \in ({{\varvec{\zeta }}})}, f^{\textrm{c}} \Big ). \end{aligned}$$
(3.24)

In other words, we leave the elements of \(\mathfrak {g}^{({{\varvec{\zeta }}_+})}\) fixed, we change the sign of those in \(\mathfrak {g}^{({{\varvec{\zeta }}_-})}\) and we flip the two additional copies of \(\mathfrak {g}\). One easily checks that the involution (3.24) is symmetric with respect to the bilinear form (3.14).

Following [2], consider the isomorphism

$$\begin{aligned} {\mathcal {C}}:=\varvec{j}_{\varvec{z}} \varvec{\pi }_{{\varvec{\zeta }}}^{-1} : \mathfrak {g}\oplus \mathfrak {g}^{({\varvec{\zeta }})}\overset{\cong }{\longrightarrow }{\mathfrak {f}}^\perp . \end{aligned}$$
(3.25)

We use this to transfer (3.22) to an operator on \({\mathfrak {d}}\). Explicitly, we have the following.

Lemma 3.6

The operator \(\mathcal {E}:={\mathcal {C}}\widetilde{\mathcal {E}} {\mathcal {C}}^{-1}: {\mathfrak {d}}\overset{\cong }{\longrightarrow }{\mathfrak {d}}\) is an involution and is symmetric with respect to the bilinear form (2.11) on \({\mathfrak {d}}\).

Proof

The involution property is immediate from that of \(\widetilde{\mathcal {E}}\).

By combining Lemmas 3.2 and 3.5, we have \(\langle \langle {\mathcal {C}}{\textsf{U}}, {\mathcal {C}}{\textsf{V}} \rangle \rangle _{{\mathfrak {d}}} = \langle \langle {\textsf{U}}, {\textsf{V}} \rangle \rangle _{\mathfrak {g}\oplus \mathfrak {g}^{({\varvec{\zeta }})}\oplus \mathfrak {g}}\) for any \({\textsf{U}}, {\textsf{V}} \in \mathfrak {g}\oplus \mathfrak {g}^{({\varvec{\zeta }})}\oplus \mathfrak {g}\). Therefore, for any \({\textsf{x}}, {\textsf{y}} \in {\mathfrak {d}}\) we deduce

$$\begin{aligned} \langle \langle {\textsf{x}}, \mathcal {E}{\textsf{y}} \rangle \rangle _{{\mathfrak {d}}}&= \langle \langle {\textsf{x}}, {\mathcal {C}}\widetilde{\mathcal {E}} {\mathcal {C}}^{-1} {\textsf{y}} \rangle \rangle _{{\mathfrak {d}}} = \langle \langle {\mathcal {C}}^{-1} {\textsf{x}}, \widetilde{\mathcal {E}} {\mathcal {C}}^{-1} {\textsf{y}} \rangle \rangle _{\mathfrak {g}\oplus \mathfrak {g}^{({\varvec{\zeta }})}\oplus \mathfrak {g}}\\&= \langle \langle \widetilde{\mathcal {E}} {\mathcal {C}}^{-1} {\textsf{x}}, {\mathcal {C}}^{-1} {\textsf{y}} \rangle \rangle _{\mathfrak {g}\oplus \mathfrak {g}^{({\varvec{\zeta }})}\oplus \mathfrak {g}} = \langle \langle {\mathcal {C}}\widetilde{\mathcal {E}} {\mathcal {C}}^{-1} {\textsf{x}}, {\textsf{y}} \rangle \rangle _{{\mathfrak {d}}} = \langle \langle \mathcal {E}{\textsf{x}}, {\textsf{y}} \rangle \rangle _{{\mathfrak {d}}}. \end{aligned}$$

Hence, \(\mathcal {E}\) is also symmetric, as required. \(\square \)

Lemma 3.7

The restriction of the symmetric bilinear form \(\langle \langle \cdot , \mathcal {E}\cdot \rangle \rangle _{{\mathfrak {d}}}: {\mathfrak {d}}\times {\mathfrak {d}}\rightarrow {\mathbb {R}}\) to \({\mathfrak {f}}\subset {\mathfrak {d}}\) is non-degenerate.

Proof

Let \({\textsf{v}} \in {\mathfrak {f}}\) and suppose that \(\langle \langle {\textsf{u}}, \mathcal {E}{\textsf{v}} \rangle \rangle _{{\mathfrak {d}}} = 0\) for all \({\textsf{u}} \in {\mathfrak {f}}\). Then, \(\mathcal {E}{\textsf{v}} \in {\mathfrak {f}}^\perp \) which is a contradiction since \(\mathcal {E}{\mathfrak {f}}= \tilde{\mathfrak {f}}\) is by definition a complement of \({\mathfrak {f}}^\perp \) in \({\mathfrak {d}}\). \(\square \)

3.5 Solving the Constraint Between \({\mathcal {L}}\) and h

Having introduced all of the necessary ingredients in the previous sections, we are finally in a position to complete the final step of the construction of 2d integrable field theories from 4d Chern–Simons theory. Recall from Sect. 2.3.2 that in order to write down the final 2d action as in (2.30) we need to have a solution \({\mathcal {L}}= {\mathcal {L}}(h)\) of the constraint (2.28) satisfying the transformation property (2.29). We will now show how to construct solutions which give rise to integrable degenerate \(\mathcal {E}\)-models.

Let \({\mathfrak {v}}_{\pm }\) denote the eigenspaces of \(\mathcal {E}\) restricted to \({\mathfrak {v}}\) with eigenvalues \(\pm 1\). These are the images of the subspaces \(\mathfrak {g}^{({{\varvec{\zeta }}_{\pm }})}\) under the isomorphism \({\mathcal {C}}\) defined in (3.25). Equivalently, we can describe \({\mathfrak {v}}_{\pm }\) as the image of the spaces \(R'_{\Pi {\varvec{\zeta }}_{\pm }}\left( \mathfrak {g}^{\mathbb {C}}\right) ^\Pi \) of \(\Pi \)-equivariant \(\mathfrak {g}^{\mathbb {C}}\)-valued rational functions with poles in \(\Pi {\varvec{\zeta }}_{\pm }\) (see the start of Sect. 3.1), namely

$$\begin{aligned} {\mathfrak {v}}_{\pm } :=\varvec{j}_{\varvec{z}} \big ( R'_{\Pi {\varvec{\zeta }}_{\pm }}\left( \mathfrak {g}^{\mathbb {C}}\right) ^\Pi \big ). \end{aligned}$$
(3.26)

We have \(\dim {\mathfrak {v}}_+ = \dim {\mathfrak {v}}_- = \tfrac{1}{2} \dim {\mathfrak {d}}- \dim \mathfrak {g}\). Following [43, 45], define the projection operators \(W^\pm _h: {\mathfrak {d}}\rightarrow {\mathfrak {d}}\) by, see in particular [45, (3.23)],

$$\begin{aligned} \ker W^\pm _h = {{\,\textrm{Ad}\,}}_{h^{-1}} {\mathfrak {k}}, \qquad {{\,\textrm{im}\,}}W^\pm _h = {\mathfrak {f}}\oplus {\mathfrak {v}}_{\pm }. \end{aligned}$$
(3.27)

The constraint (2.28) explicitly says \(B_{\pm } :={{\,\textrm{Ad}\,}}_h (\varvec{j}_{\varvec{z}} {\mathcal {L}}_{\pm }) - \partial _{\pm } h h^{-1} \in C^\infty (\Sigma , {\mathfrak {k}})\), which can be rewritten as \(\varvec{j}_{\varvec{z}} {\mathcal {L}}_{\pm } = {{\,\textrm{Ad}\,}}_{h^{-1}} B_{\pm } + h^{-1} \partial _{\pm } h\). Applying the operator \(W^\pm _h\) to both sides, and using the fact that \({{\,\textrm{Ad}\,}}_{h^{-1}} B_{\pm }\) is valued in \(\ker W^\pm _h\), we get

$$\begin{aligned} W^\pm _h(\varvec{j}_{\varvec{z}} {\mathcal {L}}_{\pm }) = W^\pm _h(h^{-1} \partial _{\pm } h). \end{aligned}$$
(3.28)

On the other hand, in terms of light-cone components the condition (2.24b) is equivalent to the statement that \({\mathcal {L}}^{(y,q)}_{\pm } = 0\) for any \(y \in {\varvec{\zeta }}_{\mp }\) and \(q = 0, \ldots , m_y-1\). We can rewrite this in terms of the \(\widetilde{\mathcal {E}}\)-operator (3.24) as \(\widetilde{\mathcal {E}} \big ( \varvec{\pi }_{{\varvec{\zeta }}} ({\mathcal {L}}_{\pm } -{\mathcal {L}}^{\textrm{c}}_{\pm }) \big ) = \pm \varvec{\pi }_{{\varvec{\zeta }}} ({\mathcal {L}}_{\pm } - {\mathcal {L}}^{\textrm{c}}_{\pm })\). Applying the isomorphism \({\mathcal {C}}\) from (3.25) on both sides and using the definition of the \(\mathcal {E}\)-operator in Lemma 3.6 we then obtain the equivalent condition

$$\begin{aligned} \mathcal {E}( \varvec{j}_{\varvec{z}} {\mathcal {L}}_{\pm } - \Delta {\mathcal {L}}^{\textrm{c}}_{\pm }) =\pm ( \varvec{j}_{\varvec{z}} {\mathcal {L}}_{\pm } - \Delta {\mathcal {L}}^{\textrm{c}}_{\pm }) \end{aligned}$$
(3.29)

which implies that \(\varvec{j}_{\varvec{z}} {\mathcal {L}}_{\pm } - \Delta {\mathcal {L}}^{\textrm{c}}_{\pm } \in {\mathfrak {v}}_{\pm }\). Since \(\Delta {\mathcal {L}}^{\textrm{c}}_{\pm } \in {\mathfrak {f}}\) it follows that \(\varvec{j}_{\varvec{z}} {\mathcal {L}}_{\pm } \in {{\,\textrm{im}\,}}W^\pm _h\) so that the left-hand side of (3.28) just becomes \(\varvec{j}_{\varvec{z}} {\mathcal {L}}_{\pm }\). In other words, we arrive at the following solution

$$\begin{aligned} \varvec{j}_{\varvec{z}} {\mathcal {L}}_{\pm }(h) = W^\pm _h(h^{-1} \partial _{\pm } h) \end{aligned}$$
(3.30)

of the constraint (2.28).

It remains to show that this solution satisfies the desired transformation property (2.29). For this we need the following lemma.

Lemma 3.8

For any \(g \in C^\infty (\Sigma , G)\) and \(k \in C^\infty (\Sigma , K)\) we have

$$\begin{aligned} W^\pm _{kh\Delta (g)^{-1}} = {{\,\textrm{Ad}\,}}_{\Delta (g)} \circ W^\pm _h \circ {{\,\textrm{Ad}\,}}_{\Delta (g)^{-1}}. \end{aligned}$$

Proof

The projection operator \(W^\pm _{k h\Delta (g)^{-1}}: {\mathfrak {d}}\rightarrow {\mathfrak {d}}\) is defined by

$$\begin{aligned} \ker W^\pm _{k h\Delta (g)^{-1}} = {{\,\textrm{Ad}\,}}_{\Delta (g)} {{\,\textrm{Ad}\,}}_{h^{-1}} {\mathfrak {k}}, \qquad {{\,\textrm{im}\,}}W^\pm _{k h\Delta (g)^{-1}} = {\mathfrak {f}}\oplus {\mathfrak {v}}_{\pm } \end{aligned}$$

where in the first equality we have used the fact that \({{\,\textrm{Ad}\,}}_{k^{-1}} {\mathfrak {k}}= {\mathfrak {k}}\) since k is valued in K. The statement now follows using the fact that \({\mathfrak {v}}\) is invariant under the adjoint action of the diagonal subgroup \(F = {{\,\textrm{im}\,}}\Delta \). \(\square \)

The property (2.29) now easily follows using Lemma 3.8, namely we have

$$\begin{aligned} \varvec{j}_{\varvec{z}} {\mathcal {L}}_{\pm }(kh\Delta (g)^{-1})&= W^\pm _{k h \Delta (g)^{-1}} \big ( \Delta (g) h^{-1} k^{-1} \partial _{\pm } (k h \Delta (g)^{-1}) \big )\\&= {{\,\textrm{Ad}\,}}_{\Delta (g)} \circ W^\pm _h \big ( {{\,\textrm{Ad}\,}}_{h^{-1}} (k^{-1} \partial _{\pm } k) + h^{-1} \partial _{\pm } h\\&\quad -\Delta (g)^{-1} \partial _{\pm } \Delta (g) \big )\\&= \Delta (g) \big ( W^\pm _h ( h^{-1} \partial _{\pm } h)\big ) \Delta (g)^{-1} - \partial _{\pm } \Delta (g) \Delta (g)^{-1}\\&= {}^{\Delta (g)} \big ( \varvec{j}_{\varvec{z}} {\mathcal {L}}_{\pm }(h) \big ), \end{aligned}$$

where in the third equality we have used the fact that \({{\,\textrm{Ad}\,}}_{h^{-1}} (k^{-1} \partial _{\pm } k)\) is valued in \({{\,\textrm{Ad}\,}}_{h^{-1}}{\mathfrak {k}}= \ker W^\pm _h\) and \(\Delta (g)^{-1} \partial _{\pm } \Delta (g)\) is valued in \({\mathfrak {f}}\).

Substituting the solution (3.30) into the 2d action (2.30) we get the degenerate \(\mathcal {E}\)-model action [45, (3.22)]

$$\begin{aligned} S_{2d}(h)&= \frac{1}{2}\int \Big (\langle \langle h^{-1} \partial _-h,W^+_h (h^{-1}\partial _+h) \rangle \rangle \nonumber \\&\quad - \langle \langle h^{-1}\partial _+h,W^-_h (h^{-1}\partial _-h) \rangle \rangle \Big )\textrm{d}\sigma ^{+}\wedge \textrm{d}\sigma ^{-}-\frac{1}{2}I^{\textrm{WZ}}[h] \,. \end{aligned}$$
(3.31)

This model can be defined for more general Lie algebras \({\mathfrak {d}}\) and \(\mathcal {E}\)-operators than the ones considered in this article. In general, the action (3.31) does not describe a 2d integrable field theory. However, in the present case where \({\mathfrak {d}}\) is the defect Lie algebra (2.9) and the \(\mathcal {E}\)-operator is as defined in Sect. 3.4, namely when the data originate from 4d Chern–Simons theory as reviewed in Sect. 2.3, the action (3.31) describes an integrable field theory by construction. In particular, since the right-hand side of (3.30) takes values in \({\mathfrak {f}}\oplus {\mathfrak {v}}_{\pm } \subset {\mathfrak {f}}^\perp \) and \(\varvec{j}_{\varvec{z}}: R_{\Pi {\varvec{\zeta }}}\left( \mathfrak {g}^{\mathbb {C}}\right) ^\Pi \xrightarrow {\cong }{\mathfrak {f}}^\perp \) is an isomorphism by Proposition 3.4, we can apply its inverse to both sides to obtain the Lax connection. Namely, denoting this inverse by \(\varvec{p}: {\mathfrak {f}}^\perp \xrightarrow {\cong }R_{\Pi {\varvec{\zeta }}}\left( \mathfrak {g}^{\mathbb {C}}\right) ^\Pi \) we have

$$\begin{aligned} {\mathcal {L}}_{\pm }(h) = \varvec{p} \big ( W^\pm _h(h^{-1} \partial _{\pm } h) \big ), \end{aligned}$$
(3.32)

which provides a Lax connection for the integrable degenerate \(\mathcal {E}\)-model (3.31).

Although it is immediate by construction that (3.32) satisfies the flatness equation (2.32), it is instructive to show this explicitly. We will need the following lemma which is an immediate generalisation of [2, Lemma 4.6  & Remark 4.7]. Recall the definition of the real vector spaces \({\mathfrak {v}}_{\pm }\) in (3.26).

Lemma 3.9

For any \({\textsf{u}}_{\pm } \in {\mathfrak {f}}\oplus {\mathfrak {v}}_{\pm }\) we have \(\varvec{p}[{\textsf{u}}_+, {\textsf{u}}_-] = [\varvec{p} {\textsf{u}}_+, \varvec{p} {\textsf{u}}_-]\).

Proof

Let \({\textsf{u}}_{\pm } \in {\mathfrak {f}}\oplus {\mathfrak {v}}_{\pm }\) which we can write as \({\textsf{u}}_{\pm } = \varvec{j}_{\varvec{z}} f_{\pm }\) for some \(f_{\pm } \in R_{\Pi {\varvec{\zeta }}_{\pm }}\left( \mathfrak {g}^{\mathbb {C}}\right) ^\Pi \). Then, we have

$$\begin{aligned} {[}\varvec{p} {\textsf{u}}_+, \varvec{p} {\textsf{u}}_-] = [f_+, f_-] =\varvec{p} \varvec{j}_{\varvec{z}} [f_+, f_-] = \varvec{p} [\varvec{j}_{\varvec{z}} f_+, \varvec{j}_{\varvec{z}} f_-] = \varvec{p} [{\textsf{u}}_+, {\textsf{u}}_-]. \end{aligned}$$

To see the second step, first observe that \([f_+, f_-] \in R_{\Pi {\varvec{\zeta }}}\left( \mathfrak {g}^{\mathbb {C}}\right) ^\Pi \) by virtue of the fact that the poles of \(f_+\), which lie in \({\varvec{\zeta }}_+\), are disjoint from those of \(f_-\), which lie in \({\varvec{\zeta }}_-\). Therefore, \(\varvec{j}_{\varvec{z}}\) has a well-defined action on \([f_+, f_-]\) and we can insert the identity in the form \(\textrm{id}= \varvec{p} \varvec{j}_{\varvec{z}}\) out front. In the third step, we then used the fact that the linear map \(\varvec{j}_{\varvec{z}}\), defined by taking the truncated Taylor expansions at the points \(x \in \varvec{z}\) is in fact a morphism of Lie algebras. \(\square \)

The equations of motion of the degenerate \(\mathcal {E}\)-model (3.31) take the form of a flatness equation [45, (2.8)]

$$\begin{aligned} \partial _+ \big ( W^-_h(h^{-1} \partial _- h) \big ) - \partial _- \big ( W^+_h(h^{-1} \partial _+ h) \big ) + \big [ W^+_h(h^{-1} \partial _+ h), W^-_h(h^{-1} \partial _- h) \big ] = 0 \end{aligned}$$

in \({\mathfrak {d}}\). Applying the linear map \(\varvec{p}: {\mathfrak {f}}^\perp \xrightarrow {\cong }R_{\Pi {\varvec{\zeta }}}\left( \mathfrak {g}^{\mathbb {C}}\right) ^\Pi \) to both sides and using Lemma 3.9 on the commutator term, we obtain the desired flatness equation (2.32) for the Lax connection with light-cone components (3.32).

Remark 3.10

The action of the non-degenerate \(\mathcal {E}\)-model [45, (2.5)] can be written in exactly the same form as in (3.31) but where the projectors \(W^\pm _h\) are now defined by the conditions [45, (2.6)]

$$\begin{aligned} \ker W^\pm _h = {{\,\textrm{Ad}\,}}_{h^{-1}} {\mathfrak {k}}, \qquad {{\,\textrm{im}\,}}W^\pm _h = {\mathfrak {v}}_{\pm } \end{aligned}$$

instead of (3.27). Of course, this action for the non-degenerate \(\mathcal {E}\)-model is equivalent to the one derived in [2] (see [2, §2.2]). The action in the form (3.31) can be derived directly in exactly the same way as above. Explicitly, assuming that \(\omega \) has a double pole at infinity, as in [2], one first fixes the F symmetry by setting the edge mode at infinity equal to the identity. As recalled at the start of this section, this then removes the constant term in both components of the Lax connection. The exact same procedure as above then applies, with the absence of constant terms in the Lax connection reducing (3.29) to \(\mathcal {E}( \varvec{j}_{\varvec{z}} {\mathcal {L}}_{\pm }) = \pm ( \varvec{j}_{\varvec{z}} {\mathcal {L}}_{\pm } )\) which was the condition used in [2]. In particular, this condition now implies that \(\varvec{j}_{\varvec{z}} {\mathcal {L}}_{\pm } \in {\mathfrak {v}}_{\pm }\).

4 Examples

In Sects. 2 and 3, we presented a general construction of integrable degenerate \(\mathcal {E}\)-models from 4d Chern–Simons theory, with the final 2d action given in (3.31). In practice, starting from a choice of meromorphic 1-form \(\omega \) one should build the associated defect Lie algebra \({\mathfrak {d}}\), identify its non-degenerate bilinear form \(\langle \langle \cdot , \cdot \rangle \rangle _{{\mathfrak {d}}}\) and work out the Lie group structure of the defect Lie group D. The real vector space of rational functions \(R'_{\Pi {\varvec{\zeta }}}\left( \mathfrak {g}^{\mathbb {C}}\right) ^\Pi \), as defined in §3.1, may then be used to explicitly construct the subspaces \({\mathfrak {v}}_{\pm }\) of \({\mathfrak {d}}\). By making a choice of isotropic Lie subalgebra \({\mathfrak {k}}\subset {\mathfrak {d}}\), one is then able to explicitly construct the projectors \(W^\pm _h\) defined by (3.27) in terms of which the action (3.31) is expressed. In this section, we apply this procedure to explicitly construct the pseudo-chiral model and the bi-Yang-Baxter model.

4.1 Pseudo-Chiral Model

The first example we consider is that of a meromorphic 1-form with a \(4^{\textrm{th}}\) order pole at the origin and two simple zeroes at \(\pm a\) with \(a>0\), namely

$$\begin{aligned} \omega = \frac{a^2-z^2}{z^4} \textrm{d}z\,. \end{aligned}$$

The defect Lie algebra (2.9) for this choice of meromorphic 1-form is given by

$$\begin{aligned} {\mathfrak {d}}= \mathfrak {g}\otimes {\mathbb {R}}[\varepsilon ]/(\varepsilon ^4) \,. \end{aligned}$$
(4.1)

We denote the elements of \({\mathfrak {d}}\) by \({\textsf{u}}^p :={\textsf{u}} \otimes \varepsilon ^p\) with \(p=0,1,2,3\). The bilinear form (2.11) on the defect Lie algebra reads

$$\begin{aligned} \langle \langle {\textsf{u}}^p,{\textsf{v}}^q\rangle \rangle _{{\mathfrak {d}}} ={\left\{ \begin{array}{ll} \begin{array}{cc} a^2\langle {\textsf{u}},{\textsf{v}}\rangle &{}\quad \text {if} \quad p+q=3 \\ -\langle {\textsf{u}},{\textsf{v}}\rangle &{}\quad \text {if} \quad p+q=1 \\ 0 &{}\quad \text {otherwise} \,. \end{array} \end{array}\right. } \end{aligned}$$
(4.2)

Next, let us describe the Lie group structure on the defect Lie group D with Lie algebra (4.1). It is given by the third-order jet bundle \(J^3G\) of the Lie group G, which in the right trivialisation is isomorphic to \(G \times \mathfrak {g}\times \mathfrak {g}\times \mathfrak {g}\). That is, a general element \(h \in D \cong G \times \mathfrak {g}\times \mathfrak {g}\times \mathfrak {g}\) can be expressed as a tuple \(h=(g,{\textsf{u}}, {\textsf{v}},{\textsf{w}})\), with \(g\in G\) and \({\textsf{u}}, {\textsf{v}},{\textsf{w}}\in \mathfrak {g}\). The group product and inverse on D are then given byFootnote 1 [49]

$$\begin{aligned} (g,{\textsf{u}}, {\textsf{v}},{\textsf{w}})(\tilde{g},{\textsf{x}},{\textsf{y}},{\textsf{z}})&=\left( g\tilde{g}, {\textsf{u}} +\textrm{Ad}_g {\textsf{x}},{\textsf{v}} + \textrm{Ad}_g {\textsf{y}}+\frac{1}{2}[{\textsf{u}}, \textrm{Ad}_g {\textsf{x}}], \right. \\&\quad \times \left. {\textsf{w}}+\textrm{Ad}_g {\textsf{z}} + \frac{2}{3}[{\textsf{u}}, \textrm{Ad}_g {\textsf{y}}]+\frac{1}{3}[{\textsf{v}}, \textrm{Ad}_g {\textsf{x}}] +\frac{1}{6}[{\textsf{u}},[{\textsf{u}}, \textrm{Ad}_g {\textsf{x}}]]\right) ,\\ (g,{\textsf{u}}, {\textsf{v}}, {\textsf{w}})^{-1}&=\left( g^{-1}, -\textrm{Ad}_g^{-1}{\textsf{u}},-\textrm{Ad}_g^{-1} {\textsf{v}},-\textrm{Ad}_g^{-1} {\textsf{w}}+\frac{1}{3}\textrm{Ad}_g^{-1}[{\textsf{u}}, {\textsf{v}}]\right) \,. \end{aligned}$$

To specify the kernels of the operators \(W_h^{\pm }\), as in (3.27), we need to make a choice of Lagrangian subalegbra \({\mathfrak {k}}\subset {\mathfrak {d}}\). A natural choice in the present case is

$$\begin{aligned} {\mathfrak {k}}= \mathfrak {g}\otimes \varepsilon ^2{\mathbb {R}}[\varepsilon ]/(\varepsilon ^4), \end{aligned}$$
(4.3)

which is easily seen to be Lagrangian. One additional nice feature of this choice of \({\mathfrak {k}}\) is that it is an ideal in \({\mathfrak {d}}\) and thus, for any \(h\in D\) we have \(\textrm{Ad}_h^{-1}{\mathfrak {k}}= {\mathfrak {k}}\). Hence,

$$\begin{aligned} \ker W^{\pm }_h= \mathfrak {g}\otimes \varepsilon ^2 {\mathbb {R}}[\varepsilon ]/(\varepsilon ^4) =\{{\textsf{y}}^{2}+{\textsf{z}}^3 \,|\, {\textsf{y}},{\textsf{z}} \in \mathfrak {g}\}\,. \end{aligned}$$
(4.4)

On the other hand, specifying the image of \(W^{\pm }_h\) requires identifying the diagonal subalgebra \({\mathfrak {f}}= {{\,\textrm{im}\,}}\Delta \) and the subspaces \({\mathfrak {v}}_{\pm }\) defined in (3.26). Starting with \({\mathfrak {f}}\), the diagonal embedding for the defect Lie algebra (4.1) is simply \({\textsf{w}} \mapsto {\textsf{w}}^0\), so that

$$\begin{aligned} {\mathfrak {f}}= {{\,\textrm{im}\,}}\Delta = \{ {\textsf{w}}^0 \,|\, {\textsf{w}} \in \mathfrak {g}\}\,. \end{aligned}$$
(4.5)

To describe \({\mathfrak {v}}_{\pm }\) we begin by identifying the spaces of rational functions \(R'_{\Pi {\varvec{\zeta }}_{\pm }} \left( \mathfrak {g}^{\mathbb {C}}\right) ^\Pi \) corresponding to the meromorphic 1-form (4.1). By partitioning the set of zeroes \(\Pi {\varvec{\zeta }}= {\varvec{\zeta }}= \{a, -a\}\) of \(\omega \) as \(\Pi {\varvec{\zeta }}_{\pm } = {\varvec{\zeta }}_{\pm } = \{\pm a\}\), we find

$$\begin{aligned} R'_{\Pi {\varvec{\zeta }}_{\pm }}\left( \mathfrak {g}^{\mathbb {C}}\right) ^\Pi =\left\{ \frac{{\textsf{x}}}{z\mp a} \, \bigg | \, {\textsf{x}}\in \mathfrak {g}\right\} \,, \end{aligned}$$
(4.6)

so that expanding such rational functions to \(4^{\textrm{th}}\) order at the origin gives

$$\begin{aligned} {\mathfrak {v}}_+&= \varvec{j}_z\left( R'_{\Pi {\varvec{\zeta }}_+}\left( \mathfrak {g}^{\mathbb {C}}\right) ^\Pi \right) =\left\{ -\frac{{\textsf{x}}^0}{a}-\frac{{\textsf{x}}^1}{a^2} -\frac{{\textsf{x}}^2}{a^3}-\frac{{\textsf{x}}^3}{a^4} \,\,\bigg | \,\, \textsf{x}\in \mathfrak {g}\right\} , \end{aligned}$$
(4.7)
$$\begin{aligned} {\mathfrak {v}}_-&=\varvec{j}_z\left( R'_{\Pi {\varvec{\zeta }}_-}\left( \mathfrak {g}^{\mathbb {C}}\right) ^\Pi \right) =\left\{ \frac{{\textsf{x}}^0}{a}-\frac{{\textsf{x}}^1}{a^2} +\frac{{\textsf{x}}^2}{a^3}-\frac{{\textsf{x}}^3}{a^4} \,\,\bigg | \,\, \textsf{x}\in \mathfrak {g}\right\} \,. \end{aligned}$$
(4.8)

Then \({{\,\textrm{im}\,}}W^{\pm }_h = {\mathfrak {f}}\oplus {\mathfrak {v}}_{\pm }\). With the explicit expressions for the image and kernel of the projectors, we may proceed with the computation of \(W^{\pm }_h (h^{-1}\partial _{\pm } h)\).

In order to simplify the discussion, we start by fixing both the K-symmetry and the F-symmetry. First, we note that the Lie group K with Lie algebra \({\mathfrak {k}}\) is identified with the subgroup \(\{\textrm{id}\} \times \{0\} \times \mathfrak {g}\times \mathfrak {g}\) of \(G \times \mathfrak {g}\times \mathfrak {g}\times \mathfrak {g}\). On the other hand, the Lie group F with Lie algebra \({\mathfrak {f}}\) is identified with the subgroup \(G\times \{0\}\times \{0\}\times \{0\}\). Fixing both of these gauge symmetries implies that our physical degree of freedom will be described by a representative of the class of \(h \in C^\infty (\Sigma , D)\) in the double coset \(K {\setminus } D / F\). By a slight abuse of notation, we will also denote it by \(h=(\textrm{id},{\textsf{u}},0,0)\). We then have

$$\begin{aligned} h^{-1}\textrm{d}h = \textrm{d}{\textsf{u}}^1-\frac{1}{2}[{\textsf{u}},\textrm{d}{\textsf{u}}]^2 +\frac{1}{6}[{\textsf{u}},[{\textsf{u}},\textrm{d}{\textsf{u}}]]^3 \,. \end{aligned}$$
(4.9)

In order to find the explicit action of \(W^\pm _h\) on \(h^{-1}\partial _{\pm } h\), we decompose the latter with respect to the direct sum decomposition \({\mathfrak {d}}= \ker W^\pm _h \dotplus {{\,\textrm{im}\,}}W^\pm _h\). Focussing first on \(h^{-1}\partial _+ h\), we look for \({\textsf{w}}, {\textsf{x}},{\textsf{y}},{\textsf{z}} \in \mathfrak {g}\) such that

$$\begin{aligned} h^{-1}\partial _+ h = \left( {\textsf{y}}^2 + {\textsf{z}}^3\right) +\left( {\textsf{w}}^0 -\frac{{\textsf{x}}^0}{a}-\frac{{\textsf{x}}^1}{a^2} -\frac{{\textsf{x}}^2}{a^3}-\frac{{\textsf{x}}^3}{a^4}\right) \,, \end{aligned}$$
(4.10)

which will then give

$$\begin{aligned} W^+_h (h^{-1}\partial _+ h) = {\textsf{w}}^0 -\frac{{\textsf{x}}^0}{a} -\frac{{\textsf{x}}^1}{a^2}-\frac{{\textsf{x}}^2}{a^3}-\frac{{\textsf{x}}^3}{a^4}. \end{aligned}$$

Explicitly decomposing the \(\sigma ^+\)-component of (4.9) as in (4.10), we find

$$\begin{aligned} W^+_h (h^{-1}\partial _+ h) = \partial _+ {\textsf{u}}^1 +\frac{\partial _+{\textsf{u}}^2}{a}+\frac{\partial _+ {\textsf{u}}^3}{a^2}\,. \end{aligned}$$
(4.11a)

By a completely analogous argument, we get

$$\begin{aligned} W^-_h (h^{-1}\partial _- h) = \partial _- {\textsf{u}}^1 -\frac{\partial _-{\textsf{u}}^2}{a}+\frac{\partial _- {\textsf{u}}^3}{a^2}\,. \end{aligned}$$
(4.11b)

Using the expression for the bilinear form (4.2), we then obtain

$$\begin{aligned} \langle \langle W^\pm _h (h^{-1}\partial _{\pm } h) , h^{-1}\partial _{\mp } h\rangle \rangle = \pm a \langle \partial _+ {\textsf{u}},\partial _- {\textsf{u}}\rangle \pm \frac{a^2}{2}\langle {\textsf{u}}, [\partial _+ {\textsf{u}}, \partial _- {\textsf{u}}] \rangle . \end{aligned}$$
(4.12)

Finally, we compute the Wess-Zumino term (2.17). Since the Cartan 3-form is cubic in \(\widehat{h}^{-1} \textrm{d}\widehat{h}\), with \(\widehat{h} \in C^\infty (\Sigma \times I, D)\), and the latter has no term of Takiff degree 0 by (4.9), or its analogue for \(\widehat{h} = (\textrm{id}, \widehat{{\textsf{u}}}, 0, 0)\), it follows from the explicit form (4.2) of the bilinear form on \({\mathfrak {d}}\) that only the term cubic in \(\textrm{d}\widehat{{\textsf{u}}}\) can contribute, so that

$$\begin{aligned} I^{\textrm{WZ}}[h]=-\frac{1}{6}\int _{\Sigma \times I}a^2\langle \textrm{d}\widehat{{\textsf{u}}},[\textrm{d}\widehat{{\textsf{u}}},\textrm{d}\widehat{{\textsf{u}}}]\rangle =\frac{a^2}{6}\int _{\Sigma }\langle {\textsf{u}},[\textrm{d}{\textsf{u}},\textrm{d}{\textsf{u}}] \rangle \,. \end{aligned}$$
(4.13)

The 2d action (3.31) of the integrable degenerate \(\mathcal {E}\)-model corresponding to the meromorphic 1-form (4.1), the choice of Lagrangian subalgebra \({\mathfrak {k}}\subset {\mathfrak {d}}\) in (4.3) and the split \({\varvec{\zeta }}_{\pm } = \{\pm a\}\) of the zeroes of \(\omega \), is therefore the \(\sigma \)-model with target space \(K \setminus D / F \cong \mathfrak {g}\) and action given by

$$\begin{aligned} S[{\textsf{u}}]=\int _{\Sigma } \Big ( a\langle \partial _+ {\textsf{u}},\partial _- {\textsf{u}}\rangle +\frac{a^2}{3} \langle {\textsf{u}},[\partial _+ {\textsf{u}},\partial _- {\textsf{u}}] \rangle \Big ) \textrm{d}\sigma ^+ \wedge \textrm{d}\sigma ^- \end{aligned}$$
(4.14)

for \({\textsf{u}} \in C^\infty (\Sigma , \mathfrak {g})\), which is the pseudo-chiral model of Zakharov and Mikhailov [53].

4.1.1 Lax Connection

Having found the action of the 2d integrable field theory, we now proceed with the computation of its Lax connection. Its light-cone components are given by (3.32) where \(\varvec{p}: {\mathfrak {f}}^\perp \xrightarrow {\cong }R_{\Pi {\varvec{\zeta }}}\left( \mathfrak {g}^{\mathbb {C}}\right) ^\Pi \) is the inverse of \(\varvec{j}_z\) and

$$\begin{aligned} {\mathfrak {f}}^\perp = {\mathfrak {f}}\oplus {\mathfrak {v}}_+ \oplus {\mathfrak {v}}_- \end{aligned}$$
(4.15)

with \({\mathfrak {f}}\), \({\mathfrak {v}}_+\) and \({\mathfrak {v}}_-\) given in (4.5), (4.7) and (4.8), respectively. Hence, the action of \(\varvec{p}\) on an element of \({\mathfrak {f}}^\perp \) decomposed with respect to (4.15) is simply

$$\begin{aligned} \varvec{p} \left( {\textsf{w}}^0 -\frac{{\textsf{x}}^0}{a}-\frac{{\textsf{x}}^1}{a^2} -\frac{{\textsf{x}}^3}{a^3}-\frac{{\textsf{x}}^3}{a^4}+\frac{{\textsf{y}}^0}{a} -\frac{{\textsf{y}}^1}{a^2}+\frac{{\textsf{y}}^3}{a^3}-\frac{{\textsf{x}}^3}{a^4}\right) ={\textsf{w}} + \frac{{\textsf{x}}}{z-a}+ \frac{{\textsf{y}}}{z+a}\,. \end{aligned}$$
(4.16)

The action of \(\varvec{p}\) on \(W^\pm _{h}(h^{-1}\partial _{\pm } h)\) given in (4.11) can now be computed to give the light-cone components of the Lax connection (3.32), namely we find

$$\begin{aligned} {\mathcal {L}}_{\pm } = \frac{-a^2}{z\mp a}\partial _{\pm } {\textsf{u}} \mp a \partial _{\pm } {\textsf{u}} \,. \end{aligned}$$
(4.17)

The zero curvature equation for this Lax connection is equivalent to

$$\begin{aligned} \partial _+\partial _- {\textsf{u}} - \frac{a}{2}[\partial _+ {\textsf{u}},\partial _- {\textsf{u}}]=0\,, \end{aligned}$$
(4.18)

which corresponds to the equation of motion of \({\textsf{u}}\) for the action (4.14), thus proving the Lax integrability of the model.

4.2 Bi-Yang-Baxter \(\sigma \)-Model

The second example we consider is the bi-Yang-Baxter \(\sigma \)-model [54, 55]. Following the conventions used in [8, 56], we take the meromorphic 1-form

$$\begin{aligned} \omega =\frac{16 K z}{\zeta ^2(z-z_+)(z-z_-) (z-\tilde{z}_+)(z-\tilde{z}_-)} \textrm{d}z \end{aligned}$$
(4.19)

where \(K\in {\mathbb {R}}\). The four simple poles \(z_{\pm }, \tilde{z}_{\pm } \in {\mathbb {C}}\) and the coefficient \(\zeta \in {\mathbb {R}}\) are related to the two real deformation parameters \(\eta \) and \(\tilde{\eta }\) of the model by

$$\begin{aligned}&z_{\pm } =\frac{-2\rho \pm i \eta }{\zeta }\,, \quad \tilde{z}_{\pm } =-\frac{2+2\rho \pm i \tilde{\eta }}{\zeta }\,, \quad \rho =-\tfrac{1}{2}\left( 1-\frac{\eta ^2 -\tilde{\eta }^2}{4}\right) ,\end{aligned}$$
(4.20)
$$\begin{aligned}&\zeta ^2=\left( 1+\frac{(\eta +\tilde{\eta })^2}{4}\right) \left( 1+\frac{(\eta -\tilde{\eta })^2}{4}\right) \,. \end{aligned}$$
(4.21)

Note also that \(\omega \) has two simple zeroes, at 0 and \(\infty \). The defect Lie algebra (2.9) for this choice of meromorphic 1-form is given by

$$\begin{aligned} {\mathfrak {d}}= \big (\mathfrak {g}^{{\mathbb {C}}}\otimes {\mathbb {C}}[\varepsilon ]/ (\varepsilon )\big ) \times \big (\mathfrak {g}^{{\mathbb {C}}} \otimes {\mathbb {C}}[\tilde{\varepsilon }]/ (\tilde{\varepsilon })\big ) \cong \mathfrak {g}^{\mathbb {C}}\times \mathfrak {g}^{\mathbb {C}}, \end{aligned}$$
(4.22)

where we recall that each factor is treated as a real vector space. Therefore, elements of \({\mathfrak {d}}\) are given by tuples \(({\textsf{u}},{\textsf{v}})\in {\mathfrak {d}}\) with \(\textsf{u},\textsf{v}\in \mathfrak {g}^{{\mathbb {C}}}\). The bilinear form (2.11c) on the defect Lie algebra reads

$$\begin{aligned} \langle \langle (\textsf{u},\textsf{v}),(\tilde{\textsf{u}}, \tilde{\textsf{v}})\rangle \rangle _{{\mathfrak {d}}} =\tfrac{4K}{\eta } \Im \langle \textsf{u},\tilde{\textsf{u}}\rangle +\tfrac{4K}{\tilde{\eta }}\Im \langle \textsf{v}, \tilde{\textsf{v}}\rangle . \end{aligned}$$
(4.23)

The defect Lie group D with Lie algebra \({\mathfrak {d}}\) is simply \(G^{{\mathbb {C}}}\times G^{{\mathbb {C}}}\), a general element of which is a tuple \((h,\tilde{h})\) with \(h,\tilde{h} \in G^{{\mathbb {C}}}\).

To specify the kernels of the projection operators defined by (3.27) we need to choose a Lagrangian subalgebra \({\mathfrak {k}}\subset {\mathfrak {d}}\). Following [8], we take two skew-symmetric solutions \(R,\tilde{R}\in \textrm{End}\,\mathfrak {g}\) to the modified Yang-Baxter equation with \(c={\textsf{i}}\) in terms of which we define the Lie subalgebra

$$\begin{aligned} {\mathfrak {k}}:=\mathfrak {g}_R\times \mathfrak {g}_{\tilde{R}}=\{((R-{\textsf{i}}) \textsf{x},(\tilde{R}-{\textsf{i}})\textsf{y})\,|\, \textsf{x}, \textsf{y}\in \mathfrak {g}\} \end{aligned}$$
(4.24)

which is seen to be Lagrangian with respect to the bilinear form (4.23). To simplify the discussion we will gauge fix the K-symmetry. Let \(K= G_R \times G_{\tilde{R}} \subset G^{{\mathbb {C}}}\times G^{{\mathbb {C}}}\) be the Lie group with Lie algebra \({\mathfrak {k}}= \mathfrak {g}_R\times \mathfrak {g}_{\tilde{R}}\). Following [8], we assume that the direct sum decomposition \(\mathfrak {g}^{{\mathbb {C}}} \times \mathfrak {g}^{{\mathbb {C}}}= {\mathfrak {k}}\dotplus (\mathfrak {g}\times \mathfrak {g})\) lifts to the group level, that is, \(G^{{\mathbb {C}}}\times G^{{\mathbb {C}}} = K (G \times G)\) so that a natural parametrisation of the quotient \(K \setminus (G^{\mathbb {C}}\times G^{\mathbb {C}})\) is then given by \(G\times G\). In this way, our physical degrees of freedom will be described by a representative of the class of \(h \in C^\infty (\Sigma , G^{\mathbb {C}}\times G^{\mathbb {C}})\) in the coset \(K {\setminus } (G^{\mathbb {C}}\times G^{\mathbb {C}})\) which we denote by \((g,\tilde{g})\) with \(g,\tilde{g} \in C^\infty (\Sigma , G)\). Hence, from (3.27) we have

$$\begin{aligned} \textrm{ker}\, W^\pm _{(g,\tilde{g})} = \textrm{Ad}_{(g,\tilde{g})}^{-1} \mathfrak {g}_R \times \mathfrak {g}_{\tilde{R}}=\left\{ \left( (R_g-{\textsf{i}})\textrm{Ad}_g^{-1} \textsf{x},(\tilde{R}_{\tilde{g}}-{\textsf{i}})\textrm{Ad}_{\tilde{g}}^{-1} \textsf{y}\right) \,|\, \textsf{x},\textsf{y}\in \mathfrak {g}\right\} \,, \end{aligned}$$
(4.25)

where we have defined \(R_g = \textrm{Ad}_g^{-1} \circ R \circ \textrm{Ad}_g\), and similarly for \(\tilde{R}_{\tilde{g}}\).

On the other hand, the images of \(W^{\pm }_{(g,\tilde{g})}\) are given in terms of the subalgebra \({\mathfrak {f}}= {{\,\textrm{im}\,}}\Delta \) and the subspaces \({\mathfrak {v}}_{\pm }\) defined in (3.26). The diagonal embedding for the defect Lie algebra (4.22) is simply \({\textsf{a}} \mapsto ({\textsf{a}},{\textsf{a}})\), so that

$$\begin{aligned} {\mathfrak {f}}=\{({\textsf{a}},{\textsf{a}}) \,|\, \textsf{a}\in \mathfrak {g}\}\,. \end{aligned}$$
(4.26)

To determine \({\mathfrak {v}}_{\pm }\), we must first identify the space of rational functions \(R'_{\Pi {\varvec{\zeta }}_{\pm }} \left( \mathfrak {g}^{\mathbb {C}}\right) ^\Pi \) corresponding to the meromorphic 1-form (4.19). Fixing the partition of the set of zeroes \(\Pi {\varvec{\zeta }}= {\varvec{\zeta }}= \{0, \infty \}\) of \(\omega \) to be \(\Pi {\varvec{\zeta }}_+ = {\varvec{\zeta }}_+ = \{\infty \}\) and \(\Pi {\varvec{\zeta }}_- ={\varvec{\zeta }}_- = \{0\}\) we have

$$\begin{aligned} R'_{\Pi {\varvec{\zeta }}_+}\left( \mathfrak {g}^{\mathbb {C}}\right) ^\Pi = \left\{ {\textsf{b}} z \, | \, {\textsf{b}} \in \mathfrak {g}\right\} \,,\quad R'_{\Pi {\varvec{\zeta }}_-}\left( \mathfrak {g}^{\mathbb {C}}\right) ^\Pi =\left\{ \frac{{\textsf{b}}}{z} \, | \, {\textsf{b}}\in \mathfrak {g}\right\} . \end{aligned}$$
(4.27)

Expanding such rational functions at the set of independent poles \(\varvec{z}= \{ z_+, \tilde{z}_+ \}\) of \(\omega \) yields

$$\begin{aligned} {\mathfrak {v}}_+&= \varvec{j}_z\left( R'_{\Pi {\varvec{\zeta }}_+}\left( \mathfrak {g}^{\mathbb {C}}\right) ^\Pi \right) =\left\{ \left( {\textsf{b}} z_+,{\textsf{b}}\tilde{z}_+\right) \,|\, {\textsf{b}}\in \mathfrak {g}\right\} , \end{aligned}$$
(4.28)
$$\begin{aligned} {\mathfrak {v}}_-&=\varvec{j}_z\left( R'_{\Pi {\varvec{\zeta }}_-}\left( \mathfrak {g}^{\mathbb {C}}\right) ^\Pi \right) =\left\{ \left( \frac{{\textsf{b}}}{z_+},\frac{{\textsf{b}}}{\tilde{z}_+}\right) \,|\, {\textsf{b}}\in \mathfrak {g}\right\} \,. \end{aligned}$$
(4.29)

We then have \({{\,\textrm{im}\,}}W^{\pm }_{(g,\tilde{g})} = {\mathfrak {f}}\oplus {\mathfrak {v}}_{\pm }\) and we may now proceed with the computation of \(W^{\pm }_{(g,\tilde{g})}(j_{\pm },\tilde{\jmath }_{\pm })\) where we defined \(j_{\pm }:=g^{-1}\partial _{\pm } g\) and \(\tilde{\jmath }_{\pm } :=\tilde{g}^{-1}\partial _{\pm } \tilde{g}\).

In order to find the explicit action of \(W^\pm _{(g,\tilde{g})}\) on \((j_{\pm },\tilde{\jmath }_{\pm })\), we decompose the latter with respect to the direct sum decomposition \({\mathfrak {d}}= \ker W^\pm _{(g,\tilde{g})} \dotplus {{\,\textrm{im}\,}}W^\pm _{(g,\tilde{g})}\). Explicitly, focussing first on \((j_+, \tilde{\jmath }_+)\), we look for \({\textsf{a}}, {\textsf{b}}, {\textsf{x}}, {\textsf{y}} \in \mathfrak {g}\) such that

$$\begin{aligned} \left( j_+,\tilde{\jmath }_+\right) =({\textsf{a}}, {\textsf{a}}) +\left( {\textsf{b}} z_+,{\textsf{b}} \tilde{z}_+\right) +\left( (R_g-{\textsf{i}})\textrm{Ad}_g^{-1}\textsf{x}, (\tilde{R}_{\tilde{g}}-{\textsf{i}})\textrm{Ad}_{\tilde{g}}^{-1} \textsf{y}\right) . \end{aligned}$$
(4.30)

To match the notation from [8] it is convenient to introduce

$$\begin{aligned} J_{\pm } = \frac{1}{1\pm \frac{\eta }{2} R_g \pm \frac{\tilde{\eta }}{2}R_{\tilde{g}}} (j_{\pm } - \tilde{\jmath }_{\pm })\,, \end{aligned}$$
(4.31)

in terms of which the solution to (4.30) can be conveniently written as

$$\begin{aligned} {\textsf{a}}= & {} j_+ + \bigg ( \rho - \frac{\eta }{2} R_g \bigg ) J_+ = \tilde{\jmath }_+ + \bigg ( 1+\rho + \frac{\tilde{\eta }}{2} \tilde{R}_{\tilde{g}} \bigg ) J_+, \\ {\textsf{b}}= & {} \frac{\zeta }{2} J_+, \qquad \textrm{Ad}_{g}^{-1} {\textsf{x}} = \frac{\eta }{2} J_+, \qquad \textrm{Ad}_{\tilde{g}}^{-1} {\textsf{y}} =-\frac{\tilde{\eta }}{2} J_+. \end{aligned}$$

In particular, given that \(W^+_{(g,\tilde{g})}\) is a projector, its action on (4.30) is given by the first two terms on the right-hand side, namely we have

$$\begin{aligned} W_{(g,\tilde{g})}^+ (j_+, \tilde{\jmath }_+) = \bigg ( j_+ - \frac{\eta }{2} (R_g - \textrm{i}) J_+, \tilde{\jmath }_+ + \frac{\tilde{\eta }}{2} (\tilde{R}_{\tilde{g}} - \textrm{i}) J_+ \bigg ) \,. \end{aligned}$$
(4.32)

Similarly, to compute the action of \(W^-_{(g,\tilde{g})}\) on \((j_-,\tilde{\jmath }_-)\) we look again for \({\textsf{a}}, {\textsf{b}}, {\textsf{x}}, {\textsf{y}} \in \mathfrak {g}\) but this time such that

$$\begin{aligned} (j_-,\tilde{\jmath }_-)=({\textsf{a}}, {\textsf{a}}) +\left( \frac{{\textsf{b}}}{z_+},\frac{{\textsf{b}}}{\tilde{z}_+}\right) +\left( (R_g-{\textsf{i}})\textrm{Ad}_g^{-1}\textsf{x}, (\tilde{R}_{\tilde{g}}-{\textsf{i}})\textrm{Ad}_{\tilde{g}}^{-1} \textsf{y}\right) \,. \end{aligned}$$
(4.33)

Doing so, we find

$$\begin{aligned} W_{(g,\tilde{g})}^- (j_-,\tilde{\jmath }_-) = \bigg ( j_- + \frac{\eta }{2} (R_g - \textrm{i}) J_-, \tilde{\jmath }_- - \frac{\tilde{\eta }}{2} (\tilde{R}_{\tilde{g}} - \textrm{i}) J_- \bigg )\,. \end{aligned}$$
(4.34)

Using the expression for the bilinear form (4.23), we then obtain

$$\begin{aligned} \langle \langle W_{(g,\tilde{g})}^\pm (j_{\pm },\tilde{\jmath }_{\pm } ), (j_{\mp },\tilde{\jmath }_{\mp }) \rangle \rangle _{{\mathfrak {d}}} = \pm 2K \langle j_+ - \tilde{\jmath }_+, J_- \rangle \end{aligned}$$
(4.35)

where we used the fact that \(\langle j_- - \tilde{\jmath }_-, J_+ \rangle = \langle j_+ - \tilde{\jmath }_+, J_- \rangle \) which follows from the skew-symmetry of R and \(\tilde{R}\).

On the other hand, it is immediate to verify that the Wess-Zumino term vanishes identically. We thus find that the 2d action (3.31) is given by

$$\begin{aligned} S[g,\tilde{g}]=K\int _{\Sigma }\left\langle j_+-\tilde{\jmath }_+, J_-\right\rangle \textrm{d}\sigma \wedge \textrm{d}\tau , \end{aligned}$$
(4.36)

matching the action of the bi-Yang-Baxter \(\sigma \)-model as written in [56, (2.2)].

4.2.1 Lax Connection

The Lax connection is given by (3.32), which for this specific example becomes

$$\begin{aligned} {\mathcal {L}}_{\pm }\big ( (g,\tilde{g}) \big ) = \varvec{p} \big (W^\pm _{(g,\tilde{g})}(j_{\pm },\tilde{\jmath }_{\pm }) \big )\,, \end{aligned}$$
(4.37)

where \(\varvec{p}: {\mathfrak {f}}^\perp \xrightarrow {\cong }R_{\Pi {\varvec{\zeta }}}\left( \mathfrak {g}^{\mathbb {C}}\right) ^\Pi \) is the inverse of \(\varvec{j}_{\varvec{z}}\) with

$$\begin{aligned} {\mathfrak {f}}^\perp = {\mathfrak {f}}\oplus {\mathfrak {v}}_+ \oplus {\mathfrak {v}}_- \end{aligned}$$
(4.38)

with \({\mathfrak {f}}\), \({\mathfrak {v}}_+\) and \({\mathfrak {v}}_-\) given in (4.26), (4.28) and (4.29), respectively, so that the action of \(\varvec{p}\) on an element in \({\mathfrak {f}}^\perp \) decompose with respect to (4.38) is simply

$$\begin{aligned} \varvec{p} \left( ({\textsf{a}},{\textsf{a}})+({\textsf{b}} z_+,{\textsf{b}} \tilde{z}_+) +\left( \frac{{\textsf{c}}}{z_+},\frac{{\textsf{c}}}{\tilde{z}_+}\right) \right) ={\textsf{a}} + {\textsf{b}} z + \frac{{\textsf{c}}}{z}\,. \end{aligned}$$
(4.39)

Therefore, decomposing \(W^\pm _{(g,\tilde{g})}(j_{\pm },\tilde{\jmath }_{\pm })\) with respect to (4.38) we find

$$\begin{aligned} {\mathcal {L}}_+ = B_++\frac{\zeta }{2}z J_+ \,,\quad {\mathcal {L}}_-=B_- +\frac{\zeta }{2}z^{-1}J_- \end{aligned}$$
(4.40)

where we have defined

$$\begin{aligned} B_{\pm }= j_{\pm } + \bigg ( \rho \mp \frac{\eta }{2}R_g \bigg ) J_{\pm } \,, \end{aligned}$$
(4.41)

with \(\rho \) defined in (4.20). The expressions for the components of the Lax connection coincide, up to a conventional sign, with [56, (2.18)].

5 Outlook

In this work, we have shown that the 4d Chern–Simons action introduced by Costello and Yamazaki in [3] with the most general meromorphic 1-form \(\omega \) gives rise, when passing to 2d following the approach of [1, 2], to the actions of integrable versions of degenerate \(\mathcal {E}\)-models, or dressing cosets, introduced by Klimčík and Ševera [35].

Notably, this article resolves one of the open problems mentioned in [2], namely removing the assumption that the meromorphic 1-form \(\omega \) should have a double pole at infinity, thus allowing it to be completely arbitrary. There are, however, a number of other interesting problems that remain open which apply to the degenerate setting as well. We summarise them here for completeness.

The first is related to the Hamiltonian description of the integrable degenerate \(\mathcal {E}\)-models constructed in the present work. Indeed, as recalled in the introduction, establishing the complete integrability of a 2d field theory requires moving to the Hamiltonian formalism and showing that the Poisson bracket of the spatial component of the Lax connection with itself takes the non-ultralocal r/s-form [57, 58] with twist function. This, in turn, is equivalent to recasting the 2d field theory in question as a classical dihedral affine Gaudin model, see [31] and also [32, 33]. And although we have not shown this explicitly here, it follows indirectly from [34] where a Hamiltonian analysis of 4d Chern–Simons theory was performed and it was shown that the Poisson bracket of \({\mathcal {L}}_{\sigma }\), i.e. the spatial component of the gauge field in the gauge \(A_{\bar{z}} = 0\), with itself is precisely of the required form with the twist function \(\varphi (z)\) determined by the meromorphic 1-form \(\omega = \varphi (z) \textrm{d}z\).

It would, nevertheless, be interesting to perform the Hamiltonian analysis of the integrable degenerate \(\mathcal {E}\)-models constructed here to directly show that the spatial component of their Lax connection has a Poisson bracket with itself of the expected r/s-form with twist function determined by \(\omega \). In particular, sufficient conditions on the \(\mathcal {E}\)-model data ensuring its integrability in the Hamiltonian sense were given in [59]. These are analogous to the sufficient conditions on the \(\mathcal {E}\)-model data given in [42], see also [2] and Lemma 3.9, ensuring the existence of a Lax connection. It would therefore be interesting to check that the integrable degenerate \(\mathcal {E}\)-models constructed from 4d Chern–Simons satisfy the sufficient conditions of [59].

The second interesting open direction is to determine the relationship between 4d Chern–Simons and the usual 3d Chern–Simons theory. Indeed, it was shown in [60] (see also [61]) that the non-degenerate \(\mathcal {E}\)-model on \(S^1\times {\mathbb {R}}\) can be obtained from 3d Chern–Simons theory for the Lie group D on , with a disc, by imposing twisted self-dual boundary conditions on the gauge field A of the form

(5.1)

on the boundary . It was further shown in [60] that the \(\sigma \)-model on \(K \setminus D\) can be obtained from 3d Chern–Simons theory on a hollowed out cylinder , with an annulus, by imposing twisted self-dual boundary conditions on the gauge field as in (5.1) at the outer boundary and imposing at the inner boundary \(\Sigma _{\textrm{inn}}\) a condition of the form

$$\begin{aligned} A|_{\Sigma _{\textrm{inn}}}\in \Omega ^{1}(\Sigma _{\textrm{inn}},{\mathfrak {k}}), \end{aligned}$$
(5.2)

with \({\mathfrak {k}}\subset {\mathfrak {d}}\) a Lagrangian subalgebra. Given the similarities between the boundary conditions considered in the present 4d Chern–Simons context, in particular (3.2) as in [2] or (3.29) as considered here together with (2.28), and the boundary conditions (5.1) and (5.2) considered in the 3d Chern–Simons setting, it would be interesting to understanding whether there is any deeper connection between these two theories.

Finally, there is at least one other interesting direction in which to generalise the whole construction, which is to try and describe from a 4d Chern–Simons perspective the class of 2d integrable field theories whose Lax connections are equivariant under the action of a cyclic group \({\mathbb {Z}}_T\) for some \(T \in {\mathbb {Z}}_{\ge 2}\). Such 2d integrable field theories should, on general grounds, arise from 4d Chern–Simons theory on a certain orbifold quotient of \(\Sigma \times {\mathbb {C}}{P}^1\) by \({\mathbb {Z}}_T\). That is, one should start from 4d Chern–Simons theory on \(\Sigma \times {\mathbb {C}}{P}^1\) but impose that the gauge field A be equivariant with respect to actions of the cyclic group on \({\mathbb {C}}{P}^1\) and on \(\mathfrak {g}\). Such a setting has already been considered in [9] in relation to a specific 2d integrable field theory, namely the \(\lambda \)-model. More generally, it would be interesting to extend the results of [1] to the equivariant setting and use this to construct integrable (non-)degenerate \(\mathcal {E}\)-models with equivariant Lax connections along the lines of [2] and the present paper for general \(\omega \).