1 Introduction

We write \(\textrm{SU}(3)\) for the group of \(3 {\times } 3\) complex, unitary matrices with determinant equal to 1. Consider the closed subgroups \({\mathbb {T}}_1 = \{\textrm{diag}(z,1,z^{-1}) \mid z \in S^{1} \}\) and \({\mathbb {T}}_2 =\{\textrm{diag}(z,z,z^{-2}) \mid z \in S^{1} \}\), where \(S^1\) denotes the multiplicative group of complex numbers of norm one. Both \({\mathbb {T}}_1\) and \({\mathbb {T}}_2\) are isomorphic to \(S^1\) as topological groups, via the natural isomorphisms \(z \mapsto \textrm{diag}(z,1,z^{-1})\) and \(z \mapsto \textrm{diag}(z,z,z^{-2})\), respectively. However, the two representations of \(S^1\) in \(\textrm{SU}(3)\) are not equal and not even conjugate in \(\textrm{SU}(3).\) So it is a natural question to wonder whether there exist unitary representations \(\pi _1,\pi _2 :\textrm{SU}(3) \rightarrow \textrm{U}(n)\) for some n, such that the two representations of \(S^1\) can be matched, or more precisely

$$\begin{aligned} \pi _1(\textrm{diag}(z,z^{-1},1))=\pi _2(\textrm{diag}(z,z,z^{-2})), \quad \text { for all } z \in S^1. \end{aligned}$$

We denote by \(\rho \) the natural representation of \(\textrm{SU}(3)\) on \({\mathbb {C}}^3\). One can then check that the two 9-dimensional representations \(\pi _1=\rho \otimes \bar{\rho }\) and \(\pi _2 = \rho \oplus \bar{\rho } \oplus 1^{\oplus 3}\) solve this problem, where \(\bar{\rho }\) denotes the conjugate representation, and 1 is the trivial one-dimensional representation. We will come back to this example several times in this article.

This concrete question belongs to a more general set of problems that was first studied by Bergman (1987). Let AB be compact groups that share a common closed subgroup C, see Hofmann and Morris (2020) for background on the theory of general compact groups. It is natural to consider the abstract amalgamated free product \(G:=A *_C B\) and try to study the analytic properties that it inherits from its constituents. A natural question is whether G can carry a possibly non-Hausdorff pre-compact group topology that restricts to the given topologies on A and B. Equivalently, we ask whether A and B can be amalgamated over C in the category of compact groups, i.e., if there exists a compact group D and embeddings of A and B in D that agree on the common copy of C.

Compact groups carry bi-invariant metrics that generate the topology. Thus, a first obstruction to a positive answer is that A and B may not carry bi-invariant metrics that agree on C. If this is the case, a bi-invariant pseudo-metric on G that is faithful on A and B cannot exist. In particular, there does not exist a compact group D as described above. Bergman showed that this type of argument rules out existence of compact amalgams in many cases. It is a fundamental open problem, if amalgamation is always possible for \(C=S^1\) or \(C=\textrm{SU}(n)\), see Bergman (1987, Question 20).

The purpose of this note is to explore an equivalent algebraic reformulation of the problem in the simplest possible case. What got us started was the strategy outlined after Question 20 in Bergman (1987). That strategy is amenable to standard computer algebra systems. Our computer experiments, with SCIP (Bestuzheva et al. 2021) and OSCAR (Decker et al. 2024; OSCAR 2023), suggest that solutions to the original problem can always be found but get increasingly complicated. For example, merging the subgroups \({\mathbb {T}}_1 =\{\textrm{diag}(z,z^5,z^{-6}) \mid z \in S^1 \}\) and \({\mathbb {T}}_2 =\{\textrm{diag}(z,z^7,z^{-8}) \mid z \in S^1 \}\) of \(\textrm{SU}(3)\) required us to consider all possible direct sums of the first 120 mutually non-equivalent irreducible representations of \(\textrm{SU}(3)\) (in some specified order) until a pair of unitary representations of \(\textrm{SU}(3)\) on a complex vector space of dimension roughly 300, 000 could be found that solves the problem.

2 Some basic observations

Let us study the question of amalgamation of the base \(C=S^1\). It follows from the Peter–Weyl Theorem (Knapp 1986, Theorem 1.12) that in order to construct amalgams of general compact groups A and B, it is enough to consider the case \(A=\textrm{U}(n)\), \(B=\textrm{U}(m)\). Since \(\textrm{U}(n)\) embeds to \(\textrm{SU}(n+1)\), we can further restrict to the case \(A=\textrm{SU}(n)\), \(B=\textrm{SU}(m)\). The simplest case that comes to mind is \(A=B=\textrm{SU}(2)\). In this case, however, every embedding of \(S^1\) in \(\textrm{SU}(2)\) is conjugate to the map \(z\mapsto \textrm{diag}(z,z^{-1})\). Thus, the first truly non-trivial case might be \(A=\textrm{SU}(2)\) and \(B=\textrm{SU}(3)\) or somewhat more general \(A=B=\textrm{SU}(3)\). Our first task is to describe the possible embeddings of \(S^1\) in \(\textrm{SU}(3)\). Our second task is to search for pairs of faithful, finite-dimensional, unitary representations of \(\textrm{SU}(3)\) that agree on the embedded copies of \(S^{1}\).

The first task is easy to solve. Up to conjugation in \(\textrm{SU}(3)\), every embedding of \(S^1\) into \(\textrm{SU}(3)\) is given by three integers \((a,b,c) \in {\mathbb {Z}}^3\) such that \(a+b+c=0\) and \(\gcd (a,b,c)=1\). The embedding associated with the triplet (abc) is concretely given by

$$\begin{aligned} \psi _{a,b,c}(z):= \textrm{diag}(z^{a},z^b,z^c) \in \textrm{SU}(3), \quad \text { for all } z \in S^1. \end{aligned}$$

In order to address the second task, we recall some facts about the finite-dimensional unitary representation theory of \(\textrm{SU}(3)\). Let \(\rho \) be the standard representation of \(\textrm{SU}(3)\) on \({\mathbb {C}}^3\) and \(\bar{\rho }\) be its dual or conjugate. We denote by \(\pi _{m,n} :\textrm{SU}(3) \rightarrow \textrm{U}(\Gamma _{m,n})\) the irreducible representation parameterized by the weight (mn), a pair of non-negative integers. This representation corresponds to the Young tableaux of shape \((m+n,n)\). According to Fulton–Harris (Fulton and Harris 1991, §13.2), we have

$$\begin{aligned} \Gamma _{m,n} = \ker \left( \textrm{Sym}^m(\rho ) \otimes \textrm{Sym}^n(\bar{\rho }) {\mathop {\rightarrow }\limits ^{\phi _{m,n}}} \textrm{Sym}^{m-1}(\rho ) \otimes \textrm{Sym}^{n-1}(\rho )\right) , \end{aligned}$$
(1)

where \(\phi _{m,n}\) denotes the natural (surjective) contraction map

$$\begin{aligned} \phi _{m,n} \left( (v_1\dots v_m)\otimes (v_1^*\dots v_n^*) \right) := \sum _{i=1}^m \sum _{j=1}^n \langle v_i,v_j^*\rangle (v_1\dots \hat{v_i}\dots v_m)\otimes (v_1^*\dots \hat{v}^*_j\dots v^*_n). \end{aligned}$$

Note that representations of the compact Lie group \(\textrm{SU}(3)\) correspond to representations of the simple Lie algebra \({\mathfrak {sl}}_3{\mathbb {C}}\); see Fulton and Harris (1991, §9.3).

Every unitary representation \(\sigma \) of \(\textrm{SU}(3)\) extends to a unitary representation \(\sigma '\) of \(\textrm{U}(3)\), which is however not unique. The character of a unitary representation \(\sigma :\textrm{SU}(3) \rightarrow U(k)\) is a symmetric polynomial \(\chi (\sigma ) \in {\mathbb {Z}}[x_1,x_2,x_3]\) determined uniquely up to some element in the ideal generated by \(x_1x_2x_3-1\) by the property

$$\begin{aligned} \chi (\sigma )(x_1,x_2,x_3)= \textrm{tr}\left( \sigma '(\textrm{diag} (x_1,x_2,x_3))\right) , \quad \text { for all } x_1,x_2,x_3 \in S^1. \end{aligned}$$

It is known that \(\chi (\pi '_{m,n}) = s_{m+n,n},\) where \(s_{\lambda }\) denotes the Schur polynomial of the Young tableaux with two parts \((m+n,n).\) Here, \(\pi '_{m,n}\) is the representation of \(\textrm{U}(3)\) described by the same formula as in Equation (1). Recall that \(\chi (\sigma )(1,1,1)\) equals the dimension of the representation \(\sigma \).

Now, every finite-dimensional unitary representation is a direct sum of irreducible unitary representations, and thus every character of such a representation is a symmetric, non-negative integer linear combination of Schur polynomials. We call such a symmetric polynomial Schur positive. Here, the polynomial \(s_{1,1,1}=x_1x_2x_3\) corresponds to the trivial representation 1. We say that a Schur positive polynomial is non-trivial if it is non-scalar modulo the polynomial \(x_1x_2x_3-1.\)

Back to the original problem, we are interested in the question if for a given pair of triplets of integers, \((v_1,v_2,v_3)\) and \((w_1,w_2,w_3)\), with \(v_1+v_2+v_3=w_1+w_2+w_3=0\) and \(\gcd (v_1,v_2,v_3)=\gcd (w_1,w_2,w_3)=1\), we can find a pair of unitary representations

$$\begin{aligned} \sigma _1,\sigma _2 :\textrm{SU}(3) \rightarrow \textrm{U}(k), \end{aligned}$$

such that

$$\begin{aligned} \sigma _1\left( \psi _{v_1,v_2,v_3}(z)\right) = \sigma _2 \left( \psi _{w_1,w_2,w_3}(z)\right) \quad \text { for all } z \in S^1. \end{aligned}$$

However, these two representations are conjugate (and hence equal after conjugation) if and only if the associated characters agree. Hence, we arrive at the equivalent condition

$$\begin{aligned} \chi (\sigma _1)(z^{v_1},z^{v_2},z^{v_3}) = \chi (\sigma _2)(z^{w_1},z^{w_2},z^{w_3}) \quad \text { for all } z \in S^1. \end{aligned}$$

Thus, putting everything in more algebraic terms, the second task amounts to finding Schur positive polynomials P and Q such that the equality

$$\begin{aligned} P(z^{v_1},z^{v_2},z^{v_3}) = Q(z^{w_1},z^{w_2},z^{w_3}) \end{aligned}$$
(2)

holds in the Laurent polynomial ring \({\mathbb {Q}}[z^\pm ]\). For brevity, given a vector \(v=(a,b,c)\) with \(a+b+c=0\), we write \(P_v(z) =P(z^a,z^b,z^c)\) for the substitution.

There is an additional subtlety that we did not address so far: unitary representations of \(\textrm{SU}(3)\) need not be injective. If we pick a non-trivial third root of unity, \(\xi =\exp (2\pi i/3)\in S^1\), then the subgroup

$$\begin{aligned} Z = \left\langle \textrm{diag}(\xi ,\xi ,\xi ) \right\rangle \cong {\mathbb {Z}}/3{\mathbb {Z}}, \end{aligned}$$

forms the center of \(\textrm{SU}(3)\). Note that Z is the only non-trivial normal subgroup of \(\textrm{SU}(3)\), i.e., the quotient \(\textrm{SU}(3)/Z\) is simple. The following result characterizes the injective unitary representations of \(\textrm{SU}(3)\).

Proposition 2.1

Let \(\sigma :\textrm{SU}(3) \rightarrow \textrm{U}(k)\) be a unitary representation with character \(P=\chi (\sigma ') \in {\mathbb {Z}}[x_1,x_2,x_3]\) for some extension \(\sigma '\) of \(\sigma \) to \(\textrm{U}(3)\). Then the following conditions are equivalent:

  1. 1.

    \(\sigma \) is injective,

  2. 2.

    \(P(\xi ,\xi ,\xi ) \ne P(1,1,1)\), and

  3. 3.

    P, written in terms of the Schur basis, has a summand (with positive coefficient) having total degree not divisible by three.

Proof

Since the center Z of \(\textrm{SU}(3)\) is generated by \(\textrm{diag}(\xi ,\xi ,\xi )\) and because Z is the only non-trivial normal subgroup of \(\textrm{SU}(3)\), the representation \(\sigma \) is injective if and only if \(\sigma (\textrm{diag}(\xi ,\xi ,\xi ))\) is distinct from the \(k{\times }k\) unit matrix. The eigenvalues of \(\sigma (\textrm{diag}(\xi ,\xi ,\xi ))\) are complex numbers of modulus 1 and \(P(\xi ,\xi ,\xi )=\textrm{tr}(\sigma (\textrm{diag}(\xi ,\xi ,\xi )))\) is equal to their sum. Hence, \(P(\xi ,\xi ,\xi )=k=P(1,1,1)\) if and only if all these eigenvalues are equal to 1 if and only if \(\sigma \) is not injective. This proves the equivalence between (1) and (2). Note that P is necessarily Schur-positive since it is a character.

We proceed to show the equivalence of (2) and (3). If each summand of P has a total degree which is a multiple of three, then \(P(\xi ,\xi ,\xi )=P(1,1,1)\); this shows that (2) implies (3). To show the reverse direction, without loss of generality, assume \(P\ne 0\) is a positive linear combination of Schur polynomials, none of which has total degree divisible by three. Let \(R(x) = P(x,x,x)\), which is a rational univariate polynomial in \({\mathbb {Q}}[x]\). Note that R has all coefficients positive and no term of degree divisible by three. Now let \(R'\in {\mathbb {Q}}[x]\) be the remainder of division of R by \((x^3-1)\). Then \(R'\ne 0\), has no constant term, and satisfies \(R'(1) = R(1)\) and \(R'(\xi ) = R(\xi )\). We need to show \(P(1,1,1) \ne P(\xi ,\xi ,\xi )\). So let us assume the contrary. Then \(R'(\xi ) =R'(1)\), so the polynomial \(R'(x)-R'(1)\) must be a multiple of the minimal polynomial of \(\xi \), namely \(x^2+x+1\). Since \(R'\) has degree at most two, we have \(R' = c(x^2+x+1)\) for some nonzero constant c, but that contradicts that \(R'\) has no constant term. We conclude that \(P(1,1,1) \ne P(\xi ,\xi ,\xi )\), and this completes our proof. \(\square \)

Example 2.2

For simplicity, we denote the unitary representation of \(S^{1}\) with character \(\sum _i a_i z^i\) by the list \((i^{a_i}; i \in {\mathbb {Z}})\); where we omit the entry of i whenever \(a_i=0.\) In particular, in this notation we have \(\psi _{a,b,c} = (a,b,c)\). We now revisit the example at the beginning of the introduction. The computation

$$\begin{aligned} (-1,0,1)^{\otimes 2} = (-2,-1^2,0^3,1^2,2) = (-2,1^2) \oplus (-2,1^2)^{*} \oplus (0)^{\oplus 3} \end{aligned}$$

shows that the representations \(\psi _{-1,0,1}\) and \(\psi _{-2,1,1}\) can be amalgamated inside \(\textrm{SU}(9)\). In terms of polynomials, this corresponds to \(P(x_1,x_2,x_3)=(x_1+x_2+x_3)^2\), \(Q(x_1,x_2,x_3) = x_1+x_2+x_3+x_2x_3+x_1x_3+x_1x_2+3x_1x_2x_3\) and the identity \(P(z^{-1},1,z)=Q(z^{-2},z,z).\) Observe that \(P=s_{1,1}+s_2\) and \(Q=s_1+s_{1,1}+3s_{1,1,1}\). In particular, both polynomials are Schur positive. Moreover, \(P(1,1,1)=Q(1,1,1)=9\) is the dimension of the representation. Apart from the this example, which we were able to work out by hand, only few other cases seem suitable for pen and paper calculations.

It follows from the reasoning above that we can formulate Bergman’s problem (Bergman 1987, Question 20) in the first non-trivial case as follows:

Question 2.3

Given integer vectors \(v = (v_1,v_2,v_3)\) and \(w = (w_1,w_2,w_3)\) satisfying \(v_1+v_2+v_3=w_1+w_2+w_3=0\) and \(\gcd (v_1,v_2,v_3) = \gcd (w_1,w_2,w_3) = 1\), can we find Schur positive polynomials in three variables P and Q such that:

  1. 1.

    \(P_v(z) = Q_w(z)\) and

  2. 2.

    \(P(\xi ,\xi ,\xi ) \ne P(1,1,1)\) and

  3. 3.

    \(Q(\xi ,\xi ,\xi )\ne Q(1,1,1)\)?

Our experiments suggest that the answer is always positive.

3 Computations

Now we recast Question 2.3 as a problem in polyhedral geometry and approach it computationally. The source code and its output can be found on our MathRepo page

https://mathrepo.mis.mpg.de/CompactAmalgamation/index.html.                   (3)

We fix \(v, w \in {\mathbb {Z}}^3\) such that \(v_1+v_2+v_3=w_1+w_2+w_3=0\) and \(\gcd (v_1,v_2,v_3) = \gcd (w_1,w_2,w_3) = 1\). Choosing an ordering for the Schur polynomials, we then make the problem finite by fixing a number k and considering only the first k Schur polynomials in three variables, denoted by \(S_1,\dots ,S_k\in {\mathbb {Q}}[x_1,x_2,x_3]\). We search for \(P = \lambda _1 S_1 + \dots + \lambda _k S_k\) and \(Q = \mu _1 S_1 + \dots + \mu _k S_k\), where \(\lambda _i, \mu _i\) are non-negative integers. These polynomials lie in \({\mathbb {Q}}[x_1,x_2,x_3]\), and their substitutions \(P_{v}(z)\) and \(Q_{w}(z)\) are univariate Laurent polynomials. The coefficients of the difference \(P_{v}(z)-Q_w(z)\) are integer linear combinations of \(\lambda _i\) and \(\mu _i\). Setting these coefficients to zero and letting \(\lambda _i \ge 0\) and \(\mu _i \ge 0\) defines a polyhedral cone in \({\mathbb {R}}^{2k}\). We denote that cone \({\mathcal {C}}={\mathcal {C}}_k(v,w)\). Recall that \({\mathcal {C}}\) depends on the chosen ordering of the Schur polynomials. Throughout we assume that \(S_1=s_{1,1,1}\) is the trivial representation. It plays a special role, as \(P = Q = \lambda _1 S_1\), for any \(\lambda _1\ge 0\), is a trivial solution to (1) in Question 2.3.

To find P and Q, we consider the integer linear program

figure a

where \(c\in {\mathbb {R}}_{>0}^{2k}\) is some strictly positive linear objective function, to be discussed below. Let \({\mathcal {P}}={\mathcal {P}}_k(v,w)\) be the feasible region of the linear relaxation of (ILP\(_k\)).

Remark 3.1

Conceptually, one could replace the weak inequality by the strict inequality , but the description as an (integer) linear program requires weak inequalities.

Proposition 3.2

The feasible solutions of (ILP\(_k\)), i.e., the lattice points in \({\mathcal {P}}\), are in bijection with those nontrivial solutions to Question 2.3 which can be written as a non-negative linear combination of the first k Schur polynomials.

Proof

Containment in the cone \({\mathcal {C}}\) is equivalent to the condition (1) in Question 2.3. The two additional constraints correspond to conditions (2) and (3); see Proposition 2.1. \(\square \)

Remark 3.3

In practice, we make the following choices. We order the 3-variate Schur polynomials lexicographically: a partition \((m+n,n)\), with \(m, n\ge 0\), is less than another partition \((m'+n',n')\) if either \(m+n < m'+n'\) or \(m+n = m'+n'\) and \(n < n'\); and the special partition (1, 1, 1) is defined to be smaller than \((m+n,n)\) for arbitrary m and n. Moreover, we take the objective function \(c=(c_i)\) with \(c_i=\textrm{tdeg}\, S_i\), where \(\textrm{tdeg}\) is the total degree. So the optimal solutions are minimal with respect to dimension.

We abbreviate \((m)=(m,0)\).

Example 3.4

We consider \(v=(-1,0,1)\) and \(w=(-2,1,1)\) as in Example 2.2, and we pick \(k=4\). Then the first four Schur polynomials correspond to the partitions (1, 1, 1), (1), (1, 1), and (2). So we have \(S_1=x_1x_2x_3\), \(S_2=x_1+x_2+x_3\), \(S_3=x_1x_2+x_1x_3+x_2x_3\), and \(S_4=x_1^2 + x_1x_2 + x_1x_3 + x_2^2 + x_2x_3 + x_3^2\). Then

$$\begin{aligned} P_v(z)-P_w(z)&= (\lambda _{4} - \mu _{3} - 3\mu _{4})z^2 + (\lambda _{2} + \lambda _{3} + \lambda _{4} - 2\mu _{2})z\\&\quad + \lambda _{1} + \lambda _{2} + \lambda _{3} + 2\lambda _{4} - \mu _{1}\\&\quad + (\lambda _{2} + \lambda _{3} + \lambda _{4} - 2\mu _{3} - 2\mu _{4})z^{-1} + (\lambda _{4} - \mu _{2})z^{-2} - \mu _{4}z^{-4}. \end{aligned}$$

Consequently, the unbounded polyhedron \({\mathcal {P}}\) in \({\mathbb {R}}^8\) is given by six homogeneous equations (from the coefficients of \(P_v(z)-P_w(z)\), considered as a Laurent polynomial in \({\mathbb {Q}}[\lambda _1,\dots ,\mu _4][z^\pm ]\)), the eight nonnegativity constraints and two affine inequalities (from forcing injectivity). The polyhedron \({\mathcal {P}}\) is 3-dimensional. Solving the integer linear program (ILP\(_k\)) yields

$$\begin{aligned} \lambda _1=\lambda _2=0,\; \lambda _3=\lambda _4=1 \quad \text {and} \quad \mu _1=3,\; \mu _2=\mu _3=1,\; \mu _4=0 \end{aligned}$$

as an optimal solution of objective value \(3+6=3+3+3=9\). This recovers the pair of 9-dimensional representations given by \(P=s_{1,1}+s_2\) and \(Q=3s_{1,1,1}+s_1+s_{1,1}\) from Example 2.2. That pair of Schur positive polynomials corresponds to the lattice point marked \(0011\, 3110\) in Fig. 1. Our visualization artificially truncates the feasible region at representation dimension ten. We see two 9-dimensional solutions and two 10-dimensional ones. The solutions come in pairs since \(s_{1,v}=s_{(1,1),v}\) for the special choice of \(v=(-1,0,1)\). The 10-dimensional solutions are obtained from the 9-dimensional solutions by adding a trivial representation. In this way, the solution from Example 2.2 explains all four solutions shown here.

Fig. 1
figure 1

Four integral points in \({\mathcal {P}}_4(v,w)\) for \(v=(-1,0,1)\) and \(w=(-2,1,1)\). Visualized with polymake (Gawrilow and Joswig 2000); hyperplane for artificial truncation at representation dimension 10 marked red (color figure online)

Solving integer linear programs is generally hard, both theoretically and in practice (Schrijver 1986). However, our integer linear program (ILP\(_k\)) has a particularly simple structure, which can be exploited computationally.

Lemma 3.5

Let \((\lambda ,\mu )\in {\mathbb {Q}}^{2k}\) be a rational point in \({\mathcal {P}}\). Then there is a positive integer \(\ell >0\) such that \((\ell \cdot \lambda ,\ell \cdot \mu )\) is a point in \({\mathcal {P}}\) which is integral.

Proof

Let \(\ell \) be the common denominator of \(\lambda _1,\lambda _2,\dots ,\mu _k\). Then \((\ell \cdot \lambda ,\ell \cdot \mu )\) is integral. The polyhedron \({\mathcal {P}}\) is the intersection of the cone \({\mathcal {C}}\) with two additional affine halfspaces. Clearly, \((\ell \cdot \lambda ,\ell \cdot \mu )\) lies in \({\mathcal {C}}\). Further, we have , and similarly for the other inequality. Thus the point \((\ell \cdot \lambda ,\ell \cdot \mu )\) lies in \({\mathcal {P}}\cap {\mathbb {Z}}^{2k}\). \(\square \)

As a consequence, the integer linear program (ILP\(_k\)) is feasible if and only if its linear relaxation is. The latter condition can be tested much faster. Consequently, standard complexity bounds in linear optimization entail the following result; see Grötschel et al. (1993), Renegar (2001).

Proposition 3.6

Employing the interior point method, deciding the feasibility of the integer linear program (ILP\(_k\)) takes polynomial time in the five parameters k, \(\log |v_1|\), \(\log |v_2|\), \(\log |w_1|\), and \(\log |w_2|\).

Recall the condition \(v_1+v_2+v_3=0=w_1+w_2+w_3\), whence \(v_3\) and \(w_3\) are not mentioned. Now we can summarize how to address Question 2.3 computationally. First we pick some integer k. Then we decide the feasibility of (ILP\(_k\)) by solving the linear relaxation. If this is feasible we use a bisection to find the minimal \(k'\) such that (ILP\(_{k'}\)) is feasible. If it is infeasible we try 2k and repeat. Of course, this procedure does not terminate if no solution exists. Yet that did not occur so far.

There are many implementations of algorithms for linear and integer optimization available, both open source and commercial. Yet the majority employs floating-point arithmetic, which may lead to errors, which in turn makes these software systems less suited for obtaining mathematical results. For this reason we use SCIP, which implements the simplex method in exact rational arithmetic (Bestuzheva et al. 2021). Setting up the (integer) linear program (ILP\(_k\)) is done in OSCAR, which provides partitions, Schur polynomials and the necessary commutative algebra (Decker et al. 2024; OSCAR 2023). OSCAR also inherits the full functionality of polymake (Gawrilow and Joswig 2000), which includes exact rational integer linear programming. While SCIP is much faster at integer linear programming, that implementation is based on floating-point arithmetic.

Table 1 Minimal k for which \({\mathcal {P}}_k(v,w)\) is feasible, where \(v=(1,v_2,-1-v_2)\) and \(w=(1,w_2,-1-w_2)\)

3.1 Feasibility

For our first experiment, we consider pairs of vectors \(v=(v_1,v_2,-v_1-v_2)\) and \(w=(w_1,w_2,-w_1-w_2)\) such that \(v_1=w_1=1\). Such a pair (vw) is determined by the pair \((v_2,w_2)\) of integers. In Table 1, we give the minimal values of k for which \({\mathcal {P}}_k(v,w)\) is feasible, which we compute by solving the linear relaxation of (ILP\(_k\)). As pointed out in Remark 3.3, the parameter k refers to the lexicographic ordering of the Schur polynomials. That ordering does affect the value of k. That is to say, replacing the pure lexicographic ordering by, e.g., the graded lexicographic ordering may lead to a lower value of k. It is unclear whether one ordering is better than another.

3.2 Representation dimensions

For our second experiment we actually solve the integer linear program (ILP\(_k\)). We take the objective function

$$\begin{aligned} c = \left( S_1(1,1,1),\dots ,S_k(1,1,1)\right) \end{aligned}$$

to be the dimension of the representation corresponding to P; see Remark 3.3. Table 2 records pairs of vectors, the minimal value of k such that \({\mathcal {P}}_k(v,w)\) is feasible and the optimal value of the integer linear program, i.e., the smallest dimension achieved by solutions using only the first k Schur polynomials. Each row of that table corresponds to one entry in Table 1. The explicit Schur positive symmetric polynomials whose dimensions are recorded in Column 4 of Table 2 can be found on our MathRepo page (3) alongside the source code.

Table 2 Minimal k for which \({\mathcal {P}}_k(v,w)\) is feasible and the dimension of the representation that corresponds to an optimal integral solution

Note that the dimensions recorded in Table 2 might not be minimal among all solutions since they use only the first k Schur polynomials; allowing the use of more Schur polynomials can potential provide a solution with smaller dimension.

3.3 Running times

We briefly comment on the computation time of Tables 1 and 2. Computing all entries in Table 1 took in total approximately 400,000 s (4.6 days). Optimal solutions in Table 2 are computed in SCIP, via floating-point arithmetic, and then verified in OSCAR, via exact arithmetic. Verification is fast and succeeded in all our cases. The longest computation was for the pair \((1,5,-6)\) and \((1,7,-8)\), which took 312 s in SCIP. Computations for pairs with \(k > 125\) did not terminate within a day.

All computations were done on the computer server Hydra at the MPI MiS, with the following system specifics: 4x16-core Intel Xeon E7-8867 v3 CPU (3300 MHz) on Debian GNU/Linux 5.10.149-2 (2022-10-21) x86_64.

Remark 3.7

In principle, the optimal (rational) solutions to the linear programming relaxations leading to Table 1 yield an upper bound on the smallest dimension of a representation of the amalgamation problem Question 2.3. However, these numbers are excessively large. For example, for \(v=(1,9,-10)\) and \(w=(1,10,-11)\) the bound we obtain is 2382041666750207. This is one of the smaller ones. Therefore, it is not desirable to provide a complete table here. However, using the Jupyter notebook available on the MathRepo page (3) the interested reader can compute some of these numbers by themselves.

4 A relaxed problem

In this section, we consider the following relaxed problem by dropping the Schur positivity condition and disregarding the case \((-1,0,1)\). Recall that the Schur polynomials form a basis of the space of all symmetric polynomials; see Fulton and Harris (1991, §A.1).

Question 4.1

Given \(v = (v_1,v_2,v_3)\) such that \(v_1+v_2+v_3=0\), \(v_1v_2v_3\ne 0\) and \(\gcd (v_1,v_2,v_3) = 1\). For which Laurent polynomials \(F\in {\mathbb {Q}}[z^\pm ]\) can we find a symmetric polynomial in three variables P that \(F=P_v(z)\)?

Remark 4.2

We pose the additional condition \(v_1v_2v_3\ne 0\), which excludes the case \((v_1,v_2,v_3)=(-1,0,1)\), because our argument does not apply to that case, see Remark 4.13.

From now on fix a triplet \(v = (v_1,v_2,v_3)\) such that \(v_1+v_2+v_3=0\), \(v_1v_2v_3\ne 0\) and \(\gcd (v_1,v_2,v_3) = 1\). Since every symmetric polynomial in three variables can be written as a polynomial in the first three elementary symmetric polynomials

$$\begin{aligned} e_1=x_1+x_2+x_3,\ e_2=x_1x_2+x_1x_3+x_2x_3,\ e_3=x_1x_2x_3 \end{aligned}$$

in \({\mathbb {Q}}[x_1,x_2,x_3]\), and because \(v_1+v_2+v_3=0\) implies that \((e_3)_v(z)=1\), answering Question 4.1 amounts to characterizing the \({\mathbb {Q}}\)-subalgebra A(v) of the Laurent polynomial ring \({\mathbb {Q}}[z^\pm ]\) generated by

$$\begin{aligned} F_1:=(e_1)_v(z)=z^{v_1}+z^{v_2}+z^{v_3} \quad \text {and} \quad F_2:=(e_2)_v(z)=z^{v_1+v_2}+z^{v_1+v_3}+z^{v_2+v_3}. \end{aligned}$$

Since we have \(F_1'(1)=F_2'(1)=0\), the product rule implies that \(F'(1)=0\) holds for all \(F\in A(v)\). This shows that A(v) is a proper subalgebra of \({\mathbb {Q}}[z^\pm ]\). As the next example shows, this is in general not the only constraint.

Example 4.3

Consider the case \(v=(1,1,-2)\), and let \(\xi \in {\mathbb {C}}\) be a primitive third root of unity. We have \(F_1'=2\cdot (1-z^{-3})\) and \(F_2'=2z\cdot (1-z^{-3})\) and this shows \(F_1'(\xi )=F_2'(\xi )=0\). Again this shows that \(F'(\xi )=0\) for all \(F\in A(1,1,-2)\). One can prove that these are all constraints in this case:

$$\begin{aligned} A(1,1,-2)=\{F\in {\mathbb {Q}}[z^\pm ]\mid F'(1)=F'(\xi )=F'(\xi ^2)=0 \}. \end{aligned}$$

Our main contribution in this section is the following rather technical result which says that A(v) can, in general, be characterized by conditions similar as in Example 4.3.

Theorem 4.4

There is a product \(\Phi \in {\mathbb {Q}}[z]\) of cyclotomic polynomials with \(\Phi (1)\ne 0\) and a subalgebra C of \({\mathbb {Q}}[z^\pm ]/(\Phi )\) such that for \(F\in {\mathbb {Q}}[z^\pm ]\) the following are equivalent:

  1. 1.

    There is a symmetric polynomial P in three variables with rational coefficients such that \(F=P_v(z)\).

  2. 2.

    We have \(F'(1)=0\), and the residue class of F modulo \(\Phi \) is in C.

Remark 4.5

The subalgebra C of \({\mathbb {Q}}[z^\pm ]/(\Phi )\) in Theorem 4.4 is the one generated by the residue classes of \(F_1\) and \(F_2\). Since \({\mathbb {Q}}[z^\pm ]/(\Phi )\) is a finite dimensional \({\mathbb {Q}}\)-vector space, this can be explicitly calculated once knowing \(\Phi \).

Before we will give a proof of Theorem 4.4 we point out some consequences that are less technical.

Corollary 4.6

There are finitely many roots of unity \(\zeta _1,\ldots ,\zeta _r\in {\mathbb {C}}{\setminus }\{1\}\) and natural numbers \(a_1,\ldots ,a_r\) such that every \(F\in {\mathbb {Q}}[z^\pm ]\) with \(F'(1)=0\) which vanishes at \(\zeta _i\) with multiplicity at least \(a_i\) for \(i=1,\ldots ,r\) can be expressed as \(F=P_v(z)\) for some symmetric polynomial P in three variables.

Proof

Let \(\Phi \in {\mathbb {Q}}[z]\) the polynomial from Theorem 4.4 and let

$$\begin{aligned} \Phi =\Phi _1^{a_1}\cdots \Phi _s^{a_s} \end{aligned}$$

where the \(\Phi _i\) are pairwise coprime cyclotomic polynomials. If \(F\in {\mathbb {Q}}[z^\pm ]\) vanishes at the zeros of each \(\Phi _i\) with multiplicity at least \(a_i\), then F is divisible by \(\Phi \). Thus the residue class of F modulo \(\Phi \) is zero and hence contained in every subalgebra of \({\mathbb {Q}}[z^\pm ]/(\Phi )\). \(\square \)

Corollary 4.7

There are natural numbers \(a_0,b_0>0\) such that for all \(a\ge a_0\), all b divisible by \(b_0\) we have

$$\begin{aligned} F_{a,b}=(1+z+\cdots +z^{b-1})^a\cdot (1+z^{-1}+\cdots +z^{-(b-1)})^a\in A(v). \end{aligned}$$
(4)

Proof

Let \(\zeta _1,\ldots ,\zeta _r\) and \(a_1,\ldots ,a_r\) as in Corollary 4.6. Let \(b_0\) such that \(\zeta _i^{b_0}=1\) for all \(i=1,\ldots ,r\) and \(a\ge \frac{1}{2}\max _{i=1}^r (a_i)\). Then for all \(a\ge a_0\) and all b divisible by \(b_0\) the Laurent polynomial \(F_{a,b}\) vanishes at \(\zeta _i\) with multiplicity at least \(a_i\) for \(i=1,\ldots ,r\). A straight-forward calculation further shows that \(F'_{a,b}(1)=0\). \(\square \)

Corollary 4.8

Consider finitely many triplets

$$\begin{aligned} t_1,\ldots ,t_r\in \{(\alpha ,\beta ,\gamma )\in {\mathbb {Z}}^3 \mid \alpha +\beta +\gamma )=0, \alpha \beta \gamma \ne 0\,\text { and }\gcd (\alpha ,\beta ,\gamma ) = 1\}. \end{aligned}$$

Then there are natural numbers ab such that \(F_{a,b}\in \bigcap _{i=1}^r A(t_i)\).

Proof

For each \(i\in \{1,\ldots ,r\}\) we obtain \(a_0\) and \(b_0\) as in Corollary 4.7. We can choose a as the maximum of all such \(a_0\) and b as the product of all such \(b_0\). \(\square \)

In fact, we conjecture that the Laurent polynomials \(F_{a,b}\) in Eq. (4) can even be realized as positive rational linear combinations of Schur polynomials.

Conjecture 4.9

There are natural numbers \(a_0,b_0>0\) such that for all \(a\ge a_0\), all b divisible by \(b_0\) there is \(N\in {\mathbb {N}}\) and a Schur positive symmetric polynomial P in three variables such that

$$\begin{aligned} N\cdot F_{a,b}= P_v. \end{aligned}$$

In order to amalgamate two representations given by tuples v and w let \(a_0,b_0,N\) and \(a_0',b_0',N'\) the natural numbers from the previous conjecture for v and w respectively. Then, if Conjecture 4.9 is true, letting \(a=\max (a_0,a_0')\), \(n=\textrm{lcm}(b_0,b_0')\) and \(M=\textrm{lcm}(N,N')\), we have

$$\begin{aligned} P_v=M\cdot F_{a,b}=Q_{w} \end{aligned}$$

for Schur positive symmetric polynomials P and Q.

Remark 4.10

In the case \(v = (1,1,-2)\) our computational experiments suggest that Conjecture 4.9 is true for \(a_0=1\) and \(b_0=3\).

We found N and Schur-positive symmetric polynomials \(P_v\) that satisfy \(N\cdot F_{a,b} = P_v\) for various pairs of a and b. We record the values of N and the dimensions of \(P_v\) in Table 3.

The computations are similar to what we perform in the previous section. Given ab and N, we obtain a Laurent polynomial \(N\cdot F_{a,b}\). The degree of this polynomial gives an upper bound on the degrees of the Schur polynomials that can appear in \(P_v\). We then take all available Schur polynomials and solve an integral linear program like before. The source code and explicit polynomials \(P_v\) can be found on our MathRepo page (3).

Table 3 Experimental data on Conjecture 4.9 with \(v = (1,1,-2)\)

4.1 Proof of Theorem 4.4

Our proof involves some algebraic geometry; see the textbooks by Hartshorne (1977) and Harris (1995). We consider the polynomial map

$$\begin{aligned} f:{\mathbb {C}}^*\rightarrow {\mathbb {C}}^2,\, z\mapsto \left( F_1(z),F_2(z)\right) =(z^{v_1}+z^{v_2}+z^{v_3},z^{v_1+v_2}+z^{v_1+v_3}+z^{v_2+v_3}). \end{aligned}$$
(5)

We first study where this map fails to be injective.

Lemma 4.11

We have \(f^{-1}(f(1))=\{1\}\).

Proof

Let \(x\in {\mathbb {C}}^*\) such that \(f(x)=f(1)\). This implies

$$\begin{aligned} (t-x^{v_1})(t-x^{v_2})(t-x^{v_3})=t^3-F_1(x)t^2+F_2(x)t -1=t^3-F_1(1)t^2+F_2(1)t-1=(t-1)^3, \end{aligned}$$

which entails \(x^{v_1}=x^{v_2}=x^{v_3}=1\). Since \(\gcd (v_1,v_2,v_3)=1\), we get \(x=1\). \(\square \)

For a complex number \(x\in {\mathbb {C}}\), let \(|x|=\sqrt{x\cdot \overline{x}}\) be its norm.

Lemma 4.12

For \(|x|\ne 1\) we have \(|f^{-1}(f(x))|=1\).

Proof

Let \(y\in {\mathbb {C}}^*\) such that \(f(y)=f(x)\). This implies that the zeros of the polynomial

$$\begin{aligned} (t-y^{v_1})(t-y^{v_2})(t-y^{v_3}) \end{aligned}$$

are the three complex numbers \(x^{v_1}\), \(x^{v_2}\) and \(x^{v_3}\). If \(|x|>1\), then \(|y|>1\) as well. Indeed, if two of the three integers \(v_1,v_2,v_3\) are positive, then two of the three real numbers \(|x|^{v_1}\), \(|x|^{v_2}\) and \(|x|^{v_3}\) are larger than one and thus the same must hold for the real numbers \(|y|^{v_1}\), \(|y|^{v_2}\) and \(|y|^{v_3}\). Otherwise two of the three integers \(v_1,v_2,v_3\) are negative, and a similar argument applies. Since for every real \(t>1\) the map \(d\mapsto t^d\) is strictly increasing, we must have \(y^{v_1}=x^{v_1}\), \(y^{v_2}=x^{v_2}\) and \(y^{v_3}=x^{v_3}\). This implies \(f(\frac{y}{x})=f(1)\), and hence Lemma 4.11 yields \(y=x\). The case \(|x|<1\) is analogous. \(\square \)

Remark 4.13

The statement of Lemma 4.12 is not true in the case \((v_1,v_2,v_3)=(-1,0,1)\). Indeed, in this case the preimage of f(x) under the map

$$\begin{aligned} f:{\mathbb {C}}^*\rightarrow {\mathbb {C}}^2,\, z\mapsto =(z^{v_1}+z^{v_2}+z^{v_3}, z^{v_1+v_2}+z^{v_1+v_3}+z^{v_2+v_3})=(z^{-1}+1+z,z^{-1}+1+z) \end{aligned}$$
(6)

has two elements for all \(x\in {\mathbb {C}}^*{\setminus }\{-1,1\}\). This is why we have excluded this case.

We denote by B(v) the \({\mathbb {C}}\)-subalgebra of \({\mathbb {C}}[z^\pm ]\) generated by \(F_1\) and \(F_2\). Note that \(B(v)=A(v) \otimes _{{\mathbb {Q}}}{\mathbb {C}}\) and \(A(v)=B(v)\cap {\mathbb {Q}}[z^\pm ]\).

Lemma 4.14

The ring extension \(B(v)\subset {\mathbb {C}}[z^\pm ]\) is finite, i.e., \({\mathbb {C}}[z^\pm ]\) is finitely generated as a B(v)-module.

Proof

For \(t\in \{z^{v_1},z^{v_2},z^{v_3}\}\) we have

$$\begin{aligned} t^3-F_1t^2+F_2t-1=(t-z^{v_1})(t-z^{v_2})(t-z^{v_3})=0. \end{aligned}$$

This implies that \(t^k\), for \(k\in {\mathbb {N}}\), is contained in the B(v)-module that is generated by \(1,t,t^2\). Thus \({\mathbb {C}}[z^{v_1},z^{v_2},z^{v_3}]\) is equal to the B(v)-module that is generated by

$$\begin{aligned} \{z^{av_1}z^{bv_2}z^{cv_3}\mid 0\le a,b,c\le 2\}. \end{aligned}$$

Now it remains to show that \({\mathbb {C}}[z^\pm ]={\mathbb {C}}[z^{v_1},z^{v_2},z^{v_3}]\). The inclusion “\(\supset \)” is clear. Since \(\gcd (v_1,v_2,v_3)=1\), there are integers abc such that

$$\begin{aligned} av_1+bv_2+cv_3=1. \end{aligned}$$

Since \(v_1+v_2+v_3=0\) we also have

$$\begin{aligned} (a+m)v_1+(b+m)v_2+(c+m)v_3=1 \end{aligned}$$

for every \(m\in {\mathbb {Z}}\). In particular, we can find natural numbers \(a',b',c'\) such that

$$\begin{aligned} a'v_1+b'v_2+c'v_3=1 \end{aligned}$$

meaning that \(z=(z^{v_1})^{a'}(z^{v_2})^{b'}(z^{v_3})^{c'}\in {\mathbb {C}}[z^{v_1},z^{v_2},z^{v_3}]\). Analogously, it can be proved that \(z^{-1}\in {\mathbb {C}}[z^{v_1},z^{v_2},z^{v_3}]\). \(\square \)

The \({\mathbb {C}}\)-algebra B(v) is the coordinate ring of the algebraic curve \(X\subset {\mathbb {C}}^2\) cut out by the elements of the kernel of the map

$$\begin{aligned} {\mathbb {C}}[x,y]\rightarrow B(v),\, P\mapsto P(F_1,F_2); \end{aligned}$$

we denote the quotient field of B(v) by K. Lemma 4.14 implies that X is, in fact, the image of f because finite morphisms are closed (Hartshorne 1977, Exc. II.4.1).

Proposition 4.15

There are only finitely many \(x\in {\mathbb {C}}^*\) such that \(|f^{-1}(f(x))|>1\). All of them are roots of unity. Moreover, we have \(K={\mathbb {C}}(z)\), the rational function field.

Proof

For every \(x\in {\mathbb {C}}^*\) the fiber \(f^{-1}(f(x))\) is Zariski closed in \({\mathbb {C}}^*\). The Zariski closed subsets of \({\mathbb {C}}^*\) are either finite or all of \({\mathbb {C}}^*\). Therefore, since f is not constant, every fiber \(f^{-1}(f(x))\) is finite. By Harris (1995, Proposition 7.16) the field extension \({\mathbb {C}}(z)/K\) is finite and there is a nonempty Zariski open subset \(U\subset {\mathbb {C}}^*\) such that \(|f^{-1}(f(x))|=[{\mathbb {C}}(z):K]\) for all \(x\in U\). This implies that \(|f^{-1}(f(x))|=[{\mathbb {C}}(z):K]\) is true for all but finitely many \(x\in {\mathbb {C}}^*\). Lemma 4.12 thus shows that \([{\mathbb {C}}(z):K]=1\) and that each of the finitely many \(x\in {\mathbb {C}}^*\) with \(|f^{-1}(f(x))|>1\) must satisfy \(|x|=1\). Moreover, since f is defined over \({\mathbb {Q}}\), all such x are algebraic numbers and therefore roots of unity. \(\square \)

The situation is very similar for ramification points of f.

Proposition 4.16

If \(x\in {\mathbb {C}}^*\) is not a root of unity, then f is unramified at x.

Proof

Since every power sum in three variables can be written as a polynomial in the first three elementary symmetric polynomials, there is for every \(n\in {\mathbb {N}}\) a polynomial map \(\varphi _n:{\mathbb {C}}^2\rightarrow {\mathbb {C}}^n\) such that

$$\begin{aligned} \varphi _n(f(x))=(x^{v_1}+x^{v_2}+x^{v_3}, x^{2v_1}+x^{2v_2}+x^{2v_3},\ldots ,x^{nv_1}+x^{nv_2}+x^{nv_3}). \end{aligned}$$

If f is ramified at \(x\in {\mathbb {C}}^*\), then \(\varphi _n\circ f\) is also ramified at x for every \(n\in {\mathbb {N}}\). Thus

$$\begin{aligned} kv_1x^{kv_1-1}+kv_2x^{kv_2-1}+kv_3x^{kv_3-1}=0 \end{aligned}$$

for all \(k\in {\mathbb {N}}\). As x and k are nonzero, this implies that

$$\begin{aligned} v_1x^{kv_1}+v_2x^{kv_2}+v_3x^{kv_3}=0 \end{aligned}$$

for all \(k\in {\mathbb {N}}\). This means that \(x^k\) is a zero of the nonconstant polynomial

$$\begin{aligned} v_1z^{v_1}+v_2z^{v_2}+v_3z^{v_3}\in {\mathbb {Q}}[z] \end{aligned}$$

for all \(k\in {\mathbb {N}}\). Hence the set

$$\begin{aligned} \{x^k\mid k\in {\mathbb {N}}\} \end{aligned}$$

is finite which implies that x is a root of unity. \(\square \)

Corollary 4.17

There is a finite set \(S\subset {\mathbb {C}}^*\) of roots of unity such that

$$\begin{aligned} {\mathbb {C}}^*{\setminus } S\rightarrow X{\setminus } f(S),\, x\mapsto f(x) \end{aligned}$$

is an isomorphism.

Proof

This follows from Propositions 4.15, 4.16 and Lemma 4.14 by Harris (1995, Thm. 14.9). \(\square \)

Remark 4.18

The smallest set \(S\subset {\mathbb {C}}^*\), such that

$$\begin{aligned} {\mathbb {C}}^*{\setminus } S\rightarrow X{\setminus } f(S),\, x\mapsto f(x) \end{aligned}$$

is an isomorphism, is the preimage of the singular locus of the curve X under f.

Example 4.19

Let \(v=(1,1,-2)\). Then X is the zero set of the bivariate quartic polynomial

$$\begin{aligned} x_1^2 x_2^2-4 x_1^3-4 x_2^3+18 x_1 x_2-27 \end{aligned}$$

in \({\mathbb {C}}^2\). The three points \(f(1)=(3,3)\), \(f(\xi )=(3\xi ,3\xi ^2)\) and \(f(\xi ^2)=(3\xi ^2,3\xi )\) form the singular locus. In particular, its preimage under f is the set of third roots of unity. See Fig. 2. Note that this motivates the choice \(b_0=3\) in Remark 4.10.

Fig. 2
figure 2

The real locus of X for \(v = (1,1,-2)\) plotted in the plane

Example 4.20

For \(v=(1,2,-3)\) one computes that X is cut out by

$$\begin{aligned}&x_1^3 x_2^3-x_1^5-3 x_1^4 x_2-3 x_1 x_2^4-x_2^5 -x_1^4+5 x_1^3 x_2+10 x_1^2 x_2^2+5 x_1 x_2^3- x_2^4 \\&\quad +x_1^3 -x_1^2 x_2-x_1 x_2^2+x_2^3-7 x_1^2-13 x_1 x_2-7 x_2^2. \end{aligned}$$

The preimage of its singular locus under f is the set of all third roots of unity along with the set of primitive seventh and eighth roots of unity.

Recall that the conductor of the ring extension \(B(v)\subset {\mathbb {C}}[z^\pm ]\) is defined as

$$\begin{aligned} I=\{a\in B(v)\mid a\cdot {\mathbb {C}}[z^\pm ]\subset B(v)\} \end{aligned}$$

which is an ideal in both B(v) and \({\mathbb {C}}[z^\pm ]\). The zero set of I is contained in the locus where f fails to be an isomorphism (Bourbaki 1998, p. 316). Thus by Corollary 4.17 and because \({\mathbb {C}}[z^\pm ]\) is a principal ideal domain, it follows that I is generated by a Laurent polynomial all of whose zeros are roots of unity. Since f is defined over \({\mathbb {Q}}\), this Laurent polynomial is also defined over \({\mathbb {Q}}\). Therefore, we can write the generator of I as \((z-1)^m\cdot \Phi \) where \(\Phi \) is a product of cyclotomic polynomials with \(\Phi (1)\ne 0\).

Lemma 4.21

We have \(m=2\).

Proof

Recall from (5) that X is the image of the map

$$\begin{aligned} f:{\mathbb {C}}^*\rightarrow {\mathbb {C}}^2,\, z\mapsto \left( F_1(z),F_2(z)\right) =(z^{v_1}+z^{v_2}+z^{v_3},z^{v_1+v_2}+z^{v_1+v_3}+z^{v_2+v_3}). \end{aligned}$$

We have \(F_1'(1)=F_2'(1)=0\) and \(F_1''(1)=F_2''(1)=v_1^2+v_2^2+v_3^2\ne 0\). Together with Lemma 4.11 this implies that X has an ordinary cusp at the image of 1. The coordinate ring of X is B(v). Proposition 4.15 and Lemma 4.14 imply that \({\mathbb {C}}[z^\pm ]\) is the integral closure of B(v). In this situation the conductor has been computed in Fulton (2004, Proposition 1). \(\square \)

As we are actually interested in polynomials with rational coefficients, we consider A(v) the \({\mathbb {Q}}\)-subalgebra of \({\mathbb {Q}}[z^\pm ]\) generated by \(F_1\) and \(F_2\) and we let \(I'\) the ideal of \({\mathbb {Q}}[z^\pm ]\) generated by \((z-1)^m\cdot \Phi \).

Corollary 4.22

The ideal \(I'\) is the conductor of the ring extension \(A(v)\subset {\mathbb {Q}}[z^\pm ]\):

$$\begin{aligned} I'=\{a\in A(v)\mid a\cdot {\mathbb {Q}}[z^\pm ]\subset A(v)\}. \end{aligned}$$

Proof

This follows from \(I'=I\cap {\mathbb {Q}}[z^\pm ]\) and \(A(v)=B(v)\cap {\mathbb {Q}}[z^\pm ]\). \(\square \)

Let \(C_0\subset {\mathbb {Q}}[z^\pm ]/I'\) the image of A(v) modulo \(I'\).

Lemma 4.23

Let \(g\in {\mathbb {Q}}[z^\pm ]\). Then we have \(g\in A(v)\) if and only if the residue class of g modulo \(I'\) is in \(C_0\).

Proof

One direction is trivial so let us assume that the residue class of g modulo \(I'\) is in \(C_0\). Thus there is a \(h_1\in A(v)\) and \(h_2\in I'\) such that \(g=h_1+h_2\). Thus \(g\in A(v)\) because \(I'\subset A(v)\). \(\square \)

By the Chinese remainder theorem we can naturally identify

$$\begin{aligned} {\mathbb {Q}}[z^\pm ]/I'=({\mathbb {Q}}[z^\pm ]/(z-1)^2)\times ({\mathbb {Q}}[z^\pm ]/\Phi ). \end{aligned}$$

Lemma 4.24

We have \(C_0={\mathbb {Q}}\times C\), where

$$\begin{aligned} C=\{g\in {\mathbb {Q}}[z^\pm ]/\Phi \mid \exists h\in {\mathbb {Q}}[z^\pm ]/(z-1)^2) :(h,g)\in C_0\}. \end{aligned}$$

Proof

Since \(F_i(1)=3\) and the derivative of \(F_i\) vanishes at 1, we have

$$\begin{aligned} F_i\equiv 3\mod (z-1)^2 \end{aligned}$$

for \(i=1,2\). This shows the inclusion "\(\subset \)". For the reverse inclusion we observe that, by Lemma 4.11, there is a polynomial \(G\in {\mathbb {Q}}[x_1,x_2]\) that vanishes on \(f(\zeta )\) for all zeros \(\zeta \) of \(\Phi \) but \(G(f(1))=1\). The residue class modulo \(I'\) of a large enough power of \(G(F_1,F_2)\in A(v)\subset {\mathbb {Q}}[z^\pm ]\) is then (1, 0). In particular, this shows that \((1,0)\in C_0\) and proves the claim. \(\square \)

Proof of Theorem 4.4

Let \(F\in {\mathbb {Q}}[z^\pm ]\). Then by Lemma 4.23 and Lemma 4.24 we have that F lies in A(v) if and only if the residue class of F modulo \(\Phi \) is in C and the residue class of F modulo \((z-1)^2\) in \({\mathbb {Q}}\). The latter condition is equivalent to \(F'(1)=0\), which implies the claim. \(\square \)

Example 4.25

In the case \(v=(1,2,-3)\), by Example 4.20 the smallest b such that \(F_{a,b}\) can possibly be divisible by the polynomial Q from Theorem 4.4 for some \(a\in {\mathbb {N}}\) is \(b=3\cdot 7\cdot 8=168\).

5 Conclusion and outlook

As indicated already in the remarks after Question 20 in Bergman (1987), a similar strategy applies also to the cases \(A=B=\textrm{SU}(n)\). The only difference is that we now have to consider exponent vectors \((v_1,\dots ,v_n) \in {\mathbb {Z}}^n\) satisfying \(v_1+\cdots + v_n=0\) and \(\gcd (v_1,\dots ,v_n)=1\) and Schur polynomials in n variables. Putting it more precisely, we obtain the following question:

Question 5.1

Given \(v = (v_1,\dots ,v_n)\) and \(w = (w_1,\dots ,w_n)\) be integer vectors satisfying \(v_1+\cdots +v_n=w_1+\cdots +w_n=0\) and \(\gcd (v_1,\dots , v_n) = \gcd (w_1,\dots ,w_n) = 1\), can we find Schur positive symmetric polynomials in n variables P and Q such that:

  1. (1)

    \(P_v(z) = Q_w(z)\),

  2. (2)

    \(P(\xi ,\dots ,\xi ) \ne P(1,\dots ,1)\) and \(Q(\xi ,\dots ,\xi )\ne Q(1,\dots ,1),\) for \(\xi ^n=1,\xi \ne 1\).

Theorem 4.4 describes the Laurent polynomials which are a \({\mathbb {Q}}\)-linear combination of monomials in \((e_1)_v(z),(e_2)_v(z),(e_3)_v(z)\) where \(e_k\) is the elementary symmetric polynomial in three variables of degree k. For our purposes it would however be much more desirable to have an understanding of which Laurent polynomials are a positive \({\mathbb {Q}}\)-linear combination of monomials in \((e_1)_v(z),(e_2)_v(z),(e_3)_v(z)\). Indeed, since elementary symmetric polynomials are Schur positive and since the product of Schur positive polynomials is again Schur positive, a positive \({\mathbb {Z}}\)-linear combination of monomials in \(e_1,e_2,e_3\) is always Schur positive. We tried to get our hands on this by suitable variants of Pólya’s Theorem (Marshall 2008, Theorem 5.5.1) but we have not been successful.