1 Introduction

What does it mean to say that a class of (quantum-)physical systems is complex? One perspective is to look at the physical phenomena displayed by that type of system. If these phenomena are rich and complex, then the system arguably can be said to be complex itself. Another perspective is to look at the computational power of the system: the ability to build a universal computer using the system would serve as strong evidence that the system is complex.

Interestingly, in some cases these notions of complexity are equivalent. Recent work by us, together with Cubitt, introduced and characterised the notion of universality in many-body quantum Hamiltonians [17]. A family of Hamiltonians is said to be universal if any other quantum Hamiltonian can be simulated arbitrarily well by some Hamiltonian in that family. By “simulate”, we mean the following (see Sect. 2 below for a formal definition): Hamiltonian A simulates Hamiltonian B if the low-energy part of A is close to B in operator norm, up to a local isometry (i.e. a map which associates each subsystem of the B system with a discrete set of subsystems of the A system).

This notion of simulation is very strong, as it implies that the low-energy part of A reproduces all physical properties of B (such as eigenvalues, ground states, partition functions, correlation functions, etc.) in a technical sense made precise in [17]. Universality is correspondingly a very strong notion. As a universal family \({\mathcal {F}}\) of Hamiltonians can simulate any other quantum Hamiltonian, any physical phenomenon that can occur in a quantum system must occur within Hamiltonians picked from \({\mathcal {F}}\). This implies that the ability to implement Hamiltonians in \({\mathcal {F}}\) allows universal “analogue” simulation of arbitrary quantum systems [14, 21]. In addition, if one also assumes that the simulation can be computed efficiently (as is usually the case), universal families of Hamiltonians are computationally universal, in a number of senses [17]. First, they can be used to perform arbitrary quantum computations, either by preparing a simple initial state, evolving according to \(H \in {\mathcal {F}}\) for some time and measuring, or via adiabatic evolution. Second, the problem of approximately computing the ground-state energy of Hamiltonians from \({\mathcal {F}}\) is QMA-complete, where QMA is the quantum analogue of the complexity class NP [7, 22], and hence expected to be computationally hard.

A natural way to classify physical systems is in terms of the types of interactions that they are built from. Let \({\mathcal {S}}\) be a set of interactions on up to k qudits (d-level subsystems), i.e. each element of \({\mathcal {S}}\) is a Hermitian operator on \(({\mathbb {C}}^d)^{\otimes l}\) for some \(l \leqslant k\). Then we say that an n-qudit Hamiltonian H is an \({\mathcal {S}}\)-Hamiltonian if

$$\begin{aligned} H = \sum _i \alpha _i h^{(i)}, \end{aligned}$$
(1)

where for all i, \(\alpha _i \in {\mathbb {R}}\) and the non-trivial part of \(h^{(i)}\) is picked from \({\mathcal {S}}\). That is, \(h^{(i)} = h \otimes I\) for some \(h \in {\mathcal {S}}\). H is a so-called k-local Hamiltonian. We stress that the \(\alpha _i\) coefficients can (usually) be either positive or negative. If \({\mathcal {S}}\) contains a 2-local interaction h that is not symmetric, then a Hamiltonian of terms of the form \(\alpha h_{ij}+\beta h_{ji}\) is also an \({\mathcal {S}}\)-Hamiltonian. We also say that H is an \({\mathcal {S}}\)-Hamiltonian with local terms if it can be written in the form (1) by adding arbitrary 1-local operators. The form (1) encompasses a vast array of the Hamiltonians studied in condensed-matter physics, such as the general Ising model (\({\mathcal {S}}= \{ Z \otimes Z\}\)) and the general Heisenberg model (\({\mathcal {S}}= \{X \otimes X + Y \otimes Y + Z \otimes Z\}\)). In the case where \({\mathcal {S}}= \{h\}\) for some h, we just call H an h-Hamiltonian.

Determining the complexity of \({\mathcal {S}}\)-Hamiltonians is a natural quantum generalisation of the long-running programme in classical complexity theory of classifying constraint satisfaction problems (CSPs) according to their complexity. Beginning with Schaefer’s famous 1978 dichotomy theorem for boolean CSPs [40], which has been extended in many different directions since (see e.g. [15, 42] for references), this project aims to pinpoint, for each possible set of constraints \({\mathcal {S}}\), the complexity of a CSP that uses only constraints from \({\mathcal {S}}\) (perhaps weighted, to give an optimisation problem). A quantum generalisation of this question is to determine the complexity of approximately computing the ground-state energy of \({\mathcal {S}}\)-Hamiltonians up to \(1/{{\,\mathrm{poly}\,}}(n)\) precision [22]. This problem, which we call simply \({\mathcal {S}}\) -Hamiltonian, is a special case of the Local Hamiltonian problem, which in general is QMA-complete [28, 30] when \({\mathcal {S}}\) contains all k-qubit interactions for any fixed \(k \geqslant 2\). The classical special case of the \({\mathcal {S}}\) -Hamiltonian problem corresponds to \({\mathcal {S}}\) containing only diagonal interactions; such problems are known as “valued” or “generalised” CSPs, and a full complexity classification of these was only obtained in 2016, by Thapper and Živný [42].

A full classification was given in [16] of the computational complexity of the \({\mathcal {S}}\) -Hamiltonian problem in the special case where all interactions in \({\mathcal {S}}\) are on at most 2 qubits; this was sharpened by [10], which showed that one complexity class in the classification was equivalent to the previously studied class StoqMA [8]. It was later shown in [17] that each of the classes in [16] corresponds to a physical universality class. These results can be summarised as follows:

Theorem 1

([10, 16,17,18, 26]). Let \({\mathcal {S}}\) be any fixed set of two-qubit and one-qubit interactions such that \({\mathcal {S}}\) contains at least one interaction which is not 1-local. Then:

  • If there exists \(U \in SU(2)\) such that U locally diagonalises \({\mathcal {S}}\), then \({\mathcal {S}}\)-Hamiltonians are universal classical Hamiltonian simulators [18] and the \({\mathcal {S}}\) -Hamiltonian problem is NP-complete [16, 26];

  • Otherwise, if there exists \(U \in SU(2)\) such that, for each 2-qubit matrix \(h_i \in {\mathcal {S}}\), \(U^{\otimes 2} h_i (U^\dag )^{\otimes 2} = \alpha _i Z^{\otimes 2} + A_i\otimes I + I \otimes B_i\), where \(\alpha _i \in {\mathbb {R}}\) and \(A_i\), \(B_i\) are arbitrary single-qubit interactions, then \({\mathcal {S}}\)-Hamiltonians are universal stoquastic Hamiltonian simulators [17] and the \({\mathcal {S}}\) -Hamiltonian problem is StoqMA-complete [10, 16];

  • Otherwise, \({\mathcal {S}}\)-Hamiltonians are universal quantum Hamiltonian simulators [17] and the \({\mathcal {S}}\) -Hamiltonian problem is QMA-complete [16].

A stoquastic Hamiltonian is one whose off-diagonal elements in the standard basis are all nonpositive. Here we sometimes generalise this terminology slightly by also calling H stoquastic if there exists a local unitary U such that \(U^{\otimes n} H (U^\dag )^{\otimes n}\) is stoquastic.

1.1 Our results

Here we continue the programme of classifying universality of Hamiltonians—and hence the computational complexity of the \({\mathcal {S}}\) -Hamiltonian problem—by generalising from qubit interactions to qudit interactions, i.e. local dimension \(d > 2\), or equivalently spin \(>1/2\). As well as being a natural next step from the perspective of computational complexity, this framework includes many important models studied in condensed-matter theory [1, 6, 25, 29, 31, 33, 39]. However, it is significantly more difficult than the qubit case. One reason for this is that in the case of qubits, there was a simple “canonical form” into which any 2-qubit interaction could be put by applying local unitaries [16], which dramatically reduced the number of types of interaction that needed to be considered. No comparably simple canonical form seems to exist for \(d > 2\) [32].

We first consider \({\mathcal {S}}\)-Hamiltonians with local terms. This is a more general setting than just \({\mathcal {S}}\)-Hamiltonians, and hence easier to prove universality results. From a computer science point of view, allowing free local terms corresponds to allowing arbitrary constraints or penalties on individual variables in a CSP. For conciseness, we say that \({\mathcal {S}}\) is LA-universal (“locally assisted universal”) if the family of \({\mathcal {S}}\)-Hamiltonians with local terms is universal. Similarly, we say that \({\mathcal {S}}\) is LA-stoquastic-universal if it can simulate any stoquastic Hamiltonian. Then our main result about universality with local terms is a complete classification theorem:

Theorem 2

Let \({\mathcal {S}}\) be a set of interactions, which are not all 1-local, between qudits of dimension d. Then \({\mathcal {S}}\) is:

  • LA-stoquastic-universal, if there exists \({|{\psi }\rangle }\in {\mathbb {C}}^d\) such that all interactions in \({\mathcal {S}}\) are, up to the addition of 1-local terms, given by a linear combination of operators taken from the set \(\{I, | \psi \rangle \langle \psi |,| \psi \rangle \langle \psi |^{\otimes 2},| \psi \rangle \langle \psi |^{\otimes 3},\dots \}\)—furthermore, if \({\mathcal {S}}\) is of this form and H is an \({\mathcal {S}}\)-Hamiltonian with local terms, then H is stoquastic;

  • LA-universal, otherwise.

We note some general consequences of this result for Hamiltonians assisted by local terms. First, we see that any nontrivial k-qudit interaction can be used to simulate an arbitrary stoquastic Hamiltonian. Second, almost any k-qudit interaction can actually be used to simulate arbitrary general Hamiltonians. Third, perhaps surprisingly, there exist Hamiltonians whose 2-local part is diagonal, but which are LA-universal.

We highlight some examples for \(d=3\). Consider

$$\begin{aligned} {\mathcal {S}}_1 = \left\{ \begin{pmatrix} 1 &{} 0 &{} 0\\ 0 &{} -1 &{} 0\\ 0 &{} 0 &{} -1 \end{pmatrix}^{\otimes 2} \right\} , \;\;\;\; {\mathcal {S}}_2 = \left\{ \begin{pmatrix} 1 &{} 0 &{} 0\\ 0 &{} -1 &{} 0\\ 0 &{} 0 &{} 0 \end{pmatrix}^{\otimes 2} \right\} . \end{aligned}$$

The single interaction in \({\mathcal {S}}_1\) is equal to \(| 0 \rangle \langle 0 |^{\otimes 2}\) plus some 1-local terms, so \({\mathcal {S}}_1\) is stoquastic and LA-stoquastic-universal. On the other hand, the interaction in \({\mathcal {S}}_2\) cannot be decomposed in this way, so \({\mathcal {S}}_2\) is LA-universal. So, for example, given access to interactions of the form of \({\mathcal {S}}_2\) and arbitrary local terms, one can perform universal quantum computation.

Next we consider the more general h-Hamiltonian problem, where the lack of “free” 1-local terms makes it much more challenging to prove universality results. Here we focus on qudit generalisations of the qubit Heisenberg (exchange) interaction (\(h\propto X\otimes X+Y\otimes Y+Z\otimes Z\)). Hamiltonians built from this interaction enjoy significant levels of symmetry, which made it one of the most difficult cases to prove universal in previous work [16, 17]. The most symmetric such generalisation in local dimension d is the SU(d) Heisenberg model (often known as “SU(N) Heisenberg model” in the literature [6, 33]), where the interaction is

$$\begin{aligned} h=\sum _{a=1}^{d^2-1} T^a \otimes T^a \end{aligned}$$
(2)

for some \(d\times d\) traceless Hermitian matrices \(T^a\) such that \({{\,\mathrm{Tr}\,}}(T^a T^b)=\frac{1}{2}\delta _{ab}\). Up to adding an identity term and rescaling, h is just the swap operator, or the projector onto the symmetric subspace of two qudits,

$$\begin{aligned} P_{\text {sym}} = \frac{1}{4} \sum _{i,j} ({|{ij}\rangle } + {|{ji}\rangle })({\langle {ij}|} + {\langle {ji}|}). \end{aligned}$$

h is invariant under conjugation by local unitaries, implying that the eigenspaces of any Hamiltonian built only from h interactions inherit this property. Nevertheless, we have the following result:

Theorem 3

For any \(d \geqslant 2\), the SU(d) Heisenberg interaction \(h:=\sum _{a} T^a \otimes T^a\), where \(\{T^a\}\) are traceless Hermitian matrices such that \({{\,\mathrm{Tr}\,}}(T^a T^b)=\frac{1}{2}\delta _{ab}\), is universal. This holds even if the weights \(\alpha _i\) in the decomposition (1) are restricted to be non-negative.

The special case \(d=2\) of Theorem 3 was shown in [17]. As a corollary of Theorem 3, we obtain QMA-hardness of a quantum variant of the Max-d-Cut problem [19] (equivalently, a quantum generalisation of the (classical) antiferromagnetic Potts model [44]). In the Max-d-Cut problem, we are given a graph where each edge (ij) has a non-negative weight \(w_{ij}\), and are asked to partition the vertices into d sets, such that the sum of the weights of edges between vertices in different sets is maximised. That is, we find a map c from each vertex i to an integer \(c(i) \in [d]\) such that \(\sum _{i<j} w_{ij} (1-\delta _{c(i)c(j)})\) is maximised. The natural “quantum” way of generalising this problem is to replace each vertex with a d-dimensional qudit, and replace each weighted edge across two vertices with a weighted projector onto the symmetric subspace across the corresponding qudits (equivalently, an interaction h). Then the task is to approximate the ground-state energy of the corresponding Hamiltonian \(\sum _{i<j} w_{ij} h_{ij}\), up to precision \(1/{{\,\mathrm{poly}\,}}(n)\). Call this problem Quantum Max-d -Cut.

To see why this is a suitable (and non-trivial) generalisation, note that \(P_{\text {sym}}\) gives an energy penalty to a pair of qudits that are both in the same computational basis state, similarly to the classical case, but that the behaviour of the quantum variant can sometimes be quite different. For example, consider the case \(d=2\), and four vertices arranged in an unweighted cycle. Classically, the vertices can clearly be partitioned into two sets such that there are no edges between vertices in the same set. However, there is no quantum state that is simultaneously in the ground space of all corresponding projectors \(P_{\text {sym}}\). This is because the unique ground state of \(P_{\text {sym}}\) is maximally entangled, and each qubit cannot be maximally entangled with both of its neighbours simultaneously.

It is an immediate consequence of Theorem 3 that:

Corollary 4

For any \(d \geqslant 2\), Quantum Max-d -Cut is QMA-complete.

The special case \(d=2\) of Corollary 4 was shown in [38, Theorem 2].

Next, we consider the case where the interactions are of the form \(P=| \psi \rangle \langle \psi |\) for an entangled two qudit state \({|{\psi }\rangle }\).

Theorem 5

Let \(P= | \psi \rangle \langle \psi |\) be the projector onto an entangled two-qudit state \({|{\psi }\rangle } \in ({\mathbb {C}}^d)^{\otimes 2}\). Then \(\{P\}\)-Hamiltonians are universal.

In fact, Theorem 5 holds even in the restrictive setting where all the interactions are required to sit on the edges of a bipartite interaction graph (see Sect. 6 for a precise statement). Entanglement is a very well studied property of quantum systems, and is well known to be fundamental to many interesting quantum phenomena. This result can be viewed as an intriguing and apparently tight link between entanglement and universality.

A perhaps more familiar, and also very well-studied, interaction we consider is another generalisation of the qubit Heisenberg interaction (e.g. [1, 34, 37]): the SU(2) Heisenberg interaction in local dimension d (often just called the “spin-s Heisenberg interaction”, where \(s=(d-1)/2\)). Now the interaction is of the form

$$\begin{aligned} h = S^x \otimes S^x + S^y \otimes S^y + S^z \otimes S^z, \end{aligned}$$

where \(S^x\), \(S^y\), \(S^z\) generate a d-dimensional irreducible representation of \(\mathfrak {su}(2)\) and correspond to the familiar Pauli matices X, Y, Z (up to an overall scaling factor). Note that, although the Lie algebra involved is the same as for the qubit case, the interaction h may have very different properties for higher d; for example, it has d distinct eigenvalues (see Eq. (44) below). Nevetheless, this generalisation turns out to be universal too:

Theorem 6

For any \(d \geqslant 2\), the SU(2) Heisenberg interaction \(h= S^x \otimes S^x + S^y \otimes S^y + S^z \otimes S^z\), where \(S^x\), \(S^y\), \(S^z\) are representations of the Pauli matrices X, Y, Z, is universal.

Finally, we consider yet another well-studied generalisation of the Heisenberg model (see e.g. [2, 25, 29, 31]): the general bilinear-biquadratic Heisenberg model in local dimension \(d=3\) (spin 1). Here the interaction used is

$$\begin{aligned} h^{(\theta )} := (\cos \theta ) h + (\sin \theta ) h^2, \end{aligned}$$

where \(\theta \in [0,2\pi )\) is an arbitrary parameter and h is the spin-1 Heisenberg interaction, which can be written explicitly as

$$\begin{aligned} h = X_3 \otimes X_3 + Y_3 \otimes Y_3 + Z_3 \otimes Z_3 \end{aligned}$$
(3)

where

$$\begin{aligned} X_3 = \frac{1}{\sqrt{2}} \begin{pmatrix} 0 &{} 1 &{} 0\\ 1&{} 0&{} 1\\ 0 &{} 1 &{} 0\end{pmatrix},\;\;\;\;Y_3 = \frac{i}{\sqrt{2}} \begin{pmatrix} 0 &{} -1 &{} 0\\ 1&{} 0&{} -1\\ 0 &{} 1 &{} 0\end{pmatrix},\;\;\;\;Z_3 = \begin{pmatrix} 1 &{} 0 &{} 0\\ 0&{} 0&{} 0\\ 0 &{} 0 &{} -1\end{pmatrix}. \end{aligned}$$

The special case \(\theta = \arctan 1/3\) corresponds to the famous Affleck-Kennedy-Lieb-Tasaki (AKLT) model [2]. Our result here is as follows:

Theorem 7

Let \(h^{(\theta )} := (\cos \theta ) h + (\sin \theta ) h^2\), where \(\theta \in [0,2\pi )\) is an arbitrary parameter and h is the spin-1 Heisenberg interaction. For all \(\theta \), \(h^{(\theta )}\) is universal.

We therefore see that, although different values of \(\theta \) may correspond to very different physics [31], from a universality point of view they are all of equal power.

The family of \({\mathcal {S}}\)-Hamiltonians allows varying interaction strengths by definition. The simulations presented here are all efficient in the sense that it is possible to simulate an O(1)-local Hamiltonian of n qubits with maximum interaction strength \(J_{\text {max}}\) using a simulator Hamiltonian with interaction strengths that scale at most polynomially in \(n, J_{\text {max}}\) and \(\Delta ,1/\epsilon , 1/\eta \), which are the parameters of Definition 1 that describe the accuracy of the simulation. However since the constructions presented here often consist of multiple stages of simulations, with the degree of the corresponding polynomials multiplying together, these interaction strengths can be very large, and we do not calculate what these polynomials are exactly.

We remark that, in common with most previous work in this area [16, 17], we usually allow each interaction weight to be positive or negative. This can lead to physical systems built from the same interaction having very different physical properties (e.g. antiferromagnetism vs. ferromagnetism). It is sometimes possible to prove universality-type results for interactions whose weights all have the same sign [38]; we achieve this in Theorem 3, but in general leave this extension for future work. Another interesting direction is to prove universality for systems with simpler interaction patterns [17, 36, 38, 41], or with less heavily-weighted interactions [13].

1.2 Related work

There has been a substantial amount of work characterising the complexity of various types of qubit Hamiltonians from the perspective of QMA-completeness; see [7, 17, 22] for references. In the case of qudits, rather than general classification results, most work has considered carefully designed special cases where QMA-completeness can be achieved. Indeed, it is often the case that these results aim to reduce the local dimension of a QMA-complete construction that achieves some other desiderata. For example, Aharonov et al. [3] gave a QMA-complete family of local Hamiltonians on a 1D line with \(d=12\), later improved to \(d=8\) by Hallgren, Nagaj and Narayanaswami [24]; Gottesman and Irani [23] gave a \(\hbox {QMA}_{\text {EXP}}\)-complete family of translationally invariant Hamiltonians on a 1D line with \(d=O(10^6)\), later improved to \(d \approx 40\) by Bausch, Cubitt and Ozols [4]. The local dimension has been reduced even further to \(d=4\), for a translationally invariant Hamiltonian on a 3D lattice [5]. We refer to [7] for further examples, including the more general case where the local dimension can vary across the system being considered. In all these cases, one fixes the dimension and then carefully tunes the types of interactions used to achieve the desired result. Here, by contrast, we begin with a fixed set of interactions and attempt to determine the complexity of Hamiltonians based on these interactions.

1.3 Overview of proof of Theorem 2

We now give an informal discussion of our LA-universality classification result (Fig. 1). The majority of the work to prove Theorem 2 is taken up by the special case of 2-local interactions, and sets \({\mathcal {S}}\) containing only one interaction. To prove universality of an interaction h, we use simulations: showing that an interaction known to be universal [16, 17] can be implemented using Hamiltonians consisting of h terms and 1-local terms. Our simulations are all based on perturbative gadgets, as introduced in [28] and used for example in [10, 17, 36], to effectively implement one Hamiltonian within the ground space of another. For example, a type of gadget we often use is a so-called mediator gadget. In this type of gadget, one or more ancilla (“mediator”) qudits are added to the system. Strong interactions within the mediator qudits effectively project these qudits into a fixed state. Then weaker interactions between the mediator and original qudits implement effective interactions between the original qudits. The interactions produced are determined rigorously via perturbation theory.

Fig. 1
figure 1

Sequence of simulations used in the proof of Theorem 2. An arrow from one box to another indicates that a Hamiltonian of the first type can be simulated by a Hamiltonian of the second type

First we consider the special case of diagonal interactions with 2-local rank \(\geqslant 2\), where the 2-local rank of an interaction h is informally defined as follows: Writing \(h= h' + \text {1-local terms}\), and \(h' = \sum _{a,b} M_{ab} T^a \otimes T^b\) for some basis \(T^a\) of Hermitian matrices, the 2-local rank of h is the rank of M. (For example, \(h = X \otimes X + Y \otimes I\) has 2-local rank 1.) We can think of diagonal matrices symmetric under qudit interchange and with 2-local rank 2 as being of the form \(D \otimes D + E \otimes E\) for some diagonal matrices D and E. To show that such interactions are universal (a similar argument works for non-symmetric interactions), we use our free 1-local terms to apply a heavy interaction to each qudit which effectively projects it into a 2-dimensional subspace. Note that even though D and E commute, this need not be the case for the corresponding projected qubit interactions. This allows us to generate a 2-qubit effective interaction within this subspace which is universal [17].

Remaining within the special case of diagonal interactions, the next step is to consider those with 2-local rank 1, which are of the form \(A \otimes A\). To deal with this case, we split into two parts. When A has at least 3 distinct eigenvalues, we design a gadget using an additional qudit to implement the effective interaction \(A \otimes A^2 + A^2 \otimes A\), which is universal from the previous case. When A has 2 distinct eigenvalues, but is not of the form \(a| \psi \rangle \langle \psi | + b I\), we show that another gadget can be used to simulate an interaction \(B \otimes B\) where B has 3 distinct eigenvalues. For the remaining diagonal case—interactions of the form \(A \otimes A\) for \(A = a| \psi \rangle \langle \psi | + b I\)—we show that local unitary rotations can be used to transform any Hamiltonian built of such interactions into a stoquastic Hamiltonian, so we cannot expect this case to be universal.

We then move on to non-diagonal interactions. We first consider those of the form \(A \otimes A + B \otimes B\) for some B that does not commute with A (otherwise we would be in the diagonal case). For all such interactions, we show there exists a gadget which projects the interaction onto a 2-qubit subspace on which the resulting interaction is universal. The non-commutativity makes this task simpler than in the diagonal case. The next step is interactions with 2-local rank \(\geqslant 2\), but not of the form \(A \otimes A + B \otimes B\). For these, we show that one can always produce an effective interaction of the form \(A \otimes A + B \otimes B\) using two rounds of simulation.

All 2-qudit interactions h can be handled using one of these lemmas. Considering the interaction \(h'\) formed by deleting the 1-local parts from h, we know that h is LA-universal if the 2-local rank of \(h'\) is \(\geqslant 2\). If not, then \(h' = A \otimes B\) for some A and B. Either \(A \otimes B + B\otimes A\) has 2-local rank \(\geqslant 2\), or B is proportional to A. Either way, we are in one of the previously considered cases.

The final step to complete the proof of Theorem 2 is to generalise to k-local interactions for \(k>2\). To do so, we show that our free 1-local terms can be used to extract 2-local “sub-interactions” from the interactions we are given; this is a generalisation to \(d>2\) of an analogous argument for qubits in [16]. Then either we can produce a universal sub-interaction, or all the sub-interactions of all interactions in \({\mathcal {S}}\) are proportional to \(| \psi \rangle \langle \psi |^{\otimes 2}\), up to 1-local terms. In the latter case, the overall interactions must all have been of the form \(| \psi \rangle \langle \psi |^{\otimes \ell }\), so the whole Hamiltonian is stoquastic.

1.4 Overview of proof of Theorems 3, 5, 6 and 7

The techniques required to prove universality of interactions without free local terms are very different, and in general this setting is much more challenging. Given the symmetry displayed by the interactions we consider, we need to consider some notion of encoding in order to implement arbitrary effective interactions. In the case of the SU(d) Heisenberg interaction, we proceed by using a perturbative gadget to encode a qubit within the 2-dimensional ground space of a system of 2d qudits; this generalises a similar (but significantly simpler) gadget used for the case \(d=2\) in [17]. Interactions across pairs of qudits within the gadget implement effective X and Z interactions, while interactions across two gadgets can be used to implement a non-trivial 2-qubit interaction, which is enough to prove universality using the results of [17, 38]. In order to analyse the gadget’s behaviour, we need to use the representation theory of the Lie algebra \(\mathfrak {su}(d)\), and in particular analysis of quadratic Casimir operators [20], which are operators of the form \(\sum _a R(T^a) R(T^a)\) for some representation R of the generators \(T^a\) of \(\mathfrak {su}(d)\). The Hamiltonian corresponding to the SU(d) Heisenberg interaction on the complete graph on k qudits turns out to have a close connection to the Casimir operator corresponding to the representation \(R(T^a) = \sum _{i=1}^k T^a_i\), whose spectral properties are well-understood, and which has beautiful algebraic features that enable suitable gadget weights to be determined for any d.

Theorem 5 is proven using a gadget that shows that, when P is the projector onto an entangled state of two qudits, \(\{P\}\)-Hamiltonians can simulate \(\{P'\}\)-Hamiltonians for some \(P'=| \psi ' \rangle \langle \psi ' |\) where either \({|{\psi '}\rangle }\) is an entangled state of two qubits, in which case universality follows from Theorem 1; or \({|{\psi '}\rangle } =\frac{1}{\sqrt{d}}\sum _i {|{i}\rangle }{|{i}\rangle }\), in which case universality can be shown to follow from universality of the SU(d)-Heisenberg interaction (Theorem 3).

The gadget for the SU(2) Heisenberg interaction h also relies on properties of the corresponding Casimir operator, but is more complicated than the SU(d) case. Here the key technical step is to give a gadget that allows \(h^2\) interactions to be simulated, given access to h interactions; once this is achieved, it is not too hard to show that for any d, this allows the SU(2) Heisenberg interaction to be simulated in local dimension 3 (qutrits). Applying the \(h \mapsto h^2\) gadget again, we can produce the interaction \(h + h^2\), which (in local dimension 3) is the same as the SU(3) Heisenberg interaction, and hence universal. The analysis of this gadget depends on fourth-order perturbation theory, for which we need to prove a new general simulation lemma based on the Schreiffer–Wolff transformation [9]. Previous work gave general simulation lemmas for up to third-order perturbations [10], but extending this line of argument to fourth-order is more complex technically; in particular, there are non-trivial interference effects between different gadgets to take into account. We thus hope that this result will find other applications elsewhere.

We note that higher order perturbation theory has been considered before in the literature in slightly different settings, mostly in a framework where only the ground state energy is reproduced; for example [27] considers perturbation theory at arbitrary order. Although the contribution of the fourth order term in a Schreiffer–Wolff perturbative series has been considered before [12], we are not aware of any explicit demonstration of how the interactions must be chosen such that this fourth order term dominates as in Lemma 12. Cross gadget interference has previously been seen before for certain parameter regimes of low strength Hamiltonians [11], where it can be easily shown to disappear simply by increasing the strength of the interactions; whereas in Lemma 13, the cross gadget terms are independent of the strength of the Hamiltonian.

Finally, for the remaining bilinear-biquadratic Heisenberg interactions in dimension 3, we use different gadgets depending on the value of \(\theta \), which we can assume is within the range \([0,\pi ]\) because we are free to choose the signs of interactions arbitrarily. When \(\theta \in (0,\arctan 1/3) \cup (\pi /4, \pi )\) and \(\theta \ne \arctan 2\), then there exists an entangled state \({|{\psi }\rangle }\) which is either the unique ground state or the unique highest excited state of \(h^{(\theta )}\). Using a perturbative gadget to effectively project some qudits onto \({|{\psi }\rangle }\), we can obtain a new interaction \(h^{(\theta ')}\) for some \(\theta ' \ne \theta \). Taking a linear combination of these two interactions, we can simulate the SU(3) Heisenberg interaction. When \(\theta \in (\arctan 1/3,\arctan 5)\), \(h^{(\theta )}\) has a 3-dimensional ground space. We encode a qutrit within this subspace of two physical qutrits, and use \(h^{(\theta )}\) interactions across pairs of qutrits to simulate the SU(3) Heisenberg interaction across logical qutrits. These ranges encompass all values of \(\theta \) except \(\theta =\arctan 1/3\). In this last special case, \(h^{(\theta )}\) corresponds to the well-studied AKLT interaction [2]. Here the ground space of \(h^{(\theta )}\) is 4-dimensional, but we are able to construct a mediator qutrit gadget which effectively projects 3 qutrits into the unique ground state of a 3 qutrit AKLT Hamiltonian. This again allows us to simulate the SU(3) Heisenberg interaction.

2 Summary of Techniques

Next, we give the required definitions to state our results formally, describe previous results that we use, and exemplify our results by giving a simple example of a simulation. We then proceed to a full technical presentation of the remainder of our results.

2.1 Definitions

We first formally define the notions of simulation and universality that we will use. For an arbitrary Hamiltonian \(H \in L({\mathbb {C}}^d)\), we let \(P_{\leqslant \Delta (H)}\) denote the orthogonal projector onto the subspace \(S_{\leqslant \Delta (H)} := {{\,\mathrm{span}\,}}\{ {|{\psi }\rangle } : H{|{\psi }\rangle }=\lambda {|{\psi }\rangle }, \lambda \leqslant \Delta \}\). We also let \(H'|_{\leqslant \Delta (H)}\) denote the restriction of some other arbitrary Hamiltonian \(H'\) to \(S_{\leqslant \Delta (H)}\), and write \(H|_{\leqslant \Delta } := H|_{\leqslant \Delta (H)}\) and \(H_{\leqslant \Delta } := H P_{\leqslant \Delta (H)}\). We let \(L({\mathcal {H}})\) denote the set of linear operators acting on a Hilbert space \({\mathcal {H}}\), and use the standard notation \([A,B] := AB-BA\) and \(\{A,B\} := AB+BA\) for the commutator and anticommutator of A and B, respectively.

Definition 1

(Special case of definition in [17]; variant of definition in [10]). We say that \(H'\) is a \((\Delta ,\eta ,\epsilon )\)-simulation of H if there exists a local isometry \(V = \bigotimes _i V_i\), where each isometry \(V_i\) acts on at most one qudit, such that:

  1. 1.

    There exists an isometry \({\widetilde{V}}\) such that \({\widetilde{V}} {\widetilde{V}}^\dag = P_{\leqslant \Delta (H')}\) and \(\Vert {\widetilde{V}} - V\Vert \leqslant \eta \);

  2. 2.

    \(\Vert H'_{\leqslant \Delta } - {\widetilde{V}}H{\widetilde{V}}^\dag \Vert \leqslant \epsilon \).

We say that a family \({\mathcal {F}}'\) of Hamiltonians can simulate a family \({\mathcal {F}}\) of Hamiltonians if, for any \(H \in {\mathcal {F}}\) and any \(\eta ,\epsilon >0\) and \(\Delta \geqslant \Delta _0\) (for some \(\Delta _0 > 0\), that depends only on H), there exists \(H' \in {\mathcal {F}}'\) such that \(H'\) is a \((\Delta ,\eta ,\epsilon )\)-simulation of H. We say that the simulation is efficient if, in addition, for H acting on n qudits, \(\Vert H'\Vert = {{\,\mathrm{poly}\,}}(n,1/\eta ,1/\epsilon ,\Delta )\); \(H'\) is efficiently computable given H, \(\Delta \), \(\eta \) and \(\epsilon \); and each isometry \(V_i\) maps to O(1) qudits.

The first part of Definition 1 says that H can be mapped exactly into the ground space of \(H'\) by some “encoding” isometry \({\widetilde{V}}\) which is close to a local isometry V. The second part says that the low-energy part of \(H'\) is close to an encoded version of H. In [17] a more general notion of encoding was used, which allowed for complex Hamiltonians to be encoded as real Hamiltonians, for example; here we will not need this directly. (However, as we make use of the results of [17], we do use this notion of encoding indirectly.)

Definition 2

([17]). We say that a family of Hamiltonians is universal if any (finite-dimensional) Hamiltonian can be simulated by a Hamiltonian from the family. We say that the universal simulator is efficient if the simulation is efficient for all k-local Hamiltonians, for constant k.

Here all simulations we develop will be efficient, so whenever we say “universal”, we mean “efficiently universal” in the above sense.

2.2 Perturbative gadgets

The main technique we will use to prove universality will be the remarkably powerful concept of perturbative gadgets [28]. Let \({\mathcal {H}}_{\text {sim}}\) be a Hilbert space decomposed as \({\mathcal {H}}_{\text {sim}} = {\mathcal {H}}_+ \oplus {\mathcal {H}}_-\), and let \(\Pi _{\pm }\) denote the projector onto \({\mathcal {H}}_{\pm }\). For any linear operator O on \({\mathcal {H}}_{\text {sim}}\), write

$$\begin{aligned} O_{--} = \Pi _- O \Pi _-,\;\;\;\; O_{-+} = \Pi _- O \Pi _+,\;\;\;\; O_{+-} = \Pi _+ O \Pi _-,\;\;\;\; O_{++} = \Pi _+ O \Pi _+. \end{aligned}$$
(4)

Throughout, let \(H_0\) be a Hamiltonian such that \(H_0\) is block-diagonal with respect to the split \({\mathcal {H}}_+ \oplus {\mathcal {H}}_-\), \((H_0)_{--} = 0\), and \(\lambda _{\min }((H_0)_{++}) \geqslant 1\), where \(\lambda _{\min }(H)\) denotes the minimal eigenvalue of H. We write \(H^{-1}\) to denote the Moore–Penrose pseudoinverse, when H is not an invertible matrix.

Slight variants of the following lemmas were shown in [10], building on previous work [9, 36]:

Lemma 8

(First-order simulation [10]). Let \(H_0\) and \(H_1\) be Hamiltonians acting on the same space. Suppose there exists a local isometry V such that \({\text {Im}}(V)={\mathcal {H}}_-\) and

$$\begin{aligned} V H_{{\text {target}}} V^\dag = (H_1)_{--}. \end{aligned}$$
(5)

Then \(H_{{\text {sim}}} = \Delta H_0 + H_1\) \((\Delta /2,\eta ,\epsilon )\)-simulates \(H_{{\text {target}}}\), provided that \(\Delta \geqslant O(\Vert H_1\Vert ^2/\epsilon + \Vert H_1\Vert / \eta )\).

Lemma 9

(Second-order simulation [10]). Let \(H_0\), \(H_1\), \(H_2\) be Hamiltonians acting on the same space, such that: \(\max \{\Vert H_1\Vert ,\Vert H_2\Vert \} \leqslant \Lambda \); \(H_1\) is block-diagonal with respect to the split \({\mathcal {H}}_+ \oplus {\mathcal {H}}_-\); and \((H_2)_{--} =0\). Suppose there exists a local isometry V such that \({\text {Im}}(V)={\mathcal {H}}_-\) and

$$\begin{aligned} V H_{{\text {target}}} V^\dag = (H_1)_{--} - (H_2)_{-+} H_0^{-1} (H_2)_{+-}. \end{aligned}$$
(6)

Then \(H_{{\text {sim}}} = \Delta H_0 + \Delta ^{1/2} H_2 + H_1\) \((\Delta /2,\eta ,\epsilon )\)-simulates \(H_{{\text {target}}}\), provided that \(\Delta \geqslant O(\Lambda ^6/\epsilon ^2 + \Lambda ^2/\eta ^2)\).

Lemma 10

(Third-order simulation [10]). Let \(H_0\), \(H_1\), \(H_1'\), \(H_2\) be Hamiltonians acting on the same space, such that: \(\max \{\Vert H_1\Vert ,\Vert H_1'\Vert ,\Vert H_2\Vert \} \leqslant \Lambda \); \(H_1\) and \(H_1'\) are block-diagonal with respect to the split \({\mathcal {H}}_+ \oplus {\mathcal {H}}_-\); \((H_2)_{--}=0\). Suppose there exists a local isometry V such that \({\text {Im}}(V)={\mathcal {H}}_-\) and

$$\begin{aligned} V H_{{\text {target}}} V^\dag = (H_1)_{--} + (H_2)_{-+} H_0^{-1} (H_2)_{++} H_0^{-1} (H_2)_{+-} \end{aligned}$$
(7)

and also that

$$\begin{aligned} (H_1')_{--} = (H_2)_{-+} H_0^{-1} (H_2)_{+-}. \end{aligned}$$
(8)

Then \(H_{{\text {sim}}} = \Delta H_0 + \Delta ^{2/3} H_2 + \Delta ^{1/3} H_1' + H_1\) \((\Delta /2,\eta ,\epsilon )\)-simulates \(H_{{\text {target}}}\), provided that \(\Delta \geqslant O(\Lambda ^{12}/\epsilon ^3 + \Lambda ^3/\eta ^3)\).

These lemmas can be used to construct gadgets to simulate desired interactions via a mixture of design and trial and error. The intuition for first order gadgets is fairly straightforward: one just restricts to the groundspace of \(H_0\). For the higher order gadgets, there is still the restriction to this subspace, but now the lower strength interactions (\(H_1\), \(H_2\), etc.) multiply together, allowing the generation of more complex effective interactions. The \(H_0^{-1}\) terms appear to complicate this, but in practice \(H_0\) can often be chosen such that they are benign, for example when \(H_0\) is a projector.

We will often apply the simulation results in these lemmas to many individual interactions within a larger overall Hamiltonian, in parallel. For the gadgets we will use, it was shown in [17, Lemma 36] (following similar arguments in previous work, e.g. [10, 36]) that the overall simulation produced is what one would expect (i.e. a sum of the individual simulated interactions, without unexpected interference between the terms) at a cost of slightly larger interaction strengths. In addition, the simulations that we use will either associate a fixed number of ancilla (“mediator”) qudits with each interaction, or encode each logical qudit within a fixed number of physical qudits. In each such case, the overall isometry V is easily seen to be a tensor product of local isometries as required for Definition 1.

Later on, we will need a new fourth-order simulation lemma. As this is more technical to state (and its proof has some additional complications involving interference), we defer it to Sect. 3.

2.3 Example: the AKLT interaction

To see how the above simulation results can be used to prove universality, we give a simple example of how the AKLT interaction [2] can simulate the SU(3) Heisenberg interaction. The AKLT interaction \(h^{\text {AKLT}}\) is defined in local dimension \(d=3\) (spin 1) by \(h^{\text {AKLT}}:=3h+h^2\), where h is the SU(2) Heisenberg interaction defined in (3).

Lemma 11

The AKLT interaction \(h^{\text {AKLT}}:=3h+h^2\) is universal.

Proof

We will use a gadget construction to show that \(h^{\text {AKLT}}\) can simulate the SU(3) invariant interaction \(h+h^2\), which is shown to be universal in Theorem 3. We will use Lemma 9 and construct a second-order mediator qutrit gadget involving 3 mediator qutrits labelled 3, 4, 5 that will result in an effective interaction between qutrits 1 and 2. Let \(H_0 \in L(({\mathbb {C}}^d)^{\otimes 5})\) act trivially on qudits 1 and 2 as \(H_0=h^{\text {AKLT}}_{34}+h^{\text {AKLT}}_{45}+h^{\text {AKLT}}_{35}+6I\). The projector onto the ground space of \(H_0\) is of the form \(I_{12} \otimes | \psi \rangle \langle \psi |_{345}\) where

$$\begin{aligned} {|{\psi }\rangle }=\frac{1}{\sqrt{6}}\left( {|{012}\rangle }+{|{120}\rangle }+{|{201}\rangle }-{|{021}\rangle }-{|{210}\rangle }-{|{102}\rangle }\right) \end{aligned}$$

is the completely antisymmetric state on 3 qutrits. Let \(V=I_{12} \otimes {|{\psi }\rangle }_{345}\) be the isometry that maps \({|{\phi }\rangle } \mapsto {|{\phi }\rangle }_{12} {|{\psi }\rangle }_{345}\) and satisfies \(VV^{\dagger }=\Pi \). Let \(H_2=\alpha _2\left( h^{\text {AKLT}}_{13}+h^{\text {AKLT}}_{23}-\frac{8}{3}I\right) \) for some \(\alpha _2 \in {\mathbb {R}}\). The interaction graph of this gadget is pictured in Fig. 2.

Fig. 2
figure 2

Interaction graph of the gadget used in Lemma 11. Thick lines indicate the heavy interactions of \(H_0\), and white circles denote the mediator qutrits (3, 4, 5). The gadget produces an effective interaction between the remaining qutrits (1, 2)

Then one can check (either by hand or using a computer algebra package) that \(\Pi H_2 \Pi =0\) and

$$\begin{aligned} -\Pi H_2 H_0^{-1} H_2 \Pi = -\frac{2\alpha _2^2}{27}\left( 23h_{12}+h_{12}^2+\frac{136}{3}I\right) \Pi . \end{aligned}$$

Let \(H_1=\alpha _1 h^{\text {AKLT}}_{12}\) for some \(\alpha _1 \in {\mathbb {R}}\) so that \(\Pi H_1\Pi =\alpha _1h^{\text {AKLT}}_{12}\Pi \). Then choosing \(\alpha _1=22\) and \(\alpha _2=\sqrt{27}\) we have

$$\begin{aligned} \Pi H_1 \Pi -\Pi H_2 H_0^{-1} H_2 \Pi =V\left( 20(h_{12}+h^2_{12})-\frac{272}{3}I\right) V^{\dagger } \end{aligned}$$

and so by Lemma 9 (second order), we can simulate the interaction \(20(h_{12}+h^2_{12})-\frac{272}{3}I\), which one can check is the SU(3) Heisenberg interaction as desired, up to rescaling and deletion of an identity term. Note that this can only produce positively-weighted interactions, but Hamiltonians of this restricted form are indeed proven universal in Theorem 3. \(\quad \square \)

3 Fourth-Order Perturbative Gadgets

We will need the following lemma, which we prove for the first time here (and hence state a bit more generally than the above simulation lemmas, although we will only need \(\epsilon =0\) on the right-hand side of (9)). The proof is technical, and hence (as with the subsequent lemma) deferred to Appendix A.

Lemma 12

(Fourth-order simulation). Let \(H_0\), \(H_1\), \(H_2\), \(H_3\), \(H_4\) be Hamiltonians acting on the same space, such that: \(\max \{\Vert H_1\Vert ,\Vert H_2\Vert ,\Vert H_3\Vert ,\Vert H_4\Vert \} \leqslant \Lambda \); \(H_2\) and \(H_3\) are block-diagonal with respect to the split \({\mathcal {H}}_+ \oplus {\mathcal {H}}_-\); \((H_4)_{--}=0\). Suppose there exists a local isometry V such that \({\text {Im}}(V)={\mathcal {H}}_-\) and

$$\begin{aligned} \Vert V H_{{\text {target}}} V^\dag - \Pi _-\left( H_1 +H_4 H_0^{-1} H_2 H_0^{-1} H_4 -H_4 H_0^{-1} H_4 H_0^{-1} H_4 H_0^{-1} H_4\right) \Pi _- \Vert \leqslant \epsilon /2 \nonumber \\ \end{aligned}$$
(9)

and also that

$$\begin{aligned} (H_2)_{--} = \Pi _- H_4 H_0^{-1}H_4 \Pi _- \quad \text { and } \quad (H_3)_{--} = -\Pi _- H_4 H_0^{-1}H_4 H_0^{-1}H_4 \Pi _-. \end{aligned}$$
(10)

Then \(H_{{\text {sim}}} = \Delta H_0 + \Delta ^{3/4} H_4 + \Delta ^{1/4} H_3 + \Delta ^{1/2}H_2+H_1\) \((\Delta /2,\eta ,\epsilon )\)-simulates \(H_{{\text {target}}}\), provided that \(\Delta \geqslant O(\Lambda ^{20}/\epsilon ^4+\Lambda ^4/\eta ^4)\).

For fourth-order gadgets, unlike the gadgets analysed in previous work, it is unfortunately not the case that one can disregard interference between different gadgets applied in parallel; there are additional terms generated by interference between gadgets. We calculate this interference in the following lemma.

Lemma 13

Consider a Hilbert space \({\mathcal {H}}={\mathcal {H}}_0 \otimes \bigotimes _{i\geqslant 1} {\mathcal {H}}_i\) with multiple fourth-order mediator gadgets labelled by \(i\geqslant 1\), each with heavy Hamiltonian \(H_0^{(i)}\) which acts non-trivially only on \({\mathcal {H}}_i\), and interaction terms \(H_1^{(i)}\), \(H_2^{(i)}\), \(H_3^{(i)}\), \(H_4^{(i)}\) which act non-trivially only on \({\mathcal {H}}_i \otimes {\mathcal {H}}_0\). Let \(\Pi _-^{(i)}\) denote the projector onto the ground space of \(H_0^{(i)}\), and \(\Pi _+^{(i)} = I - \Pi _-^{(i)}\). Suppose that for each i, these terms satisfy the conditions of Lemma 12; in particular, \(H_0^{(i)} \Pi _-^{(i)} = 0\), \(H_2^{(i)}\) and \(H_3^{(i)}\) are block diagonal with respect to the \(\Pi _-^{(i)}\), \(\Pi _+^{(i)}\) split, \(\Pi ^{(i)}_- H_4^{(i)} \Pi ^{(i)}_-=0\) and

$$\begin{aligned}&\Pi _-^{(i)}H_2^{(i)}\Pi _-^{(i)} = \Pi ^{(i)}_- H_4^{(i)} (H_0^{(i)})^{-1}H_4^{(i)} \Pi ^{(i)}_- \text { and }\\&\quad \Pi ^{(i)}_-H_3^{(i)} \Pi _-^{(i)} = -\Pi ^{(i)}_- H_4^{(i)} (H_0^{(i)})^{-1}H_4^{(i)} (H_0^{(i)})^{-1}H_4^{(i)} \Pi ^{(i)}_-. \end{aligned}$$

For each \(j \in \{0,\dots ,4\}\), let \(H_j = \sum _i H_j^{(i)}\), and let \(\Lambda \geqslant \max \{\Vert H_1\Vert ,\Vert H_2\Vert ,\Vert H_3\Vert ,\Vert H_4\Vert \}\).

Suppose there exists a local isometry V such that \({\text {Im}}(V)\) is the ground space of \(H_0\) and also \(\Vert V H_{{\text {target}}}V^{\dagger }-M\Vert \leqslant \epsilon /2\), where

$$\begin{aligned} M&=\sum _i \Pi _-\left( H_1^{(i)}+H_4^{(i)} (H_0^{(i)})^{-1} H_2^{(i)} (H_0^{(i)})^{-1} H_4^{(i)}\right. \\&\left. \quad -H_4^{(i)} (H_0^{(i)})^{-1} H_4^{(i)} (H_0^{(i)})^{-1} H_4^{(i)} (H_0^{(i)})^{-1} H_4^{(i)}\right) \Pi _-\\&\quad +\sum _{i\ne j}\Pi _-\Big (H_4^{(i)} (H_0^{(i)})^{-1} H_4^{(j)}(H_0^{(j)})^{-1}H_4^{(j)} (H_0^{(i)})^{-1} H_4^{(i)}\\&\quad -H_4^{(i)} (H_0^{(i)})^{-1} H_4^{(j)} (H_0^{(i)}+H_0^{(j)})^{-1} H_4^{(j)} (H_0^{(i)})^{-1} H_4^{(i)}\\&\quad -H_4^{(i)} (H_0^{(i)})^{-1} H_4^{(j)} (H_0^{(i)}+H_0^{(j)})^{-1} H_4^{(i)} (H_0^{(j)})^{-1} H_4^{(j)}\Big )\Pi _- \end{aligned}$$

and \(\Pi _-\) is the projector onto the ground space of \(H_0\).

Then \(\Delta H_0 + \Delta ^{3/4} H_4 + \Delta ^{1/4} H_3 + \Delta ^{1/2}H_2+H_1\) \((\Delta /2,\eta ,\epsilon )\) simulates \(H_{{\text {target}}}\), provided that \(\Delta \geqslant O(\Lambda ^{20}/\epsilon ^4+\Lambda ^4/\eta ^4)\).

Note that the first line of the simulated Hamiltonian is what one would expect when summing the contributions of each of the gadgets separately. The other terms are in general not zero and may be thought of as the cross-gadget interference.

We will only need to use Lemma 13 via the following simplified corollary.

Corollary 14

Suppose the conditions of Lemma 13 hold, and in addition \(H_0^{(i)}H_4^{(i)}\Pi _-=H_4^{(i)}\Pi _-\) for all i. Then the expression for M is given by

$$\begin{aligned} M= & {} \sum _i \Pi _-\left( H_1^{(i)}+H_4^{(i)} H_2^{(i)} H_4^{(i)} -H_4^{(i)}H_4^{(i)} (H_0^{(i)})^{-1} H_4^{(i)} H_4^{(i)}\right) \\&\quad \Pi _- -\frac{1}{2} \sum _{i<j}\Pi _-\left[ H_4^{(i)},H_4^{(j)}\right] ^2 \Pi _- \end{aligned}$$

Proof

Fix a pair \(i \ne j\), and let \(H_4^{(j)}= \sum _{\alpha } A_\alpha \otimes B_{\alpha }\) where \(A_{\alpha }\) acts non-trivially only on \({\mathcal {H}}_j\) and \(B_{\alpha }\) acts non-trivially only on \({\mathcal {H}}_0\). By the additional assumption of the present corollary,

$$\begin{aligned} H_0^{(i)} H_4^{(i)} H_4^{(j)} \Pi _-&= H_0^{(i)} H_4^{(i)} \left( \sum _{\alpha } A_\alpha \otimes B_{\alpha }\right) \Pi _- = \sum _{\alpha } (A_{\alpha } \otimes I)H_0^{(i)} H_4^{(i)} \Pi _- (I \otimes B_{\alpha }) \end{aligned}$$
(11)
$$\begin{aligned}&= \sum _{\alpha } (A_{\alpha } \otimes I) H_4^{(i)} \Pi _- (I \otimes B_{\alpha }) = H_4^{(i)} \left( \sum _{\alpha } A_\alpha \otimes B_{\alpha }\right) \Pi _- \end{aligned}$$
(12)
$$\begin{aligned}&=H_4^{(i)} H_4^{(j)} \Pi _- \end{aligned}$$
(13)

where it is possible to commute \(A_\alpha \) and \(B_\alpha \) to the front and back respectively because \(H_0^{(i)}\), \(H_4^{(i)}\) act trivially on \({\mathcal {H}}_j\) and \(\Pi _-\) acts trivially on \({\mathcal {H}}_0\). We also have \(H_0^{(j)} H_4^{(i)} H_4^{(j)} \Pi _- =H_4^{(i)} H_0^{(j)} H_4^{(j)} \Pi _-= H_4^{(i)} H_4^{(j)} \Pi _- \) since \([H_0^{(j)}, H_4^{(i)}]=0\).

Therefore \((H_0^{(i)}+H_0^{(j)})^{-1}H_4^{(i)}H_4^{(j)}\Pi _-=\frac{1}{2}H_4^{(i)}H_4^{(j)}\Pi _-\), so the expression for the cross-gadget interference from Lemma 13 simplifies to

$$\begin{aligned}&\sum _{i \ne j} \Pi _- \left( H_4^{(i)} H_4^{(j)} H_4^{(j)} H_4^{(i)} -\frac{1}{2}\left( H_4^{(i)} H_4^{(j)} H_4^{(j)} H_4^{(i)} +H_4^{(i)} H_4^{(j)} H_4^{(i)} H_4^{(j)} \right) \right) \Pi _-\\&\quad =\frac{1}{2}\sum _{i \ne j} \Pi _- \left( H_4^{(i)} H_4^{(j)} H_4^{(j)} H_4^{(i)} -H_4^{(i)} H_4^{(j)} H_4^{(i)} H_4^{(j)} \right) \Pi _-\\&\quad =-\frac{1}{2} \sum _{i<j}\Pi _-\left[ H_4^{(i)},H_4^{(j)}\right] ^2 \Pi _- \end{aligned}$$

where we note that the sum over \(i \ne j\) includes both cases \(i < j\) and \(i>j\). \(\quad \square \)

An example of the condition \(H_0^{(i)}H_4^{(i)}\Pi _-=H_4^{(i)}\Pi _-\) holding is when \(H_0^{(i)}\) is a projector. In this case the condition (from Lemma 13) that \(\Pi ^{(i)}_- H_4^{(i)} \Pi ^{(i)}_-=0\) ensures that \(H_4^{(i)}\) maps out of the ground space of \(H_0^{(i)}\) and into the +1 eigenspace of \(H_0^{(i)}\).

4 LA-Universal Hamiltonians

We first prove LA-universality (or otherwise) of various classes of interactions, before bringing these results together into a full classification theorem by showing that every interaction fits into one of these classes. Before embarking on the proof, we observe that for any interaction h, we can delete its 1-local part by using our free 1-local terms. This corresponds to replacing h with

$$\begin{aligned} h' = h - \frac{I}{d}\otimes {{\,\mathrm{Tr}\,}}_1(h) - {{\,\mathrm{Tr}\,}}_2(h)\otimes \frac{I}{d} + {{\,\mathrm{Tr}\,}}(h)\frac{I\otimes I}{d^2}. \end{aligned}$$
(14)

We call \(h'\) the 2-local part of h.

Definition 3

Let \(\{T^a\}_{a=1}^{d^2}\) be a basis of Hermitian \(d \times d\) matrices, and let the 2-local part of h be \(h'=\sum _{a,b} M_{ab} T^a \otimes T^b\) for some real \(d^2 \times d^2\) matrix M. We define the 2-local rank of h to be the rank of M.

Note that this definition is independent of the choice of basis \(T^a\). Suppose we instead write \(h'=\sum _{a,b}{\tilde{M}}_{ab} S^{a}\otimes S^{\prime b}\) for two other bases \(\{S^a\}_a\) and \(\{S^{\prime b}\}_b\) of Hermitian \(d\times d\) matrices. Since these are bases there must exist invertible matrices R and \(R'\) such that \(T^a=\sum _{b} R_{ab} S^a=\sum _{b} R'_{ab} S^{\prime b}\). Then

$$\begin{aligned} h'&=\sum _{a,b}{\tilde{M}}_{ab} S^{a}\otimes S^{\prime b} \\&=\sum _{c,d} M_{cd} T^c \otimes T^d=\sum _{a,b}( \sum _{c,d}R_{ca} M_{cd} R_{db}) S^a \otimes S^{\prime b} \end{aligned}$$

and thus \({{\,\mathrm{rank}\,}}(\tilde{M})={{\,\mathrm{rank}\,}}(R^{T}MR') ={{\,\mathrm{rank}\,}}(M)\) since R and \(R'\) are both full rank.

We comment on our use of subscript notation. For a local operator such as a 2-local interaction h, the subscript notation denotes which subsystem the interaction acts on, so \(h_{ij}\) denotes the interaction h acting on qudits i and j. However for a matrix of coefficients such as the matrix M above, the subscript notation is used to index the entries of the matrix, so \(M_{ab}\) denotes the entry in the ath row and bth column of the matrix M.

We now move on to the first case of the proof, diagonal interactions.

4.1 Interactions diagonalisable by local unitaries

Lemma 15

Let h be a nonzero diagonal 2-qudit interaction. If the 2-local rank of h is \(\geqslant 2\), then h is LA-universal; otherwise, h is LA-stoquastic-universal.

Proof

First note that we can use 1-local terms to replace h with its 2-local part, as in (14). This still results in a diagonal interaction and allows us to assume that \({{\,\mathrm{Tr}\,}}_1(h)=0={{\,\mathrm{Tr}\,}}_2(h)\). Let h be given by \(h=\sum _{i,j=1}^d A_{ij} | i \rangle \langle i | \otimes | j \rangle \langle j |\) for some \(d \times d\) matrix A. Then the 2-local rank of h is given by \({{\,\mathrm{rank}\,}}(A)\). Next observe that we can assume that the interaction h is either symmetric or antisymmetric with respect to permuting the qudits on which it acts, because we can apply it in either direction, with positive or negative weights. So we obtain either \(h_{ij} + h_{ji}\) or \(h_{ij}-h_{ji}\), corresponding to mapping A either to \(A+A^T\) or \(A-A^T\). This cannot affect the condition on the rank of A, because

$$\begin{aligned} {{\,\mathrm{rank}\,}}(A) = {{\,\mathrm{rank}\,}}((A+A^T) + (A-A^T)) \leqslant {{\,\mathrm{rank}\,}}(A+A^T) + {{\,\mathrm{rank}\,}}(A-A^T); \end{aligned}$$

if \({{\,\mathrm{rank}\,}}(A) \geqslant 2\), then either \(\max \{{{\,\mathrm{rank}\,}}(A+A^T),{{\,\mathrm{rank}\,}}(A-A^T)\} \geqslant 2\), or \({{\,\mathrm{rank}\,}}(A+A^T) = {{\,\mathrm{rank}\,}}(A-A^T) = 1\); but this latter possibility cannot occur because \(A-A^T\) is skew-symmetric, so \({{\,\mathrm{rank}\,}}(A-A^T) \ne 1\).

We will apply the first order perturbation theory Lemma 8 by using heavily-weighted local terms to effectively project each subsystem on which h acts into a 2-dimensional subspace, which will encode a qubit. Such a projection can be described by a \(2 \times d\) matrix P. We aim to produce an effective 2-qubit interaction \(h'\) which is universal. As we can apply arbitrary local terms, we can project each qudit onto an arbitrary 2-dimensional subspace S by choosing a “heavy” Hamiltonian \(H_0 = \sum _i H^P_i\) in Lemma 8 such that \(H^P\) has S as its ground space. The local isometry V in the lemma is just given by \(P^{\dagger }\).

The result of projecting h is the 2-qubit interaction

$$\begin{aligned} h' = \sum _{i,j=1}^d A_{ij} \left( P | i \rangle \langle i | P^{\dag }\right) \otimes \left( P | j \rangle \langle j | P^\dag \right) = \sum _{i,j=1}^d A_{ij} \left( \sum _{k=0}^3 \beta _{ik} \sigma ^k \right) \otimes \left( \sum _{\ell =0}^3 \beta _{j\ell } \sigma ^\ell \right) , \end{aligned}$$

for some real coefficients \(\beta _{ik}\) such that

$$\begin{aligned} \beta _{ik} = \frac{1}{2} {{\,\mathrm{Tr}\,}}[P | i \rangle \langle i | P^\dag \sigma ^k]. \end{aligned}$$

Reordering the sums, we obtain

$$\begin{aligned} h' = \sum _{k,\ell =0}^3 \left( \sum _{i,j=1}^d \beta _{ik} A_{ij} \beta _{j\ell } \right) \sigma ^k \otimes \sigma ^\ell = \sum _{k,\ell =0}^3 \langle \beta _k|A|\beta _{\ell } \rangle \sigma ^k \otimes \sigma ^\ell , \end{aligned}$$

where we define the unnormalised vector \({|{\beta _k}\rangle } = \sum _{i=1}^d \beta _{ik} {|{i}\rangle }\). We can write down explicit expressions for these vectors as

$$\begin{aligned} \beta _{i1} = {\text {Re}}( P_{1i}^* P_{2i} ),\;\;\;\; \beta _{i2} = {\text {Im}}( P_{1i}^* P_{2i} ),\;\;\;\; \beta _{i3} = \frac{1}{2}\left( |P_{1i}|^2 - |P_{2i}|^2\right) . \end{aligned}$$

It was shown in [16, 17, Theorem 44] that an interaction of the form \(\sum _{k,\ell =1}^3 M_{k\ell } \sigma ^k \otimes \sigma ^\ell \) is universal if the \(3 \times 3\) matrix M has rank at least 2. Our goal will be to choose the vectors \({|{\beta _k}\rangle }\) to achieve this.

If A is symmetric, we can expand it as a weighted sum of projectors onto real, orthonormal eigenvectors \({|{\eta _i}\rangle }\); as \({{\,\mathrm{rank}\,}}(A) \geqslant 2\), there exist \({|{\eta _1}\rangle }\), \({|{\eta _2}\rangle }\) with nonzero eigenvalues. If A is skew-symmetric, there exist real, orthonormal vectors \({|{\eta _i}\rangle }\) such that \(\langle \eta _i|A|\eta _i \rangle = 0\) for all i, and \(\langle \eta _1|A|\eta _2 \rangle = -\langle \eta _2|A|\eta _1 \rangle \ne 0\) (see e.g. [43]). Hence, in either the symmetric or skew-symmetric case, in order to achieve that M has rank at least 2, it is sufficient to have \({|{\beta _1}\rangle } = {|{\eta _1}\rangle }\) and \({|{\beta _3}\rangle } = {|{\eta _2}\rangle }\). This fixes a \(2 \times 2\) submatrix of M to be either diagonal (and rank 2), or proportional to \(\left( {\begin{matrix} 0 &{} 1\\ -1 &{} 0 \end{matrix}} \right) \). So we want to produce a matrix P that achieves \(\beta _{i1} = \langle i|\eta _1 \rangle \), \(\beta _{i3} = \langle i|\eta _2 \rangle \) for all i.

If we can find a real matrix P that achieves this, it will automatically have orthonormal rows (up to an overall normalising constant), and also the entries of M outside a \(2\times 2\) submatrix will be zero. To see this, first note that \({|{\eta _1}\rangle }\) and \({|{\eta _2}\rangle }\) are orthogonal to \({|{+}\rangle } = \sum _{i=1}^d {|{i}\rangle }\). This holds because \({{\,\mathrm{Tr}\,}}_1(h) = \sum _{j=1}^d \left( \sum _{i=1}^d A_{ij}\right) | j \rangle \langle j | = 0\), and similarly for \({{\,\mathrm{Tr}\,}}_2(h)\), so \(A {|{+}\rangle } = A^T {|{+}\rangle } = 0\). So as \({|{\beta _1}\rangle } = {|{\eta _1}\rangle }\) and \({|{\beta _3}\rangle } = {|{\eta _2}\rangle }\), \(\sum _i \beta _{i1} = \sum _i \beta _{i3} = 0\), implying that \(\sum _i P_{1i} P_{2i} = 0\) and \(\sum _i P_{1i}^2 = \sum _i P_{2i}^2\). We can find an explicit expression for each element of P by solving the simultaneous equations

$$\begin{aligned} P_{1i} P_{2i} = \gamma _i,\;\;\;\; \frac{1}{2}\left( P_{1i}^2 - P_{2i}^2\right) = \delta _i, \end{aligned}$$

where we write \(\gamma _i = \langle i|\eta _1 \rangle \), \(\delta _i = \langle i|\eta _2 \rangle \). It can readily be verified that the following is a valid solution:

$$\begin{aligned} {\left\{ \begin{array}{ll} P_{1i} = 0, P_{2i} = \sqrt{-2\delta _i} &{} \text {if }\gamma _i = 0\text { and }\delta _i \leqslant 0\\ P_{1i} = \sqrt{\delta _i + \sqrt{\gamma _i^2+\delta _i^2} }, P_{2i} = \frac{\gamma _i}{\sqrt{\delta _i + \sqrt{\gamma _i^2+\delta _i^2}}} &{} \text {otherwise.} \end{array}\right. } \end{aligned}$$

Thus h is LA-universal. This completes the proof of the case \({{\,\mathrm{rank}\,}}(A) \geqslant 2\). If \({{\,\mathrm{rank}\,}}(A) = 1\), we know that there exists an eigenvector \({|{\eta _1}\rangle }\) with nonzero eigenvalue, and can take \({|{\eta _2}\rangle }\) to be an arbitrary orthogonal vector. Almost all the above steps go through, but we end up producing a matrix M such that \({{\,\mathrm{rank}\,}}(M) \geqslant 1\). This case is known to be stoquastic-universal [10, 17]. \(\quad \square \)

Lemma 16

Let \(h=A\otimes A\) be a 2-qudit interaction such that A has three distinct eigenvalues. Then h is LA-universal.

Proof

We work in the eigenbasis of A, so that \(A=\sum _i \lambda _i | i \rangle \langle i |\) . With the addition of 1-local terms of the form \(\mu A\otimes I +\mu I \otimes A+\mu ^2 I\otimes I\) in order to complete the square to make \((A+\mu I)\otimes (A + \mu I)\), it is possible to shift the spectrum of A by a constant \(\mu \). Since A has three distinct eigenvalues, we may therefore assume wlog (relabelling eigenvectors if necessary) that A has eigenvalues \(\lambda _0<0\) and \(\lambda _1>0\) such that \(\lambda _0 + \lambda _1 > 0\).

We will use a third order mediator qudit gadget involving a mediator qudit labelled 3, which will simulate an interaction between two other qudits 1 and 2. The resulting effective interaction will be of the form shown to be universal in Lemma 15. Let \(H_2= A_1A_3+A_2 A_3\) and let \(H_0=I \otimes \left( I-| \psi \rangle \langle \psi |\right) \) in \( L(({\mathbb {C}}^d)^{\otimes 3})\) act non-trivially only on the mediator qudit 3, where \({|{\psi }\rangle }=\sqrt{\lambda _1}{|{0}\rangle }+\sqrt{-\lambda _0}{|{1}\rangle }\) . The interaction graph of this gadget is pictured in Fig. 3.

Fig. 3
figure 3

Interaction graph of the gadget used in Lemma 16. \(H_0\) acts on the mediator qudit 3, which results in an effective interaction between the qudits 1 and 2

Note that \({|{\psi }\rangle }\) has been chosen so that

$$\begin{aligned} {\langle {\psi }|}A{|{\psi }\rangle }=0,\;\;\;\; {\langle {\psi }|}A^2{|{\psi }\rangle }>0,\;\;\;\; {\langle {\psi }|}A^3{|{\psi }\rangle }> 0, \end{aligned}$$
(15)

which implies that \((H_2)_{--}={\langle {\psi }|}A{|{\psi }\rangle } (A_1+A_2)\otimes | \psi \rangle \langle \psi |=0\).

Let \(H_1'={\langle {\psi }|}A^2{|{\psi }\rangle }(2A_1A_2 +A_1^2+A_2^2)\) so that

$$\begin{aligned} (H_2)_{-+} H_0^{-1} (H_2)_{+-}= {\langle {\psi }|}A^2{|{\psi }\rangle }(A_1+A_2)^2\otimes | \psi \rangle \langle \psi |=(H_1')_{--} \end{aligned}$$

as required, where we have used the fact that \(H_0^{-1}=H_0\) (since \(H_0\) is a projector) and \(H_0A{|{\psi }\rangle }=A{|{\psi }\rangle }\) (since \(A{|{\psi }\rangle }\) and \({|{\psi }\rangle }\) are orthogonal as shown in (15)). Finally we calculate the third order term: \((H_2)_{-+} H_0^{-1} (H_2)_{++} H_0^{-1} (H_2)_{+-} ={\langle {\psi }|}A^3{|{\psi }\rangle } (A_1+A_2)^3 \otimes | \psi \rangle \langle \psi |\).

Let \(H_1=-(A_1^3+A_2^3){\langle {\psi }|}A^3{|{\psi }\rangle }\) and let \(V:{|{\phi }\rangle }\rightarrow {|{\phi }\rangle }_{12} \otimes {|{\psi }\rangle }_3\) so that

$$\begin{aligned} (H_2)_{-+} H_0^{-1} (H_2)_{+-}+(H_1)_{--}=V\left( {\langle {\psi }|}A^3{|{\psi }\rangle } (A_1\otimes A_2^2+A_1^2 \otimes A_2) \right) V^{\dagger } \end{aligned}$$

Therefore by Lemma 10 (third order) we can simulate an interaction proportional to \(A\otimes A^2 +A^2 \otimes A\) which is universal by Lemma 15 unless \(A^2=\lambda A+ \mu I\) for some \(\lambda , \mu \in {\mathbb {R}}\). But if A has three distinct eigenvalues, then it cannot be a root of any polynomial of degree 2. \(\quad \square \)

Lemma 17

Let \(h=A\otimes A\) be a 2-qudit interaction such that A is not of the form \(a| \psi \rangle \langle \psi |+bI\) for any \({|{\psi }\rangle } \in {\mathbb {C}}^d\), and \(a,b \in {\mathbb {R}}\). Then h is LA-universal.

Proof

By assumption A is not proportional to the identity so has at least two distinct eigenvalues. If A has three distinct eigenvalues then h is LA-universal by Lemma 16. It remains to consider the case where A has exactly two eigenvalues \(\lambda _1 \ne \lambda _2\). Since \(A\ne a| \psi \rangle \langle \psi |+bI\), there must be at least two orthonormal eigenvectors for each of the two eigenvalues of A, and it must be the case that \(d\geqslant 4\). Let \({|{\psi _i}\rangle }\) and \({|{\phi _i}\rangle }\) be orthonormal eigenvectors with eigenvalue \(\lambda _i\) for \(i \in \{1,2\}\).

We will use Lemma 8 (first order) to effectively project into a 3-dimensional subspace of each qudit, such that the effective 2-qutrit interaction is universal by Lemma 16.

Let P be the projector onto \(S={{\,\mathrm{span}\,}}\{{|{\psi _1}\rangle },{|{\psi _2}\rangle },\frac{{|{\phi _1}\rangle }+{|{\phi _2}\rangle }}{\sqrt{2}}\}\), and let \(H_0=2I-P_1-P_2 \in L(({\mathbb {C}}^d)^{\otimes 2})\) be a Hamiltonian on two qudits with groundspace \(S^{\otimes 2}\). Let \(V:({\mathbb {C}}^3)^{\otimes 2} \rightarrow ({\mathbb {C}}^d)^{\otimes 2}\) be the isometry that maps onto the groundspace of \(H_0\),

$$\begin{aligned} V=\left( {|{\psi _1}\rangle }{\langle {0}|} +{|{\psi _2}\rangle }{\langle {1}|} + \left( \frac{{|{\phi _1}\rangle }+{|{\phi _2}\rangle }}{\sqrt{2}}\right) {\langle {2}|}\right) ^{\otimes 2}. \end{aligned}$$

Let \(H_1=A\otimes A\) so that

$$\begin{aligned} (H_1)_{--}=(PAP)^{\otimes 2}&=\lambda _1 | \psi _1 \rangle \langle \psi _1 | +\lambda _2 | \psi _2 \rangle \langle \psi _2 | + \frac{\lambda _1+\lambda _2}{2}\left( \frac{{|{\phi _1}\rangle }+{|{\phi _2}\rangle }}{\sqrt{2}}\right) \left( \frac{{\langle {\phi _1}|}+{\langle {\phi _2}|}}{\sqrt{2}}\right) \\&=V \left( \lambda _1 | 0 \rangle \langle 0 | +\lambda _2 | 1 \rangle \langle 1 | + \frac{\lambda _1+\lambda _2}{2}| 2 \rangle \langle 2 | \right) ^{\otimes 2}V^{\dagger }. \end{aligned}$$

Therefore by Lemma 8 (first order), we can simulate an interaction \(B\otimes B\), where B is a qutrit operator with three distinct eigenvalues \(\lambda _1, \lambda _2, \frac{\lambda _1+\lambda _2}{2}\); and so \(B \otimes B\) is LA-universal by Lemma 16. \(\quad \square \)

We next show that the one remaining case that is not covered by Lemma 17 corresponds to stoquastic Hamiltonians, so is unlikely to be universal.

Lemma 18

Let \(h=A\otimes A\) be a 2-qudit interaction where A is of the form \(A=a | \psi \rangle \langle \psi |\) for some \({|{\psi }\rangle } \in {\mathbb {C}}^d\) and \(a \ne 0\). Then any Hamiltonian of the form \(\sum _i M^{(i)} + \sum _{j \ne k} \alpha _{jk} h_{jk}\)—where \(M^{(i)}\) are arbitrary single qudit operators acting only on qudit i, \(h_{jk}\) refers to the interaction h applied to qudits j and k, and \(\alpha _{jk} \in {\mathbb {R}}\)—is equivalent to a stoquastic Hamiltonian under conjugation by a local unitary operation.

Proof

By conjugating h by a local unitary \(U\otimes U\) and rescaling, we may assume without loss of generality that \(A=| 0 \rangle \langle 0 |\). For each qudit, we demonstrate the existence of a local unitary acting on that qudit which leaves \({|{0}\rangle }\) unchanged, but rotates the 1-local term \(M^{(i)}\) acting on that qudit into a stoquastic term (i.e. non-positive off-diagonal entries). First we conjugate by a unitary \(U_1=| 0 \rangle \langle 0 |+{\widetilde{U}}\) where \({\widetilde{U}}\) acts only on \(S=\text {span}\{{|{1}\rangle },\dots {|{d-1}\rangle }\}\), such that \(U_1M^{(i)}U_1^{\dagger }\) is diagonal on the space S; that is,

$$\begin{aligned} U_1M^{(i)}U_1^{\dagger }=\sum _{j=0}^{d-1}{w_j}| j \rangle \langle j | +\sum _{j-1}^{d-1}a_j{|{0}\rangle }{\langle {j}|}+a_j^*{|{j}\rangle }{\langle {0}|}. \end{aligned}$$

Write \(a_j=|a_j|e^{i\theta _j}\) and define \(U_2=| 0 \rangle \langle 0 |+\sum _{j=1}^{d-1}-e^{i\theta _j}| j \rangle \langle j |\) so that

$$\begin{aligned} U_2U_1M^{(i)}U_1^{\dagger }U_2^{\dagger }=\sum _{j=0}^{d-1}w_j| j \rangle \langle j | +\sum _{j=1}^{d-1}- |a_j| \bigl ({|{0}\rangle }{\langle {j}|}+{|{j}\rangle }{\langle {0}|}\bigr ). \end{aligned}$$

This operator is clearly stoquastic. \(\quad \square \)

4.2 Interactions not necessarily diagonalisable by local unitaries

Having dealt with the diagonal case, we now need to consider other types of interactions. The first is interactions of the form \(A\otimes A+B\otimes B\).

Lemma 19

Let A and B be single-qudit Hermitian operators such that the operators \(A'=A-{{\,\mathrm{Tr}\,}}(A)I/d\) and \(B'=B-{{\,\mathrm{Tr}\,}}(B)I/d\) are linearly independent, and write \(h = A\otimes A+B\otimes B\). Then h is LA-universal.

Proof

If A and B commute, then A and B are simultaneously diagonalisable by the same unitary U. Conjugating h by \(U \otimes U\), we have a diagonal 2-local interaction with Pauli rank 2 (since \(A'\) and \(B'\) are linearly independent), so the result follows from Lemma 15. So suppose A and B do not commute. Then there must exist an eigenstate \({|{\psi }\rangle }\) of A with eigenvalue \(\lambda \) such that \(AB{|{\psi }\rangle }\ne BA{|{\psi }\rangle }=\lambda B{|{\psi }\rangle }\). So \(B{|{\psi }\rangle }\) is not in the eigenspace of A corresponding to eigenvalue \(\lambda \), and there must exist an orthogonal eigenstate \({|{\phi }\rangle }\) of A with distinct eigenvalue \(\mu \ne \lambda \), such that \({\langle {\phi }|}B{|{\psi }\rangle } \ne 0\). By multiplying \({|{\phi }\rangle }\) by a phase \(e^{i\theta }\), we may assume \({\langle {\phi }|}B{|{\psi }\rangle }\) is real.

We will apply a heavy term \(I-P\) to each of the qudits on which h acts, where P is the projector onto the space \(S=\text {span}\{{|{\psi }\rangle },{|{\phi }\rangle }\}\) . Then we can use first-order perturbation theory (Lemma 8) to produce a logical 2-qubit interaction within the \(S^{\otimes 2}\) subspace.

Let \(V:{\mathbb {C}}^2 \rightarrow {\mathbb {C}}^d\) be the isometry \(V={|{\psi }\rangle }{\langle {0}|} + {|{\phi }\rangle }{\langle {1}|}\), which maps onto S such that

$$\begin{aligned} PAP&=V\left( \lambda | 0 \rangle \langle 0 |+\mu | 1 \rangle \langle 1 |\right) V^{\dagger } = V\left( \frac{\lambda -\mu }{2}Z+\frac{\lambda +\mu }{2}I\right) V^{\dagger },\\ PBP&=V\left( aZ+{\langle {\phi }|}B{|{\psi }\rangle }X+\frac{{\langle {\psi }|}B{|{\psi }\rangle }+{\langle {\phi }|}B{|{\phi }\rangle }}{2}I\right) V^{\dagger }, \end{aligned}$$

where \(a=({\langle {\psi }|}B{|{\psi }\rangle }-{\langle {\phi }|}B{|{\phi }\rangle })/2\).

Let \(H_0=I \otimes (I-P)+(I-P) \otimes I \in L(({\mathbb {C}}^d)^{\otimes 2})\) be a Hamiltonian on two qudits, with ground space \(S^{\otimes 2}\). Let \(H_1= h\) so that

$$\begin{aligned} (H_1)_{--} =PAP \otimes PAP +PBP \otimes PBP =V^{ \otimes 2} h'(V^{\otimes 2})^{\dagger } \end{aligned}$$

for some two-local qubit interaction \(h'=\sum M_{ij} \sigma ^i \otimes \sigma ^j+\text {1-local terms}\), where M is the matrix defined by

$$\begin{aligned} M=\left( \begin{array}{ccc} {\langle {\phi }|}B{|{\psi }\rangle }^2 &{} 0 &{} a{\langle {\phi }|}B{|{\psi }\rangle } \\ 0 &{} 0 &{} 0 \\ a{\langle {\phi }|}B{|{\psi }\rangle } &{} 0 &{} a^2+(\lambda -\mu )^2/4 \\ \end{array} \right) , \end{aligned}$$

which has rank 2 whenever \({\langle {\phi }|}B{|{\psi }\rangle }(\lambda -\mu )\ne 0\), which holds here due to the choice of \({|{\phi }\rangle }\) and \({|{\psi }\rangle }\). As shown in [16, 17, Theorem 44], any 2-local qubit interaction with Pauli rank 2 is universal. Hence h is LA-universal. \(\quad \square \)

Next we use Lemma 19 to deal with almost all other types of interactions.

Lemma 20

Let h be a 2-qudit interaction with 2-local rank \(\geqslant 2\). Then h is LA-universal.

Proof

This proof consists of two gadgets. We use a first order gadget to project into a two-dimensional subspace of a qudit to produce an effective interaction F between a qudit and a qubit. Second we use the interaction F in a mediator qubit gadget, pictured in Fig. 4, to produce an effective 2 qudit interaction of the form shown to be universal in Lemma 19.

Fig. 4
figure 4

Interaction graph of one of the gadgets in the proof of Lemma 20. \(H_0\) acts non-trivially only on the mediator qubit 3, and the gadget results in an effective interaction between the qudits 1 and 2

Let \(h'\) be the 2-local part of h, given by \(h'=\sum _{a,b}M_{ab}T^a \otimes T^b\) where \({{\,\mathrm{rank}\,}}(M)\geqslant 2\) and \(\{T^a\}_a\) is a basis for the space of of traceless Hermitian matrices. Let S be a two-dimensional subspace of \({\mathbb {C}}^d\) spanned by orthonormal vectors \({|{\psi }\rangle }\) and \({|{\phi }\rangle }\) to be chosen later. Let P be the projector onto S and let \(V:{\mathbb {C}}^2 \rightarrow {\mathbb {C}}^d\) map onto S by \(V={|{\psi }\rangle }{\langle {0}|}+{|{\phi }\rangle }{\langle {1}|}\), so that \(VV^{\dagger }=P\).

Let \(H_0=I \otimes (I-P) \in L(({\mathbb {C}}^d)^{\otimes 2})\) and let \(H_1=h\), then

$$\begin{aligned} (H_1)_{--}=(I \otimes V) \left( \sum _{a,b} M_{ab} T^a \otimes V^{\dagger }T^b V \right) (I \otimes V)^{\dagger } \end{aligned}$$

Then by Lemma 8 (first order), we can simulate an interaction F between a qudit and a qubit, where \(F=\sum _{a,b} M_{ab} T^a \otimes V^{\dagger }T^b V\).

Now we can assume we have access to F interactions, and we design another gadget, this time using a second-order mediator gadget involving two qudits 1, 2 and a mediator qubit 3 (of local dimension 2). We choose \(H_0=I_{12} \otimes | 1 \rangle \langle 1 |\) to act non-trivially only on the mediator qubit, and \(H_2=F_{13}+F_{23}\); the interaction graph is pictured in Fig. 4. The second-order term is given by

$$\begin{aligned}&-(H_2)_{-+} H_0^{-1} (H_2)_{+-}= -(I \otimes | 0 \rangle \langle 0 |) (F_{13}+F_{23}) (I \otimes | 1 \rangle \langle 1 |) (F_{13}+F_{23}) (I \otimes | 0 \rangle \langle 0 |) \\&\quad =-\sum _{a,b,c,d}M_{ab} (T_1^a+T_2^a) M_{cd}(T_1^c+T_2^c)\otimes \left( | 0 \rangle \langle 0 | V^{\dagger }T^b V| 1 \rangle \langle 1 | V^{\dagger }T^d V | 0 \rangle \langle 0 | \right) \\&\quad =-\sum _{a,b,c,d}M_{ab} M_{cd} {\langle {\psi }|}T^b| \phi \rangle \langle \phi |T^{d}{|{\psi }\rangle } (T_1^a+T_2^a) (T_1^c+T_2^c)\otimes | 0 \rangle \langle 0 |\\&\quad =-V'\left[ \sum _{a,c}(R_{ac}+R_{ca})T_1^a T_2^c + \text {1-local terms} \right] (V')^{\dagger } \end{aligned}$$

where \(V'=I_{12} \otimes {|{0}\rangle }\) and R is a \((d^2-1) \times (d^2 -1)\) matrix with entries \(R_{ac}=\sum _{b,d} M_{ab} M_{cd}{\langle {\psi }|}T^b| \phi \rangle \langle \phi |T^{d}{|{\psi }\rangle }\). By Lemma 9 (second order), and choosing \(H_1\) to cancel out the unwanted 1-local terms (and add additional 1-local terms if desired), we can simulate the interaction \(-\sum _{a,c}(R_{ac}+R_{ca})T^a \otimes T^c\).

Note that \(R_{ac}= {\langle {\psi }|}K^a| \phi \rangle \langle \phi |K^c{|{\psi }\rangle }\) where \(K^a=\sum _b M_{ab}T^b\), so R is positive semi-definite and rank 1. Since \(R+R^T\) is symmetric, if we can choose \({|{\psi }\rangle }\) and \({|{\phi }\rangle }\) such that \({{\,\mathrm{rank}\,}}(R+R^T)=2\), then the simulated interaction must be of the form \(-(A\otimes A+B \otimes B)\) and so is LA-universal by Lemma 19.

Suppose for a contradiction that \({{\,\mathrm{rank}\,}}(R+R^T)\ne 2\) for any choice of \({|{\psi }\rangle }\) and \({|{\phi }\rangle }\). Since \({{\,\mathrm{rank}\,}}(R)=1={{\,\mathrm{rank}\,}}(R^T)\), this can only happen if \(R=R^T\). That is, for any a and c and any choice of orthogonal normalised states \({|{\psi }\rangle }\) and \({|{\phi }\rangle }\),

$$\begin{aligned} {\langle {\psi }|}K^a| \phi \rangle \langle \phi |K^c {|{\psi }\rangle }={\langle {\psi }|}K^c| \phi \rangle \langle \phi |K^a {|{\psi }\rangle }. \end{aligned}$$
(16)

By the definition of \(K^a\) and the fact that M has rank at least 2, there must be a choice of a and c such that \(K^a\) and \(K^c\) are linearly independent. Fix this choice of a and c for the remainder of the proof. The contradiction we will show is that Eq. (16) implies that \(K^a\) and \(K^c\) are not linearly independent.

Fix \({|{\psi }\rangle }\) and extend it to an orthonormal basis \(B_{\psi }=\{{|{\psi }\rangle },{|{e_1}\rangle },\dots ,{|{e_{d-1}}\rangle }\}\). Then taking \({|{\phi }\rangle } = {|{e_i}\rangle }\) for any i, Eq. (16) holds. Taking the sum over all i we have \({\langle {\psi }|}K^a K^c{|{\psi }\rangle }={\langle {\psi }|}K^c K^a{|{\psi }\rangle }\). Since \({|{\psi }\rangle }\) was arbitrary, we conclude that \([K^a,K^c]=0\). So \(K^a\) and \(K^c\) are simultaneously diagonalisable. Let \({|{\Phi }\rangle }=\frac{1}{\sqrt{d}}\sum _i {|{i}\rangle }\), where \(\{{|{i}\rangle }\}\) is an eigenbasis for both \(K^a\) and \(K^c\). We can decompose an arbitrary state \({|{\psi }\rangle }\) as \({|{\psi }\rangle } = {|{\psi '}\rangle }+b{|{\Phi }\rangle }\) where \({|{\psi '}\rangle }\) is an unnormalised vector orthogonal to \({|{\Phi }\rangle }\). Then

$$\begin{aligned} {\langle {\Phi }|}K^a{|{\psi }\rangle }={\langle {\Phi }|}K^a{|{\psi '}\rangle }+b{\langle {\Phi }|}K^a{|{\Phi }\rangle }={\langle {\Phi }|}K^a{|{\psi '}\rangle }+b\frac{1}{d}{{\,\mathrm{Tr}\,}}(K^a)={\langle {\Phi }|}K^a{|{\psi '}\rangle } \end{aligned}$$

and similarly for \(K^c\). So, setting \({|{\phi }\rangle }={|{\Phi }\rangle }\), as \({|{\psi '}\rangle }\) is orthogonal to \({|{\Phi }\rangle }\) Eq. (16) holds for any choice of \({|{\psi }\rangle }\), and hence \(K^a| \Phi \rangle \langle \Phi |K^c=K^c| \Phi \rangle \langle \Phi |K^a\). Multiplying on the left by \({\langle {i}|}\) and on the right by \({|{j}\rangle }\) this gives \(\lambda _i\mu _j=\mu _i\lambda _j\) where \(\lambda _i\) and \(\mu _i\) are the eigenvalues corresponding to \({|{i}\rangle }\) of \(K^a\) and \(K^c\) respectively. This implies there exists \(C\in {\mathbb {R}}\) such that \(\lambda _i=C\mu _i\) for all i, and hence that \(K^a=C K^c\) which is the contradiction we desired. \(\quad \square \)

We have now proven all the ingredients we need to show the following theorem, which is the 2-local, single-interaction special case of Theorem 2:

Theorem 21

Let h be a 2-qudit interaction which is not 1-local. If, up to addition of 1-local terms, \(h = \alpha | \psi \rangle \langle \psi |^{\otimes 2}\) for some state \({|{\psi }\rangle }\) and some \(\alpha \ne 0\), then h is LA-stoquastic-universal. Otherwise h is LA-universal.

Proof

Let \(h'\) be the interaction obtained from h by deleting its 1-local part. Then, by Lemma 20h is LA-universal unless \(h' = A \otimes B\) for some A and B. If A and B are linearly independent, then \(A\otimes B+B \otimes A\) has 2-local rank 2 and so is LA-universal by Lemma 20. Otherwise, \(B = \beta A \) for some \(\beta \ne 0\), so \(h' = \beta A \otimes A\). Diagonalising h using a local unitary \(U \otimes U\) and using Lemma 15, h is LA-stoquastic-universal. In addition, if \(A \ne a| \psi \rangle \langle \psi |+bI\) for some \({|{\psi }\rangle } \in {\mathbb {C}}^d\), then h is LA-universal by Lemma 17. \(\quad \square \)

We do not expect any larger class of two-local interactions to be LA-universal than in Theorem 21, as shown by Lemma 18.

4.3 Extension to k-local interactions

In order to extend our results to interaction terms that act on more than 2 qudits, we first show how 1-local terms can be used to extract \((k-1)\)-local interactions from k-local interactions.

Lemma 22

Let h be a k-local interaction with a decomposition \(h=\sum _{i=1}^l A_i \otimes B_i\) where the \(A_i\) operators act on \(k-1\) qudits and the \(B_i\) operators are linearly independent. Then using h interactions and additional 1-local terms we can simulate any interaction in \({{\,\mathrm{span}\,}}\{A_i\}_{i=1}^l\).

Proof

Fix a single qudit state \({|{\psi }\rangle } \in {\mathbb {C}}^d\), and let \(H_0=I \otimes (I-| \psi \rangle \langle \psi |) \in L(({\mathbb {C}}^d)^{\otimes k})\) which acts non-trivially only on the kth qudit. Let \(H_1=h\) and \(V=I \otimes {|{\psi }\rangle }\) be the isometry \(V: ({\mathbb {C}}^d)^{\otimes k-1} \rightarrow ({\mathbb {C}}^d)^{\otimes k}\) onto the groundspace of \(H_0\). Then \((H_1)_{--}= V \left( \sum _{i=1}^l A_i {\langle {\psi }|} B_i {|{\psi }\rangle }\right) V^{\dagger }\).

So by Lemma 8 (first order), we can simulate \(\sum _{i=1}^l A_i {\langle {\psi }|} B_i {|{\psi }\rangle }\). Using different ancilla qudits projected into different states \({|{\psi }\rangle }\) we can produce a linear combination of such interactions. It therefore suffices to prove that \({{\,\mathrm{span}\,}}\{ x^{(\psi )} : {|{\psi }\rangle } \in {\mathbb {C}}^d\}={\mathbb {R}}^l\), where \(x^{(\psi )}\) is the vector in \({\mathbb {R}}^l\) with coefficients given by \(x^{(\psi )}_i={\langle {\psi }|} B_i {|{\psi }\rangle }\).

Suppose for a contradiction that the \(x^{(\psi )}\) do not span the whole of \({\mathbb {R}}^l\), then there must exist some non-zero \(\lambda \in {\mathbb {R}}^l\) which is orthogonal to \(x^{(\psi )}\) for all \({|{\psi }\rangle }\), so

$$\begin{aligned} 0=\sum _{i=1}^l \lambda _i x_i^{(\psi )}={\langle {\psi }|}\left( \sum _i \lambda _i B_i \right) {|{\psi }\rangle } \quad \forall {|{\psi }\rangle } \quad \Rightarrow \quad \sum _i \lambda _i B_i=0 \end{aligned}$$

contradicting the assumption that the \(B_i\) are linearly independent. \(\quad \square \)

Let h be a k-qudit Hamiltonian and S be a subset of those k qudits. Define \(h_S\) to be the part of h which acts non-trivially only on S but does not have any part in its decomposition which acts trivially on any subset of S. More precisely, take a basis \(\{I, B_i\}\) of Hermitian matrices on \({\mathbb {C}}^d\), where the \(B_i\) are traceless, and decompose h as a linear combination of tensor products of terms from these bases; then \(h_S\) is the sum of all terms which are non-identity on S and identity elsewhere. Note that \(h=\sum _S h_S\) and \({{\,\mathrm{Tr}\,}}_i(h_S)=0\) for any \(i \in S\).

The following corollary is an easy consequence of Lemma 22.

Corollary 23

Let h be a k-qudit interaction, with a decomposition \(h=\sum _S h_S\) where \(h_S\) is defined as above. Then, using h and additional 1-local terms, it is possible to simulate the interaction \(h_S\) for any subset S.

Proof

Let h have a decomposition \(h=A_0\otimes I+\sum _i A_i \otimes B_i\) where the \(B_i\) are traceless Hermitian matrices acting nontrivially on a single qudit. Then, by Lemma 22, we can simulate \(A_0\). This is the part of h which acts trivially on the last qudit and can hence be expressed as \(A_0 \otimes I =\sum _{S' \subseteq \{1,\dots k-1\}} h_{S'}\). By applying Lemma 22 repeatedly in this way, we can simulate any interaction of the form \(h(S)=\sum _{S'\subseteq S}h_{S'}\) for an arbitrary set S.

We now prove the corollary by induction on |S|, noting that the base case \(|S|=1\) is trivial since we have access to all 1-local terms. Assume the claim for all subsets of size l and let S be a subset of size \(l+1\). By the induction hypothesis, we can simulate \(h_{S'}\) for all subsets \(S'\subset S\). Taking these away from h(S) we are left with \(h_S\) as desired. \(\quad \square \)

We are now ready to generalise Theorem 21 to k-local interactions.

Theorem 2(restated) Let \({\mathcal {S}}\) be a set of interactions, which are not all 1-local, between qudits of dimension d. Then \({\mathcal {S}}\) is:

  • LA-stoquastic-universal, if there exists \({|{\psi }\rangle }\in {\mathbb {C}}^d\) such that all interactions in \({\mathcal {S}}\) are, up to the addition of 1-local terms, given by a linear combination of operators taken from the set \(\{I, | \psi \rangle \langle \psi |,| \psi \rangle \langle \psi |^{\otimes 2},| \psi \rangle \langle \psi |^{\otimes 3},\dots \}\)—furthermore, if \({\mathcal {S}}\) is of this form and H is an \({\mathcal {S}}\)-Hamiltonian with local terms, then H is stoquastic;

  • LA-universal, otherwise.

Proof

First note that by the same argument as Lemma 18, the Hamiltonians given in the first case are stoquastic. Since not all interactions are 1-local, Lemma 22 can be used to extract a 2-local interaction with non-zero 2-local part, which is LA-stoquastic-universal by Theorem 21.

It remains to prove that any other set of interactions is universal. Define \(T_l\) to be the space of l-local interactions that have no m-local part in their decomposition for \(m<l\), and which can be generated by repeated applications of Lemma 22 to interactions \(h \in {\mathcal {S}}\) (and taking linear combinations of such interactions). Given an interaction h in \({\mathcal {S}}\), and a decomposition \(h=\sum _{S} h_S\), \(T_l\) includes all interactions \(h_S\) such that \(|S|=l\) by Corollary 23. It will therefore suffice to prove that there exists \({|{\psi }\rangle }\) such that \(T_l= {{\,\mathrm{span}\,}}\{(d | \psi \rangle \langle \psi |-I)^{\otimes l}\}\) for all l, as then \(H = \sum _S H_S\) will be of the desired form.

We prove this claim by induction on l. Note that \(T_2\) is non-empty unless all interactions in \({\mathcal {S}}\) are 1-local. By Theorem 21, each interaction in \(T_2\) must be proportional to \((d | \psi \rangle \langle \psi |-I)^{\otimes 2}\) for some state \({|{\psi }\rangle }\). Moreover, the state \({|{\psi }\rangle }\) must be the same for all interactions in \(T_2\), or we could simulate \((d | \psi \rangle \langle \psi |-I)^{\otimes 2}+(d | \psi ' \rangle \langle \psi ' |-I)^{\otimes 2}\) for some \({|{\psi }\rangle } \ne {|{\psi '}\rangle }\), which is LA-universal by Lemma 19.

Assume now that the claim holds for \(T_l\) and consider an interaction F in \(T_{l+1}\). Write \(F=\sum _i A_i \otimes B_i\), where \(B_i\) are traceless single-qudit operators. Then, by Lemma 22, \({{\,\mathrm{span}\,}}\{A_i\} \subseteq T_l\). Therefore, by the induction hypothesis, \(F=(d | \psi \rangle \langle \psi |-I)^{\otimes l}\otimes B\) for some single-qudit operator B. By applying Lemma 22 to a different qudit, we conclude that B must also be proportional to \((d| \psi \rangle \langle \psi |-I)\) as required. \(\quad \square \)

5 SU(d) Heisenberg Interaction

In the remainder of the paper we prove universality for some families of interactions where we are not assisted by free 1-local terms. We consider interactions that generalise the familiar Heisenberg interaction \(h = X \otimes X + Y \otimes Y + Z \otimes Z\) for qubits. The Pauli matrices X, Y, Z correspond to generators for the fundamental (2-dimensional) representation of the Lie algebra \(\mathfrak {su}(2)\). So two natural ways to generalise the interaction h are to consider \(\mathfrak {su}(d)\) for \(d > 2\), or to consider higher-dimensional representations of \(\mathfrak {su}(2)\). We study both of these generalisations, beginning with the former.

We first review the mathematical aspects of these generalised Heisenberg models that will be important for us, and in particular the required concepts from representation theory. Throughout this section, [20] will be a useful reference. The fundamental representation of the Lie algebra \(\mathfrak {su}(d)\) is given by the space of traceless antiHermitian \(d \times d\) matrices. We will follow the physics convention of considering a set of traceless Hermitian operators \(\{T^a\}\) such that the real linear span of \(\{iT^a\}\) gives the fundamental representation of \(\mathfrak {su}(d)\). The basis can be chosen such that \({{\,\mathrm{Tr}\,}}(T^a T^b)=\frac{1}{2}\delta _{ab}\) so that the structure constants \(f_{abc}\), defined by \([T^a,T^b]=\sum _{c}if_{abc}T^c\), are completely antisymmetric. For example the Pauli spin matrices iX/2, iY/2, iZ/2 are such a basis of \(\mathfrak {su}(2)\). The SU(d) Heisenberg interaction h is given by

$$\begin{aligned} h:=\sum _{a=1}^{d^2-1} T^a \otimes T^a. \end{aligned}$$
(17)

which (up to rescaling and adding an identity term) is the only two-qudit operator which is invariant under conjugation by the unitary \(U\otimes U\) for any matrix U in SU(d).

5.1 Notes on the representation theory of \(\mathfrak {su}(d)\)

A representation of a Lie algebra \({\mathfrak {g}}\) is a vector space \(\Lambda \) and a linear map \(R:{\mathfrak {g}}\rightarrow L(\Lambda )\) from \({\mathfrak {g}}\) to the space of linear maps on \(\Lambda \), such that \([R(x),R(y)]=R([x,y])\) for all \(x,y \in {\mathfrak {g}}\). The Lie algebra \(\mathfrak {su}(d)\) is semi-simple, which means that any representation R has a direct sum decomposition such that:

$$\begin{aligned} R=\bigoplus _{i} R_i \quad \text { and } \quad \Lambda =\bigoplus _{i} \Lambda _i \end{aligned}$$
(18)

where each \(R_i:{\mathfrak {g}}\rightarrow \Lambda _i\) is an irreducible representation.

The irreducible representations of \(\mathfrak {su}(d)\) can be labeled with a Young diagram of at most d rows. The fundamental representation has a Young diagram of a single box. The antifundamental representation or conjugate representation has Young diagram of a single column of \(d-1\) boxes, and is given by \(R_{{\text {conj}}}(T^a)=-(T^a)^*\) where \(*\) denotes complex conjugation. The trivial representation is a one dimensional representation in which \(R_{{\text {trivial}}}(T^a)=0\), with Young diagram consisting of a single column of d boxes. The adjoint representation is an \(d^2-1\) dimensional representation in which \(R_{{\text {adjoint}}}\) acts on the Lie algebra itself with the action of the Lie bracket, \(R_{{\text {adjoint}}}(T^a) T^b=[T^a,T^b]\). The adjoint representation has a Young diagram of one column of \(d-1\) boxes and a second column of a single box.

For a given representation R of \(\mathfrak {su}(d)\), the quadratic Casimir operator \(C_R\) is defined by \(C_R=\sum _{a}R(T^a)R(T^a)\). Note that \(C_R\) commutes with all elements \(R(T^b)\):

$$\begin{aligned} {[}C_R,R(T^b)]&=\sum _a [R(T^a)R(T^a),R(T^b)]\\&=\sum _a \left( R(T^a)[R(T^a),R(T^b)]+[R(T^a),R(T^b)]R(T^a)\right) \\&=\sum _{a,c}if_{abc}\left( R(T^a)R(T^c)+R(T^c)R(T^a)\right) =0 \end{aligned}$$

since \(f_{abc}\) is antisymmetric in ac and \(R(T^a)R(T^c)+R(T^c)R(T^a)\) is clearly symmetric in ac.

When R is an irreducible representation, Schur’s Lemma implies that \(C_R=c_R I\) for some \(c_R \in {\mathbb {R}}\) known as the Casimir eigenvalue. For an irreducible representation R of \(\mathfrak {su}(d)\) with corresponding Young diagram of \(n_{row}\) rows of length \(b_1,b_2,\dots ,b_{n_{row}}\) and \(n_{col}\) columns of length \(a_1,a_2,\dots a_{n_{col}}\) and l boxes in total, the Casimir eigenvalue \(c_R\) is [20, equation (19.14)]

$$\begin{aligned} c_{R}=\frac{1}{2}\left[ l(d-l/d)+\sum _{i=1}^{n_{row}}b_i^2-\sum _{i=1}^{n_{col}}a_i^2\right] . \end{aligned}$$
(19)

For a representation R with a decomposition as in (18), \(C_R=\bigoplus _{i} C_{R_i}\) and so each eigenspaces of \(C_R\) corresponds to a space \(\Lambda _i\) with corresponding Casimir eigenvalue \(c_{R_i}\).

Given two representations \(R_1\) and \(R_2\), we can define a new representation \(R_1 \otimes R_2\) called the tensor product representation on the space \(\Lambda _1 \otimes \Lambda _2\) by

$$\begin{aligned} (R_1 \otimes R_2)(T^a)=R_1(T^a) \otimes I_2 + I_1\otimes R_2(T^a) \end{aligned}$$

Even when \(R_1\) and \(R_2\) are irreducible representations, the tensor product representation is not in general irreducible. The irreducible representations \(R_i\) in the decomposition (18) of \(R_1 \otimes R_2\) can be calculated using the Young diagrams of \(R_1\) and \(R_2\). This process is described in detail in, for example, [20, Section 19.3]. If \(R_1\) and \(R_2\) have Young diagrams of \(l_1\) and \(l_2\) boxes respectively, then every irreducible representation in the decomposition of \(R_1 \otimes R_2\) has a Young diagram of \(l_1+ l_2\) boxes.

5.2 Alternative SU(d) invariant interaction

We briefly note that an alternative generalisation of the Heisenberg model has also been studied in the condensed-matter theory literature [6, 33, 39]. The qudits of the system are partitioned into two subsets A and B, and the interaction graph is bipartite, with no interactions acting within A or B. The total Hamiltonian H is of the form

$$\begin{aligned} H=\sum _{\begin{array}{c} i \in A,\\ j \in B \end{array}}{\widetilde{h}}_{ij} \quad \text { where }{\widetilde{h}}=\sum _{a} T^a \otimes (-T^a)^{*} \end{aligned}$$

where \(^{*}\) denotes complex conjugation. Since \(\sum _a T^a T^a=\frac{d^2-1}{2d}I\) by Eq. (19), we have

$$\begin{aligned} {\widetilde{h}}+\frac{d^2-1}{2d} I&=\sum _a T^a \otimes (-T^a)^* + \frac{1}{2}\left( T^aT^a \otimes I +I \otimes (-T^a)^*(-T^a)^*\right) \\&=\frac{1}{2}\sum _a {\widetilde{T}}^a {\widetilde{T}}^a \end{aligned}$$

where \({\widetilde{T}}^a= T^a\otimes I + I\otimes (-T^a)^*\). Thus \({\widetilde{h}}\) is, up to a multiple of the identity, the Casimir operator in the \({\widetilde{T}}^a\) representation and so commutes with \({\widetilde{T}}^a\) for all a. This implies that the total Hamiltonian H is now no longer invariant under conjugation by the unitary \(U^{\otimes n}\), but is invariant when conjugated by \(U^{\otimes |A|} \otimes (U^{*})^{\otimes |B|}\).

Fig. 5
figure 5

Interaction graph of the gadget used in Sect. 5.2. \(H_0\) acts on the mediator qudits 3 and 4, which results in an effective interaction between the qudits 1 and 2

Note that \({\widetilde{T}}^a\) is the tensor product of the fundamental and antifundamental representation which decomposes into a direct sum of the trivial representation and the adjoint representations (this can be seen using the Young diagram method, as described for example in [20, Section 19.3]). Therefore, as \({\widetilde{T}}^a\) annihilates the state \({|{\phi }\rangle }=\frac{1}{\sqrt{d}}\sum _{i}{|{i}\rangle }{|{i}\rangle }\), \({\widetilde{h}}+\frac{d^2-1}{2d}I=\frac{1}{2}\sum _a {\widetilde{T}}^a{\widetilde{T}}^a\) also annihilates \({|{\phi }\rangle }\), and has eigenvalue \(\frac{1}{2}c_{{\text {adjoint}}}=d/2\) on the rest of the space. Therefore \({\widetilde{h}}\) is just a linear combination of the identity I and the projector onto \({|{\phi }\rangle }\):

$$\begin{aligned} {\widetilde{h}} =\frac{1}{2d}I-\frac{d}{2} | \phi \rangle \langle \phi | \end{aligned}$$
(20)

We will show that this Hamiltonian can simulate an arbitrarily weighted SU(d) invariant interaction \(h=\sum _{a} T^a \otimes T^a\) on the A qudits using a second-order mediator gadget. Consider a system of four qudits with qudits \(1,2,3 \in A\) and qudit \(4 \in B\) as in Fig. 5. Let \(V=I_{12} \otimes {|{\phi }\rangle }_{34}\) and let \(\Pi =VV^{\dagger }\) be the projector onto the state \({|{\phi }\rangle }_{34}\). Let \(H_0= I - \Pi = \frac{2}{d}({\widetilde{h}}_{34}+\frac{d^2-1}{2d}I)\), \(H_1=0\) and \(H_2={\widetilde{h}}_{14}+\mu {\widetilde{h}}_{24}\) for some \(\mu \in {\mathbb {R}}\). Since \(\Pi M_4 \Pi = ({{\,\mathrm{Tr}\,}}M) \Pi \) for any M and the \(T^a\)’s are traceless, \(\Pi H_2 \Pi = 0\), and so

$$\begin{aligned} \Pi H_1 \Pi -\Pi H_2 H_0^{-1} H_2 \Pi&=-\Pi H_2 (I-\Pi ) H_2 \Pi \\&=-\sum _{a,b}(T^a_1+\mu T^a_2)\frac{1}{d}{{\,\mathrm{Tr}\,}}(T^aT^b) (T^b_1+\mu T^b_2) \Pi \\&=-(1+\mu ^2)\frac{d^2-1}{4d^2} I - \frac{\mu }{d} \sum _a T_1^a T_2^a \Pi \\&=V \left( -(1+\mu ^2)\frac{d^2-1}{4d^2} I - \frac{\mu }{d} h \right) V^{\dagger } \end{aligned}$$

where we used that \(\sum _a (T^a)^2 = \frac{d^2-1}{2d} I\) in the third equality. Therefore by Lemma 9 (second order), and by adjusting \(\mu \) accordingly, we can simulate an arbitrarily weighted h interaction up to the identity term.

In order to show that \({\widetilde{h}}\) is universal, it will therefore suffice to consider only h. We will do this for the rest of the paper.

5.3 Encoding a logical qubit in a 2d-qudit gadget

We now consider a system of k qudits each of dimension d, and will use subscript notation to denote which qudit an operator acts on, so \(T^a_i\) denotes the action of \(T^a\) on qudit i and the identity elsewhere. For a non-empty set \(S\subseteq \{1,\dots , k\}\) we use the shorthand \(T^a_S=\sum _{i \in S} T^a_i\). For any such S, the operators \(\{T_S^a\}_a\) form a representation of \(\mathfrak {su}(d)\); it is the representation given by the tensor product of the fundamental representation \(l=|S|\) times.

Consider the following Hamiltonian, given by the quadratic Casimir operator in the \(\{T^a_S\}_a\) representation:

$$\begin{aligned} C(S)&:=\sum _a T_S^a T_S^a=\sum _{a}\left( \sum _{i \ne j} T^a_i T^a_j + \sum _i T^a_i T^a_i\right) \end{aligned}$$
(21)
$$\begin{aligned}&=\sum _{i\ne j} h_{ij} + \frac{l(d^2-1)}{2d} I \end{aligned}$$
(22)

where we have used Eq (19), the formula for the Casimir value. As discussed above, to understand the eigenspaces of C(S), it suffices to know the irreducible representations contained in the decomposition of \(\{T^a_S\}_a\). In particular we note that C(S) is a sum of squares of Hermitian matrices so is positive semidefinite, and the Young diagram consisting of a single column of d boxes is a one dimensional irrep, with Casimir eigenvalue zero, corresponding to the state \({|{\Psi }\rangle }\), the completely antisymmetric state on d qudits. The 1-dimensional irrep is known as the trivial representation because \(T^a_S {|{\Psi }\rangle }=0\) for all a.

In [16], the representation theory of \(\mathfrak {su}(2)\) was used to understand the ground space of the qubit Heisenberg model on the complete (bipartite) graph. This was important as each gadget in their construction contained two logical qubits: one with which an interesting simulation could be implemented, and one which could only implement qubit Heisenberg interactions. The analysis of the complete bipartite Heisenberg model made it possible to project the second logical qubit of each gadget into the non-degenerate n-qubit ground state of this model. Here, we similarly use the representation theory of \(\mathfrak {su}(d)\) to understand the ground space of the SU(d) Heisenberg model on the complete graph, but our motivation is quite different. We will use a gadget construction to encode a single logical qubit within a gadget of 2d (constant, independent of system size) physical qudits. The construction here is more closely related to the construction in [17, Theorem 42].

The gadget construction, which encodes a logical qubit within 2d physical qudits, is a second-order perturbative gadget which via Lemma 9 will implement effective interactions across pairs of logical qubits. We consider a system of 2d qudits, each of dimension d, and each with a label in \(E=\{1,2,\dots , 2d\}\). Let \(A=\{3,4,\dots , d+1\}\) and \(B=\{d+2,\dots 2d\}\) and consider the Hamiltonian \(H_0 \in L(({\mathbb {C}}^d)^{\otimes 2d})\)

$$\begin{aligned} H_0=C(E)+C(A)+C(B) -\frac{(d^2-1)}{d}I, \end{aligned}$$
(23)

whose interaction graph is pictured in Fig. 6. The \(-\frac{(d^2-1)}{d}I\) term will simply ensure that the ground state energy of \(H_0\) is zero, so that the requirements of Lemma 9 (second order) are met.

Fig. 6
figure 6

Interaction graph of \(H_0=C(E)+C(A)+C(B) -\frac{(d^2-1)}{d}I \) for \(d=4\)

First we will show that the ground space of \(H_0\)—which will form our logical qubit—is indeed two-dimensional. In fact the two states in the ground space of \(H_0\) sit in the respective ground spaces of C(E), C(A) and C(B). The eigenvalues of C(A) are the Casimir values of the representations corresponding to Young diagrams of \(d-1\) boxes with values as given in Eq. (19). The lowest eigenvalue occurs when these boxes are arranged in a single column. The ground space of C(A) is therefore the d-dimensional space \({\mathcal {H}}_{{\text {antisym}}}(d-1)\) of antisymmetric states on the \(d-1\) qudits in A, corresponding to the Young diagram of a single column of \(d-1\) boxes. Let \(\{{|{i}\rangle }\}_{i=1}^{d}\) be an orthonormal basis for \({\mathbb {C}}^d\), then there is a unique (up to a phase) antisymmetric state \({|{\psi _i}\rangle }\) in \({{\,\mathrm{span}\,}}\{ {|{1}\rangle },\dots ,{|{i-1}\rangle },{|{i+1}\rangle },\dots {|{d}\rangle }\}^{\otimes d-1}\). These states are clearly orthonormal and form a basis for \({\mathcal {H}}_{{\text {antisym}}}(d-1)\).

Then the groundspace of \(H_0\) contains

$$\begin{aligned} {|{\phi _1}\rangle }={|{\Psi }\rangle }_{1A}{|{\Psi }\rangle }_{2B} \quad \text { and } \quad {|{\phi _2}\rangle }={|{\Psi }\rangle }_{1B}{|{\Psi }\rangle }_{2A}, \end{aligned}$$

where \({|{\Psi }\rangle }\) is the completely antisymmetric state on d qudits,

$$\begin{aligned} {|{\Psi }\rangle }&=\frac{1}{\sqrt{d!}} \sum _{\sigma \in S_d} \text {sgn}(\sigma ){|{\sigma (1)}\rangle }{|{\sigma (2)}\rangle }\dots {|{\sigma (d)}\rangle }\end{aligned}$$
(24)
$$\begin{aligned}&=\frac{1}{\sqrt{d}}\sum _i {|{i}\rangle }{|{\psi _i}\rangle } \end{aligned}$$
(25)

and \(\{{|{i}\rangle }\}_i\) and \(\{{|{\psi _i}\rangle }\}_i\) are the orthonormal bases for \({\mathbb {C}}^d\) and \({\mathcal {H}}_{{\text {antisym}}}(d-1)\) as defined above. Clearly, these states are in the ground space of C(A) and C(B), and \({|{\Psi }\rangle }\) is the antisymmetric state on d qudits so \(T_E^a\) annihilates \({|{\phi _1}\rangle }\) and \({|{\phi _2}\rangle }\), implying that these states are also in the ground space of C(E). To see that these are the only two states in the ground space of \(H_0\), we note that the ground space of \(C(A)+C(B)\) is spanned by states in the representations given in Fig. 7. The C(E) term forces the ground space of \(H_0\) to be the two dimensional space corresponding to the two copies of the Young diagram of two columns of d boxes.

Fig. 7
figure 7

Irreducible representations in the decomposition of the ground space of \(C(A)+C(B)\). The rules for taking the tensor product of representations given as Young diagrams can be found for example in [20, Section 19.3]

It is important to note that \({|{\phi _1}\rangle }\) and \({|{\phi _2}\rangle }\) are not orthogonal:

$$\begin{aligned} {\langle {\phi _1|\phi _2}\rangle }&=\frac{1}{d^2}\sum _{i,j,k,l}\left( {\langle {i}|}{\langle {\psi _i}|}{\langle {j}|}{\langle {\psi _j}|}\right) \left( {|{k}\rangle }{|{\psi _l}\rangle }{|{l}\rangle }{|{\psi _k}\rangle }\right) \end{aligned}$$
(26)
$$\begin{aligned}&=\frac{1}{d^2}\sum _{i,j,k,l}\delta _{ik}\delta _{il}\delta _{jl}\delta _{jk}=\frac{1}{d^2}\sum _{i}\delta _{ii}=\frac{1}{d}. \end{aligned}$$
(27)

In order to calculate perturbative gadgets we want to understand the action of the physical interaction h defined in (17) in this logical qubit space. First we calculate \(M_{ij}(T^a_k T^b_l ):={\langle {\phi _i}|}T^a_k T^b_l {|{\phi _j}\rangle }\) for all abij and any \(k,l \in \{1,2,A,B\}\), and then we will convert to an orthogonal basis later. We only show the calculations for three of these values, as all others can be calculated by symmetric arguments, and recalling that (\(T^a_1+T^a_A) {|{\Psi }\rangle }_{1A}=0\). For example, we can calculate \({\langle {\phi _1}|}T^a_1 T^b_2{|{\phi _2}\rangle }= -{\langle {\phi _1}|}T^a_1 T^b_A{|{\phi _2}\rangle }= {\langle {\phi _1}|} T^b_1 T^a_1{|{\phi _2}\rangle }\).

$$\begin{aligned} {\langle {\phi _1}|}T^a_1 T^b_1{|{\phi _1}\rangle }&=\frac{1}{d^2}\sum _{i,j,k,l}\left( {\langle {i}|}{\langle {\psi _i}|}{\langle {j}|}{\langle {\psi _j}|}\right) T_1^a T_1^b\left( {|{k}\rangle }{|{\psi _k}\rangle }{|{l}\rangle }{|{\psi _l}\rangle }\right) \end{aligned}$$
(28)
$$\begin{aligned}&=\frac{1}{d^2}\sum _{i,j,k,l} {\langle {i}|} T^a T^b {|{k}\rangle } \delta _{ik}\delta _{jl }\delta _{jl} \end{aligned}$$
(29)
$$\begin{aligned}&=\frac{1}{d} {{\,\mathrm{Tr}\,}}(T^aT^b) \end{aligned}$$
(30)
$$\begin{aligned} {\langle {\phi _1}|}T^a_1 T^b_1{|{\phi _2}\rangle }&=\frac{1}{d^2}\sum _{i,j,k,l}\left( {\langle {i}|}{\langle {\psi _i}|}{\langle {j}|}{\langle {\psi _j}|}\right) T_1^a T_1^b\left( {|{k}\rangle }{|{\psi _l}\rangle }{|{l}\rangle }{|{\psi _k}\rangle }\right) \end{aligned}$$
(31)
$$\begin{aligned}&=\frac{1}{d^2}\sum _{i,j,k,l} {\langle {i}|} T^a T^b {|{k}\rangle } \delta _{il}\delta _{jl}\delta _{jk} \end{aligned}$$
(32)
$$\begin{aligned}&=\frac{1}{d^2} {{\,\mathrm{Tr}\,}}(T^aT^b) \end{aligned}$$
(33)
$$\begin{aligned} {\langle {\phi _1}|}T^a_1 T^b_2{|{\phi _1}\rangle }&=\frac{1}{d^2}\sum _{i,j,k,l}\left( {\langle {i}|}{\langle {\psi _i}|}{\langle {j}|}{\langle {\psi _j}|}\right) T_1^a T_2^b\left( {|{k}\rangle }{|{\psi _k}\rangle }{|{l}\rangle }{|{\psi _l}\rangle }\right) \end{aligned}$$
(34)
$$\begin{aligned}&=\frac{1}{d^2}\sum _{i,j,k,l} {\langle {i}|} T^a{|{k}\rangle } \delta _{ik} {\langle {j}|} T^a{|{l}\rangle }\delta _{jl} \end{aligned}$$
(35)
$$\begin{aligned}&=\frac{1}{d^2} {{\,\mathrm{Tr}\,}}(T^a){{\,\mathrm{Tr}\,}}(T^b)=0. \end{aligned}$$
(36)

We then have

$$\begin{aligned} M(T^a_1T^b_1)&=\frac{{{\,\mathrm{Tr}\,}}(T^aT^b)}{d^2}\begin{pmatrix} d &{} 1\\ 1 &{} d \\ \end{pmatrix}&M(T^a_1T^b_A)&=\frac{{{\,\mathrm{Tr}\,}}(T^aT^b)}{d^2}\begin{pmatrix} -d &{} -1\\ -1 &{} 0 \\ \end{pmatrix} \end{aligned}$$
(37)
$$\begin{aligned} M(T^a_1T^b_B)&=\frac{{{\,\mathrm{Tr}\,}}(T^aT^b)}{d^2}\begin{pmatrix} 0 &{} -1\\ -1 &{} -d \\ \end{pmatrix}&M(T^a_1T^b_2)&=\frac{{{\,\mathrm{Tr}\,}}(T^aT^b)}{d^2}\begin{pmatrix} 0 &{} 1\\ 1 &{} 0 \\ \end{pmatrix} \end{aligned}$$
(38)

Now let \(V:{\mathbb {C}}^2 \rightarrow ({\mathbb {C}}^d)^{\otimes 2d}\) be an isometry that maps onto the ground space of \(H_0\), \({{\,\mathrm{span}\,}}\{{|{\phi _1}\rangle },{|{\phi _2}\rangle }\}\), defined by its action on the basis states:

$$\begin{aligned} V{|{0}\rangle }&= \sqrt{\frac{d}{2(d+1)}}\left( {|{\phi _1}\rangle }+{|{\phi _2}\rangle }\right) \end{aligned}$$
(39)
$$\begin{aligned} V {|{1}\rangle }&= \sqrt{\frac{d}{2(d-1)}}\left( {|{\phi _1}\rangle }-{|{\phi _2}\rangle } \right) \end{aligned}$$
(40)

Then the action of \(T^a_i T^a_j \) in the ground space of \(H_0\) is given by \(V^{\dagger } T^a_i T^a_j V\) in Table 1. Therefore by Lemma 8 (first-order), choosing \(H_1=\alpha h_{1A}+\beta h_{12}\) for \(\alpha ,\beta \in {\mathbb {R}}\), we can simulate any logical 1-local interaction in \({{\,\mathrm{span}\,}}\{X,Z\}\), up to an identity term.

5.4 Second-order terms

We now want to simulate interactions between two logical qubits using a second-order gadget, via Lemma 9. Consider two copies of the gadget above with qudit labels \(\{1,2,\dots , 2d\}\) and \(\{1',2',\dots , 2d'\}\) respectively; so now the heavy term is \(\tilde{H_0}= I \otimes H_0 +H_0 \otimes I \in L(({\mathbb {C}}^d)^{\otimes 4d})\) and \(V \otimes V\) maps onto the ground space of \(\tilde{H}_0\). Let \({\tilde{\Pi }}=\Pi \otimes \Pi \) be the projector onto the ground space of \(H_0\). \(H_1\) is chosen as in the previous section to simulate any 1-local terms desired. We will choose \(H_2=\sum _{i,j} \alpha _{i j} h_{ij'}\), so we need to calculate

$$\begin{aligned} {\tilde{\Pi }} H_2 (\tilde{H}_0)^{-1} H_2{\tilde{\Pi }} = \sum _{i,j,k,l} \alpha _{ij} \alpha _{kl}{\tilde{\Pi }} h_{ij'} (\tilde{H}_0)^{-1} h_{kl'} {\tilde{\Pi }}. \end{aligned}$$

The difficult part of this calculation is to understand how the \((\tilde{H}_0)^{-1}\) term acts. We consider just a single gadget first: for any state \({|{\psi }\rangle }\) in the ground space of \(H_0\), we will show that \(H_0 T^b_i {|{\psi }\rangle }= d T^b_i {|{\psi }\rangle }\) for any \(i \in \{1,2,A,B\}\). We provide a proof for the case \(i=1\), but the other cases are similar.

It is easy to check that the states \(\{T_1^b{|{\psi }\rangle }\}_b\) are orthogonal and that \(T_E^a\) acts on this space as the adjoint representation:

$$\begin{aligned} T^a_E T^b_1 {|{\psi }\rangle }=\left( T^a_1 T^b_1 +\sum _{i\ne 1}T^b_1T^a_i\right) {|{\psi }\rangle } =\left( T^a_1 T^b_1 -T^b_1 T^a_1 \right) {|{\psi }\rangle }= [T^a_1,T^b_1]{|{\psi }\rangle } \end{aligned}$$

where the second equality holds because \(T_E^a {|{\psi }\rangle }=0\) and so \(\sum _{i\ne 1} T_i^a {|{\psi }\rangle }= -T_1^a {|{\psi }\rangle }\).

Therefore \(T_1^b {|{\psi }\rangle }\) is an eigenvector of C(E) with the Casimir eigenvalue corresponding to the adjoint representation, which has Young diagram consisting of one column of length \(d-1\) and a second column of length 1. By Eq. (19), this eigenvalue is given by \(c_{\text {adjoint}}=d\), which we can also check directly:

$$\begin{aligned} C(E)T^b_1 {|{\psi }\rangle }&=\sum _a T^a_E T^a_E T^b_1{|{\psi }\rangle }=\sum _a\left[ T_1^a,[T_1^a,T_1^b]\right] {|{\psi }\rangle } \end{aligned}$$
(41)
$$\begin{aligned}&=-\sum _{a,c,e}f_{abc}f_{ace} T^e_1{|{\psi }\rangle }=-\sum _e \kappa _{be} T^e_1{|{\psi }\rangle }=dT^b_1 {|{\psi }\rangle } \end{aligned}$$
(42)

where we have used the antisymmetry of the structure constants \(f_{abc}\) and the definition of the Killing form \(\kappa _{ab}=\sum _{c,e} f_{ace}f_{bec}=-2d{{\,\mathrm{Tr}\,}}(T^aT^b)\).

Furthermore, the operator \(T^b_1\) does not act on A or B so the state \(T^b_1{|{\psi }\rangle }\) is still antisymmetric with respect to permutations within A and B and so is in the zero-energy ground space of \(C(A)+C(B)-\frac{d^2-1}{d}I\), and so \(H_0 T^b_1 {|{\psi }\rangle }= d T^b_1 {|{\psi }\rangle }\) as claimed.

Table 1 Action of \(T^a_iT^a_j\) in the ground space of \(H_0=C(E)+C(A)+C(B) -\frac{(d^2-1)}{d}I\)

Thus \(\tilde{H}_0 h_{kl'} {\tilde{\Pi }}= 2d h_{kl'} {\tilde{\Pi }}\) for \(k,l \in \{1,2,A,B\}\) and so

$$\begin{aligned} {\tilde{\Pi }} h_{ij'} (\tilde{H}_0)^{-1} h_{kl'} {\tilde{\Pi }}&= \frac{1}{2d}{\tilde{\Pi }} h_{ij'} h_{kl'} {\tilde{\Pi }}=\frac{1}{2d}\sum _{a,b} \Pi T^a_i T^b_k \Pi \otimes \Pi T^a_{j'} T^b_{l'} \Pi \\&=\frac{1}{2d}\sum _{a} \Pi T^a_i T^a_k \Pi \otimes \Pi T^a_{j'} T^a_{l'} \Pi , \end{aligned}$$

which corresponds to a logical operator that can be read off from Table 1. We choose \(\alpha _{ij}=1\) if \((i,j) \in \{ (1,A), (2,B), (A,1), (B,A), (B,B) \}\) and \(\alpha _{ij}=0\) otherwise. Then

$$\begin{aligned} -{\tilde{\Pi }} H_2 (\tilde{H_0})^{-1} H_2{\tilde{\Pi }} = \frac{1}{8d(d^2-1)}(V\otimes V)\left( XX + \frac{3}{d^2-1}ZZ+ \text {1-local terms}\right) (V\otimes V)^{\dagger }, \end{aligned}$$

which can be checked either by hand or using a computer algebra package. Therefore by Lemma 9 (second order), we can choose \(H_1\) to cancel out the unwanted 1-local terms (as described in the previous section), and simulate the interaction \(\alpha (XX + \frac{3}{d^2-1} ZZ)\) for an arbitrary positive weight \(\alpha \). This family of Hamiltonians was shown to be universalFootnote 1 in [38, Theorem 2]. This completes the proof of the following theorem:

Theorem 3(restated). For any \(d \geqslant 2\), the SU(d) Heisenberg interaction \(h:=\sum _{a} T^a \otimes T^a\), where \(\{T^a\}\) are traceless Hermitian matrices such that \({{\,\mathrm{Tr}\,}}(T^a T^b)=\frac{1}{2}\delta _{ab}\), is universal.

The following corollary is an immediate consequence of Theorem 3 and the discussion in Sect. 5.2.

Corollary 24

For any \(d \geqslant 2\), the alternative SU(d) Heisenberg interaction \({\widetilde{h}}:=- \sum _a T^a \otimes (T^a)^{\star }\), where \(\{T^a\}\) are traceless Hermitian matrices such that \({{\,\mathrm{Tr}\,}}(T^a T^b)=\frac{1}{2}\delta _{ab}\), is universal even on a bipartite interaction graph.

6 Rank 1 Projectors

In this section we consider the family of \({\mathcal {S}}\)-Hamiltonians where \({\mathcal {S}}\) contains a single rank 1 projector P onto a two qudit state \({|{\psi }\rangle } \in ({\mathbb {C}}^d)^{\otimes 2}\). We prove universality even in the restricted setting where interactions are only allowed between qudits on a bipartite interaction graph. We note that this also trivially implies universality without such a restriction.

Theorem 5(restated). Let \(P= | \psi \rangle \langle \psi |\) be the projector onto the two-qudit state \({|{\psi }\rangle } \in ({\mathbb {C}}^d)^{\otimes 2}\). Then Hamiltonians of the form

$$\begin{aligned} H= \sum _{i \in A, j \in B} \alpha _{ij}P_{ij} \end{aligned}$$

where A and B are disjoint subsets of qubits and \(\alpha _{ij} \in {\mathbb {R}}\), are universal if \({|{\psi }\rangle }\) is entangled.

Otherwise, if \({|{\psi }\rangle }\) is a product state, then this family of Hamiltonians is classical.

We observe that we have already shown that the alternative SU(d) Heisenberg interaction is universal in Corollary 24, which is the special case of Theorem 5 where \({|{\psi }\rangle }= \frac{1}{\sqrt{d}}\sum _i {|{i}\rangle }{|{i}\rangle }\).

Proof

We first conjugate the entire Hamiltonian by a total unitary \(\left( \bigotimes _{i \in A} U\right) \otimes \left( \bigotimes _{j \in B} V\right) \). This allows us to perform a change of basis of the form \((U \otimes V) P_{ij}(U\otimes V)^{\dag }\) for each projector \(P_{ij}\). Therefore, by the Schmidt decomposition, we may assume without loss of generality that \({|{\psi }\rangle }= \sum _{i=1}^d \lambda _i {|{i}\rangle }{|{i}\rangle }\), where \(\lambda _i \geqslant 0\) and the \(\lambda _i\) are in non-increasing order. If \({|{\psi }\rangle }\) is a product state, then the Hamiltonian is clearly classical, since P is diagonal in this basis - it is the projector onto \({|{1}\rangle }{|{1}\rangle }\).

So assume that \({|{\psi }\rangle }\) is entangled; we first show how to simulate some 1-local operators using mediator qudit gadgets. For three qudits \(1,3 \in A\) and \(2 \in B\), let \(H_0= I-P_{32}\) and \(H_{1}=P_{12}\) be operators in \(L(({\mathbb {C}}^d)^{\otimes 3})\). The interaction graph is pictured in Fig. 8. Let \(\Pi \) be the projector onto the ground space of \(H_0\) and let \(V=I \otimes {|{\psi }\rangle }_{32}\) (which maps onto the ground space of \(H_0\)), so that

$$\begin{aligned} \Pi H_1 \Pi&= P_{32}P_{12} P_{32}=\left( \sum _{i,j}\lambda _i \lambda _j I \otimes {|{i}\rangle }{\langle {j}|} \otimes {|{i}\rangle }{\langle {j}|}\right) \left( \sum _{k,l} \lambda _k \lambda _l {|{k}\rangle }{\langle {l}|} \otimes {|{k}\rangle } {\langle {l}|} \otimes I \right) P_{32}\\&=\left( \sum _{i,j,l}\lambda _i \lambda _j^2 \lambda _l {|{j}\rangle }{\langle {l}|} \otimes {|{i}\rangle }{\langle {l}|} \otimes {|{i}\rangle }{\langle {j}|}\right) \left( \sum _{m,n} \lambda _m \lambda _n I \otimes {|{m}\rangle }{\langle {n}|} \otimes {|{m}\rangle }{\langle {n}|}\right) \\&=\left( \sum _{i,j,n}\lambda _i \lambda _j^4 \lambda _n{|{j}\rangle }{\langle {j}|} \otimes {|{i}\rangle }{\langle {n}|} \otimes {|{i}\rangle }{\langle {n}|}\right) =R_1 P_{32} =VRV^{\dagger } \end{aligned}$$

where R is the single qudit operator \(R=\sum _j \lambda _j^4 {|{j}\rangle }{\langle {j}|}\). Then by Lemma 8 (first order) we can simulate R.

Fig. 8
figure 8

Interaction graph of the gadgets used in the proof of Theorem 5. \(H_0\) acts on the mediator qudits 2 and 3. An effective 1-local interaction is produced on qudit 1

We can now therefore assume we also have access to the 1-local interaction R on any qudit in A, and we will construct another gadget of the same form, see Fig. 8. Let \(H_1= (\alpha +\beta ^2)P_{12}\) and \(H_2= \beta (P_{12}-R_1)\) for some arbitrary \(\alpha , \beta \in {\mathbb {R}}\), with \(H_0= I-P_{32}\), \(V=I \otimes {|{\psi }\rangle }_{32}\) and \(\Pi \) as before. We note that \(\Pi H_2\Pi =0\), so

$$\begin{aligned} \Pi \left[ H_1- H_2 (H_0)^{-1} H_2 \right] \Pi&= P_{32}\left[ (\alpha +\beta ^2) P_{12} - \beta ^2 P_{12}(I-P_{32})P_{12}\right] P_{32}\\&=\alpha P_{32}P_{12} P_{32}+\beta ^2 (P_{32}P_{12} P_{32})^2=(\alpha R_1+\beta ^2 R_1^2) P_{32} \\&= V(\alpha R +\beta ^2 R^2) V^{\dagger }. \end{aligned}$$

So by Lemma 8 (first order), we can simulate the 1-local interaction \(\alpha R + \beta ^2 R^2\) on any qudit in A. By a symmetric argument, we can also simulate \(\alpha R + \beta ^2 R^2\) on any qudit in B.

To complete the proof, we consider the following two separate cases:

  • (i) R has a degenerate eigenspace with non-zero eigenvalue.

    Suppose there exists \(\mu >0\) such that \(J=\left\{ i \,|\, \lambda _i = \mu \right\} \subseteq \{1,2,\dots ,d\}\) has two or more elements. Then \(\tilde{R}=R^2-2\mu ^4 R +\mu ^8I=(R-\mu ^4 I)^2\) is positive semidefinite with ground space projector \(\Pi = \sum _{i \in J} | i \rangle \langle i |\). Let \(d'=|J|\) and let \(V:{\mathbb {C}}^{d'} \rightarrow {\mathbb {C}}^d\) map onto \({{\,\mathrm{span}\,}}_{i \in J} \{{|{i}\rangle }\}\).

    Let \(H_0=I \otimes \tilde{R} +\tilde{R} \otimes I\) and let \(H_1=P\) so that

    $$\begin{aligned} (H_1)_{--}=(\Pi \otimes \Pi )P (\Pi \otimes \Pi ) = \mu ^2 \sum _{i,j \in J} {|{i}\rangle }{\langle {j}|} \otimes {|{i}\rangle } {\langle {j}|}=\mu ^2(V\otimes V) {\widetilde{h}} (V \otimes V)^{\dagger } \end{aligned}$$

    where \({\widetilde{h}}\) is the alternative \(SU(d')\) Heisenberg interaction (see Eq. 20). Therefore by Lemma 8 (first order), we can simulate a Hamiltonian of alternative \(SU(d')\) Heisenberg interactions on a bipartite lattice, which is universal by Corollary 24.

  • (ii) The eigenspaces of R with non-zero eigenvalue are non-degenerate.

    Without loss of generality, assume that the \( \lambda _i\) are ordered in non-increasing order. The assumption that \({|{\psi }\rangle }\) is entangled implies that \(\lambda _1, \lambda _2>0\). Since we are not in case (i), we know that \(\lambda _1>\lambda _2\) and \(\lambda _2> \lambda _i\) for all \(i \ne 1,2\). Then the operator \(\tilde{R}= R^2-(\lambda _1^4+ \lambda _2^4)R +\lambda _1^4\lambda _2^4 I\) has two-dimensional ground space \({{\,\mathrm{span}\,}}\{{|{1}\rangle },{|{2}\rangle }\}\). Let \(V:{\mathbb {C}}^2 \rightarrow {\mathbb {C}}^d\) be the isometry which acts on the qubit basis states as \(V{|{0}\rangle }={|{1}\rangle }\) and \(V{|{1}\rangle }={|{2}\rangle }\).

    Let \(H_0=I \otimes \tilde{R} +\tilde{R} \otimes I\) and let \(H_1=P\), so that

    $$\begin{aligned} (H_1)_{--}&= \sum _{i,j \in \{1,2\}}\lambda _i \lambda _j {|{i}\rangle }{\langle {j}|} \otimes {|{i}\rangle } {\langle {j}|}\\&= ( V \otimes V)\left( \frac{\lambda _1\lambda _2}{2}(XX-YY)+\frac{\lambda _1^2+\lambda _2^2}{4}(ZZ+I)+\frac{\lambda _1^2-\lambda _2^2}{4}(ZI+IZ) \right) ( V \otimes V)^{\dagger } \end{aligned}$$

    So by Lemma 8 (first order), we can simulate a Hamiltonian of interactions of the form \(\frac{\lambda _1\lambda _2}{2}(XX-YY)+\frac{\lambda _1^2+\lambda _2^2}{4}(ZZ+I)+\frac{\lambda _1^2-\lambda _2^2}{4}(ZI+IZ)\).

    The 2-local part of this interaction was shown to be universal in [38, Theorem 3], even when the interactions are restricted to a bipartite interaction graph. It remains to note that the gadget for removing the 1-local part of an interaction presented in [16] takes place on a bipartite interaction graph.\(\quad \square \)

We remark on the efficiency of this simulation in the case when P projects onto a state \({|{\psi }\rangle }\) that is almost a product state. In this case \(\lambda _1 \approx 1\) and \(\lambda _2 \ll 1\) so the size of the effective interactions (in both case i and ii) is small unless we scale up all the terms in the gadget. This means that the interactions strengths of the terms in the simulator Hamiltonian scale as \({{\,\mathrm{poly}\,}}(1/\lambda _2)\).

7 SU(2) Heisenberg Interaction on Qudits of Dimension d

Next we consider the SU(2) Heisenberg interaction in local dimension d. Let \(S^x, S^y, S^z\) form a d-dimensional irreducible representation of \(\mathfrak {su}(2)\) corresponding to the qubit operators \(\sigma ^x=X/2,\sigma ^y=Y/2,\sigma ^z=Z/2\). As a representation they must satisfy \([S^a,S^b]=\sum _{c}i\epsilon _{abc}S^c\), where \(\epsilon _{abc}\) is the completely antisymmetric Levi-Civita symbol which satisfies the following standard identities:

$$\begin{aligned} \sum _{a} \epsilon _{abc} \epsilon _{aef}=\delta _{be}\delta _{cf}-\delta _{bf}\delta _{ce} \quad \Rightarrow \quad \sum _{a,b} \epsilon _{abc} \epsilon _{abf}=2\delta _{cf}. \end{aligned}$$
(43)

Then the SU(2) Heisenberg interaction on qudits of dimension d is defined by

$$\begin{aligned} h=\sum _{a} S^a \otimes S^a. \end{aligned}$$

We first prove some preliminary technical results that will be useful later on.

The irreducible representations of \(\mathfrak {su}(2)\) can be labelled by their dimension. Let \(R^{(d)}\) be the unique d-dimensional irreducible representation which maps \(R^{(d)}(\sigma ^a)=S^a\). The Young diagram corresponding to \(R^{(d)}\) has a single row of \(d-1\) boxes and the Casimir eigenvalue of this representation is \(\lambda :=(d^2-1)/4\) by Eq. (19), so \(\sum _a S^aS^a=\lambda I\).

The tensor product of two d-dimensional representations has a direct sum decomposition into all odd-dimensional representations of sizes \(1,3,\dots ,2d-1\) (this can be seen using the Young diagram method, as described for example in [20, Section 19.3]):

$$\begin{aligned} R^{(d)}\otimes R^{(d)}=R^{(1)} \oplus R^{(3)} \oplus \dots \oplus R^{(2d-1)} \end{aligned}$$
(44)

Letting \(s=(d-1)/2\), this is the familiar decomposition of the total spin of two particles of spin s.

For two qudits of dimension d labelled E and F, let \(H_0=h_{EF}+\lambda I=\frac{1}{2}\sum _{a}(S_E^a+S_F^a)(S_E^a+S_F^a)\), which up to a multiplicative factor of 1/2 is the Casimir operator in the representation \(\{S^a_E+S^a_F\}_a\), so has eigenspace decomposition as given in Eq. (44), with eigenvalues half of the corresponding Casimir eigenvalue for that representation.

Let \({|{\psi _{EF}}\rangle }\) be the state corresponding to the trivial one dimensional representation in the decomposition, for which \((S_E^a+S_F^a){|{\psi _{EF}}\rangle }=0\) for all a. In the standard choice of basis this is given by

$$\begin{aligned} {|{\psi _{EF}}\rangle }=\frac{1}{\sqrt{d}}\sum _{i=0}^{d-1}(-1)^i{|{i}\rangle }_E{|{d-i}\rangle }_F. \end{aligned}$$

The following identities involving \({|{\psi _{EF}}\rangle }\) can be derived from the fact that \({\langle {\psi _{EF}}|}M_E{|{\psi _{EF}}\rangle }=\frac{1}{d}{{\,\mathrm{Tr}\,}}(M)\) for any single qudit interaction M and the trace formulas from [35]; we include the proofs in Appendix B.

$$\begin{aligned}&{\langle {\psi _{EF}}|}S_E^a{|{\psi _{EF}}\rangle }=0, \qquad {\langle {\psi _{EF}}|}S_E^aS_E^b{|{\psi _{EF}}\rangle }=\frac{\lambda }{3}\delta _{ab}, \qquad {\langle {\psi _{EF}}|}S_E^aS_E^bS_E^c{|{\psi _{EF}}\rangle }=\frac{i\lambda }{6}\epsilon _{abc} \end{aligned}$$
(45)
$$\begin{aligned}&{\langle {\psi _{EF}}|}S_E^aS_E^bS_E^cS_E^e{|{\psi _{EF}}\rangle }=\frac{\lambda }{15}\left( (\lambda -2)\delta _{ac}\delta _{be}+(\lambda +\tfrac{1}{2})(\delta _{ab}\delta _{ce}+\delta _{ae}\delta _{bc})\right) \end{aligned}$$
(46)

In particular the second equation of (45) shows that the states \(\{S_E^a {|{\psi _{EF}}\rangle }\}_{a=1}^3\) are orthogonal; in fact they span the space on which \(S_E^a+S_F^a\) acts as the 3 dimensional adjoint representation in the decomposition, since \((S_E^a+S_F^a){|{\psi _{EF}}\rangle }=0\) implies \((S_E^a+S_F^a) S_E^b{|{\psi _{EF}}\rangle }=[S_E^a,S_E^b]{|{\psi _{EF}}\rangle }\). We can check that \(H_0\) has eigenvalue 1 on this space:

$$\begin{aligned} H_0 S_E^b{|{\psi _{EF}}\rangle }=\frac{1}{2}\sum _{a}[S_E^a,[S_E^a,S_E^b]]{|{\psi _{EF}}\rangle }=\frac{1}{2}\sum _{a,c,e}-\epsilon _{ace}\epsilon _{abc}S_E^e{|{\psi _{EF}}\rangle }=S_E^b{|{\psi _{EF}}\rangle }. \end{aligned}$$

Finally we wish to show that the states \(\left( \frac{1}{2}\{S_E^b,S_E^c\}-\frac{\lambda }{3}\delta _{bc}\right) {|{\psi _{EF}}\rangle }\) are in the 5-dimensional eigenspace of \(H_0\) with eigenvalue 3.

$$\begin{aligned} H_0 S_E^b S_E^c {|{\psi _{EF}}\rangle }&=\frac{1}{2}\sum _{a} (S_E^a+S_F^a)(S_E^a+S_F^a) S_E^b S_E^c {|{\psi _{EF}}\rangle } =\frac{1}{2}\sum _{a} [S_E^a,[S_E^a,S_E^b S_E^c]] {|{\psi _{EF}}\rangle }\\&=\frac{1}{2}\sum _{a} \left( [S_E^a,[S_E^a,S_E^b ]]S_E^c+2[S_E^a,S_E^b] [S_E^a,S_E^c ]+S_E^b[S_E^a[S_E^a,S_E^c ]] \right) {|{\psi _{EF}}\rangle }\\&=-\frac{1}{2}\sum _{a,e,f}\left( \epsilon _{abe}\epsilon _{aef}S_E^fS_E^c+2\epsilon _{abe}\epsilon _{acf}S_E^eS_E^f+\epsilon _{ace}\epsilon _{aef}S_E^bS_E^f\right) {|{\psi _{EF}}\rangle }\\&=\left( 2S_E^bS_E^c-\sum _{e,f}(\delta _{bc}\delta _{ef}-\delta _{bf}\delta _{ce})S_E^eS_E^f\right) {|{\psi _{EF}}\rangle }\\&=\left( 2S_E^bS_E^c-\delta _{bc}\lambda I +S^c_ES^b_E\right) {|{\psi _{EF}}\rangle } \end{aligned}$$

where we have used Eq. (43) and \(\sum _{e}S^eS^e=\lambda I\). This implies that \(H_0\left( \frac{1}{2}\{S_E^b,S_E^c\}-\frac{\lambda }{3}\delta _{bc}I\right) {|{\psi _{EF}}\rangle }=3\left( \frac{1}{2}\{S_E^b,S_E^c\}-\frac{\lambda }{3}\delta _{bc}I\right) {|{\psi _{EF}}\rangle }\) as desired.

Fig. 9
figure 9

Interaction graph of the gadget used in the proof of Lemma 25. \(H_0\) acts on the mediator qudits E and F. An effective interaction is produced between the qudits 1 and 2

7.1 Simulating \(h^2\) with h

Lemma 25

A Hamiltonian consisting entirely of SU(2) Heisenberg interactions h can simulate a Hamiltonian of the form \(\sum _{ij} \alpha _{ij} h_{ij}+\beta _{ij} h_{ij}^2\) for arbitrary \(\alpha _{ij},\beta _{ij}\in {\mathbb {R}}\) and \(\beta _{ij}\geqslant 0 \).

Proof

To apply an arbitrary interaction of the form \(\alpha h +\beta h^2\) across qudits 1 and 2, we will use a mediator gadget with a pair of mediator qudits labelled EF under the heavy interaction \(H_0=I_{12} \otimes (h_{EF}+\lambda I) \in L(({\mathbb {C}}^d)^{\otimes 4})\) for \(\lambda =\frac{d^2-1}{4}\) as in the previous section. Let \(\Pi =I \otimes | \psi _{EF} \rangle \langle \psi _{EF} |\) be the projector onto the ground space of \(H_0\).

This will be a fourth-order gadget so we must define Hamiltonians \(H_1,H_2,H_3,H_4 \in L(({\mathbb {C}}^d)^{\otimes 4})\) in order to apply Lemma 12 (Fig. 9). Let

$$\begin{aligned} H_4=\mu _2(h_{1E}+h_{2E})=\mu _2\sum _{a}(S_1^a+S_2^a)S_E^a=\mu _2\sum _{a}{\widetilde{S}}^aS_E^a, \end{aligned}$$

where \({\widetilde{S}}^a =S_1^a+S_2^a\), and let \(H_1=\mu _1 h_{12}\), \(H_2=\frac{2\mu _2^2\lambda }{3}(h_{12}+\lambda I)\), and \(H_3=-\frac{\mu _2^3\lambda }{3}(h_{12}+\lambda I)\), where \(\mu _1\), \(\mu _2\) are real coefficients to be chosen later. Note that \(h_{12}+\lambda I = \frac{1}{2} \sum _a {\widetilde{S}}^a {\widetilde{S}}^a\). \(H_1,H_2,H_3\) all commute with \(\Pi \), so are block diagonal with respect to the split \( {\mathcal {H}}_- \oplus {\mathcal {H}}_+\). We can use Eq. (45) to check that the remaining condition of Lemma 12 is satisfied,

$$\begin{aligned} \Pi H_4 \Pi =\mu _2\sum _{a}(S_1^a+S_2^a){\langle {\psi _{EF}}|}S_E^a{|{\psi _{EF}}\rangle }\Pi =0. \end{aligned}$$

Since \( S_E^b{|{\psi _{EF}}\rangle }\) is an eigenvector of \(h_{EF}+\lambda I\) with eigenvalue 1 , we have \(H_0 H_4 \Pi =H_4\Pi \). This significantly simplifies the calculations required to determine the effective interaction produced using Lemma 12:

$$\begin{aligned} \Pi H_4 H_0^{-1}H_4 \Pi&=\Pi (H_4)^2 \Pi =\mu _2^2\sum _{a,b} (S_1^a+S_2^a)(S_1^b + S_2^b){\langle {\psi _{EF}}|}S_E^aS_E^b{|{\psi _{EF}}\rangle }\Pi \\&=\frac{\mu _2^2\lambda }{3}\sum _{a,b} \delta _{ab}{\widetilde{S}}^a {\widetilde{S}}^b\Pi =\frac{2\mu _2^2\lambda }{3}(h_{12}+\lambda I)\Pi =\Pi H_2 \Pi ;\\ \Pi H_4 H_0^{-1}H_4 H_0^{-1}H_4 \Pi&=\Pi (H_4)^3 \Pi \\&\quad =\mu _2^3\sum _{a,b,c} (S_1^a+S_2^a)(S_1^b + S_2^b)(S_1^c+S_2^c){\langle {\psi _{EF}}|}S_E^aS_E^bS_E^c{|{\psi _{EF}}\rangle }\Pi \\&=\frac{\mu _2^3\lambda }{6}\sum _{a,b,c}i\epsilon _{abc}{\widetilde{S}}^a {\widetilde{S}}^b {\widetilde{S}}^c\\&\quad \Pi =\frac{\mu _2^3\lambda }{6}\sum _{c} {\widetilde{S}}^c {\widetilde{S}}^c \Pi =\frac{\mu _2^3\lambda }{3}(h_{12}+\lambda I) \Pi \\&=-\Pi H_3\Pi . \end{aligned}$$

In the final set of equations we have used the following useful identity which holds for any operators \({\widetilde{S}}^a\) which form a representation of \(\mathfrak {su}(2)\) and thus satisfy \([{\widetilde{S}}^a,{\widetilde{S}}^b]=\sum _c i\epsilon _{abc} {\widetilde{S}}^c\):

$$\begin{aligned} \sum _{a,b}i\epsilon _{abc}{\widetilde{S}}^a{\widetilde{S}}^b&=\sum _{a,b}\frac{i}{2}\left( \epsilon _{abc}{\widetilde{S}}^a{\widetilde{S}}^b +\epsilon _{bac}{\widetilde{S}}^b{\widetilde{S}}^a\right) =\frac{i}{2}\sum _{a,b}\epsilon _{abc}[{\widetilde{S}}^a,{\widetilde{S}}^b] \end{aligned}$$
(47)
$$\begin{aligned}&=-\frac{1}{2}\sum _{a,b}\epsilon _{abc}\epsilon _{abe}{\widetilde{S}}^e=-\delta _{ce}{\widetilde{S}}^e=-{\widetilde{S}}^c. \end{aligned}$$
(48)

To use Lemma 12 (fourth order), we need to calculate \((H_1)_{--}+A-B\) where \(A=\Pi H_4 H_0^{-1}H_2 H_0^{-1}H_4 \Pi \) and \(B=\Pi H_4 H_0^{-1}H_4 H_0^{-1}H_4 H_0^{-1}H_4 \Pi \). First we calculate A using Eq. (45) to find

$$\begin{aligned} A=\Pi H_4 H_2 H_4 \Pi =\frac{\mu _2^4\lambda }{3}\sum _{a,b,c} {\widetilde{S}}^a{\widetilde{S}}^b {\widetilde{S}}^b {\widetilde{S}}^c {\langle {\psi _{EF}}|}S_E^aS_E^c{|{\psi _{EF}}\rangle }\Pi =\frac{\mu _2^4\lambda ^2}{9}\sum _{a,b}{\widetilde{S}}^a{\widetilde{S}}^b {\widetilde{S}}^b {\widetilde{S}}^a \Pi . \end{aligned}$$

Calculating B is more complicated:

$$\begin{aligned} B=\Pi (H_4)^2 H_0^{-1}(H_4)^2 \Pi =\mu _2^4\sum _{a,b,c,e}{\widetilde{S}}^a{\widetilde{S}}^b {\widetilde{S}}^c {\widetilde{S}}^e {\langle {\psi _{EF}}|}S_E^aS_E^bH_0^{-1}S_E^cS_E^e{|{\psi _{EF}}\rangle }\Pi . \end{aligned}$$

We therefore need to calculate \({\langle {\psi _{EF}}|}S_E^aS_E^bH_0^{-1}S_E^cS_E^e{|{\psi _{EF}}\rangle }\), which can be done by recalling from above that \(\left( \frac{1}{2}\{S_E^b,S_E^c\}-\frac{\lambda }{3}\delta _{bc}I\right) {|{\psi _{EF}}\rangle }\) is in the eigenspace of \(H_0\) with eigenvalue 3, and \([S_E^b,S_E^c]=\sum _e f_{bce} S_E^e\) for some coefficients \(f_{bce}\), so \([S_E^b,S_E^c]{|{\psi _{EF}}\rangle }\) is in the eigenspace of \(H_0\) with eigenvalue 1. Then we have

$$\begin{aligned}&{\langle {\psi _{EF}}|}S_E^aS_E^bH_0^{-1}S_E^cS_E^e{|{\psi _{EF}}\rangle }\\&\quad ={\langle {\psi _{EF}}|}S_E^aS_E^bH_0^{-1}\left( \frac{1}{2}\{S_E^c,S_E^e\}-\frac{\lambda }{3}\delta _{ce}I +\frac{1}{2}[S_E^c,S_E^e]+\frac{\lambda }{3}\delta _{ce}I\right) {|{\psi _{EF}}\rangle }\\&\quad ={\langle {\psi _{EF}}|}S_E^aS_E^b\left( \frac{1}{3}\left( \frac{1}{2}\{S_E^c,S_E^e\}-\frac{\lambda }{3}\delta _{ce}I \right) +\frac{1}{2}[S_E^c,S_E^e]\right) {|{\psi _{EF}}\rangle }\\&\quad ={\langle {\psi _{EF}}|}S_E^aS_E^b\left( \frac{2}{3}S_E^cS_E^e-\frac{1}{3}S_E^eS_E^c-\frac{\lambda }{9}\delta _{ce} I \right) {|{\psi _{EF}}\rangle }\\&\quad =\frac{\lambda }{45}\left( (\lambda -\tfrac{9}{2})\delta _{ac}\delta _{be}+(\lambda +3)\delta _{ae}\delta _{bc}+(\tfrac{1}{2}-\tfrac{2}{3}\lambda )\delta _{ab}\delta _{ce}\right) \end{aligned}$$

where we have used Eqs. (45) and (46) in the last equality. And so we have

$$\begin{aligned} A-B=\frac{\mu _2^4\lambda }{45}\sum _{a,b}\left( (\tfrac{9}{2}-\lambda ){\widetilde{S}}^a{\widetilde{S}}^b {\widetilde{S}}^a {\widetilde{S}}^b+(4\lambda -3){\widetilde{S}}^a{\widetilde{S}}^b {\widetilde{S}}^b {\widetilde{S}}^a+(\tfrac{2}{3}\lambda -\tfrac{1}{2}){\widetilde{S}}^a{\widetilde{S}}^a {\widetilde{S}}^b {\widetilde{S}}^b\right) . \end{aligned}$$

Then we substitute in the following relations which are an easy consequence of Eq. (47):

$$\begin{aligned}&\sum _{a,b}{\widetilde{S}}^a{\widetilde{S}}^b{\widetilde{S}}^a{\widetilde{S}}^b\\&\quad =\sum _{a,b}\left( {\widetilde{S}}^a{\widetilde{S}}^a{\widetilde{S}}^b{\widetilde{S}}^b+ {\widetilde{S}}^a[{\widetilde{S}}^b,{\widetilde{S}}^a]{\widetilde{S}}^b\right) =\sum _{a,b}\left( {\widetilde{S}}^a{\widetilde{S}}^a{\widetilde{S}}^b{\widetilde{S}}^b+\sum _c i\epsilon _{bac}{\widetilde{S}}^a{\widetilde{S}}^c{\widetilde{S}}^b\right) \\&\quad =\sum _{a,b}{\widetilde{S}}^a{\widetilde{S}}^a{\widetilde{S}}^b{\widetilde{S}}^b -\sum _c{\widetilde{S}}^c{\widetilde{S}}^c,\\ \sum _{a,b}{\widetilde{S}}^a{\widetilde{S}}^b{\widetilde{S}}^b{\widetilde{S}}^a&\quad =\sum _{a,b}\left( {\widetilde{S}}^a{\widetilde{S}}^b{\widetilde{S}}^a{\widetilde{S}}^b+ {\widetilde{S}}^a{\widetilde{S}}^b[{\widetilde{S}}^b,{\widetilde{S}}^a]\right) =\sum _{a,b}\left( {\widetilde{S}}^a{\widetilde{S}}^b{\widetilde{S}}^a{\widetilde{S}}^b+\sum _c i\epsilon _{bac}{\widetilde{S}}^a{\widetilde{S}}^b{\widetilde{S}}^c\right) \\&\quad =\sum _{a,b}{\widetilde{S}}^a{\widetilde{S}}^b{\widetilde{S}}^a{\widetilde{S}}^b +\sum _c{\widetilde{S}}^c{\widetilde{S}}^c =\sum _{a,b}{\widetilde{S}}^a{\widetilde{S}}^a{\widetilde{S}}^b{\widetilde{S}}^b \end{aligned}$$

to get

$$\begin{aligned} A-B&=\frac{\mu _2^4\lambda }{45}\left( \left( \frac{11}{3}\lambda +1\right) \sum _{a,b}{\widetilde{S}}^a{\widetilde{S}}^a{\widetilde{S}}^b{\widetilde{S}}^b +\left( \lambda -\frac{9}{2}\right) \sum _ c{\widetilde{S}}^c{\widetilde{S}}^c \right) \\&=\mu _2^4\tfrac{\lambda }{135}\left( 4(11\lambda +3)h_{12}^2+(88\lambda ^2+30\lambda -27)h_{12}+(44\lambda ^2+18\lambda -27)\lambda I \right) \Pi \end{aligned}$$

where we have used \(\sum _c {\widetilde{S}}^c {\widetilde{S}}^c=2(h_{12}+\lambda I)\).

Let \(\mu _1=\alpha -\mu _2^4\frac{\lambda }{135}(88\lambda ^2+30\lambda -27)\) and \(\mu _2=(135\beta /4(11\lambda ^2+3\lambda ))^{1/4}\), noting that \(11\lambda ^2+3\lambda \) is positive for all \(d\geqslant 2\). Then

$$\begin{aligned} \Pi H_1 \Pi +A-B=(\alpha h_{12}+\beta h_{12}^2 +c I ) \Pi =V(\alpha h +\beta h^2 +cI)V^{\dagger } \end{aligned}$$

for some \(c \in {\mathbb {R}}\), and where \(V=I_{12} \otimes {|{\psi _{EF}}\rangle }\). So by Lemma 12 (fourth order) we can simulate \(\alpha h +\beta h^2 +cI\).

Finally, since this is a fourth-order gadget, we must check if there is any cross-gadget interference when we use multiple gadgets in parallel. Let \(\Pi _{{\text {tot}}}\) be the projector onto the ground space of all gadgets being applied in parallel. By Corollary 14, the interference between gadgets i and j is given by

$$\begin{aligned} -\frac{1}{2} \Pi _{{\text {tot}}}\left[ H_4^{(i)},H_4^{(j)}\right] ^2 \Pi _{{\text {tot}}}. \end{aligned}$$

If \(H_4^{(i)}\) and \(H_4^{(j)}\) commute then clearly there is no interference. Assume without loss of generality that gadget i simulates an interaction between qudits 1 and 2 with \(H_4^{(i)}=\mu _2^{(i)}( h_{1E_i}+h_{2E_i})\) and gadget j simulates an interaction between qudits 1 and 3 with \(H_4^{(j)}=\mu _2^{(j)} (h_{1E_j}+h_{3E_j})\). Normalising by a factor of \((\mu _2^{(i)})^2(\mu _2^{(j)})^2\) for convenience, the cross-gadget interference is proportional to

$$\begin{aligned}&-\frac{1}{2(\mu _2^{(i)})^2(\mu _2^{(j)})^2} \Pi _{{\text {tot}}}\left[ H_4^{(i)},H_4^{(j)}\right] ^2 \Pi _{{\text {tot}}}\\&\quad =-\frac{1}{2} \Pi _{{\text {tot}}}\left[ \sum _{a}(S_1^a+S_2^a)S_{E_i}^a,\sum _b (S_1^b+S_3^b)S_{E_j}^b\right] ^2\Pi _{{\text {tot}}}\\&\quad =-\frac{1}{2} \sum _{a,b,c,e}[S_1^a,S_1^b] [S_1^c,S_1^e]\Pi _{{\text {tot}}}S_{E_i}^a S_{E_i}^c S_{E_j}^b S_{E_j}^e\Pi _{{\text {tot}}}\\&\quad =-\frac{1}{2} \frac{\lambda ^2}{9}\sum _{a,b}[S_1^a,S_1^b][S_1^a,S_1^b] =\frac{\lambda ^2}{18} \sum _{a,b,c,e}\epsilon _{abc}\epsilon _{abe}S_1^cS_1^e\\&\quad =\frac{\lambda ^2}{9} \sum _{c}S_1^cS_1^c=\frac{\lambda ^3}{9}I \end{aligned}$$

where we have used Eq. (45) in the third equality. Therefore the cross-gadget interference is proportional to the identity, which corresponds only to an unimportant energy shift, and so can be ignored. \(\quad \square \)

7.2 h and \(h^2\) simulate qutrit SU(3) Heisenberg interaction

Let C be the Casimir operator corresponding to the \(\{S^a_1 + S^a_2\}_a\) representation of \(\mathfrak {su}(2)\). Given access to \(h^2\) and h interactions, we can produce the two-qudit interaction

$$\begin{aligned} H_0=(C-2I)^2&=\left( \sum _a (S_1^a+S_2^a)(S_1^a+S_2^a)-2I\right) ^2=(2h_{12}+2\lambda I-2I)^2\\&=4\left( h_{12}^2+2(\lambda -1)h_{12}+(\lambda -1)^2 I\right) , \end{aligned}$$

where as before \(\lambda = (d^2-1)/4\). This operator is clearly positive semidefinite and has eigenvalue zero only on the 3-dimensional representation in the decomposition (44), since the 3-dimensional representation has Casimir eigenvalue 2. We will use this 3-dimensional space to encode a logical qutrit. For any 4 qudits (1, 2), (3, 4), where each pair is restricted to this space, the operator

$$\begin{aligned} H_1 = h_{13}+h_{14}+h_{23}+h_{24}=\sum _a (S_1^a+S_2^a)(S_3^a+S_4^a) \end{aligned}$$

acts as a logical qutrit SU(2) Heisenberg interaction. So by Lemma 8 (first order) we can use the interactions h and \(h^2\) to simulate a Hamiltonian of \(h'\) interactions between qutrits, where \(h'\) is the SU(2) Heisenberg qutrit interaction (Fig. 10).

Fig. 10
figure 10

Interaction graph of the gadget used in Sect. 7.2. A logical qutrit is encoded into each pair of qudits (1, 2) and (3, 4)

Then by Lemma 25, it is possible to simulate any Hamiltonian \(H=\sum _{ij} \alpha _{ij} h'_{ij}+\beta _{ij} (h'_{ij})^2\), where \(\beta _{ij}\geqslant 0\). In particular one can set \(\alpha _{ij}=\beta _{ij}\) and simulate \(\sum _{ij} \beta _{ij}( h'_{ij}+(h'_{ij})^2)\). Then \(h'+(h')^2\) is the SU(3) Heisenberg interaction, which is universal by Theorem 3 (even with non-negative weights). This completes the proof of the following theorem:

Theorem 6(restated). For any \(d \geqslant 2\), the SU(2) Heisenberg interaction \(h= S^x \otimes S^x + S^y \otimes S^y + S^z \otimes S^z\), where \(S^x\), \(S^y\), \(S^z\) are representations of the Pauli matrices X, Y, Z, is universal.

8 Bilinear-Biquadratic Interaction in Dimension 3

We finally consider an important variant of the SU(2) Heisenberg model: the bilinear-biquadratic spin-1 Heisenberg model (i.e. in local dimension 3). Write \(X_3\), \(Y_3\), \(Z_3\) for matrices such that \(\{iX_3,iY_3,iZ_3\}\) generate a 3-dimensional irreducible representation of \(\mathfrak {su}(2)\). For example, we can take

$$\begin{aligned} X_3 = \frac{1}{\sqrt{2}} \begin{pmatrix} 0 &{} 1 &{} 0\\ 1&{} 0&{} 1\\ 0 &{} 1 &{} 0\end{pmatrix},\;\;\;\;Y_3 = \frac{i}{\sqrt{2}} \begin{pmatrix} 0 &{} -1 &{} 0\\ 1&{} 0&{} -1\\ 0 &{} 1 &{} 0\end{pmatrix},\;\;\;\;Z_3 = \begin{pmatrix} 1 &{} 0 &{} 0\\ 0&{} 0&{} 0\\ 0 &{} 0 &{} -1\end{pmatrix}; \end{aligned}$$

note that these obey the same commutation relations as the Pauli matrices (up to a scaling constant). Then the Heisenberg interaction is

$$\begin{aligned} h = X_3 \otimes X_3 + Y_3 \otimes Y_3 + Z_3 \otimes Z_3. \end{aligned}$$

Consider the algebra generated by h. We have \(h^3 = h - 2 h^2 + 2 I\), so up to scaling and an identity term any nontrivial interaction in this algebra can be written as \(h^{(\theta )} := (\cos \theta ) h + (\sin \theta ) h^2\) for some \(\theta \). Let \(\alpha =\cos \theta \) and \(\beta =\sin \theta \). Because of our freedom to choose the signs of interactions, we can further assume that \(0 \leqslant \theta \leqslant \pi \), and thus \(\beta \geqslant 0\). Then any Hamiltonian produced from such interactions can be written, up to an overall identity term, as

$$\begin{aligned} H = \sum _{i < j} a_{ij} h^{(\theta )}_{ij}. \end{aligned}$$

This model is known as the (general) bilinear-biquadratic Heisenberg model and has been a popular object of study [1, 25, 29, 31]. The special case \(\theta = \arctan 1/3\) is the interaction proportional to \(h + \frac{1}{3}h^2\) occurring in the famous AKLT model [2], which was handled in Lemma 11. We also already showed that the cases \(\theta \in \{0,\pi /4\}\) are universal in the previous section (\(\pi /4\) corresponds to the SU(3) Heisenberg interaction); here we prove universality for all other values of \(\theta \).

It is easy to check that h has three eigenspaces with eigenvalues \(-2\), \(-1\), 1 and dimensions 1, 3, 5 respectively. Therefore \(h^{(\theta )}\) has eigenvalues \(4\beta -2\alpha , \beta -\alpha ,\beta +\alpha \) with respect to the same eigenspaces. In addition, \(h^2\) is proportional to the projector onto \({|{\psi }\rangle }={|{02}\rangle }-{|{11}\rangle }+{|{20}\rangle }\) plus a multiple of the identity. Depending on \(\theta \), \(h^{(\theta )}\) has the following properties:

  • \(\theta = 0\): \(h^{(\theta )} = h\). The Heisenberg model.

  • \(0< \theta < \arctan 1/3\): ground state nondegenerate and equal to \({|{02}\rangle }-{|{11}\rangle }+{|{20}\rangle }\).

  • \(\theta = \arctan 1/3\): ground space 4-fold degenerate (the AKLT model).

  • \(\arctan 1/3< \theta < \pi /2\): ground space 3-fold degenerate and spanned by

    $$\begin{aligned} \{{|{01}\rangle }-{|{10}\rangle },{|{12}\rangle }-{|{21}\rangle },{|{02}\rangle }-{|{20}\rangle }\}. \end{aligned}$$
    (49)
  • \(\theta = \pi /2\): ground space 8-fold degenerate and the orthogonal complement of \({|{02}\rangle }-{|{11}\rangle }+{|{20}\rangle }\). The case \(h^{(\theta )} = h^2\).

  • \(\pi /2< \theta < \pi \): ground space 5-fold degenerate.

The special case \(\theta = \pi /4\) gives the qutrit swap operator (up to rescaling and subtracting an identity term), which is in addition SU(3)-invariant. For \(\theta > \pi /4\), the highest energy state is nondegenerate and is \({|{02}\rangle }-{|{11}\rangle }+{|{20}\rangle }\).

8.1 Mediator gadget

We first consider the case where the state \({|{\psi }\rangle }={|{02}\rangle }-{|{11}\rangle }+{|{20}\rangle }\) is either the unique ground state or highest excited state of \(h^{(\theta )}\).

Lemma 26

Let \(\theta \in (0,\arctan 1/3) \cup (\pi /4, \pi )\setminus \{\arctan 2\}\). Then \(h^{(\theta )}\) is universal.

Proof

Our strategy will be to use a second-order gadget via Lemma 9 to implement the effective interaction \(h^{(\theta ')}\) for any choice of \(\theta '\). In particular this allows us to simulate the interaction \(h^{(\pi /4)}\) which is the qutrit swap operator—the unique SU(3) invariant interaction shown to be universal in Theorem 3. To use this approach, we need to define Hamiltonians \(H_0\), \(H_1\), \(H_2\) on a system of 4 qutrits. We label these qutrits 1, 2, 3, 4 where qutrits 3 and 4 are mediator qutrits, and the effective interaction \(h^{(\theta ')}\) is simulated on qutrits 1 and 2 (Fig. 11).

Fig. 11
figure 11

Interaction graph of the mediator gadget used in the proof of Lemma 26. \(H_0\) acts on the mediator qutrits 3 and 4, and there is an effective interaction between the qutrits 1 and 2

The condition on \(\theta \) implies that \(\beta >0\) and \(\alpha > 3\beta \) or \(\alpha < \beta \). Consider the operator \(h^{(\theta )}+(2\alpha -4\beta )I\), which annihilates \({|{\psi }\rangle }={|{02}\rangle }-{|{11}\rangle }+{|{20}\rangle }\), and has eigenvalues \(\alpha -3\beta \) and \(3\alpha -3\beta \) on the two eigenspaces of h with dimension 3 and 5 respectively, which in turn correspond to eigenvalues \(-1\) and \(+1\). If \(\alpha > 3\beta \) then both of these eigenvalues are positive and we set \(H_0=h^{(\theta )}_{34}+(2\alpha -4\beta )I\), while if \(\alpha <\beta \) then both of these eigenvalues are negative and the proof will continue analogously with \(H_0=-(h^{(\theta )}_{34}+(2\alpha -4\beta )I)\).

In either case, \(\Pi =I \otimes {|{\psi _{34}}\rangle }{\langle {\psi _{34}}|}\) is the projector onto the ground space of \(H_0\). Let \(H_1=\lambda _1 h^{(\theta )}_{12}\) for some \(\lambda _1 \in {\mathbb {R}}\), so that \(H_1\) commutes with \(\Pi \), and \(\Pi H_1\Pi =\lambda _1 h_{12}^{(\theta )} \Pi \). Then we choose

$$\begin{aligned} H_2=\lambda _2 \left( h^{(\theta )}_{13}+h^{(\theta )}_{23}-\frac{8\beta }{3} I\right) = \lambda _2(\alpha -\beta /2)A+\lambda _2\beta B \end{aligned}$$

where \(A=h_{13}+h_{23}\), \(B=h_{13}^2+\tfrac{1}{2}h_{13}+ h_{23}^2+\tfrac{1}{2}h_{23}-\tfrac{8}{3}I\), and \(\lambda _2 \in {\mathbb {R}}\). It is easy to check that for any \({|{\phi _{12}}\rangle }\), \(h_{13}{|{\phi _{12}}\rangle }{|{\psi _{34}}\rangle }\) and \(h_{23}{|{\phi _{12}}\rangle }{|{\psi _{34}}\rangle }\) are in the eigenspace of \(h_{34}\) with eigenvalue \(-1\), and therefore that \(A\Pi \) has support only on the eigenspace of \(H_0\) with eigenvalue \(\alpha -3\beta \). Similarly, one can check that \((h_{13}^2 +\frac{1}{2}h_{13}-\frac{4}{3}I){|{\phi _{12}}\rangle }{|{\psi _{34}}\rangle }\) and \((h_{23}^2 +\frac{1}{2}h_{23}-\frac{4}{3}I){|{\phi _{12}}\rangle }{|{\psi _{34}}\rangle }\) are in the eigenspace of \(h_{34}\) with eigenvalue \(+1\), which implies that \(B\Pi \) has support only on the eigenspace of \(H_0\) with eigenvalue \(3\alpha -3\beta \).

Therefore neither \(A\Pi \) or \(B\Pi \) have support on the eigenspace of \(H_0\) with eigenvalue 0, and so \(\Pi H_2 \Pi =0\) as required to apply Lemma 9. The second-order term is given by

$$\begin{aligned} \Pi H_2 H_0^{-1} H_2 \Pi =\lambda _2^2\frac{(\alpha -\beta /2)^2}{\alpha -3\beta }\Pi A^2\Pi +\lambda _2^2\frac{\beta ^2}{3\alpha -3\beta }\Pi B^2\Pi . \end{aligned}$$

Calculating \(\Pi A^2 \Pi \) and \(\Pi B^2 \Pi \) separately we find that

$$\begin{aligned} \Pi A^2 \Pi =\Pi (h_{13}^2 +h_{13}h_{23}+h_{23}h_{13}+h_{23}^2)\Pi =\frac{4}{3}(2I +h_{12})\Pi \\ \Pi B^2\Pi =\left( \frac{2}{3}h_{12}^2+\frac{1}{3}h_{12}+\frac{2}{9}I\right) \Pi . \end{aligned}$$

Let \(V= I_{12}\otimes {|{\psi _{34}}\rangle }\) so that \(VV^{\dagger }=\Pi \) and

$$\begin{aligned} \Pi H_1 \Pi - \Pi H_2H_0^{-1}H_2\Pi =V\left( \lambda _1 h^{(\theta )}+\lambda _2^2 \tilde{h}(\theta )\right) V^{\dagger } \end{aligned}$$

where

$$\begin{aligned} \tilde{h}(\theta )=\frac{2}{9( \alpha - \beta )}\left( \beta ^2 h^2+\frac{6 \alpha ^3 - 12 \alpha ^2 \beta + 8 \alpha \beta ^2 - 3 \beta ^3}{\alpha -3\beta }h+\frac{2 (18 \alpha ^3 - 36 \alpha ^2 \beta + 23 \alpha \beta ^2 - 6 \beta ^3)}{3(\alpha -3\beta )}I\right) . \end{aligned}$$

Therefore by Lemma 9 (second order), we can simulate \(\lambda _1 h^{(\theta )}+\lambda _2^2 \tilde{h}(\theta )\). By repeating the same calculation with \(H_2=\lambda _2(h^{(\theta )}_{13}-h^{(\theta )}_{23})\), it is possible to simulate the interaction \(\lambda _1 h_{12}^{(\theta )}-\lambda _2^2 \tilde{h}_{12}(\theta )\) instead. For all \(\theta \) satisfying the conditions in the lemma, it is easy to check that the 2-local part of \(\tilde{h}_{12}(\theta )\) is linearly independent of \(h_{12}^{(\theta )}\). So, by choosing \(\lambda _1\), \(\lambda _2\) appropriately, we can use this gadget to simulate any desired interaction \(h^{(\theta ')}\) (with an arbitrary weight), and in particular the case \(\theta ' = \pi /4\). \(\quad \square \)

8.1.1 Logical qutrit gadget

In the next case we consider, \(h^{(\theta )}\) has a 3-dimensional ground space.

Lemma 27

Let \(\theta \in (\arctan 1/3,\arctan 5)\). Then \(h^{(\theta )}\) is universal.

Proof

In this case, the condition on \(\theta \) implies that \(0< \beta /5<\alpha < 3 \beta \) and that \(h^{(\theta )}\)’s ground space is the 3-dimensional space with basis (49). Let \(V:{\mathbb {C}}^3 \rightarrow ({\mathbb {C}}^3)^{\otimes 2}\) be the isometry defined by

$$\begin{aligned} V=\frac{{|{01}\rangle }-{|{10}\rangle }}{\sqrt{2}}{\langle {0}|}+\frac{{|{12}\rangle }-{|{21}\rangle }}{\sqrt{2}}{\langle {1}|}+\frac{{|{02}\rangle }-{|{20}\rangle }}{\sqrt{2}}{\langle {2}|}, \end{aligned}$$

which maps onto the ground space of \(h^{(\theta )}\).

Fig. 12
figure 12

Interaction graph of the gadget used in the proof of Lemma 27. A logical qutrit is encoded into each pair of qutrits (1, 2) and (3, 4)

We will construct a second-order gadget that encodes each logical qutrit into one of these 3-dimensional ground spaces of two physical qutrits (Fig. 12). Using Lemma 9, we choose \(H_0\), \(H_1\) and \(H_2\) such that the effective interaction between logical qutrits is proprtional to \(h+h^2\), the SU(3) invariant SWAP interaction shown to be universal in Theorem 3.

By the anti-interference discussion presented in [17, Lemma 36], it will suffice to consider just two logical qutrits encoded in 4 physical qutrits. Let one logical qutrit be encoded into the ground space of \(h_{12}^{(\theta )}\) in a pair of physical qutrits labelled 1, 2 and a second logical qutrit be encoded into the ground space of \(h^{(\theta )}_{34}\) in a pair of physical qutrits labelled 3, 4. The overall heavy Hamiltonian \(H_0\), with an appropriate multiple of the identity to ensure the ground state energy is zero, is given by

$$\begin{aligned} H_0=h^{(\theta )}_{12}+h^{(\theta )}_{34}+2(\alpha -\beta )I. \end{aligned}$$

Let \(\Pi \) be the projector onto the 9 dimensional ground space of \(H_0\), in which the two logical qutrits are encoded. One can check that for \(i \in \{1,2\}\) and \(j \in \{3,4\}\),

$$\begin{aligned} \Pi h^{(\theta )}_{ij}\Pi =(V \otimes V) \left( \frac{1}{4} h^{(\theta )} + \beta I\right) (V \otimes V)^{\dagger } \end{aligned}$$

Let \(H_2=\lambda _2(h^{(\theta )}_{13}-h^{(\theta )}_{24})\) so that \(\Pi H_2 \Pi =0\). Using a computer algebra package we can calculate the second-order term, remembering that \(H_0\) has zero energy on its ground space, and that the \(H_0^{-1}\) denotes the inverse computed on the higher energy space only:

$$\begin{aligned} -\Pi H_2 H_0^{-1} H_2 \Pi&= \frac{\lambda _2^2}{2\alpha (\alpha -3\beta )}(V \otimes V)\bigg ( (-3\alpha ^3 + 6 \alpha ^2 \beta - 8\alpha \beta ^2 + \beta ^3)h \\&-\frac{1}{2}(5 \alpha ^3 - 7 \alpha ^2\beta + 9 \alpha \beta ^2 + \beta ^3)h^2+c I\bigg ) (V \otimes V)^{\dagger } \end{aligned}$$

for some \(c \in {\mathbb {R}}\).

Let \(H_1=4\lambda _1 h_{13}\) so that \(\Pi H_1\Pi = \lambda _1(h_{L}^{(\theta )} + 4\beta I) \Pi =\lambda _1(\alpha h_L +\beta h_L^2 + 4\beta I ) \Pi \). Then by Lemma 9 (second order), choosing \(H_0\) and \(H_2\) as above and setting \(\lambda _1=\alpha -\beta \), \(\lambda _2=2\sqrt{\alpha }\) will simulate

$$\begin{aligned} \frac{5\alpha ^3 - 8 \alpha ^2 \beta + 13 \alpha \beta ^2 - 2 \beta ^3}{3 \beta -\alpha }\left( h+h^2\right) +\tilde{c}I \end{aligned}$$

for some \(\tilde{c} \in {\mathbb {R}}\), which is the SU(3) Heisenberg interaction as desired, up to rescaling and deletion of an identity term. We note that \(3\beta -\alpha >0\) and

$$\begin{aligned} 5\alpha ^3 - 8 \alpha ^2 \beta + 13 \alpha \beta ^2 - 2 \beta ^3=(5\alpha -\beta )(\alpha -\sqrt{2} \beta )^2 +(10\sqrt{2}-7)\alpha ^2\beta +(3-2\sqrt{2})\alpha \beta ^2>0 \end{aligned}$$

since \(\alpha ,\beta >0\) and \(5\alpha -\beta >0\). Therefore this gadget can only produce positively-weighted interactions, but this restriction is allowed in Theorem 3. \(\quad \square \)

Combining Theorem 6, Lemma 11, Lemma 26 and Lemma 27 yields our final result:

Theorem 7(restated). Let \(h^{(\theta )} := (\cos \theta ) h + (\sin \theta ) h^2\), where \(\theta \in [0,2\pi )\) is an arbitrary parameter and h is the spin-1 Heisenberg interaction. For any \(\theta \), \(h^{(\theta )}\) is universal.