Translationally invariant universal classical Hamiltonians

Spin models are widely studied in the natural sciences, from investigating magnetic materials in condensed matter physics to studying neural networks. Previous work has demonstrated that there exist simple classical spin models that are universal, in the sense that they can replicate the physics of all other classical spin models to any desired precision. However, all classical spin models which were previously known to be universal broke translational invariance. In this paper we demonstrate that there exist translationally invariant universal models. Our main result is to construct a translationally invariant Hamiltonian described by a single parameter that is universal. This means that there exists a single Hamiltonian which can replicate all classical spin physics, including every universality class, just by tuning a single parameter in the Hamiltonian and varying the size of the system that the Hamiltonian is acting on. This demonstrates that universality is not merely a consequence of inhomogeneous couplings in the Hamiltonian, and significantly strengthens previous results.


Introduction
Hamiltonian simulation is an active research area, with a number of potential applications. A large scale classical Hamiltonian simulator already exists [23], and analogue quantum Hamiltonian simulation is one of the most promising near-term applications of quantum technologies [4]. Recent work has investigated the theoretical framework of Hamiltonian simulation, and precisely defined what it means for one classical system to simulate another [7], and for one quantum system to simulate another [5]. In both the classical and the quantum cases it was demonstrated that within very demanding definitions of what it means for one system to simulate another there exist families of Hamiltonians that are universal, in the sense that they can simulate all other classical (respectively, quantum) Hamiltonians. One drawback to the universality results derived in [7] and [5] is that all the universal models found in the two papers break translational invariance. In this paper we improve on the classical result by demonstrating that there exist translationally invariant classical models which are universal.
The work on universal Hamiltonians arose in part due to previous work on the completeness of the partition functions of a set of classical spin models [10], [17], [9], [8], [18], [22], where a model is said to be "Hamiltonian complete" if its partition function can replicate (up to a constant multiplicative factor) the partition function of any other model. The requirements for a "universal model" in [7] are more demanding: a model is universal if for any classical Hamiltonian H ′ there exists a Hamiltonian from the model H that can simulate H ′ , where "simulate" means reproducing not only the partition function but also the energy levels and spin configurations. (See Definition 2 or [7] for the mathematically rigorous definition.) The main technical result in [7] is that a spin model is a universal model if and only if it is closed, and its ground state energy problem admits a polynomial-time faithful reduction from SAT. 1 The authors then show that the 2D Ising model with fields meets these criteria, hence is a universal classical Hamiltonian. They also prove universality of a number of other simple spin models, including the 3D Ising model and the Potts model. However all the models shown to be universal in [7] require the ability to tune individual interaction strengths in the Hamiltonian, raising the question of whether the universality is a consequence of breaking translational invariance in this way.
In [14] it was shown that the translationally invariant tiling problem is NEXP-complete, demonstrating that translational invariance is not a barrier to complexity. In this paper we go further, and show that translational invariance is not a barrier to universality. Our main result is that there exists a family of translationally invariant Hamiltonians, parameterised by a single parameter∆, which is universal. This translationally invariant universal model is definted on a 2D square lattice of spins with nearestneighbour interactions. The Hamiltonian can be written in the form: where H 1 and H 2 are fixed, translationally invariant, nearest-neighbour Hamiltonians. By varying the size of the lattice that the Hamiltonian is acting on, and tuning the∆ parameter in the construction, this family of Hamiltonians can replicate all classical spin physics in the strong sense of [7]. Our universality construction takes inspiration from the NEXP-complete tiling problem constructed by Gottesman and Irani in [14]. 2 The tiling problem in [14] is formally defined as follows: Definition 1 (Tiling, Definition 2.1 from [14]). Problem parameters: A set of tile T = {t 1 , ..., t m }. A set of horizontal constraints H ⊆ T × T such that if t i is placed to the left of t j , then it must be the case that (t i , t j ) ∈ H. A set of vertical constraints V ⊆ T × T such that if t i is placed below t j , then it must be the case that (t i , t j ) ∈ V . A designated tile t 1 that must be placed in the four corners of the grid. Problem Input: Integer N, specified in binary. Output: Determine whether there is a valid tiling of an N × N grid.
In order to demonstrate that this is NEXP complete, Gottesman and Irani [14] make use of the fact that tiling is Turing complete [1,21]; it is possible to construct sets of tiles and tiling constraints that mimic the behaviour of any Turing machine. This is done by constructing a set of tiles and horizontal constraints such that each row of a valid tiling represents the tape of a Turing machine at an instant in time. The time evolution of the Turing machine is then encoded in the vertical tiling constraints, so that if the Turing machine is running top to bottom on the grid, then in a valid tiling row a can only appear above row b if the Turing machine can evolve from the state in row a to the state in row b in one time-step.
The idea in [14] is to encode instances, x, of any problem in NEXP in the binary expansion of N , the size of the grid to be tiled. The Turing machine encoded in the tiling rules first extracts the binary representation of N from the grid size, by incrementing a binary counter for a number of time-steps equal to the grid size. It then feeds this as input to the verifier computation for the NEXP problem.
Note that the reason Gottesman and Irani demonstrate NEXP-completeness as opposed to NP-completeness is that they are encoding all the information about the problem instance in the binary expansion of the size of the grid to be tiled. It takes approximately log(N ) bits of information to specify N in binary, so the size of the grid is exponential in the problem size, which is why the complexity result is shifted up to an exponential time result. This is not a weakness of the result, but a necessary consequence of only being allowed to vary one input: the size of the grid to be tiled.

Main results
In this section we give the main results and a high-level overview of the construction. See Sections 4 and 5 for full technical details of the constructions and proofs.

Universal model described by two parameters
Our main result constructs a single-parameter translationally invariant universal model. But we also prove universality of a two-parameter model, where the scaling of the parameters in the two-parameter model is better in some cases than the scaling in the one-parameter model. We will consider the simpler two-parameter result first, as the one-parameter construction builds on this.
Theorem 2.1. There exist fixed two-body interactions h 3,vert , h 3,horiz , h 4,vert , h 4,horiz such that the that the family of translationally-invariant Hamiltonians on a 2D square lattice of size N with h 3 , h 4 as nearest-neighbour interactions: is an efficient universal model, where ∆, α are parameters of the Hamiltonian, and the sums are over adjacent sites along the rows and columns, respectively.
(The precise meaning of efficiency in the translationally invariant case is given in Section 5.2.) The universality construction relies on the fact that every tiling problem can easily be re-expresed in terms of a classical, translationally invariant, nearest-neighbour spin Hamiltonian on a two-dimensional square lattice (a proof is given in Section 5.1). H 3 is derived from a tiling problem that takes inspiration from the NEXP-complete tiling problem constructed in [14]. But here, instead of encoding a NEXP problem instance, the binary expansion of the lattice size N encodes the target Hamiltonian H ′ to be simulated.
The Turing machine we encode in the tiling rules carries out its computation in two stages. The first stage runs a binary counter Turing machine for a number of steps equal to the lattice size N , exactly as in [14]. This leaves N written in binary in the bottom row of tiles. The description of H ′ encoded in N takes up approximately log(N ) of the available tiles on the bottom row, and we leave the remaining tiles unconstrained -a fixed subset of these act as the 'physical' spins in our universal simulator, i.e. the ones that reproduce the spin configurations of the system being simulated.
The resulting N then acts as the input to a deterministic Turing machine that reads in the description of H ′ and the state of the physical spins, and computes the energy contribution (in multiples of α) of each term in H ′ with this particular configuration of physical spins. Thus α controls the precision to which the energies are computed. The tiling rules then enforce that if the energy of a particular spin configuration is M α, then at the top row of the tiling there are M tiles forced to be in a special "flag" state which picks up an energy of 1 from the H 4 Hamiltonian.
From Eq. (2), we have that for any configuration that obeys all the tiling rules in the H 3 construction, the energy M α matches (to within error at most α) that of the corresponding spin configuration of the target Hamiltonian H ′ . Whereas for any invalid tilings of the H 3 construction, the energy of the system is greater than ∆, where we choose ∆ to be much greater than the maximum energy of H ′ . We also have that each spin in the target Hamiltonian corresponds to a fixed subset of the physical spins in our Hamiltonian. And we show in Section 5.2 that with this construction the partition function of the system is reproduced. This covers all the criteria necessary for a simulation of an arbitrary Hamiltonian H ′ , and hence for a universal model.

Universal model described by one parameter
We can now turn to the one parameter universal model. Theorem 2.2. There exist fixed two-body interactions h 1,vert , h 1,horiz , h 2,vert , h 2,horiz such that the that the family of translationally-invariant Hamiltonians on a 2D square lattice of size N with h 1 , h 2 as nearest-neighbour interactions: is an efficient universal model, where∆ is a parameter of the Hamiltonian, and the sums are over adjacent sites along the rows and columns, respectively.
The one-parameter construction works in much the same as the twoparameter construction. The H 1 term is again a Hamiltonian derived from a tiling problem, and the H 2 term is used to produce the required energy. By using a simple lemma about irrational numbers we can remove the need for the α parameter in the construction. The price we pay is that the H 2 Hamiltonian now contains negative energy terms. These result in a larger scaling of the∆ parameter in the Hamiltonian (as compared with the ∆ in the two-parameter construction), to deal with the fact that invalid tilings may pick up negative energy bonuses. Full details are given in Section 5.3.
Our final result concerns the impossibility of constructing zero-parameter universal models, where the only thing that can vary is the number of spins on which the Hamiltonian acts: Theorem 2.3. It is not possible to construct a translationally invariant universal model whose only parameter is the number of spins on which the Hamiltonian acts.
Hence our one-parameter model is optimal in terms of the number of parameters required for a universal Hamiltonian.
The existence of translationally invariant universal models implies that every hardness result known about general classical Hamiltonians can now be extended to translationally invariant classical Hamiltonians, where the results are shifted up in time-complexity by an exponential factor due to the way problem instances are encoded. (Details of this are given in Section 5.2.)

Classical spin Hamiltonians
Discrete spins (also known as Ising spins) are variables which can take on values in some set of states S. Given a set of spins {σ i } for i ∈ {1, ..., n}, a spin configuration asigns a state from S to each spin {σ i }. A classical spin Hamiltonian, H, is a function, H : S ×n → R which specifies the energy H(σ) of each spin configuration σ = σ 1 σ 2 . . . σ n ∈ S ×n . We can also consider continuous spins, represented as unit vectors σ i ∈ S D , where S D is the Ddimensional unit sphere.
We refer to families of related spin Hamiltonians as "spin models". In this work, the spin models we consider will be translationally invariant Hamiltonians on a 2D square lattice with nearest-neighbour interactions, with some small number (1 or 2) of global parameters. Different Hamiltonians from the same model therefore differ only in the size of the lattice and the values of the parameters.

k-local Hamiltonians
Throughout our discussion of universal models we will assume that the Hamiltonians the models are simulating are k-local. While at first this may appear restrictive, we place no restriction on how large k is allowed to be, so a global Hamiltonian on n spins is just a special case of a k-local Hamiltonian for which n = k. Phrasing everything in terms of k-local Hamiltonians allows us to derive efficiency results at the same time as universality results.
The choice of defining efficiency in terms of k-local Hamiltonians is convenient, and covers the most important case. But it is more restrictive than is really required. When we talk about simulating a Hamiltonian efficiently, the most general requirement is that the number of parameters specifying the simulator Hamiltonian scales at most polynomially in terms of the number of parameters required to describe in the original system. For global Hamiltonians with no structure this requirement becomes a triviality, as the number of bits of information needed to describe a global Hamiltonian is anyway exponential in the number of spins. However, there do exist non-local Hamiltonians with efficient descriptions, the obvious example being sparse Hamiltonians. In the interests of simplicity of exposition, we will only give efficient explicit constructions for k-local Hamiltonians. But our constructions can easily be generalised to give efficient simulations of all Hamiltonians for which the energy of a given spin configuration can be computed efficiently, which includes sparse Hamiltonians.

Simulation and universality
A rigorous definition of what it means for one classical Hamiltonian to simulate another was formulated in [7] ( [5] extends this defintion to the quantum case): Definition 2 (Hamiltonian simulation (definition from main text of [7] 3 )). We say that a spin model with spin degrees of freedom σ = σ 1 , σ 2 , . . . can simulate H ′ if it satisfies all three of the following: 1. For any ∆ > max σ ′ H ′ (σ ′ ) and any 0 < δ < 1, there exists a Hamiltonian H in the model whose low-lying energy levels E σ = H(σ) < ∆ approximate the energy levels E ′ σ ′ = H ′ (σ ′ ) of H ′ to within additive error δ.

For every spin σ ′
i in H ′ , there exists a fixed subset P i of the spins of H (independent of ∆) such that states of σ ′ i are uniquely identified with configurations of σ P i , such that |E ′ σ ′ − E σ | ≤ δ for any energy level E σ < ∆. We refer to the spins P = ∪P i in the simulation that correspond to the spins of the target model as the "physical spins".

The partition function
to within arbitrarily small relative error: for some known constant µ.
In [7] an efficient universal model was defined as follows: This definition will be too restrictive for the translationally invariant case, but is included here for completeness. A generalisation of this definition which applies to the translationally invariant case will be introduced in Section 5.2.

Computing energy levels of k-local Hamiltonians
All of our constructions make use of a Turing machine, which reads in a description of a k-local Hamiltonian in binary, and outputs the energy of each k-local term in the Hamiltonian for a given spin configuration. We do not explicitly construct the transition rules for such a machine, appealing to the fact that calculating the energy levels of a k-local Hamiltonian term to any constant precision is evidently an efficiently computable function, and thus it is possible to construct a Turing machine which does this.

Complexity classes
In complexity theory problems are classified into complexity classes, defined by the amounts of certain computational resources needed to solve the problem. There are a number of complexity classes which will be relevant for this work.  Definition 7 (NEXP: Non-deterministic exponential time). The class of decision problems decidable in exponential time by a non-deterministic Turing machine. Equivalently, the class of decision problems for which if the answer is YES then there is a proof, exponential in the length of the input, that can be verified in EXP.
EXP and NEXP are the exponential time analogues of P and NP respectively. If A reduces to B then we can solve A by transforming it into B and solving B. We use the notation A ≤ B to denote that A reduces to B.

Definition 9.
A problem A is hard for a complexity class C if every problem in C can be reduced to A.

Definition 10.
A problem A is complete for a complexity class C if A is hard for C and A is in C.
The complete problems for any particular complexity class can be considered the hardest problems in that complexity class.
A decision problem which will be useful in this work is the Ground State Energy problem (GSE): Definition 11 (Ground State Energy, Definition 8 from [7]). The ground state energy problem of a model M = {H α }, asks: given H α ∈ M and c ∈ Q, is there a configuration of spins σ such that H α (σ) ≤ c.

NEXP-complete tiling construction
In this section we review the NEXP-complete tiling construction originally published in [14], on which our universality construction is based. In Section 4.1 we outline the general idea behind encoding a Turing machine in tiling rules, before going on in Section 4.2 to discuss the specific tiling rules used in [14] for showing NEXP-completeness when the boundary conditions of the tiling problem specify the tile to be placed in each corner. It is clear that translational invariance is broken at the corners in this construction, but this construction provides the basis for versions of the problem with open and periodic boundary conditions [14], which we cover in Section 4.3 and Section 4.4 respectively. We include the tiling rule definitions, and some intuition for why they work, but refer the reader to [14] for proofs. The notation and tile markings in this section are taken directly from [14]. Finally in Section 4.5 we discuss a modification of the NEXP-complete tiling construction which will be useful in our universality proof.

Encoding a Turing machine in tiling
Tiling is known to be Turing complete, which means that any Turing machine can be represented by a set of tiling rules [1,21]. The basic idea is that it is possible to construct tiling rules such that any row in a valid tiling represents the configuration of the Turing machine tape, internal state, and head position at a particular point in time, and the sequence of rows along the vertical direction represents the sequence of configurations in the time evolution of the Turing machine.
In order to make this explicit, we first review the definition of a Turing machine. We can define a Turing machine as a triple: M = Q, Σ, δ , where Q denotes a non-empty set of states, Σ is the Turing machine alphabet, and δ : Q × Σ → Q × Σ × {L, R} is the transition function (L/R denotes that the Turing machine head has moved to the left/right). The Turing machine will have a designated blank symbol # ∈ Σ, starting state q 0 ∈ Q and final state q F ∈ Q [15].
To encode this definition of a Turing machine into a tiling problem, [14] uses three different varieties of tile. The first variety is denoted [a] where a ∈ Σ is a Turing machine tape symbol. Tiles of this variety denote the symbol on the Turing machine tape away from the position of the Turing machine head. The second tile variety is denoted by a triple Σ × Q × {r, l}. These tiles denote the symbol on the Turing machine tape and the state of the head when the head has moved to a location, but not yet acted. The {r, l} signify which direction the head came from in its last move. The third tile variety is also denoted by a triple Σ × Q × {R, L}. Tiles of this variety denote the state of the head and the tape symbol after the head has acted, and the {R, L} denote which direction the head moved [14].
To encode a Turing machine in these tiles, one defines tiling rules that force each pair of adjacent rows to represent a valid update of head position, internal state and tape. The specific tiling rules used in [14] are given in Section 4.2. An illustration of a portion of tiles that encodes a Turing machine, reproduced from [14], is shown below. In this example the Turing machine is running from bottom to top on the tiling grid, and the Turing machine is undergoing the evolution (a, q) → (b, q ′ , L) in the first move, and (c, q ′ ) → (f, q ′′ , R) in the second move.
The [d, q, L] in the bottom right corner is from the previous step of the Turing machine evolution. It indicates a Turing machine head which is now in state q and has moved to the left. This is consistent with the [a, q, r] in the bottom middle row which signifies that the state of the Turing machine is q, and that the Turing machine tape came from the right. The [b, q ′ , L] shows that the Turing machine is going to execute the step (a, q) → (b, q ′ , L), and we can see in the adjacent tile [c, q ′ , r] that indeed the Turing machine head has moved to the left, and its state is now q ′ . Finally in the bottom row [f, q ′′ , R] indicates that the Turing machine is going to execute the step (c, q ′ ) → (f, q ′′ , R), and the [b, q ′′ , l] tile to the right indicates that indeed the Turing machine tape has moved one step to the right, and is now in state q ′′ .

NEXP-complete tiling rules with corner tiles fixed
We first review the boundary conditions and the resulting tile pattern on the border of the grid in [14].
Boundary conditions: The boundary conditions for the NEXP-complete tiling construction in [14] require a to be placed in each of the four corners of the grid. There are then four other tiles which we refer to as boundary tiles , , , . The tiling rules for the boundary tiles are reproduced in Table 1.
These tiling rules enforce that the only place a tile can go is on the left boundary, the only place a tile can go is the right boundary, the only place a tile can go is on the bottom boundary, and the only place a tile can go is the top boundary. The tiling rules also enforce that the only tiles can go adjacent to a tile are boundary tiles. If we consider the top boundary only and tiles can go to the left or right of a tile, so the entire top boundary will have to be or tiles, but no tile can go below a tile, so tiles cannot go on the top boundary. This leaves the entire top boundary, except the corner tiles, as tiles. Similar logic applies for the other boundaries, and we find that these tiles form a border for a grid, Table 1: (Table 2 from [14]) Tiling rules for the boundary tiles. Here a * denotes any interior tile, 'N' indicates a disallowed pair of neighbouring tiles, and 'Y' denotes an allowed pair of neighbouring tiles. In cases where the rule for the interior tile is not specified this is because the rule depends on the type of interior tile. which can be seen in Fig. 1. These boundary tiles allow us to implement special conditions along the borders of the tiling grid, while only breaking translational invariance at the four corners. We will see in Section 4.3 and Section 4.4 how the construction in [14] is modified to make it fully translationally invariant.
Turing machine tiling rules: The tiling rules required to simulate a Turing machine running from bottom to top on a grid using a tiling grid are given in Table 2, with the tile types discussed in Section 4.1. The rules enforce that each row in a valid tiling represents the state of the Turing machine tape at a moment in time, and that the vertical evolution in a valid tiling represents the time evolution of the Turing machine tape. (See [14] for full details.) It is convenient to think about the interior of the grid as having two "layers". The first layer represents the time-evolution of a binary counter Turing machine running from top to bottom. The second represents that of a non-deterministic Turing machine running from bottom to top. In both cases the main tiling rules are be of the form specified in Table 2.
It is important to note that there are not actually two layers of tiles in the construction -we are tiling one grid, using one set of tiles. But, apart from the boundary tiles, each tile in the construction is specified by a pair, denoting its "layer 1" type and its "layer 2" type: T = T 1 × T 2 , and the tiling constraints are given by Thinking about these tile markings as representing "two layers" of tiles in the interior is convenient, because most of the tiling rules will constrain one of the layers independently of the other.
First layer of tiling: For the first layer of the tiling, we will need to implement some additional rules in order to ensure that the binary counter Turing machine begins with the [#, q 0 , l] tile in the top left interior tile, followed by [#] tiles. This is enforced by additional tiling rules, given in Table 3.
Combining these tiling rules with those for running a Turing machine gives the complete set of tiling rules for the first layer, where the particular Turing machine we are implementing on the first layer is a binary counter Turing machine, M BC . It should also be noted that the binary counter  (Table 3 from [14]) Tiling rules for simulating a Turing machine using a tiling, given our boundary conditions. Here 'N' indicates a disallowed pair of neighbouring tiles, and 'Y' denotes an allowed pair of neighbouring tiles. 'If TM rule' means that the tiles can only be placed in that configuration if there is a Turing machine rule allowing that move. The 'Y*' entry will be modified later to get the correct starting configuration. Table 3: (Table 4 from [14]) Additional tiling rules for the first layer of tiling. Turing machine on the first layer "runs" from top to bottom, so the tiling rules for implementing a Turing machine from the previous section will be modified so that the Turing machine "runs" in the opposite direction, from bottom to top. Second layer of tiling: For the second layer we would like to copy the output from the binary counter Turing machine to the bottom layer of the second layer, and we would like to enforce that the head of the nondeterministic Turing machine goes at the left-most point of the grid on the bottom row. We also require that in a valid tiling the Turing machine must be in its accepting state, q F , at the top row of the grid. This is achieved with the additional tiling rules given in Table 4.
In order to force the head of the Turing machine to start at the leftmost point of the grid, the tiling rules enforce that no symbol from the binary counter Turing machine alphabet can ever go to the right of a tile. This means that after the Turing machine head has moved on from the leftmost point we will need to overwrite the symbol in the leftmost tile with a new symbol which does not appear in Σ M BC . This is accomplished by introducing an a ′ symbol in the alphabet of the non-deterministic Turing machine for every a ∈ Σ M BC . Once the Turing machine head has moved on from the leftmost point, it overwrites the symbol on the leftmost tile to its primed version, and this is treated as an ordinary symbol for the remainder of the computation.
With this construction Gottesman and Irani demonstrate that there is a valid tiling of the grid if and only if the non-deterministic Turing machine accepts on input x in N steps, so every problem in NEXP can be reduced to tiling, and therefore tiling is NEXP-complete.

NEXP-complete weighted tiling with open boundary conditions
To have a NEXP-complete version of the tiling problem with open boundary conditions, one must consider a variant of the tiling problem where the constraints are weighted.

Definition 12 (Weighted tiling (definition 4.3 from [14]).
Problem parameters: to the total cost of the tiling. A polynomial p. Boundary conditions (a tile to be placed at all four corners, open boundary conditions, or periodic boundary conditions). Problem input: Integer N , specified in binary. Output: Determine whether there is a tiling of an N × N grid such that the total cost is at most p(N ).
In [14] Gottesman and Irani construct a NEXP-complete weighted tiling problem with open boundary conditions, where p(N ) is a constant, c = −4. This is done using a three layer tiling construction. 5 The tile types for layers 1 and 2 are unchanged from Section 4.2, and there are five tile types which can be used in the third layer: , , , and . The tiling weights for the third layer are given in Table 5. These tiling rules ensure that the minimum cost of tiling the third layer is -4. The optimal configuration is shown in Fig. 2.
If we insist that any , , or tile in the third layer must correspond to a in the main layer then the optimal overall tiling has layers 1 and 2 constrained in precisely the way required for the NEXP-complete construction outlined in Section 4.2. If we assign all forbidden pairs of neighbouring tiles from the tiling rules in Section 4.2 a weight of +1, then the Tile on right

NEXP-complete weighted tiling with periodic boundary conditions
In [14] Gottesman and Irani also construct a NEXP-complete Weighted Tiling with periodic boundary conditions, where p(N ) is again a constant, in this case c = +2. As for open boundary conditions, the periodic boundary condition construction adds a third layer to the two-layer construction from Section 4.2. The tile types which can be used in the third layer are , , , and . The tiling rules for the third layer are given in Table 6. The minimum cost of the third layer is +2 and one possible optimal configuration is shown in Fig. 3. 6 The horizontal and vertical lines in Fig. 3 delineate the boundary of an (N − 1) × (N − 1) grid. Weighted tiling constraints can be implemented to ensure that in an optimal tiling of the overall grid, layers 1 and 2 are constrained to have at the border of this grid. Therefore, if we assign all forbidden pairs of neighbouring tiles from the tiling rules in Section 4.2 a weight of +1, the tiling problem from Section 4.2 reduces to Weighted Tiling with periodic boundary conditions.
It should be noted that this construction requires that N be odd. This restriction is unimportant both in the NEXP-completeness result and in Tile on right  our universality constructions, as we are free to choose an N satisfying this constraint. (The restriction to odd N can in fact be lifted using aperiodic tilings, see [6].)

Using tiling to encode a deterministic Turing machine
In [14] the tiling rules for the bottom boundary copy the entire bottom row of the first layer to the bottom row of the second layer, and used this as the input to a non-deterministic Turing machine. In their construction it was therefore only the problem instance x which was given as input to the Turing machine.
For our purposes it is more useful to rephrase the construction in terms of a deterministic Turing machine, which takes some additional input w as a witness. In order to do so we need to modify the rules for the bottom Boundary tile Adjacent interior tile location [a] If layer 1 is # or if a matches layer 1 Table 7: Modified tiling rules to simulate a deterministic Turing machine. boundary in the NEXP-complete tiling construction, so that the part of the boundary that is not specifying the problem instance x is allowed to be any Turing machine alphabet tile, rather than being constrained to be the blank symbol as is the case in [14]. These unconstrained tiles then play the role of the witness -the verifier Turing machine will accept if these tiles form a valid witness, and reject otherwise. To achieve this we need to update the tiling rule between the bottom tile and the [a] tile for layer 2. The updated tiling rules are summarised in Table 7.
We now have that at the bottom boundary the problem instance x is copied to the bottom row of the second layer, and the rest of the tiles in the bottom row of the second layer are allowed to be any of the Turing machine alphabet tiles. The rest of the tiling rules are unchanged from the NEXPcomplete tiling construction, except that the Turing machine rules which are encoded in the tiling rules will be rules for a deterministic Turing machine, rather than for a non-deterministic Turing machine. This modification to the construction does not change the complexity of the problem. Whilst the alteration to a deterministic Turing machine does mean that each row is uniquely determined by the previous row (except in the case of the bottom boundary), the bottom interior row is not given as input, and the tiling problem remains NEXP-complete.
It should be noted that if the witness takes up the entire remaining length of the bottom row then the Turing machine will not have space to carry out any computation. It will take N − 3 steps just to read in the input, and after N − 3 steps the entire grid is tiled. However, if we want the tiling rules to remain translationally invariant we cannot restrict the length of the unconstrained "witness" tiles via tiling rules. This will not, however, be an issue in our construction. We will choose the encoding of the problem instance x to include a specification the length of the witness, say |w|. The Turing machine will then only read in |w| of the unconstrained tiles as part of its input, and will then begin running the witness verification computation without reading any further input. The remainder of the unconstrained tiles on the bottom row do not form part of the witness, and their states are essentially arbitrary.

Tiling constraint Hamiltonians
The tiling problem is translationally invariant as the tiling rules are the same everywhere on the grid. The Gottesman and Irani construction therefore suggests that translational invariance may not be a barrier to universality. Indeed, it's well known that the existence of a translationally invariant classical Hamiltonian whose GSE problem (Definition 11) is NEXP-complete follows from [14], an argument we encapsulate in the following lemma for later convenience: Lemma 5.1. Every weighted tiling problem can be represented by a classical, translationally invariant, nearest-neighbour spin Hamiltonian on a 2D square lattice, in such a way that tiling configurations correspond to spin configurations, and the energy of any spin configuration is identical to the weight of the corresponding tiling configuration.
Proof. Consider a tiling problem described by a set of tiles T = {t} t=d t=1 , a set of horizontal weights w H : T × T → Z, and a set of vertical weights Consider a graph G = (V, E) which is restricted to be a N × N square lattice. Let there be a d-dimensional Ising spins, σ i ∈ {t} t=d t=1 , associated with each vertex i ∈ V . Construct the Hamiltonian H T : where E H , E V denote the sets of horizontal and vertical edges respectively. We can see that there is a one-to-one mapping between the possible spin states of the Ising spins σ i and the tiles in the tiling problem, and that H T assigns an energy penalty to pairs of adjacent spins which is equal to the weight given to the corresponding pairs of neighbouring tiles. The Hamiltonian H T is translationally invariant because its local interaction terms are the tiling rules, which are themselves translationally invariant. Throughout our proofs we will refer to Hamiltonians derived in this way as 'tiling constraint Hamiltonians'. The following rephrases the result from [14] in terms of classical Hamiltonians: Recalling from [7] that NP-hardness 8 of the GSE problem is a necessary (although not sufficient) condition for universality, Corollary 5.1.1 suggests that there may exist a family of translationally invariant universal classical Hamiltonians based on the construction in [14]. On the other hand, hardness is only a necessary condition for universality, not a sufficient one, and previous universality results relied heavily on breaking translational-invariance. Our main contribution is to prove that this complexity-theoretic intuition does extend to the translationally invariant case by explicitly constructing a universal model based on the construction from [14].

Two parameter model
We can now prove our first result, that there exists a family of translationally invariant classical Hamiltonians with two parameters which can simulate all other classical Hamiltonians. We will prove the result for open boundary conditions, but the construction works equally well for periodic boundary conditions. 9 We begin by defining the tiling model: Definition 13 (Tiling model). A 'tiling model' is a family of Hamiltonians specified by a graph G = (V, E) which is restricted to be a 2D square lattice; a tiling constraint Hamiltonian, H T ; an energy offset c; an energy penalty, ∆ ∈ R; and a "flag" energy α ∈ R.
A discrete, classical spin σ i which can take values in some finite set S is associated with each vertex i ∈ V , and Hamiltonians in the model are given by: whereγ ∈ S and fγ : S → Z is the function fγ(γ) = +1, and ∀σ i = γ fγ(σ i ) = 0. Proof. Proof overview: In order to prove the lemma, we will explicitly construct H univ . Our construction takes inspiration from the NEXP-complete weighted tiling problem with periodic boundary conditions constructed in [14]. As in the Gottesman and Irani construction, the problem instance (in 8 Recall from the previous section that tiling is NEXP-complete rather than NPcomplete as all the information about the problem instance is encoded in the binary expansion of the size of the grid to be tiled. 9 The proof for open boundary conditions requires we set the parameter c = −4, and use the third layer tiling rules from Section 4.3. For periodic boundary conditions, one should instead use c = +2 and the third layer tiling rules from Section 4.4. this case the Hamiltonian to be simulated, H ′ ) is encoded in the binary expansion of N , the size of the grid to be tiled. The tiling is carried out in three layers: 10 the first layer implements a binary counter Turing machine, so that the bottom row of the first layer contains N in binary -a description of H ′ .
This description of H ′ is copied to the bottom layer of the second row (taking up approximately log(N ) tiles), and the rest of the tiles in the bottom row of the second layer are left unconstrained. The first n of the unconstrained tiles act as the 'physical' spins (see Definition 2). The tiling rules for the second layer encode a Turing machine, which reads in the description of H ′ and the configuration of the physical spins, and computes the energy that should be assigned to this configuration in multiples of α. The tiling rules then force the correct number of spins to be in theγ state, so that the system has the desired energy. The third layer is used to implement boundary conditions on the first two layers.
Any non-optimal tiling will incur an energy penalty of ∆, so provided ∆ > E max (H ′ ) any spin configuration which corresponds to an non-optimal tiling will not have energy below the energy cut-off. As required by Definition 2, this energy cut-off can be made arbitrarily large by increasing the ∆ parameter in the construction. (Similarly to the non-translationallyinvariant models constructed in [7].) Encoding H ′ : The Hamiltonian to be simulated, H ′ , is encoded in the binary expansion of N , the size of the grid to be tiled. Let H ′ = m I=1 h I be a k-local Hamiltonian containing m terms, acting on n two-dimensional spins. H ′ is then fully described by n and the energy levels associated with each of the 2 k possible spin configurations for each of the m k-local terms. Let E = ∪ m I=1 spec(h I ) (where spec(h I ) denotes the spectrum of h I ), and let α be the greatest common measure of the set E. The greatest common measure (often called greatest common divisor in the special case of integers) is well defined for all sets of rational numbers. For irrational numbers there exist incommensurate pairs of numbers for which no common measure exists. However, since we only need to simulate energies up to some finite precision δ, we can assume without loss of generality that E is a set of rational numbers. A complete description of H ′ then consists of n and, for each of the m k-local terms, 2 k parameters giving the energy contribution (in integer multiples of α) of that local term for each of the 2 k possible configurations of the k spins it acts on.
It is worth considering explicitly how the Hamiltonian to be simulated, H ′ , will be encoded in binary. The first thing to note is that since we are in binary we have only two symbols available to us. As the length of the description depends on the Hamiltonian, we need to use a self-delimiting code. (An alternative would be to encode in ternary, and reserve one symbol exclusively as a delimeter. But using a self-delimiting code and sticking to binary makes the tiling construction slightly simpler.) We will use a simple self-delimiting code known as the Elias-γ code [11]. The length of an integer z encoded in the Elias-γ code scales as O(log(z)), so there is only a constant overhead compared with the binary representation of z. If we denote the binary representation of z by B(z), and its length by |B(z)|, then the Elias-γ code for z is given by |B(z)| − 1 zeros that indicate how long the input will be, followed by B(z) itself. All binary numbers (with the exception of zero) begin with a one, so B(z) starts with a one, and this delimits B(z) from the |B(z)| − 1 zeros. The Elias-γ code does not code zero or negative integers. But we do not need negative integers for our construction, and we can easily include zero in the code by adding one to every number before coding, and subtracting one after decoding. We will refer to this as the Elias-γ ′ coding.
Having established how to encode individual parameters in binary, we can now consider how to encode the full Hamiltonian. Every value needed to specify the Hamiltonian will be represented in Elias-γ ′ coding. For the purpose of the encoding we label the n spins in the original system by integers i = 1 : n, corresponding to the order in which these are represented in the physical spins that act as input to the Turing machine. The encoding of the Hamiltonian begins with n, followed by k. Each of the m k-local terms in H ′ is specified by giving the label of each spin involved in that the interaction (a total of k integers), followed by the energy values for each of the 2 k possible spin configurations of the k spins acted on by that local term, ordered by the index of the configuration considered as a binary number. These energy values are specified by the closest integer multiple of α to the desired value. (l-local terms for l < k can be specified by arbitrarily picking k − l spins to pad the number of spins to k, but specifying identical energy values for configurations differing only on those extra l − k spins).
After the last of the k-local terms has been described in this way, there will be a single 1, delimiting the end of the description of H ′ . This will be unambiguous, as it is the only time that an integer in Elias-γ ′ coding will be followed by a one, rather than by a string of zeros. 11 If we let γ ′ (z) denote the integer z represented in Elias-γ ′ code then putting all of this together, we can represent the Hamiltonian as follows: where · denotes concatenation of bit strings, the i denotes the index of the 11 The 1 at the end also ensures that N is an odd number. This is not important for the open boundary conditions, but would be necessary if we were using periodic boundary conditions (see Section 4.4). (Note that for using periodic boundary conditions, the size of the grid to be tiled is actually N + 4.)

Boundary tile
Adjacent interior tile location [a] If layer 1 is # and a = 0, 1 OR if layer 1 is not # and a matches layer 1 spins in each k-local term, and the λ j indicate the energy levels in terms of multiples of α of each k-local term in the Hamiltonian.
This encoding can unambiguously specify any k-local Hamiltonian in (the binary representation of) an integer N . It will take O(log(n)) bits to specify n in Elias-γ ′ code, O(log(k)) bits to specify k in Elias-γ ′ code, and O(m2 k ) bits to specify the energy levels of each k-local term. The total number of bits required therefore scales as O(m2 k ) = O(n k ), where we have used the fact that without loss of generality, m ≤ n k = O(n k ) for k-local Hamiltonians. The length of N in binary is approximately log(N ), so we have that log(N ) = O(n k ), and hence N = O(2 n k ). N is clearly efficiently computable from any other reasonable description of H ′ .
Tiling rules: 12 The tiling rules for the third layer of tiles and the border of layers one and two are unchanged from the construction in [14], outlined in Section 4.3 and Section 4.2 respectively. The first layer of the tiling will implement a binary counter Turing machine from top to bottom, as before. So that the bottom row of the first layer in any optimal tiling contains N in binary, with the remaining tiles blank. For simplicity we will assume that the tape alphabet for the binary counter Turing machine is given by Σ M BC = {0, 1, #}, where # denotes the blank symbol. The tiling rules needed to implement a binary counter Turing machine on the first layer of the tiling are unchanged from [14].
Bottom boundary: The tiling rules for the second layer will encode a Turing machine which reads in the description of the Hamiltonian to be simulated in binary, and the state of the physical spins, and calculates the energy of the system being simulated. For concreteness we will assume that the tiling rules encode a Turing machine with seven alphabet symbols: Σ M U = {0, 1, 0 ′ , 1 ′ , γ, #}. The # denotes a blank symbol, the 0,0 ′ and 1,1 ′ are used during the computation, and the γ is only used in the output of a computation. The input to the Turing machine represents the state of the physical spins. As the spins being simulated are two-level we need to update the tiling rules so that only 0 and 1 symbols can appear as input. The updated rules are provided in Table 8.
These rules are combined with the rules from [14], which state that in layer 2 an [a] tile can only go next to the left boundary if a / ∈ Σ M BC . This ensures that the left-most position contains the head of the Turing machine, and that the output of the first layer (i.e. N in binary) is copied to the first approximately log(N ) tiles, while the remaining tiles are constrained to be the tiles representing the Turing alphabet states 0 or 1. After the Turing machine head has moved on from the left-most position we need to have the left-most tile in an alphabet state which is not in Σ M BC , so the alphabet symbol from the left-most tile is updated to its primed version, which obeys the tiling rule, and is treated as a 0, 1 for the rest of the computation. Computation: With these tiling rules for the bottom boundary, the only optimal tilings of the bottom row will have the Turing machine head in the left-most position, with N in binary copied to the first approximately log(N ) tiles, and the remaining tiles constrained to be alphabet tiles containing either the symbols 0 or 1.
The Turing machine reads in the program, which contains a description of the Hamiltonian to be simulated, including the number of spins the Hamiltonian is acting on, n. It will then read in a further n input bits, which determine the state of the n 'physical' spins in the simulation. The Turing machine will then calculate the energy (in terms of multiples of α) arising from each of the k-body terms in the Hamiltonian given the state of the physical spins, and will output 0 if a k-body term contributes no energy with the given spin configuration, or a copies of the symbol γ if a k-body term contributes energy aα with the given spin configuration. After each of the m k-local terms has been calculated, the final state of the Turing machine tape will be a string of 0s and γs containing M γs (where the total energy of the particular spin configuration is M α), with a number of 0 and 1s remaining on parts of the tape that have not been used for output.
We have seen that N = O(2 n k ). It will take the Turing machine O(log(N )+ n) = O(n k ) steps to read in the input. The remainder of the calculation simply involves determining which of the 2 k possible configurations the relevant k spins are in, for each of the k-local terms, and outputting the corresponding energy. For each k-local term this can clearly be done in time O(poly(2 k )), giving a time for the total computation of O(poly(n k )). The system has N = O(2 n k ) rows of computation available to it, so will finish the computation in the space available in the tiling grid.
At the point the Turing machine has finished its computation, and determined the energy contribution from each k-body term in the Hamiltonian, the Turing machine tape will contain a string of 0s and γs, followed by some 0s and 1s left over from the initial input, and from the part of the tape that didn't form part of the initial input.
Note that there will always be enough room to write all the γ's in a single N -length row of tiles. The description of H, which is the binary expansion of N , lists the energy levels of each of the k-local terms in terms of multiples of α, written in binary, which means that N always grows faster than the  required number of γ's. Top boundary: The tiling rules introduced so far ensure that once the computation finishes the row which encodes the final line in the computation will just be copied until the grid is full. However, this means that there will be more than M of the γ symbols in the full grid, so if we give energy to the γ symbols we will not end up with the correct energy. To circumvent this issue, we introduce a new type of tile, which we will denote γ , and we will introduce tiling rules which ensure that the γ tiles only appear in the top interior row of the tiling, so we can give energy to just these tiles without introducing terms in the Hamiltonian which break translational invariance.
The additional tiling rules associated with the γ tile are given in Table 9. In an optimal tiling the γ tile can only appear in the top interior row as the only tile that can appear directly above a γ tile is a top boundary tile. In order to ensure that every γ symbol that is output in the computation is copied to a γ tile in the top interior row we need to update the rules for the top boundary. The updated rules are provided in Table 10.
These rules ensure that in every optimal tiling there will be exactly M of the γ tiles in the top interior row, and none appearing elsewhere in the tiling. If we associate the γ tile with theγ state of the spins, then the energy of this system is precisely M α, the energy of the physical spins being simulated.
Partition function: We have now established that for all optimal tilings, the energy of the simulator system will be equal to the energy of the target system when the spin configuration of the target system is equal to the spin configuration of the physical spins, and for all non-optimal tilings the energy of the simulator system will be greater than ∆, so H meets the first two requirements to be considered a simulation of H ′ . The final requirement to consider is that H reproduces the partition function of H ′ to within arbitrarily small relative error. For this, it is not enough that H has the same energy levels as H ′ . It must also introduce the same additional degeneracy to each of the energy levels of the original Hamiltonian (a proof of this can be found below). In our construction, the spins that encode H ′ are uniquely determined for all optimal tilings, so introduce no additional degeneracy to any energy levels below ∆.
As we are encoding a deterministic Turing machine, the tiles in each row in an optimal tiling are uniquely determined by the tiles in the preceeding row, so likewise in optimal tilings the spins in the higher rows do not introduce any additional degeneracy. The only spins we need to consider are therefore the spins that correspond to the tiles in the bottom interior row that are constrained to be the alphabet tiles 0 or 1, but that are not part of the set of physical spins. The states of these tiles are entirely independent of the states of the physical tiles. As such, they will introduce the same degeneracy to all possible configurations of physical spins, and hence to all the energy levels. The partition function of H therefore reproduces the partition function of H ′ to within arbitrarily small relative error up to physically irrelevant rescaling.
This completes the proof of Lemma 5.2.

Lemma 5.3. Let H be a Hamiltonian on N d-level
Ising spins that approximates all energy levels of H ′ to within error δ, up to an energy cut-off ∆.
Then Proof. Denote the spin configurations of H and H ′ by σ and σ ′ , respectively. Assume that H simulates H ′ , and the degeneracy of each energy level of the H system is multiplied by a factor of µ compared with the corresponding energy level of the H ′ system. Then the relative error in the partition function is given by: In the second step we have used the fact that below ∆ the energy levels of H are within δ of the energy levels of H ′ but repeated µ times, in order to replace the sum over σ configurations with a sum over σ ′ configurations. The δ σ ′ can take on values in the interval [−δ, +δ]. In the penultimate step we have assumed the worst case (all δ σ ′ = −δ) to upper-bound the error. In order to see the only if direction, consider the case where H approximates all energy levels of H ′ to within δ below energy cut-off ∆, but where it does not introduce the same degeneracy to each energy level. Denote the degeneracy of spin configuration σ ′ by m σ ′ . Then the relative error in the partition function is given by: The second term is independent of δ, and the first term will only be upperbounded by e βδ − 1 if There are two cases to consider: Eq. (11) becomes: This must hold for all β > 0, and all δ ∈ (0, 1). Taking the limit βδ → 0, Eq. (12) becomes: But σ ′ e −βH(σ ′ )) µ − m σ ′ e −βδ σ ′ > 0 by assumption, so it is not possible to satisfy Eq. (13), hence Eq. (11) cannot be satisfied for sufficiently small δ.
The only way to satisfy Eq. (16) is then to have m σ ′ − µ ∈ N, which implies: Together with Eq. (15), this implies: We can assume without loss of generality that the H ′ (σ ′ ) + δ σ are all distinct. 13 The functions e −β(H ′ (σ ′ )+δσ) are then linearly independent, so the only solution to Eq. (16), and therefore also the only solution to Eq. (11) is ∀σ ′ m σ ′ = µ, as claimed. 13 If there exist σ ′ i and σ ′ j such that It should be noted that in our proof of Lemma 5.2 we have implicitly assumed that all the energy levels in the target Hamiltonian H ′ are positive. This is a valid assumption as any classical Hamiltonian on a finite number of spins can be made positive by adding a constant energy shift. We could also prove the more general result where we allow H ′ to have negative energy levels, but this requires including negative energy terms in our universal model, which introduces additional complexity to the construction, as we will see in Section 5.3 when we prove our main result: the one-parameter universal model.
It is instructive to consider the number of parameters, and number of spins, required in the construction. The number of parameters needed to specify H ′ is O(poly(m, 2 k )), but the number of parameters in the simulator Hamiltonian H is just two: α, ∆, together with the size of the lattice the Hamiltonian is acting on, N . We have already calculated that N = O(2 n k ), so the number of spins in the simulation is given by N 2 = O(2 n k ). Comparing this with the definition of universality given in the introduction, we see that the number of parameters needed to specify H is in keeping with this definition of universality, but that the number of spins needed for the simulation is not efficient according to that definition.
However, this definition of efficiency is too restrictive for the translationally invariant case. A translationally invariant Hamiltonian on N spins can be described using only poly(log(N )) bits of information, whereas a klocal Hamiltonian which breaks translational invariance in general requires poly(N ) bits of information. The number of bits required to describe a klocal Hamiltonian on n spins is O(n k ). So by a simple counting argument we can see that it is not possible to encode all the information about a klocal Hamiltonian on n spins in any translationally invariant Hamiltonian on poly(n, m, 2 k ) spins, as this would require encoding O(n k ) bits of information in O(log(n)) bits. As such, we extend the definition of efficiency for universal models in such a way that it coincides with the original definition in [7] for the models considered previously, but is also meaningful for translationally invariant models: Definition 14 (Efficient universal model). We say that a universal model is efficient if, for any Hamiltonian H ′ = m I=1 h I on n spins composed of m separate k-body terms h I , H ′ can be simulated by some Hamiltonian H from the model specified by poly(m, 2 k ) parameters which can be computed in time poly(n, m, 2 k ), described by poly(n, m, 2 k ) bits of information.
This clearly encompasses the previous definition for non-translationally invariant Hamiltonians, as for these Hamiltonians the number of bits needed to describe the Hamiltonian scales polynomially with the number of spins in the system. But it also encompasses the possibility of efficient, universal, translationally invariant models.
With this modified definition of efficiency we can state and prove our main result for the two-parameter case: Proof. In Lemma 5.2 we showed that any any k-local Hamiltonian on twolevel Ising spins can be simulated by a Hamiltonian from this tiling model. The 2D Ising model with fields meets this requirements, and was already shown to be a universal model in [7]. Since we can simulate a universal model, our tiling model is itself universal.
If H ′ = m I=1 h I is a k-local Hamiltonian acting on n spins, then the Ising model H Ising simulating it will be specified by poly(m, 2 k ) parameters and act on poly(n, m, 2 k ) spins. Thus the tiling model Hamiltonian H simulating this will be specified by two parameters together with the lattice size, and will act on O(2 poly(n,m,2 k ) ) spins, so will be described by O(poly(n, m, 2 k )) bits of information. We demonstrated in the proof of Lemma 5.2 that the size of the lattice needed to simulate H ′ can be computed in time O(n k ).
It should be noted that an alternative method to prove Theorem 5.4 would be to generalise our construction from the proof of Lemma 5.2 to deal with d-level spins, and show directly that this tiling model can simulate any Hamiltonian on discrete Ising spins. This would require modifying the mapping between the states of the physical spins on the simulator system, so that instead of one physical spin on the simulator system mapping to one spin on the original system, we would now have log 2 (d) physical spins (which we will refer to as a physical set) on the simulator system mapping to one spin on the original system. In the case where log 2 (d) is an integer we would still have a one-to-one mapping between the states of the physical spins and the states of the original spins. In the case where log 2 (d) is not an integer the number of spins in a physical set would be rounded up to the nearest integer, and some states of the physical set would not correspond to any state of the original spin. In order to handle instances where the states of the physical set do not correspond to any state of the original spin we would introduce an additional term in the Hamiltonian, i∈V ∆f ν (σ i ), where f ν : S → Z is given by f ν (ν) = 1, f ν (σ i ) = 0 ∀σ i = ν and ν is a spin state that corresponds to a new alphabet symbol, ν, in the layer 2 Turing machine. We could construct Turing machine rules in such a way that if the states of the physical set do not correspond to a valid configuration the Turing machine will output a ν symbol, picking up energy ∆ and therefore taking that configuration above the energy cut-off. This does not affect the number of parameters in the model, as we were already using the ∆ parameter in the construction.
The argument sketched above gives an alternative, direct proof that this tiling model could simulate all Hamiltonians on discrete spins. In order to show that this tiling model can simulate all Hamiltonians on continuous spins which are Lipschitz-continuous in each argument we would then follow the argument in [7], showing that they can be simulated to arbitrary accuracy using discrete spins. The encoding of H ′ in the binary expansion of N would also need to specify the value of d.

One parameter model
Having established the existence of a universal model which requires only two parameters (∆, α) it is natural to ask whether all these parameters are strictly necessary, or whether it is possible to generate a universal model requiring fewer parameters. In order to show that it is indeed possible to construct a universal model with only one parameter we require a technical lemma regarding irrational rotations on the unit circle.
We begin by introducing some notation. The unit circle will be denoted S 1 , with the points on S 1 represented by the real interval [0, 1]. A rotation of a point x ∈ S 1 about an angle θ is then given by R θ (x) = x + θ 2π mod 1. The orbit of a point x under a rotation R θ is the set of points O = {R n θ (x)|n ∈ Z}, where R n θ (x) indicates that the rotation was applied n times.
Lemma 5.5. The orbit of any point on the circle under an irrational rotation R θ (x), where θ = 2πα, α / ∈ Q is dense in S 1 .
Proof. This is a standard result with a number of proofs. We sketch one argument here. First consider two points in the orbit, R n θ (x) and R m θ (x). Using the definition of the rotation we can write: Similarly: Hence if two points in the orbit are equal, we must have: Simplifying, this gives: where a, b ∈ Z such that (x + nα) mod 1 = x+nα−a, and (x + mα) mod 1 = x + mα − b. If both a and b were equal to zero, or a = b, then α would be equal to zero, which contradicts our initial statement. Hence at least one of a or b does not equal zero, and a = b. We have, therefore, that either n = m or that α is a ratio of integers, which contradicts our assertion that α / ∈ Q. Hence R n θ (x) = R m θ (x) for n = m. We now apply the pigeonhole principle, which states that if we try to fit n items into m containers, where n > m, then at least one container will contain more than one item. Consider dividing the unit circle into Z ∈ Z disjoint intervals of length ǫ = 1 Z . We have just shown that every point on an irrational orbit is distinct, so the points R 0 θ (x), R 1 θ (x), ..., R Z θ (x) are all distinct points on S 1 . We have, therefore, Z + 1 distinct points on S 1 , but only Z disjoint intervals, so at least one of our intervals contains two points. Let R l θ (x) and R m θ (x) be two such points which fall into the same interval, where 0 < m < l < Z.
We have that the intervals are of length ǫ, so the distance |R l |n ∈ Z} is therefore ǫ-dense in S 1 , since the first (Z + 1)-points on the orbit cover S 1 in equidistant steps which are separated by less than ǫ.
We have left x and Z arbitrary, and can make ǫ arbitrarily small by increasing Z, hence the orbit of any point on the circle under an irrational rotation is dense in S 1 .
It is useful to consider how many points on the orbit we have to take to be ǫ close to every point on the unit circle.
Proof. Lemma 5.5 implies that T is dense in [0, 1]. Any point in R can be reached from [0, 1] by adding or subtracting integers. The definition of T allows you to subtract multiples of 1 so we can subtract arbitrary integers, and 1 is in the set [0, 1]. We can also add multiples of a number which is arbitrarily close to 1. Thus, for any ǫ, we can get ǫ-close to any z ∈ Z + by adding together z copies of a number which is ǫ z -close to 1.
It should be noted that using √ 2 is arbitrary -any irrational number would do. We can now demonstrate that there exists a translationally invariant model on one parameter which is universal. As in the two-parameter case we are assuming open boundary conditions, but the construction can easily be adapted to work equally well with periodic boundary conditions. We first define the model: Definition 15 (Reduced parameter tiling model). A 'reduced parameter tiling model' is a family of Hamiltonians specified by a graph G = (V, E) which is restricted to be a 2D square lattice, a tiling constraint Hamiltonian H T ; an energy offset c; and an energy penalty,∆.
A discrete, classical spin σ i which can take on values in the set S is associated with each vertex i ∈ V , and Hamiltonians in the model are given by: whereγ,η ∈ S and fγ ,ν : S → Z is the function fγ ,ν (γ) = + √ 2, fγ ,ν (ν) = −1 and ∀σ i =ν,γ fγ ,ν (σ i ) = 0. Proof. The proof of this lemma is closely related to the proof of Lemma 5.2, so we omit some of the details here, and refer back to the previous proof.
Outline of proof: The target Hamiltonian H ′ is again encoded in the size of the grid to be tiled, N . As before, the first layer of the tiling implements a binary counter Turing machine from top to bottom, so that the bottom row of the first layer contains N in binary; this is copied to the bottom row of the second layer, taking up approximately log(N ) tiles, and the remaining tiles are unconstrained, with a fixed subset of these tiles acting as the 'physical' spins. The second layer of tiles again encodes a Turing machine, which reads in the description of H ′ (N in binary), and the state of the physical spins, and outputs the energy that should assigned to this configuration by computing the energy of each k-local term to the desired precision. However, this time the energy is computed in the form a √ 2 − b. The tiling rules then force the appropriate number of spins to be in theγ andη states, so the system has the correct energy. The third layer of tiles is used to implement boundary conditions. Any non-optimal tiling will incur an energy penalty∆, but we now have negative energy terms in the Hamiltonian, 14 so a non-optimal tiling may also pick up some energy bonuses (i.e. negative energy contributions). The maximum number of energy bonuses any non-optimal tiling could pick up is N 2 . We will later discuss ways to bound the number of tiling errors needed to induce a certain number of energy bonuses. But for now we assume that it is possible to cause N 2 energy bonuses with O(1) tiling errors. So in order to push all incorrect tilings above energy ∆ we require∆ = ∆ + N 2 . With this condition any non-optimal tiling will have an energy greater than ∆, so provided ∆ > E max(H ′ ) any spin configuration which corresponds to an non-optimal tiling will be above the energy cut-off.
Encoding H ′ : As in the previous construction the Hamiltonian to be simulated H ′ is encoded in the binary expansion of N . Let H ′ = m I=1 h I be a k-local Hamiltonian acting on n two-dimensional Ising spins. As before, H ′ is fully described by the energy of each of the 2 k possible configurations of each of the m k-local terms. Each energy will be specified by giving two numbers, a and b, where the energy λ = a √ 2 − b. Let us assume we want to specify each energy to precision ǫ. Then a, b scale as 1 ǫ . A complete description of H ′ therefore consists of n, and for each of the m k-local terms, 2 k elements giving the energy, where each of these elements contains two binary numbers which scale as 1 ǫ . The description of the Hamiltonian will again be given in Elias-γ ′ coding, using the same format as in Lemma 5.2, with the modification that the energy levels will now each be specified by two Elias-γ ′ numbers. This leads to the Hamiltonian H ′ being specified in binary in the form: In the proof of Lemma 5.2 we found that N scaled as O(2 n k ). Now that we are including the precision information in N too this scaling will become O(2 2 k ǫ ). As in the previous construction, N can clearly be efficiently computed from any other reasonable description of H ′ .
Tiling rules: 15 The tiling rules for the third layer of tiles and the border of layers one and two are unchanged from the construction in [14], outlined in Section 4.3 and Section 4.2 respectively. The rules for the first layer of the tiling are unchanged from Lemma 5.2.
Bottom boundary: As before the second layer will encode a Turing machine. But for this construction we will increase the alphabet of the Turing machine, and use the alphabet Σ M U = {0, 1, 0 ′ , 1 ′ , γ, η, #}. The meanings of the primed symbols are unchanged from Lemma 5.2, and the γ and η are only used in the output. The boundary rules for the bottom boundary are unchanged from Lemma 5.2, and again ensure that the physical spins can only be in the 0 or 1 state.
Computation: The computation is essentially unchanged from the previous construction, with the only difference being that now for each k-local 15 The tiling rules in this section are stated in terms of forbidden and allowed pairs of neighbouring tiles. As in Section 4.3, all forbidden pairs of neighbouring tiles have weight +1, and all allowed pairs have weight 0.  term in the Hamiltonian the computation will output a γs, and b ηs where the energy of that k-local term with the particular spin configuration of physical spins is given by λ = a √ 2 − b. The Turing machine can also output 0 in the case of a zero energy term. At the point the Turing machine has finished its computation its tape will contain a string of 0s, γs and ηs, followed by some string of 0s and 1s, such that if the total energy of the configuration of physical spins is E = a ′ √ 2 − b ′ then the number of γs is a ′ and the number of ηs is b ′ . As before, there will always be enough room to write all the γ's and η's in a single N -length row of tiles.

Position of η Adjacent tile type Below
Top boundary: The tiling rules for the top boundary are very similar to those in the construction of Lemma 5.2, but they have to be updated to allow for the inclusion of a η tile. The updated rules for the top boundary are provided in Table 12. The additional tiling rules which ensure that the η tile does not appear elsewhere in an optimal tiling are summarised in Table 11.
These tiling rules are just straightforward generalisations of the tiling rules from Lemma 5.2, with the tiling rules for η tiles following straightforwardly from the tiling rules for γ tiles.
Partition function: The argument for reproducing the partition function is unchanged Lemma 5.2.
It should be noted that in this construction we can simulate negative energy levels without a problem, so there is no need to add a constant energy shift to the target Hamiltonian in order to make the target Hamiltonian positive semidefinite as we did in the two-parameter case. We have now established the existence of a translationally invariant, universal family of Hamiltonians which require only one parameter. On the other hand, our construction now has a number of simulator spins that scales as O(2 n k ǫ ), whereas before it scaled as O(2 n k ), and the energy penalty in the Hamiltonian now requires an offset that scales as O(N 2 ). The scaling of N with ǫ is inevitable, as removing the α parameter in the Hamiltonian means that the precision information has to be encoded in N . However, it is not immediately clear whether this scaling of the∆ term is necessary.
We first note that it is not possible to have a universal model without a precision parameter unless the Hamiltonian contains 'energy bonuses': be a translationally invariant family of Hamiltonians, where where x denotes the parameters specifying Hamiltonians in the model and the sum is over all spins in the simulator system. If H (x) is universal and contains no precision parameter, then the spectrum of h (x) contains at least one energy level which is below that corresponding to the ground state energy on the target systems.
Proof. We can assume without loss of generality that the energy level of the universal Hamiltonian that maps to the ground state energy of the target Hamiltonian is the zero point energy of the universal Hamiltonian, as this can always be achieved by a constant energy shift to the universal Hamiltonian. The lemma can therefore be rephrased as stating that the energy spectrum of h (x) , denoted spec(h), contains at least one negative value.
Let us assume that there exists a translationally invariant universal model where spec(h) contains only positive values. Now the local Hamiltonian h (x) must act on some fixed number of fixed dimension spins, so the set spec(h)\{0} is finite. Any finite set of real numbers contains a minimum, so spec(h) \ {0} contains a minimum, which we will denote λ min .
If we consider the overall Hamiltonian H (x) = i h (x) acting on some large set of simulator spins then the spectrum of H (x) , denoted spec(H), can only contain different combinations of the energy levels of h(x) added together. So spec(H) \ {0} must have a minimum at least as large as the minimum of spec(h) \ {0} (in general it will be larger). Hence H (x) cannot simulate any Hamiltonian with a non-zero energy level lower then λ min .
Whichever λ min we choose we can construct a Hamiltonian with non-zero energy lower than λ min that the H (x) cannot simulate, hence H (x) is not universal.
We have established that any two parameter translationally invariant model without a precision parameter requires 'energy bonuses' in the Hamiltonian. This suggests that some offset to∆ will always be necessary in order to ensure that non-optimal tilings are pushed to energy above ∆.
Reducing the size of the offset would be possible if the computation encoded in the tiling rules can be carried out "fault tolerantly", so that the number of energy bonuses incurred by making a single tiling error somewhere in the lattice is bounded. There does exist a construction for a 1-dimensional fault tolerant cellular automata [12], which could be encoded into the tiling rules, suggesting the possibility of carrying out the computation fault tolerantly and reducing the offset to∆. However, the construction involves a complicated encoding scheme, and it is not obvious that it would be possible to decode the output of the computation fault tolerantly, so an error in the decoding procedure could still lead to an unbounded number of errors in the output of the simulator, which would give O(N ) energy bonuses in the final output. There also exists a 3-dimensional tolerant cellular automata which fault tolerantly simulates a Turing machine; this could be encoded into tiling rules on a 4-dimensional grid, and it requires no decoding [13]. This simpler fault tolerant device seems more likely to provide a route to reducing the offset to∆, with the caveat that it would only apply to 4-dimensional lattices.
Whether this, or another approach, can be successfully applied to produce a universal model requiring only two parameters with a better scaling of the∆ offset, is an open question which we leave to future research. However it should be noted that in the worst case E max (H ′ ) can scale as O(N ) = O(2 n ). In this case, even if the scaling of∆ with N were improved, we would require it to scale as O(2 n ). So our construction already achieves the optimal worst-case asymptotic scaling of∆ with n.

Necessity of one parameter to specify a universal model
We will now demonstrate that the one-parameter universal model is optimal in terms of the number of parameters required to describe a universal model. Proof. Assume that there exists a translationally invariant Hamiltonian, H U , which takes no parameters other than the number of spins it is acting on, and which is universal. Write H = i h where h is some k-local Hamiltonian containing m terms, acting on d-dimensional Ising spins, and where the sum is over all S = N × N spins on the simulator system. Denote the spectrum of h by spec(h) = {µ i }. Let µ max = max{µ i } and µ min = min{µ i }, where by the minimum we mean the largest magnitude negative value in the set. 16 Consider a target Hamiltonian H ′ , where the spectrum of H ′ is given by spec(H ′ ) = {λ i }. As H U is a universal Hamiltonian it must be able to simulate H ′ up to arbitrary precision ǫ, and below arbitrary energy cut-off ∆. We saw in the previous section that it is possible to encode the precision into N , now we also have to encode the energy cut-off into N .
Let us assume that we can encode ∆ into N in some way, and that when H acts on S = N × N spins it simulates H ′ below energy ∆. Consider a configuration, σ, of the S spins which is below the energy cut-off, so that E(σ) = λ x ∈ {λ i }. Now take one spin in the configuration, and change its state to one of the other d − 1 possible spin states. The maximum energy difference such a change in state can have is bounded by µ max − µ min . Now consider simulating H ′ with a different energy cut-off ∆ ′ , where ∆ ′ > ∆. We must have that this simulation involves H acting on S ′ = N ′ × N ′ spins, where N ′ = N for sufficiently larger ∆ ′ . Consider again a configuration σ ′ of the spins, such that E(σ ′ ) = λ x . We can again consider taking one spin in the configuration and changing its state, and the maximum energy difference such a change in state can have will again be bounded by µ max − µ min . We see that if we are in a spin configuration that is below the energy cut-off, then the energy difference between the current spin configuration and one which differs from the current configuration by the state of just one spin is bounded, and in particular is independent of N . Therefore, this energy difference cannot be made arbitrarily large by altering N . Hence in order to satisfy Definition 2 of simulation, the new spin configuration we reach by altering the state of one spin must also be below the energy cut-off.
We have not restricted λ x in any way, hence this argument must hold for the spin configuration reached via one spin flip from any state below the energy cut-off. So if start in a configuration with energy λ x ∈ {λ i }, after one spin flip we are in another configuration with energy λ y ∈ {λ i }, and if we carry out another single spin flip we will again be in a configuration below the energy cut-off. 17 Now, all of the d S possible spin configurations can be reached by starting in the configuration σ, and carrying out a series of single-spin flips. Each single-spin flip takes us to a different configuration below the energy cut-off, so we never exceed the energy cut-off. Hence we find that every possible configuration of the S spins must map to one of the energy levels of H ′ .
This argument holds regardless of whether we were trying to achieve the energy cut-off of ∆ or ∆ ′ . Thus we find that whenever H U is simulating a Hamiltonian H ′ , every spin configuration of the spins in the simulator system must map to one of the energy levels of H ′ .
However, in order to correctly simulate H ′ it is not enough to reproduce the energy levels. We must also introduce the same additional degeneracy to each energy level, so that the partition function is reproduced. The number of possible spin configurations of S simulator spins is d S . If we denote the number of spin configurations of the original spins by M , the requirement of introducing the same additional degeneracy to each energy level of the original system enforces that ∀M , ∃S such that d S M ∈ Z. To see that this isn't possible, consider arbitrary d, and select M such that d and M are coprime, i.e. so that the greatest common factor of d and M is 1. 18 If we let the prime factors of d (ignoring multiplicities) be the set P = {p i }, and the prime factors of M be the set P ′ = {p ′ j }, then from the fact that d and M are coprime we have that P ∩ P ′ = ∅. Now consider the prime factors of d S . Ignoring multiplicities, the set of prime factors of d S is the same as the set of prime factors of d, so we find that M and d S have no common factors greater than 1, and are coprime. Hence d S M is an irreducible fraction, and in particular d S M / ∈ Z. Therefore for any d we choose there will exist some Hamiltonians which cannot be simulated. Hence our one-parameter, translationally invariant Hamiltonian cannot be universal.
It should be noted that the requirement that the energy cut-off ∆ can be made arbitrarily large is crucial in order to ensure that the partition function can be approximated arbitrarily well. If we allowed a definition of simulation in which the energy cut-off was fixed we would only be able to reproduce the partition function to within some constant error.

A translationally invariant model on poly(n) spins
Finally, we note that it would be possible to construct a universal model on poly(n) spins by allowing the Turing machine that is encoded in the tiling rules to depend on what Hamiltonian is being simulated, which would vastly increase the number of parameters in the model.
In both the two-parameter and one-parameter universal constructions the computation only takes time poly(n) to run. The reason we need a grid of size O(2 n k ) is to allow the binary counter Turing machine to provide a description of the Hamiltonian being simulated. If instead of a universal Turing machine we used the tiling rules to encode a Turing machine which specifically computes the energy levels of the target Hamiltonian, then there would be no need for the binary counter Turing machine, and we could simply run the non-universal Turing machine on poly(n) spins.
The number of parameters needed to specify such a universal model would be large. In particular, since there are only a finite number of Turing machines with a given alphabet, but an infinite number of Hamiltonians we might want to simulate, the alphabets of the Turing machines, and hence the local state space of Hamiltonians in the model, would not be fixed.

Discussion
Our main result is that there exists a translationally invariant Hamiltonian which can replicate all classical spin physics, just by tuning one parameter in the Hamiltonian, and varying the number of spins the Hamiltonian is acting on. We mentioned in Section 2 that this result has some implications for complexity theory, but perhaps the more interesting implications are those for classical many-body physics.
It was commented on in [7] that the existence of universal models implies that the properties of classical many-body systems are not determined solely by the symmetries or number of spatial dimensions of the system, nor by the structure of their interaction graph. However, all the universal models constructed in [7] required inhomogeneous coupling strengths, and it was suggested that the inhomogeneity of the couplings could account for this. We can now go further, and state that even for translationally invariant models with homogenous couplings the properties of the system are not determined by the dimensionality or symmetry of the system, or by the interaction structure, since all physical properties of e.g. 3D spin systems with complicated interaction graphs can be simulated on a 2D square lattice with nearest-neighbour, translationally invariant interactions.
It should be noted that in the classical setting physically simulating one spin model using another is unlikely to be of much practical use, as numerical simulations will usually be more efficient. As such the significance of this classical result lies more in what it tells us about the properties of classical spin models, than in technological applications of the result.
When dealing with quantum systems, however, we can no longer efficiently simulate them numerically using a classical computer. Quantum systems can be efficiently numerically simulated using a quantum computer [19], however this requires a large-scale, fault-tolerant quantum computer, which is beyond the reach of current technology. This has led to interest in the concept of analogue quantum Hamiltonian simulation -simulating a quantum system by engineering the Hamiltonian and measuring its properties. There has been experimental progress in designing quantum simulators using a range of implementations [16], [2], [3]; and theoretical work has pro-vided a rigorous framework for analogue Hamiltonian simulation [5].
We mentioned in the introduction that within the rigorous framework of analogue quantum Hamiltonian simulation defined in [5], there exist quantum spin models that are universal, in the sense that they can simulate the entire physics of any other quantum many-body system. As in the classical case, some of the universal spin models are surprisingly simple, with examples including the Heisenberg, and XY-interaction [5]. However, every known universal quantum model breaks translational invariance. It would be interesting to consider whether there exists a quantum analogue to results in this paper -is it possible to construct a universal, translationally invariant quantum model?
In [14] it was demonstrated that the hardness of calculating the ground state energy of a one-dimensional translationally invariant quantum system is QMA EXP -complete, suggesting that in the quantum regime translational invariance is not a barrier to complexity. This work may prove a route to extending our result to the quantum case. But it should be noted that the quantum notion of simulation is more involved than the classical case, and it is not clear that there exists a straightforward generalisation of our result to the quantum case.