Entanglement in weakly coupled lattice gauge theories

We present a direct lattice gauge theory computation that, without using dualities, demonstrates that the entanglement entropy of Yang-Mills theories with arbitrary gauge group G contains a generic logarithmic term at sufficiently weak coupling e. In two spatial dimensions, for a region of linear size r, this term equals 12\documentclass[12pt]{minimal} \usepackage{amsmath} \usepackage{wasysym} \usepackage{amsfonts} \usepackage{amssymb} \usepackage{amsbsy} \usepackage{mathrsfs} \usepackage{upgreek} \setlength{\oddsidemargin}{-69pt} \begin{document}$$ \frac{1}{2} $$\end{document}dim(G) log(e2r) and it dominates the universal part of the entanglement entropy. Such logarithmic terms arise from the entanglement of the softest mode in the entangling region with the environment. For Maxwell theory in two spatial dimensions, our results agree with those obtained by dualizing to a compact scalar with spontaneous symmetry breaking.


Introduction
Entanglement entropy is a powerful tool for characterizing the entanglement structure of quantum states. It is a quantity of interest in high energy physics, condensed matter, and quantum information theory alike. Despite its relevance, the definition of entanglement entropy in gauge theories has not been understood in depth until rather recently. The ubiquitous difficulty lay in the fact that gauge invariance introduced a degree of nonlocality at the UV scale that seemed to make it impossible to define subsystems whose entanglement entropy we were to measure. A number of approaches have been proposed to address this issue ; in subsequent sections we will review them and see how they all revolve around constructing a gauge-invariant density operator whose von Neumann entropy can be interpreted as the entanglement entropy of the given subsystem. This paper will take the route very close to the one outlined in [1][2][3][4][5].

JHEP04(2016)163
The main goal of this paper is to use these technical developments to further our understanding of the ground state entanglement in weakly coupled Yang-Mills theories. Working within a lattice gauge theory framework, we provide a comprehensive description of the calculation of entanglement entropy for Yang-Mills theories on arbitrary lattices and for arbitrary gauge groups G. In particular, we demonstrate the presence of a ubiquitous term that, in a continuum with d spatial dimensions, takes the form Here r is the linear size of the entanglement region and e is the continuum gauge coupling, and for d = 3 the logarithm is replaced just by log e 2 . This term is of particular importance in d = 2, where it is the dominant universal term of the entanglement entropy that takes the form ∆S = 1 2 dim(G) log e 2 r . (1.2) We obtain the advertised results by gauge-fixing to axial gauge, expressing the gauge theory as a principal chiral model with spontaneous symmetry breaking, and calculating the entanglement entropy of the resulting Nambu-Goldstone bosons. 1 The key ingredient here is the fact that Nambu-Goldstone bosons do not have a zero mode and therefore exhibit enhanced entanglement of the softest mode in the entanglement region [25]. Thus, we can qualitatively say that the ∆S term arises because weakly coupled gauge theories look like (d − 1)dim(G) decoupled photons, with each photon behaving like a scalar field with a zero mode removed (or gauged away). For the special case of the d = 2 Maxwell theory, this logarithmic term has been computed in an alternative way, by dualizing to a compact scalar theory (the O(2) model) that exhibits spontaneous symmetry breaking [15]. 2 The results presented in this paper touch on many other studies of entanglement in gauge theories. Our computation shows that the ∆S term is at least one part of entanglement entropy that is invariant under field-theoretic dualities. Further, the weak-coupling entanglement entropy is explicitly shown to scale as N 2 in the planar limit, forming a contrast to the vanishing of the entropy at strong lattice coupling and suggesting, along the lines of [26], that entanglement entropy is indeed a good order parameter for confinement that can be explicitly computed in both weak and strong coupling regimes. Finally, we draw a connection between the logarithmic terms above and the topological entanglement entropy in d = 2 [6,18], arguing that ∆S is the analog of this important quantity for systems with continuous gauge groups. 1 A clarification is in order here. Typically, a discussion of spontaneous symmetry breaking in an O(N ) model of scalar matter assumes that the constant mode is frozen into a particular position on the sphere S N −1 . However, one can also construct the state that features "restoration of symmetry" by uniformly superposing all possible directions that the constant mode can point in. This construction effectively "gauges away" the zero mode of the theory, and it is this kind of state that we will encounter when gaugefixing the Yang-Mills theory. In general, whenever we refer to "spontaneous symmetry breaking" in this paper, we will refer to a projection to just the sector of Nambu-Goldstone bosons, where the wavefunction does not depend on the zero mode. 2 As we will describe in detail below, the gauge-fixed lattice theory is related to the dual scalar one by a canonical transformation.

JHEP04(2016)163
The paper is structured as follows. In section 2, we give a short but self-contained introduction to lattice gauge theories and set the notation to be used throughout the paper. In section 3, we give a very explicit definition of entanglement entropy in a gauge theory [1][2][3][4][5], and we show how it agrees with other definitions in the literature. An example is given in section 4, where the strong coupling entanglement entropy at large N is computed for the first time. The main calculation is given in section 5 and it culminates with the above result for ∆S and with a reasonable conjecture for the general form of the total gauge theory entropy. Further implications of our results are discussed in the Conclusion, and miscellaneous technical points are collected in the appendices.

Notation and conventions
Our Hamiltonian formulation of non-Abelian lattice gauge theory in d spatial dimensions is based on the seminal work of Kogut and Susskind [27]. The notation follows the approaches of [1,2]. We work on a finite lattice with open boundary conditions and no nontrivial topology. The lattice sites are labeled by i, j, . . ., and the links are labeled either by a link index, ℓ, by a pair of adjacent site indices, (i, j), or by a site and a direction, (i, µ). Each link is oriented, so if ℓ = (i, j), the link of opposite orientation isl = (j, i). In all examples we will assume a hypercubic lattice, but our discussion applies to arbitrary lattices.
Quantum variables live on links. The state on a link ℓ is labeled by an element U of the gauge group G. Products of states |U ℓ over all links ℓ form the Hilbert space H 0 of the whole lattice. The operator algebra on H 0 is generated by momentum operators L Λ ℓ and position operators U r ℓ , which act on |U ℓ via where Λ ∈ G and r is a representation of the gauge group. We also define L Λ ℓ and U r ℓ via L Λ ℓ |U ℓ = |U Λ −1 ℓ , U r ℓ |U ℓ = r(U −1 )|U ℓ . (2.2) Note that L Λ ℓ and L Λ ℓ commute, by construction. Operators on different links also all commute. By writing L ℓ we also refer to the operator ℓ ′ =ℓ ½ ℓ ′ × L ℓ on the full lattice.
The state of the entire lattice (i.e. an element of H 0 ) will be denoted by kets without an index; e.g. we might write |Ψ = ℓ |U ℓ for a given configuration {U ℓ } on the lattice links.
Electric operators J a ℓ are representations of generators associated to momentum operators: Here θ a are the coordinates on the group manifold and T a are the generators of G normalized to Tr(T a T b ) = δ ab . 3 Given a quantum state labeled by U ≡ e iA a T a , the electric operators act on it as covariant derivatives on the Lie manifold with coordinates A a :

JHEP04(2016)163
The covariant Laplacian J 2 ℓ = J a ℓ J a ℓ is a Casimir invariant that we will use extensively. The sum ℓ J 2 ℓ is the "electric term" in the Kogut-Susskind Hamiltonian [27]. The remaining, "magnetic term" of this Hamiltonian is given by the magnetic operators W r p and W r p , defined on lattice plaquettes p as where r is a representation of the gauge group and the links in the product ℓ∈p are traversed counterclockwise. Henceforth, when r is dropped, the fundamental representation is understood. We will always use N to denote the dimension of this representation. Using these conventions, the full Kogut-Susskind Hamiltonian is where g 2 is the gauge coupling. We will exclusively work with this Hamiltonian in this paper, though our results would not qualitatively change if we included more terms in the magnetic potential.
Gauge-invariant systems are insensitive to all local transformations of the form where Λ = {Λ i } assigns an element of G to each lattice site. Such transformations are implemented in terms of operators via where we have introduced the Gauss operators G Λ i ≡ µ L Λ (i, µ) at each site. In the above product there are exactly two momentum operators acting on each link, one acting in the direction of the link and one in the opposite direction. Together they implement the desired local transformation. Gauging this transformation, we demand that physical states are only those satisfying |Ψ Λ = |Ψ or, equivalently, G Λ i |Ψ = |Ψ at each site and for any Λ ∈ G. The space of all such states is the physical Hilbert space H.
In the first half of the paper we will work in the electric basis of H 0 and H. This basis diagonalizes the electric term J 2 ℓ on each link. One element of this basis is the ground state of the electric term, where dU is the Haar measure on the group manifold G, normalized so that Ω|Ω = 1. 4 This state is gauge-invariant, as L Λ ℓ |Ω = |Ω for all momentum operators L Λ ℓ . Excited

JHEP04(2016)163
states -other elements of the electric basis -are formed by acting on |Ω with position operators U r ℓ . These excitations are labeled by representations r, as per the Peter-Weyl theorem. Physical excitations can be viewed as closed lines of electric (color) flux. In the remainder of this subsection we review the systematics of these electric eigenstates. An excellent reference with many more details is [28].
Elements of the electric basis of H 0 can be thought of as products over links of wavefunctions on the space of representations of G. On a given link, an electric basis element labeled by an irreducible representation r is where d r is the dimension of r. The normalization r|r ′ ℓ ℓ = δ rr ′ r(½) follows from eq. (2.9) and the Weyl orthogonality property As an example, the fundamental and antifundamental representations r = f,f are Given a (possibly self-intersecting or multiply-winding) closed loop of links C = (ℓ 1 , . . . , ℓ n ), the physical states on C are where s k = +1 if the link ℓ k ∈ C is traversed in the direction of its orientation, and s k = −1 if not. These states form the electric basis of H; they are eigenstates of the electric term of (2.6) with eigenvalues ng 2 C 2 (r), where C 2 (r) is the quadratic Casimir of the representation r of the gauge algebra. For example, we have C 2 (f) = C 2 (f) = N for U(N ) and C 2 (f) = C 2 (f) = N − 1/N for SU(N ). At large N , states defined on different loops are all orthogonal to each other, and hence electric basis elements in the planar limit are indexed by the set of all closed loops with a representation associated to each loop.
3 Entanglement entropy in lattice gauge theory

Overview
Entanglement entropy quantifies how much information is lost by restricting ourselves to a part of the given system. The point of view of this paper is that, given a state of the whole system, one can always construct a reduced density operator that precisely reproduces the physics (i.e. all the correlation functions) of the original state in just a part of the full system. The entanglement entropy is then the von Neumann entropy associated to this operator; roughly speaking, this measures the number of states a subsystem can be in JHEP04(2016)163 while the entire system is in a given state. This is the quantity we will compute. There exist alternative (possibly inequivalent) approaches to entanglement entropy, e.g. axiomatic definitions involving strong subadditivity, but we will not address them here. The definition of a reduced density operator in gauge theories is tricky because the physical Hilbert space H does not admit a decomposition into a direct product of Hilbert spaces defined solely on a region V and its complementV . There are several ways to address this subtlety: • Embedding H into a direct product of Hilbert spaces defined separately on V andV gives rise to a density operator that can be reduced in the usual way [6][7][8][9][10][11].
• Elements of the reduced density matrix may be computable via a Euclidean path integral by directly path-integrating and using the replica trick (see e.g. [14][15][16] for some salient examples).
• In theories that admit holographic duals, entanglement entropy can be computed following the Ryu-Takayanagi prescription [12,13] (see [29] for an explanation on why this is equivalent to the computation via the replica method in the boundary theory).
• In special circumstances, the gauge theory setup can be mapped to a problem in which the reduced density operator/entanglement entropy calculation is tractable [17][18][19][20][21][22]. These approaches might not yield the most general method for computing a reduced density matrix, but they are important for checking any general prescription.
• Finally, a reduced density operator can be defined purely algebraically, as the unique density operator in any subalgebra of observables that reproduces the expectation values of all the operators in the subalgebra [1-5, 23, 24].
These methods are equivalent in a precise sense that we will discuss below. We will start from the particularly transparent approach taken in the progression of papers [1,2,4,5], where the last of the above prescriptions was followed. Given the algebra A V of all gauge-invariant operators on a set of links (not sites or plaquettes) V , it is always possible to find the density operator ρ V ∈ A V and calculate its von Neumann entropy. This prescription is manifestly gauge-invariant and associates an entropy to all the physical data contained in a region of space. 5 Varying what algebra one assigns to a region (e.g. dropping magnetic operators near the edges of the set V ) leads to different values for the entanglement entropy [1,3]; the results of [30] suggest that these alternative entropies correspond to entanglement regions with operator insertions on the entangling edge, and we will have more to say about this at the end of this section.

Definition
The desired reduced density operator is constructed as follows. Let V andV be a set of links and its complement, and let ∂V be the set of sites for which some (but not all) JHEP04(2016)163 emanating links are in V . For each i ∈ ∂V , define boundary electric operators as Gauss operators (2.8) at boundary sites i can be expressed as where, as in (2.3), the θ's are defined such that Λ = e iθ a T a . The gauge constraint requires G Λ i = 1 for all Λ, and hence it must be true that when acting on any physical state. Of course, these operators are not gauge-invariant, but e.g. the quadratic Casimirs E 2 i = E a i E a i are, and the above relation implies that All possible boundary Casimirs -not just the quadratic ones -generate the center of the algebra A V . The center elements are all diagonalized simultaneously in the electric basis. The physical space H thus naturally splits into superselection sectors H (k) labeled by k = (k 1 , . . . , k B ), the collection of Casimir eigenvalues at various sites i, where B is the total number of boundary Casimirs. See figure 1 for an illustration. A subtle point arises here: the space H (k) is spanned by gauge-invariant states living wholly in V , wholly inV , and partly in V and partly inV . This last group presents an obstruction to decomposing H (k) into a direct product, and we now describe how this is circumvented by working with the example of a square plaquette p = (ℓ 1 , ℓ 2 , ℓ 3 , ℓ 4 ) with just ℓ 1 ∈ V . Our point is illustrated already by the fundamental excitation on this plaquette, This state can be written as The key point here is that the gauge-invariant state | p is written as a gauge-invariant entangled combination of gauge-variant states | αβ and |⊓ αβ . Instead of using the single basis element | p ∈ H, we may instead use a larger basis consisting of the 2N 2 elements | αβ , |⊓ αβ ∈ H 0 ; as long as we act on it only with gauge-invariant operators and density matrices built out of gauge-invariant states like | p , we will never ruin gauge invariance, and as an upshot we will be able to cleanly split this enlarged basis into elements in V and elements inV . Tracing over the gauge-variant basis elements inV gives a reduced density matrix that maximally mixes the gauge-variant basis elements in V . The entropy coming from this density matrix is 2 log N . In general, the effect of splitting a flux line of representation r will be to increase the entanglement entropy of the appropriate sector by log d r . For each piercing of the entanglement edge by a loop with representation r, another factor of log d r should be added. Keeping this in mind we may now forget all about using gauge-variant basis elements, include this extra entropy piercings log d r by fiat, and pretend that H (k) factorizes into a direct product H (k)

JHEP04(2016)163
V contains all gauge invariant states defined purely on V and all gauge-variant states represented by open flux lines that end at ∂V and that have outgoing flux given by k. This contribution to the entanglement entropy was identified in [9], and further discussions of working with gauge-invariant operators on gauge-variant basis states may be found in [2,4].
We now return to constructing the desired reduced density operator. The density operator ρ of a general state may have elements that mix two superselection sectors, but since no gauge-invariant operators in A V can change the superselection sector of a given state, these elements of ρ can be set to zero. This way one obtains a block diagonal matrix ρ = k p k ρ (k) that is sufficient to describe all expectation values of operators in A V [1,2]. Here ρ (k) is a density matrix for states in the H (k) sector and p k ≡ Tr H (k) ρ are c-numbers chosen so that ρ (k) has unit trace. By tracing out theV states -which is possible because we can effectively express H (k) as a direct product -one obtains the reduced matrix V . This is the operator we were after. Its von Neumann entropy is the JHEP04(2016)163 entanglement entropy we wish to compute. It takes the simple form It may be instructive to view this expression as a sum of two types of entropies, a Shannon (or "classical") entropy − k p k log p k that comes from boundary conditions/edge modes, and the average von Neumann entropy k p k S (k) that comes from the entanglement of the interior modes with the exterior. For states that contain only a few electric flux lines, ρ (k) V will describe pure states and the entanglement entropy will arise solely from the facts that gauge-invariant operators are cut by ∂V and that the original state is a superposition of basis elements belonging to different sectors. States of this form are the strong-coupling ground state and its lowest excitations, and we will see that, for them, the Shannon entropy contains the most relevant information about entanglement.
Ground states at weak coupling will feature a superposition of all superselection sectors. If the gauge group is continuous, the superposition will be Gaussian; if it is discrete, the superposition will be uniform. (We will review this in the appendix.) In the latter case the dominant contribution to the universal parts of the entanglement entropy will come from the S (k) V entropies, while in the former case the Shannon entropy provides corrections comparable to the von Neumann entropy, as has been explicitly demonstrated in [11]. The weak-coupling logarithmic term that we will focus on in section 5 is unaffected by the presence of the edge modes.

Comments
The procedure described above certainly defines a gauge-invariant quantity that measures entanglement, but a few words are needed to justify calling it the entanglement entropy. Implicit choices were made at two junctures in the above discussion: Maximal algebra of observables: the algebraic approach [1][2][3][4][5] shows in a particularly clear way that there exist alternative choices for the definition of entanglement entropy. Given a set of links V , we may start with any algebra of gauge-invariant observables that are defined only using operators U ℓ and J ℓ with ℓ ∈ V . In this paper, we have chosen to work with the maximal algebra A V that contains all possible operators supported on V . Other algebras, such as those with a trivial center (and hence without superselection sectors), will lead to different reduced density operators and different entropies. The maximal algebra is often called the "electric center" choice [1].
The existence of these choices is by no means unique to gauge theories. A real scalar on a lattice has two operators at each site, φ i and its conjugate momentum π i . After choosing a set of sites V , we still have the freedom to choose an algebra generated by, say, all the φ i 's but only some π i 's in V . Given a state of the scalar field on the entire lattice, the reduced density operator that belongs to this algebra will not be the same as the reduced density operator constructed out of all possible φ i 's and π i 's that lie on sites in V . The von Neumann entropies of these operators will be different, as well

JHEP04(2016)163
The point of view of this paper is that the natural object to study is the maximal algebra of observables in a given region V . One reason for this is that the entropy associated to this algebra has a nice interpretation in terms of a replica trick path integral [4]. Another reason is that this seems to be the object that has been implicitly studied by most other approaches to gauge theory entanglement, and it is for this choice that we recover the familiar notion of topological entanglement entropy. A final reason is that working with the maximal algebra seems to be the approach that was already adopted for theories with matter and no gauge fields. Studying the entropies of non-maximal algebras remains a worthwhile task, and in particular it is of interest to know whether there are measures of entanglement that do not depend on the particular choice of algebra, as long as it is supported on the links in V and not in any subset of V [3].
Gauge theory as a projection: another choice we made was viewing the gauge theory as a projection to a G-invariant sector of a bigger theory with symmetry group G. For instance, the Hilbert space H of a gauge theory could have been obtained from H 0 as in Kitaev's toric code, by simply positing that the mass of charged states is much greater even than the energy scale set by the lattice spacing [31]. This approach has been contrasted to working with a "true" gauge theory, in which there is no physical interpretation of the enlarged Hilbert space H 0 and the factors of log d r in (3.7) are absent [32].
Whether one of the above options is more fundamental than the other is a deep question that we will not tackle in depth here; for our purposes this choice is a matter of taste. The present paper takes the former ("projection") approach because it appears more natural if we are interested in placing the gauge theory on manifolds with nontrivial topology. In the latter approach, Wilson loops along noncontractible cycles would need to be added manually to the set of usual plaquette excitations on flat space, whereas in the projection language the holonomies are automatically included from the start. The dichotomy between the two views of gauge theory seems intimately related to the old question of compact versus noncompact gauge groups: choosing a noncompact gauge group matches the Gaussian fluctuations in the compact theory but misses the topological/nonperturbative effects such as vortices and monopoles, and as a result it typically yields a nonunitary quantum theory unless these defects are added to the theory by fiat.
Even after the above choices are made, we are still left with a variety of methods to calculate the entanglement entropy, as enumerated in subsection 3.1. We will now comment on the equivalence of these approaches. The discussion of gauge-invariant entanglement of gauge-variant degrees of freedom around eq. (3.5) serves to justify the "embedding approach" to defining a reduced density operator [6][7][8][9][10][11]. If we embed the physical Hilbert space into a space where matter degrees of freedom can live on the entanglement edge -effectively splitting the links into two, as done in [6] -we can directly construct the appropriate density operator ρ V by the usual tracing out procedure. As long as the initial state is gauge-invariant, ρ V will entangle these edge degrees of freedom so that it computes the correct expectations of gauge-invariant operators in A V while giving zero for gauge-variant V can be viewed as subsets of such a larger Hilbert space.) Extending this thought, instead of ever working with the physical space H, we can just work with H 0 from the outset and ask for the entropy of the algebra of gauge-invariant operators A V acting on H 0 ; the answer will be the same as for H, and it is given by tracing out the elements of H 0 defined onV [4]. This tracing out can be expressed by a lattice path integral over all the link configurations, and this connects the calculations in the previous subsection to the ones done by usual replica trick methods. This demonstrates the equivalence of all the methods that have been proposed for calculating entanglement entropy in gauge theories.
We can also connect these lattice computations to the ones done in the continuum limit. Consider a set of links V and focus on a particular site i ∈ ∂V . To simplify, pick i so that there is only one link emanating from it that is inV . Now consider adding this one link to V , thereby creating a new set V + δV . As long as adding this link has not introduced an entire new plaquette into V , the Gauss law guarantees the equality of algebras, A V = A V +δV , and of the associated entropies S V = S V +δV . Thus, as the continuum limit is taken, the boundary of the entangling region is realized as a "belt" or "buffer zone" of thickness equal to the lattice spacing; the entanglement entropy is not sensitive to whether the set of links V is chosen to contain links in this belt or not. (For example, the link containing the E 2 3 operator on figure 1 can be removed and the link containing theĒ 2 2 operator can be added to V without affecting the entropy.) Thus, there are equivalence classes of algebras that all have the same entropy, and this explains why it is sensible to draw the entangling edge as a line cutting through links -if removing a link keeps you in the same equivalence class, then it is meaningless to ask if that link is in V or not, and the entanglement edge might as well be drawn as cutting the link for illustrative purposes. A reasonable conjecture, supported by the results of [30], is that each equivalence class corresponds to a different entangling edge in the continuum path integral, with classes differing by a single generator corresponding to continuum entanglement edges differing by a single operator insertion along the edge.
We close this section by recalling that there are other choices for boundary conditions that label superselection sectors. (The detailed construction is given in [2], and here we just mention the basics.) Each choice corresponds to a commuting set of operators in the "buffer zone" of plaquettes around the entangling region, and the entanglement entropy does not depend on this choice. For instance, in d = 2, instead of specifying all the electric Casimirs in figure 1, we can specify the values of Wilson loops around the green plaquette, of the Casimirs of the total electric field through the green plaquette, and of electric Casimirs at the remaining boundary sites. A choice that will be particularly useful for studying the weakly coupled regime are the magnetic boundary conditions, where one exclusively works with magnetic operators in the "buffer zone." The superselection sectors are labeled by independent values of W r p for all plaquettes p in the buffer zone. For U(1) theory, in the magnetic basis where Wilson loops are diagonalized, the sectors are labeled by a number w p = W p for each boundary plaquette. Denoting by w the set of all these labels, the entanglement entropy can be expressed analogously to (3.7) as (3.8) Below, we will give an explicit example of a calculation using these boundary conditions.

JHEP04(2016)163
4 Example: entanglement at strong coupling The preceding definitions are most clearly illustrated by studying the strong coupling regime on the lattice. The strong-coupling physics is dominated by the electric term in the Hamiltonian (2.6). The g = ∞ ground state, |Ω g=∞ , is the state |Ω introduced in (2.9). Given any region V , this state lies in the k = (0, . . . , 0) sector, and its entanglement entropy is Lattice gauge theories confine at strong coupling, and if g is sufficiently high we expect the confinement scale to be smaller that the lattice spacing. This intuition agrees with the lack of any entanglement structure at distances that can be probed by S V . As we move away from infinite coupling, the ground state receives corrections from single-plaquette fundamental and antifundamental excitations | p and |¯ p , cf. eq. (2.13). The corrected ground state is found using ordinary perturbation theory. This was done for G = SU(2) in ref. [9].
The leading term in the entanglement entropy comes from the first-order corrected state where N P is the (finite) number of plaquettes on the lattice, and the effective coupling is λ ≡ sg 4 N , with s being the number of links on a plaquette. The O(1/λ 2 ) term serves to normalize the state. It can be shown that other 1/λ 2 corrections that would come from second-order perturbation theory do not contribute to the entropy at leading order. If λ 2 ∼ N P , we need to go to higher orders in perturbation theory to obtain properly normalized states; to avoid technical complications we will assume that λ 2 ≫ N P . After picking a subset of links V , applying eq. (3.7) is straightforward. The starting density matrix is ρ = |Ω g≫1 Ω g≫1 |. States |Ω , | p , and |¯ p all belong to the same sector, k = (0, . . . , 0), when the plaquette p has all its links in V or all its links inV . Plaquette excitations associated to p's that are orthogonal to ∂V , i.e. that have links both in V andV , lie in different sectors. Let ∂V ⊥ be the set of such plaquettes. For each p ∈ ∂V ⊥ , | p and |¯ p belong to the sector labeled by a nonzero k. The block-diagonal matrix ρ takes the form where we have defined the auxiliary state with ′ p denoting the sum over all plaquettes that are not perpendicular to ∂V . In each sector we have a pure state, so the entanglement entropy (3.7) only receives contributions from the Shannon entropy of the p k 's and from the cuts through gauge invariant states (with two d r = N piercings in each sector where k = 0): This is the large-N generalization of the entropy obtained in [9]. At strong coupling the entropy thus vanishes as S V ∼ |∂V ⊥ | log g 2 N g 8 N 2 , and the planar limit only accelerates the vanishing.

The weak coupling limit
We now focus on the small-g limit of the Hamiltonian (2.6). The conceptually simplest way to proceed is to fix the axial gauge. It may be useful to spell out this procedure in the context of our work. In the Hamiltonian formalism, going to axial gauge amounts to replacing the physical Hilbert space H with an isomorphic space H ⋆ ⊂ H 0 that is spanned by states |U ⋆ ℓ located on a particular subset of all links on the lattice, the "living" links. In this nomenclature, "dead" links are the ones on which we can use a gauge transformation to set U ℓ = ½, and all remaining links are "living." On a hypercubic lattice one typically chooses all links in one direction (say, along the x-axis) to be dead, then one picks a fixed-x slice of the lattice and kills off all y-directed links on this slice, and so on through all the directions of the lattice. Once gauge freedom is completely exhausted, we are left with N P living links (one per plaquette) that directly correspond to states that span the physical Hilbert space. The states on living links are now allowed to take any value without restrictions.
It is important to stress that all gauge-invariant operators that act on H can still be defined on H ⋆ . The Gauss law allows us to express electric operators on dead links as functions of electric operators on living links, and the gauge-fixing condition instructs us to just set U ℓ = ½ on all dead links ℓ that appear in magnetic operators. Thus no observables are dropped by the gauge-fixing.
At weak coupling, Yang-Mills theory is either in the Coulomb phase, or it confines at a length scale that grows exponentially with 1/g 2 . Thus, at sufficiently small distances and couplings the theory will always appear to be in the Coulomb phase. (See appendix A for a more precise justification of this statement.) The ground state |Coulomb is a direct product of dim(G) ground states of identical, free, noncompact photons. The total entanglement entropy in this state is This entanglement entropy scales as S V ∼ N 2 , in sharp contrast with the swift ∼ (log N )/N 2 vanishing of S V at strong coupling and large N , as found in eq. (4.5).
The entanglement entropy of a single noncompact photon, S (photon) V , can be computed using methods developed in section 3. In d = 2, this quantity has been investigated numerically in the gauge theory [3], and analytically in the dual scalar theory [15]. We now present a direct, analytic gauge theory calculation that shows that the entanglement entropy has the form is the entanglement entropy of a massless scalar, and is a ubiquitous coupling-dependent term that arises because noncompact photons, just like Nambu-Goldstone bosons, lack a zero mode; in other words, this term comes about because weakly coupled gauge fields (in the right gauge) can be realized as d − 1 scalar fields with the identification φ(x) ≡ φ(x) + ε. The "corrections" above are primarily terms that arise due to the presence of the edge modes; they will not be our concern in this paper, but we will outline how they are computed. The ∆S terms exist in all dimensions but the d = 2 ones have a particular significance, as we will discuss below.

Warm-up: the O(M ) model
As mentioned above, ∆S(g) is a term that can be found in systems with Nambu-Goldstone modes. Before calculating in the gauge theory, we now review how such ubiquitous logarithmic terms arise in symmetry-breaking ground states of nonlinear σ-models [25]. Consider a continuum O(M ) model of radius σ in d-dimensional flat space. The action is We wish to study the physics of n configurations that are all very close to a particular direction n 0 . This is a free theory, so given a region V , the ground state reduced density matrix for these fluctuations can be explicitly calculated to be where specifies the Hamiltonian 1 2 d d x d d y δ n(x)Q(x, y)δ n(y) = k |k|δ n † k δ n k of the fluctuations δ n = n − n 0 , and ∆(x, y) is a complicated kernel that arises after integrating out the modes outside V and whose form we will not need. Throughout this derivation, it is assumed that fluctuations are small; the self-consistency of this assumption must be checked by computing the size of fluctuations à la Coleman-Weinberg. (For instance, from the Mermin-Wagner theorem we know that in d = 1 there will be no symmetry-breaking phase with small fluctuations.) We now decompose n into a soft ("zero") mode n z and fluctuations χ a , a = 1, . . . , M −1, defined through 6 n ≡ n z 1 − χ a χ a σ 2 + e a χ a σ , e a · n z = 0, e a · e b = δ ab ,

JHEP04(2016)163
The last condition justifies the name "soft" or "zero" mode for n z , as it makes sure that n z contains all information about modes whose wavelengths are greater than the size of V . In these new variables, the reduced density matrix takes the form where ρ V looks just like ρ V in eq. (5.5), but with compact fields n replaced by noncompact fields χ a . The dimensionless parameter I can be calculated straight from eq. (5.6) and equals where a is the lattice spacing/UV cutoff, and r is the IR cutoff of the integral over V (i.e. r is the linear size of V ). The dots represent corrections to the logarithm that are determined from the exact shape of ∂V . For any macroscopic region with r ≫ a we have I ≫ 1, meaning that the density matrix for n z is close to the identity. The reduced density operator ρ V has a very particular form. As an operator on the subspace associated to the soft mode n z , ρ V has matrix elements of the form e −I( nz− n ′ z ) 2 and so, in operator form at I ≫ 1, ρ V ∼ e − J 2 /2I ρ V , where J is the spin operator conjugate to n. The exponent has the characteristic form of the "Anderson tower of states" Hamiltonian that describes the small gaps between vacua of a system with continuous symmetry breaking. This is a crucial observation: the symmetry-breaking ground state of the O(M ) model is invariant under global rotations and does not depend on the zero mode of the system, and this lack of a zero mode forces the softest mode of the subsystem to have a very specific, σ-dependent entanglement spectrum with the environment.
The eigenvalues of the "tower of states" modular Hamiltonian scale as 1/I ∼ σ 2 r d−1 log(r/a) −1 . Elements of ρ V have the form e −Q(χ−χ ′ ) 2 −∆(χ+χ ′ ) 2 . The presence of the (χ + χ ′ ) 2 term in the exponent makes this matrix qualitatively different from the soft mode density matrix, and indeed, in operator form ρ V is an exponential of an SHO Hamiltonian whose energy levels scale as 1/ log(r/a) and are independent of σ [25]. The modular Hamiltonian thus splits into two parts -a quantum rotor that describes the soft mode states and an SHO that describes the other modes -and only the first part still carries σ-dependence. It is through this part of the density matrix that one recovers the coupling-dependent universal term announced in eq. (5.3), easily observed in the von Neumann entropy of the operator e − J 2 /2I . The remaining terms from the soft mode entanglement entropy, such as various constants and a log log(r/a) term, are also found in the entanglement entropy of the χ a modes, and are not as ubiquitous as ∆S(g). The resummation of these terms is beyond the scope of this work, but [25] have argued that this resummation gives the usual area law term and the accompanying subleading corrections.

JHEP04(2016)163
In d = 2, the renormalized entanglement entropy for a circle is F (r) = rS ′ (r) − S(r), and therefore the ubiquitous coupling-dependent term in two spatial dimensions is The remaining contribution to F (r) does not depend on σ. As we flow to the IR and σ increases, ∆F (and hence the entire F quantity) decreases, as per the F -theorem.

Gauge theory
We now return to the gauge theory case and repeat the same analysis. The first order of business is to find the ground state wavefunction and then reduce the density matrix in order to get the analog of eq. (5.5). The gauge-fixed Hamiltonian is where, on dead links, we set A ℓ = 0 and express J 2 ℓ in terms of living links using the Gauss law. The ground-state wavefunction can be found by expanding in small fluctuations around U = ½, diagonalizing the Hamiltonian, and determining the usual SHO ground state for each of the eigenmodes.
After using the Gauss law and expanding in fluctuations A a ℓ , the Hamiltonian takes the form The ground state wavefunction is where A a ℓ A b ℓ ′ ∝ g 2 δ ab Q −1 ℓℓ ′ is the gluon propagator in this gauge, and This expression for the propagator and for the ground state is correct as long as there are no nonperturbative effects that cause the propagator to change at large distances. Such effects can indeed exist, e.g. they are present in theories with confinement via monopoles, but as long as we work with small enough coupling and at small enough distances, the above ansatz for the ground state will be correct. We will comment more on this issue below.
As an example, in d = 2 and at large distances compared to the lattice spacing, the matrix Q is given by where x and x ′ are coordinates associated to links ℓ and ℓ ′ . (At smaller distances one has to replace k i by 2 sin(k i /2) to get the correct lattice propagator.) As in the O(M ) model,

JHEP04(2016)163
the wavefunction in terms of the original variables is invariant under global shifts in A a ℓ , which is seen from the fact that ⋆ ℓ,ℓ ′ Q ℓℓ ′ = 0. The density matrix corresponding to the state (5.14), (5.17) must be reduced following the prescription appropriate to the gauge theory. This is where one uses magnetic boundary conditions and eq. (3.8). If one were to ignore the presence of edge modes and the need to decompose the density matrix into subsectors, one would get a gauge-dependent result: states with a flux loop intersecting the edge ∂V could be in principle represented as excitations in V or outside V , depending on the gauge choice, and the edge modes are the gauge-invariant way to keep track of flux loops that enter and exit the entangling region.
Consider the case of one photon. The superselection sectors are labeled by the values of Wilson loops in the "buffer zone" around the region V ; these live on plaquettes that have links both in V and inV , such as the green plaquette in figure 1. In axial gauge, a Wilson loop on a plaquette is equal to the difference of A ℓ 's on the two living links belonging to the plaquette. One superselection sector thus consists of all field configurations A ℓ that satisfy A ℓ − A ℓ ′ = w p for two living links ℓ, ℓ ′ in the edge plaquette p. This sector is labeled by the collection of all the w p 's, denoted w, and we will call the sector H (w) . We can now work sector by sector, defining the block density matrix in a given sector as where A and A ′ both belong to the sector labeled by w. The normalization constants are defined as p w = Tr H (w) ρ, so the operators ρ (w) have unit trace. Now we can trace out the degrees of freedom on living links inV , getting the reduced operator ρ (w) V whose von Neumann entropy figures in eq. (3.8). For a general gauge group, the superselection sectors are labeled by a set of numbers w a p for each gluon. Given a sector w, we may now repeat the steps of the previous section and extract the density matrix for the softest mode in V . This density matrix will once again be the exponential of a "tower of states" Hamiltonian with prefactor proportional to I = 1 g 2 V d d x d d y Q xy , just like in eq. (5.8). The relevant logarithmic term in the von Neumann entropy of the soft mode in this sector will be ∆S w = dim(G) 2 log I. This term will not depend on w because the soft mode coupling only depends on Q ℓℓ ′ and g 2 , so the logarithmic terms in the full entropy add up to ∆S(g) = w p w ∆S w (g) = dim(G) 2 log I due to the normalization condition w p w = 1.
We will prove that g 2 I = |∂V | f |∂V |/a d−1 , where f (x) is a slowly varying function, e.g. a log. For notational simplicity, let us focus on d = 2 with V being a region of linear size r. We now wish to show that g 2 I ∝ rf (r/a). The most direct argument goes as follows. If we let Q xy = Q(x − y), the function Q(x) will behave as Q(x) ∼ x 2 1 /|x| 5 at |x| ≫ a, assuming that the full system size is much larger than any other scale. Any divergences that appear in I will come from the region when x and y are close to each other. Up to constant prefactors, the divergent pieces will be the same as in V d 2 x d 2 y |x − y| −3 . After integrating out y from this integral, we will be left with

JHEP04(2016)163
By dimensional analysis and symmetries, c must not depend on x and f 0 (x) cannot have a divergence worse than 1/|x − x 0 | for some set of points x 0 ∈ V . Integrating over x will give (5.19) where f contains nothing worse than a logarithmic divergence. Now, if V had been the entire system and if we had used the exact lattice propagator, I would have been zero by the definition of Q xy in (5.16). This is consistent with the volume term c|V |/a found above only if c renormalizes to zero when using the exact propagator. This proves that g 2 I = rf (r/a) in a continuum description. We have checked this numerically for fixed superselection sectors on small (up to 100 × 100), d = 2, square lattices. A detailed numerical analysis analogous to [33,34] could be used to determine the exact form of f (r/a) and therefore the remaining terms in the entanglement entropy.
An analytic argument in favor of the I ∼ |∂V | scaling in any d goes as follows. Instead of doing the microscopic calculation outlined above, let us construct a toy model. We wish to study slow fluctuations of the gauge field in the region V and to replace the entire field U ℓ , ℓ ∈ V , with a single effective degree of freedom U . The effective Lagrangian for U and its environment W is (5.20) The average coordinate U in this region has coupling c U ∼ |V | ∼ r d , where r is the characteristic size of V . If V is much smaller than the entire system, the relative weight c r = c U c W /(c W + c U ) will also scale like r d . On the other hand, the coupling J in (5.20) is the characteristic spin-wave coupling of a system to its boundary conditions that scales as J ∼ r d−2 for d ≥ 2 [25]. Thus, the coupling of the soft (and only) mode in the effective, toy-model description is found to scale as ξ 2 ∼ 1/ √ c r J ∼ 1/r d−1 . The ground state wavefunction of the above Lagrangian is 22) in terms of the small fluctuations defined as U = W e iA a T a . Thus, the soft mode coupling should be I = 1/ξ 2 ∼ |∂V |. This argument does not prevent possible multiplicative log r terms in I since the scaling behavior is just captured by the leading powers of r, but this is good enough for our purposes.
Putting everything together, we can conclude that the entanglement entropy has a coupling-dependent term of the form

JHEP04(2016)163
In the continuum limit in d = 2, however, only the combination g 2 |∂V | remains finite, so by writing log(|∂V |/g 2 ) = log(g 2 |∂V |) − 2 log g 2 we can extract the term in the couplingdependent piece with a regular continuum limit, log(e 2 r), (5.24) where e is the continuum coupling and r is the linear size of the region V . This is the advertised result (1.2). Comparing it to other terms in the universal part of the d = 2 entanglement entropy, we see that ∆S(e) at small e 2 r is parametrically larger than the leading universal term (the F term) which takes on e-independent values of order unity. What happens for d ≥ 3? The coupling-dependent piece (5.23) still has the same form. The continuum coupling is e 2 = g 2 a d−3 , meaning that the finite coupling-dependent piece of the entropy is Thus, in d = 3 the log term from the soft modes gets "contaminated" with universal constant terms from the other modes. Similar contamination of ∆S with log(r/a) terms will happen for other odd d. For even values of d a ubiquitous ∆S term does exist as long as the weakly coupled ground state still displays symmetry breaking, but for d > 2 it will not be the dominant universal piece. In all dimensions, we find that ∆S is proportional to d − 1, the number of independent scalar degrees of freedom contained in a gauge field -this is the origin of the d − 1 prefactor in eq. (5.2). We will not give any specifics on the "correction" terms in eq. (5.2), but now we see that they come from the from the log g 2 piece and the subleading behavior of the soft mode coupling I, from the von Neumann entropies of all the other modes that do not depend on g 2 , and from the Shannon entropy of the edge modes which will also depend on log g 2 , as per the definition of the p w 's. The latter was shown to provide a contribution comparable to the one of a single scalar in [11]. None of these entropies can contain the combination log e 2 r , however; the term we have found is indeed ubiquitous in d = 2. 7 Finally, we emphasize that just like in the O(M ) model of the previous section, the above calculation on its own gives precious little insight into when the weak coupling, symmetry-breaking description is valid. The result above should thus be interpreted to mean that the Coulomb phase of a sufficiently weakly coupled gauge theory will always have this logarithmic term present, but the presence of the Coulomb phase must be determined by other considerations. 8 It remains to be seen whether the presence of ∆S terms in the entanglement entropy of a gauge theory always implies that the theory is in a Coulomb phase. 7 In d = 3 this lack of zero mode will shift the coefficient of the log r a term in the entanglement entropy by the shape-independent constant dim(G). In this paper we have not discussed the signature of this weakcoupling entanglement in relation to the usual universal log r a terms in odd d, but further discussion on this subject can be found in ref. [36] that appeared after the first version of this paper. 8 As we will comment below, nonperturbative effects may wash out this logarithmic term. This is precisely what happens for the d = 2 U(1) gauge theory and the O(2) model as one goes deeper into the IR. We make the plausible assumption that for any d and G there exists a coupling regime at which the theory looks deconfined at small enough length scales.

Related results in d = 2 Maxwell theory
It is instructive to review how other calculations relate to the d = 2 result, eq. (5.24), for gauge group G = U(1). This case has been the subject of much study, both due to the simplicity and richness of the gauge theory (see appendix B), and because it is possible to dualize the theory to a scalar one (see appendix C; invariance of the entanglement entropy under dualities will be discussed in the Conclusion). In particular, instead of the above Hamiltonian analysis, it is possible to carry out an explicit replica-trick calculation in the dual scalar theory, and indeed the entanglement entropy of the compact Maxwell theory on a disk of radius r was found to contain a 1 2 log e 2 r term when e 2 r ≪ 1, with the symmetrybreaking description becoming invalid at e 2 r ≫ 1 [15]. (The path integral method has also been used to reproduce the log term in [25], although the validity of the calculation at large e 2 r was not the focus of that work.) The disappearance of ∆S(e) in the IR, when e 2 r ≫ 1 but still below the confinement scale, is realized as a nonperturbative effect in the path integral language. It is less clear how this effect is realized directly in our Hamiltonian gauge theory analysis; it is related to the fact that the free photon propagator, Q ℓℓ ′ , starts changing at distances much greater than 1/e 2 , invalidating the initial expression for the wavefunction ψ[A] in (5.14). A detailed understanding of IR effects on the ∆S term for various d and G remains a topic for future work.
A less direct reproduction of the same result appears in [14], where the renormalized free energy of compact Maxwell theory on a three-sphere of radius r was found to contain a term − 1 2 log e 2 r, and the presence of this term was linked to constant gauge transformations/zero modes on the sphere. While this free energy is very closely related to the renormalized entanglement entropy F (r), these quantities do not match for nonconformal theories. It is interesting that the universal logarithmic term does seem to match; this suggests that similar universal terms might be extractable by performing free energy calculations on spheres.
A numerical lattice calculation for F (r) of a noncompact photon [3] has yielded the same kind of term, − 1 2 log(rµ), for an undetermined, UV-independent µ, by computing the entropy of the "truncated scalar algebra," i.e. of the operator algebra containing only derivatives of a free scalar field. Working with the truncated algebra is equivalent to working with Nambu-Goldstones/noncompact photons, i.e. scalars that are simply missing a zero mode. Our analysis, moreover, shows that the constant µ is determined by the choice of normalization of the gauge field -while the algebra of observables is invariant under field rescalings (since these are canonical transformations), the entanglement entropy is not [3], and hence the operator algebra should be supplemented by a rule to fix this scaling ambiguity. Physically, fixing this scaling amounts to choosing the compact theory whose gauge-fixing or spontaneous symmetry breakdown gives the needed theory of Nambu-Goldstones.
Finally, we point out that the logarithmic term found above is analogous to the topological entanglement entropy found in systems with discrete Abelian gauge groups. Consider the Z 2 gauge theory in d = 2. Its entanglement entropy famously contains the universal term − log 2, the topological entanglement entropy [6,18]. The dual of this gauge theory is the Ising model defined up to a global spin flip. On its own, the Ising model has no topological entanglement entropy; the − log 2 must come from the global Z 2 ambiguity. This

JHEP04(2016)163
ambiguity can be viewed as the omission of the zero mode in the Ising model, in analogy with the U(1) case we studied in detail. We will further comment on this in the Conclusion.

Conclusion
This paper has presented a computation of entanglement entropy directly in Yang-Mills gauge theory on a lattice. In particular, we have provided a transparent connection between logarithmic terms in the entanglement entropy and the lack of the zero mode in the theory; the arguments in this paper complement the results of papers [15,25] from a gauge theory point of view. Our results are particularly significant in d = 2, where the logarithmic terms presented here dominate the universal part of entanglement entropy at weak coupling. We now review several points of interest that may warrant further work.
In section 4 we have shown that at strong lattice coupling the entanglement entropy vanishes as (log N )/N 2 in the planar limit. Conversely, in section 5 we have shown that at sufficiently weak coupling the entropy scales as N 2 . The entanglement entropy thus jumps by an infinite amount as the gauge coupling is dialed from strong to weak over a finite interval; this establishes the presence of a lattice phase transition in the planar limit. This result agrees with previous lattice studies [35] and establishes the entanglement entropy as a good order parameter for confinement that can be calculated directly in the gauge theory. The grand prize -proving the presence of the phase transition at small enough lattice couplings such that the continuum limit is applicable, as done holographically in [26] remains beyond our abilities for now.
Reference [15] has already shed significant light on the origin of logarithmic terms in the d = 2 Maxwell theory, but it has nevertheless relied heavily on the Maxwell-scalar duality. How invariant is the entanglement entropy under such dualities? Our calculation has shown that the logarithmic term is present on both sides of the duality. It would be of great interest to understand whether all universal terms are preserved under all dualities of Kramers-Wannier type. Progress on this front is most easily accessible by studying the Ising model and its dual Z 2 gauge theory in d = 2. Here it is possible (and easy) to track how the maximal algebra A V in a region V of the gauge theory maps across the duality; the result is that A V maps to an algebra A V on a set of sites V on the dual lattice, with A V generated by operators σ x i for i ∈ V and σ z i for i ∈ Int V = V − ∂ V . In other words, the maximal algebra in the gauge theory does not map to the maximal algebra in the dual theory. (Repeating this analysis in d = 3 would show that the maximal algebra, i.e. the electric center choice, maps to the algebra known as the magnetic center choice in the dual gauge theory.) In d = 2, this is reassuring, as the entanglement entropy of the weakly coupled gauge theory could not be equal to the usual entanglement entropy of the strongly coupled Ising model, which lacks the usual area law due to disorder at strong coupling. Fleshing out this duality will be the subject of a future publication.
In a similar vein, at several points we have alluded at an incomplete dictionary between alternative algebra choices and operator insertions in the continuum. Writing this correspondence has been initiated by [30] for the Ising model in d = 1. Continuing this program for other theories, both gauge and pure matter, would be a fruitful task.

JHEP04(2016)163
The analysis in this paper has relied heavily on lattice gauge theory techniques. It would be of interest to develop a continuum approach that takes into account all the subtleties that come with a gauge theory but that does not require working directly with the lattice. Initial steps have been outlined in [11], and following this program through could lead to a versatile definition of entanglement entropy that can be more readily connected to path integral calculations using the replica trick.
Finally, this paper deals with a rather vast, formal topic: defining entanglement entropy in a theory with nonlocal degrees of freedom. Extending the kind of analysis given in this paper to other theories with gauge constraints, in particular to gravity, would be extremely interesting and is already a subject of investigation [37].
Acknowledgments I would like to thank Steve Shenker for his continuous support and many insightful ideas and questions. It is also a pleasure to thank Sinya Aoki, Shamik Banerjee, Horacio Casini, William Donnelly, Marina Huerta, Chao-Ming Jian, Edward Mazenc, Kantaro Ohmori, Masahiro Nozaki, Xiao-Liang Qi, Matt Roberts, Lenny Susskind, Yuji Tachikawa, Sandip Trivedi, and members of many audiences that have heard preliminary versions of this work; the questions and comments raised by all these people have helped direct and sharpen the arguments presented here. Finally, I thank Lina Wu for alerting me to a mistake in the original manuscript. The author is supported by a Stanford Graduate Fellowship.

A Weak coupling details
In this section we start from the gauge-fixed Kogut-Susskind Hamiltonian and analyze the weak coupling limit more thoroughly, justifying the statement that the ground state looks like dim(G) decoupled photons. Let us start by asking what happens at g = 0. The naïve (and wrong/incomplete) answer is simple: the electric part of the Hamiltonian disappears and the ground state is an eigenstate of the position operators, defined in (2.1), such that W p = W p = N on each plaquette. Since our lattice is topologically trivial, the only axial gauge configuration that satisfies this has U = ½ on all living links. This is the analog of the topological state |topo found in Z κ gauge theories, e.g. the toric code in 2 + 1 dimensions [31]. In the electric basis this state becomes a sum over all representations on all living links, i.e.
Here ⋆ ℓ denotes the product over all living links. This is a sum over infinitely many states, and this is reflected in the logarithmic divergence of the entanglement entropy due to the presence of infinitely many superselection sectors with equal probability. This divergence goes back to the bad behavior of the norm, topo|topo = ⋆ ℓ δ(0), as expected for a state where each link has a definite position. This issue appears in the naïve g = 0 regime of any gauge theory with continuous gauge group, and as we will now show, it is an artifact of carelessly taking the weak coupling limit.

JHEP04(2016)163
We must deal with this infinity in order to meaningfully talk about the weak coupling regime. To this end, we regulate the Lie manifold of G with a short-distance cutoff ε. This way the allowed values of U ℓ are of the form e iεn a ℓ T a with n a ℓ ∈ Z. (This is a generalization of regulating the U(1) gauge group with Z κ and taking ε = 1/κ.) In principle, this regularization should have further ε 2 and higher terms in the exponent in order to properly mimic the curvature of the Lie manifold. These corrections will not be important for our purposes, as we will focus on excitations with |n a ℓ | ≪ 1/ε; these will be the ones relevant for describing the physics at very small coupling where the field configurations tend to be very close to U ℓ = ½. 9 The effective Hamiltonian acting on these basis states is found by expanding (2.6) in ε: Here we use ∆/∆n to denote the discrete difference operator. The sums over links go over both living and dead links; on each dead link we set n a ℓ = 0 and express ∆/∆n a ℓ through difference operators on living links. In this limit the colors decouple and we can write H eff = a H a eff . This is the first indication that in the weak coupling regime the theory in a sense behaves as dim(G) decoupled photons, which is of course the setup familiar from e.g. weakly coupled QCD. Because of this we might expect the entanglement entropy to scale with dim(G) ∼ N 2 .
The crucial observation that allows us to understand the behavior of H eff is that there exist two very different extremal regimes, ε ≫ g and ε ≪ g. The regime we are in depends on how ε and g scale as they are both taken to zero. When ε ≪ g, the ε 2 /g 2 term in the "magnetic" part of H eff is very small, and nearby configurations n a ℓ and n a ℓ + 1 have infinitesimal energy differences. In other words, we may replace the states {|n a ℓ } at fixed a and ℓ with a continuous set {|X a ℓ } with X a ℓ ∈ (−1/g, 1/g) ≈ R, so H eff describes dim(G) decoupled systems of harmonic oscillators (noncompact photons), each with Hamiltonian H a eff = − ℓ ∂ ∂X a ℓ 2 + p ( ℓ X a ℓ ) 2 . This is the Coulomb phase, and we will denote the corresponding ground state by |Coulomb . However, when ε ≫ g, we cannot so carelessly rescale the fields and get a nice description in terms of continuous oscillators. At finite ε/g, H eff describes a collection of particles moving in a quadratic potential on a one-dimensional lattice, with ε/g being the spacing between the sites; as this spacing is taken to infinity, the oscillators all freeze into n a ℓ = 0. This frozen-out configuration is precisely the topological state |topo . 10 The limit ε ≫ g is exactly the regime in which our earlier g = 0 discussion was applicable. At finite ε/g the intermediate ground state wavefunctions can be obtained in terms of Mathieu functions [39].

JHEP04(2016)163
In order to access the weak coupling regime of a theory with a continuous gauge group, we must take ε and g small while keeping ε ≪ g, and hence we cannot naïvely set g = 0 from the outset. If we are working with a discrete gauge group with κ ≫ 1 elements, however, we are at liberty to take the coupling to zero with any ratio ε/g ∼ 1/κg; depending on this ratio the ground state interpolates between a topological state and the ground state of weakly coupled noncompact photons. For Z κ gauge theory this crossover (or transition, depending on d) between the Higgs and Coulomb regimes is a classic result [40].

B Abelian gauge theory in d = 2
In this appendix we review the salient properties of Abelian gauge theories in d = 2. This section lies somewhat outside the main line of development of the paper, but we include it for completeness of presentation.

B.1 Lessons from the continuum
For most of this section we will focus on the U(1) theory in d = 2 spatial dimensions. Let us start from some continuum considerations; a good review is [41]. A general renormalizable Lagrangian for the compact U(1) theory has a Maxwell and a Chern-Simons (CS) term, where κ ∈ Z is the CS level and a is the lattice spacing used to regulate the theory in the UV. In the continuum description we have A µ ∈ R, but the compactness is still recorded by the fact that the physics must be invariant under A µ → A µ + ∂ µ Λ for Λ ∈ [0, 2π). The single propagating degree of freedom in this theory has mass κe 2 . At distances greater than ξ = 1/κe 2 the theory behaves like pure CS at level κ, and below ξ the theory behaves like the compact Maxwell theory, whose propagators get nonperturbatively screened to zero at lengths l conf ∼ exp 1 ae 2 [38]. (Typically one has ξ ≪ l conf , so this effect is invisible; see [42] for a more thorough discussion of monopole screening in the presence of CS terms.) A related theory is the noncompact CS-Maxwell, which has the same Lagrangian L as (B.1), except that κ, A µ , and Λ all take values in R. Physically, the only difference compared to the compact case is that now there is no nonperturbative screening of Maxwell propagators, so at distances below ξ the theory genuinely looks like just a free (noncompact) photon.
This ubiquitous UV/IR structure has interesting consequences in light of the Ftheorem, which states that the renormalized entanglement entropy F (r) ≡ rS ′ (r) − S(r) of a circle of radius r is a monotonically decreasing function of r. Let F κ, e (r) be this F -function for the CS-Maxwell theory (B.1), compact or not. The continuum properties discussed above imply that For pure CS it is known that F κ, ∞ (r) = 1 2 log κ, so by monotonicity we have F 0, e (r) > 1 2 log κ for any κ and any r ≤ ξ. This way one can justify finding a logarithmic divergence

JHEP04(2016)163
in the entanglement entropy of small regions in the pure Maxwell theory in the continuum and at any coupling. However, this conclusion is at odds with the entanglement entropy one could calculate for a Maxwell theory on a lattice. A simple way out is to notice that the CS term is chiral while the Maxwell one is not, and hence the RG flow of pure Maxwell theory will never generate a CS term, and so the above analysis does not say anything about pure Maxwell. However, we can run the above argument for two uncoupled CS-Maxwell theories with opposite levels, or (almost equivalently) for two Maxwell theories coupled with a BF term, and we still have the same conundrum (and no parity arguments to save us).
This tension is resolved in the following way. While the reviewed properties of continuum CS-Maxwell theories are all correct, they are not the whole story. The information missing in (B.2) is that F κ, e (r) = F 0, e (r) holds at best only at a r ξ. An easy way to see this is to keep e fixed and increase κ until ξ = 1/κe 2 ∼ a; for any κ > 1/ae 2 the presence of the CS level will be felt at all length scales and the theory will nowhere behave like pure Maxwell. Conversely, since CS-Maxwell is a massive theory, it must be defined with a UV cutoff and this UV completion will always know about the CS level κ. In fact, at each κ there exists a distinct UV theory at the lattice scale; the free Maxwell theory is not a UV fixed point that controls the flow down to all CS-Maxwell theories, and no CS term will ever be generated by flowing from the pure Maxwell theory. Thus the above F -theorem argument only ensures that far enough from both the deep UV and the deep IR -where the CS-Maxwell theory looks like a pure Maxwell theory -a term of the form log κ = − log(e 2 r) appears in F (r). In the deep IR, at e 2 r ≫ 1, this term becomes log κ (or zero, if there is no CS term), and in the deep UV, at r ∼ a, the log is replaced by a quantity proportional to − log g 2 , the logarithm of the bare coupling. In the main body of the paper we explicitly show that this is a correct prediction and that the entanglement entropy in the continuum Maxwell theory indeed contains the expected, UV-independent logarithmic term.

B.2 Lessons from the lattice
We can support the above points by explicitly constructing a lattice theory whose continuum behavior is described by a CS-Maxwell theory at given κ and e. This construction is standard in condensed matter lore, where it is known that the U(1) × U(1) CS theory at level κ with Lagrangian iκ 4π describes the topological phase of the Z κ lattice gauge theory (see e.g. [43,44] for κ = 2 incarnations of this idea). Extending this idea, it can be shown that Z κ gauge theory at coupling g and lattice spacing a is a UV completion of such a U(1) × U(1) CS-Maxwell theory at level κ and coupling e(g, a) for both Maxwell fields. 11 We will spell out below how the IR behavior of Z κ coincides with the CS-Maxwell theory, but we may already notice that the topological state |topo of this lattice gauge theory has renormalized The level κ is an integer and is represented as continuous just for convenience. The thick line roughly connects critical couplings at which deconfinement happens (the transition is second order for κ = 2 but may be first order for other levels). The dashed line follows κ = 1/g, roughly indicating where the weakly coupled ground state crosses over from a noncompact photon to the topological state. The shaded region, given by κ ≫ 1 and g ≫ 1/κ, is where the theory looks like a compact Maxwell lattice theory. At 1 ≫ g ≫ 1/κ (the leftmost part of the shaded region) the theory confines at very large distances due to the Polyakov mechanism, and at distances below this confinement scale the theory behaves as noncompact Maxwell, just as in the Coulomb phase. entanglement entropy F (r) = log κ, which is precisely (and reassuringly) the F -function of two CS theories at level κ.
The phase diagram of the Z κ theory can be worked out by the machinery developed in the previous two sections. We set the lattice spacing to be a = 1. At any κ the theory confines at large enough g. At 1 ≫ g ≫ ε = 1/κ, the theory at short distances appears to be in the Coulomb phase. However, in this limit Z κ begins to look like U(1), so the Polyakov mechanism gives the photons a very small mass gap and screens them at very long distances, meaning that the theory is actually confining. As ε/g is dialed away from zero and towards infinity, the Coulomb phase crosses over to the topological phase. At high enough κ, the would-be-Coulomb-but-actually-confined phase undergoes a deconfinement transition, has a very short transient behavior, and settles into the topological phase. This behavior is shown on figure 2.
These phases precisely translate into the previously described regimes of continuum CS-Maxwell. As we have repeatedly emphasized, in the ε ≪ g region the gauge group is effectively U(1) and the continuum theory is compact Maxwell, which is always in the confined phase. When the coupling g is weak and ε/g ∼ 1, the oscillations about the ground state start becoming suppressed because the photons are restricted to take values on a discrete grid of spacing ε/g; this makes the photon in the Coulomb phase massive, and this corresponds to giving it mass κe 2 in the continuum, which is the hallmark of the CS-Maxwell JHEP04(2016)163 theory. (It would be interesting to work out exactly how ε/g translates to κe 2 .) Finally, as the topological phase is reached at ε ≫ g, all the fluctuations become infinitely gapped and we are left with the topological state described by the pure CS theory in the continuum.
The Maxwell theory on a lattice, being obtained by sending κ → ∞ before taking any other limits of the Z κ theory, is thus not described by an effective CS theory even in the deep IR. In particular, the compact Maxwell theory on a lattice will always be trivial in the IR due to the Polyakov mechanism, and the noncompact Maxwell theory on a lattice will be scale-invariant and will not RG flow.
Even though the pure Maxwell theory on a lattice does not have a topological phase, coupling it to other theories can allow it to flow to something topological in the IR. In particular, coupling Maxwell theory to scalars of charge κ ≥ 2 can Higgs it and lead to a Z κ gauge theory [45]. In the continuum language, this means that scalar QED 3 can be written as a purely topological theory in the IR, which is a fact often used in studies of the fractional quantum Hall effect [46]. Conversely, coupling a pure Maxwell theory to a massive fermion will lead to just a U(1) CS theory, and indeed coupling any gauge theory to fermions is the right way to access the chiral CS regime, even for non-Abelian groups [47]. It would be fascinating to extend the work in this paper to gauge-matter theories and to verify that the F -theorem holds for them too.
C Maxwell-scalar duality in d = 2 In the Hamiltonian formalism on a d = 2 lattice, the Maxwell-scalar duality is the operator map where ℓ(p 1 , p 2 ) is the link between adjacent plaquettes p 1 and p 2 . The conjugate operators φ p and π p describe a scalar theory on the dual lattice. At small g, the Hamiltonian is that of a free massless field, 1 2 p π 2 p + 1 2 p 1 ,p 2 (φ p 1 − φ p 2 ) 2 . Eigenvalues of J ℓ are integers, and hence φ p has eigenvalues in gZ, so at small gauge coupling the dual scalar takes values in a continuous set. The definition above specifies φ p up to a global shift, φ p → φ p + g. This is crucial: it means that the dual scalar theory is not a simple free field, but rather a "truncated scalar" (a noncompact photon or a Nambu-Goldstone boson), i.e. a scalar field in which configurations related by global shifts by g have been identified. This "gauging" of the shift symmetry will lead to the promised appearance of log terms in the entanglement entropy.
In the continuum limit, the mapping (C.1), (C.2) must first be amended by replacing g with g/a 1/2 . Taking g → 0 and a → 0 with e 2 = g 2 /a and employing the appropriate continuum variables, the duality takes the well-known form

JHEP04(2016)163
or 1 2 ǫ µνλ F µν ≡ e∂ λ φ for short. The electric operator J i (x) has real eigenvalues so φ(x) is also real, but the shift identification becomes φ(x) ≡ φ(x) + e, meaning that the dual description of Maxwell theory is the spontaneous symmetry breaking phase of a compact scalar with radius e.
When the continuum coupling is small (i.e. at length scales r such that e 2 r ≪ 1) the shift identification reduces to the gauging of infinitesimal shifts φ(x) → φ(x) + ε. This is equivalent to removing the zero mode from the theory. Qualitatively, the entanglement entropy at such small scales will reflect the loss of this one mode [33]; the softest (nearly constant) mode that can live in the entangling region will be forced to couple to the softest mode allowed in the exterior in order to put the overall zero mode into its ground state, a state it has to stay in because it plays no part at small coupling. As the entangling region is increased, its softest mode will be allowed more and more leeway, leading to the presence of a monotonically increasing log(e 2 r) term in the entanglement entropy, as we will show explicitly below. As we work our way to higher length scales, only finite shifts will be gauged away and the logarithmic term will cross over to an O(1) constant by the time we reach the e 2 r ≫ 1 regime, where there are effectively no traces of the gauging left and the entanglement entropy corresponds to that of the noncompact, ordinary scalar. By further increasing e 2 r we will eventually hit the confinement scale r = l conf at which point the dual scalar becomes massive and the gauge fields become confined, and the entanglement entropy becomes zero.
We close this section by remarking that the dual photon φ is not the same field as the gauge-fixed vector potential, even though both can be written as a single real degree of freedom. Given a plaquette with living links ℓ 1 and ℓ 2 , the two descriptions are related by eq. (C.2), In other words, axial gauge is related to the dual photon by a conjugate transformation that exchanges the position and momentum operators.
Open Access. This article is distributed under the terms of the Creative Commons Attribution License (CC-BY 4.0), which permits any use, distribution and reproduction in any medium, provided the original author(s) and source are credited.