Yang–Mills Measure on the Two-Dimensional Torus as a Random Distribution

We introduce a space of distributional 1-forms Ωα1\documentclass[12pt]{minimal} \usepackage{amsmath} \usepackage{wasysym} \usepackage{amsfonts} \usepackage{amssymb} \usepackage{amsbsy} \usepackage{mathrsfs} \usepackage{upgreek} \setlength{\oddsidemargin}{-69pt} \begin{document}$$\Omega ^1_\alpha $$\end{document} on the torus T2\documentclass[12pt]{minimal} \usepackage{amsmath} \usepackage{wasysym} \usepackage{amsfonts} \usepackage{amssymb} \usepackage{amsbsy} \usepackage{mathrsfs} \usepackage{upgreek} \setlength{\oddsidemargin}{-69pt} \begin{document}$$\mathbf {T}^2$$\end{document} for which holonomies along axis paths are well-defined and induce Hölder continuous functions on line segments. We show that there exists an Ωα1\documentclass[12pt]{minimal} \usepackage{amsmath} \usepackage{wasysym} \usepackage{amsfonts} \usepackage{amssymb} \usepackage{amsbsy} \usepackage{mathrsfs} \usepackage{upgreek} \setlength{\oddsidemargin}{-69pt} \begin{document}$$\Omega ^1_\alpha $$\end{document}-valued random variable A for which Wilson loop observables of axis paths coincide in law with the corresponding observables under the Yang–Mills measure in the sense of Lévy (Mem Am Math Soc 166(790), 2003). It holds furthermore that Ωα1\documentclass[12pt]{minimal} \usepackage{amsmath} \usepackage{wasysym} \usepackage{amsfonts} \usepackage{amssymb} \usepackage{amsbsy} \usepackage{mathrsfs} \usepackage{upgreek} \setlength{\oddsidemargin}{-69pt} \begin{document}$$\Omega ^1_\alpha $$\end{document} embeds into the Hölder–Besov space Cα-1\documentclass[12pt]{minimal} \usepackage{amsmath} \usepackage{wasysym} \usepackage{amsfonts} \usepackage{amssymb} \usepackage{amsbsy} \usepackage{mathrsfs} \usepackage{upgreek} \setlength{\oddsidemargin}{-69pt} \begin{document}$$\mathcal {C}^{\alpha -1}$$\end{document} for all α∈(0,1)\documentclass[12pt]{minimal} \usepackage{amsmath} \usepackage{wasysym} \usepackage{amsfonts} \usepackage{amssymb} \usepackage{amsbsy} \usepackage{mathrsfs} \usepackage{upgreek} \setlength{\oddsidemargin}{-69pt} \begin{document}$$\alpha \in (0,1)$$\end{document}, so that A has the correct small scale regularity expected from perturbation theory. Our method is based on a Landau-type gauge applied to lattice approximations.


Introduction
The main object of study in this paper is the Yang-Mills (YM) measure on the twodimensional torus T 2 given formally by dμ(A) = Z −1 e −S YM (A) d A. (1.1) Here d A denotes a formal Lebesgue measure on the affine space A of connections on a principal G-bundle P over T 2 , where G is a compact, connected Lie group with Lie algebra g. For our purposes, we will always assume P is trivial, so that after taking a global section, A can be identified with the space 1 (T 2 , g) of g-valued 1-forms on T 2 . The constant Z is a normalisation which makes μ a probability measure, and the YM action S YM (A) is defined by where F A is the curvature two-form of A. A number of authors with different techniques have investigated ways to give a rigorous meaning to (1.1) (and its variants); a highly incomplete list is [BFS79,BS83,GKS89,Fin91,Sen97,Ngu15]. See also [Cha19] for an extensive review on the literature associated with this problem.
One way to understand the measure is to study the distributions of certain gauge invariant observables. A popular class of such observables are Wilson loops defined via holonomies, and a complete characterisation of these distributions can be found in [Lév03], with related work going back to [Mig75,DM79,Bra80,Dri89,Wit91]. We shall follow [Lév03,Lévy10] and treat the YM measure as a stochastic process indexed by sufficiently regular loops in T 2 .
The purpose of this work is to realise the YM measure as a random distribution with the small scale regularity one expects from perturbation theory. We show that a Landautype gauge applied to lattice approximations allows one to construct a (non-unique) random variable taking values in a space of distributional 1-forms for which a class of Wilson loops is canonically defined and has the same joint distributions as under the YM measure.
Outline of results. The main result of this paper can be stated as follows (we explain the notation after the theorem statement).
Theorem 1.1. Let G be a compact, connected, simply connected Lie group with Lie algebra g. For all α ∈ ( 1 2 , 1), there exists an 1 α (T 2 , g)-valued random variable A such that for any x ∈ T 2 , finite collection of axis loops γ 1 , . . . , γ n based at x, and Adinvariant function f : G n → R, it holds that f (hol(A, γ 1 ), . . . , hol(A, γ n )) is equal in law to f applied to the corresponding holonomies under the YM measure.
The class of all functions A → f (hol (A, γ 1 ), . . . , hol(A, γ n )), where f is Ad-invariant, is known to uniquely determine A up to gauge equivalence (at least for smooth A), see [Sen92,Prop. 2.1.2]. This class includes the Wilson loop observables, i.e., functions which depend only on Tr[ϕ hol(A, γ 1 )], . . . , Tr[ϕ hol(A, γ n )] where ϕ is any finitedimensional representation of G, but in general this class is strictly larger.
The article, as well as the proof of Theorem 1.1, which is given at the end of Sect. 5, is split into three parts. The first part, given in Sect. 3, constructs the space 1 α and derives its basic properties. In this part we work in arbitrary dimension d ≥ 1. The second part, which can be seen as the main contribution of this paper, is given in Sect. 4 and defines a gauge on lattice approximations through iterations of the Landau gauge d μ=1 ∂ μ A μ = 0 (also called the Coulomb gauge in differential geometry). We furthermore apply an axial gauge in order to reach a small 1-form on some medium scale, after which the preceding gauge can be applied. The third part, given in Sect. 5, again uses an axial-type gauge together with a random walk argument to obtain probabilistic bounds necessary to apply the results from Sect. 4. We work with quite general discrete approximations as in [Dri89,Sect. 7] which cover the Villain (heat kernel) and Wilson actions.
Remark 1.2. The assumption that G is simply connected appears for topological reasons when applying the axial gauge in Sect. 4.2 (and would not be necessary if we worked on the square [0, 1] 2 instead of T 2 ). In fact, one does not expect to be able to represent a realisation of the YM holonomies as a global 1-form unless the realisation is associated to a trivial principal bundle. How to construct the YM measure associated to a specific principal bundle was understood in [Lév06], and it would be of interest to extend our results to this general case.
Remark 1.3. The restriction to axis paths appears superficial, and is certainly an artefact of our proof. The construction in [Lév03] makes sense of the corresponding random variables for any piecewise smooth embeddings γ i , and this was later extended to all bounded variation paths in [Lévy10]. It would be of interest to determine a more canonical space of "test" paths in our context for which hol(A, γ ) is well-defined together with regularity estimates. The construction in Sect. 3 could be adapted to different classes of paths, however it is unclear how to adapt the results of Sects. 4 and 5 to yield a satisfactory conclusion. See also Remark 3.3.
The Landau-type gauge defined in Sect. 4.1 can be loosely explained as follows: we first apply the classical Landau gauge on low dimensional subspaces, working up to the full dimension (for d = 2 this involves just two steps), and then propagate the procedure from large to small scales. The advantage of this gauge is that it is relatively simple to analyse and retains the small scale regularity expected from perturbation theory (which is not true, e.g., for the axial gauge). The exact form of this gauge appears new (although it is closely related to the classical Landau gauge, which is of course well-known) and its regularity analysis can be seen as the main technical contribution of this paper. We choose to study this gauge only in dimension d = 2 since this simplifies many arguments, and since this restriction is crucial for our probabilistic estimates, however we emphasise that an analogous construction works in arbitrary dimension. See Remarks 4.6 and 4.9 for the intuition behind this gauge coming from elliptic PDEs.
While we work with approximations of the YM measure taken from [Lév03,Dri89], we note that our analysis is closer in spirit to that of [Bal85a,Bal85c,Bal85b] (which was subsequently used to prove ultraviolet stability of three-and four-dimensional lattice approximations of the pure YM field theory under the action of a renormalisation group).
Motivation and further directions. It would be of interest to extend our work to higher dimensions to yield small scale regularity of lattice approximations to the YM measure in d = 3. See [Cha16] for recent work on the YM measure in three and four dimensions. The difficulty here is of course that the measure becomes much more singular and requires non-trivial renormalisation. Furthermore, one does not necessarily expect from perturbation theory that Wilson loop observables would be well-defined even for d = 3 (see Remark 3.1 and [CG15,Sect. 3.1], [Frö80,Sect. 3]). In this case one may need to regularise the connection as propsed in [CG13,CG15] or consider smooth averages of Wilson loops, see e.g. [Sin81,p. 819]. Another direction would be to work with so-called lasso variables [Gro85,Dri89] which could prove more regular in higher dimensions than Wilson loops.
We end the introduction with a discussion on one of the motivations behind this paper. An important feature of the space 1 α is its embedding into 1 C α−1 , the space of Hölder-Besov distributions commonly used in analysis of stochastic PDEs [Hai14,GIP15], see Corollary 3.23. The main result of this paper can thus be seen as a construction of a candidate invariant measure (up to suitable gauge transforms) for the connection-valued stochastic YM heat flow where d A is the covariant derivative, F A is the curvature two-form of A, and ξ is a space-time white noise built over the Hilbert space 1 (T 2 , g), i.e., (ξ μ ) d μ=1 are iid gvalued space-time white noises. The term d A d * A A, known as the DeTurck [DeT83] or Zwanziger [Zwa81] term, is a gauge breaking term which renders the equation parabolic (and the solution gauge equivalent to the solution without this term).
The YM heat flow without noise is a classical tool in geometry [DK90]; for a recent application, see [Oh14,Oh15] where the deterministic YM heat flow was applied to establish well-posedness of the YM equation in Minkowski space. It was also proposed in [CG13] as a gauge invariant continuum regularisation of rough connections; one of the motivations therein was to set up a framework in which one could define a non-linear distributional (negative index Sobolev) space which could support the YM measure for non-Abelian gauge groups (a goal which parallels the one of this article).
The motivation to study the stochastic dynamics arises from stochastic quantization [DH87,BHST87]. The principle idea is to view (1.3) as the Langevin dynamics for the Hamiltonian (1.2) of the YM model. This quantization procedure largely avoids gauge fixing, the appearance of Faddeev-Popov ghosts, and the Gribov ambiguity, which was one of the motivations for its introduction by Parisi-Wu [PW81]. It was furthermore recently used to rigorously construct the scalar 4 3 measure on the torus [MW17a].
Due to the roughness of the noise ξ and the non-linearity of the term d * A F A in the non-Abelian case, equation (1.3) is classically ill-posed. The framework of regularity structues [Hai14,CH16,BHZ19,BCCH17] however provides an automated local solution theory for this equation in dimension d < 4 (at least via smooth mollifier approximations). Shen [She18] recently studied lattice approximations of the Abelian version of this equation coupled with a Higgs field using discretizations of regularity structures [EH17,HM18,CM18]. One also expects the equation to be amenable to paracontrolled analysis and its discretizations [GIP15,GP17,MP17,ZZ18].
Remark 1.4. Another way to construct the YM measure as a random distribution is through the axial gauge [Dri89]. One can verify however that this construction yields a random distribution of regularity C η for η < − 1 2 and that the procedure in [Hai14,BCCH17] yields a solution theory for (1.3) only for initial conditions in C η for η > − 1 2 . In a similar way to [HM18], one could expect that (1.3) admits global in time solutions for a.e. starting point from an invariant measure. In addition to [LN06], where a large deviations principle is shown, such a result would provide a further rigorous link between the YM measure and the YM energy functional. . It is therefore possible that global in time solutions could exist a.s. for arbitrary initial conditions, but it is unclear if this should be expected. This is true for the 4 models [MW17b,MW17a], though through a rather different mechanism. Global in time stability of the YM heat flow without noise is already somewhat non-trivial, even in d = 2, 3 [Rad92], and typically uses Uhlenbeck compactness [Uhl82,Weh04].

Notation and Conventions
2.1. Paths. For a set E and a function γ : [0, 1] → E, we denote by γ [0,1] ⊂ E the image of γ . For a metric space (E, d), q ≥ 1, and a path γ : [s, t] → E, we define the q-variation of γ by where the supremum is taken over all finite partitions D = (s ≤ t 0 < t 1 < · · · < t n ≤ t) (with t n+1 def = t for the case t i = t n in the sum above). For a sequence (γ (i)) k i=1 with γ (i) ∈ E, we denote by |γ | q-var the same quantity with the supremum taken over all Let (e μ ) d μ=1 be an orthonormal basis of R d and let Z d denote the lattice generated by (e μ ) d μ=1 . We will work primarily on the torus T d def = R d /Z d equipped with its usual (geodesic) metric which, by an abuse of notation, we denote by |x − y|. As a set, we will identify T d with [0, 1) d in the usual way and write x = (x 1 , . . . , We say that x, y ∈ N are adjacent if |x − y| = 2 −N . An oriented bond, or simply bond, of N is an ordered pair of adjacent points x) the reversal of α. We denote by B N the set of bonds of N . We further denote by B N the subset of bonds (x, x + 2 −N e μ ) ∈ B N . Note that every α ∈ B N canonically defines a subset of T d with one-dimensional Lebesgue measure |α| def = 2 −N , and that α,ᾱ ∈ B N define the same subset of T d if and only ifᾱ = α or α = ← − α . In the same way, we can canonically identify every α ∈ B N with a subset of and 1 ≤ m, n < 2 N with either m = 1 or n = 1. Observe that r can be canonically identified with a subset of N consisting of (m +1)(n +1) points, as well as a (closed) subset of T d with two-dimensional Lebesgue measure |r | = mn2 −2N . We will freely interchange between these interpretations. If m = n = 1, we call r a plaquette.
We let G N ⊂ T d denote the grid induced by N , that is, . We call elements of 1,(N ) discrete E-valued 1-forms on N . Note that forN ≤ N , every A ∈ 1,(N ) canonically defines a function A ∈ 1,(N ) (which we denote by the same letter) via (2.1) We will often use the shorthand AN μ (x) . Throughout the paper we let G be a compact, connected Lie group (not necessarily simply connected) with Lie algebra g. We let 1 G denote the identity element of G. We equip G with the normalised Haar measure denoted in integrals by dx. We equip g with an Ad(G) invariant inner product ·, · and equip G with the corresponding Riemannian metric and geodesic distance. We fix a measurable map log : G → g with bounded image such that exp(log x) = x for all x ∈ G and such that log is a diffeomorphism between a neighbourhood of 1 G and a neighbourhood of 0 ∈ g. We further choose log so that log(yx y −1 ) = Ad y log x for all x, y ∈ G and log(x) = − log(x −1 ) for all x ∈ G outside a null-set (this is always possible by considering a faithful finitedimensional representation of G and the principal logarithm, cf. [Bal85a, Sect. A]; the last point follows from the fact if G is a compact, connected matrix group, then {x ∈ G | −1 ∈ σ (x)} has Haar measure zero -this is obvious if G is Abelian, and the general case follows e.g. from the Weyl integral formula [Hal15,Thm. 11.30]).
Remark 2.1. In the sequel, when we say that a quantity depends on G, we implicitly mean it depends also on the choice of log and inner product on g.
We denote by A (N ) the set of functions U : B N → G such that U (α) = U ( ← − α ) −1 . Observe that every A ∈ 1,(N ) (T d , g) defines an element of A (N ) via U = exp A. Note further that every U ∈ A (N ) canonically defines an element in A (N ) for allN ≤ N exactly as in (2.1) with the sum replaced by an ordered product. We will again often use the shorthand UN μ (x) We let G (N ) denote the set of functions g : N → G. We call elements of G (N ) discrete gauge transforms. For U ∈ A (N ) and g ∈ G (N ) , we define U g ∈ A (N ) by We define the binary power of a number q ∈ [0, 1) as the smallest k ≥ 0 such that p ∩ N such that z μ and z ν have binary power at most N − 1, and for the other three points y ∈ p ∩ N , at least one of y μ , y ν has binary power N . We call z the origin of . . , α 4 are the four bonds oriented to traverse the boundary of p anti-clockwise starting at z when viewed from the (μ, ν) plane.
In general, for a rectangle r = (x, m2 −N e μ , n2 −N e ν ), there is a unique plaquette p ⊂ r such that neither p − 2 −N e μ nor p − 2 −N e ν are contained in r . We define the origin z of r as the origin of p, and define U (∂r ) . . , α k are the bonds in B N which traverse the boundary of r anti-clockwise starting from z when viewed from the (μ, ν) plane.
Remark 2.3. The exact order of the bonds α i may seem arbitrary at this point (one usually simply starts at the south-west corner of r ), but this choice will be convenient in Sect. 4.1.

Holonomy on Distributions
In this section we introduce spaces of distributional 1-forms on T d for which integration along axis paths is canonically defined. We will later show that the YM measure can be appropriately gauged fixed to have support on these spaces.

Motivation: the Gaussian free field.
From perturbation theory, we expect that in two and three dimensions the YM measure can be realised as a random distribution with the same regularity as the Gaussian free field (GFF) . In this subsection, we present an informal discussion about what precisely we mean by "regularity".
Working on T 2 , it is well-known that is not a function (though it is almost a function since it belongs to every Hölder-Besov space C −κ , κ > 0). Pointwise evaluation (x) = , δ x is therefore ill-defined. We claim however, that for certain regular curves γ : The point here is that ψ, δ can make sense for sufficiently regular distributions ψ.
Hence −1/2 δ is a function in L 2 (with plenty of room to spare) and the evaluation , δ makes sense (as a random variable) where = −1/2 ξ is a GFF and ξ is an R-valued white noise on T 2 .
Remark 3.1. Note that the same is not true in three dimensions. In this case K (x) ∼ |x| −2 so that K * δ (x) ∼ |d(x, )| −1 , rendering the integral |K * δ (x)| 2 dx infinite (but only just). This suggests that, even in the smoothest gauge, Wilson loops would a.s. not be defined for the YM measure in dimension three, cf. [BFS80, p. 160]. We note however, that replacing by a suitable surface L again renders K * δ L (x) ∼ | log d(x, L)| so that −1/2 δ L is in L 2 (with plenty of room to spare).
Furthermore, one can derive growth bounds and Hölder continuity with respect to . To see this, note that | | 2α for any α < 1 (e.g. by splitting the domain of integration into annuli around with radii | |2 N ). Hence , δ is a Gaussian random variable One can combine these two estimates in a Kolmogorov-type argument (at least for axis line segments) to show that, for any α < 1, (A more precise formulation would be that admits a modification for which these bounds holds.) Sections 4 and 5 of this paper can be seen as deriving these estimates and Kolmogorov argument when is replaced by discrete approximations of the YM measure (albeit with rather different methods). The remainder of this section sets up the space in which we will obtain weak limit points of these approximations.
Remark 3.2. The analogue for the YM measure U (as a random holonomy) of the estimate | , δ − δ¯ | | | α/2 d( ,¯ ) α/2 takes the form | log U (∂r )| |r | α/2 where r is the rectangle with ,¯ as two of its sides. This is certainly expected since the law of U (∂r ) is close to that of B |r | , where B is a G-valued Brownian motion.
Remark 3.3. We restrict attention in this article to axis line segments (and thus finite concatenations thereof). It would be desirable to work with a more natural class of paths along which holonomies could be defined together with similar estimates, but it is not entirely clear what the correct "test-space" should be. For example, if A was a random g-valued 1-form which induced the YM holonomies, one would expect that for a.e. realisation there should exist a bounded variation path γ for which A(γ ) defined by (3.1) does not exist (e.g., concatenations of small square loops rapidly decreasing in size but with an increasing number of turns around each one). Thus it seems necessary to impose some control on the derivative of γ for A(γ ) and hol(A, γ ) to be well-defined pathwise (cf. Remark 1.3).

Functions on line segments.
We formalise the above discussion by introducing a suitable space of distributions.
, and λ ∈ [0, 1]. In this case we define | | def = λ and, if | | > 0, we say that the direction of is μ. We let X denote the set of all axis line segments equipped with the Hausdorff metric d H .
Note that X is a compact metric space. We introduce another distance on X .
Definition 3.5. For μ ∈ [d] let π μ : T d → T denote the projection onto the μ-th axis. We say that ,¯ ∈ X are parallel if they have the same direction μ ∈ [d] and π μ = π μ¯ . For parallel ,¯ ∈ X we define Note that ( ,¯ ) 2 is the area of the smallest rectangle with two of its sides as and .
For the rest of the section, let E be a fixed finite-dimensional normed space.
Definition 3.6. We say that ,¯ ∈ X are joinable if ∪¯ ∈ X and | ∪¯ | = | | + |¯ |. We say that a function A : Definition 3.7. For A ∈ and α ∈ [0, 1] we define where the supremum is taken over all distinct parallel ,¯ ∈ X . We also define the α-growth norm where the supremum is taken over all ∈ X with | | > 0.
For ∈ X , we call a parametrisation of a path γ : [0, 1] → T d with constant derivative γ ≡ | |e μ such that γ [0,1] = . Note that if | | < 1, there is exactly one parametrisation of . For every A ∈ and ∈ X with | | < 1, one can canonically construct a path A : where γ is the unique parametrisation of . We have the following basic result, the proof of which is obvious.
We show next that | · | α-gr and | · | α; bound the α 2 -Hölder norm of A with respect to d H .
We break the proof up into several elementary lemmas.
Proof. Let μ be the direction of . Then where in the first inequality we used that π μ¯ is a single point, and in the second inequality we used that π μ : T d → T does not increase distance.
Let |X | denote the Lebesgue measure of a (measurable) subset X ⊂ T, and let X Y denote the symmetric difference of X, Y ⊂ T. Lemma 3.11. Let X, Y be subsets of T each with a single connected component. Then Proof. Clearly X Y has at most two connected components and every connected component has Lebesgue measure at most 2d H (X, Y ).
Consider a pair ,¯ ∈ X with the same direction μ ∈ [d]. It holds that π μ ∩ π μ¯ has at most two connected components which we call X, Y (one or both possibly empty). Likewise, π μ π μ¯ has at most two connected components, which we call U, V (one or both possibly empty).
Proof of Proposition 3.9. Suppose ,¯ do not have the same direction. Then clearly and the conclusion follows by Lemma 3.10. Suppose now ,¯ have the same direction. By additivity of A, using the notation of Lemma 3.12, we have and the conclusion follows from Lemma 3.12.
For completeness, we record two further lemmas the proofs of which are obvious.
3.3. Additive functions from 1-forms. Let 1 denote the space of all bounded, measurable E-valued one forms, i.e., all For ∈ X with a parametrisation γ ∈ C 1-var ([0, 1], T d ), we then define A( ) def = A(γ ) (which is independent of the choice of parametrisation γ ). In such a way, we treat every element of 1 as an element of .
Note that this identification does not respect almost everywhere equality, i.e., if A =Ā a.e. on T d , it does not necessarily hold that A( ) =Ā( ) for all ∈ X . However, we have the following.
Proposition 3.15. Let A ∈ 1 . If A( ) = 0 for all ∈ X , then A is a.e. zero. Conversely, suppose A ∈ 1 is a.e. zero and that ∈ X is a continuity point of A (as a function on X ). Then A( ) = 0.
Proof. Let ψ ∈ C(T d , R) and μ ∈ [d], and write ds. The first claim follows by noting that X z (t) is the evaluation of A at an element of X . For the second claim, write = {x + te μ | t ∈ [0, λ]} for some λ ≥ 0. Let (ϕ ε ) ε>0 be a smooth approximation of the Dirac delta δ x . Denote On the one hand, since A μ is zero a.e., A μ ,φ ε = 0 for all ε > 0. On the other hand, A( y ) → A( ) as y → x since is a continuity point of A, so that the LHS of (3.2) converges to A( ) as ε → 0, from which it follows that A( ) = 0.
As a consequence we may realise the space 1 0 def = {A ∈ 1 | A is continuous as a function on X } simultaneously as a subspace of C(X , E) and as a space of E-valued L ∞ 1-forms. Note that, by Proposition 3.9, every A ∈ 1 with |A| α < ∞ for some α > 0 is in˚ 1 0 .

Embeddings.
In this subsection, we show that α is compactly embedded in 1 α forᾱ < α, and that the latter is continuously embedded in 1 Cᾱ −1 , the Hölder-Besov space of distributions commonly used in anaysis of SPDEs [Hai14,GIP15].

Dyadic approximations and compact embeddings
Fix in this section A ∈ . We suppose further that A( ) = 0 unless has direction μ ∈ [d]. We construct a sequence of functions A (N ) ∈ 1 0 (which serve as dyadic approximations to A) as follows. For x be the unique axis line segment of length 2 −N containing x such that π μ Proof. For the first inequality, let us write ∈ X as = 1 ∪ 2 · · · ∪ n , where i and i+1 are joinable for i ∈ {1, . . . , n − 1}, and each i is contained in a single cell, i.e., a set of the form π −1 μ [k2 −N , (k + 1)2 −1 ]. Then For the second inequality, let ,¯ ∈ X be parallel. Let us decompose ,¯ exactly as above. Observe that where the first supremum is taken over all parallel a, b ∈ X which are in the same cell and for which d(a, b) = d( ,¯ ). The same holds for For the middle part, we simply have It follows that Lemma 3.18. Suppose A is continuous as a function on X . Then Proof. Since A( ) = 0 for all ∈ X consisting of a single point, (uniform) continuity of A on X implies lim ε→0 sup | |≤ε |A( )| = 0. The conclusion follows by additivity and the definition of A (N ) .
Proof. Proposition 3.9 implies that α 2 -Hölder norm of A ∈ α is bounded by |A| α , hence the unit ball of α is equicontinuous and bounded in C(X , E). Since X is compact, the claim follows by Arzelà-Ascoli and Lemmas 3.13 and 3.14.

Lattice approximations.
We will see in the following sections that lattice gauge theory provides us with random approximations of elements in α defined on lattices. We show that one can take projective weak limit points of these random variables in α . Recall the definition of 1,(N ) and note that every A ∈ canonically defines an element of 1,(N ) . and for all K ≥ 0 (3.4)

Deterministic Bounds
In this section we collect the necessary deterministic results concerning lattice gauge theory. We restrict henceforth to the case T d = T 2 . We emphasise however that this assumption is not necessary in this section, and a similar analysis can be performed in arbitrary dimension. The presentation however does simplify significantly in this case, and furthermore the probabilistic bounds in the following section depend crucially on the fact that d = 2.
We will henceforth take E = g when considering the spaces 1,(N ) (T 2 , g). Throughout this section let N 1 ≥ 0 and U ∈ A (N 1 ) . Definition 4.1. For N ≤ N 1 and a rectangle r ⊂ N , let p 1 , . . . , p k denote the plaquettes of N ordered so that neither p 1 − 2 −N e 1 nor p 1 − 2 −N e 2 are contained in r and so that the boundaries of p i+1 and p i share a common bond for i = 1, . . . , k − 1 (note this defines the order uniquely). Let r i denote the subrectangle of r consisting of the plaquettes p 1 , . . . , p i . See Fig. 1 for an example. We call the anti-development of U along r the g-valued sequence ( For an integer N ≤ N 1 and a rectangle r ⊂ N , consider the conditions for somē C ≥ 0 and α ∈ R | log U (∂r )| ≤C|r | α/2 , (4.1) and for some q ≥ 1 |X | q-var ≤C|r | α/2 , (4.2) where (X i ) k i=1 is the anti-development of U along r . Remark 4.4. As the name suggests, the development of X into G is exactly the sequence (U (∂r i )) k i=1 . As a result, by Young integration, if (4.2) holds for some q < 2, then so does (4.1) (potentially with a largerC). In our situation, we will only have (4.2) for q > 2, in which case (4.1) would only be implied by (4.2) if X is replaced by its rough path lift (and our probabilistic estimates in the following section indeed imply this stronger bound). However we choose the current formulation to keep the assumptions in this section more elementary and since the bound (4.2) will only be used in the "Young regime", cf. Lemma 4.11.
The main result of this section can be stated as follows.
Suppose further that G is simply connected. Then there exists A ∈ 1,(N 1 ) such that exp A = U g for some g ∈ G (N 1 ) and for everyᾱ < α, there exists C ≥ 0, independent of N 1 , such that |A| Proof. By Proposition 4.15 we can apply the axial gauge for sufficiently large N 0 ≥ 1 until the assumptions of Theorem 4.12 are satisfied, after which we can apply the binary Landau gauge for N 0 ≤ N ≤ N 1 .

Binary Landau gauge.
Throughout this subsection, let us fix N 0 ≤ N 1 . We should think of N 0 as providing a fixed medium scale while we take N 1 → ∞. We will define A ∈ 1,(N 1 ) and g ∈ G (N 1 ) such that exp(A) = U g with explicit bounds on |A| (N 1 ) α . Remark 4.6. We will be guided by the following observation. Let A be a smooth g-valued 1-form on a closed hypercube B in R d with curvature If A is small or if G is Abelian, the final terms can be ignored and we are left with a Poisson equation for A μ with a mixed Dirichlet-Neumann boundary condition (we ignore the non-smoothness of ∂ B in this discussion). The probabilistic representation of the solution is where W is a Brownian motion started at x, conditioned to exit B at ∂ B\∂ μ B, and τ is the first exit time of W from B. Using this representation (or the classical maximum principle) we see that A μ is bounded by its value on ∂ B\∂ μ B plus contributions from ∂ ν F μν .
Provided the contribution from ∂ ν F μν is small, this allows us to bound A on smaller scales by its value on large scales. The procedure in this subsection can be seen as a discrete version of this boundary value problem with a random walk approximation.
We define A and g inductively. To start, let N = N 0 and A(α) Suppose we have defined A and g on B N −1 and N −1 respectively for N 0 < N ≤ N 1 . To extend the definition to N , we consider intermediate lattices where k N is the subset of N consisting of vertices x = (x 1 , x 2 ) for which at most k coordinates have binary power at most N (see Sect. 2.3 for the definition of binary power). We correspondingly define the set of bonds B k N by B 0 N = B N −1 and for k = 1, 2 as the set of ordered pairs (x, y) where x, y ∈ k N with |x − y| = 2 −N (in particular B 2 N = B N ). For k = 1, 2, we define A and g on B k N and k N as follows. Let We then extend the definition of g to x by enforcing It clearly holds that exp A = U g on B 1 N (with U g defined in the obvious way). If k = 2, let p 1 , p 2 , p 3 , p 4 be the four plaquettes of N one of whose corners is x, ordered from the positive quadrant anti-clockwise, see Fig. 3. Note that the origin of p i is a point z i ∈ N −1 which is the corner of p i opposite to x. Define Lemma 4.7. For all n ≥ 1, there exists C > 0 depending only on n and G, such that for all A 1 , . . . , A n ∈ g, it holds that Proof. An immediate consequence of the compactness of G and non-zero radius of convergence of the Campbell-Baker-Hausdorff formula.
Lemma 4.8. Let A and g be defined as above on B 1 N and 1 N respectively. For x ∈ 2 N as above, denote Then there exist E i ∈ g for i = 1, 2, 3, a constant C ≥ 0 depending only on G, and a unique choice for g(x), such that |E i | ≤ Cδ 2 and such that Remark 4.9. Following Remark 4.6, the ratios 3 8 and 1 8 arise from the following observation: let X be a random walk on the bonds of p 1 , . . . , p 4 parallel to e 1 starting on (x, x + 2 −N e 1 ) which is stopped the first time it hits the boundary of p 1 ∪ · · · ∪ p 4 . Then X will stop on ∂( p 1 ∪ p 4 ) with probability 3 4 and on ∂( p 2 ∪ p 3 ) with probability 1 4 .
Proof. There clearly exists a unique choice for g(x) such that exp from which it follows by Lemma 4.7 that where x 1 = U g (∂ p 1 ) and x i = u i U g (∂ p i )u −1 i for i = 2, 3, 4, where u i is a suitable product of elements of the form U g (∂ p i ) and e ±A N μ (x ± ν ) , μ = ν. By Lemma 4.7, we have Combining (4.3), (4.4), and the definition of A N 1 (x), we obtain from which the existence of E 1 with the desired property follows. The existence of E 2 and E 3 follows in the same manner.
We now extend the definition of A and g to B N and N as in Lemma 4.8 choosing E i in an arbitrary way provided the bound |E i | ≤ Cδ 2 is satisfied. By induction, we define A ∈ 1,(N 1 ) such that exp A = U g as desired.
We now show that this choice leads to a bound on |A| (N 1 ) α . In the following, we use the shorthand Lemma 4.10 (Bonds bound). Suppose there exists α ∈ (0, 1) andC ≥ 0 such that (4.1) holds for all plaquettes r ⊂ N for all N 0 ≤ N ≤ N 1 . Then there exists C ≥ 0, not depending on N 1 , such that if where c ∈ (0, ∞] is a constant depending only on G, then for all Proof. Fix any ε ∈ (0, 1 2 ) and consider N > N 0 . We may suppose thatC2 −N 0 α ≤ 1. Using Lemma 4.8 and the assumption that (4.1) holds for every plaquette, we have (4.6) where C 1 depends only on G and δ IfC2 −N α is furthermore sufficiently small, we have We conclude that there exists c > 0, depending only on G, such that if (4.5) holds, then (4.5) also holds with N 0 replaced by N > N 0 and where C 2 does not depend on N . Proceeding by induction and lowering ε if necessary so that θ def = (ε + 1/2)2 α < 1 we see that where C 3 can depend on θ and N 0 but not on N .
Lemma 4.11. Letᾱ ∈ ( 1 2 , 1) and q ∈ [1, 1 1−ᾱ ). Then for every rectangle r ⊂ N it holds that where X is the anti-development of U along r , p 1 , . . . , p k are all the plaquettes contained in r , and C is a constant depending only on G,ᾱ, and q.
Proof. The idea is to write k i=1 log U g (∂ p i ) as a Young integral against the antidevelopment of U along r . Using the notation from Definition 4.1, let i be the unique line contained in the boundary of r which connects z, the origin of r , and z i , the origin of p i . Note that i ∈ X (N −1) .
which is in the form of a Young integral. Using that exp(A) = U g on B N −1 , we see that into Aut(g) (through left multiplication in the adjoint representation) with initial point Y 1 = Ad g(z) . By Lemma 3.8, it holds that α-gr , and thus Young's estimate for controlled ODEs implies Since q −1 +ᾱ > 1 and since |Y 1 | = 1 (in fact |Y i | = 1 for all i = 1, . . . , k), the conclusion follows by Young integration. Proof. It suffices to considerᾱ ∈ ( 2 3 ∨ (1 − q −1 ), α). To prove (4.8), we proceed by induction on N ≥ N 0 . Assume that |A( )| ≤ P N −1 | |ᾱ for some constant P N −1 ≥ 1 and all ∈ X (N −1) .
Let ∈ X (N ) . Suppose first that is contained in G N −1 , the grid of N −1 . Then we can write = 1 ∪ 2 ∪ 3 where 1 ∈ X (N −1) and, for i = 2, 3, i is either empty or is a bond of N . By induction, we know that |A( 1 )| ≤ P N −1 | 1 |ᾱ. If both 2 , 3 are empty, then we are done. Otherwise, by Lemma 4.10, we have |A( 2 )| + |A( 3 )| ≤ C 1 2 −N α for a constant C 1 not depending on N . If 1 is empty, then again we are done by choosing P N ≥ C 1 . Otherwise we have Since C 1 is independent of N , we may increase P N −1 if necessary so that P N −1ᾱ ≥ C 1 . Hence which proves the inductive step in the case ⊂ G N −1 . Note that the same constant P N −1 appears, which will be used in the next case.
Suppose now is not contained in G N −1 . Then by the definition of A N , we have where 1 , 2 ∈ X (N ) are parallel to and are contained in G N −1 . Here 1 accounts for the terms ∂ μ F νμ and satisfies for a constant C 2 depending only on G, q, andᾱ where the sum is taken over all plaquettes p ⊂ N which have a corner belonging to and the second inequality is due to Lemma 4.11. The term 2 accounts for the errors E i from the CBH formula and satisfies, by Lemma 4.8, for a constant C 3 depending only on G where we have used that (4.2) holds for all plaquettes, Lemma 4.10 as above, and the fact that is a union of | |2 N bonds of N . Using these estimates for 1 , 2 , it follows from the previous case that |A( )| ≤ P N −1 | |ᾱ + C 5 2 −N (α−ᾱ) P N −1 | |ᾱ for C 5 independent of N . Hence we have shown the inductive step with P N def = P N −1 (1 + C 5 2 −N (α−ᾱ) ), and thus sup N P N < ∞. This completes the proof of (4.8).
To prove (4.9), we again proceed by induction on N . Suppose that the case N − 1 holds with proportionality constant Q N −1 . Let ,¯ ∈ X (N ) be distinct and parallel. Suppose first that and¯ are both contained in G N −1 . We write = 1 ∪ 2 ∪ 3 as before and similarly for¯ . Note that we can take parallel 1 ,¯ 1 ∈ X (N −1) to which we can apply the inductive hypothesis. If 2 and 3 are both empty, or if 1 is empty, then we are done. Otherwise, in the same way as the proof of (4.8), (where we increase Q N −1 if necessary as before). Now suppose¯ is contained in G N −1 and is not. Then we know A( ) admits the expression (4.10) with the same bounds on 1 and 2 , and where 1 and 2 are parallel to¯ with By the previous case and the concavity of x → xᾱ /2 , we have From (4.11) we have (C 7 takes into account the fact that sup N P N < ∞). From (4.12) and the condition 2 3 <ᾱ < α we have It follows that for C 8 independent of N . For the final case, when neither nor¯ are contained in G N −1 , we write A( ) and A(¯ ) as in (4.10) with corresponding i ,¯ i and parallel i ,¯ i which are contained in G N −1 and d( i ,¯ i ) = d( ,¯ ) for i = 1, 2. By exactly the same argument we again obtain (4.13). Hence we have shown the inductive step with Q N def = Q N −1 + C 8 2 −N (α−ᾱ)/2 , and thus sup N Q N < ∞, which completes the proof of (4.9).

Axial gauge.
In this subsection we conclude the proof of Theorem 4.5 by showing that an axial-type gauge gives an easy bound of the order |A N μ (x)| 2 −N α/2 , which ensures we can always start the induction in Lemma 4.10.
Remark 4.13. This is the only part where we use simple connectedness of G. If we chose to work on [0, 1] 2 instead of T 2 , then this assumption could be dropped and a simplified version of the gauge presented in this subsection could be used.

Probabilistic Bounds
In this section we show that discrete approximations of the Yang-Mills measure satisfy the bounds required in Theorem 4.5.
For every N ≥ 0, let Q N : G → [0, ∞) be measurable map such that G Q N (x) dx = 1, and Q N (x) = Q N (x −1 ) and Q N (yx y −1 ) = Q N (x) for all x, y ∈ G. Consider the probability measure on A (N ) where the product is over all plaquettes p ⊂ N , dU is the Haar measure on A (N ) ∼ = G |B N | , and Z N is the normalisation constant which makes μ N a probability measure.
For an integer N ≥ 0 and constants C l , C u ,C ≥ 0 consider the conditions where M = 1 ∨ 2 2N −3 and Q k N denotes the k-fold convolution of Q N with itself, and for some β ≥ 1 (5.2) Condition (5.1) means that the G-valued random walk with increments Q N (x) dx has a density after M steps which is bounded above and below. Condition (5.2) means that the β-th moment of Q N (x) dx is comparable to the β-th moment of B(2 −2N ), where B is a G-valued Brownian motion.
Remark 5.1. The symmetry assumption Q N (x) = Q N (x −1 ) simplifies several points, namely the proof of Lemma 5.4 below, but is not at all necessary provided we make an assumption of the type | G log(x)Q N (x) dx| 2 −2N to control the drift of the associated G-valued random walk.
Example 5.2. Two common choices for Q N are the • Villain (heat kernel) action Q N = e t at time t = 2 −2N , where is the Laplace-Beltrami operator on G, , and we implicitly assume G is a matrix Lie group.
One can check that for every β ≥ 1 there exist C l , C u ,C ≥ 0 such that (5.1) and (5.2) hold for all N ≥ 0 and these two choices of Q N .
The main result of this section is the following Kolmogorov-type criterion. We henceforth fix N ≥ 0 and let U denote the A (N ) -valued random variable distributed by μ N . Theorem 5.3. Let β ≥ 2 and suppose that (5.1) and (5.2) hold. Then for any q > 2 and α < 1 − 6 β , there exists λ ≥ 0 depending only on G, β, q, such that where the second supremum is taken over all rectangles r ⊂ n , and X denotes the anti-development of U along r .
The idea of the proof is to approximate the holonomy U (∂r ) and the anti-development X by pinned random walks, and the latter we control using rough paths theory. We require the following lemma.
Lemma 5.4. Suppose (5.2) holds for some β ≥ 2 andC ≥ 0. Then for all q > 2, there exists λ ≥ 1, depending only on G, β and q, such that for all M, k ≥ 1 Proof. We first prove the claim for M = 1. Let k ≥ 1 and consider i.i.d. g-valued random variables V 1 , V 2 , . . . , V k equal in law to log(Y ), where Y ∼ Q N (x) dx. Consider the martingale (X j ) k j=0 defined by X j def = j i=1 V i and let X denote its canonical (Marcus) level-2 rough path lift (see [CF19,Sect. 4

]). Then
where C 1 depends only on β, q, and where we used the enhanced BDG inequality [CF19, Thm. 4.7] in the first inequality, the power-mean inequality in the second inequality, and (5.2) in the final inequality. Note that trivially |X | q-var ≤ X q-var . Note also that e V 1 · · · e V k is the solution to a controlled (Marcus) differential equation driven by X . By the local-Lipschitz continuity of the rough path solution map, it follows that | log(e V 1 · · · e V k )| ≤ C 2 X q-var , where C 2 depends only on G and q. This proves the claim for M = 1.
For general M ≥ 1, observe that taking k = M in the previous case implies that (5.2) holds with Q N on the LHS replaced by Q M N andC and 2 −2N on the RHS replaced by λC and M2 −2N respectively (λ depending only on G, β, q). The conclusion again follows from the previous part by replacing Q N by Q M N . Proof of Theorem 5.3. Let n ≤ N and consider a rectangle r ⊂ n . We first show that E | log U (∂r )| β + |X | β q-var ≤ λC l C uC |r | β/2 , (5.3) where λ depends only on G, q, β. It suffices to consider 2 ≤ n ≤ N and r = (0, k2 −n e 1 , 2 −n e 2 ) where k < 2 n−1 . Note that the discrete measure μ N has a domain Markov property: if D is a simply connected domain of N , then, conditioned on the bonds of the boundary, the measure inside D is independent from the measure outside D. As a consequence, we can substitute the lattice N by the square D = [0, 1 2 ] 2 ∩ N (which contains r by assumption) with prescribed bond variables on the boundary. More precisely, since U (∂r ) and |X | q-var are functions only of the bond variables inside and on the boundary of D, we can write the LHS of (5.3) as where F(U ) = | log U (∂r )| β + |X | Suppose first that n < N . To facilitate analysis of the integrals, we fix a maximal tree T ⊂ B N inside D as follows. We include in T all bonds on the boundary of D exceptᾱ def = (( 1 2 , 1 2 − 2 −N ), ( 1 2 , 1 2 )). We further include all horizontal bonds 2 −N ((x, y), (x + 1, y)) where either • x ∈ {0, . . . , 2 N −1 − 2} and y = 2 N −n + 2m for some integer m ≥ 0 such that y ∈ {2 N −n , . . . , 2 N −1 − 1}, or • x ∈ {1, . . . , 2 N −1 − 1} and y = 2 N −n + (2m + 1) for some integer m ≥ 0 such that y ∈ {2 N −n , . . . , 2 N −1 − 1} and all vertical bonds 2 −N ((x, y), (x, y + 1)) where either See Fig. 4 for an example of T.
The case n = N follows by similar (even simpler) considerations. The only changes which need to be made are that T has no vertical bonds which are not on the boundary The final term is bounded above independently of N provided 3 − β(1 − α)/2 < 0, i.e., α < 1 − 6/β.
Proof of Theorem 1.1. Applying Theorem 5.3 to the heat kernel action from Example 5.2, Theorem 4.5 shows that for every N ≥ 1, there exist an 1,(N ) (T 2 , g)-valued random variable A (N ) for which (|A (N ) | (N ) α ) N ≥1 is tight for any α ∈ (0, 1), and such that the associated gauge field induces the discrete YM measure on the lattice N . Recall that, by Young integration, the development map C α-Höl ([0, 1], g) → C α-Höl ([0, 1], G) is continuous (locally Lipschitz) for all α ∈ ( 1 2 , 1]. We thus obtain for any α ∈ ( 1 2 , 1) the existence of an α -valued random variable A with the desired properties from Lemma 3.8, Theorem 3.26, and the characterisation of the YM measure in [Lév03, Thm. 2.9.1]. The fact that A has support in 1 α follows from Proposition 3.20.