Inverse problem for the Yang-Mills equations

We show that a connection can be recovered up to gauge from source-to-solution type data associated with the Yang-Mills equations in the four dimensional Minkowski space. Our proof analyzes the principal symbols of waves generated by suitable nonlinear interactions and reduces the inversion to a broken non-abelian light ray transform. The principal symbol analysis of the interaction is based on a delicate calculation that involves the structure of the Lie algebra under consideration and the final result holds for any compact Lie group.


Introduction
The purpose of this paper is to solve an inverse problem associated with Yang-Mills theories in Minkowski space R 1+3 . The objective is the recovery of the gauge field A on a causal domain where waves can propagate and return, given data on a small observation set inside the domain.
The starting point of Yang-Mills theories is a compact Lie group G with Lie algebra g. Without loss of generality, we shall think of G as a matrix Lie group and hence g will be a matrix Lie algebra. We assume also that G is connected and endowed with a bi-invariant metric, or equivalently, an inner product on g invariant under the adjoint action.
In their most general formulation, Yang-Mills theories take place in the adjoint bundle of a principal bundle with structure group G over space-time. Since our region of interest in space-time will be a contractible set M ⊂ R 1+3 , we might as well assume from the start that we are working with the trivial adjoint bundle M × g. The main object of the theory is a gauge field A, also known as Yang-Mills potential. In geometric language this is simply a connection A ∈ C ∞ (M ; T * M ⊗g) = Ω 1 (M ; g), that is, a smooth g-valued 1-form. In general, we denote the set of g-valued forms of degree k by Ω k = Ω k (M ; g).
There is a natural pairing [·, ·] : Ω p ⊗ Ω q → Ω p+q given in our situation as where the wedge product of g-valued forms is understood using matrix multiplication in g. Using the pairing we define a covariant derivative d A : Ω k (M ; g) → Ω k+1 (M ; g), Given a gauge field A, we can associate to it, its field strength or curvature. This is defined as and it always satisfies the Bianchi identity d A F A = 0. Moreover, d 2 A ω = [F A , ω] for any ω ∈ Ω k .
1.1. Yang-Mills equations. The Yang-Mills equations arise as the Euler-Lagrange equations for the Yang-Mills action functional which we now recall. The inner product in g naturally induces a pairing ·, · Ad Ω p (M, g) × Ω q (M, g) → Ω p+q (M ).
If denotes the Hodge star operator of the Minkowski metric, the Yang-Mills functional is given by If G is a subgroup of the unitary group, we may take as adjoint invariant inner product −trace(XY ), where X, Y are matrices in g, and thus S YM (A) may also be written as a constant multiple of as is frequently found in the physics literature. From this functional one easily derives the Yang-Mills equations: This property can be easily deduced from the fact that the action S YM is gauge invariant.
1.2. Main result. We will consider an inverse problem for the Yang-Mills equations in the causal diamond D = {(t, x) ∈ R 1+3 : |x| ≤ t + 1, |x| ≤ 1 − t}. For a fixed 0 < 0 < 1, the data will be given on the subset = {(t, x) : (t, x) is in the interior of D and |x| < 0 }.
We we say that A ∈ Ω 1 (D; g) is a background connection if it satisfies the Yang-Mills equations (1) in D. Due to the gauge invariance, the determination of a background connection on D is considered only up to the action of the following pointed gauge group is technical in nature as we shall explain below, see discussion after Lemma 6. Both gauge groups are clearly related by G(D)/G 0 (D, p) = G.
For A, B ∈ C k (D; T * D ⊗ g), with k ∈ N, we say that A ∼ B in D if there is U ∈ G 0 (D, p) such that (2) holds in D. Moreover, we write and say that A ∼ B near ∂ − D if there are U ∈ G 0 (D, p) and a neighbourhood U ⊂ D of ∂ − D such that (2) holds in U ∩ D. The sets D, and ∂ − D are visualized in Figure 1.
We let A be a background connection, and consider the data set Let us remark that we could consider the source-to-solution map given in Proposition 4 instead of the more abstract data set D A . We prefer to formulate our main result using D A since the definition of the source-to-solution map is technical, requiring suitable gauge fixing among other things. In fact, it is precisely in the proof of Proposition 4 that the pointed gauge group is needed. Nevertheless, intuitively, it is helpful to think of the data set as that produced by an observer creating sources J supported in and observing solutions V to d * V F V = J in . The data set D A could also be reformulated in terms of the pairs (J, V | ) satisfying d * V F V = J, with J supported in . This formulation, while being somewhat redundant as J = d * V F V can be computed given V | , suggests viewing D A informally as the graph of the map taking J to V | . However, we reiterate that defining such map requires care. In addition to gauge fixing, we need to take into account the compatibility condition d * V J = 0 that every source must satisfy, see Lemma 2. Our abstract formulation of the data set D A bypasses these problems while incorporating the natural gauge invariance of the theory.
We are now ready to formulate our main result. 1.3. Outline of the proof of Theorem 1. The objective is to reduce the proof of the theorem to an inversion result for a broken non-abelian light ray transform as in [7]. The broken light ray transform that arises in this paper is that related to the adjoint representation given the natural habitat of the Yang-Mills theories. In [7] we studied the broken light ray transform associated with the fundamental representation, so our first task is to relate the two.
To go from the data set D A to the broken non-abelian light transform we follow the template laid out in [7] where a considerably simpler wave equation with cubic non-linearity was studied. The first step is then to process the abstract data set and convert it into a manageable source-to-solution map and this already brings the question of gauge fixing to the forefront. The construction of source-to-solution map uses two types of gauges: the temporal gauge and the relative Lorenz gauge. The temporal gauge is easy to implement as it involves solving a linear matrix ODE to make the time component of a Yang-Mills potential A to vanish, that is, A 0 = 0. This gauge is particularly suited to prove uniqueness results, cf. Proposition 2 below.
It is important to remark that uniqueness does really depend on the shape of the set where the connections satisfy the Yang-Mills equations. The causal diamond D has the special feature that perturbations cannot propagate in it through the top boundary |x| = 1 − t, whereas the bottom boundary is under control due to the assumed gauge equivalence near ∂ − D. In particular, even if a background connection A satisfies the Yang-Mills equations on a larger set than D, we do not expect to be able to recover it outside D given data on . Moreover, it does not appear to be possible to prove Theorem 1 using presently known unique continuation results, as discussed in more detail below.
A connection V is said to be in relative Lorenz gauge with respect to the back- A satisfies a semilinear wave equation where the leading part is given by the connection wave operator . This is very helpful for solving the foward problem and for the microlocal analysis used to extract information from the source-to-solution map.
Following [7], the idea is to consider the non-linear interaction of three singular waves produced by sources which are conormal distributions. We carefully track the principal symbol produced by the non-linear interaction and extract from that the non-abelian broken light ray transform. This requires a delicate calculation unlike anything in the previous literature, in which the structure of the Lie algebra g comes into consideration. This is the technical core of the proof, and perhaps one of the most innovative aspects of the paper. After this computation, contained in Section 8.2, there is one further hurdle to overcome: to use the source-to-solution map we must revert back to the temporal gauge and check that no information is lost in the process.
1.4. Discussion and comparison with previous literature. It is tempting to think that a result like Theorem 1 can be obtained from a unique continuation principle. It must be stressed that unique continuation for linear wave equations with time-dependent coefficients is simply false as there are counterexamples [1]. Although the difference of two solutions to the Yang-Mills equations in the Lorenz gauge satisfies a linear wave equation (with coefficients depending on both the solutions), due to unique continuation failing, our inverse problem is not "immediately solvable" and hence a different approach is needed. We mention that an inverse problem for Yang-Mills connections on a Riemannian manifold was studied in [6]. The proofs there are based on unique continuation for elliptic systems, however, the elliptic case is very different from the hyperbolic one.
This paper sits firmly within the program, initiated in [7], that is motivated by the Yang-Mills-Higgs system. In addition to the Yang-Mills potential A, a Higgs field Φ ∈ C ∞ (M, g) is present in this system. The equations for the pair of fields (A, Φ) are given by where V is the derivative of a smooth function V : [0, ∞) → R. More generally, we can consider these equations when Φ is a section of an associated bundle determined by a given representation of G. The focus of [7] was the recovery of A via the second equation (5), when V is assumed to be a quadratic potential (the most popular choice in Yang-Mills-Higgs theories): this turns (5) into a wave equation with a cubic nonlinearity. The present paper focuses on the first equation (4); more precisely in the pure Yang-Mills case where Φ = 0. There are two substantial differences between [7] and the present paper. First, when A is fixed, the second equation (5) is no more gauge invariant, and hence the construction of source-to-solution map in [7] does not require gauge fixing. Second, the quadratic potential V leads to particularly simple non-linear structure in [7], and the resulting analysis of principal symbols is much more straightforward than in the present paper.
As already mentioned above, we consider the non-linear interactions of three singular waves. Interaction of singular waves has been studied outside the context of inverse problems. In particular, the wave front set of a triple cross-derivative has been studied in the case of the 1 + 2-dimensional Minkowski space by Rauch and Reed [39]. The references [3,24,34,35,40] have results of similar nature. The use of non-linear interactions in the context of inverse problems was initiated in [29], where the wave front set resulting from the interaction of four singular waves was studied. The same approach was used for the Einstein equations in [28], and subsequently in [32,46], in some ways the closest previous results to ours. For a review of this approach, see [30]. We observed in our above mentioned work [7] that it is sufficient to consider interactions of three singular waves, simplifying the analysis. Three-fold interactions are used in the present paper.
Non-linearities allow solving inverse problems that are open for the corresponding linearized equations. In particular, the inverse problem for the linearized Yang-Mills equation, see e.g. (32) below (where some lower order terms are discarded), is open.
The only known results are in the case G = U (1), see [41,12], and these results impose convexity assumptions not satisfied by the geometric setting of Figure 1. The same is true for recovery zeroth order terms, solved with and without convexity assumptions for certain scalar linear [43] and non-linear wave equations [14], respectively.
We mention that non-linear interactions have also been used to recover non-linear terms for scalar wave equations [33], scalar elliptic equations [13,31], and scalar real principal type equations [38]. In these four works, non-linear terms do not contain any derivatives, contrary to the Einstein and Yang-Mills equations. Nonlinear interactions involving derivatives have also been studied in the context of scalar wave equations [47] and elastodynamics [10]. In addition, inverse problems have been studied for various non-linear equations using methods originally developed in the context of linear elliptic equations. In particular, the method of complex geometrical optics originating from [45], and importantly extended by [37,27], was first applied to an inverse coefficient determination problem for a non-linear parabolic equation [21] and subsequently to several other inverse problems [2,5,22,23,25,42,44].
There are numerous analogies between the problem studied here and that of the Einstein equations considered in [28]. For starters, both problems have gauges: in the Einstein case the gauge group is the diffeomorphism group. The role of the relative Lorenz gauge is played by wave coordinates and one could also say that the Fermi coordinates used in [28] are the analogue of the temporal gauge. Both problems have a compatibility condition for the sources: the Einstein tensor has zero divergence and Yang-Mills has d * A d * A F A = 0. However, there are important differences and we want to stress those, since they are essential in resolving the inverse problem in the different contexts. After suitable gauge fixing and linearization, both the Einstein and Yang-Mills equations reduce to a linear wave equation. The unknown Lorentzian metric appears in the leading order terms of the equation in the former case while the background gauge field A features at the subprincipal level in the latter case. The Lorentzian metric affects the Lagrangian geometry of the parametrix for the wave equation but the effect of A is visible only in the principal symbol of the parametrix. Thus the need for a symbol calculation in the present paper that takes into consideration the structure of the Lie algebra g. Finally, the two inverse problems reduce to very different purely geometric problems. In our case, we read the broken non-abelian light ray transform from certain principal symbols, whereas in the Einstein case, the so-called light observation sets are obtained by analysing the wave front sets of suitable solutions, see [29,17] for the corresponding geometric problem.
1.5. Outline of the paper. Section 2 introduces parallel transport in both the principal and the adjoint representation and reduces Theorem 1 to inversion of the broken non-abelian light ray transform via [7,Proposition 2] in the case that G has finite centre. Section 3 discusses the Yang-Mills equations with a source. Section 4 introduces the relative Lorenz gauge and the temporal gauge, thus setting up the scence for the source-to-solution map. The latter is discussed in Section 5 where the important Proposition 4 is proved. Section 6 computes the equations for the triple cross-derivative when three sources are introduced. Section 7 supplies the necessary tools from microlocal analysis needed to compute the symbol of the triple interaction and the latter is computed in Section 8. Section 9 proves a result about the structure of Lie algebras with trivial centre, and completes the proof of Theorem 1 in the case that G has finite centre. The final Section 10 contains the proof of Theorem 1 in the general case.
There are three appendices, first of which derives explicit formulas in coordinates, for example, for d * A F A . The second appendix discusses the direct problem for the Yang-Mills equations, and the last one gives an elementary alternative to the result in Section 9 in the case that g = su(n) with n ≥ 2.
Acknowledgements. ML was supported by Academy of Finland grants 320113 and 312119. LO was supported by EPSRC grants EP/P01593X/1 and EP/R002207/1, XC and GPP were supported by EPSRC grant EP/R001898/1, and XC was supported by NSFC grant 11701094. LO thanks Matthew Towers for discussions of Lie algebras.

Parallel transport
We will explain in Section 10 how the case of an arbitrary compact, connected Lie group G can be reduced to the case that G has finite centre, that is, the set Z(G) = {z ∈ G : zh = hz for all h ∈ G} is finite. In this case, the proof of Theorem 1 will ultimately boil down to inversion of a non-abelian broken light ray transform. This transform is the composition of two parallel transports, and we begin by defining the parallel transport used in the paper.
For the moment we may let (M, g) be any Lorentzian manifold, and G any compact matrix Lie group with Lie algebra g. However, we will work with trivial bundles for simplicity. Let A ∈ Ω 1 (M ; g) be a connection and let us first define the parallel transport on the principal bundle M × G with respect to A: the parallel transport U A γ along a curve γ : [0, T ] → M is given by U A γ = U (T ) where U is the solution of the ordinary differential equation Here ·, · is the pairing between covectors and vectors.
In general, if V is a vector space and ρ : G → GL(V) is a linear representation, the parallel transport on the associated vector bundle M ×V is defined by P A,ρ γ = ρ(U A γ ). Two representations will be of importance to us. First, when G ⊂ GL(C n ) and V = C n we have the representation given by ρ = id. In other words, P A,id We call this the principal representation.
Second, when V = g we have the adjoint representation ρ = Ad where Ad(h), h ∈ G, is typically written Ad h and defined by Ad h b = hbh −1 for b ∈ g. We have where U is the solution of (6).
When M is a convex subset of Minkowski space R 1+3 and x, y ∈ M , there is a unique geodesic γ from x to y, up to reparametrization. The parallel transport U A γ does not depend on the parametrization of γ, and we write simply P A,ρ y←x = P A,ρ γ in this case.
We are now ready to define the non-abelian broken light ray transforms used in the proof of Theorem 1. We write L = {(x, y) ∈ D 2 : there is a lightlike geodesic joining x and y}, S + ( ) = {(x, y, z) ∈ D 3 : (x, y), (y, z) ∈ L, x < y < z, x, z ∈ , y / ∈ }, where x < y means that there is a future pointing causal curve from x to y. (For (x, y) ∈ L, we have x < y if and only if the time coordinate of y − x is strictly positive.) Define S A,ρ z←y←x = P A,ρ z←y P A,ρ y←x , (x, y, z) ∈ S + ( ). We will reduce the transform S A,Ad z←y←x to S A,id z←y←x as follows: Lemma 1. Suppose that a compact, connected matrix Lie group G has finite centre and let A, B ∈ Ω 1 (D; g). If S A,Ad z←y←x = S B,Ad z←y←x for all (x, y, z) ∈ S + ( ) then S A,id z←y←x = S B,id z←y←x for all (x, y, z) ∈ S + ( ). Proof. Let (x, y, z) ∈ S + ( ) and b ∈ g. Then ub = bu where As this holds for all b ∈ g we see that u is in the centre Z(G). For the convenience of the reader we recall the proof of this well-known fact. Let h ∈ G. As G is connected, there is a path H : [0, 1] → G satisfying H(0) = id and H(1) = h. Define the path where we used the fact that b = H −1Ḣ ∈ g commutes with u −1 . We conclude that Now u ∈ Z(G) depends continuously on x, y and z, and u → id when y → x and z → x. As Z(G) is finite, we have u = id, and therefore We have previously inverted the transform S A,id z←y←x in the case of the unitary group G = U(n), see Proposition 2 of [7], where slightly different choice of and D is used. However, the proof works for any matrix Lie group, and also for the present choice of and D. Moreover, the gauge u defined in Lemma 3 of [7] is smooth up to ∂D whenever the two connections A and B are smooth up to ∂D.
Until treating the case of an arbitrary compact, connected Lie group in Section 10, we will focus on proving: Under the additional assumption that G has finite centre, Theorem 1 follows then from Proposition 1, Lemma 1 and the proof of Proposition 2 in [7].

Yang-Mills equations with a source
In this section we let (M, g) be any oriented Lorentzian manifold, and consider the Yang-Mills equations with a source Here the source J cannot be arbitrarily chosen but must obey the compatibility condition due to the following well-known lemma. We give a proof for the convenience of the reader.
, and the Yang-Mills equations with a source (9) imply the compatibility condition (10).
So it is enough to prove that [F V , F V ] = 0. But this is a purely algebraic fact that holds for any ω ∈ Ω 2 (M ; g), that is, This is equivalent with To check this, write ω = ω ij dx i ∧ dx j and note that The next lemma, proven again for convenience, implies that the source in (9) changes to U −1 JU when a gauge transformation U ∈ C ∞ (M, G) acts on V . We use the shorthand notation B = U · A for (2).
Proof. By assumption B = U −1 dU + U −1 AU. A direct calculation from the definitions shows that

Gauge fixing
Gauge fixing is a mathematical procedure for coping with redundant degrees of freedom in field variables. Our work uses two gauges, namely the temporal gauge and the relative Lorenz gauge. While these are typical gauge choices, we will give below a self-contained presentation of certain, perhaps less commonly used, properties of these gauges.
For a connection V ∈ Ω 1 (D; g) we define a connection T (V ) in temporal gauge by and We shall prove the following uniqueness result: Suppose, furthermore, that both A and B are in the temporal gauge. Then U does not depend on t, and A = U · B in D.

Reduced equations.
We follow a reduction given in [9]. Suppose that a connection A ∈ Ω 1 (M ; g) is in temporal gauge and write d * For the convenience of the reader, we give a proof of the following formula, see Lemma 12 in Appendix A, Here, and throughout the paper, indices are raised and lowered by using the Minkowski metric. Taking β = 0 we get the constraint equation with a = 1, 2, 3, and taking β = j = 1, 2, 3 we get Here ∂ x A = (∂ 1 A, ∂ 2 A, ∂ 3 A) andÑ j contains the terms that are of order one and zero,Ñ In the remainder of this section, we will use systematically Greek letters for indices over 0, 1, 2, 3 and Latin letters for 1, 2, 3.
We differentiate (15) using ∂ j and (16) using ∂ 0 , to obtain Substituting the first equation to the second one gives where we have written and We call (17) Hence if A andÃ satisfy (17) with the same J, then the difference A −Ã satisfies a linear equation of the form where X j , j = 1, 2, are first order differential operators in the x 1 , x 2 and x 3 variables, with coefficients that depend on A andÃ, and whence also on the x 0 variable. Writing Due to its time-independence, U is well-defined and smooth in whole D and U = id in . We defineÃ = U · B and proceed to show that A =Ã in D.
AsÃ is gauge equivalent to B, the Yang-Mills equations dÃFÃ = 0 hold in D \ . As U = id in , we haveÃ = B in . Therefore dÃFÃ = d A F A in . As U does not depend on t, we see thatÃ 0 = 0. Hence A andÃ are two solutions to the reduced Yang-Mills equations (17), with the same J, and the difference A −Ã satisfies (19). As they also coincide near ∂ − D, Lemma 14 in Appendix B implies that A =Ã in D.

4.2.
Relative Lorenz gauge. For a moment we may let (M, g) be any oriented Lorentzian manifold of even dimension. Consider two connections A and V on M solving the Yang-Mills equations without (1) and with (9) a source, respectively.
We will rewrite the latter equation in terms of Directly from the definition of curvature and thus (20) Combining this with (20) and We say that V ∈ Ω 1 (M ; g) is in the Lorenz gauge relative to a background con- the connection wave operator. The semilinear wave equation (23), together with suitable initial conditions, is solvable when the source J is small and smooth enough, see, for example, (the proof of) Theorem 6 in [26]. However, its solution W solves the actual Yang-Mills equations (21) if and only if d A d * A W = 0. Recall also that if W solves (21), or equivalently (9), then J satisfies the compatibility condition (10). We will therefore study the system combining (10) and (23). Observe that (10) is equivalent with where j = 1, 2, 3. This can be viewed as an ordinary differential equation for J 0 .
We begin with an uniqueness result that is similar to Proposition 2. For r > 0 and x ∈ R 1+3 we define the rescaled and translated diamond Lemma 4. Let r > 0 and x ∈ R 1+3 and writeD = D(x, r). Let A ∈ Ω 1 (D, g) and and that the spatial parts of J (1) and J (2) of coincide onD, that is, Proof. Pseudolinearization analogous to that in Section 4.1.2 shows that the difference (W (1) − W (2) , J (1) − J (2) ) solves a system of the form (65) in Appendix B with f 1 = 0 and f 2 = 0. The coefficients of this system depend on W ( ) , J ( ) and they satisfy the assumptions of Lemma 14 in Appendix B. Lemma 14 is formulated for D rather than forD, however, the form of the system (65) is invariant under a rescaling and translation. Therefore Lemma 14 holds also forD and we conclude by applying it.
We will now turn to existence of solutions to the Yang-Mills equations. It is convenient work in the cylinder M = (−2, 2) × R 3 containing the diamond D, rather than in D. Let us consider again the system combining (10) and (23), Lemma 5. Let A ∈ Ω 1 (M ; g) and suppose that W, J ∈ C 3 (M ; T * M ⊗ g) solve (25). Suppose moreover that A solves (1) in D and that supp(J j ), j = 1, 2, 3, is contained in the interior of D. Then W solves (21) in D, with J on the right-hand side.
Proof. The equations (21) and (23) (21) in D, and the first equation in (25), in other words (23) V to this equation, we have using Lemma 2 and the second equation in (25) This is a linear wave equation for H. We will show below that W vanishes near ∂ − D. Hence also H vanishes near ∂ − D, and as it satisfies the linear wave equation, it vanishes in the whole D. This type of finite speed of propagation result is of course standard, and it follows also from Lemma 14 Appendix B.
We prove the following result in Appendix B.
Proposition 3. Suppose that A ∈ Ω 1 (M ; g) is bounded, together with all its derivatives, and let k ≥ 4. Then there is a neighbourhood H of the zero function in H k+2 (M ; g) such that for all J j ∈ H, j = 1, 2, 3, there is a unique solution

Source-to-solution map
We begin with a lemma, that will be used only once, and that highlights the difference between the pointed gauge group G 0 (D, p) and the full gauge group G(D). Lemma 6. Suppose thatÃ ∼ A near ∂ − D and consider the modified data set Proof. It follows immediately from the definitions of the sets D A andD A that there are U ∈ G 0 (D, p) and V ∈ C 3 (D; If we used gauge equivalence with respect to G(D) in the definition D A , then (26) would still hold in a neighbourhood U ⊂ of ∂ − D ∩ , however, this simply says that U| U is in the stabilizer subgroup {U ∈ C ∞ (U; G) : U ·Ã =Ã} with respect tõ A| U . In general, the stabilizer subgroup may be non-trivial.
Recall that the temporal gauge version T (V ) of a connection V is defined by (14). Recall, furthermore, that the system (25) of Yang-Mills equations in relative Lorenz gauge with the compatibility condition is posed on M = (−2, 2) × R 3 .
Proposition 4. Suppose that A ∈ Ω 1 (D; g) satisfies (1) in D. Then there is a connectionÃ ∈ Ω 1 (D; g) such thatÃ ∼ A in D,Ã| is in temporal gauge, and the following holds: for all x ∈ there are a neighbourhood 0 ⊂ of x and a neighbourhood H of the zero function in H 7 0 ( 0 ; g) such that D A determinesÃ| and the source-to-solution map Extending (W, J 0 ) by zero we get a solution in the set − = ∩ {t < t 0 }. To summarize, the solution (W, J 0 ) in − is determined byÃ and our choice of J j , j = 1, 2, 3. Defining a connectionV =V (J 1 , We write + = ∩ {t > t 0 }, and consider the set and the spatial part of d * V F V vanishes in + }. Here T is defined by (14) with |x| < 0 , cf. (3). No confusion should arise from our use of T for temporal gauge both in and in D since T (V | ) = T (V )| for a connection V on D.
AsV is determined by D A (and the choice ofÃ ), also L is determined by D A . Moreover, T (V )| ∈ L where V = W +Ã and (W, J 0 ) is the solution of (25) in M with J j , j = 1, 2, 3, as above and A =Ã. The solution (W, J 0 ) in M is an extension of the solution (W, J 0 ) in (0, t 0 ) × R 3 , which justifies our reuse of symbols. Observe that Proposition 3, together with the Sobolev embedding theorem, guarantees that W ∈ C 3 (D; T * D ⊗ g), and that Remark 1 guarantees that supp(J 0 ) ⊂ .
To conclude the proof, it remains to show that L consists of a single element. Suppose that W ,W ∈ L. By Lemma 6 there are connections V ,Ṽ and gauges u, In fact, as V =V =Ṽ in − , we have U =Ũ and W =W in − . Hence also d * As gauge transformations act componentwise on d * W F W , see (12), also the spatial parts of d * W F W and d * W FW vanish in + . Writing J 0 for the temporal part of d * W F W , the compatibility condition d * W d * W F W = 0, see Lemma 2, together with W 0 = 0, implies that ∂ t J 0 = 0 in + . The same holds forJ 0 , the temporal part of d * W FW . But J 0 =J 0 on ∩ {t = t 0 }, and hence J 0 =J 0 in + . To summarize d * W F W = d * W FW in . Proposition 2 implies that W =W in . In other words W =W and this is the only element in L.

Linearization of the Yang-Mills equations in Lorenz gauge
Let us study multiple-fold linearizations of (23). Consider a three-parameter family of solutions to (23), vanishing for t ≤ 0, where is in a neighbourhood of the origin in R 3 . Assume that the source term is linear in the sense that J = 3 k=1 (k) J (k) for some J (k) ∈ Ω 1 (R 1+3 ; g). Writing and differentiating (23) in gives the following system of linear wave equations where the nonlinear terms read , and, writing S 3 for the set of permutations on {1, 2, 3}, Now we continue the calculation in Cartesian coordinates in Minkowski space R 1+3 , and use the formulas These formulas are derived in Appendix A. Using (29)-(31) and the Lorenz gauge condition d * A W = 0, we rewrite the first three equations in (28), modulo lower order terms, as follows where the components of the right-hand sides of the last two equations read

Preliminaries on microlocal analysis
7.1. Distributions associated to conormal bundles and two Lagrangians.
The advantage of working in the relative Lorenz gauge is that the Yang-Mills equations reduces to a cubic nonlinear wave equation with the linear part given by the connection wave operator A , modulo zeroth order terms. The parametrix for A is a distribution associated to an intersecting pair of Lagrangians (shortly an IPL distribution), in the sense of [36], and we use the product calculus of conormal distributions to study the non-linear part.
The proof of Proposition 1 in the next section relies solely on symbolic computations, and we recall here only that conormal and IPL distributions have principal symbols and that the corresponding symbol maps are isomorphisms, modulo lower order terms in a suitable sense. We will not recall the definitions of these classes of distributions, them being somewhat technical, instead we refer the reader to [7] for a review of the theory that we use and that was originally developed in [18,11,36]. Even the precise definition of spaces of symbols is not important for our present purposes, since we will consider only symbols that are positively homogeneous in the fibre variable.
Recall that a pseudodifferential operator A on a manifold X with a homogeneous principal symbol a is said to be elliptic at (x, ξ) ∈ T * X \ 0 if a(x, ξ) = 0. The wavefront set WF(u) ⊂ T * X \ 0 of a distribution u on X is the complement of its regular set, whilst the regular set consists of such points (x, ξ) ∈ T * X \0 that there is a zeroth order pseudodifferential operator A that is elliptic at (x, ξ) and that satisfies Au ∈ C ∞ (X). We denote by singsupp(u) the projection of WF(u) on X, and by WF(A) the essential support of A, that is, the projection of WF(A ) ⊂ (T * X \ 0) 2 on the first factor T * X \ 0 where A is the Schwartz kernel of A. Moreover, we say that A is a microlocal cutoff near (x, ξ) ∈ T * X \ 0 if A is elliptic at (x, ξ) and WF(A) is contained in a small neighbourhood of {(x, λξ) : λ > 0}.
Let E be a complex smooth vector bundle over X and Ω 1/2 the half density bundle. A conormal distribution u ∈ I m (N * Y ; E ⊗ Ω 1/2 ) of order m ∈ R is a compactly supported distribution taking values on the tensor bundle E ⊗ Ω 1/2 with WF(u) contained in the conormal bundle N * Y of a submanifold Y of X. In addition, u is required to have certain local structure on Y , see (2.4.1) in [18], precise form of which is not important for our purposes. What is important is that the principal symbol σ[u] of u is a smooth section of E ⊗ Ω 1/2 , invariantly defined on N * Y \ 0, and that the principal symbol map u → σ[u] gives the short exact sequence, for any λ > 0 and (x, ξ) ∈ N * Y \ 0.
Since the half density is involved here, the given homogeneity looks a little different from the classical definition in [19, p.67]. More generally, a Lagrangian distribution u ∈ I m (Λ; E ⊗ Ω 1/2 ) is a compactly supported distribution with WF(u) contained in a conical Lagrangian submanifold Λ of T * X \ 0, and certain local structure, see (3.2.14) in [18]. Its principal symbol is invariantly defined on Λ as a smooth section of the bundle E ⊗ Ω 1/2 ⊗ L, where L is the Maslov bundle over Λ. Analogously to (35)  The notion of Lagrangian distributions is insufficient to completely describe the fundamental solution of wave equations as two Lagrangian manifolds are needed in order to describe the propagating singularities and the singularities at the source. An IPL distribution u ∈ I m (Λ 0 , Λ 1 ; E ⊗ Ω 1/2 ) is compactly supported distribution with WF(u) contained in Λ 0 ∪ Λ 1 , where (Λ 0 , Λ 1 ) is a cleanly intersecting pair of conical Lagrangian submanifolds of T * X \ 0, and with certain local structure on Λ 0 ∪ Λ 1 , see [36]. Here Λ 1 is a manifold with boundary, while Λ 0 is a manifold without boundary, and by cleanly intersecting, we mean Again what we really need in the present paper is the symbol map for such distributions. In this case the symbol map is an isomorphism, modulo lower order terms, We remark that R maps the E ⊗Ω 1/2 ⊗L-valued symbols over Λ 0 to the E ⊗Ω 1/2 ⊗Lvalued symbols over Λ 1 and acts as a multiplication by a scalar on E.
If (x, ξ) ∈ Λ j \ ∂Λ 1 for j = 0 or j = 1, then there is a microlocal cutoff χ near (x, ξ) such that χu ∈ I(Λ j ; E) for all u ∈ I m (Λ 0 , Λ 1 ; E ⊗ Ω 1/2 ). The only place where we need the full picture of IPL distributions, instead of the above microlocal reduction to Lagrangian distributions, is equation (39) giving an initial condition on ∂Λ 1 for a transport equation on Λ 1 . Moreover, apart from (39), we can also avoid the use of Lagrangian distributions in favour of conormal distributions, since all the Lagrangian manifolds Λ 0 and Λ 1 considered below will be conormal bundles away from ∂Λ 1 .
The principal symbol σ[ A ] and the subprincipal symbol We denote by Φ s , s ∈ R, the flow of the Hamilton vector field H σ[ A ] of σ[ A ], and define for a subset B of the characteristic set Σ of A the future flowout of B by {(y, η) ∈ Σ; (y, η) = Φ s (x, ξ), s ∈ R, (x, ξ) ∈ B, y ≥ x}.
As A is of real principal type one can use the theory by Hörmander and Duistermaat [11] to understand its parametrix. A completely symbolic parametrix construction, based on IPL distributions, was given by Melrose and Uhlmann [36], and the following adaptation of their construction in the vector valued case can be found in [7]: Proposition 5. Let Λ 0 be a conormal bundle such that H σ[ A ] is nowhere tangent to Λ 0 . Denote by Λ 1 the future flowout of Λ 0 ∩ Σ. Consider the wave equation where f ∈ I(Λ 0 ; E) and E = T * R 1+3 ⊗ g. Then u ∈ m∈R I m (Λ 0 , Λ 1 ; E ⊗ Ω 1/2 ) and the corresponding principal symbols satisfy Here L H σ[ A ] denotes the Lie derivative with respect to H σ[ A ] .

7.2.
Parallel transport for the principal symbol. As in [7], the transport equation (38) can be understood as a parallel transport equation as in Section 2, Here µ is a nowhere vanishing half density on Λ 1 \ Λ 0 , β(s) = (γ(s),γ * (s)), witḣ γ * =γ α dx α , is the bicharacteristic curve emanating from β(0) ∈ Λ 0 ∩ Λ 1 , and Comparing with (7), we see that the 1-form componentsû α satisfy the parallel transport equation on M × g corresponding to the adjoint representation of G. In particular, if x, y ∈ L and the singular support of f does not intersect the line segment from x to y, then where ξ is the covector corresponding to the direction of the line segment, and β in (41) satisfies β(0) = (x, ξ) and β(s) = (y, ξ).

Proof of Proposition 1
We follow the construction in [7], however, the analysis in the present paper is more involved due to the non-linearity in Yang-Mills equations being more complicated than the simple cubic non-linearity considered in [7], and also due to the gauge invariance of the Yang-Mills equations. We will focus on the new features of the proof and refer to [7] for technical details that are unchanged.
In order to apply the microlocal machinery in Section 7 we need to consider the Yang-Mills equations on the tensor product bundle T * R 1+3 ⊗ g ⊗ Ω 1/2 . This is achieved by choosing a nowhere vanishing half density µ on R 1+3 and by considering the conjugated operator µ −1 P (µW ) instead of P (W ) = A W + [W, F A ] + N (W ), cf. (23). In fact, we choose µ so that µ = 1 identically in the Cartesian coordinates, and to simplify the notation, we omit writing µ in what follows. However, we warn the reader that additional determinant factors appear in other coordinates. These can be included in the factorsα (k) in (51), and α (k) , α (kl) and α in (53).
It turns out that in the coordinates satisfying (44)- (45) it is enough to use sources with all but the dx 2 component vanishing. Let b (k) ∈ g and set where δ x (k) is the Dirac delta distribution at x (k) and χ (k) is a microlocal cutoff near (x (k) , ±ξ (k) ). Here the the sign is chosen to be that of κ (k) , that is, − for k = 1 and + for k = 2, 3. Moreover, χ (k) is chosen so that (χ1) the principal symbol σ[χ (k) ] is positively homogeneous of degree q; , and for all k = l it holds that x (l) / ∈ J + ( (k) ) where The degree q ∈ R is chosen negative enough so that J (k),2 ∈ H 7 0 ( ; g). The geometric setting is shown in Figure 2. (1) , y, z and η, as well as, b (k) and J (k),2 (s), with k = 1, 2, 3 and small s > 0, be as above, and define for (k) ∈ R, k = 1, 2, 3, z←y←x (1) [b (2) , [b (1) , b (2) ]]. As (x (1) , y, z) ∈ S + ( ) and b (1) , b (2) ∈ g can be chosen arbitrarily apart from the constraint r = 0, Proposition 1 follows from Propositions 4 and 8 together with Proposition 9 in Section 9 below. Here the case r = 0 follows by continuity.
For the convenience of readers who do not wish to enter into theory of Lie algebras, we have included an elementary alternative to Proposition 9 in the case g = su(n), with n ≥ 2, see Lemma 16 in Appendix C. This special case is interesting in view of the SU(3) × SU(2) × U(1) gauge group of the standard model.
We will proceed to give a proof of Proposition 8 in Sections 8.1-8.3.  (25) to (23). Let J (k),2 , k = 1, 2, 3, be as in (47), and write J 2 = J 2 ( , s) for the function defined by (48). To simplify the notation, we write J j = J (k),j = 0 for k = 1, 2, 3 and j = 1, 3, and, for the remainder of this section, somewhat abusively A =Ã whereÃ is as in Proposition 4. Then we denote by the solution of (25) with J j , j = 1, 2, 3, as above and near the origin of R 3 . The derivatives of W with respect to are denoted by Y (k) , Y (kl) and Y (123) as in (27), and we write also For notational convenience, we translate the origin in (25) so that the initial conditions are given at t = 0 rather than at t = −1. (25) is equivalent with (24). Differentiating (24) with respect to (k) for k = 1, 2, 3 gives

Recall that the second equation in
Writing , the operator ∂ t is elliptic away from its characteristic set {τ = 0} ⊂ T * R 1+3 . The wave front set of the right-hand side of (50) is contained in a small neighbouhood of {(x (k) , λξ (k) ) : λ = 0}, and therefore it is disjoint from {τ = 0}. It follows that ρ (k) ∈ I(N * {x (k) }; g) since the right-hand side of (50) is in this class. Recalling the form of ξ (k) , k = 1, 2, 3, see (44) and (45), symbol evaluation gives where the sign is that of (47), and It follows that away from x (k) , where N * K (k) is the bicharacteristic flowout emanating from (x (k) , ξ (k) ). In other words, writing x (k) = (t (k) , x (k) ), The second derivative of (24) in for distinct k, l = 1, 2, 3 reads As supp(J (k),j ) ⊂ (k) by (χ2), it follows from (50) and J 0 = 0 for t ≤ 0 that supp(ρ (k) ) ⊂ˆ (k) . We see that Y (k) is smooth in the support of ρ (l) for distinct k and l, sinceˆ (k) ∩ Γ (l) = ∅ by (χ3). Moreover, Y (k) solves (32) with vanishing initial conditions and with the source satisfying supp(J (k) ) ⊂ˆ (k) ⊂ J + ( (k) ), whence supp(Y (k) ) ⊂ J + ( (k) ) due to finite speed of propagation (as discussed in the proof of Lemma 5 finite speed of propagation follows from Lemma 14 in Appendix B). As singsupp(ρ (l) ) = {x (l) }, it follows from (χ2) that ρ (l) is smooth in the support of Y (k) for distinct k and l. Analogously, Y (k) is smooth in supp(J (l) ) and J (l) is smooth in supp(Y (k) ) for k = l. Therefore the right-hand side of (52) is smooth, and so is ρ (kl) . This again implies that Y (kl) satisfies (33) modulo smooth terms.

8.2.
Principal symbols of interacting waves. The linearized equation (33) has sourceÑ (2) that consists of products of solutions Y (k) , k = 1, 2, 3, to the linear wave equation (32). These products can be viewed as the interactions of waves Y (k) and Y (l) . Then the solution Y (kl) to (33) describes the linear waves emanating from the source of such interacting waves Y (k) and Y (l) . Analogously the solution Y (123) to (34) describes waves emanating from interaction of Y (1) , Y (2) and Y (3) . As ξ (k) , k = 1, 2, 3, are linearly independent, the submanifolds K (k) , k = 1, 2, 3, intersect transversally at y, and we may compute the principal symbols σ[Y (123) ](y, η) using the product formula (40). This requires using the direct sum decomposition , where η (k) = κ (k) ξ (k) and the scalars κ (k) are given by (46). We will omit below the details related to the choices of the microlocal cutoff when applying (40). The same choices as in [7] can be used, see (54) there and its proof.
By (43) the incoming principal symbols satisfy , where the scalar factors α (k) converge in C \ 0 as s → 0. The factors α (k) are independent from A, and their precise form is not important for our purposes. We refer to [7] for more detail on how to compute these factors. Let us point out, however, that typically α (k) =α (k) , withα (k) as in (51), due to a contribution from R and σ[ A ] −1 in (39).
Recall that L(0, J 2 ( , s), 0) is defined by T (V )| where V = W + A and W is as in (49). To simplify the notation, we write . η). It remains to study how the principal symbol σ[V (123) ] transforms under passing to the temporal gauge with T . Let U = U( ) be as in (14) with V = V ( ), and write .
In addition, U −1 U = id implies Therefore, modulo smooth terms, near z there holds Near z it holds that V (123) is a conormal distribution associated to the future flowout of N * (K (1) ∩ K (2) ∩ K (3) ) ∩ Σ, cf. (36). We refer to Appendix C of [7] for a precise description of this flowout. As the flowout is contained in the characteristic set Σ of A , it is disjoint from the characteristic set {τ = 0} of ∂ t . The second equation in (59) implies that U (123) is a conormal distribution associated to the same flowout near z.

Lie algebras with trivial centre
The material that follows is quite classical and can be found in many texbooks on Lie algebras. We start by defining notations and recalling basic results following mainly the exposition from [16,Chapter 7].
Let g be the Lie algebra of a compact connected Lie group of matrices G and let g C be its complexification. An element Z ∈ g C can be uniquely written as Z = X + iY for X, Y ∈ g, and we define Z * = −X + iY . Note that Z * is the usual conjugate transpose of Z in the case g = u(n). There is an inner product on g C that is realvalued on g and that satisfies, see [16,Proposition 7.4], If t is a maximal commutative subalgebra of g, then is a Cartan subalgebra of g C and its dimension is called the rank of g C . The roots of g C relative to h are those elements α ∈ h such that there is 0 = X ∈ g C so that where we use the convention that the inner product is linear in the second variable (and anti-linear in the first one). We let ∆ be the collection of roots. By [16,Proposition 7.15] each root α belongs to it and that we can decompose g C as a direct sum where g α contains the eigenvectors associated to α, that is, the vectors X satisfying (60). Moreover, see [16,Proposition 7.18,Theorems 7.19 and 7.23], (1) each g α is 1-dimensional; (2) if X ∈ g α with α ∈ ∆, then X * ∈ g −α ; (3) if g C has trivial center, the roots span h. We can in fact pick linearly independent elements X α ∈ g α , Y α = X * α ∈ g −α and H α ∈ h such that H α is a multiple of α and such that [X α , Y α ] = H α , [H α , X α ] = 2X α and [H α , Y α ] = −2Y α . This generates an sl(2, C)-subalgebra inside g C and implies that the elements belong to g and span a Lie subalgebra isomorphic to su (2), see [16,Corollary 7.20]. Note that the set {E 1 α , E 2 α , E 3 α } α∈∆ spans g over the reals if g has trivial centre. The commutation relations of Pauli matrices imply that su(2) is spanned by the nested commutators [X, [X, Y ]] with X, Y ∈ su (2). Hence the discussion above immediately implies: Proposition 9. Let g be the Lie algebra of a compact connected Lie group of matrices. Assume that g has trivial centre. Then g is the linear span of [X, [X, Y ]] for X, Y ∈ g.

The case of general Lie group
Suppose now G is any compact connected Lie group. In what follows it is convenient to express some previous notions in slightly more abstract form. Let ω ∈ Ω 1 (G, g) be the (left) Maurer-Cartan 1-form of G. Given U ∈ G 0 (D, p) we express the gauge equivalence between A, B ∈ Ω 1 (M, g) as where Ad : G → GL(g) is the usual Adjoint representation. For matrix Lie groups ω = g −1 dg and Ad g (a) = gag −1 for a ∈ g and we recover the expression (2) for the gauge equivalence between A and B that we have used so far. Suppose now that p : G → G is a covering of G, then p is a Lie group homomorphism and p * ω G = ω G . Given U ∈ G 0 (D, p), there is a unique U ∈ G 0 (D, p) such that p • U = U. This is because the domain of U is simply connected and we are fixing the value of U at p to be the identity. We deduce that (61) holds if and only if the following equation holds In other words, A and B are gauge equivalent via a gauge in G 0 (D, p) if and only if they are gauge equivalent via a gauge in G 0 (D, p). The same observation applies for gauges defined near ∂ − D. One very useful consequence is that the data seta D A does not really depend on the group G as long as it has Lie algebra g.
We are going to use this set up as follows. Every compact connected Lie group G admits a finite cover of the form T r × G 1 , where T r is an r-torus and G 1 is a compact Lie group with finite centre [4,Theorem 8.1,p. 233]. At the level of the Lie algebra this corresponds to an orthogonal splitting g = z ⊕ g 1 , where g 1 is the Lie algebra of G 1 and it has no centre. Given A ∈ Ω 1 (M, g) we split uniquely

Now we claim:
Proof. Using that elements in the centre z commute with everything, a quick calculation shows that given V ∈ C 3 (D; in D \ and the lemma follows.
We can deal with the abelian component A Z directly by unique continuation.
Proof. It suffices to prove the claim for r = 1, i.e. in the case of the circle S 1 . To avoid cluttering the notation we drop the subscript "Z" during the proof. If the group is abelian, the Yang-Mills equations reduces to the Maxwell equation d * F A = 0, where F A = dA. Since dF A = 0, the curvature satisfies F A = 0, where = d * d + dd * . The gauges u ∈ C ∞ (D; S 1 ) all have the form u = e iφ for φ a real-valued function since D is simply connected.
Since where η = g αα g jj g kk . Solving for c and c gives cc = 2 ηη = g αα g αα g ββ g jj g kk = −g αα .
A.2. The adjoint d * A in coordinates. Using the formulas (62)-(63) we can easily find expressions for d * A = d A in the Cartesian coordinates. Lemma 10. If X = X α dx α , then A.3. Proofs of (29)- (31). In some of our computations we encounter terms of the form [X, Y ] ∈ Ω 1 for X ∈ Ω 1 and Y ∈ Ω 2 . The next elementary lemma computes this term explicitly.
We are now ready to prove (30) that expands [X, d A Z] for X, Z ∈ Ω 1 in coordinates. Using Lemma 11 with We apply Lemma 11 with Y αβ replaced by [Y α , Z β ], to establish (31), giving [X, [Y, Z]] for X, Y, Z ∈ Ω 1 in coordinates as follows, Proof of (29), giving analogous expansion of d * A [X, Z] for X, Z ∈ Ω 1 , is more involved. Let us consider first the terms in the βth component of that contain derivatives. Using Lemma 10 these read Similarly, the terms in the βth component of (64) that do not contain derivatives are We used here the Jacobi identity. Hence we obtain (29), that is, A.4. Yang-Mills equations in coordinates. For the convenience of the reader we prove the following well-known lemma.
Proof. We apply Lemma 10 with Y αβ = ∂ α A β + 1 2 [A α , A β ], to see that the components of d * and the claim follows after combining the terms with factors 1/2, and using Appendix B. Direct problem B.1. An energy estimate. We write again (x 0 , x 1 , x 2 , x 3 ) = (t, x) ∈ R 1+3 for the Cartesian coordinates, and recall the sign convention (18) for the wave operator . We write also ∇u = (∂ x 1 u, ∂ x 2 u, ∂ x 3 u) and denote by · the Euclidean inner product on R 3 .
Let X j , j = 1, 2, be first order and Y j , j = 1, 2, zeroth order differential operators on R 1+3 . Suppose, furthermore, that X 2 is of zeroth order with respect to t variable. We will consider the system Here v and u are allowed to take values on a Hermitian vector bundle, but we do not emphasize this in the notation. We prove an energy estimate for (65). Write B(r) = {x ∈ R 3 : |x| < r}. Let R > 0 and define r(t) = R − t. Consider the following local energy and the norm of the source Lemma 13. Let T > 0 and define the cut cone Suppose that v, u ∈ C 2 (C) satisfy (65) in C. Then for a constant C > 0 that depends only on the L ∞ (C)-norm of the coefficients of X j and W 1,∞ (C)-norm of the coefficients of Y j , j = 1, 2, E(t) ≤ e Ct E(0) + C Proof. We differentiate the local energy ∂ t E = B(r(t)) ∂ 2 t v∂ t v + ∇v · ∇∂ t v + v∂ t v + ∇u · ∇∂ t u + u∂ t u dx − 1 2 ∂B(r(t)) Edx.
We write z 1 = −X 1 v − X 2 u + v + f 1 and z 2 = −Y 1 v − Y 2 u + f 2 , apply integration by parts to the second term in the first integral, and use (65) to obtain We have |z j | 2 ≤ C(E + F), j = 1, 2, and |∇z 2 | 2 ≤ C(E + F), where the constant C > 0 depends only on the L ∞ (C)-norm of the coefficients of X j and W 1,∞ (C)-norm of the coefficients of Y j , j = 1, 2. Moreover, 2|∂ ν v∂ t v| ≤ |∇v| 2 + |∂ t v| 2 ≤ E, and we obtain ∂ t E ≤ C(E + F ). Now we can use Grönwall's inequality, or simply notice that e Ct ∂ t (e −Ct E) ≤ CF, leading to the energy estimate (66).
The energy estimate (66) implies the following two uniqueness results. Proof. We writeÃ = T (A) andB = T (B), see (14). As A ∼ B near ∂ − D alsõ A ∼B there. That is, there is U ∈ G 0 (D, p) such that The system (67) is of the form (65) with v =Ẇ and u =J 0 , and the coefficients of X j and Y j , j = 1, 2, depend only on the background connection A and are smooth. Using the energy estimate (66), it is straightforward to show that (67) has a unique solution. However, we give a short proof based on the fact that the second equation in (67) is independent fromẆ .
Proof. Solving the second equation givesJ 0 ∈ H k+1 (M ; g). ThenẆ can be solved from the linear wave equation B.3. Proof of Proposition 3. To simplify the notation in the proof, we write H k (M ) also for Sobolev spaces of vector valued functions. As k ≥ 4, the Sobolev embedding theorem implies that both H k (M ) and H k+1 (M ) are Banach algebras, and also that H k+1 (M ) embeds in C 2 (M ).
We define The map Φ is a third order polynomial, and therefore it is smooth. Moreover, K(u, 0) contains only monomials of order two and three, and it follows that ∂ u Φ(0, 0) = id. The implicit function theorem gives a neighbourhood H of the zero function in H k+2 (M ) and a smooth map J → u from U to H k+1 (M ) such that Φ(u(J ), J ) = 0 for all J ∈ H.
Appendix C. Generation of su(n) using nested commutators We recall the definition of generalized Gell-Mann matrices. Denote by E jk the matrix with 1 in the jk-th entry and 0 elsewhere. The three types of generalized Gell-Mann matrices in C n×n are as follows symmetric type: for 1 ≤ j < k ≤ n let S jk = E jk + E kj . antisymmetric type: for 1 ≤ j < k ≤ n let A jk = −iE jk + iE kj . diagonal type: for 1 ≤ l ≤ n − 1 let D l be the matrix with 1 in the jj-th entry for 1 ≤ j ≤ l, −l in the jj-th entry with j = l + 1, and 0 elsewhere. The diagonal type matrices D l are typically normalized by multiplying them with 2 l(l+1) but this is irrelevant for our purposes. A basis of su(n) is given by the matrices iS jk , iA jk and iD l .
In the case n = 2, we obtain the Pauli matrices We define the nested commutator Before giving the general proof, let us consider the case of su (2). A straightforward computation shows that S 12 = 4c(A 12 , S 12 ), A 12 = 4c(S 12 , A 12 ), D 1 = 4c(S 12 , D 1 ).
Therefore the lemma holds in the case n = 2.
Proof. The computation in the case n = 2 generalizes immediately to S jk = 4c(A jk , S jk ), A jk = 4c(S jk , A jk ).
Also D 1 = 4c(S 12 , D 1 ). We will show using an induction that D l can be expressed as a linear combination of the nested commutators. Denote the upper left m × m block of a matrix A by A| m and the lower right m × m block by A| m . Then Analogously, with the rest of the entries zero. Therefore c(A l,l+1 , D l−1 ).
If D l−1 is a linear combination of the nested commutators, then so is D l .