Abstract
We show that a connection can be recovered up to gauge from sourcetosolution type data associated with the Yang–Mills equations in Minkowski space \({\mathbb {R}}^{1+3}\). Our proof analyzes the principal symbols of waves generated by suitable nonlinear interactions and reduces the inversion to a broken nonabelian light ray transform. The principal symbol analysis of the interaction is based on a delicate calculation that involves the structure of the Lie algebra under consideration and the final result holds for any compact Lie group.
Introduction
The purpose of this paper is to solve an inverse problem associated with Yang–Mills theories in Minkowski space \({\mathbb {R}}^{1+3}\). The objective is the recovery of the gauge field A on a causal domain where waves can propagate and return, given data on a small observation set inside the domain.
The starting point of Yang–Mills theories is a compact Lie group G with Lie algebra \({\mathfrak {g}}\). Without loss of generality, we shall think of G as a matrix Lie group and hence \({\mathfrak {g}}\) will be a matrix Lie algebra. We assume also that G is connected and endowed with a biinvariant metric, or equivalently, an inner product on \({\mathfrak {g}}\) invariant under the adjoint action.
In their most general formulation, Yang–Mills theories take place in the adjoint bundle of a principal bundle with structure group G over spacetime. Since our region of interest in spacetime will be a contractible set \(M\subset {\mathbb {R}}^{1+3}\), we might as well assume from the start that we are working with the trivial adjoint bundle \(M\times {\mathfrak {g}}\). The main object of the theory is a gauge field A, also known as Yang–Mills potential. In geometric language this is simply a connection \(A\in C^{\infty }(M;T^*M\otimes {\mathfrak {g}})=\Omega ^{1}(M;{\mathfrak {g}})\), that is, a smooth \({\mathfrak {g}}\)valued 1form. In general, we denote the set of \({\mathfrak {g}}\)valued forms of degree k by \(\Omega ^{k} = \Omega ^{k}(M;{\mathfrak {g}})\).
There is a natural pairing \([\cdot , \cdot ]: \Omega ^{p}\otimes \Omega ^{q}\rightarrow \Omega ^{p+q}\) given in our situation as
where the wedge product of \({\mathfrak {g}}\)valued forms is understood using matrix multiplication in \({\mathfrak {g}}\). Using the pairing we define a covariant derivative
Given a gauge field A, we can associate to it, its field strength or curvature. This is defined as
and it always satisfies the Bianchi identity \(d_{A}F_{A}=0\). Moreover, \(d_A^2 \omega = [F_A, \omega ]\) for any \(\omega \in \Omega ^k\).
Yang–Mills equations
The Yang–Mills equations arise as the Euler–Lagrange equations for the Yang–Mills action functional which we now recall. The inner product in \({\mathfrak {g}}\) naturally induces a pairing \(\langle \cdot ,\cdot \rangle _{\text {Ad}}\)
If \(\star \) denotes the Hodge star operator of the Minkowski metric, the YangMills functional is given by
If G is a subgroup of the unitary group, we may take as adjoint invariant inner product \(\text {trace}(XY)\), where X, Y are matrices in \({\mathfrak {g}}\), and thus \(S_{{{\,\mathrm{YM}\,}}}(A)\) may also be written as a constant multiple of
as is frequently found in the physics literature. From this functional one easily derives the Yang–Mills equations:
where \(d_{A}^*\) is the formal adjoint of \(d_A\) and given by
(In general for a Lorentzian spacetime of dimension m, the formal ajoint acting on kforms has the expression \(d_{A}^*=(1)^{m+km}\star d_{A}\star \).)
The Yang–Mills equations are gauge invariant in the sense that if two connections A and B are gauge equivalent and if A satisfies (1) then also B satisfies \(d_{B}^*F_{B}=0\). The connections A and B being gauge equivalent means that there is a section \({\mathbf {U}}\in C^\infty (M; G)\) such that
This property can be easily deduced from the fact that the action \(S_{{{\,\mathrm{YM}\,}}}\) is gauge invariant.
Main result
We will consider an inverse problem for the Yang–Mills equations in the causal diamond
For a fixed \(0< \epsilon _0 < 1\), the data will be given on the subset
We we say that \(A \in \Omega ^{1}({\mathbb {D}};{\mathfrak {g}})\) is a background connection if it satisfies the Yang–Mills equations (1) in \({\mathbb {D}}\). Due to the gauge invariance, the determination of a background connection on \({\mathbb {D}}\) is considered only up to the action of the following pointed gauge group
where \(p = (1, 0) \in {{\overline{\mho }}}\). The reason for considering the pointed gauge group instead of the full gauge group
is technical in nature as we shall explain below, see discussion after Lemma 6. Both gauge groups are clearly related by \(G({\mathbb {D}})/G^{0}({\mathbb {D}},p)=G\).
For \(A, B \in C^k({\mathbb {D}};T^*{\mathbb {D}}\otimes {\mathfrak {g}})\), with \(k \in {\mathbb {N}}\), we say that \(A \sim B\) in \({\mathbb {D}}\) if there is \({\mathbf {U}}\in G^0({\mathbb {D}},p)\) such that (2) holds in \({\mathbb {D}}\). Moreover, we write
and say that \(A \sim B\) near \(\partial ^ {\mathbb {D}}\) if there are \({\mathbf {U}}\in G^0({\mathbb {D}},p)\) and a neighbourhood \({\mathcal {U}} \subset {\mathbb {D}}\) of \(\partial ^ {\mathbb {D}}\) such that (2) holds in \({\mathcal {U}} \cap {\mathbb {D}}\). The sets \({\mathbb {D}}\), \(\mho \) and \(\partial ^ {\mathbb {D}}\) are visualized in Figure 1.
We let A be a background connection, and consider the data set
Let us remark that we could consider the sourcetosolution map given in Proposition 4 instead of the more abstract data set \({\mathcal {D}}_A\). We prefer to formulate our main result using \({\mathcal {D}}_A\) since the definition of the sourcetosolution map is technical, requiring suitable gauge fixing among other things. In fact, it is precisely in the proof of Proposition 4 that the pointed gauge group is needed. Nevertheless, intuitively, it is helpful to think of the data set as that produced by an observer creating sources J supported in \(\mho \) and observing solutions V to \(d_{V}^*F_{V}=J\) in \(\mho \).
The data set \({\mathcal {D}}_A\) could also be reformulated in terms of the pairs \((J, V_\mho )\) satisfying \(d_{V}^*F_{V}=J\), with J supported in \(\mho \). This formulation, while being somewhat redundant as \(J = d_{V}^*F_{V}\) can be computed given \(V_\mho \), suggests viewing \({\mathcal {D}}_A\) informally as the graph of the map taking J to \(V_\mho \). However, we reiterate that defining such map requires care. In addition to gauge fixing, we need to take into account the compatibility condition \(d_{V}^*J=0\) that every source must satisfy, see Lemma 2. Our abstract formulation of the data set \({\mathcal {D}}_{A}\) bypasses these problems while incorporating the natural gauge invariance of the theory.
We are now ready to formulate our main result.
Theorem 1
Suppose that \(A, B \in \Omega ^{1}({\mathbb {D}};{\mathfrak {g}})\) solve (1) in \({\mathbb {D}}\). Then \({\mathcal {D}}_A = {\mathcal {D}}_B\) if and only if \(A \sim B\) in \({\mathbb {D}}\).
Clearly if \(A \sim B\) in \({\mathbb {D}}\) then \({\mathcal {D}}_A = {\mathcal {D}}_B\). The nontrivial content of the theorem is the opposite implication. It follows from Proposition 10 in Appendix Appendix B that if A and B are as in the theorem, then \(A \sim B\) in \({\mathbb {D}}\) if and only if \(A \sim B\) near \(\partial ^ {\mathbb {D}}\).
Outline of the proof of Theorem 1
The objective is to reduce the proof of the theorem to an inversion result for a broken nonabelian light ray transform as in [7]. The broken light ray transform that arises in this paper is that related to the adjoint representation given the natural habitat of the Yang–Mills theories. In [7] we studied the broken light ray transform associated with the fundamental representation, so our first task is to relate the two.
To go from the data set \({\mathcal {D}}_{A}\) to the broken nonabelian light transform we follow the template laid out in [7] where a considerably simpler wave equation with cubic nonlinearity was studied. The first step is then to process the abstract data set and convert it into a manageable sourcetosolution map and this already brings the question of gauge fixing to the forefront. The construction of sourcetosolution map uses two types of gauges: the temporal gauge and the relative Lorenz gauge. The temporal gauge is easy to implement as it involves solving a linear matrix ODE to make the time component of a Yang–Mills potential A to vanish, that is, \(A_0=0\). This gauge is particularly suited to prove uniqueness results, cf. Proposition 2 below.
It is important to remark that uniqueness does really depend on the shape of the set where the connections satisfy the Yang–Mills equations. The causal diamond \({\mathbb {D}}\) has the special feature that perturbations cannot propagate in it through the top boundary \(x=1t\), whereas the bottom boundary is under control due to the assumed gauge equivalence near \(\partial ^ {\mathbb {D}}\). In particular, even if a background connection A satisfies the Yang–Mills equations on a larger set than \({\mathbb {D}}\), we do not expect to be able to recover it outside \({\mathbb {D}}\) given data on \(\mho \). Moreover, it does not appear to be possible to prove Theorem 1 using presently known unique continuation results, as discussed in more detail below.
A connection V is said to be in relative Lorenz gauge with respect to the background A if \(d_{A}^* V = d_{A}^* A\). The advantage of this gauge is that if A satisfies Yang–Mills \(d_{A}^*F_{A}=0\), and \(d_{V}^*F_{V}=J\), then the difference \(W=VA\) satisfies a semilinear wave equation where the leading part is given by the connection wave operator \(\Box _{A}=d_{A}d^{*}_{A}+d_{A}^*d_{A}\), cf. (23). This is very helpful for solving the foward problem and for the microlocal analysis used to extract information from the sourcetosolution map.
Following [7], the idea is to consider the nonlinear interaction of three singular waves produced by sources which are conormal distributions. We carefully track the principal symbol produced by the nonlinear interaction and extract from that the nonabelian broken light ray transform. This requires a delicate calculation unlike anything in the previous literature, in which the structure of the Lie algebra \({\mathfrak {g}}\) comes into consideration. This is the technical core of the proof, and perhaps one of the most innovative aspects of the paper. After this computation, contained in Section 8.2, there is one further hurdle to overcome: to use the sourcetosolution map we must revert back to the temporal gauge and check that no information is lost in the process.
Discussion and comparison with previous literature
It is tempting to think that a result like Theorem 1 can be obtained from a unique continuation principle. It must be stressed that unique continuation for linear wave equations with timedependent coefficients is simply false as there are counterexamples [1]. Although the difference of two solutions to the Yang–Mills equations in the Lorenz gauge satisfies a linear wave equation (with coefficients depending on both the solutions), due to unique continuation failing, our inverse problem is not “immediately solvable” and hence a different approach is needed. We mention that an inverse problem for Yang–Mills connections on a Riemannian manifold was studied in [6]. The proofs there are based on unique continuation for elliptic systems, however, the elliptic case is very different from the hyperbolic one.
This paper sits firmly within the program, initiated in [7], that is motivated by the Yang–Mills–Higgs system. In addition to the Yang–Mills potential A, a Higgs field \(\Phi \in C^{\infty }(M,{\mathfrak {g}})\) is present in this system. The equations for the pair of fields \((A,\Phi )\) are given by
where \(V'\) is the derivative of a smooth function \(V: [0,\infty )\rightarrow {\mathbb {R}}\). More generally, we can consider these equations when \(\Phi \) is a section of an associated bundle determined by a given representation of G. The focus of [7] was the recovery of A via the second equation (5), when V is assumed to be a quadratic potential (the most popular choice in Yang–Mills–Higgs theories): this turns (5) into a wave equation with a cubic nonlinearity. The present paper focuses on the first equation (4); more precisely in the pure Yang–Mills case where \(\Phi =0\). There are two substantial differences between [7] and the present paper. First, when A is fixed, the second equation (5) is no more gauge invariant, and hence the construction of sourcetosolution map in [7] does not require gauge fixing. Second, the quadratic potential V leads to particularly simple nonlinear structure in [7], and the resulting analysis of principal symbols is much more straightforward than in the present paper.
As already mentioned above, we consider the nonlinear interactions of three singular waves. Interaction of singular waves has been studied outside the context of inverse problems. In particular, the wave front set of a triple crossderivative has been studied in the case of the \(1+2\)dimensional Minkowski space by Rauch and Reed [39]. The references [3, 24, 34, 35, 40] have results of similar nature. The use of nonlinear interactions in the context of inverse problems was initiated in [29], where the wave front set resulting from the interaction of four singular waves was studied. The same approach was used for the Einstein equations in [28], and subsequently in [32, 46], in some ways the closest previous results to ours. For a review of this approach, see [30]. We observed in our above mentioned work [7] that it is sufficient to consider interactions of three singular waves, simplifying the analysis. Threefold interactions are used in the present paper.
Nonlinearities allow solving inverse problems that are open for the corresponding linearized equations. In particular, the inverse problem for the linearized Yang–Mills equation, see e.g. (32) below (where some lower order terms are discarded), is open. The only known results are in the case \(G=U(1)\), see [12, 41], and these results impose convexity assumptions not satisfied by the geometric setting of Figure 1. The same is true for recovery zeroth order terms, solved with and without convexity assumptions for certain scalar linear [43] and nonlinear wave equations [14], respectively.
We mention that nonlinear interactions have also been used to recover nonlinear terms for scalar wave equations [33], scalar elliptic equations [13, 31], and scalar real principal type equations [38]. In these four works, nonlinear terms do not contain any derivatives, contrary to the Einstein and Yang–Mills equations. Nonlinear interactions involving derivatives have also been studied in the context of scalar wave equations [47] and elastodynamics [10]. In addition, inverse problems have been studied for various nonlinear equations using methods originally developed in the context of linear elliptic equations. In particular, the method of complex geometrical optics originating from [45], and importantly extended by [27, 37], was first applied to an inverse coefficient determination problem for a nonlinear parabolic equation [21] and subsequently to several other inverse problems [2, 5, 22, 23, 25, 42, 44].
There are numerous analogies between the problem studied here and that of the Einstein equations considered in [28]. For starters, both problems have gauges: in the Einstein case the gauge group is the diffeomorphism group. The role of the relative Lorenz gauge is played by wave coordinates and one could also say that the Fermi coordinates used in [28] are the analogue of the temporal gauge. Both problems have a compatibility condition for the sources: the Einstein tensor has zero divergence and Yang–Mills has \(d_{A}^*d_{A}^*F_{A}=0\).
However, there are important differences and we want to stress those, since they are essential in resolving the inverse problem in the different contexts. After suitable gauge fixing and linearization, both the Einstein and Yang–Mills equations reduce to a linear wave equation. The unknown Lorentzian metric appears in the leading order terms of the equation in the former case while the background gauge field A features at the subprincipal level in the latter case. The Lorentzian metric affects the Lagrangian geometry of the parametrix for the wave equation but the effect of A is visible only in the principal symbol of the parametrix. Thus the need for a symbol calculation in the present paper that takes into consideration the structure of the Lie algebra \({\mathfrak {g}}\). Finally, the two inverse problems reduce to very different purely geometric problems. In our case, we read the broken nonabelian light ray transform from certain principal symbols, whereas in the Einstein case, the socalled light observation sets are obtained by analysing the wave front sets of suitable solutions, see [17, 29] for the corresponding geometric problem.
Outline of the paper
Section 2 introduces parallel transport in both the principal and the adjoint representation and reduces Theorem 1 to inversion of the broken nonabelian light ray transform via [7, Proposition 2] in the case that G has finite centre. Section 3 discusses the Yang–Mills equations with a source. Section 4 introduces the relative Lorenz gauge and the temporal gauge, thus setting up the scence for the sourcetosolution map. The latter is discussed in Section 5 where the important Proposition 4 is proved. Section 6 computes the equations for the triple crossderivative when three sources are introduced. Section 7 supplies the necessary tools from microlocal analysis needed to compute the symbol of the triple interaction and the latter is computed in Section 8. Section 9 proves a result about the structure of Lie algebras with trivial centre, and completes the proof of Theorem 1 in the case that G has finite centre. The final Section 10 contains the proof of Theorem 1 in the general case.
There are three appendices, first of which derives explicit formulas in coordinates, for example, for \(d_A^* F_A\). The second appendix discusses the direct problem for the Yang–Mills equations, and the last one gives an elementary alternative to the result in Section 9 in the case that \({\mathfrak {g}}= {{\,\mathrm{{\mathfrak {su}}}\,}}(n)\) with \(n \ge 2\).
Parallel Transport
We will explain in Section 10 how the case of an arbitrary compact, connected Lie group G can be reduced to the case that G has finite centre, that is, the set
is finite. In this case, the proof of Theorem 1 will ultimately boil down to inversion of a nonabelian broken light ray transform. This transform is the composition of two parallel transports, and we begin by defining the parallel transport used in the paper.
For the moment we may let (M, g) be any Lorentzian manifold, and G any compact matrix Lie group with Lie algebra \({\mathfrak {g}}\). However, we will work with trivial bundles for simplicity. Let \(A\in \Omega ^{1}(M;{\mathfrak {g}})\) be a connection and let us first define the parallel transport on the principal bundle \(M \times G\) with respect to A: the parallel transport \({\mathbf {U}}_\gamma ^A\) along a curve \(\gamma :[0,T]\rightarrow M\) is given by \({\mathbf {U}}_\gamma ^A = U(T)\) where U is the solution of the ordinary differential equation
Here \(\left\langle \cdot ,\cdot \right\rangle \) is the pairing between covectors and vectors.
In general, if \({\mathbb {V}}\) is a vector space and \(\rho : G\rightarrow {{\,\mathrm{GL}\,}}({\mathbb {V}})\) is a linear representation, the parallel transport on the associated vector bundle \(M\times {\mathbb {V}}\) is defined by \({\mathbf {P}}_\gamma ^{A,\rho } = \rho ({\mathbf {U}}_\gamma ^A)\). Two representations will be of importance to us. First, when \(G \subset {{\,\mathrm{GL}\,}}({\mathbb {C}}^n)\) and \({\mathbb {V}} = {\mathbb {C}}^n\) we have the representation given by \(\rho ={{\,\mathrm{id}\,}}\). In other words, \({\mathbf {P}}_\gamma ^{A,{{\,\mathrm{id}\,}}} v = {\mathbf {U}}_\gamma ^A v\) for \(v \in {\mathbb {V}}\). We call this the principal representation.
Second, when \({\mathbb {V}}={\mathfrak {g}}\) we have the adjoint representation \(\rho = {{\,\mathrm{Ad}\,}}\) where \({{\,\mathrm{Ad}\,}}(h)\), \(h \in G\), is typically written \({{\,\mathrm{Ad}\,}}_h\) and defined by \({{\,\mathrm{Ad}\,}}_h b = h b h^{1}\) for \(b \in {\mathfrak {g}}\). We have
It is straightforward to verify that \(W(t) = U(t) b U^{1}(t)\) solves
where U is the solution of (6).
When M is a convex subset of Minkowski space \({\mathbb {R}}^{1 + 3}\) and \(x, y \in M\), there is a unique geodesic \(\gamma \) from x to y, up to reparametrization. The parallel transport \({\mathbf {U}}_\gamma ^A\) does not depend on the parametrization of \(\gamma \), and we write simply \({\mathbf {P}}_{y \leftarrow x}^{A, \rho } = {\mathbf {P}}_\gamma ^{A, \rho }\) in this case.
We are now ready to define the nonabelian broken light ray transforms used in the proof of Theorem 1. We write
where \(x<y\) means that there is a future pointing causal curve from x to y. (For \((x,y) \in {\mathbb {L}}\), we have \(x<y\) if and only if the time coordinate of \(yx\) is strictly positive.) Define
We will reduce the transform \({\mathbf {S}}^{A,{{\,\mathrm{Ad}\,}}}_{z \leftarrow y \leftarrow x}\) to \({\mathbf {S}}^{A,{{\,\mathrm{id}\,}}}_{z \leftarrow y \leftarrow x}\) as follows:
Lemma 1
Suppose that a compact, connected matrix Lie group G has finite centre and let \(A, B \in \Omega ^{1}({\mathbb {D}};{\mathfrak {g}})\). If \({\mathbf {S}}^{A,{{\,\mathrm{Ad}\,}}}_{z \leftarrow y \leftarrow x} = {\mathbf {S}}^{B,{{\,\mathrm{Ad}\,}}}_{z \leftarrow y \leftarrow x}\) for all \((x,y,z) \in {\mathbb {S}}^+(\mho )\) then \({\mathbf {S}}^{A,{{\,\mathrm{id}\,}}}_{z \leftarrow y \leftarrow x} = {\mathbf {S}}^{B,{{\,\mathrm{id}\,}}}_{z \leftarrow y \leftarrow x}\) for all \((x,y,z) \in {\mathbb {S}}^+(\mho )\).
Proof
Let \((x,y,z) \in {\mathbb {S}}^+(\mho )\) and \(b \in {\mathfrak {g}}\). Then \({\mathbf {u}}b = b {\mathbf {u}}\) where
As this holds for all \(b \in {\mathfrak {g}}\) we see that \({\mathbf {u}}\) is in the centre Z(G). For the convenience of the reader we recall the proof of this wellknown fact. Let \(h \in G\). As G is connected, there is a path \(H : [0,1] \rightarrow G\) satisfying \(H(0) = {{\,\mathrm{id}\,}}\) and \(H(1) = h\). Define the path \(F(t) = {\mathbf {u}}H(t) {\mathbf {u}}^{1} H^{1}(t)\) in G. Then \(F(0) = {{\,\mathrm{id}\,}}\) and
where we used the fact that \(b = H^{1} {\dot{H}} \in {\mathfrak {g}}\) commutes with \({\mathbf {u}}^{1}\). We conclude that \({\mathbf {u}}h {\mathbf {u}}^{1} h^{1} = F(1) = {{\,\mathrm{id}\,}}\).
Now \({\mathbf {u}}\in Z(G)\) depends continuously on x, y and z, and \({\mathbf {u}}\rightarrow {{\,\mathrm{id}\,}}\) when \(y \rightarrow x\) and \(z \rightarrow x\). As Z(G) is finite, we have \({\mathbf {u}}= {{\,\mathrm{id}\,}}\), and therefore
\(\square \)
We have previously inverted the transform \({\mathbf {S}}^{A,{{\,\mathrm{id}\,}}}_{z \leftarrow y \leftarrow x}\) in the case of the unitary group \(G = \mathrm {U}(n)\), see Proposition 2 of [7], where slightly different choice of \(\mho \) and \({\mathbb {D}}\) is used. However, the proof works for any matrix Lie group, and also for the present choice of \(\mho \) and \({\mathbb {D}}\). Moreover, the gauge \({\mathbf {u}}\) defined in Lemma 3 of [7] is smooth up to \(\partial {\mathbb {D}}\) whenever the two connections A and B are smooth up to \(\partial {\mathbb {D}}\).
Until treating the case of an arbitrary compact, connected Lie group in Section 10, we will focus on proving:
Proposition 1
Suppose that G has finite centre. If A and B are as in Theorem 1 and if \({\mathcal {D}}_A = {\mathcal {D}}_B\), then there are \({{\tilde{A}}} \sim A\) and \({{\tilde{B}}} \sim B\) in \({\mathbb {D}}\) such that \({\mathbf {S}}^{{{\tilde{A}}},{{\,\mathrm{Ad}\,}}}_{z \leftarrow y \leftarrow x} = {\mathbf {S}}^{{{\tilde{B}}},{{\,\mathrm{Ad}\,}}}_{z \leftarrow y \leftarrow x}\) for all \((x,y,z) \in {\mathbb {S}}^+(\mho )\).
Under the additional assumption that G has finite centre, Theorem 1 follows then from Proposition 1, Lemma 1 and the proof of Proposition 2 in [7].
Yang–Mills Equations with a Source
In this section we let (M, g) be any oriented Lorentzian manifold, and consider the Yang–Mills equations with a source
on M. Here the source J cannot be arbitrarily chosen but must obey the compatibility condition
due to the following wellknown lemma. We give a proof for the convenience of the reader.
Lemma 2
Let \(V \in C^3(M;T^*M\otimes {\mathfrak {g}})\). Then \(d^*_V d^*_V F_V = 0\), and the Yang–Mills equations with a source (9) imply the compatibility condition (10).
Proof
Since \(d_{V}^*= \pm \star d_{V}\star \) we see that given any \(\omega \in \Omega ^{k}(M;{\mathfrak {g}})\) we have
So it is enough to prove that \([F_{V},\star F_{V}]=0\). But this is a purely algebraic fact that holds for any \(\omega \in \Omega ^{2}(M;{\mathfrak {g}})\), that is,
This is equivalent with
To check this, write \(\omega =\omega _{ij}dx^{i}\wedge dx^{j}\) and note that
if and only if \(i=k\), \(j=l\), \(i \ne j\) and \(k \ne l\). Thus
and since
\(\star \omega \wedge \omega \) has the same expression and (11) holds. \(\quad \square \)
The next lemma, proven again for convenience, implies that the source in (9) changes to \({\mathbf {U}}^{1} J {\mathbf {U}}\) when a gauge transformation \({\mathbf {U}}\in C^\infty (M,G)\) acts on V. We use the shorthand notation \(B = {\mathbf {U}}\cdot A\) for (2).
Lemma 3
\(B = {\mathbf {U}}\cdot A\) implies
Proof
By assumption
A direct calculation from the definitions shows that
Using \(d^*_{A}=\star d_{A}\star \) and (13) we see that
since \(F_{B}={\mathbf {U}}^{1}F_{A}{\mathbf {U}}\). \(\quad \square \)
Gauge Fixing
Gauge fixing is a mathematical procedure for coping with redundant degrees of freedom in field variables. Our work uses two gauges, namely the temporal gauge and the relative Lorenz gauge. While these are typical gauge choices, we will give below a selfcontained presentation of certain, perhaps less commonly used, properties of these gauges.
Temporal gauge
In this section we write \((x^0, x^1, x^2, x^3) = (t,x) \in {\mathbb {R}}^{1+3}\) for the Cartesian coordinates. The signature convention \((+++)\) is chosen for the Minkowski metric. A connection \(A \in \Omega ^1(M;{\mathfrak {g}})\), with \(M \subset {\mathbb {R}}^{1+3}\), is said to be in the temporal gauge if \(A_0 = 0\) where \(A = A_\alpha dx^\alpha \).
For a connection \(V \in \Omega ^1({\mathbb {D}}; {\mathfrak {g}})\) we define a connection \({\mathscr {T}}(V)\) in temporal gauge by
and \(\psi (x) = x1\). Observe that \(\{(t,x) \in {\mathbb {D}} : t = \psi (x)\} = \partial ^ {\mathbb {D}}\) and \({\mathbf {U}}\in G^0({\mathbb {D}},p)\). Therefore \({\mathscr {T}}(V) \sim V\) in \({\mathbb {D}}\).
We shall prove the following uniqueness result:
Proposition 2
Let \(A, B \in C^3({\mathbb {D}};T^*{\mathbb {D}}\otimes {\mathfrak {g}})\) solve the Yang–Mills equations (1) in the set \({\mathbb {D}} \setminus \mho \). Suppose that \(d_A^* F_A = d_B^* F_B\) in \(\mho \) and that there is \({\mathbf {U}}\in C^\infty ({\mathbb {D}}; G)\) such that \(A = {\mathbf {U}}\cdot B\) near \(\partial ^ {\mathbb {D}}\) and that \({\mathbf {U}}= {{\,\mathrm{id}\,}}\) in \(\mho \) near \(\partial ^ {\mathbb {D}}\). Suppose, furthermore, that both A and B are in the temporal gauge. Then \({\mathbf {U}}\) does not depend on t, and \(A = {\mathbf {U}}\cdot B\) in \({\mathbb {D}}\).
Reduced equations
We follow a reduction given in [9]. Suppose that a connection \(A \in \Omega ^1(M;{\mathfrak {g}})\) is in temporal gauge and write \(d_A^* F_A = J\). For the convenience of the reader, we give a proof of the following formula, see Lemma 12 in Appendix Appendix A,
Here, and throughout the paper, indices are raised and lowered by using the Minkowski metric. Taking \(\beta = 0\) we get the constraint equation
with \(a=1,2,3\), and taking \(\beta = j = 1,2,3\) we get
Here \(\partial _x A = (\partial _1 A, \partial _2 A, \partial _3 A)\) and \({{\tilde{N}}}_j\) contains the terms that are of order one and zero,
In the remainder of this section, we will use systematically Greek letters for indices over 0, 1, 2, 3 and Latin letters for 1, 2, 3.
We differentiate (15) using \(\partial _j\) and (16) using \(\partial _0\), to obtain
Substituting the first equation to the second one gives
where we have written
and
We call (17) the reduced Yang–Mills equations.
Pseudolinearization
Observe that for bilinear and trilinear forms b and m,
Hence if A and \({{\tilde{A}}}\) satisfy (17) with the same J, then the difference \(A{{\tilde{A}}}\) satisfies a linear equation of the form
where \(X_j\), \(j=1,2\), are first order differential operators in the \(x^1, x^2\) and \(x^3\) variables, with coefficients that depend on A and \({{\tilde{A}}}\), and whence also on the \(x^0\) variable. Writing \(u = A{{\tilde{A}}}\), \(Y_1 = 1\) and \(Y_2 = 0\), the system (19) is equivalent to (65), with \(f_1 = 0\) and \(f_2 = 0\), studied in Appendix Appendix B.
Proof of Proposition 2
\(A_0 = 0 = B_0\) implies that \({\mathbf {U}}^{1} \partial _t {\mathbf {U}}= 0\), that is, \(\partial _t {\mathbf {U}}= 0\). Due to its timeindependence, \({\mathbf {U}}\) is welldefined and smooth in whole \({\mathbb {D}}\) and \({\mathbf {U}}= {{\,\mathrm{id}\,}}\) in \(\mho \). We define \({{\tilde{A}}} = {\mathbf {U}}\cdot B\) and proceed to show that \(A = {{\tilde{A}}}\) in \({\mathbb {D}}\).
As \({{\tilde{A}}}\) is gauge equivalent to B, the Yang–Mills equations \(d_{{{\tilde{A}}}} F_{{{\tilde{A}}}} = 0\) hold in \({\mathbb {D}} \setminus \mho \). As \({\mathbf {U}}= {{\,\mathrm{id}\,}}\) in \(\mho \), we have \({{\tilde{A}}} = B\) in \(\mho \). Therefore \(d_{{{\tilde{A}}}} F_{{{\tilde{A}}}} = d_{A} F_{A}\) in \(\mho \). As \({\mathbf {U}}\) does not depend on t, we see that \({{\tilde{A}}}_0 = 0\). Hence A and \({{\tilde{A}}}\) are two solutions to the reduced Yang–Mills equations (17), with the same J, and the difference \(A  {{\tilde{A}}}\) satisfies (19). As they also coincide near \(\partial ^ {\mathbb {D}}\), Lemma 14 in Appendix Appendix B implies that \(A={{\tilde{A}}}\) in \({\mathbb {D}}\).
Relative Lorenz gauge
For a moment we may let (M, g) be any oriented Lorentzian manifold of even dimension. Consider two connections A and V on M solving the Yang–Mills equations without (1) and with (9) a source, respectively. That is, \(d_A^* F_A = 0\) and \(d_V^* F_V = J\). We will rewrite the latter equation in terms of the difference \(W = V  A\).
Directly from the definition of curvature
and thus
Since \(d_A^*= \star d_A \star \) it follows that \(d_V^*= d_A^*+ \star [W, \star \cdot ]\). Combining this with (20) and \(d_A^* F_A = 0\), we see that \(d_V^* F_V = J\) is equivalent with
where the nonlinear part reads
We say that \(V \in \Omega ^1(M;{\mathfrak {g}})\) is in the Lorenz gauge relative to a background connection \(A \in \Omega ^1(M;{\mathfrak {g}})\) if \(d_A^*V=d_A^*A\). In this case (21) is equivalent with
where \(\Box _A = d_A d^*_A + d^*_A d_A\) is the connection wave operator.
The semilinear wave equation (23), together with suitable initial conditions, is solvable when the source J is small and smooth enough, see, for example, (the proof of) Theorem 6 in [26]. However, its solution W solves the actual Yang–Mills equations (21) if and only if \(d_A d_A^*W=0\). Recall also that if W solves (21), or equivalently (9), then J satisfies the compatibility condition (10). We will therefore study the system combining (10) and (23). Observe that (10) is equivalent with
where \(j=1,2,3\). This can be viewed as an ordinary differential equation for \(J_0\).
We begin with an uniqueness result that is similar to Proposition 2. For \(r > 0\) and \(x \in {\mathbb {R}}^{1+3}\) we define the rescaled and translated diamond
Lemma 4
Let \(r > 0\) and \(x \in {\mathbb {R}}^{1+3}\) and write \(\tilde{{\mathbb {D}}} = {\mathbb {D}}(x,r)\). Let \(A \in \Omega ^1(\tilde{{\mathbb {D}}},{\mathfrak {g}})\) and suppose that \(W_{(\ell )}, J_{(\ell )} \in C^2(\tilde{{\mathbb {D}}};T^*\tilde{{\mathbb {D}}}\otimes {\mathfrak {g}})\) solve
in \(\tilde{{\mathbb {D}}}\) for \(\ell =1,2\). Suppose, furthermore, that \(W_{(\ell )}, J_{(\ell )}\), \(\ell =1,2\), vanish near \(\partial ^ \tilde{{\mathbb {D}}}\) and that the spatial parts of \(J_{(1)}\) and \(J_{(2)}\) of coincide on \(\tilde{{\mathbb {D}}}\), that is, \(J_{(1),j} = J_{(2),j}\) for \(j=1,2,3\). Then \(W_{(1)} = W_{(2)}\) and \(J_{(1)} = J_{(2)}\) in \(\tilde{{\mathbb {D}}}\).
Proof
Pseudolinearization analogous to that in Section 4.1.2 shows that the difference \((W_{(1)}  W_{(2)}, J_{(1)}  J_{(2)})\) solves a system of the form (65) in Appendix Appendix B with \(f_1 = 0\) and \(f_2 = 0\). The coefficients of this system depend on \(W_{(\ell )}, J_{(\ell )}\) and they satisfy the assumptions of Lemma 14 in Appendix Appendix B. Lemma 14 is formulated for \({\mathbb {D}}\) rather than for \(\tilde{{\mathbb {D}}}\), however, the form of the system (65) is invariant under a rescaling and translation. Therefore Lemma 14 holds also for \(\tilde{{\mathbb {D}}}\) and we conclude by applying it. \(\square \)
We will now turn to existence of solutions to the Yang–Mills equations. It is convenient to work in the cylinder \(M = (2,2) \times {\mathbb {R}}^3\) containing the diamond \({\mathbb {D}}\), rather than in \({\mathbb {D}}\). Let us consider again the system combining (10) and (23),
Lemma 5
Let \(A \in \Omega ^1(M; {\mathfrak {g}})\) and suppose that \(W,J \in C^3(M;T^*M\otimes {\mathfrak {g}})\) solve (25). Suppose moreover that A solves (1) in \({\mathbb {D}}\) and that \({{\,\mathrm{supp}\,}}(J_j)\), \(j=1,2,3\), is contained in the interior of \({\mathbb {D}}\). Then W solves (21) in \({\mathbb {D}}\), with J on the righthand side.
Proof
The equations (21) and (23) differ by the term \(d_A d_A^* W\) on the lefthand side. Hence it is enough to verify that \(H = 0\) in \({\mathbb {D}}\) where \(H = d^*_A W\). We write \(V = W + A\). As A solves (1) in \({\mathbb {D}}\), \(d_V^* F_V\) coincides with the lefthand side of (21) in \({\mathbb {D}}\), and the first equation in (25), in other words (23), implies that \(d^*_V F_V + d_A H = J\) in \({\mathbb {D}}\). Applying \(d^*_V\) to this equation, we have used Lemma 2 and the second equation in (25) that \(d_V^* d_A H = 0\) in \({\mathbb {D}}\). This is a linear wave equation for H. We will show below that W vanishes near \(\partial ^ {\mathbb {D}}\). Hence also H vanishes near \(\partial ^ {\mathbb {D}}\), and as it satisfies the linear wave equation, it vanishes in the whole \({\mathbb {D}}\). This type of finite speed of propagation result is of course standard, and it follows also from Lemma 14 Appendix Appendix B.
Let us now show that W vanishes near \(\partial ^ {\mathbb {D}}\). There is \(r \in (0,1)\) such that \({{\,\mathrm{supp}\,}}(J_j) \subset {\mathbb {D}}(0,r)\) for \(j=1,2,3\). Let \(\tilde{{\mathbb {D}}}\) in Lemma 4 satisfy \(\tilde{{\mathbb {D}}} \cap {\mathbb {D}}(0,r) = \emptyset \) and \(\partial ^ \tilde{{\mathbb {D}}} \subset \{t < 1\}\). Lemma 4 implies that \(W = 0\) in \(\tilde{{\mathbb {D}}}\) by comparison with the trivial solution. By varying \(\tilde{{\mathbb {D}}}\) we see that W vanishes in \(\{t \le 0\} \setminus {\mathbb {D}}(0,r)\), and also near \(\partial {\mathbb {D}} \cap \{t=0\}\). In particular, W vanishes near \(\partial ^ {\mathbb {D}}\). \(\quad \square \)
Remark 1
As the second equation in (25) is equivalent with the ordinary differential equation (24), we see that if \({{\,\mathrm{supp}\,}}(J_j) \subset (0,T) \times K\), \(j=1,2,3\), for some \(K \subset {\mathbb {R}}^3\), then also \({{\,\mathrm{supp}\,}}(J_0) \subset (0,T) \times K\) for a solution of (25).
We prove the following result in Appendix Appendix B.
Proposition 3
Suppose that \(A \in \Omega ^1(M; {\mathfrak {g}})\) is bounded, together with all its derivatives, and let \(k \ge 4\). Then there is a neighbourhood \({\mathcal {H}}\) of the zero function in \(H^{k+2}(M;{\mathfrak {g}})\) such that for all \(J_j \in {\mathcal {H}}\), \(j=1,2,3\), there is a unique solution
of (25) with \(J=J_0 dx^0 + \dots + J_3 dx^3\). Moreover, the map \((J_1, J_2, J_3) \mapsto (W,J_0)\) is smooth from \({\mathcal {H}}^3\) to \(H^{k+1}(M; T^*M \otimes {\mathfrak {g}} \oplus {\mathfrak {g}})\).
SourcetoSolution Map
We begin with a lemma, that will be used only once, and that highlights the difference between the pointed gauge group \(G^0({\mathbb {D}}, p)\) and the full gauge group \(G({\mathbb {D}})\).
Lemma 6
Suppose that \({{\tilde{A}}} \sim A\) near \(\partial ^ {\mathbb {D}}\) and consider the modified data set
Let \(V' \in \tilde{{\mathcal {D}}}_A\). Then there are \({\mathbf {U}}\in G^0({\mathbb {D}}, p)\) and \(V \in C^3({\mathbb {D}}; T^* {\mathbb {D}} \otimes {\mathfrak {g}})\) such that \(V' = V_\mho \), \(V = {\mathbf {U}}\cdot {\tilde{A}}\) near \(\partial ^ {\mathbb {D}}\), and \({\mathbf {U}}= {{\,\mathrm{id}\,}}\) in \(\mho \) near \(\partial ^ {\mathbb {D}}\).
Proof
It follows immediately from the definitions of the sets \({\mathcal {D}}_A\) and \(\tilde{{\mathcal {D}}}_A\) that there are \({\mathbf {U}}\in G^0({\mathbb {D}}, p)\) and \(V \in C^3({\mathbb {D}}; T^* {\mathbb {D}} \otimes {\mathfrak {g}})\) such that \(V' = V_\mho \), \(V = {\mathbf {U}}\cdot {\tilde{A}}\) near \(\partial ^ {\mathbb {D}}\), and \(V = {{\tilde{A}}}\) in \(\mho \) near \(\partial ^ {\mathbb {D}}\). Then \({\mathbf {U}}\) satisfies
in \(\mho \) near \(\partial ^ {\mathbb {D}}\). As (26) is equivalent with the differential equation \(d{\mathbf {U}}= [{\tilde{A}}, {\mathbf {U}}]\), and \({\mathbf {U}}(p) = {{\,\mathrm{id}\,}}\), it follows that \({\mathbf {U}}= {{\,\mathrm{id}\,}}\) in \(\mho \) near \(\partial ^ {\mathbb {D}}\). \(\quad \square \)
If we used gauge equivalence with respect to \(G({\mathbb {D}})\) in the definition \({\mathcal {D}}_A\), then (26) would still hold in a neighbourhood \({\mathcal {U}} \subset {{\overline{\mho }}}\) of \(\partial ^ {\mathbb {D}} \cap {{\overline{\mho }}}\), however, this simply says that \({\mathbf {U}}_{{\mathcal {U}}}\) is in the stabilizer subgroup \(\{{\mathbf {U}}\in C^\infty ({\mathcal {U}}; G) : {\mathbf {U}}\cdot {{\tilde{A}}} = {{\tilde{A}}}\}\) with respect to \({{\tilde{A}}}_{{\mathcal {U}}}\). In general, the stabilizer subgroup may be nontrivial.
Recall that the temporal gauge version \({\mathscr {T}}(V)\) of a connection V is defined by (14). Recall, furthermore, that the system (25) of Yang–Mills equations in relative Lorenz gauge with the compatibility condition is posed on \(M = (2,2) \times {\mathbb {R}}^3\).
Proposition 4
Suppose that \(A \in \Omega ^{1}({\mathbb {D}};{\mathfrak {g}})\) satisfies (1) in \({\mathbb {D}}\). Then there is a connection \({\tilde{A}} \in \Omega ^{1}({\mathbb {D}}; {\mathfrak {g}})\) such that \({\tilde{A}} \sim A\) in \({\mathbb {D}}\), \({\tilde{A}}_\mho \) is in temporal gauge, and the following holds: for all \(x \in \mho \) there are a neighbourhood \(\mho _0 \subset \mho \) of x and a neighbourhood \({\mathcal {H}}\) of the zero function in \(H_0^7(\mho _0;{\mathfrak {g}})\) such that \({\mathcal {D}}_A\) determines \({\tilde{A}}_\mho \) and the sourcetosolution map
where \(V = W + {\tilde{A}}\) and \((W,J_0)\) is the solution of (25) with \(J=J_0 dx^0 + \dots + J_3 dx^3\) and with A replaced by an arbitrary smooth, compactly supported extension of \({\tilde{A}}\) to M.
Proof
Let \({\tilde{A}}' \in {\mathcal {D}}_A\) be in the temporal gauge and satisfy \(d_{{\tilde{A}}'}^*F_{{\tilde{A}}'}=0\) in \(\mho \). Such \({\tilde{A}}'\) exists, for example, \({\tilde{A}}' = {\mathscr {T}}(A)_\mho \) is a possible choice. There is \({\tilde{A}}\) such that \({\tilde{A}}' = {\tilde{A}}_\mho \), \(d_{{\tilde{A}}}^*F_{{\tilde{A}}}=0\) in \({\mathbb {D}}\) and \({\tilde{A}} \sim A\) near \(\partial ^ {\mathbb {D}}\). Proposition 10 in Appendix Appendix B implies that \({\tilde{A}} \sim A\) in \({\mathbb {D}}\). Choose a smooth, compactly supported extension of \({{\tilde{A}}}\) in M, still denoted by \({{\tilde{A}}}\).
For \(x \in \mho \) we choose \(\epsilon > 0\) small enough so that \({\mathbb {D}}(x,\epsilon ) \subset \mho \) and let \(\mho _0\) be the interior of \({\mathbb {D}}(x,\epsilon )\). Let \(t_0\) be the time coordinate of x. Let \(J_j \in H^7_0(\mho _0; {\mathfrak {g}})\), \(j=1,2,3\), be small, and consider the solution \((W,J_0)\) of the system (25) with \(A = {\tilde{A}}\) in \((1,t_0) \times {\mathbb {R}}^3\). This solution vanishes outside \(\mho _0\) and near \(\partial ^ {\mathbb {D}}(x,\epsilon )\), and it does not depend on \({\tilde{A}}\) away from \(\mho _0\). The vanishing of \((W,J_0)\) outside \(\mho _0\) and near \(\partial ^ {\mathbb {D}}(x,\epsilon )\) is shown similarly to the vanishing of W near \(\partial ^ {\mathbb {D}}\) in the proof of Lemma 5, and we omit this argument. To see that \((W,J_0)\) does not depend on \({\tilde{A}}\) away from \(\mho _0\), we consider two solutions to (25) with different backgrounds A in \((1,t_0 + \epsilon ) \times {\mathbb {R}}^3\). Both the backgrounds are assumed to coincide with \({{\tilde{A}}}\) in \(\mho _0\). As both the solutions vanish near \(\partial ^ {\mathbb {D}}(x,\epsilon )\), Lemma 4 implies that they are identical in \({\mathbb {D}}(x,\epsilon )\).
Extending \((W,J_0)\) by zero we get a solution in the set \(\mho _ = \mho \cap \{t < t_0\}\). To summarize, the solution \((W,J_0)\) in \(\mho _\) is determined by \({\tilde{A}}'\) and our choice of \(J_j\), \(j=1,2,3\). Defining a connection \({{\hat{V}}} = {{\hat{V}}}(J_1, J_2, J_3)\) on \(\mho _\) by \({{\hat{V}}} = W + {\tilde{A}}\) we have \(d_{{{\hat{V}}}}^* F_{{{\hat{V}}}} = J\) in \(\mho _\) where \(J = J_0 dx^0 + \dots +J_3 dx^3\). We write \(\mho _+ = \mho \cap \{t > t_0\}\), and consider the set
Here \({\mathscr {T}}\) is defined by (14) with \(x < \epsilon _0\), cf. (3). No confusion should arise from our use of \({\mathscr {T}}\) for temporal gauge both in \(\mho \) and in \({\mathbb {D}}\) since \({\mathscr {T}}(V_\mho ) = {\mathscr {T}}(V)_\mho \) for a connection V on \({\mathbb {D}}\).
As \({{\hat{V}}}\) is determined by \({\mathcal {D}}_A\) (and the choice of \({{\tilde{A}}}'\)), also \({\mathcal {L}}\) is determined by \({\mathcal {D}}_A\). Moreover, \({\mathscr {T}}(V)_\mho \in {\mathcal {L}}\) where \(V = W + {\tilde{A}}\) and \((W,J_0)\) is the solution of (25) in M with \(J_j\), \(j=1,2,3\), as above and \(A={\tilde{A}}\). The solution \((W,J_0)\) in M is an extension of the solution \((W,J_0)\) in \((0,t_0) \times {\mathbb {R}}^3\), which justifies our reuse of symbols. Observe that Proposition 3, together with the Sobolev embedding theorem, guarantees that \(W \in C^3({\mathbb {D}}; T^* {\mathbb {D}} \otimes {\mathfrak {g}})\), and that Remark 1 guarantees that \({{\,\mathrm{supp}\,}}(J_0) \subset \mho \).
To conclude the proof, it remains to show that \({\mathcal {L}}\) consists of a single element. Suppose that \(W', {{\tilde{W}}}' \in {\mathcal {L}}\). By Lemma 6 there are connections V, \({{\tilde{V}}}\) and gauges \({\mathbf {u}}\), \({{\tilde{{\mathbf {u}}}}}\) satisfying \(W' = {\mathscr {T}}(V)_\mho \), \({{\tilde{W}}}' = {\mathscr {T}}({{\tilde{V}}})_\mho \), \(d_{V}^*F_{V}=0 = d_{{{\tilde{V}}}}^*F_{{{\tilde{V}}}}\) in \({\mathbb {D}} \setminus \mho \), \(V = {\mathbf {u}}\cdot {\tilde{A}}\) and \({{\tilde{V}}} = {{\tilde{{\mathbf {u}}}}} \cdot {{\tilde{A}}}\) near \(\partial ^ {\mathbb {D}}\), and \({\mathbf {u}}= {{\,\mathrm{id}\,}}= {{\tilde{{\mathbf {u}}}}}\) in \(\mho \) near \(\partial ^ {\mathbb {D}}\). We define
and set \(W = {\mathbf {U}}\cdot V\) and \({{\tilde{W}}} = {{\tilde{{\mathbf {U}}}}} \cdot {{\tilde{V}}}\). Then \(W_0 = 0 = {{\tilde{W}}}_0\) in \({\mathbb {D}}\). Moreover, it follows from the definition of \({\mathscr {T}}\) that \(W' = W_{\mho }\) and \({{\tilde{W}}}' = {{\tilde{W}}}_{\mho }\).
There holds \(V = {{\tilde{A}}} = {{\tilde{V}}}\) in \(\mho \) near \(\partial ^ {\mathbb {D}}\). This implies \({\mathbf {U}}= {{\tilde{{\mathbf {U}}}}}\) and \(W = {{\tilde{W}}}\) in \(\mho \) near \(\partial ^ {\mathbb {D}}\). Writing \({\mathbf {U}}_ = {\mathbf {U}}{\mathbf {u}}{{\tilde{{\mathbf {u}}}}}^{1}{{\tilde{{\mathbf {U}}}}}^{1}\), we have that \(W = {\mathbf {U}}_ \cdot {{\tilde{W}}}\) near \(\partial ^ {\mathbb {D}}\) and \({\mathbf {U}}_= {{\,\mathrm{id}\,}}\) in \(\mho \) near \(\partial ^ {\mathbb {D}}\).
In fact, as \(V = {{\hat{V}}} = {{\tilde{V}}}\) in \(\mho _\), we have \({\mathbf {U}}= {{\tilde{{\mathbf {U}}}}}\) and \(W = {{\tilde{W}}}\) in \(\mho _\). Hence also \(d_{W}^* F_W = d_{{{\tilde{W}}}}^* F_{{{\tilde{W}}}}\) in \(\mho _\). The spatial parts of \(d_{V}^* F_V\) and \(d_{{{\tilde{V}}}}^* F_{{{\tilde{V}}}}\) vanish in \(\mho _+\). As gauge transformations act componentwise on \(d_{W}^* F_W\), see (12), also the spatial parts of \(d_{W}^* F_W\) and \(d_{{{\tilde{W}}}}^* F_{{{\tilde{W}}}}\) vanish in \(\mho _+\). Writing \(J_0\) for the temporal part of \(d_{W}^* F_W\), the compatibility condition \(d_W^* d_{W}^* F_W = 0\), see Lemma 2, together with \(W_0 = 0\), implies that \(\partial _t J_0 = 0\) in \(\mho _+\). The same holds for \({{\tilde{J}}}_0\), the temporal part of \(d_{{{\tilde{W}}}}^* F_{{{\tilde{W}}}}\). But \(J_0 ={{\tilde{J}}}_0\) on \(\mho \cap \{t=t_0\}\), and hence \(J_0 ={{\tilde{J}}}_0\) in \(\mho _+\). To summarize \(d_{W}^* F_W = d_{{{\tilde{W}}}}^* F_{{{\tilde{W}}}}\) in \(\mho \). Proposition 2 implies that \(W = {{\tilde{W}}}\) in \(\mho \). In other words \(W' = {{\tilde{W}}}'\) and this is the only element in \({\mathcal {L}}\). \(\quad \square \)
Linearization of the Yang–Mills Equations in Lorenz Gauge
Let us study multiplefold linearizations of (23). Consider a threeparameter family
of solutions to (23), vanishing for \(t \le 0\), where \(\epsilon \) is in a neighbourhood of the origin in \({\mathbb {R}}^3\). Assume that the source term is linear in the sense that \(J = \sum _{k = 1}^3 \epsilon _{(k)} J_{(k)}\) for some \(J_{(k)} \in \Omega ^1({\mathbb {R}}^{1+3}; {\mathfrak {g}})\). Writing
and differentiating (23) in \(\epsilon \) gives the following system of linear wave equations
where the nonlinear terms read
and, writing \(S_3\) for the set of permutations on \(\{1,2,3\}\),
Now we continue the calculation in Cartesian coordinates in Minkowski space \({\mathbb {R}}^{1+3}\), and use the formulas
These formulas are derived in Appendix Appendix A. Using (29)–(31) and the Lorenz gauge condition \(d_A^* W = 0\), we rewrite the first three equations in (28), modulo lower order terms, as follows
where the components of the righthand sides of the last two equations read
Preliminaries on Microlocal Analysis
Distributions associated to conormal bundles and two Lagrangians
The advantage of working in the relative Lorenz gauge is that the Yang–Mills equations reduces to a cubic nonlinear wave equation with the linear part given by the connection wave operator \(\Box _A\), modulo zeroth order terms. The parametrix for \(\Box _A\) is a distribution associated to an intersecting pair of Lagrangians (shortly an IPL distribution), in the sense of [36], and we use the product calculus of conormal distributions to study the nonlinear part.
The proof of Proposition 1 in the next section relies solely on symbolic computations, and we recall here only that conormal and IPL distributions have principal symbols and that the corresponding symbol maps are isomorphisms, modulo lower order terms in a suitable sense. We will not recall the definitions of these classes of distributions, them being somewhat technical, instead we refer the reader to [7] for a review of the theory that we use and that was originally developed in [11, 18, 36]. Even the precise definition of spaces of symbols is not important for our present purposes, since we will consider only symbols that are positively homogeneous in the fibre variable.
Recall that a pseudodifferential operator A on a manifold X with a homogeneous principal symbol a is said to be elliptic at \((x,\xi ) \in T^*X \setminus 0\) if \(a(x,\xi ) \ne 0\). The wavefront set \({{\,\mathrm{WF}\,}}(u) \subset T^*X \setminus 0\) of a distribution u on X is the complement of its regular set, whilst the regular set consists of such points \((x,\xi ) \in T^*X \setminus 0\) that there is a zeroth order pseudodifferential operator A that is elliptic at \((x,\xi )\) and that satisfies \(Au \in C^\infty (X)\). We denote by \({{\,\mathrm{singsupp}\,}}(u)\) the projection of \({{\,\mathrm{WF}\,}}(u)\) on X, and by \({{\,\mathrm{WF}\,}}(A)\) the essential support of A, that is, the projection of \({{\,\mathrm{WF}\,}}({\mathscr {A}}) \subset (T^*X \setminus 0)^2\) on the first factor \(T^*X\setminus 0\) where \({\mathscr {A}}\) is the Schwartz kernel of A. Moreover, we say that A is a microlocal cutoff near \((x,\xi ) \in T^*X \setminus 0\) if A is elliptic at \((x,\xi )\) and \({{\,\mathrm{WF}\,}}(A)\) is contained in a small neighbourhood of \(\{(x, \lambda \xi ) : \lambda > 0\}\).
Let E be a complex smooth vector bundle over X and \(\Omega ^{1/2}\) the half density bundle. A conormal distribution \(u \in I^m(N^*Y; E \otimes \Omega ^{1/2})\) of order \(m \in {\mathbb {R}}\) is a compactly supported distribution taking values on the tensor bundle \(E \otimes \Omega ^{1/2}\) with \(\text {WF}(u)\) contained in the conormal bundle \(N^*Y\) of a submanifold Y of X. In addition, u is required to have certain local structure on Y, see (2.4.1) in [18], precise form of which is not important for our purposes. What is important is that the principal symbol \(\sigma [u]\) of u is a smooth section of \(E \otimes \Omega ^{1/2}\), invariantly defined on \(N^*Y \setminus 0\), and that the principal symbol map \(u \mapsto \sigma [u]\) gives the short exact sequence,
see [18, Theorem 2.4.2] and [19, Theorem 18.2.11]. Here n is the dimension of X and \(S^{m}(N^*Y; E \otimes \Omega ^{1/2})\), with \(m \in {\mathbb {R}}\), is the space of symbols, see [19, Definition 18.2.10]. For our purposes it suffices to note that positively homogeneous sections of degree m are in this space, and that if \(\Omega ^{1/2}\) is trivialized by choosing a nowhere vanishing positively homogeneous section \(\mu \) of degree r, then \(\sigma [u]\) is positively homogeneous of degree \(m + r\) if
Since the half density is involved here, the given homogeneity looks a little different from the classical definition in [19, p.67].
More generally, a Lagrangian distribution \(u \in I^{m}(\Lambda ; E \otimes \Omega ^{1/2})\) is a compactly supported distribution with \(\text {WF}(u)\) contained in a conical Lagrangian submanifold \(\Lambda \) of \(T^*X \setminus 0\), and certain local structure, see (3.2.14) in [18]. Its principal symbol is invariantly defined on \(\Lambda \) as a smooth section of the bundle \(E \otimes \Omega ^{1/2} \otimes L\), where L is the Maslov bundle over \(\Lambda \). Analogously to (35) the principal symbol map gives an isomorphism
modulo lower order terms, see [18, Theorem 3.2.5]. We write also
The notion of Lagrangian distributions is insufficient to completely describe the fundamental solution of wave equations as two Lagrangian manifolds are needed in order to describe the propagating singularities and the singularities at the source. An IPL distribution \(u \in I^{m}(\Lambda _0, \Lambda _1; E \otimes \Omega ^{1/2})\) is compactly supported distribution with \(\text {WF}(u)\) contained in \(\Lambda _0\cup \Lambda _1\), where \((\Lambda _0, \Lambda _1)\) is a cleanly intersecting pair of conical Lagrangian submanifolds of \(T^*X \setminus 0\), and with certain local structure on \(\Lambda _0 \cup \Lambda _1\), see [36]. Here \(\Lambda _1\) is a manifold with boundary, while \(\Lambda _0\) is a manifold without boundary, and by cleanly intersecting, we mean
Again what we really need in the present paper is the symbol map for such distributions. In this case the symbol map is an isomorphism, modulo lower order terms, from \(I^{m}(\Lambda _0, \Lambda _1; E \otimes \Omega ^{1/2})\) to the space
We remark that \({\mathscr {R}}\) maps the \(E\otimes \Omega ^{1/2} \otimes L\)valued symbols over \(\Lambda _0\) to the \(E\otimes \Omega ^{1/2} \otimes L\)valued symbols over \(\Lambda _1\) and acts as a multiplication by a scalar on E.
If \((x,\xi ) \in \Lambda _j \setminus \partial \Lambda _1\) for \(j=0\) or \(j=1\), then there is a microlocal cutoff \(\chi \) near \((x,\xi )\) such that \(\chi u \in I(\Lambda _j; E)\) for all \(u \in I^{m}(\Lambda _0, \Lambda _1; E \otimes \Omega ^{1/2})\). The only place where we need the full picture of IPL distributions, instead of the above microlocal reduction to Lagrangian distributions, is equation (39) giving an initial condition on \(\partial \Lambda _1\) for a transport equation on \(\Lambda _1\). Moreover, apart from (39), we can also avoid the use of Lagrangian distributions in favour of conormal distributions, since all the Lagrangian manifolds \(\Lambda _0\) and \(\Lambda _1\) considered below will be conormal bundles away from \(\partial \Lambda _1\).
The principal symbol \(\sigma [\Box _A]\) and the subprincipal symbol \(\sigma _{\text {sub}}[\Box _A]\) read
We denote by \(\Phi _s\), \(s \in {\mathbb {R}}\), the flow of the Hamilton vector field \(H_{\sigma [\Box _A]}\) of \(\sigma [\Box _A]\), and define for a subset \({\mathscr {B}}\) of the characteristic set \(\Sigma \) of \(\Box _A\) the future flowout of \({\mathscr {B}}\) by
As \(\Box _A\) is of real principal type one can use the theory by Hörmander and Duistermaat [11] to understand its parametrix. A completely symbolic parametrix construction, based on IPL distributions, was given by Melrose and Uhlmann [36], and the following adaptation of their construction in the vector valued case can be found in [7]:
Proposition 5
Let \(\Lambda _0\) be a conormal bundle such that \(H_{\sigma [\Box _A]}\) is nowhere tangent to \(\Lambda _0\). Denote by \(\Lambda _1\) the future flowout of \(\Lambda _0 \cap \Sigma \). Consider the wave equation
where \(f \in I(\Lambda _0; E)\) and \(E = T^*{\mathbb {R}}^{1+3} \otimes {\mathfrak {g}}\). Then \(u \in \bigcup _{m \in {\mathbb {R}}} I^{m}(\Lambda _0, \Lambda _1; E \otimes \Omega ^{1/2})\) and the corresponding principal symbols satisfy
Here \({\mathscr {L}}_{H_{\sigma [\Box _A]}}\) denotes the Lie derivative with respect to \(H_{\sigma [\Box _A]}\).
We will compute symbols related to the nonlinear terms by using the following result, implicitly contained in [15] and explicitly formulated for example in [7].
Proposition 6
Let \(K_{(1)}\) and \(K_{(2)}\) be two transversal submanifolds of X, let
and let \(u_{(j)} \in I(N^* K_{(j)}; E)\), \(j=1,2\). If \(\chi \) is a microlocal cutoff near \((x,\xi )\) and \(\mu \) is a nowhere vanishing half density on X, then writing \(u_{(1)} u_{(2)} = \mu (\mu ^{1}u_{(1)}) (\mu ^{1} u_{(2)})\), there holds \(\chi (u_{(1)}u_{(2)}) \in I(N^*(K_{(1)} \cap K_{(2)}); E)\) and
where \(\xi = \xi _{(1)} + \xi _{(2)}\) with \(\xi _{(1)} \in N^*K_{(1)}\) and \(\xi _{(2)} \in N^*K_{(2)}\).
Parallel transport for the principal symbol
As in [7], the transport equation (38) can be understood as a parallel transport equation as in Section 2,
Here \(\mu \) is a nowhere vanishing half density on \(\Lambda _1 \setminus \Lambda _0\), \(\varvec{\beta }(s) = (\gamma (s), {{\dot{\gamma }}}^*(s))\), with \({{\dot{\gamma }}}^* = {{\dot{\gamma }}}_\alpha dx^\alpha \), is the bicharacteristic curve emanating from \(\varvec{\beta }(0) \in \Lambda _0 \cap \Lambda _1\), and
Comparing with (7), we see that the 1form components \({{\hat{u}}}_\alpha \) satisfy the parallel transport equation on \(M \times {\mathfrak {g}}\) corresponding to the adjoint representation of G. In particular, if \(x,y \in {\mathbb {L}}\) and the singular support of f does not intersect the line segment from x to y, then
where \(\xi \) is the covector corresponding to the direction of the line segment, and \(\varvec{\beta }\) in (41) satisfies \(\varvec{\beta }(0) = (x,\xi )\) and \(\varvec{\beta }(s) = (y,\xi )\).
We will also need the fact that positive homogeneity is preserved in (42) in the sense of the following proposition, where we have fixed a nowhere vanishing half density \(\mu \) of degree 1/2 on \(\Lambda _1 \setminus \Lambda _0\).
Proposition 7
Let \(u \in I(\Lambda _0, \Lambda _1; T^*{\mathbb {R}}^{1+3} \otimes {\mathfrak {g}} \otimes \Omega ^{1/2})\) be an IPL distribution solving (37) and its symbol \(\sigma [u]\) positively homogeneous of degree \(q + 1/2\) on \(\Lambda _1 \setminus \Lambda _0\). Suppose that \(\Lambda _1 \setminus \Lambda _0 = N^*K \setminus 0\) for some \(K \subset {\mathbb {R}}^{1+3}\). Then for any \((y, \xi ) \in N^*K \setminus 0\) with \((y, \xi ) = \Phi _s(x, \xi )\) for some \(s \in {\mathbb {R}}\), we have
Recall that \(\Phi _s\) is the flow of the Hamilton vector field \(H_{\sigma [\Box _A]}\). For the proof, the reader is referred to our work [7, Proposition 1].
Proof of Proposition 1
We follow the construction in [7], however, the analysis in the present paper is more involved due to the nonlinearity in Yang–Mills equations being more complicated than the simple cubic nonlinearity considered in [7], and also due to the gauge invariance of the Yang–Mills equations. We will focus on the new features of the proof and refer to [7] for technical details that are unchanged.
In order to apply the microlocal machinery in Section 7 we need to consider the Yang–Mills equations on the tensor product bundle \(T^* {\mathbb {R}}^{1+3} \otimes {\mathfrak {g}}\otimes \Omega ^{1/2}\). This is achieved by choosing a nowhere vanishing half density \(\mu \) on \({\mathbb {R}}^{1+3}\) and by considering the conjugated operator \(\mu ^{1} P(\mu W)\) instead of \(P(W) = \Box _A W + \star [W, \star F_A] + {\mathcal {N}}(W)\), cf. (23). In fact, we choose \(\mu \) so that \(\mu =1\) identically in the Cartesian coordinates, and to simplify the notation, we omit writing \(\mu \) in what follows. However, we warn the reader that additional determinant factors appear in other coordinates. These can be included in the factors \({{\tilde{\alpha }}}_{(k)}\) in (51), and \(\alpha _{(k)}\), \(\alpha _{(kl)}\) and \(\alpha \) in (53).
Recall that \({\mathbb {S}}^+(\mho )\) is defined by (8). Let \((x_{(1)},y,z) \in {\mathbb {S}}^+(\mho )\) and consider the line segments \(\gamma _{y \leftarrow x_{(1)}}\) and \(\gamma _{z \leftarrow y}\) from \(x_{(1)}\) to y and from y to z, respectively. We write
where \(\ell \in {\mathbb {R}}\) satisfies \(\gamma _{y \leftarrow x_{(1)}}(\ell ) = y\) and \(\cdot ^* : T_y {\mathbb {R}}^{1+3} \rightarrow T_y^* {\mathbb {R}}^{1+3}\) denotes the tangentcotangent isomorphism given by the Minkowski metric. After rescaling \(\eta \) and \(\xi _{(1)}\), and after a rotation in \({\mathbb {R}}^3\), we may assume that
where \(a(r) = \sqrt{1r^2}\) and \(r \in (1,1)\). Then we let \(s > 0\) be small and set
The rationale behind this choice of \(\xi _{(k)}\), \(k=2,3\), is that now \(\eta \) can be written as the linear combination
where the scalars \(\kappa _{(k)}\) are given explicitly by
Writing \(\gamma (\cdot ; x, \xi )\) for the geodesic on \({\mathbb {R}}^{1+3}\) with the initial conditions \(\gamma (0; x, \xi ) = x\) and \({{\dot{\gamma }}}^*(0;x, \xi ) = \xi \), we define
Then \(x_{(2)}, x_{(3)} \in \mho \) for small enough \(s > 0\).
It turns out that in the coordinates satisfying (44)–(45) it is enough to use sources with all but the \(dx^2\) component vanishing. Let \(b_{(k)} \in {\mathfrak {g}}\) and set
where \(\delta _{x_{(k)}}\) is the Dirac delta distribution at \(x_{(k)}\) and \(\chi _{(k)}\) is a microlocal cutoff near \((x_{(k)}, \pm \xi _{(k)})\). Here the sign is chosen to be that of \(\kappa _{(k)}\), that is, − for \(k=1\) and \(+\) for \(k=2,3\). Moreover, \(\chi _{(k)}\) is chosen so that

(\(\chi \)1) the principal symbol \(\sigma [\chi _{(k)}]\) is positively homogeneous of degree q;

(\(\chi \)2) \({{\,\mathrm{supp}\,}}(J_{(k),2}) \subset \mho _{(k)}\) where \(\mho _{(k)} \subset \mho \) is a neighbourhood of \(x_{(k)}\), and for all \(k \ne l\) it holds that \(x_{(l)} \notin {\mathcal {J}}^+(\mho _{(k)})\) where
$$\begin{aligned} {\mathcal {J}}^+(\mho _{(k)}) = \{y \in {\mathbb {R}}^{1+3} : x < y\text { or }x = y\text { for some }x \in \mho _{(k)} \}; \end{aligned}$$ 
(\(\chi \)3) \({{\hat{\mho }}}_{(k)} \cap \Gamma _{(l)} = \emptyset \) for all \(k \ne l\) where
$$\begin{aligned} {{\hat{\mho }}}_{(k)}&= \{(t,x') \in {\mathbb {R}}^{1+3}: ({{\tilde{t}}}, x') \in \mho _{(k)} \text { for some }{{\tilde{t}}} \in {\mathbb {R}}\}, \\ \Gamma _{(k)}&= \{\gamma ({{\tilde{t}}}; x_{(k)}, \xi ) : {{\tilde{t}}} \in {\mathbb {R}},\ (x_{(k)}, \xi ) \in {{\,\mathrm{WF}\,}}(\chi _{(k)})\}. \end{aligned}$$
The degree \(q \in {\mathbb {R}}\) is chosen negative enough so that \(J_{(k),2} \in H_0^7(\mho ; {\mathfrak {g}})\). The geometric setting is shown in Figure 2.
Proposition 8
Let \(x_{(1)},y,z\) and \(\eta \), as well as, \(b_{(k)}\) and \(J_{(k),2}(s)\), with \(k=1,2,3\) and small \(s>0\), be as above, and define for \(\epsilon _{(k)} \in {\mathbb {R}}\), \(k=1,2,3\),
Let \({{\tilde{A}}}\) and L be as in Proposition 4. Suppose that \(r \ne 0\) in (44), \(b_{(2)} = b_{(3)}\). Then for any \(s_0>0\), the following point values of symbols
determine \({\mathbf {S}}^{{{\tilde{A}}},{{\,\mathrm{Ad}\,}}}_{z \leftarrow y \leftarrow x_{(1)}}[b_{(2)}, [b_{(1)}, b_{(2)}]]\).
As \((x_{(1)},y,z) \in {\mathbb {S}}^+(\mho )\) and \(b_{(1)}, b_{(2)} \in {\mathfrak {g}}\) can be chosen arbitrarily apart from the constraint \(r \ne 0\), Proposition 1 follows from Propositions 4 and 8 together with Proposition 9 in Section 9 below. Here the case \(r=0\) follows by continuity.
For the convenience of readers who do not wish to enter into theory of Lie algebras, we have included an elementary alternative to Proposition 9 in the case \({\mathfrak {g}}= {{\,\mathrm{{\mathfrak {su}}}\,}}(n)\), with \(n \ge 2\), see Lemma 16 in Appendix Appendix C. This special case is interesting in view of the \({{\,\mathrm{SU}\,}}(3) \times {{\,\mathrm{SU}\,}}(2) \times \mathrm {U}(1)\) gauge group of the standard model.
We will proceed to give a proof of Proposition 8 in Sections 8.1–8.3.
Microlocal reduction from (25) to (23)
Let \(J_{(k),2}\), \(k=1,2,3\), be as in (47), and write \(J_2 = J_2(\epsilon ,s)\) for the function defined by (48). To simplify the notation, we write \(J_j = J_{(k),j} = 0\) for \(k=1,2,3\) and \(j=1,3\), and, for the remainder of this section, somewhat abusively \(A = {{\tilde{A}}}\) where \({\tilde{A}}\) is as in Proposition 4. Then we denote by
the solution of (25) with \(J_j\), \(j=1,2,3\), as above and \(\epsilon \) near the origin of \({\mathbb {R}}^3\). The derivatives of W with respect to \(\epsilon \) are denoted by \(Y_{(k)}\), \(Y_{(kl)}\) and \(Y_{(123)}\) as in (27), and we write also
For notational convenience, we translate the origin in (25) so that the initial conditions are given at \(t=0\) rather than at \(t=1\).
Recall that the second equation in (25) is equivalent with (24). Differentiating (24) with respect to \(\epsilon _{(k)}\) for \(k=1, 2, 3\) gives
Writing
the operator \(\partial _t\) is elliptic away from its characteristic set \(\{\tau = 0\} \subset T^* {\mathbb {R}}^{1+3}\). The wave front set of the righthand side of (50) is contained in a small neighbouhood of \(\{(x_{(k)}, \lambda \xi _{(k)}) : \lambda \ne 0\}\), and therefore it is disjoint from \(\{\tau = 0\}\). It follows that \(\rho _{(k)} \in I(N^*\{x_{(k)}\}; {\mathfrak {g}})\) since the righthand side of (50) is in this class. Recalling the form of \(\xi _{(k)}\), \(k=1,2,3\), see (44) and (45), symbol evaluation gives
Hence \(Y_{(k)}\) solves (32) with \(J_{(k)}\) satisfying
where the sign is that of \(\kappa _{(k)}\), \({{\tilde{\alpha }}}_{(k)} = \sigma [\chi _{(k)}](x_{(k)}, \pm \xi _{(k)}) \ne 0\), \(b_{(k)}\) is as in (47), and
It follows that away from \(x_{(k)}\),
where \(N^*K_{(k)}\) is the bicharacteristic flowout emanating from \((x_{(k)}, \xi _{(k)})\). In other words, writing \(x_{(k)} = (t_{(k)}, x_{(k)}')\),
Moreover, \({{\,\mathrm{singsupp}\,}}(Y_{(k)}) \subset \Gamma _{(k)}\).
The second derivative of (24) in \(\epsilon \) for distinct \(k, l =1, 2, 3\) reads
As \({{\,\mathrm{supp}\,}}(J_{(k),j}) \subset \mho _{(k)}\) by (\(\chi 2\)), it follows from (50) and \(J_0 = 0\) for \(t \le 0\) that \({{\,\mathrm{supp}\,}}(\rho _{(k)}) \subset {{\hat{\mho }}}_{(k)}\). We see that \(Y_{(k)}\) is smooth in the support of \(\rho _{(l)}\) for distinct k and l, since \({{\hat{\mho }}}_{(k)} \cap \Gamma _{(l)} = \emptyset \) by (\(\chi \)3). Moreover, \(Y_{(k)}\) solves (32) with vanishing initial conditions and with the source satisfying \({{\,\mathrm{supp}\,}}(J_{(k)}) \subset {{\hat{\mho }}}_{(k)} \subset {\mathcal {J}}^+(\mho _{(k)})\), whence \({{\,\mathrm{supp}\,}}(Y_{(k)}) \subset {\mathcal {J}}^+(\mho _{(k)})\) due to finite speed of propagation (as discussed in the proof of Lemma 5 finite speed of propagation follows from Lemma 14 in Appendix Appendix B). As \({{\,\mathrm{singsupp}\,}}(\rho _{(l)}) = \{x_{(l)}\}\), it follows from (\(\chi \)2) that \(\rho _{(l)}\) is smooth in the support of \(Y_{(k)}\) for distinct k and l. Analogously, \(Y_{(k)}\) is smooth in \({{\,\mathrm{supp}\,}}(J_{(l)})\) and \(J_{(l)}\) is smooth in \({{\,\mathrm{supp}\,}}(Y_{(k)})\) for \(k \ne l\). Therefore the righthand side of (52) is smooth, and so is \(\rho _{(kl)}\). This again implies that \(Y_{(kl)}\) satisfies (33) modulo smooth terms.
The third derivative of (24) in \(\epsilon \) can be written as
It follows from [20, Th. 8.2.10] that, for distinct k and l, any \((x,\xi ) \in {{\,\mathrm{WF}\,}}(Y_{(k)}Y_{(l)})\) with lightlike \(\xi \) satisfies \((x,\xi ) \in {{\,\mathrm{WF}\,}}(Y_{(j)})\) for \(j=k\) or \(j=l\). Then (33) implies that
Similarly with the above, we see also that \({{\,\mathrm{supp}\,}}(Y_{(kl)}) \subset {\mathcal {J}}^+(\mho _{(k)}) \cup {\mathcal {J}}^+(\mho _{(l)})\) and \({{\,\mathrm{supp}\,}}(\rho _{(kl)}) \subset {{\hat{\mho }}}_{(k)} \cup {{\hat{\mho }}}_{(l)}\) for \(k \ne l\). As above, this implies that \(\rho _{(123)}\) is smooth, and that \(Y_{(123)}\) satisfies (34) modulo smooth terms.
Principal symbols of interacting waves
The linearized equation (33) has source \({\tilde{N}}(2)\) that consists of products of solutions \(Y_{(k)}\), \(k=1,2,3\), to the linear wave equation (32). These products can be viewed as the interactions of waves \(Y_{(k)}\) and \(Y_{(l)}\). Then the solution \(Y_{(kl)}\) to (33) describes the linear waves emanating from the source of such interacting waves \(Y_{(k)}\) and \(Y_{(l)}\). Analogously the solution \(Y_{(123)}\) to (34) describes waves emanating from interaction of \(Y_{(1)}\), \(Y_{(2)}\) and \(Y_{(3)}\).
As \(\xi _{(k)}\), \(k=1,2,3\), are linearly independent, the submanifolds \(K_{(k)}\), \(k=1,2,3\), intersect transversally at y, and we may compute the principal symbols \(\sigma [Y_{(123)}](y,\eta )\) using the product formula (40). This requires using the direct sum decomposition
where \(\eta _{(k)} = \kappa _{(k)} \xi _{(k)}\) and the scalars \(\kappa _{(k)}\) are given by (46). We will omit below the details related to the choices of the microlocal cutoff when applying (40). The same choices as in [7] can be used, see (54) there and its proof.
By (43) the incoming principal symbols satisfy
where the scalar factors \(\alpha _{(k)}\) converge in \({\mathbb {C}}\setminus 0\) as \(s \rightarrow 0\). The factors \(\alpha _{(k)}\) are independent from A, and their precise form is not important for our purposes. We refer to [7] for more detail on how to compute these factors. Let us point out, however, that typically \(\alpha _{(k)} \ne {{\tilde{\alpha }}}_{(k)}\), with \({{\tilde{\alpha }}}_{(k)}\) as in (51), due to a contribution from \({\mathscr {R}}\) and \(\sigma [\Box _A]^{1}\) in (39).
We use the shorthand notations
where \(\eta _{(kl)} = \eta _{(k)} + \eta _{(l)}\), \(\alpha _{(kl)} = \alpha _{(k)}\alpha _{(l)}\), and \(\alpha = \iota \alpha _{(1)}\alpha _{(2)}\alpha _{(3)}\). The constant \(\iota \in {\mathbb {C}} \setminus 0\) comes from (39) and is independent from A. Then
where \(p(y, \xi ) =  \xi _0^2 + \xi _1^2 + \xi _2^2 + \xi _3^2\). Writing
we have
and
Moreover,
For our purposes, it is enough to compute the leading order terms with respect to s, in the limit \(s \rightarrow 0\), of the first two 1form components of \({\hat{Y}}_{(1 2 3)}\). The cubic terms
are of order s. Indeed, if \(\beta =1\) then the last factor vanishes, and if \(\beta =0\) then the last factor is of order s. Hence for \(\beta =0,1\),
It is in principle straightforward to express \({\hat{Y}}_{(1 2 3),\beta }\) in terms of \({{\tilde{b}}}_{(j)}\), analogously to (54). We do not reproduce here the details of this long computation, however, we have verified the below expression (55) using a computer algebra system, and our code is available online [8]. There holds
The terms of order \(s^{1}\) cancel out due to the Jacobi identity. Hence
Taking \(b_{(3)} = b_{(2)}\) yields
where we used the following simple consequence of the Jacobi identity
Indeed, let \(W_j\), \(j=1,2\), be the solutions of (7) with \(V=V_j\). Then the Jacobi identity implies
Thus \([W_1, W_2]\) solves (7) with \(V = [V_1, V_2]\) and (56) follows.
We apply (43) to obtain
where \(c = c(s) =  (1+a(r))(6r\alpha )^{1} \kappa _{(1)}\kappa _{(2)}\kappa _{(2)}^{1q}\) and \(\alpha _{(0)} \in {\mathbb {C}}\setminus 0\) is independent from A.
Principal symbol in temporal gauge
To finish the proof of Proposition 8, we show that for \(\beta =1,2,3\),
Indeed, Proposition 8 follows from (57) and (58) with \(\beta = 1\).
Recall that \(L(0, J_{2}(\epsilon , s), 0)\) is defined by \({\mathscr {T}}(V)_\mho \) where \(V = W + A\) and W is as in (49). To simplify the notation, we write
As A is smooth, \(\sigma [V_{(123)}](z,\eta ) = \sigma [Y_{(1 2 3)}](z,\eta )\). It remains to study how the principal symbol \(\sigma [V_{(123)}]\) transforms under passing to the temporal gauge with \({\mathscr {T}}\).
Let \({\mathbf {U}}= {\mathbf {U}}(\epsilon )\) be as in (14) with \(V = V(\epsilon )\), and write
Recall that we are using the notation \(A = {{\tilde{A}}}\) where \({{\tilde{A}}}\) is as in Proposition 4. In particular, \(A_\mho \) is in temporal gauge. This, together with \(V_{\epsilon =0} = A\), implies that \({\mathbf {U}}_{\epsilon = 0} = {{\,\mathrm{id}\,}}\) in \(\mho \).
We will consider V and \({\mathbf {U}}\) near the point \(z \in \mho \). Recall that \(Y_{(k)}\) is singular only in \(\Gamma _{(k)}\) and that \(Y_{(kl)}\) is singular only in \(\Gamma _{(k)} \cup \Gamma _{(l)}\). Therefore \(V_{(k)}\) and \(V_{(kl)}\) are smooth near z. Moreover, as \({{\,\mathrm{WF}\,}}(V_{(k)})\) and \({{\,\mathrm{WF}\,}}(V_{(kl)})\) are disjoint from the characteristic set \(\{\tau = 0\}\) of \(\partial _t\), the ordinary differential equation in (14) implies that also \(U_{(k)}\) and \(U_{(kl)}\) are smooth near z.
Writing
and differentiating (14) in \(\epsilon _1\), \(\epsilon _2\) and \(\epsilon _3\) at \(\epsilon = 0\) yields that
where \(U_{(123)}\) solves
In addition, \({\mathbf {U}}^{1} {\mathbf {U}}= {{\,\mathrm{id}\,}}\) implies
Therefore, modulo smooth terms, near z there holds
Near z it holds that \(V_{(123)}\) is a conormal distribution associated to the future flowout of \(N^* (K_{(1)} \cap K_{(2)} \cap K_{(3)}) \cap \Sigma \), cf. (36). We refer to Appendix C of [7] for a precise description of this flowout. As the flowout is contained in the characteristic set \(\Sigma \) of \(\Box _A\), it is disjoint from the characteristic set \(\{\tau = 0\}\) of \(\partial _t\). The second equation in (59) implies that \(U_{(123)}\) is a conormal distribution associated to the same flowout near z.
We write \({{\hat{X}}} = \sigma [X](z,\eta )\) where \(X = T, V_{(123)}, U_{(123)}\). Then taking principal symbols in (59) gives for \(\beta = 0,1,2,3\),
Solving for \({\hat{U}}_{(123)}\) in the second equation and substituting in the first one yields (58). This finishes the proof of Proposition 8, and hence also Proposition 1 is proven.
Lie Algebras with Trivial Centre
The material that follows is quite classical and can be found in many texbooks on Lie algebras. We start by defining notations and recalling basic results following mainly the exposition from [16, Chapter 7].
Let \({\mathfrak {g}}\) be the Lie algebra of a compact connected Lie group of matrices G and let \({{\mathfrak {g}}}_{{\mathbb {C}}}\) be its complexification. An element \(Z\in {{\mathfrak {g}}}_{{\mathbb {C}}}\) can be uniquely written as \(Z=X+iY\) for \(X,Y\in {\mathfrak {g}}\), and we define \(Z^*=X+iY\). Note that \(Z^*\) is the usual conjugate transpose of Z in the case \({\mathfrak {g}}= {\mathfrak {u}}(n)\). There is an inner product on \({{\mathfrak {g}}}_{{\mathbb {C}}}\) that is realvalued on \({\mathfrak {g}}\) and that satisfies, see [16, Proposition 7.4],
If \({\mathfrak {t}}\) is a maximal commutative subalgebra of \({\mathfrak {g}}\), then
is a Cartan subalgebra of \({{\mathfrak {g}}}_{{\mathbb {C}}}\) and its dimension is called the rank of \({{\mathfrak {g}}}_{{\mathbb {C}}}\). The roots of \({{\mathfrak {g}}}_{{\mathbb {C}}}\) relative to \({\mathfrak {h}}\) are those elements \(\alpha \in {\mathfrak {h}}\) such that there is \(0\ne X\in {{\mathfrak {g}}}_{{\mathbb {C}}}\) so that
where we use the convention that the inner product is linear in the second variable (and antilinear in the first one). We let \(\Delta \) be the collection of roots. By [16, Proposition 7.15] each root \(\alpha \) belongs to \(i{\mathfrak {t}}\) and that we can decompose \({{\mathfrak {g}}}_{{\mathbb {C}}}\) as a direct sum
where \({\mathfrak {g}}_{\alpha }\) contains the eigenvectors associated to \(\alpha \), that is, the vectors X satisfying (60). Moreover, see [16, Proposition 7.18, Theorems 7.19 and 7.23],

(1)
each \({\mathfrak {g}}_{\alpha }\) is 1dimensional;

(2)
if \(X\in {\mathfrak {g}}_{\alpha }\) with \(\alpha \in \Delta \), then \(X^*\in {\mathfrak {g}}_{\alpha }\);

(3)
if \({{\mathfrak {g}}}_{{\mathbb {C}}}\) has trivial center, the roots span \({\mathfrak {h}}\).
We can in fact pick linearly independent elements \(X_{\alpha }\in {\mathfrak {g}}_{\alpha }\), \(Y_{\alpha }=X^*_{\alpha }\in {\mathfrak {g}}_{\alpha }\) and \(H_{\alpha }\in {\mathfrak {h}}\) such that \(H_{\alpha }\) is a multiple of \(\alpha \) and such that \([X_{\alpha },Y_{\alpha }]=H_{\alpha }\), \([H_{\alpha },X_{\alpha }]=2X_{\alpha }\) and \([H_{\alpha },Y_{\alpha }]=2Y_{\alpha }\). This generates an \(\mathfrak {sl}(2,{\mathbb {C}})\)subalgebra inside \({{\mathfrak {g}}}_{{\mathbb {C}}}\) and implies that the elements
belong to \({\mathfrak {g}}\) and span a Lie subalgebra isomorphic to \({\mathfrak {su}}(2)\), see [16, Corollary 7.20]. Note that the set \(\{E_{\alpha }^{1}, E^{2}_{\alpha }, E^{3}_{\alpha }\}_{\alpha \in \Delta }\) spans \({\mathfrak {g}}\) over the reals if \({\mathfrak {g}}\) has trivial centre. The commutation relations of Pauli matrices imply that \({\mathfrak {su}}(2)\) is spanned by the nested commutators [X, [X, Y]] with \(X, Y \in {{\,\mathrm{{\mathfrak {su}}}\,}}(2)\). Hence the discussion above immediately implies:
Proposition 9
Let \({\mathfrak {g}}\) be the Lie algebra of a compact connected Lie group of matrices. Assume that \({\mathfrak {g}}\) has trivial centre. Then \({\mathfrak {g}}\) is the linear span of [X, [X, Y]] for \(X,Y\in {\mathfrak {g}}\).
The Case of General Lie Group
Suppose now G is any compact connected Lie group. In what follows it is convenient to express some previous notions in slightly more abstract form. Let \(\omega \in \Omega ^{1}(G,{\mathfrak {g}})\) be the (left) MaurerCartan 1form of G. Given \({\mathbf {U}}\in G^{0}({\mathbb {D}},p)\) we express the gauge equivalence between \(A,B\in \Omega ^{1}(M,{\mathfrak {g}})\) as
where \(\text {Ad}:G\rightarrow GL({\mathfrak {g}})\) is the usual Adjoint representation. For matrix Lie groups \(\omega =g^{1}dg\) and \(\text {Ad}_{g}(a)=gag^{1}\) for \(a\in {\mathfrak {g}}\) and we recover the expression (2) for the gauge equivalence between A and B that we have used so far.
Suppose now that \(p:{\widetilde{G}}\rightarrow G\) is a covering of G, then p is a Lie group homomorphism and \(p^*\omega _{G}=\omega _{{\widetilde{G}}}\). Given \({\mathbf {U}}\in G^{0}({\mathbb {D}},p)\), there is a unique \({\widetilde{{\mathbf {U}}}}\in {\widetilde{G}}^{0}({\mathbb {D}},p)\) such that \(p\circ {\widetilde{{\mathbf {U}}}}={\mathbf {U}}\). This is because the domain of \({\mathbf {U}}\) is simply connected and we are fixing the value of \({\mathbf {U}}\) at p to be the identity. We deduce that (61) holds if and only if the following equation holds
In other words, A and B are gauge equivalent via a gauge in \(G^{0}({\mathbb {D}},p)\) if and only if they are gauge equivalent via a gauge in \({\widetilde{G}}^{0}({\mathbb {D}},p)\). The same observation applies for gauges defined near \(\partial ^{}{\mathbb {D}}\). One very useful consequence is that the data seta \({\mathcal {D}}_{A}\) does not really depend on the group G as long as it has Lie algebra \({\mathfrak {g}}\).
We are going to use this set up as follows. Every compact connected Lie group G admits a finite cover of the form \({\mathbb {T}}^r\times G_{1}\), where \({\mathbb {T}}^r\) is an rtorus and \(G_{1}\) is a compact Lie group with finite centre [4, Theorem 8.1, p. 233]. At the level of the Lie algebra this corresponds to an orthogonal splitting \({\mathfrak {g}}={\mathfrak {z}}\oplus {\mathfrak {g}}_{1}\), where \({\mathfrak {g}}_{1}\) is the Lie algebra of \(G_{1}\) and it has no centre. Given \(A\in \Omega ^{1}(M,{\mathfrak {g}})\) we split uniquely
Now we claim:
Lemma 7
Let \(A,B\in \Omega ^{1}(M,{\mathfrak {g}})\). Then \({\mathcal {D}}_{A}={\mathcal {D}}_{B}\) iff \({\mathcal {D}}_{A_{Z}}={\mathcal {D}}_{B_{Z}}\) and \({\mathcal {D}}_{A_{1}}={\mathcal {D}}_{B_{1}}\).
Proof
Using that elements in the centre \({\mathfrak {z}}\) commute with everything, a quick calculation shows that given \(V\in C^{3}({\mathbb {D}};T^*{\mathbb {D}}\otimes {\mathfrak {g}})\) with \(V=V_{Z}+V_{1}\) we can write the curvature of V as
since \(d_{V}=d_{V_{1}}\). Hence
Again using commutativity, \(d_{V_{1}}^*dV_{Z}=d^*dV_{Z}\) since \(dV_{Z}\) is also in the centre. Hence
This implies that \(d_{V}^*F_{V}=0\) in \({\mathcal {D}}\setminus \mho \) iff \(d^*dV_{Z}=d^{*}_{V_{1}}F_{V_{1}}=0\) in \({\mathcal {D}}\setminus \mho \) and the lemma follows. \(\quad \square \)
We can deal with the abelian component \(A_{Z}\) directly by unique continuation.
Lemma 8
If \({\mathcal {D}}_{A_{Z}}={\mathcal {D}}_{B_{Z}}\), then there is \(u\in C^{\infty }({\mathbb {D}};{\mathbb {T}}^{r})\) with \(u(p)=\text { id}\) such that
Proof
It suffices to prove the claim for \(r=1\), i.e. in the case of the circle \(S^{1}\). To avoid cluttering the notation we drop the subscript \(``Z''\) during the proof. If the group is abelian, the Yang–Mills equations reduces to the Maxwell equation \(d^*F_{A}=0\), where \(F_{A}=dA\). Since \(dF_{A}=0\), the curvature satisfies \(\Box F_{A}=0\), where \(\Box =d^*d+dd^*\). The gauges \(u\in C^{\infty }({\mathbb {D}};S^{1})\) all have the form \(u=e^{i\phi }\) for \(\phi \) a realvalued function since \({\mathbb {D}}\) is simply connected.
Since \(A\in {\mathcal {D}}_{A}={\mathcal {D}}_{B}\), there is V with \(d^*F_{V}=0\) in \({\mathbb {D}}\setminus \mho \), \(V\sim B\) near \(\partial ^ {\mathbb {D}}\) and \(A_{\mho }=V_{\mho }\). Thus \(d^*F_{V}=0\) in \({\mathbb {D}}\). It follows that \(\Box (F_{A}F_{V})=0\) in \({\mathbb {D}}\) and \(F_{A}=F_{V}\) in \(\mho \) and by Holmgren’s unique continuation principle, \(F_{A}=F_{V}\) in \({\mathbb {D}}\), i.e. \(d(AV)=0\). Since \({\mathbb {D}}\) is simply connected, A and V are gauge equivalent in \({\mathbb {D}}\). But since \(V\sim B\) near \(\partial ^ {\mathbb {D}}\), it follows that A and B are gauge equivalent near \(\partial ^ {\mathbb {D}}\). Proposition 10 implies now that A and B are gauge equivalent in the whole \({\mathbb {D}}\). \(\quad \square \)
We are now ready to prove our main result.
Proof of Theorem 1
We consider the finite cover \({\mathbb {T}}^r\times G_{1}\) of G as above. By Lemma 7 we know that \({\mathcal {D}}_{A_{Z}}={\mathcal {D}}_{B_{Z}}\) and \({\mathcal {D}}_{A_{1}}={\mathcal {D}}_{B_{1}}\). Let u be the gauge from Lemma 8. We have already proven Theorem 1 in the case that \(G = G_{1}\), since it has finite centre. Thus there is \({\mathbf {U}}\in G^{0}_{1}({\mathbb {D}},p)\) so that \(A_{1}\) and \(B_{1}\) are gauge equivalent via \({\mathbf {U}}\). Finally, \(p\circ (u,{\mathbf {U}})\in G^{0}({\mathbb {D}},p)\) gives a gauge equivalence between A and B as desired. \(\quad \square \)
References
Alinhac, S.: Nonunicité du problème de Cauchy. Ann. Math. (2) 117(1), 77–108 (1983)
Assylbekov, Y.M., Zhou, T.: Direct and inverse problems for the nonlinear timeharmonic Maxwell equations in Kerrtype media. Preprint, arXiv:1709.07767
Bony, J.M.: Second microlocalization and propagation of singularities for semilinear hyperbolic equations. In: Hyperbolic Equations and Related Topics (Katata/Kyoto, 1984), pp. 11–49. Academic Press, Boston (1986)
Bröcker, T., tom Dieck, T.: Representations of Compact Lie Groups, Graduate Texts in Mathematics, vol. 98. Springer, New York (1985)
Cârstea, C.I., Nakamura, G., Vashisth, M.: Reconstruction for the coefficients of a quasilinear elliptic partial differential equation. Appl. Math. Lett. 98, 121–127 (2019)
Cekić, M.: Calderón problem for YangMills connections. J. Spectr. Theory 10(2), 463–513 (2020)
Chen, X., Lassas, M., Oksanen, L., Paternain, G.P.: Detection of Hermitian connections in wave equations with cubic nonlinearity. J. Eur. Math. Soc. (JEMS) (to appear)
Chen, X., Lassas, M., Oksanen, L., Paternain, G.P.: Mathematica code verifying (55). https://github.com/loksanen/CLOP2020 (2020). GitHub repository
ChoquetBruhat, Y.: Yang–Mills–Higgs fields in three space time dimensions. Mém. Soc. Math. France (N.S.) 46(2), 73–97 (1991)
de Hoop, M., Uhlmann, G., Wang, Y.: Nonlinear interaction of waves in elastodynamics and an inverse problem. Math. Ann. 376(1–2), 765–795 (2020)
Duistermaat, J.J., Hörmander, L.: Fourier integral operators. II. Acta Math. 128(3–4), 183–269 (1972)
Feizmohammadi, A., Ilmavirta, J., Kian, Y., Oksanen, L.: Recovery of time dependent coefficients from boundary data for hyperbolic equations. J. Spectr. Theory. Preprint arXiv:1901.04211
Feizmohammadi, A., Oksanen, L.: An inverse problem for a semilinear elliptic equation in Riemannian geometries. J. Differ. Equ. Preprint, arXiv:1904.00608
Feizmohammadi, A., Oksanen, L.: Recovery of zeroth order coefficients in nonlinear wave equations. J. Inst. Math. Jussieu. Preprint arXiv:1903.12636
Greenleaf, A., Uhlmann, G.: Recovering singularities of a potential from singularities of scattering data. Commun. Math. Phys. 157(3), 549–572 (1993)
Hall, B.: Lie Groups, Lie Algebras, and Representations, Graduate Texts in Mathematics, vol. 222, 2nd edn. Springer, Berlin (2015)
Hintz, P., Uhlmann, G.: Reconstruction of Lorentzian manifolds from boundary light observation sets. Int. Math. Res. Not. IMRN 22, 6949–6987 (2019)
Hörmander, L.: Fourier integral operators. I. Acta Math. 127(1–2), 79–183 (1971)
Hörmander, L.: The Analysis of Linear Partial Differential Operators. III, Grundlehren der Mathematischen Wissenschaften, vol. 274. Springer, Berlin (1985)
Hörmander, L.: The Analysis of Linear Partial Differential Operators. I. Springer Study Edition, 2nd edn. Springer, Berlin (1990)
Isakov, V.: On uniqueness in inverse problems for semilinear parabolic equations. Arch. Ration. Mech. Anal. 124(1), 1–12 (1993)
Isakov, V., Nachman, A.I.: Global uniqueness for a twodimensional semilinear elliptic inverse problem. Trans. Am. Math. Soc. 347(9), 3375–3390 (1995)
Isakov, V., Sylvester, J.: Global uniqueness for a semilinear elliptic inverse problem. Commun. Pure Appl. Math. 47(10), 1403–1410 (1994)
Joshi, M.S., Sá Barreto, A.: The generation of semilinear singularities by a swallowtail caustic. Am. J. Math. 120(3), 529–550 (1998)
Kang, K., Nakamura, G.: Identification of nonlinearity in a conductivity equation via the DirichlettoNeumann map. Inverse Probl. 18(4), 1079–1088 (2002)
Kato, T.: Quasilinear Equations of Evolution, with Applications to Partial Differential Equations. Lecture Notes in Mathematics, vol. 448, pp. 25–70. Springer, Berlin (1975)
Kenig, C.E., Sjöstrand, J., Uhlmann, G.: The Calderón problem with partial data. Ann. Math. (2) 165(2), 567–591 (2007)
Kurylev, Y., Lassas, M., Oksanen, L., Uhlmann, G.: Inverse problem for Einsteinscalar field equations. Preprint arXiv:1406.4776
Kurylev, Y., Lassas, M., Uhlmann, G.: Inverse problems for Lorentzian manifolds and nonlinear hyperbolic equations. Invent. Math. 212(3), 781–857 (2018)
Lassas, M.: Inverse problems for linear and nonlinear hyperbolic equations. In: Proceedings of the International Congress of Mathematicians—Rio de Janeiro 2018, vol. IV. Invited lectures, pp. 3751–3771. World Sci. Publ., Hackensack (2018)
Lassas, M., Liimatainen, T., Lin, Y.H., Salo, M.: Partial data inverse problems and simultaneous recovery of boundary and coefficients for semilinear elliptic equations. Preprint, arXiv:1905.02764
Lassas, M., Uhlmann, G., Wang, Y.: Determination of vacuum spacetimes from the Einstein–Maxwell equations. Preprint arXiv:1703.10704
Lassas, M., Uhlmann, G., Wang, Y.: Inverse problems for semilinear wave equations on Lorentzian manifolds. Commun. Math. Phys. 360(2), 555–609 (2018)
Melrose, R., Ritter, N.: Interaction of nonlinear progressing waves for semilinear wave equations. Ann. Math. (2) 121(1), 187–213 (1985)
Melrose, R.B., Ritter, N.: Interaction of progressing waves for semilinear wave equations. II. Ark. Mat. 25(1), 91–114 (1987)
Melrose, R.B., Uhlmann, G.A.: Lagrangian intersection and the Cauchy problem. Commun. Pure Appl. Math. 32(4), 483–519 (1979)
Nachman, A.I.: Reconstructions from boundary measurements. Ann. Math. (2) 128(3), 531–576 (1988)
Oksanen, L., Salo, M., Stefanov, P., Uhlmann, G.: Inverse problems for real principal type operators. Preprint arXiv:2001.07599
Rauch, J., Reed, M.C.: Singularities produced by the nonlinear interaction of three progressing waves; examples. Commun. Partial Differ. Equ. 7(9), 1117–1133 (1982)
Sá Barreto, A., Wang, Y.: Singularities generated by the triple interaction of semilinear conormal waves. Preprint arXiv:1809.09253
Salazar, R.: Determination of timedependent coefficients for a hyperbolic inverse problem. Inverse Probl. 29(9), 095015,17 (2013)
Salo, M., Zhong, X.: An inverse problem for the \(p\)Laplacian: boundary determination. SIAM J. Math. Anal. 44(4), 2474–2495 (2012)
Stefanov, P.D.: Uniqueness of the multidimensional inverse scattering problem for time dependent potentials. Math. Z. 201(4), 541–559 (1989)
Sun, Z., Uhlmann, G.: Inverse problems in quasilinear anisotropic media. Am. J. Math. 119(4), 771–797 (1997)
Sylvester, J., Uhlmann, G.: A global uniqueness theorem for an inverse boundary value problem. Ann. Math. (2) 125(1), 153–169 (1987)
Uhlmann, G., Wang, Y.: Determination of spacetime structures from gravitational perturbations. Commun. Pure Appl. Math. Preprint arXiv:1806.06461
Wang, Y., Zhou, T.: Inverse problems for quadratic derivative nonlinear wave equations. Commun. Partial Differ. Equ. 44(11), 1140–1158 (2019)
Acknowledgements
ML was supported by Academy of Finland grants 320113 and 312119. LO was supported by EPSRC grants EP/P01593X/1 and EP/R002207/1, XC and GPP were supported by EPSRC grant EP/R001898/1, and XC was supported by NSFC grant 11701094. LO thanks Matthew Towers for discussions of Lie algebras.
Funding
Open Access funding provided by University of Helsinki including Helsinki University Central Hospital.
Author information
Authors and Affiliations
Corresponding author
Additional information
Communicated by P. Chruscie.
Publisher's Note
Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.
Appendices
Appendix A. Elementary Computations
A.1. The Hodge star operator on Minkowski space \({\mathbb {R}}^{1+3}\)
In this section we use the Cartesian coordinates \(x^0, \dots ,x^3\) on \({\mathbb {R}}^{1+3}\) and write \(\left\langle \cdot , \cdot \right\rangle \) for the Minkowski metric with the signature \(( + + +)\). We define also \({{\,\mathrm{vol}\,}}= dx^0 \wedge \dots \wedge dx^3\).
Definition 1
The Hodge star operator \(\star \) for any forms \(\omega \) and \(\eta \) of the same degree is the linear map defined by \(\omega \wedge (\star \eta ) = \langle \omega , \eta \rangle {{\,\mathrm{vol}\,}}\) where \(\langle \omega , \eta \rangle = \det (\langle \omega _j, \eta _k\rangle )\) if \(\omega = \omega _1 \wedge \cdots \wedge \omega _r\) and \(\eta = \eta _1 \wedge \cdots \wedge \eta _r\) for some 1forms \(\omega _j\) and \(\eta _j\).
In order to express the Yang–Mills equations and their linearizations in local coordinates, we will need the following lemma:
Lemma 9
Writing \(g^{\alpha \beta } = \left\langle dx^\alpha , dx^\beta \right\rangle \) there holds
In (63) it is assumed that \(\alpha \ne \beta \).
Proof
Taking \(\omega = \eta = {{\,\mathrm{vol}\,}}\) in Definition 1, we see that \(\star {{\,\mathrm{vol}\,}}= g^{00} \cdots g^{33} = 1\). Then (62) follows immediately:
Let us turn to (63). Let \(\alpha \ne \beta \) and choose indices j, k and a sign \(\epsilon = \pm 1\) so that
Now \(\star (dx^\alpha \wedge dx^\beta ) = c dx^j \wedge dx^k\) for a sign \(c=\pm 1\) that satisfies
where \(\eta = \left\langle dx^\alpha \wedge dx^\beta , dx^\alpha \wedge dx^\beta \right\rangle = g^{\alpha \alpha } g^{\beta \beta }\). Both sides of (63) vanish if \(p \ne \alpha \) or \(p \ne \beta \). Suppose now that \(p=\alpha \), the case \(p=\beta \) is analogous and we omit its proof. There holds \(\star (dx^\alpha \wedge dx^j \wedge dx^k) = c' dx^\beta \) for a sign \(c'=\pm 1\) that satisfies
where \(\eta ' = g^{\alpha \alpha } g^{jj} g^{kk}\). Solving for c and \(c'\) gives
A.2. The adjoint \(d_A^*\) in coordinates
Using the formulas (62)–(63) we can easily find expressions for \(d_A^* = \star d_A \star \) in the Cartesian coordinates.
Lemma 10
If \(X= X_\alpha dx^\alpha \), then
If \(Y= Y_{\alpha \beta } dx^\alpha \wedge dx^\beta \), then
Proof
We have
and
\(\square \)
A.3. Proofs of (29)–(31)
In some of our computations we encounter terms of the form \(\star [X,\star Y]\in \Omega ^{1}\) for \(X\in \Omega ^{1}\) and \(Y\in \Omega ^{2}\). The next elementary lemma computes this term explicitly.
Lemma 11
If \(X=X_{\alpha }dx^{\alpha }\) and \(Y=Y_{\alpha \beta }dx^{\alpha }\wedge dx^{\beta }\) then
Proof
We have
\(\square \)
We are now ready to prove (30) that expands \(\star [X, \star d_A Z]\) for \(X, Z \in \Omega ^1\) in coordinates. Using Lemma 11 with \(Y_{\alpha \beta } = \partial _\alpha Z_\beta + [A_\alpha , Z_\beta ]\), we obtain
We apply Lemma 11 with \(Y_{\alpha \beta }\) replaced by \([Y_\alpha , Z_\beta ]\), to establish (31), giving \(\star [X, \star [Y, Z]]\) for \(X, Y, Z \in \Omega ^1\) in coordinates as follows,
Proof of (29), giving analogous expansion of \(d_A^*[X, Z]\) for \(X,Z \in \Omega ^1\), is more involved. Let us consider first the terms in the \(\beta \)th component of
that contain derivatives. Using Lemma 10 these read
Similarly, the terms in the \(\beta \)th component of (64) that do not contain derivatives are
We used here the Jacobi identity. Hence we obtain (29), that is,
A.4. Yang–Mills equations in coordinates
For the convenience of the reader we prove the following wellknown lemma.
Lemma 12
If \(A = A_\alpha dx^\alpha \) then the components of \(d^*_A F_A\) are given by
Proof
We apply Lemma 10 with \(Y_{\alpha \beta } = \partial _\alpha A_\beta + \frac{1}{2} [A_\alpha , A_\beta ]\), to see that the components of \(d^*_A F_A\) are
and the claim follows after combining the terms with factors 1/2, and using
\(\square \)
Appendix B. Direct Problem
B.1. An energy estimate
We write again \((x^0, x^1, x^2, x^3) = (t,x) \in {\mathbb {R}}^{1+3}\) for the Cartesian coordinates, and recall the sign convention (18) for the wave operator \(\Box \). We write also \(\nabla u = (\partial _{x^1} u, \partial _{x^2} u, \partial _{x^3} u)\) and denote by \(\cdot \) the Euclidean inner product on \({\mathbb {R}}^3\).
Let \(X_j\), \(j=1,2\), be first order and \(Y_j\), \(j=1,2\), zeroth order differential operators on \({\mathbb {R}}^{1+3}\). Suppose, furthermore, that \(X_2\) is of zeroth order with respect to t variable. We will consider the system
Here v and u are allowed to take values on a Hermitian vector bundle, but we do not emphasize this in the notation.
We prove an energy estimate for (65). Write \(B(r) = \{ x \in {\mathbb {R}}^3 : x < r\}\). Let \(R > 0\) and define \(r(t) = R  t\). Consider the following local energy
and the norm of the source
Lemma 13
Let \(T > 0\) and define the cut cone
Suppose that \(v,u \in C^2({\mathcal {C}})\) satisfy (65) in \({\mathcal {C}}\). Then for a constant \(C > 0\) that depends only on the \(L^\infty ({\mathcal {C}})\)norm of the coefficients of \(X_j\) and \(W^{1,\infty }({\mathcal {C}})\)norm of the coefficients of \(Y_j\), \(j=1,2\),
Proof
We differentiate the local energy
We write \(z_1 =  X_1 v  X_2 u + v+ f_1\) and \(z_2 = Y_1 v  Y_2 u+f_2\), apply integration by parts to the second term in the first integral, and use (65) to obtain
We have \(z_j^2 \le C({\mathcal {E}}+ {\mathcal {F}})\), \(j=1,2\), and \(\nabla z_2^2 \le C({\mathcal {E}}+ {\mathcal {F}})\), where the constant \(C > 0\) depends only on the \(L^\infty ({\mathcal {C}})\)norm of the coefficients of \(X_j\) and \(W^{1,\infty }({\mathcal {C}})\)norm of the coefficients of \(Y_j\), \(j=1,2\). Moreover,
and we obtain
Now we can use Grönwall’s inequality, or simply notice that
leading to the energy estimate (66). \(\quad \square \)
The energy estimate (66) implies the following two uniqueness results.
Lemma 14
Suppose that \(v,u \in C^2({\mathbb {D}})\), that the coefficients of \(X_j\) are in \(L^\infty ({\mathbb {D}})\) and that the coefficients of \(Y_j\) are in \(W^{1,\infty }({\mathbb {D}})\) for \(j=1,2\). If (v, u) is a solution to (65) with \(f_1=0\) and \(f_2=0\) and if (v, u) vanishes near \(\partial ^ {\mathbb {D}}\), then (v, u) vanishes in \({\mathbb {D}}\).
Proof
As (v, u) vanishes near \(\partial ^ {\mathbb {D}}\), also the extension of (v, u) by zero to the cone
solves (65) with \(f_1=0\) and \(f_2=0\). Therefore the energy estimate (66) implies that (v, u) vanishes. \(\quad \square \)
Proposition 10
Let \(A, B \in \Omega ^{1}({\mathbb {D}};{\mathfrak {g}})\) solve (1) in \({\mathbb {D}}\). Suppose that \(A \sim B\) near \(\partial ^ {\mathbb {D}}\). Then \(A \sim B\) in \({\mathbb {D}}\).
Proof
We write \({{\tilde{A}}} = {\mathscr {T}}(A)\) and \({{\tilde{B}}} = {\mathscr {T}}(B)\), see (14). As \(A \sim B\) near \(\partial ^ {\mathbb {D}}\) also \({{\tilde{A}}} \sim {{\tilde{B}}}\) there. That is, there is \({\mathbf {U}}\in G^0({\mathbb {D}},p)\) such that
As both \({{\tilde{A}}}\) and \({{\tilde{B}}}\) are in the temporal gauge, \({\mathbf {U}}\) does not depend on time and we may define \(V = {\mathbf {U}}^{1} d {\mathbf {U}}+ {\mathbf {U}}^{1} {{\tilde{B}}} {\mathbf {U}}\) in the whole \({\mathbb {D}}\). Now both \({{\tilde{A}}}\) and V satisfy the Yang–Mills equations in \({\mathbb {D}}\). They are also both in the temporal gauge and coincide near \(\partial ^ {\mathbb {D}}\). Pseudolinearization in Section 4.1.2, together with Lemma 14, implies that \({{\tilde{A}}} = V\) in \({\mathbb {D}}\). Therefore \({{\tilde{A}}} \sim {{\tilde{B}}}\) in \({\mathbb {D}}\) and hence also \(A \sim B\) there. \(\quad \square \)
B.2. Linearized Yang–Mills equations in relative Lorenz gauge
A linearization of (25) can be solved using the following lemma. For notational convenience we translate the origin in time so that the initial conditions are posed on \(t=0\).
Lemma 15
Let \(T > 0\) and write \(M = (0,T) \times {\mathbb {R}}^3\). Let \(A \in \Omega ^1(M, {\mathfrak {g}})\) be as in Proposition 3. Let \(f_1 \in H^k(M; T^* M \otimes {\mathfrak {g}})\) and \(f_2 \in H^{k+1}(M; {\mathfrak {g}})\). Then
has a unique solution \((\dot{W},\dot{J}_0)\) and the map \({\mathcal {S}}(f_1,f_2) = (\dot{W},\dot{J}_0)\) is continuous
The system (67) is of the form (65) with \(v = \dot{W}\) and \(u= \dot{J}_0\), and the coefficients of \(X_j\) and \(Y_j\), \(j=1,2\), depend only on the background connection A and are smooth. Using the energy estimate (66), it is straightforward to show that (67) has a unique solution. However, we give a short proof based on the fact that the second equation in (67) is independent from \(\dot{W}\).
Proof
Solving the second equation gives \(\dot{J}_0 \in H^{k+1}(M; {\mathfrak {g}})\). Then \(\dot{W}\) can be solved from the linear wave equation
where \(f_1 + \dot{J}_0dt \in H^k(M; T^* M \otimes {\mathfrak {g}})\). \(\quad \square \)
B.3. Proof of Proposition 3
To simplify the notation in the proof, we write \(H^k(M)\) also for Sobolev spaces of vector valued functions. As \(k \ge 4\), the Sobolev embedding theorem implies that both \(H^k(M)\) and \(H^{k+1}(M)\) are Banach algebras, and also that \(H^{k+1}(M)\) embeds in \(C^2(M)\).
We define
where \(u = (W, J_0)\), \(J' = (J_1, J_2, J_3)\) and \(j=1,2,3\). Then (25) is equivalent to
Consider the map \(\Phi (u, J') = u  {\mathcal {S}} {\mathcal {K}}(u, J')\) where \({\mathcal {S}}\) is as in (68). Observe that if \(\Phi (u, J') = 0\) then \(u = {\mathcal {S}} {\mathcal {K}}(u, J')\) solves (69). Let us show that
We have \({\mathcal {N}}(W) \in H^k(M)\) since W, the first component of u, is in \(H^{k+1}(M)\) and since \(H^k(M)\) is a Banach algebra. Therefore the first component of \({\mathcal {K}}(u, J')\) is in \(H^k(M)\). Similarly, using the fact that \(H^{k+1}(M)\) is a Banach algebra, we have that the second component of \({\mathcal {K}}(u, J')\) is in \(H^{k+1}(M)\). The regularity (70) follows then from (68).
The map \(\Phi \) is a third order polynomial, and therefore it is smooth. Moreover, \({\mathcal {K}}(u, 0)\) contains only monomials of order two and three, and it follows that \(\partial _u \Phi (0,0) = {{\,\mathrm{id}\,}}\). The implicit function theorem gives a neighbourhood \({\mathcal {H}}\) of the zero function in \(H^{k+2}(M)\) and a smooth map \(J' \mapsto u\) from \({\mathcal {U}}\) to \(H^{k+1}(M)\) such that \(\Phi (u(J'), J') = 0\) for all \(J' \in {\mathcal {H}}\).
Appendix C. Generation of \({{\,\mathrm{{\mathfrak {su}}}\,}}(n)\) Using Nested Commutators
We recall the definition of generalized GellMann matrices. Denote by \(E_{jk}\) the matrix with 1 in the jkth entry and 0 elsewhere. The three types of generalized GellMann matrices in \({\mathbb {C}}^{n \times n}\) are as follows

symmetric type: for \(1 \le j < k \le n\) let \(S_{jk} = E_{jk} + E_{kj}\).

antisymmetric type: for \(1 \le j < k \le n\) let \(A_{jk} = i E_{jk} + i E_{kj}\).

diagonal type: for \(1 \le l \le n  1\) let \(D_l\) be the matrix with 1 in the jjth entry for \(1 \le j \le l\), \(l\) in the jjth entry with \(j=l+1\), and 0 elsewhere.
The diagonal type matrices \(D_l\) are typically normalized by multiplying them with \(\sqrt{\frac{2}{l (l+1)}}\) but this is irrelevant for our purposes. A basis of \({{\,\mathrm{{\mathfrak {su}}}\,}}(n)\) is given by the matrices \(i S_{jk}\), \(i A_{jk}\) and \(i D_l\).
In the case \(n=2\), we obtain the Pauli matrices
We define the nested commutator
Lemma 16
\({{\,\mathrm{{\mathfrak {su}}}\,}}(n)\) with \(n \ge 2\) is the linear span of the set
Before giving the general proof, let us consider the case of \({{\,\mathrm{{\mathfrak {su}}}\,}}(2)\). A straightforward computation shows that
Therefore the lemma holds in the case \(n=2\).
Proof
The computation in the case \(n=2\) generalizes immediately to
Also \(D_1 = 4 c(S_{12},D_1)\). We will show using an induction that \(D_l\) can be expressed as a linear combination of the nested commutators. Denote the upper left \(m \times m\) block of a matrix A by \(A^m\) and the lower right \(m \times m\) block by \(A_m\). Then
and the rest of the entries of \(A_{23}\) and \(D_1\) are zero. Therefore
with the rest of the entries zero. It follows that
Analogously,
and hence
with the rest of the entries zero. Therefore
If \(D_{l1}\) is a linear combination of the nested commutators, then so is \(D_l\). \(\quad \square \)
Rights and permissions
Open Access This article is licensed under a Creative Commons Attribution 4.0 International License, which permits use, sharing, adaptation, distribution and reproduction in any medium or format, as long as you give appropriate credit to the original author(s) and the source, provide a link to the Creative Commons licence, and indicate if changes were made. The images or other third party material in this article are included in the article’s Creative Commons licence, unless indicated otherwise in a credit line to the material. If material is not included in the article’s Creative Commons licence and your intended use is not permitted by statutory regulation or exceeds the permitted use, you will need to obtain permission directly from the copyright holder. To view a copy of this licence, visit http://creativecommons.org/licenses/by/4.0/.
About this article
Cite this article
Chen, X., Lassas, M., Oksanen, L. et al. Inverse Problem for the Yang–Mills Equations. Commun. Math. Phys. 384, 1187–1225 (2021). https://doi.org/10.1007/s00220021040060
Received:
Accepted:
Published:
Issue Date:
DOI: https://doi.org/10.1007/s00220021040060