Background Independence in Gauge Theories

Classical field theory is insensitive to the split of the field into a background configuration and a dynamical perturbation. In gauge theories, the situation is complicated by the fact that a covariant (w.r.t. the background field) gauge fixing breaks this split independence of the action. Nevertheless, background independence is preserved on the observables, as defined via the BRST formalism, since the violation term is BRST exact. In quantized gauge theories, however, BRST exactness of the violation term is not sufficient to guarantee background independence, due to potential anomalies. We define background-independent observables in a geometrical formulation as flat sections of the observable algebra bundle over the manifold of background configurations, with respect to a flat connection which implements background variations. A theory is then called background independent if such a flat (Fedosov) connection exists. We analyze the obstructions to preserve background independence at the quantum level for pure Yang–Mills theory and for perturbative gravity. We find that in the former case, all potential obstructions can be removed by finite renormalization. In the latter case, as a consequence of power-counting non-renormalizability, there are infinitely many non-trivial potential obstructions to background independence. We leave open the question whether these obstructions actually occur.


Introduction
In quantum field theory (QFT), one frequently considers the quantum fluctuations around classical field configurations. Examples are: • Spontaneous symmetry breaking in the standard model, where one considers quantum fluctuations around a non-trivial classical configuration of the Higgs field; • The background field method, which is an efficient tool, for example, for the computation of the renormalization group flow (see, e.g., [1]); • Perturbative quantum gravity, where one has to use a non-trivial background metric, providing the necessary structure for the formulation of a QFT [2,3].
Hence, the issue of background independence seems to be of high conceptual importance. Apart from the discussion in [2], on which we comment in detail below, there are basically two approaches to deal with it in the literature. One is the Riemannian path integral framework, which faces the problem that, in the presence of non-trivial background fields, the relation between correlation functions on Riemannian spaces, and the QFT on Lorentzian spacetime in which one is ultimately interested, is unclear. In particular, in the absence of an Osterwalder-Schrader theorem, it is not clear whether such correlation functions define a QFT in the sense of observables represented by operators on some Hilbert space. The other approach, discussed in more detail at the end of this section, is to treat the background field as an infinitesimal perturbation around a fixed flat reference background. However, for a full proof of background independence, one should treat the background field non-perturbatively. Then, one faces the problem that on generic backgrounds, there is no unique vacuum state and that the usual renormalization techniques based on momentum space are not available. A further common shortcoming of these approaches is that they are not "operational" in the sense that they do not address the following question: Given a background configuration and an observable defined w.r.t. this background, what is the same observable on a different background?
In view of the difficulties mentioned above, we follow the algebraic approach, i.e., we directly (perturbatively) construct the algebras of observables for the different background configurations, using locally covariant renormalization techniques developed in the context of QFT on curved space-times [4]. Background independence for us then means that we can unambiguously identify observables on different backgrounds (at least for infinitesimally close backgrounds). As suggested in [5,6], this can be formulated in the spirit of Fedosov quantization [7]: One considers the bundle of observable algebras over the manifold of background configurations and constructs a flat connection on it. The sections that are flat, i.e., covariantly constant, w.r.t. this connection provide a consistent assignment of an observable to each background. The similarity of background independence and Fedosov's approach has already been noted, in a quantum mechanical framework, in [8]. 1

The Scalar Field as a Toy Model
To motivate our definition of background independence and to introduce some of the relevant concepts, let us first discuss a toy model, namely the selfinteracting Φ 4 -theory. Consider splitting the basic scalar field into a background configurationφ, which is kept classical at the quantum level (i.e., it commutes with all quantum fields) and a dynamical field φ which is viewed as fluctuations aroundφ and is quantized in perturbation theory. The question of background independence is then the following: Is field theory independent of the splitting of Φ into a backgroundφ and a perturbation φ? Clearly, the action functional S[Φ] depends only on the combinationφ + φ; hence, the classical field theory is independent of this split. We say that it exhibits split independence. Here, we ask whether and in which mathematically rigorous sense this split independence is preserved at the quantum level.
To analyze the issue, we find it convenient to adopt the framework of locally covariant quantum field theory [4,11] which has proven to be powerful for QFT in curved space-time or in the presence of non-trivial background gauge connections [12]. In this framework, the covariance with respect to suitable transformations of background data (e.g., isometries of the background metric or gauge transformations of the background connection) is manifest by construction. The objects of primary interest are renormalized interacting time-ordered products, which include interacting fields. They are constructed in perturbation theory and generate the non-commutative local algebra of observables.
More concretely, for the example of scalar field theory expanded around a classical solutionφ of Φ 4 -theory, one constructs for each such backgroundφ, the local algebra Wφ. To each classical local functional F [φ, φ], one associates the generating functional T int φ (e iF ) of interacting time-ordered products, which is an element of Wφ. These elements generate the algebra W int φ of interacting observables. Now consider a local functional F [Φ]. Obviously, it induces local functionals F [φ, φ] = F [φ + φ] for the different backgroundsφ. Their background independence can be stated via functional derivatives as whereφ is some variation of the background. The question is how to implement this on the quantum observables T int φ (e iF ). While the second derivative (w.r.t. the dynamical field φ) is well defined on Wφ, the first derivative (w.r.t. the background fieldφ) has no obvious meaning on Wφ, as one is comparing elements of different algebras. 2 The way out is to replace this derivative with the retarded variation δ r [13], which is the infinitesimal version of the Møller operator relating the algebras on the different backgrounds [14] (see below). The natural translation of (2) to an assignment of interacting fields to different backgrounds is thus It turns out [5], cf. [15] for details, that in the Φ 4 -theory, this is equivalent to (2) in the sense that precisely if perturbative agreement 3 [13] holds for changes in the (position dependent) mass of the scalar field. 4 As a consequence of the flatness of D, also D is then flat. For variations in the mass, perturbative agreement can be fulfilled [15,16], so that (5) indeed holds. It is natural to give this a geometric interpretation along the lines of Fedosov quantization, as suggested in [5,6] (see [15] for details). Consider the manifold S Φ 4 of solutions to the interacting Φ 4 field equations. The tangent space at eachφ ∈ S Φ 4 is the space of solutionsφ of the field equations linearized aroundφ. We patch all algebras W int φ together to obtain the algebra bundle: An assignmentφ → T int φ (e iF [φ,−] ) as above is then interpreted as a section of W int Φ 4 , and Dφ as a covariant derivative (connection) on this bundle in the direction of the vector fieldφ. If this connection is flat, we call the QFT background independent. Flatness ensures that, at least formally, any interacting observable on one background can be uniquely parallel transported to any other background, providing an answer to the question posed at the beginning of this section. Or, in the spirit of Fedosov quantization: The space of sections of W int Φ 4 is much larger than the space of functionals of Φ, i.e., the space of functions on S Φ 4 . However, when restricting to sections that are flat w.r.t. D, i.e., fulfill (4), one obtains a one-to-one correspondence between functions on S Φ 4 and flat sections of W int Φ 4 . Again, flatness of Dφ is crucial.

Gauge Theories
The main aim of the present work is to analyze the issue of background independence for gauge theories where more complications arise due to gauge fixing. Let us for definiteness consider the pure Yang-Mills theory which is the theory of a G-connection A on a principal bundle, subject to the Yang-Mills field equations. We split into a background connectionĀ and a dynamical g-valued 1-form A (a vector potential) which will be quantized in perturbation theory.Ā is a solution to the Yang-Mills equation. Similar to the scalar case, the classical Yang-Mills action is independent of this split. However, for the purpose of perturbative quantization, one has to fix the gauge, which necessarily breaks this split independence if one requires a covariant gauge fixing. The gauge-fixed action exhibits a residual fermionic symmetry, the BV-BRST symmetry. It acts by a nilpotent operator s, and the physical (gauge invariant) observables are obtained as the cohomology of s. In fact, the violation of the split independence in the gauge-fixed action is s exact. It follows that, classically, split independence holds at the level of gauge-invariant observables, i.e., there is a flat connec-tionDā on classical local functionals that is well defined on s cohomology, i.e., Dā • s = s •Dā. Hereā is an infinitesimal variation of the background. To quantize, one constructs, for each backgroundĀ, the (unphysical) where Q int A is the renormalized interacting BRST charge, and the commutator is taken w.r.t. the algebra product. Therefore, for background independence to hold, the desired connection Dā has to be well defined on the BRST cohomology, that is, it must satisfy on-shell. Furthermore, on the kernel of [Q int A , −] , the curvature of Dā has to vanish modulo an element in the image of [Q int A , −] . If this is the case, background-independent observables can be defined as those sections of the observable algebra bundle which are flat w.r.t. Dā modulo Im[Q int A , −] . We find that there are potential obstructions (anomalies) for the construction of such a connection. However, for pure Yang-Mills theory in D = 4 space-time dimensions, these turn out to be trivial. Power-counting renormalizability is a crucial ingredient of our proof. If the relevant anomaly is absent, then an identity analogous to (5) holds in F YM = ĀFĀ, namely whereÂ incorporates quantum corrections. Hence, a classically gauge-invariant and background-independent local functional does not automatically give rise to a background-independent observable at the quantum level, but quantum corrections may be necessary.
We also sketch the application of our framework to perturbative quantum gravity. As in any diffeomorphism-invariant theory, the definition of local observables is a major issue, and we follow recent proposals [2,17], based on [18], for the construction of such (relational) observables employing a set of configuration-dependent covariant coordinates. As opposed to the pure Yang-Mills case, our analysis of potential anomalies to background independence shows that for the case of perturbative gravity, one can indeed find infinitely many candidates for such anomalies using the dimensionful coupling of the theory. From this perspective, it seems difficult to prove the absence of anomalies, as they may appear at arbitrarily high order in perturbation theory. 5 We would like to point out that our work does not yet provide a full Fedosov quantization of Yang-Mills theories. First of all, one should then work on gauge equivalence classes of classical solutions as base space, not on the full space of classical solutions, as we do (but see [19] for a different point of view). Second, the set of solutions to the Yang-Mills equation is a manifold only up to singular points corresponding to solutions with symmetries [20]. We work locally in configuration space, i.e., in a neighborhood of a generic configuration, avoiding these singularities. We should also emphasize that the main focus of our work is algebraic, not (functional) analytic. In particular, we do not discuss the analytical aspects of the infinite-dimensional manifolds of solution spaces, and algebra bundles upon these. We refer to [15] for a thorough discussion.

Comparison with the Path Integral Approach
Let us compare our treatment of background independence with more formal approaches, in particular the path integral formalism. In the case of the scalar field, one defines the generating functional of connected graphs as and the corresponding effective action as Assuming that the path integral measure Dφ is shift invariant, one obtains, with Γ the generating functional in the absence of the background field [1]. In particular,Γ In this sense, background independence holds, provided that shift invariance of the path integral measure is fulfilled. One can thus see perturbative agreement as the rigorous version of the shift invariance of the formal path integral. 6 Shift invariance of the path integral measure is also a crucial requirement in the treatment of background independence in gauge theory given in [21]. However, as described above, this is not sufficient, as the gauge-fixed action is not split independent. To deal with this, an extended BRST differential is introduced in [21], which also implements a shift between the background and the dynamical vector potential. It is then argued that the corresponding Slavnov identities can be fulfilled. As in our treatment, a crucial ingredient in that proof is power-counting renormalizability, which restricts the number of possible counterterms.
Let us summarize two major conceptual differences between our treatment and the path integral approach: • Typically, renormalization techniques are employed which require that the propagator is translation invariant. This means that the background is in fact treated perturbatively, i.e., it enters only the vertices, not the propagators. This entails that the background field is a vector potential, not a principal bundle connection and also that shift invariance of the path integral measure is trivially fulfilled. But the perturbative expansion with all the background fields in the vertices is ill-defined, unless the background field is treated as an infinitesimal perturbation, so that one may expand in powers of the background field. Hence, only an infinitesimal neighborhood of a fixed flat reference connection is actually treated. In contrast, in our approach, the background connection is treated nonperturbatively. • A formulation of background independence such as (8) does not refer to observables, i.e., it does not address the question posed at the beginning of the introduction. For this, one would need to couple generic observables through source terms to the action and study the background independence of the resulting effective action. To the best of our knowledge, this has not been done in the literature.

Outline
The article is structured as follows. To set the stage, we review, in the next section, the case of scalar field theory, in particular the construction of the algebras Wφ. Following [5,15], the relation of background independence and perturbative agreement is discussed. In the main part of this work, Sect. 3, we study the case of Yang-Mills theories. Perturbative quantum gravity is treated in Sect. 4. An "Appendix" contains technical lemmata. For the convenience of the reader, we provide a glossary of symbols used.

Perturbative QFT on a Backgroundφ
In this section, we review the discussion of background independence for a self-interacting scalar field Φ [5,15]. Throughout this work, we consider globally hyperbolic space-times (M, g) with signature (−, +, · · · +) and compact Cauchy surfaces. J ± (L) denotes the causal future/past of a space-time region L ⊂ M , cf., for example, [22] for a definition.
Due to the time-slice axiom [23], it is sufficient to define the interacting observables localized in a causally closed, compact space-time region R ⊂ M which contains a Cauchy surface. In particular, we may choose R = J + (Σ 0 ) ∩ J − (Σ 1 ) for two non-intersecting Cauchy surfaces Σ 0/1 . We may thus replace the coupling constant λ 0 with a smooth compactly supported cutoff function λ(x) which equals λ 0 on a neighborhood of R. For the perturbations φ, we consider the expansion of the action Note that the free Lagrangians for different backgroundsφ coincide outside of the support of λ. This is essential for identifying quantum theories around different backgrounds as discussed in the next section. Also note that there is no source term in (9), i.e., a term linear in φ, since the background configuration is required to fulfill the interacting equation of motion The solutions to (10) form a manifold S Φ 4 , with tangent space TφS Φ 4 at φ given by the solution space to the linearized equation of motion This means that given a smooth curve {φ s } s in S Φ 4 , i.e., of solutions to (10), withφ 0 =φ, its derivativeφ is a solution to (11). We refer to [15] for details, in particular on the notion of smoothness. Background independence of the classical scalar field theory now means that it is independent of the arbitrary split (1) into background and dynamical fields. One manifestation of this is the split independence of the action in the sense that δS where the interaction part S int of the action was defined in (9).

The Free Algebra Wφ
The algebra Wφ (also called the free algebra in contrast to the interacting one defined below) consists of evaluation functionals where the singularities of the symmetric distributions f n on M n are constrained by a condition on their wave front set, cf. [4]. We define the support of a functional of the form (14) as Given a Hadamard two-point function ωφ for Pφ, cf. [4] for a definition, one defines a non-commutative product where m is the point-wise multiplication of functionals, Here, the functional derivative δ δφ(x) F is interpreted as a Wφ valued density, whose evaluation on test functions ϕ is defined as The definition of the product, and thus also that of Wφ, depends on the two-point function ωφ. However, it turns out that algebras equipped with products defined by different Hadamard two-point functions ωφ, ω φ are isomorphic [4], justifying the notation. 7 Note that, in particular, (15) implies that where Δφ = Δ ā φ − Δ rφ is the causal propagator of Pφ, with Δ r/ā φ denoting the retarded/advanced propagator.
Elements of Wφ are considered in the sense of formal power series in , i.e., Wφ is considered as a graded vector space with grading provided by deg , which counts the number of factors. A further grading is given by where deg φ counts the number of fields. For example, for an F of the form (14), with f n = 0 and f m = 0 for all m = n, one has deg φ (F ) = n. It is obvious that the product respects the grading, i.e., This grading is in fact the natural grading in the context of Fedosov quantization [7].
Local covariance [4,11] is a crucial ingredient of our approach. 8 It is implemented as follows: A morphism ψ : (M , g ,φ ) → (M, g,φ) is an isometric embedding ψ : M → M , i.e., ψ * g = g , which preserves the causal structure, and such that ψ * φ =φ . For each morphism ψ, there exists an algebra homomorphism defined by To implement the equations of motion, one passes to the on-shell algebra. This proceeds by quotienting out the ideal (17) of functionals F that vanish on all solutions φ of the linearized equations of motion Pφφ = 0.
The subspace W loc φ ⊂ Wφ of local functionals consists of those F of the form (14) for which each f n is supported on the total diagonal of M n . It is generated by smearing fields O(x) with appropriate test tensors. Fields depend locally and covariantly on g,φ, φ, or, abstractly, for a morphism ψ. They are of the form where P is a polynomial, α stands for multi-indices and R μνρσ is the Riemannian curvature of g. It is sometimes useful to express a local functional in terms of its integral kernel.

Time-Ordered Products
To obtain the interacting renormalized quantum fields, one needs to define renormalized time-ordered products (or renormalization schemes) on the algebra Wφ. These are a collection of symmetric multi-linear maps which are subject to the axioms (or renormalization conditions) of [4,24], cf. also the reviews [25][26][27]. In particular, they fulfill: Ann. Henri Poincaré Grading Time-ordered products respect the Deg grading, i.e., Deg(Tφ ,n (F 1 ⊗ · · · ⊗ F n )) = i Deg(F i ).
Locality and covariance Let ψ : (M , g ,φ ) → (M, g,φ) be a morphism, and α ψ as in (16). Then, Scaling Each Tφ ,n scales almost homogeneously, cf. [4], under Field independence Each Tφ ,n is independent of the dynamical field φ, in the sense that Single field factor A time-ordered product with a single field factor simplifies as Support Time-ordered products do not increase the support, i.e., For fields, it is more convenient to use the mass dimension instead of the scaling dimension, defined by the power of μ in the scaling law (20). It is defined as the scaling dimension plus the number of lower indices minus the number of upper indices. It has the advantage that it does not depend on the position of the indices.
As shown in [24,28], time-ordered products exist and are unique up to a well-characterized, local and covariant renormalization ambiguity which is described by the main theorem of renormalization theory. These ambiguities are best expressed in terms of the generating functional for time-ordered products given by where we have introduced the notation In passing, we note that for F a proper interaction, i.e., deg φ (F ) ≥ 3, the expression is well defined w.r.t. the Deg grading, i.e., at any given grade only a finite number of terms contribute. Now, let Tφ and T φ be two different time-ordered products (renormalization schemes) which satisfy the above axioms. The main theorem of renormalization theory then states that they are related via correspond to finite local counter terms, characterizing the renormalization ambiguity. They are of order O( ), decrease the total Deg by 2(n − 1), are supported on the total diagonal, i.e., they vanish unless the supports of all arguments overlap, and are locally covariant and field independent, i.e., fulfill (19) and (22) with T n replaced by D n . Furthermore, they scale homogeneously under (20) and vanish if one of their arguments is a linear field. The time-ordered products Tφ ,1 (O) are usually called Wick powers and are constructed by point splitting w.r.t. the Hadamard parametrix h, cf. [4,29], which is constructed covariantly from the local geometric data and captures the singularities of Hadamard two-point functions ω, i.e., ω − h is smooth. Concretely, one defines δφ(x)δφ(y) F and the subscript ω on the l.h.s. denotes the two-point function w.r.t. which the product is defined. Time-ordered products Tφ ,n for n > 1 can be constructed recursively using in particular the causal factorization to define the distributions up to the diagonal in M n and extending them to the diagonal as first proposed by Epstein and Glaser [30] (for details, see [4,24,28]).

The Interacting Algebra W int φ
Interacting observables are represented in Wφ via retarded products, defined by Bogoliubov's formula . By causal factorization (21), retarded products are trivial if the support of second argument does not intersect the past of the support of the first, i.e., The generating functional of interacting time-ordered products is then given by Given a field O, one thus defines the corresponding interacting field as As for time-ordered products, interacting time-ordered products fulfill causal factorization, i.e., The interacting algebra W int φ is the subalgebra of Wφ generated by the interacting time-ordered products for supp F ⊂ R. The subalgebras W int φ (L) of observables measurable in compact, causally closed space-time regions L ⊂ R are generated by T int φ (eī F ⊗ ) with supp F ⊂ L. By (29), the algebras corresponding to causally disjoint space-time regions commute.
Finally, we also introduce interacting retarded products by We note that (the equality holds both for usual and interacting timeordered/retarded products) and that, as a consequence of (22), field independence of interacting timeordered products holds in the sense that

Background Independence of Renormalized Scalar Field Theory
As discussed in the introduction, the naive derivativeδφ := δ δφ −,φ in (2) w.r.t. the background field is not properly defined on the algebra bundle The natural replacement is the retarded variation δ r ϕ defined as follows. Given two backgroundsφ andφ , one defines the retarded Møller operator [14], cf. also [4] for an on-shell version, as an algebra isomorphism [4,31] τ r φ,φ : Wφ → Wφ, by its action on functionals as Here, rφ ,φ is the retarded wave operator mapping solutions of Pφφ = 0 to solutions of Pφ φ = 0 which coincide outside of J + (supp(φ −φ )). 9 In (32), the subscript ω denotes a two-point function w.r.t. which the product on Wφ is defined, and ωφ is obtained by acting with rφ ,φ on both variables of ωφ. Given an infinitesimal background variation ϕ, as in (12), and a family {F s } s∈R of functionals, F s ∈ Wφ s , 10 one defines the retarded variation A key identity on which our discussion of background independence is based is the so-called perturbative agreement formulated in [13]. It is derived from the requirement that it should not matter whether one includes terms quadratic in the fields into the free or the interacting part of the action. The comparison between the two theories thus defined is performed by the retarded Møller operator or, infinitesimally, by the retarded variation. This implies a further renormalization condition, supplementing those mentioned in the previous section: Background variation For an infinitesimal variationφ of the backgroundφ, we have As shown in [15,16], this condition can indeed be implemented. In the following, we thus assume that (35) holds. In particular, we then have the following version of perturbative agreement on interacting time-ordered products.

Lemma 2.1. On interacting time-ordered products, perturbative agreement implies
Proof. We compute The claim then follows from which is a consequence of (30). 10 A typical family of such functionals would be given by the assignmentφ → O int φ of an interacting observable to each background, given a field O.
Corresponding to the subalgebras W int φ (L) for observables localized in the space-time region L, we may introduce the subbundles W int Φ 4 (L). The space of sections Γ(W int Φ 4 ) of the algebra bundle W int Φ 4 is an algebra in itself, with the product being fiber-wise given by . With a slight abuse of notation, we denote the resulting product again by . One may define the subalgebra Γ ∞ (W int Φ 4 ) of smooth sections, cf. [15] for details. For our purposes, it is sufficient to think of it as generated by sections (3) for local functionals F with a smooth dependence on the backgroundφ. Analogously to the usual definition of connections on vector bundles, we give a tentative definition of a connection on the interacting algebra bundle, with a supplementary space-time localization condition, which seems natural in a quantum field theoretical context.
linear in the first and additive in the second argument, reduces to the ordinary derivative on c-number functionals, i.e., is a derivation, i.e., fulfilling and respects space-time localization, in the sense that By (36), due to the second term on the r.h.s., the background variation δ r ϕ violates the locality requirement (38). 11 But, as seen in the following proposition, subtracting the derivative w.r.t. φ yields a connection. The following propositions, first proven in [5], cf. [15] for details, summarize background independence for scalar fields.

Proposition 2.3. The operator
where δφ and Dφ are defined in (2).
Proof. That Dφ is a derivation is a consequence of the retarded Møller operator being an algebra isomorphism and of δφ being a derivation. The localization requirement (38) is a consequence of (40). To prove the latter, we note that by (36) and (31), we have The claim then follows from (13). 11 It is not even obvious that it is well defined on W int Φ 4 , i.e., that DφF ∈ Γ ∞ (W int Φ 4 ), asδφS is not supported in R.

Proposition 2.4. The connection Dφ, defined in (39), is flat.
Proof. It is straightforward to check that Dφ satisfies is the Lie bracket of vector fieldsφ andφ on S Φ 4 . Therefore, using (40), the curvature of Dφ vanishes: Hence, defining the background-independent observables as sections which are covariantly constant w.r.t. Dφ, (40) implies that backgroundindependent interacting fields O int φ correspond to classically split independent fields O, i.e., fulfilling DφO = 0. This means that there is a one-to-one correspondence between classical and quantum background-independent fields.

Pure Yang-Mills Theory
This main part of the article is structured as follows: We begin by setting up Yang-Mills theory on the classical level, culminating in the identification ofDā as the relevant connection on classical local functionals. In Sect. 3.2, we discuss, following [25,32], quantization, in particular the occurrence of anomalies. As a crucial ingredient for background independence, we prove a theorem on the background dependence of the anomaly, assuming that perturbative agreement holds. In Sect. 3.3, we then prove our main result on background independence in Yang-Mills theories.

Classical Gauge Theory
3.1.1. The Basic Setting. Let P → M be a G principal fiber bundle over space-time M , with G a semi-simple Lie group. We denote by Ad the adjoint action of a Lie group G on itself, Ad g h := ghg −1 , and the adjoint action on the corresponding Lie algebra g by ad. The Lie bracket on g is denoted by The Yang-Mills theory is the dynamical theory of a G connection A on P whose dynamics is governed by the Yang-Mills action where F is the curvature of A, interpreted as a section of p ⊗ Ω 2 , with p:= P × ad g and Ω k the bundle of k forms on M . Let {T I } I be a basis of g, normalized as Tr(T I T J ) = − 1 2 δ IJ . Then, we can write F = 1 Classical solutions will play the role of background configurations, and these will be typically denoted by a bar, i.e., we will consider connectionsĀ which are solutions to the Yang-Mills equation whereF is the curvature ofĀ and∇ μ is the associated covariant derivative on sections of p ⊗ Ω k . The Yang-Mills equation is well-posed [33], guaranteeing the existence of global solutions. Furthermore, the set S YM of such solutions is a manifold, i.e., its tangent space TĀS YM at a solutionĀ is the space of solutionsā to the Yang-Mills equation linearized aroundĀ, except at certain symmetric background configurationsĀ, cf. [20]. At these symmetric background configurations, there are solutions to (43) that are not tangent to S YM , i.e., do not arise as the derivative of a curve in S YM . The presence of such singular points in configuration space S YM does not impart our considerations, as these are local in S YM , so that we can restrict to regions not containing such exceptional points. Thus, we will henceforth identify the space of solutions to (43) with the tangent space TĀS YM of S YM atĀ.

Background and Dynamical Gauge Transformations
We consider the decomposition (6) of A into a background connectionĀ and a dynamical g-valued one-form A, i.e., a section of p ⊗ Ω 1 . In local coordinates, the corresponding covariant derivative operator D when acting on sections of p ⊗ Ω k takes the form Then, the curvature two-form F in local coordinates is given by Gauge transformations are parametrized by smooth sections g of P × Ad G. On a connection A, they act as with θ the Maurer-Cartan form. For A split as in (6), there are then two natural implementations of this gauge transformation. A background gauge transformation acts asĀ The covariance of the quantum theory under such a transformation will be part of the requirement of local (gauge) covariance. On the other hand, one may keep the background fixed and implement the change A → A g by solely changing A, i.e.,Ā

→Ā,
This is called a dynamical gauge transformation which needs to be gauge fixed.

Localization of the Interaction and Split Independence
As for the scalar field, we need to localize the interaction in a compact spacetime region. For the scalar field, we used a smooth compactly supported cutoff function λ(x) which was equal to λ 0 in the space-time region R for which the algebra of interacting observables was constructed. This cutoff had the additional consequence that, for any two background solutionsφ,φ , the corresponding linearized wave operators Pφ, Pφ , cf. (11), coincided outside of a compact space-time region (the support of λ). This made it possible to define the flat connection Dφ, using the retarded variation δ r . Also for Yang-Mills theory, we use a smooth cutoff function λ(x) to localize the interaction (see below). This cutoff, however, does not affect the linearized wave operator P lin A , defined in (43). 12 Hence, the operators P lin A in general do not coincide outside of a compact space-time region, which, however, is a prerequisite for the use of the retarded variation. Hence, we relax the condition thatĀ is on-shell, i.e., a solution to (42), on the whole spacetime. We proceed as follows: We choose a neighborhood U of R on which we require the backgroundsĀ to be on-shell, i.e., Furthermore, we require all backgroundsĀ to coincide outside of a larger region V ⊃ U with an arbitrary reference connection A 0 . Consequently, the variationsā of the background are supported in V and fulfill the linearized Yang-Mills equation (43) in U. In this way, one ensures that the retarded variation δ r a is well defined. Furthermore, one localizes the interaction by introducing a cutoff function λ, which is supposed to be supported in U and equal to 1 on a neighborhood of R. The action is, then, defined as where summation over repeated indices I is understood. In R, where λ = 1 and the backgroundĀ is on-shell, this is the Yang-Mills action (41) (41) would have a source term, i.e., a term linear in A, which however vanishes in R, asĀ is on-shell there. The setup of our localization prescription is summarized in Fig. 1.
Since∇,F and A transform covariantly under background gauge transformations, the action (46) is invariant under background gauge transformations. Analogously to (13), the action is split independent in the sense that Here, S YM,int is the part of S YM which is of degree higher than 2 in A. The restriction to x ∈ R is due to the infrared cutoff λ of the interaction.

BV-BRST Formalism and Background Covariant Gauge Fixing.
In this section, we outline the straightforward generalization of the BV-BRST formalism [25,[34][35][36], to the case with non-trivial backgrounds.
In order to perform gauge fixing in the BV-BRST formalism, we need to augment the field variables with a set of ghosts and anti-fields, some of which are fermions, i.e., have an odd Grassmann parity. 13 The resulting gaugefixed theory enjoys the BV-BRST symmetry s as follows. Let us denote the set of all dynamical fields by Φ = (A I μ , B I , C I ,C I ), where C (C) are called (anti-) ghosts and B is a Lagrange multiplier. One assigns mass dimensions d Φ = (1, 2, 0, 2) and a ghost number g Φ = (0, 0, 1, −1) to the fields. The latter defines the Grassmann parity. The BV-BRST operator s, which increases the ghost number by 1, acts by , 2) and ghost numbers g Φ ‡ = (−1, −1, −2, 0). They are interpreted as densities and act as classical, non-dynamical sources of BRST transformations of the fields, appearing in the action via To perform the gauge fixing, we add a manifestly BV-BRST-invariant term sΨ to the action, where Ψ is a gauge-fixing fermion with ghost number −1 which does not contain anti-fields and we choose here to be This is the so-called background covariant gauge fixing. It breaks dynamical gauge invariance, while keeping the background gauge invariance. In this respect, (48) is a useful gauge in practical calculations and is commonly employed in the background field formalism [1, 21,[38][39][40][41].
The BV-BRST transformations of all fields and anti-fields can now be written as where S is the extended and gauge-fixed action and where (−, −) is the so-called anti-bracket defined by cf. [37] for a definition of left and right derivatives w.r.t. fields with Grassmann parity. In the following, field derivatives will be left derivatives, unless states otherwise. The anti-bracket satisfies the graded Jacobi identity and has the following graded symmetry We remark that only on functionals supported in R, whereĀ is on-shell and λ = 1, the operator s coincides with the standard nilpotent BV-BRST differential and the gauge-fixed action fulfills the classical master equation: which expresses the BRST invariance of S. As usual, we split the action into a free and an interaction part: where the free action S 0 is quadratic in Φ and Φ ‡ , and the compactly supported interaction S int contains the terms of degree higher than 2 in Φ and Φ ‡ . This, in turn, leads to the decomposition s = s 0 + s int of the BV-BRST differential. The action of s 0 on all fields and anti-fields is given in Table 1. Note that the requirement of the background connection being on-shell is necessary for the nilpotency of s 0 . For instance, one can check by direct calculation that s 2 0 A ‡I μ = [∇ νF μν , C] I g , which vanishes only if ∇ νF μν = 0. Hence, s 0 is only nilpotent when restricted to functionals localized in U, motivating our condition that supp λ ⊂ U.
The gauge-fixed action S is invariant under background gauge transformations since all the dynamical fields and anti-fields transform in the adjoint.
However, it is no longer split independent, not even in R, since Ψ destroys split independence asĀ and A no longer appear in Ψ in the formĀ + A.
where we have used that (47) also holds with S YM replaced by S sc and that δ δA Ψ is proportional toC, on which s int vanishes.
It is advantageous to also compute the action of Dā on S, the former being defined, analogously to (2), by

Corollary 3.2.
In R, i.e., when restricted to configurations supported in R, we have Proof. Using (52), we compute The second term on the r.h.s. vanishes due toā being, in R, a solution to the linearized equation of motion. The result then follows from δās 0 Ψ = sδāΨ, which holds for any Ψ which is quadratic in fields and does not contain antifields.

Local Gauge Covariance
Our background data now consist of (P → M, g,Ā), i.e., a principal fiber bundle P → M with a fixed structure group G, the metric g and a background connectionĀ on P . To make the notion of local covariance precise, we define, following [12], morphisms χ : (P → M , g ,Ā ) → (P → M, g,Ā) as G equivariant smooth maps χ : P → P , which cover a causality preserving isometric embedding ψ : M → M , i.e., a morphism in the sense of the previous section, such that χ * Ā =Ā . This covers the case of background gauge transformations where χ g : P → P is the natural action of a section g of P × Ad G on P .
Locally covariant fields should then satisfy By the Thomas replacement theorem [25,42], such a field takes the form where P is a polynomial, α stands for multi-indices, R μνρσ is the Riemannian curvature of g, andF μν is the curvature ofĀ.

Classical BV-BRST Cohomology
For the case of pure Yang-Mills theory, for semi-simple G, the cohomology ring H(s) is generated by elements of the form where α stands for multi-indices, p r and Θ s are invariant polynomials of g, and r t is a local functional of the metric g, the background field strengthF , the Riemann tensor R and their derivatives. F is the full field strength, cf. (44). This result for the case of trivial backgrounds, i.e., withF = 0, is proven in [25,43]. The above expression is then obtained by the requirement of local covariance (54) in the presence of a non-trivial background connection. As there is no invariant polynomial of degree 1 on a semi-simple Lie algebra, the cohomology at ghost number 1, H 1 (s), is trivial. Now restricting to sections of vector bundles associated with P via the trivial representation of G, that is, those O without a Lie algebra index, the cohomology ring H(s|d) is generated by linear combination of elements of the form (55) and elements of the form where q r (F , C + A, A) are the Chern-Simons forms in the presence of a background connection [44]. In this expression,d denotes the covariant differential, induced on sections of p ⊗ Ω by the Leibniz rule anddb =∇ μ bdx μ for b a section of p, and m(r) are the degrees of the independent Casimir elements of G. The trace is in some representation of g. Furthermore, f s are strictly gauge-invariant monomials of F , and r t are closed forms. Again, the result (56) is a generalization of the wellknown results in [25,43] to the case with non-trivial background connection. Elements of the cohomology class H 0 (s) at ghost number 0 are in one-toone correspondence with the gauge-invariant observables of the original Yang-Mills theory, while those in the class H 4 1 (s|d) of four forms at ghost number 1 turn out to contain the gauge anomalies of the Yang-Mills theory, see, e.g., [25].

The BRST Charge
Classically, the action of the BRST differential on fields is also generated by the Noether charge of the BRST symmetry via the graded Peierls bracket [45,46] The charge Q is constructed as follows [25]: One chooses a one-form γ μ , supported in R, such that for a Cauchy surface Σ contained in R and any closed three-form α. One then sets where J is the Noether current of the BRST symmetry, which is a 3-form with ghost number 1, and is conserved on-shell in R.

Background-Independent Local Functionals.
In the case of scalar field theory, we defined the background-independent classical local functionals as those in the kernel of Dφ, cf. (2). However, as discussed above, the gauge-invariant observables are defined to be equivalence classes of the BV-BRST cohomology. Therefore, the suitable operator whose kernel defines the background-independent classical local functionals must be well defined on BV-BRST cohomology (i.e., it must commute with s). However, in view of (53), this is not the case for Dā. We, therefore, define the following modified operatorDā which turns out to have the desired properties, as stated in the following theorem. (58) satisfies, for F i with arbitrary support and F supported in R,

Theorem 3.3. The operatorDā defined in
Proof. To prove (59), we calculatê where we have used the identity where we have used (59) and (53). To prove (61), we calculatê Therefore, we find   (58) can also be motivated as follows. Before introducing the gauge-fixing Ψ in the action (50), the BV-BRST differential is given by (S YM +S sc , −) which is related to the gauge-fixed differential s by where e (−,Ψ) = id + (−, Ψ) + 1 2! (−, Ψ), Ψ + 1 3! ((−, Ψ), Ψ), Ψ + . . . , is a "canonical transformation" generated by Ψ (in the cases of interest here, Yang-Mills theory and gravity, the series truncates, as Ψ does not contain anti-fields). Consequently, the cohomologies of (S YM + S sc , −) and s turn out to be isomorphic under the map F → e (−,Ψ) F . In the non-gauge-fixed theory, Dā is the correct derivative operator, in the sense that it commutes with (S YM +S sc , −). The operatorDā is then obtained by the same canonical transformation, applied to Dā:Dā Thus, in view of (62), the correction term can be seen to naturally arise as a consequence of gauge fixing.
Remark 3.5. In view of (61), one may, similarly to Fedosov's approach, add the tangent vector fieldsā to S YM as a new non-dynamical fermionic field and define a differentialδ = D −,ā onā independent functionals and extend it naturally toā dependent ones. By (60),δ and s then anticommute, so that one may define a new differentialŝ = s +δ, whose cohomology at grade 0 gives the gauge-invariant, background-independent, on-shell local functionals. Such an approach was pursued by several authors in the literature, cf. [21,[47][48][49][50] for example. We do not proceed in this way here, basically because in the quantized theory, the flatness of the analog ofD will only hold on cohomology, see below.

Perturbative Quantum Yang-Mills Theory on a BackgroundĀ
In this section, we outline the perturbative quantization of the gauge-fixed Yang-Mills theory, described in the previous section, i.e., we adapt [25] to the case of non-trivial background gauge fields. The construction of the free algebra WĀ is similar to the scalar case, discussed in Sect. 2.1, now with the differential operator acting on (A ν , B, C,C). Here,P lin was defined in (43). The corresponding Hadamard two-point function is of the form where one assumes the vector and scalar two-point functions ω v , ω s to be related by∇ in U. The latter condition ensures that s 0 defines a graded derivation on WĀ, i.e., Vol. 21 (2020) for F i 's supported in U. That one can construct Hadamard two-point functions ω v , ω s fulfilling these properties was shown in [51,52]. As for scalar fields, the on-shell algebra is defined by dividing out the ideal JĀ generated by the equations of motion s 0 Φ ‡ i = 0. It is important to note that these in general contain anti-fields, cf. Table 1. These are being treated as sources, cf. [13], for example.
Time-ordered products on the algebra WĀ are defined analogously to the scalar case to be a collection of maps graded symmetric linear maps TĀ ,n : (W loc A ) ⊗n → WĀ, which satisfy the axioms mentioned below (18) with obvious modifications to adapt to the gauge fields, and with the difference that local covariance is now defined with respect to the morphisms χ. Time-ordered products with one factor, i.e., Wick powers, are defined analogously to the scalar case, cf. (27)

Ward Identities.
A crucial aspect of quantized gauge theory is the interplay of gauge invariance and renormalization. It is encoded in the anomalous Ward identity [25] valid for F supported in U. 15 Here A(e F ⊗ ) = n≥1 1 n! A n (F ⊗n ) is the anomaly, where each A n is a map A n : (W loc A ) ⊗n → W loc A , with properties similar to D n , cf. (26), that is, it is of order O( ), decreases the total Deg by 2(n − 1), is supported on the total diagonal, is local and covariant and graded symmetric and scales homogeneously under (20). As proven in Lemmata A.1 and A.2 , it is (anti-) field independent and vanishes if one of the arguments is a linear (anti-) field. In addition, each A n increases the ghost number by 1. Furthermore, it is subject to the consistency condition [25] In generating identities such as (67) or (68), we always assume F to be Grassmann even. To handle Grassmann odd F , one proceeds by multiplying with Grassmann odd parameters and differentiating w.r.t. them (taking care about the order).
14 This can be shown, for example using the methods developed in [53]. 15 In [25], this was proven for a flat background connection without restrictions on the support of F . This proof can be straightforwardly generalized to general background connections. However, a crucial ingredient is that s 0 is a derivation and nilpotent, which is only true on functionals supported in U . This motivates the localization supp S int ⊂ U, which by (24) ensures that supp T int A (eī F ⊗ ) ⊂ U for supp F ⊂ R (those are the generators that we will be concerned with).
As argued below, a crucial consistency requirement is the absence of gauge anomalies, i.e., The consistency condition (68) is crucial for the removal of anomalies, i.e., for achieving (69). Let us indicate how this proceeds. Consider the expansion of A(e Sint ⊗ ) in powers of : . . , for some integer m > 0. Now we write A (m) (e Sint ⊗ ) = M α as an integral of a local four-form α(x) with ghost number 1 and mass dimension 4 (this follows from the homogeneous scaling of the anomaly). The consistency condition (68) for F = S int implies that α(x) ∈ H 4 1 (s|d). If the cohomology ring H 4 1 (s|d) is trivial, then for some fields β, γ, of ghost number 0 and 1, respectively. Such an anomaly can be removed by passing to another renormalization scheme, as follows. Let us write the interaction (50) as S int = M L int , and let L 1 be the term of degree 3 in fields and anti-fields (so that Deg(īL 1 ) = 1). We now choose a new scheme T by setting the following local finite counter terms D n : where D (m) is the first non-trivial term in the expansion of D(e Sint ⊗ ) and where n = 2(m − 1) + deg φ β. The anomalies A and A in the schemes T and T are related via [25] A (m) (e Sint ⊗ ) = A (m) (e Sint ⊗ ) + sD (m) (e Sint ⊗ ), and therefore, with the choice (70) the anomaly in the new scheme vanishes: Repeating the argument for higher-order coefficients of A in , we can fully remove the anomaly. For the pure Yang-Mills case, as can be seen from (56), H 4 1 (s|d) is actually non-trivial. However, one can argue [25] that the parity property of the possible gauge anomaly is indeed not compatible with that of A(e Sint ⊗ ) and hence is absent, so that there exists a renormalization scheme in which (69) holds. In the following, we assume to work with such a scheme.

Quantum BRST Charge and the Algebra of Physical Observables.
In analogy with the scalar field theory, we can now define the generating functional of interacting time-ordered products. These generate the interacting algebra W int A . Due to the time-slice axiom [23], it suffices to consider F 's supported in R. However, the algebra W int A also contains gauge-variant and unphysical functionals. They can be represented only on a space with indefinite inner product. However, the algebra of physical and gauge-invariant renormalized observables is defined to be [25,54] , at ghost number 0 in the interacting on-shell algebra W int A mod JĀ. Here, Q int A is the renormalized interacting quantum BRST charge, obtained by applying definition (28) to the local functional Q defined in (57). Equality in FĀ is thus equality modulo equations of motion and Im[Q int for some H. Under certain conditions, FĀ admits a Hilbert space representation [52,55].
Whether such a construction of FĀ can be implemented turns out to be closely related to the issue of local gauge-symmetry preservation at the quantum level, which has the following manifestations: (i) conservation of the renormalized interacting Noether current J int A of BRST symmetry, As proven in [25,32], for any theory with local gauge symmetry, the first two manifestations listed above hold in the absence of gauge anomalies, i.e., when (69) holds. Also, the last manifestation follows from the anomalous Ward identity (67) if, in addition to (69), we have [25,32]: ⊗ ) = 0, which turns out to be a consequence of the triviality of H 1 (s).
The key identity in the proof of the above statements is the following interacting anomalous Ward identity [32]: which holds for all F supported in R, under assumption (69). 16 Here, ≈ means equal modulo the ideal JĀ of free equations of motion, defined analogously to (17), i.e., is the generating functional of interacting anomalies, defined by These are subject to the interacting consistency conditions At first order in F , this implies that the quantum BV-BRST operator [32] defined by is nilpotent, i.e., q 2 = 0. Using this notation, we may express (71) at first order in F as We also note that by (71), the gauge-invariant generators of interacting timeordered products are given by T int In particular, an interacting field F int A = T int A (F ) is gauge invariant if qF = 0. Furthermore, given F of ghost number 0 and fulfilling qF = 0, one may supplement it with "contact terms" to F = F + C(e F ⊗ ) such that F fulfills (75) in the sense of power series in F [56].

Perturbative Agreement and the Background Dependence of the Anomaly. As for the scalar case, perturbative agreement is a crucial ingredient for background independence. For variations of the background connection, it means
In the following, we sketch the proof that this can indeed be fulfilled in pure Yang-Mills theories, on a proof in a simpler context given in [31]. 17 We then explore the interplay of perturbative agreement and anomalies. We first need to define the retarded variation, to make sense of the l.h.s. of (76). We recall the differential operator-valued matrixP ij defined bȳ cf. (64), and denote the corresponding retarded/advanced propagator by Δ ij r/a . It fulfillsP Let us also introduce the (differential operator valued) matrix K i j defined by 17 Perturbative agreement will in general not hold when the gauge fields couple to chiral fermions, due to the usual chiral anomalies cf. [53].
so that K i j Φ j = s 0 Φ i , and its formal adjointK j i such that Then, with ε the Grassmann parity of Φ i . Analogously to the definition of the retarded wave operator in the scalar case, cf. (33), we now define 18 It maps solutions to the free equations of motion s 0 Φ ‡ i = 0 on the backgroundĀ to solutions on the backgroundĀ . It follows that the retarded Møller operator τ r , defined as in (32), is well defined on the on-shell algebra. One also defines its infinitesimal version, the retarded variation δ r a , as for the scalar case, cf. (34).
A crucial ingredient in the proof that perturbative agreement can be fulfilled is the free current, obtained as the variation of the free part of the action w.r.t. the background connection, i.e., j(a) :=δ a S 0 .
Here, we naturally extend the action to off-shell backgrounds, i.e., a is an arbitrary section of p ⊗ Ω 1 , not subject to the linearized equations of motion. When no sources are present, this current is classically covariantly conserved on-shell. In the present case, this is spoiled by the presence of anti-fields. One finds the off-shell identitȳ with ε the Grassmann parity of Φ i . We now have all the necessary ingredients to prove that (76) can be fulfilled.
This quantity is (anti-) field independent. It was also shown [31] that, for space-time dimension D ≤ 4, (84) holds on-shell, provided that the divergence of the Wick-ordered current vanishes on-shell, As we argue below, this is true when anti-fields are set to zero (i.e., when the ideal generated by Φ ‡ i is modded out). Thus, (84) holds when equations of motion s 0 Φ ‡ i and anti-fields Φ ‡ i are modded out. But as E(a 1 , a 2 ) is independent of (anti-) fields, (84) then also holds off-shell, and so does perturbative agreement (76).
It remains to argue that (85) indeed holds when anti-fields are set to zero. The first term on the r.h.s. of (83) then yields equations of motion [Φ i ,P ij Φ j ] I g . To evaluate the corresponding Wick-ordered product, one has to applyP to the Hadamard parametrix H and evaluate the limit of coinciding points. This can be done, for example using the methods developed in [53]. However, one can directly see that the result must vanish, as it is a locally and covariantly constructed section of p of mass dimension 4. No such quantity exists in parity non-violating models for semi-simple gauge groups.

Theorem 3.8. If perturbative agreement (76) holds, background variations of the anomaly satisfyδā
Proof. As the anomaly is local and 20 we may chooseā to be supported in the region U ⊃ U in which the background A is on-shell. As nilpotency of s 0 and the anomalous Ward identity (67) also hold on functionals supported in U , we may thus use perturbative agreement (76) and (67) to obtain where we have again used (86). Regarding the last term on the r.h.s., one computes In particular, this is supported outside of U. We may thus decompose as with supp(s 0δā S 0 ) ± ⊂ J ± (U)\U. It follows that the last term in (87) may be rewritten as a commutator, . On the other hand, we have In particular, [δ r a , s 0 ] acts on linear (anti-) fields as for x ∈ U, since the anomaly of a linear (anti-) field vanishes, cf. Lemma A.2. 21 The action of both s 0 and δ r a , and thus also of [δ r a , s 0 ], on nonlinear functionals is defined by their action on linear functionals, i.e., 22 Comparing with (88) shows that we are finished if we can show that The r.h.s. of this equation is of the form with some smooth 23 kernel W ij which vanishes unless ε i + ε j mod 2 = 1. It thus suffices to show that this vanishes when acting on Φ i (x)Φ j (y) with ε i + ε j mod 2 = 1. By this restriction, we have (88) and considering the equation at O(λ 1 λ 2 ), we indeed find that W ij must vanish, again by the absence of anomalies of linear fields, Lemma A.2. 21 This can also be shown directly, using the definition of s 0 and δ r a on Φ i . 22 To be precise, δ r a also acts non-trivially on background fields, by δ r aĀ =ā. However, as s 0 acts trivially on background fields, so does [δ r a , s 0 ]. 23 Smoothness follows from the Hadamard property of the two-point function and [58], Thm. 8.2.14.
For the following considerations, it turns out to be convenient to introduce the notation even though outside of R, s does not need to be well defined as an operator on local functionals. The important point is that in R, i.e., when restricted to configurations supported in R, s reduces to the BV-BRST differential s, cf. Proposition 3.1.

Corollary 3.9.
If perturbative agreement (76) holds, then, for F supported in U, , with sδāΨ defined by (89). In particular, for F supported in R, and n ≥ 1, (90) Proof. By field independence of the anomaly, Lemma A.1, we have , which proves the first claim. The locality of the anomaly and the fact that on R, sδāΨ = sδāΨ, then leads to (90).

Background Independence
Having introduced the setting for the quantum Yang-Mills theory perturbatively constructed around each backgroundĀ, we now turn to the formulation of background independence. In analogy with the case of scalar field theory (Sect. 2.2), we can identify the theories defined on different backgrounds via the retarded variation δ r a . As shown in Sect. 3.2.3, we can assume that perturbative agreement (76) holds, and we will do so from now on. Using this variation, we want to define a flat connection Dā on the bundle where S YM is the manifold of background field configurations which are solutions to the Yang-Mills equation, cf. also the discussion following (43). A connection is here defined in complete analogy to Definition 2.2. The local algebras FĀ(L) are then generated by T int A (eī F ⊗ ) with F supported in L and fulfilling (75). We would also like to ensure that in the classical limit, it should reduce to the connectionDā on classical local functionals, in the sense that  (7) on-shell, ensuring that it maps kernel and image of [Q int A , −] onto themselves; (iii) it is a derivation, i.e., fulfills (37); (iv) it respects space-time localization in the sense defined in (38).
A crucial requirement for the fulfillment of these properties will be the absence of a certain anomaly. We will later show that time-ordered products can indeed be defined accordingly.

Remark 3.10.
There is a subtlety regarding the definition of the bundles F YM and W YM . We recall that the backgroundsĀ are only required to be on-shell in U (and to coincide with an arbitrary reference connection A 0 outside of V). Hence, their behavior in V\U is arbitrary. A further requirement should thus be that the construction is independent of the choice of a representative, i.e., the connection Dā should vanish forā supported in V\U, when applied to That this is indeed the case is checked below, cf. Remark 3.14.
To construct the desired connection D, it is useful to split the connection D on local functionals aŝ where the two terms on the r.h.s. are obtained by applying the canonical gaugefixing transformation as in (63) separately toδā and δā. Hence, it is natural to see the first term on the r.h.s. as the gauge-fixed background variation and replace it by the retarded variation. Our first tentative definition is thus That this is a natural starting point is evidenced by the following Lemma: Lemma 3.11. The operator D 0 a is well defined on the on-shell algebra.
Proof. As the retarded variation is well defined on the on-shell algebra, it remains to check for the last two terms. We have indeed fulfills that requirement, asā is, in U, a solution to (43).

Well-Definedness of the Connection on the Quantum BRST Cohomology.
Similarly to the case of scalar field theory (Proposition 2.3), D 0 a acts as δāΨ). We note that, by (S int , δāΨ) = 0, we have D 0 a S int = DāS int . With the notation (89), we thus obtain Note the presence of the second term on the r.h.s. of (93) which is absent in the case of scalar field theory, cf. (40). It leads to a violation of the locality requirement (38). This term appears because the gauge-fixing fermion breaks the split independence of the action S, cf. (52). We first state a lemma which is crucial for the proof of the following theorem.

Lemma 3.12. For all F supported in R, it holds
Proof. As a consequence of (60), (59), the graded Jacobi identity (51) and (90), the l.h.s. equals . The first three terms cancel due to (49) and (51) and the last two terms due to Lemma A.1, taking into account (92) and the fact that S int is independent ofC ‡ .

Theorem 3.13. Assuming
with a not necessarily a solution to (43), the operator where s(δ ηā Ψ) is defined in (89) and η is a smooth nonnegative function supported on J − (R) and equal to 1 on J − (R)\R, is well defined on the on-shell for all F supported in R. On this cohomology, it is independent of the choice of η. Furthermore, for F fulfilling (75), we have In particular, Dā is a connection on F YM fulfilling (91).
Remark 3.14. The last term in definition (96) can be motivated as follows: Assume thatā is supported outside of U. As discussed in Remark 3.10, the corresponding derivative Dā should vanish on T int A (eī F ⊗ ) with F localized in R. The first term on the r.h.s. of (93) does indeed vanish (as the supports of a and F are disjoint), but the second one does not. However, due to causal factorization (29) of interacting time-ordered products, it is canceled by the commutator which is added in (96). This is completely analogous to the unitary transformation (its generator in the present case) which compensates a change of the infrared cutoff of the interaction in the so-called algebraic adiabatic limit, cf. [24].
Proof. We begin by proving the independence of the choice of η. The difference ξ = η 1 − η 2 of two admissible ηs is supported in R, where s coincides with s. Hence, under the assumption (95) and using (74),

functional yields a [Q int
A , −] exact functional, i.e., a zero element in the cohomology. We continue with proving (97). From Eq. (93) and for all F , we have By the above, we may, without loss of generality, assume that supp η ∩ J + (supp F ) = ∅. We split in the second term on the r.h.s. of (99), where η, χ and ψ are smooth nonnegative functions, summing up to 1, with χ being supported inside R and equal to 1 in a neighborhood of supp F , η being supported in J − (R), and ψ supported in J + (R). By causal factorization (29) and (30), we have We thus obtain With (71), we compute, using the assumption (95), and with Cā(F ) the expression on the l.h.s. of (94). Lemma 3.12 thus proves (97).
As a direct consequence of (98), Dā fulfills (91) and respects space-time localization in the sense defined in (38), and so defines a connection on F YM .

Flatness of the Connection on the Quantum BRST Cohomology.
Finally, we want to prove flatness of Dā.
Proof. Using (98), it suffices to provē . By (90), we have The last term on the r.h.s. vanishes by the flatness ofδ and Lemma A.2. Thus, with (61), we have . With the consistency condition (72), this simplifies to , where we used Lemma A.1, taking into account (92) and the fact that S int is independent ofC ‡ , and that Ψ does not contain anti-fields, so that (δā Ψ,δāΨ) = 0. With (71), we thus obtain , which proves the statement.

Absence of Obstructions to Background Independence. Above, we found that condition (95) is sufficient to ensure well-definedness and flatness of the connection Dā on [Q int
A , −] cohomology. We now show that this can indeed be satisfied in pure Yang-Mills theory.
Using this identity, the anomalous Ward identity in the scheme T takes the form On the other hand using (25), we can write the anomalous Ward identity as Comparing (102) and (103), we arrive at ). Now (101) follows by replacing F with F + τ G, differentiating with respect to τ , and setting τ = 0 and F = S int .
In the following, we show that the violation of condition (95) can be removed by a redefinition of time-ordered products. The strategy is as follows: Assume that the anomaly has been removed up to order O( m−1 ), i.e., with A int(n) independent of . We denote by A (m) and D (m) the anomaly and the redefinition of time-ordered product at order O( m ). From (101), we conclude that There are thus two possible strategies to remove the anomaly ofδ a Ψ at order O( m ): The first one would be to set However, such a definition must not spoil the absence of gauge anomalies or perturbative agreement. As discussed in the proof of Proposition 3.7, achieving perturbative agreement proceeds by redefining time-ordered products involving at least one factor j(a) =δ a S 0 , cf. (82). Hence, redefinitions of such timeordered products should not be allowed. Furthermore, due to field independence, a redefinition of a time-ordered product of the form T (δ a S i ⊗ eī Sint ⊗ ), with the interaction S int = S 1 + S 2 and Deg(īS i ) = i, would spoil the absence of gauge anomalies. However, by (52), we have Hence, the time-ordered products T (s 0δa Ψ ⊗ eī Sint ⊗ ) must not be redefined. Thus, to implement (106), one would have to redefine time-ordered products of the form T (s intδa Ψ ⊗ eī Sint ⊗ ). Concretely, one would set for n ≥ 1, with Note the different number of interaction terms on the two sides of (107), which is enforced by the fact that s intδa Ψ is cubic in the fields, whileδ a Ψ is only quadratic. This redefinition is still problematic. First, one has to show that the r.h.s. of (107) vanishes for n = 0. Second, and more severe, are constraints from field independence. One can find a such that δ a s intδa Ψ vanishes. Hence, if such a derivative δ a acts on the first variable of the functional on the l.h.s., one gets a functional that identically vanishes. Hence, all such derivatives only act on the S i factors. But there are less such factors on the l.h.s. than on the r.h.s., so that the redefinition (107) might be inconsistent with field independence. In order to circumvent these difficulties, we exploit the second (in fact related, cf. Remark 3.21) possibility to removing an anomaly based on (105). Namely, if A (m) (δ a Ψ ⊗ e Sint ⊗ ) happens to be s exact, i.e., A (m) (δ a Ψ ⊗ e Sint ⊗ ) = sH a , then we may set Unfortunately, A (m) (δ a Ψ ⊗ e Sint ⊗ ) need not be s exact. However, it turns out to be s 0 exact, which is sufficient to remove the anomaly order by order in the number of fields. To prove these statements, we collect a few lemmata. Proof. The statement was shown in [43], Thm. 7.1, for the full differential s and the restricted algebra not containing B,C and their anti-fields. However, adding these does not change the statement, as they form trivial pairs and do not modify the cohomology. The proof given in [43] only uses the triviality of the homology of the Koszul-Tate differential at positive anti-field number, and this also holds for its free part. Proof. Let G j = 0 be the lowest-order term in the (anti-) field number expansion of G. If j = i, we have found the sought for G i . For j < i, we note that s 0 G j = 0. By Lemma 3.17, there is H j such that G j = s 0 H j . Define G (1) = G − sH j . We still have sG (1) = F , but now the lowest-order term of G (1) occurs at j (1) > j. We continue until j (k) = i.
with Θ (m) a locally and covariantly constructed section of p ⊗ Ω 1 (M ) of ghost number 0, mass dimension 3, and in the kernel of s. By (55), it must thus be of the form Θ (m)Iμ = sΣ (m)Iμ + Ξ (m)Iμ , with Ξ (m) a c-number. However, the only such c-number would be∇ νF Iνμ , which vanishes in R, cf. (45). Noting that the first term on the r.h.s. of (109) can be rewritten as an element of the image of s 0 using s 0 (A ‡I μ +∇ μC I ) = (P lin A) I μ , and using Lemma 3.18 on the second term, we obtain the desired statement.
We are now ready to perform the necessary redefinitions.
where we used notation (108). Both expressions are at the same order in the interaction, so there are no potential obstructions from field independence. Expanding (105) in the total (anti-) field number, we see that the anomaly now occurs at a higher order in the total (anti-) field number. By power counting, the anomaly has a bounded total (anti-) field number, so the process terminates at some point, so that the anomaly at order O( m ) is removed. Continuing at higher orders, one removes the anomaly to all orders.
Remark 3.21. The two possibilities for removing the anomaly, i.e., by either redefining time-ordered products involving s intδa Ψ orδ a Ψ, are in fact related. This follows from field independence and the fact that δ δC s intδa Ψ involves the same Wick power asδ a Ψ, namelyCA. Hence, the redefinition (110) implies a redefinition of the form (107), with a modified right-hand side. One can thus see the approach chosen here as a means to rule out the potential clashes with field independence discussed below (108).
Remark 3.22. The above arguments invoked power counting and thus relied on power-counting renormalizablity. Our method is thus not sufficient to rule out violation of background independence, for example in Yang-Mills in higher dimensions.
Remark 3.23. Let us consider the situation when the gauge group is not semisimple, but contains abelian factors and possibly also matter fields. The proof of anomaly freedom given in [25] does then not apply, but let us assume that there are no gauge anomalies. How are our considerations then affected? The dynamical fields corresponding to the abelian factors are free (apart from the possible coupling to matter), so the gauge-fixing fermion Ψ is independent of the abelian background connection. In particular,δ a Ψ = 0 if the perturbation a is only in the abelian background connection. It follows that no further potential obstructions to achieving (95) arise by including abelian factors.

Summary of Assumptions.
Even though we discussed Yang-Mills theory here, the treatment of other gauge theories should be completely analogous, provided that a few conditions are met. Obviously, the theory should have no gauge anomaly, i.e., (69) holds, and fulfill perturbative agreement w.r.t. changes in the background. Also, the triviality of H 1 (s) was used. Apart from that, we used that (i) S int does not containC ‡ , (ii) the gauge-fixing fermion is quadratic in fields, (iii) and does not contain anti-fields. If these conditions are met, then background independence holds, provided that the analog of (95) does.
Remark 3.24. Throughout, we also assumed compact Cauchy surfaces. This assumption is of technical nature only. It is relevant for the existence of the interacting BRST charge Q int A , but as long as one is not interested in singling out the physical subspace in a Hilbert space representation, this charge is not needed. We only use it in the form [Q int A , −] of the on-shell interacting BRST differential. One could equally well work with the off-shell interacting BRST differentialŝ recently constructed in [56] (which does not require compact Cauchy surfaces). For non-compact Cauchy surfaces, the construction of the cutoff functions needed, for example in Theorem 3.13 or Theorem 3.8, becomes slightly more involved, but apart from the fact that the existence of Hadamard states for non-compact Cauchy surfaces has not been proven in full generality [51], our conclusions also hold for non-compact Cauchy surfaces (with the obvious replacements of [Q int A , −] byŝ). 3.3.5. Renormalized Background-Independent Interacting Fields. Lemma 3.12 can be seen as a master equation for the compatibility ofDā = D 0 a − (−,δāΨ) and s in the renormalized setting. Let us explore some consequences.
We recall that for a local functional F to give rise to a gauge-invariant interacting field T int A (F ), it must fulfill qF = 0, cf. (73) for the definition of q, corresponding to the linearization of (75). As shown in [56], for any field O of ghost number 0 which is classically gauge invariant, sO = 0, there is extension O = O + O( ) such that qO = 0. A further structure that naturally occurs at second order in F is the quantum anti-bracket [32]: . We may now define , which is equal toDā up to quantum corrections. It follows from (98) that a functional F giving rise to a background-independent interacting field T int A (F ) must fulfill A straightforward consequence of the consistency condition (72), Lemma 3.12 and Theorem 3.15 is the following: If furthermore qF = 0 = qG, then also A natural question is now the following. Assume a field O is given which is classically gauge invariant and background independent, i.e., A is a gauge-invariant, background-independent field? Given an extension O such that qO = 0, one may evaluate it on one backgroundĀ and then obtain local functionals on general backgrounds by parallel transport w.r.t. D , at least locally on S YM (using that D is flat on q cohomology). However, it is not obvious whether one may choose the extension O such that this procedure results in a proper field in the sense defined in Sect. 2.1, i.e., is independent ofĀ . We leave this as an interesting open problem.

Perturbative Quantum Gravity
Having treated background independence for Yang-Mills theory in full detail, we now turn to perturbative quantum gravity, with an emphasis on the differences to the Yang-Mills case. Quantum gravity in the sense of perturbation theory around generic backgrounds was recently formulated in [2]. Our setup differs in an important point, so this difference will also be highlighted.
The principal dynamical variable is the metric perturbation h μν , i.e., the full metric is given by withḡ μν the background metric. It is supplemented by ghosts c μ , anti-ghosts c μ and Lagrange multipliers b μ , which are (co-) vector fields and transform under the BRST transformation as with ∇ the Levi-Civita derivative w.r.t. g μν . Correspondingly, the Einstein-Hilbert action is extended to where the anti-fields Φ ‡ are interpreted as tensor-valued densities. The action is invariant under background gauge transformations, i.e., diffeomorphisms ψ : M → M acting via pullback onḡ and the dynamical fields.
As for Yang-Mills fields, the interaction terms are adiabatically cutoff. There is a slight complication w.r.t. the Yang-Mills case in that the cutoff function should be a function of covariant coordinates, cf. below. That, however, does not change anything substantial, so this cutoff can be treated as for Yang-Mills fields. Hence, we ignore this subtlety in the following. In the region where the cutoff function is equal to one, the extended action has the shift symmetry To implement the harmonic (or de Donder) gauge, we employ the gaugefixing fermion [59] which is a covariant functional of the dynamical fields and the background metricḡ. Here,∇ is the Levi-Civita derivative w.r.t.ḡ μν . The gauge-fixed action then becomes with h :=ḡ μν h μν . It leads to hyperbolic equations of motion at the linearized level for c μ ,c ν and γ μν := h μν − 1 2ḡ μν h (after eliminating b ν ). Local observables can be constructed as proposed in [2,17], by what one might call covariant coordinates. One chooses backgroundsḡ that are sufficiently generic 24 to allow, in a neighborhood ofḡ, for four curvature scalars to provide a coordinate system X[g] : M → U ⊂ R 4 . By definition, these fulfill for a diffeomorphism ψ. It follows that Given a test tensor t on M , and T [g] a tensor covariantly constructed out of the metric, i.e., obtained by contractions of g μν , g μν , ∇ (λ1 . . . ∇ λr) R μνρσ , one defines From (111), it follows that the observable (112) transforms covariantly, and is in the kernel of the BRST operator. We refer to [2,17] for the interpretation of these observables. An adiabatic cutoff of the interaction terms, respecting covariance, can be implemented similarly. Let L int be the interaction Lagrangian density, obtained by Taylor expansion of the Lagrangian density in (h, c,c, b, h ‡ , c ‡ ,c ‡ ) and keeping only the terms of order higher than two. Then, a covariant cutoff can be implemented as with λ a test function on the background, assumed to be equal to one in a neighborhood of the region R, cf. the setup for the Yang-Mills case.
As for the case of pure Yang-Mills theory, there are no gauge anomalies and H 1 (s) is trivial [62], and there is also no obstruction to the fulfillment of perturbative agreement [13] for variations in the background metric. Also, the conditions (i)-(iii) stated in Sect. 3.3.4 are met. It follows that it suffices to check the fulfillment of the analog of condition (95), which is A int 1 (δ k Ψ) = 0, whereδ k Ψ = δ δḡμν Ψ, k μν . Due to power-counting non-renormalizability, the arguments invoked in Sect. 3.3.3 to prove the fulfillment of (95) cannot be adapted to the present setting, cf. also Remark 3.22. For example, even if A int 1 (sδ k Ψ) = 0 holds, one can, using the covariant coordinates, still find non-trivial analogs of Θ in the proof of Lemma 3.19, such as for any covariant symmetric tensor T . We leave open the question whether such obstructions to background independence occur in perturbative quantum gravity.
Remark 4.1. In one respect, our setup severely deviates from the one employed in [2]. There, the gauge condition is that the four curvature scalars X that are used as coordinates are harmonic. The corresponding Lagrange multipliers b are then a collection of four scalars, and accordingly for the anti-ghostsc. It follows that the gauge-fixed action is no longer covariant, but explicitly depends on the choice of the coordinates X. It is in fact not even invariant under changing the coordinates to Y = ψ • X using a diffeomorphism ψ of R 4 , i.e., under relabeling the points in the chart. The advantage of this approach is that the gauge-fixing fermion does not break the split independence. The downside is of course that in the end, one has to show that covariance is still intact in the observable algebra. Furthermore, having given up covariance, renormalization schemes and thus also potential anomalies are much less constrained than in our approach.

Background Independence as Triviality of the Relative Cauchy Evolution
Finally, let us comment on a different criterion for background independence, which is used in [2] in the context of perturbative quantum gravity. Based on ideas formulated in [63], background independence is there defined as triviality of the interacting relative Cauchy evolution β. We first discuss it in the example of the scalar field. One defines Here, τ a is the advanced Møller operator, defined in complete analogy to the retarded one, cf. (32), and A is the advanced product 25 defined as The inverses of retarded and advanced products appearing here are purely formal. However, the requirement that β is trivial on-shell can be properly formulated as The infinitesimal version of this is, using perturbative agreement, Formally, i.e., putting aside cutoff issues, we have δφS 0 = 0, so that, with (13), we may replaceδφS by δφS and conclude that the equation is indeed fulfilled, by the field equation, which follows from (23).
In the case of Yang-Mills theory, the split independence of the action is broken by gauge fixing, cf. (52), so that one then obtains, again ignoring cutoff issues, For F fulfilling (75) and assuming (95) and [Q int A , T (eī Sint ⊗ )] ≈ 0, the r.h.s. can be written as an element of Im[Q int A , −] , i.e., as a trivial element. Hence, assuming the absence of the anomaly (95), one finds that the interacting relative Cauchy evolution is indeed trivial on the cohomology.
Two comments are in order: • As discussed in Remark 4.1, in [2] the breaking of the split independence of the action is avoided by the use of a non-covariant gauge fixing. In particular, the relevance of the absence of the anomaly (95) was not noted there. The problems with such a non-covariant gauge fixing were discussed in Remark 4.1. • The significance of the criterion proposed in [2], i.e., triviality of the interacting relative Cauchy evolution, seems unclear. Following the derivation above, one finds that, in the case of gravity, it is implied by the on-shell vanishing of the stress-energy tensorδkS, or, equivalently, by the on-shell fulfillment of the equations of motion. However, it gives no information about how to relate observables defined on different backgrounds, i.e., does not answer our initial question, as evidenced by the fact that all derivatives of F w.r.t. the background fields drop out in the above calculations. We therefore think that triviality of the interacting relative Cauchy evolution is not a sufficient criterion for background independence.
Open Access. This article is licensed under a Creative Commons Attribution 4.0 International License, which permits use, sharing, adaptation, distribution and reproduction in any medium or format, as long as you give appropriate credit to the original author(s) and the source, provide a link to the Creative Commons licence, and indicate if changes were made. The images or other third party material in this article are included in the article's Creative Commons licence, unless indicated otherwise in a credit line to the material. If material is not included in the article's Creative Commons licence and your intended use is not permitted by statutory regulation or exceeds the permitted use, you will need to obtain permission directly from the copyright holder. To view a copy of this licence, visit http://creativecommons.org/ licenses/by/4.0/.
Publisher's Note Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

A Lemmata on the Anomaly
Lemma A.1. The anomaly A(e F ⊗ ) is (anti-) field independent, in the sense that where Φ i = (A I μ , B I , C I ,C I ) and ε is the Grassmann parity of Φ i , and analogously for Φ ‡ i .
Proof. From the anomalous Ward identity (67) and field independence (22) On the other hand, we have We thus obtain where δ δΦ i , s 0 := δ δΦ i • s 0 − (−1) ε s 0 • δ δΦ i . It thus remains to show that which together with (114) implies the claim. To prove this, we note that δS0 δΦ i is a linear expression in fields and anti-fields and, hence, can be written δS0 δΦ i = a ij Φ j + b i j Φ ‡ j for some (differential operator valued) coefficients a ij , b i j . Thus, Therefore, (115) follows from (anti-) field independence of time-ordered products. For anti-field independence, the proof proceeds analogously. Proof. To prove the claim, we use the single field axiom, i.e., (23), which in the present situation reads where Δ ij a (x, y) is the advanced propagator of the differential operatorP ij defined in (77). We also recall definitions (79) and (80) of the (differential operator valued) matrices K i j andK j i . From (81) and s 2 0 Φ ‡ i = 0, it follows thatP ij K j k + (−1) εK j iP jk = 0, with ε the Grassmann parity of Φ i . This implies, cf. [64], From (117) it follows that for the linear fields s 0 Φ i , we have where for simplicity we omitted the variable x and Δ jk a δ δΦ k should be read as Δ jk a (x, y) δ δΦ k (y) . To prove the claim, we first apply s 0 on the left-hand side of (117) and find with ε the Grassmann parity of Φ i . In the first step, we have used the anomalous Ward identity (67) and the sign factor appears by commuting Φ i and s 0 F + 1 2 (F, F ) + A(e F ⊗ ), which is fermionic. In the second step, we used (117) to pull Φ i out of the time-ordered product, and in the last step, we have again used (67) and the field independence of A(e F ⊗ ), i.e., (113). Now applying s 0 on the r.h.s. of (117), we find Equating (120) and (121), we arrive at where we have used (116). Using (78) and noting that δΦ k F Inserting this back into (122) and using (118), we obtain which vanishes by (119). This proves the claim.
For an anti-field Φ ‡ i , the second term on the r.h.s. of (117) is absent, and so are the last two terms on the r.h.s. of (120) and (121). The claim then follows from (81), which entails , with ε the Grassmann parity of Φ i .

Glossaryā
Background variation satisfyingP linā = 0 (tangent vector fields on S YM ) A Background G-connection A Dynamical g-valued 1-form A int n Interacting anomaly with n local insertions Δ