Adler-Bardeen theorem and manifest anomaly cancellation to all orders in gauge theories

We reconsider the Adler-Bardeen theorem for the cancellation of gauge anomalies to all orders, when they vanish at one loop. Using the Batalin-Vilkovisky formalism and combining the dimensional-regularization technique with the higher-derivative gauge invariant regularization, we prove the theorem in the most general perturbatively unitary renormalizable gauge theories coupled to matter in four dimensions, and identify the subtraction scheme where anomaly cancellation to all orders is manifest, namely no subtractions of finite local counterterms are required from two loops onwards. Our approach is based on an order-by-order analysis of renormalization, and, differently from most derivations existing in the literature, does not make use of arguments based on the properties of the renormalization group. As a consequence, the proof we give also applies to conformal field theories and finite theories.


Introduction
The Adler-Bardeen theorem [1,2] is a crucial property of quantum field theory, and one of the few tools to derive exact results. In the literature various statements go under the name of "Adler-Bardeen theorem". They apply to different situations. The original statement by Adler and Bardeen says that (I) the axial anomaly is one-loop exact. The second statement, which is the one we are going to study here, says that (II) (there exists a subtraction scheme where) gauge anomalies vanish to all orders, if they vanish at one loop. Statement II is important to justify the cancellation of gauge anomalies to all orders in the Standard Model. A third statement concerns the one-loop exactness of anomalies associated with external fields. Statement I is expressed by a well-known operator identity for the divergence of the axial current. By means of a diagrammatic analysis, Adler and Bardeen were able to provide the subtraction scheme where that identity is manifestly one-loop exact in QED [1]. They emphasized that higher-order corrections vanish, unless they contain the one-loop triangle diagram as a subdiagram. Said like this, statement I intuitively implies statement II. However, the original proof of Adler and Bardeen applies only to QED.
Other approaches to the problem have appeared, since the paper by Adler and Bardeen, in Abelian and non-Abelian gauge theories [2]. Statement I can be proved using arguments based on the properties of the renormalization group [3,4,5], regularization independent algebraic techniques [6], or an algebraic/geometric derivation [7] based on the Wess-Zumino consistency conditions [8] and the quantization of the Wess-Zumino-Witten action. Statement II can also be proved using renormalization-group (RG) arguments, with the dimensional regularization [9] or regularization-independent approaches [10].
More recently, statement II was proved by the author of this paper in Standard Model extensions with high-energy Lorentz violation [11], which are renormalizable by "weighted power counting" [12]. The approach of [11] is closer to the original approach by Adler and Bardeen, in the sense that it does not make use of RG arguments, algebraic methods or geometric shortcuts, it naturally provides the subtraction scheme where the all-order cancellation is manifest, and it is basically a diagrammatic analysis, although instead of dealing directly with diagrams, it uses the Batalin-Vilkovisky formalism [13] to manage relations among diagrams in a compact and efficient way.
In the present paper we prove statement II in the most general perturbatively unitary and renormalizable gauge theories coupled to matter, and elaborate further along the guidelines of ref. [11]. We upgrade the approach of [11] in a number of directions, emphasize properties that were not apparent at that time, and expand the arguments that were presented concisely. We also gain a certain clarity dropping the Lorentz violation. A side purpose of this investigation is to develop new techniques and tools to prove all-order theorems in quantum field theory with a smaller effort.
Our results make progress in several directions. To our knowledge, if we exclude ref. [11] and this paper, statement II has been proved beyond QED only making use of arguments based on the renormalization group. However, RG arguments do not provide the subtraction scheme where all-order cancellation is manifest, and are not sufficiently general. For example, they are powerless when the beta functions identically vanish, so they exclude conformal field theories and finite theories, where however the Adler-Bardeen theorem does hold. Actually, RG techniques fail even when the first coefficients of the beta functions vanish [9,10]. Our approach does not suffer from these limitations. Another reason to avoid shortcuts is that in the past the Adler-Bardeen theorem caused some confusion in the literature, therefore new proofs, and even more generalizations, should be as transparent as possible. In this paper we pay attention to all details.
The all-order cancellation of gauge anomalies is a property that depends on the scheme, but the existence of a good scheme is not evident. Knowing the scheme where cancellation is manifest is very convenient from the practical point of view, because it saves the effort of subtracting ad hoc finite local counterterms at each step of the perturbative expansion. For example, using the dimensional regularization and the minimal subtraction scheme the cancellation of two-loop and higher-order corrections to gauge anomalies in the Standard Model is not manifest, and finite local counterterms must be subtracted every time.
To find the right subtraction scheme we need to define a clever regularization technique. It turns out that using the Batalin-Vilkovisky formalism and combining the dimensional regularization with the gauge-invariant higher-derivative regularization, the subtraction scheme where the Adler-Bardeen theorem is manifest emerges quite naturally [11].
It is well-known that, in general, gauge invariant higher-derivative regularizations do not regularize completely, because some one-loop diagrams can remain divergent. From our viewpoint, this is not a weakness, because it allows us to separate the sources of potential anomalies from everything else. We just have to use a second regulator, the dimensional one, to deal with the few surviving divergent diagrams.
The regularization we are going to use introduces two cutoffs: ε = 4 − D, where D is the continued complex dimension, and an energy scale Λ for the higher-derivative regularizing terms. The regularized action must be gauge invariant in D = 4, to ensure that the higher-derivative regulator has the minimum impact on gauge anomalies. The physical limit is defined letting ε tend to 0 and Λ to ∞. When we have two or more cutoffs, physical quantities do not depend on the order in which we remove them. More precisely, exchanging the order of the limits ε → 0 and Λ → ∞ is equivalent to change the subtraction scheme. That kind of scheme change is however crucial for our arguments.
Consider first the limit Λ → ∞ followed by ε → 0. When D = 4 the limit Λ → ∞ is regular in every diagram and gives back the dimensionally regularized theory: no Λ divergences appear, but just poles in ε. In this framework there are no known subtraction schemes where the Adler-Bardeen theorem holds manifestly. Now, consider the limit ε → 0 followed by Λ → ∞. At fixed Λ we have a higher-derivative theory. If properly organized, that theory is superrenormalizable and contains just a few (one-loop) divergent diagrams, which are poles in ε and may be removed redefining some parameters. At a second stage, we study the limit Λ → ∞, where Λ divergences appear and are removed redefining parameters and making canonical transformations. We call the regularization technique defined this way dimensional/higher-derivative (DHD) regularization.
Intuitively, if gauge anomalies cancel at one loop there should be no further problem at higher orders, because the higher-derivative regularization is manifestly gauge invariant. Thus, we expect that the DHD regularization provides the framework where the Adler-Bardeen theorem is manifest. However, it is not entirely obvious that the two regularization techniques can be merged to achieve the goal we want. Among the other things, ε evanescent terms are around all the time and the O(1/Λ n ) regularizing terms can simplify power-like Λ divergences, causing troubles. Nevertheless, with some effort and a nontrivial amount of work we can prove that all difficulties can be properly dealt with.
Summarizing, the statement we prove in this paper is Theorem. In renormalizable perturbatively unitary gauge theories coupled to matter, there exists a subtraction scheme where gauge anomalies manifestly cancel to all orders, if they vanish at one loop.
Once we have this result, we know that no matter what scheme we use, it is always possible to find ad hoc finite local counterterms that ensure the cancellation of gauge anomalies at higher orders. Then we are free to use the more common minimal subtraction scheme and the pure dimensional regularization technique.
The paper is organized as follows. In sections 2-7 we prove the theorem in non-Abelian Yang-Mills theory coupled to left-handed chiral fermions. This model is sufficiently general to illustrate the key points of the proof, as well as the main arguments and tools, but relatively simple to free the derivation from unnecessary complications. At the end of the paper, in section 8, we show how to include the missing fields, namely right-handed fermions, scalars and photons, and cover the most general perturbatively unitary renormalizable gauge theory coupled to matter. Section 9 contains our conclusions. In the appendix we recall the calculation of gauge anomalies in chiral theories.
The proof for Yang-Mills theory coupled to chiral fermions is organized as follows. In sections 2 and 3 we formulate the dimensional and DHD regularization techniques. In sections 4-6 we prove the Adler-Bardeen theorem in the higher-derivative theory, studying the limit ε → 0 at Λ fixed. Precisely, in section 4 we work out the renormalization, in section 5 we study the one-loop anomalies and in section 6 we prove anomaly cancellation to all orders. In section 7 we take the limit Λ → ∞ and conclude the proof of the Adler-Bardeen theorem for the final theory.

Dimensional regularization of chiral Yang-Mills theory
We first prove the Adler-Bardeen theorem in detail in four-dimensional Yang-Mills theory coupled to left-handed chiral fermions. This model offers a sufficiently general arena to illustrate the key arguments and tools of our approach. At the same time, we make some clever choices to prepare the generalization (discussed in section 8) to the most general perturbatively unitary gauge theories coupled to matter. To begin with, in this section we dimensionally regularize chiral gauge theories and point out a number of facts and properties that are normally not emphasized, but are rather important for the arguments of this paper.
Consider a gauge theory with gauge group G and left-handed chiral fermions ψ I L in certain irreducible representations R I L of G. If G is the product of various simple groups G i , we use indices a, b, . . . for G and indices a i , b i , . . . for G i . Denote the gauge coupling g i of each G i with gr i , where r i are parameters of order one that we incorporate into the G structure constants f abc and the anti-Hermitian matrices T a associated with the representations of matter fields. We call g the overall gauge coupling. We organize the matrices T a in block-diagonal form, where each block refers to a ψ I L and a representation R I L . When we write T a ψ I L we understand that T a is replaced by the appropriate block. More fermions in the same irreducible representations may be present. With these conventions the matrices T a still satisfy [T a , T b ] = f abc T c and the classical action reads sum over this kind of index i being understood, here and in the rest of the paper) is the G i field strength, D µ ψ I L = ∂ µ ψ I L +gT a A a µ ψ I L is the fermion covariant derivative and ı is used for √ −1 to avoid confusion with the index i. The parameters ζ i could be normalized to 1, but for future uses it is convenient to keep them free, because they are renormalized by poles in ε. Analogous parameters in front of the fermionic kinetic terms are not necessary.
To keep the presentation simple we make some simplifying assumptions that do not restrict the validity of our arguments. Specifically, we do not include right-handed fermions and scalar fields, and assume that the groups G i are non-Abelian, so there is no renormalization mixing among gauge fields, even when more copies of the same simple group are present. In section 8 we explain how to relax these assumptions and cover the most general Abelian and non-Abelian perturbatively unitary renormalizable gauge theories coupled to matter.
Let us briefly recall the Batalin-Vilkovisky formalism for general gauge theories [13]. The classical fields φ = {A a µ , ψ I L ,ψ I L }, together with the ghosts C, the antighostsC and the Lagrange multipliers B for the gauge fixing are collected into the set of fields Φ α = {A a µ , C a ,C a , B a , ψ I L ,ψ I L }. An external source K α with opposite statistics is associated with each Φ α , and coupled to the Φ α transformations R α (Φ, g). We have K α = {K µa , K a C , K ā C , K a B , K I ψ ,K I ψ }. If X and Y are functionals of Φ and K their antiparentheses are defined as where the integral is over spacetime points associated with repeated indices. The master equation (S, S) = 0 must be solved with the "boundary condition" is the classical action (2.1). The solution S(Φ, K) is the action we start with to quantize the theory.
In the model we are considering the gauge algebra closes off shell, so there exists a variable frame where S(Φ, K) is linear in K. The non-gauge-fixed solution of the master equation is collects the symmetry transformations of the fields, D µ C a = ∂ µ C a +gf abc A b µ C c being the covariant derivative of the ghosts. The gauge-fixed solution of the master equation reads where Ψ(Φ) is the "gauge fermion", a functional of ghost number −1 that collects the gauge-fixing conditions. For convenience, we choose standard linear gauge-fixing conditions and write where ξ i are gauge-fixing parameters. The naïve D-dimensional continuation of the action (2.1) is not well regularized, because chiral fermions do not have good propagators. To overcome this difficulty, we proceed as follows. As usual, we split the D-dimensional spacetime manifold R D into the product R 4 × R −ε of ordinary four-dimensional spacetime R 4 times a residual (−ε)-dimensional evanescent space R −ε . Spacetime indices µ, ν, . . . of vectors and tensors are split into bar indicesμ,ν, . . ., which take the values 0,1,2,3, and formal hat indicesμ,ν, . . ., which denote the R −ε components. For example, momenta p µ are split into pairs pμ, pμ, or equivalentlyp µ ,p µ . The flat-space metric η µν =diag(1, −1, . . . , −1) is split into ημν =diag(1, −1, −1, −1) and ημν = −δμν. When we contract evanescent components we use the metric ημν , so for examplep 2 = pμημν pν. We assume that the continued γ matrices γ µ satisfy the continued Dirac algebra {γ µ , γ ν } = 2η µν . We define γ 5 = ıγ 0 γ 1 γ 2 γ 3 , P L = (1 − γ 5 )/2, P R = (1 + γ 5 )/2 and the charge-conjugation matrix C = −ıγ 0 γ 2 in the usual fashion. Full SO(1, D −1) invariance is lost in most expressions, replaced The action (2.1) gives the fermion propagator P R (ı// p)P L , which involves only the fourdimensional componentsp µ of momenta, therefore does not fall off in all directions of integration for p → ∞. Applying the rules of dimensional regularization, fermion loops integrate to zero. To provide fermions with correct propagators we introduce right-handed ψ I L -partners ψ I R that decouple in four dimensions and are inert under every gauge transformations. We include ψ R and ψ R into the set of fields Φ. It is not necessary to introduce sources K for them.
Specifically, we start from the regularized classical action which is the sum of the unregularized classical action (2.1) plus a correction where ς IJ are constants that form an invertible matrix ς. The only nontrivial off-diagonal entries of ς (and of all the matrices M IJ we going to meet in this paper) are those that mix equivalent irreducible representations R I L . The reason why the matrix ς is kept free is that later on it will help us reabsorb the renormalization constants of ψ I L , since S LR is nonrenormalized (see below). Using the polar decomposition, we can write ς = U † R DU L , where U L and U R are unitary matrices and D is a positive-definite diagonal matrix. In the basis where ς is replaced by its diagonal form D ≡ diag(ς I ) the propagators of the Dirac fermions ψ I = ψ I L + ψ I R are and coincide with the usual propagators for ς I = 1.
Next, observe that (S K , S K ) = 0 in arbitrary D. The regularized gauge-fixed action is (up to an extension that will be discussed later) and satisfies where "O(ε)" is used to denote any expression that vanishes in four dimensions. We have used P R / ∂P R = P R/ ∂P R and a similar relation with R → L. Observe that S r0 is invariant under the global symmetry transformations of the group G.
Given a regularized classical action S(Φ, K), the regularized generating functionals Z and W are defined by the formulas (2.9) and the generating functional Γ(Φ, K) = W (J, K) − Φ α J α of one-particle irreducible diagrams is the Legendre transform of W (J, K) with respect to J, where the sources K act as spectators.
Often it is necessary to pay attention to the action used to define averages. We denote the averages · · · defined by the action S as · · · S . The anomaly functional is and collects the set of one-particle irreducible correlation functions containing one insertion of (S, S). The last equality of (2.10) can be proved making the change of variables Φ α → Φ α + θ(S, Φ α ) in the functional integral (2.9), where θ is a constant anticommuting parameter. For details, see for example the appendix of [14].
No one-particle irreducible diagrams can be constructed with external legsψ R or ψ R , becausē ψ R and ψ R do not appear in any vertices. Thus, the total Γ functional satisfies We have anticipated that the action (2.7) is not the final dimensionally regularized action we are going to use. Before moving to the appropriate extension S r , we must describe the counterterms generated by S r0 , list a number of properties that can be used to restrict the S r0 extensions and point out some subtleties concerning the dimensional regularization.
First, observe that the counterterms are B, K B and KC independent. Indeed, the source K B appears nowhere in S r0 , while KC appears only in − BKC. Moreover, the gauge fixing conditions are linear in the fields, and the B-dependent terms of S r0 are at most quadratic in Φ, therefore no nontrivial one-particle irreducible diagrams can have external B legs.
Second, the action S r0 does not depend on the antighostsC a i and the sources K µa i separately, but only through the combinations K µa i + ∂ µC a i . The Γ functional must share the same property.
Indeed, an antighost external leg actually carries the structure ∂ µC a i , since all vertices containing antighosts do so. Given a diagram with K µa i or ∂ µC a i on external legs, we can construct almost identical diagrams just replacing one or more legs K µa i with ∂ µC a i , or vice versa.
Third, power counting and ghost-number conservation ensure that the counterterms are linear in the sources K. Using square brackets to denote dimensions in units of mass, we have [K µa ] = [K a C ] = 2, and [K ψ ] = 3/2. Each of these sources has ghost number equal to −1, therefore the dimension of a term that is more than linear in K and has vanishing ghost number necessarily exceeds 4.

Structure of the dependence on the overall gauge coupling
It is useful to single out how the functionals depend on the overall gauge coupling g. The tree-level functionals we work with have the g structure If the action satisfies this condition at the tree level, then the renormalized action and the Γ functional have the g structure where X L collects the L-loop contributions. Basically, there is an additional factor g 2 for every loop. Indeed, when the action is of the form (2.11), every vertex is multiplied by a power g N −2 , where N is the number of its Φ plus K legs. Then, a one-particle irreducible diagram with L loops, I internal legs, E external legs and v i vertices with i legs is multiplied by having used L − I + V = 1 and i>2 iv i = 2I + E. We see that for L ≥ 1 we have one power of g for each external leg and a residual factor g 2(L−1) , in agreement with (2.12). The g structures (2.11) and (2.12) are preserved by the antiparentheses: if the functionals X(Φ, K, g) and Y (Φ, K, g) satisfy (2.11), or (2.12), then the functional (X, Y ) satisfies (2.11), or (2.12), respectively.

Properties of the dimensional regularization of chiral theories
Now we recall a few properties of the dimensional regularization of chiral theories, which are important for the rest of our analysis. It is well-known that divergences are just poles in ε. Instead, the terms that disappear when D → 4, called "evanescences", can be of two types: formal or analytic. Analytically evanescent terms, briefly denoted as "aev", are those that factorize at least one ε, such as εF µν F µν , εψ L ı / Dψ L , etc. Formally evanescent terms, briefly denoted as "fev", are those that formally disappear when D → 4, but do not factorize powers of ε. They are built with the tensor δμν and the evanescent componentsx,p,∂,γ,Â of coordinates, momenta, derivatives, gamma matrices and gauge fields. Examples areψ L ı/ ∂ψ R , (∂μA a ν )(∂μA νa ), etc.
The distinction between formally evanescent and analytically evanescent expressions is to some extent ambiguous. Consider for example a basisψ 1 γ ρ 1 ···ρ k ψ 2 of fermion bilinears, where ψ 1 , ψ 2 can be ψ L or K ψ , and γ ρ 1 ···ρ k is the completely antisymmetric product of γ ρ 1 , · · · , γ ρ k . In dimensional regularization these bilinears are nonvanishing for every k, and they are evanescent for k > 4. We have several ways to rearrange the products of two or more fermion bilinears using Fierz identities, and such rearrangements can convert formally evanescent objects into analytically evanescent ones. For example, given some spinors ψ n , n = 1, 2, 3, 4, we can expand the matrix ψ 2ψ3 in the basis made of γ ρ 1 ···ρ k , k = 0, . . . , ∞. We have where f (D) =tr [1]. Using this identity we find, for example, (2.13) Basically, this equation has the form "fev = fev + aev". The existence of such relations poses some problems, which we now describe.
Feynman diagrams may generate "divergent evanescences", briefly denoted as "divev". They are made of products between poles and formal evanescences, such as (∂μA a ν )(∂μA νa )/ε. The theorem of locality of counterterms demands that we renormalize divergent evanescences away, together with ordinary divergences (see below). However, this makes sense only if we can define divergent evanescences unambiguously, which could be problematic due to the observations made above. For example, if we multiply both sides of formula (2.13) by 1/ε we get a relation of the type "divev = finite + divev".
Ultimately, the problem does not arise in the theories we are considering here, for the following reasons. Both the classical action and counterterms are local functionals, equal to integrals of local functions of dimension 4. In the paper we also show that the first nonvanishing contributions to the anomaly functional (2.10) are local, equal to integrals of local functions of dimension 5. A fermion bilinearψ 1 γ ρ 1 ···ρ k ψ 2 has dimension 3, so power counting implies that the classical action, as well as counterterms and local contributions to anomalies, cannot contain products of two or more fermion bilinears, therefore are not affected by the ambiguities discussed above. Those ambiguities can only occur in the convergent sector of the theory, where they are harmless, since both analytic and formal evanescences must eventually disappear.
Thanks to the properties just mentioned, it is meaningful to require that the action S r0 , as well as its extensions constructed in the rest of this paper, do not contain analytically evanescent terms. More precisely, the coefficients of every Lagrangian terms should be equal to their fourdimensional limits. This request is important to avoid unwanted simplifications between ε factors and ε poles, when divergent parts are extracted from bilinear expressions such as (Γ, Γ). It can be considered part of the definition of the minimal subtraction scheme. For the same reason, we must be sure that the antiparentheses do not generate extra factors of ε, or poles in ε, which is proved below.
Finite nonevanescent contributions will be called "nev". We need a convention to define these quantities precisely, otherwise they can mix with evanescent terms. For example, we need to state whetherC∂ 2 C, orC∂ 2 C, or a combination such as (1 + αε)C∂ 2 C + βC∂ 2 C, where α and β are constants, is taken to be nonevanescent. The convention we choose is that nonevanescent terms are maximally symmetric with respect to the D-dimensional Lorentz group. For the arguments of this paper we just need to focus on local functionals contributing to counterterms and anomalies. In the case of counterterms the nonevanescent terms are those appearing in the action S r0 , which are SO(D)-invariant when chiral fermions are switched off. In the case of anomalies the nonevanescent terms are SO(D)-invariant unless they contain the tensor ε µνρσ or chiral fermions.

Evanescent extension of the classical action
It is convenient to extend the action S r0 adding all formally evanescent terms that have the features of divergent evanescences, multiplied by independent parameters η. In this way it is possible to subtract divergent evanescences by means of η redefinitions. Denoting the correction collecting such terms with S ev , the extended action reads Then the generating functionals (2.9), the functional Γ and the anomaly functional A of (2.10) are turned into those defined by S r .
Each term of S ev is the integral of a monomial of dimension 4 and is globally invariant under G. It not necessarily gauge invariant, since gauge invariance is violated away from four dimensions. Moreover, S ev is B, K B , KC,ψ R and ψ R independent, linear in K and depends on C a i and sources K µa i only through the combinations K µa i + ∂ µC a i . It is also independent of K C , K ψ ,K ψ , ψ L andψ L , because no formally evanescent terms can be built with these objects. By power counting and ghost-number conservation the terms proportional to K µa i + ∂ µC a i are independent of matter fields. In the end, S ev has the form We can further restrict S ev . Indeed, S r0 satisfies (2.11), therefore the divergent evanescences have the form (2.12) with L 1, and can be renormalized with an S ev of the form (2.11). Precisely, we can define the parameters η so that S ev is linear in η and its g dependence has the form (2.16) so S r also satisfies (2.11).
Basically, the terms of S ev are similar to those appearing in S r0 , but contain some evanescent components of momenta and/or gauge fields, and are broken into gauge noninvariant pieces. We have while examples of contributions to S cev are The terms multiplied by η 3i , · · · η 8i are quadratic and modify the propagators of the gauge fields A a i µ and the Lagrange multipliers B a i . We do not need to report here the modified propagators, which are rather involved. We have however checked with the help of a computer program that they satisfy the requirements we need. In particular, if k denotes their momentum, (i) they are regular when any evanescent componentsk of k are set to zero; (ii) when the propagators are differentiated with respect to any componentsk,k, or to parameters of positive dimensions (such as η 8i ), their behaviors for large k 2 improve by at least one power; (iii) they have a regular infrared behavior, which corresponds to the decoupling of the evanescent components A a î µ . Finally, their denominators are SO(1, 3) × SO(−ε) scalars, like the denominators of the fermion propagators (2.6).

Structure of correlation functions
Now we analyze the evaluation of correlation functions. We use the same notation for a function and its Fourier transform, since no confusion is expected to arise.
In momentum space, the terms of the classical action can be written in the form (2.19) where k 1 , · · · , k n+r are the external momenta.The constants T β 1 ···βr µ 1 ···µpα 1 ···αn collect all tensors η µν , ε µνρσ , δμν , γ matrices, structure constants f abc and matrices T a . In particular, every projector onto hat components of momenta, fields and sources is moved inside T β 1 ···βr µ 1 ···µpα 1 ···αn . Momentum conservation ensures that where the tensorsG µ 1 ···µp are polynomials that depend on n + r − 1 external momenta. Propagators can be decomposed as sums of terms of the form to the parameters ς I of formula (2.6) and the parameters η brought by the extension S r0 → S r discussed above.
The Feynman diagrams of Γ and A have structures inherited from the structures (2.19) and (2.21) of vertices and propagators. They can be written as sums of contributions of the form (2.19), with tensors G µ 1 ···µp that satisfy (2.20), but nowG µ 1 ···µp are integrals over internal momenta p of rational functions where the polynomial N µ 1 ···µp (p, k) appearing in the numerator is an SO(1, D − 1) tensor, and the polynomial D(p, k) appearing in the denominator is an tensors. Note thatG µ 1 ···µp have a regular limit when the evanescent componentsk of external momenta k tend to zero. For example, we can write .
It may be useful to write (2.19) in the more compact form and then organize the expressions L µ 1 ···µp (Φ, K) using the basis of fermion bilinearsψ 1 γ ρ 1 ···ρ k ψ 2 , and explicitly evaluate traces of spinor indices and contractions of Lorentz indices. At the end, all Lorentz indices appear in gauge fields, fermion bilinears, the tensor ε µνρσ (if present) and G µ 1 ···µp , and are contracted among one another, possibly after projections onto bar or hat components. It is also convenient to expand are polynomials constructed with η µν , ε µνρσ , δμν and the n+r−1 independent momenta k. Then we can write the contribution (2.24) to Γ or A as After these operations, Lorentz indices appear in gauge fields, fermion bilinears, momenta k and the tensor ε µνρσ . They are contracted among themselves, possibly after projections onto bar or hat components. At this point, traces and index contractions must be evaluated explicitly, because they may produce factors ε, which are important for the expansions and limits that we are going to define. The analytic expansion around ε = 0 of (2.24) or (2.26) is defined expanding the scalars G i (k) in powers of ε without affecting the evanescent components of external momenta. The analytic limit is the order zero of the analytic expansion, once the poles in ε have been subtracted away. The formal limit ε → 0 is the limit where the evanescent components of gauge fields, external momenta and fermion bilinears are dropped. The limit ε → 0 is the analytic limit followed by the formal limit.
For the reasons explained above, the analytic and formal limits may be ambiguous in the convergent sector of the theory, but they are unambiguous in the divergent sector. More importantly, the limit ε → 0 is always unambiguous. Since the tensors G µ 1 ···µp are regular when any evanescent componentsk of external momenta k are set to zero, the formal limits of (2.24) and (2.26) are well-defined.
When we use the expressions "O(ε)" or "ev" we mean any quantity that vanishes in the limit ε → 0. Clearly, ev = aev + fev.

Locality of counterterms
Now we comment on the locality of counterterms. The forms of regularized propagators ensure that a sufficient number of derivatives with respect to physicalk and/or evanescentk components of external momenta k kills the overall divergences of Feynman diagrams. If we subtract divergent evanescences, together with ordinary divergences, up to some order n, then both ordinary divergences and divergent evanescences of order n + 1 are polynomial ink andk. The S r0 -extension S r = S r0 + S ev of formula (2.14) allows us to subtract all of them in a way that is efficient for the proof of the Adler-Bardeen theorem.
To complete the analysis it is useful to describe what happens if for some reason we do not subtract divergent evanescences. We use the abbreviations "loc" and "nl" to denote local and nonlocal contributions, respectively. At one loop we miss counterterms of the form loc fev ε . (2.27) Consequently, at two loops we also miss counterterms for subdivergences. Using the vertex (2.27) inside one-loop diagrams we get contributions of the form The first three terms are generated when the formal evanescence enters the diagram, is converted into a factor ε and simplifies a pole in ε. Symbolically, we express this occurrence (which is the basic mechanism that originates potential anomalies) as The last three terms of (2.28) describe what happens when the formal evanescence remains outside the diagram. The first term of (2.28) must be subtracted, so the missing counterterms at two loops are Even if the last term of this list is nonlocal, we still have no problem, since the residues of the poles in ε are formally evanescent. However, when we use the third term of (2.30) inside one-loop diagrams, the formal evanescence can simplify another pole, by the mechanism (2.29), and give We see that undesirable nonlocal, nonevanescent divergences are generated from three loops onwards.
A possible way to get rid of them might be to define an ad-hoc scheme that subtracts the local nonevanescent terms of (2.29) as soon as they appear. However, it is far from evident that it is always possible to do so. More importantly, a renormalization procedure of this type is cumbersome, and unfit to make anomaly cancellation manifest to all orders.
For these reasons we pursue the safer way, that is to say subtract divergent evanescences order by order, together with nonevanescent divergences.

Properties of the antiparentheses
Now we study how divergences and evanescences propagate through the antiparentheses. Indeed, in the proofs of renormalizability to all orders and the Adler-Bardeen theorem, it is necessary to extract divergent parts of antiparentheses such as A = (Γ, Γ) or (Γ, A). This operation is not as simple as it sounds, because we must be sure that the antiparentheses themselves do not generate poles or factors of ε, in order to be able to say that, for example, the divergent part of (S r , div is the divergent part of Γ (1) . Specifically, we prove that (i) the antiparentheses (X conv , Y conv ) of convergent functionals X conv and Y conv are convergent; (ii) the antiparentheses (X conv , Y ev ) of convergent functionals X conv and evanescent functionals Y ev are evanescent; (iii) the antiparentheses (X, Y ) do not generate either poles in ε or factors of ε if X, Y and (X, Y ) do not involve products of two or more fermion bilinears.
For the uses we have in mind it is convenient to rephrase property (iii) more explicitly as (iii ′ ) the antiparentheses (X A , Y B ) of functionals X A and Y B with the properties specified by their subscripts A and B, satisfy the identities as long as X A , Y B and (X A , Y B ) do not involve products of two or more fermion bilinears.
To prove these properties it is convenient to write the antiparentheses in momentum space. We have and a similar relation obtained exchanging Φ and K. Let us write formulas (2.26) for X, Y and (X, Y ) as Using (2.25) we find that the p integral of formula (2.32) can be readily done and gives where P is the total momentum ofG i X andG j Y . We see that the scalar "cores" G i of correlation functions just multiply each other in momentum space, which cannot generate new poles in ε or factors of ε.
It remains to study the relation between L ij(X,Y ) and L iX , L jY . The antiparentheses can produce index contractions by means the paired functional derivatives δ/δA µ -δ/δK µ and δ/δψδ/δK ψ . Clearly, no such operations can generate poles in ε. This observation is sufficient to prove statements (i) and (ii).
As far as statement (iii) is concerned, we must assume that the functionals X, Y and (X, Y ) do not involve products of two or more fermion bilinears, therefore they are free of ambiguities of type (2.13). The contraction of Lorentz indices brought by δ/δA µ and δ/δK µ gives a tensor η µν with mixed indices (namely one index from X and one index from Y ). The contraction of spinorial indices brought by δ/δψ and δ/δK ψ gives structures such as where the ρ indices come from X and the σ indices come from Y . Anticommuting the γ's we can rearrange the indices so that ρ 1 < ρ 2 < · · · < ρ k and σ 1 < σ 2 < · · · < σ l . Reordering the indices we may get minus signs from further anticommutations or from squares of γ matrices with identical indices. In the end, we get a formula likē where the breves denote missing indices that go into the tensors η µν . Again, we get only tensors η µν with mixed indices. We recall that all Lorentz indices, possibly after projection onto bar or hat components, are contracted with gauge fields, fermion bilinears, momenta and possibly ε µνρσ , and that, by assumption, no products of two or more fermion bilinears are involved. Then it is obvious that the contractions originated by the antiparentheses cannot produce ε factors. Using these properties it is easy to check that identities (2.31) hold, so statement (iii) is also proved. Statement (iii) also says that the antiparentheses cannot convert formal ε evanescences into analytic ones. It applies, for example, to local functionals X and Y that are equal to the integrals of functions of dimensions n X , n Y 5, such that n X + n Y 8, because then X, Y and (X, Y ) cannot contain products of two or more fermion bilinears. In the paper we will apply statement (iii) to the divergent contributions to Γ and the first nonvanishing contributions to the anomaly functional A of (3.9).

DHD regularization
The dimensional regularization alone does not provide the subtraction scheme where the cancellation of gauge anomalies is manifest to all orders. To find the right scheme, we modify the regularization technique adding higher-derivative terms that preserve gauge invariance in D = 4. We take the non-gauge-fixed regularized classical action where (3. 2) The higher-derivative structures of (3.1) and (3.2) are chosen to simplify the arguments of our derivations.
We gauge fix S cΛ using modified gauge-fixing functions of the form and a modified gauge fermion where λ ′ and ξ ′ are other (dimensionless) gauge-fixing parameters. Finally, we add which differs from S ev only because the combinations K µa i + ∂ µC a i are replaced by K µa i + Q (✷) ∂ µC a i . The regularized gauge-fixed action reads where S K is the same as before, and satisfies where h IJ (∂ 2 ) = (ς IJ Λ 6 + δ IJ (∂ 2 ) 3 )/Λ 6 . The reason why it is useful to separate the terms proportional to the parameters η will become clear later.
It is straightforward to derive the propagators and check that the ones of gauge fields, A µ (k)A ν (−k) 0 , and the ones of ghosts, C(k)C(−k) 0 , fall off as 1/(k 2 ) 9 for large momenta k, while the propagators A(k)B(−k) 0 fall off as k/(k 2 ) 9 , and B(k)B(−k) 0 as 1/(k 2 ) 8 . For example, in the "Feynman gauge" ξ i = ζ i , λ ′ = ξ ′ = 1 at η = 0 we have The fermion propagators, on the other hand, fall off as p/(p 2 ) 4 . For a while we need to work at finite Λ, where the action S Λ is superrenormalizable. To make its superrenormalizability manifest, it is convenient to parametrize it so that the Λ denominators cancel out. Let us first ignore the terms S Λev . We define tilde fields and tilde parameters as andr i = r i . The covariant derivatives remain Λ independent. To cancel the Λ denominators of the gauge-fixing sector we define C a =C a /Λ 8 ,B a = B a /Λ 8 andC a = C a /Λ 8 . Finally, we define the tilde sources so the tilde map is a canonical transformation combined with a redefinition of parameters. As far as S Λev is concerned, using (2.16) and the linearity in η we can write it as whereQ (✷) = Λ 16 + λ ′ ✷ 8 . In the tilde parametrization the full action reads The DHD-regularized generating functional Z Λ reads and the generating functional Γ Λ (Φ, K) = W Λ (J, K) − Φ α J α of one-particle irreducible diagrams is the Legendre transform of W Λ (J, K) with respect to J. Since no one-particle irreducible diagrams with external legs ψ R ,ψ R can be constructed, the action S Λ and the Γ functional Γ Λ depend on ψ R ,ψ R in exactly the same way. The DHD-regularized anomaly functional is When we switch to the tilde parametrization we writeZ Λ ,W Λ ,Γ Λ andÃ Λ . The tilde actionS Λ is polynomial in Λ, has properly normalized propagators and contains only parameters of nonnegative dimensions in units of mass. However, the tilde fields have negative dimensions, which in principle may jeopardize the (super)renormalizability we want to prove. Precisely, we have and [K ψ ] = 9/2. This problem is solved as follows. Since S Λ has the form (2.11), theg structure ofS Λ is the tilde version of (2.11). The tilde version of formula (2.12) ensures that the counterterms have theg structure L 1g 2(L−1) F L (gΦ,gK), (3.10) where the L-loop local functionals F L depend polynomially on the other dimensionful parameters of the theory. Then we see that the theory is indeed superrenormalizable, because the dimensions of all productsgΦ andgK are strictly positive.

The DHD limit
The basic idea behind the DHD regularization is to "first send ε to zero, then Λ to infinity". However, we must formulate the rules of such limits more precisely, since certain caveats demand attention. We distinguish the higher-derivative theory from the final theory. The higher-derivative theory is the one defined by the classical action S Λ (orS Λ , if we use the tilde parametrization), where the scale Λ is kept fixed and treated like any other parameter, instead of a cutoff. It is superrenormalizable and regularized by the dimensional technique. Its divergences, which are poles in ε, are subtracted in the next section using the minimal subtraction scheme. The final theory is obtained taking the limit Λ → ∞ on the renormalized higher-derivative theory, after subtracting the Λ divergences that emerge in that limit.
Having already expanded in ε, we may wonder what types of divergences appear in the final theory. We have products Λ k ln k ′ Λ of powers and logarithms of Λ, but we also have terms that are evanescent in ε and divergent in Λ. To understand what to do with these, we distinguish two types of them, according to whether the ε evanescence is analytic or formal.
(i) First, consider analytic evanescences in ε multiplied by products Λ k ln k ′ Λ, such as εΛ 2 ln Λ. Since we first send ε to zero, these quantities are not true divergences and must be neglected. In any case, they cannot be subtracted away, because the theorem of locality of counterterms does not apply to them. Consider for example the integral where for the purposes of our present discussion the mass m can also play the role of an external momentum. Expanding the right-hand side in powers of ε we find that the O(ε 0 ) terms, which are equal to 1 32π 2 πΛ 2 − 2m 2 ln have a Λ-divergent part that is polynomial in m, as expected, while the O(ε 1 ) terms have a Λ-divergent part that contains expressions such as which are not polynomial in m.
(ii) Next, consider formal evanescences times Λ k ln k ′ Λ, such as (ln Λ)∂ µ Aν∂ µ Aν. These can (actually, must, for the reasons explained in subsection 2.5) be subtracted away (as long as their coefficients are calculated at ε = 0), because the form of regularized propagators ensures that counterterms are polynomial in both physical and evanescent components of external momenta and fields.
(iii) Formally evanescent expressions multiplied by products Λ k ln k ′ Λ and factors of ε are just like case (i) and should not be subtracted away.
(iv) For completeness, we point out a forth type of ε-evanescent Λ divergences, that is to say nonlocal contributions of type (ii), which can appear as artifacts of inconvenient manipulations. Precisely, because of the ambiguities encoded in formula (2.13) some quantities of type (i) can be converted into nonlocal divergences of type (ii). These conversions should just be avoided. To this purpose, it is sufficient to note that the structure (2.19) of diagrams and the expansion of the integrals G µ 1 ···µp only generate ε-evanescent Λ divergences of type (i), (ii) and (iii). In the event that "aev → fev conversions" of type (2.13) are accidentally applied, nonlocal divergences of type (ii) can just be ignored, because they cannot mix with the local terms belonging to the power-counting renormalizable sector and they are resummable into contributions of type (i).
To summarize, the Λ divergences are equal to Λ k ln k ′ Λ times local monomials of the fields, the sources and their derivatives. Those monomials may or may not be formally evanescent, and their coefficients must be evaluated in the analytic limit ε → 0.
We can thus define the procedure with which we renormalize the final theory and define physical quantities. We call it DHD limit. We still organize the contributions to Γ and A in the form (2.19). Referring to (2.24) and (2.26), the DHD limit is made of the analytic limit ε → 0, followed by the limit Λ → ∞, followed by the formal limit ε → 0. We also have the DHD expansion, that is to say the analytic expansion around ε = 0 followed by the expansion around Λ = ∞.
The three steps that define the DHD limit are unambiguous in the divergent sector, which does not contain products of more than one fermion bilinears. Instead, the first and third steps are ambiguous in the convergent sector. What is important is that the DHD limit is also unambiguous in the convergent sector.
It is useful to recapitulate the DHD limit in symbolic form. We first expand around ε = 0 at Λ fixed, and find poles, finite terms and evanescent terms: The symbols appearing in this list have the following meanings: 1/ε denotes any kinds of divergences in ε,δ is any formally evanescent quantity, ε 0 is any quantity that is convergent and nonevanescent in the analytic limit ε → 0, and ε denotes any analytic evanescence. After the expansion, we subtract the poles and remain with ε 0 ,δε 0 , ε,δε. (3.11) The terms proportional to ε vanish in the DHD limit. The termsδε 0 also vanish in that limit, but for some time we treat them together with the ε 0 terms. Next, we study the Λ dependence. Expanding the coefficients of every surviving terms (3.11) around Λ = ∞, we find where Λ denotes any kind of Λ divergences (including the logarithmic ones), Λ 0 any Λ-convergent, non-Λ-evanescent term, and 1/Λ is any Λ-evanescent term. Then we subtract the Λ divergences, namely the terms of type ε 0 Λ andδε 0 Λ. After that we remain with At this point we are ready to take the DHD limit, which drops all contributions of this list but the ε 0 Λ 0 terms.

Renormalization of the higher-derivative theory
In this section and the next two we study the higher-derivative regularized theoryS Λ , keeping Λ fixed and (mostly) using the tilde parametrization. We first work out the renormalization of the theory, then study its one-loop anomalies and finally prove anomaly cancellation to all orders. The counterterms (3.10) are local and largely constrained. We know that i) they are independent ofB,KC,K B ,ψ R and ψ R and ii) do not depend on antighosts C a i and sourcesK µa i separately, but only through the combinationsK µa i +Q(✷)∂ µ C a i . Indeed, we have arranged S Λev to preserve these properties. Actually, we have chosen the higher-derivative structure of S Λ to simplify the counterterms even more: iii) they cannot depend on the sourcesK and matter fieldsψ, because each productgK,gψ has dimension greater than 4; iv) they cannot contain antighosts, because of points (ii) and (iii); v) they cannot contain ghosts, because all objects with negative ghost numbers are excluded by points (iii) and (iv); vi) they can only be one-loop, because each loop carries an extra factorg 2 , which has dimension 16. In the end, there can only be one-loop divergences of the form (where derivatives can act on any objects to their right), and those obtained from these expressions suppressing somegÃ's or derivatives. The anomaly functional (3.9), if nonvanishing and nontrivial (in a sense specified below), is the anomaly of the higher-derivative theory. In the tilde parametrization we havẽ (4. 2) The one-loop contributionÃ whereΓ (1) Λ is the one-loop contribution toΓ Λ . Using (2.31) and (3.4) we see that (S Λ ,S Λ ) = fev. The right-hand side of (4.3) collects one-loop Feynman diagrams containing insertions of formally evanescent vertices. The formal evanescences can: (a) remain attached to external legs and momenta, or (b) be turned into a factor ε. In case (a) they give local divergent evanescences plus nonlocal evanescences. In case (b) the factor ε can simplify a local divergent part and give local nonevanescent contributions, in addition to nonlocal evanescences. Therefore, we can writẽ whereÃ (1) Λnev is local, convergent and nonevanescent,Ã Λdivev is local and divergent-evanescent andÃ (1) Λev is evanescent and possibly nonlocal. Now we take the divergent part of equation (4.3). DecomposeΓ (1) Λ as the sum of its divergent partΓ (1) Λdiv and its convergent partΓ (1) Λconv . Recalling that the antiparentheses of convergent functionals are convergent, we obtain that (S Λ ,Γ (1) Λconv ) is convergent. Properties (2.31) apply to (S Λ ,Γ (1) Λdiv ), therefore we have the identity Λdivev . (4.5) Now, formula (4.1) tells us thatΓ (1) Λdiv is just a functional ofgÃ, therefore its antiparenthesis with S Λ is only sensitive toS K and the K-dependent contributions toS Λev , which we denote asS ΛK ev . Moreover, we can further decomposeΓ (1) Λdiv as the sum of a nonevanescent divergent partΓ (1) Λnevdiv and a divergent evanescenceΓ (1) Λdivev . Doing so we find (4.6) At this point, taking the nonevanescent divergent part of this equation we obtain Λnevdiv ) = 0, which just states thatΓ (1) Λnevdiv is gauge invariant. Going back to the nontilde parametrization, we haveΓ Λnevdiv can only be a linear combination of the invariants F a i µν F a i µν , and can be subtracted redefining the parameters ζ i . The rest, Γ Λdivev , can be subtracted redefining the parameters η of S ev . The renormalized actionŜ Λ is obtained making the replacements in S Λ , where f i , f ′ are calculable numerical coefficients. Since S Λ is linear in ζ and η, we havê Λdiv .

One-loop anomalies
In this section we study the one-loop anomalies, and relate those of the final theory, which are trivial by assumption, to those of the higher-derivative theory, which turn out to be trivial as a consequence.
We begin relating the one-loop anomaliesÂ Λ andÃ Λ . First we observe that Indeed, the correctionΓ (1) Λdiv to the action provides O( ) vertices. If we use those vertices in oneparticle irreducible diagrams together with vertices of (Ŝ Λ ,Ŝ Λ ), we must close at least one loop, which gives O( 2 ) contributions. Using (4.9), we havê As a check, recall thatÂ Λ is convergent, therefore the divergent evanescencesÃ (1) Λdivev must disappear fromÂ (1) Λ . We know thatÃ (1) Λnev is the integral of a local function of dimension 5 and ghost number 1.
Recalling that a factorg is attached to every external leg, we havẽ Λnev cannot depend on the sourcesK and matter fieldsψ, because the productsgK andgψ have dimensions greater than 4.
Working out (S Λ ,S Λ ) in detail it is easy to check that it does not depend onB a i and depends onK µa i and C a i only through the combinationsK µa i +Q(✷)∂ µ C a i . Therefore, the same must be true ofÃ (1) Λ , which means thatÃ (1) Λnev cannot depend on either C orB. Then the functionsÃ a cannot even contain ghosts. Summarizing, we can writẽ Recall that the antiparentheses satisfy the identity (X, (X, X)) = 0 for any functional X. Taking X =Γ Λ , we obtain (Γ Λ ,Â Λ ) = 0, (5.4) which are the Wess-Zumino consistency conditions [8], written using the Batalin-Vilkovisky formalism. In particular, at one loop we have In section 2 we have proved that the antiparenthesis of an evanescent functional with a convergent functional is evanescent. Thus, For the same reason, (S Λ ,Ã At this point, we take the nonevanescent part of both sides and note that formulas (2.31) apply to (S K ,Ã Λnev ), because thanks to (5.3), no products of more fermion bilinears are involved in this antiparenthesis. We find (S K ,Ã Λnev is the (potential) one-loop anomaly of the higher-derivative regularized theorỹ S Λ , defined keeping Λ fixed. The final theory is instead obtained taking the DHD limit. We must relateÃ (1) Λnev to the potential one-loop anomaly A (1) f nev of the final theory. Indeed, we are assuming that A (1) f nev is trivial (the final theory cannot have gauge anomalies at one loop), but we still have no precise information aboutÃ (1) Λnev . We know howÃ (1) Λnev depends ong. The other dimensionful parameters ofS Λ (such asζ i and ξ i ), as well as the powers of Λ multiplying various terms (such as ψ I L ı / Dψ I L ), have dimensions greater than 4. They cannot contribute toÃ (1) Λnev , because the local functionsÃ a are polynomial in them and have dimension 4. ThusÃ (1) Λnev can only depend ongC,gÃ,r i , λ ′ , ξ ′ , η 1i and η 2i . Using (5.3), switching to nontilde variables, and recalling thatgÃ = gA,gC = gC, we obtain that Λnev is Λ independent. Now we show that actually A (1) Λnev coincides with the one-loop anomaly A (1) f nev of the final theory.
To prove this fact, we need to take Λ to infinity and study the DHD limit at one loop. A more comprehensive study of the DHD limit will be carried out later. The terms that are divergent in this limit are denoted as "Ddiv", to distinguish them from the divergences considered so far, which strictly speaking were "εdiv". Recall that, according to the definition of DHD limit, the Λ-divergent parts cannot contain analytic ε evanescences, but can contain formal ε evanescences.
ConsiderÂ Λ = (Γ Λ ,Γ Λ ) and take the one-loop DHD-divergent part of this equation. Using (5.1) and recalling that A (1) Λnev is Λ independent, we get whereΓ (1) ΛDdiv is the one-loop DHD-divergent part ofΓ Λ . In the last step we have dropped the contribution involving (S Λ − S r ,Γ ΛDdiv ), since this quantity vanishes in the limit Λ → ∞. The reason is that, by formulas (2.14) and (3.4), the difference S Λ − S r is made of O(1/Λ 6 ) terms, and the powerlike Λ divergences contained inΓ (1) ΛDdiv cannot exceed Λ 4 . Actually, this is one of the reasons why we have chosen the particular higher-derivative structure of the theory S Λ . Moreover, to make the last step of (5.7) we have applied (2.31) to (S r ,Γ ΛDdiv ). Because of the analysis of section 3, the Λ divergences ofΓ (1) ΛDdiv can be of two types, with respect to the limit ε → 0: nonevanescent or formally evanescent. Thanks to (2.31), the parenthesis with S r also gives nonevanescent or formally evanescent contributions, wherefrom the last equality of (5.7) follows.
Subtracting the Λ divergencesΓ ΛDdiv fromŜ Λ we can define the one-loop renormalized action S f ren of the final theory, which readŝ For the moment we do not need to specify the O( 2 ) terms of this subtraction (but later we will have to be precise about them). The anomaly of the final theory is and its one-loop nonevanescent part is the quantity A (1) f nev we want, where the subscript "nev" close to the subscript "f " denotes the contributions that do not vanish in the DHD limit. We have In these manipulations we have used the formulâ which holds because at one loop the vertices ofΓ (1) ΛDdiv , which are already O( ), cannot contribute to one-particle irreducible diagrams containing one insertion of (Ŝ Λ ,Ŝ Λ ).
At one loop, using (5.7), we obtain We are ready to take the DHD limit. Recall that (S Λ − S r ,Γ ΛDdiv ) tends to zero for Λ → ∞, while A Λev and its Λ-divergent part do not separately tend to zero, because they can contain (local) terms that are formally ε evanescent and Λ divergent. However, those terms are precisely A (1) Λev Ddiv , therefore disappear in the difference . Finally, using (5.3), we get as we wanted. Let us write the most general structure of the functions A a (gA). We know that they have dimension 4 and are sums of terms of the form g p ∂ k A p . Power counting gives k + p ≤ 4, hence we have plus the terms obtained from these suppressing some gA's or some derivatives. Now it remains to collect all pieces of information found so far and solve (5.6). We call condition (5.6) a little cohomological problem, because it involves a structure (5.3) that contains a finite number of terms, in our case just a few, and its solution can be worked out directly. We recall the solution without proof, because it is well-known and because it is not necessary for the other derivations of this paper. It can be split into the sum of trivial and nontrivial contributions. Trivial contributions are those of the form (S K , χ), where χ is a K-independent local functional of vanishing ghost number, equal to the integral of a local function of dimension 4 and having g structure as the one-loop sector of formula (2.12). The only nontrivial contributions to A (1) f nev are proportional to the famous Bardeen formula [15]. In the appendix the coefficient of the Bardeen term is calculated using our regularization technique. In the end, we have where C = C a T a , A µ = A a µ T a . One-loop gauge anomalies vanish when the trace appearing in (5.12) vanishes. Typically, the cancellation is possible when the gauge group is a product group and the theory contains various types of fermionic fields in suitable representations, as in the Standard Model. Now we go back to the higher-derivative theory (the DHD limit being completed in section 7), precisely to the classical actionŜ Λ of formula (4.8). The trivial contributions (S K , χ) to anomalies can be canceled out redefining the action aŝ because then In the last step we used the fact that χ is K independent. Thus, at one loop we havê which means that when the Bardeen term vanishesÂ Finally, observe that the new Γ functionalΓ ′ Λ is still convergent to all orders. The reason is that it is convergent at one loop and the action has the g structure (2.12). Then, using tilde variables, counterterms must have the form (3.10), which however forbids divergent contributions from two loops onwards. The anomaly functional is also convergent to all orders and has the g structure (2.12). The next step is to prove anomaly cancellation to all orders in the higher-derivative theory. After that we have to complete the DHD limit by renormalizing the Λ divergences.

Manifest Adler-Bardeen theorem in the higher-derivative theory
In this section we prove that gauge anomalies manifestly cancel to all orders in the higherderivative theory S Λ . We assume that the final theory has no one-loop anomalies, which according to the previous section implies that the higher-derivative theory shares the same property, namely Here the "O(ε)" includes the tree-level contribution (S Λ , S Λ ). Now we move to higher orders. We have to study the diagrams with two or more loops, with one insertion of Λdivev − (S ΛK ev , χ), (6.2) calculated with the action (5.14). We have used (4.5), and replaced (S Λ , χ) with (S K + S ΛK ev , χ) and (S K , χ), with A Λnev . Both E andÂ ′ Λ have the structure (2.12) and (S Λ , S Λ ) is formally evanescent. To fix the notation, let us start from formula (2.19) for the ℓ-loop diagrams containing one (S Λ , S Λ ) insertion. We write them as sums of contributions of the form where the tensors T (ℓ)β 1 ···βr Aµ 1 ···µpα 1 ···αn are constant and evanescent, and the integrations over momenta are understood. We recall that G (ℓ)µ 1 ···µp A (k 1 , · · · , k n+r ) are the integrals coming from Feynman diagrams, once all tensors η µν , ε µνρσ , δμν , the γ matrices, the structure constants f abc and the matrices T a are moved outside into the structures T Λdivev appearing in (6.2). It is easy to see that these subtractions precisely cancel the nontrivial divergent parts of the integrals G where the subtracted integrals G is evanescent, in agreement with (6.1). At higher loops it is useful to make a similar analysis. We begin with ℓ = 2. The integrals G (2)µ 1 ···µp A are automatically equipped with the counterterms that subtract away their nontrivial subdivergences: first, the actionŜ ′ Λ is equipped with its own counterterms and, second, the sub- Λdivev appearing in (6.2) provide counterterms for the integrals G (1)µ 1 ···µp A associated with the one-loop subdiagrams containing the (S Λ , S Λ ) insertion. Thus, when we include counterterms for subdivergences, we can identify subtracted integrals G Λ , which can be divergent-evanescent, nonevanescent (due to simplified divergences), or still divergent. However, local contributions must have the structure (3.10), which implies that they are zero. Indeed, using the tilde parametrization, they are polynomial in all dimensionful parameters ofS Λ and carry an overall factorg 2 , which has dimension 16. We conclude that the overall divergences G (2)µ 1 ···µp Adiv are trivial, because they are killed by T (2) A , therefore G therefore formula (6.1) is promoted to the next order and we can writeÂ , where now "O(ε)" includes the evanescent contributions appearing on the right-hand side of (6.3).
At this point we can proceed by induction. Assume that for some ℓ 2, and that the overall divergent parts G Λloc must have the structure (3.10), which means that it vanishes. In the end, G (ℓ+1)µ 1 ···µp A div are also trivial, and Λev . Thus, if the inductive assumptions hold for some ℓ, they must also hold with ℓ → ℓ + 1 and therefore for ℓ = ∞. We conclude that the anomaly is evanescent to all orderŝ therefore vanishes in the limit D → 4. This result proves that if the final theory is anomaly-free at one loop, the higher-derivative theory S Λ is anomaly-free to all orders. It is important to stress that the DHD-regularization framework provides the subtraction scheme where this property is manifest: after the subtraction of (S K , χ) at one loop, no analogous subtractions are necessary at higher orders. This is not the final result we want, though. To get there we still need to take Λ to infinity and complete the DHD limit.

Manifest Adler-Bardeen theorem in the final theory
We are finally ready to study anomaly cancellation to all orders in the final theory. In this section we study the Λ dependence and complete the DHD limit, according to the rules of subsection 3.1. The subtraction of Λ divergences proceeds relatively smoothly, and preserves the master equation to all orders up to terms that vanish in the DHD limit.
Call S n and Γ n the action and the Γ functional DHD-renormalized up to n loops, where S 0 =Ŝ ′ Λ =Ŝ Λ − χ/2 is the action (5.14). The action S n must satisfy two inductive assumptions to all orders in : (I) Γ n has a regular limit for ε → 0 at fixed Λ, and (II) the local functional (S n , S n ) ≡ E n (7.1) is "truly ε-evanescent at fixed Λ", namely a local functional such that E n tends to zero when ε → 0 at fixed Λ. More precisely, Γ n is a sum of terms (3.13) up to n loops (because it is DHD-convergent to that order) and a sum of terms (3.11) from n + 1 loops onwards. Instead, E n = (Γ n , Γ n ) contains the terms (3.13) except ε 0 Λ 0 and ε 0 /Λ up to n loops, and the terms (3.11) except ε 0 from n + 1 loops onwards. Thanks to (6.5) we know that the inductive hypotheses are true for n = 0.
The theorem of locality of counterterms ensures that the (n + 1)-loop divergent part Γ (n+1) ndiv of Γ n is a local functional. Since Γ n has a regular limit for ε → 0 at fixed Λ, Γ (n+1) ndiv contains only divergences in Λ, not in ε. Precisely, we can write ndivfev collect the terms ε 0 Λ andδε 0 Λ of the list (3.12), respectively. Now we study the (n + 1)-loop divergent part of (Γ n , Γ n ). We take the (n + 1)-loop DHDdivergent non-ε-evanescent part of (Γ n , Γ n ) = (S n , S n ) = E n , (7.2) which means the terms of types ε 0 Λ of the list (3.12). Recall that S Λ is equal to the action S r of (2.14) plus O(1/Λ 6 ) terms, so (S Λ − S r , Γ (n+1) ndiv ) is convergent for Λ → ∞. Moreover, S r is equal to S gf , which by formula (2.3) is non-ε-evanescent, plus ψ I R ı / ∂ψ I R plus ε-evanescent terms. Noting that the divergent part of E n is just made of termsδε 0 Λ, we obtain (S gf , Γ ) with 0 < k < n + 1, because they are convergent in the DHD limit. Note that Γ (k) n , 0 < k < n + 1, may contain terms εΛ. Now, the powers of Λ can get simplified inside (Γ ). However, Γ n is convergent for ε → 0 and the antiparentheses cannot generate poles, so the resulting contributions remain negligible in the DHD limit. We must just pay attention not to manipulate the terms εΛ in inconvenient ways (see subsection 3.1 for details).
Since the theory is power-counting renormalizable, (7.3) is another little cohomological problem, therefore it can be solved directly. Moreover, it is a purely four-dimensional problem, since all ε-evanescent terms have been dropped. Its solution is well-known and states that Γ (n+1) ndivnev can be reabsorbed redefining the parameters of S gf and making a canonical transformation inside S gf . Using the nonrenormalization of the B-and KC-dependent terms, and power counting, the canonical transformation is generated by a functional 4) and the parameter redefinitions read where Z nAi , Z nCi , Z nIJ and Z ni are ε-independent Λ-divergent renormalization constants. The r i redefinitions encode the renormalizations of gauge couplings. Instead, the ξ i redefinitions follow from the nonrenormalization of the terms quadratic in B. In the parametrization we are using there are no redefinitions of g and ζ i . Making the canonical transformation (7.4) and the redefinitions (7.5) on S gf we get However, the classical action we have been using is not S gf , and not even S r = S gf + S LR + S ev , but S Λ , therefore we must inquire what happens making the operations (7.4) and (7.5) on S Λ . Let us begin from S r . Since S LR is nonrenormalized, we must also make the redefinitions When we apply (7.4) and (7.5) to S ev we generate new formally ε-evanescent, Λ-divergent terms of order n+1 , which change Γ ndivfev are not constrained by gauge invariance, but just locality and power counting. They can be subtracted redefining the parameters η of S ev , since S ev was added precisely for this purpose.
We denote the operations that subtract Γ (n+1) ndiv with T n . They include the canonical transformation (7.4), the redefinitions (7.5) and (7.6), and the η redefinitions that subtract Γ ′(n+1) ndivfev . Note that T n = 1 + O( n+1 ). We have It remains to check what happens when the operations T n act on S Λ . Observe that, since no ε divergences are around, the operations T n are independent of ε and divergent in Λ. However, the difference S Λ − S r is of order 1/Λ 6 and the operations T n cannot contain powers of Λ greater than 4. Thus, (T n − 1)(S Λ − S r ) vanishes in the DHD limit. Call S n+1 the action obtained applying T n on S n . We have This formula tells us that the operations T n do renormalize the divergences due to S n in the DHD limit, therefore S n+1 is the (n + 1)-loop DHD-renormalized action, namely it gives a generating functional Γ n+1 that is convergent up to n + 1 loops in the DHD limit. Moreover, since the canonical transformations generated by (7.4) act multiplicatively on fields and sources, the operations T n act on the Γ functional precisely as they act on the action, therefore Γ n+1 = T n Γ n . Since the operations T n are ε-independent, we conclude that Γ n+1 is regular when ε → 0 at fixed Λ, to all orders in , which promotes the inductive assumption (I) to n + 1 loops.
Finally, the operations T n preserve the antiparentheses. Applying them to (7.1) we also obtain Now, taking the average of this equation we get where · · · k means that the average is calculated with the action S k . If we take the limit of T n E n n for ε → 0 at fixed Λ we get zero, because by assumption (II) E n n tends to zero for ε → 0 at fixed Λ. We conclude that the local functional E n+1 ≡ T n E n is truly ε evanescent at fixed Λ, therefore assumption (II) is also promoted to n + 1 loops. Since all inductive assumptions have been successfully promoted to n + 1 loops, the DHDrenormalized action S R = S ∞ satisfies where E R vanishes in the DHD limit, because it contains only the terms of (3.13) except ε 0 Λ 0 and ε 0 /Λ. Finally, the DHD-renormalized Γ functional Γ R = Γ ∞ is such that (Γ R , Γ R ) = E R tends to zero in the DHD limit, which means that gauge anomalies cancel out to all orders.
The DHD framework defines a subtraction scheme where the cancellation takes place naturally and manifestly. In any other framework, the right scheme must be identified step-by-step, from two loops onwards, fine-tuning local counterterms.
Some final comments are in order. Because of (4.7) higher-order divergent terms of the form Λ p ln k Λ/ε are generated along the way. They appear in S R and in the partially renormalized actions S n . Our renormalization procedure (which is just made of redefinitions of parameters, fields and sources) makes them cancel opposite contributions coming from diagrams, therefore they do not appear in the Γ functionals Γ R and Γ n , which are indeed regular in the limit ε → 0 at Λ fixed.
In several steps of the proof we have used the fact that S Λ = S r + O(1/Λ 6 ). It is important that the higher-derivative regularized classical action S Λ does not contain terms with fewer inverse powers of Λ. Consistently with this, renormalization does not require to turn them on. The operations T n may contain powerlike divergences, which can generate terms with less than 6 inverse powers of Λ when they act on S Λ − S r . Those terms are at least one loop and not divergent, so they do not affect the structure of the classical action S Λ .

Standard Model and more general theories
In this section we show how to extend the proof of the previous sections to the Standard Model and the most general perturbatively unitary, power-counting renormalizable theories. We just need to include photons V µ , scalar fields ϕ and right-handed fermions χ R , which were dropped so far for simplicity. Depending on the representations, we can also add Majorana masses to the fermions ψ L .
We begin from the fermions. The starting classical action (2.1) is modified as follows: where S m collects the mass terms, when allowed by the representations: The functional S K that collects the symmetry transformations is also extended: Clearly, Ψ and (S K , Ψ) are unmodified. To regularize the right-handed fermions we mirror what we did for the left-handed ones. In the same way as we added partners ψ R for ψ L that decouple in four dimensions, we add partners χ L for χ R that also decouple when D → 4. The correction to S LR is Massive terms involving the regularizing partners ψ I R and χ I L can also be added. Differently from (8.1), they are not renormalized, so their coefficients must be independent of the ones appearing in (8.1). The evanescent corrections S ev of formula (2.15) are affected only in the sector S cev , which is extended to include terms such as the integrals of multiplied by independent constants. Next, we add the higher-derivative regularizing terms The gauge fermion Ψ Λ does not change, as well as S Λev − S cev . Tilde fields and sources are defined as before and every argument of the proof can be extended straightforwardly. Now, wave-function renormalization constants can mix right-handed fermions with conjugates of left-handed ones.
Scalars can be added making the replacements where S Y denotes the Yukawa terms. As before, the renormalized action is linear in the sources K, by ghost number conservation and power counting. The evanescent corrections S cev include new terms such as the integrals of (∂μϕ) † (∂μϕ), (∂μϕ) † T a (Aμ a ϕ), while S Λev − S cev does not change. The higher-derivative regularizing terms are so the tilde fields and sourcesφ = ϕ Λ 3 ,K ϕ = Λ 3 K ϕ , are such that [gφ] = 6, [gK ϕ ] = 13. With these choices, the matter fields and their sources still cannot contribute to the one-loop countertermsΓ (1) Λdiv of the higher-derivative theoryS Λ , nor to the nonevanescent one-loop gauge anomaliesÃ (1) Λnev . Moreover, we still have S Λ − S r = O(1/Λ 6 ), therefore all arguments used in the proof of the previous sections generalize straightforwardly.
Finally, we add photons. Assume that the group G contains N U (1) factors and denote their gauge fields with V u µ , u = 1, . . . N . Then make the replacements S c → S c − 1 4 ζ uv W u µν W vµν , D µ π I → D µ π I + iQ u V u µ π I , where W u µν = ∂ µ V u ν − ∂ ν V u µ , ζ uv is an invertible constant matrix, π I is any matter field in the irreducible representation R I of G, and π I † , K I † π stand forπ I ,K I π if π I is a fermion. We define extended G indicesâ,b, . . . to include both sets of indices u, v, . . . and a, b, . . ., and write Aâ µ = {V u µ , A a µ }. The U (1) charges of matter fields are denoted with gq u I . We also write Tâ = {iQ u , T a }, where Q u acts on π I by multiplying it by q u I . The change of the gauge fermion (2.4) is The sector S cev of S ev is also extended, to include V -dependent evanescent terms similar to those already met in (2.18), (8.2) and (8.3). Instead, S Λev − S cev remains the same, since the U (1) ghosts decouple. The action S cΛ is extended to include the higher-derivative regularizing terms while the change of gauge fermion is where P uv (✷) = ξ uv + δ uv ξ ′ Λ 16 ✷ 8 . Finally, S Λev inherits the modifications made on S cev . Tilde fields and sources are defined as before. The one-loop renormalization of the higher-derivative theoryS Λ is made of the replacements (4.7) plus similar replacements for ζ uv , where f uv are calculable constants. Formula (5.12) for the one-loop gauge anomalies holds with C = CâTâ, A µ = Aâ µ Tâ. The correction to the canonical transformation (7.4) reads F n (Φ, K ′ ) → F n (Φ, K ′ ) + (V u µ Z 1/2 nuv K ′µv + C u Z 1/2 nuv K ′v C +C u Z −1/2 nuv K ′v C + B u Z −1/2 nuv K ′v B ), and the redefinitions (7.5) are accompanied by q u′ I = Z −1/2 nuv q v I , ξ ′ uv = Z 1/2 nuw ξ wz Z 1/2 nzv , so that the U (1) gauge-fixing sector (S K , Ψ), including the ghost action, as well as the U (1) sector of S K , are nonrenormalized. With the rules of this section gauge anomalies manifest cancel to all orders in the most general perturbatively unitary, renormalizable gauge theory coupled to matter, as long as they vanish at one loop. We stress again that the proof we have given also works when the theory is conformal or finite, or the first coefficients of its beta functions vanish, where instead RG techniques are powerless.

Conclusions
We have reconsidered the Adler-Bardeen theorem, focusing on the cancellation of gauge anomalies to all orders, when they are trivial at one loop. The proof we have worked out is more powerful than the ones appeared so far and makes us understand aspects that the previous derivations were unable to clarify. Key ingredients of our approach are the Batalin-Vilkovisky formalism and a regularization technique that combines the dimensional regularization with the higher-derivative gauge invariant regularization. The most important result is the identification of the subtraction scheme where gauge anomalies manifestly cancel to all orders. We have not used renormalizationgroup arguments, so our results apply to the most general perturbatively unitary, renormalizable gauge theories coupled to matter, including conformal field theories, finite theories, and theories where the first coefficients of the beta functions vanish.
In view of future generalizations to wider classes of quantum field theories, we have paid attention to a considerable amount of details and delicate steps that emerge along with the proof. We are convinced that the techniques developed here may help us identify the right tools to upgrade the formulation of quantum field theory and simplify the proofs of all-order theorems.

A Appendix
In this appendix we calculate the one-loop coefficient of the Bardeen anomaly in chiral gauge theories. That coefficient is scheme independent, so we can work at Λ = ∞, which means use the dimensionally regularized action S r of (2.14). Actually, we can equivalently use the action S r0 of (2.7), because it is easy to check that the contributions due to S ev do not contain fermion loops, therefore cannot generate the tensor ε µνρσ .
We focus on the matter-independent contributions A B to the anomaly A = (S r0 , S r0 ) S r0 , so we can take the ghosts outside the average. Switching to momentum space, we get A B = −2ıq L d D p (2π) D C(−k) tr / p(P R − P L ) ψ(p + k 1 )ψ(−p + k 2 ) .
Here and below the integrals on k 1 and k 2 in A B are understood. We expand the fermion twopoint function in powers of the gauge field. The linear term gives a contribution that by power counting and ghost number conservation is proportional to It can be subtracted away as explained in formula (5.13). Then we concentrate on the contributions A ′ B to A B that are quadratic in the gauge field. We observe that one fermion propagator is sandwiched between two P L 's or two P R 's, which projects its numerator onto the evanescent sector, and the other two propagators are sandwiched between P L and P R , which projects their numerators onto the physical sector. We get The photons and their momenta k 1 , k 2 can be taken to be strictly four dimensional. Turning to Euclidean space and using we obtain L 12π 2 C(−k)ε µνρσ k 1µ A ν (k 1 )k 2ρ A σ (k 2 ). Converting to coordinate space and including the trivial contributions, we finally get Cε µνρσ F µν F ρσ + (S K , χ).
After subtraction of the trivial terms the divergence of the current averages to Incidentally, the calculation shows that A B receives no contributions proportional to CF µν F µν . This term is in principle allowed by the cohomological constraint (5.6) in Abelian theories, but actually does not show up. The reason is that it would imply that the global symmetry is anomalous, which is of course not true. The calculation just done also proves formula (5.12), after inserting matrices T a and structure constants f abc where appropriate.