Adler–Bardeen theorem and manifest anomaly cancellation to all orders in gauge theories

We reconsider the Adler–Bardeen theorem for the cancellation of gauge anomalies to all orders, when they vanish at one loop. Using the Batalin–Vilkovisky formalism and combining the dimensional-regularization technique with the higher-derivative gauge invariant regularization, we prove the theorem in the most general perturbatively unitary renormalizable gauge theories coupled to matter in four dimensions, and we identify the subtraction scheme where anomaly cancellation to all orders is manifest, namely no subtractions of finite local counterterms are required from two loops onwards. Our approach is based on an order-by-order analysis of renormalization, and, differently from most derivations existing in the literature, does not make use of arguments based on the properties of the renormalization group. As a consequence, the proof we give also applies to conformal field theories and finite theories.


Introduction
The Adler-Bardeen theorem [1,2] is a crucial property of quantum field theory, and one of the few tools to derive exact results. In the literature various statements go under the name of "Adler-Bardeen theorem". They apply to different situations. The original statement by Adler and Bardeen says that (I) the Adler-Bell-Jackiw axial anomaly [3,4] is one-loop exact. The second statement, which is the one we are going to study here, says that (II) (there exists a subtraction scheme where) gauge anomalies vanish to all orders, if they vanish at one loop. Statement II is important to justify the cancellation of gauge anomalies to all orders in the standard model. A third statement concerns the one-loop exactness of anomalies associated with external fields. Statement I is expressed by a well-known operator identity for the divergence of the axial current. By means of a diagrammatic analysis, Adler and Bardeen were able to provide a e-mail: damiano.anselmi@df.unipi.it the subtraction scheme where that identity is manifestly oneloop exact in QED [1]. They emphasized that higher-order corrections vanish, unless they contain the one-loop triangle diagram as a subdiagram. Thus stated, statement I intuitively implies statement II. However, the original proof of Adler and Bardeen applies only to QED.
Other approaches to the problem have appeared, since the paper by Adler and Bardeen, in Abelian and non-Abelian gauge theories. For a review, see for example [2]. Statement I can be proved using arguments based on the properties of the renormalization group [5][6][7], regularization independent algebraic techniques [8], or an algebraic/geometric derivation [9] based on the Wess-Zumino consistency conditions [10] and the quantization of the Wess-Zumino-Witten action. Statement II can also be proved using renormalization-group (RG) arguments, with the dimensional regularization [11] or regularizationindependent approaches [12].
More recently, statement II was proved by the author of this paper in standard model extensions with highenergy Lorentz violation [13], which are renormalizable by "weighted power counting" [14]. The approach of [13] is closer to the original approach by Adler and Bardeen, in the sense that it does not make use of RG arguments, algebraic methods or geometric shortcuts, it naturally provides the subtraction scheme where the all-order cancellation is manifest, and it is basically a diagrammatic analysis, although instead of dealing directly with diagrams, it uses the Batalin-Vilkovisky formalism [15][16][17] to manage relations among diagrams in a compact and efficient way.
In the present paper we prove statement II in the most general perturbatively unitary, renormalizable gauge theories coupled to matter, and elaborate further along the guidelines of Ref. [13]. We upgrade the approach of [13] in a number of directions, emphasize properties that were not apparent at that time, and expand the arguments that were presented concisely. We also gain a certain clarity by dropping the Lorentz violation. A side purpose of this investigation is to develop new techniques and tools to prove all-order theorems in quantum field theory with a smaller effort.
Our results make progress in several directions. To our knowledge, if we exclude Ref. [13] and this paper, statement II has been proved beyond QED only making use of arguments based on the renormalization group. However, RG arguments do not provide the subtraction scheme where the all-order cancellation is manifest, and they are not sufficiently general. For example, they are powerless when the beta functions identically vanish, so they exclude conformal field theories and finite theories, where, however, the Adler-Bardeen theorem does hold. Actually, RG techniques fail even when the first coefficients of the beta functions vanish [11,12]. Our approach does not suffer from these limitations. Another reason to avoid shortcuts is that in the past the Adler-Bardeen theorem caused some confusion in the literature, therefore new proofs, and even more generalizations, should be as transparent as possible. In this paper we pay attention to all details.
The all-order cancellation of gauge anomalies is a property that depends on the scheme, but the existence of a good scheme is not evident. Knowing the scheme where the cancellation is manifest is very convenient from the practical point of view, because it saves the effort of subtracting ad hoc finite local counterterms at each step of the perturbative expansion. For example, using the dimensional regularization and the minimal subtraction scheme the cancellation of two-loop and higher-order corrections to gauge anomalies in the standard model is not manifest, and finite local counterterms must be subtracted every time.
To find the right subtraction scheme we need to define a clever regularization technique. It turns out that using the Batalin-Vilkovisky formalism and combining the dimensional regularization with the gauge invariant higherderivative regularization, the subtraction scheme where the Adler-Bardeen theorem is manifest emerges quite naturally [13].
It is well known that, in general, gauge invariant higherderivative regularizations do not regularize completely, because some one-loop diagrams can remain divergent. From our viewpoint, this is not a weakness, because it allows us to separate the sources of potential anomalies from everything else. We just have to use a second regulator, the dimensional one, to deal with the few surviving divergent diagrams.
The regularization we are going to use introduces two cutoffs: ε = 4 − D, where D is the continued complex dimension, and an energy scale for the higher-derivative regularizing terms. The regularized action must be gauge invariant in D = 4, to ensure that the higher-derivative regulator has the minimum impact on gauge anomalies. The physical limit is defined letting ε tend to 0 and to ∞. When we have two or more cutoffs, physical quantities do not depend on the order in which we remove them. More precisely, exchanging the order of the limits ε → 0 and → ∞ is equivalent to change the subtraction scheme. That kind of scheme change is, however, crucial for our arguments.
Consider first the limit → ∞ followed by ε → 0. When D = 4 the limit → ∞ is regular in every diagram and gives back the dimensionally regularized theory: no divergences appear, but just poles in ε. In this framework there are no known subtraction schemes where the Adler-Bardeen theorem holds manifestly. Now, consider the limit ε → 0 followed by → ∞. At fixed we have a higher-derivative theory. If properly organized, that theory is superrenormalizable and contains just a few (one-loop) divergent diagrams, which are poles in ε and may be removed by redefining some parameters. At a second stage, we study the limit → ∞, where divergences appear and are removed by redefining parameters and making canonical transformations. We call the regularization technique defined this way dimensional/higherderivative (DHD) regularization.
Intuitively, if gauge anomalies are trivial at one loop, there should be no further problems at higher orders, because the higher-derivative regularization is manifestly gauge invariant. Thus, we expect that the DHD regularization provides the framework where the Adler-Bardeen theorem is manifest. However, it is not entirely obvious that the two regularization techniques can be merged to achieve the goal we want. Among the other things, ε evanescent terms are around all the time and the O(1/ n ) regularizing terms can simplify power-like divergences, causing troubles. Nevertheless, with some effort and a nontrivial amount of work we can prove that all difficulties can be properly dealt with.
Summarizing, the statement we prove in this paper is Theorem In renormalizable perturbatively unitary gauge theories coupled to matter, there exists a subtraction scheme where gauge anomalies manifestly cancel to all orders, if they are trivial at one loop.
Once we have this result, we know that no matter what scheme we use, it is always possible to find ad hoc finite local counterterms that ensure the cancellation of gauge anomalies to all orders. Then we are free to use the more common minimal subtraction scheme and the pure dimensional regularization technique.
The paper is organized as follows. In Sects. 2-7 we prove the theorem in non-Abelian Yang-Mills theory coupled to left-handed chiral fermions. This model is sufficiently general to illustrate the key points of the proof, as well as the main arguments and tools, but relatively simple to free the derivation from unnecessary complications. At the end of the paper, in Sect. 8, we show how to include the missing fields, namely right-handed fermions, scalars, and photons, and cover the most general perturbatively unitary renormal-izable gauge theory coupled to matter. Section 9 contains our conclusions. In Appendix A we recall the calculation of gauge anomalies in chiral theories. In Appendix B we recall the proof of a useful formula.
The proof for Yang-Mills theory coupled to chiral fermions is organized as follows. In Sects. 2 and 3 we formulate the dimensional and DHD regularization techniques. In Sects. 4-6 we prove the Adler-Bardeen theorem in the higher-derivative theory, studying the limit ε → 0 at fixed. Precisely, in Sect. 4 we work out the renormalization, in Sect. 5 we study the one-loop anomalies and in Sect. 6 we prove the anomaly cancellation to all orders. In Sect. 7 we take the limit → ∞ and conclude the proof of the Adler-Bardeen theorem for the final theory.

Dimensional regularization of chiral Yang-Mills theory
We first prove the Adler-Bardeen theorem in detail in fourdimensional non-Abelian Yang-Mills theory coupled to lefthanded chiral fermions. This model offers a sufficiently general arena to illustrate the key arguments and tools of our approach. At the same time, we make some clever choices to prepare the generalization (discussed in Sect. 8) to the most general perturbatively unitary gauge theories coupled to matter. To begin with, in this section we dimensionally regularize chiral gauge theories and point out a number of facts and properties that are normally not emphasized, but are rather important for the arguments of this paper. Consider a gauge theory with gauge group G and lefthanded chiral fermions ψ I L in certain irreducible representations R I L of G. If G is the product of various simple groups G i , we use indices a, b, . . . for G and indices a i , b i , . . . for G i . Denote the gauge coupling g i of each G i with gr i , where r i are parameters of order one that we incorporate into the G structure constants f abc and the anti-Hermitian matrices T a associated with the representations of matter fields. We call g the overall gauge coupling. We organize the matrices T a in block-diagonal form, where each block refers to a ψ I L and its representation R I L . When we write T a ψ I L we understand that T a is replaced by the appropriate block. More fermions in the same irreducible representations may be present. With these conventions, the matrices T a still satisfy [T a , T b ] = f abc T c and the classical action reads no sum over this kind of index i being understood, here and in the rest of the paper) is the G i field strength, D μ ψ I L = ∂ μ ψ I L + gT a A a μ ψ I L is the fermion covariant derivative and ı is used for √ −1 to avoid confusion with the index i. The parameters ζ i could be normalized to 1, but for future uses it is convenient to keep them free, because they are renormalized by poles in ε. Analogous parameters in front of the fermionic kinetic terms are not necessary.
To keep the presentation simple we make some simplifying assumptions that do not restrict the validity of our arguments. Specifically, we do not include right-handed fermions and scalar fields, and assume that the groups G i are non-Abelian, so there is no renormalization mixing among gauge fields, even when more copies of the same simple group are present. In Sect. 8 we explain how to relax these assumptions and cover the most general Abelian and non-Abelian perturbatively unitary renormalizable gauge theories coupled to matter.
Let us briefly recall the Batalin-Vilkovisky formalism for general gauge theories [15][16][17]. The classical fields φ = {A a μ , ψ I L ,ψ I L }, together with the ghosts C, the antighostsC and the Lagrange multipliers B for the gauge fixing are collected into the set of fields α = {A a μ , C a ,C a , B a , ψ I L ,ψ I L }. An external source K α with opposite statistics is associated with each α , and coupled to the α transformations R α ( , g).
If X and Y are functionals of and K their antiparentheses are defined as where the integral is over spacetime points associated with repeated indices. The master equation (S, S) = 0 must be solved with the "boundary condition" S( , is the classical action (2.1). The solution S( , K ) is the action we start with to quantize the theory.
In the model we are considering the gauge algebra closes off shell, so there exists a variable frame where S( , K ) is linear in K . The non-gauge-fixed solution of the master equation is where the functional collects the symmetry transformations of the fields, D μ C a = ∂ μ C a + g f abc A b μ C c being the covariant derivative of the ghosts. The gauge-fixed solution of the master equation reads where ( ) is the "gauge fermion", a functional of ghost number −1 that collects the gauge-fixing conditions. For convenience, we choose standard linear gauge-fixing conditions and write where ξ i are gauge-fixing parameters. The naïve D-dimensional continuation of the action (2.1) is not well regularized, because chiral fermions do not have good propagators. To overcome this difficulty, we proceed as follows. As usual, we split the D-dimensional spacetime manifold R D into the product R 4 × R −ε of ordinary fourdimensional spacetime R 4 times a residual (−ε)-dimensional evanescent space R −ε . Spacetime indices μ, ν, . . . of vectors and tensors are split into bar indicesμ,ν, . . ., which take the values 0,1,2,3, and formal hat indicesμ,ν, . . ., which denote the R −ε components. For example, the momenta p μ are split into pairs pμ, pμ, or equivalentlyp μ ,p μ . The flatspace metric η μν = diag(1, −1, . . . , −1) is split into ημν = diag(1, −1, −1, −1) and ημν = −δμν. When we contract evanescent components we use the metric ημν, so for examplê p 2 = pμημν pν. We assume that the continued γ matrices γ μ satisfy the continued Dirac algebra {γ μ , γ ν } = 2η μν . We define γ 5 = ıγ 0 γ 1 γ 2 γ 3 , P L = (1 − γ 5 )/2, P R = (1 + γ 5 )/2 and the charge-conjugation matrix C = −ıγ 0 γ 2 in the usual fashion. Full SO(1, D − 1) invariance is lost in most expressions, replaced by SO(1, 3) × SO(−ε) invariance.
The action (2.1) gives the fermion propagator P L (ı/ / p)P R , which involves only the four-dimensional componentsp μ of momenta. Therefore, it does not fall off in all directions of integration for p → ∞. Applying the rules of the dimensional regularization, fermion loops integrate to zero. To provide fermions with correct propagators we introduce righthanded ψ I L -partners ψ I R that decouple in four dimensions and are inert under every gauge transformations. We include ψ R andψ R into the set of fields . It is not necessary to introduce sources K for them.
Specifically, we start from the regularized classical action which is the sum of the unregularized classical action (2.1) plus a correction where ς I J are constants that form an invertible matrix ς . The only nontrivial off-diagonal entries of ς (and of all the matrices M I J we going to meet in this paper) are those that mix equivalent irreducible representations R I L . The reason why the matrix ς is kept free is that later on it will help us reabsorb the renormalization constants of ψ I L , since S LR is nonrenormalized (see below).
Using the polar decomposition, we can write ς = U † R DU L , where U L and U R are unitary matrices and D is a positive-definite diagonal matrix. In the basis where ς is replaced by its diagonal form D ≡ diag(ς I ) the propagators of the Dirac fermions ψ I = ψ I L + ψ I R are and coincide with the usual propagators for ς I = 1. Next, observe that (S K , S K ) = 0 in arbitrary D. The regularized gauge-fixed action is (up to an extension that will be discussed later) (2.8) and satisfies where "O(ε)" is used to denote any expression that vanishes in four dimensions. We have used P R / ∂ P R = P R/ ∂ P R and a similar relation with R → L. Observe that S r 0 is invariant under the global symmetry transformations of the group G.
Given a (dimensionally) regularized classical action S( , K ), the regularized generating functionals Z and W are defined by the formulas (2.10) and the generating functional ( , K ) = W (J, K ) − α J α of one-particle irreducible diagrams is the Legendre transform of W (J, K ) with respect to J , where the sources K act as spectators. Often it is necessary to pay attention to the action used to define averages. We denote the averages · · · defined by the action S as · · · S (at J α = 0). The anomaly functional is and collects the set of one-particle irreducible correlation functions containing one insertion of (S, S). The last equality of (2.11) can be proved by making the change of field variables α → α + θ(S, α ) inside the functional integral (2.10), where θ is a constant anticommuting parameter.
The proof is recalled in Appendix B, together with comments on the meaning of the formula. No one-particle irreducible diagrams can be constructed with external legsψ R or ψ R , becauseψ R and ψ R do not appear in any vertices. Thus, the total functional satisfies We have anticipated that the action (2.8) is not the final dimensionally regularized action we are going to use. Before moving to the appropriate extension S r , we must describe the counterterms generated by S r 0 , list a number of properties that may be used to restrict the S r 0 extensions and point out some subtleties concerning the dimensional regularization.
First, observe that the counterterms are B, K B and KC independent. Indeed, the source K B appears nowhere in S r 0 , while KC appears only in − B KC . Moreover, the gaugefixing conditions are linear in the fields, and the B-dependent terms of S r 0 are at most quadratic in . Therefore no nontrivial one-particle irreducible diagrams can have external B legs.
Second, the action S r 0 does not depend on the antighosts C a i and the sources K μa i separately, but only through the combinations K μa i + ∂ μC a i . The functional must share the same property. Indeed, an antighost external leg actually carries the structure ∂ μC a i , since all vertices containing antighosts do so. Given a diagram with K μa i or ∂ μC a i on external legs, we can construct almost identical diagrams by just replacing one or more legs K μa i with ∂ μC a i , or vice versa.
Third, power counting and ghost-number conservation ensure that the counterterms are linear in the sources K . Using square brackets to denote dimensions in units of mass, we have [K μa ] = [K a C ] = 2, and [K ψ ] = 3/2. These sources have negative ghost numbers. Therefore, the dimension of a term that is more than linear in K and has vanishing ghost number necessarily exceeds 4.

Structure of the dependence on the overall gauge coupling
It is useful to single out how the functionals depend on the overall gauge coupling g. The tree-level functionals we work with have the g structure If the action satisfies this condition at the tree level, then the renormalized action and the functional have the g structure (2.13) where X L collects the L-loop contributions. Basically, there is an additional factor g 2 for every loop. Indeed, when the action is of the form (2.12), every vertex is multiplied by a power g N −2 , where N is the number of its plus K legs. Then, a one-particle irreducible diagram with L loops, I internal legs, E external legs and v i vertices with i legs is multiplied by having used L − I + V = 1 and i 2 iv i = 2I + E. We see that for L ≥ 1 we have one power of g for each external leg and a residual factor g 2(L−1) , in agreement with (2.13).

Properties of the dimensional regularization of chiral theories
Now we recall a few properties of the dimensional regularization of chiral theories, which are important for the rest of our analysis. It is well known that divergences are just poles in ε. Instead, the terms that disappear when D → 4, called "evanescences", may be of two types: formal or analytic. Analytically evanescent terms, briefly denoted by "aev", are those that factorize at least one ε, such as εF μν F μν , εψ L ı / Dψ L , etc. Formally evanescent terms, briefly denoted by "fev", are those that formally disappear when D → 4, but do not factorize powers of ε. They are built with the tensor δμν and the evanescent componentsx,p,∂, γ ,Â of coordinates, momenta, derivatives, gamma matrices and gauge fields. Examples areψ L ı/ ∂ψ R , (∂μ A a ν )(∂μ A νa ), etc.
The distinction between formally evanescent and analytically evanescent expressions is to some extent ambiguous. Consider for example a basisψ 1 γ ρ 1 ···ρ k ψ 2 of fermion bilinears, where ψ 1 , ψ 2 can be ψ L or K ψ , and γ ρ 1 ···ρ k is the completely antisymmetric product of γ ρ 1 , . . . , γ ρ k . In dimensional regularization these bilinears are nonvanishing for every k, and they are evanescent for k > 4. We have several ways to rearrange the products of two or more fermion bilinears by using Fierz identities, and such rearrangements can convert formally evanescent objects into analytically evanescent ones. For example, given some spinors ψ n , n = 1, 2, 3, 4, we can expand the matrix ψ 2ψ3 in the basis made of γ ρ 1 ···ρ k , k = 0, . . . , ∞. We have where f (D) =tr [1]. Using this identity we find, for example, Basically, this equation has the form "fev = fev + aev". The existence of such relations poses some problems, which we now describe.
Feynman diagrams may generate "divergent evanescences", briefly denoted by "divev". They are made of products between poles and formal evanescences, such as (∂μ A a ν ) (∂μ A νa )/ε. The theorem of locality of counterterms demands that we renormalize divergent evanescences away, together with ordinary divergences (see below). However, this makes sense only if we can define divergent evanescences unambiguously, which could be problematic due to the observations made above. For example, if we multiply both sides of formula (2.14) by 1/ε we get a relation of the type "divev = finite + divev".
Ultimately, the problem does not arise in the theories we are considering here, for the following reasons. Both the classical action and the counterterms are local functionals, equal to integrals of local functions of dimension 4. In the paper we also show that the first nonvanishing contributions to the anomaly functional (2.11) are local, equal to integrals of local functions of dimension 5. A fermion bilinearψ 1 γ ρ 1 ···ρ k ψ 2 has dimension 3, so power counting implies that the classical action, as well as counterterms and local contributions to anomalies, cannot contain products of two or more fermion bilinears. Therefore, they are not affected by the ambiguities discussed above. Those ambiguities can only occur in the convergent sector of the theory, where they are harmless, since both analytic and formal evanescences must eventually disappear.
Thanks to the properties just mentioned, it is meaningful to require that the action S r 0 , as well as its extensions constructed in the rest of this paper, do not contain analytically evanescent terms. More precisely, the coefficients of every Lagrangian terms should be equal to their four-dimensional limits. This request is important to avoid unwanted simplifications between ε factors and ε poles, when divergent parts are extracted from bilinear expressions such as ( , ). It can be considered part of the definition of the minimal subtraction scheme. For the same reason, we must be sure that the antiparentheses do not generate extra factors of ε, or poles in ε, which is proved below.
Finite nonevanescent contributions will be called "nev". We need a convention to define these quantities precisely, otherwise they can mix with evanescent terms. For example, we need to state whetherC∂ 2 C, orC∂ 2 C, or a combina-tion such as (1 + αε)C∂ 2 C + βC∂ 2 C, where α and β are constants, is taken to be nonevanescent. The convention we choose is that nonevanescent terms are maximally symmetric with respect to the D-dimensional Lorentz group. For the arguments of this paper we just need to focus on local functionals contributing to counterterms and anomalies. In the case of counterterms the nonevanescent terms are those appearing in the action S r 0 , which are SO(D) -invariant when chiral fermions are switched off. In the case of anomalies the nonevanescent terms are SO(D)-invariant unless they contain the tensor ε μνρσ or chiral fermions.

Evanescent extension of the classical action
It is convenient to extend the action S r 0 by adding all formally evanescent terms that have the features of divergent evanescences, multiplied by independent parameters η. In this way it is possible to subtract divergent evanescences by means of η redefinitions. Denoting the correction collecting such terms with S ev , the extended action reads Then the generating functionals (2.10), the functional and the anomaly functional A of (2.11) are turned into those defined by S r .
Each term of S ev is the integral of a monomial of dimension 4, globally invariant under G. It not necessarily gauge invariant, since gauge invariance is violated away from four dimensions. Moreover, S ev is B, K B , KC ,ψ R and ψ R independent, linear in K and depends onC a i and the sources K μa i only through the combinations K μa i + ∂ μC a i . It is also independent of K C , K ψ ,K ψ , ψ L andψ L , because no formally evanescent terms can be built with these objects. By power counting and ghost-number conservation the terms proportional to K μa i + ∂ μC a i are independent of matter fields. In the end, S ev has the form (2.16) We can further restrict S ev . Indeed, S r 0 satisfies (2.12). Therefore, the divergent evanescences have the form (2.13) with L 1, and they can be renormalized with an S ev of the form (2.12). Precisely, we can define the parameters η so that S ev is linear in η and its g dependence has the form (2.17) so S r also satisfies (2.12).
Basically, the terms of S ev are similar to those appearing in S r 0 , but contain some evanescent components of momenta and/or gauge fields, and they are broken into gauge noninvariant pieces. We have while examples of contributions to S c ev are The terms multiplied by η 3i , . . . η 8i are quadratic and modify the propagators of the gauge fields A a i μ and the Lagrange multipliers B a i . We do not need to report here the modified propagators, which are rather involved. We have checked, with the help of a computer program, that they satisfy the requirements we need. In particular, if k denotes their momentum, (i) they are regular when any evanescent componentsk of k are set to zero; (ii) when the propagators are differentiated with respect to any componentsk,k, or to parameters of positive dimensions (such as η 8i ), their behaviors for large k 2 improve by at least one power; (iii) they have a regular infrared behavior, which corresponds to the decoupling of the evanescent components A a î μ . Finally, their denominators are SO(1, 3) × SO(−ε) scalars, like the denominators of the fermion propagators (2.7).

Structure of correlation functions
Now we analyze the evaluation of correlation functions. We use the same notation for a function and its Fourier transform, since no confusion is expected to arise. In momentum space, the terms of the classical action can be written in the form where k 1 , . . . , k n+r are the external momenta.The constants T β 1 ···β r μ 1 ···μ p α 1 ···α n collect all tensors η μν , ε μνρσ , δμν, γ matri-ces, structure constants f abc and matrices T a . In particular, every projector onto hat components of momenta, fields, and sources is moved inside T β 1 ···β r μ 1 ···μ p α 1 ···α n . Momentum conservation ensures that where the tensorsG μ 1 ···μ p are polynomials that depend on n + r − 1 external momenta.
Propagators can be decomposed as sums of terms of the form -scalar denominators, due to the parameters ς I of (2.7) and the parameters η provided by the extension S r 0 → S r discussed above.
The Feynman diagrams of and A have structures inherited from the structures (2.20) and (2.22) of the vertices and propagators. They may be written as sums of contributions of the form (2.20), with tensors G μ 1 ···μ p that satisfy (2.21), but nowG μ 1 ···μ p are integrals over internal momenta p of rational functions, where the polynomial N μ 1 ···μ p ( p, k) appearing in the numerator is an SO(1, D − 1) tensor, and the polynomial D( p, k) appearing in the denominator is an SO(1, 3) × SO(−ε) scalar. At ς I J = δ I J , η = 0 the integralsG μ 1 ···μ p are full SO(1, D − 1) tensors. Note thatG μ 1 ···μ p have a regular limit when the evanescent componentsk of the external momenta k tend to zero. For example, we can write Then we include δμνδρσ inside the constants T β 1 ···β r μ 1 ···μ p α 1 ···α n . The remaining completely symmetric tensorG μνρσ (k, m) is an integral with the properties listed above.
It may be useful to write (2.20) in the more compact form (2.25) and then organize the expressions L μ 1 ···μ p ( , K ) by using the basis of fermion bilinearsψ 1 γ ρ 1 ···ρ k ψ 2 , and explicitly evaluate traces of spinor indices and contractions of Lorentz indices. At the end, all Lorentz indices appear in gauge fields, fermion bilinears, the tensor ε μνρσ (if present), and G μ 1 ···μ p , and they are contracted among one another, possibly after projections onto bar or hat components. It is also convenient to expand are polynomials constructed with η μν , ε μνρσ , δμν and the n + r − 1 independent momenta k. Then we can write the contribution (2.25) to or A as are also SO(1, 3) × SO(−ε) scalars. After these operations, the Lorentz indices appear in gauge fields, fermion bilinears, momenta k and the tensor ε μνρσ . They are contracted among themselves, possibly after projections onto bar or hat components. At this point, traces and index contractions must be evaluated explicitly, because they may produce factors ε, which are important for the expansions and limits that we are going to define. The analytic expansion around ε = 0 of (2.25) or (2.27) is defined by expanding the scalars G i (k) in powers of ε without affecting the evanescent components of external momenta. The analytic limit is the order zero of the analytic expansion, once the poles in ε have been subtracted away. The formal limit ε → 0 is the limit where the evanescent components of gauge fields, external momenta and fermion bilinears are dropped. The limit ε → 0 is the analytic limit followed by the formal limit.
For the reasons explained above, the analytic and formal limits may be ambiguous in the convergent sector of the theory, but they are unambiguous in the divergent sector. More importantly, the limit ε → 0 is always unambiguous. Since the tensors G μ 1 ···μ p are regular when any evanescent compo-nentsk of the external momenta k are set to zero, the formal limits of (2.25) and (2.27) are well defined.
When we use the expressions "O(ε)" or "ev" we mean any quantity that vanishes in the limit ε → 0. Clearly, ev = aev + fev.

Locality of counterterms
Now we comment on the locality of counterterms. The forms of the regularized propagators ensure that a sufficient number of derivatives with respect to physicalk and/or evanescentk components of external momenta k kills the overall divergences of Feynman diagrams. If we subtract the divergent evanescences, together with the ordinary divergences, up to some order n, then both ordinary divergences and divergent evanescences of order n + 1 are polynomial ink andk. The S r 0 -extension S r = S r 0 + S ev of (2.15) allows us to subtract all of them in a way that is efficient for the proof of the Adler-Bardeen theorem.
To complete the analysis it is useful to describe what happens if for some reason we do not subtract some divergent evanescences. We use the abbreviations "loc" and "nl" to denote local and nonlocal contributions, respectively. At one loop we miss counterterms of the form h loc fev ε . (2.28) Consequently, at two loops we also miss counterterms for subdivergences. Using the vertex (2.28) inside one-loop diagrams we get contributions of the form The first three terms are generated when the formal evanescence enters the diagram, is converted into a factor ε and simplifies a pole in ε. Symbolically, we express this occurrence (which is the basic mechanism that originates potential anomalies) as The last three terms of (2.29) describe what happens when the formal evanescence remains outside the diagram.
The first term of (2.29) must be subtracted, so the missing counterterms at two loops arē Even if the last term of this list is nonlocal, we still have no problem, since the residues of the poles in ε are formally evanescent. However, when we use the first and third terms of (2.31) inside one-loop diagrams, the formal evanescence can simplify another pole, by the mechanism (2.30), and givē h 3 nl nev ε +h 3 nl fev ε 2 +h 3 nl fev ε +h 3 nl plus local poles. We see that nonlocal, nonevanescent divergences appear at three loops. These are only partially compensated by analogous contributions originating by the subtraction of the first term of (2.29). Those due to the first term of (2.31), in particular, do not seem to disappear. On the other hand, it is safe to subtract the divergent evanescences order by order, together with nonevanescent divergences. In this paper we adopt this prescription.

Properties of the antiparentheses
Now we study how divergences and evanescences propagate through the antiparentheses. Indeed, in the proofs of renormalizability to all orders and the Adler-Bardeen theorem, it is necessary to extract divergent parts of antiparentheses such as A = ( , ) or ( , A). This operation is not as simple as it sounds, because we must be sure that the antiparentheses themselves do not generate poles or factors of ε, in order to be able to say that, for example, the divergent part of (S r , (1) ) is equal to (S r , (1) div ), where (1) it the one-loop contribution to and (1) div is the divergent part of (1) . Specifically, we prove that (i) the antiparentheses (X conv , Y conv ) of convergent functionals X conv and Y conv are convergent; (ii) the antiparentheses (X conv , Y ev ) of convergent functionals X conv and evanescent functionals Y ev are evanescent; (iii) the antiparentheses (X, Y ) do not generate either poles in ε or factors of ε if X , Y and (X, Y ) do not involve products of two or more fermion bilinears.
For the use we have in mind it is convenient to rephrase property (iii) more explicitly as (iii ) the antiparentheses (X A , Y B ) of functionals X A and Y B with the properties specified by their subscripts A and B , satisfy the identities as long as X A , Y B and (X A , Y B ) do not involve products of two or more fermion bilinears.
To prove these properties it is convenient to write the antiparentheses in momentum space. We have and a similar relation obtained by exchanging and K . Let us write (2.27) for X , Y and (X, Y ) as Using (2.26) we find that the p integral of (2.33) can be readily done and gives where P is the total momentum ofG i X plus the ones ofG j Y . We see that the scalar "cores" G i of correlation functions just multiply each other in momentum space, which cannot generate new poles in ε or factors of ε.
It remains to study the relation between L i j (X,Y ) and L i X , L jY . The antiparentheses can produce index contractions by means of the paired functional derivatives δ/δ A μ -δ/δK μ and δ/δψ-δ/δK ψ . Clearly, no such operations can generate poles in ε. This observation is sufficient to prove statements (i) and (ii).
As far as statement (iii) is concerned, we must assume that the functionals X , Y and (X, Y ) do not involve products of two or more fermion bilinears. Therefore, they are free of ambiguities of type (2.14). The contraction of Lorentz indices brought about by δ/δ A μ and δ/δK μ gives a tensor η μν with mixed indices (namely one index from X and one index from Y ). The contraction of spinorial indices brought about by δ/δψ and δ/δK ψ gives structures such as where the ρ indices come from X and the σ indices come from Y . Anticommuting the γ 's we can rearrange the indices so that ρ 1 < ρ 2 < · · · < ρ k and σ 1 < σ 2 < · · · < σ l . Reordering the indices we may get minus signs from further anticommutations or from squares of γ matrices with identical indices. In the end, we get a formula likē where the breves denote missing indices that go into the tensors η μν . Again, we get only tensors η μν with mixed indices. We recall that all Lorentz indices, possibly after projection onto bar or hat components, are contracted with gauge fields, fermion bilinears, momenta and possibly ε μνρσ , and that, by assumption, no products of two or more fermion bilinears are involved. Then it is obvious that the contractions originating by the antiparentheses cannot produce ε factors. Using these properties it is easy to check that identities (2.32) hold, so statement (iii) is also proved.
Statement (iii) also says that the antiparentheses cannot convert formal ε evanescences into analytic ones. It applies, for example, to local functionals X and Y that are equal to the integrals of functions of dimensions n X , n Y 5, such that n X + n Y 8, because then X , Y and (X, Y ) cannot contain products of two or more fermion bilinears. In the paper we will apply statement (iii) to the divergent contributions to and the first nonvanishing contributions to the anomaly functional A of (3.10).

DHD regularization
The dimensional regularization alone does not provide the subtraction scheme where the cancellation of gauge anomalies is manifest to all orders. To find the right scheme, we modify the regularization technique by adding higherderivative terms that preserve gauge invariance in D = 4. We take the non-gauge-fixed regularized classical action where The higher-derivative structures of (3.1) and (3.2) are chosen to simplify the arguments of our derivations.
We gauge fix S c by using modified gauge-fixing functions of the form and a modified gauge fermion where λ and ξ are other (dimensionless) gauge-fixing parameters. Finally, we add which differs from S ev only because the combinations The regularized gauge-fixed action reads where S K is the same as before, and satisfies where h I J (∂ 2 ) = (ς I J 6 +δ I J (∂ 2 ) 3 )/ 6 . The reason why it is useful to separate the terms proportional to the parameters η will become clear later.
It is straightforward to derive the propagators and check that the ones of gauge fields, A μ (k) A ν (−k) 0 , and the ones of ghosts, C(k)C(−k) 0 , fall off as 1/(k 2 ) 9 for large momenta k, while the propagators A(k) B(−k) 0 fall off as k/(k 2 ) 9 , and B(k) B(−k) 0 as 1/(k 2 ) 8 . For example, in the "Feynman gauge" ξ i = λ = ξ = 1 at η = 0 we have The fermion propagators, on the other hand, fall off as p/( p 2 ) 4 .
For a while we need to work at finite , where the action S is super-renormalizable. To make its superrenormalizability manifest, it is convenient to parametrize it so that the denominators cancel out. Let us first ignore the terms S ev . We define tilde fields and tilde parameters as andr i = r i . The covariant derivatives remain independent.
To cancel the denominators of the gauge-fixing sector we define C a =C a / 8 ,B a = B a / 8 andC a = C a / 8 . Finally, we define the tilde sources so the tilde map is a canonical transformation combined with a redefinition of parameters. As far as S ev is concerned, using (2.17) and the linearity in η we can write it as whereQ ( ) = 16 + λ 8 .
In the tilde parametrization the full action reads and the generating functional ( , K ) = W (J, K ) − α J α of one-particle irreducible diagrams is the Legendre transform of W (J, K ) with respect to J . Since no oneparticle irreducible diagrams with external legs ψ R ,ψ R can be constructed, the action S and the functional depend on ψ R ,ψ R in exactly the same way. The DHD-regularized anomaly functional is (3.10) When we switch to the tilde parametrization we writeZ , W ,˜ andÃ . See appendix B for the proof of the last equality of (3.10 ).
The tilde actionS is polynomial in , has properly normalized propagators and contains only parameters of nonnegative dimensions in units of mass. However, the tilde fields have negative dimensions, which in principle may jeopardize the (super)renormalizability we want to prove. Precisely, we have while [K a μ ] = [K a C ] = [K ā C ] = 10, [K B ] = 9 and [K ψ ] = 9/2. The problem is solved as follows. Since S has the form (2.12), theg structure ofS is the tilde version of (2.12). The tilde version of (2.13) ensures that the counterterms have thẽ g structure L 1g 2(L−1) F L (g˜ ,gK ), (3.11) where the L-loop local functionals F L depend polynomially on the other dimensionful parameters of the theory. Then we see that the theory is indeed super-renormalizable, because the dimensions of all productsg˜ andgK are strictly positive.

The DHD limit
The basic idea behind the DHD regularization is to "first send ε to zero, then to infinity". However, we must formulate the rules of such limits more precisely, since certain caveats demand attention. We distinguish the higherderivative theory from the final theory. The higher-derivative theory is the one defined by the classical action S (orS , if we use the tilde parametrization), where the scale is kept fixed and treated like any other parameter, instead of a cutoff. It is super-renormalizable and regularized by the dimensional technique. Its divergences, which are poles in ε, are subtracted in the next section using the minimal subtraction scheme. The final theory is obtained by taking the limit → ∞ on the renormalized higher-derivative theory, after subtracting the divergences that emerge in that limit.
Having already expanded in ε, we may wonder what types of divergences appear in the final theory. We have products k ln k of powers and logarithms of , but we also have terms that are evanescent in ε and divergent in . To understand what to do with these, we distinguish two types of them, according to whether the ε evanescence is analytic or formal.
(i) First, consider analytic evanescences in ε multiplied by products k ln k , such as ε 2 ln . Since we first send ε to zero, these quantities are not true divergences and must be neglected. In any case, they cannot be subtracted away, because the theorem of locality of counterterms does not apply to them. Consider for example the integral where for the purposes of our present discussion the mass m can also play the role of an external momentum. Expanding the right-hand side in powers of ε we find that the O(ε 0 ) terms, which are equal to 1 32π 2 π 2 − 2m 2 ln (iv) For completeness, we point out a fourth type of εevanescent divergences, that is to say, nonlocal contributions of type (ii), which can appear as artifacts of inconvenient manipulations. Precisely, because of the ambiguities encoded in (2.14) some quantities of type (i) can be converted into nonlocal divergences of type (ii). These conversions should just be avoided. To this purpose, it is sufficient to note that the structure (2.20) of diagrams and the expansion of the integrals G μ 1 ···μ p only generate ε-evanescent divergences of types (i), (ii) and (iii). In the event that "aev → fev conversions" of type (2.14) are accidentally applied, nonlocal divergences of type (ii) can just be ignored, because they cannot mix with the local terms belonging to the power-counting renormalizable sector and they are resummable into contributions of type (i).
To summarize, the divergences are equal to k ln k times local monomials of the fields, the sources and their derivatives. From the point of view of the dimensional regularization, those monomials may be nonevanescent or formally evanescent, and their coefficients must be evaluated in the analytic limit ε → 0.
We can thus define the procedure with which we renormalize the final theory and define the physical quantities. We call it the DHD limit. We still organize the contributions to and A in the form (2.20). Referring to (2.25) and (2.27), the DHD limit is made of the analytic limit ε → 0, followed by the limit → ∞, followed by the formal limit ε → 0. We also have the DHD expansion, that is to say, the analytic expansion around ε = 0 followed by the expansion around = ∞.
The three steps that define the DHD limit are unambiguous in the divergent sector, which does not contain products of more than one fermion bilinears. Instead, the first and third steps are ambiguous in the convergent sector. What is important is that the DHD limit is also unambiguous in the convergent sector.
It is useful to recapitulate the DHD limit in symbolic form. We first expand around ε = 0 at fixed, and find poles, finite terms and evanescent terms: The symbols appearing in this list have the following meanings: 1/ε denotes any kinds of divergences in ε,δ is any formally evanescent quantity, ε 0 is any quantity that is convergent and nonevanescent in the analytic limit ε → 0, and ε denotes any analytic evanescence. After the expansion, we subtract the poles and remain with ε 0 ,δε 0 , ε,δε. (3.12) The terms proportional to ε vanish in the DHD limit. The termsδε 0 also vanish in that limit, but for some time we treat them together with the ε 0 terms. Next, we study the dependence. Expanding the coefficients of every surviving terms (3.12) around = ∞, we find ε 0 ,δε 0 , ε 0 0 ,δε 0 0 , ε 0 ,δ ε 0 , ε ,δε , ε 0 ,δε 0 , ε ,δ ε , (3.13) where denotes any kind of -divergent expression (such as k ln k , with k, k 0 and k +k > 0), while 0 is anyconvergent, non--evanescent expression, and 1/ is any -evanescent expression. Then we subtract the divergences of the DHD limit, namely the terms of types ε 0 andδε 0 . After that we remain with (3.14) At this point we are ready to take the DHD limit, which drops all contributions of this list but the ε 0 0 terms. The counterterms (3.11) are local and largely constrained. We know that (i) they are independent ofB,KC ,K B , ψ R and ψ R and (ii) do not depend on antighosts C a i and sourcesK μa i separately, but only through the combinations K μa i +Q( )∂ μ C a i . Indeed, we have arranged S ev to preserve these properties. Actually, we have chosen the higherderivative structure of S to simplify the counterterms even more: (iii) they cannot depend on the sourcesK and matter fieldsψ, because each productgK ,gψ has dimension greater than 4; (iv) they cannot contain antighosts, because of points (ii) and (iii); (v) they cannot contain ghosts, because all objects with negative ghost numbers are excluded by points (iii) and (iv); (vi) they can only be one-loop, because each loop carries an extra factorg 2 , which has dimension 16. In the end, there can only be one-loop divergences of the form (where derivatives can act on any objects to their right), and those obtained from these expressions by suppressing somẽ gÃ's or derivatives. The anomaly functional (3.10), if nonvanishing and nontrivial (in a sense specified below), is the anomaly of the higher-derivative theory. In the tilde parametrization we havẽ The one-loop contributionÃ (1) is where˜ (1) is the one-loop contribution to˜ . Using (2.32) and (3.4) we see that (S ,S ) = fev. The right-hand side of (4.3) collects one-loop Feynman diagrams containing insertions of formally evanescent vertices. The formal evanescences can: (a) remain attached to external legs and momenta, or (b) be turned into one or more factors ε. In case (a) they give local divergent evanescences plus nonlocal evanescences. In case (b) the factors ε can simplify a local divergent part and give local nonevanescent contributions, in addition to (generically nonlocal) evanescences. Therefore, we can writẽ whereÃ (1) nev is local, convergent, and nonevanescent, A (1) divev is local and divergent-evanescent andÃ (1) ev is evanescent and possibly nonlocal. Now we take the divergent part of equation (4.3). Decom-pose˜ (1) as the sum of its divergent part˜ (1) div and its convergent part˜ (1) conv . Recalling that the antiparentheses of convergent functionals are convergent, we see that (S ,˜ (1) conv ) is convergent. The properties (2.32) apply to (S ,˜ (1) div ), so we have the identity (4.5) Now, (4.1) tells us that˜ (1) div is just a functional ofgÃ. Therefore, its antiparentheses withS are only sensitive tõ S K and the K -dependent contributions toS ev , which we denote byS K ev . Moreover, we can further decompose˜ (1) div as the sum of a nonevanescent divergent part˜ (1) nevdiv and a divergent evanescence˜ (1) divev . So doing we find At this point, taking the nonevanescent divergent part of this equation, we obtain (S K ,˜ (1) nevdiv ) = 0, which just states that˜ (1) nev div is gauge invariant. Going back to the nontilde parametrization, we have˜ (1) nev div (gÃ) = (1) nevdiv (g A). By power counting, (1) nevdiv can only be a linear combination of the invariants F a i μν F a i μν , and it can be subtracted by redefining the parameters ζ i . The rest, (1) divev , can be subtracted by redefining the parameters η of S ev . The renormalized actionŜ is obtained by making the replacements in S , where f i , f are calculable numerical coefficients. Since S is linear in ζ and η, we havê Moreover, using (4.5) and (˜ (1) div ,˜ (1) div ) = 0 we find The generating functionalˆ defined byŜ is convergent to all orders, because (3.11) ensures that no divergences can appear beyond one loop. Finally,ˆ and the anomalŷ A = (ˆ ,ˆ ) are obtained by making the replacements (4.7) inside˜ andÃ = (˜ ,˜ ), respectively. Clearly, A is convergent, becauseˆ is convergent, and because the antiparentheses of convergent functionals are convergent.

One-loop anomalies
In this section we study the one-loop anomalies, and relate those of the final theory, which are trivial by assumption, to those of the higher-derivative theory, which turn out to be trivial as a consequence.
We begin with the one-loop contributionsÂ (1) andÃ (1) toÂ andÃ . First, we observe that Indeed, the correction˜ (1) div to the action provides O(h) vertices. If we use those vertices in one-particle irreducible diagrams together with vertices of (Ŝ ,Ŝ ), we must close at least one loop, which gives O(h 2 ) contributions. Using (4.9), we havê As a check, recall thatÂ is convergent, so the divergent evanescencesÃ (1) divev must disappear fromÂ (1) .
We know thatÃ (1) nev is the integral of a local function of dimension 5 and ghost number 1. Recalling that a factorg is attached to every external leg, we havẽ whereÃ a are local functions of ghost number zero and dimension 4. However,Ã (1) nev cannot depend on the sources K and the matter fieldsψ, because the productsgK andgψ have dimensions greater than 4.
Working out (S ,S ) in detail, it is easy to check that it does not depend onB a i and depends onK μa i and C a i only through the combinationsK μa i +Q( )∂ μ C a i . Therefore, the same must be true ofÃ (1) , which means thatÃ (1) nev cannot depend on either C orB. Then the functionsÃ a cannot even contain ghosts. Summarizing, we can writẽ Recall that the antiparentheses satisfy the identity (X, (X, X )) = 0 for any functional X . Taking X =ˆ , we obtain (ˆ ,Â ) = 0, (5.4) which are the Wess-Zumino consistency conditions [10], written using the Batalin-Vilkovisky formalism. In particular, at one loop we have (S ,Â (1) ) = −(ˆ (1) , (S ,S )). (5.5) In Sect. 2 we have proved that the antiparentheses of an evanescent functional with a convergent functional are evanescent. Thus, (ˆ (1) , (S ,S )) = ev = O(ε).
For the same reason, (S ,Ã (1) ev ) and (S K ev ,Ã (1) nev ) are evanescent. Using these facts, together with (5.1) and (5.3), (5.5) gives ev = (S ,Ã (1) nev +Ã (1) ev ) = (S ,Ã (1) nev ) + ev = (S K +S K ev ,Ã (1) nev ) + ev = (S K ,Ã (1) nev ) + ev. At this point, we take the nonevanescent part of both sides and note that the relations (2.32) apply to (S K ,Ã (1) nev ), because, thanks to (5.3), no products of more fermion bilinears are involved in these antiparentheses. We find nev is the (potential) one-loop anomaly of the higher-derivative regularized theoryS , defined keeping fixed. The final theory is instead obtained taking the DHD limit. We must relateÃ (1) nev to the potential one-loop anomaly A (1) f nev of the final theory. Indeed, we are assuming that A (1) f nev is trivial (the final theory cannot have gauge anomalies at one loop), but we have no information of this type as regardsÃ (1) nev . We know howÃ (1) nev depends ong. The other dimensionful parameters ofS (such asζ i andξ i ), as well as the powers of multiplying various terms (such as ψ I L ı / Dψ I L ), have dimensions greater than 4. They cannot contribute tõ A (1) nev , because the local functionsÃ a are polynomial in them and have dimension 4. Thus,Ã (1) nev can only depend ongC,gÃ,r i , λ , ξ , η 1i and η 2i . Using (5.3), switching to nontilde variables, and recalling thatgÃ = g A,gC = gC, we see that A (1) nev is independent. Now we show that actually A (1) nev coincides with the one-loop anomaly A (1) f nev of the final theory.
To prove this fact, we need to take to infinity and study the DHD limit at one loop. A more comprehensive study of the DHD limit will be carried out later. The terms that are divergent in this limit are denoted by "Ddiv", to distinguish them from the divergences considered so far, which strictly speaking were "εdiv". Recall that, according to the definition of DHD limit, the -divergent parts cannot contain analytic ε evanescences, but can contain formal ε evanescences. ConsiderÂ = (ˆ ,ˆ ) and take the one-loop DHDdivergent part of this equation. Using (5.1) and recalling that A (1) nev is independent, we get 1 2 A (1) ev Ddiv whereˆ (1) Ddiv is the one-loop DHD-divergent part ofˆ . In the last step we have dropped the contribution involving (S − S r ,ˆ (1) Ddiv ), since this quantity vanishes in the limit → ∞. The reason is that, by formulas (2.15) and (3.4), the difference S − S r is made of O(1/ 6 ) terms, and the powerlike divergences contained inˆ (1) Ddiv cannot exceed 4 . Actually, this is one of the reasons why we have chosen the particular higher-derivative structure of the theory S . Moreover, to make the last step of (5.7) we have applied (2.32) to (S r ,ˆ (1) Ddiv ). Because of the analysis of Sect. 3, the divergences ofˆ (1) Ddiv may be of two types, with respect to the limit ε → 0: nonevanescent or formally evanescent. Thanks to (2.32), the antiparentheses with S r also give nonevanescent or formally evanescent contributions, wherefrom the last equality of (5.7) follows.
Subtracting the divergencesˆ (1) Ddiv fromŜ , we can define the one-loop renormalized actionŜ f ren of the final theory, which readŝ For the moment we do not need to specify the O(h 2 ) terms of this subtraction (but later we will have to be precise about them). The anomaly of the final theory is and its one-loop nonevanescent part is the quantity A (1) f nev we want, where the subscript "nev" close to the subscript " f " denotes the contributions that do not vanish in the DHD limit. We have In these manipulations we have used the formulâ which holds because at one loop the vertices ofˆ (1) Ddiv , which are already O(h), cannot contribute to one-particle irreducible diagrams containing one insertion of (Ŝ ,Ŝ ).
At one loop, using (5.7), we obtain Ddiv ). (5.9) We are ready to take the DHD limit. Recall that (S − S r ,ˆ (1) Ddiv ) tends to zero for → ∞, while A ev and its -divergent part do not separately tend to zero, because they can contain (local) terms that are formally ε evanescent and divergent. However, those terms are precisely A (1) ev Ddiv , so they disappear in the difference A (1) ev − A (1) ev Ddiv . Finally, using (5.3), we get as we wanted. Let us write the most general structure of the functions A a (g A). We know that they have dimension 4 and are sums of terms of the form g p ∂ k A p . Power counting gives k + p ≤ 4, hence we have plus the terms obtained from these by suppressing some g A's or some derivatives. Now it remains to collect all pieces of information found so far and solve (5.6). We call condition (5.6) a little cohomological problem, because it involves a structure (5.3) that contains a finite number of terms, in our case just a few, and its solution can be worked out directly. We recall the solution without proof, because the proof is well known and not necessary for the other derivations of this paper. The solution can be split into the sum of trivial and nontrivial contributions. Trivial contributions are those of the form (S K , χ), where χ = χ(g A) is a local functional of the gauge fields A, equal to the integral of a local function of dimension 4 and ghost number 0, and having a g structure corresponding to the one-loop sector of (2.13). In the tilde parametrization, we write χ asχ(gÃ). The only nontrivial contributions to A (1) f nev are proportional to the famous Bardeen formula [18]. In appendix A, the coefficient of the Bardeen term is calculated using our regularization technique. In the end, we have (5.11) where C = C a T a , A μ = A a μ T a , the Bardeen term being the integral on the right-hand side.
One-loop gauge anomalies vanish when the trace appearing in (5.11) vanishes. Typically, the cancellation is possible when the gauge group is a product group and the theory contains various types of fermionic fields in suitable representations, as in the standard model. Now we go back to the higher-derivative theory (the DHD limit being completed in Sect. 7), precisely to the classical actionŜ of (4.8). The trivial contributions (S K , χ) to the anomalies can be canceled out by redefining the action aŝ because then In the last step we used the fact that χ is K independent. Thus, at one loop we havê which means that when the Bardeen term vanishesÂ Finally, observe that the new functionalˆ is still convergent to all orders. The reason is that it is convergent at one loop and the action has the g structure (2.13). Then, using tilde variables, the counterterms must have the form (3.11), which, however, forbids divergent contributions from two loops onwards. The anomaly functionalÂ = (ˆ ,ˆ ) is also convergent to all orders and has the g structure (2.13). The next step is to prove the anomaly cancellation to all orders in the higher-derivative theory. After that, we will have to complete the DHD limit by renormalizing the divergences.

Manifest Adler-Bardeen theorem in the higher-derivative theory
In this section we prove that gauge anomalies manifestly cancel to all orders in the higher-derivative theory S . We assume that the final theory has no one-loop anomalies, which, according to the previous section, implies that the higherderivative theory shares the same property, namely A (1) nev = (S K , χ) ,Â (1) nev = 0. Then, the one-loop contributionÂ (1) to the anomaly functionalÂ is evanescent, so we can writê Here "O(ε)" includes the tree-level contribution (S , S ). Now we move on to higher orders. We have to study the diagrams with two or more loops, and one insertion of calculated with the action (5.13). We have switched back to the tilde parametrization, used (4.5), and replaced (S ,χ) by (S K +S K ev ,χ) and (S K ,χ) byÃ (1) nev . Both E andÂ have the structure (3.11) and (S ,S ) is formally evanescent. To fix the notation, let us start from formula (2.20), applied to the -loop diagrams containing one (S ,S ) insertion. We write them as sums of contributions of the form where the tensors T ( )β 1 ···β r Aμ 1 ···μ p α 1 ···α n are constant and evanescent, and the integrations over momenta are understood. We recall that G ( )μ 1 ···μ p A   (k 1 , . . . , k n+r ) are the integrals coming from Feynman diagrams, once all tensors η μν , ε μνρσ , δμν , the γ matrices, the structure constants f abc and the matrices T a are moved outside into the structures T ( ) A . We call the divergent parts of G ( )μ 1 ···μ p A "nontrivial" if they are not killed by the structures T ( ) A . Let us first reconsider the case = 1. It is useful to describe the right-hand side of (6.2) from the point of view of the integrals G can be of three types: (a) divergences that are turned into nonevanescent contributions by T (1) A , which are subtracted byÃ (1) nev ; (b) divergences that remain divergent when T (1) A is applied to them, which are subtracted byÃ (1) divev ; (c) divergences that are turned into evanescences by T (1) A , which may be subtracted by further, one-loop, local evanescent terms L (1) ev with theg structure (3.11). We write E = E 1 + E 2 , where ,χ). The subtractions included in E 1 cancel all nontrivial divergences of G (1)μ 1 ···μ p A . Instead, E 2 collects the diagrams with one E 2 insertion. They can also be expressed in the form (6.3) and studied along the same lines. From now on we understand that the expressions (6.3) refer to the diagrams with one (S ,S ) insertion or one E 2 insertion.
Each contribution G (1) A is then equipped with counterterms G (1) Acounter , so that the difference involves fully convergent subtracted integrals G (1)μ 1 ···μ p Asubtr . Now, the evanescences provided by T (1) A cannot simplify any divergences, so the final resultÂ (1) is evanescent, in agreement with (6.1). At higher loops it is useful to make a similar analysis. We begin with = 2. The integrals G (2)μ 1 ···μ p A are automatically equipped with the counterterms that subtract their nontrivial subdivergences: first, the actionŜ is equipped with its own counterterms and, second, the subtractions contained in E 1 provide counterterms for the integrals G (1)μ 1 ···μ p A associated with (S ,S ). Instead, the two-loop contributions of E 2 do not have subdivergences, because E 2 is oneloop. When we include counterterms for subdivergences, we can identify subtracted integrals G (2) (by the theorem of locality of counterterms) and possibly nonlocal finite parts G (2)μ 1 ···μ p Afinite . When T (2) A acts on G (2)μ 1 ···μ p Adiv , it gives local contributions toÂ (2) , which can be nonevanescent (due to simplified divergences), evanescent or still divergent. However, local contributions must have the structure (3.11), which implies that they are zero. Indeed, using the tilde parametrization, they are polynomial in the dimensionful parameters ofS and carry an overall factorg 2 , which has dimension 16. We conclude that the overall divergences G (2)μ 1 ···μ p Adiv are trivial, because they are killed by it just gives (possibly nonlocal) evanescent contributions toÂ (2) . Finally, we havê Therefore, (6.1) is promoted to the next order, and we can writeÂ = O(ε) + O(h 3 ), where now "O(ε)" includes the evanescent contributions appearing on the right-hand side of (6.4).
At this point we can proceed by induction. Assume that for some 2, (6.5) and that the overall divergent parts G Adiv , plus a possibly nonlocal evanescent partÂ ( +1) loc must have the structure (3.11), which means that it vanishes. In the end, G ( +1)μ 1 ···μ p Adiv are also trivial, and ev . Thus, if the inductive assumptions hold for some , they must also hold with → + 1 and therefore for = ∞. We conclude that the anomaly is evanescent to all orders: This result proves that if the final theory is anomaly-free at one loop, the higher-derivative theory S is anomalyfree to all orders. It is important to stress that the DHDregularization framework provides the subtraction scheme where this property is manifest: after the subtraction of (S K , χ) at one loop, no analogous subtractions are necessary at higher orders. This is not the final result we want, though. To get there we still need to take to infinity and complete the DHD limit.

Manifest Adler-Bardeen theorem in the final theory
We are finally ready to study anomaly cancellation to all orders in the final theory. In this section we study the dependence and complete the DHD limit, according to the rules of Sect. 3.1. The subtraction of divergences proceeds relatively smoothly, and preserves the master equation to all orders up to terms that vanish in the DHD limit.
Call S n and n the action and the functional DHDrenormalized up to (and including) n loops, where S 0 = S =Ŝ − χ/2 is the action (5.13). The action S n must satisfy two inductive assumptions to all orders inh: (I) n has a regular limit for ε → 0 at fixed , and (II) the local functional is "truly ε-evanescent at fixed ", that is to say a local functional such that E n tends to zero when ε → 0 at fixed .
More precisely, n is a sum of terms (3.14) up to n loops (because it is DHD-convergent to that order) and a sum of terms (3.12) from n + 1 loops onwards. Instead, E n = ( n , n ) contains the terms (3.14) except ε 0 0 and ε 0 / up to n loops, and the terms (3.12) except ε 0 from n + 1 loops onwards. Thanks to (6.6) we know that the inductive hypotheses are true for n = 0.
The theorem of locality of counterterms ensures that the (n + 1)-loop divergent part (n+1) n div of n is a local functional. Since n has a regular limit for ε → 0 at fixed , (n+1) n div contains only divergences in , not in ε. Precisely, we can write n divfev collect the terms ε 0 andδε 0 of the list (3.13), respectively. Now we study the (n + 1)-loop divergent part of ( n , n ). We take the (n + 1)-loop DHD-divergent non-ε-evanescent part of ( n , n ) = (S n , S n ) = E n , (7.2) which means the terms of types ε 0 of the list (3.13). Recall that S is equal to the action S r of (2.15) plus O(1/ 6 ) terms, so (S − S r , (n+1) n div ) is convergent for → ∞. Moreover, S r is equal to S gf , which by (2.3) is non-ε-evanescent, plus ψ I R ı/ ∂ψ I R plus ε-evanescent terms. Noting that the divergent part of E n is just made of termsδε 0 , we obtain n in powers ofh and dropped all contributions ( (k) n , (n+1−k) n ) with 0 < k < n + 1, because they are convergent in the DHD limit. Note that (k) n , 0 < k < n + 1, may contain terms ε . Now, the powers of can get simplified inside ( ). However, n is convergent for ε → 0 and the antiparentheses cannot generate poles, so the resulting contributions remain negligible in the DHD limit. We must just pay attention not to manipulate the terms ε in inconvenient ways (see Sect. 3.1 for details).
Since the theory is power-counting renormalizable, (7.3) is another little cohomological problem, therefore it can be solved directly. Moreover, it is a purely four-dimensional problem, since all ε-evanescent terms have been dropped. Its solution is well known and states that (n+1) n divnev may be reabsorbed by redefining the parameters of S gf and making a canonical transformation inside S gf . Using the nonrenormalization of the B-and KC -dependent terms, and power counting, the canonical transformation is generated by a functional (7.4) and the parameter redefinitions read where Z n Ai , Z nCi , Z n I J , and Z ni are ε-independentdivergent renormalization constants. The r i redefinitions encode the renormalizations of gauge couplings. Instead, the ξ i redefinitions follow from the nonrenormalization of the terms quadratic in B. In the parametrization we are using there are no redefinitions of g and ζ i . Making the canonical transformation (7.4) and the redefinitions (7.5) on S gf we get However, the classical action we have been using is not S gf , and not even S r = S gf + S LR + S ev , but S n , whose classical limit is S , therefore we must inquire what happens by making the operations (7.4) and (7.5) on S . Let us begin from S r . Since S LR is nonrenormalized, we must also make the redefinitions When we apply (7.4) and (7.5) to S ev we generate new formally ε-evanescent, -divergent terms of orderh n+1 , which change (n+1) n divfev into some new (n+1) n divfev , plus O(h n+2 ). The divergences (n+1) n divfev are not constrained by gauge invariance, but just locality and power counting. They can be subtracted by redefining the parameters η of S ev , since S ev was added precisely for this purpose.
We denote the operations that subtract (n+1) n div with T n . They include the canonical transformation (7.4), the redefinitions (7.5) and (7.6), and the η redefinitions that subtract It remains to check what happens when the operations T n act on S . Observe that, since no ε divergences are around, the operations T n are independent of ε and divergent in . However, the difference S − S r is of order 1/ 6 and the operations T n cannot contain powers of greater than 4. Thus, (T n − 1)(S − S r ) vanishes in the DHD limit. Call S n+1 the action obtained by applying T n on S n . We have This formula tells us that the operations T n do renormalize the divergences due to S n in the DHD limit. Therefore, S n+1 is the (n + 1)-loop DHD-renormalized action, namely it gives a generating functional n+1 that is convergent up to (and including) n + 1 loops in the DHD limit.
Moreover, since the canonical transformations generated by (7.4) act multiplicatively on fields and sources, the operations T n act on the functional precisely as they act on the action. Therefore, n+1 = T n n . Since the operations T n are ε -independent, we conclude that n+1 is regular when ε → 0 at fixed , to all orders inh, which promotes the inductive assumption (I) to n + 1 loops.
Finally, the operations T n preserve the antiparentheses. Applying them to (7.1) we also obtain (S n+1 , S n+1 ) = T n E n . Now, taking the average of this equation we get where · · · k means that the average is calculated with the action S k . If we take the limit of T n E n n for ε → 0 at fixed we get zero, because by assumption (II) E n n tends to zero for ε → 0 at fixed . We conclude that the local functional E n+1 ≡ T n E n is truly ε evanescent at fixed . Therefore, assumption (II) is also promoted to n + 1 loops. Since all inductive assumptions have been successfully promoted to n +1 loops, the DHD-renormalized action S R = S ∞ satisfies where E R vanishes in the DHD limit, because it contains only the terms of (3.14) except ε 0 0 and ε 0 / . Finally, the DHD-renormalized functional R = ∞ is such that the anomaly functional tends to zero in the DHD limit, which means that gauge anomalies cancel out to all orders.
The DHD framework defines a subtraction scheme where the cancellation takes place naturally and manifestly. In any other framework, the right scheme must be identified stepby-step, from two loops onwards, by fine-tuning local counterterms.
Some final comments are in order. Because of (4.7) higherorder divergent terms of the form p ln k /ε are generated along the way. They appear in S R and in the partially renormalized actions S n . Our renormalization procedure (which is just made of redefinitions of parameters, fields, and sources) makes them cancel opposite contributions coming from diagrams. Therefore, they do not appear in the functionals R and n , which are indeed regular in the limit ε → 0 at fixed.
In several steps of the proof we have used the fact that S = S r +O(1/ 6 ). It is important that the higher-derivative regularized classical action S does not contain terms with fewer inverse powers of . Consistently with this, renormalization does not require to turn them on. The operations T n may contain power-like divergences, which can generate terms with less than six inverse powers of when they act on S − S r . Those terms are at least one loop and not divergent, so they do not affect the structure of the classical action S .

Standard Model and more general theories
In this section we show how to extend the proof of the previous sections to the standard model and the most general perturbatively unitary, power-counting renormalizable theories. We just need to include photons V μ , scalar fields ϕ, and right-handed fermions χ R , which were dropped so far for simplicity. Depending on the representations, we can also add Majorana masses to the fermions ψ L .
We begin from the fermions. The starting classical action (2.1) is modified as follows: where S m collects the mass terms, when allowed by the representations: The functional S K that collects the symmetry transformations is also extended: Clearly, and (S K , ) are unmodified. To regularize the right-handed fermions we mirror what we did for the lefthanded ones. In the same way as we added partners ψ R for ψ L that decouple in four dimensions, we add partners χ L for χ R that also decouple when D → 4. The correction to S LR is Massive terms involving the regularizing partners ψ I R and χ I L can also be added. Differently from (8.1), they are not renormalized, so their coefficients must be independent of the ones appearing in (8.1). The evanescent corrections S ev of (2.16) are affected only in the sector S cev , which is extended to include terms such as the integrals of multiplied by independent constants. Evanescent terms of the Majorana type may also be allowed. Next, we add the higher-derivative regularizing terms The gauge fermion does not change, as well as S ev − S cev . Tilde fields and sources are defined as before and every argument of the proof can be extended straightforwardly. Now, wave-function renormalization constants can mix right-handed fermions with conjugates of left-handed ones. The contributions of right-handed fermions to the one-loop anomalies A (1) f nev = A (1) nev are given by a formula similar to (5.11), the only difference being that the trace appearing in the Bardeen term on the righthand side is calculated on the appropriate representations T a R (C → C a T a R , A μ → A a μ T a R ) and is multiplied by a further minus sign. The one-loop gauge anomalies A (1) f nev are trivial when the Bardeen terms cancel out in the total, and there exists a local functional χ(g A) such that A (1) f nev = (S K , χ).

Scalars may be added by making the replacements
where S Y denotes the Yukawa terms. As before, the renormalized action is linear in the sources K , by ghost number conservation and power counting. The evanescent corrections S cev include new terms such as the integrals of  (1) div of the higher-derivative the-oryS , nor to the nonevanescent one-loop gauge anomalies A (1) nev . Moreover, we still have S − S r = O(1/ 6 ). Therefore, all arguments used in the proof of the previous sections generalize straightforwardly.
Finally, we add photons. Assume that the group G contains N U(1) factors and denote their gauge fields with V u μ , u = 1, . . . N . Then make the replacements where W u μν = ∂ μ V u ν − ∂ ν V u μ , ζ uv is an invertible constant matrix, π I is any matter field in the irreducible representation R I of G, and π I † , K I † π stand forπ I ,K I π if π I is a fermion. We define extended G indicesâ,b, . . . to include both sets of indices u, v, . . . and a, b, . . ., and write Aâ μ = {V u μ , A a μ }. The U (1) charges of matter fields are denoted by gq u I . We also write Tâ = {i Q u , T a }, where Q u acts on π I by multiplying it by q u I . The change of the gauge fermion (2.4) is The sector S c ev of S ev is also extended, to include Vdependent evanescent terms similar to those already met in (2.19), (8.2), and (8.3). Instead, S ev −S cev remains the same, since the U (1) ghosts decouple.
The action S c is extended to include the higher-derivative regularizing terms while the change of the gauge fermion is where P uv ( ) = ξ uv + δ uv ξ 16 8 .
Finally, S ev inherits the modifications made on S cev . Tilde fields and sources are defined as before. The one-loop renormalization of the higher-derivative theoryS is made of the replacements (4.7) plus similar replacements, for ζ uv , where f uv are calculable constants. Let us describe the nontrivial contributions to the one-loop gauge anomalies A (1) f nev . We have terms of the Badreen type and terms proportional to C u W v μν W zμν . Using differential forms, the terms of the Bardeen type are linear combinations of B 1 = Tr [dC∧A∧d A] and B 2 = Tr [dC∧A∧A∧A], as in (5.11), where now C = Câ Tâ f , A = Aâ μ Tâ f dx μ , d = dx μ ∂ μ and Tâ f are the matrices Tâ restricted to the fermions. The coefficient of B 1 is the same as in (5.11), apart from the minus sign associated with right-handed fermions. The coefficient of B 2 is uniquely determined by the coefficient of B 1 , but it differs from the one of (5.11) any time U (1) gauge fields and/or ghosts are involved. The terms proportional to C W μν W μν can only appear in (unusual) situations where global U (1) gauge symmetries are potentially anomalous. One-loop gauge anomalies are trivial when all these terms cancel out, and there exists a local functional χ(g A) such that A (1) f nev = (S K , χ). The correction to the canonical transformation (7.4) reads With the rules of this section gauge anomalies manifestly cancel to all orders in the most general perturbatively unitary, renormalizable gauge theory coupled to matter, as long as they vanish at one loop. We stress again that the proof we have given also works when the theory is conformal or finite, or the first coefficients of its beta functions vanish, where instead RG techniques are powerless.

Conclusions
We have reconsidered the Adler-Bardeen theorem, focusing on the cancellation of gauge anomalies to all orders, when they are trivial at one loop. The proof we have worked out is more powerful than the ones appeared so far and makes us understand aspects that the previous derivations were unable to clarify. Key ingredients of our approach are the Batalin-Vilkovisky formalism and a regularization technique that combines the dimensional regularization with the higherderivative gauge invariant regularization. The most important result is the identification of the subtraction scheme where gauge anomalies manifestly cancel to all orders. We have not used renormalization-group arguments, so our results apply to the most general perturbatively unitary, renormalizable gauge theories coupled to matter, including conformal field theories, finite theories, and theories where the first coefficients of the beta functions vanish.
In view of future generalizations to wider classes of quantum field theories, we have paid attention to a considerable amount of details and delicate steps that emerge along with the proof. We are convinced that the techniques developed here may help us identify the right tools to upgrade the formulation of quantum field theory and simplify the proofs of all-order theorems.
Open Access This article is distributed under the terms of the Creative Commons Attribution License which permits any use, distribution, and reproduction in any medium, provided the original author(s) and the source are credited. Funded by SCOAP 3 / License Version CC BY 4.0.

Appendix A: Calculation of one-loop anomalies
In this appendix we illustrate our approach by calculating the one-loop coefficient of the Bardeen anomaly in chiral gauge theories. That coefficient is scheme independent, so we can work at = ∞, which means use the dimensionally regularized action S r of (2.15 ). Actually, we can equivalently use the action S r 0 of (2.8), because it is easy to check that the contributions due to S ev do not contain fermion loops. Therefore, they cannot generate the tensor ε μνρσ .
For simplicity, we first work with chiral QED and then generalize the result to non-Abelian theories. The action reads S r 0 ( , K ) = − 1 4 F μν F μν + ψ ıγ μ ∂ μ ψ − q LψL γ μ A μ ψ L + (S K , ) + S K , where q L is the charge and the gauge fermion is We have where J μ = q LψL γ μ ψ L is the gauge current andS( ) = S( , 0). We focus on the matter-independent contributions A B to the anomaly A = (S r 0 , S r 0 ) S r 0 , so we can take the ghosts outside the average. Switching to momentum space, we get ×tr / p(P R − P L ) ψ( p + k 1 )ψ(− p + k 2 ) .
Here and below the integrals on momenta k in A B are understood. We expand the fermion two-point function in powers of the gauge field. The linear term gives a contribution that by power counting and ghost number conservation is proportional to It can be subtracted away as explained in (5.12). Then we concentrate on the contributions A B to A B that are quadratic in the gauge field. We observe that one fermion propagator is sandwiched between two P L 's or two P R 's, which projects its numerator onto the evanescent sector, and the other two propagators are sandwiched between P L and P R , which projects their numerators onto the physical sector. We get The photons and their momenta k 1 , k 2 can be taken to be strictly four dimensional. Turning to Euclidean space and using we obtain where ε 0123 = 1. Converting to coordinate space and including the trivial contributions, we finally get After subtraction of the trivial terms the divergence of the current averages to Incidentally, the calculation shows that A B receives no contributions proportional to C F μν F μν . This term is in principle allowed by the cohomological constraint (5.6) in Abelian theories, but actually does not show up. If it did, it would imply that the global symmetry associated with the gauge symmetry is anomalous, which is of course not true.
The calculation just done also proves (5.11), after inserting matrices T a and structure constants f abc where appropriate.