A Forest Formula to Subtract Infrared Singularities in Amplitudes for Wide-angle Scattering

For any hard QCD amplitude with massless partons, infrared (IR) singularities arise from pinches in the complex planes of loop momenta, called pinch surfaces. To organize and study their leading behaviors in the neighborhoods of these surfaces, we can construct approximation operators for collinear and soft singularities. A BPHZ-like forest formula can be developed to subtract them systematically. In this paper, we utilize the position-space analysis of Erdogan and Sterman for Green functions, and develop the formalism for momentum space. A related analysis has been carried out by Collins for the Sudakov form factors, and is generalized here to any wide-angle kinematics with an arbitrary number of external momenta. We will first see that the approximations yield much richer IR structures than those of an original amplitude, then construct the forest formula and prove that all the singularities appearing in its subtraction terms cancel pairwise. With the help of the forest formula, the full amplitude can also be reorganized into a factorized expression, which helps to generalize the Sudakov form factor result to arbitrary numbers of external momenta. All our analysis will be on the amplitude level.


Introduction
The use of forest-like structure of subtractions to remove singularities has inspired research since it was formulated from Bogoliubov's R-operation [1,2] for ultraviolet (UV) divergences by Zimmermann [3]. The BPHZ formalism treats nested and overlapping divergences by a set of nested forests of subtractions. Later, the BPHZ theorem was generalized to include massless fields by Lowenstein and Zimmermann [4,5], and also to include Euclidean infrared (IR) singularities by Chetyrkin, Tkachov and Smirnov [6,7]. Beyond this, a Hopf algebraic structure has been discovered by Kreimer [8,9], and its mathematical structures shed light on quantum field theory.
In comparison with the extensions mentioned above, our work concentrates on the forest-like treatment for IR divergences in Minkowski spacetime, with the subtraction terms motivated from the factorization theorems. This treatment remains under study because of the complex structures of pinch surfaces in Minkowski space [10,11], on which much previous work has centered. Long ago, Humpert and van Neerven discussed the analogy between multiplicative BPHZ renormalization and mass factorization [12], when they used a graphical method to achieve an alternative proof of the factorization of collinear singularities, with the factorized parts being the subtraction terms. Soon afterwards, Collins and Soper focused on the Drell-Yan process in the "back-to-back" limit [13]. Working in axial gauge in that well-known paper, they used a "botanical construction" with concepts "gardens" and "tulips" to disentangle the nested and overlapping IR divergences. Later in Collins' book [14], he developed a forest formula in Feynman gauge for Drell-Yan and related processes, where there are two back-to-back external particles. An all-order factorization discussion has been given long ago in axial gauge for wide-angle scattering with color exchange by Sen [15], and more recently by Feige and Schwartz, using a "factorization gauge" [16].
In a related work, Erdogan and Sterman have applied the forest formula to subtract the UV divergences for massless gauge theories in position space [17], for arbitrary wide-angle kinematics. Based on these pioneering works, our paper aims to provide a generalization to multi-particle amplitudes. As we will see, to carry out the analysis in momentum space and Feynman gauge does not simply involve a Fourier transformation; rather, subtleties will arise due to the complexity of IR structure in the forest subtractions.
This complexity originates from the nontriviality of IR singularities of QCD amplitudes. They are described by the Landau equations, the solutions of which define pinch surfaces, a set of classical pictures with a combination of collinear and soft divergences. The pinch surfaces of an amplitude can be obtained from the Coleman-Norton interpretation [18]. In more detail, each pinch surface consists of the hard, jet and soft subgraphs intertwining with each other. The short-distance interactions are encoded in the hard subgraph H, while the long-distance interactions are encoded in the jet subgraph J and soft subgraph S. To evaluate the contribution of a pinch surface σ, we distinguish between its internal coordinates and those transverse to σ, which are called normal coordinates. By studying the behavior of the graph near σ through the power counting technique of [10,11], we can identify the IR divergent pinch surfaces. For the amplitudes studied here and many other QCD processes, the result is that the divergences are at worst logarithmic, when the following three requirements are satisfied on the pinch surface [10,14,19].

1.
A soft parton cannot be attached to the hard subgraph.

2.
A soft fermion or scalar cannot be attached to the jet subgraph.
3. In each jet subgraph J I , the full set of partons attached to a connected component of H is made up of exactly one parton with physical polarization, and all others being scalar-polarized gauge bosons.
Strictly speaking, these requirements are not sufficient for an IR divergence. Imagine a pinch surface with these requirements satisfied, one of whose jets has only one internal vertex, to which two soft propagators are attached. Following the power counting procedure, we will find a suppression of the logarithmic divergence. On the other hand, a pinch surface would also be IR divergent without the third requirement satisfied. Namely, all the propagators of a jet that are attached to the hard subgraph are scalar-polarized gauge bosons [14,20]. Such pinch surfaces are power divergent, giving a "super-leading" contribution [14]. But if we sum over the graphs representing different attachments of the collinear gluons to the hard subgraph, they will vanish due to the Ward identity. So we do not treat them separately, and will regard the requirements 1 − 3 as necessary conditions for an IR divergence.
We call a pinch surface meeting the requirements 1 − 3 above a leading pinch surface. For a decay process with n outgoing particles, for example, a leading pinch surface σ can be graphically represented as the RHS of figure 1, where we have used (σ) in superscripts to denote the subgraphs. The set of leading pinch surfaces of an amplitude A includes all its IR divergences, which are not cancelled in the sum over the gauge-invariant sets of graphs.
For each leading pinch surface σ, we will define an approximation operator t σ such that A | div.σ = t σ A | div.σ . That is, in region σ the divergences of t σ A are the same as these of A. Approximation operators that correspond to nested pinch surfaces can also act on A repetitively as t σn ...t σ 1 A, where σ 1 ⊂ ... ⊂ σ n . With the help of the approximation operators, we will be able to construct the forest formula, which schematically reads: (1.1) That is, after summing over all the "forests" F in F [A], each of which corresponds to a set of approximations acting repetitively on A, all the IR divergences that may appear in any of the terms are cancelled. Note that these IR divergences include not only the ones from the original leading pinch surfaces, but also those of the subtraction terms, which are highly nontrivial and require a detailed discussion. But in the end, we will find that all these "induced" divergences form pairs to cancel each other, and are organized as the divergences along eikonal lines, which appear in a factorized expression of the full amplitude. The notations in eq. (1.1) will be explained in more detail in the following sections. Besides the forest-like structure to subtract IR singularities, additional methods have been developed to separate the IR divergent parts from the finite parts in a Feynman integral, especially for the next-to-leading order (NLO) and the next-to-next-to-leading order (NNLO) cross sections. Some notable works define subtraction terms on the level of integrands, such as the Catani-Seymour method [21,22], the Nagy-Soper method [23], the antenna method [24][25][26], in addition to Refs. [27][28][29][30]. Alternative ways are to define subtraction terms on the level of integration measures [31][32][33][34][35], or the products of integrand and measure, like the Frixione-Kunszt-Signer method [36,37]. These works mainly focus on the practical evaluations of multi-loop or multi-particle Feynman integrals up to certain orders, but suggest that local IR subtractions can regularize an arbitrary amplitude in momentum space. Our project here, therefore, aims to provide such an all-order IR subtraction procedure.
Most of our calculations and discussions in this paper center on eq. (1.1). Sections 2-4 establish the validity of this formula. In section 2 we introduce the approximation operators, and study the IR singularities generated by them, which may not exist in the original amplitude A. The approximation operators help to motivate the idea to cancel nested divergences. As for the cancellation of non-nested or "overlapping" divergences, the concept of enclosed pinch surfaces is introduced in section 3, which will be shown to be a leading pinch surface of A. With the knowledge acquired, section 4 then serves as a proof of the pairwise cancellations in eq. (1.1), by focusing on every IR divergent regions. Section 5 is mainly concerned with the application of the forest formula. That is, we apply the factorization theorem to the subtraction terms, and show that with the help of the gauge with A = 1, ..., N ), then the sets of normal and intrinsic coordinates of σ are defined as (2.1) To study the behavior of an amplitude A near the pinch surface, we scale the normal coordinates with λ ( 1). Namely, l µ ∼ (λ, λ, λ, λ) Q,
Note that by contour deformation, we can prove that there are no Glauber regions for wideangle scatterings [14,15,[43][44][45], so eq. (2.2) shows the unique way the soft momenta are scaled. We assume that the numerators and denominators of the integrand are polynomials in normal coordinates. As the normal coordinates are scaled as above near the pinch surface, each such polynomial can be approximated by keeping only the leading terms, which are isolated by the hard-collinear and soft-collinear approximations. These approximations act on the hard subgraph H (σ) and the jet subgraph J vector or scalar.  (2.4). In the hard-collinear approximation (denoted by hc A here), the jet momenta entering the hard subgraph are projected onto the directions of the jets, while the vector indices are projected onto the opposite directions of the jets. Moreover, for each fermion jet propagator attached to the hard subgraph, we insert the operator 1 2 (γ · β) γ · β where (γ · β) is next to the hard subgraph, to project on the spinor space which gives the leading power in the Dirac traces. In the soft-collinear approximations (denoted by sc A here), the projections on the momenta work in the reversed way compared to the hard-collinear approximations.
To be more specific, the typical denominators become under these approximations S : l 2 → l 2 ∼ O(λ 2 ),

5)
where in the second line k µ A is a sum of jet loop momentum and l µ is soft. In the third line q µ is a hard momentum, of order λ 0 in all components.
The approximation operators are projections, so that holds for any σ. Regarding these approximations, we shall introduce some convenient notation. First, we define I k µ ≡ k · β I β µ I , I k µ ≡ (k · β I ) β µ I , (2.7) representing the projected I-th jet momenta that appear in the hard part, and soft momenta in the I-th jet part, respectively. For simplicity, whenever the jet label I is unambiguous, we shall use k µ and k µ .
Since a propagator can belong to different subgraphs of different pinch surfaces (for example, a hard propagator in one pinch surface may be lightlike or soft in another), we will put the pinch surface in a bracket as an upper index of the subgraphs as in figure 1. For example, H (σ) refers to the hard subgraphs of σ, which may be no longer hard in other pinch surfaces. We will also use the notations sc (σ) I and hc (σ) I to denote soft-collinear or hard-collinear approximations in a given t σ , with respect to the jet J (σ) I . For simplicity, when possible we will only use "sc." and "hc." if there are no ambiguities. Graphically we will draw round and square half-brackets to describe them, as in figure 2, with the projected momenta or vector indices appearing outside the brackets.
To identify the regions where eqs. (2.3) and (2.4) are good approximations, we introduce the neighborhood of a pinch surface σ in terms of the coordinates in (2.1). This is defined as a region containing σ, where the normal coordinates s (σ) i and the intrinsic coordinates r (σ) j satisfy [17] i s (σ) i 2 p 2 0 , s where p 2 0 Q 2 , and 0 < δ j < 1/2 is fixed for each intrinsic coordinate. The reason for this range of δ j is that if δ j < 1/2, we may always neglect l 2 and k 2 A terms on the RHS of the soft-and hard-collinear approximations in eq. (2.5), because they are relatively suppressed by O(λ 1/2 ). This restricted region in (2.8), is denoted as n [σ].
We now study the relations between pinch surfaces in momentum space. To do this, we define the normal space of a momentum k µ , N σ (k), as the linear span of the sets of normal coordinates of momentum k µ in σ, i.e.
the full 4-dim space if k µ is soft in σ. (2.9) For any loop momentum k µ i of an amplitude A at a pinch surface σ, the larger the dimension of its normal space, the more it is constrained, and the smaller the dimension of σ will be. For example, it would be most constrained if it is soft, since all its four components are zero. We use normal coordinates to define orderings of pinch surfaces. Given any two distinct pinch surfaces σ 1 and σ 2 , we define that σ 1 ⊂ σ 2 if and only if for any loop momentum k µ i , its normal space in σ 2 is contained in (or equal to) that in σ 1 , i.e.
where the equal signs cannot be simultaneously taken for all the k µ i . From this definition, we can deduce the relation between hard, jet and soft subgraphs. For σ 1 ⊂ σ 2 , we define J (σ i ) ≡ I J (σ i ) I (i = 1, 2), and then (2.11) where one can derive the full set of relations using any two of them. Again, the equal signs cannot be simultaneously taken. If neither σ 1 ⊂ σ 2 nor σ 2 ⊂ σ 1 , and moreover, we say σ 1 and σ 2 are overlapping, denoted by the symbol σ 1 : o : σ 2 . If the left hand side of eq. (2.12) is empty, σ 1 and σ 2 are called disjoint pinch surfaces. Note that pinch surfaces of a lowest-order electroweak decay process can never be disjoint, since H (σ 1 ) ∩ H (σ 2 ) always includes the electroweak vertex, and thus is always nonempty. For others, like the scattering processes, the hard subgraphs of two pinch surfaces can be non-overlapping, but we will show in section 3.4 that such configurations are not relevant in the forest formula. Coming back to eq. (2.8), σ 1 ⊂ σ 2 does not imply n [σ 1 ] ⊂ n [σ 2 ]. Therefore, the neighborhoods of nested pinch surfaces may overlap each other. To avoid overcounting, we define the neighborhoods in the following "reduced" way: (2.13) Then the union of all the reduced neighborhoods, as defined above, takes account of all the singularities of an amplitude without double-counting. Note that larger pinch surfaces correspond to smaller reduced graphs, and vice versa.
With these tools at hand, our next task is to study the action of a single approximation operator, and see how it changes pinch surfaces compared with those of the original amplitudes. This will be necessary for our analysis in the following sections.

Pinch surfaces generated by a single approximation
In this subsection, we study the pinch surfaces of amplitudes with a single approximation operator t σ A, and our results will be generalized in section 2.3. The reason that the pinch surfaces of t σ A are different from those of A is easy to see from the definitions eqs. (2.3) and (2.4), because after the action of the operator t σ , only certain components of the momenta and numerator factors of A are kept in specified subgraphs. Naturally, we need to look at the effects of these approximations. Most of the reasoning in this subsection, as a result, will apply to lines that attach S (σ) to J A to H (σ) . To be specific, we wish to classify all the pinch surfaces of t σ A. At pinch surface σ, the momenta are conserved to the leading order in the scaling variable λ at each vertex. In region n[σ], the action of t σ sets to zero only the components that are negligible in the momenta on which t σ acts. But the approximations still apply in other regions, where momentum conservation may not hold even to the leading order once these approximations have been made. A hard-collinear approximation provided by t σ , for example, changes a jet momentum appearing in the hard subgraph, say k µ , into the form of k · β I β µ I . Of course these two momenta provide the same leading contribution in the region n [σ]. But when we consider another pinch surface where they are not necessarily identical in the leading contributions, this pinch surface may be different from any of the ones of A. Similar considerations apply for the soft-collinear approximations. To synthesize all the approximations, we depict t σ A in figure 3. The approximation t σ defines hard, jet and soft bubbles. Inside each bubble, the Landau equation is applicable and the physical picture from Coleman-Norton interpretation still holds. However, between any two bubbles, the outgoing momenta of one bubble are generally not the same as the incoming momenta of Figure 3: A pictorial representation of the approximated amplitude t σ A, where σ is shown in figure 1. The propagators with intact endpoints, denoted by dots, retain momenta before projections, while the truncated lines, denoted by bars, provide their projected momenta to the corresponding subgraphs. At each internal vertex of a bubble, the incoming and outgoing momenta are conserved. the other one. Three comments regarding the momenta joining these bubbles summarize these new features.
• The jet bubbles have external momenta only in the directions of β µ I and β µ I . We shall see that as a result, at pinch surfaces all their internal loops can only be soft, or hard, or collinear to β µ I or β µ I .
• Loops joining the jet-I bubble and the H bubble depend only on the β I -components of their momenta. These loops can have additional pinches at loops connecting H and the jets, but they will only involve lines in the β I -direction for jet I.
• Loops joining jets must flow through S, and can be pinched only in the jet directions.
Therefore, the original Coleman-Norton interpretation does not necessarily apply for the entire graph; we need a new analysis to see the formation of pinches for t σ A. To distinguish them from the pinch surfaces of A, we will denote the pinch surfaces of t σ A as ρ {σ} . Most of the time we will keep the superscript, but during some specific discussions we may drop it for simplicity.
To enumerate the possible pinch surfaces of t σ A in detail, we focus on any one of its propagators, whose momentum k µ can be either soft, lightlike or hard before the projection. The confluence of momenta k µ and p µ in ρ {σ} , where the merged momentum is of the value p µ + (k · v) v µ because of the approximation t σ . The approximation could be either soft-collinear or hard-collinear, for which we will show examples, and the discussions regarding to this figure are not restricted to 3-vertices.
Denoting by (t σ k) µ the value that the projected momentum takes in t σ A, as a result of the approximations in eqs. (2.3) and (2.4), we then ask how the difference between (t σ k) µ and k µ would change the pinch surface. Without loss of generality, we shall denote as the projected momentum after approximations are made. The vector v µ here is a lightlike unit vector which is either in the same or opposite direction of a jet. That is, for a hardcollinear approximation with respect to the jet J I , i.e. k µ → k · β I β µ I , we have v µ = β µ I ; for a soft-collinear approximation k µ → (k · β I ) β Especially, we focus on a "confluence" of the projected momentum (t σ k) µ and another momentum p µ , resulting into a momentum p µ + (k · v) v µ , as is shown in figure 4. Note that p µ is the momentum entering the confluence, which can be either the original momentum of a propagator or projected by t σ .
We assume that these momenta are at a pinch surface of t σ A, say ρ {σ} , and will study how they relate to the pinch surfaces of A itself. In the paragraphs below, we will list all the possibilities by considering whether k µ is soft, lightlike (in various directions) or hard in ρ {σ} , and compare the obtained configurations of figure 4 (corresponding to the pinch surfaces of t σ A) with the ones obtained by letting the original k µ flow into the confluence (corresponding to the pinch surfaces of A). In figure 4 we exhibit a 3-point vertex, but the whole analysis also works in the presence of 4-point vertices.
A. k µ is soft in ρ {σ} This case is the simplest, since a soft momentum after any projections is still soft. Then the pinch surface at k µ = 0 does not change if we replace (t σ k) µ = (k · v) v µ by k µ , meaning that the configuration in this case, being a subgraph of the pinch surface ρ {σ} , also exists in a pinch surface ρ of the original amplitude A.
B. k µ is lightlike in ρ {σ} and not collinear to v µ Here k µ is collinear to a certain lightlike vector, which is not necessarily v µ but is not v µ . Then the projected momentum will be collinear to v µ . To obtain the possible configuration of figure 4, the values of p µ should be taken into account as well. We elaborate the discussion below, which involves a number of subcases.
(Bi) p µ is hard in ρ {σ} . Then generally both p µ + k µ and p µ + (k · v) v µ are hard, so the configurations obtained by k µ and (k · v) v µ are identical.
(Bii) p µ is collinear to v µ in ρ {σ} . In other words, p µ is pinched in alignment to the projected momentum (k · v) v µ . This configuration of momenta, as a subgraph of ρ {σ} , may not exist in any pinch surface of the original amplitude A.
In detail, if k µ itself is also collinear to v µ , then k µ , p µ and p µ + (k · v) v µ are all collinear to v µ . Such a configuration can appear at a pinch surface of A. But if k µ is not in the direction of v µ , we will obtain a configuration where k µ is collinear to one lightlike unit vector, while p µ and p µ + (k · v) v µ are collinear to another. In other words, the propagators in a connected jet subgraph are lightlike, but in different directions. Apparently, this never takes place in A. To see how the pinches are formed, we examine the denominators in the expression of t σ A that involve p µ , which are of the form: The solution that produces a pinch when both p µ and p µ + (k · v) v µ are lightlike is That is, when p µ and p µ + (k · v) v µ are both lightlike in ρ {σ} , they can only be collinear to (t σ k) µ ∝ v µ , rather than the vector k µ before approximations. The condition −1 < α < 0 ensures a pinch. This analysis works for both hard-collinear and soft-collinear approximations, and examples are given for both cases in figure 5 below.
(Biii) p µ is collinear to another vector v µ ( = v µ ) in ρ {σ} . Then by construction, the confluence momentum p µ + (k · v) v µ is hard, and both (k · v) v µ and k µ correspond to the same configuration of figure 4, i.e. two jet lines of different directions joining the hard subgraph together.
In the corresponding configuration of figure 4, a soft momentum p µ is attached to a lightlike momentum k µ , which becomes (k · v) v µ after the confluence. This configuration does not exist in any pinch surfaces of A, unless k µ is collinear to v µ . The upper row describes the case where t σ acts a soft-collinear approximation on k µ and v µ = β µ I . The lower row describes a hard-collinear approximation for which v µ = β µ I . Each of them forces the external momentum entering the p µ -loop to be in the direction of v µ in ρ {σ} , and yields the configuration described in (Bii). The propagators marked bold and blue are for later use (to identify certain subgraphs) in section 2.4.
C. k µ is collinear to v µ in ρ {σ} In this case k µ is lightlike while (t σ k) µ = (k · v) v µ is soft. We again consider all the possible values of p µ , and discuss the configurations of figure 4 in the following subcases.
(Ci) p µ is hard in ρ {σ} . Then generally both p µ + k µ and p µ + (k · v) v µ are hard, so the configurations obtained by k µ and (k · v) v µ are identical.
(Cii) p µ is lightlike in ρ {σ} . We start with a special case: p µ is collinear to v µ . Then k µ , p µ and p µ + (k · v) v µ are all collinear to v µ . However, there is a difference between this configuration and that in an original amplitude A, which we can observe from the following denominator factors: The solution that produces a pinch when k µ and p µ are both in the direction of v µ is: We notice that the range of α that gives a pinch is unbounded, which is different from the configuration in an original amplitude A, where p µ , k µ and p µ + k µ are all collinear to v µ . This difference lies in the intrinsic coordinates.
In the general case, if p µ is collinear to v µ = v µ , we still have eqs. (2.17) and (2.18), and the obtained configuration is still different from any configuration in A, due to the unbounded intrinsic variable k · v. Meanwhile, since v µ = v µ , the two lightlike propagators carrying momenta p µ and k µ are in different directions, then join each other to form another propagator collinear to p µ . This is another difference from any configuration in A, as we have encountered in (Bii) already.
With these differences in mind, we show in figure 6 two examples of case (Cii), with t σ as a hard-or soft-collinear approximation.
(Ciii) p µ is soft in ρ {σ} . In this case the three incoming (outgoing) momenta at the confluence, p µ , (k · v) v µ and p µ + (k · v) v µ are all soft, so they join at a soft vertex. This configuration does not exist in A, because we have a jet propagator attached to two or more soft propagators. For this reason, we shall call such a jet propagator whose nonzero lightlike momentum becomes soft under t σ , and is attached to a soft vertex as a soft-exotic propagator. Two typical examples, where the t σ is either a hard-collinear or a soft-collinear approximation on k µ , are shown in figure 7.
The two rows in figure 7 exhibit the lowest-order graphs, but in principle they can be the representatives of all-order graphs for the two cases, where the approximation on the momentum of the soft-exotic propagator (k µ ) is hard-collinear or soft-collinear.
To be specific, if the approximation is hard-collinear, k µ must be lightlike in σ, and collinear to the opposite direction in ρ {σ} (this phenomenon will be explained below in Theorem 1). The projected momentum is then automatically soft. If the approximation is soft-collinear, the propagator with k µ must be soft and attached to one jet in σ, and become part of that jet in ρ {σ} . Under the soft-collinear approximation, only the component opposite to the jet's direction is kept, so the projected momentum will be automatically soft in ρ {σ} as well.
D. k µ is hard in ρ {σ} In this case, (k · v) v µ is lightlike. Then if p µ is neither collinear to v µ nor soft, then p µ + (k · v) v µ is hard, and we come up with a configuration where p µ flows into the hard subgraph, as at a corresponding pinch surface of A. If p µ is collinear to v µ or soft, the momentum p µ + (k · v) v µ will be lightlike, and it is possible that some other collinear or soft subgraphs are pinched according to this lightlike momentum, and the hard subgraph may become disconnected (see figure 8). In other words, we have found a hard propagator attached to a jet vertex, all the other momenta flowing into which are collinear to a certain direction or soft. This can never happen in the pinch surfaces of A. This is similar to Figure 6: Examples of case (Cii), where both k µ and p µ are lightlike in ρ {σ} , and specially, k µ is in the direction of v µ . The upper row describes the case where t σ acts as a hardcollinear approximation on k µ with v µ = β µ I , while the lower row describes a soft-collinear approximation with v µ = β µ I . Due to these approximations, the component of k µ which joins p µ is (k · v) v µ , so that k µ is pinched in the direction of v µ from our analysis. Meanwhile, the propagator with momentum p µ + (k · v) v µ is put on shell as a jet propagator. The propagators marked bold and blue are for later use in section 2.4. the previously discussed case, where a jet propagator is attached at a soft vertex. In comparison, we call a propagator carrying hard momentum which becomes lightlike under t σ as a hard-exotic propagator.

Regular and exotic configurations
After the enumeration above, we can classify the configurations of figure 4 into two types. To do so, we focus on a vertex of t σ A, say x, and identify the momentum flowing into (or out of) this vertex whose normal space in ρ {σ} has the smallest dimension. We say that x is a soft (jet, hard) vertex, if and only if the identified momentum is a soft (jet, hard) momentum. We then say the normal spaces are conserved at a given vertex if one of the following statements is true: (1) the vertex is a soft vertex, and all the propagators attached to this vertex are soft; Figure 7: Two examples where a jet propagator can end at a soft vertex in ρ {σ} . In the upper row, t σ acts as a hard-collinear approximation and only keeps the v(= β I )-component of k µ in H (σ) . At the same time, it also acts as a soft-collinear approximation on the soft momentum entering J (σ) I . When this soft line in σ becomes collinear to v µ (= β µ K ) in ρ {σ} , for the same reasons as in (Bii) above, k µ is pinched in the direction of v µ . Then (k · v) v µ vanishes at the pinch surface ρ {σ} , and if the internal momentum p µ is soft as well, all the incoming momenta at the confluence of p µ and (t σ k) µ will be soft. In the lower row, t σ acts as a soft-collinear approximation on k µ , and projects it onto its v(= β I )-component. Subsequently, a soft vertex forms when k µ is collinear to β µ I . Some propagators are colored red or green for later uses in section 2.4.
(2) the vertex is a jet vertex, and all the propagators attached to this vertex are either soft or lightlike; (3) the vertex is a hard vertex.
Otherwise, we say that the normal spaces are not conserved at this vertex. This concept helps us to classify the configurations of figure 4. That is, for a given vertex x of an approximated amplitude, the subgraph composed by x and its attached propagators is called a regular configuration if and only if the normal spaces are conserved at x. Otherwise, it is called an exotic configuration. 2 Specifically, a jet propagator attached Figure 8: A pinch surface σ that yields a hard-exotic configuration in ρ {σ} , where the hard subgraph is disconnected in the latter. Here we use bold lines and arcs to denote hard propagators, and dashed arcs to denote soft propagators. Due to the hard-collinear approximations from t σ , only the β I -and β K -components of the hard momenta flow into vertices c and d in ρ {σ} separately. So IR structures can emerge in the subgraph with these lightlike momenta being the external momenta. By definition, the hard-exotic propagators are ac and bd.
to a soft vertex corresponds to a soft-exotic configuration, while a hard propagator attached to a jet vertex corresponds to a hard-exotic configuration. A pinch surface with soft-or hard-exotic configurations is called an exotic pinch surface, otherwise it is called a regular pinch surface. These two types of pinch surfaces of t σ A are thus denoted as ρ {σ} exo and ρ {σ} reg , and the divergences near their neighborhoods both should be considered in detail.
We summarize our discussions in this subsection in table 1, giving all the possibilities of figure 4 that serve as configurations of sub pinch surfaces of t σ A. All the information in the table is given in our analysis and definitions above.
General approximated subgraphs of ρ {σ} With these preparations, we now study the approximated subgraphs of ρ {σ} . For each ρ {σ} , its general picture can be obtained by combining the configurations in table 1. To make it clearer, we focus on the propagators of subgraph J (σ) I that are also lightlike in ρ {σ} , and make an observation that will be quite useful in the upcoming sections. We formalize it in bold as follows: by the action of the approximations, which set its lines on shell. This is a general feature of nested subtractions, and we expect all such singularities to be cancelled in the full sum over forests. This will turn out to be the case.  figure 4, which depend on p µ , k µ in ρ {σ} and p µ + (t σ k) µ = p µ + (k · v) v µ . The abbreviation "col." represents "collinear". Taking account of the associated approximations, we can depict γ in ρ {σ} , as is shown in figure 9. The whole subgraph includes the shaded area as well as those propagators whose momenta are denoted by l µ 2 , where the vertices denoted by x are arbitrary jet vertices in ρ {σ} . The approximations are from t σ : in σ the momenta l µ 1 are soft external momenta of J (σ) I while the l µ 2 are lightlike (in the β µ I -direction) external momenta of H (σ) in σ. Some propagators of γ may be attached to the hard part of ρ {σ} , either approximated by t σ or not.
The directions of the propagators in γ are determined only by the momenta l µ 1 and l µ 2 . On one hand, the momenta l µ 1 are projected and become (l 1 · β I ) β µ I by the soft-collinear approximation of t σ . The momenta (l 1 · β I ) β µ I are lightlike, meaning all the external jet momenta from l 1 that enter the shaded area are parallel to β µ I , and only momenta parallel to β µ I can satisfy Landau equations for the internal loops of γ. On the other hand, only the β I -components of l µ 2 enter the subgraph H (σ) as shown by the hard-collinear approximation. Recalling that any vertex represented by x is a jet vertex, we identify the lightlike momentum that enters x (which is not l 2 · β I β µ I ), denote it by p µ , and assume it to be parallel to some jet direction β µ K . We may have K = I or not. If K = I, then no matter what directions the l µ 2 are in, all the momenta entering x are soft or collinear to β µ I . In this case the l µ 2 do not fix the direction of the propagators in γ. If K = I, the propagators that contain the momenta l µ 2 have denominators of the form 2 p · β K (β I · β K ) l 2 · β I + i if they are in the K-jet, and l 2 2 + i if they are in γ. In order that these propagators are both lightlike at ρ {σ} , l µ 2 has to be pinched in the direction of β µ I since the only candidates for normal coordinates are their β I -components l 2 · β I . From these two aspects we see the whole shaded area can only be collinear to β µ I rather than any other directions, due to the the effects of t σ . In conclusion, Theorem 1 is proved.
With the help of Theorem 1, we now define the jet subgraphs in ρ {σ} . We imagine a flow starting from the I-th external momentum and going inward to the hard subgraph. The flow only covers lightlike propagators, including those collinear to β µ I in ρ {σ} , and those lightlike in another direction β µ K ( = β µ I ) in σ, and become collinear to β µ K in ρ {σ} . The set of lightlike propagators that carry this flow, is defined as J  To end this subsection, we depict a general picture of ρ {σ} , figure 11, which contains all the configurations displayed in table 1, as well as the "merging of jets". To be specific, case (A) can occur inside the soft subgraphs S 1 and S 2 . Case (Bi) occurs at the connection between each jet and H 1 (or H 2 ); case (Bii) occurs where the propagators collinear to β µ J or β µ K flow into the blob in the direction of β

Subtraction terms with repetitive approximations
In this subsection we study the approximated amplitudes with repetitive approximations. The reason to introduce them, taking t σ 2 t σ 1 A for example, is to subtract double-counted and unphysical divergences from terms (approximation amplitudes) with fewer approximations, in other words t σ 1 A and t σ 2 A. The extra divergences of t σ 2 t σ 1 A, are cancelled by terms with more approximations. In order to obtain the subtractions that eliminate the infrared divergences properly, we should be clear about the rules such operators obey when transforming momenta and vector indices under repetitive approximations.
The rules themselves, should meet the following requirements. First, we only consider the "nested" repetitive approximations, namely, σ 1 ⊂ ... ⊂ σ n , and denote the relevant operator as t σn ...t σ 1 . We will define the action of t σn on t σ n−1 ...t σ 1 A in terms of its projections on the momenta and vector indices. Of course, momenta and vector indices may respond differently. For any two nested pinch surfaces σ 1 ⊂ σ 2 , we further require: and Both these relations imply two aspects. Taking eq. (2.19) for example, it implies the coincidence of pinch surfaces, i.e. σ , as well as the exactness of t σ 1 in the neighborhood n σ . In this subsection, we will give the rule for repetitive approximations, and only show that the rule is compatible with the exactness of relevant approximation operators. In section 4.1 later, we will see that σ , whose loop momenta have the same normal spaces as those of σ 1 ; similarly, σ , with identical normal spaces as σ 2 . These results are within our assumption in this subsection.
Before we work on the exactness of t σ 1 in eq. (2.19) and t σ 2 in (2.20), we make the following crucial observation: For any subtraction term t σn ...t σ 1 A that is not vanishing, each momentum or vector index of A is projected at most twice. More precisely, the projection is given by a soft-collinear approximation with respect to some β µ I followed by a hard-collinear approximation with respect to β µ K ( = β µ I ). We see this as follows. First, when we increase the index i of σ i , going from smaller to larger Figure 12: Subgraphs of two leading pinch surfaces σ 1 and σ 2 of A. The propagator with momentum k µ is soft in σ 1 , while collinear to β µ K ( = β µ I ) in σ 2 . In the approximated amplitude t σ 1 A, the momentum k µ appearing in (p I + k) µ is replaced by regions, by eq. (2.11) a soft line may move into jets, and then become hard. So the approximations t σn ...t σ 1 can be seen as several sc I 's followed by several hc I 's, where I need not be different from I. Next, because both sc. and hc. act as projections, t σn ...t σ 1 for any given line, is indeed an sc I followed by an hc I . Finally, if I = I, we will see at the end of this subsection (at eq. (2.31)) that a β 2 I = 0 factor will be produced. By neglecting those vanishing terms, the only nontrivial case we should consider is I = I.
From the observations above, in order to verify eqs. (2.19) and (2.20), the only nontrivial case of σ 1 and σ 2 we need to consider is shown in figure 12 (as well as their corresponding approximations t σ 1 and t σ 2 ). Note that if β µ I and β µ K are back-to-back (β µ I = β µ K ), the proof of the two equations will be trivial. So for generality, we do not assume that. Now we can write down the rule of repetitive approximations on a momentum, and verify it is compatible with eqs. (2.19) and (2.20) by considering how the denominator factor (p I + k) 2 changes according to the approximations t σ 1 , t σ 2 and t σ 2 t σ 1 . According to (2.5), we have: The rule of repetitive approximations is as follows: we have p µ I ≈ p I · β I β µ I and k µ ≈ k · β K β µ K . This implies that (2.22) describes the correct rule for repetitive approximation on a momentum. Next, we show and verify the rule of repetitive approximations on a vector index. In order to do this, we consider a propagator carrying momentum k µ and a vertex to which the gauge boson attaches with index α in figure 12, and denote their product as V α . V α corresponds to either a ψψA α vertex or a ∂φφA α vertex, which can be generalized to a 3-gluon vertex. Again, it becomes trivial when β µ I = β µ K , so we do not assume this below. If the vertex is ψψA α , then V α = ( / p I + / k)γ α and under the approximations, (2.23) Then the rule gives In region n σ , V α always appears in the combination for V α / β I . Then eq. (2.25) where only the first term in this expression is leading, and is cancelled in (2.25).
where we have only kept the leading terms in the second line. This result contributes O (λ) to (1 − t σ 2 ) t σ 1 A because for the leading term in n σ , k α is scalar-polarized in the Kdirection, so we can insert β Kα β α K in the first term without changing the leading behavior, which then cancels the second term exactly.
If the vertex is ∂φφA α , then V α = (2p I + k) α and the rule is As above we easily verify that (2.29) Figure 13: The graphical description for the repetitive approximations. The momentum k µ after the projections, which flows into the vertex x is k · β K (β K · β I ) β µ I .
The first equality is due to the expansion of p I in n σ , while the second equality is due to treating k α as scalar-polarized in n σ Note that this explanation can be generalized in the same way to a 3-gluon vertex, implying that eqs. (2.24) and (2.28) describe the correct rules for repetitive approximations on a vector index.
Having constructed the rules for repetitive approximations, eqs. (2.22), (2.24) and (2.28), on an amplitude A from the requirements (2.19) and (2.20), we emphasize again that in order to completely verify the requirements, we also need to show the coincidence of pinch surfaces. This will be done in section 4.1, when we deal with a stronger relation, eq. (4.5) there. The terms of the form t σn ...t σ 1 A, will be seen as the proper subtractions to remove IR divergences.
For convenience, we add one more notation besides those introduced in eq. (2.7). As we have argued at the beginning of this subsection, the only nontrivial combination of approximations on a given momentum is a soft-collinear with respect to β µ I followed by a hard-collinear with respect to β µ K . In the upcoming text, especially in some of the figures, we will abbreviate a momentum projected by such a combination as IK k (where k µ is the original momentum) for simplicity. Explicitly, (2.30) In terms of graphs, they are represented by figure 13.
Given these results for repetitive approximations, we should generalize the analysis in section 2.2 to an amplitude acted on by repetitive projections: t σn ...t σ 1 A (σ 1 ⊂ ... ⊂ σ n ). Compared with the graphical representation of t σ A in figure 3, now we have more subgraphs whose external momenta have been modified for t σn ...t σ 1 A. Inside each of them there is a classical picture from the Landau equations at any pinch surface. To study the pinch surfaces, which are denoted by ρ {σn...σ 1 } , we again study the configuration in figure 4, but with repetitive approximations taken into consideration. In other words, we focus on the "confluence" of the double-projected momentum (t σ 2 t σ 1 k) µ with another momentum p µ in figure 14, using our notation for repetitive approximations introduced above.
In the figure I and K can be either equal or not. Let's assume I = K first. Then we can analyze the configurations of figure 14 as we have done for figure 4, and one can check that everything follows similarly. The results are summarized in table 2, which can be seen as a generalization of table 1: in the former there are two vectors of reference (β µ I and β µ K ) while in the latter there is only one. If we assume that β  is of the value p + IK k µ ≡ p µ + k · β K (β K · β I ) β µ I , due to the approximations t σ 1 (sc I ) and t σ 2 (hc K ).  If I = K, since β I · β K = 0, the confluence momentum p + IK k µ = p µ . Now we claim that whatever the configuration of figure 14 is, there is always a zero appearing as the overall factor in the approximated amplitude, which then will not contribute. This is because the soft-collinear approximations can only act on gauge bosons. In the presence of a hard-collinear approximation acting on the same line, this gauge boson must be scalarpolarized. From the definitions in eqs. (2.3) and (2.4), the vector index of the gauge boson is projected and we obtain for some jet velocity β. This vanishes because β 2 = 0. This observation implies that we do not need to consider the case I = K whenever we study the pinch surfaces of a repetitivelyapproximated amplitude and require them to be IR divergent, as we will see in Theorems 2, 4 and 6. Nevertheless, these terms will still be included in the analysis of section 4 so as to manifest the IR cancellation in a more direct way.
With the knowledge of table 2, we classify the various configurations of a pinch surface ρ {σn...σ 1 } into the types of regular and exotic, just as we have done for a ρ {σ} . Namely, the normal spaces are conserved at regular configurations and not conserved at exotic configurations. As one studies the divergences of t σn ...t σ 1 A, all such pinch surfaces need to be taken into consideration. Theorem 1, given at the end of section 2.2, can also be generalized from the knowledge of table 2. that are not collinear to β µ I , as γ. We depict γ in figure 15, which includes the shaded area as well as those propagators whose momenta are denoted as l µ 2 . The vertices x are arbitrary jet vertices in ρ {σn...σ 1 } . Since the propagators with a single approximation from t σn ...t σ 1 have been taken into account in the proof of Theorem 1, we only consider those with repetitive approximations. Each momentum denoted by l 1 is collinear to some β µ K in ρ {σn...σ 1 } and attached to H in some σ k 1 , while soft and attached to J I in some σ i 1 (⊂ σ k 1 ). The momenta denoted by l 2 are collinear to β µ I and attached to H in some σ i 2 , while soft and attached to J L in some σ j 2 (⊂ σ i 2 ). By construction, there are hard-collinear approximations hc K acting on l µ 1 , and soft-collinear approximations sc L on l µ 2 . Similarly to the proof of Theorem 1, first we focus on the external propagators with momenta l µ 1 . Due to the repetitive approximations, the external momenta that enter the shaded area will be of the form l 1 · β K (β K · β I ) β µ I , which is always in the direction of β µ I , ensuring that the propagators of γ that contains l µ 1 can only be collinear to β µ I in ρ {σn...σ 1 } as well. Then we focus on the propagators with momenta l µ 2 . After being acted on by the approximations, all the momenta entering the subgraph J through the jet vertices x are of the form l 2 · β I (β I · β L ) β µ L , rather than l µ 2 . In other words, the propagators that contain the momenta l µ 2 have denominators of the form 2 p · β L (β I · β L ) l 2 · β I + i if they are in the subgraph J  be l 2 · β I , meaning that the propagators marked by momentum l µ 2 are also parallel to β µ I at ρ {σn...σ 1 } . In conclusion, Theorem 2 is proved.

Divergences are logarithmic
Another natural question on the effects of the approximation operators is whether they preserve the degree of divergences of the leading term near a pinch surface, which is logarithmic from power counting. We expect so, since in the process of showing IR finiteness, the IR divergences in an approximated amplitude need to be cancelled by some other subtraction terms, and if all the divergences are still logarithmic, we only need to show the coincidence of their leading terms (differing by a minus sign). Otherwise we will need the cancellations for next-to-leading terms, etc., which would be more difficult. We shall take an arbitrary pinch surface ρ {σn...σ 1 } and discuss all its possible configurations in table 1, and relations with the forest {σ 1 , ..., σ n }. We classify the possibilities into four cases: "nested & regular", "overlapping & regular", "soft-exotic" and "hard-exotic". We will explain their meanings, and analyze them one by one. For the "overlapping & regular" and "soft-exotic" cases, we only consider a single approximation in this subsection, and put the generalizations to repetitive approximations in appendix A. After these analyses, we discuss whether the three features of a leading pinch surface of A, as introduced in section 1, still hold for a pinch surface ρ {σ 1 ,...,σn} with logarithmic divergence.

Nested & Regular
By "nested & regular" we mean that ρ {σn...σ 1 } is a regular pinch surface, and nested with every σ i (1 i n, and one may recall the definition in eq. (2.10)). Without loss of generality, we assume We argue below that though ρ {σn...σ 1 } may differ from any pinch surface of A, we can always find a corresponding pinch surface of A, say ρ A , which has the same set of normal coordinates with ρ {σn...σ 1 } .
The approximations acting inside S (ρ {σn...σ 1 } ) are provided by t σ m+1 , ..., t σn , because they must correspond to the pinch surfaces that contain ρ {σn...σ 1 } . But since a soft momentum remains soft after any projections, we can simply remove these approximations without changing the configuration of ρ {σn...σ 1 } or its degree of IR divergence (though the value of the leading term may vary). Similarly, the approximations inside H (ρ {σn...σ 1 } ) are provided by t σ 1 , ..., t σ m−1 . But since the momenta in H (ρ {σn...σ 1 } ) are hard and hence do not contribute to the degree of divergence, we can simply remove the approximations without changing the configuration of ρ {σn...σ 1 } or its degree of divergence. In conclusion, each ρ {σn...σ 1 } that is nested with every σ i (i = 1, ..., n) corresponds to a ρ A -a pinch surface of A -with their degrees of divergence being identical. Since ρ A is at worst logarithmically divergent, we conclude that the IR divergence of ρ {σn...σ 1 } , is at worst logarithmic.

Overlapping & Regular
By "overlapping & regular" we mean the case where ρ {σn...σ 1 } is regular, and overlaps with some σ i (1 i n). To evaluate its degree of divergence, we consider the pinch surfaces with only a single approximation operator here, i.e. ρ {σ} , and include the discussion on repetitive approximations in appendix A.1. For simplicity, we will drop the superscript and use ρ instead, until the end of the power counting evaluation.
According to the definition, eq. (2.12), overlapping implies one of the two possibilities. First, a hard, jet or soft subgraph of t σ A at ρ contains the corresponding subgraph at σ while another hard, jet or soft subgraph of A at σ contains the corresponding subgraph at ρ. Second, some jet subgraphs overlap, i.e. J The first case is simpler, since as in the "nested & regular" case, the action of t σ does not change the power counting procedure at ρ. We immediately come to the conclusion that the divergence near ρ is at worst logarithmic. Turning to the second case, the subtleties originate from the fact that J In the figure, m J is the number of external lines of γ that belong to the subgraph S (σ) J (ρ) K ; m S is the number of external lines that belong to J (σ) I S (ρ) ; n J is the number of internal lines that are attached to H (σ) but not to H (ρ) ; n H is the number of internal lines that are attached to H (ρ) but not to H (σ) ; finally, there are m H internal lines, each one having an endpoint attached to both H (ρ) and H (σ) . These carry polarization β µ I into γ. Due to the operator t σ and the ranges of momenta in ρ, vectors β µ I and β µ I together with momenta from vertices in γ form invariants in the leading term, as is shown in the figure.
Now we undertake the power counting for this leading term. Suppose the degree of divergence is p (ρ) (γ), then by definition, where L is the number of loops, N is the number of propagators, V is the number of vertices, and n (ρ) num represents the numerator contribution. The number of independent loops in γ as well as those formed by the n J propagators, can be expressed in terms of N and V by Euler's formula, where the +1 in the bracket corresponds to the external vertices of the n J propagators.
Combining with the identity counting half-edges, num . In ρ, let α II be the number of the invariants β I · β I appearing in the expression of γ, α Il be the number of the invariants l · β I , and so on. Since from eq. (2.2) every momentum of the propagators in γ can be expressed as the numerator contribution can then be rewritten as in which each invariant counted in α Il and α ll contributes a factor λ, while the other invariants counted in α II and α Il contribute orders λ 0 . On one hand, the uppermost m J propagators in figure 16 are external propagators of γ, and projected onto the β I -component by the soft-collinear approximation of t σ . So each of them provides a β µ I to the subgraph γ. On the other hand, the lowest (n J + m H ) propagators are internal propagators of γ, and only the β µ I -component remains after the hard-collinear approximation. So equivalently, each of them is contracted with a β µ I . These lightlike vectors are generated by t σ , and there are also some generated when we focus on the leading behavior near ρ {σ} . For example, the m S external propagators are soft in ρ, so we can impose an scĪ (soft-collinear approximation with respect to β µ I ) on each of them without changing the leading behavior, and a β µ I is automatically provided to γ. Similarly, the n H internal propagators are collinear to β µ I and attached to H (ρ) in ρ, so we can impose an hc (ρ) I on each of them, and a β µ I is automatically provided. Now we can relate the numbers of different invariants by counting the vectors β µ I , β µ I and l µ . Explicitly, we have We can combine these relations and solve for α ll as The power counting carried out above is for the subgraph γ. If we consider the IR divergence of the entire graph t σ A, we also need to study how the external propagators of γ can affect the power counting for t σ A \ γ. First, lines counted in m H attach to H (ρ) and hence produce no other contributions to power counting. Second, each line counted in n J can produce a −1 in power counting by attaching to a line in some jet K ( = I). Such lines would be in H (σ) J (ρ) K . Finally, each line in the set labelled m S produces a −1 in (2.42) from the power counting of γ. As we shall see, this is necessary to produce logarithmic divergences. Explicitly, for any regular ρ that overlaps with σ through a set of nonempty subgraphs J A is the contribution to p (ρ) (t σ A) from the jet subgraph in ρ. We decomposed p , whose lines are collinear to β µ A in ρ, and J A may be either from J

(σ)
A , H (σ) or S (σ) . That is, the contribution from the jet subgraph in ρ can be rewritten as the sum over those from J

[A]
A and γ A , as is shown above. The lower bound of the first term can be evaluated as A is the number of soft gauge bosons attached to J

[A]
A , and n J is the number of the n J propagators in figure 16 that are attached to J

[A]
A from subgraphs γ B (B = A).
The first term in eq. (2.44) is from the standard power counting of a jet subgraph, which is obtained by removing the attachments of the n A , and the second (third) term is the extra denominator (numerator) contribution that is generated by these attachments.
The evaluation result of p (ρ) (γ A ) can be directly read from eq. (2.42), and the leading contribution of p (ρ) (S) can be obtained from a standard power counting. That is, (2.45) In (2.45), the symbols m S are the numbers of specific lines of γ A , which separately correspond to m H , n J and m S in figure 16. (Notice that n In order that the pinch surface ρ is divergent, we now have two additional requirements: A . This means that the propagators of J , a result that will be revisited in section 3.2. 2. num(n J ) = 0, which means that at the external vertices of the n J propagators (see figure 16), the numerator contribution must be of O(1). This property will be very helpful, and we will revisit it several times in the later analysis of section 4.2.
In figure 16, we have stated that for the leading contribution, each of the n H propagators provides a β µ I to contract with the other vectors in γ. If some of these n H propagators provide transverse polarizations β µ I⊥ instead, we can carry out a calculation similar to that from eq. (2.32) to (2.47), and find that each of such propagators gives a λ 1/2 -suppression. Therefore, for an IR-divergent ρ {σ} , none of the n H propagators can be transversely polarized gauge bosons. The same conclusion holds for scalars and fermions. 4 As a result, the subgraph γ can only be attached to H (ρ {σ} ) through scalar-polarized gauge bosons.
The analysis above is for a single approximation, and that for repetitive approximations is in appendix A.1. We restore the superscripts of ρ and draw the conclusion: the IR divergence at a regular pinch surface ρ {σn...σ 1 } , which overlaps with some of the {σ i }, is at worst logarithmic.
To end the discussion under the title of "overlapping & regular", we comment that for each ρ {σn...σ 1 } as a pinch surface of t σn ...t σ 1 A, we can find a corresponding pinch surface of A, say ρ A , as long as the jets do not have nontrivial overlaps (see figure 10 for example). To obtain the corresponding ρ A , we simply remove the approximation operators inside the solution. For example, after we do this for ρ {σn...σ 1 } in the upper row of figure 5, the subgraph J (σ) I J (ρ) K (blue lines) become collinear to β µ K . Similar procedures can be implemented for the repetitive-approximation case. The change from ρ {σn...σ 1 } to ρ A preserves the propagator types, i.e. a lightlike (hard, soft) propagator remains lightlike (hard, soft), though its direction may change. In other words, the set of normal coordinates of ρ {σn...σ 1 } may differ from that of ρ A , as long as β µ I and β µ K are not back-to-back, but their elements are in one-to-one correspondence.

Soft-Exotic
Having discussed the regular pinch surfaces, now we study the power counting when there are soft-exotic configurations. Again, we consider a single approximation t σ here, and the case of repetitive approximations will be discussed later in appendix A.2.
As is indicated before, a soft-exotic configuration can be induced by both hard-collinear and soft-collinear approximations. The first case is encountered when lines in a jet J (σ) I become part of another jet in ρ {σ} . As discussed in Theorem 1 of section 2.2, these momenta are pinched only in the direction of β µ I . The hard-collinear approximation then forces the projected momenta that flow into H (σ) to be soft, producing a pinch surface where a jet line appears to "decay" into soft lines (see figure 17(a)). The second case is encountered when a subgraph of S (σ) , which contains lines attached to J are soft in ρ {σ} . This can again produce a pinch surface where a jet line "decays" into soft lines (see figure 17(b)). We should consider both these cases, and shall start from the first one. The analysis of the second case will follow similarly.
For the hard-collinear case, the subgraph whose degree of divergence we shall calculate is J K as well as a soft subgraph S , where the m vertices of the exotic propagators are internal. In the following discussion, we define γ ≡ J (σ) I J (ρ) K S . We have marked this structure red in figure 17(a) below, as well as our previous example in (the upper row of) figure 7. The idea is similar to that in the calculations for the "overlapping & regular" case.
First, by dimensional analysis, the degree of divergence of the soft subgraph S , with external propagators removed is The degree of divergence p (ρ) (γ) is then where the α's satisfy From these, we can solve for n (ρ) num,J and hence p (ρ) (γ), with the result which shows the divergence is at worst logarithmic, because by referring to eq. (2.48), the first part 4 − b − 3 2 f is the contribution from a normal soft subgraph S , and the other term (−m S ) fits the contribution from the soft propagators attached to the jets, as we explained in the "Overlapping & Regular" discussion. So in this case the degree of divergence is unchanged.
Now we consider the case where the projection from t σ that acts on the lightlike propagators is a soft-collinear approximation. This time, the structure we shall study is S (σ) J Comparing figure 17(b) with 17(a), we see that this configuration can be treated in the same way as the previous case, since we can simply exchange β I and β I . So it is indicated that the degree of divergence should be the same as in eq. (2.53). In conclusion, we have verified that the soft-exotic configuration preserves the logarithmic degree of divergence.

Hard-Exotic
Finally we study the degree of divergence of a pinch surface ρ {σn...σ 1 } with hard-exotic configurations. Denoting the graph of ρ {σn...σ 1 } as G, we can find a general procedure to decompose it into several subgraphs, whose degrees of divergence are known results, or easy to evaluate. The method separates contributions from the disjoint hard subgraphs that occur at generic hard-exotic pinch surfaces. This is achieved from the following recursive steps: Step 1. Imagine a flow along the lightlike propagators of G in ρ {σn...σ 1 } , which starts from the external propagators and points towards the origin. Whenever a branch of the flow hits a vertex, it streams into the other propagators at this vertex, whose momenta are collinear to the same direction. A branch comes to its endpoint when it encounters a hard subgraph of ρ {σn...σ 1 } , and the whole flow is stopped when every branch comes to an end. We consider the union of the constructed flow, and contract all the endpoints together as a hard vertex, and denote the obtained subgraph as M 1 .
Step 2. Next we focus on the set of the hard propagators of G/M 1 "coterminous with" M 1 : those attached to the flow endpoints. We enlarge this set by including all the hard propagators which can join them through a series of other hard propagators, and denote this enlarged set as N 1 . The momenta of M 1 that flow into N 1 can be regarded as the external momenta of the truncated propagators of N 1 .
Step 3. If M 1 N 1 includes all the hard and jet propagators of G, there are no hard-exotic configurations, and we can jump to Step 4. Otherwise, some momenta of the propagators in N 1 must be projected to become lightlike, and combine with the normal coordinates of the loop momenta in G 1 ≡ G \ (M 1 N 1 ) to form pinches. From the definition in case (D) of section 2.2, these projected hard propagators are called hard-exotic propagators. Treating the momenta of these projected hard-exotic propagators as the external momenta of G 1 , we can recursively follow the same routine of Step 1 and 2 to decompose G 1 , into the new subgraphs M i and has covered the whole of the hard and jet subgraphs of G. Note that in the M i 's, the "nested & regular" and "overlapping & regular" configurations can occur, which we have analyzed above.
Step 4. Finally we consider the subgraph of G, which is soft in ρ {σn...σ 1 } , and denote it as S. Technically S can be attached to both M i and N i . Given a vertex of i=1 (M i N i ) where some propagators of S are attached, if it is a jet or soft vertex (defined in the "regular and exotic configurations" part of section 2.2), we say that S is attached to M at this vertex; if it is a hard vertex, we say that S is attached to N . For example, in figure 8, the soft subgraph S is attached to M at vertices a and b. Note that S can also combine with M to form soft-exotic configurations, which we have analyzed above. The result of the procedure above is that the degree of divergence of G can be regarded as the sum of the contributions from {M i }, {N i }, and S. For each i, N i contributes zero to the degree of divergence, because it is made up of propagators with off-shell momenta in ρ {σn...σ 1 } . In other words, (2.54) Here we have again dropped the superscript of ρ for simplicity. Each subgraph M i can be seen as a set of jet subgraphs with external soft propagators, whose "external states" are the real external states of G (for i = 1), or the projected hard momenta from N i (for i > 1). Given a jet subgraph J Ki of M i , suppose b Ki is the number of soft bosons attached to it, f Ki is the number of attached soft fermions, v Ki is the number of its vertices to which a soft gauge boson is attached, and h Ki is the number of its physical partons attached to N i , then the degree of divergence of J Ki is by standard power counting methods like those used above [10,14,19,46]. The contribution from S is  From all the analyses for these four cases (nested & regular, overlapping & regular, soft-exotic and hard-exotic), we have verified that the approximation operators preserve the degree of IR divergence, which are hence still at worst logarithmic.
Features of Leading ρ {σ 1 ...σn} As stated in the introduction, a leading pinch surface of A possesses three features: no soft lines attached to the hard subgraph, no soft fermions or scalars attached to the jet subgraph, and at most one line with physical polarization in each jet subgraph attached to the hard subgraph. However, due to the complex structures, as well as the breakdown of normal space conservation (defined in the "regular and exotic configurations" part of section 2.2), we are not guaranteed that these features are still present in a given t σn ...t σ 1 A at an arbitrary ρ {σ 1 ...σn} , even though the integral is (logarithmically) divergent there.
For example, in our power counting analysis for the soft-exotic configuration in figure  17, a soft fermion or scalar can join one of the m jet propagators without suppressing the logarithmic divergence, because they join each other at a soft vertex, which does not exert any constraints on the types of the soft partons. For hard-exotic configurations that yield logarithmic divergence, figure 8 for example, a soft parton can be attached to a hard propagator, and we can have more than one jet propagator with physical polarization attached to the (union of the connected components of the) hard subgraph.
Nevertheless, the basic features are still present for a regular pinch surface (more precisely, at a regular configuration). The reason is simple. As we have seen in the discussion of "nested & regular" and "overlapping & regular" in this subsection, as long as the jets do not have a nontrivial overlap in ρ {σ 1 ...σn} , each regular pinch surface of ρ {σ 1 ...σn} can be "mapped" to a pinch surface of A (ρ A ) by removing the approximations, and they have one-to-one corresponding normal coordinates. So when we carry out the power counting procedure for ρ {σ 1 ...σn} , the factors that suppress its degree of divergences are exactly those that suppress the degree of divergence of ρ A . Namely, whenever there is more than one physical jet parton in the same direction, or a soft parton attached to the hard subgraph, or a soft fermion or scalar attached to the jet subgraph, a power suppression emerges.
This conclusion also holds when the jets in ρ {σ 1 ...σn} do have a nontrivial overlap (see figure 10 for example), because from our definition, the propagators of any given J . The other two features of leading pinch surfaces follow identically as above.
The whole of this section has centered on the approximation operators extensively. To summarize, we have figured out all the possible configurations in a pinch surface of t σ A, from which we deduced those in t σn ...t σ 1 A, and verified that all the pinch surfaces appearing in eq. (1.1), though various, still lead to logarithmic divergences. These results are fundamental in the upcoming study of IR cancellations in the forest formula.

Enclosed pinch surfaces
In section 2 we have studied the IR singularities of the approximated amplitude t σn ...t σ 1 A. An approximation operator in the approximated amplitude, say t σ i , describes the asymptotic behavior where the normal coordinates of σ i approach zero. The two terms t σn ...t σ i ...t σ 1 A and t σn ...t σ i+1 t σ i−1 ...t σ 1 A, then cancel in σ i for any choice of the other σ's. This motivates the idea of pairwise cancellations of the divergences near nested pinch surfaces. We must also, however, consider the cancellations near overlapping pinch surfaces.
More precisely, given a pinch surface ρ {σn...σ 1 } that overlaps with one or more σ i ∈ {σ 1 , ..., σ n }, how does one find the pairwise cancellation for its divergence? The way to do this, as we will explain in this subsection, is to study the maximal region simultaneously contained in σ i and ρ {σn...σ 1 } , which we call the "enclosed pinch surface of σ i and ρ {σn...σ 1 } ", and denote as "enc σ i , ρ {σn...σ 1 } " [17]. The formal construction of enc σ i , ρ {σn...σ 1 } follows from the definition of ordering in eq. (2.10), by defining the normal space (see (2.9)) of the loop momenta l µ at enc σ, ρ {σ} as The action of the direct sum symbol ⊕ is given in table 3. In the table, we use the notation, as in eq. (2.9),

3)
with N (I) the normal space of momenta in the direction of β µ I . In the special case of a single approximation, (3.1) becomes This definition is natural because a larger normal space implies a smaller pinch surface (and a larger reduced graph). A direct sum of two normal spaces corresponds to a pinch surface simultaneously enclosed by the two pinch surfaces.
In the following, we will show that enclosed pinch surfaces are leading pinch surfaces of A. Once this is demonstrated, we are assured to find pairs of the subtraction terms from the forest formula (1.1), which contain the approximation operator associated with this enclosed pinch surface or not. This will result in the cancellation of overlapping divergences. We should note that if this were not the case, either the cancellation of overlapping divergences would not be this simple, or we would need to design a corresponding approximation operator for this enclosed pinch surface, as well as the subtraction terms containing this operator, then add them into the forest formula. This would turn out to enlarge the workload greatly, or even endlessly, which would be disastrous.
For this reason, it is necessary to study the definition, eq. (3.1) further. The first thing to do is to study the relations between the soft, jet and hard subgraphs of σ i , ρ {σn...σ 1 } and enc σ i , ρ {σn...σ 1 } . We will do this by developing a new algebra of normal spaces of pinch surfaces in section 3.1. We then turn to the question of whether enc σ i , ρ {σn...σ 1 } is a leading pinch surface of A. In section 3.2, we first assume ρ {σn...σ 1 } to be a regular pinch surface (defined in section 2.2), and then show that under this assumption, enc σ i , ρ {σn...σ 1 } possesses the three features of a leading pinch surface of A given in the introduction. After that, we consider the soft-and hard-exotic configurations of ρ {σn...σ 1 } in section 3.3, and confirm that they do not produce any anomalous structures of enc σ i , ρ {σn...σ 1 } that violate the three features. For convenience, the analysis in sections 3.2 and 3.3 is implemented within the framework of decay processes. Later in section 3.4, we explain why this analysis extends to wide-angle scatterings.
We note that section 3.1 shows the detailed analysis for a single approximation, and the corresponding details for repetitive approximations are provided in appendix B. In sections 3.2 and 3.3 it is sufficient to only work on the single-approximated amplitudes because the analysis for repetitive approximations will follow identically. Throughout these sections, we will denote enc σ, ρ {σ} ≡ τ for convenience.

Soft, jet and hard subgraphs of enclosed pinch surfaces
From the definition eq. (3.1) we can easily obtain the relation between the loop momentum l µ in σ, ρ {σ} and τ . If l µ is hard in τ , then it does not have any normal coordinates, i.e. N τ (p µ ) is empty, so it must be hard in σ and ρ {σ} as well. If l µ is collinear in τ , to β µ for example, then it is either collinear to β µ in σ and hard in ρ {σ} , or vice versa, or collinear to β µ in both σ and ρ {σ} . If it is soft, then either it is also soft in σ and/or ρ {σ} , or it is collinear in both σ and ρ {σ} , but to different lightlike vectors.
It is not yet adequate, however, to only relate the loop momenta in τ to those of σ and ρ {σ} . Rather, we need to extend such relations to line momenta. In other words, we aim to prove the following, extending eq. (3.1) for loop momenta to all line momenta.
Theorem 3: For the momentum of any propagator of A, say p µ , we have where N τ (p µ ) is the same normal space defined by eq. (3.1). Once Theorem 3 is proved, we can immediately relate the subgraphs of τ to those of σ and ρ {σ} . That is, applying the very reasoning given in the first paragraph of this subsection to each propagator, we have where, for example, every line in J (τ ) I carrying momentum p µ has N τ (p µ ) = N (I) . Eq. (3.6) will be a powerful tool in constructing and understanding the subgraphs of an enclosed pinch surface.
To make the relation between the loop momenta and line momenta specific, we begin with planar graphs. We assign the loop momenta of A as the counter-clockwise momenta going around its loops as shown in figure 19. In order to match the values of the external momenta, we mark the "external loop momenta" ±p µ 1 , ..., ±p µ N , which are outside the graph and lightlike.
In this notation, the momentum of a propagator can be easily obtained from those of the loops. For planar graphs, it is simply the difference between two of its loop momenta that flow through the propagator. In figure 19, the momentum of propagator ab (from a to b) is thus l µ 5 − l µ 3 . By comparison, in a nonplanar graph we may have a linear combination (with coefficients ±1) of three or more loop momenta. We will first prove that a momentum of the form p µ ij ≡ ±l µ i ± l µ j also satisfies eq. (3.5), and then directly generalize the conclusion to nonplanar graphs. After showing these results, Theorem 3 is proved, and we have the relations between subgraphs, eq. (3.6).
As a preliminary to the proof, we introduce an operator denoted by : (N , N ) → N . The action of is defined in table 4, where the notations N (soft) and N (I) have been introduced in eqs. (3.2) and (3.3). 5 The motivation for this operation is to obtain the normal space of a momentum that is the linear combination of two independent momenta, whose normal spaces are known. In other words, given any independent momenta l µ i and l µ j together with a pinch surface λ (= σ, ρ {σ} or τ ), the linear combination of l µ i and l µ j satisfies if l µ j is soft in λ, and vice versa, if l µ i and l µ j are collinear to the same direction in λ.
Clearly, this star symbol relates the loop momenta with the propagator momenta in our construction, though l µ i and l µ j can be either projected by t σ or not. In more detail, suppose where for some lines, t σ l µ = l µ . The normal space of p ij in σ and ρ {σ} then, separately satisfies Now we can go on with the proof of Theorem 3. Proof of Theorem 3: Using eq. (3.8) with momentum p µ ij , the result we wish to prove, eq. (3.5), becomes Next, using the defining property of the normal space for a single loop momentum, eq. (3.1), we can rewrite (3.9) as We shall prove this relation, which is equivalent to (3.1) and hence Theorem 3. The method of proving eq. (3.10) is to find all the cases of N ρ {σ} (p µ ) that appear in t σ A, given the pinch surface σ, which may differ from N ρ (p µ ) where ρ is a another pinch surface of A. The proof depends on the action of t σ on the momenta in eq. (3.10). This action can be an identity operator, or exert a hard-collinear or soft-collinear approximation on l µ i or l µ j . 1, t σ = 1 in eq. (3.10). First we analyze the case where t σ = 1 on l µ i and l µ j . This happens when the confluence of l µ i and l µ j is at an internal vertex of H (σ) , J (σ) or S (σ) . Equivalently, l µ i and l µ j are simultaneously hard, soft or collinear to a given direction in σ. For the case where they are both hard in σ, we have N σ (l i ) = N σ (l j ) = ∅, and (3.10) reduces to the identity Similarly, for cases where they are both soft in σ, it is obvious that both sides of (3.10) equal a 4-dim space.
For the case where l µ i and l µ j are both collinear to, say β µ I in σ, (N σ (l i ) N σ (l j )) is then equal to span β µ I , β µ I⊥ . Suppose first that β µ I is contained in N ρ {σ} (l i ) N ρ {σ} (l j ) . If it is, then the RHS of eq. (3.10) equals the whole space, because β µ I is not contained in N (I) . Moreover, β µ I must be in both N ρ {σ} (l i ) and N ρ {σ} (l j ). So on the LHS of (3.10), using table 4 we have (3.12) and the two sides of (3.10) match.  So the LHS of (3.10) equals N (I) as well.
2, t σ = 1 in eq. (3.10). For cases where t σ provides nontrivial approximations on l µ i and l µ j , figure 20 shows how these arise. In figure 20, (a) describes a jet momentum collinear to β I joining the hard part, so a hard-collinear approximation is applied on l µ i ; (b) describes the case of two jet momenta of different directions (β I and β K ) merging as a hard momenta together, so two hard-collinear approximations are applied separately on l µ i and l µ j ; (c) describes a soft momentum joining a jet momentum which is in the direction of β I , so a soft-collinear approximation is applied on l µ j . The considerations below apply when the line of momentum l µ i ±l µ j is internal to H (σ) (for (a)) or a J with I l i defined in (2.7). The RHS of (3.14) indicates that l µ i is replaced by I l µ i (= t σ l µ i ) inside H (σ) . Also, note that N ρ {σ} I l i may not be equal to N (I) because l µ i can be soft or parallel to β µ I in ρ {σ} . Similarly, for figure 20(b), eq. (3.10) becomes To verify eqs. (3.14) and (3.15), we show first For the configuration in figure 20(c), we use that N σ (l j ) = full space, and N σ (l i ) N σ (l j ) = N (I) to show that eq. (3.10) is equivalent to (3.17) To verify this relation, we make the following observation. Because the approximation sc I projects the momentum l µ j onto the direction of β µ I , the coordinate β µ I must be included in N ρ {σ} I l j . After this observation the idea is the same as that in the last paragraph. By noticing that in ρ {σ} , l µ i can only be of four types (hard, collinear to β µ I , collinear to β µ I and soft), we study them one by one. For the case of being hard or collinear to β µ I , both sides of (3.17) are equal to span β µ I , β µ I⊥ = N (I) . When l µ i is collinear to β µ I or soft in ρ {σ} , the LHS of (3.17) is the full 4-dim space. Also, we have β µ I ∈ N ρ {σ} (l i ) N ρ {σ} I l j , which means the RHS is the full 4-dim space as well. Therefore, we have proved eq. (3.17), and (3.10) holds for figure 20(c) as well.
Finally, we comment that the proof above is also sufficient for non-planar graphs, or other loop assignments in a planar graph, where the propagator momenta are of the form p µ = L i=1 a i l µ i , with L being the number of loops, and a i = ±1, 0. Taking the case where p µ = l µ i + l µ j + l µ k as an example, we use the associativity of and aim to prove the following relation: which is equivalent to the following analogue of eq. (3.10): Define l µ ij ≡ l µ i + l µ j . Then from the proof of (3.10) we have Inserting this into (3.19) and using (3.8) again, we only need to show which is of the same form as (3.10), with l i → l ij , l j → l k . Therefore, we see that the same reasoning works for non-planar graphs as well. This completes the proof of Theorem 3.
Now we have verified the correctness of eq. (3.5), the normal space relation for arbitrary lines, which then implies the graphical relation (3.6) for the enclosed pinch surface. Relating the subgraphs of different pinch surfaces, (3.6) is very helpful for our understanding of the structures of an arbitrary enclosed pinch surface.
We emphasize that to derive the relation between subgraphs, the approximation operator t σ is indispensable in eq. (3.1). That is, if we rewrite (3.1) by replacing ρ {σ} by ρ (a pinch surface of A) in the definition of enclosed pinch surfaces, no relations between the subgraphs of σ, ρ and enc [σ, ρ], like (3.6), will still hold in general. To be specific, suppose we define the enclosed pinch surface of σ and ρ, which are two pinch surfaces of A, as Then we can still construct the pinch surface enc [σ, ρ] from the loop momenta. But we would not have any line-momentum relations that are as simple as eq. (3.6). An example to illustrate this is provided in appendix B.1.
Finally we generalize Theorem 3 by taking repetitive approximations into account.
Theorem 4: Suppose σ 1 ⊂ ... ⊂ σ n are a series of nested leading pinch surfaces of A. For the momentum of any propagator of A, say p µ , we have The proof is similar to that of Theorem 3, but it requires a more extensive discussion. We give the detailed proof in appendix B.2, and only provide a sketch here.
The case where every σ i is contained in ρ {σn...σ 1 } is automatic, because both sides of eq. (3.23) are N σn (p µ ). Otherwise, we can make use of the defining property (3.1) and table 4, to rewrite (3.23) into a form similar to (3.10). The result is, which depends upon the combination of approximations, t σn ...t σ 1 .
We classify the explicit expressions of eq. (3.24) into two types: those where t σm is a identity operator for l µ i and l µ j , and those where t σm is not the identity for l µ i or l µ j . In the case where t σm = 1, l µ i and l µ j must be simultaneously hard, soft or collinear to a certain direction in σ m . In the case where t σm = 1, its actions on l µ i and l µ j are sufficiently described by the three cases in figure 20. These are in common with our arguments in proving Theorem 3. Meanwhile, approximations from other pinch surfaces, i.e. σ i where i = m may also act on l µ i and l µ j . We then need to consider all the possibilities, and take them into account to check the relation (3.24) is true in each of them. Throughout the proof, we will see the information that σ m is the smallest pinch surface that contains or overlaps with ρ {σn...σ 1 } , plays a key role.  (which we will show also contains only one propagator) Once Theorem 4 is proved, we can immediately come to the following relations among the subgraphs of τ , σ and ρ {σn...σ 1 } , which generalizes eq. (3.6) without changing its algebra:

Enclosed pinch surfaces are leading
In the previous subsection we verified eq. (3.6), the relation between the subgraphs of an enclosed pinch surface τ ≡ enc σ, ρ {σ} and those of σ and ρ {σ} . We also generalized it to repetitive approximations in (3.25). If we construct an enclosed pinch surface by means of such relations, the result should be a well-defined pinch surface of A. Our next goal is to confirm that it is leading, when both σ and ρ {σ} are IR-divergent pinch surfaces of A and t σ A respectively. This is formulated as Theorem 5. then τ ≡ enc σ, ρ {σ} is a leading pinch surface of A. The notation of overlapping is defined in Sec 2.1. If ρ {σ} is an IR-divergent regular pinch surface, then from the discussion at the end of section 2.4, it possesses the three features of a leading pinch surface of A described in the introduction. A natural idea for the proof of Theorem 5 then, is to proceed by contradiction. Namely, suppose some of the three features are violated for our constructed τ ≡ enc σ, ρ {σ} from eq. (3.6). Then we aim to find the contradiction by proving that either σ or ρ {σ} does not preserve these three features, which implies that (A) div n[σ] = 0, or (t σ A) div n[ρ {σ} ] = 0. This is what we shall do in this subsection, which can be directly generalized to repetitive approximations, as will be discussed at the end.
For convenience, we again drop the superscript of ρ {σ} and denote it by ρ within this subsection. Some notations to be used are in table 5.

Connections between S (τ ) and H (τ )
We begin by showing that no soft propagators can join to the hard subgraph in τ . We only need to focus on one connected component of H (τ ) , and will prove the claim by contradiction: we suppose there exists a soft line attached to the hard subgraph in τ . It is easy to see that the specific soft propagator cannot be from S (σ) or S (ρ) , because otherwise either σ or ρ will have a soft line connected directly to its hard subgraph, and would then not be a leading pinch surface by power counting. So the propagator can only be from J

Soft fermions and scalars attached to J (τ )
We next show that no soft fermions or scalars can be attached to the jet subgraphs in the pinch surface τ .
We shall prove this by contradiction: in the presence of a soft fermion or scalar attached to J (τ ) , the IR divergence of A at σ, or that of t σ A at ρ, would be suppressed. Suppose such a soft fermion or scalar propagator is labelled ab, attaching vertices a and b of the graph. Similarly to the arguments above, we observe that ab can neither be from S (σ) nor S (ρ) , because otherwise, from S (σ) for example, it would then be attached to either J (σ) or H (σ) . Neither case is allowed since σ is a leading pinch surface of A.
To see that ab is not from J  K . The whole chain is then "distorted": it either passes through S (ρ) or H (ρ) . There is an obvious suppression on the IR divergence in the former case, because we will obtain soft scalars or fermions attached to the jets in ρ. The latter case, on the other hand, can be depicted almost identically as figure 21, except that the vertex a is not included in H (ρ) (enclosed by the dashed curve) in ρ. The argument that we have given in case (i) above, still applies, because all the external propagators of γ ≡ H (σ) H (ρ) are scalar-polarized gauge bosons. Such configurations vanish in the sum over all gauge-invariant set of graphs, and again region ρ is not leading.
Similarly, we rule out the possibility that ab = j (H ρ ) I,phys in ρ while being an internal propagator of J (σ) In conclusion, no soft fermions or scalars can be attached to the jets in τ .

Number of propagators in
The final step in verifying that τ is a leading pinch surface of A is to prove that in each jet J I,phys in the pinch surface σ. We start by proving that the contribution is at most one. We notice that j (H ρ ) I,phys cannot be from S (σ) or J (σ) K where K = I, because from eq. (3.6) it would then be a part of S (τ ) rather than J (τ ) I . We have treated these cases above, so all we will need to consider is j I,phys must be linked by a "chain of physical partons", which extends to the I-th external particle. Graphically, such a chain must cross the boundary between J The reason for the existence of a physical chain is as follows. If the I-th external particle is a scalar or a fermion, then as noted in case (ii) of the discussions on soft fermions and scalars attached to J (τ ) above, such a chain describes the flow of charge. Next we consider the case where the external particle is a transversely polarized gauge boson. For any graph γ with internal propagators lightlike in the direction of β µ I , and gauge bosons being its external propagators, we want to show that if one of these gauge bosons is transversely polarized, there must be another one transversely polarized. Denoting the external momenta of γ by p µ , and the loop momenta by l µ , then we have generic factors in the expression of γ of the type: where µ 1 , ..., µ m , ν 1 , ..., ν n are the vector indices of the external gauge bosons. Apparently in the first factor, µ 2 =⊥, and in order that the second factor is nonzero, one of the µ i 's (1 i m) must be ⊥. In other words, a gauge boson with transverse polarization must get out after going into γ. As a result, transversely polarized gauge bosons must also form a physical chain. We end by explaining why the number of lines in J (H τ ) I,phys is not zero. Suppose it is zero, we focus on the subgraph H (σ) H (ρ) , and observe that in ρ every attached parton plays the role of a scalar-polarized gauge boson. Figure 21 again shows the general case. Though such pinch surfaces may be IR divergent, the sum over all similar configurations vanishes due to the Ward identity [14,20] in region ρ, so that ρ is not leading.
To finish the proof of Theorem 5, we also need to consider the exotic configurations of ρ {σ} , and show that any enclosed pinch surface produced by them does not violate the three features of a leading pinch surface of A. We do this in the next subsection.

Extension to exotic configurations
In section 3.2, we have assumed ρ {σ} as an IR-divergent regular pinch surface, so that we can apply the three features of A introduced in section 1 to ρ {σ} . That is, a soft parton cannot be attached to the hard subgraph; a soft fermion or scalar cannot be attached to he jet subgraph; in each jet there is exactly one physical parton attached to the hard subgraph. We "endowed" ρ {σ} with these features, with which τ can be proved leading.
However, as is pointed out at the end of section 2.4, these three features no longer hold at exotic configurations. As a result, we need to remove this limitation by showing that whatever exotic configurations ρ {σ} has, the corresponding configuration of τ ≡ enc σ, ρ {σ} never violates the three features above. This is what we shall do in this subsection.
Before we start, we recall that the jet propagators in ρ {σ} , whose projected momenta are soft and are attached to other soft propagators, have been denoted as "soft-exotic propagators" in case (Ciii) of section 2.2 (see figure 7 and table 1). Similarly, the hard propagators in ρ {σ} whose projected momenta are collinear to certain directions, and make some other momenta pinched in alignment with them, have been denoted as "hard-exotic propagators" (figure 8 and table 1).
We depict possible exotic configurations in ρ {σ} in figures 23 (soft-exotic) and 24 (hardexotic) below. For a soft-exotic propagator that is collinear to, say β µ I , in ρ {σ} , t σ must project it onto the direction of β µ I in order to make it soft. If t σ acts as a hard-collinear approximation, then by definition, the soft-exotic propagator must be collinear to β µ I in σ and thus belong to J The analysis for hard-exotic configurations is similar. For a given hard-exotic propagator, let's suppose t σ projects it onto the direction of β µ I . If t σ acts as a hard-collinear approximation, then by definition, the hard-exotic propagator must be collinear to β µ I in σ and attached to H (σ) . From the point of view of ρ {σ} , the propagators in H (σ) attached to this hard-exotic propagator can only be soft or collinear to β µ I , as is shown in figure 24(a). If t σ acts as a soft-collinear approximation, then the hard-exotic propagator is soft in σ and attached to jet lines in the direction of β  Moreover, the corresponding configurations in τ are relatively simpler. In figures 23 and 24(b), the enclosed pinch surface includes a soft vertex and the (soft) propagators attached to it, which is compatible with any τ that is leading. In figure 24(a), region τ has a jet subgraph in a certain direction and one or two soft partons entering it. In order that it is compatible with a leading pinch surface τ , we need to show that these soft partons are gauge bosons. This is direct from the requirement that (t σ A) div n[ρ {σ} ] = 0, because otherwise there will be a suppression compared to the logarithmic divergence of ρ {σ} , making it IR finite. Therefore, all the configurations of τ in figures 23 and 24 are compatible with a leading pinch surface of the original amplitude A.
We comment on generalizing our argument above to repetitive approximations, i.e. configurations of the pinch surfaces ρ {σn...σ 1 } . Given a soft-exotic or hard-exotic propagator, if there is only one approximation acting on it, then everything follows identically to the single-approximation case. If there are two approximations, then from our explanations in section 2.3, these approximations must be an hc  After we combine these conclusions with those obtained in section 3.2, Theorem 5, which is introduced at the beginning of section 3.2, is proved. Moreover, we are able to generalize it to repetitive approximations. In summary, we have We emphasize again that this conclusion is of a great significance in the pairwise cancellations of the divergences in the forest formula, eq. (1.1). To be specific, each approximated amplitude t σn ...t σ 1 A corresponds to a series of nested pinch surfaces σ 1 , ..., σ n . If an IRdivergent pinch surface ρ {σn...σ 1 } is overlapping with some of them, we find the smallest one among them, say σ m (1 m n). Then from Theorem 6, τ ≡ enc σ m , ρ {σn...σ 1 } is also a leading pinch surface of A (σ m ⊂ τ ⊆ σ m−1 ), so the terms with t τ appear in the forest formula. What we will find is, by adding or eliminating t τ in t σn ...t σ 1 A, we will obtain two terms whose divergences near ρ {σn...σ 1 } are cancelled by each other. Such a pairwise cancellation will be discussed in more detail in section 4.2.
So far the discussions in sections 3.2 and 3.3 apply to electroweak induced decay processes, for which H (σ) H (ρ) is always non-empty, since it must contain the electroweak vertex (or other external current). In fact, we can generalize the analysis to wide-angle scatterings.

The case of wide-angle scatterings
We briefly recap what we have now. Given an amplitude A of the decay process, we can list its leading pinch surfaces {σ i }, and obtain approximated amplitudes of the form t σn ...t σ 1 A. For each approximated amplitude, though its IR-divergent pinch surfaces may not be leading pinch surfaces of the original amplitude A, we find that the enclosed pinch surface τ ≡ enc σ, ρ {σ} must be, which can be generalized to repetitive approximations. Now we extend our conclusion to the m → n wide-angle scattering amplitudes, namely, m external particles of the initial state are scattered into n external particles of the final state, with all the particle momenta in different directions. A leading pinch surface is shown in figure 25, with the external momenta being {p i } m i=1 and {q j } n j=1 . Compared with the decay processes, the only subtlety in wide-angle scatterings lies in the intersection of hard subgraphs. That is, unless H (σ) H (ρ {σ} ) = ∅ the pinch surfaces of wide-angle scattering can be seen as identical to those of decay processes, and our analyses in sections 3.2 and 3.3 apply. Now we prove by contradiction that the case H (σ) H (ρ {σ} ) = ∅ never occurs for ρ {σ} . Note if H (σ) H (ρ {σ} ) = ∅, H (ρ {σ} ) must transfer momentum by lines from S (σ) . But according to the soft-collinear approximations of t σ , momenta flowing out of S (σ) are always in the direction of β µ I , and hence never scatter the jet lines carrying momenta p µ 1 , ..., p µ m into external momenta q µ 1 , ..., q µ n ; instead, they are only taken off-shell. Figure 8 in section 2.2 serves as an example for the explanations above. We can consider figure 8 as part of a scattering process, with β I and β K labelling final-state lines, resulting from a one-loop hard scattering H (σ) . In ρ {σ} the propagator of S (σ) is hard, which, after projection by the soft-collinear approximations, becomes lightlike in the directions of β In conclusion, Theorems 5 and 6 not only work for decay processes, but for hard scattering processes as well. We are now ready to work on the IR cancellations in the forest formula, eq. (1.1).

The proof of cancellations
In this section, we confirm that the full set of forest subtractions, eq. (1.1) eliminates all singularities. For convenience, we rewrite (1.1) and explain its notations in more detail: (4.1) Here F [A] refers to the set of forests of A, in which each forest F is defined by a set of leading pinch surfaces (LPS): For each F , we have a corresponding series of {σ 1 , ..., σ n }, and the t σ i products are ordered.
(The value of n depends implicitly on each forest F .) Namely, the approximation operator with a smaller pinch surface appears to the right of that with a larger pinch surface, as discussed in section 2.3. Finally, the lower notation "div" refers to the IR divergences from the whole sum over forests, except for singularities that are cancelled by the Ward identity in the sum over all hard subgraphs of the same order. Then eq. (4.1) means that in the whole sum, the IR divergences are thoroughly cancelled. In other words, the remainder of A after all the subtractions is finite. If we set the on-shell particle masses to be small rather than zero, then all the large logarithms disappear in the remainder. As noted in the introduction, the idea of a forest formula originates from the BPHZ renormalization scheme as a subtraction method of UV divergences in Feynman graphs [1][2][3]. In the UV case, for the integrand of a general Feynman graph, with overlapping and nested subgraphs whose degrees of UV divergence are non-negative, the prescription for the renormalized integrand is where t d(γ) p(γ) is the operator on the subgraph γ, which acts by performing a Taylor expansion in its external momenta p (γ) up to the degree of UV divergence d (γ). For the remaining subgraph A \ γ, the operator acts as an identity. The t d(γ) p(γ) products are ordered, so that the operator with a smaller subgraph γ appears on the right. The integral after subtractions, i dk i R A (p, k), is then absolutely convergent. The subtraction terms in eq. (4.3) result from replacing the γ's by local counterterms, while in comparison, the subtraction terms in our forest formula (4.1) result from expanding the integrand near all the pinch surfaces of A. Due to the complicated IR structures (compared with UV), different constructions are needed for the proof of (4.1), as we have seen in the previous sections.
Our method is to focus on any IR-divergent pinch surface ρ {...} of an arbitrary term in eq. (4.1), to be specific, ρ {σn...σ 1 } of t σn ...t σ 1 A. We aim to find a unique other term t σ m ...t σ 1 A in (4.1) that cancels the specific divergence near ρ {σn...σ 1 } , (4.4) We will use the defining properties of approximation operators and enclosed pinch surfaces to identify these pairs. Note that the assignments of the pairs are unique in both ways: each divergence of t σn ...t σ 1 A corresponds to that of t σ m ...t σ 1 A, and vice versa. Once this is done, we need two more steps in order to show this pairwise cancellation. They are: These two steps verify the IR cancellation of each pair of terms, eq. (4.1).
To understand how the pairs are assigned, we need to discuss the relations between ρ {σn...σ 1 } and the elements of {σ 1 , ..., σ n }. This is how we organize this section. In section 4.1 we study the case where ρ {σn...σ 1 } is nested with all the σ i 's, and in section 4.2 we deal with the case where ρ {σn...σ 1 } overlaps with certain elements of {σ 1 , ..., σ n }. In both these subsections, ρ {σn...σ 1 } is assumed to be a regular pinch surface (defined in section 2.3), and in each subcase we shall discuss, the two steps above to show IR cancellation introduced in the previous paragraph will be applied. After that, we include exotic configurations into our analysis in section 4.3. Some comments on the proof are added in section 4.4.
We claim that the following two terms have the same divergence at ρ up to a minus sign, so they cancel in the sum. In other words, eq. (4.4) is rewritten as: (4.5) Figure 26: The pinch surfaces ρ {σn...σ 1 } of a back-to-back decay process as an example. As is indicated in the text, the approximations from t σp (m p n) reside in the soft and jet subgraphs, and we mark them blue. In comparison, the approximations from t σq (1 q m) reside in the jet and hard subgraphs, which we mark dark red.
(a) We confirm the divergence by studying the difference between the two pinch surfaces σ ρ (= σ m ) and ρ {σn...σ 1 } . By definition, σ ρ and ρ {σn...σ 1 } have the same set of normal coordinates, so their differences can only lie in the ranges of intrinsic coordinates. It is then relatively easy to see that the approximation operators t σn , ..., t σm do not lead to any differences between σ ρ and ρ {σn...σ 1 } . Any given loop-momentum projection from these t's acts either on S (σρ) or J (σρ) . On one hand, at σ ρ there are no intrinsic coordinates in S (σρ) . On the other hand, as shown in figure 26 it follows from the ordering of the trees that these approximations in J (σρ) I can only be hc I , and do not make any approximations on the intrinsic coordinates (β I −components of the jet momenta). So the ranges of intrinsic coordinates are unaffected by t σn , ..., t σm .
The approximations t σ 1 , ..., t σ m−1 apply projections on the loop momenta in H (σρ) and J (σρ) . For H (σρ) , since the hard loop momenta are always integrated over the entire 4dim space, the projections do not make any difference between σ ρ and ρ {σn...σ 1 } . For the subgraph J (σρ) I , the approximations can only be sc I , which projects the jet momenta onto their β I −component. As we have shown in eq. (2.18), this leads to a pinch surface where the ranges of the β I −components of the projected jet momenta (which are intrinsic coordinates) are unbounded. This is the only difference between σ ρ and ρ {σn...σ 1 } . However, this difference does not affect eq. (4.5), because the same t σ 1 , ..., t σ m−1 are present in both terms.
From these two paragraphs above, it is then obvious that both terms in eq. (4.5) are pinched at ρ {σn...σ 1 } .
(b) From our previous analysis on the products of approximation operators in section 2.3, more precisely, eqs. (2.19) and (2.20), it is also direct that t σm is exact at ρ {σn...σ 1 } , which implies the two terms in (4.5), which differ by a minus sign, cancel in the leading order. As we have analyzed in section 2.4, the divergences in (4.5) are at worst logarithmic, so the cancellation in the leading terms is sufficient to show IR finiteness.

Divergences at overlapping pinch surfaces
In this subsection we consider the case where ρ {σn...σ 1 } overlaps with certain σ i 's appearing in t σn ...t σ 1 A. Then we consider the pinch surface τ ≡ enc σ m , ρ {σn...σ 1 } , where σ m is the smallest of all the pinch surfaces in the forest F ≡ {σ 1 , ..., σ n } that overlap with ρ {σn...σ 1 } . By definition, τ is nested with the σ i 's. From sections 3.1-3.3, we know τ is a leading pinch surface of A, which may have been included in F or not. Without loss of generality, we assume τ / ∈ F . We will confirm that the IR divergences at ρ {σn...σ 1 } , are cancelled between the following two terms: which differ by a minus sign and the operator t τ . To prove eq. (4.6), we can first work on a simpler version with the minimum number of approximation operators: Here τ ≡ enc σ, ρ {σ} , and we have both is what we aim to prove in this subsection. This is a special case of (4.6), in which n = m = 1.
In fact the proof of (4.7) is sufficient to deduce (4.6), as we will explain at the end of this subsection.
As above, we need to show two things in order to prove eq. (4.7): (a) t τ t σ A is also divergent at ρ {σ} , namely, ρ {σ} = ρ {στ } ; (b) t τ is exact for t σ A there. Our method is to focus on an arbitrary configuration of ρ {σ} , go through all the ways in which t τ may act on line momenta and vector indices, and verify the two aspects above for each case. Throughout this subsection, ρ {σ} is assumed to be regular. Exotic configurations in ρ {σ} will be discussed in section 4.3.
Suppose t τ acts as a hard-collinear approximation, say hc I , on some momentum k µ of a propagator. Then by definition this propagator is in the set J . Then by construction it is an internal propagator of H (σ) . For this case the argument reduces to that of the nested case in section 4.1, and we can make use of (4.5). Therefore, we have verified (4.7) when t τ acts as a hard-collinear approximation.
Then suppose t τ acts as a soft-collinear approximation on the momentum k µ of a propagator ab. Again from eq. (3.6), ab is in the following three subgraphs: S (σ) , S (ρ) and/or J (σ) I J (ρ) K (I = K). For the cases of S (σ) and S (ρ) , the arguments in the previous paragraph can be applied straightforwardly, after replacing the hard subgraphs there by jet subgraphs here, and jet subgraphs there by soft subgraphs here. We find that the action of t τ is exact and consistent with ρ {σ} = ρ {στ } .
The case of ab ∈ J (σ) I J (ρ) K ⊂ S (τ ) is more complicated: although line ab is acted on by a soft-collinear approximation in t τ , it is also acted upon by a hard-collinear approximation from t σ , if it is attached to H (σ) . We can classify all the relations between ab and the hard subgraphs, H (σ) and H (ρ) , into four types: (I) ab is attached to neither H (σ) nor H (ρ) ; (II) ab is attached to both H (σ) and H (ρ) ; (III) ab is attached to H (σ) but not H (ρ) ; (IV) ab is attached to H (ρ) but not H (σ) . These subcases are considered one by one to verify eq.
in t σ t τ A. However, at the pinch surface ρ {σ} that we study, their contributions are both v · β K (β K · β I ) β µ I . The reason is that the component of v µ that appears, in the leading term of t σ A | n[ρ] , is its β K −component, i.e. v · β K β µ K , because a is a jet vertex with jet momenta parallel to β µ K , while the other endpoint b is in H (ρ) . Therefore, (4.7) holds when ab is attached to both H (σ) and H (ρ) .
I . The shaded areas represent the hard subgraphs in each region (and are not all the same). Denote the momentum of ab as k µ . All the propagator momenta in A that contain k are of the forms p I + k and p K + k, where p I (p K ) is the momentum of a propagator in J I (J K ). In the approximated amplitude ρ {σ} , p K + k becomes p K + I k, while in ρ {στ } , p I + k becomes p I + I k and p K + k becomes p K + KI k.

(III) &(IV)
The two cases where ab ∈ J    Similarly, the two currents at b 4 in ρ {σ} and ρ {στ } also agree, because a 4 b 4 is collinear to β µ I there. Then the soft-collinear approximation in the β I −direction, which is equal to the hard-collinear approximation in the β I −direction, is a good approximation.
It is relatively more complicated for the vector indices of the currents v µ at vertices b 3 . They belong to the scalar-polarized gauge boson in J (H σ ) I , and are marked by k µ 3 . By definition, any such vertex is projected differently by t σ and t σ t τ : it appears as (v · β I ) β µ I in t σ A and v · β K (β I · β K ) β µ I in t σ t τ A. In order to see that their contributions to the leading terms are the same near the pinch surface ρ {σ} (= ρ {στ } ), we recall the second conclusion we have drawn from the power counting result eq. (2.47), that the numerators combined with v µ should offer an O(1)-contribution to the leading term, otherwise it will be suppressed. Then it is clear from the expressions of the three-point vertices in figure 29 that if the propagators marked by momentum k µ 3 are attached to scalars or fermions at b 3 , the only component of v µ that leads to O(1) is v · β I . Therefore, in the leading term we actually have (v · β I ) β µ I → v · β K (β I · β K ) β µ I , and the coincidence is automatic. It remains to analyze the case where the scalar-polarized gluons marked by k µ 3 are attached to other gluons through three-or four-gluon vertices at b 3 . We verify that the numerators of t σ A | n[ρ] and t σ t τ A | n[ρ] contribute identically to the leading term as follows.
Consider first that the junction is a three-gluon vertex V αβγ p, I k 3 , with γ being the vector index associated with k µ 3 in A, as is shown in figure 30. Then in t σ A it  Figure 30: A three-gluon vertex V αβγ p, I k 3 · β Iγ β γ I that appears in t σ A, where only the β I -component of the momentum k µ 3 joins the gluon with momentum p µ . Also, the vector index γ is also projected onto its β I -component according to the same hard-collinear approximation.
reads: Similarly, using β K · KI k 3 = 0, with KI k defined by eq. (2.30), we have in t σ t τ A: According to one of the conclusions in section 2.4 (more precisely, the second proposition following eq. (2.47)), in order to prove that eqs. (4.10) and (4.11) contribute identically to the leading terms of t σ A and t σ t τ A, it is equivalent to verify that (4.10) and (4.11) agree at O(1). Obviously, their first terms in the square bracket agree, because p µ is in the β K -direction in ρ, so their contributions are the same. As for their second and third terms, notice that the only vectors appearing in the square bracket of (4.10) are p µ and β µ I . Since p µ = p · β K β µ K at ρ {σ} , every β µ I in the whole expression of the leading term must form an invariant with β µ K , i.e. (β I · β K ). Equivalently, every β µ I is projected onto its β K -component. Therefore, the leading contributions from (4.10) and (4.11) near ρ {σ} (= ρ {τ σ} ) are automatically identical.
This argument also works for four-gluon vertices, because then we have the following factors in t σ A: Similarly to the analysis of the threepoint vertices, a β µ I must contract with a β µ K from other three-gluon vertices to form invariants of the form (β I · β K ) in the leading term, which exactly appears in t σ t τ A.
In conclusion, we have proved that eq. (4.7) also holds when ab is attached to H (σ) but not H (ρ) . Now that we have finished the proof of eq. (4.7), we next show that it is equivalent to (4.6), which can be rewritten as (4.14) We focus on any line momentum of t σ m−1 ...t σ 1 A, say k µ , and examine how t τ may project it. For a vector index that is contracted in t σ m−1 ...t σ 1 A, the analysis below follows in the same way. First, eq. (4.14) would be trivial if t τ acts as an identity operator on k µ , so we assume that t τ is either a hard-collinear or a soft-collinear approximation. Then we recall our observation that for both t σn ...t σ 1 A and t σn ...t σm t τ t σ m−1 ...t σ 1 A, k µ is projected at most twice. Thus if t σ m−1 ...t σ 1 acts as an identity operator on k µ , the operator where t σ is the net projection on k µ from t σn ...t σm . The result follows immediately from (4.7).
So we only need to consider the case where both t τ and t σ m−1 ...t σ 1 are nontrivial (and not identical) on k µ . The only case is when t τ is hard-collinear and t σ m−1 ...t σ 1 is softcollinear. We now consider the action of t σn ...t σm , from which there are two possibilities. If t σn ...t σm is the same as t τ on k µ , being a hard-collinear approximation, then from t 2 τ = t τ we see that eq. (4.14) is also trivial. Otherwise t σn ...t σm = 1 on k µ , which means that the propagator with momentum k µ can only be an internal hard propagator in the pinch surfaces σ m , ..., σ n . Since τ ≡ enc σ m , ρ {σn...σ 1 } , in order that t τ offers a hard-collinear approximation on k µ , the propagator must be lightlike and attached to H (ρ {σn...σ 1 } ) in ρ {σn...σ 1 } . In this case, t τ is a good approximation at ρ {σn...σ 1 } , and the cancellation of IR divergences in (4.14) is immediate.
In conclusion, eqs. (4.6) and (4.7) are equivalent, which indicates that for any approximated amplitude with an overlapping divergence, as long as it corresponds to a regular pinch surface, we can always find a counterterm to cancel it. This cancellation is pairwise, and in each pair one term has a t τ while the other does not. To extend our analysis to all types of overlapping divergences, we will check for the exotic configurations in the next subsection.

Divergences at exotic pinch surfaces
The analysis in the last subsection is based on the assumption that ρ {σ} is a regular pinch surface. For example, when we discussed the case where t τ acts as a soft-collinear approximation on the momentum of a propagator, and that propagator is from S (ρ) , we then deduced that it must be attached to a jet subgraph in ρ {σ} , where t τ is a good approximation. However, in the presence of exotic configurations, it is possible that a soft propagator is attached to the hard subgraph in region ρ {σ} . So if we take such configurations into account, the analysis in section 4.2 does not immediately apply.
Fortunately, in section 3.3 we have enumerated all the possible exotic configurations in ρ {σ} , as well as the corresponding pinch surfaces σ that provide approximations, for which t σ A has these configurations. From σ and ρ {σ} we can derive τ , as are shown in figures 23 and 24. For figures 23(a), (b) and 24(b), the exotic configurations in ρ {σ} correspond to an internal soft vertex in τ , which only contributes identity operators to t τ . So for these cases, eqs. (4.6) and (4.7) are automatic.
As for the case of figure 24(a), t τ contains a soft-collinear approximation sc I on the soft propagator attached to the hard subgraph at ρ {σ} , and we claim that it is a good approximation. The reason is simple: the vertex in ρ {σ} , to which the soft lines are attached, is a jet vertex, because all the lightlike momenta entering it are parallel to β µ I . As a result, all the invariants formed by the jet momenta and the soft momenta in the leading term at ρ {σ} , can only involve the β I -component of the soft momenta.
After all these discussions, we can assert that t τ is always exact in eqs. (4.6) and (4.7), with or without exotic configurations. Sections 4.1-4.3 altogether constitutes our proof of the forest formula, eq. (1.1).

Discussion
Our discussions below treat four topics relevant to the arguments and results of this section. In Item 1 below we explain why the proof is not graph-by-graph, and in Items 2 -4 we relate our forest formula, eq. (1.1), to other subtraction methods formulated in forest-structural expressions.
1 . The fact that eq. (1.1) is not graph-by-graph is due to the possibility of unphysical pinch surfaces, namely, the solutions of the Landau equation where all the lightlike propagators of one or more jets that are attached to the hard subgraph are scalarpolarized gauge bosons. These pinch surfaces are not the leading pinch surfaces by definition, but the IR behaviors in their neighborhoods can be power divergent (see eq. (2.57), when one or more h Ki = 0). Nevertheless, these divergences are cancelled by the Ward identity in the sum over all the attachments between the scalar-polarized gauge bosons and the hard subgraph. But given a single Feynman graph, A, the remainder after all our subtractions is still divergent near these "super-leading" pinch surfaces [14,20], because the approximation operators appearing in the forest formula match only physical pinch surfaces. Therefore, the IR finiteness in (1.1) is not graphby-graph.
2 . Our forest formula sums over all the forests of A, which are defined in eq. (4.2). Another way to formulate the forest formula, as is done by Collins and Soper in [13], is to sum over only the "inequivalent forests". Two forests of A, say σ a(1) , ..., σ a(m) and σ b(1) , ..., σ b(n) , are called equivalent if the two approximations t σ a(m) ...t σ a(1) and t σ b(n) ...t σ b(1) are the same. Given a series of equivalent forests, only one subtraction term is needed, and the overall sign is (−1) T , where T is the maximum number of trees in any forest of this class. For example, in the upcoming example in section 6.1, we will see that t σ 6 = t σ 7 t σ 6 = t σ 8 t σ 6 , so {σ 6 }, {σ 6 , σ 7 } and {σ 6 , σ 8 } are three equivalent forests. As a result, we only need to include a single term, for example (t σ 6 A), rather than the whole combination (−t σ 6 A + t σ 7 t σ 6 A + t σ 7 t σ 6 A), in our forest formula (1.1). We believe that this equivalence of using forests and inequivalent forests can be generalized to arbitrary orders, but a rigorous proof is left for future research.
3 . The whole of our analysis of sections 2-4 can also be interpreted in position space. Previous work has already been carried out by Erdogan and Sterman in [17], where they focused on UV divergences of massless gauge theories in position space. In light of scale invariance, such UV structures of an original amplitude A are very similar to the IR structures in our momentum-space study. The work of Erdogan and Sterman offers the precedent for this project, and we have provided, in the previous sections, a detailed illustration of what these singularities are like in momentum space, and how they are cancelled in the forest formula. We will also provide a sketch of the position-space version, especially for the pinch surfaces of t σ A, in appendix C.
with subtraction terms from hard-collinear and soft-collinear approximations as well.
He proved the forest formula using an inductive strategy.
A recent work by Anastasiou and Sterman [47], studies the IR behaviors of fixed-angle scatterings from an iterative perspective, illustrating the idea at two loops. In contrast to the latter treatment, the forest formula method we take here offers a viewpoint of the IR singularities of an all-order amplitude, with or without approximation. This treatment, though much more laborious, enables us to generalize to arbitrary orders and numbers of external momenta, and observe a number of general principles of IR cancellations.
The forest-like subtraction also appears in another recent work, which is from a slicing approach by Herzog [35]. In the paper he promotes the subtraction method by employing suitable phase space mappings. This method is based on the geometry of IR regions, and is carried out explicitly at NLO and NNLO. Although his construction of subtraction terms is different from ours, the formula that summarizes the combinatorics of various counterterms is still forest-like.

Factorization of the subtraction terms
As has been mentioned, our subtraction method implies a factorization structure. In more detail, for a QCD hard process with external momenta p µ 1 , ..., p µ N , the forest formula holds for every Feynman graph A (n) (p 1 , ..., p N ), where n represents the order O(α n ). After we sum over all the A (n) 's as well as the orders, we obtain the full amplitude M (p 1 , ..., p N ), (5.1) After we replace each A (n) by its subtraction terms, the sum over graphs will lead to a factorized expression of M, to obtain which is our aim in this section.
In the derivation, we will use the symbols γ H , γ J and γ S to denote the hard, jet and soft subgraphs, in order to emphasize that their loop momenta are integrated over the full 4-dim space, rather than certain restricted ranges. In section 5.1 we show eq. (5.5), a key result that is subsidiary to the factorization in the presence of repetitive approximations. In section 5.2, we use this result to derive the factorized expression for M. A sketch of the argument is as follows.
Step 1. We use the forest formula to rewrite the A (n) in eq. (5.1) as the sum over forests. For each forest, we identify a specific pinch surface σ * 0 : it has the largest "reduced hard subgraph" (explained below), and is the smallest among all the other pinch surfaces in this forest that have the same reduced hard subgraph. We denote the hard (jet, soft) subgraph of σ * 0 as γ H 0 (γ J 0 , γ S 0 ).
Step 2. In the sum over A (n) , the soft subgraph γ S 0 is factorized from the hard-and-jet subgraph γ H 0 J 0 . After we take the approximations inside γ S 0 into account, the soft part contributes to a factor, which we denote γ S 0 ,eik /J 0,eik .
Step 3. From our analysis in section 5.1, γ H 0 J 0 can be further factorized into the reduced subgraphs γ H 0 and γ J 0 . The approximations inside γ J 0 , which are softcollinear, help us rewrite γ J 0 as the factor J 0,part .
Step 4. We combine the remaining part, which involves γ H 0 and its subtraction terms, together with the factors obtained in Steps 2 and 3. The final result is eq. (5.27).

Factorization in the presence of approximations
To illustrate that QCD factorization can be achieved in the presence of repetitive approximations, we need to set up the following concept, which relates the pinch surfaces of different Feynman graphs. Given any two O (α n ) Feynman graphs from the whole set A (n) , say 2 ) are normal-spaceequivalent (N -equivalent), if and only if there exists a one-to-one correspondence between their loop momenta, such that both momenta in the pair share the same normal space.
In the text below, we will associate all the O (α n ) Feynman graphs sharing a given pinch surface, say σ * , and denote it as σ * A (n) to represent each element of the N -equivalent class. Note that the symbol σ * contains the information about the normal spaces of the loop momenta, i.e. the orders of the soft, hard, and jet subgraphs.
Following this convention, we will use t σ * [A (n) ] to denote the approximation operator that is associated with σ * . Note that σ * A (n) may not be a leading pinch surface of every A (n) , and if so, t σ * [A (n) ] serves to annihilate A (n) , i.e. we define Otherwise, the rules of such operators are exactly identical to those we have given in section 2.1, in which case we say that A (n) is compatible with (the loop assignments in) σ * . With the help of the Ward identity applied to soft-collinear and hard-collinear attachments [38], we obtain the following factorization relation for each N -equivalent class σ * A (n) : S * , with its external lines attached to eikonal lines in the directions of β µ 1 , ..., β µ N . The subgraph γ J A * , deleting the soft lines attached to it, and attaching its lines that were previously attached to the hard subgraph to an eikonal line in the β µ A direction [38]. Similarly, γ (n−a * −b * ) H * is a subgraph dependent on p µ 1 , ..., p µ N , obtained from the hard subgraph γ H * by deleting the unphysically polarized jet gauge bosons attached to it. The subgraph orders, a * , b A * 's and (n − a * − b * ), are determined by specifying σ * . We note that eq. (5.3) not only holds for the full Feynman graphs, but also for subgraphs. For example, in the upcoming analysis of section 5.2, we will apply this relation to the graphs with their Figure 31: The graphic procedure to prove eq. (5.5). The figures (b) and (c) are the intermediate steps from (a) to (d), to obtain which we have repeatedly applied the Ward identity for factorization. The subgraph enclosed by the dotted curve is γ H * in (a), and becomes the reduced hard subgraph γ H * in (d) after factorization. At σ < , some of its propagators become soft (represented by dashed lines), some become lightlike (represented by the shadowed region), and some remains hard, which is γ H< . the sum over graphs of A (n) with N -equivalent pinch surfaces by the Ward identity): We now rewrite R A (n) into a sum over subtraction terms. Consider the trivial "pinch surface" of A (n) where the soft and jet subgraphs are trivial; in other words, all the loop momenta are off-shell. We denote this region by η. By definition, for any leading pinch surface of A (n) , say σ, we have σ ⊂ η. Defining the approximation operator t η as an identity operator on A (n) , we can rewrite R A (n) as: We define the set of the extended leading pinch surfaces of A (n) to include all the leading pinch surfaces of A (n) as well as η, and the extended forests of A (n) as the sets of nested extended leading pinch surfaces. For a given extended forest, we denote it as F , which may contain η or not. Using this notation, we substitute eq. (5.7) into (5.6) and get Given an arbitrary nonempty extended forest F , we first focus on the subset of its extended pinch surfaces that have the largest reduced hard subgraph. A reduced hard subgraph is obtained from the original hard subgraph by removing all the unphysical jet lines (scalar-polarized gauge bosons) from it. Then we select the smallest extended pinch surface from this set, namely the one having the largest soft subgraph (smallest jet subgraph). We denote this pinch surface by σ * 0 , and its corresponding hard (jet, soft) subgraph by γ H 0 (γ J 0 , γ S 0 ). For the special case where η ∈ F , we have σ * 0 = η, hence γ H 0 = A (n) and γ J 0 = γ S 0 = ∅. Otherwise σ * 0 is a leading pinch surface of A (n) . With this construction we can reorganize the sum over forests as: In this expression, σ > are pinch surfaces with the same hard subgraphs γ H 0 , but with smaller soft subgraphs. In comparison, the pinch surfaces denoted as σ < have smaller hard subgraphs than γ H 0 , and are contained in σ * 0 . Note that the overall minus sign of the first term in eq. (5.6) has cancelled the minus sign in (−t σ * 0 ). This reorganized sum enables us to arrive at a preliminary factorized form of M (n) after we sum over all the A (n) 's. That is, where the t σ> 's only act on γ (i) S 0 ,eik because the σ > 's have the same reduced hard subgraph as that of σ * 0 , and soft subgraphs contained in γ S 0 . The hard-and-jet function which is a function of p µ 1 , ..., p µ N .
We describe the idea to arrive at eqs. (5.10) and (5.11) before interpreting the details of the notation. We write the full M (n) as the sum over all A (n) 's. To perform this sum, we first group the terms with identical γ H 0 , γ J 0 and the approximations inside these subgraphs, but with different γ S 0 's. In the sum over all the elements in each group, the factor involving the loop momenta of γ S 0 can be reformed into a multi-eikonal graph γ S 0 ,eik due to the softcollinear approximation from t σ * 0 and the Ward identity. In the obtained γ S 0 ,eik , the external lines of the γ S 0 's are attached to eikonal lines in the directions of β µ A 's. In the spirit of the factorization theorem [14,38], this graph is decoupled from the rest part of A (n) . Then we sum over the results from different groups of γ H 0 and γ J 0 , from which all the possible hard-and-jet subgraphs are automatically included. Suppose the soft part is O(α i ), then the sum over A (n) can be rewritten as the three-fold sum over i (from 0 to n), the O(α i ) soft subgraphs, and the O(α n−i ) hard-and-jet subgraphs. In this way, the sum over σ * 0 A (n) only remains in the hard-and-jet part. By definition, the approximations from the forests F > only act inside the soft part, while those from F < and the hard-collinear branch of t σ * 0 only act inside the hard-and-jet part. After this we arrive at the RHS of (5.10) and (5.11).
With the idea explained, the notations in eqs. (5.10) and (5.11) are natural. The symbol γ S 0 ,eik denotes the graph γ S 0 with its external gauge boson lines attached to eikonal lines in the directions of β 1 , ..., β N , as we have explained in the paragraph above. In comparison, γ H 0 J 0 denotes the remaining part of the Feynman graph, which is the union of γ H 0 and γ J 0 , whose corresponding set of forests is denoted by F[γ H 0 J 0 ]. The forests F HJ are sets of nested pinch surfaces σ HJ , which determine approximation operators acting on γ H 0 J 0 . Besides these, there are also hard-collinear approximations from t σ * 0 acting on γ H 0 J 0 . For the special case where σ * 0 = η, we have i = 0, so the factor involving γ S 0 ,eik is simply 1, and γ (n−i) The full amplitude M = ∞ n=0 M (n) then reads: In the paragraphs below, we shall separately study the two factors in the brackets.
The soft factor First we study the factor involving γ (i) S 0 ,eik , which we call the "soft factor". As is shown in eq. (5.12), all the approximations acting on γ  In more detail, according to the Ward identity, we have S 0 ,eik (β 1 , ..., β N ) , (5.13) where the Kronecker delta factor controls the orders on both sides to be O α i . The subgraphs γ S 0 and j 0A are attached to eikonal lines so we have added "eik" to their subscripts. For clarity, we have defined the union of j 0A,eik (A = 1, ..., N ) as J 0,eik , i.e. (5.14) From our construction, it is apparent that J A,eik . Each operator t σ> in (5.12) leads to a factor J (j) 0,eik with j > 0. (Note that j = 0 because different t σ> 's provide different soft-collinear approximations.) We also define for convenience. Combining eqs. (5.13)-(5.15) together, we rewrite the soft factor of (5.12) as where n sc is the number of different soft-collinear approximations that act inside γ (i) S 0 ,eik . The factor in the bracket can be further simplified, i.e.
Therefore, the factor involving γ Then in eq. (5.21), this identity implies that Γ H 0 J 0 = Γ H 0 J 1 . We now rephrase the calculations from eq. (5.20) to (5.23) in an iterative way. That is, in the subgraph γ H 0 J i (i = 0, 1, ...) we focus on a subset of all its pinch surfaces that have Figure 33: The pictorial representation of eq. (5.20). The LHS describes the subgraphs γ J 0 , γ S 1 (enclosed by dashed curves) and J 1 (shaded area). Note that γ S 1 may have a nonzero intersection with γ H 0 as well, which is not drawn in the figure. The RHS describes the factorized expression. Due to the soft-collinear approximations sc (σ * 1 ) , J 1 can be rewritten as J 1,eik , which is decoupled from the rest of γ J 0 . the largest hard subgraph, select a specific one with the largest soft subgraph, and denote this pinch surface as σ * i+1 and the corresponding hard (jet, soft) subgraph as γ H i+1 (γ J i+1 , γ S i+1 ). According to the soft-collinear approximations from t σ * i+1 , the subgraph J i+1 ≡ γ S i+1 γ J 0 is factorized from the remaining part γ H 0 J i+1 ≡ γ H 0 γ J i+1 to form J i+1,eik . By construction, J i+1,eik = J i,eik = ... = J 0,eik , and due to the soft-collinear approximations inside J i+1,eik , this subgraph contributes 1 (as a multiplicative factor) after we sum over all the possible graphs and subtraction terms. In other words, we have In eqs. (5.20)-(5.23) we have i = 0, and the procedure described above works for i = 1, 2, ..., as long as there are soft-collinear approximations inside γ J i+1 . Suppose we have carried out this calculation iteratively, until at a special value of i where no soft-collinear approximations exist inside γ J i+1 .
Now we can carry out the same procedure to obtain eq. (5.19) by applying (5.5), and rewrite Γ H 0 J i+1 into a factorized form: In the first line, γ J i+1 is obtained by attaching the scalar-polarized gauge bosons of the to eikonal lines, which are in the directions of β µ A (A = 1, ..., N ). We come to the second equality because the operators t σ HJ here do not act as softcollinear approximations on γ The subscript "part" is an abbreviation of "partonic", because each function J part is indeed a sum over partonic correlation functions. Finally, we insert eqs. (5.18), (5.24) and (5.25) into (5.12), drop the subscript 0 and obtain the gauge-invariant factorized form [14][15][16]: In this form, both the β A -direction collinear divergences in the first term and the β Adivergences in the second term are cancelled by the two factors J 1/2 eik in the denominator. This cancellation follows from the exponentiation of IR divergences for the cusp and jet functions [48][49][50][51][52]. Therefore, the first factor has only soft divergences, the second factor has only collinear divergences, and the third factor is IR finite. The renormalization of each of these functions has also been studied widely [17,53,54].
We comment on another form of eq. (5.28). Collins in eqs. (10.118) and (10.119) of his book [14], derived a special factorization formula of the Sudakov form factor, in which the jet functions are normalized to absorb the soft contributions. Now we rewrite (5.28) by applying the same normalization as Collins suggests, and extend his result to arbitrary number of external lines. That is, where n A is a spacelike vector for each A(= 1, ..., N ), and J is the jet functions with Collins' normalization factor, which reads For the Sudakov form factor where N = 2, the soft eikonal function γ S,eik is equal to the square root of each of the J eik 's in the denominator. So the first factor in eq. (5.29) is unity, and eqs. (5.29) and (5.30) reduces to the result for the form factor automatically.

Next-to-next-to-leading-order examples
With the rationales explained in sections 2-5, we shall visualize how they work through in next-to-next-to-leading-order (NNLO) calculations. Namely, we consider the γ * , W ± , Z → qq processes in QCD at two loops. There are eight Feynman graphs in total, as are shown in figure 34. This section is arranged as follows. In section 6.1 we start by studying figure 34(a), showing its associated pinch surfaces and forests. After evaluating four selected subtraction terms, we analyze all the IR regions appearing in the forest formula, and exhibit how they form pairwise cancellations. In section 6.2, we evaluate a term that contributes to the final result in section 5.2, i.e. eq. (5.27), to see how such a factorized expression is formed by summing over gauge-invariant set of graphs. Figure 34: The two-loop QCD corrections to γ * , W ± , Z → qq.

Infrared regions, forests and IR cancellations
With the final-state momenta being p µ I and p µ K , which are not necessarily back-to-back, the expression for figure 34(a) is: The superscript µ of A is from the vector index of the gauge boson (γ * , W ± , Z), which we will omit in the upcoming text. On the RHS of eq. (6.1), k µ 1 (k µ 2 ) is the clockwise momentum of the left (right) loop, as is shown in the figure, V αβγ (p, q, k) is the kinetic factor of the 3-gluon vertex, and [dk] is the integration measure with the other constants absorbed. Throughout this subsection, we use: where g W is the electroweak coupling, V CKM is the CKM matrix, and N c is the number of colors. Note that the +i terms in the denominators are suppressed from now on.
We denote the leading pinch surfaces of A, from eq. (6.1), as follows: σ 1 (SS), if k µ 1 and k µ 2 are both soft; σ 2 (C 1 S), if k µ 1 is collinear to β µ I and k µ 2 is soft; σ 3 (SC 2 ), if k µ 1 is soft and k µ 2 is collinear to β µ K ; σ 4 (C 1 C 1 ), if k µ 1 and k µ 2 are both collinear to β µ I ; σ 5 (C 2 C 2 ), if k µ 1 and k µ 2 are both collinear to β µ K ; σ 6 (C 1 C 2 ), if k µ 1 is collinear to β µ I and k µ 2 is collinear to β µ K ; σ 7 (C 1 H), if k µ 1 is collinear to β µ I and k µ 2 is hard; Taking into account the orderings allowed by the nesting requirements, the set of forests for figure 34 is: Each forest corresponds to a subtraction term. Now we evaluate some representatives among them: t σ 1 A, t σ 3 A, t σ 3 t σ 1 A, t σ 6 t σ 3 t σ 1 A and t σ 7 t σ 6 t σ 3 t σ 1 A. To begin, we analyze the IR regions of these subtraction terms, and see how they are cancelled pairwise.
We start from t σ 1 A:  With this the only change, we know that the pinch surfaces of t σ 1 A can be labelled as those of A. Namely, PS of t σ 1 A : SS, C 1 S, SC 2 , C 1 C 1 , C 2 C 2 , C 1 C 2 , C 1 H, and HC 2 . (6.5) The only difference from the pinch surfaces of A lies in the intrinsic coordinates of C 1 S and SC 2 above. That is, the β I -component of k µ 1 in C 1 S and the β K -component of k µ 2 in SC 2 are unbounded due to the soft-collinear approximations in t σ 1 . This follows from our previous discussion in section 2.2 (more precisely, the paragraph below eq. (2.18)). Figure  35 offers another direct way to see this result.
As another example, the expression for t σ 3 A is which is depicted in figure 35(b). Notice that here we have two eikonal lines (in the directions of β µ I and β µ K ) and a partonic line as the framework. The pinch surfaces of t σ 3 A are different from those of A, because we have an eikonal line in a new direction β µ K , and there is a soft approximation on the loop momentum k µ 1 . Therefore, the region where k µ 1 is collinear to β µ I and k µ 2 is collinear to β µ K , which we denote as C 1 C 2 , is a new pinch surface of t σ 3 A, while C 1 C 1 is no longer a pinch surface. In other words, From the rules of repetitive approximation, eqs. (2.22) and (2.24), we find that as are shown in figures 35(c) and (d). The pinch surfaces of t σ 3 t σ 1 A are identical to those of t σ 3 A, because the only difference between them is to replace the partonic line in (b) by an eikonal line in the same direction in (c). Meanwhile there is a new pinch surface appearing in t σ 6 t σ 3 t σ 1 A, i.e. C 1 C 2 , while C 2 C 2 is no longer pinched. In summary, the pinch surfaces of these subtraction terms are PS of t σ 3 t σ 1 A : SS, C 1 S, SC 2 , C 1 C 2 , C 2 C 2 , C 1 C 2 , C 1 H, and HC 2 ; (6.10) PS of t σ 6 t σ 3 t σ 1 A : SS, C 1 S, SC 2 , C 1 C 2 , C 1 C 2 , C 1 C 2 , C 1 H, and HC 2 . (6.11) In this notation, we list all the IR regions of the approximated amplitudes generated from figure 34(a) in the following long table, and distinguish them by enclosing them with numbered rectangles. Any two IR regions (from two subtraction terms) with the same lower index outside the rectangle cancel. By carrying out a laborious but direct check, it can be seen that all these listed terms cancel in a pairwise way, as expected. Regions of IR divergences Amplitudes Continued on next page Continued from previous page Approximated Regions of IR divergences Amplitudes Continued on next page Continued from previous page Approximated Regions of IR divergences Amplitudes We comment that one will not encounter exotic pinch surfaces in this two-loop calculations. The simplest example for a soft-exotic configuration to emerge is shown in figure 7, which is three-loop; the simplest example for a hard-exotic configuration to emerge is shown in figure 8, which corresponds to the ladder graph, figure 34(c). But this does not lead to more subtleties than our calculations shown above; such cancellations can be directly seen from the discussion in section 4.3.

Factorization of the subtraction terms
In this subsection we verify that at NNLO, the factorized expression of the full amplitude, eq. (5.27), can be directly obtained from the forest formula. From another point of view, we also verify that if we write out all the forest formulas of the O(α 2 S ) subgraphs, some of their terms cancel and the remaining terms can be reorganized into (5.27).
H . (6.12) As is explained in section 5.2, the last term in eq. (6.12) corresponds to the case where σ * 0 = η in (5.9), which comes from the remaining term R A (n) in (5.6). Each term in the square bracket is associated with a set of subtraction terms of the forest formula, which we check directly below: In eqs. (6.20)-(6.22), σ < are the pinch surfaces contained in those with hard loops: for example, σ < ⊂ SH in (6.20). Note that in eq. (6.21), there is one requirement on the pinch surfaces denoted by σ < : their soft subgraphs must not overlap with J (1) part . Otherwise, a J (1) eik will be factorized from J (1) part in the sum over different graphs, and we will not get the LHS. Therefore, the forest formula provides us with all the elements to obtain eq. (5.27).
Among all the subtraction terms of the forest formulas corresponding to the eight graphs in figure 34, one may find that some terms are missing throughout eqs. (6.13)-(6.22). The term t C 1 H t SS A (a) is an example. (As is explained above, it does not appear in (6.21); obviously, it does not appear in the other equations either.) But in the sum over different graphs, all such terms will cancel. For example, we will see where we sum over all the graphs that are compatible with the pinch surfaces appearing in the approximation operators. Such cancellations are related to the pattern in eq. (5.23).

Figure 36:
The graphic contributions to A t C 1 H t SS A is equal to a factorized form.

Figure 37:
The graphic contributions to A t C 1 H t C 1 S t SS A is equal to a factorized form.
In order to show eq. (6.23), we notice that the first term is contributed by figures 34(a)-(d). From our analysis in section 5.1, the sum over these graphs are in a factorized form. We express this factorization in figure 36.
Explicit evaluation of the graphs on the LHS agrees with this conclusion, and we do not present it here for simplicity. In the same way, the second term in eq. (6.23) can be reorganized in the same factorized form as well. Here the only contributions are from figures 34(a), (b) and (d), and we have figure 37: As a result, eq. (6.23) is verified, and other cancellations can be shown in the same way. With the help of these cancellations, we complete our verification of (5.27) at O α 2 S .

Summary and outlook
In this paper, we have developed a new forest formula for wide-angle scattering in momentum space, eq. (1.1), extending the previous work in coordinate space [17] and for the Sudakov form factor [14]. We first studied the pinch surfaces of the approximated amplitudes, which are generated by hard-collinear and soft-collinear approximations. There are many differences between these pinch surfaces and those of the original amplitude, which can be generalized to the case with repetitive approximations. Despite the differences, we have shown that the divergences of these new pinch surfaces are at worst logarithmic, through explicit power counting for each case. In order to clarify the pairwise cancellations of the divergences near overlapping pinch surfaces, we studied the maximal region enclosed by σ m and ρ {σn...σ 1 } , where σ m is the smallest one of all the pinch surfaces in forest {σ 1 , ..., σ n } that are not contained in ρ {σn...σ 1 } . We proved that this region, which we call an "enclosed pinch surface" τ ≡ enc σ, ρ {σ} [17], is indeed a leading pinch surface of the original amplitude, for both decay and scattering processes. The analysis involves studying a new algebra of normal spaces of pinch surfaces, which we developed for this purpose.
We then showed the pairwise cancellation within the forest formula. That is, for any IR divergent region of any subtraction term in the forest formula, we can uniquely find another subtraction term that cancels the divergence near that region. The proof includes the coincidence of pinch surfaces of the two subtraction terms in each pair, and the exactness of t τ at that pinch surface. We verified these two aspects case by case.
Finally we made use of the forest formula and the gauge theory Ward identity, to rewrite any full amplitude of the decay or scattering processes into a gauge-invariant factorized expression with soft, jet and hard functions [14-16, 38, 55], where all the IR divergences are organized along eikonal lines in the directions of β µ and β µ . In the obtained eq. (5.28), the first factor has only soft divergences, the second factor has only collinear divergences, and the last term has no IR divergences. All the other divergences in the numerators are cancelled by the denominators. Our findings on the pinch surfaces of the approximated amplitudes may help with the study of Soft-Collinear Effective Theory (SCET) [39][40][41][42], because the approximated amplitudes are equivalent to the expanded integrals obtained by means of the method of regions [56,57], which are in one-to-one correspondence to the SCET Feynman graphs.
As we have mentioned in section 4.4, some other methods to construct subtraction terms also involve the forest-like structure. In principle, we should be able to also use equivalent forests [13] instead of forests to subtract IR divergences identically. These "freedoms" suggest a common and general mathematical structure (like the Hopf algebra [8,9]) in these different approaches.
There is another direction to extend this work, which so far is based on QCD amplitudes, in the future. It should be possible to generalize it to processes with jets in the final states, which can then be implemented to subtract IR divergences of cross sections. Beyond this, it may also be extended to weighted cross sections (angularity, N -jettiness, etc.) [69]. These topics are left for future research. In the last column, we denote the produced vectors that form products with γ. every one of the m propagators provides a β µ I to the red subgraph. Therefore, the power counting procedure in the case we study here is totally identical with that in section 2.4.
In the same way, we can analyze the other case, where t σm exerts a soft-collinear approximation. For simplicity, we shall not review it here since the power counting procedure is exactly the same with that for figure 17(b). In conclusion, a pinch surface ρ {σn...σ 1 } with soft-exotic configurations, is at worst logarithmically divergent.

B Some details in section 3.1
In this appendix, we provide the reader with some details that are omitted in section 3.1. First, we emphasize that the approximations in the definition of enclosed pinch surfaces, eq. (3.1), are necessary. After that we provide a detailed proof to Theorem 4, which can be seen as the generalization of Theorem 3 to repetitive approximations.

B.1 Necessity of the approximations
Our method to demonstrate the necessity of the approximations is by contradiction. Namely, we suppose that the ρ {σ} in eq. (3.4) is replaced by ρ, as is shown in (3.22). Then we come to a counterexample, which suggests that we cannot relate the subgraphs of σ, ρ and τ as simply as eq. (3.6).
Our example is given in figure 38. If eq. (3.22) is the definition of enclosed pinch surfaces, all the loop momenta should be soft in τ by definition. However, eq. (3.6) does not work here. On the one hand, both the propagators Oa and ab belong to H (σ) J (ρ) I , and therefore to J (τ ) I if (3.6) is assumed; on the other hand, Oa is collinear to β µ I in τ while ab is soft in τ from the definition (3.22). Then we see that (3.6), which relates the subgraphs of σ, ρ and τ , does not work here.
This problem is automatically cured in our original definition, eq. (3.4), because every momentum entering b is soft in ρ {σ} and ab is a soft propagator. From our previous knowledge in case (Ciii) of section 2.2, this corresponds to the soft-exotic configuration, as we have introduced. We can then easily verify that (3.6) holds.

B.2 Proof of Theorem 4
Now we prove Theorem 4 (eq. (3.23)), which generalizes Theorem 3 in section 3.1 to repetitive approximations. Recalling the definition of the -operation in table 4, we start by rewriting (3.23) as Then we insert the defining property of N τ , eq. (3.1) into the LHS of (B.1), and obtain an equivalent relation: This is exactly eq. (3.24), which we prove below. From the expression of eq. (B.2), it is clear that as long as one knows the explicit projections that t σn ...t σ 1 exerts on l µ i and l µ j , one can check its correctness. Once (B.2) is verified for all the possible projections, the proof of Theorem 4 is finished. We will discuss the possibilities by first classifying them into the following two cases: 1, t σm = id for l µ i and l µ j ; 2, t σm = id for l µ i or l µ j . We highlight the requirement of Theorem 4, that σ m is the smallest pinch surface not contained in ρ {σn...σ 1 } . This implies that all the pinch surfaces σ 1 , ..., σ m−1 are contained in ρ {σn...σ 1 } , which is a crucial property that we will repeatedly refer to throughout the proof.
1, t σm = id in eq. (B.2). Because l µ i and l µ j flow into the same vertex, t σm = id implies that the propagators carrying l µ i and l µ j must be simultaneously in a hard, soft or jet subgraph in σ m .
(1) For the subcase of l µ i and l µ j soft in σ m , both sides of eq. (B.2) become the entire 4-dimensional space, and (B.2) holds automatically.
(2) For the subcase of l µ i and l µ j hard in σ m , N σm (l i ) N σm (l j ) = ∅, and eq. (B.2) becomes the following relation, which we aim to prove: The second equality is due to the fact that l µ i and l µ j are both hard in σ m , and consequently, all the projections on them should correspond to the pinch surfaces that are contained in σ m . Since any projected momentum always has equal or more normal coordinates compared with the original one, apparently LHS ⊆ RHS. Next we argue that if l µ i and l µ j are hard in σ m , LHS ⊂ RHS. We do this by examining the possible actions of t σ m−1 ...t σ 1 on l µ i and l µ j .
First, suppose t σ m−1 ...t σ 1 acts as hard-collinear approximations on both l µ i and l µ j . Then at the specific pinch surface(s) where t σ m−1 ...t σ 1 exerts these approximations, at least one of the following three configurations will emerge: (1) l µ i is lightlike and l µ j is hard; (2) l µ j is lightlike and l µ i is hard; (3) l µ i and l µ j are both lightlike, in different directions. No matter which configuration we encounter, according to our observation that all the pinch surfaces σ 1 , ..., σ m−1 are contained in ρ {σn...σ 1 } , the momenta t σ m−1 ...t σ 1 l µ i and t σ m−1 ...t σ 1 l µ j in ρ {σn...σ 1 } are either hard or lightlike, and if they are both lightlike, they cannot be in the same direction. As a result, the RHS of eq. (B.3) is empty, and LHS ⊂ RHS cannot occur.
For the same reason, we observe that t σ m−1 ...t σ 1 cannot act as soft-collinear approximations on both l µ i and l µ j , otherwise we get the following implications: (1) l µ i is lightlike and l µ j is soft at some σ m (1 m m − 1); (2) l µ i is soft and l µ j is lightlike at some σ m (1 m m − 1). Apparently, σ m and σ m are not nested, thus cannot be the pinch surfaces appearing in t σ m−1 ...t σ 1 .
Finally the only possibility left is that the approximation on l µ i is hard-collinear provided by t σm 1 , while that on l µ j is soft-collinear provided by t σm 2 (1 m 2 < m 1 m − 1). Then we can rewrite eq. (B.3) as where the hat and tilde are defined in (2.7). We also know that in σ m 1 , l µ i is collinear to β µ I while l µ j is hard. As we have explained, σ m 1 ⊆ ρ {σn...σ 1 } , so l µ i is either collinear to β µ I or hard, while l µ j must be hard in ρ {σn...σ 1 } . The RHS of (B.4) is then empty, and again LHS ⊂ RHS. Therefore this possibility is also eliminated, and we have finished verifying (B.3).
(3) For the case of l µ i and l µ j being lightlike in a certain direction in σ m , say β µ I , eq. (B.2) becomes the following relation, which we aim to prove: To show this, we begin by observing that if there does not exist a lightlike vector v µ ( = β for any l µ , either N ρ {σn...σ 1 } (l i ) or N ρ {σn...σ 1 } (l j ) does not contain v µ as a result. Then the LHS will also be equal to N (I) , and (B.5), and hence (B.2), is valid.
We will now argue that eq. (B.5) holds in the alternative case,   In these relations, which we shall verify below, we have used the notations of the figures as the upper indices of ρ to denote pinch surfaces of certain approximated amplitudes. For example, ρ (a1) is a pinch surface of the approximated amplitude whose approximation on l µ i is a single hc I , as is shown in figure 40 Then We apply Lemma 1 to eq. (B.15) to rewrite the term in the bracket on the LHS as N ρ (a2) I l i . As can be inferred from the previous paragraph, both sides of (B.15) are equal to ∅, either from N (I) N (K) = ∅ or ∅ N (K) = ∅.
• a3 : To obtain this net approximation, we need a pinch surface σ k (1 k < m) with l µ i soft and l µ j collinear to β µ K there. So in ρ (a2) , l µ j is either hard or collinear to β µ K . Then we apply Lemma 2 to the RHS of eq. (B.16), using (2.30) for KI l i , after which we get our previous result a1, eq. (B.14).
• a4 : Here we need a pinch surface σ k (1 k < m) with l µ i collinear to β µ I and l µ j soft there. So l µ i is either hard or collinear to β µ I in ρ (a4) . Again we apply Lemma 2 to the RHS of eq. (B.17), and it returns to case a1.
• a5 : We compare this case with a2. Here everything is the same except that in eq.
(B.18) we have KI l i rather than I l i in (B.15). But this difference is eliminated by applying Lemma 2 to the RHS, with l 1 = K l j in (B.28).
• a6 : Again we compare this case with a2. Everything is the same except that in eq.
(B.19) we have IK l j rather than K l j in (B.15). But this difference is eliminated by applying Lemma 2 to the RHS, as well, with l 1 = I l i in (B.28).
• b1 : Eq. (B.20) is the same with (3.15) which we have already shown in section 3.1.
• b2 : To obtain the net approximation in the figure, we need a pinch surface σ k (1 k < m) with l µ i being soft and l µ j being collinear to β µ K . So l µ j is either hard or collinear to β µ K in ρ (b2) . By applying Lemma 2 to the RHS of eq. (B.21), this relation returns to (B.20) above.
• b3 : This case can be treated in the same way as b2 above, by exchanging β µ I and β µ K .
• c1 : Eq. (B.23) is the same with (3.17) which we have already shown in section 3.1.
• c2 : We focus on the brackets of the RHSs of eqs. (B.24) and (B.23). Obviously the only difference lies in the hc I on l µ i . Therefore, if the two brackets are different due to this approximation, the difference must be within N (I) , because N ρ I l i ⊆ N ρ (l i ). Meanwhile, since they are both in the direct sums with N (I) , the difference is eliminated. In other words, the two relations (B.24) and (B.23) are identical.
• c3 : From the configuration of the net approximation, we know there is a pinch surface σ k (m < k n) where l µ j is collinear to β µ K . From the result of Theorem 2 in section 2.3, l µ j can only be hard, soft, collinear to β µ K or β  We have finished the discussion on all the possibilities of t σm in eq. (B.2). The analysis throughout this appendix is also sufficient for non-planar graphs, or other loop assignments in a planar graph. The reason has already been explained in (3.19)-(3.21), so we do not repeat it here. In conclusion, we have verified Theorem 4, which implies the relations between subgraphs in eq. (3.25).

C Interpretations in position space
This appendix aims to provide the position-space aspects of sections 2-4 [17]. In position space, IR divergences are long-distance, which come from the integration measure over the four-coordinates. Meanwhile, given any integral representing a hard QCD process with massless partons, we can multiply every momentum component by a scale factor without changing the form of the integrand. Due to this scale invariance, the pictures of IR divergences for massless partons should be related to those of UV divergences. To this extent, in order to identify the IR divergences in position space, it suffices to study where the integrand is pinched, which by definition corresponds to the UV divergences. However, we emphasize that the momentum-space analysis is not simply a Fourier transformation from that in position-space, because it shows how the pinch surfaces not in the original amplitude A can emerge, and how loop momenta behave nearby, which is still necessary to understand factorization.
First we study the pinches formed between parallel propagators. Referring to figure 43(a), suppose 0, y µ 1 and y µ 2 are all vertices of jet J in the direction of β µ , i.e. y µ 1 ∼ y 1 · β β µ and y µ 2 ∼ y 2 · β β µ , then for any intermediate vertex y µ joining y µ 1 and y µ 2 through propagators, there is a factor in the denominator given by: −2 (y 1 − y) · β ((y 1 − y) · β) + (y 1⊥ − y ⊥ ) 2 + i · −2 (y − y 2 ) · β ((y − y 2 ) · β) + (y ⊥ − y 2⊥ ) 2 + i . (C.1) As long as y 1 · β < y · β < y 2 · β (or reversed), the coordinates (y · β) and y ⊥ are pinched at zero. At this pinch surface both y 1 y and yy 2 are jet propagators of J (see figure 43(a)). Then we study the pinches formed by the intersection of two jets. Suppose y µ 1 and y µ 2 are vertices in different jets J I and J K , which are in the directions of β µ I and β µ K . Meanwhile, x µ is another vertex which is connected to y 1 , y 2 and 0 through propagators. Then in the denominator we have the following factor: A direct check shows that all the four components of y µ can be pinched at zero. At this pinch surface xy 1 and xy 2 are separately jet propagators of J I and J K , while x µ is in the hard subgraph (see figure 43(b)). In terms of the language of pinch surfaces, given any propagator xy in an amplitude, all the four components of (x − y) µ are normal coordinates if xy is hard; its components (x − y) · β and x ⊥ − y ⊥ are normal coordinates if it lightlike in the direction of β µ ; none of its components are pinched if it is soft. We can also scale these normal coordinates as we have in eq. (2.2), assuming the length scale L ∼ 1/Q: hard: (x − y) µ ∼ (λ, λ, λ, λ) L; collinear: (x − y) µ = (x − y) · β, (x − y) · β, (x − y) ⊥ ∼ 1, λ, λ The power counting according to (C.3), as is carried out in detail in [70], suggests that such UV divergences are at worst logarithmic. This is in agreement with the conclusion in momentum space, as a direct result of scale invariance. With the knowledge of UV singularities in position space, we shall study how to identify each pinch surface through approximations. In appendix C.1 we derive the position-space version of hard-collinear and soft-collinear approximations; then in appendix C.2 we discuss how these approximations can change the pinch surfaces. The enclosed pinch surfaces introduced in section 3, and the pairwise cancellation of UV divergences in section 4, then follow similarly. This appendix is not supposed to be a full-fledged explanation on every detail; rather, it will only focus on some representative examples to illustrate the main ideas.

C.1 Approximations
Given a Feynman integral in position space, the numerators and denominators of the integrand are polynomials in normal coordinates. As the normal coordinates are scaled as eq. (C.3), each such polynomial can be approximated by keeping only the leading term, corresponding to the hard-collinear and soft-collinear approximations in position space. In the text below we shall derive the forms of these approximations, and show their equivalence with those we have studied in momentum space.
Given a leading pinch surface σ, suppose x µ are the hard vertices, y µ A are the jet vertices in the direction of β µ A , and z µ are the soft vertices. Then for any jet subgraph J

(σ)
A , since the distances of the external propagators are of the form (y − x) µ , and the β A -components of y µ are much larger than its other components, only the β A -components of x µ will show up in J (σ) A in the leading term. Similarly, for any soft subgraph S (σ) , the distances of the external propagators are of the form (z − y) µ , and since all the components of its internal vertices z µ are of the same order, only the (largest) β A -components of y µ will show up in S (σ) . That is to say, the hard-collinear and soft-collinear approximations provided by t σ should read Here, as in eqs. We now show that eqs. (C.4) and (C.5) imply the momentum-space approximation rules (2.3) and (2.4). In other words, the approximated amplitudes t σ A in position space are related to t σ A in momentum space through a standard Fourier transformation. To start with, we approximate the external vertices of the soft subgraph as y A i · β A β µ A , where A labels the jets. Then the soft subgraph in position space can be rewritten as: Here we have ignored the vector indices of S for simplicity, and n A is the number of soft propagators attached to J A . In the last step, we have Fourier-transformed S into momentum space, so p Ai represents the external momenta of J A that are soft in σ. The momentumconservation delta functions have been absorbed into the definition of S. Eq. (C.6) implies that the approximated subgraph S in position space is equal to the Fourier transformation of the original S in momentum space, with its external vertices projected onto the attached jets. As the next step, we combine the phases of (C.6) with the approximated jet subgraph J A in position space, integrate over the vertices y A i 's, and express the remaining position dependence on x A j 's as a momentum transform, i.e.

(C.7)
Here m A is the number of scalar-polarized gauge bosons of J A that are attached to the hard subgraph. Eq. (C.7) implies that after we integrate over the jet-soft vertices, the approximated jet subgraph in position space becomes that in momentum space, with the incident soft momenta projected onto β µ A . Finally we integrate over the four components of the jet-hard vertices x A j . The remaining factors in t σ A is then equal to: One can also derive the rules for repetitive approximations in position space directly from what we have in section 2.3. As is shown in figure 12, with the presence of the operator t σ 2 t σ 1 , the line momentum (p I + q) µ in A becomes p I + (q · β K )(β K · β I )β I µ in t σ 2 t σ 1 A, where t σ 1 gives a soft-collinear and t σ 2 a hard-collinear approximation. Then if we perform Fourier transformations to rewrite the propagators in position space, we will encounter the following factor: e −i(p·β K )(βK·βI)(β I ·x) . (C.9) We can interpret the exponent from a position-space point of view: the vertex x µ is initially projected by a hard-collinear approximation hc (σ 2 ) K , and then projected by a soft-collinear approximation sc (σ 1 ) I . In other words, for figure 12, t σ 2 t σ 1 (y µ − x µ ) = y µ − (x · β I )(β I · β K )β µ K . (C.10) This rule of repetitive approximations is in agreement with the prescription given in [17], and through a direct calculation, it satisfies the requirements, eqs. (2.19) and (2.20). Another example is in figure 45, where we claim that x µ , y µ and w µ are all in alignment to β µ I . To see this, we focus on the denominators of t σ A that involve w and y, which reads: Again, since these denominators are associated with jet propagators in ρ {σ} , each term in the bracket should vanish. Given x µ = x · β K β µ K in ρ {σ} , we see that this occurs for y µ = y · β K β µ K ; w µ = (w · β I ) β µ I ; z µ = 0. (C.14) As a direct result, the propagators wx, wy and wz are all parallel to β µ I in ρ {σ} . Note that only the β I -components of x µ and y µ enters the expression of wx and wy, so we have the conclusion above. Figures 44 and 45 are the position-space examples of the regular configurations of ρ {σ} , and we can also study the exotic configurations. An example of the soft-exotic configuration is shown in figure 46, whose momentum-space version was studied in the upper row of figure 7. After the action of soft-and hard-collinear approximations, the denominators of t σ A which involve w µ read: (C. 15) In order to assure that the approximated denominators in eq. (C.15) corresponding to the lines wx and wy both vanish at ρ {σ} , we only need w µ = (w · β I ) β µ I , even if w · β I = 0. Meanwhile, the position of y µ is not constrained in ρ {σ} . In other words, y µ is a soft vertex although it is an endpoint of the jet line wy. This is the position-space interpretation of the soft-exotic configuration. Figure 47 provides an example of the hard-exotic configuration. In σ, the propagator xw is soft, xy is lightlike in the direction of β µ I and wz is lightlike in the direction of β µ K .  Meanwhile, they all become hard in ρ {σ} . According to the hard-collinear approximations in t σ , y µ (z µ ) is projected onto β µ I (β µ K ) in xy (wz), but unchanged in the other propagators. So we can express ρ {σ} as the RHS of figure 47, which yields the following factor in the denominator of t σ A: − x − (y · β I ) β I 2 + i −y 2 + i = 2 x · β I ((y − x) · β I ) + x 2 I⊥ + i −2 y · β I (y · β I ) + y 2 I⊥ + i . (C. 16) Since xy is hard in ρ {σ} , we have x − (y · β I ) β I µ ∼ O(λ), so x · β I ∼ O(λ). Then we examine the poles in the y · β I -plane. There are two poles: x · β I − x 2 I⊥ + i 2x · β I , and y 2 I⊥ + i 2y · β I . (C.17) Clearly a pinch is formed when y µ = y · β I β µ I is finite, and y · β I is of the same sign as x · β I . Both poles are of order λ from the origin if y 2 I⊥ ∼ λ, as appropriate for a collinear pinch surface for y µ . Similarly z µ = z · β K β µ K . This implies a pinch surface structure "inside" H (σ) , and renders the position-space interpretation of the hard-exotic configuration. These interpretations in figures 44-47 are compatible with the results in section 2.2, and are fundamental in proving the pairwise cancellation of UV divergences in the position-space forest formula.