Cancellation of Glauber gluon exchange in the double Drell-Yan process

An essential part of any factorisation proof is the demonstration that the exchange of Glauber gluons cancels for the considered observable. We show this cancellation at all orders for double Drell-Yan production (the double parton scattering process in which a pair of electroweak gauge bosons is produced) both for the integrated cross section and for the cross section differential in the transverse boson momenta. In the process of constructing this proof, we also revisit and clarify some issues regarding the Glauber cancellation argument and its relation to the rest of the factorisation proof for the single Drell-Yan process.


Introduction
As the LHC continues to run after its energy upgrade, hopes are high that signals for physics beyond the Standard Model will show up in the coming years. To identify such signals and to understand their nature will be challenging for experiment and theory. This provides strong motivation to strive for a detailed and quantitative understanding of the final state in high-energy proton-proton collisions. Steady progress in perturbative calculations allows for increasing precision in describing the final state of a single parton-level scattering. However, much remains to be understood even at the conceptual level for multiple hard interactions, where several parton-level scatters occur in one proton-proton collision. While such multiple interactions are power suppressed in sufficiently inclusive cross sections, they are unsuppressed in specific regions of phase space [1]. Moreover, their importance grows with increasing collision energy. Since many search channels for new physics involve high multiplicity, a thorough understanding of multiple hard scattering is highly desirable. Recent years have seen considerable progress, from theory to phenomenology to experimental studies, as is for instance documented in the proceedings [2][3][4][5].
The theory for single hard scattering in proton-proton collisions rests on factorisation formulae, which describe the cross section in terms of parton distributions, hard-scattering cross sections at parton level, and possibly other quantities like soft factors. Substantial effort has gone into proving these formulae to all orders of perturbation theory in QCD, see e.g. [6][7][8] for comprehensive accounts. A crucial part of such proofs is to cast the exchange of soft gluons between the two colliding protons into a form consistent with the factorisation formula. In particular kinematics, which is referred to as the Glauber region and which essentially describes small-angle parton-parton scattering, the approximations required to achieve such a form break down. Factorisation therefore only holds if the net effect of all Glauber gluon exchanges cancels in the observable considered. A prominent example where such a cancellation fails to occur (and factorisation is strongly violated) is hard diffraction [9,10]. Observables in Drell-Yan production for which Glauber gluon exchange breaks factorisation are discussed in [11,12].
Factorisation formulae can also be written down for multiple hard scattering, where multi-parton distributions appear instead of the familiar single-parton distributions. Together with model assumptions connecting these multi-parton distributions with their single-parton counterparts, this forms the basis of most phenomenological investigations, see for instance . To put this framework on a more solid footing, it is necessary to understand whether factorisation actually holds for multiple hard scattering. Several pieces of evidence pointing towards a positive answer have been given in [1,46,47], but the issue of Glauber gluons in this context has not been analysed yet. It is the purpose of the present paper to fill this gap. We limit ourselves to double hard scattering and more specifically to the double Drell-Yan process, i.e. to the production of two electroweak gauge bosons. To the best of our knowledge, the cancellation of Glauber gluon exchange in single hard scattering has also been established only for Drell-Yan production [8,[48][49][50]. We will show that the proof given in [8] can indeed be adapted to the double Drell-Yan process. In doing so, we will revisit and clarify some subtle issues in the single Drell-Yan case.
The most commonly used form of factorisation involves parton distributions integrated over transverse parton momenta, often called "collinear" distributions. Correspondingly, the net transverse momentum of the particles produced in a hard-scattering subprocess is then integrated over in the cross section formula. A different factorisation formalism uses transverse-momentum dependent parton distributions (TMDs). It allows one for instance to compute the transverse-momentum spectrum of the gauge boson produced in the Drell-Yan process in the region where that transverse momentum is small compared to the boson invariant mass. TMD factorisation for hadron-hadron collisions has only been established for cases where the particles produced in the hard scattering are colourless, i.e. for the production of electroweak gauge bosons and Higgs particles. For final states involving hadrons, problems related to gluon exchange in the Glauber region have so far prevented the formulation of TMD factorisation [51]. Under the restriction to colourless final states, our proof of Glauber gluon cancellation for double hard scattering holds in both the collinear and TMD cases. As already noted for single DY production in [8], such a proof is no more complicated (if not simpler) in the TMD framework than for collinear factorisation, and we treat both cases together in this paper. A possible extension to final states involving hadrons, which is of obvious phenomenological relevance, must await further work.
This paper is organised as follows. In the next section we give an account of the overall logic and status of a factorisation proof for the double Drell-Yan process, in close analogy to the single Drell-Yan case. Section 3 analyses one-gluon exchange in a toy model, with explicit calculations illustrating the role of complex contour deformation in avoiding contributions from the Glauber region. Within this toy model we show how the exchange of a single Glauber gluon cancels in the cross section. In section 4 we present an all-order proof, following the work in [8,50] on single Drell-Yan production, where an essential tool is light-cone time ordered perturbation theory. We summarise our main results in section 5.

Overview of the factorisation proof
In this section we give a broad overview over the different steps of the factorisation proof for double hard scattering with colourless final state particles. For definiteness, we consider double Drell-Yan production, p + p → V 1 + V 2 + X, where V 1 and V 2 are electroweak gauge bosons and X is the unobserved hadronic final state. Although many of the steps do not directly concern the main topic of this work, namely the cancellation of Glauber gluon exchange, we find it important to exhibit the interplay of this part of the proof with the other steps, which need to be taken in a consistent order.
Several parts of the factorisation proof have been elaborated in [1] or [47], but others still require further work in our opinion. Many elements of the proof are the same for TMD and collinear factorisation, so that the discussion can conveniently treat both cases in parallel. We largely follow the factorisation proof for single Drell-Yan production as given in chapter 14 of [8], with some modifications that we will point out.

Overall strategy
1. Power counting. Both collinear and TMD factorisation are based on a power expansion in a small parameter Λ/Q. Here Q is the hard scale of the process, whereas Λ represents small kinematic quantities and the scale of nonperturbative QCD interactions. The terms "leading" and "dominant" refer to the limit of small Λ/Q.
In double Drell-Yan production, Q is given by the invariant masses of the two produced bosons (which we treat as being of the same order). For collinear factorisation there is no kinematic scale of size Λ, whereas for TMD factorisation the transverse momenta of the bosons are counted to be of order Λ. This does not mean that TMD factorisation is limited to transverse momenta in the nonperturbative region; what counts is that they are small compared with the hard scale of the process. Since Λ/Q is the parameter used to quantify the accuracy of the final factorisation formula, Λ is to be taken as the largest of all scales counted as "small" and Q as the smallest of all scales counted as "large".
2. Dominant graphs and regions. Following the analysis of Libby and Sterman [52,53], one identifies the dominant regions of loop integration for each individual graph that contributes to the cross section. These regions are characterised by the momenta of internal lines being either hard, soft (which includes the Glauber region), or collinear to an external direction. They correspond to pinch singularities of the graph when all quantities of order Λ (masses and transverse momenta) are set to zero. For double Drell-Yan production there are just two collinear directions, given by the incoming protons. Within collinear factorisation, unobserved jets in the final state provide additional directions for collinear lines, but they disappear in the final result. (In TMD factorisation, production of additional jets is power suppressed due to the requirement of small transverse momentum for the produced gauge bosons.) For each leading momentum region, a graph can be organised into subgraphs containing only hard, soft or collinear lines. Dimensional analysis shows how these subgraphs can be connected in order to give a leading contribution. In this context, different roles are played by gluons with transverse or longitudinal polarisation (respectively having a polarisation vector that is approximately transverse or approximately proportional to the gluon momentum). A collinear and a hard subgraph must be connected by the minimum number of quarks 1 or transverse gluons necessary for the process, whereas there can be any number of longitudinal gluons. Between a collinear and a soft graph there can be any number of soft gluons. We only consider soft subgraphs that couple to at least two different collinear graphs; soft subgraphs connecting to a single collinear graph are treated as a part of that graph. Soft gluons coupling to a hard subgraph are power suppressed.
For our further discussion we introduce light-cone coordinates, a ± = (a 0 ± a 3 )/ √ 2 for each four-vector and denote its transverse components as a = (a 1 , a 2 ). The components of the full vector are given as (a + , a − , a). We choose a coordinate system in the overall c.m. frame where both incoming protons have zero transverse momentum, with one proton moving fast to the right and the other fast to the left. The typical size of momenta is then l ∼ (Q, Q, Q) for hard lines, l ∼ (Q, Λ 2 /Q, Λ) for right-moving collinear lines, l ∼ (Λ 2 /Q, Q, Λ) for left-moving collinear lines. (2.1) Virtualities are thus of order Q 2 for hard lines and of order Λ 2 for collinear lines. We note that in low-order graphs, hard momenta may have small or zero transverse components, which does however not affect the general analysis. Soft momenta have all components of order Λ or smaller. As we will discuss in section 2.2, there are important differences whether soft momentum components are of order Λ or Λ 2 /Q.
The analysis described here is based on Feynman graphs in perturbation theory. It is understood that in the final factorisation formula, quantities that involve lines with virtuality Λ 2 or less (i.e. collinear or soft lines) will be represented by hadronic matrix elements of quark and gluon operators, which are meaningful beyond perturbation theory.
After these preliminaries, we can now specify the dominant graphs for double Drell-Yan production.
(a) For TMD factorisation, the dominant graphs are as shown in figure 1. They have the same structure as for single Drell-Yan production, except that there are two distinct hard-scattering subgraphs on either side of the final-state cut. For ease of notation, we group the hard subgraphs such that H i (i = 1, 2) comprises the two subgraphs (on either side of the final-state cut) that produce the gauge boson V i . There are exactly four quark lines entering each subgraph H i .
We denote the collinear subgraph with right-moving lines by A and the one with left-moving lines by B. Each of them has four external quark lines and is related to a double-parton distribution that appears in the final factorisation formula. We note that the soft factor S may consist of several connected components. As mentioned above, each of these components must couple to both A and B. The simplest connected subgraphs of S consist of one gluon directly connecting the two collinear graphs.
(b) The dominant graphs in collinear factorisation are more complicated since each of the two hard scatters can produce additional jets, which are part of the unobserved hadronic final state X in the cross section. These additional collinear factors in the final state can further couple to the soft subgraph, as shown in figure 2. A series of arguments is needed to convert this to two hard-scattering graphs with cut parton lines in the final state, as in figure 3. The arguments are the same as for the single Drell-Yan process. We will review them in a separate publication. Figure 1. Dominant graphs for double Drell-Yan production, p  Figure 2. Dominant graph for double Drell-Yan production, p + p → V 1 + V 2 + X, in collinear factorisation, with one unobserved jet produced by each hard scatter. Not shown for the sake of clarity are additional longitudinal gluons that can be exchanged between each pair of a hard and a collinear subgraph.
At higher orders in α s , incoming quarks in a hard graph can be replaced by gluons with transverse polarisation. The four partons with physical polarisation in each collinear factor thus can be different combinations of quarks and gluons. For brevity we will call these "physical partons".
(c) While the reaction p + p → V 1 + V 2 + X can proceed by double hard scattering (double parton scattering, DPS), two gauge bosons can also be produced in a single hard scattering (single parton scattering, SPS). In TMD factorisation, SPS and DPS have the same power behaviour in Λ/Q whereas in collinear factorisation SPS is enhanced over DPS by a factor Q 2 /Λ 2 . (On the other hand, due to a larger flux of incoming partons, DPS is generically enhanced over SPS at small Q 2 /s, where s is the squared pp c.m. energy.) There are also interference terms between SPS and DPS, which have the same power behaviour as DPS for both TMD and collinear factorisation. For details we refer to section 2.4 of [1].
As discussed in sections 5.2.3 and 5.3 of [1], certain graphs can contribute to both SPS and DPS in different momentum regions. (An example is graph b in figure 9 below.) The existing factorisation formulae for DPS extend into the SPS region and vice versa, so that there is a double counting problem. This problem has been discussed in [46,[54][55][56][57], but in our opinion it has not been solved in a satisfactory way.

3.
Approximations. In each dominant momentum region of a graph, one makes appropriate approximations. As discussed in section 10.4.2 of [8], these can be grouped into different types: (a) Kinematics. For collinear lines entering a hard subgraph one keeps only large light-cone components, neglecting small light-cone components and transverse components. A right-moving momentum ℓ entering H i is thus replaced bŷ A small rescaling of the large components can be made so as to have exact momentum conservation in the hard subgraph. In hard subgraphs one neglects all quark masses that are small compared with Q (heavy-quark contributions are not considered in this work).
One can also approximate soft momenta ℓ that enter a collinear subgraph. As explained in section 2.3, we take here a different approximation than the one in [8], which is not suitable for the case of double hard scattering. In section 3 we do not make any kinematic approximation for soft gluons, whereas in section 4 we neglect the light-cone component of ℓ that is much smaller than the large component of collinear momenta. For a soft gluon entering graph A we thus replace ℓ withl = (0, ℓ − , ℓ) .

(2.3)
After these kinematic approximations, certain loop integrals only involve some subgraphs rather than the product of all of them. This induces ultraviolet divergences that are not present in the original graph for the cross section and hence not cancelled by the counterterms in the QCD Lagrangian. The individual factors hence require additional ultraviolet renormalisation. At this stage of the argument, all computations are to be done in a regularised theory (in practice in d = 4 − 2ǫ dimensions). The required UV subtractions are done at the very end (see point 10 below).
A comment is in order about the routing of loop momenta. We choose independent loop integration variables such that momentum conservation is already satisfied for the collinear and soft factors (if such a factor has n external parton lines, there are hence n − 1 independent loop momenta). A particular choice of independent loop variables may be needed in parts of the proof and will be specified when necessary. For the hard subgraphs we write explicit δ functions to ensure momentum conservation. These δ functions constrain momentum components of the incoming partons, unless they disappear because corresponding momentum components of final state particles are integrated over. As part of the kinematic approximations, we neglect in the δ functions small plus or minus components compared to large ones (i.e. the minus momenta of right-moving and the plus momenta of left-moving collinear lines). δ functions for transverse parton momenta are to be kept unapproximated, even if these momenta are neglected inside the hard subgraph. In the final factorisation formula for double parton scattering these δ functions "tie together" certain transverse momenta in the double parton distributions of the two protons, as shown e.g. in section 2.  [58,59].
(c) Gluons connecting different subgraphs. For the numerator factors of soft or of longitudinally polarised collinear gluons, one makes a so-called Grammer-Yennie (also called eikonal) approximation, which generalises the treatment by Grammer and Yennie of soft photons in QED [60]. For a collinear gluon with momentum ℓ flowing from A into a hard subgraph H, we approximate is an auxiliary vector that is widely separated in rapidity from the right-moving gluon. 2 The rapidity of v A can hence be large and negative In the first and last steps of (2.4) we used that A µ scales like a right-moving collinear momentum, so that A + ∼ Q is its largest component. In the first step we also used that the components of H µ are all of size Q (the transverse components may be zero for symmetry reasons, which does not affect the argument). The iε in the denominator provides a regularisation of the pole at ℓv A = 0 and will be commented on in point 5. An analogous approximation is made for collinear gluons flowing from B into H, with an auxiliary vector v B that has either central or large positive rapidity (|v + B | ≫ |v − B |), and withl being obtained from ℓ by keeping only the large component ℓ − .
Similar approximations are made for soft gluons connecting to a collinear factor. For a gluon with momentum flowing from S into A, the Grammer-Yennie approximation reads where v R has large positive rapidity. In the first step we have again used the scaling properties of A µ , and in the first and last step we have additionally used that the minus component of S µ is not smaller than its other components (see below). An analogous approximation holds for a gluon flowing from S into B, with an auxiliary vector v L that has large negative rapidity. As discussed in section 2.3, our approximation (2.5) differs from the one in [8] because we use a different kinematic approximation for soft momenta entering collinear subgraphs. Equation (2.5) is a slight modification of the form used in [61], wherel was replaced with the full soft momentum ℓ (as we will do in section 3).
In collinear factorisation for single hard scattering, one can take lightlike auxiliary vectors, setting either v + or v − to zero as appropriate. However, for TMD factorisation this choice would lead to divergences in the rapidity of gluon momenta ℓ for the factors A, B and S. For double hard scattering this happens even in collinear factorisation. Different regulators for these divergences have been proposed, see e.g. [8,[62][63][64][65]. We follow [8] and use non-lightlike vectors v.
(d) Glauber region. A serious complication of the proof is that the soft Grammer-Yennie approximation (2.5) does not work for gluon momenta in the Glauber region, which we define as More precisely, if ℓ − ∼ Λ 2 /Q and |ℓ| ∼ Λ, then the approximation ℓ − A + ≈l ν A ν in the last step of (2.5) fails, because with A + ∼ Q and |A| ∼ Λ the contribution of transverse components to the scalar productl ν A ν cannot be neglected. A similar statement holds in the analogue of (2.5) for a gluon connecting S with B if the gluon momentum satisfies ℓ + ∼ Λ 2 /Q and |ℓ| ∼ Λ. Moreover, to derive (2.5) we used that the minus component of S µ is not smaller than its other components, which can be shown if the gluons attached to the soft factor have momenta with ℓ + j ∼ ℓ − j ∼ |ℓ j |, but which is not obvious if gluon momenta are in the Glauber region.
One can overcome these problems by deforming the integration contour for each soft gluon momentum ℓ into a region of the complex plane where the Grammer-Yennie approximation works. This deformation must be consistent with the analyticity properties in ℓ of the relevant subgraphs. It can be performed if the subgraphs do not have any singularities obstructing the contour deformation, or if the additional contributions obtained when crossing such singularities cancel after summing over all final-state cuts of a given graph. To show that one of these conditions is always met for double Drell-Yan production is the main objective of this paper.
4. Subtraction formalism. For each graph that gives a dominant contribution to the cross section, there are one or more terms with approximations made as just discussed, each corresponding to a distinct region of the loop momenta (and hence to a distinct way of organising the graph into subgraphs). In each of these terms, all loop momenta are integrated over their full range, not only over the momentum region for which the approximations have been designed. Using cutoffs to restrict the loop momenta would seriously complicate a systematic analysis, for instance because cutoffs break Lorentz invariance and lead to complicated nonlocal operators in the definition of parton distributions and soft factors.
The subtraction procedure discussed in sections 10.1 and 10.7 of [8] provides an alternative to momentum cutoffs and ensures that the sum over the terms approximated for different momentum regions correctly reproduces the graph, up to power corrections in Λ/Q. Since we rely on this method in our arguments, we briefly sketch its essentials here. Let Γ be a particular Feynman graph (integrated over all loop momenta) and R be a particular momentum region of that graph, defined by specifying for each line in the graph whether its momentum is hard, collinear in a certain direction, or soft. We call a region R ′ smaller than R (denoted as R ′ < R) if hard momenta in R are collinear or soft in R ′ , or if collinear momenta in R are soft in R ′ . Examples for the latter case are given in figure 17.
By T R Γ we denote the application of the approximations appropriate for the region R, as specified in point 3. By construction we have T R Γ restr ≈ Γ restr up to power corrections, where "restr" denotes the restriction of loop integrals to the design region R of T R . However, we need a representation without this restriction. The full loop integral T R Γ can contain leading contributions from momentum regions smaller or larger than R. Moreover, the application of T R does in general not give a correct approximation in those regions (it was not designed to do so). The combined problem of double counting and degraded quality of the approximations is solved by the subtraction formalism.
To cope with momentum regions smaller than R, an approximant C R Γ for R is defined by subtracting the contributions from smaller regions as Note that the approximant T R is applied also to the subtraction terms (e.g. if a line is hard in R, one will neglect its mass in T R C R ′ Γ even if it is a collinear or soft line in R ′ .) In the sum one may or may not include regions R ′ that only provide power suppressed contributions. The recursive definition (2.7) is possible since for each momentum the soft region is smallest. For a leading region R 0 with no smaller leading region one simply has C R 0 Γ = T R 0 Γ without subtractions. Up to power corrections one then finds with the sum running over all leading regions R. Notice that a term C R Γ has no leading contributions from smaller regions because of the subtractions in (2.7), but it may still have leading contributions from regions R ′′ > R (e.g. when a collinear momentum in R is hard in R ′′ ). This unwanted contribution is removed by the subtraction term −T R ′′ C R Γ in the approximant C R ′′ Γ for the larger region. As shown in [8], the subtractions also correctly treat the case where two regions intersect each other (collinear momenta in different directions have the soft region as a common smaller region and thus intersect).

5.
Back to real momenta. After Grammer-Yennie approximations have been made, one can deform the soft momenta back to the real axis. This requires a suitable choice for the auxiliary vectors and the iε prescription in the Grammer-Yennie approximants for soft gluons. Because of the soft subtraction terms discussed in point 8, the same holds for the Grammer-Yennie approximants for collinear gluons. The iε prescriptions in (2.4) and (2.5), together with the choice of auxiliary vectors specified in section 3.2 are such that the poles in the Grammer-Yennie approximants do not obstruct the contour deformation for soft momenta. If one uses a different prescription, one must show that the contributions from poles crossed during the contour deformation are power suppressed or cancel in the final factorisation formula.
Deforming contours back at this stage enables one to initially choose momentum routings for individual graphs as one finds suitable for establishing the cancellation of Glauber gluon contributions (cf. our comment in point 3a). In point 6, the conditions on loop momentum routings are more stringent since we consider sums over different graphs, whose momentum routings must match. Having deformed soft gluon momenta back to the real axis, one can readily make the required changes of integration variables in the graphs. The result of this procedure is shown in figure 4. The collinear factors A and B are a first version of double parton distributions, to be modified by subtractions as discussed below. They are given by operator matrix elements, which for two-quark distributions schematically read where k ′ , k, j ′ and j are colour indices in the fundamental representation and W is a Wilson line in the appropriate direction. Γ 1,2 are Dirac matrices, and the indices 1, 2 on the quark fields indicate that (after a Fourier transform) they create or annihilate a parton with momentum fraction x 1,2 in the proton. More detail is given in section 2.2 of [1].
First-order examples for the application of Ward identities and the emergence of Wilson lines have been given in sections 3.2 and 3.3 of [1]; an all-order derivation of the required identities in QCD remains to be worked out. For the single Drell-Yan process an all-order proof of the analogous procedure is given in [50], and for different single-parton scattering processes in an Abelian theory in [8].
Each connected part of the original soft factor couples to both A and B, so that after applying Ward identities each connected part couples to Wilson lines along both v R and v L . A corresponding restriction applies to the Wilson lines in the collinear factors, since they are generated by a Grammer-Yennie approximation for gluons connecting a collinear and hard subgraph rather than a collinear subgraph with itself. So-called Wilson line self-interactions must therefore be explicitly excluded from the matrix elements that define soft and collinear factors. In the overall factorisation formula, these self-interactions cancel. For further discussion we refer to sections 13.3.4 and 13.7.2 of [8].
Wilson lines are defined in position space, and the structure of the factorisation formula is simplified by a Fourier transform from transverse momenta in collinear and soft factors to coordinates in transverse configuration space (often referred to as b space). This replaces convolution integrals in transverse momenta by ordinary products.
7. Colour structure. The collinear factors have a nontrivial structure in the colour of the four physical parton lines (or more precisely, of the Wilson lines attached to them, see (2.9)). This is different from single hard scattering, where the two parton lines attached to a collinear factor can only couple to an overall colour singlet. Colour decompositions for double scattering are given in section 2.3 of [1], see [47,66] for alternative choices and [67] for a simplification in the gluon sector. If the colours of the two partons with the same momentum fraction (x 1 or x 2 ) are coupled to singlets, we speak of "colour singlet" double parton distributions, which are an immediate generalisation of the familiar distributions for a single parton. In (2.9) one then contracts j ′ with j and k ′ with k. Other colour combinations, referred to as "nonsinglet" channels, can be regarded as describing colour interference or colour correlations. In the basis of [1], the second colour structure for two-quark distributions is called "colour octet" combination and obtained by contracting (2.9) with t a j ′ j t a k ′ k .
In TMD factorisation, the hard subgraphs for double Drell-Yan production have a trivial colour structure since their final state is colourless. As a consequence, Wilson lines in the soft factor have their colour indices contracted pairwise, with one Wilson line along v R and the other along v L in each pair. Regarding colour indices, the cross section thus involves a matrix multiplication of the form where c and d label the different terms in the colour decompositions of A, B and S and are to be summed over. (In the basis just described, c and d run over the singlet and octet channels.) In collinear factorisation the hard factors can produce unobserved jets, so that several colour channels are open. Taking qq annihilation as an example, one can then decompose where j, k are the colour indices of the incoming partons in the amplitude, and j ′ , k ′ those in the conjugate amplitude. The tree-level graph for H contributes only to 1 H, whereas one-gluon emission contributes only to 8 H. The colour indices of H are to be contracted with the colour indices of Wilson lines in S. After a corresponding colour decomposition of S, the colour structure of the cross section involves an additional level of matrix multiplication: where f, g label the two colour channels of H 1,2 . This additional complication was not discussed in [1,47], where only the tree-level expression of H was considered.
8. Soft subtraction in collinear factors. As a consequence of the procedure described in point 4, the collinear factors require subtractions for regions where gluon lines have momenta that are not collinear but soft. As this has not been elaborated on in [1] we briefly explain it here.
According to point 2, soft gluons inside an unsubtracted collinear factor couple to other soft gluons, to a collinear line, or to the Wilson lines. Taking the right-moving factor for definiteness, A unsub (v A ) thus has a soft subgraph S that connects collinear lines with the Wilson lines. Collinear gluons couple to the Wilson lines as well, and because we have a non-abelian theory, the order of gluon attachments is relevant. Let us show that a leading contribution is only obtained if, starting at the physical parton line, first all collinear gluons couple to a Wilson line and then all soft gluons, as shown in figure 5a. According to our earlier discussion, v A may have central or large negative rapidity. Since |v A ℓ c | ≫ |v A ℓ s | if ℓ c is collinear and ℓ s is soft, this ordering has the smallest number of eikonal propagators carrying a collinear rather than a soft momentum. Example graphs are shown in figure 6.   One can now apply the same approximations made for soft gluons that couple to the collinear subgraph in the overall process (see point 3c), provided that one can again show the cancellation of Glauber gluon exchange. The result has the form where by construction the subtracted collinear factor A sub does not receive a leading contribution from soft momenta. (It may receive leading contributions from hard loop momenta, but as discussed in point 4 this will be removed by subtractions in the hard factors.) From (2.13) one readily obtains Note that one can take v A to be light-like in this expression, since potential rapidity divergences are removed by the soft subtraction term. In the cross section for TMD factorisation one has a product of matrices in colour space. A corresponding expression is obtained for collinear factorisation from (2.12).
For TMD factorisation in single hard scattering, different schemes have been proposed in the literature regarding the choice of Wilson lines. A vector v A = v B with central rapidity was chosen in [61], whereas v A = v L and v B = v R was taken in [62,68]. In section 13.7 of [8] a more complicated scheme involving additional soft factors is presented, in which the individual terms in the final factorisation formula do not require the explicit removal of Wilson line self-interactions. v A and v B are taken as light-like in this scheme. Its analogue for double hard scattering has not been studied so far.
9. Rapidity evolution. In general one cannot take all Wilson lines lightlike, as mentioned earlier. The collinear and soft factors then depend on the rapidities of the Wilson lines. This dependence is described by differential equations, commonly called Collins-Soper equations. The Collins-Soper equation for the unsubtracted collinear factors has been discussed in section 3.4 of [1]. A derivation of the rapidity evolution for the soft factor in the same framework is still missing. For the particular rapidity regulator introduced within SCET (soft-collinear effective theory) in [64], the rapidity evolution of collinear and soft factors has been discussed in [47].
In collinear factorisation, soft and collinear factors are integrated over transverse parton momenta (or equivalently, evaluated at b = 0 in transverse configuration space). In the colour singlet channel, one can take all Wilson lines as light-like since potential rapidity divergences cancel. The soft factor then reduces to unity, In colour nonsinglet channels, one must however keep non-lightlike vectors (or use a different rapidity regulator) and retains nontrivial soft factors.
Factors that depend on a large rapidity difference involve large logarithms, called Sudakov logarithms (recall that rapidity is a logarithm, see footnote 2). The power of these logarithms increases with the order of perturbation theory. The Collins-Soper equations sum these logarithms to all orders and can be used to ensure that the final factorisation formula contains no large logarithms in the hard factors, which can then be evaluated reliably in fixed-order perturbation theory. The resummed Sudakov logarithms typically lead to a suppression of the cross section. For collinear factorisation this means in particular that all channels except for the colour singlet channel in the double parton distributions are suppressed, as was realised long ago [69]. A simple but instructive numerical study of the suppression of colour nonsinglet channels is given in [47].  The UV subtractions do not change the overall cross section, where the UV divergences in the individual factors cancel, as ensured by the subtraction procedure of point 4. As mentioned earlier, the only UV divergences in the original graphs are those removed by the counterterms in the Lagrangian.
The UV subtracted factors depend on a renormalisation scale µ. For TMD functions, the dependence on that scale is given by simple renormalisation group equations with anomalous dimensions multiplying the factors themselves. For collinear parton distributions, one obtains DGLAP type equations containing a convolution product of the distributions with splitting functions, where the convolution is in longitudinal momentum fractions. For double parton distributions, these equations are discussed in [70][71][72][73][74]. Their form is connected with the double counting problem noted in point 2c, as explained in section 5.3.2 of [1]. As usual, the evolution equations can be used to ensure that the hard factors do not contain large logarithms that would spoil a fixed-order perturbative expansion.

Scaling of soft momenta
The typical size of collinear momenta, (ℓ + , ℓ − , |ℓ|) ∼ (Q, Λ 2 /Q, Λ) or (Λ 2 /Q, Q, Λ), is directly connected to the external kinematics of the process. Their large components must be of order Q to enable the hard scattering. Their small and transverse components correspond to generic virtualities of order Λ 2 in collinear factorisation, where there is no small external kinematic quantity and Λ is a generic nonperturbative scale. If in TMD factorisation the measured transverse momenta q are larger than a nonperturbative scale, then one must take Λ ∼ |q| as discussed earlier. The transverse momenta entering the hard scattering must then be of size Λ in order to produce the required q in the final state. The size of soft momenta is not directly set by external kinematics. The Libby-Sterman analysis only specifies that all soft momentum components vanish in the formal limit Λ → 0. A crucial part of the analysis is that soft gluons couple to collinear lines.
Let us take a closer look at soft momentum scalings that give a leading contribution to the cross section. For definiteness we consider the exchange of a single soft gluon between the collinear subgraphs A and B, with example graphs given in figure 7. Our discussion applies likewise to single and double Drell-Yan production. Leading regions can be conveniently identified by comparing the expressions for the graph with and without soft gluon exchange: Here the Lorentz index of the lowest-order expressions A (0) and B (0) originates from the Fierz decomposition mentioned in point 3b, and the additional index in A (1) and B (1) comes from the exchanged gluon. The size of each collinear factor can be determined by boosting to a frame where its momenta scale like (Λ, Λ, Λ). In such a frame, all components of A (0) and B (0) scale like Λ n (where n is the appropriate mass dimension), whilst all components of A (1) and B (1) scale like Λ n−1 . Boosting back to the overall c.m. frame, we find that the largest components of each factor scale like In (2.15) we have already used that other components of A and B give smaller contributions and neglected these. A momentum region in the one-gluon exchange graph thus contributes to the cross section with the same power as the leading-order graphs if d 4 ℓ/ℓ 2 ∼ Λ 4 /Q 2 , where d 4 ℓ is to be understood as the size of the integration volume for that region. The preceding analysis is only valid if ℓ + and ℓ − are at most of order Λ 2 /Q, so that they do not change the scaling of momenta in the collinear factors. If ℓ − ∼ Λ then at least one line in A (1) has momentum scaling like (Q, Λ, Λ) and hence a virtuality QΛ instead of Λ 2 . A corresponding statement holds for B (1) if ℓ + ∼ Λ. The approximations for collinear momenta discussed in point 3 of section 2.1 remain valid in this case.
A number of different momentum scalings for ℓ give a leading contribution to the cross section.
2. If all components of ℓ are of size Λ 2 /Q then d 4 ℓ ∼ Λ 8 /Q 4 and ℓ 2 ∼ Λ 4 /Q 2 , so that we again obtain a leading contribution. Momenta in this region are often called ultrasoft in the context of effective theories.
Since in this region the typical gluon virtuality goes to zero for Q → ∞, one may wonder whether nonperturbative effects such as confinement invalidate this analysis. Taking a nonzero gluon mass m g to simulate such effects, one finds that ultrasoft scaling only gives a leading contribution if Λ 2 /Q > ∼ m g .
3. If all components of ℓ are of size Λ, which in effective theories is often denoted as the soft 3 region, we have d 4 ℓ ∼ Λ 4 and ℓ 2 ∼ Λ 2 . The scaling in (2.16) must be amended for the presence of collinear lines with virtuality QΛ. One finds that for the graph in figure 7a there is an overall penalty factor (Λ/Q) 2 for one off-shell line in each subgraph A (1) and B (1) , so that one obtains a leading contribution. In figure 7b the subgraph A (1) has two quark lines with virtuality QΛ, which results in an overall penalty factor (Λ/Q) 3 and thus in a power suppressed contribution to the cross section.
4. The regions where ℓ scales like (Λ, Λ 2 /Q, Λ) or (Λ 2 /Q, Λ, Λ) may be regarded as hybrids between regions 1 and 3. With d 4 ℓ ∼ Λ 5 /Q and ℓ 2 ∼ Λ 2 and a penalty factor of Λ/Q for an off-shell line in only one of the two collinear factors, these regions contribute to leading power for the appropriate graphs.
This list is not complete, and other scalings such as ℓ ∼ (Λ, Λ 2 /Q, Λ Λ/Q) give a leadingpower contribution as well. Fortunately, a complete inventory of all leading momentum scalings is not needed for the factorisation proof. In the subtraction procedure sketched in point 4 of section 2.1, a region R is not primarily characterised by its momentum scaling, but by the approximations T R appropriate for that region. A controlled power-law accuracy of these approximations is an essential ingredient in establishing the method. For a detailed discussion we refer to section 10.7 of [8] and to sections VIII A to D of [61]. For the momentum scalings enumerated above, the Grammer-Yennie approximation (2.5) for a soft gluon coupling to A and its analogue for a soft gluon coupling to B work in regions 2 and 3, but at least one of the approximations fails in regions 1 and 4. To establish the cancellation of Glauber gluon exchange, we will deform ℓ out of regions 1 and 4 into region 2.

Approximations for soft gluons
As stated earlier, we approximate a soft gluon momentum ℓ entering the collinear factor A(ℓ) byl = (0, ℓ − , ℓ). The Grammer-Yennie approximation then fails if ℓ is in the Glauber region. A different kinematic approximation is described in section 10.4.2 of [8], wherel = (0, ℓ − − δ 2 ℓ + , 0), with δ being a parameter of order Λ/Q. (The same approximation with δ = 0 was taken in the original work [50].) The second step of (2.17) is then valid even in the Glauber region. However, the approximation to neglect ℓ inside A(ℓ) is not. The transverse momentum of the soft and collinear lines is comparable not only in the Glauber region but also in regions 3 and 4 described in the previous subsection. In section 3.2 of [50] an argument was given why one can nevertheless neglect ℓ in A for single Drell-Yan production, which required a specific routing of loop momenta. Let us see why this argument does not readily generalise to the double Drell-Yan case.
In figure 8a we show the graphs for a soft gluon coupling to A on the left of the finalstate cut in a simple spectator model. After contour deformation out of the Glauber region and use of the Grammer-Yennie approximation, a Ward identity converts the sum of these graphs into collinear factors without a soft gluon, as shown in figure 8b. As in section 3.2 of [50] we have routed the gluon momentum such that it does not cross the final-state cut.
In order to apply the Ward identity we must use the same external momentum assignment in each graph, and for definiteness we have chosen ℓ to flow out of the line with k 1 rather than the one with k 2 . The collinear factor in the first term of figure 8b does not depend on ℓ, so that we obtain the correct result whether or not we set ℓ to zero in the original graphs of figure 8a (the eikonal propagators do not depend on ℓ and the gluon propagator is to be truncated in all graphs). In the case of single hard scattering, this would be the only term to consider. The collinear factor in the second term of figure 8b depends however on ℓ, which can be of the same size as k 1 and k 2 , as discussed in the previous subsection. Neglecting ℓ in this factor gives an incorrect result. Following the derivation in section 3.2.2 of [1] one finds indeed that in the overall expression of the cross section, the dependence of this collinear factor on ℓ is crucial to give the correct transverse position of the Wilson lines to which the soft factor is eventually converted. This holds true even for collinear factorisation, where the Wilson lines associated with the partons of momenta k 1 and k 2 have a relative transverse distance y from each other. Neglecting ℓ in the graphs of figure 8a would put all Wilson lines in the soft factor at the same transverse position, which is incorrect.
3 Glauber cancellation for one-gluon exchange

Scope and general method
In this section we demonstrate the cancellation of Glauber gluons in the double Drell-Yan process at the level of single gluon exchange. The momentum ℓ of the exchanged gluon can lie in the hard, soft, or collinear-to-A or B regions, and what we shall show in this section is that the approximated expressions from these regions reproduce the exact graph at leading-power accuracy. There is no need for a separate term for the Glauber region, where the soft Grammer-Yennie approximation (2.5) breaks down. The hard region can be distinguished from the rest by the fact that it corresponds to generically large transverse momenta |ℓ| ∼ Q rather than |ℓ| ∼ Λ. By considering integrals differential in transverse momenta, and transverse momenta only of order Λ, we remove the need to consider the hard region in the following.
For simplicity, we adopt a model with scalar "partons" described by a field φ and coupling to each other via three-and four-point vertices. Gluons couple to these partons as required by gauge invariance, i.e. via the usual scalar QED/QCD-type vertices (see for example [75]). We thus have a momentum-dependent gφφ vertex and a ggφφ seagull vertex. Some example graphs for the double Drell-Yan process in this model are given in figure 9.
We shall not concern ourselves with the detailed colour structure of the model. As we shall see, the Glauber cancellation argument depends only on the analyticity properties of the loop integrands in the gluon momentum, which in turn is determined only by the denominators of the Feynman and eikonal propagators. It does not rely on the spin and colour structure of the model, and hence also applies to graphs in QCD.
The Glauber cancellation argument we shall present here works on a graph-by-graph basis, with the sum over cuts of the graph taken where necessary. We need to consider only graphs in which the gluon extends between right-and left-moving collinear partonsgraphs in which the gluon attaches only to right-or left-moving partons are topologically factorised already. We classify the partons into active ones and spectators, with active partons being the ones that attach to a vertex producing an electroweak gauge boson. All other partons are spectators. We thus have active-active, active-spectator, and spectatorspectator gluon attachments.
What we shall do in this section corresponds to applying the region approximants of point 3 in section 2.1 in the following order: first we make contour deformations needed to avoid the Glauber region (with a sum over cuts where necessary). Then we apply the Grammer-Yennie approximations, before deforming the contours back to the real axis. Finally the kinematic approximations and approximations for fermion lines are applied. With this order of steps, we do not need to concern ourselves with the kinematical and fermion line approximations in this section.
To illustrate where problematic pinched poles in ℓ + or ℓ − may generically arise, it is instructive to consider the example graph in figure 9d. If ℓ is in the Glauber region then ℓ − is negligible in the right-moving lines above the gauge boson vertices. It is also negligible in gluon propagator, which is dominated by ℓ 2 . Poles in the small ℓ − region thus result Figure 9. Example graphs for the double Drell-Yan amplitude within the model described in the text. In the physical process these graphs are embedded in graphs with further spectator lines and with a hadron entering each collinear subgraph.
from the propagator denominators The ℓ − contour is pinched near the origin, between a pair of poles with real parts of order Λ 2 /Q. We see that the position of a pole above or below the real axis depends on whether ℓ − is routed along or against the large and positive plus-momenta (p − k 1 ) + and k + 1 of the collinear lines.
Conversely the poles of ℓ + in the Glauber region are determined by the following denominator factors: In this case ℓ + flows in the same direction as the large positive minus momenta (p −k 1 ) − andk − 1 , so that both poles are on same side of real axis and ℓ + is not pinched at small values.
From this example we see that the pattern of poles and pinches in the small ℓ + , ℓ − region is determined by the flow of large plus or minus momentum. The integration over ℓ + is pinched if there is at least one line where it flows against a collinear momentum with large minus component, and at least one line where it flows with such a momentum. An analogous rule holds for ℓ − . This rule will be be essential in our general proof of the cancellation of the Glauber region at the one-gluon exchange level.
The remainder of our argument is structured as follows. In section 3.2 we describe the soft and collinear Grammer-Yennie approximants and give details of a method that allows us to investigate the validity of the factorisation formula at the one-gluon level. In section 3.3 we use this method to show by explicit calculation that the factorisation formula gives the correct leading-power expression for a simplified set of double Drell-Yan graphs. In section 3.4 we give an argument that -after a sum over cuts where necessary -there is no pinched Glauber exchange contribution for any one-gluon exchange graph within our model. Finally, in section 3.5 we discuss the difficulty to extend this argument to gluon exchange at higher orders. As a result, we will pursue a different approach for our all-order proof of Glauber gluon cancellation in section 4.

Grammer-Yennie approximation and rearrangement of terms
For reasons to become obvious shortly, we take the auxiliary vectors for the soft and collinear Grammer-Yennie approximants to be the same (v L = v A , v R = v B ) and call these common vectors v L and v R in the following. At a later stage of the factorisation proof, this equality can be lifted using the appropriate Collins-Soper equations.
We group the parton lines in each graph into subgraphs L and R, with L containing the lines above the vertices producing the gauge bosons, and R containing the lines below. In the lowest-order graphs with no gluon L (R) contains only left (right) moving collinear lines. With left-moving (right-moving) collinear gluon exchange, the line in R (L) between the gluon attachment and the gauge boson vertex is hard. We use Feynman gauge for the gluon propagator.
Let us consider first a graph in which the gluon couples to two active partons, as in figure 9a and b. The Grammer-Yennie approximation for the gluon with momentum ℓ flowing out of R or into L respectively reads Since we take the soft and collinear Grammer-Yennie approximants to be the same, and since the soft auxiliary vectors must have large positive or negative rapidity, we must take v R (v L ) to have large positive (negative) rapidity. Without loss of generality, we can take . We allow the infinitesimal parameters η and η ′ to be different from the ε in Feynman propagators, so that in later calculations we can see that the order in which these parameters are taken to zero does not matter. The imaginary parts in the eikonal propagators correspond to past-pointing Wilson lines, as appropriate for Drell-Yan. They are chosen such that the pole in ℓ − for the left-hand term in (3.3), and the pole in ℓ + for the right-hand term, lie on the same side of the real axis as the corresponding active parton propagators.
Whilst the use of light-like auxiliary vectors (v − R , v + L = 0) in the Grammer-Yennie approximation may simplify calculations, it can also result in rapidity divergences as already mentioned in section 2.1. We shall in the following investigate the cancellation of the Glauber region for a single gluon when the auxiliary vectors are lightlike, spacelike or timelike. An advantage of spacelike auxiliary vectors (resulting in spacelike Wilson lines) is that the additional pole of 1/ and hence does not complicate contour deformations, as was observed in [76]. A corresponding statement holds for the poles in ℓ + . The same clearly cannot be said when the Wilson lines are timelike. According to the subtraction method discussed in point 4 of section 2.1, the following approximant is good for a graph with an active-active gluon connection in the combined collinear and soft regions of gluon momentum: The first term on the right-hand side is the approximant for left moving collinear ℓ, including the soft subtraction. The second term is the corresponding term for right-moving collinear ℓ. Finally the third term is the approximant for the soft region of ℓ.
In fact, we can make this replacement also for graphs where the gluon couples to a spectator line on one or both sides. If it couples for instance to a spectator in R and an active parton in L, as in figure 9c, then the factorisation formula should only contain the second and third terms in (3.4), which describe the leading contributions from right moving and soft ℓ. The first term in (3.4) can however be added since it is power suppressed: if ℓ is left moving then R is suppressed because it contains too many off-shell propagators, and if ℓ is soft or right moving, the suppression is ensured by the factor in parentheses contracted with L. Likewise, in graphs where the gluon couples to spectator lines on both ends, the first and second term in (3.4) are power suppressed in all relevant regions and may be added without degrading the accuracy of approximations. The key advantage of using (3.4) is that it can be rewritten as To show that (3.5) is a valid approximation in the collinear and soft regions of ℓ is hence tantamount to showing that its second term is power suppressed in these regions. The advantage of this method, which has long ago been used for factorisation studies in [77], is that it avoids the explicit computation of complicated loop integrals since one can instead use power counting arguments.

Explicit calculations in a simplified setting
Before giving a general proof of Glauber exchange cancellation in the setting just specified, we find it instructive to perform explicit calculations for specific graphs. This will allow us to see how the general arguments of power counting and complex contour deformations work in concrete examples. We restrict our calculation to graphs without parton self-couplings and complement our model by introducing a colourless scalar "hadron" with a pointlike coupling to a parton pair. Thus, the basic graph we have to consider is the box graph in figure 10 (which has no spectator partons at all) and the one-gluon corrections to it. To regulate collinear Figure 10. divergences we use a small mass m for the scalar parton, and to regulate infrared divergences at the one-loop level considered here we give the gluon a mass λ.
Given the presence of the ggφφ seagull vertex in the model, there are one-loop corrections to figure 10 with a gluon tadpole insertion on any one of the scalar parton lines. The contribution of these diagrams is proportional to and therefore vanishes in the limit of zero gluon mass, which we take whenever possible. The remaining one-gluon corrections to the amplitude are then given by the graphs in figure 11 and by graphs with the same topology obtained by permutation of lines. The corresponding graphs for the cross section are given by the interference of these virtual corrections with the lowest-order graph. Real gluon emission does not contribute at this order because one cannot form a colour singlet state out of a single emitted gluon and two gauge bosons.
We remark in passing that the diagrams in figures 10 and 11 appear as subgraphs of more complex graphs possible in the model of section 3.1, although in that context the incoming particle to the box structure is a coloured parton rather than a colourless hadron. In this sense, figure 11d appears inside figure 9b. Since the calculations of the following subsections do not depend on the colour factors of the individual graphs and remain valid if the incoming lines are off-shell partons, the demonstration of Glauber cancellation in the graphs of figure 11 immediately carries over to the more general case in which these graphs are embedded in more complex structures. In that case, graphs with real gluon emission contribute to the cross section as well, but real gluons cannot be in the Glauber region for kinematic reasons. The graphs in which the gluon extends between two collinear partons travelling in approximately the same direction, i.e. graphs 11a and b, factorise topologically as already mentioned in section 3.1, and we need not consider them further. They are part of the oneloop expression of the collinear factor for right-moving partons. In the following subsection we will analyse the most complicated of the remaining graphs, namely the double box of figure 11d. The gauge boson vertex graph c, discussed in section 3.3.5, will then be comparatively simple.
As explained in section 3.1 we will limit our discussion to the kinematical region where all parton lines in figure 10 have left or right moving collinear momenta and thus transverse momenta of order Λ ≪ Q. With scalar partons, this is indeed the only leading momentum region of the graph, provided of course that the transverse momenta of the gauge bosons are also of order Λ. With spin 1/2 partons there would be also a leading region where partons have transverse momenta of order Q and are far off-shell, so that the graph describes a single hard scattering rather than two. This problem, mentioned in point 2c of section 2.1, does not interfere with the issue of Glauber gluon cancellation we are focusing on here.
As a further simplification, we replace each electroweak gauge boson by a scalar particle with a pointlike momentum-independent coupling to two partons. This removes the dependence of the scattering amplitude on polarisation vectors or open Lorentz indices, without affecting the analytic structure of the graphs. The results of the following subsections therefore generalise to the case where spin 1 bosons are produced.

Double box graph
We now turn to the calculation of the graph in figure 12, which up to a constant is given by Since this graph has no ultraviolet divergence we have set the number of spacetime dimensions to 4. It is understood that k 1 +k 1 = q 1 , k 2 +k 2 = q 2 , k 1 + k 2 = p,k 1 +k 2 =p and p +p = q 1 + q 2 by momentum conservation.
We choose a frame where the hadron with momentum p moves fast to the right and the hadron with momentump fast to the left, with p =p = 0. For k i andk i (i = 1, 2) in the collinear region, the momentum components have typical size The choice of independent integration variables in (3.7) is appropriate for power counting since bothk + 2 and k − 1 can freely vary over their natural range Λ 2 /Q. In the collinear momentum region, the lowest order graph in figure 10 has all four propagator denominators of order 1/Λ 2 and an integration volume of order Λ 6 /Q 2 , giving an overall behaviour like 1/(Λ 2 Q 2 ). It is easy to see that the Glauber region (2.6) of ℓ contributes at the same order: the numerator factor (2k 1 + ℓ)(2k 2 + ℓ) is of order Q 2 , the integration volume d 4 ℓ of order Λ 6 /Q 2 and each of the three additional propagators of order 1/Λ 2 . As discussed in section 2.2, both k − 1 and ℓ − can be of order Λ as long as k − 1 +ℓ − ∼ Λ 2 /Q, because this only carries the momentum k 1 off shell, which is compensated by a larger integration volume of ℓ − compared with the Glauber region. Likewise, ℓ + can also be of order Λ. Finally the ultrasoft region with all components of ℓ of order Λ 2 /Q has a smaller integration volume compared with the Glauber region, which is compensated by a smaller gluon propagator. We thus find that all four soft momentum scalings of ℓ enumerated in section 2.2 give leading contributions to the graph. Furthermore, the regions where ℓ is right or left collinear give leading contributions. As discussed in section 3.1, we can establish the validity of the Grammer-Yennie approximation by showing that the replacement in (3.7) gives a power suppressed result relative to the leading behaviour 1/(Λ 2 Q 2 ).

Lightlike Wilson lines
For simplicity we first consider the case in which the soft and collinear Wilson lines are taken lightlike, with v + R = v − L = 1 and v − R = v + L = 0. The replacement (3.9) then corresponds to the replacement in (3.7). The original numerator term on the left hand side of this equation is of order Q 2 -thus when we replace the left hand side by the right hand side, we only get a leading power contribution when this expression is also of order Q 2 . When ℓ is soft ∼ (Λ, Λ, Λ) or collinear, the numerator of the right-hand-side is of order Λ 4 and the denominator of order Λ 2 , giving a power-suppressed O(Λ 2 ) contribution overall. For the momentum scalings ℓ ∼ (Λ 2 /Q, Λ, Λ) and ℓ ∼ (Λ, Λ 2 /Q, Λ), the numerator remains of order Λ 4 whilst the denominator grows to order Λ 3 /Q, which is however not enough to give a leading contribution. In the ultrasoft region of ℓ the denominator is of order Λ 4 /Q 2 but the numerator shrinks to order Λ 6 /Q 2 , resulting again in a subleading contribution. Only when ℓ is in the Glauber region, where the numerator is of order Λ 4 and the denominator of order Λ 4 /Q 2 , do we get a leading contribution by power counting. It remains therefore to show that the contribution from the Glauber region is actually smaller than the naive prediction from power counting. For this we can neglect all terms in the numerator of (3.10) that contain either plus-or minus-components, since in the Glauber region these terms are subleading compared to the terms containing only transverse components.
Combining the denominators of all Feynman propagators that contain ℓ with the help of Feynman parameters we obtain The integration measure for Feynman parameters is [d n α] = dα 1 · · · dα n δ(α 1 + · · · + α n − 1) with 0 ≤ α i ≤ 1, (3.13) and it is understood that I is still to be integrated overk + 2 , k − 1 , k 1 and ℓ. Since the volume of this remaining integration space is O(Λ 8 /Q 2 ) and an overall leading power contribution to Γ is O[1/(Λ 2 Q 2 )], the integral I has to be of order 1/Λ 10 for a leading power contribution. Conversely, we must exclude that I is of order 1/Λ 10 in order to establish the validity of the Grammer-Yennie approximation.
As appropriate for the Glauber region, we only consider the case |ℓ| ∼ Λ in the following. Power counting thus gives A ∼ Λ 2 for any value of the Feynman parameters, 4 even if we take masses λ, m ≪ Λ. This implies that regions where α i ≪ 1 for one or several i are power suppressed: contributions from such regions are suppressed by the phase space [d n α] and cannot be enhanced by having a smaller denominator. Only the region where all Feynman parameters are of generic size can possibly give a contribution of order 1/Λ 10 to I. We then have whereas a + can have either sign, depending on the relative size of α 3 q + 1 and α 2 q + 2 . By power counting, the region where ℓ + ∼ ℓ − ∼ Λ 2 /Q could indeed give a leading contribution to I, and we have to show that this is not the case. According to the general rule explained in section 3.1 the integration over ℓ − has a pinched pair of poles in that region, whereas the one over ℓ + does not.
To further analyse I we carry out the integral over ℓ + using Cauchy's theorem. For ℓ − + a − > 0 both poles in ℓ + are on the same side of the real axis and one gets zero. From the region ℓ − + a − < 0 we obtain, picking up the residue at ℓ + = −iη for simplicity, where we have dropped iη in the last denominator, since it comes with the same sign as iε and thus does not change the position of the pole. We note that the evaluation of Feynman integrals in light-cone coordinates has to be done with some care, because in some cases the naive application of Cauchy's theorem can lead to wrong results. In the case at hand we have cross checked our results with the method described in appendix A. As explained there, we could not have used Cauchy's theorem for the ℓ + integration in (3.11) if we had kept the full numerator of the graph given in (3.10), because terms with ℓ + or (ℓ + ) 2 would have cancelled the denominator ℓ + +iη. Using the method just mentioned, we have checked that these terms give a power suppressed contribution and can hence be discarded, as we argued on the basis of power counting before (3.11). Since ℓ − is at least of size Q in (3.15), we have 2a + ℓ − + A ∼ Q 2 or bigger and obtain a power suppressed result for I, as we needed to show. We observe that in (3.15) we have two poles Figure 13. Different momentum routings for the double box graph: (a) routing used in (3.7), where ℓ − but not ℓ + is pinched in the Glauber region, (b) routing for which neither ℓ + nor ℓ − is pinched in the Glauber region, (c) routing for which both ℓ + and ℓ − are pinched in the Glauber region.
which are pinched for a + > 0 in agreement with the general rule stated above. However, they are far away from the integration region and therefore do not matter.
We can also explicitly calculate (3.15) by partial fractioning, which gives (3.17) Since a + a − ∼ Q 2 , the potentially leading terms in (3.17) are those proportional to 1/A 4 . In the generic region of Feynman parameters, the logarithm gives a suppression by a factor Λ 2 /Q 2 and the Θ function terms cancel exactly because a − > 0. This confirms that I is power suppressed.

Alternative momentum routings
We explained in section 3.1 that the pattern of pinched poles at small ℓ + (ℓ − ) is determined by the way in which these loop momentum components flow along the collinear lines with large minus (plus) momenta. It is therefore interesting to consider different routings for ℓ + and ℓ − for which their respective flow along the collinear momenta (and thus the pattern of pinches) is different. We will verify by explicit calculation that the final result for the double box graph, i.e. the validity of the factorisation formula, is independent of the momentum routing. The freedom of choice to re-route the ℓ + and ℓ − loop momentum components will be essential in section 3.4, where we give a general proof of Glauber gluon cancellation at the one-gluon level.
The momentum routing in figure 13a has already been discussed in the previous subsection. Let us now turn to the routing in figure 13b, for which neither the ℓ + nor the ℓ − integration is pinched in the Glauber region. This momentum routing can be obtained by the shift k − 1 → k − 1 − ℓ − in (3.7). Using separate sets of Feynman parameters for combining the propagators containing ℓ + with the gluon propagator on one hand and for the propagators containing ℓ − on the other hand, we get with A, a − and N as in (3.12) and We see that B ∼ Λ 2 for all values of the Feynman parameters (as is the case for A), even if m is neglected. Hence the regions with β i ≪ 1 cannot give a leading contribution to I due to the suppression from the integration volume [d 3 β]. In the generic region of Feynman parameters, we have Performing again the ℓ + integral using Cauchy's theorem we get (3.21) Given that a − /α 4 is of order Q and positive, the ℓ − integration is limited to a region where 2b + ℓ − + B ∼ Q 2 or bigger, so that we obtain a power suppressed result as before. With b + being negative, the two poles in ℓ − are near zero and on the same side of the real axis, confirming the absence of a pinch in this case. Finally we turn to the routing in figure 13c. Here, both the ℓ + and ℓ − integration contours are trapped in the Glauber region, so that one would expect a leading power behaviour of I in this case. Let us see how this can be reconciled with the previous calculations. The corresponding momentum routing can be obtained via the replacement k + 2 →k + 2 − ℓ + in (3.7). Combining the propagators in the same way as above yields with A, a + and N as in (3.12), B as in (3.19) and In the generic region of Feynman parameters, both a + and b − are of order Q but can have either sign. Now we perform the ℓ − integral using Cauchy's theorem and get (3.24) Once again, the poles of the ℓ + integration are near zero. In contrast to the routing chosen above, however, they can be close to the integration region. We have different cases: 1. a + < 0 and b − of either sign: both poles are far away from the integration region, 2. a + > 0 and b − > 0: both poles are below the real axis, 3. a + > 0 and b − < 0: the integration contour is pinched in the Glauber region.
If ℓ + is at least of order Q then 2b − ℓ + +B ∼ Q 2 or bigger and we obtain a power suppressed integral. This holds in case 1, and also in case 2 if we deform the integration contour away from the real axis on the semi-circle going through the points −a + , ia + , a + . It holds in case 3 if we deform the contour to the corresponding semi-circle in the lower half-plane, thus picking up the residue of the eikonal propagator. The additional term from the residue is proportional to [A + iε] 4 [B + iε] 3 , so that I ∼ 1/Λ 10 as we expected. However, this contribution vanishes when one integrates over either k − 1 ork + 2 . To see this, we rewrite where the ellipses denote terms independent of k − 1 andk + 2 . We see that for a + > 0 the poles of (3.25) in k − 1 are on the same side of the real axis and thus give a zero integral (note that a + depends onk + 2 but not on k − 1 ). The same result is obtained if one first integrates over k + 2 under the condition that b − < 0. We have thus shown again that the Grammer-Yennie approximation works, but this required integration over several loop variables.

Wilson lines with finite rapidity
We now re-analyse the double box graph for Wilson lines along directions v R and v L with a finite rapidity. According to the general argument in section 3.2, we expect factorisation to carry over if we change lightlike Wilson lines into spacelike ones, but for timelike Wilson lines we have no such expectation.

Spacelike Wilson lines. We take spacelike Wilson lines with
As discussed in section 3.2, we must take v R (v L ) to have large positive (negative) rapidity, of the same order as the appropriate collinear particles. Therefore, we count δ ∼ Λ/Q.
In the numerator of the right hand side of (3.10), replacing light-like with spacelike Wilson lines only results in power-suppressed changes in the Glauber region. Some momentum components c + are replaced by v L c, some components c − by v R c, and the leading power term is still the one with only transverse components (i.e. N given in (3.12)). In (3.11) we must make the replacements The new eikonal propagators have the following poles in the complex ℓ + plane: The first pole remains in the lightlike limit δ → 0, whereas the second one migrates to infinity. Since the two poles lie on the same side of the real axis, the contour deformations needed to avoid the Glauber region can be made as for lightlike Wilson lines, and one does expect the Grammer-Yennie approximation to be valid also in this case. Let us see this by explicit calculation of the ℓ + integral. For ℓ − + a − > 0, all poles lie on the same side of the real axis and one gets zero. For ℓ − + a − < 0 we close the contour in the lower half plane. Approximating (1 − δ 4 ) ≈ 1, we then get 4 (3.28) with η ′′ = η ′ − δ 2 η. Our previous argument for limiting ourselves to the generic region of Feynman parameters remains valid, so that we can use the properties in (3.14). With a − > 0 the ambiguous sign of η ′′ does not matter since the pole ℓ − eik = iη ′′ of the eikonal propagator is outside the integration region.
Since ℓ − ∼ Q or larger in the integral (3.28), the integrand is strongly power suppressed, as in the case of lightlike Wilson lines. It remains to discuss poles close to the integration path. With (3.14) we find that that poles of the combined Feynman propagators are approximately at for the first term in (3.28) and at for the second term. With A/(2a + ) ∼ Λ 2 /Q and δ 2 A/(2a − ) ∼ Λ 4 /Q 3 , the poles at ℓ − 1 and ℓ − 3 are close to zero and thus far away from the integration region. For a + > 0 the pole at ℓ − 2 (with a huge real part of order Q 3 /Λ 2 ) is infinitesimally close to the integration path but can easily be avoided by contour deformation. The pole at ℓ − 4 is located at a short distance of order Λ 2 /Q from the endpoint ℓ − = a − of the integration and therefore deserves closer examination. If a + < 0 then the pole is to the right of the endpoint, so that the closest point on the integration path is the endpoint itself. There we have which is of order Q 2 and provides a lower limit for the size of the left hand side on the entire path (i.e. an upper limit for the size of the integrand, where this expression is raised to the fourth power in the denominator). This is sufficient to suppress the second integral in (3.28). If a + > 0 then the pole is to the left of the integration endpoint. It can however be avoided by contour deformation, and on a semi-circle around ℓ − 4 going to the endpoint (see figure 14a) the upper limit for the denominator based on (3.31) remains valid. Thus the contribution to the integral is power suppressed, as we anticipated.
Timelike Wilson lines. With timelike Wilson lines, the leading numerator factor in the Glauber region is again given by N in (3.12). For the eikonal propagators, we must make the replacement in (3.11). The ℓ + poles of the eikonal propagators are now on opposite sides of the real axis. In order to evaluate the ℓ + integral, we close the integration contour in lower half plane for ℓ − + a − < 0 and take the residue at ℓ + = ℓ + eik,1 , whereas we close it in the upper half plane for ℓ − + a − > 0 and take the residue at ℓ + = ℓ + eik,2 . Approximating (1 − δ 4 ) ≈ 1 we then get where now η ′′ = η ′ + δ 2 η is always positive. In addition to the eikonal pole at we now have poles at for the first term in (3.34) and at for the second term. For the first integral in (3.34), the situation is essentially identical to that for the first integral in (3.28) (except that ℓ − 2 now has a negative real part for a + < 0 rather than a + > 0), and we can use the same strategy to avoid the poles. In the second integral, we have the poles at ℓ − eik and ℓ − 3 near the origin. These can be avoided by contour deformation: a semi-circle around the origin with radius of order Λ 2 /Q, as in figure 14b, is sufficient to ensure that the integrand remains power suppressed. Finally, there is the pole at ℓ − 4 near the endpoint of integration. For this pole we can use the same logic as in the spacelike case to determine that it does not result in a leading contribution. In summary then, there is no leading contribution to I TL , and the factorised expression for the double box is valid to leading power with timelike Wilson lines as well.

Gauge boson vertex correction
We now turn to the vertex correction shown in figure 15, which is given by . (3.38) Since the graph has a UV divergence we have kept the loop integrations in 4−2ǫ dimensions. The replacement (3.9) now gives As in the double box case, the leading term on the right hand side of (3.39) is the one with only transverse components in the numerator (when we are in the Glauber region). Thus, we drop all terms apart from this one.
Combining the denominators of the gluon propagator and the scalar parton propagators containing ℓ with the help of Feynman parameters, we get This has the same structure as the expression (3.11) for the double box, except that the term with A appears to the third instead of the fourth power, which was inessential for reaching our conclusion that (3.11) was power suppressed. One can again show that only the generic region of Feynman parameters can potentially give a leading contribution, and one finds that the quantities in (3.41) fulfil the same conditions as stated in (3.14) for the double box. In addition, a + < 0 for the vertex correction, whereas it can have both signs in the double box case. Hence our arguments regarding factorisation carry over to the vertex correction graph, for lightlike, spacelike and timelike Wilson lines.

General prescription for avoiding the Glauber region
Let us return to the general case of one gluon exchange in the model introduced in section 3.1. As illustrated in section 3.3.3, we can avoid pinched poles in the Glauber region in most cases by a judicious separate routing of the plus and minus components of the gluon momentum ℓ through the graph.
To simplify the proof that Glauber gluons do not inhibit factorisation, we route ℓ − such that it flows in the same direction as the positive plus momentum of right moving lines whenever possible. In the example of figure 9d we have discussed before, one then routes ℓ − to flow through the active parton line with momentum k 2 rather than through the line with momentum k 1 . Whenever such a routing is possible for a graph, all poles in ℓ − from the lower part of the graph are on the same side of the real axis, and one can deform the integration to the other side, out of the Glauber region to a region where |ℓ − | ∼ |ℓ| ∼ Λ. There are no poles in the small ℓ − region from the upper part of the graph, where lines are left moving and have much larger minus momenta. The gluon propagator has its pole at 2ℓ + ℓ − = ℓ 2 + λ 2 and thus does not inhibit the contour deformation just described. As already discussed, the spacelike Wilson lines specified in and below (3.3) are designed to have poles on the same side of the real axis as the propagators of the right-moving lines and thus do not obstruct deforming ℓ − back to the real axis after the Grammer-Yennie approximation has been made. This is not the case for timelike Wilson lines, whose use therefore requires additional justification. We will show in appendix C that at the onegluon exchange level and at leading-power accuracy they give indeed the same result for the double Drell-Yan cross section as their spacelike counterparts, generalising what we found for the specific graphs computed in section 3.3.
The preceding argument can be repeated to avoid pinched poles in the small ℓ + region. Whenever the momentum routing just described is possible for either ℓ − or ℓ + , we can thus perform the Grammer-Yennie approximation at least on one side of the graph, so that at least one of the factors in parentheses in (3.5) gives a power suppression. This is sufficient to ensure that (3.4) is a good approximation throughout the combined collinear and soft momentum regions, even if one of the ℓ − and ℓ + integrations is trapped by a pinch in the region Λ 2 /Q, as is the case for figure 9c and 9e. 5 It remains to consider the graphs where neither ℓ − nor ℓ + can be routed in a way to avoid pinched poles. In the model we are discussing, this is only the case if both in the upper and in the lower part of the graph the gluon attaches to a spectator parton that either goes directly into the final state or only splits into spectator partons that go into the final state. An example of such a graph is figure 9f. For these graphs one can use exactly the same argument as in single Drell-Yan production (reviewed for instance in [11]). Deforming the ℓ + and ℓ − contours away from the poles of the active partons, one crosses poles of spectator lines. The residues of these poles cancel by a unitarity argument when, for a given graph of the cross section, one sums over all possible final-state cuts.

Limitations of the present method
The arguments of the previous section can be rather easily generalised to the exchange of several gluons in the same model setting, as long as these gluons do not interact with each other. However there exist difficulties in generalising this proof to one that works at all orders in perturbation theory. These difficulties appear already for single Drell-Yan production.
As an example consider figure 16. To construct an approximation for the combined collinear and soft regions, several different regions must be considered: in fact there are leading contributions with three, one or none of the momenta ℓ 1 , ℓ 2 and ℓ 3 being soft. The required approximators and subtraction terms for these regions do not obviously sum up ℓ 1 ℓ 2 ℓ 3 Figure 16. A graph for which the methods of section 3, in particular the use of (3.5), cannot be generalised in an obvious manner.
to give a simple form like (3.5), which was a key part of our present argument. For the proof of Glauber gluon cancellation at all orders in the next section, a different method will therefore be used.

All-order proof of Glauber gluon cancellation
In this section we demonstrate the cancellation of Glauber gluons in double Drell-Yan production to all orders in QCD perturbation theory. We make use of the same techniques that were used to demonstrate the all-order cancellation of Glauber gluons in the single Drell-Yan process [8,50] -in particular, we use the light-front ordered version of QCD perturbation theory (LCPT).
A subtle point is that the approximations needed to obtain factorisation are valid in specific regions of the loop momenta, whereas the arguments that establish the possibility of contour deformations out of the Glauber region require integration over the full range of certain light-cone components of these momenta. We therefore need to refine steps 3 and 4 in the general overview of the proof given in section 2.1. This applies both to single and double Drell-Yan production.
1. The expression of a given graph for the double Drell-Yan process is decomposed into a sum over all possible regions as in (2.8). For each region we make the appropriate kinematic approximations specified in point 3a of section 2.1, as well as the approximations for fermion lines in point 3b. These approximations are valid for all soft gluon momenta, including those in the Glauber region. In accordance with the subtraction formalism, they are made not only in the approximant C R Γ for their design region, but also in the subtraction terms of that region in approximants for bigger regions. The sum R C R Γ, with each term being integrated over the full range of all loop momenta, then gives a correct approximation of the original graph.
At this stage we do not yet make the Grammer-Yennie approximations necessary for the application of Ward identities. Before doing so, we need to take care of soft gluon momenta in the Glauber region. We could already apply the Grammer-Yennie approximation (2.4) for collinear gluons and the associated Ward identity, but we refrain from doing so. This avoids complications of the contour deformation argument in the case where the collinear Wilson lines are not strictly lightlike (cf. our comment in section 4.3.3).
2. For a given graph Γ, approximated for a given region R, we consider the sum over all possible cuts, G R = cuts C R Γ. We partition the graph into the collinear factor A and a remainder factor, where the ℓ j are the soft momenta entering A from the soft subgraph andl j is defined in (2.3). For simplicity, we suppress other momentum arguments in A and R, as well as the associated integrations. 6 The following arguments can all be made at fixed values of transverse parton momenta. Using LCPT we show that, whilst individual cut graphs in G R may have pinched poles at small ℓ − j , the sum over all cuts G R is free from such pinches, and all poles in ℓ − j that originate from A(l j ) are in the lower half plane. This step is rather involved, and we discuss it in detail in the following subsections.
3. The amount to which the ℓ − j integration can be deformed into the upper half plane is determined by the poles of the remainder factor R(ℓ j ). In the unapproximated graph, each ℓ − j is routed to flow through S into B, and from there back via H 1 or H 2 into A. After the approximations in step 1, ℓ − j is set to zero in B and the hard subgraphs, so that we only need to worry about the poles of S(ℓ j ). For |ℓ + j | < ∼ |ℓ j | the propagator of the gluon with momentum ℓ j has its pole at |ℓ − j | > ∼ |ℓ j |, and we assume that one can route ℓ − j through S in a way that the same is true for the full soft factor. It would be desirable to have a formal proof that this is always possible in momentum regions that give a leading contribution to the cross section. One can then deform the integration contour of each soft momentum ℓ − j out of the Glauber region, e.g. to a semicircle around the origin in the upper half plane, with radius of order |ℓ j |.
4. The approximator C R for a region includes subtraction terms for smaller regions R ′ .
According to (2.7), the approximations for both regions R and R ′ are applied to a graph in these terms. Consider a region R ′ with a soft gluon momentum ℓ j that is collinear or hard in R. Some graphs, such as the one in figure 17a are leading in R but subleading in R ′ . It is advantageous to keep the corresponding subtraction terms nevertheless, so that one can apply exact Ward identities appropriate for the region R at later stages. In such graphs we can make the replacement (2.5) without loss of accuracy for the cross section. For graphs that are leading in R ′ , such as the one in figure 17b, we must show that ℓ j can be deformed out of the Glauber region, after which (2.5) is indeed a good approximation. One can treat the subtraction Figure 17. Subtraction terms for a gluon with momentum ℓ j that is collinear in region R and soft in region R ′ . The corresponding decomposition into subgraphs A, B, H or A ′ , B ′ , S ′ , H ′ is indicated (to avoid clutter, H and H ′ are not shown in graph b). Graph a is power suppressed in region R ′ because a soft gluon enters a hard subgraph, whereas graph b is leading.
term for region R ′ just as the original term for that region and follow steps 2 and 3 above. In the subtraction term, additional approximations are made: in figure 17b the transverse gluon momentum ℓ j is neglected inside A ′ and the masses of lines that are in A ′ but not in A are put to zero. These additional approximations do not affect the arguments for contour deformation, and we can thus deform the ℓ − j integration to a semicircle around the origin in the upper half plane, with radius of order |ℓ j |.

5.
We repeat steps 2 to 4 for soft gluons connecting B with S, and for subtraction terms for hard or collinear gluons that become soft in a smaller region. The roles of plus and minus momenta are now interchanged. A difference to the discussion above is that integrations over the minus components of soft gluon momenta in A and S are already deformed, as specified above. This does not affect the possibility to deform the integration over the plus components of soft gluon momenta in B and S.
6. All soft gluon momenta are now deformed out of the Glauber region into a region where ℓ + j ∼ ℓ − j ∼ |ℓ j |. One can now apply all Grammer-Yennie approximations for soft gluons (see points 3c and 3d in section 2.1), as well as all Grammer-Yennie approximations for collinear gluons. This is done in all terms, including subtraction terms. The approximations are valid in their design regions; their possible failure in larger regions does not degrade the approximation of the cross section because within power law accuracy the contribution from these regions is removed by appropriate subtraction terms.
Note that the discussion of subtractions in point 4 of section 2.1 was at the level of individual cut graphs and completely local in momentum space. We now consider the sum over all cuts of a graph, which does not pose any problems. Moreover, the subtractions for soft gluons now work only for integrals over ℓ − j and ℓ + j . This is because we have deformed integration contours in the subtraction term T R C R ′ Γ for a region R ′ where a gluon is soft, but not in the term T R Γ where the corresponding gluon is designated as collinear or hard. The subtraction works correctly for the integral because in the integration region where ℓ j is soft, the unsubtracted graph T R Γ is correctly approximated by T R C R ′ Γ, and for the approximated term we have shown that the contour can be deformed.
Having made all Grammer-Yennie approximations, one proceeds with the factorisation proof as described in section 2.1, starting with the deformation of all integration contours back to the real axis.
In the remainder of this section we give a detailed description of step 2. Since LCPT may not be familiar to all readers, we give a brief summary of the rules of LCPT in section 4.1. We derive these rules from conventional perturbation theory in section 4.2, since an understanding of this derivation turns out to be important when we consider how to treat the collinear partons from a proton that enter the hard interactions. In section 4.3 we present the details of the sum-over-cuts argument in step 2, starting with a review of the corresponding procedure for single Drell-Yan production given in [8,50].

Light-cone perturbation theory (LCPT)
LCPT is a method for computing Green functions or S matrix elements in which one sums diagrams over all possible orderings of the vertices in the "light-cone time" It is somewhat similar to old-fashioned time ordered perturbation theory (discussion of which can e.g. be found in [7,78,79]), where vertices are ordered according to the usual time variable t = x 0 .
LCPT ultimately gives the same results as conventional covariant perturbation theory and indeed may be derived from it [8,80], as we show in the next section. We here give the rules for LCPT following the conventions of [8]. The rules may also be found in [81][82][83][84][85][86], albeit with differing normalisation conventions.
1. The diagrams in LCPT are like the diagrams in covariant perturbation theory, except that vertices are ordered in the light-cone time x + . We sum over all possible orderings in x + as well as all possible graphs. In diagrams we take x + to increase from left to right in the amplitude (i.e. to the left of the final state cut) and to decrease from left to right in the conjugate amplitude.
2. We assign each line L an on-shell four-momentum κ L satisfying κ 2 L = m 2 L . The plus and transverse components of these four-momenta are conserved at vertices, but the minus components are not -these are instead fixed by the on-shell condition, such that where p − ξ,inc is the total external minus momentum entering the state from the left (i.e. from lower x + ) and κ − L is the on-shell minus momentum of a line L in ξ.
6. Coupling factors at vertices and symmetry factors are the same as in Feynman graphs.
Note that strictly speaking these are only the rules for scalar theories. For theories involving fermion and/or vector boson fields (such as QCD and QED), the LCPT formulation contains extra vertices associated with particle exchange(s) that are instantaneous from the point of view of the ordering in x + (see e.g. [82,84,85]). These vertices are somewhat similar to the new vertices one encounters connecting collinear particles in SCET [87,88]. We will not need the precise structure of the vertices and hence do not give the corresponding rules here.

Derivation of LCPT from covariant perturbation theory
We now briefly review the derivation of the LCPT rules from covariant perturbation theory. We follow section 7.2.3 of [8], although we use a slightly different notation here. The steps to obtain the light-front ordered diagrams from a given covariant Feynman graph are as follows.
1. Take a covariant Feynman graph calculated according to the usual momentum-space rules.
2. Order the vertices i = 1, . . . , n. Assign momentum κ ij to an internal line flowing from vertex i to j, and momentum p i to an external line flowing into the graph at vertex i.
3. At each vertex we have a δ function corresponding to the conservation of fourmomentum. Rewrite the minus component part of this δ function as follows: 4. For each vertex there is now an x + i which is integrated over the full range. Partition the multiple integral over all x + i into a sum over possible orderings of the x + i as where we have a sum over all permutations π of the n indices, i.e. over all orderings x + π(1) < x + π(2) < · · · < x + π(n) of the light-cone times.

5.
For the ordering i < j of vertices use the momentum κ ij rather than κ ji .
6. For simplicity of notation we now consider only the ordering where x + 1 < x + 2 < · · · < x + n . For the Θ functions use the momentum-space representation Performing the integrals over x + i gives back a "momentum conservation" δ function i−1 at the vertex i, although this δ function now includes the fictitious momenta ρ − i and ρ − i−1 . Using the δ functions to perform the ρ − i integrals, we generate the "light-cone energy denominators" in (4.3) from the denominator factors in (4.6): This can be shown by induction over i, starting at the first vertex. The factors in (4 .7) are not yet quite those in (4.3), due to the fact that the κ − jk are not yet on shell.
7. Integrate over the minus momenta κ − ij of internal lines. There are exactly two places where κ − ij occurs: in the light-cone energy denominators, where it appears always as −κ − ij + iǫ, and in the Feynman propagator for the appropriate internal line: From this we see that we get a nonzero integral only for κ + ij > 0 (this is the origin of the Θ function in rule 3 above). For κ + ij > 0 pick up the pole of the propagator denominator, which sets the κ − ij to its on-shell value in the energy denominators (4.7). Note that when performing the κ − ij integral we get a factor 1/(2κ + ij ) from the residue of (4.8), which is the denominator factor in rule 3 above.

Absence of pinched poles in the Glauber region
Having reviewed LCPT, we now show that in the sum over all cuts of a graph there are no pinched poles in the Glauber region, as announced in step 2 at the beginning of section 4. We first show how this step works for the single Drell-Yan process, since the generalisation to double Drell-Yan production turns out to be a small step on top of this. Figure 18. Partitioning of a leading graph and region in single Drell-Yan production into a collinear factor A and the remainder R. The solid lines are collinear quarks (going from A to H), whilst the gluons are soft lines (going from S to A).

Single Drell-Yan
Let us take a closer look at the sum over cuts G R for an approximated graph, which as already discussed can be written as the convolution of the collinear factor A with a remainder factor R, which contains all other subgraphs. We start with the case depicted in figure 18, where we have two physically polarised partons joining A to H (one in the amplitude, and one in the conjugate), but no longitudinally polarised gluons. We postpone the discussion of the general case with longitudinally polarised gluon attachments to section 4.3.3. The other gluons exchanged between R and A in figure 18 are soft and may be in the Glauber region.
The sum over cuts in G R is organised as follows. We consider the vertices at which the soft gluons enter A, and let V denote the partitioning of these vertices between the amplitude and its conjugate. For a given V , we sum over the set A(V ) of compatible cuts of A, and over the set R(V ) of compatible cuts of R. Then we sum over all V to get the full set of cuts. Explicitly this gives Here . A crucial point to note is that A is defined to include the propagators for the external collinear lines with momenta k + j ℓ j and k, but to exclude the propagators for the external soft lines ℓ j , which are included in R. For simplicity, we shall henceforth omit the Lorentz indices in A µ 1 ···µn and R µ 1 ···µn and simply write A and R. In section 4.3.4 we will show that the factor on the second line of (4.10) is independent of the partitioning V . The sum over V then only applies to A and gives We now consider this expression in LCPT, with light-cone time x + . Since we do not include the propagators of the soft lines ℓ j in A, we treat these as external lines, with an external momentum ℓ j being injected at the vertex where the soft line couples to a collinear line inside A. By contrast, we do include the propagators of the collinear lines, so we treat these as lines inside A that terminate on a two-point vertex, whose other line is an external line that carries the collinear momentum away (and whose associated coupling constant is unity). These vertices participate in the light-cone time ordering in the same way as the other vertices in A. We call them "hard vertices" because in the graph for the physical process they are replaced by vertices at which the collinear lines enter the hard scattering.
To implement this interpretation we rewrite the integration over k − as treating κ i (κ i ′ ) as the momentum of the internal collinear line at the left (right) of the final state cut and the corresponding δ functions as associated with the two-point vertices, as needed in step 3 of the derivation of the LCPT rules in section 4.2. The integral over the external momentum component k − is not used for the derivation of LCPT and will be performed explicitly. For a given time ordering T we now partition the states ξ according to whether they are before the hard vertex H in the amplitude, before the hard vertex H ′ in the conjugate amplitude, or in the "final state" between H or H ′ and the cut F A . States before H go into the factor I T , those before H ′ into I ′ T , and those in the final state into F T : The explicit expression of the "numerator" in (4.13) is not needed in the following. It includes propagator numerators and vertex factors, as well as a factor i or −i for each light-cone energy denominator (4.3), depending on whether the corresponding intermediate state is in the amplitude or its conjugate. This change of sign is compensated by a change of sign for each interaction vertex in the amplitude or its conjugate, so that the overall numerator factor does not depend on the placement of the final state cut F A . Furthermore, this factor is independent of k − . In LCPT, internal vertices and propagator numerators only depend on the on-shell momenta (4.2) of the relevant lines, so that k − could only enter the numerator through the two-point vertex at the end of a collinear line. The factor for that vertex is however momentum independent. The other factors in (4.13) read where p is the proton momentum and (in contrast to section 4.2) we simply number the on-shell minus momenta of the lines κ L sequentially. The initial state factors I T and I ′ T only have poles in the lower half plane for the momenta ℓ − j entering prior to H or H ′ . However, the ℓ − j poles of the factor F T are pinched and seem to prevent us from deforming the ℓ − j integration out of the Glauber region. To proceed, we label the N states ξ f in F T by an index f = 1, . . . , N running from left to right, and abbreviate the sum of on-shell minus momenta in state f as The sum over all final state cuts F A for F T gives which is essentially the Cutkosky identity [89,90] in the light-cone formalism and can readily be obtained by using It is now easy to perform perform the integral of (4.18) over the external momentum k − (note that no other factor in (4.13) depends on this variable): For N > 1 each product on the r.h.s. of (4.18) decreases faster than 1/k − at infinity, so that one obtains zero by the theorem of residues, whereas the result for N = 1 simply corresponds to the original δ function on the l.h.s. Once we sum over all cuts, F T thus gives either 0 or 1. The pinched poles in ℓ − j are then removed, and we can deform ℓ − j into the upper half plane as we set out to show. Notice that if one does not consider the inclusive Drell-Yan cross section (differential or not in the transverse momentum of the vector boson), but instead some observable that is not constant for all the cuts of F T , then this argument fails and Glauber gluon exchange may not cancel [11,12].
As already stated at the beginning of section 4, the preceding derivation works not only for the original terms in the region decomposition of a graph, but also for the subtraction terms. In the graph of figure 17b one neglects ℓ j and the masses of certain lines inside A ′ , which leads to different values of the on-shell momenta κ − L in the above expressions, but does not affect the arguments otherwise.

Double Drell-Yan
We now generalise the argument of the previous section to the double Drell-Yan process. In this case we have four physically polarised partons joining A to H (two in the amplitude, and two in its conjugate) -once again we ignore longitudinally polarised gluon attachments for the moment. The analogue of (4.10) now reads Detailed justification of this form for G R can be found in [1]. A diagram with the momentum assignments is given in figure 19. In section 4.3.4 we will show that the factor on the last line of (4.21) is independent of the partitioning V . In analogy to the single Drell-Yan process we can thus perform an unrestricted sum over all cuts F A of A, as in (4.11). Now we consider A in LCPT and establish the rules for the vertices to which the collinear lines attach. We could straightforwardly copy the procedure of the previous section, in which case each collinear line would attach to a two-point vertex leading to an external line. We would then consider all light-cone time orderings of the vertices (including allowed orderings of the two-point vertices), and integrate over the external minus momenta k − 1 , k − 2 and r − . In Appendix B we show how the cancellation of pinched poles arises in Figure 19. Partitioning of a leading graph and region in double Drell-Yan production into a collinear factor A and the remainder R. Note that our argument does not depend on which of the four collinear partons we chose to route the sum of soft momenta. We show the momentum assignments for double parton scattering found in [1] (involving k 1 , k 2 , r), as well as the alternative assignment (involving k, k ′ , K) used later in this section.
this setting. Here we take an alternative approach, which allows us to arrive at the desired result more efficiently. First, we make the following change of variables: As shown in figure 19, k − runs up the first collinear line in the amplitude and down the second, without crossing the final state cut. Since the factor R in (4.21) is independent of k − , we may treat this variable as internal in the derivation of the LCPT rules (see step 7 in section 4.2). We then have a vertex to which both collinear lines in the amplitude attach, as shown in figure 20. The momentum component k ′− in the conjugate amplitude can be treated in the same way. In generalisation of (4.12) we now write the integration over the external minus momenta as The δ functions in the last expression are exactly what is required for introducing threepoint vertices according to step 3 of section 4.2. These vertices are somewhat unusual in that the plus components (in TMD factorisation also the transverse ones) of the individual parton lines κ i and κ l (κ i ′ and κ l ′ ) are fixed, but this does not invalidate the derivation of the LCPT rules.
With this setup we can proceed exactly like in single Drell-Yan production, having exactly one hard vertex H on the left and one hard vertex H ′ on the right of the cut F A . Summing over all these cuts and integrating over K − we obtain the analogue of (4.20), which shows that all pinched poles from final state interactions cancel in the sum over F A . We can thus deform the integration over ℓ − j into the upper half plane as desired. Note that it is the integration over k − that is key in order for us to be able to attach the two collinear lines in the amplitude to a single vertex. In fact, the momentum k − is Fourier conjugate to the light-cone time separation x + 12 between the fields associated with these two collinear partons in the operator definition of A. Integrating over k − corresponds to setting x + 12 = 0. It is thus not surprising that the k − integration allows us to connect the ends of the two collinear lines together at a single vertex in the LCPT picture. In the physical process, this amounts to the two hard scatterings occurring at the same light-cone time.

Including longitudinally polarised collinear gluons
It is now easy to treat the case in which we have an arbitrary number of longitudinally polarised collinear gluon attachments between A and H, either in single or in double Drell-Yan production. For each one of these gluon lines there is an additional loop momentum λ k in (4.10) or (4.21), and the integral over λ − k concerns only the factor A because this component is set to zero in the hard subgraph. We can thus use the same procedure as for the two physically polarised partons in double Drell-Yan production, attaching the ends of all collinear lines in the amplitude to one vertex H and the ends of all collinear lines in the conjugate amplitude to another vertex H ′ . The argument for the cancellation of final state poles in soft momentum components ℓ − j then proceeds as before. We remark that, before discussing the contour deformation in ℓ − j , one could already make the Grammer-Yennie approximation (2.4) for collinear gluons and then apply the corresponding Ward identities to remove collinear gluon attachments from the hard factors. For lightlike Wilson lines the resulting eikonal propagators 1/(λ + k + iǫ) can readily be incorporated into the vertex factors for H and H ′ . For non-lightlike Wilson lines, however, the eikonal propagators also depend on λ − k , which complicates the derivation of the LCPT rules. For this reason we postpone the collinear Grammer-Yennie approximation to a later stage in our present treatment.

Independence of the remainder R on partitioning of soft vertices
As announced earlier, we now show that the remainder factor R is independent of the partitioning of soft vertices V in the collinear factor A. We perform this proof explicitly for the double Drell-Yan process. The proof for single Drell-Yan production readily follows from the derivation given here; a slightly different version of it has been given long ago in [50]. Consider the factor R for double Drell-Yan production, in the general case where we have an arbitrary number of longitudinally polarised gluon attachments to the hard process from both A and B. This can be written as where R k 1 ,k 2 ,r,λ l , ℓ j (4.25) A diagrammatic depiction of R and R is given in figure 21. The factor H in (4.24) contains both hard scatterings H 1 and H 2 of figure 21, plus all appropriate δ functions for momentum conservation in the hard subgraphs. It is summed over all possible final state cuts of H 1 and H 2 (in TMD factorisation there is only one such cut, but there can be several cuts in collinear factorisation with unobserved jets). The quantities λ k are the momenta of the longitudinally polarised gluons entering H from A, andλ l are the corresponding momenta of the longitudinally polarised gluon connections between H and B. Let us consider R in LCPT, with the light-cone time now being x − (rather than x + as in our analysis of the factor A). Note that we have the product of the B and S subgraphs in R, and that we make kinematic approximations for soft momenta when they enter B, neglecting their minus components. In the LCPT formulation these approximations only adjust the values of on-shell plus momenta κ + L , the precise values of which do not matter in the arguments to follow. We therefore can treat the soft and collinear lines in R in the same manner as we would without the approximations.
In (4.25) we have integrations over the plus momenta of all external parton lines, i.e. all collinear lines (the physically polarised lines plus the longitudinally polarised gluons) and all soft lines. We can hence take all external parton lines in the amplitude and tie their ends together at a single vertex H, using the same argument as in the previous subsections. Likewise, we can tie together all external lines in the conjugate amplitude at a single vertex H ′ . There is a remaining integral over the total plus momentumK + flowing out of the graph at H (and, by momentum conservation, flowing back into the graph at H ′ ). In this representation, the partitioning V determines which soft lines ℓ j originate in the vertex H and which ones originate in H ′ .
As we did for A, we partition R for each light-cone time ordering T into a numerator factor, a factor I T for states that occur before H in the amplitude, a factor I ′ T for states that occur before H ′ in the conjugate amplitude, and a factor F T for final states between H or H ′ and the cut F BS . Only F T depends onK + , so that we can write Let us consider a given graph and time ordering T of vertices, sum over the final state cuts F BS that are compatible with V , and also perform the integral overK + . We then get zero unless there is only a single state in F T , in which case the final state factor is simply unity. The argument is completely analogous to the one given in (4.18). In particular, this means that when we have graphs in which soft gluons arising from an external vertex attach to the final state, such as in figure 22c, the sum over cuts always gives zero. For the nonzero contributions to R it does not matter whether a soft gluon ℓ j couples to H or to H ′ , i.e. R does not depend on V as we set out to show. To see this, consider two partitionings V 1 and V 2 that differ only in whether ℓ j enters the graph in the amplitude or in its conjugate. For every nonzero graph in V 1 we can find a corresponding nonzero graph in V 2 with the same initial state factors I T and I ′ T . A simple example of corresponding graphs is given in figure 22a and b. Note that even though the final states in graphs a and b are different, we have (2π) −1 dK + F T (K + ) = 1 in both cases.
The proof for the independence of R on V in the single Drell-Yan process is completely analogous, the only difference being that in R there is only one physically polarised parton (rather than two) entering the external vertex in the amplitude or its conjugate. Figure 22. Examples of LCPT graphs for the factor R in a simple setting where the incident hadron is replaced by a single parton line.

Summary
Double parton scattering is an interesting signal and potentially relevant background process at the LHC. A factorisation formula for the double parton scattering cross section was put forward long ago, based on an analysis of the lowest order Feynman diagrams for the process and the usual approximations made in collinear factorisation [1,91,92]. More recently, a similar analysis was used to obtain a formula for the DPS cross section differential in final-state transverse momenta [1]. For the double Drell-Yan process, in which one does not have any complications associated with colour in the final state, several ingredients of a factorisation proof have been given in [1,47]. Notably missing in these papers is a proof of cancellation of so-called Glauber gluons, which are soft gluons with much larger transverse momentum components than longitudinal ones. For these momentum modes, the approximations required to obtain factorisation break down.
In the present paper we have filled this gap, demonstrating the cancellation of Glauber gluons in the double Drell-Yan process (both with and without measured transverse momenta of the produced gauge bosons). This was done first at the one-gluon level in a simple model, using momentum routing arguments and the same unitarity argument that can be used to show the cancellation of Glauber exchange at the one-gluon level in single Drell-Yan production (see e.g. [11]). Then, an all-order proof was given, which makes use of light-cone perturbation theory and generalises the corresponding proof for the single Drell-Yan process in [8,50]. In the process of constructing this proof, we also revisited and clarified some issues with regards to the Glauber cancellation argument and its relation to the rest of the factorisation proof in the single scattering case.
It is easy to see that both our proofs -the one for one-gluon exchange and the one for all orders -generalise to other double scattering processes producing colourless particles, such as the production of a Higgs boson and a Z. Likewise, both proofs carry over from double parton scattering to an arbitrary number n of hard scatters (each producing only colourless particles). Corresponding cross section formulae derived from lowest order Feynman diagrams can be found in [1]. It should not be difficult to adapt the other steps required for a factorisation proof, laid out in section 2.1, to this case. What becomes increasingly more complicated with the number of hard scatters is in particular the colour structure of multiparton distributions and soft factors, which is not a problem of conceptual nature but of increased complexity for calculations and phenomenology.
Even with the results of the present paper, there still remains work to do to establish the all-order factorisation of the double Drell-Yan process. Open issues are discussed in section 2.1. Perhaps the most prominent among them is the question how to treat the double counting between single and double scattering, and how one should divide "double splitting graphs" (e.g. the equivalent to figure 10 in QCD) between single and double scattering.
Aside from double Drell-Yan production, it would be desirable to investigate the factorisation properties of other processes, such as the production of double dijets. Before analysing such processes with colour in the final state, one first should investigate the factorisation for the corresponding single scattering processes, for which to the best of our knowledge the absence of Glauber gluon effects has not yet been proven. A more solid theoretical understanding of double hard scattering with jets or heavy quarks would be of great practical value, because such processes have much higher rates than double Drell-Yan production and were already studied experimentally at the Tevatron [93][94][95][96][97][98] and in Run 1 of the LHC [99][100][101][102][103][104]. Further in-depth studies can surely be expected from Run 2.

A Evaluation of Feynman integrals in light-cone coordinates
As mentioned in section 3.3 the evaluation of Feynman integrals in light-cone coordinates has to be done with some care. Consider for instance the integral The integral on the l.h.s. of (A.1) is obviously non-zero while a naive application of Cauchy's theorem for the ℓ + integration on the r.h.s. would yield zero. This apparent paradox is resolved by the fact that Cauchy's theorem is not applicable in this case, because at the point ℓ − = 0 the ℓ + integral is linearly divergent. This issue has been encountered in the past, and in [83,105] it was shown that the proper formula for such integrals reads This formula and the methods used to derive it can also be used to calculate the integral that we have encountered in section 3.3. After partial fractioning with respect to ℓ + we can adapt the methods of [83] and obtain the same result as we got using Cauchy's theorem for the ℓ + integration. Cauchy's theorem can be used in this case because at the point ℓ − = −a − the ℓ + integral diverges only logarithmically but not linearly. It can however not be used if the integrand comes with one or two powers of ℓ + in the numerator, which cancel the factor 1/(ℓ + + iη). The method using (A.2) combined with partial fractioning continues to work in this case.
B Alternative approach to the double scattering collinear factor In this appendix we consider the alternative approach, mentioned after (4.21), for treating the vertices at the ends of collinear lines in the double scattering collinear factor A. We take each collinear line to end on its own two-point vertex, where an external line carries away the appropriate momentum. We have to consider all possible orderings of these twopoint vertices in the amplitude and its conjugate, and a priori we can have any number of interactions and insertions of soft momenta between the two-point vertices. This seems to be rather different from the situation described in section 4.3.2, where the two collinear lines are connected together at a vertex, such that nothing can occur between them. However, we will see that in the end we obtain the same result. Let us consider the two LCPT graphs in figure 23, which are completely identical except for the fact that we have interchanged the order of the hard vertices in the amplitude. The internal structure of the blobs, positions of soft attachments, and light-cone time ordering of states in the blobs are the same. To calculate A we must sum over both graphs. Now we make the change of variables in (4.22). Then only the factors I T 2 andĨ T 2 depend on k − , which is to be integrated over. For either graph, if there is more than one state between H 1 and H 2 (in particular if there is an insertion of a soft momentum between H 1 and H 2 ) then the k − integration gives zero, since then there are at least two denominator factors that depend on k − , which all have their poles on the same side of the real axis. A nonzero result is hence only possible if there is exactly one state between H 1 and H 2 .
We now combine the two graphs. Since they are identical apart from the ordering of H 1 and H 2 , the factors I T 1 , F T , I ′ T 1 and I ′ T 2 are identical between the two and can be factored out. We thus need only consider the sum of I T 2 andĨ T 2 for the one case in which they are nonzero:

C Timelike Wilson lines
In this appendix, we show the viability of timelike Wilson lines in single and double Drell-Yan production at the level of one-gluon exchange. Rather than proving factorisation with timelike Wilson lines directly, we start from the factorised expression with spacelike Wilson lines, whose validity we have established in the main body of this paper. We then show that the expression with timelike Wilson lines is equivalent to this to leading-power accuracy. As in section 3, we restrict our attention to transverse gluon momenta ℓ of order Λ, since this covers the soft and collinear regions where the Wilson lines are relevant. We also take the collinear and soft Wilson lines to have the same directions. As in section 4.3, we begin with the simpler single Drell-Yan case and later explain how the argument generalises to double Drell-Yan production. Let us denote the directions of the left-and right-moving timelike Wilson lines as v LT = (δ 2 , 1, 0) and v RT = (1, δ ′2 , 0) respectively. We write their spacelike analogues as v LS = (−δ 2 , 1, 0) and v RS = (1, −δ ′2 , 0). Here δ, δ ′ ∼ Λ/Q, and it is understood that right-moving partons have plus components of order Q and transverse ones of order Λ.
We first derive an intermediate result that will prove useful later on. Let A sub (v Li , v Rj ) with i, j = T, S denote a subtracted collinear subgraph with a Wilson line along v Li and a Wilson line v Rj in the soft subtraction term (see (2.13)). We now prove that, for a single gluon attaching to the Wilson line v Lj , we have when the transverse gluon momentum ℓ is restricted to be of size Λ. Let us first discuss the case where the gluon does not cross the final-state cut inside A. Before integration over ℓ and up to a global factor irrelevant for our argument, the subtracted collinear factor reads where +δ 2 corresponds to v LT and −δ 2 to v LS . We have taken the gluon momentum to flow from the Wilson line downwards into A, on the left hand side of the final state cut. Note that as in section 3, we consider here only the Grammer-Yennie approximations in the soft and collinear factors, postponing kinematic approximations to a later stage.
To start, we consider the term with δ 2 A − (ℓ) in (C.2). This is only non-negligible compared to the other terms if A − (ℓ)/A + (ℓ) is very large, which requires ℓ − ≫ Q. The only dominant graphs in A are then those where the gluon couples to an active parton, since this carries the minimal number of parton lines off shell (see the discussion in section 2.2). Explicit power counting for such graphs shows that this region is strongly power suppressed, either due to the gluon propagator or due to the δ 2 ℓ − term in the eikonal propagator. Thus, we need not consider this term further.

(C.3)
We now show that the sign of the δ 2 ℓ − term in (C.2) is not important for either piece. We begin with the first piece, which together with the remaining factors and integral is For the term δ 2 ℓ − to be important, the gluon must be left moving with ℓ − ∼ δ −2 ℓ + or larger. To prevent the gluon propagator from becoming too large, one must then have ℓ + ≪ Q. Then, however, the difference in square brackets is negligible, and so is the integral. Hence there is no leading contribution from the region in which the sign of the δ 2 ℓ − term matters. We now move on to the second piece in (C.3), which gives .
We have performed the integration over ℓ + using Cauchy's theorem. With all eikonal propagators having their poles in ℓ + above the real axis (for this it is essential to take v R spacelike) we have closed the integration contour in the lower half plane, only picking up the pole of the gluon propagator if ℓ − > 0. In the second line of (C.5), the subtraction term leads to a suppression for ℓ − ∼ |ℓ| or larger, whereas for ℓ − ≪ |ℓ| the terms with ±δ 2 in the eikonal propagators are negligible. Thus the sign of the δ 2 ℓ − term is not important here either.
The preceding arguments also work if the gluon couples to a Wilson line left of the final state cut and then crosses the cut; in that case the gluon propagator in (C.2), (C.4) and the starting expression of (C.5) is to be replaced by 2πδ(ℓ 2 ) Θ(ℓ − ). This completes our proof of (C.1).
Note that this result cannot be obtained by power counting alone. For |ℓ| ∼ Λ, the region with ℓ − ∼ Λ 2 /Q and ℓ + ∼ Λ 4 /Q 3 gives a leading-power contribution to (C.2) since ℓ − is too small for the Grammer-Yennie approximation in the subtraction term to work, and the smallness of the eikonal propagator compensates the smallness of the integration volume of ℓ + in that region, where one cannot neglect ±δ 2 ℓ − . The position of poles in ℓ + was essential to pick up only the pole of the gluon propagator using Cauchy's theorem, and then the mass-shell condition for ℓ excluded the above dangerous region. Now let us consider the soft and collinear-to-A or B approximants to the full set of graphs in which one gluon extends between the left and right moving collinear sectors. We collect together graph approximants in which a gluon attaches in all different places to the upper and lower parts of the graph, with these attachments being either to the left or to the right of the final-state cut in each case. This allows us to use Ward identities to convert contractions of ℓ µ with A µ (ℓ) or with B µ (ℓ) to collinear factors A and B without an external gluon.
We write A i and B i with i = T, S for the unsubtracted collinear factor with one gluon coupling to a time-or spacelike Wilson line, and S ij for a graph contributing to the one-loop expression of the soft factor with time-or spacelike Wilson lines (with the gluon coupling to v L and v R on a definite side of the final state cut). The difference between the factorised expressions with timelike and spacelike auxiliary Grammer-Yennie vectors (or Wilson lines), after the sum over gluon attachments as specified, is It is easy to evaluate the graphs for the one-loop soft factor explicitly (see e.g. section 3.3.1 of [1]). We find that, up to power corrections, the result for each graph is a factor independent of v L and v R times log where σ L,R = −1 (+1) if the corresponding eikonal line is left (right) of the final state cut. Plugging this result into (C.7), we get zero. This result can also be shown in a simple way as follows. Up to global factor the combination of soft factors can be written either as 8) or as its analogy with the gluon propagator replaced by 2πδ(ℓ 2 ) Θ(ℓ − ). The differences of eikonal propagators yield a power suppression unless ℓ + < ∼ δ 2 ℓ − (for the first square bracket) and ℓ + > ∼ ℓ − /δ ′2 (for the second square bracket). However, both conditions cannot be satisfied at the same time, so one always has a suppression in (C.8).
Let us now move to the double Drell-Yan process. In each collinear factor we now have twice as many Wilson lines to which the single gluon may attach. Demonstrating that (C.1) still holds at the one-gluon level is a straightforward copy of the single Drell-Yan argument, provided that one routes ℓ through the relevant Wilson line, down into A through the gluon and back up through the parton associated with the Wilson line. In the formulae analogous to (C.6) and (C.7), both A i and B j are now summed over the two possible gluon attachments to Wilson lines. Likewise, the one-loop expression for S ij in (C.6) and (C.7) now contains a sum over the four possibilities how the gluon can attach to the different left-and right-moving Wilson lines (always on a definite side of the finalstate cut). For each of these possibilities, the longitudinal structure of the loop integral is exactly the same as for single Drell-Yan production. The argument of (C.8) works on a diagram-by-diagram basis, and of course also for the relevant sums over graphs. Thus our overall argument carries over to the double Drell-Yan process.
The simple structure in (C.7) is only obtained with a single exchanged gluon. We can therefore not decide from the above whether timelike Wilson lines are generally suitable for factorisation. However, the proof in the present section shows that a counter-example would require at least two exchanged gluons.