Improved TMD factorization for forward dijet production in dilute-dense hadronic collisions

We study forward dijet production in dilute-dense hadronic collisions. By considering the appropriate limits, we show that both the transverse-momentum-dependent (TMD) and the high-energy factorization formulas can be derived from the Color Glass Condensate framework. Respectively, this happens when the transverse momentum imbalance of the dijet system, $k_t$, is of the order of either the saturation scale, or the hard jet momenta, the former being always much smaller than the latter. We propose a new formula for forward dijets that encompasses both situations and is therefore applicable regardless of the magnitude of $k_t$. That involves generalizing the TMD factorization formula for dijet production to the case where the incoming small-$x$ gluon is off-shell. The derivation is performed in two independent ways, using either Feynman diagram techniques, or color-ordered amplitudes.


Introduction
Forward particle production observables in proton-proton (p+p) and proton-nucleus (p+A) collisions at the Large Hadron Collider (LHC) offer unique opportunities to study the dynamics of QCD at small x, and in particular the non-linear regime of parton saturation. Indeed, in high-energy hadronic collisions, forward particle production is sensitive only to high-momentum partons inside one of the colliding hadrons, which therefore appears dilute. By contrast, for the other hadron or nucleus, it is mainly small-momentum partons, whose density is large, that contribute to the scattering. Such processes, in which a large-x projectile is used as a probe to investigate a small-x target, are sometimes called dilute-dense collisions. Since the high-x part of the projectile wave function is well understood in perturbative QCD, forward particle production is indeed ideal to investigate the small-x part of target wave function. This is true both in p+p and p+A collisions, although using a target nucleus does enhance the dilute-dense asymmetry of such collisions.
The separation between the linear and non-linear regimes of the target wave function is characterized by a momentum scale Q s (x), called the saturation scale, which increases as x decreases. Dilute-dense collisions can be described from first principles, provided Q s Λ QCD . This condition is better realized with higher energies (as they open up the phase space towards lower values of x), and with nuclear targets (since, roughly, Q s ∼ A 1/3 ). Over the years, the Color Glass Condensate (CGC) effective theory [1] has emerged as the best candidate to approximate QCD in the saturation regime, both in terms of practical applicability and of phenomenological success [2]. In this paper, we focus on forward dijet production in p+A and p+p collisions. We note that the CGC approach has been very successful in describing forward di-hadron production at RHIC [3][4][5], in particular it predicted the suppression of azimuthal correlations in d+Au collisions compared to p+p collisions [6], which was observed later experimentally [7,8].
With forward dijets at the LHC however, the full complexity of the CGC machinery is not needed. Indeed, for the di-hadron process at RHIC energies, no particular ordering of the momentum scales involved is assumed in CGC calculations, while, at the LHC, the presence of particles with transverse momenta much larger than the saturation scale clearly must imply some simplifications. On the flip side, there will be other complications since further QCD dynamics, which is not part of the CGC framework but which is relevant at large transverse momenta, must also be considered. There are three important momentum scales in the forward dijet process: a typical transverse momentum of a hard jet, P t , whose precise definition will be stated in next section, the transverse momentum of the small-x gluons involved in the hard scattering, k t , and the saturation scale of the small-x target, Q s . Clearly, P t is always one of the hardest scales, and it is much bigger than Q s , which is always one of the softest scales. Then, depending on where k t sits with respect to these two, three different regimes can be defined.
A first regime, with Q s k t ∼ P t , corresponds to the domain of applicability of the so-called high energy factorization (HEF) framework [9,10], in which the description of forward dijets involves an unintegrated gluon distribution for the small-x target along with off-shell matrix elements. That is explicitly shown in this work, starting form CGC calculations. A second regime, with k t ∼ Q s P t , is where the so-called transverse momentum dependent (TMD) factorization [11] is valid. It involves on-shell matrix elements but two independent unintegrated gluons distributions [12]. In this regime, non-linear effects are present and, in the the large-N c limit, equivalence with CGC expressions was shown in [13]. In the present work we shall keep N c finite. Finally, the intermediate regime Q s k t P t , which is naturally obtained from the two others by taking the appropriate limits, corresponds to the collinear regime, with on-shell matrix elements and the standard integrated gluon distribution.
Separately, the HEF and TMD approaches to dijet production have been extensively studied in the literature [10,[14][15][16][17] and [11,[18][19][20][21], but little connection has been made between them so far. The first result of this paper is to reveal that connection, in the context of dilute-dense collisions, and to show that, in fact, they are both contained in the CGC description. However, as already mentioned, using the CGC approach is unnecessarily complicated and one should take advantage of the fact that P t Q s to simplify the theoretical formulation. The second result of the paper is precisely to develop a new formula for forward dijets in dilute-dense collisions that encompasses all three situations described above, meaning that it is applicable regardless of the magnitude of k t . As explained below, this is obtained by extending the TMD factorization framework, more precisely by supplementing it with off-shell matrix elements.
Note that the derivation of our new unified formula is performed in two independent ways: first using the standard Feynman diagram technique, and second by exploiting the so-called helicity method that employs color-ordered amplitudes [22]. With this second method, the gauge invariance of the results is explicit, and the method will also prove very useful in the future, when processes with more particles in the final state are considered. As is the case in the CGC framework, our new formulation contains all the relevant limits, but it has the advantage that it is more amenable to phenomenological implementations than CGC calculations. In addition, it is also better suited to be supplemented with further QCD dynamics relevant at high P t , such as Sudakov logarithms [23,24] or coherence in the QCD evolution of the gluon density [25][26][27]. These tasks are left for future work.
The plan of the paper is as follows. In Section 2, we introduce kinematics and notations, and briefly present the HEF and TMD frameworks. In Section 3, we show that the HEF framework can be derived from CGC calculations, when the Q s k t ∼ P t limit is considered. Section 4 is devoted to the k t ∼ Q s P t limit, the derivation of the TMD factorization formula for forward dijets given in [12] is recalled, and extended to the case of finite N c . The hard factors of the TMD framework are computed again in Section 5, but keeping the small-x gluon off-shell, which leads us to our new unified formula for forward dijets in p+A collisions. In Section 6, both the TMD factorization formula and the off-shell hard factors are derived again, but using color-ordered amplitudes. Finally, Section 7 is devoted to conclusions and outlook.

Forward dijets in p+A collisions
We shall discuss the process of inclusive dijet production in the forward region, in collisions of dilute and dense systems The process is shown schematically in Fig 1. The four-momenta of the projectile and the target are massless and purely longitudinal. In terms of the light cone variables, v ± = (v 0 ± v 3 )/ √ 2, they take the simple form where s is the squared center of mass energy of the p+A system. The energy (or longitudinal momenta) fractions of the incoming parton (either a quark or gluon) from the projectile, x 1 , and the gluon from the target, x 2 , can be expressed in terms of Figure 1: Inclusive dijet production in p+A collision. The blob H represents hard scattering. The solid lines coming out of H represent partons, which can be either quarks or gluons. the rapidities and transverse momenta of the produced jets as where p 1t , p 2t are transverse Euclidean two-vectors. By looking at jets produced in the forward direction, we effectively select those fractions to be x 1 ∼ 1 and x 2 1. Since the target A is probed at low x 2 , the dominant contributions come from the subprocesses in which the incoming parton on the target side is a gluon In dilute-dense collisions, the large-x partons of the dilute projectile are described in terms of the usual parton distribution functions of collinear factorization f a/p , with a scale dependence given by DGLAP evolution equations. By contrast, the small-x gluons of the dense target nucleus are described by a transverse-momentum-dependent distribution, which evolve towards small x according to non-linear equations. Therefore, the momentum k of the incoming gluon from the target, besides the longitudinal component k − = x 2 s/2, has in general a non-zero transverse component, k T , which leads to imbalance of transverse momentum of the produced jets Here, by k T we mean a four-vector, as opposed to k t = p 1t + p 2t , which is a two-dimensional vector in the transverse plane. They are simply related by: k T = (0, k t , 0). Using the notation defined above, the gluon's four-momentum can be also parametrized as The Mandelstam variables at the partonic level are defined as They sum up toŝ +t +û = k 2 T . To take into account small-x effects in dijet production, an approach that has been broadly used in phenomenological studies involves the so-called high energy factorization (HEF) formula [14] dσ pA→dijets+X This formula makes use of the unintegrated gluon distribution F g/A that is involved in the calculation of the deep inelastic structure functions. It is determined from fits to DIS data, and then used in Eq. (2.9), along with matrix elements that depend on the transverse momentum imbalance (2.5). Even though the high energy factorization is not strictly valid for dijet production, there exists a kinematic window, the dilute limit Q s |p 1t |, |p 2t |, |k t |, in which it can be motivated from the CGC approach. We shall demonstrate this explicitly for all channels in the next section.
A second approach, valid in the regime where the transverse momentum imbalance between the outgoing particles, Eq. (2.5), is much smaller than their individual transverse momenta, is the so-called transverse momentum dependent (TMD) factorization. This limit, |p 1t + p 2t | |p 1t |, |p 2t |, or |k t | |P t |, corresponds to the situation of nearly back-to-back dijets. Even though, in general, there exists no TMD factorization theorem for jet production in hadron-hadron collisions, such a factorization can be established in the asymmetric "dilute-dense" situation considered here, where only one of the colliding hadrons is described by a transverse momentum dependent gluon distribution. Again, selecting dijet systems produced in the forward direction implies x 1 ∼ 1 and x 2 1, which in turn allows us to make that assumption. The TMD factorization formula reads (so far, this has been obtained in the large-N c approximation, but this restriction will be lifted in the present work) [13] dσ pA→dijets+X  [13] as if the small-x 2 gluon was on-shell (i.e. |k t | = 0). The k t dependence survived only in the gluon distributions.
By restoring the k t dependence of the hard factors inside formula (2.10), we can make the bridge between the HEF and TMD frameworks and obtain a unified formulation which encompasses both the dilute and the nearly back-to-back limit. Note that we follow the conventions used in earlier papers that dealt with these formalisms, such as Ref. [14] and [13] respectively. Therefore, contrary to the HEF matrix elements |M ag * →cd | 2 , the hard factors H (i) ag→cd of the TMD factorization are defined without the g 4 factor. In addition, the definition of the gluon distribution also differ by a factor π. The integrated gluon distribution x 2 f g/A is obtained from ag in the TMD formalism. Finally, let us point out that, in the frameworks described above, one emits radiation in the transverse direction that one has no control over, as it is part of the small-x gluon distributions and therefore is treated fully inclusively. To be more specific, at this level, transverse momentum conservation is obtained either by several particles of average transverse momentum Q s , or by a third hard jet, depending on the magnitude of |k t |. Due to the small-x evolution, that radiation is ordered in rapidity, therefore it does not contribute to the measured forward dijets systems.
3 High energy factorization derived from CGC: the |p t |, |p t |, |k t | Q s limit We shall demonstrate that the high-energy factorization formula for double-inclusive particle production, Eq. (2.9), is identical to a result obtained from the CGC formalism in the dilute target approximation. This is a limit where all the momenta involved in the process are much larger than the saturation scale: |p 1t |, |p 2t |, |k t | Q s . Here, we show explicitly the equivalence of the HEF and CGC formulas for the qg * → qg channel and only provide the final results for the two other channels, as the derivations proceed identically for all of them. We derive the CGC cross sections for the qg * → qg and gg * → qq channels in the dilute limit following a procedure developed in Ref. [28] where only the gg * → gg sub-process was considered. The inclusive quark-gluon production cross section in CGC is given by the following formula [6]: where the amplitude squared, |M(p, p 1 , p 2 )| 2 , has the form: where φ λ αβ are mixed-space quark wave functions and S (i) are correlators of Wilson lines explained in details below. Following the notation from Fig. 1 and Eq. (2.8), we use the fraction of the plus components of four-momenta, z, with p 1 being the four-momentum of the outgoing gluon and p 2 , the four-momentum of the outgoing quark.
The fundamental, U (x), and adjoint, V (x), Wilson lines are defined as path-ordered exponentials of the gauge field (written here in the A + = 0 gauge): where t a and T a are the generators of the fundamental and adjoint representations of SU (N ) respectively. The traces of products of Wilson lines appearing in the cross section are defined in the following way: The CGC average is taken over the background filed evaluated at Y = ln(1/x 2 ). The product of wave functions in the massless limit is: Introducing a change of variables, u = x − b and v = zx + (1 − z)b (and similar for the primed coordinates), we get [6]: The conjugate momentum to u − u is P t = (1 − z)p 1t − zp 2t , and the one corresponding to v − v is the total transverse momentum of the produced particles k t = p 1t + p 2t . In terms of fundamental Wilson lines only: . (3.10) In the dilute target limit we use an expansion of the Wilson lines to second order in the background field: To this order, the expectation values of the correlators (3.10) and (3.10), entering the cross section, become: (3.14) In the above expressions: where r = |x − y| and γ x 2 (x + , 0) is related to the expectation value of the two-field correlator: Using the expressions for the multi-point functions S (i) , we get the following result for the amplitude squared: The Fourier transform of Eq. (3.15) above gives the unintegrated gluon distribution: In terms of the unintegrated gluon distribution, the amplitude squared gets the form: where S ⊥ is the transverse area of the target. We want to show that Eq. (3.19) reproduces the HEF formula (2.9) with the appropriate unintegrated parton distribution function and off-shell matrix elements. For this purpose, we need to find a relation between the unintegrated gluon distribution used in the above equation, f x 2 (k t ), and F g/A (x 2 , k t ), which appears in the HEF formula (2.9). This is easily done by considering the deep inelastic scattering process, since F g/A (x 2 , k t ) is precisely the unintegrated gluon distribution involved in the formulation of the γ * + A → X total cross section, and is therefore related to the qq dipole scattering amplitude in a straightforward manner (see for instance [15,29]): In the weak-field limit, using formula (3.14), this gives the relation Then, the cross section for the qg production channel from Eq. (3.1) can be written in a more compact form , (3.22) whereP gq (z) is related to the quark-to-gluon splitting function and is given by: It turns out that the above expression for the quark-gluon production cross section is identical to the result in the HEF formalism, Eq. (2.9), containing the off-shell amplitudes |M ag * →cd | 2 .
The latter have been calculated in Refs. [10], [30] and [31]. The equivalence of the CGC and HEF formulas in the dilute limit can be shown in a similar way for the cross sections of the other two subprocesses, gg * → qq and gg * → gg. The CGC results for the cross sections in this limit are: (3.24) and [28] dσ(pA → ggX) (3.25) The expressions forP qg (z) andP gg (z) have the form: Again, Eqs. (3.24) and (3.25) are equivalent to the HEF formulas for the corresponding cross sections [15]. Therefore, in principle, the HEF formalism should not be employed to include non-linear effects, and one should stick to Balitsky-Fadin-Kuraev-Lipatov (BFKL) evolution [32][33][34], or Ciafaloni-Catani-Fiorani-Marchesini evolution [25][26][27], when evaluating the gluon distribution. In this spirit, most studies are performed using a gluon density evolved with an improved BFKL equation that includes some higher-order corrections [35], but no non-linear effects. However, we note that the HEF framework could be used with the Balitsky-Kovchegov (BK) equation [36,37] in order to investigate the so-called geometric scaling regime, where saturation effects are felt, even though Q s k t . The full saturation region, Q s ∼ k t , is however, in principle, out of reach of formula (2.9). Along these lines, an estimate of saturation effects was obtained in [38,39], using the BK equation extended to include the same higher-order corrections as included in the linear case [35].
with different operator definition, are involved here. Indeed, as explained in [11], a generic unintegrated gluon distribution of the form where F i− are components of the gluon field strength tensor, must be also supplemented with gauge links, in order to render such a bi-local product of field operators gauge invariant. The gauge links are path-ordered exponentials, with the integration path being fixed by the hard part of the process under consideration. Therefore, unintegrated gluon distributions are process-dependent.
In the following, we shall encounter two gauge links U [+] and U [−] , as well as the loop These links are composed of Wilson lines, their simplest expression is obtained in the A + = 0 gauge: but the expressions of the various gluon distributions given below are gauge-invariant. From now on, F i− (ξ + , ξ) is simply denoted as F (ξ), and the hadronic matrix elements A|...|A → ... . Note however that they are different from the CGC averages of the previous section. This approach to dijet production in proton-nucleus collisions was analyzed in Ref. [13]. The TMD factorization formula (2.10) was derived there in the large-N c limit, and shown to be equivalent to CGC calculations (e.g. formulas (3.1) and (3.2) in the case of the qA → qg channel), after taking the limit |p 1t |, |p 2t | |k t |, Q s . In this section, we derive the results keeping N c finite, and obtain corrections to the hard factors H (i) ag→cd previously derived, as well as new hard factors corresponding to gluon distributions that were omitted before (as they were vanishing in the large-N c limit). We also check explicitly the gauge invariance of these hard factors by computing them in a gauge different from the one used in [13].
An important fact to note is that, as a consequence of the |k t | |p 1t |, |p 2t | limit, the k t dependence in (2.10) survives only in the gluon distributions, and the hard factors are calculated as if the small-x 2 gluon was on-shell. That is, looking at the hard partonic interaction represented by the blob H in Fig. 1, k 2 = −|k t | 2 is set to zero, andŝ +t +û = 0.

The qg → qg channel
The complete set of independent cut diagrams contributing to this channel is shown in Fig. 2 (mirror images of diagrams (3), (5) and (6) give identical expressions).
These are the same gluon distributions as in the large-N c limit [13], no additional ones are present in this channel. The only difference in the expression (4.3) when we go to finite N c will appear in the hard factor H qg . That gluon distribution is sometimes also denoted x 2 G (2) , and is called the dipole distribution, since it is the one that enters the formulation of the inclusive and semi-inclusive DIS. In the CGC approach, x 2 G (2) can be related to the qq dipole scattering amplitude, and therefore linked to the gluon distribution used in the HEF formalism: That distribution is not sufficient however to compute the forward dijet cross section when |k t | ∼ Q s (i.e. the case considered in this section).
The exact results for the two hard factors read where D i s are the squared and interference diagrams corresponding to the qg → qg channel, following the numbering of Fig. 2. Each term D i = C u i h i represents the product of the color factor, C u i , and the hard coefficient, h i . What kind of diagrams enter the hard factors H (i) qg→qg depends on the type of the gauge links appearing in each of them. As summarized in table IV of Ref. [11], the distribution F qg is present in diagrams (1), (2), (4), (5) and (6), while the distribution F (2) qg appears in diagrams (1), (2) and (3). The D i components were computed in Ref. [13] (table II) in an axial gauge with the axial vector, n, set to n = p, for both the incoming and the outgoing gluon, where p is the four-momentum of the incoming quark, as defined in Fig. 1. We recovered the same results for D i s in that gauge and performed the same calculation in a different gauge with the axial vector set to n = p for the incoming gluon and n = p 2 for the outgoing gluon. The results for the hard factors H (1) qg→qg and H (2) qg→qg at finite N c are identical in both gauges and they read The hard factors and the TMDs entering the factorization formula (4.3) are all gauge invariant. In principle, that leaves us some freedom and the factorization formula can be rewritten with new hard factors and the corresponding new gluon distributions formed as linear combinations of the the old ones.
For reasons that shall be discussed in detail in Section 6, let us define the new hard factors for the qg → qg subprocess and the corresponding new gluon TMDs such that the factorization formula (4.3) now takes the form The explicit expressions for K (1) qg→qg and K (2) qg→qg are given in Table 1.

The gg → qq channel
The independent cut diagrams contributing to this channel are shown in Fig. 3.
In addition to the two gluon distributions, F gg and F (2) gg , used in Ref. [13], the result to all orders in N c involves a third distribution [11], F gg (also sometimes denoted x 2 G (1) and called the Weizsacker-Williams gluon distribution), and the differential cross section reads 14) Figure 3: Diagrams for gg → qq subprocess. The mirror diagrams of (3), (5) and (6) give identical contributions.
with the three gluon TMDs defined as The appropriate hard factors are constructed from the expressions corresponding to the diagrams (1)-(6) depicted in Fig. 3, using the following formulas Again, the components D i = C u i h i were computed in [13] (table III) and they were used there to determine the hard factors H (1,2) gg→qq in the large N c limit. Here, we generalize the results of [13] to the full, finite-N c case. The calculation can be most readily done by exploiting crossing symmetry that relates the qg → qg and gg → qq channels. This allows for identification of the diagrams between Figs. 2 and 3 and enables one to recycle the D i expressions calculated in the previous subsection. For example, the expression corresponding to the diagram (1) from Fig. 3, with the incoming and the outgoing legs connected, is identical to the already computed expression for the diagram (4) from Fig. 2 (modulo a color averaging factor and swapping of the momenta p 1 ↔ p). Similarly for all the other diagrams. That gives the following set of hard factors for the gg → qq subprocess: Of the three hard factors, H gg→qq , only two are independent. The third hard factor, H gg→qq , can be expressed as 1 gg→qq .
(4.24) Therefore, the cross section for quark-antiquark production can be rewritten with only two hard factors and two gluon distributions that are linear combinations of F gg and F gg : gg→qq . (4.25) In the above, we defined the new gluon TMDs as 27) and the hard factors K (i) gg→qq as: gg→qq . (4.28) The explicit expressions for the latter are given in Table 1.

The gg → gg channel
Finally, the independent cut diagrams for the gg → gg channel are given in Figs. 4 and 5, and the corresponding differential cross section for two-gluon production reads: The F (1,2,3) gg distributions are the same as the ones introduced in the previous section in Eqs. (4.15)-(4.17). The remaining three are [11]:  (3), (5) and (6) give identical contributions.
The associated hard factors are constructed as 2 : The calculation of the gg → gg subprocess requires inclusion of diagrams with four-gluon vertex. Therefore, in general, the expressions D i in the above equations contain contributions from both, the 3-gluon and 4-gluon vertex diagrams, the latter shown in Fig. 5. The corresponding expressions were computed in [13], where they were used to determine the hard factors in the large-N c limit. Below, we generalize the result of Ref. [13] to the case of finite-N c , with the help of the exact definitions given in Eqs. (4.33)-(4.35). The six hard factors read To get further insight into the above results, we have performed an independent calculation in a gauge with non-vanishing 4-gluon vertex contribution, with the axial vectors defined as: n = p for the gluon k , n = k for the gluon p , n = p 2 for the gluon p 1 , n = p 1 for the gluon p 2 . (4.39) The contributions to D i s in this gauge, coming from diagrams with 3-gluon vertices only and depicted in Fig. 4, are given in Table 2.
2 Note that what is called H gg→gg in Ref. [13] is now H gg→gg . Out of six hard factors, only H gg→gg , H gg→gg and H (6) gg→gg survive in the large-Nc limit.  (8), (9) and (10) give identical contributions. In order to add the 4-gluon vertex contribution and obtain a full result for the D i coefficients, let us consider a general 4-gluon amplitude, shown on the left hand side of Fig. 6. A 3-gluon vertex brings a single SU (N ) structure constant factor. Each amplitude in Fig. 4 consists of two 3-gluon vertices and that results in three possible color factor products for the amplitudes with a gluon exchange in the t-, s-and u-channels, respectively. Each of the above amplitudes can now be written as where i is either t, s or u, c i is a color factor from Eq. (4.40), and A 3g i is a corresponding kinematic expression. The 3g superscript means that only 3-gluon vertices are involved in the given amplitude. Similarly, for the conjugate amplitudes, following the notation of Fig. 6, we havec (4.42) That allows us to identify the color coefficients of the 3-gluon diagrams of Fig. 4 and write them in a compact form (2)ŝ 6 + 2tŝ 5 + 33t 2ŝ4 + 60t 3ŝ3 + 44t 4ŝ2 + 16t 5ŝ + 4t 6 Table 2: Expressions for the gg → gg subprocess corresponding to diagrams (1)-(6) of Fig. 4, hence containing only 3-gluon vertices, in gauge (4.39) with non-vanishing 4-gluon vertex contributions.
The O α 2 s contributions from diagrams with 4-gluon vertex are depicted in Fig. 5, where the first row shows the 4-gluon vertex amplitude squared, and the second row gives the interference terms with the three types of M 3g amplitudes from Eq. (4.41). A 4-gluon vertex amplitude contains all three color factor products of Eq. (4.40) at once (4.45) (4.47) (4.49) (4.50) The results for D i s in the gauge (4.39) are summarized in in Table 3. Plugging those expressions into the hard factor definitions (4.33)-(4.35) leads to the results identical to Eqs. (4.36)-(4.38). We have already seen that not all of the six hard factors that arise in the gg → gg subprocess are independent. As shown in Eq. (4.35), the expressions for H gg→gg and H (6) gg→gg are linearly dependent, that is Hence, the cross section for two-gluon production from Eq. (4.29) can be written in a much simpler, factorized form, with only two hard factors and two gluon distributions In this channel, the new gluon TMDs, Φ gg→gg , are defined as the following linear combinations of F gg , . . . , F gg : and the new hard factors are: The explicit expressions are given in Table 1. We note, that the above simplification occurs naturally when utilizing gauge invariance from the start, as we will show in section 6.
Finally, we point out that, in the large-N c limit, all the distributions that were introduced in this section, F gg , and F (6) gg , can be written in terms of xG (1) and xG (2) , and equivalence of formulas (4.13), (4.25) and (4.52) with CGC results is obtained [13].
Let use conclude that this part of our work brings two improvements to the current state of the art for the TMD factorization in forward dijet production. First of all, we have obtained finite-N c corrections to the hard factors of Ref. [13]. More importantly, however, we have eliminated the redundancy in the number of gluon distributions needed to write a factorization formula for this process, which now takes the compact form with only two gluon distributions and two hard factors required in each channel. Note that, as we shall discuss now, the incoming, small-x gluon is kept on-shell. Eqs. (4.56) will be further generalized to the case of the off-shell gluon in Section 5.

The |k t | Q s limit
Finally, let us consider the limit |k t | Q s . This is the dilute limit considered in Section 3, with the extra requirement that |k t | |P t |, needed for the validity of those formula. In that limit, the transverse separation between the field operators in the definition of the gluon distribution goes to zero, and the gauge links can be dropped. As a result, all the F (i) ag distributions coincide, except F (2) gg which vanishes. In terms of the Φ (1,2) ag→cd functions, all six distributions also reduce to that one gluon distribution, which can therefore be identified with F g/A /π.
Then, for all channels, one can easily sum the surviving hard factors. In terms of diagrams, we always obtain D 1 + D 2 + 2D 3 + D 4 + 2D 5 + 2D 6 , meaning that we recover the collinear matrix elements. Indeed we have (noting that H Therefore, we recover the HEF formula (2.9), except that, due to the |k t | |P t | limit, the matrix elements are on-shell: the transverse momentum of the incoming gluon, k t , survives only in F g/A . In other words, we recover the standard high-|P t | limit: In the following section, we shall restore the k t dependence of the hard factors. This will extend our formulas such that they recover the full HEF formula when the dilute limit is considered. As a result, we will obtain a unified description, valid for generic forward dijet system with |p 1t |, |p 2t | Q s , without any additional requirement on the magnitude of the transverse momentum imbalance k t . We shall now generalize the hard factors that enter the TMD factorization formula (2.10) to the case with one of the incoming gluons being off the mass shell, as illustrated in Fig. 7. As it has been already stated, the motivation to include the offshellness is to be able to allow for configurations where the dijets are produced at any azimuthal angle (of course before application of a jet algorithm that will suppress very small angles and hence render the results finite). As can be seen in Fig. 8 (as an example we chose only purely gluonic matrix element but the same structure occurs for the other channels), the on-shell matrix element misses substantial contributions when the jets are produced at small angles near ∆φ = 0 and at small rapidity differences ∆Y = |y 1 − y 2 | 0. In such configurations, the matrix element develops a structure that is divergent and it is suppressed only by a jet algorithm, which has to be applied in order to ensure two-jet configurations [15]. The matrix elements squared we are after, i.e. gg * → gg, gg * → qq and qg * → qg, can be extracted from the high energy limit (or eikonal limit) of q g → q g g and q g → qq q and q q → q q g. In this approach the quark q is an auxiliary line to which the initial state off-shell gluon g * couples eikonally. The high energy factorization is a direct procedure where one uses the standard Feynman rules for all vertices and color factors, and fixes the light-cone gauge for the on-shell gluons, using a gauge vector given by the longitudinal component of the off-shell, initial-state gluon. In particular, if we apply the high energy factorization to the process we are after, we set the gauge vector to n = p A , where p A is the target four-momentum, as defined in Fig. 1 and Eq. (2.2). Furthermore, the prescription is to associate with the off-shell gluon a longitudinal polarization vector, called nonsense polarization, of the form where λ counts just one longitudinal polarization vector. In the square amplitude, this leads to the polarization tensor of the form [9] 0 µ The √ 2 factor in Eq. (5.1) follows from the convention that allows one to use on-shell-like (two polarizations) averaging over two polarizations. In the above, x 2 = k µ p µ /p Aν p ν , which follows directly from the definition in Eq. where, depending on the channel, q = p, p 1 or p 2 , c.f. Eq. (4.39). Let us note that the procedure outlined above defines the hard process in a gauge invariant manner only when a special choice for polarization vectors of the on-shell gluons is taken. In an arbitrary gauge, for internal and external gluon lines, more sophisticated methods have to be used, see e.g. [31,[40][41][42][43].
To present our results in a compact form, with direct relation to the on-shell formulas from Section 4, in addition to the standard Mandelstam variables given by Eqs. (2.7), which now, however, sum up toŝ +t +û = k 2 T , we introduce their barred versions, defined only with the longitudinal component of the off-shell gluon which are related via the equations +t +ū = 0 .

(5.5)
In the on-shell limit, k 2 T → 0, the variables defined above recover the standard Mandelstam variables from Eq. (2.7) As a consistency check, we have verified that, for all three subprocesses, the off-shell amplitudes that shall be used to build the hard factors in the remaining part of this section are identical to those first calculated in Ref. [10].
From this point onwards, we shall discuss our results only in terms of the new K (i) hard factors and the new factorization formulas from Eqs. (4.13), (4.25) and (4.52). The results for the old hard factors, H (i) , in the off-shell case are given in Appendix A for completeness.

The qg * → qg channel
The off-shell hard factors for this channel are obtained using definitions given in Eq. (4.10) and then Eqs. (4.6) and (4.7). The corresponding D i expressions are collected in Appendix A in Table 8. The two hard factors read In the limit |k t | → 0, simplification given by Eq. (5.6) occurs and the above formulas manifestly recover the on-shell results from Table 1.

The gg * → qq channel
The off-shell hard factors are obtained using definitions given in Eq. (4.28) and then Eqs. (4.18), (4.19) and (4.20). The corresponding D i expressions are collected in Appendix A in Table 9. The two hard factors take the following compact form Again, following Eq. (5.6), it is manifest that the above hard factors reduce to those given in Table 1, in the limit |k t | → 0.

The gg * → gg channel
In the gauge chosen for our calculation, all the squared diagrams and interference terms that involve a 4-gluon vertex are identically zero. The corresponding D i s are given in Table 10 of Appendix A. Using the combinations from Eqs. (4.33)-(4.35) and then the definition from Eq. (4.55) leads to the following set of the off-shell hard factors The on-shell limit is again manifest, with the above equations reducing to those from Table 1 as |k t | → 0.

Helicity method for TMD amplitudes
In the preceding sections, the hard factors accompanying the gluon densities F (i) ag were calculated from the squared diagrams presented in Figs. 2-5. This procedure has certain drawbacks, especially when one would like to consider more complicated processes. For multiparticle processes, the color decompositions and helicity method [22,44] are now considered as the most effective ways to deal with them. Moreover, it is not obvious how the gauge invariance comes into play for the separate diagrams from Figs. 2-5 contributing to the hard factors. In the color decomposition method, the so-called color ordered amplitudes are gauge invariant from the start and one can use them directly to construct hard factors.
In view of the above, and to cross-check the results from Section 5, we will give an alternative procedure to obtain the factorization formulas with off-shell gluon. To this end, we shall need TMD gluon densities corresponding to color decomposition of amplitudes and the color-ordered amplitudes themselves.

Color decompositions
Let us recall some basic facts about the color decompositions. We refer to [22,44] for more details.
We first consider a gluon amplitude M a 1 ...a N ε λ 1 1 , . . . , ε λ N N , where a 1 , . . . , a N are the external, adjoint color quantum numbers, the ε λ i i is a polarization vector for a gluon i having momentum k i and helicity λ i = ±. The fundamental color decomposition reads Consider now an amplitude involving a quark anti-quark pair M D 1 a 2 ...a N −1 D N where D i , D j are the color and the anti-color of the quark and the anti-quark, respectively. The color decomposition reads Now λ 1 and λ N are helicities of the quark and the anti-quark. For amplitudes involving more quark anti-quark pairs the decomposition is more complicated and we refer to [22] for details. It is important to note that the above color decompositions work also for the case when one of the gluons is off-shell.

Gluon TMDs for color ordered amplitudes
Let us now find the gluon TMDs corresponding to the color ordered amplitudes squared, as defined in the previous subsection. We constraint ourselves to the 2 → 2 processes case considered in this paper.
Let us first consider the g (k 4 ) g * (k 1 ) → g (k 3 ) g (k 2 ) process. For the purpose of this and next subsections we have assigned a new set of momenta to the partons. This assignment differs from the one used before but it is more convenient when dealing with color ordered amplitudes. The correspondence is achieved by the following relations: Moreover, for the off-shell momentum we adopt a notation color-ordered amplitude squared gluon TMD Table 4: Gluon TMDs accompanying the color-ordered amplitudes for gg * → gg process. It has been assumed that TMDs are real. The F where n 1 is placed for the off-shell gluon instead of a polarization vector (in fact it plays a similar role). As far as dual amplitudes are concerned, we indicate the off-shell gluon by a star. The gluon TMDs that correspond to the color structures exposed in (6.5) (after squaring) were calculated in [11] (by TMDs we mean here linear combinations of the elementary correlators F (i) gg ) and are given in rows 1 and 3 of Table 8 of [11], which defines six different gluons. From the color decomposition it follows, however, that they are the only relevant TMDs and all the other are in fact redundant. These TMDs contain all the necessary information and correspond to the two independent gauge invariant amplitudes squared and their interference, as summarized in Table 4. Now, let us turn to the g (k 4 ) g * (k 1 ) → q (k 3 ) q (k 2 ) process. The color decomposition reads The gluon TMDs corresponding to the color structures appearing after squaring this equation are gathered in Table 5. They correspond to rows 1 and 5 of Table 7 in [11]. Again, we have only two independent TMDs that are needed. For the process q (k 4 ) g * (k 1 ) → q (k 3 ) g (k 2 ), the color decomposition reads For anti-quarks we need to exchange the indices 3 ↔ 4. The TMDs corresponding to those processes are given in Table 6. In general, the TMDs for a sub-process with anti-quarks are different than for quarks, but they turn out to be the same assuming that the correlators are real. Again, we end up with only two independent TMDs.
color-ordered amplitude squared gluon TMD

Off-shell color-ordered helicity amplitudes
In Section 5, we have calculated the off-shell hard factors using the high energy projector (5.1), which, together with an axial gauge (for internal lines and external gluons), assured that they were gauge invariant despite the off-shellness (the gauge invariance issues are discussed in [30]). There are also methods to calculate gauge invariant off-shell amplitudes in any gauge and choice of polarization vectors [31,42,43]. In what follows, we shall use the results of [31,43].
Consider first the gluon amplitudes. For the purpose of this section only we assume all momenta to be outgoing. For the non-vanishing helicity configurations, in the helicity basis, we have M g * g→gg 1 * , 2 − , 3 + , 4 + = 2g 2 ρ 1 1 * 2 4 1 * 2 23 34 41 * , (6.8) M g * g→gg 1 * , 2 + , 3 − , 4 + = 2g 2 ρ 1 1 * 3 4 1 * 2 23 34 41 * , (6.9) M g * g→gg 1 * , 2 + , 3 + , 4 − = 2g 2 ρ 1 1 * 4 4 1 * 2 23 34 41 * , (6.10) where we adopted a shorthand notation for the spinor products ij = k i − |k j + with |k i ± = 1 2 (1 ± γ 5 ) u (k i ), and where ρ 1 is a, for our purposes irrelevant, phase factor (see details e.g. in [43]). We also defined 1 * i = n 1 i with n 1 being the longitudinal component of k 1 , c.f. Eq. (6.4). The other remaining helicity configurations can be obtained from Eqs. (6.8)-(6.10) using CP invariance and so on. For the other color ordered amplitude, M gg * →gg (1 * , 3, 2, 4), we need to exchange 2 ↔ 3 in the denominators. The above helicity amplitudes can be efficiently evaluated and squared numerically, however for the purpose of this paper we shall need analytic expressions. To this end let us introduce [ij] = k i + |k j − , which, up to an unimportant phase, is a complex conjugate of ij . Moreover, we have the following relation ij [ji] = (k i + k j ) 2 ≡s ij . (6.12) For the products involving n 1 we use the notation With this, we get for the required amplitudes squared summed and averaged over helicities for pure gluon channel, and for gg * → qq channel. For the qg * → qg sub-process we need to use the crossing symmetry as described in the preceding section. We have , (6.32) In all the formulas above, the first color factor comes from color averaging. The minus signs in front of the amplitudes in (6.32), (6.33) come from the crossing of a fermion line. Table 7 is easily recovered using the following relations ofs ij to the kinematic variables from Section 5 s 23 =s 14 =ŝ,s 34 =s 12 =t,s 24 =s 13 =û , (6.34)

Conclusions and outlook
Dijet production is one of the key processes studied at the LHC. Requiring the two jets to be produced in the forward direction creates an asymmetric situation, in which one of the incoming hadrons is probed at large x, while the other is probed at a very small momentum fraction. This kinematic regime poses various challenges, one of the biggest questions being the existence of a theoretically-consistent and, at the same time, practically-manageable factorization formula. The standard collinear factorization is not applicable in this case as the dependence on the transverse momentum of the low-x gluon in the target, k t , cannot be neglected.
In the limit where the jets' transverse momenta |p 1t |, |p 2t | |k t | ∼ Q s , with the latter being the saturation scale of the target, an effective transverse-momentum-dependent factorization formula for forward dijet production has been derived in Refs. [12,13] and it has been shown to be consistent with the CGC framework. On the other side, the high energy factorization approach [9,10] has been also successfully applied for studying forward dijet production at the LHC. In this paper, we have examined the theoretical status of the HEF approach in the context of forward dijet production at hadron colliders and reconciled it with the TMD factorization by creating a unified framework valid in the limit |p 1t |, |p 2t | Q s with an arbitrary value of |k t |, as long as it is allowed by phase space constraints. In particular, we have shown in Section 3 that the HEF formula is indeed justified in the kinematic window of |p 1t |, |p 2t | ∼ |k t | Q s , where it was explicitly derived from CGC for all 2 → 2 channels. This limit corresponds to the dilute target approximation hence no non-linear effects are expected.
The second major result of our work is an improvement of the effective TMD factorization for forward dijet production, first derived in Ref. [13], by taking into account in Section 4 all finite-N c corrections, as well as generalizing the factorization formula to the case with an off-shell incoming gluon in Sections 5 and 6. In addition, we were able to simplify the TMD factorization formula by reducing the number of gluon distributions to two independent TMDs for each channel. The main results of this part of our study are summarized in Eq. (6.27), which gives the new TMD factorization formula, as well as in Table 7, where we collect all the off-shell hard factors. The corresponding gluon distributions are given in Tables 4, 5 and 6. The above results were obtained with two independent techniques: a traditional Feynman diagram approach and helicity methods with color ordered amplitudes. The improved TMD factorization formula (6.27) encapsulates both the result of Ref. [13] and the HEF framework as its limiting cases.
The results obtained in this paper open several avenues for future research that we plan to follow. First, a natural next steps will be to use Eq. (6.27) for phenomenological studies. That shall require some input for the six gluon TMDs Φ (1,2) ag→cd (x, k t ), which may be difficult in a general case. But in the large-N c limit, they can all be written in terms of just two functions: xG (1) (x, k t ) and xG (2) (x, k t ), which in turn can be evaluated within certain models, as in [4].
Another line of possible extension of our framework is to supplement it with high-|P t | effects such as Sudakov logarithms or coherence in the evolution of the gluon density. Essentially, this can be done by adding a µ 2 dependence to the unintegrated gluon distributions [25-27, 46, 47]. The equations that combine such effects with the small-x evolution [48,49] show a nontrivial interplay between the non-linearities and the µ 2 dependence and this may, in particular, weaken the saturation effects. At the linear level, the so-called single step inclusion of the hard-scale effects (as demonstrated in [16]) helps in the description of forward-central dijet data, therefore this direction seems to be relevant in order to provide complete predictions. Furthermore, first estimates of azimuthal decorrelations of the forward-forward dijets in the HEF framework, with inclusion of hard scale effects and non-linearities, show that they are of similar relevance for this process [29].
Last but not least, it remains to be proved that the large logarithms generated by higherorder corrections can indeed be absorbed into evolution equations for the various parton distributions (and jet fragmentation functions) involved, and potentially for additional soft factors [50]. This limitation however is not specific to our work, the same is true at the level of the TMD and HEF regimes independently. In the former case, it is known that TMD factorization generically does not apply for dijet production in hadron-hadron collisions [18,20]. It is nevertheless expected that, in dilute-dense collisions, initial state interactions originating from a dilute hadron do not interfere with the intrinsic transverse momentum and thus factorization may hold, although there is no formal proof of this statement yet.
In addition, even though it was possible to write formula (4.56) in terms of just two TMDs per channel, this simplification may not survive after small-x evolution is included, as, in general, the non-linear equations mix the original F (i) ag functions. For instance, xG (1) does not obey a closed equation and, contrary to what happens with xG (2) , the large-N c limit does not help [51]. We note that any equivalent linear combination of the gluon distributions, such as (2.10) and (4.56), is equally valid, and it may turn out that some alternative choice allows one to write the evolution equations directly in terms of TMDs. By contrast, it is also possible that the inclusion of small-x evolution can only be achieved within the full complexity of the CGC, meaning that the Q s ∼ |k t | |P t | limit, which allows one to avoid the quadrupole operator in (3.10) and express the cross section in terms of gluon distributions, may not help when small-x evolution is considered.
In the HEF regime, the issues are different. The Q s |k t | ∼ |P t | limit makes things simpler from the point of view of small-x evolution, since non-linear effects can be neglected. However, the off-shellness of the hard process is not neglected and thus the standard power counting of the twist expansion becomes useless. One must then resort to different methods, such as those of Ref. [52]. Any progress towards an all-order proof of either HEF or TMD factorization for forward dijet production in dilute-dense collisions will naturally carry over to our improved TMD factorization formula (6.27) that combines both regimes. In the meantime, our results represent a viable alternative to CGC calculations, equivalent to them in the kinematic regime appropriate for dijets Q s |P t | but more practical.

Acknowledgments
The work of K.K. has been supported by Narodowe Centrum Nauki with Sonata Bis grant DEC-2013/10/E/ST2/00656. P.K. acknowledges the support of the grants DE-SC-0002145 and DE-FG02-93ER40771. S.S. acknowledges useful discussions with Gavin Salam and Fabrizio Caola. P.K., K.K., S.S. and A.vH. are grateful for hospitality toÉcole Polytechnique, where part of this work has been carried out. K.K. thanks for the hospitality of Penn State University, where part of this research was done.

A Off-shell expressions
In this appendix, we gather all expressions corresponding to the D i diagrams from Fig. 2-5 in the case where one of the incoming gluons is off-shell. All calculations were preformed in the axial gauge discussed at the beginning of Section 5, with the axial vectors for the on-shell gluons set according to Eq. (4.39). For completeness, we also give here the results for the "old" hard factors defined in Eqs. (4.6), (4.7) (4.18), (4.19), (4.20), (4.33), (4.34) and (4.35), in the case with off-shell incoming gluon. Table 8 gives the D i expressions for the subprocesses qg * → qg. The two hard factors in this channel read In the limit, |k t | → 0, simplification given by Eq. (5.6) occurs and the above formulas manifestly recover the on-shell results from Eqs. (4.8) and (4.9).