On the approaches to threshold resummation of rapidity distributions for the Drell-Yan process

We consider threshold resummation of rapidity distributions, for which various approaches exist in the literature. Recently, a work by Lustermans, Michel, Tackmann suggested that older approaches by Becher, Neubert, Xu (BNX) and Bonvini, Forte, Ridolfi (BFR) were wrong because they miss some leading power contributions at threshold. In this work, we prove and demonstrate that the BNX and BFR approaches are correct and able to resum threshold logarithms to leading power accuracy. We then show that the BNX and BFR approaches can provide rather good alternatives to more modern approaches to threshold resummation of rapidity distributions, provided the threshold logarithms are resummed according to the ψ\documentclass[12pt]{minimal} \usepackage{amsmath} \usepackage{wasysym} \usepackage{amsfonts} \usepackage{amssymb} \usepackage{amsbsy} \usepackage{mathrsfs} \usepackage{upgreek} \setlength{\oddsidemargin}{-69pt} \begin{document}$$\psi $$\end{document}-soft definition introduced in the context of Higgs production.


Introduction
The study of hadron-hadron collisions plays a fundamental role in the understanding of the Standard Model of particle physics and in the searches for new physics signals beyond it.Precision is the keyword to fully exploit the potential of hadron-hadron collider machines such as the Large Hadron Collider (LHC).Precision must be achieved both at the experimental level and in theoretical predictions.The latter are the subject of this paper.
Quantum Chromodynamics (QCD) is the main player in the theoretical description of hadronhadron collisions.Physical observables at high energy are computed through perturbation theory.However, in some cases, fixed-order perturbative predictions are not accurate enough.Indeed, perturbative computations may depend on some logarithms of kinematic origin, and in some kinematic regions these logarithms can be large and enhanced at every order, thus spoiling the perturbative result.In such cases, the all-order resummation of these logarithmic contributions is needed to stabilise the perturbative result.It is worth noting that, in some intermediate regions, the same logarithms may be small enough to behave perturbatively but still large enough to be the dominant contribution to the cross section.In such cases, the resummation of these contributions is not strictly necessary, but it helps improving the accuracy of the theoretical prediction.
In this work we consider the threshold region, usually defined by the limit in which the invariant mass of the tagged final state is close to the centre-of-mass energy of the initial state.In this limit, most of the available energy flows into the considered final state, so that any extra radiation has to be soft.It is this soft radiation that produces the threshold logarithms whose resummation is the subject of this paper.
The resummation in the threshold limit is widely discussed in the literature, for a large variety of processes.The case of the Drell-Yan (DY) pair production is particularly interesting since it

Threshold resummation of Drell-Yan rapidity distributions
We focus on the Drell-Yan process, namely the production of a lepton-antilepton pair in hadronhadron collisions.We denote by Q 2 the invariant mass squared of the pair, and by s the collider centre-of-mass energy.We introduce the variables where ŝ = x 1 x 2 s is the partonic centre-of-mass energy, namely the energy of the subsystem identified by the two partons coming from each hadron, and x 1,2 are the partons momentum fractions.
According to the QCD factorization theorem, the Drell-Yan cross section differential in invariant mass squared Q 2 and rapidity Y of the pair can be expressed as where σ 0 is the Born term (given e.g. in Ref. [16]), C ij (z, u, α s ) are the partonic coefficient functions computable in perturbation theory and L ij (z, u) are non-perturbative parton luminosities.We use the variable u in place of the more common parton-level rapidity y (namely the rapidity of the pair defined in the partonic centre-of-mass system) for later convenience.The two are related by the equations which show that the variable u ranges from 0 to 1, as a consequence of the physical constraint ze 2|y| ≤ 1.The parton luminosities are defined as the product of two parton distribution functions where the coefficients c ij depend on the vector boson mediating the production of the pair (see e.g.Ref. [16] for a general definition) and the momentum fractions x 1,2 are given in terms of z, u, τ and Y by ) The physical constraint x 1,2 ≤ 1 restricts the actual integration range of u, giving effectively max 0, τ e −2Y − z 2 (1 − z)(τ e −2Y + z) ≤ u ≤ min 1, z(1 − τ e 2Y ) (1 − z)(τ e 2Y + z) . (2.6) For a lighter notation, we are omitting the explicit dependence on Y , τ and on the factorization scale µ F of the luminosity, which is always implicitly understood.Similarly, we are not showing the dependence on the factorization and renormalization scales of the partonic coefficient function, as their dependence is not central in our discussion.

The threshold limit(s)
We now consider the threshold limit.As already mentioned in the introduction, this is the limit in which the available energy is just enough to produce the tagged final state.Namely, Drell-Yan is at threshold when the collider energy √ s is close to the invariant mass Q of the lepton pair, or in terms of the variables defined in Eq. (2.1) when τ → 1 (from below, as τ has to be smaller than 1 by definition).This limit, called henceforth the hadronic threshold limit, is not very interesting phenomenologically because the cross section is small (at τ = 1 it becomes identically zero).
A more interesting limit is the so-called partonic threshold limit, where it is the parton level centre-of-mass energy √ ŝ which is close to Q, namely z → 1.In the cross section formula Eq. (2.2), the variable z is integrated over in the range τ ≤ z ≤ 1.Consequently, if τ is close to 1 then also z is forced to be close to 1, namely hadronic threshold implies partonic threshold.However, the converse is not true: even far from hadronic threshold (which is the phenomenologically interesting region for Drell-Yan) contributions from the partonic threshold region z → 1 are always present, as the integral always extends to z = 1.Most importantly, the partonic threshold region often dominates the integral, due to the shape of the PDFs that act as a weight favouring large values of z even when τ is small.This phenomenon is sometimes called dynamical threshold enhancement [11], and it has been quantified through a saddle point argument in Mellin space [12,17] both in the context of Drell-Yan and of Higgs production.
The partonic threshold region is relevant also because the threshold logarithms mentioned in the introduction appear in the perturbative computations as logarithms in the variable z.More specifically, the partonic coefficient function in the q q channel develops contributions of the form (the + suffix denotes the usual plus distribution) at order n in the strong coupling α s .Because the highest power of the logarithm grows with the order, with two extra power for each extra order, the coefficient function is said to be affected by a double logarithmic enhancement.Clearly, in the partonic threshold limit, these enhanced logarithms spoil the reliability of a fixed-order computation and need to be resummed to all orders.On top of these singular contributions in the z variables there are other plus distributions and Dirac delta terms in the variable u, that are singular in u = 0 or u = 1.These contributions are not directly related to the threshold region, but they play a role in the accurate description of the parton-level coefficient function at the partonic rapidity endpoints.We will come back to this point later in this section.
Because threshold logarithms appear only in the q q channel,1 from now on we consider only its contribution to the cross section.We write the q q contribution to Eq. (2.2) as in terms of the "total" q q luminosity defined by L q q (z, u) = q c q q f q x 1 , µ 2 F f q x 2 , µ 2 F (2.9) and where we have removed the subscript q q from the coefficient function C(z, u, α s ) to keep the notation light.
As long as rapidity-integrated distributions are concerned (e.g. the invariant mass distribution or the total cross section) the definition of the threshold logarithm Eq. (2.7) is unique.However, when considering rapidity distributions, it is possible to distinguish between the logarithms originating from each incoming quark.To do so, it is more convenient to use a different set of variables: in place of z, u (or z, y) one can use z a , z b related to the former by which in turn gives z = z a z b .Each of these variables is related to each incoming parton.In particular, the momentum fractions x 1,2 defined in Eq. (2.5) are given by .11a) with x a x b = τ .In terms of these variables Eq. (2.8) takes the form of a double Mellin convolution, where The coefficient function in terms of these variables contains double logarithms of the form Eq. (2.7) but in the variables z a and z b separately: These are related to the threshold logarithms in the variable z, but the conversion is not straightforward, as it involves also the u dependence (see Ref. [14] for more detail).For completeness, we report in appendix A.1 the NLO contribution to the coefficient function, written with the two choices of sets of variables.When using these parton-specific variables, there are two regions that generate large logarithms: z a → 1 and z b → 1.The threshold region discussed before, z → 1, coincides with the overlap of the two regions z a → 1 and z b → 1, because of the relation z = z a z b .However, if just one of these two variables is large, say z a → 1, and the other is not large, z b ≪ 1, then the variable z is also not large and so the threshold logarithms Eq. (2.7) are harmless, but there are large logarithms in the coefficient functions (those from z a ) which may spoil perturbativity.This mechanism can be understood in terms of the z, u variables noting that there are other large contributions in the coefficient function coming from the u dependence, and in particular at the edge of the range of definition of u.Indeed, it is immediate to see from Eq. (2.10) that z a → 1 corresponds to u → 0, and z b → 1 corresponds to u → 1.These singular contributions from the u dependence are enhanced at the partonic rapidity endpoints, as one can see from Eq. (2.3), and are therefore relevant to describe the tails of the rapidity distribution.
These considerations show that there is a region, larger than the partonic threshold region previously defined, where logarithms of the form Eq. (2.14) may become large, possibly spoiling the reliability of the perturbative result.Adopting a notation introduced in Ref. [14], we may call it generalized partonic threshold region.We can see this as the partonic version of the generalized hadronic threshold region of Ref. [14], identified by the condition τ e 2|Y | → 1 (or equivalently either x a or x b close to 1, depending on the sign of Y ), which corresponds to the tails of the rapidity distribution irrespectively of the value of τ .In other words, it corresponds to the region in which the production of the Drell-Yan pair is at threshold at a given value of the rapidity Y .In this region, either z a or z b is forced to be large, but the other variable can take any accessible value.Therefore, the generalized hadronic threshold limit implies the generalized partonic threshold limit.It is worth noting that in the stronger hadronic threshold limit τ → 1, both z a and z b are forced to be large, and therefore this generalized partonic threshold region coincides with the partonic threshold region z → 1.

Threshold resummation
Large logarithms in the coefficient function need to be resummed in order to obtain reliable predictions.In the literature there exist at least two families of approaches: one aiming at resumming the logarithms in the variable z, Eq. (2.7), and one aiming at resumming the logarithms in the variables z a and z b , Eq. (2.14).As we stressed already, the second family is more powerful as it resums more terms than what is resummed in the first family, making the resummed result useful in a wider kinematic range.
To our knowledge, there exist four approaches to resum threshold logarithms in rapidity distribution for the Drell-Yan process.We present them in chronological order of appearance.
• The approach of Becher, Neubert, Xu (BNX henceforth) [11], belonging to the first family.It is based on the observation that the u dependence in the PDF luminosity Eq. (2.9) is nextto-leading power at large z, Eq. (2.5), which allows to write the leading power contribution in terms of the rapidity-integrated coefficient function, whose resummation is well known.• The approach of Bonvini, Forte, Ridolfi (BFR henceforth) [12], belonging to the first family.It is similar to the BNX approach, but the derivation is based on a different argument in Mellin-Fourier space [18], and extends the older result of Ref. [19].The resulting resummation formula differs from BNX, but it has been shown to be equivalent up to next-to-next-to-leading power in 1 − z.
• The approach of Banerjee, Das, Dhani, Ravindran (BDDR henceforth) [13] (see also [20]), belonging to the second family.This is a two-scale extension of the original approaches to rapidity-integrated resummation [5,6], already introduced for x F distributions in Ref. [5] and later converted to rapidity distributions in Refs.[21][22][23][24][25], where a double Mellin transform is taken with respect to the two variables z a and z b and logarithms of the product of the two Mellin conjugate variables are resummed to all orders.
• The approach of Lustermans, Michel, Tackmann (LMT henceforth) [14], belonging to the second family.It uses the framework of soft-collinear effective theory to factorise the cross section in terms of beam functions and a soft function, enabling the resummation of threshold logarithms after solving renormalization group equations that they obey.It is designed to be valid in the generalized threshold region, thus enlarging its kinematic range of applicability.
We stress that the BDDR approach has been recently extended [15,[26][27][28] to resum next-to-leading power contributions to the dominant q q channel, namely those suppressed by one power of 1 − z a or 1 − z b with respect to the leading power logarithms Eq. (2.14).This extension allows to enlarge the region where threshold contributions dominate, and is therefore very useful to obtain reliable predictions close to threshold and more precise predictions even far from threshold, even though for achieving higher accuracy the next-to-leading power contributions from other channels, i.e. the qg channel, should be included as well.
We also stress that the LMT approach already captures the subleading power contributions predicted in Refs.[15,[26][27][28].Indeed, it is constructed to resum the leading power terms in one of the two variables, say z a , with full dependence on the other variable, z b , and vice versa.Therefore, it contains contributions suppressed with any power of 1 − z b which multiples the leading-power terms in z a , and vice versa. 3Moreover, the LMT approach is able to predict also the off-diagonal qg channel at this accuracy.We recall that the current LMT work [14] presents only the analytic structure of this resummation but it contains no all-order numerical results yet.
The LMT paper, on top of proposing the virtually best approach to resum threshold logarithms in rapidity distributions, criticises the BNX and BFR approaches, stating explicitly that they are wrong as they miss leading power contributions in 1 − z.We disagree with this criticism, and we will show in the next section that the BNX and BFR approaches, which have been used in the literature [29][30][31][32], are perfectly consistent within their region of validity.

On the validity of BNX and BFR
In this section we focus on the approaches of BNX and BFR, whose validity has been criticised in Ref. [14].We will present a detailed proof valid for both approaches, that also reveals the expected quality of the threshold approximation at the core of these approaches.We discuss possible caveats, and finally comment on the arguments of Ref. [14] against the validity of BNX and BFR.
Before moving to this, we recall the explicit expressions of BNX and BFR resummation, also to establish our notation.When using the variables z and u, in the partonic threshold region z → 1 the coefficient function can be expanded as (we omit the argument α s from now on to emphasise the dependence on the other variables) where C thr (z, u) is defined to contain all leading power threshold contributions, namely the plus distributions Eq. (2.7) and delta functions of 1 − z. 4 In the BNX approach the u dependence of the coefficient function at threshold is further approximated as while BFR find where C thr (z) is the rapidity-integrated coefficient function at threshold, reported at fixed order in appendix A.2. Indeed, it is immediate to see that the integral over u of both expressions gives exactly C thr (z).This coefficient function is then the threshold limit of the well known inclusive coefficient function, whose resummation has been studied for decades [5][6][7][8][9][10].We stress that in the original BNX and BFR approaches two different methods for resumming C thr (z) were considered, but this is immaterial as any other choice is possible.Therefore, for our purposes, we refer to BNX and BFR as the two formulas Eq. (3.2) and (3.3), irrespectively of how C thr (z) is resummed to all orders.

Proof of BNX and BFR
The approximations Eq. (3.2) and Eq.(3.3) are not particularly significant written in that way.Indeed, the u dependence of the coefficient function is distributional, and it is clear for instance that each equation cannot be seen as an approximation of the other, namely there is no sense in which δ(u) ).The meaning of Eq. (3.2) and Eq.(3.3) resides in the way they appear in the cross section formula, Eq. (2.8), where also the parton luminosity is present and plays a crucial role in constructing the BNX/BFR approximations.We now show this in detail.
The key observation is the fact that the PDFs become independent of u in the partonic threshold limit z → 1.This can be seen by expanding the arguments of the PDFs, Eq. (2.5), in powers of 1 − z: Consequently, the luminosity Eq. (2.9) can be expanded as with L ′ q q the derivative with respect the z variable, where the first term which is the luminosity computed in z = 1 is independent of u, When we will want to emphasise this independence, we will write it as L q q (1, •).Note that the second term of Eq. (3.5) can be considered as truly suppressed by a power of 1 − z if L ′ q q (1, u) is of the same size of L q q (1, •).This is not always the case.We will investigate numerically the size of L ′ q q (1, u), which may potentially be a limiting factor of our proof of BNX and BFR, in section 3.2.
The fact that the u dependence of the luminosity is power suppressed in the partonic threshold limit z → 1 shows that the u dependence of the coefficient function can be integrated over at leading power.To formally prove this, we consider the cross section formula Eq. (2.8) and we expand the luminosity at threshold, along the lines of Ref. [11]: Here we have first used Eq.(3.5) to expand the luminosity at large z, then we used the uindependence of the luminosity in z = 1 to pull it out of the integral, we then computed the u integral to obtain the rapidity-integrated coefficient function C(z). 5 In the penultimate line we restored the z dependence of the luminosity, which is again legitimate up to O(1 − z) thanks to Eq. (3.5), but in doing so we also need to restore the u dependence in the second argument.As u has been integrated over, any value ū between 0 and 1 is formally acceptable.Finally, in the last step we have further approximated C(z) with its threshold limit.We stress that in Eq. (3.7) we have expanded in powers of 1 − z selected terms of the entire integrand.This is in contrast with the standard approach of threshold resummation [5,6], where only the coefficient function is expanded.Keeping the parton luminosity unexpanded is certainly more natural and ensures for instance that kinematic constraints coming from the limits of PDF arguments are preserved by the resummed expression.However, from a formal point of view, it is clearly allowed to also include the parton luminosity in the threshold expansion.
The selection of terms that are expanded in Eq. (3.7) is obviously arbitrary but legitimate according to the power counting in 1 − z, and it is made in a way to reproduce BNX and BFR as we shall see.
In the derivation of Eq. (3.7) we have made an assumption that is not entirely trivial.In particular, in the second step we have assumed the identity (3.8) As mentioned above, this assumption is correct if L ′ q q (1, u) is of the same size of L q q (1, •), which is not always the case.By comparing numerically the size of the two functions, we will see in section 3.2 that this assumption is satisfied in a wide kinematic range.In particular, at small τ and not too close to the rapidity endpoints, which is the phenomenologically relevant region, the assumption Eq. (3.8) is valid and Eq.(3.7) is correct.
Note that the contributions of O(1 − z) in the integrand of Eq. (3.7) may be large, as the integral extends down to z = τ and if τ is small there is a part of the integration region in which these corrections are not suppressed.This fact does not invalidate the derivation of Eq. (3.7), but poses questions about the quality of an approximation obtained neglecting those subleading power terms.We will see in section 3.2 that the approximation for the dependence on the luminosity is rather good even for small values of τ .The impact of approximating C(z) with C thr (z) in the last step of Eq. (3.7) depends on the form of threshold logarithms used to construct it, and it will be discussed in section 4.
Having established Eq. (3.7), we can obtain the BNX and BFR formulation by neglecting the subleading power contributions of O(1 − z) and for suitable choices of the variable ū.In particular, we obtain the BFR formulation by choosing ū = 1 2 , while the BNX one corresponds to the average of the result with ū = 0 and ū = 1: It is immediate to verify that these are indeed the results obtained by using Eq.(3.2) and Eq.(3.3) into Eq.(2.8).In order to use these equations for resummation, the function C thr (z) needs to be resummed to all orders in the threshold limit.The resummed result should then be matched to fixed-order computations, which shall not be approximated.The quality of the resummed and matched result will also depend on whether the resummed threshold logarithms are the dominant part of the higher orders or not: we will address this question in section 4. Note that BNX and BFR are just two of infinitely many possible alternative and equivalent formulations of resummation, which can be obtained by using different values of ū and averages thereof.We notice however that not all combinations make physical sense.Indeed, the rapidity distribution for the Drell-Yan process in proton-proton collisions is forward-backward symmetric, but the luminosity L (z, u) is not symmetric for Y → 1 − Y unless u = 1/2.More precisely, the luminosity Eq. (2.9) is symmetric under the exchange of x 1 and x 2 , which can in turn be realised by changing Y → −Y and u → 1 − u simultaneously, see Eq. (2.5).Therefore, only symmetric sums L (z, u) + L (z, 1 − u) are symmetric in Y .Since the dependence on the rapidity Y of Eq. (3.7)only comes from the luminosity, the general physically acceptable resummed expression is given by where any value of ū in the allowed range 0 ≤ ū ≤ 1 is acceptable.Also any weighted average of results with different ū provides a valid formulation of resummation.Obviously, BNX and BFR are two (maximally different) special cases of Eq. (3.11).
We recall that the integrands of the BFR and BNX expressions differ effectively by O[(1 − z) 2 ] terms, as it was noticed in Ref. [12].This fact does not mean that the accuracy of these two formulations is higher, it just tells us that the two approaches are more similar than expected from the accuracy of the derivation.We will see this explicitly in the numerical results of the next sections.In fact, it is easy to prove that any symmetric average of the form Eq. (3.11) with any value of ū is equivalent to BFR and BNX up to O[(1 − z) 2 ].

On the validity of the threshold expansion in the integrand
The derivation of the BNX and BFR results in Eq. (3.7) involves a number of assumptions and expansions whose validity we now address.
We have already commented that the derivation of Eq. (3.7) assumes the equality Eq. (3.8), which in turn is valid if L ′ q q (1, u) is of the same size of L q q (1, •).This is usually the case, but when the PDFs are computed close to their endpoint x = 1, which happens when τ e 2|Y | is close to 1, this assumption is no longer valid.We now investigate this effect in detail.
Let us focus on the case in which τ is close to 1 (the case in which τ is not large but the rapidity is large is similar but slightly more complicated to describe analytically).In this case, the identity Eq. (3.8) is no longer valid.We can see this analytically, by approximating the PDFs as for some positive value of α, which is a good approximation at large x.Considering a single flavour for simplicity (or equivalently assuming that the same value of α holds for all quark PDFs), the derivative of the luminosity can be written as It is clear that at large τ e 2|Y | → 1 one of the two denominators becomes parametrically small, and thus for some values of u the derivative becomes parametrically larger than the luminosity.This becomes even clearer at central rapidity Y = 0, where the denominator clearly enhances the derivative with respect to the luminosity for any value of u in the τ → 1 limit.In the derivation of Eq. (3.7), therefore, when we assume that τ is close to 1 which in turn implies that 1 − z is of the same order as 1 − τ , the term −L ′ q q (1, u)(1 − z) is not subleading power with respect to L q q (1, •) in Eq. (3.5) and the expansion in Eq. (3.7) is not accurate.
To appreciate the size of this effect we plot in figure 1 the ratio of the derivative of the true luminosity over the luminosity in z = 1 as a function of τ for different values of Y = 0, Y max /2, 2Y max /3 and u = 0, 1/2, 1, with Y max = 1 2 log 1 τ , using the PDF4LHC21 NNLO PDF set [33] and assuming photon-mediated Drell-Yan production at LHC √ s = 13 TeV.At central rapidity Y = 0 even with real PDFs we see that the luminosity is almost independent of u, and we observe that for a wide range of τ values 10 −4 < τ < 0.1 this ratio is of O(1), making the proof of the previous section valid for these kinematics.We see however a clear growth of this ratio going towards large τ , confirming that the assumption Eq. (3.8) breaks down at some point when τ is too large.Increasing the rapidity, the situation gets worse and the ratio becomes larger at smaller values of τ for some values of u (in this case, having used positive rapidity, the largest effect is at u = 1).Despite this deterioration, we notice that in the phenomenologically interesting region of mid-low τ this ratio remais of O(1) even at large rapidities.
We can thus conclude that the BNX and BFR approaches are formally valid in a restricted kinematical region.They are not supposed to be accurate at large τ and towards the rapidity endpoints.This is somewhat surprising and counterintuitive as these are precisely the regions identified by the hadronic threshold limit, where threshold resummation is certainly relevant.The point is that the BNX and BFR formulation, as already discussed, are based on an approximation of the dependence of z and u of the coefficient function, and it is this approximation that is not valid in the hadronic threshold region.Far from it, the approximation is legitimate, and it allows to resum the partonic threshold logarithms in the cross section.
We can try to estimate in a more quantitative way in which kinematic region the missing contributions from the derivative of the luminosity make the BNX and BFR formulation break down.To do so, we consider the BNX and BFR expressions, Eq. (3.9) and Eq.(3.10), in which we replace C thr (z) with the full C(z): These expressions represent alternative formulations for the rapidity distribution where only the luminosity is approximated, and correspond to neglecting the O(1 − z) contributions in the penultimate line of Eq. (3.7). 6As such, comparing them with the exact distribution (at fixed order), we are able to judge the quality of the approximation of the luminosity, and thus the impact of the neglected derivative terms.We do this in figures 2 and 3, where we plot the exact contributions to the rapidity distribution at NLO and NNLO (in the q q channel only) along with the approximations of the luminosity dependence à la BNX and BFR, Eqs.(3.15), (3.16).In the first figure the distribution is shown as a function of τ and for different values of Y = 0, Y max /2, 2Y max /3, while the second figure shows the same distribution but as a function of Y for three values of τ = 10 −4 , 10 −2 , 10 −1 .The plots are obtained considering photon-mediated Drell-Yan production at LHC √ s = 13 TeV, using again the PDF4LHC21 NNLO PDF set [33] and taking from it the value of the strong coupling.The exact NNLO result is taken from the Vrap code [16,34], selecting from it only the terms contributing to the q q channel.We immediately observe that at small τ and central rapidity, the quality of the approximation is excellent at NLO and NNLO, as in both cases the BNX/BFR curves are almost identical to the exact result.Note that while we expect a failure of the validity of the expansion in powers of 1 − z at large τ and/or large rapidity, it is perhaps surprising to see such a good agreement at such small values of τ .Indeed, the neglected O(1 − z) contributions, though geniunely subleading, are not necessarily small as the integration over z extends to values as small as τ .An explanation of this effect is the fact that the shape of the luminosity strongly favours large values of z in the integrand and suppresses the region of z close to τ , as a consequence of the small-x growth and the large-x suppression of the PDFs f i (x, µ 2 F ), respectively.Therefore, the integral over z is dominated by the large-z region, well described by a threshold approximation, while the contribution from medium-small z down to τ is a small correction.This phenomenon is the so-called dynamical threshold enhancement of Ref. [11], and it is responsible for the threshold dominance also at the rapidity-integrated level, see e.g.Ref. [35].
Moving towards larger values of Y and τ we see some deterioration of the agreement, more marked at NNLO.However, in the range of values of τ, Y considered here, the accuracy of the approximation remains very high, with discrepancies of the order of some percent.We can appreciate in particular a slight distorsion of the Y dependence of the distribution, clearly visible at NNLO in figure 3. We stress that, by construction, the integral in rapidity of all the curves in each plot is the same and coincides with the exact rapidity-integrated cross section, as we also verified numerically.This constraint also contributes to the high accuracy of the approximation.
We finally notice that the BNX and BFR approaches give basically identical results.This is a consequence of the fact that the two formulations differ by next-to-next-to-leading power contributions.
The excellent quality of the approximation of the luminosity at the core of the BNX/BFR formulation can be understood analytically.By repeating the derivation of Eq. (3.7) keeping also the linear term in 1 − z of the expansion of the luminosity Eq. (3.5), it is easy to find where the term proportional to L ′ q q (1, ū) appears when we replace L q q (1, •) with L q q (z, ū) in the fourth line of Eq. (3.7).We thus observe that the linear term in 1 − z is not proportional to the entire derivative L ′ q q (1, u), but to the difference L ′ q q (1, ū) − L ′ q q (1, u), which is obviously smaller.The same holds for higher derivative terms.In other words, restoring the z dependence in the luminosity, even if this introduces a dependence on the new, arbitrary variable ū, is beneficial as it allows to reduce the impact of the missing contributions at higher order in 1 − z.
We thus conclude that the limitations coming from the growth of the derivative of the luminosity at large τ e 2|Y | is more formal than practical, and for all phenomenologically relevant values of these parameters the approximation of the luminosity leading to the BNX/BFR expressions is fully valid.In passing, we have also shown that the large-z approximation on the luminosity in Eq. (3.7) is of very high quality even when τ is small, despite the integral contains many values of z which are far from 1 and for which the threshold expansion is not formally accurate.The actual quality of threshold resummation based on Eq. (3.7), performed either à la BNX Eq. (3.9) or à la BFR Eq. (3.10), also depends on how well C thr (z) approximates the exact C(z), and we will discuss this in section 4.

Objections to BNX and BFR raised in the LMT paper
We now analyse the objections raised in Ref. [14], which claims that BNX and BFR are not accurate at leading power in 1 − z.
One of the arguments used in Ref. [14] is the fact that a class of terms apparently enhanced in the z → 1 limit are missed in these approaches.Specifically, any term in C(z, u) that vanishes after integration over u cannot be captured by the BNX and BFR approaches.One such term is for instance the first term appearing in the last line of Eq. (A.2) at NLO, of the form This term is singular in z = 1, but since it multiplies plus distributions in u it vanishes after integration over u.According to Ref. [14], this is a leading power contribution that is missing in BNX and BFR.We agree with the authors of Ref. [14] that, if we were to count powers of 1 − z at the level of the coefficient function only, the term Eq. (3.18) is formally leading power, like the terms that are retained in BNX/BFR.However, our derivation of BNX/BFR in section 3.1 adopts a power counting in 1 − z at the level of the full integrand of Eq. (2.8), thus including also the parton luminosity.What we are now going to show is that the term Eq. (3.18), because of its peculiar u dependence, contributes at next-to-leading power to the integral, and is thus consistently missing in the BNX/BFR leading power result.
The proof that the term Eq. (3.18) does not contribute at leading power relies again on the fact that the luminosity in z = 1 is independent of u.To see this, we consider the u integral of that term multiplied by the luminosity, Each difference of luminosities in the integrands can be expanded in powers of 1 − z, L q q (z, u) − L q q (z, 0) = ( ( ( ( ( ( ( ( ( ( L q q (z, u) − L q q (z, 1) = ( ( ( ( ( ( ( ( ( ( and the zeroth-order contribution in each difference vanishes because the luminosity in z = 1 does not depend on u.Therefore, each integral in the big rounded brackets in Eq. (3.19) is of order 1 − z, and thus cancels the singularity of the 1 1−z term in front.We conclude that this contribution behaves as a constant in the z → 1 limit, and therefore counts as a next-to-leading power contribution.This is also the reason why there is no need to surround this term with a plus distribution, which would instead be needed if it counted as leading power.
We can verify this numerically, by plotting the function F (z) Eq. (3.19) to see that it does not diverge at z = 1.We do this in figure 4, for different values of τ = 10 −4 , 10 −2 , 10 −1 and of the rapidity Y = 0, Y max /2, 2Y max /3, using the usual physical setup.Since there are some numerical oscillations at large z, probably due to the fact that the curve is the ratio of two small numbers, we also plot the PDF uncertainty band to make sure that the interpretation of the result is solid.It is clear that F (z) does not diverge in z = 1, rather it is perfectly finite, showing explicitly that this term is next-to-leading power.We also show (without uncertainty) each of the two contributions to Eq. (3.19) coming from each integral in the rounded brackets, as they are separately regular.We see indeed that both of them do not diverge in z = 1.
In fact, it is possible to note from the plots that the full function F (z) seems to go to zero at z = 1.This is indeed the case, as we can verify analytically by noticing that the O(1 − z) terms in the expansions Eq. (3.20) satisfy the relation which is easy to prove using the general form of the derivative of the luminosity given by Therefore, the whole term contributes to the rapidity distribution at next-to-next-to-leading power, and it is thus more suppressed at threshold than naively expected.The same is true also for the individual integrals of Eq. (3.19) when Y = 0, as a consequence of the fact that the luminosity is symmetric for the exchange x 1 ↔ x 2 , that at Y = 0 corresponds to a symmetry for the exchange u ↔ 1 − u.
More in general, the BNX and BFR approaches miss any term in C(z, u) that vanishes after integration in u from 0 to 1. Consider a generic function G(z, u) that may seem to be leading power for some integer value of k, and with g(u) any function or distribution satisfying the constraint Expanding the luminosity in powers of 1 − z we find where, thanks to the fact that the luminosity in z = 1 is u independent, we could use Eq.(3.24) in the last step to show that the first term of the expansion vanishes.We thus conclude that any apparently leading power term of the general form Eq. (3.23) that vanishes after integration over u contributes effectively at next-to-leading power.Therefore, the fact that BNX and BFR miss these contributions does not represent a power-counting issue.
We observe that contributions of this kind are instead present in the approaches to threshold resummation that keep the separate dependence on z a and z b , e.g.Refs.[5,13,[21][22][23][24][25].These approaches are also supposed to be valid in the threshold z → 1 limit, but since the limit is performed at the level of the coefficient function these contributions count as leading power and are thus preserved.For what we have shown, in the strict z → 1 limit their inclusion does not increase the accuracy as they are suppressed with respect to the other leading power terms. 7However, if a threshold approximation/resummation is extended outside the threshold region, namely for values of z that are not large enough, these contributions may be relevant.Indeed, a term like Eq. (3.18) contains enhanced contributions in u → 0, 1, which are irrelevant at z → 1 but not at generic z.To understand this, recall that u → 0, 1 corresponds to z a → 1 and z b → 1 respectively.Since z = z a z b , when z → 1 both z a , z b → 1 irrespectively of the value of u.But when z is not large, one among z a and z b can be large if u tends to 0 or 1.These non-threshold but enhanced contributions are captured by the aformentioned approaches, and improve the description of the tails of the rapidity distribution.Therefore, the fact that these contributions are missing in BNX and BFR makes them less accurate than other approaches when the process is far from threshold.
In Ref. [14] there are other arguments used to criticise the validity of the BNX and BFR approaches.One of them is an explicit calculation using the same toy PDF Eq. (3.12) showing that after integration in z and u the two functions give rise to different leading power contributions.As these correspond to a BFR and a BNX implementation of the same term, if they are both correct at leading power, they should give the same leading power contributions.The problem here is that the computation of Ref. [14] assumes We have already commented in section 3.2 that in this limit the derivation of BNX and BFR of section 3.1 does not hold anymore, so this conclusion is not surprising.Moreover, the power counting is performed at hadron level, namely the terms identified in Ref. [14] are leading power in 1 − x a and 1 − x b , which makes sense because they assume τ → 1 but it cannot be directly related to the leading power terms in 1 − z when τ is not large.Another objection of Ref. [14] regarding the BNX and BFR approaches is related to the expansion of the luminosity Eq. (3.5).In particular, LMT say that the expansion of the arguments x 1,2 of the PDFs, Eq. (3.4), to the zeroth order is too trivial because it does not depend on z.More precisely, they say that "the u dependence is not power suppressed but multiplies the leading dependence of the PDF arguments on z itself, and so it cannot be dropped".We believe that this is not a real issue: the expansion in powers of 1 − z is legitimate at large z, and if the zeroth order of this expansion is independent of z it cannot represent a reason for expanding to one order higher.However, for completeness, we can consider an alternative expansion that overcomes the LMT objection, without affecting the proof Eq. (3.7).Specifically, we can expand in powers of 1 − z not the full x 1,2 expression Eq. (2.5), but only the square root that depends on u: Now the leading z dependence appears in the zeroth order term of the expansion, which is still u independent.Of course, this (arbitrary) expansion is equivalent to that of Eq. (3.4) up to the order at which it is truncated.We can write the luminosity as which is again u independent, thus making the rest of the proof identical to what discussed in section 3.1 (except for the fact that L (0) q q (z) can be moved outside the u integral but not the z integral, but this is immaterial for the derivation of the final result).
A final objection raised in Ref. [14] is the fact that the original proof of the BFR approach [12] has a conceptual problem in one of the steps.In particular, the proof is based on the expansion of the Fourier transform kernel e iM y , where M is the Fourier conjugate variable to the parton rapidity y.The expansion in powers of y was truncated at order 0 because y is a variable ranging in |y| < 1  2 log 1 z ∼ 1 2 (1 − z), and so higher orders in y are effectively suppressed by powers of 1 − z.The objection of Ref. [14] is that the conjugate variable M has to be counted as of order 1  1−z according to the Fourier inversion theorem.Therefore, the expansion in powers of y is not legitimate, in the sense that the neglected terms are not really power suppressed.We agree with this criticism, and confirm that the proof of BFR in Ref. [12] is not satisfactory.Indeed, we also notice that this proof does not make use of the peculiar z, u dependence of the luminosity, which is instead at the core of the derivation of the BNX and BFR resummation formulas, as we have shown in section 3.1.

How good can BNX/BFR be?
Having established the validity of the BNX and BFR formulation of threshold resummation of rapidity distributions, we now want to understand how well they approximate the full cross section at fixed order.This gives us information on how accurate threshold resummation based on BNX or BFR formulations can be.Despite the fact that there exist resummation formalisms [13][14][15] that are formally superior to BNX and BFR, they remain interesting alternatives for their simplicity, being based on the resummation of the rapidity-integrated coefficient function which is typically known for more processes and with higher logarithmic accuracy.
The accuracy of any threshold approximation (and hence resummation) depends on the definition of the threshold logarithms that are retained.Indeed, any definition of threshold logarithms that differs from Eq. (2.7) by subleading power contributions is formally equivalent and thus acceptable, but the results may differ significantly.Such a difference may be seen as a limitation of the threshold approximation, as it comes from sizeable contributions from next-to-leading power contributions that are beyond the control of leading power threshold resummation. 8However, some subleading power contributions have a universal structure that can be incorporated in the definition of threshold logarithms, improving the quality of a threshold approximation [42][43][44][45][46][47][48][49].In the following we will discuss different choices of threshold logarithms and compare them numerically against the exact NLO and NNLO results.

Definitions of threshold logarithms
Let us focus in this section on the rapidity-integrated coefficient function which is the ingredient of the BNX and BFR formulations.The most natural choice for constructing a threshold approximation is to retain all contributions of the form of Eq. (2.7) and delta functions.At order α n s , the coefficient function is thus approximated at threshold as where c n,k and d n are numerical coefficients (not functions of z).Extending the notation of Refs.[45,46], we shall call the threshold approximation based on Eq. (4.2) z-soft approximation, 9 meaning the natural threshold approximation in z space.
The z-soft approximation is not particularly convenient for two reasons.One is that it is not very accurate, as we shall see in section 4.2.The other reason is that it is not easy to construct an all-order resummed result that contains all and only those contributions. 10he most widespread definition of threshold logarithms that is used in resummed computation is done in Mellin conjugate space, where the phase space of the gluon emissions (responsible of threshold logarithms) factorises making possible the construction of an all-order expression in closed form.Such a definition is based on the expansion at large N (corresponding to the threshold region in Mellin space) of the Mellin transform of z-space logarithmic terms [46,50] where Γ (j) (x) is the j-th derivative of the Euler gamma function Γ(x).Neglecting the O(1/N ) contributions (which are subleading power at threshold), this expansion provides an alternative approximation at threshold, that we call N -soft according to the notation of Refs.[45,46].The Nsoft approximation can be used directly in N space, which is convenient for instance at resummed level.It can also be used in the z-space formula Eq. (4.2), by replacing the logarithmic terms with the inverse Mellin transform of the powers of log 1 N terms of Eq. (4.3).Details are given in appendix A.2.
We also consider an alternative definition of threshold logarithms proposed in Ref. [46], whereby log N is replaced by ψ 0 (N + 1), where ψ 0 is the digamma function.Indeed, the exact Mellin transform of the threshold logarithms is expressed in terms of polygamma functions ψ k (N ), all of which go to zero at large N as N −k with the exception of ψ 0 (N ), which grows as log N .The argument of ψ 0 (N ) is further shifted to N + 1, which is equivalent up to O(1/N ).This choice, denoted ψ-soft 1 in Ref. [46], corresponds to the approximation obtained by neglecting the O(1/N ) contributions in the expansion This approximation can be converted to z space order by order; details are given in appendix A.2.
The formal accuracy of this expression is equivalent to that of N -soft and z-soft, as they all differ among each other by subleading power contributions at threshold.However, the ψ-soft 1 approximation has some advantages that make its quality superior to other choices.
The key observation is that the use of ψ 0 (N +1) allows to include at the rapidity-integrated level subleading power contributions that have a kinematical origin and are thus universal [42][43][44][45][46][47][48][49].In particular, the leading logarithmic terms at next-to-leading power in 1 − z are predicted correctly to all orders in ψ-soft 1 in the dominant flavour-diagonal channel for color-singlet production processes such as Higgs and Drell-Yan, and subleading logarithmic contributions are also partially included.As a consequence, at the rapidity-integrated level, the ψ-soft 1 choice of logarithmic terms provides better numerical agreement with the exact result than z-soft or N -soft, and also leads to a better stabilisation of the scale dependence at resummed level.These results were obtained for Higgs production [45,46], and they hold also for the Drell-Yan process.Of course, this simple modification of the resummed logarithms is not able to predict subleading power contributions coming from the other channels, most importantly the qg channel, which would be needed for achieving a higher accuracy of the resummation.
Using ψ-soft 1 for BNX and BFR is expected to improve the resummation of rapidity distributions as well.Indeed, even though a direct analytical comparison at parton level cannot be done because u dependence in BNX and BFR is always approximate, we know that integrating over rapidity Eq. (3.9) and Eq.(3.10) we obtain the correct rapidity-integrated distribution, for which ψ-soft 1 is known to perform well.We will now see by a numerical comparison that the ψ-soft 1 approximation is more accurate than the others also for rapidity distributions, thereby providing the most convenient choice of threshold logarithms for an accurate resummation within the BNX/BFR formulation.

Numerical validation of BNX/BFR at NLO and NNLO
In this section we compare the exact NLO and NNLO contributions to the Drell-Yan rapidity distribution against threshold approximations based on the BNX and BFR formulations and for the three choices of threshold logarithms discussed in section 4.1.
The numerical setup is the same of section 3. We consider neutral current Drell-Yan production at LHC with √ s = 13 TeV, including only the contribution from the photon for simplicity.We use the PDF4LHC21 NNLO PDF set [33], and take from it the value of the strong coupling.We always sit at µ F = µ R = Q, with Q = √ τ s the invariant mass of the lepton pair.We only plot the q q contribution to the rapidity distribution.The exact NNLO result is taken from the Vrap code [16,34].
We start by showing plots of the rapidity distribution at fixed rapidity and as a function of We observe that ψ-soft 1 is by far the best approximation, in most cases overshooting the exact result by a small amount.In fact, comparing with figure 2, we can observe that the ψsoft 1 curve is very similar to the curves obtained with the exact C(z), as a consequence of the aformentioned fact that ψ-soft 1 provides an excellent description of the coefficient function at the rapidity-integrated level.The N -soft approximation is reasonably good but definitely worse than ψ-soft 1 , and it always undershoots the exact by an amount that can reach 35%.Only at large rapidity the large-τ behaviour seems to be better for N -soft than for ψ-soft 1 , but this is due to an accidental compensation of the deterioration of the approximation of the luminosity visible in figure 2 and the undershooting of N -soft.Finally, z-soft is the worst, especially at NNLO where it seems completely unrelated to the exact result, except at very large τ .This failure of the z-soft approximation is well known and expected at the rapidity-integrated level, and it has been studied in various works, mostly in the context of Higgs production [7,45,[51][52][53].
We now move to visualise the same differential distribution as a function of the rapidity Y for fixed values of τ .This is shown in figure 6. 11 The structure of the plots is the same of the previous figures, and the three columns correspond to the values τ = 10 −4 , 10 −2 , 10 −1 .
We clearly see that at large rapidity the agreement of all the approximations is good, but moving towards smaller rapidity z-soft deviates soon and significantly, N -soft also deviates undershooting the exact result by a large amount, while ψ-soft 1 is closer to the exact result, typically overshooting it by a small amount.Again, the shape of ψ-soft 1 is very similar to the result obtained with the full C(z), figure 3.There is a slight deterioration of the accuracy of ψ-soft 1 at NNLO with respect to the NLO, with the shape at this order being somewhat distorted.However, reassuringly, the quality of all the approximations is generally similar at NLO and NNLO, showing that the procedure is stable and hopefully preserves its reliability at higher orders.
These consideration hold the same for both BNX and BFR.While the two approaches give very similar results, in these plots we can see a small difference between them, which is more marked at medium-small τ and in the less accurate approximations based on z-soft and N -soft.In particular, it seems that BNX is able to better reproduce the little bump present in the rapidity distribution at the transition between the central rapidity plateau and the large rapidity drop.However, the difference between BNX and BFR is so mild and in particular much smaller than the difference between exact and approximate (and between the different choices of threshold logarithms) that it cannot be used to strongly favour one of the two approaches over the other.
In conclusion, we have seen the BNX and BFR approaches, which can be considered equivalent for all practical purposes, can describe rather well the exact result, even far from threshold, provided a good choice of threshold logarithms is used.The ψ-soft 1 choice is very convenient as it allows to reach the best description and at the same time it is very easy to implement at resummed level.We stress that any "traditional" Mellin-space resummation code that uses N -soft by default can be straightforwardly upgraded to ψ-soft 1 simply replacing log N with ψ 0 (N + 1).

Comparison of BNX/BFR with other approaches
Having seen that BNX/BFR are able to approximate sufficiently well the fixed-order result if a proper choice of the threshold logarithms is made, we now want to compare 12 these results with the other approaches to threshold resummation of the literature, namely BDDR [13] and LMT [14].We also consider the AMRST [15] approach, which is the next-to-leading power extension of BDDR.
For them, we stick to the original definition of threshold logarithms.For BDDR and AMRST, being them formulated in Mellin space, the logarithms are defined according to N -soft, with two separate logarithms in the variables z a , z b , corresponding to logarithms in Mellin space of two distinct variables N a and N b .Expressions for BDDR at NLO and NNLO are given in N a , N b space in Ref. [13] and can be converted to z a , z b space using the results of appendix A.2.For AMRST, we have expanded ourselves the resummed formulas of Ref. [15] to order α s and α 2 s .Explicit expressions for both BDDR and AMRST at NLO and NNLO are given in appendix A.3, both in Mellin space and in momentum space.
For LMT, the logarithms used correspond to the z-soft definition.Despite it being a choice that gives inaccurate approximations at leading power, the fact that LMT includes subleading power contributions partially cures this deficiency.We stress that the LMT approximation corresponds to Comparison of various approaches, s = 13 TeV, qq channel, photon exchange only the full distributional part of the coefficient function when written in terms of the z a , z b variables.Explicit expressions at NLO and NNLO are given in appendix A. 4. We now compare these results, BDDR, LMT and AMRST, with the BFR approximation based on ψ-soft 1 .We do not show BNX, that would give results very close to BFR, already presented in section 4. In figure 7 we plot the Drell-Yan rapidity distribution as a function of τ and in figure 8 as a function of Y .The physical setup is the same of the previous plots.BFR is shown in dot-dashed purple, BDDR in dashed green, LMT in dotted blue and AMRST in dot-dot-dashed orange.
We note that BFR with ψ-soft 1 is not worse than BDDR, rather it performs much better both at NLO and NNLO, being much closer to the exact result.The reason for this is probably due to the fact that the N -soft choice of threshold logarithms of BDDR is not optimal.Importantly, this comparison confirms once more that the BNX/BFR formulation of threshold resummation of rapidity distributions is legitimate and competitive with canonical, more complex approaches.Clearly, the shape in rapidity of the BFR curve is only approximate, and indeed it does not reproduce features of the exact result (e.g., the NNLO bump), due to the approximate nature of the BFR/BNX approaches.However, the BFR curve is overall close to the exact result thanks to the fact that the BFR/BNX approaches are designed to reproduce at the rapidity-integrated level the inclusive threshold result which is very well approximated by ψ-soft 1 .
Moving on, we observe that, not surprisingly, the LMT result is generally more accurate than BDDR, in particular at large rapidity where it is very close to the exact result.We note however that while at NLO the agreement is very good especially at large τ , the accuracy of LMT deteriorates significantly at NNLO, becoming even worse than BDDR at central rapidity for τ small.This is certainly a consequence of the use of z-soft for the form of the logarithms. 13Indeed, the structure of the LMT result is such that the helpful subleading power corrections in one variable (say z a ) are retained when they multiply only the leading power terms in the other variable (z b ), which are not accurate enough when using z-soft, unless the kinematics forces that variable to be large, i.e. at large rapidity.We finally move to the AMRST result.Formally, this result contains less information than the LMT one, as for each variable it only adds the next-to-leading power correction (multiplied by the leading power term in the other variable), and not all subleading power corrections as it is done in LMT.However, the advantage of the AMRST result is that it uses the N -soft definition of the threshold logarithms, which is superior to the z-soft one adopted in the LMT approach.Therefore, as we can see from the figures, the AMRST result is the one that best approximates the exact results.It is in particular much better than the LMT one at NNLO, where AMRST stays very close to the exact result, as it also does at NLO.
Notably, BFR with ψ-soft 1 is closer to the exact result than BDDR and LMT, with the exception of the high rapidity region.In fact it is rather close to the AMRST result as well, making BFR (and BNX) comparable even with the best approach on the market today.Of course AMRST is superior, as for instance it is able to reproduce the shape in rapidity which is only vaguely approximated by BFR, as we have already commented.However, we believe that BNX/BFR resummation can still prove useful when studying resummation for rapidity distributions.Indeed, the BFR/BNX approaches are based on the leading-power resummation of the rapidity integrated cross section, which is available for a large variety of processes and to a high logarithmic accuracy, while the recent AMRST approach requires more ingredients and it is only available for a limited number of processes so far.We will demonstrate the value of the BNX/BFR approaches by showing representative resummed results in section 6.
Moreover, motivated by these numerical results, we have investigated the analytical difference between the various approaches.The details are technical and are collected in appendix A. 5.
We conclude by stressing that the comparisons presented here are for the q q channel only, but at next-to-leading power also the other channels contribute.At the moment only the LMT approach can control the resummation in these subleading channels, which is a clear advantage for the goal of achieving the highest precision.It would thus be very interesting to understand if it is possible to modify the LMT approach by changing the form of the threshold logarithms to take advantage of other better definitions like N -soft or ψ-soft 1 , in order to improve its quality, reaching and possibly surpassing AMRST.In this way one could achieve the best description of the dominant q q channel, supplemented by the important contributions from the other subleading channels.We plan to investigate this possibility in future work.

All-order resummed results for BNX/BFR
Having established the accuracy of the BNX/BFR approaches to resummation, in this section we present some representative all-order results.We restrict our attention to BFR resummation (BNX would lead to very similar results), with the ψ-soft 1 choice for threshold logarithms.
We perform the resummation using the public TROLL code [30,46,54], which implements the resummation up to next-to-next-to-next-to-leading logarithmic accuracy (N 3 LL ′ ), 14 both for rapidity distributions and for rapidity-integrated cross sections.The ingredients for N 3 LL ′ resummation are all available in the literature [58][59][60], including the δ(1 − z) term at N 3 LO which is taken from Ref. [61] and the recent four-loop cusp anomalous dimension computed in Ref. [62].We do not exponentiate the constant terms in N space [46,54], thus sticking to a more standard attitude, as the effect for this process is rather mild.
In figure 9 we show the rapidity distribution as a function of τ for Y = 0, Y max /2, 2Y max /3 at fixed LO (solid black), NLO (solid blue) and NNLO (solid red) along with the resummed results at NLO+NLL ′ (dashed blue), NNLO+NNLL ′ (dashed red), and NNLO+N 3 LL ′ (dotted green).Similarly, in figure 10 we plot the same curves as a function of Y for τ = 10 −4 , 10 −2 , 10 −1 .In all plots we also show a lower panel with the ratio to the LO result, to better appreciate the relative size of the various perturbative corrections.As in the previous sections, we only plot the dominant q q channel.We observe a good convergence of the resummed result, improved with respect to that of the fixed-order result especially at large τ and Y , where threshold logarithms are more dominant.Overall, we notice that the effect of adding resummation over the fixed order is small at NLO and very small at NNLO, which is a consequence of the fact that the Drell-Yan process exhibits a good perturbative convergence (as opposed for instance to the Higgs production process in gluon  fusion, see e.g.[63,64]).From the ratio plots, 15 it is apparent that going towards large rapidity or large τ all the resummed curves tend to overlap, while the fixed-order contributions get larger thus showing a perturbative instability.This is a consequence of the fact that in these regions the threshold logarithms are dominant and large, and resumming them the perturbative expansion stabilizes significantly and leads to reliable perturbatively-stable results.
We were not able to include in the plots the N 3 LO curve [3] as to our knowledge there is no public code available out of which we can extract the q q contribution.Consequently, we could not show the N 3 LO+N 3 LL ′ resummed result, but we consider the NNLO+N 3 LL ′ result, which should be equivalent to the former in the region of large τ and Y where the threshold logarithms dominate the distribution.Far from this region, the addition of the N 3 LO result would instead improve the accuracy.
In conclusion, we have demonstrated that the BNX/BFR formulations are available for producing reliable resummed results for rapidity distributions at high logarithmic accuracy.The public TROLL code implements these results up to N 3 LL ′ accuracy, with the ψ-soft 1 choice of threshold logarithms (N -soft is also available).The results for ψ-soft 1 resummation for Drell-Yan rapidity distributions are presented here for the first time.

Conclusions
In this work we have studied threshold resummation of rapidity distributions, applied to the Drell-Yan process.Our work was motivated by the recent criticism of Ref. [14] that states that older approaches to threshold resummation by BNX [11] and BFR [12] are wrong.
We have proposed a detailed proof of the BNX and BFR approaches, emphasising their limitation but showing clearly that they are correct within their declared accuracy.We have rebutted several objections of Ref. [14], mostly focussed on the fact that BNX and BFR miss some leading power contributions at threshold, showing analytically and numerically that this is not the case.
To support our findings we have validated the approaches numerically against the exact NLO and NNLO results.The quality of these results depends on the choice of the form of threshold logarithms that one eventually wants to resum to all orders.We have shown that the ψ-soft 1 definition of threshold logarithms, proposed in Ref. [46] in the context of Higgs production, provides the best results.We have concluded that despite the approximate nature of the BNX and BFR approaches they perform rather well as they are able to reproduce to a good accuracy the exact NLO and NNLO results even far from threshold.
Motivated by this, we have performed a comparison at NLO and NNLO with other approaches in the literature, from Ref. [13] (BDDR), Ref. [15] (AMRST) and Ref. [14] itself (LMT).We noted that the best approximation is given by the AMRST result, despite the fact that it is formally less accurate than LMT.The reason for this is the different form of threshold logarithms used in the two approaches: the ones used in the LMT paper, namely the standard choice in z space, have a particularly poor quality in approximating the exact result.This suggests that the LMT approach could be improved by upgrading the form of threshold logarithms, and we plan to investigate this in future.
Notably, we have observed that the BNX/BFR approaches with the ψ-soft 1 definition of threshold logarithms lead to approximations that are comparable with (if not better than) the others, despite the latter are all formally more accurate.This confirms that the BNX/BFR are rather good alternatives to more modern approaches, and they can still provide a good framework for fast implementations of threshold resummation in rapidity distributions at high logarithmic accuracy.We concluded by showing representative results of BFR resummation up to N 3 LL ′ accuracy, obtained through the public TROLL code, available at http://l.infn.it/mb/troll/.
in the q q channel (the only one relevant at threshold) the Drell-Yan coefficient function at threshold Eq. (4.2) is given by and at NNLO by where with µ the factorization scale, assumed to be equal to the renormalization scale.In the expressions above we have conveniently written the distributional terms in the compact form defined by Using these distributions, Eqs.(A.4), (A.5) correspond to the z-soft choice of threshold logarithms.
To obtain the N -soft expression, we use the results of appendix B.4 of Ref. [50] to write where [46] We recognise in the right-hand side of this equation exactly the dominant large-N limit of the Mellin transform of the D k (z) distribution, Eq. (4.3), which is the part retained to build up the N -soft approximation.We immediately conclude that the N -soft choice of logarithmic terms corresponds in z space to the replacement O(1/N 2 ) contributions and retaining only the powers of ψ 0 (N ), which is easy to implement to all orders in N space.A z-space analog would be implemented by a modified distribution Dψ k (z) defined by for which however we cannot find an easy closed form for any k.Up to the order we are interested in, we have that Dψ 0 (z) = D0 (z) and Dψ 1 (z) = D1 (z), while The inverse Mellin transforms of these differences can be computed analytically, using e.g. the results of Refs.[50,65].We find The ψ-soft 1 prescription mentioned in the main text uses N + 1 rather than just N as the argument of the digamma function: ψ 0 (N +1).This is the simplest form of the "collinear improvement" introduced in Ref. [46] that includes subleading power contributions from universal splitting functions.
In z space a shift N → N + 1 corresponds to multiplication by z, so the recipe for the conversion of Eqs.(A.4) and (A.5) to ψ-soft 1 is simply given by the replacement (A.21)

A.3 Approximation based on the BDDR/AMRST approach to resummation
The expansion of the resummed result of BDDR [13] (at leading power) and AMRST [15] (at next-to-leading power) in N space is given at NLO by (for and at NNLO by16 To convert these results to z space, we can invert Eq. (A.8) for the leading power terms, while for the next-to leading power contributions we further need the relation which can be derived from the generating function log ξ 1 z deriving k times with respect to ξ in ξ = p.Using these results we get at NLO   having used the D log k (z) distributions defined in Eq. (A.9).

A.4 Approximation based on the LMT approach to resummation
In the generalized threshold expansion of LMT, the expansion of the resummed result can be obtained from the expression CLMT ij (z a , z b , α s ) = H kr (α s ) δ ki Îrj (z a , z b , α s ) + Îki (z b , z a , α s )δ rj − Ŝ(z a , z b , α s ) , (A.28) in terms of the functions defined in the LMT paper [14].Focussing on the q q channel and expanding in powers of α s , we obtain at NLO at central scales  In the last function we have kept the dependence on the functions I (2) qqV (z) and I (2) qqS (z) given in equation (S53) of Ref. [14].Specifically, these functions contain distributional terms in D k (z) and δ(1 − z), which are already taken into account explicitly in Eq. (A.30) as they contribute to the double-distributional part of the result, so here only the remaining regular part, denoted by reg[...] in the formula, has to be considered.The factor 1/16 finally fixes the different normalization due to the different expansion parameters (we use α s /π while LMT use α s /(4π)).It is useful to write explicitly the large z expansion of the F i (z) functions, From this expansion it is easy to verify that the double-Mellin transform of the LMT result coincides at next-to-leading power with the AMRST result Eq. (A.23).

A.5 Analytical comparison
We consider here an analytical comparison of the approaches studied in this work.We find it convenient to perform this comparison in double Mellin space, which allows us to obtain more compact expressions.The expressions for BDDR and AMRST are already given in this space, Eqs.(A.

1 Figure 1 .
Figure 1.Ratio of the derivative of the general luminosity over the luminosity in z = 1 as a function of τ for Y = 0 (left), Y = 1/2 (central) and Y = 2Ymax/3 (right) for different values of u, using PDF4LHC21 NNLO PDF set and considering photon-mediated Drell-Yan production at LHC √ s = 13 TeV.

Figure 2 .
Figure 2. Rapidity distributions at NLO (up) and NNLO (down) as a function of τ for Y = 0, Ymax/2, 2Ymax/3, using PDF4LHC21 NNLO PDF set and considering photon-mediated Drell-Yan production at LHC √ s = 13 TeV.The approximations are obtained expanding the luminosity à la BNX (darker color) and à la BFR (lighter color), given by Eq. (3.15) and (3.16) respectively, where we use the full C(z) instead of its threshold approximation C thr (z).

Figure 5 .
Figure 5. Rapidity distributions at NLO (up) and NNLO (down) as a function of τ for Y = 0, Ymax/2, 2Ymax/3, using PDF4LHC21 NNLO PDF set and considering photon-mediated Drell-Yan production at LHC √ s = 13 TeV.The approximations z-soft, N -soft and ψ-soft1 are shown in darker color for BNX and in lighter color for BFR.