1 Introduction

The Drell–Yan process is one of the standard channels for determining the parton distribution functions (PDFs), specially the sea quark ones. At the CMS experiment, for instance, the production of the pairs of muons is measured with a wide range of dilepton invariant mass, \(15< M < 3000\) GeV at \(\sqrt{s} = 13\) TeV [1]. The results are integrated in dilepton rapidity and show good agreement with next-to-next-to-leading order Drell–Yan predictions. For a similar result from ATLAS, see Ref. [2].

It is possible to calculate the Drell–Yan (DY) cross section through a factorized scheme: as a convolution of the parton distributions (one for each involved proton) with the matrix element using a factorization scale, \(\mu _F\). Schematically, we have

$$\begin{aligned} \sigma= & {} \int \mathrm {d}\, x_A \mathrm {d}\, x_B ~\mathrm{PDF}(x_A, \mu _F) \nonumber \\&\times |\mathcal {M}(\mu _\mathrm{F})|^2\times \mathrm{PDF}(x_B, \mu _F)~, \end{aligned}$$
(1)

where the matrix element, \(\mathcal {M}(\mu _F)\), is calculated in a perturbative manner. The convolution is in x space, i.e., the longitudinal momentum fraction carried by the partons. At leading order (LO), there is a big scale dependence, whereas, at next-to-leading order (NLO), there is a smaller dependence and so on for higher orders until that, if we consider all perturbative terms, the result would be independent of scale, assuming it is a convergent series. The conventional choice for the factorization scale is M for the DY process.

It is known that at small x NLO theoretical predictions there is a large factorization scale dependence, usually quantified by allowing for \(\mu _F = M/2, 2M\). This is due to the fact that a variation of factorization scale will change the parton distributions. If the whole perturbative series were present, the matrix element would cancel this change. However, when truncated at NLO, the matrix element contains only one parton emission (see, e.g., Fig. 1), while the parton distributions can emit many partons (average of 8 at small x and for the LHC energies, as estimated in Ref. [3]) when they are evolved in \(\mu _F\). This uncertainty limits the precision in which the parton distributions can be probed by the Drell–Yan process.

However, there is a procedure [3] to set an optimal scale, which reduces the uncertainty due to the factorization scale. The main idea is, in the limit of small x, to include part of the NLO contribution already at the LO by changing the parton distribution factorization scale at LO. It was applied first for the DY process, but it has been also applied to other processes, like \(c\bar{c}\) and \(b\bar{b}\) production [4] and \(J/\psi \) production [5].

Given that at large scales the parton distributions are more or less understood, it would be desirable to lower the optimal scale. With this goal, in Ref. [6], a dilepton (or, equivalently, photon) upper transverse momentum (\(k_t\)) cutoff was imposed, therefore making the NLO contribution smaller and then requiring a smaller optimal scale. In this way, one has information as regards the PDFs at smaller scales, i.e., smaller than the scales that can currently be measured due to experimental limitations. In this paper, we continue that work by imposing a cut in the azimuthal angle between the transverse momentum of the leptons (instead of a photon \(k_t\) cut). This will be a complementary approach, which can be tested both theoretically and experimentally.

There are other ways to resum small x parton evolution. For example, an all-order small-x resummation matched to a fixed-order DGLAP anomalous dimension [7] was obtained some time ago. Also, by considering the perturbative coefficient functions at fixed order minus its expansion in \(\alpha _s\) series, it was possible to resum small x effects in Refs. [8, 9] and have a better description of DIS data. In our work, we have the advantage of being able to choose more exclusive observables by having an easier way of introducing cutoffs.

This paper is organized as follows: in Sect. 2, we discuss how to reduce the NLO phase space through the azimuthal angle cut. Then, in Sect. 3, we calculate the optimal scale as a function of the cutoff. In Sect. 4, we show the effect of the cutoff in the cross section and, in Sect. 5, we show the stability of the results with regard to the choice of the remaining factorization scale. Finally, we present our conclusions in Sect. 6.

2 Imposing an azimuthal angle cut \(\phi _0\)

Drell–Yan process at NLO is given by a collision between a parton A and parton B, resulting in another parton C and a photon, the latter splitting into leptons D and E. The most important case at small x, where the gluon distribution dominates, is the QCD Compton scattering: a gluon and a quark are the initial partons that result in the quark C and the leptons D and E, as shown in Fig. 1. We define the Mandelstam variable \(t = (p_C - p_A)^2\).

The leptons D and E with the corresponding transverse momentum \(\vec {p}_{Dt}\) and \(\vec {p}_{Et}\) are separated by an azimuthal angle \(\phi \). If we take \(\phi \) to be the smallest angle, it will vary between \(0< \phi < \pi \), with the upper limit corresponding to the back-to-back configuration. With an azimuthal angle cut, we reduce the number of events taken into account by selecting only the ones with \(\phi > \phi _0\), i.e., closer to the back-to-back configuration. In Fig. 2, we present only the lepton pair, D and E, and show the cut off region in red.

By introducing the cutoff, we expect to lower optimal scale that will be described in the next section. In this way, we are able to safely probe parton distribution at lower scales by reducing the big uncertainty involved in the choice of this scale as shown in Fig. 1 of Ref. [6].

Fig. 1
figure 1

The Compton scattering diagram of the NLO Drell–Yan process: gluon A and quark B are the initial particles, resulting in a quark C and a photon, which in turn splits into a pair of leptons D and E. This diagram has a divergence in the t channel and it is the most relevant one at NLO for small x due to the gluon distribution

Fig. 2
figure 2

Transverse momentum vector of leptons D and E, separated by an azimuthal angle \(\phi \). For a given cutoff \(\phi _0\), the green region corresponds to allowed values of angles \(\phi > \phi _0\), i.e. the part of phase space which is taken into account in the calculation. The red region is cut off, therefore, the events that are closer to the back-to-back configuration are the relevant (measured) ones

3 Determination of the optimal scale

Following the procedure of Ref. [6], we use the parton cross section for the NLO subprocess \(qg\rightarrow q \gamma ^* \rightarrow q l \overline{l}\) differential in \(M^2\), in t and in the lepton transverse momenta. This is integrated in the two lepton variables, keeping the restriction in the azimuthal angle of \(\phi > \phi _0\):

$$\begin{aligned} \int \mathrm {d}\, p_{Dt} \int \mathrm {d}\, p_{Et} \frac{\mathrm {d}\, \hat{\sigma }(qg \rightarrow q l \overline{l})}{\mathrm {d}\, M^2 \mathrm {d}\, t \mathrm {d}\, p_{Dt} \mathrm {d}\, p_{Et}} \Theta (\phi -\phi _0). \end{aligned}$$
(2)

We also use the LO parton cross section convoluted with DGLAP \(g\rightarrow q \overline{q}\) splitting function [10] that does not have a dependence on the lepton variables:

$$\begin{aligned} \frac{\alpha ^2 \alpha _s z}{9 M^4} \frac{z^2 + (1-z)^2}{t}. \end{aligned}$$
(3)

We equate both expressions (NLO vs. LO convoluted with DGLAP) and integrate in t. The infrared divergences cancel (the cut does not touch the divergence). There is a further integration in \(z = M^2/\hat{s}\) with fixed M, accounting for an incoming gluon flux of 1/z, where the parton c.o.m. energy is \(\sqrt{\hat{s}}\). Thus, we have an equation that can be used for finding the optimal scale, \(\mu _0\).

In the next step, to calculate the cross section, we will use the factorized scheme:

$$\begin{aligned} \sigma = \mathrm PDF (\mu _0) \times C^{LO} + PDF(\mu _F)\times C^{NLO}(\mu _0) \end{aligned}$$
(4)

using the optimal scale, \(\mu _0\), in the parton distribution appearing at leading order and also in the next-to-leading order coefficient, \(C^\mathrm{NLO}\). By using the optimal scale \(\mu _0\) we include in the LO term all the NLO contributions which depends on factorization scale and enhanced by a large \(\ln (1/x)\)—that is, we resum inside the LO low-x PDF the terms \([\alpha _s\ln (\mu _F/M)\ln (1/x)]^n\). Of course now, the first of these terms should not be taken into account at NLO to avoid double counting; this is done by setting \(\mu _F = \mu _0\) in \(C^\text {NLO}\).

However, since there is a cutoff applied, it is necessary to take care of the situation of a parton that emits other partons during the evolution that may spoil the cutoff. In other words, we must take into account possible parton emissions from the optimal scale (\(\mu _0\)) up to the hard scale (\(\sqrt{\hat{s}}\)) which give a supplementary transverse momentum to the dilepton. For example, a configuration in which the leptons are exactly back-to-back (\(\phi = \pi \)) when the dilepton has no transverse momentum can be changed to another configuration like \(\phi = \pi /2\) if the dilepton is given the appropriate transverse momentum. We will do that at double logarithm accuracy.

This situation is addressed by including Sudakov form factors that ensure that there will be no emission between the optimal scale \(\mu _0\) and \(\sqrt{\hat{s}}\). This inclusion is detailed in Ref. [6], here we briefly recall that, in the double log approximation, the quark Sudakov factor is given by

$$\begin{aligned} T_q = \exp {\left( -\alpha _s S_q(\mu _0,\sqrt{\hat{s}})\right) } \end{aligned}$$
(5)

with

$$\begin{aligned} S_q = \frac{C_F}{\pi } \mathfrak {R}\left( \ln (\sqrt{\hat{s}}/\mu _0) +i\pi /2 \right) ^2 \end{aligned}$$
(6)

where \(C_F = 4/3\) and, at leading order, \(\sqrt{\hat{s}} = M\). Similarly, there is a Sudakov factor for the gluon. They enter the Eq. (4) as factors that multiply respective the parton distributions. Of course now we have to exclude the first term \(\alpha _s (\ln ^2(\sqrt{\hat{s}}/\mu _0) - \pi ^2/4)\) from the \(C^\text {NLO}\) expression to avoid the double counting.

One may argue that it is not clear how the Sudakov factors could be used with the angular cut, since they are traditionally used to account for no emission in a range of transverse momentum. First of all, the Sudakov factor depends on virtualities of single particles (as in the original paper [11]), not transverse momentum. This means that we can use them here, provided that we use as their arguments the inclusive scale M or \(\hat{s}\), where all possible dileptons are taken into account, and the optimal scale \(\mu _0\) with our cutoff. This is good at double logarithm accuracy and corrections to it will appear only at NNLO. As shown in Ref. [3], the NNLO is rather small after the choice of the optimal scale and that justifies our approach. If we were to make completely sure that the cutoff was not spoiled by the PDF evolution, we would have to calculate this process to all orders or do a Monte Carlo evolution keeping track of all variables of intermediate partons, but we do not pursue this complicated approach.

In Fig. 3 the reduction is shown of the optimal scale with the cutoff for the cases without and with Sudakov form factor, for dilepton masses equal to 6 and 12 GeV. It starts with the case of no cut applied (\(\phi _0=0\)) and ends in the most drastic case of \(\phi _0 = \pi \), where all phase space is cut off. In this range, the optimal scale varies from \(\mu _0 = 1.45\,M\) (no cutoff) to \(\mu _0 = 0\).

Fig. 3
figure 3

Optimal scale as a function of the azimuthal angle cutoff with and without Sudakov factors. We observe that for \(\phi _0 > 0.7 \pi \) the Sudakov factor does not affect much the factorization scale

From Fig. 3, we clearly see that, in the region which starts around \(\phi _0 = 0.7\pi \), Sudakov effects are not so important on the determination of the factorization scale. This is the most important region to study smaller scales, since \(\mu _0/M < 0.7\) in this case. Then, we can investigate predictions of Drell–Yan cross section at smaller scales without worrying about a new theoretical uncertainty due to the Sudakov form factors.

After including into the LO term most of the NLO contribution, it would still be possible to use the parton distributions at a different scale \(\mu _1\) when computing the NLO contribution. Then the NNLO coefficient would depend on \(\mu _0\) and \(\mu _1\) and the idea would be to choose \(\mu _1\) in a way to make that almost all of the NNLO contribution would be already taken into account at lower orders. This would further reduce the scale uncertainty. We do not pursue such calculation here, but we argue, as first discussed in Ref. [3], that setting \(\mu _1 = \mu _0\) already is a good choice, since the dominant diagram at small x at NNLO is the one with two gluons in the initial state and most of its contribution will be taken into account by correcting both quark and antiquark legs of the LO diagram with LO (and not NLO) DGLAP.

Another possibility is to combine the azimuthal angle cutoff with the transverse momentum cutoff discussed in Ref. [6]. This would lower the optimal scale w.r.t. the application of a single cut, but we expect it will not be much lower. In fact, we expect that both cutoffs will be similar in the sense that a large part of the phase space is cut by the two cuts. For instance, the optimal scale for \(\phi _0 = 0.85 \pi \) is \(\mu _0 = 0.44 M\); if we also cut the dilepton transverse momentum at \(k_0 = M\), the optimal scale is still 0.44M within rounding error, if we set \(k_0 = M/2\), we have \(\mu _0 = 0.42 M\). In conclusion, applying both cuts should be weighted against the possible experimental difficulties when measuring this new cross section, depending on the setup it will be better to apply a single but stricter cut.

4 Predictions with an azimuthal angle cutoff

As described in Sects. 2 and 3, we are now in a position to lower the scale with an azimuthal angle cut and investigate the effects of the cutoff in cross section. We are interested in applying a cut for which Sudakov factors do not change much our results, \(\phi _0 > 0.7\pi \). A good choice will be \(\phi _0 = 0.85\pi \), for which the optimal scale is reduced to \(\mu _0 = 0.44 M\). In Fig. 4, we show our predictions for the differential cross section in dilepton rapidity Y for the Drell–Yan process at LHC energy of \(\sqrt{s}= 14\) TeV. We use MMHT14 NLO PDFs [12] and set the dilepton mass equal to 6 and 12 GeV.

Fig. 4
figure 4

Drell–Yan differential cross section for \(M = 12\) GeV (left) and 6 GeV (right). The upper curves correspond to the result without any cut (\(\mu _0 = 1.45M)\), while the lower ones, to the result with an azimuthal angle cutoff of \(\phi _0 = 0.85\pi \). The bands display the \(1\sigma \) PDF uncertainty and show that they can be reduced by a proper measurement (with current LHC precision) of such observable

The upper curves in Fig. 4 correspond to the absence of any cutoff; therefore \(\mu _F = \mu _0 = 1.45 M\). In this case, the scale at which the partons are probed is still larger than the usual choice \(\mu _F = M\). The lower curves correspond to the cutoff \(\phi _0 = 0.85\pi \), for which we have a much lower scale (less than a third of 1.45M), but we still have a considerable cross section, as it can be seen that approximately 50% of the dileptons produced are kept.

We also calculate the \(1\sigma \) error corridors coming from the PDF uncertainty, that, depending on Y, are rather large. The current precision of the measurements at the LHC is better than this PDF uncertainty, leading us to believe that a proper measurement of such observable would add new precise knowledge about the PDFs. In the next section we will see that the remaining factorization scale will be smaller than such bands.

Fig. 5
figure 5

Drell–Yan differential cross section given at two factorization scales \(\mu _F = \mu _0\) (black) and \(\mu _0/2\) (red). The azimuthal angle cutoff \(\phi _0 = 0.85 \pi \) is imposed, with optimal scale at LO given by \(\mu _0 = 0.44 M\). This shows that the remaining factorization scale uncertainty is greatly reduced

5 Sensitivity of choice of factorization scale

We should now verify the behaviour of the cross section, Eq. (4) with respect to the remaining factorization scale dependence. Therefore, we set the scale at the LO PDF (\(\mu _F = \mu _0\)) and in the NLO coefficient \(C^\mathrm{NLO}(\mu _0)\), while varying the factorization scale, \(\mu _F\), in the PDF multiplying the NLO contribution. We will investigate the central prediction \(\mu _F = \mu _0\) and also a smaller factorization scale \(\mu _F = \mu _0/2\). Here we cannot use the larger \(\mu _F = 2 \mu _0\), because it would allow the DGLAP evolution to violate the cutoff. This would happen by the emission of partons with enough transverse momentum to produce a photon with some transverse momentum. Therefore, the dilepton will have to carry this momentum and the net effect will be a reduction of the azimuthal angle \(\phi \), putting, in the forbidden region, some events previously understood to be in the allowed region of \(\phi \).

In Fig. 5, we show the scale variation described above for the differential cross section in rapidity for \(M = 6\) GeV and 12 GeV, setting the LHC energy to 14 TeV and, as an example, applying the azimuthal angle cutoff \(\phi _0 = 0.85\pi \) with \(\mu _0 = 0.44 M\). The renormalization scale is kept fixed at \(\mu _R = M\). We can see that changing the factorization scale does not change much the results. Therefore, the role of optimal scale still holds and the uncertainty in the choice of scale is reduced.

6 Conclusion

In this work, we investigated the production of Drell–Yan dileptons at small x with a cutoff that excluded smaller values of the azimuthal angle \(\phi <\phi _0\). Following the prescription established in earlier work, we calculated the leading-order optimal factorization scale using the dominant diagram at NLO, i.e., the gluon–quark Compton scattering. In doing so, the main theoretical uncertainty (factorization scale) was reduced, as can be seen for \(\phi _0 = 0.85 \pi \) at Fig. 5.

We provided the optimal scale as a function of the size of the cutoff \(\phi _0\) in Fig. 3. By introducing the cutoff, it was possible to lower the scale at which the parton distributions are probed, for example, \(\mu _0=0.44M\) at \(\phi _0 = 0.85 \pi \). In order to avoid the DGLAP evolution of the PDFs spoiling the proposed observable by the emission of a parton in the cutoff region, appropriate Sudakov factors were included. They changed the dependence of the optimal scale on \(\phi _0\), but for \(\phi _0 > 0.7 \pi \), the change of its absolute value is very small and therefore the optimal scale is quite robust regarding this correction.

Finally, we calculated the cross section of the discussed observable with \(\phi _0 = 0.85 \pi \) in Fig. 4, showing that indeed we will have a smaller cross section by a factor of about 2 when compared with the case without the cutoff. The uncertainty bands shown indicate that the determination of the parton distributions can be improved, since the uncertainty due to the factorization scale was greatly reduced.