Perturbative QCD description of jet data from LHC Run-I and Tevatron Run-II

We present a systematic comparison of jet predictions at the LHC and the Tevatron, with accuracy up to next-to-next-to-leading order (NNLO). The exact computation at NNLO is completed for the gluons-only channel, so we compare the exact predictions for this channel with an approximate prediction based on threshold resummation, in order to determine the regions where this approximation is reliable at NNLO. The kinematic regions used in this study are identical to the experimental setup used by recently published jet data from the ATLAS and CMS experiments at the LHC, and CDF and D0 experiments at the Tevatron. We study the effect of choosing different renormalisation and factorisation scales for the NNLO exact prediction and as an exercise assess their impact on a PDF fit including these corrections. Finally we provide numerical values of the NNLO k-factors relevant for the LHC and Tevatron experiments.


Introduction
Single jet inclusive and dijet observables are the most fundamental QCD processes measured at hadron colliders. They probe the basic parton-parton scattering in QCD and thus allow for a determination of the parton distribution functions in the proton and for a direct probe of the strong coupling constant up to the highest energy scales that can be attained in collider experiments. In particular, gluon scattering is a direct contribution to the production of high-p T jets. For this reason, jet data is included in PDF fits with the goal of assessing the gluon distribution in the proton at medium to large values of the momentum fraction x.
Improvements at the level of accuracy of the theoretical predictions for the single jet inclusive cross section beyond next-to-leading order (NLO) in QCD perturbation theory have been achieved recently. First, the exact next-to-next-to leading order (NNLO) prediction for the gluons-only channel has been published in [1,2]. Second, an approximate NNLO prediction based on threshold resummation is presented in [3].
In this work we perform a systematic study comparing theoretical predictions at leading (LO), next-to-leading (NLO) and next-to-next-to-leading order (NNLO), to recent data from the LHC and Tevatron experiments. The aim of this study is to understand and characterize the validity of the NNLO threshold approximation [3] by comparing it to the exact computation in the gluon-gluon channel [1,2]. In Ref. [3] the threshold approximation is compared to the exact calculation in the gluon-gluon channel showing, after integration over rapidity, a good agreement at large p T . However, for small p T regions it tends to diverge from the exact computation. Our objective is to determine the experimental regions where this breakdown of the threshold approximation occurs. A rejection criteria to exclude approximate predictions will be suggested based on the gluon-gluon channel which is dominant in the small p T region. In this region the full NNLO computation is dominated by the gluon-gluon channel and therefore the predictions from the exact NNLO calculation in this channel are reliable to determine the kinematic regions for which the threshold terms become accurate. Moreover, and contrary to the study made in Ref. [3], we will compare both predictions using the same factorisation and renormalisation scales in both calculations and discuss the effects of making different scale choices in the theory predictions.
In order to obtain the exact NNLO predictions, the calculation in [1,2] used the antenna subtraction scheme [4] to perform the cancellation of IR singularities between real and virtual corrections at NNLO [5][6][7][8]. For hadron collider observables this includes contributions due to radiative corrections from partons in the initial state [9][10][11][12][13]. The cancellation of IR singularities is achieved analytically in all intermediate steps of the calculation thereby producing a strong check on the correctness of the calculation. In this calculation the exact two-loop [14][15][16], one-loop [17] and tree-level [18] QCD matrix elements at NNLO are included in a parton-level generator NNLOJET, which integrates them over the exact full phase space to compute any infrared safe two-jet observable to NNLO accuracy. For the purposes of the present study we compute the single jet inclusive cross section pp → j + X where we require to observe at least one jet in the final state and integrate inclusively any additional radiation.
In Ref. [3] approximate NNLO results for the same observable were derived using the formalism of threshold resummation for single jet production in hadron-hadron collisions. In this framework the threshold limit is defined by the vanishing of the invariant mass of the system that recoils against the observed jet s 4 = P 2 X → 0. In this limit the phase space available for additional soft radiation is restricted such that the higher kth order coefficient functions are dominated by large logarithmic corrections, The NNLO threshold calculation then performs a systematic resummation of these logarithmic enhanced contributions for all partonic channels to all orders in the strong coupling α s , by determining the three leading logarithmic contributions ∝ (log 3 (z)/z) + , (log 2 (z)/z) + , (log(z)/z) + and keeping full dependence of the cross section on the jet rapidity [3]. The soft contribution δ(z) as well as non-enhanced regular terms in z of NNLO accuracy are not computed in this approach. For this reason it has been in shown in [3] that different approximate NNLO predictions can be derived from the threshold formalism if the variables used in the computation differ away from the threshold z = 0 limit (but are otherwise identical at z = 0). This effect can lead to a significant change in the shape of the approximate NNLO threshold prediction [3] and increases its uncertainty.
Together with the comparison between the predictions at NNLO obtained in the threshold formalism and in the exact fixed-order calculation, we also provide the NNLO/NLO k-factors relevant for the Tevatron and the LHC experiments. The paper is organized as follows. In Sect. 2 we present the jet data selected for the comparison, and the setup of the computational tools used for the generation of the theoretical predictions. In Sect. 3 and Sect. 4 we show the results for the LHC and the Tevatron experiments respectively. In Sect. 5 we present as an exercise a NNLO PDF fit using the NNLO k-factors computed in Sect. 3 and Sect. 4. In Sect. 6 we present our conclusions and directions for future work. An appendix is enclosed which provides tables with k-factors in the gluon-gluon channel at the LHC and the Tevatron.
2 Benchmark predictions for jet production

LHC and Tevatron jet data
In order to provide realistic comparisons, based on real data which is already included in the extractions of parton distribution functions (PDFs) [19], we have selected recent data sets obtained during the LHC Run-I and the Tevatron Run-II. Using data from both colliders provides the possibility of investigating differences and similarities between datasets for different collision energies and kinematic coverage. A summary of the experimental data included in our analysis is presented in Table 1. As we will show in Sect. 3 Table 1: Jet data included in the current analysis with the respective kinematic information.
From the LHC experiments we have included the CMS measurements of the double differential jet cross sections at √ s = 7 TeV [20], where jets are reconstructed up to |η| < 2.5. We have also included the ATLAS measurements of inclusive jet cross sections at √ s = 7 TeV [21] and √ s = 2.76 TeV [22], where the rapidity coverage reaches |η| < 4.4.
For both LHC experiments jets are reconstructed with the anti-k t algorithm. The main differences between CMS and ATLAS data is the choice of jet resolution parameter R, which is R=0.7 for CMS and R=0.4 for ATLAS, and the p T coverage which for CMS covers the very high p T region, reaching 2 TeV, while ATLAS measures very low p T jets starting from 20 GeV. Concerning the Tevatron data, we have included the most recent CDF Run-II k t jets [23] and the D0 Run-II cone data [24]. In contrast to LHC data, the center of mass energy of both sets is √ s = 1.96 TeV and their coverage in rapidity and p T is smaller than ATLAS and CMS experiments. It is important to highlight that CDF uses the k t algorithm to do the jet reconstruction, while D0 presents data reconstructed with the MidPoint cone algorithm which is infrared unsafe at NNLO.

Theoretical predictions
Theoretical predictions presented in this work are computed exclusively with the central value of the NNPDF23 nnlo as 0118 set, presented by the NNPDF collaboration in Ref. [19]. This set is used for predictions at all perturbative orders. However, we are interested in comparing predictions at the same order and thus the choice of the input PDF is only marginally relevant. At LO and NLO full exact predictions are available and these have been computed with the FastNLO [25] interface for CMS, CDF and D0 and with the APPLgrid [26] tables for the ATLAS predictions. Tables used with both interfaces have been computed with NLOjet++ program [27,28]. Predictions at NNLO are computed using the exact fixed-order results in the gluon-gluon channel and with the threshold approximation code.
To obtain the exact predictions at NNLO we use the parton level Monte Carlo NNLO-JET code recently presented in Ref. [1,2,29] interfaced with libHFILL 1 , a histogram library developed for this work and compatible with all MCs programs which allows the automatic construction of jet p T distributions from event weights, using the binning and 1 Available at: http://libhfill.hepforge.org/ -4 -kinematic regions presented in Table 1. The Monte Carlo uncertainties presented for the exact predictions are below the percent level. In this code, the gg → gg + X at full colour and the qq → gg + X [8,29] contributions at leading colour are available at NNLO and the current limitations are the missing partonic contributions for qg and qq scattering.
To obtain the approximate NNLO predictions based on threshold resummation we use the threshold approximation code [3] which implements predictions for all channels. We have also used the narrow-jet approximation code (NJA) presented in Ref. [30,31], which computes in analytic form the single jet inclusive cross section at LO and NLO in the narrow-jet limit where both the matrix elements and phase space are expanded around the narrow-jet limit. The original version of the NJA and threshold codes have been improved through comparisons with exact calculations and updated in order to use the LHAPDF [32] interface to PDFs and by including the bottom and the anti-quark PDFs contributions to the total luminosity. After these modifications both codes show full agreement at LO with the exact calculations and can be used for comparisons with experimental data.
For all predictions the value of α s is provided by the PDF set through the LHAPDF [32] interface. For the exact calculation we generate predictions using two different dynamical renormalisation and factorisation scales. One choice evaluates the fixed-order single jet inclusive cross section using µ R = µ F = µ = p T 1 where for each event the renormalisation and factorisation scales are set equal to each other and equal to the p T of the leading jet in the event. The p T of the leading jet is obtained after clustering all parton finalstate momenta into jet momenta using the appropriate jet algorithm employed by each experimental setup. Whenever the jet algorithm determines a partonic clustering the 4D recombination scheme is applied, i.e., the 4-momenta of the jet's constituents is added, producing a list of final-state jets that are ordered in p T at the end of the clustering procedure. As a second choice we computed the fixed-order single jet inclusive cross section using µ R = µ F = µ = p T where in this case each jet in every event is binned with the weight evaluated at the scale p T of the jet. While at LO the two final state partons generate two jets with equal transverse momentum p T 1 = p T 2 = p T and the two scale choices coincide, radiative corrections can generate subleading jets and the effects of the different scale choice in the theory prediction become apparent at NLO and NNLO.
The approximate threshold prediction uses µ R = µ F = µ = p T , where the p T of the observed jet is generated by the MC integration and the general structure of the resummed cross section applies specifically to the anti-k T algorithm [3].
In the next section we present the theoretical predictions for all the experimental setups in Table 1 at each order in perturbation theory up to NNLO and perform a benchmark comparison of the various approximations.
When assessing the validity of the threshold prediction at NNLO we will suggest a rejection criteria which is to exclude approximate predictions which are more than 10% off the exact prediction. We will apply this criteria in the gluons-only channel to help determine exactly the regions in the experimental setup where the threshold approximation is applicable. This choice of criteria is not a recommendation and its purpose is to fix a level of accuracy when using approximate predictions. In Sect. 5 we will discuss the effects of being more restrictive or flexible with this choice.
-5 -For each experiment in Table 1 we have generated all channel full LO, NLO and approximate NNLO predictions and compared them directly against the experimental data. In Figure 1 we show an example of this analysis for the first rapidity bin of the CMS jets 2011 dataset. On the left plot of Fig. 1, the full channel theoretical predictions are normalized to the CMS data, corrected by non-perturbative corrections, where uncertainties are estimated from the diagonal of the covariance matrix, which is extracted by considering systematic uncertainties additively.
From this plot we observe an excellent agreement at LO between all codes and that the data is well described by the NLO predictions. We note that the NNPDF2.3 set used in this comparison is obtained from a NNLO fit that includes jet data from the Tevatron and the LHC for which the corresponding theory predictions are known presently only to NLO accuracy. For this reason, higher order theory effects beyond NLO are not taken into account in the jet prediction used in the fit. As a result, we observe that the approximate NNLO prediction based on threshold resummation predicts a cross section above the data indicating the need to consistently include NNLO jet predictions in NNLO PDF fits of jet data. Part of this excess could be due to the inherent approximated nature of this prediction and for this reason we aim to disentangle in the next sections the regions which correspond to a breakdown of the threshold approximation.
On the right plot of Fig. 1 we quantify the size of the higher order corrections by computing ratios of higher order cross sections over the leading order one. These k-factors show that NLO corrections vary between 20% and 45% with respect to the LO prediction with the approximate NNLO threshold corrections varying between 40% and 70%. We also observe that the NLO/LO k-factors using the threshold (in green) and NJA (in red) codes are in good agreement with the exact computation (in blue).
A more detailed comparison between the all channels exact and approximate predictions is presented in Figure 2. At LO and NLO we computed ratios between the approximate predictions at each order and the exact predictions at the same order. We conclude that the LO predictions of all codes are in perfect agreement in all regions of p T , while as expected the NLO threshold (in green) and the NJA (in red) predictions converge to the exact computation for high values of p T . The high-p T region corresponds precisely to the threshold region s 4 = 0 where the phase space available for additional radiation is limited.
In the same figure we present the exact NLO/LO k-factor (in blue) and with the approximate NNLO prediction we have constructed NNLO/NLO k-factors using two different choices for the denominator: the approximate NLO threshold (in light magenta) and the NLO exact (in dark magenta). As we can see the effects due to this choice are at the few percent level, negligible in comparison to the size of the approximate NNLO threshold correction.
We have performed this exercise for all experiments, however, in the next subsections we limit the analysis to the gluon-gluon channel because we are interested in determining the regions where the NNLO threshold k-factors are in agreement with the exact computation, which is available for that channel. The full report with all plots and tables of -6 -

CMS jets
In Figure 3 we show the ratios between predictions in the gluons-only channel computed with the codes presented in Sec. 2 using the rapidity and p T bins of the CMS jets 2011 dataset. The LO, NLO, NNLO predictions labelled exact are obtained from the Monte Carlo NNLOJET presented in [1,2,29] and are compared with the NJA code [30,31] at LO and NLO and with the threshold code [3] at LO, NLO and NNLO. For all predictions we set the renormalisation and factorisation scales equal to each other and equal to the p T of each individual jet in every event (µ R = µ F = p T ) For all plots we first check the agreement at LO and NLO between all codes and then the agreement of the NNLO/NLO k-factors obtained with the threshold approximation code and the exact computation. As we did in the previous section, we provide two definitions for the NNLO threshold k-factor by dividing the approximate NNLO threshold predictions with the approximate NLO threshold predictions (in light magenta long-dashed curves) and also by diving them with the exact NLO prediction (in dark magenta longdashed curves). The NNLO/NLO k-factor using the exact computation at NLO and NNLO is plotted in long-dashed black curves. The distance between the long-dashed black curve and the long-dashed magenta curves in Figure 3 and subsequent Figures indicates the level of disagreement between the k-factors produced by the exact NNLO computation and the approximate NNLO threshold computations.
By looking at Figure 3 we conclude that all predictions at LO are in full agreement with differences below the percent level. At NLO the NJA code shows percent level differences at small p T . Similarly the NLO threshold code shows percent level differences at small p T which rise to 5% at central p T for the last bins in rapidity.
Concerning the NNLO predictions we looked at the NNLO k-factors and relative differences bin by bin. The relative difference between the exact computation and the threshold computations is documented in Tables 6 to 10 where, for each rapidity slice of the experiment, we show for each p T bin, the experimental cross section, the experimental error, the gluons-only exact NNLO and threshold NNLO k-factors together with their relative percentage wise difference and finally the percentage wise relative difference between the two possible NNLO gluons-only threshold k-factors.
We first notice that for the entire kinematic range of the experiment the choice of the denominator for the NNLO threshold k-factor produces relative differences which are much smaller than the difference to the exact k-factor. When comparing either of these with the exact NNLO computation we find for all rapidity bins an instability at low-p T in the approximate NNLO results. In these regions the approximate NNLO threshold kfactor starts to rise generating large perturbative corrections. While for the first two bins in rapidity |η| < 1.0, the relative differences with the exact calculation are below 10% we observe strong deviations for |η| > 1.0.
Using the rejection criteria suggested at the end of Section 2.2 we conclude that for CMS the NNLO threshold prediction is not applicable for the rapidity slices |η| > 1.5 as relative differences with respect to the exact computation are larger than 20% and can rise up to 60%. Furthermore, due to the instability of the approximate prediction at low-p T , for the rapidity slice 1.0 < |η| < 1.5 the first seven p T bins should be excluded.
As mentioned in the introduction, the comparison between the exact fixed-order calculation and the threshold approximation performed in [3] used different central scale choices for each prediction. The predictions from the threshold resummation formalism were obtained using µ R = µ F = p T , where p T is the individual jet p T while the fixed-order calculation used µ R = µ F = p T 1 , where p T 1 is the transverse momentum of the leading jet in each event. In order to eliminate this inconsistency, and also to study the impact of the central scale choice in the fixed-order predictions, we show in Figure 4 NLO and NNLO gluons only cross sections evaluated at the two different scales for the first rapidity slice of the CMS experiment. We observe that at high-p T subleading jets tend to be soft and in this region p T 1 ∼ p T and the predictions using either scale choice coincide. In the low-p T region we observe low-p T jets accompanying a high-p T object. In this case p T < p T 1 and we observe an increase in the NLO prediction of about 5% when using the scale µ R = µ F = p T . This effect is due to the fact that the virtual contribution which has Born kinematics remains identical with either scale choice while the real radiation contribution is enhanced when its weight is computed at a lower scale. At NNLO we observe instead a reduction of the size of the fixed-order prediction at low-p T . As a consequence we conclude that the NNLO/NLO k-factor is typically smaller in the low-p T region with the scale choice µ R = µ F = p T as compared to the choice µ R = µ F = p T 1 . The resulting reduced NNLO k-factor for the exact prediction then shows that the disagreement between the exact calculation and the threshold calculation is enhanced when both calculations are performed using the same central scale choice. This observation is rapidity independent as emissions in events with a high-p T central object can produce low-p T jets entering the single jet inclusive p T distribution in the forward regions.

ATLAS jets at
√ s = 7 TeV and √ s = 2.76 TeV The main difference between ATLAS and CMS jets is the different kinematic coverage as shown in Table 1. ATLAS provide jet data at two distinct center of mass energies and kinematic ranges. Figures 5 and 6 present the gluons-only theoretical predictions for ATLAS jets at √ s = 7 TeV and √ s = 2.76 TeV respectively. For all predictions we set the renormalisation and factorisation scales equal to each other and equal to the p T of each individual jet in every event (µ R = µ F = p T ). In both cases the LO predictions are in perfect agreement for all bins. At NLO the NJA (narrow-jet approximation) prediction shows good agreement with the exact NLO result across the entire kinematic range. We note that the ATLAS experiment employs R=0.4 for the anti-k t jet clustering procedure and therefore the agreement between NJA approximation and the exact calculation is expected and observed to improve for smaller values of R. The threshold prediction at NLO is also in good agreement with the exact calculation but shows an evident instability at small p T where discrepancies vary between 10% and 40% increasing with rapidity.
The predictions at NNLO are in worse agreement when compared to the results obtained at CMS. There is a constant gap between the threshold and the exact predictions for ATLAS jets at √ s = 7 TeV and √ s = 2.76 TeV. The convergence of the threshold -9 -    Figure 4: NLO (left) and NNLO (right) exact gg-channel predictions for CMS evaluated with the renormalisation and factorisation scales µ R = µ F = p T and µ R = µ F = p T 1 . In the lower pads we present the relative differences due to the different central scale choice. approximation code to the exact computation is not evident as it is for CMS because the maximum p T values are smaller for ATLAS.
On Tables 11 to 17 in the Appendix we document the NNLO k-factor results. The general behaviour is similar to CMS: differences between the exact NNLO computation and the approximate NNLO threshold computation are large at small p T (between 20% and 80%) and increase with rapidity. For the rapidity range 0.8 < |η| < 2.1 and for all p T the disagreement between the two predictions is between 10% − 100%. For the rapidity regions |η| > 2.1 the disagreement is larger than 100% for all p T . An application of the rejection criteria suggested before excludes approximate predictions for the p T points for the first ten bins with |η| < 0.3 and the first thirteen points with 0.3 < |η| < 0.8. In the regions of rapidity |η| > 0.8 we should discard all points as differences between the exact and approximate calculation are for all values of jet p T much larger than 10%.
In particular, and contrary to statements made in Ref. [3], the large NNLO k-factors observed in the approximate threshold calculation ( Figure 5) of the order 5 or so at η ∼ 4 are not present in the exact NNLO calculation. As can be seen in Figure 5 while the exact NNLO k-factor decreases with rapidity, the NNLO k-factor of the approximate NNLO calculation increases with rapidity.
The comparison between the NNLO √ s = 7 TeV ATLAS k-factors and the NNLO √ s = 2.76 TeV ATLAS k-factors show a moderate dependence on the center of mass energy, with the predictions at √ s = 2.76 TeV giving slightly smaller NNLO k-factors.
With the results documented in Tables 18 to 24 and by extending the rejection criteria suggested to √ s = 7 TeV to ATLAS at √ s = 2.76 TeV we conclude that the approximate calculation gives acceptable predictions only for the first rapidity slice |η| < 0.5 after removing the first eight p T bins.
-11 -       Figure 7: NLO (left) and NNLO (right) exact gg-channel predictions for ATLAS evaluated with the renormalisation and factorisation scales µ R = µ F = p T and µ R = µ F = p T 1 .
In the lower pads we present the relative differences due to the different central scale choice.
We conclude this section by comparing the exact fixed-order predictions in the ggchannel evaluated at the scales µ R = µ F = p T 1 and µ R = µ F = p T . In Figure 7 we show NLO and NNLO gluons only cross sections evaluated at the two different scales for the first rapidity slice of the ATLAS experiment. The qualitative effect is similar to the one showed for CMS in the previous section. By choosing µ R = µ F = p T as a central scale we observe that at low-p T the NLO prediction increases by about 10% while the NNLO prediction is reduced by around 20%, with respect to the results obtained using µ R = µ F = p T 1 . Predictions at high-p T are as expected identical with either scale choice. We note that the due to the p T coverage by ATLAS being more extreme than at CMS (nearly two orders of magnitude in p T covered by ATLAS) we are at low-p T more often in the kinematical regions where p T ≪ p T 1 and therefore the effects that result from changing the central scale choice for the predictions from µ = p T 1 to µ = p T are enhanced.
In Figure 8 we show the exact NLO/LO and NNLO/NLO k-factors for each scale choice. Interestingly we observe that for the first bin in p T the perturbative series behaves for µ R = µ F = p T as 14% NLO correction with respect to LO and 14% NNLO corrections with respect to NLO. This compares with 3% NLO corrections with respect to LO and 54% NNLO corrections with respect to NLO when using µ R = µ F = p T 1 . Therefore, the convergence of the perturbative series in the fixed order calculation is improved using µ R = µ F = p T as the NNLO/NLO k-factor is smaller than the NLO/LO k-factor for all p T and rapidity. Similarly to the results presented for CMS in the previous section we conclude that the disagreement between the exact calculation and the threshold calculation is enhanced when both calculations are performed using the same central scale choice. In particular, by cutting away kinematical regions where the disagreement between both calculations is larger than 10% we keep only data points for which the NNLO k-factors are typically smaller ∼ 1.1 − 1.2. As we will shown in Sect. 5 this has the consequence of improving the χ 2 of a NNLO PDF fit including the NNLO single jet inclusive prediction.

CDF jets
The gluons-only theory predictions for the CDF setup are presented in Figure 9. We observe that the level of agreement at NLO and LO is better with respect to the LHC comparisons presented in the previous section. As before, Tables 25 to 29 show the k-factors for the gg-channel, where non-perturbative corrections have been applied to the experimental data as performed for ATLAS and CMS.
At NNLO the situation has improved since differences between exact and threshold approximation results are smaller than what we have observed for the LHC experiments. With a rejection criteria of excluding points where the disagreement is larger than 10% we observe that the last two rapidity slices for |η| > 1.1 should be excluded.
The main differences between the Tevatron and LHC setups are the different center of mass energies, the projective particles and kinematic ranges which are shorter at the Tevatron than at the LHC. For these reasons, the threshold approximation code provides at the Tevatron predictions closer to the exact NNLO calculation.

D0 jets
The results of the comparison for the D0 experimental setup are presented in Figure 10 with the respective Tables 30 to 35. As for CDF, the NLO and LO predictions are in good agreement for all rapidity and p T bins. The NNLO predictions behave similarly to the CDF results, with the threshold approximation code providing acceptable predictions at least for the first three rapidity slices. Predictions for the D0 experiment have been generated using the k t algorithm for the jet reconstruction instead of the MidPoint cone used for the measurement because this algorithm is IR unsafe at NNLO.
This represents a drawback towards analysing jet data from D0 at NNLO. At the moment D0 data has been included in PDF fits where the IR finiteness at NLO of the MidPoint cone jet algorithm allows the perturbative computation to be performed at this order. A -15 -      -17 -possible solution to include a perturbative prediction at NNLO for the D0 data could be identifying a relationship between the MidPoint algorithm and an IR safe cone algorithm such as SISCone. As discussed in [33] relative differences between the two algorithms are expected to the start at O(α 4 s ) when we have 3 particles in a common neighbourhood plus one to balance the momentum. By producing an inclusive jet p T spectrum using just treelevel 2 → 4 diagrams, differences at the level of 1-2% are observed at the Tevatron in [33] between the predictions of both jet algorithms when the SISCone algorithm is employed with cone parameters R=0.7 and f =0.5. A detailed study applying this prescription for the NNLO inclusive jet prediction is beyond the scope of the current work.

Exclusion criteria summary
In the previous sections we performed a comparison at NLO and NNLO between the exact predictions and the approximate predictions from the threshold resummation formalism at the same order. We performed this comparison for all experimental hadron-hadron collider setups at the LHC and the Tevatron from which a wealthy dataset of inclusive jet data has been delivered.
This comparison was performed in the gg-channel where the exact NNLO results were delivered first [1,2] and therefore can be used to identify the regions where the approximate NNLO prediction emulates the exact results. As a result of this exercise we include in this work the full set of NLO/LO and NNLO/NLO k-factors based on both approaches for the gg-channel and also the approximate NNLO all-channel prediction. Using these predictions we can identify kinematical regions where discrepancies between the two results in the gg-channel are larger than δ=10%. In such regions we conclude that the results of the approximate prediction should not be trusted and for this reason, leads to an exclusion of part of the full dataset of single jet inclusive cross section measurements at hadron-hadron colliders that can be analysed. We studied the effect of being more restrictive or relaxed with this criteria and summarise experiment by experiment the resulting exclusion regions of experimental data as a function of δ in Tables 2 to 5. As expected, being more restrictive and demanding a smaller relative difference between exact and threshold k-factors leads to an increased exclusion region of data points. Using δ=5% results in excluding more than half the data points from CMS and all data points from ATLAS. This information is also reproduced in Figure 11 where we show the relative difference between the exact and approximate NNLO k-factor in the gg-channel in the (p T , |y|) plane of each experiment.
With this information at hand we aim in this section to perform template NNLO PDF fits of jet data including the approximate NNLO corrections. We choose to perform four fits using a criteria of less than 5%, 7.5% 10% and 15% relative disagreement between the exact and the threshold gg-channel k-factors in order to exclude approximate corrections, and assess the impact of these choices on the quality of the fit. The kinematical regions which survive each cut are introduced in the fit through the full channel threshold approximation 2 . The study of the effect in the fit of performing these cut variations also provides a way for 2 Full channel approximate k-factors available at: http://libhfill.hepforge.org/JetStudy2014 -18 - an empiric determination of the best exclusion criteria. As mentioned in Sect. 4 data from the D0 experiment will not be included in this exercise.
To perform the fits we have used the latest NNPDF fitting technology [19] to fit partial subsets of jet data determined by the relative difference criteria δ. Moreover we have also eliminated a few points at large rapidity bins where the full channel k-factor is orders of magnitude larger that the gg-channel k-factor and considered only datasets which after applying the cuts contain at least 2 data points.
In Tables 2 to 5 we present as a function of δ the experimental χ 2 /dof obtained when fitting the respective subset of data points. For CMS the fits tend to include more data points, in particular in high-p T regions and fairly central |y| < 1.0 jets. In these regions, the jet data is probing kinematics not constrained by other data and we observe that the χ 2 /dof has a small dependence. For CDF a large fraction of data points is included in the fit. In this case, however, we noticed that the χ 2 /dof improves when δ is reduced. Finally, for the ATLAS data, we observe that by reducing δ a large fraction of data points is excluded, and this results in large χ 2 /dof fluctuations. In conclusion, with these results we cannot find a precise exclusion criteria. However, we suggest a possible compromise of |δ| = 10% which allows the inclusion of some data from all experiments, providing a reasonable and stable χ 2 /dof in all cases. In this way within the tolerance error chosen, perturbative QCD corrections can be included in PDF fits and result in a reduction of the gluon PDF uncertainty at high-x.
Finally we would like to point out that by computing the qg [34] and qq [34] scattering processes to exact NNLO accuracy in the fixed-order calculation along the lines followed for gg [1,2] and qq [8,29] scattering, we avoid the necessity of introducing a rejection criteria to exclude/include approximate higher order predictions. Instead, the exact prediction from the fixed-order calculation allows the use of the full dataset of jet cross section measurements in global NNLO PDF fits of jet data. Moreover we can test the resulting fit quality in the description of any fully differential 2-jet observable at NNLO. We leave this study for future work.     Table 5: Summary of exclusion regions in p T and rapidity |y| as a function of the relative difference between exact and threshold k-factors for the gluon-gluon channel for the CDF 76 data points. In the table we quote the χ 2 /dof for NNLO PDF fits performed with the full channel approximated k-factors. N dat represents the number of experimental data points included in the fit.

Conclusion and outlook
The purpose of this paper is to compare in the gluons-only channel predictions at NNLO for the single jet inclusive cross section based on the exact NNLO calculation published in [1,2] with an approximate NNLO calculation based on threshold resummation published in [3]. This comparison is performed using the same experimental setups employed by the Tevatron and LHC experiments in their jet analysis. With these results we deliver an updated description of the state of the art of the accuracy of the theoretical predictions for the single jet inclusive cross section in QCD at hadron colliders, and in particular revise contradictory statements in the literature [3]. We observe that when the predictions are compared using the same central scale choice for the renormalisation and factorisation scales the disagreements are larger than previously quoted. Concerning the regions of validity of the NNLO approximation we conclude, based on a criteria of excluding approximate prediction which are more than 10% off the exact prediction, that the threshold approximation code provides predictions that are reasonably close to the exact calculation at large p T and central rapidity regions. We observed smaller differences between the exact calculation and the approximate NNLO threshold calculations at the Tevatron than at the LHC. It is important to highlight that threshold predictions produced integrated over rapidity, as shown in [3], are dominated by the central rapidity regions and provide stable results. However, when looking at specific rapidity bins the threshold predictions are, in some cases, far from the exact computation. This remark is important and invites caution when using the threshold approximation for the determination of PDFs.
As an exercise and to test this observation we performed a PDF fit including approximate NNLO corrections. We observed that as expected the resulting fit quality is dependent on the criteria which is employed to exclude/include approximate NNLO corrections. A more conservative criteria has the effect of excluding a larger amount of the experimental data points that go into the fit, and favours regions where the approximate prediction gives smaller NNLO corrections in agreement with the exact calculation.
Finally we conclude that with the current results, there is no trivial way to determine the p T value for which the threshold approximation predictions are reliable. The only possible prescription is to check the relative difference to the exact computation bin by bin, and admitting a tolerance which can be correlated to the real data uncertainty. As we have shown, the regions of validity of the threshold approximation are very dependent on the experimental setups that we have analysed and are very likely to be different for the future high-energy Run-II of the LHC.
As a further improvement it would be interesting to repeat such study when the NNLO exact prediction becomes available for all channels.

A Tables with k-factors for the gluon-gluon channel
In this Appendix we document the numerical results for the comparisons between the NNLO threshold approximation and the NNLO exact calculation in the gluons-only channel. In the following tables we show for each p T bin of each experiment in columns 2 and 3 the experimental cross section together with its experimental uncertainty computed as described in Sect. 2. Additionally we give NNLO/NLO gluons-only k-factors with both NNLO and NLO results computed in the exact calculation (column 4) and in the threshold approximation (column 5). The percentage wise relative difference between the two is given in column 6. For completeness we give also the NNLO threshold k-factor using the NLO exact calculation in the denominator (column 7) and using the approximate NLO threshold calculation in the denominator (column 8). Their percentage wise relative difference is given in column 9.