# The impact of LHC jet data on the MMHT PDF fit at NNLO

## Abstract

We investigate the impact of the high precision ATLAS and CMS 7 TeV measurements of inclusive jet production on the MMHT global PDF analysis at next-to-next-to-leading order (NNLO). This is made possible by the recent completion of the long-term project to calculate the NNLO corrections to the hard cross section. We find that a good description of the ATLAS data is not possible with the default treatment of experimental systematic errors, and propose a simplified solution that retains the dominant physical information of the data. We then investigate the fit quality and the impact on the gluon PDF central value and uncertainty when the ATLAS and CMS data are included in a MMHT fit. We consider both common choices for the factorization and renormalization scale, namely the inclusive jet transverse momentum, \(p_\perp \), and the leading jet \(p_\perp \), as well as the different jet radii for which the ATLAS and CMS data are made available. We find that the impact of these data on the gluon is relatively insensitive to these inputs, in particular the scale choice, while the inclusion of NNLO corrections tends to improve the data description somewhat and has a qualitatively similar though not identical impact on the gluon in comparison to NLO.

## 1 Introduction

Parton distribution functions (PDFs) are a fundamental input into hadron collider physics for both the theoretical and experimental particle physics communities; see [1] for a recent review. The dominant experimental input in determining PDFs comes from data on deep inelastic scattering (DIS) structure functions, with the final combined HERA Run I + II data set being the most prominent example [2]. By combining data from different processes (neutral and charged current) and targets (protons, deuterons, heavy nuclei) one can obtain much direct information about the quark content of the proton, while the quark evolution is sensitive to the gluon. Indeed, at small *x* this evolution is largely driven by the gluon. However, at moderate and high *x* the proton structure is dominated by the non-singlet valence quark distributions. Then, the evolution is largely decoupled from the gluon and the speed of evolution is determined mostly by the value of the strong coupling constant. There is, of course, some influence from the gluon in the high-*x* quark evolution; it provides an increase of quarks (and anti-quarks) with \(Q^2\), but this is difficult to decorrelate from the variation with \(\alpha _S\), and also decreases in significance at high *x*.

In order to obtain the most comprehensive constraints various groups perform global fits to all available data for which precise theoretical calculations are available [3, 4, 5]. For example, Drell Yan data in hadron–hadron collisions (both collider and fixed target) provide additional information on the anti-quarks and on the quark flavour decomposition, and improve the overall determination in comparison to DIS data alone [6, 7]. A direct constraint on the gluon, particularly at high *x*, can be obtained from high \(p_\perp \) jet production data at hadron colliders.^{1} However, until recently the data have been quite limited in precision, while the calculation of the hard cross section has only been available up to next-to-leading order (NLO), with some threshold resummation results also available [10, 11, 12, 13]. In our most recent global fit [3] we included data on inclusive jet production as a function of \(p_T\) in different rapidity bins from the D0 [14] and CDF [15] experiments at the Tevatron, and from early measurements at both the ATLAS [16, 17] and CMS [18] detectors, at 7 TeV. The Tevatron data were generally close to threshold, so that we could reliably include the NNLO approximations obtained from expanding out the threshold resummation of [10]. However, the LHC data extended much further from threshold, where it was clear these approximations break down [12]. Thus in that study we only included the Tevatron jet data in the NNLO fit. We note that at this time other groups used alternative approaches of including jet data at NNLO; see [4, 19].

Since this previous study there has been both an increase in the range and precision of LHC jet data, combined with the completion of a very large-scale and long-term project to calculate the NNLO corrections to the hard cross section [20, 21, 22]. In this paper we investigate the consequences of both of these new developments for PDF determination, concentrating on the final 7 TeV measurements from the ATLAS [23] and CMS [24] collaborations. We find that neither is quite as straightforward as might be hoped. First, the newer ATLAS jet data are impossible to fit well without some modifications. Second, the variation in NNLO corrections between scale choices and jet radii is potentially quite significant. We therefore examine both of these issues in detail, considering the impact of different choices of scale and jet radius on the fit at both NLO and NNLO, while suggesting a minimal manner in which to improve the fit quality in the case of the ATLAS data that retains the dominant physical constraints implied by the data. We then determine the consequences for both the central values and uncertainties of the gluon PDF obtained at both NLO and NNLO within the MMHT framework. We obtain the very encouraging, and not necessarily expected, result that in practice these are found to very insensitive to any reasonable choices we make in either the treatment of the data or the theory input.^{2} We also find in general that the data description is somewhat improved by the inclusion of these NNLO corrections. As mentioned above, we only consider the 7 TeV data, for which the NNLO calculations are currently available, in this study. In fact, already a range of precise jet data from ATLAS and CMS at 8 and 13 TeV [26, 27, 28, 29] are available. This study will therefore guide the inclusion of these data in future MMHT fits at NNLO. For example, a similarly poor default description is also present in the ATLAS 8 TeV [26] and 13 TeV [27] data, and so it must be dealt with in any future fit.

## 2 Theoretical inputs for jet production

The NLO theoretical predictions for inclusive jet production are calculated using the NLOJet++ code [30], with the results stored in APPLgrid [31] format^{3} for fast use in the PDF fit. The theoretical approach used to calculate the NNLO corrections to this is described in [22] (see also [33]). The NNLO to NLO *K*-factors are provided by the authors for the ATLAS and CMS 7 TeV kinematics, for both jet radii presented by the collaborations and with the renormalization/factorization scale taken as either inclusive jet transverse momentum, \(p_\perp ^{\mathrm{jet}}\), or the maximum jet transverse momentum, \(p_\perp ^{\mathrm{max}}\). These are provided using the NNPDF3.0 set [5], although the *K*-factors are expected to be largely insensitive to the PDF choice. This therefore allows us to perform a detailed analysis of the impact of these LHC jet data for different jet radii and scale choices.

*K*-factors, while the central values are at most 10% from unity. These errors are therefore non-negligible and must be included in the fit. One possibility is to simply include these as an additional bin-by-bin source of uncorrelated uncertainty. However, in general we will expect the

*K*-factors to be smoothly varying functions of the kinematic variables, and therefore this approach is unnecessarily conservative. We instead perform a simple four-parameter fit:

*i*’ labels the specific rapidity region, data set, choice of jet radius and scale. In general this enables a good fit, with \(\chi ^2/\mathrm{dof}\sim 1\), and the standard \(\Delta \chi ^2=1\) criterion can be applied to determine the uncertainties associated with the fit. This results in four sources of correlated systematic uncertainty for each rapidity bin, which we then include in the PDF fit. We have explicitly checked that the results are stable with respect to adding in further polynomial terms, and thus we can safely truncate at this order. The best fit curves are shown in Fig. 1 for four representative cases, with the uncertainty band due to the sum in quadrature of the errors in each bin, i.e. omitting correlations, shown for illustration. We can see that biggest difference in the

*K*-factors is due to the choice of scale, which gives quite different trends depending on whether \(p_\perp ^{\mathrm{max}}\) or \(p_\perp ^{\mathrm{jet}}\) is taken. The choice of jet radius also shows some impact, while the ATLAS and CMS results with the same scale choice and comparable (either low or high) jet radii, which are not shown here, show a qualitatively similar trend.

## 3 Treatment of ATLAS correlated systematic errors

Before considering the general impact of the jet data on the NNLO fit, some care is needed when dealing with the ATLAS data [23]. The discussion follows closely that given in [34], but for completeness we present a summary below. For illustration, we will work at NLO only, using as a baseline PDF the MMHT14 set [3] including the HERA I + II combined data [2], that is, as presented in [35].

*x*and \(Q^2\) regions little improvement is possible (or observed) by refitting to these data.

*i*th data point, \(T_i\) is the theory prediction and \(\sigma ^{\mathrm{uncorr}}_i\) (\(\sigma ^{\mathrm{corr}}_{k,i}\)) are the uncorrelated (correlated) errors. We in particular evaluate the shifts for each of the first four rapidity bins (from 0 to 2.0 in steps of 0.5) individually; including the last two rapidity bins, where the data tend to be less precise, does not affect the conclusions that follow. Any tensions between the different bins may then show up through significantly different \(r_k\) values being preferred in the different rapidity bins, in order to achieve good individual fits. In Fig. 3 we show the average squared sum of the shift differences \((r_i-r_j)^2\) for the four bins. It is clear that for a small subset of the shifts the size of this difference is significantly larger than zero, indicating a large degree of tension.

The three shifts jes21, 45 and 62 as defined in [36], which correspond [37] to the multi-jet balance asymmetry, an in-situ statistical uncertainty and the jet energy scale close by jets, respectively, show particularly large differences. However, for the in-situ statistical uncertainty the correlations are particularly well determined [37], and therefore we omit this from the investigation below. In fact, as shown in [34], decorrelating this source of uncertainty has a less significant impact on the quality of the data description in comparison to the other two sources.

We therefore investigate the impact of decorrelating the systematic uncertainties, jes21 and 62, alone between rapidity bins. We compare to the ATLAS data with \(R=0.4\) and \(p_\perp ^{\mathrm{jet}}\) as the scale choice. The result for the individual uncertainty sources, as well as the combination, is shown in Table 1, and is found to be dramatic. Simply decorrelating jes21, for example, leads to a reduction of 180 points in \(\chi ^2\), giving almost a factor of 2 decrease in the \(\chi ^2/N_{\mathrm{pts.}}\) from 2.85 to 1.58. Decorrelating jes62 in addition gives a \(\chi ^2/N_{\mathrm{pts.}}\) of 1.27. The same data/theory comparisons as in Fig. 2, but including this decorrelation of jes21 and jes62, are shown in Fig. 4 and are visibly improved, with the additional freedom allowing the data/theory to shift in the different rapidity bins and achieve a good overall description. While the above analysis only considers the experimental sources of correlated uncertainty, we have also checked that decorrelating the quoted uncertainty associated with the non-perturbative corrections from [23] that we apply leads to some \(\sim 40\) point improvement in the \(\chi ^2\), which is significantly smaller than for those sources discussed above. In addition, we find that even omitting these corrections entirely has little impact on the fit quality, in other words these appear to be correlated sufficiently with other sources of experimental systematics that their omission does not affect the comparison significantly

\(\chi ^2\) per number of data points (\(N_{\mathrm{pts}}=140\)) for fit to ATLAS jets data [23], with the default systematic error treatment (‘full’) and with certain errors, defined in the text, decorrelated between jet rapidity bins

Full | 21 | 62 | 21,62 | |
---|---|---|---|---|

\(\chi ^2/N_{\mathrm{pts.}}\) | 2.85 | 1.58 | 2.36 | 1.27 |

The \(\chi ^2\) for the ATLAS (\(N_{\mathrm{pts}}=140\)) and CMS 7 TeV jet data (\(N_{\mathrm{pts}}=158\)) at NNLO. The quality of the description using the baseline set is shown, while the result of re-fitting to the single jet data set is given in brackets. Results with the different treatments of the ATLAS systematic uncertainties, described in the text, are also shown

ATLAS | ATLAS, \(\sigma _{pd}\) | ATLAS, \(\sigma _{fd}\) | CMS | ||
---|---|---|---|---|---|

\(R=0.4\) | 350.8 (333.7) | 183.1 (170.7) | 128.4 (122.2) | \(R=0.5\) | 191.7 (163.4) |

\(R=0.6\) | 304.0 (264.0) | 178.8 (148.9) | 128.9 (115.7) | \(R=0.7\) | 200.1 (175.2) |

The \(\chi ^2\) for the combined fit to the ATLAS (\(N_{\mathrm{pts}}=140\)) and CMS (\(N_{\mathrm{pts}}=158\)) 7 TeV jet data. The values for the ATLAS and CMS contributions are given, for different choices of jet radius and scale, at NLO and NNLO

\(R_{\mathrm{low}}\), \(p_\perp ^{\mathrm{jet}}\) | \(R_{\mathrm{low}}\), \(p_\perp ^{\mathrm{max}}\) | \(R_{\mathrm{high}}\), \(p_\perp ^{\mathrm{jet}}\) | \(R_{\mathrm{high}}\), \(p_\perp ^{\mathrm{max}}\) | |
---|---|---|---|---|

ATLAS (NLO) | 213.8 | 190.5 | 171.5 | 161.2 |

ATLAS (NNLO) | 172.3 | 199.3 | 149.8 | 152.5 |

CMS (NLO) | 190.3 | 185.3 | 195.6 | 193.3 |

CMS (NNLO) | 177.8 | 187.0 | 182.3 | 185.4 |

## 4 Fit quality at NNLO

### 4.1 Individual data sets at NNLO

In Table 2 we show the quality, \(\chi ^2\), of the prediction and fit to the ATLAS and CMS jet data. For the predictions, we take as a baseline set the fits to the same data set (and using the same theoretical parameters) as MMHT14 [3], but including the final HERA I+II combined data set [2], and excluding all Tevatron jet data. In the latter case the NNLO predictions are not currently publicly available and so these are omitted for consistency. Unless otherwise stated we take \(p_\perp ^{\mathrm{jet}}\) as the factorization/renormalization scale. The NLO (NNLO) results are all made with a fixed value of \(\alpha _s\) of 0.120 (0.118), as taken in [3], although the results are insensitive to this precise choice. We first consider the impact of fitting the ATLAS and CMS jet data individually. We show the ‘ATLAS’ result with the default treatment of systematic errors, with our model of partial error decorrelation (\(\sigma _{pd}\)), and with a full decorrelation of all systematic errors across jet rapidity bins (\(\sigma _{fd}\)). While as discussed above the latter approach is clearly overly conservative, we note that e.g. only fitting the first jet rapidity bin as in [5] implicitly assumes such a decorrelation.

As in the NLO case above, the description and fit of the ATLAS data with the default error treatment is poor, with \(\chi ^2/N_\mathrm{pts}\sim 2\) or higher, but this improves to be of order unity when taking our model of partial error decorrelation. If the systematic errors are fully decorrelated between rapidity bins, some further improvement is achieved, giving a value that is somewhat below unity. However, it is clear that the most dramatic change comes from the decorrelation of the first two systematic errors. We also show the comparison for different choices of jet radius, with \(R=0.4\) (0.5) and \(R=0.6\) (0.7) for the ATLAS (CMS) data, which in the following we will label as ‘low’ and ‘high’, respectively. Interestingly, with the higher choice of *R* the quality of the description of the ATLAS data is better, while the change when refitting is significantly increased; for the partial error decorrelation the \(\chi ^2\) decreases by \(\sim 30\) points, giving a final \(\chi ^2/N_{\mathrm{pts}}\) very close to unity. On the other hand, for the full error decorrelation, little difference is seen, which is perhaps unsurprising given the over-estimate in the freedom of the data uncertainties. We also show the \(\chi ^2\) for the prediction and fit to the CMS jet data. Here the description is fair, and a \(\chi ^2/ N_{\mathrm{pts}}\sim 1\) is achieved for both radii after refitting, with a reduction in the \(\chi ^2\) by \(\sim 30\) points. The fit quality is a little better for the lower choice of jet radius, although the difference is relatively small.

### 4.2 NNLO vs. NLO fit quality for combined data

In Table 3 we show the results of the combined fit to the ATLAS and CMS data, taking the partial error decorrelation model for the ATLAS data. As above, we show results for low and high jet radii, while we also consider the impact of the jet scale choice. The \(\chi ^2\) values for the ATLAS and CMS data sets are given.

The \(\chi ^2\) for the combined NNLO fit to the ATLAS and CMS 7 TeV jet data, excluding and including the calculated NNLO *K*-factors, and excluding the errors associated with the polynomial fit to the *K*-factors. The \(p_\perp ^{\mathrm{jet}}\) factorization/renormalization scale is taken

NLO theory | NNLO | NNLO (no errors) | |
---|---|---|---|

ATLAS, \(R_{\mathrm{low}}\) | 215.3 | 172.3 | 179.1 |

ATLAS, \(R_{\mathrm{high}}\) | 159.2 | 149.8 | 153.5 |

CMS, \(R_{\mathrm{low}}\) | 194.2 | 177.8 | 182.8 |

CMS, \(R_{\mathrm{high}}\) | 198.5 | 182.3 | 188.8 |

Second, at NNLO some improvement in the fit quality is apparent for most choices of jet radii and scale; such an improvement is also visible upon comparing Tables 1 and 2 for the individual fits. This is more significant for the ATLAS data, where an improvement of up to 40 points in \(\chi ^2\) can be achieved, while for the CMS data the improvement is at most 10 points. On the other hand, for the low jet radius, and \(p_\perp ^{\mathrm{max}}\) choice, some slight deterioration in the fit quality is observed. We note that in [34] a deterioration in the fit quality to the ATLAS data when going to NNLO was reported; however, here it was precisely these choices of scale and jet radius that were taken. Following the more detailed study in this work, we can see that this effect is not in general present.

Third, we can see that for the joint fit a clear preference for the higher choice of jet radius is shown at both orders in the ATLAS data, while for the CMS any difference is relatively marginal. Moreover, while at NLO some preference (in particular in the ATLAS data) for the \(p_\perp ^{\mathrm{max}}\) scale choice is shown, at NNLO this trend is reversed for the low *R* choice, while for high *R* essentially no preference is indicated by the fit, with the descriptions of the ATLAS and CMS data being excellent for both scale choices. Thus to achieve the best NNLO fits to these data sets, a higher value of *R* is preferred, while the result is less sensitive to the choice of scale. As we will show in the following section, this relative insensitivity is also observed in the extracted PDFs, in particular for the gluon.

Finally, it is important to clarify the role played by the NNLO jet production theory, in contrast to the NNLO PDFs, in leading to the improvement in the fit quality at NNLO. In Table 4 we show the same \(\chi ^2\) values as before, resulting from the NNLO fit to the combined ATLAS and CMS data, but in addition excluding the NNLO *K*-factors, i.e. applying NLO theory only to the jet data. We can see that the improvement due to the NNLO corrections in the fit is still present at roughly the same level as before, with some variation in the precise amount. We also show the effect of excluding the correlated errors associated with the *K*-factor fit described in Sect. 2. This leads to some small increase in the \(\chi ^2\), as it must, but the trend is unchanged.

## 5 Impact of LHC jet data on PDFs

### 5.1 Central values

We can see that for both jet radii, despite leading to significantly different fit qualities, the partial decorrelation and default error treatments in fact result in quite similar fits for the gluon PDF, with some softening observed at high *x*. On the other hand, the full decorrelation of systematic uncertainties leads to a gluon that is qualitatively different, being much less soft at high *x*, although still consistent within PDF uncertainties. This is perhaps not surprising, as the systematic shifts we determine by profiling with respect to the various correlated uncertainties in (2) have a physical interpretation, giving us the best fit values of the various experimental parameters and a corresponding best fit measurement that is shifted with respect to the default. By treating these sources of uncertainty as uncorrelated across rapidity bins, this connection is largely lost, and in effect an imperfect measurement that is systematically different may be fit. The central value of the extracted gluon may then vary quite significantly. This effect is indeed observed in Fig. 5. Given these results, in what follows we will simply apply our model of partial error decorrelation, although we note that in all cases the results are very similar when taking the default treatment.

*x*. In Fig. 6 we show the result of the NNLO fit, including the CMS jet data only, for both jet radii. Here, the impact on the gluon is relatively flat out to quite high

*x*, where some hardening is observed, albeit within the large PDF uncertainties in this region. As with the ATLAS data, the larger choice of jet radius leads to some softening in the gluon in comparison to the lower choice.

In Fig. 7 we now consider the effect of combined fit to the ATLAS and CMS jet on the gluon. As mentioned above, we take the partial decorrelated treatment of the ATLAS jet data in what follows. We show results for low and high jet radii, i.e. with \(R=0.4\) (0.5) and \(R=0.6\) (0.7) for the ATLAS (CMS) data, respectively. We also show the effect of taking the \(p_\perp ^{\mathrm{max}}\) scale choice in comparison to \(p_\perp ^{\mathrm{jet}}\). The result at NLO (NNLO) is shown in the left (right) panel. The impact of the scale choice on the gluon is quite small, of the same order of or less than that due to the choice of jet radius, although here the difference for the combined fit is also not dramatic. This is not necessarily to be expected, as the difference between the scale choices in the underlying theory prediction is not negligible. In addition, while the qualitative trend in the NLO and NNLO fits is similar, the latter leads to a somewhat softer gluon, which even lies somewhat outside the baseline PDF uncertainty band for the higher jet radius. We can also see that the gluon that results from the combined fit lies closer to the result from the ATLAS then the CMS fit, although these are all consistent within PDF uncertainties. This is consistent with the somewhat larger deterioration observed in the fit quality for the CMS-only case in comparison to the combined fit.

Thus, to summarise the effect of the LHC data and the accompanying theory improvements, in all cases we observe some relative stability in the overall trend of the extracted gluon, with the smallest differences being due to the choice of scale, followed by the choice of jet radius and finally the NLO vs. NNLO difference being largest. We study this result in more detail below.

To investigate the effect of scale choice further, in Figs. 8 and 9 we show the data/theory for both choices of scale, with lower and higher *R* choice, respectively. We show results for both the ATLAS and the CMS data, in the central rapidity bin (although similar results are seen at other rapidities). In the left hand plots we show the results prior to including the systematic shift in the correlated errors. We concentrate our discussion below on the lower *R* choice shown in Fig. 8, as here difference with respect to the two scale choices is more pronounced, but similar conclusions hold for the higher choice, shown in Fig. 9. An approximately \(10\%\) difference is observable at lower \(p_\perp \), with the \(p_\perp ^{\mathrm{max}}\) choice leading to the larger result, consistent with the findings in [33]. We can see that in both cases the description of the data is poor, highlighting the importance of the systematic experimental uncertainties. However, once the data is allowed to shift by these errors this difference largely disappears, and good description of the data is achieved in all cases. The shift is somewhat larger for the \(p_\perp ^{\mathrm{max}}\) case, with a \(\sim 5\) (11) point increase in the \(\chi ^2\) due the shift penalty found for the ATLAS (CMS) data, while for the higher *R* choice the overall shift penalty is only marginally increased, by 2 points. These findings are consistent with the trends found in Table 3. From Fig. 7 we can see that these results translate into a relative, although not complete, stability in the predicted PDFs. With a further reduction in the size of the systematic experimental uncertainties the difference may on the other hand become more pronounced, but no significant effect is observed with the 7 TeV data sets.

*K*-factors in the fit. In the left (right) panel we show the result with the low (high) choice of jet radius. This therefore shows the impact of the new NNLO theory calculation on the gluon. We can see that in both cases the effect is reasonably small, but not negligible, leading to some additional softening in the gluon at high

*x*. Indeed, for the high jet radius choice, the inclusion of the NNLO theory leads to a central value at high

*x*which lies somewhat outside the uncertainty band of the baseline fit. The effect of using \(p_\perp ^{\mathrm{max}}\) instead as the scale choice is similar.

*x*is observed, consistent with its impact in the MSTW08 fit [38]. This is in contrast to the LHC data, which we have seen prefers a softer gluon at higher

*x*, although up to \(x \gtrsim 0.3\) these are consistent within PDF errors. Nonetheless some tension is observed, and indeed when including both the Tevatron and LHC data into the fit, the description deteriorates by about 10 and 8 points in comparison to the individual fits for the LHC and Tevatron, respectively. The resultant gluon is somewhat harder at high

*x*than the LHC only fit, but still softer than the baseline. For clarity we do not include the PDF uncertainties in this case; these will be shown below. It will be interesting to see how this situation changes when the full NNLO corrections are included for the Tevatron predictions.

### 5.2 PDF uncertainties

*R*choice, for low and intermediate values of

*x*the error reduction relative to the baseline ranges from \(10-20\%\), but for the \(x\sim 0.05-0.2\) there is little reduction and in some regions even a slight increase in the error. At high

*x*there is again a reduction in the uncertainty, although as

*x*approaches 1 and the jet data places little or no constraint, the quantitative result cannot be taken completely literally, as this will depend on the precise choice of PDF parameterisation. For the lower

*R*choice the reduction in the PDF uncertainty is less significant, and the

*x*region where this increases relative to the baseline is wider. In Fig. 13 (right) we should the results for the higher jet radius choice and for different treatments of the ATLAS systematic errors. We can see that the partial decorrelation leads to a similar, although in some places slightly less constraining, impact on the uncertainties across the entire

*x*region in comparison to the default treatment, consistent with the impact on the central values shown before. On the other hand, for fully decorrelated uncertainties the impact at high

*x*in particular is much less constraining, although in the \(x\sim 0.1\) region the uncertainties are in fact somewhat smaller.

In Fig. 14 (we show the impact of fitting the ATLAS and CMS data individually on the PDF uncertainties. We can see that, consistently with the results of the previous section, the impact of the ATLAS is generally larger, in particular at higher *x*, where the CMS data in fact lead to a somewhat larger uncertainty in comparison to the baseline. Including both the ATLAS and CMS data generally leads to some decrease in the uncertainties in comparison to the individual fits. In Fig. 15 we show the impact of the fits to the LHC and Tevatron data individually, as well as to the combination, at NLO and NNLO. We can see that with the exception of the intermediate \(x\sim 0.05-0.1\) region at NNLO, the LHC data has a greater impact in reducing the PDF uncertainties. For the combined fit, the relative uncertainties reduce by \(\sim 20\%\) across the entire *x* region at NNLO, while at NLO the reduction in uncertainty is somewhat milder in comparison to the LHC-only fit in the \(x\sim 0.01-0.1\) region, while in the highest *x* regions the impact is somewhat larger. We can also see that, with the exception of this very high *x* region, which will in any case be sensitive to parameterisation effects, the impact of the NNLO fit on the gluon is more significant in comparison to the NLO for all data combinations. Again, it will be interesting to see how this situation changes at NNLO when the full NNLO corrections are included in the Tevatron predictions.

## 6 Conclusions and outlook

Inclusive jet production data has played a key role in constraining the partonic structure of the proton, and in particular the gluon at higher *x*, in global PDF fits. The availability of high precision jet data from the LHC combined with the recent release of the NNLO corrections to the hard cross section therefore provides an invaluable tool for high precision PDF constraints.

In this paper, we have presented a detailed study of the impact of LHC jet data on a PDF fit within the MMHT global fitting framework, at NNLO. We have observed that to reliably perform such a study, certain issues require a careful treatment. Namely we have had to address the choice of jet scale and radius, and the fact that a satisfactory description of the systematics dominated ATLAS data cannot by default be achieved across the full kinematic region. After analysing the structure of the systematic shifts induced in describing the ATLAS data, we have determined a straightforward and minimal method to improve the fit quality; by decorrelating two sources of systematic uncertainty in rapidity, a greatly improved description is achieved. Crucially, despite this change in the fit quality, we have shown that this only has a relatively small impact on the determination of the gluon itself in comparison to the default treatment. The result of our minimal approach should then be in line with a more complete consideration of different decorrelation scenarios permitted by experimental considerations. This suggests that despite this question of the default fit quality, these data can still be reliably included in a PDF fit. On the other hand, we have found that decorrelating all sources of uncertainty in rapidity, in essence the approach that is assumed if only one rapidity bin is fitted, leads to larger shifts. Some caution in applying such a procedure therefore appears to be warranted.

We have then presented the fit quality at NLO and NNLO when the ATLAS and CMS jet data are included in a MMHT fit, for both the inclusive and leading jet \(p_\perp \) scale choices, and different values of the jet radius *R*. We find that some improvement is in general achieved when going to NNLO, with the exception of the \(p_\perp ^{\mathrm{jet}}\) and lower *R* choice, where there is a slight deterioration. The impact on the gluon PDF is qualitatively similar between orders. Although the theory predictions are quite different at lower jet \(p_\perp \) when considering the two scale choices, we find that the fit quality including a proper treatment of the experimental systematics is in fact similar. Moreover, the impact on the gluon itself is very stable between the choices. This suggests that at least for the data sets under consideration in this paper, the effect of the choice of jet scale on PDF determination may not be as significant at NNLO as has sometimes previously been assumed.

In terms of the jet radius, the ATLAS data in particular has shown some preference of the larger (\(R=0.6\)) choice, although again the impact on the gluon is relatively stable in comparison to the smaller choice. In all cases the jet data are found to consistently prefer a somewhat softer gluon at high *x* and a harder gluon in the intermediate *x* region, with in general some \(\sim \) 10–20% *relative* reduction in the PDF uncertainty.

Thus, in this paper we have shown that LHC jet data may be reliably included in to global PDF fits at NNLO, while addressing in a minimal way the issue related to achieving a good description of the high precision, systematics dominated, ATLAS data across the whole kinematic region. We have only considered the 7 TeV data, for which the NNLO calculations are available. In future global fits, we will take our partially decorrelated treatment of the experimental systematic errors for these data sets. However, in the future we intend to confirm if the above conclusions hold in the case of the 8 and 13 TeV jet data from the LHC.^{4} Moreover, this issue related to the description of the ATLAS data may become increasingly relevant in the high precision LHC era, and may warrant a more detailed study in the future of both the experimental and theoretical sources of uncertainty.

## Footnotes

- 1.
- 2.
As such, a detailed investigation of the scale dependence as described in [25] may be avoided.

- 3.
In the case of the ATLAS data with the \(p_\perp ^{\mathrm{max}}\) scale we take this directly from the APPLgrid website [32].

- 4.
We note that the Run-II data sets are by default made available for a single common value of jet radius, \(R=0.4\), preventing any comparison of different jet radii. Interestingly, we have seen here that a better description of the ATLAS data in particular may be achieved for a higher choice of jet radius, \(R=0.6\).

## Notes

### Acknowledgements

We are grateful to Pavel Starovoitov for useful discussions and invaluable help with the NLOJet++ interface to APPLgrid. We are also grateful to Ulla Blumenschein, Amanda Cooper–Sarkar, James Currie, Nigel Glover, Claire Gwenlan, Bogdan Malaescu, and Matthias Schott for useful discussions. LHL thanks the Science and Technology Facilities Council (STFC) for support via Grant awards ST/L000377/1 and ST/P004547/1. RST thanks the Science and Technology Facilities Council (STFC) for support via Grant awards ST/L000377/1 and ST/P000274/1.

## References

- 1.J. Gao, L. Harland-Lang, J. Rojo (2017). arXiv:1709.04922
- 2.ZEUS, H1, H. Abramowicz et al., Eur. Phys. J.
**C75**, 580 (2015), arXiv:1506.06042 - 3.L.A. Harland-Lang, A.D. Martin, P. Motylinski, R.S. Thorne, Eur. Phys. J. C
**75**, 204 (2015). arXiv:1412.3989 ADSCrossRefGoogle Scholar - 4.
- 5.NNPDF, R.D. Ball et al., Eur. Phys. J.
**C77**, 663 (2017). arXiv:1706.00428 - 6.A. Accardi, L.T. Brady, W. Melnitchouk, J.F. Owens, N. Sato, Phys. Rev. D
**93**, 114017 (2016). arXiv:1602.03154 ADSCrossRefGoogle Scholar - 7.S. Alekhin, J. Blümlein, S. Moch, R. Placakyte, Phys. Rev. D
**96**, 014011 (2017). arXiv:1701.05838 ADSCrossRefGoogle Scholar - 8.M. Czakon, N.P. Hartland, A. Mitov, E.R. Nocera, J. Rojo, JHEP
**04**, 044 (2017). arXiv:1611.08609 ADSCrossRefGoogle Scholar - 9.R. Boughezal, A. Guffanti, F. Petriello, M. Ubiali, JHEP
**07**, 130 (2017). arXiv:1705.00343 ADSCrossRefGoogle Scholar - 10.N. Kidonakis, J.F. Owens, Phys. Rev. D
**63**, 054019 (2001). arXiv:hep-ph/0007268 ADSCrossRefGoogle Scholar - 11.
- 12.D. de Florian, P. Hinderer, A. Mukherjee, F. Ringer, W. Vogelsang, Phys. Rev. Lett.
**112**, 082001 (2014). arXiv:1310.7192 ADSCrossRefGoogle Scholar - 13.X. Liu, S.-O. Moch, F. Ringer (2017). arXiv:1708.04641
- 14.D0, V.M. Abazov et al., Phys. Rev.
**D85**, 052006 (2012). arXiv:1110.3771 - 15.CDF, A. Abulencia et al., Phys. Rev.
**D75**, 092006 (2007). arXiv:hep-ex/0701051**(Erratum: Phys. Rev. D75, 119901 (2007))** - 16.ATLAS, G. Aad et al., Phys. Rev. D
**86**, 014022 (2012). arXiv:1112.6297 - 17.ATLAS, G. Aad et al., Eur. Phys. J.
**C73**, 2509 (2013). arXiv:1304.4739 - 18.CMS, S. Chatrchyan et al., Phys. Rev.
**D87**, 112002 (2013), arXiv:1212.6660**(Erratum: Phys. Rev.D87,no.11,119902(2013))** - 19.NNPDF, R.D. Ball et al., JHEP
**1504**, 040 (2015). arXiv:1410.8849 - 20.A. Gehrmann-De Ridder, T. Gehrmann, E.W.N. Glover, J. Pires, Phys. Rev. Lett.
**110**, 162003 (2013). arXiv:1301.7310 ADSCrossRefGoogle Scholar - 21.J. Currie, A. Gehrmann-De Ridder, E.W.N. Glover, J. Pires, JHEP
**01**, 110 (2014). arXiv:1310.3993 ADSCrossRefGoogle Scholar - 22.J. Currie, E.W.N. Glover, J. Pires, Phys. Rev. Lett.
**118**, 072002 (2017). arXiv:1611.01460 ADSCrossRefGoogle Scholar - 23.
- 24.CMS, S. Chatrchyan et al., Phys. Rev. D
**90**, 072006 (2014). arXiv:1406.0324 - 25.A.D. Martin, M.G. Ryskin, Eur. Phys. J. C
**77**, 218 (2017). arXiv:1702.01663 ADSCrossRefGoogle Scholar - 26.ATLAS, M. Aaboud et al., JHEP
**09**, 020 (2017). arXiv:1706.03192 - 27.ATLAS, M. Aaboud et al., (2017). arXiv:1711.02692
- 28.CMS, V. Khachatryan et al., JHEP
**03**, 156 (2017). arXiv:1609.05331 - 29.CMS, V. Khachatryan et al., Eur. Phys. J.
**C76**, 451 (2016). arXiv:1605.04436 - 30.
- 31.
- 32.https://applgrid.hepforge.org. Accessed 21 Mar 2018
- 33.
- 34.L.A. Harland-Lang, R. Nathvani, R.S. Thorne, A.D. Martin, Acta Phys. Pol. B
**48**, 1011 (2017). arXiv:1704.00162 ADSCrossRefGoogle Scholar - 35.L.A. Harland-Lang, A.D. Martin, P. Motylinski, R.S. Thorne, Eur. Phys. J. C
**76**, 186 (2016). arXiv:1601.03413 ADSCrossRefGoogle Scholar - 36.ATLAS, G. Aad et al., Eur. Phys. J.
**C75**, 17 (2015). arXiv:1406.0076 - 37.U. Blumenschein, C. Gwenlan, B Malaescu, M. Schott
**(private communication)**Google Scholar - 38.A.D. Martin, W.J. Stirling, R.S. Thorne, G. Watt, Eur. Phys. J. C
**63**, 189 (2009). arXiv:0901.0002 ADSCrossRefGoogle Scholar

## Copyright information

**Open Access**This article is distributed under the terms of the Creative Commons Attribution 4.0 International License (http://creativecommons.org/licenses/by/4.0/), which permits unrestricted use, distribution, and reproduction in any medium, provided you give appropriate credit to the original author(s) and the source, provide a link to the Creative Commons license, and indicate if changes were made.

Funded by SCOAP^{3}