Infrared sensitivity of single jet inclusive production at hadron colliders

Jet production at hadron colliders is a benchmark process to probe the dynamics of the strong interaction and the structure of the colliding hadrons. One of the most basic jet production observables is the single jet inclusive cross section, which is obtained by summing all jets that are observed in an event. Our recent computation of next-to-next-to-leading order (NNLO) QCD contributions to single jet inclusive observables uncovered large corrections in certain kinematical regions, which also resulted in a sizeable ambiguity on the appropriate choice of renormalization and factorization scales. We now perform a detailed investigation of the infrared sensitivity of the different ingredients to the single jet inclusive cross section. We show that the contribution from the second jet, ordered in transverse momentum $p_{T}$, in the event is particularly sensitive to higher order effects due to implicit restrictions on its kinematics. By investigating the second-jet transverse momentum distribution, we identify large-scale cancellations between different kinematical event configurations, which are aggravated by certain types of scale choice. Taking perturbative convergence and stability as selection criteria enables us to single out the total partonic transverse energy $\hat{H}_{T}$ and twice the individual jet transverse momentum $2\,p_{T}$ (with which $\hat{H}_{T}$ coincides in Born kinematics) as the most appropriate scales in the perturbative description of single jet inclusive production.


Introduction
At hadron colliders, the factorised form of the inclusive cross section is given by, where dσ ij is the parton-level scattering cross section for parton i to scatter off parton j and the sum runs over the possible parton types i and j. The probability of finding a parton of type i in the proton carrying a momentum fraction x is described by the parton distribution function (PDF) f i (x)dx. By applying suitable cuts, one can study more exclusive observables such as the transverse momentum distribution or the rapidity distribution of the hard objects (jets or vector bosons, Higgs bosons or other new particles) produced in the hard scattering. In eq. (1.1), one has to fix the renormalization scale µ R for the strong coupling α s (µ R ), and the mass factorization scale µ F for the parton distribution functions f i (x, µ F ).
In this paper we study jet production at hadron colliders, in particular the single jet inclusive cross section in proton-proton collisions, dσ(p+p → jet+X), which is obtained by -1 -

JHEP10(2018)155
summing over all jets in the event. The observable is inclusive over all additional radiation as no further kinematical constraints are imposed on the final state particles beyond the requirement of observing at least a single jet. In this way, the full event can contain multiple jets and all jets that lie in a given range of rapidity y and transverse momentum p T are taken into account in determining the single jet inclusive cross section for that bin.
Large-p T jet production at hadron colliders has been studied in particle accelerators over a period of many years by the UA1 [1] and UA2 [2] experiments at the SppS collider ( √ s = 546 GeV and 630 GeV) and by the CDF [3] and D0 [4] experiments at the Tevatron ( √ s = 1.96 TeV). At the Large Hadron Collider (LHC) at CERN, the ALICE, ATLAS and CMS collaborations have measured inclusive jet cross sections in proton-proton collisions at centre-of-mass energies of √ s = 2.76 TeV [5][6][7], 7 TeV [8,9], 8 TeV [10,11] and 13 TeV [12,13]. These precise measurements are crucial for understanding physics at hadron colliders as jet cross sections provide valuable information about the strong coupling constant α s , the non-perturbative structure of the proton encoded in the PDFs, and probe the shortest distance scales that are experimentally attainable. More recently, jet substructure techniques have been applied to understand the internal dynamics of QCD jets in order to identify discriminator variables which can more easily disentangle jets originating in the QCD parton scattering process from those produced by the hadronic decay of new heavy beyond-Standard-Model particles [14].
Hadron collider jet observables can be computed at a given fixed order in α s in perturbative QCD, by retaining the corresponding terms in the series expansion in α s for the parton-level cross sections and the PDFs, as presented in eq. (1.1). Next-to-leading-order (NLO) QCD corrections to jet production at hadron colliders were computed in [15][16][17] and later combined with a parton shower in [18,19]. First-order corrections in the electroweak (EW) coupling have been derived in [20,21], and the combination of NLO QCD and EW corrections was studied in [22]. A study of joint jet radius and threshold resummation has been presented in [23]. Progress in next-to-next-to-leading-order (NNLO) QCD calculations has been made over the past several years [24][25][26][27]. After the completion of the first calculations of the gluons-only subprocess [28,29], the complete leading-colour and leading-N F NNLO QCD corrections to the single jet inclusive production [30] and to di-jet production [31] were obtained recently.
The recent NNLO calculation provides new opportunities for QCD studies at hadron colliders, it enables precise theoretical predictions for jet observables to be compared with the wealth of experimental jet data which have similar precision. More formally, the knowledge of three orders in the perturbative expansion in α s for these jet observables provides a testing ground for the impact of the higher order corrections through the notion of perturbative convergence and the reduction of theoretical scale uncertainties.
One issue which requires particular attention is the role of the renormalization and factorization scales in the theoretical predictions. At a formal level, the parameters µ R and µ F are introduced as auxiliary quantities which allow meaningful predictions to be calculated at each order in perturbation theory. As auxiliary quantities, an all-order prediction would be independent of these parameters. However, truncating at fixed order yields a residual dependence, formally of one order higher in the strong coupling. Varying -2 -the numerical value of the scale (usually in an interval around a pre-defined central scale choice) is frequently used to quantify the uncertainty on the theory prediction due to the uncomputed higher orders. The huge dynamical range of jet production at the LHC and the three available perturbative orders in the theoretical prediction provide the opportunity to test thoroughly the commonly used arguments about scale dependence in perturbative calculations.
For these reasons we have recently provided jet cross section predictions for the LHC at NNLO using both the leading jet transverse momentum in an event µ R = µ F = p T,1 [30] or each individual jet transverse momentum µ R = µ F = p T [32] as a central scale choice. We have observed an overall reduction in the scale dependence of the prediction at NNLO with respect to the NLO result with either of these scale choices. However, comparing the two predictions (which are both based on well-motivated central scale choices) against each other, we noticed a substantial difference in their quantitative behaviour [32], which can be viewed as a further uncertainty on the theory prediction. It is therefore important to arrive at a sensible central scale choice, which covers the range of jet kinematics accessible at the LHC.
In this paper, we perform a detailed study of the perturbative behaviour of the individual contributions to the single jet inclusive production cross section for a given set of sensibly chosen dynamical scales. In section 2 we present the structure and the scale dependence of the single jet inclusive cross section computed through to NNLO in QCD. We discuss possible functional forms for the scale choice in terms of kinematical variables, thereby carefully distinguishing scales which are based on individual jet kinematics (jetbased) or on full event kinematics (event-based). Particular attention is paid to the effects of the jet clustering algorithm and the jet resolution parameter on the kinematical variables used in the different scale choices.
In section 3 we subsequently perform a detailed investigation of the infrared sensitivity of the different ingredients to the single jet inclusive cross section for the jet-based scale choice µ R = µ F = p T and the event-based scale choice µ R = µ F = p T,1 . It is the aim of this section to identify the source of the different quantitative behaviour in the NNLO predictions between the µ R = µ F = p T and µ R = µ F = p T,1 scale choices.
In section 4 we analyse the behaviour of the perturbative expansion of the single jet inclusive observable, for the different functional forms for the central scale choice established in section 2. This allows us to assess the perturbative stability and convergence properties for each scale choice up to NNLO, thereby identifying the most appropriate candidates. Subsequently we compare our predictions at NLO and NNLO to the available CMS 13 TeV jet data for the first time in section 5. Finally in section 6 we present our conclusions.
2 Renormalization and factorization scales in the single jet inclusive cross section When calculating jet cross sections to fixed order in perturbation theory (1.1), one has to fix the renormalization scale µ R for the strong coupling α s (µ R ), and the mass factorization scale µ F for the parton distribution functions f i (x, µ F ).

JHEP10(2018)155
The behaviour of the coupling constant and parton distributions under scale variations is determined by evolution equations. After fixing a central reference scale, all scaledependent terms of hard scattering cross sections can be inferred by expanding the solutions of the evolution equations in powers of the strong coupling constant. These are collected in section 2.1 below. For processes involving massive particles (vector bosons, Higgs bosons or top quarks), the particle mass provides a natural candidate for the central reference scale. In contrast, no natural fixed scale is present in jet production processes, which involve only massless objects at parton level. Consequently, the central scale for jet production must be chosen dynamically, based on the kinematics of the final-state objects (jets or full events) under consideration. Section 2.2 discusses different prescriptions for the central scale in single jet inclusive production, based on the kinematics of each individual jet, or of the whole event. These kinematical variables depend on the jet algorithm and the jet resolution parameter. In section 2.3 we define the individual jet contributions to the single jet inclusive production cross section which are individually infrared safe only if they are inclusive in the jet rapidity. Analysing the different possible final state configurations up to NNLO, we finally discuss the impact of the jet resolution on the event properties and on the scale choices in section 2.4.

Renormalization scale dependence
The renormalization group equation describing the running of α s as a function of the renormalization scale µ R reads: with the MS-scheme coefficients [33,34] where C A = 3, C F = 4/3, T R = 1/2 and N F is the number of light quark flavours.
Using the solution of this equation, the coupling at a fixed scale µ R 0 can be truncated in terms of the coupling at µ R by introducing The perturbative expansion of the single jet inclusive cross section starts at order α 2 s . In evaluating the expansion coefficients σ (n) = σ (n) (µ R 0 ), the renormalization scale is fixed to a value µ R 0 (which can be dynamically evaluated event-by-event). Rescalings can then be made for a fixed ratio µ R /µ R 0 for all events; e.g. if µ R 0 = p T,1 , we can rescale to µ R = 2 p T,1 or µ R = p T,1 /2, but not to µ R = M Z or µ R = H T ).

Factorization scale dependence
The evolution of parton distributions associated to a variation of the factorization scale µ F is determined by the Altarelli-Parisi equations [35] which read (omitting for simplicity the dependence on the Bjorken scaling variable x): (2.5) The expansion to the third order of the splitting functions P [36,37]: where we introduced Note that (2.6) can be rewritten as which implies that f i (µ F , µ R ) and f i (µ F , µ F ) fulfil the same evolution equation to all perturbative orders. The finite scheme transformation between both possible choices for µ F , equal or different from µ R , is thus vanishing to all orders and both functions can at most vary in their non-perturbative boundary conditions. For all perturbative purposes, we thus have which we will normally use in what follows (except if the scale transformation of the parton distribution is not expanded in α s (µ F ), but in α s (µ R )).

JHEP10(2018)155
The parton distribution at a fixed scale µ F 0 can be expressed in terms of parton distributions at µ F by expanding the solution of (2.5). We distinguish the expansion in powers of α s (µ R ) and in powers of α s (µ F ) and introduce The expansion in α s (µ R ) of the parton distribution at µ F 0 reads then: The expansion in powers of α s (µ F ) is obtained from the above by setting µ R = µ F in α s to yield In both expressions, a summation over indices appearing twice is implicit.

Hadron collider jet cross section
Using the results presented above, one can compute the perturbative coefficients of the hadron collider cross section with default values of µ F 0 = µ R 0 = µ 0 . The perturbative expansion to NNLO reads: The full scale dependence of this expression, for µ F and µ R different from each other, can be recovered by inserting (2.4) and (2.11) into the above equation. It yields (2.14)

Scale choices
Inclusive jet observables accumulate each reconstructed jet in the event to the same kinematic distribution, resulting in multiple bookings of the event into a given histogram. The set of possible scale choices is consequently large and we shall distinguish two generic types: event-based and jet-based scales. A jet-based scale only uses kinematic information from the individual jet to determine the scale associated with the contribution from this jet to the cross section. In a given event, the event weight is thus evaluated at several different scales, one scale for each jet. In contrast to this, an event-based scale uses information from the full final state of the event to set a common scale for all binnings of the jets that are contained in this event.

JHEP10(2018)155
In this paper we will consider the following set of functional forms for the scale choice (and multiples thereof): the individual jet transverse momentum p T : when this jet-based scale choice is used for the inclusive p T distribution, the observable is directly aligned with the scale itself, making it a convenient choice for PDF fits. It mimics kinematical hierarchies in an event, where multiple jets can be reconstructed with very different p T . However, this can lead to the scale being set to values that are not at all representative of the underlying hard scattering process.
the leading-jet transverse momentum p T,1 : this event-based scale uses the p T of the hardest jet in the event, which is a better proxy for the scale of the hard interaction compared to the µ = p T choice. For multi-jet events comprising many hard resolved jets, p T,1 can still underestimate the scale of the hard interaction. Moreover, p T,1 does not take account of scale hierarchies in an event.
the scalar sum of the transverse momenta of all reconstructed jets H T : with this event-based scale one incorporates the kinematics of all individual jets by summing up their respective transverse momenta, H T = i∈jets p T,i . As such, it constitutes the hardest scale discussed so far and for the Born-level 2 → 2 process, it is related to the p T scales as H T = 2 p T = 2 p T,1 . It however suffers from a discontinuous behaviour, when the number of reconstructed jets changes: For this reason, it displays a large displacement at the phase-space boundaries where (n + 1)-jet events migrate to n-jet events. As a consequence, higher order corrections for values of p T close to the minimum jet acceptance p T,min become unstable and we will no longer consider this scale in the remainder of this paper.
the scalar sum of the transverse momenta of all partonsĤ T : the undesirable discontinuous behaviour of H T can be alleviated if the transverse momentum sum is not based on the reconstructed jets, but instead obtained as the transverse momentum sum of all partons in the event:Ĥ T = i∈partons p T,i . This event-based scale choice also has the advantage of being insensitive to the jet reconstruction applied in the analysis and is an infrared-safe event shape variable.
Any scale choice that is based on the kinematics of the reconstructed jets, i.e. p T , p T,1 , and H T from the list above, inherits a dependence on the jet cuts and the details of the clustering employed in the analysis [38]. This means that for a given partonic configuration and the same scale definition, the determined value for the scale depends on the details of the jet algorithm, the allowed rapidity range and the rapidity and p T range probed by the experiment. In particular, the scale choice introduces an indirect dependence on the cone size R of the jet algorithm and on the jet cuts:  Table 1. Possible scale choices in inclusive jet production and their properties.
• A sensitivity of event-based scales on the jet cuts induces the unwanted property that a variation of rapidity cuts can impact the predictions in the other rapidity regions well away from the variation.
• The dependence of the scale on the jet-clustering algorithm can introduce an indirect sensitivity on the cone size R. Such an effect becomes hard to disentangle from the purely kinematical dependence on R, which is discussed in section 2.4 and which induces potentially enhanced corrections of the form log(R).
As an example of the above, consider the event-based scale µ = p T,1 and a configuration in which the leading jet is relatively forward and thus does not contribute in a central rapidity slice of the single jet inclusive cross section. If the detector rapidity coverage includes the jet, the scale will be the p T of this forward jet. On the other hand, if the forward jet lies outside of the detector coverage it will not be identified as the leading jet, and the event-based scale will be different. As a consequence, predictions for the jet cross section in the central region of the detector will depend on the rapidity coverage of the detector when the event-based scale p T,1 is used.
In contrast, the µ = p T scale choice always uses the transverse momentum of the jet in the rapidity slice where the jet is observed and therefore its predictions are not sensitive to the jet-defining cuts. However, as will be detailed in section 2.4, the scales µ = p T,1 and µ = p T show a different sensitivity to the jet cone size.
Both of these issues are avoided for µ =Ĥ T that is defined on the basis of the parton kinematics. While not being directly accessible in the experimental measurement,Ĥ T is infrared-safe and theoretically well-defined. Its use for scale settings is not problematic, since the renormalization and factorization scales are simply auxiliary quantities in the theoretical prediction. Table 1 summarises the different scale choices together with their respective properties discussed in this section.

Individual jet contributions to inclusive jet production
To illustrate the difference between an event-based and a jet-based scale choice, we consider two of the most common scale choices in studies of jet production at hadron colliders, i.e. µ = p T,1 and µ = p T . To this end, it is instructive to look at the composition of the single jet inclusive cross section in terms of contributions from individual jets in an dσ dp T (µ = p T,1 ) = dσ dp T,1 (µ = p T,1 )+ dσ dp T,2 (µ = p T,1 )+ dσ dp T, 3 (µ = p T,1 )+ dσ dp T, 4 (µ = p T,1 ) .
Predictions for the jet-based scale choice µ = p T can subsequently be obtained in the following way, such that the difference between the µ = p T,1 and µ = p T results can be identified in the last three lines in equation (2.16). It will therefore be important to numerically study the individual sub-leading jet contributions to the inclusive jet sample and in particular the effects that can arise from changing the scale from an event-based scale to a jet-based scale. When decomposing the inclusive jet cross section in terms of the contributions from leading and subleading jets, the individual jet distributions are well-defined and infraredsafe only if they are inclusive in the jet rapidity (with the same global rapidity cuts applied to all jets). Since the notion of leading and sub-leading jet is not well defined at leading order (p T,1 = p T,2 at LO), the rapidity assignment to the leading and subleading jet is ambiguous for leading-order kinematics. When computing higher-order corrections, the rapidity of the leading and subleading jet may thus be interchanged between event and counter-event, causing them to end up in different rapidity bins, thereby obstructing their cancellation in infrared-divergent limits. On the other hand, in the inclusive jet transverse momentum distribution (which sums over all jets in the event) IR-safety is restored in differential distributions in rapidity y, since leading and subleading jet contributions are treated equally.

Dependence on the jet resolution parameter R
In this subsection we discuss the effects stemming from the jet definition itself, in particular the jet resolution. For the sake of illustration, we represent the jets by cones of radius R in rapidity and azimuthal angle, as obtained [39] by either a cone algorithm or the commonly used anti-k T clustering/recombination algorithm [40]. Figure 1 shows some illustrations of various jet configurations at LO and NLO where solid arrows represent partons and cones represent jets resulting from the jet algorithm. Figure 1a shows a dijet event at leading order where two back-to-back partons form two jets and p T,1 = p T,2 . In this case there is no difference in scale choice between p T,1 and p T . Figure 1b shows a dijet event where three partons are clustered by the jet algorithm into two jets such that the jets are still balanced in p T and the scale choice is identical. Figure 1c shows a trijet event where three partons are sufficiently hard and separated to form three distinct jets. In this configuration p T,1 = p T,2 = p T,3 and so the scale choice does make a difference, although the three-jet contribution makes up only a very small fraction of the inclusive jet cross section as we will observe in section 3. Figure 1d depicts a dijet event where the third parton falls outside the jet radius and is not clustered but also is not sufficiently hard to form a jet on its own; such configurations typically lead to a small imbalance in the leading and subleading jet p T and their description is sensitive to the scale parameterization.
At NNLO there are more configurations to consider due to the presence of four finalstate partons in the double real contribution forming either two, three or four jets. Once again, many configurations do not contribute to the difference in scale parameterization. Whenever the jet algorithm clusters two, three or four partons into two jets then the jets are balanced in p T and there is no difference between the µ = p T,1 and µ = p T scale choices. The only NNLO configurations that can contribute to the difference are: three-or four-jet events (for which the cross section is very small) or two jet events where additional radiation falls outside of the jet radius, see figure 2.
As illustrated in figure 3 the choice of the R parameter in the jet algorithm can have an effect on how the partons are clustered into jets. We can take the same three-parton configuration and consider the clustering for different values of R, figure 3a, and a larger value of R, figure 3b. For the smaller R value the most subleading parton is more likely to fall outside the jet radius of the two leading jets and so generate a difference between the µ = p T,1 and µ = p T scale parameterizations. Therefore, when using the µ = p T scale choice, the value of the scale can vary with R for a fixed event. On the other hand, with the choice µ = p T,1 , the scale for the event is R-independent at NLO, where the leading jet is not sensitive to radiation outside the cone, and becomes R-dependent only at NNLO.
This difference between the two scale choices grows significantly for small R, decreases for large R, and is moderate for the phenomenologically relevant values used at the LHC, for R = 0.4 (0.7) as we will observe in section 3. 3 The scale choices µ = p T,1 and µ = p T As observed in ref. [32], the spread in the NNLO predictions for single jet inclusive production between using the dynamical scales µ = p T,1 and µ = p T can be comparable or even larger in size than the respective uncertainties estimated through scale variations. The significant effect of this scale ambiguity on the NNLO predictions, and the lack of a theoretically well-motivated preference motivates us to revisit these results and to further study this issue.
For the leading jet in the event, the scale µ = p T is identical to µ = p T,1 and its contribution is therefore insensitive to the scale choice between p T and p T,1 . Furthermore, two-jet events where the jets are balanced in p T cannot generate any difference as p T = p T,1 = p T,2 . Away from these jet configurations, the subleading jets will have a smaller p T than the leading jet in the event so that p T,2 , p T,3 , . . . < p T,1 .

JHEP10(2018)155
For these reasons, at LO the two scale choices generate the same prediction and similarly, for all events at higher order that have LO kinematics there is no difference between the two scale choices. In particular at high p T the scale choices once again converge as is to be expected for the largely back-to-back configurations encountered at high p T . Kinematical configurations where the scale choices do not coincide are events with three or more hard jets and events with hard emissions outside the jet fiducial cuts that generate an imbalance in p T between the leading and subleading jets in the event. For this reason, we can expect also that for larger jet cone sizes the difference in the predictions using µ = p T or µ = p T,1 will be smaller, since the increased number of parton clusterings driven by a larger cone size promotes final state jets balanced in p T .
It is the aim of this section to scrutinise how the contributions to the single jet inclusive transverse momentum distribution behave according to the choice of the functional form of the scale. To this end, we use the two central scale choices µ = p T and µ = p T,1 as representatives for jet-based and event-based scale settings. After describing our calculational set up in section 3.1, in sections 3.2 and 3.3 we study the impact on the transverse momentum distribution of the individual jet fractions at LO, NLO and NNLO level for the two central scale choices µ = p T and µ = p T,1 and for the two cone sizes R = 0.7 and R = 0.4. Having identified the crucial role of the second jet distribution from this analysis, in section 3.4 we focus our attention to this particular contribution and present how it behaves at a given perturbative order.

Calculational setup
In order to investigate the differences between the scale choices µ = p T and µ = p T,1 and their origin, we perform a numerical study for the single jet inclusive cross section at a center of mass energy √ s = 13 TeV. The jets are identified using the anti-k T algorithm [40] and results are presented for both R = 0.4 and 0.7 to further allow to inspect the dependence on the jet cone size.
Jets are accepted within the fiducial volume defined through the cuts covering jet-p T values up to 2 TeV, and ordered in transverse momentum. 1 As explained in section 2.3, we can not apply a rapidity binning to leading and subleading jet distributions. We systematically use the PDF4LHC15_nnlo_100 PDF set [41] for the evaluation of the LO, NLO and NNLO contributions. This choice of a fixed PDF across the different perturbative orders allows us to quantify the effects of the two scale choices at the partonic cross section level rendering our conclusions independent of the PDF set used. The value for the strong coupling constant is given by α s (M Z ) = 0.118, as provided by the PDF set.

Corrections to the transverse momentum distribution
As a first step, we investigate the impact of including NLO and NNLO corrections to the single jet inclusive transverse momentum distribution. For all event-based scales in the remainder of this paper the variation includes doubling and halving the central value of the scale independently for µ R and µ F , with the constraint 1/2 ≤ µ R /µ F ≤ 2. For jet-based scales the reevaluation of the event at several different scales is increasingly expensive to compute. For this reason we restrict the scale variation for event-based scales to 3-point symmetric µ R , µ F scale variations noting that the bulk of the scale dependence comes from µ R variations and that no significant differences are observed with respect to a 7-point scale variation. As expected, we can observe that at high p T the NLO and NNLO effects are small and similar using either of the two central scale choices while more pronounced and different effects can be observed at low p T . In the low p T region we can observe larger NLO corrections with the scale µ = p T than with the scale µ = p T,1 , while one observes smaller -14 -

JHEP10(2018)155
NNLO corrections with the µ = p T scale than with the scale µ = p T,1 . As a result we see a faster convergence of the perturbative expansion when using the scale µ = p T , where in particular the NNLO result lies inside the NLO scale uncertainty band, which itself lies inside the LO scale band. Furthermore, the scale uncertainty at NNLO displays a greater reduction for the scale choice µ = p T .
It is instructive to compare what happens when using a smaller jet cone size. In this case, we fix the jet cone size to R = 0.4 and present the results in figure 5. Similarly to the R = 0.7 case, we observe identical higher order effects at high p T between the two scale choices while more pronounced effects can be seen at low p T . In this case, we observe that the NLO corrections using the central scale µ = p T are smaller than those corrections obtained for R = 0.7. The NLO scale uncertainty band is artificially small with the central scale choice sitting at the top of the band and the overlap between the NNLO result and the NLO scale band is no longer observed. Looking at the µ = p T,1 results, we observe an almost identical NLO scale band as for the results obtained with R = 0.7 and again non-overlapping NNLO and NLO scale bands. By comparing the NNLO/NLO K-factors for the two scale choices we observe that the NNLO cross section decreases (increases) with respect the NLO result with µ = p T (µ = p T,1 ). This is not unexpected since we have anticipated that for smaller jet cone sizes the effects of changing the scale from µ = p T to µ = p T,1 would be more pronounced. In particular, by comparing the NNLO/LO curves for µ = p T,1 and µ = p T (remembering that the LO result is identical for the two scale choices) we see that in the R = 0.7 case the NNLO predictions lie significantly closer to each other than for R = 0.4.
In order to demonstrate the last effect more clearly, figure 6 shows the ratios of the predictions at NLO (in blue) and NNLO (in red) for the two scale choices. We see that at NLO and NNLO, the impact of changing the scale from µ = p T,1 to µ = p T is more pronounced for the smaller jet size R = 0.4 (right) than it is for the larger jet size R = 0.7 (left). Interestingly, we also observe that for the two jet sizes, the impact of this change is bigger at NNLO than it is at NLO, which contradicts our expectation that the higher order corrections should lead to a smaller scale dependence.
It is worth noting that we do not necessarily expect that a change in the form of the central scale choice from µ = p T,1 to µ = p T can be captured by varying the values taken for renormalization and factorization scales in the predictions computed at a given fixed order. When the scale variation is performed, all the events are shifted simultaneously by a rescaling of the µ R and µ F scales. On the other hand, when we change the central scale from µ = p T,1 to µ = p T , events with LO kinematics are unchanged while events with higher order kinematics can change significantly.
The renormalization group equations (see section 2) can be used to predict a change in the cross section due to a multiplication of the scales by a constant shift factor, but are otherwise unable to predict the behaviour of the cross section with another functional form for the central scale choice. For this reason we can expect the potentially different behaviour of the two scales used to compute IR sensitive observables (which are subject to delicate cancellations between real and virtual corrections) to be the underlying cause of the discrepancy in the results at NNLO between µ = p T,1 and µ = p T .

Jet fractions in the single jet inclusive distribution
In order to explore this idea further, it is instructive to observe the breakdown of the single jet inclusive transverse momentum distribution into leading and subleading jet fractions, which is shown in figure 7 for the scale µ = p T,1 and jet sizes R = 0.4 (left) and R = 0.7 (right) at LO (top), NLO (middle) and NNLO (bottom). Beyond the trivial LO result, which as expected shows an equality between the first and second jet transverse momentum distributions, we observe interesting effects at higher orders. In particular, at NLO, we find that the leading jet contribution dominates the inclusive jet p T spectrum for both jet sizes, while the contribution from the third jet is negligible. As expected for the larger cone size we produce more events with jets that are balanced in p T and the jet fractions for the first and second jet are closer to the symmetric LO result. We can also identify a significant depletion of the second jet contribution in the NLO result for the jet cone size R = 0.4 at low p T with the scale choice µ = p T,1 . Finally, the NNLO results show a substantial increase in the second jet fraction for both jet sizes with respect to the NLO case, thereby coming closer to the LO result of similar-size first and second jet fractions.
With these results in mind, we can conclude that a small change in the second jet p T distribution can have a potentially larger impact on the inclusive jet transverse momentum distribution at NNLO than at NLO, since the second jet contributes significantly more to the inclusive jet sample at NNLO than it does at NLO. It is therefore plausible that a change in scale from µ = p T,1 to µ = p T which affects the second jet p T distribution produces a larger shift in the prediction of the inclusive jet p T distribution at NNLO than it does at NLO (as shown in figure 6).
For comparison, figure 8 shows the corresponding jet fractions for the µ = p T scale choice and jet sizes R = 0.4 (left) and R = 0.7 (right) at LO (top), NLO (middle) and NNLO (bottom). As expected, when we compare with the results obtained with the scale µ = p T,1 we do not see significant differences in the jet fractions for the larger jet size of R = 0.7. On the other hand, for R = 0.4 we observe an increase in the second jet contribution at low p T at NLO and a reduction in the same region at NNLO with respect to the results for µ = p T,1 .

The second jet transverse momentum distribution
Given its potential impact on the scale uncertainty of the NNLO single jet inclusive cross section, we now focus our attention on the second jet transverse momentum distribution. Figure 9 shows the perturbative expansion of the second jet p T distribution for the jet cone size of R = 0.7 with the scale choice µ = p T,1 (left) and µ = p T (right). For the two scale choices, we observe that this distribution is subject to very large perturbative corrections indicating potentially IR-sensitive effects. In particular we identify the presence of very large negative NLO corrections and large positive NNLO corrections generating an alternating series expansion with large coefficients. It is reassuring that the results at NNLO for the two scale choices are still largely identical despite this effect. We can nonetheless discern a significantly improved behaviour in the perturbative expansion when the scale µ = p T is used. Both NLO and NNLO K-factors are significantly reduced and the NLO and NNLO scale uncertainty bands are also closer to each other for the µ = p T case. The same behaviour can be observed for the smaller jet cone size of R = 0.4 in figure 10 where the sensitivity to IR effects is even more pronounced. In this case, we find a negative NLO cross section for the scale choice µ = p T,1 which is clearly exhibiting a pathological behaviour. The NNLO corrections fix this unphysical behaviour even when the scale µ = p T,1 is used, but similarly to the R = 0.7 case, we see a significantly better convergence of the perturbative series using µ = p T as the central scale choice.
Interestingly enough we observe also for both cone sizes, that in this contribution the NNLO scale band (in red) is larger than the LO scale band (in green). As explained in In order to understand the source of the IR sensitivity in the second-jet contribution figure 11 shows the fractional contribution to the second jet p T distribution in a given p T,2 interval (133 GeV < p T,2 < 153 GeV) for particular p T,1 slices plotted along the x-axis,  for either µ = p T,1 (left frames) or µ = p T (right frames), and using R = 0.7 (upper frames) and R = 0.4 (lower frames). The bin content is constrained to sum to unity by construction. We observe that this is achieved from a large cancellation (for both scale choices) between the first bin of the distribution (where p T,1 = p T,2 ) and the adjacent bin where (p T,1 p T,2 ). In particular at NLO (in blue) the entire second bin content is filled from the NLO real emission (where p T,1 can be larger than p T,2 for the first time) while the virtual correction contributes to the first bin only. When comparing the behaviour of the two scale choices we note that for µ = p T,1 the scale is increasing along the x-axis. On the other hand for µ = p T , the scale is fixed to be equal to p T,2 for all contributions and the cancellation between the large positive real emission and large negative virtual correction is improved (as shown by the height of the bins). This effect is even more pronounced for the R = 0.4 jet size as shown in the two lower frames of figure 11. We can therefore infer that we observe an instability at higher order in the second jet p T distribution when additional radiation is not recombined into the outgoing jet and -21 -

JHEP10(2018)155
generates an imbalance between p T,1 and p T,2 . In this case, relatively soft emissions do not outbalance fully with virtual corrections and large logarithms appear. 2 This effect has been observed to be particularly relevant for the smaller jet cone size distributions. After employing the µ = p T scale choice we see an improved convergence for the second jet p T distribution. The observed stabilisation for the jet-based scale µ = p T as opposed to the event-based scale µ = p T,1 is at first sight counter-intuitive, as one should expect an event-based scale to lead to an improved infrared stability [38], since all contributions from a single parton-level event are evaluated at the same scale. The situation is somewhat different for the jet inclusive p T distribution, since its infrared sensitivity stems only from the contribution from the second jet, which has implicit restrictions on its allowed phase space. If the second-jet cross section in a fixed kinematical bin is broken down according to the event properties that contribute to it, then the jet-based µ = p T is a fixed scale, while the event-based µ = p T,1 becomes a dynamical scale.
We conclude that by employing the scale µ = p T we improve the stability of the second jet transverse momentum distribution with respect to µ = p T,1 by improving the cancellation at fixed order between the real and virtual corrections. Since the leading jet p T,1 contribution is identical with either µ = p T and µ = p T,1 , the single jet inclusive cross section is potentially more stable when using the jet based scale µ = p T .

Comparison of different scale choices
The renormalization and factorization scales are arbitrary dimensionful parameters and any scale is a priori an equally valid choice. Moreover, any ambiguity induced by different choices of the scales should ideally reduce as higher order terms in the perturbative expansion are included. As was shown in the previous section, however, the inclusive p T distribution suffers from an infrared sensitivity that exhibits a strong dependence on the scale that is used and a suboptimal choice can introduce pathological behaviours in the predictions.
It is the aim of this section to go beyond the two scale choices µ = p T and µ = p T,1 of the previous section and to study predictions for single jet inclusive production based on the comprehensive set of functional forms introduced in section 2.2. In particular, we will study the scale µ =Ĥ T and the appropriate scaling factor in front of the central scale choice. To this end, we introduce a set of criteria that define desirable properties for a suitable scale choice: (a) perturbative convergence: we require that the size of the corrections reduces at each successive order in the perturbative expansion.
(b) scale uncertainty as error estimate: in order to have a reliable estimate of theory uncertainties due to missing higher-order corrections, we require overlapping scaleuncertainty bands between the last two orders, i.e. between the NLO and NNLO pre-

JHEP10(2018)155
dictions. Ideally, the central prediction with the highest accuracy should lie within the scale variation of the order that precedes it.
(c) perturbative convergence of the individual jet spectra: based on the observation of the previous section, where the p T spectra of the individual jets receive large corrections with cancellations in the inclusive distribution, we further demand the convergence of the corrections to the individual p T,1 and p T,2 distributions.
(d) stability of the second jet distribution: the comparison between the scales µ = p T and p T,1 has exposed the second jet distribution to be especially sensitive to the scale choice, sometimes even exhibiting unphysical behaviour where the scale variation predicts negative cross section. We therefore introduce an additional criterion based on the second jet distribution and its associated scale uncertainty and require the predictions to provide physical, positive cross sections.
In this way, a careful assessment of the behaviour of each scale can be made purely based on the behaviour of the predictions in perturbation theory, prior to any comparisons with experimental data (which are deferred to section 5). Section 4.1 is devoted to a comparison of the different scales on the basis of the criteria defined above and identifying the choices that satisfy our requirements on transverse momentum distributions integrated over rapidity. It is the aim of this section to arrive at a sensible scale choice for single jet inclusive production. In section 4.2, we validate the optimal scale choices we made by further looking at the inclusive jet p T distribution differentially in rapidity.

Assessment of the convergence criteria
In order to test the convergence criteria (a)-(d) defined in the introduction of this section on a more quantitative level, we define the following correction factors for the individual jet p T spectra where p T,i denotes the i-th leading jet in the event. The K-factor for the inclusive jet distribution can be expressed in terms of the δk i as follows, and conditions (a), (c) are then given by ∀i .

JHEP10(2018)155
Given that the measured single jet inclusive sample receives contributions predominantly from the two leading jets in the event, it is sufficient to test condition (c) only for i = 1, 2. Figures 12 and 13 show the correction factors δk 1 , δk 2 , and δk Σ for the set of scale choices of section 2.2 at NLO (solid lines) and NNLO (dashed lines) for the cone sizes R = 0.7 and R = 0.4, respectively. As anticipated, we find large cancellations between the leading (blue) and the subleading jet contributions (red) at each order in the perturbative expansion for any scale choice.
In particular, we can observe a very large negative (positive) NLO coefficient for the second (first) jet contributions in solid red (blue) respectively. This effect explains why the second jet contribution to the inclusive jet p T sample at NLO is significantly reduced and the first jet fractions dominates (as shown in figures 7, 8). At the next order, the sign of the NNLO coefficient is reversed for the leading and subleading jet (dashed blue and red respectively), resulting in the leading and subleading fractions to become similar over the whole p T range at NNLO.
Given that the aforementioned feature is common for all scale choices, we can now apply the criteria (a,c) to assess which scale choices show the most stable behaviour in the perturbative expansion. Criterion (c) is concerned with the spread of the blue and red curves, associated with δk 1 and δk 2 , respectively. Going from NLO to NNLO, we require the size of the corrections to the individual jet p T to become smaller and therefore that the dashed curves exhibit a smaller spread than the corresponding solid ones. We observe that for the scale choices µ = p T , µ = p T,1 , and µ =Ĥ T /2 this condition is not fulfilled, specially at low p T and in particular for the smaller jet cone size R = 0.4.
The net effect on the inclusive p T spectrum is given by the correction factors δk Σ , shown as the black lines. With criterion (a), we require the K-factor at NNLO to be smaller than at NLO, i.e. the dashed black lines to be closer to zero than the solid ones. Here, we observe that the scales p T and p T,1 give rise to sizeable NNLO corrections that are larger in magnitude than the corresponding corrections at NLO for R = 0.4, while for the bigger cone size of R = 0.7 the same effect is observed to be true again for the p T,1 scale choice. The remaining scales 2 p T,1 , 2 p T ,Ĥ T , andĤ T /2 fulfil criterion (a), with NNLO corrections at the level of 5-10%.
In figure 14 we examine criterion (b) on the theory error estimate by plotting the predictions at a given order with their respective scale uncertainty bands normalised to the NLO prediction. Given the potentially large impact of the cone size, we present results for both R = 0.7 (top) and R = 0.4 (bottom). For both cone sizes we observe that the scale choices p T,1 and 2 p T,1 give rise to scale uncertainty bands at NLO and NNLO that do not overlap in the low-p T region. For the scale p T , the conclusion depends strongly on the cone size where we observe overlapping bands for R = 0.7 but not for R = 0.4. The remaining three scales 2 p T ,Ĥ T , andĤ T /2, on the other hand, exhibit good convergence with overlapping scale uncertainty bands independently on the cone size.
Finally, we study criterion (d) by investigating the perturbative behaviour of the second jet distribution and its associated scale uncertainties. In figures 15 and 16 we show the corrections to the p T distribution of the second jet for the cone sizes R = 0.7 and 0.4, respectively. As was already mentioned in the study of criterion (c), we clearly observe an improved perturbative behaviour with smaller higher-order corrections and scale uncertainties for the three harder scale choices 2 p T , 2 p T,1 , andĤ T compared to their respective counterparts that are smaller by a factor of a half. For R = 0.4, we find that only the scale µ = 2 p T is able to predict positive NLO cross sections across the entire p T range, both for the central value as well as for its variation. Although the scale choices 2 p T,1 andĤ T give rise to NLO scale uncertainties that extend to negative cross section values, this behaviour is less critical as it only occurs in the very first bin(s) below p T 150 GeV and the central predictions remain positive. The situation is much more severe for the remaining three scales p T , p T,1 , andĤ T /2, where the NLO prediction exhibits the unphysical behaviour of negative cross sections already starting from p T ∼ 400-600 GeV. In the case of µ = p T,1 andĤ T /2, even then central prediction turns negative in the lowest p T -27 - We summarise the findings of this section in tables 2a, 2b for cone sizes of R = 0.7 and R = 0.4 respectively. By comparing the two tables we see that, as expected, the various scale choices behave in a much more similar way for the larger cone size than for R = 0.4. It is interesting to note that the two most commonly used scales µ = p T and p T,1 perform by far the worst among the set of scale choices considered here. In particular, they are not able to fulfil any of the criteria for the smaller cone size of R = 0.4. On the other hand, the scale µ = 2 p T fulfils all the requirements we identified at the beginning of this section, while the scale µ =Ĥ T satisfies all of the criteria for p T > 150 GeV. We therefore identify µ = 2 p T and µ =Ĥ T as the two theoretically best-motivated scale choices for single jet inclusive production, noting that the former belongs to the class of jet-based scales and the latter is an event-based scale.

Results for central and forward rapidity slices
Having discussed at length the behaviour of the leading and subleading jet contributions as a function of the scale choice integrated over rapidity, for the remainder of this section, we will focus on the single jet inclusive observable for the different rapidity bin intervals used by the CMS collaboration [12]. Figure 17 shows the perturbative corrections for the single jet inclusive cross section at NLO and at NNLO for the same six scale choices discussed earlier: µ = p T,1 , µ = p T , µ = 2 p T,1 , µ = 2 p T , µ =Ĥ T , and µ =Ĥ T /2 for a jet cone size of R = 0.7 and for jets produced at (a) central rapidity (|y| < 0.5) and (b) forward rapidity (2.5 < |y| < 3.0). The shaded bands represent the scale variation around the respective central scale choice.
Focussing first on the central rapidity region shown in figure 17a, we see that the shape and size of the LO/NLO K-factor (green) for the µ = p T,1 , µ = p T and µ =Ĥ T /2 scales are fairly similar. However, we observe larger NLO radiative corrections when these central choices are rescaled by a factor of 2.
Inspection of the NNLO/NLO K-factor (red) reveals that the size and shape of the NNLO corrections are generally smaller than the NLO ones, but that there is some depen- dence on the functional form of the scale choice. While the NNLO/NLO K-factor is never more than ±20% for any of the scale choices, the dependence on p T is quite varied. For µ = p T (µ = 2 p T ), the corrections grow from −10% (0%) at low p T to a few percent (10%) at large p T , while for µ = p T,1 (µ = 2 p T,1 ), the corrections fall from +15% (12%) at low p T to a few percent (10%) at large p T . For µ =Ĥ T , the corrections are always positive, growing from a few percent at low p T to 12% at large p T . In the case of µ =Ĥ T /2, the NNLO/NLO K-factor is always small. The same qualitative behaviour can be observed in the predictions for jet production at forward rapidity (2.5 < |y| < 3.0), shown in figure 17b.
Because of the significantly different behaviour of the perturbative expansion for each scale choice, it is instructive to compare the respective absolute cross sections in the central rapidity region with a fixed normalisation. Figure 18a shows the NLO results for all six Performing the same comparison for the different scale choices at NNLO in figure 18b and normalising to the NNLO prediction with µ = 2 p T , we observe the anticipated dramatic reduction in the scale variation with respect to NLO (as indicated by the reduction in the thickness of the red and green bands compared to figure 18a). We also conclude that the NNLO predictions are generally all in good agreement, particularly at high p T , and independently of the scale choice. However, at low-p T we do observe larger differences, where the scales µ = p T,1 , µ = 2 p T,1 tend to look similar and predict a larger NNLO cross section of approximately 10% with respect to the scales µ = p T , µ = 2 p T , µ =Ĥ T , µ =Ĥ T /2. The size of this effect combined with a significant reduction in the scale uncertainty of the NNLO prediction introduces an ambiguity because the scale variation of the NNLO cross section no longer captures the predictions of different functional forms for the central scale choice. This has an important interplay with PDF extractions [42] using jet data and NNLO predictions, and also a significant impact when comparing the NNLO predictions with jet data.
We present the study of the perturbative corrections for the smaller jet cone size R = 0.4 in figure 19. accidentally minimise the scale uncertainty. At NNLO there is a more symmetric scale variation, however the NNLO/NLO K-factors behave rather differently. The effect of the NNLO radiative corrections is positive for the scales µ = p T,1 , µ = 2 p T,1 , negligible for the scales µ =Ĥ T , µ =Ĥ T /2, and negative for the scales µ = p T , µ = 2 p T . As expected, the magnitude of the ambiguity in the scale choice for inclusive jet production is more severe for the smaller jet cone sizes. variations at NLO whose uncertainty largely captures the effects of changing the functional form of the scale. That is to say that the red and green bands largely overlap. The same comparison is shown at NNLO in figure 20b where, as for R = 0.7, we observe a dramatic reduction in the scale variation with respect to NLO (as indicated by the reduction in the thickness of the red and green bands compared to figure 20a), except at low-p T for the scale choices µ = p T , µ = p T,1 and µ = 2 p T,1 . For this reason, we can conclude that the NNLO predictions are generally in very good agreement at high p T , independently of the scale choice. At low p T we find larger differences, where in particular the scales µ = p T,1 , µ = 2 p T,1 tend to look similar and predict a larger NNLO cross section of approximately 15%-20% with respect to µ = p T , µ = 2 p T , µ =Ĥ T , µ =Ĥ T /2.
The instability of the single jet inclusive cross section at low p T has been thoroughly discussed in sections 3 and 4.1. Due to implicit restrictions on its kinematics, it was found that the contribution from the second jet in the event is particularly sensitive to higher order effects, and that the perturbative stability of the predictions can be improved to some extent by adopting sensible scale choice criteria. Moreover, the largest difference in cross section and NNLO scale uncertainty is associated to using either µ = p T , µ = p T,1 or µ = 2 p T,1 as central scale choices. As documented in tables 2a, 2b, these scale choices introduce pathological behaviours in the perturbative expansion of the single jet inclusive observable. Since the spread in the NNLO predictions including these scale choices is larger in size than the NNLO scale variation, their inclusion (and associated pathological behaviours) is therefore overestimating the residual scale uncertainty at NNLO. It is therefore sensible to adopt well-motivated criteria for fixing the scale choice that best maximise the impact of the knowledge of the higher order QCD corrections to the observable, to the extent JHEP10(2018)155 that pathological behaviours are avoided. We have observed that the best perturbative stability can be obtained for µ = 2 p T or µ =Ĥ T , where the perturbative convergence of the individual jet contributions is vastly improved with respect to the other functional forms of the scale choice. It is therefore not surprising that these scales tend to show smaller NNLO corrections and lead to smaller residual NNLO scale uncertainties.
In the remainder of this paper we will employ these two functional forms of the central scale choice to compare our predictions with jet data from the CMS dataset at √ s = 13 TeV for the first time.

Comparison with CMS jet measurements at √ s = 13 TeV
Having discussed how the jet kinematics at the LHC differently affects each of the eventbased and jet-based scale choices, in this section we present predictions for the double differential jet cross section at NLO and NNLO for the CMS measurement at √ s = 13 TeV [12]. We use the same numerical setup as described in section 3.1 and do not include nonperturbative effects from underlying event and hadronization in our predictions. An assessment of the size of the non-perturbative contributions has been presented in [12] and we note that these can vary significantly with the jet p T and the R cone size. In the study in [12] the non-perturbative corrections are expected to be negligible for R = 0.4 but can reach up to 10%-15% for R = 0.7 at low-p T . Figure 21 displays the NLO and NNLO predictions for the jet-based scale choice µ = 2 p T , as well as for the event-based scale choiceĤ T , compared to the CMS 13 TeV data [12] with a jet cone size of R = 0.7. For both scale choices we observe small positive NNLO corrections across all rapidity slices, that improve the agreement with the CMS data, as compared to the NLO prediction. In addition we identify a reduction in the scale uncertainty from NLO to NNLO across the entire p T range. Figure 22 shows the NLO and NNLO predictions for the smaller jet cone size of R = 0.4 (where non-perturbative corrections are expected to be less important than for R = 0.7 [12]). Similarly to the R = 0.7 case, we see that both scale choices provide reasonable predictions and that the agreement with data is improved at NNLO. For the µ = 2 p T scale choice this is achieved by having small negative NNLO corrections while for µ =Ĥ T the NNLO corrections are flat leading to a smaller residual scale variation at NNLO than for µ = 2 p T .

Summary and conclusions
In this paper we have studied single jet inclusive production at hadron colliders and the jet transverse momentum distribution obtained by adding up the contributions from all jets that are observed in an event. Our predictions include the most up-to-date second order NNLO corrections in the perturbative expansion of the observable.
In detail we presented a breakdown of the inclusive jet-p T sample into leading and subleading jet contributions and found large radiative corrections to the first and second jet contributions (that dominate the inclusive jet sample) that largely cancel each other. By . Double-differential single jet inclusive cross-sections measurement by CMS [12] and NNLO perturbative QCD predictions as a function of the jet p T in slices of rapidity, for anti-k T jets with R = 0.7 normalised to the NLO result for (a) µ = 2 p T , (b) µ =Ĥ T scales. The shaded bands represent the scale uncertainty.
investigating the second-jet transverse momentum distribution we identified large cancellations between different kinematical event configurations, which are aggravated by certain types of scale choices. Since the notion of leading and subleading jet is not well defined at leading order (p T,1 = p T,2 at LO), the single jet inclusive observable is decomposed into IR-sensitive leading and subleading jet contributions and the functional form of the scale can have an impact on the final result, when the kinematics of the scale choice affects the IR cancellations between the different contributions. We have found this effect to be worse for the smaller jet cone size R = 0.4 than for R = 0.7.
The smaller cone size increases the contribution from events where relatively soft emissions are not recombined with outgoing jets. These do not cancel fully with virtual corrections, leading to an imbalance between p T,1 and p T,2 . Since the second jet contribution to the inclusive jet sample is increased at NNLO with respect to NLO we have identified this effect to be the cause in the mismatch between inclusive jet predictions at NNLO that employ µ = p T or µ = p T,1 as central scale choices. By investigating the kinematical JHEP10(2018)155 (a) (b) Figure 22. Double-differential single jet inclusive cross-sections measurement by CMS [12] and NNLO perturbative QCD predictions as a function of the jet p T in slices of rapidity, for anti-k T jets with R = 0.4 normalised to the NLO result for (a) µ = 2 p T , (b) µ =Ĥ T scales. The shaded bands represent the scale uncertainty.
properties of events that contribute to a fixed bin in p T,2 (as function of p T,1 in the event), we found that the imbalance between real and virtual emissions is much more serious for µ = p T,1 than for µ = p T , which can be understood from the fact that the former is changing event-by-event in this distribution, while the latter remains constant. We have observed that the spread in the NNLO predictions that use different functional forms of the scale is larger in size than the NNLO scale variation in the low-p T region of the transverse momentum distribution. For this reason we have introduced a sensible set of criteria that define desired properties for a suitable scale choice and that can maximise the impact of the knowledge of the higher order QCD corrections to the observable, to the extent that scale choices that introduce pathological behaviours can be identified and avoided.
We have identified µ = 2 p T and µ =Ĥ T as the two scales that fulfil all the criteria that we have defined, observing that they lead to important cancellations between the leading and subleading jet contributions which result in an improved perturbative convergence on the transverse momentum distributions, with overlapping scale uncertainty bands.

JHEP10(2018)155
Subsequently we used these two functional forms of the central scale choice to compare our NNLO predictions with jet data from the CMS dataset at √ s = 13 TeV [12] for the first time. We have observed that both recommended scale choices are stable and provide reasonable predictions for the two jet cone sizes employed in the measurement across the entire p T and rapidity region where the observable is defined. In particular we find an improved agreement with data at NNLO with respect to NLO with a significant reduction in scale uncertainty by roughly more than a factor of 2 in a wide range of p T and rapidity. We have refrained from comparing the measurement to predictions that employ scale choices that contain pathological behaviours since these scale choices are not recommended on the grounds introduced in this study.
The central scale choices µ = 2 p T and µ =Ĥ T are clearly found to be favoured in terms of stability and convergence of the predictions for single jet inclusive production. Both yield very similar predictions at NNLO. We expect that our findings will enable improved precision studies based on single jet inclusive production data, especially in using them as precision probes of the parton distributions in the proton and for a determination of QCD parameters.