Dynamical scales for multi-TeV top-pair production at the LHC

We calculate all major differential distributions with stable top-quarks at the LHC. The calculation covers the multi-TeV range that will be explored during LHC Run II and beyond. Our results are in the form of high-quality binned distributions. We offer predictions based on three different parton distribution function (pdf) sets. In the near future we will make our results available also in the more flexible fastNLO format that allows fast re-computation with any other pdf set. In order to be able to extend our calculation into the multi-TeV range we have had to derive a set of dynamic scales. Such scales are selected based on the principle of fastest perturbative convergence applied to the differential and inclusive cross-section. Many observations from our study are likely to be applicable and useful to other precision processes at the LHC. With scale uncertainty now under good control, pdfs arise as the leading source of uncertainty for TeV top production. Based on our findings, true precision in the boosted regime will likely only be possible after new and improved pdf sets appear. We expect that LHC top-quark data will play an important role in this process.


Introduction
The recent derivation of the fully differential next-to-next-to-leading order (NNLO) correction to top quark-pair production at the LHC [1] and at the Tevatron [2,3] naturally raises the question: what precision can be expected in top-quark pair production at the LHC across observables and in the widest achievable kinematical ranges? To address this question, it is instructive to first recall the situation with the total inclusive cross-section which is wellunderstood in (resummed) NNLO QCD [4][5][6][7].
Upon the inclusion of the NNLO QCD correction, σ tot can be predicted with an accuracy of about 5%. A number of independent sources contribute to this total error, the most important ones being missing higher order terms (beyond NNLO), pdf error and parametric m t and α S uncertainties. Significantly, all these sources of error are comparable in magnitude which indicates that further reduction in the error of top-pair production at the LHC would be a significant challenge even in the long run. The next level of uncertainty contributors to σ tot are at the level of about 1% and include EW corrections, finite top width and various non-perturbative effects.
This uncertainty breakdown for σ tot is a good indicator for the sources of uncertainty to be expected in top-pair differential distributions. It is important to recognise, however, that the various sources of uncertainty mentioned in the context of σ tot could vary wildly across kinematics. For example, the electroweak (EW) corrections are expected to become on par with the NNLO QCD scale variation in the TeV range [8][9][10][11][12][13][14][15][16][17][18][19]. Finite top width effects are typically suppressed by powers of Γ t /m t but can be much larger in special kinematic regions [20][21][22][23][24][25][26][27]. Non-factorisable effects in inclusive observables are typically suppressed by powers of 1/m t but could be much larger, for example, in presence of jet vetoes if p T,veto ≪ m t , in which case they are suppressed only as 1/p T,veto [28].
In this paper we take the first step towards the systematic study of theoretical uncertainties in precision fully-differential top-pair production at the LHC with stable top quarks. Specifically, we focus our discussion on NNLO QCD scale uncertainty which, at present, is a main source of theoretical error. The framework of our discussion is as follows: 1. We consider the variation of factorisation and renormalisation scales as a proxy for missing higher order terms. The scale variation procedure we use is not ad hoc; its applicability to the total inclusive cross-section has been validated.
2. As a prerequisite to scale variation, one needs to specify a default central scale µ 0 . The main goal of this paper is to identify the functional form of µ 0 . We choose such a scale based on the criterium of perturbative convergence. In doing so we account for LO, NLO and NNLO corrections as well as, where available, NNLO plus soft-gluon resummation. 3. We assume that the sought default scale µ 0 is the same for both the renormalisation and factorisation scales, i.e. µ R,0 = µ F,0 = µ 0 . Scale variation, however, is done independently for µ F and µ R [29]: (1.1) 4. A dynamic scale is, a priori, better than a fixed scale. However, the spread among various dynamic scales can be comparable in size to scale variation and therefore a sensible choice among possible dynamic scales has to be made.
Perturbative convergence is an indicator of the reliability of perturbative predictions. Ever since the early days of heavy flavour NLO calculations [30,31] running scales -motivated by physical arguments -have been used. Clearly, different scale choices affect the rate of convergence through higher-order terms they introduce. Since scales are unphysical, one may promote perturbative convergence to a principle and try to derive the "correct" scale with it. In this work we only invoke the principle of fastest perturbative convergence in a weak sense 1 ; we speak of the criterium of faster perturbative convergence which we define as follows (related past work is reviewed in sec. 2): between two scales, the one that offers faster convergence is better. Clearly, the scale µ 0 will depend on the set of considered functional forms.
We motivate and explain our choices for scale µ 0 in section 3, but before going into this, we would like to make the following comment. While the scale choices we identify in this paper are sensible and satisfy the above criteria we do not imply that even "better" dynamic scales cannot be derived in the future. In particular, such scale modifications may be needed to reflect improved future understanding of the large p T behaviour of top production due to resummation of large collinear logs ∼ ln(p T /m t ) as well as the validity of the five-flavour number scheme that is exclusively used in the description of top production at present (see refs. [32][33][34] for related work). As quality LHC data at large p T starts to appear and these two theoretical issues get scrutinised, the functional form for the scale µ 0 may potentially need to be revisited. We, however, find it unlikely that such potential future scales will lead to significant deviations in observables compared to the scales derived in this work.
The paper is organised as follows: in section 2 we offer a brief overview of past results on scale setting relevant for our discussion. In section 3 we analyse the total inclusive crosssection and differential distributions for LHC 8 TeV and derive the functional forms for "best" scales µ 0 . As it turns out, two scales are needed: one for the p T distribution and one for all other distributions. In section 4 we study the sensitivity of NNLO differential distributions and demonstrate that our "best" scales µ 0 are stable with respect to the choice of pdf. In section 5 we present our best predictions for all stable top differential distributions in NNLO QCD for LHC 8 and 13 TeV. Prospects for further improvements are discussed in the conclusions. All results are made available in electronic form with the Arxiv submission of this paper.

Overview of past work related to scale setting
Interpreting scale variation as theoretical uncertainty due to missing higher order terms has long history. Within such an approach factorisation and renormalisation scales are typically varied up and down by factors of two and one-half around a judiciously chosen default value. Such default scale, often called central scale, is specific to each process and observable. Clearly, the choices for both the central scale and the variation around it are arbitrary. Nevertheless, as a result of three decades of higher-order calculations for high-energy colliders, a common choice of scale variation (2,1/2) has emerged. Such variation procedure, which is common across processes and observables, is very useful in practice because it allows to easily interpret and compare theoretical errors derived for different, even unrelated, processes. One can justify the amount of scale variation around a central value a posteriori, by comparing predictions for central scales computed at different orders in perturbation theory. A scale variation procedure is deemed good if the error estimate at certain perturbative order contains the central value of the next higher order. Such procedure requires at least NLO calculations. If NNLO results are available then such checks can be even quantitative.
In top-pair production the scale variation procedure eq. (1.1) based on restricted independent variation of the factorisation and renormalisation scales has been shown to work very well through NNLO for the total inclusive cross-section [35]. We expect that it will also work well for differential distributions, at least in the bulk low-p T region, and we also extend this variation procedure to the whole kinematic range for all kinematic variables. 2 The choice for the central scale is, however, much less clear and often alternative choices are made in different calculations for the same observable. We hope that with the advent of NNLO collider phenomenology such choices will be more and more scrutinised in the future. We also hope that the present work will serve as an example in this regard. While we cannot give an exhaustive collection of scales used in collider physics, in the following we will review some past work which has some relevance for our present work in top-pair production.
A number of dynamic scales has been used in the past in top-pair production at hadron colliders. In refs. [22,25] a geometric average scale (see eq. (3.5) below) has been used for both tt and single top production. H ′ T -based scales are also used [24], where H ′ T includes all final state partons as in eq. (3.4) below. Scales based on m T (3.2) have been used since the early days of NLO calculations [30,31,36,37], as well as, more recently, m tt -based scales [38][39][40][41].
Similar functional forms for the factorisation and renormalisation scales have been used and discussed in other collider processes. For example, for W + jets production H ′ T /2 scale has been used at NLO [42], while at NNLO a modified version of H ′ T was used in ref. [43]. A detailed study of dynamic scales in W + 3jets was performed in ref. [44] where scales based on the MLM and CKKW procedures [45,46] were found to offer small corrections across different kinematics, in variance with the case of the W -boson transverse mass. Related discussion for V + jets can be found in ref. [47]. An often made choice in inclusive jet production is p T or p T,max [48][49][50] while for dijet mass distributions one typically has p T,ave and p T,max e 0.3y * [48,51]. A recent summary of existing LHC jet measurements can be found in ref. [52].
Past approaches to scale setting include the Method of Effective Charges [53][54][55] (sometimes referred to as Fastest Apparent Convergence [56,57]; see also Ref. 14 in [54]), the Principle of Minimal Sensitivity [56,58]; the Complete Renormalization Group Improvement approach [59] which provides a factorisation scale based on an alternative collinear factorisation scheme [60], extending earlier work on factorisation scale setting in Higgs production [61,62]. Finally, the Brodsky-Lepage-Mackenzie scale setting approach [63] (and its further refinement known as Principle of Maximum Conformality) [64][65][66][67][68][69] is based on the idea of restoring the conformal symmetry of the QCD Lagrangian in observables. The BLM/PMC approach specifies a value for the renormalisation, but not factorisation, scale.
Our approach is closest, yet not identical, to the criterion of Fastest Apparent Convergence. This criterion derives from the Method of Effective Charges and sets the renormalisation scale at such a (process-dependent) value that the NLO correction for a particular observable vanishes. The Method of Effective Charges is more general; its application is process-dependent and sets to zero all terms in the perturbative expansion beyond the leading order. The conditions one imposes are such that the truncated perturbative expansion for an observable is renormalisation scheme independent to any finite order. In effect, this method replaces the fixed order expansion in the usual MS coupling evaluated at scale µ R with a Born-level effective coupling defined in a new, process-dependent renormalisation scheme. As a by product of this procedure the value of the renormalisation constant gets fixed, too. Our approach is similar to the above in that it tries to minimise the size of higher order corrections, but not necessarily set them to zero.
In this work we choose to follow the usual approach to scale setting due to its broadly-established applicability from fully inclusive observables to exclusive multi-particle final states.
In particular, here we only consider scales which are common to all orders in the strong coupling expansion. For this reason, in the present work we do not study the implications of the BLM/PMC procedures. Recent comparison of predictions based on the BLM/PMC and the usual scale setting approaches can be found in ref. [3]. Alternative approaches for estimating theory errors have been proposed in refs. [70][71][72].
3 Choosing the scale µ 0 In order to identify the most appropriate dynamical scale for use in top-pair production at the LHC, we perform a number of fully differential calculations based on the following set of functional forms: where the momentum p T entering the definition of m T in eq. (3.2) is either that of the top or the antitop, depending on the distribution. The sum in the definition of H ′ T runs over all massless partons present in the final state (at NNLO there could be up to two partons). Finally, an important part of the process of choosing the functional form of µ 0 involves the fixing of the proportionality constant, signified by the ∼ sign in the above equations. While for brevity we focus our presentation on LHC 8 TeV, we have also verified that our conclusions remain unchanged at LHC 13 TeV. Unless explicitly specified, throughout this work we combine partonic cross-sections with pdf of the same order (LO with LO, NLO with NLO, etc). Resummed NNLO partonic cross-sections are convoluted with NNLO pdf. The strong coupling constant α S is evaluated through the LHAPDF interface [73] as appropriate for the corresponding pdf set. Throughout this paper scale variation in differential distributions is performed by independently varying µ F and µ R (as defined in sec. 1). Only in sec. 3.1 -in the context of the total inclusive cross-section -we use simultaneous µ F = µ R scale variation.

Total cross-section
We begin our investigation with the total inclusive cross-section based on the standard choice µ 0 = m t and computed with two pdf sets: MSTW2008 [74] and NNPDF3.0 [75]. The total cross-section is computed with the help of the program Top++ [76]. Besides the LO, NLO and NNLO QCD corrections we also include soft-gluon resummation through NNLL accuracy where available (i.e. for the total cross-section computed with a fixed scale µ 0 ∼ m t ).
Two important observations can be made from fig. 1 and they turn out to be central for this work: first, the scale for which perturbative convergence is maximised is slightly above m t /2, i.e. that scale is significantly lower than the standard one µ 0 = m t . Second, the value of the fixed order NNLO cross-section evaluated at the scale of fastest convergence is only about 0.5% higher than the NNLO+NNLL resummed one evaluated at the usual scale µ 0 = m t , i.e. the two values essentially agree (recall that 0.5% difference is only a small fraction of the scale uncertainty of the resummed result).
The numerical agreement between the fixed order result evaluated at a lower scale and the usual resummed result is significant. First, in practical terms, such an agreement allows the use of fixed order results without the need to worry about the numerical impact of softgluon resummation 3 . The fact that the fixed order result at a smaller scale is larger than the standard resummed prediction (albeit by a tiny amount) is also consistent with what one might expect about yet uncalculated higher-order effects based purely on the behaviour of the known LO, NLO and NNLO corrections to top-pair production, as well as soft-gluon resummation, where one observes reasonably fast convergence of so-far always positive higher order corrections.
Perhaps not surprisingly, given the large uncertainty at LO (as evident from its large slope and from the difference between the two pdf sets), the LO correction is not a reliable input to the above analysis. The difference between the two pdf sets decreases fast with higher orders and is completely negligible at NNLO and at NNLO+NNLL. It thus appears that the point of fastest convergence is not very different for the two pdf sets and the values of the NNLO cross-section one derives from the two pdf sets are within less than 1% from each other. We also notice that for scales smaller than the one of fastest convergence the hierarchy of perturbative corrections gets completely inverted, i.e. the LO is largest and the inclusion of higher orders decreases the total cross-section.
With this observation in mind it is interesting to contrast our findings based on the principle of fastest convergence with the principle of minimal sensitivity which has often been invoked in the past. Had we followed the latter principle we would have found NLO correction which is very large compared to the standard NNLO resummed result. The minimal sensitivity scale for which the NLO curve plateaus is particularly low, around m t /4. Furthermore, we notice a significant shift when going from NLO to NNLO both in terms of minimum sensitivity scale and in terms of the values the cross-section takes at these two scales.
The picture emerging from fig. 1 has a direct analogue in inclusive Higgs production at the LHC. Following the recent work [77] on inclusive Higgs production in NNNLO QCD we observe the almost one-to-one behaviour between the top inclusive cross-section at order N n LO and the total Higgs cross-section 4 at order N n+1 LO for n = 0, 1, 2 as a function of the scale µ. Importantly, the analogy extends also to the resummed NNLO cross-section, especially the rise of the NNLL resummed cross-section for larger values of µ. We have checked, but do not show it in fig. 1, that the inclusion of soft gluon resummation with lower logarithmic accuracy (NLL and LL) does not lead to such a rise for larger values of µ. Similar behaviour is seen also in the case of the Higgs cross-section. From this comparison we can conclude that both inclusive top-pair and Higgs production cross-sections exhibit fastest perturbative convergence at scales lower than the usual ones: m t for top production and m h /2 for Higgs production (note that in both cases these scales are half the mass of the Born-level final state). On the other hand, the fast rise of the resummed cross-section at larger values of µ indicates that the perturbative series is not converging well there and therefore such large scales should be avoided.
In the following we verify the above conclusion by considering the full set of scales (3.1-3.7). We consider the LO, NLO and NNLO cross-sections but no soft-gluon resummation.
We first study the most natural choice for a dynamic scale in inclusive top production, namely, µ 0 = H T /2. In fig. 2 we present the µ = µ F = µ R dependence of the total crosssection evaluated with this scale. We observe that the behaviour of the cross-section as a function of the scale µ is rather similar to the one with a fixed scale. The only noticeable difference between the two figures is the shift towards smaller scales, i.e. while the scale of fastest convergence was slightly above 1/2 of the nominal value (m t in that case) now it is almost exactly at 1/2 of the nominal value H T /2. Moreover, the value of the NNLO crosssection at such a scale is only 0.5% larger than the resummed NNLO+NNLL cross-section at scale m t , for both pdf sets studied here. From this we conclude that the optimal choice for a dynamic scale, and one that reproduces well the known total cross-section, is: The fact that the optimal value of the dynamic scale is slightly below the value for the fixed scale is easy to understand. At low p T,t -which is the region that generates the bulk of the total cross-section -the scale in eq. (3.8) behaves as m t /2 + O(p 2 T,t ). Upon integration over p T,t the terms O(p 2 T,t ) generate additional contribution which effectively increases the value of the scale or, in other words, an effective static scale has value larger than m t /2 due to the running scale effects. In this sense we view the scale m t not as the "best" scale at which to evaluate the total cross-section, but as the best average value of the running scale which reproduces the total cross-section. The value for the fastest convergence scale of about 0.7m t observed in fig. 1 is consistent with this observation.
There are several alternative definitions of the scale H T that have been considered in the literature. One of them is eq. (3.5) which we denote as E T ; it differs from H T by taking the geometric as opposed to arithmetic average of the t andt transverse masses. From fig. 3 (left) we conclude that the numerical difference between the two scales is immaterial. Another alternative definition (3.4), denoted here as H ′ T , involves the sum of the transverse masses of all final state partons. In fig. 3 (right) we see that the behaviour of this scale is very different from H T , especially at NNLO. Indeed, the NLO and NNLO curves do not even cross and the NNLO curve has monotonic behaviour over the whole interval 1/8 ≤ µ/µ 0 ≤ 8. We have not studied in depth this peculiar behaviour but point out that such a scale is much more sensitive to singular emissions (real and virtual). For this reason, a definition that relies on    clustering the emitted partons into jets may alleviate such behaviour. 5 Anticipating our findings for the scale µ 0 in differential distributions, in this work we find strong support for the idea that a good dynamical scale should, among others, resemble as much as possible the born-level observable for the process of interest. It seems to us this conclusion may also have implications for processes outside top physics, or at a minimum, may warrant similar investigations in other processes.
To summarise our discussion of scale-setting for the total cross-section in fig. 4 we compare all scales used so far in NNLO QCD (and NNLO+NNLL where available) and for both pdf sets. From this figure it is easy to see that at this order of perturbation theory the predictions are rather stable with respect to the choice of pdf set (at least for the pdf sets we have studied) and that the choice of a scale ensuring fastest convergence is a rather clear cut. Moreover, such scale returns value for σ tot which is in nearly perfect agreement with the so-far default value for σ tot evaluated with NNLO+NNLL at the scale µ = m t . From this figure it is also evident that for the fastest convergence scale eq. (3.8), the scale behaviour of the total cross-section is very regular and monotonic around the value µ/µ 0 = 1/2.

Differential distributions
In determining the functional form of the scale µ 0 one is constrained by the following limiting cases: at p T → 0 we have µ 0 ≈ c 0 m t , while for very large p T we have µ 0 ≈ c ∞ p T . The two constants c 0 and c ∞ are a priori unknown as is the scale's functional form that interpolates between these two limits. The limit p T → 0 is, however, strongly correlated with the total cross-section. We will thus use the scale derived in section 3.1 in the context of the total inclusive cross-section, to fix the constant c 0 . From eq. (3.8) we have c 0 = 1/2.
The scale µ 0 = H T /4 (3.8) implies that c ∞ = 1/2. One may wonder, however, if the constants c ∞ and c 0 should necessarily be equal. Indeed, the typical value used in the past for the former constant is c ∞ = 1. 6 Since σ tot is not sensitive to the large-p T,t limit, one will need to investigate differential distributions and we turn to them in the following.
We would like to stress that since the limit of large p T has not yet been experimentally constrained, in this study we cannot rely on data. For this reason, our only guiding principle will be the principle of fastest perturbative convergence. As it turns out, this principle is actually quite powerful and quite clear picture of a "good" scale emerges from our analysis. We will allow for scales with different large-p T behaviour and will nevertheless conclude that the best scale is µ 0 = H T /4. We will also find that for the p T,t distribution (as well as for the p T,t/t of the average top/antitop) the best scale will be not H T /4 but µ 0 = m T /2 as defined in eq. 3.2. Both scales H T /4 and m T /2 have the same asymptotic behaviour in the limits p T,t → 0 and p T,t → ∞ thus arriving at the following "best" scale for : p T,t , p T,t and p T,t/t , for : all other distributions . (3.9) Eq. (3.9) above is the main result of this work. In the following we present its justification by the way of analysing differential distributions. We also compare three different pdf sets: NNPDF3.0 [75], CT14 [78] and MMHT2014 [79].
In fig. 5 we compare predictions for p T,t/t computed with five different dynamic scales: m T /2, m T , H T /4, H T,int /2 and m tt /4. We observe that the scale m T /2 consistently leads to K-factors that are closest to unity, i.e. it fits best the requirement for fastest perturbative convergence in the full kinematic range. A K-factor between orders a and b, a ≥ b, is defined: .  We also notice that the scale m T /2 leads to cross-section with the smallest scale variation. It is worth noting that the difference between the central values for the NNLO p T distribution based on the scales m T /2 and H T /4 never exceeds 2% for p T,t/t < 1 TeV, i.e. the effect of the scale choice at NNLO is rather limited.
Similarly, in fig. 6 we compare predictions for m tt also computed with five different dynamic scales: H T /4, H T /2, H T,int /2, m tt /2 and m tt /4. We observe that the scale H T /4 consistently leads to K-factors that are closest to unity, i.e. it fits best the requirement for  fastest perturbative convergence. We also notice that this scale leads to cross-section with the smallest scale variation.
The comparison in fig. 6 demonstrates that m tt -based scales lead to poor perturbative convergence. Even for an m tt -based scale that is as small as m tt /4 the deviation between the absolute predictions is large and exceeds the size of the scale error. Such scales have been used in the past [38,39] as well as recently in the resummation-based work [40,41]. Our findings seem to indicate that the large corrections found in refs. [40,41] are actually due to the particular scale choice. It will be interesting to check if a different scale choice (like, for example, H T /4) will lead to much smaller resummation corrections.

Pdf related issues
A major concern in a scale study like ours is if the conclusions drawn above apply independently of the pdf set. In figs. 7,8 we show the unnormalised p T,t/t and m tt differential distributions based on the following three pdf sets: NNPDF 3.0, CT14 and MMHT2014. To facilitate the comparison between the three predictions, we also show the ratios of both unnormalised (figs. 7,8) and normalised ( fig. 9) distributions with respect to NNPDF3.0.
It is immediately clear that the differential distributions are significantly impacted by the choice of pdf. Furthermore, the K-factors of these three sets behave very differently. In the following we will show that these differences are due to the pdf sets themselves and are not related to the choice of dynamic scale. To that end in fig. 10 we show the p T,t/t and m tt distributions always computed with NNLO pdf set while varying the order of the perturbative cross-section (from LO to NNLO). The rationale for doing this is that in a ratio where the same pdf is used both in numerator and denominator, the dependence of the pdf is reduced or even completely drops out, i.e. the ratio is effectively dependent only on the partonic crosssections. Similarly, in a ratio where the same partonic cross-sections are used in both the numerator and denominator (but different pdf's) the dependence of the partonic cross-section is effectively removed and the ratio becomes a function of the pdf's only. In fig. 10 we observe that such cancellations indeed take place: the top three plots show near-independence with respect to the choice of the perturbative cross-section (from LO through NNLO) while the bottom two plots show the near-independence of K-factors with respect to the choice of pdf set. Fig. 10 thus confirms that the large differences between differential distributions and K-factors apparent from figs. 7,8,9 are of pdf origin.
To further demonstrate this, in fig. 11 we show the gg-luminosities for the three pdf sets. 7 We notice that above around 1 TeV the NLO and NNLO luminosities of the MMHT2014 set are incompatible with each other within the pdf error. At any rate it is evident that the growing pdf error plays a major role and that the predicted differential distributions at large values of p T,t/t and m tt are likely impacted by significant uncertainty due to the imperfect knowledge of pdf. It is clear that with the large amount of top data expected during Run II of the LHC, top-quark data has very strong potential for constraining pdfs. In this work we only highlight this problem and verify that the pdf uncertainty does not affects our optimal scale-choice. Detailed analysis of pdf and how they can be improved with top data should be the subject of a dedicated study.
Finally, before closing this section, we present another proof that the conclusion derived in section 3 regarding the choice of "best" scale µ 0 is not impacted by the choice of pdf 7 The plots in fig. 11 are prepared with the help of the APFEL library [80]; we thank Juan Rojo for kindly providing us with these plots.   (1/σ)dσ/dp  that end, in figs. 12,13 we show plots analogous to the ones in figs. 5,6 but with all curves evaluated with the same NNLO pdf set (i.e. LO, NLO and NNLO partonic cross-sections are all convoluted with the same NNLO pdf). Based on the conclusions above, the K-factors for each scale should be pdf independent. We notice that all K-factors are very similar to the ones in figs. 5,6 and most importantly, the K-factors for the "best" scale choices eq. (3.9) are consistently the smallest ones, and the ones closest to unity, among all dynamic scales considered by us.

Phenomenological applications
As stated in the introduction, the ultimate goal of seeking a robust dynamic scale for toppair production is to describe top production in the broadest kinematic ranges that will be accessible at the LHC. Indeed, as shown in the previous sections, the "best" scales from eq. (3.9) satisfy all our criteria for a "good" dynamic scale. In this work we calculate the NNLO QCD corrections to all stable top quark observables that have so-far been measured at the LHC. We have predictions for LHC at 8 TeV and 13 TeV. Specifically, we compute the following distributions: p T,t/t , y t/t , m tt , p T,tt , y tt , at LO, NLO and NNLO QCD and with three different pdf sets: NNPDF3.0, MMHT2014 and CT14. 8 All results are available for download in electronic format with the Arxiv submission of this paper. For this reason, and due to the very large number of distributions, we do not specify here the bins and ranges of the various distributions. We would only like to remark that in order to achieve high-quality multi-TeV predictions (for example, our 13 TeV Ratio to NNPDF30 (LO partonic; NNLO pdf) dσ/dp  prediction for p T,t/t extends to 3 TeV while the one for m tt up to 6 TeV) we have taken special care in order to populate with sufficient number of events tails of distributions that span many orders of magnitude. In doing so we have used the narrowest bins possible that allow us to keep the Monte Carlo integration error within about 1% in almost all bins. The bins chosen do not correspond to a particular experimental analysis. They are, however, narrow enough so they might be combined to fit the usually much wider experimental bins. Another option is to fit the bin distribution with a smooth curve and then rebin that fit to any desired bin. The high quality of our result, paired with its extended range and narrow bins, should make these results useful for any future LHC experimental or theoretical analysis. In order to allow for the calculation of differential distributions that are normalised over any sub-range of the maximal ranges computed in this work, we make available the results for all seven µ F,R scale combinations. To obtain scale variations in absolutely normalised distributions one has to simply find the min/max in each bin. For the normalised distributions, one has to first normalise each one of the seven curves within the desired range and then search for the min/max value in every bin.
In the following we show some representative results for LHC 13 TeV. In fig. 14 we plot the p T,t/t and m tt distributions with absolute normalisation. Both are computed with NNPDF3.0 and with the optimal dynamic scales (3.9). The distributions have behaviour similar to the case of 8 TeV shown in figs. 7,8. The quality of the computation is high, with the aim of having Monte Carlo error typically within 1% in each bin.
The scale variation for the p T,t/t distribution is such that the central value is typically contained within the lower order scale variation band. At 8 TeV this is the case in the full kinematic range. At 13 TeV the NLO central scale is outside the LO error band in the interval 250 GeV−1000 GeV; the NNLO central value is, however, well within the NLO scale variation in this range. For very large p T both the NLO and NNLO central values at 13 TeV are outside the lower order scale bands. In this regard it is worth pointing out that the scale variation of the NLO correction, unlike the LO and NNLO ones, seems to be accidentally small at large p T and this may be the reason for such a behaviour. Furthermore, the resummation of collinear logs ∼ ln(p T /m t ) may also be playing a role in this kinematic range.
The m tt distribution at 13 TeV is rather well-behaved, similarly to the case of 8 TeV. Above m tt ≈ 3.5 TeV the NNLO correction tends to be outside the NLO scale variation range. This effect is comparable in size to the scale variation and so is not too significant. It would be interesting to revisit this upon supplementing the fixed order calculations with threshold and collinear resummation. The NNLO K-factor is rather mild for low m tt , although not as flat as it is for a fixed scale (see ref. [1]). The characteristic rise at absolute threshold noted in ref. [1] is also clearly visible.
In fig. 15 we show the absolutely normalised y t/t and y tt distributions. Both are computed with NNPDF3.0 and with the optimal dynamic scale (3.9). We notice good perturbative convergence as well as the tendency for the NLO and NNLO results to be within the scale error bands of the lower orders for both distributions. The MC errors are very small and  the calculations of both spectra are of very high quality. In view of the importance of the y tt distributions for fits of parton distribution functions in fig. 16 we show this distribution computed with all three pdf sets considered in this work. For both the unnormalised and normalised distributions we show the ratios with respect to the central value computed with NNPDF3.0. A large spread among the various pdf sets is evident. It is moreover particularly significant in the normalised y tt distribution where the differences due to different pdf is onpar with the scale error. Clearly, the y tt distribution suffers from significant pdf error and could, in turn, be used as a strong constraint on pdfs from high-precision LHC data.
We conclude this section with the following two comments. First, in this work we have  not computed the pdf errors for any pdf set. As we conclude in the previous sections, however, pdf related uncertainties become the dominant source of error long before one reaches the end points of the computed ranges. To gain insight into the size of the pdf error we have compared predictions based on three pdf sets. It appears that at present the constraining factor in doing TeV analyses is the knowledge of pdfs. For this reason the result of the present work should be used with some care. Future precision progress will critically depend on the availability of improved pdf sets. In order to facilitate the use of our calculations with any future pdf set, we will release in the near future our results also as tables in the fastNLO library format [81,82]. Second, we would like to emphasise that besides pdf errors, the results we present here will also be affected by the resummation of collinear logs and possibly by EW effects. Those contributions will require dedicated future studies. In any case the NNLO QCD result computed in this work offers the base for such future additions.

Conclusions
The main result of this work is the extension of the recently computed NNLO QCD differential distributions for stable top quark pair production at the LHC beyond the small p T /m tt regime studied so far at LHC Run I. The results derived here make it possible to describe stable top quark production into the multi-TeV regime which will be explored in detail during LHC Run II. We have presented high-quality predictions for most top-quark distributions for both LHC 8 TeV and 13 TeV. Our results are in the form of binned distributions and are computed with three different pdf sets. All results are available for download in electronic form with the Arxiv submission of this work. The relatively small bin sizes for our results, coupled with their small Monte Carlo errors, would allow one to easily produce high-quality analytic fits to all distributions. We expect that such fits could subsequently be used for further rebinning to a different bin size, at the expense of tolerable errors. This way our results could be extended to accommodate diverse bin configurations; in order to also allow for a (fast) change of parton distribution sets we will release in the near future our results as fastNLO library tables. This  At the technical level, the new ingredient that makes it possible to extend our previous NNLO QCD results to the widest ranges achievable at the LHC is a new dynamic renormalisation and factorisation scale µ 0 . We derive such a scale based on the principle of fastest perturbative convergence, i.e. we require the scale be such that, both at NLO and NNLO, it introduces the smallest possible K-factors across the full kinematic range. Since the small p T behaviour of such a scale is strongly correlated with the well-understood total top-pair cross-section, we also find it desirable to have good numerical agreement with the value of the NNLO+NNLL cross-section.
The following scales satisfy our requirements best: µ 0 = m T /2 to be used for the description of the p T distribution of top/antitop quarks and µ 0 = H T /4 for all other distributions. These functional forms, along with other functional forms that we found to be less suitable, have been used in the past in NLO QCD calculations; the main new feature we uncover is that the scale µ 0 need to be a factor of 2 smaller compared to the typical form in past studies. We demonstrate that such functional forms for µ 0 lead to fast perturbative convergence, small-to-moderate scale errors and return NNLO total cross-section which differs from the NNLO+NNLL σ tot (m t ) value at the sub-percent level.
A convincing derivation of a "good" dynamic scale is possible because of the full control over both NLO and NNLO corrections. Furthermore the reduced error of the NNLO-accurate cross-section makes it much easier to distinguish between various dynamic scale candidates. For example, we find that m tt -based dynamic scales are disfavoured, a result which may have implications in matching the NNLO results with NNLL soft-gluon resummation. We have also noted that the behaviour of the total tt cross-section through NNLO+NNLL is very similar to the Higgs production cross-section through resummed N 3 LO.
We estimate that the error due to missing higher orders is typically within 5%, at least for kinematic ranges of current phenomenological interest. Such typical-size estimate, however, should only be used as a rough guide for the scale error of differential distributions in NNLO QCD and one should keep in mind that the actual error varies across kinematic ranges and across distributions. Specifically, the top/antitop and top-pair rapidities seem to be under very good control in the full kinematic ranges considered here. The p T,t/t distribution seems to also be reliably predicted for p T,t/t as large as 2 TeV. The m tt distribution's scale variation is within 5% for masses of up to 2 TeV, but is steadily increasing towards larger scales. For example, for m tt = 4 TeV, the scale error is as large as 10%. Moreover, the overlap between various perturbative orders is not as good for very large p T,t/t and m tt . Very importantly, by comparing predictions with three different pdf sets, we show that for p T and m tt that are just into the TeV range, as well as for medium and large values of y tt , the uncertainty due to the imperfect knowledge of pdfs very fast becomes the dominant source of error. Therefore, our results should be used with care over extended ranges with current pdf sets and one should be mindful of the implied pdf error (which is not plotted in any of the figures or included in the supplied electronic files). In fact, it seems to us, truly precise top-quark predictions in the TeV range will only be possible once a new generation of pdf sets becomes available and it seems likely that such pdf sets will utilise, to some degree, LHC top quark data. We should also emphasise that the direct phenomenological relevance of our results in the TeV range is additionally subject to the following so-far unaccounted effects: resummation of large collinear logs ln n (p T /m t ), fixed-versus-variable flavour number scheme ambiguity for top production as well as inclusion of EW corrections. The range of phenomenological relevance for these effects, however, has yet to be carefully investigated.
In conclusion, we mention a number of other lessons that can be drawn from our work. First, our approach to finding an appropriate dynamical scale is quite generic and it may benefit other LHC processes that are now -or will soon be -known at NNLO. In particular, we notice that our best scales have a feature that may well be process-independent: they tend to reflect the observable already encoded in the LO kinematics. Second, in the past [83,84] the use of the MS scheme for the top-quark mass has been advocated for, among others, improved convergence of the perturbative series. Our work shows that in order to achieve good convergence no special choice for m t is needed. Third, our experience shows that the principle of fastest perturbative convergence works quite well. This may be contrasted, for example, with the principle of minimal sensitivity that has been used in the past in the context of NLO studies.