An extensive survey of the estimation of uncertainties from missing higher orders in perturbative calculations

We consider two approaches to estimate and characterise the theoretical uncertainties stemming from the missing higher orders in perturbative calculations in Quantum Chromodynamics: the traditional one based on renormalisation and factorisation scale variation, and the Bayesian framework proposed by Cacciari and Houdeau. We estimate uncertainties with these two methods for a comprehensive set of more than thirty different observables computed in perturbative Quantum Chromodynamics, and we discuss their performance in properly estimating the size of the higher order terms that are known. We find that scale variation with the conventional choice of varying scales within a factor of two of a central scale gives uncertainty intervals that tend to be somewhat too small to be interpretable as 68% confidence-level-heuristic ones. We propose a modified version of the Bayesian approach of Cacciari and Houdeau which performs well for non-hadronic observables and, after an appropriate choice of the relevant expansion parameter for the perturbative series, for hadronic ones too.


Introduction
Precision phenomenology of the kind aimed for by the Large Hadron Collider (LHC) physics program requires accurate and reliable theoretical predictions to be compared to an ever increasing range of high precision experimental measurements. Once theoretical and experimental uncertainties become of comparable size, it is crucial to be able to characterise quantitatively the relevance of missing higher order terms in perturbative calculations.
In Quantum Chromodynamics (QCD), which we take as a model here given its central role in LHC physics, theoretical uncertainties stemming from missing higher orders in the perturbative series are usually estimated by varying the unphysical renormalisation and factorisation scales that appear in the cross-sections and decay rates calculations. This approach has served the QCD community well for more than thirty years, and can still be regarded as the most effective way to quickly estimate the missing higher order uncertainties (MHOUs). It suffers, however, from some drawbacks. Chiefly among them the fact that its uncertainty intervals cannot be characterised in a statistically meaningful way and therefore cannot be combined easily with, e.g., likelihood profiles for other uncertainties, for instance of experimental origin.
One of us (MC) and N. Houdeau tried in [1] to overcome this limitation by proposing to estimate MHOUs in a Bayesian context, so as to obtain a statistically meaningful posterior distribution for the probability density profile of the uncertainty interval. The Cacciari-Houdeau approach led to a model (henceforth CH) that relies on simple priors that, at their core, partly mimic assumptions that are anyway implicitly made when one employs the scale-variation method. We refer to [1] for a more detailed description of the CH approach and its underlying Bayesian character, and e.g. to [2,3,4] for some examples of applications of its results. In a context of estimation of MHOUs, we also point out the different but possibly complementary approach of [5] that focuses on a mathematically motivated approximate completion of a perturbative series.
The purpose of this paper is twofold. On the one hand, we revisit the Bayesian CH model, and propose a modified version (which we will denote CH) which will trade some of the simplicity of the original CH model for a better adaptability to a broader class of observables, namely those related to processes with hadrons in the initial state. On the other hand, we study the results of both the scalevariation and the CH model on a large number of perturbatively calculated observables, so as to be able to assess their performance in a (frequentist) statistically meaningful way. For the scale-variation approach, this means that we can attempt to characterise a posteriori its uncertainty intervals in terms of some confidence level that they correctly describe the MHOUs. For the CH model, this study allows us to either assess whether the Degree of Belief (DoB) associated to the uncertainty intervals is correct or, where needed, to estimate the appropriate expansion parameter of the perturbative series that ensures that this be the case.
The paper is structured as follows. Section 2 reviews the scale-variation approach and the Bayesian method introduced in [1], and describes the modifications to the CH model that lead to the formulation of the CH approach used in this paper. Section 3 describes the methodology that we have followed in our study of the performances of the scale-variation and the CH approaches, introduces the list of calculated observables used in the survey, and presents our results. Section 4 compares the results of the scalevariation and the CH method for the determination of MHOUs for some benchmark processes that we consider either particularly relevant for LHC phenomenology or simply quite iconic, namely e + e − → hadrons, Higgs decay to two gluons and to two photons, W and Z production in pp collisions, pp → tt and Higgs production in proton-proton collisions. A concluding section follows, while a few appendices collect technical details and the numerical values of the perturbative coefficients of the observables used in the survey and the benchmarking.

Estimations of theoretical uncertainties
In this Section we introduce and describe two different approaches to the estimation of the uncertainty stemming from the missing higher orders of a perturbatively calculated observable: • the scale-variation approach, which involves varying the unphysical renormalisation and factorisation scales that appear in higher order perturbative calculations within a given range around a chosen central value; • the Bayesian approach introduced by Cacciari and Houdeau in [1], with its modification discussed below.
In the following we review how these two approaches work, and also set the appropriate notations.

Uncertainty estimation by scale variation
The truncated perturbative expansion of an arbitrary observable O calculated up to a fixed order k as a power series expansion in α s , contains a residual, higher-order dependence on the renormalisation and/or factorisation scales, here collectively denoted by µ.
The standard approaches to estimate the MHOUs are all based on the idea of varying the scale(s) µ in an interval [Q/r, rQ], where r is an arbitrary factor often chosen to be equal to 2, and Q is a typical hard scale of the process. The values of the observable obtained at different scales are then used to derive an uncertainty interval. Different recipes can be used to implement this prescription. Writing this interval as [O − k , O + k ] around O k (not necessarily centred around it), the most common choices are: 1.
where we have defined Generalisation to the case of two or more scales is straightforward and follows along the same lines. The prescription which is probably most commonly used (see e.g. the QCD review in [6]) , and which we will also use in our study, is an extension of eq. (2.2), i.e. varying both the renormalisation and the factorisation scale (µ r and µ f ) as shown there, but with the additional constraint 1/r ≤ µ r /µ f ≤ r, to avoid the appearance of unnaturally large logarithms. 1 The main problem with the scale-variation approach is that it does not provide a probability distribution for the uncertainty interval, which therefore has no statistical meaning. It is also worth noting that the common choice r = 2 is merely a convention, and that the choice of the central scale around which to perform the variation is also largely arbitrary. In fact, in some cases this central scale is deliberately chosen away from the characteristic scale of the process to satisfy other criteria. This is for instance the case for Higgs production in gluon fusion, where the central scale is often chosen equal to m H /2 to mimic the result obtained when performing soft-gluon resummation [8], and because around this value the cross section shows reduced sensitivity to the scale choice and an improved convergence of the perturbative series [9].

The Cacciari-Houdeau Bayesian approach
The approach of Cacciari and Houdeau [1] is a Bayesian framework to evaluate MHOUs. It makes assumptions on the behaviour of the coefficients of a series of the form where the unphysical scales µ have been set to the central value Q, and we have implicitly defined α s ≡ α s (Q) and c n ≡ c n (Q, Q). These assumptions are encoded into specific Bayesian priors (detailed in [1]) and in the choice of the expansion parameter, taken here to be α s , and allow one to determine an uncertainty density profile (the posterior of the model) in the form of a conditional probability density for the remainder of the series 2 , ∆ k ≡ ∞ n=k+1 α n s c n , given the known perturbative coefficients, {c l , . . . , c k }. Assuming that the dominant contribution to the remainder comes from the first unknown order, i.e. ∆ k α k+1 s c k+1 , one can derive [1] a simple analytic expression for the conditional density, wherec (k) ≡ max(|c l |, · · · , |c k |) and n c is the number of known perturbative coefficients. From this expression, one can appreciate the characteristics of the posterior distribution for this model: a central plateau with power suppressed tails.
The existence of such a probability density distribution for the uncertainty interval represents the main difference with the scale-variation approach, which only gives an interval without a density profile.
Given the conditional density in eq. (2.8) it is possible to compute the smallest credibility interval for ∆ k with a degree of belief (DoB) equal to p% (i.e. such that ∆ k is expected with p% credibility to be contained within the interval [−d 2.3 The modified Cacciari-Houdeau approach (CH) The CH model described above relies on a specific form of the perturbative expansion, namely eq. (2.7). As a result, its estimate for the uncertainty is not invariant under a rescaling of the expansion parameter from α s to α s /λ. While working on this project we made a number of attempts to reformulate the model in a rescaling-invariant way. Ultimately, none of them turned out to be satisfactory, to the extent that each required formulating priors much too informative, which shaped excessively the final posterior. We eventually settled instead on a slightly modified version of the CH model. In this modified model, henceforth denoted as CH, we rewrite the perturbative expansion of eq. (2.7) in the form and submit the new coefficients b n to the same priors originally used for the c n in the CH model. This leads to the following expressions for the probability density profile for the remainder function ∆ k (2.12) and the credibility interval (2.13) The introduction of the (n − 1)! term in the expansion, which represents the main modification with respect to the original CH model, can be justified on the ground that such a factor is expected to appear in higher order perturbative calculations, e.g. those in the large-β 0 limit and in connection with renormalon contributions [10,11,12,13].
The optimal value for the rescaling factor λ can be determined empirically by observing how the model fares in predicting MHOUs for observables for which higher order perturbative computations are available. In Section 3 we will present such a determination of λ from a study based on a comprehensive set including more than thirty observables. This method of determining λ brings some frequentist contamination into the Bayesian approach. We consider this drawback acceptable at the present stage, but we note that one could in principle further improve the model by introducing an additional prior for the value of λ and thus avoid the frequentist contamination. The frequentist study on λ performed in this work can then perhaps be used as a guide for the formulation of such an additional prior.

Extension to hadronic observables
The original CH model was formulated focusing on observables in processes without hadrons in the initial state, and its extension to observables with initial state hadrons is potentially not straightforward. A generic hadronic observable (e.g. a total cross section) can be written as a convolution integral where L is the parton-parton luminosity, C n (Q) is the hard-scattering coefficient function, τ is an appropriate hadronic scaling variable, Q is the characteristic energy scale of the process and ⊗ denotes a generic convolution in the space of the hadronic scaling variables (not explicitly shown on the right hand side of the equation). The unphysical renormalisation and factorisation scales are taken to be equal to Q as in the non-hadronic case, and they are not explicitly shown. In eq. (2.14), the perturbative coefficient functions C n are usually distributions, and not simple numbers like the coefficients c n in the perturbative expansion of the non-hadronic observables. This means that it is not possible to directly apply the CH method described in Section 2.3 to hadronic observables. This problem can be overcome in two ways.

1.
A first approach is to express the hadronic observable as a series expansion whose coefficients include the convolution with the parton-parton luminosities, i.e. to rewrite eq. (2.14), in analogy with the non-hadronic case, in the form where we have defined We now denote the rescaling parameter with λ h , rather than λ, to stress the fact that its value is a priori potentially different from the one used in the case of non-hadronic observables. We then proceed like in the non-hadronic case, submitting the expansion coefficients H n to the same Bayesian priors used in the non-hadronic case.
This approach is based on the assumption that the contribution coming from the non-perturbative physics encoded in the parton-parton luminosity is roughly the same at each perturbative order, or more generally that its presence does not spoil the assumptions of the model. This approach has been adopted in some of the papers that have used the CH approach in its original formulation, e.g. [3,4].

A second approach is based on rewriting the observable in Mellin space, in the form
is the Mellin transform of the short-distance coefficient function C n , rescaled by the factor λ n h /(n− 1)! introduced in CH, and L(N + 1) is the Mellin transform of the parton-parton flux. We then observe that, if the Mellin inversion integral can be shown to be dominated by a single Mellin moment O k (N 0 , Q), one can simply apply the Bayesian priors of the CH approach to the shortdistance coefficients B n (N 0 , Q), which are ordinary numbers, and determine the uncertainty for the dominant moment series. This uncertainty can then be translated back to the uncertainty on the full result by an appropriate rescaling. This approach is viable because one can show that at least in some cases (see e.g. [14,15]) such a dominant Mellin moment exists and gives a good approximation to the full result.
The main limitation of this approach, which a priori would be preferred because it eliminates the possible contamination due to non-perturbative physics, is that it relies on the predominance of not only a single Mellin moment but also a single production channel (e.g. gluon-gluon fusion in Higgs production at the LHC) at all orders. If this is not the case, the need to reweigh the various dominant Mellin moments in the different parton channels will reintroduce contamination from non-perturbative physics. A second, practical, limitation is that perturbative results are rarely available in Mellin moment space from public codes, limiting the straightforward application of this method to very few cases. Because of these limitations we use the first approach, i.e. the convolution, as our main tool in this paper, but we also present in Appendix B two case studies for the Mellin-moment method.

Global survey
In this Section we assess the performance of the scale-variation procedure and of the CH approach by studying how well they estimate the MHOUs when applied to a wide set of observables. For every observable in the set we consider two quantities: 1. the size of the uncertainty predicted at a given perturbative order k by the approach under consideration; 2. the known perturbative result for the same observable at order k + 1.
For each of the methods we then determine its global success rate in predicting the missing higher order uncertainties at order k, defined as fraction of observables for which the result of the calculation at order k + 1 falls within the uncertainty interval predicted by the model for the order k computation.
In the case of the scale-variation method we study the behaviour of the global success rate as we vary the scaling factor r defined in Section 2.1. The observed success rate can then be used to assign an a posteriori heuristic confidence level (CL) to the uncertainty intervals obtained with a given value of r.
In the case of the CH Bayesian approach we repeat the analysis described above for various values of λ and, since we now have a probabilistic interpretation of the resulting uncertainty intervals, various Degrees of Belief (DoB). This allows us to determine the optimal value of λ to be used in CH, defined as the value of λ for which the model has a global success rate which is closest to the requested DoB, for every possible DoB.

Setup
We perform two separate analyses, one for observables in processes without hadrons in the initial state (non-hadronic observables) and one for observables in processes with hadrons in the initial state (hadronic observables).
The non-hadronic observables considered in our analysis are listed in Table 1. For each observable we show the leading order in α s , the maximum known order in α s and a reference to the original literature from which we have extracted the values of the perturbative coefficients. When the leading order contribution for these observables is entirely electroweak in nature, we do not include the first

Non-Hadronic observables Observable
Leading order in α s Highest known order in α s Reference Bjorken sum rule 3-jets Thrust 1 3 [22] 3-jets Heavy jet mass 1 3 3-jets Wide jet broadening 1 3 3-jets Total jet broadening H → gg 2 5 [25] H → γγ 0 2 [26]   coefficient c 0 in the analysis when using the CH approach, as was done in [1]. This is because we are interested in a perturbative expansion in terms of the strong coupling. The observables included in our hadronic analysis are listed in Table 2, where again we show the leading order in α s , the highest known order in α s and a reference to the code implementing the computation that we used to evaluate the perturbative coefficients. In this case, the leading order coefficient (i.e. the first one) is always retained for the analysis with the CH approach, independently of its perturbative order in the strong coupling 3 . In order to avoid biasing the analysis by using different parton distribution functions (PDFs) at different orders, we always use the same NNLO PDFs for all perturbative orders, with the exception of the scale-variation study shown in the right plot of Figure 3.2.
All the coefficients and the specific parameters for the calculations are given in Appendix A, in Tables 5 and 6. For all our analyses, we have used a private Mathematica code.

Scale Variation
In this Section, we study the performance of the standard scale-variation approach. An outcome of this analysis is the determination of a heuristic confidence level (CL) for the uncertainty intervals given by scale variation, as a function of the scaling factor r that sets the range over which the scales are varied, µ ∈ [Q/r, rQ].
In the non-hadronic case, we also compare two of the prescriptions given in Section 2.1, which are supposedly the most widely used ones: a) take the maximum and the minimum of the cross sections obtained with µ = rQ or µ = Q/r, as explained in eq. (2.2); b) take the maximum and the minimum while scanning the whole interval of scales between Q/r and rQ, as explained in eq. (2.3). Results for the first prescription (i.e. using only the extreme values) are given in the left plot of Figure 3.1. At LO, the heuristic CL of the scale-variation uncertainty intervals for the conventional r = 2 value is of the order of 50%, and it reaches a 68% level for r close to 4. For larger values of r, the CL stabilises around 80%. At NLO, the CL is still of the order of 50% at r = 2, but it increases more rapidly with r than at LO, and it is already around 68% for r 2.5 − 3. For higher values of r it stabilises around 80%. Results for the second scale-variation prescription (i.e. doing a full scan) are given in the right plot of Figure 3.1. While the LO results are identical to those of the first prescription, the NLO heuristic CLs are significantly larger for r ≥ 4, reaching 100% at r = 5. This is likely explained by the fact that, being the scale variation of an observable calculated to NLO accuracy usually non monotonic, a full scan can capture better its overall variation than the evaluation of two or three fixed points only.
We have also examined scale-variation uncertainties in the case of hadronic observables. Since hadronic cross sections depend on two scales, the factorisation and renormalisation scale, we vary them independently to obtain the scale-variation interval. As often done in literature, we do not perform a full scan (too computationally expensive) but rather evaluate the observables only at the centre and at the extremes of a scale range, avoiding combinations that generate large logarithms, as explained at the end of Section 2.1. Figure 3.2 shows the results of the analysis of the full set of hadronic observables. We have calculated the cross sections both using NNLO PDFs at each order (left plot) and using ordermatched PDFs (right plot), i.e. using LO PDFs for the LO computation, NLO ones at NLO, etc 4 . At each perturbative order, the two choices are equivalent up to higher order terms. In both cases, we see that, as common wisdom dictates, the LO scale-variation uncertainty fails to capture the size of the NLO correction. At NLO the two prescriptions differ qualitatively in their performance. When using always NNLO PDFs we can associate a 40% heuristic CL to the standard scale variation with r = 2. The 68% CL level is attained around r = 3, and the CL then stabilises around 90% CL for r ≥ 3.5. When using order-matched PDFs, on the other hand, we obtain very small heuristic CL (less than 30%) for r ≤ 3.
The CL reaches 68% for r just over 4 and then stabilises around 80% for larger values of r. These two analyses for hadronic observables suggest that in the scale-variation approach one may wish to use a rescaling factor r ∼ 3 − 4 in order to obtain a reasonably conservative uncertainty interval, with a heuristic CL at least as large as 68%.

The modified Cacciari-Houdeau model (CH)
For each of the sets of observables listed in Tables 1 and 2 we have performed an analysis of the performance of the CH model in estimating the MHOUs. In this case, a parameter of the model is the λ (or λ h factor) that defines the effective expansion parameter of the perturbative series as written in the model, see eq. (2.10) and eq. (2.15). As far as the size of the uncertainty intervals is concerned, the parameter λ (or λ h ) plays a role analogous to that of r in the scale-variation approach: the final result will depend on its value. However, since in the Bayesian model the widths of the uncertainty intervals are associated with properly defined credibility values, one can explicitly determine the optimal value for λ    by requiring that the model performs as expected, i.e. that the observed global success rate corresponds to the DoB of the uncertainty intervals used in the analysis. We first study the non-hadronic case. We show the results of this analysis graphically in Figure 3.3 in two different and complementary ways. Both analyses use observables calculated at perturbative orders ranging from LO to N 3 LO, for a total of 37 tests performed using the numerical coefficients given in Table 5 in Appendix A. The histogram in Figure 3.3 (left) is obtained by varying the DoB between 0.05 and 0.95 in steps of 0.01 (the uncertainty interval returned by CH varies of course accordingly). For each DoB value, we determine the λ value which gives the best agreement with the condition DoB = global success rate. The resulting λ values are plotted in a histogram. At LO the preferred values for λ can be seen to be between 0.6 and 0.9, while at NLO the histogram shows a preference for the range 0.9 -1.1. The plot in Figure 3.3 (right) shows instead how DoB and success rate compare for different values of λ in a global analysis of LO, NLO and NNLO observables. We see that for values of λ in the 0.9 -1.1 range the requested DoB agrees well with the observed success rate of the uncertainty prediction. 5 This is in agreement with the result that we obtain from the histogram analysis.

CH LO NLO
We perform the same analysis for the hadronic observables set using the coefficients given in Table 6 in Appendix A. Figure 3.4 (left) shows the histogram of the optimal values of λ h for the DoB scan made using the full set of hadronic observables. The histogram peaks around λ h 0.5 at NLO, which is smaller than the preferred λ value obtained from the analysis of non-hadronic observables. In Figure 3.4 (right) we plot the success rate as a function of the DoB of the CH intervals for various values of λ h for all hadronic observables at LO and NLO. We observe that the preferred value of λ h oscillates between 0.5 and 0.6 according to the requested DoB. Since we are mainly interested in determining 68% and 95% DoB intervals, we choose a value of λ h equal to 0.6 as our best estimate, since it appears to be the one for which the model performs better in this DoB range.
The results of the analyses presented in this section allow us to define the optimal values for the parameters in the CH model as follows: we use a parameter λ = 1 when considering non-hadronic observables, while we use λ h = 0.6 when considering hadronic observables. 6

Benchmark processes
In this Section, we compare the results obtained when computing MHOUs using either the CH or the scale-variation prescription for a set of benchmark processes that we consider interesting either because they provide an ideal testing ground for the CH method (e + e − → hadrons and the Higgs decay into two gluons) or are particularly relevant for LHC phenomenology (electroweak vector boson, top quark and Higgs production, Higgs decay into two photons).
We use the results obtained in the global survey (see Section 3) to fix the parameters of the models. We recall that, for the CH model, in the case of observables without initial-state hadrons the preferred value is λ = 1, while for observables involving initial-state hadrons it is λ h = 0.6.
For each process, we compare the uncertainty intervals obtained from the scale-variation procedure with r = 2 and r = 4 with the 68% and 95% DoB intervals obtained using CH. When analysing the CH results, we also consider the behaviour of the posterior density function for the remainder of the series ∆ k when increasing the perturbative order. We show how, in most cases, the inclusion of further information leads to a progressive narrowing of the distribution and a consequent reduction of the uncertainty.

Processes without hadrons in the initial state
We first consider three processes without hadrons in the initial state: the total cross section for the production of hadrons in e + e − collisions and the decay of a Standard Model Higgs boson into a pair of gluons or a pair of photons.
As discussed in [1], the total cross section for e + e − → hadrons is an ideal testing case for understanding the behaviour of the CH model, as perturbative coefficients up to order α 3 s are available in the literature. Their numerical values are listed in Table 5 in Appendix A. In Table 3(a), we summarise the results of our study, comparing the size of the 68% and 95% DoB intervals obtained with the CH method with the uncertainty interval of the scale-variation procedure for r = 2. A graphical representation of these intervals is shown in Figure 4.1. These results show that 68% DoB intervals from CH are always larger than scale-variation intervals for r = 2 and, especially at higher orders, agree better in size with those obtained using scale-variation with r = 4. In Figure 4.2 we plot the full posterior distribution for the remainder of the perturbative expansion, ∆ k ≡ ∞ n=k+1 , at each order k. We highlight the regions of ±0.2 in the determination of λ. 6 It may be tempting to speculate that the smaller value of λ in the hadronic case (and therefore a larger effective expansion parameter for the series) may be explained by the generally larger number of gluons involved in these processes, and therefore by an expansion parameter closer to αsCA than to αsCF , but we will refrain from doing so. Table 3: Results for the analysis of missing higher order uncertainties for benchmark processes without hadrons in the initial state. We quote the perturbative order k at which the observable is calculated, the central value for the theoretical prediction at that order, the MHOs uncertainty intervals computed using the CH model at 68% DoB and 95% DoB, and the uncertainty interval obtained using the scale variation (SV) procedure with r = 2.
(a) MHOUs for hadron production in e + e − collisions at the Z pole. The perturbative series of the observable at the order k is defined as R k (e + e − → Z → hadrons) = k n=0 α n s cn. R0 is normalised to 1. that contribute to the 68% and 95% DoB intervals and compare them to the r = 2 scale-variation intervals, showing how, in this case, the latter is always contained in the flat part of the Bayesian credibility distribution for ∆ k . At the LHC, Higgs decay rates constitute one of the most important processes which do not involve initial-state hadrons. Their precise knowledge is crucial for the extraction of the Higgs couplings to the other SM particles. For our study we consider two Higgs decay channels which present complementary characteristics with respect to our theoretical knowledge and to their phenomenological relevance. The first one is the Higgs decay into two gluons which, despite being not relevant for Higgs phenomenology at the LHC because of the large irreducible background due to QCD jets, is especially well suited as a test case for our Bayesian analysis. Indeed its perturbative QCD expansion is known up to N3LO, and QCD corrections are quite sizeable. Next we study the decay of a Higgs boson into two photons. In this case, QCD corrections are rather small and its perturbative expansion is only known up to NLO. On the other hand, due to its clean experimental signature, it is of great phenomenological importance and it is indeed one of the channels where signs of new physics beyond the Standard Model are expected to appear. Again, numerical values for the coefficients of the perturbative expansions of these observables are collected in Table 5 in Appendix A, while the results of our analysis are summarised in Tables 3(b)-3(c).
For the H → gg process, we plot the uncertainty intervals obtained using the scale-variation and the CH methods in Figure 4.3. We observe that the size of the 68% DoB intervals obtained with the CH model is, with the exception of the N3LO band, slightly bigger than the r = 2 and smaller than the r = 4 scale-variation intervals, coherently with what we have observed in the global survey. We note here that, possibly because of a large NLO K-factor, neither the r = 2 and r = 4 scale-variation interval nor the 68% DoB CH interval at LO contains the NLO result. Conversely at higher orders, where successive perturbative corrections decrease in size, the next order result is always included in both the 68% DoB and the r = 2 intervals. The posterior density distributions for ∆ k are plotted in Figure 4.4. We notice that also for this observable, the r = 2 scale-variation interval is always contained in the central plateau. In addition, we observe a progressive narrowing of distributions with increasing perturbative order.
Finally we consider Higgs decay into two photons. In this case predictions at NLO are included within the LO uncertainty intervals determined using both the CH and the scale-variation (r = 2) methods, as can be seen in Figure 4.5. On the other hand, we notice that in this case the 68% DoB intervals are comparable in size with the intervals obtained from scale variation with r = 4. This suggests that theoretical uncertainties of the Higgs decay into two photon process determined with the scale-variation procedure using r = 2 may be underestimated if one attempts to assign them a 68% or larger heuristic CL.

Processes with hadrons in the initial state
We now consider a number of processes with hadrons in the initial state for which theoretical predictions are available at least up to NNLO, namely the production in proton-proton collisions of Z and W bosons, top-antitop pairs, and Higgs. These processes are either considered to be standard candles at hadronic colliders or are particularly relevant for LHC phenomenology. They provide precision tests of the Standard Model, and a significant discrepancy between their experimental measurement and theoretical predictions might be a hint of new physics at the TeV scale.
The numerical values of the perturbative coefficients together with the values used for the renormalisation and factorisation scales and the strong coupling constant, are collected in Table 6 in Appendix A. In Table 7 in the same Appendix we list the cuts used in the computation of the observables.
In Table 4, we summarise the results of our studies comparing, for each process, the size of the uncertainty intervals obtained with the CH method (68% and 95% DoB) to the ones obtained using the scale-variation procedure (r = 2) at different perturbative orders.
As far as weak vector boson production is concerned, a graphical comparison of predictions obtained with the CH and the scale-variation methods is shown in Figures 4.7 and 4.9. We notice a similar behaviour for W + and Z production. At LO the 68% DoB uncertainty intervals obtained with the CH prescription are substantially larger than the intervals obtained from the scale-variation prescription with either r = 2 or r = 4. At NLO the difference in size is reduced, while at NNLO the CH intervals turn out to be smaller in size than the scale-variation ones. The posterior distributions for ∆ k , shown in For the top-pair production process the comparison of uncertainty intervals obtained using the CH and scale-variation methods are shown in Figure 4.11. In this case, the NLO result for the cross section is contained in the LO uncertainty band determined by 68% DoB interval of the CH prescription, while this is not the case for the scale-variation interval obtained with r = 2. On the other hand, the NNLO central prediction for the cross section is outside the LO intervals computed with either method. CH intervals with a DoB of 68% are similar in size to the scale-variation intervals with r = 4, and scale-variation intervals with r = 2 are always smaller than the CH ones at 68% DoB. The posterior distributions for ∆ k plotted in Figure 4.12 show the expected narrowing as the perturbative order increases, and that the r = 2 scale-variation interval is always contained in the flat part of the distribution.
As a final benchmark process, we consider Higgs production in proton-proton collisions, a process which is characterised by large perturbative corrections. The graphical comparison in Figure 4.13 shows that uncertainty intervals determined by scale variation with r = 2 do not give a reliable error estimate for MHOUs for this observable, neither at LO nor at NLO. We observe that the CH 68% DoB intervals are comparable in size with scale-variation intervals obtained with r = 4. They both fail to properly estimate the large NLO correction, the central result at NLO being far outside the LO uncertainty band, and the error bars at NLO not even overlapping with the LO ones. The CH method appears to produce a more reliable estimation of uncertainty intervals at higher orders, with the 68% DoB intervals at NLO and NNLO showing a substantial overlap. The slowly-converging pattern of the perturbative expansion for the Higgs cross section prediction is reflected in the behaviour of the posterior distributions for ∆ k , which are shown in Figure 4.14. We observe that the narrowing of the posterior distribution with increasing perturbative order is now much less pronounced than for the other observables that we have considered in this Section. Also, the flat part of the posterior distribution for Higgs production broadens significantly when going from LO to NLO. Table 4: Results for the analysis of missing higher order uncertainties for benchmark processes with initial-state hadrons. We quote the perturbative order k at which the observable is calculated, the central value for the theoretical prediction at that order, the MHOs uncertainty intervals computed using the CH model at 68% DoB and 95% DoB, and the uncertainty interval obtained using the scale variation (SV) procedure with r = 2.
(a) MHOUs for W production in the Drell-Yan process at √ S = 8 TeV. The perturbative series of the observable at the order k is defined as σ k (pp → W + → l + ν l ) = k n=0 α n s hn.

Conclusions and outlook
In this paper we have investigated the performance of two approaches in estimating the theoretical uncertainties due to missing higher orders in perturbative QCD computations. The first one is the widely used prescription of varying the unphysical factorisation and renormalisation scales around a central value. The second (CH) is a modified version of the Bayesian approach introduced by Cacciari and Houdeau in [1].
We have performed a global survey based on a wide set of perturbative observables. Within this set we have considered two categories, characterised by the absence ("non-hadronic") or the presence ("hadronic") of hadrons in the initial state of a process, and we have analysed them separately. The outcome of this survey has allowed us to assign an heuristic confidence level to the uncertainty intervals returned by the scale-variation approach and, in a separate analysis, to determine an optimal expansion parameter to be employed in the CH Bayesian approach.
We have found that, in the scale-variation approach, the standard variation within a factor of two with respect to the central scale can lead to uncertainty intervals whose heuristic confidence level (CL) falls short of a conventional 68%, thereby leading to possibly underestimate the real uncertainty. This is true for both the non-hadronic observables and, more markedly, for the hadronic ones. We have determined that the rescaling factor needed to obtain 68%-heuristic CL intervals is usually larger than two, with the specific value depending on the class of observables under consideration and on the prescription used. In general, and conservatively, a rescaling factor of between three and four appears more likely to give an estimation of the missing higher orders uncertainty that is consistent with a 68%-heuristic CL interpretation of the scale-variation intervals.
Our analysis of the CH approach has allowed us to determine that α s is an appropriate expansion parameter for non-hadronic observables, while a slightly larger parameter, α s /λ h , with λ h 0.6, appears to be more appropriate for hadronic observables.
Armed with the determination of these expansion parameters from the global survey, we have then compared the performances of the scale-variation and the CH Bayesian approaches in the estimation of the MHOUs for a set of benchmark observables of particular interest for LHC physics, namely the production in proton-proton collisions of electroweak vector bosons, top-antitop pairs and Higgs. The two approaches perform similarly in estimating, to a given heuristic confidence level for the scalevariation approach or to a given credibility level for the Bayesian CH approach, the MHOUs for these observables, provided that a rescaling factor larger than two is used for scale variation. More importantly, however, the CH approach additionally provides a full probability density distribution for the missing higher orders uncertainty. This probability density can then be used to combine in a meaningful way the MHOU with uncertainties of different origin, e.g. experimental ones.
We conclude by commenting on two possible avenues for further development. First, in this paper we have determined the optimal expansion parameter for the perturbative series in the CH approach by performing a frequentist analysis of a set of known observables. One could envisage replacing this analysis with an additional prior on the expansion parameter. Second, in this work we have adopted, in the form of the CH model, an as generic Bayesian approach as possible, meant to be applied to a wide and open-ended class of perturbative observables. One could instead envisage developing other, more refined Bayesian models, crafted to work on a more restricted and more uniform class of observables. In such a case, more specific knowledge about the perturbative behaviour of the observables would be input into the model, and a trade-off would exist between this amount of information and the model's eventual predictivity. We leave exploration of these avenues to further work.

Non-Hadronic observables Parameters Coefficients
Observable    Our preferred extension of the CH model to hadronic observables is the one based on the convoluted coefficients (see Section 2.3.1), due to its ability to capture effectively the full complexity of a process with initial-state hadrons and multiple partonic channels. However, it is instructive to compare its results with those of the Mellin-moment approach described in Section 2.3.1. In this Appendix we review two processes: Higgs production in gluon fusion, where, as we will see, the Mellin method and the coefficient one return equivalent results, and the Drell-Yan process, where the Mellin method fails to capture the essence of the process.

B.1 Higgs production in gluon fusion at the LHC
This process is characterised by the dominance at every perturbative order of the gluon-gluon channel. The other partonic channels, which only enter at NLO and at NNLO order, turn out to only give subleading contributions. We calculate the dominant moment from the Higgs coefficient functions in Mellin space given in [38] at N 0 = 2, this value giving the best approximation of the full Mellin inversion integral, as established using a saddle-point approximation [14,15]. We then compare the MHOUs thus obtained with those from the convoluted coefficients method in Table 8(a). We see that the behaviour of the dominant Mellin moments at the various perturbative orders is very similar to the one of the coefficients extracted after the convolution with the parton distribution functions. Indeed, as shown also in Figure B.1, the resulting uncertainty intervals in the two approaches agree very well.

B.2 The Drell-Yan process at the LHC
The Drell-Yan process for Z production at the LHC is dominated by the qq channel both at LO (where it is the only channel) and at NLO, where also the quark-gluon channel starts to contribute. At NNLO also the gluon-gluon channel opens up. It is known that the total net effect of the new channels on the total NNLO contribution is PDF-dependent, due to the uncertainties of the gluon PDFs. In particular, the NNLO contribution can change sign according to the PDF set used. In our case, using the NNLO NNPDF 2.3 set [36], it is negative due to the predominance of the negative quark-gluon channel at this order. On the other hand, the Mellin-space coefficient function for the qq channel, which we use in the analysis, is always positive. Hence, it is not able to capture, alone, the complex pattern of the perturbative expansion for this process. Indeed we see in Table 8(b) and in Figure B.2 that, starting from NLO, the uncertainty interval obtained with the Mellin method is systematically larger than the one obtained with the standard coefficient-based one.
One could in principle work around this issue performing a Mellin analysis for each channel separately. However, to recombine the different uncertainties in order to get a single band for the complete cross section would then require the knowledge of the weight of each channel. This in turn requires the use of the PDFs to determine the corresponding parton fluxes, introducing a dependence on long-range physics into the Mellin moment method, and therefore spoiling its main advantage. Table 8: Results for the analysis of Higgs production in proton-proton collisions and of Drell-Yan production of a Z boson at √ S = 8 TeV. We quote the perturbative order k at which the observable is calculated, the central value for the theoretical prediction at that order, the value of the coefficient function at the dominant Mellin moment N 0 and the MHOs uncertainty intervals computed using the CH model at 68% DoB and 95% DoB with both the Mellin moment method and the coefficient-based approach.
(a) MHOUs for Higgs production in proton-proton collisions via gluon fusion. The perturbative series of the observable at the order k is defined as σ k (pp → H) = k n=2 α n s hn.
pp → H  C Statistical uncertainty in the determination of λ In this Appendix we detail the procedure through which we determine the uncertainty on our determination of the rescaling factor λ (or λ h ) of the expansion parameter in the CH approach. We recall that the optimal value for λ is determined by comparing, for the CH model with a given value of λ, its measured success rate in describing the missing higher order uncertainty (i.e. in returning an interval that includes the known higher order result) with the value of the Degree of Belief given as an input.
What we need to establish is the statistical uncertainty on the measured success rate, resulting from the finite size of the sample of observables that we employ in our survey. For this purpose, we follow the procedure suggested in [39]. It relies on the assumption that the measured success rate, s/n, where s is the number of successes and n the size of the observables' sample, is a point estimator for p, the real proportion of successes in the underlying population. The likelihood of observing a success rate value s/n, given the true value p, is then proportional to p s (1 − p) n−s , and upon normalisation over the interval 0 < p < 1 it can be written as a Beta distribution, B(s + 1, n − s + 1) = (n + 1)! s!(n − s)! p s (1 − p) n−s . (C.1) If we express our ignorance on the value of p by means of a Bayes-Laplace uniform prior, P pr (p) = 1 over the interval 0 < p < 1, we can, upon application of Bayes' theorem, use the normalised likelihood function (C.1) as a posterior probability distribution. We can then define the lower and upper bounds, p l and p u , of an equal-tailed c% = 1 − α credibility interval for p through the relations p l 0 dp B(s + 1, n − s + 1) = α 2 and 1 pu dp B(s + 1, n − s + 1) = α 2 (C.2) respectively. A proper application of this procedure to our problem would require the evaluation, for each value of λ and for each measured s/n rate, of the credibility interval [p l , p u ] for a given value of α. In practice, in order to simplify both the calculation and the graphical representation of this uncertainty, we determine a 68.3%-credible interval for an ideal curve where success rate = requested DoB. This is the interval that is represented as a grey band in the right hand plots of Figures 3.3 and 3.4. As long as the actual curves do not differ too much from this ideal one, one can easily gauge the size of the uncertainty, and therefore to what extent two curves obtained with two values of λ are, or are not, significantly different.