# Parton-shower uncertainties with Herwig 7: benchmarks at leading order

- First Online:

- Received:
- Accepted:

- 3 Citations
- 301 Downloads

## Abstract

We perform a detailed study of the sources of perturbative uncertainty in parton-shower predictions within the Herwig 7 event generator. We benchmark two rather different parton-shower algorithms, based on angular-ordered and dipole-type evolution, against each other. We deliberately choose leading order plus parton shower as the benchmark setting to identify a controllable set of uncertainties. This will enable us to reliably assess improvements by higher-order contributions in a follow-up work.

## 1 Introduction

General purpose Monte Carlo (MC) event generators [1, 2, 3, 4, 5, 6] are central to both theoretical and experimental collider physics studies. Recent development of these simulations has seen improvements in various areas, both within perturbative calculations, through matching to fixed order [2, 7, 8, 9, 10, 11, 12, 13, 14, 15], combining higher jet multiplicities [16, 17, 18, 19, 20, 21, 22], as well as the all-order resummation with parton showers [23, 24, 25, 26] and also within the non-perturbative, phenomenological models [27, 28]. While there are well established prescriptions on how to quantify the theoretical uncertainty of fixed-order calculations due to missing higher-order contributions [29, 30, 31, 32, 33, 34, 35]^{1}, there is no such consensus for general resummed calculations [36, 37, 38, 39, 40, 41], and parton-shower algorithms in particular [42, 43, 44, 45, 46, 47], since a number of ambiguities are present within the different schemes; however, there is progress in towards this goal. Given the perturbative improvements, and the expected precision from data-taking at *Run II* of the Large Hadron Collider [48, 49], the task of assigning theoretical uncertainties to MC event generators is becoming increasingly crucial. This also applies to validating new approaches against existing data, as well as using predictions to design future observables and/or collider experiments. Phenomenological studies, for example, indicate that MC event generators can be used even in primarily data driven methods to perform powerful analyses once theoretical uncertainties are under control [50, 51]. It is therefore important to quantify the uncertainties associated with an event generator in a reliable way.

Uncertainties due to non-perturbative modelling have been addressed in [52, 53], as well as the impact of the parton shower on reconstructed observables [54]. Various ambiguities and sources of uncertainty have been addressed within the context of other multi-purpose event generators as well; in particular recoil schemes [55, 56] and parton distribution functions (PDF) [57, 58, 59] have so far been considered, both for pure showers and in the context of matched or merged samples, see e.g. [60, 61, 62]. The eigentune method [63] has been applied by both ATLAS [64] and CMS [65] to determine systematic tune variations. All of these studies share a commonality in that they focus on a single source of uncertainty which is usually connected to the development/improvement studied. Contrary to this, the authors in [66, 67, 68, 69] describe possible approaches to uncertainty handling for the Drell–Yan process. An even more systematic approach for how to handle the possible interplay between theoretical and experimental uncertainties can be found in [70]. Further in the direction of a systematic approach, CMS published a short guide on how to estimate MC uncertainties [71] and outlined some issues to address. Finally, new techniques of propagating uncertainties through the parton shower by means of an alternate event weight were proposed [72, 73].

In the present work, we address uncertainties of parton-shower algorithms within the Herwig 7 event generator [1, 2]. Herwig 7 is a general purpose event generator that computes any observable at next-to-leading order (NLO) precision in perturbation theory automatically matched to a parton shower. It includes sophisticated modules for very different physics aspects ranging from interfaces for physics beyond the standard model and two independent parton-shower algorithms [55, 74], to a detailed modelling of multiple particle interactions [75, 76, 77].

It is our aim to develop a consistent uncertainty evaluation for event generators, and Herwig 7 in particular. This work is a first step in this direction, concerning the parton-shower part and will be extended by further detailed studies in the context of higher-order improvements and the interplay with non-perturbative, phenomenological models and parameter fitting. The present paper is therefore structured as follows: in Sect. 2 we classify all different types of uncertainties and their respective sources. We then argue as to why we start with a pure leading order (LO) plus parton-shower (PS) study. The sources of uncertainty tested in this study are described in detail in Sect. 3. Our results are presented in Sect. 4 for \(e^+e^-\) and fully inclusive *pp* production, while our findings including additional jet radiation are described in Sect. 5. The results establish a baseline of a set of controllable uncertainties, which can then be used to quantify the impact of higher-order corrections to be addressed in upcoming work. Finally, we present a summary and outlook in Sect. 6.

## 2 Context

### 2.1 Sources of uncertainty

*Numerical*Computational precision and statistical convergence. This is clearly a limitation which can be overcome by investing enough computing resources and will hence not be addressed further.*Parametric*Quantities taken from measurements or fits beyond the event generator parameters. This includes masses, coupling constants, and PDFs, and the impact of these needs to be quantified separately and potentially on a process-by-process basis watching out for maximum sensitivity.*Algorithmic*The actual parton-shower algorithm, matching and merging prescriptions, and phenomenological models considered. The last are not considered here, as we limit ourselves to the simulation available in Herwig 7.*Perturbative*Truncation of expansion series in coupling or logarithmic order. The main purpose of this work is to elaborate on quantifying these uncertainties in the case of leading order plus parton-shower simulation, which will be motivated in more detail below.*Phenomenological*Goodness of fit uncertainties (e.g. the so-called eigentunes) regarding parameters in the non-perturbative models. We will argue that a remaining spread of predictions obtained by fitting parameters for each of the variations of controllable perturbative uncertainties is able to quantify the cross talk to non-perturbative models and a genuine model uncertainty.

Phenomenological uncertainties will be subject to future investigations. However, we will point out first hints towards their influence by considering variations of the shower infrared cutoff in a selected number of cases. The reasoning to this is twofold: On one hand, we want to stress the fact that parton level studies should typically be carried out with care, and their region of validity can by estimated by cutoff variations with large changes that indicate non-negligible hadronisation corrections. On the other hand, this fact also indicates how cutoff variations, along with other variations, may actually point to the possibility of quantifying otherwise unknown, generic, model uncertainties and the interplay with non-perturbative corrections.

### 2.2 Why leading order?

We solely consider LO plus PS simulation in this work. The motivation to do so is as follows: With fixed-order improvements it is clearly very hard to disentangle sources of uncertainty stemming from pure parton showering, and those which have been potentially improved by higher-order corrections. In order to quantify genuine parton-shower uncertainties in an improved setting one would typically need to look at jets beyond those that received fixed-order hard process input (e.g. the second jet from a leading order configuration in a NLO matched simulation). Not only is this computationally unnecessary for the sake of studying only parton-shower uncertainties, it also introduces slightly different shower dynamics, the differences of which, with respect to leading order, would also need to be quantified carefully. Additionally, it is our aim to show where and how fixed-order input improves the simulation along with the expected reduction in uncertainty; stated otherwise: To use the NLO matched simulation in order to identify which of the non-first-principle variations considered in this work are indeed reliable estimators of theoretical uncertainty in the perturbative part of event generator predictions.

### 2.3 Different algorithms or uncertainties?

To quantify to what extent commonly used recipes are a sensible measure of uncertainties in parton-shower algorithms the first step is a clear distinction of what possible sources exist within a fixed algorithm, and what differences should actually be attributed to the consideration of distinct algorithms. Looking at different algorithms, we obtain a strong cross-check on whether the uncertainties assigned to one algorithm are sensible, provided we consider algorithms that exhibit similar resummation properties. We will also show that changes to the algorithms that are naively expected to be subleading, can cause severe difference in the resummation properties. Similarly, kinematics parametrisations to convert on-shell partons to off-shell ones after multiple radiation are known to cause numerically significant differences [55, 56, 78].

Such details, as well as the choice of splitting kernels and evolution variable should not be considered a source of uncertainty within an algorithm but are details that fix a distinct algorithm; we therefore call them algorithmic uncertainties. An uncertainty band based on varying such details cannot serve as a systematic framework to quantify missing higher-logarithmic contributions. If differences between algorithms are not covered by variation of the scales involved, either the estimate of uncertainty or the resummation properties of the algorithms should be questioned.

the hard scale \(\mu _\mathrm{H}\) (factorisation and renormalisation scale in the hard process);

the veto scale \(\mu _\mathrm{Q}\) (boundary on the hardness of emissions);

the shower scale \(\mu _\mathrm{S}\) (argument of \(\alpha _S\) and PDFs in the parton shower).

At least two parameters in our shower algorithms are typically obtained in the course of tuning to data, the strong coupling \(\alpha _s(M_Z)\), and the shower cutoff parameter.^{2} Using the different tuned values (at least with the latter having, in general, a different meaning between the two showers), the predictions on parton level will differ, though fully simulated, hadronic events, will yield a comparable description of data. We argue that these differences should be evaluated carefully, but belong to a future study that will address the interplay with non-perturbative models in more detail.

### 2.4 Simulation setup

We consider both parton-shower modules available in Herwig 7, the default angular-ordered shower [74] and the dipole-type shower based on [8, 55]; in addition to their default settings, which we have adjusted to make them as similar as possible by choosing the same \(p_\perp \) cutoff and \(\alpha _s\) running (the ‘baseline’ settings for this work), we consider a number of modifications mainly outlined in Sect. 3, all of which constitute different algorithms in the sense outlined above. The two showers are very different in their nature: The angular-ordered, QTilde, shower evolves on the basis of \(1\rightarrow 2\) splittings with massive DGLAP functions, using a generalised angular variable and employs a global recoil scheme once showering has terminated; its available phase space is intrinsically limited by the angular-ordering criterion, resulting in a ‘dead zone’, though it is able to generate emissions with transverse momenta larger than the hard process scale and so typically an additional veto on jet radiation is imposed (see Sect. 3 for more details). The dipole-based shower, Dipole, uses \(2\rightarrow 3\) splittings with Catani–Seymour kernels with an ordering in transverse momentum and so is able to perform recoils on an emission-by-emission basis; the splitting kernels naturally require the two possible emitting legs of each dipole to share their phase space and there is no a priori phase-space limitation, but the available phase space is controlled by the starting scale of the shower.

^{3}Hard processes are simulated at leading order (see the previous discussion), using the Matchbox infrastructure powered by amplitudes generated by MadGraph5_aMC@NLO [11]. In \(e^+e^-\) collissions, we consider di-jet production; at hadron colliders, in addition, we consider stable

*Z*-boson Drell–Yan production, \((e^+e^-j)\) production within the mass window \(66~\mathrm {GeV}< m_{ll} < 116~\mathrm {GeV}\) around the

*Z*mass, as well as production of a stable, \(125\ \mathrm{GeV}\), Higgs accompanied by zero or one jet. In the presence of additional jets in the hard process we use FastJet [83, 84] to perform the generation cuts; analyses are performed throughout using the Rivet framework [85], with analysis modules based on existing experimental and generic Monte Carlo implementations. In \(e^+e^-\) collisions, where we choose a centre-of-mass energy of \(\sqrt{s} = 100\) GeV as baseline, we reconstruct jets with the Durham algorithm [86], while the hadron collider setup reconstructs anti-\(k_\perp \) jets with a radius of \(R=0.4\) within a rapidity range \(|y|<5\) and a transverse momentum threshold of \(p_{\perp } > 20\ \mathrm{GeV}\). Parton level without multiple interactions and hadronisation is employed, and partons up to and including

*b*-quarks are treated as massless objects. Both parton showers mentioned above use a \(p_\perp \) cutoff prescription with a value of \(\mu _{\text {IR}}=1\ \mathrm{GeV}\). Electroweak parameters are kept at their default values.

### 2.5 Consistency checks

The ability to compare different algorithms puts us into the unique position of performing a number of consistency checks for the uncertainty estimate that we advocate. In particular, perturbative error bands should cover algorithmic discrepancies, if these algorithms are expected to deliver the same accuracy. If that is not the case then the algorithm at hand is questionable. Furthermore, by construction the shower approximates emissions in the soft and collinear region. If we force the shower to produce hard emissions, larger uncertainties are to be expected by a controllable prescription. Another point is the possibility of double counting hard emissions. The shower should not cover phase-space regions that are already covered by the hard process input. This property is typically reflected in demanding that observables that receive input at fixed order are not significantly altered by subsequent showering. Clearly, the definition of ‘region’, which in this case is covered by the veto scale on hard emissions (see Sect. 3 for a more detailed discussion), is again only precise to the level of accuracy covered by the parton shower and varying this boundary should serve as a measure of missing logarithmic orders. We emphasise that a boundary chosen to be far away from the correct ordering behaviour may introduce severe double counting issues, ultimately impacting on a resummation of a tower of logarithms which is not typical to the process, *i.e.* not encountered in any higher-order corrections to an observable considered. Furthermore, the perturbative uncertainties for observables in phase-space regions that do not receive logarithmically enhanced contributions should be driven by the hard scale alone, while the other scales have negligible impact. Logarithmically sensitive observables, on the other hand, should be altered by the parton shower and the uncertainties should be driven by all possible scale variations together. The setting where this is least clear is pure jet production, which we will address amongst other ‘jetty’ processes in Sect. 5.

## 3 Scale choices, variations and profiles

### 3.1 Phase-space restrictions and profile choices

The quantity central to parton showers is the splitting kernel. Its exponentiation gives rise to the Sudakov form factor, which regulates the divergence of the splitting kernel for soft and/or collinear emissions. On top of this, there are two further crucial ingredients (besides formally subleading, though not necessarily small issues like kinematic parametrisations): The evolution variable chosen, and the phase space accessible at a fixed value of the evolution variable. Emissions are typically further subject to an upper bound on their hardness. This cannot be directly deduced from a priori principles but should be chosen in the order of magnitude of the typical hardness scale of the process being evolved to avoid the double counting issues mentioned before.

*z*, has limits that read

*z*boundaries being crucial to produce the correct logarithmic pattern [25, 55]. Instead, if one desires to make all of the phase space available to parton-shower emissions, \(K_\perp ^2=R_\perp ^2\) is chosen and no other than the kinematic constraint \(p_\perp ^2<R_\perp ^2\) is in place.

^{4}

theta: \(\kappa (Q_\perp ^2,q_\perp ^2) = \theta (Q_\perp ^2 - q_\perp ^2)\), which is expected to reproduce the correct tower of logarithms;

- resummation: \(\kappa (Q_\perp ^2,q_\perp ^2)\) is one below \((1-2\rho )\ Q_\perp \), zero above \(Q_\perp \), and quadratically interpolating in between. This profile is expected to reproduce the correct towers of logarithms, and switches off the resummation smoothly towards the hard region (currently we use \(\rho =0.3\)
^{5}):$$\begin{aligned}&\kappa (Q_\perp ^2,q_\perp ^2)= \left\{ \begin{array}{ll} 1 &{} \quad q_\perp /Q_\perp \le 1-2\rho \\ 1 - \frac{(1-2\rho -q_\perp /Q_\perp )^2}{2\rho ^2} &{} \quad q_\perp /Q_\perp \in (1-2\rho ,1-\rho ] \\ \frac{(1-q_\perp /Q_\perp )^2}{2\rho ^2}&{} \quad q_\perp /Q_\perp \in (1-\rho ,1] \\ 0 &{} \quad q_\perp /Q_\perp > 1 \end{array} \right. ;\nonumber \\ \end{aligned}$$(5) hfact: \(\kappa (Q_\perp ^2,q_\perp ^2) = \left( 1+q_\perp ^2/Q_\perp ^2\right) ^{-1}\), which is also referred to as damping factor within the POWHEG-BOX implementation [7]; and

power shower: imposing nothing but the phase-space restrictions inherent to the shower algorithm considered.

Different combinations of \(R_\perp ^2\) and \(K_\perp ^2\) can be achieved within the two showers. In particular, the dipole shower is able to populate the region up to \(K_\perp ^2=R_\perp ^2\) (‘power shower’), while, for \(2\rightarrow 1\) processes at hadron colliders the angular-ordered phase space, by construction, imposes \(K_\perp ^2=Q_\perp ^2\) to be the mass of the singlet which is produced.

*z*integration at this simple qualitative level is given by

\(K_\perp ^2\sim Q_\perp ^2\) is imposed by the

*z*boundaries;*and*\(\kappa (Q_\perp ^2,q_\perp ^2) \sim \text {const}\) whenever \(q_\perp ^2\) is not of the order of \(Q_\perp ^2\) for the term involving the derivative of \(\kappa \) to become subleading.

^{6}The second restriction also excludes choices of \(\kappa \) providing a ratio of logarithms to effectively replace \(K_\perp ^2\) by \(Q_\perp ^2\) in the first term in Eq. 7. To this extent, we conclude that only those profiles that are narrow smeared versions (in the sense of varying only in a region where \(Q_\perp ^2/q_\perp ^2\) is of order one) of a theta-type cutoff will provide the proper tower of logarithms. Choices such as the resummation profile are desirable to avoid discontinuities introduced by the theta-type cutoff which are beyond the accuracy considered, while keeping the resummation properties of the parton shower; the profile we consider here is only one such kind, and there is no restriction on the exact form considered. The name ‘profile’ is chosen since the treatment of the hard scale we consider here closely resembles prescriptions on scale variations within the analytic resummation context [87].

### 3.2 Identifying a ‘Resummation Scale’

### 3.3 Scale variations

In addition to variations of the scales in the hard process, \(\mu '_{R/F}=\xi _H \mu _{R/F}\), we vary both the hard veto scale, \(\mu _\mathrm{Q}=\xi _Q Q_{\perp }\), and the arguments of \(\alpha _s\) and the PDFs in the parton-shower splitting kernels, \(\mu _\mathrm{S}=\xi _S q_\perp \). We constrain the number of possible variations to be \(\xi \in [1/2,1,2]\). This spans a cube of, \(\xi _H \otimes \xi _S \otimes \xi _Q\), 27 combinations. All these choices are connected to logarithmic scale choices. There is therefore no a priori way of reducing their number. We emphasise that in principle only the full 27-point envelope constitutes a comprehensive uncertainty measure. We therefore always produce the full envelope along with envelopes for each of the individual variations. Using this it is possible to observe which scale drives the overall uncertainty in a particular region of phase space. While one expects the variation of \(Q_\perp ^2\) to cancel out to the level of NLL accuracy (if this is indeed resembled by the parton shower), the situation is less clear for the other variation and different proposals have been made as to what extent the contribution at the level of NLL contributions should be cancelled (see e.g. [88] for a discussion) or otherwise considered as a probe of where precisely higher accuracy of the shower is missing. We do not consider introducing any terms that cancel these variations to the NLL order, and postpone a detailed analysis of this issue to future work. We do, however, analyse these variations as we are convinced that they are another clean handle on controlling where we expect, specifically, soft emissions and contributions by the hadronisation model to dominate. A recent Les Houches study [89] has also shown that, when not taking into account the full variations of this kind, discrepancies between different shower algorithms, which are expected to be similar, are not covered within these variations.

### 3.4 Real-life constraints

Besides the unclear definition of a resummation scale in the context of different shower algorithms, another word of caution needs to be raised when considering the hard veto scales: There are cases in which there is no meaningful variation as the hard scale is a fixed quantity such as the mass of an independently evolving final-state emitting system, e.g. showers in \(e^+e^-\rightarrow \) hadrons. It is not clear how one would quantify the respective shower uncertainty in this case, besides looking at shape differences encountered at different centre-of-mass energies of the \(e^+e^-\) collider to quantify the scaling of the predictions with respect to ratios of the hard scale to the infrared sensitive quantity considered. Already this observation clearly marks the fact that no claim of a full and well-understood uncertainty recipe can be made at this point, but only are we able to perform initial steps in this direction. Similarly for the power shower there is no meaningful variation of \(\mu _\mathrm{Q}\). It can also happen, as is the case for the angular-ordered shower, that the algorithm chosen naturally imposes an upper bound on the hardness of the emission. In the case of Drell–Yan-type processes, the angular-ordered shower, for example, will only allow for a down-variation of \(Q_\perp ^2\) and is thus questionable as to whether this variation in these cases is the right measure.

## 4 Clean benchmarks

To begin exploring the uncertainties that arise from the considerations of Sect. 3 we start by studying ‘clean benchmarks’, *i.e.* hard processes with the least number of legs: \(e^+e^-\) annihilation, and Drell–Yan-type \(2\rightarrow 1\) processes producing either a *Z* or Higgs boson. For the case of \(e^+e^-\) collisions, the notion of a hard veto scale does not directly exist owing to the fact that the phase-space boundary and relevant hard scale coincide. However, we can compare variations of the collision energy and quantify this impact at the level of normalised distributions to acquire a handle on variations of the logarithmic structure similar to hadron–hadron collisions^{7}. On top of the three scales \(\mu _\mathrm{H,S,Q}\) described above, we vary the infrared cutoff of the shower by a factor of 1 / 2 and 2 for the \(e^+e^-\) setting, in order to obtain a first indication of how much dynamics of the shower is expected to be absorbed into hadronisation effects; notice that varying the argument of \(\alpha _s\) may serve a similar purpose.

### 4.1 Final-state showers

\(e^+ e^- \rightarrow qq\) provides the clean environment to study final-state radiation. Note that in this case the power and theta profile coincide, which is also our choice in the following.

The Thrust distribution, Fig. 3, shows good agreement between showers; this is true both for the central prediction and its variations, and shows that they possess the same resummation accuracy. Differences that do emerge between the showers are related to cutoff effects and non-radiating events in the region towards \(T=1\); these offer no insight into the resummation properties. A further difference emerges from the dead-zone of the QTilde shower, however, this is a region that can be supplemented by using matching or ME corrections. For this observable we note that the \(\sqrt{s}\) and \(\mu _\mathrm{S}\) variations are similar in magnitude.

In Fig. 4 we show results for the integrated two-jet rate; the uncertainties are dominated by \(\sqrt{s}\) as well as cutoff variations at small \(y_\mathrm {cut}\). Again, the overall uncertainties are comparable between the showers; as expected, we obtain large uncertainties in the small \(y_\mathrm{cut}\) region, which is dominated by hadronisation effects.

### 4.2 Initial-state showers

As far as initial-state showering is concerned, we investigate a gluon-initiated process \(pp \rightarrow H\) (in the large-\(m_t\) effective theory), and a quark-initiated process \(pp \rightarrow Z\); these particles are set stable for simplicity. Inclusive observables, such as the rapidity of the resonance in this case, are quantities expected to be well described by the matrix element, and thus should be unmodified by the parton shower; this is reflected in Figs. 5 and 7 where both showers display good agreement, with uncertainties mainly driven by the hard process variation. The differences in magnitude should be attributed to different couplings for each process, with envelope shape differences attributed to the PDFs.

The jets in these samples are generated solely from the parton shower; therefore the \(p_\perp \) of the leading (hardest) jet directly probes the impact of the profile scales.^{8}

Comparing Figs. 6, 7, 8, 9, 10, we find that the different profile choices exhibit significantly different behaviours, both amongst themselves as well as between different showers. The resummation and theta profiles, as intended, yield comparable results in terms of central predictions and uncertainties and across the different shower algorithms. This clearly shows that we can indeed expect the same resummation accuracy using these profiles. The variations towards high \(p_\perp \) for the theta profile expose the effect of the different phase-space limitations. In the QTilde shower the upward variation of the scales (\(\mu _\mathrm{Q}\)) is ultimately irrelevant, as there are no possible emissions at this scale; looking at the dipole shower one sees the effect of such variations. However, this is not the case for the resummation profile whose interpolating region is sensitive to such variations, and displays similar variations between showers.

For large transverse momenta, the uncertainties should reflect the case that parton-shower emissions in these regions are unreliable. We observe this for both the theta and resummation profiles and to some extent for the hfact choice, though the variation is considerably smaller than indicated by the theta-type choices. The power shower, however, shows no increased uncertainty and in fact is dominated by variations of \(\mu _\mathrm{H}\), since by definition there is no variation of \(\mu _\mathrm{Q}\). Given the marked differences in the hardness of jets between the two showers, the power shower seems to offer no handle towards the assessment of shower uncertainties. We can also clearly observe the intrinsic limitation of the QTilde shower phase space, which in this case is not able to populate high-\(p_\perp \) emissions which ultimately needs to be supplied by matching and/or matrix element corrections similarly to the ‘dead zone’ effect in \(e^+e^-\) collisions.

We therefore conclude that within this basic setting the showers and profile scale choices do admit the expected behaviour, and the two showers using theta-type profiles exhibit similar central predictions and uncertainties.

## 5 Jetty processes

Having established shower uncertainties using simple benchmark processes, the next simplest examples are the processes studied in Sect. 4 with an additional hard emission off the hard process, e.g. *H* / *Z* plus one (inclusive) jet. In addition, pure di-jet production is investigated because of the absence of a colour singlet setting a hard scale and the related ambiguities in possible hard scale choices. We do not investigate the shower cutoff as we shall now focus on properties which are not expected to be significantly altered by hadronisation effects.

As with the clean benchmarks presented in Sect. 4, we consider variations of the three relevant scales discussed in Sect. 3, changing them by factors 1 / 2 and 2, respectively, to span a cube of a total of 27 variations; we will also perform cross-validations between both available showers. From arguments given in Sect. 3 we expect observables and/or regions in phase space where the uncertainty is mainly driven by \(\xi _\text {H}\), i.e. in the case of inclusive observables. As all uncertainties connected with scale choices stem from logarithmic arguments there is no a priori way to exclude any of the possible variations when determining shower uncertainties, unless one is able to identify scale compensation patterns between the different scales for which we see no evidence in the setting considered in this study.

For the rapidity distributions of the Higgs and *Z* boson, shown in Figs. 11 and 12, respectively, we find that the distributions are consistent with the prediction of the hard matrix element, as is expected from such inclusive quantities; this applies to all of the profile scales considered, with the power shower showing larger deviations in the forward region. Scale variations affect these observable mainly through variations present in the hard process.

^{9}For the hfact profile with the QTilde shower we find a spectrum compatible with the one anticipated by the matrix element; for the dipole shower, a significantly harder spectrum is obtained. A similar, but even more dramatic picture emerges for the power shower setting. The spread of predictions for the QTilde shower is smaller than the spread for the dipole shower, owing to the intrinsic limitations of the phase-space volume available to angular-ordered emissions as already pointed out in the previous sections. The combinations QTilde plus power, and Dipole plus hfact or power contradict the criterion of controllable showering, which in this case is expected to not significantly alter the jet \(p_\perp \) spectrum. Combined with the empirical findings of Sect. 4, we will therefore not consider the power shower profile choice any further.

^{10}, we consider the angular separation between the boson and the leading jet \(\Delta R_{(H/Z)j}\), which probes both matrix element and shower dominated regions in a continuous observable: Matrix element emissions in this case can only populate the phase-space region \(\Delta R_{(H/Z)j} \ge \pi \). The region below is solely filled by the parton shower, typically operating at the boundary of validity of the underlying approximation as this phase space requires the shower to produce a hard, large-angle emission. Within the definition of controllable and consistent uncertainties, we therefore expect large uncertainties for \(\Delta R_{(H/Z)j} \le \pi \), while the shower should reproduce the matrix element dynamics above. Results for the QTilde shower are shown in Figs. 17 and 19 (H and

*Z*production, respectively) and for the Dipole shower in Figs. 18 and 20. We place particular emphasis on the comparison of the resummation and hfact profiles. For all processes/showers we find that hfact predicts a small uncertainty band and produces slightly more hard jets; the latter can be attributed to the available phase space, while the former can be obtained by analysing Eq. 7, stressing the fact that the region in which the derivative of the profile is varying significantly extends over a larger region than for the other profiles, though with less overall variation implied. Contrary, and matching the expectations motivated by the logarithmic structure, the uncertainty for the resummation profile in the small \(\Delta R_{(H/Z)j}\) region is large and driven by all scale variations together. In addition, in the bottom ratio plot of Figs. 17, 18, 19 and 20 we show a subset of scale variations for the resummation profile choice, varying the hard and shower scales in a correlated and anti-correlated setting, at a fixed \(\mu _\mathrm{Q}\). This breakdown shows how different subsets of variations constitute the full uncertainty band. Besides the simple domination of the uncertainty by one variation, other regions of phase space show that the uncertainty is strongly underestimated by considering the variations separately. We therefore argue that only the full, combined, scale variation produces a reliable error band. As another probe of the interaction of shower emissions with the hardest jet, we consider \(k_\perp \)-splitting scales, particularly the one in which an event with two jets would turn into an event with one jet as the jet \(p_\perp \) threshold passes through the scale obtained. These observables have also been proven to be accessible to analytic considerations [90], for which comparisons to full parton showers are highly desirable though are beyond the scope of this paper. In Fig. 21 we show our results for the QTilde shower for Higgs production.

^{11}Once again we compare the resummation profile choice with the hfact profile. It is noteworthy that the hfact profile introduces a strong change in the shape of the Sudakov peak, on top of the harder spectrum already observed for the first jet; besides the tail effects we are therefore concerned that profile scale choices along these lines may significantly impact the resummation properties of the parton shower, as may already be expected from the arguments presented in Sect. 3. We therefore conclude that, even with intrinsically restricted phase space, the hfact profile does not provide controllable uncertainties and will not be taken further into account in this study. We also use Fig. 21 to perform an comprehensive breakdown of the different variation directions in the ‘cube’ of possible variations, showing that no individual variation actually covers the full dynamics present. For LO plus PS simulations, we therefore argue that the full band is taken into consideration and improvements in the context of matching and merging will be subject to future work.

With the transverse momentum of the third jet and the \(2\rightarrow 3\) resolution shown in Figs. 25 and 26 we consider purely shower driven quantities; both of these nicely reveal that the two showers, together with the resummation profile, are perfectly compatible with each other, exhibiting the same resummation accuracy.

## 6 Conclusions and outlook

We have performed a comprehensive and detailed study of the sources of uncertainty in parton showers, utilising the two parton-shower algorithms available in Herwig 7. We have investigated different choices of profile scales to approach the boundary of hard emissions, as these are highly relevant to effects that appear in the context of NLO plus PS matching. We have systematically categorised the sources of uncertainty and outlined their interplay with other simulation components, putting this study into context of a bigger work programme to eventually establish uncertainties for event generators in total.

Focussing on the perturbative, parton-shower part, of the simulation, we have deliberately chosen LO plus PS calculations to establish a baseline of controllable and consistent variations that will allow us to subsequently identify improvements and reduction in these uncertainties as higher-order corrections are included. We have found that profile scale choices are very constrained when applying consistency conditions on both central predictions (which should not alter input distributions of the hard process) and uncertainties (with large uncertainties to be expected in unreliable regions or regions dominated by hadronisation corrections), as well as stable results in the Sudakov region. Particularly the hfact and power shower configurations do not admit results compatible with these criteria. Utilising a resummation profile, which is very close to the theta cutoff for hard emissions as implemented in previous algorithms, we find that the angular-ordered and dipole-based shower algorithms are compatible with each other, both in central predictions and uncertainty claims, despite their very different nature.

While being based on scale compensation arguments, these methods are, however, not able to predict the impact of finite corrections.

One can argue that the tuning of \(\alpha _s(M_Z)\) is typically absorbing the CMW correction advertised in [80] which would have to be included otherwise to obtain a satisfactory description of data.

This setup has been chosen such as to later on enable a fair comparison to NLO improved simulation that necessitates these orders of running.

Typically, the splitting kernel for exact phase-space factorisation is then accompanied by a damping factor \(\sim 1 - p_\perp ^2/R_\perp ^2\) towards the edge of phase space.

In principle \(\rho \) should be varied with a reasonable range, though we do not expect a big effect from this variation, given the similarities between \(\rho =0.3\) and \(\rho =0\) corresponding to the theta profile; see the following sections.

Lifting this restriction in the case of the dipole shower will induce logarithms of \(K_\perp /Q_\perp \) which can become parametrically as large as the leading \(Q_\perp /p_\perp \) ones, if these scales are not anymore of the same order.

We do not consider deep inelastic scattering, which is interesting in its own respect. Similarly, a (hypothetical) \(e^+e^-\rightarrow gg\) collider setting should be explored to complement our studies of *Z* versus *H* production in *pp* collisions; we postpone these discussions to later work elaborating on the interplay with hadronizsation models, where these differences are expected to be more relevant; the reader is also referred to the Les Houches study [89] in this context.

Note that the profile scales, especially in the case of the QTilde shower, need to be applied to all emissions such as to make sure the hardest emission is corrected in the intented way.

Cut migration for jetty processes should actually be considered another source of uncertainty beyond the ones discussed here; however, we do not address these in detail but chose to use equal generation and analysis cuts to highlight these effects.

We remind the reader that ‘exclusive’ here means: potentially probing more and more shower emissions on top of the hard process.

## Acknowledgements

We are grateful to the other members of the Herwig collaboration for encouragement and helpful discussions; in particular we would like to thank Stefan Gieseke, Peter Richardson and Mike Seymour for a careful review of the manuscript. We also acknowledge fruitful exchange with Mrinal Dasgupta, Keith Hamilton and Frank Tackmann. The work of JB and PS has been supported by the European Union as part of the FP7 Marie Curie Initial Training Network MCnetITN (PITN-GA-2012-315877). GN acknowledges a short term student visit funded by MCnetITN. SP acknowledges support by a FP7 Marie Curie Intra European Fellowship under Grant Agreement PIEF-GA-2013-628739. We are also grateful to the Cloud Computing for Science and Economy project (CC1) at IFJ PAN (POIG 02.03.03-00-033/09-04) in Cracow and the U.K. GridPP project whose resources were used to carry out some of the numerical calculations for this project. Thanks also to Mariusz Witek and Miłosz Zdybał for their help with CC1 and Oliver Smith for his help with grid computing.

## Copyright information

**Open Access**This article is distributed under the terms of the Creative Commons Attribution 4.0 International License (http://creativecommons.org/licenses/by/4.0/), which permits unrestricted use, distribution, and reproduction in any medium, provided you give appropriate credit to the original author(s) and the source, provide a link to the Creative Commons license, and indicate if changes were made.

Funded by SCOAP^{3}