Monte Carlo tuning in the presence of Matching

We consider the impact of varying α s choices (and scales) on each side of the so-called “matching scale” in MLM-matched matrix-element + parton-shower predictions of collider observables. We explain how inconsistent prescriptions can lead to counter-intuitive results and present a few explicit examples, focusing mostly on W / Z + jets processes. We give a speciﬁc prescription for how to improve the consistency of the matching and also address how to perform consistent tune variations (e.g., of the renormalization scale) around a central choice. Comparisons to several collider processes are included to illustrate the properties of the resulting improved matching, relying on AlpGen + Pythia 6, with the latter using the so-called Perugia 2011 tunes, developed as part of this effort.


Introduction
The theoretical description of multijet production in hadronic collisions is one of the key ingredients for the interpretation of the data from high-energy hadron colliders, the 1.96 TeV proton-antiproton Tevatron collider at Fermilab, and the proton-proton Large Hadron Collider (LHC) at CERN. Final states with multijets, possibly associated with electroweak gauge bosons, are in fact the dominant signature of the decay of heavy particles produced at high energy, whether in the Standard Model (top quarks and Higgs bosons), or in theories beyond the Standard Model (BSM), such as supersymmetry. The identification of these particles, and the study of their properties, requires an accurate modelling of the Standard Model (SM) sources of multijets. Great progress was achieved towards this goal in the past decade. On one side, the calculation of inclusive, parton-level, cross-sections to next-to-leading-order (NLO) in QCD has produced results for processes as complex as W+4 jets [1]. On the other, algorithms have been developed and implemented in numerical codes to provide a complete description of the hadronic final states emerging from processes with up to 6 jets, merging the exact leading-order (LO) calculation of the partonic matrix elements (ME) with the evolution, provided by so-called shower Monte Carlo (MC) codes, of the partonic shower (PS) and the subsequent hadronization of the partons in to physical hadrons.
The development of theoretical tools has been accompanied by experimental measurements, which provide the necessary validation test-bed for these calculations. Parton-level NLO calculations provide a firstprinciple description of inclusive final states: they have an intrinsic high degree of precision, due to the reduced dependence on the unphysical choice of a renormalization and factorization scale and, furthermore, are not subject to modelling uncertainties related to the details of the non-perturbative phase of the final state evolution. Calculations based on the merging of LO matrix elements, shower evolution and hadronization, on the other hand, while affected by the larger scale-setting uncertainty due to the LO approximation, provide a fully exclusive description of the final states, and are therefore more suitable for the experimental analyses. Their ultimate goal is not only to give reliable estimates of the inclusive jet rates and energy distributions, but also to reproduce properties of the final states such as the jet inner structure and the distribution of softer particles produced outside of the jets, including those resulting from the evolution of the fragments of the original colliding hadrons 1 . These properties, which depend on the details of the non-perturbative dynamics, can only be described through the phenomenological models embedded in the shower MC codes. The parameters of these phenomenological models need to be tuned using experimental data of some suitable observables. The factorization assumption built into any description of large-Q processes justifies the use of these same parameters in the prediction of different observables, and provides the basis for the predictive power of such tools. This assumption however must be validated with a direct comparison with data. Elements that need to be probed include the scaling with beam energy of the UE parameters, the universality of the parameters controlling the shower evolution and hadronization, and the overall independence of all parameters on the type of hard process. Deviations from the expected universality would highlight faults in the underlying modelling of effects beyond perturbative physics, or could be due to the insufficient precision of the perturbative description, in case NLO effects were to modify significantly the LO predictions. Differences compatible with the theoretical systematics of the LO approximation could however be reabsorbed by modifying the perturbative parameters that govern the LO systematics, for example the renormalization and factorization scales, or the matching variables used in the matrix-element/shower merging algorithm.
It is therefore important to understand the correlations between the effects of changing the soft and UE parameters on one side, and the perturbative parameters on the other. In this paper we present studies which 1 We refer to the ensemble of these particles as the underlying event, or UE demonstrate that, in the tuning of ME-PS matched predictions, it is vital that there is consistency in the treatment of α S in both the ME and PS components. While this is a general issue for all shower MCs, we consider as an explicit example the merging of LO matrix elements with the Pythia 6 shower MC [2], as implemented in the framework of the AlpGen code [3], one of the reference tools for experimental multijet studies at the Tevatron and at the LHC. The most recent versions of Pythia (6.425) and AlpGen (2.14) codes were used for producing the results.
On the Pythia 6 side, we consider several different tune variations of the interleaved p T -ordered partonshower model [4], focusing on the so-called "Perugia" set of tunes of [5,6], ranging from the Perugia 0 tune (from 2009) to the Perugia 2011 updates that have been developed as part of this work, including systematic up/down variations of the shower activity (see the Appendix and [5,6] for details). We also compare to the "DW" tune [7] of the virtuality-ordered shower model [8,9]. For Herwig [10], we include the "Jimmy" underlying-event model [11], with default parameters. We emphasize that the qualitative conclusions presented in this paper carry over to other shower models, including the ones implemented in Pythia 8 [12,13] and Herwig ++ [14], but the quantitative aspects should still be considered limited to the particular tunes and shower models studied here. We rely on Fastjet [15] for jet clustering and have further used the Rivetbased [16] mcplots web site [17] for some of our comparisons.
The paper is organised as follows: in Section 2 we describe in detail the theoretical nature of the α S consistency problem, and give a practical example of how it can be manifest in the prediction of high p T observables. In Section 3 we show how a simple prescription can be applied to stabilise ME-PS tunings against this problem, propose a new tune for AlpGen + Pythia 6 matched predictions, and demonstrate the behaviour of this tune under tuning variations. In Section 4 we show that this new AlpGen + Pythia 6 tune is able to reproduce (within statistical errors) the Tevatron and LHC vector boson plus jets data. In addition we also the tune predictions to the jet shape measurements at the Tevatron and LHC. Finally we conclude in Section 5.

The Importance of Consistent α S Treatment in ME-PS Matched Predictions
In this section we demonstrate that consistent treatment of α S in ME-PS matched predictions is important in order to achieve the desired accuracy in the prediction of high p T observables. We first present the theoretical arguments behind this, and then go on to show and explain that without adopting this approach one can observe undesirable and counter-intuitive effects on experimental observables.

Theoretical Background
The philosophy behind matching prescriptions such as the MLM one [18,19] employed by AlpGen is to separate phase space cleanly into two distinct regions; a short-distance one, which is supposed to be described by matrix elements, and a long-distance one described by parton showers. In the long-distance region, real and virtual corrections, with the latter represented by Sudakov factors, are both generated by the shower and are intimately related by unitarity (for pedagogical reviews, see, e.g., [20,21]). On the short-distance side, the real corrections are generated by the matrix elements while the virtual ones are still generated by the shower.
Much effort has gone into ensuring that the behaviour across the boundary between the two regions be as smooth as possible. CKKW showed [22] that it is possible to remove any dependence on this "matching scale" at NLL precision by careful choices of all ingredients in the matching; technical details of the implementation are important, and the dependence on the unphysical matching scale may be larger than NLL unless the implementation matches the theoretical algorithm precisely [23][24][25].
Especially when two different computer codes are used for matrix elements and showering, respectively (as when AlpGen or MadGraph [26] is combined with Pythia 6 or Herwig), inconsistent parameter sets between the two codes can jeopardise the consistency of the calculation and lead to unexpected results, as will be illustrated in the following sections.
To give a very simple theoretical example, suppose a matched matrix-element generator (MG) uses a different definition of α s than the parton-shower generator (SG). Suppressing parton luminosity factors to avoid clutter, the real corrections, integrated over the hard part of phase space, for some arbitrary final state F, will then have the form where we have factored out the coupling corresponding to the "+1" parton and suppressed the dependence on any other couplings that may be present in |M F+1 | 2 . The virtual corrections at the same order, generated by the shower off F, will have the form with P i (z) the DGLAP splitting kernels (or equivalent radiation functions in dipole or antenna shower approaches). If the two codes use the same definitions for the strong coupling, α SG s = α MG s , then the fact that P(z)/Q 2 captures the leading singularities of |M F+1 | 2 guarantees that the difference between the two expressions can at most be a non-singular term. Integrated over phase space, such a term merely leads to a finite O(α s ) change to the total cross section, which is within the expected precision. Indeed, it is a central ingredient in both the MLM and (L)-CKKW matching prescriptions that a reweighting of the matched matrix elements be performed in order to ensure that the scales appearing in α s match smoothly between the hard and soft regions. Thus, we may assume that the choice of renormalization scale after matching is µ ∼ p T on both sides of the matching scale, where p T is a scale characterising the momentum transfer at each emission vertex, as established by [27,28] and encoded in the CKKW formalism [22].
In the case of the CKKW approach as implemented in the Sherpa MC framework [29], this prescription can be controlled exactly, since the matrix element and the shower evolution are part of the same computer code and hence naturally use the same α s definition. This is also true in Lönnblad's variant [23] of the algorithm, used in Ariadne [30]. In the case of codes like AlpGen or Madgraph, on the other hand, an issue emerges. These codes are designed to generate parton-level event samples to be used with an arbitrary shower MC. Different shower MCs however use slightly different scales for the parton branchings, as a result of different approaches to the shower evolution, and may use different values of Λ QCD , as a result of the tuning of the showers and/or underlying events. A possible mismatch therefore arises in the values of α s used by the matrix-element calculation and those used by the shower.
If there is a mismatch in Λ QCD or α s (M Z ), then this will effectively generate a real-virtual difference whose leading singularities are proportional to which is of next-to-leading logarithmic order (unless Λ MG ∼ Λ SG , in which case it vanishes). Similarly, even if both matrix-element and shower codes are using the same Λ QCD , but they use different running orders, then there will be an O(α 3 s ln(p 2 T /Λ 2 )) mismatch, which may also become large if p T Λ. To be more concrete, let us consider a specific example. Compare A) a matched MG+SG calculation which uses the same Λ QCD value on both sides of the matching to B) a calculation in which the value used on the MG side is reduced to half its previous value but the SG one remains the same, as summarised by the two first columns of tab. 1. Going from case A to B, the following changes result: 1. The number of (F + 1) states added by the MG decreases, due to the lowering of the Λ QCD value on the MG side, while the number of surviving F states remains constant, since the shower Sudakov is not modified. The total estimated cross section therefore decreases.
2. At the differential level, the smaller number of (F + 1) states combined with the unchanging number of F states implies smaller absolute jet cross sections and smaller fractions σ jet /σ tot .
Similarly we may consider what happens if C) we reduce the Λ QCD value on the SG side instead, as summarised in the last column of tab. 1. Going from case A to C, the following changes result: 1. The number of (F + 1) states added by the MG remains constant, while the number of surviving F states increases, since the SG is generating fewer branchings. The total estimated cross section therefore increases.
2. Since the number of (F + 1) states is constant, while the shower is made less active, the final jets will actually be narrower, which increases the rate of reconstructed jets at any given fixed p T value.
3. Since both the total cross section increases and the number of reconstructed additional jets also increases, jet fractions can either increase or decrease.
In particular, note the somewhat counter-intuitive effect that decreasing the shower α s value actually increases the jet rates in a matched calculation, while it normally decreases them in a standalone shower calculation.
Since, as was discussed above, inconsistencies among the choices on the two sides can lead to differences at the NLL level, it is obviously important to ensure that they are consistent within a reasonable margin. This is particularly true in the context of event-generator tuning, in which specifically the NLL components of the shower description are sought to be optimized with respect to measured data, and hence changes at this level could effectively destroy the tuning.
Finally, we remind the reader that a change in Λ QCD can be interpreted as a change in the opposite direction of the renormalization scale argument (for constant Λ QCD ), modulo small flavour threshold effects that we Figure 1: Ratio of predictions for the leading-jet E T spectrum in W+jets final states at the Tevatron, obtained with AlpGen plus various MC codes and tunes. The leading jet observable is defined at the particle-level as in the CDF W+jets analysis [32].
shall ignore here. This is easy to realise from the definition of the coupling, Thus, we may write renormalization scale variations (e.g., by a factor of 2 in each direction) either by applying a prefactor directly on the renormalization scale argument of α s or by applying the inverse of that factor to Λ QCD while keeping the renormalization scale argument unchanged. Due to the technical structure of the codes, the former is more convenient in AlpGen (via the ktfac setting) whilst the latter is more convenient for Pythia 6.

Examples of the interplay between tunes and matching
In this section we give several examples of how the issues in ME-PS matching described in Section 2 can affect high-p T observables using AlpGen interfaced to Pythia 6 with DW [7], Perugia 0 (P0) [5,31] and Perugia 2010 (P2010) tunes [6].
In Fig. 1 we show the ratio of predictions for the transverse energy (E T ) spectrum of the leading jet (that jet with the highest E T per event) in W +jet final states at the Tevatron, obtained by the merging of AlpGen with different shower codes; Herwig, Pythia 6 virtuality-ordered shower (DW), and p T -ordered shower (Perugia 0). The differences between Herwig and Herwig plus Jimmy at small E T can be explained by the different amounts of energy that, in the various cases, are deposited by the UE in the jet cones. In particular, as shown in Fig. 2, these differences can accommodate the slight shape discrepancy between data and AlpGen +Herwig that was noted, at small E T , in the CDF study [32]. It is difficult, however, to attribute to the UE energy the significant differences seen in Fig. 1 at large E T . Figure 2: Comparison of CDF data [32] with the leading-jet E T spectrum predicted by AlpGen plus various MC codes and tunes.
In order to investigate the source of the differences in the predictions, systematic parameter variations of the perturbative and non-perturbative model components of Pythia 6 have been studied using the Perugia family of Pythia 6 tunes with Perugia 0 as the central tune and Perugia Hard and Perugia Soft as the systematic variation tunes. The Perugia Soft and Perugia Hard tunes both use the same Parton Density Function (PDF) as Perugia 0, CTEQ5L [33], but differ in the values of Pythia 6 parameters controlling both perturbative and non-perturbative activity levels. In comparison to Perugia Soft, the Perugia Hard tune has more perturbative (initial and final state radiation) activity but less non-perturbative (multiple interactions, beam remnant and hadronization) activity. Perugia Soft on the other hand has less perturbative but more non-perturbative activity than the Perugia 0 tune. In order to investigate the interplay of the tuning variations with the MLM matching, the effect of the change of the tune on the physics observables in both the Pythia 6 standalone case and AlpGen + Pythia 6 case is presented.
In Fig. 3 the distribution of jet multiplicity (N jet ) in W +jets events is compared for events generated with the Perugia 0, Perugia Hard and Perugia Soft tunes. The N jet observable is defined at the particle-level according to the definition used in the ATLAS measurement of the W +jets cross-section at √ s=7 TeV [34]. Jets are clustered from stable particles using the anti-Kt jet algorithm [35] with the radius parameter R = 0.4, and considered in case they satisfy the following kinematic cuts: p T > 20 GeV and |η| < 2.8. Comparisons are performed for both the Pythia 6 standalone (left) and AlpGen + Pythia 6 (right) cases. For the Pythia 6 standalone case we observe that Perugia Hard tune yields more high-p T jets than the Perugia 0 tune while Perugia Soft yields less final state jets correspondingly. For the AlpGen + Pythia 6 case an opposite trend is observed: Perugia Hard tune yields less high-p T jets and Perugia Soft tune yields more high-p T jets. In order to determine which modelling components of the Perugia Soft and Perugia Hard tunes cause this behaviour, we considered the effect of varying individual sets of parameters of the Perugia Soft and Perugia Hard tunes in AlpGen + Pythia 6 predictions. Parameters were grouped according to the modelling aspect they control into Initial State Radiation (ISR), Final State Radiation including the FSR from the ISR partons (FISR), the Underlying Event (UE) and Colour Reconnections (CR) blocks 2 Dedicated samples where only parameters of an individual block were varied in the ranges used in Perugia Soft and Perugia Hard tunes on top of the Perugia 0 tune were produced. The results of the study, in terms of the cross-section contribution of each AlpGen sub-sample (after MLM matching), are given in Table 2. As was already noted in Fig. 3, the cross-section for multijet production in the Perugia Hard case decreases with respect to Perugia 0, and vice versa for Perugia Soft. From Table 2, we see that the parameter blocks that produce this affect are the ISR and FISR blocks, while the impact of the CR and UE block variations on the cross-sections is negligible. In addition to the simultaneous variations of parameters in the blocks, we have also performed individual parameter variations for each of the parameters in order to check that potential correlations between the parameters do not affect the conclusions. Studies have also been performed for the Hadronization and Beam Remnant blocks of [6]. The variations of these parameters also had a negligible effect on the kinematic distributions and cross-section values.
In Fig. 4 we demonstrate that the increased parton shower activity can indeed lead to the reduced crosssection (and softer jet spectra) due to the increased rates at which the AlpGen + Pythia 6 events are vetoed during the MLM matching. In the figure the distributions of the events that pass or fail (ISVETO=0 or ISVETO =0) the MLM matching criterion are shown for the exclusive sub-sample of AlpGen + Pythia 6 Perugia 2010 W +jets events with exactly three additional partons from the matrix element in the final state 3 . Each of the distributions is normalised to unit area. The distributions are shown as a function of the largest p T shower emission from the initial state radiation (left) and as a function of the largest p T multiple protonproton interaction. 4 In the left hand side figure we see that the events are rejected with higher probability, the larger the p T of the hardest ISR branching in the event. Therefore, a Pythia 6 standalone tune which increases the ISR activity can, somewhat counter-intuitively, reduce the rate for multijet and hard emissions. In the right hand side we demonstrate that the events are accepted and rejected independently of the transverse momentum of the hardest multiple interaction in the event (which is the desired behaviour of the matching application used with the parton shower code).
To conclude, the origin of the differences observed in the predictions of tunes with different ISR/FSR activity matched to AlpGen is rather due to the mismatch between the jet-emission probability predicted by the matrix elements and by the shower. This comes from the mismatch in the value of α S discussed earlier, arising from different values of Λ QCD or from the use of a different evolution variable in the shower. If the value of α S in the shower increases, the emission rate of additional jets during the shower evolution will increase. Since the matching algorithm rejects events with extra jets generated by the shower, to replace them with events where the jet is accounted for by a higher-order matrix element calculation, a larger value of α S in the shower leads to a higher rejection rate. Unless this change in α S is accompanied by a similar change in the matrix element calculation, the additional rejection is not compensated by the relative increase in rate for the higher-order parton-level contributions, leading to the effects reported in this section.
This important interplay between MC parameters, which are typically tuned to "soft" observables such as UE or the small-p T DY spectrum, and the performance of the matching algorithms for "hard" observables, calls for particular attention when adopting new UE tunes in the framework of multijet studies with matrixelement matching. Along the same lines, it should be kept in mind that, tuning a stand-alone shower MC to 2 The parameter blocks organisation is similar to the one introduced in [6] and are listed in A.2 3 The observations in the text are largely independent on the final state parton multiplicity 4 These p T values are reported by Pythia 6 parameters VINT(357) (ISR) and VINT(359) (MPI) respectively. Ratio to P0      better model multijet final states, will force it to emulate effects present in the multiparton matrix elements. Using such tunes with matrix-element matching therefore requires ad-hoc modifications of the matching algorithm, or of its parameters.

Stabilising ME-PS Matched Tunings
In this section we discuss how to overcome the problems discussed in the previous section with a simple prescription, and outline a tuning strategy that should allow to consistently optimize, in the context of the Pythia 6 shower MC, the description of both the UE and the high-E T properties of final states.

A New AlpGen + Pythia 6 α S Consistent Tune
As it was explained in Section 2.1, and practically demonstrated in Section 2.2, it is highly desirable to have a consistent treatment of α S on either side of the ME and PS boundary. In Appendix A.1 the relevant settings for a new α S consistent AlpGen + Pythia 6 tune are described in detail. In this tune, the α S consistency is essentially ensured by setting the effective value of Λ QCD to be the same throughout the Pythia 6 parton shower algorithms and in the AlpGen matrix elements. A consistent choice for Λ QCD of is made, where the superscript indicates the number of flavours. This choice is informed by comprehensive Professor tunings [36,37] of the p T -ordered shower in Pythia 6 [4] to event shapes and other LEP data. Note that the settings for Pythia 6 are those of the central Perugia 2011 (P2011) tune [6], which was inspired by these studies. We will refer to this new tune of AlpGen + Pythia 6 as the Perugia 2011 "matched" tune. Ratio to P2011  Figure 5: Comparison of AlpGen + Pythia 6 (p T >20 GeV) jet multiplicity (left) and leading jet transverse momentum (right) distributions in W +jets electron channel events. The samples are generated using different AlpGen + Pythia 6 parameter setups described in the text.

Tests of the Consistent α S Approach: Behaviour Under Scale Variations
In this section we study the behaviour of the new AlpGen + Pythia 6 Perugia 2011 "matched" tune under Λ QCD variations to demonstrate that, with a consistent treatment of α S , the expected behaviour of ME-PS matched predictions under variations of tuning parameters is restored. W +jets events selected with the same criteria applied for Fig. 3 are used. Figure 5 shows the jet multiplicity (left) and leading jet transverse momentum (right) distributions for the Perugia 2011 "matched" tune and four variant tune samples generated with different Λ QCD values. Two samples, labelled as "Λ Alp. ↑" and "Λ Alp. ↓", have Λ QCD respectively increased and decreased by a factor of 2 only in the ME calculation. This is achieved by setting respectively the AlpGen parameter ktfac to 1/2 and 2. The increase (decrease) of the Λ QCD value in AlpGen results in more (less) jets and a harder (softer) leading jet spectrum as shown in Fig. 5. The two samples labelled as "Λ PS ↑, Λ Alp. ↑" and "Λ PS ↓, Λ Alp. ↓" correspond to a consistent variation of Λ QCD both in the ME and PS, with Λ QCD respectively increased and decreased by a factor of 2. The impact of these variations is qualitatively similar to the case where Λ QCD is only varied in the ME, restoring the expected behaviour of ME-PS matched prediction under variation of Λ QCD . However, the samples with Λ QCD varied simultaneously in the ME and in the PS exhibit a smaller deviation from the nominal sample. The mitigation of the impact of a Λ QCD coherent change in a ME-PS matched sample compared to the same change only in the ME calculation is due to the interplay between the radiation produced by PS and the matching algorithm, as detailed in Section 2.1. While the choice of the xlclu parameter allows to directly adapt AlpGen to possible future changes in the choice of Λ QCD in Pythia, the variation of the ktfac parameter in the standard range 0.5 < ktfac < 2 can be used to establish the range of the systematical uncertainty, or to tune the description of specific observables.

Comparisons with Data
In this section we demonstrate that the new Λ QCD -consistent Perugia 2011 "matched" tuning of AlpGen + Pythia 6 introduced in Section 3.1 compares well with recent Tevatron and LHC measurements, and that, with the arrival of improved precision measurements, there should be room for further tuning of these predictions.

Z/W +jets production
The figures that follow show comparisons of AlpGen + Pythia 6 Monte Carlo predictions to measurements of published Z+jets and W +jets processes from CDF [32,38,39] and W +jets from ATLAS [40] 5 . These crosssection measurements are corrected for all known detector effects to particle level and compared to Monte Carlo predictions. The Monte Carlo predictions of V +jet production cross-sections are formed by clustering the stable final state particles (τ > 10 ps) following parton shower and hadronization of the unweighted events. This clustering is done using the same jet algorithm as the measurement, as implemented in the Fastjet [15] package. All stable final state particles are used, with the exception of the leptons that result from the decay of the signal W or Z boson. The decay leptons are corrected for the final state QED radiation such that their 4-momentum is equivalent to that before radiation. After the events have been clustered, the restrictions on the allowed phase space of the jets and of the W /Z boson decay products are applied to be consistent with the measurement to which we are comparing. The prediction of the final AlpGen + Pythia 6 cross-sections contains contributions from V +0, 1, 2, 3 parton samples (showered with exclusive MLM matching), and V +4 parton samples (showered with inclusive MLM matching).
Two different AlpGen + Pythia 6 generations are compared to the data; the new AlpGen + Pythia 6 Perugia 2011 "matched" tune introduced in Section 3.1 (labelled "Alp.+Pyt. P2011"), and an AlpGen + Pythia 6 prediction using the default settings of AlpGen and the Pythia 6 DW tune (labelled "Alp.+Pyt. DW"). The ratio of the matched predictions to the data are shown and compared. Additionally, the results of variations of ktfac by factors of 0.5 and 2.0 in the AlpGen + Pythia 6 Perugia 2011 "matched" prediction are shown as solid lines. The hatched regions show the total error (statistical plus systematic) propagated to the theory/data ratio from the data measurements. The error bars on the points show the statistical error on the theoretical prediction.
In Figure 6 we show the ratio of the predicted theory cross-sections to the data for the CDF Z+jets measurement [39]. In this measurement jets are defined by the CDF midpoint algorithm [46], with R cone = 0.7 and are required to have p jet T > 30 GeV and |y jet | < 2.1. In Figure 7 we show the ratio of the predicted theory cross-sections to the data for the CDF W +jets measurement [32]. In this measurement jets are defined by the CDF JetClu algorithm [47], with R cone = 0.4 and are required to have p T > 20 GeV and |η| < 2.5. In Figure 8 we show the ratio of the predicted theory cross-sections to the data for the ATLAS W +jets measurements [40]. In this measurement jets are defined by the anti-Kt algorithm [35], with a radius parameter R = 0.4 and are required to have p T > 20 GeV and |η| < 2.8.
The AlpGen + Pythia 6 Perugia 2011 "matched" prediction compares well with the measured cross sections both as a function of the inclusive jet multiplicity and jet p T . In particular, the prediction correctly describes the low p T region of the differential cross section and the jet sub-structure without presenting any significant disagreement with data at high p T . This shows that it is possible to tune separately the long-and short- Figure 6: (Top) The ratio of predicted theory and CDF measured data cross-sections for the production of a Z → ee boson in association with at least N jet jets [39]. In the left hand figure the theory predictions are not normalised to the data. In the right hand figure the theory predictions are normalised such that they equal the data measurement in the ≥ 1 jet bin. (Bottom) The ratio of predicted theory and CDF measured data cross-sections for the production of a Z → ee boson in association with at least 1 jets (left hand side) and at least 2 jets (right hand side) as a function of jet E T . In the left hand plot, the theory prediction is normalised such that the predicted rate for ≥ 1 jet production is equal to that measured in the data. In the right hand plot, the theory prediction is normalised such that the predicted rate for ≥ 2 jet production is equal to that measured in the data. distance contributions of the prediction to obtain a satisfactory description of the observables in the whole experimental accessible phase space. Remarkably, the prediction describes the data both at √ s = 1.96 TeV and √ s = 7 TeV. This illustrates that the scaling properties of the long-distance contribution with the centre of mass energy of the collision does not produce unexpected effects in the high p T region of the cross section. A coherent rescaling of α S with ktfac =0.5, 2 has a little effect on the shapes of the differential crosssections, while for the inclusive N jet cross-sections it produces variations that bracket the default prediction. The ktfac parameter can therefore be used to explore the sensitivity of the prediction to a variation of the renormalization and factorization scale, other than allow tuning on data. With the statistics of the currently available measurements there is no much room for optimising the parameters of the AlpGen + Pythia 6 Perugia 2011 "matched" prediction in a tune that better describes the measurements. However, with the imminent arrival of more precise higher-statistics LHC measurements, this should be possible in the near future.
Even though not explicitly shown here, in the relevant publications one can find similarly good agreement between the available measurements [32,34,44] and predictions based on AlpGen + Herwig. Figure 7: (Top) The ratio of predicted theory and CDF measured data cross-sections for the production of a W → eν boson in association with at least N jet jets [32] . In the left hand figure the theory predictions are not normalised to the data. In the right hand figure the theory predictions are normalised such that they equal the data measurement in the ≥ 1 jet bin. (Bottom) The ratio of predicted theory and CDF measured data cross-sections for the production of events containing a W → eν boson in association with at least 1 jets (left hand side) and at least 2 jets (right hand side), as a function of the leading jet E T (left hand side), and the sub-leading jet E T (right hand side). In the left hand plot, the theory prediction is normalised such that the predicted rate for ≥ 1 jet production is equal to that measured in the data. In the right hand plot, the theory prediction is normalised such that the predicted rate for ≥ 2 jet production is equal to that measured in the data.  [40] for the production of a W → eν boson in association with at least N jet jets. In the left hand figure the theory predictions are not normalised to the data. In the right hand figure the theory predictions are normalised such that they equal the data measurement in the ≥ 0 jet bin. (Bottom) The ratio of predicted theory and ATLAS data cross-sections for the production of events containing a W → eν boson in association with at least 1 jets (left hand side) and at least 2 jets (right hand side), as a function of the leading jet p T (left hand side), and the sub-leading jet p T (right hand side). In the left hand plot, the theory prediction is normalised such that the predicted rate for ≥ 1 jet production is equal to that measured in the data. In the right hand plot, the theory prediction is normalised such that the predicted rate for ≥ 2 jet production is equal to that measured in the data.

Jets shapes
Finally we test the ability of the new AlpGen + Pythia 6 Perugia 2011 "matched" tune and the systematics variations to describe the jet shapes at the LHC and the Tevatron.
For the LHC, the jet shapes measured in inclusive jet production by the ATLAS collaboration [48] are taken as reference. For this measurement the jets are reconstructed using the anti-kt algorithm with the distance parameter R=0.6, the transverse momentum range 30 GeV < p T < 600 GeV and rapidity in the region |y| < 2.8. The jet shapes are expected to be sensitive to both perturbative (parton shower) and non-perturbative (fragmentation and underlying event) modelling aspects. We perform the data comparisons for both the AlpGen + Pythia 6 and Pythia 6 standalone cases. The samples are generated using different Pythia 6 standalone and AlpGen + Pythia 6 parameter settings as follows: for Pythia 6 standalone the Perugia 2011 and the associated systematics tunes Perugia 2011 radHi and Perugia 2011 radLo are compared. The same tunes are also used for the generating the AlpGen + Pythia 6 distributions whereby the Λ QCD values are always set to the same values in AlpGen (using ktfac) and Pythia 6 (i.e. the Perugia 2011 "matched" central settings and the systematic variations around the central settings). The setups are compared to the integral jet shape distributions as measured in the data. The integral jet shape is defined as the average fraction of the jet p T that lies inside a cone of radius r concentric with the jet cone [48]: The sum is performed over all the N jet jets in the kinematic region of interest.
In Figure 9 the integral jet shape distributions are compared to the ATLAS data for the jets in the transverse momentum ranges of 40-60 GeV (top) and 260-310 GeV (bottom) in the whole measured rapidity range (|y| < 2.8). We observe that both Pythia 6 standalone (left) and AlpGen + Pythia 6 (right) with Perugia 2011 provide reasonably good description of the jet shapes. Due to MLM matching the jets in the AlpGen + Pythia 6 case tend to be more narrow than in the Pythia 6 standalone case.
For the Tevatron, the shapes of jets produced in association with a Z boson as measured by CDF [49] are used. In this measurement jets are defined by the CDF midpoint algorithm [46], with R cone = 0.7 and are required to have p jet T > 30 GeV and |y jet | < 2.1. Figure 10 shows good agreement between AlpGen + Pythia 6 and the measurement for both the DW and Perugia 2011 tunes.
The comparisons in Figures 9 and 10 (as well as comparisons to the jet shapes in other kinematic regions and comparisons to further LHC measurements) reveal no major short-comings of the AlpGen + Pythia 6 Perugia 2011 "matched" tune. The Perugia 2011 tune has been developed by tuning Pythia 6 standalone, whereby the effective value of Λ QCD was set to be the same throughout the Pythia 6 parton shower in the anticipation of using it with the AlpGen matrix elements using the same effective Λ QCD value. The agreement with the measured jet shapes data could therefore potentially be improved by performing a dedicated tuning of AlpGen + Pythia 6.

Conclusions
We have shown that, in the context of tuning ME-PS matched predictions, it is vital that the tuning adopted ensures a consistent treatment of α S on either side of the "matching boundary". In the case of Alp-  Gen + Pythia 6 matched predictions, we have outlined a simple prescription to ensure this. This can be easily generalised, and applied to the case of matching AlpGen with other shower MCs, such as Herwig. We have then given an example of such a tune that compares well to Tevatron and LHC measurements of vector boson plus multijet final states. In addition, we have shown how consistent variations around a central ME-PS matched tune can be performed, so as to define a systematic uncertainty on that prediction. This knowledge should prove valuable in defining a new set of consistent ME-PS tunes for the precise future study of LHC multijet final states.
Further, in order to avoid that any of these values are modified by the code, we set MSTP(64)=2, and PARP(64)=1.0. The former forces the code to keep Λ QCD unmodified for ISR. In particular, the translation from "MSbar" to "CMW", which is applied for MSTP(64)=3, is not performed. This is equivalent to interpreting the effective Λ QCD value as already being in a scheme similar to CMW. The latter, PARP(64)=1.0, sets the prefactor for the renormalization scale used for ISR equal to unity, i.e., the renormalization scale will just be p T . Any re-interpretation of Λ QCD , for instance to translate between different effective scheme definitions or to introduce multiplicative factors on the effective renormalization scale, for scale variation purposes, should then be imposed directly on the three Λ QCD values above. This is the prescription followed in the so-called Perugia 2011 tunes which were developed as part of this effort, with parameters as listed in [6].
Finally, one needs to settle on an effective value for Λ QCD . According to comprehensive Professor tunings [36,37] of the p T -ordered shower in Pythia [4] to event shapes and other LEP data, one needs values of order where the superscript indicates the number of flavours. We interpret this as an effective value, derived directly from data using a "Pythia scheme" that is defined numerically by Pythia's shower algorithm. It is not necessarily directly comparable to MS determinations 7 .
The Strong Coupling in Pythia 8: Although we restrict the numerical studies in this paper to Pythia 6, for completeness we also include the case of Pythia 8 [50], for which the corresponding relevant parameters are • TimeShower:alphaSvalue, • TimeShower:alphaSorder, • SpaceShower:alphaSvalue, • SpaceShower:alphaSorder, for final-state (timelike) and initial-state (spacelike) showers, respectively. Notice in particular that one here specifies the value of α s (M Z ) rather than that of Λ QCD . Similar comments about the effective scheme definition as for Pythia 6 apply.
Radiation Phase Space: The size of the allowed phase space for radiation in the shower generator may also affect the matched result. In the p T -ordered shower in Pythia 6, the switch MSTP(72) controls the starting scale for final-state radiation off jets that are produced by initial-state radiation and/or are colourconnected to the beam. Naively, the FSR off such a parton should start at the scale at which it was created, which is obtained with MSTP(72)=2, the recommended option. Using the other available options is strongly discouraged, as these lead to a quite bad agreement with NLL resummations, underscored by Banfi et al in [51]. The Perugia 2011 tunes all use MSTP(72)=2.
Further, in the Perugia 2011 tunes, we start both the ISR and FSR evolutions at p T evol = SCALUP, with the Pythia evolution variable p T evol defined in [4] and SCALUP the scale parameter defined in the Les Houches Accord for event generators [52,53]. Technically, this is achieved by setting PARP(71) = PARP(67) = 1.0, where the former controls the scale factor applied to the starting scale for FSR and the latter sets the one for ISR. Note that these parameters could still be varied somewhat around their central values, since the p T evol variable used by Pythia is not 100% identical to the p T definition that might be used to place cuts in a matrix-element generator, but we have not judged this difference essential at the current level of precision.

A.2 PYTHIA Tunes
In this work, we have used the Perugia 0, Perugia Hard, Perugia Soft, Perugia 2010, Perugia 2011, Perugia 2011 radHi, Perugia 2011 radLo [6], and the DW [7] tunes of Pythia 6.4 [2]. All use the CTEQ5L PDF set [33]. For a complete description, see the indicated references. The salient features of the tunes are as follows.
Tune DW [7] is a tune of the Q 2 -ordered shower. It is based on the Tevatron "Tune A", which had great success in describing the underlying event measured at the Tevatron. Contrary to Tune A, however, DW also included the Drell-Yan p T spectrum, for which Tune A predicted a far too soft spectrum. Tune DW therefore has a significantly lower renormalization scale for ISR (and thus a larger value of α s ), and 2 GeV of so-called "primordial k ⊥ ", as compared to 1 GeV in Tune A. The energy scaling of the underlying event was based on comparisons between the underlying-event level at the Tevatron between 630 and 1800 GeV.
The Perugia tunes [6] are all tunes of the p T -ordered shower. Unlike DW, which was developed by tuning to the underlying event in jet events, the Perugia tunes primarily used minimum-bias data as drivers, relying on the universality of PYTHIA's MPI modelling to extrapolate to the underlying event. In addition, a comprehensive update of the LEP fragmentation parameters was included in all tunes.
The first set, the Perugia 0 family, used LEP event shapes and fragmentation data, Tevatron minimum-bias data, and the Tevatron Drell-Yan p T spectrum. Again, the scaling from Tevatron data at 630 GeV was used to determine the scaling with CM energy, with some additional constraints from older UA5 data also included. A "Hard" and "Soft" variation attempted to vary the shower radiation up and down, respectively. Both Perugia 0 and the "Hard" variation use the so-called "CMW" scheme for Λ QCD for ISR, while the soft retained the unmodified MSbar value, in all cases taking the numerical value from the PDF set used. For the "Hard" variation, the renormalization scale for ISR was 0.5p T , for Perugia 0 p T , and for the "Soft" variation, √ 2p T . In addition, the "Hard" variation had higher-than-nominal values for FSR, and had a slightly harder hadronization spectrum, while the converse was true for the "Soft" one. None of these early tunes used the recommended MSTP(72)=2 setting, and hence predicted rather narrow ISR jets. The Pythia tune numbers are 320, 321, and 322, for Perugia 0, "Soft", and "Hard", respectively.
Tab.3 lists the parameter settings of the Perugia family that were used for the block variations in 2.2 .
In Perugia 2010, jet shapes were included among the tuning constraints. The amount of FSR outside resonance decays (previously controlled by the Λ QCD value read from the PDF set) was adjusted to agree with the level inside them (constrained by fits to LEP event shapes), combined with the recommended MSTP(72)=2. The Λ QCD value for ISR was still read from the PDF set, and translating from MSbar to CMW, as in Perugia 0. A few fragmentation parameters were slightly revised, since some of the previous ones had only been constrained using the Q 2 -ordered shower, and a new colour-reconnection model was introduced.  and "Soft" variations were produced for this tune. The Pythia tune number is 327 for Perugia 2010.
In Perugia 2011, it was possible to include some early lessons from LHC at 7 TeV. Based on observed strangeness and baryon production rates, a few of the fragmentation parameters were again revised. The universal effective Λ QCD choice advocated in this paper was introduced. Variations labelled "radHi" and "radLo" were defined as well, expressing a factor 2 variation in the Λ QCD values used for ISR and FSR. The Pythia tune numbers for Perugia 2011, radHi, and radLo, are 350, 351, and 352, respectively.
Tabulated values of the parameters of all of the Perugia tunes can be found in the appendices of [6].