Matching Tree-Level Matrix Elements with Interleaved Showers

We present an implementation of the so-called CKKW-L merging scheme for combining multi-jet tree-level matrix elements with parton showers. The implementation uses the transverse-momentum-ordered shower with interleaved multiple interactions as implemented in PYTHIA8. We validate our procedure using e+e--annihilation into jets and vector boson production in hadronic collisions, with special attention to details in the algorithm which are formally sub-leading in character, but may have visible effects in some observables. We find substantial merging scale dependencies induced by the enforced rapidity ordering in the default PYTHIA8 shower. If this rapidity ordering is removed the merging scale dependence is almost negligible. We then also find that the shower does a surprisingly good job of describing the hardness of multi-jet events, as long as the hardest couple of jets are given by the matrix elements. The effects of using interleaved multiple interactions as compared to more simplistic ways of adding underlying-event effects in vector boson production are shown to be negligible except in a few sensitive observables. To illustrate the generality of our implementation, we also give some example results from di-boson production and pure QCD jet production in hadronic collisions.


Introduction
Production rates for multi-jet events at the LHC are very large, and the understanding of such events is important, not least as most discovery channels for new physics involve jets. The main irreducible, and often huge, background for such signals comes from QCD processes. To distill a signal one therefore needs to make complicated cuts to decrease the QCD background, sometimes by several orders of magnitude. For this it is very important that we have a good understanding, not only of the average behaviour of multi-jet processes, but also the fluctuations and very rare events coming from standard QCD.
The state-of-the-art for simulating multi-jet final states with Monte Carlo event generators is to use Ckkw-based algorithms to combine exact tree-level matrix elements (ME) with parton showers (PS) in a consistent way. Here, the matrix elements describe accurately the production of several hard, well-separated partons, while the parton shower encodes how these are evolved into partonic jets by accurately modelling the soft and collinear partonic emissions, in a way such that standard hadronisation models can be applied to produce realistic exclusive hadronic multi-jet final states.
However, Ckkw merging algorithms mainly focus on the jets produced in the primary interaction, and little attention is normally given to jets which may arise from rare, but hard fluctuations in the underlying events. If at all, the underlying-event contribution is typically added to the merged sample assuming that the additional scatterings are completely independent of the primary interaction. This may be a good approximation in most cases, but it is clear that there are correlations between the primary interaction and the underlying event, which we think are important to investigate carefully. The multiple interaction model in PYTHIA8 is arguably the most advanced model for the underlying event today. It contains several sources of correlations between the primary interaction and the underlying event. In particular, the model for multiple scatterings is tightly tied to the parton shower in that additional scatterings are interleaved with the parton evolution.
In this paper we implement the Ckkw-l algorithm for merging parton showers with tree-level matrix elements in PYTHIA8, and in doing so we consider possible effects of the fact that the PYTHIA8 shower is interleaved with multiple interactions. Although the effects turn out to be small, we note that there may be more sources of correlations which are currently not taken into account by PYTHIA8, and our scheme is a way to automatically take into account any such correlations also in the merging with tree-level matrix elements.
It should be noted that this is not the first implementation of matrix-element merging with the PYTHIA shower. Interfaces exists for the FORTRAN version of PYTHIA to the ALPGEN [1] program by employing the MLM matching prescription [2], and to MADGRAPH/ MADEVENT [3] using so-called Pseudo-Shower merging [4].
The outline of this article is as follows. First in sections 2 and 3 we briefly recapitulate the main features of the Ckkw-l merging scheme and the interleaved showers respectively, before we describe the details of our PYTHIA8 implementation in section 4. Then we present results in section 5, starting with some control plots to validate the implementation before we study the effects of multiple interactions and other formally sub-leading features on the production of vector bosons with additional jets at the LHC. We end with showing some comparisons with data, and some preliminary results also from di-boson and pure QCD jet production. Finally we present our conclusions in section 6.

The Ckkw-l merging scheme
Here we will present the main features of the Ckkw-l merging procedure. For a more detailed discussion of Ckkw-l and other similar merging algorithms we refer to [5,6] and the original publications [7,8].
The starting point for Ckkw-l is that we have a tree-level matrix-element generator capable of generating the Born-level process of interest, as well as the same process with up to N additional partons. The matrix elements used are regularised with a jet cutoff which we refer to as the merging scale, t MS . To the states generated in this way we want to add a parton shower to dress up the hard partons with emissions below the merging scale in a way such that the soft and collinear emissions are properly modelled.
As the matrix elements are inclusive, in that they give the cross section for states with at least n additional partons resolved above the merging scale, it is obvious that we cannot simply add the event samples generated with different parton multiplicities. Instead we want to make the samples exclusive by reweighting them with Sudakov form factors taken to be the no-emission probabilities the parton shower would have used to produce the same partonic states.
To calculate the form factor we first have to reconstruct a parton-shower history for the states with n additional partons, S +n , given by the matrix element generator. This means that we have to answer the question, how would my parton shower have generated this state? The answer to this question is not necessarily unique. The parton shower may produce a given final parton state in several ways, just as a given state may be represented by many different Feynman diagrams. In Ckkw-l, these different path are considered by reconstructing all possible parton shower histories, and picking one of them according to probabilities calculated from the relevant splitting functions.
Doing this, we arrive at a history, in which a sequence of parton shower emissions are specified by the ordering scale of each emission ρ i and other splitting variables such as the energy fractions, and azimuthal angles, denoted by z i . We also obtain a sequence of intermediate parton states, S +i . The requirement on the parton shower is therefore that it must have complete on-shell intermediate parton states between each splitting. Until fairly recently this was only true for the ARIADNE program [9], which also was the first to use the Ckkw-l merging [8].
Let us denote Sudakov form factors by dρ dzα s (ρ)P i (ρ, z) . (2.1) This is the probability that there are no parton shower emissions from the state S +i between the scales ρ i , and ρ i+1 . The reweighting with Sudakov form factors now proceeds by starting the parton shower at a given intermediate state S +i , setting ρ i as the maximum scale, and generating one emission (ρ, z). The probability that this emission is above ρ i+1 is exactly 1 − ∆ S +i (ρ i , ρ i+1 ), so throwing away the event if the emission is above ρ i+1 is equivalent to reweighting with the Sudakov form factor. A special treatment is called for in the Sudakov between the last emission scale, ρ n , and the merging scale, in the case the cutoff in the matrix elements is not defined in terms of the parton shower ordering variable. In the case of n < N , the event is rejected if the trial emission from the state S +n is above the matrix element cut-off, irrespective of how it is defined. In the case of n = N , however, no Sudakov-reweighting is done.
The n-parton state is typically generated using matrix elements with a fixed α s (µ), so that we also reweight the event with n i=1 α s (ρ i ) α s (µ) (2.2) to obtain the same running of α s as in the shower. Finally, note that for initial-state parton-shower splittings, the no-emission probability Πis not the same as the Sudakov form factor needed to reweight the matrix-element generated state. Instead we have [10,11], (2.3) and the corresponding ratios of parton density functions are included as an additional weight.
We have thus constructed exclusive final states with an arbitrary number of partons resolved above the parton shower cutoff scale ρ c . The distribution of these states are resummed to all orders in α s , according the precision of the parton shower. However, the n ≤ N emissions which are considered hardest in the parton-shower sense, and are above the merging scale as well, will have their splitting functions corrected to reproduce the correct tree-level matrix element.
It should be noted that if the merging scale is defined in the same way as the parton shower evolution scale, the Ckkw-l is equivalent to standard Ckkw, as long as the latter is used with a shower which is properly vetoed and truncated [12]. In Appendix A we elaborate on how the logarithmic accuracy of the shower is preserved in Ckkw-l and compare with the case of standard Ckkw using truncated showers.

Interleaved showers
As mentioned in the previous section, the requirement on a parton shower to be used in the Ckkw-l procedure is that it gives complete on-shell partonic states between each emission. In this respect, the transverse-momentum ordered shower in PYTHIA8 [13] is perfectly suited. However, it is not completely straight forward to implement Ckkw-l with PYTHIA8, as the parton shower in the case of hadron collisions is interleaved with multiple interactions.
The philosophy behind the interleaved shower is that processes with a high scale in some sense happen before processes at lower scales. As the emissions in a parton shower are not completely independent in that every emission will give rise to recoils and will carry away some energy and momentum, it is important that the emissions are performed in the right order. It is, for example, not reasonable that an emission of a gluon with small transverse momentum removes so much energy as to make an emission with a higher transverse momentum impossible. The argument is based on formation times -a final state parton with large transverse momentum is to some extend formed long before one with small transverse momentum.
If we consider standard QCD jet production in proton collisions, a parton shower is typically initiated by a hard 2 → 2 matrix element at some transverse momentum. The parton shower then evolves these hard jets by emitting final-state radiation from the outgoing partons and initial-state radiation from the incoming partons. This is done iteratively, ordering the emissions in transverse momentum.
There is also a chance for a second (semi-)hard interaction between the colliding protons. Also, one of the outgoing partons from the hard interaction can rescatter with one of the spectator partons in one of the colliding protons, and in addition, outgoing partons are allowed to rescatter among themselves. In PYTHIA8 such scatterings are included in the shower procedure such that an additional scattering at a scale ρ MI will happen before e.g. an initial-state splitting at a scale, ρ i < ρ MI .
This means that the no-emission probabilities are modified in PYTHIA8, and now consist of several pieces, where the superscript refers to the standard parton shower (PS), multiple interactions (MI) and rescattering (RS). If we have resolved a state S +i at a scale ρ i , the probability for a change of type a at scale ρ is where P a is the inclusive probability. PYTHIA8 uses an interleaved treatment of spacelike (initial-state radiation -ISR) and timelike showers (final-state radiation -FSR), so that the no-emission probability Π PS S +i is further subdivided as The ordering scale, ρ, is defined in different ways for different processes, but they all correspond to a relative transverse momentum of emitted partons. For ISR the scale is where −Q 2 is the virtuality of the incoming original parton and z is its momentum fraction, and for FSR we have where Q 2 is the invariant mass of the radiating parton, and z the energy fraction (in the dipole rest frame) of the emitted parton. For MI and RS the scale is simply given by the squared transverse momentum of the emitted partons.
The full interleaving of all shower components makes PYTHIA8 ideal for our prescription of matrix element merging, since the full no-emission probability can, as will be explained below, easily be generated in only one step.

Implementation in PYTHIA8
Due to the requirement of fully on-shell intermediate states, Ckkw-l merging has so far only been implemented in the ARIADNE shower. Here, we present a new implementation within PYTHIA8, which is conceptually equivalent to the former, but differs in details relating to the differences in the parton showers.

Constructing the parton shower history
A key concept of the merging algorithm is the assignment of a shower history -a sequence of shower states and evolution scales -to each n-particle configuration supplied by the matrix element generator. In the Ckkw-l approach, this is done by constructing every possible path from a core Born-level process to the current n-particle state.
Here we encounter the first difference between ARIADNE and PYTHIA8. The first gluon emission is particularly simple in e + e − in ARIADNE, where the evolution variable and the splitting kernel for the first splitting are symmetrical between both outgoing legs, thus resulting in only one possible path: One dipole splitting into two dipoles. In PYTHIA8, the approach is slightly different. Also here a dipole-like approach is used, but the emission is explicitly divided up into two contributions stemming from each of the dipole ends, where the radiation close to one end of the dipole is considered more likely to come from this dipole end itself. Different splitting probabilities for either dipole end will thus result in two different ways in which PYTHIA8 could have arrived at the +1-parton state. In general there are more possible paths in PYTHIA8 than in ARIADNE. We therefore try to investigate in some detail the effects of different ways of choosing a path.
What we basically want to do is to reconstruct which Feynman diagram gives the largest contribution to the state produced by the matrix element generator. The preferred option would be to ask the matrix element generator itself, but this information is not always easily accessible. Even if such details were available, it is not always enough, as a given Feynman diagram may also correspond to different parton shower histories. As is discussed in Appendix B, we approach this issue by constructing all possible path of collinear splittings, and pick a path according to the product of splitting probabilities. For the hardest emission, the splitting probability is supplemented so that the matrix element transition probability is assured. More precisely, we choose a path according to the probability It must be noted that in the limit of strong ordering, which is the relevant limit when looking at the formal logarithmic accuracy of the procedure, picking the most likely path is trivial. Hence, the way a path is selected will only give sub-leading effects on any observable. We will nevertheless investigate how large these effects are by implementing two different schemes. One is similar to the original ARIADNE-implementation, and is based on eq. (4.1). The other is inspired by the Ckkw-implementation in HERWIG++ [14], where the path which has the smallest sum of transverse momenta in the splittings is chosen exclusively. Clearly, in the strongly ordered limit, both of these will find the "right" path, but as we will see in section 5, there are visible differences.
For higher jet multiplicities, minor complications of the path concept arise. First, we know that shower emissions are always ordered in some scale variable ρ (virtuality, angle, transverse momentum). This is not always true for consecutive clusterings of jets from a matrix element. We choose to interpret a sequence of such unordered splittings as a single step in the algorithm such that all steps will be ordered. We must then decide which scale to use for this combined emission step. Assume that we have a sequence of reconstructed scales given by ρ 1 > ρ 3 > ρ 2 > ρ 4 . The combined emission then corresponds to ρ 2 and ρ 3 and we can generate the total no-emission probability as In the former case, the no-emission probability between the scales ρ 3 and ρ 2 is calculated using the 1-parton state, while in the latter, the 3-parton state is used. We will investigate the difference between using the higher (ρ 3 ) or lower scale (ρ 2 ) as minimal scale for rejecting trial emissions off the 1-parton state in section 5.1. Some rare matrix element configū d c c W − Figure 1: An example of a matrix element contribution without a complete shower history. In this case, only the two gluon emission can be reclustered, cc → udW − is regarded a separate hard process.
urations, e.g. massive electroweak corrections to an underlying QCD process, as shown in Figure 1, could never have been produced in the shower algorithm. For such processes, clustering will be attempted as far as possible. The last, irreducible, state will be treated as a new hard process, and be assigned a shower starting scale in the same way PYTHIA8 normally would have assigned a scale when presented with such processes. When handling externally generated processes, PYTHIA8 would by default start the evolution at the factorisation scale defined in the matrix element evaluation. However, different user choices are allowed. In section 5.1, we will also investigate the effects of other scale choices.
On a more technical note, we disallow clusterings that will result in a unreasonable Born-level process. An example would involve starting from the configuration shown in  . From only recombining colour and flavour, alternative (b) would be identical, and, albeit being disconnected, allowed. The interpretation of the configuration as either (a) or (b) is tied to which uū pair is clustered to a gluon. Since we will always be able to find sensible paths like (a), impossible paths (b) leading to disconnected diagrams will be discarded.
In other merging prescriptions, these problems are addressed with other strategies. In HERWIG++ [14], the authors found that results where insensitive to the treatment of unordered or incomplete paths and chose to retain incomplete contributions. SHERPA [15] follows a different approach in that no incomplete histories are constructed, since if necessary, electroweak bosons will be clustered as well. This would interpret the diagram in Figure 1 as an electro-weak matrix element correction to di-jet production.

Interleaved multiple interactions
At the LHC, events with only one parton-parton scattering per collision are highly improbable, and a lot of effort has gone in to the modelling of multiple scatterings in PYTHIA8. When merging the PYTHIA8 shower with matrix elements, it is therefore desirable to keep the modelling of multiple scatterings as intact as possible.
In PYTHIA8, multiple interactions and radiation compete for the available phase space. To make sure that some part of phase space is exclusively filled by matrix element configurations, another part by shower radiation and multiple interactions, we minimally modify the Ckkw-l algorithm. The generation of no-emission probabilities has to be slightly refined to keep the effect of multiple interactions on the no-emission probability, while assuring the validity of our algorithm.
The formal proof that the merging scale dependence cancels to the accuracy of the shower rests on the assumption that the factorisation scheme defined by the shower evolution equation is uniform over all of phase space. In Ckkw-l, this is realised by allowing trial emissions in the matrix element domain, i.e. off reclustered states, without phase space restrictions, and vetoing events if the first emission off a ME configuration produced another ME configuration, i.e. a parton above t MS . In this way, events with non-zero weight have been treated identically in the ME and PS regions. This prescription has to be generalised to include additional sources of emissions, e.g. multiple interactions.
The requirement that the shower evolution is identical in ME and PS domains forces us to treat multiple interactions on equal footing with radiative emissions, once secondary scatterings are included in the evolution of partons by allowing for competition over phase space. When performing trial showers on a reclustered state, we thus treat multiple interactions identical to "ordinary" emissions. The treatment of the first emission off ME configurations defines the border between ME and PS regions. We choose to slightly refine this definition by requiring that the matrix element region contains only radiative emissions above a cut t MS . This means that once a different type of emission has been produced, we are in the parton shower domain, and we should continue the shower without any additional phase space restriction. More concretely, when checking the first shower evolution response from a ME configuration, we keep the state if an emission below t MS or a secondary scattering has been generated. Hence, the lower bound on the matrix-element-corrected region is changed to t ′ MS = max(t MS , ρ MI ). The reason for this treatment is that we want to keep hard multiple interactions generated by the shower, rather than unjustifiably restricting them to be below t MS .
Let us describe our procedure with a specific example for merging up to three additional jets. Consider a W + 3 gluon event, with scales ρ 1 ≥ ρ 2 ≥ ρ 3 . When only allowing QCD radiation and multiple scatterings, this state could be produced by 1. Three gluon emissions off W production; 2. One gluon emission off W production, and one secondary gg → gg or qq → gg scattering; Clearly, the first possibility can and should be corrected with matrix elements according to the standard Ckkw-l procedure. In the second case, the hardest scale can be attributed to either MI (ρ 1 = ρ MI = ρ 2 > ρ 3 ) or an emission. In the former, we think of the state as inside the PS domain. This means that the shower would have produced the secondary interaction first, "freeing" the subsequent emissions from phase space restrictions. Thus, we have to generate this state from the 0-jet matrix element, and, to avoid double counting, veto it in trial showers off reconstructed configurations. If the hardest scale was associated with an emission (ρ 1 > ρ 2 = ρ MI = ρ 3 ), we can distinguish two cases. If the hardest emission is in the PS domain already, there is no reason to restrict the event generation further by disallowing MI above a certain scale. In effect, the configuration is taken from the evolution of the 0-jet ME sample, while removing it from the 1-jet sample by vetoing configurations with ρ MI > ρ 1,reclus in the trial showers. Finally, the emission with ρ 1 can be in the matrix element phase space. Adding one secondary interaction will produce a state of two correlated 2 → 2 processes. Since no matrix elements can include this state, it is unambiguously inside the PS region, even without applying additional constraints related to a merging scale. This reasoning leads us to define the cross-over of ME and PS domains by a phase space cut for emissions, or the existence of more than one 2 → 2 process. Coming back to the example, we will generate this state from the 1-jet matrix element by adding a secondary scattering. In order to avoid double counting, in trial showers off reconstructed states, we veto the event if the trial emission resulted in ρ 1,reclus > ρ MI > ρ 2,reclus . This example illustrates the algorithm and sheds light on how particular configurations are generated. The bottom line is that every event where the n hardest (according to the parton shower ordering) partons can produced in one of the matrix elements samples, it will be taken from this sample. Hence, we are still true to the philosophy of Ckkw-l merging. Note that in this publication, we will only consider merging matrix elements with additional QCD-induced jets. Therefore we will e.g. treat photon radiation in the shower in the same way as to multiple interactions.
To validate our algorithm, we chose to implement an alternative treatment of multiple interactions, which is similar to the prescription applied in SHERPA [15]. For this, we exclude multiple interactions when performing trial showers on reclustered states, keeping only the shower emissions in the Sudakov form factors. Then, when showering the matrix element configurations, we allow additional interactions below the scale ρ 1 of the reclustered 2 → 2 process. For the +0 jet contribution, we choose ρ 0 = t MS as maximal scale. Differences between both treatments are investigated in section 5.1.

The algorithm step-by-step
After choosing a parton shower history for the matrix element state, the weight the parton shower would have generated while evolving to this state has to be calculated. This includes the running of α s in the shower, the no-emission probabilities generated by choosing particular splittings and the way parton distribution functions guide the space-like evolution. In the Ckkw-l scheme, a seamless inclusion of ME configurations into the parton shower is then achieved by reweighting the state with the parton shower weight where ρ i are the reconstructed scales of the splittings. The first PDF ratio in eq. (4.4) means that the total cross section is given by the lowest order Born-level matrix element, which is what the non-merged PYTHIA8 shower uses. The PDF ratio in brackets comes from of the fact that shower splitting probabilities are products of splitting kernels and PDF factors. The running of α s is correctly included by the second bracket. Finally, the event is made exclusive by multiplying no-emission probabilities. In our implementation, we chose to reorder the PDF ratios according to eq. (4.5), so that only PDFs of fixed flavour and x-values are divided, thus making the weight piecewise numerically more stable. The algorithm to calculate and apply this weight can be summarised as follows: I. Produce Les Houches event files (LHEF) [16] with a matrix element generator for n = 0, 1 . . . N extra jets with a regularisation cut-off, t MS , typically using a fixed factorisation scale, µ F , and a fixed α sME .
II. Pick a jet multiplicity, n, and a state S n according to the cross sections given by the matrix element generator.
1. Find all shower histories for the state S n , pick a sequence according to the product of splitting probabilities. Only pick un-ordered sequences if no ordered sequence was found. Only pick incomplete paths if no complete path was constructed.
2. Perform reweighting according to eq. (4.5): ii. Calculate the weight factor 3. Start the shower from S n at ρ n , giving a state R n+1 with the scale ρ R n+1 .
i. If n < N , and R n+1 was produced from S n by QCD radiation, and k ⊥ (R n+1 ) > t MS , reject the event and start again from II. Otherwise, accept the event and the emission and continue the shower. If a multiple interaction was generated, keep it and continue the shower without restrictions.
ii. If n = N , continue the shower without vetoing.
III. If the event was not rejected, multiply the event weight by IV. Start again from II.
Our merging approach is, with dynamically generated Sudakov factors, tailored to always reproduce what PYTHIA8 would most probably have done to arrive at the current configuration. Starting scales are of course no exception. Thus, we will start (trial) showering of electroweak 2 → 2 processes at the kinematical limit √ s, both for radiation and for multiple interactions, which is the default procedure in PYTHIA8. In this way, the question for a starting scale of multiple interactions when merging additional emissions is irrelevant.
For jet production in the pure QCD case, by default we set the transverse momentum of the outgoing partons in the 2 → 2 process as starting scale in the shower and multiple interactions. This should be adequate as long as the merging scale is not too small. For very small merging scales we have the option of including a Sudakov form factor giving the probability that no additional scatterings are produced between the maximum scale, √ s, and the transverse momentum of the 2 → 2 process. This would make the primary process exclusive, in the sense that we make sure that there are no harder scatterings in the event.
Note also that in pure QCD, the Born-level 2 → 2 process is in itself divergent and we must introduce a cutoff regularisation. This cutoff need not be the same as the merging scale. In fact we will here choose a much lower scale to avoid having a large fraction of the reclustered multi-jet ME-states ending up below the cut and resulting in un-ordered paths. In addition, the procedure must be changed slightly since also the scale of the reclustered 2 → 2 state is included in the classification of un-ordered histories.
In all cases, we implemented the scale settings such that user choices (e.g. forcing "power showers") are always transferred to the trial showers off 2 → 2 processes. For higher-order tree-level matrix elements, we use the reconstructed splitting scale of the state as starting point.
When comparing alternative MI treatments, special care is required when setting the starting scale. For the SHERPA-inspired prescription, we will set the scale ρ 1 of the reclustered 2 → 2 process as the MI starting scale for states S +n>0 , and allow multiple scatterings below ρ 0 = t MS for the +0 jet matrix element contributions.

Results
We have implemented the necessary code for Ckkw-l merging in PYTHIA8, where it has been publicly available as of version 8.157.
In the following, we will first show some validation plots on parton level for jet production in e + e − collisions and weak boson production at hadron colliders. We then move to more realistic observables for these processes, and compare to data. Thereafter, di-boson production and pure QCD jet production are examined.
As input matrix element kinematics, we choose Les Houches Event Files generated with MADGRAPH/MADEVENT and the following settings 1 : • CTEQ6L1 parton distributions used for hadron collisions.
• Fixed factorisation scale µ F set to M W for W+jets, M Z for Z+jets, M W + M Z for WZ+jets and M Z for pure QCD di-jets.
with D = 0.4, to regularise the QCD divergences and act as merging scale t MS .
• Require p T,ℓ > 20 GeV in Z+ jets to avoid low momentum in γ propagators.
• Require p T,j > 10 GeV in QCD di-jet events.
For brevity, we will refer to results of merging of up to N additional jets as MEN PS.
Contributions for a fixed number n ≤ N of jets from the matrix element will be indicated by a superscript n, as in ME n N PS. Also, we will write PYTHIA8 when talking about the default PYTHIA8 behaviour. For all distributions, we use routines of the fastjet package [17] to define and analyse jets. If not otherwise indicated, we present plots at the parton level, i.e. after shower and multiple interaction evolution, since merging effects are more visible without smearing due to hadronisation.

Validation
We begin by considering the simplest case, with only one extra parton added to the Bornlevel state. This is a very useful benchmark for any matching or merging algorithm, as emphasised in [5], because many parton shower programs, such ARIADNE and PYTHIA8, implement directly the tree-level matching by modifying the splitting functions for the first emission. Hence, when comparing a merged parton shower with the matched one, it is very easy to see if the merging algorithm, for example, has any non-trivial dependence on the merging scale.
Merging scale dependence in e + e − → jjj The PYTHIA8 parton cascade by default includes reweighting of the first splitting of the hard process with the correct matrix element expression, thus giving an excellent handle to check our implementation of e + e − → jets. To compare our result with PYTHIA8, we however have to make a minor change to the shower. When supplied with a e + e − → qq state, PYTHIA8 will use the three body matrix element as splitting kernel for the first splitting of q and the first splitting ofq. This is done since the e + e − → qqg matrix element provides a better estimate of the dipole splitting kernel than the DGLAP kernel. However, when starting from e + e − → qqg input, PYTHIA8 will use DGLAP kernels in the evolution of the quarks. Thus the showers response to LHEF input of e + e − → qq and e + e − → qqg will slightly differ when constructing additional jets. Since we want to merge also higher jet multiplicities with the PYTHIA8 cascade, it is natural to exclude the improvement in the e + e − → qq case, and switch off the usage of matrix element correction weights for more than three final partons. In the most recent versions of PYTHIA8, such a switch is available for user input. Doing this, we can compare ME1PS with PYTHIA8. The variable used as a separation cut t MS between matrix element and parton shower domains is most sensitive to the implementation of the merging procedure. In Figure 3, we show the value of k ⊥ for which three jets would be clustered to two jets. As desired, we find excellent agreement, and, when examining different values of the separation cut t MS , vanishing merging scale dependence.  Merging scale dependence in pp → V + 1 jet Similarly, the implementation of V+1 jet merging can be validated against default PYTHIA8. In accordance with the discussion above, we switch off additional matrix element reweighting factors in default PYTHIA8 after the first initial state emission. Further, it is important to note that in PYTHIA8, infrared divergences in space-like splittings are regularised by shifting the denominator of the integration measure in the evolution equation by a small ρ reg . This shift is inspired by the interleaved evolution of space-like splittings and multiple interactions, where colour screening will dampen the number of interactions. Not strictly perturbative effects like these will be present in the default PYTHIA8 distributions, even at p ⊥ ≈ O(10 GeV). That the merging is well under control is shown in Figure 4, where we set ρ reg = 0 for the first splitting in default PYTHIA8 to remove the deliberate mismatch in integration measures. We then find complete agreement in the k ⊥ distributions.

Influence of the prescription on how to choose a shower history
That different prescriptions to choose amongst reconstructed histories differ only by subleading terms is exemplified in Figure 5. We see a small merging scale dependence when always choosing the history with the smallest sum of transverse momenta. The smallness of the effect stems from the fact the probabilistic choice -on average giving the "correct" shower history -is dominated by a 1 ρ factor, so that picking a history by lowest scale ρ  Variation when changing the starting scales for un-ordered histories In the following, we refrain from setting the infrared regularisation parameter ρ reg to zero. When facing histories with unordered emission sequences, different ways to assign an emission scale to the combined splitting are conceivable, as discussed in section 4.1.
To investigate this we turn to two-jet merging, the lowest non-trivial jet multiplicity at which non-ordered histories may occur. Figure 6 highlights that when choosing the lower scale as a common scale, the transverse momentum of the second jet has a harder tail compared to setting the higher of both scales as the scale of the combined emission. Also, back-to-back jets are more prominent. This is an effect of the reweighting with a running coupling constant, which produces a more pronounced enhancement of the cross section when choosing smaller scales. For all further results, we will use the larger scale when evaluating α s (ρ).
Variation due the choice of starting scales for incomplete histories ρ 0 = s = (1960 GeV) 2 allows to conclude that the dependence on the starting scale for incomplete emissions is negligible, which reflects the fact that the corresponding states are very rare.

Differences between treatments of multiple interactions
Different treatments of multiple interactions are presented in Figure 8, which illustrates that at the LHC, variations of up to 10% may occur between the default Ckkw-l recipe and the SHERPA-inspired alternative. Due to the phase space restriction ρ 0 = t MS for additional scatterings in Z + 0 jet matrix element samples, the alternative treatment produces fewer multiple interactions. Thus, the k ⊥1 spectrum for intermediate scales 15 GeV < k ⊥1 < 30 GeV is softer than the Ckkw-l result. At scales k ⊥1 > 50 GeV, the two prescriptions become indistinguishable. The behaviour at low scales is also anticipated, since the alternative sample does not include suppression due to MI no-emission probabilities. Since these are present in default PYTHIA8, the alternative recipe exhibits a higher maximum, whereas the default prescription reproduces the showers low scale features closely.
Our goal when developing a generalisation of the Ckkw-l method including interleaved showers was to be as similar for low scales to the event generator as possible, meaning that the modelling of PYTHIA8 in regions where multiple interactions are important should be left unchanged. As pointed out in 4.2, this can be formally be achieved in PYTHIA8 by employing the "Ckkw-l" prescription. The discussion of the last paragraph also showed that in the implementation of the method, low scale features of PYTHIA8 are retained. Hence, we choose the "Ckkw-l" prescription of adding the influence of multiple scatterings as the default. As can be inferred from Figure 16 below, this method succeeds in not changing the underlying event description of PYTHIA8.
Because in weak boson measurements at low scales, the shape and position of maxima is unchanged in the Ckkw-l approach, we also minimise the need for changes of some tuning parameters, e.g. primordial p ⊥ . This is not obviously true for the alternative method, in which some changes in primordial p ⊥ might be necessary. Meanwhile, once hadronisation is added and experimental cuts are applied, Z+ jets observables at the Tevatron show only little dependence on the strategy how multiple interactions are included in merged samples.

Unitarity violations
We finish our validation by discussing a theoretical issue. Parton shower resummation alone does not change the cross section of the hard process, since the probability of having no emission, together with the sum of probabilities to evolve into states with an arbitrary number of emissions adds to unity -a property dubbed unitarity. This however is only true if the transition probabilities used in generating additional emissions are identical to 1.5⋅10 -9 2.0⋅10 -9 2.5⋅10 -9 dσ/d∆φ 12 [mb] the terms exponentiated in Sudakov form factors. As pointed out in [18,19], unitarity is violated by tree level merging due to the fact that the transition probabilities above and below t MS are different, while Sudakov factors are always generated with shower splitting kernels, i.e. the transition probabilities below t MS . The magnitude of the resulting unitarity violations for different merging scales is assessed for W+ jets in Table 1. We have also verified that the main points of the following discussion apply to all example processes used in this report. First, we note that including one additional jet does not lead to unitarity violations for vector boson production, since PYTHIA8 is already matrix element corrected, so that the full tree-level splitting probability is exponentiated. When including more than one jet, we observe smaller deviations from the hard process cross section as we increase the merging scale. This is expected since for larger t MS , the Sudakov form factors generated by trial showering quickly approach unity. Because of the higher merging scale, phase space regions with low scale emissions (where Sudakov factors differ from unity) are generated by the parton shower. Thus, identical splitting probabilities are used to generate the emissions and Sudakov form factors, and unitarity is preserved to reasonable accuracy. One immediate consequence is that we should not choose t MS too low, since otherwise, sizable violations      can lead to different regions of the full phase space -which includes unordered emissionsbeing neglected in the parton shower approximation. These regions of unordered emission sequences are formally beyond the accuracy of the shower. Figure 9 shows the differences in transverse momentum distributions between merged distributions and PYTHIA8, for two different ways of ordering emissions. Deviations from unitarity are more significant if the shower evolution is ordered both in ρ and rapidity. This is due to neglecting larger regions of the full phase space in the parton shower. We have verified that when only keeping ME configurations for which a history ordered in ρ and rapidity can be constructed, unitarity violations are greatly reduced. Nonetheless, we have to conclude that ordering the cascade in these two variables makes the parton shower approximation worse than ordering in ρ alone. Only ordering in ρ, we find in Table 2 that the inclusive 2 → 2 cross section is not changed drastically when including additional jets. Also, the k ⊥1 spectrum becomes only slightly harder in this case, as is seen in the right panel of Figure 9.
It should be noted that the rapidity ordering was introduced in PYTHIA8 to suppress the high transverse momentum emissions from dipoles between incoming and outgoing partons. However, this is now achieved through another damping mechanism described in [20], which means that the rapidity ordering is no longer needed to achieve a reasonable description of data.
We checked that deviations from unitarity can be even further reduced when excluding unordered emissions. However, in Ckkw-l, we want to include states which are out of the reach of the shower, and thus, as discussed in section 4.1, by default keep ME configurations for which only unordered histories can be found. For enthusiasts, switches for rejecting configurations with unordered histories are available in the public code. Provided considerable unitarity violations remain after excluding differences between the full allowed and the PS phase space, this could suggest large higher-order corrections, since by choosing a low merging scale we effectively include major parts of the real emission phase space of an NLO calculation [21]. It should be noted that unitarity is a parton shower concept and need not be fulfilled in other contexts, see e.g. [22].  Merging procedures aim for a better description of well separated jets in the parton shower. Historically, angular correlations in e + e − → 4 jet production have been used to investigate the 3-gluon vertex. The description of these observables should be improved when including additional jets. More specifically, we look at the the (modified) Nachtmann-Reiter angle 1) and the angle between the two lowest energy jets

e + e − four-jet observables
where p i are the energy ordered three-vectors of the jets. As shown in Figure 10, the default PYTHIA8 description of these observables is fairly good to start with, reflecting the fact that some azimuthal correlations are included in the shower, and it is only slightly changed when merging additional jets. We notice that |cos θ N R | becomes slightly worse when including additional jets. The different shape of the generator curves can be explained by the fact that the data was corrected to the parton level, whereas the MC samples where generated with full hadronisation. In |cos θ N R |, the hadronisation corrections [23] would change the MC shapes towards a better agreement. cos α 34 is captured slightly better for cos α 34 ≈ −1, when including additional jets. The trend to overshoot at cos α 34 ≥ 0.5 can again be explained by the fact that we have generated the distributions at the hadron level, whereas the data was corrected to the parton level. We have checked by excluding hadronisation that these statements are true, and that the irregularities are reduced. However, the general trends in both |cos θ N R | and cos α 34 remain, albeit less pronounced. This might be explained with the fact that the hadronisation corrections applied to the data are estimated with a model different from the one used by PYTHIA8. Since the cross-over from partonic to hadronic states is a highly model-dependent statements, artifacts of the model used to estimate corrections could be present in the data. Even so, we think Figure 10 illustrates that when including higher-order tree-level matrix elements in the description of e + e − → jets, changes as compared to the default shower are fairly modest, which indicates that PYTHIA8 already nicely describes observables at LEP. When checking further LEP observables, we find that Ckkw-l does as good or moderately better than default PYTHIA8. This means that when developing a new tune including additional matrix elements, the hadronisation parameters, which are predominantly constrained at LEP, may not have to be touched.

Vector boson production
In hadron collisions, we can assess the extent of change when including additional jets by looking at vector boson production with two or more additional jets. In Figures 11 and 12, we compare jet k ⊥ spectra and jet multiplicities for W production and in Drell-Yan events to data, respectively. In general, we find more jets with high k ⊥ and better agreement with jet multiplicity data. It is particularly instructive to investigate the change of k ⊥ distributions when increasing the numbers of jets in the matrix element generation. Figure 13 again shows that the k ⊥ spectra develop harder tails when including higher multiplicity matrix element configurations.
Analysing the k ⊥2 separation when two jets are clustered into a single jet in the right panel, it is interesting to see how this increase arises. For small merging scales (e.g. 10, 15 GeV), k ⊥2 in two-jet merging quickly grows at the merging scale and remains flat until a more gradual ascend sets in at k ⊥2 ≈ 60 GeV. There, the ME2PS distributions MC/data Figure 11: Jet multiplicity and transverse momentum of the hardest jet in W + jet events as measured in the electron channel by ATLAS [25]. The merging scale is t MS = min{k ⊥i } = 15 GeV. Effects of multiple scatterings and hadronisation are included. The plots were produced with RIVET [24]. MC/data Figure 12: Jet multiplicity and inclusive jet transverse momentum in Drell-Yan events, as measured by CDF [26]. The merging scale is t MS = 30 GeV. Effects of multiple scatterings and hadronisation are included, and Tune 4C was chosen. The plots were produced with RIVET [24].
for t MS = 10, 15 GeV also join the curves for larger merging scales (30,45 GeV). This behaviour of ME2PS for low t MS can more clearly be seen in the left panel of Figure 14. When inspecting the ME3PS curves for t MS = 10, 15 GeV, we again see a hardening of the spectrum, which is to some extent stable when going to ME4PS. Such a stabilisation inspires the conclusion that the k ⊥n≤N separation between the n'th and (n − 1)'th hardest   jets is stable once the maximal number of merged jets is increased above n, as was found in [21].
One possible argument for this effect is that when looking at the k ⊥2 separation at which a jet a 1 and a jet a 2 are clustered into a single jet in ME3PS, the parent jets b 1 , b 2 , b 3 which produced a 1 and a 2 were harder than in ME2PS, thus again favouring harder jets a 1,2 , i.e. larger separations, k ⊥2 . The stabilisation could then be explained by assuming that the parent jets producing b 1,2,3 in ME4PS will not greatly increase the hardness of b 1,2,3 because in ME4PS, most jets will be just above the merging scale due to a steeply falling k ⊥4 spectrum.
However, in our implementation, the question arises if a stable k ⊥2 distribution will also be stable to changing the value of the merging scale. First, notice that there is no shape change in the t MS = 45 GeV curves when going from ME2PS to ME3PS (or ME4PS), even though by the above reasoning, further distortions should be more pronounced at high merging scales. It is critical to notice (see Figure 14) that for low merging scales, the spectrum in ME4PS is significantly harder than the ME4PS reference at t MS = 30 GeV, whereas once their initial ascend is over, the curves for t MS = 45 GeV nicely join the t MS = 30 GeV ones. These observations can again be explained by unitarity violation for t MS = 10 GeV and t MS = 15 GeV, which stabilise when merging more jets, but do not decrease. Since the changed cross section is stable while the sample composition changes between ME2PS and ME3PS, the shape of k ⊥2 has to change. In support of this rationale, the right panel of Figure 14 shows that when reducing unitarity violations by not enforcing rapidity ordering in the shower, the effects are significantly reduced as well. These considerations can be applied to jet separations k ⊥n≥2 as well. Every parton shower relies on phenomenological models to confine partons into hadrons, thus making systematic tuning is a critical step in the development of an event generator. Tuning however should not hide the shortcomings due to approximations made. If residual tuning effects because of correlations between tuning parameters remain in phase space regions with well-separated jets, we expect such changes to be stabilised when correcting with higher multiplicity matrix elements. The impact of changing between different tunes in PYTHIA8 is shown in Figure 15, where we show the results of using Tune 2C, Tune 4C and forcing α s (M Z ) = 0.129783 (the CTEQ6L1 fit value) in all components of PYTHIA8, in comparison with ATLAS data [25]. We find that the pp → W+jets predictions are fairly stable with respect to changing tunes. As expected, we observe that the ME3PS sample is harder than default PYTHIA8.
Finally in Figure 16 we show the effect of our treatment of multiple interactions. The associated hadronic activity in Z production events, especially in the azimuthal direction direction of the Z, is very sensitive to underlying event effects, and hence also to multiple interactions [27]. In our merging scheme we have been very careful to make sure that multiple interactions are treated exactly the same way as in standard PYTHIA8 without inclusion of matrix element configurations. And, as seen in Figure 16, the differences between the merged sample and default PYTHIA8 are indeed very small.

Di-boson and QCD jet production
Our implementation is in principle general enough to be applied to any process that can be handled by PYTHIA8. However, in this publication, we restrict ourselves to two further examples. First, let us examine di-boson production, with one of the bosons decaying hadronically. Allowing hadronic decays of weak bosons in the underlying Born process provides another complication, and for technical reasons we here restrict the matrix element to only produce extra jets in from the incoming partons, while additional jets in the hadronic boson decay are only produced by the shower. As the first emission in the boson decay is anyway ME-corrected in standard PYTHIA8, this is not a severe restriction. Note however that this means that we have to treat emissions from the boson decay on the same footing as multiple interactions (and QED radiation). This means that they are included in the Sudakov form factors generated from reclustered states, but when showering from a n < N state, if the first emission is from the boson decay, the event is never vetoed. The partons from the boson decay are also not involved in the reclustering of matrix element states.
The performance of our implementation concerning these issues can be tested when merging pp → W + Z → e + ν e jj+jets matrix elements. The left panel of Figure 17 shows that also in the case of di-boson production, the k ⊥3 spectrum becomes harder on inclusion of additional jets. There are no visible differences in the default PYTHIA8 results when changing between only ordering emissions in evolution ρ and ordering both in ρ and rapidity, since k ⊥3 is dominated by the hardest shower emission, which is not affected  Figure 16: Toward region charged particle density and average p ⊥ in Drell-Yan events, as measured by CDF [27]. The merging scale is t MS = min{k ⊥i } = 30 GeV. Effects of multiple scatterings and hadronisation are included, and Tune 4C was chosen. The plots were produced with RIVET [24].
by the additional rapidity ordering. We observe only small differences between merged samples with and without enforced rapidity ordering in the shower. Relative changes in k ⊥3 are, as expected, comparable to the effects on k ⊥1 when including additional jets in pp → W + → e + ν e (see e.g. Figure 13). We have checked that different jet definitions do not change this trend. A consequence of harder jets can be seen in the right panel of Figure 17, where we show the di-jet invariant mass distribution with cuts and jet definition from CDF [28]. The spectrum develops a harder tail compared to default PYTHIA8. Particularly in the region 140 < m jj < 200 GeV we find an increase around 10%. Also, the distribution is sensitive to the unitarity violations due to enforced rapidity ordering, so that care has to taken when comparing MEPS distributions to experimental data. In [28], the shape of the di-boson backgrounds was modelled by PYTHIA6.216, which should behave similar to default PYTHIA8. Merging additional jets in pp → W + Z → e + ν e jj can affect the shape of the di-jet invariant mass spectrum in a way which will reduce the significance of the effect found by CDF. We plan to further investigate these issues in a future publication.
Finally, we examine QCD jet production. For such events, we set the shower starting scale for the 2 → 2 process to the transverse momentum of the outgoing partons. The maximal scale for secondary scatterings is set to the same value. In PYTHIA8, users are generally allowed to choose a different prescription of setting a maximal scale of multiple interactions, e.g. the energy of the colliding hadrons. Adopting this example, we risk double-counting configurations, since interactions identical to the hard process can be generated.
To remove this double counting, an additional veto on the transverse momentum of multiple interactions in the trial shower has to be applied. We have checked that when  [28]. Jets were defined with the CDF JETCLU algorithm [29] as implemented in fastjet.
allowing secondary scatterings up to the kinematical limit and applying a veto, distributions are not changed with respect to setting ρ MI,max = p ⊥,2→2 . The results presented here have been produced with fixing the starting scales for the hard process to the transverse momentum, as is the default in PYTHIA8. Figure 18 shows that for QCD jets as well, inclusion of additional jets increases the hardness k ⊥3 of the third jet. Compared to the changes in k ⊥1 for W+jets, the effect is, however, moderate. This is in accord with the findings in [20], which showed good agreement in the p ⊥ of the softest of three partons (there called p ⊥5 ), when comparing 2 → 3 matrix elements to the default shower after the first emission from a 2 → 2 core process. There, the shower was slightly harder than the matrix element until p ⊥5 ≈ 80 GeV. A similar effect can be seen in the k ⊥3 separation of jets, which is related to the p ⊥5 of partons.
The inclusion of a sample with two additional jets does not change the situation dramatically, leading us to the conclusion that once the first few hard jets are generated according to the tree-level matrix elements, the parton shower does a fairly good job in describing the hardness of additional jets. This is supported by the upper panel of Figure 19, showing the k ⊥3 separation between the third and second hardest jets in W+jets events. Clearly, there are only little changes in the hardness of the third jet when going from ME2PS to ME4PS, i.e. the merging has less impact once a couple of jets are included from the matrix element states 2 . Coming back to pure QCD, we show in Figure 20 that also in this case, results for k ⊥3 are fairly stable when changing the merging scale. We register only small unitarity violations of O(10%), which matches the changes in k ⊥3 in W+jets events when going from ME2PS to ME4PS without requiring rapidity ordering, as illustrated in the upper part of Figure 19. As W + n jets contains colour configurations similar to di-jet+(n − 2) jets, this is another indication for the consistency of the implementation. However, in Figure 20, we find only minor changes between different treatments of rapidity ordering for di-jet events, whereas for W+jets events, we observe dramatic effects (see lower plot in Figure 19). This can be explained by the fact that when requiring rapidity ordering, PYTHIA8 orders all emissions after the first shower emission in rapidity, meaning that for di-jet events, k ⊥3 is virtually unaffected by the constraint, while in W+jets events, major restrictions on the phase space of the second and third jet lead to large unitarity violations. This argument is substantiated by Figure 21, which shows that once rapidity ordering becomes relevant, the additional ordering results in larger deviations for low merging scales.
It is worth noting that since jet spectra are not changed dramatically when including additional jets, only small differences are expected when comparing to experimental data.
violations -which should be avoided anyway. Hadronisation and multiple interactions were switched off. Curves with enforced rapidity ordering in the shower carry the label "y-ordered", while results without explicit rapidity ordering are labelled "y-unordered".
In Figure 22, we examine the description of CDF jet shapes [30] for two exemplary p jet ⊥ bins. For a p jet ⊥ in 55 GeV < p jet ⊥ < 63 GeV, we find only very minor changes. However, for higher p jet ⊥ in the region 128 GeV < p jet ⊥ < 148 GeV the differences between default PYTHIA8 and the merged sample ME2PS with two additional jets are more pronounced, and we see that the latter gives a slightly broader shape. This is expected as at high transverse momentum the effect of the harder third jet in Figure 18 should come into play, resulting in more jets containing two partons from the matrix element. Such jets are of course broader.
When checking differential jet shapes for other p jet ⊥ bins, we find that ME2PS does as good or slightly better than PYTHIA8 for p jet ⊥ 120 GeV, while decreasing too slowly for p jet ⊥ 120 GeV. This indicates that at least some revisions need to be made when tuning matrix-element-merged PYTHIA8 to pure QCD jet data. Since the influence of multiple interactions on jets with p jet ⊥ 120 GeV is likely to be small, a possible new tune would potentially feature changes in α s (M Z ) and other parameters to prescribe the physics of hard jets.

Conclusions and Outlook
We have implemented Ckkw-l merging inside the PYTHIA8 framework, and have shown that it works well for several sample processes: e + e − → jets, (di-) boson and pure QCD jet production in hadronic collisions. The implementation is, however, quite general and could be used for any process which PYTHIA8 is able to handle.
The algorithm is true to the Ckkw-l spirit, in that if matrix element samples are provided for up to N extra partons, every event where the n ≤ N hardest (in the parton shower sense) partons can be produced by the matrix element, it will be evolved from the corresponding matrix element state.
By construction, the dependence on the merging scale vanishes to the logarithmic precision of the PYTHIA8 parton shower. Nevertheless, we find visible sub-leading effects due to different choices that can be made in the procedure. In particular we have investigated • different ways of choosing parton shower histories, • different strategies for handling unordered histories, • different starting scales for incomplete histories, • different options for including multiple scatterings.
In all these cases we found the effects to be small. Hadronisation and multiple interactions were switched off. Curves with enforced rapidity ordering in the shower carry the label "y-ordered", while results without explicit rapidity ordering are labelled "y-unordered".
However, we found that in some cases there are large merging scale dependences from unitarity violations. These problems have been noted before in other Ckkw-based algorithms [18,19], and arise from the fact that what is exponentiated in the Sudakov form factors is only the parton shower approximation to the matrix elements, rather than the matrix elements themselves. In addition, the phase space integrated over in the Sudakov may differ from the full phase space available to the matrix element.
For our implementation, one would expect the unitarity violations to be diminished in the cases where PYTHIA8 already include a matrix-element reweighting of the first parton shower emission (similar to what is done in POWHEG [12,31]). However, we found that the effects on the contrary are very large, and traced the reason for this to the fact that the default tune of PYTHIA8 uses a rapidity ordering for the initial-state shower in addition to the ordering in the transverse momentum evolution scale. This results in a severe restriction of the phase space over which Sudakov form factors are integrated, giving increased merging scale dependences. When removing the rapidity ordering, the unitarity violations are reduced to an almost negligible effect.
An important result of our investigations is that the PYTHIA8 shower (without enforced  Figure 22: Jet shapes in QCD events as measured by CDF [30]. The merging scale is t MS = min{k ⊥i } = 30 GeV. Effects of multiple scatterings and hadronisation are included. The plots were produced with RIVET [24]. rapidity ordering) actually is quite good at describing the hardness of multi-jet events, as long as the hardest few jets are generated according to the exact matrix elements. Of course, there may be special observables related to details in the correlations between jets, where merging with high-multiplicity matrix elements is still necessary to get a correct description, but for the main features of multi-jet event it seems to be enough to merge with a limited number of extra jets. Before our Ckkw-l implementation can be used for reliable predictions and comparisons with experimental data, the parameters of PYTHIA8 need to be retuned. We have shown that for e + e − → jets, the merging with multi-jet matrix elements barely changes the description of data, and we can assume that the parameters for the hadronisation and final state showers will not need to be substantially changed. Furthermore, for pure QCD processes in hadronic collisions, the effects of multi-jet merging are again very modest, except for very high transverse momentum jets, so also for minimum bias and underlying event observables the tuning needed can be assumed to be minor. On the other hand, for electro-weak processes and for very hard jets in pure QCD processes in hadronic collisions the merging gives quite substantial effects, which means retuning is necessary. To get stable results, this new tune should be done without the rapidity ordering discussed above. element configuration. In this way, for non-zero weighted events, every ME configuration is treated exactly as in the parton shower. It is also clear that panel (d), a configuration with two jets which are soft in t should come from the parton shower. Panel (c) provides the next complicated region. Since region (a) is already correctly accounted for, we veto shower emissions that would evolve the state into one in (a). This means that all states which evolved from a matrix element state with one jet above t MS , and which have nonzero weight, i.e. with the first emission below t MS or no emission at all, are produced by PS evolution. As outlined in section 4.3, Sudakov form factors below the reconstructed evolution scale of the ME emission are added by using the full shower when producing trial emissions. We ensure in this way that form factors are added to the ME one-jet configuration in the correct parton shower manner.
To our understanding, the truncated shower approach and Ckkw-l only differ in their treatment of panel (b). Let us clarify this statement. In Ckkw-l, we define the ME region to contain the n ≤ N hardest jets in the evolution variable ρ, which are also above the cut in t. Once we are inside the PS region, we believe the shower is performing well, so that all further emissions will be taken from the parton shower. The first emission in panel (b) is already in the PS region. Thus, every further splitting is taken from the shower. For the example this means that this two jet state is generated from the 0-jet ME sample. Since the shower is the only ingredient in how the state is produced, the accuracy of the shower is preserved.
Truncated showers differ, in that they allow what we call a pure PS state to evolve into something which could have evolved from a ME 1-jet state. Here, the emission is kept, the reclustered state changed, and evolved further until the scale ρ 2 is reached. Then, an emission with the reconstructed ME splitting variables is forced. In this way, the path how the state was reached is correctly described, and the accuracy of the shower is retained. Truncated showering is allowed if the emissions that were inserted before the hard emission are soft and did not change the flavour of the line that will emit the hard jet.
This example reveals the different philosophies behind the Ckkw-l and Truncated Shower approaches. In Ckkw-l, a compromise is made in that only the n ≤ N emissions hardest in the evolution variable, and above the merging scale cut t MS , are corrected with matrix element configurations. Thus, in comparison to using truncated showers, a smaller region of phase space have a matrix element structure. However, we are allowed to use the full shower to generate no-emission probabilities.
When using truncated showers, the flavour of the splitting lines has to be conserved in order to be able to attach the ME emission, i.e. truncated showers only allow gluon emissions. Also, splittings in the truncated showers cannot be allowed to remove too much momentum from the line, since otherwise, the ME emission could not be forced. These restrictions make the Sudakov form factors differ slightly between the full shower and the truncated shower, though differences are sub-leading and might be tiny in an actual implementation.
Summarising, we believe that both Ckkw-l and the Truncated Shower approach have to compromise in regions with t MS -unordered splittings. In Ckkw-l, only the hardest partons in the evolution variable will be corrected with tree-level matrix elements, as long as they are above t MS as well. This effectively means that the shower evolution variable should be some measure of hardness, since otherwise, only small regions of the relevant phase space will be endowed with corrections. Choosing e.g. a shower with an ordering variable defined by angles would not be suitable. Truncated Shower prescriptions allow correcting larger parts of the phase space with ME configurations, though at the expense of compromising in the generation of Sudakov form factors. This approach is particularly suited if the evolution variable does not provide a hardness measure, since then, the differences in the Sudakov form factors are vanishing, while large fractions of the phase space can be described by ME emissions. Since the evolution in transverse momentum provides a good hardness measure, Ckkw-l provides a natural merging scheme in PYTHIA8.

B. Reconstructing shower splitting probabilities and intermediate states
In a numerical fixed order calculation, different Feynman graphs can contribute to a particular phase space point. The analogue in a parton shower is that a multitude of different sequences of shower splittings can fill the same phase space point. We describe in this appendix the construction and choice of parton shower histories. The prescriptions below are implemented in PYTHIA8, with the code being publicly available from version 8.157 onwards. Given a matrix element state S +n , all possible intermediate states, splitting probabilities and splitting scales are reconstructed. We first detail how splitting probabilities are calculated and used to choose a particular path of shower splittings. We will after this outline how intermediate states are constructed.

B.1 Calculation of splitting probabilities
When assigning a parton shower history to a matrix element state, we have to decide on how to choose amongst all possible splitting sequences. Our choice of a suitable discriminant between these "paths" is guided by the collinear factorisation of n-particle matrix elements: dσ n = L n (x n , t n )F n |M n | 2 dΦ n ≈ α s 2π 1 Q 2 P (z)L n (x n , t n )F n |M n−1 | 2 dk 2 where F m is the flux factor and L m the parton luminosity for the m-parton final state using the factorisation scale t n , while (k 2 ⊥ , z, φ) are the splitting variables, Q 2 the virtuality of the splitting parton, and P (z) is the DGLAP splitting kernel for the splitting. The integration measure is given by where dφ m is the m-particle phase space volume and x ± m are the momentum fractions of the incoming partons moving in ±z direction. Using the fact that L n (x n , t n ) = x + n f + n (x + n , t n )x − n f − n (x − n , t n ) , with L n (x n , t n ) = L n−1 (x n−1 , t αs 2π x + n f + n (x + n ,tn) x + n−1 f + n−1 (x + n−1 ,t n−1 ) P (z) ρ dk 2 ⊥ dz dφ 2π dσ n−1 for ISR.

(B.4)
To illustrate initial state radiation, we have here chosen the parton moving along +z direction to split. Iterating this procedure down to the desired Born-level state represented by m = 0, we can construct one path of collinear splittings by which we may have arrived at the n-particle state. We can use the sum over all different possible paths p, where (ρ 2 ip , z ip , φ ip ) and P ip are the reconstructed splitting variables and splitting function for the i'th splitting in the path p, as an approximation of the n-parton cross section. The PDF ratio will equal unity for final state splitting. We can make this correspondence exact for the very first splitting, by adding matrix element corrections to the splitting kernel and finding a common integration measure for the joined evolution along the possible paths, as will be addressed in the following. The very first emission can be attributed to either a splitting of dipole end "1", or of dipole end "2". If the momentum of dipole end i ∈ {1, 2} after the splitting is p i , and the momentum of the emitted parton is p 3 , we define α s 2π The common integration measures for both paths were defined by introducing the variables For initial state splittings, the weight takes a more complicated form since in PYTHIA8, infrared singularities are regularised by the introduction of a small scale ρ reg . This is inspired by the regularisation of multiple interactions using arguments relating to colour screening effects [20]. For vanishing ρ reg , the weight for the first splitting of initial particles again takes the form given in eq. (B.4). Note that we keep α s fixed, as the running of α s is corrected for later in the algorithm. Also, the change in incoming parton content, compensated by ratios of parton distributions, will be corrected later on. Hence, the weight for each splitting should not contain α s or PDF ratio factors. The product of these weights of individual splittings in a path will then be used as the weight when choosing a path with the normalised probability In section 5, we compare this probabilistic prescription with a winner-takes-it-all strategy based on the smallest sum of transverse momenta, and observe minor, though visible, differences.

B.2 Reconstruction of intermediate states
Given an n-parton phase space point S +n from a matrix element generator, we explicitly construct all possible intermediate states S +0 . . . , S +n−1 in all paths p by reclustering allowed shower emissions. For the construction of the state S +i , given that we have the state S +i+1 , this rather complicated step is achieved by inverting all the changes the shower would have applied in the construction of the emission. This means that we need construct 1. The underlying momentap = {p 0 , . . . ,p k+i } from the momenta p = {p 0 , . . . , p k+i+1 }; 2. The underlying flavour configuration F +i from the configuration F +i+1 ; 3. The underlying colour configuration C +i from C +i+1 .
In the following, write "before" for values before the clustering, and "after" for values after the clustering.

Reclustering of momenta
The construction of the reclustered momentap from the momenta p is achieved by exactly reverting all changes PYTHIA8 would have done if the showers would have constructed an emission resulting in the momenta p. Formally speaking, this means that we invert the radiative phase space mapping of the shower. The construction of the momentum of an emission in PYTHIA8 differs between initial state and final state splittings, leading to different prescriptions how underlying kinematicsp are constructed.
q → g r q emt ,q → g rqemt (ISR) set the g r colour (anticolour) to the anticolour (colour) of the emitted parton. Set the reconstructed gluon anticolour (colour) to the anticolour (colour) of the radiating parton.
Once a pair of radiating and emitted partons is chosen, these rules can be applied to deduce the colour configuration C +i of the state S +i . Combining the inversion of the parton shower kinematics, the construction of the underlying flavour configuration and reclustering of colours, the complete state S +i can be generated. We have extensively tested that, given a state S +n , our implementation exactly reproduces all states S +(m<n) , if the states S +m+1 , . . . S +n were generated by shower splittings, verifying that we have used the exact inversion of the radiative mappings of PYTHIA8.