Merging meets matching in MC@NLO

The next-to-leading order accuracy for MC@NLO results exclusive in J light jets is achieved if the computation is based on matrix elements that feature J and J+1 QCD partons. The simultaneous prediction of observables which are exclusive in different light-jet multiplicities cannot simply be obtained by summing the above results over the relevant range in J; rather, a suitable merging procedure must be defined. We address the problem of such a merging, and propose a solution that can be easily incorporated into existing MC@NLO implementations. We use the automated aMC@NLO framework to illustrate how the method works in practice, by considering the production at the 8 TeV LHC of a Standard Model Higgs in association with up to J=2 jets, and of an e\nu_e pair or a t\bar{t} pair in association with up to J=1 jet.


Introduction and definitions
Let us consider the process P 1 + P 2 −→ S + M jets , (1.1) where P 1 and P 2 can be either hadrons or leptons, and in order to simplify the discussion we assume that the final state is defined at the parton level (that is, before the hadronization phase in an event generator). S is a set of particles which does not contain any QCD massless partons, and the M jets are light, i.e., obtained by clustering light quarks and gluons. If the definition of a given observable O explicitly involves J jets, with 0 ≤ J ≤ M (with or without a further dependence on the four-momenta of the particles in S), we refer to such an observable as exclusive in J jets, and inclusive in the remaining M − J jets; when J = 0, the observable is typically called fully inclusive. In Monte Carlos (MCs) based on the leading-order (LO) approximation, the description of the process in eq. (1.1) stems from the underlying tree-level matrix elements relevant to the process a 1 + a 2 −→ S + i partons , with a 1 and a 2 partons or leptons (according to the identities of P k ), and i a fixed number with 0 ≤ i ≤ M ; the shower must generate at least M − i partons, in order to obtain a final state that can contribute to eq. (1.1). The most straightforward approach is that of using matrix elements with i = 0 (which, for later use, we suppose to be of O(α b S )), and thus of letting the shower generate all partons. The apparent simplicity of this procedure has the drawback that cross sections are severely underestimated in the hard regions (i.e., for energetic and well-separated jets). This problem, which obviously worsens when increasing the c.m. energy, stems from the fact that showers give only a leading-logarithmic (LL) approximation to the full matrix elements that would be the correct description in those phase-space regions. As a result, one may say that with i = 0 one obtains a LL accuracy for observables exclusive in J jets, with J ≥ 1; for fully-inclusive observables, the accuracy is LO+LL. In order to partly remedy to the lack of hard emissions in Monte Carlos, the shower scale (which is directly related to the largest hardness attainable) can be increased in the latter to very large values. While this is formally correct, it entails the stretching of Monte Carlo simulations outside their range of validity, and thus a loss of predictive power. If one is interested in J-jet exclusive observables, for J given and fixed, it is best to use matrix elements with i = J, which promotes such observables to LO+LL accuracy. This has the disadvantage that the event samples generated in this way cannot be used for observables exclusive in less than J jets, since in doing so one would be forced to integrate matrix elements over soft and collinear regions, where they diverge. There are technical implications as well. Firstly, the matrix-element divergences must be avoided by introducing (unphysical) cutoffs, chosen so as not to bias physical predictions. Secondly, one requires that the largest jet hardness generated in the shower phase be less than that at the matrix-element level, which can be typically done with a suitable choice of the shower scale. This requirement is not necessary in order to avoid double counting (which cannot occur in this context: since each shower emission is associated with one power of α S , after a single emission one has already a contribution of relative O(α i+1 S )), but rather in order to maintain the expected LO+LL accuracy of the J-jet exclusive observables. The two issues discussed here are specific manifestations of the general problem of the matching between matrix-element calculations and parton shower simulations.
A first improvement on the situation presented in the preceding paragraph aims at the LO+LL-accurate description of fully inclusive and 1-jet exclusive observables, in the context of a single simulation; the corresponding techniques are generally known as matrix element corrections (MECs), although this terminology has been lately adopted to identify other, and more sophisticated, approaches. MECs rely on the computations of i = 0 and i = 1 matrix elements, which are combined by either "switching" between the two descriptions they underpin [1,2], or by directly including the information on i = 1 in the showers, which is still initiated starting from an i = 0 configuration [3][4][5]. These consistent combinations of matrix elements characterized by different multiplicities are the first examples of merging procedures 1 .
Extensions of MECs to arbitrarily-large multiplicities have attracted considerable interest, in view of their relevance to Tevatron and LHC phenomenology, where multi-jet processes are ubiquitous. There are now well-established procedures, such as CKKW [6,7], CKKW-L [8,9] (and their later improvements, see refs. [10][11][12]), and MLM [13]. These merging approaches are based on the computations of all tree-level matrix elements (which is why they can be referred to as attaining an LO merging) with 0 ≤ i ≤ N , where N is as large as a computer can handle; for J ≤ N the J-jet exclusive observables are LO+LL accurate. Since LO-merging techniques use tree-level results, cutoffs are necessary that prevent infrared (IR) divergences from occurring. The avoidance of biases in physical results is much more difficult to achieve than in the case of the two parton multiplicities relevant to MECs, and the interplay between the merging and matching conditions is in fact non trivial. The matching can be expressed in terms of the IR cutoffs, which can be collectively (and slightly improperly) called matching scale 2 . The dependence of observables on the matching scale never completely vanishes, but in practice it is sufficiently small numerically (consistently with the proof [6] that for CKKW in e + e − collisions it is suppressed by two logarithmic powers w.r.t. the nominal accuracy of the shower). We point out that such a dependence is the unavoidable consequence of the fact that a cutoff-free cross section at relative O(α N S ) can only be achieved by a full N N LO computation (i.e., which includes loop corrections). Clever matching conditions reduce the cutoff dependence, but cannot eliminate it.
In parallel to the development of LO-merging procedures, the problem was considered of matching next-to-leading order (NLO) QCD computations with parton showers. This spurred quite a lot of theoretical activity [14][15][16][17][18][19][20][21][22][23][24][25][26], but currently the only proposals that are systematically applied to hadroproduction processes are MC@NLO [15] and POWHEG [18] (possibly in their SHERPA implementations, refs. [26] and [25] respectively). The problem is the analogue of the LO matching for J-jet exclusive observables, with J fixed, discussed above. The novelty is that the underlying matrix elements are of O(α b+J S ) and of O(α b+J+1 S ). The former are the same as those relevant to the LO matching, but play the role of Born contributions in this context; the latter include the real corrections, their IR subtraction terms, and the one-loop contributions. Although real matrix elements have parton multiplicities one unit larger than the other matrix elements involved, which is also what happens in MECs, the problem at hand is a matching, and not a merging, one. In fact, its aim is not that of enlarging the number of jet multiplicities which can be predicted to LO+LL accuracy, but rather that of promoting J-jet exclusive observables from LO+LL to NLO+LL accuracy; the fact that (J + 1)-jet exclusive observables are indeed predicted at LO+LL is beside the point, since this fact is a spinoff of the method, rather than its primary motivation. Technically, what happens is that one set of IR divergences (those arising from one parton becoming soft, or two partons becoming collinear) are cancelled without the need of introducing a cutoff. Apart from the case of zero jet at the Born level (J = 0), cutoffs are still required in order to prevent the cross sections from diverging because of multi-parton IR configurations (e.g., two partons becoming soft, or three partons becoming collinear); however, their impact on fully-showered events is much reduced w.r.t. the analogous situation at the LO (loosely speaking, by a factor of α S -see e.g. ref. [27] for a discussion on this point).
The scope of LO-merging procedures is larger than that of NLO matching, since in the 2 It is in fact a matching and merging scale. We stick to the standard notation.
latter the relevance of matrix-element information is limited to a couple of jet multiplicities (of which only one is truly NLO, as mentioned above). Still, NLO-matched approaches must be the method of choice whenever possible, because they are capable of giving predictions affected by theoretical uncertainties (such as cutoff biases and scale dependences) significantly smaller than at the LO. It is therefore clear that the situation can be improved by combining the two strategies, in order to merge consistently event samples which are individually matched with parton showers to NLO accuracy (this, we call for brevity an NLO-merging approach). Merging at NLO requires one to tackle two issues. Firstly, there is the purely theoretical problem of devising an acceptable solution, which has stimulated much work lately [24,[28][29][30][31][32][33] 3 . Secondly, in order for the above solution not to remain an academic achievement, the computations must be feasible of all matrix elements involved in the relevant NLO cross sections (i.e., up to relatively large multiplicities). This is indeed the case thanks to the extremely high level of automation achieved in the past few years for the one-loop [34][35][36][37][38][39][40][41][42][43][44] and the subtracted real [45][46][47][48][49][50] contributions, both of which have greatly benefitted from the thorough understanding of tree-level-amplitude calculations [51][52][53][54][55][56][57][58][59][60][61].
The goal of this paper is that of proposing a prescription for an NLO merging built upon the matching achieved according to the MC@NLO formalism [15], and which requires only very minimal modifications to the latter. This allows us to easily implement such a merging scheme into the existing aMC@NLO framework (see e.g. refs. [27,62] for recent applications), and to test it by considering, at the 8 TeV LHC, the production of S = H (Standard Model Higgs), S = e + ν e , and S = tt, for observables exclusive in up to one or two jets at the NLO. The paper is organized as follows: in sect. 2 we describe our approach; in sect. 3 we present sample predictions; finally, in sect. 4 we draw our conclusions.

Outline of the procedure
Let us consider the process in eq. (1.2), and interpret it as the O(α b+i S ) Born contribution to a (J = i)-jet exclusive cross section -hence, all final-state partons are hard and well separated. The total transverse momentum vanishes because of momentum conservation: The O(α b+i+1 S ) real corrections feature (S + i + 1)-body final states, which define the H-event configurations in MC@NLO. For such final states, p Born T is different from zero 4 ; this property will be exploited in what follows as an intuitive way to measure the "extra" radiation w.r.t. a given (Born) kinematic configuration 5 . The (i + 1) th parton can be arbitrarily soft or collinear to any other parton (which implies p Born T ≃ small), but also hard and well separated (where p Born T ≃ large). All the other O(α b+i+1 S ) contributions to the cross section have an (S + i)-body kinematics, identical to that of the Born; these are the S-event configurations in MC@NLO.
After processing hard events with parton showers, one will obtain configurations quite different from those of the S and H events; in particular, final-state multiplicities will have greatly increased. However, these differences may be irrelevant to physics observables, which may be almost identical, in shape and normalization, to those resulting from an NLO parton-level computation 6 . For this not to be the case, two conditions must be fulfilled. Firstly, the observable must be IR-sensitive (i.e., large logarithms can appear in the coefficients of its perturbative expansion). Secondly, one must be in an IR phasespace region, where partons are soft and/or collinear (which causes those logarithms to grow large); this corresponds to having p Born T ≃ small. When this happens, the shape of the observable is determined by the MC (large logarithms are resummed), while its normalization is still dictated by the underlying NLO matrix elements (thanks to the unitarity property of the shower). This implies, in particular, that the value of p Born T of the configuration emerging from the shower can be markedly different from that relevant to the H event from where the shower started (which is trivially true for S events, since they have p Born T = 0). On average, one can say that in the IR regions S and H events provide the normalization, while the kinematics is controlled by the MC.
Let us now consider the hard regions, where p Born T ≃ large. S events do not contribute there, since in order to do so the shower would have to provide all the extra radiation leading to p Born T (which is still possible, but at the price of choosing unjustifiably large shower scales). On the other hand, H events do contribute; more specifically, the values of p Born T before and after the shower do not differ significantly. Thus, on average, in the hard regions H events provide one with both the normalization and the kinematic configurations. Finally, it should be stressed that the characteristics of the S and H events discussed here are quite directly related to the fact that MC@NLO is designed to perturb in a minimal manner both the MC and the matrix-element results (in particular, there are no contributions of relative O(α 2 S ) which are not of MC origin). The above observations underpin the proposal for the NLO-merging strategy that we sketch here.
1. For any given Born multiplicity, except for the largest one considered, there must not be contributions to the hard regions. This implies, in particular, that in such regions real emissions must not occur, and the corresponding matrix elements must rather be viewed as defining the Born process for the next (i.e., one unit larger) multiplicity.
2. Suitable choices of veto scales in showers must be made for consistency with item 1.
3. Further conditions can be imposed which are similar to those used in LO-merging procedures; for a given multiplicity, the combination of H-and S-event contributions plays essentially the same role as a tree-level matrix element in LO mergings.
A few comments are in order here. The definition of a "hard" region is realised by introducing a (generalised) matching scale, analogously to what is done in LO mergings. Item 1. is achieved by cuts at the matrix-element level, defined by means of the matching scale.
Since MC@NLO and NLO parton-level predictions are quite similar in the hard regions, by getting rid of the latter, one also eliminates the former without the need of any extra conditions at the MC level. However, the hard region defined here may still be populated by the Monte Carlo (whose choice of shower scales is made a priori, and independently of any merging procedure) when showering S events, hence item 2. Note that the relevant scale choices there are easily worked out by taking into account what is done in item 1., owing to the interplay between H and S events. As far as item 3. is concerned, let us suppose that, for a given (S + i)-body Born kinematics, MC@NLO were used to obtain only i-jet exclusive observables. This is effectively as if the (i + 1) th parton were never resolved, and always integrated over. It thus suggests to formally treat the combination of S and H events as equivalent to (S + i)-body tree-level matrix elements in an LO merging which implies, among other things, the reweighting of these events by suitable combinations of Sudakov factors (or equivalent suppressions, as in the MLM procedure). By construction, such factors must then be obtained using (S + i)-body configurations (possibly effective), also in the case of H events. One readily observes that Sudakov reweightings are used in an LO merging (together with conditions on showers) in order to prevent different multiplicities from double counting which, in the present context, is supposed to be guaranteed by items 1. and 2.. This is correct; the idea is indeed that of employing reweighting factors which contribute to a relative O(α 2 S ) (hence, beyond NLO), so as the perturbative expansions of the cross sections that define S and H events are identical to the original ones up to O(α b+i+1 S ). This is a condition whose application guarantees that the accuracy of the MC@NLO calculation is not spoiled, and which results in the insertion into the MC@NLO short distance cross sections of an extra relative O(α S ) term, that we shall call dσ (∆) i . It also implies that Sudakov reweightings are totally general, and do not constrain the type of observables that one predicts starting from a given multiplicity -it should be clear that the assumption made at the beginning of this paragraph has the sole role of simplifying the picture. Still, it will remain true (essentially because of item 1.) that i-jet exclusive observables will receive the dominant contributions from MC@NLO samples associated with an underlying (S + i)-body Born kinematics. This is directly related to the fact that, when Sudakov reweightings are applied, the term dσ (∆) i is numerically small; in fact, as we shall explicitly show later, the procedures in items 1. and 2. are sufficient to obtain smooth results for most observables.

Technicalities
The contributions to the process in eq. (1.2) will be denoted by: where T i are the O(α b+i S ) tree-level matrix elements, and V i is the O(α b+i+1 S ) sum of the finite part of the one-loop amplitude times that of the Born, plus the finite remainders of the soft and collinear subtractions. The MC@NLO cross section for what we shall call the i-parton sample can thus be written schematically as follows: where K and K MC indicate symbolically the kernels relevant to the NLO and MC subtractions respectively. In order to simplify the notation, in eqs. (2.4) and (2.5) we have understood the interface-to-MC's I M C of ref. [15] (in turn equivalent to the generating functionals F (k) MC of ref. [63]), since they will not play any role in what follows. In fact, all manipulations of the short-distance cross sections that will be carried out below are relevant to hard events, i.e. at the parton level and before the shower phase (with one exception, discussed in sect.

2.2.3).
For our merging scheme we introduce a function: with µ 1 ≤ µ 2 two arbitrary mass scales, whose role can be roughly summarized as follows: Although one may want to choose a smooth function D for numerical reasons, it is more transparent from the physics viewpoint to adopt a sharp version: which is a particular case of eq. (2.6). In eq. (2.7), the identification of µ Q as the matching scale is obvious. We shall also denote by the scale (with canonical dimension equal to one, i.e. mass) at which a given S+partons configuration passes from being reconstructed as an j-jet one to being reconstructed as an (j − 1)-jet one, according to a k T jet-finding algorithm [64] (in other words, there are j jets of hardness d j − ε, and (j − 1) jets of hardness d j + ε, with ε arbitrarily small). For example: In general, for n final-state partons one will have (2.10) It will also turn out to be convenient to define with √ s the parton c.m. energy, i.e. the largest energy scale available event-by-event. Equations (2.3)-(2.5) imply that the i-parton MC@NLO sample gets contributions from both the i-and the (i + 1)-parton tree-level matrix elements. This is what usually happens in MC@NLO, when eq. (2.3) is used to compute the i-jet exclusive cross section with NLO+LL accuracy. However, in the context where MC@NLO samples with different multiplicities must be consistently merged, this implies that a given tree-level matrix element T i will contribute to both the i-parton sample (as Born contribution) and to the (i − 1)-parton sample (as real correction). This fact is peculiar of the merging at the NLO (since at the LO one imposes that T i contributes solely to the i-jet exclusive cross section), and can lead to problems of double-counting nature even before considering the matching to showers. In order to avoid these problems, we introduce the following rule: R.1 For any given (S + i)-body kinematic configuration (with i ≥ 1) at the matrixelement level, the sum of the contributions due to T i to the i-and (i − 1)-parton samples must be equal to T i (possibly times a factor smaller than one if this helps prevent the reconstruction of a number of hard jets smaller than (i − 1)).
We now proceed to incorporate rule R.1 into the MC@NLO short-distance cross sections. In order to be definite, let us consider the merging of the i-parton samples with that is, the largest final-state multiplicity will be N + 1 partons, relevant to the real corrections to the N -parton sample. We then formally define modified MC@NLO formulae in the following way: with dσ i given in eq. (2.3). As it will be discussed in what follows, these expressions are incorrect; however, their intuitive meaning is easy to grasp. Thus, we discuss here their physics contents, and refine them later. The D and Θ functions in eq. (2.13) imply that, out of the i + 1 jets in the i-parton sample (i < N ), there are at least: A. i − 1 jets harder than µ 2 (owing to Θ (d i−1 − µ 2 )); B. one jet harder than µ 1 (owing to (1 − D(d i ))); C. one jet softer than µ 2 (owing to D(d i+1 )).
Note that one jet is degenerate (i.e., has zero four momentum) in the case of S events, and that condition C. does not apply to H events when i = N . We stress again that no shower is involved yet, and the jets are thus defined at the matrix-element level. Furthermore, according to the definition of d i , it would be more appropriate to talk about a hardness scale at which one resolves i or (i − 1) jets; in practice, a less precise language is acceptable, since no confusion is possible here. It is easier to start analysing the implications of the previous formulae by considering first a sharp D function, eq. (2.7). In such a case, items A.-C. imply that there are i jets harder than µ Q , and one jet softer than µ Q . The idea is that the matrix element description of the former i jets is adequate (and of NLO accuracy), while the latter one jet will be heavily affected by MC showers. Consistently with this picture, one will not want the MC to generate emissions harder than µ Q . When a generic and smooth D function is considered instead (eq. (2.6)), the interpretation is basically the same, only slightly more involved. In particular, the description of the (i − 1) jets harder than µ 2 is still a fully matrix-element one. In the intermediate-hardness region (µ 1 , µ 2 ), item B. implies the presence of an extra jet, with "probability" given by 1 − D. This is more correctly interpreted as our confidence in the correctness of a matrix-element description for such a jet, which is maximal (i.e., equal to one) for a hardness equal to µ 2 , and minimal (i.e., equal to zero) for a hardness equal to µ 1 . This damping factor 1 − D is arbitrary, and must be compensated. This will happen thanks the contribution to the i-jet cross section due to the (i − 1)-parton sample, the extra jet being generated in the intermediate region (µ 1 , µ 2 ) by means of MC radiation. This compensation is consistent with the idea that in the intermediate-hardness region both the matrix element and the MC description are on equal footing. Finally, the case of item C. is analogous to that of item B., but specular. In particular, the matrixelement description of the extra jet relevant here is turned off with probability D, so that an MC description is generally dominant.
The latter point is not only motivated by physical arguments, but is actually necessary in the context of a well-behaved NLO computation. In fact, while the last two factors on the r.h.s. of eq. (2.13) limit the hardness of i jets from below, the factor D(d i+1 ) limits that of the (i + 1) th jet from above. It is clear that for such a jet (which generally corresponds to the softest parton in the event) there must not be a lower bound on hardness, because of infrared safety. On the other hand, the bound from above prevents one from having an (i + 1)-jet configuration generated by the i-parton sample, since this would effectively be an LO (rather than an NLO) prediction. The exception is of course that of the largest parton multiplicity available to the calculation, simply because one cannot do better than LO there; this is the reason for the special case considered in eq. (2.14).
We can now check the consistency of eqs. (2.13) and (2.14) with rule R.1. As it can be seen from eqs. (2.4)and (2.5), T i enters the S events of the i-parton sample, and the H events of the (i − 1)-parton sample. Explicitly, one obtains: where in eq. (2.15) we have exploited the fact that, for S events in the i-parton sample (with i < N ), d i+1 = 0 and hence D(d i+1 ) = 1 (when i = N , this D factor simply does not appear in eq. (2.14)). By using and the properties of the D function, eq. (2.16) can be rewritten as follows: This result indeed obeys rule R.1. In fact, the first term on the r.h.s. of eq. (2.19) is a contribution to the hard region for (i − 1) jets; the hardness of the remaining one jet can be either small (thus playing the role of an NLO correction to an (i − 1)-jet cross section), or large (thus being a Born contribution to an i-jet cross section). In both cases, the use of a matrix element description is fully justified, and T i appears with its proper weight (which implies that there is no double counting of matrix-element origin in the combination of dσ H,i−1 and dσ S,i ). On the other hand, in the second term on the r.h.s. of eq. (2.19) T i is multiplied by a non-trivial weight factor. However, that term corresponds to having only (i − 2) jets in the hard region, while one of them is forced to be in the intermediate region ), and another one to be either in the intermediate or in the soft region (owing to D(d i )). As was discussed before, in this situation one should not expect the matrix elements to give the only correct description, and it therefore appears desirable that T i be multiplied by a number smaller than one.
Ultimately, effects such as that giving rise to the second term on the r.h.s. of eq. (2.19) can be ascribed to the systematics of the merging scheme. This can be checked e.g. by changing the values of µ 1 and µ 2 , and the functional form of D in the intermediate region.
For example, one can see immediately that the sharp D form of eq. (2.7) simply sets the term above identically equal to zero (as it must happen, since with eq. (2.7) there is no intermediate region). It has to be stressed that, in the case of a smooth D, eq. (2.19) will only give an upper bound to the merging systematics. In fact, we expect a compensating effect, mainly due to the S events of the (i − 1)-parton sample, giving rise through showers to at least two extra jets, one in the intermediate region, and one in the soft region. Finally, the particular source of systematics we are discussing here is essentially due to the fact that two jets can simultaneously be present in the intermediate region. In order to avoid this, one can think to variants of the prescription given in eq. (2.13) such as: (2.20) This option and its possible variants will not be considered in this paper.
What was done so far brings out the physics contents of eqs. (2.13) and (2.14). As they stand, however, those equations are ambiguous, since the two components of dσ i , namely dσ S,i and dσ H,i , are associated with different kinematics configurations (2 → S + i and 2 → S + i + 1 respectively), and one needs to specify which of these is used in the computation of the d i 's. Let us denote by these kinematic configurations (the notation reminds one that they are associated with the S and H events of the i-parton sample). It is fairly obvious that Ξ S,i must be used in the dσ S,i contribution to eq. (2.13), while Ξ H,i must be used in the dσ H,i bit. In fact, by doing so the manipulations carried out in eqs. (2.15)-(2.19) are still correct, since one can always identify Ξ H,i−1 with Ξ S,i . As an explicit example of the cross sections that one obtains with these kinematic assignments, we write here the results relevant to the simplest case of the merging of the two lowest parton multiplicities (N = 1): Here, we have used eq. (2.11), d 1 (Ξ S,0 ) = 0, and the properties of the D function in eq. (2.6) (obviously, µ 2 < d 0 ). While the above prescription removes the ambiguity in eqs. (2.13) and (2.14), and gives those equations an operative meaning, it leads to cross sections affected by double counting within the i-parton sample (i.e., even before merging different multiplicities), as one can readily prove by using eqs. (2.22)-(2.25), and by proceeding e.g. as is done in appendix B of ref. [15]. However, the correct formulae can easily be obtained by means of a few simple modifications. We shall illustrate them in the following, starting from the case N = 1 in order to simplify the discussion, and moving next to the fully general case.

Merging 0and 1-parton samples
The correct, non-double-counting versions of eqs. (2.22)-(2.25) read as follows: where changes have occurred in the definitions of dσ S,0 and of dσ H,1 . The factor D(d 1 (Ξ H,0 )) in eq. (2.27) limits the hardness of the final-state parton as prescribed by the function D (with a sharp D, eq. (2.7), the parton relative transverse momentum will obey p T < µ Q , see eq. (2.9)). While this condition is imposed at the matrix-element level, one should keep in mind that the MC subtraction term, T 0 K MC , appears in eq. (2.27) in order to prevent double counting at the NLO. Hence, consistency demands that its modification due to the D-dependent prefactor be accompanied by a prescription for the shower scale that limits emissions within the same hardness range. Given the NLO accuracy of the MC subtraction terms, this can be conveniently done by means of the LH-interface [65] parameter SCALUP, which will be chosen event-by-event in a random manner (so as to avoid biases) according to the inverse of the function D (for example, with a sharp D function and SCALUP having the meaning of relative p T , such a scale will be always set equal to µ Q ). The modifications of the shower scale and of the MC subtraction term in H events imply that the MC subtraction term must be modified in S events as well; this is the reason for the factor D(d 1 (Ξ H,0 )) in eq. (2.26). As far as the 1-parton sample is concerned (eqs. (2.28) and (2.29)), the factors 1 − D limit from below what is essentially the relative p T of the Born-level parton -in the case of a sharp D function, this is therefore equivalent to imposing hard Born-level cuts. Thus, it should be intuitively clear, and could be formally proven using again the techniques of appendix B of ref. [15], that the proper 1 − D prefactor for the H-event MC subtraction term is that in eq. (2.29), and not that in eq. (2.25).

The general case
What is done in sect. 2.2.1 is sufficient to sketch the procedure one has to follow in order to convert the naive prescriptions of eqs. (2.13) and (2.14) into correct expressions for MC@NLO short-distance cross sections. We obtain: We for the fact that the hardness of the (N + 1) th parton is not bounded from above. This is correct, since there is no higher multiplicity whose Born-level kinematics could compensate for the lack of hard emissions in the N -parton sample.
As was already mentioned in sect. 2.2.1, the equations above entail specific choices of shower scales. We have chosen: for S and H i-sample events respectively 7 . We have introduced the quantity with r a random number, to be generated event by event; one has p T dependence in eq. (2.36) has already been discussed in sect. 2.2.1. That on d i (Ξ S,i ) is intuitively clear: since such a quantity is directly related to the hardness of the softest Born-level jet, one does not want the shower to generate jets harder than that. This argument can be substantiated analytically in the context of the toy model of ref. [15], where one can actually show that any monotonically-growing function of d i , subject to the conditions 1/2d i ≤ f (d i ) ≤ d i , will do. For un-merged MC@NLO samples, the toy model does not give any prescription for the shower-scale choice of H events; when merging, however, one obtains the constraint that the scale be either a monotonically-decreasing function of the hardness of the real emission, or a constant, which motivates eq. (2.37) (but does not determine it uniquely). Incidentally, such a choice is just the generalisation of what was done in ref. [15], where it was motivated simply by the argument that the shower scale has to be related to the hardness "left" in the system after the first emission.

Sudakov reweighting
The formulae presented in sect. 2.2.2 achieve the strategy described in items 1. and 2. of sect. 2.1. As far as item 3. there is concerned, the basic idea has already been discussed, which is essentially that of following the CKKW prescription [6] with a reweighting of the short-distance cross sections by a combination of Sudakov factors. We implement this by defining modified MC@NLO cross sections as follows: where by ∆ (1) i we have denoted the O(α S ) term in the perturbative expansion of ∆ i . We point out that the quantities ∆ i are products of ordinary Sudakov factors, which we define following the CKKW prescription of reconstructing the most probable shower history, according to the jet-finding algorithm that determines the d i . When doing that, we use the (S + i)-body kinematic configurations when dealing with S events (eq. (2.39)), and the (S + i + 1)-body kinematic configurations for H events (eq. (2.40)), as one would naively expect. However, in the latter case the softest of the d's is discarded. This is in keeping with what has been discussed at the end of sect. 2.1, that H events have to be treated on the same footing as S ones as far as the multiplicity relevant to the definition of Sudakovs is concerned. The scales entering the Sudakov factors in eqs. (2.39)-(2.41) are defined as follows: Here, µ ME is a hard scale, which can be generically associated with matrix-elements computations (e.g. an NLO parton-level result which corresponds to a given MC@NLO simulation); explicit examples will be given below. Note that, in the case of a sharp D function, the r.h.s. of eqs. (2.44) and (2.45) are equal to p (D) T = µ Q . In the CKKW procedure, a reweighting by α S factors is also performed. Although this would technically be possible in the context of MC@NLO, the complications it entails (owing to the more involved dependence on the renormalization scale of NLO cross sections w.r.t. that of LO ones) do not seem justified, in view of the fact that in MC@NLO there is already an O(α S ) cancellation between matrix elements and Monte Carlo effects. On the other hand, it is probably best to choose a renormalization scale whose definition exploits the CKKW-like considerations which lead to eqs. (2.39) and (2.40). We adopt therefore what is used in the MINLO procedure [67], which is tailored for NLO computations, and in view of its connections with CKKW: . (2.48) We remind the reader that b is the power of α S that appears in the Born contribution to the 0-parton sample; furthermore, in eq. (2.48) both µ ME and the d j 's are meant to be computed with the kinematics proper of either S or H events. For consistency with ref. [67], we also set the factorization scale equal to d i . We remark that both renormalization and factorization scales are set equal to µ ME for NLO mergings that do not include the Sudakov reweighting discussed here, and for un-merged MC@NLO predictions.
Reference [67] also suggests a simplification in the implementations of eqs. (2.39) and (2.40), which is useful because of a known issue with out-of-the-box CKKW (see e.g. ref. [9,68]). Namely, Sudakov reweighting leads to better results from the numerical viewpoint if the Sudakov factors used in the short-distance cross sections are equal to those that enter the Monte Carlo which is matched to the matrix elements. However, analytical NLL Sudakov, such as those considered in ref. [6], are appealing precisely because, being MC-independent, they give one the possibility of an error-free, easy, and universal implementation. The differences induced by analytical or actual-MC Sudakovs will grow with the distance between the largest and smallest scales entering them. We can therefore envisage the following possibility: in eqs. (2.39) and (2.40), when i < N we use the identity:  [67]). On the other hand, the Sudakov factors inside the square brackets are effectively computed using an MLM-type rejection procedure, which is supposed to be a good approximation of the use of actual-MC Sudakovs (in that it exploits information obtained from the Monte Carlo). When such a procedure is applied at the LO, one matches partons with jets 8 . In order to extend this idea to the NLO, we must make use of jets, reconstructed at the level of the hard subprocess (which corresponds to the matrix-element level), instead of partons; this preserves IR safety. We then require these hard-subprocess jets to match parton-level jets after shower, in essentially the same way as in the original MLM procedure [13]. The only difference is that, while MLM uses a cone jet-finding algorithm, we adopt a k T one [64], as is also done in the MLM LOimplementation in MadGraph [69], and consistently with the construction of the d i 's. More specifically, if jets are defined by the k T algorithm with a given radius R 0 , we tag a jet after shower as matched with a hard-subprocess-level one if the two are less than 1.5R 0 apart in the η − φ plane. All the results we shall present in sect. 3 have been obtained with R 0 = 1 (large radii have to be preferred, but there does not appear to be a strict constraint on what to choose. For example, we have verified that with R 0 = 0.8 our results are unchanged). Finally, after having obtained a set of shower-level matched jets, we impose that in the i-parton sample there be exactly i jets with hardness larger than p (D) T . In practice, as a further measure to ensure that there be no biases after applying the MLM rejection, we generate hard events by relaxing the conditions enforced by the 1 − D factors in eqs. (2.32)-(2.35). For i < N , this implies a loss of efficiency, while for i = N it requires that an MLM condition be imposed as well, with hardness d N . In particular, one demands that, if there are N or N + 1 jets at the hard-subprocess level, then these should match the N or N + 1 hardest ones after shower. We finally point out that the MLM conditions above are identical in the cases of S and H events, again consistently with the treatment of these two event classes in the CKKW-type procedure set up here.

Results
In this section, we present results obtained with the NLO-merging procedure described in sect. 2 which are relevant to the production of a Standard Model Higgs (denoted by H henceforth), of an e + ν e pair, and of a tt pair, at the 8 TeV LHC. We concentrate in particular on H production, which is an ideal test case since it features a very large amount of radiation, in this way helping expose any problems in the matching and merging techniques. Furthermore, the matrix elements relevant to this process are relatively simple, and thus fast to evaluate. We consider the cases of sharp and smooth D functions, and of Sudakov reweighting; we study the merging with N = 1 and with N = 2. As far as e + ν e and tt pair production are concerned, we limit ourselves to presenting a few key observables with N = 1 and Sudakov reweighting, which is sufficient to show the generality and flexibility of the procedure. The latter is also guaranteed by its implementation in the automated aMC@NLO framework, which has been used to obtain all the results shown below.
The relevant information are the following: a. The merged result.
c. The un-merged ("standalone") MC@NLO results for the corresponding multiplicities. For each observable, we present all the results above in one plot, with the following layout. The main frame displays a, superimposed to b, thus allowing one to check the origin of the features of the merged results, and to see the interplay among the various i-partonsample contributions. An upper inset shows the ratios (c/a); in this way, it is easy to assess how much the merged results differ from the standalone MC@NLO ones, in both shape and normalization. It should be kept in mind that the latter, if obtained with an underlying (S + J)-parton description, are physical only for observables that feature at least J hard jets (be them obtained by explicit cuts, or by effective ones enforced by other constraints, such as reconstructing the p T of recoiling objects). A lower inset shows the ratios (d/a) and/or (e/a) -the idea here is that of assessing the merging systematics, and of studying the effect of increasing the largest multiplicity that enters the merging procedure. Finally, in the cases of H and tt production we have also compared our results with those of Alpgen [57] (including parton showers and MLM merging). The latter have been renormalized by a process-dependent overall factor, at the sole purpose of rendering them more visible in the figures, where they typically appear as ratios in the lower insets. All the Monte Carlo simulations have been performed with HERWIG6 [70]; however, the merging procedure has been implemented so as no or minor changes are foreseen in the cases of Herwig++ [71] and Pythia8 [72] (while the case of Q 2 -ordered Pythia6 [66] may require further consideration), which are currently available and under testing respectively in aMC@NLO. Underlying events have not been generated. Jets are reconstructed (by clustering all final-state stable hadrons) with the k T [64], anti-k T [73], and Cambridge/Aachen [74,75] algorithms as implemented in FastJet [76], for several different jet radii -our default choice, to be shown in the plots, is the k T algorithm with R = 0.6. We have used the PDF set MSTWnlo200868cl [77].

Standard Model Higgs production
Our runs have been performed with m H = 125 GeV; we have set µ ME equal to the Higgs transverse mass. We have also considered µ ME = H T /2, and found the same patterns as with our default choice; hence, the corresponding results will not be presented here. The one-loop matrix elements relevant to the 1-and 2-parton samples have been taken from the MCFM code [78] as implemented in ref. [79]. We have studied the following four merging scenarios: N = 1, sharp-D, non-Sudakov-reweighted; N = 1, smooth-D, non-Sudakov-reweighted; N = 1, sharp-D, Sudakov-reweighted; and N = 2, sharp-D, Sudakovreweighted.
We start by discussing the former case, for the three choices µ Q = 30, 50, and 70 GeV. Sample observables are shown in fig. 1; the merged results in the main frame correspond to µ Q = 30 GeV, in order to facilitate the direct comparison with Alpgen, where the matching scale has also been set equal to 30 GeV, which is fairly typical for this process. The two observables displayed on the top panels of fig. 1 are representative of the behaviour of all observables which are not directly related to jet transverse momenta: namely, the 0-and 1-parton samples merge smoothly. For the Higgs transverse momentum, this smoothness, and the agreement with the standalone H + 1j prediction at large p T (H) (which results naturally from the merging procedure), gives an effective constraint on the total rate, which can only act as a normalization effect at small p T (H), where differences can be seen w.r.t the standalone H + 0j prediction (while the shapes are identical, as they should be by construction). We note that normalization effects have to be expected in this case, where the fact that the shower may cause "leaks" into larger exclusive multiplicities w.r.t. those of the underlying parton cross sections is not compensated by any suppression (e.g. by Sudakov reweighting). In spite of this, and of the fairly severe conditions posed by a sharp D function, it is reassuring that these effects are smaller than 20%, as shown by the two lower insets of the upper-left panel. From the comparison between those two insets, one can also see that, by setting µ Q = 50 GeV, the merged result agrees almost perfectly (in shape and normalization) with the standalone H + 0j and H + 1j predictions (where relevant, i.e. at small and large p T 's respectively). Although this may suggest a way towards the definition of an optimal matching scale, it is important to keep in mind that the capability of changing the merging conditions is essential in order to assess the systematics that affects the procedure. The dotted-plus-symbols green histograms in fig. 1, also presented in terms of ratios over the merged MC@NLO results in the lower insets, are the Alpgen predictions; for consistency with the N = 1 case considered here, the H + 0, H + 1, and H + 2 parton samples in Alpgen have been generated, and combined according to the MLM prescription. Although in reasonable agreement, we see that the p T (H) shape as given by Alpgen is harder than that of MC@NLO; we point out, however, that to some extent this difference might be due to the fact that we have run Alpgen with its default scales, and with the LO version of the NLO PDFs we have used for MC@NLO.
In the bottom panels of fig. 1 we present two observables directly related to jet hardness, namely the p T of the leading jet, and d 1 . They display similar features, the most striking of which is a kink at p T (j 1 ) ≃ d 1 ≃ µ Q . It is worth stressing that the same kink appears in the Alpgen results. This emphasises the fact that such a kink is a more general feature than being simply an artifact of the merging prescriptions adopted in these plots (which are obviously very different from each other 9 ), since it is ultimately caused by a significant 9 Just as obviously, it is not implied that kinks will appear regardless of the merging technique employed; mismatch between the Monte Carlo and matrix-element descriptions in the region chosen for merging. As far as we know, the problem posed by this kink is very often ignored: in fact, if one is interested in 1-jet exclusive observables, a lower bound p cut T on the jet transverse momentum is imposed, and the issue of the kink is simply bypassed, in the context of a merging procedure, by choosing µ Q < p cut T . With this, one is essentially assuming that the proper description for such a jet is a matrix-element one. However, this is not necessarily the case; for example, a 40-GeV-p T jet is presumably well described by matrix elements if produced in association with a 80 GeV Higgs, but likely much better modeled by parton showers when the mass of the Higgs is 600 GeV. Furthermore, the mass scales involved in the problem are not the only relevant factors: for example, in a gg-dominated process such as Higgs production, Monte Carlo effects will extend farther in p T in comparison with a process characterized by exactly the same mass scales, but qq-initiated (e.g., Z ′ production with m Z ′ = m H ). The bottom line is that, even for J-jet exclusive observables, the choice µ Q < p cut T can be misleading, if anything because it prevents one from assessing theoretical uncertainties in a complete manner. Merging procedures do offer a systematic way of addressing this problem, provided that p cut T is not regarded as a natural upper bound for the matching scale; jet transverse momenta (or related quantities) must be studied in a range that includes the matching scale. We conclude this discussion by mentioning that the kink that appears in the bottom panels of fig. 1 is not peculiar of the k T algorithm with R = 0.6 -we have found the same feature for all the jet algorithms we have considered.
One interesting aspect of fig. 1 is the general agreement between MC@NLO and Alpgen for the basic features of the observables. This may be surprising at first, given the fact that only in Alpgen a Sudakov suppression is implemented (effectively), which incorporates information on the behaviour of the Monte Carlo in the merging scheme. In fact, such information is also included in MC@NLO via the matching procedure (in particular, in the MC subtractions), in a way that differs from that of Alpgen at relative O(α 2 S ). This confirms the naive expectation that effects due to the mismatch between matrix elements and Monte Carlos, which are mitigated at the LO by merging, are largely dealt with by matching at the NLO. It also leads one to expect that, if similar merging procedures can be implemented at both the LO and the NLO, the latter results will be better behaved than the former. We shall show later explicit examples that this is indeed the case.
In fig. 2 we plot the same observables as in fig. 1, computed with the same settings as before, except for the fact that here the function D is smooth, with µ 1 = 30 GeV and µ 2 = 70 GeV. We have adopted the following functional form: with α = c = 1. The histograms shown in the lower insets are computed by taking the ratios of the predictions obtained with a sharp D (for the three values of µ Q previously we shall show later how to get rid of them within the NLO approach proposed here. However, they are a persistent characteristic -see e.g. ref. [68] for examples relevant to CKKW (although for processes different from Higgs hadroproduction), and ref. [11] for a possible amendment, shown there to work in e + e − collisions. considered), over those obtained with the smooth D of eq. (3.1). Perhaps not surprisingly, the new results are in a very good overall agreement with those relevant to a sharp D function with µ Q = 50 GeV. The exception is the marked disagreement around the p T (j 1 ) ≃ d 1 ≃ µ Q regions, where the results obtained with a smooth D function do not present any kinks, and are fairly regular. We thus conclude that the merging procedure with a smooth D function (and without Sudakov reweighting) leads to satisfactory results 10 . Its drawback is that the assessment of the theoretical systematics is rendered more involved because of the presence of two scales, that define the position and width of a merging range. One possibility is that of taking the envelope of the predictions obtained with a sharp D, for values of µ Q that span the merging range (which is essentially what is done in the lower insets of fig. 2). On the one hand, this overestimates the systematics, since the contributions due to scales close to the end-points of the merging range are less important (in the effective average performed by the smooth D function) than those at its center. On the other hand, this is not equivalent to assessing the effect of changing the position and width of the merging range, which should probably also be done. In any case, these appear to be pretty minor issues, given that the theoretical systematics associated with merging cannot be given a precise statistical meaning, and some degree of arbitrariness is always present.
We now study the effect of the Sudakov reweighting, following the procedure described in sect. 2.2.3. We start by considering again the N = 1 case, which we generate with a sharp D function, and the three values µ Q = 30, 50, and 70 GeV already employed. In fig. 3 we plot the same observables as in fig. 1 and 2; a few more jet-related observables are  fig. 3, for the pseudorapidity of the hardest jet (upper left), the pseudorapidity (upper right) and p T (lower left) of the second-hardest jet, and d 2 (lower right). In the case of η(j k ), we have imposed a p T (j k ) > 30 GeV cut. displayed in figs. 4 and 5. In all these figures, the main frame presents the µ Q = 50 GeV results, our "central" predictions henceforth. The histograms in the lower insets are the ratios of the Sudakov-reweighted µ Q = 30 GeV and 70 GeV results over the central ones (in other words, there are no merged predictions in these plots that do not include the Sudakov reweighting). Also shown there are the ratios computed using Alpgen in the numerator, over the central NLO-merged results.
The comparison of fig. 3 with figs. 1 and 2 shows that the Sudakov reweighting on top of a sharp D function is as effective as the use of a smooth D function (without Sudakov reweighting) in removing the kinks. There are quite small residual wiggles 11 , which may be  fig. 3, for the difference in rapidity between the Higgs and the hardest jet, with four different p T cuts on the latter. seen in some of the lower insets in the vicinity of the transition regions, that are in any case within a modest theoretical systematics; this is smaller than 10% everywhere, and in most regions quite negligible. As one can infer from the comparison between the upper insets of fig. 3 with those of fig. 2 (keeping in mind that the standalone H +0j and H +1j MC@NLO predictions are the same in these two figures) the sharp-D, Sudakov-reweighted results are quite close to the smooth-D, non-Sudakov-reweighted ones, with differences of the order of 5% or smaller. The same holds true for the sharp-D, non-Sudakov-reweighted results obtained with µ Q = 50 GeV, in keeping with the idea that Sudakov effects in the context of option here, since it appears to be just a phenomenological issue analogous to tuning. We reckon that a relatively small range (µ1, µ2) will be sufficient; a further handle can be provided by the parameters α and c that appear in eq. (3.1).
the present merging procedure are beyond NLO, hence small. Obviously, this smallness is true in a parametric sense, and numerically the case µ Q = 50 GeV is particularly fortunate, just because such a value happens to be quite suited for this process. When µ Q = 30 GeV or µ Q = 70 GeV the differences induced by the Sudakov reweighting on top of a sharp D function are numerically a bit larger, but there is no difference of principle among the various scale choices.
The transverse momentum of the second-hardest jet, and d 2 , presented in the bottom panels of fig. 4 show the same patterns as p T (j 1 ) and d 1 (possibly with an even smaller theoretical systematics). This also applies to the comparison with Alpgen, since the latter has kinks at p T (j 2 ) ≃ d 2 ≃ 30 GeV, analogous to those affecting p T (j 1 ) and d 1 (if a bit smaller). The pseudorapidities of the hardest and second-hardest jets are presented in the upper panels of fig. 4, for p T (j k ) > p cut T ≡ 30 GeV. In both cases, the merged predictions are more central than those obtained with the standalone H +0j MC@NLO simulations, owing to the contribution of the 1-parton sample. Still, since pseudorapidities receive the most important contributions from the region p T (j k ) ≃ p cut T , the 0-parton sample will typically be dominant (except for very large p cut T , which is not the case here), with the 1-parton sample providing a subleading correction. Hence, the onset of the 1-parton sample regime determines directly the amount of migration towards central η values w.r.t. the standalone H + 0j results. In turn, this onset is controlled by the matching scale; this explains why the systematics affecting η(j 1 ) and η(j 2 ) is larger than for other observables. In the case of η(j 1 ), the Alpgen result is in fact quite close to the NLO-merged prediction obtained with the same matching scale (30 GeV). On the other hand, for η(j 2 ) Alpgen is significantly more central than MC@NLO, even with the same matching scale. This can be understood as follows. Let us consider the H + 2 parton matrix element which, if surviving the merging "cuts" in Alpgen, will result into two hard jets. This is not quite the case in MC@NLO (except in the large-p T region, which is not important here), where its contribution to the H events of the 1-parton sample is partly compensated by the MC subtractions. This is what underscores the intuitive picture of the 1-parton sample being kinematically a 1-jet cross section, plus (small) corrections, and is consistent with the fact that NLO computations must be inclusive to a certain degree. Hence, despite receiving contributions from the same tree level matrix elements, Alpgen results will be more matrix-element driven than MC@NLO ones, for observables sensitive to the largest multiplicity 12 . Indeed, we shall show later that the inclusion of the 2-parton sample in MC@NLO results in a more central η(j 2 ) distribution.
As the final example for the N = 1, sharp-D function, Sudakov-reweighted merging, we present in fig. 5 the difference in rapidity between the Higgs and the hardest jet, by imposing that the p T of the latter be larger than 10, 30, 50, and 70 GeV. This observable has attracted some attention in the past, because of the presence in the standalone H + 0j MC@NLO results of a dip in the central region (analogous features can be found in y(S) − y(j 1 ) or y(j 1 ) for standalone S + 0j MC@NLO runs -see e.g. ref. [80] for a discussion of the case S = tt, to which we shall return later). It should be pretty clear that the dip is inherited by MC@NLO from the underlying Monte Carlo, which has in fact a deeper dip, partly filled in MC@NLO by the H-event contribution. This is documented in ref. [81], where MC@NLO was matched with Q 2 -ordered Pythia; since Pythia, at variance with HERWIG6 or Herwig++, does not have a dip (at least in the low-p T region), MC@NLO does not have a dip. Having clarified this, the natural question is the following: if, for a given p T (j 1 ) > p cut T , the underlying Monte Carlo has a dip in the rapidity difference, why MC@NLO does not remove it completely? The answer is pretty simple: as mentioned above, MC@NLO fills the dip through the H-event contribution, which is matrix-element driven and of relative O(α S ). Such contribution will thus be overwhelmed by Monte Carlo effects (from the first emission onwards), when radiation of O(p cut T ) can be easily achieved by parton showers 13 . In other words, in spite of the fact that p cut T may seem to define a hard scale, p T (j 1 ) ≃ p cut T could still be in an MC-dominated region (which is a processand MC-dependent statement). Hence, in such a case MC@NLO will not "assume" that a matrix element description is correct, but will rather follow the pattern of the underlying Monte Carlo. The implication is that, if the dip is phenomenologically untenable, a solution has to be found at the Monte Carlo level, and not in the matching procedure.
An alternative point of view, which will however lead one to the same conclusions, is the following. Since y(S) − y(j 1 ) is one-jet exclusive, standalone S + 0j MC@NLO will give an LO-accurate description at best, so we are much better off by considering standalone S + 1j MC@NLO instead. While this is true, it is indeed equivalent to assuming that for p T (j 1 ) ≃ p cut T a matrix-element description is correct. This is obviously the same problem as that of p T (j 1 ), discussed in the context of fig. 1. Therefore, as in that case, a merging procedure will certainly constitute an improvement over standalone results, provided that the merging systematics is correctly assessed (i.e., µ Q must be chosen and varied independently of p cut T ). As can be seen from the four panels of fig. 5, the NLOmerged results are in much better agreement with what one expects from matrix elements (equivalent to the standalone H + 1j histograms here) than with the standalone H + 0j predictions. There are, however, significant differences among the results obtained with different µ Q 's, since the dip is typically present only when p cut T < µ Q (but not necessarily so: e.g. when µ Q = 30 GeV, there is no dip even for p cut T = 10 GeV). For example, when p cut T = 30 GeV, the NLO-merged result has a dip when µ Q = 50 and 70 GeV, while it has no dip when µ Q = 30 GeV. It is interesting to observe that Alpgen does have dips when p cut T ≤ µ Q = 30 GeV, while the NLO-merged prediction obtained with the same matching scale does not, as mentioned above. This is not surprising, on the basis of what was discussed before about the inheritance of this feature from the underlying Monte Carlo. The conclusion is that, in the context of merging at both the LO and the NLO accuracy, the dip can be tuned away by a suitable choice of merging parameters; this is acceptable only if the merging systematics is exhaustively assessed. We expect that, for small p cut T (where "small" can be precisely defined given the production process), such systematics will be large if the underlying Monte Carlo features the dip, and much smaller otherwisehence, theoretical uncertainties will be strictly MC-dependent. This dependence is bound to disappear, and the merging-parameter dependence reduced, when p cut T becomes large.
We finally turn to discussing the case of the N = 2, sharp-D function, Sudakovreweighted merging; that is, we increase the largest multiplicity by one unit w.r.t. what was done before. The settings are the same as in the N = 1 case, and figs. 6, 7, and 8 are the analogues of figs. 3, 4, and 5 respectively (with the exception of one panel in fig. 8). The numerators of the ratios that appear in the upper insets are the same as before for the H + 0j and H + 1j cases; that for H + 2j is obviously specific to N = 2. In the lower insets, together with the ratios that allow one to assess the merging systematics, we have plotted (as histograms overlaid with open circles) the ratios of the N = 1 results over the N = 2 ones, both for µ Q = 50 GeV. We have also recomputed the Alpgen predictions, by adding the H + 3 parton sample, for consistency with N = 2. The corresponding results will not be shown in the plots, since these are already quite busy, and there is no difference at all in the patterns discussed above, except in a very few cases which we shall comment upon when appropriate.
The common feature of all but one of the observables presented in figs. 6-8 is that they are extremely close, in both shape and normalization, to their N = 1 counterparts of figs. 3-5. This is highly non-trivial, since the individual i-parton contributions are different in the two cases. The exception is the pseudorapidity of the second-hardest jet (upper right panel of fig. 7), which the inclusion of the 2-parton sample turns into a more central distribution, as anticipated in the discussion relevant to fig. 4, and brings it very close to the Alpgen result obtained with the same µ Q .
The small impact of the increase of the largest multiplicity is also generally in agreement with what is found in Alpgen, where the inclusion of the H + 3 parton contribution changes the fully-inclusive rate by +0.3%. The effects on differential observables are also comparably small, growing to only a few percent in the tail of p T distributions. The exceptions are p T (j 2 ) and p T (j 3 ), which are significantly harder in Alpgen after the inclusion of the H + 3 parton sample (a 20% and 30% effect respectively). However, this may be related to the fact that such an inclusion also leads to larger kinks (of which we see no trace in the NLO-merged results) at p T ≃ µ Q for these two observables; thus, the hardening may be partly an artifact of the LO-merging procedure; in order to further this point, it will be useful to study the merging systematics affecting p T (j 2 ) and p T (j 3 ) in Alpgen. It is obvious that the relative stability of the NLO-merged results against the inclusion of higher multiplicities is observable-dependent. η(j 2 ) gives a counterexample, but more spectacular ones can be found by looking at correlations. As an example of this, we show in the upper left panel of fig. 8 the azimuthal distance ∆φ(j 1 , j 2 ) between the two hardest jets, for three different cuts p T (j 2 ) > p cut T on the subleading jet 14 . The plots are presented in the form of ratios of the N = 1 results over the N = 2 ones. When p cut T is well below the matching scale (10 GeV vs 50 GeV, see the upper inset), ∆φ(j 1 , j 2 ) is MC-dominated, and the inclusion of the 2-parton sample is irrelevant. However, matrix-element effects start to be felt when p cut T is increased (middle inset), to become quite important at p cut T = 50 GeV (lower inset). This clearly shows the impact of the largest matrix-element multiplicity on ∆φ(j 1 , j 2 ), and demonstrates that for the predictions of observables exclusive in up to J jets is best to have N ≥ J. We conclude this section by remarking that in general the merging systematics is smaller (although for some observables marginally so) when N = 2 than in the N = 1 case, which is exactly what one expects in a "converging" procedure, where the N = 2 results are better (i.e., more accurate) than the N = 1 ones.

e + ν e production
In this section, we present the results for e + ν e production, limiting ourselves to the case N = 1, sharp-D function, and Sudakov-reweighted merging. We have treated the electron as massless, and have set m W = 80.419 GeV, Γ W = 2.0467 GeV, and where M (eν) and p T (eν) are the invariant mass and transverse momentum of the lepton pair respectively; the former is also required to obey the following constraints: The one-loop matrix elements are computed with MadLoop [37]. We have considered three values for the matching scale, µ Q = 35 (our default), 20, and 50 GeV. As a shorthand notation, we may denote by W the e + ν e pair in the following and in the labels of the plots. Our predictions are presented in figs. 9-11, where we have used the same layout and patterns as e.g. in fig. 3 (except for the Alpgen results, which we did not generate for this process). As in the case of Higgs production, the merged results are fairly smooth, and affected by merging systematics which are at most a 15% effect, but typically much smaller than that. Similarly to what happens for Higgs, some of the p T or d distributions display wiggles at the transition between regions dominated by different i-parton samples. While this is just a manifestation of the merging systematics, in phenomenology-oriented applications one may consider using a smooth D function, as was already suggested in sect. 3.1. As is expected, the inclusion of the 1-parton sample induces a hardening in the tails of p T distributions w.r.t. the standalone W +0j results (see the dashed blue histograms in the upper insets), while affecting the fully-inclusive rates only in a marginal manner. In fig. 11 we show the rapidity difference between the lepton pair and the hardest jet, for four different p T cuts on the latter. This is in fully analogy with the case of fig. 5, and it is immediate to see that the general discussion given there applies to e + ν e production as well. In particular, the pattern of the presence or absence of the dip is exactly the same, while the specific behaviour at a given p cut T is different because of the differences between the two processes (i.e., m W vs m H and qq vs gg). We also point out that fig. 11 can be quite directly compared with figs. 5 and 6 of ref. [81], where the same observable is computed with standalone MC@NLO matched with Q 2 -ordered Pythia6; this underlines again the MC-dependence of these distributions for small p cut T . Figure 10: As in fig. 9, for the electron p T (upper left) and pseudorapidity (upper right), the pair rapidity (lower left), and the hardest-jet pseudorapidity (lower right). The latter observable is obtained with a p T (j 1 ) > 30 GeV cut.

tt production
We conclude this phenomenology section by presenting our predictions for tt production, again limiting ourselves to the case N = 1, sharp D function, and Sudakov-reweighted merging. The top quarks are produced on the mass shell; they are decayed leptonically in the shower phase, in order to limit the contamination of the hadronic activity of the events. Furthermore, the b quarks emerging from the top decays are not included in the jets whose distributions we present below. We have set m t = 172.5 GeV, and µ ME = max (m T (t), m T (t)) , where we have denoted by m T (X) the transverse mass of X. We point out that the top Figure 11: As in fig. 9, for the difference in rapidity between the e + ν e pair and the hardest jet, for four different p T cuts on the latter.
quarks are not included in the clustering algorithm that determines the d i 's of sect. 2.2, which enter in the merging procedure. This implies that gluon radiation off t andt is not constrained, which is an acceptable (and approximate) solution only because such radiation is quite unlikely to be hard in the case of the very massive actual top. The one-loop matrix elements which contribute to the 0-and 1-parton samples are those originally computed in refs. [82] and [83] respectively. We have considered three values for the matching scale, µ Q = 100 (our default), 45, and 155 GeV. We have also obtained results with Alpgen, using 45 GeV as matching scale in the MLM procedure, and generating the tt + 0, tt + 1 and tt + 2 parton samples for consistency with the N = 1 case.
For tt production, one could repeat most of what has been said in sects. 3.1 and 3.2. However, there are a few specific features that are worth stressing. Firstly, the merging systematics is greater than previously observed. In part, this is due to the very large range of matching scales adopted here, but it is also related to the dynamics of the present process. Namely, up to quite large p T values (one can use the pair transverse momentum to be definite) the standalone MC@NLO tt + 0j result is larger in absolute normalization than the tt + 1j one; this is the combined effect of the fact that the shower easily produces hard radiation (as a consequence of the top mass driving the setting of the shower scale to relatively large values), and of the large K factor in the tt + 0j NLO computation. This feature is easily seen e.g. in the upper inset of the upper-left panel of fig. 12 -the relative difference between the dashed blue and dotted red histograms is about 30% for p T (tt) ≥ 100 GeV. Secondly, there is a good agreement between Alpgen, and the merged-NLO Figure 13: As in fig. 12, for the top-quark p T (upper left) and rapidity (upper right), the pair rapidity (lower left), and the hardest-jet pseudorapidity (lower right). The latter observable is obtained with a p T (j 1 ) > 30 GeV cut.
results obtained with the same matching scale µ Q = 45 GeV (see the comparison between the dot-dashed blue and the solid-plus-crosses green histograms in the lower insets). It will be interesting to see whether this agreement holds also for much larger matching scales, such as those adopted here at the NLO. However, one starts to see deviations between the two computations when approaching large p T 's, which can be in part due to the feature of the NLO cross sections outlined above, and in part to the different scale and PDF choices made in Alpgen and MC@NLO. Thirdly, the agreement between the NLO-merged and Alpgen results for η(j 1 ) (see the lower right panel in fig. 13) shows how the inclusion of the 1-parton sample addresses the issue raised for standalone MC@NLO at the end of Figure 14: As in fig. 12, for the difference in rapidity between the tt pair and the hardest jet, for four different p T cuts on the latter.
sect. 4 of ref. [80] for such a variable 15 . However, from that figure one can also see that the systematics affecting the shape is quite large. Finally, for the y(tt)−y(j 1 ) difference, shown in fig. 14, a dip or depletion is present for all the p T (j 1 ) cuts if µ Q = 100 GeV is used, while it disappears in all cases except in the p T (j 1 ) > 10 GeV one when µ Q = 45 GeV, for both the NLO-merged and Alpgen results.

Conclusions
We have presented a procedure for the merging of MC@NLO simulations characterized 15 Ref. [80] actually discussed the case of rapidity (and with different jet-finding parameters), which is obviously fully analogous to what is done here. by underlying processes with different parton multiplicities. It entails only minor changes to the standard MC@NLO short-distance cross sections, and therefore is not difficult to incorporate into existing implementations. We have done so in the automated aMC@NLO framework, which guarantees maximum flexibility and independence of the process.
We have provided no proof, and made no claims, on the formal accuracy of the merging. Instead, we have thoroughly compared the merged results with those of standalone MC@NLO, for the three sample cases of Standard Model Higgs, e + ν e , and tt hadroproduction, and found agreement in shape and normalization where relevant. We have also studied the theoretical uncertainties that affect the merging, and they have turned out to be quite small. Although the agreement with LO-merged results computed with Alpgen and the MLM approach is generally good, there are a few differences which will deserve further investigations.
There are obviously several open questions, both theoretical and phenomenological, and this work should be seen only as a first step towards further improvements. To name just a few: the application to processes that feature b quarks or light jets at the Born level of the lowest multiplicity; the use of alternative definitions of the d i 's; the use of different scales in the short-distance cross sections and in the showers; the possibility of including other features of the CKKW scheme, such as PDF reweighting; the potential role of vetoedtruncated showers; the use of tree-level matrix elements only (as opposed to NLO cross sections) for large multiplicities; the issue of logarithms of higher orders in showers. In the near future, we plan to apply the method proposed here to the Herwig++ and Pythia8 Monte Carlos (which should not require any modifications), to extensively compare our results with other LO and NLO merging techniques, and to validate our approach using LHC data.