W and Z bosons in association with two jets using the POWHEG method

In this work we present the implementation of generators for W and Z bosons in association with two jets interfaced to parton showers using the POWHEG BOX. We incorporate matrix elements from the parton-level Monte Carlo program MCFM in the POWHEG BOX, allowing for a considerable improvement in speed compared to previous implementations. We address certain problems that arise when processes that are singular at the Born level are implemented in a shower framework using either a generation cut or a Born suppression factor to yield weighted events. In such a case, events with very large weights can be generated after the shower through a number of mechanisms. Events with very small transverse momentum at the Born level can develop large transverse momentum either after the hardest emission, after the shower, or after the inclusion of multi-parton interactions. We present a solution to this problem that can be easily implemented in the POWHEG BOX. We also show that a full solution to this problem can only be achieved if the generator maintains physical validity also when the transverse momentum of the emitted partons becomes unresolved. One such scheme is the recently-proposed MiNLO method for the choice of scale and the exponentiation of Sudakov form factors in NLO computations. We present a validation study of our generators, by comparing their output to available LHC data.


Introduction
The production of a vector boson at large transverse momentum, i.e. in association with jet activity, played an important role as a testing ground for perturbative QCD calculations and event generators. Nowadays, these processes continue to be used for model testing. However, their primary importance derives from their role as a background to rarer physics processes. Thus the production of a Higgs boson decaying to W + W − , where one of the W bosons decays hadronically, has as its principal background the W + 2 jet process. The W + 2 jet process is also a background to W W and W Z production where one W or the Z boson decays hadronically. With higher jet multiplicities the W +jets process is the primary background to top pair production. In addition the Z + 2-jet process with the Z-boson decaying to neutrinos gives a signature of two jets and missing energy that constitutes a background to searches for weakly interacting particles, such as those that occur in supersymmetry [1] or in the search for dark matter [2].
Ten years ago, the next-to-leading order (NLO) corrections to W +2-jet and Z+2-jet production (from now on W jj and Zjj) at hadron colliders were published [3,4]. These calculations were performed at the parton level, so the predicted final states are jets of partons, rather than the hadrons observed in the experiment. NLO results for vector bosons in association with 3 and 4 jets have also been obtained [5,6,7,8,9].
Although parton level predictions have been successfully compared with experiment [10,11,12,13] and continue to provide a useful benchmark, NLO calculations are most useful when matched with parton showers such as PYTHIA [14,15] or HERWIG [16,17], including an implementation of hadronization models, that describe the transition from partons to the observed hadrons. Viable methods for matching NLO calculations with parton shower generators have been given in refs. [18,19,20].
Both the W jj and Zjj processes at NLO matched with parton showers have been previously considered in the literature. Using the aMC@NLO method, ref. [21] gives results for the W jj process at the Tevatron, in order to more reliably assess the principal Standard Model background to the Tevatron anomaly [22]. 1 More recently, results for W jj and W jjj processes have been presented using a variant of the MC@NLO method [18] in the SHERPA framework [24]. Meanwhile, the Zjj process has been considered previously in the context of the POWHEG BOX by Re in ref. [25]. The electroweak production of W jj [26] and Zjj [26,27] matched with parton showers has also been considered recently.
In this paper we present new POWHEG generators, based upon the POWHEG BOX [28] framework, for the W jj and Zjj processes. We will refer to these henceforth as the WJJ and ZJJ generators. These generators present a considerable advantage in speed (more than a factor of three improvement) as compared to the previous ones, thanks to the use of the matrix elements taken from the MCFM program. These rely upon the calculation of ref. [29] for the virtual amplitudes, and of ref. [30] for the real contributions.
Our generators, being based upon POWHEG, offer the possibility of investigating the effects of matching with all parton shower models that comply with the Les Houches requirements for interfacing showers to user processes. In this work we have mainly used PYTHIA 6, but we have also checked that no substantial differences are seen with PYTHIA 8. The programs can be interfaced as easily to HERWIG and HERWIG++. In this case one should also supply truncated showers [19] in order for coherence in soft emissions to be correctly treated. The full impact of truncated showers, in particular for complex processes, remains to be properly assessed. However, in the POWHEG simulations where they have been implemented and studied, their effect was found to be basically negligible [31,32,33,34,35]. It was also shown that truncated shower effects can be mimicked well simply by a clever choice of shower starting scales [36].
In this work we also address some physics issues, that are relevant to the matching of 1 A recent CDF analysis of a larger data set [23], accounting for a number of subtle systematic effects -most notably improving the treatment of jet energy scale corrections, finds no evidence for the earlier anomaly.
NLO and parton showers for all processes with associated jets present at the Born level. 2 As is well known, in the framework of shower models, a generation cut is needed in these cases at the Born level, in order to build a finite sample. In fact, the Born level cross section diverges when the associated jet has small transverse momentum. One then imposes that, at the Born level, the transverse momenta of the jets must be above a generation cut. Thereafter it is necessary to ensure that events that do not pass the generation cut, would not, after showering and hadronization, contribute appreciably to the final output after the analysis cuts are imposed. In other words, the generation cut should be low enough so that events lying just below the p t cut at the Born level, would only rarely be promoted to high p t events after showering and hadronization. In the POWHEG approach, this "promotion" of low p t to high p t events can take place at three levels: at the generation of the hardest radiation, at the shower level, and at the hadronization level, especially after accounting for multi-parton interactions. This last point is particularly insidious, since (for example) a W production event with very little jet activity after the hardest emission and parton shower, may acquire hard jets because of multi-parton interactions. In this paper we examine in detail these problems and propose a suitable solution that makes use of the recently developed MiNLO method [37,38].
In our current implementation of the ZJJ and WJJ generators, we have incorporated several extensions and improvements to the POWHEG BOX framework, that will become eventually a "Version 2" of the POWHEG BOX. These extensions are described in the following section.

Description of the implementation
The routines for the generation of the Born phase space have been taken, with minor modifications, from ref. [39]. The remaining ingredients needed for the POWHEG BOX implementation are as follows: the lowest order (Born approximation) matrix element squared; the real matrix element squared for the emission of one extra parton; the virtual contributions coming from the interference of the one-loop matrix element with the lowest order matrix element; the color correlated Born and spin correlated Born matrix elements squared; the Born amplitudes in the large N c limit (where N c is the number of colours) needed for assigning a colour structure to the generated Les Houches events.
Previous implementations of W/Zjj processes have used automatically generated Madgraphstyle calculations for all or some of these ingredients. The Madgraph-POWHEG BOX interface presented in ref. [39] can be used to generate automatically all the above ingredients, except for the phase space and the virtual corrections.
Our implementation uses the simpler analytic expressions for all matrix elements that are used in the MCFM program, leading to an improvement in computing time performance. These rely upon the calculation of ref. [30] for the real, and of ref. [29] for the virtual contributions. The new analytic routines have been checked by comparing with the routines generated automatically using the Madgraph-POWHEG BOX interface. We have also compared the virtual matrix elements of the ZJJ generator with those derived from the BlackHat results and found agreement. The MCFM matrix elements have also been checked with the MadLoop results [40].

Improved aspects of the POWHEG BOX
Our implementation makes use of improved aspects of the POWHEG BOX, that at some point will lead to a new version of the package. These aspects are: • The possibility of generating the importance sampling grids by parallel runs. This was not possible with the standard POWHEG BOX version. Without this feature, the generation of the importance sampling grid for the integration becomes a bottleneck for complex processes, where a single processor run requires too much time.
• The possibility of performing scale variation or PDF variation studies using reweighting. Again, for complex processes this is the only practical way to study uncertainties.
• The possibility to adopt the MiNLO [37,38] procedure for processes with associated jets.
• An improved structure of the generation of the underlying Born kinematics, first introduced in ref. [41], and now made available for all processes, which largely increases the generation efficiency.

Generation cuts and Born suppression factor
The Born matrix elements for the Zjj and W jj processes contain both soft and collinear divergences. These require a special treatment, in order to avoid the predominant generation of soft or collinear events that will be removed by phase space cuts on the jets. This problem is typically handled by introducing generation cuts that avoid the soft and collinear region in the underlying Born. Alternatively one can introduce a suppression factor [42] that depends upon the kinematics of the underlying Born configuration. Here we use this second alternative. We use a modified (Born suppressed)B function of the formB Events are then generated as usual, except that a variable weight equal to 1/F (Φ n ) is applied, leading to a weighted event sample. We chose as the reweighting factor the form where p 2 1 , p 2 2 are the transverse momentum squared of the two recoiling partons 1 and 2 and .
The cutoff parameter Λ is set of the order of the typical jet transverse momentum cutoff that we need in the analysis.

Scale choice
We consider in the present work two schemes for the factorization and renormalization scale. The first one usesĤ T /2 as a central scale, witĥ where M V is the mass of the W or Z boson, p t,V its transverse momentum and p (j) t denotes the transverse momentum of parton j. As customary in POWHEG, theĤ T value is computed using the underlying Born configuration of the event. We have checked that there is little difference when using real momenta in theĤ T definition. The second scheme relies on the MiNLO method, illustrated in detail in ref. [37]. We remark here that, when using the MiNLO method, it is no longer necessary to include a suppression factor. The Sudakov form factors introduced in the MiNLO procedure suppress the region of soft and collinear emission, leading to finite predictions. Furthermore, as shown in refs. [37,38], when MiNLO is applied, the predictions for W/Zjj merge smoothly into the prediction for W/Zj and even for the inclusive W/Z production when one or two partons become unresolved.

Problems with singular underlying Born configurations
The Born suppression method works if, after imposing the analysis cuts, the events with larger weight appear with a much smaller frequency. We note that the method of Born suppression is in fact more general than the commonly adopted method of using generation cuts. In fact, one may argue that these are equivalent to a Born suppression factor that is infinitesimally small in the regions below the cut. This leads to the possibility of events with infinitely large weights, that however will never appear in any finite number of events. Thus, when generation cuts are used, one should study the dependence of the result upon the cuts to make sure that events right below the generation cut do not contribute appreciable effects to relevant kinematic observables. On the other hand, if a suppression factor is used, it is guaranteed that for a sufficiently long run one obtains the right result. Thus, the use of a Born suppression factor has the advantage that one does not need to check the dependence of the result upon the generation cut.
However if the Born suppression factor is very severe then, in a run with a limited number of events, it is possible that no events with very large weights are generated. In this case these events will only appear for a sufficiently large sample and, because of their very large weight, they will lead to large spikes in the computed distributions. This is a problem that has occurred in practice while studying our W/ZJJ generators with high statistics runs.

Statement of the problems
In the present work we have extensively studied this "spikes" problem, and have found essentially three problematic areas that have to be dealt with in order to solve it. These are: 1. Given an underlying Born configuration near a singularity, the POWHEG hardest radiation may generate a phase space point that is away from any singularity. It may thus happen that a point passing the analysis cuts at the Les Houches level, may in fact carry a large weight, because the corresponding underlying Born configuration is soft or collinear.
2. An event that originates from an underlying Born configuration near a singularity, and that after the generation of the hardest radiation is still near the singularity, may lead to an event that passes the analysis cuts after the parton shower. In other words the shower may lead, in rare cases, to radiation that is harder than the one in the Les Houches event.
3. An event that originates from an underlying Born configuration near a singularity, and that reaches the shower stage still remaining near the singular region, may become a hard event, passing the analysis cuts, after hadronization and the inclusion of the underlying event. This possibility is especially due to multi-parton interactions, that can generate pairs of hard jets accompanying the basic hard interaction.

Problem 1
We illustrate the first problem using the simpler example of vector boson production in association with one jet. In POWHEG, one begins by generating an underlying Born configuration, consisting of the vector boson V in association with one parton. In the partonic CM system, this configuration is singular when the emitted parton has small transverse momentum. Let us now consider the configuration when the parton has appreciable energy, but nearly zero emission angle. This event is passed to the mechanism for the generation of the hardest radiation in POWHEG. It may happen that the hardest radiation is from the initial state. In this case, the full event kinematics is constructed by boosting the underlying Born event in the transverse direction, in order to balance the transverse momentum of the radiation [20]. Following this, the real emission configuration would be away from all singular regions, since the transverse boost will give a transverse momentum also to the parton in the underlying Born configuration. In the case of final state radiation, a similar circumstance may arise, with the real radiation originating from the large angle splitting of the Born parton (collinear to the beam) into two partons, balanced in transverse momentum. Also in this case the real emission configuration is non-singular. The POWHEG formula for the hardest radiation is schematically given by where we have used the standard notation of the POWHEG literature [19,20]. The Sudakov form factor ∆(k T ) is such that up to corrections that are suppressed by typical hadronic scales. Let us assume first that no generation cuts or Born suppression factors are used. The POWHEG generator works in a two-stage process. First the underlying Born kinematics is generated with a probability proportional toB. Then the radiation kinematics is generated with a probability given by eq. (3.1). So, if there is a singular region for the underlying Born, that region will have a largely enhanced generation. However radiations that are not as singular will be suppressed by the R/B ratio, leading to a finite, regular contribution. We now observe that, although formally correct, the above approach may lead to practical problems in the implementation. Let us suppose first that we use a generation cut, avoiding the regions of underlying Born kinematics where the parton in the Born configuration has small transverse momentum. By doing this, we are also cutting off the corresponding region of real radiation that has a singular underlying Born.
A very clear example is given again in the W/Zj example, when in the real emission configuration the vector boson has nearly zero transverse momentum, while the two emitted partons have a relatively large, balanced transverse momenta. This real event admits an underlying Born structure where the vector boson has nearly zero transverse momentum, and the two partons are merged into a single parton with small transverse momentum. Clearly, when using a generation cut, one removes this type of configuration. The more one lowers the cut, the better these configurations are covered, at the cost of increasing the rate of production of soft events that will never pass the final cuts. If instead we are using a Born suppression factor then these configurations will always be produced, but they will arise rarely and with a very large weight, leading to undesirable spikes with large errors in the final distributions.
A simple way to avoid this problem is to modify the association of the real contributions with an underlying Born, in such a way that the regular real contributions associated with singular Born ones are suppressed. In POWHEG the separation of regions in the real cross section is performed as follows where α labels each possible collinear singular region of the real amplitude and d α are functions of the event kinematics that vanish in the singular regions. They are typically given by (some power of) the transverse momentum of the parton for the ISR collinear regions, and by the relative transverse momentum of the partons in the FSR ones. We thus replace the functions d α with new functionsd α , given by the following prescription:d where now with β we label the singular regions of the underlying Born configuration associated with the singular region α in the real contribution, and d ub α,β are functions of the underlying Born configuration kinematics that vanish when such a configuration approaches the singular region β. In practice, we choose them of the same form as the d α . This modification has the effect of slowing the vanishing of the d α coefficient near the singular region, if the corresponding underlying Born is also singular.
It is instructive to see the effect of the modification eq. (3.4) in our previous example of a vector boson produced at very small transverse momentum, associated with two partons with relatively large transverse momenta that balance each other. Labelling the two partons with 1 and 2, we have three regions: region 1, with parton 1 collinear to the beam, region 2, with 2 collinear to the beam, and region 12 with 1 collinear to 2, with their associated d α functions d 1 , d 2 and d 12 . With the old d α definition the d 12 term is of the same order as the others. With the new definition, instead,d 12 is much bigger, because the underlying Born configuration of the 12 region is singular, and, as dictated by eq. (3.4),d 12 includes a large extra factor 1/d ub 12,0 , where 0 labels the (only) singular region of the underlying Born corresponding to the Born parton becoming collinear to the beam. On the other hand, the underlying Born corresponding to regions 1 and 2 is non-singular. In fact, if for example we consider region 1, the underlying Born in the POWHEG formalism is obtained by boosting the V -2 system (i.e. the system consisting of the vector boson plus parton 2) in the transverse direction, in such a way that after the boost the V -2 system is balanced, and the vector boson has non vanishing transverse momentum. Thus, in the partition of R according to the singular regions, the 12 region is largely suppressed.
In the following we show the effect of the implementation of the prescription in eq. (3.4). We consider W + → µ + ν production at the 7 TeV LHC, with a standard set of cuts on the lepton, missing energy and jets, requiring two jets with transverse momentum larger than 20 GeV. The cuts are given later in table 1, but their precise form is irrelevant for the present considerations. In order to speed up the computation, we have not used in our run the "folding" [28,43] option of the POWHEG BOX. By doing so, we have a relatively large fraction of negative weighted events.
In fig. 1 we plot the cross section as a function of the event weight, as obtained with the old definition of the d α functions and the new one (i.e. thed α of eq. (3.4)). In the figure we plot both the absolute value of the cross section (left) and the signed cross section (right). We see that there is a tail of events with large weight, their contribution becoming smaller as the weight increases. In order to have convergence, the contribution to the cross section should decrease faster than the inverse of the weight. We see that this is not the case for the old d α definition as far as the absolute value of the cross section is concerned. The large tail of weights causes the "spike" problem when building histograms using the old d α definition.
It is likely that, by using the folding feature to reduce the impact of negative weights, the "spike" problem would also be reduced with the old d α definition. However it is also clear from fig. 1 that the new definition, besides being more correct from a formal point of view (as we have shown previously), also performs better in all cases.  Figure 1: Cross section as a function of the absolute value of the event weight for W + production in association with two jets (where cuts and jet definition are specified in table 1) at the 7 GeV LHC. The POWHEG generation uses theĤ T /2 scale choice, and the analysis is performed at the Les Houches Event level, that is to say, no further shower is applied to the event. The effect of using the old or the new d α definition is shown. In the left plot, the absolute value of the cross section is histogrammed, while on the right plot the signed value is used.

Problem 2
The second kind of problem may arise if the POWHEG generator passes to the shower program a transverse momentum upper limit for subsequent radiation that is too high. (In practice the transverse momentum upper limit is the value for the scalup variable in the Les Houches common block). This problem was seen to affect especially the POWHEG dijet production generator [44], and has been definitely solved with the introduction of the doublefsr option [45]. In the case of vector boson production in association with two jets, this option does not seem to have visible consequences. However, we have set this feature as default in our WJJ and ZJJ generators.

Problem 3
We now examine how an event that is soft at the parton level may become hard after the inclusion of hadronization effects. In our studies, we have considered a large set of physical distributions at three levels: at the parton level after shower, after the hadronization but without multi-parton interactions (MPI from now on), and after hadronization including MPI. Our observables typically require jets with a transverse momentum above 20 GeV. With this requirement we see only small effects when comparing the first two cases. However we do see important effects at the third stage, when MPI are turned on. This is not surprising, since an event of inclusive vector boson production with relatively modest radiation activity may be accompanied by a pair of relatively hard balanced jets due to a secondary partonic collision. The cross section for such events is given by the product of the inclusive W cross section and the two jet cross section for dijets with transverse momentum above 20 GeV, divided by the effective cross section for multi-parton interactions, σ eff . The probability for MPI events in our sample is enhanced due to the fact that the inclusive W cross section is about a factor of 30 larger than the cross section for W production in association with at least two 20 GeV jets. A back-of-the-envelope estimate yields a probability of the order of 10%. This genuine physical effect should be accounted for correctly in a generator, and thus the NLO generator should also provide a fair description of the inclusive production of the vector boson.
In the present work we will rely upon the MiNLO method, introduced in ref. [37]. With this method the inclusive cross section for the generation of the vector boson comes out remarkably close to the NLO inclusive cross section. This property is discussed in detail in the relevant references [37,38]. Here we only give a brief reminder of how this is possible, and why it is the case. In the MiNLO method, generated parton level kinematic configurations are clustered (using, for example, the k t -clustering algorithm) in such a way that a sort of branching history is constructed. Coupling constants and Sudakov form factors are assigned to the branching vertices, according to the CKKW procedure [46], taking care to subtract from the NLO corrections all contributions that are overcounted. Once Sudakov form factors are included, if we integrate out an unresolved emission, the combination of the Sudakov form factor, with the soft or collinear factorization of the amplitude yields an approximate identity, often referred to as the "unitarity" of the splitting process. Thus the amplitude approaches an amplitude with one less emission. This property works recursively, so that the generator for the production of a heavy system plus two jets, when integrating over the jets, yields a good approximation to the distributions for the inclusive production of the heavy system alone.
In fig. 2 we plot the cross section for events that have at least two jets at the hadron  Figure 2: Fraction of events as a function of the absolute value of the event weight for W + production, with the standard cuts of table 1, in association with two jets with transverse momentum larger than 20 GeV, at the 7 GeV LHC. The POWHEG generation uses theĤ T /2 scale choice, and the events are fully showered with PYTHIA 6. The effect of switching on and off the multi-parton interaction is shown. In the left plot, the absolute value of the cross section is histogrammed, while on the right plot the signed value is used. level, as a function of the event weight, comparing the results obtained by switching on and off the multi-parton interaction mechanism in PYTHIA. The results are obtained with the prescription of eq. (3.4), which from now on will be our default, and with theĤ T /2 scale choice. From the previous discussion, it is clear that we don't expect to give a reasonable description of events where relatively soft QCD radiation is accompanied by relatively hard multiple scattering events. It is however interesting to see what goes wrong. From the absolute value plot, it is clear that the inclusion of MPI leads to the presence of a large number of events with large weights. When histogramming standard observables, these large weights lead to very poor results. From the plot of the signed value of the cross section we see that these large weights largely cancel. Nevertheless, the statistics required in order to produce reasonable plots of standard physics observables becomes prohibitively large.
The large, rising tail in the left plot of Figure 2 can only arise from events that are soft at the parton level, and that acquire hard jets because of secondary collisions. By comparing this tail with the one on the right plot, we deduce that there are a large number of negative weighted events that cancel a large fraction of positive weighted ones, all being at low p t . This is as expected, since the calculation performed with theĤ T /2 scale becomes unreliable in the low p t region, where negative virtual corrections can overwhelm the Born cross section and lead to negative results. In this case, even the folding feature may not be sufficient to get positive events. Fig. 3 is fully analogous to fig. 2, except that now the MiNLO prescription is used. We see that the rising contribution of high-weight events disappear, and that the total contribution is somewhat smaller. We expect this, since the MiNLO result has a finite total integral, while the one with a standard scale choice diverges at small transverse momenta. However, it is also clear that also in this case we will get spikes in the final distributions. The reason for this is quite clear. We have suppressed with a Born suppression factor the configurations of W production accompanied by low radiation activity. The MPI can promote these configurations to hard ones, so that they eventually pass our jet cuts. However, they are produced rarely and with a large weight. If we want to describe reasonably well this process, the only way is not to suppress the low p t region. In fact, it is not necessary to use a Born suppression factor at all when using MiNLO, since the MiNLO Sudakov form factors damp the cross section in the singular regions. If one would like to increase the number of events produced at relatively large transverse momenta, one can use a suppression factor that does not vanish in the regions with singular Born kinematics.
We finally stress that the true advantage of the MiNLO approach is that the value of the inclusive cross section is sensible, and it is quite close to the results that can be obtained with the W [47] and WJ [42] generators [37]. This being the case, we can trust that also the contribution coming from events with low radiation, accompanied by hard secondary parton interactions, are represented in a fair way. In section 5 we will only use the MiNLO improved generators to compare with real data. Later in section 6 we will study some observables that are sensitive to multi-parton interactions.

Comparison of the MiNLO andĤ T /2 results
Since the MiNLO method is relatively new, in this section we compare the results obtained using it with the results using the more commonĤ T /2 scale choice. For this purpose we only consider W → µν at 7 TeV, with the cuts of table 1. We don't expect significant differences for the Z production case.
We shower the events generated with POWHEG using the PYTHIA 6 Monte Carlo. We have used the AMBT1 PYTHIA tune (i.e. we have performed the call CALL PYTUNE(340)). We have switched off the hadronization and the underlying event, by setting MSTP(111)=0, MSTP(91)=0 for switching off hadronization, and MSTP(81)=20 for switching off the MPI. We do not see significant differences when hadronization is turned on, provided that MPI are switched off. On the other hand, as explained earlier, we were unable to produce usable plots when using the scaleĤ T /2, if MPI are turned on.
We have considered a very large set of distributions, comprising all those that have been considered by the ATLAS collaboration [12]. Observables that do not require the presence of at least two jets cannot be computed using the WJJ generator with theĤ T /2 scale choice. For all observables that do require at least two jets, we see very good agreement between the two approaches. Here we show only, in figs 4 and 5, the inclusive jet multiplicities,  (for details on how this is implemented in MiNLO, see refs. [37,38]). Statistical errors (not included) are negligible with respect to the scale variation range. Notice that in fig. 4, left plot, no results are given for the 0 and 1 inclusive jet multiplicity using theĤ T /2 scale choice, since, as stressed before, these cannot be computed with the WJJ generator. We notice that not only the central values are in good agreement, but also the scale variation bands are of the same order.

Comparison with available data
Results for W +jets production at the LHC, operating at 7 TeV, have been presented in ref. [48] by the CMS collaboration and in ref. [13] by the ATLAS collaboration. Similarly, CMS and ATLAS have published results for Z+jets final states, at the same operating energy, in ref. [48] and ref. [12] respectively. In this section we compare the results of the WJJ and ZJJ generators with LHC data. We will limit ourselves to ATLAS data for reasons of space.
We will include in our predictions only the theoretical errors due to scale uncertainties. We believe that this is sufficient for the present purpose of validating our generators. We remind the reader, however, that other sources of errors are present, especially when dealing with quantities that are sensitive to the fourth jet or beyond. These jets are generated by PYTHIA, and are only accurate in the collinear approximation. Changing or tuning the shower model may well have a considerable impact on these quantities.

W production data
For the sake of illustration, in this section we present results for W +jets production at 7 TeV compared to the ATLAS data of ref. [12]. We adopt the set of cuts displayed in table 1, that are taken from ref. [12]. The parameters used in our simulation are displayed Cuts for W production Jets defined using the anti-k t algorithm (R = 0.4), with p min t > 20 GeV, |η| < 4.4; One lepton required with p (l) t > 20 GeV, |η l | < 2.5; Lepton isolation required: ∆R lj for all jets (as defined above) > 0.5; One neutrino (missing E t ) required with p (ν) t > 25 GeV; Transverse mass constraint required: 2p Events are classified according to the number of jets, as defined above. in table 2. The parton distributions used are the CTEQ6M set taken from ref. [49]. We use FastJet [50] to cluster partons into jets via the anti-kt algorithm [51]. As motivated earlier, we use the MiNLO procedure for the scale choice, that gives reasonable predictions also for quantities requiring only one or no jets. The uncertainty bands comprise only the scale uncertainties, estimated with the prescription given in sec. 4.
We have generated samples of W + → e + ν and W − → e −ν production, and added them together. We have switched off electromagnetic radiation in PYTHIA 6 (i.e. we have set MSTJ(41)=11). In this way, the electron and muon decay modes do not differ in any appreciable way. Our results can be considered an average over muon and electron decay modes provided a dressed, rather than a bare lepton, definition is used. We thus compare them with the ATLAS results averaged over electrons and muons. We have used the AMBT1 PYTHIA tune (i.e. we have performed the call CALL PYTUNE(340)). We show here results obtained with the jet cut set at 20 GeV, while in Appendix A we show those obtained with a 30 GeV cut.
In fig. 6 we show the inclusive jet multiplicity compared to ATLAS data. We have also shown in this figure the results obtained interfacing the WJJ generator with PYTHIA 8. The    We see that, within present uncertainties, the data and Monte Carlo predictions agree quite well. Notice that also for four and five jet multiplicities, that are only described with collinear accuracy by the generator, the agreement is quite good. Notice also that the quantities depending upon the one jet final state are predicted adequately by the calculation.
The same consideration applies to all remaining observables measured by the ATLAS collaboration, that are shown in the remaining part of this section. We do however observe a tendency of our calculation to overshoot the data at small transverse momentum. This trend is also observed in ref. [13] when comparing data with NLO calculations.           Figure 20: Distance R j1j2 = (∆ 2 y j1j2 + ∆ 2 φ j1j2 ) 1/2 between the two leading jets in inclusive two-jet events.

Z production data
In this section we compare the output of our ZJJ+MiNLO generator with the data of ref. [12]. The parameters used in the simulation are the same as in sec. 5.1. The set of cuts are displayed in table 3. Again, we consider only e + e − decay but, since we switch off Cuts for Z production Jets defined using the anti-k t algorithm (R = 0.4), with p min t > 30 GeV, |η| < 4.4; Two opposite charge, same flavour leptons required with p (l) t > 20 GeV, |η l | < 2.5; Lepton-jet isolation: ∆R lj for all jets (as defined above) > 0.5; Lepton-lepton isolation: ∆R ll > 0.5; Constraint on dilepton invariant mass: 66 < m ll < 116 GeV; Events are classified according to the number of jets, as defined above. electromagnetic radiation in PYTHIA, our result can be considered valid for any "dressed" lepton analysis. Therefore they will be shown in comparison with the average data for electrons and muons.
We observe good agreement between the prediction of our generator and ATLAS data.    ZJJ-MiNLO ATLAS Figure 26: Distance R j1j2 = (∆y j1j2 + ∆φ j1j2 ) 1/2 between the two leading jets and invariant mass of the system of all jets (right) in inclusive two-jet events.

Observables sensitive to the MPI
As discussed in section 3, the contribution of the MPI to the cross section for vector boson production plus two jets may not be negligible. On the other hand, the W/ZJJ generators presented in this work, when augmented with the MiNLO procedure, yield a good prediction also for the MPI contributions, since they describe reasonably well also inclusive distributions.
An analysis of double parton scattering in the framework of W production has been presented by the ATLAS collaboration in ref. [52]. Here we examine two observables that should be particularly sensitive to the MPI. The first and most natural one is the transverse momentum of the vector boson in the two jet inclusive sample. The second one is the vector sum of the transverse momenta of the two jets, p jj T = |p j 1 T +p j 2 T | in two-jet exclusive events. 3 In all cases one expects a non negligible contribution from the MPI in the region of small transverse momenta.
In fig. 27 we show the transverse momentum of the W boson and the p jj T distribution for W production. It is clear that the inclusion of the MPI yields a noticeable change in the shapes. In particular, below 10 GeV, the contribution of the MPI increases, until it becomes dominant at small transverse momenta.
In fig. 28 we show the same quantities for Z production. Since now the jet cut is at 30 GeV rather than 20, the relative effect of MPI is reduced, but still quite visible. In conclusion, our study indicates that both observables can be conveniently used to study and tune multi-parton interactions. The transverse momentum of the Z bosons in the 2-jet sample (inclusive or exclusive) is also a particularly promising observable, provided the jet energy cut is low enough.

Conclusions
In the present work we have presented new POWHEG BOX generators for the production of a vector boson in association with two jets. This is not the first time that generators for these processes have appeared in the literature, and in the present work we claim to have improved in terms of speed (more than a factor of three) over previous POWHEG implementations. However, this is not the only progress that we have achieved. Although building NLO generators matched with showers for processes with associated jet production is today a straightforward task, several details about their correct use are still subject to study. In our opinion, substantial theoretical progress in this framework is desirable and indeed possible. The problem of scale choice in associated jet production is, for example, a detail with very profound implications [37,38].
In the present work, we have addressed issues and implications related to the use of a generation cut or of a Born suppression factor, that are needed when associated jet production is considered. It would be desirable that events that are soft at the underlying Born level remain soft after the hardest radiation, after shower, and after the inclusion of the underlying events and of multi-parton interactions. If this is not the case, hard jet distributions become sensitive to the generation cut, or, alternatively, if a suppression factor is used instead, rare events with very large weight contribute, making it difficult to interpret the results.
We have found that some simple modification of the kinematic functions used for the separation of the singular regions in POWHEG is very beneficial in reducing the probability that a soft underlying Born configuration may become hard after the POWHEG hardest radiation. The analogous problem, of an event that is soft at the underlying Born level, but becomes hard after the parton shower, and before hadronization, has also been discussed in the past in the framework of dijet production.
On the other hand, the case of events that are soft after shower, but become hard after the hadronization stage, because of multi-parton-interactions, cannot be dismissed or alleviated. This is in fact a physical effect that should be properly simulated. In order to do so, the generator should also be capable of describing the inclusive cross section (i.e. without jet cuts) with reasonable accuracy. By using the MiNLO procedure, this feature comes as a bonus.
These considerations are not only relevant for the ZJJ and WJJ generators. They are, of course, as significant for the HJJ generator. But we have found that they have also considerable relevance for processes involving a single jet, like the HJ, WJ and ZJ processes. We expect them to be important for all processes involving jet production, irrespective of the number of jets.
In this work we have also validated our full setup, involving both MiNLO and the newly introduced kinematic functions for the separation of the singular regions in POWHEG, against available data. This is the first extensive validation study of the MiNLO method. We have found a generally good agreement with data, even for one jet inclusive and totally inclusive observables, for which the method does not achieve NLO accuracy.
A. Comparison with data for p (j) t > 30 GeV for W + 2 jet production In this appendix we present plots comparing the data of the ATLAS collaboration for W + 2 jet production using a jet cut of p (j) t > 30 GeV with our predictions using the MiNLO method.              Figure 43: Distance R j1j2 = (∆ 2 y j1j2 + ∆ 2 φ j1j2 ) 1/2 between the two leading jets in inclusive two-jet events.