Automating the POWHEG method in Sherpa

A new implementation of the POWHEG method into the Monte-Carlo event generator Sherpa is presented, focusing on processes with a simple colour structure. Results for a variety of processes, namely e+e- to hadrons, deep-inelastic lepton-nucleon scattering, hadroproduction of single vector bosons and of vector boson pairs as well as the production of Higgs bosons in gluon fusion serve as test cases for the successful realisation. The algorithm is fully automated such that for further processes only virtual matrix elements need to be included.


Introduction
Higher-order QCD corrections by now form an important ingredient to many phenomenological studies and experimental analyses at both the Tevatron and the LHC. The impact of these corrections has been similarly important for various studies of HERA and LEP data. Calculations invoking such corrections, typically at next-to leading order (NLO) in the perturbative expansion in the strong coupling α s , and in very few cases also at next-to-next-to leading order (NNLO) accuracy, have been used for a wide range of precision tests of our understanding of QCD and the Standard Model. They are also important for the subtraction of backgrounds in searches for new physics. When being compared to such calculations, experimental measurements are usually corrected for detector effects, while the perturbative result is corrected for hadronisation. Only after performing these corrections, theoretical predictions and experimental data are on the same footing. Hadronisation corrections are typically determined by using multi-purpose Monte-Carlo event generators, such as PYTHIA/PYTHIA 8 [3], HERWIG/HERWIG++ [4], or, more recently, SHERPA [2]. In the past two decades, such event generators have been the workhorses of particle physics phenomenology, including in their simulation many features in addition to the perturbative aspects of a collision, such as hadronisation, the underlying event, hadron decays etc.. But, to describe perturbative QCD, they typically rely on leadingorder (LO) matrix elements only, combined with parton showers, which in turn model QCD radiation effects in a leading logarithmic approximation. Improvements to this approximation can be obtained through merging methods, pioneered in [5,6], and further worked out in different varieties at different accuracies and for different parton showers, e.g. in [7,8]. In this merging approach, tree-level matrix elements for processes with different jet multiplicities are combined with parton showers, avoiding problems related to double counting of emissions. Lately, a new formulation has been proposed which can be proved to preserve the formal accuracy of the parton shower,

Parton showers and the POWHEG method
In this publication, the POWHEG method is reinterpreted as an advanced reweighting technique for standard parton showers. The following section introduces the necessary notation and outlines the parallels between the POWHEG method and traditional matrix-element reweighting. The starting point of the discussion is the factorisation theorem underlying the specific parton-shower model, like the DGLAP equation [28], the colour-dipole model [29], Catani-Seymour factorisation [24] or antenna factorisation [30]. Except in collinear factorisation, the splitting functions of the parton shower depend on (at least) one additional parton, which is often referred to as the "spectator". In order to make this connection explicit, the notation of a dipole-like factorisation is adopted, which is sufficiently general to discuss all relevant features of the POWHEG method and its implementation into the SHERPA event generator.

Decomposition of real-emission cross sections
In the following, sets of n particles in a 2 → (n− 2) process will summarily be denoted by { a} = {a 1 , . . . , a n }, and the particles will be specified through their flavours { f } = {f 1 , . . . , f n } and momenta { p } = {p 1 , . . . , p n }. The generic expression for a fully differential Born-level cross section in a scattering process with (n − 2) final-state particles can be written as a sum over all contributing flavour combinations as The individual terms in the sum are given by Here, |M B | 2 ({ a}) denotes the partonic matrix element squared, with all factors due to averaging over initial state quantum numbers such as spin or colours absorbed into it, and dΦ B ({ a}) is the corresponding differential n-particle partonic phase-space element; S({ f }) is the symmetry factor due to identical flavours associated to the partonic subprocess, while F ({ a}) denotes the flux factor and L is the parton luminosity given by the corresponding parton distribution functions (PDFs). In the case of leptonic initial states, ignoring QED initial state radiation, the parton distribution functions f (x, µ 2 ) are replaced by δ(1 − x). In a similar fashion, the real-emission part of the QCD next-to-leading order cross section can be written as a sum, this time over parton configurations {a 1 , . . . , a n+1 }, i.e. including one additional parton. A corresponding subprocess cross section reads At this point, it is helpful to introduce a notation for mappings from real-emission parton configurations to Born parton configurations. Such mappings combine the partons a i and a j into a common "mother" parton a ı , in the presence of the spectator a k by defining a new flavour f ı and by redefining the particle momenta.
To be specific, The flavour of the "mother" parton, f ı , is thereby fixed unambiguously by the QCD interactions, while the flavour of the spectator, f k , remains unaltered. The momentum map guarantees that all partons are kept on their mass shell. Conversely, any Born parton configuration and a related branching process ı,k → ij, k determine the parton configuration of a real-emission subprocess as The radiative variables Φ R|B are thereby employed to turn the n-parton momentum configuration into an n+1-parton momentum configuration using the inverse of the phase-space map defined by Eq. (2.4). The flavour f j is again determined unambiguously by the QCD interactions. Here, also two obvious generalisations of Eq. (2.4) shall be defined, b ij,k ({ f }) and b ij,k ({ p }), which act on the parton flavours and on the parton momenta only. Correspondingly, such generalisations exist for Eq. (2.5).
In the soft and collinear limits, the partonic matrix element squared, R({ a}), can be decomposed as a sum of terms D ij,k ({ a}), These terms factorise into a Born-level term and a universal splitting kernel, encoding the transition of a ı to a i and a j [24]. The splitting is associated with a universal procedure for factorising the phase space integral into a Born level part and a one-particle radiative phase space.
The existence of universal decompositions like in Eq. (2.6) forms the basis of subtraction methods like the Catani-Seymour dipole subtraction [24], antenna subtraction [30], or the subtraction method of Frixione, Kunszt, and Signer [27]. It also serves as starting point for the construction of parton shower algorithms [31,32], which aim at approximating parton emissions in the collinear and soft limits of the radiation phase space, to resum the associated large logarithms, cf. Sec. 2.2. However, it is important to stress that, also away from the infrared limits, R({ a}) can be decomposed into a number of terms R ij,k analogous to D ij,k through .
3) can now be rewritten as a sum of trivially factorised contributions . (2.10) and R ij,k ({ a}) = L({ a})R ij,k ({ a}). These equations are key ingredients to understanding and implementing the POWHEG method.

Construction of the parton shower
Due to the non-Abelian nature of QCD, the terms D ij,k in Eq. (2.6) in general do not factorise on the level of squared matrix elements, including all colour contributions. To arrive at a practical model for a parton shower, sub-leading colour configurations are therefore neglected, which leads to an assumed factorisation on the level of squared matrix elements. In the infrared limits one can then write where the set of momenta b ij,k ({ p }) is determined by the phase space map of the parton shower model. 3 The quantities K ij,k are the parton shower evolution kernels, which depend on the parton flavours f i , f j and f k and on the radiative phase space. The denominator factor 2 p i p j or any linearly dependent quantity is usually used to define the parton shower evolution variable, in the following denoted by t.
Using the above model, the parton-shower approximation of Eq. (2.10) can be derived as . (2.12) Particles produced in the parton shower are resolved at a certain evolution scale and can therefore be distinguished from particles at higher and lower scales. At most the two particles a i and a j , emerging from the same splitting process, can be seen as identical. Hence, the ratio of symmetry factors in Eq. (2.10) changes to The integral over the radiative phase space can be written as with t the evolution variable, z the splitting variable, and φ an azimuthal angle. Here, J denotes the Jacobian factor, that potentially arises due to the transformation of variables. Equation (2.12) thus becomes .
(2.15) The assignment of the mother parton, the spectator and the underlying Born process can now be assumed to be fixed. Then, the sum runs over all possible real-emission configurations originating from this particular Born state instead. Furthermore, assuming independence of the individual emissions, i.e. Poissonian statistics, this leads to the constrained no-branching probability of the parton-shower model [33] between the two scales t ′′ and t ′ (2.16) It is worth noting that Eq. (2.16) depends on the underlying Born process, since the flavour and momentum of the spectator enter as arguments of J ij,k and K ij,k . The ratio of L in Eq. (2.16) accounts for a potential change of the parton luminosity when integrating over the initial-state phase space 4 . Note that the partons { a} B denote a Born-level set, while in (2.12) and (2.15) { a} denote a set of partons at real-emission level. This is also indicated by the subscript B in Eq. (2.16). Using the definition the total cross section in the parton shower approximation reads (2.18) 4 Note that, depending on the parton shower model, the x i do not necessarily fulfil the relation x i =x i /z [34,35].
The scale t 0 acts as the infrared cutoff of the parton shower. Simple inspection shows that the sum in the square bracket equals unity, since the second term can be written as This makes the probabilistic properties of the parton shower explicit. At the same time it also shows that this unitarity leads to the cross section in standard parton-shower Monte Carlos to be exactly the respective leading-order cross section. In order to evaluate the formal accuracy of the description of the radiation pattern, induced by the second term in the square bracket -the first term encodes the probability that there is no resolvable emission off the Born-level configuration -a corresponding observable must be introduced. This complicates the discussion somewhat and is therefore postponed to Secs. 2.4 and 2.5.

Correcting parton showers with matrix elements
The aim of this section is to devise a simple method for reinstating O(α s ) accuracy in the emission pattern of the parton shower, i.e. the hardest emission in the parton shower should follow the distribution given by the corresponding real-emission matrix element. Loosely speaking, the key idea is to replace the splitting kernels K with the ratio of real-emission and Born-level matrix elements. Thus, instead of the splitting kernels, this ratio is exponentiated in the Sudakov form factor and employed in simulating the splitting. Comparing Eqs. (2.10) and (2.12), a corresponding factor correcting the latter to the former can be easily identified. Using Eq. (2.13) it reads . (2.20) Employing the parton shower approximation, Eq. (2.11), to replace ρ ij,k yields (2.21) Note that this implies a corrective weight, which is actually splitter-spectator independent.
Correcting the parton shower to the full matrix element can thus be achieved through the following algorithm: 1. Determine an overestimate for Eq. (2.21), i.e. find a set of W ı,fi ({ a}), such that w(r ı,k (f i , Φ R|B ; { a})) ≤ W ı,fi ({ a}) for allk and throughout the real-emission phase space.
2. Replace the parton shower splitting kernels K ij,k by W ı,fi ({ a}) K ij,k .
It is straightforward to show that the constrained no-branching probability of such a matrix-element corrected parton shower reads (2.22) The ratio R/B in Eq. (2.22) coincides with the ratio in the original publications presenting the POWHEG method. In the relatively simple cases treated so far [1,15,16], the various symmetry factors in the equation above cancel and can be neglected. For more complicated flavour structures this factor may differ from one and therefore must be retained.
Employing again the definition of Eq. (2.17), but this time for the Sudakov form factor constructed from the ratio R/B yields the cross section in the matrix element improved parton shower approximation. It reads (2.23) Again, the term in the square bracket equals one and thus reflects the probabilistic nature of this approach. Consequently, in the matrix-element improved parton-shower approximation the total cross section is given by the Born cross section, although the radiation pattern has improved. For a detailed discussion of the real-emission term see Sec. 2.5.

Approximate NLO cross sections
In the previous two sections it has become clear that the total cross section of events simulated in a partonshower Monte-Carlo is determined by the "seed" cross section, typically computed at Born level. While matrix-element improvement of the naive parton shower picture will lead to radiation patterns which are accurate to O(α s ), the total cross section of the event sample and any observable that can be defined at Born level will still be given by the respective leading-order expression. To allow for a simulation with next-to-leading order accuracy, including the cross section of the event sample, a prescription to assign a corresponding weight and multiplicity of the seed event must be found. The solution is to replace the original Born-level matrix element with a modified one [1], denoted byB, such that the "seed" cross section, dσB, integrates to the full NLO result. When constructing such an NLO-weighted differential cross section for the Born configuration, certain approximations must be made, since NLO cross sections have two contributions, one with Born-like kinematics and one with real-emission like kinematics, both of which exhibit divergent structures. The value of a given infrared and collinear safe observable, O, computed at NLO, is given in terms of the Born term B, the real emission term R, and the virtual contribution (including the collinear counter-terms), denoted byṼ, as

(2.25)
It is obvious that the real-emission contribution cannot be simply combined with the Born and virtual terms, as it depends on different kinematics. In the following, the solution of this problem in the framework of the POWHEG method is outlined. In order to compute Eq. (2.25) in a Monte-Carlo approach, subtraction terms, rendering the real emission finite in D = 4 space-time dimensions are introduced. Corresponding integrated subtraction terms regularise the infrared divergences of the virtual terms. In the dipole subtraction method [24], the equation above can then be written as (2.26) Note that each S ij,k defines a separate phase space map and that the observable O in the last term depends on b ij,k ({ p }), rather than { p }, which is a crucial feature of the subtraction procedure. The real and integrated subtraction terms S ij,k ({ a}) and I({ a}) fulfil the relation In the POWHEG method, this term is approximated as (2.30) In the next section it will be shown that the combination of this term,B with a matrix-element reweighted parton shower yields the attempted O(α s ) accuracy, not only of the total cross section, but also of the real-emission contribution.

The POWHEG method and its accuracy
The key point of the POWHEG method is, to supplement Monte Carlo event samples from matrix-element corrected parton showers with a next-to-leading order weight to arrive at full NLO accuracy. This is achieved by combining the two methods discussed in Secs. 2.3 and 2.4. To obtain the O(α s ) approximation to the cross section in the POWHEG method, the parton-shower expression of the real-emission probability is combined with the approximated initial cross section, dσB. This yields the following master formula for the value of an infrared and collinear safe observable, O, Two special cases should now be considered • The infrared limit (t → 0) In this case, only the first term in Eq. (2.33) contributes, as any infrared safe observable maps the real-emission kinematics for collinear (soft) emissions to the kinematics of the (any) underlying Born configuration The contribution to O (POWHEG) from this phase-space region is therefore correct to O(α s ).
• Hard emissions (t → µ 2 ) In this case, a cancellation between the first and the last term in Eq. (2.33) is achieved and only the second term remains. Also, ∆(t, Comparing this result with the real-emission term in Eq. (2.30) reveals that both expressions differ by the factor , which arises solely due to the way the real-emission phase space is populated by the parton shower (cf. Sec. 2.2). Therefore, the contribution to O (POWHEG) from this phase-space region is correct to O(α s ).
In the phase-space regions "between" these limits, the POWHEG method interpolates smoothly between the two above results.

Realisation of the POWHEG method in the SHERPA Monte Carlo
SHERPA is a multi-purpose Monte-Carlo event generator for collider experiments [2]. The goal of this project is a complete simulation of all aspects of the collision. Despite being focused on developments improving the treatment of perturbative QCD, over the past years significant improvements have been achieved regarding the description of soft QCD and QED dynamics, like the process of hadronisation, the decays of the produced hadrons, and the implementation of QED radiation in these decays [36]. One of the traditional key features of the SHERPA program, however, is a consistent merging of multi-jet matrix elements at tree-level with the subsequent parton shower in the spirit of [6]. An improvement of this method and its consistent implementation have been presented in [9] and extended to include hard QED radiation [35]. To this end, SHERPA uses its two internal tree-level matrix element generators AMEGIC++ [37] and COMIX [38], which are capable of calculating cross sections for processes in the Standard Model (AMEGIC++ and COMIX) and beyond (AMEGIC++), involving final states with high multiplicities. Soft and collinear parton radiation is generated in SHERPA by means of a parton shower based on Catani-Seymour dipole factorisation [31]. The program also allows to steer external modules for the computation of virtual corrections using a standardised interface [39]. The corresponding real corrections and subtraction terms in the Catani-Seymour formalism [24] are then provided automatically by AMEGIC++ [40]. SHERPA is therefore perfectly suited to implement the POWHEG method as all prerequisites outlined in Sec. 2 are found within a single, coherent framework. In this section the basic features of the corresponding modules of the generator are reviewed, as far as they are important in the context of this work. A complete overview of SHERPA can be found in [2]. Figure 1: Effective diagram for the splitting of (1) a final-state parton connected to a final-state spectator, (2) a final-state parton connected to an initial-state spectator, (3) an initial-state parton connected to a final-state spectator and (4) an initial-state parton connected to an initial-state spectator in the standard Catani-Seymour notation. The blob denotes the colour correlated leading order matrix element, and the incoming and outgoing lines label the initial-state and final-state partons participating in the splitting.

Matrix elements and subtraction terms
For this study, the matrix-element generator AMEGIC++ [37], is employed. It is based on the construction of Feynman diagrams, which are evaluated using the helicity methods introduced in [41]. For the computation of NLO cross sections in QCD-associated processes, AMEGIC++ provides the fully automated generation of dipole subtraction terms [40], implementing the Catani-Seymour formalism [24]. As outlined in Sec. 2.4, such a subtraction procedure is a necessary ingredient to be able and compute NLO QCD cross sections with Monte Carlo methods.
In the Catani-Seymour method, the soft and collinear singularities of the real emission amplitude squared, R({ a}), are removed by a local subtraction term (cf. Eq. (2.26)) On the first line, the notation of Sec. 2.4 is adopted, while on the second line the definitions of [24] are restored by defining i, j, k > 2, a, b = 1, 2 and requiring all indices to be mutually distinct. Along those lines,D ij,k , D a ij ,D aj k andD aj,b are the four types of Catani-Seymour dipole terms, as depicted in Fig. 1, for final-state splittings with final-state spectators, final-state splittings with initial-state spectators, initial-state splittings with final-state spectators and initial-state splittings with initial-state spectators, respectively. This implies that for final state splittings, i.e. i, j > 2, Due to its analytic integrability over the extra emission phase space dΦ ij,k R|B in D = 4 − 2ǫ dimensions, the subtraction term S, in its integrated form I of Eq. (2.27), as well as the collinear counter-terms can be added back to the virtual contributions to cancel their poles in ǫ, wherein V is the one-loop matrix element convoluted with the Born amplitude and C is the collinear counterterm.
The implementation in SHERPA's matrix element generator AMEGIC++, expanding upon its tree-level capabilities to generate B and R, is able to generate both the subtraction terms S and their integrated counterparts I as well as the collinear counter-term C in an automated fashion. The virtual contributions V, however, are obtained from dedicated external codes interfaced using the Binoth-Les Houches Accord [39]. Having all this at hand, the assembly of theB-function of Eq. (2.30), integrable in D = 4 dimensions, is feasible in an automated way. This involves integrating over the real emission subspace of the phase space of the NLO real correction to the Born process, cf. Eq. (2.30) and adding the result to the terms with Born-level kinematics.
In SHERPA, this integration is performed in a Monte-Carlo fashion, by selecting a single point in the realemission phase space. This technique potentially generates negative weights. In the standard POWHEG method, the emergence of such negative weights is suppressed by either performing the integration analytically or by sampling over sufficiently many real-emission phase space points. Tests to decide which method is better for practical applications are beyond the scope of this publication and will be addressed in a future work; here it should suffice to state that, of course, sampling over more than one phase space point is neither a conceptual problem nor a practical obstacle. Also, for all processes under study in this publication, no problems have been encountered by the possibility of having negative weighted events. Loosely speaking, the problem can in no case be more severe than the possibility of having negative weights in a standard NLO calculation. Therefore, the only remaining issue is to construct an integration method, which, starting from a given Born configuration, is able to fill the real-emission phase space in an efficient manner. Having an implementation of the Catani-Seymour subtraction method at hand, the construction of an integrator for the real-emission subspace based on CS-subtraction terms is rendered a straightforward exercise. The actual integration can be decomposed into three one-dimensional integrals (cf. Eq. (2.14)) [24] dΦ ij,k R|B = with the two integration variables α and z given in Tab. 1, cf. [40]. The azimuthal angle, φ, is common to all configurations. Several different integration channels, each based on a a separate CS dipole, can be combined to yield a multi-channel integrator [42] for the real-emission phase space. The a-priori weights in the multi-channel can be employed to better adapt to the emission pattern of the process under consideration. Additionally, every one-dimensional integrator can be individually improved using the VEGAS algorithm [43].
Type  .7) and (3.15). Note that the definitions for massless partons in [24] are employed.

The parton shower
SHERPA implements a parton shower based on Catani-Seymour (CS) dipole factorisation, which will be denoted by CSS [31]. The model was originally proposed in [44] and worked out and implemented in parallel in [31,32]. It relies on the factorisation of real-emission matrix elements in the CS subtraction framework [24].
Since the original algorithm has been improved in several ways, which have not yet been compiled in a single reference, the parton shower model and its features are briefly reviewed here.

Ordering parameters and splitting functions
Consider the process depicted in Fig. 1, where a parton ı, accompanied by a spectator partonk, splits into partons i and j, with the recoil absorbed by the spectator k. Conveniently the combined momenta are identified as p ij = p i + p j and Q = p ij + p k . A Lorentz-invariant transverse momentum,k 2 T , which acts as ordering parameter in the parton shower algorithm, can now be defined as This relation holds, independent of whether the spectator parton is a final-or initial-state particle for all final-state splittings, while for initial-state splittings a → a j, again independent of the type of spectator, the ordering parameter is given bỹ The precise definition ofỹ and the splitting variables z for the various dipole types are listed in Tab. 2. Sudakov form factors for all branching types, taking into account finite masses of final-state partons and strictly relying on the Lorentz-invariant variables z andk 2 T , have been derived. The corresponding evolution kernels, as defined in Eq. (2.11), read where N spec is the number of spectator partons of the off-shell particle a ı in the large N C limit. The spin-averaged dipole functions V , taken in four dimensions, depend on the type of emitter and spectator parton and are listed in [31]. Their infrared singularities are regularised through the parton shower cutoff, t 0 , typically of the order of 1 GeV 2 . The denominator factorS ij avoids double-counting identical final states in final-state evolution and is given by A potential shortcoming of the original approach in [31] is that certain dipole functions connecting the initial and final state may acquire negative values in some non-singular regions of the phase space. This prohibits their naive interpretation in terms of splitting probabilities and leaves the corresponding parts of the phase space unpopulated. The problem was solved recently by altering the finite parts of the affected splitting functions such that they reproduce corresponding full matrix elements [45].

Splitting kinematics
All branchings in the CSS formalism implement exact four-momentum conservation and the particles are kept on their mass-shell before and after every evolution step. The phase-space maps from n-to n+1particle final states are exact and cover the whole phase space. They are, however, not unambiguous. One of the most prominent criticisms of this particular parton-shower model is the "unphysical" recoil strategy originally employed in configurations with initial-state splitter and final-state spectator. This problem was addressed in [34] for the case of massless partons and has been implemented for prompt photon production and extended to the fully massive case in [35]. The latter publication defined a general algorithm for constructing the emission kinematics, independent of the type of splitter and spectator parton, which is summarised as follows: 1. Determine the new momentum of the spectator parton as [24] 2. Find the light-like helper vectors l and n as where 3. Express the momenta p i and p j in terms of l, n and a transverse component k ⊥ as 14) The parametersz i and k 2 ⊥ of this decomposition are given bȳ where ξ i,jk is defined in Tab. 2. Equations (3.10) and (3.15) are valid for all dipole configurations, i.e. initial and final-state branchings with the recoil partner being either in the initial or in the final state.
To generate an emission using the reweighting technique presented in Sec. 2.3, it is important to be able and access matrix-element information during the parton-shower evolution, such that Eq. (2.21) can be implemented in a process-independent manner. SHERPA provides an interface between its tree-level matrixelement generators and its parton shower, which allows for all the necessary interactions. Together with an implementation of the phase-space maps b ij,k ({ a}), that correspond to the inverse of the above splitting kinematics, Eq. (2.21) can then be realised easily.

Automatic identification of Born zeros
It was noted in [16] that Eq. (2.21) can develop spurious singularities as the matrix element of the underlying Born process may be zero, while the real-emission matrix element is not. Such configurations do not exponentiate, as R is not singular when B → 0. This fact can be employed to formulate a general solution to the problem [16]. One can split R into two parts, a singular one, R (s) , and a regular one, R (r) .
Note that B max can be determined during the integration of the seed cross section, while t max is given as a universal function of the hadronic centre-of-mass energy, depending only on the definition of t in the parton shower model. The resolution factor κ res then determines the relative splitting between R (s) and R (r) : the larger κ res , the larger the fraction R (r) of R.
The necessity of such a splitting of the real emission matrix element can be determined on an event-by-event basis by comparing the correction factor to the parton shower of Eq. (2.20), w ij,k , to a predefined threshold w th ij,k . Thus, regular non-exponentiated R (r) events are only produced if w ij,k > w th ij,k . Such a treatment ensures that both the exponentiation of the real-emission matrix element is as inclusive as possible and the parton shower correction factor does not get too large, rendering event generation too inefficient.

Results
This section collects results obtained with the implementation of the POWHEG algorithm in the SHERPA event generator. We exemplify the performance in a variety of processes which are listed in Sec The annihilation of e + e − into hadrons is studied at LEP Run 1 energies, E CMS = 91.25 GeV. This setup allows to validate the algorithms of Sec. 2 in pure final-state QCD evolution, which is the simple-most testing ground. The parton shower cut-off scale has been set to k 2 T, min = 1.6 GeV 2 . Even though the improvements in this paper are purely related to perturbative physics, the results are presented after hadronisation with the Lund model [3] to make comparison to experimental results more meaningful. The ME+PS samples have been generated with up to one additional jet in the matrix elements and the phase space slicing parameter was set to log(y cut ) = −2.25. For the virtual matrix elements, the code provided by the BlackHat collaboration [25] was used.

Deep-inelastic lepton-nucleon scattering
Hadronic final states in deep-inelastic lepton-nucleon scattering (DIS) are studied at HERA Run 1 energies, E CMS = 300 GeV. Just like e + e − -annihilation into hadrons, this process boasts a wealth of precise experimental data. From the theoretical perspective, it is invaluable, as it allows to test QCD factorisation in an extremely clean environment. The associated scale, given by the virtuality of the exchanged γ * /Z-boson is not fixed, but potentially varies by orders of magnitude, which allows to test perturbative QCD predictions in various kinematic limits. Our results are presented at the parton level only, as hadronisation corrections have little effect on the observables and the focus lies on the potential improvements on the perturbative part of the simulation. The CTEQ6.6 [46] parton distribution functions have been employed and the strong coupling has been defined accordingly as α s (m Z ) = 0.118 with NLO running for both the matrix elements and the parton shower. The remaining settings correspond to those in [45]. ME+PS samples have been generated with up to one additional jet in the matrix element and the phase space slicing parameters were set toQ cut = 5 and S DIS = 0.6 (cf. [45]). Virtual matrix elements were provided by BlackHat [25].

Drell-Yan lepton pair production
We investigate Drell-Yan lepton pair production at Tevatron Run 2 energies, simulating pp collisions at E CMS = 1.96 TeV. The CTEQ6.6 [46] parton distribution functions are employed and the strong coupling is defined accordingly as α s (m Z ) = 0.118 with NLO running for both the matrix elements and the parton shower. A cut on the invariant mass of the lepton pair of 66 < m ℓℓ /GeV < 116 is applied at the matrixelement level. For the ME+PS samples matrix elements with up to one additional jet were generated and a phase-space slicing cut of Q cut = 20 GeV was applied. Virtual matrix elements were provided by BlackHat [25]. The factorisation and renormalisation scales for the NLO matrix element were chosen as µ 2 R = µ 2 F = m 2 ⊥, ℓℓ . In all tree-level matrix elements SHERPA's default scale choice was employed: The matrix element is clustered onto a core 2 → 2 configuration using a k T -type algorithm with recombination into on-shell particles. Scales are defined as the lowest invariant mass or negative virtuality in the core process. Hadronisation and multiple parton interactions have been disabled to allow for a study at the parton shower level. The Z → ℓℓ decay is corrected for QED next-to-leading order and soft-resummation effects in the Yennie-Frautschi-Suura (YFS) approach [36]. The three reactions listed in Sec. 4.1.1-4.1.3 essentially amount to one and the same process at the parton level, as they only differ by crossing of initial and final state legs. Their combination allows to validate the implementation of the matrix-element reweighting in Sec. 2.3 for all possible dipole configurations with quark splitters.

W boson production
Production of W bosons is presented in pp collisions at E CMS = 1.8 TeV. Although in principle similar to the Drell-Yan case, this process is of special interest to validate the automatic decomposition of the real-emission term into singular and non-singular pieces, as outlined in Sec. 3.3. If not stated otherwise, the parameters for this decomposition are set to κ res = 4 and w th ij,k = 100. The CTEQ6.6 [46] parton distribution functions have been employed and the strong coupling has been defined accordingly as α s (m Z ) = 0.118 with NLO running for both the matrix elements and the parton shower. A cut on the invariant mass of the lepton-neutrino pair of m ℓν > 10 GeV was applied at the matrix-element level. For the ME+PS samples matrix elements with up to one additional jet were used and a phase space slicing cut of Q cut = 20 GeV was applied. Virtual matrix elements were provided by BlackHat [25]. The factorisation and renormalisation scales for the NLO matrix element were chosen as µ 2 R = µ 2 F = m 2 ⊥, ℓν . In all tree-level matrix elements SHERPA's default scale choice was employed, cf. Sec. 4.1.3. Hadronisation and multiple parton interactions have been disabled. The W → ℓν decay is corrected for QED next-to-leading order and soft-resummation effects in the YFS approach [36].

Higgs boson production through gluon-gluon fusion
The production of Higgs bosons through gluon-gluon fusion is simulated for proton-proton collisions at E CMS = 14 TeV. The coupling to gluons is mediated by a top-quark loop and modeled through an effective Lagrangian [47]. Again, this process is technically very similar to the Drell-Yan case, but it also allows to validate matrix-element corrections to the remaining initial state splitting functions. Next-to-leading order corrections are rather large at nominal LHC energies, with a ratio of K ≈ 2 between the NLO and the LO result for the total cross section. This fact has spurred tremendous efforts to perform fully differential calculations at NNLO [48] and several predictions have been presented which merged such fixed-order results with resummation at next-to-next-to-leading logarithmic accuracy [49], as the process is expected to have high phenomenological relevance at LHC energies. However, this publication centres on the behaviour of the theory at NLO only, as a prediction beyond this level of accuracy is clearly not within the capabilities of the POWHEG method. The CTEQ6.6 [46] parton distribution functions have been employed and the strong coupling has been defined accordingly as α s (m Z ) = 0.118 with NLO running for both the matrix elements and the parton shower. A cut for the invariant mass of the τ pair of 115 < m τ τ /GeV < 125 was applied at the matrix-element level. For the ME+PS merged samples matrix elements with up to one additional jet were used and a phase-space slicing cut of Q cut = 20 GeV was applied. The virtual matrix elements have been implemented according to [18]. The factorisation and renormalisation scales for the NLO matrix element were chosen as µ 2  Table 3: Cross sections in pb for e + -e − annihilation into hadrons at LEP and deep-inelastic positron-proton scattering at HERA as calculated in the POWHEG framework and in a conventional fixed order NLO calculation [40].
h → τ τ decay is corrected for QED soft-resummation and approximate next-to-leading order effects in the YFS approach [36].

Z-pair production
The production of pairs of Z bosons is studied for proton-proton collisions at E CMS = 14 TeV. This is an important background for the golden-plated Higgs boson discovery mode at the LHC. Detailed studies of the decay properties of the Z bosons and their correlations are known to allow for a determination of some properties of the Higgs boson, when found. Among these correlations are, e.g. the relative orientation of the decay planes of the bosons. The CTEQ6.6 [46] parton distribution functions have been employed and the strong coupling has been defined accordingly as α s (m Z ) = 0.118 with NLO running for both the matrix elements and the parton shower. A cut on the invariant mass of each lepton pair of 66 < m ℓℓ /GeV < 116 was applied at the matrix-element level. For the ME+PS samples matrix elements with up to one additional jet were used and a phase-space slicing cut of Q cut = 20 GeV was applied. Virtual matrix elements were provided by MCFM [26,50]. The factorisation and renormalisation scales were chosen as µ 2 R = µ 2 F = m 2 ZZ . Hadronisation and multiple parton interactions have been disabled to allow for a study at the parton shower level. Each Z → ℓℓ decay is corrected for QED next-to-leading order and soft-resummation effects in the YFS approach [36].

W + W − -production
W + W − -production is also studied for proton-proton collisions at E CMS = 14 TeV. It is worth noting that this process hitherto has not been treated in the POWHEG approach. Similar to the Z-pair production, it is an important background to the search channel for the Standard Model Higgs boson, at masses around and above 130 GeV. Again, in order to suppress this background, distributions which depend on correlations of decay products of the W 's in phase space are heavily employed. In the simulation here, again the CTEQ6.6 [46] parton distribution functions have been employed and the strong coupling has been defined accordingly as α s (m Z ) = 0.118 with NLO running for both the matrix elements and the parton shower. A cut on the invariant mass of each lepton-neutrino pair of m ℓν > 10 GeV was applied at the matrix-element level. For the ME+PS samples matrix elements with up to one additional jet were used and a phase-space slicing cut of Q cut = 20 GeV was applied. Virtual matrix elements were provided by MCFM [26,50]. The factorisation and renormalisation scales were chosen as µ 2 R = µ 2 F = m 2 W W . Hadronisation and multiple parton interactions have been disabled to allow for a study at the parton shower level. Each W → ℓν decay is corrected for QED next-to-leading order and soft-resummation effects in the YFS approach [36].

Tests of internal consistency
The aim of this section is to provide consistency checks on the different aspects of the POWHEG implementation in SHERPA. At first, total cross sections as obtained from POWHEG are compared with the corresponding results from a standard NLO calculation. In this case, the public release SHERPA-1.2.2 5 serves as the reference, which includes an implementation of [40]. Results for e + -e − annihilation into hadrons and deep-inelastic positron-proton scattering are presented in Tab. 3. Numbers for inclusive Z-boson production with decay to an electron-positron pair, for inclusive W -boson production with decay to an electron-neutrino pair, and for Higgs-boson production via a top-quark loop with decay into τ are listed in Tab. 4. The agreement between the POWHEG results and those of the standard integration method typically is within a 1σ range as given by the respective Monte-Carlo errors. To examine differences between POWHEG and a parton-shower Monte Carlo regarding the exponentiation of the real-emission matrix elements in POWHEG, R can be approximated by R (PS) in Eq. (2.21). Performing this replacement does not only constitute a mandatory cross-check, whether the parton-shower approximation is retained, but it also estimates the size of corrections that can be expected at all when switching to NLO accuracy in the event simulation. Apart from the overall normalisation, in processes with no additional phase space dependence introduced by the loop matrix element, the emission pattern in POWHEG should be identical to the parton-shower result. This is verified in inclusive Z-boson production at Tevatron energies as displayed in Fig. 2. For low transverse momentum (low jet resolution) p ⊥ ≪ µ F both distributions coincide within statistical errors. For large values the emission phase space is severely restricted in the parton-shower approach, as t < µ 2 F ≈ m 2 Z and p ⊥ t. Any contribution to this phase-space region must therefore originate from configurations where more than one hard parton recoils against the lepton pair. Such configurations are suppressed by higher orders of α s , and therefore the emission rate is gravely underestimated by the parton shower. As a direct consequence, all deviations are then manifestations of the exponentiation of non-logarithmic terms, which can be sizeable in the hard wide-angle emission region. The automatic splitting of the real-emission matrix element into singular and regular contributions as presented in Sec. 3.3 contains two unphysical parameters: κ res , which governs the relative sizes of the exponentiated, singular part R (s) and the non-exponentiated, regular part R (r) , and w th ij,k , which determines when the above separation is actually employed. The effect of κ res on the central parton shower reweighting factor w ij,k is detailed Fig. 3. There, it can be seen that for values of κ res chosen neither too low, such that the maximum of the reweighting factor rises beyond reasonable bounds rendering the reweighting of the parton shower inoperable, nor too high, such that parts of leading logarithmic structure of R are not exponentiated, event generation with the accuracy aimed at by the POWHEG algorithm is feasible. Hence, the results of the Monte-Carlo simulation should be fairly independent of κ res and w th ij,k , if varied within a reasonable range. Figure 4 displays predictions for transverse momentum spectra in W -boson production for several values of κ res . As expected, no significant variations of the emission pattern can be observed. The small differences

Comparison with tree-level matrix-element parton-shower merging
By comparing POWHEG results to a standard parton shower combined with LO matrix elements (LO+PS), it can be established whether observables are produced correctly in regions where the soft/collinear approximations in the parton shower are equivalent to the R/B ratios in POWHEG. An example is the distribution of the jet resolution scale d 01 , using the longitudinally invariant k T -algorithm in W/Z + jets production. This observable amounts to the k T -scale where a 1-jet event is clustered into a 0-jet event. Figure 5 shows that there is good agreement between the LO+PS and POWHEG results for d 01 < 50 GeV. For harder emissions the LO+PS approach fails due to the restricted phase space, as explained in the previous section.
In this publication, we regard the POWHEG method as an advanced matrix-element reweighting technique for the parton-shower algorithm; the reweighting is supplemented with local K-factors to implement full NLO corrections. It is therefore useful to compare the respective results with matrix-element parton-shower merged samples (ME+PS), which are rescaled by a suitably chosen global K-factor. Such samples are known to yield approximate NLO radiation patterns by effectively implementing higher-order matrix-element corrections into the parton shower. An implementation of one of the most advanced ME+PS algorithms to date is available within the SHERPA framework [9] such that a direct comparison with POWHEG is a straightforward exercise. However, because of the lack of virtual contributions in the LO+PS and ME+PS samples, an agreement on the total rate cannot be expected. Thus, in the comparisons below the following global K-factors were employed: • K = 1.038 for e + e − → hadrons at LEP energies, • K = 1.2 for Z/γ * and W production at Tevatron energies, • K = 1.2 for ZZ production at the LHC (14 TeV), • K = 1.34 for W + W − production at the LHC (14 TeV), and • K = 2.1 for Higgs production through gluon fusion at the same LHC energies.
When comparing POWHEG results to ME+PS results including matrix elements up to the 1-jet final state one should obtain a very similar radiation pattern. The observed agreement indeed is very good, as expected. Figure 5 shows that, for example, the differential one-jet rates in W/Z-boson production agrees on the 20% level, even for relatively large scales (d 01 > 50 GeV). The remaining differences can be attributed to the differences in the Sudakov form factors: While POWHEG exponentiates R/B, the ME+PS method uses standard Sudakov form factors at the logarithmic accuracy of the parton shower. Such differences become visible also in the multiplicity distribution of k T jets with p ⊥ > 20 GeV in Drell-Yan and W production, cf. Fig. 6. The 0-jet and 1-jet rates agree within 10% between POWHEG and ME+PS, but for higher multiplicity final states the POWHEG method predicts significantly more jets. Here a ME+PS simulation with more jets in the matrix element would lead to better agreement. Now focussing on the properties of the leading jet produced in association with a W or Z boson, the transverse momentum of the leading jet is shown in Fig. 8. Here the LO+PS approach fails to describe the hard tail of the distribution, again doe to lacking phase space, while the POWHEG and ME+PS approaches agree within 20%. The separation in η-φ space between this jet and the W/Z boson is displayed in Fig. 7. Clear differences in the shape of the distribution comparing the LO+PS approach with both POWHEG and ME+PS are found, as expected, since the other hand, parton showers cover only a restricted area of the phase space, and, in addition, they do not encode the full final-state correlations described by the matrix elements. Results from the POWHEG and ME+PS methods agree very well, with differences below 10% only. A similar finding applies to the transverse momentum of the leading jet, which is shown in Fig. 8. Here the LO+PS approach fails to describe the hard tail of the distribution, while the POWHEG and ME+PS approaches agree within 20%. The transverse momentum of the Higgs boson and the transverse momentum of the leading jet displayed in Fig. 9 give a similar picture as in vector boson production: All three methods agree very well for low transverse momenta. In the high p ⊥ region the POWHEG and ME+PS approaches agree within 15%. Figure 10 shows that minor differences arise between the LO+PS and the POWHEG and ME+PS approaches in the pseudorapidity spectrum of the leading jet. This can be understood as a direct consequence of the different transverse momentum distributions in the LO+PS method, as harder jets tend to be more central than softer ones. The POWHEG and ME+PS approaches agree well in the central rapidity region and show up to 10% difference only in the forward region. The distribution of η-φ separation between the two leading jets proves again that the POWHEG and ME+PS predictions are very similar, with deviations below the 5% level. Again, the LO+PS prediction shows a slightly different behaviour, because of the reasons stated above. Now we turn to look at diboson production at nominal LHC energies of 14 TeV. Figure 11 (left) shows a comparison of the scalar sum H T of the transverse momenta of jets and leptons in Z-pair production.
Deviations of up to 50% become visible between the three compared approaches. This is especially true in the high-H T region. It is well understood that the predictions of the LO+PS approach are softer than either of the two other approaches, due to the restricted emission phase space. The relatively large differences between the ME+PS approach and POWHEG are naively not expected, but might stem from using consistent but somewhat oversimplified scale schemes. This surely should be analysed in more detail, in a forthcoming publication, where pair production processes, including W H and ZH would be studied. The transverse momentum distributions of the individual Z bosons (Figure 11 right) on the other hand agree very well in both approaches, while it is again obvious that the LO+PS sample cannot describe the hard region of this spectrum.
In the azimuthal separation of the two Z bosons, see Fig. 12, a similar feature as in H T can be found: The events are harder in ME+PS than in POWHEG, leading to increased decorrelation of the boson pair. In Figure 12 (right) it can be seen that the angle between the boson decay planes is predicted very consistently in all three approaches. Properties of the leading jet in W + W − pair production at LHC energies are displayed in Figure 13. On the left hand side one can see the transverse momentum of the leading jet and on the right hand side the separation between lepton and leading jet. For both the ME+PS and POWHEG approaches agree well within 20% and the LO+PS sample severely underestimates the hardness of the first jet due to its phase-space restrictions. Figure 14 displays observables related to the two oppositely charged leptons from the two decays. The pseudorapidity difference (left) agrees within 20% for all three approaches, while their azimuthal decorrelation is significantly lower in the LO+PS sample than in the ME+PS and POWHEG approaches, which agree very well. Transverse momentum of leading jet dσ/dp ⊥ (jet 1) [pb/GeV] Transverse momentum of leading jet dσ/dp ⊥ (jet 1) [pb/GeV]  Transverse momentum of leading jet dσ/dp ⊥ (jet 1) [pb/GeV]

Comparison with experimental data
The remainder of this section is dedicated to a comparison of results from the POWHEG approach with experimental data to assert the improved description of data, provided by this method. For the reaction e + e − → hadrons at LEP1 energies the LO+PS and ME+PS predictions do not show significant differences except in extreme regions of phase space. The POWHEG prediction confirms that picture. This is largely due to the fact that the parton-shower algorithm, which is employed in SHERPA is based on Catani-Seymour subtraction terms and those terms constitute a rather good approximation to the real-emission matrix element in the process e + e − → qqg.
In the distribution of the Durham jet resolution at which 3-jet events are clustered into 2-jet events (Fig. 15 left) all three approaches agree very well with the measurement over large parts of the phase space. Only in the hard emission region y 23 > 0.05 deviations from the LO+PS result can be seen. It is encouraging, although not surprising to note that both POWHEG and ME+PS describe the data better. Good agreement of all three approaches with each other and with the measurement is also observed e.g. for the thrust distribution ( Fig. 15 right), the total jet broadening (Fig. 16 left) and the C-Parameter (Fig. 16 right).
As was discussed in [45], deep-inelastic scattering processes offer the opportunity to test perturbative QCD in a region where the factorisation scale of the inclusive process, Q 2 , is smaller than the scale of the actual event, which might be set e.g. by the transverse momentum of a hard jet. As measurements can be performed down to very low values of Q 2 , many hard jets must usually be included in the simulation to achieve a good description of data throughout the phase space. This method cannot be used in the context of this work, as the POWHEG implementation in SHERPA can so far only be employed for the core process e ± q → e ± q. Therefore, results are presented for the high-Q 2 region only and the discussion of the low-Q 2 domain is postponed to a forthcoming publication [23], where the merging of POWHEG samples with higher-multiplicity matrix elements will be discussed. Figure 17 shows reasonable agreement between the POWHEG results and experimental data in a measurement of the di-jet cross section performed at the H1 experiment [51,52]. Deviations from the LO+PS result are apparent, especially at lower values of Q 2 , as the phase space of the parton shower is severely restricted by the low factorisation scale. Similar findings apply to the rapidity spectra shown in Fig. 18. The probably most discussed observable probing the influence of QCD radiation in hadron-hadron collisions is the transverse momentum of the lepton pair in Drell-Yan production, which is displayed in Fig. 19. Very good agreement, compared to a recent measurement, is found for both the POWHEG and ME+PS approaches, while the LO+PS method is not able to describe large parts of the spectrum because of the restricted realemission phase space. The rapidity of the Z boson in Fig. 19 is described very well by all three approaches. The situation is very similar in W -boson production. A comparison of POWHEG predictions with Tevatron data from the DØ experiment [53] is shown in Fig. 20, where very good agreement between the Monte-Carlo result and the data can be observed. In addition to the direct comparison the uncertainties related to a variation of the renormalisation and factorisation scales are also shown. Thereby, two different strategies are pursued: While the inner (dark) band shows the uncertainty related to a variation of the scale in the hard matrix elements only, the outer (light) band shows the influence of varying the scales also in the partonshower evolution. It is rather obvious that the latter approach yields the larger variations, as it is associated with an uncertainty in the choice of the strong coupling for the real-emission subprocess. While this process essentially determines the shape of the transverse momentum distribution in Fig. 20, it enters at tree-level accuracy only, thus leading to a rather large scale dependence.  Figure 14: Pseudorapidity difference (left) and azimuthal angle (right) between the two oppositely charged leptons in W + W − production at nominal LHC energies.   Figure 17: The di-jet cross section as a function of Q 2 in bins of E T,1 + E T,2 (left), the three-jet cross section as a function of Q 2 (right top) and the ratio of the three-over the two-jet rate as a function of Q 2 (right bottom) compared to data from the H1 experiment [51,52].  Figure 18: The di-jet cross section as a function of η ′ , compared to data from the H1 experiment [51]. η ′ denotes half the rapidity difference of the two leading jets in the Breit frame. MC/data Figure 19: Transverse momentum and rapidity of the Z boson in Drell-Yan lepton-pair production at the Tevatron compared to data from the DØ experiment [55]. MC/data Figure 20: Transverse momentum of the W boson in W +jets production at the Tevatron compared to data from the DØ experiment [53]. Scale variations of the POWHEG prediction by factors of 1/2 and 2 are displayed for two different scale schemes, µ F = µ R = m lν (left) and µ F = µ R = m ⊥,lν (right). The inner (dark) band displays the variations associated with redefining the scales for matrix elements alone, while the outer (light) band also takes variations in the running coupling of the parton shower evolution into account.

Conclusions and outlook
In this publication the successful implementation of the POWHEG algorithm into the SHERPA framework was reported. The program is fully automated, relying on SHERPA's efficient matrix-element generation modules, which allow to construct real correction terms for given processes and their Catani-Seymour dipole subtraction kernels in both differential and integrated form. It is worth stressing that this is the first time that the POWHEG method has been applied simultaneously to various higher-order calculations using Catani-Seymour dipole terms for partitioning the real-emission phase space. This implementation makes a number of processes, computed at NLO and available in public program libraries accessible for matching with a parton shower. Additional processes are easily added in SHERPA, by merely linking the corresponding code for the virtual correction terms 6 . The implementation was validated by a number of systematic checks, including • the stability of cross sections, as exhibited in Tables 3 and 4; • the radiation patterns, through comparison with a fake POWHEG algorithm, based on shower kernels, c.f. Figure 2; • the automated detection of Born zeroes and their stable cure, as shown in Figure 3 and 4; • merged LO samples, see  • and comparison with a variety of data, in  It also included, for the first time, the treatment of W -pair production; results for this, along with some plots for Z-pair production are displayed in Figures 11-14.
In the near future, more processes with one coloured line only, such as W H and ZH associated production, will be added to the SHERPA framework and in a future publication we will also discuss the slightly more subtle treatment of processes with more than one coloured line. Furthermore, the methods developed in this publication have been used to generate merged samples with an inclusive cross section at NLO accuracy, according to the method already presented in [22]; we will report on our parallel development in a second publication [23]. This set of NLO-implementations will be made available in the future, in a new and extended release of SHERPA.