Bounding the Higgs width at the LHC using full analytic results for gg ->2e 2\mu

We revisit the hadronic production of the four-lepton final state, e^- e^+ \mu^- \mu^+, through the fusion of initial state gluons. This process is mediated by loops of quarks and we provide first full analytic results for helicity amplitudes that account for both the effects of the quark mass in the loop and off-shell vector bosons. The analytic results have been implemented in the Monte Carlo program MCFM and are both fast, and numerically stable in the region of low Z transverse momentum. We use our results to study the interference between Higgs-mediated and continuum production of four-lepton final states, which is necessary in order to obtain accurate theoretical predictions outside the Higgs resonance region. We have confirmed and extended a recent analysis of Caola and Melnikov that proposes to use a measurement of the off-shell region to constrain the total width of the Higgs boson. Using a simple cut-and-count method, existing LHC data should bound the width at the level of 25-45 times the Standard Model expectation. We investigate the power of using a matrix element method to construct a kinematic discriminant to sharpen the constraint. In our analysis the bound on the Higgs width is improved by a factor of about 1.6 using a simple cut on the MEM discriminant, compared to an invariant mass cut m_{4l}>300 GeV.


INTRODUCTION
The discovery of a boson consistent with the Standard Model Higgs [1,2] has set a large part of the agenda for the LHC physics program over the next couple of decades. The data collected in Run 1 has provided first information about the new particle. The mass of the new boson has been measured to be near 126 GeV [3,4] and the 0 + spin-parity state is strongly favoured [4,5]. Finally, the total rates of production and decay of the boson are broadly compatible with the predictions of the Standard Model [6,7].
Turning the observed cross sections into statements regarding the coupling of the Higgs boson to Standard Model particles is a non-trivial, but desirable goal. A typical measurement of a Higgs process at the LHC focuses on events which lie in the Higgs resonance region, where the cross section depends on the initial and final state Higgs couplings, g i , g f , and on the total width as follows, Therefore in order to measure the Higgs couplings g i,f one must either first measure the width, or measure the couplings under the assumption of a known total width. Clearly, the cross section in the narrow width approximation is invariant under the rescaling g x → ξg x Γ H → ξ 4 Γ H . Information on the couplings alone can only be obtained by either constraining the width directly, or by using ratios of cross sections to eliminate the dependence on the total width. Direct measurement of the Higgs width in a hadronic environment is curtailed for widths smaller than the detector resolution (typically around 1 GeV). Lepton colliders offer more promising prospects, although an e + e − machine will only be able constrain the total width by measuring the invisible branching fraction (in ZH production). Muon colliders offer the possibility to measure the width directly, by performing a threshold scan around the Higgs mass.
In an interesting recent paper, Caola and Melnikov [8] proposed to constrain the total width using the number of ZZ events away from the Higgs resonance region. This method exploits the fact that at least 15% of the Higgs cross section with the Higgs boson decaying to four charged leptons comes from the off-peak region corresponding to a four-lepton invariant mass above 130 GeV [9]. In the phase space region away from the Higgs resonance Eq. (1) is no longer valid, since the Higgs propagator is dominated by the (s − m 2 H ) term for large s and the cross section is essentially independent of the width. Therefore if one performs the same rescaling g x → ξg x , Γ H → ξ 4 Γ H the compensation which occurs in the resonance region no longer exists. The off-shell cross section thus depends on ξ and therefore by measuring the total number of off-shell Higgs events one can place a limit on the total width. The method proposed in ref. [8] using Run I data suggests current constraints on the total width corresponding to Γ H < ∼ (20−38)Γ SM H , with a potential limit of around Γ H < ∼ (5 − 10)Γ SM H obtainable with larger LHC data sets and sufficient control of experimental and theoretical systematic uncertainties.
In the approach of Caola and Melnikov it is imperative to obtain a precise prediction for the off-peak cross section. For largeŝ the effective field theory in which the top quark is integrated out is no longer valid. In addition, a significant contribution in this region comes from the effect of interference between amplitudes representing the Higgs-related diagrams and those representing the continuum background. Example Feynman diagrams that enter the calculation of these two amplitudes are shown in Fig. 1(a) and (b). In the Standard Model the impact of this interference is significant, with the result that the effect of including the Higgs boson diagrams is to reduce rather than increase the number of off-shell events expected. Indeed, in the SM the total number of off-shell Higgs-mediated events is negative, as the Higgs unitarizes the continuum gg → ZZ cross section. A similar interference effect exists in the H → γγ channel [10][11][12] and another recent proposal [13] exploits this to similarly constrain the total width. This latter method requires a precise measurement of the shift in the mass (when compared to the results in other channels such as ZZ) caused by the interference, to constrain the couplings of Higgs to photons and gluons. This can then be used to constrain the total width given the form of the total cross section formula. An alternative to these strategies is to combine experimental results across all Higgs boson production and decay channels and apply extra constraints on individual Higgs boson couplings based on theoretical arguments [14]. The method of ref. [14] currently provides rather stringent limits on the Higgs boson width, Γ H < ∼ (3 − 4)Γ SM H , albeit with the caveat of mild theoretical assumptions. In this paper we shall consider the hadronic production of four charged-leptons in the final state. As we have already discussed, this proceeds both by the standard electroweak production 1 , and by the mediation of a Higgs boson produced in the s-channel, The underlying parton processes for the hadronic reactions in Eqs. (2) and (3) are shown in Table I, (a)-(c), with representative Feynman diagrams depicted in Fig. 1. We shall refer to the amplitude (a) : g(−p 1 ) + g(−p 2 ) → H → e − (p 3 ) + e + (p 4 ) + µ − (p 5 ) + µ + (p 6 ) O(g 2 s e 4 ) (b) : g(−p 1 ) + g(−p 2 ) → e − (p 3 ) + e + (p 4 ) + µ − (p 5 ) + µ + (p 6 ) O(g 2 s e 4 ) (c) : q(−p 1 ) +q(−p 2 ) → e − (p 3 ) + e + (p 4 ) + µ − (p 5 ) + µ + (p 6 ) O(e 4 ) (d) : q(−p 1 ) + g(−p 2 ) → H → e − (p 3 ) + e + (p 4 ) + µ − (p 5 ) + µ + (p 6 ) + q(p 7 ) O(g 3 s e 4 ) (e) : q(−p 1 ) + g(−p 2 ) → e − (p 3 ) + e + (p 4 ) + µ − (p 5 ) + µ + (p 6 ) + q(p 7 ) O(g s e 4 ) TABLE I: Partonic processes which contribute to the four charged-lepton final state. The second column shows the order of the strong coupling, g s , and the electromagnetic coupling, e, in which the partonic process first contributes. For the purposes of this counting we do not distinguish between the weak coupling g W , the electromagnetic coupling e, and the Yukawa coupling g W m t /2/M W . In the cases where the initial and final states are the same, interference needs to be taken into account.
for the Higgs production process (a) in Table I as M H and to the continuum amplitude (b) as M C .
The dominant continuum contribution is represented by the quark-initiated continuum reaction (c).
One of the aims of this paper is to compute the complete set of 1-loop amplitudes for process (b), gg → ZZ, using the spinor-helicity formalism, to provide analytic formulae for helicity amplitudes including massive quarks in the loop. The amplitudes can then be included together with the Higgs-mediated diagrams in order to provide a prediction for the number of off-shell Higgs events including all interference effects. The analytic results that are presented here will have a significant advantage in calculation speed over more numerical methods. In addition, it is known that the amplitudes M C , when expressed in terms of scalar integrals, can develop numerical instabilities when the transverse momentum of the produced vector bosons tends to zero. These are apparent singularities that cancel when relations between the scalar integrals in the singular region are taken into account. In calculations based on the Passarino-Veltman formalism [16] such apparent singularities appear as inverse powers of the determinant, ∆ 4 of the Gram matrix, G ij = p i · p j . In particular, in our case we have where p T is the transverse momentum of the vector boson with momentum p 34 and p 1 and p 2 are the momenta of the incoming partons. These delicate numerical points are particularly trying in this case because cuts on the transverse momenta of the final state leptons, do not exclude the region where the p T of the vector boson is equal to zero. Moreover, simply excising these regions can compromise the accuracy of the theoretical prediction. For example, imposing a p T cut, p T > 7 GeV, on the transverse momentum of the vector boson produced by collisions at √ s = 7 TeV, would exclude 8% of the gg-initiated cross section. Since in a spinor helicity treatment the apparent singularities appear as p 1 |(p 3 + p 4 )|p 2 ], which are proportional to the square root of the Gram determinant, the severity of the numerical problems is reduced 2 . Moreover the existence of a compact analytic answer allows us to rearrange the calculation to mitigate potential numerical problems at small p T . Returning to Table I, we will now discuss the role of processes (d) and (e). These contributions naturally arise as part of the NLO corrections to reactions (a) and (c) respectively. For our purposes it is not this aspect that is most relevant. Instead, from the order of the couplings presented in Table I, it is clear that a consistent treatment of Higgs-related 4-lepton production at order g 4 s e 8 , i.e. consideration of |M H + M C | 2 , should also include the interference between processes (d) and (e). A useful way of visualizing the relevant interference contributions is shown in Fig. 2 where the different contributions are represented by various cuts of a single master topology. In addition to the two interference contributions highlighted here, in principle a further cut may be performed that leads to contributions from the interference of tree-level qq → ZZ and 2-loop qq → H → ZZ amplitudes. However, this vanishes for massless fermions since by helicity conservation the H → q + q − amplitude vanishes. As a result there can only be contributions through bottom quarks, i.e. q = b. We do not consider such a contribution in this paper since it is heavily suppressed by the initial state b-quark parton distribution functions (PDFs) 3 . However, in order to quantify the expected number of off-shell events expected in the four-lepton channel, we assess the impact of the qg and qg-initiated interferences originating from amplitudes (d) and (e). These terms contain a final state parton, which may or may not be resolved as a jet, but the contribution is finite.
Given the importance of a measurement of the Higgs boson width, it is natural to consider methods that could improve the limits that were suggested in Ref. [8]. One possible strategy is to use event-by-event discriminants to separate signal and background events on a probabilistic basis. This type of matrix element method (MEM) has already been successfully applied in the on-shell region [2,17,18]. In this paper we will investigate the potential of a recent MEM formulation [19] to identify off-shell Higgs events and therefore provide more stringent constraints on the total width of the Higgs boson. This paper proceeds as follows. In section 2 we collect the needed Higgs amplitudes for the interference studies. In section 3 we discuss the calculation of the continuum amplitudes, including an outline of the result for the calculation of the gg → 4ℓ continuum amplitude including loops of massive fermions. Full details of the result for this one-loop calculation are given in Appendices B and C. In section 4 we present a phenomenological study of the 4-lepton final state, including the effect of all interferences considered here, and consider the impact on Higgs width studies. We investigate the potential improvements on these constraints using the matrix element method in section 5. Finally in section 6 we draw our conclusions.

GLUE-GLUE INITIATED AND QUARK-GLUON INITIATED HIGGS AMPLITUDES
In this section we describe the amplitudes appearing in Table I which contain a Higgs boson, namely amplitudes (a) and (d). Although the production of a Higgs boson through gluon fusion via a heavy fermion loop and its subsequent decay to four charged leptons is well known [20], for completeness and to introduce our notation, we reproduce the results in this section. The amplitudes for continuum processes that do not involve a Higgs boson propagator will be presented in Section 3.
We begin by re-deriving the well-known gg initiated amplitudes. We first extract color, couplings and phases, yielding the following definition of our reduced amplitude, Since the Higgs boson is a propagating s-channel scalar we can further divide this amplitude into component pieces, where A gg→H (1 h 1 g , 2 h 2 g ) represents the Higgs production through gluon fusion, and A H→4l (3 h 3 e , 4 h 4 e , 5 h 5 µ , 6 h 6 µ ) represents the decay of the Higgs into four-leptons. The amplitudes are sewn together using the propagator function P H (s), As is well-known, for a spin zero Higgs boson there are only two non-zero helicity amplitudes, namely those in which the two gluons have the same helicity. In these instances the amplitude has the following form, The function C 0 is the scalar triangle integral. The exact definition is given in Appendix A and m represents the mass of the fermion in the loop. Sizeable contributions result only from the cases m = m t or m = m b . The reduced amplitudes for decay of the Higgs boson into four leptons are defined as follows where the couplings of the Z boson to the charged lepton line are, r e = 2 sin 2 θ W sin(2θ W ) (11) and θ W is the Weinberg angle. With our conventions we recover the full amplitudes for Higgs decay by multiplying the expression in Eq. (9) by −2ie 3 . In writing these equations we have introduced the notation, We express the amplitudes in terms of spinor products defined as, and we further define the spinor sandwiches for massless momenta j and k, For the studies of the qg-initiated interference we define the reduced amplitude for the crossed process 0 → qqeēµμg as follows, . (15) where with our conventions the reduced Higgs production amplitude is defined as, with Tr t C 1 t C 2 = 1 2 δ C 1 ,C 2 . Since the amplitude factors onto the s-channel propagator in exactly the same manner as in the previous sub-section, the amplitudes for the decay of the Higgs A H→4l can be re-cycled from Eq. (9). We therefore only require the amplitudes for production of a Higgs and qqg via a heavy fermion loop. The two amplitudes are, The scalar integrals B 0 and C 0 are defined in Appendix A, and as before m is the mass of the fermion circulating in the loop.

CALCULATION OF THE NON-HIGGS BOSON MEDIATED AMPLITUDES
In this section we describe the amplitudes required for the calculation of the non-Higgs boson mediated, or continuum, amplitudes. These correspond to the reactions (b), (c) and (e) in Table I.
The NLO corrections to the process, were first calculated in refs. [21,22], while the inclusion of spin correlations in the decays and phenomenology for the Tevatron and LHC was presented in refs. [23][24][25][26]. This channel is the most important contribution to the four lepton production process. In estimating the size of this background we will use the implementation of this process in MCFM. This implementation includes the contributions of both virtual photons and Z-bosons in producing the final state leptons. In addition, single resonant diagrams that contribute to the same final state are also included through next-to-leading order.

Process (b) :
Calculation of the gluon induced continuum amplitude gg → e − e + µ − µ + This calculation corresponds to the gg initiated box diagrams, which produce pairs of Z's from a fermion loop. These calculations have a rich history. The first calculation of ZZ production via gluon fusion (with on-shell Z's) was completed over 25 years ago [27,28]. These results were later extended to include off-shell Z's [29]. More recently, a public code gg2VV was developed [30,31] which includes the full mass dependence in the fermion loop, and leptonic decays of the Z/γ * . This code has been used to study the interference with the Higgs signal in ref. [9]. Fully analytic helicity amplitudes with massless fermion loops for gg → V V were presented in ref. [26], using the earlier results for V + 2j from ref. [32] 4 .
Here we will describe our analytic calculation of the helicity amplitudes for the process, with a massive fermion propagating in the loop. The contributing diagrams are shown in Fig. 3, where the produced electroweak bosons that each decay to a charged lepton pair can either be a virtual photon or a Z-boson. This amplitude receives contributions proportional to V 2 f and A 2 f (the mixed terms vanish) where V f , A f are the vector and axial couplings of the fermions to the Z-bosons or virtual photons.
In the first instance we shall consider the leptons to be produced by an off-shell photon. However we shall decompose the vector coupling of the photon into left-and right-handed pieces that will be calculated separately. This is necessary for the generalization to the Z-boson case in which the left-and right-handed couplings differ. There are four sub-amplitudes to consider, which we denote by LL, LR, RL and RR. The first label refers to the coupling of the boson with momentum p 34 to where Since the mixed vector-axial contributions vanish there are only two independent amplitudes corresponding to left-left and left-right couplings (A LL and A LR ) with A RR = A LL and A RL = A LR . Note that, if desired, the vector-vector and axial-axial contributions can be reconstructed via, We now describe how to construct the full amplitude containing both Z-bosons and virtual photons, given the four QED amplitudes A LL , A RL , A LR , A RR . We shall make a default choice for the helicity labels of the final state leptons. The other cases can be easily obtained by interchanging (3 ↔ 4) and/or (5 ↔ 6). Our default will be to write expressions for the case In addition we will define reduced amplitudes by removing our default overall factor, with j, k = L, R. The full reduced amplitude for our default lepton helicity given by Eq. (25) is given by, The coupling factors are, for a quark of type i running in the loop, where and Q i and τ i = ±1 are the charge of the ith quark (in units of the positron charge) and the weak isospin of the ith quark and l e and r e are given by Eq. (10). The propagator function is defined in Eq. (7). The calculational strategy for the LR and LL pieces will be different. The LR pieces vanish for the case of massless quarks and consequently the tensor box integrals which occur are at most of rank two. Because of this low rank it is easy to obtain a compact analytic result using Passarino-Veltman reduction. Indeed the result given in the Appendix for the LR piece differs little from the result of Glover and van der Bij [27], apart from the extension to off-shell bosons. This is mandatory for a description of the region below the Z-pair threshold, relevant for the Higgs boson. In addition we construct the helicity amplitudes in terms of spinor products. The full result for the LR helicity amplitudes is given in Appendix B.
The LL pieces contain tensor integrals of rank 4 and are treated with a different strategy. For the LL pieces we use the decomposition, The amplitude is expanded in terms of a basis of box (D 0 ), triangle (C 0 ) and bubble (B 0 ) scalar integrals, with the sum running over the relevant kinematic configurations labelled by j. The precise definition of the scalar integrals is given in Appendix B. The basis also includes a purely rational term R. The box and triangle coefficients are determined using D-dimensional unitarity techniques [34][35][36]. In general these coefficients are expansions in m 2 . The bubble coefficients are independent of the mass and can be constructed from the massless results of ref [32]. There is an intimate relationship between the m 4 pieces in the box coefficients, the m 2 pieces in the triangle coefficients and the rational terms. We exploit these relationships wherever possible to lighten the computational burden. The full analytic results for the coefficients in Eq. (31) are given in Appendix C.
One of the features of our expansion is the introduction of the six-dimensional box in the basis set of integrals in Eq. (31). We have found that the formulation in this fashion increases the degree of numerical stability in the low p Z T region. The six-dimensional box can be expressed in terms of normal four-dimensional box-and triangle scalar integrals. This expansion introduces one power of the inverse Gram determinant. Note however that the apparent singularity for vanishing Gram determinant is cancelled by relationships between the scalar integrals in this limit. We find that grouping the terms by expressing the four-dimensional integrals into the combination dictated by the six-dimensional box leads to greater numerical stability.
As before we will consider the virtual photon process first and include the additional electroweak couplings later, There are six Feynman diagrams for this process which can potentially contain two resonant propagators.
There are four Feynman diagrams for the singly resonant process which can potentially contain only one resonant propagator. We say 'potentially' because the resonant propagators will be added at a later stage.
The other needed helicity amplitudes can be obtained from these basic amplitudes.

PHENOMENOLOGY
The full calculation of the production of e − e + µ − µ + has been implemented in the parton level integrator MCFM. We present the relative importance of the dominant processes at √ s = 8 and 13 TeV in Figs. 4 and 5. These plots have been prepared using the parameters shown in Table II and applying the CMS cuts [37] which are detailed as follows.: In addition, the transverse momentum of the hardest (next-to-hardest) lepton should be larger than 20 (10)  , i.e. they are next-to-next-to leading with respect to the qq process, but enhanced by large gluon fluxes at the LHC. The peaks at the Higgs boson mass and at the Z-boson mass (from the singly resonant diagrams) are visible. At high invariant mass m 4ℓ one can clearly see the destructive interference canceling the leading high energy behaviour of the gg → ZZ → eeµµ process [27]. Fig. 5 also demonstrates that the relative fraction of gg-and qq-initiated processes changes at higher energy with the gg process becoming more important at √ s = 13 TeV. The method of ref. [8] relies on gg-initiated events and is thus expected to improve with increasing energy.
To discuss the structure of our results we introduce the following notation to distinguish the different squared amplitudes that are included in the gluon-gluon initiated contributions: where M H is the Higgs production amplitude and M C is the amplitude for the continuum background. Thus, for instance, σ I reflects the pure interference contribution while σ H+I denotes the effect of including the Higgs-mediated diagrams. As stressed in refs. [8,9] the interference is primarily of importance in the off-peak region. The overall size of the interference can be assessed from Fig. 6 which shows the cross sections σ H and σ H+I . It is apparent that the description of the off-peak region without accounting for the interference is unreliable.
In Table III we compare our results with similar results presented by Caola and Melnikov. Our results display the same general pattern as those reported in ref. [8], but differ in detail on the size   of the gg interference contribution, despite using what we believe to be identical input parameters.
The results of ref. [8] were obtained using the code gg2VV [9]. We believe that the cause of the discrepancy is a cut of p Z T > 7 GeV imposed in the double precision version of gg2VV for the continuum process, but not on the Higgs signal process. The interference contribution is obtained by forming the combination (c.f. Eq. (38)), The p T cut is performed on the first two terms on the right hand side of Eq. (39) but not on the third. The cut on the amplitudes that involve the continuum background in the gg2VV code is presumably performed for reasons of numerical stability.
We shall now discuss the treatment of the region of low p T of the Z-boson in our code, and illustrate the importance of low p T . In Fig. 7 we first demonstrate the impact of the spurious 1/p T singularities that appear in the amplitudes. The figures show the calculation of the gg → ZZ cross section in the region 0.1 < p T (Z) < 2 GeV, including only the effect of the massive top and bottom quark loops. The calculation is performed using the CMS cuts that were previously described. The calculation is performed in two different ways. The "original" calculation includes only the 4-dimensional scalar integrals in the basis, with explicit factors of 1/p 4 T and 1/p 3 T in the amplitudes for opposite helicity incoming gluons. The "improved" calculation, presented in Appendices B,C, extends the basis to also include 6-dimensional box integrals, and simplifies the remaining coefficients so that only 1/p 2 T factors remain. The original calculation becomes numerically unstable for p T < 0.4 GeV, whereas the improved calculation provides a reliable prediction down to the p T = 0.1 GeV threshold. The significance of the low-p T region is demonstrated in Fig. 8. The figure shows the contribution to the total Higgs and continuum cross sections from the phase space below a given p T cut. For the gg → ZZ continuum process, the effect of enforcing a cut at 7 GeV is a reduction in the cross section of about 8%. In contrast, a cut at the level of 0.1 GeV has a negligible (< 0.01%) effect.
In Fig. 6 we also show the effect of Higgs-mediated diagrams in the qg(qg) initiated interference and in Table III quantify the size of this contribution to the cross section in two mass ranges. Compared to the Higgs peak cross section this contribution is small. Further, as the figure illustrates, this interference is significant primarily in the region around 2m Z and above 300 GeV it can be safely neglected. A full assessment of the import of these terms will have to await a complete NNLO calculation of the pp → e − e + µ − µ + process since we expect an intimate relationship between the Higgs-mediated contribution we have considered and other box diagrams contributing to the full qqgZZ amplitude. Alternatively, if we identify a jet, we could compare the data with a complete NLO calculation of pp → e − e + µ − µ + + jet, of which the interference that we present is also a part 5 . At present we merely note that these terms do not overwhelm the contributions from the gg-initiated terms. In view of the fact that our results for this interference term are small and only a partial calculation, we will not include them in the following discussion.   We now investigate the dependence of the on-shell, off-shell and interference contributions on the choice of parton distribution function and scale. For the sake of illustration we undertake this analysis for √ s = 8 TeV. Results at other centre of mass energies are similar. For the PDF set we consider CTEQ6L1 [40] in addition to our standard choice of MSTW08LO [38]. We also investigate the use of a dynamic scale that is more natural for events that lie far beyond the Higgs boson on-shell peak, namely m 4ℓ /2. Our results are summarized in Table IV. The cross section changes considerably when switching from the fixed to the dynamic choice of scale, since the off-peak contribution is considerably suppressed by the running of the strong coupling. However the ratio of off-peak to on-peak cross sections is relatively stable under PDF variation. Our best predictions for the effect of the interference, obtained using the running scale m 4ℓ /2, are presented in Table V. We now turn to the issue of constraining the Higgs width by measuring the fraction of off-shell ZZ events, as proposed in Ref. [8]. The scenario we consider is one in which the peak Higgs cross section is constrained to its Standard Model value while the width is changed. Such a scenario is realized by a universal rescaling of the coupling of the Higgs boson, g σ H+I of f (m 4ℓ > 300 GeV) = 0.025 In these equations the linear scaling with the Higgs width originates from the genuine off-shell contribution while the interference contribution scales with the square root. The coefficients entering the equivalent relations at 7 and 13 TeV can be read directly from Table V. With these results in hand it is straightforward to repeat the analysis of Ref. [8] in order to obtain the number of off-shell Higgs-related 4-lepton events (N 4ℓ of f ) expected in the CMS analysis presented in Ref.
[37]. The number of such events expected in the combined 7 and 8 TeV data sample is obtained by summing the appropriately-weighted cross sections and normalizing to the peak cross section reported in Ref.
[37]. We find, N 4ℓ of f (m 4ℓ > 300 GeV) = 2.02 Comparing the first of these equations to the equivalent one found in Ref. [8] we see that the coefficients are both smaller, due to the difference between our choice of dynamic scale and the approximate suppression factor employed in Ref. [8]. The interference term differs further due to the use of the gg2VV code in Ref. [8] that employs a p Z T cut, as discussed previously. The limit on the Higgs width is then determined by comparing the background-subtracted number of events observed with the number of Higgs-related events expected. This is illustrated graphically in Fig. 9. We obtain the limits, These limits are slightly weaker than those reported in Ref. [8] due to the different choice of scale, as discussed above. Since the current limits are far from the Standard Model value, the analysis is not affected by the small difference in the interference term which is insignificant for large values of the rescaling parameter.

CONSTRAINING THE HIGGS WIDTH USING THE MATRIX ELEMENT METHOD
The results presented in the previous section highlight the difficulty of measuring the off-shell Higgs-mediated contributions to four-lepton production at the LHC. It is therefore natural to investigate the possibility of using advanced techniques to extend the experimental analyses beyond a cut and count approach. One such technique is the use of kinematic discriminants, which assign each event a weight associated with a given hypothesis. The variant of this method that we adopt is the matrix element method (MEM), in which a fixed-order matrix element is used to assign a probabilistic weight to individual events. In this way all of the theoretical information encoded in the matrix element is utilized in the analysis. The MEM has been used successfully in the on-shell region [2,17,18] and it is therefore natural to investigate the possibility of using such a kinematic discriminant in the off-shell region. In this section we will use the matrix element method algorithms presented in ref. [19] to compute kinematic discriminants in the off-shell region. Although Ref. [19] presented an extension of the MEM to NLO accuracy, since the gg initiated matrix elements are currently only available at LO, our analysis will focus on the LO implementation of this algorithm. We will briefly discuss the potential impact of the MEM@NLO at the end of this section.
The aim of the MEM is to associate a probabilistic weight to each input event (from Monte Carlo or data), with a weight computed under a given theoretical hypothesis. In the case at hand we must map an input data event to a partonic configuration in which the 4-lepton system has no transverse momentum. In order to implement this map an input data event, which may contain significant recoil, we perform a transverse boost. To ensure that the weight is unique, we integrate over all longitudinally equivalent boosts. Each weight is thus obtained from a fixed order matrix element, and an integration over the longitudinal degrees of freedom associated with the production through two colliding partons. Explicitly, at LO the weights are defined as follows, In this equationσ ij is the LO parton cross section, evaluated at the phase space point φ, defined for incoming partons of flavour i and j, which are occur in the proton with probability f i,j given by the parton distribution functions. Q 2 represents the overall center of mass energy of the event that is kept invariant under the longitudinal integration. In this equation we have assumed that the leptons are well-measured in order to reduce the computational load. Lifting this assumption is straightforward and we believe that the results presented here serve as a well-motivated and useful starting point for future studies.

The Kinematic Discriminant
For each event we compute three weights, corresponding to different hypotheses: P qq : qq initiated background.
P gg : gg initiated pieces, including Higgs signal, box diagrams and interference.
P H : gg initiated Higgs signal squared.
The kinematic discriminant D S is then computed from these according to, Note that, since P gg contains both the effect of the Higgs diagram squared and the interference term between the signal and background it is possible that P H > P gg so that D S > 0. We have chosen P H in the numerator (compared to P gg ) since P gg will favor events which either have a large continuum or Higgs probability. To constrain the Higgs width we primarily seek off-shell Higgs events, and our discriminant is thus constructed to reflect this. The samples of events that we use for our study are generated as follows. For the background qq events we use POWHEG [15] to produce NLO events matched to the PYTHIA [41] parton shower. We will use the term qq background to refer to all non gg-initiated backgrounds, even though this sample contains some fraction of gq initiated events that enter at NLO. Events from the Higgs signal, gg background and interference terms are generated using the results of this paper, using the same PYTHIA interface to produce showered events. We then perform a basic simulation of detector effects by performing Gaussian smearing of the p T of each of the leptons, with a width of 0.5 GeV. After this we require exactly four leptons that pass cuts based on the CMS selection criteria presented in the previous section. For efficiency of generation we have raised the minimum invariant mass of the off-shell lepton pair to 20 GeV and, for simplicity, have fixed |η ℓ | < 2.4 for all leptons.
We begin by validating the discriminant on our gg initiated samples. Samples are generated using the prescription and cuts described above, for two different values of the total Higgs width: Γ H → ξ 4 Γ H with ξ 4 = 1, 10. In order to understand the behaviour of the discriminant on the different events that may be present we generate three samples for each ξ, corresponding to |M H | 2 , |M H+C | 2 and |M C | 2 . Our results are summarized in Fig. 10, which clearly indicates that the discriminant is working as expected. The continuum-only sample peaks at D S ≈ −2 while the event samples containing the Higgs boson produce a significant feature in the region D S > 0 . In addition, the number of events present in this D S > 0 region depends strongly on the rescaling factor ξ. The difference between the number of events found there between ξ 4 = 1 and ξ 4 = 10 scales roughly as Γ H /Γ SM , i.e. an order of magnitude. This should be compared to the overall scaling of the total gg cross section, which for the same values of ξ increases by around 24%. The impact of the interference is also clear from the figure. The destructive interference reduces the overall cross section and particularly suppresses the number of events in the region in which the Higgs signal is largest. These results clearly demonstrate the importance of modeling the interference in this measurement. Indeed, in the Standard Model the peak associated with the off-shell production of Higgs bosons is completely washed out by the interference, as expected from the results of the previous section. Having validated our discriminant on control gg samples, we now compare our gg events to the qq sample. Our results are shown in Fig. 11. Due to the much larger cross section, σ N LO qq ≈ 10 σ gg , the qq initiated events now dominate the discriminant. However, it is also clear from Fig. 11 that these events have the same shape as the continuum gg background. As a result the region D S > 0 remains sensitive to the value of ξ and for ξ 4 = 10 the number of expected gg events in the tail is comparable to the number of qq events.

Measuring the Higgs width using the MEM
In order to determine the expected limit on Γ H we must first form a prediction for the total number of expected events in our data sample. In our setup the total number of expected events consists of those arising from the qq, gg continuum and Higgs-mediated contributions, We wish to normalize the samples according to the number of expected qq events, i.e. we define, In Eq. (48) the best prediction for σ qq is obtained from a NLO calculation and we generate it using POWHEG. For σ C gg the current state of the art is the LO calculation presented in this paper. However the part of σ H+I gg that represents Higgs diagrams squared (i.e. σ H gg ) is known to NNLO and the higher order corrections are large. For this reason we rescale the results of this paper for σ H+I gg by a NLO K-factor of 1.76. This is derived in the effective theory, under the CMS cuts with m 4ℓ > 100 GeV. This approach treats the higher-order corrections to the Higgs-squared diagram and the Higgs-continuum interference equally. However, as we have seen in the previous section, for the current LHC sensitivity the limits on the width do not depend strongly on the effect of the interference.
In our analysis we will use a fixed qq expectation N qq = 400. As a systematic uncertainty on our method we will consider the variation of σ C gg and σ H gg over the scale choices µ = {m 4ℓ /4, m 4ℓ /2, m 4ℓ }. The number of Higgs-mediated events in the off-shell region, m 4l > 130 GeV, can then be parametrized by, The systematic uncertainties in this constraint correspond to the variation of the scale about the central value of m 4ℓ /2 as described above. Despite the small differences in the analysis compared to the last section, the final constraints are rather similar, c.f. Eq. (44). We can now compare the effect of performing a MEM analysis with a cut on the discriminant variable, D S > D cut S . In order to obtain our expected number of events, given a cut on D S , we use the Monte Carlo samples discussed previously (see Fig 11). For each sample we calculate the fraction of events that pass the cut on the discriminant. We then use the normalization prescription of Eq. 48 to combine the samples, weighted by the appropriate cut efficiency. Our results are summarized in Fig. 12. It is clear that application of a cut on the discriminant variable strengthens the constraint on the Higgs width. Given our expected number of events, the largest values of D cut S actually result in weaker constraints on the Higgs width since there are too few events to effectively discriminate between hypotheses. The strongest expected constraint on the Higgs width is around D cut S = 1 for which we find, This is around a factor of 2.6 better than the cut-and-count method with m 4ℓ > 130 GeV, and about 1.6 times better than the result for m 4ℓ > 300 GeV cut, c.f. Eq. (50). Note that it may be possible to improve these limits in a full experimental analysis, for instance by using a template fit to fully exploit the shape of the full D S distribution rather than simply cutting on it.

Future Theoretical Improvements
The results of the previous subsection illustrate the potential of the MEM to constrain the Higgs width. Given its important role in determining Higgs couplings, it is natural to consider potential improvements which may lead to stronger constraints in the future. Obviously the limits derived previously will improve with the collection of larger data sets, eventually becoming dominated by systematic errors.
The most obvious potential improvement is the calculation of the complete gg initiated contributions (continuum and Higgs-mediated) at NLO. This would improve both the cut and count method, and also allow for the use of the MEM@NLO [19]. Given the long lifetime of the LHC, this calculation is a realistic possibility. Indeed the NLO corrections to the Higgs signal are already known [42][43][44].
A second improvement, that is simpler to implement, could come from binning the events according to the number of associated jets and using the MEM@LO in each bin separately. Indeed we know that for the gg → ZZ +jet process the interference between Higgs and continuum diagrams in the off-peak region is around −160% of the off-peak Higgs cross section and that about 9% of the gg-initiated cross section is due to Higgs diagrams [45]. This is to be contrasted with our results reported in Fig. 4, where the interference is approximately −200% and only about 5% of the gg-initiated cross section is due to Higgs diagrams. We leave a detailed investigation of this possibility to future work.

CONCLUSIONS
In this paper we have revisited the cross sections for the hadronic production of four charged leptons, e − e + µ − µ + , focussing in particular on the gluon-gluon initiated process that involve closed fermion loops. We include the full amplitude, i.e. both the diagrams containing an s-channel Higgs boson and those proceeding through a closed loop of (massive and massless) fermions radiating vector bosons (Z/γ * ). Our result for the full amplitude includes the interference between the two types of processes. We have obtained analytic formulae for the gg-initiated helicity amplitudes, retaining the mass of the fermion that circulates in the closed loop. Even though numerical results using these amplitudes have been presented before, we believe this is the first paper to publish analytic results for helicity amplitudes including off-shell vector bosons in the final state. The inclusion of off-shell vector bosons is clearly necessary to describe the region where the mass of the four leptons is below twice the Z-boson mass, relevant for Higgs boson studies. Our analytic approach has advantages over a more numerical approach, both in terms of calculational speed and in terms of numerical stability. Numerical stability can be an issue in the region where the vector boson transverse momentum p T is small. We have demonstrated that our code is stable down to p T = 0.1 GeV where we perform a cut that removes a negligible fraction of the cross section (0.01%).
The experimental study of the Higgs boson in the four lepton channel has focussed on the resonant region where the mass of the four leptons is close to the mass of the Higgs boson. Somewhat surprisingly, the narrow width approximation for the Higgs boson fails because of the proximity of the Z-pair threshold and the production of longitudinal Z-bosons. Indeed 15% of the cross section deriving from diagrams with a Higgs boson in the s-channel lies hundreds of Higgs widths above its mass, m 4l > 130 GeV. It is essential to include interference in the gg-channel to accurately describe this region. The interference in the qg-channel is found to be smaller. Its contribution can be further reduced by binning the data in the number of associated jets, or by considering only the m 4l > 300 GeV region. A definitive analysis of its importance will require a complete higher order calculation.
Following a suggestion of Caola and Melnikov we have used the off-resonant production through the Higgs channel to bound the total width of the Higgs boson. We confirm and extend the results of Caola and Melnikov, giving more precise results for the effect of the interference and investigating alternative choices for the renormalization and factorization scale. We find that the choice of scale can substantially affect the ratio of off-shell to on-shell Higgs production, although the effect of different parton distributions on this quantity is less important. More precise predictions for this ratio will require a campaign to include strong and electroweak higher order effects into the cross section for the four lepton final state.
A cut and count style analysis of current data gives limits Γ H < 43.2 (25.2) Γ SM H using offresonance events with m 4l > 130 (300) GeV. We investigated the use of a matrix element method to construct an event-by-event kinematic discriminant as a means of improving the constraint on the Higgs width. Using Monte Carlo pseudo-data we found that such a MEM analysis could suppress the qq and gg continuum backgrounds whilst still remaining sensitive to the width of the Higgs boson. In our analysis the bound on the Higgs width was improved by a factor of about 1.6 using a simple cut on the MEM discriminant, compared to an invariant mass cut m 4l > 300 GeV. Our results motivate a more complete experimental analysis including real data and a full detector simulation.
The scalar integrals themselves are defined as follows, where the denominator function is For the purposes of this paper we take the masses in the propagators to be real. Near four dimensions we use d = 4 − 2ǫ (and for clarity the small imaginary part which fixes the analytic continuations is specified by +i ε). µ is a scale introduced so that the integrals preserve their natural dimensions, despite excursions away from d = 4. We have removed the overall constant which occurs in d-dimensional integrals The final numerical evaluation of the amplitudes uses the ff [46,47] and QCDLoop [48] libraries provide values for these scalar integrals.
The expression for the six-dimensional box with two adjacent external massless lines is, (p 2 1 = p 2 2 = 0), with Y = s 134 s 234 − s 34 s 56 . The six-dimensional box is both infra-red and ultraviolet finite, even in the limit m → 0. We also note that, despite the overall factor of 1/Y , it approaches a finite limit as Y (or p T ) goes to zero. There are two independent helicity configurations, (1 + , 2 + ) and (1 − , 2 + ). To obtain the remaining helicities we define the operation, flip : (3 ↔ 4), (5 ↔ 6), The remaining two helicities are then obtained by, The LR amplitude is simple because it vanishes in the m → 0 limit. Thus the tensor rank of the integrals that appear is at most two. We will first consider the LR amplitude for the gluonic production of two virtual photons, The virtual photons will subsequently decay to charged lepton pairs with momenta p 3 , p 4 and p 5 , p 6 , so that p 34 = p 3 + p 4 , p 56 = p 5 + p 6 . These decays will be added later in this section. For definiteness we consider the LR amplitude for the case where the couplings of the virtual photons with momenta p 34 and p 56 are We remove a series of overall factors to define a reduced amplitude P µνρσ LR for this process, The indices µ and ν refer to the two gluons with momenta p 1 and p 2 respectively, (see Fig. 3). C 1 and C 2 similarly denote the color labels of the gluons. Including the factor of -1 for a fermion loop we find that the form of the reduced amplitude, consistent with QCD gauge invariance, is [27], The six form factors In writing these equations we have introduced the notation, We also note the following relations, The notation for the scalar integrals, D 0 (j), C 0 (j) is given in Table VI so that A 6 can be dropped from further discussion. These formula, up to an overall factor, are in agreement with the result given in ref. [29]. In addition, in the limit p 2 34 = p 2 56 = M 2 Z , they are in agreement with the formula of ref. [27]. This concludes our discussion of the tensor P µνρσ LR . Contracting with the polarization vectors of the gluons, ǫ ± , we find helicity amplitudes for the (1 − , 2 + ) and (1 + , 2 + ) polarizations are, Thus the amplitude for the standard polarization of the final state leptons can be written with an overall factor extracted, where the reduced amplitudes are given in terms of the form factors A i defined in Eq. (B8) by, For this helicity combination these coefficients are equal to zero.   The results for d 3 can be found from the results for d 2 by applying the operation flip 2 defined by, Note that under this operation the helicities of the lepton lines are switched.
Note that the standard helicity choice, (1 − , 2 + ), can be recovered by applying the flip operation defined in Eq. (B1).
Note that the standard helicity choice can be recovered by applying the flip operation defined in Eq. (B1).  The kinematic quantities ∆ 3 and δ 12,34,56 are defined in Eq. (B11). The coefficient c 2 (1 − , 2 + ) is again obtained by exploiting the relation between mass-dependent coefficients of boxes and triangles and the total rational contribution, c.f. Eq. (C19).         The kinematic quantities ∆ 3 and δ ij,kl,mn are defined in Eq. (B11).    appearing in the amplitudes A(1 + , 2 + , 3 − , 4 + , 5 − , 6 + ) and A(1 − , 2 + , 3 − , 4 + , 5 − , 6 + ) are shown in Table VII. In summary we give the value of the reduced matrix elements at our standard point Eq. (D1) and for a quark of mass m = 0.4255266775 running in the loop. We find for the LR combination, where,