Singular ways to search for the Higgs boson

The discovery or exclusion of the fundamental standard scalar is a hot topic, given the data of LEP, the Tevatron and the LHC, as well as the advanced status of the pertinent theoretical calculations. With the current statistics at the hadron colliders, the workhorse decay channel, at all relevant H masses, is H to WW, followed by W to light leptons. Using phase-space singularity techniques, we construct and study a plethora of"singularity variables"meant to facilitate the difficult tasks of separating signal and backgrounds and of measuring the mass of a putative signal. The simplest singularity variables are not invariant under boosts along the collider's axes and the simulation of their distributions requires a good understanding of parton distribution functions, perhaps not a serious shortcoming during the boson hunting season. The derivation of longitudinally boost-invariant variables, which are functions of the four charged-lepton observables that share this invariance, is quite elaborate. But their use is simple and they are, in a kinematical sense, optimal.


I. INTRODUCTION
Recent data from the LHC [1] on a putative standard Higgs boson exclude, at a 95% confidence level (CL), the mass domain 127 GeV < M H < 600 GeV (CMS) and, with some narrow gaps, 131 GeV < M H < 453 GeV (AT-LAS). These results are obtained with full use of the standard theory, including radiative corrections which sometimes constitute the dominant effect. The amplitude for Higgs boson production, for example, is largely dominated by gluon fusion via a t-quark loop and so is the amplitude for H → γγ decay.
In the "quantum-level" setting we recounted, it would be inconsistent not to analize the LHC data in conjunction with the constraints on M H which follow from the profusion of high precision measurements that test the standard theory beyond tree level. These constraints (and the direct searches [2] at the Tevatron) result in M H < 161 (156) GeV at a CL of 95% [3], while the direct LEP limit is M H > 115 GeV.
In mass intervals akin to the one implied by the quoted constraints CMS finds a 1.9σ excess of events -that could be an indication of a Higgs signal-at M H = 124 GeV and ATLAS a 2.5σ one at M H = 126 GeV [1]. In the current broad mass range(s) of the searches, the corresponding "local" significances are somewhat larger [1], but have no rigorous statistical interpretation.
For a standard Higgs boson of mass M H > 140 GeV the branching ratio for the decay H → W W is the dominant one. Below this mass and above the LEP limit, the winner is H → bb, a process beset by terrifying backgrounds at a hadron collider. The branching ratio W → qq is one order of magnitude larger than the one for W → ν, = e, µ. But a light-lepton signal is much "cleaner" than that of a quark-generated jet. This makes the chain H → W + W − , W ± → ± ν the allmass workhorse at a hadron collider. In brief, we refer to this process as H → W W , including the "off-shell" M H < 2 M W case, often dubbed H → W W * .
The obvious problem with the H → W W channel is that M H cannot be reconstructed event by event, as a lot of information escapes detection with the unobserved neutrinos and, at a hadron collider, also with the unobserved hadrons that exit "longitudinally" close to the beam pipe(s). This makes taming the workhorse almost an art, not only a science. The formal and theoretically optimal singularity variable procedure to deal with this kind of incomplete information is summarized in [4,5] and exploited for the hadron-collider production of a single W in [5]. We shall see that, for the H → W W process, the situation is much more challenging, mainly because two missing neutrinos are many more than one.
The other crucial obstacles in the process we study are the large backgrounds with kinematics akin to those of the signal. The main and irreducible one is the direct non-resonant production of W pairs by qq annihilation. The next most relevant one is tt production, which also results in W pairs. For simplicity we shall illustrate our arXiv:1202.2552v1 [hep-ph] 12 Feb 2012 theoretical results only for the chain W → eν, W → µν, for which the "Drell-Yan" background is not a problem.
The analysis tools used to deal with the H → W W channel range from a simple "cut and count" approach to "matrix-element" techniques and avant-garde neural networks or "boosted decision trees" [1]. There is no question that in the long range the methods that input and utilize the largest amount of information are likely to be the most powerful ones. Whether this is also the case at an exploratory "Higgs-hunting" stage is more doubtful. Here we shall explore a "copy and paste" avenue of intermediate sophistication: the derivation of singularity variables -functions of the observable momenta and of M H -whose measured histograms are to be compared (in one or more dimensions) with pre-prepared templates.
For the production of a Higgs boson at a hadron collider, followed by the decays H → W + W − , W ± → ± ν, we shall limit our discussion to the distributions of various functions of the charged lepton three-momenta k and l. The treatment of the 2D transverse momentum of the final state hadrons, p T , deserves a separate paragraph, the next. The use of other observables, such as the number of jets, is beyond our scope.
Two practical problems are that k and l are measured to much higher precision than p T and that the formulae for the H → W W singularity variables are much more complex for p T = 0 than for p T = 0. We deal with both problems by setting p T = 0 in our theoretical expressions. This is less cavalier than it seems. The transverse momentum of a Higgs boson in a given event is − p T . For a given ansatz M H value, its observed lepton momenta can be Lorentz boosted closer to the p T = 0 frame. The precise boost would require knowledge of the boson's longitudinal momentum, p 3 . But, typically, p 2 3 2M 2 H , it is a fair approximation to neglect p 3 . More importantly, the singularity variables for p T = 0 are very useful even in the analysis of events with p T = 0, even if one does not boost the events back closer to the p T = 0 frame, and even if one is also dealing with the quoted backgrounds, whose W pairs do not have a fixed invariant mass.
Let the lepton momenta be k ≡ { k T , k 3 } and l ≡ { l T , l 3 }. Because of the rotational symmetry along the beams' axis, the six-dimensional observable space { k, l} is in practice just five-dimensional. One possible choice of variables is the set (k + l) 2 , the invariant mass of the lepton pair; k T , l T , the moduli of the transverse momenta; and k T · l T , or the familiar ∆ϕ = arccos[ k T · l T /(k T l T )]. All four of these "transverse" observables are invariant under longitudinal boosts along the beams' axis. The remaining variable, for instance k 3 + l 3 , is not.
We shall derive two types of singularity variables: those which do -or do not-depend only on transverse observables. Transverse variables are preferable, in that they are insensitive to the significant uncertainties associated with the (longitudinal) parton distribution functions (pdfs). In practice the uncertainties are to a modest extent reintroduced via the angular coverage limitations of an actual experiment, which are not invariant under longitudinal boosts. The histograms of singularity variables that are not longitudinally invariant do depend on the pdfs, but, particularly during a Higgs-hunting or initiatory epoch, this is not a serious limitation.
In the problem at hand, the quintessential function of transverse variables -in the sense of its ability to tell apart signal from backgrounds-is ∆ϕ, the angle between the charged leptons in the transverse plane. In Fig. 1, for comparison with our coming results, we recall this fact by showing the (arbitrarily normalized) shapes of signal and background distributions for two examples, with M H = 500 and 120 GeV. As is well known, the V-A nature of the weak current and the specific spin zero nature of the W pair in signal events, favours collinear vs. anticollinear leptons. The effect weakens as M H increases and the leptons are boosted away from each other.
The simulations of Fig. 1, as well as all others in this note, were made with use of the PYTHIA6 event generator [6]. They are for the H → W W , W → eν, W → µν channel, with leptons of transverse momentum greater than 15 GeV, and satisfying the pseudorapidity cuts η(e) < 2.5, η(µ) < 2.1. Our goal is two-fold. First and foremost, to derive the complete set of phase-space singularity variables (functions analogous to ∆ϕ) for the process at hand. Second, to illustrate with examples their potential phenomenological usage. At least at low M H values, ∆ϕ is more heavily dependent on dynamics than on kinematics. The singularity variables we shall derive are the other way around.
Individually, several of them are nearly "as good" as ∆ϕ in disentangling a signal from the backgrounds. The ensemble of their distributions, particularly in conjunction with ∆ϕ itself, should be a powerful and relatively simple tool to search for a Higgs boson, which, in the sense of signal kinematics, is guaranteed to be optimal.
Since we investigate a plethora of singularity variables, this paper is long and detailed. The reader mainly interested in results may be well advised to start reading it from the end: §XI and §XII.

II. OUTLINE
The simple example of single-W production is used in §III to clarify what singularity conditions and singularity variables are. After posing the formal problem in §IV we proceed in §V to solve it in the center of mass (CM) reference system of the Higgs boson. There are two reasons for this. First, it is a necessary intermediate step in the theoretical derivation, in §VI, of the general case with a boson which is not at rest. Second, the "approximation" of a heavy particle made at rest by gluon-gluon or qq fusion in a hadron collider is not so bad, since the quark and gluon pdfs are fast falling functions of their fractional momenta. Our results will reflect this fact.
We shall have to deal with the case M H < 2M W , in which at least one of the W s is off-shell; the relative probability for both of them having an invariant mass significantly different from M W is small. In §VII we explain the simple way in which we treat this case. Further details of our data analysis not already discussed in the introduction are given in §VIII.
We analize MC-generated data in §IX in the theoretical approximation of a Higgs boson made at rest. To some extent, this section is a "warm-up" for the general results wherein we lift this approximation, discussed in §X, to which the reader interested in the most powerful results may prefer to jump.
A summary of results is given in §XI. Our data analysis is not as thorough as the theoretical one, it is only meant to illustrate our points. But it suffices to reach our conclusions, which, naturally, are drawn in the last section. A very formal but important step in our theoretical analysis is relegated to the Appendix.

III. SIMPLE SINGULARITY VARIABLES
Our main result is the theoretical derivation of the phase space singularity variables and singularity conditions for the process H → W + W − , W ± → ± ν. To understand these concepts it is easiest to recall a simpler problem: the analogous one for single-W production at a hadron collider, followed by the same leptonic decay. In this case, the singularity condition [5] is Σ T = 0, with: Of the four M -roots of Σ T = 0, one is not unphysical: which reduces to M T = 2 |l T | for p T = 0. The function in Eq.
(2) is the habitual M 2 T originally derived in [7,8]. The result of Eq. (1) is obtained by projecting the full phase space (which includes the neutrino momentum) onto the observable phase space. The function Σ T is a singularity variable which -for a general non-singular event-is a measure of its distance to the nearest singularity at the singular Σ T = 0 border of the projected space. In Eq. (1) the mass of the W appears in two ways: the physical M H is imprinted in the data and also appears as an implicit "trial" mass M → M in the equation. In the M T singularity variable of Eq. (2) M H is only reflected in the observables. In applying the phase-space singularity approach to our two-W problem, we shall encounter both types of singularity variables.
In the single-W case one can refine the result in the sense of finding the optimal singularity variable, that which would result in the most precise measurement of M W [5]. In the two-W case this is not worthwhile, as there are decay channels, such as H → γγ, H → ZZ; Z → + − , for which the mass is reconstructible.

IV. THE FORMAL PROBLEM
Back to the H → W + W − , W ± → ± ν process, let y and x, respectively, be the four-momenta of the neutrinos accompanying the charged leptons of four-momentum k and l. The full information relevant to the reconstruction of the boson's mass for a signal event is embedded in the kinematical equations: where we have made the approximation l 2 = k 2 = 0 for the charged leptons and -fleetingly in error for the W W * case-set the masses of the two W s to their central values. There are 9 unknowns (2 neutrino four-momenta and M H ) and only 7 equations. In spite of this, is there a systematic way to extract the kinematically most stringent information on M H ? This is the problem to face. Consider the 14D space of the components l, k of the three-momenta of the two (approximately massless) charged leptons and the four-momenta x, y of the two neutrinos. For a fixed M H , the seven Equations (3) define a 14−7 = 7D manifold, the phase space. This surface is to be projected onto the 6D hyper-plane of observable three-momenta. The points in the full phase space that project onto the boundary of the 6D space of observables are singular: at such points one or more of the invisible directions are contained in the tangent plane to the full phase space [4,5] , and a tangent to a surface is singular in that it "touches it" at more than one single point.
The equation, Σ( l, k, M 2 W ; M 2 H ) = 0, describing the boundary of the projected phase space is a singularity condition. A general event (i.e. specific values of l and k) is non-singular and its corresponding value of Σ is, once again, a measure of distance to the Σ = 0 singularity. The shape of the distribution of the values of the singularity variable Σ is sensitive to the unknown mass M H in a manner that allows one to extract its true value, be it physical or Monte Carlo (MC) generated.
The formal modus operandi to obtain singularity variables is summarized in [4] and discussed in detail in [5]. We recalled that at a singularity one or more of the invisible directions are contained in the tangent plane to the full phase space. The general condition for this to happen is that, in the space {z} = {x, y} of invisible directions, the row vectors of the Jacobian matrix J ij ≡ ∂E i /∂z j (with the row index i running along the number of equations and the column index j over the number of invisible coordinates) be linearly dependent. In other words, at a singularity, the rank of J ij must be smaller than its rank at nonsingular points. There are 7 equations and 8 invisible directions in Eqs.
(3). The vanishing of the Jacobian J ij (a 7 × 8 matrix) entails 8 conditions: the nullification of all 7 × 7 minors. Two of these minors coincide, up to their sign, with two others. Moreover the sums of two pairs of minors are of the forms D S 0 , D S 3 , with Given that S 0 > 0, one condition is: that is, the coplanarity of the four lepton four-momenta, equivalent to Introducing this into the 8 original minors, it is easy to see that they all vanish provided that The transposed Jacobian matrix, with use of E6 and E7 of Eqs. (3), is where the functional dependence of J on S 0,3 has been made explicit for later convenience.
It turns out to be very useful to study the behaviour of the 7 × 7 minors of J under longitudinal Lorentz transformations, the boosts along the axis "3" of the proton beams. To proceed recall that, for the reasons stated in the Introduction, we are setting the hadron momenta p 1 = p 2 = 0. Next, parametrize an event in the usual Cabibbo-Maximovich manner [9], illustrated in our notation in Fig. 2. That is, consider the lepton momenta as if both W bosons were at rest, boost them by the antiparallel motion of the W s in the H rest system and finally boost the Higgs boson longitudinally left or right along the beams' axis: where n k , n l , n and n p are unit vectors, L(y, n) is a Lorentz boost along n with velocity β = tanh(y), y = arccosh(γ), γ = M H /(2 M W ), and analogously for the longitudinal boost along n p , of rapidity y H .  [9]. The vectors n l and n k are the directions of the charged leptons (a µ + and an e − in this illustration) in the respective rest systems of their parent W s. The overall W W system, shown here at rest, is to be boosted along the direction np of the gluon or qq pair that fuse to produce the W pair, resonantly (for the H signal) or not (for the irreducible background).
Label m j , j = 1 to 8, the 7 × 7 minors of J in Eq. (8), lacking the row 9 − j of J. Under a longitudinal boost L(y H , n 3 ), they transform as m j →m j , with: where we used the definitions in Eqs. (4).
The conditions m j = 0, ∀ j imply that D = 0, and consequently thatm j = 0, ∀ j. Thus, we reach a crucial point: the general singular configurations can be obtained by boosts of the ones in the boson's rest system. This is one of the reasons why we pause to study this latter simpler case.

V. LESSONS FROM A GLUON COLLIDER
A standard Higgs can be made in various ways, with top-mediated gluon fusion being the dominant mechanism up to very high M H values. The gluonic pdfs, as well as those of the other partons, are fast-falling functions of their fractional momentum. This implies that Higgs bosons are made with a narrow distribution of rapidities, centered at y H = 0. The same is true for the backgrounds to the H → W W channel, e.g. non-resonant pairs of relatively heavy objects, such as W -bosons, are also made with a moderate collective motion. Thus, the approximation of a monochromatic "gluon collider" (or qq collider) is a good starting point for our analysis.

A. Derivation of the singularity conditions
In the W W center of mass (CM) system an extra working condition is to be added to Eqs. (3): and the Jacobian is now J(S 0 , 0), with J as in Eq. (8).
Since S 0 > 0 and the fifth column of J(S 0 , 0) is proportional to S 0 , it suffices to consider the vanishing of the eight 7 × 7 minors of J(1, 0), of which only four, e.g. m i , i = 1, ...4, are independent modulo D. In the CM system, let E = M H /2 denote the W 's energy and P the corresponding momentum modulus. The four-momenta of the individual W s in the notation of Eq. (9) are: and it is convenient to put the neutrino's momenta in the form The conditions x 2 = y 2 = 0 now read Stepping back to Eq. (8) and introducing the explicit lepton four-momenta in the minors of J(1, 0) and in det(l, x, k, y), the vanishing of the results requires, in particular, that det( l, k, n) = 0, that is, the 3D coplanarity of l, k and n and, consequently, of all four lepton threemomenta. We may write Gathering results and imposing n · n = 1, one may express a, b, x and y as functions of l and k. Two families of CM critical configurations are obtained. They differ by the sign of δ in and satisfy: Substituting these expressions into m i , i = 1, ...4 one finds that m 4 vanishes automatically. The others independent minors acquire the form where N and D are lengthy functions of k and l.
There are two alternative ways to satisfy m i = 0 ∀i. One of them is to let all three parenthesis in Eqs. (17) vanish simultaneously, tantamount to imposing k ∝ l, a specific case of the condition to be obtained anon from the second alternative: N = 0. Eliminating the sign ambiguity of δ yields a first requirement for an event to be singular, C = 0, with The non-trivial vanishing of C 3 implicitly presupposes det( l, k, n) = 0. Up to non-vanishing overall factors, a second requirement for an event to be singular is this coplanarity condition, squared such as to eliminate the sign of δ: C 0 = 0, with where l·k has its customary Minkowskian meaning.

B. Questions of nomenclature
For a singular event the values of C in Eq. (18) and C 0 in Eq. (19) must both vanish. Given the form of C, there are three nontrivial ways for this to happen: C i = C 0 = 0, i = 1 to 3, which we call complete singularity conditions. Of these, only C 3 = C 0 = 0 guarantees that all minors of the Jacobian vanish. The other two conditions, C i = C 0 = 0, i = 1 to 2, are mock singularity conditions, in a sense occasionally used in mathematics, that is, they do not satisfy all wanted conditions, but are useful for one's purposes. As it turns out, even the four partial singularity conditions C j = 0, j = 0 to 3, are of interest.
We choose C 0 as the example to make our next linguistic points. Consider a real or MC-generated event due to the production and decay of a Higgs boson. Its corresponding value of the C 0 function in Eq. (19) -a (partial) measure of distance to the C 0 = 0 singularity-depends on the Higgs boson mass in two distinct senses. The first is that k 0 , l 0 and l· k are contingent on this "input" mass. The second is the explicit M H in the expression of C 0 , which is a variable that one may -naturally-vary at will. To emphasize this point, we label this analyst's mass calligraphically: M H → M.
It is convenient to rescale and rewrite C 0 as: where M ± are the non-zero roots of C 0 = 0 and M 2 M,E are the Minkowski and "Euclidean" masses of the (approximately massless) charged lepton pair. Notice that Σ 0 depends on the implicit variable M, while its roots, M ± do not. That is why we refer to Σ 0 and M ± with different symbols, even though their distributions are in all cases diagnostics of the value of the real or simulated Higgs boson mass (we reserve the nomenclature "M " for all singularity variables of the later kind). Implicit masses become theoretically inevitable in cases for which, unlike for Σ 0 , the M roots cannot be made explicit.
Functions of an implicit mass M, such as Σ 0 , are also singularity variables. They vanish at singular points of phase space, iff the correct choice M = M H has been made, with M H the physical or Monte Carlo "truth".

C. Partial and complete singularity conditions and variables
The singularity condition C = 0 of Eq. (18) can be satisfied in various ways. Two of them (M H = 0 and M H = 2 M W ) are of little practical relevance. Two others correspond to the naïve-looking observables The remaining possibility is C 3 = 0 in Eq. (18), a cubic polynomial in the Higgs boson mass. In analogy with Eq. (20) for C 0 , we introduce its roots: where the explicit forms ofM i are lengthy. It is not useless to rewrite Eqs.(20,22,23) in the form: This is because to construct true singularity variables that reflect a complete set of singularity conditions we must introduce a measure of the distance between a data point (given values of k and l ) and one of the three center-of-mass singular manifolds: the points {0, 0} of the planes {C i = 0} ∩ {C 0 = 0}, i = 1 to 3. With the help of Eqs. (20,24) we define the following quantities with unit mass dimension: The functions D i (M) are the full set of complete centerof-mass singularity variables for the case at hand.

D. From Algebra to Geometry
An advantage of the approximation in which Higgs bosons would be produced at rest is that the locus of the singular points in the observable { k, l} phase space can be visualized. Let c ϕ ≡ cos ∆ϕ = k T · l T /(k T l T ). The singular phase space is shown in Fig. 3 in the variables {k 0 , l 0 , c ϕ }, in an example wherein we have chosen M H = 2.5 in M W = 1 units. The closed surface in the three subfigures is the coplanarity condition C 0 = 0, see Eq. (19). The thick lines in the top and middle figure correspond to the singularity conditions C 0 = C 1 = 0 and C 0 = C 2 = 0, see Eqs (18). The last figure partly describes the singular phase space C 0 = C 3 = 0 for the choice k 3 +l 3 = 0; the complete space would be the direct product of this latter line in {k 3 , l 3 } space with the thick line in the figure.
The C 0 = C 3 = 0 singularity condition, as one varies k 3 + l 3 , covers all of the C 0 = 0 coloured surface of Fig. 3. This reflects the fact that the other two conditions are mock, and of zero measure relative to C 0 = C 3 = 0.

VI. BACK TO A HADRON COLLIDER
The derivation of a singularity variable for the more realistic case of an H boson of rapidity y H = 0 is akin to that of the y H = 0 case, requiring only one extra step. Naturally, this is to start by applying the Lorentz boost L(y H , n p ) to the W momenta of Eq. (12), to obtain: p W l = {cE + sP n 3 , +P n 1 , +P n 2 , cP n 3 + sE} p W k = {cE − sP n 3 , −P n 1 , −P n 2 , cP n 3 − sE} (26) where c ≡ cosh(y H ) and s ≡ sinh(y H ). Following precisely the same steps as in §V A, one concludes that the partial singularity conditions are C i = 0, i = 1 to 4, with where In C 2,3,0 , whose expressions in terms of unprimed momenta are easily obtained and lengthy, we have only given the first and last term in their expansion in ξ, which are sufficient to specify their mass dimension and their grade as polynomials in ξ, two numbers that we shall need.
To obtain longitudinally boost-invariant results analogous to the ones in Eqs. (25) one must eliminate the unknown boost parameter ξ between the pairs of polynomials {C j (ξ), C 0 (ξ)}, j = 1 to 3. The first and simplest of these results, for j = 1, is the singularity condition ∆ 1 = 0, with where E = M/2 is the energy of a W in the rest system of a Higgs boson of trial mass M and P is the corresponding momentum. We have followed our convention to label M the singularity variables that do not depend on M, and Σ (and now ∆) those which do. Notice that ∆ 1 , by construction and demonstration, is a function of longitudinally boost-invariant observables.
The derivation of analytical results for the remaining polynomial pairs is not as simple as it was for {C 0 , C 1 }, C 1 being merely quadratic in ξ. The expressions for C 2,3,0 are polynomials in ξ of degrees 4, 12 and 8, respectively. The condition for two polynomials n i=0 a i ξ i and m j=0 b j ξ j to vanish simultaneously (to have common roots) is called their resultant, and is a sum of products of powers of a i and b j . The resultant of C 1 = 0 and C 0 = 0 is the condition Res{C 1 , C 0 } ≡ ∆ 1 = 0, see Eq. (29). The number of terms of a resultant grows very rapidly with m × n, it is 95 for (m, n) = (2, 8), 4970 for Res{C 2 , C 0 }, for which (m, n) = (4,8). For this case, after considerable simplifications, the singularity condition is ∆ 2 = 0, with Notice that ∆ 2 , as was the case for ∆ 1 , is a function of only longitudinally boost-invariant observables.
What is the number of terms in Res{C 3 , C 0 }, for which the degrees of the polynomials in ξ are (m, n) = (12, 8)? For m, n larger than a small integer the resultant soon becomes obdurately complex. Not even the number of addends in the (monomial) products of its formal coefficients is known. Only upper bounds to that number are, to our knowledge, published. The tightest one is [10]: S(m, n) = F (m, n, m n/2 ) m + n n , where, for integer a, b, c, F satisfies the recurrence We shall not be discouraged by the mathematical hardship of constructing explicit algebraic resultants. In analogy with ∆ 1 in Eq. (29) and given the complexity of ∆ 2 in Eq. (30), we shall simply define: and find, event by event, the resultant numerically. The coefficients of the powers of ξ in C 2,3,0 being -for a given event-numbers as opposed to symbols, this is doable and -for the computer-trivial. The formal proof that the resultants in Eqs. (31) ought to be boost-invariant is given in the Appendix.

VII. DEALING WITH THE MH < 2MW CASE
In an H → W W * process followed by leptonic decays of both W s, there is no way to assign a mass, M * , to the W which is putatively off-shell, even for a fixed M H . Moreover, there is no deterministic way to decide which W was approximately on-shell. Finally, except close to the M H = 2M W threshold, the theoretical distribution of off-shell masses, dΓ/dM * , is very wide. To confront this situation we choose to analize this case by assigning to both W s an adequately averaged squared mass: Since the non-observation of two neutrinos results in wide distributions for all observables, there is very little difference between using this prescription and other sensible ones, such as substituting the average M 2 * in Eq. (32) by its most probable value.
For M H > M W , up to a few W widths, Γ H , below the two-W threshold, and for the leading order standardmodel matrix element for the H → W W * decay, the distribution of W * masses, in the excellent approximation of neglecting Γ 2 W /M 2 W , is: The corresponding M 2 * distribution is shown in Fig. (4) for the relevant range of M H values. In this range and to a good approximation

VIII. DETAILS OF OUR DATA ANALYSIS
We have derived singularity variables only for the signal process, not for its backgrounds, and we use the signal singularity variables to compare the distributions of MC-generated signals and backgrounds.
We present results only for the H → W W , W → eν, W → µν channel and its non-resonant W W and tt backgrounds, with leptons of transverse momentum greater than 15 GeV, and satisfying the pseudorapidity cuts η(e) < 2.5, η(µ) < 2.1 [6].
Given the delicacies of measuring or simulating (at a "reconstruction level") the transverse momentum of hadrons, p T , we have not boosted each event to the approximate frame wherein the putative Higgs boson is transversally at rest. Our MC-simulations are for "generator level" events and do not have a p T = 0 requirement. No doubt this makes our results look somewhat weaker than they might otherwise be.
The ratios of signal to background yields are fastvarying functions of M H . The selections made by experimentalists on the way to focus on signal events are many and are also mass-dependent. These are reasons why we shall limit ourselves to illustrating only the different shapes (and not the absolute scales) of the signal and background histograms of various singular variables.
In discussing singularity variables such as Σ 0 (M) of Eq. (20) or ∆ 1 (M) of Eq. (29), it is informative to do it not only for the correct "guess" M = M H , but also for incorrect ones. Naturally, the histograms for a fixed M H and various M contain precisely the same statistical information. An experimentalist using an observable such as Σ 0 or ∆ 1 would deal with data (with M H not known a priori) armed with a plethora of "diagonal" MC templates with M = M H , with which to compare the observed distributions.

IX. DATA ANALYSIS IN THE CM APPROXIMATION
In this section we sketch a numerical analysis of the partial and complete "Higgs at rest" singularity variables derived in §V. Recall that these theoretically-obtained expressions ignore both the longitudinal and transverse momentum of the Higgs-boson signal to be analized.

A. Partial singularity conditions
We start this part of the discussion with the singularity variable Σ 0 , a measure of distance of an event to the partial singularity condition of coplanarity: Σ 0 = 0. We chose M H = 500 and 120 GeV as examples of the "true" mass of the events in this first illustration.
The The ability of the Σ 0 distribution to sieve apart signal and background shapes is illustrated in Fig. 6. Its left (right) columns are for M H = 500 (120) GeV, both with M set to its corresponding correct value. The top (bottom) lines refer to the W W and tt backgrounds. In all cases we have simulated equally many signal and background events, so that the figure reflects the shape of the distributions, not their relative weights. At M H = 120 GeV the shape of signal and backgrounds are very different, while at M H = 500 GeV this is less so.
The conclusions on the ability to distinguish signal and backgrounds or different Higgs masses are, as we saw, very mass dependent. The rest of the questions to be discussed in this chapter are quite insensitive to M H . We shall study them only for the M H = 120 GeV example.
The quantity Σ 0 of Eq. (20) is real, but its roots, M ± , need not be. For input MC data corresponding to M H = 120 GeV, about 14% of the roots are a real pair, the rest being two conjugate complex numbers. The conclusion that the complex roots are useless would be most premature. A feature of these roots to be studied ab initio is the correlation between their absolute value and phase. This is done for the M H = 120 GeV signal and the W W and tt backgrounds in Fig. 7, where the mass axis is the absolute value of the roots (shown once for each complex root and for its two values for each real pair). The ϕ axis is the phase of the roots having a non-negative imaginary part. We see that the {ϕ, |M |} correlation is weak and the distributions are significantly different for signal and background.
The distribution of absolute values and phases of the roots of Σ 0 , that is the projections of the results of Fig. 3 onto the |M | and ϕ axis, are shown in Fig. 8. The results for signal and W W and tt backgrounds are significantly different. Notice in particular how the signal has a much higher fraction than the background of events with |M | and ϕ close to zero.
The variables M 1,2 of Eqs. (21,22) are akin to M ± in that they do not refer to an ansatz mass M. In spite of their naiveté, these observables, particularly M 2 , are quite good at telling signal from backgrounds. Their shapes for an M H = 120 GeV signal and the W W and tt backgrounds are shown in Fig. 9. Because the variable Σ 3 of Eq. (23) has mass dimension 3, it is convenient to plot its sign-recalling cubic root of Eq. (24). This we do in the left column of Fig. 10 for an M H = 120 GeV signal and the W W and tt backgrounds. The signal and the illustrated backgrounds are seen to result in distributions with similar looks but significantly different details. In the right column of Fig. 10 we show the three roots of the cubic equation Σ 3 (M) = 0, see Eq. (23). The taller (yellow) histograms are the M H = 120 GeV signal, they are compared with those of the W W background (the tt distributions, not shown, differ a bit more than the W W ones from the signal distributions). The three rootsM i of Σ 3 , unlike the roots M ± of Σ 0 , are not so useful in telling signal from backgrounds.

B. Correlations between partial singularity variables
A question of practical interest is the extent to which the C 0 and C 1,2,3 distributions of Eqs. (20,18) are correlated, for a putative signal, and for the backgrounds. in the two other pairs, we plot results for {Σ 0 , Σ 1 } and {Σ 0 , Σ 2 }, see Eq. (24). All this we do in Fig. (11).
Two conclusions are to be extracted from the quoted figure, after noticing that its horizontal scales for signal and backgrounds are not always the same. The signal variable pairs are quite correlated, with the exception of {Σ 0 , Σ 1 }. The W W background is less correlated and its distribution is significantly different from that of the signal, for all variable pairs. These statements are more so for the tt background, which we have not shown.

C. Complete CM singularity conditions
To construct true singularity variables that reflect a complete set of singularity conditions we must exploit a measure of the distance between a data point (its values of k and l ) and one of the three center-of-mass singular- are shown in Fig. 12. The three choices appear to be comparably efficient at telling signal from backgrounds. The {D i , D j } correlations are illustrated in Fig. 13 for a signal with M H = 120 GeV and for the W W background, all with M = M H . Only the {D 2 , D 3 } correlation is strong. In all cases the signal and background results are fairly distinct. This is even more so for the tt background, results for which we do not show.

X. DATA ANALYSIS BEYOND THE CM APPROXIMATION
We have derived three longitudinally boost-invariant singularity variables. The first of them, ∆ 1 is algebraically simple and factorizable as ∆ 1 ∝ M 2 Σ, see Eq. (29). The mass dimension of C 1 is 1 and its degree in ξ is 2. The corresponding numbers for C 0 are 6 and 8, see Eqs. (27). The mass dimension of their resultant is 1 × 8 + 6 × 2 = 20. The mass dimensions of M and Σ are 4 and 12, respectively. It is therefore convenient to discuss the results in terms of M 1/4 , Σ 1/12 and where the root is always real, since ∆ 1 is always positive.
For the quoted variables we show in Fig. 14    We conclude that the singularity variable ∆ 1 and its factors are strong boost-invariant tools to tell signal from backgrounds, and is not very stringent in constraining the value of M H .
The mass dimension of C 2 is 3 and its degree in ξ is 4. The corresponding numbers for C 0 are 6 and 8, see Eq. (27). Thus, the mass dimension of ∆ 2 ≡ Res{C 2 , C 0 } is 3 × 8 + 4 × 6 = 48. In analogy with Eqs. (24), it is convenient to definẽ Results for the distributions of this singularity variable are presented in Figs. 17, 18 and commented later. The mass dimension of C 3 is 7 and its degree in ξ is 12. Recall that thee corresponding numbers for C 0 are 6 and 8. The mass dimension of ∆ 3 ≡ Res{C 3 , C 0 } is 7 × 8 + 12 × 6 = 128. In analogy with Eq. (36), it is thus Results for the distributions of this singularity variable are presented in Figs. 17, 18. The message of these figures is that the variables∆ 1,2,3 are very good both at distinguishing signal and background events and∆ 2,3 are very good at pinpointing the mass of a putative signal. An interesting feature emerges when some of the histograms in Fig. 18 are remade with higher statistics and resolution, concerning the singularity functions∆ 2,3 , but not∆ 1 . This is shown in Fig. 19. A very clear narrow double peak shape appears for∆ 3 (lower figure), and a hint of a similar structure for∆ 2 (upper figure).
The peaks in Fig. 19  has an even more intractable degree in M 2 : twenty-two. Thus, their roots can only be extracted numerically event by event, a rather laborious task, which we postpone.

A. Correlations
The longitudinally boost invariant singularity variables ∆ 1,2,3 have correlations similar to the ones between D 1,2,3 that we showed in Fig. 13. They are shown in Fig. 20, for an M H = 120 GeV signal. Once again, there are significant but not extreme correlations. Moreover the correlated histograms are quite different for the signal and W W background. This is even more so for the tt background, which we do not show.
In Fig. 21, we illustrate the correlations between∆ 1,2,3 and ∆ϕ for the signal and the W W background, to which the tt background again is in this sense similar. For the relatively light M H = 120 GeV signal shown in the figure, as expected, signal and background densely populate very different regions of phase space.

XI. SUMMARY OF RESULTS
With an eye on potential practical usefulness, let us call "good" the singularity variables that do an efficient job at focusing on the correct value of M H and, more so, the ones that produce the most significant difference in  Eqs. (22,24), is not good. The variable Σ 2 (Σ 3 ), also defined in Eq. (24) is good (not so good), as one can conclude from Figs. 9, 10. These limitations are lifted as we construct from these variables the quantities D i defined in Eq. (25), which are histogrammed in Fig. 12: they are reasonably good at telling signal from backgrounds. Their correlation plots, shown in Fig. 13, are quite disimilar for the signal and the W W irreducible background.
The longitudinally boost-invariant analogs of D i , i = 1 to 3 are the singularity variables ∆ 1 of Eq. (29) and ∆ 2,3 of Eq. (31). We have redefined them to have unit  Fig. 18 with the corresponding result for the CM variables D i , Fig. 12, shows, once more, that the∆ i are demonstrably better.
The confrontation of the results for the boost-invariant variables, ∆ i , and their siblings, D i , obtained in the approximation in which the boson is at rest is very gratifying. The signal peaks are significantly narrower and the correlations weaker for the ∆s than for the Ds.
In the sense of their correlations with the function ∆ϕ, shown in Fig. 21, the singularity variables∆ i are optimal tools to separate signal from backgrounds.

XII. CONCLUSIONS
Recall that, as discussed in the Introduction, in the case of the CM singularity variables one can construct up to five independent combinations of the relevant observables. The best choice is the set {M + , M − , D 1 , D 2 , D 3 }, whose ingredients are defined in Eqs. (20,25). The main appeal of these CM variables is that they are simple and explicit functions of the relevant observables. Their main drawback is that they are not as good as the boostinvariant variables, to be revisited next.
The only imperfection of the boost-invariant singularity variables is that for one of them, ∆ 3 , we are unable to derive its explicit analytical expression. For ∆ 2 the analytical expression, Eq. (30), is so complex that we have opted to compute it event by event as a numerical resultant, as we are forced to do in the case of ∆ 3 . For the computer, this is fast and simplest.
The practical virtues of the variables ∆ i -in being able to pinpoint the actual value of M H and to tell apart signal from backgrounds-amply overcome their quoted single limitation. The theoretical toil required to go beyond the Higgs-at-rest approximation pays.
The bell shapes of the signal histograms in Fig. 18 are very satisfactory, even if obtained with theoretical expressions in the p T = 0 approximation for the produced hadrons. We have checked, by generating and analizing events with p T = 0, that the improvement brought by theoretical variables that avoid the p T = 0 approximation is unlikely to be very significant.
We have only studied variables and their pair-wise cor- relations. We have not attempted to quantify the absolute values of signals and backgrounds -as opposed to just the shape of their distributions. Thus, we are far from being able to show potential "significance" results in terms of a full multi-dimensional analysis of all variables and their correlations. Yet our results for∆ 1,2,3 in Fig. 18 are competitive in "goodness" with the ∆ϕ diagnosis recalled in Fig. 1. That was one of our goals. For an experimentalist eager to test the tantalizing hints that M H = 126 or 124 GeV [1], it should not be too streneous to prepare the relevant one-or multi- dimensional singularity-variable templates for the relatively copious channel H → W W → leptons.
Our main aim was the theoretical derivation of a complete set of phase-space singularity conditions and variables for the process H → W + W − , W ± → ± ν. We have seen it is a rather laborious task. The origin of its difficulty is many-fold. First, because of the elusiveness of neutrinos, the kinematical constraints of Eqs. (3) are incomplete. Second, several of these equations are nonlinear. Finally and most severely, the 7-th equation, the one reflecting that the invariant mass of the four leptons is M H , inextricably links the leptons resulting from the decay of one W to those from the other, very significantly complicating the ensuing algebra.
In most processes relevant to a hadron-collider search for new physics involving unobservable particles, the initial step is a non-resonant production of a pair of novel particles. This means one cannot assume a fixed invariant mass for the pair and (approximately) boost each event to the pair's rest system. But the last difficultly mentioned in the previous paragraph is absent. That is why, even for p T = 0 -and a surfeit of unknown massesthe pertinent singularity variables are relatively simple, and analytical [11].
The desired "Q.E.D." is simply reached by putting together Eqs. (43,45). It is also simple and gratifying, in the case of the ∆ 3 resultant that we were unable to derive explicitly, to check its boost invariance numerically.