Enhancing the discovery prospects for SUSY-like decays with a novel kinematic variable

The lack of a new physics signal thus far at the Large Hadron Collider motivates us to consider how to look for challenging final states, with large Standard Model backgrounds and subtle kinematic features, such as cascade decays with compressed spectra. Adopting a benchmark SUSY-like decay topology with a four-body final state proceeding through a sequence of two-body decays via intermediate resonances, we focus our attention on the kinematic variable $\Delta_{4}$ that delineates the boundary of four-body phase space. We highlight the advantages of using $\Delta_{4}$ as a discovery variable, and present an analysis suggesting that the pairing of $\Delta_{4}$ with another invariant mass variable leads to a significant improvement over more conventional variable choices and techniques.


Introduction
The possible existence of particles beyond the Standard Model (SM) at the TeV scale is theoretically motivated both by naturalness considerations for the electroweak scale [1], and by the so-called WIMP (weakly interacting massive particle) miracle for obtaining the correct dark matter relic abundance [2]. Nevertheless, as we approach the end of Run II of the Large Hadron Collider (LHC), we have as yet no conclusive evidence of new particles beyond the SM (BSM) [3]. This requires us to pause, rethink and perhaps re-optimize our search strategies, in preparation for what may lie ahead. In particular, we should be mindful of the following challenges: • The signal may be buried under a large SM background. Of course, one obvious possibility for why partner particles may so far have evaded detection is that they are simply too heavy and therefore have small production cross sections. If that is the case, then discovery could be waiting around the corner, provided that the signatures of the new particles are distinctive. For instance, significant mass gaps in the spectrum of the new particles will result in high p T leptons and jets in the final state and a sizable missing transverse energy, / E T . Therefore, while the signal cross section may be low, signal over background can still be large and reaching discovery sensitivity will simply be a question of collecting sufficient statistics. This scenario is rather uninteresting to us, and instead in this paper we focus on the alternativethat the new particles are being produced in sizable numbers, but their signatures are plagued by large SM backgrounds, so the name of the game is whether we can identify selection criteria which have the best potential for discriminating against the background. This attitude is supported by the flurry of theoretical activity in recent years in designing models which "hide" the new physics from the LHC. One of the standard methods for doing so is to arrange for a "compressed" mass spectrum with a mass degeneracy of the relevant particles, such as supersymmetric (SUSY) partners, so that the resulting decay products are too soft to be triggered upon and tagged in the experimental analysis [4][5][6][7][8][9][10][11][12][13], or a "stealth" mass spectrum, where the new physics signature becomes identical to the SM background, since the additional particles are too soft to make any appreciable difference [14][15][16][17][18][19][20]. Our aim will be to highlight a kinematic variable that, either by itself or in conjunction with more conventional variables, can more effectively select signal over background when the signal spectrum is compressed and when signal events contain multi-stage cascade decays.
• Exclusive searches may be reducing the signal statistics to unobservable levels. When searching for new physics, one has to find the right balance between inclusive and exclusive searches. Inclusive searches are more robust since they have fewer theoretical assumptions about the event topology and have a higher signal efficiency. On the flip side, they tend to suffer from larger SM backgrounds. In contrast, exclusive searches have the potential to reach higher sensitivity when the correct assumptions are made about the features of signal events, since those features can then be used to reduce backgrounds, but at the cost of relying on the assumptions about event topology that may prove to be incorrect.
In our study we will remain much more inclusive than in experimental searches that model the topology of the entire event, and instead we will only operate on the assumption that the event contains (at least) one SUSY-like cascade decay proceeding through a sequence of two-body decays and with an invisible particle at the end of the decay chain. We will make no assumptions about whether the particle at the beginning of the cascade is singly or pair-produced, and if the latter, what the "other side" of the event looks like. Because of this, we will not make direct use of / E T , or any other transverse variables. Adopting a benchmark final state with three visible and one invisible final state particles [see Fig. 1(d)], we will focus our attention on fully Lorentz-invariant kinematic variables.
• Uncertainties in background modeling. A required component of any new physics search is the prediction of the expected SM background. Depending on the final state, this may turn out to be a difficult task, plagued by large systematics. Ideally one would like to use data-driven background estimates, and not rely on theoretical input or Monte Carlo. The classic technique for such searches is the "bump hunting" method with sideband subtraction. Fig. 1(a-c) shows examples of simpler decay chains for which this method is easily applied. Fig. 1(a) depicts a visibly decaying resonance, here to two visible particles v 1 and v 2 . In this case, the relevant kinematic variable is the invariant mass m v 1 v 2 of the decay products -it exhibits a Breit-Wigner peak at the mass m X 1 of the new resonance. Since the m v 1 v 2 distribution for the SM background is expected to be smooth, one can interpolate from the sidebands and obtain a reliable prediction for the background under the peak. This triedand-true method has been used successfully many times in the past, including most Here v 1 , v 2 and v 3 are SM particles which are reconstructed in the detector (either directly, or through their respective visible decay products), while χ is a potential dark matter candidate which is invisible in the detector. X 2 and X 3 are additional BSM particles recently for the discovery of the Higgs boson in the diphoton channel [21,22].
However, the method runs into a complication if one of the final state particles is invisible in the detector, e.g. particle χ in Fig. 1(b). Nevertheless, the procedure still goes through, only this time one has to use a suitable kinematic variable which retains the "bump" feature for the signal, namely the transverse invariant mass m T,X 1 [23][24][25]. The downside of the transverse mass variable m T (and the related mass variables m T 2 [26], m 2 [27][28][29], etc.) is that its definition uses the / E T measurement, which forces a departure from inclusivity, and also suffers from the systematics of all possible detector effects. For decay chains containing more than one visible particle, one can remain more inclusive by working only with Lorentz-invariant variables constructed from the momenta of these particles. For the two-stage decay chain in Fig. 1(c), the only such kinematic variable is the invariant mass m v 1 v 2 , whose distribution does have a distinctive feature [30]. While these cases have all been studied in great detail in the past, there has not been a comparable effort to design optimized variables for a longer decay chain, such as in Fig. 1(d). We will therefore adopt this decay topology as our benchmark in this paper. Our main goal will be to identify and study a kinematic variable for this decay topology that is robust to a certain amount of uncertainty in the modeling of the relevant backgrounds.
Based on the arguments above, an obvious choice of kinematic variables to consider are the pair-wise 1 invariant masses of the visible decay products, m v 1 v 2 , m v 2 v 3 , and m v 1 v 3 , or some combination of those. For plotting convenience, in what follows we shall actually use the squares of those variables and denote them as The variables (1.1) are in principle good candidates for the analysis, not only because they are Lorentz invariant, but also because their distributions exhibit interesting kinematic features (edges and endpoints) which are traditionally used for determining the masses of the new particles X 1 , X 2 , X 3 and χ [31][32][33][34][35][36][37][38][39][40]. However, as discussed in refs. [41][42][43][44][45][46], the multidimensional phase space m 2 12 , m 2 23 , m 2 13 in this case in fact contains more information than is captured by edge-and-endpoint variables alone. As we will be describing in more detail in section 2, the vicinity of the endpoints corresponds only to a fraction of the full boundary of the kinematically available phase space. This boundary is defined via the condition 2 ∆ 4 = 0 where the variable ∆ 4 will be introduced and defined in section 2 below. For now we simply remark that the location of this boundary contains the complete information about the spectrum in the cascade decay [41,42]. A determination of this boundary (using Voronoi tessellations [47,48]) has already been shown to result in an improvement in the measurement of the new physics mass spectrum [45]. 3 More importantly, the phase space volume element has an enhancement near the boundary, even in the case of a compressed spectrum [43]. This suggests that ∆ 4 may be an effective discovery variable, especially in difficult scenarios of compressed spectra. The main goal of this paper will be to investigate the suitability of the ∆ 4 variable as an analysis variable, either on its own, or when paired with the edge-and-endpoint variables 4 . In order to demonstrate the basic idea, we adopt a specific realization of our benchmark decay topology from Fig. 1(d), by specifying a final state on which we will base our analysis (see Fig. 2). In particular, we will take X 1 and X 3 to be charged particles, while X 2 and χ are neutral. We also take the neutral particles to be flavor singlets. The SM particles produced in the second and third stages of the cascade are therefore oppositely charged, and have the same flavor, whereas the charge and flavor assignments of the SM particle produced in the first stage of the cascade are uncorrelated with the other two. Furthermore, in order to concentrate on what can be achieved using phase space techniques for discovery, we will aim to minimize possible complications due to challenging collider objects, so we choose the visible particles to be leptons. It is worth reiterating that our choice of final state is simply a choice of convenience in order to demonstrate the applicability of our 1 The invariant mass variable mv 1 v 2 v 3 of all three visible particles is not an independent quantity, since The specific realization of the event topology from Fig. 1(d) which will be studied in this paper. Here ± and ∓ is a pair of opposite-sign, same-flavor leptons, while is a third lepton of a different flavor.
methods, but the methods can be applied to photons, jets or even unstable SM particles with fully visible decays (such as visibly decaying Z-bosons) as well, at the potential cost of worse detector energy resolution and combinatorics. Our analysis will take into account the effect of finite energy resolution for leptons, as well as the combinatoric ambiguity about which lepton is emitted at the various decay stages. In particular, there will not in general be a way to distinguish which of the same-flavor, opposite-charge leptons is emitted higher upstream in the cascade. On the other hand, the lepton emitted in the first stage of the cascade can be distinguished by demanding it to carry a flavor different from the same-flavor, opposite-charge lepton pair.
Since we aim to focus on improving signal selection in the case of compressed spectra, we adopt the following benchmark spectrum: m X 1 = 390 GeV, m X 2 = 360 GeV, m X 3 = 330 GeV and m χ = 300 GeV. Note that the choice of spectrum is mainly intended to demonstrate how well the kinematic variables in question compare to one another. Our conclusions would not be affected by raising all masses in the spectrum (while preserving the mass gaps), if we wanted to assign additional significance to this mass benchmark and avoid existing exclusion constraints for various potential underlying models, such as supersymmetry.
The outline of this paper is as follows: In the next section we will review the theoretical aspects of multidimensional phase space and formally introduce the Δ 4 variable. In section 3, we will then perform a preliminary study with simplified assumptions to outline the salient features of Δ 4 as a discovery variable. In section 4 we will address a subtlety about the use of a hypothesis spectrum in order to calculate Δ 4 . Once this is done, we will then perform a realistic study of the performance of Δ 4 as a discovery variable in section 5. We conclude in section 6.

Mathematical description of four-body phase space
Let us start by introducing a manifestly Lorentz-invariant parametrization of the phase space for the cascade decay of our benchmark decay topology. Using the formalism of ref. [49], 5 we introduce the matrix where the {p i } are the four momenta of the final state particles , ± , and χ. The variables ∆ i can then be defined as Among these variables, ∆ 4 will play a special role in the rest of this paper. As described in ref. [49], the kinematically allowed region is given by ∆ 1,2,3,4 > 0, with the boundary located at 6 With the requirement that all m 2 ij ≥ 0, outside of the kinematically allowed region the values of ∆ 4 are negative and become arbitrarily large in magnitude as one moves towards infinity.
The general four-body phase space volume element is given by , which causes an enhancement near the boundary ∆ 4 = 0.
Of course, the physically observable quantities depend not only on dΠ 4 but on |M| 2 , the quantum mechanical matrix element squared for the decay: (2.5) In particular, for the benchmark decay topology of Fig. 2, the volume element will be combined with the squares of the internal propagators in the cascade, which in the narrow width approximation are given as delta functions with arguments linear in the m 2 ij and can therefore be used to perform some of the m 2 ij integrals. As a result, the events fill out a three-dimensional phase space that can conveniently be fully parameterized in terms of the observables m 2 12 , m 2 13 and m 2 23 . The enhancement in the phase space volume element near the boundary should make it clear why it is promising to consider ∆ 4 as a discovery variable. The prominent features in the edge-and-endpoint variable distributions happen at the extremes of linear slicings of the three dimensional phase space, and therefore only a small fraction of signal events 5 For an alternative derivation, the curious reader is invited to follow Exercise 11 on page 574 in [42]. 6 Alternative equivalent parametrizations of this kinematic boundary were previously derived in [41,42,44]. However, those results were not used to study the interior of the kinematically allowed phase space, as we will be doing here. 7 This is the general formula. For our analysis, while mχ > 0, we will take the leptons to be massless.
contribute to these features. In contrast, the prominent feature in the ∆ 4 distribution at ∆ 4 = 0 captures the full boundary of phase space, where the density of signal events is enhanced, so it is reasonable to expect that selecting for events near ∆ 4 = 0, one could significantly enhance signal over background.
It is worth remarking that the phase space for any known SM background process does not develop a singular structure like the one described in eq. (2.4). Furthermore, there is no reason to expect the |M| 2 factor for the background to have any sharp features over the kinematically accessible signal region (the location of which depends on the signal spectrum). In particular, for a compressed signal spectrum which results in a relatively small signal region, the variation of the background matrix element over this region will in all likelihood be mild.
Note that for a given event, ∆ 4 cannot be calculated from the observable data alone. As can be seen from eq. (2.2), ∆ 4 is equal to −det [Z], and the last column and row of Z contain the four momentum of the lightest supersymmetric particle (LSP) χ, which is unobservable. However, if one starts with a hypothesis for the spectrum {m X 1 , m X 2 , m X 3 , m χ }, the onshell constraints allow one to solve for all entries of Z, and thus a mass hypothesis dependent value of ∆ 4 can be calculated. The obvious question to ask then is whether this requirement for a spectrum hypothesis significantly weakens the usefulness of the ∆ 4 variable. We will take up this question in section 4, drawing the conclusion that ∆ 4 is a powerful variable despite this caveat.

Preliminary study with uniform background
In order to illustrate the usefulness of ∆ 4 , we wish to compare its performance as a discovery variable to the conventional edge-and-endpoint variables using the benchmark cascade decay and spectrum specified in the introduction. The performance of all variables will depend on the differential distribution of signal and background events, which as mentioned in the previous section will in turn depend on both the geometry of phase space as well as the matrix elements for signal and background. Again as emphasized in the previous section, the usefulness of ∆ 4 originates from the phase space geometry for signal, in particular, the enhancement of the signal event density near the boundary of the kinematically allowed region where there is no strong reason to expect a feature in the density of background events. Therefore, we devote this section to a toy study where we minimize the effects of the matrix elements and of the background event distribution, by taking all particles in the signal decay chain to be scalars, and we make the highly simplifying approximation that the background varies not only slowly over the signal region but is in fact uniformly distributed over phase space (parameterized in terms of the coordinates m 2 ij ). We will also use the true signal spectrum in calculating ∆ 4 and return to the issue of having to scan over spectrum hypotheses in the next section, before we do a full analysis with SM backgrounds and a signal model with spins of new particles assigned SUSY-like in section 5.
Since we use a uniformly distributed background, we need to define a finite box in the three-dimensional space formed by the three m 2 ij variables in order to deal with only a finite number of background events. We choose the box size as twice the maximal possible signal value in each of the m 2 ij variables. This choice ensures that finite energy resolution in the detector does not push signal events outside the box, and that no artificial features are introduced in background distributions at small but negative values of ∆ 4 , close to but outside the signal region. We generate high statistics samples with one million signal and background events each, where in the signal the flavors of the leptons and are randomly assigned as electrons or muons. We only consider events where those two flavors are distinct.
Even in this preliminary study, we will need to face two complications. One is finite energy resolution, as mentioned, while the other complication arises from combinatoric ambiguities. Note that in our benchmark topology of Fig. 2, it cannot be experimentally determined in which order the particles + and − are emitted in the cascade, leading to a combinatoric ambiguity. As argued in ref. [34], in such a case it is advantageous to work with ordered variables instead, so we define and work with the variables Note that there is no combinatorial ambiguity in defining m 2 23 as we require and to have distinct flavors. Due to the combinatorial ambiguity, there are two possible values of ∆ 4 for every event, and both of them will be used when populating ∆ 4 histograms. In setting up our study, we will choose to start by using perfect energy resolution and by ignoring the combinatoric ambiguity, before introducing them below. We do this because there are a few important lessons we can learn even before the analysis is made more complicated by these effects.
As mentioned in the introduction, an ideal discovery variable that eliminates the need for precise background modeling would exhibit a strong feature in the distribution of the signal while the background distribution is smooth at the same position, such that a sideband analysis can pick out the signal as in a bump-hunting analysis. At first sight, ∆ 4 seems to be a promising variable along these lines, since the signal event density is enhanced near ∆ 4 = 0 while the background event density has no reason to be enhanced at the same surface, the location of which after all is dependent on the signal spectrum. Unfortunately, this line of thinking misses a potential problem, namely that even though the density of background events may be smooth near the surface ∆ 4 = 0, the phase space in which signal and background events are distributed is three-dimensional, and in making a one-dimensional histogram of ∆ 4 , one has to integrate the phase space volume between surfaces of constant ∆ 4 . This can still introduce a feature into the background ∆ 4 histogram if the volume between contours itself exhibits a feature near ∆ 4 = 0. This does in fact happen to be the case, since the gradient of ∆ 4 is small on a significant portion of the boundary surface, increasing the volume between ∆ 4 contours there. The resulting ∆ 4 histogram for signal and background (uniform density) is shown in Fig. 3, where the normalization of the signal and background histograms has been chosen such that they both contain the same total number of events. Here ∆ 4 values are normalized by the maximum ∆ 4 for the chosen mass spectrum, (m X 1 , m X 2 , m X 3 , m χ ) = (390, 360, 330, 300) GeV. When the number of background events are significantly higher than the number of signal events, as is often the case for searches for new physics, and when the distributions become smeared due to finite energy resolution, the presence of the background feature at Δ 4 = 0 will make a simple bump hunt based on a sideband analysis difficult, since the signal can be misinterpreted as a background systematic [42].
We therefore switch to a different approach for a search strategy. In order to compare the effectiveness of the different variables in selecting signal events, we construct a performance curve of each variable as follows 8 . For a given variable, a histogram is made of the signal and background. For the m 2 variables, the interval of interest in the histogram is between the maximum and minimum possible values predicted by the spectrum, and for Δ 4 it is the interval between ±Δ 4(max) , also as predicted by the spectrum. The interval of interest is divided into 100 bins 9 . The first entry in the performance curve is the ratio of signal to background events (S/B) in the bin with the highest number of signal events. To obtain the second entry in the performance curve, this bin is combined with the bin to its left or to its right, whichever of the two has the larger number of signal events, and S/B is calculated for the combined two-bin region. For the third entry in the performance curve, these two bins are combined with the neighboring bin with the higher number of signal events, and so on. The procedure stops when all bins containing signal events are exhausted, and therefore the last entry in the performance curve corresponds to S/B over the full signal region for the variable in question. Note that the ordering of the bins in terms of signal events (as opposed to S/B) reduces the reliance on background modeling. 8 The spirit of these curves is similar to a receiving operator characteristic (ROC) curve, even though they are not technically ROC curves. 9 We verify that the procedure outlined here is not sensitive to the choice of binning. We point out that the performance curves of any two variables may be meaningfully compared independently of the overall signal and background normalizations, since any change in the signal and background normalizations will multiply the performance curve of all variables by the same common factor. Using the same procedure, for completeness we also produce performance curves for the S/ √ B metric 10 . These performance curves are shown in Fig. 4. Note that by construction, the background has a flat distribution in all m 2 ij variables, and in the absence of spin correlations, the signal has an exactly flat distribution in m 2 12 and m 2 23 , and a nearly flat distribution in m 2 13 as well. This explains the near-flatness of the S/B performance curves of the m 2 ij variables, as well as the √ N bins scaling for the S/ √ B performance curves. As can be seen from the figures, Δ 4 performs significantly better than these with respect to both metrics.
Encouraged by this result, we proceed to check whether it is robust in the presence of finite detector energy resolution and combinatorial ambiguities. We use the EM calorimeter resolution based on the CMS-TDR [50] where the energy E is defined in GeV. For the muon resolution we utilized values (in terms of muon momentum and pseudorapidity) summarized in Figure 1.5 of the CMS-TDR [50].
Since the background that we consider in this preliminary study is not physical and has no four-vectors associated with it, we leave it unmodified. To incorporate combinatorial ambiguities into the analysis, we use the ordered m 2 variables as defined in eq. (3.1), and we populate Δ 4 histograms by both possible values for each event as mentioned above. The effect of smearing and combinatorics on the Δ 4 distribution of figure 3 is shown in Fig. 5.   As a result of both smearing and combinatorics, the performance curves for Δ 4 in Fig. 4 are mildly degraded, which can be seen in Fig. 6. In Fig. 7, the performance curves of Δ 4 and the edge-and-endpoint variables are compared with energy resolution and combinatorics included. Δ 4 is seen to still outperform the edge-and-endpoint variables, but by a smaller margin.
After this preliminary comparison among single kinematic variables as discovery tools, it is also interesting to look at how well pairs of variables compare to one another. In particular we will be interested in whether pairing Δ 4 with the m 2 variables will be more effective than pairing one of the m 2 variables with another one. The procedure we use to perform this comparison closely mirrors the procedure outlined above for the case of a single variable. In particular, for any pair of variables, signal and background events populate a double histogram in the two variables in question (the same binning parameters are used in each variable as described earlier in this section). The (double) bins are then ordered in order of their signal contribution, but without demanding that the bins that are combined neighbor one another, and performance curves of S/B and of S/ √ B are made. The effects of both smearing and of combinatorics are included. We exhibit the results in Fig. 8 from which it is easy to see that variable pairs including Δ 4 perform better than

Scanning over mass spectra
Encouraged by the promising results of our preliminary study described in the previous section, we will devote this section to address the issue of the spectrum dependence in calculating Δ 4 . In particular, since the true signal spectrum is not known a priori, analyses involving Δ 4 will need to scan over all possible (correctly ordered) spectra. Below, we will show that the significance is maximized at least locally for the true spectrum, a result which is consistent with the conclusions of ref. [45]. Therefore, if one were to scan over all spectra and use the spectrum that yields the highest significance, then the performance curve based on the true spectrum offers a guaranteed, and in fact potentially conservative (should other spectra exist far from the true spectrum that lead to even higher significance), benchmark for comparison against the performance curves of the m 2 variables. The significances we report will be local. The calculation of a global significance requires the use of a trials factor which is tricky to define for this analysis and is beyond the scope of this paper. The question of the potential existence of other local (or even global) maxima of significance requires extensive calculational resources, since a fine scan over four masses is required 11 . However, since we will show below that the true spectrum yields at least a local maximum, with a high significance value, then if other local maxima with even higher significance should exist, this would only strengthen the discovery potential, not reduce it, but at the cost of having to give up the claim that the spectrum can be simultaneously measured in the same analysis. We will therefore not make this claim in this study.
To demonstrate that the true spectrum yields a local maximum of significance, we will compare the performance curves of Δ 4 for a range of hypothesized spectra obtained by local deformations around the true spectrum. A background uniform in the m 2 ij variables is used as in the previous section, and finite energy resolution as well as combinatorial ambiguities are included in the analysis.
For the local scan near the true spectrum, we allow each of the four masses to change up or down by 10 GeV, resulting in 8 variations. The performance curves obtained as a result of the scan are shown on the left-hand side of Fig. 9. It is easy to see that for any low or moderate number of bins in the performance curve, the true spectrum yields the highest significance. The strong reduction in the performance as one goes away from the true spectrum (along any direction other than the flat direction, see the next paragraph) can be traced to the fact that the sharp peak at Δ 4 = 0 is only present when Δ 4 is calculated for the true spectrum, and is severely distorted otherwise, thereby erasing the most distinctive feature in the signal distribution compared to the background distribution.
We also perform a finer one-dimensional scan along a special direction. In particular, while the m 2 ij variables are sensitive to changes in the mass gaps in the spectrum, there is a direction where the endpoints of all three m 2 ij distributions remain fixed. We parameterize this direction in terms of the change in the mass of the LSP from its benchmark value. As shown in ref. [45], ∆ 4 is sensitive to changes along the flat direction, while the effect on the shape of the m 2 ij distributions is minimal. These results are shown in the righthand side of Fig. 9, with the conclusion that small deformations along the flat direction leave the performance curve unchanged (within statistical errors) while more substantial deformations reduce the significance. The results of the scans presented above thus confirm our claim that the ∆ 4 performance has a local maximum for the true spectrum.

Study with SM background
Having obtained encouraging results in our toy study with uniform background, and having dealt with the subtlety of scanning over spectrum hypotheses in calculating ∆ 4 , we are now in the position to conduct a much more realistic study, with SM backgrounds, matrix element effects in the signal, finite detector resolution, and combinatorics taken into consideration. For the signal, we consider a benchmark model where X 1 is a scalar muon partner, X 2 is a heavy fermion, X 3 is a scalar electron partner, and χ is the fermionic LSP. It should be emphasized again that we are not arguing for this as a signal model to be taken literally; as argued in the introduction, this model is chosen to make an apples-to-apples comparison between ∆ 4 and the m 2 variables possible, without introducing distracting complications. Nevertheless, we believe that our proposed analysis is straightforwardly applicable to the SUSY signal searches in the channel we study here. This signal model guarantees the flavor arrangement of the three leptons in our benchmark cascade. The dominant SM background for this final state is W Z ( * ) production followed by their leptonic decays. Since our benchmark spectrum ensures that the opposite sign, same flavor lepton pair invariant mass remains well below m Z , we impose a Z-veto in simulating the background, so that the region with off-shell Z's can be scanned efficiently. We perform our parton-level simulation for signal and background using MG5@aMC [51], and apply energy resolution for final state leptons according to the CMS-TDR [50] [see also eq. (3.2)]. We use the following selection cuts on the events: p T, > 10 GeV, |η | < 2.5, ∆R ≥ 0.4, 15GeV < m + − < 65 GeV ( = e, µ). (5.1) Here the invariant mass cut in the second line is relevant only to same-flavor opposite-sign lepton pairs.
For the generated signal and background event samples, we plot the ∆ 4 distributions, as well as the effect of smearing and combinatorics on these distributions, in Fig. 10. The resulting performance curves for ∆ 4 are obtained following the same steps as in section 3, and shown in Fig. 11. We then compare the performance of ∆ 4 to the edge-and-endpoint variables in Fig. 12. We observe that the ∆ 4 variable becomes less powerful than it was in our preliminary study with uniform background. The main reason for this degradation is because the matrix elements and the parton distribution functions that govern the phase space distribution of SM background events lead more events to lie close to the regions in which ∆ 4 is smaller than that for the uniform background distribution [42]; for example, Figure 10. The Δ 4 histograms for signal (blue) and the SM background (green), with energy resolution and combinatoric ambiguities included. the event population in the same-flavor lepton pair invariant mass is enhanced at small values due to the mixing between γ and Z, resulting in more background population at small values of Δ 4 . Nonetheless, Δ 4 shows a comparable performance to the strongest m 2 variable with respect to both metrics.
Furthermore, as we pointed out in our preliminary exercise, some m 2 variable, when  combined with Δ 4 , may outperform traditional approaches with m 2 variables only. Indeed, the same expectation goes through for the signal under consideration, which is supported by the results presented in Fig. 13. As one would expect based on the single variable results of Fig. 12, the best performance is achieved by the combination between m 2 1(hi) and Δ 4 (blue lines) in both the S/B (left panel) and the S/ √ B (right panel) metrics. Therefore, we find that Δ 4 can play, at least, a complementary role in separating signal from background, hence expediting a discovery of new physics.

Conclusions
As we approach the end of Run II in the LHC experiment, the absence of a discovery of new physics makes it increasingly more imperative to focus on scenarios where a new physics signal may exist in the data, but not be distinctive enough to register in searches looking for high momentum particles. This happens for example when the new particles that are produced decay in a cascade with a compressed spectrum. We argued that using the variable Δ 4 , which arises naturally in describing four-body phase space, allows one to design a search strategy in such a scenario that is quite inclusive and does not rely strongly on background modeling. 12 We do this by focusing our attention on only the part of the event containing the cascade decay, using Lorentz-invariant variables, and by not using detailed properties of the background in designing our search strategy. We have argued that even though the calculation of ∆ 4 requires a hypothesis for the mass spectrum in the cascade decay, the significance has a local maximum for the true signal spectrum which can be used as a benchmark of comparison against the performance of other variables. We have compared the performance of the variable ∆ 4 , both singly and paired with conventional edge-and-endpoint variables, in a study using SM backgrounds, spin correlations, finite energy resolution and combinatoric effects, concluding that ∆ 4 can significantly enhance the signal both for systematics-dominated (S/B metric) and statistics-dominated (S/ √ B metric) searches.