Mass Measurement Using Energy Spectra in Three-body Decays

In previous works we have demonstrated how the energy distribution of massless decay products in two body decays can be used to measure the mass of decaying particles. In this work we show how such results can be generalized to the case of multi-body decays. The key ideas that allow us to deal with multi-body final states are an extension of our previous results to the case of massive decay products and the factorization of the multi-body phase space. The mass measurement strategy that we propose is distinct from alternative methods because it does not require an accurate reconstruction of the entire event, as it does not involve, for instance, the missing transverse momentum, but rather requires measuring only the visible decay products of the decay of interest. To demonstrate the general strategy, we study a supersymmetric model wherein pair-produced gluinos each decay to a stable neutralino and a bottom quark-antiquark pair via an off-shell bottom squark. The combinatorial background stemming from the indistinguishable visible final states on both decay sides can be treated by an"event mixing"technique, the performance of which is discussed in detail. Taking into account dominant backgrounds, we are able to show that the mass of the gluino and, in favorable cases, that of the neutralino can be determined by this mass measurement strategy.


Introduction and general strategy of the mass measurement
A stable weakly interacting massive particle (WIMP), with a weak-scale mass, is a wellmotivated candidate for the dark matter (DM) of the universe. The reason for this is that it naturally results in the observed relic density after thermal freeze-out of the early universe (for review, see for example, Ref. [1]). In fact, this WIMP DM can also have weakscale interactions with various Standard Model (SM) particles. 1 This feature leads to the exciting possibility that DM arising in this manner could be detected non-gravitationally 2 . In these cases, a symmetry under which the dark matter is charged must be present in order to stabilize this DM from decaying to purely SM final states; the SM particles remain uncharged, thus preventing decay.
While there are many possible non-gravitational probes of such DM, we focus here on production of DM at hadron colliders, specifically the Large Hadron Collider (LHC). Specifically, many models incorporating DM of the aforementioned type contain particles that are not only heavier than the DM and charged under the DM stabilization symmetry, but also that interact via SM gauge bosons. If these heavier particles (dubbed "parent" particles) do interact via say, QCD, they could be copiously produced at hadron colliders, which would then be followed by their subsequent decay into the concomitant DM and SM particles. By design, the DM particle leaves no visible trace in the particle detector, thus its presence in an event is typically inferred from the missing transverse momentum ( ¡ p T ), which can be interpreted as a loss of specificity in the kinematic information of the event.
The primary goal of this paper is to devise a strategy for the simultaneous measurement of the masses of the parent and the DM particles in the associated processes despite this loss of information. This strategy for the mass measurement also has further applications beyond the study of DM particles; it can be applied to any case where a new particle decays to a semi-invisible final state. In fact, the invisible particle neither has to be a DM candidate, nor has to be absolutely stable -only insofar as its time-of-flight out of the detector is concerned. However, for notational simplicity we shall still refer to it as "DM".
A full reconstruction of such a decay chain is typically not possible, given that it contains an invisible particle. On top of this, due to the DM stabilization symmetry the parent particles are typically pair-produced, implying that each event comes with two invisible particles. The presence of two invisible particles involves an even greater loss of kinematic information from each event, and poses a sizable challenge in the associated mass measurement. Methods using the M T 2 variable and its variations [2][3][4][5][6][7] have been proposed as a solution to overcome this challenge. This class of variables is well-known for its usefulness both in measuring the masses of particles [8] and in isolating new physics signals from their backgrounds (e.g. Ref. [9]). Despite their utility, these variables have a possible drawback when aiming at a precise mass measurement: they all require information about the total missing momentum. Unfortunately, a precise measurement of the missing momentum is often difficult, for instance due to the relatively poor reconstruction of the jets that are usually a part of the overall event structure. This is an unpleasant feature of missing momentum measurements, especially in those cases in which many of the jets that are involved in the measurement of the missing momentum are actually not involved in the decay process of interest. Said another way, in general, the missing transverse momentum is measured as the opposite of the sum of the momentum of all the reconstructed objects (leptons, jets, photons,...) in the event, which means that the measurement of the missing momentum is an inherently "global" measurement of said event.
In light of this, we have recently proposed complementary methods for mass measurements which instead use only the energy of the visible particles. The reason to pursue this strategy is, of course, that it relies intrinsically on more "local" information, ideally using only a subset of the particles coming from a given decay chain. 3 The main idea behind the method that we propose is to use the energy spectra of the visible particles. The basic result upon which the method is predicated was shown in Ref. [18]: namely, for a massless child from the two-body decay of an unpolarized parent, the peak in the energy spectrum of the child (henceforth denoted as "energy-peak") seen in the laboratory frame (henceforth denoted by "laboratory-frame" energy) is the same as the fixed value of its energy in the rest frame of the associated parent (henceforth denoted by "rest-frame" energy). The latter value is given by a simple relation in terms of the masses of the two massive particles (the parent and the other child) involved in the decay, and hence can give information about these masses. In a subsequent paper [19], we then applied this observation to measuring the unknown masses in the semi-invisible decay of a heavy new particle involving a multi-step cascade of two-body decays. 4 In this paper, we consider instead a single-step, three-body, semi-invisible decay of a heavier new particle, which we denote as where a, b are visible SM particles and A, B are massive new particles with A assumed invisible. In order to deal with this specific decay topology, we need to extend the result of Ref. [18] to multi-body decays. The key idea is to map a multi-body final state into a two-body one, by the factorization of phase-space. In carrying out this mapping, we will take particular care in correctly partitioning and grouping the multi-body decay products and selecting an appropriate region of the multi-body phase space upon which to apply the above two-body result.
To reduce the multi-body final state to a two body one, we first form a compound system made of all of the visible particles, labeled as a and b in eq. (1.1). We denote this compound system as (ab) and graphically represent the corresponding partitioning in Figure 1. After this partitioning the decay does not yet look like a truly two-body decay because the compound system (ab) does not have a fixed mass. The combination of (ab) will have its own phase space in invariant mass. This is apparent from the well-known [21] recursive formula for the multi-particle phase-space of N particles of masses m 1 , ..., m N , which can be thought of as the sum of many two-body phase-spaces, a single particle on its own and the remaining (N − 1) particles clustered into a single object whose mass now depends on the momenta and the angles between the N − 1 clustered particles: φ N (m 1 , ..., m N ) = dµ dφ 2 (m N , µ(m 1 , ..., m N −1 ) · dφ N −1 (m 1 , ..., m N −1 )) .

(1.2)
Considering each value of the masses that the compound system (ab) can take separately, we can regard the N body final state as a weighted sum of a collection of two-body systems, each of which is characterized by the mass of the compound system denoted by µ and its probability dφ N −1 . This probability, together with the actual squared matrix element of the decay, would give the rate of decay in that particular kinematic configuration. In the following, we do not assume any knowledge of the matrix element of the decay and we shall make no use of these rates; all we will need for our strategy to work is the ability to represent the multi-body final state as the sum of the collection of all possible two-body final states. The fact that we do not need to know the rate for each possible kinematic configuration of the multi-particle final states is a remarkable point of strength of our method; it is especially powerful when applied to newly discovered particles, as their matrix elements are a priori essentially unknown.
For the case of a three-body decay, the above outlined procedure gives where the equality should be taken in the sense of an equivalence. We also remark that for practical reasons the integral for the phase-space factorization formula has been discretized. In this way, we can form a finite number of compound systems (ab) m ab of mass m ab ±δ with δ m ab . This procedure ensures that each of the compound systems has an approximately fixed invariant mass, and we can think of it as a "pseudo-particle" having a width of order δ. This means that we partition, or "slice", the data according to the total invariant mass of a compound particle formed from the a and b particles and apply the result for two-body decays to each mass partition as the overall system has been reduced to an effective twobody decay. In the rest frame of the parent particle, each partition of the pseudo-particle (ab) has energy: based simply on two-body kinematics for decay of B into A and (ab). Using the appropriate extension of the result in Ref. [18] to the case m ab = 0 (see more on this point in Section 2), we are able to extract E * (ab) from the laboratory-frame energy distribution of that particular (ab) compound particle. We then repeat this procedure for each of the mass partitions in the overall range of m ab . When plotted versus m 2 ab , the fitted data for E * (ab) extracted from the energy distributions should lie along a straight line as per eq. (1.3). It is straightforward to see that m B can be determined from the slope of this line and that m A can be determined from the intercept on the vertical axis once m B has been deter-mined. The available information can be fully utilized in constructing this straight line by analyzing the data in all slices. 5 We remark that, although the characteristic signature of the production of an invisible particle A is missing momentum, our method does not make explicit use this quantity and yet still offers a way to obtain a measurement of mass of this particle! In other words, any specific property of the invisible particle is almost completely irrelevant to our method, except for the assumption that there is at least one invisible particle per decay chain. We again emphasize that this achievement is remarkable, especially in comparison with other mass measurement methods involving ¡ p T such as M T 2 and its variants. Despite the simplicity of the general idea, there are still some potential issues that should be properly dealt with in order to successfully execute the strategy outlined above. First, the compound particle (ab), the visible child particle from the effective two-body decay, has a non-negligible mass; thus, it is essential to generalize the result in Ref. [18] on the energy-peak to the case for a massive visible child particle. Our companion paper [27] will be devoted to studying how to deal with these massive child particles in more detail. In the present work, we shall merely report the final result of this companion paper and use this result for our present investigation. Nevertheless, the discussion presented here is largely independent of the derivation of this result.
As mentioned earlier, the DM model under consideration has the parent particles being produced in pairs. If both of them decay to the same final state, a combinatorial ambiguity arises in attempting to correctly partition and group those particles originating from the same parent; multiple pairs can be formed from the final state as seen in the detector, but it is not known a priori which is the correct pairing -that is, that the particles in the pairing originated from the same decay. This partitioning is a crucial necessity in forming the (ab) compound system that plays the role of the child pseudo-particle. For this reason, we allot Section 3 to the thorough discussion of the treatment of this combinatorial ambiguity. In particular we propose an "event mixing" technique [22,[24][25][26] as a way to remove the combinatorial background.
To illustrate our method we discuss in detail its application to a specific process. As a concrete example, we choose the pair production of gluinos in an R-parity conserving supersymmetric model. Here the gluinos are assumed to subsequently decay into bb and an invisible light stable neutralino via an off-shell bottom squark: This scenario is chosen primarly because it has been thoroughly studied in the literature, and thus should be familiar and interesting to a large audience. Indeed, this process has also been investigated at the LHC [9,[28][29][30][31][32]. In order to provide a fully realistic example, and to demonstrate some of the issues that arise in using our method, we shall incorporate the relevant Standard Model backgrounds in our analysis as well. We take particular care in devising cuts for background rejection so that these selections do not affect the shape of the energy spectrum near the peak, which is the critical region of interest for our energy peak method. Obviously, the optimization of these cuts is a process-dependent issue, and so must be evaluated on a case-by-case basis. The goal of our discussion is to present potential systematic uncertainties and biases arising from the specific details of our method, such as those induced by phase space slicing, event mixing, imperfect knowledge of the background, and overly restrictive event selection criteria. We also present other complementary observables that enhance the findings obtained using the energy peak method, one of which is the kinematic endpoint of the di-jet invariant mass distribution. The rest of the paper is organized as follows. We continue in the next section with a discussion of a template function used to describe the energy spectrum of a massive child particle. Section 3 is devoted to dealing with the combinatorial ambiguity inherent in our chosen decay topology. Then in Section 4, we detail our selected signal process and the relevant backgrounds, both those from SM processes and the "event mixing" scheme for signal combinatorics. Section 5 contains the main results for the mass measurement of the aforementioned example process together with a discussion of several opportunities for improvement to the method. In Section 6 we present our conclusions and outlook.
2 A template for the energy spectrum of a massive child particle As outlined in the previous section, the essence of our mass measurement technique is to fit the data to get the value of E * for each of the fixed masses of the compound system (ab) and fit them onto the straight line in eq. (1.3). The mass of the compound system (ab), being a system of two particles, is not fixed and in general spans a range fixed by the masses of all the particles in the decay. Since we are a priori unaware of the masses of the parent and invisible child particles, it is not possible to know whether or not a given value of m ab is small enough in comparison to those unknown masses to justifiably trust the validity of our previous results for effectively massless child particles [18], and thus we are motivated to extend the finding to massive child particles.
The primary difficulty in generalizing to a massive child is the potential loss of correlation between the peak location in the laboratory-frame energy distribution of the massive child and its energy in the rest frame of the parent. This can readily be seen when considering the decay of a massive particle B → X ψ at the kinematic end point of the phase-space, i.e., m B = m X + m ψ . For any value of m X , ψ will be at rest in the B rest frame, hence E * ψ = m ψ . If each particular event is boosted to the laboratory frame, the energy of ψ becomes simply γ B m ψ , with γ B being the boost of particle B relative to the laboratory. This direct linear relationship between E ψ and γ B implies that the shape of the energy distribution of particle ψ in the laboratory frame should simply be that of the boost distribution of particle B. In this case, it is clear that the peak of the energy distribution of the massive child ψ carries essentially no information about the masses; rather, it carries information on the most probable boost of particle B. This is contrast to the "invariance" that holds for a massless child: the energy-peak in the laboratory frame is the same as the rest-frame energy value irrespective of the details in the boost distribution of particle B.
For a more formal understanding of this problem, it is instructive to analyze the Lorentz transformation of a massive child particle from the rest frame of its parent particle, where it has energy-momentum (E * , p * ), to the laboratory frame. Given the boost factor γ of the parent particle and the emission angle of the child θ * relative to the boost direction, we find the energy of the child particle in the laboratory frame (denoted by E) to be where we have used p * = β * E * . We observe that the laboratory-frame energy E becomes equal to the rest-frame energy E * only if Denoting this θ * as the "reference" angle, we see that any value of cos θ * smaller (larger) than eq. (2.2) gives rise to a laboratory-frame energy value E smaller (larger) than E * .
To ensure the existence of the reference angle, the boost factor should satisfy the following relation: When this condition is satisfied, the energy distribution in the laboratory frame is non-zero at E = E * , which is a necessary, but not sufficient, condition to have a maximum at E * . Obviously, if for some γ the condition set by eq.(2.3) is not satisfied, then E > E * for all cos θ * , which potentially invalidates the statement that the peak in the laboratory-frame energy distribution appears at E * . Actually, for the typical boost distributions of parent particles produced at hadron colliders, one can see that if any of the boosts of the parent particle(s) lie outside of the range given by (2.3), it is then guaranteed that the peak of the energy distribution in the laboratory frame will not be located at the rest-frame energy value 6 . A more rigorous method of characterizing the energy distribution of a massive child will be presented in a separate work [27]. We hope that the argument above, while not as rigorous as our companion paper, will convince the reader that the maximum boost of the parent particle is the key parameter that controls the position of the peak in the laboratory-frame energy distribution of a massive child particle. On top of affecting the peak position, the overall shape of the energy distribution for the massive child is expected to differ from that for the massless child. This means that the function used to fit the massless child energy spectra in previous works cannot be used in the present work. In order to obtain a suitable description of the massive child energy spectrum, we revisit the corresponding discussion for the case with a massless child particle. The value of the energy distribution at a given laboratory-frame energy E is given by a Lebesque-type integral within the range of γ values contributing to the E together with the associated weight for the γ [18]. For the case at hand, this range is found by solving eq. (2.1) for γ with cos θ * = ±1, which gives: In light of eq. (2.6), we can express the energy spectrum for a massless child that we used in previous works [18] as Motivated by the success of this exponential form in the massless case [18,19] and exploiting the identification in the massless limit eq. (2.7), we propose the following ansatz on the shape of the laboratory frame energy spectrum of a massive child 8) where N and w are a normalization factor and the width of the function, respectively. A complete evaluation of the accuracy with which this function describes the energy spectrum in the laboratory frame of a massive child will be presented in a dedicated work [27]. For the purposes of this paper, it will be sufficient to know that this function reproduces our ansatz for massless child particles for γ * → ∞. In any case, we will explicitly show that this function provides a good description of simulation data for our example process below. A comment on the location of the maximum of this function is in order. The maximum of this function coincides with E * only in the limit w → ∞, in which the function becomes a δ function. For all finite values of w, the actual location of the maximum of the function is slightly larger than E * . However, we empirically observe that for parent particles that would typically be produced at colliders, and for γ * somewhat larger than 1, the typical value of w is large enough that this effect is negligible. Therefore, we expect that eq. (2.8) properly describes a large class of energy spectra. Because the peak location and the E * parameter are no longer necessarily the same 7 , the determinations of the best fit values of E * and w are interrelated when fitting the data to the massive template function eq.(2.8). The fact that the maximum of the spectrum is a function of both E * and w is an inherent feature of our ansatz for the massive child energy spectrum, which did not exist in the massless case of Ref. [18]. In this paper we will study the possible effects that arise in our explicit example due to this feature of eq.(2.8). For a fully general investigation of this issue we refer to our companion paper [27]. In Sec. 4 we shall fit the energy distribution of each mass partition of the pseudo-particle (ab) both with the new template for massive children eq. (2.8), and with its massless limit (i.e γ * → ∞), the latter of which was the template employed previously for massless child particles. The comparison of the results from these two templates will allow us to demonstrate the necessity of using eq. (2.8) and generalizing what we had used in our previous work [18,19].
3 "Event mixing" to estimate the combinatorial background As explained in the Introduction and depicted in Figure 1, the success of our strategy hinges on correctly identifying the pairs of particles coming from the same decay side. If this identification is done correctly, the idea of phase-space factorization can safely be applied to reduce the multi-body final state to a two-body final state. Generally speaking, the identification of the correct pairs of particles to be grouped together is a tremendously difficult task, as we have no systematic way of knowing which particles are related to the same decay, and can thus be correctly paired in the analysis.
One approach to surmounting this challenge is to attempt to correctly identify the pairs of particles that come from the same decay by exploiting some preferential kinematic correlation between particles that originate from the same decay. This correlation could then be translated into a selection criterion that would correctly pair the appropriate particles with relatively high accuracy. Of course, this selection process would not be 100% effective and would fail to pick the correct pairings for some fraction of events. As a result, we would be left with a certain amount of combinatoric pollution from pairings whose constituent particles did not come from the same decay. Several event-by-event strategies have been developed to identify which pairs of particles come from the same decay (see, for example, Refs. [33][34][35][36][37]). It is however generally true that, in order to maximize the chance of pairing particles correctly, the kinematic selection criteria which form the basis of each method must be rather restrictive, so as to guarantee a sufficient rejection of unwanted pairings. Events that pass these highly constrained kinematic criteria will be preferentially selected from isolated regions of the kinematic phase space of the scattering, and therefore, the kinematic distributions of the final state will be significantly altered by the imposition of these criteria. Our method of mass extraction relies critically on the fidelity of the energy distribution around the location of the peak; without this, the templates we use to extract the masses return biased information. Thus, because the above set of procedures for selecting correct decay siblings greatly disturbs the resultant kinematic distributions, they are not suitable for use alongside our method. For this reason, we do not even attempt to identify the correct pairings in each event; we instead try to obtain the energy distribution of the correct pairs without knowing which are the correct pairs event-by-event.
In order to determine the distribution of a given observable as it would arise from picking just the correct pairs, we use an event mixing technique whose basic idea is as follows. We consider a scattering where two heavy particles B 1 and B 2 are produced and decay to final states of the same kind, B → Aab, which are labeled to correspond with their respective parent. 8 In the analysis of events of this nature, we follow the procedure laid out in Section 1 for making pseudo-particles out of the a and b particles for all possible equivalent pairings (e.g. a 1 with b 1 , but also a 1 with b 2 and so on.) From these pairs, we obtain a fully inclusive distribution of the observable of interest, which in our case here is the total energy of the pair, In order to obtain the distribution stemming only from the pairs of particles coming from the same decay, it is sufficient to come up with an estimate of the distribution that stems from the pairs that we would like to discard and subtract it from the fully inclusive distribution: This equality might seem trivial, but is in fact very powerful. To wit, the object that we must have for our method to be successful -the distribution of pairings from the same decay -which is quite difficult to obtain in itself, is now expressed in terms of two objects that are much simpler to obtain. The first piece is obviously attainable, as it is simply the distribution from all the pairs that can be formed in each event. The second piece, the distribution from pairs not coming from the same decay, can be estimated by the "event mixing" technique, which is based on making a distribution from pairs of jets that come from different events. The intuition that justifies the usefulness of the event mixing technique originates from the fact that pairs (a 1 b 2 ) and (a 2 b 1 ) from the same event are made of particles which are produced with almost no kinematic correlation. Therefore, it seems reasonable to mimic the effect of these "incoherent" pairs with the pairs of particles taken from different events, which intuitively have no correlation. More precisely, we can see that the phase-space point from which a 1 and b 1 originate in the decay of B 1 is very close to being uncorrelated with the phase-space point from which a 2 and b 2 originate in the decay of B 2 . This approximately vanishing correlation between the products of different decays implies that pairs formed by particle a 1 taken from one event, and particle b 2 taken from another event are expected to be statistically equivalent to pairs (a 1 b 2 ), where both particles are taken from the same event. This means that the distributions of a quantity over a given sample of events obtained from either the (a 1 b 2 )-type incorrect pairings within the same event or from pairing a 1 in one event and b 2 in another event are equivalent. This implies that we can estimate the distribution stemming from pairs of particles from the same decay as This observation is at the center of the event mixing idea, and we henceforth denote the procedure described by the right-hand side of eq. (3.4) as "mixed event subtraction".
The plausibility of the distribution obtained by pairing particles from different decays within the same event being equivalent to the distribution arising from pairings between different events can be seen intuitively in certain simple cases. To illustrate, imagine having an ensemble E of pairs of particles (B 1 , B 2 ), 2 ), ... , (3.5) where the superscripts denote the associated event number. For simplicity, we take particles B to be scalars sitting at rest in the laboratory, where they decay B → abA. It is obvious that the distributions made from pairs coming from different B particles in the same event of the ensemble are the same as the distribution made from pairs coming from B particles from different events in the ensemble. In this example, we are simply sampling the phase space of the B decay in two different ways, in one case taking kinematic information from instances of the decay that happen at the same time and in the other taking that information from instances of the decay that are separated in time.
However, one must note that the situation described above may not correspond to reality. As a concrete example, we take the pair of (B 2 ) and assume that the system of the two B particles has a center of mass energy √ŝ i (> 2m B ). In the laboratory frame, the phase-space accessible to the decay products of the two B particles depends on √ŝ i . If we consider the jth event, with the intention of mixing particles from the ith and jth events, we are forced to confront the fact that the center of mass energy in the jth event, ŝ j , and thus the phase space accessible to the jth event's particles, is different than that of ith event. The mismatch in the phase-space accessible to particles in events at different √ŝ is clearly a potential source of error in the identification of the "different-decay" and "different-event" distributions, which naturally poses an a threat to the successful application of the event mixing technique.
It is quite difficult to estimate the size of this inaccuracy, though we expect that for typical situations at hadron colliders the event mixing technique works quite well. One reason for this is because of the small variance of the boosts of the B particles produced in typical collisions at hadron colliders. In addition to this effect, however, there may be other potential sources of error in the event mixing technique, and a case-by-case study is needed to check the performance of the method. Because of this, we take a pragmatic approach in the following and apply the event mixing technique to our example while explicitly checking the performance of this method for our example process.

Application to the gluino decay
We now demonstrate how the general strategy detailed above is realized by taking as an example a particular gluino decay channel. We first illustrate the signal process and its particular characteristics, and then move to discussing the possible backgrounds of this signal process. The discussion of these backgrounds is separated into two categories: 1) the real background from SM processes, and 2) the systematic background from incorrect pairings of the final state particles used in forming the invariant mass and energy distributions. As we detail the various processes involved, we shall utilize Monte Carlo simulation to generate the relevant event samples, construct and analyze the appropriate kinematic data from these samples, and end with a discussion of the effectiveness of our technique based on the results of this analysis.

4.1
Signal process: gluino decay We apply the general idea developed in the previous sections for the case of pair-produced SUSY gluinos and their subsequent decay into two bottom quarks and a neutralino via a three-body decay: at the 14 TeV LHC. In terms of the notation used in Section1, the gluino and the neutralino correspond to particles B and A, respectively, and two visible particles a and b are the bottom quark and anti-quark in a decay chain. In reality, the particle detector cannot reliably discriminate between bottom and anti-bottom, thus particles a and b in this example are considered indistinguishable.
Though we are using the specific decay above as a concrete example, we emphasize that the decay mode and underlying model at hand are chosen only to enable us to demonstrate the proposed technique and that the general idea can be applied to multi-body decay processes in other models. We also point out that the applicability of the method is not affected in any way by the strengthening of bounds on supersymmetric particles, because the method is applicable for parent and invisible (child) particles of any mass. To illustrate our technique, we choose the masses of these particles to be mg = 1.2 TeV, m χ = 100 GeV with a decoupled bottom squark, and assume that the only decay mode of the gluino is a three-body decay in the form of bbχ 9 . The Monte Carlo signal for our study is simulated using MadGraph5 v1.4.8 [39] and the structure of the proton is parametrized by the parton distribution functions (PDFs) CTEQ6l1 [40], evaluated with the default renormalization and factorization scale settings of MadGraph5. The production cross section of the paired gluinos is computed with MadGraph5 and is reported in the first column of Table 1. Since we assume that all produced gluinos decay into bbχ as described above, σ(pp →gg) is equal to σ(pp →gg → bbbbχχ).
The neutralinos in the final state of our signal do not interact with the detectors of the LHC, resulting in a missing transverse momentum. The four bottom quarks give rise to jets of hadrons -particularly, B-hadrons. The particular characterisitcs of the B-hadrons in the jets allows us to distinguish this type of jet from other jets that do not originate from the bottom quark, and it is possible to see the traces of b-quark-initiated jets and tag 9 During the completion of this work the limit on the gluino mass given by the LHC experiments has risen to about 1.4 TeV for light χ [31,32]. Despite these new limits ruling out the spectrum we consider at a 95% confidence level, this spectrum still serves its purpose as an illustration of the technique. It should be remarked that we do not expect qualitative differences in the application of our strategy to the mass measurement of a heavier but not yet experimentally excluded gluino.  Table 1. The cross sections for the signal process (before and after a set of cuts are imposed) and the (main) background processes (only after cuts are imposed). The cuts are described in eqs. (4.2)-(4.4), and also include all the identification and the isolation criteria explained in the text. The effect of the b-tagging efficiency is not taken into account by the numbers in this table.
them in a large fraction of the events. With the requirement that four of the reconstructed jets in the final state have this tag, the signal will feature four bottom jets plus missing transverse momentum, 4b + ¡ p T . Before closing this section, we remark that the chosen example process poses an extra challenge in the application of our method. In fact all visible final state particles are indistinguishable, hence there are three different ways to form pseudo-particles from these b-tagged final state particles that must be all be considered in the analysis. Together with the SM backgrounds, this can be interpreted as another background, as explained in the next section.

Backgrounds
In this section, we discuss the backgrounds relevant to the signal process defined in the preceding section. As mentioned earlier, there are two types of backgrounds: those coming from Standard Model processes which give rise to the same signature as our supersymmetric process, and that which comes from taking the wrong pairings of two visible particles when we evaluate both the energy sum and the pair-wise invariant mass. We start by discussing the "real" background from the Standard Model and then we discuss the combinatorial background.

Standard Model backgrounds and event selection
For our collider signature 4b + ¡ p T , the following two processes in the SM are identified as the major backgrounds: The Monte Carlo generation of background events is done using the same event generator and input PDFs as those for the signal events. Since the detector signature from these interactions is exactly the same as the one used in our earlier work Ref. [19], we adopt a similar strategy for handling these backgrounds with only slight modifications. The Z boson background is irreducible, whereas the ttbb is reducible and can be reduced so that it becomes sub-dominant with respect to the Z boson background. The ttbb background might seem different from the signal process in terms of its partonic final state, but it can mimic our signal, and thus become a relevant background, by "losing" some of the final state partons in the detectors. In order to match with the signal's detector signature, the two W bosons originating from the decay of the two top quarks must go undetected, which makes those W bosons the main source of ¡ p T in the ttbb background. Although the rate of the detector missing the W bosons is expected to be small, the sizable production rate of pp → ttbb can compensate for this, thus making the pp → ttbb process a possibly important background.
A W boson will go unseen in the detector for primarily two reasons: 1) when its decay products are not within the experimental acceptance region of the detector due to having insufficient p T , supernumerary η, or both, and 2) when its decay products are not adequately isolated from other particles, i.e., they are merged with other particles in the reconstruction of a given event. For the first case, we define as missed any object that satisfies the following criteria: • for jets, p T,j < 30 GeV or |η j | > 5, • for leptons, p T,l < 10 GeV or |η l | > 3 with l = e, µ, τ .
In the second case, the following rules determine when a particle is missed: • for merging jets, ∆R j 1 j 2 < 0.4 with j 1 and j 2 denoting any jet pairs including b-jets, • for merging leptons, ∆R jl < 0.3 with j and l denoting a jet and a lepton, respectively.
With the acceptance and isolation requirements listed above, we observe that most of the background events are from either the fully leptonic or the semi-leptonic decay channels of top quark pairs because these channels require that fewer erstwhile visible partons be missed in comparison to the fully hadronic top decay channel.
To devise an adequate strategy for rejecting a large number of these background events while preferentially keeping the signal events, we must adopt event selection criteria that incorporate the kinematic differences between signal and background event. We observe first that, thanks to the heaviness of the parent particles, the signal events will be composed of jets that typically have a larger transverse momentum than those found in background events. This is a strong hint as to which cuts will be suitable in rejecting the background. However, one must be especially cautious in selecting these cuts because the method proposed here is based on extracting E * bb , which relies in part on the shape analysis of the energy distributions for each invariant mass slice. Therefore, cuts should be chosen such that they do not considerably distort the energy distributions. For this reason, we prefer using softer cuts than in most searches performed at the LHC, and we choose as our baseline selection criteria for identifying the bottom jets in all events that we analyze. In order to further suppress the backgrounds from Standard Model processes, we consider requiring that events have a large missing transverse momentum. In signal events, the missing transverse momentum is expected to be determined by some combination of the new particle masses, and thus will be large. On the other hand, the missing transverse momentum in background events is determined by the larger of the total hardness of the event, the mass of the Z boson, or the mass of the top quark. Therefore, a large ¡ p T cut allows us to efficiently discriminate the signal events from the background. However, in our case special care is needed in deciding the scale of this cut, as there is the risk of this cut introducing unwanted bias in the energy distributions. In particular, the missing transverse momentum can be interpreted as the recoil of the invisible particles against the visible, which seems to imply that a large ¡ p T cut is likely to select only events with very hard visible particles and correspondingly induce some bias in the b-jet energy spectrum toward higher energies. As mentioned before, this could lead to a misidentification the value of E * bb , and as a consequence, an innacurate measurement of the associated masses. Fortunately, the relatively large mass hierarchy between the gluino and the neutralino in our signal process ensures multiple hard b-jets on average and thus a sizable recoil for the invisible neutralinos. We therefore anticipate that the E bb distribution will only be mildly affected, even with a fairly hard ¡ p T cut. For our signal and backgrounds, we impose which strongly suppresses the backgrounds with negligible deformation of the E bb distributions.
In addition to the ¡ p T cut, we introduce another cut that requires each b-jet p T to have some minimum angular separation from the ¡ p T vector. This enables us to avoid events where the measured missing energy is caused by the mismeasurement of jets. For our analysis we require ∆φ( which has negligible effect on the shape of the E bb distributions. In Table 1 we show the cross sections for both signal and background events after applying the set of cuts listed above. We clearly see that the ttbb background is subdominant with respect to the Z + bbbb background. We also remark that the expected signal-to-background ratio (S/B) is large, which is certainly favorable for extracting E * bb from the E bb distribution. Indeed, if new physics particles are discovered in the forthcoming runs of the LHC, it would then be natural to discuss measuring their masses in the channels where there is a clean signal, and hence a large S/B. In this sense, the context in which we present our mass measurement technique is expected to be typical when attempting a mass measurement beyond the precision of the order of magnitude, as we do here.
Other than the above-mentioned backgrounds, QCD multi-jet production pp → bbbb is another possible source of background events from the SM, in which the missing transverse momentum typically arises from imperfectly measuring the energy of jets. Unsurprisingly, an accurate estimation of this background is quite challenging because it involves detector effects. We expect that a great deal of the QCD multi-jet background would largely be suppressed by the cuts eqs. (4.2), (4.3) and (4.4) to the point that it becomes sub-dominant, and in the following we do not taking this background into account (see, for example, Ref. [19]).

Combinatorial background and mixed event subtraction
As mentioned earlier, the procedure of phase space slicing inevitably requires the formation of the invariant mass of two visible objects. Since the process eq. (4.1) under consideration has pair-produced parent particles which both decay to indistinguishable visible children, grouping the four b-jets into two pairs gives the correct choice in only one case out of the three possible combinations. Following the strategy outlined in Section 3, we form all possible pairs and obtain an inclusive energy distribution. We then subtract the contributions originating from the wrong combinations by estimating the corresponding distribution through the event mixing technique as described before. In order to validate the performance of this mixed event subtraction scheme in our example, we first study a large number of pure signal events where no selection cuts are imposed and without including backgrounds. We then discuss how the inclusion of backgrounds complicates the estimate of the combinatorial background when using the mixed event subtraction.
i) Pure signal: For our study it is necessary to check that b-jet pairs' energy and invariant mass distributions are well reproduced by the mixed event subtraction scheme. In order to apply it as per eq. (3.4), we need to obtain the distributions of observables given by forming pairs of b-jets belonging to different events as explained in Section 3. In principle, there are several options for choosing the two events that one can use to compute these observables. For example, one can compute the observables using all possible pairs of events, meaning that each event is reused many times, or one can use a procedure such that each event is used only a few times. The detailed way in which the event mixing is done can, in principle, affect the result. However, in most cases, the alternative methods give rise to only minor differences in the relevant result 10 . Among those possibilities, we show results for which the events were mixed as follows: given a sample of N events, (1) we first randomly shuffle and reorder the events to remove any potential correlation between events arising from the way in which the events were generated, (2) compute the different-event observables by taking b-jets in the ith and (i + 1)th events, 11 so that each event is used twice (3) finally renormalize these distributions to weigh as much as the contribution from the incorrect pairings in the signal sample that we intend to remove, i.e., two thirds of the total number of events in the signal sample.
We label the inclusive invariant mass distribution formed from all the pairs in the same event as F SE (m bb ) and we denote as F DE (m bb ) the distribution obtained from pairs in different events. Then, given an invariant mass value, we take from the same-event 10 One can also form "events" out of randomly selected sets of four particles from the entire event sample and compute the observables by forming pairs from the particles now constituting these new "events." We evaluated the efficacy of constructing the distribution to be subtracted using this alternative method and found little difference between the end results, both in the distributions and in the energy and mass values extracted. 11 The last event is mixed with the first one. sample all the pairs whose invariant mass lies within the range of interest and plot the spectrum of the energy of the sum of the two b-jets, E bb . We repeat the same operation in the different-event sample, and we obtain the E bb spectrum for the fixed ranges of m bb . Denoting the spectrum from the same-event pairs as f SE (E bb ), the spectrum from differentevent pairs as f DE (E bb ), and the resultant spectrum from the mixed event subtraction as f S (E bb ), our estimate of the energy distribution from the correct pairs is: from which we will ultimately extract the rest-frame energy value E * bb for each fixed m bb . Similarly to the subtraction for the E bb spectrum, we obtain an estimate of the overall invariant mass distribution for the correct pairings using the mixed event technique, which is (4.6) Note that this distribution is not used to extract the masses, but serves as an example of the effectiveness of the mixed event subtraction scheme. To quantify the fidelity of the distribution obtained from the mixed event subtraction scheme, we define a ratio where N i,S is the i-th bin-count of the "subtracted" distribution F S (m bb ) and N i,C is the corresponding bin-count in the distribution obtained considering only the correct pairs. The latter is obtained by exploiting the fact that in simulated events all of the history of the particles is available. The left panel in Figure 2 compares the invariant mass distribution from the correct pairs, shown as the blue dot-dashed histogram, and the distribution obtained by mixed event subtraction, shown as the red solid histogram. To show how much the original invariant mass distribution is contaminated by the combinatorial background, the distribution before the subtraction procedure is also plotted as the green dashed histogram. Each bin count is normalized to an integrated luminosity of 3 ab −1 . We observe  that the distribution obtained from the mixed event subtraction is very close to that obtained from the correct pairs. A more quantitative comparison is also provided in the right panel of Figure 2, showing the bin-by-bin ratio R i for the distribution of the events against m bb . All of the bin counts are quite close to their associated theoretical values. In fact, R i ∼ 1 in all the range of m bb . We do not show the ratio R i in the vicinity of both kinematic endpoints because the significantly smaller N i,C at the endpoints leads to unreliable R i values. Besides visualizing the effectiveness of the mixed event subtraction, this check enables us to confirm that for a given invariant mass slice, a similar amount of data remain available after the mixed event subtraction compared to what would have been available were we able to completely eliminate the combinatorial background.
In Figure 3 we compare the energy distribution from the correct pairs and that obtained from the mixed event subtraction. The two distributions are shown for a mass slice 275 GeV ≤ m bb ≤ 325 GeV in the upper panel of the figure and for another mass slice 675 GeV ≤ m bb ≤ 725 GeV in the bottom panel. The ratio of the correct pairs and the subtracted histograms is provided for each choice of m bb as well. Since the fit to extract E * bb from the energy spectra will be performed using only the data around the peak, we show the bin-by-bin ratio only for the energy range that corresponds to the full width at half maximum (FWHM), which is indicated by black dashed lines in each plot. In this case, we see that the energy spectrum processed with the mixed event subtraction (blue histogram) is also quite close to the associated theory expectation (red histogram). To be more quantitative, we compute the average of R i in the FWHM range. This average is denoted as R and is close to 1, which suggests that the mixed event subtraction scheme works quite well, i.e., the shape of the energy spectrum is reasonably preserved. From this exploratory analysis, we expect that the extraction of E * bb from the energy distribution obtained by the mixed event subtraction is unlikely to have major bias due to the subtraction.
ii) Background and "signal-background interference": Once the SM backgrounds come into play, there is a non-trivial complication that is introduced by the event mixing. Since we are not aware a priori whether a given event is from the signal or the background, it is not possible to perform the event mixing using only the signal events. Therefore, the distribution returned by the whole operation of the mixed event subtraction scheme contains "signal-background interference", i.e., picking one particle from a signal event and the other from a background event. In principle, the overall kinematic characteristics of the background events differ from those of the signal events, and therefore these interference pairs will make the overall distribution deviate from that of the pure signal or pure background combinatorially-generated background. 12 As a consequence, a naively subtracted distribution would be distorted with respect to the distribution of a pure signal sample.
To understand the quantitative impact of the inclusion of physical backgrounds, we first need to assess the hierarchy of the effects that arise from the simple addition of these backgrounds and from the event mixing. We focus on the situations where n ev events have been collected after the application of selection cuts, e.g., eqs. (4.2)-(4.4). These events come both from the signal process and from background processes. In general, we have n s signal events and n b = n ev − n s background events. 13 However, in a situation in which a mass measurement is attempted, we expect that the signal will dominate the backgrounds, n s n b . Under this assumption, we can quantify how likely the event mixing procedure is to form pairs where both particles come from the signal process, both particles come from the background processes, or one particle comes from signal and the other from background. Clearly, the probabilities to select a particle from a signal or a background event in the sample are 12 For the dominant background in our case (i.e., Z + 4b), the pure background combinatorial distribution is somewhat tricky; the distinction between correct and incorrect pairings is meaningless because the associated event topology is ill-defined. However, in the interest of generality, we imagine that our background can also give rise to fictitious correct and wrong combinations. 13 Since different types of backgrounds, in principle, will form different distributions, we here assume only a single type of background to avoid any potential complication.
for a signal and a background event, respectively. Therefore, most of the pairs formed in the event mixing procedure are made with two particles from the signal process. Pairs made of two particles from the background are much less abundant, and in fact arise only in a small fraction, n 2 b /n 2 s , of the cases. Strikingly, pairs made with one particle coming from the background and one particle from the signal are much more abundant than pure background pairs, as their probability is 2 × n b /n s .
The effect of the pairs involving both backgrounds and signal is not predictable unless one specifies the signal. However, some general features of this "interference" contribution to the event mixing estimate of the distribution for the correct signal pairings can be easily guessed. First, the "interference" distribution tends to produce an underestimation of the bin-counts in the estimate eq. (3.4) of the distribution consisting of pairs of b-jets originating from the same decay. The reason is that in the inclusive distribution from all pairs in the same event, the first term in eq. (3.4), there are contributions stemming from pairs of events coming both from signal or both from background, but there is no way to construct a hybrid of the two: where by σ S and σ B we mean the signal and background contributions, respectively. On the other hand, in mixed events, we have As suggested by Figure 3, the contribution from pairs where both events come from the signal, dσ SS dE bb , does a good job of estimating the effect of the pairs of b-jets not coming from the the same decay in the signal. Similarly, the contribution from pairs of events where both events come from the background, dσ BB dE bb , is a good estimate of the combinatorial background generated from the background itself. Therefore, the contribution dσ SB dE bb is the piece that typically ruins the result because it gives rise to an excessive subtraction in eq. (3.4). Obviously, this phenomenon cannot be avoided since we are unable to distinguish signal and background events with absolute certainty. The presence of this type of "interference" background is inherent to the event mixing technique and dealing with it requires special care. A more quantitative argument about the interference is available in App. A for more interested readers.
Given its peculiar origin, it may be desirable in some cases to remove the "signalbackground interference" contribution. In order to do so, one must discuss the shape of this distribution, which in general depends on the signal and therefore is a priori unknown. However, some general features of the "signal-background interference" distribution can be predicted using the following argument. The distribution that arises from pairs made of one background and one signal particle feels in part the kinematics of the signal events and in part that of the background events. The background is typically expected to have softer particles than the signal, and therefore, the "interference" b-jet pair energy distribution is expected to be skewed towards energies that are somewhat larger than the characteristic    values of the distribution for pairs made out of two background particles. For our example process, we display the distribution from pairs of signal particles in Figure 3, while the distribution from pairs of background particles and the interference distribution are shown in Figure 4. Comparison of these distributions confirms our intuition from the argument above.
From Figure 4 we see that the interference effect dominates the pure background effect essentially everywhere in the distribution. As discussed above, this is due to the fact that we envision a situation where signal events are much more abundant than background events. The shape of the interference contribution is also quite different from that of the pure background contribution.
When discussing results in a later section, we will study the effect of the removal of background contributions to our results. With this goal in mind, we study what functional form describes the total effect of backgrounds; that is, the distribution from the pure background pairs plus that from the "signal-background interference" pairs. The combination is shown in Figure 4; in the lower panel, we show a possible fit of this distribution. Due to the importance of the signal in determining the shape of the "signal-background interference" distribution, we decided to model the total effect of backgrounds with a function of the family eq. (2.8). The fit result in Figure 4 is rather good, but we do not attach any special significance to this finding. In fact, a better description for this background may exist and might be preferred. More generally, we stress that the "signal-background interference" distribution is not universal, and our choice could be unreliable for other signals. In our application to the gluino decay process, the fairly good description provided by eq. (2.8) and shown in Figure 4 is satisfactory for our current purposes.

Mass measurement results and discussion
In this section, we demonstrate the application of the proposed technique to the gluino decay. Results on the mass measurement from fitting the energy spectra for the compound system of two b-jets are presented in the following subsections along with the possible issues and limitations of our method. In the final subsection, we discuss possible improvements of the mass measurement with the aid of the di-jet invariant mass endpoint.

Measurement of gluino and neutralino masses
Following the strategy outlined in the previous sections, we present results for the determination of the masses of the gluino and the neutralino from the b-jet energies. The energy spectra that we study are obtained from simulated event samples generated as described in Sec. 4.1 for both signal and dominant background processes at the 14 TeV LHC. We also recall that the relevant channel is characterized by a large missing transverse momentum and four bottom-tagged jets which are selected as per eqs. (4.2)-(4.4). Since the primary interest of this paper is to study the theoretical aspects of energy peaks in a multi-body  decay, rather than data analysis under realistic statistics, we take a sufficiently large number of events to minimize potential statistical fluctuation within the data sample, which is then normalized to an integrated luminosity of 3 ab −1 .
Note that we study the distribution of the sum of the energy of b-jet pairs (say, b 1 and b 2 ), i.e., E bb = E b 1 + E b 2 , for which the associated invariant mass values belong to a narrow range such that m bb ∈ [m bb − ∆m bb /2,m bb + ∆m bb /2] , (5.1) where ∆m bb denotes the width of the mass window. We henceforth identify every individual mass window by its central valuem bb . Due to the indistinguishable nature of the final state particles, we form the E bb distribution from all possible pairs of b-jets in an event, and subsequently apply the mixed event subtraction technique described in Sec. 4.2.2 to eliminate the contamination from the pairs of b-jets not coming from the same gluino. In Figure 5, we present the average of the bin-by-bin ratio of the distributions from only correct pairs and from the mixed event subtraction technique at various values ofm bb . We remark that for each point in the figure, the average is limited to the energy range defined by the full width at half maximum (FWHM) of the distribution. The figure suggests that the average deviation between the two distributions is, at most, about 8%, meaning that the mixed event subtraction scheme reproduces the original distribution fairly well. Although we do not show it here, we also remark that for a givenm bb the standard deviation in the bin-by-bin ratio is small enough that the distribution from the mixed event subtraction consistently tracks the corresponding distribution from the correct pairs. For eachm bb , the rest-frame energy of the b-jet pair system (i.e., E * bb ) is extracted from the energy distribution by fitting the data to a template function. We have two possible template functions, given in eqs. (2.7) and (2.8), and we use both of them to see which one better suits the data and to see if there are significant differences between the rest-frame energy and E * value found by the two functions. Since eq. (2.7) is suitable only for the case where the relevantm bb is effectively negligible, we expect an increasing discrepancy between the results obtained by eqs. (2.7) and (2.8) asm bb grows. The reduced χ 2 values for each fit to the energy spectrum are shown in Figure 6. This χ 2 is a measure of how well the template function describes the data globally, but we do not necessarily attach any statistical meaning to it. We instead use it as a measure for the distance between the two template functions. Looking at the figure, we observe that for allm bb , the massive template describes the data as well as or better than the massless template. As expected, the performance of the massless template becomes progressively worse asm bb increases. The massless template also seems inferior to the massive template from another aspect as it typically returns an estimate of the rest-frame energy that is larger than the expected value. On the other hand, the massive template does not introduce such a pronounced bias.
To demonstrate the difference between fits made with the two templates, we show in Figure 7 sample fit results for two different nominalm bb values, 250 GeV and 650 GeV. In the left panels, we provide results from applying the massive template, while in the right panels we provide results from applying the massless template. The errors quoted for the extracted E * bb were estimated at 95% confidence interval (C.I.) from the variation of the χ 2 of the fit. For both of them bb values, we see that the massless template estimates E * bb as being slightly larger than the estimate given by the massive template, and the discrepancy is larger as we go to largerm bb . Although the discrepancy is within the 95% confidence interval, we feel that this is an important characteristic of the fit results. In fact, the massless template has a systematic tendency to return larger E * bb in general, which implies the introduction of a possible bias to the mass measurement. The results of the fits for all of the values ofm bb are reported in Table 2, from which we see that massless template consistently overshoots the estimate of E * bb obtained from the massive template, and that it also overestimates the theory values of E * bb . Finally, we take all the values of E * bb obtained from the fitting the energy spectrum for eachm bb and fit them to the line given by eq. (1.3), taking into account and displaying the associated errors from the fit procedure that we used to extract the values of E * bb . The expression in eq. (1.3) can be adjusted to match our example process as follows: We perform the fit of eq. (5.2) on both of the results from the massive and massless templates. For the fits using the massive template, we use only the results obtained for m bb in the range from 200 GeV to 650 GeV, in which the errors from the fit of the energy spectra are quite small. Other choices of them bb range give similar mass measurements, but we simply make a conservative choice for the range ofm bb included so to avoid the values ofm bb where fewer events are expected. From the results extracted using the massless template, we choose to fit eq. (5.2) only form bb in the range from 200 GeV to 600 GeV, where it is more reasonably accurate to treat the b-jet pair system as massless. The fit parameters are the slope of eq. (5.2) and its vertical intercept (that is E * bb ofm bb = 0) and we denote them as Fitting the line eq. (5.2) on the results obtained from the energy spectra, we obtain the best-fit lines shown in Figure 8, which correspond to for the massive template and for the massless template. Not surprisingly, the values extracted using the massive template are closer to the input values than those from the massless template, although even the Table 2. The fit results for fifteen invariant mass slices. For each fit, a mass slice of 50 GeV was chosen, for example, form bb = 200, E bb is selected such that the corresponding m bb is between 175 and 225 GeV. The bin size for all energy distributions is 20 GeV. The error estimation for each fit parameter is performed by 95% confidence interval. All values but χ 2 are given in GeV.
massless template was able to get a rough estimate of the rest frame energy values of the bb system for the chosen values ofm bb . In Figure 9, we also show the 68% and 95% confidence level contours obtained from the χ 2 variation of the fit of the slope and intercept parameters; the result with the massive template are in the left panel and that with the massless template in the right panel. One can clearly see that the distance between the theory values and the best-fit values for the case of the massive template (left panel) is smaller than that for the case of the massless template (right panel). As mentioned before, s and y can be easily converted into the masses of gluino and neutralino. Based on eqs. (5.5) and (5.6), we obtain the following measurements of the two masses: Massive template : mg = 1042 ± 65 GeV, m 2 χ = −159000 ± 59000 GeV 2 , (5.7) Massless template : mg = 964 ± 56 GeV, m 2 χ = −240000 ± 41000 GeV 2 .
We remark that the gluino mass, while quite precisely determined, is underestimated by about 20%, with the value from the massive template being closer to the true value than the value from the massless template. The neutralino mass is poorly determined using both the massive and the massless template. Possible causes of this poor estimation will be discussed in the next subsection. 2). The theoretical expectation for the given mass spectrum is represented by the solid black line. The data points obtained by fitting the E bb distributions with massless and massive templates are marked by "cross" and "square" symbols, respectively. The mass measurement done with cross symbols is represented by the red dashed line. For the blue dot-dashed line, the measurement is done for the data points for which the massless template work reasonably well.

Study of systematic effects
The results in the previous subsection on the measurement of the gluino and neutralino masses are fairly good, considering the challenging circumstances of the mass measurement, particularly the fully indistinguishable character of final state particles in our chosen signal process. Despite an adequate result for the gluino mass measurement, the neutralino mass measurement is very poor; the only conclusion that one is able to draw is that the neutralino in our example process is consistent with being massless.
As noted already, the measurement of E * bb for eachm bb is statistically compatible with the theory value, but still results in a mass measurement that is systematically overestimated. From the fit of the data in Figure 8, the mismeasurement of E * bb primarily implies a slope larger than that predicted by theory, which consequently implies that the extracted gluino mass is biased towards values smaller than the true mass. This bias is not particularly worrisome per se, as it is about 10%. However, given the relation between the masses and the observables in eq. (5.3), it turns out that this underestimation of the gluino mass severely affects the neutralino mass determination. More generally, there are nine possible cases based on under-, over-, or consistent estimations of the slope and intercept  of the straight line in eq. (5.2). The implication of each case in terms of the extracted mass parameters is summarized in Table 3.
It is interesting to examine possible causes of this bias in the best-fit line of Figure 8, which also serves as a basis for possible improvements of our method. In order to clarify the origin of the incorrect estimation of E * , we study the following potential sources of inaccuracy in our fits of the energy spectra: i) an imperfect fit of the data with the massive template eq. (2.8); ii) contamination due to the background; iii) biases introduced by the event mixing subtraction; iv) finite size of the m bb range used to discretize the multi-body phase-space;  Table 4. Description of the original sample (OS) and several selected check samples (CS). The width of the ranges of m bb for the discretization of the multi-body phase space is reported in the first column. Samples marked as event mixed are those in which the mixed event subtraction has been carried out. In those marked as "no", the correct pairs are identified in the event record and so eliminate the effect of combinatorial backgrounds. Samples where the background has been completely neglected are marked in the third column. For the samples where the background has been added, we report in the fourth column if we have added a template to fit the background events to the overall fit of the data . Finally, in the fifth and final column we report if selection cuts eqs. (4.2) through (4.4) have been applied to the events or not. v) biases due to events selection.
For the first potential source, we recall the discussion in Section 2 where the massive template function was introduced; this template had a maximum at E * bb only when w → ∞, which corresponds to producing the gluino(s) at rest. For practical cases, w is finite and the maximum of the function appears at a somewhat larger value than E * bb . On the other hand, physical energy distributions can have the maximum at E * bb ; in particular, for cases where m bb can be treated as effectively massless. Therefore, the relevant fit could result in a value that does not match the corresponding expectation. Fortunately for the case at hand, we find that w is large enough to cause only a negligible shift in the peak position, i.e., such a potential mismatch is very tiny. Consequently, we do not ascribe the systematic overestimate E * bb to the inaccuracy of the template function eq. (2.8). In order to see the effect of the other four potential sources of bias on the final result, we conduct a dedicated analysis for each. In each analysis, we repeat the same procedure as described in the previous section, that is to say we extract the values of E * bb from an event sample that incorporates the effect under study. The event samples for the study of these possible effects are denoted as "Check Sample" (CS). We then compare the results obtained from those Check Samples to those obtained from the Original Sample (OS). The attributes of the check samples that we have considered are summarized in Table 4 and are also described in the following.
The first check sample enables us to find the effect of the background on the extraction of E * bb . We study first the pure background energy distributions in order to calibrate the template function describing them in the fit. This calibration is done for each of them bb slices. Although there are two types of backgrounds, the dominant SM background and the interference from the event mixing, we employ a single template in eq. (2.8) to describe both of them collectively. We then repeat the fit of the energy spectra for eachm bb , including  Figure 10. Comparisons of fit results from the five samples used to assess the effects of several potential sources of bias in the gluino mass determination as described in Table 4.
the template function for the backgrounds as well. The results in the determination of E * bb for eachm bb are labeled as "CS I" an plotted as red open circles in Figure 10. In this figure, the left panel shows the absolute shift of the measured E * bb from the corresponding theory value for each sample. The right panel shows the ratio of the values of E * bb from the check samples and the corresponding value in the original sample. We observe that the effect of background modeling is negligible for allm bb , and from the right panel of Figure 10, we can in fact see that this engenders less than 1% of the shift in E * bb . The next potential source of bias that we study is the event mixing, which is studied by the second and third check samples, denoted by CS II and CS III in the following. In these samples we use the event record to identify the correct pairs of jets coming from the same gluino, and therefore we obtain the correct energy spectra without applying the mixed event subtraction. The two samples CS II and CS III differ by the inclusion of the SM background. The results for the determination of E * bb are reported in Figure 10 by blue filled triangles and blue open triangles, respectively. From the figure we see that the determination of E * bb is significantly improved. In fact, in the left panel of Figure 10 we can see that a mild positive shift of the determined E * bb values still exists, but is greatly reduced compared to what we had with the original sample. We remark further that only minor differences are found between the results obtained from the check samples CS II and CS III, which can be taken as another way of confirming that the effect from background events is negligible. Therefore, we conclude that the effect of event mixing is a major cause of the shift that we observe in the gluino mass determination.
Next, we study the effect of the discretization of the m bb spectrum by taking smaller ranges for them bb window. This is performed on the check sample denoted as CS IV. This narrow m bb range analysis is intended to provide an improvement to the previous two analyses where the combinatorial issues were artificially resolved using the information in the event record. In fact, the check sample CS IV for this analysis is similar to CS III except for the ∆m bb . The results of this analysis are reported in Figure 10 by black filled rhombuses, which suggests that the effect of discretization of the m bb spectrum is negligible. This is not surprising in light of the following observation: one can easily figure out that, for a given nominal value ofm bb the the E * bb from eq. (5.2) for (m bb + 25) GeV (m bb − 25) GeV is at most about 15 GeV larger (smaller) than the E * bb for the nominalm bb . The absolute size of this shift is already quite small and is further reduced by the fact that for each m bb range we observe the sum of all the contributions below and abovem bb . In all, we expect a very small net effect, making the discretization of the m bb spectrum in increments of 50 GeV suitable for the precision sought.
Finally, we study the bias induced by the selection cuts. To assess their effect, we produce a sample along the line of CS IV, but being fully inclusive in the signal phasespace. The result of fits performed on the energy spectra from this sample are reported in Figure 10 by black open rhombuses. The use of a fully inclusive sample gives E * bb from the fits that agree with the theory predictions within few percents up tom bb = 650 GeV.
For the check samples II, III, IV and V we remark that the agreement of the fit results with the theory value deteriorates as one gets closer to the endpoint of the range covered by m bb . We observe that good agreement is retained up tom bb = 650 GeV, which is in the falling tail of the m bb distribution, as apparent from Figure 2. We suspect that the mismatch of the fitted E * bb and the theory values is connected to the massive template becoming less accurate in fitting to the data. Indeed, in Figure 8 we can see that the error estimation on the fitted E * bb becomes larger form bb ≥ 650 GeV. In a realistic application of our mass measurement method, we would not know up to what precise value of m bb the massive template can be trusted. However, it is clear that the values of m bb for which the extracted E * bb from the fit comes with a large error should be avoided. We remark that in Figure 8, all the fit results for m bb ≥ 650 GeV have a significantly large error, thus clearly signaling a transition to a region of m bb where the fit template eq. (2.8) can no longer be trusted. A more detailed investigation of this transition boundary is beyond the scope of our paper and we instead refer to Ref. [27] for a more systematic study of it.

Improving the mass measurement using the m bb endpoint
In this section, we discuss a possible improvement of the mass measurement with the aid of the kinematic endpoint of the dijet invariant mass distribution. The point is that without prior knowledge of the masses, it is not possible to say whether the measurement eq. (5.7) has a bias. However, we can devise a check and an improvement of the obtained measurement using an independent observable. To this end, we study a possible combination of the result of fits to energy spectra with the measurement of the endpoint of the invariant mass distribution of the pairs of b-jet, which we denote as m max bb . This observable is a simple function of the gluino and neutralino masses: and is expected to be very useful in combination with the results of fits to energy spectra. In fact, eq. (5.2), in light of eq. (5.9), can be rewritten as which is a straight line in the plane (E * bb ,m 2 bb ) described by just one free parameter. This should be compared to the previous equation we used to find the masses, eq. (5.2), where there are two independent parameters: the slope and the constant term of the straight line.
If the m bb endpoint is assumed to be well-measured, we can use eq. (5.10) to more accurately fit our results to a line in the plane (m 2 bb , E * bb ). This relation is shown in Figure 11 for m max bb = 1.1 TeV. Using this relation as a template for the fit of the (m 2 bb , E * bb ) data points in Figure 8 along with eq. (5.9), we obtain a mass measurements: mg = 1236 ± 31 GeV, m χ = 134 ± 31 GeV . (5.11) In this case, as well as for the analysis in the previous sections, the fit is performed between m bb = 200 andm bb = 650 GeV, with the data points obtained from the massive template. This result is more accurate and in better agreement with the expected values than what we obtained in eq. (5.7) using only energy distributions. Therefore, we conclude that, depending on the accuracy with which the m bb endpoint can be experimentally determined, the addition of information from the m bb endpoint can grant a very significant improvement to our results obtained from only the energy spectra.

Summary and Conclusions
In this paper, we have discussed how to use the energy spectra of visible decay products for the measurement of masses of "parent" particles in semi-invisible multi (i.e., more than 2)-body decays. The results are an extension of previous results regarding the properties of the energy distribution of the massless decay products in a two-body decay. In particular, we extended the results for two-body decays by discretizing the multi-body phase-space and considering it as the combination of multiple two-body decays, as suggested by the recursive factorization formula for the multi-body phase space. The fictitious two-body systems that are involved in the recursive factorization of phase-space are necessarily massive. Therefore, we utilized the results of a companion paper [27] on the description of energy spectra of massive decay products in two-body decays. Armed with these results, we should be able to fit the energy spectra of the visible part of the fictitious two-body decay and extract an estimate of the masses involved in the process using the results of these fits.
A particularly challenging aspect of our analysis had to do with the fact that the decaying particle (the particle of interest) is typically produced in association with other particles. In this situation, it is possible that some of the "child" particles from the decay of the parent are identical to those contained in the rest of the event, which means that in the process of reducing the multi-body final state of the parent decay to one with fewer bodies, the particle pairings that we perform may unintentionally include particles which have nothing to do with the mass measurement at hand. In particular, the parent particles are often produced in pairs: if the two parents in each event undergo the same decay process, then it is clear that this combinatorial background is inevitable. These particles extraneous to the decay potentially hamper the mass measurement of the parent, and thus the contamination they add must be addressed. Most of the general discussion above can be succinctly illustrated by the consideration of a suitable example. Furthermore, tackling a concrete example enables us to quantify the quality of the mass measurement that one can achieve using our method. With these goals in mind, we studied in detail the production of a pair of gluinos in a supersymmetric model, where R-parity is conserved, and in which the gluinos directly decay to bbχ 0 1 (a three-body decay) via an off-shell bottom squark. To this end, we simulated events for the process including the relevant SM contribution. We identified selection cuts to remove the SM backgrounds to a level that further clears a path toward the successful determination of the masses of the new physics states. Simultaneously, we attempted to minimize the changes in the shape of the energy spectra caused by these event selection criteria.
Our actual analysis then starts with forming all possible pairs of bottom quarks in each event that passed the selection in eqs. (4.2)-(4.4), from which we then obtained a distribution of the energy of the b quark pairs, E bb . This distribution includes the contribution from pairs formed by bottom quarks not originating from the same decay, i.e., the wrong combinations. Note that the final state of each decay in our chosen process is made of two (visible) indistinguishable particles, and so the pollution from the combination of particles not coming from the same decay is even more severe; in particular, each event gives 6 combinations of two bottom quarks, out of which only 2 are correct, cf. the case of distinct particles a and b from each decay, which would give 2 correct ones out of 4 combinations. To remove this adulteration of the event sample, we subtracted an estimate that we obtained using the event mixing technique described in Sec. 3. This estimate was obtained from pairs of b quarks taken from different events, and we showed that the contribution from pairs of b quarks not coming from the same gluino can thus be effectively removed, as seen in Figure 3. In other words, this method has a natural tendency to maintain the shapes of the distributions that we need to analyze to carry out our measurement.
The energy spectra, once effectively rid of the contribution from pairs of b quarks coming from different gluinos, were fitted in a region around their peak with the function eq. (2.8), which is taken from our work Ref. [27] and briefly described in Sec. 2. For comparison, these energy spectra were also fitted with the function eq. (2.7), which was used in our original paper on the peak of energy spectra of massless particles. The comparison of the results from fits with the two functions highlights the improvement achieved by our new result for massive decay products. The better description of the energy spectra with the function eq. (2.8) can be seen in the comparison of the χ 2 for the various fits performed, as reported in Figure 6.
The result of the fits of the energy spectra is the extraction of the function parameter E * bb , which is exactly the energy of the system of the two b quarks in the rest frame of the gluino that generated the two b quarks. The determination of E * bb is the core of our procedure as this value is connected to the masses of the gluino and the neutralino via eq. (5.2): for a pair of b quarks of massm bb . The results of the extraction of E * bb for several choices of the mass of the two b quark system were shown in Figure 8. The determination of E * bb for each m bb was fitted using the straight line eq. (5.2) given above. This fit is essentially our mass measurement, as the gluino mass corresponds to the slope of the line and the neutralino mass to the constant term of the straight line. The resulting mass measurement was given in eq. (5.7), which was found to be within 20% of the gluino mass and a rather poor determination of the neutralino mass. This inaccuracy of the mass measurement is mainly due to the fact that in each fit of energy spectra, we tend to overestimate the energy E * bb . In Sec. 5.2, we studied several possible causes for this error and (in the end) identified the modest shape changes due to the mixed event subtraction as the primary source.
In order to improve the mass measurement, we then studied how including information about the endpoint location in the m bb distribution altered the quality of the mass measurements. If well-measured, this quantity should correspond to the mass difference between the gluino and the neutralino, and hence is expected to aid in the measurement of these two masses. Indeed, a much more accurate measurement was obtained using information from this observable, as shown in eq. (5.11).
In conclusion, we have demonstrated the use of energy distributions, and in particular, the region close to their peaks, to measure the masses of new physics particles involved in a single-step multi-body decay. Taking the example of gluino production and decay in a R-parity conserving supersymmetric model, we have found that using only visible decay products of the gluino decayg → bbχ, it is possible to measure with good accuracy the masses of both the gluino and neutralino, with the best results being obtained when both the energy and the invariant mass distributions of the pairs of b quarks are used. Rather strikingly, the mass measurement technique that we discussed does not actually use information about the missing momentum, except only for event selection purposes. The example that we considered also required a proper removal of the effect from pairs of b quarks not coming form the same gluino. To this end, we have shown that the event mixing technique is especially well-suited. We anticipate that our general methodology here can be applied to mass measurements in other processes.
We begin with the total number of pairings of b-jets (denoted by N N C ) and with the data sample consisting of n s signal events and n b background events. Remembering the fact that there are six possible pairings out of four b-jets in each event, we have N N C = 2n s + 2n b + 4n s + 4n b (A.1) where the last two terms represent the total number of wrong pairings of b-jets denoted by N W C = 4(n s + n b ) .
As mentioned before, our unawareness of which are signal and which are background events precludes us from having the event mixing (symbolized by ⊗) only between signal events or background events. Therefore, if we perform the event mixing procedure on the total (n s + n b ) events, we then have signal-signal mixing, signal-background mixing, and background-background mixing. Since there are four b-jets in each event, a single event mixing enables us to have up to 4 = 16 b-jet pairings. Of course, for practical purposes, one could use a subset of these 16 pairs, say m pairs. Now that m is given by a common prefactor for every single event mixing, we then need to know the number of event mixings. The total number of signal-signal mixing is evaluated by the two-combination in the binomial coefficients: n s ⊗ n s = n s 2 = n s (n s − 1) 2 , (A. 2) and so is that of background-background mixing: Likewise, the number of signal-background mixing, i.e., interference, is expressed as follows: , we eventually use m · N M C b-jet pairs to estimate the distribution from the N W C wrong pairs formed in the same event distribution. 14 Since in general N W C = N N C , each mixed pair should be re-weighted in making the "different events" distributions, so to match the contribution of N W C wrong pairs. Keeping this issue of the normalization in mind, let us first see how the wrong b-jet pairings in the signal events can be treated by the mixed event subtraction even in presence of a small background. Since the total number of mixed pairings is normalized to N W C , the presence of the background affects the weight of the "different event" distributions made by just signal events. In fact, in the "different event" distribution, once normalized so as to match the contribution of the N W C wrong pairs, the fraction of b-jet pairs stemming from the signal-signal mixing is given by eq. (A.2) times a rescaling factor to match the normalization to N W C , that is n s (n s − 1) 2 × 4(n s + n b ) n s (n s − 1)/2 + n s n b + n b (n b − 1)/2 ≈ 4n s 1 − n b n s (A.5) where the common prefactor m is omitted for simplicity and the approximation is done with the assumptions of n s n b and n s 1. The implication of this result is that the combinatorial background (caused by the signal itself) can be almost completely eliminated with the event mixing technique thanks to the dominance of the signal assumed throughout in this paper. In the total "different event" distribution reweighed as to match the expected number of wrong pairing N W C we find that the fraction of b-jet pairs from the backgroundbackground mixing is n b (n b − 1) 2 × 4(n s + n b ) n s (n s − 1)/2 + n s n b + n b (n b − 1)/2 ≈ 4n b n b n s − 1 n s . (A.6) 14 Note that the only thing that we know is ns + n b , neither ns nor n b separately.
In the same distribution we find that the fraction of b-jet pairs from the signal-background mixing is n s n b × 4(n s + n b ) n s (n s − 1)/2 + n s n b + n b (n b − 1)/2 ≈ 8n b 1 − n b n s + 1 n s . (A.7) Comparing eq. (A.6) and eq. (A.7), we conclude that the effect of "interference" is, in general, more important than the contribution from the background-background mixing, as also argued in the main text. Besides corroborating the argument given in the main text, these equations quantify more precisely the effect of background, which becomes more important when the mass measurement strategy described in this paper is applied to situations where S/B is less favorable than that in our example process.