Multi-Dimensional Phase Space Methods for Mass Measurements and Decay Topology Determination

Collider events with multi-stage cascade decays fill out the kinematically allowed region in phase space with a density that is enhanced at the boundary. The boundary encodes all available information about the spectrum and is well populated even with moderate signal statistics due to this enhancement. In previous work, the improvement in the precision of mass measurements for cascade decays with three visible and one invisible particles was demonstrated when the full boundary information is used instead of endpoints of one-dimensional projections. We extend these results to cascade decays with four visible and one invisible particles. We also comment on how the topology of the cascade decay can be determined from the differential distribution of events in these scenarios.


Introduction
Naturalness of the Higgs sector as well as the weakly interacting massive particle (WIMP) paradigm for dark matter provide strong motivations for new physics at the TeV scale. The most commonly studied extensions of the Standard Model (SM) that attempt to solve the hierarchy problem do so by positing the existence of partners to the SM particles that cancel divergent contributions to the Higgs mass. Many of these scenarios also provide a dark matter candidate since they incorporate a parity symmetry under which the partner particles are odd, making the lightest partner particle stable. Arguably the best known example for such scenarios is the minimal supersymmetric extension of the SM, the MSSM.
The collider phenomenology of these scenarios has been studied extensively in the literature. The most promising discovery channels include production of colored partners, which then decay, often in multiple stages, until the lightest partner is reached. Since the lightest partner is assumed to constitute dark matter, it leaves the detector without interacting. Thus no resonances can be constructed from the visible decay products and discovery as well as mass measurement prospects often rely on endpoints of one-dimensional distributions of Lorentz invariant (e.g. edges, endpoints) [1][2][3][4][5][6][7][8][9][10][11][12] (for a comprehensive review, see [13]) or boost invariant (e.g. m T , m T 2 ) variables [10,.
The Large Hadron Collider (LHC) has completed its 7 and 8 TeV runs and is currently running with a center of mass energy of 13 TeV. The LHC experiments currently do not have significant indications of physics beyond the SM. Considering that the center of mass energy is already near the design value, one needs to take seriously the possibility that if new physics is discovered by the LHC experiments, the signal statistics will remain low, or moderate at best. Therefore, it will be of paramount importance to optimize the methods by which the signal will be studied for low statistics.
Let us consider mass measurement techniques in particular. For cascade decay chains with sufficiently many intermediate on-shell stages, polynomial methods  can be applied to algebraically solve for all unknown masses based on a small number of events. However, there exist decay chains which do not have sufficiently many on-shell stages for these methods to be applicable. For such decay chains, the one-dimensional variables mentioned above are commonly accepted as the tool to be used for mass measurements. It was argued in ref. [60] however that when there are more than two visible particles in the final state, the kinematically accessible region in phase space is multidimensional and the commonly used one-dimensional variables are inefficient at low statistics. It was demonstrated specifically for final states with three visible particles and one invisible particle that the density of events near the boundary of the kinematically accessible phase space is enhanced, and that a determination of this boundary in the multidimensional phase space could yield significantly higher precision and accuracy for mass measurements at low statistics.
In this paper we will extend the conclusions of ref. [60] to the remaining cascade decay topologies where polynomial methods are not applicable. If all on-shell decay stages are 2 or 3-body decays with one invisible particle emitted from the last stage of the cascade, then it is straightforward to show that any cascade decay with more than five final state particles can be analyzed using polynomial methods, therefore we will restrict ourselves to final states with at most five final state particles. We will show that the enhancement in the density of events near the boundary is in fact even stronger for five-body decays compared to four-body decays, and in a number of representative cases for decay topologies we will demonstrate the improvement for mass measurements compared to the more traditional methods based on kinematic edges or endpoints.
Various techniques involving kinematic variables have also been proposed for the purpose of determining decay topologies [61][62][63][64][65][66][67]. We will provide a preliminary assessment of the sensitivity of the full phase space boundary method to the topology, and suggest an algorithm by which the topology underlying a signal sample should be determined.
Our goal will be to provide a proof of principle that these improvements can be obtained, and therefore as in ref. [60] we will compare our methods to those based on kinematic endpoints under ideal circumstances, without SM or combinatorial backgrounds, spin correlations or realistic detector effects. While these certainly pose additional challenges in the construction of a fully realistic analysis, they will deteriorate the results of both our methods and any method based on kinematic endpoints, with no obvious reason why one should be more negatively affected than the other. Also, as in [60] we will restrict our study to "one-sided" events, where the cascade decay takes place on one side of the event, and the other side is assumed to include only the lightest partner. This corresponds to scenarios such as gluino-LSP associated production in the MSSM. The reason for this choice is that our methods use only Lorentz-invariant observables and are therefore used on one decay chain at a time, with no obvious way to combine the two sides of the event using the missing transverse energy (MET) for example. Therefore, for reasons of simplicity, we demonstrate the applicability of our methods in the simplest possible case of one-sided events. The same methods can of course simply be used twice in a symmetric event, but that comes at the cost of combinatoric issues such as identifying which side of the event any final state particle belongs to. We will leave a more realistic study including all these complications to future work. In fact, in parallel to this work, methods are already being developed to address some of these complications, and for one decay topology featured in ref. [60] it has already been demonstrated [68][69][70] that the improvement for mass measurements based on the determination of the full phase space boundary over one-dimensional variables can be maintained in the presence of SM and combinatorial backgrounds, by using Voronoi tessellations.
The layout of the paper is as follows. In section 2 we review the mathematical description of many-body phase space and we quantify the enhancement near the boundary for five-body final states. In section 3 we focus on mass measurements and we set up an analysis to compare the results of mass measurement based on our methods to those obtained from kinematic endpoints. In section 4 we comment on the potential use of our methods for determining the underlying decay topology. We conclude in section 5. Certain details of our methods are more fully described in appendices A through C.

Mathematical Description of Many-Body Phase Space
The standard form of the phase space volume element of n final state particles with 4momenta p µ i and total 4-momentum P µ is expressed as a function of individual components of 4-momenta which are not manifestly Lorentz invariant. There also exists a less well-known formulation which is expressed purely in terms of Lorentz scalars [71,72]. As argued in [60] this form contains important clues to optimizing the sensitivity of mass measurements, therefore we will review it below. We start by defining M n as the n × n matrix with elements p i · p j , and define ∆ i as the coefficients of the characteristic polynomial of M n as follows: The kinematically accessible region of phase space corresponds to ∆ 1,2,3 > 0, ∆ 4 ≥ 0 and ∆ 5,...,n = 0, with ∆ 4 = 0 defining the boundary of this region [72]. For the specific case of n = 4, the volume element is given by where M 2 X = P µ P µ and where the δ-function at the end enforces energy conservation. Note that the volume element scales as ∆ −1/2 4 , diverging near the boundary in an integrable way. This can be understood as follows: ∆ 4 , which for n = 4 is equal to (− det M 4 ) can be rewritten as − det(V T gV ) = det 2 V , where V is the 4 × 4 matrix whose columns are the p µ i and g = diag(1, −1, −1, −1) is the metric. This makes it clear that the boundary of the kinematically accessible region corresponds to the final state momenta becoming linearly dependent. When this happens, the coordinate change from Cartesian coordinates to the Lorentz-invariant coordinates m 2 ij becomes singular and the Jacobian diverges. Note also that the presence of intermediate on-shell particles in the cascade does not change this conclusion, since in the narrow width approximation, these contribute δ-functions to the amplitude squared |M| 2 , the arguments of which are linear in the m 2 ij . Therefore, using these δ-functions to eliminate some of the integrals over m 2 ij never produces nontrivial Jacobian factors.
Going beyond n = 4, the phase space volume element has the form [72] dP S n = (const.) × M −2 Naively, this expression seems to imply that the enhancement in the volume element near the boundary is absent for n > 4. However, a more careful examination reveals that the arguments of the δ(∆ n ) factors are non-linear in the m 2 ij , and therefore nontrivial Jacobians arise as those δ-functions are integrated over.
In order to isolate the scaling of the volume element near the boundary, an alternative expression can be used [72], which locally takes the form: (2.5) Here the e αβ , where 1 ≤ α ≤ β ≤ n − 4, are a set of (n − 4)(n − 3)/2 constraints that are linear in all m 2 ij to first order. The form of the e αβ are complicated, which makes this expression less useful for practical purposes. However since no nontrivial Jacobians arise, the correct scaling ∆ −(n−3)/2 4 is revealed, which results in a stronger and stronger enhancement near the boundary with increasing n. In particular, for n = 5 the volume element scales as ∆ −1 4 . This can be understood in a similar way to our argument above for n = 4: as we approach the three dimensional boundary of phase space, a larger number of 4-vectors must become linearly dependent, and the coordinate change from Cartesian coordinates to the m 2 ij becomes more singular. We have also verified the scaling for n = 5 numerically by generating Monte Carlo data for 5-body decays. Specifically, after restricting ourselves to the physical hypersurface specified by the ∆ 5 = 0 constraint, we have sampled the density of Monte Carlo events in a narrow tube perpendicular to the boundary near randomly chosen points on the boundary. Our results are shown in figure 1 and they confirm the ∆ −1 4 scaling, demonstrated visually by the red line.

Mass Determination
In this section we will compare the results of mass measurements obtained by using the multi-dimensional nature of the kinematically accessible region in phase space to those obtained from the traditional kinematic edges and endpoints. In order to perform this comparison, we introduce quality-of-fit functions, to be described below, for the two methods, and we search for the spectrum that results in the best fit, using Monte Carlo samples of 100 events each for several decay topologies. By finding the best fit spectrum over many samples, and studying the distribution of the best fit masses, we can evaluate the precision and accuracy of the two techniques. We do this by studying representative benchmark spectra in the decay topologies of interest. Our setup is similar to the analysis performed in ref. [60], where final states with three visible particles and one invisible particle were considered. In this paper we extend this to final states with four visible particles and one invisible particle. We use a shorthand notation to classify the topologies of interest by using the multiplicity of final states in each stage of the cascade: for instance "2+2+3" denotes a decay topology where the initial state decays through a 2-body decay, the resulting intermediate particle decays through another 2-body decay, and the intermediate particle resulting from the second stage decays through a 3-body decay, where the final state of the last decay stage includes the lightest partner particle.
We do not consider the 2+2+2+2 topology since it has sufficiently many on-shell intermediate particles to be analyzed by polynomial methods. The four on-shell constraints for the intermediate particles together with the 5-body constraint ∆ 5 = 0 are sufficient to restrict the likelihood function to have support on a set of measure zero in the space of mass spectra. Therefore, the true spectrum can be determined with a finite number of events. The 2+2+3, 2+3+2 and 3+2+2 topologies are very similar, and therefore we will study the 2+2+3 topology as a representative case. We also study the 3+3 topology which is inequivalent to those. We do not consider decay topologies involving a direct 4-body decay. The endpoint formulas in certain topologies have different analytical forms in different regions of the space of spectra (see appendix A). Some forms are functions of mass differences only and cannot contribute to a determination of the overall mass scale, while others do contain some absolute scale dependence. In order to gauge the performance of the endpoint method in a more representative way, we pick two benchmark mass spectra for the 2+2+3 topology. The benchmark mass spectra are listed in table 1. In particular, benchmark spectrum 2 is expected to be less sensitive to the overall mass scale, as both the m 2 1234 and m 2 234 endpoints depend only on mass differences (see the last lines of equations A.5 and A.6), while the endpoint formulas for benchmark spectrum 1 do have some dependence on the overall mass scale (line 2 of equations A.5 and the first line of equation A.6).
The Monte Carlo events are generated using the phase space routines in ROOT [73]. We also use the optimization routines in ROOT to find the best fit spectrum. We assume that the underlying decay topology is known; we will comment on the question of determining the decay topology in the next section. We start the optimization procedure within a rectangular box in the space of spectra where each mass is varied by ±25% of its correct (1)  500  400  150  100  5  2+2+3 (2)  400  350  300  100  5  3+3 (3)  500  300  -100  5   Table 1. Benchmark mass spectra used in our analysis. For the labeling of the masses in the spectra, see figures 7 and 8 in appendix A.
value (for the multidimensional phase space method) or up to several TeV (for the endpoint method). We perform a random scan inside this box to find the best fit spectrum. We then refine the best fit spectrum using the simulated annealing algorithm. As mentioned in the introduction, an important caveat in our methods is that all spin correlations are ignored, in other words we use isotropic decays in our Monte Carlo events, and in the quality-of-fit variables described below. Therefore, in the presence of spin correlations, the specific quality-of-fit variable described below for the multidimensional phase space method may develop biases. However, we reiterate that the main purpose of our study in this paper is to provide a proof of principle that multidimensional phase space methods can provide an improvement over kinematic endpoints for mass measurements. Fundamentally, all the information about the spectrum is encoded in the shape of the boundary of the kinematically accessible region in phase space, not in the distribution of events, which will have additional dependence on the matrix element. The ideal mass measurement analysis would therefore be based on finding the boundary alone, for example by using our methods combined with Voronoi tessellations as has already been done in refs. [68][69][70]. Once the masses have been measured by using the boundary, more sophisticated methods such as matrix element matching can then be utilized for determining the spins of the particles in the decay chain. We proceed with the quality-of-fit variables below mainly to keep the comparison between the two methods as simple as possible for this initial study of the five-body decay chains. We leave a more realistic work incorporating tools such as Voronoi tessellations to future work.

Quality-of-fit variable for the kinematic endpoint method
We define the measured position of a kinematic endpoint as the highest value obtained for the observable in question within the data sample. We construct the quality-of-fit function to quantify the agreement between the measured endpoints and those predicted by the spectrum hypothesis: where Ξ = 1 if all measured endpoints occur at smaller (or equal) values than the predicted ones. If any one of the measured endpoints exceeds the predicted value, the mass hypothesis is rejected (Ξ is taken to be ∞). We consider all possible Lorentz-invariant endpoints, with pairs, triplets, etc. of visible final state particles. All endpoints used in our analysis and their predicted values are listed in appendix A. The best fit mass hypothesis is the one that minimizes Q.

Quality-of-fit variable for the multidimensional phase space method
To quantify the quality-of-fit using the multidimensional phase space method, we introduce a likelihood function. In particular, let L({M i }|data) denote the likelihood for a hypothesis spectrum {M i } given the data. By Bayes' theorem, using a flat prior over spectra, this is proportional to L(data|{M i }), the probability of obtaining the data from the underlying spectrum {M i }. This probability can be factored over the events in the data sample as where {m 2 ij } denote all Lorentz-invariant observables in the event. The form of the L event factors and the details of their calculation are described in appendix B. Ultimately, we bring the likelihood functions for each topology into a standard form where for any given decay topology, the Θ[D] factors encode the kinematically accessible region in phase space, F contains all dependence on the hypothesis spectrum {M i }, and N includes all remaining dependence on the observables in the events. Note that as in the setup for the kinematic endpoint method, spectra for which there exist events that fall outside the (hypothetical) kinematically accessible region are considered excluded. Since the phase space density becomes large near the boundary of the kinematically accessible region, the likelihood function favors spectra where as many events as possible lie near the boundary, with no events lying outside the boundary. The best fit mass hypothesis is the one that maximizes L (to be more precise, we use the logarithm of L).

Analysis and Results
As mentioned above, kinematic endpoint methods are generically much more sensitive to mass differences in the spectrum than to the overall mass scale, parameterized e.g. by the mass of the lightest partner. Therefore, when the statistical distribution of best fit values for the mass of any particle in the spectrum is considered, the spread in the distribution is dominated by the uncertainty in the overall scale. In order to better compare the performance of the two methods to the overall mass scale and to the mass gaps in the spectrum separately, it is preferable to find an alternative parametrization for spectra rather than using the masses of the individual particles. In particular, we parameterize the spectrum in terms of one parameter that sets the overall mass scale, and three other parameters that only depend on mass gaps. For the 2+2+3 topology, we define the dimensionless parameters {α, β, γ, δ} as Similarly, for the 3+3 topology we define {α, β, γ} as Again, α parameterizes the overall mass scale, and similar consideration as above apply in choosing the allowed range for these parameters.
Our results for the 2+2+3 topology are shown in figure 2 for benchmark spectrum 1, in figure 3 for benchmark spectrum 2. The results for the 3+3 topology is shown in figure 4. The mean value and standard deviation of the distributions for α, β, γ and δ are listed in table 2. It is easy to see that the conclusions obtained for the four-body decay topologies [60] continue to hold, namely that the multidimensional phase space method yields both more precise and more accurate results for the overall mass scale as well as for the mass gaps. The reasons for the mean values of the distributions obtained by the kinematic endpoint method to be biased away from the correct masses is similar to those discussed in appendix C of ref. [60] for the four-body final states. Note also that for the 2+2+3 topology, although the α distribution for the kinematic endpoint method of benchmark spectrum 1, which was chosen to have lesser sensitivity on the overall scale, seems to be broader compared to benchmark spectrum 2, this is somewhat misleading. The lower end of the α distribution for benchmark spectrum 2 is cut off by the constraint that all masses in the spectrum be positive numbers, which obscures the true spread in the distribution.  Figure 4. Distribution of the best fit values of α, β and γ (defined in equation 3.9) for the kinematic endpoint method (blue) and the multidimensional phase space method (yellow), using the benchmark spectrum for the 3+3 topology and data samples of 100 events. The true masses correspond to α, β and γ all being zero.

Topology Determination
In this section we will consider how different event topologies may be distinguished from one another by using the distribution of events in phase space. We will consider both 4-body and 5-body final states, since ref. [60] did not consider the question of topology determination. In particular, let {T i } be the set of event topologies that are compatible with the number of observed particles, with one invisible particle assumed to be produced in the last stage of the cascade. We will now write the likelihood function as L(T i , {M i }|data), making the dependence on the topology explicit. As before, with a flat prior over topologies and spectra, the likelihood can be related to the probability of obtaining a given distribution of events from an underlying topology (4.1) We can now use the likelihood functions listed in appendix B, in the standard form As for the analysis for mass measurements, we adopt log L as the quality of fit variable.
Maximizing over spectra as before, statistical statements (such as exclusion with a given Multidimensional Phase Space Kinematic Endpoints 2+2+3 Topology Benchmark Spectrum 1 m X (GeV) 500 ± 1 543 ± 24 Table 2. The mean value and standard deviation of the distributions masses in the spectrum as well as of the parameters α, β etc. for the two benchmark spectra of the 2+2+3 topology, and for the benchmark spectrum of the 3+3 topology, for data samples of 100 events.
confidence level) can then be made about a number of possible topology hypotheses based on the data. We will not attempt to perform a detailed analysis of this type in this work, since the idealizations we work with, such as perfect energy resolution and the absence of backgrounds and combinatoric effects, would render the conclusions unreliable. Nevertheless, one general conclusion can be drawn immediately: When a topology hypothesisT contains more on-shell particles than the "true" topology T , it can be ruled out (for any spectrum) with a very small number of events. Indeed, for the hypothesis T , the optimization over mass spectra will be trying to enforce an on-shell constraint among the visible particles where no such constraint is actually obeyed by the data. In general, there is no reason for a constraint that appears to be satisfied by one event to also be satisfied by any other. Conversely, a choice forT that contains fewer on-shell particles than T , while it cannot be ruled out completely, will generally result in a significantly lower  Figure 5.
For data samples of 100 events each generated with the spectrum (500, 350, 200, 100) GeV, the distribution of log-likelihood values for the 2 + 2 + 2, 2 + 3 and 3 + 2 topology hypotheses where the likelihood is maximized over spectra for each sample and each topology hypothesis. likelihood than when the correct topology hypothesis is used, sinceT will not provide a very good fit to the distribution of events in the data.
Let us demonstrate this on a specific example. The 2 + 2 + 2 topology with the spectrum (500, 350, 200, 100) GeV was used to generate Monte Carlo samples of 100 events each, and for the topology hypotheses 2 + 2 + 2, 2 + 3, 3 + 2 and 4, all possible spectra were scanned until the spectrum with the highest likelihood was found for each sample. Note that unlike in the analysis in section 3, for an incorrect topology hypothesis there is no "correct" mass point to center the scan region on, therefore we scan the spectra over a larger region where each mass is varied between zero and several TeV. The distribution of the best-fit log-likelihood over samples for each topology hypothesis is shown in figure 5. In accordance with our expectations, the 2 + 3 and 3 + 2 topologies with fewer on-shell particles result in a poor fit, and the correct topology results in the highest likelihoods.
It should be noted that for certain incorrect hypotheses, there exists a runaway direction in the space of spectra {M i }, namely the likelihood increases as all masses go to infinity with fixed mass gaps. This happens for instance when a direct 4-body decay topology hypothesis is used in the example above. In addition to being completely unphysical (which is why they are not plotted in figure 5), the likelihood values for this topology hypothesis in any case turn out to be smaller than for the other topology hypotheses. Runaway directions do not exist for the correct topology hypothesis, and therefore the presence of a runaway direction can be used to rule out a topology hypothesis.
Based on the above considerations, a rather general conclusion can be reached that when analyzing a given data sample, the correct topology is among those hypotheses that have the highest number of on-shell particles and that are not immediately ruled out. If there is only one such hypothesis (2 + 2 + 2 in the above example), then it must be the correct one. Things are more interesting when there are competing hypotheses with the same number of on-shell particles.
For the final states with three visible particles and one invisible particle, the following  Figure 6. For data samples of 100 events each generated with the spectrum (500, 350, 100) GeV and the 2 + 3 hypothesis, the distribution of log-likelihood values for the 2 + 3 and 3 + 2 topology hypotheses where the likelihood is maximized over spectra for each sample and each topology hypothesis.
outcomes are therefore possible: • The data does not rule out the 2+2+2 topology hypothesis, which is then established as the correct one.
• The data is not compatible with the 2+2+2 topology hypothesis but it is compatible with the 2 + 3 and 3 + 2 topology hypotheses. This is the only nontrivial case that can arise with this number of final state particles, and a statistical analysis would be needed to find the best fit topology hypothesis. An example of this is demonstrated in figure 6, where the log-likelihood distributions are plotted for the two competing hypotheses for data samples of 100 events each, generated with the 2 + 3 topology and the spectrum (500, 350, 100) GeV. The log-likelihood distribution clearly favors the correct topology.
• The data is only compatible with a direct 4-body decay hypothesis, which is then established as the correct topology.
Similarly, for the final states with four visible particles and one invisible particle, the possible outcomes are: • The data is compatible with the 2+2+2+2 topology hypothesis, which consequently must be the correct one.
• The data rules out the 2 + 2 + 2 + 2 topology hypothesis but it is compatible with the 2 + 2 + 3, 2 + 3 + 2 and 3 + 2 + 2 topology hypotheses. Since these have the same number of on-shell particles, a statistical analysis would need to be performed to determine the correct topology hypothesis. We have performed a numerical study of this scenario with samples of 100 events each, generated with the 2+2+3 topology and the spectrum (500, 350, 200, 100) GeV. The log-likelihood distribution not only favors the correct topology but in fact the incorrect topology hypotheses are ruled out completely since no spectrum can be found that is consistent with the data.
• If the data is not compatible with any of the above, then the 3+3 topology hypothesis is the most likely fit, though technically 4 + 2 or 2 + 4 are also potential topology hypotheses since they have the same number of on-shell intermediate particles. It is rare for particles in beyond the SM scenarios to not have any 2-body or 3-body decay channels such that the dominant decay mode is a direct 4-body decay, but from a purely model-independent point of view this should not be discarded off-hand and a likelihood analysis should be performed as in the above examples.
• While extremely unlikely from a theoretical point of view, there is also a possibility that none of the above topology hypotheses provide a good fit such that a direct 5-body decay topology hypothesis may need to be considered.

Conclusions
With the LHC already operating close to its design energy, it is not unreasonable to expect that even if new physics is discovered, the signal will not have high statistics. Earlier work [60] demonstrated that for limited signal statistics, kinematic endpoints are inefficient for mass measurements in cascade decays with three visible particles and one invisible particle, and that a determination of the phase space boundary in its full dimensionality can lead to significant improvement. This conclusion was borne out further with a subsequent study with a more realistic analysis [70], using the method of Voronoi tessellations [68,69] to find the boundary of the signal region in the presence of background. In this paper we explored additional decay topologies, including those with four visible particles and one invisible particle, and we have shown that the enhancement in the density of events near the boundary of the kinematically allowed region not only persists, but is even stronger. We have also demonstrated the improvement in mass measurements that can be obtained with these methods on several benchmark decay topologies, for which polynomial methods are not applicable. We have performed this comparison in a very idealized setup, mainly as a proof of principle; however there is no reason to expect that in a more realistic analysis the results obtained by the methods presented in this paper should degrade more than traditional methods based on kinematic endpoints. As has already been done in the case of 4-body final states [70], it should be possible to verify whether our conclusions continue to hold using a more realistic analysis. Finally, we have explored the possibility of determining the underlying decay topology using our methods, and we concluded that at least in principle topology determination is achievable. The construction of a more realistic analysis both for mass measurements and for topology determination will be performed in future work.

A Endpoint formulas
In this appendix, we list the formulae for the endpoints used in the analysis of section 3. Additional details and derivations can be found in [12]. We work in the limit of massless visible final state particles (except for the lightest partner) for which simple expressions for the endpoints are available. Numerical verification shows that including small masses has a negligible effect on the endpoint positions.

A.1 2+2+3
The labeling of the particles is illustrated in figure 7. There are eight endpoints for this topology. The positions of the following four endpoints are spectrum independent: The positions of the remaining four endpoints are given by expressions that depend on the spectrum: (A.8)

A.2 3+3
The labeling of the particles is illustrated in figure 8. There are six endpoints for this topology. The positions of the following four of the endpoints are spectrum independent: The positions of the remaining two endpoints are given by expressions that depend on the spectrum: (A.14)

B Likelihood functions
In this appendix, we will derive analytical expressions for the likelihood functions that we use in our analysis. We treat all particles as spin-0 and we work in the narrow width approximation for any on-shell intermediate states. For a given data sample, we define the likelihood function as the probability that these events were produced from a certain underlying event topology with a spectrum {M i } of intermediate on-shell states. Using Bayes' theorem with a flat prior over spectra, one can relate this to the probability of obtaining a given distribution of events from a given spectrum To capture the multidimensionality of the phase space, we choose L event factors to be normalized fully differential decay widths, integrated over all unobservable m 2 ij involving the lightest partner 1 . The differential decay width is simply the product of the squared matrix element and the phase space volume element (see equation 2.4): Since we treat all particles as spin-0, the matrix element squared only contains factors of effective couplings for each decay stage and propagators that are simplified using the narrow width approximation. Therefore, L event breaks up into factors for each on-shell stage of the cascade decay. Note that each decay stage involves one heavy particle of mass M i that decays to another heavy particle M i+1 and a number of light particles, assumed massless. The energy-momentum conserving δ-functions and factors of 1/Γ arising from the narrow width approximation for each intermediate on-shell state are also combined with the vertices that they are attached to. See ref. [60] for additional calculational details. For 2-and 3-body decay stages, the width of the decaying particle is given by where µ and λ are the effective couplings of the 2-and 3-body decay vertices (of mass dimension 1 and 0, respectively), and r is the ratio of the heavy daughter mass to the mass of the decaying particle in that decay stage. With the correct normalization, the phase space factors for the 4-or 5-body final state in terms of Lorentz invariants are given by dP S n = M −2 X 2 8 π 6 ∆ −1/2 4 n = 4 2 11 π 9 δ(∆ 5 ) n = 5 encodes overall energy conservation. Here M X is the mass of the decaying particle, m LP is the mass of the lightest partner particle at the end of the decay chain, and the remaining m i are the masses of the light particles in the final state, which we set to zero in our analysis. Performing the integration over the unobservable m 2 ij , we bring the likelihood functions into a standard form where for any given decay topology, the Θ[D] factors encode the kinematically accessible region, F contains all dependence on the spectrum {M i }, and N includes all remaining dependence on the observed Lorentz invariants m 2 ij as well as on numerical factors. In tables 3 and 4 we present the likelihood functions for all 4-and 5-body decays consisting of 2-and 3-body decay stages. We express the results in terms of the kinematic functions λ n with n(n + 1)/2 arguments, defined as the determinant of a (n + 2) × (n + 2) symmetric matrix [71] given by x n(n+1)/2 1 x n x n(n+1)/2 0 1 Note that λ 2 is the triangle function that appears in 2-body decays, and λ n is proportional to ∆ n in an n-body decay. The likelihood functions can be expressed in terms of ∆ i 's by using the invariant masses m 2 ij of pairs of particles, or in terms of λ n 's by using the invariant masses of larger collections of particles as shown in tables 3 and 4. For this reason, the usage of the λ n is superior at making the dependence on the masses of on-shell mediators higher up in the decay chain more explicit, and we adopt this notation in reporting our results.
The structure of the D factors has interesting properties as well, which we describe in more detail in appendix C.

C Factorization of the domain function
In this appendix, we will further study the structure of the factors in the likelihood function encoding the kinematically accessible region of phase space. Any cascade decay can be broken down into a number n s of stages, each stage corresponding to the presence of an onshell intermediate particle. Let us explore how this structure is related to the factorization of the domain function. In particular, consider the i-th stage as a heavy particle X i decaying to another heavy particle X i+1 and a number n i of SM particles. The domain function cannot depend on whether the n i particles are emitted promptly from the decay vertex, or whether the decay proceeds as X i → X i+1 Σ i , with Σ i being a metastable particle that much later decays into the n i SM particles 2 . In the likelihood function, an essential property of the domain function is to ensure that the full cascade X 1 → Σ 1 . . . Σ ns X ns+1 can proceed, where X ns+1 is assumed to be the lightest partner particle which is stable.  This consideration gives the key to the factorization of the domain function. There is always a "skeleton factor" corresponding to n s consecutive 2-body decays, with the Σ i and the lightest partner as final state particles. The skeleton factor cannot be factorized further, and it depends on the spectrum of the X i . In addition there are a number of other factors that have to do with the decays of the Σ i and these factors depend only on the m 2 ij of the final state SM particles, but not on the spectrum of the X i . Since the m 2 ij are actually observed in the data, they correspond to a physical configuration of particles and therefore these factors in the domain function can never become negative. In other words, for computing the domain function in the likelihood, only the skeleton factor is nontrivial. The exact form of the remaining factors also depends on the order in which the integrals over the m 2 ij are performed. For a concrete example, consider the 2+3 decay topology, where the labeling of the particles is given in figure 9. m 2 14 , m 2 24 and m 2 34 cannot be measured and they need to be integrated over. There are however only two δ-functions arising from on-shell particles X and Y in the narrow width approximation. The phase space volume element also includes a factor of ∆ −1/2 4 . Since ∆ 4 = − det M 4 , it is quadratic in all of the m 2 ij . After using If the last integral is chosen to be over m 2 34 , then the discriminant can be factored into two factors D A and D B , where D A is the determinant of the 3 × 3 matrix, the entries of which are dot products of pairs of the four momenta p µ 1 , p µ 2 and p µ 3 . Similarly, D B is the determinant of the 3 × 3 matrix, the entries of which are dot products of pairs of the four momenta p µ 1 , (p µ 2 + p µ 3 ) and p µ Z . D B can then be recognized as the skeleton factor, with particles 2 and 3 grouped together into a fictitious Σ particle, while D A depends only on the measured m 2 ij . This structure is reflected in the D entry for the 2+3 topology in table 3, with the λ 3 functions corresponding to D A and D B .