Identifying Dark Matter Event Topologies at the LHC

Assuming dark matter particles can be pair-produced at the LHC from cascade decays of heavy particles, we investigate strategies to identify the event topologies based on the kinematic information of final state visible particles. This should be the first step towards measuring the masses and spins of the new particles in the decay chains including the dark matter particle. As a concrete example, we study in detail the final states with 4 jets plus missing energy. This is a particularly challenging scenario because of large experimental smearing effects and no fundamental distinction among the 4 jets. Based on the fact that the invariant mass of particles on the same decay chain has an end point in its distribution, we define several functions which can distinguish different topologies depending on whether they exhibit the end-point structure. We show that all possible topologies (e.g., two jets on each decay chain or three jets on one chain and the other jet on the other chain, and so on) in principle can be identified from the distributions of these functions of the visible particle momenta. We also consider cases with one jet from the initial state radiation as well as off-shell decays. Our studies show that the event topology may be identified with as few as several hundred signal events after basic cuts. The method can be readily generalized to other event topologies. In particular, event topologies including leptons will be easier because the end points are expected to be sharper and there are more distinct invariant mass distributions from different charges.


Introduction
The Large Hadron Collider (LHC) have started taking data and many beyond the standard model (SM) theories will be tested. Many of these new models contain dark matter candidates whose stability is protected by some unbroken Z 2 symmetry. Examples are the R-parity in the supersymmetric model [1] and the KK-parity in models with Universal Extra Dimensions (UEDs) [2,3,4]. In these models, the dark matter particle is the lightest Z 2 -odd particle and may be produced in pairs at the LHC via cascade decays of heavier Z 2 -odd particles. At colliders, the dark matter particles escape the detectors and result in missing energies in the events. Knowing the missing particle mass and its interactions with the SM particles is crucial for testing whether it can indeed be the major component of the dark matter in the universe.
Because only the transverse part of the sum of the missing particle momenta can be measured at hadron colliders, the kinematics can not be fully reconstructed on an event-by-event basis. It is a challenging task to measure the dark matter particle mass at hadron colliders. Nonetheless, there have been a lot of progress in developing new techniques to determine the dark matter particle mass at hadron colliders despite the partially unknown dark matter particle momenta (see [5] for a review).
On a single decay chain, one can use the end points of visible particle invariant mass distributions to obtain relations among the masses of the particles on the same decay chain [6,7,8,9,10,11,12,13,14].
If one uses both decay chains, the end point of the variable M T 2 can also be used to determine missing particle masses [15,16,17,18,19,20,21,22,23,24,25,26,27]. If the decay chains are long enough and the event topology is known, one can even solve the kinematics of the full system by combining events [28,29,30,31,32].
In almost all of those analyses, the event topology has been assumed to be known from the beginning. However, it frequently happens that the same final states can come from different event topologies arising from the same model or different models. If a wrong topology is assumed, these mass determination methods will not give sensible results. Therefore, the first step towards measuring the dark matter mass should be identifying the dark matter event topology. In this paper, we are trying to clear the topology ambiguities and close the gap between the real experimental data and the different dark matter mass measurement techniques.
For a given final state, there are discrete choices of corresponding event topologies. The kinematic distributions are in general functions of the masses, spins and couplings of all particles in the process, in addition to the event topology. Therefore, to distinguish topologies we would like to use binary information in these distributions which is independent of the details of the model. An important observation is that the kinematics of the visible particles from a single decay chain is constrained by the mother particle mass, and hence the invariant mass distributions of these visible particles will have end points. On the other hand, the invariant mass combinations of particles from different chains are can easily tell if these objects in the corresponding invariant mass combination belong to the same decay chain. The task is more difficult if all the visible objects are jets. Not only there are fewer different types of invariant mass combinations, but the jets also suffer more from the experimental smearing effects. In this paper, rather than exhausting all possible final states we choose the quite challenging 4 jets plus missing transverse energy (MET) scenario as a case study. This final state can have a large production cross section at the LHC and a good discovery chance even with early LHC data [33] [34]. Without considering the case in which two jets are from W or Z boson decays, there are three different topologies for 4j + / E T even without the contamination of the initial state radiation (ISR): two jets on each decay chain (2⊕2); three jets on one chain and the other jet on the second chain (3⊕1); all four jets on a single chain and only the dark matter particle on the other chain (4⊕0).
The 4⊕0 topology in principle can be identified by looking at the invariant mass distribution of all four jets, since only 4⊕0 can have an end point under this function. To identify the other cases one needs to cleverly define certain functions of the 2-jet or 3-jet invariant masses which can preserve the end-point structure given that we do not know which 2 jets or 3 jets to use a priori.
In our detailed study, we consider the case that the four jets have similar energies so that they can not be divided into different groups based on their E T 's. We include the detector effects on the simulations as well as the initial and final state radiations to make the study more realistic.
Furthermore, we will consider two other cases which give rise to the same final state: two jets on one chain and one jet on the other chain with the fourth jet coming from ISR; three jets on one chain and zero jet on the other chain with the fourth jet from ISR. We will show that with the help of two additional functions those two topologies may also be distinguished.
Our paper is organized as follows. In Section 2, we first define four invariant-mass functions for distinguishing the 4j + / E T event topologies based on some theoretical motivations, and we perform the detailed parton-level studies to show that they indeed exhibit the behaviors that are expected.
In Section 3 we perform the realistic particle-level analysis on these functions. Even though the showering, hadronization, and the detector smearing make the end points harder to identify, we show that by fitting the slopes of the distributions and comparing the orders we can still distinguish different topologies. We also study the two ISR cases: 2⊕1⊕ISR and 3⊕0⊕ISR in Section 3.1 and off-shell decays in Section 3.2. The cases with mixtures of topologies are considered in Section 3.3. We then discuss strategies to identify event topologies for other final states and conclude our paper in Section 4.

Kinematic Functions for 4j + / E T Event Topologies Identification
In this section, we construct functions of visible particle momenta which can be useful to identify event topologies. We use the 4j + / E T final state for a detailed case study. Similar functions can easily be constructed for other final states. As motivated by the SUSY-like models, we consider that the signal events come from the pair-production of heavy parity-odd particles, and then they go through cascade decays which end at the dark matter particle. We will first consider all those 4 jets coming from heavy parity-odd particle cascade decays and defer the cases with one jet from ISR in a later section.
Under these assumptions, there are three topologies for this final state: four jets from one chain and zero jet from the other chain; three jets from one chain and one jet from the other chain; and two jets from both chains. The Feynman diagrams for those three cases are depicted in Fig. 1. In these diagrams, we denote the dark matter particle as χ and the decaying particles in the cascading chains as Y i . The intermediate particles in each decay chain can be either on-shell or off-shell, both of which will be covered in the following sections.
Although we do not restrict ourselves to any particular model, we note that all three topologies can arise in the popular SUSY scenario. In SUSY models, the lightest neutralino is usually the lightest supersymmetric particle (LSP) and also the dark matter particle. The 4⊕0 case can come from the associated production of the lightest neutralino and gluino with the gluino going through the cascade decays: pp →g + χ withg →ũ L + 1j → χ 2 + 2j →ũ R + 3j → χ + 4j. The 3⊕1 case can arise from the squark pair production as pp →ũ R +ũ L with two squarks decaying asũ R → χ + j and u L → χ 2 + 1j →ũ R + 2j → χ + 3j. Finally, the 2⊕2 case can come from the gluino pair production, pp →g +g, with the same decaying processes on both chains asg →ũ R + 1j → χ + 2j. Whether any of these processes occur and dominate the 4j + / E T signal events depends on the spectrum and details of the specific model. It is also possible that more than one processes give comparable contributions to the particular final state. In this section we are looking for functions which can identify the event Figure 1: The three event topologies, 4⊕0, 3⊕1 and 2⊕2, for 4j + / E T without specifying the spins of particles. The production mechanisms are not specified here and represented by a circle. The dark matter particle is denoted by χ. The intermediated particles in each chain can be either on-shell or off-shell. topologies, so we will consider the idealized case where all events come from a single process, and leave the more complicated situations to Section 3.3 after we discuss the realistic implementation of the strategy.
For the illustration purpose, in the following when we define our kinematic functions, we demonstrate them with a SUSY spectrum such that the LSP mass is m χ = 200 GeV and in each decay the mother parity-odd particle is heavier than the neighboring daughter parity-odd particle by 200 GeV in the decay chains for each topology. The jets coming from these on-shell decays will have similar energies and hence are indistinguishable. This is the most challenging scenario. If instead there are large hierarchies among these four jets, we can consider separately the invariant mass combinations of the jets based on their energy hierarchies and obtain more handles on whether the harder jets and/or softer jets come from the same decay chain. The partonic event simulations are generated with the Madgraph/MadEvents [35] package for the 14 TeV center of mass energy of LHC with the CTEQ 6L1 [36] parton distribution functions.
The kinematic functions that we are looking for need to be able to distinguish different event topologies. As mentioned in the Introduction, the invariant mass combinations of visible particles coming from the same decay chain are constrained by the mass of the decaying mother particle and hence have end points in their distributions, while the invariant mass combinations of particles from different decay chains are not expected to exhibit the end-point structure. Therefore we focus on functions of various invariant mass combinations of the 4 jets. The end-point formulae for various invariant mass combinations of particles from the same decay chain are given in Appendix A. We found that the invariant mass distributions are sufficient and fast enough to achieve our goal. It might be worth exploring more complicated strategies and other kinematic variables to improve the results.
To identify the 4⊕0 topology, the obvious function to use is the total invariant mass distribution of all four jets. We should anticipate a sharp end point for the 4⊕0 case but not for the other two cases. Therefore, we define the first function, which is specifically sensitive to the 4⊕0 topology, as Here, p i is defined according to the ordering of jet E T in each event and inv[ , · · · , ] means the total invariant mass of all momenta in the bracket, ( i p i ) 2 . The  For the 3⊕1 topology, we have found two potentially useful functions. The first one is the smallest 3-particle invariant mass combination: The minimum of the invariant masses of the four combinations has a high probability to find the set of three jets on the same chain. Even if it sometimes picks the wrong set, it will not exceed the expected end point since it is smaller than the correct combination. Plotting the event numbers as functions of F 2 , one may expect a sharp end point for the 3⊕1 case but not for the 2⊕2 case. An end point for the 4⊕0 topology is also expected since all visible particles come from the same chain. Another function uses 2-particle invariant masses. Since the invariant mass of 2 particles from different chains can be very large, we consider the invariant mass of the pair of the particles which is opposite to the pair that forms the largest invariant mass: Here, i, j, k, l, m, n = 1, · · · 4 and ǫ klij is the totally antisymmetric tensor. For the 3⊕1 case, this pair of particles has a large chance to come from the same chain. Even if occasionally they come from different chains, their invariant mass is bounded by an invariant mass from the same chain and hence will not exceed the corresponding end point. On the other hand, this combination for the 2⊕2 case is likely to come from opposite chains and is not expected to have an end point. The event distributions of these two functions are shown in Fig. 3, where one can see that both 4⊕0 and 3⊕1 topologies have obvious end points at around 600 GeV for F 2 and around 400 GeV for F 3 . (The exact expected values can be found in Appendix A.) On the contrary, the 2⊕2 distributions have long tails without end points. In the next section we find that F 2 seems to work better than F 3 after the experimental smearing. However, F 3 may still be useful for some other final states (e.g., leptons which do not suffer too much from smearing effects).
To identify the 2⊕2 topology, we define the following function which is sensitive to this topology: For each event there are 3 ways to pair those four jets. One first chooses the pair with the larger invariant mass for each way of pairing, then calculate the minimum of the larger invariant masses At the parton level, one can see that the three different topologies can be easily identified by checking whether there are end points in those four functions F 1 -F 4 . However, including the showering, hadronization, and other experimental smearing effects these end points become less distinct and more concrete strategies need to be developed to identify the topologies, which is the subject of the next section.

Dealing with Realistic Particle-Level Event Distributions
In reality, the kinematic distributions proposed in the previous section will receive significant experimental smearing effects from showering, hadronization, detector resolutions, backgrounds, and so on.
It is important to check whether the distinctive features of these functions can survive the smearing effects. We also need an unambiguous procedure to identify the event topology from these kinematic functions rather than just looking for end points by naked eyes.
To include the experimental smearing effects for our particle-level analysis, we first generate partonlevel events using Madgraph/MadEvents as before. We then process the parton level events with Pythia [37] for showering and hadronization including initial and final state radiations, and PGS [38] (with the default CMS detector card) for the detector simulation. Some basic cuts are imposed on all signal events after the detector simulation.
We start with the same spectra used in the previous section with all on-shell decays and generate 10000 events for each topology. We require at least four jets with E T > 100 GeV and the missing transverse energy / E T > 200 GeV on the events. This set of cuts are utilized just for the illustration purpose, and our following analysis is insensitive to those cuts. The signal selection efficiencies of 4⊕0, 3⊕1 and 2⊕2 are 19.3%, 10.1% and 13.1%, respectively.
The event distributions in terms of those four functions F i with i = 1, · · · 4 are shown in Fig. 5.
Compared to the parton level distributions in Fig. 2, 3, 4, we can still roughly tell some of the end points for the first three functions. The end point structure for the last function F 4 is not so clear on the other hand. Nevertheless, for the distributions which were supposed to have end points, the after-peak slopes look steeper than the ones without end points. We will explore this observation to come up with a concrete procedure to identify different topologies.
To find the falling slopes after the peaks of these distributions, we will try to fit the histograms in Fig. 5 with some simple functions. The first fitting function that we consider is the following log-normal function: We select the signal events by requiring: at least 4 jets with E T > 100 GeV and missing transverse energy / E T > 200 GeV. The continuous lines are fitted lines using the log-normal function. Only bins with a height above 1/2 of the peak height on the left side and above 1/4 of the peak height on the right side are included in the fit.
This log-normal function has a peak structure with asymmetric half-height widths. The three parameters a 0 , a 1 and a 2 determine the peak height, the peak location and the falling slope after the peak.
The last parameter a 2 can be used to distinguish different topologies. A smaller value of a 2 means a steeper slope after the peak. Given the large statistical uncertainties on the tails of those distributions and also potentially large contaminations from backgrounds, we only include bins with a height above one half of the peak height on the left side and above one quarter of the peak height on the right side into the fit. The χ 2 in our fit is defined as to only take into account the statistical uncertainties from the signal events. The asymmetric choice of bins around the peak is to minimize the effects of cuts, which affect the left side of the peak more severely. The fitted curves are shown in the continuous lines in Fig. 5 for the 30 GeV bin size. As one can see, except the F 3 distributions all other histograms are well fitted by the log-normal functions.
The fitted values of a 2 are listed in Table 1  Topologies  (5), which determine the steepness of the slopes of the histograms after the peak. The bin size has been chosen to be 30 GeV (left), and 40 GeV (right).
bin size introduce larger fluctuations of the number of events in each bin, and make the fit unreliable.
Comparing those numbers for the 30 GeV and the 40 GeV bin sizes, we can see that the fitted slopes are fairly consistent.
To examine the sensitivity on the functions used to do the fits, we also try to fit the histograms with two straight lines around the peak. This fit also gives an estimate of the end point value at the point when the right straight line crosses zero. The "broken-line" function used in the fits is given by with all parameters being positive. The parameters "(a, b)" determine the location of the peak and "c" and "d" determine the slopes of curves on the right side and the left side of the peak. For the 30 GeV bin size, we compare the fitted results with the simulated distributions in Fig. 6. We see that the "broken-line" function can fit all histograms including  Table 2. Similarly, the fitted results are insensitive to the choices of bin sizes.   The fitted values for b/(ac) of the "broken-line" function in Eq. (7), which determine the normalized steepness of the slope of the histogram after the peak. The bin size has been chosen to be 30 GeV (left), and 40 GeV (right). We have also shown the 1 σ statistical uncertainties for the 30 GeV bin size, based on 1932, 1013 and 1313 events after cuts for 4⊕0, 3⊕1 and 2⊕2, respectively.
One can naïvely define the end point to be the intersecting point of the right branch of the brokenline with the x-axis, which is given by F end 2⊕2 3⊕1 Figure 7: The flow chart of our procedures to identify the event topologies with 4j + / E T when using the log-normal function to fit the F i distributions. For the broken-line fit, one needs to replace a 2 by b/(ac).
After fitting the shapes of distributions, we can now use the information in Table 1 and 2 to identify different topologies. From these tables, one can see that the topology can be identified by finding "i" of the minimum of all four a 2 (F i ) for the log-normal fit or b/(ac)(F i ) for the broken-line fit. If i = 1, then the topology is 4⊕0; if i = 2 (or 3), it is 3⊕1; if i = 4, it is 2⊕2. The procedure described above to identify the dark matter event topologies with 4j + / E T is summarized in the flow chart in Fig. 7.
The two choices of the fitting functions give similar results, so we will only use the broken-line fits in the rest of the paper.
Based on the central values and 1 σ statistical uncertainties in Table 2, we estimate the required numbers of events to achieve 90% of time correct for the topology 4⊕0 to be 446, 3⊕1 to be 777 and 2⊕2 to be 740. All of those numbers are defined after the basic kinematic cuts and assuming negligible SM backgrounds. A more complete analysis requires simulations of SM backgrounds and is beyond the scope of this paper.

Initial State Radiation
There exist other possibilities to have the 4j + / E T final state after passing the basic kinematic cuts. and apply the procedure proposed earlier to at least understand the relative positions of the four hard jets, though the end points become less sharp and the formulae in Appendix A may not be accurate.
The third possibility is to have only 3 jets coming from the decay chains but the fourth jet from ISR. There are two types of topologies, 3⊕0⊕ISR (three jets from a single chain) and 2⊕1⊕ISR (two jets from one chain and another jet from the other chain). In this subsection, we describe the additional procedures for identifying those two new topologies with one ISR jet.
The 3⊕0⊕ISR topology is very similar to the 3⊕1 topology. The only difference is that the isolated jet comes from ISR instead of from the decay of a heavy particle. Indeed, the fitted slopes for the 3⊕0⊕ISR topology have the same pattern as the 3⊕1 topology, with the F 2 function giving the steepest slope. To distinguish it from the 3⊕1 topology, we should examine the E T distribution of the isolated jet. If it comes from the ISR, it should have a falling E T distribution starting from the kinematic cut. On the other hand, a jet coming from a heavy particle decay should have a peak value determined by the mass difference. Because the 3 jets from the same decay chain are likely to have a smaller invariant mass, we define the following 3⊕0⊕ISR-specified function: The distributions of F 5 for different topologies are shown in Fig. 8. There, we use the broken-line  function to fit various distributions with 30 GeV bin size. As can be seen from this plot, 3⊕1 and 3⊕0⊕ISR have different locations of the peak with a lower value for the 3⊕0⊕ISR case. To quantify this difference, we compare the dimensionless parameter d/c, which measuring the ratio of the rising slope before the peak and the falling slope after the peak. The numerical values of the b/(ac) and d/c for those five topologies and five functions are listed in Table 4. We see that the 3⊕0⊕ISR and 3⊕1 can be distinguished by examining d/c(F i ). The 3⊕0⊕ISR has the largest value in F 5 while the largest value for 3⊕1 occurs at F 1 . For the 2⊕1⊕ISR topology, it is not expected to have an end point for any of these functions. Indeed, we find that all four b/(ac)(F i ) are comparable and none of them takes a value as small as those ones with end points for the corresponding topology. That it is not any of the other topologies which should have at least one end point among the 4 functions can be taken as a sign for the 2⊕1⊕ISR topology. It is also not so easy to identify the ISR jet for this topology. To unambiguously distinguish it from other topologies is more challenging and probably requires additional functions. We will have more discussion on this topology in Appendix B.  The fitted values of b/(ac), which determine the normalized steepness of the slopes of the histograms after the peak, and the fitted values of d/c, which determine the ratios of the rising slopes and the falling slopes around the peak. The broken-line function in Eq. (7) is used to fit the distributions. The bin size has been chosen to be 30 GeV.

Off-Shell Decays
It frequently happens in models with dark matter that some particles have the dominant decays to be three-body processes. For example in SUSY, if squarks are heavier than the gluino, the dominant decay channel of the gluino is to two quarks plus one neutralino via off-shell squarks. In this subsection, we use the SUSY model to generate events with off-shell decay processes in the decay chains with the assistance of the Monte-Carlo tool BRIDGE [39]. For the 2⊕2 event topology, the events are generated by pair producing two gluinos, and then decaying each one toū + u + χ via off-shell squarks. For 3⊕1, we first generate events withũ L +ũ R in the final state, and then we requireũ L →ū + u +ũ R via the off-shell gluino or neutralino andũ R → u + χ. For 4⊕0, we first generate events withg + χ in the final state, and then we requireg →ū + u + χ 2 and χ 2 →ū + u + χ via off-shell squarks. To produce events with similar visible kinematics for all three topologies, the LSP χ is fixed to have a mass of 200 GeV, and the mass difference between the mother superparticle and the daughter superparticle is chosen to be 200 GeV for two-body decays and 400 GeV for three-body decays. As a consequence, all four jets should have similar E T distributions on average.
After the basic cuts on all signal events: at least four jets with E T > 100 GeV and the missing transverse energy / E T > 200 GeV, the acceptance efficiencies are 11.5%, 6.7% and 9.2% for 4⊕0, 3⊕1 and 2⊕2, respectively. Those efficiencies are lower than the on-shell decay cases simply because jets in the off-shell case have a larger probability to become soft.
Repeating the same procedure as in the on-shell decay case, we have fitted the slopes of the distributions after the peak in Table 5. Comparing Table 5 with Table 2 for the on-shell case, we can bin size = 25 GeV bin size = 30 GeV   Fig. 7 can still be used to identify those three topologies, more signal events or a higher integrated luminosity at the LHC are required to identify the topologies for the off-shell case. After the event topology is identified, the on-shell and off-shell decay cases may be distinguished by examining the correlations of the invariant masses [13] and/or the function M T 2,max [15,16,18,19,20].
For completeness, we also report the fitted end-point values in Table 6   The fitted values (in GeV) of the end points, which are the right intersection points of broken lines with the x-axis. The bin size has been chosen to be 30 GeV. The numbers in the parenthesis are the end-point values at the parton level. The numbers with * occur in the soft limit of some jet(s) which in practice will not pass the cuts. Therefore the actual end points are much smaller.

The Cases of Mixtures of Different Event Topologies
It may also happen that the signal events of the same final state come from a combination of two or more different event topologies. The method discussed in this paper should work when one topology dominates over the others. To quantify the limit at which the topology can be identified by our method, we use a mixture of 2⊕2 and 3⊕1 topologies as a case study. Fixing the total number of events after the basic cuts to be 1000, we study the patterns of the fitted b/(ac) as a function of the mixture fraction, x. For x = 0, all events are from the 3⊕1 topology, while for x = 1 all events are from the 2⊕2 topology. For the 4 functions F 1 -F 4 , the fitted values of b/(ac) as functions of the mixture fraction are shown in Fig. 9. As x increases, one can see from Fig. 9 that there is a transition  Figure 10: The same as Fig. 9 but for a mixture of 2⊕1⊕ISR and 2⊕2 topologies with 1000 combined events (after the cuts). Here, x is the fraction of 2⊕2 type events in a mixture of 2⊕1⊕ISR and 2⊕2 topologies.

Discussion and Conclusions
Although In this paper we have generated events from SUSY models with certain specific spectra. The strategies employed in this paper should work for general dark matter models with an unbroken Z 2 discrete symmetry [3,4,41,42,43,44], or even models with more complicated symmetries (e.g., Ref. [45,46,47]) as long as the events contain 2 decay chains which end with missing particles. Additional missing particles like neutrinos coming from the decay chains may be difficult to be identified themselves, but should not prevent us from identifying the topology of the visible particle part. There could also be accompanied events with neutrinos replaced by its charged lepton partners which may be used to identify the topology. The invariant mass distributions may have different behaviors for dif-ferent models. Depending on the spins of the intermediate particles, the sharpness of the end points of invariant mass distributions varies [48]. For example, comparing the processes ofg →q +q →q + q + χ in SUSY and g (1) →q + q (1) →q + q + B (1) in the UED model, the invariant mass distribution of two jets in the UED model has a sharper end-point than in SUSY because it has a spin-1/2 intermediate particle, in contrast to the spin-0 particle in the SUSY case. In addition, the production cross section can vary a lot among different models with different spectra. Therefore, the actual required luminosity at the LHC to identify a particular topology depends on the specific model and requires a detailed study for the individual model. (For a general study of model discriminations at the LHC, see [49] [50].) Although we have concentrated on identifying event topologies in this paper, we would like to point out that the functions defined in this paper can also be useful for reducing the combinatorial problems. Once a particular topology is identified. The functions in which this topology has end points can be used to determine the end points of some invariant masses of particles from the same decay chain. Then the end points can be used to cut wrong combinations [31,51,52] by performing an event-by-event analysis and removing combinations with the invariant masses above the corresponding end-point value. The order of the visible particles from a single chain for each individual event remains a difficult problem without additional handles. We also note that the signal events often have peak structure in these functions. The backgrounds, on the other hand, are not expected to have peak structure in general and should be falling rapidly above the kinematic cut. A cut around the peak of the signal region may increase the signal-to-background ratio which could help the discovery and/or the follow-up signal analysis.
In conclusion, we have studied how to identify different event topologies with dark matter particles produced in pairs from cascade decays of heavy particles. Setting 4 j + / E T as a case study, we have shown that one can identify all event topologies based on the existence of end points of several functions of invariant mass distributions defined in this paper. We have also extended our studies to include the cases with ISR and off-shell particles in the decay chains. It is found that most of those topologies can be identified with O(10 3 ) signal events after basic kinematic cuts. We believe that this study should be the first step towards measuring the masses and spins of the dark matter particles, and similar studies should be performed for events with other final states.

A Invariant Mass End Point Formulae
In this Appendix we summarize the formulae for various (up to four particles) invariant mass end points in a decay chain, with either on-shell or off-shell intermediate particles. We start with a general consideration. Consider that a mother particle M goes through cascade decays by emitting several visible particles and end up with a single missing particle A as shown in Fig. 11. Denote the total M A p V Figure 11: The general Feynman diagram with a mother particle M cascade decays into visible particles and a single missing particle A.
momentum of all visible particles by p V , the invariant mass-squared of all visible particles is given by in the rest frame of the mother particle M . (9) We see that the maximum of p 2 V occurs when E A is minimized in the rest frame of M . In particular, if it is possible to make A at rest in the M rest frame, then E A,min = m A and p 2 V,max = (m M − m A ) 2 . However, it is not always possible to make A at rest if one of the on-shell decay produces a large boost in some direction which can not be compensated by the boosts in the opposite direction from other stages of the decays. In this case, to reach the maximum of p 2 V we still would like to have the other decays to boost A in the opposite direction from the largest-boost decay to minimize E A in the M rest frame. We will see some examples in the invariant mass end point formulae.
Many invariant mass end point formulae (up to three particles) can be found in the literature [7,8,9,10,11,13,14]. For our study, we extend the formulae to any invariant mass combinations in a decay chain with four visible particles, shown in Fig. 12. For simplicity we assume that all visible particles are massless, p 2 1 = p 2 2 = p 2 3 = p 2 4 = 0, which is a good approximation for most cases. The intermediate Figure 12: The Feynman diagram with a mother particle E cascade decays into four jets j i and a single missing particle A.
particles D, C, B may be off shell. The modifications of the end point formulae for off-shell decays will be remarked when it is relevant. As a rule of thumb, when an end point formula contains a mass parameter explicitly, it applies to the case when the corresponding intermediate particle is on shell. If a mass parameter does not appear in an end point formula, then the formula applies for either that particle being on shell or off shell. We follow the notations in Ref. [14] by defining R ij ≡ m 2 i /m 2 j , where i, j = A, B, C, D, E.
A.1 Two-particle invariant masses 1. m 2 12,max : The extremal configuration is shown in Fig. 13. The invariant mass end point for on-shell B is given by the well-known formula In the case that B is off shell, it is possible to make A at rest in the C rest frame, so 2. m 2 13,max : The extremal configuration for on-shell decays is shown in Fig. 14. The corresponding invariant mass end point is If C or B is off shell, then the invariant mass end point occurs in the soft limit of visible particle 2, p 2 → 0. The invariant mass end point formula can be obtained in the same way as m 2 12 , However, unless the initial particle D is highly boosted, the particle 2 is likely to be too soft to pass the cut in this limit. The end point distribution will not be very sharp on an event sample which includes a hard-enough particle 2. The formula only works as an upper bound in practice and the actual end point may be much smaller depending on the jet E T cut. Similar consideration also applies to other cases where only one of the two visible particles coming from an off-shell decay is included in the invariant mass calculation. The invariant mass maximum occurs in the soft limit of the other visible particle from that decay. The invariant mass end point is the same as the case with that soft visible particle removed, though in practice it will be smaller and not be as sharp.
3. m 2 14,max : The extremal configuration for on-shell decays is shown in Fig. 15. The corresponding invariant mass end point is is given by (c) If R AB < R BD , irrespective of whether C is on shell or not, the end point of m 2 123 is given by (d) In other cases when there is no single large boost and A can be put at rest in the D rest frame, the end point of m 2 123 is given by the standard formula, 2. m 2 124,max : If C or D is off-shell, the end point occurs in the soft limit of visible particle 3. Then it reduces to the previous case (case 1) and the end point formulae can be easily obtained by replacing the appropriate masses. In the following discussion we assume that both C and D are on shell. The general extremal configuration is shown in Fig. 16. The double arrow represents the total momentum of particles 1 and 2, while individual particles may go in different directions depending on the relative mass parameters.
(a) If R AC (1 − R DE + R CE ) > R CE , the extremal configuration is that particles 1 and 2 move in the same direction, which is opposite to particles 3 and 4. The end point of m 2 124 in this case is given by (c) If all intermediate particles are on shell and R AB (1 − R DE + R CE ) > R BE , and if R AB < R BC , the extremal configuration is that particles 2, 3 and 4 move in the same direction which is opposite to particle 1. The end point of m 2 124 in this case is given by (d) In other cases, particles 1, 2 and 4 are not collinear in the extremal configuration. The end point is given by 3. m 2 134,max : If B or C is off-shell, the end point occurs in the soft limit of visible particle 2. It also reduces to the case 1 and the end point formulae can be easily obtained by replacing the appropriate masses. In the following discussion we assume that both B and C are on shell. The general extremal configuration is shown in Fig. 17. The double arrow represents the total momentum of particles 3 and 4, while individual particles may go in different directions depending on the relative mass parameters.
, the extremal configuration is that particles 2, 3 and 4 move in the same direction, which is opposite to particle 1. This is independent whether D is on shell or not. The end point of m 2 134 in this case is given by (d) In other cases, particles 1, 3 and 4 are not collinear in the extremal configuration. The end point is given by A.3 Four-particle invariant masses 1. m 2 1234,max : Again, the end point depends on whether there is a large boost from one of the on-shell decays which can not be compensated by the combined boost from other decays.
(a) If R DE < R AD (B, C can be on shell or off shell), the end point is given by (b) If R CD < R AC R DE (B can be on shell or off shell), the end point is given by (c) If R BC < R AB R CE (D can be on shell or off shell), the end point is given by (d) If R AB < R BE (C, D can be on shell or off shell), the end point is given by (e) In other cases when there is no single large boost and A can be made at rest in the E rest frame, the end point of m 2 1234 is given by the standard formula, A.4 End point formulae for functions F 1 , F 2 , F 3 , F 4 Now we can write down the formulae for the end points (if they exist) for the functions F 1 , F 2 , F 3 , F 4 defined in Section 2 for various topologies.
For the 4⊕0 topology, the invariant mass end points come from the decay chain with 4 jets. It has exactly the topology in Fig. 12. The end points for F 1 -F 4 are For the 3⊕1 and 3⊕0⊕ISR topologies, the invariant mass end points come from the decay chain with 3 jets, which are labeled as 1, 2, 3. The end points for F 1 -F 4 are F 1,max : no end point, F 4,max : no end point.
For the three visible particles from the same decay chain, the maximum of m 2 12,max , m 2 13,max , m 2 23,max occurs when two of the three particles are parallel and the other one is anti-parallel. However, the definition of F 3 will take the more energetic one away from the 2 parallel particles to pair with the particle from the other decay chain, so the actual F 3,max will in general be smaller than the above formula.
For the 2⊕2 topology, the invariant mass end points can come from either decay chain. We assume that the two decay chains are symmetric and label the two jets on the same chain as 1 and 2. The end points for F 1 -F 4 are F 1,max : no end point, F 2,max : no end point, The 2⊕1⊕ISR topology is not expected to have a sharp end point in any of F 1 -F 4 functions.
For the parameters used in Sec. 2, m A = 200 GeV, m B = 400 GeV, m C = 600 GeV, m D = 800 GeV, and m E = 1000 GeV, the various end points (in GeV) are listed in Table 7.

B Identifying the 2⊕1⊕ISR topology
To identify the 2⊕1⊕ISR topology, we need to find a way to pick out the ISR jet and distinguish it from the other cases without ISR. The ISR jets should have a falling E T distribution, while other jets from heavy particle on-shell decays should have an E T distribution with peak structure if the peak is above the cut. In order to keep the peak structure of those energetic jets, the cut on the jet E T can not be too strong. Therefore, in this section, we impose a softer cut on the basic kinematics: at least 4 jets with E T > 50 GeV and missing transverse energy / E T > 100 GeV.
To increase the probability of finding the ISR jet for the 2⊕1⊕ISR topology, we only choose those events in which the two pairs with the largest invariant masses contain the same jet. We then plot the E T distribution of the remaining one which does not appear in the two largest invariant masses.
We use the following function to describe this procedure F 6 (p 1 , p 2 , p 3 , p 4 ) = E T (j l ) , such that ǫ ijkl = 0 and m ij , m ik = two largest invariant masses.
The event distributions for different topologies in terms of F 6 are shown in Fig. 18. By choosing a proper bin size, we can see from Fig. 18 that the two topologies with ISR have the largest number of events at the first bin, which can be used to distinguish them from topologies without ISR.
To further distinguish 2⊕1⊕ISR from 3⊕0⊕ISR, we can simply check their patterns of the parameters of the first four functions. If b/(ac)(F 2 ) is smaller than the other values, we can then identify the   The fitted values of b/(ac), which determine the normalized steepness of the slopes of the histograms after the peak. The broken-line function in Eq. (7) is used to fit distributions. The bin size has been chosen to be 40 GeV (left), and 50 GeV (right). Again each topology has 10000 events before the cuts, and there are 3841 and 3632 events passing the cuts for 3⊕0⊕ISR and 2⊕1⊕ISR, respectively.
topology as 3⊕0⊕ISR. On the other hand for the 2⊕1⊕ISR topology, none of the normalized inverse slopes exhibit a particularly small value and b/(ac)(F 2 ) is not significantly smaller than others. In this case the topology can be 2⊕1⊕ISR or with even more ISRs,