On the Origin of Long-Lived Particles

MATHUSLA is a proposed large-volume displaced vertex (DV) detector, situated on the surface above CMS and designed to search for long-lived particles (LLPs) produced at the HL-LHC. We show that a discovery of LLPs at MATHUSLA would not only prove the existence of BSM physics, it would also uncover the theoretical origin of the LLPs, despite the fact that MATHUSLA gathers no energy or momentum information on the LLP decay products. Our analysis is simple and robust, making it easily generalizable to include more complex LLP scenarios, and our methods are applicable to LLP decays discovered in ATLAS, CMS, LHCb, or other external detectors. In the event of an LLP detection, MATHUSLA can act as a Level-1 trigger for the main detector, guaranteeing that the LLP production event is read out at CMS. We perform an LLP simplified model analysis to show that combining information from the MATHUSLA and CMS detectors would allow the LLP production mode topology to be determined with as few as $\sim 100$ observed LLP decays. Underlying theory parameters, like the LLP and parent particle masses, can also be measured with $\lesssim 10\%$ precision. Together with information on the LLP decay mode from the geometric properties of the observed DV, it is clear that MATHUSLA and CMS together will be able to characterize any newly discovered physics in great detail.


Contents
1 Introduction it might not be obvious how to characterize any new particles that are discovered. Fortunately, MATHUSLA is not only an LLP-discovery machine, and its physics utility goes beyond initial observation of LLP signals. Despite its simplicity, MATHUSLA can use geometric information to extract the likely velocity and decay mode of detected LLPs [20]. This also allows the LHC bunch crossing that produced the LLP to be identified. Furthermore, MATHUSLA can act as a Level-1 trigger for CMS, ensuring information about the LLP production event is written to tape regardless of main detector trigger thresholds. Together, these abilities suggest that it should be possible to combine information from both MATHUSLA and CMS in order to study the properties of detected LLPs and details of the underlying theory.
In this work we analyze the prospect of using correlated MATHUSLA and main detector information to diagnose the production mode of LLPs, and estimate underlying model parameters under a given production mode hypothesis. One initial obstruction to this effort is the vast space of possible models that give rise to LLPs. Fortunately, recent progress has been made to organize much of this space into a set of just a few 'simplified models' that describe particular experimental signatures [13]. Complicated scenarios exist that do not fit easily into any of these categories, but the simplified model framework provides a good starting point for analyses that cover a wide range of commonly studied BSM scenarios in a relatively model-independent fashion.
Assuming that MATHUSLA observes a number of LLP decays that all originate from a single production mode, we show that with simple cuts using only a few event observables, observation of ∼ O(100) events allows the production mode to be determined with 90% confidence. Even for O(10) observed events, some simplified models can be correctly identified with high probability, while higher event yield obviously results in even greater classification confidence. Assuming correct model classification, the LLP and parent particle masses can be determined to ∼ 10% precision with similarly modest statistics.
Our simple analysis can be generalized to include more production modes, and more complicated scenarios with several LLP production and decay modes considered simultaneously. Our methods should also be applicable to other proposed external LLP detectors [16,17], as well as to LLPs discovered in the LHC main detectors themselves (where additional information on the LLP decay products may be available). Together with information on the LLP decay mode from the geometric properties of the DVs [20], a discovery at MATHUSLA would therefore not only prove the existence of BSM physics, it would allow the theoretical origin of LLPs to be uncovered.
The paper is structured as follows. First we will review the MATHUSLA detector and its potential to be used as a trigger for CMS in Section 2. In Section 3 we will summarize the simulation of LLP production under several simplified models and across a wide range of parameters. In Section 4 we will develop a robust, physically motivated classifier to diagnose the LLP production mode using both CMS and MATHUSLA observables, and evaluate its performance. We describe how simple analyses can extract the LLP and parent particle masses for the considered LLP production modes in Section 5, and perform maximum likelihood estimation using pseudo-data compared to simulation-derived distributions to estimate the achievable precision. Section 6 contains a brief discussion of complications due to multiple possible LLP production vertices, and we make concluding remarks in Section 7. Appendix A contains various Monte Carlo details for LLP event generation in our analysis.

The MATHUSLA Detector
We first review the principles and geometry of the MATHUSLA detector in Section 2.1 to provide context for the rest of the analysis. Section 2.2 contains a brief overview of the physics reach of MATHUSLA. In Section 2.3 we review how MATHUSLA can measure LLP boost, and the implications for coordinating between MATHUSLA and CMS.

Detector overview
The proposed MATHUSLA detector is a large-volume surface detector, to be placed near CMS [19,21], see Figure 1. The goal is to instrument a large volume to search for displaced vertex (DV) decays of ultra long-lived particles produced at the LHC, focusing on decay lengths cτ 100m. It has been shown [19] that MATHUSLA can operate in a lowbackground environment by making use of strong restrictions on the final state multiplicity, timing, and geometry of candidate events. The detector principle is to place several (∼5) layers of trackers, likely to be plastic scintillator, above a large decay volume of air. LLPs decaying to SM charged particles inside MATHUSLA will give rise to upwards going tracks. By requiring reconstructed tracks to travel upwards relativistically and originate both spatially and temporally at a point inside the empty detector volume, backgrounds such as LHC muons, atmospheric neutrinos and cosmic rays can be rejected.
MATHUSLA will be equipped with an internal trigger system for LLPs, which relies on real-time tagging of upwards-traveling track-candidates within groups of neighboring detector modules [19]. The trigger rate will be low enough that the MATHUSLA trigger can also act as a Level-1 burst-trigger for CMS: If the upwards tracks originate from the decay of an LLP, there is a range of < 10 candidate LHC bunch crossings that are very likely to include the production event at CMS. The MATHUSLA Level-1 trigger can therefore request a range of CMS events to be written to tape, which would enable the kind of off-line analysis we study in this work. If only CMS triggers were available, then the post-discovery analysis of many scenarios we consider here would be significantly more difficult or even impossible, due to much smaller trigger efficiencies in the main detector [14].
In this work, we assume the original MATHUSLA benchmark geometry [14,15], which is a 200m × 200m square detector with a 20m high decay volume, placed 100m displaced from the interaction point both horizontally and vertically. More recently, the collaboration has considered more realistic designs with a 100m × 100m detector volume at an available site near CMS, with significantly smaller horizontal and vertical displacement compensating for the smaller size. This results in practically identical sensitivities to LLP decays [22]. Our results apply almost verbatim to this more realistic design as well.

Physics Reach
Due to MATHUSLA's near-zero backgrounds, ∼ 10 observed events are likely enough for a discovery in most of the targeted LLP scenarios, especially if they decay to hadrons. This   [14,15]. Our results will apply to the more realistic geometries currently considered by the MATHUSLA collaboration, with ∼ (100 m) 2 area situated closer to the collision point, as well. The tracking planes in the roof detect charged particles, allowing for the reconstruction of displaced vertices in the air-filled decay volume. The floor protector provides vetoing capability against charged particles entering the detector from below.
low threshold allows MATHUSLA to probe much lower cross-sections than ATLAS and CMS in the long-lifetime regime. The number of LLPs MATHUSLA will observe depends primarily on the production cross-section and decay length of the LLPs, and has only mild model dependence. In the long-lifetime regime, the lowest cross-section that MATHUSLA can exclude scales linearly withbcτ , whereb is the mean boost of the LLPs. The minimum signal cross-section required for discovery is roughly [14] σ LHC sig,min ∼ (1 fb) b cτ To give a concrete example of the detector's physics reach, Figure 2 shows that MATHUSLA can probe exotic Higgs decay branching ratios to LLPs down to 10 −5 . For this study we will work in terms of the number of observed LLP decays, N obs , ranging from 10 to 1000 without specifying the corresponding cross-sections for each model under consideration. Clearly this is a plausible event yield for many BSM scenarios.

Measuring LLP boost
There are two important reasons to measure LLP boost at MATHUSLA [20]. First, the boost is highly correlated with LLP mass for a given production mode. Second, knowledge of LLP boost allows for the LHC production event to be identified, or at least narrowed down to a few possible bunch crossings, which allows us to use information from both MATHUSLA and CMS to learn about the nature of newly detected LLPs. Ref. [20] showed that MATHUSLA can determine the velocity of detected LLPs using only geometric information of a displaced vertex and the knowledge of LLP's direction of travel, from the LHC IP to the reconstructed DV. For a two-body decay, the velocity of the LLP decaying to two particles at angles θ 1 , θ 2 is  Figure 2: MATHUSLA sensitivity to new LLPs X pair-produced in exotic Higgs decays, where X decays hadronically. Exclusion curves correspond to 4 events in MATHUSLA. Plot from [14].

BBN Limit Decays dominantly in Main Detector
Since in the vast majority of cases the LLP decay products are ultra-relativistic, β 1 and β 2 can be set to 1 while incurring only negligible error. Regardless, the timing resolution of the proposed tracking detectors should allow these velocities to be determined within ∼ 5%. The uncertainty in θ 1 and θ 2 is dominated by detector effects and is estimated at roughly 0.2% for cm spatial resolution.
In the case of an LLP decay mode whose final states have high multiplicity, i.e. hadronic decays, there is an alternative method to estimate its velocity, relying on the approximate sphericity of charged decay products in the LLP rest frame. Labelling the final-state momenta p i , we solve for β X in the constraint Essentially, the measured track momenta are boosted backward along the direction toward the LHC IP until they are approximately spherically distributed, which approximates the LLP rest frame. The accuracy of this estimate is negatively affected by the fact that the decay product momenta may not be lightlike as well as the fact that some decay products might be downgoing in the lab frame and therefore not included in the calculation. Thus the measured LLP boost distribution suffers some systematic bias (which can be accounted for) and a small increase in spread. Nonetheless, the LLP boost can be determined sufficiently well event-by-event to narrow down the candidate bunch crossings in which it was produced to ∼ 5. There may be multiple hard-scattering vertices in the list of candidate events, but as discussed in 6, this does not jeopardize our analysis strategy.

LLP Simplified Models
The space of all physical models that could produce long-lived particles is immense. To identify, consider, and discard them one by one would be enormously impractical. Simplified models break down BSM theories into smaller effective models that reflect experimental signatures and can be characterized by a small number of parameters that correspond to experimental observables, like cross sections, branching ratios, and masses. In the past decade, the use of simplified models in searches for prompt particles at the LHC has become a standard practice [23]. Recently, considerable work has been done to extend the simplified model framework to LLPs [13]. This includes, in particular, separate simplified models for LLP production and decay, since they do not a priori have to be related. This is very suitable for our analysis, since the decay mode can be reasonably well characterized by MATHUSLA alone [20], and we focus on the production mode. In this section, we review and slightly expand upon the simplified model basis for LLP production modes introduced in [13], and briefly explain how we generate Monte Carlo event samples for our study.

LLP Production Modes
In this paper we will employ the LLP simplified models recently defined and presented in the LHC LLP white paper [13], with the addition of LLP production in exotic B-meson decays. This set of simplified models, each representing an LLP production event topology, is meant to cover a wide range of well-motivated LLP production scenarios while remaining agnostic of the underlying mechanism generating the particles' long lifetimes. They are as follows, with schematic Feynman diagrams for each model in Figure 3.
1. The Exotic B-meson Decay channel is abbreviated as BB and has a the SM B-meson decay to an LLP plus one or more SM products. This scenario is not included in the simplified model library constructed in [23], but we include this mode due to its importance for low-mass LLP benchmark models [24] below ∼ 5 GeV. Two distinct scenarios for this production channel are right-handed neutrino models [25,26] and light dark scalars [27]. Both are explored in this paper.
2. In the Charged Current (CC) scenario, a BSM particle possessing a SM electric charge is produced in the s-channel and decays to a charged SM particle and an LLP. In this case, assuming the dark sector particle is of integer charge, the charged SM decay product can be assumed to be a charged lepton. Models containing W bosons [28][29][30] decaying to long-lived right-handed neutrinos [31,32] are especially representative of this topology.
3. In the Heavy Parent (HP) scenario some BSM particle is pair produced before each decays to an LLP and one or more standard model particles.   (gluino) pair production and subsequent decay (via an off-shell squark) to a long-lived neutralino would produce one (two) jet(s) per decay chain. We consider both possibilities separately, and comment on how our methods could be easily extended to include leptonic or multi-step decay chains. 4. In the Exotic Higgs Decay (HIG) scenario, the Standard Model Higgs boson couples to the dark sector and decays to two long-lived particles. Hidden sector models with a Higgs portal fall into this category [24,33]. More complex Higgs decay final states are of course a possibility, especially with decays into confining dark sectors [34][35][36][37]. The two-particle final state serves as a starting point when considering this space of models.
5. In the Direct Pair Production (DPP) scenario, two long-lived particles are produced non-resonantly, through some effective operator, such as that generated by a very heavy squark leading to neutralino pair production, or pair-production of a hidden sector LLP via an off-shell s-channel mediator.
6. The Heavy Resonance (RES) model has some BSM particle, such as a Z' boson (see e.g. [38][39][40]), produced on-shell in the s-channel that decays to two long-lived particles. In the high-mass limit, the Heavy Resonance model and Direct Pair Production model should coincide.
We will use these simplified models as our 'basis set' of LLP production topologies. Many searches at the LHC constrain the available parameter space for these models. However, in general, production cross-sections and branching ratios can be dialed down while still ensuring events in MATHUSLA, so we do not concern ourselves with current bounds on these signatures, except to note that they are not excluded for weak-scale masses.
All of these simplified models have either one or two parameters that are varied in simulation. The only parameter that can be freely chosen in the BB, DPP, and HIG models is the LLP mass m LLP . The CC, HP, and RES models also have a parent particle that decays to the LLP, so the two free parameters are m parent and m LLP . The width of BSM parent particles can in principle be varied as well. For simplicity, Γ parent is set to 0.01 × m parent for each of these models, since the exact width does not matter as long as it is small enough. A high-width parent particle case is also simulated for the RES model, with Γ parent = 0.3 × m parent , to illustrate the effect of parent particle width on classification accuracy.
The range of LLP and parent masses for which simulation was generated in each model is summarized in Table 1. In single-parameter models, the chosen m LLP values are equally spaced within the indicated ranges (on a linear scale). In models with a variable parent mass, m parent and m LLP /m parent values are equally spaced within the indicated ranges. This is relevant in interpreting our probabilistic results regarding the achievable accuracy in classifying datasets of observed LLP decays. Further details on the simulation of each simplified model are in appendix A.

LLP decay in MATHUSLA
For each point in parameter space, 10 6 events were initially simulated. Only events with LLPs in the right rapidity range η ∈ (0.8, 2) to potentially intersect MATHUSLA were considered. To increase the efficiency of our simulations for LLP decays in MATHUSLA we took advantage of the fact that the orientation of each event around the beam axis is random. Therefore, we rotated and appropriately reweighed events to assign LLPs a random azimuthal angle that guaranteed them flying through some part of MATHUSLA. Each event was also weighted to account for the probability of the LLP decaying inside MATHUSLA by a factor L/bcτ , where L is the distance each particular LLP travels through MATHUSLA, b is its boost, and cτ is its decay length. In the long lifetime regime, this is proportional to the probability that the LLP decays within MATHUSLA. This event selection and weighting defines our high-statistics weighted samples. Since our analysis assumes some number of observed LLP decays N obs , there is no weighting applied for a particular cross-section, and our results are valid for arbitrarybcτ 10 2 m, whereb is average LLP boost. (Repeating our analysis for shorter lifetimes would be straightforward, except it would allow for additional information on the LLP lifetime to be extracted from the geometric distribution of decays in the detector.) For each of these sets of weighted events, an unweighted event sample of size ∼ O(10 4 ) was drawn. These are used to construct representative signal samples of N obs = 10, 100, 1000 observed events.
We make two assumptions that simplify our analysis and simulations. The first is that however the LLP decays, MATHUSLA will accurately measure its boost using the methods outlined in Section 2.3. This allows us to omit detailed MATHUSLA detector simulation of the LLP reconstruction, and makes our analysis independent of details of the final detector design. The second assumption is that each LLP decay can be uniquely matched to a hard scattering and hence a single primary vertex in an event at CMS. This is to simplify our first pilot study of analyzing LLP events in MATHUSLA and CMS, and a more realistic analysis can be generalized to account for the possibility that each LLP decay is associated with several possible hard-scattering candidate events due to high pile-up at the HL-LHC. As we discuss in Section 6, this complication should not qualitatively impact our conclusions on production mode classification and parameter estimation.

Determining Production Mode
In this Section we will describe the strategy for identifying the LLP production mechanism at the LHC using combined data from CMS and MATHUSLA.
We assume that MATHUSLA observes N obs = 10, 100 or 1000 LLP decays, all resulting from the same single production topology. Sample-level variables describing characteristics of the entire observed LLP dataset, like fraction of events with some number of jets above some p T in CMS, are used to classify the production mode. The algorithm is summarized in Figure 4 and Table 2.
Using characteristic features of each production mode, we find that simple cuts in sample-level observables can be used to achieve 90% probability of correct model classification for all but small corners of BSM particle parameter space with O(100) observed events, and 98% with O(1000) observed events. For the BB, CC, and HP models, > 90% probabilities of correct classification can be achieved with only N obs = 10 events.

Hierarchical Classification Algorithm
Our goal is to find robust, physically motivated observables that allow us to classify samples of LLP production events into the categories defined by the different simplified models in Figure 3. We therefore construct a hierarchical classification algorithm that addresses each production mode hypothesis in order of increasing difficulty. It is summarized in Figure 4 and Table 2. Below, each step is discussed in detail.
The two easiest tasks are to identify data arising from the Exotic B-meson Decay and Charged Current models. We start with BB. Because the vast majority of B-mesons are produced in relatively soft bb events at the LHC with p T < 20 GeV, we can expect very few jets and even fewer b-tags in events from this model. Only ≈ 0.5% of events have a b-tagged jet with p T > 20 GeV in the Delphes simulation. The presence of a b-tag could thus only be useful when O(1000) LLPs have been detected. However, there is an upside to the low characteristic energy scale of these events -a complete absence of hard jets in > 90% of events. All of the other models under consideration have a hard process at some high scale, producing ISR jets at the least in a sizeable fraction of events. The cut used to classify a sample as BB-like is essentially a jet veto, demanding zero jets above a certain p min T cutoff in a high fraction of events. If the event sample fails this jet veto, we move on to the next possibility.
Next, we check whether the sample could arise from the Charged Current model. In this model, the LLP produced is accompanied by a SM charged lepton. For most values of m LLP and m parent , this lepton is generically produced at high transverse momentum, and in the opposite azimuthal direction from MATHUSLA for events where the LLP is detected at MATHUSLA. None of the other simplified models have a mechanism to produce a single hard SM lepton in such a high fraction of events. The cut for this model will demand an isolated lepton above some minimum p min T in a high fraction of events. The next step in the classification is to check for the Heavy Parent model. In the other models featuring LLPs produced in the decay of a heavy particle, the only jets in an LLP production event come from initial state radiation (with the exception of VBF jets in Higgs  Figure 4: Summary of our hierarchical LLP production mode classification algorithm.

Variable
Cut for model classification B decay (BB) Fraction of events with no jets of p T > 20 GeV >0.9 Charged Current (CC) Fraction of events with one lepton of p T > 10 GeV >0.2 Heavy Parent (LLP +1 jet) Fraction of events with at least one jet of p T > 20 GeV >0.85 Heavy Parent (LLP +2 jet) Fraction of events with at least six jets with p T > 20 GeV

>0.3 Exotic Higgs Decay
Fraction of events y with highest jet p T > 90 GeV 0.04 < y < 0.22 Fraction of events x with two jets with ∆η jj > 2.5 y < 3.5x -0.2

Direct Pair Production
Standard deviation σ and mean µ of LLP log 10 (b) distribution Heavy Resonance -- Table 2: Cuts defining the LLP production model classification algorithm, applied in sequence from top to bottom. An observed event sample that fails all cuts is classified as Heavy Resonance production mode by a process of elimination.
production). The spectrum of this radiation is dependent on the characteristic energy scale (and therefore BSM particle masses) of each model, but in general the number of jets per event and their p T is lower than for the HP model, where we assume the SM products of the heavy parent decay give rise to jets. To be classified as HP, some minimum fraction of events in a sample will have to contain one or more jets above some p min T . Within the HP model, we can further distinguish between the decay of the heavy parent particle to one SM particle plus an LLP, and the decay to two SM particles plus an LLP. This secondary classification asks for an even higher jet multiplicity in some minimum fraction of events, and may take advantage of the fact that 4 hard quarks or gluons at parton-level may be quite likely to split into even more jets after showering.
After these easier classification decisions, we must turn to more difficult cases requiring slightly more sophisticated checks. The Heavy Resonance and Direct Pair Production modes produce no associated SM particles in the hard process that creates the LLP, and the same is true in the dominant gluon fusion channel for Higgs production. However, in the vector boson fusion (VBF) and associated vector boson (VH) production channels for Higgs production, there are additional particles created in the hard process. In particular, we will look for the signature of VBF events to distinguish the HIG model. These events are characterized by a pair of hard jets well-separated in pseudorapidity (∆η jj ), originating from the two quarks surviving from the VBF process. Since VBF accounts for approximately  Table 1, the left plot shows for each sample the fraction of events with ∆n jj > 2.5 vs the fraction of events with hardest jet p T above 90 GeV. Clearly, there is very good separation between HIG and RES/DPP in the high-statistics limit. The right plot shows the distribution of these two variables for samples of size N obs = 100 for the various production modes and masses. There is now significantly more spread, but HIG events are still well separated. The orange shaded region defines our cuts the sample must pass to be classified as Higgs production.
10% of the Higgs production cross-section, we expect to see roughly the same fraction of Higgs events with this distinctive signature. The DPP and RES models are much less likely to produce events of this nature. Apart from this, the jets present in HIG events have a weaker p T spectrum than for RES or DPP in most regions of parameter space, due to the heavy resonance having mass > m H or the produced LLPs having 2m LLP > m H , controlling the scale of the interaction and therefore the momentum of produced ISR jets. With both of these considerations in mind, we can construct a combined cut in the fraction of events with highest jet p T larger than some cutoff, along with fraction of events with ∆η jj greater than some cutoff. See Figure 5 for the shape of this cut, along with the highstatistics prediction of these two variables for the HIG, RES and DPP samples with various masses, and the spread for observed samples of size N obs = 100.
Finally, we must devise a strategy to distinguish between the two most similar models, Heavy Resonance and Direct Pair Production. Indeed, in the high m parent limit, the RES and DPP models coincide. For the first time, we will use the information gathered by MATHUSLA for production mode classification, not just the fact that it triggers CMS. In the RES model, the boost of the detected LLPs is determined almost exclusively by m LLP /m parent , since the heavy resonant particles will be produced dominantly on-threshold, giving the LLPs a boost b ∼ m parent /2m LLP . This produces a fairly narrow boost distribution for the LLPs if the heavy resonance has a narrow width. Contrastingly, in the DPP model the LLP boost is correlated directly with m LLP , with higher masses corresponding to lower boosts. Because there is no resonance decaying to the LLPs in the DPP model, the energy distribution of the LLPs is wider, leading to a wider boost distribution compared to even a high-width heavy resonance.
The difference in correlations of the LLP boost with particle masses between the two models will allow us to define a cut that separates the two models. Figure 6 shows the LLP boost mean and standard deviation of the low-and high-width Heavy Resonance samples for the various simulated masses, compared to the Direct Pair Production samples, in both the high-statistics limit and for observed samples of size N obs = 100. A diagonal cut in this plane can separate DPP and RES. This cut only loses some of its distinguishing power in the high-width limit for the RES model. 2 For all of the above cuts we optimize the precise thresholds to maximize classification performance for event samples of size N obs = 100. The resulting cuts are summarized in Table 2, and visualized as a flowchart in Figure 4. Note that these cuts are applied in order, such that only samples failing to meet the criteria of a given step are tested in the next. While more sophisticated simulations or data-driven optimizations might result in slightly different optimal values of these cuts, their physical basis nevertheless makes them robust.
Our classification framework is easily extendable to include other LLP scenarios. For example, leptonic decay in the Heavy Parent model can be accommodated by testing for a minimum fraction of events with two or more hard leptons. A modification of the two-jet Heavy Parent scenario where the heavy parent undergoes a decay chain rather than threebody decay could be distinguished using jet kinematic variables like M T 2 [53]. Finally, a generalized Higgs decay scenario with indeterminate mass for the Higgs-like scalar could still be identified by the presence of VBF production events, 3 with the task of distinguish-  Figure 7: Flow chart summary of our hierarchical LLP production mode classification algorithm.
showing possible additions to include other LLP production scenarios not considered in our detailed analysis.  ing the scalar resonance from the SM Higgs becoming a problem of parameter estimation similar to the cases we discuss in Section 5. We have omitted these scenarios from our first pilot analysis for simplicity, or because they are easier to classify than the cases we study (in particular the leptonic heavy-parent models). Even so, we can anticipate how our classification algorithm would be modified to include these possibilities in Figure 7. We hope this will be useful for future extensions of our analysis.
It is important to note that nothing about our methods requires the LLP to be detected in an external detector like MATHUSLA. If LLPs are discovered at the LHC main detectors, the same kind of analysis can be used to characterize them as well. In fact, more information will be available, since the main detectors might be able to extract energy and momentum data on the LLP decay products, depending on the position of the DV within the detector.

Performance
Unweighted sets of pseudo-data with N obs = 10, 100, 1000 were generated for the models and mass ranges shown in Table 1. For most models, 30-40 parameter-points were generated, forming a rough grid in the m parent , m LLP /m parent plane. For the one-parameter models, between 8 (HIG) and 40 (DPP) values of m LLP were generated.
The classification algorithm was run on all of these pseudo-data samples, and the rates of correct and incorrect classifications were recorded. At each of the three benchmark values N obs , we report an average classification accuracy for each model. This average of classification accuracies is taken first over all samples at a given point in parameter space, then over all points in parameter space, for each model. This average is particular to the prior imposed by our choice of simulated BSM particle masses, but generally reflects the ability of the classifier to distinguish a given model from the others across parameter space. Corners of parameter space for which the classification accuracy differs significantly from the average are discussed on a case-by-case basis, as are trends of variation in the accuracy as a function of BSM particle masses and widths. Table 3 summarizes the performance of the classifier for each model, while Table 4 shows the full classification matrices for each benchmark N obs .   7. For the RES model with a high-width resonance (Γ Z = 0.3 × m Z ), classification accuracy was lower than for narrow-width, averaging 86% across the simulated parameter space. The variation of classification accuracy across parameter space was similar to the narrow-width case.
Clearly, our classification algorithm performs well in most regions of BSM parameter space. A small minority of scenarios is misclassified, but in each such case this reflects either a genuine physical degeneracy of observables, like in various low-mass limits, requiring much more detailed analysis, or simply a 'special case', where an extension of our simple classifier could greatly improve accuracy. Having demonstrated that classification is generally robust, we leave such further details to future studies.

Determining Model Parameters
The second task for which we would like to estimate the prospective capabilities of MATH-USLA and CMS is the measurement of the properties of the newly discovered BSM particles. In Ref. [20] it was shown that the measured LLP boost distribution gives information about the LLP mass under the assumption of production by exotic Higgs decay. We will attempt to demonstrate similar relations for the rest of the simplified models under our consideration. Without the knowledge of an intermediate parent particle whose mass is itself known, this task becomes more difficult. Each subsection below will describe the estimators of BSM particle masses and precision achievable for each of the simplified models. One important assumption that we make here is that the production mode of the LLPs in the sample has already been accurately identified, so that we know which analysis to perform. 4 The main conclusion of our analysis is that MATHUSLA can determine BSM particle masses with useful O(10%) precision assuming only O(100) observed events.
For each production mode, we use binned maximum-likelihood estimation to compare distributions of event-level observables from pseudo-data with N obs = 10, 100 or 1000 events against distributions generated using our full high-statistics simulations. Best-fit masses are computed for many pseudo-data samples, and the standard deviation of the distribution of best-fit masses is taken as an estimate of the precision MATHUSLA can achieve when measuring LLP model parameters. The log-likelihood (LLH) function to be minimized as a function of the theory parameters is where i indexes bins in a histogram of the chosen observables, n i is the observed count in bin i for a given sample of pseudo-data, and µ i (m parent , m LLP ) is the expected count in bin i. Bins with 0 observed count are ignored. The likelihood function is evaluated over a grid of points in (m parent , m LLP /m parent ) with available simulation, then interpolated and minimized. For models without varying parent particle masses, simulation is generated for m LLP at regular intervals, and the likelihood depends only on m LLP . The spacing and size of the grid were chosen to accurately demonstrate the precision achievable at N obs = 100. This means that for N obs = 10, some outlier pseudo-experiments do not have best-fit mass values within the grid. In such cases, the reported precision  Table 5: Summary of parameter estimation performance for all of the simplified models we consider.
The variables x 1 , x 2 chosen to estimate BSM particle masses are listed. The precisions shown are the characteristic standard deviation/mean of best-fit masses for benchmark BSM particle masses that are approximately representative for each model.
should be regarded as an approximate lower bound on the uncertainty in BSM particle masses MATHUSLA can achieve. For N obs = 1000 the grid spacing is much larger in some cases than the variation in best-fit masses, meaning that the actual distribution of best-fit masses is not accurately portrayed. However, systematic uncertainties driven by the final experimental design are likely to dominate in this regime, so the exact values obtained for 1000 observed events are less important than the qualitative lesson that the BSM parameter measurement might become systematics dominated. For single-parameter models without a resonant LLP parent particle (Direct Pair Production), or where the parent mass is known (B-decay, Higgs decay), we only need a single variable x 1 (m LLP ) = b = | p LLP |/m LLP , the LLP boost that MATHUSLA can measure from the DV geometry. The binned distribution µ i (m LLP ) in this variable is sufficient to measure LLP mass, since b ∼ m parent /m LLP .
For models with unknown parent particle mass (Charged Current, Heavy Parent and Heavy Resonance), we also make use of the LLP boost x 1 (m LLP , m parent ) = b. However, we need to define an additional variable x 2 (m LLP , m parent ) that can be constructed from information at MATHUSLA and CMS, and which supplies information about the LLP or parent particle masses independent of the boost. We do this case-by-case below, and the variables are summarized in Table 5, along with the approximate precision MATHUSLA can achieve for each model.
Due to the high computational cost of generating simulated datasets on the finelyspaced grid of LLP and parent particle masses necessary to accurately maximize the likelihood over parameter space, we demonstrate parameter estimation performance for one set of BSM particle masses in the two-parameter models. We choose m parent = 1 TeV, m LLP = 300 GeV for all models as a benchmark that is fairly representative of MATH-USLA's abilities across a large range of parameter space.

Exotic B-Meson Decay
The first model we consider is Exotic B-meson Decay, in the two cases of a heavy neutral lepton (HNL) LLP or a scalar LLP. We assume these can be distinguished by examining the decay products of observed LLPs. In the HNL case, the LLP decay includes an invisible neutrino, so the visible decay products in MATHUSLA do not point back at the LHC interaction point. The details of how this classification could be accomplished would require careful further study as well as detailed knowledge on the final experimental design of MATHUSLA, which is not currently available. Furthermore, the boost measurement requires special treatment when the LLP mass is not much greater than that of its decay products. These issues are important and require their own dedicated investigation, which is beyond the scope of this work. For our purposes, we will assume that that the LLP decay mode can be determined and its boost measured with useful accuracy.
With the mass of SM mesons known, the only free parameter in this model is the LLP mass. If m LLP m B , the LLPs produced will be highly boosted, and their boost is strongly correlated with their mass. Figure 8 shows this relationship for both scalar and RHN LLPs, using the mean boost to demonstrate the dependence on m LLP . It should be noted that as m LLP increases, the boost rapidly loses its dependence on m LLP .
The precision of the m LLP estimation was evaluated at three benchmark LLP masses: 0.5, 1.0, 3.0 GeV, for both HNL and scalar LLPs. Maximum likelihood estimation was performed on many samples of N obs = 10, 100, 1000, comparing each sample's boost histogram against expected histograms for log 10 (b LLP ) at each mass point created from high-statistics simulated samples. Tables 6 and 7 show the standard deviation of best-fit masses for each mass point and pseudo-experiment sample size under the two different models. In cases where no local minimum in −LLH was found within the range of masses with available simulation, typically occurring for high m LLP , the best-fit LLP mass was set to 4 GeV. Consequently, the spread of best-fit masses is generally underestimated for the m LLP = 3 GeV case.
Comparing the two kinds of LLP under consideration, we find that LLP mass can be estimated with lower standard error in the scalar case than the HNL case. At or below ≈ 1 GeV, the HNL LLP mass can be determined with relative precision ≈ 0.15 − 0.2 with N obs = 100, while the scalar LLP mass can be determined with relative precision ≈ 0.1.   At m LLP = 3 GeV, the spread in best-fit masses is large enough for both types of LLP that a large fraction of samples have no local −LLH minimum within the range of masses simulated, indicating a large uncertainty in the LLP mass.

Charged Current
Second, let us consider the Charged Current production model, where two new particles participate in the LLP production process. We need to define a variable that is comple-   mentary to the LLP boost b = x 1 , in order to determine both parent and LLP masses. In the CC model, the LLP is produced in a decay whose other product is a SM lepton , which is most likely observed and whose momentum is measured in CMS. Its transverse momentum p T is highly correlated with m parent −m LLP , which is an orthogonal combination of masses to m LLP . We therefore set x 2 = p T . 5 The strong dependences of the high-statistics distribution average of x 1 = log 10 (b) on m LLP /m parent , and the peak value of the x 2 = p T distribution on m parent − m LLP are illustrated in Figure 9. Figure 10 shows the precision attained for m parent and m LLP /m parent in the cases N obs = 10, 100, 1000 for our benchmark parameter point of (m parent , m LLP ) = (1 TeV, 300 GeV). Table 8 gives the relative precisions for parent and LLP masses, defined as the standard deviation divided by the mean of best-fit masses. The most important result is that under the CC production model, with 100 LLP observations in MATHUSLA successfully matched with their production events in CMS, the parent particle mass and the m LLP /m parent mass ratio can both be determined with better than 5% resolution. Even with only 10 events, these resolutions degrade to ≈ 8% and 10% respectively.

Heavy Parent
Next, consider the Heavy Parent model. Again, one of our variables will be the LLP boost log 10 (b) = x 1 , and we need to find a complementary variable x 2 . We select x 2 = H T , the scalar sum of jet transverse momenta in CMS. Its distribution is highly correlated with m parent − m LLP , similarly to p T in the Charged Current case, and works well for both oneand two-jet parent decay scenarios. 6 Figure 11 shows the dependence of the high-statistics average value of both estimators on the BSM particle masses.
Ellipses showing the spread of best-fit masses for pseudo-data samples of size N obs = 10, 100, 1000 using H T are displayed in Figure 12 for the one-and two-jet decay modes of our benchmark parameter point (m parent , m LLP ) = (1 TeV, 300 GeV). Tables 9 and 10 summarize the distributions of best-fit particle masses in the one-and two-jet Heavy Parent decay models respectively. For the one-jet case, 5% resolution in m parent and m LLP /m parent is possible with 100 observed LLP events in MATHUSLA, and 20% resolution with just 10 events. In the two-jet case, resolution of 5-10% in m parent and m LLP /m parent is possible with 100 observed LLP events in MATHUSLA, and 20-30% resolution with just 10 events.
Since the H T variable can be used to estimate masses in the one-jet and two-jet Heavy Parent decay scenarios, it is interesting to consider whether the parameter estimation even requires an accurate determination of the production mode (within the Heavy Parent class    Table 9: Heavy Parent 1-jet decay model mass estimation results for m LLP = 300 GeV, m parent = 1 TeV, using estimators x 1 = log 10 (b LLP ), x 2 = H T . N samples is the number of pseudo-data sets tested for the indicated number of events N obs .  Table 11: Exotic Higgs decay model mass estimation results with estimator log 10 (b LLP ) and true m LLP = 35 GeV. N samples is the number of pseudo-data sets tested for the indicated number of events N obs .
of scenarios). In other words, could one measure the parent particle mass in the HP scenario, without knowing the exact hadronic decay mode of the parent? To test this, we performed a parameter estimation on HP two-jet data with m parent = 1000 GeV, m LLP = 300 GeV, but using the one-jet hypothesis as our fitting templates. The best-fit masses had a large positive bias, in many cases beyond the range of simulated masses. This is not unexpected, since the final states of a 2-body decay have lower combined H T than the final states of a 3-body decay. This bias should therefore persist for lower masses, indicating that precise measurements of m parent and m LLP requires fairly precise knowledge of the parent decay mode. 7

Exotic Higgs Decay
In the Exotic Higgs Decay model, the average boost b ∼ m h /2m LLP of LLPs is strongly correlated with LLP mass, see Figure 13 (left). This scenario was already analyzed in Ref. [20], but we include it here for completeness. Maximum likelihood estimation was performed on the boost distributions of many pseudo-data samples at m LLP = 35 GeV, similarly to the procedure for exotic B-meson decays to evaluate the mass estimation precision. Because the precision does not vary significantly over the range of m LLP considered for this model, we only show results for this one mass point. Table 11 summarizes the distributions of best-fit masses for N obs = 10, 100, 1000.

Direct Pair Production
Our strategy for the Direct Pair Production model is the same as for Higgs decay. The correlation between boost and LLP mass is strong for this model, degrading in the high-  Table 12: Direct Pair Production model mass estimation results with estimator log 10 (b LLP ). N samples is the number of pseudo-data sets tested for the indicated number of events N obs .
mass limit, see Figure 13 (right). Maximum likelihood estimation was performed for ensembles of pseudo-experiments with N obs = 10, 100, 1000 GeV, for benchmark masses 100, 1000, 1700 GeV. Table 12 gives the mean and relative precision of the best-fit masses obtained for each choice. Note that since the highest mass point simulated was 2 TeV, this is the upper bound on best-fit masses reported. Therefore, the spread in best-fit masses may be underestimated for m LLP = 1700 GeV, N obs = 10, 100 (although the prior on detecting much heavier states at the LHC would be suppressed due to kinematic suppression of the production rate). For all three benchmark masses, resolution of 15% or better can be achieved with 100 observed LLPs.

Heavy Resonance
The final simplified model to consider is the intermediate s-channel Heavy Resonance. Like the Heavy Parent scenario, there are two sub-categories. In this case they are the narrowand high-width limits for the resonant particle. Just as for HP, we will first consider parameter estimation in each case separately, then investigate the impact of using the incorrect width hypothesis to estimate parameters. Finally, we will demonstrate that the shape of the boost distribution provides useful information on the resonance width.
In the RES simplified model, the LLP boost distribution is almost entirely dependent on m LLP /m parent , with very little separate dependence on m parent . This is demonstrated in Figure 14 (left). This not only makes x 1 = b the obvious choice for one of our fit variables, it also means the LLP to parent mass ratio can be extracted independently. To find the second complementary variable that independently measures m parent , the only handle we have available is the fact that the energy scale of ISR jets in the detector is determined by the overall energy scale of the hard process, which is m parent . Several variables were considered as candidates for x 2 , including the highest main-detector jet transverse momentum, p j1 T , the scalar sum of jet transverse momenta H T , and the number of jets with p T > 20 GeV, n jet . It was found that n jet gave the best sensitivity, and its dependence on m parent is shown in Figure 14 (right). The weakening dependence of n jet at high m parent is consistent with the logarithmic dependence of ISR kinematics on the hard scale of the event.
These two variables were used for the two-dimensional binned likelihood fit to measure m parent and m LLP /m parent in the same fashion as the Charged Current and Heavy Parent analyses, using pseudo-data samples with m parent = 1000 GeV, m LLP = 300 GeV. Unfortunately, the variance in n jet is so high that only for N obs = 1000 was m parent estimated with precision better than the range of heavy resonance masses simulated. We therefore do not report a precision for measurement of m parent for N obs 100 for our benchmark point, but given the steeper m parent dependence of n jet for lighter parent masses, we expect that observation of fewer events might be sufficient to measure parent masses in the O(100 GeV) range.
Parameter estimation in the Heavy Resonance model is likely to benefit from other measurements at the LHC main detectors, since the s-channel mediator must have a visible SM decay channel, at minimum through the same process by which it was produced. This is similar to dark matter simplified models involving a heavy vector mediator, where some of the strongest constraints come from LHC searches for di-jet resonances [54][55][56][57]. Current and future colliders have the potential to detect the heavy mediator and measure its mass with O(10%) or better precision with standard resonance search techniques [58,59]. If LLPs are discovered at MATHUSLA and classified as being produced via an s-channel resonance, the  Table 13: Heavy Resonance model mass estimation results for m LLP = 300 GeV, m parent = 1 TeV, using estimators x 1 = log 10 (b LLP ), x 2 = n jet . N samples is the number of pseudo-data sets tested for the indicated number of events N obs . Results for high-width are shown in brackets next to those for narrow-width. Note that a measurement of m parent using LLP events is only possible for ∼ 1000 observed decays for our benchmark point.
observed events in MATHUSLA could be used to independently measure m LLP /m parent . Table 13 demonstrates that this ratio can be measured with ∼ 5% precision or better for 10-100 observed events. Resonance searches at the main detectors would then provide an independent measurement of m parent , allowing all mass parameters of the production mode to be determined.
To evaluate the robustness of parameter estimation for this model under variation of the width of the heavy resonance, likelihood fits were performed for samples of narrow width events using template distributions from the high-width case, and vice versa. Using this method, m LLP /m parent can be estimated with similar precision to that achieved using the correct template distributions, but with a small bias. The average m LLP /m parent for narrow width samples under the high-width assumption is higher than the true value by approximately 1%, but this is smaller than the spread of the distribution of best-fit masses up to N obs = 1000. The average mass ratio for high-width samples under the narrow width assumption is biased slightly lower than the true value, by approximately 3%. The LLP to parent mass ratio can therefore be reliably extracted from the LLP boost distribution measured at MATHUSLA alone, once the production mode has been classified as Heavy Resonance, regardless of the resonance width.
On the other hand, estimation of m parent using the incorrect width fails completely. In both cases, the best-fit masses were at the edge of the range of masses simulated, at least 50% off of the true value, indicating that it is not possible to estimate m parent without an estimate or assumption of the heavy resonant particle's width, regardless of N obs . This emphasizes the importance of main detector resonance searches to determine all parameters of this LLP simplified production model.
In the high statistics limit, the width and mass of the resonance could be measured independently of a separate resonance search by performing a full three-dimensional likelihood fit for the LLP mass, parent mass, and parent width. This is beyond our scope, but we demonstrate that this is possible in principle by classifying pseudo-data sets as lowor high-width based on the shape of the sample's boost distribution, using the ratio of maximum likelihoods under the two width hypotheses as a discriminant. The results are summarized in Table 14 for true m parent = 1 TeV, m LLP = 300 GeV. It is clear that at least the two extremes of possible heavy resonance widths can be reliably distinguished,  Table 14: Heavy Resonance model width classification results from likelihood ratios of log 10 (b LLP ) distribution with true m parent = 1 TeV, m LLP = 300 GeV, and Γ parent = 10 GeV, 300 GeV. Results for high-width are shown in brackets next to those for narrow width. N samples is the number of pseudo-data sets tested for the indicated number of events N obs .
and that with enough observed events the heavy resonance's width could be estimated simultaneously with BSM particle masses.

Uncertainty in Matching LLP Decay to Production Vertex
The analyses above demonstrate that combining information from MATHUSLA and the main detector in the event of an LLP discovery would allow both the production mode and the mass parameters of the underlying simplified model to be determined. For simplicity, we have so far assumed that each LLP decay at MATHUSLA could be associated with a single hard primary vertex at the HL-LHC. However, this is unlikely to be the case, partially due to some uncertainty in the LLP boost measurement at MATHUSLA, which narrows down the bunch crossings that produced the LLP down to a few, but more importantly because of the high pile-up multiplicity of 140-200 during the HL-LHC running conditions [60,61], Even given the fact that 'interesting' hard events are in absolute terms fairly rare in pp collisions, and that the primary vertex with the highest associated total energy in the main detector is likely to be the one that corresponds to the LLP production event for most production modes, it seems apparent that in a realistic analysis, each LLP observation at MATHUSLA will correspond to a list of candidate primary vertices, instead of just one. We must therefore ask how our classification and parameter estimation analysis would be affected by this complication. We find that our methods are robust and should be adaptable to account for an ambiguity in the exact production vertex.

Impact on Production Mode Classification
If there is an 'ensemble' of multiple candidate hard primary vertices per LLP, then instead of cuts on the fraction of events meeting certain criteria, the classifier would impose cuts on the fraction of ensembles of candidate primary vertices meeting criteria. This could be as simple as demanding that at least one of the candidate vertices corresponding to an LLP observation satisfy certain conditions. We explore the impact of such a change for each of the simplified models.
1. Since the requirement to classify a sample as Exotic B-meson Decay is a lack of hard scattering events, the additional candidate hard scatters will not lead to other models being classified as BB. However, it is likely that BB samples will be more likely to fail this criterion. The maximum fraction of events across all ensembles with jets of p T > 20 GeV can be decreased to accommodate this, or we could impose a cut on not the fraction of events with a hard jet, but the fraction of ensembles where a certain fraction of events have a hard jet. This can be optimized in a realistic analysis.
2. The correct production vertex in the Charged Current model can still be identified by a hard lepton, which does not occur often in the detector. Therefore, classification for this model should be relatively unaffected.
3. Classification of the Heavy Parent model relies on jet multiplicity, which is vulnerable to contamination from spurious vertices. This contamination can be reduced by raising the minimum p T and/or jet multiplicity for a candidate vertex to pass selection criteria. There may be a small penalty to classification accuracy for this model, but it should still perform similarly to what we demonstrated in this work.
4. The main signature used to identify Exotic Higgs Decay samples is the presence of well-separated jets from VBF. While correct identification of the production vertex may be more difficult for the GF production channel, the presence of VBF-like jets in roughly 10% of the sets of candidate vertices should be roughly unaltered, as should the classifier's performance for this model. 5. The classification of the DPP model uses only LLP boost information, and is therefore unaffected by uncertainty in the production vertex identification.
With only small modifications, the classification procedure outlined in Section 4 should therefore be remarkably robust under the introduction of multiple possible LLP production vertices. We turn now to a discussion of how parameter estimation would proceed in these circumstances.

Impact on Parameter Estimation
For the Exotic B-meson Decay, Exotic Higgs Decay, and Direct Pair Production models, parameter estimation only uses LLP boost, so it is unaffected. For the Charged Current and Heavy Parent models, the correct production vertex can likely be identified in a high fraction of events due to associated hard objects or leptons, so parameter estimation can proceed as normal. Finally, for the Heavy Resonance model, estimation of m LLP /m parent uses only LLP boost and is unaffected. Estimation of m parent using the ISR spectrum is unreliable even when the correct vertex is identified, and is likely to rely on separate resonance searches performed with the main detectors. Therefore, the ability of MATHUSLA to estimate BSM particle masses is essentially unaffected by the complication of multiple possible LLP production vertices.

Conclusions
Many well-motivated scenarios for BSM physics include long-lived particles, and it is important that the experimental search program for long-lived particles continue to be developed and expanded. The proposed MATHUSLA detector represents a vital part of that program, able to probe parameter space inaccessible by any other experiment. Allowing MATHUSLA to act as a Level-1 Trigger for CMS is required to take full advantage of both detectors' potential. This would enable an analysis of LLP observations at MATHUSLA to be augmented by observation of their production events in the main detector. With information from both experiments, both the production and decay topology of long-lived particles, as well as the BSM particle masses, can be accurately determined.
We have shown that for heavy LLPs, an accurate diagnosis at the simplified model level can be achieved > 90% of the time with only 100 observed events, and ≈ 98% of the time with 1000 observed events. Similar performance is possible for lighter LLPs, except for the Heavy Parent and Charged Current models with squeezed spectra m LLP ∼ m parent < 100 GeV, where classification fails because the associated objects being used to identify the production mode become too soft. With similar statistics, the underlying parameters of the simplified model, like LLP and parent particle mass, can be measured with ∼ 10% precision or better in most cases. The exception is the Heavy Resonance production mode, where the ratio of LLP to parent particle mass can be reliable extracted from the LLP boost distribution, but measurement of the parent particle mass would likely benefit from separate resonance searches conducted with the main detectors, in analogy to mediator resonance searches in Dark Matter simplified models [54][55][56][57].
This performance is achieved with extremely simple cuts and analyses using robust, physically motivated features of LLP production events. Further work is sure to improve on our demonstrated classification accuracy and measurement precision. One case that will require further study is LLP production in exotic B-meson decays, where the masses of LLP decay products cannot be neglected, and the measurement of LLP boost at MATHUSLA is consequently more difficult. We have also discussed how to extend our work to include other LLP production scenarios like Heavy Parent decay to leptons + LLP, Heavy Parent with multi-step decay chains, or the consideration of a Higgs-like intermediate resonance of indeterminate mass. It is also imaginable that post-discovery, we suffer an embarrassment of riches, with many different LLP production and decay modes being observed. In that case, our analysis would have to be extended to allow for more than one dominant channel, but this work still demonstrates in principle how the information in each channel could be extracted. Finally, if the LLP lifetime is in the rangebcτ ∼ 1 − 100 m, the geometric distribution of DVs in MATHUSLA's decay volume could be used to extract the lifetime directly. Our methods are applicable not just to MATHUSLA and CMS, but also to other external LLP detector proposals, or even LLPs discovered using LHC main detectors alone. This emphasizes not only the great discovery potential of new LHC detectors like MATH-USLA [15], FASER [17] or CODEX-b [16], but also shows that in the event of a discovery, however it took place, the origin of Long-Lived Particles can be uncovered in great detail.
supercomputer at the SciNet HPC Consortium [62,63]. SciNet is funded by: the Canada Foundation for Innovation; the Government of Ontario; Ontario Research Fund -Research Excellence; and the University of Toronto. This research was enabled in part by support provided by Compute Canada (www.computecanada.ca).

A Simulation Details
This Section contains a detailed description of how LLP production events were simulated for each of the simplified models considered in this work.
1. For the B-meson decay model, bb production was simulated with the MadGraph Standard Model. The B-mesons produced were identified in the simulation at hadronization level. If two were present, one was randomly chosen and manually decayed to two-or three-body final states consisting of SM electrons and/or mesons as well as the LLP in accordance with the relative branching ratios in [25,27], depending on the LLP type. Heavier B-baryons were present in approximately 10% of events, but were ignored for simplicity. (A more realistic analysis would include them separately, but this would not affect our conclusions.) If a jet with p T > 20 GeV corresponding to the B-meson chosen for LLP decay was present in the simulated event at detector level, it was removed from the event. A jet was defined as corresponding to a B-meson if the ∆R = (∆φ 2 ) + (∆η) 2 between the jet and B-meson was less than 0.5. This value was chosen by considering the distribution of ∆R to find the characteristic distance between B-mesons and their corresponding jets at detector level. The SM products of the exotic B decay were not added back into the event. Since most bb production at the LHC is near threshold energy, and the B-mesons characteristically have momentum ≈ 5 GeV, only ≈ 2% of events contained jets with p T above the 20 GeV threshold at detector level. Therefore, the effect of completely discarding the SM products of the exotic decay is negligible. A reweighting of these events was performed by comparing the B-meson p T spectrum to one obtained from FONLL for B-mesons with 0 < η < 3, roughly corresponding to those heading towards MATHUSLA [64].
2. For the Charged Current model, the ChargeWN MadGraph model from the LLP Simplified Model Library was used [13]. The simulated process was production of an (anti-)muon plus the lightest neutralino, mediated by an s-channel W ± boson. Because of the distinctive hard lepton present in these events, rendering analysis of jet variables irrelevant for model classification, jet matching was not performed for these samples. The choice was made to restrict to the case of decay of the W to muons plus LLPs. Decay to other leptons or lepton-universal decays would not present any additional challenges for LLP production mode identification.
3. For the Heavy Parent model, the the minimal supersymmetric standard model (MSSM ) model was used [7,[45][46][47][48][49]. In the one-jet decay mode the simulated process was squark anti-squark production, with each squark (anti-squark) decaying to either one quark or anti-quark as well as the lightest neutralino, which serves as the LLP. In the twojet decay mode the simulated process was gluino pair production, with each gluino decaying to two quarks via an off-shell 10 TeV squark. Because the most important jets in these events originate from the hard process, jet matching was not performed on these samples. The choice was made to restrict the decay of the heavy parent particle to jets only, because this presents the most difficult case to distinguish from other simplified models. Decay to leptons plus LLPs would be identifiable by a large proportion of events containing multiple hard leptons.
4. For the Exotic Higgs Decay model, Higgs production was simulated using the heft model in Madgraph [50][51][52]. The gluon fusion, vector boson fusion, and associated vector boson production modes were included. Jet matching was performed up to one jet for the gluon fusion channel, with the jet matching variable xqcut set to 15. Jet matching was performed to increase the reliability of simulated jet-related variables, since the only jets in these events are due to initial state radiation, and none are produced in the hard process. The Higgs boson was not permitted to decay. For each event, a pair of LLP 4-vectors with energy m H /2 were manually generated pointing in a random direction and its opposite, as one would expect in Higgs decay to two scalars in the Higgs rest frame. These 4-vectors were then boosted into the lab frame using the Higgs momentum from simulation.

5.
For the Direct Pair Production model, the same MSSM package was used, and pair production of the lightest neutralino as the LLP was the simulated process. The intermediate, t-/u-channel squark was set to 10 TeV to remove it as a dynamical degree of freedom. Jet matching was performed up to one jet, and the variable xqcut was set to 0.4×m LLP . This choice of xqcut was made to produce smooth distributions of jet kinematics variables that were stable under small variation of xqcut.
6. For the Heavy Resonance model, the simplified model Zp 2LLP was used [13]. The process simulated was dark scalar pair production with a Z in the s-channel, with the dark scalar's width set to 1 × 10 −20 GeV so that they did not decay in simulation. Jet matching was performed up to one jet, and the variable xqcut was set to 10 + 0.1 × m parent after a similar optimization procedure to that used for Direct Pair Production.